<<

Algebraic

Ira M. Gessel1 Richard P. Stanley2 Department of Department of Mathematics Brandeis University Massachusetts Institute of Technology Waltham, MA 02254 Cambridge, MA 02139

1. Introduction 2. 3. Generating Functions 4. Free 5. Circular Words 6. Lagrange Inversion 7. The Transfer Method 8. and Partitions 9. Exponential Generating Functions 10. with Restricted Position 11. Cancellation 12. Inclusion-Exclusion 13. Mobius Inversion 14. Symmetric Functions

1. Introduction In all areas of mathematics, questions arise of the form “Given a description of a nite , what is its ?” Enumerative deals with questions of this sort in which the sets to be counted have a fairly simple structure, and come in indexed families, where the is most often the set of nonnegative . The two branches of enumerative combinatorics discussed in this book are asymptotic enumeration and algebraic enumeration. In asymptotic enumeration, the basic goal is an approximate but simple formula which describes the order of growth of the as a of their parameters. Algebraic enumeration deals with exact results, either explicit formulas for the in question, or more often, generating functions or recurrences from which the numbers can be computed. The two fundamental tools in enumeration are bijections and generating functions, which we introduce in the next two sections. If there is a simple formula for the cardinality of a set, we would like to nd a “reason” for the existence of such a formula. For example, if a set S has cardinality 2n, we may hope to prove this by nding a between S and the set of of an n- set. The method of generating functions has a long history, but has often been regarded as an ad hoc device. One of the main themes of this article is to explain how generating functions arise naturally in enumeration problems.

1 Partially supported by NSF Grant DMS-8902666 2 Partially supported by NSF Grant DMS-8401376 This paper will appear in the Handbook of Combinatorics, edited by Ronald Graham, Martin Grotschel, and LaszloLovasz, to be published by North-Holland.

1 Further information and references on the topics discussed here may be found in the books of Comtet (1972), Goulden and Jackson (1983), Riordan (1958), Stanley (1986), and Stanton and White (1986).

2. Bijections The method of bijections is really nothing more than the denition of cardinality: two sets have the same of elements if there is a bijection from one to the other. Thus if we nd a bijection between two sets, we have a proof that their cardinalities are equal; and conversely, if we know that two sets have the same cardinality, we may hope to nd an explanation in the existence of an easily describable bijection between them. For example, it is very easy to construct bijections between the following three sets: the set of 0-1 of length n, the set of subsets of [n]={1, 2,...,n}, and the set of compositions of n +1. (A composition of an is an expression of that integer as sum of positive integers. For example, the compositions of 3 are 1+1+1,1+2, 2 + 1, and 3.) The composition a1 + a2 + + ak of n + 1 corresponds to the S = {a1,a1 +a2,...,a1 +a2 ++ak1} of {1, 2,...,n} and to the 0-1 u1u2 un in which ui = 1 if and only if i ∈ S. Moreover, in our example, a composition with k parts corresponds to a subset of cardinality k 1 and to a 0-1 sequence with k 1 ones and thus n there are k1 of each of these. It is easy to give a that the set of compositions of n with parts 1 and 2 is equinumerous with the set of compositions of n + 2 with all parts at least 2: given a composition a1 +a2 + +ak of n+2 with all ai 2, we replace each ai with 2+1| + {z +1}

ai2 and then we remove the initial 2. If we let fn be the number of compositions of n with parts 1 and 2, then fn is easily seen to satisfy the recurrence fn = fn1 + fn2 for n 2, with the initial conditions f0 = 1 and f1 =1.Thusfn is a Fibonacci number. (The Fibonacci numbers are usually normalized by F0 = 0 and F1 =1,wehavefn = Fn+1.) As another example, if is a of [n], then we can express as a product of cycles, where each cycle is of the form i(i) 2(i) s(i) . We can also express as the linear arrangement of [n], (1) (2) (n). Thus the set of cycles {(1 4), (2), (3 5)} corresponds to the linear arrangement 42513. So we have a bijection between sets of cycles and linear arrangements. This simple bijection turns out to be useful. We use it to give a proof, due to Joyal (1981, p. 16), of Cayley’s formula for labeled trees. First note that the bijection implies that for any nite set S the number of sets of cycles of elements of S (each element appearing exactly once in some cycle) is equal to the number of linear arrangements of elements of S. The number of functions from [n]to[n] is clearly nn. To each such function f we may associate its functional digraph which has an arc from i to f(i) for each i in [n]. Now every weakly connected component of a functional digraph (i.e., connected component of the underlying undirected graph) can be represented by a cycle of rooted trees. So by the correspondence just given, nn is also the number of linear arrangements of rooted trees on n 2 [n]. We claim now that n = n tn, where tn is the number of trees on [n]. 2 It is clear that n tn is the number of triples (x, y, T ), where x, y ∈ [n] and T is a tree on [n]. Given such a triple, we obtain a linear arrangement of rooted trees by removing all

2 arcs on the unique path from x to y and taking the nodes on this path to be the roots of n2 the trees that remain. This correspondence is bijective, and thus tn = n . Prufer (1918) gave a dierent bijection for Cayley’s formula, which is easier to describe but harder to justify. Given a labeled tree on [n], let i1 be the least leaf (node of degree 1), and suppose that i1 is adjacent to j1. Now remove i1 from the tree and let i2 be the least leaf of the new tree, and suppose that i2 is adjacent to j2. Repeat this procedure until only two nodes are left. Then the original tree is uniquely determined by j1 jn2 and conversely any sequence j1 jn2 of elements of [n] is obtained from some tree. Thus the number of trees is nn2. Both proofs of Cayley’s formula sketched above can be rened to count trees according to the number of nodes of each degree, and thereby to prove the Lagrange inversion formula, which we shall discuss in Section 6. (See Labelle (1981).) There is another useful bijection between sets of cycles and linear arrangements which we shall call Foata’s transformation (see, for example, Foata (1983)) that has interesting properties. Given a permutation in cycle notation, we write each cycle with its least element rst, and then we arrange the cycles in decreasing order by their least elements. Thus in our example above, we would have = (35)(2)(14). Then we remove the parentheses to obtain a new permutation whose 1-line notation is ˆ =35214. If is a permutation of [n], then a left-right minimum (or lower record)of is an index i such that (i) <(j) for all j

This number is (up to sign) a Stirling number of the rst kind. We shall see them again in Sections 3 and 9. In Section 10 we shall need a variant of Foata’s transformation in which left-right maxima are used instead of left-right minima.

3. Generating Functions The basic idea of generating functions is the following: instead of nding the cardinality of a set S, we assign to each in S a weight w(P). Then the G(S) for S (with respect to the weighting function w)is ∈S w(). Thus the concept of generating function for a set is a generalization of the concept of cardinality. Note that S may be innite as long as the sum converges (often as a formal power ). The weights may be elements of any abelian , but they are usually monomials in a of polynomials or power series. In a typical application each element of S will l() have a ‘length’ l() and we take the weightP of to be x , where x is an indeterminate. l() Then knowing the generating function ∈S x is equivalent to knowing the number of elements of S of each length. Analogous to the product rule for cardinalities, |A||B| = |A B|, is the product rule for generating functions, G(A)G(B)=G(A B), where we take the ‘product weight’ on A B, dened by w (, ) = w()w().

3 As an example, suppose we want to count sequences of zeros and ones of length n according to the number of zeros they contain. We can identify the set of 0-1 sequences of length n with the {0, 1}n. If we weight {0, 1} by w(0) = x and w(1) = y then the product weight on {0, 1}n assigns to a sequence with j zeros and k (= n j) ones the weight xjyk.Thus X n (x + y)n = xjyk j j+k=n is the generating function for sequences of zeros and ones of length n by the number of zeros and the number of ones. If we want to count sequences of zeros and ones of all lengths with this weighting, we sum on n to obtain the generating function X∞ j + k 1 xjyk = . j 1 x y j,k=0 Now suppose we want to count compositions with parts 1 and 2. Rather than picking an integer n and considering the compositions of n, we pick an integer k and consider the set Ck of all compositions of any integer with exactly k parts (each part being 1 or 2). We k 2 may identify Ck with {1, 2} . If we assign 1 the weight x and 2 the weight x , where x is n an indeterminate, then the product weight of a composition of n in Ck is x .Thus X2k k G(C )=G({1, 2}k)=G({1, 2})k =(x + x2)k = xn. k n k n=k Thus there are k compositions of n with k parts, each part 1 or 2. As before, if we don’t nk P ∞ 2 k 2 1 care about the number of parts, we sum on k to obtain k=0(x + x ) =(1 x x ) as the generating function for all partitions into parts 1 and 2. By the same kind of reasoning, if A is any set ofP positive integers, then the generating function for compositions with k i k parts, all in A,is i∈A x and the generating function for compositions with any number P i 1 ofP parts, all in A,is 1 i∈A x . In particular, if A is the set of positive integers then i i∈A x = x/(1 x) so the generating function for compositions with k parts is k X∞ x n 1 = xn 1 x k 1 n=k for k>0 and the generating function for all compositions is 1 X∞ x 1 x x 1 = =1+ =1+ 2n1xn. 1 x 1 2x 1 2x n=1 The generating function for compositions with parts greater than 1 is ∞ 2 1 2 X x 1 x x n 1 = =1+ =1+ F x . 1 x 1 x x2 1 x x2 n 1 n=1

4 Similarly, the generating function for compositions with odd parts is 1 X∞ x x 1 =1+ =1+ F xn. 1 x2 1 x x2 n n=1

Thus we have proved using generating functions two of the results we proved using bi- jections in the preceding section. Notice that the generating functions take into account initial cases which did not arise in the bijective approach. For our next two examples, we consider another bijection for permutations. Suppose that is a permutation of [n]. We associate to a sequence a1a2 an of integers satisfying 0 aj j 1 for each j as follows: aj is the number of indices i(j). The sequence a1a2 an is called the inversion table of : an inversion of is a pair (i, j) with i(j), and thus aj is the number of inversions of of the form (i, j). It is not dicult to show that the correspondence between permutations and their inversion tables gives a bijection between the set Sn of permutations of [n] and the set Tn of sequences a1a2 an of integers satisfying 0 aj j 1 for each j. Note that Tn is the Cartesian product {0}{0, 1}{0, 1, ,n 1}. We shall use the inversion table to count permutations by inversions and also by cycles. Let I() be the number of inversions of . We would like to nd the generating function I() G(Sn) for permutations of [n] where each permutation is assigned the weight q .To do this we note that I() is the sum of the entries of the inversion table of , and thus if a1++an we assign the weight w(a)=q to a = a1a2 an ∈ Tn, then we have

n1 G(Sn)=G(Tn)=1 (1 + q) (1 + q + + q ).

Next we count permutations by left-right minima. It is clear that j is a left-right k minimum of if and only if aj = j 1. Thus if we assign the weight t to a permutation in Sn with k left-right minima, and to a sequence in Tn with k occurrences of aj = j 1, then we have

Xn k G(Sn)=G(Tn)=t(t + 1)(t +2)(t + n 1) = c(n, k)t , k=0 where c(n, k) is (by denition) the unsigned Stirling number of the rst kind. By Theorem 2.1, it follows that c(n, k) is also the number of permutations in Sn with k cycles.

4. Free Monoids Free monoids provide a useful way of organizing many simple applications of generating functions. Let A be a set of “letters.” The free A is the set of all nite sequences (including the empty sequence) of elements of A, usually called words, with the of . We can construct an from A by taking formal sums of elements of A with coecients in some ring. We write 1 for the empty sequence, which is the unit of this algebra. These formal sums are then in noncommuting variables. The generating function G(S) for a subset S of A is the sum of its elements.

5 If S and T are subsets of A, we write ST for the set {st | s ∈ S and t ∈ T }.We say that the product ST is unique if every element of ST has only one such factorization. The fundamental fact about generating functions is that if the product ST is unique, then G(ST)=G(S)G(T ). More generally, we may dene a free monoid to be a set together with an associative which is isomorphic to a free monoid as dened above. Let A+ = A \{1} and suppose that S is a subset of A+ such that for each k, every element of Sk has a unique factorization s1s2 sk with each si in S. SuchS a set S is sometimes called a ∞ i uniquely decodable code, or simply a code. Then S = i=0 S is a free monoid. We call the elements of S the primes of the free monoid S. In this case X∞ G(S)= G(S)i = 1 G(S) 1. i=0

In particular, G(A)= 1 G(A) 1. Among the simplest free monoid problems are those dealing with compositions of integers, as we saw in the previous section. A composition of an integer is simply an element of the free monoid P, where P is the set of positive integers. As a more interesting example, let A = {X, Y }, let S be the subset of A consisting of words with equal numbers of X’s and Y ’s, and let T be the subset of A of words with no nonempty initial segment in S. Then A = ST uniquely, so G(A)=(1 X Y )1 = G(S)G(T ). Moreover, S is a free monoid U , where U is the set of words in S which cannot be factored nontrivially in S. The sets S, T , and U have simple interpretations in terms of walks in the plane, starting at the origin. If X and Y are represented by unit steps in the x and y directions, then S corresponds to walks which end on the main diagonal, T corresponds to walks that never return to the main diagonal, and U corresponds to walks that return to the main diagonal only at the end. It is often useful to replace the noncommuting variables by commuting variables. If we replace the letter X by the variable x, we are assigning X the weight x. (More formally, we are applying a in which the of X is x.) In our example, if we weight X and Y by commuting variables x and y, then G(A) P∞ becomes 1/(1 x y) and G(S) becomes 2n xnyn =(1 4xy)1/2 since there are n=0 n √ 2n G n ways of arranging nX’s and nY’s. Thus (T ) becomes 1 4xy/(1 x y). It can be shown that this is equal to X∞ |m n| m + n (4.1) xmyn, m + n n m,n=0 where the constant term is 1. The coecients in (4.1) are called ballot numbers and we shall see them again in Section 6. If we replace x and y by the same variable z, the generating function for T becomes √ 1 4z2 1+2z = √ . 1 2z 1 4z2

6 2n Thus the number of words in T of length 2n is n and the number of words in T of length 2n 2n + 1 is 2 n . Although we usually work with formal power series, it is sometimes useful for variables to take on real values. We derive an inequality called McMillan’s inequality which is useful in information theory. (See McMillan (1956).) Let A be an alphabet (set of letters) of size r, and let S beacodeinA, so that S is a free monoid. Let us weight each letter of A by t, and let G(S)=p(t). We know that G(S)= 1p(t) 1 as formal power series in t. Since there are rk words in A of length k, the coecient of tk in 1 p(t) 1 is at most rk.If0<<1/r then the P ∞ k k 1 1 1 series k=0 r converges absolutely to (1 r) , and thus 1 p() (1 r) , which implies p() r. Taking the limit as approaches 1/r from below, we obtain p(1/r) 1. Thus we have proved the following:

(4.2) Theorem. Let S be a uniquely decodable code in anP alphabet of size r, and for ∞ k each k let pk be the number of words in S of length k. Then k=1 pkr 1.

In some applications of free monoids, the ‘letters’ have some internal structure. For example, consider the set of permutations of [n]={1, 2,...n} satisfying |(i) i|1. We can represent a permutation of [n] as a digraph with node set [n] with an arc from i to (i) for each i. If we draw the digraph with the nodes in increasing order, we get a picture like this one, which corresponds to the permutation 2143576:

It is clear that these permutations form a free monoid with the two ‘letters,’ or primes

and

Thus the generating function (by length) for these permutations is (1 x x2)1, and there is an obvious bijection between these permutations and compositions with parts 1 and 2. Sometimes it is easier to count all the elements of a free monoid than just the primes. If we represent arbitrary permutations as in the previous example, then we have a free monoid in which the primes, called indecomposable permutations, are those permutations of [n] (for some n) such that for 1 i

7 For example, 4213 is indecomposable:

but21534 is not:

Thus if g(x) is the generating function for indecomposable permutations, we have X∞ n! xn = 1 g(x) 1, n=0 so ! X∞ 1 g(x)=1 n! xn , n=0 as shown by Comtet (1972). 5. Circular words We now study some properties of words in which the letters are thought of as arranged in a circle, so that the last letter is considered to be followed by the rst. (This should not be confused with the problem of equivalence classes of words under , which we discuss in Section 14.) We dene the cyclic shift operator C on words by

Ca1a2 ak = a2a3 aka1. A conjugate or cyclic permutation of a w is a word of the form Cmw for some m.If S is a set of words, then we dene S to be the set of all conjugates of words in S. Suppose that S is a free submonoid of the free monoid A , and let w = s1s2 sk be k i k an element of S , where each si is in S. It is clear that C w ∈ S whenever i takes on any of the k values 0, l(s1), l(s1s2), ..., l(s1s2 sk1), where l(v) denotes the length of the word v. If these are the only values of i, with 0 i

8 (5.1) Theorem. Suppose that S is cyclically free and let Q = Sk ∩ An. Then k|Q| = n|Q|. Proof. We count pairs (i, w), where Ciw ∈ Q and 0 i

X∞ 0 tzg (z) |(Sk) ∩ An|tkzn = . 1 tg(z) n,k=1

Equivalently, X∞ zn 1 |(Sk) ∩ An|tk = log . n 1 tg(z) n,k=1

We can use Theorem (5.1) to count the number of k-subsets of [n] with no two con- secutive elements, where 1 and n are considered consecutive. We take S = {ab, b}. The subsets we want correspond to words in Snk ∩{a, b}n . These words contain n letters, of which n k are b’s, and hence k are a’s. The positions of the a’s in one of these words determines the subset. S is clearly cyclically free, so by Theorem (5.1), the number of n nk such subsets is nk k . Our next example will be useful in proving the Lagrange inversion formula in the next section. Let be any function from A to the real numbers. Extend to all of A by dening (a1a2 ak)=(a1)++ (ak). Dene R by (5.3) R = { w | if w = uv with u =6 1 then (u) < 0 }.

It is easily veried that R is a cyclically free submonoid of A. The following description of R is the key step in our proof of the Lagrange inversion formula in the next section: (5.4) Lemma. Let R and be as above. Then R = {1}∪{w | (w) < 0 }. Proof. We need only show that if (w) < 0 then for some i, then Ciw ∈ R. Of the heads (initial segments) h of w which maximize (h), let u be the longest, and let w = uv. Then vu is easily veried to be in R.

6. Lagrange inversion In the last example, let A = {x1,x0,x1,x2, } and dene : A → Z by (xi1 xim )=i1 + + im.

9 Let R be as in (5.3) and let S = { w | w ∈ R and (w)=1. }. We claim that R = S. Since we know that R is a free monoid, we need only show that if w is a prime of R then (w)=1. To see this, let w be a prime of R. Since (w) < 0 and (xi) 1 for each xi ∈ A, w must have a head h with (h)=1. Let w = uv, where u is the longest head of w for which (u)=1. Then v must be in R, since otherwise w would have a longer head h with (h) 1. Since w is a prime of R, this means w = u. It follows that if v is any word in R with (v)=k then v ∈ Sk. Now let v be any word in S and suppose v = uxi. Then u is in R with (u)+i = 1, so (u)=1 i, and thus u ∈ Si+1. It follows that

[∞ i+1 (6.1) S = S xi, i=1

where the is disjoint. We are now ready to prove the Lagrange inversion formula. We use the notation [xn]F (x) to denote the coecient of xn in F (x). P ∞ n (6.2) Theorem. Let g(u)= n=0 gnu , where the gn are indeterminates. Then there is a unique formal power series f in the gn satisfying f = g(f), and for k>0,

X∞ k (6.3) f k = [unk]g(u)n. n n=1

Proof. It is easily seen that the equation f = g(f) has a unique solution. Let us assign to the letter xi the weight gi+1 and let f be the image of G(S) under this assignment. Then from (6.1) we have X∞ i+1 f = f gi+1 = g(f). i=1 By the weighted version of Theorem (5.1), the sum of the weights of the words of length n in Sk is k/n times the sum of the weights of the words in (Sk ∩ An). But by Lemma (5.4), the sum of the weights of the words in (Sk ∩ An) is g(u) n [uk] =[unk]g(u)n. u

The proof we have just given is essentially that of Raney (1960). It is clear that if the gi are assigned values that are not necessarily indeterminates, then the theorem still holds as long as the sum in (6.3) converges as a formal power series and f is uniquely determined as a formal power series by Pf = g(f). The usual formulation of Lagrange ∞ n inversion is obtained by taking g(u)=z n=0 rnu , where z is an indeterminate and the rn are arbitrary.

10 One of the most important applications of Lagrange inversion is to the enumeration of ordered trees. (An ordered tree is a rooted unlabeled tree in which the children of any node are linearly ordered.) Let us weight a node with i children in an ordered tree by gi, and weight the tree by the product of the weights of its nodes. If f is the sum of the weights of all ordered trees, then since an ordered tree consists of a root together with some number (possibly zero) of children, each of which may be an arbitrary ordered tree, we have X∞ i f = gif = g f , i=0 P ∞ i where g(u)= i=0 giu . The Lagrange inversion formula then yields the following:

(6.4) Theorem. The number of k- of ordered trees in which a total of ni nodes have i children is k n P , where n = i ni, n n0,n1,n2,

if n1 +2n2 +3n3 = n k, and 0 otherwise. It is not hard to derive Theorem (6.2) from Theorem (6.4), so any other proof of Theorem (6.4) (for example, by induction), yields a proof of the Lagrange inversion formula. Our approach can also be used to give a purely combinatorial proof of (6.4) without the use of generating functions. A few special cases of (6.2) and (6.4) are especially important. If there are a nodes with 2 children, b nodes with no children, and no other nodes, then with b = a + k the number of k-tuples of such trees is k n k 2a + k = . n a 2a + k a

These numbers are called ballot numbers. The special case k = 1 gives the Catalan numbers 1 2a +1 1 2a = . 2a +1 a a +1 a 6 To apply (6.2) directly to this case, we√ may take g0 =1,g2 = x, and gi = 0 for i =0, 2. Then f satises f =1+xf 2,sof = 1 1 4x /2x, and we obtain √ k X∞ 1 1 4x k 2a + k = xa. 2x 2a + k a a=0

To count all ordered trees we set gi = x for all i, to obtain the equation f(x)= x/ 1 f(x) , with the solution √ k X∞ X∞ 1 1 4x k 2n k 1 k 2n + k f(x)k = = xn = xn+k, 2 n n k 2n + k n n=0 n=0

11 so we again obtain the Catalan and ballot numbers. It is an instructive exercise to nd a bijection between these classes of trees, and to relate these results to formula (4.1). Our analysis gives a well-known bijection between ordered trees and words in S. The code c(t) for a tree T may be dened as follows: If the root of T has no children, then c(T )=x1. Otherwise, if the children of the root of T are (in order) the roots of trees T1, T2, ..., Tk, then c(T )=c(T1) c(Tk)xk1. For another example, we dene a binary tree to be a rooted tree in which every node has a left child, a right child, neither, or both. Thus

and

are dierent binary trees. Let us weight a binary tree with n nodes, i left children, and j right children by xnLiRj. Then if f is the generating function for these trees, we have f = x(1 + Lf)(1 + Rf), and thus by Lagrange inversion we have X∞ Xnk k n n f k = LiRnkixn. n i i + k n=k i=0 1 n n For k = 1, the numbers are called Runyon numbers or Narayana numbers. n i i +1

7. The transfer matrix method Many enumeration problems can be transformed into problems of counting walks in di- graphs, which can be solved by the transfer matrix method. Suppose D is a nite digraph. To every arc of D we associate a weight. Let M be the matrix in which rows and columns are indexed by the nodes of D and the (i, j) entry of M is the sum of the weights of the arcs from i to j. Then by the denition of matrix , the (i, j) entry in M k is the sum of the weightsP of all walks of k arcs from i to j. It follows that (as long as the ∞ k 1 innite sums exist) k=0 M =(I M) counts all walks, where I is the identity matrix, and trace (I M)1 counts walks that end where they begin. For example, consider the following problem: Given integers n and i, what is the number t(n, i) of sequences a1a2 an of 0’s, 1’s, and 1’s with a1 ++an i (mod 6)? Here we take D to be the digraph with node set {0, 1, 2, 3, 4, 5} and an arc from each j to j 1, j, and j + 1, reduced modulo 6. We weight each arc by x.SoM is   xx000x    xxx000    0 xxx00   .  00xxx0  000xxx x 000xx

12 We nd that (I M)1 is the circulant matrix with rst column   1 2 1  + + +2  1 3x 1 2x 1+x   1 1 1   + 1   1 3x 1 2x 1+x     1 1 1   + 1  1  1 3x 1 2x 1+x   1 2 1  6    +2  1 3x 1 2x 1+x   1 1 1   + 1   1 3x 1 2x 1+x  1 1 1 + 1 1 3x 1 2x 1+x Thus for n>0, t(n, 0)=(3n +2n+1 +(1)n)/6 t(n, 1) = t(n, 5)=(3n +2n (1)n)/6 t(n, 2) = t(n, 4)=(3n 2n +(1)n)/6 t(n, 3)=(3n 2n+1 (1)n)/6. As another example, how many 0-1 sequences are there with specied numbers of occurrences of 00, 01, 10, and 11? Here we take D to be the weighted digraph

x 01

x 0 00 1 x 11

x 10

Then 1 1 x x (I M) 1 = 00 01 x 1 x 10 11 . 1 x11 x01 x10 1 x00 = (1 x00)(1 x11) x01x10 Thus, for example, the generating function for 0-1 sequences beginning with 0 and ending with 1 is X∞ x xi+1xi 01 = 01 10 (1 x )(1 x ) x x (1 x )i+1(1 x )i+1 00 11 01 10 i=0 00 11 X i + j i + k = xi+1xi xj xk . 01 10 00 11 j k i,j,k

13 i+j i+k so j k is the number of 0-1 sequences beginning with 0 and ending with 1, with i occurrences of 10 (and thus i+1 occurrences of 01), j occurrences of 00, and k occurrences of 11 (and thus i + j + 1 zeros and i + k + 1 ones). The transfer matrix method can often be used to show that a generating function is rational. For example, consider the problem of counting the number of ways of covering an m n rectangle with a xed nite set of polyominos. It is not hard to show, using the transfer matrix method, that for xed m the generating function on n is rational, although it is dicult to give an explicit formula. We will see another example of this type in Section 10.

8. Multisets and partitions We have so far considered problems involving linear arrangements. In this and the next section we turn to unordered collections. We rst consider the problem of counting mul- tisets, which are sets with repeated elements allowed. More formally, a on a set S is a function from S to the nonnegative integers; if is a multiset then (s) represents the multiplicity of s. IfQ each element s in S has a weight w(s), then we dene the weight (s) of the multiset to be s∈S w(s) . For each s in S, let Ms be a set of positive integers. Then the sum of the weights of all multisets on S such that (s)isinMs for each s in S is easily seen to be Y X w(s)i.

s∈S i∈Ms

We give a few examples. Let us take w(s)=x for all s in S, and assume |S| = n.If Ms = {0, 1} for each s, we are counting subsets, and the generating function is Xn n (1 + x)n = xk. k k=0

If Ms is the set of all nonnegative integers for each s, we are counting unrestricted multisets, and the generating function is X∞ n + k 1 (1 + x + x2 + )n =(1 x)n = xk. k k=0

If Ms = {0, 1,...,m} for each s, the generating function is n X∞ X 1 xm+1 n n + k (m +1)i 1 (1 + x + + xm)n = = xk (1)i . 1 x i k (m +1)i k=0 i

A multiset of positive integers with sum k is called a partition of k. The elements of a partition are called its parts. It is customary to list the parts of a partition in decreasing order, so a partition of k is often dened as a (weakly) decreasing sequence of positive

14 i integers with sum k. To count partitions, we weightQ i by q , where q is an indeterminate. ∞ i 1 Then the generating function for allQ partitions is i=1(1 q ) and the generating function ∞ i for partitions with distinct parts is i=1(1 + q ). Many theorems in the theory of partitions assert that one set of partitions is equinu- merous with another. The simplest of these, due to Euler, is that the number of partitions of n with odd parts is equal to the number of partitions of n with distinct parts. To prove this, we note that the generating function for partitions with odd parts is

Y Y∞ Y∞ (1 qi)1 = (1 qi)1 (1 q2j) i odd i=1 j=1 Y∞ Y∞ 1 q2i = = (1 + qi), 1 qi i=1 i=1 which is the generating function for partitions with distinct parts. It is not dicult to give a combinatorial proof of this result: Suppose is a partition with odd parts. If contains the odd part i with multiplicity k, let k =2e1 +2e2 + es +2 , where 0 e1

X∞ 2 Y∞ qi 1 = . (1 q)(1 q2) (1 qi) (1 q5j+1)(1 q5j+4) i=0 j=0

No simple bijective proof of (8.1) is known. A complicated bijective proof was found by Garsia and Milne (1981). We now prove an identity called the q-binomial theorem, which has many applications n1 to partitions. We introduce the notation (a)n for (1 a)(1 aq) (1 aq ), where q 2 n isQ understood. In particular, (q)n =(1 q)(1 q ) (1 q ). We also write (a)∞ for ∞ i i=0(1 aq ). (8.2) Theorem. (The q-binomial theorem.)

X∞ (a) (at)∞ n tn = . (q) (t)∞ n=0 n

15 Proof. Let X∞ (at)∞ n = fnt . (t)∞ n=0 Then X∞ (at)∞ (at)∞ n =(1 t) =(1 t) fnt . (tq)∞ (t)∞ n=0 But also, X∞ (at)∞ (atq)∞ n n =(1 at) =(1 at) fnq t . (tq)∞ (tq)∞ n=0 Equating coecients of tn,wehave

n n1 fn fn1 = q fn aq fn1,n 1,

n n1 and thus fn(1 q )=fn1(1 aq ). Since f0 = 1, this gives

Yn 1 aqi 1 (a) f = = n . n 1 qi (q) i=1 n Two cases are particularly worth noting. If a = qm, where m is a positive integer, then we have X∞ (qm) 1 (8.3) n tn = . (q) (t) n=0 n m The q-binomial coecient is dened to be n (q) = n . k (q)k(q)nk

m Since (q )n =(q)m+n1/(q)m1, we may rewrite (8.3) as

X∞ m + n 1 1 (8.4) tn = . n (1 t)(1 tq) (1 tqm1) n=0 n It follows from (8.4) that k is a polynomial in q that reduces to the binomial coecient n k for q =1. We can use (8.4) to count partitions with at most n parts, each part at most m.Itis clear that the desired generating function is the coecient of tn in 1 1 m = , (1 t)(1 tq) (1 tq ) (t)m+1

16 m+n and by (8.4) this is n . The case a = qm of the q-binomial theorem yields similarly (after changing q to q1 and t to t/q) Xm n n m m1 (8.5) t q(2) =(1+t)(1 + tq) (1 + tq ), n n=0

which implies that the generating function for partitions with n distinct parts, all less n (2) m than m, where 0 is allowed as a part, is q n . This result may be derived directly from our previous generating function for partitions with repeated parts allowed, since every partition with distinct parts is obtained uniquely from an unrestricted partition by adding 0 to the smallest part, 1 to the next smallest, and so on. There is an important for q-binomial coecients in terms of vector spaces over nite elds. (See, for example, Stanley (1986), p. 28, for the proof.)

(8.6) Theorem. Let q be a prime power. Then the number ofk-dimensional subspaces n of an n-dimensional over a eld with q elements is k . A comprehensive reference on the theory of partitions is Andrews (1976).

9. Exponential generating functions

If a0, a1, ...is a sequence of numbers, the power series

X∞ xn a n n! n=0

is called the exponential generating function for the sequence. Exponential generating functions arise in counting ‘labeled objects.’ Their usefulness comes from the fact that xm xn m + n xm+n = . m! n! m (m + n)!

If A is an object with label set [m] and B is an object with label set [n], we can combine m+n 0 0 them in m ways to get an object (A ,B) with label set [m + n]: We rst choose an m- element subset S of [m + n] and replace the labels of A with the elements of S (preserving their order) to get A0, and in the same way we get B0 from B and [m + n] \ S. Thus if f(x) and g(x) are exponential generating functions for classes of labeled ob- jects, then their product f(x)g(x) will be the exponential generating function for ordered pairs of these objects. For example, the exponential generating function for nonempty sets is X∞ xn ex 1= n! n=1

17 since the elements of [n] can be arranged as a nonempty set in one way if n>0 and in no ways if n =0.Thus X∞ xn (ex 1)2 = (2n 2) n! n=2 is the exponential generating function for ordered partitions of a set into two nonempty blocks. More generally, (ex 1)k is the exponential generating function for ordered parti- tions of a set into k nonempty blocks, and X∞ 1 (ex 1)k = 2 ex k=0 is the exponential generating function for all ordered partitions of a set. Now suppose that f(x) is the exponential generating function for a of labeled objects and that f(0) = 0. As we have seen, f(x)k is the exponential generating function for k-tuples of these objects. Every k-set can be arranged into a k- in k! ways, so f(x)k/k! is the exponential generating function for k-sets of these objects. Thus, for example, (ex 1)k/k! is the exponential generating function for partitions of a set into k blocks. The numbers S(n, k) dened by

X∞ (ex 1)k xn (9.1) = S(n, k) k! n! n=0 are called Stirling numbers of the second kind. If we sum on k we obtain the exponential x generating function exp(e 1) for all partitions of a set. The coecients Bn dened by

∞ X n x x B = ee 1 n n! n=0 are called Bell numbers. In general ef(x) counts sets of labeled objects each counted by f(x). Another important application of this principle (often called the ‘exponential formula’) is to the enumeration of permutations by cycle structure. A permutation may be considered as a set of cycles. If we weight a cycle of length i by ui and weight a permutation by the product of the weights of its cycles, then the exponential generating function for cycles is X∞ X∞ xn xn (n 1)! u = u , n n! n n n=1 n=1 and thus the exponential generating function for permutations by cycle structure is X∞ n exp( unx /n). n=1

18 If we set un = u for all n, then we are counting permutations by the number of cycles, and we obtain the generating function for the (unsigned) Stirling numbers of the rst kind,

X∞ X∞ Xn xn xn (1 x)u = u(u +1)(u + n 1) = c(n, k)uk, n! n! n=0 n=0 k=0 which we derived in a dierent way in Section 3. In some cases, there is a simpler expression for ef(x) than for f(x). For example, any labeled graph is a set of connected labeled graphs. Thus if g(x) is the exponential generating function for connected labeled graphs, then eg(x) is the exponential generating n function for all labeled graphs. But there are 2(2) labeled graphs on [n], so ! ∞ X n n x g(x) = log 2(2) . n! n=0 Exponential generating functions often satisfy simple dierential equations which can be explained combinatorially. If X∞ xn f(x)= f n n! n=0 then X∞ xn f 0(x)= f , n+1 n! n=0 so an object counted by f 0(x) with label set [n] is the same as an object counted by f(x) with label set [n + 1]. For example, let X∞ xn 1 f(x)= n! = n! 1 x n=0 be the exponential generating function for permutations (considered as linear arrangements of numbers). Then f 0(x) counts permutations of [n + 1] in which only the numbers in [n] are considered to be labels. We can consider n + 1 to be a ‘marker’ that separates the original permutation into a pair of permutations on [n], and we obtain the dierential equation f 0(x)=f(x)2. This decomposition can be used to obtain more information about permutations, as we shall see next. A descent of the permutation a1a2 an is an i for which ai >ai+1. It is convenient to count n as a descent also, if n>0. Let X∞ xn A(x)= A (t) n n! n=0 be the exponential generating function for permutations by descents, where a permutation k with k descents is weighted t . If we take a permutation = a1a2 an+1 on [n +1]

19 and remove the element n + 1, we are left with two permutations, 1 = a1a2 aj1 and 2 = aj+1 an+1, where aj = n+1. The number of descents of is the sum of the number of descents of 1 and 2 unless 1 is empty, when has an additional descent. Thus we obtain the dierential equation A0(x)= A(x) 1 A(x)+tA(x), together with the initial condition A(0) = 1. The dierential equation is easily solved by separation of variables, yielding

1 t A(x)= 1 te(1t)x

The polynomials An(t) are called Eulerian polynomials and their coecients are called Eulerian numbers. As another example, let us dene an up-down permutation to be a permutation > a1a2 an satisfying a1 a3

X∞ x2n+1 T (x)= D . 2n+1 (2n + 1)! n=0

Removing 2n+1 from an up-down permutation of [2n+1] for n 1 leaves a pair of up-down permutations of odd length. Taking into account the exceptional case n = 0, we obtain the dierential equation T 0(x)=T (x)2 + 1, with the initial condition T (0) = 0. Solving the dierential equation yields T (x) = tan x. The numbers D2n+1 are called tangent numbers. For the generating function X∞ x2n S(x)= D , 2n (2n)! n=0 a similar analysis yields the dierential equation S0(x)=T (x)S(x), with the initial con- dition S(0) = 1, which has the solution S(x) = sec x. The numbers D2n are called secant numbers. We will show that S(x) = sec x by a dierent method in Section 11. Another application of exponential generating functions is to the enumeration of la- beled rooted trees. Since a rooted tree can be represented as a root together with a set of subtrees, the exponential generating function t(x) for rooted trees satises

t(x)=xet(x).

We can solve this equation by the Lagrange inversion formula, and we obtain X∞ t(x)k n 1 xn = nnk , k! k 1 n! n=k

20 which for k = 1 gives a formula equivalent to Cayley’s.

10. Permutations with restricted position In this and the next several sections we discuss methods for dealing with formulas that involve . One way to deal with such formulas is to replace them with equivalent formulasP having onlyP positive terms. The example we give here isP based on the factP that the k k k k formula k Akt = k Bk(t 1) is equivalent to the formula k Ak(t +1) = k Bkt . (10.1) Theorem. Let R be a subset of [n] [n]. For any permutation of [n], let r() be the number of values of i ∈ [n] for which i, (i) ∈ R. Let

Xn X k r() a(t)= akt = t ,

k=0 ∈Sn where Sn is the set of permutations of [n]. Let bk be the number of k-subsets of R in which no two pairs agree in either coordinate. Then

Xn k a(t)= bk(n k)! (t 1) . k=0

In particular, Xn k a0 = a(0) = bk(n k)! (1) . k=0

Proof. We prove that Xn k a(t +1)= bk(n k)! t k=0 ∈S ∩ by counting in two ways pairs (, Q), in which n and Q G() R, where G()= { i, (i) | i ∈ [n] }. We weight such a pair by t|Q|. First, we have X X X X t|Q| = t|Q| = (t +1)|G()∩R| = a(t +1). (,Q) QG()∩R

Second, we have X X t|Q| = { | G() Q }t|Q|. (,Q) QR

If G() Q then Q does not contain two ordered pairs which agree in either coordinate, and if this condition is satised, Q can beP expanded to the graph of a permutation in | | n k (n Q )! ways. Thus the sum is equal to k=0 bk(n k)! t .

21 Theorem 10.1 is often proved by inclusion-exclusion, which we discuss in Section 12. See, for example, Riordan (1958), chapters 7 and 8. For our rst example, let R = { (i, i) | i ∈ [n] }. Then a(t) counts permutations by n xed points. Here b(k)= k ,so Xn (t 1)k a(t)=n! , k! k=0 P n k and in particular, a0 = n! k=0( 1) /k! is the number of derangements (permutations without xed points) of [n], denoted dn. Next we consider the case R = { (i, j) | ij 0or1 (modn) }, which is the classical probleme des menages. Here we can evaluate bk by a simple trick, but the generalizations in which i j 0, 1,...,s (mod n) could be solved by the transfer matrix method. Let us set p2i1 =(i, i) for 1 i n, p2i =(i, i + 1) for 1 i n 1 and p2n =(n, 1). Then bk is the number of k-subsets of {p1,...,p2n} containing no pi and pi+1 (or p2n and p1). Then as we saw in Section 5, 2n 2n k b = , k 2n k k and thus Xn 2n 2n k a = (1)k (n k)! . 0 2n k k k=0

Finally, let us take R = { (i, j) | i>j}. Then bk is the Stirling number S(n, n k). We prove this by giving a bijection between k-subsets of R counted by bk and partitions of [n] with n k blocks: to the subset {(i1,j1), (i2,j2),...,(ik,jk)} of R counted by bk we associate the nest partition in which is and js are in the same block for each s.Thus

Xn a(t)= S(n, n k)(n k)! (t 1)k. k=0

If we call this polynomial an(t), then a straightforward computation using (9.1) shows that X∞ X∞ xn t 1 1 t xn a (t) = =1+t1 1+ =1+ t1A (t) , n n! t e(t1)x 1 te(1t)x n n! n=0 n=1 where An(t) is the Eulerian polynomial. Similarly, if we had taken R = { (i, j) | i j }, we would have found a(t)=An(t) for all n. Thus for n 1 the three polynomials X X X t1+|{ i | (i)>(i+1) }|, t|{ i | i(i) }|, and t1+|{ i | i>(i) }|,

∈Sn ∈Sn ∈Sn

22 are all equal. A combinatorial proof is easily found through Foata’s transformation: For example, if =57216384 (where the dots represent descents) then Foata’s transformation takes to

1 =(57)(2)(1 6384), in which occurrences of (i) >(i + 1) together with the extra descent at the end have been transformed into occurrences of i 1(i). The variant of Foata’s transformation with left-right maxima instead of minima trans- forms to 2 = (5)(72163)(84), in which occurrences of (i) >(i+1) have been transformed into occurrences of i>2(i).

11. Cancellation In this section we consider a technique for simplifying sums of positive and negative terms by cancellation. We have two sets A+ and A, which we think of as ‘positive objects’ with sign +1 and ‘negative objects’ with sign 1. We want to nd a combinatorial interpretation to |A+||A|. We do this by nding a partial pairing of positive objects with negative objects; then |A+||A| will be equal to the contribution from the unpaired objects. (11.1) Theorem. Let A = A+ ∪ A and suppose that there is subset B of A and an ω dened on A \ B which is sign reversing: if ω(x) is dened, then x ∈ A+ if and only if ω(x) ∈ A. Then |A+||A| = |A+ ∩ B||A ∩ B|. In most (but not all) applications, B is a subset of either A+ or A. As an example, we give a combinatorial proof of the identity Xn n r + k r (1)k =(1)n . k m m n k=0

Let us rst consider the special case m = 0, which we may write as Xn n 1,n=0; (1)k = k 0,n>0. k=0

It is clear that we should take A to be the set of subsets of [n], with A+ the subsets of even cardinality and A the subsets of odd cardinality. For n>0 we want to nd a sign- reversing involution on all of A, so that B = ∅. Clearly the given by ω(K)=K {1} has the right properties, where denotes the symmetric dierence. Now we consider the general case. Let R be an r-element set disjoint from [n]. We may take A to be the set of all pairs (K, M), where K is a subset of [n] and M is an ∪ | | n r+k m-subset of R K. Then the number of such pairs with K = k is k m . We take

23 the sign of (K, M)tobe(1)|K|. It is not immediately obvious what B should be, but we may try to construct a sign-reversing involution on as large a subset of A as possible, and B will be whatever is left over. Given a pair (K, M) ∈ A, let j be the least element of [n] \ M if [n] \ M is nonempty. Then we set ω (K, M) =(K {j},M). This is clearly a sign-reversing involution. The only pairs (K, M) for which it is not dened are those for which [n] M. But if [n] M then since M ∩ [n] K, we must have K=[n] and M r must consist of [n] together with an (m n)-subset of R. There are mn of these and they all have sign (1)n. Thus the identity is proved. For our next example, let Dn be the number of up-down permutations of [n], as dened in Section 9. We give a completely dierent proof that X∞ x2n (11.2) D = sec x. 2n (2n)! n=0

If we multiply both sides of (11.2) by cos x and equate coecients of x2n/(2n)!, we see that (11.2) is equivalent to the recurrence Xn 2n 1, if n =0; (11.3) (1)n k D = 2k 2k 0, otherwise. k=0

The case n = 0 of (11.3) is clear. To interpret (11.3) for n>0, let A be the set of all ordered pairs (, ) such that for some subset S [2n] of even cardinality, is an up-down permutation of S and is the increasing permutation of [2n] \ S.If|S| has cardinality 2k then we give (, ) the sign (1)k. Thus for n = 8, a typical element of A is (1427, 3568). Now let =(a1a2 ...a2k,b1b2 ...b2n2k) be an element of A.Ifa2k >b1 or k =0,we dene ω()tobe(a1a2 ...a2kb1b2,b3 ...b2n2k) and if a2k

X∞ x2n+1 D = tan x 2n+1 (2n + 1)! n=0 can be proved similarly.

Following Zeilberger (1985), we now use a sign-reversing involution to prove the ‘ma- trix tree theorem,’ which gives a determinantal formula for the number of spanning ar- borescences of a digraph, rooted at a given node. Similar proofs have been found by several people, of whom the rst seems to be Temperley (1981). For each i, j, with 1 i, j n, letQ wij be an arbitrary weight. We dene the weight of a digraph on [n] to be the product wij over all arcs (i, j) of the digraph. We shall nd a formula for the sum of the weights of all arborescences on [n], rooted at n. (Then given any digraph D on [n], the number of spanning arborescences of D is obtaining by set wij equal to 1 for arcs (i, j)inD and to 0 for arcs not in D.)

24 First we observe that a determinant can be interpreted as a sum of signed weights of digraphs. A permutation digraph is a digraph in which every vertex has indegree and outdegree 1, or equivalently, in which every weakly connected component is a directed cycle. Any permutation of a set corresponds to the permutation digraph in which the arcs are i, (i) , and conversely, every permutation digraph is of this form. Now let M be the matrix (wij)i,j=1...n1. Then the determinant of M is equal to the sum over all permutations of [n 1] of

nY1 (11.4) (sgn ) (wi(i)). i=1

This product is clearly, up to sign, the weight of the permutation digraph correspondingQ r li+1 to . Now supposeQ that has r cycles, of lengths l1, l2, ..., lr. Then sgn = i=1( 1) n 1 r li r and ( 1) = i=1( 1) , so (11.4) is ( 1) times the weight of the permutation digraph corresponding to . Now consider the determinant

w21 + + wn1 w21 wn1,1 w12 w12 + w32 + + wn2 wn1,2 W = . . . . . .. . . . . . w1,n1 w2,n1 w1,n1 + + wn2,n1 + wn,n1

This is the determinant of M above, with wii replaced by

X wji. 1jn j=6 i

The digraphs that W counts will be obtained from permutation digraphs by replacing eachloop(i, i) with an arc (j, i) for some j =6 i (with j = n allowed), and the sign of such a digraph is (1)r, where r is the number of cycles length at least 2 in the original permutation digraph. More precisely, W is the sum of the signed weights of all pairs (P, T) of digraphs on [n] such that (1) P is a permutation digraph, with every cycle of length at least 2, on a set of nodes NP [n 1].

(2) T is a digraph without loops on [n] in which every node in [n 1] \ NP has indegree 1 and every node in NP ∪{n} has indegree 0. The signed weight of the pair (P, T)is(1)r times the product of the weights of P and T ,

25 where r is the number of cycles of P . Here is a typical pair (P, T):

T

P n

We now dene the sign-reversing involution ω on all pairs (P, T) such that either P or T contains a cycle: take the cycle containing the least vertex and transfer it from P to T or from T to P . Then ω is a weight-preserving sign-reversing involution that cancels all pairs except those in which P is empty and T is an arborescence rooted at n. Further examples of cancellation can be found in Stanton and White (1986).

12. Inclusion-exclusion The inclusion-exclusion principle is probably the most well-known technique for dealing with subtraction.

(12.1) Theorem.PLet f and g be two functionsP dened on the subsets of a nite set S |AB| such that f(A)= BA g(B). Then g(A)= BA( 1) f(B). Proof. We have X X (1)|AB|f(B)= (1)|AB|g(C) BA BA CB X X = g(C) (1)|AB| = g(A). CA CBA

A dual form of inclusion-exclusion may be proved the same way as Theorem (12.1): X X (12.2) f(A)= g(B) if and only if g(A)= (1)|BA|f(B). SBA SBA

26 An important special case of inclusion-exclusion occurs when f(A) and g(A) depend only on |A|, so we may write f(A)=f|A| and g(A)=g|A|. Then the between f and g may be written Xn Xn n n f = g and g = (1)nk f . n k k n k k k=0 k=0

PThese relations may be expressedP in terms of exponential generating functions: if F (x)= ∞ n ∞ n x x n=0 fnx /n! and G(x)= n=0 gnx /n! then F (x)=e G(x) and G(x)=e F (x). Another form of inclusion-exclusion is often used: Suppose we have a nite set X of elements, each of which has certain ‘properties,’ and let S be the set of all such properties. For each subset A of S let f(A) be the number of elements of X having all the properties in A (and possibly others). P (12.3) Theorem. Let Mi = |A|=i f(A) and let Ni be the number of elements of X having exactly i properties. Then X l N = (1)li M , i i l li and in particular, N0 = M0 M1 + M2 .

Proof. For A S, let g(A) be the numberP of elements of X having the proper- ties in A and no others. Then f(A)= g(B), so by inclusion-exclusion, g(A)= P P BA |BA| BA( 1) f(B). Thus Ni = |A|=i g(A) and the result follows by a straightforward calculation. Our rst example of inclusion-exclusion is to permutation enumeration. The descent set D() of a permutation of [n]is{ i | (i) >(i +1)}. Fix n, and for A [n 1], let g(A) beP the set of permutations with descent set A. We shall nd a simple formula { } for f(A)= BA g(B). Let A = a1

a1,a2 a1, ,ak ak1,n ak P |AB| so g(A) is given explicitly by g(A)= BA( 1) f(B). If we set a0 = 0 and ak+1 = n, then g(A) can be expressed compactly as the determi- nant 1 (12.4) n! (aj ai1)! i,j=1,...,k+1

27 where we interpret 1/r!as0forr<0.Q To see this, suppose that (mij)isanr r matrix r 6 for which mij =0ifj