<<

BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 36, Number 2, Pages 113–133 S 0273-0979(99)00775-2 Article electronically published on February 16, 1999

WHY THE CHARACTERISTIC POLYNOMIAL FACTORS

BRUCE E. SAGAN

Abstract. We survey three methods for proving that the characteristic poly- nomial of a finite ranked lattice factors over the nonnegative integers and indi- cate how they have evolved recently. The first technique uses geometric ideas and is based on Zaslavsky’s theory of signed graphs. The second approach is algebraic and employs results of Saito and Terao about free hyperplane ar- rangements. Finally we consider a purely combinatorial theorem of Stanley about supersolvable lattices and its generalizations.

1. Introduction The fundamental problem in enumerative combinatorics can be stated: given a set S, find a formula for its cardinality S . More generally, given a sequence of sets | | S0,S1,S2,... we would like to investigate properties of the sequence

(1) a0,a1,a2,...

where ai = Si ,i 0. From their definition we obviously have ai Z 0,the non-negative| integers.| ≥ ∈ ≥ We can also turn this problem around. Suppose we are given a sequence (1) where the ai are defined in a way that would permit them to be in the complex numbers, C. If, in fact, the terms are in Z 0, can we find a combinatorial explanation for this? ≥ One possibility, of course, would be to find a sequence of sets such that ai = Si . Such questions are of great interest currently in algebraic combinatorics. | | We are going to investigate a particular problem of this type: trying to explain why the roots of a certain polynomial associated with a partially ordered set, called the characteristic polynomial, often has all of its roots in Z 0. We will provide three explanations with tools drawn from three different areas≥ of : graph theory/geometry, algebra, and pure combinatorics. The first of these uses Zaslavsky’s lovely theory of signed graph coloring [57], [58], [59] which can be n n generalized to counting points of Z or of Fp inside a certain polytope [2], [15], [20], [26], [53]. (Here, Fp is the Galois field with p elements.) The next technique is based on theorems of Saito [41] and Terao [51] about free hyperplane arrangements. Work has also been done on related concepts such as inductive freeness [50] and recursive freeness [60]. The third method employs a theorem of Stanley [44] on

Received by the editors June 30, 1996, and in revised form November 30, 1998. 1991 Mathematics Subject Classification. Primary 06A07; Secondary 05C15, 20F55, 06C10, 52B30. Key words and phrases. Characteristic polynomial, free arrangement, M¨obius function, par- tially ordered set, signed graph, subspace arrangement, supersolvable lattice. Presented at the 35th meeting of the S´eminaire Lotharingien de Combinatoire, October 4–6, 1995.

c 1999 American Mathematical Society

113

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 114 BRUCE E. SAGAN

semimodular supersolvable lattices which has recently been generalized by Blass and me [14] by relaxing both restrictions on the lattice. Along the way we will meet many important combinatorial concepts. The rest of this paper is organized as follows. The next section will introduce the M¨obius function, µ, of a partially ordered set (poset) which is a far-reaching generalization of the one in number theory. In section 3 we will talk about gener- ating functions, an important way to manipulate sequences such as (1), and define the characteristic polynomial, χ, as the generating function for µ. The last three sections will be devoted to the three methods for proving that, for various posets, χ factors over Z 0. ≥

2. Mobius¨ functions and posets

In number theory, one usually sees the M¨obius function µ : Z>0 Z defined as → 0ifnis not square free, (2) µ(n)= ( 1)k if n is the product of k distinct primes,  − a definition which seems very strange at first blush. The importance of µ lies in the number-theoretic M¨obius Inversion Theorem.

Theorem 2.1. Let f,g : Z>0 Z satisfy → f(n)= g(d) dn X| for all n Z>0.Then ∈ g(n)= µ(n/d)f(d). dn X| Moving into the area of enumerative combinatorics, one of the very useful tools is the Principle of Inclusion-Exclusion or PIE. Theorem 2.2. Let S be a finite set and S ,... ,S S;then 1 n ⊆ n n S S = S S + S S +( 1)n S . | − i| | |− | i| | i ∩ j |−··· − | i| i=1 1 i n 1 i

Theorem 2.3. If f : Z 0 C then ≥ → ∆Sf(n)=f(n).

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use WHY THE CHARACTERISTIC POLYNOMIAL FACTORS 115

1, 2, 3 { } 3 18 QQ @  Q @ t  t Q t  Q @ 2 1, 2  1, 3 2, 3Q 6 @ 9 { } QQ { } QQ { } @ Q  Q  @ t t Q t Q t t t  Q  Q @ 1 1  2Q 3Q 2 @ 3 { } QQ { } {} @ Q  @ t t Q t  t t t Q  @ 0 Q @ 1 t t t ∅ The chain C3 The Boolean algebra B3 The divisor poset D18 Figure 1. Some example posets

One of the advantages of the combinatorial M¨obius function is that its inversion theorem unifies and generalizes the previous three results. In addition, it makes the definition (2) transparent, encodes topological information about posets [6], [40], and has even been used to bound the running time of certain algorithms [9]. We will now define this powerful invariant. Let P be a finite poset with partial order .IfPhas a unique minimal element, ≤ then it will be denoted 0=ˆ 0ˆP; and if it has a unique maximal element, then we will use the notation 1=ˆ 1ˆ .Ifx yin P , then the corresponding (closed) interval is P ≤ [x, y]= z : x z y { ≤ ≤ } and we let Int(P ) denote the set of all intervals of P .Notethat[x, y]isaposetin ˆ ˆ its own right with 0[x,y] = x, 1[x,y] = y.TheM¨obius function of P , µ :Int(P) Z, is defined recursively by → 1ifx=y, (3) µ(x, y)= xz

(4) µ(x, z)=δx,y x z y ≤X≤ where δx,y is the Kronecker delta. If P has a zero, then we define µ(x)=µ(0ˆ,x). Let us compute µ(x) in some simple posets. The chain, Cn, consists of the integers 0, 1,... ,n ordered in the usual manner; see Figure 1 for a picture of C . { } 3 It is immediate directly from the definition (3) that in Cn we have 1ifx=0 (5) µ(x)= 1ifx=1  0if− x2.  ≥ The Boolean algebra, Bn, has as elements all subsets of [n]:= 1,2,... ,n and as order relation, the case n = 3 being displayed in Figure 1.{ After computing} ⊆ the M¨obius function of B3, the reader will immediately guess that if x Bn then x ∈ µ(x)=( 1)| |. This follows easily from the following observations. The Cartesian −

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 116 BRUCE E. SAGAN

product of two posets P, Q is obtained by ordering the (x, y) P Q component- ∈ × wise. It is easy to prove directly from (3) that if 0ˆP and 0ˆQ exist, then

µP Q(x, y)=µP(x)µQ(y). × n Since Bn is isomorphic as a poset to the n-fold product (C1) , it is a simple matter to verify that its M¨obius function has the desired form using equation (5). The divisor poset, Dn, consists of all d n ordered by c d if c d. Figure 1 shows | ni ≤ | D18. Clearly if n has prime n = i pi , then we have an isomorphism D = C . So as with the Boolean algebra, we get the value of µ(d)tobeasin n ∼ i ni definition× (2), this time in a much more naturalQ way. In case the reader is not convinced that the definition (4) is natural, consider the incidence algebra of P , I(P ), which consists of all functions f :Int(P) C. The multiplication in this algebra is convolution defined by → f g(x, y)= f(x, z)g(z,y). ∗ x z y ≤X≤ Note that with this multiplication I(P ) has an δ :Int(P) C, → namely δ(x, y)=δx,y. One of the simplest but most important functions in I(P ) is the zeta function (so called because in I(Dn) it is related to the Riemann zeta function) given by ζ(x, y) = 1 for all intervals [x, y]. It is easy to see that ζ is 1 invertible in I(P )andinfactthatζ− =µwhere µ is defined by (4). The fundamental result about µ is the combinatorial M¨obius Inversion Theo- rem [40].

Theorem 2.4. Let P be a finite poset and f,g : P C. → 1. If for all x P we have f(x)= y xg(y),then ∈ ≤ g(x)= Pµ(y,x)f(y). y x X≤ 2. If for all x P we have f(x)= y xg(y),then ∈ ≥ g(x)= µP(x, y)f(y). y x X≥ It is now easy to obtain the Theorems 2.1, 2.2, and 2.3 as corollaries by using M¨obius inversion over Dn, Bn,andCn, respectively. For example, to get the Principle of Inclusion-Exclusion, use f,g : Bn Z 0 defined by → ≥ f(X)= S, |X| g(X)= S S , |X− i| iX [6∈ where SX = i X Si. ∩ ∈ 3. Generating functions and characteristic polynomials The (ordinary) generating function for the sequence (1) is the f(x)=a +a x+a x2+ . 0 1 2 ··· Generating functions are a powerful tool for studying sequences, and Wilf has written a wonderful text devoted entirely to their study [55]. There are several reasons why one might wish to convert a sequence into its generating function. It may be possible to find a closed form for f(x) when one does not exist for (an)n 0, ≥

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use WHY THE CHARACTERISTIC POLYNOMIAL FACTORS 117

or the expression for the generating function may be used to derive one for the sequence. Also, sometimes it is easier to obtain various properties of the an,such as a recursion or congruence relation, from f(x) rather than directly. By way of illustration, consider the sequence whose terms are

p(n) = number of integer partitions of the number n Z 0 ∈ ≥ where an integer partition is a way of writing n as an unordered sum of positive integers. For example, p(4) = 5 because of the partitions 4, 3+1, 2+2, 2+1+1, 1+1+1+1+1. There is no known closed form for p(n), but the generating function was found by Euler:

∞ ∞ 1 (6) p(n)xn = . 1 xk n=0 X kY=1 − To see this, note that 1/(1 xk)=1+xk+x2k+ , and so a term in the product obtained by choosing xik from− this expansion corresponds··· to choosing a partition with the integer k repeated i times. From (6) one can obtain all sorts of information about p(n), such as its asymptotic behavior. See Andrews’ book [1] for more details. Our main object of study will be the generating function for the M¨obius function of a poset, P , the so-called characteristic polynomial. Let P have a zero and be ranked so that for any x P all maximal chains from 0toˆ xhave the same length denoted ρ(x) and called the∈ rank of x.(Achain is a totally ordered subset of P , and maximal refers to inclusion.) The characteristic polynomial of P is then

ρ(1)ˆ ρ(x) (7) χ(P, t)= µ(x)t − . x P X∈ Note that we use the corank rather than the rank in the exponent on t so that χ will be monic. Let us look at some examples of posets and their characteristic polynomials, starting with those from the previous section. For the chain we clearly have

n n 1 n 1 χ(C ,t)=t t − =t − (t 1). n − − Now for the Boolean algebra we have

x n x n χ(B ,t)= ( 1)| |t −| | =(t 1) . n − − x [n] X⊆ Note that by the same argument, if k is the number of distinct primes dividing n, then

ρ(n) k k χ(D ,t)=t − (t 1) n − since the terms divisible by squares contribute nothing to the sum. As a fourth example, consider the partition poset, Πn, which consists of all set partitions of [n] (families of disjoint nonempty subsets whose union is [n]) ordered by refinement. 2 Direct computation with Π3 (see Figure 2) shows that χ(Π3,t)=t 3t+2 = (t 1)(t 2). In general − − − χ(Π ,t)=(t 1)(t 2) (t n+1). n − − ··· − Note that in all cases χ has only nonnegative integral roots.

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 118 BRUCE E. SAGAN

1, 2, 3 { } QQ  Q  t Q  Q 1, 2 3  1, 3 2 1 2, 3Q { }{ } QQ{ }{ } { }{ } Q  t Q t  t Q  Q

1 2t 3 { }{ }{ }

Figure 2. The partition poset Π3

Many of our example posets will arise as intersection lattices of subspace ar- rangements. A lattice, L, is poset such that every pair x, y L has a meet or greatest lower bound, x y, and a join or least upper bound, x ∈ y. All our lattices will be finite and so will∧ automatically have a zero 0=ˆ Land∨ a one 1=ˆ L.A subspace arrangement is a finite set V W (8) = K ,K ,... ,K A { 1 2 l} n of subspaces of real Euclidean space R .IfdimKi=n 1for1 i l,thenwesay that is a hyperplane arrangement and use H’s in place− of K≤’s. The≤ intersection A lattice of , L( ), has as elements all subspaces X Rn that can be written as an intersectionA ofA some of the elements of . The partial⊆ order is reverse inclusion, A so that X Y if and only if X Y .SoL( ) has minimal element Rn, maximal ≤ ⊇ A element K1 Kl, and join operation X Y = X Y . The reader can consult [7], [36] for more∩···∩ details about the general theory∨ of arrangements∩ which is currently a very active field. The characteristic polynomial of A is defined by (9) χ( ,t)= µ(X)tdim X . A X L( ) ∈XA This is not necessarily the same as χ(L( ),t)asdefinedin(7).If is a hyperplane arrangement, then the two will be equalA up to a factor of a powerA of t, so from the point of view of having integral roots there is no difference. In the general subspace case (7) and (9) may be quite dissimilar, and often the latter turns out to factor at least partially over Z 0 while the former does not. In the arrangement, case the roots of (9) are called the≥ exponents of and denoted exp .Infactwhen is the set of reflecting hyperplanes for a WeylA group W , then theseA roots are just theA usual exponents of W [51] which are always nonnegative integers. The reason that Weyl groups, as opposed to more general Coxeter groups, have well-behaved characteristic polynomials is that such groups stabilize an appropriate discrete subgroup of Zn. All of our previous example lattices can be realized as intersection lattices of subspace arrangements. The n-chain is L( ) with = K0,... ,Kn where Ki is the set of all points having the first i coordinatesA A zero.{ The Boolean} algebra is the intersection lattice of the arrangement of coordinate hyperplanes Hi : xi =0, 1 i n. By combining these two constructions, one can also obtain a realization of≤ the≤ divisor poset as a subspace arrangement. To get the partition lattice we use

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use WHY THE CHARACTERISTIC POLYNOMIAL FACTORS 119

χ( ,t) exp A t(tA 1)(t 2) (t n+1) 0,1,A2,... ,n 1 An − − ··· − − n (t 1)(t 3) (t 2n+1) 1,3,5,... ,2n 1 B (t−1)(t − 3) ···(t−2n+3)(t n+1) 1,3,5,... ,2n−3,n 1 Dn − − ··· − − − − Table 1. Characteristic polynomials and exponents of some Weyl arrangements

the Weyl arrangement of type A = x x =0 : 1 i

4. Signed graphs Zaslavsky developed his theory of signed graphs [57], [58], [59] to study hyper- plane arrangements contained in the Weyl arrangement . (Note that this includes Bn n and n.) In particular his coloring arguments provide one of the simplest ways Ato computeD the corresponding characteristic polynomials. A signed graph, G =(V,E), consists of a set V of vertices which we will always take to be 1, 2,... ,n , and a set of edges E which can be of three possible types: { } 1. a positive edge between i, j V , denoted ij+, ∈ 2. a negative edge between i, j V , denoted ij−, 3. or a half-edge which has only∈ one endpoint i V , denoted ih. ∈ One can have both the positive and negative edges between a given pair of vertices, in which case it is called a doubled edge and denoted ij±. The three types of edges correspond to the three types of hyperplanes in ,namelyx =x,x = x, Bn i j i −j and xi = 0 for the positive, negative, and half-edges, respectively. So to every hyperplane arrangement n we have an associated signed graph G with a hyperplane in if and onlyA⊆B if the corresponding edge is in G . Actually,A the possible edgesA in G really correspond to the vectors in the rootA system of type B perpendicular toA the hyperplanes, which are e e ,e+e,and e . (In the n i − j i j i full theory one also considers the root system Cn with roots 2ei which are modeled

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 120 BRUCE E. SAGAN

1 1 1 ¡A ¡A ¡A ¡ sA ¡ sA ¡ sA ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A HH ¨¨ HH ¨¨ ¡ A HH¡ ¨A¨ HH¡ ¨A¨ ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A ¡ A 2 ¡ A 3 2 ¡ A 3 2 ¡ A 3 @ s G s s G s@ s G s A3 B3 D3

Figure 3. Graphs for Weyl arrangements

by loops ii in G. This is why the somewhat strange definition of a half-edge is necessary. Loops and half-edges behave differently because, e.g., the former can be in a circuit of the graph while the later cannot.) In picturing a signed graph I will + draw an ordinary edge for ij , an edge with a slash through it for ij−,anedge with two slashes through it for ij±, and an edge starting at a vertex and wandering h off into space for i . The graphs G 3 ,G3,andG 3 are shown in Figure 3. Since we are using signed edges,A we areB also goingD to use signed colors for the vertices. For s Z 0 let [ s, s]= s, s +1,... ,s 1,s .Acoloring of the signed graph G ∈is a≥ function− c : V {−[ s,− s]. The fact− that the} number of colors t = [ s, s] =2s+ 1 is always odd→ will− be of significance later. A proper coloring c of |G−requires| that for every edge e E we have ∈ 1. if e = ij+,thenc(i)=c(j); 6 2. if e = ij−,thenc(i)= c(j); 3. if e = ih,thenc(i)=0.6 − 6 Note that the first of these three restrictions is the one associated with ordinary graphs and the four-color theorem [19]. The chromatic polynomial of G is P (G, t) = the number of proper colorings of G with t colors. It is not obvious from the definition that P (G, t) is actually a polynomial in t.In fact even more is true as we see in the following theorem of Zaslavsky.

Theorem 4.1 ([58]). Suppose n has signed graph G .Then A⊆B A χ( ,t)=P(G ,t). A A Theorem 4.1 trivializes the calculation of the characteristic polynomials for the three infinite families of Weyl arrangements and in so doing explains why they

factor over Z 0.For nthe graph G n consists of every possible positive edge. So ≥ A A to properly color G n we have t choices for vertex 1; then t 1 for vertex 2 since c(2) = c(1), and soA forth, yielding − 6 χ( n,t)=P(G )=t(t 1) (t n+1) A An − ··· − in agreement with Table 1. It will be convenient in a bit to have a shorthand for

this falling factorial,solet tn =t(t 1) (t n+1). In G n we also have every negative edge and half-edge.hi This− gives···t −1 choices for vertexB 1 since color 0 is not allowed; t 3 choices for vertex 2 since−c(2) = c(1), 0; and so on. These − 6 ± are exactly the factors in the n entry of Table 1. Finally consider G which is B Dn

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use WHY THE CHARACTERISTIC POLYNOMIAL FACTORS 121

just G n with the half-edges removed. There are two cases depending on whether the colorB 0 appears once or not at all. (It can’t appear two or more times because

G n G n .) If the color 0 is never used, then we have the same number of coloringsA ⊆ asD with . If 0 is used once, then there are n vertices that could receive Bn it and the rest are colored as in n 1.So B − n n 1 n 1 − − χ( ,t)= (t 2i +1)+n (t 2i +1)=(t n+1) (t 2i +1), Dn − − − − i=1 i=1 i=1 Y Y Y which again agrees with the table. Blass and I have generalized Zaslavsky’s theorem from hyperplane arrangements to subspace arrangements. If and are subspace arrangements, then we say that is embedded in if allA subspacesB of are intersections of subspaces of , i.e., A L( ). Now considerB [ s, s]n as aA cube of integer lattice points in Bn (notA⊆ to be confusedB with our use− of lattice as a type of partially ordered set). LetR [ s, s]n denote the set of points of the cube which lie on none of the subspaces in− . We\ willA include a proof of the next result because it amply illustrates the importanceA S of the M¨obius Inversion Theorem.

Theorem 4.2 ([15]). Let t =2s+1 where s Z 0 and let be any subspace arrangement embedded in .Then ∈ ≥ A Bn χ( ,t)= [ s, s]n . A |− \ A| Proof. We construct two functions f,g : L( ) [Z by defining for each X L( ) A −→ ∈ A f(X)= X [s, s]n , |∩− | g(X)= (X Y) [ s, s]n . | \ ∩ − | Y>X[ Recall that L( ) is ordered by reverse inclusion so that Y X. In particular A Y>X ⊂ g(Rn)= [ s, s]n . Note also that X [ s, s]n is combinatorially just a cube of dimension|− dim\X andA| side t so that f(X∩)=− tdim XS. Finally, directly from our S definitions, f(X)= Y Xg(Y), so by the Theorem 2.4 ≥ P[ s, s]n = g(0)ˆ | − \ A| [ = µ(X)f(X) X 0ˆ X≥ = µ(X)tdim X X L( ) ∈XA = χ( ,t), A which is the desired result.

To see why our theorem implies Zaslavsky’s in the hyperplane case, note that apointc [ s, s]n is just a coloring c : V [ s, s]wheretheith coordinate of the point∈ is the− color of the vertex i. With→ this− viewpoint, a coloring is proper if and only if the corresponding point is not on any hyperplane of . For example, if ij+ E then the coloring must have c(i) = c(j) and so the pointA does not lie on ∈ 6 the hyperplane xi = xj .

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use 122 BRUCE E. SAGAN

As an application of Theorem 4.2, we consider a set of subspace arrangements that has been arousing a lot of interest lately. The k-equal arrangement of type A is = x = x = ...=x :1 i