Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Combinatorics: The Art of

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

GRADUATE STUDIES IN MATHEMATICS 210

Combinatorics: The Art of Counting

Bruce E. Sagan

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Marco Gualtieri Bjorn Poonen Gigliola Staffilani (Chair) Jeff A. Viaclovsky

2020 Mathematics Subject Classification. Primary 05-01; Secondary 06-01.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-210

Library of Congress Cataloging-in-Publication Data Names: Sagan, Bruce Eli, 1954- author. Title: Combinatorics : the art of counting / Bruce E. Sagan. Description: Providence, Rhode Island : American Mathematical Society, [2020] | Series: Gradu- ate studies in mathematics, 1065-7339 ; 210 | Includes bibliographical references and index. | Identifiers: LCCN 2020025345 | ISBN 9781470460327 (paperback) | ISBN 9781470462802 (ebook) Subjects: LCSH: Combinatorial analysis–Textbooks. | AMS: Combinatorics – Instructional ex- position (textbooks, tutorial papers, etc.). | Order, lattices, ordered algebraic structures – Instructional exposition (textbooks, tutorial papers, etc.). Classification: LCC QA164 .S24 2020 | DDC 511/.6–dc23 LC record available at https://lccn.loc.gov/2020025345

Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2020 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10987654321 252423222120

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

To Sally, for her love and support

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Contents

Preface xi

List of Notation xiii

Chapter 1. Basic Counting 1 §1.1. The Sum and Product Rules for sets 1 §1.2. and words 4 §1.3. and subsets 5 §1.4. Set partitions 10 §1.5. Permutations by cycle structure 11 §1.6. Integer partitions 13 §1.7. Compositions 16 §1.8. The twelvefold way 17 §1.9. Graphs and digraphs 18 §1.10. Trees 22 §1.11. Lattice paths 25 §1.12. Pattern avoidance 28 Exercises 33

Chapter 2. Counting with Signs 41 §2.1. The Principle of Inclusion and Exclusion 41 §2.2. Sign-reversing involutions 44 §2.3. The Garsia–Milne Involution Principle 49 §2.4. The Reflection Principle 52

vii

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

viii Contents

§2.5. The Lindström–Gessel–Viennot Lemma 55 §2.6. The Matrix-Tree Theorem 59 Exercises 64

Chapter 3. Counting with Ordinary Generating Functions 71 §3.1. Generating polynomials 71 §3.2. and 푞-analogues 74 §3.3. The algebra of formal power series 81 §3.4. The Sum and Product Rules for ogfs 86 §3.5. Revisiting integer partitions 89 §3.6. Recurrence relations and generating functions 92 §3.7. Rational generating functions and linear recursions 96 §3.8. Chromatic polynomials 99 §3.9. Combinatorial reciprocity 106 Exercises 109

Chapter 4. Counting with Exponential Generating Functions 117 §4.1. First examples 117 §4.2. Generating functions for Eulerian polynomials 121 §4.3. Labeled structures 124 §4.4. The Sum and Product Rules for egfs 128 §4.5. The Exponential Formula 131 Exercises 134

Chapter 5. Counting with Partially Ordered Sets 139 §5.1. Basic properties of partially ordered sets 139 §5.2. Chains, antichains, and operations on posets 145 §5.3. Lattices 148 §5.4. The Möbius of a poset 154 §5.5. The Möbius Inversion Theorem 157 §5.6. Characteristic polynomials 164 §5.7. Quotients of posets 168 §5.8. Computing the Möbius function 174 §5.9. Binomial posets 178 Exercises 183

Chapter 6. Counting with Actions 189 §6.1. Groups acting on sets 189 §6.2. Burnside’s Lemma 192 §6.3. The cycle index 197

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Contents ix

§6.4. Redfield–Pólya theory 200 §6.5. An application to proving congruences 205 §6.6. The cyclic sieving phenomenon 209 Exercises 213

Chapter 7. Counting with Symmetric Functions 219 §7.1. The algebra of symmetric functions, Sym 219 §7.2. The Schur basis of Sym 224 §7.3. Hooklengths 230 §7.4. 푃-partitions 235 §7.5. The Robinson–Schensted–Knuth correspondence 240 §7.6. Longest increasing and decreasing subsequences 244 §7.7. Differential posets 248 §7.8. The chromatic symmetric function 253 §7.9. Cyclic sieving redux 256 Exercises 259

Chapter 8. Counting with Quasisymmetric Functions 267 §8.1. The algebra of quasisymmetric functions, QSym 267 §8.2. Reverse 푃-partitions 270 §8.3. Chain enumeration in posets 274 §8.4. Pattern avoidance and quasisymmetric functions 276 §8.5. The chromatic quasisymmetric function 279 Exercises 283

Appendix. Introduction to Representation Theory 287 §A.1. Basic notions 287 Exercises 292

Bibliography 293

Index 297

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Preface

Enumerative combinatorics has seen an explosive growth over the last 50 years. The purpose of this text is to give a gentle introduction to this exciting area of research. So, rather than trying to cover many different topics, I have chosen to give a more leisurely treatment of some of the highlights of the field. My goal has been to write the exposition so it could be read by a student at the advanced undergraduate or beginning graduate level, either as part of a course or for independent study. The reader will find it similar in tone to my book on the symmetric group. I have tried to keep the prerequisites to a minimum, assuming only basic courses in linear and abstract algebra as background. Certain recurring themes are emphasized, for example, the existence of sum and prod- uct rules first for sets, then for ordinary generating functions, and finally inthecaseof exponential generating functions. I have also included some recent material from the research literature which, to my knowledge, has not appeared in book form previously, such as the theory of quotient posets and the connection between pattern avoidance and quasisymmetric functions. Most of the exercises should be doable with a reasonable amount of effort. A few unsolved conjectures have been included among the problems in the hope that an in- terested student might wish to tackle one of them. They are, of course, marked as such. A few words about the title are in order. It is in part meant to be a tip of the hat to Donald Knuth’s influential series of books The art of computer programing, Volumes 1–3 [51–53], which, among many other things, helped give birth to the study of pattern avoidance through its connection with stack sorting; see Exercise 36 in Chapter 1. I hope that the title also conveys some of the beauty found in this area of mathemat- ics, for example, the elegance of the Hook Formula (equation (7.10)) for the number of standard Young tableaux. In addition I should mention that, due to my own pref- erences, this book concentrates on the enumerative side of combinatorics and mostly ignores the important extremal and existential parts of the field. The reader interested in these areas can consult the books of Flajolet and Sedgewick [25] and of van Lint [95].

xi

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

xii Preface

This book grew out of the lecture notes which I have compiled over years of teach- ing the graduate combinatorics course at Michigan State University. I would like to thank the students in these classes for all the feedback they have given me about the various topics and their presentation. I am also indebted to the following colleagues, some of whom taught from a preliminary version of this book, who provided me with suggestions as well as catching numerous typographical errors: Matthias Beck, Moussa Benoumhani, Andreas Blass, Seth Chaiken, Sylvie Corteel, Georges Grekos, Richard Hensh, Nadia Lafrenière, Duncan Levear, and Tom Zaslavsky. Darij Grinberg deserves special mention for providing copious comments and corrections as well as providing a number of interesting exercises. I also received valuable feedback from four anony- mous referees. Finally, I wish to express my appreciation of Ina Mette, my editor at the American Mathematical Society. Without her gentle support and persistence, this text would never have seen the light of day. Because I typeset this document myself, all errors can be blamed on my computer.

East Lansing, Michigan, 2020

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

List of Notation

Symbol Definition Page

퐴(퐷) arc set of digraph 퐷 21 퐴(퐺) adjacency matrix of graph 퐺 60 풜(퐺) set of acyclic orientations of 퐺 103 푎(퐺) number of acyclic orientations of 퐺 103 퐴([푛], 푘) set of permutations 휋 in 픖푛 having 푘 descents 121 퐴(푛, 푘) Eulerian number, of 퐴([푛], 푘) 121 퐴푛(푞) Eulerian polynomial 122 풜(푃) atom set of poset 푃 169 Asc 푐 ascent set of a proper coloring 푐 279 asc 푐 ascent number of a proper coloring 푐 279 Asc 휋 ascent set of 휋 76 asc 휋 ascent number of permutation 휋 76 Av푛(휋) the set of permutations in 픖푛 avoiding 휋 29 훼푟 reversal of composition 훼 32 ̄훼 expansion of composition 훼 274 훼(퐶) rank composition of chain 퐶 275 퐵(퐺) incidence matrix of graph 퐺 61 퐵(푇) set of partitions of the set 푇 10 퐵푛 Boolean algebra on [푛] 140 퐵∞ poset of subsets of ℙ 178 퐵(푛) 푛th 10

xiii

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

xiv List of Notation

Symbol Definition Page

ℂ complex numbers 1 푐푖(푔) number of cycles of length 푖 in group element 푔 197 퐶퐿푛 claw poset with 푛 atoms 169 co 푇 content of tableau 푇 225 퐶푛 cycle with 푛 vertices 19 퐶푛 chain poset of length 푛 139 푐푥(푃) column insertion of element 푥 into tableau 푃 245 퐶∞ chain poset on ℕ 178 퐶(푛) Catalan number 26 푐([푛], 푘) set of permutations in 픖푛 with 푘 cycles 12 푐(푛, 푘) signless of the first kind 12 푐표(퐿, 푘) ordered 푘 cycle decompositions of permutations of 퐿 127 ℂ푋 vector space generated by set 푋 over ℂ 248 ℂ[푥] polynomial algebra in 푥 over ℂ 71 ℂ[[푥]] formal power series algebra in 푥 over ℂ 81 풞(휋) set of functions compatible with 휋 236 풞푚(휋) set of functions compatible with 휋 bounded by 푚 236 Des 푃 descent set of tableau 푃 271 Des 휋 descent set of permutation 휋 75 des 휋 descent number of permutation 휋 76 퐷푛 lattice of divisors of 푛 140 퐷∞ divisibility poset on ℙ 181 퐷(푛) derangement number 43 풟(푛) set of Dyck paths of semilength 푛 26 풟(푉) set of all digraphs on vertex set 푉 21 풟(푉, 푘) set of all digraphs on vertex set 푉 with 푘 edges 21 deg 푚 degree of a monomial 219 deg 푣 degree of vertex 푣 in a graph 20 Δ푓(푛) forward difference operator of 푓(푛) 162 훿푥,푦 Kronecker delta 7 훿(푥, 푧) delta function of poset incidence algebra 159 퐸(퐺) edge set of graph 퐺 18 퐸(퐿) set structure on label set 퐿 125 퐸(퐿) nonempty set structure on label set 퐿 125 퐸푛 Euler number 120 푒푛 푛th elementary symmetric function 221 퐸(푡) for elementary symmetric functions 221 Exc 휋 set of excedances of permutation 휋 122 exc 휋 number of excedances of permutation 휋 122

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

List of Notation xv

Symbol Definition Page

Fix 푓 fix point set of a function 푓 44 푓푛 Fibonacci number 3 퐹푛 Fibonacci number 2 픽푞 Galois field with 푞 elements 79 푓(푥) ordinary generating function 81 푓푆(푥) weight-generating function for weighted set 푆 86 퐹(푛) binomial poset 푛-interval function 178 퐹(푥) exponential generating function 117 퐹풮(푥) exponential generating function for structure 풮 125 퐹푆 fundamental quasisymmetric for set 푆 269 퐹훼 fundamental quasisymmetric for composition 훼 269 푓휆 number of standard Young tableaux of shape 휆 225 Φ fundamental map on permutations 122 휙 between subsets and compositions 16 퐺 ⧵ 푒 graph 퐺 with edge 푒 deleted 100 퐺/푒 graph 퐺 with edge 푒 contracted 101 GL(푉) general linear group over vector space 푉 287 풢(푉) set of all graphs on vertex set 푉 20 풢(푉, 푘) set of all graphs on vertex set 푉 with 푘 edges 20 퐺푥 stabilizer of element 푥 under the action of group 퐺 191 퐻푐 = 퐻푖,푗 hook of cell 푐 = (푖, 푗) 230 ℎ푐 = ℎ푖,푗 hooklength of cell 푐 = (푖, 푗) 230 ℋ푛 set of hook diagrams with 푛 cells 278 ℎ푛 푛th complete homogeneous symmetric function 221 퐻(푡) complete homogeneous generating function 221 ideg 푣 in-degree of vertex 푣 in a digraph 21 Inv 휋 inversion set of permutation 휋 74 inv 휋 inversion number of permutation 휋 74 ℐ(푃) incidence algebra of poset 푃 158 퐼(푆) lower-order ideal generated by 푆 in a poset 143 ISF(퐺; 푡) increasing spanning forest generating function of 퐺 105 ISF푚(퐺) set of 푚-edge increasing spanning forests of 퐺 105 isf푚(퐺) number of 푚-edge increasing spanning forests of 퐺 105 푖휆(퐺) number of independent type 휆 partitions in graph 퐺 254 풥(푃) distributive lattice of lower-order ideals of poset 푃 151 퐾푛 complete graph with 푛 vertices 19 퐾푛 lattice of compositions of 푛 140 퐾휆,휇 number of tableaux of shape 휆 and content 휇 225

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

xvi List of Notation

Symbol Definition Page

퐿(퐺) Laplacian of graph 퐺 62 ℒ(퐺) bond lattice of graph 퐺 167 ℒ(푃) set of linear extensions of 푃 238 ℓ(퐶) length of chain 퐶 in a poset 147 ℓ(휆) length of an integer partition 휆 15 ℓ(휋) length of a permutation 휋 4 lim 푓푘(푥) limit of a sequence of formal power series 84 푘→∞ lds 휋 length of a longest decreasing subsequence of 휋 245 lis 휋 length of a longest increasing subsequence of 휋 244 푛 퐿푛(푞) lattice of subspaces of 픽푞 140 퐿∞(푞) poset of subspaces of vector space 푉∞ over 픽푞 178 퐿(푉) lattice of subspaces of 푉 140 휆(퐹) type of partition induced by edge set 퐹 255 휆! multiplicity factorial of partition 휆 254 maj 휋 major index of permutation 휋 76 푀(푛) Mertens function 183 푀(푃) monomial quasisymmeric function for poset 푃 275 푀훼 monomial quasisymmetric function 268 푚휆 monomial symmetric function 220 휇(푃) Möbius function value on a poset 푃 154 휇(푥) one-variable Möbius function evaluated at 푥 154 휇(푥, 푧) two-variable Möbius function on the interval [푥, 푧] 157 ℕ nonnegative integers 1 NBC푘(퐺) set of no broken circuit sets of 푘 edges of 퐺 102 nbc푘(퐺) number of no broken circuit sets of 푘 edges of 퐺 102 풩ℰ(푚, 푛) set of 푁-퐸 lattice paths from (0, 0) to (푚, 푛) 26 odeg 푣 out-degree of vertex 푣 in a digraph 21 풪푥 orbit of an element 푥 under action of a group 190 푂(푔) big oh notation applied to function 푔 182 표(푔) order of a group element 푔 210 ℙ positive integers 1 푃∗ dual of poset 푃 142 풫퐶(퐺) set of proper colorings of 퐺 with the positive integers 279 푃(퐺; 푡) chromatic polynomial of graph 퐺 100 Par 푃 set of 푃-partitions 238 Par푚 푃 set of 푃-partitions bounded by 푚 238 푃푛 path with 푛 vertices 19 푃(푛) set of partitions of the integer 푛 13 푝(푛) number of partitions of the integer 푛 13 푝푛 푛th power sum symmetric function 221 푃(푡) power sum symmetric generating function 221

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

List of Notation xvii

Symbol Definition Page

푃(푛, 푘) set of partitions of 푛 into at most 푘 parts 15 푝(푛, 푘) number of partitions of 푛 into at most 푘 parts 15 푃(푆) permutations of a set 푆 4 푃(푆, 푘) permutations of length 푘 of a set 푆 4 푃((푆, 푘)) words of length 푘 over a set 푆 5 푃(휋) insertion tableau of 휋 242 풫(푢; 푣) set of directed paths from 푢 to 푣 in a digraph 56 Π푛 partition lattice on [푛] 140 Π(풮) partition structure on structure 풮 131 Π푒(풮) even partition structure on structure 풮 133 Π표(풮) odd partition structure on structure 풮 133 ℚ rational numbers 1 푄(푛) set of compositions of the integer 푛 16 푞(푛) number of compositions of the integer 푛 16 푄(푛, 푘) set of compositions of 푛 into 푘 parts 16 푞(푛, 푘) number of partitions of 푛 into 푘 parts 16 QSym algebra of quasisymmetric functions 268

QSym푛 quasisymmetric functions of degree 푛 268 푄(휋) recording tableau of 휋 242 푄푛(Π) quasisymmetric function for patterns Π 277 ℝ real numbers 1 ℛ퐶(휋) set of functions reverse compatible with 휋 270 rk 푃 rank of a ranked poset 푃 147 Rk푘 푃 푘th rank set of a ranked poset 푃 147 rk 푥 rank of an element 푥 in a ranked poset 147 ℛ(푘, 푙) set of partitions contained in a 푘 × 푙 rectangle 79 RPar 푃 set of reverse 푃-partitions 271 ℛ(푃) reduced incidence algebra of a binomial poset 179

rpp푛(휆) number of shape 휆 reverse plane partitions of 푛 233 rpar(푃; 퐱) generating function for reverse 푃-partitions 271 푟푥(푃) row insertion of element 푥 into tableau 푃 241 휌(퐹) vertex partition induced by edge set 퐹 255 휌∶ 퐺 → GL(푉) representation of group 퐺 287 풮(퐿) labeled structure on label set 퐿 124 픖 pattern poset 140 픖푛 symmetric group on [푛] 11 푆푓(푛) operator applied to function 푓(푛) 162 sgn sign function on a signed set 44 sh 푇 shape of tableau 푇 225 푠(푛, 푘) signed Stirling number of the first kind 13

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

xviii List of Notation

Symbol Definition Page

푆(푇, 푘) set of partitions of the set 푇 into 푘 blocks 10 푆(푛, 푘) Stirling number of the second kind 10 푆표(퐿, 푘) set of ordered partitions of the set 퐿 into 푘 blocks 127 풮푇(퐺) set of spanning trees of graph 퐺 59 st statistic on a set 74 std 휎 standardization of the permutation 휎 28 Supp 푥 support set of 푥 in a product of claws 173 supp 푥 size of support set of 푥 in a product of claws 173 Sym algebra of symmetric functions 220

Sym푛 symmetric functions of degree 푛 220 SYT(휆) set of standard Young tableaux of shape 휆 224 SSYT(휆) set of semistandard Young tableaux of shape 휆 225 푠휆 Schur function 225 푇푖,푗 element in cell (푖, 푗) of tableau 푇 225 풯푛 set of monomino-domino tilings of a row of 푛 squares 3 푈(푆) upper-order ideal generated by 푆 in a poset 143 푉(퐷) vertex set of digraph 퐷 21 푉(퐺) vertex set of graph 퐺 18 푉∞ vector space with a countably infinite basis over 픽푞 178 푤푘(푃) Whitney number of the first kind for a poset 푃 156 푊 푘(푃) Whitney number of the second kind for a poset 푃 156 푊푛 walk with 푛 vertices 19 wt weight function on a set 86 퐱 a countably infinite set of variables 219 퐱푐 monomial for a coloring 푐 of a graph 253 퐱푓 monomial for a function 푓 270 퐱푇 monomial for a tableau 푇 225 푋푔 fixed points of group element 푔 acting on set 푋 192 푋(퐺; 퐱) chromatic symmetric function of graph 퐺 253 푋(퐺; 퐱, 푞) chromatic quasisymmetric function of graph 퐺 280 푌 Young’s lattice 140 ℤ set of integers 1 휁(푥, 푧) zeta function in the incidence algebra of a poset 159 휁(푠) Riemann zeta function 182 푧(푔) cycle index of group element 푔 197 푍(퐺) cycle index of group 퐺 197 #푆 cardinality of the set 푆 1 |푓| size (sum of values) of a function 236 |푆| cardinality of the set 푆 1 |푇| sum of entries of tableau 푇 233 푆 ⊎ 푇 disjoint union of sets 푆 and 푇 1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

List of Notation xix

Symbol Definition Page

|휆| sum of the parts of partition 휆 13 휆 ⊢ 푛 휆 is a partition of 푛 13 푆 × 푇 (Cartesian) product of sets 푆 and 푇 1 푃 ⊎ 푄 disjoint union of posets 푃 and 푄 145 푃 ⊕ 푄 ordinal sum of posets 푃 and 푄 146 푃 × 푄 (Cartesian) product of posets 푃 and 푄 146 [푔] linear transformation for group element 푔 287 [푔]퐵 matrix in basis 퐵 for group element 푔 287 [푛] set of integers {1, 2, ... , 푛} 4 [푛]푞 푞-analogue of nonnegative integer 푛 75 [푛]푞! 푞-analogue of 푛! 75 [푥푛]푓(푥) coefficient of 푥푛 in 푓(푥) 83 푛↓푘 푛 falling factorial with 푘 factors 4 2푆 set of subsets of 푆 5 푆 (푘) set of 푘-element subsets of 푆 6 푛 (푘) 7 [푛] 푞-binomial coefficient 77 푘 푞 푉 [푘] 푘-dimensional subspaces of vector space 푉 79 {{푎, 푎, ... }} individual element notation 8 {{푎2, ... }} multiset multiplicity notation 8 푆 ((푘)) set of 푘-element multisubsets of 푆 9 휒(퐺) chromatic number of 퐺 99 휒(푔) character of group element 푔 291 푥 ⋖ 푦 푥 is covered by 푦 in a poset 140 푦 ⋗ 푥 푦 covers 푥 in a poset 140 0̂ the minimum element of a poset 142 1̂ the maximum element of a poset 142 [푥, 푦] closed interval from 푥 to 푦 in a poset 143 푥 ∧ 푦 meet of 푥 and 푦 in a poset 148 ⋀ 푋 meet of the subset 푋 in a poset 149 푥 ∨ 푦 join of 푥 and 푦 in a poset 149 푈 + 푉 sum of subspaces 푈 and 푉 149 푓 ∗ 푔 convolution of 푓 and 푔 in the incidence algebra 158 휒(푃; 푡) characteristic polynomial of a ranked poset 푃 164 푃/ ∼ quotient of poset 푃 by ∼ 169 휔푛 primitive 푛th root of unity 210 RS 휋 ↦ (푃, 푄) Robinson–Schensted map 242 RSK 푀 ↦ (푇, 푈) Robinson–Schensted–Knuth map 244

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 1

Basic Counting

In this chapter we will develop the most elementary techniques for enumerating sets. Even though these methods are relatively basic, they will presage more complicated things to come. We denote the integers by ℤ and parameters such as 푛 and 푘 are always assumed to be integral unless otherwise indicated. We also use the notation ℕ and ℙ for the nonnegative and positive integers, respectively. As usual, ℚ, ℝ, and ℂ stand for the rational numbers, real numbers, and complex numbers, respectively. Finally, whenever taking the cardinality of a set we will assume it is finite.

1.1. The Sum and Product Rules for sets

The Sum and Product Rules for sets are the basis for much of enumeration. And we will see various extensions of them later to ordinary and exponential generating functions. Although the rules are very easy to prove, we will include the demonstrations because the results are so useful. Given a 푆, we will use either of the notations #푆 or |푆| for its cardinality. We will also write 푆 ⊎ 푇 for the disjoint union of 푆 and 푇, and usage of this symbol implies disjointness even if it has not been previously explicitly stated. Finally, our notation for the (Cartesian) product of sets is 푆 × 푇 = {(푠, 푡) ∣ 푠 ∈ 푆, 푡 ∈ 푇}.

Lemma 1.1.1. Let 푆, 푇 be finite sets. (a) (Sum Rule) If 푆 ∩ 푇 = ∅, then |푆 ⊎ 푇| = |푆| + |푇|. (b) (Product Rule) For any finite sets |푆 × 푇| = |푆| ⋅ |푇|.

Proof. Let 푆 = {푠1, ... , 푠푚} and 푇 = {푡1, ... , 푡푛}. For part (a), if 푆 and 푇 are disjoint, then we have 푆 ⊎ 푇 = {푠1, ... , 푠푚, 푡1, ... , 푡푛} so that |푆 ⊎ 푇| = 푚 + 푛 = |푆| + |푇|.

1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2 1. Basic Counting

For part (b), we induct on 푛 = |푇|. If 푇 = ∅, then 푆 × 푇 = ∅ so that |푆 × 푇| = 0 as ′ ′ desired. If |푇| ≥ 1, then let 푇 = 푇 − {푡푛}. We can write 푆 × 푇 = (푆 × 푇 ) ⊎ (푆 × {푡푛}). Also 푆 × {푡푛} = {(푠1, 푡푛), ... , (푠푚, 푡푛)}, which has |푆| = 푚 elements since the second component is constant. Now by part (a) and induction

′ |푆 × 푇| = |푆 × 푇 | + |푆 × {푡푛}| = 푚(푛 − 1) + 푚 = 푚푛, which finishes the proof. □

In combinatorial choice problems, one is often given either the option to do one operation or another, or to do both. Suppose there are 푚 ways of doing the first oper- ation and 푛 ways of doing the second. If there is no common operation, then the Sum Rule tells us that the number of ways to do one or the other is 푚 + 푛. And if doing the first operation has no effect on doing the second, then the Product Rule givesacount of 푚푛 for doing the first and then the second. More generally if there are 푚 ways of doing the first operation and, no matter which ofthe 푚 is chosen, the number of ways to continue with the second operation is 푛, then again there are 푚푛 ways to do both. (The actual 푛 second operations available may depend on the choice of the first, but not their number.) So in practice one translates from English to mathematics by replacing “or” with addition and “and” with multiplication. Another important concept related to is that of a bijection. A bijection between sets 푆, 푇 is a function 푓 ∶ 푆 → 푇 which is both injective (one-to-one) and sur- jective (onto). If 푆, 푇 are finite, then the existence of a bijection between them implies that |푆| = |푇|. (One can extend this notion to infinite sets, but we will have no cause to do so here.) In combinatorics, one often uses to prove that two sets have the same cardinality. See, for just one of many examples, the proof of Theorem 1.1.2 below. We will illustrate these ideas with one of the most famous sequences in all of com- binatorics: the Fibonacci numbers. As is sometimes the case, there is an amusing (if somewhat improbable) story attached to the sequence. One starts at the beginning of time with a pair of immature rabbits, one male and one female. It takes one month for rabbits to mature. In every subsequent month a pair gives birth to another pair of immature rabbits, one male and one female. If rabbits only breed with their birth part- ner and live forever (as I said, the story is somewhat improbable), how many pairs of rabbits are there at the beginning of month 푛? Let us call this number 퐹푛. It will be con- venient to let 퐹0 = 0. Since we begin with only one pair, 퐹1 = 1. And at the beginning of the second month, the pair has matured but produced no offspring, so 퐹2 = 1. In subsequent months, one has all the rabbits from the previous month, counted by 퐹푛−1, together with the newborn pairs. The number of newborn pairs equals the number of mature pairs from the previous month, which equals the total number of pairs from the month before which is 퐹푛−2. Thus, applying the Sum Rule,

(1.1) 퐹푛 = 퐹푛−1 + 퐹푛−2 for 푛 ≥ 2 with 퐹0 = 0 and 퐹1 = 1

where we can start the recursion at 푛 = 2 rather than 푛 = 3 due to letting 퐹0 = 0. The 퐹푛 are called the Fibonacci numbers. It is also important to note that some authors

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.1. The Sum and Product Rules for sets 3

Figure 1.1. 풯3

define this sequence by letting

(1.2) 푓0 = 푓1 = 1 and 푓푛 = 푓푛−1 + 푓푛−2 for 푛 ≥ 2. So it is important to make sure which flavor of Fibonacci is being discussed in agiven context.

One might wonder if there is an explicit formula for 퐹푛 in addition to the recursive one above. We will see that such an expression exists, although it is far from obvious how to derive it from what we have done so far. Indeed, we will need the theory of ordinary generating functions discussed in Chapter 3 to derive it.

Another thing which might be desired is a combinatorial interpretation for 퐹푛.A combinatorial interpretation for a sequence of nonnegative integers 푎0, 푎1, 푎2,... is a sequence of sets 푆0, 푆1, 푆2,... such that #푆푛 = 푎푛 for all 푛. Such interpretations of- ten give rise to very pretty and intuitive proofs about the original sequence and so are highly desirable. One could argue that the story of the rabbits already gives such an interpretation. But we would like something more amenable to mathematical manip- ulation. Suppose we are given a row of squares. We are also given two types of tiles: domi- nos which can cover two squares and monominos which can cover one. A tiling of the row is a set of tiles which covers each square exactly once. Let 풯푛 be the set of tilings of a row of 푛 squares. See Figure 1.1 for a list of the elements of 풯3. There is a simple relationship between tilings and Fibonacci numbers. Theorem 1.1.2. For 푛 ≥ 1 we have

퐹푛 = #풯푛−1.

Proof. It suffices to prove that both sides of this equation satisfy the same initialcon- ditions and recurrence relation. When the row contains no squares, it only has the empty tiling so 풯0 = 1 = 퐹1. And when there is one square, it can only be tiled by a monomino so 풯1 = 1 = 퐹2. For the recursion, the tilings in 풯푛 can be divided into two types: those which end with a monomino and those which end with a domino. Removing the last tile shows that these tilings are in bijection with those in 풯푛−1 and □ those in 풯푛−2, respectively. Thus #풯푛 = #풯푛−1 + #풯푛−2 as desired.

To see the power of a good combinatorial interpretation, we will now give a simple proof of an identity for the 퐹푛. Such identities are legion. See, for example, the book of Benjamin and Quinn [10]. Corollary 1.1.3. For 푚 ≥ 1 and 푛 ≥ 0 we have

퐹푚+푛 = 퐹푚−1퐹푛 + 퐹푚퐹푛+1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4 1. Basic Counting

Proof. By the previous theorem, the left-hand side counts the number of tilings of a row of 푚 + 푛 − 1 squares. So it suffices to show that the same is true of the right. Label the squares 1, ... , 푚 + 푛 − 1 from left to right. We can write 풯푚+푛−1 = 풮 ⊎ 풯 where 풮 contains those tilings with a domino covering squares 푚 − 1 and 푚, and 풯 has the tilings with 푚−1 and 푚 in different tiles. The tilings in 풯 are essentially pairs of tilings, the first covering the first 푚 − 1 square and second covering the last 푛 squares. So the Product Rule gives |풯| = |풯푚−1| ⋅ |풯푛| = 퐹푚퐹푛+1. Removing the given domino from the tilings in 풮 again splits each tiling into a pair with the first covering 푚 − 2 squares and the second 푛 − 1. Taking cardinalities results in |풮| = 퐹푚−1퐹푛. Finally, applying the Sum Rule finishes the proof. □

The demonstration just given is called a combinatorial proof since it involves count- ing discrete objects. We will meet other useful proof techniques as we go along. But combinatorial proofs are often considered to be the most pleasant, in part because they can be more illuminating than demonstrations just involving formal manipulations.

1.2. Permutations and words

It is always important when considering an enumeration problem to determine whether the objects being considered are ordered or not. In this section we will consider the most basic ordered structures, namely permutations and words.

If 푆 is a set with #푆 = 푛, then a permutation of 푆 is a sequence 휋 = 휋1 ... 휋푛 obtained by listing the elements of 푆 in some order. If 휋 is a permutation, we will always use 휋푖 to denote the 푖th element of 휋 and similarly for other ordered structures. We let 푃(푆) denote the set of all permutations of 푆. For example, 푃({푎, 푏, 푐}) = {푎푏푐, 푎푐푏, 푏푎푐, 푏푐푎, 푐푎푏, 푐푏푎}. Clearly #푃(푆) only depends on #푆. So often we choose the canonical 푛-element set [푛] = {1, 2, ... , 푛}.

We can also consider 푘-permutations of 푆 which are sequences 휋 = 휋1 ... 휋푘 obtained by linearly ordering 푘 distinct elements of 푆. Here, 푘 is called the length of the permu- tation and we write ℓ(휋) = 푘. Again, we use the same terminology and notation for other ordered structures. The set of all 푘-permutations of 푆 is denoted 푃(푆, 푘). By way of illustration, 푃({푎, 푏, 푐, 푑}, 2) = {푎푏, 푏푎, 푎푐, 푐푎, 푎푑, 푑푎, 푏푐, 푐푏, 푏푑, 푑푏, 푐푑, 푑푐}. In particular, if #푆 = 푛, then 푃(푆, 푛) = 푃(푆). Also 푃(푆, 푘) = ∅ for 푘 > 푛 since in this case it is impossible to pick 푘 distinct elements from a set with only 푛. And 푃(푆, 0) = {휖} where 휖 is the empty sequence. To count permutations it will be convenient to introduce the following notation. Given nonnegative integers 푛, 푘, we can form the falling factorial

푛↓푘= 푛(푛 − 1) ... (푛 − 푘 + 1). Note that 푘 equals the number of factors in the product.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.3. Combinations and subsets 5

Theorem 1.2.1. For 푛, 푘 ≥ 0 we have

#푃([푛], 푘) = 푛↓푘 . In particular #푃([푛]) = 푛! .

Proof. Since 푃([푛]) = 푃([푛], 푛), it suffices to prove the first formula. Given 휋 = 휋1 ... 휋푘 ∈ 푃([푛], 푘), there are 푛 ways to pick 휋1. Since 휋2 ≠ 휋1, there remains 푛 − 1 choices for 휋2. Since the number of choices for 휋2 does not depend on the actual ele- ment chosen for 휋, one can continue in this way and apply a modified version of the Product Rule to obtain the result. □

Note that when 0 ≤ 푘 ≤ 푛 we can write 푛! (1.3) 푛↓ = . 푘 (푛 − 푘)!

But for 푘 > 푛 the product 푛 ↓푘 still makes sense, even though the product cannot be expressed as a quotient of . Indeed, if 푘 > 푛, then zero is a factor and so 푛 ↓푘= 0, which agrees with the fact that 푃([푛], 푘) = ∅. In the special case 푘 = 0 we have 푛↓푘= 1 because it is an . Again, this reflects the combinatorics in that #푃([푛], 0) = {휖}. One of the other things to keep track of in a combinatorial problem is whether elements are allowed to be repeated or not. In permutations we have no repetitions. But the case when they are allowed is interesting as well. A 푘-word over a set 푆 is a sequence 푤 = 푤1 ... 푤푘 where 푤푖 ∈ 푆 for all 푖. Note that there is no assumption that the 푤푖 are distinct. We denote the set of 푘-words over 푆 by 푃((푆, 푘)). Note the use of the double parentheses to denote the fact that repetitions are allowed. Note also that 푃(푆, 푘) ⊆ 푃((푆, 푘)), but usually the inclusion is strict. To illustrate 푃(({푎, 푏, 푐, 푑}, 2)) = 푃({푎, 푏, 푐, 푑}, 2) ⊎ {푎푎, 푏푏, 푐푐, 푑푑}. The proof of the next result is almost identical to that of Theorem 1.2.1 and so is left to the reader. When a result is given without proof, this is indicated by a box at the end of its statement. Theorem 1.2.2. For 푛, 푘 ≥ 0 we have #푃(([푛], 푘)) = 푛푘. □

1.3. Combinations and subsets

We will now consider unordered versions of the combinatorial objects studied in the last section. These are sometimes called combinations, although the reader may know them by their more familiar name: subsets. Given a set 푆, we let 2푆 denote the set of all subsets of 푆. Notice that 2푆 is a set, not a number. For example, 2{푎,푏,푐} = {∅, {푎}, {푏}, {푐}, {푎, 푏}, {푎, 푐}, {푏, 푐}, {푎, 푏, 푐}}. The reason for this notation should be made clear by the following result.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6 1. Basic Counting

Theorem 1.3.1. For 푛 ≥ 0 we have #2[푛] = 2푛.

Proof. By Theorem 1.2.2 we have 2푛 = #푃(({0, 1}, 푛)). So it suffices to find a bijection 푓 ∶ 2[푛] → 푃(({0, 1}, 푛)),

and there is a canonical one. In particular, if 푆 ⊆ [푛], then we let 푓(푆) = 푤1 ... 푤푛 where, for all 푖, 1 if 푖 ∈ 푆, 푤 = { 푖 0 if 푖 ∉ 푆.

To show that 푓 is bijective, it suffices to find its inverse. If 푤 = 푤1 ... 푤푛 ∈ 푃(({0, 1}, 푛)), −1 then we let 푓 (푤) = 푆 where 푖 ∈ 푆 if 푤푖 = 1 and 푖 ∉ 푆 if 푤푖 = 0 where 1 ≤ 푖 ≤ 푛. It is easy to check that the compositions 푓 ∘ 푓−1 and 푓−1 ∘ 푓 are the identity maps on their respective domains. This completes the proof. □

The proof just given is called a bijective proof and it is a particularly nice kind of combinatorial proof. This is because bijective proofs can relate different types of com- binatorial objects, sometimes revealing unexpected connections. Also note that we proved 푓 bijective by finding its inverse rather than showing directly that it wasone- to-one and onto. This is the preferred method as having a concrete description of 푓−1 can be useful later. Finally, when dealing with functions we will always compose them right-to-left so that (푓 ∘ 푔)(푥) = 푓(푔(푥)). We now want to count subsets by their cardinality. For a set 푆 we will use the notation 푆 ( ) = {푇 ⊆ 푆 ∣ #푇 = 푘}. 푘 As an example, {푎, 푏, 푐} ( ) = {{푎, 푏}, {푎, 푐}, {푏, 푐}}. 2 As expected, we now find the cardinality of this set. Theorem 1.3.2. For 푛, 푘 ≥ 0 we have [푛] 푛↓ #( ) = 푘 . 푘 푘!

Proof. Cross-multiplying and using Theorem 1.2.1 we see that it suffices to prove [푛] #푃([푛], 푘) = 푘! ⋅#( ). 푘

To see this, note that we can get each 휋1 ... 휋푘 ∈ 푃([푛], 푘) exactly once by running through the subsets 푆 = {푠1, ... , 푠푘} ⊆ [푛] and then ordering each 푆 in all possible [푛] ways. The number of choices for 푆 is #( 푘 ) and, by Theorem 1.2.1 again, the number of ways of permuting the elements of 푆 is 푘!. So we are done by the Product Rule. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.3. Combinations and subsets 7

1 1 1 1 2 1 1 3 3 1 1 4 6 4 1

Figure 1.2. Rows 0 through 4 of Pascal’s triangle

Given 푛, 푘 ≥ 0, we define the binomial coefficient 푛 [푛] 푛↓ (1.4) ( ) = #( ) = 푘 . 푘 푘 푘! The reason for this name is that these numbers appear in the binomial expansion which will be studied in Chapter 3. Often you will see the binomial coefficients displayed in 푛 a triangular array called Pascal’s triangle which has (푘) as the entry in the 푛th row and 푘th diagonal. When 푘 > 푛 it is traditional to omit the zeros. See Figure 1.2 for rows 0 through 4. (We apologize to the reader for not writing out the whole triangle, but this page is not big enough.) For 0 ≤ 푘 ≤ 푛 we can use (1.3) to write 푛 푛! (1.5) ( ) = , 푘 푘! (푛 − 푘)! which is pleasing because of its symmetry. We can also extend the binomial coefficients 푛 [푛] to 푘 < 0 by letting (푘) = 0. This is in keeping with the fact that ( 푘 ) = ∅ in this case. In the next theorem, we collect various basic results about binomial coefficients which will be useful in the sequel. In it, we will use the Kronecker delta function defined by 1 if 푥 = 푦, 훿 = { 푥,푦 0 if 푥 ≠ 푦. Also note that we do not specify the range of the summation variable 푘 in (c) and (d) because it can be taken as either 0 ≤ 푘 ≤ 푛 or 푘 ∈ ℤ since the extra terms in the larger sum are all zero. Both viewpoints will be useful on occasion. Theorem 1.3.3. Suppose 푛 ≥ 0. (a) The binomial coefficients satisfy the initial condition 0 ( ) = 훿 푘 푘,0 and recurrence relation 푛 푛 − 1 푛 − 1 ( ) = ( ) + ( ) 푘 푘 − 1 푘 for 푛 ≥ 1. (b) The binomial coefficients are symmetric, meaning that 푛 푛 ( ) = ( ). 푘 푛 − 푘

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8 1. Basic Counting

(c) We have 푛 ∑ ( ) = 2푛. 푘 푘 (d) We have 푛 ∑(−1)푘( ) = 훿 . 푘 푛,0 푘

[푛] Proof. (a) The initial condition is clear. For the recursion let 풮1 be the set of 푆 ∈ ( 푘 ) [푛] [푛] with 푛 ∈ 푆, and let 풮2 be the set of 푆 ∈ ( 푘 ) with 푛 ∉ 푆. Then ( 푘 ) = 풮1 ⊎ 풮2. But [푛−1] [푛−1] if 푛 ∈ 푆, then 푆 − {푛} ∈ ( 푘−1 ). This gives a bijection between 풮1 and ( 푘−1 ) so that 푛−1 [푛−1] 푛−1 #풮1 = (푘−1). On the other hand, if 푛 ∉ 푆, then 푆 ∈ ( 푘 ) and this implies #풮2 = ( 푘 ). Applying the Sum Rule completes the proof.

[푛] [푛] [푛] [푛] (b) It suffices to find a bijection 푓 ∶ ( 푘 ) → (푛−푘). Consider the map 푓 ∶ 2 → 2 by 푓(푆) = [푛] − 푆 where the minus sign indicates difference of sets. Note that the 2 [푛] composition 푓 is the identity map so that 푓 is a bijection. Furthermore 푆 ∈ ( 푘 ) if and [푛] only if 푓(푆) ∈ (푛−푘). So 푓 restricts to a bijection between these two sets.

[푛] [푛] (c) This follows by applying the Sum Rule to the equation 2 = ⨄푘 ( 푘 ). (d) The case 푛 = 0 is easy, so we assume 푛 > 0. We will learn general techniques for dealing with equations involving signs in the next chapter. But for now, we try to prove the equivalent equality 푛 푛 ∑ ( ) = ∑ ( ). 푘 푘 푘 odd 푘 even [푛] [푛] Let 풯1 be the set of 푇 ∈ 2 with #푇 odd and let 풯2 be the set of 푇 ∈ 2 with #푇 even. We wish to find a bijection 푔∶ 풯1 → 풯2. Consider the operation of symmetric difference 푆 Δ 푇 = (푆 − 푇) ⊎ (푇 − 푆). It is not hard to see that (푆 Δ 푇) Δ 푇 = 푆. Now define 푔∶ 2[푛] → 2[푛] by 푔(푇) = 푇 Δ {푛} so that, by the previous sentence, 푔2 is the identity. Furthermore, 푔 reverses parity and so restricts to the desired bijection. □

As with the case of permutations and words, we want to enumerate “sets” where repetitions are allowed. A multiset 푀 is an unordered collection of elements which may be repeated. For example 푀 = {{푎, 푎, 푎, 푏, 푐, 푐}} = {{푐, 푎, 푏, 푎, 푐, 푎}}. Note the use of double curly brackets to denote a multiset. We will also use multiplicity notation where 푎푚 denotes 푚 copies of the element 푎. Continuing our example 푀 = {{푎3, 푏, 푐2}}. As with powers, an exponent of one is optional and an exponent of zero indicates that there are no copies of that element in the multiset. The cardinality of a multiset is its number of elements counted with multiplicity. So in our example #푀 = 2 + 1 + 3 = 6.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.3. Combinations and subsets 9

If 푆 is a set, then 푀 is a multiset on 푆 if every element of 푀 is an element of 푆. We let 푆 ((푘)) be the set of all on 푆 of cardinality 푘 and 푛 [푛] (( )) = #(( )). 푘 푘 To illustrate {푎, 푏, 푐} (( )) = { {{푎, 푎}}, {{푎, 푏}}, {{푎, 푐}}, {{푏, 푏}}, {{푏, 푐}}, {{푐, 푐}} } 2

3 and so ((2)) = 6. Theorem 1.3.4. For 푛, 푘 ≥ 0 we have 푛 푛 + 푘 − 1 (( )) = ( ). 푘 푘

Proof. We wish to find a bijection [푛] [푛 + 푘 − 1] 푓 ∶ (( )) → ( ). 푘 푘

Given a multiset 푀 = {{푚1 ≤ 푚2 ≤ 푚3 ≤ ⋯ ≤ 푚푘}} on [푛], let

푓(푀) = {푚1 < 푚2 + 1 < 푚3 + 2 < ⋯ < 푚푘 + 푘 − 1}.

Now the 푚푖 +푖 − 1 are distinct, and the fact that 푚푘 ≤ 푛 implies 푚푘 +푘−1 ≤ 푛+푘−1. [푛+푘−1] It follows that 푓(푀) ∈ ( 푘 ) and so the map is well-defined. It should now be easy for the reader to construct an inverse, proving that 푓 is bijective. □

푛 As with the binomial coefficients, we extend ((푘)) to negative 푘 by letting it equal zero. In the future we will do the same for other constants whose natural domain of definition is 푛, 푘 ≥ 0 without comment. We do wish to comment on an interesting relationship between counting sets and multisets. Note that definition (1.4) is well-defined for any complex number 푛 since the falling factorial is just a product, and in particular it makes sense for negative integers. In fact, if 푛 ∈ ℕ, then −푛 (−푛)(−푛 − 1) ⋯ (−푛 − 푘 + 1) (1.6) ( ) = 푘 푘! 푛(푛 + 1) ⋯ (푛 + 푘 − 1) = (−1)푘 푘! 푛 = (−1)푘(( )) 푘 by Theorem 1.3.4. This kind of situation where evaluation of an enumerative formula at negative arguments yields, up to sign, another enumerative function is called com- binatorial reciprocity and will be studied in Section 3.9.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

10 1. Basic Counting

1.4. Set partitions

We have already seen that disjoint unions are nice combinatorially. So it should come as no surprise that set partitions also play an important role.

A 푇 is a set 휌 of nonempty subsets 퐵1, ... , 퐵푘 such that 푇 = ⨄푖 퐵푖, written 휌 ⊢ 푇. The 퐵푖 are called blocks and we use the notation 휌 = 퐵1/ ... /퐵푘 leaving out all curly brackets and commas, even though the elements of the blocks, as well as the blocks themselves, are unordered. For example, one set partition of 푇 = {푎, 푏, 푐, 푑, 푒, 푓, 푔} is 휌 = 푎푐푓/푏푒/푑/푔 = 푑/푒푏/푔/푐푓푎. We let 퐵(푇) be the set of all 휌 ⊢ 푇. To illustrate, 퐵({푎, 푏, 푐}) = {푎/푏/푐, 푎푏/푐, 푎푐/푏, 푎/푏푐, 푎푏푐}. The 푛th Bell number is 퐵(푛) = #퐵([푛]). Although there is no known expression for 퐵(푛) as a simple product, there is a recursion. Theorem 1.4.1. The Bell numbers satisfy the initial condition 퐵(0) = 1 and the recur- rence relation 푛 − 1 퐵(푛) = ∑ ( )퐵(푛 − 푘) 푘 − 1 푘 for 푛 ≥ 1.

Proof. The initial condition counts the empty partition of ∅. For the recursion, given 휌 ∈ 퐵([푛]), let 푘 be the number of elements in the block 퐵 containing 푛. Then there 푛−1 are (푘−1) ways to pick the remaining 푘 − 1 elements of [푛 − 1] to be in 퐵. And the number of ways to partition [푛] − 퐵 is 퐵(푛 − 푘). Summing over all possible 푘 finishes the proof. □

We may sometimes want to keep track of the number of blocks in our partitions. So define 푆(푇, 푘) to be the set of all 휌 ⊢ 푇 with 푘 blocks. The Stirling numbers of the second kind are 푆(푛, 푘) = #푆([푛], 푘). We will introduce Stirling numbers of the first kind in the next section. For example 푆({푎, 푏, 푐}, 2) = {푎푏/푐, 푎푐/푏, 푎/푏푐} so 푆(3, 2) = 3. Just as with the binomial coefficients, the 푆(푛, 푘) for 1 ≤ 푘 ≤ 푛 can be displayed in a triangle as in Figure 1.3. And like the binomial coefficients, these Stirling numbers satisfy a simple recurrence relation.

1 1 1 1 3 1 1 7 6 1 1 15 25 10 1

Figure 1.3. Rows 1 through 5 of Stirling’s second triangle

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.5. Permutations by cycle structure 11

Theorem 1.4.2. The Stirling numbers of the second kind satisfy the initial condition

푆(0, 푘) = 훿푘,0 and recurrence relation 푆(푛, 푘) = 푆(푛 − 1, 푘 − 1) + 푘푆(푛 − 1, 푘) for 푛 ≥ 1.

Proof. By now, the reader should be able to explain the initial condition without diffi- culty. For the recursion, the elements 휌 ∈ 푆([푛], 푘) are of two flavors: those where 푛 is in a block by itself and those where 푛 is in a block with other elements. Removing 푛 in the first case leaves a partition in 푆([푛 − 1], 푘 − 1) and this is a bijection. This accounts for the summand 푆(푛−1, 푘−1). Removing 푛 in the second case leaves 휎 ∈ 푆([푛−1], 푘), but this map is not a bijection. In particular, given 휎, one can insert 푛 into any one of its 푘 blocks to recover an element of 푆([푛], 푘). So the total count is 푘푆(푛 − 1, 푘) for this case. □

1.5. Permutations by cycle structure

The ordered analogue of a decomposition of a set into a partition is the decomposition of a permutation of [푛] into cycles. These are counted by the Stirling numbers of the first kind.

The symmetric group is 픖푛 = 푃([푛]). As the name implies, 픖푛 has a group structure defined as follows. If 휋 = 휋1 ... 휋푛 ∈ 픖푛, then we can view this permutation as a bijection 휋∶ [푛] → [푛] where 휋(푖) = 휋푖. From this it follows that 픖푛 is a group where the operation is composition of functions. ℓ Given 휋 ∈ 픖푛 and 푖 ∈ [푛], there is a smallest exponent ℓ ≥ 1 such that 휋 (푖) = 푖. This and various other claims below will be proved using digraphs in Section 1.9. In this case, the elements 푖, 휋(푖), 휋2(푖), ... , 휋ℓ−1(푖) are all distinct and we write 푐 = (푖, 휋(푖), 휋2(푖), ... , 휋ℓ−1(푖)) and call this a cycle of length ℓ or simply an ℓ-cycle of 휋. Cycles of length one are called fixed points. As an example, if 휋 = 6514237 and 푖 = 1, then we have 휋(1) = 6, 휋2(1) = 3, 휋3(1) = 1 so that 푐 = (1, 6, 3) is a cycle of 휋. We now iterate this process: if there is some 푗 ∈ [푛] which is not in any of the cycles computed so far, we find the cycle containing 푗 and continue until every element is in a cycle. The cycle decomposition of 휋 is 휋 = 푐1 ... 푐푘 where the 푐푗 are the cycles found in this process. Continuing our example, we could get 휋 = (1, 6, 3)(2, 5)(4)(7).

To distinguish the cycle decomposition of 휋 from its description as 휋 = 휋1 ... 휋푛 we will call the latter the one-line notation for 휋. This is also distinct from two-line notation, which is where one writes 1 2 ... 푛 (1.7) 휋 = . 휋1 휋2 ... 휋푛

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

12 1. Basic Counting

1 1 1 2 3 1 6 11 6 1 24 50 35 10 1

Figure 1.4. Rows 1 through 5 of Stirling’s first triangle

Note that an ℓ-cycle can be written in ℓ different ways depending on which of its elements one starts with; for example (1, 6, 3) = (6, 3, 1) = (3, 1, 6). Furthermore, the distinct cycles of 휋 are disjoint. So if we think of the cycle 푐 as the permutation of [푛] which agrees with 휋 on the elements of 푐 and has all other elements as fixed points, then the cycles of 휋 = 푐1 ... 푐푘 commute where we consider the product as a composition of permutations. Returning to our running example, we could write 휋 = (1, 6, 3)(2, 5)(4)(7) = (4)(1, 6, 3)(7)(2, 5) = (5, 2)(3, 1, 6)(7)(4). As mentioned above, we defer the proof of the following result until Section 1.9.

Theorem 1.5.1. Every 휋 ∈ 픖푛 has a cycle decomposition 휋 = 푐1 ... 푐푘 which is unique up to the order of the factors and cyclic reordering of the elements within each 푐푖.

We are now in a position to proceed parallel to the development of set partitions with a given number of blocks in the previous section. For 푛 ≥ 0 we denote by 푐([푛], 푘) the set of all permutations in 픖푛 which have 푘 cycles in their decomposition. Note the difference between “푘 cycles” referring to the number of cycles and “푘-cycles” referring to the length of the cycles. The signless Stirling numbers of the first kind are 푐(푛, 푘) = #푐([푛], 푘). So, analogous to what we have seen before, 푐(푛, 푘) = 0 for 푘 < 0 or 푘 > 푛. To illustrate the notation, 푐([4], 1) = {(1, 2, 3, 4), (1, 2, 4, 3), (1, 3, 2, 4), (1, 3, 4, 2), (1, 4, 2, 3), (1, 4, 3, 2)} so 푐(4, 1) = 6. In general, as you will be asked to prove in an exercise, 푐([푛], 1) = (푛−1)!. Part of Stirling’s first triangle is displayed in Figure 1.4. We also have a recursion. Theorem 1.5.2. The signless Stirling numbers of the first kind satisfy the initial condition

푐(0, 푘) = 훿푘,0 and recurrence relation 푐(푛, 푘) = 푐(푛 − 1, 푘 − 1) + (푛 − 1)푐(푛 − 1, 푘) for 푛 ≥ 1.

Proof. As usual, we concentrate on the recurrence. Given 휋 ∈ 푐([푛], 푘), we can re- move 푛 from its cycle. If 푛 was a fixed point, then the resulting permutations are counted by 푐(푛 − 1, 푘 − 1). If 푛 was in a cycle of length at least two, then the per- mutations obtained upon removal are in 푐([푛 − 1], 푘). So one must find the number of

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.6. Integer partitions 13

ways to insert 푛 into a cycle of some 휎 ∈ 푐([푛 − 1], 푘). There are ℓ places to insert 푛 in a cycle of length ℓ. So the total number of insertion spots is the sum of the cycle lengths of 휎, which is 푛 − 1. □

The reader may have guessed that there are also (signed) Stirling numbers of the first kind defined by 푠(푛, 푘) = (−1)푛−푘푐(푛, 푘). It is not immediately apparent why one would want to attach signs to these constants. We will see one reason in Chapter 5 where it will be shown that the 푠(푛, 푘) are the Whitney numbers of the first kind for the lattice of set partitions ordered by refinement. Here we will content ourselves with proving an analogue of part (d) of Theorem 1.3.3.

Corollary 1.5.3. For 푛 ≥ 0 we have 1 if 푛 = 0 or 1, ∑ 푠(푛, 푘) = { 0 if 푛 ≥ 2. 푘

Proof. The cases when 푛 = 0 or 1 are easy to verify, so assume 푛 ≥ 2. Since 푠(푛, 푘) = (−1)푛−푘푐(푛, 푘) and (−1)푛 is constant throughout the summation, it suffices to show 푘 that ∑푘(−1) 푐(푛, 푘) = 0. Using Theorem 1.5.2 and induction on 푛 we obtain ∑(−1)푘푐(푛, 푘) = ∑(−1)푘푐(푛 − 1, 푘 − 1) + ∑(−1)푘(푛 − 1)푐(푛 − 1, 푘) 푘 푘 푘 = − ∑(−1)푘−1푐(푛 − 1, 푘 − 1) + (푛 − 1) ∑(−1)푘푐(푛 − 1, 푘) 푘 푘 = −0 + (푛 − 1)0

= 0 as desired. □

Note the usefulness of considering the sums in the preceding proof as over 푘 ∈ ℤ rather than 0 ≤ 푘 ≤ 푛. This does away with having to consider any special cases at the values 푘 = 0 or 푘 = 푛.

1.6. Integer partitions

Just as one can partition a set into blocks, one can partition a nonnegative integer as a sum. Integer partitions play an important role not just in combinatorics but also in number theory and the representation theory of the symmetric group. See the appendix at the end of the book for more information on the latter. An integer partition of 푛 ≥ 0 is a multiset 휆 of positive integers such that the sum of the elements of 휆, denoted |휆|, is 푛. We also write 휆 ⊢ 푛. These elements are called the parts. Since the parts of 휆 are unordered, we will always list them in a canonical order 휆 = (휆1, ... , 휆푘) which is weakly decreasing. We let 푃(푛) denote the set of all

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

14 1. Basic Counting

partitions of 푛 and 푝(푛) = #푃(푛). For example, 푃(4) = {(1, 1, 1, 1), (2, 1, 1), (2, 2), (3, 1), (4)} so that 푝(4) = 5. Note the distinction between 푃([푛]), which is a set of set partitions, and 푃(푛), which is a set of integer partitions. Sometimes we will just say “partition” if the context makes it clear whether we are partitioning sets or integers. We will use multiplicity notation for integer partitions just as we would for any multiset, writing

휆 = (1푚1 , 2푚2 , ... , 푛푚푛 )

where 푚푖 is the multiplicity of 푖 in 휆. There is no known product formula for 푝(푛). In fact, there is not even a simple recurrence relation. One can use generating functions to derive results about these numbers, but that must wait until Chapter 3. Here we will just introduce a useful geometric device for studying 푝(푛). The Ferrers or Young diagram of 휆 = (휆1, ... , 휆푘) ⊢ 푛 is an array of 푛 boxes into left-justified rows such that row 푖 contains 휆푖 boxes. Dots are also sometimes used in place of boxes and in this case some authors use “Ferrers diagram” for the dot variant and “Young diagram” for the corresponding array of boxes. We often make no distinction between a partition and its Young diagram. The Young diagram of 휆 = (5, 5, 2, 1) is shown in Figure 1.5. We should warn the reader that we are writing our Young diagrams in English notation where the rows are numbered from 1 to 푘 from the top down as in a matrix. Some authors prefer French notation where the rows are numbered from bottom to top as in a Cartesian coordinate system. The conjugate or transpose of 휆 is the partition 휆푡 whose Young diagram is obtained by reflecting the diagram of 휆 about its main diagonal. This is done in Figure 1.5, showing that (5, 5, 2, 1)푡 = (4, 3, 2, 2, 2). There is also another way to express the parts of the conjugate.

휆 = (5, 5, 2, 1) = = ••••• ••••• •• •

휆푡 =

Figure 1.5. A partition, its Young diagram, and its conjugate

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.6. Integer partitions 15

푡 푡 푡 Proposition 1.6.1. If 휆 = (휆1, ... , 휆푘) is a partition and 휆 = (휆1, ... , 휆푙 ), then, for 1 ≤ 푗 ≤ 푙, 푡 휆푗 = #{푖 ∣ 휆푖 ≥ 푗}.

푡 Proof. By definition, 휆푗 is the length of the 푗th column of 휆. But that column contains □ a box in row 푖 if and only if 휆푖 ≥ 푗.

The number of parts of a partition 휆 is called its length and is denoted ℓ(휆). At this point the reader is probably expecting a discussion of those partitions of 푛 with ℓ(휆) = 푘. As it turns out, it is a bit simpler to consider 푃(푛, 푘), the set of all partitions 휆 of 푛 with ℓ(휆) ≤ 푘, and 푝(푛, 푘) = #푃(푛, 푘). Note that the number of 휆 ⊢ 푛 with ℓ(휆) = 푘 is just 푝(푛, 푘) − 푝(푛, 푘 − 1). So in some sense the two viewpoints are equivalent. But it will be easier to state our results in terms of 푝(푛, 푘). Note also that 푝(푛, 0) ≤ 푝(푛, 1) ≤ ⋯ ≤ 푝(푛, 푛) = 푝(푛, 푛 + 1) = ⋯ = 푝(푛). Because of this behavior, it is best to display the 푝(푛, 푘) in a matrix, rather than a trian- gle, keeping in mind that the entries in the 푛th row eventually stabilize to an infinite repetition of the constant 푝(푛). Part of this array will be found in Figure 1.6. We also assume that 푝(푛, 푘) = 0 if 푛 < 0 or 푘 < 0. Unlike 푝(푛), one can write down a simple recurrence relation for 푝(푛, 푘). Theorem 1.6.2. The 푝(푛, 푘) satisfy 0 if 푘 < 0, 푝(0, 푘) = { 1 if 푘 ≥ 0 and 푝(푛, 푘) = 푝(푛 − 푘, 푘) + 푝(푛, 푘 − 1) for 푛 ≥ 1

Proof. We skip directly to the recursion. Note that since conjugation is a bijection, 푝(푛, 푘) also counts the partitions 휆 = (휆1, ... , 휆푙) ⊢ 푛 such that 휆1 ≤ 푘. It will be convenient to use this interpretation of 푝(푛, 푘) for the proof. We have two possible cases. If 휆1 = 푘, then 휇 = (휆2, ... , 휆푙) ⊢ 푛 − 푘 and 휆2 ≤ 휆1 = 푘. So these partitions are counted by 푝(푛 − 푘, 푘). The other possibility is that 휆1 ≤ 푘 − 1. And these 휆 are taken care of by the 푝(푛, 푘 − 1) term. □

0 1 2 3 4 5 0 1 1 1 1 1 1 1 0 1 1 1 1 1 2 0 1 2 2 2 2 3 0 1 2 3 3 3 4 0 1 3 4 5 5

Figure 1.6. The values 푝(푛, 푘) for 0 ≤ 푛 ≤ 4 and 0 ≤ 푘 ≤ 5

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

16 1. Basic Counting

1.7. Compositions

Recall that integer partitions are really unordered even though we usually list them in weakly decreasing fashion. This raises the question about what happens if we consid- ered ways to write 푛 as a sum when the summands are ordered. This is the notion of a composition.

A composition of 푛 is a sequence 훼 = [훼1, ... , 훼푘] of positive integers called parts

such that ∑푖 훼푖 = 푛. We write 훼 ⊧ 푛 and use square brackets to distinguish composi- tions from integer partitions. This causes a notational conflict between [푛] as a compo- sition of 푛 and as the integers from 1 to 푛, but the context should make it clear which interpretation is meant. Let 푄(푛) be the set of compositions of 푛 and 푞(푛) = #푄(푛). So the compositions of 4 are 푄(4) = {[1, 1, 1, 1], [2, 1, 1], [1, 2, 1], [1, 1, 2], [2, 2], [3, 1], [1, 3], [4]}. So 푞(4) = 8, which is a power of 2. This, as your author is fond of saying, is not a coincidence.

Theorem 1.7.1. For 푛 ≥ 1 we have 푞(푛) = 2푛−1.

Proof. There is a famous bijection 휙 ∶ 2[푛−1] → 푄(푛), which we will use to prove this result. This map will be useful when working with quasisymmetric functions in Chapter 8. Given 푆 = {푠1, ... , 푠푘} ⊆ [푛 − 1] written in increasing order, we define

(1.8) 휙(푆) = [푠1 − 푠0, 푠2 − 푠1, ... , 푠푘 − 푠푘−1, 푠푘+1 − 푠푘]

where, by definition, 푠0 = 0 and 푠푘+1 = 푛. To show that 휙 is well-defined, suppose 휙(푆) = [훼1, ... , 훼푘+1]. Since 푆 is increasing, 훼푖 = 푠푖 − 푠푖−1 is a positive integer. Fur- thermore 푘+1 푘+1

∑ 훼푖 = ∑(푠푖 − 푠푖−1) = 푠푘+1 − 푠0 = 푛. 푖=1 푖=1 Thus 휙(푆) ∈ 푄(푛) as desired. To show that 휙 is bijective, we construct its inverse 휙−1 ∶ 푄(푛) → 2[푛−1]. Given 훼 = [훼1, ... , 훼푘+1] ∈ 푄(푛), we let −1 휙 (훼) = {훼1, 훼1 + 훼2, 훼1 + 훼2 + 훼3, ... , 훼1 + 훼2 + ⋯ + 훼푘}. It should not be hard for the reader to prove that 휙−1 is well-defined and the inverse of 휙. □

As usual, we wish to make a more refined count by restricting the number of con- stituents of the object under consideration. Let 푄(푛, 푘) be the set of all compositions of 푛 with exactly 푘 parts and let 푞(푛, 푘) = #푄(푛, 푘). Since the 푞(푛, 푘) will turn out to be previously studied constants, we will forgo the usual triangle. The result below fol- lows easily by restricting the function 휙 from the previous proof, so the demonstration is omitted.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.8. The twelvefold way 17

Theorem 1.7.2. The composition numbers satisfy

푞(0, 푘) = 훿푘,0 and 푛 − 1 푞(푛, 푘) = ( ) 푘 − 1 for 푛 ≥ 1. □

1.8. The twelvefold way

We now have all the tools in place to count certain functions. There are 12 types of such functions and so this scheme is called the twelvefold way, an idea which was introduced in a series of lectures by Gian-Carlo Rota. The name was suggested by Joel Spencer and should not be confused with the twelvefold path of Buddhism! We will consider three types of functions 푓 ∶ 퐷 → 푅, namely, arbitrary functions, injections, and surjections. We will also permit the domain 퐷 and range 푅 to be of two types each: either distinguishable, which means it is a set, or indistinguishable, which means it is a multiset consisting of a single element repeated some number of times. Thus the total number of types of functions under consideration is the product of the number of choices for 푓, 퐷, and 푅 or 3 ⋅ 2 ⋅ 2 = 12. Of course, a function where the domain or range is a multiset is not really well-defined, even though the intuitive notion should be clear. To be precise, when 퐷 is a multiset and 푅 is a set, suppose 퐷′ is a set with |퐷′| = |퐷|. Then a function 푓 ∶ 퐷 → 푅 is an of functions 푓 ∶ 퐷′ → 푅 where 푓 and 푔 are equivalent if #푓−1(푟) = #푔−1(푟) for all 푟 ∈ 푅. The reader can come up with the corresponding notions for the other cases if desired. We will assume throughout that |퐷| = 푛 and |푅| = 푘 are both nonnegative integers. We will collect the results in the chart in Table 1.1. We first deal with the case where both 퐷 and 푅 are distinguishable. Without loss of generality, we can assume that 퐷 = [푛]. So a function 푓 ∶ 퐷 → 푅 can be considered as a word 푤 = 푓(1)푓(2) ... 푓(푛). Since there are 푘 choices for each 푓(푖), we have, by Theorem 1.2.2, that the number of such 푓 is #푃(([푘], 푛)) = 푘푛. If 푓 is injective, then 푤 becomes a permutation, giving the count #푃([푘], 푛) = 푘↓푛 from Theorem 1.2.1. For surjective functions, we need a new concept. If 퐷 is a set, then the kernel of a function 푓 ∶ 퐷 → 푅 is the partition ker 푓 of 퐷 whose blocks are the nonempty subsets of the form 푓−1(푟) for 푟 ∈ 푅. For example, if 푓 ∶ {푎, 푏, 푐, 푑} → {1, 2, 3} is given by 푓(푎) = 푓(푐) = 2,

Table 1.1. The twelvefold way

퐷 푅 arbitrary 푓 injective 푓 surjective 푓

푛 dist. dist. 푘 푘↓푛 푘! 푆(푛, 푘) 푛+푘−1 푘 푛−1 indist. dist. ( 푛 ) (푛) (푘−1) 푘 dist. indist. ∑푗=0 푆(푛, 푗) 훿(푛 ≤ 푘) 푆(푛, 푘) indist. indist. 푝(푛, 푘) 훿(푛 ≤ 푘) 푝(푛, 푘) − 푝(푛, 푘 − 1)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

18 1. Basic Counting

푓(푏) = 3, and 푓(푑) = 1, then ker 푓 = 푎푐/푏/푑. If 푓 is to be surjective, then the function can be specified by picking a partition of 퐷 for ker 푓 and then picking a bijection 푔 from the blocks of ker 푓 into 푅. Continuing our example, 푓 is completely determined by its kernel and the bijection 푔(푎푐) = 2, 푔(푏) = 3, and 푔(푑) = 1. The number of ways to choose ker 푓 = 퐵1/ ... /퐵푘 is 푆(푛, 푘) by definition. And, using the injective case with 푛 = 푘, the number of bijections 푔∶ {퐵1, ... , 퐵푘} → 푅 is 푘 ↓푘= 푘!. So the total count is 푘! 푆(푛, 푘). Now suppose 퐷 is indistinguishable and 푅 is distinguishable where we assume 푅 = [푘]. Then one can think of 푓 ∶ 퐷 → 푅 as a multiset 푀 = {{1푚1 , ... , 푘푚푘 }} on 푅 −1 where 푚푖 = #푓 (푖). It follows that ∑푖 푚푖 = #퐷 = 푛. So, by Theorem 1.3.4, the number of all such 푓 is 푘 푛 + 푘 − 1 (( )) = ( ). 푛 푛 If 푓 is to be injective, then we are picking an 푛-element subset of 푅 = [푘] giving a count 푘 of (푛). If 푓 is to be surjective, then 푚푖 ≥ 1 for all 푖 so that [푚1, ... , 푚푘] is a composition 푛−1 of 푛. It follows from Theorem 1.7.2 that the number of functions is 푞(푛, 푘) = (푘−1). To deal with the case when 퐷 = [푛] is distinguishable and 푅 is indistinguishable, we introduce a useful extension of the Kronecker delta. If 푆 is any statement, we let 1 if 푆 is true, (1.9) 훿(푆) = { 0 if 푆 is false. Returning to our counting, 푓 is completely determined by its kernel, which is a parti- tion of [푛]. If we are considering all 푓, then the kernel can have any number of blocks up to and including 푘. Summing the corresponding Stirling numbers gives the corre- sponding entry in Table 1.1. If 푓 is injective, then for such a function to exist we must have 푛 ≤ 푘. And in that case there is only one possible kernel, namely the partition into singleton blocks. This count can be summarized as 훿(푛 ≤ 푘). For surjective 푓 we are partitioning [푛] into exactly 푘 blocks, giving 푆(푛, 푘) possibilities.

If 퐷 and 푅 are both indistinguishable, then the nonzero numbers of the form 푚푖 = #푓−1(푟) for 푟 ∈ 푅 completely determine 푓. And these numbers form a partition of 푛 = #퐷 into at most 푘 = #푅 parts. Recalling the notation of Section 1.6, the total number of such 푓 is 푝(푛, 푘). The line of reasoning for injective functions follows that of the previous paragraph with the same resulting answer. Finally, for surjectivity we need exactly 푘 parts, which is counted by 푝(푛, 푘) − 푝(푛, 푘 − 1).

1.9. Graphs and digraphs

Graph theory is a substantial part of combinatorics. We will use directed graphs to give the postponed proof of the existence and uniqueness of the cycle decomposition of permutations in 픖푛. A labeled graph 퐺 = (푉, 퐸) consists of a set 푉 of elements called vertices and a set 퐸 of elements called edges where an edge consists of an unordered pair of vertices. We will write 푉(퐺) and 퐸(퐺) for the vertex and edge set of 퐺, respectively, if we wish to emphasize the graph involved. Geometrically, we think of the vertices as nodes and the edges as line segments or curves joining them. Conventionally, in graph theory an

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.9. Graphs and digraphs 19

푣 푤

푦 푥

Figure 1.7. A graph 퐺

edge connecting vertices 푣 and 푤 is written 푒 = 푣푤 rather than 푒 = {푣, 푤}. In this case we say that 푒 contains 푣 and 푤, or that 푒 has endpoints 푣 and 푤. We also say that 푣 and 푤 are neighbors. For example, a drawing of the graph 퐺 with vertices 푉 = {푣, 푤, 푥, 푦} and edges 퐸 = {푣푤, 푣푥, 푣푦, 푤푥, 푥푦} is displayed in Figure 1.7. If #푉 = 1, then there is only one graph with vertex set 푉 and such a graph is called trivial. Call graph 퐻 a subgraph of 퐺, written 퐻 ⊆ 퐺, if 푉(퐻) ⊆ 푉(퐺) and 퐸(퐻) ⊆ 퐸(퐺). In this case we also say that 퐺 contains 퐻. There are several types of subgraphs which will play an important role in what follows. A walk of length ℓ in 퐺 is a sequence of vertices 푊 ∶ 푣0, 푣1, ... , 푣ℓ such that 푣푖−1푣푖 ∈ 퐸 for 1 ≤ 푖 ≤ ℓ. We say that the walk is from 푣0 to 푣ℓ, or is a 푣0–푣ℓ walk, or that 푣0, 푣ℓ are the endpoints of 푊. We call 푊 a path if all the vertices are distinct and we usually use letters like 푃 for paths. In particular, we will use 푊푛 or 푃푛 to denote a walk or a path having 푛 vertices, respectively. In our example graph, 푃 ∶ 푦, 푣, 푥, 푤 is a path of length 3 from 푦 to 푤. Notice that length refers to the number of edges in the path, which is one less than the number of vertices. A cycle of length ℓ in 퐺 is a sequence of distinct vertices 퐶 ∶ 푣1, 푣2, ... , 푣ℓ such that we have distinct edges 푣푖−1푣푖 for 1 ≤ 푖 ≤ ℓ, and subscripts are taken modulo ℓ so that 푣0 = 푣ℓ. Returning to our running example, 퐶 ∶ 푣, 푥, 푦 is a cycle in 퐺 of length 3. In a cycle the length is both the number of vertices and the number of edges. The notation 퐶푛 will be used for a cycle with 푛 vertices and we will call this an 푛-cycle. We also 푛 denote by 퐾푛 the complete graph which consists of 푛 vertices and all possible (2) edges between them. A copy of a complete graph in a graph 퐺 is often called a clique. There is a close relationship between some of the parts of a graph which we have just defined.

Lemma 1.9.1. Let 퐺 be a graph and let 푢, 푣 ∈ 푉.

(a) Any walk from 푢 to 푣 contains a path from 푢 to 푣. (b) The union of any two different paths from 푢 to 푣 contains a cycle.

Proof. We will prove (a) and leave (b) as an exercise. Let 푊 ∶ 푣0, ... , 푣ℓ be the walk. We will induct on ℓ, the length of 푊. If ℓ = 0, then 푊 is a path. So assume ℓ ≥ 1. If 푊 is a path, then we are done. If not, then some vertex of 푊 is repeated, say 푣푖 = 푣푗 for ′ 푖 < 푗. Then we have a 푢–푣 walk 푊 ∶ 푣0, 푣1, ... , 푣푖, 푣푗+1, 푣푗+2, ... , 푣ℓ which is shorter than 푊. By induction, 푊 ′ contains a path 푃 and so 푊 contains 푃 as well. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

20 1. Basic Counting

To state our first graphical enumeration result, let 풢(푉) be the set of all graphs on the vertex set 푉. We will also use 풢(푉, 푘) to denote the set of all graphs in 풢(푉) with 푘 edges.

Theorem 1.9.2. For 푛 ≥ 1 and 푘 ≥ 0 we have (푛) #풢([푛]) = 2 2 and (푛) #풢([푛], 푘) = ( 2 ). 푘

Proof. If 푉 = [푛] is given, then a graph 퐺 with vertex set 푉 is completely determined 푛 by its edge set. Since there are 푛 vertices, there are (2) possible edges to choose from. So the number of 퐺 in 풢([푛]) is the number of subsets of these edges, which, by The- orem 1.3.1, is the given power of 2. The proof for 풢([푛], 푘) is similar, just using the definition (1.4). □

A graph is unlabeled if the vertices in 푉 are indistinguishable. If the type of graph is clear from the context or does not matter for the particular application at hand, we will omit the adjectives “labeled” and “unlabeled”. The enumeration of unlabeled graphs is much more complicated than for labeled ones. So this discussion is postponed until Section 6.4 where we will develop the necessary tools. If 퐺 is a graph and 푣 ∈ 푉, then the degree of 푣 is deg 푣 = the number of 푒 ∈ 퐸 containing 푣. In our running example deg 푣 = deg 푥 = 3 and deg 푤 = deg 푦 = 2. There is a nice relationship between vertex degrees and the cardinality of the edge set. The demon- stration of the next result illustrates an important method of proof in combinatorics, counting in pairs.

Theorem 1.9.3. For any graph 퐺 we have ∑ deg 푣 = 2|퐸|. 푣∈푉

Proof. Consider 푃 = {(푣, 푒) | 푣 is contained in 푒}. Then #푃 = ∑ (number of 푒 containing 푣) = ∑ deg 푣. 푣∈푉 푣∈푉 On the other hand #푃 = ∑(number of 푣 contained in 푒) = ∑ 2 = 2|퐸|. 푒∈퐸 푒∈퐸 Equating the two counts finishes the proof. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.9. Graphs and digraphs 21

푣 푤

푦 푥

Figure 1.8. A digraph 퐷

Theorem 1.9.3 is often called the Handshaking Lemma because of the following interpretation. Suppose 푉 is the set of people at a party and we draw an edge between person 푣 and person 푤 if they shake hands during the festivities. Then adding up the number of handshakes given by each person gives twice the total number of hand- shakes. It is often useful to have specified directions along the edges. A labeled directed graph, also called a digraph, is 퐷 = (푉, 퐴) where 푉 is a set of vertices and 퐴 is a set of arcs which are ordered pairs of vertices. We use the notation 푎 = ⃗푣푤 for arcs and say that 푎 goes from 푣 to 푤. To illustrate, the digraph with 푉 = {푣, 푤, 푥, 푦} and 퐴 = { ⃗푣푤, ⃗푤푣, ⃗푤푥, 푦푣,⃗ 푦푥}⃗ is drawn in Figure 1.8. We use 푉(퐷) and 퐴(퐷) to denote the vertex set and arc set, respectively, of a digraph 퐷 when we wish to be more precise. Directed walks, paths, and cycles are defined for digraphs similarly to their undirected cousins in graphs, just insisting the ⃗푣푖−1푣푖 ∈ 퐴 for 푖 in the appropriate range. So, in our example digraph, 푃 ∶ 푦, 푣, 푤, 푥 is a directed path and 퐶 ∶ 푣, 푤 is a directed cycle. Note that 푤, 푥, 푦, 푣 is not a directed path because the arc between 푥 and 푦 goes the wrong way. Let 풟(푉) and 풟(푉, 푘) be the set of digraphs and the set of digraphs with 푘 arcs, respectively, having vertex set 푉. The next result is proved in much the same manner as Theorem 1.9.2 so the demonstration is omitted. Theorem 1.9.4. For 푛 ≥ 1 and 푘 ≥ 0 we have #풟([푛]) = 2푛(푛−1) and 푛(푛 − 1) #풟([푛], 푘) = ( ). □ 푘

In a digraph 퐷 there are two types of degrees. Vertex 푣 ∈ 푉 has out-degree and in-degree odeg 푣 = the number of 푎 ∈ 퐴 of the form 푎 = ⃗푣푤, ideg 푣 = the number of 푎 ∈ 퐴 of the form 푎 = ⃗푤푣, respectively. In Figure 1.8, for example, odeg 푣 = 1 and ideg 푣 = 2. The next result will permit us to finish our leftover business from Section 1.5. The union of digraphs 퐷 ∪ 퐸 is the digraph with vertices 푉(퐷 ∪ 퐸) = 푉(퐷) ∪ 푉(퐸) and arcs 퐴(퐷 ∪ 퐸) = 퐴(퐷) ∪ 퐴(퐸).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

22 1. Basic Counting

Lemma 1.9.5. Let 퐷 = (푉, 퐴) be a digraph. We have odeg 푣 = ideg 푣 = 1 for all 푣 ∈ 푉 if and only if 퐷 is a disjoint union of directed cycles.

Proof. The reverse implication is easy to see since the out-degree and in-degree of any vertex 푣 of 퐷 would be the same as those degrees in the directed cycle containing 푣. But in such a cycle odeg 푣 = ideg 푣 = 1.

For the forward direction, pick any 푣 = 푣1 ∈ 푉. Since odeg 푣1 = 1 there must exist a vertex 푣2 with ⃗푣1푣2 ∈ 퐴. By the same token, there must be a 푣3 with ⃗푣2푣3 ∈ 퐴. Continue to generate a sequence 푣1, 푣2,... in this manner. Since 푉 is finite, there must be two indices 푖 < 푗 such that 푣푖 = 푣푗. Let 푖 be the smallest such index and let 푗 be the first index after 푖 where repetition occurs. Thus 푖 = 1, for if not, then we have ⃗푣푖−1푣푖, ⃗푣푗−1푣푖 ∈ 퐴, contradicting the fact that ideg 푣푖 = 1. By definition of 푗, we have a directed cycle 퐶 ∶ 푣1, 푣2, ... , 푣푗−1. Furthermore, no vertex of 퐶 can be involved in another arc since that would make its out-degree or in-degree too large. Continuing in this manner, we can decompose 퐷 into disjoint directed cycles. □

Sometimes it is useful to allow loops in a graph which are edges of the form 푒 = 푣푣. Similarly, we can permit loops as arcs 푎 = ⃗푣푣 in a digraph. Another possibility is that we would want multiple edges, meaning that one could have more than one edge between a given pair of vertices, making 퐸 into a multiset. Multiple arcs are defined similarly. If we make no specification for our (di)graph, then we are assuming that it has neither loops nor multiple edges. We will now prove Theorem 1.5.1.

Proof (of Theorem 1.5.1). To any 휋 ∈ 픖푛 we associate its functional digraph 퐷휋 which has 푉 = [푛] and an arc ⃗횤횥 ∈ 퐴 if and only if 휋(푖) = 푗. Now 퐷휋 is a digraph with loops. Because 휋 is a function we have odeg 푖 = 1 for all 푖 ∈ [푛]. And because 휋 is a bijection we also have ideg 푖 = 1 for all 푖. The proof of the previous lemma works equally well if one allows loops. So 퐷휋 is a disjoint union of cycles. But cycles of the digraph 퐷휋 correspond to cycles of the permutation 휋. Thus the cycle decomposition of 휋 exists. It is also easy to check that the cycles of 퐷휋 produced by the algorithm in the demonstration of necessity in Lemma 1.9.5 are unique. This implies the uniqueness statement about the cycles of 휋 and so we are done. □

1.10. Trees

Trees are a type of graph which often occurs in practice, even in domains outside of mathematics. For example, trees are used as data structures in computer science, or to model evolution in genetics. A graph 퐺 is connected if, for every pair of vertices 푣, 푤 ∈ 푉, there is a walk in 퐺 from 푣 to 푤. By Lemma 1.9.1(a), this is equivalent to there being a path from 푣 to 푤 in 퐺. The connected components of 퐺 are the maximal connected subgraphs. If 퐺 is connected, there is only one component. Call 퐺 acyclic if it contains no cycles. A forest is another name for an acyclic graph. The connected components of a forest are called trees. So a graph 푇 is a tree if it is both connected and acyclic. Figure 1.9 contains five trees 푇1, ... , 푇5.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.10. Trees 23

1 1 1 1 1

4 6 4 6 4 6 4 4

5 2 3 2 3 2 2

푇1 푇2 푇3 푇4 푇5 푤1 = 4 푤2 = 6 푤3 = 1 푤4 = 4 푤1 = 4 푙1 = 5 푙2 = 3 푙3 = 6 푙4 = 2 푙2 = 3

Figure 1.9. The Prüfer algorithm

A leaf in a graph 퐺 is a vertex 푣 having deg 푣 = 1. The next result will show that nontrivial trees have leaves (regardless of the time of year). Further, it should be clear from this lemma why leaves are a useful tool for induction in trees. In order to state it, we need the following notation. If 퐺 is a graph and 푊 ⊆ 푉, then 퐺 − 푊 is the graph on the vertex set 푉 − 푊 whose edge set consists of all edges in 퐸 with both endpoints in 푉 − 푊. If 푊 = {푣} for some 푣, then we write 퐺 − 푣 for 퐺 − {푣}. In Figure 1.9, 푇2 = 푇1 − 5. Similarly, if 퐹 ⊆ 퐸, then 퐺 − 퐹 is the graph with 푉(퐺 − 퐹) = 푉(퐺) and 퐸(퐺 − 퐹) = 퐸(퐺) − 퐸(퐹). If 퐹 consists of a single edge, then we use a similar abbreviation as for subtracting vertices.

Lemma 1.10.1. Let 푇 be a tree with #푉 ≥ 2. (a) 푇 has (at least) 2 leaves. (b) If 푣 is a leaf of 푇, then 푇′ = 푇 − 푣 is also a tree.

Proof. (a) Let 푃 ∶ 푣0, ... , 푣ℓ be a path of maximum length in 푇. Since 푇 is nontrivial, 푣0 ≠ 푣ℓ. We claim that 푣0, 푣ℓ are leaves and we will prove this for 푣0 as the same proof works for 푣ℓ. Suppose, towards a contradiction, that deg 푣0 ≥ 2. Then there must be a vertex 푤 ≠ 푣1 such that 푣0푤 ∈ 퐸. We now have two possibilities. If 푤 is not a vertex of ′ 푃, then the path 푃 ∶ 푤, 푣0, ... , 푣ℓ is longer than 푃, a contradiction to the definition of 푃. If 푤 = 푣푖 for some 2 ≤ 푖 ≤ ℓ, then the portion of 푃 from 푣0 to 푣푖 together with the edge 푣0푣푖 forms a cycle in 푇, again a contradiction.

(b) It is clear that 푇′ is still acyclic since removing vertices cannot create a cycle. To show it is connected, take 푥, 푦 ∈ 푉(푇′). So 푥, 푦 are also vertices of 푇. Since 푇 is connected, Lemma 1.9.1(a) implies that there is a path 푃 from 푥 to 푦 in 푇. If this path is also in 푇′, then we will be done. But if 푃 goes through 푣, then, since there is a unique vertex 푣′ adjacent to 푣, 푃 would have to pass through 푣′ just before and just after 푣. This contradicts the fact that the vertices of 푃 are distinct. □

There are a number of characterizations of trees. We collect some of them here as they will be useful in the sequel.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

24 1. Basic Counting

Theorem 1.10.2. Let 푇 be a graph with #푉 = 푛 and #퐸 = 푚. The following are equivalent conditions for 푇 to be a tree: (a) 푇 is connected and acyclic. (b) 푇 is acyclic and 푛 = 푚 + 1. (c) 푇 is connected and 푛 = 푚 + 1. (d) For every pair of vertices 푢, 푣 there is a unique path from 푢 to 푣.

Proof. We will prove the equivalence of (a), (b), and (c). The equivalence of (a) and (d) is left as an exercise. To prove that (a) implies (b), it suffices to show by induction on 푛 that 푛 = 푚 + 1. This is trivial if 푛 = 1. If 푛 ≥ 2, then, by Lemma 1.10.1, 푇 has a leaf 푣. Induction applies to 푇′ = 푇 − 푣 so that its vertex and edge cardinalities are related by 푛′ = 푚′ + 1. But 푛 = 푛′ + 1 and 푚 = 푚′ + 1 so that 푛 = 푚 + 1.

To see why (b) implies (c), consider the connected components 푇1, ... , 푇푘 of 푇. Since 푇 is acyclic, each of these components is a tree. Also, from the implication (a) ⟹ (b), we have that 푛푖 = 푚푖 + 1 for 1 ≤ 푖 ≤ 푘 where 푛푖 = #푉(푇푖) and 푚푖 = #퐸(푇푖).

Adding these equations together and using the fact that ∑푖 푛푖 = 푛 and ∑푖 푚푖 = 푚 we obtain 푛 = 푚 + 푘. But we are given that 푛 = 푚 + 1. So we must have 푘 = 1 . This means that 푇 only has one component and so is connected. We prove that (c) implies (a) by contradiction. So suppose that 푇 contains a cycle 퐶 and let 푒 = 푢푣 ∈ 퐸(퐶). We claim that 푇 − 푒 is still connected. For if 푥, 푦 are any two vertices of 푇 − 푒, then there is a walk 푊 from 푥 to 푦 in 푇. If 푊 does not contain 푒, then 푊 is still in 푇 − 푒. If 푊 does contain 푒, then replace 푒 in 푊 with the path 퐶 − 푒 to form a new walk 푊 ′ from 푥 to 푦 in 푇 − 푒. We can keep removing edges in this way until the resulting graph 푇′ is acyclic. Since 푇′ is still connected, it is a tree. And by the first implication we have 푛′ = 푚′ + 1. But 푛′ = 푛 and 푚′ < 푚 so that 푛 < 푚 + 1, the desired contradiction. □

Let 풯(푉) be the set of all trees on the vertex set 푉. There are quite a number of different proofs of the beautiful formula below for #풯(푉), many of which are in Moon’s book on the subject [64]. Theorem 1.10.3. For 푛 ≥ 1 we have #풯([푛]) = 푛푛−2.

Proof. The result is trivial if 푛 = 1, so assume 푛 ≥ 2. By Theorem 1.2.2 it suffices to find a bijection 푓 ∶ 풯([푛]) → 푃(([푛], 푛 − 2)). There is a famous algorithm for con- structing 푓 which is called the Prüfer algorithm. An example will be found in Figure 1.9. Given 푇 ∈ 풯([푛]), to determine 푓(푇) = 푤1 ... 푤푛−2 we will build a sequence of trees 푇 = 푇1, 푇2, ... , 푇푛−1 by removing vertices from 푇 as follows. Since the vertices of 푇 are labeled 1, ... , 푛 it makes sense to talk about, e.g., a maximum vertex because of the ordering on the integers. Given 푇푖, we find the leaf 푙푖 ∈ 푉(푇푖) such that 푙푖 is maximum and let 푇푖+1 = 푇푖 − 푙푖. By the previous lemma, 푇푖+1 will also be a tree. Since 푙푖 is a leaf, it is adjacent to a unique vertex 푤푖 in 푇푖 and we let 푤푖 be the 푖th element of 푓(푇). Now each 푤푖 ∈ [푛] and 푓(푇) has length 푛 − 2 by definition. So 푓(푇) ∈ 푃(([푛], 푛 − 2)).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.11. Lattice paths 25

To show that 푓 is a bijection, we find its inverse. Given 푤 ∈ 푃(([푛], 푛 − 2)), we will first construct a permutation 푙 = 푙1 ... 푙푛−2 ∈ 푃([푛], 푛 − 2) where 푙푖 will turn out to be the leaf removed from 푇푖 to form 푇푖+1. We construct the 푙푖 inductively by letting

(1.10) 푙푖 = max([푛] − {푙1, ... , 푙푖−1, 푤푖, ... , 푤푛−2}). −1 Finally we construct 푓 (푤) = 푇 by letting 푇 have edges 푒푖 = 푙푖푤푖 for 1 ≤ 푖 ≤ 푛 − 2 as well as the edge 푒푛−1 = 푙푛−1푙푛 where [푛] − {푙1, ... , 푙푛−2} = {푙푛−1, 푙푛}. To show that −1 푓 (푤) = 푇 is a tree, note first that 푙1 is a leaf of 푇 because 푙1 is attached to 푤1 but ′ to none of the other vertices of 푇 by (1.10) and the definition of 푒푛−1. Consider 푤 = −1 ′ 푤2 ... 푤푛−2 and apply the algorithm for 푓 to 푤 using the ground set [푛]−{푙1} instead ′ ′ of [푛]. By induction, the result is a tree 푇 . And 푇 is formed by adding 푙1 as a leaf to 푇 , which makes 푇 a tree as well. To show that 푓 and 푓−1 are inverses we will show that 푓−1 ∘ 푓 is the identity map, −1 leaving the proof for 푓 ∘ 푓 to the reader. Suppose 푓(푇) = 푤1 ... 푤푛−2. Also let ′ ′ the sequence of leaves removed during the construction of 푓(푇) be 푙1 ... 푙푛−2. Then ′ by definition of the algorithm, the edges of 푇 are exactly 푙푖푤푖 for 1 ≤ 푖 ≤ 푛 − 2 ′ ′ ′ ′ ′ ′ and 푙푛−1푙푛 where [푛] − {푙1, ... , 푙푛−2} = {푙푛−1, 푙푛}. Comparing this with the definition −1 ′ of 푓 we see that it suffices to show that 푙푖 = 푙푖 for all 푖 and that this will follow if ′ one can prove the equality holds for 1 ≤ 푖 ≤ 푛 − 2. Since 푙푖 is a leaf in 푇푖, it cannot ′ ′ be any of the previously removed leaves 푙1, ... , 푙푖−1. Of the remaining vertices, those which are among 푤푖, ... , 푤푛−2 are not currently leaves since they are attached to fu- ture leaves which are to be removed. And conversely those not among the 푤푖, ... , 푤푛−2 must be leaves; otherwise, they would be listed as some 푤푗 for 푗 ≥ 푖 once all their adjacent leaves were removed. Hence the leaves of 푇푖 are precisely the elements of ′ ′ [푛] − {푙1, ... , 푙푖−1, 푤푖, ... , 푤푛−2}. Since we always remove the leaf of maximum value, ′ ′ we see that the rule for choosing 푙푖 is exactly the same as the one in (1.10). So 푙푖 = 푙푖 as desired. □

1.11. Lattice paths

Lattice paths lead to many interesting counting problems in combinatorics. They are also important in and statistics; see the book of Mohanty [63] for examples. Consider the integer lattice in the plane ℤ2 = {(푥, 푦) | 푥, 푦 ∈ ℤ}. A lattice path is a sequence of elements of ℤ2 written

푃 ∶ (푥0, 푦0), (푥1, 푦1), ... , (푥ℓ, 푦ℓ).

Just as in graph theory, we say the path has length ℓ and goes from (푥0, 푦0) to (푥ℓ, 푦ℓ), which are called its endpoints. Unlike graph-theoretic paths, we do not assume the (푥푖, 푦푖) are distinct. To illustrate the notation, if we assume that the left-hand path in Figure 1.10 starts at the origin, then it would be written 푃 ∶ (0, 0), (0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 3), (3, 4), (4, 4).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

26 1. Basic Counting

Figure 1.10. Dyck paths

The step between (푥푖−1, 푦푖−1) and (푥푖, 푦푖) on 푃 is the vector 푠푖 = [푥푖 − 푥푖−1, 푦푖 − 푦푖−1]. Note the use of square versus round brackets to distinguish steps from vertices of the path. Note that 푃 is determined up to translation by its steps and that it is determined completely by its steps and initial vertex. If no initial vertex is specified, it is assumed to be the origin. We let 퐸 = [1, 0] and 푁 = [0, 1], calling these east and north steps, respec- tively. The path on the left in Figure 1.10 could also be represented 푃 ∶ 푁푁퐸푁퐸퐸푁퐸. For our first enumerative result, we use the notation 풩ℰ(푚, 푛) for the set of lattice paths from (0, 0) to (푚, 푛) only using steps north and east. We call lattice paths using only 푁 and 퐸 steps northeast paths.

Theorem 1.11.1. For 푚, 푛 ≥ 0 we have

푚 + 푛 #풩ℰ(푚, 푛) = ( ). 푚

Proof. Let 푃 be a northeast lattice path from (0, 0) to (푚, 푛). Then 푃 has 푚 + 푛 total steps. And once 푚 of them are chosen to be 퐸, the rest must be 푁. The result follows. □

We will be particularly concerned with a special type of northeast path. A Dyck path of semilength 푛 is a northeast lattice path which begins at (0, 0), ends at (푛, 푛), and never goes below the line 푦 = 푥. The first path in Figure 1.10 is of this type. Note that 푛 is called the semilength because the Dyck path itself has 2푛 steps. We let 풟(푛) denote the set of Dyck paths of semilength 푛. This should cause no confusion with the notation 풟(푉) for the set of digraphs on the vertex set 푉 because in the former notation 푛 is a nonnegative integer while in the latter it is a set. We now define that Catalan numbers to be 퐶(푛) = #풟(푛). The Catalan numbers are ubiquitous in combinatorics. In fact, Stanley has written a book [92] containing 214 different combinatorial interpretations of 퐶(푛). A few of these are listed in the exercises. The Catalan numbers satisfy a nice recursion.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.11. Lattice paths 27

Theorem 1.11.2. We have the initial condition 퐶(0) = 1 and recurrence relation 퐶(푛) = 퐶(0)퐶(푛 − 1) + 퐶(1)퐶(푛 − 2) + 퐶(2)퐶(푛 − 3) + ⋯ + 퐶(푛 − 1)퐶(0) for 푛 ≥ 1.

Proof. The initial condition counts the trivial path of a single vertex. For the recursion, take 푃 ∶ 푣0, ... , 푣2푛 ∈ 풟(푛) where 푣푖 = (푥푖, 푦푖) for all 푖. Let 푗 > 0 be the smallest index such that 푣2푗 is on the line 푦 = 푥. Such an index exists since 푣2푛 = (푛, 푛) satisfies this condition. Also note that no vertex of odd subscript is on 푦 = 푥 since the number of north steps and the number of east steps preceding that vertex cannot be equal. It follows that 푃1, the portion of 푃 from 푣1 to 푣2푗−1, stays above 푦 = 푥 + 1. So the number of choices for 푃1 is 퐶(푗 − 1). Furthermore, if 푃2 is the portion of 푃 from 푣2푗 to 푣2푛, then 푃2 is (a translation of) a Dyck path of semilength 푛 − 푗. So the number of choices for 푃2 is 퐶(푛 − 푗). Thus the total number of such 푃 is 퐶(푗 − 1)퐶(푛 − 푗). Summing over 1 ≤ 푗 ≤ 푛 finishes the proof. □

There is an explicit expression for the Catalan numbers. But to derive this formula it will be convenient to use a second set of paths counted by 퐶(푛). Call the steps 푈 = [1, 1] and 퐷 = [1, −1] up and down, respectively. An updown path is one using only such steps. It should be clear that if we let 풟(푛)̃ be the set of updown lattice paths from (0, 0) to (2푛, 0) never going below the 푥-axis, then #풟(푛)̃ = #풟(푛) = 퐶(푛). In fact one can get from the paths in one set to those in the other by rotation and dilation of the plane. The two paths in Figure 1.10 correspond under this map and the second one would be represented as 푃 ∶ 푈푈퐷푈퐷퐷푈퐷.

Theorem 1.11.3. For 푛 ≥ 0 we have 1 2푛 퐶(푛) = ( ). 푛 + 1 푛

Proof. We rewrite the right-hand side as

1 2푛 (2푛)! 1 2푛 + 1 ( ) = = ( ). 푛 + 1 푛 푛! (푛 + 1)! 2푛 + 1 푛 Let 풫 be the set of all updown paths starting at (0, 0) and ending at (2푛 + 1, −1). Such paths have 2푛 + 1 steps of which 푛 are up (forcing the other 푛 + 1 to be down) so that 2푛+1 #풫 = ( 푛 ). Our strategy will be to find a partition 휌 of 풫 such that (1) #퐵 = 2푛 + 1 for every block 퐵 of 휌 and (2) there is a bijection between the blocks of 휌 and the paths in 풟(푛)̃ . It will then follow that #풟(푛)̃ is equal to the number of blocks of 휌, which is #풫/(2푛 + 1), giving the desired equality.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

28 1. Basic Counting

To determine 휌, we will take any 푃 ∈ 풫 and describe the block 퐵 containing 푃. We will refer to the 푦-coordinate of a vertex 푣 of 푃 as its height, written ht 푣. Suppose 푃 has step representation 푃 ∶ 푠1푠2 ... 푠2푛+1. Define the 푟th rotation of 푃 to be the path

푃푟 ∶ 푠푟+1푠푟+2 ... 푠2푛+1푠1푠2 ... 푠푟

where all paths start at the origin. Let 퐵 = {푃0, ... , 푃2푛}. So to show that 퐵 has the correct cardinality, we must prove that the 푃푖 are all distinct. Suppose to the contrary that two are equal. By renumbering if necessary, we can assume that 푃0 = 푃푗 for some 1 ≤ 푗 ≤ 2푛. Take 푗 be be minimum. Iterating this equality, we get 푃0 = 푃푗 = 푃2푗 = ... . These equalities and the fact that 푗 is as small as possible imply that 푃 = 푃0 is the ′ ′ concatenation of 푃 ∶ 푠1 ... 푠푗 with itself, say 푘 times for some 푘 ≥ 2. Suppose 푃 ends at height ℎ. Then 푃 must end at height 푘ℎ and so 푘ℎ = −1. This forces 푘 = 1, which is a contradiction. To finish the proof, we must show that the blocks of 휌 are in bijection with the paths in 풟(푛)̃ . Let 풟̃ ′(푛) denote the set of paths obtained by appending a down step to each path in 풟(푛)̃ . So 휌 partitions 풫 ⊇ 풟̃ ′(푛). Thus it suffices to show that there isa unique path from 풟̃ ′(푛) in each block 퐵 of 휌. Let 퐵 be generated by rotating a path 푃 as in the previous paragraph and let 푃 ∶ 푣0 ... 푣2푛+1 be the lattice point representation of 푃. Let ℎ be the minimum height of a vertex of 푃, and among all vertices of 푃 of height ̃ ′ ℎ let 푣푟 be the left most. We claim that 푃푟 ∈ 풟 (푛) and no other 푃푠 is in this set for 푠 ∈ {0, 1, ... , 푛} − {푟}. We will prove the first of these two claims and leave the second, whose demonstration is similar, as an exercise. Since 푣푟 is translated to the origin and has smallest height in 푃, the translations of all 푣푖 for 푖 ≥ 푟 will lie weakly above the 푥- axis. As for the 푣푖 with 푖 < 푟, they must be translated so the 푣푟 becomes the last vertex of 푃푟 which is of height −1. But since 푣푟 was the first vertex of minimum height in 푃, the vertices before it must be translated to have height greater than −1 and so must also lie weakly above the 푥-axis. It follows that only the last vertex of 푃푟 is below the 푥-axis, which is what we wished to prove. □

1.12. Pattern avoidance

Pattern avoidance is a relatively recent area of study in combinatorics. It has seen strong growth in part because of its connections to algebraic geometry and computer science. For more information about this topic, see the books of Bóna [18] or Ki- taev [48]. Let 푆 be a set of integers with #푆 = 푘 and consider a permutation 휎 ∈ 푃(푆). The standardization of 휎 is the permutation std 휎 ∈ 푃([푘]) obtained by replacing the smallest element of 휎 by 1, the next smallest by 2, and so on. For example, if 휎 = 263, then std 휎 = 132. Given 휎 ∈ 픖푛 and 휋 ∈ 픖푘 in one-line notation, we say that 휎 contains a copy of 휋 if there is a subsequence 휎′ of 휎 such that std 휎′ = 휋. Note that a subsequence need not consist of consecutive elements of 휋. In this case, 휋 is called the pattern. To illustrate, 휎 = 425613 contains the pattern 휋 = 132 since 휎′ = 263 standardizes to 휋. On the other hand, we say that 휎 avoids 휋 if it has no subsequence 휎′ with std 휎 = 휋. Continuing our example, one can check that 휎 avoids 4321 since 휎 does not contain a decreasing subsequence of length four. There is an equivalent

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.12. Pattern avoidance 29

휎 =

휋 =

Figure 1.11. The diagrams for 휋 = 132 and 휎 = 425613

definition of pattern containment which the reader will see in the literature. If 푆, 푇 are sets with #푆 = #푇 = 푘, then call 휎 = 휎1 ... 휎푘 ∈ 푃(푆) and 휏 = 휏1 ... 휏푘 ∈ 푃(푇) order isomorphic if 휎푖 < 휎푗 is equivalent to 휏푖 < 휏푗 for all 푖, 푗. It is easy to see that 휎 contains a copy of 휋 if and only if 휎 contains a subsequence order isomorphic to 휋. To study patterns, it will be useful to have a geometric model of a permutation analogous to its permutation matrix. Again, the integer lattice will come into play. 2 Given 휎 = 휎1 ... 휎푛 ∈ 픖푛, its diagram is the set of points (푖, 휎푖) ∈ ℤ for 1 ≤ 푖 ≤ 푛. In displaying the diagram, the lower-left corner is always assumed to have coordinates (1, 1). Using our running example, the diagrams for 휋 = 132 and 휎 = 425613 are shown in Figure 1.11. The points corresponding to the copy 263 of 휋 in 휎 have been enlarged to emphasize how easily one can see pattern containment using diagrams. From an enumerative point of view, avoidance often turns out to be easier to work with than containment. So given 휋 ∈ 픖푘, we consider

Av푛(휋) = {휎 ∈ 픖푛 | 휎 avoids 휋}. ′ Note that many authors use 픖푛(휋) instead of Av푛(휋) for this set. Call 휋 and 휋 Wilf ′ ′ equivalent, written 휋 ≡ 휋 , if # Av푛(휋) = # Av푛(휋 ) for all 푛 ≥ 0. It is easy to see that this is an equivalence relation on 픖푛. We will prove that any two permutations in 픖3 are Wilf equivalent, although this is not as startling as it might first sound. Certain Wilf equivalences follow easily from manipulation of diagrams. Consider the dihedral group of the square

(1.11) 퐷 = {휌0, 휌90, 휌180, 휌270, 푟0, 푟1, 푟−1, 푟∞}

where 휌휃 is rotation by 휃 degrees counterclockwise and 푟푚 is reflection across a line of slope 푚. If 휎 contains a copy 휎′ of 휋 and 푓 ∈ 퐷, then 푓(휎) contains a copy 푓(휎′) of 푓(휋). Using 푓−1, we see that the converse of the previous assertion is also true. It follows that 휎 avoids 휋 if and only if 푓(휎) avoids 푓(휋). We have proven the following result. □ Lemma 1.12.1. For any 휋 ∈ 픖푘 and any 푓 ∈ 퐷 we have 휋 ≡ 푓(휋).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

30 1. Basic Counting

휎′

휎 =

휎″

Figure 1.12. Decomposing 휎 ∈ Av푛(132)

The equivalences in this lemma are called trivial Wilf equivalences. In particular, in 픖3 one sees by repeatedly applying 휌90 that 132 ≡ 231 ≡ 213 ≡ 312 and 123 ≡ 321. In fact, all six permutations are Wilf equivalent and their avoidance sets are counted by the Catalan numbers. We start with 132 from the first set of equivalent permutations. Theorem 1.12.2. For 푛 ≥ 0 we have

# Av푛(132) = 퐶(푛).

Proof. We will induct on 푛, using the initial condition and recurrence relation for 퐶(푛) given in Theorem 1.11.2. As usual, we concentrate on the latter. Pick 휎 = 휎1 ... 휎푛 ∈ ′ ″ ′ Av푛(132) and suppose 휎푗 = 푛. So we can write 휎 = 휎 푛휎 where 휎 = 휎1 ... 휎푗−1 ″ ′ ″ and 휎 = 휎푗+1 ... 휎푛. Clearly 휎 and 휎 must avoid 132 since they are subsequences of 휎. We also claim that min 휎′ > max 휎″ so that we can think of the diagram of 휎 decomposing as in Figure 1.12. Indeed, if there is 푠 ∈ 휎′ and 푡 ∈ 휎″ with 푠 < 푡, then 휎 contains 푠푛푡, which is a copy of 132, a contradiction. Thus 휎′ and 휎″ are permutations of {푛−1, 푛−2, ... , 푛−푗+1} and [푛−푗], respectively, both of which avoid 132. Conversely, if the diagram of 휎 has the form given in Figure 1.12 with 휎′, 휎″ avoiding 132, then 휎 must avoid 132. This is a case-by-case proof by contradiction, considering where the elements of a copy of 132 could lie in the diagram if one existed. We leave the details to the reader. To finish the count, from what we have shown and induction there are 퐶(푗 − 1) choices for 휎′ and 퐶(푛 − 푗) for 휎″. Taking their product and summing over 푗 ∈ [푛] shows that there are 퐶(푛) choices for 휎. □

Next we will tackle 123, but to do so we will need some new concepts. The left- right minima of 휎 = 휎1 ... 휎푛 ∈ 픖푛 are the 휎푖 satisfying 휎푖 < min{휎1, 휎2, ... , 휎푖−1}. For example 휎 = 698371542 has left-right minima 휎1 = 6, 휎4 = 3, and 휎6 = 1. The indices 푖 such that 휎푖 is a left-right minimum are called the left-right minimum positions. If necessary to distinguish from the positions, the 휎푗 themselves are called the left-right

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

1.12. Pattern avoidance 31

minimum values. Reading the left-right minima in order from left to right, the positions and values always satisfy

(1.12) 1 = 푖1 < 푖2 < ⋯ < 푖푙 and 푚1 > 푚2 > ⋯ > 푚푙 = 1 for some 푙 ≥ 1. We will need to determine, given a set of values and positions, whether a permu- tation exists with left-right minima having these values and positions. To do this, we introduce the dominance order on compositions, which is also useful in other areas of combinatorics and representation theory. A weak composition of 푛 is a sequence

훼 = [훼1, ... , 훼푙] of nonnegative integers with ∑푖 훼푖 = 푛. So in weak compositions zero is permitted as a part and we will use 0 subscripts on notation for compositions when ⊴ used for weak compositions. If 훼, 훽 ⊧0 푛, then 훼 is dominated by 훽, written 훼 훽, if we have

훼1 + 훼2 + ⋯ + 훼푗 ≤ 훽1 + 훽2 + ⋯ + 훽푗 ⊴ for all 푗 ≥ 1, where 훼푗 = 0 if 푗 > ℓ(훼) and similarly for 훽. To illustrate, [2, 2, 1, 1] [3, 1, 2] because 2 ≤ 3, 2+2 ≤ 3+1, 2+2+1≤3+1+2, and 2+2+1+1 = 3+1+2+0. Since 훼, 훽 ⊧0 푛 the last inequality always becomes an equality. In the next result, the reader will notice a similarity between the construction of 휄 and 휇 and the map 휙 defined by (1.8).

Lemma 1.12.3. Let 휎 ∈ 픖푛.

(a) We have 휎 ∈ Av푛(123) if and only if its subsequence of non-left-right minima is decreasing.

(b) There exists 휎 ∈ Av푛(123) with left-right minima positions and values given by (1.12) if and only if 휄 ⊴ 휇 where

휄 = (푖2 − 푖1 − 1, 푖3 − 푖2 − 1, ... , 푖푙+1 − 푖푙 − 1),

휇 = (푚0 − 푚1 − 1, 푚1 − 푚2 − 1, ... , 푚푙−1 − 푚푙 − 1),

and 푖푙+1 = 푚0 = 푛 + 1. In this case, 휎 is unique.

Proof. (a) We will prove this statement in its contrapositive form. Suppose first that 휎 contains a copy 휎푖휎푗휎푘 of 123. Then 휎푗, 휎푘 cannot be left-right minima since 휎푖 is smaller than both and to their left in 휎. Since 휎푗 < 휎푘, the non-left-right minima subsequence contains an increase. Conversely, suppose 휎푗 < 휎푘 with 푗 < 푘 and both non-left-right minima. Let 휎푖 be the left-right minimum closest to 휎푗 on its left. We have that 휎푖 exists since 휎 begins with a left-right minimum. Then 휎푖 < 휎푗 < 휎푘, giving a copy of 123.

(b) Clearly if 휎 exists, then it must be unique since the positions and values of its left-right minima are given by (1.12) and the rest of the elements can only be arranged in one way by (a). We can attempt to build 휎 satisfying the given conditions as follows. An example will be found following the proof. Start with a row of 푛 blank positions. Now fill in the values 푚1 > ⋯ > 푚푙 at the positions 푖1 < ⋯ < 푖푙. Filling in the rest of the positions with the elements of 푆 = [푛] − {푚1, ... , 푚푙} (the set of non-left-right minima) in decreasing order gives a 휎 avoiding 123 since 휎 is a union of two decreasing subsequences. So the only question is whether doing this will result in a permutation

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

32 1. Basic Counting

having the 푚푗 as its left-right minima. We have that 푚1 is always a left-right minimum regardless of the other entries. Now 푚푗+1 will be the next left-right minimum after 푚푗 if and only if the blanks before position 푖푗+1 are filled with elements larger than 푚푗. Note that 휄푗 = 푖푗+1 − 푖푗 − 1 is the number of spaces between positions 푖푗 and 푖푗+1. Also 휇푗 = 푚푗−1 −푚푗 −1 is the number of 푠 ∈ 푆 with 푚푗 < 푠 < 푚푗−1. It follows that 휄1 +⋯+휄푗 is the number of blanks before position 푖푗+1 and 휇1 + ⋯ + 휇푗 is the number of elements of 푆 greater than 푚푗. So filling in the spaces preserves the left-right minima if andonly if the inequalities for 휄 ⊴ 휇 are satisfied. This completes the proof. □

Suppose we want to see if there is 휎 ∈ Av9(123) with left-right minima 6 > 3 > 1 in positions 1 < 4 < 6. We start off with the diagram (1.13) 휎 = 6 3 1 . We wish to check whether filling the blanks with the remaining elements of [9] in decreasing order will result in a permutation which has the initial elements as left- right minima. One way to do this is just to fill the blanks and verify that the desired elements become left-right minima: 휎=698371542. Another way is to use the 휄 and 휇 compositions. Note that 휄1 = 4 − 1 − 1 = 2 is the number of blanks between 푚1 = 6 and 푚2 = 3 in the original diagram. Similarly 휇1 = 10−6−1 = 3 is the number of elements of 푆 = [9]−{6, 3, 1} greater than 푚1 = 6. In order to fill the blanks between 6 and 3 so that 6 is a left-right minimum, the numbers used must all be greater than 6. This is possible exactly when 휄1 ≤ 휇1. Similarly 휄1 + 휄2 ≤ 휇1 + 휇2 ensures that one can fill the blanks to the left of 푚3 = 1 with numbers greater than 푚2 = 3, and so forth. So checking whether 휄 ⊴ 휇 also determines whether 휎 has the correct left-right minima.

We will need an analogue of Lemma 1.12.3 for elements of Av푛(132). To state it, we define the reversal of a weak composition 훼 = [훼1, 훼2, ... , 훼푙] to be 푟 훼 = [훼푙, 훼푙−1, ... , 훼1].

Lemma 1.12.4. Let 휎 ∈ 픖푛.

(a) We have 휎 ∈ Av푛(132) if and only if, for every left-right minimum 푚, the ele- ments of 휎 to the right of and greater than 푚 form an increasing subsequence.

(b) There exists 휎 ∈ Av푛(132) with left-right minima positions and values given by (1.12) if and only if 휇푟 ⊴ 휄푟 where 휄, 휇 are as given in Lemma 1.12.3. In this case, 휎 is unique

Proof. Much of the proof of this result is similar to the demonstration of Lemma 1.12.3 and so will be left as an exercise. Here we will only present the construction of 휎 ∈ Av푛(132) from its diagram of left-right minima and blanks. Again, an example follows the explanation. We keep the notation of the proof of the previous lemma. We start by filling the blanks to the right of 푚푙 = 1 with the elements 푠 ∈ 푆 such that 푚푙 < 푠 < 푚푙−1 in increasing order and as far left as possible (so they will be consecutive). Next we fill in the remaining blanks to the right of 푚푙−1 with those 푠 ∈ 푆 such that 푚푙−1 < 푠 < 푚푙−2 so that they form an increasing subsequence which is as far left as possible given the spaces already filled. Continue in this manner until all blanks are occupied. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 33

Suppose we wish to fill in the diagram (1.13) so that 휎 avoids 132. If 푚3 < 푠 < 푚2, then 푠 = 2, so we put 2 just to the right of 푚3 = 1 to get 휎 = 6 3 1 2 . Similarly, 푚1 < 푠 < 푚2 is satisfied by 푠 = 4, 5 so we put these elements following 푚2 = 3 in increasing order using the left-most blanks available to obtain 휎 = 6 3 4 1 2 5 . Finally, we do the same for the elements greater than 푚1 = 6 to get the end result 휎=678341259. We need one last observation before we achieve our goal of showing all elements of 픖3 are Wilf equivalent. Suppose 훼 = [훼1, ... , 훼푙] and 훽 = [훽1, ... , 훽푙] are weak compositions of 푛. We claim 훼 ⊴ 훽 if and only if 훽푟 ⊴ 훼푟. To see this, note that the inequality 훼1+⋯+훼푗 ≤ 훽1+⋯+훽푗 is equivalent to 푛−(훽1+⋯+훽푗) ≤ 푛−(훼1+⋯+훼푗). But 푛−(훼1+⋯+훼푗) = 훼푟+훼푟−1+⋯+훼푗+1 and similarly for 훽. Making this substitution we get the necessary inequalities for 훽푟 ⊴훼푟 and all steps are reversible. Finally, we say that a bijection 푓 ∶ 푆 → 푇 preserves property 푃 if 푠 ∈ 푆 having property 푃 is equivalent to 푓(푠) having property 푃 for all 푠 ∈ 푆.

Theorem 1.12.5. For 푛 ≥ 0 and any 휋 ∈ 픖3 we have

# Av푛(휋) = 퐶(푛).

Proof. By Theorem 1.12.2 and the discussion just before it, it suffices to show that we have # Av푛(123) = 퐶(푛). This will be true if we can find a bijection 푓 ∶ Av푛(123) → Av푛(132). In fact, 푓 will preserve the values and positions of left-right minima. Suppose 휎 ∈ Av푛(123) has its positions and values given by (1.12). By Lemma 1.12.3 there is a unique such 휎 and we must also have 휄 ⊴ 휇. But, as noted just before this theorem, this 푟 ⊴ 푟 ′ is equivalent to 휇 휄 . So, using Lemma 1.12.4, there is a unique 휎 ∈ Av푛(132) having the given positions and values of its left-right minima and we let 푓(휎) = 휎′. Because of the existence and uniqueness of 휎 and 휎′, this is a bijection. □

Note that the description of 푓 in the previous proof can be made constructive. Given 휎 ∈ Av푛(123), we remove its non-left-right minima and rearrange them using the algorithm in the proof of Lemma 1.12.4. So, using our running example, 푓(698371542) = 678341259.

Exercises

(1) Prove each of the following identities for 푛 ≥ 1 in two ways: one inductive and one combinatorial. 푛

(a) ∑ 퐹푖 = 퐹푛+2 − 1. 푖=1 푛

(b) ∑ 퐹2푖 = 퐹2푛+1 − 1. 푖=1 푛

(c) ∑ 퐹2푖−1 = 퐹2푛. 푖=1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

34 1. Basic Counting

(2) Prove that if 푘, 푛 ∈ ℙ with 푘 ∣ 푛 (meaning 푘 divides evenly into 푛), then 퐹 푘 ∣ 퐹푛. (3) Given 푚 ∈ ℙ, show that the sequence of Fibonacci numbers is periodic modulo 푚; that is, there exists 푝 ∈ ℙ such that

퐹푛+푝 ≡ 퐹푛 (mod 푚)

for all 푛 ≥ 0. The period modulo 푚 is the smallest 푝 such that this congruence holds. Note that it is an open problem to find the period of the Fibonacci sequence for an arbitrary 푚.

(4) The Lucas numbers are defined by 퐿0 = 2, 퐿1 = 1, and

퐿푛 = 퐿푛−1 + 퐿푛−2 for 푛 ≥ 2.

Prove the following identities for 푚, 푛 ≥ 1. (a) 퐿푛 = 퐹푛−1 + 퐹푛+1. (b) Let 풞푛 be the set of tilings of 푛 boxes arranged in a circle with dominos and monominos. Show that #풞푛 = 퐿푛. (c) 퐿푚+푛 = 퐹푚−1퐿푛 + 퐹푚퐿푛+1. (d) 퐹2푛 = 퐹푛퐿푛. (5) Prove Theorem 1.2.2. (6) Check that the two maps defined in the proof of Theorem 1.3.1 are inverses. (7) (a) Prove Theorem 1.3.3(b) using equation (1.5). (b) Give an inductive proof of Theorem 1.3.3(c). (c) Give an inductive proof of Theorem 1.3.3(d). (8) Let 푆, 푇 be sets. (a) Show that 푆 Δ 푇 = (푆 ∪ 푇) − (푆 ∩ 푇). (b) Show that (푆 Δ 푇) Δ 푇 = 푆.

(9) Given nonnegative integers satisfying 푛1 + 푛2 + ⋯ + 푛푚 = 푛. the corresponding multinomial coefficient is

푛 푛! (1.14) ( ) = . 푛1, 푛2, ... , 푛푚 푛1! 푛2! ... 푛푚!

We extend this definition to negative 푛푖 by letting the multinomial coefficient be zero if any 푛푖 < 0. Note that when 푚 = 2 we recover the binomial coefficients as

푛 푛 ( ) = ( ). 푘, 푛 − 푘 푘

(a) Find and prove analogues of Theorem 1.3.3(a), (b), and (c) for multinomial coefficients. (b) A permutation of a multiset 푀 = {{1푛1 , 2푛2 , ... , 푚푛푚 }} is a linear arrangement of the elements of 푀. Let 푃(푀) denote the set of permutations of 푀. For example

푃({{12, 22}}) = {1122, 1212, 1221, 2112, 2121, 2211}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 35

1 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1

Figure 1.13. Pascal’s triangle modulo 2

Prove that 푛 #푃({{1푛1 , 2푛2 , ... , 푚푛푚 }}) = ( ) 푛1, 푛2, ... , 푛푚 in three ways: (i) combinatorially, (ii) by induction on 푛, (iii) by proving that 푛 푛 푛 − 푛 ( ) = ( )( 1 ) 푛1, 푛2, ... , 푛푚 푛1 푛2, ... , 푛푚 and then inducting on 푚. (10) (a) Prove the Pascal triangle is fractal modulo 2. Specifically, if one replaces each binomial coefficient by its remainder on division by 2, then, for any 푘 ≥ 0, the triangle consisting of rows 0 through 2푘 − 1 is repeated on the left and on the right in rows 2푘 through 2푘+1 −1 with an inverted triangle of zeros in between. See Figure 1.13 for the first eight rows. Hint: Induct on 푘. (b) Formulate and prove an analogous result modulo 푝 for any prime 푝. (11) Find the inverse for the map in the proof of Theorem 1.3.4, proving that it is well- defined and the inverse to the given function. ! (12) For 푛 ≥ 0 define the 푛th Fibotorial to be the product 퐹푛 = 퐹1퐹2 ... 퐹푛. Also, for 0 ≤ 푘 ≤ 푛 define a Fibonomial coefficient by 푛 퐹! ( ) = 푛 . 푘 ! ! 퐹 퐹푘퐹푛−푘 Note that from this definition it is not clear that this is an integer. (a) Show that the Fibonomial coefficients satisfy the initial conditions (푛) = 0 퐹 (푛) = 1 and recurrence 푛 퐹 푛 푛 − 1 푛 − 1 ( ) = 퐹 ( ) + 퐹 ( ) 푘 푛−푘+1 푘 − 1 푘−1 푘 퐹 퐹 퐹 for 0 < 푘 < 푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

36 1. Basic Counting

(b) Show that (푛) is an integer for all 0 ≤ 푘 ≤ 푛. 푘 퐹 (c) Find a combinatorial interpretation of (푛) . 푘 퐹 (13) For 푛 ≥ 1 show that the Stirling numbers of the second kind have the following values. (a) 푆(푛, 1) = 1. (b) 푆(푛, 2) = 2푛−1 − 1. (c) 푆(푛, 푛) = 1. 푛 (d) 푆(푛, 푛 − 1) = ( ). 2 푛 푛 (e) 푆(푛, 푛 − 2) = ( ) + 3( ). 3 4 (14) For 푛 ≥ 1 show that the signless Stirling numbers of the first kind have the follow- ing values. (a) 푐(푛, 1) = (푛 − 1)!. 푛 1 (b) 푐(푛 + 1, 2) = 푛! ∑ . 푖 푖=1 (c) 푐(푛, 푛) = 1. 푛 (d) 푐(푛, 푛 − 1) = ( ). 2 푛 푛 (e) 푐(푛, 푛 − 2) = 2( ) + 3( ). 3 4 (15) Call an integer partition 휆 self-conjugate if 휆푡 = 휆. Show that the number of self- conjugate 휆 ⊢ 푛 equals the number of 휇 ⊢ 푛 having parts which are distinct (no part can be repeated) and odd. Hint: Use Young diagrams and try to guess a bijec- tion inductively by first seeing what it has to be for small 푛. Then try to construct a bijection for 푛 + 1 which will be consistent in some way with the one for previous values. Finally try to describe your bijection in a noninductive manner. (16) The main diagonal of a Young diagram is the set of squares starting with the one at the top left and moving diagonally right and down. So in Figure 1.5, the main diagonal of 휆 consists of two squares. Prove the following. (a) If 휆 is self-conjugate as defined in the previous exercise, then |휆| ≡ 푑 (mod 2) where 푑 is the length (number of squares) of the main diagonal. (b) Let 푝푑(푛) be the number of partitions of 푛 whose diagonal has length 푑. Then 2 푝푑(푛) = ∑ 푝(푚, 푑)푝(푛 − 푚 − 푑 , 푑). 푚≥0

(17) Define 푝푒(푛, 푘) to be the number of 휆 ⊢ 푛 having exactly 푘 parts. Prove the follow- ing under the assumption that 푛 ≥ 4, where ⌊⋅⌋ is the round-down function. (a) 푝푒(푛, 푘) = 푝(푛 − 푘, 푘). (b) 푝푒(푛, 1) = 1. (c) 푝푒(푛, 2) = ⌊푛/2⌋. (d) 푝푒(푛, 푛 − 2) = 2. (e) 푝푒(푛, 푛 − 1) = 1. (f) 푝푒(푛, 푛) = 1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 37

(18) Finish the proof of Theorem 1.7.1. (19) Prove Theorem 1.7.2. (20) Consider a line of 푛 copies of the integer 1. One can now put slashes in the spaces between the 1’s and count the number of 1’s between each pair of adjacent slashes to form a composition of 푛. For example, if 푛 = 6, then we start with 1 1 1 1 1 1. One way of inserting slashes is 1 1/1/1 1 1, which corresponds to the composition 2 + 1 + 3 = 6. Give alternate proofs of Theorems 1.7.1 and 1.7.2 using this idea. (21) A weak composition of 푛 into 푘 parts is a sequence of 푘 nonnegative integers sum- ming to 푛. Find a formula for the number of weak compositions of 푛 into 푘 parts and then prove it in three different ways: (a) by using a variant of the map 휙 defined in (1.8), (b) by finding a relation between weak compositions and compositions and then using the statement of Theorem 1.7.2 (as opposed to their proofs as in part (a)), (c) by modifying the construction in the previous exercise. (22) Show that the last two columns in Table 1.1 agree when 푓 is bijective, that is, when 푛 = 푘. (23) Prove Lemma 1.9.1(b). (24) A graph 퐺 = (푉, 퐸) is regular if all of its vertices have the same degree. If deg 푣 = 푟 for all vertices 푣, then 퐺 is regular of degree 푟. (a) Show that if 퐺 is regular of degree 푟, then 푟|푉| |퐸| = , 2

(b) Call 퐺 bipartite if there is a set partition of 푉 = 푉1 ⊎ 푉2 such for all 푢푣 ∈ 퐸 we have 푢 ∈ 푉1 and 푣 ∈ 푉2 or vice versa. Show that a bipartite graph regular of degree 푟 ≥ 1 has |푉1| = |푉2|. (25) A graph 퐺 is planar if it can be drawn in the plane ℝ2 without edge crossings. In this case the regions of 퐺 are the topologically connected components of the set- theoretic difference ℝ2 − 퐺. Let 푅 be the set of regions of 퐺. If 푟 ∈ 푅, then let deg 푟 be the number of edges on the boundary of 푟. Show that ∑ deg 푟 ≤ 2|퐸|. 푟∈푅 Find, with proof, a condition on the cycles of 퐺 which is equivalent to having equal- ity. (26) Two graphs 퐺, 퐻 are isomorphic, written 퐺 ≅ 퐻, if they are equal as unlabeled graphs. The complement of a graph 퐺 = (푉, 퐸) is the graph 퐺̄ with vertices 푉 and with 푢푣 an edge of 퐺̄ if and only if 푢푣 ∉ 퐸. Call 퐺 self-complementary if 퐺 ≅ 퐺̄. (a) Show that there exists a self-complementary graph with 푛 vertices if and only if 푛 ≡ 0 (mod 4) or 푛 ≡ 1 (mod 4). (b) Show that in a self-complementary graph with 푛 vertices where 푛 ≡ 1 (mod 4) there must be at least one vertex of degree (푛 − 1)/2. Hint: Show that the number of vertices of degree (푛 − 1)/2 must be odd. (27) Prove Theorem 1.9.4.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

38 1. Basic Counting

(28) Prove that in any digraph 퐷 = (푉, 퐴) we have ∑ ideg 푣 = ∑ odeg 푣 = |퐴|. 푣∈푉 푣∈푉 (29) Prove the equivalence of (a) and (d) in Theorem 1.10.2. Hint: Use Lemma 1.9.1(b).

(30) Consider a sequence of nonnegative integers 푑 ∶ 푑1, ... , 푑푛. Call 푑 a degree se- quence if there is a graph with vertices 푣1, ... , 푣푛 such that deg 푣푖 = 푑푖 for all 푖. (a) Let 푇 be a tree with 푛 vertices and arrange its degree sequence in weakly de- creasing order. Prove that for 1 ≤ 푖 ≤ 푛 we have 푛 − 1 푑 ≤ ⌈ ⌉ . 푖 푖 (b) Let 푇 be a tree with 푛 vertices and let 푘 ≥ 2 be an integer. Suppose the degree sequence of 푇 satisfies 푑푖 = 1 or 푘 for all 푖. Prove that 푑푖 = 푘 for exactly (푛 − 2)/(푘 − 1) indices 푖. (31) Finish the proof of Theorem 1.10.3.

(32) Consider 푛 cars 퐶1, ... , 퐶푛 passing in this order by a line of 푛 parking spaces num- bered 1, ... , 푛. Each car 퐶푖 has a preferred space number 푐푖 in which to park. If 퐶푖 gets to space 푐푖 and it is free, then it parks. Otherwise it proceeds to the next empty space (which will have a number greater than 푐푖) and parks there if such a space exists. If no such space exists, it does not park. Call 푐 = (푐1, ... , 푐푛) a parking function of length 푛 if all the cars end up in a parking space. (a) Show that 푐 is a parking function if and only if its unique weakly increasing rearrangement 푑 = (푑1, ... , 푑푛) satsifies 푑푖 ≤ 푖 for all 푖 ∈ [푛]. (b) Use a counting argument to show that the number of parking functions of length 푛 is (푛 + 1)푛−1. Hint: Consider parking where there are 푛 + 1 spaces arranged in a circular manner and 푛 + 1 is an allowed preference for cars. (c) Reprove (b) by finding a bijection between parking functions of length 푛 and trees on 푛 + 1 vertices. Hint: Let 푇 be a tree on 푛 + 1 vertices labeled 0, ... , 푛 and call vertex 0 the root of the tree. Draw 푇 in the plane so that the vertices connected to the root, called the root’s children, are in increasing order read left to right. Continue to do the same thing for the children of each child of the root, and so forth. Create a permutation 휋 by reading the children of the root left to right, then the grandchildren of the root left to right, etc. Finally, orient each edge of 푇 so that it points from a vertex to its parent and call this set of arcs 퐴. Map 푇 to 푐 = (푐1, ... , 푐푛) where 1 if 횤0⃗ ∈ 퐴, 푐푖 = { 1 + 푗 if ⃗횤휋푗 ∈ 퐴. (33) Consider 퐸푊-lattice paths along the 푥-axis which are paths starting at the origin and using steps 퐸 = [1, 0] and 푊 = [−1, 0]. (a) Show that if an 퐸푊-lattice path has length 푛 and ends at (푘, 0), then 푛 and 푘 have the same parity and |푘| ≤ 푛. (b) Show that the number of 퐸푊-lattice paths of length 푛 ending at (푘, 0) is 푛 ( 푛+푘 ). 2

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 39

(c) Show that the number of 퐸푊-lattice paths of length 2푛 ending at the origin and always staying on the nonnegative side of the axis is 퐶(푛). (34) Show that the Catalan numbers 퐶(푛) also count the following objects: (a) ballot sequences which are words 푤 = 푤1 ... 푤2푛 containing 푛 ones and 푛 twos such that in any prefix 푤1 ... 푤푖 the number of ones is always at least as great as the number of twos, (b) sequences of positive integers

1 ≤ 푎1 ≤ 푎2 ≤ ... ≤ 푎푛

with 푎푖 ≤ 푖 for 1 ≤ 푖 ≤ 푛, (c) triangulations of a convex (푛 + 2)-gon using nonintersecting diagonals, (d) noncrossing partitions 휌 = 퐵1/ ... /퐵푘 ⊢ [푛] where a crossing is 푎 < 푏 < 푐 < 푑 such that 푎, 푐 ∈ 퐵푖 and 푏, 푑 ∈ 퐵푗 for 푖 ≠ 푗. (35) Fill in the details of the proof of Theorem 1.11.3. (36) A stack is a first-in first-out (FIFO) data structure with two operations. Onecan put something on the top of a stack, called pushing, or take something from the top of the stack, called popping. A permutation 휎 = 휎1 ... 휎푛 ∈ 픖푛 is considered sorted if its elements have been rearranged to form the permutation 휏 = 12 ... 푛. Consider the following algorithm for sorting 휎. Start with an empty stack and an empty output permutation 휏. At each stage there are two options. If the stack is empty or the current first element 푠 of 휎 is smaller than the top element of the stack, then one pushes 푠 onto the stack. If 휎 has become empty or the top element 푡 of the stack is smaller than the first element of 휎, then one pops 푡 from the stack and appends it to the end of 휏. An example showing the sorting of 휎 = 3124 will

휏 stack 휎 휖 휖 3124 휖 3 124 1 휖 3 24 1 3 24 2 1 3 4 12 3 4 123 휖 4 123 4 휖 1234 휖 휖

Figure 1.14. A stack-sorting algorithm

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

40 1. Basic Counting

be found in Figure 1.14. Note that the input permutation 휎 is on the right and the output permutation 휏 is on the left so that the head of 휎 and the tail of 휏 are nearest the stack. (a) Show that this algorithm sorts 휎 if and only if 휎 ∈ Av푛(231). (b) Show that if there is a sequence of pushes and pops which sorts 휎, then it must be the sequence given by the algorithm.

(37) Suppose 휋 = 휋1 ... 휋푘 ∈ 픖푘. Prove the following descriptions of actions of ele- ments of 퐷 in terms of one-line notation. 푟 (a) 푟∞(휋) = 휋푘 ... 휋1 ≔ 휋 , the reversal of 휋. 푐 (b) 푟0(휋) = (푘 + 1 − 휋1) ... (푘 + 1 − 휋푘) ≔ 휋 , the complement of 휋. −1 (c) 푟1(휋) = 휋 , the group-theoretic inverse of 휋. (38) Finish the proof of Theorem 1.12.2. (39) Given any set of permutations Π we let

Av푛(Π) = {휎 ∈ 픖푛 ∣ 휎 avoids all 휋 ∈ Π}.

If 휋 = 휋1휋2 ... 휋푚 ∈ 픖푚 is a permutation and 푛 ∈ ℕ, then we can construct a new permutation 휋 + 푛 = 휋1 + 푛, 휋2 + 푛, ... , 휋푚 + 푛. Given permutations 휋, 휎 of disjoint sets, we denote by 휋휎 the permutation obtained by concatenating them. Define two other concatenations on 휋 ∈ 픖푚 and 휎 ∈ 픖푛, the direct sum 휋 ⊕ 휎 = 휋(휎 + 푚) and skew sum 휋 ⊖ 휎 = (휋 + 푛)휎. Finally for 푛 ≥ 0 we use the notation

휄푛 = 12 ... 푛 for the increasing permutation of length 푛, and

훿푛 = 푛 ... 21 for the decreasing one. Prove the following.

(a) Av푛(213, 321) = {휄푘1 ⊕ (휄푘2 ⊖ 휄푘3 ) ∣ 푘1 + 푘2 + 푘3 = 푛}. (b) Av (132, 213) = {휄 ⊖ 휄 ⊖ ⋯ ∣ ∑ 푘 = 푛}. 푛 푘1 푘2 푖 푖

(c) Av푛(132, 213, 321) = {휄푘1 ⊖ 휄푘2 ∣ 푘1 + 푘2 = 푛}.

(d) Av푛(132, 231, 312) = {훿푘1 ⊕ 휄푘2 ∣ 푘1 + 푘2 = 푛}.

(e) Av푛(132, 231, 321) = {(1 ⊖ 휄푘1 ) ⊕ 휄푘2 ∣ 푘1 + 푘2 = 푛 − 1}. (f) Av (123, 132, 213) = {휄 ⊖ 휄 ⊖ ⋯ ∣ ∑ 푘 = 푛 and 푘 ≤ 2 for all 푖}. 푛 푘1 푘2 푖 푖 푖 (40) Finish the proof of Lemma 1.12.4.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 2

Counting with Signs

In the previous chapter, we concentrated on counting formulae where all of the terms were positive. But there are interesting things to say when one permits negative terms as well. This chapter is devoted to some of the principal techniques which one can use in such a situation.

2.1. The Principle of Inclusion and Exclusion

The Principle of Inclusion and Exclusion, or PIE, is one of the classical methods for counting using signs. After presenting the Principle itself, we will give an application to derangements which are permutations having no fixed points. In the Sum Rule, Lemma 1.1.1(a), we assumed that the sets 푆, 푇 are disjoint. Of course, it is easy to see that for any finite sets 푆, 푇 we have

(2.1) |푆 ∪ 푇| = |푆| + |푇| − |푆 ∩ 푇|.

Indeed, |푆| + |푇| counts 푆 ∩ 푇 twice and so to count it only once we must subtract the cardinality of the intersection. But one could ask if there is a similar formula for the union of any number of sets. It turns out that it is often more useful to consider these sets as subsets of some universal set 푆 and count the number of elements in 푆 which are not in any of the subsets, similar to the viewpoint used in pattern avoidance. To set up

notation, let 푆 be a set and let 푆1, ... , 푆푛 ⊆ 푆. We wish to find a formula for |푆 − ⋃푖 푆푖|. When 푛 = 1 we clearly have

|푆 − 푆1| = |푆| − |푆1|.

And for 푛 = 2 equation (2.1) yields

|푆 − (푆1 ∪ 푆2)| = |푆| − |푆1| − |푆2| + |푆1 ∩ 푆2|.

41

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

42 2. Counting with Signs

푆 푆

푆1 푆1 푆2

Figure 2.1. The PIE for 푛 = 1, 2

Venn diagrams showing the shaded region counted for these two cases are given in Fig- ure 2.1. The reader may have already guessed the generalization for arbitrary 푛. This type of enumeration where one alternately adds and subtracts cardinalities is some- times called a sieve.

Theorem 2.1.1 (Principle of Inclusion and Exclusion, PIE). If 푆 is a finite set with sub- sets 푆1, ... , 푆푛, then | 푛 | 푛 (2.2) |푆 − 푆 | = |푆| − ∑ |푆 | + ∑ |푆 ∩ 푆 | − ⋯ + (−1)푛| 푆 |. | ⋃ 푖| 푖 푖 푗 ⋂ 푖 | 푖=1 | 1≤푖≤푛 1≤푖<푗≤푛 푖=1

Proof. For any set 푆 we have |푆| = ∑푠∈푆 1. We will use the notation |푆| = ∑푠∈푆 1푠 so that 1푠 will keep track of the contribution of 푠 to the sum. So it suffices to show that the

coefficient of 1푠 in the alternating sum is one if 푠 ∉ ⋃푖 푆푖 and zero otherwise. In the first case, 1푠 only occurs in |푆|, giving the desired coefficient. In the second case, suppose

푠 ∈ 푆푖 for exactly 푚 ≥ 1 indices 푖. Now 푠 ∈ 푆푖1 ∩ ⋯ ∩ 푆푖푘 precisely when 푆푖1 , ... , 푆푖푘 are 푘 of the 푚 subsets containing 푠. It follows that the number of summands 1푠 in the sum 푚 for 푘-fold intersections is ( 푘 ). So the coefficient of 1푠 to the right-hand side of (2.2) is 푚 푚 푚 ( ) − ( ) + ( ) − ⋯ = 0 0 1 2 by Theorem 1.3.3(d). This completes the proof. □

푛 To simplify notation we will usually write just ⋃ 푆푖 for ⋃푖=1 푆푖. We will also write 푆퐼 in place of ⋂푖∈퐼 푆푖. As an application of the PIE, we will count permutations without fixed points. This problem is sometimes accompanied by the following story. Suppose that 푛 jolly revelers (and it is important that they be jolly) put their 푛 identical bowler hats on a hat stand before dinner at a restaurant. During the meal, the hat stand gets overturned (I told you they were jolly) so that the hats, having no identifying markings, are re- turned at random when the revelers leave. What is the probability that no man gets his own hat back?

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.1. The Principle of Inclusion and Exclusion 43

If one numbers the men 1, ... , 푛 and similarly number the hats where hat 푖 belongs to man 푖, then a way of returning the hats is just a permutation 휋 = 휋1 ... 휋푛 ∈ 픖푛 where 휋푖 = 푗 means that man 푖 gets back hat 푗. So the condition that no man gets his own hat means that 휋푖 ≠ 푖 for all 푖; that is, 휋 has no fixed points. Such a permutation is called a derangement and the number of derangements in 픖푛 is denoted 퐷(푛) and is called the 푛th derangement number. We now wish to set this problem up so that we can use the PIE. In particular, we want to define 푆 and subsets 푆1, ... , 푆푛 so that 퐷(푛) = |푆 − ⋃ 푆푖|. To do this, we think of the problem as counting a set of elements subject to certain restrictions and then let (i) 푆 be the set of objects with no restrictions and

(ii) 푆1, ... , 푆푛 be subsets so that removing 푆푖 from 푆 imposes the 푖th restriction.

We will have chosen 푆 and the 푆푖 correctly if the cardinalities on the right-hand side of (2.2) can be computed. In the case under consideration, we want to count permuta- tions with no fixed points. So we should let 푆 = 픖푛, the set of all permutations without any restriction on their fixed points. We will also let 푆푖 be the set of 휋 ∈ 픖푛 with 휋푖 = 푖 so that we will remove those permutations having 푖 as a fixed point. Note that wedo ′ not choose subsets 푆푖 defined as the set of 휋 ∈ 픖푛 with 푖 fixed points, for if we did so, ′ ′ ′ ′ then the 푆푖 would be disjoint so that |푆 − ⋃ 푆푖| = |푆| − |푆1| − ⋯ − |푆푛|. Because of this, ′ computing the cardinalities of the 푆푖 is about as hard as computing the cardinality of the set difference directly and so one does not gain anything. However, our original choice of subsets will turn out to be very nice.

We now compute the necessary cardinalities. Of course, |푆| = |픖푛| = 푛!. Next, if 휋 ∈ 푆1, then 휋 = 1휋2 ... 휋푛 where 휋2 ... 휋푛 form a permutation of 2, ... , 푛. So |푆1| = (푛 − 1)!. Clearly the same argument could be applied to any 푆푖, so

∑ |푆푖| = 푛 ⋅ (푛 − 1)! = 푛! . 푖

Similarly, 푆1∩푆2∩⋯∩푆푘 is the set of all permutations of the form 휋 = 12 ... 푘휋푘+1 ... 휋푛 and there are (푛 − 푘)! ways to choose 휋푘+1, ... , 휋푛. In fact, all the terms in the 푘-fold 푛 sum have this value and there are (푘) such terms giving a total of 푛 푛! (푛 − 푘)! ( ) = . 푘 푘! Summing up, so to speak, we have proved the following. Theorem 2.1.2. The 푛th derangement number is given by 1 1 1 퐷(푛) = 푛! (1 − + − ⋯ + (−1)푛 ) 1! 2! 푛! for 푛 ≥ 0. □

The reader should recognize the series in the previous result as a truncation of the series for 1/푒. Since the probability that no man gets his hat back is the number of ways this could happen over the total number of permutations for returning the hats, or 퐷(푛)/푛!, we get a very pretty answer to the question originally posed.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

44 2. Counting with Signs

Corollary 2.1.3. In the limit as 푛 → ∞, the probability that no man gets his hat back is 1/푒. □

It is striking that 푒, one of the quintessential transcendental numbers, should occur in the solution to a combinatorial problem which, at the outset, involves only integers.

2.2. Sign-reversing involutions

Sign-reversing involutions are a powerful way of proving identities involving signs, and even identities which do not explicitly have signs in them. As we will see, these maps can be used to prove the PIE itself and play an important role in the Garsia–Milne Involution Principle, which we will study in the next section. Let 푆 be a (not necessarily finite) set. A function 휄∶ 푆 → 푆 is an involution if 휄2 is the identity map on 푆. Equivalently, 휄 is a bijection such that 휄−1 = 휄. There is another nice characterization of involutions which will be crucial once we introduce signs. For any 푓 ∶ 푆 → 푆, its fixed point set is Fix 푓 = {푠 ∈ 푆 ∣ 푓(푠) = 푠}. We also say that distinct elements 푠, 푡 ∈ 푆 form a 2-cycle of 푓 if 푓(푠) = 푡 and 푓(푡) = 푠. In this case we write (푠, 푡) or 푠 ↔ 푡 to denote the 2-cycle.

Lemma 2.2.1. Consider 휄∶ 푆 → 푆. The function 휄 is an involution if and only if 푆 is the disjoint union of the fixed points and 2-cycles of 휄.

Proof. For the forward direction, it suffices to show that if 푠 ∈ 푆 is not a fixed point, then it is in a 2-cycle. So suppose 휄(푠) = 푡. Then 휄(푡) = 휄2(푠) = 푠 as desired. Conversely, suppose that 푆 is such a disjoint union and pick 푠 ∈ 푆. If 푠 ∈ Fix 휄, then 휄2(푠) = 휄(푠) = 푠. Otherwise, 푠 is in a 2-cycle (푠, 푡) so that 휄2(푠) = 휄(푡) = 푠. So 휄2 is the identity map and we are done. □

A signed set is a set 푆 together with a function sgn∶ 푆 → {+1, −1}. In this case we let 푆+ = {푠 ∈ 푆 ∣ sgn 푠 = +1} and similarly for 푆−. If 휄∶ 푆 → 푆 is an involution on 푆, then we say that 휄 is sign reversing if sgn 휄(푠) = − sgn 푠 for every 푠 which is in a 2-cycle of 휄. A pictorial representation of this situation will be found in Figure 2.2. Now suppose that 푆 is finite. It follows that (2.3) ∑ sgn 푠 = ∑ sgn 푠. 푠∈푆 푠∈Fix 휄

Indeed, if 푠 is in a 2-cycle (푠, 휄(푠)), then on the left-hand side we have sgn 푠+sgn 휄(푠) = 0. So all elements in 2-cycles cancel from the sum, which leaves only terms from Fix 휄. This formula can be very useful if the sum on the right has far fewer terms than the one on the left. And if all the fixed points of 휄 have the same sign so that the right- hand side of (2.3) is ±| Fix 휄|, then we may be able to glean even more information. The

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.2. Sign-reversing involutions 45

푆+ Fix 휄 푆−

Figure 2.2. A sign-reversing involution on a set 푆

푘 general method for trying to prove facts about a signed sum ∑푘≥0(−1) 푎푘 for positive integers 푎푘 is as follows:

(i) Find a set 푆 enumerated by the positive sum ∑푘 푎푘. 푘 (ii) Sign 푆 so that the left-hand side of (2.3) equals ∑푘(−1) 푎푘. (iii) Devise a sign-reversing involution 휄 on 푆 with many 2-cycles. As our first application of sign-reversing involutions, we will reprove the formula for the alternating sum of the binomial coefficients in Theorem (1.3.3)(d). In fact, the original demonstration was a closet version of this technique. But now we can present the involution proof in its full glory. We restate the identity here for ease of reference: 푛 (2.4) ∑(−1)푘( ) = 훿 . 푘 푛,0 푘

Proof. As usual, we assume 푛 ≥ 1 since 푛 = 0 is trivial. From the sum with the signs removed, it is clear that we should let 푆 = 2[푛]. And from the way 푘 is being used in the original sum, one would be inclined to let sgn 푠 = (−1)#푠 for 푠 ⊆ [푛]. We now need to check that the left-hand sides of (2.3) and (2.4) agree. The technique we will use, of turning a single sum into a double sum and then grouping terms, is a common one in enumerative combinatorics. In this case ∑ sgn 푠 = ∑ (−1)#푠 푠∈푆 푠⊆[푛]

= ∑ ∑ (−1)푘 푘 [푛] 푠∈( 푘 )

푛 = ∑(−1)푘( ) 푘 푘 as desired. As for the sign-reversing involution, we already saw it in the original demonstra- tion of this result. Define 휄∶ 2[푛] → 2[푛] by 휄(푠) = 푠 Δ {푛}. As noted previously, this is an involution. To see that it is sign reversing, we have that |푠 Δ {푛}| = |푠| ± 1. So

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

46 2. Counting with Signs

sgn 휄(푠) = (−1)|푠|±1 = − sgn 푠. Finally, we just need to determine Fix 휄. But 푠 Δ {푛} ≠ 푠 for all 푠 ⊆ [푛]. Thus the right-hand side of (2.3) is the empty sum. Since this equals zero the proof is complete. □

Given that (2.4) was a crucial tool in proving the PIE, it may not come as a surprise that the principle itself can be proved using a sign-reversion involution. We restate the PIE here, in part so as not to conflict with the notation we have set up for sign-reversing involutions. So given a finite set 퐴 and subsets 퐴1, ... , 퐴푛 we wish to prove | 푛 | | 푛 | (2.5) |퐴 − 퐴 | = |퐴| − ∑ |퐴 | + ∑ |퐴 ∩ 퐴 | − ⋯ + (−1)푛 | 퐴 | . | ⋃ 푖| 푖 푖 푗 |⋂ 푖| | 푖=1 | 1≤푖≤푛 1≤푖<푗≤푛 |푖=1 |

Proof. An example illustrating the proof will be found after the demonstration. We cannot take 푆 = 퐴 since the same element of 퐴 is counted in many of the terms on the right side of (2.5). To take care of these multiplicities, let

[푛] (2.6) 푆 = {(푎, 퐼) ∈ 퐴 × 2 | 푎 ∈ 퐴퐼 },

recalling the notation

(2.7) 퐴 = 퐴 . 퐼 ⋂ 푖 푖∈퐼 Notice how pairs come into play here even though they are not apparent from the orig- inal statement of the result to be proved, just as in the case of the demonstration of Theorem 1.9.3. Note that 퐴∅ = 퐴. So (푎, ∅) is a pair for all 푎 ∈ 퐴, and if 푎 ∉ ⋃ 퐴푖, then this is the only pair in which 푎 appears. Since the signs in (2.5) come from the number of subsets in an intersection, we define

sgn(푎, 퐼) = (−1)#퐼 .

It follows that

∑ sgn 푠 = ∑ (−1)#퐼 푠∈푆 (푎,퐼)∈푆

= ∑ ∑ (−1)#퐼 [푛] 퐼∈2 푎∈퐴퐼

푛 = ∑ ∑ ∑ (−1)푘

푘=0 [푛] 푎∈퐴퐼 퐼∈( 푘 )

푛 푘 = ∑ (−1) ∑ |퐴퐼 | 푘=0 [푛] 퐼∈( 푘 )

as we wished.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.2. Sign-reversing involutions 47

To construct an involution, define for each 푎 ∈ ⋃ 퐴푖 the index

푚(푎) = max{푖 ∣ 푎 ∈ 퐴푖}. Finally, we let

(푎, 퐼 Δ {푚(푎)}) if 푎 ∈ ⋃ 퐴푖, 휄(푎, 퐼) = { (푎, 퐼) otherwise. It is clear from the definition that this is an involution whose fixed points are inbijec- tion with the elements of 퐴 − ⋃ 퐴푖 and whose 2-cycles contain elements of opposite signs. Since elements of 퐴 − ⋃ 퐴푖 each occur in exactly one pair, it follows that the right side of (2.3) is just the cardinality of this set, as desired. □

To illustrate the proof, suppose 퐴 = {푎, 푏, 푐, 푑}, 퐴1 = {푎, 푏}, and 퐴2 = {푏, 푐}. Then, leaving out curly brackets and commas in the index sets 퐼 for readability, 푆 = {(푎, ∅), (푎, 1), (푏, ∅), (푏, 1), (푏, 2), (푏, 12), (푐, ∅), (푐, 2), (푑, ∅)}. Also 푚(푎) = 1 and 푚(푏) = 푚(푐) = 2 so that the involution creates the following 2-cycles: (푎, ∅) ↔ (푎, 1), (푏, ∅) ↔ (푏, 2), (푏, 1) ↔ (푏, 12), (푐, ∅) ↔ (푐, 2).

The only fixed point is (푑, ∅) and 퐴 − (퐴1 ∪ 퐴2) = {푑}. It would be nice to prove something we have not seen before using our new tech- nique. Here is an identity involving Stirling numbers of the second kind. Theorem 2.2.2. For 푛 ≥ 0 we have ∑(−1)푘푘! 푆(푛, 푘) = (−1)푛. 푘≥0

Proof. The first order of business will be to give a combinatorial interpretation tothe summands. A composition of a set 푇 is a sequence of nonempty subsets 휌 = (퐵1, ... , 퐵푘)

such that ⨄푖 퐵푖 = 푇. In this case we write 휌 ⊧ 푇. So the number of 휌 ⊧ [푛] with 푘 blocks is 푘! 푆(푛, 푘) since we can start with any of the 푆(푛, 푘) partitions in 푆([푛], 푘) and order its blocks in 푘! ways. The reader should have enough experience with signed sets at this point to see that we are going to want to take 푆 to be all 휌 ⊧ [푛] with sgn 휌 = (−1)푘 if 휌 has 푘 blocks. Verifying that this gives the correct alternating sum andis easy and is left as an exercise. The involution will be more interesting. We will break it into two cases which will be inverses of each other. As often, an example follows the proof. Given 휌 = (퐵1, ... , 퐵푘) ⊧ [푛], we say that 퐵푗 is splittable if #퐵푗 ≥ 2. In this case the splitting map applied to 퐵푗 is defined by

휎(퐵1, ... , 퐵푘) = (퐵1, ... , 퐵푗−1, {푏}, 퐵푗 − {푏}, 퐵푗+1, ... , 퐵푘)

where 푏 = min 퐵푗. In other words 퐵푗 is replaced by a pair of blocks, the first containing its minimum element and the other all the rest of its elements. Although the notation 휎 does not indicate which block is to be split, this will be made clear from the context. We will now define the part of the involution which will undo splitting. Given 휌, we

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

48 2. Counting with Signs

say that 퐵푗 can be merged with 퐵푗+1 if

(1) 퐵푗 = {푏} for some element 푏 ∈ [푛] and

(2) 푏 < min 퐵푗+1.

In this case the merging map applied to 퐵푗 is defined by

휇(퐵1, ... , 퐵푘) = (퐵1, ... , 퐵푗−1, 퐵푗 ∪ 퐵푗+1, 퐵푗+2, ... , 퐵푘). ′ ′ It should be clear that if 퐵푖 can be split into 퐵푖 and 퐵푖+1, then the primed blocks can be merged back into 퐵푖 and vice versa. To define the involution 휄, suppose we are given 휌 = (퐵1, ... , 퐵푘). We scan 휌 from left to right until we find the first index 푗, if any, such that 퐵푗 can either be split or merged with 퐵푗+1. (Clearly one cannot do both since splitting implies that #퐵푗 ≥ 2 and merging that #퐵푗 = 1.) Now define 휎(휌) if 퐵 can be split, 휄(휌) = { 푗 휇(휌) if 퐵푗 can be merged. If no such index exists, then 휌 will be a fixed point of 휄. We have some work to do to verify that 휄 is an involution. Specifically, we must show that if 휄(휌) = 휌′ is obtained from 휌 by splitting at index 푗, then 휄(휌′) will be ob- tained by merging at the same index and vice versa. We will do the first case and leave the second to the reader. First note that since no 퐵푖, 푖 < 푗, could be split in 휌 we must have 퐵푖 = {푏푖} for some 푏푖 for each 푖 in this range. Furthermore, since none of these 퐵푖 could be merged into 퐵푖+1, we must also have 푏1 > 푏2 > ⋯ > 푏푗−1 > 푏푗 = min 퐵푗. ′ ′ ′ Now in 휌 we have 퐵푖 = {푏푖} for 푖 ≤ 푗 with 푏1 > ⋯ > 푏푗. As a consequence, no 퐵푖 ′ ′ ′ ′ can be split or merged for 푖 < 푗 and so 휄(휌 ) will merge 퐵푗 into 퐵푗+1. Thus 휄(휌 ) = 휌 as desired. It is clear that 휄 is sign reversing since 휄(휌) has one more or one fewer block than 휌. So we just need to find the fixed points. But if 휌 ∈ Fix 휄, then all 휌’s blocks con- tain a single element; otherwise one could be split. It follows that 휌 = ({푏1}, ... , {푏푛}). Furthermore, none of the blocks can be merged and so 푏1 > ⋯ > 푏푛. But this forces our set composition to be 휌 = ({푛}, {푛 − 1}, ... , {1}) and sgn 휌 = (−1)푛, completing the proof. □

To illustrate, suppose 푛 = 8. As we have done previously, we will dispense with brackets and commas in sets. Consider 휌 = (퐵1, ... , 퐵5) = (5, 3, 147, 2, 68). Then 퐵3 is splittable and splitting it results in 휎(휌) = (5, 3, 1, 47, 2, 68). Also, 퐵4 can be merged into 퐵5 in 휌 since 퐵4 = {2} and 2 < min 퐵5 = 6. Merging these two blocks gives 휇(휌) = (5, 3, 147, 268). To decide which operation to use we start with 퐵1. It cannot be split, having only one element. And it cannot be merged with 퐵2 since 5 > min 퐵2 = 3. Similarly 퐵2 cannot be split or merged with 퐵3. But we have already seen that 퐵3 can be split so that 휄(휌) = (5, 3, 1, 47, 2, 68) = 휌′. To check that 휄(휌′) = 휌 is similar. Involutions involving merging and splitting often come up when finding formulae for antipodes in Hopf algebras. One can consult the papers of Benedetti-Bergeron [7], Benedetti-Hallam-Machacek [8], Benedetti-Sagan [9], or Bergeron-Ceballos [12] for ex- amples.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.3. The Garsia–Milne Involution Principle 49

2.3. The Garsia–Milne Involution Principle

So far we have used sign-reversing involutions to explain cancellation in alternating sums. But can they also furnish a bijection for proving that two given sets have the same cardinality? The answer in certain cases is “yes” and the standard technique for doing this is called the Garsia–Milne Involution Principle. Garsia and Milne [30] introduced this method to give the first bijective proof of the Rogers–Remanujan identities, famous formulas which involve certain sets of integer partitions. Since then the Involution Principle has found a number of other applications. See, for example, the articles of Remmel [73] or Wilf [100]. In order to prove the Garsia–Milne result, we will need a version of Lemma 1.9.5 which applies to a slightly wider class of digraphs. Since the demonstration of the next result is similar to that of the earlier one, we leave the pleasure of proving it to the reader.

Lemma 2.3.1. Let 퐷 = (푉, 퐴) be a digraph. We have odeg 푣, ideg 푣 ≤ 1 for all 푣 ∈ 푉 if and only if 퐷 is a disjoint union of directed paths and directed cycles. □

The basic idea of the Involution Principle is that, under suitable conditions, if one has two signed sets each with their own sign-reversing involution, then we can use a bijection between these sets to create a bijection between their fixed-point sets. So let 푆 and 푇 be disjoint signed sets with sign-reversing involutions 휄∶ 푆 → 푆 and 휅∶ 푇 → 푇 such that Fix 휄 ⊆ 푆+ and Fix 휅 ⊆ 푇+. Furthermore, suppose we have a bijection 푓 ∶ 푆 → 푇 which preserves signs in that sgn 푓(푠) = sgn 푠 for all 푠 ∈ 푆. A picture of this setup can be found in Figure 2.3. Note that although all arrows are really double- headed, we have only shown them in one direction because of what is to come. And the circular arrows on the fixed points have been ignored. We now construct amap 퐹 ∶ Fix 휄 → Fix 휅 as follows. To define 퐹(푠) for 푠 ∈ Fix 휄 we first compute 푓(푠) ∈ 푇+. If 푓(푠) ∈ Fix 휅, then we let 퐹(푠) = 푓(푠). If not, we apply the functional composition 휙 = 푓 ∘ 휄 ∘ 푓−1 ∘ 휅 to 푓(푠). Remembering that we compose from right to left, this takes 푓(푠) to 푇−, 푆−, 푆+, and 푇+ in that order. If this brings us to an element of Fix 휅, then we let 퐹(푠) = 휙(푓(푠)). Otherwise we apply 휙 as many times as necessary, say 푚, to arrive at an element of Fix 휅 and define (2.8) 퐹(푠) = 휙푚(푓(푠)). Continuing the example in Figure 2.3 we see that 푓(푠) = 푢 ∉ Fix 휅. So we apply 휙, which takes 푢 to 푣, 푟, 푞, and 푡 in turn. Since 푡 ∈ Fix 휅 we let 퐹(푠) = 푡. Of course, we have to worry whether this is all well-defined; e.g., does 푚 always exist? And we also need to prove that 퐹 is a bijection. This is taken care of by the next theorem.

Theorem 2.3.2 (Garsia–Milne Involution Principle). With the notation of the previous paragraph, the map 퐹 ∶ Fix 휄 → Fix 휅 is a well-defined bijection.

Proof. Recall the notion of a functional digraph as used in the proof from Section 1.9 of Theorem 1.5.1. Define the following functions by restriction of their domains:

−1 푓 = 푓|푆+ , 푔 = 푓 |푇− , 휄 = 휄|푆− , 휅 = 휅|푇+−Fix 휅.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

50 2. Counting with Signs

푆+ 푇+

푞 푓 푓 푢

휄 푠 푡 휅 Fix 휄 Fix 휅

푟 푣 푓−1 푆− 푇−

Figure 2.3. An example of the Garsia–Milne construction

Consider 퐷 which is the union of the functional digraphs for 푓, 푔, 휄, and 휅. It is easy to verify from the definitions that 푥 ∈ 푉(퐷) has in-degrees and out-degrees given by the following table depending on the subset of 푆 ∪ 푇 containing 푥:

subset odeg 푥 ideg 푥 Fix 휄 1 0 Fix 휅 0 1 (푆 − Fix 휄) ∪ (푇 − Fix 휅) 1 1

For example, if 푥 ∈ Fix 휄, then the only arc containing 푥 comes from 푓 and so odeg 푥 = 1 and ideg 푥 = 0. On the other hand, if 푥 ∈ 푆+ − Fix 휄, then 푥 has an arc going out from 푓 and one coming in from 휄 giving odeg 푥 = ideg 푥 = 1. Now 퐷 satisfies the hypothesis of the forward direction ofLemma 2.3.1. It follows that 퐷 is a disjoint union of directed paths and directed cycles. Each directed path must start at a vertex with out-degree 1 and in-degree 0 and end at a vertex with these degrees switched. Furthermore, all other vertices have out-degree and in-degree both 1. From these observations and the chart, it follows that these paths define a 1-to-1 correspondence between the vertices of Fix 휄 and those of Fix 휅. Furthermore, from the definition of 퐷 we see that each path corresponds exactly to a functional composition 휙푚푓(푠) for 푠 ∈ Fix 휄 and some 푚 ≥ 0. So 퐹 is the bijection defined by these paths. □

Before we give an application of the previous theorem, we should mention an ap- proach which can be useful in setting up the necessary sets and bijections. Here is one way to try to find a bijection 퐹 ∶ 푋 → 푌 between two finite sets 푋, 푌.

(i) As with the PIE, construct a set 퐴 with subsets 퐴1, ... , 퐴푛 such that 푋 = 퐴 − ⋃ 퐴푖. Similarly construct 퐵 and 퐵1, ... , 퐵푛 for 푌. (ii) Use the method of our second proof of the PIE to set up a sign-reversing in- volution 휄 on the set 푆 as given by (2.6). Similarly construct 휅 on a set 푇.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.3. The Garsia–Milne Involution Principle 51

(iii) Find a bijection 푓 ∶ 푆 → 푇 of the form

푓(푎, 퐼) = (푏, 퐼)

which is well-defined in that 푎 ∈ 퐴퐼 if and only if 푏 ∈ 퐵퐼 .

+ Recall that Fix 휄 = (푎, ∅) where 푎 ∈ 퐴 − ⋃ 퐴푖. Thus Fix 휄 ⊆ 푆 as needed to apply the Involution Principle, and there is a natural bijection between Fix 휄 and 푋. Note also that 푓 is automatically sign preserving since sgn(푎, 퐼) = (−1)#퐼 = sgn(푏, 퐼). So once these three steps have been accomplished, Theorem 2.3.2 guarantees that we have a bijection 푋 → 푌. As already remarked, the Involution Principle is useful in proving integer partition identities. Say that partition 휆 = (휆1, 휆2, ... , 휆푘) has distinct parts if 휆1 > 휆2 > ⋯ > 휆푘 (as opposed to the usual weakly decreasing condition). On the other hand, say that 휆 has odd parts if all the 휆푖 are odd. The next result is a famous theorem of Euler. As has become traditional, an example follows the proof.

Theorem 2.3.3 (Euler). Let 푃푑(푛) be the set of partitions of 푛 with distinct parts and let 푃표(푛) be the set of partitions of 푛 with odd parts. For 푛 ≥ 0 we have

#푃푑(푛) = #푃표(푛).

Proof. It suffices to show that there is a bijection 푃푑(푛) → 푃표(푛). To apply the PIE to 푃푑(푛) we can take 퐴 = 푃(푛), the set of all partitions of 푛, with subsets 퐴1, ... , 퐴푛 where

퐴푖 = {휆 ⊢ 푛 ∣ 휆 has (at least) two copies of the part 푖}.

Note that 퐴푖 = ∅ if 푖 > 푛/2, but this does no harm and keeps the notation simple. It should be clear from the definitions that 푃푑(푛) = 퐴 − ⋃ 퐴푖. Similarly, for 푃표(푛) we let 퐵 = 푃(푛) with subsets

퐵푖 = {휇 ⊢ 푛 ∣ 휇 has a part of the form 2푖}

for 1 ≤ 푖 ≤ 푛. Again, it is easy to see that 푃표(푛) = 퐵 − ⋃ 퐵푖. The construction of 푆, 휄, 푇, and 휅 are now exactly the same as in the second proof of the PIE. So it suffices to construct an appropriate bijection 푓 ∶ 푆 → 푇. Given (휆, 퐼) ∈ 푆, we replace, for each 푖 ∈ 퐼, a pair of 푖’s in 휆 by a part 2푖 to form 휇. So if 휆 ∈ 퐴푖, then 휇 ∈ 퐵푖 for all 푖 ∈ 퐼 and the map 푓(휆, 퐼) = (휇, 퐼) is well-defined. It is also easy to construct 푓−1, taking an even part 2푖 in 휇 and replacing it with two copies of 푖 to form 휆 as 푖 runs over 퐼. Appealing to Theorem 2.3.2 finishes the proof. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

52 2. Counting with Signs

To illustrate this demonstration, suppose we start with (6, 2, 1) ∈ 푃푑(9). For the pairs in 푆 and 푇, we will dispense with delimiters and commas as usual. So

푓 휅 푓−1 휄 (621, ∅) ↦ (621, ∅) ↦ (621, 3) ↦ (3321, 3) ↦ (3321, ∅)

푓 휅 푓−1 휄 ↦ (3321, ∅) ↦ (3321, 1) ↦ (33111, 1) ↦ (33111, 13)

푓 휅 푓−1 휄 ↦ (621, 13) ↦ (621, 1) ↦ (6111, 1) ↦ (6111, ∅)

푓 휅 푓−1 휄 ↦ (6111, ∅) ↦ (6111, 3) ↦ (33111, 3) ↦ (33111, ∅)

푓 ↦ (33111, ∅).

퐹 It follows that we should map (6, 2, 1) ↦ (3, 3, 1, 1, 1). Clearly one might like to find a more efficient bijection if one exists. This issue will be further explored in the exercises.

2.4. The Reflection Principle

The Reflection Principle is a geometric method for working with certain combinatorial problems involving lattice paths. In particular, it will permit us to give a very simple proof of the binomial coefficient formula for the Catalan numbers. It is also useful in proving unimodality, an interesting property of real number sequences. Consider the integer lattice ℤ2 and northeast paths in this lattice. Suppose we are given a line in the plane of the form 퐿 ∶ 푦 = 푥 + 푏 for some 푏 ∈ ℤ. Note that the reflection in 퐿 of any northeast path is again a northeast path. If 푃 is a path from 푢 to 푃 푣, then we write 푃 ∶ 푢 → 푣 or 푢 → 푣. Suppose 푃 ∶ 푢 → 푣 intersects 퐿 and let 푥 be its

푣 퐿 퐿

푃2

푣′ 푥 푥 Υ퐿 → ′ 푃2

푢 푢 푃1 푃1

Figure 2.4. The map Υ퐿

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.4. The Reflection Principle 53

last (northeast-most) point of intersection. Then 푃 can be written as the concatenation

푃 푃 푃 ∶ 푢 →1 푥 →2 푣.

For example, on the left in Figure 2.4 we have the path 푃 = 퐸퐸퐸푁푁푁푁퐸푁 with 푃1 = 퐸퐸퐸푁푁 and 푃2 = 푁푁퐸푁. We now define a new path

푃 푃′ 1 2 ′ Υ퐿(푃) ∶ 푢 → 푥 → 푣

′ ′ where 푃2 and 푣 are the reflections of 푃2 and 푣 in 퐿, respectively. Returning to our ex- ′ ample, 푃2 = 퐸퐸푁퐸, which is obtained from 푃2 by merely interchanging north and east steps. So Υ퐿(푃) = 퐸퐸퐸푁푁퐸퐸푁퐸 as on the right in Figure 2.4. This is the fundamental map for using the Reflection Principle. To state it precisely, let 풩ℰ(푢; 푣) denote the set of northeast paths from 푢 to 푣 and let 풩ℰ퐿(푢; 푣) be the subset of paths which intersect 퐿. If 푢 is omitted, then it is assumed that 푢 = (0, 0). Also, be sure to distinguish the notation 풩ℰ(푢; 푣) for the northeast paths from 푢 to 푣 and 풩ℰ(푚, 푛) for the northeast paths from (0, 0) to (푚, 푛). The former contains a semicolon where the latter has a comma.

Theorem 2.4.1 (Reflection Principle). Given 퐿 ∶ 푦 = 푥+푏 for 푏 ∈ ℤ and 푣 ∈ ℤ2, we let ′ ′ 푣 be the reflection of 푣 in 퐿. Then the map Υ퐿 ∶ 풩ℰ퐿(푢; 푣) → 풩ℰ퐿(푢; 푣 ) is a bijection.

′ Proof. In fact, we can show that Υ퐿 is an involution on 풩ℰ퐿(푢; 푣) ∪ 풩ℰ퐿(푢; 푣 ). This follows from the fact that reflection in 퐿 is an involution and that the set of intersection □ points does not change when passing from 푃 ∩ 퐿 to Υ퐿(푃) ∩ 퐿.

As a first application of Theorem 2.4.1, we will give a simpler, although not as purely combinatorial, proof of Theorem 1.11.3. We restate the formula here for refer- ence: 1 2푛 퐶(푛) = ( ). 푛 + 1 푛

Proof. Recall that 퐶(푛) counts the set 풟(푛) of northeast Dyck paths from (0, 0) to (푛, 푛). From Theorem 1.11.1 we know that the total number of all northeast paths 푃 from the origin to (푛, 푛) is 2푛 #풩ℰ(푛, 푛) = ( ). 푛 Note that 푃 does not stay weakly above 푦 = 푥 if and only if 푃 intersects the line 퐿 ∶ 푦 = 푥 −1. And by the Reflection Principle, such paths are in bijection with 풩ℰ퐿((0, 0); (푛 + 1, 푛 − 1)) since (푛 + 1, 푛 − 1) is the reflection of (푛, 푛) in 퐿. But all paths from (0, 0) to (푛 + 1, 푛 − 1) cross 퐿 since these two points are on opposite sides of the line. Thus, using Theorem 1.11.1 again,

2푛 #풩ℰ ((0, 0); (푛 + 1, 푛 − 1)) = #풩ℰ(푛 + 1, 푛 − 1) = ( ). 퐿 푛 + 1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

54 2. Counting with Signs

So subtracting the number of non-Dyck paths from the total number of paths in 풩ℰ(푛, 푛) gives 2푛 2푛 퐶(푛) = ( ) − ( ) 푛 푛 + 1

(2푛)! (2푛)! = − 푛! 푛! (푛 + 1)! (푛 − 1)!

푛 2푛 = (1 − )( ) 푛 + 1 푛

1 2푛 = ( ) 푛 + 1 푛 as desired. □

The Reflection Principle can also be used to prove that certain sequences havea property called unimodality. A sequence of real numbers 푎0, 푎1, ... , 푎푛 is said to be unimodal if there is an index 푚 such that

푎0 ≤ 푎1 ≤ ⋯ ≤ 푎푚 ≥ 푎푚+1 ≥ ⋯ ≥ 푎푛. So this is the next most complicated behavior after being weakly increasing or weakly decreasing. In fact the latter are the special cases of unimodality where 푚 = 푛 or 푚 = 0. Many sequences arising in combinatorics, algebra, and geometry are unimodal. See the survey articles of Stanley [89], Brenti [20], or Brändén [19] for more details. The term “unimodal” comes from probability and statistics where one thinks of the 푎푖 as giving you the distribution of a taking values in {0, 1, ... , 푛}. Then a unimodal distribution has only one hump. We have already met a number of unimodal sequences, although we have not re- marked on the fact. Here is the simplest. Theorem 2.4.2. For 푛 ≥ 0 the sequence 푛 푛 푛 ( ), ( ),..., ( ) 0 1 푛 is unimodal.

Proof. Because the binomial coefficients are symmetric, Theorem 1.3.3(b), it suffices to prove that this sequence is increasing up to its halfway point. So we want to show 푛 푛 ( ) ≤ ( ) 푘 푘 + 1 for 푘 < ⌊푛/2⌋. From Theorem 1.11.1, we know that 푛 푛 ( ) = #풩ℰ(푘, 푛 − 푘) and ( ) = #풩ℰ(푘 + 1, 푛 − 푘 − 1). 푘 푘 + 1 So it suffices to find an injection 푖 ∶ 풩ℰ(푘, 푛 − 푘) → 풩ℰ(푘 + 1, 푛 − 푘 − 1). Let 퐿 be the perpendicular bisector of the line segment from (푘, 푛−푘) to (푘+1, 푛−푘−1). It is easy to

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.5. The Lindström–Gessel–Viennot Lemma 55

check that 퐿 has the form 푦 = 푥 + 푏 for 푏 ∈ ℤ. From the Reflection Principle, we have a bijection Υ퐿 ∶ 풩ℰ퐿(푘, 푛 − 푘) → 풩ℰ퐿(푘 + 1, 푛 − 푘 − 1). But since 푘 < ⌊푛/2⌋ the lattice points (0, 0) and (푘, 푛−푘) are on opposite sides of 퐿 so that 풩ℰ퐿(푘, 푛−푘) = 풩ℰ(푘, 푛−푘). Furthermore 풩ℰ퐿(푘 + 1, 푛 − 푘 − 1) ⊆ 풩ℰ(푘 + 1, 푛 − 푘 − 1). So extending the range of □ Υ퐿 provides the desired injection.

It turns out that the Stirling number sequences 푐(푛, 0), 푐(푛, 1), ... , 푐(푛, 푛) and 푆(푛, 0), 푆(푛, 1), ... , 푆(푛, 푛) are also unimodal. But this is not so easy to prove directly. One reason for this is that these sequences are not symmetric like the one for the binomial coefficients. And there is no known simple expression for the index 푚 where they achieve their maxima. Instead it is better to use another property of real sequences, called log-concavity, which can imply unimodality. This is one of the motivations for the next section.

2.5. The Lindström–Gessel–Viennot Lemma

The lemma in question is a powerful technique for dealing with certain determinan- tal identities. It was first discovered by Lindström [57] and then used to great effect by Gessel and Viennot [31] as well as many other authors. Like the Reflection Prin- ciple, this method uses directed paths. On the other hand, it uses multiple paths and is not restricted to the integer lattice. In particular, when there are two paths, then log-concavity results can be obtained.

A sequence of real numbers 푎0, 푎1, ... , 푎푛 is called log-concave if, for all 0 < 푘 < 푛, we have

2 (2.9) 푎푘 ≥ 푎푘−1푎푘+1.

As usual, we can extend this to all 푘 ∈ ℤ by letting 푎푘 = 0 for 푘 < 0 or 푘 > 푛. Log- concave sequences, like unimodal ones, are ubiquitous in combinatorics, algebra, and geometry. See the previously cited survey articles of Stanley, Brenti, and Brändén for details. For example, a row of Pascal’s triangle or either of the Stirling triangles is log- concave. The name “log-concave” comes from the following scenario. Suppose that we have a function 푓 ∶ ℝ → ℝ which is concave down. So if one takes any two points on the graph of 푓, then the line segment connecting them lies weakly below 푓. Taking the points to be (푘 − 1, 푓(푘 − 1)) and (푘 + 1, 푓(푘 + 1)) and comparing the 푦-coordinate of the midpoint of the corresponding line segment with that coordinate on 푓 gives (푓(푘 − 1) + 푓(푘 + 1))/2 ≤ 푓(푘). Now if 푓(푥) > 0 for all 푥 and the function log 푓(푥) is concave down, then substituting into the previous inequality and exponentiating gives 푓(푘 − 1)푓(푘 + 1) ≤ 푓(푘)2 just like the definition of log-concavity for sequences. It turns out that log-concavity and unimodality are related.

Proposition 2.5.1. Suppose that 푎0, 푎1, ... , 푎푛 is a sequence of positive reals. If the se- quence is log-concave, then it is unimodal.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

56 2. Counting with Signs

Proof. To show a sequence is unimodal it suffices to show that after its first strict de- crease, then it continues to weakly decrease. But 푎푘−1 > 푎푘 is equivalent to 푎푘−1/푎푘 > 1 for positive 푎푘. Rewriting (2.9) as 푎푘/푎푘+1 ≥ 푎푘−1/푎푘 we see that if 푎푘−1/푎푘 > 1, then □ 푎푙−1/푎푙 > 1 for all 푙 ≥ 푘. So the sequence is unimodal.

Even though log-concavity implies unimodality for positive sequences, it is para- doxically often easier to prove log-concavity rather than proving unimodality directly. This comes in part from the fact that the log-concave condition is a uniform one for all 푘, as opposed to unimodality where one must know where the maximum of the sequence occurs. 2 We can rewrite (2.9) as 푎푘 − 푎푘−1푎푘+1 ≥ 0, or in terms of determinants as | 푎 푎 | (2.10) | 푘 푘+1 | ≥ 0. | 푎푘−1 푎푘 | To prove that the determinant is nonnegative, we could show that it counts something and that is exactly what the Lindström–Gessel–Viennot Lemma is set up to do. We will first consider the case of 2 × 2 determinants and at the end of the section indicate how to do the general case. As a running example, we will show how to prove log-concavity of the sequence of binomial coefficients considered in Theorem 2.4.2. Let 퐷 be a digraph which is acyclic in that it contains no directed cycles. Given two vertices of 푢, 푣 ∈ 푉(퐷), we let 풫(푢; 푣) denote the set of directed paths from 푢 to 푣. We will assume that 푢, 푣 are always chosen so that 푝(푢; 푣) = #풫(푢; 푣) is finite even if 퐷 itself is not. To illustrate, let 퐷 be the digraph with vertices ℤ2 and arcs from (푚, 푛) to (푚 + 1, 푛) and to (푚, 푛 + 1) for all 푚, 푛 ∈ ℤ. Then 풫(푢; 푣) is just the set of northeast lattice paths from 푢 to 푣, denoted 풩ℰ(푢; 푣) in the previous section. We will continue to use the notation for general paths from that section for any acyclic digraph. We also extend that notation as follows. Given a directed path 푃 ∶ 푢 → 푣 and vertices 푥 coming 푃 before 푦 on 푃, we let 푥 → 푦 be the portion of 푃 between 푥 and 푦.

Continuing the general exposition, suppose we are given 푢1, 푢2 ∈ 푉 called the initial vertices and 푣1, 푣2 ∈ 푉 which are the final vertices. We wish to consider determi- nants of the form

| 푝(푢1; 푣1) 푝(푢1; 푣2) | (2.11) | | = 푝(푢1; 푣1)푝(푢2; 푣2) − 푝(푢1; 푣2)푝(푢2; 푣1). | 푝(푢2; 푣1) 푝(푢2; 푣2) |

Note that 푝(푢1; 푣1)푝(푢2; 푣2) counts pairs of paths

(푃1, 푃2) ∈ 풫(푢1; 푣1) × 풫(푢2; 푣2) ≔ 풫12

and similarly for 푝(푢1; 푣2)푝(푢2; 푣1) and

풫(푢1; 푣2) × 풫(푢2; 푣1) ≔ 풫21. Returning to our example, if we wish to show

2 푛 푛 푛 ( ) − ( )( ) ≥ 0, 푘 푘 − 1 푘 + 1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.5. The Lindström–Gessel–Viennot Lemma 57

푣2 푣2

푣1 푣1

Ω 푥 → 푥

푢2 푢2

푢1 푢1

Figure 2.5. The Lindström–Gessel–Viennot Involution

then we could take

푢1 = (1, 0), 푢2 = (0, 1), 푣1 = (푘 + 1, 푛 − 푘), 푣2 = (푘, 푛 − 푘 + 1).

푛 푛 It follows from Theorem 1.11.1 that 푝(푢1; 푣1) = 푝(푢2; 푣2) = (푘), while 푝(푢1; 푣2) = (푘−1) 푛 and 푝(푢2; 푣1) = (푘+1). More specifically, if 푛 = 7 and 푘 = 3, then in Figure 2.5 we have 7 7 a pair of paths in 풫21 counted by (2)(4) on the left and another pair in 풫12 counted by 7 2 (3) on the right. For readablity, the grid for the integer lattice has been suppressed, leaving only the vertices of ℤ2. To prove that the determinant (2.11) is nonnegative, we will construct a sign- reversing involution Ω on the set 풫 ≔ 풫12 ∪ 풫21 where

+1 if (푃1, 푃2) ∈ 풫12, sgn(푃1, 푃2) = { −1 if (푃1, 푃2) ∈ 풫21.

We will construct Ω so that every pair in 풫21 is in a 2-cycle with a pair in 풫12. Further- more, the remaining fixed points in 풫12 will be exactly the path pairs in 풫 which do not intersect. It follows that (2.11) is just the number of nonintersecting path pairs in 풫 and therefore must be nonnegative.

To define Ω, consider a path pair (푃1, 푃2) ∈ 풫. If 푃1 ∩ 푃2 is empty, then this pair is in 풫12, since every pair in 풫21 intersects. So in this case we let Ω(푃1, 푃2) = (푃1, 푃2), a fixed point. If 푃1 ∩ 푃2 ≠ ∅, then consider the list of intersections 푥1, ... , 푥푡 in the order in which they are encountered on 푃1. We claim they must also be encountered in this order on 푃2. For if there were intersections 푥, 푦 such that 푥 comes before 푦 on 푃1 and 푦 푃1 푃2 comes before 푥 on 푃2, then one can show that the directed walk 푥 → 푦 → 푥 contains a directed cycle, as the reader will be asked to do in the exercises. This contradicts the assumption that 퐷 is acyclic. So there is a well-defined notion of a first intersection

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

58 2. Counting with Signs

′ ′ 푥 = 푥1. We now let Ω(푃1, 푃2) = (푃1 , 푃2 ) where

푃 푃 ′ 1 2 푃1 = 푢1 → 푥 → 푣2, 푃 푃 ′ 2 1 푃2 = 푢2 → 푥 → 푣1,

if (푃1, 푃2) ∈ 풫12, and similarly if (푃1, 푃2) ∈ 풫21 with 푣1 and 푣2 reversed. An illustration of this map is shown in Figure 2.5. ′ ′ Because the set of intersections in (푃1, 푃2) is the same as in (푃1 , 푃2 ), the first in- tersection remains the same and this makes Ω an involution. It is also clear from its definition that it changes sign. We have proved the following lemma and corollary.

Lemma 2.5.2. Let 퐷 be an acyclic digraph. Let 푢1, 푢2, 푣1, 푣2 ∈ 푉(퐷) be such that each pair of paths (푃1, 푃2) ∈ 풫21 intersects. Then

| 푝(푢1; 푣1) 푝(푢1; 푣2) | | | = number of nonintersecting pairs (푃1, 푃2) ∈ 풫12. | 푝(푢2; 푣1) 푝(푢2; 푣2) | In particular, the determinant is nonnegative. □

Corollary 2.5.3. For 푛 ≥ 0 the sequence

푛 푛 푛 ( ), ( ),..., ( ) 0 1 푛

is log-concave.

Lemma 2.5.2 can be extended to 푛 × 푛 determinants as follows. Let 푢1, ... , 푢푛 and 푣1, ... , 푣푛 be 푛-tuples of distinct vertices in an acyclic digraph. For 휋 ∈ 픖푛, we let

풫휋 = {(푃1, ... , 푃푛) ∣ 푃푖 ∶ 푢푖 → 푣휋(푖) for all 푖 ∈ [푛]} and 풫 = 풫 . ⋃ 휋 휋∈픖푛

To make 풫 into a signed set, recall from abstract algebra that the sign of 휋 ∈ 픖푛 is sgn 휋 = (−1)푛−푘

if 휋 has 푘 cycles in its disjoint cycle decomposition. There are other ways to define sgn 휋, but they are all equivalent. One crucial property of this sign function is that if 퐴 = [푎푖,푗] is a matrix, then

det 퐴 = ∑ (sgn 휋)푎1,휋(1)푎2,휋(2) ... 푎푛,휋(푛). 휋∈픖푛

Now if (푃1, ... , 푃푛) ∈ 풫휋, then we let sgn(푃1, ... , 푃푛) = sgn 휋.

To extend the involution Ω, call 푃 = (푃1, ... , 푃푛) intersecting if there is some pair 푃푖, 푃푗 which intersects. Given an intersecting 푃, we find the smallest 푖 such that 푃푖 intersects another path of 푃 and let 푥 be the first intersection of 푃푖 with another path.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.6. The Matrix-Tree Theorem 59

′ Now take the smallest 푗 > 푖 such that 푃푗 goes through 푥. We now let Ω(푃) = 푃 where ′ 푃 is 푃 with 푃푖, 푃푗 replaced by

푃 푃푗 푃′ = 푢 →푖 푥 → 푣 , (2.12) 푖 푖 휋(푗) 푃 푃 ′ 푗 푖 푃푗 = 푢푗 → 푥 → 푣휋(푖), respectively. One now needs to check that Ω is a sign-reversing involution. As before, nonintersecting path families 푃 are fixed points of Ω. Modulo the details about Ω, we have now proved the following.

Lemma 2.5.4 (Lindström–Gessle–Viennot). Let 퐷 be an acyclic digraph. Consider two sequences of vertices 푢1, ... , 푢푛, 푣1, ... , 푣푛 ∈ 푉(퐷) such that every 푃 ∈ 풫휋 is intersecting for 휋 ≠ id, the identity permutation. We have

det[푝(푢푖; 푣푗)]1≤푖,푗≤푛 = number of nonintersecting 푃 ∈ 풫id. In particular, the determinant is nonnegative. □

This theorem also has something to say about real sequences. Any sequence 푎0,..., 푎푛 has a corresponding Toeplitz matrix which is the infinite matrix 퐴 = [푎푗−푖]푖,푗≥0. So 푎 푎 푎 ⋯ 푎 0 0 0 ... ⎡ 0 1 2 푛 ⎤ ⎢ 0 푎0 푎1 푎2 ⋯ 푎푛 0 0 ... ⎥ 퐴 = ⎢ ⎥ . ⎢ 0 0 푎0 푎1 푎2 ⋯ 푎푛 0 ... ⎥ ⎣ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⎦ The sequence is called Pólya frequency, or PF for short, if every square submatrix of 퐴 has a nonnegative determinant. Notice that, in particular, we get the determinants in (2.10) so that PF implies log-concave. Lemma 2.5.4 can be used to prove that a se- quence is PF in much the same way that Lemma 2.5.2 can be used to prove that it is log-concave. The reader should now have no difficulty in proving the following result.

Theorem 2.5.5. For 푛 ≥ 0 the sequence 푛 푛 푛 ( ), ( ),..., ( ) 0 1 푛 is PF. □

2.6. The Matrix-Tree Theorem

We end this chapter with another application of determinants. There are many places where these animals abide in enumerative combinatorics and a good survey will be found in the articles of Krattenthaler [54,55]. Here we will be concerned with counting spanning trees using a famous result of Kirchhoff called the Matrix-Tree Theorem. A subgraph 퐻 ⊆ 퐺 is called spanning if 푉(퐻) = 푉(퐺). So a spanning subgraph is completely determined by its edge set. A spanning tree 푇 of 퐺 is a spanning subgraph which is a tree. Clearly for a spanning tree to exist, 퐺 must be connected. Let 풮푇(퐺) be the set of spanning trees of 퐺. If one considers the graph 퐺 on the left in Figure 2.6,

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

60 2. Counting with Signs

푣 푒 푤

푔 푓 ℎ

푦 푖 푥

푣 푒 푤

푔 푓 ℎ

푦 푖 푥

Figure 2.6. A graph 퐺, its spanning trees, and an orientation

then the list of its eight spanning trees is in the middle of the figure (shrunk to half size so they will fit and without the vertex and edge labels). To develop the tools needed to prove our main theorem, we first need to make some remarks about combinatorial matrices. We will often have occasion to create matrices whose rows and columns are in- dexed by sets rather than numbers. If 푆, 푇 are sets, then an 푆 × 푇 matrix 푀 is con- structed by giving a linear order to the elements of 푆 and to those of 푇 and using them to index the rows and columns of 푀, respectively. So if (푠, 푡) ∈ 푆 × 푇, then 푚푠,푡 is the entry in 푀 in the row indexed by 푠 and the column indexed by 푡. The reader may have noted that such a matrix depends not just on 푆, 푇, but also on their linear orderings. However, changing these orderings merely permutes rows and columns in 푀 which will usually have no effect on the information we wish to extract fromit. If 퐺 = (푉, 퐸) is a graph, then there are several important matrices associated with it. The adjacency matrix of 퐺 is the 푉 × 푉 matrix 퐴 = 퐴(퐺) with

1 if 푣푤 ∈ 퐸, 푎 = { 푣,푤 0 otherwise.

Using the ordering 푣, 푤, 푥, 푦, the graph on the left in Figure 2.6 has adjacency matrix

푣 푤 푥 푦 푣 0 1 1 1 ⎡ ⎤ 퐴 = 푤 ⎢ 1 0 1 0 ⎥ . ⎢ ⎥ 푥 ⎢ 1 1 0 1 ⎥ 푦 ⎣ 1 0 1 0 ⎦

The adjacency matrix is always symmetric since 푣푤 and 푤푣 denote the same edge. It also has zeros on the diagonal since our graphs are (usually) loopless.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.6. The Matrix-Tree Theorem 61

A second matrix associated with 퐺 is its incidence matrix, 퐵 = 퐵(퐺), which is the 푉 × 퐸 matrix with entries 1 if 푣 is an endpoint of 푒, 푏 = { 푣,푒 0 otherwise.

Returning to our example, the graph has

푒 푓 푔 ℎ 푖 푣 1 1 1 0 0 ⎡ ⎤ 퐵 = 푤 ⎢ 1 0 0 1 0 ⎥ . ⎢ ⎥ 푥 ⎢ 0 0 1 1 1 ⎥ 푦 ⎣ 0 1 0 0 1 ⎦ By construction, row 푣 of 퐵 contains deg 푣 ones, and every column contains 2 ones. We will also need the diagonal 푉 × 푉 matrix 퐶(퐺) which has diagonal entries 푐푣,푣 = deg 푣. These three matrices are nicely related.

Proposition 2.6.1. For any graph 퐺 we have

퐵퐵푡 = 퐴 + 퐶.

Proof. The (푣, 푤) entry of 퐵퐵푡 is the inner product of rows 푣 and 푤 of 퐵. If 푣 = 푤, then this is, using the notation (1.9),

2 2 ∑ 푏푣,푒 = ∑ 훿(푣 is an endpoint of 푒) = deg 푣 = 푐푣,푣 푒 푒

since 02 = 0 and 12 = 1. Similarly, if 푣 ≠ 푤, then the entry is

∑ 푏푣,푒푏푤,푒 = ∑ 훿(푣 is an endpoint of 푒) ⋅ 훿(푤 is an endpoint of 푒) 푒 푒 = 훿(푣푤 ∈ 퐸)

= 푎푣,푤, which completes the proof. □

Interestingly, to compute the number of spanning trees of 퐺 we will have to turn 퐺 into a digraph. An orientation of 퐺 is a digragh 퐷 with 푉(퐷) = 푉(퐺) and, for each edge 푣푤 ∈ 퐸(퐺), either the arc ⃗푣푤 or the arc ⃗푤푣 in 퐴(퐷). In this case 퐺 is called the underlying graph of 퐷. The digraph on the right in Figure 2.6 is an orientation of our running example graph 퐺. The adjacency matrix of a digraph is defined just as for graphs and will not concern us here. But we will need the directed incidence matrix, 퐵 = 퐵(퐷), defined by

⎧−1 if 푎 = ⃗푣푤 for some 푤, 푏 = 1 if 푎 = ⃗푤푣 for some 푤, 푣,푎 ⎨ ⎩ 0 otherwise.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

62 2. Counting with Signs

For the digraph in Figure 2.6 we have 푒 푓 푔 ℎ 푖 푣 −1 −1 1 0 0 ⎡ ⎤ 퐵 = 푤 ⎢ 1 0 0 1 0 ⎥ . ⎢ ⎥ 푥 ⎢ 0 0 −1 −1 1 ⎥ 푦 ⎣ 0 1 0 0 −1 ⎦ Here are the two properties of 퐵(퐷) which will be important for us. Proposition 2.6.2. Let 퐷 be a digraph and let 퐵 = 퐵(퐷).

(a) If the rows of 퐵 are 퐛1, ... , 퐛푛, then

퐛1 + ⋯ + 퐛푛 = ퟎ where ퟎ is the zero vector. (b) If 퐷 is an orientation of a graph 퐺, then (2.13) 퐵퐵푡 = 퐶(퐺) − 퐴(퐺).

Proof. For (a), just note that every column of 퐵 contains a single 1 and a single −1, which will cancel in the sum. The proof of (b) is similar to that for Proposition 2.6.1 and so is left to the reader. □

It is interesting to note that although the matrix 퐵 on the left-hand side of (2.13) depends on 퐷, the right-hand side only depends on the underlying graph 퐺. The matrix 퐿(퐺) = 퐶(퐺) − 퐴(퐺) is called the Laplacian of 퐺 and controls many combinatorial aspects of the graph. Returning to our example, we have 3 −1 −1 −1 ⎡ ⎤ ⎢ −1 2 −1 0 ⎥ 퐿(퐺) = ⎢ ⎥ . ⎢ −1 −1 3 −1 ⎥ ⎣ −1 0 −1 2 ⎦ Note that the sum of the rows of 퐿 = 퐿(퐺) is zero since, for all 푣 ∈ 푉, column 푣 contains deg 푣 on the diagonal and then deg 푣 other nonzero entries which are all −1. So det 퐿 = 0. But removing the last row and column of the previous displayed matrix and taking the determinant gives 3 −1 −1 det[ −1 2 −1 ] = 8. −1 −1 3 The reader may recall that 8 was also the number of spanning trees of 퐺. This is not a coincidence! But before we can prove the implied theorem, we need one more result.

Let 푀 be an 푆 × 푇 matrix and let 퐼 ⊆ 푆 and 퐽 ⊆ 푇. Let 푀퐼,퐽 denote the submatrix of 푀 whose rows are indexed by 퐼 and columns by 퐽. In 퐵(퐺) for our example graph 퐺 with 퐼 = {푣, 푥} and 퐽 = {푓, 푔, 푖} we would have 1 1 0 퐵 = [ ] . 퐼,퐽 0 1 1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

2.6. The Matrix-Tree Theorem 63

If 퐼 = 푆−{푠} for some 푠 ∈ 푆 and 퐽 = 푇−{푡} for some 푡 ∈ 푇, then we use the abbreviation 푀푠,̂ 푡̂ for 푀퐼,퐽 . In this case when 푆 = 푇 = [푛], the (푖, 푗) cofactor of 푀 is

푖+푗 푚푖,̂ 푗̂ = (−1) det 푀푖,̂ 푗.̂

We will need the following famous result about determinants called the Cauchy– Binet Theorem. Since this is really a statement about linear algebra rather than com- binatorics, we will just outline a proof in the exercises.

Theorem 2.6.3 (Cauchy–Binet Theorem). Let 푄 be an [푚] × [푛] matrix and let 푅 be [푛] × [푚]. Then

□ det 푄푅 = ∑ det 푄[푚],퐾 ⋅ det 푅퐾,[푚]. [푛] 퐾∈( 푚 )

Note that in the special case 푚 = 푛 this reduces to the well-known statement that det 푄푅 = det 푄 ⋅ det 푅.

Theorem 2.6.4 (Matrix-Tree Theorem). Let 퐺 be a graph with 푉 = [푛], 퐸 = [푚], and let 퐿 = 퐿(퐺). We have for any 푖, 푗 ∈ [푛]

#풮푇(퐺) = ℓ푖,̂ 푗.̂

Proof. We will do the case when 푖 = 푗 = 푛 as the other cases are similar. So

푛+푛 ℓ푛,̂ 푛̂ = (−1) det 퐿푛,̂ 푛̂ = det 퐿푛,̂ 푛̂. Let 퐷 be any orientation of 퐺 and 퐵 = 퐵(퐷). By Proposition 2.6.2(b), we have that 퐿 = 퐶(퐺) − 퐴(퐺) = 퐵퐵푡. It follows that

푡 퐿푛,̂ 푛̂ = 퐵푊,퐸(퐵푊,퐸) where 푊 = [푛 − 1]. Applying the Cauchy–Binet Theorem we get

푡 2 ℓ푛,̂ 푛̂ = ∑ det 퐵푊,퐹 ⋅ det(퐵푊,퐹 ) = ∑ (det 퐵푊,퐹 ) . 퐸 퐸 퐹∈(푛−1) 퐹∈(푛−1) So the theorem will be proved if we can show that

±1 if the edges of 퐹 are a spanning tree of 퐺, (2.14) det 퐵 = { 푊,퐹 0 otherwise.

Note that 퐵푊,퐹 is the incidence matrix of the digraph 퐷퐹 having 푉(퐷퐹 ) = 푉 and 퐴(퐷퐹 ) = 퐹 but with the row of vertex 푛 removed. We say that 퐷퐹 is a tree if its under- lying graph is one.

We first consider the case when 퐷퐹 is not a tree. We know #퐹 = 푛−1 so, by Theo- rem 1.10.2, 퐷퐹 must be disconnected. Thus there is a component of 퐷퐹 not containing the vertex 푛. And the sum of the row vectors of 퐵푊,퐹 corresponding to that component is ퟎ by Proposition 2.6.2(a). Thus det 퐵푊,퐹 = 0 in this case.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

64 2. Counting with Signs

Now suppose that 퐷퐹 is a tree. To prove this case of (2.14), it suffices to permute the rows and columns of 퐵푊,퐹 so that the matrix becomes lower triangular with ±1 on the diagonal. Such a permutation corresponds to a relabeling of the vertices and edges of 퐷퐹 . If 푛 = 1, then 퐵푊,퐹 is the empty matrix which has determinant 1. If 푛 > 1, then, by Lemma 1.10.1, 퐷퐹 has at least two leaves. So in particular there is a leaf in 푊 = [푛 − 1]. By relabeling 퐷퐹 we can assume that 푣 = 1 is the leaf and 푎 = 1 is the sole arc containing 푣. It follows that the first row of 퐵푊,퐹 has ±1 in the (1, 1) position and zeros elsewhere. Now we consider 퐷퐹 − 푣 and recurse to finish constructing the matrix. □

We can use this theorem to rederive Cayley’s result, Theorem 1.10.3, enumerating all trees on a given vertex set. To do so, consider the complete graph 퐾푛 with vertex set 푉 = [푛]. Clearly the number of trees on 푛 vertices is the same as the number of spanning trees of 퐾푛. The Laplacian 퐿(퐾푛) consists of 푛 − 1 down the diagonal with −1 everywhere else. So 퐿푛,̂ 푛̂ is the same matrix but with dimensions (푛 − 1) × (푛 − 1). Add all the rows of this matrix to the first row. The result is a first row which is allones since every column consists of an 푛 − 1 as well as 푛 − 2 minus ones in some order. Next add the first row to each of the other rows. This will cancel all the minus onesinthose rows as well as changing each diagonal entry from 푛 − 1 to 푛. Now the matrix is upper triangular and, since elementary row operations do not change the determinant, we have that ℓ푛,̂ 푛̂ is the product of the diagonal entries which consist of a one and 푛 − 2 copies of 푛. Cayley’s Theorem follows.

Exercises

(1) Let 푛 be a positive integer and let 푝1, ... , 푝푘 be distinct primes. Prove that the number of integers between 1 and 푛 not divisible by any of the 푝푖 is 푛 푛 푛 푛 − ∑ ⌊ ⌋ + ∑ ⌊ ⌋ − ⋯ + (−1)푘 ⌊ ⌋ . 푝 푝 푝 푝 푝 ... 푝 1≤푖≤푘 푖 1≤푖<푗≤푘 푖 푗 1 2 푘 (2) Let 퐴(푛) be the number of 휌 ⊢ [푛] such that 푖 and 푖 + 1 never occur in the same block of 휌 for any 푖 ∈ [푛 − 1]. (a) Show that 푛−1 푛 − 1 퐴(푛) = ∑ (−1)푖( )퐵(푛 − 푖) 푖 푖=0 where 퐵(푛) is the 푛th Bell number. (b) Find and prove a similar identity involving the Stirling numbers of the second kind. (c) Show that part (a) follows from part (b). (3) Fix positive integers 푘 ≤ 푛. Use the Principle of Inclusion and Exclusion to find a formula for the number of compositions 훼 = [훼1, ... , 훼푘] ⊧ 푛 with the property that 훼푖 ≥ 2 for all 푖 ∈ [푘].

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 65

(4) Prove that for 푛 ≥ 3 we have 퐷(푛) = (푛 − 1)(퐷(푛 − 1) + 퐷(푛 − 2)) in two ways: (a) by using Theorem 2.1.2, (b) by a combinatorial argument. (5) Prove that for 푛 ≥ 1 we have 퐷(푛) = 푛퐷(푛 − 1) + (−1)푛. (6) Call two positive integers 푘, 푛 relatively prime if gcd(푘, 푛) = 1 where gcd is the greatest common divisor. The Euler totient function, also called the Euler phi func- tion, is 휙(푛) = #{푘 ∈ [푛] ∣ gcd(푘, 푛) = 1}. Using the PIE, show that

1 휙(푛) = 푛 ∏(1 − ) 푝 푝 where the product is over all primes 푝 dividing 푛. (7) Given another proof of Lemma 2.2.1 when 푆 is finite by using Theorem 1.5.1.

(8) Fix a set 퐴 and subsets 퐴1, ... , 퐴푛 ⊆ 퐴. Define 퐴퐼 for 퐼 ⊆ [푛] by (2.7). Show that

퐴∅ = 퐴. (9) Prove that for the (signed) Stirling numbers of the first kind 1 if 푛 = 0 or 1, ∑ 푠(푛, 푘) = { 0 if 푛 ≥ 2, 푘 using a sign-reversing involution. (10) Fill in the details of the proof of Theorem 2.2.2. (11) Consider permutations 휋 ∈ 푃(푆) and 휎 ∈ 푃(푇) where 푆 ∩푇 = ∅. The set of shuffles of 휋 and 휎 is 휋 ⧢ 휎 = {휏 ∈ 푃(푆 ⊎ 푇) ∣ 휋 and 휎 are subwords of 휏}. For example 31 ⧢ 24 = {3124, 3214, 3241, 2314, 2341, 2431}. We take linear combinations of permutations as if they were vectors. For example 6(3124) − 7(3241) − 9(3124) + (3241) = −3(3124) − 6(3241). And a set of permutations represents the sum of all the elements in the set with coefficient one. So we would also write 31 ⧢ 24 = 3124 + 3214 + 3241 + 2314 + 2341 + 2431 and let the context determine whether 31 ⧢ 24 means the set or the sum. Show

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

66 2. Counting with Signs

that

푘 푛 ∑(−1) ∑ 푤1 ⧢ 푤2 ⧢ ⋯ ⧢ 푤푘 = (−1) (푛 ... 21), 푘≥1 푤1⋅푤2⋅. . .⋅푤푘=12. . .푛

where the sum is over all concatenations 푤1 ⋅ 푤2 ⋅ ... ⋅ 푤푘 = 12 ... 푛. For example, when 푛 = 3, then the concatenations are

123 = 1 ⋅ 2 ⋅ 3 = 1 ⋅ 23 = 12 ⋅ 3 = 123.

Hint: Consider a permutation 푣 contained in a shuffle 푤1 ⧢ 푤2 ⧢ ⋯ ⧢ 푤푘. Find the largest index 푗 ≥ 0, if any, such that (i) |푤1| = |푤2| = ⋯ = |푤푗| = 1 (which implies that 푤푖 = 푖 for 푖 ∈ [푗]) and (ii) 푗 ... 21 is a subword of 푣. Use the relative positions of 푗 and 푗 + 1 in 푣 together with merging and splitting to find a copy of 푣 in another shuffle of opposite sign. (12) Prove Lemma 2.3.1.

(13) Here is a way to obtain a direct bijection 푔∶ 푃푑(푛) → 푃표(푛). Consider 휆 ∈ 푃푑(푛). Each part 푝 of 휆 can be uniquely written as 푝 = 푞2푟 for some odd 푞 and integer 푟 ≥ 0. Replace 푝 by 2푟 copies of 푞 to get a partition 휇 = 푔(휆). For example, if 휆 = (6, 4, 1), then 6 = 3 ⋅ 21 = 3 + 3, 4 = 1 ⋅ 22 = 1 + 1 + 1 + 1, and 1 = 1 ⋅ 20 = 1. So 푔(6, 4, 1) = (3, 3, 1, 1, 1, 1, 1). (a) Prove that 푔 is a bijection. (b) Prove that 푔 is the same as the bijection obtained using the Involution Princi- ple in the proof of Theorem 2.3.3. (14) Call a graph 퐺 rooted if each component has a distinguished vertex called the root of that component. Say that two unlabled, rooted graphs 퐺1 = (푉1, 퐸1) and 퐺2 = (푉2, 퐸2) are equal if there is a bijection 푓 ∶ 푉1 → 푉2 which preserves both the roots (푟 is a root of 퐺1 if and only if 푓(푟) is a root of 퐺2) and edges (푣푤 ∈ 퐸1 if and only if 푓(푣)푓(푤) ∈ 퐸2). Call a rooted tree 푇 even if there is some edge 푟푣, where 푟 is the root, such that removing this edge from 푇 and making 푣 the root of its component results in a graph with two equal components. Call a rooted forest distinct if all of its component trees are not equal. (a) Use the Garsia–Milne Involution principle to find a bijection between the rooted forests on 푛 vertices with no component tree being even and the rooted forests on 푛 vertices which are distinct. (b) Describe a bijection for (a) using the ideas from Exercise 13. (c) Show that the bijections in (a) and (b) are actually the same. (15) One can generalize Theorem 2.3.3 in the following way. Fix a positive integer 푚. Let 푃<푚(푛) be the set of 휆 ⊢ 푛 where each part is repeated fewer than 푚 times. Let 푃≢푚(푛) be the set of 휆 ⊢ 푛 such that none of the parts is divisible by 푚. (a) Show that 푃<2(푛) = 푃푑(푛) and 푃≢2(푛) = 푃표(푛). (b) Prove that #푃<푚(푛) = #푃≢푚(푛) by generalizing the bijection of the previous exercise. (c) Reprove that #푃<푚(푛) = #푃≢푚(푛) using the Involution Principle. (d) Show that the bijections in (b) and (c) are the same.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 67

(16) Let 풮 = (푆; 푆1, ... , 푆푛) where 푆 is a finite set and 푆1, ... , 푆푛 are subsets. Similarly define 풯 = (푇; 푇1, ... , 푇푛). Call 풮 and 풯 sieve equivalent if #푆퐼 = #푇퐼 for all 퐼 ⊆ [푛]. (a) Use the PIE to show that if 풮 and 풯 are sieve equivalent, then

| 푛 | | 푛 | |푆 − 푆 | = |푇 − 푇 | . | ⋃ 푖| | ⋃ 푖| | 푖=1 | | 푖=1 |

(b) Show that if 풮 and 풯 are sieve equivalent, then the Involution Principle can be used to construct a bijection proving (a). (17) (a) Check that the line 퐿 used in the proof of Theorem 2.4.2 has the correct form. Use this equation to verify that (0, 0) and (푘, 푛 − 푘) are on opposite sides of 퐿. (b) Give a second proof of this theorem using the factorial expression for binomial coefficients. (c) Give a third proof of this theorem using induction. (18) Consider lattice paths of length 푛, starting at the origin and ending at (푥, 푦) and using steps 푁, 퐸, 푆, 푊 where 푆 = [0, −1] and 푊 = [−1, 0]. Let 푟 = (푛 − 푥 − 푦)/2 and 푠 = (푛 + 푥 − 푦)/2. (a) Show that the number of such paths is given by

푛 푛 ( )( ). 푟 푠

Hint: Find a bijection with pairs of 퐸푊-lattice paths which are defined in Exercise 33 of Chapter 1. (b) Show that the number of such paths staying weakly above the 푥-axis is

푛 푛 푛 푛 ( )( ) − ( )( ). 푟 푠 푟 − 1 푠 − 1

(c) Show that for integers 푛, 푟 ≥ 0 the sequence

푛 푛 푛 푛 푛 푛 ( )( ), ( )( ),..., ( )( ) 푟 0 푟 − 1 1 0 푟

is unimodal. (19) Let 퐷 be a digraph. (a) Show that any directed walk from 푢 to 푣 with 푢 ≠ 푣 contains a directed path from 푢 to 푣. (b) Show that any directed walk of length at least 2 from 푢 to 푣 with 푢 = 푣 contains a directed cycle. (20) Show that for 푛 ∈ ℕ the sequence

푛 푛 푛 ( ), ( ),..., ( ) 0 1 푛

is log concave by using the formula for a binomial coefficient in terms of factorials.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

68 2. Counting with Signs

(21) Let 푎0, 푎1, ... , 푎푛 be a sequence of positive reals. Show that the sequence is log- concave if and only if for all 0 < 푘 ≤ 푙 < 푛 we have

푎푘푎푙 ≥ 푎푘−1푎푙+1.

Hint: Use the ideas in the proof of Proposition 2.5.1. (22) (a) Let 푡(푛, 푘) be a triangular array of real numbers for 0 ≤ 푘 ≤ 푛. Call the ar- ray log-concave in 푘 if the sequence 푡(푛, 0), ... , 푡(푛, 푛) is log-concave for all 푛. Suppose that the 푡(푛, 푘) satisfy the recursion

푡(푛, 푘) = 푎(푛, 푘)푡(푛 − 1, 푘 − 1) + 푏(푛, 푘)푡(푛 − 1, 푘)

for 푛 ≥ 1 where 푎(푛, 푘), 푏(푛, 푘), 푡(푛, 푘) are nonnegative reals and 푎(푛, 푘) = 푏(푛, 푘) = 푡(푛, 푘) = 0 for 푘 < 0 or 푘 > 푛. Also assume that (i) 푎(푛, 푘) and 푏(푛, 푘) are log-concave in 푘 and (ii) 푎(푛, 푘 − 1)푏(푛, 푘 + 1) + 푎(푛, 푘 + 1)푏(푛, 푘 − 1) ≤ 2푎(푛, 푘)푏(푛, 푘) for 푛 ≥ 1. Prove that 푡(푛, 푘) is log-concave in 푘. 푛 (b) Use part (a) to prove that (푘), 푐(푛, 푘) (unsigned Stirling numbers of the first kind), and 푆(푛, 푘) (Stirling numbers of the second kind) are all log-concave in 푘. (23) Suppose 0 ≤ 푘 < 푛. Prove in two ways that

2 푛 푛 − 1 푛 + 1 ( ) ≥ ( )( ), 푘 푘 푘

by using the expression for binomial coefficients in terms of factorials and by using lattice paths.

(24) Check that Ω as defined for general path families 푃 = (푃1, ... , 푃푛) is a sign-reversing involution. (25) Prove Theorem 2.5.5. (26) Consider the sequence 푐(푛, 0), ... , 푐(푛, 푛) of signless Stirling numbers of the first kind. (a) Use Lemma 2.5.2 to prove that this sequence is log-concave. Hint: Try to con- struct 퐷 with 푉 = ℤ2 such that the number of paths from (0, 0) to (푛, 푘) is 푐(푛, 푘). It will be helpful to use multiple, but distinguishable, arcs. (b) Use Lemma 2.5.4 to show that, in fact, this is a PF sequence. (27) (a) Find a sequence of positive reals which is unimodal but not log-concave. (b) Find a sequence of positive reals which is log-concave but not PF. (28) (a) Show that the (푣, 푤) entry of 퐴(퐺)푛 is the number of walks going from 푣 to 푤 of length 푛. (b) Show that a similar result holds for digraphs. (29) Use the matrix 퐵(퐺) to prove the Handshaking Lemma, Theorem 1.9.3. (30) Prove Proposition 2.6.2(b).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 69

2 4

1 5

3 6

Figure 2.7. The graph 퐺6

(31) Give two proofs of Theorem 2.6.3 as follows. (a) Give one proof using the Lindström–Gessle–Viennot Lemma. (b) Give a second demonstration based on the outline below. (i) Show that if 푚 > 푛, then both sides are zero. (ii) Assume that 푚 ≤ 푛, write out the entries of 푄푅, and expand about the columns of the product using multilinearity to show that

det 푄푅 = ∑ (det 푄•,휋)푟휋1,1푟휋2,2 ... 푟휋푚,푚 휋∈푃(([푛],푚))

where 푄•,휋 is the matrix whose 푗th column is the 휋푗 column of 푄. (iii) Show that in the previous sum, det 푄•,휋 = 0 if 휋 contains a repeated entry. [푛] (iv) Show that if 퐾 ∈ ( 푚 ), then det 푄[푚],퐾 can be factored out of all the terms in the sum where 휋 is a permutation of 퐾 and that what remains sums to det 푅퐾,[푚]. (32) Prove the case of Theorem 2.6.4 where 푖 = 1 and 푗 = 2.

(33) Let 퐺푛 be the graph with vertex set 푉 = [푛] and edge set 퐸 = {12, 13, 14, ... , 1푛, 23}.

Graph 퐺6 is displayed in Figure 2.7. Find the number of spanning trees of 퐺푛 in two ways: by a direct count and by using the Matrix-Tree Theorem.

(34) The complete bipartite graph, 퐾푚,푛, has 푉 = {푣1, ... , 푣푚, 푤1, ... , 푤푛} and edge set consisting of 푣푖푤푗 for all 푖, 푗 (and no other edges). Show that 푛−1 푚−1 #풮푇(퐾푚,푛) = 푚 푛 .

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 3

Counting with Ordinary Generating Functions

This chapter introduces one of the most powerful techniques in the enumerator’s tool- kit: generating functions. Wilf [101] wrote a whole book devoted to their properties. There are several types of generating functions and we will start with the simplest, which are called ordinary generating functions. In Chapters 4, 7, and 8 we will deal with other types. The basic idea in all cases is to take a sequence of numbers in which we are interested and replace it by an algebraic object, namely a polynomial or power series. The advantage of doing this is that one can then bring a host of algebraic tech- niques to bear in order to study the original sequence. This makes it possible to give proofs of results about the sequence which have the following advantages: (1) The proofs can be very short. (2) Many demonstrations can be done by straightforward manipulations which do not require the cleverness of other approaches. (3) Sometimes no other method is known for obtaining a given result.

3.1. Generating polynomials

Let 푥 be a variable. A sequence

(3.1) 푎0, 푎1, 푎2, ... , 푎푛 of complex numbers has ordinary generating polynomial 푛 2 푛 푘 푓(푥) = 푎0 + 푎1푥 + 푎2푥 + ⋯ + 푎푛푥 = ∑ 푎푘푥 . 푘=0 Here, “ordinary” is to distinguish this generating polynomial from other types. Since we will only be dealing with the ordinary case in this chapter, we will usually drop the adjective. Note that 푓(푥) is an element of the algebra ℂ[푥] of polynomials in 푥

71

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

72 3. Counting with Ordinary Generating Functions

with complex coefficients. We will also often call 푓(푥) the generating function for the sequence (3.1) since it is a special case of the generating function for a sequence with a countable, but perhaps not finite, number of terms. This more general setting willbe discussed in Section 3.3. To begin with a simple example, consider the sequence of binomial coefficients found in a row of Pascal’s triangle

푛 푛 푛 푛 ( ), ( ), ( ),..., ( ). 0 1 2 푛

The corresponding generating function is

푛 푛 푓(푥) = ∑ ( )푥푘. 푘 푘=0 In particular, when 푛 = 4 we get

푓(푥) = 1 + 4푥 + 6푥2 + 4푥3 + 푥4 = (1 + 푥)4.

The power of this generating function is that it can be expressed as a product which is just the well-known Binomial Theorem. We will give two proofs of this result, one combinatorial and one using algebraic manipulations.

Theorem 3.1.1 (Binomial Theorem). For 푛 ∈ ℕ we have

푛 푛 ∑ ( )푥푘 = (1 + 푥)푛. 푘 푘=0

Proof (Combinatorial). Consider expanding the product

푛 (1 + 푥)푛 = (1⏞⎴⎴⎴⎴⎴⎴⏞⎴⎴⎴⎴⎴⎴⏞ + 푥)(1 + 푥) ⋯ (1 + 푥)

using the distributive law. One obtains a term 푥푘 in the expansion by picking the 푥 in 푘 of the factors and picking the 1 in the remaining 푛 − 푘. But the number of ways of 푛 푘 choosing 푘 objects from 푛 objects is (푘). So that is the coefficient of 푥 in the product and we are done. □

Proof (Algebraic). We will induct on 푛. The result is clearly true for 푛 = 0 so assume 푛 ≥ 1. Note that, because of our conventions for binomial coefficients, we can write the generating function as

푛 ∞ 푛 푛 ∑ ( )푥푘 = ∑ ( )푥푘. 푘 푘 푘=0 푘=−∞ The advantage of doing this is that we will not have to worry about boundary cases when 푘 = 0 or 푘 = 푛 and so we will suppress the limits. Now using the binomial

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.1. Generating polynomials 73

recursion in Theorem 1.3.3(a), reindexing, and induction 푛 푛 − 1 푛 − 1 ∑ ( )푥푘 = ∑ ( )푥푘 + ∑ ( )푥푘 푘 푘 − 1 푘 푘 푘 푘

푛 − 1 푛 − 1 = 푥 ∑ ( )푥푘−1 + ∑ ( )푥푘 푘 − 1 푘 푘 푘

푛 − 1 푛 − 1 = 푥 ∑ ( )푥푘 + ∑ ( )푥푘 푘 푘 푘 푘 = 푥(1 + 푥)푛−1 + (1 + 푥)푛−1

= (1 + 푥)푛 as desired. □

The first proof illustrates the use of the Product Rule for weight-generating func- tions which will be discussed in Section 3.4. The second proof is an example of the point made in the chapter introduction about how proofs involving generating func- tions can be based on routine manipulations. And the trick of extending the domain of summation is one which we will often use to simplify demonstrations. We now wish to give an illustration of how a generating function, once derived, can be used to give simple proofs of other results. In particular, setting 푥 = 1 in the Binomial Theorem we immediately get 푛 푛 ∑ ( ) = (1 + 1)푛 = 2푛, 푘 푘=0 which is part (c) of Theorem 1.3.3. Similarly, letting 푥 = −1 in Theorem 3.1.1 gives 푛 푛 ∑ (−1)푘( ) = (1 − 1)푛 = 0푛 = 훿 , 푘 0,푛 푘=0 which is Theorem 1.3.3(d). We end this section by stating the generating function for the Stirling numbers of the first kind. This result can be proved similarly to the algebraic proof of theBinomial Theorem so its demonstration will be left as an exercise. Finding a generating func- tion for the Stirling numbers of the second kind will have to wait until after we have discussed formal power series in Section 3.3. Theorem 3.1.2. For 푛 ∈ ℕ we have

푛 ∑ 푐(푛, 푘)푥푘 = 푥(푥 + 1)(푥 + 2) ... (푥 + 푛 − 1). □ 푘=0 Note that by setting 푥 = 1 in the previous displayed equation we obtain the special case #푃([푛]) = ∑ 푐(푛, 푘) = 푛! . 푘

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

74 3. Counting with Ordinary Generating Functions

So this proposition can be considered a generalization of Theorem 1.2.1. Such exten- sions are called 푞-analogues and will be discussed in the next section.

3.2. Statistics and 푞-analogues

One way of constructing generating functions is through the use of statistics and 푞- analogues. Because of connections with the theory of hypergeometic series, the vari- able 푞 is usually used for these generating functions. This is a mnemonic choice since sometimes, as we will see below, 푞 stands for the power of a prime 푝. There is no for- mal definition of a 푞-analogue, so we will start with an example which will illustrate the meta-definition we will eventually give. A statistic on a set 푆 is a function st∶ 푆 → ℕ. Because the range of a statistic is ℕ we can define, for finite 푆, a corresponding generating polynomial 푓(푞) = ∑ 푞st 푠. 푠∈푆 This generating function is sometimes called the distribution of st over 푆 because it can also be written 푘 푓(푞) = ∑ 푎푘푞 푘≥0

where 푎푘 is the number of 푠 ∈ 푆 satisfying st 푠 = 푘 and this parallels the distribution of a random variable in . One of the most famous statistics on permu- tations is the inversion number. A permutation 휋 = 휋1 ... 휋푛 ∈ 푃([푛]) has inversion set

Inv 휋 = {(푖, 푗) ∣ 푖 < 푗 and 휋푖 > 휋푗}. One can think of this as the set of pairs of indices where the corresponding elements of 휋 are out of their natural increasing order. Note that one uses pairs of indices rather than the elements of 휋 because this makes it easier to generalize this concept to words where repetitions are allowed. For example, if 휋 = 휋1휋2휋3휋4휋5 = 41532, then Inv 휋 = {(1, 2), (1, 4), (1, 5), (3, 4), (3, 5), (4, 5)}. The inversion number of 휋 is just inv 휋 = # Inv 휋. We will often use the convention of beginning functions having to do with sets with uppercase letters and their corresponding cardinalities with lowercase. Continuing our example, inv 41532 = 6. Clearly inv∶ 푃([푛]) → ℕ is a statistic and it has a very interesting generating polynomial. Theorem 3.2.1. For 푛 ≥ 0 we have ∑ 푞inv 휋 = (1)(1 + 푞)(1 + 푞 + 푞2) ⋯ (1 + 푞 + 푞2 + ⋯ + 푞푛−1). 휋∈푃([푛])

Proof. We will induct on 푛, omitting the trivial base case. Every 휋 ∈ 푃([푛]) can be obtained uniquely from a 휎 ∈ 푃([푛−1]) by inserting 푛 into one of the 푛 spaces between 푖 the elements of 휎 (including the space before 휎1 and the space after 휎푛−1). Let 휎 be

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.2. Statistics and 푞-analogues 75

the result of placing 푛 in the 푖th space from the right where the space after 휎푛−1 is considered space zero. Then clearly inv 휎푖 = 푖 + inv 휎. Using this equation and induction we see that

푛−1 ∑ 푞inv 휋 = ∑ ∑ 푞inv 휍푖 휋∈푃([푛]) 휍∈푃([푛−1]) 푖=0 푛−1 = ∑ 푞inv 휍 ⋅ ∑ 푞푖 휍∈푃([푛−1]) 푖=0 = (1 + 푞)(1 + 푞 + 푞2) ⋯ (1 + 푞 + 푞2 + ⋯ + 푞푛−1) as we wished to prove. □

Note that by plugging 푞 = 1 into this result one obtains #푃([푛]) = ∑ 1 = 푛! , 휋∈푃([푛]) which is the second statement in Theorem 1.2.1. Now that we have met some 푞-analogues (although they have not been named as such), their meta-definition should make more sense. A 푞-analogue of a combinatorial object 풪 is an object 풪(푞) such that lim 풪(푞) = 풪. 푞→1 Note that 풪 could be many things: a number, a definition, or a theorem. For example, one of the standard 푞-analogues of 푛 ∈ ℕ is the polynomial

2 푛−1 (3.2) [푛]푞 = 1 + 푞 + 푞 + ⋯ + 푞 . 푛 Clearly [푛]1 = 푛. Another possible 푞-analogue of 푛 is the rational function (1 − 푞 )/ (1 − 푞). In this case one cannot just substitute 푞 = 1 but must take a limit. Of course, this quotient and [푛]푞 are equal when 푞 ≠ 1. Another 푞-analogue is the 푞-factorial

[푛]푞! = [1]푞[2]푞 ... [푛]푞. So Theorem 3.2.1 can be restated as

inv 휋 ∑ 푞 = [푛]푞!. 휋∈푃([푛])

Note that we will sometimes write [푛]푞 as just [푛]. This could cause confusion with the use of [푛] as a set, so we will only use this simplification if it is clear which of the two possible meanings is meant. Similarly, we will often drop the 푞 subscript from other 푞-analogues when convenient.

There is another famous statistic which has [푛]푞! as its distribution. The descent set of 휋 ∈ 푃([푛]) is

(3.3) Des 휋 = {푖 ∣ 휋푖 > 휋푖+1}

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

76 3. Counting with Ordinary Generating Functions

with corresponding descent number des 휋 = # Des 휋. Equivalently 푖 ∈ Des 휋 if and only if (푖, 푖 +1) ∈ Inv 휋. We also define the ascent set, Asc 휋, and ascent number, asc 휋, analogously by reversing the inequality in definition (3.3). Using our previous example we have Des 41532 = {1, 3, 4} and des 41532 = 3. The major index of 휋 is maj 휋 = ∑ 푖. 푖∈Des 휋 So maj 41532 = 1 + 3 + 4 = 8. The term “major index” was coined by Dominique Foata [26] in honor of Percy MacMahon who first studied this statistic [61] and was a major in the British army. Theorem 3.2.2. For 푛 ≥ 0 we have maj 휋 ∑ 푞 = [푛]푞!. 휋∈푃([푛])

Proof. We start as in the proof of Theorem 3.2.1 but now number the spaces of 휎 differently. First number the spaces between 휎푖 and 휎푖+1 where 푖 is a descent, as well as the space after 휎푛−1, from right to left starting with zero. Now number the remaining spaces, including the one before 휎1, from left to right with the numbers des 휎 + 1, des 휎 + 2, ... , 푛 − 1. An example follows this proof. Let 휎(푗) denote the result of placing 푛 in space 푗 with this maj labeling. We claim that (3.4) maj 휎(푗) = 푗 + maj 휎. Indeed, if space 푗 is in a descent or at the end of 휎, then inserting 푛 just moves the 푗 descents to the right of and including the given descent one position to the right. By definition of major index, this adds a total of 푗 to maj 휎. If space 푗 is in an ascent or at the beginning of 휎, then inserting 푛 creates a new descent as well as moving descents to the right of the space one position to the right. It is easy to check for these 푗 that if inserting 푛 in space 푗 caused maj 휎 to increase by 푗, then inserting 푛 in place 푗 + 1 increases maj 휎 by 푗 + 1. So, by induction, equation (3.4) continues to hold in this range of 푗. The completion of the proof is now done exactly as in the demonstration of Theorem 3.2.1. □

Continuing on with 휎 = 41532 having maj 휎 = 8, the spaces are labeled using subscripts as follows:

44315523120. Inserting 6 into each space in turn gives 푗 0 1 2 3 4 5 휎(푗) 415326 415362 415632 461532 641532 416532 maj 휎(푗) 8 9 10 11 12 13

It turns out that there are many permutation statistics whose distribution is [푛]푞! and these statistics were dubbed Mahonian by Foata. One can consult the article of Babson and Steingrímsson [3] for a list of Mahonian statistics.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.2. Statistics and 푞-analogues 77

Having found 푞-analogues involving permutations, the reader may suspect that they also exist for combinations. For integers 0 ≤ 푘 ≤ 푛, define the 푞-binomial coeffi- cients or Gaussian polynomials to be

푛 [푛]푞! [ ] = . 푘 [푘] ! [푛 − 푘] ! 푞 푞 푞 As usual, we let this function be zero if 푘 < 0 or 푘 > 푛. For example 4 [4]! [ ] = 2 [2]! [2]! [4][3] = [2][1] (1 + 푞 + 푞2 + 푞3)(1 + 푞 + 푞2) = (1 + 푞) (3.5) = 1 + 푞 + 2푞2 + 푞3 + 푞4. It is not at all clear from the definition just given that this is actually a polynomial in 푞 rather than just a rational function. But this follows easily using induction and our next result. Note that this theorem gives two 푞-analogues for the ordinary binomial re- cursion. This illustrates a general principle that 푞-analogues are not necessarily unique as we have also seen in the inv and maj interpretations of [푛]푞!. Theorem 3.2.3. We have 0 [ ] = 훿 푘 0,푘 푞 and, for 푛 ≥ 1, 푛 푛 − 1 푛 − 1 [ ] = 푞푘 [ ] + [ ] 푘 푘 푘 − 1 푞 푞 푞

푛 − 1 푛 − 1 = [ ] + 푞푛−푘 [ ] . 푘 푘 − 1 푞 푞

Proof. The initial condition is trivial. We will prove the first recursion for the 푞- binomial, leaving the other as an exercise. Using the definition in terms of 푞-factorials and finding a common denominator gives 푛 − 1 푛 − 1 [푛 − 1]! 푞푘 [ ] + [ ] = (푞푘[푛 − 푘] + [푘]) 푘 푘 − 1 [푘]! [푛 − 푘]! [푛 − 1]! = ⋅ [푛] [푘]! [푛 − 푘]! 푛 = [ ] 푘 as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

78 3. Counting with Ordinary Generating Functions

Figure 3.1. The Young diagrams for (5, 5, 2, 1) ⊇ (3, 2, 2)

We will now give a 푞-analogue of the Binomial Theorem (Theorem 3.1.1). Let 푞, 푡 be two variables. Theorem 3.2.4. For 푛 ≥ 0 we have 푛 (푘) 푛 (3.6) (1 + 푡)(1 + 푞푡)(1 + 푞2푡) ⋯ (1 + 푞푛−1푡) = ∑ 푞 2 [ ] 푡푘. 푘 푘=0 푞 Proof. We will induct on 푛 where the case 푛 = 0 is easy to check. For 푛 > 0 we can use the second recursion in the previous result and the induction hypothesis to write

(푘) 푛 (푘) 푛 − 1 (푘)+푛−푘 푛 − 1 ∑ 푞 2 [ ] 푡푘 = ∑ 푞 2 [ ] 푡푘 + ∑ 푞 2 [ ] 푡푘 푘 푘 푘 − 1 푘 푞 푘 푞 푘 푞

(푘−1) 푛 − 1 = (1 + 푡)(1 + 푞푡) ⋯ (1 + 푞푛−2푡) + 푞푛−1푡 ∑ 푞 2 [ ] 푡푘−1 푘 − 1 푘 푞

= (1 + 푡)(1 + 푞푡) ⋯ (1 + 푞푛−2푡) + 푞푛−1푡(1 + 푡)(1 + 푞푡) ⋯ (1 + 푞푛−2푡)

= (1 + 푡)(1 + 푞푡) ⋯ (1 + 푞푛−1푡), which is what we wished to prove. □

There are many combinatorial interpretations of the 푞-binomial coefficients. We will content ourselves with presenting two of them here. If 휆 = (휆1, ... , 휆푘) and 휇 = (휇1, ... , 휇푙) are integer partitions, then we say that 휆 contains 휇, written 휆 ⊇ 휇, if 푘 ≥ 푙 and 휆푖 ≥ 휇푖 for 푖 ≤ 푙. Equivalently, the Young diagram of 휆 contains the Young diagram of 휇 if they are placed so that their northwest corners align. As an example, (5, 5, 2, 1) ⊇ (3, 2, 2) and Figure 3.1 shows the diagram of 휆 with the squares of 휇 shaded inside. The notation 휇 ⊆ 휆 should be self-explanatory. Given 휇 ⊆ 휆, one also has the corresponding skew partition (3.7) 휆/휇 = {(푖, 푗) ∈ 휆 ∣ (푖, 푗) ∉ 휇}. The cells of the skew partition in Figure 3.1 are white.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.2. Statistics and 푞-analogues 79

The 푘 × 푙 rectangle is the integer partition whose multiplicity notation is (푘푙). Con- sider the set of partitions contained in this rectangle

ℛ(푘, 푙) = {휆 ∣ 휆 ⊆ (푘푙)}.

Recalling that |휆| is the sum of the parts of 휆, we consider the generating function |휆| ∑휆∈ℛ(푘,푙) 푞 . For example, if 푘 = 푙 = 2, then we have

휆 ⊆ (22) ∅ (1) (2) (12) (2, 1) (22) , 푞|휆| 1 푞 푞2 푞2 푞3 푞4

which gives ∑ 푞|휆| = 1 + 푞 + 2푞2 + 푞3 + 푞4. 휆∈ℛ(2,2) The reader will have noticed the similarity to (3.5), which is not an accident.

Theorem 3.2.5. For 푘, 푙 ≥ 0 we have 푘 + 푙 ∑ 푞|휆| = [ ] . 푘 휆∈ℛ(푘,푙) 푞

Proof. We induct on 푘 where the case 푘 = 0 is left to the reader. If 푘 > 0 and 휆 ⊆ (푘푙), 푙 then there are two possibilities. Either 휆1 < 푘 in which case 휆 ⊆ ((푘 − 1) ) or 휆1 = 푘 so that 휆 can be written as 휆 = (푘, 휆′) where 휆′ is the partition containing the parts of ′ 푙−1 ′ 휆 other than 휆1. So 휆 ⊆ (푘 ). Notice that in this case |휆| = |휆 | + 푘. We now use induction and Theorem 3.2.3 to obtain

∑ 푞|휆| = ∑ 푞|휆| + ∑ 푞|휆′|+푘 휆∈ℛ(푘,푙) 휆∈ℛ(푘−1,푙) 휆′∈ℛ(푘,푙−1)

푘 + 푙 − 1 푘 + 푙 − 1 = [ ] + 푞푘 [ ] 푘 − 1 푘

푘 + 푙 = [ ] , 푘

which finishes the proof. □

For our second combinatorial interpretation of the Gaussian polynomials we will need some linear algebra. Let 푞 be a prime power and let 픽푞 be the Galois field with 푞 elements. Let 푉 be a vector space of dimension dim 푉 = 푛 over 픽푞. We will use 푊 ≤ 푉 to indicate that 푊 is a subspace of 푉. Let

푉 [ ] = {푊 ≤ 푉 ∣ dim 푊 = 푘}. 푘

The subspaces of dimension 푘 are in bijective correspondence with 푘 × 푛 row-reduced echelon matrices of full rank, that is, with no zero rows. For example, if 푛 = 4 and

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

80 3. Counting with Ordinary Generating Functions

푘 = 2, then the possible matrices are 0 0 1 0 0 1 ∗ 0 0 1 0 ∗ [ ] , [ ] , [ ] , 0 0 0 1 0 0 0 1 0 0 1 ∗

1 ∗ ∗ 0 1 ∗ 0 ∗ 1 0 ∗ ∗ [ ] , [ ] , [ ] , 0 0 0 1 0 0 1 ∗ 0 1 ∗ ∗

where the stars represent arbitrary elements of 픽푞. So the number of subspaces corre- sponding to one of these star diagrams is 푞푠 where 푠 is the number of stars. Thus 픽4 # [ 푞 ] = 1 + 푞 + 2푞2 + 푞3 + 푞4, 2 which should look very familiar at this point! Note however that, in contrast to previ- ous cases, this actually represents an integer rather than a polynomial since 푞 is a prime power. Of course, this example generalizes. Because of this result people sometimes talk half-jokingly about sets being vector spaces over the (nonexistent) Galois field with one element.

Theorem 3.2.6. If 푉 is a vector space over 픽푞 of dimension 푛, then 푉 푛 # [ ] = [ ] . 푘 푘 푞

Proof. Given 푊 ≤ 푉 with dim 푊 = 푘, we first count the number of possible ordered 푛 푛 bases (퐯1, 퐯2, ... , 퐯푘) for 푊. Note that since dim 푉 = 푛 we have #푉 = #픽푞 = 푞 . We 푛 can pick any nonzero vector for 퐯1 so the number of choices is 푞 − 1. For 퐯2 we can 푛 choose any vector in 푉 which is not in the span of 퐯1, which gives 푞 − 푞 possibilities. Continuing in this way, the total count will be (푞푛 − 1)(푞푛 − 푞)(푞푛 − 푞2) ... (푞푛 − 푞푘−1). By a similar argument, the number of different ordered bases which span a given 푊 of dimension 푘 is (푞푘 − 1)(푞푘 − 푞)(푞푘 − 푞2) ... (푞푘 − 푞푘−1). So the number of possible 푊’s is

(푘) (푞푛 − 1)(푞푛 − 푞) ... (푞푛 − 푞푘−1) 푞 2 (푞푛 − 1)(푞푛−1 − 1) ... (푞푛−푘+1 − 1) = (푞푘 − 1)(푞푘 − 푞) ... (푞푘 − 푞푘−1) (푘) 푞 2 (푞푘 − 1)(푞푘−1 − 1) ... (푞 − 1)

(푞 − 1)푘[푛][푛 − 1] ... [푛 − 푘 + 1] = (푞 − 1)푘[푘][푘 − 1] ... [1]

푛 = [ ] , 푘 as advertised. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.3. The algebra of formal power series 81

There is a beautiful proof of this result due to Knuth [50] using row-reduced eche- lon matrices as in the previous example. The reader will be asked to supply the details in the exercises.

3.3. The algebra of formal power series

We now wish to generalize the concept of generating function from finite to count- ably infinite sequences. To do so, we will have to use power series. But we wishto avoid the questions of convergence which come up when using analytic power series. Instead, we will work in the algebra of formal power series. This will mean that we have to be careful since, in an algebra, one is only permitted to apply an operation like addition or multiplication a finite number of times. But there is another concept of convergence which will take care of this issue. We should note that there is a whole branch of combinatorics which uses analytic techniques to extract useful information about a sequence, such as its rate of growth, from the corresponding power series. For information about this approach, see the book of Flajolet and Sedgewick [25]. A formal power series is an expression of the form ∞ 2 푛 푓(푥) = 푎0 + 푎1푥 + 푎2푥 + ⋯ = ∑ 푎푛푥 , 푛=0

where the 푎푛 are complex numbers. We also say that 푓(푥) is the ordinary generating function or ogf for the sequence 푎푛, 푛 ≥ 0. Often we will leave out the adjective “ordi- nary” in this chapter since we will not have met any other type of generating function yet. Note that these series are considered formal in the sense that the powers of 푥 are just place holders and we are not permitted to substitute a value for 푥. Because of this rule, analytic convergence is not an issue and we can happily talk about formal power 푛 series such as ∑푛≥0 푛! 푥 which converge nowhere except at 푥 = 0. We will use the notation

푛 ℂ[[푥]] = {∑ 푎푛푥 ∣ 푎푛 ∈ ℂ for all 푛 ≥ 0} . 푛≥0 The set is an algebra, the algebra of formal power series, under the three operations of addition, scalar multiplication, and multiplication defined by 푛 푛 푛 ∑ 푎푛푥 + ∑ 푏푛푥 = ∑(푎푛 + 푏푛)푥 , 푛≥0 푛≥0 푛≥0

푛 푛 푐 ∑ 푎푛푥 = ∑(푐푎푛)푥 , 푛≥0 푛≥0

푛 푛 푛 ∑ 푎푛푥 ⋅ ∑ 푏푛푥 = ∑ 푐푛푥 , 푛≥0 푛≥0 푛≥0 where 푐 ∈ ℂ and 푛

푐푛 = ∑ 푎푘푏푛−푘. 푘=0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

82 3. Counting with Ordinary Generating Functions

The reader may object that, as mentioned earlier, in an algebra one is only permit- ted a finite number of additions yet the very elements of ℂ[[푥]] seem to involve infin- itely many. But this is an illusion. Remember that 푥 is a formal parameter so that the 푛 expression ∑푛 푎푛푥 is only meant to be a mnemonic device which gives intuition to the definitions of the three algebra operations, especially that of multiplication. Wecould just as easily have defined ℂ[[푥]] to be the set of all complex vectors (푎0, 푎1, 푎2,...) subject to the operation of vector addition

(푎0, 푎1, 푎2, ... ) + (푏0, 푏1, 푏2, ... ) = (푎0 + 푏0, 푎1 + 푏1, 푎2 + 푏2,...)

and similarly for the other two. What is true is that one is only permitted to add or multiply a finite number of elements of ℂ[[푥]]. So one can only perform operations which will alter the coefficient of a given power of 푥 a finite number of times.

Now given a sequence of complex numbers 푎0, 푎1, 푎2,..., we associate with it the ordinary generating function

2 푓(푥) = 푎0 + 푎1푥 + 푎2푥 + ⋯ ∈ ℂ[[푥]].

We will sometimes say that this series counts the objects enumerated by the 푎푛 if appro- priate. As with generating polynomials, the reason for doing so is to exploit properties of ℂ[[푥]] to obtain information about the original sequence. We will often write this 푛 generating function as ∑푛 푎푛푥 , assuming the that range of indices is 푛 ≥ 0. Let us start with a simple example. Consider the sequence 1, 1, 1, ... with generat- 푛 ing function ∑푛 푥 . We would like to simplify this as a geometric series to 1 (3.8) 1 + 푥 + 푥2 + ⋯ = . 1 − 푥 But what does the right-hand side even mean since 1/(1 − 푥) appears to be a rational function and so not an element of ℂ[[푥]]? The way out of this conundrum is to re- member that given an element 푎 in an algebra 퐴, it is possible for 푎 to have an inverse, namely an element 푎−1 such that 푎 ⋅ 푎−1 = 1 where 1 is the identity element of 퐴. So 푛 to prove (3.8) in this setting we must show that ∑푛 푥 and 1 − 푥 are inverses. This is easily done by using the distributive law:

(1 − 푥)(1 + 푥 + 푥2 + ⋯) = (1 + 푥 + 푥2 + ⋯) − 푥(1 + 푥 + 푥2 + ⋯) = (1 + 푥 + 푥2 + ⋯) − (푥 + 푥2 + 푥3 + ⋯) = 1.

This example illustrates a general principle that often well-known results about analytic power series carry over to their formal counterparts, although some work may be required to check that this is true. For the most part, we will assume the truth of a standard formula in this setting without further comment. But it would be wise to also give a couple of examples to show that caution may be needed. One illustration is that the expression 1/푥 has no meaning in ℂ[[푥]] because 푥 does not have an inverse. For suppose we have 푥푓(푥) = 1 for some formal power series 푓(푥). Then on the left-hand side the constant coefficient is 0 while on the right it is 1, a contradiction.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.3. The algebra of formal power series 83

As another example, consider the sequence 1/푛! for 푛 ≥ 0. We would like to write 푥푛 푒푥 = ∑ 푛! 푛≥0 for the corresponding generating function but, again, run into the problem that 푒푥 is not a priori an element of ℂ[[푥]]. The solution this time is to define 푒푥 to be a formal symbol which stands for this power series. Then, of course, to be complete we would need to verify formally that all the usual rules of exponents hold such as 푒2푥 = (푒푥)2. We will not take the time to do this. But we will point out a case where the rules do not hold. In particular, in ℂ[[푥]] one cannot write 푒1+푥 = 푒푒푥. This is because the left-hand side is not well-defined. Indeed, when expanding 푛 ∑푛(1 + 푥) /푛! there are infinitely many additions needed to compute the coefficient of any given power of 푥 which, as we have already noted, is not permitted. Although we will not verify every specific analytic identity needed for formal power series in this text, it would be good to have some general results about which operations are permitted in ℂ[[푥]]. First we deal with the issue of when a formal power series is invertible.

푛 −1 Theorem 3.3.1. If 푓(푥) = ∑푛 푎푛푥 , then 푓(푥) exists in ℂ[[푥]] if and only if 푎0 ≠ 0.

푛 Proof. For the forward direction, suppose 푓(푥)푔(푥) = 1 where 푔(푥) = ∑푛 푏푛푥 . Tak- ing the constant coefficient on both sides gives 푎0푏0 = 1. So 푎0 ≠ 0. 푛 Now assume 푎0 ≠ 0. We will construct an inverse 푔(푥) = ∑푛 푏푛푥 . We want 푓(푥)푔(푥) = 1. Comparing coefficients of 푥푛 on both sides we see that we wish to have 푎0푏0 = 1 and, for 푛 ≥ 1,

푎0푏푛 + 푎1푏푛−1 + ⋯ + 푎푛푏0 = 0.

Since 푎0 ≠ 0 we can take 푏0 = 1/푎0. By the same token, when 푛 ≥ 1 we can solve for 푏푛 in the previous displayed equation giving a recursive formula for its value. Thus we can construct such a 푔(푥) and are done. □

Our example with 푒푥 shows that we also need to be careful about substitution. We 푛 wish to define the substitution of 푔(푥) into 푓(푥) = ∑푛 푎푛푥 to be 푛 푓(푔(푥)) = ∑ 푎푛푔(푥) . 푛≥0 But now the right-hand side is an infinite sum of formal power series, not just formal variables. To be able to talk about such sums, we need to introduce a notion of conver- gence in ℂ[[푥]]. It will be convenient to have the notation that for a formal power series 푓(푥) [푥푛]푓(푥) = the coefficient of 푥푛 in 푓(푥),

which we have usually been calling 푎푛. Suppose that we have a sequence 푓0(푥), 푓1(푥), 푓2(푥), ... of formal power series. We say that this sequence converges to 푓(푥) ∈ ℂ[[푥]]

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

84 3. Counting with Ordinary Generating Functions

and write

lim 푓푘(푥) = 푓(푥), 푘→∞ if, for any 푛, the coefficient of 푥푛 in the sequence is eventually constant and equals the coefficient of 푥푛 in 푓(푥). Formally, given 푛, there exists a corresponding 퐾 such that 푛 푛 [푥 ]푓푘(푥) = [푥 ]푓(푥) for all 푘 ≥ 퐾. Otherwise we say that the sequence diverges or that the limit does not exist. As an illustration, consider the sequence

2 푓0(푥) = 1, 푓1(푥) = 1 + 푥, 푓2(푥) = 1 + 푥 + 푥 ,...

푘 so that 푓푘(푥) = 1 + 푥 + ⋯ + 푥 . Then this sequence has a limit; namely

푛 1 lim 푓푘(푥) = ∑ 푥 = . 푘→∞ 1 − 푥 푛≥0

푛 To prove this, note that given 푛 we can let 퐾 = 푛. So for 푘 ≥ 푛 we have [푥 ]푓푘 = 푛 [푥 ]푓푛 = 1. On the other hand, consider the sequence

푓0(푥) = 1 + 푥, 푓1(푥) = 1/2 + 푥/2, 푓2(푥) = 1/4 + 푥/4, ...

푘 푘 and in general 푓푘(푥) = 1/2 + 푥/2 . This sequence does not converge in ℂ[[푥]] since for any 푛 we have that [푥]푓푘(푥) is always different for different 푘. This is in contrast to the analytic situation where this sequence converges to zero. As in analysis, we now use convergence of sequences to define convergence of series. Given 푓0(푥), 푓1(푥), 푓2(푥), ... , we say that their sum exists and converges to 푓(푥),

written ∑푘≥0 푓푘(푥) = 푓(푥), if

lim 푠푘(푥) = 푓(푥) 푘→∞ where

(3.9) 푠푘(푥) = 푓0(푥) + 푓1(푥) + ⋯ + 푓푘(푥) is the 푘th partial sum. Divergence is defined as expected. Note that this definition is consistent with our notation for formal power series since given a sequence 푎0, 푎1, 푘 푎2,..., we can let 푓푘(푥) = 푎푘푥 and then prove that ∑푘≥0 푓푘(푥) = 푓(푥) where 푓(푥) = 푘 ∑푘≥0 푎푘푥 . To state a criterion for convergence of series, it will be useful to define the minimum 푛 degree of 푓(푥) = ∑푛 푎푛푥 to be

mdeg 푓(푥) = smallest 푛 such that 푎푛 ≠ 0 if 푓(푥) ≠ 0, and let mdeg 푓(푥) = ∞ if 푓(푥) = 0. It turns out that to show a sum of power series converges, it suffices to take a limit of integers.

Theorem 3.3.2. Given 푓0(푥), 푓1(푥), 푓2(푥), ... ∈ ℂ[[푥]], then ∑푘≥0 푓푘(푥) exists if and only if

lim (mdeg 푓푘(푥)) = ∞. 푘→∞

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.3. The algebra of formal power series 85

Proof. We will prove the forward direction, leaving the other implication as an exer- cise. We are given that the sequence 푠푘(푥) as defined by (3.9) converges. So given 푛, there is a 퐾 such that 푛 푛 푛 [푥 ]푠퐾 (푥) = [푥 ]푠퐾+1(푥) = [푥 ]푠퐾+2(푥) = ⋯ . But for 푗 ≥ 0 we have

푠퐾+푗(푥) = 푠퐾 (푥) + 푓퐾+1(푥) + 푓퐾+2(푥) + ⋯ + 푓퐾+푗(푥). 푛 It follows that [푥 ]푓푘(푥) = 0 for 푘 > 퐾. Now given 푛, take 푁 to be the maximum of all the 퐾-values associated to integers less than or equal to 푛. From what we have shown, this forces mdeg 푓푘(푥) > 푛 for 푛 > 푁. But by definition of a limit of real numbers, this □ means lim푘→∞(mdeg 푓푘(푥)) = ∞.

We are now on a firm footing with our definition of substitution as we knowwhat it means for a sum of power series to converge. We can use the previous result to give a simple criterion for convergence when substituting one generating function into an- other. Theorem 3.3.3. Given 푓(푥), 푔(푥) ∈ ℂ[[푥]], then the composition 푓(푔(푥)) exists if and only if (1) 푓(푥) is a polynomial or (2) 푔(푥) has zero constant term.

Proof. If 푓(푥) is a polynomial, then 푓(푔(푥)) is a finite sum and so obviously converges. 푛 So assume 푓(푥) = ∑푛 푎푛푥 is not polynomial. 푛 If 푔(푥) has no constant term, we have mdeg 푎푛푔(푥) ≥ 푛. So the limit in the previ- ous theorem is infinity and 푓(푔(푥)) is well-defined. To finish the proof, consider the remaining case where [푥0]푔(푥) ≠ 0. Since 푓(푥) is not a polynomial, there are an infinite number of 푛 such that 푎푛 ≠ 0. But for these 푛 푛 we have mdeg 푎푛푔(푥) = 0. So the desired limit cannot be infinity and 푓(푔(푥)) does not exist in ℂ[[푥]]. □

We will also find it useful to consider certain infinite products. We approach their convergence just as we did for infinite sums. Given a sequence 푓0(푥), 푓1(푥), 푓2(푥), ... ,

we say that their product exists and converges to 푓(푥), written ∏푘≥0 푓푘(푥) = 푓(푥), if

lim 푝푘(푥) = 푓(푥) 푘→∞ where 푝푘(푥) = 푓0(푥)푓1(푥) ... 푓푘(푥). We have the following result whose proof is similar enough to that of Theorem 3.3.2 that we will leave it to the reader.

Theorem 3.3.4. Let 푓0(푥), 푓1(푥), 푓2(푥), ... be power series with zero constant terms. Then

∏푘≥0(1 + 푓푘(푥)) exists if and only if

□ lim (mdeg 푓푘(푥)) = ∞. 푘→∞

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

86 3. Counting with Ordinary Generating Functions

Let us end this section by showing how the previous result can give simple ver- 푘 ifications that a product does or does not exist. Consider ∏푘≥1(1 + 푥 ). In this case 푘 푘 푓푘(푥) = 푥 and mdeg 푥 = 푘. So the desired limit is infinity and this product exists. As we will see in Section 3.5, it counts integer partitions with distinct parts. By contrast, 푘 푘 the product ∏푘≥0(1 + 푥/2 ) does not converge since mdeg 푥/2 = 1.

3.4. The Sum and Product Rules for ogfs

Just as for sets there is a Sum Rule and a Product Rule for ordinary generating functions. In order to state these results, we need the idea of a weight-generating function. This approach makes it possible to construct generating functions for various sequences in a very combinatorial manner. As a first application, we make a more deep exploration of the Binomial Theorem. Let 푆 be a set. Then a weighting of 푆 is a function wt∶ 푆 → ℂ[[푥]]. Most often if 푠 ∈ 푆, then wt 푠 will just be a monomial reflecting some property of 푠. For example, if st is any statistic on 푆, then we could take wt 푠 = 푥st 푠. For a more concrete illustration which we will continue to use throughout this section, let 푆 = 2[푛] and define for 푇 ∈ 푆 (3.10) wt 푇 = 푥|푇|. Given a weighted set 푆, we can form the corresponding weight-generating function

푓(푥) = 푓푆(푥) = ∑ wt 푠. 푠∈푆 We must be careful that this sum exists in ℂ[[푥]], and if it does, then we say 푆 is a sum- mable set. Of course, when 푆 is finite then it is automatically summable. To illustrate for 푆 = 2[3] we have 푇 ∅ {1} {2} {3} {1, 2} {1, 3} {2, 3} {1, 2, 3} wt 푇 1 푥 푥 푥 푥2 푥2 푥2 푥3 so that 2 3 푓푆(푥) = 1 + 3푥 + 3푥 + 푥 . More generally, for 푆 = 2[푛] we have |푇| 푓푆(푥) = ∑ 푥 푇∈2[푛]

푛 = ∑ ∑ 푥푘 푘=0 [푛] 푇∈( 푘 )

푛 푛 = ∑ ( )푥푘 푘 푘=0 and we have recovered the generating function for a row of Pascal’s triangle. The following theorem will permit us to manipulate weight-generating functions with ease. For the Sum Rule, if 푆, 푇 are disjoint weighted sets, then we weight 푢 ∈ 푆⊎푇

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.4. The Sum and Product Rules for ogfs 87

using 푢’s weight in 푆 or in 푇 depending on whether 푢 ∈ 푆 or 푢 ∈ 푇, respectively. For arbitrary 푆, 푇 we weight 푆 × 푇 by letting wt(푠, 푡) = wt 푠 ⋅ wt 푡. Lemma 3.4.1. Let 푆, 푇 be summable sets. (a) (Sum Rule) The set 푆 ∪ 푇 is summable. If 푆 ∩ 푇 = ∅, then

푓푆⊎푇 (푥) = 푓푆(푥) + 푓푇 (푥). (b) (Product Rule) The set 푆 × 푇 is summable and

푓푆×푇 (푥) = 푓푆(푥) ⋅ 푓푇 (푥).

Proof. (a) Since 푆 is summable, given any 푛 ∈ ℕ, there are only a finite number of 푠 ∈ 푆 such that wt 푠 has a nonzero coefficient of 푥푛. And the same is true of 푇. It follows that only finitely many elements of 푆 ∪ 푇 have such a coefficient, which mean this set is summable. To prove the desired equality, we compute as follows:

푓푆⊎푇 (푥) = ∑ wt 푢 = ∑ wt 푢 + ∑ wt 푢 = 푓푆(푥) + 푓푇 (푥). ᵆ∈푆⊎푇 ᵆ∈푆 ᵆ∈푇 (b) The statement about summability of 푆 × 푇 is safely left as an exercise. Com- puting the weight-generating function gives

푓푆×푇 (푥) = ∑ wt(푠, 푡) = ∑ wt 푠 ⋅ ∑ wt 푡 = 푓푆(푥) ⋅ 푓푇 (푥), (푠,푡)∈푆×푇 푠∈푆 푡∈푇 so we are done. □

We can now use these rules to derive various generating functions in a straight- forward manner. We begin by reproving the Binomial Theorem as stated in Theo- rem 3.1.1. We have already seen that the summation side is the weight-generating function for 푆 = 2[푛]. For the product side, it will be useful to reformulate 푆 in terms of multiplicity notation. Specifically, consider

′ ′ 푚1 푚2 푚푛 푆 = {푇 = (1 , 2 , ... , 푛 ) ∣ 푚푖 = 0 or 1 for all 푖} weighted by ′ ∑ 푚 wt 푇 = 푥 푖 푖 . Clearly we have a bijection 푓 ∶ 푆 → 푆′ given by 푓(푇) = (1푚1 , 2푚2 , ... , 푛푚푛 ) where 0 if 푖 ∉ 푇, 푚 = { 푖 1 if 푖 ∈ 푇. (In fact, this is the map used in the proof of Theorem 1.3.1.) Furthermore, this bijection is weight preserving in that wt 푓(푇) = wt 푇. As a concrete example, if 푛 = 5 and 푇 = {2, 4, 5}, then 푓(푇) = (10, 21, 30, 41, 51) and wt 푓(푇) = 푥3 = wt 푇. The advantage of using 푆′ is that it is clearly a weighted product of the sets {푖0, 푖1} for 푖 ∈ [푛] where wt 푖0 = 1 and wt 푖1 = 푥. So we can write, where for distinct elements 푎 and 푏 we use the shorthand 푎 ⊎ 푏 for {푎} ⊎ {푏}, 푆′ = {10, 11} × {20, 21} × ⋯ × {푛0, 푛1} = (10 ⊎ 11) × (20 ⊎ 21) × ⋯ × (푛0 ⊎ 푛1).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

88 3. Counting with Ordinary Generating Functions

Translating this expression using both parts of Lemma 3.4.1, we obtain 푛 푛 ∑ ( )푥푘 = 푓 (푥) 푘 푆 푘=0

= 푓푆′ (푥) = (wt 10 + wt 11)(wt 20 + wt 21) ⋯ (wt 푛0 + wt 푛1) = (1 + 푥)푛.

Since this was the reader’s first example of the use of weight-generating functions, we were careful to write out all the details. However, in practice one is usually more concise, for example, making no distinction between 푆 and 푆′ with the understanding that they yield the same weight-generating function and so can be considered the same set in this situation. We also usually omit checking summability, assuming that such details could be filled in if necessary. We are now ready for a more substantial example, namely the Binomial Theorem for negative exponents. Theorem 3.4.2. If 푛 ∈ ℕ, then 1 푛 = ∑(( ))푥푘. 푛 푘 (1 − 푥) 푘≥0

Proof. The summation side suggests that we should consider 푆 = {푇 ∣ 푇 is a multiset on [푛]} with weight function given by (3.10). We are rewarded for our choice since 푛 푓 (푥) = ∑ wt 푇 = ∑ ∑ 푥푘 = ∑(( ))푥푘. 푆 푘 푇∈푆 푘≥0 푘≥0 [푛] 푇∈(( 푘 ))

We now write

푚1 푚2 푚푛 푆 = {(1 , 2 , ... , 푛 ) ∣ 푚푖 ≥ 0 for all 푖} = (10 ⊎ 11 ⊎ 12 ⊎ ... ) × (20 ⊎ 21 ⊎ 22 ⊎ ... ) × ⋯ × (푛0 ⊎ 푛1 ⊎ 푛2 ⊎ ... ) with weight function wt 푖푘 = 푥푘. Using Lemma 3.4.1 yields

0 1 2 0 1 2 푓푆(푥) = (wt 1 + wt 1 + wt 1 + ⋯) ⋯ (wt 푛 + wt 푛 + wt 푛 + ⋯)

= (1 + 푥 + 푥2 + ⋯)푛 1 = (1 − 푥)푛 and the theorem is proved. □

There are several remarks which should be made about this result. First of all, con- trast it with our first version of the Binomial Theorem. In Theorem 3.1.1 we are count- ing subsets of [푛] where repeated elements are not allowed and the resulting generating

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.5. Revisiting integer partitions 89

function is (1 + 푥)푛. In Theorem 3.4.2 we are counting multisets on [푛] so that repeti- tions are allowed and these are counted by 1/(1 − 푥)푛. We will see another example of this in the next section. We can also make Theorem 3.4.2 look almost exactly like Theorem 3.1.1. Indeed, if 푛 ≤ 0, then by Theorem 3.4.2 and equation (1.6) (with −푛 substituted for 푛) we have 1 −푛 푛 (1 + 푥)푛 = = ∑(( ))(−푥)푘 = ∑ ( )푥푘. −푛 푘 푘 (1 − (−푥)) 푘≥0 푘≥0 This is exactly like Theorem 3.1.1 except that we have an infinite series whereas for positive 푛 we have a polynomial. Analytically, the Binomial Theorem makes sense for any 푛 ∈ ℂ as long as |푥| < 1 so that the series converges. In ℂ[[푥]] one can make sense of (1 + 푥)푛 for any rational 푛 ∈ ℚ; see Exercise 12 of this chapter, and prove the following. Theorem 3.4.3. For any 푛 ∈ ℚ we have

푛 (1 + 푥)푛 = ∑ ( )푥푘. □ 푘 푘≥0

3.5. Revisiting integer partitions

The theory of integer partitions is one place where ordinary generating functions have played a central role. In this context and others it will be necessary to consider infinite products. But then, as we have seen in the previous section, we must take care that these products converge. There is a corresponding restriction on the sets which we can use to construct weight-generating functions. We begin by discussing this matter. Let 푆 be a weighted set. We say that 푆 is rooted if there is an element 푟 ∈ 푆 called the root satisfying (1) wt 푟 = 1 and (2) if 푠 ∈ 푆 − {푟}, then wt 푠 has zero constant term. For example, the sets (푛0, 푛1, 푛2,...) used in the proof of Theorem 3.4.2 were rooted 0 0 푘 푘 with 푟 = 푛 since wt 푛 = 1 and wt 푛 = 푥 for 푘 ≥ 1. Given a sequence 푆1, 푆2, 푆3 ... of rooted sets with 푆푖 having root 푟푖, their direct sum is defined to be

푆1 ⊕ 푆2 ⊕ 푆3 ⊕ ⋯

= {(푠1, 푠2, 푠3, ... ) ∣ 푠푖 ∈ 푆푖 for all 푖 and 푠푖 ≠ 푟푖 for only finitely many 푖}.

Note that when the number of 푆푖 is finite, then their direct sum is the same as their product. But when their number is infinite the root condition kicks in. Note that, because of this condition, we have a well-defined weighting on ⊕푖≥1푆푖 given by

wt(푠1, 푠2, 푠3, ... ) = ∏ wt 푠푖 푖≥1 since the product has only finitely many factors not equal to 1. In addition, the Product Rule in Lemma 3.4.1 has to be modified appropriately to get convergence. But the proof is similar to the former result and so is left as an exercise.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

90 3. Counting with Ordinary Generating Functions

Theorem 3.5.1. Let 푆1, 푆2, 푆3,... be a sequence of summable, rooted sets such that

lim[mdeg(푓푆 (푥) − 1)] = ∞. 푖→∞ 푖

Then the direct sum 푆1 ⊕ 푆2 ⊕ 푆3 ⊕ ⋯ is summable and

□ 푓푆1⊕푆2⊕푆3⊕⋯(푥) = ∏ 푓푆푖 (푥). 푖≥1

We will now prove a theorem of Euler giving the generating function for 푝(푛), the number of integer partitions of 푛. The reader should contrast the proof with that given for counting multisets in Theorem 3.4.2, which has evident parallels. Theorem 3.5.2. We have 1 ∑ 푝(푛)푥푛 = ∏ . 푖 푛≥0 푖≥1 1 − 푥

Proof. Motivated by the sum side we consider the set 푆 of all integer partitions 휆 of all numbers 푛 ≥ 0 with weight (3.11) wt 휆 = 푥|휆|, recalling that |휆| is the sum of the parts of 휆. It follows that

푛 푛 푓푆(푥) = ∑ wt 휆 = ∑ ∑ 푥 = ∑ 푝(푛)푥 . 휆∈푆 푛≥0 |휆|=푛 푛≥0 We now express 푆 as a direct sum, using multiplicity notation, as

푚1 푚2 푚3 푆 = {(1 , 2 , 3 , ... ) ∣ 푚푖 ≥ 0 for all 푖 and only finitely many 푚푖 ≠ 0} = (10 ⊎ 11 ⊎ 12 ⊎ ⋯) ⊕ (20 ⊎ 21 ⊎ 22 ⊎ ⋯) ⊕ (30 ⊎ 31 ⊎ 32 ⊎ ⋯) ⊕ ⋯ .

Note that since we want the exponent on wt 휆 to be the sum of its parts, and 푖푘 repre- sents a part 푖 repeated 푘 times, we must take wt 푖푘 = 푥푖푘 in contrast to the weight used in the proof of Theorem 3.4.2. Translating into generating functions by using the previous theorem gives

0 1 2 3 푓푆(푥) = ∏(wt 푖 + wt 푖 + wt 푖 + wt 푖 + ⋯) 푖≥1 = ∏(1 + 푥푖 + 푥2푖 + 푥3푖 + ⋯) 푖≥1 1 = ∏ , 푖 푖≥1 1 − 푥 which is the desired result. □

The reader will notice that the previous proof actually shows much more. In par- ticular, the factor 1/(1 − 푥푖) is responsible for keeping track of the parts equal to 푖 in 휆. We can make this precise as follows.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.5. Revisiting integer partitions 91

Proposition 3.5.3. Given 푛 ∈ ℕ and 푃 ⊆ ℙ, let 푝푃(푛) be the number of partitions of 푛 all of whose parts are in 푃. (a) We have 1 ∑ 푝 (푛)푥푛 = ∏ . 푃 푖 푛≥0 푖∈푃 1 − 푥 (b) In particular, for 푘 ∈ ℙ, 1 ∑ 푝 (푛)푥푛 = . [푘] 2 푘 푛≥0 (1 − 푥)(1 − 푥 ) ⋯ (1 − 푥 )

Proof. For (a), one uses the ideas in the proof of Theorem 3.5.2 except that the ele- ments of 푆 only contain components of the form 푟푚푟 for 푟 ∈ 푃. And (b) follows imme- diately from (a). □

Instead of restricting the set of parts of a partition, we can restrict the number of parts. Recall that 푝(푛, 푘) is the number of 휆 ⊢ 푛 with length ℓ(휆) ≤ 푘. Corollary 3.5.4. For 푘 ≥ 0 we have 1 ∑ 푝(푛, 푘)푥푛 = . 2 푘 푛≥0 (1 − 푥)(1 − 푥 ) ⋯ (1 − 푥 )

Proof. From the previous result, it suffices to show that there is a size-preserving bi- jection between the partitions counted by 푝[푘](푛) and those counted by 푝(푛, 푘). The 푡 map 휆 → 휆 is such a map. Indeed, 휆 only uses parts in [푘] if and only if 휆1 ≤ 푘. In terms of Young diagrams, this means that the first row of 휆 has length at most 푘. It follows that the first column of 휆푡 has length at most 푘, which is equivalent to 휆푡 having at most 푘 parts. □

In the previous section we pointed out a relationship between the generating func- tions for sets and for multisets. The same holds for integer partitions. Let 푝푑(푛) be the number of partitions of 푛 into distinct parts as defined in Section 2.3. Theorem 3.5.5. We have 푛 푖 ∑ 푝푑(푛)푥 = ∏(1 + 푥 ). 푛≥0 푖≥1

Proof. Up to now we have been writing out most of the gory details of our proofs by weight-generating function as the reader gets familiar with the method. But by now it should be sufficient just to write out the highlights. We begin by letting 푆 be all partitions of all 푛 ∈ ℕ into distinct parts and using the weighting in (3.11). It is routine to show that 푓푆(푥) results in the sum side of the theorem. To get the product side we write

푚1 푚2 푚3 푆 = {(1 , 2 , 3 , ... ) ∣ 푚푖 = 0 or 1 for all 푖, only finitely many 푚푖 ≠ 0} = (푖0 ⊎ 푖1). ⨁ 푖≥1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

92 3. Counting with Ordinary Generating Functions

The generating function translation is

푖 푓푆(푥) = ∏(1 + 푥 ) 푖≥1 and we are done. □

As mentioned in the introduction to this chapter, one of the reasons for using gen- erating functions is that they can give quick and easy proofs for various results. Here is an example where we reprove Euler’s distinct parts-odd parts result, Theorem 2.3.3, which we restate here for convenience. Let 푝표(푛) be the number of partitions of 푛 into parts all of which are odd. Theorem 3.5.6 (Euler). For all 푛 ≥ 0,

푝표(푛) = 푝푑(푛).

Proof. It suffices to show that these two sequence have the same generating function. Using Theorems 3.5.3(a) and 3.5.5 as well as multiplying by a strange name for one, we get

푛 2 3 ∑ 푝푑(푛)푥 = (1 + 푥)(1 + 푥 )(1 + 푥 ) ⋯ 푛≥0 (1 − 푥)(1 − 푥2)(1 − 푥3) ⋯ = (1 + 푥)(1 + 푥2)(1 + 푥3) ⋯ (1 − 푥)(1 − 푥2)(1 − 푥3) ⋯

(1 − 푥2)(1 − 푥4)(1 − 푥6) ⋯ = (1 − 푥)(1 − 푥2)(1 − 푥3) ⋯

1 = (1 − 푥)(1 − 푥3)(1 − 푥5) ⋯

푛 = ∑ 푝표(푛)푥 , 푛≥0 which completes this short and slick proof. □

3.6. Recurrence relations and generating functions

The reader may have noticed that many of the combinatorial sequences described in Chapter 1 satisfy recurrence relations. If one has a sequence defined by a recursion, then generating functions can often be used to find an explicit expression for the terms of the sequence. It is also possible to glean information from the generating function derived from a recurrence which is hard to extract from the recurrence itself. This section is devoted to exploring these ideas. We start with a simple algorithm using generating functions to solve a recurrence relation. Given a sequence 푎0, 푎1, 푎2,... defined by a recursion and boundary condi- tions, we wish to find a self-contained formula for the 푛th term.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.6. Recurrence relations and generating functions 93

(1) Multiply the recurrence by 푥푛; usually a good choice for 푛 is the largest index of all the terms in the recurrence. Sum over all 푛 ≥ 푑 where 푑 is the smallest index for which the recurrence is valid. (2) Let 푛 푓(푥) = ∑ 푎푛푥 푛≥0 and express the equation in step (1) in terms of 푓(푥) using the boundary con- ditions. (3) Solve for 푓(푥). 푛 (4) Find 푎푛 as the coefficient of 푥 in 푓(푥). We note that partial fraction expansion can be a useful way to accomplish step (4).

For a simple example, suppose that our sequence is defined by 푎0 = 2 and 푎푛 = 2 3 3푎푛−1 for 푛 ≥ 1. Calculating the first few values we get 푎1 = 2⋅3, 푎2 = 2⋅3 , 푎3 = 2⋅3 . 푛 So it is easy to guess and then prove by induction that 푎푛 = 2 ⋅ 3 . We would now like to obtain this result using generating functions. Step (1) is easy as we just write 푛 푛 푎푛푥 = 3푎푛−1푥 and then sum to get 푛 푛 ∑ 푎푛푥 = ∑ 3푎푛−1푥 . 푛≥1 푛≥1 Letting 푓(푥) be as in step (2) we see that

푛 ∑ 푎푛푥 = 푓(푥) − 푎0 = 푓(푥) − 2 푛≥1 and 푛 푛−1 ∑ 3푎푛−1푥 = 3푥 ∑ 푎푛−1푥 = 3푥푓(푥) 푛≥1 푛≥1 where the last equality is obtained by substituting 푛 for 푛 − 1 in the sum. For step (3) we have 2 (3.12) 푓(푥) − 2 = 3푥푓(푥) ⟹ 푓(푥) − 3푥푓(푥) = 2 ⟹ 푓(푥) = . 1 − 3푥 As far as step (4), we can now expand 1/(1 − 3푥) as a geometric series (that is, use (3.8) and substitute 3푥 for 푥) to obtain 푓(푥) = 2 ∑ 3푛푥푛 = ∑ 2 ⋅ 3푛푥푛. 푛≥0 푛≥0 푛 푛 Extracting the coefficient of 푥 we see that 푎푛 = 2 ⋅ 3 as expected.

In the previous example, it was easier to guess the formula for 푎푛 and then prove it by induction rather than use generating functions. However, there are times when it is impossible to guess the solution this way, but generating functions still give a straight- forward method for obtaining the answer. An example of this is given by the Fibonacci sequence. Our result will be slightly nicer if we use the definition of this sequence given by (1.1). Following the algorithm, we write

푛 푛 ∑ 퐹푛푥 = ∑(퐹푛−1 + 퐹푛−2)푥 . 푛≥2 푛≥2

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

94 3. Counting with Ordinary Generating Functions

푛 Writing 푓(푥) = ∑푛≥0 퐹푛푥 we obtain 푛 ∑ 퐹푛푥 = 푓(푥) − 퐹0 − 퐹1푥 = 푓(푥) − 푥 푛≥2 and 푛 2 2 ∑(퐹푛−1 + 퐹푛−2)푥 = 푥(푓(푥) − 퐹0) + 푥 푓(푥) = (푥 + 푥 )푓(푥). 푛≥2 Setting the expressions for the left and right sides equal and solving for 푓(푥) yields 푥 푓(푥) = . 1 − 푥 − 푥2 For the last step we wish to use partial fractions and so must factor 1 − 푥 − 푥2. Using the quadratic formula, we see that the denominator has roots −1 + √5 −1 − √5 푟 = and 푟 = . 1 2 2 2 It follows that 푥 푥 1 − 푥 − 푥2 = (1 − )(1 − ) 푟1 푟2

since both sides vanish at 푥 = 푟1, 푟2 and both sides have constant term 1. So we have the partial fraction decomposition 푥 퐴 퐵 (3.13) 푓(푥) = = + (1 − 푥 )(1 − 푥 ) (1 − 푥 ) (1 − 푥 ) 푟1 푟2 푟1 푟2

for constants 퐴, 퐵. Clearing denominators gives 푥 푥 푥 = 퐴(1 − ) + 퐵(1 − ). 푟2 푟1

Setting 푥 = 푟1 reduces this equation to 푟1 = 퐴(1 − 푟1/푟2) and solving for 퐴 shows that 퐴 = 1/√5. Similarly letting 푥 = 푟2 yields 퐵 = −1/√5. Plugging these values back into (3.13) and expanding the series, 1 푥푛 1 푥푛 푓(푥) = ⋅ ∑ 푛 − ⋅ ∑ 푛 . √5 푛≥0 푟1 √5 푛≥0 푟2

By rationalizing denominators one can check that 1/푟1 = (1 + √5)/2 and 1/푟2 = (1 − √5)/2. So taking the coefficient of 푥푛 on both sides of the previous displayed equation gives 푛 푛 1 1 + √5 1 1 − √5 (3.14) 퐹푛 = ( ) − ( ) . √5 2 √5 2 This example shows the true power of the generating function method. It would be impossible to guess the formula in (3.14) from just computing values of 퐹푛. In fact, it is not even obvious that the right-hand side is an integer! Our algorithm can be used to derive generating functions in the case where one has a triangle of numbers rather than just a sequence. Here we illustrate this using the

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.6. Recurrence relations and generating functions 95

Stirling numbers. Recall that the signless Stirling numbers of the first kind satisfy the recurrence relation and boundary conditions in Theorem 1.5.2. Translating these to the signed version gives 푠(0, 푘) = 훿0,푘 and 푠(푛, 푘) = 푠(푛 − 1, 푘 − 1) − (푛 − 1)푠(푛 − 1, 푘) 푘 for 푛 ≥ 1. We wish to find the generating function 푓푛(푥) = ∑푘 푠(푛, 푘)푥 where we are using the fact that 푠(푛, 푘) = 0 for 푘 < 0 or 푘 > 푛 to sum over all integers 푛. Applying our algorithm we have 푘 푓푛(푥) = ∑ 푠(푛, 푘)푥 푘 = ∑[푠(푛 − 1, 푘 − 1) − (푛 − 1)푠(푛 − 1, 푘)]푥푘 푘

= 푥푓푛−1(푥) − (푛 − 1)푓푛−1(푥)

= (푥 − 푛 + 1)푓푛−1(푥),

giving us a recursion for the sequence of generating functions 푓푛(푥). From the bound- ary condition for 푠(0, 푘) we have 푓0(푥) = 1. It is now easy to guess a formula for 푓푛(푥) by writing out the first few values and proving that pattern holds by induction to obtain the theorem below, which also follows easily from Theorem 3.1.2. Theorem 3.6.1. For 푛 ≥ 0 we have

∑ 푠(푛, 푘)푥푘 = 푥(푥 − 1) ⋯ (푥 − 푛 + 1). □ 푘 In an entirely analogous manner, one can obtain a generating function for the Stirling numbers of the second kind. Because of the similarity, the proof is left to the reader. Theorem 3.6.2. For 푘 ≥ 0 we have

푥푘 ∑ 푆(푛, 푘)푥푛 = . □ 푛 (1 − 푥)(1 − 2푥) ⋯ (1 − 푘푥) Comparing the previous two results, the reader will note a similar relationship as between generating functions for objects without repetitions (sets, distinct partitions) and those where repetitions are allowed (multisets, ordinary partitions). As already mentioned, this will be explained in Section 3.9. So far, all the generating functions we have derived from recurrences have been rational functions. This is because the recursions are linear and we will prove a gen- eral result to this effect in the next section. We will end this section by illustrating that more complicated generating functions, for example algebraic ones, do arise in prac- tice. Let us consider the Catalan numbers 퐶(푛) and the generating function 푐(푥) = 푛 ∑푛≥0 퐶(푛)푥 . Using the recursion and boundary condition in Theorem 1.11.2 and computing in the way we have become accustomed to, we obtain

푐(푥) = 1 + ∑ 퐶(푛)푥푛 = 1 + ∑( ∑ 퐶(푖)퐶(푗))푥푛 = 1 + 푥푐(푥)2. 푛≥1 푛≥1 푖+푗=푛−1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

96 3. Counting with Ordinary Generating Functions

Writing 푥푐(푥)2 − 푐(푥) + 1 = 0 and solving for 푐(푥) using the quadratic formula yields

1 ± √1 − 4푥 푐(푥) = . 2푥

Two things seem to be wrong with this formula for 푐(푥). First of all, we don’t know whether the plus or minus solution is the correct one. And second, we seem to have left the ring of formal power series because we are dividing by 푥 which has no inverse. Both of these can be solved simultaneously by choosing the sign so that the numerator has no constant term. Then one can divide by 푥 simply by reducing the power of each term in the top by one. By Theorem 3.4.3 we see that the generating √ 1/2 1/2 function for 1 − 4푥 = (1 − 4푥) has constant term ( 0 ) = 1. So the correct sign is negative and we have proved the following.

Theorem 3.6.3. We have

1 − √1 − 4푥 ∑ 퐶(푛)푥푛 = . □ 2푥 푛≥0

One can use this generating function to rederive the explicit expression for 퐶(푛) in Theorem 1.11.3, and the reader will be asked to carry out the details in the exercises.

3.7. Rational generating functions and linear recursions

The reader may have noticed in the previous section that, both in the initial example and for the Fibonacci sequence, the solution of the recursion for 푎푛 was a linear combi- nation of functions of the form 푟푛 where 푟 varied over the reciprocals of the roots of the denominator of the corresponding generating function. This happens for a wide vari- ety of recursions which we will study in this section. Before giving a theorem which characterizes this situation, we will study one more example to illustrate what can hap- pen.

Consider the sequence defined by 푎0 = 1, 푎1 = −4, and

(3.15) 푎푛 = 4푎푛−1 − 4푎푛−2 for 푛 ≥ 2.

푛 Following the usual four-step program, we have, for 푓(푥) = ∑푛≥0 푎푛푥 ,

푛 푓(푥) − 1 + 4푥 = ∑ 푎푛푥 푛≥2 푛−1 2 푛−2 = 4푥 ∑ 푎푛−1푥 − 4푥 ∑ 푎푛−2푥 푛≥2 푛≥2 = 4푥(푓(푥) − 1) − 4푥2푓(푥).

Solving for 푓(푥) and evaluating the constants in the partial fraction expansion yields 1 − 8푥 1 − 8푥 4 3 푓(푥) = = = − . 1 − 4푥 + 4푥2 (1 − 2푥)2 1 − 2푥 (1 − 2푥)2

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.7. Rational generating functions and linear recursions 97

Taking the coefficient of 푥푛 on both sides using the Theorem 3.4.2 (interchanging the roles of 푛 and 푘) together with the fact that

푘 푛 + 푘 − 1 푛 + 푘 − 1 (3.16) (( )) = ( ) = ( ) 푛 푛 푘 − 1

gives a final answer of

푛 + 1 (3.17) 푎 = 4 ⋅ 2푛 − 3( )2푛 = (1 − 3푛)2푛. 푛 1 So now, instead of a constant times 푟푛 we have a polynomial in 푛 as the coefficient. And that polynomial has degree less than the multiplicity of 1/푟 as a root of the denominator. These observations generalize.

Consider a sequence of complex numbers 푎푛 for 푛 ≥ 0. We say that the sequence satisfies a (homogeneous) linear recursion of degree 푑 with constant coefficients if there is a 푑 ∈ ℙ and constants 푐1, ... , 푐푑 ∈ ℂ with 푐푑 ≠ 0 such that

(3.18) 푎푛+푑 + 푐1푎푛+푑−1 + 푐2푎푛+푑−2 + ⋯ + 푐푑푎푛 = 0. To simplify things later, we have put all the terms of the recursion on the left-hand side of the equation and made 푎푛+푑 the term of highest index rather than 푎푛. One can also consider the nonhomogeneous case where one has a summand 푐푑+1 which does not multiply any term of the sequence, but we will have no cause to do so here. It turns out that the sequences satisfying a recursion (3.18) are exactly the ones having rational generating functions.

Theorem 3.7.1. Given a sequence 푎푛 for 푛 ≥ 0 and 푑 ∈ ℙ, the following are equivalent. (a) The sequence satisfies (3.18). 푛 (b) The generating function 푓(푥) = ∑푛≥0 푎푛푥 has the form 푝(푥) (3.19) 푓(푥) = 푞(푥) where

2 푑 (3.20) 푞(푥) = 1 + 푐1푥 + 푐2푥 + ⋯ + 푐푑푥 and deg 푝(푥) < 푑. (c) We can write 푘 푛 푎푛 = ∑ 푝푖(푛)푟푖 푖=1

where the 푟푖 are distinct, nonzero complex numbers satisfying

푘 2 푑 푑푖 (3.21) 1 + 푐1푥 + 푐2푥 + ⋯ + 푐푑푥 = ∏(1 − 푟푖푥) 푖=1

and 푝푖(푛) is a polynomial with deg 푝푖(푛) < 푑푖 for all 푖.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

98 3. Counting with Ordinary Generating Functions

Proof. We first prove the equivalence of (a) and (b). Showing that (a) implies (b)is essentially an application of our algorithm. Multiplying (3.18) by 푥푛+푑 and summing over 푛 ≥ 0 gives 푛+푑 푛+푑−1 푑 푛 0 = ∑ 푎푛+푑푥 + 푐1푥 ∑ 푎푛+푑−1푥 + ⋯ + 푐푑푥 ∑ 푎푛푥 푛≥0 푛≥0 푛≥0

푑−1 푑−2 푛 푛 푑 = [푓(푥) − ∑ 푎푛푥 ] + 푐1푥 [푓(푥) − ∑ 푎푛푥 ] + ⋯ + 푐푑푥 푓(푥) 푛=0 푛=0 = 푞(푥)푓(푥) − 푝(푥) where 푞(푥) is given by (3.20) and 푝(푥) is the sum of the remaining terms, which implies deg 푝(푥) < 푑. Solving for 푓(푥) completes this direction. To prove (b) implies (a), cross multiply (3.19) and use (3.20) to write 2 푑 푝(푥) = 푞(푥)푓(푥) = (1 + 푐1푥 + 푐2푥 + ⋯ + 푐푑푥 )푓(푥). Since deg 푝(푥) < 푑 we have that [푥푛+푑]푝(푥) = 0 for all 푛 ≥ 0. So taking the coefficient of 푥푛+푑 on both sides of the previous displayed equation gives the recursion (3.18). We now show that (b) and (c) are equivalent. The fact that (b) implies (c) again follows from the algorithm. Specifically, using equations (3.19), (3.20), and (3.21), as well as partial fraction expansion, we have

푘 푑푖 푝(푥) 퐴푖,푗 (3.22) 푓(푥) = = ∑ ∑ 푘 푗 푑푖 푖=1 푗=1 (1 − 푟푖푥) ∏푖=1(1 − 푟푖푥)

for certain constants 퐴푖,푗. But by Theorem 3.4.2 and equation (3.16) we have that 1 푗 푛 + 푗 − 1 [푥푛] = 푟푛 = 푟푛 푗 (( )) 푖 ( ) 푖 (1 − 푟푖푥) 푛 푗 − 1 where 푛 + 푗 − 1 (푛 + 푗 − 1)(푛 + 푗 − 2) ⋯ (푛 + 1) ( ) = 푗 − 1 (푗 − 1)! is a polynomial in 푛 of degree 푗 − 1 for any given 푗. Now taking the coefficient of 푥푛 on both sides of (3.22) gives

푘 푑 푖 푛 + 푗 − 1 푎 = ∑ [∑ 퐴 ( )] 푟푛. 푛 푖,푗 푗 − 1 푖 푖=1 푗=1

Calling the polynomial inside the brackets 푝푖(푛), we have derived the desired expan- sion. The proof that (c) implies (b) essentially reverses the steps of the forward direction. So it is left as an exercise. □

We note that the preceding theorem is not just of theoretical significance but is also very useful computationally. In particular, because of the equivalence of (a) and (c), one can solve a linear, constant coefficient recursion in a more direct manner without having to deal with generating functions. To illustrate, suppose we have a sequence

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.8. Chromatic polynomials 99

satisfying (3.18). Then we know that the solution in (c) is in terms of the 푟푖 which are the reciprocals of the roots of 푞(푥) as given by (3.20). To simplify things, we consider the polynomial

푑 푑 푑−1 푑−2 푟(푥) = 푥 푞(1/푥) = 푥 + 푐1푥 + 푐2푥 + ⋯ + 푐푑.

Comparison with (3.21) shows that the 푟푖 are the roots of 푟(푥). We now find the 푝푖(푛) by solving for the coefficients of these polynomials using the initial conditions. To bequite concrete, consider again the example (3.15) with which we began this section. Since 2 2 푛 푎푛 −4푎푛−1 +4푎푛−2 = 0 for 푛 ≥ 2 we factor 푟(푥) = 푥 −4푥+4 = (푥−2) . So 푎푛 = 푝(푛)2 where deg 푝(푛) < 2. It follows that 푝(푛) = 퐴+퐵푛 for constants 퐴, 퐵. Plugging in 푛 = 0 0 1 we get 1 = 푎0 = 퐴2 = 퐴. Now letting 푛 = 1 gives −4 = 푎1 = (1 + 퐵)2 or 퐵 = −3. Thus 푎푛 is again given as in (3.17). But this solution is clearly simpler than the first one given. This is called the method of undetermined coefficients. Of course, the advantage of using generating functions is that they can be used to solve recursions even when they are not linear and constant coefficient. There is a striking resemblance between the theory we have developed in this sec- tion and the method of undetermined coefficients for solving linear differential equa- tions with constant coefficients. This is not an accident and the material in thissection may be considered as part of the theory of finite differences which is a discrete analogue of the theory of differential equations. We will have more to say about finite differences when we study Möbius inversion in Section 5.5.

3.8. Chromatic polynomials

Sometimes generating functions or polynomials appear in unexpected ways. We now illustrate this phenomenon using the chromatic polynomial of a graph. Let 퐺 = (푉, 퐸) be a graph. A (vertex) coloring of 퐺 from a set 푆 is a function 푐∶ 푉 → 푆. We refer to 푆 as the color set. Figure 3.2 contains a graph which we will be using as our running example together with two colorings using the set 푆 = {white, gray, black}. We say that 푐 is proper if, for all edges 푢푣 ∈ 퐸 we have 푐(푢) ≠ 푐(푣). The first coloring in Figure 3.2 is proper while the second is not since the edge 푣푥 has both endpoints colored gray. The chromatic number of 퐺, denoted 휒(퐺), is the minimum cardinality of a set 푆 such that there is a proper coloring 푐∶ 푉 → 푆. In our example, 휒(퐺) = 3 because we have displayed a proper coloring with three colors in Figure 3.2 (black, white, and gray), and one cannot use fewer colors because of the triangle 푢푣푥.

푢 푣

퐺 =

푥 푤

Figure 3.2. A graph and two colorings

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

100 3. Counting with Ordinary Generating Functions

The chromatic number is an important invariant in graph theory. But by its defini- tion, it belongs more to extremal combinatorics (which studies structures which mini- mize or maximize a constraint) than the enumerative side of the subject. Although we will not have much more to say about 휒(퐺) here, we would be remiss if we did not state one of the most famous mathematical theorems in which it plays a part. Call a graph planar if it can be drawn in the plane without any pair of edges crossing. Theorem 3.8.1 (The Four Color Theorem). If 퐺 is planar, then

휒(퐺) ≤ 4. □

Note that this result is in stark contrast to ordinary graphs which can have arbitrar- ily large chromatic number. Complete graphs, for example, have 휒(퐾푛) = 푛. The Four Color Theorem caused quite a stir when it was proved in 1977 by Appel and Haken (with the help of Koch) [1,2]. For one thing, it had been the Four Color Conjecture for over 100 years. Also their proof was the first to make heavy use of computers to dothe calculations for all the various cases and the demonstration could not be completely checked by a human. We now turn to the enumerating graph colorings. Let 푡 ∈ ℕ. The chromatic poly- nomial of 퐺 is defined to be 푃(퐺; 푡) = the number of proper colorings 푐∶ 푉 → [푡]. This concept was introduced by George Birkhoff [13]. It is not clear at this point why 푃(퐺; 푡) should be called a polynomial, but let us compute it for the graph in Figure 3.2. Consider coloring the vertices of 퐺 in the order 푢, 푣, 푤, 푥. There are 푡 choices for the color of 푢. This leaves 푡 − 1 possibilities for 푣 since it cannot be the same color as 푢. By the same token, the number of choices for 푤 is 푡 − 1. Finally, 푥 can be colored in 푡 − 2 ways since it cannot have the colors of 푢 or 푣 and these are different. So the final count is (3.23) 푃(퐺) = 푃(퐺; 푡) = 푡(푡 − 1)(푡 − 1)(푡 − 2) = 푡4 − 4푡3 + 5푡2 − 2푡. This is a polynomial in 푡, the number of colors! Before proving that this is always the case, we have a couple of remarks. First of all, there is a close relationship between 푃(퐺; 푡) and 휒(퐺); namely 푃(퐺; 푡) = 0 if 0 ≤ 푡 < 휒(퐺) but 푃(퐺; 휒(퐺)) > 0. This follows from the definitions of 푃 and 휒 since the latter is the smallest nonnegative integer for which proper colorings of 퐺 exist and the former counts such colorings. Secondly, it is not always possible to compute 푃(퐺; 푡) in the manner above and express it as a product of factors 푡 − 푘 for integers 푘. For example, consider the cycle 퐶4 with vertices labeled clockwise as 푢, 푣, 푤, 푥. If we try to use this method to compute 푃(퐶4; 푡), then everything is fine until we get to coloring 푥, for 푥 is adjacent to both 푢 and 푤. But we cannot be sure whether 푢 and 푤 have the same color or not since they themselves are not adjacent. It turns out that the same ideas can be used both for proving that 푃(퐺; 푡) is always a polynomial in 푡 and to rectify the difficulty in computing 푃(퐶4; 푡). Consider a graph 퐺 = (푉, 퐸) and an edge 푒 ∈ 퐸. The graph obtained by deleting 푒 from 퐺 is denoted 퐺 ⧵푒 and has vertices 푉 and edges 퐸 − {푒}. The middle graph in Figure 3.3 is obtained from our running example by deleting 푒 = 푣푥. The graph obtained by contracting 푒 in 퐺 is

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.8. Chromatic polynomials 101

푢 푣 푢 푣

퐺 = 푒 퐺 ⧵ 푒 = 푥 푤 푥 푤 푢

퐺/푒 = 푣푒 푤

Figure 3.3. Deletion and contraction

denoted 퐺/푒 and is obtained by shrinking 푒 to a new vertex 푣푒, making 푣푒 adjacent to the vertices which were adjacent to either endpoint of 푒 and leaving all other vertices and edges of 퐺 the same. Contracting 푣푥 in our example graph results in the graph on the right in Figure 3.3. The next lemma is crucial in the study of 푃(퐺; 푡). It is ideally set up for induction on #퐸 since both 퐺 ⧵ 푒 and 퐺/푒 have fewer edges than 퐺. Lemma 3.8.2 (Deletion-Contraction Lemma). If 퐺 is a graph, then for any 푒 ∈ 퐸 we have 푃(퐺; 푡) = 푃(퐺 ⧵ 푒; 푡) − 푃(퐺/푒; 푡).

Proof. We will prove this in the form 푃(퐺 ⧵ 푒) = 푃(퐺) + 푃(퐺/푒). Suppose 푒 = 푢푣. Since 푒 is no longer present in 퐺 ⧵ 푒, its proper colorings are of two types: those where 푐(푢) ≠ 푐(푣) and those where 푐(푢) = 푐(푣). If 푐(푢) ≠ 푐(푣), then properly coloring 퐺 ⧵ 푒 is the same as properly coloring 퐺. So there are 푃(퐺) colorings of the first type. There is also a bijection between the proper colorings of 퐺 ⧵ 푒 where 푐(푢) = 푐(푣) and those of 퐺/푒; namely color 푣푒 with the common color of 푢 and 푣 and leave all the other colors the same. It follows that there are 푃(퐺/푒) colorings of the second type and the lemma is proved. □

We can now easily show that 푃(퐺; 푡) lives up to its name. Theorem 3.8.3. We have that 푃(퐺; 푡) is a polynomial in 푡 for any graph 퐺.

Proof. We induct on #퐸. If 퐺 has no edges, then clearly 푃(퐺; 푡) = 푡#푉 which is a polynomial in 푡. If #퐸 ≥ 1, then pick 푒 ∈ 퐸. By deletion-contraction 푃(퐺; 푡) = 푃(퐺 ⧵ 푒; 푡) − 푃(퐺/푒; 푡). And by induction we have that both 푃(퐺 ⧵ 푒; 푡) and 푃(퐺/푒; 푡) are polynomials in 푡. Thus the same is true of their difference. □

We can also use Lemma 3.8.2 to compute the chromatic polynomial of 퐶4. Recall our notation of 푃푛 and 퐾푛 for paths and complete graphs on 푛 vertices, respectively. Now picking any edge 푒 ∈ 퐸(퐶4) we can use deletion-contraction and then determine the polynomials of the resulting graphs via coloring vertex by vertex to obtain

3 2 푃(퐶4) = 푃(푃4) − 푃(퐾3) = 푡(푡 − 1) − 푡(푡 − 1)(푡 − 2) = 푡(푡 − 1)(푡 − 3푡 + 3).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

102 3. Counting with Ordinary Generating Functions

Note that the quadratic factor has complex roots, thus substantiating our claim that 푃(퐺) does not always have roots which are integers. One can use induction and Lemma 3.8.2 to prove a host of results about 푃(퐺; 푡). Since these demonstrations are all similar, we will leave them to the reader. We will use a nonstandard way of writing down the coefficients of this polynomial, which will turn out to be convenient later. Theorem 3.8.4. Let 퐺 = (푉, 퐸) and write 푛 푛−1 푛−2 푛 푃(퐺; 푡) = 푎0푡 − 푎1푡 + 푎2푡 − ⋯ + (−1) 푎푛. (a) 푛 = #푉. (b) mdeg 푃(퐺; 푡) = the number of components of 퐺.

(c) 푎푖 ≥ 0 for all 푖 and 푎푖 > 0 for 0 ≤ 푖 ≤ 푛 − mdeg 푃(퐺; 푡). □ (d) 푎0 = 1 and 푎1 = #퐸.

Now that we know 푃(퐺; 푡) is a polynomial we can ask if there is any combinatorial interpretation for its coefficients, the reverse of our approach up to now, whichhas been to start with a sequence and then find its generating function. Put a on the edge set 퐸, writing 푒 < 푓 if 푒 is less than 푓 in this order and similarly for other notation. If 퐶 is a cycle in 퐺, then the corresponding broken circuit 퐵 is the set of edges obtained from 퐸(퐶) by removing the smallest edge in the total order. Returning to the graph in Figure 3.2, let 푏 = 푢푣, 푐 = 푢푥, 푑 = 푣푤, 푒 = 푣푥 and impose the order 푏 < 푐 < 푑 < 푒. The only cycle has edges 푏, 푐, 푒 and the corresponding broken circuit is 퐵 = {푐, 푒} which are the edges of a path. Say that a set of edges 퐴 ⊆ 퐸 contains no broken circuit or is an NBC set if 퐴 6⊇ 퐵 for any broken circuit 퐵. Let

NBC푘 = NBC푘(퐺) = {퐴 ⊆ 퐸 ∣ #퐴 = 푘 and 퐴 is an NBC set}

and nbc푘 = nbc푘(퐺) = # NBC푘(퐺). In our example graph

푘 NBC푘(퐺) nbc푘(퐺) 0 { ∅ } 1 1 { {푏}, {푐}, {푑}, {푒} } 4 2 { {푏, 푐}, {푏, 푑}, {푏, 푒}, {푐, 푑}, {푑, 푒} } 5 3 { {푏, 푐, 푑}, {푏, 푑, 푒} } 2 4 ∅ 0 Comparison of the last column of this table with the coefficients of 푃(퐺; 푡) presages our next result, which is due to Whitney [99]. It is surprising that the conclusion does not depend on the total order given to the edges. The proof we give is based on the demonstration of Blass and Sagan [16]. Theorem 3.8.5. If #푉 = 푛, then, given any ordering of 퐸, 푛 푘 푛−푘 푃(퐺; 푡) = ∑ (−1) nbc푘(퐺) 푡 . 푘=0

Proof. Identify each 퐴 ∈ NBC푘(퐺) with the associated spanning subgraph. Then 퐴 is acyclic since any cycle contains a broken circuit. It follows from Theorem 1.10.2 that

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.8. Chromatic polynomials 103

푢 푣

퐺 =

푥 푤

Figure 3.4. Two orientations of a graph

푛−푘 퐴 is a forest with 푛 − 푘 component trees. Hence nbc푘(퐺) 푡 is the number of pairs (퐴, 푐) where 퐴 ∈ NBC푘(퐺) and 푐∶ 푉 → [푡] is a coloring constant on each component of 퐴. We call such a coloring 퐴-improper. Make the set of such pairs into a signed set by letting sgn(퐴, 푐) = (−1)#퐴. So the theorem will be proved if we exhibit a sign- reversion involution 휄 on these pairs whose fixed points have positive sign and arein bijection with the proper coloring of 퐺. Define the fixed points of 휄 to be the (퐴, 푐) such that 퐴 = ∅ and 푐 is proper. These pairs clearly have the desired characteristics. For any other pair, 푐 is not a proper color- ing so there must be an edge 푒 = 푢푣 with 푐(푢) = 푐(푣). Let 푒 be the smallest such edge in the total order. Now define 휄(퐴, 푐) = (퐴Δ{푒}, 푐) ≔ (퐴′, 푐). It is clear that 휄 reverses signs. And it is an involution because 푐 does not change, and so the smallest monochromatic edge is the same in a pair and its image. We just need to check that 휄 is well-defined. If 퐴′ = 퐴 − {푒}, then obviously 퐴 is still an NBC set and 푐 is 퐴′-improper. If 퐴′ = 퐴 ∪ {푒}, then, since 푒 joined two vertices of the same color, 푐 is still 퐴′-improper. But assume, towards a contradiction, that 퐴′ is no longer NBC. Then 퐴′ ⊇ 퐵 where 퐵 is a broken circuit, and 푒 ∈ 퐵 since 퐴 is NBC. Since 푐 is 퐴′-improper, all edges in 퐵 have vertices colored 푐(푢). But 푒 is the smallest edge having that property, and so the smaller edge removed from a cycle to get 퐵 cannot exist. Thus 퐴′ is NBC, 휄 is well-defined, and we are done with the proof. □

One of the amazing things about the chromatic polynomial is that it often appears in places where a priori it has no business being because no graph coloring is involved. We now give two illustrations of this. Recall from Section 2.6 that an orientation 푂 of a graph 퐺 is a digraph with the same vertex set obtained by replacing each edge 푢푣 of 퐺 by one of the possible arcs ⃗푢푣 or ⃗푣푢. See Figure 3.4 for two orientations of our usual graph. Call 푂 acyclic if it does not contain any directed cycles and let 풜(퐺) and 푎(퐺) denote the set and number of acyclic orientations of 퐺, respectively. The first of the orientations just given is acyclic while the second is not. The total number of orientations of the cycle 푢, 푣, 푥, 푢 is 23 and the number of those which produce a cycle is 2 (clockwise and counterclockwise). Since neither orientation of 푣푤 can produce a cycle, we see that 푎(퐺) = 2(23 − 2) = 12. We now do something very strange and plug 푡 = −1 into the chromatic polynomial (3.23) and get 푃(퐺; −1) = (−1)(−2)(−2)(−3) = 12. Although it is not at all clear what it means to color a graph with −1 colors, we have just seen an example of the following celebrated theorem of Stanley [85]. Theorem 3.8.6. For any graph 퐺 with #푉 = 푛 we have 푃(퐺; −1) = (−1)푛 푎(퐺).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

104 3. Counting with Ordinary Generating Functions

Proof. We induct on #퐸 where the base case is an easy check. It suffices to show that (−1)푛푃(퐺; −1) and 푎(퐺) satisfy the same recursion. Using the Deletion-Contraction Lemma for the former, we see that we need to show 푎(퐺) = 푎(퐺 ⧵ 푒) + 푎(퐺/푒) for a fixed 푒 = 푢푣 ∈ 퐸. Consider the map 휙∶ 퐴(퐺) → 퐴(퐺 ⧵ 푒) which sends 푂 ↦ 푂′ where 푂′ is obtained from 푂 by removing the arc corresponding to 푒. Clearly 푂′ is still acyclic so the function is well-defined. We claim 휙 is onto. Suppose to the contrary that there is some 푂′ ∈ 퐴(퐺 ⧵ 푒) such that adding back ⃗푢푣 creates a directed cycle 퐶, and similarly with ⃗푣푢 creating a cycle 퐶′. Then (퐶 − ⃗푢푣) ∪ (퐶′ − ⃗푣푢) is a closed, directed walk which must contain a directed cycle by Exercise 14(b) of Chapter 1. But this third directed cycle is contained in 푂′, which is the desired contradiction. If 푂′ ∈ 퐴(퐺 ⧵ 푒), then by definition of the map #휙−1(푂′) ≤ 2. And from the previous paragraph #휙−1(푂′) ≥ 1. So 푎(퐺) = 푥 + 2푦 where 푥 = #{푂′ ∣ 휙−1(푂′) = 1} and 푦 = #{푂′ ∣ 휙−1(푂′) = 2}. Since 푎(퐺 ⧵푒) = 푥 +푦 it suffices to show that 푎(퐺/푒) = 푦. We will do this by constructing a bijection 휓∶ {푂′ ∈ 퐴(퐺 ⧵ 푒) ∣ 휙−1(푂′) = 2} → 퐴(퐺/푒).

Let 푌 be the domain of 휓. If there are a pair of edges 푤푢, 푤푣 ∈ 퐸(퐺), then any 푂′ ∈ 푌 contains either both ⃗푤푢 and ⃗푤푣 or both ⃗푢푤 and ⃗푣푤. This is because in all other cases adding back one of the orientations of 푒 would create an orientation of 퐺 with a cycle, contradicting the fact that 휙−1(푂′) = 2. So define 푂″ = 휓(푂′) to be the ′ orientation of 퐺/푒 which agrees with 푂 on all arcs not containing the new vertex 푣푒, and on any edge of the form 푤푣푒 uses the same orientation as either 푤푢 or 푤푣. (As just shown, these two orientations are either both towards or both away from 푒). Proving that 휓 is a well-defined bijection is left as an exercise. □

We should mention that Stanley actually proved a stronger result giving a combi- natorial interpretation to 푃(퐺; −푡) for all negative integers −푡. See Exercise 28(b) for details. So, as we saw with the binomial coefficients in (1.6), we have another instance of combinatorial reciprocity. We will study this phenomenon more generally in the next section. For our second example of the protean nature of the chromatic polynomial, we will need to assume that our graphs 퐺 have vertex set [푛] so that one has a total order on the vertices. Let 퐹 be a spanning forest of 퐺 and root each component tree 푇 of 퐹 at its smallest vertex 푟. Say that 퐹 is increasing if the integers on any path starting at 푟 form an increasing sequence for all roots 푟. In Figure 3.5 the reader will find the usual

1 2 1 2 1 2

퐺 = 퐹1 = 퐹2 = 4 3 4 3 4 3

Figure 3.5. A graph and two spanning forests

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.8. Chromatic polynomials 105

graph, now labeled with [4], and two spanning forests. We have that 퐹1 is increasing as any singleton node is increasing, and in the nontrivial tree the only maximal path from the root is 1, 2, 4, which is an increasing sequence. On the other hand 퐹2 is not increasing because of the path 1, 4, 2. For a graph 퐺 = (푉, 퐸) we define

ISF푚(퐺) = {퐹 ∣ 퐹 is an increasing spanning forest of 퐺 with 푚 edges}

and isf푚(퐺) = # ISF푚(퐺). If #푉 = 푛, then consider the corresponding generating polynomial 푛 푚 푛−푚 isf(퐺) = isf(퐺; 푡) = ∑ (−1) isf푚(퐺)푡 . 푚=0 Let us compute this for our example graph. Any tree with zero or one edge is increasing so that isf0(퐺) = 1 and isf1(퐺) = #퐸 = 4. Any of the pairs of edges of 퐺 form an 4 increasing forest except for the pair giving 퐹2 in Figure 3.5. So isf2(퐺) = (2) − 1 = 5. Similarly one checks that isf3(퐺) = 2. And isf4(퐺) = 0 since 퐺 itself is not a forest. So isf(퐺; 푡) = 푡4 − 4푡3 + 5푡2 − 2푡 = 푡(푡 − 1)2(푡 − 2) = 푃(퐺; 푡). We cannot always have isf(퐺) = 푃(퐺) because the former depends on the label- ing of the vertices (even though our notation conceals that fact) while the latter does not. So we will put aside deciding when they are equal for now and concentrate on the factorization over ℤ which we have just seen and which, as we will see, is not a coincidence. In fact, the roots will be the cardinalities of the edge sets defined by

(3.24) 퐸푘 = {푖푘 ∈ 퐸 ∣ 푖 < 푘}

for 1 ≤ 푘 ≤ 푛. In our example 퐸1 = ∅ (since there is no vertex smaller than 1), 퐸2 = {12}, 퐸3 = {23}, and 퐸4 = {14, 24}. Lemma 3.8.7. If 퐺 has 푉 = [푛], then a spanning subgraph 퐹 is an increasing forest if and only if it is obtained by picking at most one edge from each 퐸푘 for 푘 ∈ [푛].

Proof. For the forward direction assume, towards a contradiction, that 퐹 contains both 푖푘 and 푗푘 with 푖, 푗 < 푘. So if 푟 is the root of the tree containing 푖, 푗, 푘, then, by the increasing condition, 푖 must be the vertex preceding 푘 on the unique 푟–푘 path. But the same must be true of 푗, which is a contradiction. For the reverse implication, we must first verify that 퐹 is acyclic. But if 퐹 contains a cycle 퐶, then let 푘 be its maximum vertex. It follows that there are 푖푘, 푗푘 ∈ 퐸(퐶) and, by the maximum requirement, 푖, 푗 < 푘. This contradicts the assumption in this direction. Similarly one can show that if 퐹 is not increasing, then one can produce two □ edges from the same 퐸푘 and so we are done.

It is now a simple matter to prove the following result of Hallam and Sagan [41]. The proof given here was obtained by Hallam, Martin, and Sagan [40]. Theorem 3.8.8. If 퐺 has 푉 = [푛], then 푛

isf(퐺; 푡) = ∏(푡 − |퐸푘|). 푘=1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

106 3. Counting with Ordinary Generating Functions

Proof. The coefficient of 푡푛−푚 in the product is, up to sign, the sum of all terms of the

form |퐸푖1 ||퐸푖2 | ⋯ |퐸푖푚 | where the 푖푗 are distinct indices. But this product is the number

of ways to pick one edge out of each of the sets 퐸푖1 , 퐸푖2 , ... , 퐸푖푚 . So, by the previous lemma, the sum is the number of increasing forests with 푚 edges, finishing the proof. □

Returning to the question of when the chromatic and increasing spanning forest polynomials are equal, we need the following definition. Graph 퐺 has a perfect elim- ination order (peo) if there is a total ordering of 푉 as 푣1, 푣2, ... , 푣푛 such that, for all 푘, the set of vertices coming before 푣푘 in this order and adjacent to 푣푘 form the vertices of a clique (complete subgraph of 퐺). This definition may seem strange at first glance, but it has been useful in various graph-theoretic contexts. Returning to our running example graph we see that the order 1, 2, 3, 4 is a peo since 1 is adjacent to no earlier vertex, 2 and 3 are both adjacent to a single previous vertex which is a 퐾1, and 4 is ad- jacent to 1 and 2 which form an edge also known as a 퐾2. We can now prove another result from [40].

Lemma 3.8.9. Let 퐺 have 푉 = [푛]. Write the edges of 퐺 as 푖푗 with 푖 < 푗 and order them lexicographically. For all 푚 ≥ 0 we have

ISF푚(퐺) ⊆ NBC푚(퐺). Furthermore, we have equality for all 푚 if and only if the natural order on [푛] is a peo.

Proof. To prove the inclusion we suppose that 퐹 is an increasing spanning forest which contains a broken circuit 퐵 and derive a contradiction. By the lexicographic ordering of the edges, 퐵 must be a path of the form 푣1, 푣2, ... , 푣푙 where 푣1 = min{푣1, ... , 푣푙} and 푣2 > 푣푙. So there must be a smallest index 푖 ≥ 2 such that 푣푖 > 푣푖+1. It follows that 푣푖−1, 푣푖+1 < 푣푖 so that the two corresponding edges of 퐵 contradict Lemma 3.8.7. For the forward direction of the second statement we must show that if 푖, 푗 < 푘 and 푖푘, 푗푘 ∈ 퐸(퐺), then 푖푗 ∈ 퐸(퐺). By Lemma 3.8.7 again, {푖푘, 푗푘} is not the edge set of an increasing spanning forest. So, by the assumed equality, this set must contain a broken circuit. Since there are only two edges, this set must actually be a broken circuit, and 푖푗 ∈ 퐸(퐺) must be the edge used to complete the cycle. The reverse implication is left as an exercise. □

From this result we immediately conclude the following.

Theorem 3.8.10. Let 퐺 have 푉 = [푛]. Then isf(퐺; 푡) = 푃(퐺; 푡) if and only if the natural order on [푛] is a peo. □

3.9. Combinatorial reciprocity

When plugging a negative parameter into a counting function results in a sign times another enumerative function, then this is called combinatorial reciprocity. This con- cept was introduced and studied by Stanley [86]. We have already seen two examples

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

3.9. Combinatorial reciprocity 107

of this in equation (1.6) and Theorem 3.8.6 (and, more generally, Exercise 28 of this chapter). Here we will make a connection with recurrences and rational generating functions. See the text of Beck and Sanyal [5] for a whole book devoted to this subject.

Before stating a general theorem, we return to the example with which we began Section 3.6. This was the sequence defined by 푎0 = 2 and 푎푛 = 3푎푛−1 for 푛 ≥ 1. One can extend the domain of this recursion to all integral 푛, in which case one gets 2 2 = 푎0 = 3푎−1 so that 푎−1 = 2/3. Then 2/3 = 푎−1 = 3푎−2 yielding 푎−2 = 2/3 , 푛 and so forth. An easy induction shows that for 푛 ≤ 0 we have 푎푛 = 2 ⋅ 3 just as for 푛 ≥ 0. We can also compute the generating function for the negatively indexed part of the sequence, where it is convenient to start with 푎−1, which is the geometric series 2푥푛 2푥/3 2푥 ∑ 푎 푥푛 = ∑ = = . −푛 3푛 1 − 푥/3 3 − 푥 푛≥1 푛≥1

푛 Comparing this to 푓(푥) = ∑푛≥0 푎푛푥 as found in (3.12) we see that −2 2푥 −푓(1/푥) = = = ∑ 푎 푥푛. 1 − 3/푥 3 − 푥 −푛 푛≥1

The reader should have some qualms about writing 푓(1/푥) since we have gone to great pains to point out that 푥 has no inverse in ℂ[[푥]]. Indeed, if we use the definition 푛 푛 that 푓(푥) = ∑푛≥0 푎푛푥 , then 푓(1/푥) = ∑푛≥0 푎푛/푥 , which is not a formal power series! But if 푓(푥) can be expressed as a rational function 푓(푥) = 푝(푥)/푞(푥) where deg 푝(푥) ≤ deg 푞(푥) ≔ 푑, then we can make sense of this substitution as follows. Since 푞(푥) has degree 푑 we have that 푥푑푞(1/푥) is also a polynomial and is invertible since its constant coefficient is nonzero (Theorem 3.3.1). Furthermore, 푥푑푝(1/푥) is also a polynomial since 푑 ≥ deg 푝(푥). So we can define 푥푑푝(1/푥) (3.25) 푓(1/푥) = 푥푑푞(1/푥) and stay inside the formal power series ring. With this convention, the following result makes sense.

Theorem 3.9.1. Suppose that 푎푛 is a sequence defined for all 푛 ∈ ℤ and satisfying the linear recurrence relation with constant coefficients (3.18) for all such 푛. Letting 푓(푥) = 푛 ∑푛≥0 푎푛푥 , we have 푛 ∑ 푎−푛푥 = −푓(1/푥). 푛≥1

Proof. By (3.25), to prove the theorem it suffices to show that

푑 푛 푑 푥 푞(1/푥) ∑ 푎−푛푥 = −푥 푝(1/푥). 푛≥1 Note that by (3.20) we have

푑 푑 푑−1 푑−2 푥 푞(1/푥) = 푥 + 푐1푥 + 푐2푥 + ⋯ + 푐푑.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

108 3. Counting with Ordinary Generating Functions

So if 푚 ≥ 1, then, using (3.18) and the fact that 푥푑푝(1/푥) has degree at most 푑,

푚+푑 푑 푛 [푥 ]푥 푞(1/푥) ∑ 푎−푛푥 = 푎−푚 + 푐1푎−푚−1 + 푐2푎−푚−2 + ⋯ + 푐푑푎−푚−푑 푛≥1 = 0

= [푥푚+푑](−푥푑푝(1/푥)). Similarly we can prove that this equality of coefficients continues to hold for −푑 ≤ 푚 ≤ 0. This completes the proof. □

To illustrate this theorem, we consider the negative binomial expansion. So if one fixes 푛 ≥ 1, then using Theorem 3.4.2 and equation (3.16)

1 푛 푛 + 푘 − 1 푓(푥) ≔ = ∑(( ))푥푘 = ∑ ( )푥푘. 푛 푘 푛 − 1 (1 − 푥) 푘≥0 푘≥0

푛+푘−1 Note that since 푛 is fixed we are thinking of ( 푛−1 ) as a function of 푘. Substituting −푘 for 푘 in the binomial coefficient, we wish to consider the corresponding generating function 푛 − 푘 − 1 푔(푥) = ∑ ( )푥푘. 푛 − 1 푘≥1

푛−푘−1 푛 We note that ( 푛−1 ) = 0 for 1 ≤ 푘 < 푛 since then 0≤푛−푘−1<푛−1. So 푥 can be factored out from 푔(푥) and, using (1.6) and the above expression for the negative binomial expansion, 푛 − 푘 − 1 푔(푥) = 푥푛 ∑ ( )푥푘−푛 푛 − 1 푘≥푛

−푗 − 1 = 푥푛 ∑ ( )푥푗 푛 − 1 푗≥0

푗 + 1 = (−1)푛−1푥푛 ∑(( ))푥푗 푛 − 1 푗≥0

푛 + 푗 − 1 = (−1)푛−1푥푛 ∑ ( )푥푗 푛 − 1 푗≥0

(−1)푛−1푥푛 = . (1 − 푥)푛 On the other hand, we could apply Theorem 3.9.1 and write −1 −푥푛 (−1)푛−1푥푛 푔(푥) = = = (1 − 1/푥)푛 (푥 − 1)푛 (1 − 푥)푛 giving the same result but with less computation.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 109

Exercises

(1) Prove that for 푛 ∈ ℕ 푛 푛 ∑ 2푘( ) = 3푛 푘 푘=0 in two ways. (a) Use the Binomial Theorem. (b) Use a combinatorial argument. (c) Generalize this identity by replacing 2푘 by 푐푘 for any 푐 ∈ ℕ, giving both a proof using the Binomial Theorem and one which is combinatorial. (2) For 푚, 푛, 푘 ∈ ℕ show that 푚 + 푛 푚 푛 ( ) = ∑ ( )( ) 푘 푙 푘 − 푙 푙≥0 in three ways: by induction, using the Binomial Theorem, and using a combinato- rial argument.

(3) Let 푥1, ... , 푥푚 be variables. Prove the multinomial coefficient identity

푛 푛1 푛푚 푛 ∑ ( )푥1 ⋯ 푥푚 = (푥1 + ⋯ + 푥푚) , 푛1, ... , 푛푚 푛1+⋯+푛푚=푛 in three ways: (a) by inducting on 푛, (b) by using the Binomial Theorem and inducting on 푚, (c) by a combinatorial argument. (4) (a) Prove Theorem 3.1.2. (b) Use this generating function to rederive Corollary 1.5.3.

(5) (a) Recall that an inversion of 휋 ∈ 푃([푛]) is a pair (푖, 푗) with 푖 < 푗 and 휋푖 > 휋푗 and we call 휋푖 the maximum of the inversion. The inversion table of 휋 is 퐼(휋) = (푎1, 푎2, ... , 푎푛) where 푎푘 is the number of inversions with maximum 푘. Show that 0 ≤ 푎푘 < 푘 for all 푘 and that 푛

inv 휋 = ∑ 푎푘. 푘=1 (b) Let

ℐ푛 = {(푎1, 푎2, ... , 푎푛) ∣ 0 ≤ 푎푘 < 푘 for all 푘}.

Show that the map 휋 ↦ 퐼(휋) is a bijection 푃([푛]) → ℐ푛. (c) Use part (b) and weight-generating functions to rederive Theorem 3.2.1.

(6) Let st∶ 픖푛 → ℕ be a statistic on permutations. Recall that for a permutation 휋 we let Av푛(휋) denote the set of permutations in 픖푛 avoiding 휋. Say that permutations st 휋, 휎 are st-Wilf equivalent, written 휋 ≡ 휎, if for all 푛 ≥ 0 we have equality of the

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

110 3. Counting with Ordinary Generating Functions

generating functions ∑ 푞st 휏 = ∑ 푞st 휏.

휏∈Av푛(휋) 휏∈Av푛(휍) (a) Show that if 휋 and 휎 are st-Wilf equivalent, then they are Wilf equivalent. (b) Show that inv 132 ≡ 213 and inv 231 ≡ 312 and that there are no other inv-Wilf equivalences between two permutations in 픖3. (c) Show that maj 132 ≡ 231 and maj 213 ≡ 312 and that there are no other maj-Wilf equivalences between two permutations in 픖3. (7) (a) Prove that 푛 푛 [ ] = [ ] 푘 푛 − 푘 in three ways: using the 푞-factorial definition, using the interpretation in terms of integer partitions, and using the interpretation in terms of subspaces. (b) Prove the second recursion in Theorem 3.2.3 in two ways: by mimicking the proof of the first recursion and by using the first recursion in with part (a). (c) If 푆 ⊆ [푛], then let Σ푆 be the sum of the elements of 푆. Give two proofs of the [푛] 푛 following 푞-analogue of the fact that #( 푘 ) = (푘):

(푘+1) 푛 ∑ 푞Σ푆 = 푞 2 [ ] , 푘 [푛] 푞 푆∈( 푘 ) one proof by induction and the other using Theorem 3.2.5. (d) Give two other proofs of Theorem 3.2.6: one by inducting on 푛 and one by using Theorem 3.2.5. (8) (a) Reprove Theorem 3.2.4 in two way: using integer partitions and using sub- spaces. (b) The Negative 푞-Binomial Theorem states that 1 푛 + 푘 − 1 = ∑ [ ]푡푘. 2 푛−1 푘 (1 − 푡)(1 − 푞푡)(1 − 푞 푡) ... (1 − 푞 푡) 푘≥0 Give three proofs of this result: inductive, using integer partitions, and using subspaces.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 111

(9) (a) Given 푛1+푛2+⋯+푛푚 = 푛, define the corresponding 푞-multinomial coefficient to be 푛 [푛]푞! [ ] = 푛 , 푛 , ... , 푛 [푛 ] ! [푛 ] ! ... [푛 ] ! 1 2 푚 푞 1 푞 2 푞 푚 푞

if all 푛푖 ≥ 0 or zero otherwise. Prove that 푛 [ ] 푛 , 푛 , ... , 푛 1 2 푚 푞 푚 푛 − 1 = ∑ 푞푛1+푛2+⋯+푛푖−1 [ ] . 푛 , ... , 푛 , 푛 − 1, 푛 , ... , 푛 푖=1 1 푖−1 푖 푖+1 푚 푞 (b) Define inversions, descents, and the major index for permutations (linear or- derings) of the multiset 푀 = {{1푛1 , 2푛2 , ... , 푚푛푚 }} exactly the way as was done for permutations without repetition. Let 푃(푀) be the set of permutations of 푀. Prove that 푛 ∑ 푞inv 휋 = ∑ 푞maj 휋 = [ ] . 푛 , 푛 , ... , 푛 휋∈푃(푀) 휋∈푃(푀) 1 2 푚 푞

(c) Let 푉 be a vector space over 픽푞 of dimension 푛. Assume 푆 = {푠1 < ⋯ < 푠푚} ⊆ {0, 1, ... , 푛}. Then a flag of type 푆 is a chain of subspaces

퐹 ∶ 푊1 < 푊2 < ⋯ < 푊푚 ≤ 푉

such that dim 푊 푖 = 푠푖 for all 푖. The reason for this terminology is that when 푛 = 2 and 푆 = {0, 1, 2}, then 퐹 consists of a point (the origin) contained in a line contained in a plane which could be viewed as a drawing of a physical flag with the point being the hole in the ground, the line being theflagpole, and the plane being the cloth flag itself. Give two proofs that 푛 #{퐹 of type 푆} = [ ] , 푠 , 푠 − 푠 , 푠 − 푠 , ... , 푠 − 푠 , 푛 − 푠 1 2 1 3 2 푚 푚−1 푚 푞 one by mimicking the proof of Theorem 3.2.6 and one by induction on 푚. (10) (a) Prove that in ℂ[[푥]] we have 푒푘푥 = (푒푥)푘 for any 푘 ∈ ℕ. (b) Define formal power series for the trigonometric functions using their usual Taylor expansions. Prove that in ℂ[[푥]] we have sin2 푥 + cos2 푥 = 1 and sin 2푥 = 2 sin 푥 cos 푥. (c) If one is given a sequence 푎0, 푎1, 푎2,... and defines 푘 푓푘(푥) = 푎푘푥 , 푘 then show that ∑푘≥0 푓푘(푥) = 푓(푥) where 푓(푥) = ∑푘≥0 푎푘푥 . (d) Prove the backwards direction of Theorem 3.3.2. (e) Use Theorems 3.3.1 and 3.3.3 to reprove that 1/푥 and 푒1+푥 are not well-defined in ℂ[[푥]]. (f) Prove Theorem 3.3.4 (11) Prove that if 푆, 푇 are summable sets, then so is 푆 × 푇.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

112 3. Counting with Ordinary Generating Functions

(12) Say that 푓(푥) ∈ ℂ[[푥]] has a square root if there is 푔(푥) ∈ ℂ[[푥]] such that 푓(푥) = 푔(푥)2. (a) Prove that 푓(푥) has a square root if and only if mdeg 푓(푥) is even. (b) Show that as formal power series 1/2 (1 + 푥)1/2 = ∑ ( )푥푘. 푘 푘≥0 (c) Show that as formal power series 푥푘 (푒푥)1/2 = ∑ . 푘 푘≥0 2 푘! (d) Generalize the previous parts of this exercise to 푚th roots for 푚 ∈ ℙ. (13) Give a second proof of Theorem 3.4.2 by using induction. (14) Prove Theorem 3.4.3. (15) Prove Theorem 3.5.1. (16) (a) Finish the proof of Theorem 3.5.3(a). (b) Give a second proof of Theorem 3.5.3(b) by using Theorem 3.2.5. (c) Show that the generating function for the number of partitions of 푛 with largest part 푘 equals the generating function for the number of partitions of 푛 with exactly 푘 parts and that both are equal to the product 푥푘 . (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푘) (d) The Durfee square of a Young diagram 휆 is the largest square partition (푑푑) such that (푑푑) ⊆ 휆. Use this concept to prove that 푥푑2 ∑ 푝(푛)푥푛 = ∑ . 2 2 2 푑 2 푛≥0 푑≥0 (1 − 푥) (1 − 푥 ) ⋯ (1 − 푥 )

(17) Let 푎푛 be the number of integer partitions of 푛 such that any part 푖 is repeated fewer than 푖 times and let 푏푛 be the number of integer partitions of 푛 such that no part is a square. Use generating functions to show that 푎푛 = 푏푛 for all 푛. (18) Given 푚 ≥ 2, use generating functions to show that the number of partitions of 푛 where each part is repeated fewer than 푚 times equals the number of partitions of 푛 into parts not divisible by 푚. Note that bijective proofs of this result were given in Exercise 15 of Chapter 2.

(19) (a) Show that 퐹푛 is the closest integer to 푛 1 1 + √5 ( ) √5 2 for 푛 ≥ 1. (b) Prove that

2 퐹푛−1퐹푛+1 + 1 if 푛 is odd, 퐹푛 = { 퐹푛−1퐹푛+1 − 1 if 푛 is even in two ways: using equation (3.14) and by a combinatorial argument.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 113

(20) (a) Use the algorithm in Section 3.6 to rederive Theorem 3.1.1. (b) Complete the proof of Theorem 3.6.1. (c) Give a second proof of Theorem 3.6.1 using Theorem 3.1.2 (d) Prove Theorem 3.6.2. (e) Let 푠 be the infinite matrix with rows and columns indexed by ℕ and with 푠(푛, 푘) being the entry in row 푛 and column 푘. Similarly define 푆 with entries 푆(푛, 푘). Show that 푆푠 = 푠푆 = 퐼 where 퐼 is the ℕ × ℕ identity matrix. Hint: Use Theorems 3.6.1 and 3.6.2. (21) Given 푛 ∈ ℕ and 푘 ∈ ℤ, define the corresponding 푞-Stirling number of the second kind by 푆[0, 푘] = 훿0,푘 and, for 푛 ≥ 1,

푆[푛, 푘] = 푆[푛 − 1, 푘 − 1] + [푘]푞푆[푛 − 1, 푘]. (a) Show that 푥푘 ∑ 푆[푛, 푘]푥푛 = . 푛≥0 (1 − [1]푞푥)(1 − [2]푞푥) ⋯ (1 − [푘]푞푥)

(b) All set partitions 휌 = 퐵1/퐵2/ ... /퐵푘 in the rest of this problem will be written in standard form, which means that

1 = min 퐵1 < min 퐵2 < ⋯ < min 퐵푘.

An inversion of 휌 is a pair (푏, 퐵푗) where 푏 ∈ 퐵푖 for some 푖 < 푗 and 푏 > min 퐵푗. We let inv 휌 be the number of inversions of 휌. For example, 휌 = 퐵1/퐵2/퐵3 = 136/25/4 has inversions (3, 퐵2), (6, 퐵2), (6, 퐵3), and (5, 퐵3) so that inv 휌 = 4. Show that 푆[푛, 푘] = ∑ 푞inv 휌. 휌∈푆([푛],푘)

(c) A descent of a set partition 휌 is a pair (푏, 퐵푖+1) where 푏 ∈ 퐵푖 and 푏 > min 퐵푗. We let des 휌 be the number of descents of 휌. In the previous example, 휌 has descents (3, 퐵2), (6, 퐵2), and (5, 퐵3) so that des 휌 = 3. The descent multiset of 휌 is denoted Des 휌 and is the multiset

푑1 푑2 푑푘 {{1 , 2 , ... , 푘 ∣ for all 푖, 푑푖 = number of descents (푏, 퐵푖+1)}}. The major index of 휌 is

maj 휌 = ∑ 푖 = 푑1 + 2푑2 + ⋯ + 푘푑푘. 푖∈Des 휌 In our running example Des 휌 = {{12, 21}} so that maj 휌 = 1 + 1 + 2 = 4. Show that 푆[푛, 푘] = ∑ 푞maj 휌. 휌∈푆([푛],푘) (22) Given 푛 ∈ ℕ and 푘 ∈ ℤ, define the corresponding signless 푞-Stirling number of the first kind by 푐[0, 푘] = 훿0,푘 and, for 푛 ≥ 1,

푐[푛, 푘] = 푐[푛 − 1, 푘 − 1] + [푛 − 1]푞푐[푛 − 1, 푘] (a) Show that 푘 ∑ 푐[푛, 푘]푥 = 푥(푥 + [1]푞)(푥 + [2]푞) ⋯ (푥 + [푛 − 1]푞). 푘≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

114 3. Counting with Ordinary Generating Functions

(b) The standard form of 휋 ∈ 픖푛 is 휋 = 휅1휅2 ⋯ 휅푘 where the 휅푖 are the cycles of 휋,

min 휅1 < min 휅2 < ⋯ < min 휅푘,

and each 휅푖 is written beginning with min 휅푖. Define the cycle major index of 휋 ′ ′ to be maj푐 휋 = maj 휋 where 휋 is the permutation in one-line form obtained by removing the cycle parentheses in the standard form of 휋. If, for example, ′ 휋 = (1, 7, 2)(3, 6, 8)(4, 5), then 휋 = 17236845 so that maj푐 휋 = 2 + 6 = 8. Show that maj 휋 푐[푛, 푘] = ∑ 푞 푐 . 휋∈푐([푛],푘) (23) Reprove the formula 1 2푛 퐶(푛) = ( ) 푛 + 1 푛 by using Theorem 3.6.3. 푛+푙 (24) (a) Show that if 푘, 푙 ∈ ℕ are constants, then ( 푘 ) is a polynomial in 푛 of degree 푘. (b) Show that the polynomials 푛 푛 + 1 푛 + 2 ( ), ( ), ( ),... 0 1 2 form a basis for the algebra of polynomials in 푛. (c) Use part (a) to complete the proof of Theorem 3.7.1.

(25) Redo the solution for the first recursion in Section 3.6 as well as the one for 퐹푛 using the method of undetermined coefficients. (26) Prove for the 푛-cycle that 푛 푛 푃(퐶푛; 푡) = (푡 − 1) + (−1) (푡 − 1) in two ways: using deletion-contraction and using NBC sets. (27) Prove Theorem 3.8.4 using induction and give a second proof of parts (b)–(d) using NBC sets. (28) (a) Complete the proof of Theorem 3.8.6. (b) Let 퐺 = (푉, 퐸) be a graph and 푡 ∈ ℙ. Call an acyclic orientation 푂 and a (not necessarily proper) coloring 푐∶ 푉 → [푡] compatible if, for all arcs ⃗푢푣 of 푂, we have 푐(푢) ≤ 푐(푣). Show that if #푉 = 푛, then 푃(퐺; −푡) = (−1)푛(number of compatible pairs (푂, 푐)). (c) Show that Theorem 3.8.6 is a special case of part (b). (29) Finish the proofs of Lemma 3.8.7 and Lemma 3.8.9. (30) (a) Call a permutation 휎 which avoids Π = {231, 312, 321} tight. Show that 휎 is tight if and only if 휎 is an involution having only 2-cycles of the form (푖, 푖 + 1) for some 푖. (b) Let 퐺 be a graph with 푉 = [푛]. Call a spanning forest 퐹 of 퐺 tight if the sequence of labels on any path starting at a root of 퐹 avoids Π as in part (a). Let

TSF푚(퐺) = {퐹 ∣ 퐹 is a tight spanning forest of 퐺 with 푚 edges}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 115

Show that if 퐺 has no 3-cycles, then for all 푚 ≥ 0

TSF푚(퐺) ⊆ NBC푚(퐺). (c) A candidate path in a graph 퐺 with no 3-cycles is a path of the form

푎, 푐, 푏, 푣1, 푣2, ... , 푣푚 = 푑

such that 푎 < 푏 < 푐, 푚 ≥ 1, and 푣푚 is the only 푣푖 smaller than 푐. A total order on 푉(퐺) is called a quasiperfect ordering (qpo) if every candidate path satisfies the following condition: either 푎푑 ∈ 퐸(퐺), or 푑 < 푏 and 푐푑 ∈ 퐸(퐺). Consider the generating function 푚 푛−푚 tsf(퐺; 푡) = ∑ (−1) tsf푚(퐺)푡 . 푚≥0 Show that tsf(퐺; 푡) = 푃(퐺; 푡) if and only if the natural order on [푛] is a qpo. (31) Fill in the details of the case −푑 ≤ 푚 ≤ 0 in the proof of Theorem 3.9.1.

(32) (a) Extend the Fibonacci numbers 퐹푛 to all 푛 ∈ ℤ by insisting that their recursion continue to hold for 푛 < 0. Show that if 푛 ≥ 0, then 푛−1 퐹−푛 = (−1) 퐹푛. 푛 (b) Find ∑푛≥1 퐹−푛푥 in two ways: by using part (a) and by using Theorem 3.9.1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 4

Counting with Exponential Generating Functions

Given a sequence 푎0, 푎1, 푎푛,... of complex numbers, one can associate with it an ex- 푛 ponential generating function where 푎푛 is the coefficient of 푥 /푛!. In certain cases it turns out that the exponential generating function is easier to deal with than the ordi- nary one. This is particularly true if the 푎푛 count combinatorial objects obtained from some set of labels. We give a method for dealing with such structures which again give rise to Sum and Product Rules as well as an Exponential Formula unique to this setting.

4.1. First examples

Given 푎0, 푎1, 푎푛,... where 푎푛 ∈ ℂ for all 푛, the corresponding exponential generating function (egf) is 푥 푥2 푥푛 퐹(푥) = 푎 + 푎 + 푎 + ⋯ = ∑ 푎 . 0 1 1! 2 2! 푛 푛! 푛≥0 In order to distinguish these from the ordinary generating functions (ogfs) in the pre- vious chapter, we will often use capital letters for egfs and lowercase ones for ogfs. The use of the adjective “exponential” is because in the simple case when 푎푛 = 1 for all 푛, 푛 푥 the corresponding egf is 퐹(푥) = ∑푛≥0 푥 /푛! = 푒 . To illustrate why egfs may be useful in studying a sequence, consider 푎푛 = 푛! for 푛 푛 ≥ 0. The ogf is 푓(푥) = ∑푛≥0 푛! 푥 and this power series cannot be simplified. On the other hand, the egf is 푥푛 1 퐹(푥) = ∑ 푛! = 푛! 1 − 푥 푛≥0 which can now be manipulated if necessary. To get some practice in using egfs, we will now compute some examples. One technique which occurs often in determining generating functions is the interchange of

117

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

118 4. Counting with Exponential Generating Functions

. As an example, consider the derangement numbers 퐷(푛) and the formula which was given for them in Theorem 2.1.2. So

푛 푥푛 1 푥푛 ∑ 퐷(푛) = ∑ 푛! ( ∑ (−1)푘 ) 푛! 푘! 푛! 푛≥0 푛≥0 푘=0

(−1)푘 = ∑ ∑ 푥푛 푘! 푘≥0 푛≥푘

(−1)푘 푥푘 = ∑ 푘! 1 − 푥 푘≥0

1 (−푥)푘 = ∑ 1 − 푥 푘! 푘≥0 푒−푥 = . 1 − 푥 We will be able to give a much more combinatorial derivation of this formula once we have introduced the theory of labeled structures in Section 4.3. For now, we just record the result for future reference.

Theorem 4.1.1. We have

푥푛 푒−푥 ∑ 퐷(푛) = . □ 푛! 1 − 푥 푛≥0

If the given sequence is defined by a recurrence relation, then one can use aslight modification of the algorithm in Section 3.6 to compute its egf. One just multiplies by 푥푛/푛!, rather than 푥푛, and sums. Note that the largest index in the recursion may not be the best choice to use as the power on 푥 because of the following considerations. 푛 Given 푓(푥) = ∑푛≥0 푎푛푥 , then its formal derivative is the formal power series

′ 푛−1 푓 (푥) = ∑ 푛푎푛푥 . 푛≥0 This derivative enjoys most of the usual properties of the ordinary analytic derivative such as linearity and the product rule. One similarly defines formal integrals. Note 푛 that if one starts with an egf 퐹(푥) = ∑푛≥0 푎푛푥 /푛!, then 푥푛−1 푥푛−1 푥푛 (4.1) 퐹′(푥) = ∑ 푛푎 = ∑ 푎 = ∑ 푎 푛 푛! 푛 푛+1 푛! 푛≥0 푛≥1 (푛 − 1)! 푛≥0 which is just the egf for the same sequence shifted up by one. So it can simplify things if the subscript on an element of the sequence is greater than the exponent of the cor- responding power of 푥. Let us consider the Bell numbers and their recurrence relation given in Theo- 푛 rem 1.4.1. Let 퐵(푥) = ∑푛≥0 퐵푛푥 /푛!. It will be convenient to replace 푛 by 푛 + 1 and 푘 by 푘 + 1 in the recursion before multiplying by 푥푛/푛! and summing. So using (4.1) and

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.1. First examples 119

the summation interchange trick we obtain 푥푛 퐵′(푥) = ∑ 퐵(푛 + 1) 푛! 푛≥0

푛 푛 푥푛 = ∑( ∑ ( )퐵(푛 − 푘)) 푘 푛! 푛≥0 푘=0

푛 1 = ∑ ∑ 퐵(푛 − 푘)푥푛 푛≥0 푘=0 푘! (푛 − 푘)!

푥푘 푥푛−푘 = ∑ ∑ 퐵(푛 − 푘) 푘! 푘≥0 푛≥푘 (푛 − 푘)!

푥푘 = ∑ 퐵(푥) 푘! 푘≥0 = 푒푥퐵(푥).

We now have a differential equation to solve, but we must take some care to make sure this can be done formally. To this end we define the formal natural logarithm by 푥2 푥3 (−1)푛−1푥푛 (4.2) ln(1 + 푥) = 푥 − + − ⋯ = ∑ 2 3 푛 푛≥1 which can be thought of as formally integrating the geometric series for 1/(1+푥). Note that since ln(1+푥) has an infinite number of terms, to have ln(1+푓(푥)) well-defined it must be that 푓(푥) has constant term 0 by Theorem 3.3.3. In other words, for an infinite series 푔(푥), we have ln 푔(푥) is only defined if the constant term of 푔(푥) is 1. Luckily this is true of 퐵(푥) so we can separate variables above to get 퐵′(푥)/퐵(푥) = 푒푥 and then formally integrate to get ln 퐵(푥) = 푒푥 + 푐 for some constant 푐. By definition (4.2) a natural log has no constant term so we must take 푐 = −1. Solving for 퐵(푥) we obtain the following result. Again, a more combinatorial proof will be given later. Theorem 4.1.2. We have

푛 푥 푥 ∑ 퐵(푛) = 푒푒 −1. □ 푛! 푛≥0

We end this section by discussing certain permutations whose descent sets have a nice structure. To compute their egf we will need to define the formal sine power series by 푥3 푥5 푥2푛+1 sin 푥 = 푥 − + − ⋯ = ∑(−1)푛 3! 5! 푛≥0 (2푛 + 1)! and 1 sin 푥 cos 푥 = (sin 푥)′, sec 푥 = , tan 푥 = . cos 푥 cos 푥 Note that sec 푥 and tan 푥 are well-defined by Theorem 3.3.1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

120 4. Counting with Exponential Generating Functions

Call a permutation 휋 ∈ 푃([푛]) alternating if

(4.3) 휋1 > 휋2 < 휋3 > 휋4 < ⋯ or equivalently if Des 휋 consists of the odd number in [푛]. The 푛th Euler number is

퐸푛 = the number of alternating 휋 ∈ 푃([푛]). For example, when 푛 = 4 then the alternating permutations are

2143, 3142, 3241, 4132, 4231

so 퐸4 = 5. A permutation is complement alternating if 휋1 < 휋2 > 휋3 < 휋4 > ⋯ or equivalently 휋푐 is alternating where 휋푐 is the complement of 휋 as defined in Ex- ercise (37)(b) of Chapter 1. Clearly 퐸푛 also counts the number of complement alter- nating 휋 ∈ 푃([푛]). More generally, define any sequence of integers to be alternating using (4.3) and similarly for complement alternating. We have the following result for the Euler numbers.

Theorem 4.1.3. We have 퐸0 = 퐸1 = 1 and, for 푛 ≥ 1,

푛 푛 2퐸 = ∑ ( )퐸 퐸 . 푛+1 푘 푘 푛−푘 푘=0 Also 푥푛 ∑ 퐸 = sec 푥 + tan 푥. 푛 푛! 푛≥0

Proof. To prove the recurrence it will be convenient to consider the set 푆 which is the union of all permutations which are either alternating or complement alternating in 푃([푛 + 1]). So #푆 = 2퐸푛+1. Pick 휋 ∈ 푆 and suppose 휋푘 = 푛 + 1. Then 휋 factors as a word 휋 = 휋′(푛 + 1)휋″. Suppose first that 휋 is alternating. Then 푘 is odd, 휋′ is alternating, and 휋″ is complement alternating. The number of ways of choosing the ′ 푛 ″ elements of 휋 is (푘) and the remaining ones are used for 휋 . The number of ways of ′ ″ arranging the elements for 휋 in alternating order is 퐸푘, and for 휋 it is 퐸푛−푘. So the 푛 total number of such 휋 is (푘)퐸푘퐸푛−푘 where 푘 is odd. Similar considerations show that the same formula holds for even 푘 when 휋 is complement alternating. The summation side of the recursion follows. 푛 Now let 퐸(푥) be the egf for the 퐸푛. Multiplying the recurrence by 푥 /푛! and sum- ming over 푛 ≥ 1 one obtains the differential equation and boundary condition

2퐸′(푥) = 퐸(푥)2 + 1 and 퐸(0) = 1

where, to make things well-defined for formal power series, 퐸(0) is an abbreviation for the constant term of 퐸(푥). One now obtains the unique solution 퐸(푥) = sec 푥 + tan 푥, either by separation of variables or by verifying that this function satisfies the differential equation and initial condition. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.2. Generating functions for Eulerian polynomials 121

4.2. Generating functions for Eulerian polynomials

In Section 3.2 we saw that the inv and maj statistics have the same distribution. It turns out that there are statistics that have the same distribution as des and these are called Eulerian. The polynomials having this distribution have nice corresponding generat- ing functions of both the ordinary and exponential type. We will discuss them in this section. A whole book devoted to this topic has been written by Petersen [69]. Given 푛, 푘 ∈ ℕ with 0 ≤ 푘 < 푛, the corresponding Eulerian number is

퐴(푛, 푘) = #{휋 ∈ 픖푛 ∣ des 휋 = 푘}. As usual, we let 퐴(푛, 푘) = 0 if 푘 < 0 or 푘 ≥ 푛 with the exception 퐴(0, 0) = 1. For example, if 푛 = 3, then we have 푘 0 1 2 des 휋 = 푘 123 132, 213, 231, 312 321 so that 퐴(3, 0) = 1, 퐴(3, 1) = 4, 퐴(3, 2) = 1. Be sure not to confuse these Eulerian numbers with the Euler numbers introduced in the previous section. Also, some authors use 퐴(푛, 푘) to denote the number of permuta- tions in 픖푛 having 푘 − 1 descents. Some elementary properties of the 퐴(푛, 푘) are given in the next result. It will be convenient to let

퐴([푛], 푘) = {휋 ∈ 픖푛 ∣ des 휋 = 푘}. Theorem 4.2.1. Suppose 푛 ≥ 0. (a) The Eulerian numbers satisfy the initial condition

퐴(0, 푘) = 훿푘,0 and recurrence relation 퐴(푛, 푘) = (푘 + 1)퐴(푛 − 1, 푘) + (푛 − 푘)퐴(푛 − 1, 푘 − 1) for 푛 ≥ 1. (b) The Eulerian numbers are symmetric in that 퐴(푛, 푘) = 퐴(푛, 푛 − 푘 − 1). (c) We have ∑ 퐴(푛, 푘) = 푛! . 푘

Proof. We leave all except the recursion as an exercise. Suppose 휋 ∈ 퐴([푛], 푘). Then removing 푛 from 휋 results in 휋′ ∈ 퐴([푛 − 1], 푘) or 휋″ ∈ 퐴([푛 − 1], 푘 − 1) depending on the relative size of the elements to either side of 푛 in 휋. A permutation 휋′ will result if either the 푛 in 휋 is in the space corresponding to a descent of 휋′, or at the end of 휋′. So a given 휋′ will result 푘+1 times by this method, which accounts for the first term in the sum. Similarly, one obtains a 휋″ from 휋 if 푛 is either in the space of an ascent or at the beginning. So the total number of repetitions in this case is 푛 − 푘 and the recurrence is proved. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

122 4. Counting with Exponential Generating Functions

The 푛th Eulerian polynomial is 푘 des 휋 퐴푛(푞) = ∑ 퐴(푛, 푘)푞 = ∑ 푞 . 푘≥0 휋∈픖푛

Any statistic having distribution 퐴푛(푞) is said to be an Eulerian statistic. One of the other famous Eulerian statistics counts excedances. An excedance of a permutation 휋 ∈ 픖푛 is an integer 푖 such that 휋(푖) > 푖. This gives rise to the excedance set Exc 휋 = {푖 ∣ 휋(푖) > 푖} and excedance statistic exc 휋 = # Exc 휋. To illustrate, if 휋 = 3167542, then 휋(1) = 3, 휋(3) = 6, and 휋(4) = 7 while 휋(푖) ≤ 푖 for other 푖. So Exc 휋 = {1, 3, 4} and exc 휋 = 3. Making a chart for 푛 = 3 as we did for des gives 푘 0 1 2 exc 휋 = 푘 123 132, 213, 312, 321 231 so that the number of permutations in each column is given by the 퐴(3, 푘), even though the sets of permutations in the two tables are not necessarily equal. In order to prove that 퐴(푛, 푘) also counts permutations by excedances, we will need a map that is so important in enumerative combinatorics that it is sometimes called the fundamental bijection. Before we can define this function, we will need some more concepts. Similar to what was done in Section 1.12, call an element 휋푖 of 휋 ∈ 픖푛 a left-right maximum if

휋푖 > max{휋1, 휋2, ... , 휋푖−1}.

Note that 휋1 and 푛 are always left-right maxima and that the left-right maxima in- crease left-to-right. To illustrate, the left-right maxima of 휋 = 51327846 are 5, 7, and 8. The left-right maxima of 휋 determine the left-right factorization of 휋 into factors 휋푖휋푖+1 ... 휋푗−1 where 휋푖 is a left-right maximum and 휋푗 is the next such maximum. In our example 휋, the factorization is 5132, 7, and 846. Recall that since disjoint cycles commute, there are many ways of writing the dis- joint cycle decomposition 휋 = 푐1푐2 ⋯ 푐푘. We wish to distinguish one which is analo- gous to the left-right factorization. The canonical cycle decomposition of 휋 is obtained by writing each 푐푖 so that it starts with max 푐푖 and then ordering the cycles so that

max 푐1 < max 푐2 < ⋯ < max 푐푘. To illustrate, the permutation 휋 = (7, 1, 8)(2, 4, 5, 3)(6) written canonically becomes 휋 = (5, 3, 2, 4)(6)(8, 7, 1).

The fundamental map is Φ∶ 픖푛 → 픖푛 where Φ(휋) is obtained by replacing each left-right factor 휋푖휋푖+1 ... 휋푗−1 by the cycle (휋푖, 휋푖+1, ... , 휋푗−1). For example Φ(51327846) = (5, 1, 3, 2)(7)(8, 4, 6) = 35261874. Note that from the definitions it follows that the cycle decomposition of Φ(휋) obtained will be the canonical one. It is also easy to construct an inverse for Φ: given 휎 ∈ 픖푛, we construct its canonical cycle decomposition and then just remove the parentheses

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.2. Generating functions for Eulerian polynomials 123

and commas to get 휋. These maps are inverses since the inequalities defining the left- right factorization and canonical cycle decomposition are the same. We have proved the following. □ Theorem 4.2.2. The fundamental map Φ∶ 픖푛 → 픖푛 is a bijection. Corollary 4.2.3. For 푛, 푘 ≥ 0 we have

퐴(푛, 푘) = number of 휋 ∈ 픖푛 with 푘 excedances.

Proof. A coexcedance of 휋 ∈ 픖푛 is 푖 ∈ [푛] such that 휋(푖) < 푖. Note that the distribution of coexcedances over 픖푛 is the same as for excedances. Indeed, we have a bijection −1 on 픖푛 defined by 휋 ↦ 휋 . And this bijection has the property that the number of excedances of 휋 is the number of coexcedances of 휋−1 because one obtains the two- line notation for 휋−1 (as defined in Section 1.5) by taking the two-line notation for 휋, interchanging the top and bottom lines, and then permuting the columns until the first row is 12 ... 푛. We now claim that if Φ(휋) = 휎 where Φ is the fundamental bijection, then des 휋 is the number of coexcedances of 휎 which will finish the proof. But if we we have a descent 휋푖 > 휋푖+1 in 휋, then 휋푖, 휋푖+1 must be in the same factor of the left-right factor- ization. So in Φ(휋) we have a cycle mapping 휋푖 to 휋푖+1. This makes 휋푖 a coexcedance of 휎. Similar ideas show that no ascent of 휎 gives rise to a coexcedance of 휎 and so we are done. □

We will now derive two generating functions involving the Eulerian polynomials, one ordinary and one exponential. Theorem 4.2.4. For 푛 ≥ 0 we have

퐴푛(푞) 푛 푚 (4.4) 푛+1 = ∑ (푚 + 1) 푞 . (1 − 푞) 푚≥0

Proof. We will count descent partitioned permutations 휋 which consist of a permuta- tion 휋 ∈ 픖푛 which has bars inserted in some of its spaces either between elements or before 휋1 or after 휋푛, subject to the restriction that the space between each de- scent 휋푖 > 휋푖+1 must have a bar. For example, if 휋 = 2451376, then we could have 휋 = 24|5||137|6|. Let 푏(휋) be the number of bars in 휋. We will show that both sides 푏(휋) of (4.4) are the generating function 푓(푞) = ∑휋 푞 . First of all, given 휋, what is its contribution to 푓(푞)? First we must put bars in the descents of 휋 which results in a factor of 푞des 휋. Now we can choose the rest of the bars by putting them in any of the 푛 + 1 spaces of 휋 with repetition allowed which, by Theorem 3.4.2, gives a factor of 1/(1 − 푞)푛+1. So 푞des 휋 퐴 (푞) 푓(푞) = ∑ = 푛 (1 − 푞)푛+1 (1 − 푞)푛+1 휋∈픖푛 which is the first half of what we wished to prove. On the other hand, the coefficient of 푞푚 in 푓(푞) is the number of 휋 which have exactly 푚 bars. One can construct these permutations as follows. Start with 푚 bars

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

124 4. Counting with Exponential Generating Functions

which create 푚 + 1 spaces between them. Now place the numbers 1, ... , 푛 between the bars, making sure that the numbers between two consecutive bars form an increasing sequence. So we are essentially placing 푛 distinguishable balls into 푚 + 1 distinguish- able boxes since the ordering in each box is fixed. By the twelvefold way, there are (푚 + 1)푛 ways of doing this, which completes the proof. □

We can now use the ogf just derived to find the egf for the polynomials 퐴푛(푥). Theorem 4.2.5. We have 푥푛 푞 − 1 (4.5) ∑ 퐴 (푞) = . 푛 푛! (푞−1)푥 푛≥0 푞 − 푒

Proof. Multiply both sides of the equality in the previous theorem by the quantity (1 − 푞)푛푥푛/푛! and sum over 푛. The left-hand side becomes (1 − 푞)푛퐴 (푞)푥푛 1 푥푛 ∑ 푛 = ∑ 퐴 (푞) . 푛+1 1 − 푞 푛 푛! 푛≥0 (1 − 푞) 푛! 푛≥0 And the right side is now (1 − 푞)푛(푚 + 1)푛푥푛 [(1 − 푞)푥(푚 + 1)]푛 ∑ ∑ 푞푚 = ∑ 푞푚 ∑ 푛! 푛! 푛≥0 푚≥0 푚≥0 푛≥0

= ∑ 푞푚푒(1−푞)푥(푚+1) 푚≥0

푒(1−푞)푥 = 1 − 푞푒(1−푞)푥 1 = . 푒(푞−1)푥 − 푞 Setting the two sides equal and solving for the desired generating function completes the proof. □

4.3. Labeled structures

There is a method for working combinatorially with exponential generating functions which we will present in the following sections. It is based on Joyal’s theory of species [47]. His original method used the machinery of categories and functors. But for the type of enumeration we will be doing, it is not necessary to use this level of generality. An exposition of the full theory can be found in the textbook of Bergeron, Labelle, and Leroux [11]. A labeled structure is a function 풮 which assigns to each finite set 퐿 a finite set 풮(퐿) such that (4.6) #퐿 = #푀 ⟹ #풮(퐿) = #풮(푀). We call 퐿 the label set and 풮(퐿) the set of structures on 퐿. We let

푠푛 = #풮(퐿)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.3. Labeled structures 125

for any 퐿 of cardinality 푛, and this is well-defined because of (4.6). We sometimes use 풮(⋅) as an alternative notation for the structure 풮. We also have the corresponding egf 푥푛 퐹 = 퐹 (푥) = ∑ 푠 . 풮 풮(⋅) 푛 푛! 푛≥0 Although these definitions may seem very abstract, we have already seen many exam- ples of labeled structures. It is just that we have not identified them as such. The rest of this section will be devoted to putting these examples in context. A summary can be found in Table 4.1. To start, consider the labeled structure defined by 풮(퐿) = 2퐿. So 풮 assigns to each label set 퐿 the set of subsets of 퐿. To illustrate 풮({푎, 푏}) = {∅, {푎}, {푏}, {푎, 푏}}. [푛] 푛 Clearly 풮 satisfies (4.6) with 푠푛 = #2 = 2 . So the associated generating function is 푛 푛 푥 2푥 (4.7) 퐹 ⋅ (푥) = ∑ 2 = 푒 . 2 푛! 푛≥0 We will also want to specify the size of the subsets under consideration by using 퐿 푛 the structure 풮(퐿) = (푘) for some fixed 푘 ≥ 0. Now we have 푠푛 = (푘) and, using the fact that this binomial coefficient is zero for 푛 < 푘, 푛 푥푛 푛! 푥푛 푥푘 푥푛−푘 푥푘 (4.8) 퐹 (푥) = ∑ ( ) = ∑ ⋅ = ∑ = 푒푥. ( ⋅ ) 푘 푛! 푛! 푘! 푘! 푘 푛≥0 푛≥푘 푘! (푛 − 푘)! 푛≥푘 (푛 − 푘)! It will be convenient to have a map which adds no extra structure to the label set. So define the labeled structure 퐸(퐿) = {퐿}. Note that 퐸 returns the set consisting of 퐿 itself, not the set consisting of the elements of 퐿. Consequently 푠푛 = 1 for all 푛 and 푥 퐹퐸 = 푒 . The use of 퐸 for this labeled structure reflects both the fact that its egf isthe exponential function and also that the French word for “set” is “ensemble”. (Joyal is a francophone.) We will also need to specify that a set be nonempty by defining {퐿} if 퐿 ≠ ∅, 퐸(퐿) = { ∅ if 퐿 = ∅.

Note that 퐸(∅) = {∅} while 퐸(∅) = ∅. For 퐸 we clearly have 푠푛 = 1 for 푛 ≥ 1 and 푠0 = 0. It is also obvious that 푥 (4.9) 퐹퐸 = 푒 − 1. For partitions of sets we will use the structure 퐿 ↦ 퐵(퐿) where 퐵(퐿) is defined as in Section 1.4. So 푠푛 = 퐵(푛) and, by Theorem 4.1.2, the egf is 푛 푥 푥 퐹 = ∑ 퐵(푛) = 푒푒 −1. 퐵 푛! 푛≥0 We will be able to give a combinatorial derivation of this fact once we derive the Expo- nential Formula in Section 4.5, rather than using the recursion and formal manipula- tions as we did before.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

126 4. Counting with Exponential Generating Functions

Table 4.1. Labeled structures

풮(퐿) Counts 푠푛 egf

푥푛 2퐿 subsets 2푛 ∑ 2푛 = 푒2푥 푛! 푛≥0

퐿 푛 푛 푥푛 푥푘푒푥 ( ) 푘-subsets ( ) ∑ ( ) = 푘 푘 푘 푛! 푘! 푛≥0

푥푛 퐸(퐿) sets 1 ∑ = 푒푥 푛! 푛≥0

푥푛 퐸(퐿) nonempty sets 1 − 훿 ∑ = 푒푥 − 1 푛,0 푛! 푛≥1

푛 푥 푥 퐵(퐿) set partitions 퐵 ∑ 퐵 = 푒푒 −1 푛 푛 푛! 푛≥0

set partitions 푥푛 1 푆(퐿, 푘) 푆(푛, 푘) ∑ 푆(푛, 푘) = (푒푥 − 1)푘 with 푘 blocks 푛! 푘! 푛≥0

ordered version 푥푛 푆 (퐿, 푘) 푘! 푆(푛, 푘) 푘! ∑ 푆(푛, 푘) = (푒푥 − 1)푘 표 of 푆(퐿, 푘) 푛! 푛≥0

푥푛 1 픖(퐿) permutations 푛! ∑ 푛! = 푛! 1 − 푥 푛≥0

푘 permutations 푥푛 1 1 푐(퐿, 푘) 푐(푛, 푘) ∑ 푐(푛, 푘) = (ln ) with 푘 cycles 푛! 푘! 1 − 푥 푛≥0

푘 ordered version 푥푛 1 푐 (퐿, 푘) 푘! 푐(푛, 푘) 푘! ∑ 푐(푛, 푘) = (ln ) 표 of 푐(퐿, 푘) 푛! 1 − 푥 푛≥0

permutations 푥푛 1 푐(퐿) (푛 − 1)! ∑(푛 − 1)! = ln with 1 cycle 푛! 1 − 푥 푛≥1

Just as with subsets, we will restrict our attention to partitions with a given number of blocks by using 퐿 ↦ 푆(퐿, 푘). Now we have 푠푛 = 푆(푛, 푘), a Stirling number of the 푛 second kind. But we have yet to find a closed form for the egf ∑푛≥0 푆(푛, 푘)푥 /푛! to verify that entry in Table 4.1. We will be able to do this easily once we have the sum and product rules for egfs presented in the next section.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.3. Labeled structures 127

We will sometimes work with set partitions where there is a specified ordering on the blocks and use the notation

푆표(퐿, 푘) = {(퐵1, 퐵2, ... , 퐵푘) ∣ 퐵1/퐵2/ ... /퐵푘 ⊢ 퐿}.

We call these ordered set partitions or set compositions. Note that the ordering is of the blocks themselves, not of the elements in each block, so that ({1, 3}, {2}) ≠ ({2}, {1, 3}) but ({1, 3}, {2}) = ({3, 1}, {2}). Clearly the labeled structure 퐿 ↦ 푆표(퐿, 푘) has 푠푛 = 푘! 푆(푛, 푘) and a similar statement can be made for the egf. We will also need weak set compositions where we will allow empty blocks. One can look at labeled structures on permutations analogously to what we have just seen for set partitions. In this context, consider a permutation of 퐿 to be a bijec- tion 휋∶ 퐿 → 퐿 decomposed into cycles as we did when 퐿 = [푛] in Section 1.5. Let 풮(퐿) = 픖(퐿) be the labeled structure of all permutations of 퐿 so that 푠푛 = 푛! and 푛 퐹픖 = ∑푛 푛! 푥 /푛! = 1/(1 − 푥). We have the associated structures

푐(퐿, 푘) = {휋 = 푐1푐2 ⋯ 푐푘 ∣ 휋 is a permutation of 퐿 with 푘 cycles 푐푖}

with ordered variant

푐표(퐿, 푘) = {(푐1, 푐2, ... , 푐푘) ∣ the 푐푖 are the cycles of a permutation of 퐿}.

Using the signless Stirling numbers of the first kind we see that the sequences enumer- ating these two structures are 푐(푛, 푘) and 푘! 푐(푛, 푘), respectively. Again, we will wait to evaluate the corresponding egfs. Finally, we will find the special case 푐(퐿) ≔ 푐(퐿, 1) of having only one cycle partic- ularly useful. In this case, the enumerator is easy to compute.

Proposition 4.3.1. We have

(푛 − 1)! if 푛 ≥ 1, #푐([푛]) = { 0 if 푛 = 0.

Proof. The empty permutation has no cycles so that 푐(∅) = 0. Suppose 푛 ≥ 1 and consider (푎1, 푎2, ... , 푎푛) ∈ 푐([푛]). Then the number of such cycles is the number of ways to order the 푎푖 divided by the number of orderings which give the same cycle, namely 푛! /푛 = (푛 − 1)!. □

It follows from the previous proposition that

푥푛 푥푛 1 퐹 = ∑(푛 − 1)! = ∑ = ln 푐 푛! 푛 1 − 푥 푛≥1 푛≥1

by definition (4.2).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

128 4. Counting with Exponential Generating Functions

4.4. The Sum and Product Rules for egfs

Just as with sets and ogfs, there is a Sum Rule and a Product Rule for egfs. To derive these results, we will first need corresponding rules for labeled structures. Suppose 풮 and 풯 are labeled structures. If 풮(퐿) ∩ 풯(퐿) = ∅ for any finite set 퐿, then we say 풮 and 풯 are disjoint. In this case we define their disjoint union structure, 풮 ⊎ 풯, by (풮 ⊎ 풯)(퐿) = 풮(퐿) ⊎ 풯(퐿). It is easy to see, and we will prove in Proposition 4.4.1 below, that 풮 ⊎ 풯 satisfies the definition of a labeled structure. As examples of this concept, suppose #퐿 = 푛. Then 2퐿 can be partitioned into the subsets of 퐿 having size 푘 for 0 ≤ 푘 ≤ 푛. In other words

퐿 퐿 퐿 (4.10) 2퐿 = ( ) ⊎ ( ) ⊎ ⋯ ⊎ ( ). 0 1 푛

Note that to make a statement about all 퐿 regardless of cardinality we can write 2퐿 = 퐿 퐿 ⨄푘≥0 (푘) since (푘) = ∅ for 푘 > #퐿. Similarly, we have (4.11) 퐵(퐿) = 푆(퐿, 0) ⊎ 푆(퐿, 1) ⊎ ⋯ ⊎ 푆(퐿, 푛) and (4.12) 픖(퐿) = 푐(퐿, 0) ⊎ 푐(퐿, 1) ⊎ ⋯ ⊎ 푐(퐿, 푛).

To define products, let 풮 and 풯 be arbitrary labeled structures. Their product, 풮×풯, is defined by (풮 × 풯)(퐿)

= {(푆, 푇) ∣ 푆 ∈ 풮(퐿1), 푇 ∈ 풯(퐿2) with (퐿1, 퐿2) a weak composition of 퐿}. Intuitively, we carve 퐿 up into two subsets in all possible ways and put an 풮-structure on the first subset and a 풯-structure on the second. Again, we will show that this is indeed a labeled structure in Proposition 4.4.1. Strictly speaking, 풮 × 풯 should be a multiset since it is possible that the same pair (푆, 푇) could arise from two different ordered partitions. However, in the examples we will use, this will never be the case. And the theorems we will prove about the product will still be true in the more general context if we count with multiplicity. In order to give some examples using products, we will need a notion of equiva- lence of structures. Say that labeled structures 풮 and 풯 are equivalent, and write 풮 ≡ 풯 if #풮(퐿) = #풯(퐿) for all finite 퐿. Sometimes we will write 풮(퐿) ≡ 풯(퐿) for this concept if the context makes inclusion of a generic label set 퐿 convenient. Clearly if 풮 ≡ 풯, then 퐹풮(푥) = 퐹풯(푥). As a first illustration of these concepts, we claim that (4.13) 2⋅ ≡ (퐸 × 퐸)(⋅).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.4. The Sum and Product Rules for egfs 129

To see this, note that subsets 푆 ∈ 2퐿 are in bijection with weak compositions via the map 푆 ↔ (푆, 퐿 − 푆). So #2퐿 = #(퐸 × 퐸)(퐿) as we wished to show. These same ideas demonstrate that we

have 푆표(⋅, 2) = (퐸 × 퐸)(⋅) and more generally, for any 푘 ≥ 0, 푘 (4.14) 푆표(⋅, 푘) = 퐸 (⋅). In a similar manner, we obtain 푘 (4.15) 푐표(⋅, 푘) = 푐 (⋅). It is time to prove the Sum and Product Rules for labeled structures. In so doing, we will also be showing that they satisfy the definition for a labeled structure (4.6). Proposition 4.4.1. Let 풮, 풯 be labeled structures and let

푠푛 = #풮(퐿), 푡푛 = #풯(퐿) where #퐿 = 푛. (a) (Sum Rule) If 풮 and 풯 are disjoint, then

#(풮 ⊎ 풯)(퐿) = 푠푛 + 푡푛. (b) (Product Rule) For any 풮, 풯 푛 푛 #(풮 × 풯)(퐿) = ∑ ( )푠 푡 . 푘 푘 푛−푘 푘=0

Proof. For part (a) we have

#(풮 ⊎ 풯)(퐿) = #(풮(퐿) ⊎ 풯(퐿)) = #풮(퐿) + #풯(퐿) = 푠푛 + 푡푛. Now consider part (b). In order to construct (푆, 푇) ∈ (풮 × 풯)(퐿) we must first pick a weak composition 퐿 = 퐿1 ⊎ 퐿2. This is equivalent to just picking 퐿1 as then 푛 퐿2 = 퐿 − 퐿1. So if #퐿1 = 푘, then there are (푘) ways to perform this step. Next we must put an 풮-structure on 퐿1 and a 풯-structure on 퐿2 which can be done in 푠푘푡푛−푘 ways. Multiplying together the two counts and summing over all possible 푘 yields the desired formula. □

As application of this result, note that applying the Sum Rule to (4.10) just gives 푛 푛 2 = ∑푘 (푘) which is Theorem 1.3.3(c). And if we apply the Product Rule to (4.13), we get 푛 푛 푛 푛 2푛 = ∑ ( ) ⋅ 1 ⋅ 1 = ∑ ( ) 푘 푘 푘=0 푘=0 again. Somewhat more interesting formulas are derived in Exercise 8(b) of this chapter. We can now translate Proposition 4.4.1 into the corresponding rules for exponen- tial generating functions. This will permit us to fill in the entries in Table 4.1 which were postponed in the previous section.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

130 4. Counting with Exponential Generating Functions

Theorem 4.4.2. Let 풮, 풯 be labeled structures. (a) (Sum Rule) If 풮 and 풯 are disjoint, then

퐹풮⊎풯(푥) = 퐹풮(푥) + 퐹풯(푥). (b) (Product Rule) For any 풮, 풯

퐹풮×풯(푥) = 퐹풮(푥) ⋅ 퐹풯(푥).

Proof. Let 푠푛 = 풮([푛]) and 푡푛 = 풯([푛]). Using the Sum Rule in Proposition 4.4.1 gives 푥푛 푥푛 퐹 (푥) + 퐹 (푥) = ∑ 푠 + ∑ 푡 풮 풯 푛 푛! 푛 푛! 푛≥0 푛≥0 푥푛 = ∑(푠 + 푡 ) 푛 푛 푛! 푛≥0 푥푛 = ∑ #(풮 ⊎ 풯)([푛]) 푛! 푛≥0

= 퐹풮⊎풯(푥). Now using the Product Rule of the same proposition yields

푥푛 푥푛 퐹 (푥)퐹 (푥) = (∑ 푠 )(∑ 푡 ) 풮 풯 푛 푛! 푛 푛! 푛≥0 푛≥0

푛 푠 푡 = ∑( ∑ 푘 ⋅ 푛−푘 )푥푛 푘! 푛≥0 푘=0 (푛 − 푘)!

푛 푛 푥푛 = ∑( ∑ ( )푠 푡 ) 푘 푘 푛−푘 푛! 푛≥0 푘=0 푥푛 = ∑ #(풮 × 풯)([푛]) 푛! 푛≥0

= 퐹풮×풯(푥), which completes the proof. □

As an illustration of how this result can be used, we can apply the Sum Rule to (4.10), keeping in mind the comment following the equation, to write

퐹2⋅(푥) = ∑ 퐹(⋅)(푥). 푘≥0 푘 We can check this using (4.7) and (4.8): 푥푘 푥푘 ∑ 퐹 (푥) = ∑ 푒푥 = 푒푥 ∑ = 푒푥 ⋅ 푒푥 = 푒2푥 = 퐹 ⋅(푥). (⋅) 푘! 푘! 2 푘≥0 푘 푘≥0 푘≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.5. The Exponential Formula 131

One can also apply the Product Rule to (4.13) and obtain

퐹2⋅(푥) = 퐹퐸(푥)퐹퐸(푥). Again, this yields a simple identity; namely 푒2푥 = 푒푥 ⋅ 푒푥. The true power of Theorem 4.4.2 is that it can be used to derive egfs which are more complicated to prove by other means. For example, applying the Product Rule to equation (4.14) along with (4.9) yields 퐹 (푥) = 퐹 (푥)푘 = (푒푥 − 1)푘, 푆표(⋅,푘) 퐸

a new entry for Table 4.1. Furthermore, since 퐹푆표(⋅,푘)(푥) = 푘! 퐹푆(⋅,푘)(푥) we obtain (푒푥 − 1)푘 퐹 (푥) = . 푆(⋅,푘) 푘! This permits us to give another derivation of the egf for 퐵(푛). Using the Sum Rule and (4.11) gives 푥 푘 (푒 − 1) 푥 퐹 (푥) = ∑ 퐹 (푥) = ∑ = 푒푒 −1. 퐵 푆(⋅,푘) 푘! 푘≥0 푘≥0 These same ideas can be used to derive the egfs for permutations with a given number of cycles, as the reader is asked to do in the exercises.

4.5. The Exponential Formula

Often in combinatorics and other areas of mathematics there are objects which can be broken down into components. For example, the components of set partitions are blocks and the components of permutations are cycles. The exponential formula de- termines the egf of a labeled structure in terms of the egf for its components. It can also be considered as an analogue of the Product Rule for egfs where one carves 퐿 into an arbitrary number of subsets (rather than just 2) and the subsets are unordered (rather than ordered). To make these ideas precise, let 풮 be a labeled structure satisfying (4.16) 풮(퐿) ∩ 풮(푀) = ∅ if 퐿 ≠ 푀. The corresponding partition structure, Π(풮), is defined by

(Π(풮))(퐿) = {{푆1, 푆2, ... } ∣ for all 퐿1/퐿2/ ... ⊢ 퐿 with 푆푖 ∈ 풮(퐿푖) for all 푖}. Intuitively, to form (Π(풮))(퐿) we partition the label set 퐿 in all possible ways and then put a structure from 풮 on each block of the partition, again in all possible ways. Condi- tion (4.16) is imposed so that each element of (Π(풮))(퐿) can only arise in one way from this process. To illustrate,

(4.17) 퐵(퐿) = {퐿1/퐿2/ ... ⊢ 퐿} ≡ (Π(퐸))(퐿)

since 퐿푖 is the only element of 퐸(퐿푖) for any 푖. In much the same way, we see that

(4.18) 픖(퐿) = {푐1푐2 ⋯ ∣ 푐푖 a cycle on 퐿푖 for all 푖 for all 퐿1/퐿2/ ... ⊢ 퐿} ≡ (Π(푐))(퐿).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

132 4. Counting with Exponential Generating Functions

There is a simple relationship between the egf for Π(풮) and the egf for 풮 which is the labeled structure defined by 풮(퐿) if 퐿 ≠ ∅, 풮(퐿) = { ∅ if 퐿 = ∅.

So if 푠푛 = #풮([푛]), then 푥푛 퐹 (푥) = ∑ 푠 . 풮 푛 푛! 푛≥1

We need 퐹풮(푥) to have a zero constant term so that the composition in the next result will be well-defined, see Theorem 3.3.3. Theorem 4.5.1 (Exponential Formula). If 풮 is a labeled structure satisfying (4.16), then 퐹 (푥) 퐹Π(풮)(푥) = 푒 풮 .

Proof. We have 푘 퐹 (푥) 퐹풮(푥) 푒 풮 = ∑ . 푘! 푘≥0 푘 From the Product Rule for egfs in Theorem 4.4.2 we see that 퐹풮(푥) is the egf for putting 풮-structures on partitions of the label set into 푘 ordered, nonempty blocks. 푘 So, by (4.16), 퐹풮(푥) /푘! is the egf for putting 풮-structures on partitions of the label set into 푘 unordered, nonempty blocks. Now using the Sum Rule for egfs, again from Theo- 푘 rem 4.4.2, it follows that ∑푘≥0 퐹풮(푥) /푘! is the egf for putting 풮-structures on partitions of the label set into any number of unordered, nonempty blocks. But this is exactly the structure Π(풮) and so we are done. □

As a first application of the Exponential Formula, consider (4.17). In this case 푥 풮 = 퐸 and 퐹퐸(푥) = 푒 − 1. So, applying the previous theorem, 푥 퐹퐸 (푥) 푒 −1 퐹퐵(푥) = 퐹Π(퐸)(푥) = 푒 = 푒 . Even though we already knew this generating function, this proof is definitely the sim- plest both computationally and conceptually. We can use (4.18) in a similar manner. Now 풮 = 푐 and

퐹푐(푥) = ln(1/(1 − 푥)) = 퐹푐(푥) since the original egf already has no constant term. Applying the Exponential Formula gives 1 퐹 (푥) = 퐹 (푥) = 푒퐹푐(푥) = 푒ln(1/(1−푥)) = 픖 Π(푐) 1 − 푥 which at least agrees with what we computed previously for this egf, even though this in now a more roundabout way of getting it. But with Theorem 4.5.1 in hand it is easy to get more refined information about permutations or other labeled structures. For example, suppose we wish to give a simpler and more combinatorial derivation for the egf of the derangement numbers 퐷(푛) found in Theorem 4.1.1. The corresponding structure is defined by 풟(퐿) = derangements on 퐿.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

4.5. The Exponential Formula 133

In order to express 풟(퐿) as a partition structure we need to permit only cycles of length two or greater. So let 푐(퐿) if #퐿 ≥ 2, 풮(퐿) = { ∅ otherwise. It follows that 풟 ≡ Π(풮). Furthermore, (푛 − 1)! if 푛 ≥ 2, 푠 = { 푛 0 otherwise, so that 푥푛 푥푛 1 퐹 (푥) = ∑(푛 − 1)! = ∑ = ln( ) − 푥. 풮 푛! 푛 1 − 푥 푛≥2 푛≥2 Applying the Exponential Formula gives 푥푛 1 푒−푥 (4.19) ∑ 퐷(푛) = 퐹 (푥) = 퐹 (푥) = exp(ln( ) − 푥) = . 푛! 풟 Π(풮) 1 − 푥 1 − 푥 푛≥0 One can mine even more information from Theorem 4.5.1 since the proof shows 푘 that each of the summands 퐹풮(푥) /푘! has a combinatorial meaning. Define the hyper- bolic sine and cosine functions to be the formal power series 푥3 푥5 푥2푛+1 sinh 푥 = 푥 + + + ⋯ = ∑ 3! 5! 푛≥0 (2푛 + 1)! and 푥2 푥4 푥2푛 cosh 푥 = 1 + + + ⋯ = ∑ . 2! 4! 푛≥0 (2푛)! 푛 It is easy to see that for any formal power series 푓(푥) = ∑푛≥0 푎푛푥 we can extract the series of odd or even terms by 푓(푥) − 푓(−푥) 푓(푥) + 푓(−푥) (4.20) ∑ 푎 푥2푛+1 = and ∑ 푎 푥2푛 = . 2푛+1 2 2푛 2 푛≥0 푛≥0 It follows that 푒푥 − 푒−푥 푒푥 + 푒−푥 (4.21) sinh 푥 = and cosh 푥 = . 2 2

Define the odd partition structure Π표(풮) by

(Π표(풮))(퐿)

= {{푆1, 푆2, ... } ∈ (Π(풮))(퐿) ∣ 퐿 partitioned into an odd number of blocks}

and similarly define the even partition structure Π푒(풮). A proof like that of the Expo- nential Formula can be used to demonstrate the following. Theorem 4.5.2. If 풮 is a labeled structure satisfying (4.16), then 퐹 (푥) = sinh 퐹 (푥) Π표(풮) 풮 and

퐹 (푥) = cosh 퐹 (푥). □ Π푒(풮) 풮

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

134 4. Counting with Exponential Generating Functions

Now suppose we wish to find the egf for 푎푛 which is the number of permutations of [푛] that have an odd number of cycles. As before 풮 = 푐 with 퐹푐(푥) = 퐹푐(푥) = ln(1/(1 − 푥)). Using Theorem 4.5.2 and then (4.21) we see that

푥푛 ∑ 푎 = sinh 퐹 (푥) 푛 푛! 푐 푛≥0

푒ln(1/(1−푥)) − 푒− ln(1/(1−푥)) = 2 1 1 = ( − (1 − 푥)) 2 1 − 푥

1 = 푥 + ∑ 푥푛. 2 푛≥2

Extracting the coefficient of 푥푛/푛! from the first and last sums above yields

푛! /2 if 푛 ≥ 2, (4.22) 푎 = { 푛 1 if 푛 = 1.

Of course, once one has obtained such a simple answer, one would like a purely com- binatorial explanation and the reader is encouraged to find one in Exercise 12(c) of this chapter.

Exercises

(1) (a) Use the recursion for the derangement numbers in Exercise 4 of Chapter 2 to reprove Theorem 4.1.1. (b) Use the recursion for the derangement numbers in Exercise 5 of Chapter 2 to reprove Theorem 4.1.1. (2) (a) Finish the proof of Theorem 4.2.1. (b) Finish the proof of Corollary 4.2.3. (c) Give two proofs of the identity

푚 + 푛 − 푘 (푚 + 1)푛 = ∑ 퐴(푛, 푘)( ), 푛 푘≥0

one using equation (4.4) and one using descent partitioned permutations.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 135

(d) Give a combinatorial proof of the following formula for 퐴(푛, 푘): 푘+1 푛 + 1 퐴(푛, 푘) = ∑(−1)푖( )(푘 − 푖 + 1)푛. 푖 푖=0 Hint: Use the Principle of Inclusion and Exclusion and descent partitioned permutations. (e) Give a combinatorial proof of the following recursion for 푛 ≥ 1: 푛−2 푛 − 1 퐴 (푞) = 퐴 (푞) + 푞 ∑ ( )퐴 (푞)퐴 (푞). 푛 푛−1 푖 푖 푛−푖−1 푖=0

Hint: Factor each 휋 ∈ 픖푛 as 휋 = 휎푛휏. (f) Use (e) to give a second proof of (4.5). (3) Let 퐼 ⊂ ℙ be finite and let 푚 = max 퐼 if 퐼 is nonempty or 푚 = 0 if 퐼 = ∅. For 푛 > 푚 define the corresponding descent polynomial 푑(퐼; 푛) to be the number of 휋 ∈ 픖푛 such that Des 휋 = 퐼. 푛−1 (a) Prove that 푑([푘]; 푛) = ( 푘 ). (b) If 퐼 ≠ ∅, then let 퐼− = 퐼 − {푚}. Prove that 푛 푑(퐼; 푛) = ( )푑(퐼−; 푚) − 푑(퐼−; 푛). 푚 − Hint: Consider the set of 휋 ∈ 픖푛 such that Des(휋1휋2 ... 휋푚) = 퐼 and 휋푚+1 < 휋푚+2 < ⋯ < 휋푛. (c) Use part (b) to show that 푑(퐼; 푛) is a polynomial in 푛 having degree deg(퐼; 푛) = 푚. (d) Reprove the fact that 푑(퐼; 푛) is a polynomial in 푛 using the Principle of Inclu- sion and Exclusion. (e) Since 푑(퐼; 푛) is a polynomial in 푛, its domain of definition can be extended to all 푛 ∈ ℂ. Show that if 푖 ∈ 퐼, then 푑(퐼; 푖) = 0. (f) Show that the complex roots of 푑(퐼; 푛) all lie in the circle |푧| ≤ 푚 in the com- plex plane and also all have real part greater than or equal to −1. Note: This seems to be a difficult problem. (g) (Conjecture) Show that the complex roots of 푑(퐼; 푛) all lie in the circle 푚 + 1 푚 − 1 |푧 − | ≤ . | 2 | 2 Note that this conjecture implies part (f). (4) (a) Derive the generating function 푛 푥 푥 ∑ 푆(푛, 푘)푡푘 = 푒푡(푒 −1) 푛! 푛,푘≥0 in two ways: using the recursion for the 푆(푛, 푘) and using the generating func- tions in Table 4.1. (b) Rederive the egf for the Bell numbers 퐵(푛) using part (a). 푘 푛 (5) (a) Find a formula for ∑푛,푘≥0 푐(푛, 푘)푡 푥 /푛! and prove it in two ways: using the recursion for the 푐(푛, 푘) and using the generating functions in Table 4.1. (b) Rederive the egf for the permutation structure 픖(⋅) using part (a).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

136 4. Counting with Exponential Generating Functions

(6) (a) Let 푖푛 be the number of involutions in 픖푛. Show that 푖0 = 푖1 = 1 and for 푛 ≥ 2

푖푛 = 푖푛−1 + (푛 − 1)푖푛−2. (b) Show that 푛 푥 2 ∑ 푖 = 푒푥+푥 /2 푛 푛! 푛≥0 in two ways: using the recursion in part (a) and using the Exponential For- mula. (c) Given 퐴 ⊆ ℙ, let 푆(푛, 퐴) be the number of partitions of [푛] all of whose block sizes are elements of 퐴. Use the Exponential Formula to find and prove a 푛 formula for ∑≥0 푆(푛, 퐴)푥 /푛!. (d) Repeat part (c) for 푐(푛, 퐴), the number of permutations of [푛] all of whose cycles have lengths which are elements of 퐴. (7) Fill in the details for finding the egf and solving the differential equation inthe proof of Theorem 4.1.3. (8) (a) Use (4.14) and the Product Rule for labeled structures to show that 푆(푛, 2) = 2푛−1 − 1. (b) Use (4.15) and the Product Rule for labeled structures to show that 푛 1 푐(푛 + 1, 2) = 푛! ∑ . 푘 푘=1

(9) (a) Use the Theorem 4.4.2 to derive the egfs in Table 4.1 for the structures 푐표(⋅, 푘) and 푐(⋅, 푘). (b) Use part (a) to rederive the egf for the structure 픖(⋅). (10) (a) Suppose 풮 is a labeled structure satisfying (4.16) and 풯 is any labeled structure. Their composition, 풯 ∘ 풮, is the structure such that

(풯 ∘ 풮)(퐿) = {({푆1, 푆2, ... }, 푇) ∣ for all 퐿1/퐿2/ ... ⊢ 퐿

with 푆푖 ∈ 풮(퐿푖) for all 푖 and 푇 ∈ 풯({푆1, 푆2, ... })} . Prove that

퐹풯∘풮(푥) = 퐹풯(퐹풮(푥)). (b) Use (a) to reprove the Exponential Formula. (11) Let ℱ(퐿) be the labeled structures consisting of all forests with 퐿 as vertex set. Show that 푥푛 ∑ #ℱ([푛]) = exp(∑ 푛푛−2푥푛/푛! ). 푛! 푛≥0 푛≥1 (12) (a) Prove the identities (4.20). (b) Prove Theorem 4.5.2. (c) Reprove (4.22) by finding a bijection between the permutations of [푛], 푛 ≥ 2, which have an odd number of cycles and those which have an even number of cycles. (d) Find a formula for the number of permutations of [푛] having an even number of cycles in two ways: by using Theorem 4.5.2 and by using (4.22).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 137

(13) Let 푎푛 be the number of permutations in 픖푛 that have an even number of cycles, all of them of odd length. (a) Use a parity argument to show that if 푛 is odd, then 푎푛 = 0. (b) Use egfs to show that if 푛 is even, then 푛 푛! 푎 = ( ) . 푛 푛/2 2푛 (c) Use part (b) to show that if 푛 is even, then the probability that in tossing a fair coin 푛 times exactly 푛/2 heads occur is the same as the probability that a permutation chosen uniformly at random from 픖푛 has an even number of cycles, all of them of odd length. (d) Reprove part (c) by giving, when 푛 is even, a bijection between pairs (푆, 휋) [푛] [푛] where 푆 ∈ (푛/2) and 휋 ∈ 픖푛 and pairs (푇, 휎) where 푇 ∈ 2 and 휎 ∈ 픖푛 has an even number of cycles, all of them of odd length.

(14) Let 푗푛 be the number of involutions in 픖푛 which have no fixed points. (a) Give a combinatorial proof that 푗2푛+1 = 0 and that 푗2푛 = 1 ⋅ 3 ⋅ 5 ⋯ (2푛 − 1). (b) Use the Exponential Formula to find a simple expression for the exponential 푥푛 generating function ∑푛≥0 푗푛 푛! . (c) Use the exponential generating function from (b) to give a second derivation of the formula in (a).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 5

Counting with Partially Ordered Sets

Partially ordered sets, known as “posets” for short, give a fruitful way of ordering com- binatorial objects. In this way they provide new perspectives on objects we have already studied, such as interpreting various combinatorial invariants as rank functions. They also give us new and powerful tools to do enumeration such as the Möbius Inversion Theorem which generalizes the Principle of Inclusion and Exclusion.

5.1. Basic properties of partially ordered sets

A partially ordered set or poset is a pair (푃, ≤) where “푃” is a set and “≤” is a binary relation on 푃 satisfying the following axioms for all 푥, 푦, 푧 ∈ 푃: (a) (reflexivity) 푥 ≤ 푥, (b) (antisymmetry) if 푥 ≤ 푦 and 푦 ≤ 푥, then 푥 = 푦, and (c) (transitivity) if 푥 ≤ 푦 and 푦 ≤ 푧, then 푥 ≤ 푧. Often we will refer to the poset as just 푃 since the partial order will be obvious from context. We will also use notation for the standard order on ℝ in the setting of posets in the obvious way. For example 푥 < 푦 means 푥 ≤ 푦 but 푥 ≠ 푦, or using 푦 ≥ 푥 as equivalent to 푥 ≤ 푦. We say that 푥, 푦 ∈ 푃 are comparable if 푥 ≤ 푦 or 푦 ≤ 푥. Otherwise they are incomparable. A poset where every pair of elements is comparable is called a total order. There are standard partial orders on many of the combinatorial objects we have already studied and we list some of them here.

• The chain of length 푛 is (퐶푛, ≤) where 퐶푛 = {0, 1, ... , 푛} and 푖 ≤ 푗 is the usual ordering of the integers. So 퐶푛 is a total order. Note that 퐶푛 has 푛+1 elements and in some texts this would be referred to as an (푛 + 1)-chain. Although

139

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

140 5. Counting with Partially Ordered Sets

퐶푛 was also used as the notation for the graphical cycle with 푛 vertices, the context should make it clear which object is meant. [푛] • The Boolean algebra is (퐵푛, ⊆) where 퐵푛 = 2 and 푆 ⊆ 푇 is set containment. Be sure not to confuse 퐵푛 with the 푛th Bell number 퐵(푛).

• The divisor lattice is (퐷푛, ∣) where 퐷푛 consists of all the positive integers which divide evenly into 푛 and 푎 ∣ 푏 means that 푎 divides evenly into 푏 (in that 푏/푎 is an integer). So in 퐷12 we have 2 ≤ 6 but 2 ≰ 3. Note the distinction between 퐷푛 and the derangement number 퐷(푛).

• The lattice of partitions is (Π푛, ≤) where Π푛 is the set of all partitions of [푛] and 휌 ≤ 휏 means that every block of 휌 is contained in some block of 휏, called the refinement ordering. For example, in Π6 we have 14/2/36/5 ≤ 1245/36 because {1, 4}, {2}, and {5} are all contained in {1, 2, 4, 5} and {3, 6} is contained in itself. • Young’s lattice is (푌, ≤) where 푌 is the set of all integer partitions and 휆 ≤ 휇 is containment of Young diagrams as defined in Section 3.2.

• The lattice of compositions is (퐾푛, ≤) where 퐾푛 is the set of all compositions of 푛 and 훼 ≤ 훽 is refinement of compositions: 훼 can be obtained from 훽 by replacing each 훽푖 by a composition [훼푗, 훼푗+1, ... , 훼푘] ⊧ 훽푖. For example, in 퐾11 we have [2, 3, 2, 1, 3] ≤ [2, 5, 4] because the first 2 in [2, 5, 4] was replaced by itself, the 5 was replaced by [3, 2], and the 4 by [1, 3]. On the other hand [2, 3, 1, 2, 3] ≰ [2, 5, 4]. Again, there is a notational overlap with 퐾푛 as the complete graph on 푛 vertices, but the two will never appear together. • The pattern poset is (픖, ≤) where 픖 is the set of all permutations and 휋 ≤ 휎 means that 휎 contains 휋 as a pattern. • The subspace lattice is (퐿(푉), ≤) where 퐿(푉) is the set of subspaces of a finite 푛 vector space 푉 over 픽푞 and 푈 ≤ 푊 means 푈 is a subspace of 푊. If 푉 = 픽푞 , then we often denote this poset by 퐿푛(푞).

There are also important partial orders on permutations which reflect their group structure (strong and weak Bruhat order) and on certain subgraphs of a graph (the bond lattice). We will define the latter later when it is needed. Often a poset (푃, ≤) is represented by a certain (di)graph which can be easier to work with than just using the axioms. If 푥, 푦 ∈ 푃, then we say that 푥 is covered by 푦 or 푦 covers 푥, written either 푥 ⋖ 푦 or 푦 ⋗ 푥, if 푥 < 푦 and there is no 푧 ∈ 푃 with 푥 < 푧 < 푦. The Hasse diagram of 푃 is the graph with vertices 푃 and an edge from 푥 up to 푦 if 푥 ⋖ 푦. Note that this is actually a digraph where all the arcs are directed up and so are just written as edges with this understanding. Hasse diagrams for examples from the above list are given in Figure 5.1. For those which are infinite, only the bottom of the 2 poset is displayed. Note that in the case of the subspace lattice 푉 = 픽3 the subspaces are listed using their row-echelon forms. We will make no distinction between a poset and its Hasse diagram if no confusion will result by blurring the distinction. Finally, certain posets such as the real numbers under their normal ordering have no covers. So in such cases it does not make sense to try to draw a Hasse diagram.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.1. Basic properties of partially ordered sets 141

{1, 2, 3} 3

2 {1, 2} {1, 3} {2, 3} 1 {1} {2} {3} 0 ∅ 퐶3

퐵3

18 123

6 9 12/3 13/2 1/23 2 3

1 1/2/3

퐷18 Π3

⋮ ⋮ [4]

[3, 1] [2, 2] [1, 3]

[2, 1, 1] [1, 2, 1] [1, 1, 2]

∅ [1, 1, 1, 1]

푌 퐾4

1 0 ⋮ [ ] 123 132 213 231 312 321 0 1

12 21 [ 0 1 ] [ 1 2 ] [ 1 0 ] [ 1 1 ] 1

∅ {(0, 0)}

픖 퐿2(3)

Figure 5.1. A zoo of Hasse diagrams

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

142 5. Counting with Partially Ordered Sets

푐 푑

푎 푏

Figure 5.2. Minimal and maximal elements

There are certain parts of a poset 푃 to which we will often refer. A minimal element of 푃 is 푥 ∈ 푃 such that there is no 푦 ∈ 푃 with 푦 < 푥. Note that 푃 can have multiple minimal elements. The poset in Figure 5.2 has minimal elements 푎, 푏. Dually, define a maximal element of 푃 to be 푥 ∈ 푃 with no 푦 > 푥 in 푃. The example poset just cited has maximal elements 푐, 푑. By way of contrast, 푃 has a minimum element if there is 푥 ∈ 푃 such that 푥 ≤ 푦 for every 푦 ∈ 푃. A minimum element is unique if it exists because if 푥 and 푥′ are both minimum elements, then 푥 ≤ 푥′ and 푥′ ≤ 푥 which forces 푥 = 푥′ by antisymmetry. In this case the minimum element is often denoted 0̂. All of the posets in Figure 5.1 have a 0̂. In fact, in 퐷푛 we have 0̂ = 1, a rare instance where one can write that zero equals one and be mathematically correct! Again, there is the dual notion of a maximum element 푥 ∈ 푃 which satisfies 푥 ≥ 푦 for all 푦 ∈ 푃. A maximum is unique if it exists and is denoted 1̂. The following result sums up the existence of minimum and maximum elements for the posets in Figure 5.1. Its proof is sufficiently easy that itis left as an exercise. Proposition 5.1.1. We have the following minimum and maximum elements.

• In 퐶푛 we have 0̂ = 0, 1̂ = 푛.

• In 퐵푛 we have 0̂ = ∅, 1̂ = [푛].

• In 퐷푛 we have 0̂ = 1, 1̂ = 푛.

• In Π푛 we have 0̂ = 1/2/ ... /푛, 1̂ = [푛]. • In 푌 we have 0̂ = ∅ and no 1̂. 푛 • In 퐾푛 we have 0̂ = [1 ], 1̂ = [푛]. • In 픖 we have 0̂ = ∅ and no 1̂. • In 퐿(푉) we have 0̂ is the zero subspace, 1̂ = 푉. □

As one can tell from the previous set of definitions, it is sometimes useful to reverse inequalities in a poset. So if 푃 is a poset, then we define its dual 푃∗ to have the same underlying set with 푥 ≤ 푦 in 푃∗ if and only if 푦 ≤ 푥 in 푃. The Hasse diagram of 푃∗ is thus obtained by reflecting the one for 푃 in a horizontal axis. As is often the case in mathematics, we analyze a structure by looking at its sub- structures. In posets 푃, these come in several varieties. A subposet of 푃 is a subset 푄 ⊆ 푃 with the inherited partial order; namely 푥 ≤ 푦 for 푥, 푦 ∈ 푄 if and only if 푥 ≤ 푦

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.1. Basic properties of partially ordered sets 143

{1, 2, 3, 5, 6}

{1, 2, 3, 6} {1, 2, 3, 5} {1, 3, 5, 6} [{1, 3}, {1, 2, 3, 5, 6}] = {1, 2, 3} {1, 3, 5} {1, 3, 6}

{1, 3}

Figure 5.3. An interval in 퐵7

in 푃. In such a case we will sometimes use a subscript to make precise which poset is being considered, as in 푥 ≤푃 푦. Note that some authors call this an induced subposet and use the term “subposet” when 푄 satisfies the weaker condition that 푥 ≤푄 푦 im- plies 푥 ≤푃 푦 (but not necessarily conversely). There are several especially important subposets. Assume 푥, 푦 ∈ 푃. Then the corresponding closed interval is

[푥, 푦] = {푧 ∈ 푃 ∣ 푥 ≤ 푧 ≤ 푦}.

Note that [푥, 푦] = ∅ unless 푥 ≤ 푦. For example, the Hasse diagram of the interval [{1, 3}, {1, 2, 3, 5, 6}] in 퐵7 is displayed in Figure 5.3. Note that if one removes the la- bels, then this diagram is exactly like the one for 퐵3 in Figure 5.1. We will explain this formally when we introduce the concept of isomorphism below. Open and half-open intervals in a poset are defined as expected. A subset 퐼 ⊆ 푃 is a lower-order ideal if 푥 ∈ 퐼 and 푦 ≤ 푥 imply that 푦 ∈ 퐼. For example, if 푃 has a 0̂, then any interval of the form [0,̂ 푥] is a lower-order ideal. If 푆 ⊆ 푃 is any subset, then the lower-order ideal generated by 푆 is

퐼(푆) = {푦 ∈ 푃 ∣ 푦 ≤ 푥 for some 푥 ∈ 푆}.

We often leave out the set braces in 푆, for example, writing 퐼(푥, 푦) for the ideal generated by 푆 = {푥, 푦} ⊆ 푃. If #푆 = 1, then the order ideal is called principal. If 푃 has a 0̂, then 퐼(푥) = [0,̂ 푥]. In Young’s lattice, 퐼(휆) is just all the partitions contained in 휆. So when we have a rectangle 휆 = (푘푙), then 퐼(휆) = ℛ(푘, 푙), the set which came into play when discussing the 푞-binomial coefficients in Section 3.2. upper-order ideals 푈 as well as those generated by a set, 푈(푆), are defined by reversing all the inequalities. We will sometimes abbreviate “lower-order ideal” to “order ideal” or even just “ideal,” whereas for upper-order ideals both adjectives will always be used. Some simple properties of ideals are given in the next proposition.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

144 5. Counting with Partially Ordered Sets

Proposition 5.1.2. Let 푃 be a poset.

(a) We have that 퐼 ⊆ 푃 is a lower-order ideal if and only if 푃 − 퐼 is an upper-order ideal. (b) If 푃 is finite and 퐼 is a lower-order ideal, then 퐼 = 퐼(푆) where 푆 is the set of maximal elements of 퐼. (c) If 푃 is finite and 푈 is an upper-order ideal, then 푈 = 푈(푆) where 푆 is the set of minimal elements of 푈.

Proof. We will prove (b) and leave the other two parts to the reader. We will prove the equality by proving the two corresponding set containments. Suppose 푥 ∈ 퐼 and consider the set 푋 = {푦 ∈ 퐼 ∣ 푦 ≥ 푥}. This subset of 푃 is nonempty since 푥 ∈ 푋. So 푋 has at least one maximal element 푦 since 푃 is finite. In fact, 푦 must be maximal in 퐼 since, if not, there is 푧 > 푦 with 푧 ∈ 퐼. But then by transitivity 푧 > 푥 so that 푧 ∈ 푋. This contradicts the maximality of 푦 in 푋. So 푦 ∈ 푆 and 푥 ∈ 퐼(푆) showing that 퐼 ⊆ 퐼(푆). To show 퐼(푆) ⊆ 퐼, take 푦 ∈ 퐼(푆). By definition 푦 ≤ 푥 for some 푥 ∈ 푆 and 푆 ⊆ 퐼. So 푦 ≤ 푥 where 푥 ∈ 퐼 which forces 푦 ∈ 퐼 by definition of lower-order ideal. □

To define isomorphism, we need to consider maps on posets. Assume posets 푃, 푄. Then a function 푓 ∶ 푃 → 푄 is order preserving if

푥 ≤푃 푦 ⟹ 푓(푥) ≤푄 푓(푦).

For example, the map 푓 ∶ 퐶푛 → 퐵푛 by 푓(푖) = [푖] is order preserving because 푖 ≤ 푗 implies that 푓(푖) = [푖] ⊆ [푗] = 푓(푗). We say that 푓 is an isomorphism or that 푃 and 푄 are isomorphic, written 푃 ≅ 푄, if 푓 is bijective and both 푓 and 푓−1 are order preserving. It is important to show that 푓−1, and not just 푓, is order preserving. For consider the two posets in Figure 5.4. Define a bijection 푓 ∶ 푃 → 푄 by 푓(푎) = 푎′ and 푓(푏) = 푏′. Then 푓 is order preserving vacuously since there are no order relations in 푃. But clearly we do not want 푓 to be an isomorphism since 푃 and 푄 have different (unlabeled) Hasse diagrams. This is witnessed by the fact that 푓−1 is not order preserving: we have 푎′ ≤ 푏′ but 푓−1(푎′) = 푎 ≰ 푏 = 푓−1(푏′). There are a number of isomorphisms involving the posets in Figure 5.1. Some of them are collected in the next result.

푏′

푃 = 푄 = 푎 푏 푎′

Figure 5.4. Two nonisomorphic posets

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.2. Chains, antichains, and operations on posets 145

Proposition 5.1.3. We have the following isomorphisms. In all cases we assume the intervals used are nonempty. ∗ (a) 퐶푛 ≅ 퐶푛.

(b) If 푖, 푗 ∈ 퐶푛, then [푖, 푗] ≅ 퐶푗−푖. ∗ (c) 퐵푛 ≅ 퐵푛.

(d) If 푆, 푇 ∈ 퐵푛, then [푆, 푇] ≅ 퐵|푇−푆|. ∗ (e) 퐷푛 ≅ 퐷푛.

(f) If 푙, 푚 ∈ 퐷푛, then [푙, 푚] ≅ 퐷푚/푙.

(g) If 푛 is a product of 푘 distinct primes, then 퐷푛 ≅ 퐵푘.

(h) If 휌 ∈ Π푛 has 푘 blocks, then [휌, 1]̂ ≅ Π푘.

(i) For all 푛 we have 퐾푛 ≅ 퐵푛−1.

(j) If 푉, 푊 are vector spaces over 픽푞 of the same dimension, then 퐿(푉) ≅ 퐿(푊). (k) If 푋, 푌 ∈ 퐿(푉), then [푋, 푌] ≅ 퐿(푋/푌) where 푋/푌 is the quotient vector space.

Proof. We will prove the statement about [푆, 푇] in 퐵푛 and leave the rest of the iso- morphisms as exercises. Let 푇 − 푆 = {푡1, 푡2, ... , 푡푛} where 푛 = |푇 − 푆|. Define a map ′ ′ 푓 ∶ [푆, 푇] → 퐵푛 as follows. If 푋 ∈ [푆, 푇], then 푋 = 푆 ⊎ 푋 where 푋 ⊆ 푇 − 푆. If ′ 푋 = {푡푖, ... , 푡푗}, then let 푓(푋) = {푖, ... , 푗}. This is well-defined since, by definition, 푖, ... , 푗 ∈ [푛].

To show that 푓 is a bijection, we construct its inverse. Assume 퐼 = {푖, ... , 푗} ∈ 퐵푛. −1 Then let 푓 (퐼) = 푆 ⊎ {푡푖, ... , 푡푗}. The proof that this is well-defined is similar to that of 푓, and the fact that these are inverses is clear from their definitions. Finally, we need to show that 푓 and 푓−1 are order preserving. If 푋 ≤ 푌 in [푆, 푇], ′ ′ then 푋 ⊆ 푌 in 푇 − 푆. It follows that 푓(푋) ≤ 푓(푌) in 퐵푛. Thus 푓 is order preserving. −1 As far as 푓 , take 퐼 ≤ 퐽 in 퐵푛. Then the corresponding sets 푇퐼 and 푇퐽 in 푇 − 푆 gotten by using 퐼 and 퐽 for subscripts satisfy 푇퐼 ⊆ 푇퐽 . It follows that −1 −1 푓 (퐼) = 푆 ⊎ 푇퐼 ⊆ 푆 ⊎ 푇퐽 = 푓 (퐽) which shows that 푓−1 is order preserving. □

5.2. Chains, antichains, and operations on posets

We will now consider three operations for building posets. Chains and the related notion of antichains will play important roles.

Given posets (푃, ≤푃) and (푄, ≤푄) with 푃 ∩ 푄 = ∅, their disjoint union is the poset whose elements are 푃 ⊎ 푄 with the partial order 푥 ≤푃⊎푄 푦 if

(a) 푥, 푦 ∈ 푃 and 푥 ≤푃 푦 or

(b) 푥, 푦 ∈ 푄 and 푥 ≤푄 푦. So one just takes the relations in 푃 and 푄 and does not add any new ones. To illustrate, the poset on the left in Figure 5.4 is the disjoint union 푃 = {푎} ⊎ {푏}. If one takes both posets in this figure, then 푃 ⊎ 푄 is the poset on {푎, 푏, 푎′, 푏′} with 푎′ < 푏′ being the only

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

146 5. Counting with Partially Ordered Sets

strict order relation. An important example of disjoint union is the 푛-element antichain 퐴푛 which consists of a set of 푛 elements with no strict order relations. So 푃 = {푎} ⊎ {푏} is a copy of 퐴2. Another way to combine disjoint posets is their ordinal sum 푃 + 푄 which has el- ements 푃 ⊎ 푄 and 푥 ≤푃+푄 푦 if one of (a), (b), (c) hold where (a) and (b) are as in the previous paragraph and the third possibility is (c) 푥 ∈ 푃, 푦 ∈ 푄. Intuitively, one takes the relations in 푃 and 푄 and also makes everything in 푃 smaller than everything in 푄. As an example using chains, we have 퐶푚 + 퐶푛 ≅ 퐶푚+푛+1. Note that in general 푃 + 푄 ≇ 푄 + 푃 as the use of the adjective “ordinal” suggests. Our third method to produce new posets from old ones is via products. Given two (not necessarily disjoint) posets (푃, ≤푃) and (푄, ≤푄), their (direct or Cartesian) product has underlying set 푃 × 푄 = {(푥, 푦) ∣ 푥 ∈ 푃, 푦 ∈ 푄} together with the partial order ′ ′ ′ ′ (푥, 푦) ≤푃×푄 (푥 , 푦 ) if 푥 ≤푃 푥 and 푦 ≤푄 푦 . We let 푃푛 denote the 푛-fold product of 푃. One can obtain the Hasse diagram for 푃 × 푄 by replacing each vertex of 푄 with a copy of 푃 and then, for each edge between two vertices of 푄, connecting each pair of vertices having the same first coordinate in the corresponding two copies of 푃. Illustrations of this can be found in Figure 5.1. For 3 example, 퐷18 looks like a rectangle because it is isomorphic to 퐶1 × 퐶2. Also, 퐵3 ≅ 퐶1 , which is why the Hasse diagram looks like the projection of a 3-dimensional cube into the plane. Both of these isomorphisms generalize, and there is one for Π푛 as well. Proposition 5.2.1. We have the following product decompositions. (a) We have 푛 퐵푛 ≅ 퐶1 . 푛1 푛2 푛푘 (b) If the prime factorization of 푛 is 푛 = 푝1 푝2 ⋯ 푝푘 , then

퐷푛 ≅ 퐶푛1 × 퐶푛2 × ⋯ × 퐶푛푘 .

(c) If 휌 ≤ 휏 in Π푛, then

[휌, 휏] ≅ Π푛1 × Π푛2 × ⋯ × Π푛푘

where 휏 = 푇1/푇2/ ... /푇푘 and 푛푖 is the number of blocks of 휌 contained in 푇푖 for all 푖.

Proof. (a) Consider the map 푓 used in the proof of Theorem 1.3.1. Stated in the lan- 푛 guage of posets, we see that 푓 ∶ 퐵푛 → 퐶1 . And we have already shown that 푓 is bijec- tive. So we only need to prove that 푓 and 푓−1 are order preserving. Suppose 푆, 푇 ⊆ [푛] with 푓(푆) = (푣1, ... , 푣푛) and 푓(푇) = (푤1, ... , 푤푛). Now 푆 ⊆ 푇 if and only if 푖 ∈ 푆 implies 푖 ∈ 푇. By the definition of 푓, this is equivalent to 푣푖 = 1 implying 푤푖 = 1. But this means that 푣푖 ≤ 푤푖 for all 푖 since if 푣푖 = 0, then 푣푖 ≤ 푤푖 is automatic. Thus we 푛 have shown that 푆 ≤ 푇 in 퐵푛 if and only if (푣1, ... , 푣푛) ≤ (푤1, ... , 푤푛) in 퐶1 which is what we wished to prove.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.2. Chains, antichains, and operations on posets 147

(b) The map for 퐷 is similar. Define 푔∶ 퐷 → ⨉ 퐶 by 푔(푑) = (푑 , 푑 , ... , 푑 ) 푛 푛 푖 푛푖 1 2 푘 푑푖 where 푑 = ∏푖 푝푖 . The reader can now verify that this is a well-defined isomorphism of posets. (c) The construction of the isomorphism is messy but not conceptually difficult. Consider blocks of 휌 as being single elements and then aggregate all those lying in a given block of 휏 together. Again, the reader can fill in the details. □

So chains can help us understand various posets by taking products. There is an- other important way in which chains can be used to decompose certain posets. Let 푃 be a poset and 퐶 ∶ 푥0 < 푥1 < ⋯ < 푥푛 be a chain in 푃. We say that 퐶 is a chain of length 푛 from 푥0 to 푥푛 and also use the term 푥0–푥푛 chain. We let ℓ(퐶) denote the length of 퐶. Call 퐶 maximal if it is not strictly contained in a larger chain of 푃. If [푥0, 푥푛] is finite, this is equivalent to 푥푖 < 푥푖+1 being a cover for all 0 ≤ 푖 < 푛. The chain 퐶 is saturated if it is not strictly contained in a larger chain from 푥0 to 푥푛. Equivalently, 퐶 is saturated if it is maximal in [푥0, 푥푛]. For example, in 퐷18 the chain 3 < 6 < 18 is a chain of length 2 from 3 to 18 which is saturated since each inequality is a cover. But this chain is not maximal since it is contained in the larger chain 1 < 3 < 6 < 18. Some posets can be written as a disjoint union of certain subposets called ranks as follows. Suppose 푃 is a poset which is locally finite in that the cardinality of any interval [푥, 푦] of 푃 is finite. All the posets in Figure 5.1 are locally finite even though 푌 and 픖 are not finite. The poset of real numbers with its usual total order is not locally finite. Let 푃 be locally finite and have a 0̂. Then 푃 is ranked if for any 푥 ∈ 푃 all saturated chains from 0̂ to 푥 have the same length. In this case we call this common length the rank of 푥 and it is denoted rk 푥 or rk푃 푥 if we wish to be specific about the poset. For 푘 ∈ ℕ, the 푘th rank set of 푃 is

(5.1) Rk푘 푃 = {푥 ∈ 푃 ∣ rk 푥 = 푘}. If 푃 is finite, then we define its rank to be

rk 푃 = max{푘 ∣ Rk푘 푃 ≠ ∅}. All the posets in Figure 5.1 are ranked and we will describe their rank sets shortly. An example of a poset which is not ranked is in Figure 5.5. This is because there are two saturated 0̂–1̂ chains, namely 0̂ < 푏 < 1̂ which is of length 2 and 0̂ < 푎 < 푐 < 1̂ which is of length 3. We will now list the rank information for the posets in Figure 5.1. Note how many of the combinatorial concepts which were introduced in earlier chapters occur natu- rally in this context. These results are easily proved, so the demonstrations can be filled in by the reader.

Proposition 5.2.2. All the posets in Figure 5.1 are ranked with the following rank func- tions.

(a) If 푘 ∈ 퐶푛, then rk(푘) = 푘, so Rk푘(퐶푛) = {푘}. Also, rk(퐶푛) = 푛. [푛] (b) If 푆 ∈ 퐵푛, then rk(푆) = #푆, so Rk푘(퐵푛) = ( 푘 ). Also, rk(퐵푛) = 푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

148 5. Counting with Partially Ordered Sets

푐 푏 푎

Figure 5.5. An unranked poset

푑1 푑푟 (c) If 푑 ∈ 퐷푛 has prime factorization 푑 = 푝1 ⋯ 푝푟 , then rk(푑) = 푑1 + ⋯ + 푑푟. So Rk푘(퐷푛) is the set of 푑 ∣ 푛 with 푘 primes in their prime factorization, counted with multiplicity. Also, rk(퐷푛) is the total number of primes dividing 푛, counted with multiplicity.

(d) If 휌 ∈ Π푛 has 푏 blocks, then rk(휌) = 푛 − 푏, so Rk푘(Π푛) = 푆([푛], 푛 − 푘). Also, rk(Π푛) = 푛 − 1.

(e) If 휆 ∈ 푌, then rk(휆) = |휆|, so Rk푘(푌) = 푃(푘).

(f) If 훼 ∈ 퐾푛 has 푐 parts, then rk(훼) = 푛 − 푐, so Rk푘(퐾푛) = 푄(푛, 푛 − 푘). Also, rk(퐾푛) = 푛 − 1.

(g) If 휋 ∈ 픖, then rk(휋) = |휋|, so Rk푘(픖) = 픖푘. 푉 (h) If 푊 ∈ 퐿(푉), then rk(푊) = dim 푊, so Rk푘(푉) = [푘]. Also, rk(퐿(푉)) = dim 푉. □

5.3. Lattices

The reader will have noticed that several of our example posets are called “lattices”. This is an important class of partially ordered sets with the property that pairs of ele- ments have greatest lower bounds and least upper bounds. It is also common to study lattices whose elements satisfy certain identities using these two operations. In this section we will prove a theorem characterizing the lattices satisfying a distributive law. If 푃 is a poset and 푥, 푦 ∈ 푃, then a lower bound for 푥, 푦 is a 푧 ∈ 푃 such that 푧 ≤ 푥 and 푧 ≤ 푦. For example, if 푆, 푇 ∈ 퐵푛, then any set contained in both 푆 and 푇 is a lower bound. We say that 푥, 푦 have a greatest lower bound or meet if there is an element in 푃, denoted 푥 ∧ 푦, which is a lower bound for 푥, 푦, and 푥 ∧ 푦 ≥ 푧 for all lower bounds 푧 of 푥 and 푦. Returning to 퐵푛, we have 푆 ∧ 푇 = 푆 ∩ 푇. In fact, one can remember the notation for meet as just a squared-off intersection symbol. Note that if the meetof 푥, 푦 exists, then it is unique. Indeed, if 푧, 푧′ are both greatest lower bounds of 푥, 푦, then we have 푧 ≥ 푧′ since 푧′ is a lower bound and 푧 is the greatest lower bound. But interchanging the roles of 푧 and 푧′ also gives 푧′ ≥ 푧. So 푧 = 푧′ by antisymmetry. Note also that it is possible for the meet not to exist. For example, in the poset of Figure 5.2, 푎 ∧ 푏 does

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.3. Lattices 149

not exist because this pair has no lower bound. Also, 푐 ∧ 푑 does not exist but this is because the pair has both 푎 and 푏 as lower bounds, but there is no lower bound larger than both 푎 and 푏. One can extend these definitions in the obvious way from pairs of elements to any nonempty set of elements 푋 = {푥1, ... , 푥푛} ⊆ 푃. In this case the meet is denoted 푋 = 푥. ⋀ ⋀ 푥∈푋 One can also reasonably define the meet of the empty set as longas 푃 has a 1̂. Indeed, for any 푥 ∈ 푃 we would want

푥 ∧ ( ∅) = ({푥} ∪ ∅) = {푥} = 푥. ⋀ ⋀ ⋀ But the only element 푦 of 푃 such that 푥 ∧ 푦 = 푥 for all 푥 is 푦 = 1̂. So we let ∧∅ = 1̂. The concepts of upper bound and least upper bound are obtained by reversing the inequalities in the definitions of the previous paragraph. If the least upper boundof 푥, 푦 exists, then it is denoted 푥 ∨ 푦 and is also called their join.A lattice is a poset such that every pair of elements has a meet and a join. Note that this is different from the use of the term “lattice” as in “lattice path” which in this case refers to the discrete subgroup ℤ2 of ℝ2. The context should make it clear which meaning is intended. All the poset families in Figure 5.1 are lattices except for the pattern poset 픖 which has subposets isomorphic to Figure 5.2 between its second and third ranks. To describe the meets and joins we need the following terminology. If 푐, 푑 ∈ ℙ, then gcd(푐, 푑) and lcm(푐, 푑) denote their greatest common divisor and least common multiple, respec- tively. Given two Young diagrams 휆, 휇, we then take their intersection 휆 ∩ 휇 or union 휆 ∪ 휇 by aligning them as in Figure 3.1 and then taking the intersection or union of their sets of squares, respectively. If 푈, 푊 are vector subspaces of 푉, then their sum is 푈 + 푊 = {푢 + 푤 ∣ 푢 ∈ 푈, 푤 ∈ 푊}. We leave the verification of the next result to the reader.

Proposition 5.3.1. The posets 퐶푛, 퐵푛, 퐷푛,Π푛, 푌, 퐾푛, and 퐿(푉) are all lattices for all 푛 and 푉 of finite dimension over some 픽푞. In addition, we have the following descriptions of their meets and joins.

(a) If 푖, 푗 ∈ 퐶푛, then 푖 ∧ 푗 = min{푖, 푗} and 푖 ∨ 푗 = max{푖, 푗}.

(b) If 푆, 푇 ∈ 퐵푛, then 푆 ∧ 푇 = 푆 ∩ 푇 and 푆 ∨ 푇 = 푆 ∪ 푇.

(c) If 푐, 푑 ∈ 퐷푛, then 푐 ∧ 푑 = gcd(푐, 푑) and 푐 ∨ 푑 = lcm(푐, 푑).

(d) Suppose 휌, 휏 ∈ Π푛. Then 휌 ∧ 휏 is the partition whose blocks are the nonempty intersections of the form 퐵∩퐶 for blocks 퐵 ∈ 휌, 퐶 ∈ 휏. Also, 휌∨휏 is the partition such that 푏, 푐 are in the same block of the join if and only if there is a sequence of blocks 퐷1, ... , 퐷푚 where each 퐷푖 is a block of either 휌 or 휏, 푏 ∈ 퐷1, 푐 ∈ 퐷푚, and 퐷푖 ∩ 퐷푖+1 ≠ ∅ for all 푖. (e) If 휆, 휇 ∈ 푌, then 휆 ∧ 휇 = 휆 ∩ 휇 and 휆 ∨ 휇 = 휆 ∪ 휇. (f) If 푈, 푊 ∈ 퐿(푉), then 푈 ∧ 푊 = 푈 ∩ 푊 and 푈 ∨ 푊 = 푈 + 푊. □

Next we will give a list of some elementary properties of lattices.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

150 5. Counting with Partially Ordered Sets

Proposition 5.3.2. Let 퐿 be a lattice. Then the following are true for all 푥, 푦, 푧 ∈ 퐿.

(a) (Idempotent law) 푥 ∧ 푥 = 푥 ∨ 푥 = 푥. (b) (Commutative law) 푥 ∧ 푦 = 푦 ∧ 푥 and 푥 ∨ 푦 = 푦 ∨ 푥. (c) (Associative law) (푥 ∧ 푦) ∧ 푧 = 푥 ∧ (푦 ∧ 푧) and (푥 ∨ 푦) ∨ 푧 = 푥 ∨ (푦 ∨ 푧). (d) (Absorption law) 푥 ∧ (푥 ∨ 푦) = 푥 = 푥 ∨ (푥 ∧ 푦). (e) 푥≤푦 ⟺ 푥∧푦=푥 ⟺ 푥∨푦=푦. (f) If 푥 ≤ 푦, then 푥 ∧ 푧 ≤ 푦 ∧ 푧 and 푥 ∨ 푧 ≤ 푦 ∨ 푧. (g) 푥 ∧ (푦 ∨ 푧) ≥ (푥 ∧ 푦) ∨ (푥 ∧ 푧) and 푥 ∨ (푦 ∧ 푧) ≤ (푥 ∨ 푦) ∧ (푥 ∨ 푧). (h) If 푋 is a finite, nonempty subset of 퐿, then ⋀ 푋 and ⋁ 푋 exist. (i) If 퐿 is finite, then 퐿 has a 0̂ and a 1̂. (j) The dual 퐿∗ is a lattice. (k) If 푀 is a poset and there is an isomorphism 푓 ∶ 퐿 → 푀, then 푀 is also a lattice. Furthermore, for 푥, 푦 ∈ 퐿,

푓(푥 ∧ 푦) = 푓(푥) ∧ 푓(푦) and 푓(푥 ∨ 푦) = 푓(푥) ∨ 푓(푦).

Proof. The proofs of these results are straightforward. So we will give a demonstration for the first inequality in (g) and leave the rest to the reader. By definition oflower bound, 푥 ∧ 푦 ≤ 푥. And similarly 푥 ∧ 푦 ≤ 푦 ≤ 푦 ∨ 푧. So by definition of greatest lower bound, 푥∧푦 ≤ 푥∧(푦∨푧). Similar reason gives 푥∧푧 ≤ 푥∧(푦∨푧). Using the previous two inequalities and the definition of least upper bound yields (푥 ∧ 푦) ∨ (푥 ∧ 푧) ≤ 푥 ∧ (푦 ∨ 푧) which is what we wished to prove. □

It is certainly possible for the inequalities in (g) of the previous proposition to be strict. For example, using the lattice in Figure 5.5 we have

푐 ∧ (푎 ∨ 푏) = 푐 ∧ 1̂ = 푐

while (푐 ∧ 푎) ∨ (푐 ∧ 푏) = 푎 ∨ 0̂ = 푎.

However, when we have equality in one of these two inequalities, it is forced in the other.

Proposition 5.3.3. Let 퐿 be a lattice. Then

푥 ∧ (푦 ∨ 푧) = (푥 ∧ 푦) ∨ (푥 ∧ 푧) for all 푥, 푦, 푧 ∈ 퐿

if and only if 푥 ∨ (푦 ∧ 푧) = (푥 ∨ 푦) ∧ (푥 ∨ 푧) for all 푥, 푦, 푧 ∈ 퐿.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.3. Lattices 151

Proof. For the forward direction we have

(푥 ∨ 푦) ∧ (푥 ∨ 푧) = [(푥 ∨ 푦) ∧ 푥] ∨ [(푥 ∨ 푦) ∧ 푧]

= 푥 ∨ [(푥 ∨ 푦) ∧ 푧]

= 푥 ∨ [(푥 ∧ 푧) ∨ (푦 ∧ 푧)]

= [푥 ∨ (푥 ∧ 푧)] ∨ (푦 ∧ 푧)

= 푥 ∨ (푦 ∧ 푧),

which is the desired equality. The proof of the other direction is similar. □

A lattice which satisfies either of the two equalities in Proposition 5.3.3 is called a distributive lattice, and these equations are called the distributive laws. Of the lattices in Proposition 5.3.1, Π푛 is not distributive for 푛 ≥ 3. In fact, taking 푧, 푦, 푧 to be the three elements of rank 1 in Π3 gives a strict inequality in the defining relation. Similarly, 퐿(푉) is not distributive for dim 푉 ≥ 2. All the other lattices are distributive as the reader will be asked to show in the exercises. □ Proposition 5.3.4. The posets 퐶푛, 퐵푛, 퐷푛, 푌, 퐾푛 are distributive lattices for all 푛.

We will now prove a beautiful theorem characterizing finite distributive lattices due to Birkhoff [14]. It says that every such lattice is essentially a set of lower-order ideals partially ordered by inclusion. Given a poset 푃, consider

풥(푃) = {퐼 ∣ 퐼 is a lower-order ideal of 푃}.

{푟, 푠, 푡, 푢}

푢 {푟, 푠, 푡} {푟, 푡, 푢} 푠 푡 {푟, 푠} {푟, 푡}

푟 {푟}

풥(푃)

Figure 5.6. A poset and its associated distributive lattice

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

152 5. Counting with Partially Ordered Sets

Turn 풥(푃) into a poset by letting 퐼 ≤ 퐽 if 퐼 ⊆ 퐽. For example, a poset 푃 and the corre- sponding poset 풥(푃) are shown in Figure 5.6. Our first order of business is to show that 풥(푃) is always a distributive lattice. Proposition 5.3.5. If 푃 is any poset, then 풥(푃) is a distributive lattice.

Proof. It suffices to show that if 퐼, 퐽 ∈ 풥(푃), then so are 퐼 ∩ 퐽 and 퐼 ∪ 퐽. Indeed, this suffices because these are the greatest lower bound and least upper bound ifone considers all subsets of 푃, and set intersection and union satisfy the distributive laws. We will demonstrate this for 퐼 ∩ 퐽 since the argument for union is similar. We need to show that 퐼 ∩ 퐽 is a lower-order ideal of 푃. So take 푥 ∈ 퐼 ∩ 퐽 and 푦 ≤ 푥. Now 푥 ∈ 퐼 and 푥 ∈ 퐽. Since both sets are ideals, this implies 푦 ∈ 퐼 and 푦 ∈ 퐽. So 푦 ∈ 퐼 ∩ 퐽 which is what we needed to show. □

The amazing thing is that every finite distributive lattice is of the form 풥(푃) for some poset 푃. In order to prove this, we need a way that, given a distributive lattice 퐿, we can identify the 푃 from which it was built. This is done using certain elements of 푃 which we now define. Let 퐿 be a finite lattice so that 퐿 has a 0̂ by Proposition 5.3.2(i). Call 푥 ∈ 퐿 − {0}̂ join irreducible if 푥 cannot be written as 푥 = 푦 ∨ 푧 where 푦, 푧 < 푥. Equivalently, if 푥 = 푦 ∨ 푧, then 푦 = 푥 or 푧 = 푥. Let Irr(퐿) = {푥 ∈ 퐿 ∣ 푥 is join irreducible}. It turns out the join irreducibles are easy to spot in the Hasse diagram of 퐿. Also, the join irreducibles under a given element join to give that element. Proposition 5.3.6. Let 퐿 be a finite lattice. (a) Element 푥 ∈ 퐿 is join irreducible if and only if 푥 covers exactly one element. (b) For any 푥 ∈ 퐿, if we let

(5.2) 퐼푥 = {푟 ≤ 푥 ∣ 푟 ∈ Irr(퐿)},

then 푥 = ⋁ 퐼푥.

Proof. (a) First note that, by definition, 0̂ is not join irreducible. And 0̂ covers no elements, so the proposition is true in that case. Now assume 푥 ≠ 0̂. For the forward direction suppose, towards a contradiction, that 푥 covers (at least) two elements 푦, 푧. But then 푥 = 푦 ∨ 푧 since 푥 is clearly an upper bound for 푦, 푧 and there can be no smaller bound because of the covering relations. This contradicts the fact that 푥 is join irreducible. For the converse, let 푥 cover a unique element 푥′. If 푥 = 푦 ∨ 푧 with 푦, 푧 < 푥, then we must have 푦, 푧 ≤ 푥′ because 푥 covers no other element. This forces 푦 ∨ 푧 ≤ 푥′ < 푥 which is the desired contradiction. (b) We induct on the number of elements in [0,̂ 푥]. If 푥 = 0̂, then the statement is true because 퐼푥 = ∅ and the empty join equals 0̂ just as the empty meet equals 1̂. If 푥 > 0̂, then there are two cases. If 푥 ∈ Irr(퐿), then 푥 is the maximum element of 퐼푥 so ⋁ 퐼푥 = 푥 by Proposition 5.3.2(e). If 푥 ∉ Irr(퐿), then, by part (a), 푥 covers more than

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.3. Lattices 153

one element. So we can write 푥 = 푦 ∨ 푧 for any pair of elements 푦, 푧 covered by 푥. By induction, 푦 = ⋁ 퐼푦 and 푧 = ⋁ 퐼푧 where 퐼푦, 퐼푧 ⊆ 퐼푥. It follows that

푥 = 푦 ∨ 푧 = ( 퐼 ) ∨ ( 퐼 ) = (퐼 ∪ 퐼 ) ≤ 퐼 ≤ 푥 ⋁ 푦 ⋁ 푧 ⋁ 푦 푧 ⋁ 푥

where the last inequality follows since 푟 ≤ 푥 for all 푟 ∈ 퐼푥. The previous displayed □ inequalities force 푥 = ⋁ 퐼푥 as desired.

Using this proposition, the reader can see immediately in Figure 5.6 that 풥(푃) has four join irreducibles which form a subposet isomorphic to 푃. Birkhoff’s theorem says this always happens.

Theorem 5.3.7 (Fundamental Theorem of Finite Distributive Lattices). If 퐿 is a finite distributive lattice, then 퐿 ≅ 풥(푃) where 푃 = Irr(퐿).

Proof. We need to define two order-preserving maps 푓 ∶ 퐿 → 풥(푃) and 푔∶ 풥(푃) → 퐿 which are inverses of each other. If 푥 ∈ 퐿, then let 푓(푥) = 퐼푥 as defined by (5.2). Note that this is well-defined since 퐼푥 is an ideal: if 푟 ∈ 퐼푥 and 푠 ≤ 푟 is irreducible, then 푠 ≤ 푟 ≤ 푥 so that 푠 ∈ 퐼푥. Also, 푓 is order preserving since 푥 ≤ 푦 implies 퐼푥 ⊆ 퐼푦 by an argument similar to the one just given. For the inverse map, we let 푔(퐼) = ⋁ 퐼 for 퐼 ∈ 풥(푃). Clearly this is an element of 퐿 since 푃 ⊂ 퐿 and so 푔 is well-defined. It is also order preserving since if 퐼 ⊆ 퐽, then

푔(퐽) = 퐽 = ( 퐼) ∨ ( (퐽 − 퐼)) ≥ 퐼 = 푔(퐼). ⋁ ⋁ ⋁ ⋁

It remains to prove that 푓 and 푔 are inverses. If 푥 ∈ 퐿, then, using Proposi- tion 5.3.6(b), 푔(푓(푥)) = 푔(퐼 ) = 퐼 = 푥. 푥 ⋁ 푥

Now consider 퐼 ∈ 풥(푃) and let 푥 = 푔(퐼) = ⋁ 퐼. Then 푓(푔(퐼)) = 퐼푥 and we must show 퐼 = 퐼푥. For the containment 퐼 ⊆ 퐼푥, take 푟 ∈ 퐼. So 푟 ≤ ⋁ 퐼 = 푥. But this means 푟 ∈ 퐼푥 by definition (5.2), which gives the desired subset relation.

To show 퐼푥 ⊆ 퐼, take 푟 ∈ 퐼푥. We have that ⋁ 퐼 = 푥 = ⋁ 퐼푥. So 푟∧(⋁ 퐼) = 푟∧(⋁ 퐼푥) and, applying the distributive law,

(5.3) {푟 ∧ 푠 ∣ 푠 ∈ 퐼} = {푟 ∧ 푠 ∣ 푠 ∈ 퐼 }. ⋁ ⋁ 푥

Since 푟 ∈ 퐼푥, the set on the right in (5.3) contains 푟∧푟 = 푟. Furthermore, every element of that set is of the form 푟 ∧ 푠 ≤ 푟. It follows that the right-hand side of the equality in (5.3) is just 푟. But 푟 is join irreducible, so there must be some element 푠 ∈ 퐼 on the left in (5.3) such that 푟 ∧ 푠 = 푟. By Proposition 5.3.2(e) this forces 푟 ≤ 푠 ∈ 퐼. Since 퐼 is an ideal, we have 푟 ∈ 퐼 which is the final step of the proof. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

154 5. Counting with Partially Ordered Sets

5.4. The Möbius function of a poset

The Möbius function is a fundamental invariant of any locally finite poset. A special case of this function was first studied in number theory. But the generalization topar- tially ordered sets is both more powerful and also in some ways more intuitive. It is of great use to enumerators because, as we will see in the next section, it permits one to invert certain summation formulas to get information about the summands. Let 푃 be a locally finite poset with a 0̂. The (one-variable) Möbius function of 푃 is a map 휇∶ 푃 → ℤ defined inductively by

1 if 푥 = 0̂, (5.4) 휇(푥) = { − ∑ 휇(푦) otherwise. 푦<푥

Note that since 푃 is locally finite, the number of summands is finite sothat 휇 is well- defined. By moving the terms in the sum to the left-hand side of the equation, weget the following equivalent definition: for any 푥 ∈ 푃 we have

(5.5) ∑ 휇(푦) = 훿0,푥̂ 푦≤푥

where 훿0,푥̂ is the Kronecker delta. We will write 휇푃 if we wish to be specific about the poset whose Möbius function is under consideration. Also, if 푃 has a 1̂, then we will write 휇(푃) = 휇(1).̂

Let us now calculate the Möbius function for some of our example posets. First consider 퐶3 as displayed in Figure 5.1. Using (5.4) we see that 휇(0) = 1, 휇(1) = −휇(0) = −1, 휇(2) = −(휇(0) + 휇(1)) = −0 = 0, 휇(3) = −(휇(0) + 휇(1) + 휇(2)) = −0 = 0.

The next result should now be obvious.

Proposition 5.4.1. In 퐶푛 we have

⎧ 1 if 푖 = 0, 휇(푖) = −1 if 푖 = 1, ⎨ ⎩ 0 otherwise. □ So 휇(퐶푛) = 1, −1, or 0 depending on whether we have 푛 = 0, 1, or 푛 ≥ 2, respectively.

Now have a look at 퐵3. Similarly to 퐶3 we have 휇(∅) = 1 and 휇({1}) = 휇({2}) = 휇({3}) = −1. Using (5.4) we see that

휇({1, 2}) = −(휇(∅) + 휇({1}) + 휇({2})) = −(1 − 1 − 1) = 1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.4. The Möbius function of a poset 155

Analogous computations show that 휇({1, 3}) = 휇({2, 3}) = 1. Finally 휇({1, 2, 3}) = − ∑ 휇(푆) = −(1 − 1 − 1 − 1 + 1 + 1 + 1) = −1. 푆⊂{1,2,3} The next result should not be hard to guess.

Proposition 5.4.2. If 푆 ∈ 퐵푛, then (5.6) 휇(푆) = (−1)#푆. 푛 So 휇(퐵푛) = (−1) .

Proof. It will suffice to show that the function (−1)#푆 satisfies (5.5) since that equation uniquely defines 휇. So suppose 푇 ∈ 퐵푛 and let #푇 = 푘. Then, using Theorem 1.3.3(d), 푘 푘 푘 ∑ (−1)#푆 = ∑ ∑ (−1)푖 = ∑ ( )(−1)푖 = 훿 = 훿 푖 0,푘 ∅,푇 푆⊆푇 푖=0 푇 푖=0 푆∈( 푖 ) which is the desired equality. □

In the divisor lattice 퐷18, the reader should now find it easy to verify that 휇(1) = 휇(6) = 1, 휇(2) = 휇(3) = −1, 휇(9) = 휇(18) = 0. Now the pattern is not as clear. To help us, we will need a result about how the Möbius function interacts with the product operation on posets. But first, it will be useful to have a result about isomorphism and 휇. Theorem 5.4.3. Let 푃 be a locally finite poset with 0̂ and let 푓 ∶ 푃 → 푄 be in isomor- phism. Then for all 푥 ∈ 푃 we have

휇푃(푥) = 휇푄(푓(푥)).

Proof. We induct on the cardinality of the ideal 퐼(푥). If #퐼(푥) = 1, then 푥 = 0̂푃 and 푓(푥) = 0̂푄. So 휇푃(푥) = 1 = 휇푄(푓(푥)). Now assume #퐼(푥) > 1 so that 푥 > 0̂푃 and 푓(푥) > 0̂푄. Now, by induction,

휇푃(푥) = − ∑ 휇푃(푦) = − ∑ 휇푄(푓(푦)) = 휇푄(푓(푥)) 푦<푥 푓(푦)<푓(푥) as we wished. □

The Möbius function also plays well with poset products.

Theorem 5.4.4. Let 푃 and 푄 be locally finite posets containing 0̂푃 and 0̂푄, respectively. Then for all 푠 ∈ 푃 and 푥 ∈ 푄 we have

휇푃×푄(푠, 푥) = 휇푃(푠)휇푄(푥).

Proof. It suffices to show that the right-hand side of the displayed equation satis- fies (5.5). But given (푠, 푥) ∈ 푃 × 푄, we have ∑ 휇 (푡)휇 (푦) = ∑ 휇 (푡) ∑ 휇 (푦) = 훿 훿 = 훿 푃 푄 푃 푄 0̂푃 ,푠 0̂푄,푥 (0̂푃 ,0̂푄),(푠,푥) (푡,푦)≤(푠,푥) 푡≤푠 푦≤푥 as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

156 5. Counting with Partially Ordered Sets

We can now compute the Möbius function of the divisor lattice.

Proposition 5.4.5. The Möbius function of 퐷푛 is (−1)푚 if 푑 is a product of 푚 distinct primes, (5.7) 휇(푑) = { 0 otherwise.

Proof. We will use the notation and definitions in Proposition 5.2.1 and its proof as well as letting 푃 = ⨉ 퐶 . Using Theorems 5.4.3 and 5.4.4 푖 푛푖

휇퐷 (푑) = 휇푃(푔(푑)) = ∏ 휇퐶 (푑푖). 푛 푛푖 푖 Recalling the Möbius function for a chain as determined in Proposition 5.4.1, we see 푚 that the product is zero if any 푑푖 ≥ 2 and otherwise equals (−1) where 푚 is the number of 푑푖 = 1. Since the 푑푖 are the exponents in the prime factorization of 푑 (with 푑푖 = 0 if □ 푝푖 is a prime factor of 푛 but not 푑), the proposition follows.

The reader can now see the power of the poset viewpoint in this context. Most number theory texts take (5.7) as the definition of the Möbius function, which is not at all intuitive. But from our perspective, this equation is a natural consequence of the fact that 퐷푛 is a product of chains. We also note that Theorems 5.4.3 and 5.4.4 can be used to rederive the formula for 휇 in 퐵푛 as the reader is asked to do in the exercises. We end this section by noting that in a ranked poset 푃 one can get interesting results by looking at the Möbius values at a given rank. Recalling (5.1), define the Whitney numbers of the second kind for 푃 to be 푊 푘(푃) = # Rk푘(푃). Equivalently

푊 푘(푃) = ∑ 1. 푥∈Rk푘(푃) Also define 푃’s Whitney numbers of the first kind as

푤푘(푃) = ∑ 휇(푥). 푥∈Rk푘(푃) [푛] 푛 For example we have 푊 푘(퐵푛) = #( 푘 ) = (푘) and, by (5.6), 푛 (5.8) 푤 (퐵 ) = (−1)푘( ). 푘 푛 푘 As another illustration, from Proposition 5.2.2(d) we see that

푊 푘(Π푛) = #푆([푛], 푛 − 푘) = 푆(푛, 푛 − 푘) which are the Stirling numbers of the second kind. We will now show that there is a similar relationship between 푤푘(Π푛) and the (signed) Stirling numbers of the first kind. To prove this we need a definition. If 휋 ∈ 픖푛 has cycle decomposition 휋 = 푐1 ⋯ 푐푘, then 휋 has corresponding partition 휌 = 퐵1/ ... /퐵푘 where 퐵푖 is the set of elements in 푐푖 for all 푖. Note that, by Proposition 4.3.1, the number of permutations corresponding to

a given partition 휌 is ∏푖(|퐵푖| − 1)!. Using this fact, the equality

(5.9) 푤푘(Π푛) = 푠(푛, 푛 − 푘) follows immediately from the next proposition.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.5. The Möbius Inversion Theorem 157

Proposition 5.4.6. If 휌 = 퐵1/ ... /퐵푘 ∈ Π푛, then 푛−푘 (5.10) 휇(휌) = (−1) (|퐵1| − 1)! ⋯ (|퐵푘| − 1)! . 푛−1 So 휇(Π푛) = (−1) (푛 − 1)!.

Proof. We induct on 푛, where the case 푛 = 1 is easy to verify. Assume that 휇(Π푚) = (−1)푚−1(푚 − 1)! for 푚 < 푛. It follows from Proposition 5.2.1(c), Theorem 5.4.4, and induction that (5.10) holds for all 휌 < 1̂ in Π푛. To verify that it continues to be true for 휌 = 1̂, it suffices to show that summing the right-hand side of this equation overall of Π푛 satisfies (5.5). Using the observation about the number of permutations corre- sponding to a partition as well as Corollary 1.5.3 gives

푘 푛−푘 푛−푘 ∑ (−1) ∏(|퐵푘| − 1)! = ∑ (−1) 휌=퐵1/. . ./퐵푘∈Π푛 푖=1 휋=푐1⋯푐푘∈픖푛

푛 = ∑ ∑ (−1)푛−푘 푘=0 휋∈푐([푛],푘)

푛 = ∑ 푠(푛, 푘) 푘=0

= 훿0,푛 and this finishes the proof. □

5.5. The Möbius Inversion Theorem

In this section we will prove the Möbius Inversion Theorem, which is a very general method for inverting sums over posets 푃. In fact, we will show that special cases of this result include the Fundamental Theorem of the Difference Calculus (푃 = 퐶푛), the Principle of Inclusion and Exclusion (푃 = 퐵푛), and the Möbius Inversion Theorem in number theory (푃 = 퐷푛). A useful perspective will be to consider a certain alge- bra associated with 푃 called the incidence algebra and which permits linear algebra techniques to be employed. Our first step will be to generalize the Möbius function to a map having twoar- guments. Let 푃 be a locally finite poset and let Int(푃) be the set of closed intervals of 푃. Note that every [푥, 푧] ∈ Int(푃) has a minimum element; namely 0̂[푥,푦] = 푥. The Möbius function of 푃 is the map 휇∶ Int(푃) → ℤ defined inductively on [푥, 푧] by 1 if 푥 = 푧, (5.11) 휇(푥, 푧) = { − ∑ 휇(푥, 푦) otherwise. 푥≤푦<푧 Note that 휇(푥, 푧) denotes the value of 휇 on the closed interval [푥, 푧] even though the square brackets have been dropped in the notation. Also note that this is essentially

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

158 5. Counting with Partially Ordered Sets

the same definition as (5.4). Indeed, as a poset [푥, 푧] has a minimum element 0̂ = 푥 and

(5.12) 휇(푥, 푧) = 휇[푥,푧](푧) where the latter is the Möbius function of one variable. The readers may wish to con- vince themselves of this by computing 휇({1, 3}, {1, 2, 3, 5, 6}) in the poset of Figure 5.3 and then comparing this with the computation done in the previous section for 퐵3. Just as in the one-variable case, we have the alternative definition

(5.13) ∑ 휇(푥, 푦) = 훿푥,푧. 푥≤푦≤푧

It turns out that 휇 is only one of an important family of functions on Int(푃). If 푃 is a locally finite poset, then its incidence algebra, ℐ(푃), is the set of all functions 휙∶ Int(푃) → ℝ under the operations of addition (휙 + 휓)(푥, 푧) = 휙(푥, 푧) + 휓(푥, 푧), scalar multiplication of 푐 ∈ ℝ (푐 ⋅ 휙)(푥, 푧) = 푐(휙(푥, 푧)), and convolution product (휙 ∗ 휓)(푥, 푧) = ∑ 휙(푥, 푦)휓(푦, 푧). 푥≤푦≤푧 It will often be convenient to extend the domain of 휙 ∈ ℐ(푃) to all of 푃 × 푃 by letting 휙(푥, 푧) = 0 whenever 푥 ≰ 푧. Using this convention the sum in the convolution product can take place over all 푦 ∈ 푃. We will now show that the incidence algebra lives up to its name. Theorem 5.5.1. If 푃 is a locally finite poset, then (ℐ(푃), +, ⋅, ∗) is an associative algebra over ℝ.

Proof. We will prove the associative law for convolution, leaving the check of the other algebra axioms as an exercise. If 휙, 휓, 휔 ∈ ℐ(푃) and [푥, 푧] ∈ Int(푃), then, using the fact that ℝ itself is associative, ((휙 ∗ 휓) ∗ 휔)(푥, 푧) = ∑(휙 ∗ 휓)(푥, 푠)휔(푠, 푧) 푠

= ∑(휙(푥, 푟)휓(푟, 푠))휔(푠, 푧) 푟,푠

= ∑ 휙(푥, 푟)(휓(푟, 푠)휔(푠, 푧)) 푟,푠

= ∑ 휙(푥, 푟)(휓 ∗ 휔)(푟, 푧) 푟 = (휙 ∗ (휓 ∗ 휔))(푥, 푧) as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.5. The Möbius Inversion Theorem 159

We have already met one element of ℐ(푃), namely 휇. But there are others which are important. Consider the analogue of the Kronecker delta, which is 훿 ∈ ℐ(푃) defined by 1 if 푥 = 푧, 훿(푥, 푧) = { 0 otherwise.

In other words, 훿(푥, 푧) = 훿푥,푧. Proposition 5.5.2. The incidence algebra ℐ(푃) has identity element 훿; that is, for any 휙 ∈ ℐ(푃) we have 훿 ∗ 휙 = 휙 ∗ 훿 = 휙.

Proof. We will prove 훿 ∗ 휙 = 휙 as the other equality is entirely analogous. Since 훿 is only nonzero when its two arguments are equal, (훿 ∗ 휙)(푥, 푧) = ∑ 훿(푥, 푦)휙(푦, 푧) = 훿(푥, 푥)휙(푥, 푧) = 휙(푥, 푧) 푦 as we wished to show. □

Another useful element of ℐ(푃) is the zeta function which satisfies 휁(푥, 푧) = 1 for all [푥, 푧] ∈ Int(푃). In Section 5.9 we will see how 휁 is related to the Riemann zeta function. Recall that if 퐴 is an associative algebra with identity element 푒, then 푎 ∈ 퐴 has a (multiplicative) inverse if there is an element denoted 푎−1 such that both 푎−1푎 = 푒 and 푎푎−1 = 푒. If 푛 ∈ ℕ, then the algebra of 푛 × 푛 matrices over ℝ has the property that to prove the existence of 푎−1 it suffices to show that it satisfies 푎−1푎 = 푒. As we will see shortly, there is a correspondence between incidence algebras of finite posets and matrix algebras which will show that the same implication holds for ℐ(푃). It turns out that 휁 and 휇 are inverses in ℐ(푃). Proposition 5.5.3. We have 휇 = 휁−1.

Proof. Using (5.13) and the definition of 휁 we see that (휇 ∗ 휁)(푥, 푧) = ∑ 휇(푥, 푦)휁(푦, 푧) = ∑ 휇(푥, 푦) ⋅ 1 = 훿(푥, 푧). 푥≤푦≤푧 푥≤푦≤푧 By the discussion preceding this proposition, this is enough to prove that 휇 = 휁−1. □

By the discussion just before the previous proposition, we can conclude that 휁∗휇 = 훿. Evaluating this equality on an interval gives

∑ 휁(푥, 푦)휇(푦, 푧) = 훿푥,푧 푥≤푦≤푧 or

(5.14) ∑ 휇(푦, 푧) = 훿푥,푧. 푥≤푦≤푧 This looks very much like (5.13) except that in one the first argument of 휇 is fixed while the second varies, while in the other the roles are reversed. So (5.14) could also be used to uniquely define the Möbius function except in a dual manner from the original. This

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

160 5. Counting with Partially Ordered Sets

equation can be used to calculate 휇 in a “top-down” fashion. It is not at all obvious a priori that this computation and the one proceeding “bottom-up” give the same value for 휇(푥, 푧), although they must. One can make the incidence algebra more concrete by identifying it with an al- gebra of matrices. Let 푃 be a finite poset. A linear extension of 푃 is a permutation 퐿 = 푥1푥2 ... 푥푛 of the elements of 푃 such that 푥푖 ≤푃 푥푗 implies 푖 ≤ 푗; that is, 푥푖 comes before 푥푗 in the permutation. One can think of a linear extension as a listing of the ele- ments of 푃 which respects the partial order in that smaller elements must come before larger ones. For example, 퐵2 has two linear extensions, namely ∅, {1}, {2}, {1, 2} and ∅, {2}, {1}, {1, 2}. Given a linear extension 퐿 and 휙 ∈ ℐ(푃), the matrix of 푓 with respect to 퐿 is

푀휙 = (휙(푥푖, 푥푗))1≤푖,푗≤푛 recalling that 휙(푥, 푦) = 0 if 푥 ≰ 푦. Returning to our example and using the first linear extension above we have ∅ {1} {2} {1, 2} ∅ 1 1 1 1 ⎡ ⎤ 푀휁 = {1} ⎢ 0 1 0 1 ⎥ ⎢ ⎥ {2} ⎢ 0 0 1 1 ⎥ {1, 2} ⎣ 0 0 0 1 ⎦

where the elements of 퐵2 indexing the rows and columns are shown in the margins. For any linear extension 퐿 and any 휙 ∈ ℐ(푃) the matrix 푀휙 must be upper triangular since if 푖 > 푗, then 푥푖 ≰ 푥푗 and so (푀휙)푖,푗 = 휙(푥푖, 푥푗) = 0.

Theorem 5.5.4. Let 푃 be a finite poset and fix a linear extension 퐿 = 푥1 ... 푥푛 of 푃. Then ℐ(푃) is isomorphic to the algebra of 푛 × 푛 real matrices 푀 such that 푀푖,푗 = 0 if 푥푖 ≰ 푥푗.

Proof. The map 휙 ↦ 푀휙 is a bijection between the two algebras since 푀휙 contains all the values of 휙. It is easy to prove that this bijection preserves addition and scalar multiplication. For multiplication of algebra elements, we must show that 푀휙푀휓 = 푀휙∗휓 for any 휙, 휓 ∈ ℐ(푃). To do this, it suffices to prove that these matrices have the same (푖, 푗) entry for any 푖, 푗. But

(푀휙푀휓)푖,푗 = ∑ (푀휙)푖,푘(푀휓)푘,푗 1≤푘≤푛

= ∑ 휙(푥푖, 푥푘)휓(푥푘, 푥푗) 푥푘∈푃

= (휙 ∗ 휓)(푥푖, 푥푗)

= (푀휙∗휓)푖,푗 as desired. □

We can now prove the Möbius Inversion Theorem as well as its dual version. Note that each of the four conditions are required to hold for all 푥 ∈ 푃, not just for a specific element.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.5. The Möbius Inversion Theorem 161

Theorem 5.5.5 (Möbius Inversion Theorem). Let 푃 be a finite poset, let 푉 be a real vector space, and let 푓, 푔∶ 푃 → 푉 be two functions. (a) We have 푓(푥) = ∑ 푔(푦) for all 푥 ∈ 푃 ⟺ 푔(푥) = ∑ 휇(푥, 푦)푓(푦) for all 푥 ∈ 푃. 푦≥푥 푦≥푥 (b) We have 푓(푥) = ∑ 푔(푦) for all 푥 ∈ 푃 ⟺ 푔(푥) = ∑ 휇(푦, 푥)푓(푦) for all 푥 ∈ 푃. 푦≤푥 푦≤푥

Proof. We will prove (a), leaving (b) as an exercise. In fact, we will give two proofs of (a), one working directly with the elements of ℐ(푃) and one using linear algebra.

Let us assume that 푓(푥) = ∑푦≥푥 푔(푦) for all 푥 ∈ 푃. Plugging this into summation involving 휇 and using (5.13) yields ∑ 휇(푥, 푦)푓(푦) = ∑ 휇(푥, 푦) ∑ 푔(푧) 푦≥푥 푦≥푥 푧≥푦

= ∑ 푔(푧) ∑ 휇(푥, 푦) 푧≥푥 푥≤푦≤푧

= ∑ 푔(푧)훿푥,푧 푧≥푥 = 푔(푥). The proof of the reverse implication follows the same strategy and so is safely left to the reader.

For the linear algebraic proof, we will fix a linear extension 퐿 = 푥1 ... 푥푛 of 푃. Then any 푓 ∶ 푃 → 푉 has associated column vector

푓(푥1) 푣푓 = [ ⋮ ] . 푓(푥푛)

Note that the first condition in (a) can be written 푓(푥) = ∑푦∈푃 휁(푥, 푦)푔(푦) since 1 if 푥 ≤ 푦, 휁(푥, 푦) = { 0 otherwise.

The summation for 푓 says that the entry in row 푥 of 푣푓 is the same as the entry in row 푥 of the product 푀휁푣푔. And since this must hold for all 푥, the first condition in (a) is equivalent to the matrix equation 푣푓 = 푀휁푣푔. But by Theorem 5.5.4, 푀휁 has an inverse which is 푀휇. So 푣푔 = 푀휇푣푓. This is equivalent to the summation condition for 푔 since we have, taking the entry in row 푥 on both sides, 푔(푥) = ∑ 휇(푥, 푦)푓(푦) = ∑ 휇(푥, 푦)푓(푦) 푦∈푃 푦≥푥 for any 푥. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

162 5. Counting with Partially Ordered Sets

This theorem is useful when the function 푓 is easy to compute, but one is really interested in 푔. If the two maps under consideration are related by an appropriate summation condition, then we can express 푔 in terms of 푓 by inversion. We will now give several examples, starting with the ones mentioned in the first paragraph of this section. Our first application will be to the theory of finite differences, which isadiscrete analogue of the calculus. A function 푓 ∶ ℕ → ℝ has as (forward) difference the func- tion Δ푓 ∶ ℕ → ℝ defined by Δ푓(푛) = 푓(푛 + 1) − 푓(푛). This corresponds to differentiation. Indeed, the derivative of 푓 ∶ ℝ → ℝ is 푓(푥 + 휖) − 푓(푥) 푓′(푥) = lim 휖→0 휖 and at 휖 = 1 the function inside the limit is just 푓(푥 + 1) − 푓(푥). For example, if 푓(푛) = 푛2, then Δ푓(푛) = (푛 + 1)2 − 푛2 = 2푛 + 1 which bears a strong resemblance to (푥2)′ = 2푥. There is also a version of the definite integral in this context. The definite summation of 푓 ∶ ℕ → ℝ is the function 푆푓 ∶ ℕ → ℝ where 푛 푆푓(푛) = ∑ 푓(푖). 푖=0 The analogue of the Fundamental Theorem of Calculus is as follows. It will be con- venient to extend the domain of any 푓 ∶ ℕ → ℝ to ℤ by letting 푓(푖) = 0 for 푖 < 0.

Theorem 5.5.6 (Fundamental Theorem of Difference Calculus). Given two function 푓, 푔∶ ℕ → ℝ, we have 푓(푛) = 푆푔(푛) for all 푛 ≥ 0 ⟺ 푔(푛) = Δ푓(푛 − 1) for all 푛 ≥ 0.

Proof. It is easy to compute that in the chain 퐶푛 we have

⎧1 if 푖 = 푛, 휇(푖, 푛) = −1 if 푖 = 푛 − 1, ⎨ ⎩0 otherwise. Now for all 푛 ≥ 0, the first condition in the theorem can be translated as 푛 푓(푛) = 푆푔(푛) = ∑ 푔(푖) = ∑ 푔(푖) 푖=0 푖≤푛

where the inequality indexing the last summation is taking place in 퐶푛. Using Theo- rem 5.5.5(b) and the Möbius values in 퐶푛 above, this is equivalent to 푔(푛) = ∑ 휇(푖, 푛)푓(푖) = (1)푓(푛) + (−1)푓(푛 − 1) = Δ푓(푛 − 1) 푖≤푛 for all 푛 ≥ 0. □

It turns out that the Principle of Inclusion and Exclusion is just the Möbius Inver- sion Theorem applied to the poset 퐵푛. We restate it here for ease of reference.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.5. The Möbius Inversion Theorem 163

Theorem 5.5.7. Given a finite set 푆 and subsets 푆1, ... , 푆푛, we have | 푛 | | 푛 | |푆 − 푆 | = |푆| − ∑ |푆 | + ∑ |푆 ∩ 푆 | − ⋯ + (−1)푛 | 푆 | . | ⋃ 푖| 푖 푖 푗 |⋂ 푖| | 푖=1 | 1≤푖≤푛 1≤푖<푗≤푛 |푖=1 |

Proof. Define two functions 푓, 푔∶ 퐵푛 → ℕ by | | 푓(퐼) = | 푆 | |⋂ 푖| | 푖∈퐼 | and | | 푔(퐼) = | 푆 − 푆 | . |⋂ 푖 ⋃ 푗| | 푖∈퐼 푗∉퐼 |

In words, 푓(퐼) counts the number of elements of 푆 which are in all the 푆푖 for 푖 ∈ 퐼 and possibly in other 푆푗. On the other hand, 푔(퐼) is the number of elements which are in exactly the 푆푖 for 푖 ∈ 퐼 and no others. From this description we see that, for all 퐼 ∈ 퐵푛, 푓(퐼) = ∑ 푔(퐽) 퐽⊇퐼

since any element in the 푆푖 for 푖 ∈ 퐼 must be in exactly the 푆푗 for the elements 푗 of some 퐽 ⊇ 퐼. Applying Theorem 5.5.5(a) together with Proposition 5.1.3(d) and (5.6) we obtain | | 푔(퐼) = ∑ 휇(퐼, 퐽)푓(퐽) = ∑(−1)|퐽−퐼| | 푆 | . |⋂ 푗| 퐽⊇퐼 퐽⊇퐼 |푗∈퐽 | Specializing to the case 퐼 = ∅ we obtain | 푛 | | | |푆 − 푆 | = 푔(∅) = ∑ (−1)|퐽| | 푆 | | ⋃ 푖| |⋂ 푗| | 푖=1 | 퐽∈퐵푛 |푗∈퐽 | which is what we wished to prove. □

The Möbius Inversion Theorem originated in number theory. Here is that version.

Theorem 5.5.8. Given two functions 푓, 푔∶ ℙ → ℝ, we have 푓(푛) = ∑ 푔(푑) for all 푛 ∈ ℙ ⟺ 푔(푛) = ∑ 휇(푑)푓(푛/푑) for all 푛 ∈ ℙ. 푑∣푛 푑∣푛

Proof. The first condition is an exact translation of the first condition in Theorem 5.5.5(b) for the poset 퐷푛. Inverting using Theorem 5.1.3(f) as well as the relationship (5.12) between the one- and two-variable forms of 휇 gives 푔(푛) = ∑ 휇(푑, 푛)푓(푑) = ∑ 휇(푛/푑)푓(푑) = ∑ 휇(푑′)푓(푛/푑′) 푑∣푛 푑∣푛 푑′∣푛 where 푑′ = 푛/푑. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

164 5. Counting with Partially Ordered Sets

5.6. Characteristic polynomials

As was made abundantly clear in Chapter 3, one way to get insight into a combinatorial object is to study its generating function. This is also true of the Möbius function and the corresponding generating function is called the characteristic polynomial. In par- ticular, we will show that using this polynomial one can get an interesting connection between a particular lattice associated to a graph and its chromatic polynomial. Let 푃 be a finite ranked poset with rk 푃 = 푛. The characteristic polynomial of 푃 is

(5.15) 휒(푃) = 휒(푃; 푡) = ∑ 휇(푥)푡푛−rk 푥. 푥∈푃

where we are using the one variable form of the Möbius function. We also used 휒 for the chromatic number of a graph, but this should cause no confusion since here we are dealing with posets. The quantity 푛 − rk 푥 appearing in the power on 푡 is called the corank of 푥 and the reader may be wondering why we are using this rather than just the rank. One reason is that this makes 휒(푃) monic: the highest power of 푡 appears when 푥 = 0̂ and 휇(0)̂ = 1. Also, as will be seen, this choice of exponent results in 휒(푃) having some interesting properties. Note that collecting terms in (5.15) for 푥 at the same rank shows that

푛 푛−푘 (5.16) 휒(푃) = ∑ 푤푘(푃)푡 푘=0

where the 푤푘(푃) are the Whitney numbers of the first kind for 푃. Let us begin by computing the characteristic polynomials for some of our standard example posets.

Proposition 5.6.1. We have the following characteristic polynomials:

(a) For 퐶푛,

푛−1 휒(퐶푛) = 푡 (푡 − 1).

(b) For 퐵푛,

푛 휒(퐵푛) = (푡 − 1) .

푚1 푚푘 (c) If 푛 has prime factorization 푛 = 푝1 ⋯ 푝푘 and 푚 = ∑푖 푚푖, then

푚−푘 푘 휒(퐷푛) = 푡 (푡 − 1) .

(d) For Π푛,

휒(Π푛) = (푡 − 1)(푡 − 2) ⋯ (푡 − 푛 + 1).

(e) For 퐿푛(푞),

2 푛−1 휒(퐿푛(푞)) = (푡 − 1)(푡 − 푞)(푡 − 푞 ) ⋯ (푡 − 푞 ).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.6. Characteristic polynomials 165

푃 =

Figure 5.7. A poset 푃 with 휒(푃) having complex roots

Proof. We will prove the results for 퐵푛, leaving the others as exercises.

In the case of 퐵푛, one can plug (5.8) into (5.16) and then use reindexing, the sym- metry of the binomial coefficients, as well as the Binomial Theorem to obtain

푛 푛 휒(퐵 ) = ∑ (−1)푘( )푡푛−푘 푛 푘 푘=0

푛 푛 = ∑ (−1)푛−푘( )푡푘 푛 − 푘 푘=0

푛 푛 = (−1)푛 ∑ ( )(−푡)푘 푘 푘=0 = (−1)푛(1 − 푡)푛

= (푡 − 1)푛

which is the desired conclusion. □

It is striking that all the characteristic polynomials in Proposition 5.6.1 have non- negative integer roots. This is not always the case. For example, consider the poset in Figure 5.7. Then an easy computation gives 휒(푃) = 푡2 − 3푡 + 3 which has complex roots. We also note that if all the roots of a polynomial are nonnegative reals, then the coefficient sequence is log-concave. However, we will postpone the proof of thisuntil we have introduced the elementary symmetric functions in Section 7.1. One way to explain some of the factorizations in Proposition 5.6.1 is via the follow- ing result.

Theorem 5.6.2. Let 푃, 푄 be finite ranked posets. (a) If there is an isomorphism 푓 ∶ 푃 → 푄, then 휒(푃) = 휒(푄). (b) We have 휒(푃 × 푄) = 휒(푃)휒(푄).

Proof. (a) From Exercise 8(b) at the end of the chapter we have rk푃 푥 = rk푄 푓(푥) for all 푥 ∈ 푃. In particular, rk 푃 = rk 푄 = 푛 for some 푛. Thus, using Theorem 5.4.3 and

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

166 5. Counting with Partially Ordered Sets

the fact that 푓 is a bijection

푛−rk푃 푥 휒(푃) = ∑ 휇푃(푥)푡 푥∈푃

푛−rk푃 푓(푥) = ∑ 휇푃(푓(푥))푡 푥∈푃

푛−rk푄 푦 = ∑ 휇푄(푦)푡 푦∈푄

= 휒(푄).

(b) The result of Exercise 8(c) is that rk푃×푄(푥, 푦) = rk푃 푥 + rk푄 푦 for all (푥, 푦) ∈ 푃 × 푄. So if rk 푃 = 푚 and rk 푄 = 푛, then rk(푃 × 푄) = 푚 + 푛. Applying Theorem 5.4.4

푚+푛−rk푃×푄 (푥,푦) 휒(푃 × 푄) = ∑ 휇푃×푄(푥, 푦)푡 (푥,푦)∈푃×푄

푚−rk푃 푥 푛−rk푄 푦 = ∑ 휇푃(푥)푡 ∑ 휇푄(푦)푡 푥∈푃 푦∈푄

= 휒(푃)휒(푄) which is what we wished to prove. □

This theorem can be used to explain a couple of the factorizations in Proposi- 푛 tion 5.6.1. For example 퐵푛 ≅ 퐶1 so 푛 푛 푛 휒(퐵푛) = 휒(퐶1 ) = 휒(퐶1) = (푡 − 1) .

However, neither Π푛 nor 퐿푛(푞) decompose as a product of smaller posets. So to un- derstand the factorization of their characteristic polynomials, we will have to use poset quotients as discussed in the next section. We end this section by making a connection between the characteristic polynomial of a lattice associated with a graph 퐺 and the chromatic polynomial of 퐺. A subgraph 퐻 of 퐺 is induced if 푣푤 ∈ 퐸(퐺) implies 푣푤 ∈ 퐸(퐻) for all 푣, 푤 ∈ 푉(퐻). In words, every

푢 푣 푢 푣 푢 푣

퐺 =

푥 푤 푤 푥

Figure 5.8. A graph 퐺 and two subgraphs on the top, as well as two spanning subgraph colorings on the bottom

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.6. Characteristic polynomials 167

Figure 5.9. The bond lattice of the graph 퐺 in Figure 5.8

edge of 퐺 between vertices of 퐻 must be in 퐻. By way of illustration, given the graph 퐺 in Figure 5.8, the first subgraph in that figure is induced but the second is notbecause it is missing the edge 푣푥 ∈ 퐸(퐺).A bond of 퐺 is a spanning subgraph such that each component is induced. The bond lattice of 퐺, denoted ℒ(퐺), is the set of bonds of 퐺 ordered by containment. The bond lattice for the graph 퐺 of Figure 5.8 is displayed in Figure 5.9. It is not hard to show that ℒ(퐺) is a ranked lattice with rank function (5.17) rk 퐻 = 푛 − 푘(퐻) where 푛 is the number of vertices of 퐺 and 푘(퐻) is the number of components of 퐻. We need a couple of other definitions before we can connect bond lattices with chromatic polynomials. Let 푐∶ 푉 → 푆 be a (not necessarily proper) coloring of 퐺. If 퐻 is a spanning subgraph of 퐺, then we say that 푐 is 퐻-improper if every component of 퐻 is monochromatic, that is, has all vertices of the same color. The two rightmost graphs in Figure 5.8 are two subgraphs which are both 퐻-improper for the same coloring 푐. The subgraph induced by 푐 is the spanning subgraph of 퐺 such that 푢푣 ∈ 퐸(퐻) if and only if 푢푣 ∈ 퐸(퐺) and 푐(푢) = 푐(푣). Of the two subgraphs colored white and gray in Figure 5.8, only the second is induced by the coloring of 퐺. We note that the subgraph induced by 푐 is unique and is a bond. Furthermore it follows directly from the definitions that 푐 is proper if and only if its induced subgraph is the spanning subgraph with no edges.

Theorem 5.6.3. Let 퐺 be a graph with 푘(퐺) = 푘 components. We have 푃(퐺) = 푡푘휒(ℒ(퐺)).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

168 5. Counting with Partially Ordered Sets

Proof. Let 퐺 have #푉 = 푛 and consider all possible colorings 푐∶ 푉 → [푡] for 푡 ∈ ℕ. Given a bond 퐻 of 퐺, we will be interested in two associated subsets: 푆(퐻) = {푐 ∣ 푐 is 퐻-improper} and 푇(퐻) = {푐 ∣ 퐻 is induced by 푐}. Note that if 푐 is 퐻-improper, then there is a unique bond 퐾 ⊇ 퐻 such that 퐾 is the bond induced by 푐. It follows that for all 퐻 ∈ ℒ(퐺) we have 푆(퐻) = 푇(퐾). ⨄ 퐾⊇퐻 Now define 푓, 푔∶ ℒ(퐺) → ℕ by 푓(퐻) = #푆(퐻) and 푔(퐻) = #푇(퐻). Note that 푓(퐻) = 푡푘(퐻) since there are 푡 ways to choose the color of each component of 퐻. Also,

the previous displayed equation yields 푓(퐻) = ∑퐾⊇퐻 푔(퐻) for all 퐻 ∈ ℒ(퐺). Applying the Möbius Inversion Theorem gives 푔(퐻) = ∑ 휇(퐻, 퐾)푓(퐾). 퐾⊇퐻 Recall that 푐 is proper if and only if it induces the spanning subgraph of 퐺 with no edges. And this subgraph is the 0̂ element of ℒ(퐺). Thus, using in turn the formula for 푓(퐻) in terms of 푘(퐻), the rank function of ℒ(퐺) as given in (5.17), and the fact that 푃(퐺) counts proper colorings, 푡푘휒(ℒ(퐺)) = 푡푘(퐺) ∑ 휇(퐾)푡(푛−푘(퐺))−(푛−푘(퐾)) 퐾∈ℒ(퐺)

= ∑ 휇(퐾)푡푘(퐾) 퐾∈ℒ(퐺)

= 푔(0)̂

= 푃(퐺) which is what we wanted. □

5.7. Quotients of posets

In many areas of mathematics one studies the objects under consideration by taking quotients which can have a simpler structure than the original entity. In this and the next section we will present a concept of quotient for posets 푃. We will see that it is useful for proving that the characteristic polynomial of 푃 factors even though 푃 may not be a product of smaller posets. This notion can also be used to give inductive proofs of various well-known theorems about 휇. Quotients of the type we will consider first appeared in the work of Hallam and Sagan [41]. A number of techniques have been proposed for proving that the characteristic polynomial factors over the integers. See [78] for a survey. The method we will use proceeds as follows. We wish to show that a poset 푄 has characteristic polynomial

which factors as 휒(푄) = ∏푖 휒푖 for certain polynomials 휒푖. Suppose we can construct

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.7. Quotients of posets 169

posets 푃푖 with 휒(푃푖) = 휒푖 for all 푖 and let 푃 = ⨉푖 푃푖. We wish to find an equivalence relation ∼ on 푃 and a partial order on the set of equivalence classes 푃/ ∼ such that (i) (푃/ ∼) ≅ 푄 and (ii) 휒(푃/ ∼) = 휒(푃). From this and Theorem 5.6.2 we get

휒(푄) = 휒(푃/ ∼) = 휒(푃) = 휒(×푖푃푖) = ∏ 휒푖 푖 as we wished to show. Since we will be particularly interested in the case where 휒(푄) has integer roots, we introduce a simple family of posets with this property. Suppose 푃 has a 0̂. Then the elements covering 0̂ are the atoms of 푃. We will use the notation 풜(푃) = {푥 ∈ 푃 ∣ 푥 is an atom of 푃}. If 푃 is ranked, then the atoms are just the elements of rank one. Define the 푛-claw, 퐶퐿푛, to be the poset consisting solely of a 0̂ and 푛 atoms. For example, in Figure 5.10 the two posets in the direct product are 퐶퐿1 and 퐶퐿2. Clearly

휒(퐶퐿푛; 푡) = 푡 − 푛.

As our running example, we will use the partition lattice Π3 in Figure 5.1 and the prod- uct poset on the right in Figure 5.10. Note that

휒(퐶퐿1 × 퐶퐿2) = 휒(퐶퐿1)휒(퐶퐿2) = (푡 − 1)(푡 − 2) = 휒(Π3).

Unfortunately, Π3 ≇ 퐶퐿1 × 퐶퐿2. But they are close to being isomorphic. In partic- ular, if we could merge the two maximal elements of 퐶퐿1 × 퐶퐿2 into one, then the resulting poset would be a copy of Π3. Poset quotients are designed to make this sort of identification of elements precise. Let 푃 be a finite poset with a 0̂ and let ∼ be an equivalence relation on 푃. Define the quotient 푃/ ∼ to be the set of equivalence classes together with the binary relation 푋 ≤ 푌 in 푃/ ∼ if and only if 푥 ≤푃 푦 for some 푥 ∈ 푋 and 푦 ∈ 푌. It is important to note that this binary relation is not necessarily a partial order. To see what can go wrong, consider the chain 퐶2 with equivalence classes 푋 = {0, 2} and 푌 = {1}. Then we have 푋 ≤ 푌 since 0 ≤ 1. But we also have 푌 ≤ 푋 since 1 ≤ 2. And clearly 푋 ≠ 푌, so the

(푎, 푏) (푎, 푐)

(푎, 0)̂ 푎 푏 푐 (0,̂ 푏) (0,̂ 푐)

퐶퐿1 × 퐶퐿2 = × =

0̂ 0̂ (0,̂ 0)̂

Figure 5.10. A product of claws

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

170 5. Counting with Partially Ordered Sets

antisymmetry relation fails. To fix this, define, for 푃 a finite poset with 0̂, 푃/ ∼ to be a homogeneous quotient if (1) the equivalence class containing 0̂ is {0}̂ and

(2) if 푋 ≤ 푌, then for any 푥 ∈ 푋 there is a 푦 ∈ 푌 with 푥 ≤푃 푦. Homogeneous quotients yield posets. Lemma 5.7.1. If 푃/ ∼ is a homogeneous quotient, then 푃/ ∼ is a poset.

Proof. Verifying the reflexive and transitive laws is easy and so it is left as an exercise. For antisymmetry, suppose 푋 ≤ 푌 and 푌 ≤ 푋. Let 푥 be a maximal element of 푋. Then since 푋 ≤ 푌 and the quotient is homogeneous, there is 푦 ∈ 푌 with 푥 ≤ 푦. Similarly 푌 ≤ 푋 implies there is 푥′ ∈ 푋 with 푦 ≤ 푥′. So 푥 ≤ 푦 ≤ 푥′. But 푥 was picked to be maximal in 푋 which forces 푥 = 푥′. This in turn yields 푦 = 푥. Since 푥 ∈ 푋 and 푦 ∈ 푌 we have found an element of 푋 ∩ 푌. Equivalence classes are either disjoint or equal, so this implies 푋 = 푌 as we wished to prove. □

Returning to the product 푃 = 퐶퐿1 ×퐶퐿2 in Figure 5.10, we impose the equivalence relation ∼ where every element is in an equivalence class by itself except for the two maximal elements which are in a class together. It is easy to see that this is a homoge- neous quotient since any time we have 푋 < 푌, the class 푋 is a singleton. Furthermore (푃/ ∼) ≅ Π3 which is our desired condition (i). As far as (ii), one can verify by direct computation that 휒 does not change in passing from 푃 to 푃/ ∼. In fact, more is true. Note that the equivalence class {(푎, 푏), (푎, 푐)} of 푃 becomes the 1̂ of 푃/ ∼. Furthermore

휇푃(푎, 푏) + 휇푃(푎, 푐) = 1 + 1 = 2 = 휇푃/∼(1).̂ Our next order of business is to give a condition under which this always happens. Lemma 5.7.2. Let 푃/ ∼ be a homogeneous quotient. Suppose that for all nonzero 푋 ∈ 푃/ ∼ we have (5.18) ∑ 휇(푦) = 0 푦∈퐼(푋) where 퐼(푋) is the lower-order ideal generated by 푋 as a subset of 푃. Then for all 푋 ∈ 푃/ ∼ we have 휇(푋) = ∑ 휇(푥). 푥∈푋

Proof. We will induct on the length of the longest 0̂–푋 chain in 푃/ ∼. When the length is zero we have, by the first requirement for a homogeneous quotient, 푋 = {0̂푃} and 휇(푋) = 1 = 휇(0̂푃). For a nonzero 푋 we have, by induction, 휇(푋) = − ∑ 휇(푌) = − ∑ ∑ 휇(푦). 푌<푋 푌<푋 푦∈푌 We claim that {푦 ∈ 푌 ∣ 푌 < 푋} = 퐼(푋)−푋. Indeed, 푦 ∈ 퐼(푋)−푋 means that 푦 ∉ 푋 and 푦 < 푥 for some 푥 ∈ 푋. And by the second condition for a homogeneous quotient, this

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.7. Quotients of posets 171

is equivalent to being in {푦 ∈ 푌 ∣ 푌 < 푋}. So the previous displayed equation becomes 휇(푋) = − ∑ 휇(푦) = ∑ 휇(푥) 푦∈퐼(푋)−푋 푥∈푋 where the second equality comes from solving for the terms when 푥 ∈ 푋 in (5.18). This completes the proof. □

We will call (5.18) the Summation Condition. We also need to know how the rank function behaves when taking a quotient of a ranked poset. This is taken care of by the next result, where (5.19) will be called the Rank Condition. Lemma 5.7.3. Let 푃/ ∼ be a homogeneous quotient of a ranked poset. Suppose that for all 푥, 푦 ∈ 푃 (5.19) 푥 ∼ 푦 ⟹ rk 푥 = rk 푦. Then 푃/ ∼ is ranked and rk 푋 = rk 푥 for all 푥 ∈ 푋.

Proof. First we claim that if we have a cover 푋 ⋖ 푌, then for any 푥 ∈ 푋 there is 푦 ∈ 푌 with 푥 ⋖ 푦. We know that we can pick 푦 with 푥 < 푦. Suppose, towards a contradiction, that there is a 푧 with 푥 < 푧 < 푦. Letting 푍 be the equivalence class of 푧, we must have 푋 < 푍 < 푌 where the inequalities are strict because (5.19) forces an equivalence class to consist of elements at a given rank. But this contradicts 푋 ⋖ 푌.

To show that 푃/ ∼ is ranked, suppose we have two saturated chains 0̂ = 푋0 ⋖ 푋1 ⋖ ⋯ ⋖ 푋푚 and 0̂ = 푌0 ⋖ 푌1 ⋖ ⋯ ⋖ 푌푛 where 푋푚 = 푌푛. Then, by the claim, we have corresponding chains 0̂ = 푥0 ⋖ 푥1 ⋖ ⋯ ⋖ 푥푚 and 0̂ = 푦0 ⋖ 푦1 ⋖ ⋯ ⋖ 푦푛. Since 푥푚 and 푦푛 are in the same equivalence class, they must have the same rank by (5.19). This forces 푚 = 푛 and that 푃/ ∼ must be ranked. This also shows that rk 푋 = rk 푥 for all 푥 ∈ 푋. □

It is now a short step to our desired conclusion. Theorem 5.7.4. Let 푃/ ∼ be a homogeneous quotient of a ranked poset satisfying the Summation Condition (5.18) and Rank Condition (5.19). Then 휒(푃/ ∼) = 휒(푃).

Proof. Using the previous two lemmas gives 휒(푃/ ∼) = ∑ 휇(푋)푡rk(푃/∼)−rk(푋) = ∑ 휇(푥)푡rk(푃)−rk(푥) = 휒(푃) 푋∈푃/∼ 푥∈푃 as desired. □

As an application, we will use this theorem to calculate 휒(Π푛). Before doing so, we will return to the case of Π3 to motivate the equivalence relation used. We will label the atoms of the two claws in Figure 5.10 with atoms of Π3 as follows. A block of a partition of [푛] is trivial if it contains a single element. If 1 ≤ 푖 < 푗 ≤ 푛, then let 푖푗 denote the atom of Π푛 whose only nontrivial block is {푖, 푗}. Now consider the claws as labeled in Figure 5.11. Finally, replace each pair labeling an element of the product with the label which is the join in Π푛 of its two components to obtain the poset on the

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

172 5. Counting with Partially Ordered Sets

(12, 13) (12, 23)

(12, 0)̂ 12 13 23 (0,̂ 13) (0,̂ 23)

× =

0̂ 0̂ (0,̂ 0)̂

123 123

12 13 23

Figure 5.11. A product of claws with partition labels

right in the figure. Note that two elements of the product are in the same equivalence class of 푃/ ∼ if and only if they have the same label in the right-hand poset. This is the key to defining the equivalence relation.

Now consider Π푛 for any 푛. It will be convenient to make the convention just for this proof that Π푛 is the poset of partitions of {0, 1, ... , 푛 − 1}. We let

푃 = 퐶퐿1 × 퐶퐿2 × ⋯ × 퐶퐿푛−1

where for 1 ≤ 푗 ≤ 푛 − 1 the atoms of 퐶퐿푗 are labeled with the atoms 푖푗, 푖 < 푗, of Π푛. Clearly 휒(푃) = (푡 − 1)(푡 − 2) ... (푡 − 푛 + 1).

Put an equivalence relation on elements 푥 = (푥1, ... , 푥푛−1) ∈ 푃 by 푥 ∼ 푦 if and only if ⋁ 푥 = ⋁ 푦 in Π푛. So for an equivalence class 푋, we can define the partition 휌(푋) = ⋁ 푥 for any 푥 ∈ 푋. To finish, we need to show that 푃/ ∼ is a homogeneous quotient satisfying the Summation and Rank Conditions and that (푃/ ∼) ≅ Π푛. We claim that 푋 ≤ 푌 in the binary relation on 푃/ ∼ if and only if 휌(푋) ≤ 휌(푌) in Π푛. Indeed, take 푥 ∈ 푋 and 푦 ∈ 푌 with 푥 ≤푃 푦. Then 푥푖 ≤ 푦푖 in Π푛 for all 푖. It follows that (5.20) 휌(푋) = 푥 ≤ 푦 = 휌(푌). ⋁ ⋁ The proof of the backwards direction involves similar ideas and so is left as an exercise For homogeneity, clearly (0,̂ ... , 0)̂ is the only element whose join is 0̂ which gives the first condition in the definition. For the second condition, we use theassump- tion 푋 ≤ 푌 and (5.20) to induct on the length 푙 of a saturated chain in the interval [휌(푋), 휌(푌)]. If 푙 = 0, then 푋 = 푌 and the conclusion is trivial. So assume 푋 < 푌, 푥 ∈ 푋. Take two blocks 퐵, 퐶 of ⋁ 푥 = 휌(푋) which are contained in the same block

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.7. Quotients of posets 173

of 휌(푌). Let 푖 = min 퐵 and 푗 = min 퐶 where, without loss of generality, 푖 < 푗. We claim that coordinate 푥푗 of 푥 must be 0̂. For suppose that 푥푗 = 푘푗 for some 푘 < 푗. But then 푘, 푗 ∈ 퐶 contradicting 푗 = min 퐶. Now define 푧 to agree with 푥 except in the 푗th coordinate where 푧푗 = 푖푗 so that 푥 < 푧. By construction, ⋁ 푥 < ⋁ 푧 ≤ 휌(푌). So if 푍 is the equivalence class of 푧, then 푋 < 푍 ≤ 푌 by the previous claim. And by induction, there is 푦 ∈ 푌 with 푧 ≤ 푦 so that 푥 < 푧 ≤ 푦 as desired. To obtain the Summation Condition, it is easy to see from Theorem 5.4.4 that for any 푥 ∈ 푃 we have 휇(푥) = (−1)supp 푥 where supp 푥 is the cardinality of the support set of 푥

(5.21) Supp 푥 = {푖 ∣ 푥푖 ≠ 0}.̂ So to get (5.18) it suffices to find a sign-reversing involution on 퐼(푋) which has no fixed points, where the sign of 푥 is 휇(푥). Since 푋 ≠ 0̂, the partition 휌 = 휌(푋) has a nontrivial block 퐵, say 퐵 = {푖 < 푗 < ⋯}. Take 푦 ∈ 퐼(푥). We claim that 푦푗 = 0̂ or 푦푗 = 푖푗. For suppose 푦푗 = 푘푗 for some 푘 ≠ 푖. But then ⋁ 푦 ≰ 휌 since the block containing 푗 in ⋁ 푦 contains 푘 < 푗 and so cannot be 퐵. Now define an involution 휄∶ 퐼(푋) → (푋) so that 휄(푦) is 푦 with 푦푗 either changed from 0̂ to 푖푗 or from 푖푗 to 0̂. It is easy to check that this map has the desired properties.

For the Rank Condition, note that if 휌, 푎 ∈ Π푛 where 푎 = 푖푗 is an atom, then either 휌 ∨ 푎 = 휌 if 푖 and 푗 are in the same block of 휌, or 휌 ∨ 푎 covers 휌 if 푖 and 푗 are in different blocks of 휌 since those two blocks get merged in the join. So given 푥 ∈ 푃, let 푖1푗1, ... , 푖푘푗푘 be the elements of Supp 푥 listed so that 푗1 < ⋯ < 푗푘 and consider ⋁ 푥 = 푖1푗1 ∨⋯∨푖푘푗푘. Let 휌푙 = 푖1푗1 ∨⋯∨푖푙푗푙 for any 0 ≤ 푙 ≤ 푘. For 푙 ≥ 1 we see that 휌푙 must cover 휌푙−1 since the ordering of the 푖’s and 푗’s forces 푗푙 > 푖1, 푗1, ... , 푖푙 so that {푗푙} is a trivial block of 휌푙−1. Thus rk 푥 = 푘 = 푛 − |휌|. Hence if 푥 ∼ 푦, then ⋁ 푥 = 휌 = ⋁ 푦 and so rk 푥 = 푛 − |휌| = rk 푦 which is what we wished to prove.

Finally, we need an isomorphism between (푃/ ∼) and Π푛. The function we have already defined, 휌(푋) = ⋁ 푥 for any 푥 ∈ 푋, is such a map. We leave the proof that this is a well-defined isomorphism to the reader. This completes the proof that 휒(Π푛) has the desired form.

One can motivate the choice of atoms used to label the claws in the Π푛 example as follows. Consider the maximal chain in Π푛 which is 0̂ = 휌1 ⋖ 휌2 ⋖ ⋯ ⋖ 휌푛 = 1̂ where, for all 푗, 휌푗 is the partition with a block [푗] and all other blocks trivial. Considering the set of atoms 푎 such that 푎 ≤ 휌푗 but 푎 ≰ 휌푗−1, we see that these are exactly the atoms used to label the claw 퐶퐿푗. This technique of partitioning the atom set of a poset has been used before in proving that the characteristic polynomial factors over ℤ. See, for example, Stanley’s work on supersolvable lattices [83]. To end this section, we note that there is a quicker method for proving these fac- torizations when the poset is a finite lattice 퐿. Let 퐴퐿 and 퐴푥 be the set of atoms of 퐿 and the set of atoms of 퐿 less than or equal to 푥 ∈ 퐿, respectively. Say that a 0̂–1̂ chain 퐶 ∶ 0̂ = 푥0 < 푥1 < ⋯ < 푥푛 = 1̂ induces a partition 퐴1/ ... /퐴푛 of 퐴퐿 by defining

퐴푖 = {푎 ∈ 퐴퐿 ∣ 푎 ≤ 푥푖 but 푎 ≰ 푥푖−1}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

174 5. Counting with Partially Ordered Sets

For Π푛, this is exactly the partition described in the previous paragraph for the given

saturated chain. Let 퐶퐿퐴푖 be the claw whose atoms are labeled by 퐴푖 and consider

퐭 = (푥1, ... , 푥푛) ∈ 퐶퐿퐴1 × ⋯ × 퐶퐿퐴푛 . The vector 퐭 is called a transversal and it has support defined by equation (5.21) but in this more general setting. For 푥 ∈ 퐿, define | 풯 = {퐭 = (푥 , ... , 푥 ) | 푥 = 푥} 푥 1 푛 ⋁ 푖 | 푖 where the join is taken in 퐿.

Theorem 5.7.5. Let 퐿 be a finite ranked lattice and let 퐴1/ ... /퐴푛 be induced by a 0̂–1̂ chain. Suppose that for all 푥 ∈ 퐿 and 퐭 ∈ 풯푥 we have | Supp 퐭| = rk 푥. Then the following are equivalent.

(1) For every nonzero 푥 ∈ 퐿 there is an index 푖 such that |퐴푖 ∩ 퐴푥| = 1. (2) The polynomial 휒(퐿; 푡) factors with nonegative integral roots as

푛 rk 퐿−푛 □ 휒(퐿; 푡) = 푡 ∏(푡 − |퐴푖|). 푖=1 Note that to check factorization using the previous theorem there are only two conditions to check. And these are usually rather easy to verify. So this is much more efficient than using the previous method. Theorem 5.7.5 is also the only one we know of which gives conditions equivalent to (rather than just implying) factorization of 휒 over the nonnegative integers. Proving this result would consume too much of our time, but the reader can find details in [41].

5.8. Computing the Möbius function

We will now use quotient posets to prove three classic theorems about the Möbius func- tion. Each of these results gives a different way to compute 휇. One of the advantages of using quotients is that all three can be proven inductively using a lemma about a very simple equivalence relation. The proofs in this section are based on the arXiv version of a paper of Hallam [38, 39]. Let 푃 be a poset with a 1̂.A coatom of 푃 is an element 푐 covered by 1̂. We wish to examine what happens when a coatom and 1̂ are identified by an equivalence relation. If 푥 ∈ 푃, then we use [푥] for the equivalence class of 푥. Lemma 5.8.1. Let 푃 be a finite poset with a 0̂ and a 1̂ such that #푃 ≥ 3. Let 푐 be a coatom and let ∼ be the equivalence relation with classes {푐, 1}̂ and all others having only one element. In this case, the following hold. (a) 푃/ ∼ is homogeneous. (b) We have 휇([1])̂ = 휇(푐) + 휇(1).̂ (c) If 푃 is a lattice, then so is 푃/ ∼ with [푥] ∨ [푦] = [푥 ∨ 푦] for all 푥, 푦 ∈ 푃 and [푥] ∧ [푦] = [푥 ∧ 푦] for all [푥], [푦] ≠ [1]̂ .

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.8. Computing the Möbius function 175

Proof. (a) Since #푃 ≥ 3, the definition of ∼ shows that [0]̂ = {0}̂ which is the first condition for a homogeneous quotient. For the second, if 푋 = 푌, then the conclusion is trivial. And if 푋 < 푌, then 푋 = {푥} for some 푥. So 푥 ≤ 푦 for some 푦 ∈ 푌 by the definition of the binary relation. (b) It suffices to show that the Summation Condition (5.18) holds, so assume 푋 ≠ {0}̂ . We either have 푋 = {푥} or 푋 = [푥] where 푥 = 1̂. In either case ∑ 휇(푦) = ∑ 휇(푦) = 0 푦∈퐼(푋) 푦≤푥 by the definition of the Möbius function in 푃. (c) It is not hard to show that (푃/ ∼) ≅ 푃−{푐} and we will use the latter description. Consider three cases: (1) one or both of 푥, 푦 is equal to 푐 or (2) 푥 ∨ 푦 ∈ {푐, 1}̂ with 푥, 푦 ≠ 푐 or (3) 푥 ∨ 푦 ∈ 푃 − {푐, 1}̂ with 푥, 푦 ≠ 푐. We will do the second one and leave the others to the reader. If 푥 ∨ 푦 ∈ {푐, 1}̂ with 푥, 푦 ≠ 푐 , then in 푃 − {푐} we have that 푥, 푦 have a unique upper bound, namely 1̂. It follows that [푥] ∨ [푦] exists and [푥] ∨ [푦] = [1]̂ = [푐] = [푥 ∨ 푦] as desired. Checking the meets is similar. □

We will now demonstrate a theorem of Hall [37] which gives an interesting rela- tionship between the Möbius function of a poset and its chains.

Theorem 5.8.2. Let 푃 be a finite poset with a 0̂ and a 1̂. We have

푖 (5.22) 휇(1)̂ = ∑(−1) 푐푖 푖≥0

where 푐푖 is the number of 0̂–1̂ chains of length 푖 in 푃.

Proof. We will induct on #푃. The result is easy to verify if #푃 ≤ 2 so assume #푃 ≥ 3. Let 푐 be a coatom of 푃 and let ∼ be the equivalence relation of Lemma 5.8.1. Let 푎푖, respectively 푏푖, be the number of 0̂–1̂ chains of 푃 of length 푖 which do not, respectively do, contain 푐. Clearly

푖 푖 푖 (5.23) ∑(−1) 푐푖 = ∑(−1) 푎푖 + ∑(−1) 푏푖. 푖≥0 푖≥0 푖≥0

There is a length-preserving bijection between the 0̂–1̂ chains of 푃 which do not contain 푐 and the [0]̂ –[1]̂ chains of 푃/ ∼ which sends 푥 to [푥] for each 푥 in the chain. Since |푃/ ∼ | < |푃| we have, by induction,

푖 (5.24) 휇([1])̂ = ∑(−1) 푎푖. 푖≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

176 5. Counting with Partially Ordered Sets

There is also a bijection between 0̂–1̂ chains of 푃 containing 푐 and 0̂–푐 chains of [0,̂ 푐] gotten by removing 1̂ from such chains in 푃. Since this bijection changes length by one we have, again by induction, 푖 (5.25) 휇(푐) = − ∑(−1) 푏푖. 푖≥0 Plugging (5.24) and (5.25) into (5.23) and using Lemma 5.8.1 gives 푖 ∑(−1) 푐푖 = 휇([1])̂ − 휇(푐) = 휇(1)̂ 푖≥0 as desired. □

The reader familiar with algebraic topology may have noticed that the sum in equa- tion (5.22) looks like a reduced Euler characteristic. This can be made precise as fol- lows. Let 푃 be a poset with a 0̂ and a 1̂ which are distinct. The set of 0̂–1̂ chains in 푃 are in bijection with the set of all chains in 푃 ≔ 푃 − {0,̂ 1}̂ : merely remove the 0̂ and 1̂ from each chain of 푃. So define the order complex of 푃, Δ(푃), to be the set of all chains in the open interval (0,̂ 1)̂ . This is clearly an (abstract) simplicial complex since a subset of a chain is still a chain. Since the lengths of a 0̂–1̂ chain in 푃 and of its image in Δ(푃) differ by two, the right-hand side of(5.22) is the reduced Euler characteristic ̃휒(Δ(푃)). So one can bring the tools of algebraic topology to bear on questions about the Möbius function. For more information on this approach, see the survey articles of Björner [15] and Wachs [96]. The next result is due to Weisner [98]. It gives an expression for 휇 similar to the one given in its definition but with what could be a substantially smaller numberof terms. Recall from Proposition 5.3.2(i) that a finite lattice has a 0̂ and a 1̂. Theorem 5.8.3. If 퐿 is a finite lattice and 푎 ∈ 퐿 − {0}̂ , then (5.26) 휇(1)̂ = − ∑ 휇(푥). 푥≠1̂ 푥∨푎=1̂

Proof. We induct on #퐿, just doing the induction step when #퐿 ≥ 3. When 푎 = 1̂, the sum in (5.26) is over all 푥 < 1̂. So the equation is true by definition of 휇(1)̂ . Now assume 푎 ≠ 1̂ and pick a coatom 푐 with 푎 ≤ 푐. Let ∼ be the equivalence relation from Lemma 5.8.1. By induction we have 휇([1])̂ = − ∑ 휇([푥]). [푥]≠[1]̂ [푥]∨[푎]=[1]̂ Now [1]̂ = {푐, 1}̂ , [푥] ∨ [푎] = [푥 ∨ 푎] by Lemma 5.8.1, and 휇([푥]) = 휇(푥) for [푥] ≠ [1]̂ . So the previous displayed equation becomes 휇([1])̂ = − ∑ 휇(푥) = − ∑ 휇(푥) − ∑ 휇(푥). 푥≠푐,1̂ 푥≠푐,1̂ 푥≠푐,1̂ 푥∨푎=푐,1̂ 푥∨푎=푐 푥∨푎=1̂ If 푥 ∨ 푎 = 푐, then clearly 푥 ≠ 1̂. And if 푥 ∨ 푎 = 1̂, then 푥 ≠ 푐 since 푎 ≤ 푐. So we can write 휇([1])̂ = − ∑ 휇(푥) − ∑ 휇(푥). 푥≠푐 푥≠1̂ 푥∨푎=푐 푥∨푎=1̂

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.8. Computing the Möbius function 177

If 푥 ∨ 푎 = 푐, then 푥 ≤ 푐. So the first sum above can be viewed as taking place in [0,̂ 푐] which has fewer elements than 푃. By induction 휇([1])̂ = 휇(푐) − ∑ 휇(푥). 푥≠1̂ 푥∨푎=1̂ Rearranging terms and using Lemma 5.8.1 finishes the proof. □

To illustrate how this result can be used to easily compute 휇, consider 퐵푛 for 푛 ≥ 2. Consider the atom 푎 = {푛}. To satisfy 푥 ∪ 푎 = [푛] where 푥 ≠ [푛] we must have 푥 = [푛 − 1]. So using (5.26) and induction gives

푛−1 푛 휇(퐵푛) = −휇([푛 − 1]) = −휇(퐵푛−1) = −(−1) = (−1) .

We end this section with a theorem of Rota [76]. To state it, we need a new defini- tion. Let 푃 be a finite poset with a 0̂ and a 1̂.A crosscut of 푃 is 퐾 ⊂ 푃 with the following properties. (1) 0,̂ 1̂ ∉ 퐾. (2) 퐾 is an antichain. (3) Every maximal chain of 푃 intersects 퐾.

For example, if 푃 is ranked, then Rk푘(푃) is a crosscut for 0 < 푘 < rk 푃. Theorem 5.8.4 (Crosscut Theorem). Let 퐿 be a finite lattice and let 퐾 be a crosscut. Then 휇(1)̂ = ∑ (−1)#퐵 ⋁ 퐵=1̂ ⋀ 퐵=0̂ where the sum is over all 퐵 ⊆ 퐾 satisfying the meet and join conditions.

Proof. First consider the case where every atom of 퐿 is also a coatom. This forces 퐾 = 퐿 − {0,̂ 1}̂ . And the meet and join conditions are satisfied for all 퐵 ⊆ 퐾 with #퐵 ≥ 2. If #퐾 = 푛, then, using Theorem 1.3.3(d),

푛 1 푛 푛 ∑ (−1)#퐵 = ∑ ( )(−1)푘 = − ∑ ( )(−1)푘 = 푛 − 1 = 휇(1).̂ 푘 푘 ⋁ 퐵=1̂ 푘=2 푘=0 ⋀ 퐵=0̂

Now assume that the atom and coatom sets of 퐿 do not coincide. By Exercise 12(c) and 23(c), the theorem holds for 퐿 if and only if it holds for 퐿∗. So, by taking the dual if necessary, we can assume that there is a coatom 푐 ∉ 퐾. We now induct on #퐿. The previous paragraph takes care of the case #퐿 = 3 and smaller lattices do not have crosscuts. Suppose that #퐿 ≥ 4 and let ∼ be the equivalence relation of Lemma 5.8.1. Since 푐, 1̂ ∉ 퐾, the [푥] for 푥 ∈ 퐾 form a crosscut for 푃/ ∼ which we will denote by [퐾] and its subsets by [퐵]. By induction, 휇([1])̂ = ∑ (−1)#[퐵]. ⋁[퐵]=[1]̂ ⋀[퐵]=[0]̂

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

178 5. Counting with Partially Ordered Sets

By Lemma 5.8.1 we have ⋁[퐵] = [1]̂ if and only if ⋁ 퐵 = 푐 or 1̂. By the same token and the fact that 푐, 1̂ ∉ 퐾 we get that ⋀[퐵] = [0]̂ is equivalent to ⋀ 퐵 = 0̂. So the previous displayed equation becomes 휇([1])̂ = ∑ (−1)#퐵 + ∑ (−1)#퐵. ⋁ 퐵=푐 ⋁ 퐵=1̂ ⋀ 퐵=0̂ ⋀ 퐵=0̂ Note that since 퐾 is a crosscut of 퐿 not containing 푐 we have that 퐾′ ≔ 퐾 ∩ [0, 푐] is a crosscut of [0, 푐]. Furthermore, ⋁ 퐵 = 푐 implies that 퐵 ⊆ 퐾′. So applying induction to the sum with this restriction gives 휇([1])̂ = 휇(푐) + ∑ (−1)#퐵. ⋁ 퐵=1̂ ⋀ 퐵=0̂ A rearrangement of terms and Lemma 5.8.1 completes the demonstration. □

As an application of this result, consider 퐵푛, 푛 ≥ 2, with the crosscut 퐾 consisting of its atoms. But for 퐵 ⊆ 퐾 we can only have ⋁ 퐵 = [푛] if 퐵 = 퐾. And in this case #퐾 푛 ⋀ 퐵 = ∅. So 휇(퐵푛) = (−1) = (−1) .

5.9. Binomial posets

Binomial posets were introduced by Doubilet, Rota, and Stanley [23] and further stud- ied in [87]. They provide an explanation about why certain types of generating func- tions arise in practice while others do not. For example, we have already met ordi- 푛 푛 nary generating functions ∑푛 푎푛푡 and exponential generating functions ∑푛 푎푛푡 /푛!. 푛 There are also Eulerian generating functions which are of the form ∑푛 푎푛푡 /[푛]푞!. We have seen one example of this in the 푞-Binomial Theorem where the right-hand side of equation (3.6) can be written 푘 (푘) 푛 (푘) 푡 ∑ 푞 2 [ ] 푡푘 = ∑(푞 2 [푛][푛 − 1] ⋯ [푛 − 푘 + 1]) . 푘 [푘]! 푘 푞 푘 푛 Why do such generating functions appear while others, say of the form ∑푛 푎푛푡 /퐶(푛) where 퐶(푛) is the 푛th Catalan number, do not? Binomial posets provide one possible explanation. A poset 푃 is called binomial if it satisfies the following conditions: BP1 푃 is locally finite and contains arbitrarily long chains. BP2 Every interval [푥, 푧] is ranked. The interval is called an 푛-interval if, con- sidered as a poset, rk[푥, 푧] = 푛. BP3 Any two 푛-intervals contain the same number of maximal chains. This number is denoted 퐹(푛) = 퐹푃(푛) and called the factorial function of 푃. We will consider three examples corresponding to the three types of generating functions mentioned at the beginning of this section. Let 퐶∞ be the nonnegative inte- gers under the usual total order. We also have 퐵∞ which consists of all finite subsets of positive integers partially ordered by set containment. Finally, let 푉∞ be the vec- tor space over 픽푞 with countable basis 푒1, 푒2,... and denote by 퐿∞(푞) the poset of all finite-dimensional subspaces of 푉∞ with containment as the partial order.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.9. Binomial posets 179

Proposition 5.9.1. The posets 퐶∞, 퐵∞, and 퐿∞(푞) are all binomial. Their factorial func- tions are

퐹퐶∞ (푛) = 1, 퐹퐵∞ (푛) = 푛! , and 퐹퐿∞(푞)(푛) = [푛]푞!.

Proof. We will prove this for 퐵∞ and leave the other two cases as exercises. If [푆, 푇] is an interval in 퐵∞, then [푆, 푇] ≅ 퐵푛 for some 푛. So 푃 is locally finite with ranked intervals. Subchains of the infinite chain ∅ ⊂ {1} ⊂ {1, 2} ⊂ ⋯ can be arbitrarily long. Since any two 푛-intervals are isomorphic to 퐵푛, they contain the same number of maximal chains. To find the factorial function, note that a maximal chain in 퐵푛 has the form

∅ ⊂ {푠1} ⊂ {푠1, 푠2} ⊂ ⋯ ⊂ [푛].

There are 푛 choices for 푠1, and after that 푛 − 1 for 푠2, etc. So the total number of chains is 푛!. □

An important property of binomial posets is that the number of elements at a given rank in an 푛-interval [푥, 푧] does not depend on 푥, 푧.

Lemma 5.9.2. If [푥, 푧] is an 푛-interval in a binomial poset 푃 and 0 ≤ 푘 ≤ 푛, then 퐹(푛) # Rk [푥, 푧] = . 푘 퐹(푘)퐹(푛 − 푘)

Proof. Given 푦 ∈ Rk푘[푥, 푧], we first count the number of maximal chains 퐶 of [푥, 푧] passing through 푦. Such a 퐶 must be the concatenation of a maximal chain in [푥, 푦] with a maximal chain in [푦, 푧]. Since [푥, 푦] is a 푘-interval and [푦, 푧] is an (푛−푘)-interval, the number of 퐶 must be 퐹(푘)퐹(푛 − 푘). But this expression is independent of 푦. So the total number of maximal chains in [푥, 푧] is 퐹(푘)퐹(푛 − 푘) ⋅ # Rk푘[푥, 푧]. Since [푥, 푧] is an 푛-interval, this number is also 퐹(푛). Setting the two expressions equal and solving for □ # Rk푘[푥, 푧] completes the proof.

To make the connection between binomial posets 푃 and generating functions, we must consider a subalgebra of the incidence algebra ℐ(푃). The reduced incidence alge- bra of a binomial poset 푃 is

ℛ(푃) = {휙 ∈ ℐ(푃) ∣ 휙 is constant on 푛-intervals}.

Equivalently the 휙 ∈ ℛ(푃) are precisely those such that 휙(푥, 푧) = 휙(푥′, 푧′) whenever [푥, 푧] and [푥′, 푧′] are both 푛-intervals. We let 휙(푛) denote this common value. So, for example, 휁 ∈ ℛ(푃) since 휁(푥, 푧) = 1 on all intervals [푥, 푧]. It is not clear a priori that 휇 ∈ ℛ(푃). But this will follow from the next result.

Theorem 5.9.3. Let 푃 be a binomial poset. (a) ℛ(푃) is a subalgebra of ℐ(푃). (b) If 휙 ∈ ℛ(푃) and 휙−1 exists in ℐ(푃), then 휙−1 ∈ ℛ(푃).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

180 5. Counting with Partially Ordered Sets

Proof. (a) We need to show that ℛ(푃) is closed under addition, scalar multiplication, and convolution. We will prove the last and leave the other two as exercises. Suppose 휙, 휓 ∈ ℛ(푃). Using the previous lemma, we can write

(휙 ∗ 휓)(푥, 푧) = ∑ 휙(푥, 푦)휓(푦, 푧) 푥≤푦≤푧 푛 = ∑ ∑ 휙(푥, 푦)휓(푦, 푧)

푘=0 푦∈Rk푘[푥,푧] 푛 = ∑ ∑ 휙(푘)휓(푛 − 푘)

푘=0 푦∈Rk푘[푥,푧] 푛 퐹(푛) = ∑ 휙(푘)휓(푛 − 푘). 푘=0 퐹(푘)퐹(푛 − 푘) But this last expression is clearly independent of 푥, 푧 and so we are done.

(b) We must show that if [푥, 푧] is an 푛-interval, then 휙−1(푥, 푧) depends only on 푛. We will induct on 푛. If 푛 = 0, then 푥 = 푧 and 휙 ∗ 휙−1(푥, 푥) = 훿(푥, 푥) = 1. So, using the fact that 휙 ∈ ℛ(푃),

1 = ∑ 휙(푥, 푦)휙−1(푦, 푥) = 휙(푥, 푥)휙−1(푥, 푥) = 휙(0)휙−1(푥, 푥). 푥≤푦≤푥

This can be rewritten 휙−1(푥, 푥) = 1/휙(0) and the right-hand side does not depend on 푥 as desired. Now suppose 푛 > 0. Similar to the base case we have

(휙 ∗ 휙−1)(푥, 푧) = 훿(푥, 푧) = 0.

Expanding the convolution and moving the first term outside the sum gives

0 = 휙(푥, 푥)휙−1(푥, 푧) + ∑ 휙(푥, 푦)휙−1(푦, 푧) = 휙(0)휙−1(푥, 푧) + 푆 푥<푦≤푧 where 푆 is the sum in this displayed equation. But, by induction, 푆 only depends on 푛. So 휙−1(푥, 푧) = −푆/휙(0) is also solely a function of 푛 as desired. □

We can now draw a concrete relation between binomial posets and generating functions.

Theorem 5.9.4. If 푃 is binomial, then ℛ(푃) ≅ ℝ[[푡]] as algebras via the map 푡푛 휙 ↦ 퐹휙(푡) ≔ ∑ 휙(푛) . 푛≥0 퐹(푛)

Proof. This function is a bijection since it has an inverse. In particular, if 퐹(푡) ∈ ℝ[[푡]], 푛 then we can write 퐹(푡) = ∑푛 푎푛푡 /퐹(푛) for some 푎푛 ∈ ℝ. So the inverse maps 퐹(푡) to 휙 ∈ ℛ(푃) defined by 휙(푛) = 푎푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

5.9. Binomial posets 181

Showing that the bijection preserves addition and scalar multiplication is left as an exercise. For convolution we want 퐹휙∗휓(푡) = 퐹휙(푡)퐹휓(푡). Using the expression for (휙 ∗ 휓)(푥, 푧) obtained in the proof of the previous theorem 푡푛 푡푛 퐹휙(푥)퐹휓(푡) = ∑ 휙(푛) ⋅ ∑ 휓(푛) 푛≥0 퐹(푛) 푛≥0 퐹(푛) 푛 휙(푘) 휓(푛 − 푘) = ∑( ∑ ⋅ )푡푛 푛≥0 푘=0 퐹(푘) 퐹(푛 − 푘) 푛 퐹(푛) 푡푛 = ∑( ∑ 휙(푘)휓(푛 − 푘)) 푛≥0 푘=0 퐹(푘)퐹(푛 − 푘) 퐹(푛) 푡푛 = ∑(휙 ∗ 휓)(푛) 푛≥0 퐹(푛)

= 퐹휙∗휓(푡) as we wished to conclude. □

푛 Note that in 퐶∞ this map becomes 휙 ↦ ∑푛 휙(푛)푡 which is an ordinary gener- ating function. Similarly, the images in 퐵∞ and 퐿∞(푞) are exponential and Eulerian generating functions, respectively. There is one of our important initial example posets which does not seem to be covered by this theory. Consider 퐷∞ which is the positive integers ordered by divisi- bility. Then 퐷∞ is not binomial. For consider the intervals [1, 4] and [1, 6]. We have rk[1, 4] = rk[1, 6] = 2. But [1, 4] contains a single maximal chain whereas [1, 6] has two. But there is a way around this difficulty. Let 푃1, 푃2, 푃3,... be posets each with a 0̂. We will use subscripts to indicate which poset an element belongs to, for example 0̂푖 is the 0̂ in 푃푖. The direct sum of the 푃푖 has as underlying set

푃 = {(푥 , 푥 , ... ) ∣ 푥 ∈ 푃 for all 푖, 푥 ≠ 0̂ for only finitely many 푖}. ⨁ 푖 1 2 푖 푖 푖 푖 푖≥1

⋮ ⋮ ⋮

23 33 53

22 32 52 퐷∞ ≅ ⊕ ⊕ ⊕ ⋯ 2 3 5

1 1 1

Figure 5.12. 퐷∞ as a direct sum of chains.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

182 5. Counting with Partially Ordered Sets

Letting (푥푖) = (푥1, 푥2, ...), we impose the partial order (푥푖) ≤ (푦푖) if and only if 푥푖 ≤ 푦푖 for all 푖. It is not hard to see that 퐷∞ is isomorphic to the direct sum of chains as illustrated in Figure 5.12.

A Dirichlet poset is 푃 = ⊕푖≥1푃푖 where each 푃푖 is a binomial poset with a 0̂. So 퐷∞ is Dirichlet. An interval [(푥푖), (푧푖)] in 푃 is called an (푛푖)-interval where (푛푖) = (푛1, 푛2, ...) if [푥푖, 푧푖] is an 푛푖-interval in 푃푖 for all 푖. The corresponding reduced incidence algebra is

ℛ(푃) = {휙 ∈ ℐ(푃) ∣ 휙 is constant on (푛푖)-intervals}.

The notation 휙((푛푖)) should be self-explanatory. Note that 휁 ∈ ℛ(푃) for 푃 Dirichlet. Theorem 5.9.3 remains true if 푃 is replaced by a Dirichlet poset. Theorem 5.9.4 gener- alizes as follows, where 퐭 = {푡1, 푡2,...} is a countably infinite sequence of variables.

Theorem 5.9.5. If 푃 = ⨁푖≥1 푃푖 is Dirichlet, then ℛ(푃) ≅ ℝ[[퐭]] as algebras via the map

푛푖 푡푖 휙 ↦ 퐹휙(퐭) ≔ ∑ 휙((푛푖)) ∏ 퐹(푛푖) (푛푖) 푖≥1

where the sum is over all (푛푖) with only finitely many 푛푖 ≠ 0, and 퐹(푛푖) is the factorial □ function of 푃푖.

푠 Now suppose 푃 = 퐷∞ so 퐹(푛푖) = 1 for all 푖. Let 푡푖 = 1/푝푖 where 푝푖 is the 푖th prime and 푠 ∈ ℂ. Using the unique factorization of the integers we see that

푛푖 1 1 1 퐹 휁(퐭) = ∑ 휁((푛푖)) ∏ 푡푖 = ∑ ∏ 푠 = ∑ 푠 = ∑ 푛푖 푛푠 (푛 ) 푖≥1 (푛 ) 푖≥1 (푝푖 ) (푛 ) 푛≥1 푖 푖 푖 푛푖 (∏푖≥1 푝푖 )

푠 where the last sum is the Riemann 휁-function 휁(푠) = ∑푛≥1 1/푛 . For the rest of this section 푠 will be a complex number so that 휁(푠) will always refer to Riemann’s function rather than the value of the reduced incidence algebra element of the same name on an 푛-interval. As a function of a complex variable, one can show that 휁(푠) has zeros at the negative even integers which are sometimes called its trivial zeros. Perhaps the most famous conjecture in all of mathematics is the following by Riemann [74]. Conjecture 5.9.6 (Riemann Hypothesis). All the nontrivial zeros of 휁(푠) have real part 1/2.

One can restate the Riemann Hypothesis in terms of the Möbius function 휇 of 퐷∞. To do so, we need a concept from asymptotic combinatorics. Suppose we have two functions 푓, 푔∶ ℕ → ℝ. We say that 푓 is big oh of 푔, written 푓 = 푂(푔), if there are constants 퐶, 푁 such that |푓(푛)| ≤ 퐶|푔(푛)| for all 푛 ≥ 푁. To see why this might be relevant to Conjecture 5.9.6, note that if 푓(푧) = 1/푞(푧) is a rational function of 푧 ∈ ℂ, then the zeros of the polynomial 푞(푧) are called the poles of 푓(푧). These poles control the growth rate of the coefficients of the (not formal) power series expansion 푓(푧) = 푛 ∑푛≥0 푎푛푧 . For example, if 푓(푧) = 1/(1 − 2푧), then 푓(푧) has a pole at 푟 = 1/2. And 푛 푛 we also know that for sufficiently small |푧| we have 푓(푧) = ∑푛 2 푧 whose coefficients grow like, in fact are exactly, 2푛 = (1/푟)푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 183

It is not hard to show using Theorem 5.9.5 that

1 휇(푛) (5.27) = ∑ 푛푠 휁(푠) 푛≥1

where 휇 is taken in 퐷∞. Given the relation between roots and poles just discussed, it makes sense to consider the Mertens function

푀(푛) = ∑ 휇(푘). 1≤푘≤푛

Then a conjecture equivalent to Conjecture 5.9.6 is as follows.

Conjecture 5.9.7. For all real 휖 > 0 we have

푀(푛) = 푂(푛1/2+휖).

There was an earlier conjecture of Mertens [62] that |푀(푛)| ≤ 푛1/2 for all 푛. But this was disproved by Odlyzko and te Riele [66].

Exercises

(1) This exercise refers to the list of examples just after the definition of a poset. (a) Verify that they satisfy the definition of a poset. (b) Show that the partial order in Π푛 is equivalent to defining 휌 ≤ 휏 if every block of 휏 is a union of blocks of 휌. (c) Describe the cover relations in the list. For example, in 퐶푛 the covers are of the form 푖 < 푖 + 1 for 0 ≤ 푖 < 푛. (2) Prove Proposition 5.1.1. (3) Complete the proof of Proposition 5.1.2. For part (c) give two proofs: one by mim- icking the proof of part (b) and one using 푃∗.

(4) Complete the proof of Proposition 5.1.3. To show that 퐾푛 ≅ 퐵푛−1 it may be simpler ∗ to show that 퐾푛 ≅ 퐵푛−1 using the map 휙 from Section 1.7. (5) Let 푓 ∶ 푃 → 푄 be an isomorphism of posets. (a) Show that 푓 is also an isomorphism of 푃∗ with 푄∗. (b) Show that if 푃 has a 0̂, then so does 푄. (c) Show in two ways that if 푃 has a 1̂, then so does 푄: by mimicking the proof of part (b) and by using the result of (b) together with part (a). (6) (a) Show that the axioms for a partially ordered set are satisfied by 푃 ⊎ 푄, 푃 + 푄, and 푃 × 푄. (b) Show that 푃 × 푄 ≅ 푄 × 푃. (7) Complete the proof of Proposition 5.2.1.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

184 5. Counting with Partially Ordered Sets

(8) (a) Show that if 푃 is a ranked poset, then for any 푘 we have Rk푘(푃) is an antichain. (b) Let 푃 be a ranked poset and assume 푓 ∶ 푃 → 푄 is an isomorphism. Show that 푄 is also ranked and for all 푥 ∈ 푃 we have

rk푃 푥 = rk푄 푓(푥). (c) Show that if 푃, 푄 are ranked posets, then so is 푃 × 푄 with rank function

rk푃×푄(푥, 푦) = rk푃 푥 + rk푄 푦. (9) Prove Proposition 5.2.2. (10) (a) Prove Proposition 5.3.1 (b) Find, with proof, a description of the meet and join operations in 퐾푛. (c) Show that 픖 is not a lattice. (11) Fill in the proof of Proposition 5.3.2. (12) (a) Show that if 퐿 is a lattice, then so is its dual 퐿∗ with

푥 ∧퐿∗ 푦 = 푥 ∨퐿 푦 and 푥 ∨퐿∗ 푦 = 푥 ∧퐿 푦. (b) Show that if 퐿, 푀 are lattices, then so is 퐿 × 푀 with (푎, 푥) ∧ (푏, 푦) = (푎 ∧ 푏, 푥 ∧ 푦) and (푎, 푥) ∨ (푏, 푦) = (푎 ∨ 푏, 푥 ∨ 푦). (c) A meet semilattice is a poset 푃 where every pair of elements has a meet. Show that if a finite meet semilattice has a 1̂, then it is a lattice. Hint: Given 푥, 푦 ∈ 퐿, let 푈 be the set of upper bounds of 푥, 푦. Show that ⋀ 푈 exists and is their join. (d) Let 푃 be a finite poset with a 0̂ and a 1̂. Suppose that for any 푥, 푦 ∈ 푃 which both cover an element 푧, the join 푥 ∨ 푦 exists. Prove that 푃 is a lattice. Hint: Use the previous part in its dual form and induct on #푃.

(13) A set partition 휋 = 퐵1/ ... /퐵푘 of [푛푑] is 푑-divisible if 푑 divides |퐵푖| for all 푖. Let

Π푛푑,푑 = {휋 ∣ 휋 is a 푑-divisible partition of [푛푑] or 휋 = 1/2/ ... /푛푑} be partially ordered by refinement of set partitions. (a) Show that if 휋, 휎 ∈ Π푛푑,푑, then 휋 ∨ 휎 exists and is the same as the join in the ordinary partition lattice Π푛푑. (b) Show that Π푛푑,푑 is a lattice but that 휋 ∧ 휎 may not be the same in Π푛푑,푑 and in the ordinary partition lattice Π푛푑. Hint: Use the dual form of Exercise 12(c). (14) Prove the backward direction of Proposition 5.3.3. (15) Prove Proposition 5.3.4. (16) Finish the proof of Proposition 5.3.5. (17) Let 푃 be a finite poset and let 퐿 = 풥(푃) be the corresponding distributive lattice. If 푋 ⊆ 푃 is a lower-order ideal, then use the corresponding lowercase letter 푥 to denote the associated element of 퐿. (a) Show that 푥 covers 푦 in 퐿 if and only if 푌 = 푋 − {푚} where 푚 is a maximal element of 푋. (b) Show that 푥 is join irreducible in 퐿 if and only if 푋 is a principal ideal of 푃. (18) (a) Given a poset 푃, let 풜(푃) be the set of antichains of 푃. Show that the map 푓 ∶ 풜(푃) → 풥(푃) given by 푓(퐴) = 퐼(퐴) (where 퐼(퐴) is the order ideal generated by 퐴) is a bijection.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 185

(b) Show that 휆 ∈ 푌 is join irreducible if and only if 휆 = (푘푙) for some 푘, 푙 ∈ ℙ. (c) Show that 퐶 ≅ 풥(퐶 ), 퐵 ≅ 풥(퐴 ), and 퐷 ≅ ⨉ 풥(퐶 ) where 푛 = 푛 푛−1 푛 푛 푛 푖 푛푖−1 푛푖 ∏푖 푝푖 is the prime factorization of 푛. (d) Show that if 퐿, 푀 are finite, disjoint lattices, then Irr(퐿 × 푀) ≅ Irr(퐿) ⊎ Irr(푀). (e) Show that if 푃, 푄 are finite, disjoint posets, then 풥(푃 ⊎ 푄) ≅ 풥(푃) × 풥(푄).

Use this to rederive the statements about 퐵푛 and 퐷푛 in part (c). For 퐷푛 you will also need part (d).

(19) (a) Rederive the formula for 휇 in 퐵푛, equation (5.6), in two ways: by mimicking the proof of (5.7) and by constructing an 푚 ∈ ℙ such that 퐷푚 ≅ 퐵푛 and then applying (5.7). 푛 (b) Prove that if 푊 ∈ 퐿(픽푞 ) has dimension 푘, then

(푘) 휇(푊) = (−1)푘푞 2 . Hint: Use the 푞-Binomial Theorem, Theorem 3.2.4. (20) (a) Let 푃 be a locally finite poset with a 0̂. Show that if 푥 covers exactly one ele- ment of 푃, then −1 if 푥 covers 0̂, 휇(푥) = { 0 otherwise. (b) Given any 푛 ∈ ℤ, construct a poset containing an element 푥 with 휇(푥) = 푛. (21) Complete the proof of Theorem 5.5.1. (22) Complete the proof of Theorem 5.5.4. (23) Throughout this exercise, 푃 is a locally finite poset. (a) Show that 휁2(푥, 푧) = #[푥, 푧] where 휁2 = 휁 ∗ 휁. (b) Show in two ways that 푓 has an inverse if and only if 푓(푥, 푥) ≠ 0 for all 푥 ∈ 푃: by working directly with elements of ℐ(푃) and by using linear algebra. ∗ (c) Prove that 휇푃(푥, 푦) = 휇푃∗ (푦, 푥) where, as usual, 푃 is the dual of 푃. (d) Give three proofs of Theorem 5.5.5(b): by working directly with elements of ℐ(푃), by using linear algebra, and by using part (c) of this exercise. (24) (a) Recall that the Euler phi function 휙∶ ℙ → ℙ from Exercise 6 in Chapter 2 is defined by 휙(푛) = #{푚 ∈ [푛] ∣ gcd(푚, 푛) = 1}. Use a counting argument to prove that for all 푛 ∈ ℙ we have 푛 = ∑ 휙(푑). 푑∣푛 Hint: Consider the fractions 1/푛, 2/푛, ... , 푛/푛. Reduce each fraction to lowest terms and show that there will be exactly 휙(푑) fractions with denominator 푑 for each 푑 ∣ 푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

186 5. Counting with Partially Ordered Sets

(b) Show that for all 푛 ∈ ℙ we have 1 휙(푛) = 푛 ∏(1 − ) 푝 푝∣푛 where the product is over all primes 푝 dividing 푛. Hint: Use part (a) of this exercise. (25) Prove that if 푃 is a finite ranked poset with a 0̂ and a 1̂ which are distinct, then 휒(푃) has 푡 − 1 as a factor.

(26) Prove the formula for 휒(퐷푛) in Proposition 5.6.1 in two ways: using the definition of 휒 and using Theorem 5.6.2.

(27) Finish the proof of Proposition 5.6.1. Hint: For Π푛, remember that rk Π푛 = 푛 − 1 and use Theorem 3.6.1. For 퐿푛(푞) use Theorem 3.2.4. (28) (a) Let 퐺 be a graph with 푛 vertices. Show that ℒ(퐺) is a ranked lattice with rank function given by (5.17). (b) Show that the subgraph induced by a coloring of 퐺 is unique and is a bond. (c) Show that if 퐾푛 is the complete graph on 푛 vertices, then

ℒ(퐾푛) ≅ Π푛. (d) Show that if 푇 is a tree on 푛 vertices, then

ℒ(푇) ≅ 퐵푛−1. (29) Fill in the details of the proof of Lemma 5.7.1.

(30) This exercise refers to the proof that 휒(Π푛) = (푡 − 1)(푡 − 2) ⋯ (푡 − 푛 + 1) at the end of Section 5.7. (a) Prove the reverse direction of the equivalence which lead to equation (5.20). (b) Show that the map 휄∶ 퐼(푋) → 퐼(푋) is a well-defined sign-reversing involution without fixed points. (c) Show that the map 휌∶ (푃/ ∼) → Π푛 is a well-defined isomorphism of posets.

(31) Prove that 휒(퐿푛(푞); 푡) factors over the integers in two ways: by using quotient posets and by using Theorem 5.7.5.

(32) Reprove the factorization for 휒(Π푛) using Theorem 5.7.5. (33) This exercise is about Lemma 5.8.1(c). (a) Finish the proof. (b) Show that if [푥] = [1]̂ , then it is possible to have [푥] ∧ [푦] ≠ [푥 ∧ 푦]. (34) (a) State and prove a dual version of Theorem 5.8.3. (b) Using Weisner’s Theorem or its dual from part (a), rederive the formula for 휇(푃) when 푃 = 퐷푛,Π푛, and 퐿푛(푞). (35) (a) Let 퐿 be a finite lattice with atom set 풜. Show that if ⋁ 풜 ≠ 1̂, then 휇(퐿) = 0. (b) Use the Crosscut Theorem, Theorem 5.8.4, to rederive the formula for 휇(퐷푛). (36) Let 퐿 be a finite distributive lattice with atom set 풜(퐿) and join irreducibles Irr(퐿). (a) Show that 풜(퐿) ⊆ Irr(퐿) and that this is in fact true for any finite lattice. (b) Show that if 풜(퐿) ⊂ Irr(퐿) (proper subset), then ⋁ 풜(퐿) ≠ 1̂. (c) Show that if 풜(퐿) = Irr(퐿), then 퐿 ≅ 퐵푛 where 푛 = #풜(퐿).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 187

(d) Show that in 퐿 we have (−1)#풜(퐿) if 풜(퐿) = Irr(퐿), 휇(0,̂ 1)̂ = { 0 if 풜(퐿) ⊂ Irr(퐿). (37) Consider Young’s lattice 푌 of all integer partitions 휆 ordered by containment of Young diagrams. Given 휆, consider the interval 푃휆 = [(1), 휆] as a subposet of 푌.

Recall that |휆| = ∑푖 휆푖 where 휆 = (휆1, ... , 휆푘). (a) Compute 휇(푃휆) = 휇((1), 휆) for all 휆 with 1 ≤ |휆| ≤ 3. (b) Show that 휇(푃휆) = 0 for |휆| ≥ 4 in two ways: by using Weisner’s Theorem (Theorem 5.8.3) and by using the Crosscut Theorem (Theorem 5.8.4). (38) Complete the proof of Proposition 5.9.1. (39) Complete the proof of Theorem 5.9.3(a). (40) Complete the proof of Theorem 5.9.4.

(41) Show by direct computation (without using Theorem 5.9.4) that we have 퐹휇(푡) = −1 퐹 휁(푡) (a) in 퐶∞, (b) in 퐵∞. (42) (a) Prove that the direct sum of posets satisfies the axioms for a poset. (b) Show that 퐷∞ is isomorphic to a direct sum. (43) Prove the analogue of Theorem 5.9.3 in the case when 푃 is Dirichlet. (44) Prove Theorem 5.9.5. (45) Prove (5.27). Hint: Use Theorem 5.9.5.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 6

Counting with Group Actions

Sometimes we want to count objects up to some symmetry. As an example, consider counting necklaces with colored beads where two necklaces are considered the same if one is a rotation of the other. We can deal with such situations by considering a group acting on the underlying set. These tools can also be used in other contexts, such as in proving congruences from number theory or enumeration using roots of unity.

6.1. Groups acting on sets

In this section we introduce the basic notions which will be used throughout this chap- ter. We wish to have a formal way to talk about symmetries of objects. Let 퐺 be a group with identity element 푒, and let 푋 be a set. We say that 퐺 acts on 푋 if, for each 푔 ∈ 퐺, there is a map 푔∶ 푋 → 푋 such that for all 푥 ∈ 푋 (a) ℎ(푔(푥)) = (ℎ푔)(푥) for all ℎ, 푔 ∈ 퐺, (b) 푒(푥) = 푥. The reader should be careful to distinguish between when 푔 is being used to denote a group element and when it is being used as a map on 푋. For example, on the left in property (a), ℎ(푔(푥)) means apply the map 푔 to 푥 and then apply the map ℎ; i.e., compose the two maps. But on the right, ℎ푔 refers to the product in 퐺 and (ℎ푔)(푥) applies the corresponding map to 푥. It is common to write 푔푥 for 푔(푥). This should cause no confusion with group multiplication since 푥 is an element of 푋, not 퐺.

By way of illustration, consider the 4-cycle (1, 2, 3, 4) ∈ 픖4 and the group 퐺 = ⟨(1, 2, 3, 4)⟩ where the angle brackets denote the group generated by the elements in- side. So in our case

(6.1) 퐺 = {푒 = (1)(2)(3)(4), (1, 2, 3, 4), (1, 3)(2, 4), (1, 4, 3, 2)}.

189

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

190 6. Counting with Group Actions

Of course, 퐺 acts on [4] in the usual way. But we wish to consider an action on [4] 푋 = ( ) = {{1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}} 2 given by (6.2) 푔{푥, 푦} = {푔푥, 푔푦} for 푔 ∈ 퐺 and {푥, 푦} ∈ 푋. For example, (1, 2, 3, 4){1, 3} = {(1, 2, 3, 4)1, (1, 2, 3, 4)3} = {2, 4}.

For our first result, we will show that the maps 푔∶ 푋 → 푋 have special properties. Proposition 6.1.1. Suppose 퐺 acts on 푋. Then 푔∶ 푋 → 푋 is a bijection for all 푔 ∈ 퐺 and 푒∶ 푋 → 푋 is the identity map on 푋.

Proof. The statement about the map 푒 follows immediately from part (b) of the defini- tion of a group action. To prove that 푔∶ 푋 → 푋 is always a bijection, it suffices to show that 푔−1 ∶ 푋 → 푋 is its compositional inverse. So we must prove that, for all 푥 ∈ 푋, we have 푔−1(푔(푥)) = 푥 = 푔(푔−1(푥)). We will prove the first equality, leaving the second to the reader. But using require- ments (a) and (b) in the definition of a group action in that order gives 푔−1(푔(푥)) = (푔−1푔)(푥) = 푒(푥) = 푥 as required. □

When counting under the action of a group, all the elements of 푋 which can be obtained by acting on a given 푥 ∈ 푋 by the elements of the group will be considered the same. This leads to the following definition. If 퐺 acts on 푋, then the orbit of 푥 ∈ 푋 is

풪푥 = {푔푥 ∣ 푔 ∈ 퐺}.

Note that 풪푥 ⊆ 푋. It is important to keep in mind when we are talking about elements of 퐺 and when we are talking about elements of 푋. Continuing the example above, if 푥 = {1, 2}, then

풪{1,2} = { {1, 2}, (1, 2, 3, 4){1, 2}, (1, 3)(2, 4){1, 2}, (1, 4, 3, 2){1, 2} } = { {1, 2}, {2, 3}, {3, 4}, {1, 4} }.

Similar computations show that 풪{1,2} = 풪{2,3} = 풪{3,4} = 풪{1,4}. That is, if we let 풪 be the second line of the displayed equations above, then 풪푥 = 풪 for all 푥 ∈ 풪. Now consider what happens if 푥 = {1, 3}:

풪{1,3} = { {1, 3}, (1, 2, 3, 4){1, 3}, (1, 3)(2, 4){1, 3}, (1, 4, 3, 2){1, 3} } = { {1, 3}, {2, 4} }.

As before 풪{1,3} = 풪{2,4}. Also note that we have the set partition 푋 = 풪{1,2} ⊎ 풪{1,3}. And for all orbits 풪 we have #풪 ∣ #푋 where the vertical bar indicates divisibility of integers. These observations will be explained shortly.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.1. Groups acting on sets 191

Another important concept for analyzing group actions is the stabilizer. If 퐺 acts on 푋, then the stabilizer of 푥 ∈ 푋 is

퐺푥 = {푔 ∈ 퐺 ∣ 푔푥 = 푥}.

We clearly have 퐺푥 ⊆ 퐺. Returning to our running example, it is easy to check by considering each 푔 ∈ 퐺 that

퐺{1,2} = {푔 ∣ 푔{1, 2} = {1, 2}} = {푒} and

퐺{1,3} = {푔 ∣ 푔{1, 3} = {1, 3}} = {푒, (1, 3)(2, 4)}. Observe that both of these stabilizers are subgroups of 퐺. Furthermore #퐺 4 = = 2 = #풪{1,3} #퐺{1,3} 2

and similarly for #퐺{1,2} and #풪{1,2}. It is time to explain these patterns. The notation 퐻 ≤ 퐺 means that 퐻 is a subgroup of 퐺. Lemma 6.1.2. Let 퐺 act on 푋. (a) The distinct orbits form a set partition of 푋.

(b) For any 푥 ∈ 푋 we have 퐺푥 ≤ 퐺. (c) If 퐺 and 푋 are finite and 푥 ∈ 푋, then #퐺 #풪푥 = . #퐺푥

Proof. (a) Let 풪(1), ... , 풪(푘) be the distinct orbits of 퐺 acting on 푋. We first need to (푖) (푖) (푖) show 푋 = ⋃푖 풪 . Since 풪 ⊆ 푋 for all 푖 we clearly have ⋃푖 풪 ⊆ 푋. For the reverse (푖) containment, if 푥 ∈ 푋, then 푥 = 푒푥 ∈ 풪푥. Also 풪푥 = 풪 for some 푖. So 푥 is in the union as desired.

To show that the union is disjoint, it suffices to prove that if 풪푥 ∩ 풪푦 ≠ ∅ for two orbits 풪푥, 풪푦, then 풪푥 = 풪푦. We will show that 풪푥 ⊆ 풪푦 as then the reverse containment follows by just interchanging the roles of 풪푥 and 풪푦. So let 푧 ∈ 풪푥. Then there is 푔 ∈ 퐺 with 푧 = 푔푥. By assumption, there is some 푢 ∈ 풪푥 ∩ 풪푦 and thus there exist ℎ, 푘 ∈ 퐺 with 푢 = ℎ푥 and 푢 = 푘푦. It is an easy exercise to show that 푢 = ℎ푥 is equivalent to 푥 = ℎ−1푢. It follows that 푧 = 푔푥 = 푔(ℎ−1푢) = 푔(ℎ−1(푘푦)) = (푔ℎ−1푘)(푦).

This means 푧 ∈ 풪푦.

(b) We must show that 퐺푥 is a group, where the associative law is inherited from 퐺. We have 푒 ∈ 퐺푥 because 푒푥 = 푥. If 푔 ∈ 퐺푥, then 푔푥 = 푥 and so, as noted in (a), −1 −1 푔 푥 = 푥. This gives 푔 ∈ 퐺푥 so we have closure under taking inverses. Finally, if 푔, ℎ ∈ 퐺푥, then 푔푥 = 푥 and ℎ푥 = 푥. This gives (푔ℎ)(푥) = 푔(ℎ(푥)) = 푔(푥) = 푥 which gives closure under taking products.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

192 6. Counting with Group Actions

(c) By part (b) we can apply Lagrange’s Theorem which tells us that #퐺/#퐺푥 = #(퐺/퐺푥) where 퐺/퐺푥 is the set of left cosets of 퐺푥 in 퐺. So it suffices to find a bijection 푓 ∶ 퐺/퐺푥 → 풪푥. Define 푓(푔퐺푥) = 푔푥. We must show that this map is well-defined in that 푔퐺푥 = ℎ퐺푥 implies 푔푥 = ℎ푥. The hypothesis implies that 푔 = ℎ푘 where 푘 ∈ 퐺푥 so that 푘푥 = 푥. Now 푔푥 = (ℎ푘)푥 = ℎ(푘푥) = ℎ푥 −1 as we wished. To show 푓 is bijective, it suffices to construct an inverse. Define 푓 ∶ 풪푥 −1 → 퐺/퐺푥 by 푓 (푔푥) = 푔퐺푥 for each 푔푥 ∈ 풪푥. This is clearly the inverse as long as it is well-defined. So we must show that if 푔푥 = ℎ푥, then 푔퐺푥 = ℎ퐺푥. The hypothesis −1 −1 implies that 푥 = (푔 ℎ)푥 so that 푔 ℎ ∈ 퐺푥. This is equivalent to the desired conclu- sion. □

6.2. Burnside’s Lemma

As remarked in the previous section, when a group 퐺 acts on a set 푋 one sometimes wishes to consider all the elements in a given orbit of 퐺 the same. So it would be useful to have a formula for the number of orbits of the action. This result is usually referred to as Burnside’s Lemma because it was proved in his 1897 book, reprinted in [21], although Burnside himself was aware that the formula was already known. For computations, it is best to express the number of orbits in terms of a concept dual to the notion of a stabilizer of an elements of 푋. If 퐺 acts on 푋, then the fixed point set of 푔 ∈ 퐺 is 푋푔 = {푥 ∈ 푋 ∣ 푔푥 = 푥}. 푔 We have 푋 ⊆ 푋 while for the stabilizer of 푥 ∈ 푋 we have 퐺푥 ≤ 퐺. To remember this notation, note that in both cases the base of the expression (rather than the superscript or subscript) indicates whether we are dealing with a subset of 푋 or 퐺. Continuing the example from the previous section 푋(1,2,3,4) = ∅ and 푋(1,3)(2,4) = {{1, 3}, {2, 4}}. For any 퐺 and 푋 we have (6.3) 푋푒 = 푋.

Lemma 6.2.1 (Burnside’s Lemma). Let 퐺 act on 푋 with 퐺, 푋 finite. Then 1 number of orbits = ∑ #푋푔. #퐺 푔∈퐺

Proof. We will use the fact that for any positive integer 푛 푛 푛 ⏞⎴⎴⎴⎴⏞⎴⎴⎴⎴⏞1 1 1 1 = = + + ⋯ + . 푛 푛 푛 푛 It follows that for any finite set 풪 we have 1 ∑ = 1. #풪 푥∈풪

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.2. Burnside’s Lemma 193

From Lemma 6.1.2(a) and the finiteness hypothesis we can write 푋 = 풪(1) ⊎ ⋯ ⊎ 풪(푘) where 풪(푖), 1 ≤ 푖 ≤ 푘, are the orbits. Using this, the previous displayed equation, and Lemma 6.1.2(c) gives number of orbits = 푘 1 1 = ∑ + ⋯ + ∑ (1) (푘) 푥∈풪(1) #풪 푥∈풪(푘) #풪 1 = ∑ #풪 푥∈푋 푥 1 = ∑ #퐺 . #퐺 푥 푥∈푋 To express this last summation as a sum over 퐺, consider the matrix 푀 with rows indexed by 퐺, columns indexed by 푋, and entries 1 if 푔푥 = 푥, 푀 = { 푔,푥 0 otherwise. 푔 Clearly #퐺푥 is the sum of the entries in column 푥 of 푀. Similarly #푋 is the sum of 푔 the entries in row 푔 of 푀. So ∑푥∈푋 #퐺푥 and ∑푔∈퐺 #푋 are equal since both give the sum of the entries of 푀. This completes the proof. □

One standard application of Burnside’s Lemma is to count colored objects up to symmetry. To do this, we have to consider group actions on functions. Given sets 푋, 푌, we then let 푌 푋 denote the set of all functions 푓 ∶ 푋 → 푌. We know from Table 1.1 that |푌 푋 | = |푌||푋|. Suppose group 퐺 acts on the domain 푋. Then the induced action of 퐺 on 푌 푋 is defined by sending the function 푓 to the function 푔푓 such that (푔푓)(푥) = 푓(푔−1푥) for all 푥 ∈ 푋. Equivalently 푔푓 = 푓 ∘ 푔−1 where the circle indicates composition of functions. The reason that 푓 is composed with 푔−1 rather than 푔 is to make sure that condition (a) in the definition of a group action is satisfied. Indeed, ℎ(푔(푓)) = 푔(푓) ∘ ℎ−1 = 푓 ∘ 푔−1 ∘ ℎ−1 = 푓 ∘ (ℎ푔)−1 = (ℎ푔)(푓). And condition (b) is easy to verify as 푒푓 = 푓 ∘ 푒−1 = 푓 ∘ 푒 = 푓 since 푒∶ 푋 → 푋 is the identity map. Note that if we defined 푔푓 = 푓 ∘ 푔, then the two sides of (a) would not be equal. As our first application of Burnside’s Lemma, we consider colorings ofa 4-bead necklace with two colors: black (퐵) and white (푊). We wish to count the number of different necklaces if two necklaces are considered the same if one is a rotation ofthe other. If rotation is not considered, then each of the 4 beads can be colored in two ways, resulting in 42 = 16 necklaces which are displayed in Figure 6.1. Putting all necklaces which are rotations of a given necklace into a set together, we obtain a partition of the set of all necklaces. So we wish to count the number of blocks, which can easily be seen to be six in this case. But we would like to take an approach that could be generalized to more colors or more beads.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

194 6. Counting with Group Actions

{ }{ } , , ,

{ }{ } , , , ,

{ }{ } , , ,

Figure 6.1. The orbits of 2-colored, 4-bead necklaces under rotation

To get group actions into the act, label the four corners of the necklace using the set 푋 = {1, 2, 3, 4} as shown in Figure 6.2. Note that 퐺 = ⟨(1, 2, 3, 4)⟩ acts on 푋 in a way that corresponds to rotation of the necklace. Letting 푌 = {퐵, 푊}, each necklace coloring defines a function 푓 ∶ 푋 → 푌 and the action of 퐺 on 푌 푋 rotates colored necklaces. The blocks in Figure 6.1 are exactly the orbits of this action. So we can use Burnside’s Lemma as long as we can find a way to compute the fixed points (푌 푋 )푔 for each 푔 ∈ 퐺. To do that, the following lemma will be crucial. It shows that the fixed points of 푔 acting on 푌 푋 are exactly the functions 푓 which are constant on the cycles of 푔 acting on 푋. An example will be found in Figure 6.2.

Lemma 6.2.2. Let 퐺 act on a set 푋 and let 푌 be another set with 퐺, 푋, 푌 finite. Let 푔 ∈ 퐺 and 푐(푔) = number of cycles of 푔 acting on 푋. (a) For 푓 ∈ 푌 푋 we have 푔푓 = 푓 if and only if 푓(푥) = 푓(푥′) whenever 푥, 푥′ are in the same cycle of 푔 acting on 푋. (b) We have #(푌 푋 )푔 = |푌|푐(푔).

Proof. (a) We will prove the forward direction as the reverse is similar. Since 푥, 푥′ are in the same cycle of 푔 there must be an 푖 such that 푔푖푥 = 푥′. And since 푔푓 = 푓 we also have 푔푖푓 = 푓. It follows that

푓(푥) = 푓(푔−푖푥′) = (푔푖푓)(푥′) = 푓(푥′).

(b) From part (a), the fixed points of 푔 acting on 푌 푋 are obtained as follows. Choose an element 푦 for each cycle of 푔 acting on 푋 and let 푔(푥) = 푦 for all 푥 in that cycle. The number of ways of doing this is clearly |푌|푐(푔). □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.2. Burnside’s Lemma 195

2 1 labeling:

3 4

(푌 푋 )(1,3)(2,4) ∶ , , ,

Figure 6.2. Labeling the necklace and the fixed points of 푔 = (1, 3)(2, 4)

Returning to our example where |푌| = 2, we can use part (b) of the previous lemma to construct the following table for the 푔 ∈ ⟨(1, 2, 3, 4)⟩: 푔 푐(푔) #(푌 푋 )푔 (1)(2)(3)(4) 4 24 (1, 2, 3, 4) 1 21 (1, 3)(2, 4) 2 22 (1, 4, 3, 2) 1 21 Applying Burnside’s Lemma, we see that the number of distinct necklaces is 1 1 ∑ #(푌 푋 )푔 = (24 + 21 + 22 + 21) = 6 #퐺 4 푔∈퐺 as before. An immediate benefit of using this approach is that little work is needed todo the more general case where there are 푟 colors for the beads. Because of Lemma 6.2.2 the powers of 2 just get replaced by powers of 푟 so that for 4-bead, 푟-color necklaces under rotation 1 (6.4) number of orbits = (푟4 + 푟2 + 2푟). 4 It follows that 4 must divide evenly into 푟4 + 푟2 + 2푟 for all 푟 ∈ ℙ, a fact that is not obvious a priori. For a 3-dimensional example, let us find the number of distinct colorings of faces of a cube with 푟 colors if two colorings are equivalent when one is a rotation of the other. Label the faces with 푋 = [6] as shown in Figure 6.3 where arrows indicate labels for faces which cannot be seen. Colorings will be functions 푓 ∈ 푌 푋 where #푌 = 푟. Rotations can be classified by the axis of rotation and the angle through which one rotates, the exception being the identity whose cycle structure acting on 푋 is 푒 = (1)(2)(3)(4)(5)(6). If the rotation is through an axis bisecting opposite faces, then one will get the same cycle decomposition for angles of both ±90∘. Using the axis and direction given for the first rotation in Figure 6.3, one gets the permutation (1)(2, 3, 4, 5)(6). Furthermore, there are three possible pairs of opposite faces to use, giving a total of (3 axes)(2 rotations per axis) = 6 rotations of this type. A complete list of possible rotations 푔 is summarized in the following table, where the example

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

196 6. Counting with Group Actions

1 ← 4 3 5 → 2

↑ 6 labeling

face rotation edge rotation vertex rotation

Figure 6.3. Labeling and rotating a cube

permutations are all as indicated in Figure 6.3: type of rotation 푔 number of 푔 example 푔 푐(푔) #(푌 푋 )푔 identity 1 (1)(2)(3)(4)(5)(6) 6 푟6 face by ±90∘ 3 ⋅ 2 = 6 (1)(2, 3, 4, 5)(6) 3 푟3 face by 180∘ 3 ⋅ 1 = 3 (1)(2, 4)(3, 5)(6) 4 푟4 edge by 180∘ 6 ⋅ 1 = 6 (1, 4)(2, 6)(3, 5) 3 푟3 vertex by ±120∘ 4 ⋅ 2 = 8 (1, 2, 5)(3, 6, 4) 2 푟2

From this table we see that the total number of rotations is #퐺 = 24. Applying Burnside’s Lemma to the information given in the second and fourth columns gives the number of colorings as 1 1 ∑ #(푌 푋 )푔 = (푟6 + 3푟4 + 12푟3 + 8푟2). #퐺 24 푔∈퐺 As a check, consider the case 푟 = 2 with 푋 = {퐵, 푊}. Then by inspection of the small number of possibilities one can verify the following data where 퐵푖푊 푗 indicates having 푖 faces colored black and 푗 colored white: color distribution number of colorings 퐵6 or 푊 6 1 + 1 = 2 퐵5푊 or 푊퐵5 1 + 1 = 2 퐵4푊 2 or 푊 2퐵4 2 + 2 = 4 퐵3푊 3 2

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.3. The cycle index 197

So the total number of colorings is 10. On the other hand 1 (26 + 3 ⋅ 24 + 12 ⋅ 23 + 8 ⋅ 22) = 10 24 as well.

6.3. The cycle index

In the previous section we saw in equation (6.4) that the number of orbits of 퐺 = ⟨(1, 2, 3, 4)⟩ acting on 푌 [4] where #푌 = 푟 is given by (푟4 + 푟2 + 2푟)/4, a polynomial in a single variable. By permitting more variables, we can encode more information about the action of a group 퐺 on a set 푋. This will permit us to find generating functions for 푋 the number of orbits of the induced actions of 퐺 on subsets (푘) and on permutations 푃(푋, 푘) for 푘 ≥ 0. Suppose 퐺 is a finite group acting on a set 푋 with #푋 = 푛. If 푔 ∈ 퐺, then let

푐푖 = 푐푖(푔) = number of cycles of 푔 of length 푖.

Given variables 푡1, ... , 푡푛, the associated cycle index of 푔 is the monomial

푐1 푐2 푐푛 푧(푔) = 푧(푔; 푡1, 푡2, ... , 푡푛) = 푡1 푡2 ⋯ 푡푛 . The cycle index or cycle indicator of 퐺 acting on 푋 is 1 푍(퐺) = 푍(퐺; 푡 , 푡 , ... , 푡 ) = ∑ 푧(푔). 1 2 푛 #퐺 푔∈퐺 To illustrate, if 푋 = [4] and 퐺 = ⟨(1, 2, 3, 4)⟩, then by (6.1) we have 푔 푧(푔) 4 (1)(2)(3)(4) 푡1 (1, 2, 3, 4) 푡4 2 (1, 3)(2, 4) 푡2 (1, 4, 3, 2) 푡4 so that 1 (6.5) 푍(퐺) = (푡4 + 푡2 + 2푡 ). 4 1 2 4

Note that setting 푡1 = 푡2 = 푡3 = 푡4 = 푟 in 푍(퐺) we obtain the count in (6.4) for the orbits of 퐺 acting on [푟]푋 . It turns out that other specializations of 푍(퐺) give orbit-generating functions for other actions. 푋 If 퐺 acts on 푋 and 푘 ∈ ℕ, then there is an induced action on (푘) given by

푔{푥1, 푥2, ... , 푥푘} = {푔푥1, 푔푥2, ... , 푔푥푘}. The special case of this action when 퐺 = ⟨(1, 2, 3, 4)⟩ and 푋 = [4] was the running example in Section 6.1. Instead of subsets, one can consider the set 푃(푋, 푘) of 푘-permu- tations of 푋. The induced action on 푃(푋, 푘) is

푔(푥1푥2 ... 푥푘) = 푔(푥1)푔(푥2) ... 푔(푥푘). [4] In Section 6.1, we say that the orbits of 퐺 = ⟨(1, 2, 3, 4)⟩ acting on ( 2 ) were {{1, 2}, {2, 3}, {3, 4}, {1, 4}} and {{1, 3}, {2, 4}}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

198 6. Counting with Group Actions

Similarly, one can compute that the orbits of 퐺 acting on 푃([4], 2) = {12, 21, 13, 31, 14, 41, 23, 32, 24, 42, 34, 43} are {12, 23, 34, 41}, {21, 32, 43, 14}, and {13, 24, 31, 42}. In order to apply Burnside’s Lemma, we need an analogue of Lemma 6.2.2 in this con- text. Lemma 6.3.1. Let 퐺 act on 푋 with 푋 finite and consider 푔 ∈ 퐺. 푋 (a) For 푆 ∈ (푘) we have 푔푆 = 푆 if and only if 푆 is a union of cycles of 푔 (where to take the union we use the underlying set of each cycle).

(b) For 휋 = 푥1 ... 푥푘 ∈ 푃(푋, 푘) we have 푔휋 = 휋 if and only if each 푔푥푖 = 푥푖 for all 푖 ∈ [푘].

Proof. (a) We prove the forward direction and leave the converse as an exercise. It suffices to prove that if 푥 ∈ 푆 and 푥′ is any element of the cycle of 푔 containing 푥, then 푥′ ∈ 푆. Since 푥, 푥′ are in the same cycle of 푔 there is some 푖 with 푔푖푥 = 푥′. Also 푥 ∈ 푆 and 푔푆 = 푆 so that 푥′ = 푔푖푥 ∈ 푔푖푆 = 푆 and we are done.

(b) By definition of the action on 푃(푋, 푘), 푔휋 = 휋 means that we have 푔(푥1) ... 푔(푥푘) = 푥1 ... 푥푘. Since permutations are ordered collections of elements, this is equivalent □ to 푔(푥푖) = 푥푖 for all 푖 as desired.

We can now obtain expressions for the number of orbits of the induced actions of 푋 퐺 on (푘) and on 푃(푋, 푘) from the cycle indicator for 퐺’s action on 푋 itself. Note that the first generating polynomial is ordinary while the second is exponential. Theorem 6.3.2. Let 퐺 be finite acting on 푋 with #푋 = 푛. Also let 푋 푏 = number of orbits of 퐺 acting on ( ), 푘 푘

푝푘 = number of orbits of 퐺 acting on 푃(푋, 푘).

푛 푘 2 푛 (a) ∑ 푏푘푡 = 푍(퐺; 1 + 푡, 1 + 푡 , ... , 1 + 푡 ). 푘=0 푛 푡푘 (b) ∑ 푝 = 푍(퐺; 1 + 푡, 1, ... , 1). 푘 푘! 푘=0

푋 Proof. (a) Applying Burnside’s Lemma to (푘) for each 푘 and interchanging summa- tions gives 푛 푛 푔 1 푋 ∑ 푏 푡푘 = ∑ ∑ #( ) 푡푘. 푘 #퐺 푘 푘=0 푔∈퐺 푘=0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.3. The cycle index 199

So it suffices to show that for all 푔 ∈ 퐺 we have

푛 푔 푋 (6.6) 푧(푔; 1 + 푡, ... , 1 + 푡푛) = ∑ ( ) 푡푘 푘 푘=0 since then 푛 1 ∑ 푏 푡푘 = ∑ 푧(푔; 1 + 푡, ... , 1 + 푡푛) = 푍(퐺; 1 + 푡, ... , 1 + 푡푛). 푘 #퐺 푘=0 푔∈퐺

To prove (6.6), we use weight-generating functions as in Section 3.4. Weight a set 푔 푔 푔 푋 #푆 푋 푋 푆 ∈ (푘) by wt 푆 = 푡 . Then the weight ogf 푓풮(푡) for 풮 = (0) ⊎ ⋯ ⊎ (푛) is precisely the right-hand side of (6.6). To obtain the left side, write 푔 = 푔1푔2 ... 푔푚 where the 푔푖 are the cycles of 푔. By Lemma 6.3.1(a), 푆 ∈ 풮 if and only if 푆 is a union of some of the 푔푖. So 풮 can be thought of as a product 풮1 × 풮2 × ⋯ × 풮푚 where the 푖th coordinate can be either 푔푖 or ∅ if 푆 does or does not contain 푔푖, respectively. And the weighting on 풮 can be obtained by letting the weight of that coordinate be 푡#푔푖 or 푡#∅ = 1 for the two respective cases and then using the usual product weighting. Using the Sum and Product Rules for weight ogfs (Lemma 3.4.1) yields

#푔1 #푔2 #푔푚 푓풮(푡) = (1 + 푡 )(1 + 푡 ) ⋯ (1 + 푡 ) = (1 + 푡)푐1(푔)(1 + 푡2)푐2(푔) ... (1 + 푡푛)푐푛(푔) = 푧(푔; 1 + 푡, 1 + 푡2, ... , 1 + 푡푛) which completes the proof.

푔 (b) Let 푐1 = 푐1(푔). By Lemma 6.3.1(b), 휋 = 푥1 ... 푥푘 ∈ 푃(푋, 푘) if and only if 푥푖 is a fixed point of 푔 for all 푖 ∈ [푘]. Since fixed points are cycles of length one, if we choose the elements of 휋 in the order 푥1, ... , 푥푘, then the number of choices for 푥푖 is 푐1 − 푖 + 1. 푔 It follows that |푃(푋, 푘) | = 푐1 ↓푘 and so

푛 푘 푛 푛 푡 푐1 ↓푘 푐1 ∑ |푃(푋, 푘)푔| = ∑ 푡푘 = ∑ ( )푡푘 = (1 + 푡)푐1 . 푘! 푘! 푘 푘=0 푘=0 푘=0 Finally, applying Burnside’s Lemma similarly to (a) yields

푛 푛 푡푘 1 푡푘 ∑ 푝 = ∑ ∑ |푃(푋, 푘)푔| 푘 푘! #퐺 푘! 푘=0 푔∈퐺 푘=0 1 = ∑ (1 + 푡)푐1(푔) #퐺 푔∈퐺

= 푍(퐺; 1 + 푡, 1, ... , 1) as desired. □

As a reality check, let’s compute the generating functions for 퐺 = ⟨(1, 2, 3, 4)⟩ and 푋 푋 = [4] and compare the results with the computations of the orbits for ( 2 ) and 푃(푋, 2)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

200 6. Counting with Group Actions

above. Using part (a) of the previous result and (6.5) gives 푘 4 ∑ 푏푘푡 = 푍(퐺; 1 + 푡, ... , 1 + 푡 ) 푘 1 = ((1 + 푡)4 + (1 + 푡2)2 + 2(1 + 푡4)) 4 = 1 + 푡 + 2푡2 + 푡3 + 푡4. Note that the coefficient of 푡2 is 2 which is the number of orbits we found previously for this case. Note also that this generating function gives you much more information as it gives the number of orbits for all 푘, not just 푘 = 2. We now make the analogous computation for 퐺’s action on 푃(푋, 푘): 푡푘 1 ∑ 푝 = 푍(퐺; 1 + 푡, 1, 1, 1) = ((1 + 푡)4 + 1 + 2) 푘 푘! 4 푘 3 1 = 1 + 푡 + 푡2 + 푡3 + 푡4 2 4 푡 푡2 푡3 푡4 = 1 + + 3 + 6 + 6 . 1! 2! 3! 4! The coefficient of 푡2/2! is 3 which agrees with our earlier computations.

6.4. Redfield–Pólya theory

We can use the cycle index to give more refined information about orbit counts. For example, it can be used to compute the number of necklaces up to rotation which have a given number of beads of each color. This approach was developed by Redfield [71]. It was rediscovered and popularized by Pólya [70]. Let 퐺 act on 푋 with 퐺, 푋 finite, and let 푌 be a set of variables. The weight of 푓 ∈ 푌 푋 is the monomial wt 푓 = ∏ 푓(푥). 푥∈푋 Consider the example of the 4-bead necklace with two colors 푌 = {퐵, 푊} discussed in Section 6.2. Then the second necklace on the first line of Figure 6.1 would have weight wt 푓 = 푓(1)푓(2)푓(3)푓(4) = 퐵푊푊푊 = 퐵푊 3. Note that every other necklace in the orbit of the given one has the same weight. This is not an accident. Proposition 6.4.1. Let 퐺 act on 푋 with 퐺, 푋 finite, and let 푌 be a set of variables. If 푓, 푓′ are in the same orbit of 퐺 acting on 푌 푋 , then wt 푓 = wt 푓′.

Proof. Since 푓, 푓′ are in the same orbit, there is some 푔 ∈ 퐺 with 푓′ = 푔푓. By defini- tion of 퐺’s action on 푌 푋 and the fact that 푔∶ 푋 → 푋 is a bijection wt 푓′ = wt(푔푓) = ∏(푔푓)(푥) = ∏ 푓(푔−1푥) = ∏ 푓(푥′) = wt 푓 푥∈푋 푥∈푋 푥′∈푋 as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.4. Redfield–Pólya theory 201

Because of this result, if 풪 is an orbit of 퐺 acting on 푌 푋 , then we have a well- defined weight of 풪 given by wt 풪 = wt 푓 for any 푓 ∈ 풪. So the second orbit in Figure 6.1 would have wt 풪 = 퐵푊 3. We can express the weight-generating function for the orbits of 퐺 acting on 푌 푋 by making certain substitutions into the cycle index for 퐺 acting on 푋. The proof will be a weighted version of the demonstration of Burnside’s Lemma combined with some ideas in the proof of Theorem 6.3.2(a). We note that the theory of weight-generating functions from Section 3.4 carries over easily to generating functions with many vari- ables. Theorem 6.4.2 (Redfield–Pólya Theorem). Let 퐺 be a finite group acting on 푋 where #푋 = 푛. Suppose 푌 is a set of variables. Then

∑ wt 풪 = 푍(퐺; ∑ 푦, ∑ 푦2,..., ∑ 푦푛) 풪 푦∈푌 푦∈푌 푦∈푌 where the left-hand sum is over the orbits of 퐺 acting on 푌 푋 .

Proof. Recall from the proof of Burnside’s Lemma that for any orbit 풪 we have 1 ∑ = 1 #풪 푓∈풪 푓

since 풪푓 = 풪 for all 푓 ∈ 풪. It follows that wt 푓 wt 풪 ∑ = ∑ = wt 풪. #풪 #풪 푓∈풪 푓 푓∈풪 푓 Now using Lemma 6.1.2(a) and (c) yields wt 푓 wt 푓 1 (6.7) ∑ wt 풪 = ∑ ∑ = ∑ = ∑ |퐺 | wt 푓. #풪 #풪 #퐺 푓 풪 풪 푓∈풪 푓 푓∈푌 푋 푓 푓∈푌 푋

Again taking a tip from the demonstration of Lemma 6.2.1, consider a matrix 푀 with rows indexed by 퐺, columns indexed by 푌 푋 , and entries wt 푓 if 푔푓 = 푓, 푀 = { 푔,푓 0 otherwise.

푋 where 푔 ∈ 퐺 and 푓 ∈ 푌 . The sum of column 푓 of 푀 is |퐺푓| wt 푓 while the sum of row 푔 is ∑ wt 푓. 푓∈(푌 푋 )푔 Using this and (6.7) gives 1 1 ∑ wt 풪 = ∑ 푀 = ∑ ∑ wt 푓. #퐺 푔,푓 #퐺 풪 푔,푓 푔∈퐺 푓∈(푌 푋 )푔 To finish the proof, we just need to show that

(6.8) 푧(푔; ∑ 푦, ∑ 푦2,..., ∑ 푦푛) = ∑ wt 푓 푦∈푌 푦∈푌 푦∈푌 푓∈(푌 푋 )푔

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

202 6. Counting with Group Actions

because then, combining the previous displayed equations,

1 ∑ wt 풪 = ∑ 푧(푔; ∑ 푦, ∑ 푦2,..., ∑ 푦푛) #퐺 풪 푔∈퐺 푦∈푌 푦∈푌 푦∈푌

= 푍(퐺; ∑ 푦, ∑ 푦2,..., ∑ 푦푛). 푦∈푌 푦∈푌 푦∈푌

Let 풮 = (푌 푋 )푔. By definition, the right side of(6.8) is the weight-generating func- tion of 풮. Let 푔 = 푔1 ⋯ 푔푚 be the decomposition of 푔 into disjoint cycles. Similarly to the proof of Theorem 6.3.2(a), we can decompose 풮 as a product 풮1 × ⋯ × 풮푚 where the 푖th coordinate contains the possible image sets 푓(푔푖) of 푓 on the 푖th cycle. But from Lemma 6.2.2 we have that 푔 fixes 푓 if and only if 푓 is constant on the 푔푖. So 푓(푔푖) must consist of #푔푖 copies of some element of 푌. It follows that the weight-generating #푔푖 function for these sets is ∑푦∈푌 푦 . Now the Product Rule for ogfs yields

∑ wt 푓 = ( ∑ 푦#푔1 )( ∑ 푦#푔2 ) ⋯ ( ∑ 푦#푔푚 ) 푓∈(푌 푋 )푔 푦∈푌 푦∈푌 푦∈푌

푐1(푔) 푐2(푔) 푐푛(푔) = ( ∑ 푦) ( ∑ 푦2) ... ( ∑ 푦푛) 푦∈푌 푦∈푌 푦∈푌

= 푧(푔; ∑ 푦, ∑ 푦2,..., ∑ 푦푛) 푦∈푌 푦∈푌 푦∈푌 which is what we wished to show. □

Let us look again at the 4-bead necklaces under the rotation group 퐺 = ⟨(1, 2, 3, 4)⟩ and with color set 푌 = {퐵, 푊}. The cycle index for 퐺 was computed in (6.5). So, by the result just proved, the weight-generating function for the orbits is 푍(퐺; 퐵 + 푊, 퐵2 + 푊 2, 퐵3 + 푊 3, 퐵4 + 푊 4) 1 = [(퐵 + 푊)4 + (퐵2 + 푊 2)2 + 2(퐵4 + 푊 4)] 4 = 퐵4 + 퐵3푊 + 2퐵2푊 2 + 퐵푊 3 + 푊 4. Of course, in this simple example we could have gotten the same result by just looking at the orbit list in Figure 6.1. For a more substantial example, we return to the problem, first raised in Sec- tion 1.9, of counting unlabeled graphs with 푛 vertices. This can be handled using either Theorem 6.3.2(a) or Theorem 6.4.2. We will use the former as it will make the calcula- tions slightly easier. In particular, we will find the generating function

푚 (6.9) ∑ 푔푚푡 푚≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.4. Redfield–Pólya theory 203

1 1 1 1 ⎧ ⎫ ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ ↔ ↔ ⎨ ⎬ ⎨ ⎬ ⎪ ⎪ ⎪ ⎪ ⎩ 2 3 ⎭ ⎩ 2 3 , 2 3 , 2 3 ⎭

1 1 1 1 ⎧ ⎫ ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ ↔ ↔ ⎨ ⎬ ⎨ ⎬ ⎪ ⎪ ⎪ ⎪ ⎩ 2 3 , 2 3 , 2 3 ⎭ ⎩ 2 3 ⎭

Figure 6.4. The orbits of labeled graphs and the corresponding unlabeled graphs

where 푔푚 is the number of unlabeled graphs with 푛 vertices and 푚 edges. An unlabeled graph on 푛 vertices can be considered as the orbit of a set of labeled graphs with vertex set [푛] under the action of the symmetric group 픖푛 in the following way. If 퐸 is the [푛] edge set of a graph with 푉 = [푛], then 퐸 ⊆ ( 2 ). Since the vertex set is fixed, we can identify the graph with its subset of edges. The orbits of this action when 푛 = 3 are [푛] shown in Figure 6.4. So to use Theorem 6.3.2(a) we need to take 푋 = ( 2 ). In order (2) to distinguish the action of 휋 ∈ 픖푛 on 푉 = [푛] from its action on 푋 we will use 휋 (2) and 픖푛 for the group elements and the group in the latter case, where the action on {푖, 푗} ∈ 푋 is the usual one, 휋{푖, 푗} = {휋(푖), 휋(푗)}. It follows that ([푛]) 푔 = number of orbits of the induced action of 픖(2) acting on ( 2 ) 푚 푛 푚 as needed for the theorem. (2) [푛] We must first compute the cycle index of 픖푛 acting on 푋 = ( 2 ). As usual when dealing with graphs, we will write {푖, 푗} ∈ 푋 as 푖푗. Either 푖, 푗 are in the same cycle of 휋 or they are in different cycles. Consider first when 푖, 푗 ∈ 휅, a cycle of 휋 with |휅| = 푘. Consider the elements of 휅 as lying clockwise on a circle and breaking it up into 푘 arcs of length one. An orbit of 휅(2) consists of all pairs 휅푝(푖)휅푝(푗) for all possible powers 푝. This gives all pairs at the same distance 푑 around the cycle where 1 ≤ 푑 ≤ 푘/2. See the orbit on the left in Figure 6.5 for an example when 휅 = (1, 2, 3, 4, 5, 6) and 푑 = 2. If (2) 1 ≤ 푑 < 푘/2, then this orbit has 푘 elements and so contributes a 푡푘 to 푧(휋 ). Since the number of such orbits is the floor function ⌊(푘−1)/2⌋, these orbits together give a factor ⌊(푘−1)/2⌋ of 푡푘 . If 푘 is even, then the orbit when 푑 = 푘/2 contains 푘/2 edges for a factor of 푡푘/2. The orbit on the right in Figure 6.5 is an illustration. If we make the convention (2) that 푡푞 = 1 when 푞 is not an integer, then we can write the total contribution of 휅 as ⌊(푘−1)/2⌋ 푡푘 푡푘/2 regardless of the parity of 푘. Now consider the case where 푖 ∈ 휅 and 푗 ∈ 훾 for two different cycles of 휋, and suppose #휅 = 푘, #훾 = 푙. Now the orbit consists of edges of the form 휅푝(푖)훾푝(푗). So

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

204 6. Counting with Group Actions

1 1

6 2 6 2

(2) (2) 풪{1,3} = 풪{1,4} =

5 3 5 3

4 4

Figure 6.5. Two orbits of 휅(2) when 휅 = (1, 2, 3, 4, 5, 6)

the number of edges in the orbit will be the smallest positive 푝 such that 휅푝(푖) = 푖 and 훾푝(푗) = 푗. We have 휅푝(푖) = 푖 if and only if 푝 is divisible by 푘 and similarly for the second condition. Thus the smallest 푝 satisfying both is 푝 = lcm(푘, 푙). Since this is independent of 푖, 푗 all such orbits have the same size. The total number of possible pairs 푖푗 in the given two cycles is 푘푙, which means that the number of orbits is 푘푙/ lcm(푘, 푙) = gcd(푘,푙) (2) gcd(푘, 푙). Hence these orbits contribute 푡lcm(푘,푙) to 푧(휋 ). We are now in a position to calculate 푧(휋(2)). Suppose the cycle type of 휋 is given 푚1 푚푛 in multiplicity notation as (1 , ... , 푛 ) so that 휋 has 푚푘 cycles of length 푘. We have

푛 푘 푚푘 (2) 푚푘⌊(푘−1)/2⌋ 푚푘 ( 2 ) 푚푘푚푙 gcd(푘,푙) (6.10) 푧(휋 ) = ∏ 푡푘 푡푘/2 푡푘 ∏ 푡lcm(푘,푙) , 푘=1 1≤푘<푙≤푛

푚푘 where the exponent factors of 푚푘, ( 2 ), and 푚푘푚푙 count the number of ways to choose a single cycle of length 푘, a pair of cycles both of length 푘, and two cycles one of length 푘 and one of length 푙, respectively. Note that this expression depends only on the cycle type of 휋 and not directly on 휋 itself. So we will be able to combine terms with the same index by using the following result.

푚1 푚푛 Proposition 6.4.3. If 휆 = (1 , ... , 푛 ), then the number of 휋 ∈ 픖푛 with cycle type 휆 is 푛! 푘휆 = 푛 . 푚푘 ∏푘=1 푘 푚푘!

Proof. Consider a template of 푛 blank spaces arranged in cycles corresponding to a product of the given cycle type. An example follows this proof. There are 푛! ways to fill the spaces with the elements of [푛]. So to complete the count we just need to divide by the number of fillings that give the same permutation. If we fill ablank 푘-cycle with any of (푎1, 푎2, ... , 푎푘), (푎2, 푎3, ... , 푎푘, 푎1), ... , (푎푘, 푎1, ... , 푎푘−1), then the permutation 푚푘 will not change. This gives a total of 푘 possibilities for the 푚푘 cycles of length 푘. Also, since disjoint cycles commute, we can permute any of these 푚푘 cycles among □ themselves. This explains the factor of 푚푘! and finishes the proof.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.5. An application to proving congruences 205

As an example, suppose 휆 = (32). Then the template would look like ( , , )( , , ). Of the 6! ways to fill the template, (1, 2, 3)(4, 5, 6) would be the same as (2, 3, 1)(4, 5, 6) since (1, 2, 3) = (2, 3, 1). Also (1, 2, 3)(4, 5, 6) corresponds to the same permutation as (4, 5, 6)(1, 2, 3) since cycles commute. Combining (6.10) and Proposition 6.4.3, as well as canceling the 1/푛! in the cycle index into the 푛! in 푘휆, gives 푛 푚 ⌊(푘−1)/2⌋+푘 푚푘 (2) 1 푘 ( 2 ) 푚푘 푚푘푚푙 gcd(푘,푙) (6.11) 푍(픖푛 ) = ∑ ∏ 푡푘 푡푘/2 ∏ 푡 푘푚푘 푚 ! lcm(푘,푙) 휆⊢푛 푘=1 푘 1≤푘<푙≤푛

where 휆 = (1푚1 , ... , 푛푚푛 ). By Theorem 6.4.2, we obtain the generating function (6.9) for unlabeled graphs by number of edges using the substitution

푚 (2) 2 푛 ∑ 푔푚푡 = 푍(픖푛 , 1 + 푡, 1 + 푡 , ... , 1 + 푡 ). 푚≥0

As a check, suppose 푛 = 3. Then there are three summands in (6.11) given in the chart 휆 summand

3 3 (1 ) 푡1 /6 (1, 2) 푡1푡2/2 (3) 푡3/3 so that 1 1 1 푍(픖(2)) = 푡3 + 푡 푡 + 푡 . 푛 6 1 2 1 2 3 3 Thus the generating function for unlabeled graphs on three vertices by number of edges is 1 1 1 (1 + 푡)3 + (1 + 푡)(1 + 푡2) + (1 + 푡3) = 1 + 푡 + 푡2 + 푡3, 6 2 3 a fact which can be easily verified using Figure 6.4.

6.5. An application to proving congruences

We now show how group actions and Möbius inversion can be combined to prove var- ious congruences from number theory. The advantages of this approach are twofold. One is that it can be used to prove a large array of congruences; see [77] for many exam- ples. The other is that these demonstrations can be approached in a uniform manner rather than having to use ad hoc techniques for each of them. Suppose 푎, 푏 ∈ ℤ and 푚 ∈ ℙ. Recall that 푎 and 푏 are congruent modulo 푚, which we write as 푎 ≡ 푏 (mod 푚), if 푎 and 푏 both leave the same remainder on division by 푚. Equivalently, 푚 ∣ 푎 − 푏. We will start with the easiest case, which is when the modulus 푚 is a prime.

Lemma 6.5.1. Let 푝 be a prime. Let 퐺 = ⟨푔⟩ be a group with #퐺 = 푝. Then for any finite set 푋 on which 퐺 acts #푋 ≡ #푋푔 (mod 푝).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

206 6. Counting with Group Actions

Proof. By Lemma 6.1.2(c), for any orbit 풪푥 of the action, we have #퐺푥 = #퐺/#풪푥. So 풪푥 must divide evenly into #퐺 = 푝. Since 푝 is prime, the only possibilities are #풪푥 = 1 푔 or 푝. In the latter case, #풪푥 ≡ 0 (mod 푝). And in the former, 푥 ∈ 푋 . Since the orbits partition 푋 by Lemma 6.1.2(a), we have #푋 = ∑ #풪 ≡ ∑ 1 = #푋푔 (mod 푝) 풪 푥∈푋푔 where the first sum is over the distinct orbits of 퐺 acting on 푋. □

We will now use this lemma to prove three well-known congruences. The first is named Fermat’s Little Theorem. As described in Burton [22, p. 514], Pierre de Fermat stated this theorem in a letter to his friend Frénicle de Bessy in 1640. However, the first published proof seems to be in a 1736 paper of Euler according to Ore’s book [67, p. 273]. The proof we give here, as well as the one for Wilson’s Theorem which follows, are due to Julius Petersen in a paper from 1872 [68]. Theorem 6.5.2 (Fermat’s Little Theorem). Let 푎 ∈ ℤ and let 푝 be prime. Then 푎푝 ≡ 푎 (mod 푝).

Proof. It suffices to prove this for one element out of every congruence class mod- ulo 푝, so we can assume 푎 > 0. Let 푋 = [푝], 푌 = [푎] and consider the action of 퐺 = ⟨(1, 2, ... , 푝)⟩ on 푌 푋 . Of course, we picked this action because |푌 푋 | = 푎푝. And according to Lemma 6.2.2, the fixed points of 푔 = (1, 2, ... , 푝) are the 푓 which are con- stant on this cycle. But this means 푓(1) = 푓(2) = ⋯ = 푓(푝) ∈ [푎]. It follows that #(푌 푋 )푔 = 푎. The congruence now follows from Lemma 6.5.1. □

We next turn to Wilson’s Congruence. It was stated around 1000 AD by Ibn al- Haytham; see O’Connor and Robertson [65]. In 1770, Waring [97] mentioned that the result had been found by his student, Wilson, but neither of them could prove it. A demonstration was give by Lagrange [56] one year later. To prove this result, we will need to consider that action of 픖푛 on itself by conjugation. −1 Lemma 6.5.3. Suppose 휋, 휎 ∈ 픖푛 and let 휏 = 휎휋휎 . Then the cycles of 휏 are exactly those of the form (휎(푖), 휎(푗), ... , 휎(푘)) where (푖, 푗, ... , 푘) is a cycle of 휋.

Proof. Since 휋 is a bijection, it suffices to show that if (푖, 푗, ... , 푘) is a cycle of 휋, then (휎(푖), 휎(푗), ... , 휎(푘)) is a cycle of 휏. Equivalently, we must demonstrate that if 휋(푖) = 푗, then 휏(휎(푖)) = 휎(푗). But 휏(휎(푖)) = 휎휋휎−1(휎(푖)) = 휎휋(푖) = 휎(푗) so we are done. □

It is not hard to show, and so left as an exercise, that there is an action of 픖푛 itself −1 where 휎∶ 픖푛 → 픖푛 sends 휋 to 휎휋휎 . From this definition it is clear that the action can be restricted to any subset of 픖푛 which is a union of conjugacy classes. And by the previous result, a conjugacy class consists of all permutations of the same cycle type since given two cycles, one can always construct a 휎 such that the first is obtained

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.5. An application to proving congruences 207

from the second by applying 휎 to each of its elements. We can also obviously act with a subgroup of 픖푛 rather than the whole group. Theorem 6.5.4 (Wilson’s Congruence). If 푝 is prime, then (푝 − 1)! ≡ −1 (mod 푝).

Proof. Let 퐺 = ⟨휎⟩ where 휎 = (1, 2, ... , 푝) acts on the conjugacy class 푋 of 픖푛 consist- ing of all 푝-cycles. By Proposition 4.3.1 we have #푋 = (푝 − 1)!. It suffices to show that #푋휍 = 푝 − 1 since then, by Lemma 6.5.1 (푝 − 1)! = #푋 ≡ #푋휍 ≡ −1 (mod 푝).

In fact, we claim that 푋휍 = {휎푖 ∣ 0 < 푖 < 푝}. Note that 휎푖 is in 푋 for 0 < 푖 < 푝 since 푝 is prime and so these powers are all 푝-cycles. Also, 휎푖 ∈ 푋푔 since 휎휎푖휎−1 = 휎푖. To show that these are the only elements fixed by 휎, suppose 휋 ∈ 푋휍. Since 휋 is a single cycle we must have 휋(1) = 1 + 푖 for some 푖 with 0 < 푖 < 푝. We will show that 휋 = 휎푖. Since 휋 ∈ 푋휍 we have 휎푗휋휎−푗 = 휋 for any 푗. So, by Lemma 6.5.3, 휋 must also send 휎푗(1) = 1 + 푗 to 휎푗(1 + 푖) = 1 + 푖 + 푗 for any 푗. In particular, 휋 sends 1 + 푖 to 1 + 2푖, 1 + 2푖 to 1 + 3푖, and so forth, where all values are taken modulo 푝. It follows that 휋 = 휎푖 as desired. □

Our next goal is to prove a congruence of Lucas [59]. It gives a way of evaluating a binomial coefficients modulo a prime in terms of the digits inthe 푝-ary expansions of the two arguments. First, we will prove a warm-up result which gives a recursion for the binomial coefficients modulo 푝. Note how this recurrence relation is the same as the one in Theorem 1.3.3(a) except that every −1 has been replaced by a −푝.

Lemma 6.5.5. Let 푝 be prime and let 푛 ≥ 푝. We have 푛 푛 − 푝 푛 − 푝 ( ) ≡ ( ) + ( ) (mod 푝). 푘 푘 − 푝 푘

Proof. When 푘 < 0 or 푘 > 푛, then it is easy to check that both sides are zero. If 0 ≤ 푘 ≤ 푛, then let 휎 = (1, 2, ... , 푝)(푝 + 1)(푝 + 2) ... (푛) and consider the action of [푛] 푛 퐺 = ⟨휎⟩ on 푋 = ( 푘 ). So #푋 = (푘). Because of the cycle structure of 휎, Lemma 6.3.1(a) implies that 푆 ∈ 푋휍 if and only if [푝] ⊆ 푆 or [푝] ⊆ [푛] − 푆. In the first case, the number of ways to choose the remaining 푘 − 푝 elements of 푆 from the elements of [푛] − [푝] is 푛−푝 (푘−푝). In the second, we must choose 푘 elements from [푛] − [푝] to be in 푆 for a total of 푛−푝 □ ( 푘 ) choices. Adding the two counts and using Lemma 6.5.1 finishes the proof.

Theorem 6.5.6 (Lucas’s Congruence). Let 푝 be prime and let 0 ≤ 푘 ≤ 푛. Consider the 푖 푖 base 푝 expansions 푛 = ∑푖≥0 푛푖푝 and 푘 = ∑푖≥0 푘푖푝 where 0 ≤ 푛푖, 푘푖 < 푝 for all 푖. We have 푛 푛 ( ) ≡ ∏ ( 푖) (mod 푝). 푘 푘 푖≥0 푖

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

208 6. Counting with Group Actions

′ ′ Proof. Dividing by 푝 we can write 푛 = 푝푛 + 푛0 and 푘 = 푝푘 + 푘0. We will prove that ′ 푛 푛 푛0 ( ) ≡ ( ′)( ) (mod 푝) 푘 푘 푘0 from which Lucas’s result follows by induction on 푛. Let 퐺 = ⟨휎⟩ for the permutation ′ ′ 휎 = 휎1휎2 ⋯ 휎푛′ (푝푛 + 1)(푝푛 + 2) ⋯ (푛), [푛] where 휎푖 = (푝(푖 − 1) + 1, 푝(푖 − 1) + 2, ... , 푝푖). Then 퐺 acts on 푋 = ( 푘 ). Since 푘0 < 푝, for 푆 ∈ 푋 to be fixed by 휎 we must have that 푆 is the union of 푘0 elements from ′ ′ ′ 푝푛 + 1, 푝푛 + 2, ... , 푛 together with 푘 of the cycles 휎푖. So the number of ways of ′ 푛0 푛 choosing the elements of the first type is ( ), and of the second ( ′). This completes 푘0 푘 the proof. □

To prove congruences with a nonprime modulus, we need to use Möbius inversion. Let 퐺 be a group acting on 푋 with 퐺, 푋 finite. Call 푥 ∈ 푋 aperiodic if 퐺푥 = 푒, the identity element of 퐺. (For sets of one element we sometimes dispense with the curly braces.) By Lemma 6.1.2(c), this is equivalent to 푥 lying in an orbit of size #퐺. Since distinct orbits are disjoint by Lemma 6.1.2(a), the number of aperiodic elements is divisible by #퐺. Now consider 퐿(퐺), the lattice of subgroups of 퐺 ordered by inclusion. If 퐻 ∈ 퐿(퐺), then we define two functions

훼(퐻) = #{푥 ∈ 푋 ∣ 퐺푥 = 퐻} and 훽(퐻) = #{푥 ∈ 푋 ∣ 퐺푥 ≥ 퐻}.

It follows immediately that 훽(퐻) = ∑퐾≥퐻 훼(퐾). Furthermore 훼(푒) is, by definition, the set of aperiodic elements. So, by the previous paragraph, 훼(푒) ≡ 0 (mod #퐺). Applying the Möbius Inversion Theorem, Theorem 5.5.5(a), and using the fact that {푒} is the 0̂ element of 퐿(퐺), we have proved the following result. Theorem 6.5.7. Suppose 퐺 acts on 푋 with 퐺, 푋 finite. We have

∑ 휇(퐻)훽(퐻) ≡ 0 (mod #퐺). □ 퐻∈퐿(퐺)

Before applying this result, we note that it has Lemma 6.5.1 as a corollary. Indeed, by Lagrange’s Theorem, any 퐻 ≤ 퐺 has #퐻 ∣ #퐺. So if #퐺 = 푝 is prime, then #퐻 = 1 or #퐻 = 푝. It follows that 퐻 = 푒 and 퐻 = 퐺 are the only subgroups of 퐺 and 퐿(퐺) ≅ 퐶1, the chain with two elements. Thus 휇(푒) = 1, 휇(퐺) = −1, and Theorem 6.5.7 becomes 훽(푒) − 훽(퐺) ≡ 0 (mod 푝). 푔 But 훽(푒) = #푋 since every 푥 ∈ 푋 satisfies 퐺푥 ≥ 푒. Furthermore, 훽(퐺) = #푋 where 퐺 = ⟨푔⟩. Indeed, 퐺푥 ≥ 퐺 implies 퐺푥 = 퐺, which in turn is equivalent to 푔푥 = 푥 since ⟨푔⟩ = 퐺. Plugging these values into the previous displayed equation yields Lemma 6.5.1. We now give an application of Theorem 6.5.7. We first need to characterize 퐿(퐺) when 퐺 is an arbitrary cyclic group.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.6. The cyclic sieving phenomenon 209

Proposition 6.5.8. If 푛 ∈ ℙ and 퐺 = ⟨(1, 2, ... , 푛)⟩, then 퐿(퐺) ≅ 퐷푛, the lattice of divisors of 푛.

Proof. Let 푔 = (1, 2, ... , 푛). Since 퐺 is cyclic, so is every subgroup. For 푑 ∣ 푛 let 푑 퐻푑 = ⟨푔 ⟩ so that #퐻푑 = 푛/푑. We now have a bijection from the set of 퐻푑 to 퐷푛 given by 퐻푑 ↦ 푛/푑. Clearly this map and its inverse are order preserving. So we will be done if we can show that every subgroup of 퐺 is one of the 퐻푑. Suppose 퐻 ≤ 퐺. Since 퐻 is cyclic, we can write 퐻 = ⟨푔푑⟩ and choose 푑 with 0 ≤ 푑 < 푛 that is minimum over all generators of 퐻. We claim 퐻 = 퐻푑 which will finish the proof. For this, it suffices to show that 푑 ∣ 푛. Suppose, towards a contradiction, that this is not the case. Then dividing 푛 by 푑 gives 푛 = 푞푑 + 푟 where 0 ≤ 푟 < 푑. Now 푔푟 = 푔푛−푞푑 = (푔−푞)푑 ∈ 퐻. But this contradicts the fact that 푑 was the smallest possible exponent of an element of 퐻. So the proof is complete. □

We can now prove an analogue of Lemma 6.5.5 modulo 푝2. Proposition 6.5.9. Let 푝 be prime and 푛 ≥ 푝2. We have 푝 푛 푝 푛 − 푝2 ( ) ≡ ∑ ( )( ) (mod 푝2). 푘 푖 푘 − 푖푝 푖=0

2 Proof. Let 푔 = (1, 2, ... , 푝 ) ∈ 픖푛 where we do not write down any cycles of length one. If 퐺 = ⟨푔⟩, then, by the previous proposition and its proof, 퐿(퐺) consists of three groups 푒, 퐻 = ⟨푔푝⟩, and 퐺, with Möbius values 휇(푒) = 1, 휇(퐻) = −1, and 휇(퐺) = 0. So from Theorem 6.5.7 we have 훽(푒) ≡ 훽(퐻) (mod 푝2) for any 푋 on which 퐺 acts. Let [푛] 푛 푋 = ( 푘 ). Then 훽(푒) = #푋 = (푘). So we will be done if we can show that 훽(퐻) is the right-hand sum in the statement of the proposition. Note that 푔푝 = (1, 1 + 푝, 1 + 2푝, ... )(2, 2 + 푝, 2 + 2푝, ... ) ⋯ (푝, 2푝, 3푝, ... ) and so consists of 푝 cycles each with 푝 elements. By Lemma 6.3.1(a), 푔푝푆 = 푆 if and only if each of these cycles is either entirely in 푆 or in its complement. If 푖 of the cycles 푝 are in 푆, 0 ≤ 푖 ≤ 푝, then there are (푖 ) ways to choose which cycles. Once these cycles are chosen, we must choose 푘 − 푖푝 other elements to be in 푆 from the 푛 − 푝2 elements 2 푛−푝2 □ of [푛] − [푝 ]. Since this can be done in (푘−푖푝) ways, we are done.

6.6. The cyclic sieving phenomenon

As we saw in Chapter 2, one can often express the solution to a counting problem as a sum where the summands have positive and negative coefficients. So one might ex- pect that there are also situations where 푛th roots of unity come into play for 푛 > 2. This is indeed the case with instances of the cyclic sieving phenomenon, so called be- cause these roots of unity form a cyclic group. This concept was introduced by Reiner, Stanton, and White [72]. For a survey of such results see [80].

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

210 6. Counting with Group Actions

[3] Table 6.1. The action of (1, 2, 3) on (( 2 )) (1, 2, 3)11 = 22, (1, 2, 3)22 = 33, (1, 2, 3)33 = 11, (1, 2, 3)12 = 23, (1, 2, 3)13 = 12, (1, 2, 3)23 = 13.

Let 퐺 be a multiplicative group with identity element 푒. If 푔 ∈ 퐺, then we let 표(푔) be the order of 푔, that is, the smallest positive integer such that 푔표(푔) = 푒. In particular, we will be interested in the cyclic group ⟨푒2휋푖/푛⟩ ⊂ ℂ consisting of all 푛th roots of unity. An 푛th root of unity 휔 is primitive if 표(휔) = 푛. So the primitive 푛th roots 2푘휋푖/푛 are exactly those of the form 푒 where gcd(푘, 푛) = 1. We will use the notation 휔푛 for a primitive 푛th root of unity. We need the following facts.

Lemma 6.6.1. Let 휔 ≠ 1 be an 푛th root of unity. (a) 1 + 휔 + 휔2 + ⋯ + 휔푛−1 = 0. (b) If 휔 is primitive, then 1 + 휔 + 휔2 + ⋯ + 휔푖 ≠ 0 for 0 ≤ 푖 < 푛 − 1.

Proof. We will prove (a) and leave (b) as an exercise. Since 휔 is an 푛th root of unity, 휔푛 = 1 which can be rewritten as 0 = 1 − 휔푛 = (1 − 휔)(1 + 휔 + 휔2 + ⋯ + 휔푛−1). Since 휔 ≠ 1 we have 1 − 휔 ≠ 0. It follows that the second factor in the displayed equation above is zero, which completes the proof. □

Suppose we are given a cyclic group 퐺, a set 푋 on which 퐺 acts, and a polynomial 푓(푞) ∈ ℕ[푞]. The triple (푋, 퐺, 푓(푞)) exhibits the cyclic sieving phenomenon or CSP if for all 푔 ∈ 퐺 we have

푔 (6.12) #푋 = 푓(휔표(푔)). So we can count the fixed points of 푔 by plugging into 푓(푞) a root of unity which has the same order. This is quite surprising since there is, a priori, no promise that substituting a complex number into 푓(푞) will yield an integer, much less that it will count some- thing! Nevertheless, many examples of the CSP have been found and we will explore one in this section. Before we do this, note that a special case of (6.12) is 푓(1) = #푋푒 = #푋. So 푓(푞) will be a 푞-analogue of #푋. For our running example we will take 퐺 = ⟨(1, 2, ... , 푛)⟩ acting on the set of mul- [푛] tisets 푋 = (( 푘 )) by 푔{{푥1, ... , 푥푘}} = {{푔푥1, ... , 푔푥푘}}. For ease of notation, in this section we will dispense with the curly braces and commas and just write 푥1 ... 푥푛 for a multiset with the understanding that this is not a permu- tation. To be really concrete, let 푛 = 3 and 푘 = 2. So 푋 = {11, 22, 33, 12, 13, 23}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

6.6. The cyclic sieving phenomenon 211

The action of (1, 2, 3) on 푋 is given in Table 6.1. Recalling Theorem 1.3.4 and the remark at the end of the previous paragraph, a natural choice for our polynomial is 푛 + 푘 − 1 푓(푞) = [ ] . 푘 푞 In the special case under consideration 4 푓(푞) = [ ] = 1 + 푞 + 2푞2 + 푞3 + 푞4. 2

Note that (1, 2, 3) has the same order as a root 휔 = 휔3. And in this case, using Lemma 6.6.1(a), 푓(휔) = 1 + 휔 + 2휔2 + 휔3 + 휔4 = (1 + 휔 + 휔2)(1 + 휔2) = 0 = #푋(1,2,3) where the last equality can be seen from Table 6.1. The rest of this section will be devoted to proving the following result. Another example of the CSP will be found in the exercises.

Theorem 6.6.2. The cyclic sieving phenomenon is exhibited by the triple

[푛] 푛 + 푘 − 1 ((( )), ⟨(1, 2, ... , 푛)⟩, [ ] ). 푘 푘 푞

Theorem 6.6.2 will be proved by a sequence of results which will permit us to ex- plicitly evaluate the two sides of (6.12). We will start on the left. We first need an analogue of Lemma 6.3.1 for multisets. The disjoint union of two multisets is defined by 푎푙푎 ... 푐푙푐 ⊎ 푎푚푎 ... 푐푚푐 = 푎푙푎+푚푎 ... 푐푙푐+푚푐 . Note that this definition also applies to sets, where a set is just a multiset with allmul- tiplicities zero or one. So, for example, 123 ⊎ 123 ⊎ 254 = 12233254. The proof of the next lemma is similar to that of Lemma 6.3.1(a) and so is left as an exercise.

푋 Lemma 6.6.3. Let 퐺 act on 푋 with both finite and let 푔 ∈ 퐺. For 푀 ∈ ((푘)) we have 푔푀 = 푀 if and only if 푀 is a disjoint union of (not necessarily distinct) cycles of 푔. □

It is now easy to count fixed points in this situation. To simplify notation, let 퐶푛 = ⟨(1, 2, ... , 푛)⟩.

[푛] Corollary 6.6.4. Let 푋 = (( 푘 )) and suppose 푔 ∈ 퐶푛 has 표(푔) = 푑. We have 푛/푑 ⎧ (( )) if 푑 ∣ 푘, ⎪ 푘/푑 #푋푔 = ⎨ ⎪ ⎩ 0 otherwise.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

212 6. Counting with Group Actions

Proof. Since 푔 is a power of (1, 2, ... , 푛) its disjoint cycle decomposition must consist of 푛/푖 cycles of length 푖 for some 푖 ∣ 푛. It follows that if 표(푔) = 푑, then 푔 is a product of 푛/푑 cycles of length 푑. So if 푑 does not divide 푘, then a multiset with 푘 elements cannot be written as a disjoint union of cycles of 푔. Thus Lemma 6.6.3 forces #푋푔 = 0. On the other hand, if 푑 ∣ 푘, then, by the same lemma, the fixed points are those 푀 which are a disjoint union of 푘/푑 of the 푛/푑 cycles of 푔. Since cycles can be chosen with repetition, we have now verified the count for #푋푔 in this case as well. □

For the right-hand side of (6.12), we need the following lemma.

Lemma 6.6.5. Suppose 푚 ≡ 푛 (mod 푑) and 휔 = 휔푑. We have 푚 if 푑 ∣ 푛, [푚]푞 푛 (6.13) lim = { 푞→휔 [푛] 푞 1 otherwise.

Proof. By the assumption about 푚, 푛 we can write 푚 = 푘푑 + 푟 and 푛 = 푙푑 + 푟 for some 푘, 푙 ∈ ℕ and 0 ≤ 푟 < 푑. Using the definition of [푛]푞 in equation (3.2), we see that 푟 (6.14) [푛]푞 = [푟]푞 + 푞 [푑]푞[푙]푞푑 where the reader will note the substitution of 푞푑 for 푞 in the last factor. A similar expression holds for [푚]푞. So if 푟 ≠ 0, then Lemma 6.6.1 gives [푚]휔 = [푟]휔 = [푛]휔 where [푟]휔 ≠ 0. The “otherwise” case of the lemma follows immediately. If 푟 = 0, then we can use the displayed equation and the fact that 휔푑 = 1 to write

[푚]푞 [푑]푞[푘]푞푑 [푘] 푘 푚 lim = lim = 1 = = 푞→휔 [푛]푞 푞→휔 [푑]푞[푙]푞푑 [푙]1 푙 푛 as desired. □

As a corollary, we can evaluate certain 푞-binomial coefficients when substituting 휔.

Corollary 6.6.6. Suppose 휔 = 휔푑 where 푑 ∣ 푛. We have 푛/푑 + 푘/푑 − 1 ⎧ ( ) if 푑 ∣ 푘, 푛 + 푘 − 1 ⎪ 푘/푑 [ ] = 푘 ⎨ 휔 ⎪ ⎩ 0 otherwise.

Proof. After canceling [푛 − 1]푞! we have

푛 + 푘 − 1 [푛]푞[푛 + 1]푞 ⋯ [푛 + 푘 − 1]푞 [ ] = . 푘 [1] [2] ⋯ [푘] 푞 푞 푞 푞 Since 푑 ∣ 푛 we have, using Lemma 6.6.1 and equation (6.14), that in the product [푛]휔[푛 + 1]휔 ⋯ [푛 + 푘 − 1]휔 the first factor and every 푑th factor after that is zero while the other factors are nonzero. Furthermore, the factors which become zero have 휔 as a root with multiplicty one. By the same token, in [1]휔[2]휔 ⋯ [푘]휔 the zero factors have period 푑 but one starts with 푑 − 1 nonzero factors, and again each zero factor has 휔 as

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 213

a root exactly once. It follows that the number of times 휔 is a root in the numerator is always greater than or equal to the number in the denominator, with equality if and only if 푑 ∣ 푘. So if 푑 does not divide 푘, then, because the 푞-binomial coefficient is a polynomial in 푞, this polynomial will have a factor whose root is 휔. Thus the second case of the corollary is proved. To see what happens when 푑 ∣ 푘, we use the previous lemma to obtain

푛 + 푘 − 1 [푛]푞 [푛 + 1]푞 [푛 + 2]푞 [푛 + 푘 − 1]푞 [ ] = lim( ⋅ ⋅ ⋯ ) 푘 푞→휔 [푘] [1] [2] [푘 − 1] 휔 푞 푞 푞 푞 푛 푛 + 푑 푛 + 2푑 = ⋅ 1 ⋯ 1 ⋅ ⋅ 1 ⋯ 1 ⋅ ⋅ 1 ⋯ 푘 푑 2푑 푛/푑 푛/푑 + 1 푛/푑 + 2 = ⋅ ⋅ ⋯ 푘/푑 1 2 푛/푑 + 푘/푑 − 1 = ( ) 푘/푑 which is what we wished to prove in this case. □

Comparing Corollaries 6.6.4 and 6.6.6 while remembering Theorem 1.3.4 com- pletes the proof of Theorem 6.6.2. Another way to prove this result using symmetric functions and representation theory will be found in Section 7.9.

Exercises

(1) Complete the proof of Proposition 6.1.1. (2) (a) Prove that (6.2) satisfies the definition of a group action. (b) Consider the action of 퐺 = ⟨(1, 2, 3, 4)⟩ on the set 푃([4], 2) of 2-permutations of [4] given by 푔(푥푦) = 푔(푥)푔(푦). Compute the orbits and stabilizers of this action and verify that the parts of Lemma 6.1.2 are satisfied. (3) Show that if 퐺 acts on 푋, then 푔푥 = 푦 if and only if 푥 = 푔−1푦. (4) (a) Let 퐺 act on 푋 and let 푌 be another set. For 푓 ∈ 푌 푋 , define 푔푓 = 푓 ∘ 푔 and show that this definition does not satisfy part (a) of the definition ofagroup action. (b) Let 퐺 act on 푌 and let 푋 be another set. For 푓 ∈ 푌 푋 , define 푔푓 = 푔 ∘ 푓 and show that this defines a group action on 푌 푋 . (5) Complete the proof of Lemma 6.2.2(a). (6) Prove that 4 ∣ 푟4 + 푟2 + 2푟 for all 푟 ∈ ℤ by considering the possible congruence classes of 푟 modulo 4.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

214 6. Counting with Group Actions

(7) (a) Show that the number of distinct 푛-bead, 푟-color necklaces under rotation is

푛 1 1 ∑ 푟gcd(푖,푛) = ∑ 휙(푛/푑)푟푑 푛 푛 푖=1 푑∣푛

where 휙 is Euler’s function. Hint: For the second sum, use the hint for Exer- cise 24(a) in Chapter 5. (b) Use part (a) to give two new proofs of the formula obtained in the text when 푛 = 4. (8) (a) The group of symmetries of a regular 푛-gon is called a dihedral group and consists of the 푛 rotations and 푛 reflections which map the 푛-gon to itself. Find the number of different 4-bead, 푟-color necklaces if necklaces are considered the same when one is a rotation or reflection of the other. (b) Find an expression for the number of distinct 푛-bead, 푟-color necklaces if two are the same when one is a rotation or a reflection of the other. (9) (a) How many distinct cubes are there under rotation if the edges are colored from a set with 푟 colors? (b) Repeat part (a) if you are coloring the vertices. (10) (a) How many distinct regular tetrahedra are there under rotation if the faces are colored from a set with 푟 colors? (b) Show in two ways that you get the same answer in (a) if you color vertices: one using Burnside’s Lemma and one without using it. (11) Calculate the cycle index of 퐺 acting on 푋 for the following pairs. (a) 퐺 = ⟨(1, 2, ... , 푛)⟩ and 푋 = [푛]. (b) 퐺 is the dihedral group of a regular 푛-gon (see Exercise 8(a)) and 푋 = [푛]. (c) 퐺 is the group of rotations of the cube and 푋 is the faces of the cube. (d) Repeat part (c) for 푋 being the edges and vertices of the cube. (12) Complete the proof of Lemma 6.3.1(a). (13) Using the notation of Theorem 6.3.2, give two proofs of each of the following facts about the 푏푘 and 푝푘, one using their definition in terms of orbits and one using the expression for their generating functions in terms of 푍(퐺). (a) 푏0 = 푝0 = 1. (b) 푝푛 = 푝푛−1. (c) 푏푛 = 1.

(14) Call a sequence 푎0, ... , 푎푛 symmetric or palindromic if 푎푘 = 푎푛−푘 for all 푘 with 푛 푘 0 ≤ 푘 ≤ 푛. In this case also call the associated generating function ∑푘=0 푎푘푡 symmetric or palindromic. 푛 푛 푛 (a) Give three proofs that the sequence (0), (1),..., (푛) is palindromic: one us- ing (1.5), one inductive, and one combinatorial. (b) Prove that the product of palindromic unimodal polynomials is palindromic and unimodal. (c) Use (b) to give another proof of (a). (d) Prove that the generating function in Theorem 6.3.2(a) is palindromic.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 215

(15) (a) Consider 4-bead 푟-colored necklaces under rotation. Find the number of dis- tinct necklaces which have 2 beads of one color and 2 beads of another color in two ways: by using Theorem 6.4.2 and by making a direct count. (b) Consider the cube under rotation where the faces have been colored black and white. Find the generating function for the number of orbits by the number of white and number of black faces in two ways: by using Theorem 6.4.2 and by making a direct count. (c) Repeat part (b) for coloring the edges and for coloring the vertices. (16) Find the generating function for unlabeled digraphs on 푛 vertices by the number of arcs.

(17) Let 퐾푛 denote the unlabeled complete graph on 푛 vertices. (a) Find a polynomial 푝(푟) which counts the number of colorings of the edges of 퐾푛 with 푟 colors. (b) Show that 푝(2) equals the result of plugging 푡 = 1 into equation (6.9). (18) Use the fact that every integer can be written uniquely as a product of primes to show that if 푘, 푙 ∈ ℙ, then 푘푙 lcm(푘, 푙) = . gcd(푘, 푙) (19) Call 푎, 푏 ∈ ℤ relatively prime if gcd(푎, 푏) = 1. Recall that every integer can be written uniquely as a product of primes. (a) Show that if gcd(푎, 푚) = gcd(푏, 푚) = 1, then gcd(푎푏, 푚) = 1. (b) Let 푚 ∈ ℙ and let [푎] denote the congruence class of 푎 modulo 푚. Use part (a) to show that

퐺푚 = {[푎] ∣ gcd(푎, 푚) = 1} is a group. (c) Use part (b) to give two proofs that if 푝 is prime and gcd(푎, 푝) = 1, then 푎푝−1 ≡ 1 (mod 푝), one demonstration using Fermat’s Little Theorem and one using Lagrange’s Theorem from group theory. (d) Prove Euler’s Theorem: if 푎 and 푛 are relatively prime, then 푎휙(푛) ≡ 1 (mod 푛) where 휙 is the Euler phi function from Exercise 6 of Chapter 2. Hint: Use the ideas in the second proof of part (c). (e) Show that Fermat’s Little Theorem is a special case of part (d). (20) (a) Use Theorem 6.5.7 to prove that if 푎, 푛 ∈ ℙ, then ∑ 휇(푛/푑)푎푑 ≡ 0 (mod 푛). 푑∣푛 (b) Show that Fermat’s Little Theorem can be derived from part (a). −1 (21) (a) Show that the map 휎∶ 픖푛 → 픖푛 which sends 휋 to 휎휋휎 defines an action of 픖푛 on itself. (b) Show that the converse of Wilson’s Theorem is true: for 푛 > 1 we have that (푛 − 1)! ≡ −1 (mod 푛) implies 푛 is prime.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

216 6. Counting with Group Actions

(22) (a) Let 푝 be prime and let 푛 ≥ 푝. Show for the Stirling numbers of the first kind that 푐(푛, 푘) ≡ 푐(푛 − 푝, 푘 − 푝) − 푐(푛 − 푝, 푘 − 1) (mod 푝). (b) Let 푝 be prime and let 푛 > 푘푝. Show that 푐(푛, 푘) ≡ 0 (mod 푝). Hint: Use part (a). (c) Let 푝 be prime. Given 푛 ≥ 푘 ≥ 0, write 푛 = 푛″푝 + 푛′ where 0 ≤ 푛′ < 푝 and 푘 = 푘″(푝 − 1) + 푘′ where 0 ≤ 푘′ < 푝 − 1. Then ″ ″ 푛 푐(푛, 푛 − 푘) ≡ (−1)푘 ( )푐(푛′, 푛′ − 푘′) (mod 푝). 푘″ Hint: Use part (a). (d) Consider two polynomials 푓(푥), 푔(푥) ∈ ℤ[푥] and let 푚 ∈ ℙ. We say that 푓(푥) is congruent to 푔(푥) modulo 푚 if every coefficient of 푓(푥) − 푔(푥) is divisible by 푚. If 푝 is prime, show that 푝 푥↓푝≡ 푥 − 푥 (mod 푝)

where 푥↓푝= 푥(푥 − 1) ⋯ (푥 − 푝 + 1). Hint: Use Theorem 3.1.2 and part (a). (23) Finish the proof of Theorem 6.5.6 (24) (a) Show that if 퐺 acts on 푋, then it also acts on 푆(푋, 푘), the set of partitions of 푋 into 푘 blocks. (b) Let 푝 be prime and let 푛 ≥ 푝. Show for the Stirling numbers of the second kind that 푆(푛, 푘) ≡ 푆(푛 − 푝, 푘 − 푝) + 푆(푛 − 푝 + 1, 푘) (mod 푝). (c) If 푝 is prime and 푘 ∈ ℤ, then show 1 if 푘 = 1 or 푝 푆(푝, 푘) ≡ { (mod 푝). 0 otherwise (d) Suppose 푝 is prime and 0 ≤ 푘 ≤ 푛. If 푗 satisfies 푝푗 ≤ 푘 < 푝푗+1, then 푆(푛 + 푝푗(푝 − 1), 푘) ≡ 푆(푛, 푘) (mod 푝). Hint: Use part (b). (25) (a) Recall from Exercise 10 in Chapter 1 that Pascal’s triangle is fractal modulo 2. Give a second proof of this by using Lucas’s Congruence (Theorem 6.5.6) to show that if 0 ≤ 푛 < 2푚 and 0 ≤ 푘 ≤ 푛 + 2푚, then 푛 ⎧( ) if 0 ≤ 푘 ≤ 푛 ⎪ 푘 푚 ⎪ 푛 + 2 푚 ( ) ≡ 0 if 푛 < 푘 < 2 (mod 2). 푘 ⎨ ⎪ 푛 ⎪( ) if 2푚 ≤ 푘 ≤ 푛 + 2푚 ⎩ 푘 − 2푚 (b) Find and prove the analogue of part (a) for an arbitrary prime 푝 ≥ 2. (26) Use Proposition 6.5.8 to prove an analogue of Proposition 6.5.9 for any 푛 ∈ ℙ.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 217

(27) (a) Prove that for any group 퐺 the poset 퐿(퐺) is a lattice. In particular, if 퐻, 퐾 ∈ 퐿(퐺), then 퐻 ∧ 퐾 = 퐻 ∩ 퐾 and 퐻 ∨ 퐾 is the subgroup of 퐺 generated by 퐻, 퐾. 푛 (b) Let ℤ푝 be the integers modulo 푝 and consider the direct sum ℤ푝 of 푛 copies 푛 of ℤ푝. Show that if 푝 is prime, then 퐿(ℤ푝 ) ≅ 퐿푛(푝), the subspace lattice of dimension 푛 over the Galois field with 푝 elements. (c) Use part (b) to prove a congruence modulo 푝2 for the binomial coefficients. (28) Prove Lemma 6.6.1(b). (29) Prove Lemma 6.6.3. (30) Let 푝 ∈ ℙ and 푔 = (1, 2, ... , 푝). Consider the group 퐺 = ⟨푔⟩ acting on the set of [푛] multisets 푋 = (( 푘 )). (a) Suppose 푀 = {{1푚1 , 2푚2 , ... , 푛푚푛 }}. Show that 푔푀 = 푀 if and only if

푚1 = 푚2 = ⋯ = 푚푝. (b) Show that if 푝 is prime, then 푛 푛 − 푝 (( )) ≡ ∑ (( )) (mod 푝). 푘 푘 − 푚푝 푚≥0 (31) (a) Prove that 푒2푘휋푖/푛 is a primitive 푛th root of unity if and only if gcd(푘, 푛) = 1. (b) Prove that the CSP is exhibited by the triple [푛] 푛 (( ), ⟨(1, 2, ... , 푛)⟩, [ ] ). 푘 푘 푞

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 7

Counting with Symmetric Functions

A formal power series is symmetric if it is invariant under permutation of its variables. We have already seen generating functions of this type arise naturally in Theorem 6.4.2. Symmetric functions have many other connections to combinatorics some of which we will discuss in this chapter. In particular, we will see that they arise when studying log-concavity, Young tableaux, various posets, chromatic polynomials, and the cyclic sieving phenomenon. They are also intimately connected with group representations. The appendix to this book contains a summary of the facts we will need in this regard, and more information can be found in Sagan’s text [79]. For a wealth of information about symmetric functions in general, see Macdonald’s book [60].

7.1. The algebra of symmetric functions, Sym

In this section we formally define the algebra of symmetric functions and introduce some of its standard bases. Along the way, we prove the Fundamental Theorem of Symmetric Functions and show how the coefficients of a polynomial can be expressed as a symmetric function of its roots.

Let 퐱 = {푥1, 푥2, 푥3,...} be a countably infinite set of commuting variables. Con- sider the algebra of formal power series ℂ[[퐱]]. A monomial 푚 = 푥휆1 푥휆2 ⋯ 푥휆푙 has 푖1 푖2 푖푙 5 6 degree deg 푚 = ∑푖 휆푖. For example, deg(푥2푥4푥8) = 5 + 1 + 6 = 12. We say that 푓(퐱) ∈ ℂ[[퐱]] is homogeneous of degree 푛 if deg 푚 = 푛 for all monomials 푚 appearing in 푓(퐱). A weaker condition is that 푓(퐱) have bounded degree which means that there 2 is an 푛 with deg 푚 ≤ 푛 for all 푚 appearing in 푓(퐱). To illustrate, 푓(퐱) = ∑푖<푗 푥푖푥푗 is homogeneous of degree 3. On the other hand

(7.1) 푓(퐱) = ∏(1 + 푥푖) 푖≥1 is not of bounded degree.

219

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

220 7. Counting with Symmetric Functions

There is an action of 픖푚 on ℂ[[퐱]]. Specifically, if we have 휋 ∈ 픖푚 and 푓(푥1, 푥2, 푥3, ... ) ∈ ℂ[[퐱]], then we let

(7.2) 휋푓(푥1, 푥2, 푥3, ... ) = 푓(푥휋(1), 푥휋(2), 푥휋(3),...) where 휋(푖) = 푖 for 푖 > 푚. For example, 2 3 4 2 3 4 (7.3) (1, 2)(푥1 + 2푥1푥2 + 5푥1 푥3 − 푥3푥4) = 푥2 + 2푥1푥2 + 5푥2푥3 − 푥3푥4.

Call 푓(퐱) symmetric if 휋푓 = 푓 for all 휋 in every symmetric group 픖푚. Equivalently, any two monomials 푥휆1 푥휆2 ⋯ 푥휆푙 and 푥휆1 푥휆2 ⋯ 푥휆푙 , where 푖 , ... , 푖 are distinct and sim- 푖1 푖2 푖푙 푗1 푗2 푗푙 1 푙 ilarly for 푗1, ... , 푗푙, have the same coefficient in 푓(퐱) since one can always find a per- mutation 휋 such that 휋(푖푘) = 푗푘 for all 푘. Another equivalent description is that any monomial 푥휆1 푥휆2 ⋯ 푥휆푙 has the same coefficient as 푥휆1 푥휆2 ⋯ 푥휆푙 in 푓(퐱). To illustrate, 푖1 푖2 푖푙 1 2 푙 5 5 5 2 2 2 2 2 2 (7.4) 푓(퐱) = 4푥1 + 4푥2 + 4푥3 + ⋯ − 6푥1푥2 − 6푥1푥3 − 6푥2푥3 − ⋯ is symmetric. Let

Sym푛 = Sym푛(퐱) = {푓 ∈ ℂ[[퐱]] ∣ 푓 is symmetric and homogeneous of degree 푛}. The algebra of symmetric functions is Sym = Sym(퐱) = Sym (퐱). ⨁ 푛 푛≥0 Alternatively, Sym(퐱) is the set of all symmetric power series in ℂ[[퐱]] of bounded degree. This is because elements of the direct sum can only have a finite number of components which are nonzero. So the series in (7.4) is in Sym, but the ones in (7.1) and (7.3) are not. There are a number of interesting bases for Sym. We start with those functions obtained by symmetrizing a monomial. If 휆 = (휆1, 휆2, ... , 휆푙) is a partition, then the associated monomial symmetric function is 푚 = 푚 (퐱) = ∑ 푥휆1 푥휆2 ⋯ 푥휆푙 휆 휆 푖1 푖2 푖푙

where the sum is over all distinct monomials having exponents 휆1, 휆2, ⋯ , 휆푙. We will often drop the parentheses and commas in the subscript 휆 as well as use multiplicity notation. As examples, in (7.4) we have 푓 = 4푚(5) − 6푚(2,2) = 4푚5 − 6푚22 , and 2 2 2 2 2 2 (7.5) 푚21 = 푥1푥2 + 푥1푥2 + 푥1푥3 + 푥1푥3 + 푥2푥3 + 푥2푥3 + ⋯ . We must verify that we have defined a basis for Sym.

Theorem 7.1.1. The 푚휆 as 휆 varies over all partitions form a basis for Sym. Conse- quently

dim Sym푛 = 푝(푛), the number of partitions of 푛.

Proof. The second sentence follows immediately from the first. And it is clear that the 푚휆 are independent since no two contain a monomial in common. So it suffices to show that every 푓 ∈ Sym can be written as a linear combination of the 푚휆. Suppose 푥휆1 푥휆2 ⋯ 푥휆푙 is a monomial appearing in 푓 and having coefficient 푐 ∈ ℂ. Without loss 푖1 푖2 푖푙

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.1. The algebra of symmetric functions, Sym 221

of generality we can assume the indices have been arranged so that 휆1 ≥ 휆2 ≥ ⋯ ≥ 휆푙 and let 휆 = (휆1, 휆2, ... , 휆푙). Since 푓 is symmetric, every monomial in 푓 with exponents 휆1, 휆2, ... , 휆푙 appears with coefficient 푐. So 푓 − 푐푚휆 is still symmetric and contains no monomials with these exponents. Since 푓 is of bounded degree, we can repeat this process a finite number of times until we reach the zero power series. It follows that 푓 will be a linear combination of the monomial symmetric functions which appear during this algorithm. □

There are three bases which are formed multiplicatively in that one first defines 푓푛 for 푛 ∈ ℙ and then sets

(7.6) 푓휆 = 푓휆1 푓휆2 ⋯ 푓휆푙

where 휆 = (휆1, 휆2, ... , 휆푙). Specifically, for 푛 ≥ 1 we define the 푛th power sum symmetric function 푛 푝푛 = 푚(푛) = ∑ 푥푖 , 푖≥1 the 푛th elementary symmetric function

푛 푒푛 = 푚(1 ) = ∑ 푥푖1 ⋯ 푥푖푛 , 푖1<⋯<푖푛 and the 푛th complete homogeneous symmetric function

ℎ푛 = ∑ 푚휆 = ∑ 푥푖1 ⋯ 푥푖푛 . 휆⊢푛 푖1≤. . .≤푖푛

We also let 푒0 = ℎ0 = 1 because of the empty product. To illustrate, when 푛 = 3 we have 3 3 3 푝3 = 푥1 + 푥2 + 푥3 + ⋯ ,

푒3 = 푥1푥2푥3 + 푥1푥2푥4 + 푥1푥3푥4 + 푥2푥3푥4 + ⋯ , 3 3 2 2 ℎ3 = 푥1 + 푥2 + ⋯ + 푥1푥2 + 푥1푥2 + ⋯ + 푥1푥2푥3 + 푥1푥2푥4 + ⋯ .

Note that we have already met the power sum symmetric functions 푝푛 since they oc- curred as the substitutions made for the variables of the cycle index polynomial in The- orem 6.4.2. Also note that 푒푛 can be thought of as the sum of all square-free monomials of degree 푛, while ℎ푛 is the sum of all monomials of degree 푛. We now define 푝휆, 푒휆, and ℎ휆 using (7.6). So, for example, 4 4 4 2 2 2 푝(4,2) = (푥1 + 푥2 + 푥3 + ⋯)(푥1 + 푥2 + 푥3 + ⋯). To show that these are bases for Sym, it will be helpful to use generating functions. Define the following elements of ℂ[[퐱, 푡]]: 푡푛 푃(푡) = ∑ 푝 (퐱) , 푛 푛 푛≥1 푛 퐸(푡) = ∑ 푒푛(퐱)푡 , 푛≥0 푛 퐻(푡) = ∑ ℎ푛(퐱)푡 . 푛≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

222 7. Counting with Symmetric Functions

Note that 퐸(푡) and 퐻(푡) are ogfs, while we have not dealt with a generating function like 푃(푡) previously. Proposition 7.1.2. We have the following identities.

(a) 퐸(푡) = ∏(1 + 푥푖푡). 푖≥1 1 (b) 퐻(푡) = ∏ . 1 − 푥 푡 푖≥1 푖 1 (c) 푃(푡) = ln ∏ . 1 − 푥 푡 푖≥1 푖

Proof. (a) We will use weight-generating functions where the set is the same one used in the proof of Theorem 3.5.5; namely 푆 is all partitions 휆 with distinct parts. We weight 휆 = (휆1, 휆2, ... , 휆푛) ∈ 푆 by 푛 wt 휆 = 푡 푥휆1 푥휆2 ⋯ 푥휆푛 . Since 푒푛 is the sum of all square-free monomials of degree 푛, we have the weight- generating function 푛 푛 푓푆(퐱, 푡) = ∑ wt 휆 = ∑ 푡 ∑ 푥휆1 푥휆2 ⋯ 푥휆푛 = ∑ 푒푛(퐱)푡 휆∈푆 푛≥0 푙(휆)=푛 푛≥0 where ℓ(휆) is 휆’s length. On the other hand, we have the decomposition of 푆 in the demonstration of Theorem 3.5.5 푆 = ({10} ⊎ {11}) ⊕ ({20} ⊎ {21}) ⊕ ({30} ⊎ {31}) ⊕ ⋯ . Applying the Sum and Product Rules for weight-generating functions gives

푓푆(퐱, 푡) = (1 + 푥1푡)(1 + 푥2푡)(1 + 푥3푡) ⋯ so we are done. (b) This proof is similar to the one for (a) and so is left to the reader.

1 (c) Using the expansion of ln 1−푥 gives 1 1 (푥 푡)푛 푡푛 푡푛 ln ∏ = ∑ ln = ∑ ∑ 푖 = ∑ ∑ 푥푛 = ∑ 푝 (퐱) 1 − 푥 푡 1 − 푥 푡 푛 푛 푖 푛 푛 푖≥1 푖 푖≥1 푖 푖≥1 푛≥1 푛≥1 푖≥1 푛≥1 as desired. □

In order to prove that the 푝휆 and 푒휆 are bases, we will want to encode their expres- sions as linear combinations of the 푚휆. For that we will need a total order on the 휆 ⊢ 푛 to index the rows and columns of the corresponding matrix. Say that (휆1, ... , 휆푙) < (휇1, ... , 휇푘) in lexicographic order if, for the smallest index 푖 where 휆 and 휇 differ, we have 휆푖 < 휇푖.

Theorem 7.1.3. We have the following bases for Sym푛.

(a) {푝휆 ∣ 휆 ⊢ 푛}.

(b) {푒휆 ∣ 휆 ⊢ 푛}.

(c) {ℎ휆 ∣ 휆 ⊢ 푛}.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.1. The algebra of symmetric functions, Sym 223

Proof. (a) The set of 푝휆 has cardinality 푝(푛) = dim Sym푛 by Theorem 7.1.1. So we only need to show that the 푝휆 span Sym푛. Express each 푝휆 as a linear combination of the 푚휇 basis and let 퐴 = [푎휆,휇] be the matrix of coefficients where the rows and columns are listed in lexicographic order. It suffices to show that 퐴 is upper triangular −1 with nonzero diagonal elements. For then 퐴 exists and so we can write each 푚휇 in 휇1 휇푘 terms of the 푝휆. So consider a monomial 푚 = 푥1 ⋯ 푥푘 occurring in the expansion of

휆1 휆1 휆2 휆2 푝휆 = (푥1 + 푥2 + ⋯)(푥1 + 푥2 + ⋯) ⋯ . (By symmetry, our choice of subscripts for 푚 is without loss of generality.) It follows that each 휇푖 is a sum of 휆푗. But it is easy to see that adding parts of a partition make it larger in lexicographic order. So 푚휆 will have smallest subscript if it occurs. But we can 휆1 휆2 obtain the given monomial by picking 푥1 out of the first factor, 푥2 out of the second, and so forth. So the proof is complete.

(b) This demonstration is similar to the one in part (a) except that one shows

푒휆푡 = 푚휆 + ∑ 푏휆,휇푚휇 휇<휆 where 휆푡 is the conjugate of 휆.

(c) It suffices to show that each 푒푛 can be written as a polynomial in the ℎ푘. Indeed, by multiplicativity this implies that the 푒휇 are linearly spanned by the ℎ휆. And since the number of ℎ휆 is the dimension of Sym푛, they must form a basis. From Proposition 7.1.2 we see that 퐻(푡)퐸(−푡) = 1. Taking the coefficient of 푡푛 on both sides for 푛 ≥ 1 yields 푛 푖 ∑(−1) ℎ푛−푖푒푖 = 0. 푖=0

Solving for 푒푛 gives 푒푛 = ℎ1푒푛−1 − ℎ2푒푛−2 + ⋯ .

By induction on 푛, the 푒푖 on the right in this sum are polynomials in the ℎ푘, so the same □ is true of 푒푛.

We note that (b) of the previous theorem is sometimes called the Fundamental Theorem of Symmetric Functions and is expressed in the following manner: any sym- metric function can be written uniquely as a polynomial in the 푒푛. Also note that since the triangular transition matrix from the 푚휆 to the 푒휆 in (b) has ones down the diag- onal, we have actually proved that any symmetric function with integer coefficients is in fact an integral linear combination of the 푒휆. We wish to examine a corollary of Proposition 7.1.2(a) which shows that the coeffi- cients of a polynomial can be expressed as elementary symmetric functions of its roots. This will be useful for proving a log-concavity result in the next section. This result is well known for quadratic polynomials (and follows easily from the quadratic formula)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

224 7. Counting with Symmetric Functions

but holds in general. In what follows we will specialize our symmetric functions to the first 푚 variables by setting 푥푖 = 0 for 푖 > 푚. For example,

푒2(푥1, 푥2, 푥3, 푥4) = 푥1푥2 + 푥1푥3 + 푥1푥4 + 푥2푥3 + 푥2푥4 + 푥3푥4.

Lemma 7.1.4. Let 푛 푛−1 푛−2 (7.7) 푓(푡) = 푎0푡 + 푎1푡 + 푎2푡 + ⋯ + 푎푛

be a monic polynomial (so 푎0 = 1) with complex coefficients. Let the roots of 푓(푡) be 푟1, ... , 푟푛. Then for all 푘 ≥ 0 we have

푎푘 = 푒푘(−푟1, −푟2, ... , −푟푛).

Proof. From the definitions

푓(푡) = (푡 − 푟1)(푡 − 푟2) ⋯ (푡 − 푟푛). Using this and Proposition 7.1.2(a) gives 푛 푘 푡 푓(1/푡) = (1 − 푟1푡)(1 − 푟2푡) ⋯ (1 − 푟푛푡) = ∑ 푒푘(−푟1, −푟2, ... , −푟푛)푡 . 푘≥0 On the other hand, because of (7.7), 푛 푛 푡 푓(1/푡) = 푎푛푡 + ⋯ + 푎1푡 + 1. Comparing the last two displayed equations finishes the proof. □

7.2. The Schur basis of Sym

There is an important basis for Sym푛 whose elements are called the Schur functions. To construct these functions, we will use certain tableaux built out of Young diagrams. Expressing the Schur functions in the elementary, complete homogeneous, and power sum bases will lead to interesting connections with the Lindström–Gessle–Viennot technique and representations of symmetric groups. This will also permit us to prove a result noted in Section 5.6 relating log-concavity to the roots of the corresponding generating polynomial. Let 휆 be a partition of 푛.A standard Young tableau (SYT) of shape 휆 is a bijective filling 푇 of the boxes of the Young diagram for 휆 with the elements of [푛] such that rows increase from left to right and columns increase from top to bottom. We let SYT(휆) = {푇 ∣ 푇 is a standard Young tableau of shape 휆}

1 2 1 2 1 3 1 3 1 4 3 4 3 5 2 4 2 5 2 5 5 4 5 4 3

Figure 7.1. The standard Young tableaux of shape 휆 = (2, 2, 1)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.2. The Schur basis of Sym 225

휄 1 1 1 1 2 2 2 3 3 3 3 → 2 2 2 3 3 3 4 4 3 4 6

1 1 1 1 2 2 2 2 2 2 3 2 2 3 3 3 3 4 4 3 4 6

Figure 7.2. Two semistandard Young tableaux illustrating a Bender–Knuth inter- change

and 푓휆 = # SYT(휆). We also write sh 푇 for the shape of 푇 and call 푇 a standard 휆-tableau if sh 푇 = 휆. The SYT of shape 휆 = (2, 2, 1) are listed in Figure 7.1 so 푓(2,2,1) = 5. A semistandard Young tableau (SSYT) of shape 휆 is a filling 푇 of the boxes of the Young diagram for 휆 with elements of ℙ such that rows weakly increase and columns strictly increase. As expected, we let SSYT(휆) = {푇 ∣ 푇 is a semistandard Young tableau of shape 휆} and call the elements of this set semistandard 휆-tableaux. Two semistandard Young tableaux of shape (11, 8, 3) are displayed in Figure 7.2. (The reader should ignore the underlines and overlines for now.) We denoted by 푐 = (푖, 푗) the square, also called a cell, in row 푖 and column 푗 of the Young diagram of 휆 where rows and columns are indexed as in a matrix. The entry in cell (푖, 푗) of 푇 is denoted 푇푖,푗. For example, the first tableau in Figure 7.2 has 푇2,7 = 4. The content of an SSYT 푇 is the weak composition 훼 = co 푇 where 훼푖 is the number of occurrences of 푖 in 푇. The first tableau in Figure 7.2 has co 푌 = [4, 6, 8, 3, 0, 1]. Strictly speaking, one could add as many zeros as one liked to the end of co 푇, but usually we will terminate the composition with a positive entry. The Kostka numbers are

퐾휆,훼 = #{푇 ∈ SSYT(휆) ∣ co 푇 = 훼}. If we wish to use a content which is a partition 휇 and not just a weak composition, we 휆 will write 퐾휆,휇. Note that if 휆 ⊢ 푛, then 퐾휆,(1푛) = 푓 . To define the Schur functions, we will weight a 푇 ∈ SSYT(휆) by letting

푇 퐱 = ∏ 푥푇푖,푗 . (푖,푗)∈휆

푇 훼1 훼2 훼푘 Note that if co 푇 = [훼1, 훼2, ... , 훼푘], then 퐱 = 푥1 푥2 ⋯ 푥푘 . The tableau on the left in 푇 4 6 8 3 Figure 7.2 has 퐱 = 푥1 푥2푥3푥4푥6. The Schur function corresponding to a partition 휆 is 푇 푠휆 = ∑ 퐱 . 푇∈SSYT(휆)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

226 7. Counting with Symmetric Functions

For example, if 휆 = (2, 1), then a partial list of the semistandard tableaux of shape 휆 is

푇 ∶ 1 1 1 2 1 1 1 3 ... 1 2 1 3 1 2 1 4 ... 2 2 3 3 3 2 4 2 so that

2 2 2 2 푠(2,1) = 푥1푥2 + 푥1푥2 + 푥1푥3 + 푥1푥3 + ⋯ + 2푥1푥2푥3 + 2푥1푥2푥4 + ⋯ .

Note that it is not obvious from the definition that 푠휆 is even symmetric, but we will prove this shortly. As special cases, if 휆 is a single row, then the corresponding 푇 are just a weakly increasing sequences of integers so that

푠(푛) = ℎ푛. Similarly, if 휆 is a single column, then the 푇 are strictly increasing sequences, which gives

푠(1푛) = 푒푛. We will see generalizations of these equations shortly when we study the Jacobi–Trudi Determinants. For now, we must show that 푠휆 ∈ Sym. We will use a clever combina- torial involution of Bender and Knuth [6] for the proof.

Proposition 7.2.1. The function 푠휆(퐱) is symmetric.

Proof. Since the adjacent transpositions generate the symmetric group, it suffices to show that

(푖, 푖 + 1)푠휆(퐱) = 푠휆(퐱) where the action is the one given by (7.2). To do this, we will define an involution 휄∶ SSYT(휆) → SSYT(휆) such that if 휄(푇) = 푇′, then the number of 푖’s and (푖 + 1)’s are exchanged in passing from 푇 to 푇′ while all other entries are unchanged. If a column of 푇 contains both 푖 and 푖 + 1, then such pairs are called fixed. All other entries equal to 푖 or 푖 + 1 are called free. See Figure 7.2 where 푖 = 2, the fixed entries are underlined, and the free ones are overlined. The map 휄 takes each row containing 푘 free 푖’s followed by 푙 free (푖 + 1)’s and replaces these entries by 푙 free 푖’s followed by 푘 free (푖 + 1)’s. This clearly preserves the weakly increasing condition on the rows. And the columns are still strictly increasing because of the definition of free. Also, the number of 푖’s and (푖 + 1)’s are interchanged since this is true for the free elements by construction and the fixed elements come in pairs. Finally, it is clear that this map is its own inverseand hence an involution. This finishes the proof. □

Theorem 7.2.2. For 휆 ⊢ 푛 we have

푠휆 = ∑ 퐾휆,휇푚휇 휇≤휆

where the sum is over partitions 휇 of 푛 and 퐾휆,휆 = 1. So the set

{푠휆 ∣ 휆 ⊢ 푛}

is a basis for Sym푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.2. The Schur basis of Sym 227

Proof. The second sentence follows from the first in the same way as the proof of Theorem 7.1.3(a). The fact that 퐾휆,휇 is the coefficient of 푚휇 in the expansion of 푠휆 comes from the previous proposition and the definitions of 푠휆 and 퐾휆,휇. Clearly there is only one element of SSYT(휆, 휆), namely the tableau whose 푖th row consists completely of 푖’s for all 푖 ≥ 1. So it remains to show that if 퐾휆,휇 ≠ 0, then 휇 ≤ 휆. Suppose 휆 ≠ 휇 since equality has already been considered, and pick 푇 ∈ SSYT(휆, 휇). Let 푗 be the first index where 휆푗 ≠ 휇푗. Then for 푖 < 푗, the 푖th row of 푇 is all 푖’s. It follows from column strictness that the 푗th row must contain all the 푗’s. So

휇푗 = number of 푗’s < number of boxes in row 푗 = 휆푗 as desired. □

We now wish to find the expansion of 푠휆 in the elementary and complete homo- geneous bases. These are best expressed as determinants which were discovered by Jacobi [44] and subsequently simplified by his student Trudi [94]. For the proof, we will use a weighted version of the Lindström–Gessle–Viennot Lemma, Lemma 2.5.4. In it, all cardinalities are replaced by the corresponding weight-generating function over the set being counted. The only extra information which needs to be checked in this case is that the involution Ω in (2.12) is weight preserving in the sense that ′ ′ wt(푃푖, 푃푗) = wt(푃푖 , 푃푗 ) where, as usual, the weight of a Cartesian product is the prod- uct of the weights.

Theorem 7.2.3 (Jacobi–Trudi Determinants). Suppose 휆 = (휆1, ... , 휆푙).

(a) 푠휆 = det[ℎ휆푖−푖+푗]1≤푖,푗≤푙.

푡 (b) 푠휆 = det[푒휆푖−푖+푗]1≤푖,푗≤푙. Before beginning the proof, we note that a good way of remembering the subscripts in these determinants is to put the parts of 휆 down the diagonal and then in each row add 1 or subtract 1 as one moves right or left, respectively. So, for example,

ℎ7 ℎ8 ℎ9 ℎ7 ℎ8 ℎ9 푠(7,4,1) = det[ ℎ3 ℎ4 ℎ5 ] = det[ ℎ3 ℎ4 ℎ5 ] . ℎ−1 ℎ0 ℎ1 0 1 ℎ1

Proof. (a) We will use northeast lattice paths 푃 in the extension of the integer lattice ℤ2 obtained by adding a vertex (푖, ∞) for each 푖 ∈ ℤ. One can only reach (푖, ∞) by taking an infinite number of north steps along the line 푥 = 푖. We label the east steps of 푃 ∶ 푠1푠2푠3 ... by letting 퐿(푠푖) be the 푦-coordinate of 푠푖 if 푠푖 = 퐸. See, for example, the path on the left in Figure 7.3 where we are assuming the path starts at the point (1, 1). If 푃 only has a finite number of east steps all on or above 푦 = 1, we weight it by

wt 푃 = ∏ 푥퐿(푠푖) 푠푖

where the product is over all 푠푖 which are east steps of 푃. In Figure 7.3 we have wt 푃 = 2 푥1푥3푥4. Now let 푢 = (푖, 1) and 푣 = (푖 + 푛, ∞) where 푛 ≥ 0. Then all 푃 ∈ 풫(푢; 푣) have exactly 푛 east steps. Furthermore, as 푃 varies over all elements of 풫(푢; 푣) we see that

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

228 7. Counting with Symmetric Functions

4 7

3 3 4 5

1 1 (1, 1) (1, 1)

Figure 7.3. A path with the ℎ-labeling on the left and the 푒-labeling on the right

wt 푃 varies over all products 푥푗1 푥푗2 ⋯ 푥푗푛 with 1 ≤ 푗1 ≤ 푗2 ≤ ⋯ ≤ 푗푛. It follows that wt 풫(푢; 푣) = ℎ푛. To apply Lemma 2.5.4, let the initial and final vertices be

푢푖 = (1 − 푖, 1) and 푣푖 = (휆푖 − 푖 + 1, ∞)

for 푖 ∈ [푙]. See Figure 7.4 for an example where 휆 = (3, 3, 1). With this choice of ver- tices, the weighted entries of the Lindström–Gessle–Viennot matrix give (up to trans- position which does not affect the determinant) those on the right in part (a)ofthis theorem. Indeed, if 푃 goes from 푢푖 to 푣푗, then it has

(7.8) (휆푗 − 푗 + 1) − (1 − 푖) = 휆푗 + 푖 − 푗

east steps so that the set of such paths has weight ℎ휆푗 +푖−푗. Also note that since the weight of a step only depends upon its height, the map Ω is weight preserving. We also need to show that any path family 푃 whose associated permutation is not the identity must be intersecting. This will be left as an exercise. Finally, to complete the proof, we merely need to show that the weight-generating function for the nonintersecting path families is 푠휆. For this it suffices to give a weight- preserving bijection from such paths to SSYT(휆). Map such a path family (푃1, ... , 푃푙) to the tableau 푇 whose 푖th row consists of the labels on 푃푖 read left to right. An example is in Figure 7.4. Since 푃푖 goes from 푢푖 to 푣푖 it has 휆푖 east steps by (7.8), so 푇 has shape 휆. Further, the definition of the map and the labeling of the paths show that therows are weakly increasing. To show that the columns are strictly increasing, we need to check that for all 푖 and 푗, the 푗th step on 푃푖 is lower than the 푗th step on 푃푖+1. But this is forced by the nonintersecting condition. It is easy to describe an inverse sending a tableau back to a path family, so we leave this detail to the reader.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.2. The Schur basis of Sym 229

푣3 푣2 푣1 ⋮ ⋮ ⋮

5

4 4 1 1 3 ↔ 2 4 4 3 5

2

1 1

푢3 푢2 푢1

Figure 7.4. Nonintersecting paths and the associated semistandard Young tableau

(b) The proof is similar to that of (a) except that we label the east steps of 푃 by ′ 퐿 (푠푖) = 푖. See the path on the right in Figure 7.3 for an example. The reader will find it a good exercise to fill in the rest of the proof. □

The expansion of 푠휆 in the power sum basis is also important. But the proof is beyond the scope of this book. See [79, Theorem 4.6.4] for a demonstration of the next 휆 result. In it, we let 푝휋 = 푝휆 if 휋 ∈ 픖푛 has cycle type 휆. Also, 휒 is the character of the irreducible representation of 픖푛 corresponding to 휆. Theorem 7.2.4. If 휆 ⊢ 푛, then

1 푠 = ∑ 휒휆(휋)푝 . □ 휆 푛! 휋 휋∈픖푛

We now have the tools we need to prove a result postponed from Section 5.6.

Theorem 7.2.5. Let 푎0, 푎1, ... , 푎푛 be a sequence of real numbers with generating function 푘 푓(푡) = ∑푘≥0 푎푘푡 . If 푓(푡) has only real roots none of which are positive, then the sequence is log-concave.

Proof. Clearly 푎0, 푎1, ... , 푎푛 is log-concave if and only if 푎푛, ... , 푎1, 푎0 is. And 푓(푡) has 푛 푘 only real, nonpositive roots if and only if 푔(푡) = 푡 푓(1/푡) = ∑푘≥0 푎푛−푘푡 does. We can also assume, without loss of generality, that 푎0 ≠ 0. Now write 푔(푡) = 푎0ℎ(푡) where ℎ(푡) is monic and has the same roots 푟1, ... , 푟푛 as 푔(푡). So applying Lemma 7.1.4 to ℎ(푡) and multiplying by 푎0 we get

푎푘 = 푎0 푒푘(−푟1, ... , −푟푛)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

230 7. Counting with Symmetric Functions

for 0 ≤ 푘 ≤ 푛. Now let 휆 = (푘, 푘) and use Theorem 7.2.3(b) to obtain

2 푎푘 푎푘+1 푎푘 − 푎푘−1푎푘+1 = det[ ] 푎푘−1 푎푘

푎 푒 (−푟 , ... , −푟 ) 푎 푒 (−푟 , ... , −푟 ) = det[ 0 푘 1 푛 0 푘+1 1 푛 ] 푎0 푒푘−1(−푟1, ... , −푟푛) 푎0 푒푘(−푟1, ... , −푟푛)

2 = 푎0 푠(푘,푘)푡 (−푟1, ... , −푟푛). 2 Since 푠휆(퐱) has nonnegative coefficients and −푟푖 ≥ 0 for all 푖, we have 푎푘 −푎푘−1푎푘+1 ≥ 0, which is what we wished to show. □

7.3. Hooklengths

In this section we will derive formulae for counting standard Young tableaux and semi- standard Young tableaux of a given shape. These expressions will be based on the sizes of certain subsets of the Young diagram called hooks. Given a Young diagram 휆, a cell 푐 = (푖, 푗) ∈ 휆 has hook ′ ′ ′ ′ 퐻푐 = 퐻푖,푗 = {(푖 , 푗) ∈ 휆 ∣ 푖 ≥ 푖} ∪ {(푖, 푗 ) ∈ 휆 ∣ 푗 ≥ 푗}. The partition 휆 = (7, 7, 6, 6, 4) is displayed on the left in Figure 7.5 and the cells in the hook 퐻2,3 are marked with dotted lines. The hooklength of 푐 = (푖, 푗) is

ℎ푐 = ℎ푖,푗 = #퐻푐.

Using the previous example, we have ℎ2,3 = 8. And on the right in the same figure, we have displayed the hooklengths of all the cells for the shape (2, 2, 1). It is not hard to show that 푡 (7.9) ℎ푖,푗 = 휆푖 + 휆푗 − 푖 − 푗 + 1. There is a beautiful formula for the number of standard Young tableau of given shape due to Frame, Robinson, and Thrall [29]. We will give a probabilistic proof of

4 2

3 1

1

2 2 2 Figure 7.5. The hook 퐻2,3 in 휆 = (7 , 6 , 4) and the hooklengths of 휆 = (2 , 1)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.3. Hooklengths 231

this result discovered by Greene, Nijenhuis, and Wilf [35], which has the added benefit of providing an algorithm for choosing an SYT of shape 휆 uniformly at random. For the demonstration we will need the concept of an inner corner of a Young diagram 휆 which is a cell 푐 at the end of its row and column. Equivalently ℎ푐 = 1. The inner corners of the 휆 = (72, 62, 4) in Figure 7.5 are (2, 7), (4, 6), and (5, 4). Note that in any SYT of shape 훼 ⊢ 푛 one must have 푛 in one of the inner corners of 휆.

Theorem 7.3.1 (Hook Formula). If 휆 ⊢ 푛, then 푛! (7.10) 푓휆 = . ∏(푖,푗)∈휆 ℎ푖,푗

Before proving this result, let us verify it for 휆 = (2, 2, 1) ⊢ 5. Using the hook- lengths in Figure 7.5 we see that 푛! 5! = 2 = 5 ∏(푖,푗)∈휆 ℎ푖,푗 4 ⋅ 3 ⋅ 2 ⋅ 1 which agrees with the count in Figure 7.1.

Proof. Consider the following algorithm for constructing a standard Young tableau 푇 of shape 휆. In it, 퐴 ≔ 퐵 means that 퐴 is to be replace by 퐵. GNW1 Pick 푐 ∈ 휆 with probability 1/푛. ′ GNW2 While 푐 is not an inner corner, pick 푐 ∈ 퐻푐 − {푐} with probability ′ 1/(ℎ푐 − 1) and update 푐 ≔ 푐. GNW3 Let 푇푐 = 푛 and update 푛 ≔ 푛 − 1, 휆 ≔ 휆 − {푐}. If 푛 > 0, then return to GNW1; otherwise terminate. The sequence of cells chosen in GNW2 is called a trial 푡. In Figure 7.6 the solid dots and lines show a possible trial in 휆 = (72, 62, 4) with probability 1 1 1 1 1 Pr(푡) = ⋅ ⋅ ⋅ = . 30 7 4 1 840

푐1 푐2

푐3 푐4

Figure 7.6. A trial in 휆 = (72, 62, 4)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

232 7. Counting with Symmetric Functions

In order to prove (7.10) it suffices to show that for any SYT 푇 of shape 휆 ⊢ 푛, the probability that GNW1–GNW3 will produce 푇 is

∏(푖,푗)∈휆 ℎ푖,푗 (7.11) Pr(푇) = . 푛! We will induct on 푛, where the case 푛 = 1 is trivial. Suppose (훼, 휔) is the cell of 푇 containing 푛. Also let 휆′ = 휆 − {(훼, 휔)} and let 푇′ be the tableau of shape 휆′ obtained by removing 푛 from 푇. Then Pr(푇) = Pr(훼, 휔) ⋅ Pr(푇′) where Pr(훼, 휔) is the probability that a trial ends at (훼, 휔). Note that the hooklengths of 푇′ are the same as those in 푇 except for the ones in row 훼 or column 휔 which have each been decreased by one. Also, we can assume by induction that Pr(푇′) has the desired form. So it suffices to show ′ ′ that, using ℎ푖,푗 for the hooklengths in 푇 ,

∏(푖,푗)∈휆 ℎ푖,푗/푛! Pr(훼, 휔) = ′ ∏(푖,푗)∈휆′ ℎ푖,푗/(푛 − 1)!

1 ℎ ℎ훼,푗 = ∏ 푖,휔 ∏ 푛 ℎ − 1 ℎ − 1 1≤푖<훼 푖,휔 1≤푗<휔 훼,푗

1 1 1 = ∏ (1 + ) ∏ (1 + ) 푛 ℎ − 1 ℎ − 1 1≤푖<훼 푖,휔 1≤푗<휔 훼,푗 1 1 1 = ∑ ∏ ∏ . 푛 ℎ − 1 ℎ − 1 퐼⊆[훼−1] 푖∈퐼 푖,휔 푗∈퐽 훼,푗 퐽⊆[휔−1]

We will prove that this last expression equals Pr(훼, 휔) by giving a probabilistic in- terpretation to each summand as follows. Given a trial 푐1 = (푖1, 푗1), 푐2 = (푖2, 푗2), ... , 푐푚 ′ = (푖푚, 푗푚) = (훼, 휔) we define its row and column projections to be the sets 퐼 = ′ {푖1, 푖2, ... , 푖푚} and 퐽 = {푗1, 푗2, ... , 푗푚}, respectively. For the trial in Figure 7.6 we have 퐼′ = {2, 4} and 퐽′ = {3, 5, 6} corresponding to the solid and dotted lines in the dia- gram. Note that since we assume the trial ends at (훼, 휔) we always have 훼 = max 퐼′ and 휔 = max 퐽′. Let 퐼 = 퐼′ − {훼} and 퐽 = 퐽′ − {휔}. Let Pr(퐼′, 퐽′) be the probability that a trial ending at (훼, 휔) has row and column projections 퐼′ and 퐽′, respectively. We claim that 1 1 1 (7.12) Pr(퐼′, 퐽′) = ∏ ∏ . 푛 ℎ − 1 ℎ − 1 푖∈퐼 푖,휔 푗∈퐽 훼,푗 If this is true, then we are done since, by definition of the which are in- ′ ′ volved, Pr(훼, 휔) = ∑퐼′,퐽′ Pr(퐼 , 퐽 ) which is the same as the sum at the end of the pre- vious paragraph. To prove the claim, we induct on 푚, the number of cells in the trial. If 푚 = 1, then this is clearly true since then 퐼 = 퐽 = ∅ and 1/푛 is the probability of picking (훼, 휔) as the only cell of the trial. If 푚 > 1, then the trial must begin by going from (푖1, 푗1) to either (푖2, 푗1) or (푖1, 푗2). So

′ ′ 1 ′ ′ ′ ′ Pr(퐼 , 퐽 ) = [Pr(퐼 − 푖1, 퐽 ) + Pr(퐼 , 퐽 − 푗1)]. ℎ푖1,푗1 − 1

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.3. Hooklengths 233

Letting 푃 be the right-hand side of (7.12) we have, by induction, that ′ ′ Pr(퐼 − 푖1, 퐽 ) = (ℎ푖1,휔 − 1)푃 and ′ ′ Pr(퐼 , 퐽 − 푗1) = (ℎ훼,푗1 − 1)푃. It is also easy to show, using (7.9), that

(7.13) ℎ푖1,푗1 − 1 = (ℎ푖1,휔 − 1) + (ℎ훼,푗1 − 1). Thus ′ ′ 1 Pr(퐼 , 퐽 ) = [(ℎ푖1,휔 − 1)푃 + (ℎ훼,푗1 − 1)푃] = 푃 ℎ푖1,푗1 − 1 as desired. □

We now derive a generating function for semistandard Young tableaux 푇 of a given

shape 휆 by the sum of the parts which is denoted |푇| = ∑(푖,푗)∈휆 푇푖,푗. It will be conve- nient to consider a related type of array. A reverse plane partition (RPP) of shape 휆 is a filling 푅 of the Young diagram of 휆 with elements of ℕ such that the rows and columns weakly increase. (The term “reverse” comes from the fact that ordinary partitions of 푛 are written in weakly decreasing, rather than increasing, order.) The first row of Fig- ure 7.7 contains a list of six RPPs. We use notation for semistandard Young tableaux

in the obvious way applied to reverse plane partitions. Let rpp푛(휆) be the number of reverse plane partitions 푅 with sh 푅 = 휆 and |푅| = 푛. We note that there is a bijec- tion 푇 ↦ 푅 where 푇 is an SSYT, 푅 is an RPP, and sh 푇 = sh 푅 = 휆, given by letting

푅푖,푗 = 푇푖,푗 − 푖 for all (푖, 푗) ∈ 휆. Notice that in this case |푅| = |푇| + ∑푖 푖휆푖. So finding the generating function for RPPs is equivalent to finding the one for SSYT. The former generating function was first derived by Stanley [84]. The algorithmic proof we give is due to Hillman and Grassl [43]. Theorem 7.3.2. For any partition 휆 we have 1 (7.14) ∑ rpp (휆)푥푛 = ∏ . 푛 ℎ푖,푗 푛≥0 (푖,푗)∈휆 1 − 푥

Proof. By Lemma 3.4.1, the right-hand side of (7.14) is the weight-generating function for the product 푚 푆 = {{ℎ 푖,푗 ∣ 푚 ≥ 0}} ⨉ 푖,푗 푖,푗 (푖,푗)∈휆 ∑ 푚 ℎ where if we have a multiset 푀 ∈ 푆, then wt 푀 = 푥 (푖,푗)∈휆 푖,푗 푖,푗 . Note that even if two hooks have the same length, they contribute to different components of the product. So we need a weight-preserving bijection between RPPs 푅 and multisets of hooklengths 푀. Given 푅, we find the hooklengths in 푀 by producing a series of RPPs

(7.15) 푅 = 푅0, 푅1, ... , 푅푚

where 푅푚 is the all-zero RPP and, at each stage, 푅푘 is obtained from 푅푘−1 by subtracting one from all the elements of 푅푘−1 along a path 푝푘 which is constructed so that |푝푘| =

ℎ푖푘,푗푘 for some (푖푘, 푗푘).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

234 7. Counting with Symmetric Functions

1 2 2 2 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 푅푘 ∶ 3 3 3 2 2 3 1 2 3 1 2 2 1 1 1 0 0 0 3 2 1 1 1 0

ℎ푖푘,푗푘 ∶ ℎ1,1 ℎ1,1 ℎ2,3 ℎ2,2 ℎ2,1

Figure 7.7. The Hillman–Grassl Algorithm

Given 푅, we find the path 푝 = 푝1 as follows.

HG1 Start 푝 at (푎, 푏) which is the northeastmost cell in 푅 such that 푅푎,푏 ≠ 0. HG2 Continue 푝 by (푖, 푗 − 1) ∈ 푝 if 푇 = 푇 , (푖, 푗) ∈ 푝 ⟹ { 푖,푗−1 푖,푗 (푖 + 1, 푗) ∈ 푝 otherwise. HG3 Terminate 푝 when trying to apply HG2 leads to (푖 + 1, 푗) ∉ 휆. Note that HG2 amounts to saying that 푝 moves down unless forced to move left so as not to violate the weakly increasing condition on the rows once the ones are subtracted from 푅. Note also that the termination condition in HG3 forces 푝 to be at the bottom of some column 푐. Since all southwest lattice paths from (푎, 푏) to the bottom of column 푐 have the same length, we must have |푝| = ℎ푎,푐. One also needs to check that the algorithm is well-defined in that the output array is actually an RPP; that is, ithas weakly increasing rows and columns. But this verification will be left as an exercise. To illustrate, let 푅 be the first RPP in Figure 7.7. Then using HG1–HG3 returns the path given by the dots in • • • • • • and upon subtraction one obtains the second RPP in the figure. Notice that we have subtracted a total of ℎ1,1 = 6 as indicated on the bottom line. The rest of the figure illustrates finding the other RPPs in the sequence (7.15). So, in this case, 2 푅 ↦ 푀 = {{ℎ1,1, ℎ2,3, ℎ2,2, ℎ2,1}}. To reverse the procedure and find an inverse map, we first need to determine in what order hooklengths are removed from 푅 to form 푀. We claim that ℎ푖′,푗′ was re- moved before ℎ푖″,푗″ in the hook decomposition of 푅 if and only if (7.16) 푖′ < 푖″ or 푖′ = 푖″ and 푗′ ≥ 푗″. It is easy to see that this is a total order on the cells of 휆, so it suffices to prove the forward direction. And by transitivity, one can reduce to the case when ℎ푖″,푗″ is removed directly ′ ″ after ℎ푖′,푗′ . Let 푅 and 푅 be the reverse plane partitions from which ℎ푖′,푗′ and ℎ푖″,푗″ were removed using paths 푝′ and 푝″, respectively. Since entries decrease in passing from 푅′ to 푅″, the initial condition in HG1 forces 푖′ ≤ 푖″. If this inequality is strict, then we are done. If 푖′ = 푖″, then we assert that every cell on 푝″ is weakly left of a cell of 푝′. So if 푝′ ends in column 푗′, then 푝″ must end in a column 푗″ ≤ 푗′ as claimed.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.4. 푃-partitions 235

The assertion is proven by induction, for suppose (푖, ℎ) ∈ 푝″ is weakly left of (푖, 푗) ∈ 푝′ so that ℎ ≤ 푗. If the next step of 푝′ is to (푖 + 1, 푗), then 푝″ will enter row 푖 + 1 still weakly left of column 푗. If the next step of 푝′ is to (푖, 푗 −1), then the given cell of 푝″ will still be weakly left if ℎ < 푗. And if ℎ = 푗, then 푝″ must move to (푖, 푗 − 1) as well since ′ ′ ′ ″ ″ 푝 only moves left if 푇푖,푗−1 = 푇푖,푗, and after subtraction we will also have 푇푖,푗−1 = 푇푖,푗. So the assertion, and hence (7.16), holds. To construct the inverse map, given a multiset 푀, we arrange its elements in a sequence according to (7.16):

ℎ푖1,푗1 , ℎ푖2,푗2 , ... , ℎ푖푚,푗푚 . From this, we construct a sequence of RPPs

푅푚, 푅푚−1, ... , 푅0 = 푅

where 푅푚 is the all-zero RPP and 푅푘−1 is obtained by adding a one to a total of ℎ푖푘,푗푘 elements of 푅푘 for 푘 = 푚, 푚 − 1, ... , 1. Given an RPP 푅, we add back ℎ푎,푐 ones along a reverse path 푟 defined as follows. GH1 Start 푟 at the bottom cell in column 푐 of 푅. GH2 Continue 푟 by (푖, 푗 + 1) ∈ 푟 if 푇 = 푇 , (푖, 푗) ∈ 푟 ⟹ { 푖,푗+1 푖,푗 (푖 − 1, 푗) ∈ 푟 otherwise. GH3 Terminate 푟 when it passes through the rightmost cell in row 푎. Note that this is a step-by-step reversal of the construction of the (forward) path in HG1–HG3. So this will be an inverse map provided that it is well-defined, that is, pro- vided that in GH3 the reverse path actually reaches the target cell in row 푎. This is forced by (7.16) and the proof of this implication is left to the reader. We also leave as an exercise the check that adding back ones along the reverse path yields an RPP. □

7.4. 푃-partitions

It is natural to wonder if there is any relationship between equations (7.10) and (7.14). In fact, the latter can be used to derive the former. In order to do this, we will need to develop the theory of 푃-partitions, 푃 being a poset, which is due to Stanley [84]. We start with the central concept of compatibility of a permutation and a function. A function 푓 ∶ [푛] → ℕ is compatible with a permutation 휋 = 휋1 ... 휋푛 ∈ 픖푛 if

C1 푓(휋1) ≥ 푓(휋2) ≥ ⋯ ≥ 푓(휋푛) and C2 푓(휋푖) > 푓(휋푖+1) whenever 푖 ∈ Des 휋. By way of example, suppose 휋 = 37814526. It is easy to check that 푓 ∶ [8] → ℕ defined by (7.17) 푓(3) = 푓(7) = 푓(8) = 21, 푓(1) = 푓(4) = 20, 푓(5) = 10, 푓(2) = 푓(6) = 0 is compatible with 휋 since

푓(휋1), 푓(휋2), ... , 푓(휋8) = 21 ≥ 21 ≥ 21 > 20 ≥ 20 ≥ 10 > 0 ≥ 0.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

236 7. Counting with Symmetric Functions

For 휋 ∈ 픖푛, let 풞(휋) = {푓 ∶ [푛] → ℕ ∣ 푓 is compatible with 휋}. These sets partition the set of all functions from [푛] to ℕ.

Lemma 7.4.1. Every 푓 ∶ [푛] → ℕ is compatible with a unique 휋 ∈ 픖푛. Thus (7.18) {푓 ∣ 푓 ∶ [푛] → ℕ} = 풞(휋). ⨄ 휋∈픖푛

Proof. We first show how, given 푓, we can construct a 휋 with which 푓 is compatible. The reader may wish to follow the construction using the example 푓 in (7.17). Let the image of 푓 be the set 푆 = {푠1 > 푠2 > ⋯ > 푠푘} ⊂ ℕ. Since 푓 is weakly decreasing on 휋 by condition C1, those 푟 with 푓(푟) = 푠1 must come first in 휋. Furthermore, such 푟 must be arranged in increasing order since, if not, then there would be a descent which would force two of these 푟 to have distinct images by C2. Similar considerations show that the next elements in 휋 must be those such that 푓(푟) = 푠2 in increasing order, and so forth. Since all of the choices made in constructing 휋 are forced on us by the definition, the permutation is unique and we have proved the first statement of the proposition. As for (7.18), the uniqueness statement just proved shows that the union is disjoint. And existence of a compatible 휋 for each 푓 shows containment of the left-hand side in the right. The other containment is trivial since each 풞(휋) consists of functions 푓 ∶ [푛] → ℕ. □

Define the size of 푓 ∶ [푛] → ℕ to be

푛 (7.19) |푓| = ∑ 푓(푖). 푖=1 Continuing our example |푓| = 21 + 21 + 21 + 20 + 20 + 10 + 0 + 0 = 113. It will also be instructive to consider the following subsets of 풞(휋) where the maximum of a function is the maximum of the values in its image

풞푚(휋) = {푓 ∈ 풞(휋) ∣ max 푓 ≤ 푚}.

There are nice generating functions associated with 풞(휋) and 풞푚(휋).

Lemma 7.4.2. For any 휋 ∈ 픖푛 we have 푥maj 휋 (7.20) ∑ 푥|푓| = (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) 푓∈풞(휋) and des 휋 푚 푥 (7.21) ∑ #풞푚(휋)푥 = 푛+1 . 푚≥0 (1 − 푥)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.4. 푃-partitions 237

Proof. We will prove the first equality as the demonstration of the second is similar and so is left as an exercise. The basic idea behind the proof is the same as used in the demonstration of Theorem 1.3.4 where one adds or subtracts sufficient amounts to turn weak inequalities into strict ones or vice versa. An example will follow the proof.

Let 휋 = 휋1휋2 ... 휋푛 with Des 휋 = {푑1 < 푑2 < ⋯ < 푑푘}. We will construct a bijection 휙∶ 풞(휋) → Λ푛 where Λ푛 is the set of all partitions 휆 satisfying the length restriction ℓ(휆) ≤ 푛. Given 푓 ∈ 풞(휋), we will construct a sequence of functions 푓 = 푓0, ... , 푓푘 where 푓푖 will remove the strict inequality restriction at index 푑푖 from 푓푖−1. So

let 푓1 be obtained from 푓 by subtracting one from each of 푓(휋1), ... , 푓(휋푑1 ) and leaving the other 푓 values the same. Similarly, 푓2 is constructed from 푓1 by subtracting one

from 푓1(휋1), ... , 푓1(휋푑2 ) with other values constant, and so forth. By the end, the only restrictions on 푓푘 are that we have 푓푘(휋1) ≥ ⋯ ≥ 푓푘(휋푛) ≥ 0. So the nonzero images of 푓푘 form a partition 휆 ∈ Λ푛 and we let 휙(푓) = 휆. This is a bijection as its inverse is easy to construct. Furthermore, from the definition of the algorithm, it follows that

|푓| = |푓1| + 푑1 = ⋯ = |휆| + ∑ 푑푖 = |휆| + maj 휋. 푖=1 Now appealing to Corollary 3.5.4 we have

푥maj 휋 ∑ 푥|푓| = ∑ 푥|휆|+maj 휋 = (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) 푓∈풞(휋) 휆∈Λ푛 as desired. □

Using our running example, the vector of values of the initial function on 휋 is given by 푓 = (21, 21, 21, 20, 20, 10, 0, 0). Since Des 휋 = {3, 6} our first step is to subtract one from the first 3 values to obtain 푓1 = (20, 20, 20, 20, 20, 10, 0, 0). Next we subtract one from the first 6 values so that 푓2 = (19, 19, 19, 19, 19, 9, 0, 0). Taking the nonzero components gives 휆 = (19, 19, 19, 19, 19, 9). We now have all the tools needed to find generating functions for partitions whose parts are distributed over a poset. Let 푃 be a partial order on the set [푛]. In order to distinguish the usual total order on integers from the partial order in 푃, we will use 푖 ≤ 푗 for the former and 푖 ⊴ 푗 for the latter. So in the poset on the left in Figure 7.8 we have 3 ◁ 2, but 2 < 3 as integers. If 푃 is a poset on [푛], then a 푃-partition is a map

4 0 4

2 5 2

1 3 5 1 3 7

Figure 7.8. A poset 푃 on [4] on the left and a 푃-partition on the right

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

238 7. Counting with Symmetric Functions

푓 ∶ 푃 → ℕ such that PP1 푖 ⊴ 푗 implies 푓(푖) ≥ 푓(푗) and PP2 푖 ⊴ 푗 and 푖 > 푗 implies 푓(푖) > 푓(푗). So PP1 says that 푓 is weakly decreasing on 푃, while PP2 means that 푓 is strictly de- creasing on “descents” of 푃.A 푃-partition for the poset in Figure 7.8 is shown on the right with the values of 푓 circled. Note that by transitivity, it suffices to assume that PP1 and PP2 hold when 푖 is covered by 푗. Let Par 푃 = {푓 ∶ 푃 → ℕ ∣ 푓 is a 푃-partition}. For the poset in Figure 7.8 we have Par 푃 = {푓 ∶ [4] → ℕ ∣ 푓(1) ≥ 푓(2), 푓(3) > 푓(2), 푓(2) ≥ 푓(4)}. We will also need the Jordan–Hölder set of an (arbitrary) poset 푃 which is ℒ(푃) = {휋 ∣ 휋 is a linear extension of 푃}.

Note that if 푃 is a poset on [푛], then ℒ(푃) ⊆ 픖푛. Continuing the Figure 7.8 example, ℒ(푃) = {1324, 3124}. The next result, while not hard to prove, is crucial, as its name suggests. Lemma 7.4.3 (Fundamental Lemma of 푃-Partitions). Let 푃 be a poset on [푛]. Then we have 푓 ∈ Par 푃 if and only if 푓 ∈ 풞(휋) for some 휋 ∈ ℒ(푃). Thus Par 푃 = 풞(휋). ⨄ 휋∈ℒ(푃)

Proof. We will just prove the forward implication as the reverse is similar. And the proof of the equation for Par 푃 is also omitted as it follows the same lines as for (7.18). So suppose 푓 ∈ Par 푃. We know from the first part of Lemma 7.4.1 that 푓 is compatible with a unique 휋 ∈ 픖푛. So we just need to show that 휋 ∈ ℒ(푃); that is, if 푖 ◁ 푗, then 푖 should appear before 푗 in 휋. Assume, to the contrary, that we have (7.22) 휋=...푗...푖... being the order of the two elements in 휋. Since 푓 ∈ Par 푃 and 푖 ◁ 푗 we must have 푓(푖) ≥ 푓(푗) by condition PP1. But C1 and the form of 휋 in (7.22) force 푓(푗) ≥ 푓(푖). Thus 푓(푖) = 푓(푗). This equality together with (7.22) and C2 imply 푗 < 푖. But now we have 푖 ◁ 푗 and 푖 > 푗 as well as 푓(푖) = 푓(푗) which contradicts P2. This is the desired contradiction. □

We now translate the previous result in terms of generating functions. Just as with compatible functions, use the notation

Par푚 푃 = {푓 ∈ Par 푃 ∣ max 푓 ≤ 푚}. Theorem 7.4.4. For any poset 푃 on [푛] we have ∑ 푥maj 휋 |푓| 휋∈ℒ(푃) (7.23) ∑ 푥 = 2 푛 푓∈Par 푃 (1 − 푥)(1 − 푥 ) ⋯ (1 − 푥 )

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.4. 푃-partitions 239

and ∑ 푥des 휋 푚 휋∈ℒ(푃) (7.24) ∑ | Par푚 푃| 푥 = 푛+1 . 푚≥0 (1 − 푥)

Proof. We prove (7.23), leaving (7.24) as an exercise. Using the previous lemma and then (7.20) yields

maj 휋 ∑휋∈ℒ(푃) 푥 ∑ 푥|푓| = ∑ ∑ 푥|푓| = (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) 푓∈Par 푃 휋∈ℒ(푃) 푓∈풞(휋) which is what we wished to demonstrate. □

As a check, consider the poset 푃 which is the chain 1◁2◁...◁푛. Then the only inequalities satisfied by 푓 ∈ Par 푃 are 푓(1) ≥ 푓(2) ≥ ⋯ ≥ 푓(푛) ≥ 0. So 푓 corresponds to a partition 휆 with at most 푛 parts. On the other hand ℒ(푃) consists of the single permutation 휋 = 12 ... 푛 with maj 휋 = 0. So (7.23) becomes 1 (7.25) ∑ 푥|휆| = (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) ℓ(휆)≤푛 which agrees with Corollary 3.5.4. Of course, this cannot be considered a new proof of the corollary since it was used in the demonstration of (7.20). But at least it suggests we haven’t made any mistakes! Another case is the chain 푛◁푛−1◁...◁1. Now the 푓 ∈ Par 푃 satisfy the inequalities 푓(푛) > 푓(푛 − 1) > ⋯ > 푓(1) ≥ 0. The set ℒ(푃) still contains a unique element, but it is 휋 = 푛 ... 21 which has 푛 maj 휋 = 1 + 2 + ⋯ + (푛 − 1) = ( ). 2 Plugging into (7.23) we see that

푛 (2) |푓| 푥 (7.26) ∑ 푥 = 2 푛 . 푓∈Par 푃 (1 − 푥)(1 − 푥 ) ⋯ (1 − 푥 ) The reader may find it instructive to write down the bijection which permits oneto derive this equation from (7.25). This map is a special case of the one used in the proof of (7.20) but it is easier to see what is going on in this simple case. The time has come to fulfill our promise from the beginning of this section to derive the Hook Formula, (7.10), from the generating function for reverse plane partitions, (7.14). Let 휆 be the Young diagram of a partition of 푛. We turn 휆 into a poset 푃휆 by partially ordering the cells of 휆 componentwise: (푖, 푗) ⊴ (푖′, 푗′) whenever 푖 ≤ 푖′ and ′ 푗 ≤ 푗 . See Figure 7.9 for an example where 휆 = (4, 3, 1). So 푃휆 is formed from the Young diagram of 휆 by rotating 135∘ counterclockwise and imposing a grid of covers. 휆 Note that #ℒ(푃휆) = 푓 because there is a simple bijection between SYT 푇 of shape 휆 and linear extensions of 푃휆: each tableau 푇 corresponds to a linear extension of the cells 푐1, ... , 푐푛 where they are ordered so that 푐푖 is the cell of 푇 containing 푖 for 1 ≤ 푖 ≤ 푛.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

240 7. Counting with Symmetric Functions

∗ 휆 푃휆 푃휆

Figure 7.9. A Young diagram and associated posets

∗ Now consider the poset dual 푃휆 and label its elements with the numbers in [푛] ∗ in any way which corresponds to a linear extension of 푃휆 as described in the previous ∗ paragraph for 푃휆. It is easy to see that the 푃휆 -partitions are precisely the reverse plane partitions of shape 휆. (The use of the dual corresponds to these plane partitions being ∗ 휆 “reverse”.) And clearly we still have #ℒ(푃휆 ) = 푓 . Combining (7.14) and (7.23) yields.

1 푛 푝(푥) ∏ = ∑ rpp푛(휆)푥 = ℎ푖,푗 (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) (푖,푗)∈휆 1 − 푥 푛≥0 maj 휋 where 푝(푥) = ∑ ∗ 푥 . Thus 휋∈ℒ(푃휆) (1 − 푥)(1 − 푥2) ⋯ (1 − 푥푛) 푛! 푓휆 = 푝(1) = lim = 푥→1 ℎ푖,푗 ∏ ℎ ∏(푖,푗)∈휆 1 − 푥 (푖,푗)∈휆 푖,푗 which is the Hook Formula.

7.5. The Robinson–Schensted–Knuth correspondence

This section will be devoted to proving the following important identity. Theorem 7.5.1. For any given 푛 ≥ 0 we have (7.27) ∑(푓휆)2 = 푛! . 휆⊢푛 From the point of view of representation theory this is just the special case of equa- tion (A.9) in the appendix where 퐺 = 픖푛 and the dimensions are given by (A.7). How- ever, we wish to give a bijective proof of (7.27). This map was discovered in two very different forms by Robinson [75] and Schensted [81]. It is the latter description which will be presented here. We will also see that this algorithm and the corresponding identity can be generalized from standard to semistandard Young tableaux. To prove (7.27), it suffices to construct a bijection RS (7.28) 휋 ↦ (푃, 푄)

between permutations 휋 ∈ 픖푛 and pairs of SYT (푃, 푄) of the same shape 휆 ⊢ 푛. The heart of this construction will be a method of inserting a positive integer into a tableau. A partial Young tableau (PYT) is a filling of a shape with distinct positive integers such that the rows and columns increase. A PYT is standard precisely when its entries are

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.5. The Robinson–Schensted–Knuth correspondence 241

푃 = 1 3 6 9 2 8 10 4 7

1 3 6 9 ← ퟓ 1 3 ퟓ 9 1 3 5 91 3 5 9 = 푟5(푃). 2 8 10 2 8 10 ← ퟔ 2 ퟔ 10 2 6 10 4 4 4 ← ퟖ 4 ퟖ 7 7 7 7

Figure 7.10. Inserting 푥 = 5 into a partial tableau 푃

[푛] for some 푛. A partial tableau 푃 is shown at the top in Figure 7.10. Now given a PYT 푃 and a positive integer 푥 ∉ 푃, we insert 푥 into 푃 using the following algorithm: RS1 Set 푅 ≔ the first row of 푃. RS2 While 푥 is less than some element of row 푅, let 푦 be the leftmost such element and replace 푦 by 푥 in 푅. Repeat this step with 푅 ≔ the row below 푅 and 푥 ≔ 푦. RS3 Now 푥 is greater than every element of 푅, so place 푥 at the end of this row and terminate. In step RS2 we say that 푥 bumps 푦. An example of inserting 5 into the PYT in Figure 7.10 is given in the second row of the figure. Elements being bumped are written in boldface and the notation 푅 ← 푥 means that 푥 is being inserted in row 푅. If 푃′ is the result of inserting 푥 into 푃 by rows, then we write ′ 푟푥(푃) = 푃 . The reader should check that this operation is well-defined in that 푃′ is still a PYT. There is a second operation needed to describe the map (7.28) which will be used for the second component. An outer corner of a shape 휆 (or of a tableau of that shape) is a cell (푖, 푗) ∉ 휆 such that 휆 ∪ {(푖, 푗)} is the Young diagram of a partition. The outer corners of 푄 in Figure 7.11 are (1, 5), (2, 3), (3, 2), and (5, 1). Suppose we have a partial tableau 푄, 푦 > max 푄, and (푖, 푗) which is an outer corner of 푄. The tableau 푄′ obtained ′ by placing 푦 in 푄 at (푖, 푗) has all the entries of 푄 together with 푄푖,푗 = 푦. The choice of an outer corner and the condition on 푦 ensure that 푄′ is still a PYT. See Figure 7.11 for an example of a placement. We are now ready to describe (7.28). Consider 휋 as being given in two-line nota- tion (1.7) 1 2 ... 푛 휋 = . 휋1 휋2 ... 휋푛 We will construct a sequence of pairs of tableaux

(7.29) (푃0, 푄0) = (∅, ∅), (푃1, 푄1), (푃2, 푄2), ... , (푃푛, 푄푛) = (푃, 푄)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

242 7. Counting with Symmetric Functions

푄 = 1 2 6 7 푄′ = 1 2 6 7 4 5 4 5 8 8 10 9 9

Figure 7.11. The result 푄′ of placing 10 at (3, 2) in 푄

by starting with the empty pair and then letting

푃푘 = 푟휋푘 (푃푘−1),

푄푘 = place 푘 in 푄푘−1 at the cell where 푟휋푘 terminates,

for 푘 = 1, 2, ... , 푛. We then let (푃, 푄) = (푃푛, 푄푛). Note that by construction we have sh 푃푘 = sh 푄푘 for all 푘. A complete example is worked out in Figure 7.12 where the elements of the lower line of 휋 as well as their counterparts in the 푃푘 are set in bold. If RS(휋) = (푃, 푄), we also write 푃(휋) = 푃 and call 푃 the 푃-tableau or insertion tableau of 휋. Similarly we use the notation 푄(휋) = 푄 for the 푄-tableau of 휋 which is also called the recording tableau. We now come to our main theorem about this procedure. Theorem 7.5.2. The map RS 휋 ↦ (푃, 푄)

is a bijection between permutations 휋 ∈ 픖푛 and pairs (푃, 푄) of SYT of the same shape 휆 ⊢ 푛.

Proof. It suffices to construct the inverse. This will be done by reversing the algorithm step by step. So we will build the sequence (7.29) backwards, starting from (푃푛, 푄푛) = (푃, 푄) and, in the process, recover 휋. Assume that we have reached (푃푘, 푄푘) and let (푖, 푗) be the cell containing 푘 in 푄푘. To obtain 푄푘−1 we merely erase 푘 from 푄푘. As for finding 푃푘−1 and 휋푘, we note that (푖, 푗) must have been the cell at which the insertion into 푃푘−1 terminated. So we use the following deletion procedure to undo this insertion.

SR1 Let 푥 be the (푖, 푗) entry of 푃푘 and erase it from 푃푘. Set 푅 ≔ the (푖 − 1)st row of 푃푘. SR2 While 푅 is not the zeroth row of 푃푘, let 푦 be the rightmost element of 푅 smaller than 푥 and replace 푦 by 푥 in 푃푘. Repeat this step with 푅 ≔ the row above 푅 and 푥 ≔ 푦. SR3 Now 푅 is the zeroth row so let 휋푘 = 푥 and terminate. It should be clear from the constructions that insertion and deletion are inverses of each other. So we are done. □

In order to generalize (7.27), we will consider two sets of variables 퐱 = {푥1, 푥2,...} and 퐲 = {푦1, 푦2,...}. The next result is called Cauchy’s Identity and it can be found in Littlewood’s text [58].

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.5. The Robinson–Schensted–Knuth correspondence 243

1 2 3 4 5 6 7 휋 = . ퟓ ퟐ ퟑ ퟔ ퟒ ퟏ ퟕ

ퟓ ퟐ ퟐ ퟑ ퟐ ퟑ ퟔ ퟐ ퟑ ퟒ ퟏ ퟑ ퟒ ퟏ ퟑ ퟒ ퟕ 퐏 퐤 ∶ ∅ = 퐏 ퟓ ퟓ ퟓ ퟓ ퟔ ퟐ ퟔ ퟐ ퟔ ퟓ ퟓ

1 1 1 3 1 3 4 1 3 4 1 3 4 1 3 4 7 푄푘 ∶ ∅, = 푄 2 2 2 2 5 2 5 2 5 6 6

Figure 7.12. The Robinson–Schensted map

Theorem 7.5.3. We have 1 (7.30) ∑ 푠 (퐱)푠 (퐲) = ∏ 휆 휆 1 − 푥 푦 휆 푖,푗≥1 푖 푗 where the sum is over all partitions 휆 of any nonnegative integer.

To give a bijective proof of this formula, we must interpret each side as a weight- generating function. On the left, we clearly have the weight ogf for pairs (푇, 푈) of semistandard tableaux of the same shape 휆 where wt(푇, 푈) = 퐱푈 퐲푇 .

For the right-hand side, consider the set Mat of all infinite matrices 푀 with rows and columns indexed by ℙ, entries in ℕ, and only finitely many entries nonzero. A matrix 푀 ∈ Mat is shown in the upper-left corner of Figure 7.13 where all entries not shown are zero. Weight these matrices by

푀푖,푗 wt 푀 = ∏(푥푖푦푗) . 푖,푗≥1 Our example matrix has weight 2 3 2 5 3 2 2 4 4 wt 푀 = (푥1푦2) (푥1푦3) (푥2푦1)(푥2푦2) (푥3푦1)(푥3푦3) = 푥1푥2푥3푦1푦2푦3.

0 2 3 0 ⋯ ⎡ ⎤ 1 2 0 0 ⋯ ⎢ ⎥ 1 1 1 1 1 2 2 2 3 3 푀 = ⎢ 1 0 1 0 ⋯ ⎥ ↦ 휋 = ⎢ ⎥ 2 2 3 3 3 1 2 2 1 3 ⎢ 0 0 0 0 ⋯ ⎥ ⎣ ⋮ ⋮ ⋮ ⋮ ⎦ 1 1 2 2 3 3 1 1 1 1 1 3 RSK , ↦ (푇, 푈) = ( 2 2 3 2 2 2 ) 3 3

Figure 7.13. The Robinson–Schensted–Knuth map

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

244 7. Counting with Symmetric Functions

So, by the Sum and Product Rules for weight ogfs 1 ∑ wt 푀 = ∏ ∑(푥 푦 )푘 = ∏ . 푖 푗 1 − 푥 푦 푀∈Mat 푖,푗≥1 푘≥0 푖,푗≥1 푖 푗 Thus we need a weight-preserving bijection RSK (7.31) 푀 ↦ (푇, 푈) between matrices 푀 ∈ Mat and pairs (푇, 푈) ∈ SSYT(휆) × SSYT(휆) as 휆 varies over all partitions. Such a map was given by Knuth [49]. It will be convenient to reinterpret the elements of Mat as two-line arrays. Given 푀 ∈ Mat, we create an array 휋 such that 푖 푀 = the number of times a column occurs in 휋, 푖,푗 푗 and the columns are arranged in lexicographic order with the top row taking prece- dence. The two-line array 휋 associated with the matrix in Figure 7.13 is displayed at the top right. We can now define the map (7.31). Given 푀, construct its two-line array 휋. Now, starting with the empty tableau, insert the elements of the lower row of 휋 sequentially to form 푇 using exactly the same rules RS1–RS3 as before. Note that this algorithm never used the assumption that 푥 ∉ 푃 and so it applies equally well to semistandard tableaux. As one does the insertions, one places the corresponding elements of the upper row of 휋 in 푈 so that the two tableaux always have the same shape. Figure 7.13 displays the final output of this algorithm on the bottom line. The reader shouldnow be able to fill in the details of the proof of the following theorem. Theorem 7.5.4. The map RSK 푀 ↦ (푇, 푈) is a weight-preserving bijection between matrices 푀 ∈ Mat and pairs (푇, 푈) of SSYT such that sh 푇 = st 푈. □

We will use the notation RSK(푀) = RSK(휋) = (푇, 푈) where 휋 is the two-line array corresponding to 푀.

7.6. Longest increasing and decreasing subsequences

One of Schensted’s motivations [81] for introducing the algorithm which bears his name was to study the lengths of longest increasing and decreasing subsequences of a permutation. He proved that these quantities were given by the length of the first row and the length of the first column, respectively, of the associated tableaux. Inthis section, we will prove his result. Along the way we will see what effect reversing a sequence has on its insertion tableau.

Consider a permutation 휋 = 휋1휋2 ... 휋푛 ∈ 픖푛. Then an increasing subsequence of

휋 of length 푙 is 휋푖1 < 휋푖2 < ⋯ < 휋푖푙 where 푖1 < 푖2 < ⋯ < 푖푙.A decreasing subsequence

is defined similarly with the inequalities among the 휋푖푗 reversed. We let lis 휋 = length of a longest increasing subsequence of 휋

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.6. Longest increasing and decreasing subsequences 245

and lds 휋 = length of a longest decreasing subsequence of 휋. If 휋 = 5236417 is the permutation in Figure 7.12, then 휋 has increasing subsequences 2347 and 2367 of length 4 and none longer, so lis(휋) = 4. Similarly, lds(휋) = 3 because of the subsequence 531, among others. The reader will notice from Figure 7.12 that the first row of the insertion tableau 푃 (or of the recording tableau 푄) has length 4 = lis(휋) and the length of the first column is 3 = lds 휋. This is always the case.

RS Theorem 7.6.1. If 휋 ↦ (푃, 푄) with sh 푃 = sh 푄 = 휆, then

lis 휋 = 휆1.

Proof. Let 푃푘−1 be the tableau formed after inserting 휋1 ... 휋푘−1. We claim that if 휋푘 enters 푃푘−1 in column 푗, then the length of a longest increasing subsequence of 휋 end- ing with 휋푘 is 푗. Note that the claim proves the theorem since after inserting all of 휋 we

will have an increasing sequence of length 휆1 ending at the element 푃1,휆1 . And there is no longer subsequence since there is no element of 푃 in cell (1, 휆1 + 1). To prove the claim, we induct on 푘, where the case 푘 = 1 is trivial. For the induc- tion step, suppose 푥 is the element in cell (1, 푗 − 1) of 푃푘−1. Then there is an increasing subsequence 휎 of 휋1 ... 휋푘−1 of length 푗 − 1 ending in 푥. Since 휋푘 entered 푃푘−1 in a column to the right of 푥 we must have 푥 < 휋푘−1 by RS2 and RS3. It follows that the concatenation 휎휋푘 is an increasing subsequence of 휋 of length 푗 ending in 휋푘. To show that 푗 is the length of a longest such subsequence suppose, towards a contradiction, that 휏휋푘 is increasing of length greater than 푗. Let 푦 be the last element of 휏. Then, by induction, when 푦 was inserted it entered in a column 푗′ ≥ 푗. Since 푦 < 휋푘 and rows increase, the element in cell (1, 푗) just after 푦’s insertion must be less than 휋푘. And since elements only bump elements larger than themselves, the element in cell (1, 푗) in 푃푘−1 must still be smaller than 휋푘. But this contradicts the fact that 휋푘 enters in column 푗 since it must bump an element larger than itself. □

Note that the previous proof did not show that the first row of 푃 is actually an increasing subsequence of 휋. In fact, this assertion is false as can be seen in Figure 7.12. To prove our suspicion about lds 휋, we need to do insertion by columns. Define column insertion of 푥 ∉ 푃, where 푃 is a partial tableau, using RS1–RS3 but with “row” replaced by “column” everywhere and “leftmost” by “uppermost”. Denote the result of column insertion by 푐푥(푃). Amazingly, the row and column operators commute. Lemma 7.6.2. Let 푃 be a partial tableau and 푥, 푦 distinct positive integers with 푥, 푦 ∉ 푃. Then

푐푦푟푥(푃) = 푟푥푐푦(푃).

Proof. Let 푚 = max({푥, 푦} ⊎ 푃).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

246 7. Counting with Symmetric Functions

푃 = 푥

푥 푚 푚

Figure 7.14. The case 푦 = 푚 in the proof of Lemma 7.6.2

Note that by RS2 and RS3, 푚 cannot bump any element during the insertion process. There are two cases depending on which set 푚 comes from. Case 1: 푦 = 푚. (The case 푥 = 푚 is similar.) Represent 푃 schematically as on the left in Figure 7.14. Since 푚 is the maximum element, 푐푚 will insert 푚 at the end of the first column of whatever tableau to which the operator is applied. Suppose 푥 is the last element to be bumped during the insertion 푟푥(푃), and suppose 푥 comes to rest in cell 푢. If 푢 is at the end of the first column, then it is easy to check that 푐푚푟푥(푃) and 푟푥푐푚(푃) are both the middle diagram in Figure 7.14. Similarly, if 푢 is not at the end of the first column, then both insertions result in the diagram on the right in Figure 7.14. Case 2: 푚 ∈ 푃. We induct on #푃. The case when #푃 = 1 is easy to check. Let 푃 = 푃 − {푚}, that is, 푃 with 푚 erased from its cell. Using the fact that 푚 never bumps another element as well as induction gives

푐푦푟푥(푃) − {푚} = 푐푦푟푥(푃) = 푟푥푐푦(푃) = 푟푥푐푦(푃) − {푚}.

So to finish the proof, we need to show that 푚 is in the same position in both 푐푦푟푥(푃)

and 푟푥푐푦(푃). Let 푥 be the last element displaced during 푟푥(푃) and let 푢 be the cell it

occupies at the end of the insertion. Similarly define 푦 and 푣 for 푐푥(푃). We now have two subcases depending on the relative locations of 푢 and 푣.

Subcase 2a: 푢 = 푣. The first two schematic diagrams in Figure 7.15 illustrate 푟푥(푃) and 푐푦(푃) in this case. If (1, 푗1), (2, 푗2), ... , (푘, 푗푘) are the cells whose elements change during a row insertion, then it is easy to prove that 푗1 ≥ 푗2 ≥ ⋯ ≥ 푗푘. So during 푟푥(푃) the only columns which are disturbed are those weakly right of the column of 푢. Similarly, the insertion 푐푦(푃) only changes rows which are weakly below the row of 푢. It follows that the insertion paths for 푟푥 and 푐푦 in either order do not intersect until they come to 푢.

If 푥 < 푦 (the case 푦 < 푥 is similar), then 푐푦푟푥(푃) and 푟푥푐푦(푃) will both be as in the last diagram in Figure 7.15. Note also that 푥 and 푦 must be in the same column since

this is clearly true for 푐푦푟푥(푃). Now if 푚 was not in cell 푢 in 푃, then it will not be bumped

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.6. Longest increasing and decreasing subsequences 247

푟푥(푃) = 푥 푐푦(푃) = 푦 푥 푦

Figure 7.15. The subcase 푢 = 푣 in the proof of Lemma 7.6.2

by either insertion and so will remain in its cell. If 푚 is in cell 푢, then it is easy to check that it will be bumped into the column just to the right of 푢 in both orders of insertion. This completes this subcase. Subcase 2b: 푢 ≠ 푣. This subcase is taken care of using arguments similar to those in the rest of the proof, so it is left as an exercise. □

If 휋 = 휋1휋2 ... 휋푛 ∈ 픖푛, then its reversal (as defined in Exercise 37(a) of Chapter 1) 푟 푟 is 휋 = 휋푛휋푛−1 ... 휋1. The insertion tableaux of 휋 and 휋 are intimately related. Theorem 7.6.3. If 푃(휋) = 푃, then 푃(휋푟) = 푃푡 where 푡 denotes transpose.

Proof. Clearly, inserting a single element into an empty tableau gives the same result whether it be by rows or columns. Using this and the previous lemma repeatedly 푟 푃(휋 ) = 푟휋1 ⋯ 푟휋푛−1 푟휋푛 (∅)

= 푟휋1 ⋯ 푟휋푛−1 푐휋푛 (∅)

= 푐휋푛 푟휋1 ⋯ 푟휋푛−1 (∅) ⋮

= 푐휋푛 푐휋푛−1 ⋯ 푐휋1 (∅) = 푃푡 which is the conclusion we seek. □

We can now characterize lds(휋) in terms of the shape of its output tableaux.

RS Corollary 7.6.4. If 휋 ↦ (푃, 푄) with sh 푃 = sh 푄 = 휆, then 푡 lds 휋 = 휆1, where 휆푡 is the transpose of 휆.

Proof. Reversing a permutation interchanges increasing and decreasing subsequences so that lds 휋 = lis 휋푟. By Theorem 7.6.1, lis 휋푟 is the length of the first row of 푃(휋푟). And 푃(휋푟) = 푃푡 by Theorem 7.6.3. So lds 휋 is the length of the first column of 푃, as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

248 7. Counting with Symmetric Functions

We note that Greene [34] has proved the following extension of Schensted’s theo- rem.

Theorem 7.6.5. Let lis푘(휋) be the longest length of a subsequence of 휋 which is a union RS of 푘 disjoint increasing subsequences. If 휋 ↦ (푃, 푄) with sh 푃 = sh 푄 = 휆, then

lis푘(휋) = 휆1 + 휆2 + ⋯ + 휆푘 and similarly for decreasing subsequences.

Interestingly, there does not seem to be an easy interpretation of the individual 휆푖 in the shape of the output tableaux. For example, if we consider 휋 = 247951368, then

푃(휋) = 1 3 5 6 8 . 2 4 9 7

So 휆1 + 휆2 = 5 + 3 = 8 and 2479 ⊎ 1368 is a union of two increasing subsequences of 휋 and is of length 8. But one can check that there is no length 8 subsequence which is a disjoint union of two increasing subsequences of lengths 5 and 3.

7.7. Differential posets

In this section we will give a second proof of (7.27) based on properties of Young’s lattice, 푌. This technique can be generalized to a wider class of posets which were introduced and further studied by Stanley [88, 90]. These posets are called differential because of an identity which they satisfy. To connect the summation side of (7.27) with 푌, we will use a simple bijection be- tween standard Young tableaux of shape 휆 and saturated ∅–휆 chains in 푌. Specifically, a 푇 ∈ SYT(휆) where 휆 ⊢ 푛 will be associated with the chain 퐶 ∶ ∅ = 휆0 ⋖ 휆1 ⋖ ⋯ ⋖ 휆푛 = 휆 where 휆푘 is the shape of the subtableau of 푇 containing the elements [푘] for 0 ≤ 푘 ≤ 푛. An example will be found in Figure 7.16. To go the other way, given a chain 퐶, we define 푇 to be the tableau which has 푘 in the unique cell of the skew partition 휆푘/휆푘−1 where skew partitions were defined in (3.7). It is easy to see that these two maps are inverses of each other. From this discussion, it should be clear that (푓휆)2 is the number of pairs of saturated ∅–휆 chains in 푌. In order to work with this observation, we will use a common technique for turn- ing sets into vector spaces. If 푋 is a set, then consider the set of finite formal linear combinations

(7.32) ℂ푋 = { ∑ 푐푥푥 ∣ 푐푥 ∈ ℂ for all 푥 and only finitely many 푐푥 ≠ 0} . 푥∈푋

푇 = 1 3 퐶 ∶ ∅ ⋖ ⋖ ⋖ ⋖ ⋖ 2 4 5

Figure 7.16. The bijection between SYT and saturated chains in Young’s lattice

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.7. Differential posets 249

+ 퐷( ) =

+ + 푈( ) =

Figure 7.17. The down and up operators in Young’s lattice

Now ℂ푋 is a vector space with vector addition and scalar multiplication given by

∑ 푐푥푥 + ∑ 푑푥푥 = ∑(푐푥 + 푑푥)푥, 푥 푥 푥

푐 ∑ 푐푥푥 = ∑ 푐푐푥푥. 푥 푥 Note that 푋 is a basis for ℂ푋. We will define two linear operators on ℂ푌. The down operator 퐷 ∶ ℂ푌 → ℂ푌 is defined by 퐷(휆) = ∑ 휆− 휆−⋖휆 and linear extension. An example is given in Figure 7.17. It will be useful to think of 퐷(휆) as the sum of all partitions which can be reached by taking a walk of length one downward from 휆 in 푌 viewed as a graph. Note that 퐷(∅) is the empty sum so that 퐷(∅) = 0, the zero vector. Similarly, the up operator 푈 ∶ ℂ푌 → ℂ푌 is 푈(휆) = ∑ 휆+. 휆+⋗휆 Again, Figure 7.17 contains an example and a similar walk interpretation holds. We claim that

(7.33) 퐷푛푈푛(∅) = (∑(푓휆)2)∅. 휆⊢푛 Indeed, the coefficient of 휆 ⊢ 푛 in 푈푛(∅) is the number of walks from ∅ to 휆 in 푌 which always go up. But such a walk is just a saturated ∅–휆 chain so that, by the previous bijection, 푈푛(∅) = ∑ 푓휆휆. 휆⊢푛 By the same token 퐷푛휆 = 푓휆∅ since walks which always go down also follow saturated chains. So applying 퐷푛 to the expression for 푈푛(∅) and using linearity gives the desired equality.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

250 7. Counting with Symmetric Functions

To make use of (7.33), we need a closer investigation of the structure of 푌. Say that 휆 ∈ 푌 covers 푘 elements if #{휆− ∣ 휆− ⋖ 휆} = 푘. Similarly define the phrase “is covered by 푘 elements”. Also say that 휆, 휇 ∈ 푌 cover 푙 elements if #{휈 ∣ 휈 ⋖ 휆, 휇} = 푙 and ditto for being covered.

Proposition 7.7.1. The poset 푌 has the following two properties for all distinct 휆, 휇 ∈ 푌. (a) 휆 covers 푘 elements if and only if it is covered by 푘 + 1 elements. (b) 휆, 휇 cover 푙 elements if and only if they are covered by 푙 elements. In this case 푙 ≤ 1.

Proof. (a) It suffices to prove the forward direction since then the number of elements which cover 휆 is uniquely determined by the number which it covers. The elements which 휆 covers are precisely those obtained by removing an inner corner of 휆. And those which cover 휆 are the partitions obtained by adding an outer corner to 휆. But inner corners and outer corners alternate along the southeast boundary of 휆, beginning with an outer corner at the end of the first row and ending with an outer corner atthe end of the first column. The result follows. (b) Again, we only need to prove the forward implication. There are two cases. If there is an element 휈 ⋖ 휆, 휇, then, since 푌 is ranked, it must be rk 휆 = rk 휇 = 푛 for some 푛 and rk 휈 = 푛 − 1. Since 푌 is a lattice, it follows that 휈 = 휆 ∧ 휇 = 휆 ∩ 휇 is unique and so 푙 = 1. Also |휆 ∩ 휇| = |휈| = 푛 − 1 so that |휆 ∨ 휇| = |휆 ∪ 휇| = 푛 + 1. It follows that 휆, 휇 are covered by a unique element, namely 휆 ∨ 휇. If there is no element covered by both 휆, 휇, then similar considerations show that no element covers 휆, 휇. We leave this verification to the reader. □

We can translate this result in terms of the down and up operators.

Proposition 7.7.2. The operators 퐷, 푈 on 푌 satisfy (7.34) 퐷푈 − 푈퐷 = 퐼 where 퐼 is the identity map.

Proof. By linearity, it suffices to show that this equation is true when applied toabasis element 휆 ∈ ℂ푌. First consider 퐷푈(휆). The coefficient of 휇 in this expression is the number of walks 휆 to 휇 which first go up an edge of 푌 and then come down an edge (possibly the same one). These are precisely the walks of length 2 going through some element covering both 휆 and 휇. From the previous proposition, we get 퐷푈(휆) = (푘 + 1)휆 + ∑ 휇 where 푘 + 1 elements cover 휆 and the sum is over all 휇 ≠ 휆 such that 휇, 휆 are covered by a common element. In a similar way 푈퐷(휆) = 푘휆 + ∑ 휇 where the sum is over the same set of 휇. Subtracting the two equalities gives (7.34). □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.7. Differential posets 251

Note that (7.34) is reminiscent of an identity from calculus. Consider a differen- tiable function 푓(푡). Let 퐷 stand for differentiation and let 푈 be multiplication by 푡. Then 퐷푈(푓(푡)) = (푡푓(푡))′ = 푓(푡) + 푡푓′(푡) = 퐼(푓(푡)) + 푈퐷(푓(푡)) which is just (7.34) with the negative term moved to the other side of the equation. We will need an extension of (7.34) where 푈 is replaced by an arbitrary operator which is a polynomial in 푈. Corollary 7.7.3. For any polynomial 푝(푡) ∈ ℂ[푡] we have 퐷푝(푈) = 푝′(푈) + 푝(푈)퐷 where 푝′(푡) is the derivative of 푝(푡).

Proof. By linearity it suffices to prove this result for the powers 푈푛 for 푛 ≥ 0. The base case 푛 = 0 is easy to check. Assuming the result is true for 푛 and then applying (7.34) we obtain 퐷푈푛+1 = (퐷푈푛)푈 = (푛푈푛−1 + 푈푛퐷)푈 = 푛푈푛 + 푈푛(퐼 + 푈퐷) = (푛 + 1)푈푛 + 푈푛+1퐷 as desired. □

We are now ready to reprove (7.27) which we restate here for ease of reference: (7.35) ∑(푓휆)2 = 푛! . 휆⊢푛

Proof. Because of (7.33), it suffices to show that 퐷푛푈푛(∅) = 푛! ∅. We induct on 푛, where the case 푛 = 0 is trivial since 퐷0푈0 = 퐼. Applying Corollary 7.7.3, the fact that 퐷∅ = 0, and induction gives 퐷푛푈푛(∅) = 퐷푛−1(퐷푈푛)(∅) = 퐷푛−1(푛푈푛−1 + 푈푛퐷)(∅) = 푛퐷푛−1푈푛−1(∅) + 0(∅) = 푛(푛 − 1)! ∅ which is what we wished to show. □

Stanley generalized these ideas using the following definition. Call a poset 푃 dif- ferential if it satisfies the following three properties where 푥, 푦 are distinct elements of 푃. DP1 푃 is ranked. DP2 If 푥 covers 푘 elements, then it is covered by 푘 + 1 elements. DP3 If 푥, 푦 cover 푙 elements, then they are covered by 푙 elements.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

252 7. Counting with Symmetric Functions

From what we have proved, 푌 is a differential poset. Another example is given in Exercise 29. In fact, Stanley defined a more general type of poset called 푟-differential which will be studied in Exercise 30. As with Young’s lattice, we can show that the parameter 푙 must satisfy 푙 ≤ 1. Lemma 7.7.4. If poset 푃 satisfies DP1 and DP3, then 푙 ≤ 1.

Proof. As noted in the proof of Proposition 7.7.1, DP3 implies that its converse is also true. Suppose the lemma is false and pick a pair 푥, 푦 with 푙 ≥ 2. Since 푃 is ranked by DP1, we must have rk 푥 = rk 푦. Pick the counterexample pair 푥, 푦 to be of minimum rank and let 푥′, 푦′ be two of the elements covered by 푥, 푦. But since 푥′, 푦′ are covered by at least two elements, they must cover at least two elements. This contradicts the fact that we took a minimum-rank pair. □

We want to define up and down operators in a differential poset. But to makesure they are well-defined, the sums need to be finite.

Lemma 7.7.5. If 푃 satisfies DP1 and DP2, then its 푛th rank Rk푛 푃 is finite for all 푛 ≥ 0.

Proof. We induct on 푛. Since 푃 is ranked by DP1, it has a 0̂ and so the result holds for 푛 = 0. Assume the lemma through rank 푛. Now any 푥 ∈ Rk푛 푃 covers at most # Rk푛−1 푃 elements. So, by DP2,

# Rk푛+1 푃 ≤ (# Rk푛 푃)(1 + # Rk푛−1 푃) □ which forces Rk푛+1 푃 to be finite.

Thus we can define two operators on ℂ푃 by 퐷(푥) = ∑ 푥− 푥−⋖푥 and 푈(푥) = ∑ 푥+. 푥+⋗푥 The proof of the next result is similar enough to that of Proposition 7.7.2 that it is left as an exercise.

Proposition 7.7.6. Let 푃 be a ranked poset with Rk푛 푃 finite for all 푛 ≥ 0. Then

푃 is differential ⟺ 퐷푈 − 푈퐷 = 퐼. □

Also, the reader should be able to generalize the operator proof of (7.27) to the setting of differential posets and show the following. Theorem 7.7.7. In any differential poset 푃 we have ∑ (푓푥)2 = 푛!

푥∈Rk푛 푃 where 푓푥 is the number of saturated 0̂–푥 chains. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.8. The chromatic symmetric function 253

The reader might wonder if there is a way to give a bijective proof of this theorem just as the Robinson–Schensted algorithm provides a bijection for (7.27). This has been done by Fomin as part of his theory of duality of graded graphs [27, 28].

7.8. The chromatic symmetric function

Stanley [91] defined a symmetric function associated with graph colorings which gen- eralizes the chromatic polynomial. In this section we will prove some of his results about this function, including expressions for its expansion in the monomial and power sum bases for Sym.

Let 퐺 be a graph with vertex set 푉 = {푣1, ... , 푣푛}. Consider a coloring 푐∶ 푉 → ℙ of 퐺 using the positive integers as color set. Then 푐 has monomial 푐 (7.36) 퐱 = 푥푐(푣1)푥푐(푣2) ⋯ 푥푐(푣푛). For example, if 퐺 is the graph on the left in Figure 7.18 and 푐 is the coloring on the right, then 푐 2 퐱 = 푥푐(ᵆ)푥푐(푣)푥푐(푤)푥푐(푧) = 푥1푥2푥4. Now define the chromatic symmetric function of 퐺 to be 푋(퐺) = 푋(퐺; 퐱) = ∑ 퐱푐 푐∶ 푉→ℙ where the sum is over all proper colorings 푐∶ 푉 → ℙ. To illustrate, let us return to the graph of Figure 7.18. Because 퐺 contains a triangle, we must use three or four colors for a proper coloring. If we use four different colors, then this will give rise to a monomial 푥푖푥푗푥푘푥푙 for some distinct 푖, 푗, 푘, 푙. And any of the 4! = 24 ways of assigning these colors to the four vertices is proper. Since this count is independent of which four colors we use, the contribution of such colorings to 푋(퐺) is 24푚14 . If we use three colors, then 2 one of them must be used twice and so correspond to a monomial 푥푖 푥푗푥푘 for distinct 푖, 푗, 푘. One copy of color 푖 must go on vertex 푤 and the other can be on 푢 or 푥, giving two choices. The other two colors can be distributed among the remaining two vertices in two ways. So these colorings give a term 4푚212 . In total 푋(퐺) = 24푚14 + 4푚212 which the reader will note is a symmetric function. We will now prove that this is always the case, as well as showing a connection with the chromatic polynomial of 퐺. Proposition 7.8.1. Let 퐺 be a graph with vertex set 푉.

(a) 푋(퐺) ∈ Sym푛 where 푛 = #푉. 푡 (b) If we set 푥1 = ⋯ = 푥푡 = 1 and 푥푖 = 0 for 푖 > 푡, written 퐱 = 1 , then 푋(퐺; 1푡) = 푃(퐺; 푡).

푢 푣 2 4

퐺 = 푐 =

푧 푤 1 2

Figure 7.18. A graph and a coloring using ℙ

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

254 7. Counting with Symmetric Functions

Proof. (a) It is clear that 퐱푐 has 푛 factors for any coloring 푐 so that 푋(퐺) is homoge- neous of degree 푛. To show that it is symmetric, note that any permutation of the colors of a proper coloring is proper. This means that permuting the subscripts in 푋(퐺) leaves it invariant; that is, 푋(퐺) is symmetric.

(b) The given substitution results in 퐱푐 = 1 if 푐 only uses colors from [푡] and 퐱푐 = 0 otherwise. So 푋(퐺; 1푡) is just the number of proper colorings 푐∶ 푉 → [푡]. But this was the definition of 푃(퐺; 푡). □

Since 푋(퐺) is symmetric, we can expand it in terms of various bases for the sym- metric functions and see if the coefficients have any nice combinatorial interpretation. We start with the monomial basis. To describe the coefficients, we will need some defi- nitions. If 퐺 = (푉, 퐸) is a graph, then 푊 ⊆ 푉 is independent or stable if there is no edge of 퐺 between any pair of vertices of 푊. For example, if 퐺 = 푇1 as in Figure 1.9, then 푊 = {2, 5, 6} is stable but 푊 = {2, 3, 6} is not because of the edge 36. The reason we care about stable sets is that if 푐 is a proper coloring of 퐺, then the set of all vertices with a given color 푟, in other words the vertices in 푐−1(푟), form a stable set. Similarly, call a partition 휌 = 퐵1/ ... /퐵푘 of 푉 independent or stable if each block is. Returning to Fig- ure 1.9, the partition 13/256/4 is stable in 푇1. The type of a set partition 휌 = 퐵1/ ... /퐵푘 is the integer partition 휆(휌) = (휆1, ... , 휆푘) obtained by arranging #퐵1, ... , #퐵푘 in weakly decreasing order. To illustrate, 휆(1456/27/38) = (4, 2, 2). For a graph 퐺, let

푖휆(퐺) = number of independent partitions of 푉 of type 휆.

For the graph in Figure 7.18 we have 푖14 (퐺) = 1, 푖212 (퐺) = 2 and all other 푖휆(퐺) = 0. Any proper coloring 푐 of 퐺 induces a stable partition of 푉 whose blocks are the nonempty 푐−1(푟) for the colors 푟 in the color set. Finally, if 휆 = (1푚1 , 2푚1 , ... , 푛푚푛 ) is an integer partition in multiplicity notation, then let ! 휆 = 푚1! 푚2! ⋯ 푚푛!. Theorem 7.8.2. If graph 퐺 has #푉 = 푛, then ! 푋(퐺) = ∑ 푖휆(퐺)휆 푚휆. 휆⊢푛

휆1 휆푘 Proof. If 휆 = (휆1, ... , 휆푘), then the coefficient of 푥1 ⋯ 푥푘 is the coefficient of 푚휆 since 푋(퐺) is symmetric. And the coefficient of this monomial is the number of proper colorings 푐∶ 푉 → [푘] where 푖 gets used 휆푖 times for all 푖. By the discussion preceding this theorem, these colorings can be obtained by taking an independent partition 휌 ⊢ 푉 and then deciding which colors get assigned to which blocks of 휌. The number of choices for 휌 is 푖휆(퐺). Now any of the 푚푗 colors which are used 푗 times can be used on any of the 푚푗 blocks of 휌 of size 푗. The number of such assignments is 푚푗! and this is true for all 푗. This gives the factor of 휆!. □

The expansion of 푋(퐺) in the power sum basis will be found by Möbius inversion. Let 퐺 = (푉, 퐸) be a graph. Then any spanning subgraph 퐻 can be identified with its set of edges 퐸(퐻) since we have 푉(퐻) = 푉(퐺). To illustrate, for the graph 퐺 in Figure 3.5 we would identify 퐹1 with the edge set {12, 24} and 퐹2 with the edge set {14, 24}. Given

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.8. The chromatic symmetric function 255

퐹 ⊆ 퐸, we get a partition 휌(퐹) of the vertex set where a block of 휌 is the set of vertices in a component of the corresponding spanning subgraph. Returning to our example, 퐹1 and 퐹2 both have partition 휌 = 124/3. Let 휆(퐹) = type of the partition 휌(퐹).

In our running example 휆(퐹1) = 휆(퐹2) = (3, 1). Theorem 7.8.3. If 퐺 = (푉, 퐸) is a graph, then #퐹 푋(퐺) = ∑ (−1) 푝휆(퐹) 퐹⊆퐸 where the sum is over all subsets 퐹 of the edge set of 퐺 and #퐹 is the number of edges in 퐹.

Proof. Consider the Boolean algebra 퐵퐸 of all subsets of 퐸 ordered by containment. Given 퐹 ⊆ 퐸, we define the power series 훼(퐹) = ∑ 퐱푐 푐 where the sum is over all colorings 푐∶ 푉 → ℙ such that 푐(푢) = 푐(푣) for all 푢푣 ∈ 퐹. These are the colorings which are monochromatic on each component of 퐹’s spanning subgraph. So if 휌(퐹) = 퐵1/ ... /퐵푘, then each 퐵푖 can get any of the colors in ℙ. It follows that 푘 #퐵푖 #퐵푖 (7.37) 훼(퐹) = ∏(푥1 + 푥2 + ⋯) = 푝휆(퐹). 푖=1 Also define 훽(퐹) = ∑ 퐱푐 푐 where the sum is over all colorings 푐∶ 푉 → ℙ such that 푐(푢) = 푐(푣) for all 푢푣 ∈ 퐹 and 푐(푢) ≠ 푐(푣) for 푢푣 ∈ 퐸 − 퐹. So these colorings are constant on the components of 퐹 but also cannot have any other edge of 퐸 monochromatically colored. Given any spanning subgraph 퐹′ and a coloring 푐 appearing in 훼(퐹′), one can define a unique spanning subgraph 퐹 by letting 푢푣 ∈ 퐹 if and only if 푐(푢) = 푐(푣). By the definition of 훼 it is clear that 퐹 ⊇ 퐹′. Also, the definition of 퐹 implies that 푐 is a coloring in the sum for 훽(퐹). It follows that 훼(퐹′) = ∑ 훽(퐹) 퐹⊇퐹′ for all 퐹′ ⊆ 퐸. Applying Möbius inversion (Theorem 5.5.5) as well as (5.6) and (7.37) gives #퐹 훽(∅) = ∑ 휇(퐹)훼(퐹) = ∑ (−1) 푝휆(퐹). 퐹∈퐵퐸 퐹⊆퐸 But 훽(∅) is the generating function for all colorings 푑 ∶ 푉 → ℙ such that 푑(푢) ≠ 푑(푣) for 푢푣 ∈ 퐸, and these are exactly the proper colorings. Thus 훽(∅) = 푋(퐺) and we are done. □

We end this section with an open question about 푋(퐺). To appreciate it, we first prove a result about the chromatic polynomial.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

256 7. Counting with Symmetric Functions

Proposition 7.8.4. Let 푇 be a graph with #푉 = 푛. We have that 푇 is a tree if and only if 푃(푇; 푡) = 푡(푡 − 1)푛−1.

Proof. We will prove the forward direction and leave the reverse implication as an exercise. If 푣 ∈ 푉, then we color 푇 by first coloring 푣, then all the neighbors of 푣, then all the uncolored vertices which are neighbors of neighbors of 푣, etc. The number of ways to color 푣 is 푡. Because 푇 is connected and acyclic, when coloring each vertex 푤 ∈ 푉 − {푣} we will have 푤 adjacent to exactly one already colored vertex. So the number of colors available for 푤 is 푡 − 1. The result follows. □

This proposition is sometimes summarized by saying that the chromatic polyno- mial does not distinguish trees since all trees on 푛 vertices have the same polynomial. Stanley asked if the opposite was true for his chromatic symmetric function.

Question 7.8.5. If 푇1 and 푇2 are nonisomorphic trees, is it true that 푋(푇1) ≠ 푋(푇2)?

It has been checked for trees with up to 23 vertices that the answer to this question is “yes” and there are other partial results in the literature.

7.9. Cyclic sieving redux

Cyclic sieving phenomena are often associated to results in representation theory. In this section we will give a second proof of Theorem 6.6.2 using this approach. To do so, we will assume the reader is familiar with the material in the appendix in this book. We start by presenting a general paradigm for proving a CSP by using group actions. Recall that we start with a set 푋, a cyclic group 퐺 acting on 푋, and a polynomial 푓(푞) ∈ ℕ[푞]. Also, for the rest of this section we let 휔 = 푒2휋푖/푛. The triple (푋, 퐺, 푓(푞)) was said to exhibit the cyclic sieving phenomenon if, for all 푔 ∈ 퐺, (7.38) #푋푔 = 푓(훾) where 훾 is a root of unity satisfying 표(훾) = 표(푔). To interpret the left side of (7.38), consider the permutation representation ℂ푋 of 퐺. Since 푔 ∈ 퐺 takes each basis element in 푋 to another basis element, the matrix [푔]푋 consists of zeros and ones. And there is a one on the diagonal precisely when 푔푥 = 푥 for 푥 ∈ 푋. So this representation has character

푔 (7.39) 휒(푔) = tr[푔]푋 = #푋 .

As for the right-hand side of (7.38), let ℎ be a generator of 퐺 where #퐺 = 푛 and (푖) let 휔 = 휔푛. For 푖 ≥ 0, let 푉 be the irreducible 퐺-module such that the matrix of ℎ is 푖 (푖) 푖 [휔 ] and let its character be 휒 . Suppose 푓 = ∑푖≥0 푚푖푞 where 푚푖 ∈ ℕ for all 푖. Since the coefficients are nonnegative integers, we can define a corresponding 퐺-module

푉 = 푚 푉 (푖) 푓 ⨁ 푖 푖≥0

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

7.9. Cyclic sieving redux 257

with character 휒푓. Let 푔 = ℎ푗 and 훾 = 휔푗. Now using (A.1) and the fact that the 푉 (푖) are 1-dimensional, 푓 (푖) 푗 푖푗 푗 휒 (푔) = ∑ 푚푖휒 (ℎ ) = ∑ 푚푖휔 = 푓(휔 ) = 푓(훾) 푖≥0 푖≥0 which is the right side of (7.38). Now appealing to Theorem A.1.4, we have proved the following result or Reiner, Stanton, and White [72]. Theorem 7.9.1. The triple (푋, 퐺, 푓(푞)) exhibits the cyclic sieving phenomenon if and only □ if ℂ푋 ≅ 푉 푓 as 퐺-modules.

We will now give a second proof that the triple [푛] 푛 + 푘 − 1 (7.40) ((( )), ⟨(1, 2, ... , 푛)⟩, [ ] ) 푘 푘 푞 exhibits the CSP. We will write vectors in boldface to distinguish them from scalars. To use the previous theorem, any 퐺-module isomorphic to ℂ푋 will suffice. So we will construct one whose structure is easy to analyze. Let 푉 be a vector space of dimension 푛 푘 and fix a basis 퐵 = {퐛1, 퐛2, ... , 퐛푛} for 푉. Let 푃 (퐵) be the set of all formal polynomials of degree 푘 using the elements of 퐵 as variables and the complex numbers as coefficients. Clearly 푃푘(퐵) is a vector space with basis ′ (7.41) 퐵 = {퐛푖1 퐛푖2 ⋯ 퐛푖푘 ∣ 1 ≤ 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푘 ≤ 푛} In particular, if one takes 푉 = ℂ[푛] with the basis 퐵 = {퐢 ∣ 푖 ∈ [푛]}, then we will write 푃푘(푛) for 푃푘(퐵). To illustrate, 2 푃 (3) = {푐1ퟏퟏ + 푐2ퟐퟐ + 푐3ퟑퟑ + 푐4ퟏퟐ + 푐5ퟏퟑ + 푐6ퟐퟑ ∣ 푐푖 ∈ ℂ for 1 ≤ 푖 ≤ 6}. 푘 One can turn 푃 (푛) into a 퐺푛-module for 퐺푛 = ⟨(1, 2, ... , 푛)⟩ by letting

(7.42) 푔(퐢1퐢2 ⋯ 퐢푘) = 푔(퐢1)푔(퐢2) ⋯ 푔(퐢푘)

for 푔 ∈ 퐺푛 and extending linearly. It should be clear from the definitions that 푛 (7.43) ℂ(( )) ≅ 푃푘(푛) 푘

as 퐺푛-modules. So we will use the latter in establishing the CSP. 푘 The advantage of using 푃 (푛) is that it is also a GL푛-module. In particular, we can define the action of 푔 ∈ GL푛 exactly as in (7.42), where 퐢푗 is thought of as the 푗th coordinate vector and the linear combinations 푔(퐢1), ... , 푔(퐢푘) are multiplied together formally to get an element of 푃푘(푛). For example, if 푛 = 3, 푘 = 2, and 1 2 0 푔 = [ 3 4 0 ] , 0 0 −1 then 푔(ퟏퟑ) = 푔(ퟏ)푔(ퟑ) = (ퟏ + 3 ퟐ)(−ퟑ) = −ퟏퟑ − 3 ퟐퟑ. It is not hard to show that this is an action. We will also need the fact that if 퐵̃ is any other basis for ℂ[푛], then the monomials defined by (7.41) (where one replaces each ̃ 푘 퐛푖푗 by the corresponding element of 퐵) also form a basis for 푃 (푛).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

258 7. Counting with Symmetric Functions

We now compute the character of 푃푘(푛). Notice from Corollary A.1.2 that, since 퐺푛 is cyclic, there is a basis 퐵 = {퐛1, 퐛2, ... , 퐛푛} for ℂ[푛] which diagonalizes [푔] for all 푔 ∈ 퐺푛; say

[푔]퐵 = diag(푥1, 푥2, ... , 푥푛). 푘 ′ To compute the action of 퐺푛 in 푃 (푛), we use the basis 퐵 in (7.41) and see that

푔(퐛푖1 퐛푖2 ⋯ 퐛푖푘 ) = 푔(퐛푖1 )푔(퐛푖2 ) ⋯ 푔(퐛푖푘 ) = 푥푖1 푥푖2 ⋯ 푥푖푘 퐛푖1 퐛푖2 ⋯ 퐛푖푘 . So 퐵′ diagonalizes this action with

′ [푔]퐵 = diag(푥푖1 푥푖2 ⋯ 푥푖푘 ∣ 1 ≤ 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푘 ≤ 푛). This gives the character

′ 휒 (푔) = ∑ 푥푖1 푥푖2 ⋯ 푥푖푘 = ℎ푘(푥1, 푥2, ... , 푥푛) 1≤푖1≤푖2≤. . .≤푖푘≤푛 which is just a complete homogeneous symmetric polynomial in the eigenvalues. For example, if 푛 = 3 and 푘 = 2, then we would have a basis 퐵 = {퐚, 퐛, 퐜} such that 2 [푔]퐵 = diag(푥1, 푥2, 푥3). So in 푃 (3) 2 2 2 푔(퐚퐚) = 푥1퐚퐚, 푔(퐛퐛) = 푥2퐛퐛, 푔(퐜퐜) = 푥3퐜퐜, 푔(퐚퐛) = 푥1푥2퐚퐛, 푔(퐚퐜) = 푥1푥3퐚퐜, 푔(퐛퐜) = 푥2푥3퐛퐜, which gives ′ 2 2 2 휒 (푔) = 푥1 + 푥2 + 푥3 + 푥1푥2 + 푥1푥3 + 푥2푥3. To prove the CSP we will need to relate homogeneous symmetric polynomials to 푞- 푖−1 binomial coefficients. This is done via the principal specialization which sets 푥푖 = 푞 for 푖 ≥ 1.

Proposition 7.9.2. We have the principal specializations

(푘) 푛 푒 (1, 푞, ... , 푞푛−1) = 푞 2 [ ] 푘 푘 푞 and 푛 + 푘 − 1 ℎ (1, 푞, ... , 푞푛−1) = [ ] . 푘 푘 푞

Proof. We will prove the identity for ℎ푘, leaving the one for 푒푘 as an exercise. From the definition of the complete homogeneous symmetric functions we seethat

푛−1 푗1 푗2 푗푘 ℎ푘(1, 푞, ... , 푞 ) = ∑ 푞 푞 ⋯ 푞 . 0≤푗1≤푗2≤⋯≤푗푘≤푛−1

But a sequence 푗1 ≤ 푗2 ≤ ⋯ ≤ 푗푘 corresponds to an integer partition 휆 obtained by listing the nonzero elements of the sequence in weakly decreasing order. Furthermore 푗1 푗2 푗푘 |휆| 푞 푞 ⋯ 푞 = 푞 and the bounds on the 푗푖 imply that 휆 ∈ ℛ(푘, 푛 − 1), the set of partitions contained in a 푘 × (푛 − 1) rectangle. The equality now follows from Theo- rem 3.2.5. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 259

We are now ready to complete the representation-theoretic demonstration that the triple (7.40) exhibits the cyclic sieving phenomenon. Consider [(1, 2, ... , 푛)] acting as a linear transformation on ℂ[푛]. Its characteristic polynomial is 푥푛 − 1 which has roots 1, 휔, 휔2, ... , 휔푛−1. So, by Corollary A.1.2, there is a diagonalizing basis 퐵 for the action of 퐺푛 = ⟨(1, 2, ... , 푛)⟩ with

2 푛−1 [(1, 2, ... , 푛)]퐵 = diag(1, 휔, 휔 , ... , 휔 ).

푗 Since any 푔 ∈ 퐺푛 has the form 푔 = (1, 2, ... , 푛) for some 푗 and since the generator has been diagonalized, we have

푗 푗 2푗 (푛−1)푗 2 푛−1 [푔]퐵 = diag(1 , 휔 , 휔 , ... , 휔 ) = diag(1, 훾, 훾 , ... , 훾 ) where 훾 = 휔푖 is a primitive 표(푔)th root of unity. But the previous theorem and the discussion just preceding it show that 푛 + 푘 − 1 휒′(푔) = ℎ (1, 훾, 훾2, ... , 훾푛−1) = [ ] 푘 푘 훾 where 휒′ is the character of 푃푘(푛). But, by (7.43), 푔 [푛] 휒′(푔) = #(( )) . 푘

Equating the last two displayed equations completes the proof.

Exercises

(1) (a) Show that (7.2) satisfies the definition of a group action. (b) Show that Sym is an algebra; that is, it is a vector space which is closed under multiplication of symmetric functions. (2) Prove Proposition 7.1.2(b). (3) Prove Theorem 7.1.3(b). (4) (a) Prove that lexicographic order is a total order on partitions. (b) Prove that adding parts of a partition makes it larger in lexicographic order. (c) Prove that lexicographic order on partitions is a linear extension of dominance order as introduced in Section 1.12. (d) Prove that the lexicographic order inequalities in the proof of Theorem 7.1.3(a) and (b) can be strengthened to dominance order inequalities. (e) Show that part (d) is also true in the statement of Theorem 7.2.2.

(5) Show that 픖푛 is generated by the adjacent transpositions (푖, 푖 + 1) for 1 ≤ 푖 < 푛. Hint: Induct on inv 휋 for 휋 ∈ 픖푛. (6) Consider the proof of Theorem 7.2.3. (a) Show in part (a) that every path family whose associated permutation is not the identity must intersect.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

260 7. Counting with Symmetric Functions

(b) In part (a), provide a description of the inverse of the map from lattice path families to SSYT. Be sure to prove that it is well-defined and indeed an inverse to the forward map. (c) Prove part (b).

(7) Verify that the real sequence 푎0, ... , 푎푛 is log-concave if and only if 푎0/푟, ... , 푎푛/푟 is, where 푟 ∈ ℝ − {0}. (8) Use Theorem 7.2.5 to prove log-concavity of the following sequences. 푛 푛 푛 (a) ( ), ( ),..., ( ). 0 1 푛

(b) 푐(푛, 0), 푐(푛, 1), ... , 푐(푛, 푛) (signless first kind Stirling numbers). (9) Prove equation (7.9). (10) Prove equation (7.13). (11) A plane partition of shape 휆 is a filling 푃 of the cells of 휆 with positive integers so that rows and columns weakly decrease. Let 푝푝푛 be the number of plane partitions 푃 (of any shape) such that |푃| = 푛. Show that

1 ∑ 푝푝 푥푛 = ∏ 푛 푖 푖 푛≥0 푖≥1 (1 − 푥 )

in two ways: by taking a limit in (7.14) and by providing a proof in the spirit of the Hillman–Grassl algorithm. (12) (a) Prove that the output array of HG1–HG3 is a reverse plane partition. (b) Prove that (7.16) is a total order. (c) Show that in GH3 of the Hillman–Grassl construction, the reverse path 푝 must reach the rightmost cell in row 푎. (d) Prove that after adding ones along the reverse path, the result is still a reverse plane partition. (13) (a) Construct the inverse of the map used in the proof of (7.20). (b) Prove (7.21). (c) Prove (7.24). (14) Complete the proof of Lemma 7.4.3. (15) Derive (7.26) from (7.25) using a bijection. ∗ (16) (a) Show that the 푃휆 -partitions are exactly the reverse plane partitions of shape 휆. (b) Show that for any finite poset 푃 we have #ℒ(푃) = #ℒ(푃∗). (17) (a) Let 휏 be a poset. Call 휏 a rooted tree if it has a 0̂ and its Hasse diagram is a tree in the graph-theoretic sense of the term. If #휏 = 푛, then a natural labeling of 휏 is an order-preserving bijection 휏 → [푛]. See Figure 7.19 for an example of a rooted tree (on the left) and a natural labeling (in the middle). Let 푓휏 be the number of natural labelings of 휏. Define the hooklength of 푣 ∈ 휏 to be

ℎ푣 = #푈(푣)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 261

6 3 5 1 1 1

4 2 2 3

1 6

Figure 7.19. A rooted tree poset, a natural labeling, and its hooklengths

where 푈(푣) is the upper-order ideal generated by 푣. The right-hand tree in Figure 7.19 lists its hooklengths. Prove that if #휏 = 푛, then 푛! 푓휏 = ∏푣∈휏 ℎ푣 in two ways: probabilistically and using induction on 푛. (b) The comb is the infinite poset on the left in Figure 7.20. Let 퐿푛 be the set of lower-order ideals of the comb which have 푛 elements. The three elements of 퐿3 are displayed on the right in Figure 7.20. Note that the last two order ideals are considered distinct even though they are isomorphic as posets. Show that #퐿푛 = 푓푛, the Fibonacci numbers defined in (1.2). (c) Using the notation of part (a), show that

∑ (푓휏)2 = 푛! .

휏∈퐿푛

′ (18) (a) Show that if 푃 is a PYT and 푥 ∉ 푃, then 푃 = 푟푥(푃) is still a PYT; that is, the rows and columns of 푃′ still increase. (b) Show that the RSK map is well-defined in that 푇 and 푈 are both semistandard.

(19) Consider three sets of variables 퐱 = {푥푖}푖≥1, 퐲 = {푦푗}푗≥1, and 퐱퐲 = {푥푖푦푗}푖,푗≥1. Prove that

∑ 푠휆(퐱)푠휆(퐲) = 푠푛(퐱퐲) 휆⊢푛 where the variables in 퐱퐲 can be substituted into the Schur function in any order since 푠푛 is symmetric. Hint: Use the Cauchy identity, equation (7.30).

⋅ ⋅⋅

Figure 7.20. The comb and its three lower-order ideals with 3 elements

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

262 7. Counting with Symmetric Functions

(20) Prove Theorem 7.5.4. Hint: First prove that if 푇 is an SSYT into which one inserts 푥 and 푦 in that order with 푥 ≤ 푦, then the box added to the shape by the insertion of 푦 is strictly to the right of the box for the insertion of 푥.

(21) Show that (7.27) can be derived from (7.30) by taking the coefficient of 푥1 ⋯ 푥푛푦1 ⋯ 푦푛 on both sides. (22) Fill in the details of the proof of Lemma 7.6.2. (23) If 휋 is a two-line array, then let ̂휋 be the upper line and let ̌휋 be the lower line. (a) Show that any two-line array 휋 with entries in ℙ and columns which are lexi- cographically weakly increasing corresponds to a matrix 푀 ∈ Mat. (b) Show that if RSK(휋) = (푇, 푈) with sh 푇 = sh 푈 = 휆, then

휆1 = the length of a longest weakly increasing subsequence of ̌휋 by mimicking the proof of Theorem 7.6.1. (c) Let 푇 be a semistandard Young tableau and suppose that the content of 푇 is co 푇 = [훼1, 훼2, ... , 훼푘]. The standardization of 푇 is the tableau std 푇 obtained by replacing the ones in 푇 by the numbers 1, 2, ... , 훼1 from left to right, replac- ing the twos in 푇 by the numbers 훼1 + 1, 훼1 + 2, ... , 훼1 + 훼2 from left to right, and so on. Show that std 푇 is a standard Young tableau. (d) Standardize a two-line array 휋 by using the same left-to-right replacement as in the previous part on ̂휋 and then doing so again on ̌휋. Clearly std 휋 is a per- mutation in two-line form if the columns of 휋 are lexicographically ordered. Show that in this case RS(std(휋)) = std(RSK(휋)). (e) Use part (d) and the result (rather than the proof) of Theorem 7.6.1 to give a second proof of part (b). (24) (a) Extend column insertion to semistandard tableaux 푇 by having an element 푥 bump the uppermost element in a column greater than or equal to 푥. Show that with this definition 푐푥(푇) is semistandard. (b) Give two proofs that for any semistandard Young tableau 푇 and positive inte- gers 푥, 푦 we have 푐푦(푟푥(푇)) = 푟푥(푐푦(푇)), one by mimicking the proof of Lemma 7.6.2 and one by using the standard- ization operator std. (c) Give two proofs that if RSK(휋) = (푇, 푈) with sh 푇 = sh 푈 = 휆, then 푡 휆1 = the length of a longest decreasing subsequence of 휋, one by using part (b) and one using the standardization operator std from the previous exercise. (d) Prove the identity

∑ 푠휆(퐱)푠휆푡 (퐲) = ∏(1 + 푥푖푦푗). 휆 푖,푗≥1 Hint: Use column insertion to define a weight-preserving bijection 푀 → (푇, 푈) where 푀 ∈ Mat is a matrix with all entries zero or one, sh 푇 = sh 푈, and 푇, 푈푡 semistandard.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 263

⋮ ⋮ ⋮

1111 211 121 112 22

111 21 12

11 2

1

Figure 7.21. Part of the poset ℱ

(25) Finish the proof of Proposition 7.7.1. (26) Show that the base case holds in Corollary 7.7.3. (27) Prove Proposition 7.7.6. (28) Prove Theorem 7.7.7.

(29) (a) Let ℱ푛 be the set of all words 푤 of ones and twos having ∑푖 푤푖 = 푛. Show that for 푛 ≥ 0 we have

#ℱ푛 = 푓푛,

the Fibonacci numbers defined in (1.2).

(b) Put a partial order on ℱ = ⨄푛≥0 ℱ푛 with covers 푣 ⋖ 푤 whenever F1 푣 can be obtained from 푤 by removing the first one in 푤 or F2 푤 can be obtained from 푣 by changing the first one in 푣 to a two. The lower ranks of ℱ are shown in Figure 7.21. Show that ℱ is differential. (c) Show that ℱ is a lattice. Hint: Prove that 푣∧푤 exists by induction on rk 푣+rk 푤 and then use Exercise 12(c) in Chapter 5. (d) Give a second proof that ℱ is a lattice using Exercise 12(d) in Chapter 5. (30) Let 푟 ∈ ℙ. Say poset 푃 is 푟-differential if it satisfies DP1, DP3, and the following axiom for any 푥 ∈ 푃: DP2r If 푥 covers 푘 elements, then it is covered by 푘 + 푟 elements. Prove the following statements. (a) If 푃 is 푟-differential, then # Rk푛 푃 is finite for all 푛. So there are well-defined 퐷 and 푈 operators. (b) Let 푃 be ranked with # Rk푛 푃 finite for all 푛. We have

푃 is 푟-differential ⟺ 퐷푈 − 푈퐷 = 푟퐼.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

264 7. Counting with Symmetric Functions

(c) In any 푟-differential poset we have ∑ (푓푥)2 = 푟푛푛!

푥∈Rk푛 푃 where 푓푥 is the number of saturated 0̂–푥 chains. (d) If 푃 is 푟-differential and 푄 is 푠-differential, then 푃 × 푄 is (푟 + 푠)-differential. In particular, if 푃 is differential, then 푃푟 is 푟-differential. (31) Show that the backwards implication in Proposition 7.8.4 holds. Hint: Use Theo- rem 3.8.4.

(32) (a) The star 푆푛 is the tree with 푉 = {푣1, ... , 푣푛} and with edges 퐸 = {푣1푣2, 푣1푣3, ... , 푣1푣푛}. Find expressions for 푋(푆푛) in the monomial basis and in the power sum basis with coefficients which are, up to sign, products of binomial coef- 푛 ficients. (Of course, every integer is a binomial coefficient since (1) = 푛. But any such factor should come choosing one thing from 푛 things combinatori- ally.) (b) Prove that for any graph 퐺 푃(퐺; 푡) = ∑ (−1)#퐹 푡ℓ(휆(퐹)) 퐹⊆퐸 in three ways: by using deletion-contraction, by using a sign-reversing invo- lution, and by using 푋(퐺). 푘 (33) (a) Show that (7.42) defines an action of GL푛 on 푃 (푛). ̃ ̃ ̃ (b) Show that if 퐵̃ = {퐛1, 퐛2,..., 퐛푛} is any basis for ℂ[푛], then ̃ ̃ ̃ {퐛푖1 퐛푖2 ⋯ 퐛푖푘 ∣ 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푘} is a basis for 푃푘(푛). (34) (a) Let

푒푘(푛) = 푒푘(푥1, 푥2, ... , 푥푛)

and similarly for ℎ푘(푛). Prove that 푒푘(0) = ℎ푘(0) = 훿푘,0 and for 푛 ≥ 1

푒푘(푛) = 푒푘(푛 − 1) + 푥푛푒푘−1(푛 − 1),

ℎ푘(푛) = ℎ푘(푛 − 1) + 푥푛ℎ푘−1(푛). (b) Give three proofs of the first identity in Proposition 7.9.2: one by induction, one using the 푞-Binomial Theorem (Theorem 3.2.4), and one using Exercise 7(c) in Chapter 3. (c) Give two more proofs of the second identity in Proposition 7.9.2: one by in- duction and one which uses the 푞-Binomial Theorem (Theorem 3.2.4). (35) (a) Prove that

푆[푛, 푘] = ℎ푛−푘([1]푞, [2]푞, ... , [푘]푞), where 푆[푛, 푘] is the 푞-Stirling number of the second kind introduced in Chap- ter 3, Exercise 21. Hint: Use part (a) of Exercise 34. (b) Prove that

푐[푛, 푘] = 푒푛−푘([1]푞, [2]푞, ... , [푛 − 1]푞), where 푐[푛, 푘] is the signless 푞-Stirling number of the first kind introduced in Chapter 3, Exercise 22. Hint: Use part (a) of Exercise 34.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 265

(36) Define a relation on the polynomial ring ℝ[푞] by letting 푓(푞) ≤ 푔(푞) if, for all 푖 ∈ ℕ, the coefficient of 푞푖 in 푓(푞) is less than or equal to the coefficient of 푞푖 in 푔(푞). (a) Prove that this relation is a partial order on ℝ[푞], but not a total order. (b) Define a sequence of polynomials 푓0(푞), 푓1(푞), ... to be 푞-log-concave if 2 푓푘(푞) ≥ 푓푘−1(푞)푓푘+1(푞)

for all 푘 ≥ 1. If the polynomial sequence is finite, we let 푓푘(푞) = 0 for all sufficiently large 푘. Prove the following. (i) For any 푛 ∈ ℕ the finite sequence (0) 푛 (1) 푛 (푛) 푛 푞 2 [ ] , 푞 2 [ ] , ... , 푞 2 [ ] 0 1 푛 is 푞-log-concave. Hint: Use the first identity in Proposition 7.9.2. (ii) For any 푛 ∈ ℕ the infinite sequence 푛 푛 + 1 푛 + 2 [ ] , [ ] , [ ] ,... 0 1 2 is 푞-log-concave. Hint: Use the second identity in Proposition 7.9.2. (iii) For any 푛 ∈ ℕ the finite sequence 푐[푛, 0], 푐[푛, 1], ... , 푐[푛, 푛] is 푞-log-concave. Hint: Use part (b) of the previous exercise.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Chapter 8

Counting with Quasisymmetric Functions

While symmetric functions are invariant under arbitrary permutations of variables, quasisymmetric functions only need to be preserved by order-preserving bijections on the variable subscripts. Quasisymmetric functions are implicit in the work of Stanley on 푃-partitions [84] but were first explicitly defined and studied by Gessel [32]. As we will see in this chapter, these functions also have interesting connections with chain enumeration in posets, pattern avoidance, and graph coloring.

8.1. The algebra of quasisymmetric functions, QSym

We start by defining what it means for a power series to be quasisymmetric. Wewill introduce two important bases for the algebra of quasisymmetric functions and discuss their relationship with symmetric functions.

As usual, 퐱 = {푥1, 푥2, 푥3,...} will be a countably infinite variable set. A power series 푓(퐱) ∈ ℂ[[퐱]] is quasisymmetric if any two monomials of the form 푥훼1 푥훼2 ⋯ 푥훼푙 푖1 푖2 푖푙 with 푖 < 푖 < ⋯ < 푖 and 푥훼1 푥훼2 ⋯ 푥훼푙 with 푗 < 푗 < ⋯ < 푗 have the same 1 2 푙 푗1 푗2 푗푙 1 2 푙 coefficient. Note the increasing condition on the subscripts which is not present inthe definition of a symmetric function. So this is more a restrictive condition; that is,every symmetric function is quasisymmetric but not conversely. For example

4 4 4 2 2 2 2 (8.1) 푓(퐱) = 5푥1 푥2 + 5푥1 푥3 + 5푥2푥3 + ⋯ − 푥1푥2푥3 − 푥1푥2푥4 − 푥1푥3푥4 − 푥2푥3푥4 − ⋯

is quasisymmetric but not symmetric. An equivalent way to define 푓(퐱) being qua- sisymmetric is to say that any monomial of the form 푥훼1 푥훼2 ⋯ 푥훼푙 with 푖 < 푖 < ⋯ < 푖 푖1 푖2 푖푙 1 2 푙 훼1 훼2 훼푙 has the same coefficient as 푥1 푥2 ⋯ 푥푙 .

267

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

268 8. Counting with Quasisymmetric Functions

To set up notation, let

QSym푛 = QSym푛(퐱) = {푓(퐱) ∈ ℂ[[퐱]] ∣ 푓 is quasisymmetric and homogeneous of degree 푛}. Then the algebra of quasisymmetric functions is QSym = QSym(퐱) = QSym (퐱). ⨁ 푛 푛≥0 Note that, as with symmetric functions, since QSym is a direct sum the quasisymmetric power series in it are of bounded degree. Note also that, unlike symmetric functions, it is not obvious that this is an algebra, i.e., that QSym is closed under multiplication and not just under taking linear combinations. The proof of this fact will follow from Theorem 8.3.1. As we will see, bases for QSym are indexed by integer compositions. We will be interested in two particular bases.

Given a composition 훼 = [훼1, 훼2, ... , 훼푙], the associated monomial quasisymmetric function is 푀 = ∑ 푥훼1 푥훼2 ⋯ 푥훼푙 . 훼 푖1 푖2 푖푙 푖1<푖1<⋯<푖푙

훼1 훼2 훼푙 So 푀훼 can be thought of as the result of quasisymmetrizing the monomial 푥1 푥2 ⋯ 푥푙 . To illustrate, 3 3 3 푀[1,3] = 푥1푥2 + 푥1푥3 + 푥2푥3 + ⋯ .

We will often drop the square brackets and commas in the subscript of 푀훼. This should cause no confusion with partitions because of the use of capital letters for qua- sisymmetric bases and lowercase ones for symmetric function bases. We will also 푚푖 use multiplicity notation for 훼 where 푖 denotes a string of 푚푖 consecutive 푖’s. Note that the quasisymmetric function in (8.1) can be written as the linear combination 푓(퐱) = 5푀41 −7푀121. This can always be done as the 푀훼 are a basis. This proof follows the same lines as in the demonstration that the 푚휆 form a basis for Sym, Theorem 7.1.1.

Theorem 8.1.1. The 푀훼 as 훼 varies over all compositions form a basis for QSym. Con- sequently, for 푛 ≥ 1, 푛−1 dim QSym푛 = 2 . Proof. The dimension statement follows from the basis claim and Theorem 1.7.1. To prove that the 푀훼 are a basis, note first that they are independent since no two mono- mial quasisymmetric functions contain the same monomial. To show that they span, take an 푓 ∈ QSym. Consider any term in 푓, say 푐푥훼1 푥훼2 ⋯ 푥훼푙 where 푖 < 푖 < ⋯ < 푖 푖1 푖2 푖푙 1 2 푙 and 푐 ∈ ℂ. Then all monomials 푥훼1 푥훼2 ⋯ 푥훼푙 such that 푗 < 푗 < ⋯ < 푗 appear with 푗1 푗2 푗푙 1 2 푙 coefficient 푐. So 푓 − 푐푀훼 is still quasisymmetric and contains no monomials with or- dered exponent sequence 훼. The fact that 푓 is of bounded degree implies that repeating this process a finite number of times will yield zero. Thus 푓 is a linear combination of □ the 푀훼 which were subtracted.

There is a nice relationship between the monomial quasisymmetric functions and their symmetric counterparts. A rearrangement of a partition 휆 is a composition 훼 ob- tained by listing the parts of 휆 in a particular order. For example, the rearrangements of

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.1. The algebra of quasisymmetric functions, QSym 269

휆 = (2, 1, 1) are 훼 = [2, 1, 1], 훼 = [1, 2, 1], and 훼 = [1, 1, 2]. The proof of the following result is easy and so is left to the reader. Proposition 8.1.2. For any partition 휆

푚휆 = ∑ 푀훼 훼 where the sum is over all rearrangements 훼 of 휆. □

To describe the other basis for QSym which will interest us, it will be convenient to remember that there is a simple bijection 휙 between subsets 푆 ⊆ [푛 − 1] and com- positions 훼 ⊧ 푛 defined by (1.8). So we will sometimes write 푀푆 instead of 푀훼 if 휙(푆) = 훼. Strictly speaking, 푀푆 is not well-defined since there will be many 푛 such that 푆 ⊆ [푛 − 1]. But the context will always make it clear which 푛 is meant. To il- lustrate, if 푛 = 2 and 푆 = {1}, then 푀푆 = 푀[1,1], whereas if 푛 = 3 with the same 푆, then 푀푆 = 푀[1,2]. Given 푆 = {푠1, ... , 푠푘} ⊆ [푛 − 1], the corresponding fundamental quasisymmetric function is

퐹푆 = ∑ 푥푖1 푥푖2 ⋯ 푥푖푛 . 푖1≤푖2≤⋯≤푖푛

푖푠<푖푠+1 if 푠∈푆 In words, one sums over all monomials whose indices form a weakly increasing se- quence with strict increases at the positions indexed by 푆. As an example, if 푛 = 4 and 푆 = {1, 3}, then 2 2 (8.2) 퐹푆 = ∑ 푥푖푥푗푥푘푥푙 = 푥1푥2푥3 + 푥1푥2푥4 + ⋯ + 푥1푥2푥3푥4 + 푥1푥2푥3푥5 + ⋯ . 푖<푗≤푘<푙

We let 퐹훼 = 퐹푆 when 휙(푆) = 훼. So if 훼 = [훼1, ... , 훼푙], then the strict inequalities in the subscripts for the monomials in 퐹훼 must occur at positions indexed by partial sums 훼1 + ⋯ + 훼푖 for each 푖 < 푙. To describe the expansion of the 퐹훼 in terms of the 푀훽 we will use the partial order in the composition lattice 퐾푛 described at the beginning of Section 5.1. Proposition 8.1.3. We have

퐹훼 = ∑ 푀훽 훽≤훼

where ≤ is the partial order in the composition lattice 퐾푛.

Proof. The power series 퐹훼 is quasisymmetric since the inequalities impose by 훼 on a sequence 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푛 depend only on the positions in the sequence and not on the actual choice of the 푖푗. Furthermore, each monomial appearing in 퐹훼 has coefficient one. So the same is true of the expansion of 퐹훼 in the 푀훽 basis. The only thing left to prove is that 푀훽 appears in the expansion if and only if 훽 ≤ 훼. The monomials

occurring in 퐹훼 are those which can be expressed as 푥푖1 푥푖2 ⋯ 푥푖푛 where 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푛 and 푖푗 < 푖푗+1 for each 푗 which is a partial sum of 훼. Collecting together variables with the same subscripts, these are exactly the monomials which can be written as 푥훽1 푥훽2 ⋯ 푥훽푙 where 푗 < 푗 < ⋯ < 푗 and 훽 ≤ 훼. This observation completes the 푗1 푗2 푗푙 1 2 푙 proof. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

270 8. Counting with Quasisymmetric Functions

We can use the previous result to show that the 퐹훼 are a basis for QSym. The proof is much the same as that of Theorem 7.1.3(a) and so it is left to the reader. □ Theorem 8.1.4. The set {퐹훼 ∣ 훼 ⊧ 푛} is a basis for QSym푛.

8.2. Reverse 푃-partitions

Fundamental quasisymmetric functions can be used to count 푃-partitions. In fact, they give a more refined generating function which keeps track of the parts used inthe partitions in the same way that Schur functions and chromatic symmetric functions do for semistandard Young tableaux and proper colorings, respectively. This permits us to express a Schur function as a sum over standard Young tableaux and to write down a rule for the multiplication of fundamental quasisymmetric functions. But to make the partition conventions align with those for quasisymmetric functions, we will first define a slight variant of 푃-partitions where the inequalities are reversed. Consider functions 푓 ∶ [푛] → ℙ whose range is the positive integers. Say that 푓 is reverse compatible with permutation 휋 ∈ 픖푛 if

RC1 푓(휋1) ≤ 푓(휋2) ≤ ⋯ ≤ 푓(휋푛) and RC2 푓(휋푖) < 푓(휋푖+1) whenever 푖 ∈ Des 휋. Note that there is a bijection between the functions 푓 ∶ [푛] → ℙ which are reverse compatible with 휋 and the functions 푔∶ [푛] → ℕ which are compatible with 휋’s reverse complement

′ ′ ′ ′ (8.3) 휋 = 휋1휋2 ... 휋푛 = 푛 + 1 − 휋푛, 푛 + 1 − 휋푛−1, ... , 푛 + 1 − 휋1 ′ where 푔(휋푖) = 푓(휋푛+1−푖) − 1 for all 푖 ∈ [푛]. So studying reverse compatibility and compatibility is essentially the same. But, as already mentioned, RC1 and RC2 will play more nicely with fundamental quasisymmetric functions. Let ℛ퐶(휋) = {푓 ∶ [푛] → ℙ ∣ 푓 is reverse compatible with 휋}. The proof of the next result is similar to that of Lemma 7.4.1 and so it is omitted.

Lemma 8.2.1. Every 푓 ∶ [푛] → ℙ is reverse compatible with a unique 휋 ∈ 픖푛. Thus

{푓 ∣ 푓 ∶ [푛] → ℙ} = ℛ퐶(휋). □ ⨄ 휋∈픖푛

To make a connection with quasisymmetric functions, associate with any 푓 ∶ [푛] → ℙ the monomial 푓 퐱 = 푥푓(1)푥푓(2) ⋯ 푥푓(푛). Recall that the fundamental quasisymmetric functions can be indexed by subsets 푆 ⊆ [푛 − 1]. And given a permutation 휋 ∈ 픖푛, we have Des 휋 ⊆ [푛 − 1].

Lemma 8.2.2. For any 휋 ∈ 픖푛 we have 푓 (8.4) ∑ 퐱 = 퐹Des 휋. 푓∈ℛ퐶(휋)

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.2. Reverse 푃-partitions 271

Proof. Since the monomials on both sides of (8.4) all occur with coefficient one, it suffices to find a bijection between the subscripts which can appear for monomialson the two sides of the equation. By definition, the subscripts of monomials in 퐹Des 휋 are precisely those of the form 푖1 ≤ 푖2 ≤ ⋯ ≤ 푖푛 with 푖푘 < 푖푘+1 if 푘 ∈ Des 휋. Associate with this subscript the function 푓 ∶ [푛] → ℙ where 푓(휋푗) = 푖푗 for 푗 ∈ [푛]. We claim that this is well-defined in that 푓 ∈ ℛ퐶(휋). Indeed, the fact that the 푖푗 are weakly increasing is condition RC1 and the placement of the strict inequalities agrees with RC2. It is now an easy matter to construct a well-defined inverse, completing the proof. □

We now bring the 푃-partition definitions through the looking glass into the landof reverse. Given a poset 푃 on [푛], a reverse 푃-partition is a function 푓 ∶ 푃 → ℙ satisfying RPP1 푖 ⊴ 푗 implies 푓(푖) ≤ 푓(푗) and RPP2 푖 ⊴ 푗 and 푖 > 푗 implies 푓(푖) < 푓(푗). We let RPar 푃 = {푓 ∶ 푃 → ℙ ∣ 푓 is a reverse 푃-partition}. The result below follows from Lemma 8.2.1 in much the same way that Lemma 7.4.3 was derived from Lemma 7.4.1.

Lemma 8.2.3 (Fundamental Lemma of Reverse 푃-Partitions). Let 푃 be a poset on [푛]. Then 푓 ∈ RPar 푃 if and only if 푓 ∈ ℛ퐶(휋) for some 휋 ∈ ℒ(푃). Thus

RPar 푃 = ℛ퐶(휋). □ ⨄ 휋∈ℒ(푃)

To derive a generating function identity we define, for 푃 a poset on [푛], the gener- ating function rpar 푃 = rpar(푃; 퐱) = ∑ 퐱푓. 푓∈RPar 푃 Now using the previous lemma and (8.4) we obtain the following.

Theorem 8.2.4. For any poset 푃 on [푛] we have

□ rpar 푃 = ∑ 퐹Des 휋. 휋∈ℒ(푃)

Since every symmetric function is quasisymmetric, one can ask what the expan- sion of a symmetric function is in one of the bases for QSym. We will now answer this question for the expansion of the Schur functions in terms of the fundamental qua- sisymmetrics. To do so we need a notion of descent for standard Young tableaux. If 푄 is an SYT, then let Des 푄 = {푘 ∣ 푘 + 1 is in a row below 푘 in 푄}. For example, the SYT in Figure 7.1 have descent sets {2, 4}, {2, 3}, {1, 3, 4}, {1, 3}, and {1, 2, 4}, respectively.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

272 8. Counting with Quasisymmetric Functions

8 4

7 3 1

6 2

5

휆 푃휆

푄 = 1 2 4 8 ⟷ 휋 = 56273148 3 5 7 6

Figure 8.1. A Young diagram and the associated labeling of 푃휆, along with an SYT 푄 and the associated linear extension

Theorem 8.2.5. For any partition 휆 ⊢ 푛 we have

푠휆 = ∑ 퐹Des 푄. 푄∈SYT(휆)

Proof. Recall from the end of Section 7.4 that associated with 휆 is a poset 푃휆. We will turn 푃휆 into a poset on [푛] in such a way that the reverse 푃휆-partitions are exactly the semistandard Young tableaux of shape 휆. To this end, label the vertices of 푃휆 corre- sponding to the last row of 휆 = (휆1, ... , 휆푙) from 1 to 휆푙 so that the labels increase with the partial order. Now similarly label the vertices in the penultimate row with 휆푙 + 1 to 휆푙 + 휆푙−1, and continue doing so until all of 푃휆 is labeled. An example is in Figure 8.1. Using the usual coordinates on 휆 to also refer to the corresponding nodes of 푃휆, we will let 푙(푖, 푗) be the label of node (푖, 푗) in 푃휆. In Figure 8.1, 푙(2, 3) = 4. ∘ We claim that rotation by 135 is a bijection SSYT(휆) → RPar 푃휆. As with 푃- partitions, it suffices to show that RPP1 and RPP2 hold on covers. Now 푇 ∈ SSYT(휆) if and only if 푇푖,푗 ≤ 푇푖,푗+1 and 푇푖,푗 < 푇푖+1,푗 for all cells (푖, 푗). Since 푙(푖, 푗) < 푙(푖, 푗 + 1) and 푙(푖, 푗) > 푙(푖 + 1, 푗), these two conditions on 푇 become exactly RPP1 and RPP2 for the associated reverse 푃-partition, demonstrating the claim. And from this it follows that

푠휆 = rpar 푃휆.

The SYT 푄 of shape 휆 are in bijection with the 휋 ∈ ℒ(푃휆) by letting 휋푘 = 푙(푖, 푗) where 푄푖,푗 = 푘. Furthermore, if 푄 ⟷ 휋 under this bijection, then Des 푄 = Des 휋. To see this, suppose first that 푘 ∈ Des 푄 where 푘 is in cell 푐. Then 푘 + 1 is in a cell 푐′ in a ′ lower row of 푄. It follows that 휋푘 = 푙(푐) > 푙(푐 ) = 휋푘+1 because of the way the elements of 푃휆 were labeled. Thus 푘 ∈ Des 휋. Showing the reverse inclusion, Des 휋 ⊆ Des 푄, is

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.2. Reverse 푃-partitions 273

similar and so is left as an exercise. Thus if 푄 ⟷ 휋, then

퐹Des 푄 = 퐹Des 휋.

Combining the two previous displayed equations with Theorem 8.2.4 gives

푠휆 = rpar 푃휆 = ∑ 퐹Des 휋 = ∑ 퐹Des 푄 휋∈ℒ(푃휆) 푄∈SYT(휆) which is the desired conclusion. □

We will now show that QSym is an algebra by showing that the product of two fundamental quasisymmetric functions is a linear combination of quasisymmetrics. This will give us a second application of Theorem 8.2.4. First note that to apply RPP1 and RPP2 it is not necessary that 푃 be a poset on the set [푛]: any subset 푆 ⊂ ℙ would do in place of [푛]. Think of such a poset as a pair (푃, 휔) consisting of an underlying poset 푃 and a labeling 휔∶ 푃 → 푆. Two labelings 휔∶ 푃 → 푆 and 휔′ ∶ 푃 → 푆′ satisfying 휔(푥) < 휔(푦) if and only if 휔′(푥) < 휔′(푦) are said to have the same relative order. In this case, the monomials 퐱푓 for reverse (푃, 휔)-partitions are the same as the monomials for reverse (푃, 휔′)-partitions. Thus rpar(푃, 휔) = rpar(푃, 휔′). Given disjoint posets 푃 on a label set 푆 and 푄 on a label set 푇 where 푆 and 푇 are disjoint, one gives 푃 ⊎ 푄 the label set 푆 ⊎ 푇 in the obvious way. The next result follows easily from the definitions.

Proposition 8.2.6. If 푃, 푄 are posets on disjoint subsets of ℙ, then

rpar(푃 ⊎ 푄) = (rpar 푃)(rpar 푄). □

Our final tool involves shuffling sequences of integers. If 휎 ∈ 푃(푈) and 휏 ∈ 푃(푉) where 푈, 푉 are disjoint, then the associated set of shuffles is 휎 ⧢ 휏 = {휋 ∈ 푃(푈 ⊎ 푉) ∣ 휎 and 휏 are subwords of 휋}. To illustrate, if 휎 = 14 and 휏 = ퟓퟐ, then 휎 ⧢ 휏 = {14ퟓퟐ, 1ퟓ4ퟐ, 1ퟓퟐ4, ퟓ14ퟐ, ퟓ1ퟐ4, ퟓퟐ14}. Finally, we note that the definition of a descent set can be extended toany 휋 ∈ 푃(푆) and that Theorem 8.2.4 continues to hold.

Theorem 8.2.7. If 휎 ∈ 푃(푈) and 휏 ∈ 푃(푉) where 푈, 푉 are disjoint subsets of ℙ, then

퐹Des 휍퐹Des 휏 = ∑ 퐹Des 휋. 휋∈휍⧢휏

Proof. Let 푃 be the poset on 푈 which is a chain labeled from bottom to top by the elements of 휎 read left to right. So ℒ(푃) = {휎}. Similarly define 푄 so that ℒ(푄) = {휏}. It follows that ℒ(푃 ⊎ 푄) = 휎 ⧢ 휏. Now applying Theorem 8.2.4 and Proposition 8.2.6

퐹Des 휍퐹Des 휏 = (rpar 푃)(rpar 푄) = rpar(푃 ⊎ 푄) = ∑ 퐹Des 휋 휋∈휍⧢휏 as desired. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

274 8. Counting with Quasisymmetric Functions

The previous theorem is remarkable for it implies that the multiset {{Des 휋 ∣ 휋 ∈ 휎 ⧢ 휏}} depends only on the lengths of 휎 and 휏 and their descent sets. A function on permutations with this property is called shuffle compatible and this concept has been studied by Gessel and Zhuang [33], Grinberg [36], as well as Baker-Jarvis and Sagan [4].

8.3. Chain enumeration in posets

As we have just seen, the fundamental quasisymmetric functions give us information about reverse 푃-partitions. It turns out that the monomial quasisymmetric functions can be used to model chains in posets, as was shown by Ehrenborg [24]. In particular, we will see how multiplication of the 푀훼 corresponds to taking the product of posets. A connection will also be made with the binomial posets studied in Section 5.9. We begin by deriving a formula for the product of two monomial quasisymmet- ric functions. Recall that 훼 is a weak composition if it can include parts equal to zero. An expansion of a composition 훼 is a weak composition ̄훼 such that remov- ing the zeros from ̄훼 one obtains 훼. For example, one expansion of 훼 = [1, 4, 1] is ̄훼= [0, 0, 1, 4, 0, 1, 0]. If 훼, 훽, 훾 are compositions, then we say 훾 is a shuffle sum of the other two compositions if there are expansions ̄훼 and 훽̄ of 훼 and 훽, respectively, which have length ℓ(훾) such that 훾 = ̄훼+훽.̄ Here, addition is componentwise. To illustrate, if 훼 = [1, 2] and 훽 = [1], then there are two ways of writing 훾 = [1, 1, 2] as a shuffle sum of 훼 and 훽, namely [1, 0, 2] + [0, 1, 0] and [0, 1, 2] + [1, 0, 0]. The reader can verify that

푀[1,2]푀[1] = 푀[2,2] + 푀[1,3] + 2푀[1,1,2] + 푀[1,2,1]

where the coefficient 2 of 푀[1,1,2] corresponds to these two shuffle sums. The general result is as follows. Theorem 8.3.1. We have 훾 푀훼푀훽 = ∑ 푐훼,훽푀훾 훾 훾 where 푐훼,훽 is the number of ways of writing 훾 as a shuffle sum of 훼 and 훽.

Proof. Since 푀훼푀훽 is quasisymmetric, it suffices to show that for any 훾 = [훾1, ... , 훾푡] 훾1 훾푡 훾 we have 푥1 ⋯ 푥푡 occurring in the product with coefficient 푐훼,훽. For any, possibly 훼̄ 훼̄1 훼̄푡 weak, composition ̄훼= [훼1, ... , 훼푡] we let 퐱 = 푥1 ⋯ 푥푡 . Let ℓ(훼) = 푟, ℓ(훽) = 푠, and ℓ(훾) = 푡. Given any shuffle sum 훾 = ̄훼+ 훽,̄ we clearly have 퐱훾 = 퐱훼̄ 퐱훽̄. Conversely, suppose 퐱훾 = (푥훼1 ⋯ 푥훼푟 )(푥훽1 ⋯ 푥훽푠 ) = 푥훾1 ⋯ 푥훾푡 푖1 푖푟 푗1 푗푠 푘1 푘푡

where 푖1 < ⋯ < 푖푟 and 푗1 < ⋯ < 푗푠. Define an expansion ̄훼 of 훼 of length 푡 by putting 훼푝 in position 푖푝 of ̄훼 for 푝 ∈ [푟] and placing zeros everywhere else. Similarly define 훽.̄ The previous displayed equation implies that 훾 = ̄훼+ 훽.̄ It is not hard to see that the two maps just described are inverses. So we have bijection between shuffle sum ̄ 훾 decompositions 훾 = ̄훼+ 훽 and ways to write 퐱 as a product of a monomial from 푀훼 □ with a monomial from 푀훽. The theorem follows.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.3. Chain enumeration in posets 275

To make the connection with chain enumeration, let 푃 be a finite, ranked, poset with a 1̂. Recall that in this situation, any interval [푥, 푦] ⊆ 푃 can be considered as a poset of rank

rk[푥, 푦] = rk푃 푦 − rk푃 푥. Associate with any chain

(8.5) 퐶 ∶ 0̂ = 푥0 < 푥1 < ⋯ < 푥푘 = 1̂ the composition

(8.6) 훼(퐶) = [rk[푥0, 푥1], rk[푥1, 푥2], ... , rk[푥푘−1, 푥푘]].

For example, in 퐵6 the chain 퐶 ∶ ∅ < {2, 5} < {1, 2, 4, 5, 6} < [6] has 훼(퐶) = [2, 3, 1]. Also associate with 푃 the generating function

푀(푃) = ∑ 푀훼(퐶) 퐶 where the sum is over all chains 퐶 of the form (8.5). This generating function respects products of posets.

Theorem 8.3.2. Let 푃, 푄 be finite, ranked posets each having a 1̂. Then (8.7) 푀(푃 × 푄) = 푀(푃)푀(푄).

Proof. By definition, the coefficient of 푀훾 on the left side of (8.7) is the number of 0̂–1̂ chains 퐶 in 푃 × 푄 with 훼(퐶) = 훾. And by Theorem 8.3.1, the coefficient of 푀훾 on the 훾 ̂ ̂ right is ∑퐴,퐵 푐훼(퐴),훼(퐵) where the sum is over all 0–1 chains 퐴 ⊆ 푃 and 퐵 ⊆ 푄. So it suffices to find a bijection between 0̂–1̂ chains 퐶 in 푃 × 푄 and ways of writing 훼(퐶) as a shuffle sum of 훼(퐴) and 훼(퐵) for 0̂–1̂ chains 퐴, 퐵 in 푃, 푄, respectively. Suppose first that we are given

(8.8) 퐶 ∶ 0̂ = (푥0, 푦0) < (푥1, 푦1) < ⋯ < (푥푘, 푦푘) = 1.̂ The projection of 퐶 onto 푃 is the multichain

퐴 ∶ 0̂ = 푥0 ≤ 푥1 ≤ ⋯ ≤ 푥푘 = 1.̂ This multichain has underlying chain 퐴 obtained by replacing each maximal string of copies of 푥 in 퐴 with just 푥 itself. Note that the definition in (8.6) can be applied equally well to multichains, except now the result will be a weak composition. Furthermore 훼(퐴) is an expansion of 훼(퐴). Similarly define the projection 퐵 of 퐶 onto 푄 with its underlying chain 퐵. From Exercise 8(c) in Chapter 5 we know that rk(푥, 푦) = rk(푥) + rk(푦) for all (푥, 푦) ∈ 푃 × 푄. It follows that 훼(퐶) = 훼(퐴) + 훼(퐵). This completes the definition of the bijection in one direction. Now suppose we are given 0̂–1̂ chains 퐴 in 푃 and 퐵 in 푄 such that there are expan- ̄ ̄ sions ̄훼 and 훽 of 훼(퐴) and 훼(퐵) satisfying 훾 = ̄훼+ 훽 for some 푀훾 appearing on the left in (8.7). Then there is a unique multichain 퐴 whose underlying chain is 퐴 and whose

composition is 훼(퐴) = ̄훼: each 푥푖 ∈ 퐴 is replaced by 푚푖 + 1 copies of itself where 푚푖 is the number of zeros in ̄훼 between the elements 훼푖 and 훼푖+1 of 훼(퐴). Similarly we

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

276 8. Counting with Quasisymmetric Functions

obtain a multichain 퐵 from 퐵 and 훽.̄ Finally, we construct 퐶 as in (8.8) where the first components are 퐴 and the second 퐵. Verifying that 훼(퐶) = 훼(퐴) + 훼(퐵) and that this is the inverse of the map defined in the previous paragraph is left to the reader. □

To end this section, suppose that 퐼 = [푥, 푧] is an 푛-interval as defined in axiom BP2 of Section 5.9 for a binomial poset 푃. The generating function 푀(퐼) has a very nice form in terms of the factorial function 퐹(푛) for 푃. Given a composition 훼 = [훼1, 훼2, ... , 훼푘] ⊧ 푛, define 푛 퐹(푛) ( ) = . 훼 퐹(훼 )퐹(훼 ) ⋯ 퐹(훼 ) 푃 1 2 푘 Note that when 푃 = 퐵 , then (푛) is just a multinomial coefficient as defined in(1.14). ∞ 훼 푃 Theorem 8.3.3. Let 푃 be a binomial poset and let 퐼 be an 푛-interval in 푃. Then 푛 푀(퐼) = ∑ ( ) 푀 . 훼 훼 훼⊧푛 푃

Proof. Since 푃 is binomial, there is no loss of generality in assuming that 퐼 = [0,̂ 푧] for some 푧. By definition of 푀(퐼), we need to show that the number of chains (8.5) where 1̂ = 푧 and 훼(퐶) = 훼 is given by (푛) . By Lemma 5.9.2, 훼 푃 퐹(푛) # of 푥1 of rank 훼1 in 퐼 = . 퐹(훼1)퐹(푛 − 훼1) Similarly 퐹(푛 − 훼1) # of 푥2 of rank 훼2 in [푥1, 푧] = . 퐹(훼2)퐹(푛 − 훼1 − 훼2) So the number of ways to pick 푥1 and 푥2 is 퐹(푛) 퐹(푛 − 훼 ) 퐹(푛) ⋅ 1 = . 퐹(훼1)퐹(푛 − 훼1) 퐹(훼2)퐹(푛 − 훼1 − 훼2) 퐹(훼1)퐹(훼2)퐹(푛 − 훼1 − 훼2) Continuing in this way, we see that the total count is (푛) as desired. □ 훼 푃

8.4. Pattern avoidance and quasisymmetric functions

Given a formal power series 푓(퐱) which is a priori a quasisymmetric function, it can be intriguing to see if it is actually symmetric. And, in that case, one could further ask whether 푓(퐱) has an expansion with nonnegative coefficients in one of the standard bases for Sym. In this section we are going to be concerned with Schur nonnegativity,

that is, seeing if 푓(퐱) = ∑휆 푐휆푠휆 where 푐휆 ≥ 0 for all 휆. Associated with any set Π ⊆ 픖푛 of permutations, we can define the quasisymmetric function

(8.9) 퐹Π = ∑ 퐹Des 휋 ∈ QSym푛 . 휋∈Π It turns out that by taking 푆 to be the set of permutations which satisfy certain avoid- ance conditions, one gets interesting results. These ideas were first investigated by Hamaker, Pawlowski, and Sagan [42], with further results being obtained by Bloom and Sagan [17].

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.4. Pattern avoidance and quasisymmetric functions 277

If Π is any set of permutations, then we let

Av푛(Π) = {휎 ∈ 픖푛 ∣ 휎 avoids every 휋 ∈ Π}.

So Av푛(Π) = ⋂휋∈Π Av푛(휋). To illustrate, call a permutation 휋 ∈ 픖푛 reverse layered if it is of the form (8.10) 휋 = 푚, 푚 + 1, ... , 푛, 푙, 푙 + 1, ... , 푚 − 1, 푘, 푘 + 1, ... for certain 푛≥푚>푙>푘>⋯>0. This term is used since the reversal 휋푟 is of a form usually called layered. The increasing subsequences 푚, 푚 + 1, ... , 푛 and so forth are called its layers and their lengths are the layer lengths. For example 휋 = 789561234 is reverse layered with layers 789, 56, 1234 and corresponding layer lengths 3, 2, 4. Lemma 8.4.1. We have

Av푛(132, 213) = {휋 ∈ 픖푛 ∣ 휋 is reverse layered}.

Proof. Note that the reverse layered permutations 휋 are exactly the ones such that, for all 푎, 푐 ∈ [푛], if 푎 < 푐 and 푎 is before 푐 in 휋, then every element of [푎, 푐] comes between 푎, 푐 in 휋. Now 휋 contains 132 if and only if 휋 contains a subsequence 푎푐푏 with 푏 ∈ [푎, 푐] coming after 푐. Similarly 휋 containing 213 is equivalent to there being a 푏 ∈ [푎, 푐] appearing before 푎. So 휋 avoids both patterns precisely when 휋 is reverse layered. □

To bring in quasisymmetric functions define, for any set Π of permutations, the pattern quasisymmetric function

푄푛(Π) = ∑ 퐹Des 휍. 휍∈Av푛(Π)

Note that 푄푛(Π) is a sum of fundamental quasisymmetric functions for permutations avoiding Π, while 퐹Π is a sum over the elements of Π itself. There are times when ′ ′ 푄푛(Π) being symmetric implies that the same is true for 푄푛(Π ) for certain other Π , similar to what happens with Wilf equivalence. Consider the dihedral group 퐷 of the square as defined in (1.11). The following lemma is easy to prove and so its demonstra- tion is left as an exercise. Lemma 8.4.2. For any 푓 ∈ 퐷, any set of permutations Π, and any 푛 ≥ 0 we have

□ 푓(Av푛(Π)) = Av푛(푓(Π)).

Recall that if 휋 is a permutation, then its complement and reversal (see Exercise 37 of Chapter 1) are denoted 휋푐 and 휋푟, respectively.

Proposition 8.4.3. Suppose that 푄푛(Π) is symmetric and that it has Schur expansion

푄푛(Π) = ∑ 푐휆푠휆. 휆 푐 (a) We have 푄푛(Π ) is symmetric and 푐 푄푛(Π ) = ∑ 푐휆푠휆푡 . 휆

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

278 8. Counting with Quasisymmetric Functions

푟 (b) We have 푄푛(Π ) is symmetric and 푟 푄푛(Π ) = ∑ 푐휆푠휆푡 . 휆 푐푟 (c) We have 푄푛(Π ) is symmetric and 푐푟 푄푛(Π ) = ∑ 푐휆푠휆. 휆

Proof. (a) Applying the definition of 푄푛(Π) and then Theorem 8.2.5 to the hypothesis we have

(8.11) ∑ 퐹Des 휍 = 푄푛(Π) = ∑ 푐휆푠휆 = ∑ 푐휆 ∑ 퐹Des 푃. 휍∈픖푛(Π) 휆 휆 푃∈SYT(휆) In passing from 휎 to 휎푐, every descent is changed into an ascent and vice versa. So we have (8.12) Des 휎푐 = [푛 − 1] − Des 휎. Also note that in any SYT 푃 with 푘 in cell (푖, 푗) and 푘 + 1 in cell (푖′, 푗′), either we have 푖 < 푖′ and 푗 ≥ 푗′ which implies 푘 ∈ Des 푃, or we have 푖 ≥ 푖′ and 푗 < 푗′ which implies 푘 ∉ Des 푃. It follows that (8.13) Des 푃푡 = [푛 − 1] − Des 푃. Using Lemma 8.4.2, then equations (8.12), (8.11), and (8.13), and then finally Theo- rem 8.2.5 in that order yields

푐 푄푛(Π ) = ∑ 퐹[푛−1]−Des 휍 = ∑ 푐휆 ∑ 퐹[푛−1]−Des 푃 = ∑ 푐휆푠휆푡 . 휍∈픖푛(Π) 휆 푃∈SYT(휆) 휆

(b) The proof is similar to part (a) except that one uses Theorem 7.6.3 in place of equation 8.13. The details are left to the reader. (c) This is implied by the first two parts and Lemma 8.4.2. □

We will now use our results to prove that two particular Π ⊆ 픖3 have 푄푛(Π) sym- metric for all 푛 and determine its Schur expansion. For a complete list of all such Π, as well as examples which are not subsets of 픖3, see [42]. A partition 휆 is a hook if 휆 = (푎, 1푏) for 푎 ≥ 1 and 푏 ≥ 0. Let

ℋ푛 = {휆 ⊢ 푛 ∣ 휆 is a hook}. Theorem 8.4.4. We have

푄푛(132, 213) = 푄푛(231, 312) = ∑ 푠휆. 휆∈ℋ푛

Proof. Since {231, 312} = {132, 213}푐 it suffices, by the previous proposition, to prove that 푄푛(132, 213) has the given Schur expansion. First we claim that

푄푛(132, 213) = ∑ 퐹푆. 푆⊆[푛−1]

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.5. The chromatic quasisymmetric function 279

To prove the claim, it suffices to show that there is a bijection between subsets 푆 ⊆ [푛 − 1] and permutations 휎 ∈ Av푛(132, 213) such that Des 휎 = 푆. Recall from Lemma 8.4.1 that these 휎 are exactly the reverse layered permutations. If 훼 −1 = [훼1, ... , 훼푘] is the composition of layer lengths of 휎, then Des 휎 = 푆 where 푆 = 휙 (훼) and 휙 is the bijection (1.8). So it suffices to prove that for any 훼 ⊧ 푛 there is a unique 휎 ∈ Av푛(132, 213) having 훼 as its composition of layer lengths. But this is clear since we must put the 훼1 largest elements of [푛] in the first layer, the next 훼2 largest elements in the second layer, and so on. From the previous paragraph, we will be done if we can show that

∑ 푠휆 = ∑ 퐹푆. 휆∈ℋ푛 푆⊆[푛−1] Because of Theorem 8.2.5, it suffices to show that there is a bijection between subsets 푆 ⊆ [푛 − 1] and hook tableaux 퐻 such that Des 퐻 = 푆. But given any 푆′ ⊆ [2, 푛], there is a unique hook tableau 퐻 whose first column is 푆′ ∪ {1}. And Des 퐻 = 푆′ − 1, the set obtained from 푆′ by subtracting one from each entry. So the bijection 푆′ ↦ 푆 = 푆′ − 1 completes the proof. □

8.5. The chromatic quasisymmetric function

Chromatic quasisymmetric functions were introduced by Shareshian and Wachs [82] in part to study the (ퟑ + ퟏ)-Free Conjecture of Stanley and Stembridge [93]. These qua- sisymmetric functions refine the chromatic symmetric functions from Section 7.8 and have many interesting properties, including a connection with Hessenberg varieties from algebraic geometry. Here, we consider what happens when such a function is symmetric as well as making a connection with reverse 푃-partitions. Throughout this section, 퐺 = (푉, 퐸) will be a graph with 푉 = [푛]. We let

풫퐶(퐺) = {푐∶ 푉 → ℙ ∣ 푐 is a proper coloring of 퐺}.

The set of ascents of 푐 ∈ 풫퐶(퐺) is

Asc 푐 = {푖푗 ∈ 퐸 ∣ 푖 < 푗 and 푐(푖) < 푐(푗)}

with corresponding ascent number asc 푐 = # Asc 푐. Figure 8.2 displays three proper colorings of a path with edges 13 and 32. They have Asc 푐1 = ∅, Asc 푐2 = {13}, and Asc 푐3 = {13, 23} so that asc 푐1 = 0, asc 푐2 = 1, and asc 푐3 = 2, respectively. Given

푐1 = 7 5 7 푐2 = 4 5 7 푐3 = 4 5 4

1 3 2 1 3 2 1 3 2

Figure 8.2. Three proper colorings of a graph with 푉 = [3]

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

280 8. Counting with Quasisymmetric Functions

variable set 퐱 = {푥1, 푥2,...} as well as another parameter 푞, define the chromatic qua- sisymmetric function of 퐺 to be 푋(퐺; 퐱, 푞) = ∑ 퐱푐푞asc 푐 푐∈풫퐶(퐺) where 퐱푐 is defined by (7.36). To illustrate with the graph in Figure 8.2, if we consider the colorings 푐 such that 푐(1) = 푐(2) = 푖 and 푐(3) = 푗 > 푖, then each such map 2 2 2 contributes 푥푖 푥푗푞 to the sum for a total of 푀21푞 . If the inequality between 푖 and 푗 is reversed, then there are no ascents and the contribution is 푀12. Similar considerations apply to the six ways three distinct colors could be assigned to the vertex set, giving a total of 2 푋(퐺; 퐱, 푞) = (푀12 + 2푀13 ) + (2푀13 )푞 + (푀21 + 2푀13 )푞 . Because of the similarity to our notation 푋(퐺; 퐱) for the chromatic symmetric function, we will always include the 푞 when referring to its quasisymmetric cousin. Also, it will be convenient to define 푐 푋푘(퐺; 퐱) = ∑ 퐱 푐∈풫퐶(퐺) asc 푐=푘 so that 푘 푋(퐺; 퐱, 푞) = ∑ 푋푘(퐺; 퐱)푞 . 푘≥0 We first show that the chromatic quasisymmetric function lives up to its nameand is a refinement of the chromatic symmetric function. Proposition 8.5.1. Let 퐺 be a graph with 푉 = [푛]. We have:

(a) 푋푘(퐺; 퐱) ∈ QSym푛 for all 푘 ≥ 0. (b) 푋(퐺; 퐱, 푞) has degree #퐸 as a polynomial in 푞. (c) 푋(퐺; 퐱, 1) = 푋(퐺; 퐱).

푐 Proof. (a) We have 푋푘(퐺, 퐱) is homogeneous of degree 푛 since in 퐱 there is a factor 푥푖 for each vertex 푖 ∈ 푉. To show it is quasisymmetric consider a monomial of 푋푘(퐺, 퐱), say 퐱푐 = 푥훼1 푥훼2 ⋯ 푥훼푙 with 푖 < 푖 < ⋯ < 푖 which arose from a proper coloring 푖1 푖2 푖푙 1 2 푙 푐∶ 푉 → {푖1, 푖2, ... , 푖푙}. Take any other set of positive integers 푗1 < 푗2 < ⋯ < 푗푙 and ′ ′ consider the coloring 푐 = 푓 ∘ 푐 where 푓(푖푚) = 푗푚 for all 푚. Then 푐 is proper since 푓 is a bijection and asc 푐′ = asc 푐 = 푘 since 푓 is an increasing function. It follows that 푋 (퐺; 퐱) contains a corresponding monomial 퐱푐′ = 푥훼1 푥훼2 ⋯ 푥훼푙 with 푗 < 푗 < ⋯ < 푘 푖푗 푖푗 푖푗 1 2 푗푙. Thus 푋푘(퐺; 퐱) is quasisymmetric. (b) Since asc 푐 counts a subset of 퐸 for any coloring 푐, we have that the degree of 푋(퐺; 퐱, 푞) is no greater than #퐸. And the coloring 푐(푖) = 푖 for all 푖 ∈ 푉 has asc 푐 = #퐸, so that is the degree. (c) This follows trivially from the definitions. □

To explore what happens if 푋(퐺; 퐱, 푞) is in fact a symmetric function in 퐱, we need to discuss reversal and palindromicity of sequences. Given a sequence 푎 ∶ 푎0, 푎1,..., 푟 푎푛, its reversal is then 푎 ∶ 푎푛, 푎푛−1, ... , 푎0. We say that 푎 is palindromic with center 푛/2

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

8.5. The chromatic quasisymmetric function 281

if 푎 = 푎푟. For example, the sequence 0, 7, 2, 2, 7, 0 is palindromic with center 5/2. We 푛 푖 also say that the generating function 푓(푞) = ∑푖=0 푎푖푞 has one of these properties if the corresponding sequence does. So, in our example, 7푞 + 2푞2 + 2푞3 + 7푞4 is palindromic with center 5/2. Note that if 푓(푞) is palindromic with center 푛/2, then 푛 need not be the degree of 푓(푞) because of possible initial zeros. Also, since the 푎푖 may contain other variables, we sometimes say that 푓(푞) is palindromic in 푞 to be specific. There is a simple algebraic test for being a palindrome. We leave its proof to the reader.

Lemma 8.5.2. Let 푎 ∶ 푎0, 푎1, ... , 푎푛 be a sequence with generating function 푓(푞). (a) The generating function for 푎푟 is 푞푛푓(1/푞). (b) The given sequence is a palindrome with center 푛/2 if and only if

푓(푞) = 푞푛푓(1/푞). □

Define an involution 휌 on the monomial quasisymmetric functions by letting

휌(푀훼) = 푀훼푟 . Extend 휌 by linearity to QSym[푞], the polynomials in 푞 whose coefficients are qua- sisymmetric functions, where powers of 푞 are treated as scalars. We will see that this involution reverses 푋(퐺; 퐱, 푞) as a polynomial in 푞. To express the resulting power se- ries, define the descent set and descent number of a coloring 푐 to be Des 푐 = {푖푗 ∈ 퐸 ∣ 푖 < 푗 and 푐(푖) > 푐(푗)} and des 푐 = # Des 푐, respectively. Theorem 8.5.3. Let 퐺 be a graph with 푉 = [푛] and #퐸 = 푚. (a) 휌(푋(퐺; 퐱, 푞)) = 푞푚푋(퐺; 퐱, 푞−1). (b) 휌(푋(퐺; 퐱, 푞)) = ∑ 퐱푐푞des 푐. 푐∈풫퐶(퐺)

훼1 훼푙 푘 Proof. (a) For a composition 훼 = [훼1, ... , 훼푙] and 푘 ∈ ℕ, the coefficient of 푥1 ⋯ 푥푙 푞 on the left side of (a) is

훼1 훼푙 푘 훼푙 훼1 푘 [푥1 ⋯ 푥푙 푞 ]휌(푋(퐺; 퐱, 푞)) = [푥1 ⋯ 푥푙 푞 ]푋(퐺; 퐱, 푞) 푐 훼푙 훼1 (8.14) = #{proper 푐 ∣ 퐱 = 푥1 ⋯ 푥푙 and asc 푐 = 푘}. Similarly

훼1 훼푙 푘 푚 −1 훼1 훼푙 푚−푘 [푥1 ⋯ 푥푙 푞 ](푞 푋(퐺; 퐱, 푞 )) = [푥1 ⋯ 푥푙 푞 ]푋(퐺; 퐱, 푞) ′ 푐′ 훼1 훼푙 ′ (8.15) = #{proper 푐 ∣ 퐱 = 푥1 ⋯ 푥푙 and asc 푐 = 푚 − 푘}. So it suffices to find a bijection between the 푐 counted by (8.14) and the 푐′ counted by (8.15). Define 푓 ∶ [푙] → [푙] by 푓(푖) = 푙 − 푖 + 1 for all 푖. Now given 푐 as in (8.14), we let ′ ′ 푐 = 푓 ∘ 푐. We have that 푐 is still proper since 푓 is a bijection. If 푐 sends 훼푖 vertices to color 푖, then 푐′ sends that many vertices to color 푙 − 푖 + 1. So their monomials are related as desired. And since 푓 is order reversing, asc 푐′ = 푚 − asc 푐. Finally, 푓 induces a bijection on colorings since 푐 = 푓 ∘ 푐′.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

282 8. Counting with Quasisymmetric Functions

1

2 2 4 4 3 퐺 = 푃(퐺) =

1 5 3 5

Figure 8.3. A graph and its poset

(b) For any proper coloring we have Des 푐 = 퐸 − Asc 푐 and so des 푐 = 푚 − asc 푐. It follows that 푞푚푋(퐺; 퐱, 푞−1) = ∑ 퐱푐푞des 푐. 푐∈풫퐶(퐺) The result now follows from part (a). □

Similarly to QSym[푞], define Sym[푞] to be the set of polynomials in 푞 whose coef- ficients are symmetric functions. The reader should find it easy to supply the detailsof the demonstration of the next result.

Corollary 8.5.4. Let 퐺 be a graph with 푉 = [푛] and #퐸 = 푚 such that 푋(퐺; 퐱, 푞) ∈ Sym[푞]. (a) 푋(퐺; 퐱, 푞) = ∑ 퐱푐푞des 푐. 푐∈풫퐶(퐺) (b) 푋(퐺; 퐱, 푞) is palindromic in 푞 with center 푚/2. □

We end by making a connection with reverse 푃-partitions. Given any graph 퐺 with 푉 = [푛], there is an associated poset 푃(퐺) on [푛] defined as follows. Call a path 푖1, 푖2, ... , 푖푙 in 퐺 decreasing if 푖1 > 푖2 > ⋯ > 푖푙. Now define 푖 ⊴ 푗 in 푃(퐺) if there is a decreasing path from 푖 to 푗 in 퐺. See Figure 8.3 for an example. We must make sure that 푃(퐺) satisfies the poset axioms.

Lemma 8.5.5. If 퐺 is a graph with 푉 = [푛], then 푃(퐺) is a poset on [푛].

Proof. We have 푖 ⊴ 푖 for 푖 ∈ [푛] because of the path of length zero from 푖 to itself which is decreasing. If 푖 ⊴ 푗 and 푗 ⊴ 푖, then the decreasing path from 푖 to 푗 forces 푖 ≥ 푗. Similarly 푗 ≥ 푖 so that 푖 = 푗. Finally, if 푖 ⊴ 푗 and 푗 ⊴ 푘, then consider the concatenation 푃 of the decreasing paths from 푖 to 푗 and from 푗 to 푘. Now the vertices on 푃 form a decreasing sequence since the two individual paths are decreasing and the terminal vertex of the first equals the initial vertex of the second. This also showsthat 푃 is a path since the decreasing condition makes it impossible to repeat a vertex. So 푖 ⊴ 푘 which finishes transitivity and the proof. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 283

We can now show that the coefficient of 푞0 in 푋(퐺; 퐱, 푞) is the quasisymmetric generating function for reverse 푃(퐺)-partitions. Theorem 8.5.6. If 퐺 is a graph with 푉 = [푛], then

푋0(퐺; 퐱) = rpar(푃(퐺); 퐱).

Proof. It suffices to show that any proper coloring of 퐺 with no ascents is a reverse 푃(퐺)-partition and conversely. So let 푐∶ 푉 → ℙ be a proper coloring with asc 푐 = 0. Recall that it suffices to prove that RPP1 and RPP2 hold for covers 푖 ◁ 푗. But then 푖 > 푗 must be a decreasing path consisting of a single edge. And since this is a proper coloring with no ascents, 푐(푖) < 푐(푗). So both of the desired conditions hold. For the other direction, let 푓 ∶ 푃(퐺) → ℙ be a reverse partition. By definition of 푃(퐺), any cover 푖 ◁ 푗 comes from an edge 푖푗 with 푖 > 푗. By RPP2, we have 푓(푖) < 푓(푗). So 푓 is a proper coloring since the second inequality is strict and, by comparing the last two inequalities, has no ascents. This completes the proof, as well as the book. □

Exercises

(1) Prove that QSym is an algebra; that is, show it is closed under linear combinations and products. (2) Prove Proposition 8.1.2. 푚 (3) (a) Show that 푀훼 is symmetric if and only if 훼 = [푖 ] for some 푖, 푚. In addition, show that 푀[푖푚] = 푚(푖푚). 푛 (b) Show that 퐹훼 is symmetric if and only if 훼 = [푛] or 훼 = [1 ] for some 푛. In addition, show that 퐹푛 = ℎ푛 and 퐹1푛 = 푒푛. (c) Suppose that 훼, 훽 ⊧ 푛 are distinct compositions where 푛 > 3. Show that 푛 퐹훼 + 퐹 훽 is symmetric if and only if {훼, 훽} = {[푛], [1 ]}.

(4) (a) Show that the 푀훼 can be expressed in terms of the 퐹훼 as ℓ(훼) ℓ(훽) 푀훼 = (−1) ∑ (−1) 퐹 훽 훽≤훼 where ℓ(⋅) is the length function in two ways: using Theorem 1.3.3(d) and using Möbius inversion. (b) Show that part (a) implies Theorem 8.1.4. (5) Prove Lemma 8.2.1. (6) Complete the proof of equation (8.4). (7) (a) Prove that the map between reverse compatible and compatible functions given in the text is a well-defined bijection and that if 푓 maps to 푔, then |푓| = |푔| + 푛 where the permutations come from 픖푛. (b) Given a poset 푃 on [푛], find a poset 푄 on [푛] such that there is a bijection 휓∶ RPar 푃 → Par 푄 satisfying |푓| = |휓(푓)| + 푛 for all 푓 ∈ RPar 푃.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

284 8. Counting with Quasisymmetric Functions

(8) (a) Show that if 푄 is the recording tableau for 휋 under the Robinson–Schensted map, then Des 푄 = Des 휋. (b) Show in the proof of Theorem 8.2.5 that Des 휋 ⊆ Des 푄. (9) Prove Lemma 8.2.3. (10) Prove Theorem 8.2.4. (11) Prove Proposition 8.2.6.

(12) (a) Show that if 휋 ∈ 픖푛, then we have the principal specialization 푞maj 휋′ 퐹 (1, 푞, 푞2, ... ) = Des 휋 (1 − 푞)(1 − 푞2) ⋯ (1 − 푞푛) where 휋′ is the reverse complement of 휋 as given by (8.3) (b) Use part (a) and Theorem 8.2.4 to rederive equation (7.23). (13) Prove that if 휎 ∈ 푃(푈) and 휏 ∈ 푃(푉) where 푈, 푉 are disjoint and #푈 = 푚, #푉 = 푛, then 푚 + 푛 ∑ 푞maj 휋 = 푞maj 휍+maj 휏 [ ] . 푚 휋∈휍⧢휏 푞 Conclude that maj is a shuffle compatible function on permutations. Hint: Use equation (7.23). (14) Show that the two maps in the proof of Theorem 8.3.1 are inverses. (15) (a) Let 푃 be a finite, ranked poset with a 1̂ having rk 1̂ = 푛. Associate with any chain 퐶 ∶ 0̂ = 푥0 < 푥1 < ⋯ < 푥푘 = 1̂ the set 푆(퐶) = {rk 푥1, rk 푥2, ... , rk 푥푘−1} ⊆ [푛 − 1]. Show that 휙(푆(퐶)) = 훼(퐶) where 휙 is the map in (1.8). (b) Complete the proof of Theorem 8.3.2.

(16) (a) Show that for the 푛-chain we have 푀(퐶푛) = ℎ푛, the complete homogeneous symmetric function of degree 푛. 휆1 휆푙 (b) Show that if the prime factorization of 푛 is 푛 = 푝1 ⋯ 푝푙 where 휆 = (휆1, ... , 휆푙) is a partition, then for the divisor lattice 푀(퐷푛) = ℎ휆.

(17) Prove that a permutation 휋 ∈ 픖푛 is reverse layered if and only if we have 휋푖+1 ≤ 휋푖 + 1 for 1 ≤ 푖 < 푛. (18) Prove Lemma 8.4.2. (19) Prove part (b) of Proposition 8.4.3. (20) Prove that part (c) of Proposition 8.4.3 is implied by parts (a) and (b) and Lemma 8.4.2. (21) Show that if 푖 ∈ [푛], then 푠(푖,1푛−푖) = ∑ 퐹푆 푆 [푛−1] where the sum is over 푆 ∈ ( 푛−푖 ). (22) (a) The Knuth class corresponding to an SYT 푃 is the set RS 퐾(푃) = {휋 ∣ 휋 ↦ (푃, 푄) for some SYT 푄}.

Prove that if 퐹Π is as defined in (8.9) and sh 푃 = 휆, then

퐹퐾(푃) = 푠휆.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Exercises 285

(b) The Knuth aggregate corresponding to a partition 휆 is the set 퐾(휆) = 퐾(푃). ⨄ 푃∈SYT(휆) Prove that 휆 퐹퐾(휆) = 푓 푠휆. (c) Show that 휆 푄푛(∅) = ∑ 푓 푠휆. 휆⊢푛

(d) If 휄푘 = 12 ... 푘, then show 휆 푄푛(휄푘) = ∑ 푓 푠휆. 휆1<푘

(e) If 훿푘 = 푘 ... 21, then show 휆 푄푛(훿푘) = ∑ 푓 푠휆. 푡 휆1<푘 (23) (a) One can define 푋(퐺; 퐱, 푞) for any graph whose vertices are distinct positive integers just as we did in the case 푉 = [푛]. With this extended definition, show that for a disjoint union of graphs we have 푋(퐺 ⊎ 퐻; 퐱, 푞) = 푋(퐺; 퐱, 푞)푋(퐻; 퐱, 푞). (b) Show that for the empty graph which has 푛 vertices and no edges

푋(∅; 퐱, 푞) = 푒1푛 (퐱). (c) Show that for the complete graph on 푛 vertices

푋(퐾푛; 퐱, 푞) = 푒푛(퐱)[푛]푞!. (24) Prove Lemma 8.5.2. (25) Prove Corollary 8.5.4.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Appendix

Introduction to Representation Theory

This appendix is designed to give just enough information about representation theory to understand some of the material in the body of the text. As such, most proofs are omitted. The reader wanting more information is encouraged to consult the texts of James [45], James and Kerber [46], or Sagan [79].

A.1. Basic notions

Let 퐺 be a finite group and let 푉 be a finite-dimensional vector space over the complex numbers. We say that 푉 is a 퐺-module or that 푉 affords a representation of 퐺 if there is an action of 퐺 on 푉 such that each map 푔 ∶ 푉 → 푉 is linear. Since each such function is bijective, 푔 is in fact an element of the general linear group, GL(푉), of invertible linear transformations on 푉. So to say that 푉 is a 퐺-module is equivalent to saying that there is a homomorphism of groups 휌 ∶ 퐺 → GL(푉). We will often write [푔] for 휌(푔). By definition, 푔 → [푔] being a homomorphism means

(A.1) [푔ℎ] = [푔][ℎ]

for all 푔, ℎ ∈ 퐺 where the product on the left is in 퐺 and on the right we have composi- tion of linear transformations. The matrix for the linear map [푔] in a basis 퐵 of 푉 will be denoted [푔]퐵, where we may drop the subscript if the basis is clear from the context. The dimension of a representation 푉 is just the usual vector space dimension dim 푉. Every group 퐺 has the trivial representation where 푉 = ℂ and 푔푐 = 푐 for all 푔 ∈ 퐺 and 푐 ∈ ℂ. Equivalently [푔] = [1] for all 푔 ∈ 퐺. For a less trivial example, we can turn any set 푋 on which 퐺 acts into a 퐺-module by considering that vector space ℂ푋 generated by 푋 as defined in (7.32). Indeed, since 퐺 acts on 푋 which is a basis for ℂ푋, the action can be extended to ℂ푋 by linearity. Clearly dim ℂ푋 = #푋. To be even more concrete, consider the action of 픖3 on [3] and

287

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

288 A. Introduction to Representation Theory

hence on

ℂ[3] = {푐1ퟏ + 푐2ퟐ + 푐3ퟑ ∣ 푐1, 푐2, 푐3 ∈ ℂ}. Note the distinction between [푛] for 푛 ∈ ℕ and [푔] for 푔 ∈ 퐺. Note also the use of boldface numbers to denote the corresponding vectors. To find the matrix for the permutation (1, 3, 2) in the basis 푋 = [3] we compute the action on each basis vector (1, 3, 2)ퟏ = ퟑ, (1, 3, 2)ퟐ = ퟏ, (1, 3, 2)ퟑ = ퟐ. This corresponds to the matrix 0 1 0 [(1, 3, 2)]푋 = [ 0 0 1 ] . 1 0 0

The reader should find it easy to write out the matrices for the restof 픖3 and verify that (A.1) holds. If 푋 = [푛], then the representation of 픖푛 afforded by ℂ[푛] is called its defining representation. The matrix [휋]푋 for 휋 ∈ 픖푛 is called its corresponding permutation matrix. More generally, if 퐺 acts on 푋, then the 퐺-module ℂ푋 is called a permutation representation of 퐺. We will be particularly concerned with representations of cyclic groups. Suppose that 퐺 is cyclic with #퐺 = 푛, and let 푔 be a generator of 퐺. Let us find the 1-dimensional representations of 퐺. Suppose that we have a homomorphism 휌 ∶ 퐺 → ℂ which sends 푔 to the matrix [푐] for some 푐 ∈ ℂ. The value of 푐 completely determines 휌 since 푔 generates 퐺 and, by (A.1), [푔푖] = [푔]푖 = [푐]푖 = [푐푖] for any 푖 ≥ 0. Furthermore, since 푔푛 = 푒, we must have [푐푛] = [1] and so 푐 must be an 푛th root of unity. It is now easy to check that one obtains 푛 1-dimensional representa- tions of 퐺 by letting 휌(푔푖) = [휔푖] for each 푛th root of unity 휔. Sometimes we have two 퐺-modules where the action of 퐺 is essentially the same. Two 퐺-modules 푉, 푊 are said to be 퐺-isomorphic or 퐺-equivalent, written 푉 ≅ 푊, if there is an isomorphism of vector spaces 휙 ∶ 푉 → 푊 which respects the action of 퐺 in that (A.2) 푔휙(푣) = 휙(푔푣) for all 푔 ∈ 퐺 and 푣 ∈ 푉. Stated in terms of matrices, this definition means that there are bases 퐵 for 푉 and 퐶 = 휙(퐵) for 푊 such that

[푔]퐵 = [푔]퐶 for all 푔 ∈ 퐺. Otherwise 푉 and 푊 are 퐺-inequivalent. We will drop the “퐺-” modifier if the group is clear from the context.

To illustrate, let us revisit the defining representation of 픖3. Consider the subspace 푊 of ℂ[3] generated by the vector ퟏ + ퟐ + ퟑ: (A.3) 푊 = ℂ{ퟏ + ퟐ + ퟑ} = {푐(ퟏ + ퟐ + ퟑ) ∣ 푐 ∈ ℂ}.

So for any 푤 = 푐(ퟏ + ퟐ + ퟑ) and 휋 ∈ 픖3, the fact that 휋 is linear and permutes the three vectors ퟏ, ퟐ, ퟑ yields 휋(푤) = 휋(푐(ퟏ + ퟐ + ퟑ)) = 푐휋(ퟏ + ퟐ + ퟑ) = 푐(ퟏ + ퟐ + ퟑ) = 푤.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

A.1. Basic notions 289

It follows that [휋] = [1] for all 휋 and so 푊 is equivalent to the trivial representation of 픖3. More generally, given any permutation representation ℂ푋 for a group 퐺, the

subspace 푊 = ℂ{푤} where 푤 = ∑푥∈푋 푥 is equivalent to the trivial representation of 퐺. On the other hand, the 푛 representations of a cyclic group 퐺 = ⟨푔⟩ of order 푛 which we derived above are all inequivalent, for suppose we consider two representations such that (A.4) 휌(푔) = [휔] and 휌′(푔) = [휔′] for two 푛th roots of unity 휔, 휔′. Any vector space isomorphism 휙 ∶ ℂ → ℂ is multipli- cation by some 푐 ∈ ℂ − {0}. So, considering (A.2) with 푣 = 1, 푔휙(푣) = 휔′휙(푣) = 휔′(푐1) = 푐휔′ since 푔 is acting on the representation 휌′ in the range of 휙. On the other hand 휙(푔푣) = 휙(휔푣) = 푐(휔1) = 푐휔 for now 푔 is acting by 휌 in the domain of 휙. Setting these two evaluations equal forces 휔 = 휔′. If 푉, 푊 are 퐺-modules, then it is easy to see that 푉 ⊕ 푊 is also, where the action is defined by 푔(푣 + 푤) = 푔푣 + 푔푤 for 푔 ∈ 퐺, 푣 ∈ 푉, 푤 ∈ 푊. It turns out that all 퐺-modules can be constructed this way from certain building blocks which are called the irreducible modules. If 푉 is a 퐺- module, then a submodule of 푉 is a subspace 푊 ⊆ 푉 which is itself a 퐺-module in that 푔푤 ∈ 푊 for all 푔 ∈ 퐺 and 푤 ∈ 푊. Any 퐺-module 푉 has the trivial submodules consist- ing of the zero subspace and 푉 itself. All other submodules are nontrivial. Note that the usage of the word “trivial” here is different from what we have defined as the trivial representation. Any group 퐺 has a unique trivial representation which has dimension 1. On the other hand, any 퐺-module 푉 has two (not necessarily distinct) submodules which are considered trivial. Call 푉 reducible if it has nontrivial submodules and call it irreducible otherwise. Clearly every 1-dimensional 퐺-module is irreducible. On the other hand the defin- ing module ℂ[푛] for 픖푛 is not irreducible for 푛 ≥ 2 because of the submodule (A.5) 푊 = ℂ{ퟏ + ퟐ + ⋯ + 퐧}. Of course, 푊 is irreducible since it has dimension 1. Consider the orthogonal comple- ⟂ ⟂ ment 푊 using the inner product on ℂ[푛] given by 퐢 ⋅ 퐣 = 훿푖,푗. Now ℂ[푛] = 푊 ⊕ 푊 ⟂ as vector spaces. And one can show that 푊 is an irreducible 픖푛-module. It turns out that one can write any 퐺-module as a direct sum of irreducibles. We would also like to know how many irreducible modules a group can have up to isomorphism. Theorem A.1.1. Let 퐺 be a finite group and consider the 퐺-modules which are finite- dimensional vector spaces over ℂ. (a) The number of pairwise inequivalent irreducible 퐺-modules is finite and equals the number of conjugacy classes of 퐺. (b) (Maschke’s Theorem) Every 퐺-module can be written as a direct sum of irre- ducible 퐺-modules. □

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

290 A. Introduction to Representation Theory

We will use the notation

(A.6) 푉 ≅ 푚 푉 (푖) ⨁ 푖 푖

(푖) to indicate that 푉 is isomorphic to the direct sum of 푚푖 copies of 푉 as 푖 varies. If 퐺 is a cyclic group of order 푛, then, because 퐺 is abelian, 퐺 has 푛 conjugacy classes each consisting of a single element. So from part (a) of the previous theorem, we know that the 푛 inequivalent irreducible 1-dimensional representations we have found for 퐺 are a complete list. Also, because of part (b), any representation of 퐺 can be written 푉 = 푉 (1) ⊕ ⋯ ⊕ 푉 (푘) for irreducible submodules 푉 (1), ... , 푉 (푘) all of dimension 1. (푖) Taking 푣푖 to be a basis of 푉 for 1 ≤ 푖 ≤ 푘, we see that in the basis 퐵 = {푣1, ... , 푣푘} the matrix [푔]퐵 will be diagonal for any 푔 ∈ 퐺. And because the diagonal elements come from the 1-dimensional representations we found, they are all 푛th roots of unity. We record this result for future reference.

Corollary A.1.2. If 퐺 is a cyclic group of order 푛 and 푉 is a 퐺-module, then there is a basis for 푉 which simultaneously diagonalizes [푔] for all 푔 ∈ 퐺. Furthermore, the diagonal elements are 푛th roots of unity. □

In the symmetric group 픖푛, a conjugacy class is just all permutations of a given cycle type 휆 ⊢ 푛. So the irreducible representations of 픖푛 are also indexed by partitions of 푛. If 푉 휆 is the irreducible module corresponding to 휆, then one can show that

(A.7) dim 푉 휆 = 푓휆.

So, for example, 푉 (푛) is the trivial representation and dim 푉 (푛) = 1 which is the number of SYT of shape (푛). As another illustration, consider the irreducible module 푊 ⟂ where 푊 is the submodule (A.5) of ℂ[푛]. Then

dim 푊 ⟂ = dim ℂ[푛] − dim 푊 = 푛 − 1.

In fact, 푊 ⟂ ≅ 푉 (푛−1,1) and it is easy to see that 푓(푛−1,1) = 푛 − 1. It would be nice if there was a natural representation of 퐺 which contained all the irreducible representations. This is the case for the regular representation. Any group 퐺 acts on the set 푋 = 퐺 by left multiplication

(A.8) 푔(ℎ) = 푔ℎ

where on the left we have the action of 푔 on ℎ and on the right the product of 푔 and ℎ in the group. The corresponding 퐺-module ℂ퐺 is called the (left) regular representation of 퐺. To illustrate, consider 퐺 = 픖3 and the ordered basis 퐵 = {퐞, (ퟏ, ퟐ), (ퟏ, ퟑ), (ퟐ, ퟑ), (ퟏ, ퟐ, ퟑ), (ퟏ, ퟑ, ퟐ)}

of the regular representation. For 푔 = (1, 3, 2) we have, remembering that we compose permutations right to left,

(1, 3, 2)퐞 = (ퟏ, ퟑ, ퟐ), (1, 3, 2)(ퟏ, ퟐ) = (ퟐ, ퟑ), (1, 3, 2)(ퟏ, ퟑ) = (ퟏ, ퟐ), (1, 3, 2)(ퟐ, ퟑ) = (ퟏ, ퟑ), (1, 3, 2)(ퟏ, ퟐ, ퟑ) = 퐞, (1, 3, 2)(ퟏ, ퟑ, ퟐ) = (ퟏ, ퟐ, ퟑ),

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

A.1. Basic notions 291

with corresponding matrix 0 0 0 0 1 0 ⎡ ⎤ ⎢ 0 0 1 0 0 0 ⎥ ⎢ 0 0 0 1 0 0 ⎥ [(1, 3, 2)] = ⎢ ⎥ . 퐵 ⎢ 0 1 0 0 0 0 ⎥ ⎢ ⎥ ⎢ 0 0 0 0 0 1 ⎥ ⎣ 1 0 0 0 0 0 ⎦ We point out again (as we did in Chapter 7) that the sum of squares formula in equa- tion (7.27) is the special case of (A.9) below where 퐺 = 픖푛. Theorem A.1.3. Let 퐺 be a finite group and let 푉 (1), ... , 푉 (푘) be all its pairwise inequiv- alent irreducible representations. Then the regular representation satisfies

푘 ℂ퐺 ≅ 푑 푉 (푖) ⨁ 푖 푖=1 (푖) where 푑푖 = dim 푉 for all 푖. In addition 푘 2 (A.9) #퐺 = ∑ 푑푖 . 푖=1

Proof. For a proof of the first statement, see the demonstration of Proposition 1.10.1 in [79]. The second follows directly from the first by taking dimensions on both sides. □

It turns out that a lot of information about a representation can be gleaned from a very simple function. If 푉 is a 퐺-module, then its character is the map 휒 ∶ 퐺 → ℂ given by 휒(푔) = tr[푔] where tr is the trace function. Note that since the trace of a linear transformation is independent of the basis in which it is computed, 휒(푔) is well-defined. Note also that for any representation 푉 of dimension 푑 we must have

휒(푒) = tr 퐼푑 = 푑

where 퐼푑 is the 푑 × 푑 identity matrix. We also have that 휒 is a class function in that 휒(푔) = 휒(ℎ) if 푔, ℎ are in the same conjugacy class to 퐺. This is because we must have 푔 = 푘ℎ푘−1 for some 푘 ∈ 퐺. So, by (A.1) and the fact that the trace is invariant under conjugation, 휒(푔) = 휒(푘ℎ푘−1) = tr[푘ℎ푘−1] = tr([푘][ℎ][푘]−1) = tr[ℎ] = 휒(ℎ).

It is also true that equivalent modules have the same character, for suppose 휙 ∶ 푉 → 푊 is an isomorphism of 퐺-modules with characters 휒푉 and 휒푊 , respectively. Let 퐵 and 퐶 = 휙(퐵) be bases for 푉 and 푊 and suppose that 푇 is the matrix of 휙 with respect to the bases 퐵 and 퐶. Since (A.2) holds for all ∈ 푉 we must have [푔]퐶푇 = 푇[푔]퐵 for all 푔 ∈ 퐺. Since 푇 is invertible we have 푊 −1 푉 휒 (푔) = tr[푔]퐶 = tr(푇[푔]퐵푇 ) = tr[푔]퐵 = 휒 (푔).

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

292 A. Introduction to Representation Theory

Since this holds for all 푔 ∈ 퐺, it follows that 휒푊 = 휒푉 . The surprising thing is that the converse is also true. Theorem A.1.4. Let 푉 and 푊 be 퐺-modules with characters 휒푉 and 휒푊 , respectively. We have 푉 ≅ 푊 if and only if 휒푉 = 휒푊 . □

This theorem can provide a quick way of checking whether two representations are equivalent or not. For example, if we are considering a 1-dimensional representation, then 휒(푔) is just the single entry of the matrix [푔]. So if we have two representations of a cyclic group as in (A.4) for distinct 휔, 휔′, then we must have 휌 and 휌′ inequivalent since 휒(푔) = 휔 ≠ 휔′ = 휒′(푔).

Exercises

(1) Show that 푉 is a 퐺-module if and only if there is a map 휌 ∶ 퐺 → GL(푉) which is a homomorphism of groups. (2) Show that if 퐺 = ⟨푔⟩ is cyclic of order 푛 and 휔 is an 푛th root of unity, then the map 휌 ∶ 퐺 → GL(ℂ) defined by 휌(푔푖) = [휔푖] is a well-defined representation of 퐺. (3) Let 푉, 푊 be 퐺-modules. Show that 푉, 푊 are equivalent if and only if they have bases 퐵, 퐶, respectively, such that

[푔]퐵 = [푔]퐶 for all 푔 ∈ 퐺. (4) Prove that if 퐺 acts on 푋, then the submodule of ℂ푋 defined by 푉 = ⟨푣⟩ where

푣 = ∑푥∈푋 푥 is equivalent to the trivial representation. (5) Show that if 푉, 푊 are 퐺-modules, then so is 푉 ⊕ 푊 with the action 푔(푣 + 푤) = 푔푣 + 푔푤 for 푔 ∈ 퐺, 푣 ∈ 푉, 푤 ∈ 푊. (6) (a) Show that if 푉 is a 퐺-module, then the zero subspace and 푉 itself are submod- ules. (b) Consider the submodule 푊 = ℂ{ퟏ+ퟐ +ퟑ} of the defining representation ℂ[ퟑ] ⟂ of 픖3. Show that 푊 is a submodule of ℂ[ퟑ] and that it is irreducible. (7) Show that (A.8) defines a group action.

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Bibliography

[1] K. Appel and W. Haken, Every planar map is four colorable. I. Discharging, Illinois J. Math. 21 (1977), no. 3, 429–490. MR543792 [2] K. Appel, W. Haken, and J. Koch, Every planar map is four colorable. II. Reducibility, Illinois J. Math. 21 (1977), no. 3, 491–567. MR543793 [3] Eric Babson and Einar Steingrímsson, Generalized permutation patterns and a classification of the Mahonian statistics, Sém. Lothar. Combin. 44 (2000), Art. B44b, 18. MR1758852 [4] Duff Baker-Jarvis and Bruce E. Sagan, Bijective proofs of shuffle compatibility results, Adv. in Appl. Math. 113 (2020), 101973, 29, DOI 10.1016/j.aam.2019.101973. MR4032316 [5] Matthias Beck and Raman Sanyal, Combinatorial reciprocity theorems: An invitation to enumerative geometric com- binatorics, Graduate Studies in Mathematics, vol. 195, American Mathematical Society, Providence, RI, 2018. MR3839322 [6] Edward A. Bender and Donald E. Knuth, Enumeration of plane partitions, J. Combinatorial Theory Ser. A 13 (1972), 40–54, DOI 10.1016/0097-3165(72)90007-6. MR299574 [7] Carolina Benedetti and Nantel Bergeron, The antipode of linearized Hopf monoids, Algebr. Comb. 2 (2019), no. 5, 903–935, DOI 10.5802/alco.53. MR4023571 [8] Carolina Benedetti, Joshua Hallam, and John Machacek, Combinatorial Hopf algebras of simplicial complexes, SIAM J. Discrete Math. 30 (2016), no. 3, 1737–1757, DOI 10.1137/15M1038281. MR3543152 [9] Carolina Benedetti and Bruce E. Sagan, Antipodes and involutions, J. Combin. Theory Ser. A 148 (2017), 275–315, DOI 10.1016/j.jcta.2016.12.005. MR3603322 [10] Arthur T. Benjamin and Jennifer J. Quinn, Proofs that really count: The art of combinatorial proof, The Dolciani Math- ematical Expositions, vol. 27, Mathematical Association of America, Washington, DC, 2003. MR1997773 [11] F. Bergeron, G. Labelle, and P. Leroux, Combinatorial species and tree-like structures, translated from the 1994 French original by Margaret Readdy, with a foreword by Gian-Carlo Rota, Encyclopedia of Mathematics and its Applications, vol. 67, Cambridge University Press, Cambridge, 1998. MR1629341 [12] Nantel Bergeron and Cesar Ceballos, A Hopf algebra of subword complexes, Adv. Math. 305 (2017), 1163–1201, DOI 10.1016/j.aim.2016.10.007. MR3570156 [13] George D. Birkhoff, A determinant formula for the number of ways of coloring a map, Ann. of Math. (2) 14 (1912/13), no. 1-4, 42–46, DOI 10.2307/1967597. MR1502436 [14] George D. Birkhoff, On the combination of subalgebras, Proc. Camb. Phil. Soc. 29 (1933), 441–464. [15] A. Björner, Topological methods, Handbook of combinatorics, Vol. 2, Elsevier Sci. B. V., Amsterdam, 1995, pp. 1819– 1872. MR1373690 [16] Andreas Blass and Bruce Eli Sagan, Bijective proofs of two broken circuit theorems, J. Graph Theory 10 (1986), no. 1, 15–21, DOI 10.1002/jgt.3190100104. MR830053 [17] Jonathan S. Bloom and Bruce E. Sagan, Revisiting pattern avoidance and quasisymmetric functions, Ann. Comb. 24 (2020), no. 2, 337–361, DOI 10.1007/s00026-020-00492-6. MR4110402 [18] Miklós Bóna, Combinatorics of permutations, 2nd ed., with a foreword by Richard Stanley, and its Applications (Boca Raton), CRC Press, Boca Raton, FL, 2012. MR2919720

293

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

294 Bibliography

[19] Petter Brändén, Unimodality, log-concavity, real-rootedness and beyond, Handbook of enumerative combinatorics, Dis- crete Math. Appl. (Boca Raton), CRC Press, Boca Raton, FL, 2015, pp. 437–483. MR3409348 [20] Francesco Brenti, Log-concave and unimodal sequences in algebra, combinatorics, and geometry: an update, Jerusalem combinatorics ’93, Contemp. Math., vol. 178, Amer. Math. Soc., Providence, RI, 1994, pp. 71–89, DOI 10.1090/conm/178/01893. MR1310575 [21] W. Burnside, Theory of groups of finite order, 2nd ed, Dover Publications, Inc., New York, 1955. MR0069818 [22] David M. Burton, The history of mathematics: An introduction, 2nd ed., W. C. Brown Publishers, Dubuque, IA, 1991. MR1223776 [23] Peter Doubilet, Gian-Carlo Rota, and Richard Stanley, On the foundations of combinatorial theory. VI. The idea of generating function, Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Univ. California Press, Berkeley, Calif., 1972, pp. 267–318. MR0403987 [24] Richard Ehrenborg, On posets and Hopf algebras, Adv. Math. 119 (1996), no. 1, 1–25, DOI 10.1006/aima.1996.0026. MR1383883 [25] Philippe Flajolet and Robert Sedgewick, Analytic combinatorics, Cambridge University Press, Cambridge, 2009. MR2483235 [26] Dominique Foata, Distributions eulériennes et mahoniennes sur le groupe des permutations (French), with a comment by Richard P. Stanley, Higher combinatorics (Proc. NATO Advanced Study Inst., Berlin, 1976), NATO Adv. Study Inst. Ser., Ser. C: Math. Phys. Sci., vol. 31, Reidel, Dordrecht-Boston, Mass., 1977, pp. 27–49. MR519777 [27] Sergey Fomin, Duality of graded graphs, J. Algebraic Combin. 3 (1994), no. 4, 357–404, DOI 10.1023/A:1022412010826. MR1293822 [28] Sergey Fomin, Schensted algorithms for dual graded graphs, J. Algebraic Combin. 4 (1995), no. 1, 5–45, DOI 10.1023/A:1022404807578. MR1314558 [29] J. S. Frame, G. de B. Robinson, and R. M. Thrall, The hook graphs of the symmetric groups, Canad. J. Math. 6 (1954), 316–324, DOI 10.4153/cjm-1954-030-1. MR62127 [30] A. M. Garsia and S. C. Milne, A Rogers-Ramanujan bijection, J. Combin. Theory Ser. A 31 (1981), no. 3, 289–339, DOI 10.1016/0097-3165(81)90062-5. MR635372 [31] Ira Gessel and Gérard Viennot, Binomial determinants, paths, and hook length formulae, Adv. in Math. 58 (1985), no. 3, 300–321, DOI 10.1016/0001-8708(85)90121-5. MR815360 [32] Ira M. Gessel, Multipartite 푃-partitions and inner products of skew Schur functions, Combinatorics and alge- bra (Boulder, Colo., 1983), Contemp. Math., vol. 34, Amer. Math. Soc., Providence, RI, 1984, pp. 289–317, DOI 10.1090/conm/034/777705. MR777705 [33] Ira M. Gessel and Yan Zhuang, Shuffle-compatible permutation statistics, Adv. Math. 332 (2018), 85–141, DOI 10.1016/j.aim.2018.05.003. MR3810249 [34] Curtis Greene, An extension of Schensted’s theorem, Advances in Math. 14 (1974), 254–265, DOI 10.1016/0001- 8708(74)90031-0. MR354395 [35] Curtis Greene, Albert Nijenhuis, and Herbert S. Wilf, A probabilistic proof of a formula for the number of Young tableaux of a given shape, Adv. in Math. 31 (1979), no. 1, 104–109, DOI 10.1016/0001-8708(79)90023-9. MR521470 [36] Darij Grinberg, Shuffle-compatible permutation statistics II: the exterior peakset, Electron. J. Combin. 25 (2018), no. 4, Paper No. 4.17, 61. MR3874283 [37] P. Hall, The Eulerian functions of a group, Quart. J. Math. 7 (1936), no. 1, 134–151. [38] Joshua Hallam, Applications of quotient posets, Preprint arXiv:1411.3022. [39] Joshua Hallam, Applications of quotient posets, Discrete Math. 340 (2017), no. 4, 800–810, DOI 10.1016/j.disc.2016.11.019. MR3603561 [40] Joshua Hallam, Jeremy L. Martin, and Bruce E. Sagan, Increasing spanning forests in graphs and simplicial complexes, European J. Combin. 76 (2019), 178–198, DOI 10.1016/j.ejc.2018.09.011. MR3886522 [41] Joshua Hallam and Bruce Sagan, Factoring the characteristic polynomial of a lattice, J. Combin. Theory Ser. A 136 (2015), 39–63, DOI 10.1016/j.jcta.2015.06.006. MR3383266 [42] Zachary Hamaker, Brendan Pawlowski, and Bruce Sagan, Pattern avoidance and quasisymmetric functions, Preprint arXiv:1810.11372. [43] A. P. Hillman and R. M. Grassl, Reverse plane partitions and tableau hook numbers, J. Combinatorial Theory Ser. A 21 (1976), no. 2, 216–221, DOI 10.1016/0097-3165(76)90065-0. MR414387 [44] C. G. J. Jacobi, De functionibus alternantibus earumque divisione per productum e differentiis elementorum conflatum (German), J. Reine Angew. Math. 22 (1841), 360–371, DOI 10.1515/crll.1841.22.360. MR1578283 [45] G. D. James, The representation theory of the symmetric groups, Lecture Notes in Mathematics, vol. 682, Springer, Berlin, 1978. MR513828 [46] Gordon James and Adalbert Kerber, The representation theory of the symmetric group, with a foreword by P. M. Cohn, with an introduction by Gilbert de B. Robinson, Encyclopedia of Mathematics and its Applications, vol. 16, Addison- Wesley Publishing Co., Reading, Mass., 1981. MR644144 [47] André Joyal, Une théorie combinatoire des séries formelles (French, with English summary), Adv. in Math. 42 (1981), no. 1, 1–82, DOI 10.1016/0001-8708(81)90052-9. MR633783

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Bibliography 295

[48] Sergey Kitaev, Patterns in permutations and words, with a foreword by Jeffrey B. Remmel, Monographs in Theoretical Computer Science. An EATCS Series, Springer, Heidelberg, 2011. MR3012380 [49] Donald E. Knuth, Permutations, matrices, and generalized Young tableaux, Pacific J. Math. 34 (1970), 709–727. MR272654 [50] Donald E. Knuth, Subspaces, subsets, and partitions, J. Combinatorial Theory Ser. A 10 (1971), 178–180, DOI 10.1016/0097-3165(71)90022-7. MR270933 [51] Donald E. Knuth, The art of computer programming. Vol. 1, Fundamental algorithms, third edition [of MR0286317], Addison-Wesley, Reading, MA, 1997. MR3077152 [52] Donald E. Knuth, The art of computer programming. Vol. 2, Seminumerical algorithms, third edition [of MR0286318], Addison-Wesley, Reading, MA, 1998. MR3077153 [53] Donald E. Knuth, The art of computer programming. Vol. 3, Sorting and searching, second edition [of MR0445948], Addison-Wesley, Reading, MA, 1998. MR3077154 [54] C. Krattenthaler, Advanced determinant calculus, The Andrews Festschrift (Maratea, 1998), Sém. Lothar. Combin. 42 (1999), Art. B42q, 67. MR1701596 [55] C. Krattenthaler, Advanced determinant calculus: a complement, Linear Algebra Appl. 411 (2005), 68–166, DOI 10.1016/j.laa.2005.06.042. MR2178686 [56] Joseph Louis Lagrange, Demonstration d’un théorème nouveau concernant les nombres premiers, Nouveaux Mémoires de l’Académie Royale des Sciences et Belles-Lettres (Berlin) 2 (1771), 125–137. [57] Bernt Lindström, On the vector representations of induced matroids, Bull. London Math. Soc. 5 (1973), 85–90, DOI 10.1112/blms/5.1.85. MR335313 [58] Dudley E. Littlewood, The theory of group characters and matrix representations of groups, reprint of the second (1950) edition, AMS Chelsea Publishing, Providence, RI, 2006. MR2213154 [59] E. Lucas, Sur les congruences des nombres eulériens et les coefficients différentiels des functions trigonométriques suivant un module premier (French), Bull. Soc. Math. France 6 (1878), 49–54. MR1503769 [60] I. G. Macdonald, Symmetric functions and Hall polynomials, 2nd ed., with contribution by A. V. Zelevinsky and a foreword by Richard Stanley, reprint of the 2008 paperback edition [ MR1354144], Oxford Classic Texts in the Physical Sciences, The Clarendon Press, Oxford University Press, New York, 2015. MR3443860 [61] P. A. MacMahon, The Indices of Permutations and the Derivation Therefrom of Functions of a Single Variable Associated with the Permutations of any Assemblage of Objects, Amer. J. Math. 35 (1913), no. 3, 281–322, DOI 10.2307/2370312. MR1506186 [62] F. Mertens, Über eine zahlentheoretische Funktion, Sem. ber. Kais. Akad. Wiss. Wien 106 (1897), no. 106, 761–830. [63] Sri Gopal Mohanty, Lattice path counting and applications, Probability and Mathematical Statistics, Academic Press [Harcourt Brace Jovanovich, Publishers], New York-London-Toronto, Ont., 1979. MR554084 [64] J. W. Moon, Counting labelled trees, from lectures delivered to the Twelfth Biennial Seminar of the Canadian Mathe- matical Congress (Vancouver), vol. 1969, Canadian Mathematical Congress, Montreal, Que., 1970. MR0274333 [65] John J. O’Connor and Edmund F. Robertson, Abu Ali al-Hasan ibn al-Haytham, MacTutor History of Mathematics archive, University of St Andrews. http://www-history.mcs.st-andrews.ac.uk/Biographies/Al-Haytham.html. [66] A. M. Odlyzko and H. J. J. te Riele, Disproof of the Mertens conjecture, J. Reine Angew. Math. 357 (1985), 138–160, DOI 10.1515/crll.1985.357.138. MR783538 [67] Oystein Ore, Number theory and its history, reprint of the 1948 original, with a supplement, Dover Publications, Inc., New York, 1988. MR939614 [68] Julius Petersen, Beviser for Wilsons og Fermats Theoremer, Tidssk. f. Math. (3) 2 (1872), 64–65. [69] T. Kyle Petersen, Eulerian numbers, with a foreword by Richard Stanley, Birkhäuser Advanced Texts: Basler Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks], Birkhäuser/Springer, New York, 2015. MR3408615 [70] G. Pólya, Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische Verbindungen (German), Acta Math. 68 (1937), no. 1, 145–254, DOI 10.1007/BF02546665. MR1577579 [71] J. Howard Redfield, The Theory of Group-Reduced Distributions, Amer. J. Math. 49 (1927), no. 3, 433–455, DOI 10.2307/2370675. MR1506633 [72] V. Reiner, D. Stanton, and D. White, The cyclic sieving phenomenon, J. Combin. Theory Ser. A 108 (2004), no. 1, 17–50, DOI 10.1016/j.jcta.2004.04.009. MR2087303 [73] J. B. Remmel, Bijective proofs of formulae for the number of standard Young tableaux, Linear and Multilinear Algebra 11 (1982), no. 1, 45–100, DOI 10.1080/03081088208817432. MR647727 [74] Bernhard Riemann, Gesammelte mathematische Werke, wissenschaftlicher Nachlass und Nachträge (German), based on the edition by Heinrich Weber and Richard Dedekind, edited and with a preface by Raghavan Narasimhan, BSB B. G. Teubner Verlagsgesellschaft, Leipzig; Springer-Verlag, Berlin, 1990. MR1066697 [75] G. de B. Robinson, On the Representations of the Symmetric Group, Amer. J. Math. 60 (1938), no. 3, 745–760, DOI 10.2307/2371609. MR1507943 [76] Gian-Carlo Rota, On the foundations of combinatorial theory. I. Theory of Möbius functions, Z. Wahrscheinlichkeits- theorie und Verw. Gebiete 2 (1964), 340–368 (1964), DOI 10.1007/BF00531932. MR174487

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

296 Bibliography

[77] Bruce E. Sagan, Congruences via abelian groups, J. Number Theory 20 (1985), no. 2, 210–237, DOI 10.1016/0022- 314X(85)90041-1. MR790783 [78] Bruce E. Sagan, Why the characteristic polynomial factors, Bull. Amer. Math. Soc. (N.S.) 36 (1999), no. 2, 113–133, DOI 10.1090/S0273-0979-99-00775-2. MR1659875 [79] Bruce E. Sagan, The symmetric group: Representations, combinatorial algorithms, and symmetric functions, 2nd ed., Graduate Texts in Mathematics, vol. 203, Springer-Verlag, New York, 2001. MR1824028 [80] Bruce E. Sagan, The cyclic sieving phenomenon: a survey, Surveys in combinatorics 2011, London Math. Soc. Lecture Note Ser., vol. 392, Cambridge Univ. Press, Cambridge, 2011, pp. 183–233. MR2866734 [81] C. Schensted, Longest increasing and decreasing subsequences, Canadian J. Math. 13 (1961), 179–191, DOI 10.4153/CJM-1961-015-3. MR121305 [82] John Shareshian and Michelle L. Wachs, Chromatic quasisymmetric functions, Adv. Math. 295 (2016), 497–551, DOI 10.1016/j.aim.2015.12.018. MR3488041 [83] R. P. Stanley, Supersolvable lattices, Algebra Universalis 2 (1972), 197–217, DOI 10.1007/BF02945028. MR309815 [84] Richard P. Stanley, Ordered structures and partitions, Memoirs of the American Mathematical Society, No. 119, Amer- ican Mathematical Society, Providence, R.I., 1972. MR0332509 [85] Richard P. Stanley, Acyclic orientations of graphs, Discrete Math. 5 (1973), 171–178, DOI 10.1016/0012-365X(73)90108- 8. MR317988 [86] Richard P. Stanley, Combinatorial reciprocity theorems, Advances in Math. 14 (1974), 194–253, DOI 10.1016/0001- 8708(74)90030-9. MR411982 [87] Richard P. Stanley, Binomial posets, Möbius inversion, and permutation enumeration, J. Combinatorial Theory Ser. A 20 (1976), no. 3, 336–356, DOI 10.1016/0097-3165(76)90028-5. MR409206 [88] Richard P. Stanley, Differential posets, J. Amer. Math. Soc. 1 (1988), no. 4, 919–961, DOI 10.2307/1990995. MR941434 [89] Richard P. Stanley, Log-concave and unimodal sequences in algebra, combinatorics, and geometry, Graph theory and its applications: East and West (Jinan, 1986), Ann. New York Acad. Sci., vol. 576, New York Acad. Sci., New York, 1989, pp. 500–535, DOI 10.1111/j.1749-6632.1989.tb16434.x. MR1110850 [90] Richard P. Stanley, Variations on differential posets, Invariant theory and tableaux (Minneapolis, MN, 1988), IMA Vol. Math. Appl., vol. 19, Springer, New York, 1990, pp. 145–165. MR1035494 [91] Richard P. Stanley, A symmetric function generalization of the chromatic polynomial of a graph, Adv. Math. 111 (1995), no. 1, 166–194, DOI 10.1006/aima.1995.1020. MR1317387 [92] Richard P. Stanley, Catalan numbers, Cambridge University Press, New York, 2015. MR3467982 [93] Richard P. Stanley and John R. Stembridge, On immanants of Jacobi-Trudi matrices and permutations with restricted position, J. Combin. Theory Ser. A 62 (1993), no. 2, 261–279, DOI 10.1016/0097-3165(93)90048-D. MR1207737 [94] N. Trudi, Intorno un determinante piu generale di quello che suol dirsi determinante delle radici di una equazione, ed alle funzioni simmetriche complete di queste radici, Rend. Accad. Sci. Fis. Mat. Napoli 3 (1864), 121–134. [95] J. H. van Lint, Introduction to coding theory, 3rd ed., Graduate Texts in Mathematics, vol. 86, Springer-Verlag, Berlin, 1999. MR1664228 [96] Michelle L. Wachs, Poset topology: tools and applications, Geometric combinatorics, IAS/Park City Math. Ser., vol. 13, Amer. Math. Soc., Providence, RI, 2007, pp. 497–615. MR2383132 [97] Edward Waring, Meditationes algebraicæ, translated from the Latin, edited and with a foreword by Dennis Weeks, with an appendix by Franz X. Mayer, translated from the German by Weeks, American Mathematical Society, Providence, RI, 1991. MR1146921 [98] Louis Weisner, Abstract theory of inversion of finite series, Trans. Amer. Math. Soc. 38 (1935), no. 3, 474–484, DOI 10.2307/1989808. MR1501822 [99] Hassler Whitney, A logical expansion in mathematics, Bull. Amer. Math. Soc. 38 (1932), no. 8, 572–579, DOI 10.1090/S0002-9904-1932-05460-X. MR1562461 [100] Herbert S. Wilf, Sieve equivalence in generalized partition theory, J. Combin. Theory Ser. A 34 (1983), no. 1, 80–89, DOI 10.1016/0097-3165(83)90042-0. MR685214 [101] Herbert S. Wilf, generatingfunctionology, 2nd ed., Academic Press, Inc., Boston, MA, 1994. MR1277813

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Index

푘-permutation, 4 poset, 178 푘-word, 5 Theorem, 72 푞-analogue, 75 푞-analogue, 78 binomial coefficient, 77 negative, 88 Binomial Theorem, 78 Birkhoff, George factorial, 75 distributive lattice, 151 natural number, 75 Björner, Anders topological methods, 176 al-Haytham, Ibn Block, 10 Wilson’s Congruence, 206 Bloom, Jonathan Algebra quasisymmetric pattern avoidance, 276 formal power series, 81 Broken circuit, 102 Algorithm Burnside, William Greene–Nijenhuis–Wilf, 231 Lemma, 192 Hillman–Grassl, 234 Burton Prüfer, 24 Fermat’s Little Theorem, 206 Robinson–Schensted, 241 Robinson–Schensted–Knuth, 244 Catalan number, 26 Appel, Kenneth Cauchy–Binet Theorem, 63 Four Color Theorem, 100 Coefficient Ascent, 76 binomial, 7 Avoid a permutation, 28 Coexcedance, 123 Composition, 16 Babson, Eric dominance, 31 Mahonian statistics, 76 expansion, 274 Baker-Jarvis, Duff part, 16 shuffle compatibility, 274 set, 47 Beck, Matthias merge, 48 combinatorial reciprocity, 107 split, 47 Bell number, 10 shuffle sum, 274 Bender, Edward weak, 31 Schur function, 226 Condition Big oh notation, 182 Rank, 171 Bijective proof, 6 Summation, 171 Binomial Congruence, 205 coefficient, 7 Fermat, 206 푞-analogue, 77 Lucas, 207

297

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

298 Index

Möbius function, 208 Fermat, Pierre Wilson, 207 Little Theorem, 206 Conjugate Ferrers diagram, 14 Young diagram, 14 Fibonacci number, 2 Copy of a permutation, 28 Fixed point, 11 Cycle Flajolet, Philippe decomposition, 11 analytic combinatorics, 81 digraph, 21 Foata, Dominique graph, 19 Mahonian statistic, 76 permutation, 11 major index, 76 Cyclic sieving phenomenon (CSP), 210, 256 Forest, 22 Formal power series, 81 Degree algebra, 81 graph vertex, 20 convergence Deletion-Contraction Lemma, 101 product, 85 Derangement, 43 sequence, 83 Descent, 75 sum, 84 Diagram degree, 219 Ferrers, 14 bounded, 219 Hasse, 140 homogeneous, 219 permutation, 29 minimum degree, 84 Young, see also Young, diagram quasisymmetric, 267 Digraph symmetric, 220 acyclic, 56 Formula arc, 21 exponential, 132 multiple, 22 Hook, 231 cycle, 21 Four Color Theorem, 100 functional, 22 Frame, J. Sutherland in-degree, 21 Hook Formula, 230 labeled, 21 Functional digraph, 22 loop, 22 orientation, 103 Garsia, Adriano out-degree, 21 Involution Principle, 49 path, 21 Garsia–Milne Involution Principle, 49 walk, 21 Gaussian polynomial, 77 Dihedral group, 29 Generating function Directed Graph, see also Digraph Eulerian, 178 Distribution, 74 exponential, 117 Doubilet, Peter method of undetermined coefficients, 99 binomial poset, 178 ordinary, 81 Dyck path of semilength 푛, 26 rational, 96 weight, 86 Ehrenborg, Richard Generating polynomial, 71 chains in posts, 274 Gessel, Ira Euler, Leonhard lattice paths, 58 Fermat’s Little Theorem, 206 quasisymmetric function, 267 number, 120 shuffle compatibility, 274 partition theorem, 51, 92 Graph Eulerian acyclic, 22 generating function, 178 bond, 167 number, 121 bond lattice, 167 polynomial, 122 broken circuit, 102 statistic, 121 chromatic Excedance, 122 number, 99 Exponential Formula, 132 polynomial, 100 quasisymmetric function, 280 Falling factorial, 4 symmetric function, 253

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Index 299

clique, 19 cyclic sieving, 210, 256 coloring, 99 fixed point set, 192 complete, 19 induced component, 22 on functions, 193 connected, 22 on permutations, 197 cycle, 19 on subsets, 197 cycle of length ℓ, 19 orbit, 190 digraph, see also Digraph weight, 201 edge, 18 Redfield–Pólya Theorem, 201 multiple, 22 representation, see also Representation endpoints, 19 stabilizer, 191 forest, 22 weight increasing, 104 of an element, 200 Four Color Theorem, 100 of an orbit, 201 labeled, 18 leaf, 23 Haken, Wolfgang loop, 22 Four Color Theorem, 100 matrix Hall, Phillip adjacency, 60 Möbius function, 175 incidence, 61 Hallam, Joshua Laplacian, 62 increasing spanning forest, 105 no broken circuit (NBC), 102 quotient poset, 168, 174 orientation, 103 Hamaker, Zachary path, 19 quasisymmetric pattern avoidance, 276 proper coloring, 99 Handshaking Lemma, 21 ascent, 279 Hillman, Abraham subgraph, 19 reverse plane partition, 233 induced, 166 In-degree, 21 induced by a coloring, 167 Increasing forest, 104 spanning, 59 Integer partition, 13 tree, 22 containment, 78 trivial, 19 distinct parts, 51 unlabeled, 20 hook, 230 generating function, 202 hooklength, 230 vertex, 18 length, 15 degree, 20 lexicographic order, 222 walk of length ℓ, 19 multiplicity notation, 14 Grassl, Richard odd parts, 51 reverse plane partition, 233 part, 13 Greene, Curtis rectangule, 79 and Schensted’s Theorem, 248 skew, 78 Hook Formula, 231 Inversion, 74 Grinberg, Darij Involution, 44 shuffle compatibility, 274 sign reversing, 44 Group dihedral, 29 Jacobi, Carl general linear, 287 determinants, 227 order, 210 James, Gordon representation, see also Representation representation theory, 287 symmetric, 11 Joyal, André Group action, 189 species, 124 Burnside’s Lemma, 192 congruence, 205 Kerber, Adalbert cycle index representation theory, 287 of a group, 197 Knuth, Donald of an element, 197 Robinson–Schensted–Knuth algorithm, 244 cycle indicator, 197 Schur function, 226

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

300 Index

subspace, 81 on a set, 9 Kronecker delta, 7 poset, 159 Nijenhuis, Albert Hook Formula, 231 Labeled structure, 124 No broken circuit (NBC), 102 disjoint union, 128 Number equivalence, 128 Bell, 10 exponential formula, 132 Catalan, 26 partition, 131 derangement, 43 product, 128 Euler, 120 Sum and Product Rules, 129 Eulerian, 121 Lagrange, Joseph Fibonacci, 2 Wilson’s Congruence, 206 Kostka, 225 Lattice path, 25 Stirling of the first kind (signed), 13 Dyck of semilength 푛, 26 Stirling of the first kind (signless), 12 east step, 26 Stirling of the second kind, 10 endpoints, 25 length, 25 O’Connor, Edmund north step, 26 Wilson’s Congruence, 206 northeast, 26 Odlyzko, Andrew Reflection Principle, 53 Mertens Conjecture, 183 step, 26 Operator Leaf, 23 definite summation, 162 Lemma down, 249 Burnside, 192 forward difference, 162 Deletion-Contraction, 101 up, 249 handshaking, 21 Order isomorphic permutations, 29 Lindström–Gessel–Viennot, 58, 59 Orientation, 103 Length Out-degree, 21 integer partition, 15 lattice path, 25 Part permutation, 4 composition, 16 Lindström, Bernt integer partition, 13 lattice paths, 58 Partition Lindström–Gessel–Viennot Lemma, 58, 59 integer, see also Integer partition Linear recursion with constant coefficients, 97 set, see also Set partition Littlewood, John Pascal’s triangle, 7 Cauchy Identity, 242 Path Log-concave sequence, 55 digraph, 21 Lucas, Édouard graph, 19 congruence, 207 lattice, see also Lattice path Pattern in a permutation, 28 Macdonald, Ian and quasisymmetric functions, 276 symmetric function, 219 Pawlowski, Brendan MacMahon, Percy quasisymmetric pattern avoidance, 276 major index, 76 Permutation, 4 Major index, 76 alternating, 120 Martin, Jeremy ascent, 76 increasing spanning forest, 105 avoid, 28 Matrix-Tree Theorem, 63 coexcedance, 123 Mertens, Franz compatible function, 235 conjecture, 183 copy, 28 function, 183 cycle decomposition, 11 Milne, Stephen cycle of length ℓ, 11 Involution Principle, 49 derangement, 43 Multiset, 8 descent, 75 cardinality, 8 diagram, 29

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Index 301

excedance, 122 disjoint union, 145 fixed point, 11 distributive laws, 151 insertion tableau, 242 divisor lattice, 140 inversion, 74 dual, 142 involution, 44 greatest lower bound, 148 Garsia–Milne Principle, 49 Hall’s Theorem, 175 length, 4 Hasse diagram, 140 major index, 76 incidence algebra, 158 matrix, 288 delta function, 159 of a set, 4 Möbius function, 157 one-line notation, 11 reduced, 179, 182 order isomorphic, 29 zeta function, 159, 182 pattern, 28 incomparable elements, 139 recording tableau, 242 isomorphic, 144 representation, 288 isomorphism, 144 reverse compatible function, 270 join, 149 reverse layered, 277 lattice, 149 layer lengths, 277 bond, 167 shuffle, 273 distributive, 151 shuffle compatibility, 274 Fundamental Theorem of Finite standardization, 28 Distributive Lattices, 153 subsequence join irreducible, 152 decreasing, 244 least upper bound, 149 increasing, 244 linear extension, 160 two-line notation, 11 locally finite, 147 Wilf equivalence, 29 lower bound, 148 trivial, 30 lower-order ideal, 143 Petersen, Julius generated by a set, 143 Fermat’s and Wilson’s congruences, 206 maximal element, 142 PIE, 42, 162 maximum element, 142 Pólya, George meet, 148 group action, 200 minimal element, 142 Poset minimum element, 142 푃-partition, 237 Möbius function Fundamental Lemma, 238 and isomorphism, 155 atom, 169 and products, 155 binomial, 178 congruence, 208 Boolean algebra, 140 difference calculus, 162 chain, 139, 147 inclusion and exclusion, 162 maximal, 147 Inversion Theorem, 160 projection, 275 number theory, 163 saturated, 147 one variable, 154 underlying, 275 two variable, 157 characteristic polynomial, 164 operator claw, 169 down, 249 closed interval, 143 up, 249 coatom, 174 order complex, 176 comparable elements, 139 order-preserving map, 144 composition ordinal sum, 146 dominance, 31 partition lattice, 140 lattice, 140 pattern poset, 140 covering relation, 140 product, 146 crosscut, 177 quotient, 169 Crosscut Theorem, 177 homogeneous, 170 definition, 139 Rank Condition, 171 differential, 251 Summation Condition, 171 Dirichlet, 182 support number, 173

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

302 Index

support set, 173 Hypothesis, 182 ranked, 147 zeta function, 182 corank, 164 Robertson, John rank of a poset, 147 Wilson’s Congruence, 206 rank of an element, 147 Robinson, Gilbert de Beauregard rank set, 147 Hook Formula, 230 reverse 푃-partition, 271 Robinson–Schensted algorithm, 240 Fundamental Lemma, 271 Rooted set, 89 subposet, 142 Rota, Gian-Carlo subspace lattice, 140 binomial poset, 178 total order, 139 Crosscut Theorem, 177 upper bound, 149 twelvefold way, 17 upper-order ideal, 143 generated by a set, 143 Sagan, Bruce Weisner’s Theorem, 176 congruence, 205 Whitney numbers cyclic sieving phenomenon, 209 of the first kind, 156 increasing spanning forest, 105 of the second kind, 156 quasisymmetric pattern avoidance, 276 Young’s lattice, 140 quotient poset, 168 Power series, see also Formal power series representation theory, 287 Prüfer algorithm, 24 shuffle compatibility, 274 Principle Sanyal, Raman Inclusion and Exclusion, 42, 162 combinatorial reciprocity, 107 Reflection, 53 Schensted, Craige Proof increasing-decreasing subsequences, 244 bijective, 6 Robinson–Schensted algorithm, 240 Sedgewick, Robert Quasisymmetric function, 267 analytic combinatorics, 81 algebra, 268 Sequence fundamental, 269 log-concave, 55, 229 monomial, 268 palindromic, 281 pattern avoidance, 276 center, 281 reversal, 280 Redfield, J. Howard unimodal, 54 group action, 200 Set partition, 10 Reflection Principle, 53 block, 10 Reiner, Victor Shareshian, John cyclic sieving phenomenon, 209, 257 chromatic quasisymmetric function, 279 Representation, 287 Signed set, 44 character, 291 Skew partition, 78 defining, 288 Species, 124 dimension, 287 Spencer, Joel equivalent, 288 twelvefold way, 17 inequivalent, 288 Standardization of a permutation, 28 irreducible, 289 Stanley, Richard P. isomorphic, 288 푃-partition, 235 module, 287 (ퟑ + ퟏ)-Free Conjecture, 279 permutation, 288 acyclic orientations, 103 reducible, 289 binomial poset, 178 regular, 290 chromatic symmetric function, 253 submodule, 289 combinatorial reciprocity, 106 nontrivial, 289 differential poset, 248 trivial, 289 quasisymmetric function, 267 trivial, 287 Stanton, Dennis Reverse cyclic sieving phenomenon, 209, 257 plane partition, 233 Statistic, 74 Riemann, Bernhard Eulerian, 121

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Index 303

Mahonian, 76 Tiling, 3 Steingrímsson, Einar Transpose Mahonian statistics, 76 Young diagram, 14 Stembridge, John Tree, 22 (ퟑ + ퟏ)-Free Conjecture, 279 Triangle Stirling Pascal, 7 number of the first kind (signed, 13 Stirling of the first kind, 12 number of the first kind (signless), 12 Stirling of the second kind, 10 number of the second kind, 10 Trudi, Nicolò triangle of the first kind, 12 determinants, 227 triangle of the second kind, 10 Twelvefold way, 17 Sum and Product Rules for labeled structures, 129 Unimodal sequence, 54 for sets, 1 Viennot, Xavier for weight-generating functions, 87 lattice path, 58 Symmetric group, 11 Wachs, Michelle Symmetric function, 220 chromatic quasisymmetric function, 279 algebra, 220 topological methods, 176 chromatic, 253 Waring, Edward complete homogeneous, 221 Wilson’s Congruence, 206 elementary, 221 Weak composition, 31 and roots of polynomial, 224 weight-generating function, 86 Fundamental Theorem, 223 Weighting of a set, 86 monomial, 220 Weisner, Louis power sum, 221 Möbius function, 176 Schur, 225 White, Dennis cyclic sieving phenomenon, 209, 257 te Riele, Hermann Wilf, Herbert Mertens Conjecture, 183 equivalence, 29 Theorem generatingfunctionology, 71 푞-Binomial, 78 Hook Formula, 231 Binomial, 72 Word, 5 negative, 88 Cauchy Identity, 242 Young Cauchy–Binet, 63 diagram, 14 Crosscut, 177 cell, 225 Difference Calculus, 162 conjugate or transpose, 14 Euler, 51 outer corner, 241 Fermat’s Little, 206 lattice, 140 Finite Distributive Lattices, 153 tableau Four Color, 100 content, 225 Fundamental Theorem of Symmetric descent set, 271 Functions, 223 Hook Formula, 231 Hall, 175 insertion, 241 Hook Formula, 231 partial, 240 Inclusion and Exclusion, 42, 162 placement, 241 Jacobi–Trudi, 227 Robinson–Schensted algorithm, 241 Maschke, 289 Robinson–Schensted–Knuth algorithm, Matrix-Tree, 63 244 Möbius Inversion, 160 Schensted’s Theorem, 247 Redfield–Pólya, 201 semistandard, 225 Schensted, 247 shape, 224 Weisner, 176 standard, 224 Wilson’s Congruence, 207 Thrall, Robert Zeta function Hook Formula, 230 poset, 159

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

304 Index

Riemann, 182 Zhuang, Yan shuffle compatibility, 274

Not for print or electronic distribution. This file may not be posted electronically. Prepublication copy provided to Dr Bruce Sagan. Please give confirmation to AMS by September 21, 2020.

Selected Published Titles in This Series

210 Bruce E. Sagan, Combinatorics: The Art of Counting, 2020 207 Dmitry N. Kozlov, Organized Collapse: An Introduction to Discrete Morse Theory, 2020 206 Ben Andrews, Bennett Chow, Christine Guenther, and Mat Langford, Extrinsic Geometric Flows, 2020 205 Mikhail Shubin, Invitation to Partial Differential Equations, 2020 204 Sarah J. Witherspoon, Hochschild Cohomology for Algebras, 2019 203 Dimitris Koukoulopoulos, The Distribution of Prime Numbers, 2019 202 Michael E. Taylor, Introduction to Complex Analysis, 2019 201 Dan A. Lee, Geometric Relativity, 2019 200 Semyon Dyatlov and Maciej Zworski, Mathematical Theory of Scattering Resonances, 2019 199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019 198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019 197 Walter Craig, A Course on Partial Differential Equations, 2018 196 Martin Stynes and David Stynes, Convection-Diffusion Problems, 2018 195 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018 194 Seth Sullivant, Algebraic Statistics, 2018 193 Martin Lorenz, A Tour of Representation Theory, 2018 192 Tai-Peng Tsai, Lectures on Navier-Stokes Equations, 2018 191 Theo B¨uhler and Dietmar A. Salamon, Functional Analysis, 2018 190 Xiang-dong Hou, Lectures on Finite Fields, 2018 189 I. Martin Isaacs, Characters of Solvable Groups, 2018 188 Steven Dale Cutkosky, Introduction to Algebraic Geometry, 2018 187 John Douglas Moore, Introduction to Global Analysis, 2017 186 Bjorn Poonen, Rational Points on Varieties, 2017 185 Douglas J. LaFountain and William W. Menasco, Braid Foliations in Low-Dimensional Topology, 2017 184 Harm Derksen and Jerzy Weyman, An Introduction to Quiver Representations, 2017 183 Timothy J. Ford, Separable Algebras, 2017 182 Guido Schneider and Hannes Uecker, Nonlinear PDEs, 2017 181 Giovanni Leoni, A First Course in Sobolev Spaces, Second Edition, 2017 180 Joseph J. Rotman, Advanced Modern Algebra: Third Edition, Part 2, 2017 179 Henri Cohen and Fredrik Str¨omberg, Modular Forms, 2017 178 Jeanne N. Clelland, From Frenet to Cartan: The Method of Moving Frames, 2017 177 Jacques Sauloy, Differential Galois Theory through Riemann-Hilbert Correspondence, 2016 176 Adam Clay and Dale Rolfsen, Ordered Groups and Topology, 2016 175 Thomas A. Ivey and Joseph M. Landsberg, Cartan for Beginners: Differential Geometry via Moving Frames and Exterior Differential Systems, Second Edition, 2016 174 Alexander Kirillov Jr., Quiver Representations and Quiver Varieties, 2016 173 Lan Wen, Differentiable Dynamical Systems, 2016 172 Jinho Baik, Percy Deift, and Toufic Suidan, Combinatorics and Random Matrix Theory, 2016 171 Qing Han, Nonlinear Elliptic Equations of the Second Order, 2016 170 Donald Yau, Colored Operads, 2016

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/gsmseries/.

Not for print or electronic distribution. This file may not be posted electronically.