<<

LECTURES 5–6.

SYMMETRIC IN SEVERAL VARIABLES

S. YAKOVENKO

1. Notation In the situation where we have three variables, the traditional labels for them are the letters x, y, z. However, we will use a different notation, stress- ing the full democracy for all variables. Throughout this lecture, n will always stand for the number of different variables, and the variables them- selves will be labeled as x1, . . . , xn. The letter x will be used to denote the tuple of all variables, so that we will deal with polynomials denoted by p(x), q(x) from the ring R[x] which is a shorthand for R[x1, . . . , xn], etc. α1 α2 αn A is a product of the form x1 x2 ··· xn , where αi > 0 are the corresponding nonnegative powers; αi = 0 means that xi is ab- sent in the above product. We will, following the above logic, denote by n α = (α1, . . . , αn) ∈ Z+ the tuple of these powers and abbreviate the product above as xα. n If α, β ∈ Z+ are two such tuples (vectors), then obviously the sum α + β makes perfect sense as a vector (componentwise) sum. The rule of product means that α β α+β n x · x = x ∀α, β ∈ Z+. (1) The vector α can also appear as a (multiple) index, so that a general poly- P α nomial in n variables can be written as a (finite) sum p(x) = α cαx . To stress the fact that the has degree d, we will use the convention that the norm (“length”) of α is the sum of its entries:

α = (α1, . . . , αn) |α| = α1 + ··· + αn. Note that |α| > 0, and |α| = 0 if and only if α = (0,..., 0). Using these conventions, we can write a general polynomial in n variables of degree 6 n with real coefficients as the sum X α p(x) = cαx , cα ∈ R. |α|6d

2. Symmetric polynomials Definition 1. A polynomial p ∈ R[x] is called symmetric, if it remains unchanged by any of variables. 1 2 S. YAKOVENKO

To make this intuitive definition slightly more formal, consider the full permutation group on n symbols: by definition, each permutation is one-to- one transformation of the set {1, 2, . . . , n}. of this type form a group with the operation “composition”, denoted usually by Sn (this group is often called also the ). A permutation g ∈ S acts on the symbol i ∈ {1, 2, . . . , n} and maps it into g(i) ∈ {1, 2, . . . , n} so that g(i) 6= g(j) if i 6= j. The standard way to write explicitly a permutation is by a two-row matrix with columns containing i and g(i) for all labels. For instance, a cyclical permutation on n symbols is the permutation 1 2 ··· n − 1 n 2 3 ··· n 1 Permutation on symbols can be extended to act also on any objects la- beled by these symbols. For instance, a permutation g can be extended n n as a transformation of the space g∗ : R → R into itself sending the tuple (point) (x1, . . . , xn) into the point g∗x = (xg(1), xg(2), . . . , xg(n)). In the same way the symmetric group acts on by the following rules: ∗ ∗ ∗ g (xi) = xg(i), g (xixj) = xg(i)xg(j), g (xixjxk) = xg(i)xg(j)xg(k), ··· i, j, k, ··· = 1, . . . , n. In the abbreviated form this action can be described as follows, ∗ α α n g (x ) = (g∗x) , g ∈ Sn, α ∈ Z+. Finally, after explaining how a “permutation of the variables” g acts on an arbitrary polynomial, we extend it by linearity:

X α ∗ X α p = cαx =⇒ g p = cα(g∗x) , g ∈ Sn. α α Lemma 2. The action g∗ respects both the addition/substraction and the multiplication operations on polynomials: for any two polynomials p, q ∈ ∗ ∗ ∗ ∗ ∗ ∗ R[x], we have g (p ± q) = g p ± g q, g (p · q) = (g p) · (g q). The action preserves the degree: deg g∗p = deg p.

Proof. Obvious.  Now we can give a formal definition of a .

∗ Definition 3. A polynomial p ∈ R[x] is called symmetric, if g p = p for all permutations g ∈ Sn.

Example 4. The polynomials σ1(x) = x1 + ··· + xn and σn(x) = x1, . . . , xn Pn k are obviously symmetric. Besides that, all sums of powers sk(x) = 1 xi , k = 2, n, . . . , are obviously symmetric.

Our global task is to describe all symmetric polynomials in n > 3 and find for them representation analogous to that in two variables. 5. SYMMETRIC POLYNOMIALS IN SEVERAL VARIABLES 3

3. Construction of symmetric polynomials

After constructing the action of the permutation group G = Sn on the polynomials, we have a situation already familiar from Lecture 3. For any degree d the polynomials of degree 6 d have the structure of a (finite- dimension) linear space over R and the group acts on this space by linear transformations. Problem 5. Compute the dimension of the space of homogeneous polyno- mials of degree exactly d in n independent variables. Compute the dimension of polynomials of degree 6 d.

The symmetric group Sn is finite, it contains n! distinct permutations. Therefore we can apply the construction of to construct sufficiently many symmetric polynomials. We recall it.

Definition 6. The symmetrization of a polynomial p ∈ R[x] is the polyno- mial 1 X Ep = g∗p. (2) |Sn| g∈Sn Remark 7. We use here the notation Ep, traditional for the Probability Theory, where it denotes the expectation of a random variable. This is not accidental: the polynomial Ep can be considered as the expectation of the polynomial p by a “random permutation” of its variables (all permutations being considered as equiprobable). The result is a new polynomial. The operator E is a projection: E(Ep) = Ep for any polynomial p. Theorem 8. The polynomial p is symmetric if and only if Ep = p. Proof. For the sake of better understanding, we repeat here the proof from Lecture 3. ∗ If p is symmetric, then g p = p for any g ∈ Sn. Since g respects all operations in the ring R[x], the action of any g preserves each term in the sum (2), hence Ep = p. Conversely, if Ep = p, then, applying an arbitrary h ∈ Sn to the sum and using the fact that h preserves all operations, we conclude that 1 X 1 X 1 X h∗(Ep) = g∗p = h∗(g∗p) = (hg)∗p. |S | |S | |S | n g n g n g

Yet for any fixed h if g runs over the entire group Sn, then so does the composition hg ∈ Sn. Thus the result of the application is only permutation of different terms in the sum, which does not affect the result. Thus h∗p = ∗ h (Ep) = p, i.e., p is symmetric.  This result gives a way for bulk production of symmetric polynomials: 1 Pn 1 P 1 (1) For any i = 1, . . . , n, E(xi) = n! 1 (n − 1)!xi = n i xi = n σ1(x). k 1 P k 1 (2) In the same way, E(xi ) = n xi = n sk(x) is the sum of powers. 4 S. YAKOVENKO

(3) E(x ··· x ) = Qn x · 1 P 1 = 1 σ (x)s (x). 1 n−1 1 i n i xi n n −1 2 P (4) E(x1x2) = n(n−1) i α2 > ··· ; columns of height zero are not plotted. Example 9. The Young diagrams

correspond to monomials 3 2 4 3 2 x1, x1x2x3x4, x1x2x3, x1x2(x3x4x5) x6x7. The total area of the tableau is the degree d of the monomials, the width (number of columns) is less or equal to the number of variables n. The result of application of the averaging operator E(xα) depends only on the Young diagram of the monomial. The numeric coefficient can be obtained by solving the corresponding combinatorial problem. Example 10. The result of averaging of the monomial E( ··· ) n of degree d 6 n is a sum of d monomials with the reciprocal coefficient n 1/ d . n Example 11. The result of averaging E( ) is the sum of 2 2 = n(n − 1) 2 monomials of the form xi xj, with the reciprocal coefficient. Indeed, the label for the quadratic term can be chosen by n ways, and the remaining 2 2 label by (n − 1) ways. The terms xi xj and xj xi are different, so no need to divide by 2.

Problem 12. Assume that the Young diagram consists of ν1 > 1 columns of maximal height, followed by ν2 > 1 columns of second biggest height, fol- lowed by ν3 columns of third biggest height etc, followed by ν0 > (invisible) zero-height columns. Prove that the number of monomials after averaging will be n! . ν1!ν2! ··· ν0! In the preceding Example we have ν1 = ν2 = 1, ν0 = n − 2, so the answer is n! indeed 1! 1! (n−2)! = n(n − 1). 5. SYMMETRIC POLYNOMIALS IN SEVERAL VARIABLES 5

The operator E is obviously linear (respects the operation + on the ring R[x]). Unfortunately, it does not respect the multiplication of polynomials, except for very special cases. This should be compared with the fact that the expectation of product of random variables is in general different from the product of their expectations, except when the variables are independent (the notion of independence is the cornerstone of the Probability theory, which makes it really different from the integration theory). p q p q p q Example 13. Comparing E(x1x2) with the product E(x1)E(x2) = E(x1)E(x1) we see that the latter is n n n 1 X X 1 X 2 X 1 n − 1 xp · xq = xp+q + xpxq = E(xp+q) + E(xpxq). n2 i j n2 i n2 i j n 1 n 1 2 i=1 j=1 1 i

4. Elementary symmetric polynomials Another (direct) way to produce symmetric polynomials is to use the ob- vious fact that any monic (i.e., with the unit leading coefficient) polynomial in one variable λ is uniquely determined by its roots in a way non-sensitive to the permutation of these roots. In particular, all other (non-leading) coefficients of this polynomial are symmetric functions of the roots. These functions can easily be computed and turn out to be polynomials. The result is known as the (collection of) Vieta formulas. Theorem 14 (Vieta theorem). The product n Y (λ − x1)(λ − x2) ··· (λ − xn) = (λ − xi) i=1 is the monic polynomial of degree n in the variable λ which can be expanded as n −1 n−2 n−3 n λ − σ1λ + σ2λ − σ3λ + ··· + (−1) σn. (3)

The coefficients σi = σi(x) ∈ R[x], i = 1, . . . , n are symmetric polynomials of degree i of the parameters (formal variables) x1, . . . , xn. They can be described as follows: n σ (x) = E( ··· )(k is the area of the diagram) k k n = E(x x ··· x ), k = 1, . . . , n. k 1 2 k The proof of this theorem is obvious from combinatorial arguments. 6 S. YAKOVENKO

Definition 15. The coefficients σi(x) occurring in the expansion (3), are called elementary symmetric polynomials in the variables x1, . . . , n. It is convenient to denote σ0 ≡ 1. It is even more convenient to assume that σk(x) ≡ 0 for all k > n: this will simplify many formulas. Example 16. In three dimensions we have only one new symmetric polyno- mial: in addition to σ1 = x+y +z and σ3 = xyz, we have σ2 = xy +yz +xz.

5. Strategic goal What we want to do now is to find a suitable explicit description of symmetric polynomials, similar to the theorem in 2 dimensions. In other words, we want to construct a (finite) collection of symmetric polynomials ρ1(x), . . . , ρk(x) ∈ R[x], which would be: • sufficiently rich so that any symmetric polynomial p could be repre- sented through them as a composition, p(x) = P (ρ1(x), . . . , ρk(x)); • non-redundant, so that the representation above is unique. The Grand Conjecture is that the elementary symmetric polynomials σ1, . . . , σn fit this rather restrictive demand. We will justify it after some efforts.

6. Symmetry break In a somewhat paradoxical way, the idea of the proof is to break the sym- metry and introduce inequality where all variables initially enjoyed “equal rights”. Consider the lexicographic ordering of the monomials. A monomial xα is said to be stronger than another monomial xβ, if:

• α1 > β1, or • α1 = β1, but α2 > β2, or • α1 = β1, α2 = β2 but α3 > β3, ··· • αi = βi for all i = 1, . . . , n − 1, but αn > βn. In other words, the variable x1 is considered as “most important”, if it cannot distinguish two monomials, then its role is passed to x2 etc. Using this notion, one can describe the result of averaging of a monomial as follows: E(xα) = cxβ + weaker terms, n where β is the rearrangement of the components of the vector α ∈ Z+ such that the components β1, β2,... are non-increasing, and c ∈ Q is a rational coefficient which depends only on the Young diagram of α. Lemma 17. (xα + weaker terms)(xβ + weaker terms) = xα+β + weaker terms. (4) Proof. This follows from the following obvious observation: if xα is stronger β γ α+γ β+γ than x , then for any monomial x , x is stronger than x .  5. SYMMETRIC POLYNOMIALS IN SEVERAL VARIABLES 7

Can one determine the strongest (leading) term of a monomial in the β β1 β2 βn “variables” σi after the expansion? Consider a monomial σ = σ1 σ2 ··· σn . The leading terms of each polynomial σk is the product x1 ··· xk:

σk = x1 ··· xk + weaker terms. Recycling Lemma 17, we conclude that the leading term of the above mono- mial σβ is

β1 β2 βn β1+···+βn β2+···+βn βn x1 · (x1x2) ··· (x1x2 ··· xn) = x1 x2 ··· xn . α Proposition 18. For any monomial x with α1 > α2 > ··· > αn, there exists a unique monomial σβ such that σβ = xα + weaker terms. Proof. One has to solve the equations

α1 = β1 + ··· + βn,

α2 = β2 + ··· + βn, ······

αn = βn with respect to βk: β1 = α1 − α2, β2 = α2 − α3, . . . , βn = αn. The total degree of this monomial in the variables xi is (α1 −α2)+2(α2 −α3)+3(α3 − α4) + ··· + nαn = α1 + ··· + αn.  Problem 19. Consider two “symmetric monomials” σβ and σγ. Can one, looking only at the multi-exponents β, γ, quickly decide, which of these monomials will produce the stronger leading term when expanded in multi- powers of x? The answer can be described in terms of the dual order. Namely, we say β β1 βn γ γ1 γn that the monomial σ = σ1 ··· σn is stronger than σ = σ1 ··· σn , if: • |β| > |γ|, or • |β| = |γ|, but β2 + ··· + βn > γ2 + ··· + γn, or • |β| = |γ|, β2 + ··· + βn = γ2 + ··· + γn, but β3 + ··· + βn > γ3 + ··· + γn, ··· Pn Pn • i=k βi = i=k γi for all k = 1, . . . , n − 1, but βn > γn. Theorem 20 (Main theorem on symmetric polynomials). Any symmetric polynomial in the variables x1, . . . , xn can be uniquely expressed as a poly- nomial in the elementary functions σ1, . . . , σn. α Proof. Let p(x) = cαx + weaker terms be a symmetric polynomial with the leading term explicitly selected. By Proposition 18, there exists a monomial cσβ which has the same leading term. The difference p−cσβ is again a symmetric polynomial of the same degree, yet all monomials in it are by construction weaker than xα, so its leading 8 S. YAKOVENKO

α0 0 term is cα0 x with α weaker than α. It remains to note that any lexi- cographically decreasing (strictly!) sequence of vectors α, α0, α00, ··· must eventually end by the zero vector. The uniqueness of the expansion follows from the same argument. If a symmetric polynomial p admits two different representations through σi, then the difference of these representations is a nontrivial polynomial in the letters σi.  7. Power sums, a.k.a. symmetric powers a.k.a. Newton sums The above demonstration of the main theorem is algorithmic (i.e., it de- scribes a process which in finitely many steps would transform any symmet- ric polynomial in x into a polynomial in the elementary symmetric functions σ. However, sometimes we need the explicit answer, especially for the simple symmetric polynomials. The first case should be, obviously, that of symmetric (Newton) sums Pn k sk = i=1 xi . Here the index k may take any natural value, in particular, greater than n. Theorem 21 (Newton identities). k σ0sk − σ1sk−1 + σ2sk−2 − σ3sk−3 ± · · · + (−1) σks0 = 0. (5)

If k > n, we use the convention that σk = 0 for k > n. To prove this theorem, we introduce one of the most surprising tricks in , namely, the notion of generating functions.

8. Generating functions Every time we need to answer a combinatorial question or prove an iden- tity about an infinite sequence of numbers or polynomials {p0, p1, p2, p3,... }, it makes sense to introduce a new fictitious variable, say, t, and instead of the infinite sequence p consider the “generating ” P (t) defined formally by the Taylor series ∞ X k P (t) = pk t . k=0 This function may depend on additional variables if pk are not numbers, but functions of this variable. Example 22. With the sequence of the Newton sums one associates the generating function ∞ X k S(t, x) = sk(x) t . (6) k=0 With the sequence of elementary symmetric functions one can associate the generating function ∞ X k E(t, x) = σk(x) t . (7) k=0 5. SYMMETRIC POLYNOMIALS IN SEVERAL VARIABLES 9

Note that we use the convention s0(x) = n and σ0(x) = 1, σk(x) = 0 for k > n. Thus effectively the function E(t) is a polynomial in t. Its roots are the negative reciprocals ti = −1/xi, i = 1, . . . , n. The reason why the generating functions are useful is the following. Stan- dard algebraic operations (multiplication, inversion, derivation, product) on power series and rational functions, when performed componentwise, pro- duce combinatorial formulas for the coefficients which coincide with the recurrent identities defining the sequences. One example is given by the Vieta formulas above, yet a more natural representation is available for the generating function. Lemma 23. n Y E(t, x) = (1 + xi t). (8) i=1 Proof. Obvious. The coefficient before tk comes from all possible products of k distinct variables from the list {x1, . . . , xn}.  Lemma 24. n X 1 S(t, x) = . 1 − x t i=1 i Proof. Obvious. One has to use the formula for the sum of the geometric 2 progression 1+z+z +··· = 1/(1−z) in each variable z = xit separately.  The relationship between the functions E(t, x) and S(−t, x) comes from the observation that the logarithmic derivative operator f(t) 7→ f 0(t)/f(t) = d dt ln f(t) takes the product into a sum. Applying this observation to the function E, we note (the prime denotes the derivative in the variable t) that

0 n n n E d Y X d X xi = ln (1 + x t) = ln(1 + x t) = . E dt i dt i 1 + x t i=1 i=1 i=1 i The right hand side of the last identity clearly resembles the expression for the function S with some modifications. More precisely, it can be expressed as follows, n ∞ X xi S(−t, x) − s0 X k−1 P (t, x) = = = sk(x)(−t) . 1 + xi t −t i=1 k=1 In other words, the function P (t, x) is a generating function for the “shifted” Newton sums; P is generating for the sequence of the functionss ˜j = sj+1, j = 0, 1,... . Comparing these two expressions, we conclude that E0(t) = P (t)E(t). 10 S. YAKOVENKO

Proof of the Newton identities (Theorem 21). Substituting the formulas for the generating functions into the identity E0 = PE we conclude with the formal identity ∞ ∞ !  ∞  X k−1 X i−1 X j kσk t = si (−t) σj t . k=1 i=0 j=0 Comparing the coefficients before the powers 0, t, t2,... , we obtain the infi- nite series of identities:

σ1 = s1σ0

2σ2 = s1σ1 − s2σ0,

3σ3 = s1σ2 − s2σ1 + s3σ0, ··· n−1 nσn = s1σn−1 − s2σn−2 + ··· + (−1) snσ0,

0 = s1σn − s2σn−1 + s3σn−2 ± · · · , and so on (the left hand side is always zero from the nth step). Since s0 = n, this coincides with the Newton formulas (5).  Remark 25. The above calculations were performed without any regard to the convergence of the series. This is justified, since the only property which is required is the formal logarithmic derivative of the polynomial E. However, the series are all converging for sufficiently small values of |x| = |x1| + ··· + |xn|. The method of generating functions is very powerful in many areas of (not just in connection with symmetric polynomials). Example 26. Consider the generating function n ∞ Y 1 X k H(t, x) = = hk(x) t , (9) 1 − xi t i=1 k=0 where hk(x) is a complete of order k, the sum of all monomials of degree k: X α hk(x) = x , k = 0, 1, 2,... (10) |α|=k

The identity (9) is proved by expanding each fraction 1/(1 − xi t) as the sum of the infinite geometric progression, and then apply the combinatorial arguments when multiplying these progressions between themselves. Then we have the obvious identity E(−t, x)H(t, x) ≡ 1, cf. with (8). Computing the coefficients of the product in the left hand side, we conclude with the identities m X k (−1) σkhm−k = 0 ∀m = 1, 2,... (11) k=0 5. SYMMETRIC POLYNOMIALS IN SEVERAL VARIABLES 11

(recall that σn+1 = σn+2 = ··· = 0). These identities can be used to express the complete symmetric polynomials through the elementary sym- metric functions σ1, . . . , σn using the recursion with the initial condition h0 = 1, h1 = σ1.

Remark 27. The symmetric powers s0, . . . , sn can be used as a different basis for representation of all symmetric functions. Indeed, the same Newton identities can be interpreted as the recurrent formulas expressing σk through s0, . . . , sk. It may be worth giving a try sometimes (cf. with Problem 32 below).

9. Miscellaneous problems 3 3 3 Problem 28. Factor out the polynomial x + y + z − 3xyz ∈ R[x, y, z]. 3 Solution. This polynomial expands as s3 − 3σ3 = σ1 − 3σ1σ2 + 3σ3 − 3σ3 = 2 σ1(σ1 − 3σ2).  Remark 29. Warning! In general, a symmetric polynomial can be fac- tored into non-symmetric factors, which are permuted by the action of the symmetric group Sn. The above example is rather specific. Problem 30. Factor out (x + y)(y + z)(x + z) + xyz. Problem 31. Factor out p = (x − y)3 + (y − z)3 + (z − x)3. Solution. Denote X = x−y, Y = y−z, Z = z−x. Then X +Y +Z = 0, and 3 3 3 we need to factorize the polynomial s3(X,Y,Z) = X + Y + Z . Using the 3 Newton formula, we obtain p = σ1 − 3σ1σ2 + 3σ3, where σi are elementary symmetric functions of X,Y,Z. Since σ1 = 0, we have p = 3σ3 = 3XYZ = 3(x − y)(y − z)(z − x).  Clearly, this approach allows to simplify all symmetric expressions under symmetric assumptions on the variables. In particular, many symmetric sums sk can be factorized under the assumption that σ1 = 0. Problem 32. Find x4 + y4 + z4 if it is known that x + y + z = 0 and x2 + y2 + z2 = 1. Symmetric representation can be useful not just for solving equations and proving identities. Inequalities can also be proved. The central fact is, of course, the inequality on arithmetic and geometric means, x1 + ··· + xn 1 (x ··· x ) n , (12) n > 1 n which is valid for all positive real values of the symbols x1, . . . , xn. In the symmetric representation it takes the form n n σ1 − n σn > 0. (13) The “dual form” of this inequality reads as n n n−1 σn−1 − n σn > 0. 12 S. YAKOVENKO

Prove it by considering the sum s (x) = P 1 = σn−1 . −1 xi σn

Problem 33. Prove that for real positive x1, . . . , xn   1 1 2 (x1 + ··· + xn) + ··· + > n . x1 xn Solution. After division by n2m the left hand side is the product of the arith- metic mean of the numbers xi by the arithmetic mean of their reciprocals 1 x . By the Main Theorem, this is greater or equal to the product of the i √ q geometric means n x ··· x and n 1 which is one. 1 n x1···xn  Problem 34. Prove that for positive real x, y, z 3 3 3 x + y + z > 3xyz.

Department of Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel, email: [email protected]