Symmetric Polynomials

SYMMETRIC POLYNOMIALS KEITH CONRAD 1. Introduction Let F be a field. A polynomial f(X1;:::;Xn) 2 F [X1;:::;Xn] is called symmetric if it is unchanged by all permutations of its variables: f(X1;:::;Xn) = f(Xσ(1);:::;Xσ(n)) for every permutation σ of f1; : : : ; ng. Example 1.1. The sum X1 + ··· + Xn and product X1 ··· Xn are both symmetric, as are r r the power sums X1 + ··· + Xn for all r ≥ 1. 5 Example 1.2. Let f(X1;X2;X3) = X1 + X2X3. This polynomial is unchanged if we 5 interchange X2 and X3, but if we interchange X1 and X3 then f becomes X3 + X2X1, which is not f. This polynomial is only \partially symmetric." An important collection of symmetric polynomials occurs as the coefficients in the polynomial n n−1 n−2 n (1.1) (T − X1)(T − X2) ··· (T − Xn) = T − s1T + s2T − · · · + (−1) sn: Here s1 is the sum of the Xi's, sn is their product, and more generally X sk = Xi1 ··· Xik 1≤i1<···<ik≤n is the sum of the products of the Xi's taken k terms at a time. The polynomial sk is symmetric in X1;:::;Xn and is called the kth elementary symmetric polynomial { or kth elementary symmetric function { in X1;:::;Xn. p p 3+ 5 3− 5 Example 1.3. Let α = 2 and β = 2 . Although α and β are not rational, their elementary symmetric polynomials are: s1 = α + β = 3 and s2 = αβ = 1. Example 1.4. Let α, β, and γ be the three roots of T 3 − T − 1, so T 3 − T − 1 = (T − α)(T − β)(T − γ): Multiplying out the right side and equating coefficients on both sides, the elementary symmetric functions of α, β, and γ are s1 = α + β + γ = 0, s2 = αβ + αγ + βγ = −1, and s3 = αβγ = 1. Our goal is to prove the following theorem. Theorem 1.5. The set of symmetric polynomials in F [X1;:::;Xn] is F [s1; : : : ; sn]. That is, every symmetric polynomial in n variables is a polynomial in the elementary symmetric functions of those n variables. 1 2 KEITH CONRAD Example 1.6. In two variables, the polynomial X3 + Y 3 is symmetric in X and Y . As a polynomial in s1 = X + Y and s2 = XY , 3 3 3 3 X + Y = (X + Y ) − 3XY (X + Y ) = s1 − 3s1s2: There are several ways to prove Theorem 1.5. For example, it is proved in [4, Chap. IV, Sect. 6] by a double induction on n and on the degree of the polynomial. The theorem can also be proved using Galois theory, transcendental field extensions, and integral ring extensions.1 The proof we will give, based on [1, Sect. 7.1], provides an explicit algorithm that turns a symmetric polynomial in X1;:::;Xn into a polynomial in s1; : : : ; sn. 2. Lexicographic ordering on F [X1;:::;Xn] In F [X], many theorems are proved using induction on the degree of polynomials. The degree is a nonnegative integer associated to each nonzero polynomial f(X): it is the largest n n ≥ 0 such that f(X) contains a monomial anX where an 6= 0 in F . The monomials 1; X; X2;X3;::: (all with coefficient 1) are ordered by degree as 1 < X < X2 < : : :, and deg(f) is the largest monomial appearing in f with a nonzero coefficient. i1 in On F [X1;:::;Xn] we will use a total ordering on the monomials X1 ··· Xn and associate to that ordering an analogue on F [X1;:::;Xn] of the degree on F [X]. n Definition 2.1. For i = (i1; : : : ; in) and j = (j1; : : : ; jn) in N , set i < j if, for the first index r such that ir 6= jr, we have ir < jr. Write i ≤ j if i < j or i = j. Example 2.2. In N4, (3; 0; 2; 4) < (5; 1; 1; 3) and (3; 0; 2; 4) < (3; 0; 3; 1). Example 2.3. In Nn, 0 < i for all i 6= 0. This way of ordering n-tuples in Nn is called the lexicographic (i.e., dictionary) ordering since it resembles the way words are ordered in the dictionary alphabetically if we think of one word as \less" than another if it comes earlier in the dictionary. The \greater" word comes later. Alphabetical order first compares words by the first letter, if the first letters are the same then the words are compared by the second letter, and so on. While words in a dictionary have varying length, we are using lexicographic ordering only to compare sequences in N with the same number of terms. Theorem 2.4. Lexicographic ordering on Nn has the following properties. (1)( Total ordering) For all i and j, exactly one of i = j or i < j or j < i holds. (2)( Transitivity) If i < j and j < k then i < k. The same is true with ≤ in place of <. (3)( Compatibility with addition) If i ≤ i0 and j ≤ j0 then i + j ≤ i0 + j0, and if either inequality in the hypothesis is strict then the inequality in the conclusion is strict. Proof. (1) If i 6= j, then there is an r where ir 6= jr in N. Let r be the least index where this happens. If ir < jr then i < j, and if jr < ir then j < i. (2) Let r be the least index where ir; jr, and kr are not all equal. We must have ir 6= jr or jr 6= kr (if both were equalities then ir = jr = kr, which isn't true). Since earlier coordinates in i, j, and k are all equal, either ir < jr or jr < kr because i < j and j < k. Therefore ir ≤ jr ≤ kr with at least one inequality being strict, so ir < kr and earlier coordinates in i and k are equal. Thus i < k. 1Historically, Theorem 1.5 used to be part of the mathematical development leading to Galois theory, which would make a proof of Theorem 1.5 by Galois theory circular, but Galois theory in its modern form does not require Theorem 1.5. SYMMETRIC POLYNOMIALS 3 This result for ≤ is the same argument as with < except we have the extra cases where i and j may coincide or j and k may coincide, which makes things easier. (3) Rather than take cases based on where i and i0 may first differ or where j and j0 may first differ, observe that i ≤ i0 ) i + k ≤ i0 + k for all k: this is obvious when i = i0, and 0 0 0 when i < i the only way ir + kr differs from ir + kr is if ir 6= ir, and the first time this 0 0 happens we have ir < ir, so ir + kr < ir + kr. Now we use that twice together with the transitivity in (2). If i ≤ i0 and j ≤ j0, then i + j ≤ i0 + j = j + i0 ≤ j0 + i0 = i0 + j0 =) i + j ≤ i0 + j0: The case of < in place of ≤ is analogous. A polynomial f 2 F [X1;:::;Xd] is a sum of the form X i1 in f = ci1;:::;in X1 ··· Xn i1;:::;in≥0 where ci1;:::;in 2 F and only finitely many coefficients can be nonzero. Abbreviate this P i i i1 in sum to multi-index form as i ciX , where X := X1 ··· Xn for i = (i1; : : : ; in). Note i j i+j P X X = X . (In the notation i, only finitely many terms are nonzero.) When f 6= 0, lexicographic ordering lets us compare the different nonzero monomials appearing in f, which leads to the following concepts that generalize degree and leading terms on F [X]. P i Definition 2.5. Write f in F [X1;:::;Xn] as i ciX . If f 6= 0, the multidegree of f is the lexicographically largest index in Nn of a nonzero monomial in f: n mdeg f = maxfi : ci 6= 0g 2 N : The multidegree of the zero polynomial is not defined. If f 6= 0 and mdeg f = d, we call d cdX the leading term of f and cd the leading coefficient of f, written cd = lead f. We can separate the leading term of a nonzero f of multidegree d from its other nonzero terms to get d X i f = cdX + ciX : i<d Because multidegrees are totally ordered, a nonzero polynomial in F [X1;:::;Xn] has a unique leading term and a unique leading coefficient in F . 5 8 Example 2.6. In Q[X1;X2], let f = 7X1X2 + 3X2 + 9. Since max((1; 5); (0; 8); (0; 0)) = 2 5 8 (1; 5) in N , mdeg(f) = (1; 5) and lead(7X1X2 + 3X2 + 9) = 7. Example 2.7. In F [X1;:::;Xn], mdeg(X1) = (1; 0;:::; 0) and mdeg(Xn) = (0; 0;:::; 1). Example 2.8. The polynomials with multidegree 0 are the nonzero constants. Example 2.9. The multidegrees of the elementary symmetric polynomials are mdeg(s1) = (1; 0; 0;:::; 0); mdeg(s2) = (1; 1; 0;:::; 0); . mdeg(sn) = (1; 1; 1;:::; 1): For k = 1; : : : ; n, the leading term of sk is X1 ··· Xk, so the leading coefficient of sk is 1. 4 KEITH CONRAD Our definition of multidegree is specific to calling X1 the “first” variable and Xn the \last" variable. Despite its ad hoc nature (there is nothing intrinsic about making X1 the “first” variable), the multidegree is useful since it permits us to prove theorems about all multivariable polynomials by ordering them according to their multidegree.

Load more