STUDENT’S COMPANIONS IN MATH: NINETH

Polynomials and Algebraic

A is a function of the form

n n−1 p(x) = a0x + a1x + ··· + an−1x + an. (1) where a0, a1, a2, . . . , an are some constants, called the coefficients of p. Using the summation notation, we may write

Xn n−k p(x) = akx . k=0

In (1), when a0 6= 0, we say that the degree of p is n and we call a0 the leading coefficient. When p is a nonzero constant, we say that the degree of p is zero. When p is equal to the constant zero, it is customary assign ∞ to be the degree of p. We are mainly interested in nonconstant and there is no need to worry about this convention. By an algebraic we mean an equation of the form p(x) = 0. By a root of the polynomial p, or a solution to the algebraic equation p(x) = 0, we mean a (complex) number a such that p(a) = 0.

QUESTION 1. If 1 and 2 are roots of x3 − x2 − cx + d, what are c and d?

EXERCISE 2. Prove the following statement: if α and β are roots of x2 + ax + b and α 6= β, then −a = α + β, b = αβ. (2)

(Actually, the assumption α 6= β is unnecessary. We will mention a complete set of identities giving the relation between roots and coefficients of a polynomial, traditionally called Vieta’s formula; see (12) below.)

If the degree of p is two, we call p(x) = 0 a , if three, we call a , if four, we call a quatric equation, and if five, we call a quintic equation. Next we discuss the general quadratic equation p(x) = 0, where p(x) = ax2 + bx + c with a 6= 0. An important technique for dealing with a polynomial of degree two is called completing square, which is based on the identity a2 + 2ab + b2 = (a + b)2.

1 EXAMPLE 3. We are asked to do the “completing square” for the following polynomials: x2 + 6x + 7, 2x2 + 2x and 3x2 − 2x + 1. Here they are:

x2 + 6x + 7 = x2 + 2 . 3x + 32 − 32 + 7 = (x + 3)2 − 2, 2x2 + 2x = 2(x2 + x) = 2(x2 + 2(1/2)x + (1/2)2 − (1/2)2) = 2(x + 1/2)2 − 1/2, 3x2 − 2x + 1 = 3(x2 − 2(1/3)x + (1/3)2 − (1/3)2) + 1 = 3(x − 1/3)2 + 2/3.

As you can see, “completing square” is like a tailor’s job of fitting a given expression into the form (x + a)2, keeping but without worrying what is left over.

Now we return to the general polynomial of degree two: p(x) = ax2 + bx + c, with a 6= 0. In the following discussion we assume that a, b, c are real numbers. We perform “completing square” for this polynomial: µ ¶ b p(x) ≡ ax2 + bx + c = a x2 + x + c a à ½ ¾ ½ ¾ ! b b 2 b 2 = a x2 + 2 x + − + c 2a 2a 2a à ½ ¾ ! b b 2 b2 = a x2 + 2 x + − a + c 2a 2a 4a2

Or µ ¶ b 2 4ac − b2 p(x) = a x + + . (3) 2a 4a For convenience, let us write 2 p(x) = a(x − x0) + m, (4)

2 2 where x0 = −b/2a and m = (4ac − b )/4a. Notice that p(x0) = m and (x − x0) is 2 always nonnegative. So, when a > 0, we have p(x) = a(x−x0) +m ≥ 0+m = m = p(x0).

Thus, when a > 0, p has a unique (global) minimum m attained at x0. Similarly, 2 when a < 0, we have p(x) = a(x − x0) + m ≤ 0 + m = p(x0) and hence p has a

unique maximum m attained at x0.

QUESTION 4. How do you use calculus to draw the same conclusion?

QUESTION 5. How do you use (3) or (4) to derive the following formula √ −b ± b2 − 4ac x = (5) 2a 2 for finding the roots of ax2 + bx + c = 0?

We return to the general theory of polynomials. You are assumed to know long division for polynomials. See if you can do the following

EXERCISE 6. Use long division to divide f(x) = 2x4 − 7x3 + 14x + 4 by g(x) = x − 2.

Now we have to be more careful about numbers that we are allowed to use. To avoid technicality, we restrict ourselves to the following three number systems: rationals, reals and complex numbers. The standard notation for them is:

Q = the set of all rational numbers R = the set of all real numbers C = the set of all complex numbers.

In what follows, we use letter F (this is not a standard notation) to stand one of the above: Q, R or C. We use this letter F here because it is a field according to some technical definition. When a polynomial all coefficients a0, a1, . . . , an of a polynimial n n−1 p(x) = a0x + a1x + ··· + an−1x + an are in F, then we say that p is a polynomial over F. Given polynomials f(x) and g(x) over F, we can divide f(x) by g(x), using long division, to get a quotient q(x) and a remainder r(x), which are also polynomials over F. The relation between f, g and q, r is given by the identity f(x) = g(x)q(x) + r(x). Here, the degree of r(x) is strictly less than the degree of g(x), or r(x) is identically zero. (Note: the degree of a nonzero constant polynomial is 0 but the degree of the zero polynomial is defined to be infinity.) In case r(x) ≡ 0 so that we have f(x) = g(x)q(x), we say that g is a factor of f or g divides f. When f has no factors over F other than the trivial ones, that is, constants or constant multiples of f, we say that f is an irreducible polynomial over F.

Notice that reducibility of a polynomial often depend on which field F we choose. For example, x2 + 1 is a polynomial irreducible over Q or R. But it is reducible over C because it has the following proper : x2 + 1 = (x + i)(x − i).

QUESTION 7. Why is x2 − 2 is irreducible over Q but reducible over R?

Now we state the basic theorem concerning factorization of polynomials: A noncon- stant polynomial can be written as a product of finitely many irreducible polynomials

3 which are unique up to multiples of nonzero elements in F. The proof of this is similar to the unique factorization of . A good book on abstract should have a theory (about something called PID, that is, principal ideal domain) covering the unique factorization theorem for both polynomials and integers.

PROBLEM 8. Why is x4 + 1 irreducible over Q?

Recall that a number a is called a root of a polynomial if p(a) = 0. Given a polynomial p(x) and a number a, we can divide p(x) by x − a to get p(x) = (x − a)q(x) + r, where q is a polynomial with a degree smaller than that of p and the remainder r is a number. Now suppose that a is a root of p(x): p(a) = 0. Letting x = a in p(x) = (x − a)q(x) + r, we obtain 0 = r. Hence p(x) = (x − a)q(x). We have shown that if a is a root of p(x), then x − a is a factor of p(x).

QUESTION 9. Why is the converse “if x − a is a factor of p(x), then a is a root of p(x)” also true?

Next, suppose that b is another root: p(b) = 0 and b 6= a. Putting x = b in p(x) = (x − a)q(x), we get 0 = (b − a)q(b). As b − a 6= 0, we have q(b) = 0. Thus b is also a root of q(x) and hence x − b is a factor of q(x), say q(x) = (x − b)Q(x). Thus p(x) = (x−a)(x−b)Q(x), telling us that (x−a)(x−b) is a factor of p(x). More generally, if a1, a2, . . . , am are distinct roots of a polynomial p(x), then (x−a1)(x−a2) ··· (x−am) is a factor of p(x).

EXERCISE 10. Give a careful proof of the last assertion by inductiion on m.

One consequence of our discussion here is: If p(x) is a polynomial of degree n, then p(x) cannot have more than n roots. To see this, suppose that p(x) has more than n roots, say a1, a2, . . . , am with m > n. Then, according to what we have just learned, f(x) ≡ (x − a1)(x − a2) ··· (x − am) is a factor of p(x). This cannot happen because the degree of f(x) is m, which is greater than the degree of p(x), which is n. Indeed, the degree of a factor of p cannot be greater than the degree of p.

Next we study the multiplicity of a root. Suppose that a is a root of a polynomial p, that is, p(a) = 0. The above discussion tells that p(x) = (x − a)q(x) for some polynomial q. Now we ask the question: is a also a root of q? The answer can be found by computing

4 q(a), the value of q at a, to see if it is zero. If q(a) 6= 0, the answer is No and in that case we say that a is a simple root of p. If q(a) = 0, the answer is Yes and in that case we say that a is a multiple root of p.

EXERCISE 11. Prove that a is a simple root of p if and only if p(a) = 0 but p0(a) 6= 0; equivalently, a is a multiple root of a if and only if p(a) = 0 and p0(a) = 0. (Of course, p0 here stands for the derivative of p.)

Now we extract all factors of x − a from p. Say, there are m of them so that we can write p(x) = (x − a)mg(x) for some polynomial g. Now g has no factor of x − a and hence g(a) 6= 0. The positive m is called the multiplicity of the root a. When m = 1, a is a simple root. When m > 1, a is a multiple root. When we count the number of roots of a polynomial, we should count their multiplicities as well in order to get correct answers to many questions.

PROBLEM 12. Prove by induction on m that the multiplicity of a root a of a polynomial p is m if and only if p(k)(a) = 0 for 0 ≤ k < m and p(m)(a) 6= 0. (p(k) is the kth derivative of p).

Concerning the existence of roots, we have the following very basic:

Fundamental Theorem of Algebra. If p is a polynomial of degree n ≥ 1 with complex numbers as coefficients, then there is a a (called a root of p) such that p(a) = 0; in fact, counting multiplicities, p has exactly n roots.

Any proof of this basic theorem is beyond the level here, although there are over a hundred of them; (I don’t know the recent record: I guess there are about two hundreds of proofs).

As we know, a real polynomial (that is, a polynomial with real coefficients) does not necessarily have a real root. For a quadratic polynomial p(x) = ax2 + bx + c with real coefficients a(6= 0), b, c, the recipe (5) in QUESTION 5 for its roots tells us that p does not have real roots if its b2 − 4ac is negative and in that case its roots are two complex numbers conjugate to each other. In general, we have the following fact concerning the roots of a real polynomial:

Theorem. Suppose that p is a real polynomial of degree d ≥ 1. Then: (1) if the degree s is an odd number, then p has at least one real root; (2) if ω is a nonreal root

5 of p, then its complex conjugate ω is also a root of p; (3) p can be written as a product of irreducible factors, either of the form rx + s or ax2 + bx + c, where r, s, a, b, c are real numbers with r, a 6= 0 and b2 − 4ac < 0.

n n−1 To prove this theorem, let p(x) = a0x +a1x +···+an−1x+an where a0, a1, . . . , an

are real numbers with a0 6= 0. If ω is a nonreal root of p, then p(ω) = 0 and hence

n n−1 p(ω) = a0ω + a1ω + ··· + an−1ω + an

n n−1 = a0ω + a1ω + ··· + an−1ω + an (because the coefficients are real) = p(ω) = 0 = 0

and hence ω is also a root of p. This proves (2). Now both x − ω and x − ω are factors of p and they are different (otherwise ω = ω, contradicting the fact that ω is nonreal). Thus (x − ω)(x − ω) is a factor of p. So we have p(x) = (x − ω)(x − ω)q(x) for some polynomial q of degree n − 2. Now

(x − ω)(x − ω) = x2 + (ω + ω)x + ωω ≡ x2 + bx + c

with b = ω + ω, c = ωω real and b2 − 4ac = (ω + ω)2 − 4ωω = ω2 + 2ωω + ω2 − 4ωω = (ω − ω)2 < 0.

QUESTION 13. Why are b and c real? Why is (ω − ω)2 negative?

Since p and (x−ω)(x−ω) ≡ x2 +bx+c are real polynomials and p(x) = (x2 +bx+c)q(x), q is also a real polynomial.

QUESTION 14. Why?

We can ask if q has a nonreal root. If the answer is yes, we can get another conjugate pair of nonreal roots and hence another factor of the form ax2 + bx + c with real a, b, c such that b2 − 4ac < 0. Pulling out this factor from q (and hence from p), the degree of f drops again by 2. Continue in this manner, until all nonreal roots are exhausted so that only real roots of p remains. Now part (3) of the above theorem is clear. Part (1) is a direct consequence of part (3).

From the above discussion, we see that irreducible polynomials over C are of the form ax + b, where a 6= 0, and irreducible polynomials over R are of the form rx + s (r 6= 0)

6 or ax2 + bx + c with b2 − 4ac < 0. What about irreducible polynomials over Q, the field of rational numbers? This is hard, very hard. We can produce many of them by using some advanced theorems such as Eisenstein’s criterion. Some of them such as cyclotomic polynomials are very important. A complete description of irreducible polynomials over rationals seems to be impossible.

EXAMPLE 15. We are asked to factorize the polynomial p(x) = x5 − 1 over the real field R. Let ω = e2πi/5. Then 1, ω, ω2, ω3 ≡ ω2, z4 = ω are roots of p. Thus

p(x) = (x − 1)(x − ω)(x − ω2)(x − ω3)(x − ω4) = (x − 1)(x − ω)(x − ω)(x − ω2)(x − ω2) (6) = (x − 1)(x − (ω + ω)x + 1)(x − (ω2 + ω2)x + 1), in view of ωω = 1 and z2ω2 = 1. Now please read EXAMPLE 17 of the SIXTH √ COMPANION. From there we get ω + ω = 2 cos 2π/5 = ( 5 − 1)/2. So

Ã√ !2 √ √ 5 − 1 5 − 2 5 + 1 5 + 1 ω2 + ω2 = (ω + ω)2 − 2ωω = − 2 = − 2 = − . 2 4 2

Substituting the last expressions into (6), we have à √ !à √ ! 1 − 5 1 + 5 x5 − 1 = (x − 1) x2 + x + 1 x2 + x + 1 , 2 2

which is the required factorization.

QUESTION 16. Are you sure the answer in the last example correct? Check the identity à √ !à √ ! 1 − 5 1 + 5 x4 + x3 + x2 + x + 1 = x2 + x + 1 x2 + x + 1 . 2 2

EXERCISE 17. Factorize x4 + x2 + 1 over the real field R.

n n−1 PROBLEM 18. Consider a polynomial of the form p(x) = x +a1x +···+an−1x+an, where the coefficients a1, a2, . . . , an are integers. Prove that rational roots of p are integers, that is, if r is a such that p(r) = 0, then r is an integer. Use √ this fact to deduce that, for positive integers m and n, n m is either an integer or an √ √ √ irrational number. (In particular, 2, 3, 6 etc. are irrational.)

7 In the rest we breifly describe other important aspects of polynomials. First: finding roots by radicals. Formula (5) in QUESTION 5 is a closed form for roots of a general quadratic equation in terms of a radical; (roughtly speaking, a radical is an algebraic ex- pression involving roots, without trigonometric, exponential or other transcendental func- tions). For solving a general cubic equation x3 + ax2 + bx + c = 0, Cardano published in 1545 a method described as follows. (This method was originally due to Tartaglia, to whom Cardano promised earlier to keep it as a secret. Here we cannot give more detail of this fascinating story.) First step: to eliminate the x2 term by a simple transformation x = y + k, where k will be determined. Substituting x = y + k to x3 + ax2 + bx + c, we get (y + k)3 + a(y + k)2 + ··· = y3 + (3k + a)y2 + ··· . Letting k = −a/3, the original equation is converted into y3 + py + q = 0. (7)

This step is more or less routine. It works for polynomial of any degree. The next step is VERY SLICK: let y = u + v. (Here we replace one unknown y by two unknowns u and v. But so far we only have one equation, namely (7). Therefore we have the freedom to choose another equation!) Then (6) becomes (u + v)3 + p(u + v) + q = 0, or u3 + 3u2v + 3uv2 + v3 + p(u + v) + q = 0, that is,

u3 + v3 + (3uv + p)(u + v) + q = 0. (8)

The other equation we choose is 3uv + p = 0 so that (7) becomes u3 + v3 + q = 0, or u3 + v3 = −q. From 3uv + p = 0 we obtain

uv = −p/3, (9) which gives u3v3 = −p3/27. So, letting A = u3 and B = v3, we get

A + B = −q, AB = −p3/27

Solving this system, we get r r q q2 p3 q q2 p3 A ≡ u3 = − + + ,B ≡ v3 = − − + . 2 4 27 2 4 27

We get possible values of u and v: √ √ √ √ √ √ u = 3 A, ω 3 A, ω2 3 A v = 3 B, ω 3 B, ω2 3 B,

8 where ω = e2πi/3. Due to the restrain (9), only three combinations of u and v are possible for y = u + v as the solutions: √ √ √ √ √ √ y = 3 A + 3 B, ω 3 A + ω2 3 B, ω2 3 A + ω 3 B.

Quadric equations (algebraic equations of degree four) can be solved by the method of reduction to cubic equations, which was discovered by Cardano’s student Ferrari. Great efforts were made to solve quintic equations by radicals, but all of them were doomed to fail: in early nineteen century Abel proved that this is impossible. In this case, merely using radicals is not enough. To solve quintic equations one needs elliptic functions (such as the Weierstrass P function ℘(z)) which are related to elliptic , a hot topic in current mathematical research. The most powerful method of studying algebraic equations is the , which has tremendous impact on many other . Galois is a “romantic revolutionary figure” in who died in a duel at the age of barely 21. [Another “romantic figure” in math is Sofya Kowalevskaya, who is one of the two greatest women in math and whose beauty is legendary. I’m just curious why Hollywood didn’t make a movie about her.]

In EXERCISE 2 we have considered the relation between the roots and the coefficients of a . Consider a general polynomial p of degree n with leading coefficient 1:

n n−1 p(x) = x + a1x + ··· + an−1x + an. (10)

Let x1, x2, . . . , xn be the roots of p. Then

p(x) = (x − x1)(x − x2) ··· (x − xn). (11)

By multiplying out the right hand side of (11) and then comparing to (10), we have

n a1 = −s1, a2 = s2, . . . an = (−1) sn, (12)

where X s1 =x1 + x2 + ··· + xn ≡ xk k X s2 =x1x2 + x1x3 + x2x3 + ··· + xn−1xn ≡ xjxk j6=k (13) . .

sn =x1x2 ··· xn.

9 The expressions s1, s2, . . . , sn, considered as functions of x1, x2, . . . xn, are called elementary symmetric functions.

EXERCISE 19. Check (12) for n = 4.

A polynomial P = P (x1, x2, . . . , xn) in n variables is called a symmetric polynomial if it is unchanged after any permutation of its variables. For example, for n = 3, P (x1, x2, x3) is a polynomial if and only if it is invariant under six permutations of x1, x2, x3:

P (x1, x2, x3) = P (x2, x3, x1) = P (x3, x1, x2)

= P (x1, x3, x2) = P (x2, x3, x1) = P (x3, x2, x1).

For example, s1, s2, . . . , sn given by (13) are symmetric polynomials. Actually, they generate all symmetric polynomials in the following sense:

Theorem. If P = P (x1, x2, . . . , xn) is a symmtric polynomial, then there is a polynomial Q in n variables such that P = Q(s1, s2, . . . , sn), in other words, P can be written as a polynomial in elementary symmetric functions. (This basic theorem about symmetric polynomials, stated without proof here, is due to Newton.)

Given two polynomials f and g of degree m and n respectively, how do we know if they have a common factor or they are relatively prime? You can use Euclidean’s algorithm to find out, or compute a determinant of size m + n called the (due to Sylvester), denoted by R(f, g), to see if it vanishes. A polynomial p has a multiple root if and only if p and its derivative p0 have a common factor. Hence R(p, p0) = 0 is the criterion for existence of multiple roots. The expression R(p, p0) is called the discriminant of p. The discriminant of ax2 + bx + c turns out to be b2 − 4ac. This should not surprise you! Resultant is also important for polynomials of several variables, especially in a computation method called elimination.

We also mention that, for a real polynomial p, there is an important method due to Sturm to determine how many real roots of p are lying in a given interval.

10