Solving Polynomial Equations
Total Page:16
File Type:pdf, Size:1020Kb
Seminar on Advanced Topics in Mathematics Solving Polynomial Equations 5 December 2006 Dr. Tuen Wai Ng, HKU What do we mean by solving an equation ? Example 1. Solve the equation x2 = 1. x2 = 1 x2 ¡ 1 = 0 (x ¡ 1)(x + 1) = 0 x = 1 or = ¡1 ² Need to check that in fact (1)2 = 1 and (¡1)2 = 1. Exercise. Solve the equation p p x + x ¡ a = 2 where a is a positive real number. What do we mean by solving a polynomial equation ? Meaning I: Solving polynomial equations: ¯nding numbers that make the polynomial take on the value zero when they replace the variable. ² We have discovered that x, which is something we didn't know, turns out to be 1 or ¡1. Example 2. Solve the equation x2 = 5. x2 = 5 2 p x p¡ 5 = 0 (x ¡ 5)(x + 5) = 0p p x = 5 or ¡ 5 p p ² But what is 5 ? Well, 5 is the positive real number that square to 5. ² We have "learned" that the positive solution to the equation x2 = 5 is the positive real number that square to 5 !!! ² So there is a sense of circularity in what we have done here. ² Same thing happens when we say that i is a solution of x2 = ¡1. What are \solved" when we solve these equations ? ² The equations x2 = 5 and x2 = ¡1 draw the attention to an inadequacy in a certain number system (it does not contain a solution to the equation). ² One is therefore driven to extend the number system by introducing, or `adjoining', a solution. ² Sometimes, the extended system has the good algebraic properties of the original one, e.g. addition and multiplication can be de¯ned in a natural way. p ² These extended number systems (e.g. Q( 5) or Q(i)) have the added advantage that more equations can be solved. Consider the equation x2 = x + 1: ² By completingp the square,p or by applying the formula, we know that the 1+ 5 1¡ 5 solutions are 2 or 2 . p 1+ 5 ² It is certainly not true by de¯nition that 2 is a solution of the equation. ² What we have done is to take for granted that we can solve the equation x2 = 5 (and similar ones) and to use this interesting ability to solve an equation which is not of such a simple form. ² When we solve the quadratic, what we are actually showing that the problem can be reduced to solving a particularly simple quadratic x2 = c. What do we mean by solving a polynomial equation ? Meaning II: Suppose we can solve the equation xn = c, i.e. taking roots, try to express the the roots of a degree n polynomial using only the usual algebraic operations (addition, subtraction, multiplication, division) and application of taking roots. ² In this sense, one can solve any polynomials of degree 2,3 or 4 and this is in general impossible for polynomials of degree 5 or above. ² The Babylonians (about 2000 B.C.) knew how to solve speci¯c quadratic equations. ² The solution formula for solving the quadratic equations was mentioned in the Bakshali Manuscript written in India between 200 BC and 400 AD. ² Based on the work of Scipione del Ferro and Nicolo Tartaglia, Cardano published the solution formula for solving the cubic equations in his book Ars Magna (1545). ² Lodovico Ferrari, a student of Cardano discovered the solution formula for the quartic equations in 1540 (published in Ars Magna later). ² The formulae for the cubic and quartic are complicated, and the methods to derive them seem ad hoc and not memorable. Solving polynomial equations using circulant matrices D. Kalman and J.E. White, Polynomial Equations and Circulant Matrices, The American Mathematical Monthly, 108, no.9, 821-840, 2001. Circulant matrices. An n £ n circulant matrix is formed from any n-vector by cyclically permuting the entries. For example, starting with [a b c] we can generate the 3 £ 3 circulant matrix 2 3 a b c C = 4c a b5 : (1) b c a ² Circulant matrices have constant values on each downward diagonal, that is, along the lines of entries parallel to the main diagonal. The eigenvalues and eigenvectors of circulant matrices are very easy to compute using the nth roots of unity. ² For the 3 £ 3 matrix C in (1), we need the cube roots of unity: p 1;! = (¡1 + i 3)=2 and !2 = !: ² Direct computations show that the eigenvalues of C are a + b + c; a + b! + c!2, and a + b! + c!2, with corresponding eigenvectors (1; 1; 1)T ; (1; !; !2)T , and (1; !; !2)T . ² This result can be generalized to higher dimensions (n ¸ 3). To begin with, we de¯ne a distinguished circulant matrix W with ¯rst row (0; 1; 0;:::; 0). W is just the identity matrix with its top row moved to the bottom, e.g. for n = 4, 2 3 2 3 2 3 0 1 0 0 0 0 1 0 0 0 0 1 60 0 1 07 60 0 0 17 61 0 0 07 W = 6 7 ;W 2 = 6 7 ;W 3 = 6 7 : 40 0 0 15 41 0 0 05 40 1 0 05 1 0 0 0 0 1 0 0 0 0 1 0 Direct checking shows that i) Note that W T = W ¡1 (i.e. W is an orthogonal matrix). ii) The characteristic polynomial for W is p(t) = det(tI ¡ W ) = tn ¡ 1, and hence the eigenvalues of W are the nth roots of unity. 2 n¡1 iii) For each nth root of unity ¸, v¸ = (1; ¸; ¸ ; : : : ; ¸ ) is an associated eigenvector. If C is any n £ n circulant matrix, use its ¯rst row [a0 a1 a2 : : : an¡1] 2 n¡1 to de¯ne a polynomial q(t) = a0 + a1t + a2t + ¢ ¢ ¢ + an¡1t . 2 n¡1 i) Then C = q(W ) = a0I + a1W + a2W + ¢ ¢ ¢ + an¡1W . For example, when n = 4, 2 3 2 3 a0 0 0 0 0 a1 0 0 6 0 a0 0 0 7 6 0 0 a1 0 7 a0I = 6 7 ; a1W = 6 7 4 0 0 a0 0 5 4 0 0 0 a15 0 0 0 a0 a1 0 0 0 2 3 2 3 0 0 a2 0 0 0 0 a3 6 7 6 7 2 0 0 0 a2 3 a3 0 0 0 a2W = 6 7 ; a3W = 6 7 : 4a2 0 0 0 5 4 0 a3 0 0 5 0 a2 0 0 0 0 a3 0 2 3 Therefore, q(W ) = a0I + a1W + a2W + a3W is equal to 2 3 a0 a1 a2 a3 6a a a a 7 C = 6 3 0 1 27 4a2 a3 a0 a15 a1 a2 a3 a0 ii) For any nth root of unity ¸; q(¸) is an eigenvalue of C = q(W ). [ Indeed, if W v = ¸v, then W kv = ¸kv and hence q(W )v = q(¸)v.] Example 3. Consider the circulant matrix 2 3 1 2 1 3 63 1 2 17 C = 6 7 : 41 3 1 25 2 1 3 1 Read the polynomial q from the ¯rst row of C: q(t) = 1 + 2t + t2 + 3t3: Here, with n = 4, the nth roots of unity are §1 and §i. The eigenvalues of C are now computed as q(1) = 7; q(¡1) = ¡3; q(i) = ¡i; and q(¡i) = i: The corresponding eigenvectors are v(1) = (1; 1; 1; 1); v(¡1) = (1; ¡1; 1; ¡1); v(i) = (1; i; ¡1; ¡i); and v(¡i) = (1; ¡i; ¡1; i): Can check that the characteristic polynomial of C is det (tI ¡ C) = p(t) = t4 ¡ 4t3 ¡ 20t2 ¡ 4t ¡ 21: Summary Start with any circulant matrix C, one can generate both the roots and coe±cients of a polynomial p. Here, the polynomial p is the characteristic polynomial of C; the coe±cients can be obtained from the identity p(t) = det (tI ¡ C); the roots, i.e., the eigenvalues of C, can be found by applying q to the nth roots of unity. This perspective leads to a uni¯ed method for solving general quadratic, cubic, and quartic equations. In fact, given a polynomial p, we try to ¯nd a corresponding circulant C having p as its characteristic polynomial. The ¯rst row of C then de¯nes a di®erent polynomial q, and the roots of p are obtained by applying q to the nth roots of unity. Solving polynomial equations using circulant matrices. Quadratics. Let's consider a general quadratic polynomial, p(t) = t2 + ®t + ¯: We also consider a general 2 £ 2 circulant · ¸ a b C = : b a The characteristic polynomial of C is · ¸ t ¡ a ¡b det = t2 ¡ 2at + a2 ¡ b2: ¡b t ¡ a We must ¯nd a and b so that this characteristic polynomial equals p, so ¡2a = ® a2 ¡ b2 = ¯: p Solving this system gives a = ¡®=2 and b = § ®2=4 ¡ ¯. To proceed, we require only one solution of the system, and for convenience de¯ne b with the positive sign, so · p ¸ ¡®=2 ®2=4 ¡ ¯ C = p ®2=4 ¡ ¯ ¡®=2 and r ¡® ®2 q(t) = + t ¡ ¯: 2 4 The roots of the original quadratic are now found by applying q to the two square roots of unity: q ¡® 2 q(1) = + ® ¡ ¯ 2 q 4 ¡® 2 q(¡1) = ¡ ® ¡ ¯: 2 4 ² Observe that de¯ning b with the opposite sign produces the same roots of p, although the values of q(1) and q(¡1) are exchanged. Cubics. A parallel analysis works for cubic polynomials. n n¡1 We ¯rst notice that by simple algebra, for p(x) = x + ®n¡1x + ¢ ¢ ¢ + ®1x + ®0, the substitution y = x ¡ ®n¡1=n eliminates the term of degree n ¡ 1.