FACTORIZATION of MATRICES
Total Page:16
File Type:pdf, Size:1020Kb
FACTORIZATION of MATRICES Let's begin by looking at various decompositions of matrices, see also Matrix Factorization. In CS, these decompositions are used to implement efficient matrix algorithms. Indeed, as Lay says in his book: "In the language of computer science, the expression of A as a product amounts to a pre-processing of the data in A, organizing that data into two or more parts whose structures are more useful in some way, perhaps more accessible for computation." Nonetheless, the origin of these factorizations lies long in the past when they were first introduced by various mathematicians after whom they are frequently named (though not always, and not always chronologically accurately, either!). Their importance in math, science, engineering and economics quite generally cannot be over-stated, however. As Strang in his book says: "many key ideas of linear algebra, when you look at them closely, are really factorizations of a matrix. The original matrix A becomes the product of 2 or 3 special matrices." But factorization is really what you've done for a long time in different contexts. For example, each positive integer n, say 72, can be factored as a product 72 = 23 × 32 of prime integers, while each polynomial such as P(x) = x4 − 16 can factored as a product P(x) = (x − 2)(x + 2)(x2 + 4) of linear factors with real roots and a quadratic factor for which no further factorization can be carried out. In this lecture we shall look at the first of these matrix factorizations - the so-called LU-Decomposition and its refinement the LDU-Decomposition - where the basic factors are the elementary matrices of the last lecture and the factorization stops at the reduced row echelon form. Let's start. Some simple hand calculations show that for each 2 × 2 matrix Gauss Decomposition: a b 1 0 a 0 1 b 1 0 a b = a = , a ≠ 0 . c 1 ad−bc c 1 ad−bc [ c d ] [ a ] [ 0 a ] [ 0 1 ] [ a ] [ 0 a ] Notice that in the 3-term factorization the first and third factors are triangular matrices with 1's along the diagonal, the first L(ower) the third U(pper), while the middle factor is a D(iagonal) matrix. This is an example of the so-called LDU-decomposition of a matrix. On the other hand, in the term 2-factorization both factors are triangular matrices the first Lower and the second Upper, but now the second one allows diagonal entries which need not be 1. It is an example of the important LU-decomposition of a matrix. As we shall see shortly, this decomposition - possibly the most important factorization of all - comes from the method of elimination for solving systems of linear equation. But what information might we get from such decompositions? Well, note the particular factors occuring in the Gauss Decomposition of a 2 × 2 matrix: if ad − bc ≠ 0, then each factor is an elementary matrix as defined in the previous lecture, and so invertible. In fact, by the known inverses given in Lecture 06, −1 −1 −1 1 0 1 0 a 0 1 0 1 b 1 − b = , = a , a = a . [ c 1 ] [ − c 1 ] 0 ad−bc a a a [ a ] [ 0 ad−bc ] [ 0 1 ] [ 0 1 ] −1 On the other hand, (AB) = B−1A−1 . Thus by Gauss Decomposition, d b −1 b 1 0 − a b 1 − a a 1 0 ad − bc ad − bc = c = ⎡ ⎤, [ c d ] 0 a [ − 1 ] ⎢ c a ⎥ [ 0 1 ] [ ad−bc ] a ⎢− ⎥ ⎣ ad − bc ad − bc ⎦ as a matrix product calculation shows. Why bother? We knew this already from a simple direct calculation! That's the point: except for the 2 × 2, there's no formula for computing the inverse of a general n × n matrix. But if we can decompose a general matrix A into simpler components, each of which we know how to invert, then A−1 can be calculated in terms of a product just as we've just showed using the Gauss Decomposition. What's more, as in the 2 × 2 case where the condition ad − bc ≠ 0 was needed for the diagonal matrix to be inverted, use of a decomposition often shows what conditions need to be imposed for A−1 to exist. As we saw in Lecture 05, each of the elementary matrices in the 2 × 2 Gauss Decomposition determines a particularly special geometric transformation of the plane: in the 3-term factorization, the lower and upper triangular matrices correspond to shearing the xy-plane in the y and x-directions, while the diagonal matrix provides a stretching of the plane away from, or towards, the origin (dilation). But without the Gauss Decomposition, would you have guessed that every invertible 2 × 2 matrix determines a transformation of the plane that can be written as the composition of shearings and dilation provided a ≠ 0? To handle the case when a = 0 and c ≠ 0, interchange rows by writing: 0 b 0 1 c d 0 1 c 0 1 d = = c . [ c d ] [ 1 0 ] [ 0 b ] [ 1 0 ] [ 0 b ] [ 0 1 ] Do the calculation to see why the lower triangular matrix didn't appear! Notice that the first matrix in this decomposition is again an elementary matrix - the one interchanging row 1 and row 2. As a geometric transformation it reflects the xy-plane in the line y = x. So both of the previous interpretations still apply whether a = 0 or a ≠ 0. 0 b What happens, however, if a, c = 0? Well, cannot be reduced to echelon form! This suggests [ 0 d ] hypotheses for general results: Fundamental Theorem 1: if an n × n matrix A can be reduced to row echelon form without row interchanges, then A has an LU-decomposition where L is lower triangular with entries 1 on the diagonal and U is upper triangular. Fundamental Theorem 2: if an n × n matrix A can be reduced to row echelon form possibly with row interchanges, then A has an PLU-decomposition where P is a product of row interchange elementary matrices, L is lower triangular with entries 1 on the diagonal and U is upper triangular. Fundamental Theorem 2 is the version that's most often used in large scale computations. But rather than prove the existence of either decomposition in generality, let's concentrate on using a given decomposition in solving a system of linear equations. Example 1: solve the equation Ax = b when 1 0 0 −7 −7 −1 3 −7 −2 −7 Ux = L b = ⎡1 1 0 ⎤⎡ 5 ⎤ = ⎡−2 ⎤. 3 5 1 2 6 A = ⎡−3 5 1 ⎤, b = ⎡ 5 ⎤, ⎣ ⎦⎣ ⎦ ⎣ ⎦ 6 −4 0 2 ⎣ ⎦ ⎣ ⎦ But then and A has an LU-decomposition 3 −7 −2 x1 −7 1 0 0 3 −7 −2 Ux = ⎡0 −2 −1 ⎤⎡x2 ⎤ = ⎡−2 ⎤. 0 0 −1 x 6 A = LU = ⎡−1 1 0 ⎤⎡0 −2 −1 ⎤ ⎣ ⎦⎣ 3 ⎦ ⎣ ⎦ 2 −5 1 0 0 −1 ⎣ ⎦⎣ ⎦ Because U is in upper triangular form this last equation can be solved in several different ways. Solution: set y = Ux. Then Ax = Ly = b For example, the associated augmented matrix and so y = L−1b. Now by the known inverse of 3 −7 −2 −7 3 × 3 lower triangular matrices given in lecture ⎡0 −2 −1 −2 ⎤ 06, 0 0 −1 6 ⎣ ⎦ 1 0 0 is in echelon form, so the solutions can be read off −1 L = ⎡1 1 0 ⎤, by back substitution as we did earlier in lecture 3 5 1 ⎣ ⎦ 01: in which case the equation Ax = b reduces to x3 = −6 , x2 = 4 , x1 = 3 . Thus solving the equation Ax = b has been reduced to computations with triangular matrices which are 3 × 3 always much simpler to handle then general matrices, even for matrices beyond 3 × 3. The price to be paid is that first we must compute LU-decompositions such as: 3 −7 −2 1 0 0 3 −7 −2 A = ⎡−3 5 1 ⎤ = ⎡−1 1 0 ⎤⎡0 −2 −1 ⎤ = LU 6 −4 0 2 −5 1 0 0 −1 ⎣ ⎦ ⎣ ⎦⎣ ⎦ in example 1. How can this be done? The key to doing this is to remember that U is in echelon form, so the U term should come from row reduction of A (lecture 02): 3 −7 −2 3 −7 −2 3 −7 −2 R2 +R1 R3 −2R1 R3 +5R2 A −−−−→ ⎡0 −2 −1 ⎤ −−−−→ ⎡0 −2 −1 ⎤ −−−−→ ⎡0 −2 −1 ⎤ = U. 6 −4 0 0 10 4 0 0 −1 ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ But this corresponds to left multiplication by the appropriate elementary matrices (lecture 06) giving : 3 −7 −2 E23(5) E13(−2)E12(1)A = ⎡0 −2 −1 ⎤ = U . 0 0 −1 ⎣ ⎦ Since elementary matrices are invertible, we thus see that A = LU where (lecture 06) 1 0 0 1 0 0 1 0 0 1 0 0 −1 −1 −1 L = E12(1) E13(−2) E23(5) = ⎡−1 1 0 ⎤⎡0 1 0 ⎤⎡0 1 0 ⎤ = ⎡−1 1 0 ⎤, 0 0 1 2 0 1 0 −5 1 2 −5 1 ⎣ ⎦⎣ ⎦⎣ ⎦ ⎣ ⎦ as a computation shows. These hand computations show how the U and L can be computed using only the ideas we've developed in previous lectures. In practice, of course, computer algebra systems like Mathematica, MatLab, Maple and Wolfram Alpha all contain routines for carrying out the calculations electronically for m × n matrices way way beyond m = n = 3. Use them!! Nonetheless, these hand calculations can be turned into a proof of Fundamental Theorems 1 and 2. We omit the details!!.