Stat 309: Mathematical Computations I Fall 2014 Lecture 8
Total Page:16
File Type:pdf, Size:1020Kb
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2014 LECTURE 8 • we look at LU factorization and some of its variants: condensed LU, LDU, LDLT, and Cholesky factorizations 1. existence of lu factorization • the solution method for a linear system Ax = b depends on the structure of A: A may be a sparse or dense matrix, or it may have one of many well-known structures, such as being a banded matrix, or a Hankel matrix • for the general case of a dense, unstructured matrix A, the most common method is to obtain a decomposition A = LU, where L is lower triangular and U is upper triangular • this decomposition is called the LU factorization or LU decomposition • we deduce its existence via a constructive proof, namely, Gaussian elimination • the motivation for this is something you learnt in middle school, i.e., solving Ax = b by eliminating variables a11x1 + ··· + a1nxn = b1 a21x1 + ··· + a2nxn = b2 . an1x1 + ··· + annxn = bn • we proceed by multiplying the first equation by −a21=a11 and adding it to the second equation, and in general multiplying the first equation by −ai1=a11 and adding it to equation i and this leaves you with the equivalent system a11x1 + a12x2 + ··· + a1nxn = b1 0 0 0 0x1 + a22x2 + ··· + a2nxn = b2 . 0 0 0 0x1 + an2x2 + ··· + annxn = bn • continuing in this fashion, adding multiples of the second equation to each subsequent equa- tion to make all elements below the diagonal equal to zero, you obtain an upper triangular system and may then solve for all xn; xn−1; : : : ; x1 by back substitution • getting the LU factorization A = LU is very similar, the main difference is that you want not just the final upper triangular matrix (which is your U) but also to keep track of all the elimination steps (which is your L) 2. Gaussian elimination revisited • we are going to look at Gaussian elimination in a slightly different light from what you learnt in your undergraduate linear algebra class • we think of Gaussian elimination as the process of transforming A to an upper triangular matrix U is equivalent to multiplying A by a sequence of matrices to obtain U Date: November 5, 2014, version 1.0. Comments, bug reports: [email protected]. 1 • but instead of elementary matrices, we consider again a rank-1 change to I, i.e., a matrix of the form I − uvT n where u; v 2 R • in Householder QR, we used Householder reflection matrices of the form H = I − 2uuT • in Gaussian elimination, we use so-called Gauss transformation or elimination matrices of the form T M = I − mei T where ei = [0;:::; 1;::: 0] is the ith standard basis vector • the same trick that led us to the appropriate u in Householder matrix can be applied to T find the appropriate m too: suppose we want M1 = I − m1e1 to `zero out' all the entries beneath the first in a vector a, i.e., 2 3 2 3 a1 γ 6a27 607 6 7 6 7 M1 6 . 7 = 6 . 7 4 . 5 4 . 5 an 0 i.e., T (I − m1e1 )a = γe1 T a − (e1 a1)m1 = γe1 a1m1 = a − γe1 and if a1 6= 0, then we may set 2 0 3 6a2=a17 6 7 γ = a1; m1 = 6 . 7 4 . 5 an=a1 • so we get 2 1 03 6−a2=a1 1 7 T 6 7 M1 = I − m1e1 = 6 . .. 7 4 . 0 . 5 −an=a1 1 and, as required, 2 3 2 3 2 3 1 0 a1 a1 6−a2=a1 1 7 6a27 6 0 7 6 7 6 7 6 7 M1a = 6 . .. 7 6 . 7 = 6 . 7 = a1e1 4 . 0 . 5 4 . 5 4 . 5 −an=a1 1 an 0 • applying this to zero out the entries beneath a11 in the first column of a matrix A, we get M1A = A2 where 2a11 a12 ··· a1n 3 (2) (2) 6 0 a ··· a 7 6 22 2n 7 A2 = 6 . 7 4 . 5 (2) (2) 0 an2 ··· ann 2 where the superscript in parenthesis denote that the entries have changed • we will write 2 1 03 6−`21 1 7 a 6 7 i1 M1 = 6 . .. 7 ; `i1 = 4 . 0 . 5 a11 −`n1 1 for i = 2; : : : ; n • if we do this recursively, defining M2 by 21 3 60 1 7 6 7 a(2) M = 60 −`32 1 7 ; ` = i2 2 6 7 i2 (2) 6. .. 7 a 4. 5 22 0 −`n2 1 for i = 3; : : : ; n, then 2 3 a11 a12 a13 ··· a1n 6 0 a(2) a(2) ··· a(2)7 6 22 23 2n 7 6 (3) (3)7 M2A2 = A3 = 6 0 0 a33 ··· a3n 7 6 . 7 6 . 7 4 . 5 (3) (3) 0 0 an3 ··· ann • in general, we have 21 3 6 .. 7 60 . 7 6. 7 (k) 6. .. 1 7 a M = 6 7 ; ` = ik k 6. 7 ik (k) 6. −` 1 7 a 6 k+1;k 7 kk 6. 7 4. .. 5 0 −`nk 1 for i = k + 1; : : : ; n, and 2 3 u11 ··· u1n 6 0 u22 ··· u2n 7 6 7 Mn−1Mn−2 ··· M1A = An ≡ 6 . .. 7 4 . 5 0 ··· 0 unn or, equivalently, −1 −1 −1 A = M1 M2 ··· Mn−1U −1 • it turns out that Mj is very easy to compute, we claim that 2 1 03 6`21 1 7 −1 6 7 M1 = 6 . .. 7 (2.1) 4 . 0 . 5 `n1 1 3 • to see this, consider the product 2 1 03 2 1 03 6−`21 1 7 6`21 1 7 −1 6 7 6 7 M1M1 = 6 . .. 7 6 . .. 7 4 . 0 . 5 4 . 0 . 5 −`n1 1 `n1 1 which can easily be verified to be equal to the identity matrix • in general, we have 21 3 6 .. 7 60 . 7 6. 7 6. .. 7 −1 6. 1 7 Mk = 6. 7 (2.2) 6. ` 1 7 6 k+1;k 7 6. 7 4. .. 5 0 `nk 1 • now, consider the product 2 1 3 21 3 6`21 1 7 60 1 7 6 7 6 7 −1 −1 6`31 0 1 7 60 `32 1 7 M1 M2 = 6 7 6 7 6 . .. 7 6. .. 7 4 . 5 4. 5 `n1 0 1 0 `n2 1 2 1 3 6`21 1 7 6 7 6`31 `32 1 7 = 6 7 6 . .. 7 4 . 5 `n1 `n2 1 • so inductively we get 2 1 3 . 6 .. 7 6`21 7 −1 −1 −1 6 . 7 M M ··· M = 6 . .. 7 1 2 n−1 6 . `32 7 6 . 7 4 . .. .. 5 `n1 `n2 ··· `n;n−1 1 • it follows that under proper circumstances, we can write A = LU where 2 1 0 0 ··· 03 2u11 u12 ······ u1n3 6`21 1 0 ··· 07 6 0 u22 ······ u2n7 6 . .7 6 . 7 6 .. .7 6 .. 7 L = 6`31 `32 .7 ;U = 6 0 0 . 7 6 . 7 6 . 7 4 . .. .. 05 4 . .. 5 `n1 `n2 ··· `n;n−1 1 0 0 ··· 0 unn • what exactly are proper circumstances? (k) • we must have akk 6= 0, or we cannot proceed with the decomposition 4 • for example, if 20 1 113 21 3 43 A = 43 7 2 5 or A = 42 6 45 2 9 3 7 1 2 Gaussian elimination will fail; note that both matrices are nonsingular • in the first case, it fails immediately; in the second case, it fails after the subdiagonal entries (k) in the first column are zeroed, and we find that a22 = 0 • in general, we must have det Aii 6= 0 for i = 1; : : : ; n where 2 3 a11 ··· a1i 6 . 7 Aii = 4 . 5 ai1 ··· aii for the LU factorization to exist • the existence of LU factorization (without pivoting) can be guaranteed by several conditions, 1 n×n one example is column diagonal dominance: if a nonsingular A 2 R satisfies n X jajjj ≥ jaijj; j = 1; : : : ; n; i=1 i6=j then one can guarantee that Gaussian elimination as described above produces A = LU with j`ijj ≤ 1 • there are necessary and sufficient conditions guaranteeing the existence of LU decomposition but those are difficult to check in practice and we do not state them here • how can we obtain the LU factorization for a general non-singular matrix? • if A is nonsingular, then some element of the first column must be nonzero • if ai1 6= 0, then we can interchange row i with row 1 and proceed • this is equivalent to multiplying A by a permutation matrix Π1 that interchanges row 1 and row i: 20 ······ 0 1 0 ··· 03 6 1 7 6 . 7 6 .. 7 6 7 6 7 6 1 7 Π1 = 6 7 61 0 ······ 0 ······ 07 6 7 6 1 7 6 . 7 4 .. 5 1 • thus M1Π1A = A2 (refer to earlier lecture notes for more information about permutation matrices) • then, since A2 is nonsingular, some element of column 2 of A2 below the diagonal must be nonzero • proceeding as before, we compute M2Π2A2 = A3, where Π2 is another permutation matrix • continuing, we obtain −1 A = (Mn−1Πn−1 ··· M1Π1) U • it can easily be shown that ΠA = LU where Π is a permutation matrix | easy but a bit of a pain because notation is cumbersome • so we will be informal but you'll get the idea 1 Pn the usual type of diagonal dominance, i.e., jaiij ≥ j=1;j6=ijaij j, i = 1; : : : ; n, is called row diagonal dominance 5 • for example if after two steps we get (recall that permutation matrices or orthogonal ma- trices), −1 A = (M2Π2M1Π1) A2 T −1 T −1 = Π1 M1 Π2 M2 A2 T T −1 T −1 = Π1 Π2 (Π2M1 Π2 )M2 A2 T = Π L1L2A2 then { Π = Π2Π1 is a permutation matrix −1 { L2 = M2 is a unit lower triangular matrix −1 T −1 { L1 = Π2M1 Π2 will always be a unit lower triangular matrix because M1 is of the form in (2.1) 1 M −1 = 1 ` I whereas Π2 must be of the form 1 Π2 = Πb 2 for some (n − 1) × (n − 1) permutation matrix Πb 2 and so −1 T 1 0 Π2M1 Π2 = Πb 2` I −1 T in other words Π2M1 Π2 also has the form in (2.1) • if we do one more steps we get −1 A = (M3Π3M2Π2M1Π1) A3 T −1 T −1 T −1 = Π1 M1 Π2 M2 Π3 M3 A2 T T T −1 T T −1 T −1 = Π1 Π2 Π3 (Π3Π2M1 Π2 Π3 )(Π3M2 Π3 )M3 A2 T = Π L1L2L3A2 where { Π = Π3Π2Π1 is a permutation matrix −1 { L3 = M3 is a unit lower triangular matrix −1 T −1 { L2 = Π3M2 Π3 will always be a unit lower triangular matrix because M2 is of the form in (2.2) 21 3 −1 M2 = 4 1 5 ` I whereas Π3 must be of the form 21 3 Π3 = 4 1 5 Πb 3 for some (n − 2) × (n − 2) permutation matrix Πb 3 and so 21 3 −1 T Π3M2 Π3 = 4 1 05 Πb 3` I −1 T in other words Π3M2 Π3 also has the form in (2.2) 6 −1 T T { L1 = Π3Π2M1 Π2 Π3 will always be a unit lower triangular matrix for the same reason above because Π3Π2 must have the form 21 3 1 1 Π3Π2 = 4 1 5 = Πb 2 Π32 Πb 3 for some (n − 1) × (n − 1) permutation matrix 1 Π32 = Πb 2 Πb 3 • more generally if we keep doing this, then T A = Π L1L2 ··· Ln−1An−1 where { Π = Πn−1Πn−2 ··· Π1 is a permutation matrix −1 { Ln−1 = Mn−1 is a unit lower triangular matrix −1 T T { Lk = Πn−1 ··· Πk+1Mk Πk+1 ··· Πn−1 is a unit lower triangular matrix for all k = 1; : : : ; n − 2 { An−1 = U is an upper triangular matrix { L = L1L2 ··· Ln−1 is a unit lower triangular matrix • this algorithm with the row permutations is called Gaussian elimination with partial pivoting or gepp for short; we will say more in the next section 3.