STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2016 LECTURE 13

1. need for pivoting • last time we showed that under proper circumstances, we can write A = LU where

 1 0 0 ··· 0 u11 u12 ······ u1n `21 1 0 ··· 0  0 u22 ······ u2n  . .  . .   .. .  .. .  L = `31 `32 . ,U =  0 0 .   . . . .   . . . .   ...... 0  . . .. .  `n1 `n2 ··· `n,n−1 1 0 0 ··· 0 unn • what exactly are proper circumstances? (k) • we must have akk 6= 0, or we cannot proceed with the decomposition • for example, if 0 1 11 1 3 4 A = 3 7 2  or A = 2 6 4 2 9 3 7 1 2 will fail; note that both matrices are nonsingular • in the first case, it fails immediately; in the second case, it fails after the subdiagonal entries (k) in the first column are zeroed, and we find that a22 = 0 • in general, we must have det Aii 6= 0 for i = 1, . . . , n where   a11 ··· a1i  . .  Aii =  . .  ai1 ··· aii for the LU factorization to exist • the existence of LU factorization (without pivoting) can be guaranteed by several conditions, 1 n×n one example is column diagonal dominance: if a nonsingular A ∈ R satisfies n X |ajj| ≥ |aij|, j = 1, . . . , n, i=1 i6=j then one can guarantee that Gaussian elimination as described above produces A = LU with |`ij| ≤ 1 • there are necessary and sufficient conditions guaranteeing the existence of LU decomposition but those are difficult to check in practice and we do not state them here • how can we obtain the LU factorization for a general nonsingular ? • if A is nonsingular, then some element of the first column must be nonzero • if ai1 6= 0, then we can interchange row i with row 1 and proceed

Date: November 12, 2016, version 1.1. Comments, bug reports: [email protected]. 1 Pn the usual type of diagonal dominance, i.e., |aii| ≥ j=1,j6=i|aij |, i = 1, . . . , n, is called row diagonal dominance 1 • this is equivalent to multiplying A by a Π1 that interchanges row 1 and row i: 0 ······ 0 1 0 ··· 0  1   .   ..       1  Π1 =   1 0 ······ 0 ······ 0    1   .   ..  1

• thus M1Π1A = A2 (refer to earlier lecture notes for more information about permutation matrices) • then, since A2 is nonsingular, some element of column 2 of A2 below the diagonal must be nonzero • proceeding as before, we compute M2Π2A2 = A3, where Π2 is another permutation matrix • continuing, we obtain −1 A = (Mn−1Πn−1 ··· M1Π1) U • it can easily be shown that ΠA = LU where Π is a permutation matrix — easy but a bit of a pain because notation is cumbersome • so we will be informal but you’ll get the idea • for example if after two steps we get (recall that permutation matrices or orthogonal ma- trices), −1 A = (M2Π2M1Π1) A2 T −1 T −1 = Π1M1 Π2M2 A2 T T −1 T −1 = Π1Π2(Π2M1 Π2)M2 A2 T = Π L1L2A2 then – Π = Π2Π1 is a permutation matrix −1 – L2 = M2 is a unit lower −1 T −1 – L1 = Π2M1 Π2 will always be a unit lower triangular matrix because M1 is of the form in  1 0 1  `21 1  M −1 = =   (1.1) 1 ` I  . ..   . 0 .  `n1 1

whereas Π2 must be of the form 1  Π2 = Πb 2

for some (n − 1) × (n − 1) permutation matrix Πb 2 and so   −1 T 1 0 Π2M1 Π2 = Πb 2` I

−1 T in other words Π2M1 Π2 also has the form in (1.1) 2 • if we do one more steps we get −1 A = (M3Π3M2Π2M1Π1) A3 T −1 T −1 T −1 = Π1M1 Π2M2 Π3M3 A2 T T T −1 T T −1 T −1 = Π1Π2Π3(Π3Π2M1 Π2Π3)(Π3M2 Π3)M3 A2 T = Π L1L2L3A2 where – Π = Π3Π2Π1 is a permutation matrix −1 – L3 = M3 is a unit lower triangular matrix −1 T −1 – L2 = Π3M2 Π3 will always be a unit lower triangular matrix because M2 is of the form 1    0 1  1   −1 0 −`32 1  M2 =  1  =   (1.2) ` I . . ..  . . .  0 −`n2 1 whereas Π3 must be of the form 1  Π3 =  1  Πb 3

for some (n − 2) × (n − 2) permutation matrix Πb 3 and so 1  −1 T Π3M2 Π3 =  1 0 Πb 3` I

−1 T in other words Π3M2 Π3 also has the form in (1.2) −1 T T – L1 = Π3Π2M1 Π2Π3 will always be a unit lower triangular matrix for the same reason above because Π3Π2 must have the form 1  1  1  Π3Π2 =  1  = Πb 2 Π32 Πb 3 for some (n − 1) × (n − 1) permutation matrix 1  Π32 = Πb 2 Πb 3 • more generally if we keep doing this, then

T A = Π L1L2 ··· Ln−1An−1 where – Π = Πn−1Πn−2 ··· Π1 is a permutation matrix −1 – Ln−1 = Mn−1 is a unit lower triangular matrix −1 T T – Lk = Πn−1 ··· Πk+1Mk Πk+1 ··· Πn−1 is a unit lower triangular matrix for all k = 1, . . . , n − 2 – An−1 = U is an upper triangular matrix – L = L1L2 ··· Ln−1 is a unit lower triangular matrix • this algorithm with the row permutations is called Gaussian elimination with partial pivoting or gepp for short; we will say more in the next section 3 2. pivoting strategies • the (k, k) entry at step k during Gaussian elimination is called the pivoting entry or just pivot for short (k) • in the preceding section, we said that if the pivoting entry is zero, i.e., akk = 0, then we (k) just need to find an entry below it in the same column, i.e., akj for some j > k, and then permute this entry into the pivoting position, before carrying on with the algorithm • but it is really better to choose the largest entry below the pivot, and not just any nonzero entry • that is, the permutation Πk is chosen so that row k is interchanged with row j, where (k) (k) akj = maxj=k,k+1,...,n|akj | • this guarantees that |`kj| ≤ 1 for all k and j • this strategy is known as partial pivoting, which is guaranteed to produce an LU factoriza- m×n tion if A ∈ R has full row-, i.e., rank(A) = m ≤ n • another common strategy, complete pivoting, which uses both row and column interchanges (k) to ensure that at step k of the algorithm, the element akk is the largest element in absolute value from the entire submatrix obtained by deleting the first k − 1 rows and columns, i.e., (k) (k) aij = max |aij | i=k,k+1,...,n j=k,k+1,...,n • in this case we need both row and column permutation matrices, i.e., we get

Π1AΠ2 = LU when we do complete pivoting • complete pivoting is necessary when rank(A) < min{m, n} • there are yet other pivoting strategies due to considerations such as preserving sparsity (if you’re interested, look up minimum degree algorithm or Markowitz algorithm) or a tradeoff between partial and complete pivoting (e.g., rook pivoting)

3. uniqueness of the LU factorization • the LU decomposition of a nonsingular matrix, if it exists (i.e., without row or column permutations), is unique • if A has two LU decompositions, A = L1U1 and A = L2U2 −1 −1 • from L1U1 = L2U2 we obtain L2 L1 = U2U1 • the inverse of a unit lower triangular matrix is a unit lower triangular matrix, and the −1 product of two unit lower triangular matrices is a unit lower triangular matrix, so L2 L1 must be a unit lower triangular matrix −1 • similarly, U2U1 is an upper triangular matrix • the only matrix that is both upper triangular and unit lower triangular is the I, so we must have L1 = L2 and U1 = U2

4. Gauss–Jordan elimination • a variant of Gaussian elimination is called Gauss–Jordan elimination • it entails zeroing elements above the diagonal as well as below, transforming an m × n matrix into reduced row echelon form, i.e., a form where all pivoting entries in U are 1 and all entries above the pivots are zeros • this is what you probably learnt in your undergraduate class, e.g., 1 3 1 9  1 3 1 9  1 3 1 9  1 0 −2 −3 A = 1 1 −1 1  → 0 −2 −2 −8 → 0 −2 −2 −8 → 0 1 1 4  = U 3 11 5 35 0 2 2 8 0 0 0 0 0 0 0 0 4 • the main drawback is that the elimination process can be numerically unstable, since the multipliers can be large • furthermore the way it is done in undergraduate linear algebra courses is that the elimination matrices (i.e., the L and Π) are not stored

5. condensed LU factorization • just like QR and svd, LU factorization with complete pivoting has a condensed form too m×n • let A ∈ R and rank(A) = r ≤ min{m, n}, recall that gecp yields

Π1AΠ2 = LU L 0  U U  = 11 11 12 L21 Im−r 0 0   L11   = U11 U12 =: LeUe L21 r×r r×r where L11 ∈ R is unit lower triangular (thus nonsingular) and U11 ∈ R is also nonsingular m×r n×r • note that Le ∈ R and Ue ∈ R and so

T T A = (Π1Le)(UeΠ2) is a rank-retaining factorization

6. LDU and LDLT factorizations n×n • if A ∈ R has nonsingular principal submatrices A1:k,1:k for k = 1, . . . , n, then there n×n n×n exists a unit lower triangular matrix L ∈ R , a unit upper triangular matrix U ∈ R , n×n and a D = diag(d11, . . . , dnn) ∈ R such that       1 0 d11 1 u12 ··· u1n `21 1   d22   1 u2n       A = LDU =  . ..   ..   .. .   . .   .   . .  `n1 `n2 ··· 1 dnn 1 • this is called the LDU factorization of A • if A is furthermore symmetric, then L = U T and this called the LDLT factorization • if they exist, then both LDU and LDLT factorizations are unique (exercise) T • if a symmetric A has an LDL factorization and if dii > 0 for all i = 1, . . . , n, then A is positive definite • in fact, even though d11, . . . , dnn are not the eigenvalues of A (why not?), they must have the same signs as the eigenvalues of A, i.e., if A has p positive eigenvalues, q negative eigenvalues, and z zero eigenvalues, then there are exactly p, q, and z positive, negative, and zero entries in d11, . . . , dnn — a consequence of the Sylvester law of inertia • unfortunately, both LDU and LDLT factorizations are difficult to compute because – the condition on the principal submatrices is difficult to check in advance – algorithms for computing them are invariably unstable because size of multipliers can- not be bounded in terms of the entries of A • for example, the LDLT factorization of a 2 × 2 is a c  1 0 a c   1 0 a 0  1 c/a = = c d c/a 1 0 d − (c/a)c c/a 1 0 d − (c/a)c 0 1 5 • so ε 1  1 0 ε 0  1 1/ε = 1 ε 1/ε 1 0 ε − 1/ε 0 1 the elements of L and D are arbitrarily large when |ε| is small • nonetheless there is one special case when LDLT factorization not only exists but can be computed in an efficient and stable way — when A is positive definite

7. positive definite matrices

n×n T • a symmetric matrix A ∈ R is positive definite if x Ax > 0 for all nonzero x • a symmetric positive definite matrix has real and positive eigenvalues, and its leading prin- cipal submatrices all have positive determinants • from the definition, it is easy to see that all diagonal elements are positive • to solve the system Ax = b where A is symmetric positive definite, we can compute the Cholesky factorization A = RTR where R is upper triangular • this factorization exists if and only if A is symmetric positive definite • in fact, attempting to compute the Cholesky factorization of A is an efficient method for checking whether A is symmetric positive definite • it is important to distinguish the Cholesky factorization from the square root factorization • a square root of a matrix A is defined as a matrix S such that S2 = SS = A • note that the matrix R in A = RTR is not the square root of A, since it does not hold that R2 = A unless A is a diagonal matrix • the square root of a symmetric positive definite A can be computed by using the fact that A has an eigendecomposition A = UΛU T where Λ is a diagonal matrix whose diagonal elements are the positive eigenvalues of A and U is an whose columns are the eigenvectors of A • it follows that A = UΛU T = (UΛ1/2U T)(UΛ1/2U T) = SS and so S = UΛ1/2U T is a square root of A

8. cholesky factorization • the Cholesky factorization can be computed directly from the matrix equation A = RTR where R is upper-triangular • while it is conventional to write Cholesky factorization in the form A = RTR, it will be more natural later when we discuss the vectorized version of the algorithm to write F = RT and A = FF T • we can derive the algorithm for computing F by examining the matrix equation A = RTR = FF T on an element-by-element basis, writing       a11 a12 ··· a1n f11 f11 f21 ··· fn1 a12 a22 ··· a2n f21 f22   f22 fn2        . . .  =  . . ..   .. .   . . .   . . .   . .  a1n a2n ··· ann fn1 fn2 ··· fnn fnn 2 • from the above we see that f11 = a11, from which it follows that √ f11 = a11 6 • from the relationship f11fi1 = a1i and the fact that we already know f11, we obtain a1i fi1 = , i = 2, . . . , n f11 2 2 • proceeding to the second column of F , we see that f21 + f22 = a22 • since we already know f21, we have q 2 f22 = a22 − f21 • if you know the fact that a positive definite matrix must have positive leading principal minors, then you could deduce the term above in the square root is positive by examining the 2 × 2 principal : 2 a11a22 − a12 > 0 and therefore 2 a12 2 a22 > = f21 a11 • next, we use the relation f21fi1 + f22fi2 = a2i to compute a2i − f21fi1 fi2 = f22 • hence we get 2 a11 = f11, ai1 = f11fi1, i = 2, . . . , n . . 2 2 2 akk = fk1 + fk2 + ··· + fkk, aik = fk1fi1 + ··· + fkkfik, i = k + 1, . . . , n • the resulting algorithm that runs for k = 1, . . . , n is 1/2  Pk−1 2  fkk = akk − j=1 fkj ,  Pk−1  aik − j=1 fkjfij fik = , i = k + 1, . . . , n fkk • you could use induction to show that the term in the square root is always positive but we’ll soon see a more elegant vectorized version showing that this algorithm doesn’t ever require taking square roots of negative numbers • this algorithm requires roughly half as many operations as Gaussian elimination

7