<<

Chapter 1

Triangular

This chapter deals with the factorization of arbitrary matrices into products of triangular matrices. Since the solution of a linear n n system can be easily obtained once the is factored into the product× of triangular matrices, we will concentrate on the factorization of square matrices. Specifically, we will show that an arbitrary n n matrix A has the factorization P A = LU where P is an n n ,× L is an n n unit lower , and U is an n ×n upper triangular matrix. In connection× with this factorization we will discuss pivoting,× i.e., row interchange, strategies. We will also explore circumstances for which A may be factored in the forms A = LU or A = LLT . Our results for a square system will be given for a matrix with real elements but can easily be generalized for complex matrices. The corresponding results for a general m n matrix will be accumulated in Section 1.4. In the general case an arbitrary m× n matrix A has the factorization P A = LU where P is an m m permutation× matrix, L is an m m unit lower triangular matrix, and U is an×m n matrix having row echelon structure.× ×

1.1 Permutation matrices and Gauss transformations

We begin by defining permutation matrices and examining the effect of premulti- plying or postmultiplying a given matrix by such matrices. We then define Gauss transformations and show how they can be used to introduce zeros into a vector. Definition 1.1 An m m permutation matrix is a matrix whose columns con- sist of a rearrangement× of the m unit vectors e(j), j = 1,...,m, in RI m, i.e., a rearrangement of the columns (or rows) of the m m . × The rows of an n n permutation matrix consist of a rearrangement of the T × transposes (e(j)) , j = 1,...,n, of the n unit vectors in RI n. The effect of premul- tiplying (postmultiplying) a matrix A by a permutation matrix P is to rearrange the rows (columns) of A into the same order as the rows (columns) of the identity matrix are ordered in P . For integers j and k such that 1 j k m, we denote ≤ ≤ ≤ by P(j,k) the special class of permutation matrices that result from the interchange

1 2 1. Triangular Factorization of the j-th and k-th columns of the identity matrix, i.e.,

(1.1) P = e(1), , e(j−1), e(k), e(j+1), , e(k−1), e(j), e(k+1), , e(m) . (j,k) · · · · · · · · ·   These permutation matrices, which are often referred to as elementary permutation matrices, have many useful properties such as

T −1 (1.2) P(j,k) = P(j,k) = P(j,k) .

Of course, P(j,j) = I. We now define a special class of matrices that are one perturbations of the identity matrix and that can be used to introduce zeros into a vector. Definition 1.2 For an integer j such that 1 j

T m Proposition 1.1 Given x = (x1, x2,...,xm) RI such that x = 0. Choose any integer p such that 1 p m and x =0. Define∈ µ RI m by 6 ≤ ≤ p 6 ∈ 0 if j =1

 x1  if j = p and p =1 µj =  x 6  p   x j if j =2,...,m, j = p .  xp 6   Then  (1) (1) M P(1,p)x = xpe . Proof. The result follows from

0 xp µ2xp 0     (1) (1) T µ3xp 0 M P(1,p)x = P(1,p)x µ(e ) P(1,p)x = P(1,p)x = . − −  .   .   .   .       µ x   0   m p        1.1. Permutation matrices and Gauss transformations 3

2 Note that if x = 0, then one may choose p = 1. Also, note that p, 1 p m, 1 6 ≤ ≤ can be any index such that xp = 0 so that there is not a unique Gauss transformation which transforms x into a constant6 times e(1); this is illustrated by the following example.

Example 1.1 Let x= (0, 3, 4)T . If p = 2 and µ = (0, 0, 4/3)T , then − − 1 00 M (1) = 0 10  4/3 0 1    (1) T T and M P(1,2)x = ( 3, 0, 0) . On the other hand, if p = 3 and µ = (0, 3/4, 0) , then − − 1 00 M (1) = 3/4 1 0  0 01    (1) T and M P(1,3)x = (4, 0, 0) . We can also use Gauss transformations to introduce zeros in any contiguous block of a vector. For example, given x RI m and integers k and q, 1 k < q m, ∈ ≤ ≤ such that xj , j = k,...,q, are not all zero, suppose we want to choose a Gauss transformation that zeros out the (k + 1)-st through q-th components of P(k,p)x. Again, an integer p, k p q, is chosen so that x = 0. We set ≤ ≤ p 6 0 if j =1,...,k or j = q +1,...,m

 xk  if j = p and p = k (1.4) µj =  x 6  p   x j if j = k +1,...,q, j = p  x 6  p   so that, from (1.3), 

Ik−1 1  µ  − k+1  .   .  (1.5) M (k) =    µq Im−k   −   0     .   .     0      4 1. Triangular Factorization and x1 .  .  x −  k 1   xp     0  (k)   M P(k,p)x =  .  .  .     0     x   q+1   .   .     x   m    We end this section by showing how an elementary permutation matrix which premultiplies a Gauss transformation “passes through” the Gauss transformation to produce another Gauss transformation postmultiplied by the same permutation matrix.

Proposition 1.2 Let M (k) be a Gauss transformation and let j and p be indices such that k

(k) ˆ (k) (1.6) P(j,p)M = M P(j,p) .

Proof. The proof is by direct multiplication; Mˆ (k) is the matrix obtained by inter- changing the (j, k) and (p, k) entries of M (k). 2

1.2 Triangular of an n n matrix × The goal of this section is to prove that any n n matrix A has the factorization P A = LU where P is an n n permutation× matrix, L is an n n unit lower triangular matrix, and U is an ×n n upper triangular matrix. We will× first show how, through the use of elementary permutation× matrices and Gauss transformations, a given n n can be reduced to an upper triangular matrix U. We remark that the construction× given in the proof of the following proposition is exactly the classical expressed in matrix notation; we will examine this connection further in the next section.

Proposition 1.3 Let A be a given n n matrix. Then there exist an integer ℓ, 0 ℓ n 1, Gauss transformation matrices× M (k), k = 1,...,ℓ, and elementary permutation≤ ≤ − matrices P , k =1,...,ℓ, k p n, such that (k,pk ) ≤ k ≤ (1.7) U = A(ℓ+1) = M (ℓ)P M (2)P M (1)P A (ℓ,pℓ) · · · (2,p2) (1,p1) is an n n upper triangular matrix. × 1.2. Triangular factorizations of an n × n matrix 5

(1) Proof. Starting with A = A and the integer counter σ1 = 1, we assume that at the start of the k-th stage, k 1, of the procedure we have a matrix of the form ≥ U (k) a(k) B(k) A(k) = , 0 c(k) D(k)   (k) (k) k−1 (k) where U is an (k 1) (σk 1) upper triangular matrix,, a RI , c RI n−k+1, B(k) is (k −1) (×n σ −) and D(k) is (n k + 1) (n σ ).∈ If c(k) = 0, we∈ − × − k − × − k increment σk by one and move on to the next column, continuing to so increment (k) (k) σk until either c = 0 or σk >n. In the latter case we are finished since then A is upper triangular.6 In the former case we use (1.4) with

a(k) x = c(k)   (k) and q = m to choose an integer pk k and construct a Gauss transformation M (k) ≥ such that the components of M P(k,pk )x with indices j = k +1,...,m vanish. If we write M (k) in the block form I 0 k−1 , 0 M˜ (k)   then M˜ (k) is an (m k + 1) (m k + 1) Gauss transformation formed by setting x = c(k) in Proposition− 1.1.× We have− that

U (k) a(k) B(k) 0 c   (k+1) (k) (k) 0 0 A = M P(k,pk )A = ,  . .   . . M˜ (k)Dˆ (k)     0 0      (k) (k) where c denotes the (pk k +1)-st component of c and Dˆ denotes the matrix (k) − D after rows k and pk have been interchanged. Then

U (k+1) a(k+1) B(k+1) A(k+1) = , 0 c(k+1) D(k+1)   where a(k+1) and c(k+1) are the first columns of B(k) and M˜ (k)Dˆ (k), respectively, U (k+1) is the k (σ 1) matrix given by × k+1 − U (k) a(k) U (k+1) = , 0 c   (k+1) (k+1) and σk+1 = σk + 1. Clearly U has upper triangular structure and A has the same structure as A(k) with the index k augmented by one so that the inductive 6 1. Triangular Factorization step is complete. The total number of stages ℓ cannot exceed (m 1) or n and may − be less than both if σk > k for some k. 2 The process of row interchanges is often referred to as row pivoting. Since, for the most part, we will not consider column interchanges, we will henceforth refer (k) to row pivoting as simply pivoting. The (k, σk) entry in the matrix A defined in the above proof is referred to as the and (k, σk) itself is referred to as the pivot position. The pivot element is the denominator appearing in the vector µ that determines the Gauss transformation M (k) used at the k-th stage. Thus, an interchange of rows is, in theory, only necessary whenever a pivot element vanishes. The row interchange process, i.e., premultiplication by the elementary permutation matrix P(k,pk ), is invoked so that a nonzero entry is brought into the pivot position. Proposition 1.3, coupled with Proposition 1.2, allows us to prove the major result of this section.

Theorem 1.4 Given any n n matrix A there exists an n n permutation matrix P , an n n unit lower triangular× matrix L, and an n n ×upper triangular matrix U such that× × (1.8) P A = LU . Furthermore, rank(U) = rank(A).

Proof. By Proposition 1.3 we have that

A(ℓ+1) = M (ℓ)P M (ℓ−1)P M (2)P M (1)P A (ℓ,pℓ) (ℓ−1,pℓ−1) · · · (2,p2) (1,p1) is upper triangular. The inverse of a square unit lower triangular matrix is also unit (ℓ) (1) lower triangular so that the proof will be complete if we can show that M P(ℓ,pℓ) M P(1,p1) = MP for some unit lower triangular matrix M and permutation matrix P . Since· · · p k, from Proposition 1.2 we note that for k =2,...,ℓ k ≥ (k−1) (k−1) P(k,pk )M = M1 P(k,pk ) ,

(k−1) where M1 is a unit lower triangular matrix. Of course, if no row interchanges (k−1) (k−1) are required, then P(k,pk ) = I and M1 = M . Thus we have

A(ℓ+1) = M (ℓ)M (ℓ−1)P M (ℓ−2)P M (1)P P A . 1 (ℓ,pℓ) 1 (ℓ−1,pℓ−1) · · · 1 (2,p2) (1,p1) Again we use Proposition 1.2 to show that for k =3,...,ℓ

(k−2) (k−2) P(k,pk )M1 = M2 P(k,pk ) ,

(k−2) where M2 is a unit lower triangular matrix. Continuing in this manner, we have that U = A(ℓ+1) = MPA, where M = M (ℓ)M (ℓ−1)M (ℓ−2) M (2) M (1) 1 2 · · · ℓ−2 ℓ−1 1.2. Triangular factorizations of an n × n matrix 7 and P = P P P . (ℓ,pℓ) · · · (2,p2) (1,p1) Here M is the product of unit lower triangular matrices and thus M itself is unit lower triangular. Furthermore, P is the product of permutation matrices and thus P is a permutation matrix as well. Also, since M and P are invertible, clearly rank(U) = rank(A). If A is real, then the above process involves only real arithmetic so that the end products L and U are real. 2 Let r = rank(U) = rank(A) denote the number of nonzero rows of U. It is, of course, possible for r r can be factored into the product of an n r unit lower × × trapezoidal matrix and an r n upper triangular matrix. Note that L2 plays no essential role in the P A = LU× factorization of A. Also, if r = n, i.e., A has full column rank, then U is an n n square, nonsingular, upper triangular matrix. 1 × Example 1.2 Let 00 4 A = 2 1 1 .  63− 1  To form the factorization P A = LUwe could have the following steps:

1 00 1 0 0 0 1 0 00 4 (2) (1) M M P(1,2)A = 0 10 0 1 0 1 0 0 2 1 1  0 1 1   3 0 1   0 0 1   63− 1  − −  2 1 1        = 00− 4 = U .  00 0    This gives that

0 1 0 00 4 1 0 0 2 1 1 − P A = 1 0 0 2 1 1 = 0 1 0 00 4 = LU .  0 0 1   63− 1   3 1 1   00 0          Note that if we partition L and U as in (1.34) then

1 0 0 2 1 1 L1 = 0 1 ,L2 = 0 , and U1 = − .     00 4 3 1 1       8 1. Triangular Factorization

Note that, in general, the P A = LU factorization of a matrix A is not unique. In the first place, if r

Proposition 1.5 Given an n n matrix A. Partition its P A = LU factorization as × in (1.34) where U1 has full row rank and L1 is unit lower trapezoidal. Then, once the row interchange strategy is fixed, i.e., the permutation matrix P is fixed, the matrices L1 and U1 appearing in the factorization (1.34) are uniquely determined. If r = rank(A), the number of rows in U , then L may be chosen to be any n (n r) 1 2 × − matrix such that L = (L1 L2) is a unit lower triangular matrix.

Proof. Let P A = LU = L˜U˜ be two factorizations of A using the same row inter- change strategy. Then, U = L−1L˜U˜. Further partition L into the form

L 0 (1.10) L = 11 , L L  21 22  where L11 is an r r unit lower triangular matrix, L22 is an (n r) (n r) unit lower triangular matrix,× and L is (n r) r. Note that − × − 21 − × L 0 (1.11) L = 11 and L = . 1 L 2 L  21   22  We will use the analogous partitioning for L˜. Then

L−1L˜ 0 (1.12) L−1L˜ = 11 11 . L−1L L−1L˜ + L−1L˜ L−1L˜  − 22 21 11 11 22 21 22 22  Then, since U = L−1L˜U˜, we have that

−1 ˜ ˜ (1.13) U1 = L11 L11U1 , where we have partitioned U and U˜ as in (1.34). In (1.13), the matrix on the left is upper triangular, while the matrix on the right is the product of the unit lower −1 ˜ ˜ triangular matrix L11 L11 and the upper triangular matrix U1. By equating the elements to the left of the first nonzero entry in the rows on both sides of (1.13), −1 ˜ we conclude that the lower triangular matrix L11 L11 is a ; since it −1 ˜ ˜ is also a unit lower triangular matrix, we conclude that L11 L11 = I, or L11 = L11. Then, (1.13) also yields that U1 = U˜1. 1.2. Triangular factorizations of an n × n matrix 9

The relation U = L−1L˜U˜, (1.12), and the partitioning of U and U˜ of (1.34) also yield that −1 −1 ˜ −1 ˜ L22 L21L11 L11 = L22 L21 .

Then, since L11 = L˜11, we have that L21 = L˜21, and, from (1.11), L1 = L˜1. 2 If A has full row rank, i.e., if r = n, then U = U1 and L = L1 so that in this case we have that, once the row interchange strategy is fixed, the factorization of P A into the product of a unit lower triangular matrix and an upper triangular matrix is unique. In particular, this is the case when A is square and invertible. There are numerous variants to the factorization P A = LU; the most important is considered in Section 1.2.1. Another variant is found by applying Theorem 1.4 to A∗ which leads to the following result. Given any n n matrix A there exists an n n permutation matrix P˜, an n n unit upper triangular× matrix U˜, and an n n×lower triangular matrix L˜ such that× AP˜ = L˜U˜. In fact, L˜ is such that L˜∗ has row× echelon structure. Another variant of the basic factorization (1.33) is given by

(1.14) P A = LDUˆ , where P is an n n permutation matrix, L is an n n unit lower triangular matrix, Uˆ is an n n unit× upper triangular matrix, and ×D is an n n diagonal matrix. × × If A has full row rank, i.e., rank (A) = n, then, for j = 1,...,n, dj,j is equal to the first nonzero entry in the j-th row of U and Uˆ = D−1U, where U denotes the upper trapezoidal matrix of (1.33); D−1 exists by virtue of A being full rank. If rank (A) = r r, dj,j may be chosen arbitrarily. Of course, the nontrivial rows of Uˆ may be determined from those of U by dividing the rows of the latter by the corresponding first nonzero entry. The P A = LDUˆ factorization in the case of r

2 1 1 1 0 0 2 1 1 − − P(1,2)A = 00 4 = 0 1 0 00 4 .  63 1   3 1 1   00 0        ˆ Even though A is singular, P(1,2)A also has the factorization LDU given by

2 1 1 1 0 0 2 0 0 1 1/2 1/2 P A = 00− 4 = 0 1 0 0 4 0 0 0− 1 . (1,2)         63 1 3 1 1 0 0 d3 0 0 0         10 1. Triangular Factorization

Note that Uˆ is in and that the (3, 3) entry of D may be set arbitrarily since the third row of U contains only zero entries.

1.2.1 Triangular factorizations without row interchanges The appearance of the permutation matrix P in (1.33) results from the need to pivot, i.e., to perform row interchanges, whenever a zero pivot element is encountered. If row interchanges are not required, we may set P = I and then A = LU. The following example gives factorizations for two nonsingular matrices. The first has an LU factorization, i.e., no pivoting is necessary. The second matrix fails to have an LU factorization unless pivoting is performed. Example 1.4 Let 2 1 0 − A = 4 5 3 .  6 −6 2  − − Then A has the LU factorization   1 00 2 1 0 − A = LU = 2 10 0 3 3 .  3 1 1   0− 01  −     Let 0 4 1 B = 2 1 3 .  4− 14 2  − −   Then, since the first pivot element, i.e., b1,1, vanishes, B fails to have an LU factorization, i.e., we can’t write B = LU. However, if we interchange the first and second rows we see that P(1,2)B has the LU factorization

1 0 0 2 1 3 − P(1,2)B = LU = 0 1 0 0 41 .  2 3 1   0 01  −     A characterization of matrices for which zero pivot elements are not encountered, and thus of matrices A that possess a factorization of the form A = LU, is given in the following result. Here, given a matrix A with elements ai,j , i =1,...,n and j = 1,...,n, the k-th leading principal submatrix , k n, is the k k matrix Ak ≤ × with elements ai,j , i =1,...,k and j =1,...,k. Proposition 1.6 Given an n n matrix A, denote its leading principal k k × × submatrices by k for k =1,...,n. If k is nonsingular for k =1,...,ℓ = n 1, then there existsA an n n unit lowerA triangular matrix L and an n n upper− triangular matrix U such× that × (1.15) A = LU , 1.2. Triangular factorizations of an n × n matrix 11 where U = A(ℓ+1) and L is given explicitly by

1 (1) µ2 1  (1) (2)  µ3 µ3    ..   .  (1.16) L = (L L ) , where L =   1 2 1  1     . . (ℓ)   . . µℓ+1     .   .     µ(1) µ(2) µ(ℓ)   m m m   · · ·  and where L2 is any n (n ℓ) matrix such that L = (L1 L2) is an unit upper triangular matrix. Here A×(k), −k =1,..., (ℓ+1), is defined in the proof of Proposition (1.3) and (k) (k) aj,k (1.17) µj = (k) . ak,k

Moreover, ui,i =0 for i =1,...,n 1. If A is real then U and L may be chosen to be real as well.6 −

Proof. The proof is merely a specialization of the proofs of Proposition 1.3 and Theorem 1.4. Starting with A(1) = A, suppose we have completed the (k 1)-st stage of the reduction procedure without having encountered any vanishing− pivots, i.e., in the proof of Proposition 1.3 we have that P(j,pj ) = I for j = 1,...,k 1. Thus at the start of the k-th stage we assume that we have the partially reduced− matrix A(k) = M (k−1) M (1)A , · · · where the unit lower triangular matrices M (j), j = 1,...,k 1, are defined by (1.5) with q = m and where the matrix A(k) has its first (k 1)− columns in upper triangular form and its first (k 1) diagonal entries do not− vanish. We then have that − A = L(k)A(k) , where −1 L(k) = M (k−1) M (1) · · ·   is an n n unit lower triangular matrix. By using the special structure of L(k) and A(k), we× may partition A = L(k)A(k) into blocks to obtain

(k) (k) (k) k A12 L11 0 A11 A12 A = A = (k) (k) , A21 A22 L I 0 A   21 m−k ! 22 ! 12 1. Triangular Factorization

(k) (k) (k) where k, L11 , and A11 are k k, L11 is a unit lower triangular matrix, and (k) A × A11 is an upper triangular matrix. We equate the upper left-hand blocks to obtain (k) (k) (k) (k) (k) (k) (k) k = L11 A11 and therefore det k = det(L11 A11 ) = det L11 det A11 = det A11 . A (k) A Since A11 is upper triangular, we have that

(1.18) det = a(1)a(2) a(k) , Ak 1,1 2,2 · · · k,k i.e., det k is the product of the first k pivot elements. Now, by hypothesis, det =A 0. Also, by virtue of the assumption that A(k) is deduced from A without Ak 6 encountering any vanishing pivot elements, a(j) = 0 for j = 1,...,k 1. Then, j,j 6 − from (1.18), a(k) = 0 and the inductive step is complete so that we have k,k 6 A(ℓ+1) = M (ℓ)M (ℓ−1) M (1)A , · · · where A(ℓ+1) = U is upper triangular. The derivation of the explicit representation of (1.36) for −1 L = M (ℓ) M (1) · · · is left as an exercise.   2 Some important classes of matrices satisfy the hypotheses of this proposition; these include positive definite and diagonally dominant matrices. We will discuss the former in the following section. The uniqueness of the A = LU factorization is explored in the exercises. It is important to note that Proposition 1.12 gives sufficient conditions for a matrix to have an LU factorization without pivoting. The following example gives a matrix which does possess an LU factorization without pivoting but fails to satisfy the hypotheses of Proposition 1.12. Example 1.5 Let 2 1 1 − A = 00 1 .  00 1  Then A has the LU factorization   1 0 0 2 1 1 A = 0 1 0 00− 1 .  0 1 1   00 0      Note that A does not satisfy the hypotheses of Proposition 1.12.

1.2.2 Symmetric positive definite matrices We now consider an important class of square matrices, namely symmetric positive definite matrices. Recall that a is symmetric if AT = A and is positive definite if xT Ax) > 0 for all x RI n such that x = 0. ∈ 6 1.2. Triangular factorizations of an n × n matrix 13

When A is positive definite, we can easily show that A can be factored in the form A = LU by demonstrating that A satisfies the hypotheses of Proposition 1.12. (See the first part of the proof of Proposition 1.7 below.) If A is also symmetric then it turns out there is only one matrix involved in the factorization, i.e., we can write A = LLT , where now L is an invertible lower triangular matrix that in general does not have unit diagonal entries. This is the classic result known as the Cholesky factorization of a symmetric positive definite matrix. In addition, we can show that the converse is true, i.e., if A is symmetric and there exists an invertible lower triangular matrix L such that A = LLT , then A must be positive definite. The following result formalizes these observations.

Proposition 1.7 Let A be an n n . Then A is positive definite if and only if there exists an invertible× lower triangular matrix L such that

(1.19) A = LLT .

Furthermore, one can choose the diagonal elements li,i, i =1,...,n, of L to be real positive numbers. In this case the factorization (1.19) is unique. If A is real then L is real as well and we have that A = LLT .

Proof. If A is positive definite, xT Ax > 0 for x RI n such that x = 0. Choose xT = (yT , 0) for an arbitrary nonzero vector y RI∈k. We have that 6 ∈

T y T k y T 0 < y 0 A = y 0 A × = y ky 0 0 A    × ×      so that , the k-th leading principal submatrix of A, is positive definite. Further- Ak more, since k is positive definite it is nonsingular. Thus from Proposition 1.12 we have that A 1 u u u 1,1 1,2 · · · 1,n l2,1 1 u2,2 u2,n A = LU =  . .   ·. · · .  ......      l l − 1   u   n,1 · · · n,n 1   n,n      n and, since det A = i=1 ui,i and det A = 0, we have ui,i = 0 for i =1,...,n. Let D 6 6 −1 be the diagonal matrix diag (u1,1,u2,2,...,un,n) so that D is well defined. Then Q we can write A = LDUˆ where Uˆ = D−1U is a unit upper triangular matrix. Now A is also symmetric so that LDUˆ = Uˆ T DT LT , or

(1.20) DULˆ −T = L−1Uˆ T DT .

Now the product of the matrices on the left-hand side of (1.20) is an upper triangular matrix with diagonal entries given by those of D. The product of the matrices on the right-hand side is a lower triangular matrix with corresponding diagonal 14 1. Triangular Factorization entries given by those of DT . Thus both are diagonal matrices and we have that D = DULˆ −T = L−1Uˆ T DT = DT . Hence we have D = DT and Uˆ = LT , i.e., A = LDLT with L a unit lower triangular matrix and D a diagonal matrix with real entries. Since A is positive definite, we have that xT Ax = xT LDLT x = yT Dy > 0, where y = LT x. Thus we conclude that the diagonal entries of D are positive. 1/2 1/2 T Let D = diag (√d1, √d2,..., √dn) and let Lˆ = LD . Then A = LDL = LD1/2D1/2LT = LˆLˆT where the diagonal elements of Lˆ are real positive numbers. If A is real only real arithmetic is used to arrive at (1.19) so that Lˆ is real. To show that the factorization is unique when the diagonal elements of L are T T real positive numbers, we let A = L1L1 = L2L2 . Then

−1 T −T (1.21) L1 L2 = L1 L2 . The product of the matrices on the left-hand side of (1.21) is lower triangular while −1 T −T that on the right-hand side is upper triangular; hence L1 L2 = L1 L2 = G −1 for some diagonal matrix G. Now G = L1 L2 implies that gi,i = l2i,i/l1i,i and G = LT L −T implies that g = ¯l /¯l so that l = l . Since both L and 1 2 i,i 1i,i 2i,i | 1i,i| | 2i,i| 1 L2 have real positive diagonal entries this implies that l1i,i = l2i,i, or that gi,i = 1. Hence G = I and L1 = L2. Now assume that A is symmetric and that (1.19) holds. Then xT Ax = xT LLT x = yT y where y = LT x. Now yT y > 0 except for y = 0 or equivalently when x = 0 since L is invertible. Thus xT Ax > 0 unless x = 0 which is, for symmetric matrices, the definition of A being positive definite. 2 Example 1.6 Let A be given by

1 2 1 2 13− 13 .  1 13 42  −   Then A has the Cholesky factorization

1 2 1 1 0 0 1 2 1 2 13− 13 = 2 3 0 03− 5 .  1 13 42   1 5 4   00 4  − −      

1.3 Systems of algebraic equations

In this section we compare the use of triangular factorizations of a matrix with Gaussian elimination for solving linear systems of algebraic equations. We begin with the Gaussian elimination algorithm which makes use of elementary row op- erations to reduce the given n n coefficient matrix of the linear system to an upper triangular matrix; at the× same time, the elementary row operations are ap- plied to the right-hand side of the system. If this system is consistent, it can then 1.3. Systems of algebraic equations 15 be solved by a generalized back substitution algorithm. The Gaussian elimination algorithm we study here uses partial pivoting and implicit row scaling strategies. Examples illustrating the need for pivoting and scaling are given. The triangular factorization studied in the previous section, coupled with forward and back substitution algorithms, are also used to solve linear systems. A system of n linear algebraic equations in n unknowns is described by the relations n ai,j xj = bi for i =1,...,n, j=1 X where the n n numbers ai,j and the n numbers bi are given and the n numbers xj are to be determined.× The system of equations may be expressed in matrix form as (1.22) Ax = b , where A is an n n matrix with elements a ,ı=1,...,n, j = 1,...,n and x, b × i,j are n- and m-column vectors with elements xj , j = 1,...,n and bi,ı=1,...,n, respectively. We will refer to the matrix A as the coefficient matrix and the vector b as the right-hand side vector of the linear system. If A is m n, and m>n we refer to the system (1.22) as overdetermined whereas if m

1.3.1 Pivoting and scaling If one computes with infinite arithmetic, then row interchanges are nec- essary only when a pivot element is exactly zero. However, in computations using finite precision arithmetic, small pivot elements can have a disastrous effect on the accuracy of the computed results. The following well known example illustrates this point. Example 1.7 Consider the system .001 1 x 1 1 = 1 1 x 2    2    whose exact solution is x1 = 1000/999 and x2 = 998/999. If one reduces this system to triangular form without interchanging rows using base ten arithmetic rounded to two significant digits, then one obtains the triangular system .001 1 x 1 1 = . 0 1000 x 1000    2    A back substitution procedure gives the computed solution x1 = 0 and x2 =1. On the other hand, if we write the system by reordering the equations, i.e., interchang- ing rows, then we have 1 1 x 2 1 = . .001 1 x 1    2    16 1. Triangular Factorization

Using two-digit arithmetic one finds the computed solution to be x1 = x2 = 1, which is also the exact solution rounded to two significant digits. Now consider the problem

10 10, 000 x 10, 000 1 = 1 1 x 2    2    which is obtained from the original example by multiplying the first equation by 10, 000. Invoking a partial pivoting strategy would not effect an interchange of rows. The solution to this system using two-digit arithmetic is x1 = 0 and x2 = 1, which is exactly the answer we obtained for the original system without pivoting. It is thought that the trouble with the last example is that the matrix is not properly scaled, i.e., the matrix elements vary wildly. In determining the factorization P A = LU we pivoted, i.e., performed row interchanges, only when a zero pivot element a(k) was encountered. However, k,σk as the previous example demonstrates, pivoting may be necessary even when a pivot element is nonzero in order to avoid roundoff error problems. Thus some criteria must be chosen to determine the row interchange strategy. The algorithm we present in Section 1.3.3 for P A = LU employs implicit row scaling and partial pivoting strategies. Here we define these strategies. We define the scale factors si for each row i =1,...,m by n (1.23) s = a . i | i,j | j=1 X At the k-th stage of the algorithm we have a matrix A(k) whose first (k 1) − rows and (σk 1) columns, for σk k, have upper triangular structure. We search − (k) ≥ (k) the (n k +1) entries a , i = k,...,n, of A to find the first index pk such that − i,σk (k) (k) ap ,σ a (1.24) | k k | = max | i,σk | . i=k,...,n spk =0 si si6

This procedure of finding the maximal element in the column is called partial piv- oting and the choice of scaling each element during the pivot search by the factors si given in (1.23) is called implicit row scaling. (If the search (1.24) indicates that (k) ai,σk = 0 for i = k,...,n, we increment σk by one, i.e., move to the next column and repeat the search.) Ideally, i.e., in infinite precision arithmetic, the search (1.24) is needed only if a(k) = 0. In practice one uses finite precision arithmetic so that the search for k,σk (k) pk and the maximal pivot element apk,σk , along with the subsequent interchange of rows k and pk, help to avoid problems due to roundoff errors. This issue was illustrated in the example. We note that the scale factors si,ı=1,...,n, in (1.23) are determined only once from the elements of the given matrix A and are only used in (1.24) for the 1.3. Systems of algebraic equations 17 determination of the pivoting strategy; the elements of the matrix A are never explicitly scaled. Hence the terminology implicit row scaling. In the search for the pivot element defined by (1.24) we only considered elements in column σk on or below the k-th row. A more complicated process selects the pivot element for the matrix A(k) encountered at the beginning of the k-th stage by a search of the type (k) (k) ap,q = max ai,j , | | i=k,...,n | | j=k,...,n where for simplicity we have omitted scalings. Thus we now choose the pivot element to be an element of maximum modulus among all elements of the (n k + 1) (k) − × (n k + 1) submatrix with entries ai,j , i = k,...,n, j = k,...,n. This strategy is known− as full or complete pivoting. In the presence of roundoff errors the full pivoting strategy is theoretically more stable than is the partial pivoting strategy of (1.24); however, in the great majority of practical cases the latter appears to be adequate with regards to the avoidance of ill effects resulting from roundoff errors. Returning to our example, we see that if we use the implicit row scaling de- scribed here to determine the pivoting strategy, then the row scaling would force the interchange of the rows of the last system, i.e., one solves the system 1 1 x 2 1 = 10 10, 000 x 10, 000    2    which, in two-digit arithmetic, yields the same good answer x1 = x2 = 1. To prevent difficulties of the type that occur in Example 1.3.1, we incorporate a row scaling into our algorithms. There are other types of scalings which would have a similar effect as the implicit row scaling used here.

1.3.2 Solving linear systems using Gaussian elimination The goal of the Gaussian elimination method is to transform the n n coefficient matrix A into an n n matrix U which is upper triangular; this reduction× is accom- plished using elementary× row operations, i.e., the interchange of two rows, and the replacement of a row by the sum of that row and a scalar multiple of another row. At the same time, the elementary row operations are applied to the right-hand side vector b. The algorithm given here is essentially the method described in Proposi- tion 1.3 where we transformed an n n matrix A into an upper triangular matrix U by premultiplication by permutation× matrices and Gauss transformation matrices. In fact, premultiplication by these matrices effect the elementary row operations of the Gaussian elimination method, i.e., the permutation matrices effect the row interchange operations and the Gauss transformations effect the row replacement operations. In Gaussian elimination the elementary row operations applied to the coefficient matrix A and right-hand side vector b result in a linear system of the form (1.25) Ux = c , 18 1. Triangular Factorization where x is the solution of (1.22), U is exactly the same matrix as that obtained in the P A = LU factorization of A, and c is given by

(1.26) c = M (ℓ)P M (2)P M (1)P b . (ℓ,pℓ) · · · (2,p2) (1,p1) (k) Here M and P(k,pk ), k =1,...,ℓ, are the Gauss transformations and elementary permutation matrices, respectively, that are used to find the P A = LU factorization of A. Thus, Gaussian elimination and triangular factorization are equivalent; in fact, it is easy to show that c = L−1P b so that (1.25) is equivalent to P Ax = LUx = P b. Recall that in the algorithm for the triangular factorization of A we did not generate L and U as in Proposition 1.2 but rather we obtained the equations for the components of the matrices by equating entries on each side of equations such as P A = LU. In the Gaussian elimination method we actually carry out the steps found in the proof of Proposition 1.2 without explicitly forming the Gauss (k) transformations M or the permutation matrices P(k,pk ) In the following algorithm we use the implicit row scaling and partial pivoting strategies that were described above. As was the case in Algorithm 1.5, we do not physically interchange the rows, but rather keep track of the interchanges in the vector γ . Also, the vector σ stores the column index of the first nonzero entry in each row and the scalar r keeps track of the number of nonzero pivot elements encountered.

Algorithm 1.1 Gaussian elimination with implicit row scaling and partial pivoting. Given an n n matrix A and an n-vector b, this algorithm transforms the system Ax = b into× a system of the form (1.25), where U is overwritten onto A and c is overwritten onto b. The algorithm uses the partial pivoting and implicit row scaling strategies discussed in Section ?? to choose the pivot element; in particular, the pivot search is determined by (1.24). On output, γ and σ provide the same information as in Algorithm 1.5. In particular, if σk >n, then row k contains only zero elements.

Set k = 1 and σ1 = 1.

n For i =1,...,n, set γ = i and s = a . i i | i,j | j=1 X m Do while k n, σ n, and s = 0 : ≤ k ≤ γi 6 Xi=k a do while max | γi,σk | = 0 : i=k,...,n sγi sγi =06 set σ σ + 1 ; k ← k 1.3. Systems of algebraic equations 19

set p equal to the smallest integer such that

aγ ,σ a | p k | = max | γi,σk | ; sγp i=k,...,m sγi sγi =06

if k

set σk+1 = σk + 1 ; set k k + 1 . ← For i = k,...,n, set σi = n + 1. 2 For some classes of matrices, e.g., square positive definite matrices, it is known a priori that Gaussian elimination can proceed stably without the need for any row interchanges. In such cases it is wasteful to perform the pivot search and therefore that step is removed from the algorithm. Also, in this case there is no need to introduce the interchange array γ , the pivot position array σ , or the scale factors si. A few comments should be made concerning the algorithm. First, the row replacement loop begins in colunbyn (σk + 1) even though we are eliminating in column σk. This avoids the computation of aγi,σk for i = k +1,...,n which are known to vanish. If the m-th stage is reached, then we are working with the last row and therefore we do not need to eliminate any elements. Thus if k = n we exit after ascertaining whether or not the n-th row contains any nonzero elements and the pivot position for the n-th row. We also note that the scale factors si, i =1,...,n, are determined once and for all from the elements of the given matrix A and are only used in the determination of the interchange strategy; the elements of the matrix A are not explicitly scaled. Concerning the storage required by the algorithm, we see that through overwriting the elements of A we can implement the algorithm with roughly the same storage as that required to store A itself. Of course, if one overwrites then the matrix A will be destroyed during the computations. Once again we remark that other pivoting strategies could be incorporated such as a full pivoting strategy discussed in Section ??. 20 1. Triangular Factorization

A variant of the above algorithm, known as Gauss-Jordan elimination, explicitly scales rows so that the pivot element is set to unity and one eliminates above as well as below the pivot position. The outcome of Gauss-Jordan elimination is a matrix which is in row reduced echelon form; such a matrix is a row echelon matrix whose entries in a column containing a pivot element all vanish excepting, of course, for the pivot element itself. In particular, if the given matrix A is square and of full rank, i.e., invertible, the result of Gauss-Jordan elimination is simply the identity matrix. Although Gauss-Jordan elimination is of substantial theoretical importance, it is not as efficient to use as the above version of Gaussian elimination for solving linear systems. In the following examples the partial pivoting strategy of Algorithm 1.1 is used to reduce the given systems to the form (1.25). Example 1.8 Let A be the nonsingular matrix 2 13 1 A = 0 2 7 and b = 2 .  4− 45   4  The reduction of the system to the form (1.25) using the pivoting strategy of Algo- rithm 1.1 gives the following sequence of matrices 2 13 1 4 45 4 4 45 4 0 2 7 | 2 0 2 7 | 2 0 2 7 | 2  4− 45 | 4  →  2− 13 | 1  →  0 −1 1 | 1  | | − 2 | −    4 4 5 4   | 0 2 7 2 , →  0− 0 3 | 2  − | − where we have augmented the coefficient matrix with the right-hand side vector. If the storage scheme of Algorithm 1.1 is used then the resulting upper triangular matrix is stored as 0 0 3 0 2− 7 .  4− 4 5    Example 1.9 Consider the system Ax = b where 1 2 1 4 1 − − A = 1 3 7 2 and b = 1 .  1 12 11 16   1  − − − The reduction of the system to the form (1.25) using the pivoting strategy of Algo- rithm 1.1 gives the following sequence of matrices 1 2 1 4 1 1 2 1 4 1 1− 3 7− 2 | 1 0− 5 6− 6 | 0  1 12 11 16 | 1  →  0 10 12 12 | 0  − − − | − − − |     1.3. Systems of algebraic equations 21

1 2 1 4 1 − − | 0 10 12 12 0 , →  0− 0− 0− 0 | 0  |   where we have augmented the coefficient matrix with the right-hand side. Using Algorithm 1.1 we can reduce any n n linear system to one of the form (1.25). The case of most interest is when the× square matrix A is invertible. In this case the unique solution of the upper triangular system (1.25) may be easily determined by a back substitution procedure. In this procedure we simply solve for the unknowns xi, i =1,...,n, in reverse order. However, if A is not invertible (or if it is rectangular) we may still use a back substitution procedure to find a particular solution of (1.22) or (1.25). Of course, for a solution to exist these systems must be consistent, i.e., the given right-hand side vector b must belong to the column space of A or equivalently, must satisfy zT b = 0 for all n-vectors z such that AT z = 0. (See Appendix I.) The consistency of the system (1.25) is easily determined since the system Ax = b is consistent if and only if the sytem Ux = c is consistent. Indeed, one must only check if ck = 0 for all values of k such that the k-th row of U contains only zero entries, i.e., such that σk >n. Obviously, if ck =0 for such a row, the system is inconsistent. 6 Once the consistency of the system has been verified, one may proceed with the back substitution process. The components of x which correspond to columns of U which do not contain a pivot element may be set to any arbitrary value; we call these variables free variables. Starting with the last such row, the non-trivial rows of U are used to determine the value of the remaining components of x which we call pivot variables. Specifically, a non-trivial row, say row k, is used to determine the value of the component xσk , where σk gives the pivot position of the k-th row. The other components xi, i > σk, which can possibly appear in the equation corresponding to the k-th row of (1.25) are either free variables or have been previously determined from the equations corresponding to rows of (1.25) below the k-th row. Clearly, the number of pivot variables equals the rank of A and the number of free variables is n rank(A). In the algorithm which follows we set the free variables to unity; other− choices for the values of the free variables are also easily implemented. We give the generalized back substitution algorithm in the notation of the gener- alized Gaussian elimination algorithm 1.1. Here ai,j , bi, σk and γk refer to results of the Gaussian elimination process.

Algorithm 1.2 Generalized back substitution. Given the linear system Ax = b, where A is an n n matrix and b is an n vector. Assume that Algorithm 1.1 has been used to reduce× the system to the form−Ux = c where U and c are overwritten onto A and b respectively. Also let γ and σ denote the vectors generated in Algorithm 1.1. This algorithm finds a particular solution to the system Ux = c or equivalently, Ax = b, if it is consistent; all free variables are set to unity.

Set k = n and set xj = 1 for j =1,...,n. 22 1. Triangular Factorization

Do while k 1: ≥

if σk >n then: if b = 0, exit and indicate that the system is inconsistent; γk 6 else: set n 1 x = b a x ; σk a γk − γk,j j γk,σk σ +1 ! Xk set k k 1 ← − 2 Thus the combination of Algorithm 1.1 and the generalized back substitution given in Algorithm 1.2 enables one to find a particular solution of any consistent linear system of algebraic equations. Example 1.10 Consider the system Ax = b of Example 1.3.2 which has been reduced to the upper triangular system

x 1 2 1 4 1 1 − − x2 0 10 12 12 = 0 .  x   0− 0− 0− 0  3  0   x4          This is a consistent system so that in the generalized back substitution algorithm we set x4 = x3 = 1 and solve for x2 and then x1. Thus a particular solution is x = 4 , x = 12 , x = 1, and x = 1. 1 − 5 2 − 5 3 4

1.3.3 Solving linear systems using triangular factorizations In this section we first present algorithms for determining triangular factorizations of the types P A = LU, A = LU, and A = LLT . Then we see how these algorithms can be employed to solve linear systems.

Triangular factorizations without pivoting The elements of the matrices L and U appearing in (1.35) may be obtained as one proceeds through the reduction process of Proposition 1.12, e.g., L is defined by (1.36) and U = A(ℓ+1) is the end product of the process. However, the elements of L and U may also be obtained directly by equating, element by element, the left- and right-hand sides of the equation A = LU. We have that

K (1.27) ai,j = li,kuk,j , i =1,...,n and j =1,...,n, Xk=1 1.3. Systems of algebraic equations 23

where K = min(i, j), i = 1,...,n and j = 1,...,n. Noting that li,i = 1 for i = 1,...,n, we see that (1.27) represents n2 equations for the n2 nontrivial unknowns li,k and uk,j where k =1,...,n, i = k +1,...,n and j = k,...,n, i.e., those which are not known a priori to be one or zero. From (1.27) explicit formulas for these unknowns are easily found and are given in the following algorithm for triangular factorization without pivoting. This algorithm is known as the Doolittle reduction of a matrix into triangular factors.

Algorithm 1.3 Triangular factorization without pivoting. Let A be an n n × matrix which possesses an LU factorization A = LU such that uk,k = 0 for k = 1,...n 1. This algorithm computes the factors L and U whenever6 no pivoting, i.e., no− interchange of rows, is necessary. All elements of L and U that are not explicitly computed vanish. (The algorithm fails if uk,k = 0 for some k such that 1 k n 1; see Example 1.4.) ≤ ≤ −

For k =1,...,n, set lk,k = 1.

For j =1,...,n, set u1,j = a1,j .

For i =2,...,n, set li,1 = ai,1/u1,1. For k =2,...,n, set

k−1 u = a l u for j = k,...,n k,j k,j − k,ℓ ℓ,j Xℓ=1 k−1 1 li,k = ai,k li,ℓuℓ,k for i = k +1,...,n. uk,k − ℓ=1 ! X 2

An examination of the algorithm reveals that one first solves the equations corresponding to the first row of A, then those corresponding to the first column, then the second row, then the second column, etc. Note that uk,k = 0 for k = 1,..., min(m 1,n) is a sufficient condition for the algorithm to proceed6 without row interchanges;− these are exactly the denominators needed in the computations of the elements of L. The operation count for Algorithm 1.3 for is approximately n3/2 n3/6= n3/3 multiplications and a like number of additions. − As mentioned in Section 1.2, there are several variants to the factorization A = LU given in Proposition 1.12 and algorithms analagous to the Doolittle re- duction can be generated for these variants. For example, in the Crout reduction algorithm we choose U to be unit upper triangular and L to be lower triangular in the decomposition A = LU. To generate the equations for L and U in this case we solve for a column before solving for the corresponding row. Another variant is discussed in Exercise 3.15. 24 1. Triangular Factorization

For the sake of clarity, in Algorithm 1.3 we have kept distinct the roles of the elements of A, L and U. If we do not wish to save the matrix A then the results of the decomposition of A into triangular factors may be written over the corresponding element of A. For example, we have, after the k-th stage, the following storage scheme: u u u u u 1,1 1,2 · · · 1,k 1,k+1 · · · 1,n l2,1 u2,2 u2,k u2,k+1 u2,n  l l · · · u u · · · u  3,1 3,2 · · · 3,k 3,k+1 · · · 3,n  ......   ......    .  lk,1 lk,2 uk,k uk,k+1 uk,n   · · · · · ·   lk+1,1 lk+1,2 lk+1,k ak+1,k+1 ak+1,n   · · · · · ·   ......   ......     l l l a a   m,1 m,2 · · · m,k m,k+1 · · · m,n    Of course, li,i = 1 need not be stored at all. In the case where A is symmetric and positive definite, the elements of the Cholesky factor L may be computed by the following algorithm for the Cholesky factorization of a positive definite symmetric matrix.

Algorithm 1.4 Cholesky factorization of a positive definite symmetric matrix. Let A be an n n positive definite symmetric matrix. The following algorithm generates the factor× L in the factorization A = LLT .

Set l1,1 = √a1,1 and, for i =2,...,n, set li,1 = ai,1/l1,1. For k =2,...,n set

k−1 1/2 2 lk,k = ak,k lk,ℓ and − | | ! Xℓ=1 for i = k +1,...,n set

k−1 1 li,k = ai,k li,ℓl¯k,ℓ . lk,k − ! Xℓ=1 2

Because of the square roots involved, it is sometimes preferable to use the LDLT factorization for a symmetric positive definite matrix instead of the Cholesky factor- ization LLT . (The former factorization may be effected without any square roots.) If one attempts to apply the Cholesky Algorithm 1.4 to a symmetric matrix A which is not positive definite, then for some k the argument of the square root will be non-positive. Thus, either lk,k =0 or lk,k will be imaginary. This can serve as a test of whether or not a given symmetric matrix is positive definite. 1.3. Systems of algebraic equations 25

We note that the operation count for the Cholesky factorization is approximately half of the count for the factorization A = LU, i.e., for an n n matrix the leading term is n3/6 multiplications and a like number of additions.× From Algorithm 1.4 we see that for any k =1,...,n,

k l 2 = a | k,ℓ| k,k Xℓ=1 so that l 2 a for i =1,...,k. | k,i| ≤ k,k Thus the elements of the Cholesky factor L are bounded in terms of the square roots of the diagonal elements of the given matrix A. This fact has important implications regarding the of the Cholesky factorization (1.19).

Triangular factorizations with pivoting We now consider the direct computation of the factorization P A = LU given in Theorem 1.4. Since the pivoting strategy is not known at the start of the computa- tion, i.e., we do not know which pivot elements will vanish or be small, we cannot simply factor the matrix P A. Indeed, the pivoting strategy is determined as one proceeds through the reduction procedure. We obtain the following algorithm for the Doolittle reduction with partial pivoting and implicit row scaling. Algorithm 1.5 Triangulation factorization with partial pivoting and im- plicit row scaling. Let A be an n n matrix. This algorithm computes the factors L and U and the permutation matrix× P in the factorization P A = LU and also the rank r of the matrix A. The algorithm uses partial pivoting and implicit row scaling as discussed in Section ??.

Set k = 1, σ1 = 1, and r = 0. n For i =1,...,n, set γ = i and s = a . i i | i,j | j=1 X n Do while k n, σ n, and s = 0 : ≤ k ≤ γi 6 Xi=k set c = 0 ; do while c = 0 : k−1 for i = k,...,n, set d = a l u ; γi γi,σk − γi,t γt,σk t=1 X d set c = max | γi | ; i=k,...,n sγi sγi =06 26 1. Triangular Factorization

set σ σ + 1 ; k ← k dγ set p = smallest integer such that | p | = c ; sγp set r r + 1 ; ← if k = p, interchange the contents of γ and γ ; 6 p k set uγk,σk = dγk ; for j = σk +1,...,n, set

k−1 u = a l u ; γk,j γk,j − γk,t γt,j t=1 X for i = k +1,...,n, set

dγi lγi,k = ; dγk

set σk+1 = σk + 1 ; set k k + 1 . ← For i = k,...,n, set σi = n + 1. 2 Note that the algorithm does not involve the physical interchange of rows of the partially reduced matrices; the pivoting strategy is recorded in the integer γk. At any stage, γi = j indicates that the i-th row of the partially determined matrices L and U are found in the j-th row of storage. The integers σk are used to keep track of the columns in which the pivot elements, i.e., the leading nonzero entries, appear in any row. For the k-th row, the pivot element is in column σk. Again, in Algorithm 1.5, we have kept distinct the roles of the elements of A, L, and U. However, once again one may overwrite the elements of A with the nontrivial elements of L and U, e.g., by using the overwriting scheme

d (γ , k + σ ) position in storage, i = k,...,n, γi → i k u (γ , j) position in storage, j = k + σ ,...,n, γk,j → k k l (γ , k + σ ) position in storage, i = k +1,...,n. γi,k → i k

In particular, no extra storage is required for the dγi ’s and thus Algorithm 1.5 has storage requirements the same as that for the original matrix A, plus two integer arrays of length m for the γi’s and the σi’s, and, if scaling is used, another array of length m for the scale factors si. These points are illustrated by the following example. Example 1.11 Let A be an n n matrix for which we are using Algorithm 1.5 to obtain its P A = LU factorization.× We illustrate the storage at the end of the third stage of reduction, i.e., after the calculations with k = 3 are carried out. Suppose 1.3. Systems of algebraic equations 27

the pivoting strategy of the three steps is found to be p1 = 3, p2 = 5 and p3 =5 so that γ1 = 3, γ2 = 5, γ3 = 2, γ4 =4 γ5 = 1, and γi = i for i> 5. Also, assume that σ1 = 1, σ2 = 3, and σ3 = 4 so that the columns containing the pivot elements in first three rows have indices 1, 3, and 4, respectively. We then have the following storage scheme at the beginning of the fourth step:

l l l a a 5,1 5,2 5,3 ⊕ 5,5 · · · 5,n l3,1 l3,2 u3,4 u3,5 u3,n  u u u× u u · · · u  1,1 1,2 1,3 1,4 1,5 · · · 1,n  l4,1 l4,2 l4,3 a4,5 a4,n   × · · ·   l2,1 u2,3 u2,4 u2,5 u2,n  .  × · · ·   l6,1 l6,2 l6,3 a6,5 a6,n   ⊕ · · ·   ......   ......     l l l a a   n,1 n,2 n,3 n,5 · · · n,n    The elements denoted by the symbol will be filled in later with the elements of the fourth column of L. The elements⊕ denoted by the symbol should be zero × due to the fact that σk > k for k 2. In Algorithm 1.5 these zeros are not always actually computed (this would be≥ wastefull), so that these locations in storage do not necessarily contain zeros. However, in any subsequent use of the factorization effected by Algorithm 1.5, these storage locations are never accessed.

We note that it is sometimes recommended that the dγi ’s be accumulated in double precision in order to minimize the effects of roundoff error. After completing Algorithm 1.5, a pivoting strategy is known so that a posterori we may form the matrix P A. Interestingly, if we proceed to factor P A using Al- gorithm 1.3 (suitably amended as indicated in Exercise 1.17), it can be shown that the factorization so obtained, i.e., the L and the U, is the same as that obtained from Algorithm 1.5. Thus if one knew a good pivoting strategy beforehand, and re- ordered the rows of the original matrix A following that strategy, then the reordered matrix can be factored without invoking row interchanges. We note that the leading term in the operation count for the factorization P A = LU is the same as that for A = LU. This is due to the fact that the extra work required for the pivot search is of the order n2 n2/2 which is of lower order than the leading term in the operation count for the− factorization. The process of row interchanges is often referred to as row pivoting. Since, for the most part, we will not consider column interchanges, we will henceforth refer to row pivoting as (k) simply pivoting. The (k, σk) entry in the matrix A defined in the above proof is referred to as the pivot element and (k, σk) itself is referred to as the pivot position. The pivot element is the denominator appearing in the vector µ that determines the Gauss transformation M (k) used at the k-th stage. Thus, an interchange of rows is, in theory, only necessary whenever a pivot element vanishes. The row interchange process, i.e., premultiplication by the elementary permutation matrix

P(k,pk ), is invoked so that a nonzero entry is brought into the pivot position. 28 1. Triangular Factorization

Using triangular factorizations to solve linear systems The algorithms for triangular factorizations described above can also be used for solving linear systems. In fact, triangular factorization algorithms are the most efficient algorithms for solving a sequence of several systems of equations having the same coefficient matrix but having different right-hand side vectors. In this situation one computes the factorization of the coefficient matrix once and then performs two solves (a backward and forward) for each right-hand side. This is in contrast to Algorithm 1.1 in which the modifications to the right-hand side are performed at the same time the matrix is being reduced. As an immediate consequence of Theorem 1.4, we have the following proposition.

Proposition 1.8 Let A be a given n n matrix A and b an n-vector. Then the linear system Ax = b is equivalent to the× linear system

(1.28) LUx = P b , where the matrices L, U, and P are defined by the factorization P A = LU in Theorem 1.4.

Once P , L, and U are determined in the factorization P A = LU, we may solve the system LUx = P b, or equivalently Ax = b, by applying a a forward solve to

(1.29) Ly = P b to determine y and then determining x by applying a back solve to

(1.30) Ux = y using Algorithm 1.2. The forward solve is given by the following algorithm where we assume that L has been generated by Algorithm 1.5.

Algorithm 1.6 Forward substitution procedure. Let L be an n n unit lower triangular matrix generated by Algorithm 1.5 and let b RI n. Then× this algorithm solves the unit lower triangular system Ly = P b where∈ the row interchange infor- mation is stored in the vector γ , which is output from Algorithm 1.5.

Set y1 = b1.

i−1 For i =2,...,n, set y = b l y . i  γi − γi,j j j=1 X   2 We note that if no pivoting was performed to calculate the LU factorization then P = I and this forward solve algorithm is still valid since in this case γi = i. However, this algorithm is not valid for the Cholesky factorization LLT since in this 1.3. Systems of algebraic equations 29 case L is not unit lower triangular. The modifications to Algorithm 1.6 for the case of L being an arbitrary lower triangular matrix is left as an exercise. Example 1.12 Consider the system Ax=b where 2 1 0 1 A = 4 −5 3 and b = 2 .  6 −6 2   2  − − − From Example 1.2.1,A has the LU factorization   1 0 0 2 1 0 A = LU = 2 1 0 0 −3 3 .  3 1 1   0− 0 5  −     We thus solve the system Ly = b to obtain y= (1, 0, 5)T and the system Ux=y to obtain the solution x= (1, 1, 1)T . −

Operation counts We now turn to the operation counts for solving linear systems. We stated that the LU factorization required approximately (n3)/3 multiplications and a like number of additions. The number of operations for the Gaussian elimination algorithm is the same as for the LU factorization except we must also include the work required to transform the right-hand side. However, this requires (n2 + n)/2 multiplications and a like number of additions so that it does not affect the leading term. The back substitution requires another (n2 n)/2 multiplications and a like number of additions. Thus the total to solve an n− n system by Gaussian elimination is ap- proximately (n3)/3 multiplications and a× like number of additions. To solve a linear system using the LU factorization we see that we need to do both a forward and back solve. However, the forward solve requires approximately n2/2 multiplications and the back solve approximately n2/2 so that the leading term in the operation count is the same, which, of course, is what we expect. At the beginning of this section we mentioned that using triangular factorizations for solving linear systems is especially useful when one wants to solve a sequence of linear systems having the same coefficient matrix. Consider finding x(k) such that Ax(k) = b(k) for k =1,...,K, where b(k) denotes a sequence of right-hand side vectors that are not known a priori. For{ example,} the right-hand side vector b(k) for the k-th linear system could be a function of the solution vector x(k−1) of the (k 1)-st linear system. Then, for example, if A is a square matrix, re- eliminating− for every k requires, for large n approximately Kn3/3 multiplications and a like number of additions or subtractions in order to determine the sequence x(k) . However, if we compute the factors L and U and the pivoting strategy P and{ then} solve for x(k) by Ly(k) = P b(k) and Ux(k) = y(k), k = 1,...,K, the multiplication or addition count reduces to approximately n3/3+ Kn2, the first term accounting for the determination of the unchanging factors L and U and the second term accounting for the K forward and backward solves. 30 1. Triangular Factorization

1.3.4 Calculation of The of a matrix may be computed as a by-product of triangular fac- torization or the Gaussian elimination algorithm. From the factorization P A = LU where L is unit lower triangular, we see that det(P A) = det U. But det(P A) is either equal to det A or equal to det A since interchanging two rows of a matrix changes only the sign of the determinant.− Then it follows that n (1.31) det A = ( 1)ξ det U = ( 1)ξ u , − − i,i i Y where ξ denotes the total number of row interchanges performed in transforming the matrix A into the matrix U. The value of ξ may be easily accumulated during the elimination procedure. Of course, if A = LU, i.e., no row interchanges were performed, then n det A = ui,i . i=1 Y Also we note that if A is a Hermitian positive definite matrix then as a byproduct of the Cholesky factorization A = LL∗ we have det A = l 2 l 2 l 2 , | 1,1| | 2,2| ···| n,n| where li,j are the entries of the matrix L. Similarly, in the Gaussian elimination algorithm the det A is given by (1.31) since we have obtained U by applying elementary row operations to A. The only effect of a row interchange is to change the sign of the determinant while a row replacement operation, i.e., the replacement of a row by the sum of that row and multiple another row, does not change the value of the determinant.

1.4 Triangular factorizations for m n matrices × In this section we state the results for an m n matrix where its entries can be complex, analogous to those proved in Section× 1.2 for a square matrix. The proofs are left to the exercises. We will refer to certain matrices as having the following special structure. Definition 1.3 A matrix is said to have row echelon structure if it differs from a row echelon matrix only in that the first nonzero entry of any row need not be a 1. A matrix having row echelon structure is clearly upper trapezoidal; the converse is not necessarily true. The goal of this section is to give the result that any m n matrix A has the factorization P A = LU where P is an m m permutation matrix,× L is an m m unit lower triangular matrix, and U is an m× n matrix having row echelon structure.× The first results show show, through the use× of elementary permutation matrices and Gauss transformations, a given m n matrix A can be reduced to a matrix U having row echelon structure . × 1.4. Triangular factorizations for m × n matrices 31

Proposition 1.9 Let A be a given m n matrix. Then there exist an integer ℓ, 0 ℓ min(m 1,n), Gauss transformation× matrices M (k), k = 1,...,ℓ, and elementary≤ ≤ permutation− matrices P , k =1,...,ℓ, k p m, such that (k,pk ) ≤ k ≤ (1.32) U = A(ℓ+1) = M (ℓ)P M (2)P M (1)P A (ℓ,pℓ) · · · (2,p2) (1,p1) is an m n matrix having row echelon structure. × The following result provides the general factorization.

Theorem 1.10 Given any m n matrix A there exists an m m permutation matrix P , an m m unit lower× triangular matrix L, and an m n×matrix U having row echelon structure× such that × (1.33) P A = LU . Furthermore, rank(U) = rank(A) and if A is real, then L and U may be chosen to be real as well.

Let r = rank(U) = rank(A) denote the number of nonzero rows of U. It is, of course, possible for rn or in other cases for which the rows of A are linearly dependent. If r

U (1.34) P A = L L 1 = L U , 1 2 0 1 1    where L1 is an m r unit lower trapezoidal matrix, L2 is m (m r), and U1 is an r n full rank× matrix having row echelon structure. Thus× (1.34)− shows that an arbitrary× m n matrix with m > r can be factored into the product of an m r unit lower trapezoidal× matrix and an r n matrix with row echelon structure. Note× that × L2 plays no essential role in the P A = LU factorization of A. Also, if r = n, i.e., A has full column rank, then U is an n n square, nonsingular, upper triangular 1 × matrix; if r = m, i.e., A has full row rank, then L = L1 and U = U1.

Proposition 1.11 Given an m n matrix A. Partition its P A = LU factorization × as in (1.34) where U1 has full row rank and L1 is unit lower trapezoidal. Then, once the row interchange strategy is fixed, i.e., the permutation matrix P is fixed, the matrices L1 and U1 appearing in the factorization (1.34) are uniquely determined. If r = rank(A), the number of rows in U , then L may be chosen to be any m (m r) 1 2 × − matrix such that L = (L1 L2) is a unit lower triangular matrix.

The following results considers the case when we can perform the decomposition without row interchanges.

Proposition 1.12 Given an m n matrix A, denote its leading principal k k submatrices by for k =1,..., min(× m,n). If is nonsingular for k =1,...,ℓ×= Ak Ak 32 1. Triangular Factorization min(m 1,n), then there exists an m m unit lower triangular matrix L and an m n upper− trapezoidal matrix U such× that × (1.35) A = LU , where U = A(ℓ+1) and L is given explicitly by

1 (1) µ2 1  (1) (2)  µ3 µ3    ..   .  (1.36) L = (L L ) , where L =   1 2 1  1     . . (ℓ)   . . µℓ+1     .   .     µ(1) µ(2) µ(ℓ)   m m m   · · ·  and where L2 is any m (m ℓ) matrix such that L = (L1 L2) is an unit upper triangular matrix. Here A×(k), k−=1,..., (ℓ+1), is defined in the proof of Proposition (1.3) and (k) (k) aj,k (1.37) µj = (k) . ak,k

Moreover, ui,i =0 for i =1,..., min(m 1,n). If A is real then U and L may be chosen to be real6 as well. −

Exercises

1.1 Let P(k,ℓ) be defined by (1.1). Show that (1.2) holds. 1.2 It is not efficient to store an entire m m permutation matrix P on a computer. Describe an efficient way to give sufficient× information to describe a permutation matrix. 1.3 Show that if M (j) is given by (1.3), then

−1 M (j) = I + µ(e(j))T .   Given that M (j) and M (ℓ) have the form (1.5) with possibly different values of q, determine (M (j))−1(M (ℓ))−1 for j<ℓ. 1.4 Show that a permutation matrix is unitary. Show that the inverse of a permu- tation matrix is a permutation matrix. Show that the product of two permutation matrices is a permutation matrix. 1.4. Triangular factorizations for m × n matrices 33

1.5 Give an example of a nontrivial matrix that fails to have a unique LU factoriza- tion even if the factorization can be obtained without any row interchanges. Prove or disprove the assertion that if a square matrix A has an A = LU factorization, then A is invertible. 1.6 Suppose an m n matrix A has an A = LU factorization. State and prove the most general uniqueness× result for the factors L and U. 1.7 Prove that if A has the factorization A = LU and A is nonsingular then the factorization is unique. 1.8 Show that if an m n matrix A has an A = LU factorization, then L is given by (1.36). × 1.9 Write down the equations for the Crout factorization A = LU where U is unit triangular and L is lower trapezoidal. 1.10 Let A be a n n all of whose leading principal matrices are nonsingular. Prove× that there exists a unique unit lower triangular matrix L and a real diagonal matrix D such that A = LDL∗. 1.11 Let A be an n n positive definite matrix. Prove that there exists a unit lower triangular matrix×L, a unit upper triangular matrix U, and a diagonal matrix D such that A = LDU where the diagonal entries of D have positive real parts and L, D, and U are uniquely determined. 1.12 Write down an algorithm for the LDL∗ decomposition of a Hermitian matrix. 1.13 Write down an algorithm of the LDU decomposition of a positive definite matrix. 1.14 Derive a storage scheme for Algorithm 1.4 that takes advantage of the Her- mitian structure of A and of its Cholesky factors. 1.15 Proposition 1.12 gives sufficient conditions for a matrix to have an LU de- composition without pivoting. Give necessary conditions on the matrix A so that is has a factorization A = LU. Modify Algorithm 1.3 to handle the general case. 1.16 Let A be an m n matrix. Verify the operation count for the factorization A = LU for the case m>n× . Determine the operation count for the case m