Chapter 1
Triangular Factorization
This chapter deals with the factorization of arbitrary matrices into products of triangular matrices. Since the solution of a linear n n system can be easily obtained once the matrix is factored into the product× of triangular matrices, we will concentrate on the factorization of square matrices. Specifically, we will show that an arbitrary n n matrix A has the factorization P A = LU where P is an n n permutation matrix,× L is an n n unit lower triangular matrix, and U is an n ×n upper triangular matrix. In connection× with this factorization we will discuss pivoting,× i.e., row interchange, strategies. We will also explore circumstances for which A may be factored in the forms A = LU or A = LLT . Our results for a square system will be given for a matrix with real elements but can easily be generalized for complex matrices. The corresponding results for a general m n matrix will be accumulated in Section 1.4. In the general case an arbitrary m× n matrix A has the factorization P A = LU where P is an m m permutation× matrix, L is an m m unit lower triangular matrix, and U is an×m n matrix having row echelon structure.× ×
1.1 Permutation matrices and Gauss transformations
We begin by defining permutation matrices and examining the effect of premulti- plying or postmultiplying a given matrix by such matrices. We then define Gauss transformations and show how they can be used to introduce zeros into a vector. Definition 1.1 An m m permutation matrix is a matrix whose columns con- sist of a rearrangement× of the m unit vectors e(j), j = 1,...,m, in RI m, i.e., a rearrangement of the columns (or rows) of the m m identity matrix. × The rows of an n n permutation matrix consist of a rearrangement of the T × transposes (e(j)) , j = 1,...,n, of the n unit vectors in RI n. The effect of premul- tiplying (postmultiplying) a matrix A by a permutation matrix P is to rearrange the rows (columns) of A into the same order as the rows (columns) of the identity matrix are ordered in P . For integers j and k such that 1 j k m, we denote ≤ ≤ ≤ by P(j,k) the special class of permutation matrices that result from the interchange
1 2 1. Triangular Factorization of the j-th and k-th columns of the identity matrix, i.e.,
(1.1) P = e(1), , e(j−1), e(k), e(j+1), , e(k−1), e(j), e(k+1), , e(m) . (j,k) · · · · · · · · · These permutation matrices, which are often referred to as elementary permutation matrices, have many useful properties such as
T −1 (1.2) P(j,k) = P(j,k) = P(j,k) .
Of course, P(j,j) = I. We now define a special class of matrices that are rank one perturbations of the identity matrix and that can be used to introduce zeros into a vector. Definition 1.2 For an integer j such that 1 j
T m Proposition 1.1 Given x = (x1, x2,...,xm) RI such that x = 0. Choose any integer p such that 1 p m and x =0. Define∈ µ RI m by 6 ≤ ≤ p 6 ∈ 0 if j =1
x1 if j = p and p =1 µj = x 6 p x j if j =2,...,m, j = p . xp 6 Then (1) (1) M P(1,p)x = xpe . Proof. The result follows from
0 xp µ2xp 0 (1) (1) T µ3xp 0 M P(1,p)x = P(1,p)x µ(e ) P(1,p)x = P(1,p)x = . − − . . . . µ x 0 m p 1.1. Permutation matrices and Gauss transformations 3
2 Note that if x = 0, then one may choose p = 1. Also, note that p, 1 p m, 1 6 ≤ ≤ can be any index such that xp = 0 so that there is not a unique Gauss transformation which transforms x into a constant6 times e(1); this is illustrated by the following example.
Example 1.1 Let x= (0, 3, 4)T . If p = 2 and µ = (0, 0, 4/3)T , then − − 1 00 M (1) = 0 10 4/3 0 1 (1) T T and M P(1,2)x = ( 3, 0, 0) . On the other hand, if p = 3 and µ = (0, 3/4, 0) , then − − 1 00 M (1) = 3/4 1 0 0 01 (1) T and M P(1,3)x = (4, 0, 0) . We can also use Gauss transformations to introduce zeros in any contiguous block of a vector. For example, given x RI m and integers k and q, 1 k < q m, ∈ ≤ ≤ such that xj , j = k,...,q, are not all zero, suppose we want to choose a Gauss transformation that zeros out the (k + 1)-st through q-th components of P(k,p)x. Again, an integer p, k p q, is chosen so that x = 0. We set ≤ ≤ p 6 0 if j =1,...,k or j = q +1,...,m
xk if j = p and p = k (1.4) µj = x 6 p x j if j = k +1,...,q, j = p x 6 p so that, from (1.3),
Ik−1 1 µ − k+1 . . (1.5) M (k) = µq Im−k − 0 . . 0 4 1. Triangular Factorization and x1 . . x − k 1 xp 0 (k) M P(k,p)x = . . . 0 x q+1 . . x m We end this section by showing how an elementary permutation matrix which premultiplies a Gauss transformation “passes through” the Gauss transformation to produce another Gauss transformation postmultiplied by the same permutation matrix.
Proposition 1.2 Let M (k) be a Gauss transformation and let j and p be indices such that k (k) ˆ (k) (1.6) P(j,p)M = M P(j,p) . Proof. The proof is by direct multiplication; Mˆ (k) is the matrix obtained by inter- changing the (j, k) and (p, k) entries of M (k). 2 1.2 Triangular factorizations of an n n matrix × The goal of this section is to prove that any n n matrix A has the factorization P A = LU where P is an n n permutation× matrix, L is an n n unit lower triangular matrix, and U is an ×n n upper triangular matrix. We will× first show how, through the use of elementary permutation× matrices and Gauss transformations, a given n n can be reduced to an upper triangular matrix U. We remark that the construction× given in the proof of the following proposition is exactly the classical Gaussian elimination algorithm expressed in matrix notation; we will examine this connection further in the next section. Proposition 1.3 Let A be a given n n matrix. Then there exist an integer ℓ, 0 ℓ n 1, Gauss transformation matrices× M (k), k = 1,...,ℓ, and elementary permutation≤ ≤ − matrices P , k =1,...,ℓ, k p n, such that (k,pk ) ≤ k ≤ (1.7) U = A(ℓ+1) = M (ℓ)P M (2)P M (1)P A (ℓ,pℓ) · · · (2,p2) (1,p1) is an n n upper triangular matrix. × 1.2. Triangular factorizations of an n × n matrix 5 (1) Proof. Starting with A = A and the integer counter σ1 = 1, we assume that at the start of the k-th stage, k 1, of the procedure we have a matrix of the form ≥ U (k) a(k) B(k) A(k) = , 0 c(k) D(k) (k) (k) k−1 (k) where U is an (k 1) (σk 1) upper triangular matrix,, a RI , c RI n−k+1, B(k) is (k −1) (×n σ −) and D(k) is (n k + 1) (n σ ).∈ If c(k) = 0, we∈ − × − k − × − k increment σk by one and move on to the next column, continuing to so increment (k) (k) σk until either c = 0 or σk >n. In the latter case we are finished since then A is upper triangular.6 In the former case we use (1.4) with a(k) x = c(k) (k) and q = m to choose an integer pk k and construct a Gauss transformation M (k) ≥ such that the components of M P(k,pk )x with indices j = k +1,...,m vanish. If we write M (k) in the block form I 0 k−1 , 0 M˜ (k) then M˜ (k) is an (m k + 1) (m k + 1) Gauss transformation formed by setting x = c(k) in Proposition− 1.1.× We have− that U (k) a(k) B(k) 0 c (k+1) (k) (k) 0 0 A = M P(k,pk )A = , . . . . M˜ (k)Dˆ (k) 0 0 (k) (k) where c denotes the (pk k +1)-st component of c and Dˆ denotes the matrix (k) − D after rows k and pk have been interchanged. Then U (k+1) a(k+1) B(k+1) A(k+1) = , 0 c(k+1) D(k+1) where a(k+1) and c(k+1) are the first columns of B(k) and M˜ (k)Dˆ (k), respectively, U (k+1) is the k (σ 1) matrix given by × k+1 − U (k) a(k) U (k+1) = , 0 c (k+1) (k+1) and σk+1 = σk + 1. Clearly U has upper triangular structure and A has the same structure as A(k) with the index k augmented by one so that the inductive 6 1. Triangular Factorization step is complete. The total number of stages ℓ cannot exceed (m 1) or n and may − be less than both if σk > k for some k. 2 The process of row interchanges is often referred to as row pivoting. Since, for the most part, we will not consider column interchanges, we will henceforth refer (k) to row pivoting as simply pivoting. The (k, σk) entry in the matrix A defined in the above proof is referred to as the pivot element and (k, σk) itself is referred to as the pivot position. The pivot element is the denominator appearing in the vector µ that determines the Gauss transformation M (k) used at the k-th stage. Thus, an interchange of rows is, in theory, only necessary whenever a pivot element vanishes. The row interchange process, i.e., premultiplication by the elementary permutation matrix P(k,pk ), is invoked so that a nonzero entry is brought into the pivot position. Proposition 1.3, coupled with Proposition 1.2, allows us to prove the major result of this section. Theorem 1.4 Given any n n matrix A there exists an n n permutation matrix P , an n n unit lower triangular× matrix L, and an n n ×upper triangular matrix U such that× × (1.8) P A = LU . Furthermore, rank(U) = rank(A). Proof. By Proposition 1.3 we have that A(ℓ+1) = M (ℓ)P M (ℓ−1)P M (2)P M (1)P A (ℓ,pℓ) (ℓ−1,pℓ−1) · · · (2,p2) (1,p1) is upper triangular. The inverse of a square unit lower triangular matrix is also unit (ℓ) (1) lower triangular so that the proof will be complete if we can show that M P(ℓ,pℓ) M P(1,p1) = MP for some unit lower triangular matrix M and permutation matrix P . Since· · · p k, from Proposition 1.2 we note that for k =2,...,ℓ k ≥ (k−1) (k−1) P(k,pk )M = M1 P(k,pk ) , (k−1) where M1 is a unit lower triangular matrix. Of course, if no row interchanges (k−1) (k−1) are required, then P(k,pk ) = I and M1 = M . Thus we have A(ℓ+1) = M (ℓ)M (ℓ−1)P M (ℓ−2)P M (1)P P A . 1 (ℓ,pℓ) 1 (ℓ−1,pℓ−1) · · · 1 (2,p2) (1,p1) Again we use Proposition 1.2 to show that for k =3,...,ℓ (k−2) (k−2) P(k,pk )M1 = M2 P(k,pk ) , (k−2) where M2 is a unit lower triangular matrix. Continuing in this manner, we have that U = A(ℓ+1) = MPA, where M = M (ℓ)M (ℓ−1)M (ℓ−2) M (2) M (1) 1 2 · · · ℓ−2 ℓ−1 1.2. Triangular factorizations of an n × n matrix 7 and P = P P P . (ℓ,pℓ) · · · (2,p2) (1,p1) Here M is the product of unit lower triangular matrices and thus M itself is unit lower triangular. Furthermore, P is the product of permutation matrices and thus P is a permutation matrix as well. Also, since M and P are invertible, clearly rank(U) = rank(A). If A is real, then the above process involves only real arithmetic so that the end products L and U are real. 2 Let r = rank(U) = rank(A) denote the number of nonzero rows of U. It is, of course, possible for r