Chapter 9: Matrix Factorization

Chapter 9: Matrix Factorization • Positive Definite Matrices Definition: An n × n symmetric real matrix M is positive definite if: x, Mx = xT Mx> 0 for all x ∈ Rn and x = 0. M is positive semidefinite if x, Mx = xT Mx ≥ 0 ∀ x ∈ Rn M is negative definite if x, Mx = xT Mx< 0 for all x ∈ Rn and x = 0. M is negative semidefinite if x, Mx = xT Mx ≤ 0 ∀ x ∈ Rn. 1 Definition: A nonzero column vector x is an eigenvector of a square matrix M if ∃ a scalar λ such that Mx = λx Then λ is an eigenvalue of M. Definition: A principal minor of M is the determinant of any submatrix obtained from M by deleting its last k rows and k columns (k =0, 1, ··· , n − 1). 2 Properties: Given an n × n symmetric real matrix M (i) M is positive definite if and only if all its eigenvalues are positive; (ii) M is positive definite if and only if all its principal minors are positive. Some Necessary Conditions for a symmetric positive definite matrix: (i) All diagonal elements of M must be positive. (ii) The element of M having the largest absolute value must be on the diagonal of M 2 (iii) miimjj > |mij| , ∀i = j. 3 Definition: The square root of a matrix M is a matrix M 1/2 such that M = M 1/2M 1/2. Properties: If M and M 1/2 are both required to be positive definite (or positive semidefinite), then M 1/2 is unique. 4 Theorem: Let A be an m × n matrix with full row rank (m ≤ n), then AAT is positive definite. Proof: (i) AAT is symmetric. (ii) x,AAT x = xT AAT x = AT x≥ 0, ∀x ∈ Rm. If x,AAT x = 0, then AT x = 0. Since A has full row rank, we have x = 0. Hence: x,AAT x > 0 ∀x ∈ Rm and x = 0. 5 Corollary: Let A be an m × n matrix with full row rank and D be a diagonal matrix with all diagonal elements being positive, then ADAT is positive definite. Proof: ADAT = AD1/2D1/2AT = (AD1/2)(AD1/2)T 6 • Cholesky Factorization Main Theorem: If M is an n × n symmetric positive definite matrix, then it has a unique triangular factorization LLT , where L is a lower triangular matrix with positive diagonal elements. 7 Proof: The proof is by induction on the order of the matrix M. The result is certainly true for one by one matrices since m11 is positive. Suppose the assertion is true for matrices of order n − 1. Let M be a symmetric positive definite matrix of order n. It can be partitioned into the form d vT M = : v H where d is a positive scalar and H is an n−1 by n−1 submatrix. The partitioned matrix can be written as the product p ! p T ! d 0 1 0 d pv d pv I d n−1 0 H 0 In−1 vvT where H = H − d . Clearly the matrix H is symmetric. It is also positive definite since for any nonzero vector x of length n − 1, vvT xT Hx = xT (H − )x d T ! xT v d vT −x v = (− ; xT ) d d v H x which implies xT Hx ≥ 0. By the induction assumption, H has a T triangular factorization LHLH with positive diagonals. Thus, M 8 can be expressed as p ! p T ! d 0 1 0 1 0 d pv d pv I T d n−1 0 LH 0 LH 0 In−1 p ! p T ! d 0 d pv = d pv L T d H 0 LH = LLT It is left to the reader to show that the factor L is unique. 9 • Computing Cholesky Factor 1. Outer Product Form: T d1 v1 M = A0 = H0 = v1 H1 p ! ! p T ! 1 0 v1 d1 0 d1 p = v T d1 p 1 v1v1 In−1 0 H1 − d1 d1 0 In−1 1 0 T = L1 L1 0 H1 T = L1A1L1 0 1 1 0 0 1 0 T A1 = = @ 0 d2 v2 A 0 H1 0 v2 H2 0 1 0 1 0 0 1 1 p 0 = B 0 d2 C B 0 1 0 C @ v A @ T A p 2 v2v2 0 In−2 0 0 H2 − d2 d2 0 1 0 0 1 p T v2 B d2 p C @ d2 A 0 In−2 T = L2A2L2 . T An−1 = LnInLn : 10 Here, for 1 ≤ i ≤ n; di is a positive scalar, vi is a vector of length n − i, and, Hi is a n − i by n − i positive definite symmetric matrix. After N steps of the algorithm, we have T T T T M = L1L2 ··· LnLn ··· L2 L1 = LL : where it can be shown (see Exercise 2.1.6) that L = L1 + L2 + ··· + Ln − (n − 1)In: Thus, the i-th column of L is precisely the i-th column of Li. In this scheme, the column of L are computed one by one. At the same time, each step involves the modification of the submatrix T vivi Hi by the outer product to give Hi, which is simply the sub- di matrix remaining to be factored. The access to the components of M during the factorization is depicted as follows. 11 2. Bordering Method: An alternative formulation of the factorization process is the Bor- dering Method. Suppose the matrix M is partitioned as M u M = uT s where the symmetric factorization L LT of the n − 1 by n − 1 M M leading principle submatrix M has already been obtained. (why is M positive definite?) Then factorization of M is given by L 0 LT w M = M M wT t 0 t where w = L−1u M and t = (s − wT w)1=2: (Why is s − wT w positive?) Note that the factorization L LT of the submatrix M is also ob- M M tained by the bordering technique. So, the scheme can be described as follows. For i = 1; 2; : : : ; n, 0 1 0 1 0 1 l1;1 0 li;1 ai;1 . Solve @ . .. A @ . A = @ . A li−1;1 : : : li−1;i−1 li;i−1 ai;i−1 Compute i−1 X 2 1=2 li;i = (ai;i − li;k) : k=1 12 In this scheme, the rows of L are computed one at a time, the part of the matrix remaining to be factored is not accessed until the corresponding part of L is to be computed. The sequence of computations can be depicted as follows. 13 3. Inner Product Form: The final scheme for computing the components of L is the Inner Product Form of the algorithm. It can be described as follows. For j = 1; 2; : : : ; n, Compute j−1 X 2 1=2 lj;j = (aj;j − lj;k) . k=1 For i = j + 1; j + 2; : : : ; n, Compute j−1 ! X li;j = ai;j − li;klj;k =lj;j: k=1 These formulae can be derived directly by equating the elements of A to the corresponding elements of the product LLT . Like the outer product version of the algorithm, the columns of L are computed one by one, but the part of the matrix remaining to be factored is not accessed during the scheme. The sequence of computations and the relevant access to the components of M (or L) is depicted as follows. 14 The latter two formulations can be organized so that only inner products are involved. This can be used to improve the accuracy of the numerical factorization by accumulating the inner products in double precision. On some computers, this can be done at little extra cost. 15 # • Block Cholesky T TT MMM11 12 13 L11 LL11 21L 31 MMM = LL T T 21 22 23 21 22 L22 L32 MMM31 32 33 L31 L32L 33 T L33 M = L LT T M11 = L11L11 ← L11 T M21 = L21L11 ← L21 T M31 = L31L11 ← L31 T T M22 = L21L21 + L22L22 △ T T S22 = M22 − L21L21 = L22L22 ← L22 T T M32 = L31L21 + L32L22 △ T T S32 = M32 − L31L21 = L32L22 ← L32 T T T M33 = L31L31 + L32L32 + L33L33 △ T T T S33 = M33 − L31L31 − L32L32 = L33L33 ← L33 " Slide#16 ! Copyright belongs to Shu-Cherng Fang # Advantages? 1. Reduced working space for memory requirements. 2. Highly parallelizable for multiprocessors. Clock L11 L21 L31 S22 S32 L22 L32 33 L33 " Slide#17 ! Copyright belongs to Shu-Cherng Fang # General Case Given an m×m dimensional symmetric positive definite matrix M partitioned into p2 sub-blocks : M11 · · · M1p . M = . Mp1 · · · Mpp such that m = pr, where r is referred to as the block size. The Cholesky factor of M can be partitioned accordingly as L11 ··· 0 . L = . .. Lp1 ··· Lpp By directly equating LLT = M with block structure, we see that T L11L11 = M11 T Li1L11 = Mi1, for i =2, ··· ,p. " Slide#18 ! Copyright belongs to Shu-Cherng Fang # Moreover, by matrix multiplication, for p ≥ i ≥ j ≥ 2, j T Mij = k=1 LikLjk and hence T j−1 T LijLjj = Mij − k=1 LikLjk j−1 T If we denote Sij = Mij − k=1 LikLjk, then, for p ≥ i ≥ j ≥ 2, Ljj is the Cholesky factor of Sjj and T Lij is the solution of the matrix equation ZLjj = Sij. Hence a block Cholesky factorization scheme is obtained: " Slide#19 ! Copyright belongs to Shu-Cherng Fang # Algorithm C-3 compute the Cholesky factor of M11 for L11 for i =2 to p T solve ZL11 = Mi1 for Li1 end for j =2 to p for i = j to p S ← Mij for k =1 to j − 1 T S ← S − LikLjk end if i = j compute the Cholesky factor of S for Ljj else T solve ZLjj = S for Lij end end end " Slide#20 ! Copyright belongs to Shu-Cherng Fang # Note that Algorithm C-3 may use Algorithm C-1 or C-2 to find the Cholesky factor for block submatrices.

Chapter 9: Matrix Factorization

The Inverse Along a Lower Triangular Matrix∗

LU Decompositions We Seek a Factorization of a Square Matrix a Into the Product of Two Matrices Which Yields an Efficient Method

Handout 9 More Matrix Properties; the Transpose

Triangular Factorization

8.3 Positive Definite Matrices

Wronskian Solutions to the Kdv Equation Via B\" Acklund

Lecture 5: Matrix Operations: Inverse

Stat 309: Mathematical Computations I Fall 2014 Lecture 8

Using Row Reduction to Calculate the Inverse and the Determinant of a Square Matrix

Package 'Ecodist'

Linear Algebra

2.5 Elementary Row Operations and the Determinant