AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 3: Positive-Definite Systems; Cholesky Factorization
Xiangmin Jiao
Stony Brook University
Xiangmin Jiao Numerical Analysis I 1 / 11 Symmetric Positive-Definite Matrices
n×n Symmetric matrix A ∈ R is symmetric positive definite (SPD) if T n x Ax > 0 for x ∈ R \{0} n×n Hermitian matrix A ∈ C is Hermitian positive definite (HPD) if ∗ n x Ax > 0 for x ∈ C \{0} SPD matrices have positive real eigenvalues and orthogonal eigenvectors Note: Most textbooks only talk about SPD or HPD matrices, but a positive-definite matrix does not need to be symmetric or Hermitian! A real matrix A is positive definite iff A + AT is SPD T n If x Ax ≥ 0 for x ∈ R \{0}, then A is said to be positive semidefinite
Xiangmin Jiao Numerical Analysis I 2 / 11 Properties of Symmetric Positive-Definite Matrices
SPD matrix often arises as Hessian matrix of some convex functional
I E.g., least squares problems; partial differential equations If A is SPD, then A is nonsingular Let X be any n × m matrix with full rank and n ≥ m. Then T I X X is symmetric positive definite, and T I XX is symmetric positive semidefinite n×m If A is n × n SPD and X ∈ R has full rank and n ≥ m, then X T AX is SPD Any principal submatrix (picking some rows and corresponding columns) of A is SPD and aii > 0
Xiangmin Jiao Numerical Analysis I 3 / 11 Cholesky Factorization If A is symmetric positive definite, then there is factorization of A A = RT R where R is upper triangular, and all its diagonal entries are positive Note: Textbook calls is “Cholesky decomposition”, but “Cholesky factorization” is a better term Key idea: take advantage and preserve symmetry and positive-definiteness during factorization Eliminate below diagonal and to the right of diagonal T T a11 b r11 0 r11 b /r11 A = = T b K b/r11 I 0 K − bb /a11 T r11 0 1 0 r11 b /r11 T = T = R1 A1R1 b/r11 I 0 K − bb /a11 0 I √ where r11 = a11, where a11 > 0 T −T −1 K − bb /a11 is principal submatrix of SPD A1 = R1 AR1 and therefore is SPD, with positive diagonal entries
Xiangmin Jiao Numerical Analysis I 4 / 11 Cholesky Factorization Apply recursively to obtain
T T T T A = R1 R2 ··· Rn (Rn ··· R2R1) = R R, rjj > 0
which is known as Cholesky factorization
How to obtain R from Rn, ... , R2, R1?
r 0 1 0 r sT A = 11 11 s I 0 A1 0 I T r11 0 1 0 1 0 r11 s = T s I 0 R˜ 0 R˜1 0 I r 0 r sT = 11 11 = RT R s R˜ T 0 R˜
T T R is “union” of kth rows of Rk (R is “union” of columns of Rk ) Matrix A1 is called the Schur complement of a11 in A
Xiangmin Jiao Numerical Analysis I 5 / 11 Existence and Uniqueness
Every SPD matrix has a unique Cholesky factorization
I Exists because algorithm for Cholesky factorization always works for SPD matrices √ I Unique because once α = a11 is determined at each step, entire column w/α is determined Question: How to check whether a symmetric matrix is positive definite? Answer: Run Cholesky factorization and it succeeds iff the matrix is positive definite.
Xiangmin Jiao Numerical Analysis I 6 / 11 Algorithm of Cholesky Factorization n×n T Factorize SPD matrix A ∈ R into A = R R
Algorithm: Cholesky factorization R = A for k = 1 to n for j = k + 1 to n r ← r − (r /r )r j,j:n j√,j:n kj kk k,j:n rk,k:n ← rk,k:n/ rkk
Note: rj,j:n denotes subvector of jth row with columns j, j + 1,..., n Operation count
n n n k n X X X X X n3 2(n − j) ≈ 2 j ≈ k2 ≈ 3 k=1 j=k+1 k=1 j=1 k=1
In practice, R overwrites A, and only upper-triangular part is stored.
Xiangmin Jiao Numerical Analysis I 7 / 11 Bordered Form of Cholesky Factorization
Another version of Cholesky factorization is the bordered form
Let Aj denote the j × j leading principle submatrix of A.
T Aj−1 c Rj−1 0 Rj−1 h Aj = T = T , c ajj h rjj 0 rjj
T T T 2 where Aj−1 = Rj−1Rj−1, c = Rj−1h, and ajj = h h + rjj Therefore, q −T T h = Rj−1c, rjj = ajj − h h. h is solved for using forward substitution
Xiangmin Jiao Numerical Analysis I 8 / 11 Notes on Cholesky Factorization
Implementations
I Different versions of Cholesky factorization can all use block-matrix operators to achieve better performance, and actual performance depends on the sizes of blocks I Different versions may have different amount of parallelism Hermitian positive definite (HPD) matrices ∗ I Cholesky factorization A = R R, exists for HPD matrices, where R is upper-triangular and its diagonal entries are positive real values Question: What happens if A is symmetric but not positive definite?
Xiangmin Jiao Numerical Analysis I 9 / 11 LDLT Factorization
Cholesky factorization is sometimes given by A = LDLT where D is diagonal matrix and L is unit lower triangular matrix This avoids computing square roots Symmetric indefinite systems can be factorized with PAPT = LDLT , where
I P is a permutation matrix I D is block diagonal with 1 × 1 and 2 × 2 blocks I its cost is similar to Cholesky factorization
Xiangmin Jiao Numerical Analysis I 10 / 11 Banded Positive Definite Systems
A matrix A is banded if there is a narrow band around the main diagonal such that all of the entries of A outside of the band are zero
If A is n × n, and there is an s n such that aij = 0 whenever |i − j| > s, then we say A is banded with bandwidth 2s + 1 For symmetric matrices, only half of band is stored. We say that A has semi-bandwidth s.
Theorem Let A be a banded, symmetric positive definite matrix with semi-bandwidth s. Then its Cholesky factor R also has semi-bandwidth s.
This is easy to prove using bordered form of Cholesky factorization Total flop count of Cholesky factorization is only ∼ ns2 However, A−1 of a banded matrix may be dense, so it is not economical to compute A−1
Xiangmin Jiao Numerical Analysis I 11 / 11