AMS526: I (Numerical ) Lecture 3: Positive-Definite Systems; Cholesky Factorization

Xiangmin Jiao

Stony Brook University

Xiangmin Jiao Numerical Analysis I 1 / 11 Symmetric Positive-Definite Matrices

n×n A ∈ is symmetric positive definite (SPD) if T n x Ax > 0 for x ∈ R \{0} n×n A ∈ is Hermitian positive definite (HPD) if ∗ n x Ax > 0 for x ∈ C \{0} SPD matrices have positive real eigenvalues and orthogonal eigenvectors Note: Most textbooks only talk about SPD or HPD matrices, but a positive-definite matrix does not need to be symmetric or Hermitian! A real matrix A is positive definite iff A + AT is SPD T n If x Ax ≥ 0 for x ∈ R \{0}, then A is said to be positive semidefinite

Xiangmin Jiao Numerical Analysis I 2 / 11 Properties of Symmetric Positive-Definite Matrices

SPD matrix often arises as of some convex functional

I E.g., least squares problems; partial differential equations If A is SPD, then A is nonsingular Let X be any n × m matrix with full rank and n ≥ m. Then T I X X is symmetric positive definite, and T I XX is symmetric positive semidefinite n×m If A is n × n SPD and X ∈ R has full rank and n ≥ m, then X T AX is SPD Any principal submatrix (picking some rows and corresponding columns) of A is SPD and aii > 0

Xiangmin Jiao Numerical Analysis I 3 / 11 Cholesky Factorization If A is symmetric positive definite, then there is factorization of A A = RT R where R is upper triangular, and all its diagonal entries are positive Note: Textbook calls is “Cholesky decomposition”, but “Cholesky factorization” is a better term Key idea: take advantage and preserve symmetry and positive-definiteness during factorization Eliminate below diagonal and to the right of diagonal  T     T  a11 b r11 0 r11 b /r11 A = = T b K b/r11 I 0 K − bb /a11      T  r11 0 1 0 r11 b /r11 T = T = R1 A1R1 b/r11 I 0 K − bb /a11 0 I √ where r11 = a11, where a11 > 0 T −T −1 K − bb /a11 is principal submatrix of SPD A1 = R1 AR1 and therefore is SPD, with positive diagonal entries

Xiangmin Jiao Numerical Analysis I 4 / 11 Cholesky Factorization Apply recursively to obtain

 T T T  T A = R1 R2 ··· Rn (Rn ··· R2R1) = R R, rjj > 0

which is known as Cholesky factorization

How to obtain R from Rn, ... , R2, R1?

 r 0   1 0   r sT  A = 11 11 s I 0 A1 0 I        T  r11 0 1 0 1 0 r11 s = T s I 0 R˜ 0 R˜1 0 I  r 0   r sT  = 11 11 = RT R s R˜ T 0 R˜

T T R is “union” of kth rows of Rk (R is “union” of columns of Rk ) Matrix A1 is called the Schur complement of a11 in A

Xiangmin Jiao Numerical Analysis I 5 / 11 Existence and Uniqueness

Every SPD matrix has a unique Cholesky factorization

I Exists because algorithm for Cholesky factorization always works for SPD matrices √ I Unique because once α = a11 is determined at each step, entire column w/α is determined Question: How to check whether a symmetric matrix is positive definite? Answer: Run Cholesky factorization and it succeeds iff the matrix is positive definite.

Xiangmin Jiao Numerical Analysis I 6 / 11 Algorithm of Cholesky Factorization n×n T Factorize SPD matrix A ∈ R into A = R R

Algorithm: Cholesky factorization R = A for k = 1 to n for j = k + 1 to n r ← r − (r /r )r j,j:n j√,j:n kj kk k,j:n rk,k:n ← rk,k:n/ rkk

Note: rj,j:n denotes subvector of jth row with columns j, j + 1,..., n Operation count

n n n k n X X X X X n3 2(n − j) ≈ 2 j ≈ k2 ≈ 3 k=1 j=k+1 k=1 j=1 k=1

In practice, R overwrites A, and only upper-triangular part is stored.

Xiangmin Jiao Numerical Analysis I 7 / 11 Bordered Form of Cholesky Factorization

Another version of Cholesky factorization is the bordered form

Let Aj denote the j × j leading principle submatrix of A.

   T    Aj−1 c Rj−1 0 Rj−1 h Aj = T = T , c ajj h rjj 0 rjj

T T T 2 where Aj−1 = Rj−1Rj−1, c = Rj−1h, and ajj = h h + rjj Therefore, q −T T h = Rj−1c, rjj = ajj − h h. h is solved for using forward substitution

Xiangmin Jiao Numerical Analysis I 8 / 11 Notes on Cholesky Factorization

Implementations

I Different versions of Cholesky factorization can all use block-matrix operators to achieve better performance, and actual performance depends on the sizes of blocks I Different versions may have different amount of parallelism Hermitian positive definite (HPD) matrices ∗ I Cholesky factorization A = R R, exists for HPD matrices, where R is upper-triangular and its diagonal entries are positive real values Question: What happens if A is symmetric but not positive definite?

Xiangmin Jiao Numerical Analysis I 9 / 11 LDLT Factorization

Cholesky factorization is sometimes given by A = LDLT where D is and L is unit lower This avoids computing square roots Symmetric indefinite systems can be factorized with PAPT = LDLT , where

I P is a permutation matrix I D is block diagonal with 1 × 1 and 2 × 2 blocks I its cost is similar to Cholesky factorization

Xiangmin Jiao Numerical Analysis I 10 / 11 Banded Positive Definite Systems

A matrix A is banded if there is a narrow band around the main diagonal such that all of the entries of A outside of the band are zero

If A is n × n, and there is an s  n such that aij = 0 whenever |i − j| > s, then we say A is banded with bandwidth 2s + 1 For symmetric matrices, only half of band is stored. We say that A has semi-bandwidth s.

Theorem Let A be a banded, symmetric positive definite matrix with semi-bandwidth s. Then its Cholesky factor R also has semi-bandwidth s.

This is easy to prove using bordered form of Cholesky factorization Total flop count of Cholesky factorization is only ∼ ns2 However, A−1 of a banded matrix may be dense, so it is not economical to compute A−1

Xiangmin Jiao Numerical Analysis I 11 / 11