Lecture 4/26/2013
Jed Duersch
Spd Lecture 4/26/2013 matrices
Cholesky decomposi- tion Jed Duersch Band matrices Applied mathematics PhD candidate, physics MA UC Berkeley
April 26, 2013
Jed Duersch UCB 1/19 Symmetric positive-definite matricesI
Lecture 4/26/2013 Definition Jed n×n Duersch A symmetric matrix A ∈ R is positive definite iff xT Ax > 0 holds ∀x 6= 0 ∈ n. Spd R matrices
Cholesky Such matrices naturally arise in a variety of math, science, decomposi- tion and engineering applications. For example, Band The Hamiltonian operator of a physical system matrices The Hessian of a convex function The covariance matrix of linearly independent random variables
Jed Duersch UCB 2/19 Symmetric positive-definite matricesII
Lecture 4/26/2013 Theorem
Jed The following statements are equivalent: Duersch 1 A is symmetric positive definite. Spd T n matrices 2 (x, y) := x Ay is an inner product on R . Cholesky 3 decomposi- All eigenvalues are positive and eigenvectors can be tion constructed to form an orthonormal basis. Band matrices 4 A is symmetric with positive leading principal minors. 5 The Cholesky decomposition of A exists.
Jed Duersch UCB 3/19 ExistenceI
Lecture 4/26/2013 To prove the existence of the Cholesky decomposition for
Jed symmetric positive definite matrices, we apply induction. Duersch n−1×n−1 Induction hypothesis: Given any spd An−1 ∈ R Spd matrices assume there exists lower-triangular Ln−1 where Cholesky decomposi- T tion An−1 = Ln−1Ln−1 Band matrices . √ √ T Base case: A1 = [a1,1] = [ a1,1][ a1,1] . n×n Consistency: Show that for any spd An ∈ R there exists Ln such that
T An = LnLn .
Jed Duersch UCB 4/19 ExistenceII
Lecture 4/26/2013 Let us represent An as
Jed T Duersch α c An = c B Spd matrices
Cholesky decomposi- where α := a1,1 is a scalar, c := a2:n,1 is a column vector, tion n−1×n−1 and B = a2:n,2:n is the remaining lower-right Band R matrices sub-matrix. Then we have √ √ √1 T α 0 1 0 α α c √1 c I 1 T An = α n−1 0 B − α cc 0 In−1
Jed Duersch UCB 5/19 ExistenceIII
Lecture n−1 4/26/2013 Let y be an arbitrary nonzero column vector in R . We can solve Jed Duersch √ α √1 cT 0 Spd α matrices 0 In−1 x = y Cholesky decomposi- tion
Band It follows matrices 1 yT (B − √ ccT )y = xT A x > 0 α n 1 T Thus B − α cc is spd. From the induction hypothesis 1 B − ccT = L LT α n−1 n−1 .
Jed Duersch UCB 6/19 ExistenceIV
Lecture 4/26/2013 Putting it all together, we have √ √ Jed √1 T Duersch α 0 1 0 α α c 1 1 T An = √ c In−1 0 B − cc 0 I Spd α α n−1 matrices
Cholesky √ √ √1 T decomposi- α 0 1 0 α α c tion 1 T = √ c In−1 0 L L Band α n−1 n−1 0 In−1 matrices √ √ √1 T α 0 α α c √1 c L T = α n−1 0 Ln−1
T = LnLn
Jed Duersch UCB 7/19 ImplementationI
Lecture 4/26/2013 Note that the proof of existence was constructive. That
Jed means we can follow the same procedure to calculate the Duersch Cholesky decomposition. Spd 1 Loop over columns. matrices
Cholesky for k=1:n-1 decomposi- tion 2 Set diagonal element of L.
Band L(k,k)=sqrt(A(k,k)) matrices 3 Finish column. L(k+1:n,k)=A(k+1:n)/L(k,k) 4 Update lower right sub-matrix of A. A(k+1:n,k+1:n)=A(k+1:n,k+1:n)- L(k+1:n,k)*L(k+1:n,k)T 5 End for loop. end
Jed Duersch UCB 8/19 Band matricesI
Lecture 4/26/2013 We can similarly take advantage of the stucture of band
Jed matrices to optimize the calculation of the LU Duersch decomposition.
Spd matrices Definition Cholesky n×n decomposi- A band matrix A ∈ R of band width 2m + 1 has nonzero tion entries only on the main diagonal, m super-diagonals, and m Band matrices sub-diagonals.
Jed Duersch UCB 9/19 Band matricesII
Lecture 4/26/2013 For example, a tridiagonal matrix (band width 3) has the form: Jed Duersch a1,1 a1,2 0 Spd matrices a2,1 a2,2 a2,3 Cholesky a3,2 a3,3 a3,4 decomposi- tion .. A = a4,3 a4,4 . Band matrices .. .. . . an−1,n 0 an,n−1 an,n
Jed Duersch UCB 10/19 Band matricesIII
Lecture 4/26/2013 We can easily show that the LU decomposition will have the form: Jed Duersch 1 0u1,1 u1,2 0 Spd . matrices l2,1 1 .. u2,2 Cholesky A = .. .. . decomposi- . . .. u tion n−1,n 0 ln,n−1 1 Band 0 un,n matrices
Because we know so many entries must be zero, we can optimize the algorithm to avoid calculating them. Similarly, the procedures for forward and backward substitution can also be optimized in these cases.
Jed Duersch UCB 11/19 Diagonal dominanceI
Lecture 4/26/2013 Partial pivoting can be implemented, but it will increase the
Jed band width of U from m + 1 to 2m + 1. However, pivoting Duersch can sometimes be avoided.
Spd matrices Definition Cholesky A matrix A ∈ n×n is column diagonally dominant iff decomposi- R tion X Band |aj,j| > |ai,j| matrices i6=j
We can show that a column diagonally dominant matrix never requires pivoting.
Jed Duersch UCB 12/19 Diagonal dominanceII
Lecture (i) 4/26/2013 Proof: Let ak,j represent element k, j after zeroing column i
Jed the standard LU algorithm. Assume the induction Duersch hypothesis:
Spd matrices n (i) X (i) Cholesky |aj,j| > |ak,j| decomposi- tion k=i+1,6=j Band When column i + 1 is zeroed below the diagonal, the matrices elements of A update as follows:
a(i) a(i) a(i+1) = a(i) − k,i+1 i+1,j k,j k,j (i) ai+1,i+1
Jed Duersch UCB 13/19 Diagonal dominanceIII
Lecture 4/26/2013 (i) (i) (i+1) (i) aj,i+1ai+1,j Jed |a | = |a − | Duersch j,j j,j (i) ai+1,i+1 Spd (i) (i) matrices |a ||a | ≥ |a(i)| − j,i+1 i+1,j Cholesky j,j (i) decomposi- |a | tion i+1,i+1
Band n (i) (i) X (i) |aj,i+1||ai+1,j| matrices > |a | − k,j (i) k=i+1,6=j |ai+1,i+1| n (i) ! X (i) (i) |aj,i+1| = |a | + |a | 1 − k,j i+1,j (i) k=i+2,6=j |ai+1,i+1|
Jed Duersch UCB 14/19 Diagonal dominanceIV
Lecture 4/26/2013 To make further progress we observe:
Jed (i) (i) Duersch |a ||a | |a(i+1)| ≤ |a(i) | + k,i+1 i+1,j k,j k,j (i) Spd |a | matrices i+1,i+1 (i) (i) Cholesky |a ||a | decomposi- (i) (i+1) k,i+1 i+1,j tion |ak,j| ≥ |ak,j | − |a(i) | Band i+1,i+1 matrices
Jed Duersch UCB 15/19 Diagonal dominanceV
Lecture 4/26/2013 Together this gives:
Jed (i) Duersch n n (i+1) X (i+1) (i) X |ak,i+1| |a | > |a | + |a | 1 − j,j k,j i+1,j (i) Spd |a | matrices k=i+2,6=j k=i+2 i+1,i+1
Cholesky n decomposi- X (i+1) tion > |ak,j | Band k=i+2,6=j matrices This shows consistency for the induction hypothesis and the base case easily follows from the definition of diagonal dominance. The desired result immediate follows since the diagonal elements are always largest in magnitude.
Jed Duersch UCB 16/19 ImplementationI
Lecture 4/26/2013 We can follow the standard LU procedure and simply avoid
Jed elements that do not update. If A has band width 2m + 1, Duersch we have Spd 1 Loop over columns. matrices
Cholesky for k=1:n-1 decomposi- tion 2 Construct column k of L.
Band L(k+1:k+m,k)=A(k+1:k+m,k)/A(k,k) matrices 3 Set column k of A to zero. A(k+1:k+m,k)=0 4 Update lower right sub-matrix of A. A(k+1:k+m,k+1:k+m)=A(k+1:k+m,k+1:k+m)- L(k+1:k+m,k)*A(k,k+1:k+m) 5 End for loop. Identify U. end; U=A
Jed Duersch UCB 17/19 ImplementationII
Lecture 4/26/2013 In the case of a tridiagonal matrix, the procedure becomes exceptionally simple and efficient. Let us write the diagonals Jed Duersch of A as vectors. Then L and U can be written as:
Spd d1 e1 0 matrices . c d .. Cholesky 1 2 decomposi- tion A = .. .. . . en−1 Band matrices 0 cn−1 dn
1 0u1 e1 0 l 1 .. 1 u2 . . . = .. .. .. . en−1 0 ln−1 1 0 un
Jed Duersch UCB 18/19 ImplementationIII
Lecture 4/26/2013 Follow the same procedure, but avoid updating A. Build U directly. Note that the super-diagonal doesn’t change and Jed Duersch can be copied directly into U.
Spd 1 Set first row of U matrices u(1)=d(1) Cholesky decomposi- 2 Loop over columns. tion
Band for k=1:n-1 matrices 3 Construct column k of L. l(k)=c(k)/u(k) 4 Update A and place result directly in U. u(k+1)=d(k+1)-l(k)*e(k) 5 End for loop. end
Jed Duersch UCB 19/19