Lecture 4/26/2013

Jed Duersch

Spd Lecture 4/26/2013 matrices

Cholesky decomposi- tion Jed Duersch Band matrices Applied mathematics PhD candidate, physics MA UC Berkeley

April 26, 2013

Jed Duersch UCB 1/19 Symmetric positive-definite matricesI

Lecture 4/26/2013 Definition Jed n×n Duersch A symmetric A ∈ R is positive definite iff xT Ax > 0 holds ∀x 6= 0 ∈ n. Spd R matrices

Cholesky Such matrices naturally arise in a variety of math, science, decomposi- tion and engineering applications. For example, Band The Hamiltonian operator of a physical system matrices The Hessian of a convex function The of linearly independent random variables

Jed Duersch UCB 2/19 Symmetric positive-definite matricesII

Lecture 4/26/2013 Theorem

Jed The following statements are equivalent: Duersch 1 A is symmetric positive definite. Spd T n matrices 2 (x, y) := x Ay is an inner product on R . Cholesky 3 decomposi- All eigenvalues are positive and eigenvectors can be tion constructed to form an orthonormal basis. Band matrices 4 A is symmetric with positive leading principal minors. 5 The Cholesky decomposition of A exists.

Jed Duersch UCB 3/19 ExistenceI

Lecture 4/26/2013 To prove the existence of the Cholesky decomposition for

Jed symmetric positive definite matrices, we apply induction. Duersch n−1×n−1 Induction hypothesis: Given any spd An−1 ∈ R Spd matrices assume there exists lower-triangular Ln−1 where Cholesky decomposi- T tion An−1 = Ln−1Ln−1 Band matrices . √ √ T Base case: A1 = [a1,1] = [ a1,1][ a1,1] . n×n Consistency: Show that for any spd An ∈ R there exists Ln such that

T An = LnLn .

Jed Duersch UCB 4/19 ExistenceII

Lecture 4/26/2013 Let us represent An as

Jed T  Duersch α c An = c B Spd matrices

Cholesky decomposi- where α := a1,1 is a scalar, c := a2:n,1 is a column vector, tion n−1×n−1 and B = a2:n,2:n is the remaining lower-right Band R matrices sub-matrix. Then we have √ √   √1 T  α 0 1 0 α α c √1 c I 1 T An = α n−1 0 B − α cc 0 In−1

Jed Duersch UCB 5/19 ExistenceIII

Lecture n−1 4/26/2013 Let y be an arbitrary nonzero column vector in R . We can solve Jed Duersch √ α √1 cT  0  Spd α matrices 0 In−1 x = y Cholesky decomposi- tion

Band It follows matrices 1 yT (B − √ ccT )y = xT A x > 0 α n 1 T Thus B − α cc is spd. From the induction hypothesis 1 B − ccT = L LT α n−1 n−1 .

Jed Duersch UCB 6/19 ExistenceIV

Lecture 4/26/2013 Putting it all together, we have √ √ Jed   √1 T  Duersch α 0 1 0 α α c 1 1 T An = √ c In−1 0 B − cc 0 I Spd α α n−1 matrices

Cholesky √ √   √1 T  decomposi- α 0 1 0 α α c tion 1 T = √ c In−1 0 L L Band α n−1 n−1 0 In−1 matrices √ √  √1 T  α 0 α α c √1 c L T = α n−1 0 Ln−1

T = LnLn

Jed Duersch UCB 7/19 ImplementationI

Lecture 4/26/2013 Note that the proof of existence was constructive. That

Jed means we can follow the same procedure to calculate the Duersch Cholesky decomposition. Spd 1 Loop over columns. matrices

Cholesky for k=1:n-1 decomposi- tion 2 Set diagonal element of L.

Band L(k,k)=sqrt(A(k,k)) matrices 3 Finish column. L(k+1:n,k)=A(k+1:n)/L(k,k) 4 Update lower right sub-matrix of A. A(k+1:n,k+1:n)=A(k+1:n,k+1:n)- L(k+1:n,k)*L(k+1:n,k)T 5 End for loop. end

Jed Duersch UCB 8/19 Band matricesI

Lecture 4/26/2013 We can similarly take advantage of the stucture of band

Jed matrices to optimize the calculation of the LU Duersch decomposition.

Spd matrices Definition Cholesky n×n decomposi- A band matrix A ∈ R of band width 2m + 1 has nonzero tion entries only on the main diagonal, m super-diagonals, and m Band matrices sub-diagonals.

Jed Duersch UCB 9/19 Band matricesII

Lecture 4/26/2013 For example, a (band width 3) has the form: Jed Duersch a1,1 a1,2 0  Spd matrices a2,1 a2,2 a2,3  Cholesky  a3,2 a3,3 a3,4  decomposi-   tion  ..  A =  a4,3 a4,4 .  Band   matrices  .. ..   . . an−1,n 0 an,n−1 an,n

Jed Duersch UCB 10/19 Band matricesIII

Lecture 4/26/2013 We can easily show that the LU decomposition will have the form: Jed Duersch  1 0u1,1 u1,2 0  Spd . matrices l2,1 1  ..    u2,2  Cholesky A =  .. ..  .  decomposi-  . .  .. u  tion n−1,n 0 ln,n−1 1 Band 0 un,n matrices

Because we know so many entries must be zero, we can optimize the algorithm to avoid calculating them. Similarly, the procedures for forward and backward substitution can also be optimized in these cases.

Jed Duersch UCB 11/19 Diagonal dominanceI

Lecture 4/26/2013 Partial pivoting can be implemented, but it will increase the

Jed band width of U from m + 1 to 2m + 1. However, pivoting Duersch can sometimes be avoided.

Spd matrices Definition Cholesky A matrix A ∈ n×n is column diagonally dominant iff decomposi- R tion X Band |aj,j| > |ai,j| matrices i6=j

We can show that a column diagonally dominant matrix never requires pivoting.

Jed Duersch UCB 12/19 Diagonal dominanceII

Lecture (i) 4/26/2013 Proof: Let ak,j represent element k, j after zeroing column i

Jed the standard LU algorithm. Assume the induction Duersch hypothesis:

Spd matrices n (i) X (i) Cholesky |aj,j| > |ak,j| decomposi- tion k=i+1,6=j Band When column i + 1 is zeroed below the diagonal, the matrices elements of A update as follows:

a(i) a(i) a(i+1) = a(i) − k,i+1 i+1,j k,j k,j (i) ai+1,i+1

Jed Duersch UCB 13/19 Diagonal dominanceIII

Lecture 4/26/2013 (i) (i) (i+1) (i) aj,i+1ai+1,j Jed |a | = |a − | Duersch j,j j,j (i) ai+1,i+1 Spd (i) (i) matrices |a ||a | ≥ |a(i)| − j,i+1 i+1,j Cholesky j,j (i) decomposi- |a | tion i+1,i+1

Band n (i) (i) X (i) |aj,i+1||ai+1,j| matrices > |a | − k,j (i) k=i+1,6=j |ai+1,i+1| n (i) ! X (i) (i) |aj,i+1| = |a | + |a | 1 − k,j i+1,j (i) k=i+2,6=j |ai+1,i+1|

Jed Duersch UCB 14/19 Diagonal dominanceIV

Lecture 4/26/2013 To make further progress we observe:

Jed (i) (i) Duersch |a ||a | |a(i+1)| ≤ |a(i) | + k,i+1 i+1,j k,j k,j (i) Spd |a | matrices i+1,i+1 (i) (i) Cholesky |a ||a | decomposi- (i) (i+1) k,i+1 i+1,j tion |ak,j| ≥ |ak,j | − |a(i) | Band i+1,i+1 matrices

Jed Duersch UCB 15/19 Diagonal dominanceV

Lecture 4/26/2013 Together this gives:

Jed  (i)  Duersch n n (i+1) X (i+1) (i) X |ak,i+1| |a | > |a | + |a | 1 − j,j k,j i+1,j  (i)  Spd |a | matrices k=i+2,6=j k=i+2 i+1,i+1

Cholesky n decomposi- X (i+1) tion > |ak,j | Band k=i+2,6=j matrices This shows consistency for the induction hypothesis and the base case easily follows from the definition of diagonal dominance. The desired result immediate follows since the diagonal elements are always largest in magnitude.

Jed Duersch UCB 16/19 ImplementationI

Lecture 4/26/2013 We can follow the standard LU procedure and simply avoid

Jed elements that do not update. If A has band width 2m + 1, Duersch we have Spd 1 Loop over columns. matrices

Cholesky for k=1:n-1 decomposi- tion 2 Construct column k of L.

Band L(k+1:k+m,k)=A(k+1:k+m,k)/A(k,k) matrices 3 Set column k of A to zero. A(k+1:k+m,k)=0 4 Update lower right sub-matrix of A. A(k+1:k+m,k+1:k+m)=A(k+1:k+m,k+1:k+m)- L(k+1:k+m,k)*A(k,k+1:k+m) 5 End for loop. Identify U. end; U=A

Jed Duersch UCB 17/19 ImplementationII

Lecture 4/26/2013 In the case of a tridiagonal matrix, the procedure becomes exceptionally simple and efficient. Let us write the diagonals Jed Duersch of A as vectors. Then L and U can be written as:

Spd d1 e1 0  matrices . c d ..  Cholesky  1 2  decomposi-   tion A = .. ..  . . en−1 Band matrices 0 cn−1 dn

1 0u1 e1 0  l 1 ..  1  u2 .   . .   =  .. ..  ..   . en−1 0 ln−1 1 0 un

Jed Duersch UCB 18/19 ImplementationIII

Lecture 4/26/2013 Follow the same procedure, but avoid updating A. Build U directly. Note that the super-diagonal doesn’t change and Jed Duersch can be copied directly into U.

Spd 1 Set first row of U matrices u(1)=d(1) Cholesky decomposi- 2 Loop over columns. tion

Band for k=1:n-1 matrices 3 Construct column k of L. l(k)=c(k)/u(k) 4 Update A and place result directly in U. u(k+1)=d(k+1)-l(k)*e(k) 5 End for loop. end

Jed Duersch UCB 19/19