APPLIED MATRIX THEORY
E
Lecture Notes for Math 464/514 Presented by DR. MONIKA NITSCHE
E Typeset and Editted by ERIC M. BENNER
E
STUDENTS PRESS December 3, 2013 Copyright © 2013 Contents
1 Introduction to Linear Algebra1 1.1 Lecture 1: August 19, 2013 ...... 1 About the class, 1. Linear Systems, 1. Example: Application to boundary value problem, 2. Analysis of error, 3. Solution of the discretized equation, 4. 2 Matrix Inversion5 2.1 Lecture 2: August 21, 2013 ...... 5 Gaussian Elimination, 5. Inner-product based implementation, 7. Office hours and other class notes, 8. Example: Gauss Elimination, 8. 2.2 Lecture 3: August 23, 2013 ...... 8 Example: Gauss Elimination, cont., 8. Operation Cost of Forward Elimination, 9. Cost of the Order of an Algorithm, 10. Validation of Lower/Upper Triangular Form, 11. Theoretical derivation of Lower/Upper Form, 11. 2.3 HW 1: Due August 30, 2013 ...... 12 3 Factorization 15 3.1 Lecture 4: August 26, 2013 ...... 15 Elementary Matrices, 15. Solution of Matrix using the Lower/Upper factorization, 18. Sparse and Banded Matrices, 18. Motivation for Gauss Elimination with Pivoting, 19. 3.2 Lecture 5: August 28, 2013 ...... 19 Motivation for Gauss Elimination with Pivoting, cont., 19. Discussion of well-posedness, 20. Gaussian elimination with pivoting, 21. 3.3 Lecture 6: August 30, 2013 ...... 22 Discussion of HW problem 2, 22. PLU factorization, 22. 3.4 Lecture 7: September 4, 2013...... 24 PLU Factorization, 24. Triangular Matrices, 25. Multiplication of lower triangular ma- trices, 25. Inverse of a lower triangular matrix, 25. Uniqueness of LU factorization, 26. Existence of the LU factorization, 26. 3.5 Lecture 8: September 6, 2013...... 27 About Homeworks, 27. Discussion of ill-conditioned systems, 27. Inversion of lower triangular matrices, 28. Example of LU decomposition of a lower triangular matrix, 28. Banded matrix example, 29.
iii Nitsche and Benner Applied Matrix Theory
3.6 Lecture 9: September 9, 2013...... 29 Existence of the LU factorization (cont.), 29. Rectangular matrices, 31. 3.7 HW 2: Due September 13, 2013 ...... 32 4 Rectangular Matrices 35 4.1 Lecture 10: September 11, 2013 ...... 35 Rectangular matrices (cont.), 35. Example of RREF of a Rectangular Matrix, 37. 4.2 Lecture 11: September 13, 2013 ...... 38 Solving Ax = b, 38. Example, 38. Linear functions, 39. Example: Transpose operator, 40. Example: trace operator, 40. Matrix multiplication, 41. Proof of transposition property, 42. 4.3 Lecture 12: September 16, 2013 ...... 42 Inverses, 42. Low rank perturbations of I, 43. The Sherman–Morrison Formula, 44. Finite difference example with periodic boundary conditions, 44. Examples of pertur- bation, 45. Small perturbations of I, 45. 4.4 Lecture 13: September 18, 2013 ...... 46 Small perturbations of I (cont.), 46. Matrix Norms, 47. Condition Number, 48. 4.5 HW 3: Due September 27, 2013 ...... 49 5 Vector Spaces 55 5.1 Lecture 14: September 20, 2013 ...... 55 Topics in Vector Spaces, 55. Field, 55. Vector Space, 56. Examples of function spaces, 57. 5.2 Lecture 15: September 23, 2013 ...... 58
The four subspaces of Am×n, 58. 5.3 Lecture 16: September 25, 2013 ...... 61 The Four Subspaces of A, 62. Linear Independence, 63. 5.4 Lecture 17: September 27, 2013 ...... 64 Linear functions (rev), 64. Review for exam, 64. Previous lecture continued, 65. 5.5 Lecture 18: October 2, 2013...... 66 Exams and Points, 66. Continuation of last lecture, 66. 6 Least Squares 69 6.1 Lecture 19: October 4, 2013...... 69 Least Squares, 69. 6.2 Lecture 20: October 7, 2013...... 70 Properties of Transpose Multiplication, 71. The Normal Equations, 71. Exam 1, 73. 6.3 Lecture 21: October 9, 2013...... 74 Exam Review, 74. Least squares and minimization, 74. 6.4 HW 4: Due October 21, 2013...... 76
iv Nitsche and Benner Applied Matrix Theory
7 Linear Transformations 81 7.1 Lecture 22: October 14, 2013 ...... 81 Linear Transformations, 83. Examples of Linear Functions, 83. Matrix representation of linear transformations, 83. 7.2 Lecture 23: October 16, 2013 ...... 84 Basis of a linear transformation, 84. Action of linear transform, 87. Change of Basis, 88. 7.3 Lecture 24: October 21, 2013 ...... 89 Change of Basis (cont.), 89. 7.4 Lecture 25: October 23, 2013 ...... 91 Properties of Special Bases, 91. Invariant Subspaces, 93. 7.5 HW 5: Due November 4, 2013 ...... 94 8 Norms 99 8.1 Lecture 26: October 25, 2013 ...... 99 Difinition of norms, 99. Vector Norms, 99. The two norm, 99. Matrix Norms, 101. Induced Norms, 102. 8.2 Lecture 27: October 28, 2013 ...... 102 Matrix norms (review), 102. Frobenius Norm, 102. Induced Matrix Norms, 104. 8.3 Lecture 28: October 30, 2013 ...... 106 The 2-norm, 106. 9 Orthogonalization with Projection and Rotation 109 9.1 Lecture 28 (cont.) ...... 109 Inner Product Spaces, 109. 9.2 Lecture 29: November 1, 2013 ...... 110 Inner Product Spaces, 110. Fourier Expansion, 111. Orthogonalization Process (Gramm-Schmidt), 111. 9.3 Lecture 30: November 4, 2013 ...... 112 Gramm–Schmidt Orthogonalization, 112. 9.4 Lecture 31: November 6, 2013 ...... 115 Unitary (orthogonal) matrices, 116. Rotation, 117. Reflection, 118. 9.5 HW 6: Due November 11, 2013 ...... 118 9.6 Lecture 32: November 8, 2013 ...... 120 Elementary orthogonal projectors, 120. Elementary reflection, 121. Complimentary Subspaces of V, 121. Projectors, 121. 9.7 Lecture 33: November 11, 2013...... 122 Projectors, 122. Representation of a projector, 123. 9.8 Lecture 34: November 13, 2013...... 124 n Projectors, 124. Decompositions of R , 125. Range Nullspace decomposition of An×n, 126. 9.9 HW 7: Due November 22, 2013 ...... 126
v Nitsche and Benner Applied Matrix Theory
9.10 Lecture 35: November 15, 2013...... 128 Range Nullspace decomposition of An×n, 128. Corresponding factorization of A, 129. 10 Singular Value Decomposition 131 10.1 Lecture 35 (cont.) ...... 131 Singular Value Decomposition, 131. 10.2 Lecture 36: November 18, 2013...... 132 Singular Value Decomposition, 132. Existence of the Singular Value Decomposition, 133. 10.3 Lecture 37: November 20, 2013...... 136 Review and correction from last time, 136. Singular Value Decomposition, 136. Geometric interpretation, 138. 10.4 Lecture 38: November 22, 2013...... 139 Review for Exam 2, 139. Norms, 139. More major topics, 140. 10.5 HW 8: Due December 10, 2013 ...... 142 10.6 Lecture 39: November 27, 2013...... 144 Singular Value Decomposition, 144. SVD in Matlab, 145. 11 Additional Topics 149 11.1 Lecture 39 (cont.) ...... 149 The Determinant, 149. 11.2 Lecture 40: December 2, 2013 ...... 150 Further details for class, 150. Diagonalizable Matrices, 150. Eigenvalues and eigenvec- tors, 150. Index 155
Other Contents 157
vi UNIT 1
Introduction to Linear Algebra
1.1 Lecture 1: August 19, 2013
About the class The textbook for the class will be Matrix Analysis and Applied Linear Algebra by Meyer. Another highly recommended text is Laub’s Matrix Analysis for Scientists and Engineers.
Linear Systems A linear system may be of the general form
Ax = b. (1.1.1)
This may be represented in several equivalent ways.
2x1 + x2 − 3x3 = 18, (1.1.2a)
−4x1 + 5x3 = −28, (1.1.2b)
6x1 + 13x2 = 37. (1.1.2c)
This also may be put in matrix form 2 1 −3 x1 18 −4 0 5 x2 = −28. (1.1.3) 6 13 0 x3 37
Finally, a the third common form is vector form:
2 1 −3 18 −4 x1 + 0 x2 + 5 x3 = −28. (1.1.4) 6 13 0 37
1 Nitsche and Benner Unit 1. Introduction to Linear Algebra
y
y(t)
t t0 t1 t2 t3 ··· tn
Figure 1.1. Finite difference approximation of a 1D boundary value problem.
Example: Application to boundary value problem We will use finite difference approximations on a rectangular grid to solve the system,
− y00(t) = f(t), for t ∈ [0, 1], (1.1.5) with the boundary conditions
y(0) = 0, (1.1.6a) y(1) = 0. (1.1.6b)
This is a 1D version of the general Laplace equation represented by,
− ∆u = f (1.1.7) or in more engineering/science form
− ∇2u = f. (1.1.8)
The Laplace operator in cartesian coordinates,
∇2u = ∇ · (∇u), (1.1.9a)
= uxx + uyy + uzz. (1.1.9b)
Finite Difference Approximation
Let tj = j∆t, with j = 0,...,N. The approximate forms of the solution yj ≈ y(tj). Now we need to approximate the derivatives with discrete values of the variables. The forward difference approximation is
0 yj+1 − yj y (tj) = , (1.1.10) tj+1 − tj or y − y y0(t ) = j+1 j , (1.1.11) j ∆t 2 1.1. Lecture 1: August 19, 2013 Applied Matrix Theory
The backward difference approximation is y − y y0(t ) = j j−1 . (1.1.12) j ∆t The centered difference approximation is y − y y0(t ) = j+1 j−1 . (1.1.13) j 2∆t Each of these are useful approximations to the first derivative that have varying properties when applied to specific differential equations. The second derivative may be approximated by combining the approximations of the first derivative 0 0 yj+ 1 − yj− 1 (y0)0(t ) ≈ 2 2 , (1.1.14a) j ∆t yj+1−yj − yj −yj−1 = ∆t ∆t , (1.1.14b) ∆t y − 2y + y = j+1 j j−1 . (1.1.14c) ∆t2
Analysis of error To understand the error of this approximation we may utilize the Taylor series. A general Taylor series is 1 1 f(x) = f(a) + f 0(a)(x − a) + f 00(a)(x − a)2 + f 000(a)(x − a)3 + ··· (1.1.15) 2 3! By the Taylor remainder theorem, we may approximate the error with a special truncation of the series, 1 1 f(x) = f(a) + f 0(a)(x − a) + f 00(a)(x − a)2 + f 000(ξ)(x − a)3, (1.1.16) 2 3! or simply 1 f(x) = f(a) + f 0(a)(x − a) + f 00(a)(x − a)2 + O (x − a)3. (1.1.17) 2 The difference we are interested in to find the error is, y(t ) − 2y(t ) + y(t ) E = y00(t ) − j+1 j j−1 (1.1.18) j ∆t2 The Taylor series,
0 2 y(tj+1) = y(tj + ∆t) = y(tj) + y (tj)∆t + O ∆t , (1.1.19a) 0 2 y(tj−1) = y(tj − ∆t) = y(tj) − y (tj)∆t + O ∆t (1.1.19b) will need to be substituted. A function g is said to be order 2, or g = O(h2), if, |g| ≤ Ch2. (1.1.20)
3 Nitsche and Benner Unit 1. Introduction to Linear Algebra
Solution of the discretized equation We now substitute the discrete difference, y − 2y + y − j+1 j j−1 = f(t ), for j = 1, . . . , n − 1 (1.1.21) ∆t2 j and the boundary conditions become
y0 = 0, (1.1.22a)
yn = 0. (1.1.22b)
This gives the linear system which will need to be solved for the unknowns yi.
2 −1 0 ··· 0 y1 f(t1) .. . −1 2 −1 . . y2 f(t2) . . 2 . 0 −1 2 .. 0 . = ∆t . . (1.1.23) ...... y f(t ) . . . . −1 n−2 n−2 0 ··· 0 −1 2 yn−1 f(tn−1)
4 UNIT 2
Matrix Inversion
2.1 Lecture 2: August 21, 2013
Previously we came up with a tridiagonal system for finite difference solution last time.
Gaussian Elimination
We want to solve Ax = b. Claim: Gaussian elimination: A = LU Notation:
A = [aij] (2.1.1)
Lower triangular system Lx = b. In class we use underlines to indicate the vector. In general these vectors are column vectors, and we will use x| to indicate the row vector.
Lower triangular system Lx = b
`11 0 0 0 x1 b1 ` ` 0 ··· 0 21 22 ` ` ` 0 . . 31 32 21 . = . (2.1.2) . .. . . . . `n1 `n2 `n3 ··· `nn xn bn or
`11x1 = b1 (2.1.3a)
`21x1 + `22x2 = b2 (2.1.3b) ··· (2.1.3c)
`n1x1 + `n2x2 + ··· + `nnxn = bn (2.1.3d)
5 Nitsche and Benner Unit 2. Matrix Inversion
Rearranging to solve the equations,
b1 x1 = (2.1.4a) `11 b2 − `21x1 x2 = (2.1.4b) `22 ··· (2.1.4c) bi − `i(i−1)xi−1 + ··· + `i1x1 xi = (2.1.4d) `ii
The basic algorithm for solution of the above system in pseudo code follows:
1: x1 ← b1/`11 2: for i ← 2, n do Pi−1 3: xi ← [bi − k=1 `ikxk]/`ii 4: end for
The operation count, Nops, becomes,
n X Nops = 1 + 1 + 1 + (i − 1) + (i − 2) . (2.1.5) i=2 |{z} |{z} | {z } | {z } division substitution multiplication addition
Each of the terms arise directly from the steps of the algorithm shown above.
ASIDE: Finite sums
We need the following sums for our derivations of the operation counts,
n X n(n + 1) i = , (2.1.6) 2 i=1
n X n(n + 1)(2n + 1) i2 = . (2.1.7) 6 i=1
Evaluating the operation count,
n X Nops = 1 + (2i − 1), (2.1.8a) i=2 n X = (2i − 1), (2.1.8b) i=1 n ! X = 2 i − n, (2.1.8c) i=1 = n(n + 1) − n, (2.1.8d) = n2. (2.1.8e)
6 2.1. Lecture 2: August 21, 2013 Applied Matrix Theory
Implementation of lower triangular solution in Matlab We give a Matlab code for this solution, 1 function x = Ltrisol(L,b) 2 % s o l v e $Lx = b$ , assuming $L { i i }\ne 0$ 3 n = length (b ) ; 4 % initialize the size of your vectors 5 x1 = b(1)/l(1,1); 6 for i = 2 : n 7 x(i)=b(i); 8 for k = 1 : i −1 9 x ( i ) = x ( i ) − l ( i , k ) ∗ x ( k ) ; 10 end 11 end 12 % 13 end
This would be saved as the code Ltrisol.m and would be run as
>> L = ...; b = ...; >> x = Ltrisol(L, b)
Warning: Matlab loops are very slow!
Inner-product based implementation How do we re-write the code as inner products? We can reorder the second for-loop so that it is simply an inner-product, 1 function x = Ltrisol(L,b) 2 % s o l v e $Lx = b$ , assuming $L { i i }\ne 0$ 3 n = length (b ) ; 4 % initialize the size of your vectors 5 x1 = b(1)/l(1,1); 6 for i = 2 : n 7 x(i)=(b(i) − l ( i , 1 : i −1)∗x ( 1 : i −1))/l(i,i); 8 end 9 % 10 end
Note that the l(i,1:i-1) term is a row vector and x(1:i-1) is a column vector so this code will work fine. Recall that this required that x be initialized as a column vector. The inner part can also be rewritten more cleanly as, 1 function x = Ltrisol(L,b) 2 % s o l v e $Lx = b$ , assuming $L { i i }\ne 0$ 3 n = length (b ) ; 4 % initialize the size of your vectors 5 x1 = b(1)/l(1,1); 6 for i = 2 : n 7 k = 1 : i −1; 8 x(i)=(b(i) − l ( i , k )∗ x(k))/l(i,i); 9 end
7 Nitsche and Benner Unit 2. Matrix Inversion
10 % 11 end
Office hours and other class notes
Office hours will be from 12–1 on MWF, the web address is, www.math.unm.edu/~nitsche/ math464.html.
Example: Gauss Elimination Example:
2x1 − x2 + 3x3 = 13 (2.1.9a)
−4x1 + 6x2 − 5x3 = −28 (2.1.9b)
6x1 + 13x2 − 16x3 = 37 (2.1.9c)
Let’s perform each step in full equation form. So we execute the steps R2 → R2 − (−2)R1 and R3 → R3 − (−3)R1.
2x1 − x2 + 3x3 = 13 (2.1.10a)
4x2 + x3 = −2 (2.1.10b)
16x2 + 7x3 = −2 (2.1.10c)
Next step will be R3 → R3 − (4)R2.
2.2 Lecture 3: August 23, 2013
Example: Gauss Elimination, cont. Example:
2x1 − x2 + 3x3 = 13 (2.2.1a)
−4x1 + 6x2 − 5x3 = −28 (2.2.1b)
6x1 + 13x2 − 16x3 = 37 (2.2.1c)
Let’s perform each step in full equation form. So we execute the steps R2 → R2 − (−2)R1 and R3 → R3 − (−3)R1.
2x1 − x2 + 3x3 = 13 (2.2.2a)
4x2 + x3 = −2 (2.2.2b)
16x2 + 7x3 = −2 (2.2.2c)
8 2.2. Lecture 3: August 23, 2013 Applied Matrix Theory
Next step will be R3 → R3 − (4)R2.
2x1 − x2 + 3x3 = 13 (2.2.3a)
4x2 + x3 = −2 (2.2.3b)
3x3 = 6 (2.2.3c)
Now we begin the backward substitution.
x3 = 2; (2.2.4a)
x2 = (−2 − x3)/4, (2.2.4b) = −1; (2.2.4c)
x1 = (13 + x2 − 3x3)/2, (2.2.4d) = 3. (2.2.4e)
Gauss Elimination is forward elimination and backward substitution. Now we will do the same problem in matrix form,
2 −1 3 13 2 −1 3 13 −4 6 −5 −28 → 0 4 1 −2 , (2.2.5a) 6 13 16 37 0 16 7 −2 2 −1 3 13 → 0 4 1 −2 . (2.2.5b) 0 0 3 6
Operation Cost of Forward Elimination Now we want to know the operation count for the forward elimination step when we take A → U without pivoting for a general n × n matrix, A = [aij]. As an example of each step:
a11 a12 a13 a14 a15 a11 a12 a13 a14 a15 0 0 0 0 a21 a22 a23 a24 a25 0 a22 a23 a24 a25 0 0 0 0 a31 a32 a33 a34 a35 → 0 a32 a33 a34 a35 (2.2.6a) 0 0 0 0 a41 a42 a43 a44 a45 0 a42 a43 a44 a45 0 0 0 0 a51 a52 a53 a54 a55 0 a52 a53 a54 a55
aij These operations are given by, rowj → rowj − `ijrowi, where `ij = if aii 6= 0 (aii should aii a1j not be close to zero or we will need to use pivoting). An example, a1j → aij − a1j = 0. a11 The next step, a11 a12 a13 a14 a15 0 0 0 0 0 a22 a23 a24 a25 00 00 00 → 0 0 a33 a34 a35 (2.2.6b) 00 00 00 0 0 a43 a44 a45 00 00 00 0 0 a53 a54 a55
9 Nitsche and Benner Unit 2. Matrix Inversion
y y
y(t) y(t)
t t t0 t1 t2 t3 ··· tn t0 t2 t4 t6 t8 t10 t12 t14 t16 ··· t4n (a) n grid (b) 4n grid
Figure 2.1. One-dimensional discrete grids.
At ith step (i = 1 : n − 1), ˜ B(n−i)×(n−i) → B(n−i)×(n−i), (2.2.7) the cost of the individual step: n − i + 2(n − i)2. The total cost is thus, | {z } | {z } comp `ij comp aij
n−1 X 2 Nops = (n − i) + 2(n − 1) (2.2.8a) i=1
Let k = n − i then i = 1 → k = n − 1 and i = n − 1 → k = n − (n − 1) = 1
1 X = (k + 2k2), (2.2.8b) k=n−1 (n − 1)n (n − 1)n(2(n − 1) + 1) = +2 , (2.2.8c) 2 6 | {z } | {z } O(n2) O(n3) ≈ O n3. (2.2.8d)
This means that the problem scales with order 3.
Cost of the Order of an Algorithm For an order 3 algorithm, if you increase the size of your matrix by a factor of 2, the expense of computer time will increase by a factor of 8. Similarly, if it took one day to solve a boundary value problem in 1D with n = 1000, then it will take 64 days to do n = 4000 (see figure 2.1). Alternatively, if you are doing a 2D simulation, increasing by a factor of 4, as shown in figure 2.2, would increase the domain to 16 and thus the calculations would increase to 163. This gets very expensive! This is one of the reasons that models of phenomena such as the weather is very difficult.
10 2.2. Lecture 3: August 23, 2013 Applied Matrix Theory
y y
yn y4n
y0 x y0 x x0 xn x0 x4n (a) n × n grid (b) 4n × 4n grid
Figure 2.2. Two-dimensional discrete grids.
Validation of Lower/Upper Triangular Form
Consider that we have the Gaussian Elimination with A = LU, where
1 0 L = . (2.2.9) `ij 1
Check our previous system:
2 −1 3 1 0 0 2 −1 3 −4 6 −5 = −2 1 0 0 4 1. (2.2.10) 6 13 16 3 4 1 0 0 3
This works!
Theoretical derivation of Lower/Upper Form
We want to show that Gauss elimination naturally leads to the LU form using elementary row operations. The three elementary operations are:
1. Multiply row by α;
2. Switch rowi and rowj;
3. Add multiple of rowi to rowj.
All are equivalent to pre-multiplying A by an elementary matrix. Let’s illustrate these:
11 Nitsche and Benner Unit 2. Matrix Inversion
1. Multiply by α. 1 0 0 0 a11 a12 a13 a1n a11 a12 a13 a1n 0 1 0 ··· 0 a a a ··· a a a a ··· a 21 22 23 2n 21 22 23 2n 0 0 α 0 a a a a αa αa αa αa 31 32 33 3n = 31 32 33 3n . . . .. . . .. . . . . . . . . . 0 0 0 ··· 1 an1 an2 an3 ··· ann an1 an2 an3 ··· ann | {z } Ei (2.2.11a)
2.3 Homework Assignment 1: Due Friday, August 30, 2013
1. Use Taylor series expansions of f(x ± h) about x to show that f(x + h) − 2f(x) + f(x − h) h2 f 00(x) = − f (4)(x) + O h4. (2.3.1) h2 12
2. Consider the two-point boundary value problem 1 y00(x) = ex, y(−1) = , y(1) = e (2.3.2) e where x ∈ [−1, 1], Divide the interval [−1, 1] into N equal subintervals and apply the finite difference method presented in class to find the approximate the solution yj ≈ y(xj) at the N−1 interior points j = 1,...,N−1, where xj = a+jh, h = (b−a)/N, and [a, b] = [−1, 1]. Compare the approximate values at the grid points with the exact solution at the grid points. Use N = 2, 4, 8,..., 29 and report the maximal absolute error for each N in a table. Your writeup should contain:
• the Matlab code; • a table with two columns. The first contains h, the second contains the corre- sponding maximal errors. By how much is the error reduced every time N is doubled? Can you conclude whether the error is O(h), O(h2) or O(hp) for some other integer p?
Regarding Matlab: If needed, go over the Matlab tutorial on the course website, items 1–6. This covers more than you need for this problem. In Matlab, type
help diag or help ones
to find what these commands do. The (N −1)×(N −1) matrix with 2s on the diagonal and –1 on the off-diagonals can be constructed by
v=ones(1,n-1); A=2*diag(v)-diag(v(1:n-2),1)-diag(v(1:n-2),-1);
12 2.3. HW 1: Due August 30, 2013 Applied Matrix Theory
The system Ax = b can be solved in Matlab by x = A\b. The maximal difference between two vectors x and y is error=max(abs(x-y)). Your code should have the following structure
Listing 2.1. code stub for tridiagonal solver
1 disp ( sprintf ( h e r r o r ) 2 a=...; b=...; % Set values of endpoints 3 ya=...; yb=...; % Set values of y at the endpoints 4 for n = . . . ; 5 h=2/n ; 6 x=a : h : b ; 7 % Set matrix A of the linear system to be solved. 8 v=ones(1,n−1); 9 A=2∗diag ( v)−diag ( v ( 1 : n−2),1)−diag ( v ( 1 : n−2) , −1); 10 % Set right hand side of linear system. 11 rhs = . . . 12 % Solve linear system to find approximate solution. 13 y ( 2 : n)=A\ rhs ; y(1)=ya; y(n+1)=yb; 14 % Compute exact solution and approximation error 15 yex = . . . % set exact solution 16 plot ( x , y , b − , x , yex , r − ) % to compare visually 17 error=max( abs ( y−yex ) ) 18 disp ( sprintf ( %15.10f %20.15f ,h,error)) 19 end
Note that in Matlab the index of all vectors starts with 1. Thus, x=-1:h:1, is a vector of length n + 1 and the interior points are x(2:n).
3. Let U be an upper triangular n × n matrix with nonzero entries uij, j ≥ i. (a) Write an algorithm that solves Ux = b for a given right hand side b for the unknown x. (b) Find the number of operations that it takes to solve for x, using your algorithm above. (c) Write a Matlab function function x=utrisol(u,b) that implements your al- gorithm and returns the solution x.
4. Given A, b below,
(a) find the LU factorization of A (using the Gauss Elimination algorithm); (b) use it to solve Ax = b. 2 −1 0 0 0 −1 2 −1 0 0 A = , b = . (2.3.3) 0 −1 2 −1 0 0 0 −1 2 5
5. Sparsity of L and U, given sparsity of A = LU. If A, B, C, D have non-zeros in the positions marked by x, which zeros (marked by 0) are still guaranteed to be zero in
13 Nitsche and Benner Unit 2. Matrix Inversion
their factors L and U?(B, C, D are all band matrices with p = 3 bands, but differing sparsity within the bands. The question is how much of this sparsity is preserved.) In each case, highlight the new nonzero entries in L and U.
x 0 x 0 0 0 x x x x 0 x 0 x 0 0 x x x 0 x 0 x 0 x 0 A = , B = , 0 x x x 0 x 0 x 0 0 0 0 x x 0 0 x 0 x 0 0 0 0 x 0 x
x x x 0 0 0 x 0 0 x 0 0 0 x 0 x 0 0 0 x 0 0 x 0 x 0 x 0 x 0 x 0 x 0 0 x C = , D = , 0 x 0 x 0 0 0 x 0 x 0 0 0 0 x 0 x 0 0 0 x 0 x 0 0 0 0 x 0 x 0 0 0 x 0 x
6. Consider solving a differential equation in a unit cube, using N points to discretize each dimension. That is, you have a total of N 3 points at which you want to approximate the solution. Suppose that at each time step, you need to solve a linear system Ax = b, where A is an N 3 × N 3 matrix, which you solve using Gauss Elimination, and suppose there are no other computations involved. Assume your personal computer runs at 1 GigaFLOPS, that is, it executes 109 floating point operations per second.
(a) How much time does it take to solve your problem for N = 500 for 1000 timesteps? (b) When you double the number of points N, you typically also have to halve the timestep, that is, double the total number of timesteps taken. By what factor does the runtime increase each time you double N? (c) How much time will it take to solve the problem if you use N = 2000?
14 UNIT 3
Factorization
3.1 Lecture 4: August 26, 2013
For the h in the homework, for n = 2.^(1:1:10). We want to deduce the order of the method from the table of h and the error.
Elementary Matrices
1. Multiply rowi by α: 1 0 0 0 0 . 0 .. 0 0 0 E1 = 0 0 α 0 0 . (3.1.1) . 0 0 0 .. 0 0 0 0 0 1 The inverse is, 1 0 0 0 0 0 ... 0 0 0 E−1 = 0 0 1 0 0 . (3.1.2) 1 α . 0 0 0 .. 0 0 0 0 0 1 −1 E1E1 = I (3.1.3)
2. Exchange rowi and rowj: 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 E2 = . (3.1.4) 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1
15 Nitsche and Benner Unit 3. Factorization
2 E2 = I (3.1.5)
3. Replace rowj by rowj + αrowi.
1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 E3 = . (3.1.6) 0 0 α 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1
1 0 0 0 0 0 0 1 0 0 0 0 −1 0 0 1 0 0 0 E = . (3.1.7) 3 0 0 −α 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 What happens if we post-multiply by the elementary matrices? The matrices will act on the columns instead of the rows. a11 a12 a13 a1n 1 0 0 0 0 a11 a12 αa13 a1n . a21 a22 a23 ··· a2n 0 .. 0 0 0 a21 a22 αa23 ··· a2n a a a a a a αa a AE1 = 31 32 33 3n 0 0 α 0 0 = 31 32 33 3n . .. . . . .. . . . . 0 0 0 .. 0 . . . an1 an2 an3 ··· an 0 0 0 0 1 an1 an2α an3 ··· an (3.1.8) 1 0 0 0 0 0 a11 a12 a13 a1n a a a ··· a 0 1 0 0 0 0 21 22 23 2n a a a a 0 0 0 1 0 0 AE2 = 31 32 33 3n (3.1.9) . . . 0 0 1 0 0 0 . .. . 0 0 0 0 1 0 an1 an2 an3 ··· an 0 0 0 0 0 1
Gaussian Elimination without pivoting Premultiply by elementary matrices type 3 repeatedly.
aji `ji = , for j > i (3.1.10) aii
x x x x x 0 x x x x E−21A = x x x x x (3.1.11) x x x x x x x x x x
16 3.1. Lecture 4: August 26, 2013 Applied Matrix Theory
x x x x x 0 x x x x E−31E−21A = 0 x x x x (3.1.12) x x x x x x x x x x This sequence continues until we have introduced zeros to get the lower diagonal: x x x x x 0 x x x x E−n,n−1 ··· E−n1 ··· E−31E−21A = 0 0 x x x = U (3.1.13) 0 0 0 x x 0 0 0 0 x Thus, A = E21E31 ··· En−1,n−2En,n−2En,n−1 U (3.1.14) | {z } L 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 `21 1 0 0 0 0 0 1 0 0 0 0 `21 1 0 0 0 0 0 0 1 0 0 0 `31 0 1 0 0 0 `31 0 1 0 0 0 E21E31 = = . (3.1.15) 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 Which extends to 1 0 0 0 0 0 `21 1 0 0 0 0 ` 0 1 0 0 0 ˜ 31 E1 = En1 ··· E21E31 = . .. . (3.1.16) . 0 0 . 0 0 `n−1,1 0 0 0 1 0 `n1 0 0 0 0 1 This further extends to, 1 0 0 0 0 0 `21 1 0 0 0 0 ` ` 1 0 0 0 ˜ ˜ 31 32 E1E2 = . . .. . (3.1.17) . . 0 . 0 0 `n−1,1 `n−1,2 0 0 1 0 `n1 `n2 0 0 0 1 Finally we get that 1 0 0 0 0 0 `21 1 0 0 0 0 ` ` 1 0 0 0 ˜ ˜ ˜ 31 32 E1E2 ··· En−1 = ...... . (3.1.18) . . . . 0 0 `n−1,1 `n−1,2 ··· `n−1,n−2 1 0 `n1 `n2 ··· `n,n−2 `n,n−1 1
17 Nitsche and Benner Unit 3. Factorization
Solution of Matrix using the Lower/Upper factorization To use A = LU to solve Ax = b.
2 3 1. Find L, U (number of operations: 3 n ) 2. L(Ux) = b First solve Ly = b (number of operations: n2), then solve, Ux = y (number of operations: n2).
Example:
To solve Ax = b_k k= 1,10^4 % Find L, U once O(2/3 n^3) then solve L y = b U x = y 10,000 times O(10,000 * n^2 * 2)
Sparse and Banded Matrices Given x 0 0 0 0 0 ... 0 0 0 A = 0 0 x 0 0 (3.1.19) . 0 0 0 .. 0 0 0 0 0 x the bandwidth is 1. Below, x x 0 0 0 0 x x x 0 0 0 0 x x x 0 0 A = , (3.1.20) 0 0 x x x 0 0 0 0 x x x 0 0 0 0 x x the bandwidth is 3—this is a tridiagonal matrix. This type of matrix maintains it’s sparsity when it undergoes LU decomposition.
x x 0 0 0 0 1 0 0 0 0 0 x x 0 0 0 0 x x x 0 0 0 x 1 0 0 0 0 0 x x 0 0 0 0 x x x 0 0 0 x 1 0 0 0 0 0 x x 0 0 = . (3.1.21) 0 0 x x x 0 0 0 x 1 0 0 0 0 0 x x 0 0 0 0 x x x 0 0 0 x 1 0 0 0 0 0 x x 0 0 0 0 x x 0 0 0 0 x 1 0 0 0 0 0 x
18 3.2. Lecture 5: August 28, 2013 Applied Matrix Theory
Motivation for Gauss Elimination with Pivoting When does Gauss elimination give us a problem? For example
0 1 1. 1 1
δ 1 1 + δ 1 2. A = . Solve Ax = , the exact solution is . However, we run into 1 1 2 1 numerical problems.
3.2 Lecture 5: August 28, 2013
Motivation for Gauss Elimination with Pivoting, cont. When does Gauss elimination give us a problem? Returning to the example problem, A = δ 1 1 + δ 1 . Solve Ax = , the exact solution is , but we run into numerical 1 1 2 1 problems. There are a couple approaches to this problem. First, solve for x by first finding L, U and using them numerically,
δ 1 δ 1 A = → 1 = U (3.2.1) 1 1 0 1 − δ and 1 1 L = δ (3.2.2) 0 1 Now we want to solve L (Ux) = b 1 for j =1:16 2 delta =10ˆ(− j ) ; 3 b=[1+delta, 2]; 4 L= [1, 0; 1/delta, 1]; 5 U=[delta, 1; 0, 1−1/ d e l t a ] ; 6 % Solve Ly = b \ to y 7 y(1) =b(1); y(2) =b(2) − L( 2 , 1 ) ∗ y ( 1 ) ; 8 % Solve Ux = y \ to x 9 x(2) = y(2)/u(2,2); x(1) = (y(1) − u ( 1 , 2 ) ∗ x(2))/u(1,1); 10 % 11 disp ( sprintf (’ %5.0 e %20.15 f %20.15 f %10.8e’,delta ,x(1),x(2),norm( x − [ 1 , 1 ] ) ) ; 12 end
Note that the norm is the Euclidian norm, x − [1, 1] = p(x(1) − 1)2 + (x(2) − 1)2 . This gives us a table of results as shown below Conclusion: Ax = b is a good problem (well-posed) introducing small perturbations (e.g., by roundoff) does not change the solution by much. Matlab’s algorithm A\b is a good algorithm (stable); LU decomposition does not give a good algorithm (unstable).
19 Nitsche and Benner Unit 3. Factorization
Table 3.1. Variation of error with the perturbation variable
δ x(1) x(2) ||x − [1, 1]||2 1e-01 1.000 1.000 8e-16 1e-02 1.000 1.000 1e-13 1e-03 0.999 1.000 6e-12 1e-04 1.000. . . 28 1.000 e-11 1e-05 . . . 1.000 e-10 ...... 1e-16 0.888 1.000 e-0
Discussion of well-posedness
Geometrically, Ax = b,
δx1 + x2 = 1 + δ, (3.2.3a)
x1 + x2 = 2. (3.2.3b)
This is a well-posed system. Rearranging
x2 ≈ 1 − δx1, x2 = 2 − x1. (3.2.4a)
Our other system Ly = b,
y1 = 1 (3.2.5a) 1 y + y = 2 (3.2.5b) δ 1 2
This makes a very ill-posed system because small wiggles in δ give much larger errors because the slopes are so near each other. Now we consider Ux = y,
δx1 + x2 = 1, (3.2.6a) 1 1 − x = y . (3.2.6b) δ 2 2
This is also ill-posed as well. All of these linear problems are illustrated in figure 3.1.
20 3.2. Lecture 5: August 28, 2013 Applied Matrix Theory
x2 x2 x2
(1, 1)
x1 x1 x1 (a) Ax = b (b) Ly = b (c) Ux = y
Figure 3.1. Plot of linear problems and their solutions.
Gaussian elimination with pivoting
Pivoting means we exchange rows such that the current |aii| = max |aji|. Similarly, `ji = j≥i aji ≤ 1 for all j > i. Now, aii δ 1 1 + δ 1 1 2 → (3.2.7a) 1 1 2 δ 1 1 + δ R ←R −δR 1 1 2 −−−−−−−→2 2 1 (3.2.7b) 0 1 − δ 1 + δ − 2δ 1 1 2 → (3.2.7c) 0 1 − δ 1 − δ PLU always works. Theorem: Gaussian elimination with pivoting yields PA = LU. The permutation matrix is P. Every matrix has a PLU factorization. To do the pivoting, at each step, first premultiply A by 1 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 1 0 0 Pk = (3.2.8) 0 0 1 0 0 0 . 0 0 0 0 .. 0 0 0 0 0 0 1 then premultiply by 1 0 0 0 0 0 . .. 0 0 0 0 0 0 0 1 1 0 0 Lk = (3.2.9) 0 0 `k−1,k 1 0 0 . . 0 0 . 0 .. 0 0 0 `n,k 0 0 1
21 Nitsche and Benner Unit 3. Factorization
We do this in succession,
Ln−1Pn−1 ··· L2P2L1P1A = U (3.2.10)
How do these commute into a useful P and L matrix?
3.3 Lecture 6: August 30, 2013
Discussion of HW problem 2
2 − yj−1 + 2yj − yj+1 = h f(xj), for j = 1, . . . , n − 1. (3.3.1)
2 −1 0 ··· 0 y1 f(t1) + y0 .. . −1 2 −1 . . y2 f(t2) . . 2 . 0 −1 2 .. 0 . = h . . (3.3.2) ...... y f(t ) . . . . −1 n−2 n−2 0 ··· 0 −1 2 yn−1 f(tn−1) + yn
So we’ve set up our matrix rhs = matrix of zeros size \(1 \times n-1\) for A_{(n-1)x(n-1)} x = a:h:b = linspace(a,b,n+1) rhs = h^2*f(x(2:n)); rhs(1) = rhs(1) + ya; rhs(n-1) = rhs(n-1) + yb;
Recall that our f(x) = −ex: − y00 = −ex (3.3.3)
PLU factorization
For PLU factorization, we are doing Gauss elimination with pivoting. At each kth step (k) of Gaussian elimination, switch rows so that the pivots, akk , are the largest number by magnitude in the kth column. For example, 1 −1 3 x1 −3 −1 0 −2 x2 = 1. (3.3.4) 2 2 4 x3 0
22 3.3. Lecture 6: August 30, 2013 Applied Matrix Theory
or
1 −1 3 −3 2 2 4 0 −1 0 −2 1 → −1 0 −2 1 , row1 ↔ row3 (3.3.5a) 2 2 4 0 1 −1 3 −3 2 2 4 0 1 1 → 0 1 −0 1 , row ← row − row , and row ← row − row 2 2 3 1 3 3 2 1 0 −2 1 −3 (3.3.5b) 2 2 4 0 → 0 −2 1 −3 , row2 ↔ row3 (3.3.5c) 0 1 −0 1 2 2 4 0 1 → 0 −2 1 −3 , row ← row − − row 3 3 2 2 0 0 1/2 −1/2 (3.3.5d)
We need to do the back substitution to solve this system. But more importantly, we want to know what the factorization of this system would be. Recall,
1 0 0 0 0 0 . .. 0 0 0 0 0 0 0 1 0 0 0 Lk = , (3.3.6) 0 0 `k−1,k 1 0 0 . . 0 0 . 0 .. 0 0 0 `n,k 0 0 1 and
L−(n−1)Pn−1 ··· L−2P2L−1P1A = U. (3.3.7)
Reordering,
Pn−1 ··· L−2P2L−1P1A = L(n−1)U. (3.3.8)
We want to move each P to be right next to A and all the Ls such that we can form a true L. Claim, ˜ PjL−k = L−kPj, j > k. (3.3.9)
th Pj permutation moves columns below the k row. This allows us to move L’s out.
˜ PjL−kPj = L−k (3.3.10a)
˜ ˜ L−n ··· L−1Pn−1 ··· P1A = U (3.3.11)
23 Nitsche and Benner Unit 3. Factorization
Now we can return to our example but with keeping track of the 1 −1 3 −3 2 2 4 0 0 0 1 −1 0 −2 1 → −1 0 −2 1 , row1 ↔ row3, P1 = 0 1 0 2 2 4 0 1 −1 3 −3 1 0 0 (3.3.12a) 2 2 4 0 − 1 1 −0 1 1 1 → 2 , row2 ← row2 − − row1, row3 ← row3 − row1 2 2 1 2 −2 1 −3 (3.3.12b) 2 2 4 0 0 0 1 1 −2 1 −3 → 2 , row2 ↔ row3, P2 = 1 0 0 1 0 1 0 − 2 1 −0 1 (3.3.12c) 2 2 4 0 1 −2 1 −3 1 → 2 , row3 ← row3 − − row2 1 1 2 − 2 − 2 1/2 −1/2 (3.3.12d) Because P = P−1, we should remember that, PA = LU (3.3.13a) A = PLU. (3.3.13b)
3.4 Lecture 7: September 4, 2013
PLU Factorization Recall PA = LU (3.4.1) always exists by construction. This is because we can make anything non-zero by the per- mutation. This is also equivalent to, A = PLU (3.4.2) because P = P−1. To use this in an actual solution, PAx = Pb, (3.4.3) or LUx = Pb, (3.4.4) So this system is determined by:
24 3.4. Lecture 7: September 4, 2013 Applied Matrix Theory
1. Solving Ly = Pb,
2. Solving Ux = y.
In Matlab, we would use the commands [L,U,P] = lu(A), to find these three matrices. This factorization is not unique. We want to show the uniqueness of the LU factorization, and are also interested in when it exists.
Triangular Matrices We are interested in the determinants of lower or upper triangular matrices. Let’s discuss det(L). `11 0 0 0 0 . . . .. . 0 0 0 L = `i1 ··· `jj 0 0 (3.4.5) . . . . ··· . .. 0 `n1 ··· `nj . . . `nn Qn the determinant is det(L) = i=1 `ii. Thus L is invertible only if `ii 6= 0 for all `ii. We conjecture the product of two lower triangular matrices will give us lower a triangular matrix. e.g. L1L2 = L12 (3.4.6) We want to prove this!
Multiplication of lower triangular matrices
Prove that L1L2 is lower triangular. Assume AB are lower triangular. Show C = AB is lower triangular. We know that bijaij = 0 for j > i. In our proof, we first consider matrix multiplication. X eij = aikbkj. (3.4.7)
We know that aik = 0 for k > i, and bkj = 0 for j > k. If j > i, then when k < i we have that k < j so bkj = 0. Alternatively, if k > i then aik = 0. Thus, in either case one of the two products is zero and we have proved our hypothesis.
Inverse of a lower triangular matrix A lower triangular matrix’s inverse is also a lower triangular matrix; `11 ··· 0 −1 . .. . L = . . . = Lower triangular (3.4.8) `n1 ··· `nn
So, this helps with inversion of the form,
L−n ··· L−2L−1A = U. (3.4.9)
25 Nitsche and Benner Unit 3. Factorization
For matrixes of the form 1 0 0 0 0 0 . .. 0 0 0 0 0 0 0 1 0 0 0 L−k = ; (3.4.10) 0 0 −`ij 1 0 0 . . 0 0 . 0 .. 0 0 0 −`nj 0 0 1 the inverse matrix is 1 0 0 0 0 0 . .. 0 0 0 0 0 0 0 1 0 0 0 Lk = . (3.4.11) 0 0 `ij 1 0 0 . . 0 0 . 0 .. 0 0 0 `nj 0 0 1 For any 1 0 0 0 0 `11 0 0 0 0 . . . . . .. . .. . 0 0 0 . 0 0 0 Lk = 0 ··· 1 0 0 0 ··· `ii 0 0 (3.4.12) . . . . . . . 0 . .. 0 . 0 . .. 0 0 ··· `in ... 1 0 ··· 0 . . . `nn To find L−1,[LI] −−→GE [IL−1]. Use Gaussian elimination on L, and we go through each column.
Uniqueness of LU factorization
Theorem: If A is such that no non-zero pivots are encountered, then A = LU with `ii = 1 aij and uii 6= 0, which are the pivots. For, `ij = for j < i by construction. aii Proof: Assume A = L1U1 = L2U2, then −1 L2 L1U1 = U2, (3.4.13a) −1 −1 L2 L1 = U2U1 (3.4.13b) = diagonal matrix (3.4.13c) = I. (3.4.13d)
−1 If this is the case, then L2 L1 = I or L2 = L1, and similarly U2 = U1. Thus these matrices are the same and the solution must be unique.
Existence of the LU factorization
Theorem: A = LU with no zero pivots, then all leading principal submatrices Ak are non- singular. We define the leading principle sub matrices Ak of An×n is Ak = A(1:k),(1:k). These are the upper-left square matrices of the full matrix.
26 3.5. Lecture 8: September 6, 2013 Applied Matrix Theory
Part 2. A = LU then define Ak 6= 0 for any k. We want to prove that if A = LU, show that Ak is invertible. Then if Ak is invertible show that A = LU.
3.5 Lecture 8: September 6, 2013
About Homeworks
The median score was 50 out of 60. A histogram was shown with the general grade distri- bution. 1 around 10, 3 around 25, 1 around 40, 4 from 45–50, 4 from 50–55, 6 from 55–60. Comments: write in working Matlab code. Also, L must have ones on the diagonal, while U has pivots on the diagonal. “Computing efficiently” means using the LU decomposition, not invert the matrix A. For homework 2, we will have applications of finding the inverse of A or solve
AX = I (3.5.1)
or A x1 x2 ··· xn = e1 e2 ··· en (3.5.2)
To find A−1, solve
Axj = ej, (3.5.3)
for all j = 1, 2, . . . , n. Use the LU decomposition.
Discussion of ill-conditioned systems
We define Ax = b as an ill-conditioned system if small changes in A or b introduces large changes in the solution. Geometrically we showed this interpretation previously on a 2 × 2 system, and we noted that the slopes were very similar to each-other. Numerically, we have trouble because the roundoff when we solve Ax˜ = b. We also may compute a condition number which tells us the amplification factor of errors in the system. In Matlab, the command cond(A) gives you the condition. This should hopefully be under a thousand. The condition number essentially tells you how much accuracy you can expect to get from the final solution. In other words, if your condition number is 1 × 105 then you can only expect to have about 11 significant digits in our solution at floating point arithmetic.
27 Nitsche and Benner Unit 3. Factorization
Inversion of lower triangular matrices
Show that if A is a lower triangular matrix then so is A−1. So let’s solve AX = I with A lower triangular.
x 0 0 0 0 1 0 0 0 0 x 0 0 0 0 1 0 0 0 0 x x 0 0 0 0 1 0 0 0 x x 0 0 0 y 1 0 0 0 x x x 0 0 0 0 1 0 0 → x x x 0 0 y 0 1 0 0 , (3.5.4a) x x x x 0 0 0 0 1 0 x x x x 0 y 0 0 1 0 x x x x x 0 0 0 0 1 x x x x x y 0 0 0 1 x 0 0 0 0 1 0 0 0 0 x x 0 0 0 y 1 0 0 0 → x x x 0 0 y y 1 0 0 , (3.5.4b) x x x x 0 y y 0 1 0 x x x x x y y 0 0 1 x 0 0 0 0 1 0 0 0 0 x x 0 0 0 y 1 0 0 0 → x x x 0 0 y y 1 0 0 , (3.5.4c) x x x x 0 y y y 1 0 x x x x x y y y 0 1 x 0 0 0 0 1 0 0 0 0 x x 0 0 0 y 1 0 0 0 → x x x 0 0 y y 1 0 0 . (3.5.4d) x x x x 0 y y y 1 0 x x x x x y y y y 1
We now have shown that we can get the lower triangular matrix A into the form LD. Now we do backward substitution to get our X. In this case this is simply deviding each row by the value of the pivot of that row. In this way with D = U, we have X = D−1L−1.
Example of LU decomposition of a lower triangular matrix
Given the matrix,
2 0 0 1 0 0 2 0 0 1 1 3 0 = 2 1 0 0 3 0 , (3.5.5a) 1 2 1 4 1 3 1 0 0 4 = LU. (3.5.5b)
28 3.6. Lecture 9: September 9, 2013 Applied Matrix Theory
Banded matrix example
Exercise 3.10.7: Band matrix A with bandwidth w is a matrix with aij = 0 if |i − j| > w. If w = 0, we have a diagonal matrix. a11 0 0 0 0 0 a22 0 0 0 Aw=0 = 0 0 a33 0 0 . (3.5.6) 0 0 0 a44 0 0 0 0 0 a55
For bandwidth, w = 1, a11 a12 0 0 0 a21 a22 a23 0 0 Aw=1 = 0 a32 a33 a34 0 . (3.5.7) 0 0 a43 a44 a45 0 0 0 a54 a55 For bandwidth, w = 2, a11 a12 a13 0 0 a21 a22 a23 a24 0 Aw=2 = a31 a32 a33 a34 a35 . (3.5.8) 0 a42 a43 a44 a45 0 0 a53 a54 a55 In the LU decomposition these zeros are preserved. However there are other cases (as shown in the homework) where the zeros may not be preserved. We will return to our theorem on Monday. For the homework, a matrix has an LU decomposition if and only if all principle submatrices are invertible.
3.6 Lecture 9: September 9, 2013
Existence of the LU factorization (cont.) When does LU factorization exist? Theorem: If no zero pivots that appears in Gaussian th elimination (including the n one) then A = LU, `ii = 1 and uii 6= 0 are pivots. Then L, U are unique. Theorem: A = LU if and only if the leading principle submatrices Ak is invertible. Proof: Assume (for block matrices of length k × k, n − k × n − k and the difference)
A = LU, (3.6.1) L 0 U U = 11 11 12 , (3.6.2) L21 L22 0 U22 L U L U = 11 11 11 12 (3.6.3) L21U11 L22U22
29 Nitsche and Benner Unit 3. Factorization
Qk Now our question: is Ak = L11U11? We know that det L11 = j=1 `jj 6= 0 so L11 is Qk invertible. Similarly, U11 = j=1 ujj 6= 0 so it is also invertibles. Since we know that the product of two invertible matrices is also invertible, Ak must also be invertible. We will now do a proof by induction: If we assume that all Ak are invertible. Show that A = LU.
ASIDE: Example of proof by induction. We want to show, n X n(n + 1)(2n + 1) j2 = . (3.6.4) 6 j=1 The steps of proof by induction are
1. First we show that this holds for n = 1, 2. next we assume it holds for n, 3. finally we show that it holds for n + 1.
Let’s show the third step,
n+1 n X X j2 = j2 + (n + 1)2, (3.6.5a) j=1 j=1 n(n + 1)(2n + 1) = + (n + 1)2, (3.6.5b) 6 n(n + 1)(2n + 1) + 6(n + 1)2 = , (3.6.5c) 6 (n + 1) [n(2n + 1) + 6(n + 1)] = , (3.6.5d) 6 (n + 1) 2n2 + 7n + 1 = , (3.6.5e) 6 (n + 1)(n + 2)(2n + 3) = . (3.6.5f) 6 Which is what would be expected, and we have proved this relation by induction.
So for our system, 1. First we show that this holds for n = 1,
A = [a11] = [1] [a11] where a11 6= 0. 2. Assume true for n:
If Ak, k = 1, . . . , n are invertible, then An×n = Ln×nUn×n. 3. Show it holds for n + 1.
So let’s move onto the third step, assume A(n+1)×(n+1) with Ak, k = 1, . . . , n+1 are invertible. By induction assumption An = LnUn, since A1,..., An are invertible. Now we need to show that An+1 = Ln+1Un+1, A b A = n , (3.6.6a) n+1 c| α L U b = n n , (3.6.6b) c| α L 0U x = n n . (3.6.6c) y| 1 0| β
30 3.6. Lecture 9: September 9, 2013 Applied Matrix Theory
−1 −1 We want Lnx = b so we let x = Ln b which supposes that Ln exists. We also want | | | | −1 | | y Un = c so we let y = c Un . Finally, we want y x + β = α, so we let β = α − y x. We know,
L U b A = n n , (3.6.7a) n+1 c| α −1 Ln 0 Un Ln b = | −1 | | −1 −1 . (3.6.7b) c Un 1 0 α − c Un Ln b
Since A = An+1 is invertible, we must have β 6= 0 because if β = 0 then det(Ln+1) det(Un+1) = 0, in which case An+1 would not be invertible. So, An+1 has an LU decomposition and by principle of induction we have proven our theorem.
Rectangular matrices
m×n For a rectangular matrix Am×n ∈ R . Our question: is Ax = b solvable? Is the solution unique? We are presented with there options: no solution, unique solution, or infinitely many solutions. We are going to do Gaussian elimination to reduce the form of the matrix to see how many solutions we will have. So we will do row echelon form (REF) reduction.
Example of row echelon form
1 2 1 3 3 2 4 0 4 4 A = , (3.6.8a) 1 2 3 5 5 2 4 0 4 7 1 2 1 3 3 0 0 −2 −2 −2 → , (3.6.8b) 0 0 2 2 2 0 0 −2 −1 1 1 2 1 3 3 0 0 1 1 1 → , (3.6.8c) 0 0 0 0 0 0 0 0 0 2 1 2 1 3 3 0 0 1 1 1 → . (3.6.8d) 0 0 0 0 1 0 0 0 0 0
Where we made interchanges to have leading ones for the columns. What do we know about our matrix A from this information? First, we know what columns are linearly independent. We are trying to find the column space of our matrix.
31 Nitsche and Benner Unit 3. Factorization
3.7 Homework Assignment 2: Due Friday, September 13, 2013
1. Textbook 3.10.1 (a, c): LU and PLU factorizations 1 4 5 Let, A = 4 18 26. 3 16 30
(a) Determine the LU factors of A
(c) Use the LU factors to determine A−1
2. Textbook 3.10.2 Let A and b be the matrices,
1 2 4 17 17 3 6 −12 3 3 A = and b = . 2 3 −3 2 3 0 2 −2 6 4
(a) Explain why A does not have an LU factorization. (b) Use partial pivoting and find the permutation matrix P as well as the LU factors such that PA = LU. (c) Use the information in P, L, and U to solve Ax = b.
3. Textbook 3.10.3 ξ 2 0 Determine all values of ξ for which A = 1 ξ 1 fails to have an LU factorization. 0 1 ξ 4. Textbook 3.10.5 If A is a matrix that contains only integer entries and all of its pivots are 1, explain why A−1 must also be an integer matrix. Note: This fact can be used to construct random integer matrices that posses integer inverses by randomly generating integer matrices L and U with unit diagonals and then constructing the product A = LU.
5. Lower triangular matrices Let A be a 3 × 3 matrix with real entries. We showed that GE is equivalent to finding lower triangular matrices L−1 and L−2 such that L−2L−1A = U where U is upper triangular and,
1 0 0 1 0 0 L−1 = −`21 1 0 , L−2 = 0 1 0 , (3.7.1) −`31 0 1 0 −`32 1
32 3.7. HW 2: Due September 13, 2013 Applied Matrix Theory
with 1 0 0 1 0 0 −1 −1 (L−1) = `21 1 0 = L1, (L−2) = 0 1 0 = L2. (3.7.2) `31 0 1 0 `32 1
It follows that A = L2L1U. Show that
1 0 0 L2L1 = `21 1 0 . (3.7.3) `31 `32 1
Show by example that generally,
L2L1 6= L1L2 (3.7.4)
That is, the order in which these lower triangular matrices are multiplied matters.
6. Textbook 1.6.4: Conditioning Using geometric considerations, rank the following three systems according to their condition.
(a)
1.001x − y = 0.235, x + 0.0001y = 0.765.
(b)
1.001x − y = 0.235, x + 0.9999y = 0.765.
(c)
1.001x + y = 0.235, x + 0.9999y = 0.765.
7. Textbook 1.6.5 Determine the exact solution of the following system:
8x + 5y + 2z = 15, 21x + 19y + 16z = 56, 39x + 48y + 53z = 140.
Now change 15 to 14 in the first equation and again solve the system with exact arithmetic. Is the system ill-conditioned?
33 Nitsche and Benner Unit 3. Factorization
8. Textbook 1.6.6 Show that the system
v − w − x − y − z = 0, w − x − y − z = 0, x − y − z = 0, y − z = 0, z = 1,
is ill-conditioned by considering the following perturbed system:
v − w − x − y − z = 0, 1 − v + w − x − y − z = 0, 15 1 − v + x − y − z = 0, 15 1 − v + y − z = 0, 15 1 − v + z = 1. 15
34 UNIT 4
Rectangular Matrices
4.1 Lecture 10: September 11, 2013
Rectangular matrices (cont.)
We are interested in a rectangular matrix, Am×n. We may apply REF, or RREF to find the column dependence, what the basic columns are, and what the rank of the matrix is. This way we can find for any system Ax = b, whether the system is consistent and find all the solutions; whether it is homogeneous, or what the free variables are; and what the particular solutions are. Last time’s example, we went from
1 2 1 3 3 2 4 0 4 4 A = , (4.1.1a) 1 2 3 5 5 2 4 0 4 7 1 2 1 3 3 0 0 2 2 2 → . (4.1.1b) 0 0 0 0 3 0 0 0 0 0
The first, third, and fifth columns have pivots and are the basic columns. They correspond to the linearly independent columns in A. How do we write the other two columns (c2, c4) as functions of the other three columns? We can notice that, c2 = 2c1, and similarly c4 = 2c1 + c3. The reduced row echelon form (RREF) has pivots on 1, and zeros below and above
35 Nitsche and Benner Unit 4. Rectangular Matrices
x2 x2 x2
x1 x1 x1 (a) Intersecting system (one (b) Parallel system (no solu- (c) Equivalent system (infi- solution) tion) nite solutions)
Figure 4.1. Geometric illustration of linear systems and their solutions.
all pivots. So,
1 2 1 3 3 1 2 1 3 3 0 0 2 2 2 0 0 1 1 1 → , (4.1.2a) 0 0 0 0 3 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 2 1 3 0 0 0 1 1 0 → , (4.1.2b) 0 0 0 0 1 0 0 0 0 0 1 2 0 2 0 0 0 1 1 0 → . (4.1.2c) 0 0 0 0 1 0 0 0 0 0
In this form, the basic columns are very clear, and the relations between the dependent columns and the basic columns is also obvious. So again we can see that, c2 = 2c1 and c4 = 2c1 + 1c3. The rank of the matrix is the number of linearly independent columns, which is also the number of linearly independent rows, and also the number of pivots in row-echelon form of the matrix. A consistent system, Ax = b is a system that has at least one solution. It is inconsistent if it has no solutions. To determine if Ax = b is consistent, in a 2 × 2 system, Ax = b,
a11x1 + a12x2 = b1, (4.1.3a)
a21x1 + a22x2 = b2. (4.1.3b)
Since this system is a linear system we can see three cases: one intersection, parallel and separated, and parallel and the same. Each of these cases are illustrated in Figure 4.1. In general, for any size matrix, we find the row echelon form of the augmented system
36 4.1. Lecture 10: September 11, 2013 Applied Matrix Theory
h i [A b] → E b˜ . x x x x x 0 x x x x (4.1.4) 0 0 0 0 α If α 6= 0, then the system is inconsistent. So Ax = b is consistent if rank([A b]) = rank(A). If α = 0 then b˜ is not a basic column of (A b). The we can write b˜ as a linear combination of the basic columns of E. We can write b as linear combinations of basic columns of A. In our example, we had c1, c3, and c5 where the basic columns and Ax = b was consistent. Here then if we were to preform a reduction, the b = x1c1 + x3c3 + x5c5, or in other words, x1 0 A x3 = b. (4.1.5) 0 x5
Example of RREF of a Rectangular Matrix Given the matrix, 1 1 2 2 1 1 1 1 2 2 1 1 2 2 4 4 3 1 0 0 0 0 1 −1 → , (4.1.6a) 2 2 4 4 2 2 0 0 0 0 0 0 3 5 8 6 5 3 0 2 2 0 2 0 1 1 2 2 1 1 0 2 2 0 2 0 → . (4.1.6b) 0 0 0 0 1 −1 0 0 0 0 0 0 Thus, our system is consistent. We have that rank([A b]) = rank(A). Similarly, we observe that we have 3 basic columns, r, and 2 linearly dependent columns, n − r. (If n > m, then n > r, so n − r 6= 0). Let’s continue on to perform the reduced row echelon form. 1 1 2 2 1 1 1 1 2 2 1 1 0 2 2 0 2 0 0 1 1 0 1 0 → , (4.1.7a) 0 0 0 0 1 −1 0 0 0 0 1 −1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 2 2 0 2 0 1 1 0 0 1 → , (4.1.7b) 0 0 0 0 1 −1 0 0 0 0 0 0 1 0 1 2 0 1 0 1 1 0 0 1 → . (4.1.7c) 0 0 0 0 1 −1 0 0 0 0 0 0
37 Nitsche and Benner Unit 4. Rectangular Matrices
˜ Thus our b = 1˜c1 + 1˜c2 − 1˜c5. Therefore, b = 1c1 + 1c2 − 1c5, and 1 1 x = 0 . (4.1.8) 0 −1
So in review, 1 1 2 2 1 1 1 0 1 2 0 1 2 2 4 4 3 1 0 1 1 0 0 1 → . (4.1.9) 2 2 4 4 2 2 0 0 0 0 1 −1 3 5 8 6 5 3 0 0 0 0 0 0
| We found a particular solution, xp = (1 1 0 0 − 1) of Ax = b. For any solution xh of Ax = 0, we have that A (xp + xH ) = b + 0. So (xp + xH ) also solves Ax = b.
4.2 Lecture 11: September 13, 2013
Solving Ax = b Ax = b is consistent if rank[A | b] = rank(A). We have that b is a nonbasic column of [A | b]. We can express b in terms of columns of A to get a solution Axp = b. The set of all solutions is xp + xH , where Axp = b has the particular solution to Ax = b. We also solve AxH = 0, and get all homogeneous solutions, xH . Since we can add these two solutions, we have A (xp + xH ) = b. Now to actually find the particular solution, xp, we write b in terms of basic columns. To find the homogeneous solutions, xH , we solve Ax = 0 by solving for basic variables xi in terms of the n − r free variables. Basic variables correspond to basic columns, while free variables correspond to nonbasic columns. Note that if n > r then the set of columns is linearly independent and we can find x 6= 0 such that Ax = 0.
Example From our example 1 1 2 2 1 1 1 0 1 2 0 1 2 2 4 4 3 1 0 1 1 0 0 1 → , (4.2.1a) 2 2 4 4 2 2 0 0 0 0 1 −1 3 5 8 6 5 3 0 0 0 0 0 0
we have that
b = a:1 + a:2 − a:5, (4.2.2a)
= x1a:1 + x2a:2 − x5a:5, (4.2.2b) | = Axp, where xp = 1 1 0 0 −1 . (4.2.2c)
38 4.2. Lecture 11: September 13, 2013 Applied Matrix Theory
Solve,
1 0 1 2 0 0 0 1 1 0 0 0 [A | 0] = . (4.2.3a) 0 0 0 0 1 0 0 0 0 0 0 0
This gives us the three equations for the homogeneous solutions,
x1 = −x3 − 2x4, (4.2.4a)
x2 = −x3, (4.2.4b)
x5 = 0. (4.2.4c)
This gives us the homogeneous solutions of the form, −x3 − 2x4 −x3 xH = x3 , (4.2.5a) x4 0 −1 −2 −1 0 = x3 0 + x4 0. (4.2.5b) 0 1 0 0
Thus the set of all solutions are,
x = xp + xH , (4.2.6a) 1 −1 −2 1 −1 0 = 0 + x3 0 + x4 0 . (4.2.6b) 0 0 1 −1 0 0
This solves Ax = b for any x3 and x4. Therefore we have infinitely many solutions. Not we can only have unique solutions if n = r.
Linear functions
We have any function f : D → R is a linear function if
1. f(x + y) = f(x) + f(y), 2. f(αx) = αf(x).
39 Nitsche and Benner Unit 4. Rectangular Matrices
For example, f(x) = ax + b, with b 6= 0. f(x + y) = (ax + b) + (ay + b) , (4.2.7a) = a(x + y) + 2b, (4.2.7b) 6= a(x + y) + b. (4.2.7c) Thus this is not a linear function. However when b = 0, the function f(x) = ax can be verified to be linear.
Example: Transpose operator
| | The transpose operator is f(A) = A . Define that if A = [aij], then A = [aji] and ∗ | A = A = [¯aji]. Is this linear? f(A + B) = (A + B)| , (4.2.8a) | = [aij + bij] , (4.2.8b)
= [aji + bji] , (4.2.8c) = A| + B|. (4.2.8d) To check the second criterion, f(αA) = [αA]| , (4.2.9a) = α [A]| , (4.2.9b) = αf(A). (4.2.9c) So this operator is linear.
Example: trace operator P The trace operator is f(A) = tr(A) = i aii. X f(A + B) = (aii + bii) , (4.2.10a) i X X = aii + bii, (4.2.10b) i i = tr(A) + tr(B). (4.2.10c) The second cirterion, f(αA) = tr(αA), (4.2.11a) X = αaii, (4.2.11b) i X = α aii, (4.2.11c) i = α tr(A), (4.2.11d) = αf(A). (4.2.11e) We have therefore shown that this is a linear operator.
40 4.2. Lecture 11: September 13, 2013 Applied Matrix Theory
Matrix multiplication Given, a b a˜ ˜b A = , B = . (4.2.12) c d c˜ d˜ Then consider ˜ ax1 + bx2 ax˜ 1 + bx2 f(x) = Ax = , g(x) = Bx = ˜ . (4.2.13) cx1 + dx2 cx˜ 1 + dx2
Take f(g(x)) = A (Bx) ≡ ABx. (4.2.14) But,
˜ ˜ a(˜ax1 + bx2) + b(˜cx1 + dx2) f(g(x)) = ˜ ˜ , (4.2.15a) c(˜ax1 + bx2) + d(˜cx1 + dx2) ˜ ˜ (aa˜ + bc˜)x1 + (ab + bd)x2 = ˜ ˜ , (4.2.15b) (ca˜ + dc˜)x1 + (cb + dd)x2 aa˜ + bc˜ a˜b + bd˜ x = 1 , (4.2.15c) ca˜ + dc˜ c˜b + dd˜ x2 ≡ AB. (4.2.15d)
Pn Now if we define AB = [Ai:B:j] or Ai:B:j = k=1 AikBkj. We get that matrix multiplication | {z } (AB)ij is not generally commutative, or AB 6= BA. If AB = 0 then either A = 0 or B = 0 unless A or B are invertible. Further we know that we have the distributive properties,
A (B + C) = AB + AC, (4.2.16) or (A + B) D = AD + BD, (4.2.17) and the associative property (AB) C = A (BC) . (4.2.18) A property of the transpose operator is,
(AB)| = B|A|, (4.2.19) which also helps to understand that,
tr(AB) = tr(BA). (4.2.20)
Note, however, that tr(ABC) 6= tr(ACB) as we will demonstrate on the homework.
41 Nitsche and Benner Unit 4. Rectangular Matrices
Proof of transposition property We want to prove the useful property,
(AB)| = B|A|. (4.2.21)
Dealing with our left hand side of the equation,
| | LHS : (AB) = (AB)ij , (4.2.22a)
= [(AB)ji], (4.2.22b)
= [Aj:B:i]. (4.2.22c)
Manipulating the right hand side of the property,
| | h | | i RHS : B A = B A ij , (4.2.23a) | | = Bi:A:j , (4.2.23b)
= [B:iAj:], (4.2.23c)
= [Aj:B:i], (4.2.23d) = LHS. (4.2.23e)
Thus, we have proved the identity.
4.3 Lecture 12: September 16, 2013
We will be having an exam on September 30th.
Inverses We define: A has an inverse if each A−1 exists such that,
AA−1 = A−1A = I. (4.3.1)
We also have the properties:
−1 • (AB) = B−1A−1,
−1 | • (A|) = (A−1) , −1 • (A−1) = A. What about the inverse of sums (A + B)−1? There are the special cases,
| −1 n×k • low rank perturbations of In×n :(I + CD ) , where C, D ∈ R or the matrices are of rank k.
−1 • small perturbation of I :(I + A) , where ||A||.
42 4.3. Lecture 12: September 16, 2013 Applied Matrix Theory
We have a rank-1 matrix uv|, with u, v ∈ Rn = Rn×1. u1 u | 2 uv = . v1 v2 ··· vk , (4.3.2a) . uk u1v1 u1v2 ··· u1vk u v u v ··· u v 2 1 2 2 2 k = . . .. . , (4.3.2b) . . . . ukv1 ukv2 ··· ukvk | u1v u v| 2 = . . (4.3.2c) . | ukv
Now let’s say we have an example where all matrix entries are zero except for αij at some point (i, j).
0 ··· 0 0 . . . . . . α . = α 0 ··· 1 ··· 0 , (4.3.3a) . . 0 ··· 0 0 | = αeiej . (4.3.3b)
Low rank perturbations of I We make the claim the if u, v are such that v|u + 1 6= 0 then
uv| I + uv|−1 = I − (4.3.4) 1 + v|u Proof: uv| uv| u (v|u) v| I + uv| I − = I − + uv| − , (4.3.5a) 1 + v|u 1 + v|u 1 + v|u uv| (v|u) = I − + uv| − uv|, (4.3.5b) 1 + v|u 1 + v|u 1 (v|u) = I − uv| + 1 − , (4.3.5c) 1 + v|u 1 + v|u | | 1 +vu = I − uv 1 − , (4.3.5d) 1 + v|u = I. (4.3.5e)
43 Nitsche and Benner Unit 4. Rectangular Matrices
| So if c, d ∈ Rn such that d (A−1c) + 1 6= 0, we are interested in A−1. A + cd|−1 = A I + A−1cd| , (4.3.6a) = I + A−1c d| A−1, (4.3.6b) A−1cd| = I − A−1, (4.3.6c) 1 + d|A−1c A−1cd|A−1 = A−1 − . (4.3.6d) 1 + d|A−1c
The Sherman–Morrison Formula
The Sherman–Morrison formula states that if A is invertible and C, D ∈ Rn×k such that I + D|A−1C is invertible. Then,