<<

Notes on Computation University of Chicago, 2014

Vivak Patel September 7, 2014

1 Contents

1 Introduction 3 1.1 Variations of solving Ax = b ...... 3 1.2 Norms ...... 3 1.3 Error Analysis ...... 6 1.4 Floating Point Numbers ...... 7

2 Eigenvalue Decomposition 10 2.1 Eigenvalues and Eigenvectors ...... 10 2.2 Jordan Canonical Form ...... 10 2.3 Spectra ...... 12 2.4 Spectral Radius ...... 12 2.5 Diagonal Dominance and Gerschgorin’s Disk Theorem ...... 13

3 Singular Value Decomposition 15 3.1 Theory ...... 15 3.2 Applications ...... 16

4 Rank Retaining Factorization 25 4.1 Theory ...... 25 4.2 Applications ...... 25

5 QR & Complete Orthogonal Factorization 27 5.1 Theory ...... 27 5.2 Applications ...... 28 5.3 Givens Rotations ...... 29 5.4 Householder Reflections ...... 30

6 LU, LDU, Cholesky and LDL Decompositions 31

7 Iterative Methods 32 7.1 Overview ...... 32 7.2 Splitting Methods ...... 32 7.3 Semi-Iterative Methods ...... 35 7.4 Krylov Space Methods ...... 36

2 1 Introduction 1.1 Variations of solving Ax = b 1. Linear Regression. A is known and b is known but corrupted by some error unknown error r. Our goal is to find x such that:

x ∈ arg min kAx − bk2 = arg min{krk2 : Ax + r = b} n 2 2 x∈R

2. Data Least Squares. A is known but corrupted by some unknown error E. We want to determine:

x ∈ arg min {kEk2 :(A + E)x = b} n F x∈R

3. Total least squares. A and b are both corrupted by errors E and r (resp.). We want to determine:

x ∈ arg min {kEk2 + krk2 :(A + E)x + r = b} n F 2 x∈R

4. Minimum norm least squares. Given any A and b, we want x such that:

2 2 2 T T x ∈ arg min{kzk2 : z ∈ arg min kAz − bk2} = arg min{kzk2 : A Az = A b}

5. Robust Regression. The linear regression problem with a different norm for the error r. 6. Regularized Least Squares. Given a matrix Γ, we want to find:

2 2 x ∈ arg min{kAx − bk2 + kΓxk2}

7. Linear Programming. x ∈ arg min{cT x : Ax ≤ b} 8. Quadratic Programming. x ∈ arg min{0.5xT Ax + cT x : Bx = d}

1.2 Norms 1. Norm. Definition 1.1. A norm is a real-valued function defined over a vector space, denoted by k·k such that: (a) kxk ≥ 0 (b) kxk = 0 if and only if x = 0 (c) kx + yk ≤ kxk + kyk (d) kαxk = |α| kxk for any scalar α

2. Vector Norms (a) p-norms Pn p 1/p i. For p ≥ 1 the p-norm of x ∈ V is kxkp = ( i=1 |xi| )

3 ii. For p = ∞, the ∞-norm or Chebyshev norm is kxk∞ = maxi=1,...,n |xi| iii. The Chebyshev norm is the limit of p-norms

Lemma 1.1. Let x ∈ V then limp→∞ kxkp = kxk∞.

Proof. Let kxk∞ = 1. Then ∃j ∈ {1, . . . , n} such that xi = 1. Then: n !1/p X p 1/p lim |xi| ≤ lim n = 1 p→∞ p→∞ i=1 Secondly, n !1/p X p lim |xi| ≥ lim |xj| = 1 p→∞ p→∞ i=1

iv. Weighted p-norms: add a non-negative weight term to each com- ponent in the sum. (b) Mahalanobis norm. Let A be a symmetric matrix. √ ∗ kxkA = x Ax

3. Matrix Norms (a) Compatible. Submultiplicative/Consistent.

Definition 1.2. Let k·kM be a and k·kV be a vector norm. i. A matrix norm is compatible with a vector norm if:

kAxkV ≤ kAkM kxkV ii. A matrix norm is consistent or submultiplicative if:

kABkM ≤ kAkm kBkM (b) Holder Norms 1/p Pm Pn p i. The Holder p-norm of A is kAkH,p = j=1 i=1 |aij| ii. The Holder 2-norm is called the Frobenius norm

iii. The Holder ∞-norm is kAkH,p = maxi,j |aij| (c) Induced Norms i. Induced Norms. Spectral Norm

Definition 1.3. Let k·kα , k·kβ be vector norms. The matrix norm k·kα,β is the induced norm defined by:

kAxkα kAkα,β = max x6=0 kxkβ When α = β = 2, the induced norm is called the spectral norm. ii. Equivalent Definitions

4 Lemma 1.2. The following are equivalent definitions for an in- duced norm:

A. kAkα,β = sup{kAxkα : kxkβ = 1} B. kAkα,β = sup{kAxkα : kxkβ ≤ 1} Proof. Let x = v . Using this in the definition, we see that kvkβ the definition and first characterization are equivalent. For the second characterization, already have that the norm from the second characterization is necessarily less than or equal to the one from the definition. Assume that it is strictly less than the one from the definition. From the first characterization, we know that the norm is achieved and so let this point be x0. At x0/2 the norm is still maximized, so the definitions are equivalent. iii. Compatibility

Lemma 1.3. Letting k·k = k·kα,β: kAxk ≤ kAk kxk

Proof. For any x 6= 0:

kAxk kAk =≥ kxk

When x = 0, the result holds simply by plugging in values. iv. Consistency

Lemma 1.4. Letting k·k = k·kα,β: kABk ≤ kAk kBk

Proof.

kA(Bx)k kABk = max kxk kBxk ≤ kAk max x ≤ kAk kBk

m×n v. Computing k·k1,1 norm. A ∈ F . Lemma 1.5. m X kAk = max |aij| 1,1 j=1,...,n i=1

5 Proof. There exists an x such that kxk1 = 1 and kAxk1 = kAk. Therefore:

m n X X kAk = |aij||xj| i=1 j=1 n m X X = |xj| |aij| j=1 i=1 m ! n X X ≤ max |aij| |xj| j i=1 j=1 m X ≤ max |aij| j i=1 For the other direction, suppose the maximum occurs at the kth

column. Then kAekk1 ≤ kAk but the left hand side is right hand side of the array of equations above.

vi. Computing k·k∞. Lemma 1.6. n X kAk = max |aij| ∞ i=1,...,m j=1

Proof. There is an x such that kxk∞ = 1 and kAxk∞ = kAk. Therefore:

kAk = kAxk∞ n X = max |aijxj| i=1,...,m j=1 n X ≤ max |aij| i=1,...,m j=1

For the other direction, let k be the index of the maximizing row. Let x be a vector such that xi = sgn(aki). Then kxk = 1 and Pn kAk ≥ kAxk∞ = j=1 |akj|.

1.3 Error Analysis 1. Types of Error given for true value x and computed valuex ˆ (a) kxˆ − xk is the absolute error, but it depends on units (b) kxˆ − xk / kxk is the relative error, and it does not depend on units

xˆi−xi (c) Pointwise error: compute kyk where yi = 1 [xi 6= 0]. xi 2. Backwards Error Analysis (a) Notation

6 i. Suppose we want to solve Ax = b and we denote ∆(A, b) =x ˆ the algorithm which produces the estimate.

ii. The condition number of a matrix κ(A) = kAk A−1 . iii. Let ρ = kδAk kAk for some small perturbation matrix δA. (b) Idea: Viewx ˆ as the solution to a nearby system (A + δA)x = b + δb. (c) Error bound Lemma 1.7. Suppose A is an invertible matrix and we have a com- kδAk kδbk patible norm. If kAk ≤  and kbk ≤  then

kx − xˆk 2 ≤ κ(A) kxk 1 − ρ

Proof. Note that:

(I + A−1δA)(ˆx − x) = A−1 (δb − δAx)

Then: −1 (1 − ρ) kxˆ − xk ≤ A (kδbk + kδAk kxk) Dividing both sides by kxk and multiplying the right hand side by 1 = kAk / kAk:

kxˆ − xk  kδbk kδAk ≤ κ(A) + kxk kAk kxk kAk

Noting that kbk ≤ kAk kxk, the result follows.

1.4 Floating Point Numbers 1. Motivation: computers do not have infinite memory and so they can only store numbers up to a certain precision 2. Floating Point Numbers, sign, mentissa, exponent, base

Definition 1.4. Floating Point Numbers are F ⊂ Q, that have the fol- lowing representation:

e1...el ±a1a2 . . . ak × b

(a) ± is called the sign

(b) a1 . . . ak is called the mentissa and ai are values in some finite field

(c) e1 . . . el is called the exponent and ek are values in some finite field (d) b is the base

3. Floating Point Representation Standards

(a) Floating Point Representation. Machine Precision.

7 Definition 1.5. A floating point representation is a function fl : R → F which is characterized by the machine precision denoted m and defined as:

m = inf{x ∈ R : x > 0, fl(1 + x) 6= 1}

0 0 (b) Standard 1: ∀x ∈ R, ∃x ∈ F such that |x − x | ≤ m|x| (c) Standard 2: ∀x, y ∈ R and there is an |1| ≤ m:

fl(x ± y) = (x ± y)(1 + 1)

(d) Standard 3: ∀x, y ∈ R and there is an |2| ≤ m:

fl(xy) = (xy)(1 + 2)

(e) Standard 4: ∀x, y ∈ R such that y 6= 0, there is an |3| ≤ m:

fl(x/y) = (x/y)(1 + 3)

4. Floating Point in Computers

(a) Fields: b = 2 and ai, ej ∈ {0, 1} (b) Storage i. A floating point number requires 1 + l + k bits of storage using the following storage:

±|e1|e2| · · · |el|a1|a2| · · · |ak ii. 32-bit computers: 1 bit for sign, 8 bits for exponent, and 23 bits for the mentissa iii. 64-bit computers: 1 bit for sign, 11 bits for exponent, and 52 bits for the mentissa (c) Errors and (typical) Handling i. Round-off Error is when the number is more precise than the mentissa allows, and is handled by cutting off lower priority val- ues. ii. Overflow Error is when the exponent is too large, and is handled by returning representations of the largest value allowed by the system (e.g. 1e − 99 or −∞, etc.) iii. Underflow Error is when the exponent is too negative, resulting in 0. 5. Examples Example 1.1. Computing l2 norm. Suppose

 −49 −50 −50 −50  101 x = 10 10 10 ··· 10 ∈ R

2 −100 This can be stored exactly, but for i = 2,..., 101, fl(xi ) = fl(10 ) = 0. −49 Hence, using a naive algorithm would compute kxkl2 = 10 , which has a ∼ 12% (check) absolute error. An improved algorithm uses the naive

algorithm on xˆ = x/ kxk∞. This does not produce underflow errors.

8 Example 1.2. Sample Variance. There are two methods of computing sample variance.

2 1 Pn 2 (a) First compute x¯ and then compute s = n−1 i=1(xi − x¯)  2 2 1 Pn 2 1 Pn  (b) Compute s = n−1 i=1 xi − n j=1 xj The first method is more accurate but requires two passes over the data while the second only requires one.

9 2 Eigenvalue Decomposition 2.1 Eigenvalues and Eigenvectors 1. Eigenvalue. Eigenvector. Definition 2.1. An eigenvector, x, and eigenvalue, λ of a matrix A satisfy Ax = λx.

2. Basic Properties

Lemma 2.1. Let A ∈ Cn×n. (a) Eigenvectors are scale invariant n (b) The eigenspace of an eigenvalue λ, Vλ := {x ∈ C : Ax = λx}, is a vector space. (c) Any n × n matrix has n eigenvalues counted with multiplicity.

3. Eigenvalue Decompositions (a) Eigenvalue Decomposition Definition 2.2. A matrix A ∈ Cn×n admits an eigenvalue decom- position if there exists an invertible matrix X and diagonal matrix Λ of eigenvalues of A such that:

A = XΛX−1

(b) Equivalent conditions Proposition 2.1. The following three statements are equivalent for A ∈ Cn×n: i. A is diagonalizable ii. A has an eigenvalue decomposition iii. A has n linearly independent eigenvectors

Proof. We only prove equivalence between the last two. The first requires more machinery:   X = x1 x2 ··· xn ⇔ Axi = λxi

X is invertible if and only if it has n linearly independent columns.

2.2 Jordan Canonical Form

1. Every matrix A ∈ Cn×n has a Jordan Canonical form A = XJX−1 where: (a) J is a bidiagonal matrix where the main diagonal of J contains the eigenvalues of A with their algebraic multiplicity, and the super- diagonal of J is either 0 or 1. The remaining entries of J are 0. (b) The matrix X contains the eigenvectors of A and some other vectors

10 (c) Note that if the superdiagonal is all 0s then J is diagonalizable. 2. Jordan Blocks

(a) The matrix J can be written in block form with matrices J1,...,Jk:   J1 O ··· O  .. .   OJ2 . .     . .. ..   . . . O  O ··· OJk

(b) The Ji are the Jordan blocks. Let λ ∈ Λ(A) then a Jordan Block is of the form:  λ 1 0 ··· 0   .. .   0 λ 1 . .     ......   . . . . 0     . .. ..   . . . 1  0 ······ 0 λ

(c) The number of Jordan blocks corresponding to a specific λi ∈ Λ(A) is equal to its geometric multiplicity (that is, the dimension of the eigenspace)

(d) The sum of the dimensions of the Jordan blocks for any λi is equal to the algebraic multiplicity of the eigenvalue

3. Basic Properties (a) Fundamental Lemma

Lemma 2.2. Let Jr be a Jordan Block of a Jordan Canonical Form with eigenvalue λr. Let nr be its dimension. Then for k = 1, . . . , nr − k 1 (Jr − λrI) ek = 0

Proof. Let diagk(A) be the vector of values from A of the form Ai,i+k. Then, we note the following:

diag1(Jr − λrI) = 1nr −1

while all other elements in the matrix are 0. And in general:

k diagk((Jr − λrI) ) = 1nr −k

while all other elements of the matrix are 0. So when we multiply by ek we will get 0.

(b) Powers of Jordan Canonical Form: Lemma 2.3. Ak = XJ kX−1 4. Drawback: J cannot be computed.

11 2.3 Spectra 1. Spectra. Principal Eigenvalues.

Definition 2.3. Let A ∈ Cn×n. The set of eigenvalues of A is called the spectra of A and is denoted λ(A). The maximum element of λ(A) is called the Principal Eigenvalue of A.

2. Spectral Theorems

(a) . Normal Matrix. . Definition 2.4. The hermitian adjoint of a matrix A is the transpose of A with its elements conjugated. A matrix A is a normal matrix if A∗A = AA∗. A is a hermitian matrix if A = A∗. (b) for Normal Matrices Theorem 2.1. Let A ∈ Cn×n. Then the following are equivalent: i. A is unitarily diagonalizable ii. A has an orthonormal eigenbasis iii. A is a normal matrix iv. A has an EVD of the form A = V ΛV ∗ where V is unitary (i.e. VV ∗ = V ∗V = I) (c) Spectral Theorem for Hermitian Matrices Theorem 2.2. Let A ∈ Cn×n. Then the results of the Spectral Theorem for Normal Matrices applies and the eigenvalues of A are real.

2.4 Spectral Radius 1. Spectral Radius

Definition 2.5. The spectral radius ρ(A) of a matrix A ∈ Cn×n is the largest absolute eigenvalue:

ρ(A) = max{|λ| : λ ∈ Λ(A)}

Note 2.1. The spectral radius is not a norm. Consider J ∈ R2×2 where all entries are zero except J12 = 1. Then ρ(J) = 0 but J 6= 0, hence, ρ cannot be a norm.

2. Minimality of ρ and norms

Lemma 2.4. If A ∈ Cn×n and k· · ·k is a compatible norm then ρ(A) ≤ kAk

Proof. Let λ be an eigenvalue of A and x be its corresponding eigenvector. Then: λ kxk = kAxk ≤ kAk kxk Since λ is arbitrary in Λ(A) the result follows.s

12 3. Approximating Spectral Radius

Theorem 2.3. Let A ∈ Cn×n and  > 0. There exists an k·kα depending on A and  such that kAkα ≤ ρ(A) + .

4. A limiting property

Lemma 2.5. Let A ∈ Cn×n. Am → O as m → ∞ if and only if ρ(A) < 1.

Proof. Let λ be the largest eigenvalue and x be an eigenvector. Then, Amx = λmx. If Am → O as m → ∞ then taking the limit on both sides implies that λm → 0 as m → ∞ and so λ < 1. For the other direction, let  > 0 such that ρ(A) + 2 = 1. Then, we can find an α such that:

m m m kA kα ≤ kAkα ≤ (ρ(A) + ) < 1 Hence, Am → 0.

2.5 Diagonal Dominance and Gerschgorin’s Disk Theorem 1. Diagonally Dominant

Definition 2.6. A matrix A ∈ Cn×n is strictly diagonally dominant if P |aii| > j6=i |aij| for all i ∈ 1, . . . , n. It is diagonally dominant if the inequality is not strict.

2. Diagonal dominance and non-singularity Lemma 2.6. A strictly diagonally dominant matrix is nonsingular.

Proof. Suppose A is a strictly diagonally dominant matrix and it is singu- lar. That is ∃x 6= 0 such that Ax = 0. Let k be the index of the absolute largest element of the vector x. Then:

n X X 0 = akixi =⇒ −akkxk = akixi i=1 i6=k

Then: X X |akk||xk| ≤ |aki||xi| ≤ |xk| |aki| < |xk||akk| i6=k i6=k Hence, A cannot be singular.

3. Gerschgorin’s Disk Theorem

(a) Gerschgorin’s Disks Definition 2.7. For A ∈ Cn×n, define for i = 1, . . . , n the disks P Gi = {z ∈ C : |z − aii| ≤ ri| where ri = j6=i |aij|. The Gi are called Gerschgorin’s Disks. (b) Gerschgorin’s Disk Theorem

13 n×n Theorem 2.4. Let A ∈ C and Gi be its disks and Λ(A) be its spectra. Sn i. Λ(A) ⊂ i=1 Gi ii. The number of eigenvalues (with multiplicity) in each connected Sn component of i=1 Gi is the number of Gi constituting that com- ponent.

Proof. Suppose there is a λ ∈ Λ(A) such that λ 6∈ Gi for any i. Then ∀i, |λ − aii| > ri. Hence, A − λI is a strictly diagonally dominant matrix which implies that det(A−Iλ) 6= 0, but this is a contradiction since λ is an eigenvalue. For the second part, we use a bit of a trick. Let t ∈ [0, 1]. And let A(t) be the matrix whose off diagonals of A multiplied by t. Note that Gi(t) ⊂ Gi. Moreover, since eigenvalues are a continuous function of t, A(0) and A(1) have the same number S of eigenvalues in each connected component of i Gi.

14 3 Singular Value Decomposition 3.1 Theory 1. Singular Value Decomposition (SVD)

Definition 3.1. Let A ∈ Cm×n. The singular value decomposition (SVD) of A is UΣV ∗, where:

(a) U ∈ Cm×m and V ∈ Cn×n are unitary matrices (b) The columns of U are the left singular vectors of A (c) The columns of V are the right singular vectors of A m×n (d) Σ ∈ R≥0 is a diagonal matrix with values σ1 ≥ σ2 ≥ · · · ≥ σr on the diagonal where r is the rank of A (e) A = UΣV ∗

2. Properties of Singular Values and Singular Vectors Lemma 3.1. The left singular vectors of A are the eigenvectors of AA∗. The right singular vectors of A are the eigenvectors of A∗A. The square of singular values of A are the eigenvalues of AA∗ and A∗A.

Proof. From Ay = σx and A∗x = σy. Then: (a) A∗Ay = σA∗x = σ2y (b) AA∗x = σAy = σ2x

3. Other Forms of SVD

(a) Compact/Reduced SVD Definition 3.2. The compact or reduced SVD of a matrix A ∈ Cm×n can be written as A = UΣV ∗ where, if r = rank(A): i. U ∈ Cm×r, whose columns are the left singular vectors of A ∗ corresponding to non-zero singular values, and U U = Ir ii. V ∈ Cn×r, whose columns are the right singular vectors of A ∗ corresponding to non-zero singular values, and V V = Ir r×r iii. Σ ∈ R≥0 is a diagonal matrix whose diagonal values are σ1 ≥ σ2 ≥ · · · ≥ σr. (b) Rank-1 SVD

Definition 3.3. Letting u1, . . . , ur and v1, . . . , vr be the left and right singular vectors, and σ1, . . . , σr be the singular values. then, the Rank-1 SVD is:

∗ ∗ A = σ1u1v1 + ··· + σrurvr

4. Existence of SVD

15 Theorem 3.1. Every matrix has a condensed SVD

Proof. There are three steps:

(a) Constructing the Wielandt Matrix and Characterizing its Eigenvalues

 OA  W = = W ∗ A∗ O

Since W is Hermitian, by the spectral theorem, it has an eigen- value decomposition W = ZΛZ∗ with real eigenvalues. Suppose A has rank r, then W has rank 2r, so it has 2r eigenvalues (since it is Hermitian). Let zT = [ x y ] be the transpose of a col- umn of Z and σ be the corresponding eigenvalue. Then, W z = σz which implies Ay = σx and A∗x = σy. Moreover, we see that [ x −y ] is also an eigenvector of W with eigenvalue −σ. Hence, Λ = diag(σ1, σ2, . . . , σr, −σr, . . . , σ1, 0,..., 0). ∗ (b) Normalizing columns of Z and showing A = Y ΣrX . Let Σr = ∗ diag(σ1, . . . , σr).And normalize the eigenvectors of W so that z z = 2. Then, x∗x + y∗y = 2. Also, since the eigenvectors are orthogonal and we have from the first part that x∗x − y∗y = 0. This implies that x∗x = y∗y = 1. Rewriting Z in block notation with X,Y , and ∗ −Y blocks, we have that A = Y ΣrX . (c) Orthonormality of X and Y . We now show that the columns of X are orthonormal to themselves and similarly with Y . This follows from the orthonormality of z’s and specifically considering [ x −y ].

3.2 Applications 1. Solving Linear Systems

Example 3.1. Suppose we want to solve Ax = b, and the system is consistent (that is, b ∈ =(A)). If A = UΣV ∗ is the full SVD of A then ΣV ∗x = U ∗b. Letting y = V ∗x and c = U ∗b, Σy = c can be solved using back-solve and yr+1, . . . , yn are free parameters. Then V y = x.

2. Inverting Non-singular Matrices

Example 3.2. If A is non-singular, then Σ has no zeros on its diagonal. Therefore:

A−1 = (UΣV ∗)−1 = (V ∗)−1Σ−1U −1 = V Σ−1U ∗

−1 −1 −1 Moreover, if Σ = diag(σ1, . . . , σn) then Σ = diag(σ1 , . . . , σn )

3. Computing the 2-norm (a) SVD and EVD of Hermitian, psd matrices

16 Lemma 3.2. If M is Hermitian, positive semi-definite then its EVD and SVD coincide.

Proof. By the spectral theorem M = XΛX∗ be the EVD of M where Λ = diag(λ1, . . . , λn). Since M is positive semi-definite, λi ≥ 0. Therefore, the EVD is an SVD.

(b) The 2-norm is unitary invariant

Lemma 3.3. Let U, V be unitary matrices. Then kUAV k2 = kAk2 Proof. First:

2 2 ∗ ∗ ∗ 2 kUAV k2 = max kUAV xk2 = max x V A AV x = max kAV xk2 kxk2=1 kxk=1 kxk=1

Second: let x be any vector and vi be the columns of V . Then, 2 Pn ∗ Pn 2 kV xk2 = i=1 x¯ivi vix = i=1 |xi| . And since V is invertible: 2 2 2 2 kAV k2 = max kAV xk2 = max kAyk2 = kAk2 2 kyk=1 kV xk2=1

(c) 2-norm and Singular Values ∗ Corollary 3.1. Let the SVD of A be UΣV . Then kAk2 = σ1.

2 2 Proof. kAk2 = kΣk2. Let x be such that x1 = 1 and all other indices 2 2 are zero, then this maximizes kΣxk2 = σ1.

4. Computing the Frobenius Norm (a) Frobenius Norm is unitary invariant

Lemma 3.4. Let U, V be unitary matrices. Then kUAV kF = kAkF .

Proof.

2 ∗ ∗ ∗ kUAV kF = tr(V A U UAV ) = tr(V ∗A∗AV ) = tr(VV ∗A∗A) = tr(A∗A) 2 = kAkF

(b) Frobenius Norm and Singular Values Corollary 3.2. Let the SVD of A be UΣV ∗. Then

rank(A) 2 X 2 kAkF = σi i=1

17 2 2 Prank(A) 2 Proof. kAkF = kΣkF = i=1 σi

5. Schatten and KyFan Norms (a) Schatten p-norm Definition 3.4. For p ∈ [1, ∞), the Schatten p-norm is

p X p kAkσ,p = σi(A) i

When p = ∞, kAkσ,∞ s = maxi σi(A). (b) Examples of Schatten p-norm Example 3.3. From the definition: P i. kAkσ,1 = i |σi(A)|. This is also denoted kAk∗ and called the nuclear norm.

ii. kAkσ,2 = kAkF iii. kAkσ,∞ = kAk2 (c) Ky Fan (p, k)-norm p Pk p Definition 3.5. The Ky Fan (p, k)-norm of A is kAkσ,p,k = i=1 σi(A) . 6. Computing the Magnitude of the Determinant (a) The Eigenvalues of Unitary Matrices are 1. Lemma 3.5. The eigenvalues of unitary matrices are all 1.

Proof. Let U be unitary and x be an eigenvector with eigenvalue λ. Then Ux = λx. Then:

kxk2 = kUxk2 = kλxk2 = kxk2 |λ|

(b) Let A be a matrix with SVD UΣV ∗ then:

n Y | det(A)| = | det(U) det(Σ) det(V )| = | det(Σ)| = σi(A) i=1

7. Existence and Computing of Pseudo Inverses

Theorem 3.2. For any A ∈ Cm×n, ∃A† ∈ Cn×m such that: (a) Symmetries: (AA†)∗ = AA† and (A†A)∗ = A†A (b) “Identity”: AA†A = A and A†AA† = A†

m×n Proof. Let Σ = diag(σ1, σ2, . . . , σr, 0,..., 0) ∈ C . Then 0 −1 −1 Σ = diag(σ1 , . . . , σr , 0,..., 0) satisfy these conditions. So we denote Σ† = Σ0. Now let A have SVD UΣV ∗. Letting A0 = V Σ†U ∗:

18 (a) Symmetries: (AA†)∗ = (UΣΣ†U ∗)∗ = UΣΣ†U ∗ = AA†. The second one follows similarly. (b) “Identity”: AA†A = UΣV ∗V Σ†U ∗UΣV ∗ = UΣΣ†ΣV ∗ = A. The second one follows similarly. Hence, A† exists and A† = V Σ†U ∗.

8. Fredholm Alternative (a) General Linear Group, Kernels and Images Lemma 3.6. If A ∈ Cm×n and S ∈ GL(n) and T ∈ GL(m) then ker(TA) = ker(A) and im(AS) = im(A).

Proof. If y ∈ im(A) then ∃x such that Ax = y. Since S is invertible, ∃z such that Sz = x. In the other direction, if ∃z such that ASz = y then letting x = Sz, Ax = y. For the kernels, if x ∈ ker(A) then Ax = 0 so T Ax = 0, which implies x ∈ ker(TA). If x ∈ ker(TA), then T Ax = 0 and since T is invertible, Ax = 0.

(b) Co-kernel. Co-image. Definition 3.6. Let A ∈ Cm×n. ker(A∗) is the co-kernel of A and im(A∗) is its co-image. (c) SVD and Spans of Kernel, Co-Kernel, Image and Co-Image. Proposition 3.1. Let A be a complex valued m × n matrix. Let u1, ··· , um and v1, . . . , vn be the left and right singular vectors of A. Then:

i. ker(A) = span{vr+1, . . . , vn} ∗ ii. ker(A ) = span{ur+1, . . . , um} iii. im(A) = span{u1, . . . , ur} ∗ iv. im(A ) = span{v1, . . . , vr}

Proof. Let UΣV ∗ be the SVD of A and let r = rank(A).

ker(A) = ker(UΣV ∗) = ker(ΣV ∗) = V ker(Σ)

= V span{er+1, . . . , en}

= span{vr+1, . . . , vn}

∗ Similar reasoning gives ker(A ) = span{ur+1, . . . , um}. For the im- age of A:

im(A) = im(UΣV ∗) = im(UΣ) = Uim(Σ)

= Uspan{e1, . . . , er}

= {u1, . . . , ur}

∗ Similar reasoning gives im(A ) = span{v1, . . . , vr}.

19 (d) Fredholm Alternative Corollary 3.3. Let A ∈ Cm×n. Then: i. ker(A) ⊥ im(A∗) ii. ker(A∗) ⊥ im(A) iii. ker(A) ⊕ im(A∗) = Cn iv. ker(A∗) ⊕ im(A) = Cm 9. Projections (a) Projection. Orthogonal Projection. Definition 3.7. Let P ∈ Cn×n. i. P is a Projection if it is idempotent (i.e. P 2 = P ). ii. P is an Orthogonal Projection if it is idempotent and Hermitian (i.e. P ∗ = P ). (b) Orthogonal Projections onto Kernel, Co-Kernel, Image and Co-Image Lemma 3.7. The following are orthogonal projections onto the re- spective subspaces: † i. Pim(A) = AA † ii. Pim(A∗) = A A † iii. Pker(A∗) = (I − AA ) † iv. Pker(A) = (I − A A)

Proof. First we check that the mappings have the correct target spaces. Let UΣV ∗ be the SVD of A. Then AA† = UΣΣ†U ∗. We now note two facts: i. Since Σ is a diagonal matrix:

 I O  ΣΣ† = r OO

ii. Secondly:

† † ∗ † im(AA ) = im(UΣΣ U ) = im(UΣΣ ) = span{u1, . . . , ur}

† Similarly, im(A A) = span{v1, . . . , vr}. By the Fredholm alterna- tive, the other two manipulations map to the kernel and co-kernel. Idempotence follows from the identity property of the Moore-Penrose Inverse. Orthogonality follows from the symmetric property of the Moore Penrose Pseudo inverse.

10. Least Square Problem

2 (a) Problem: Find x which minimizes kb − Axk2.

20 Solution. Let UΣV ∗ be the SVD of A. Letting y = V ∗x and c = U ∗b, we can restate the problem as finding y which minimizes:

2 ∗ 2 kb − UΣyk2 = kU b − Σyk2 2 = kc − Σyk2 rank(A) X 2 X n 2 = (ci − σiyi) + s ci i=1 i=rank(A)+1

This is minimized when yi = ci/σi for i = 1, . . . , rank(A) and the other yi are free to be whatever they choose. We can recover any solution of x = V y.

(b) It is clear that unless A is non-singular (i.e. has full rank), that the minimizers are not unique. 11. Minimum Length Least Squares Problem

2 2 (a) Problem: Find x ∈ arg min{kxk2 : kb − Axk2 ≤ kb − Ayk2 ∀y}. Solution. Again, using the fact that kxk = kV yk = kyk, we see that yi = ci/σi for i = 1, . . . , rank(A) and yi = 0 for all other i recovers the minimum kxk2.

(b) Pseudo Inverse and Minimum Length Least Square Problem Lemma 3.8. The minimum length least squares solution z = A†b

Proof. Using the Fredholm Alternative, ∃b1, b2 such that b = b1 + b2 2 2 and b1 ∈ im(A) and b2 ∈ ker(A). Therefore kb − Axk2 = kb1 − Axk2+ 2 kb2k2. Moreover, ∃x such that b1 = Ax. Using the projections, † † AA b = b1 and so AA b = Ax. Letting z ∈ ker(A) that is z = (I − A†A)y, we guess the solution x = A†b + (I − A†A)y. Plugging this in, we see that we do indeed have all of the solutions. Finally, 2 we want to minimize kxk2:

2 † 2 † † † 2 kxk2 = A b 2 + 2hA b, (I − A A)yi + (I − A A)y 2 † 2 † † † 2 = A b 2 + 2h(I − A A)A b, yi + (I − A A)y 2 † 2 † 2 = A b 2 + (I − A A)y 2 This is minimized when y = 0. Therefore, A†b = x is the minimum length least squares solution.

12. Rank and Numerical rank (a) Because of floating point errors, matrices are almost always of full rank on computers, even if they are not analytically (b) We can use the decay rate of singular values to approximate the numerical rank. If the decay slows too much then we have likely reached the true rank of the matrix.

21 (c) Numerical Ranks Definition 3.8. Let A ∈ Cm×n and τ > 0 be a tolerance. Numerical ranks are:

i. ρ − rank(A) = min{r ∈ N : σr+1 ≤ τσr} P 2 P 2 ii. µ − rank(A) = min{r ∈ N : i≥r+1 σi ≤ τ i≥r σi } 2 2 iii. ν − rank(A) = kAkF / kAk2 13. Finding Closest Unitary/Orthonormal Matrices (a) Let U(n) ⊂ GL(n) be all n × n unitary matrices, and O(n) ⊂ U(n) be all n × n orthogonal matrices (b) Closest Unitary Approximation n×n Lemma 3.9. Let A ∈ C , then minX∈U(n) kA − XkF can be X = UV ∗ where UΣV ∗ is the SVD of A.

Proof. By unitary invariance, let Z = U ∗XV . So we want to find:

r 2 X 2 min kΣ − ZkF = n + min (σi − 2σi(Re(zi) + iIm(zi))) Z∈U(n) |z |=1 i i=1 since, Z must be diagonal to minimize the problem and the moduli of its elements must be 1. Using Lagrange multipliers, we have the Lagrangian:

2 2 2 σi − 2σiRe(zi) − iσiIm(zi) + λ(Re(zi) + Im(zi) − 1)

Taking derivatives with respect to the real and imaginary parts of zi and λ, we conclude that Im(zi) = 0 and Re(zi) ∈ {−1, 1}. The minimum occurs when zi = 1 since σi > 0. So z1 = ··· = zr = 1 and zr+1, . . . , zn are complex valued numbers with modulus 1. If they are required to be real, then the solution is unique. Hence, UZV ∗ = X and in the real case UV ∗ = X.

(c) Procrustes Problem (?) ∗ ∗ 2 Lemma 3.10. X = VBUBUAVA is a minimizer of minX∈U(n) kA − BXkF given A and B. Note 3.1. This does not look to be true. 14. Best r-rank approximation

(a) Problem: find arg minX:rank(X)≤r kA − Xk2 given A (b) Eckart-Young Theorem: Prank(A) ∗ Theorem 3.3. Let the SVD of A be i=1 σiuivi . Then for any r Pr ∗ the solution to the problem is X = i=1 σiuivi and min kA − Xk2 = σr+1

m×n Proof. Suppose ∃B ∈ C such that rank(B) ≤ r and kA − Bk2 < σr+1.

22 i. By the rank-nullity theorem, the nullity(B) ≥ n − r. Let w ∈ ker(B). Then Bw = 0. So:

kAwk2 = k(A − B)wk2 ≤ kA − Bk2 kwk2 < σr+1 kwk2 ∗ ii. Let v ∈ P := span(v1, . . . , vr+1) where UΣV is the SVD of A, and vi are the columns of V . Then, ∃z such that v = Vr+1z.

r+1 2 ∗ 2 X 2 2 2 2 kAvk2 = kUΣV Vr+1zk2 = σi |αi| ≥ σr+1 kvk2 i=1 iii. Since dim(P ) = r + 1 and nullity(B) ≥ n − r, P ∩ ker(B) 6= 2 ∅. Hence, ∃av ∈ P ∩ ker(B) such that σr+1 kvk2 ≤ kAvk2 < σr+1 kvk2 which is a contradiction. Thus kA − Xk2 ≥ σr+1, and kA − Xk2 = σr+1 when X is given as in the theorem.

15. Least Squares with Quadratic Constraints

m×n m † (a) Problem: For A ∈ R , b ∈ R , α < A b 2, find

arg min{kb − Axk2 : kxk2 = α}

Solution. Let UΣV ∗ be the SVD of A. Let U ∗b = c and V ∗x = z. Then we can restate the problem as:

arg min{kc − Σzk : kzk2 = α{ The Lagrangian for this problem is then:

2 2 2 L(z, µ) = kc − Σzk2 + µ(kzk2 − α ) Taking derivatives with respect to z and µ, we have the following system: ( −2Σ(c − Σz) + 2µz = 0 2 2 kzk2 = α The first equation gives that (Σ2 + µI)−1Σc = z. More explicitly:

rank(A) X σjcj e = z σ2 + µ j j=1 j

As long as µ > 0 this matrix is invertible (since it is diagonalized). Now we can use the second equation to solve for µ.

rank(A) !2 X σjcj α2 = σ2 + µ i=1 j Using some numerical method, we can solve for µ. And use this to compute z.

23 (b) Newton-Raphson is one option for solving for µ. 16. Generalized Condition Number

(a) Generalized Condition Number Definition 3.9. Given A ∈ Cm×n, its Moore-Penrose Inverse A†, and a matrix norm k·k, the Generalized Condition Number of A is

κ(A) = kAk A† (b) If k·k is the 2-norm or Frobenius Norm, we can compute these values using the SVD. For example:

† σ1 κ2(A) = kAk2 A 2 = σrank(A)

2 17. Solving Total Least Squares Problems Problem: Find arg min{kEkF + 2 krk2 :(A + E)x = b + r.

Solution. Let C = [A b], F = [E r] and zT = [xT − 1]. Then

2 arg min{kF kF :(C + F )z = 0}

Since z 6= 0, we want to find F so that rank(C + F ) < n and kF kF is Pn+1 ∗ minimized. If the SVD of C = i=1 σiuivi then we can reduce the rank of C+F by removing one of the singular values, but we choose the smallest ∗ singular values to minimize the norm of F . So F = −σn+1un+1vn+1. Now we want to find z such that:

n ! X ∗ σiuivi z = 0 i=1

A simple choice is vn+1, but we need the last element, vn+1,n+1 = −1, so we let −1 z = vn+1 vn+1,n+1

24 4 Rank Retaining Factorization 4.1 Theory 1. Rank Retaining Factorization

Definition 4.1. Let A ∈ Cm×n have rank r.A rank retaining factoriza- tion (RRF) of A = GH where

(a) G ∈ Cm×r and H ∈ Cr×n (b) rank(G) = rank(H) = r

2. Properties (a) Non-Singularity Lemma 4.1. If an RRF of A is GH and rank(A) = r then G∗G and HH∗ are non-singular.

∗ ∗   Proof. Let UΣV = G be its SVD. Note that Σ = Σr O where ∗ 2 ∗ Σr is an r × r diagonal matrix of full rank. Then G G = V ΣrV , which is the SVD of G∗G. Hence rank(G∗G) = r.

(b) Kernel. Co-kernel. Image. Co-image. Lemma 4.2. Let the RRF of A = GH and rank(A) = r. Then: i. Im(A) = Im(G) ii. ker(A) = ker(H) iii. Im(A∗) = Im(H∗) iv. ker(A∗) = ker(G∗)

Proof. If A = GH is a RRF for A then A∗ = H∗G∗ is a RRF for A∗. So we need only prove the first two points. If y ∈ Im(A), ∃x s.t. Ax = y. Let z = Hx. Then Gz = y. So y ∈ Im(G). If y ∈ Im(G) then ∃z such that Gz = y. H : Cn → Cr is onto since it has rank r. Therefore, ∃x such that Hx = z. So Im(A) = Im(G). For the kernel, we note that nullity(G) = 0 by the rank nullity theorem. Therefore, only G0 = 0. Hence, ker(A) = ker(H).

4.2 Applications 1. Suppose we want to solve Ax = b where the system is consistent and A ∈ Cm×n has full column rank n.

Solution. By the Fredholm alternative, Cn = ker(A)⊕im(A∗). Therefore, ∗ x = x0 + x1 where x0 ∈ ker(A) and x1 ∈ im(A ). Therefore:

b = Ax = GH(x0 + x1) = GHx1

∗ ∗ since ker(A) = ker(H). Since x1 ∈ H there is a z such that H z = x1. So: b = GHH∗z

25 Finally, G ∈ Cn×n and has full rank, and so (HH∗)−1G−1b = z

Multiplying through then:

∗ ∗ −1 −1 H (HH ) G b = x1

2. Suppose now we have that A has full column rank and we want to find:

2 2 arg min{kxk2 : kAx − bk2 ≤ kAz − bk2 ∀z}

m Solution. Since b ∈ C , by Fredholm’s alternative, b = b0 + b1 where ∗ ∗ ∗ b0 ∈ ker(A ) and b1 ∈ im(A). Since ker(A ) = ker(G ) and since Ax = b1 ∗ ∗ ∗ is consistent, G Ax = G b1 = G b is consistent. Moreover, this leaves that Hx = (G∗G)−1G∗b. Now we can split x as we did above to see that

x = H∗(HH∗)−1(G∗G)−1G∗b

26 5 QR & Complete Orthogonal Factorization 5.1 Theory 1. Gram-Schmidt Orthogonalization

Lemma 5.1. Let A ∈ Cm×n. Then there ∃Q ∈ U(m) and R ∈ Cm×n such that R is upper triangular.

Proof. Suppose rank(A) = s. Let qi be the columns of a matrix Q which we will construct, and rij be the entries of a matrix R which we will construct. Let R = rank(A). First, let a1 be the first column of A and let r11 = ka1k, and 1 q1 = a1 r11

Let P2 be a permutation matrix such that AP2 has first column a1 and second column a2 that is linearly independent of the first column. Such a permutation exists as long as s ≥ 2. Let j ≤ s and let Pj be a permuta- tion matrix such that AP2P3 ··· Pj’s first j − 1 columns are the same as th AP2 ··· Pj−1 and its j column is linearly independent of the first j − 1 columns. We can continue this process up to AP2 ··· Ps. Let the columns of this matrix be a1, . . . , an. Then for the first s columns: Pj−1 aj − i=1 rijqi qj = rjj where rij = haj, qii j−1

X rjj = aj − rijqi i=1

Note that as+1, . . . , an ∈ span{q1, . . . , qs}. Hence, we can finish popula- tion R by finding the coefficients for these columns in a matrix S. Un- fortunately, we have only computed AΠ = Q0M 0 where Q0 ∈ Cm×s and [RS] = M 0 ∈ Cs×n (where Π is a product of permutation matrices so its inverse is ΠT ). To get the remaining columns, we can complete the basis with the vectors in Q0 of Rm and make them orthogonal by this process. To complete M 0 we need only add rows of zeros until the right dimension is achieved. Hence:  R0 S  AΠ =  Q0 Q  s+1,··· ,m OO

2. Versions of QR Factorization (a) Full QR Decomposition. This is the version stated in the Lemma. Note when rank(A) < m ∧ n then Q is not unique.

27 (b) Reduced QR Factorization. This version is simply Q0[R0 S] calculated in the proof. It not unique given that the permutations can occur in several ways. (c) Complete Orthogonal Factorization. Consider [R0 S]T , which has full column rank. Then it has a QR decomposition

 U  Z O

. The complete orthogonal factorization is then:

 R0 S   U T O   LO  A = Q ΠT = ZT ΠT = Q ΩT OO OO OO

where L is lower triangular.

5.2 Applications

1. Full Rank Least Squares. Suppose A ∈ Cm×n has full column rank and n ≤ m. Find

arg min{kAx − bk2}

Solution 1. The solution is unique since x = (A∗A)−1AT b is the analytic solution. We use the full QR decomposition. So we have that:   2 R ∗ kAx − bk = x − Q b 2 O We can partition Q∗b into c and d so that:

2 2 2 kAx − bk2 = kRx − ck2 + kdk2 Therefore, x = R−1c which we can computed by back solve (back substi- tution).

Solution 2. Now we solve the normal equations A∗Ax = A∗b, which comes 2 from taking the derivative of kAx − bk2. In this case we can do QR on A∗A and since A∗A is of full rank, Rx = Q∗A ∗ b can be solved by back- ∗ 2 solve. Note that, κ2(A A) = κ2(A) and so it is less numerically stable, but also A∗A ∈ Cn×n. So this method may be beneficial if n  m.

2 2. Least Squares with Linear Contraints. Find arg min kAx − bk2 : CT x = d.

Solution 1. Note that the Lagrangian is:

2 T T L(λ, x) = kb − Axk2 + 2λ (C x − d) Differentiating returns the system: ( −2AT (b − Ax) + 2Cλ = 0 CT x = d

28 The first equation can be re-written as AT Ax + Cλ = AT b. Writing this as an augmented system:  AT AC   x   AT b  = CT O λ d If there is sparsity in A and C, this system can be solved with sparse methods.

Solution 2. Let QR = C. Let y = QT x and partition y into u and v. Then we can solve Ru = d , and solve for u. Using this, we can do a 1 ˜ ˜ normal minimization of kb − AQyk = b − A1u − A2v , where only v is unknown.

5.3 Givens Rotations 1. A Givens Rotation, G(i,j) is a of the form:  1 l = k 6∈ {i, j}   λ l = k ∈ {i, j} (i,j)  Glk = σ l = i, k = j  −σ l = j, i = k  0 otherwise where σ2 + λ2 = 1 (i,j) (i,j) 2. If σ and λ are selected correctly for a vector v,(G v)i or (G v)j can be set to 0. Example 5.1. Suppose i < j. We can find λ and σ as follows if we want (i,j) to set (G v)j = 0. Note that:  vk k 6∈ {i, j} (i,j)  (G v)k = λvi + σvj k = i  −σvi + λvj k = j Hence, we need to solve for σ and λ which satisfy: ( 0 = λvj − σvi 1 = λ2 + σ2 Taking the positive options: v v λ = i σ = j q 2 2 q 2 2 vi + vj vi + vj

3. For a matrix A, we can easily find G1,...,Gn such that Gn ··· G1A = R T where R is upper triangular. And, letting Q = Gn ··· G1, we have the QR decomposition of A. 4. Givens Rotations are beneficial when A is sparse. 5. Pivoting (Partial or Complete) can be implemented to ensure that the element on the diagonal of A is the largest in the row and column

29 5.4 Householder Reflections 1. A Householder reflection, H, is a matrix of the form I + τvvT where kvk = 1 and τ ∈ C. It reflects any vector over the line {tv : t ∈ R}. 2. For use in QR decomposition, we require that HHT = 1 and HT H = 1. Since HT = H, we need to check only one:

HT H = I + 2τvvT + τ 2 kvk2 vvT = I + τvvT (2 + τ kvk2) = I + τvvT (2 + τ)

This implies τ = −2

3. Moreover, given a vector v, we want Hz = αe1. Since H causes a reflec- tion, we choose z + αe v = 1 kz + αe1k Substituting this in, we have that:

2 T (z + αe1)(kzk + αw1) (I − 2vv )z = z − 2 2 = αe1 kz + αe1k

2 2 which holds if 2(kzk + αw1) = kzk + 2αw1 + α . This implies ± kwk = α. 4. Taking the negative case, we have that the appropriate Householder Re- flection is: T (z − kzk e1)(z − kzk e1) I − 2 2 kz − kzk e1k

 T m×n 5. Letting Ap = ap ··· amp of a matrix A ∈ C and permutation matrices Π:

(a) Let A(1) ∈ Cm×n be the matrix we want to QR factorize

(1) (1) (b) Let H1 be the householder matrix such that H1A1 = A1 e1 ∈ n (2) (1) C . Let A = H1A .

˜ ˜ (2) (1) (c) Let H2 be the householder matrix such that H2A2 = A2 e1 ∈ Cn−1. Let  1 0  H2 = 0 H˜2

(3) (2) and A = H2A (1) (d) Then, Hn ··· H1A = R an upper right triangular matrix. Letting T Q = Hn ··· H1 gives us Q. (e) If necessary, we can pivot the rows to achieve non zeros along the diagonals.

30 6 LU, LDU, Cholesky and LDL Decompositions

1. Both LU and LDU Factorization are based on Gaussian Elimination. 2. Given a system of equation Ax = b, we add multiples of the first row to the subsequent rows to set them equal to zero.

T Example 6.1. Let v = [ v1 ··· vn ] . To set all vi = 0 for i 6= 1, we multiply by:  1 0 ··· 0 0   −v2/v1 1 ··· 0 0     . .. .  L =  . . .     . .. .   . . .  −vn/v1 0 ··· 0 1

3. This requires that the diagonal elements of a matrix are non-zero. Hence, a sequence of pivots (partial or complete) resulting in:

MnΠnMn−1Πn−1 ··· M1Π1A = UMnΠnMn−1Πn−1 ··· M1Π1AΠ0 = U

where U is upper triangular.

4. To recover A or at least AΠ0 (in the partial case at least):

−1 A = (MnΠnMn−1Πn−1 ··· M1Π1) U T −1 T T −1 = Π1 M1 Π2 ··· Πn Mn U T T −1 T T −1 = Π1 Π2 (Π2M1 Π2 ) ··· Πn Mn U T T = Π1Π2 ··· Πn L1 ··· LnU

5. If A has non-singular principal submatrices, then LDU can be recovered by performing LU decomposition on U T = (U 0)T D 6. If A is symmetric, LDLT is recovered. 7. It is difficult to check if A has non-singular principal submatrices (unless A is positive definite)

8. Suppose A is symmetric, positive definite. Consider L to be lower trian- gular, we want to find A = LLT , and so we can compute the terms of L directly from this relationship. 9. Similarly, we can find the algorithm form A = LDLT if A is symmetric positive definite.

31 7 Iterative Methods 7.1 Overview 1. Iterative methods can be used to compute solutions to linear systems, least squares, eigenvalue problems, and singular value problems 2. Suppose we want to solve Ax = b. Iterative methods compute a sequence −1 xk where xk → x = A b, and we can control the accuracy to stop the process 3. Classes of Iterative Methods (a) Splitting/One-Step Stationary Methods i. Split A = M − N where Mx = c is easy to solve for some c.

ii. Solve Mxk = Nxk−1 + b (b) Semi-Iterative Methods

i. Generate, for a choice of B, yk = Byk−1 + c Pk ii. Then xk = j=0 ajkyj (c) Krylov Subspace Methods 2 k−1 i. Find iterates xk ∈ {b, Ab, A b, ··· ,A b} 2 r−1 ii. Uses the fact that eventually, Kr = {b, Ab, A b, ··· ,A b} will be linearly dependent as r increases.

7.2 Splitting Methods 1. Overview (a) Strategy: Suppose A is invertible in Ax = b. We find M such that Mx = c for some c is easy to solve and let N = M − A. We then have the following iteration scheme:

Mxk = Nxk−1 + b

(b) General Convergence of Errors

Proposition 7.1. Let ek = x − xk where x solves Ax = b. kekk → 0 −1 for any initial x0 if and only if ρ(M N) < 1.

Proof. Note that:

−1 −1 xk = M Nxk−1 + M b

Since x = M −1Nx + M −1b (since (M − N)x = Ax = b):

−1 ek = M Nek−1 =: Bek−1

k Therefore, ek = B e0. By Lemma 2.5, both directions follow.

2. Jacobi Method

32 (a) M = diag(A). As long as the diagonal elements of A are non-zero, then iterates can be explicitly computed as:   i 1 i X j xk = b − aijxk−1 aii j6=i

(b) Jacobi Convergence of Errors

Corollary 7.1. A is strictly diagonally dominant then ek → 0.

Proof. We need only show that ρ(M −1N) < 1. This matrix is:   0 −a12/a11 · · · −a1n/a11  .. .  −1  −a21/a22 0 . .  M N =    ......   . . . .  −an1/ann · · · −an,n−1/ann 0

Suppose A is strictly diagonally dominant, then for any row: X |aij| < |aii| j6=i

Using the fact that ρ(A) ≤ kAk∞, we then have that

−1 −1 ρ(M N) ≤ M N ∞ < 1 Applying Proposition 7.1, the result follows.

3. Gauss-Seidel Method

1 i−1 (a) Notice that in the Jacobi Method, we can compute xk, . . . , xk be- i fore we compute xk. Gauss-Seidel uses these updated values to com- pute xk. This yields:   1 X X xi = bi − a xj − a xj k a  ij k ij k−1 ii ji

(b) Gauss-Seidel Convergence of Errors

Corollary 7.2. If A is strictly diagonally dominant then ek → 0.

Proof. Let L be the sub-diagonal entries of A and U be the super diagonal entries of A. Then:

Mxk = b − Lxk − Uxk−1

−1 −1 Therefore, xk = −(M + L) Uxk−1 + (M + L) b. Hence, the errors are: −1 ek = −(M + L) Uek−1

33 or explicitly:   1 X X ei = − a ej + a ej k a i  ij k ij k−1 i ji

P |aij | Let ri = . By diagonal dominance, maxi ri =: r < 1. We j6=i |aii| i now proceed inductively to show that |ek| ≤ r kek−1k∞. When i = 1: 1 |ek| ≤ r kek−1k∞ Suppose this holds up to i − 1. Then:

X |aij| X |aij| |ei | ≤ |ej | + |ej | k |a | k |a | k−1 ji ii

X |aij| X |aij| ≤ r ke k + ke k k−1 ∞ |a | k−1 ∞ |a | j1 ii

≤ r kek−1k∞ k Therefore, kekk∞ ≤ r ke0k∞ → 0 as k → ∞.

(c) Number of iterations k i. From the proof of the corollary, we have that kekk ≤ r ke0k. ii. Given a tolerance  > 0, and x0 = 0, we need only  log   k = log r to bound the relative error by . 4. Successive Over Relaxation (SOR) (a) SOR is a generalization of Gauss-Seidel. Let A = D −L−U where D is diagonal, L has only non-zero subdiagonal and U has only non-zero superdiagonals. The iteration is derived as follows: ωDx = ωb − ωLx − ωUx Dx = ωb − ωLx − ωUx + (1 − ω)Dx

Then, using the most recent estimates to compute the iteration as in Gauss-Seidel:

Dxk = ωb − ωLxk − ωUxk−1 + (1 − ω)Dxk−1 and, explicitly:   ω X X xi = (1 − ω)xi + b − a xj − a xj k k−1 a  i ij k ij k ii ji (b) When ω = 1 this is the Gauss-Seidel method, and when ω > 1 this is Successive Over Relaxation (c) Ostrowki’s Lemma: Lemma 7.1. Suppose A is symmetric positive definite. ω ∈ (0, 2) if and only if ek → 0.

34 7.3 Semi-Iterative Methods 1. Richardson’s Method (a) Richardson’s Method is an unstable numerical method with a param- eter α which updates iterates using:

xk = (I − αA)xk−1 + αb

Note 7.1. Effectively, Richardson’s Method is a line search method 1 T T for minimizing 2 x Ax − x b. Given a starting point xk−1, then the direction of steepest descent is b − Axk−1 (by taking derivatives). Therefore, xk = xk−1 + α(b − Axk−1) where α is the step length parameter. (b) Richardson Convergence of Errors

Corollary 7.3. Suppose A is symmetric positive definite. ek → 0 if 1 and only if 0 < α < where µmin is the smallest eigenvalue of A. µmin 2 Moreover, ek → 0 optimally if . µmin+µmax Proof. We have that the errors are:

ek = (I − αA)ek−1

Hence, from Proposition 7.1, ek → 0 if and only if ρ(I − αA) < 1. Since A is symmetric positive definite, its eigenvalues are positive. Letting µ be a vector of the eigenvalues and µmax be the largest and ∗ µmin be the smallest. Letting A = Xdiag(µ)X be the EVD of A, we have that the EVD of I − αA is:

X(I − αdiag(µ))X∗

Hence, 0 < ρ(I − αA) = 1 − αµmin < 1. This implies the first result.

For optimality, note that the SVD and EVD of A coincide. Hence,

kI − αAk2 = max |1 − αµi| = k1 − αµk∞. We want to minimize this over α, which gives us the result.

2. Steepest Descent (a) This method is akin to Richardson’s method, except at every step, we optimize αk so that we minimize the norm of b − Axk+1 (i.e. we bring the slope closer to zero and hence closer to the stationary point of 0.5xT Ax + bT x).

(b) Optimal Choice of αk

Lemma 7.2. Letting rk = b − Axk, the optimal choice of αk to minimize rk+1 is: T rk rk αk = T rk Ark

35 Proof. Note that norms are equivalent in finite dimensional space.

Hence, we can minimize the k·k2, which, in the computation, will 2 require determining A , or we can minimize k·kA−1 which overcomes this cost. First:

rk+1 = b − Axk+1 = b − A(xk + αkrk) = rk − αkArk

Second:

T −1 T T 2 T rk+1A rk+1 = rk rk − 2αrk rk + α rk Ark

Taking derivatives and noting that the quadratic coefficient is posi- tive, we have the result.

3. Chebyshev’s Method

(a) Notice that in the Steepest Descent Method:

ek = (I − αkA)(I − αk−1A) ··· (I − α0)e0 =: Pk(A)e0

Instead of optimizing over αk stepwise, Chebyshev’s method tries to optimize over all α0, . . . , αk at each k.

(b) Since kekk ≤ kPk(A)k ke0k we want to minimize kPk(A)k (c) This is solved using Chebyshev’s Polynomials.

7.4 Krylov Space Methods 1. Overview (a) Suppose we want to solve Ax = b and A is invertible. Hence, x ∈ im(A−1). Computing the inverse is expensive, but we can more easily compute Akc for some c implicitly. (b) Krylov Subspace Lemma 7.3. There is an r such that the solution to Ax = b, when A is invertible, is in Kr(A).

Proof. From the Cayley-Hamilton theorem, there is a minimal poly- nomial of degree r, r X i Pr(x) = αix i=0

such that Pr(A) = 0. Hence:

r 1 X A−1 = α Ai−1 α i 0 i=1

Therefore, A−1b ∈ span{b, Ab, Ab2, . . . , Abr−1}.

2. Conjugate Gradients

36 1 T T (a) Our goal is to minimize 2 x Ax − b x, or, equivalently, solve Ax = b when A ∈ Rn×n is symmetric, positive definite. (b) Conjugacy. i. Conjugated Vectors. T 2 Definition 7.1. Let hu, viA = u Av and kvkA = hv, viA. A set n of vectors {p1, . . . , pr} ⊂ R are Conjugated with respect to A if for all i 6= j: hpi, pji = 0 ii. Conjugated Vectors form a basis. Lemma 7.4. Suppose A ∈ Rn×n is symmetric positive definite and p1, . . . , pr are conjugated with respect to A. Then p1, . . . , pr are linearly independent.

Proof. Suppose this is not true. Then there are α1, . . . , αr not all zero such that:

α1p1 + ··· + αrpr = 0

Then:

T 0 = (α1p1 + ··· + αrpr) A(α1p1 + ··· + αrpr) 2 T 2 T = α1p1 Ap1 + ··· + αrpr Apr

2 T Since αi ≥ 0 and pi Api > 0, we have a contradiction.

(c) Conjugated Gradients

i. Let x0, . . . , xk be a sequence of iterates

ii. Their steepest descent directions are r0 = b − Ax0, . . . , rk = b − Axk

iii. We then create vectors p0, . . . , pk which are conjugated with re- spect to A from the descent directions r0, . . . , rk using a Gram- Schmidt type approach:

X hrk, pjiA pk = rk − 2 pj j

iv. We update xk+1 = xk + αkpk where αk is the 1 arg min xT Ax − bT x 2 k+1 k+1 k+1 (d) Relation to Krylov Spaces: The new axes system is an orthonormal basis for the Krylov Subspace (e) Geometric Interpretation: we rotate/dilate the axes system with re- spect to A and b, and each iterate minimizes along each “coordinate” of this new axis system. Since there are n coordinates, this requires at most n iterates.

37