Fundamentals of Numerical Linear Algebra

Fundamentals of Numerical Linear Algebra Seongjai Kim Department of Mathematics and Statistics Mississippi State University Mississippi State, MS 39762 USA Email: [email protected] Updated: August 12, 2021 Seongjai Kim, Department of Mathematics and Statistics, Mississippi State University, Mississippi State, MS 39762-5921 USA Email: [email protected]. The work of the author is supported in part by NSF grant DMS-1228337. Prologue In organizing this lecture note, I am indebted by Demmel [4], Golub and Van Loan [6], Varga [10], and Watkins [12], among others. Currently the lecture note is not fully grown up; other useful techniques would be soon incorporated. Any questions, suggestions, comments will be deeply appreciated. i ii Contents 1 Introduction to Matrix Analysis1 1.1. Basic Algorithms and Notation...................2 1.1.1. Matrix notation........................2 1.1.2. Matrix operations.......................2 1.1.3. Vector notation........................3 1.1.4. Vector operations.......................4 1.1.5. Matrix-vector multiplication & “gaxpy"..........4 1.1.6. The colon notation......................5 1.1.7. Flops..............................6 1.2. Vector and Matrix Norms......................7 1.2.1. Vector norms..........................7 1.2.2. Matrix norms.........................9 1.2.3. Condition numbers...................... 12 1.3. Numerical Stability.......................... 16 1.3.1. Stability in numerical linear algebra............ 17 1.3.2. Stability in numerical ODEs/PDEs............. 18 1.4. Homework............................... 19 2 Gauss Elimination and Its Variants 21 2.1. Systems of Linear Equations.................... 22 2.1.1. Nonsingular matrices..................... 24 2.1.2. Numerical solutions of differential equations....... 26 2.2. Triangular Systems.......................... 30 2.2.1. Lower-triangular systems.................. 31 2.2.2. Upper-triangular systems.................. 33 iii iv Contents 2.3. Gauss Elimination.......................... 34 2.3.1. Gauss elimination without row interchanges....... 36 2.3.2. Solving linear systems by LU factorization........ 39 2.3.3. Gauss elimination with pivoting.............. 49 2.3.4. Calculating A−1 ........................ 56 2.4. Special Linear Systems........................ 58 2.4.1. Symmetric positive definite (SPD) matrices........ 58 2.4.2. LDLT and Cholesky factorizations............. 63 2.4.3. M-matrices and Stieltjes matrices............. 67 2.5. Homework............................... 68 3 The Least-Squares Problem 71 3.1. The Discrete Least-Squares Problem................ 72 3.1.1. Normal equations....................... 74 3.2. LS Solution by QR Decomposition................. 78 3.2.1. Gram-Schmidt process.................... 78 3.2.2. QR decomposition, by Gram-Schmidt process....... 81 3.2.3. Application to LS solution.................. 91 3.3. Orthogonal Matrices......................... 94 3.3.1. Householder reflection.................... 95 3.3.2. QR decomposition by reflectors............... 106 3.3.3. Solving LS problems by Householder reflectors...... 111 3.3.4. Givens rotation........................ 113 3.3.5. QR factorization by Givens rotation............ 119 3.3.6. Error propagation....................... 122 3.4. Homework............................... 125 4 Singular Value Decomposition (SVD) 127 4.1. Introduction to the SVD....................... 128 4.1.1. Algebraic Interpretation of the SVD............ 131 4.1.2. Geometric Interpretation of the SVD............ 133 4.1.3. Properties of the SVD..................... 135 4.1.4. Computation of the SVD................... 141 Contents v 4.1.5. Numerical rank........................ 145 4.2. Applications of the SVD....................... 147 4.2.1. Image compression...................... 148 4.2.2. Rank-deficient least-squares problems........... 151 4.2.3. Principal component analysis (PCA)............ 158 4.2.4. Image denoising........................ 165 4.3. Homework............................... 168 5 Eigenvalue Problems 171 5.1. Definitions and Canonical Forms.................. 172 5.2. Perturbation Theory: Bauer-Fike Theorem............ 177 5.3. Gerschgorin’s Theorem........................ 182 5.4. Taussky’s Theorem and Irreducibility............... 185 5.5. Nonsymmetric Eigenvalue Problems................ 189 5.5.1. Power method......................... 190 5.5.2. Inverse power method.................... 195 5.5.3. Orthogonal/Subspace iteration............... 198 5.5.4. QR iteration.......................... 201 5.5.5. Making QR iteration practical................ 203 5.5.6. Hessenberg reduction..................... 205 5.5.7. Tridiagonal and bidiagonal reduction........... 214 5.5.8. QR iteration with implicit shifts – Implicit Q theorem.. 221 5.6. Symmetric Eigenvalue Problems.................. 232 5.6.1. Algorithms for the symmetric eigenproblem – Direct methods................................ 232 5.6.2. Tridiagonal QR iteration................... 235 5.6.3. Rayleigh quotient iteration................. 240 5.6.4. Divide-and-conquer...................... 241 5.7. Homework............................... 242 6 Iterative Methods for Linear Systems 245 6.1. A Model Problem........................... 246 6.2. Relaxation Methods.......................... 247 vi Contents 6.3. Krylov Subspace Methods...................... 248 6.4. Domain Decomposition Methods.................. 249 6.5. Multigrid Methods.......................... 250 6.6. Homework............................... 251 Chapter 1 Introduction to Matrix Analysis 1 2 CHAPTER 1. INTRODUCTION TO MATRIX ANALYSIS 1.1. Basic Algorithms and Notation 1.1.1. Matrix notation Let R be the set of real numbers. We denote the vector space of all m × n matrices by Rm×n: 2 3 a11 ··· a1n m×n 6 . 7 A 2 R , A = (aij) = 4 . 5 ; aij 2 R: (1.1) am1 ··· amn The subscript ij refers to the (i; j) entry. 1.1.2. Matrix operations • transpose (Rm×n ! Rn×m) T C = A ) cij = aji • addition (Rm×n × Rm×n ! Rm×n) C = A + B ) cij = aij + bij • scalar-matrix multiplication (R × Rm×n ! Rm×n) C = αA ) cij = α aij • matrix-matrix multiplication (Rm×p × Rp×n ! Rm×n) p X C = AB ) cij = aikbkj k=1 These are the building blocks of matrix computations. 1.1. Basic Algorithms and Notation 3 1.1.3. Vector notation Let Rn denote the vector space of real n-vectors: 2 3 x1 n 6 . 7 x 2 R , x = 4 . 5 ; xi 2 R: (1.2) xn n n×1 We refer to xi as the ith component of x. We identify R with R so that the members of Rn are column vectors. On the other hand, the elements of R1×n are row vectors: 1×n y 2 R , y = (y1; ··· ; yn): If x 2 Rn, then y = xT is a row vector. 4 CHAPTER 1. INTRODUCTION TO MATRIX ANALYSIS 1.1.4. Vector operations Assume a 2 R and x; y 2 Rn. • scalar-vector multiplication z = a x ) zi = a xi • vector addition z = x + y ) zi = xi + yi • dot product (or, inner product) n T X c = x y(= x · y) ) c = xiyi i=1 • vector multiply (or, Hadamard product) z = x: ∗ y ) zi = xiyi • saxpy (“scalar a x plus y"): a LAPACK routine z = a x + y ) zi = a xi + yi 1.1.5. Matrix-vector multiplication & “gaxpy" • gaxpy (“generalized saxpy"): a LAPACK routine n X z = A x + y ) zi = aijxj + yi j=1 1.1. Basic Algorithms and Notation 5 1.1.6. The colon notation A handy way to specify a column or row of a matrix is with the “colon" notation. Let A 2 Rm×n. Then, 2 3 a1j 6 . 7 A(i; :) = [ai1; ··· ; ain];A(:; j) = 4 . 5 amj Example 1.1. With these conventions, the gaxpy can be written as for i=1:m z(i)=y(i)+A(i,:)x (1.3) end or, in column version, z=y for j=1:n (1.4) z=z+x(j)A(:,j) end Matlab-code 1.2. The gaxpy can be implemented as simple as z=A*x+y; (1.5) 6 CHAPTER 1. INTRODUCTION TO MATRIX ANALYSIS 1.1.7. Flops A flop (floating point operation) is any mathematical operation (such as +, -, ∗, /) or assignment that involves floating-point numbers Thus, the gaxpy (1.3) or (1.4) requires 2mn flops. The following will be frequently utilized in counting flops: n n X n(n + 1) X n(n + 1)(2n + 1) k = ; k2 = ; 2 6 k=1 k=1 (1.6) n n X X 2 n2(n + 1)2 k3 = k = : 4 k=1 k=1 1.2. Vector and Matrix Norms 7 1.2. Vector and Matrix Norms 1.2.1. Vector norms Definition 1.3. A norm (or, vector norm) on Rn is a function that assigns to each x 2 Rn a nonnegative real number kxk, called the norm of x, such that the following three properties are satisfied: for all x; y 2 Rn and λ 2 R, kxk > 0 if x 6= 0 (positive definiteness) kλxk = jλj kxk (homogeneity) (1.7) kx + yk ≤ kxk + kyk (triangle inequality) Example 1.4. The most common norms are 1=p X p kxkp = jxij ; 1 ≤ p < 1; (1.8) i which we call the p-norms, and kxk1 = max jxij; (1.9) i which is called the infinity-norm or maximum-norm. Two of frequently used p-norms are 1=2 P P 2 kxk1 = i jxij; kxk2 = i jxij The 2-norm is also called the Euclidean norm, often denoted by k · k. Remark 1.5. One may consider the infinity-norm as the limit of p-norms, as p ! 1; see Homework 1.1. 8 CHAPTER 1. INTRODUCTION TO MATRIX ANALYSIS Theorem 1.6. (Cauchy-Schwarz inequality) For all x; y 2 Rn, n n n 1=2 1=2 X X 2 X 2 xiyi ≤ xi yi (1.10) i=1 i=1 i=1 Note:(1.10) can be rewritten as jx · yj ≤ kxk kyk (1.11) which is clearly true. Example 1.7. Given a positive definite matrix A 2 Rn×n, define the A-norm on Rn by T 1=2 kxkA = (x Ax) Note: When A = I, the A-norm is just the Euclidean norm. The A-norm is indeed a norm; see Homework 1.2. Lemma 1.8. All p-norms on Rn are equivalent to each other. In particular, p kxk2 ≤ kxk1 ≤ n kxk2 p kxk1 ≤ kxk2 ≤ n kxk1 (1.12) kxk1 ≤ kxk1 ≤ n kxk1 Note: For all x 2 Rn, p kxk1 ≤ kxk2 ≤ kxk1 ≤ n kxk2 ≤ n kxk1 (1.13) 1.2.

Fundamentals of Numerical Linear Algebra

2 Homework Solutions 18.335 " Fall 2004

A Fast Algorithm for the Recursive Calculation of Dominant Singular Subspaces N

The Generalized Triangular Decomposition

CSE 275 Matrix Computation

The Schur Decomposition Week 5 UCSB 2014

Finite-Dimensional Spectral Theory Part I: from Cn to the Schur Decomposition

Numerical Linear Algebra Revised February 15, 2010 4.1 the LU

Massively Parallel Poisson and QR Factorization Solvers

The Non–Symmetric Eigenvalue Problem

Parallel Eigenvalue Reordering in Real Schur Forms∗

The Unsymmetric Eigenvalue Problem

Schur Triangularization and the Spectral Decomposition(S)