Linear Algebra Gaussian Elimination Norms Numerical Stability

Thomas Finley, [email protected] Norms Determinant For A Rm n, if rank(A) = n, then AT A is SPD. ∈ × A vector norm function · : Rn R satisfies: The determinant det : Rn n R satisfies: Linear Algebra / / → × → Basic Linear Algebra Subroutines 1. x 0, and x = 0 x = %0. 1. det(AB) = det(A) det(B). 2 2 A subspace is a set S Rn such that 0 S and x, y / / ≥ / / ⇔ 0. Scalar ops, like x + y . O(1) flops, O(1) data. γx γ x γ R x Rn ff R ⊆ ∈ ∀ ∈ 2. = | | · for all , and all . 2. det(A) = 0 i A is singular. 1. Vector ops, like !y = ax + y. O(n) flops, O(n) data. S, α, β . αx + βy S. / / / / ∈ Rn ∈ ∈ ∈ n 3. x + y x + y , for all x, y . 3. det(L) = ℓ1,1ℓ2,2 ··· ℓn,n for triangular L. T The span of {v1,..., vk} is the set of all vectors in R / / ≤ / / / / ∈ T 2. Matrix-vector ops, like rank-one update A = A + xy . Common norms include: 4. det(A) = det(A ). 2 2 that are linear combinations of v1,..., vk. T s O(n ) flops, O(n ) data. 1. x 1 = |x1| + |x2| + ··· + |xn| To compute det(A) factor A = P LU. det(P ) = ( 1) 2 A basis B of subspace S, B = {v1,..., vk} S has / / 2 2 2 − 3. Matrix-matrix ops, like C = C + AB. O(n ) data, ⊂ 2. x 2 = x1 + x2 + ··· + xn where s is the number of swaps, det(L) = 1. When com- 3 Span(B) = S and all vi linearly independent. / / 1 O(n ) flops. ! p p p puting det(U) watch out for overflow! 3. x = lim (|x1| + ··· + |xn| ) = max |xi| Use the highest BLAS level possible. Operators are ar- The dimension of S is |B| for a basis B of S. ∞ p i=1..n / / →∞ Ax ! Orthogonal Matrices chitecture tuned, e.g., data processed in cache-sized bites. For subspaces S, T with S T , dim(S) dim(T ), and An induced matrix norm is A ! = sup ' ' . It x=0 x ! Rn n ⊆ ≤ / / & ' ' For Q × , these statements are equivalent: further if dim(S) = dim(T ), then S = T . satisfies the three properties of norms. T∈ T Linear Least Squares Rn Rm n m n 1. Q Q = QQ = I (i.e., Q is orthogonal) A linear transformation T : has x, y x R ,A R × , Ax ! A ! x !. Suppose we have points (u1, v1),..., (u5, v5) that we want Rn R → ∀ ∈ ∀ ∈ ∈ / / ≤ / / / / 2. The · 2 for each row and column of Q. The inner 2 , α, β .T (αx + βy) = αT (x) + βT (y). Further, AB ! A ! B !, called submultiplicativity. / / to fit a quadratic curve au + bu + c through. We want to m∈n / / ≤ / / / / product of any row (or column) with another is 0. A R × such that x .T (x) Ax. T 2 a b a 2 b 2, called Cauchy-Schwarz inequality. n solve for u u1 1 v1 ∃ ∈ ∀ ≡ n m m ≤ / / / / 3. For all x R , Qx 2 = x 2. 1 a For two linear transformations T : R R , S : R 1. A = max n |a | (max row sum). ∈ / / / / . → → i=1,...,m j=1 i,j A matrix Q Rm n with m > n has orthonormal columns . b = . Rp, S T S(T (x)) is linear transformation. (T (x) / /∞ m × . 2. A 1 = maxj=1,...,n "i=1 |ai,j| (max column sum). ∈ T 2 c ◦ ≡ ≡ / / 3 2 if the columns are orthonormal, and Q Q = I. u5 u5 1 v5 Ax) (S(y) B) (S T )(x) BAx. 3. A 2 is hard: it takes" O(n ), not O(n ) operations. ∧ ≡ ⇒ ◦ ≡ / / The product of orthogonal matrices is orthogonal. The matrix’s row space is the span of its rows, its column n m 2 This is overdetermined so an exact solution is out. Instead, 4. A F = i=1 j=1 ai,j. · F often replaces · 2. For orthogonal Q, QA 2 = A 2 and AQ 2 = A 2. space or range is the span of its columns, and its rank is / / # / / / / / / / / / / / / find the least squares solution x that minimizes Ax b 2. " " / − / the dimension of either of these spaces. Numerical Stability QR-factorization For the method of normal equations, solve for x in m n Six sources of error in scientific computing: modeling er- Rm n T T For A R × , rank(A) min(m, n). A has full row (or For any A × with m n, we can factor A = QR, A Ax = A b by using Cholesky factorization. This takes ∈ ≤ rors, measurement or data errors, blunders, discretization ∈Rm m ≥ T 3 column) rank if rank(A) = m (or n). where Q × is orthogonal, and R = [ R1 0 ] 2 n ∈ ∈ mn + 3 + O(mn) flops. It is conditionally but not back- n n or truncation errors, convergence tolerance, and rounding Rm n ff T A diagonal matrix D R × has d = 0 for j = k. The × is upper triangular. rank(A) = n i R1 is invertible. wards stable: A A doubles the condition number. j,k exponent ∈ , errors. For single and double: T diagonal identity matrix I has ij,j = 1. Q’s first n (or last m n) columns form an orthonormal Alternatively, factor A = QR. Let c = [ c1 c2 ] = e − T The upper (or lower) bandwidth of A is max |i j| among ± d1.d2d3 ··· dt β t = 24, e ∈ {−126,..., 127} basis for span(A) (or nullspace(A )). T 1 × T Q b. The least squares solution is x = R1− c1. − &'$% t = 53, e ∈ {−1022,..., 1023} 2vv i, j where i j (or i j) such that A = 0. sign mantissa base A Householder reflection is H = I T . H is symmet- i,j − v v If rank(A) = r and r < n (rank deficient), factor A = ≥ ≤ , $%&' $ %& ' $%&' |ˆx x| ric and orthogonal. Explicit H.H. QR-factorization is: Σ T T T A matrix with lower bandwidth 1 is upper Hessenberg. The relative error in ˆx approximating x is |−x| . U V , let y = V x and c = U b. Then, min Ax n n t+1 / − For A, B R , B is A’s inverse if AB = BA = I. If ff 1: for k = 1 : n do r 2 m 2 ci × Unit roundo or machine epsilon is ǫmach = β− . b 2 = min (σiyi ci) + c , so yi = . For ∈ 1 2: / i=1 − i=r+1 i σi such a B exists, A is invertible or nonsingular. B = A− . Arithmetic operations have relative error bounded by ǫmach. v = A(k : m, k) ± A(k : m, k) 2e1 # i = r + 1 : n, "yi is arbitrary. " 1 / 2vvT / The inverse of A is A− = [x1, ··· , xn] where Axi = ei. E.g., consider z = x y with input x, y. This program has 3: A(k : m, k : n) = I A(k : m, k : n) − vT v For A Rn n the following are equivalent: A is nonsin- three roundoff errors.z ˆ = ((1 + δ )x (1 + δ )y) (1 + δ ), ( − ) Singular Value Decomposition × 1 2 3 4: end for m n T ∈ − For any A R × , we can express A = UΣV such gular, rank(A) = n, Ax = b has a solution x for any b, if where δ1, δ2, δ3 [ ǫmach, ǫmach]. We get HnHn 1 ··· H1A = R, so then, Q = H1H2 ··· Hn. R∈m m Rn n ∈ − 2 that U × and V × are orthogonal, and Ax = 0 then x = 0. |z zˆ| |(δ1+δ3)x (δ2+δ3)y+O(ǫ )| −2 2 3 − = − mach This takes 2mn n + O(mn) flops. Σ ∈ Rm∈n Rm n x Rn x 0 |z| |x y| − 3 = diag(σ1, ··· , σp) × where p = min(m, n) and The nullspace of A × is { : A = }. − Givens requires 50% more flops. Preferable for sparse A. ∈ Rm n ∈ ∈ T The bad case is where δ1 = ǫmach, δ2 = ǫmach, δ3 = 0: σ1 σ2 ··· σp 0. The σi are singular values. For A × , Range(A) and Nullspace(A ) are − The Gram-Schmidt produces a skinny/reduced QR- ≥ ≥ ≥ ≥ ∈ |z zˆ| |x+y| 1. Matrix 2-norm, where A 2 = σ1. orthogonal complements, i.e., x Range(A), y − = ǫmach m n |z| |x y| factorization A = Q R , where Q R has or- / / 1 σ1 T T ∈ m ∈ − 1 1 1 × 2. The condition number κ (A) = A A = , or Nullspace(A ) x y = 0, and for all p R , p = x + y ∈ 2 2 − 2 σn Inaccuracy if |x+y| |x y| called catastrophic calcellation. thonormal columns. The Gram-Schmidt algorithm is: / / / σ/1 ⇒ ∈ rectangular condition number κ2(A) = . Note for unique x and y. ≫ − Left Looking Right Looking σmin(m,n) n n Conditioning & Backwards Stability T 2 For a permutation matrix P R , PA permutes the that κ2(A A) = κ2(A) . ∈ × 1: for k = 1 : n do 1: Q = A rows of A, AP the columns of A. P 1 = P T . A problem instance is ill conditioned if the solution is sen- 3. For a rank k approximation to A, let Σ = − 2: q = a 2: for k = 1 : n do k sitive to perturbations of the data. For example, sin 1 is k k diag(σ , ··· , σ , 0T ). Then A = UΣ V T . rank(A ) 3: for j = 1 : k 1 do 3: R(k, k) = q 1 k k k k Gaussian Elimination well conditioned, but sin 12392193 is ill conditioned. k 2 ff σ ≤ 4: −T 4: / / k and rank(Ak) = k i k > 0. Among rank k or lower GE produces a factorization A = LU, GEPP PA = LU.

Linear Algebra Gaussian Elimination Norms Numerical Stability

Arxiv:2003.06292V1 [Math.GR] 12 Mar 2020 Eggnrtr N Ignlmti.Tedaoa Arxi Matrix Diagonal the Matrix

Improving the Accuracy of Computed Eigenvalues and Eigenvectors*

Orthogonal Reduction 1 the Row Echelon Form -.: Mathematical

Implementation of Gaussian- Elimination

Numerical Analysis Notes for Math 575A

Numerically Stable Generation of Correlation Matrices and Their Factors ∗

Naïve Gaussian Elimination Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America [email protected]

Model Problems in Numerical Stability Theory for Initial Value Problems*

Geometric Algebra and Covariant Methods in Physics and Cosmology

Stability Analysis for Systems of Differential Equations

Algorithmic Structure for Geometric Algebra Operators and Application to Quadric Surfaces Stephane Breuils

The Making of a Geometric Algebra Package in Matlab Computer Science Department University of Waterloo Research Report CS-99-27