Linear Algebra Gaussian Elimination Norms Numerical Stability

Thomas Finley, [email protected] Norms Determinant For A Rm n, if rank(A) = n, then AT A is SPD. ∈ × A vector norm function · : Rn R satisfies: The determinant det : Rn n R satisfies: Linear Algebra / / → × → Basic Linear Algebra Subroutines 1. x 0, and x = 0 x = %0. 1. det(AB) = det(A) det(B). 2 2 A subspace is a set S Rn such that 0 S and x, y / / ≥ / / ⇔ 0. Scalar ops, like x + y . O(1) flops, O(1) data. γx γ x γ R x Rn ff R ⊆ ∈ ∀ ∈ 2. = | | · for all , and all . 2. det(A) = 0 i A is singular. 1. Vector ops, like !y = ax + y. O(n) flops, O(n) data. S, α, β . αx + βy S. / / / / ∈ Rn ∈ ∈ ∈ n 3. x + y x + y , for all x, y . 3. det(L) = ℓ1,1ℓ2,2 ··· ℓn,n for triangular L. T The span of {v1,..., vk} is the set of all vectors in R / / ≤ / / / / ∈ T 2. Matrix-vector ops, like rank-one update A = A + xy . Common norms include: 4. det(A) = det(A ). 2 2 that are linear combinations of v1,..., vk. T s O(n ) flops, O(n ) data. 1. x 1 = |x1| + |x2| + ··· + |xn| To compute det(A) factor A = P LU. det(P ) = ( 1) 2 A basis B of subspace S, B = {v1,..., vk} S has / / 2 2 2 − 3. Matrix-matrix ops, like C = C + AB. O(n ) data, ⊂ 2. x 2 = x1 + x2 + ··· + xn where s is the number of swaps, det(L) = 1. When com- 3 Span(B) = S and all vi linearly independent. / / 1 O(n ) flops. ! p p p puting det(U) watch out for overflow! 3. x = lim (|x1| + ··· + |xn| ) = max |xi| Use the highest BLAS level possible. Operators are ar- The dimension of S is |B| for a basis B of S. ∞ p i=1..n / / →∞ Ax ! Orthogonal Matrices chitecture tuned, e.g., data processed in cache-sized bites. For subspaces S, T with S T , dim(S) dim(T ), and An induced matrix norm is A ! = sup ' ' . It x=0 x ! Rn n ⊆ ≤ / / & ' ' For Q × , these statements are equivalent: further if dim(S) = dim(T ), then S = T . satisfies the three properties of norms. T∈ T Linear Least Squares Rn Rm n m n 1. Q Q = QQ = I (i.e., Q is orthogonal) A linear transformation T : has x, y x R ,A R × , Ax ! A ! x !. Suppose we have points (u1, v1),..., (u5, v5) that we want Rn R → ∀ ∈ ∀ ∈ ∈ / / ≤ / / / / 2. The · 2 for each row and column of Q. The inner 2 , α, β .T (αx + βy) = αT (x) + βT (y). Further, AB ! A ! B !, called submultiplicativity. / / to fit a quadratic curve au + bu + c through. We want to m∈n / / ≤ / / / / product of any row (or column) with another is 0. A R × such that x .T (x) Ax. T 2 a b a 2 b 2, called Cauchy-Schwarz inequality. n solve for u u1 1 v1 ∃ ∈ ∀ ≡ n m m ≤ / / / / 3. For all x R , Qx 2 = x 2. 1 a For two linear transformations T : R R , S : R 1. A = max n |a | (max row sum). ∈ / / / / . → → i=1,...,m j=1 i,j A matrix Q Rm n with m > n has orthonormal columns . b = . Rp, S T S(T (x)) is linear transformation. (T (x) / /∞ m × . 2. A 1 = maxj=1,...,n "i=1 |ai,j| (max column sum). ∈ T 2 c ◦ ≡ ≡ / / 3 2 if the columns are orthonormal, and Q Q = I. u5 u5 1 v5 Ax) (S(y) B) (S T )(x) BAx. 3. A 2 is hard: it takes" O(n ), not O(n ) operations. ∧ ≡ ⇒ ◦ ≡ / / The product of orthogonal matrices is orthogonal. The matrix’s row space is the span of its rows, its column n m 2 This is overdetermined so an exact solution is out. Instead, 4. A F = i=1 j=1 ai,j. · F often replaces · 2. For orthogonal Q, QA 2 = A 2 and AQ 2 = A 2. space or range is the span of its columns, and its rank is / / # / / / / / / / / / / / / find the least squares solution x that minimizes Ax b 2. " " / − / the dimension of either of these spaces. Numerical Stability QR-factorization For the method of normal equations, solve for x in m n Six sources of error in scientific computing: modeling er- Rm n T T For A R × , rank(A) min(m, n). A has full row (or For any A × with m n, we can factor A = QR, A Ax = A b by using Cholesky factorization. This takes ∈ ≤ rors, measurement or data errors, blunders, discretization ∈Rm m ≥ T 3 column) rank if rank(A) = m (or n). where Q × is orthogonal, and R = [ R1 0 ] 2 n ∈ ∈ mn + 3 + O(mn) flops. It is conditionally but not back- n n or truncation errors, convergence tolerance, and rounding Rm n ff T A diagonal matrix D R × has d = 0 for j = k. The × is upper triangular. rank(A) = n i R1 is invertible. wards stable: A A doubles the condition number. j,k exponent ∈ , errors. For single and double: T diagonal identity matrix I has ij,j = 1. Q’s first n (or last m n) columns form an orthonormal Alternatively, factor A = QR. Let c = [ c1 c2 ] = e − T The upper (or lower) bandwidth of A is max |i j| among ± d1.d2d3 ··· dt β t = 24, e ∈ {−126,..., 127} basis for span(A) (or nullspace(A )). T 1 × T Q b. The least squares solution is x = R1− c1. − &'$% t = 53, e ∈ {−1022,..., 1023} 2vv i, j where i j (or i j) such that A = 0. sign mantissa base A Householder reflection is H = I T . H is symmet- i,j − v v If rank(A) = r and r < n (rank deficient), factor A = ≥ ≤ , $%&' $ %& ' $%&' |ˆx x| ric and orthogonal. Explicit H.H. QR-factorization is: Σ T T T A matrix with lower bandwidth 1 is upper Hessenberg. The relative error in ˆx approximating x is |−x| . U V , let y = V x and c = U b. Then, min Ax n n t+1 / − For A, B R , B is A’s inverse if AB = BA = I. If ff 1: for k = 1 : n do r 2 m 2 ci × Unit roundo or machine epsilon is ǫmach = β− . b 2 = min (σiyi ci) + c , so yi = . For ∈ 1 2: / i=1 − i=r+1 i σi such a B exists, A is invertible or nonsingular. B = A− . Arithmetic operations have relative error bounded by ǫmach. v = A(k : m, k) ± A(k : m, k) 2e1 # i = r + 1 : n, "yi is arbitrary. " 1 / 2vvT / The inverse of A is A− = [x1, ··· , xn] where Axi = ei. E.g., consider z = x y with input x, y. This program has 3: A(k : m, k : n) = I A(k : m, k : n) − vT v For A Rn n the following are equivalent: A is nonsin- three roundoff errors.z ˆ = ((1 + δ )x (1 + δ )y) (1 + δ ), ( − ) Singular Value Decomposition × 1 2 3 4: end for m n T ∈ − For any A R × , we can express A = UΣV such gular, rank(A) = n, Ax = b has a solution x for any b, if where δ1, δ2, δ3 [ ǫmach, ǫmach]. We get HnHn 1 ··· H1A = R, so then, Q = H1H2 ··· Hn. R∈m m Rn n ∈ − 2 that U × and V × are orthogonal, and Ax = 0 then x = 0. |z zˆ| |(δ1+δ3)x (δ2+δ3)y+O(ǫ )| −2 2 3 − = − mach This takes 2mn n + O(mn) flops. Σ ∈ Rm∈n Rm n x Rn x 0 |z| |x y| − 3 = diag(σ1, ··· , σp) × where p = min(m, n) and The nullspace of A × is { : A = }. − Givens requires 50% more flops. Preferable for sparse A. ∈ Rm n ∈ ∈ T The bad case is where δ1 = ǫmach, δ2 = ǫmach, δ3 = 0: σ1 σ2 ··· σp 0. The σi are singular values. For A × , Range(A) and Nullspace(A ) are − The Gram-Schmidt produces a skinny/reduced QR- ≥ ≥ ≥ ≥ ∈ |z zˆ| |x+y| 1. Matrix 2-norm, where A 2 = σ1. orthogonal complements, i.e., x Range(A), y − = ǫmach m n |z| |x y| factorization A = Q R , where Q R has or- / / 1 σ1 T T ∈ m ∈ − 1 1 1 × 2. The condition number κ (A) = A A = , or Nullspace(A ) x y = 0, and for all p R , p = x + y ∈ 2 2 − 2 σn Inaccuracy if |x+y| |x y| called catastrophic calcellation. thonormal columns. The Gram-Schmidt algorithm is: / / / σ/1 ⇒ ∈ rectangular condition number κ2(A) = . Note for unique x and y. ≫ − Left Looking Right Looking σmin(m,n) n n Conditioning & Backwards Stability T 2 For a permutation matrix P R , PA permutes the that κ2(A A) = κ2(A) . ∈ × 1: for k = 1 : n do 1: Q = A rows of A, AP the columns of A. P 1 = P T . A problem instance is ill conditioned if the solution is sen- 3. For a rank k approximation to A, let Σ = − 2: q = a 2: for k = 1 : n do k sitive to perturbations of the data. For example, sin 1 is k k diag(σ , ··· , σ , 0T ). Then A = UΣ V T . rank(A ) 3: for j = 1 : k 1 do 3: R(k, k) = q 1 k k k k Gaussian Elimination well conditioned, but sin 12392193 is ill conditioned. k 2 ff σ ≤ 4: −T 4: / / k and rank(Ak) = k i k > 0. Among rank k or lower GE produces a factorization A = LU, GEPP PA = LU.

Linear Algebra Gaussian Elimination Norms Numerical Stability

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support