Linear Algebra Gaussian Elimination Norms Numerical Stability

[email protected] T r 2 m 2 ci Thomas Finley, 1. x 1 = x1 + x2 + + xn Positive Definite, A = LDL min (σiyi ci) + c , so yi = . For i = r +1: k k | | | | ··· | | i=1 i=r+1 i σi x 2 2 2 n n q − 2. 2 = x1 + x2 + + xn A R × is positive definite (PD) (or semidefinite (PSD)) if n, y isP arbitrary. P Linear Algebra k k ··· 1 i p p p p T ∈ T Rn 0 x y 3. x = lim ( x1 + + xn ) = max xi x Ax > 0 (or x Ax 0). A subspace is a set S such that S and , S,α,β k k∞ p | | ··· | | i=1..n | | ≥ ⊆ ∈ ∀ ∈ ∈ →∞ When LU-factorizing symmetric A, the result is A = LDLT ; Singular Value Decomposition R . αx + βy S. Ax m n T An induced matrix norm is A = sup k k . It satisfies For any A R × , we can express A = UΣV such n ∈ x=0 x L is unit lower triangular, D is diagonal. A is SPD iff D has x R is a linear combination of v1, , vk if β1, ,βk R k k 6 k k m∈ m n n the three properties of norms. T that U R × and V R × are orthogonal, and Σ = ∈ x v v ··· ∃ ··· ∈ all positive entries. The Cholesky factorization is A = LDL = ∈ ∈ such that = β1 1 + + βk k. Rn Rm n 3 Rm n ··· n x ,A × , Ax A x . 1/2 1/2 T T n 2 diag(σ1, , σp) × where p = min(m, n) and σ1 σ2 The span of v1,..., vk is the set of all vectors in R that ∀ ∈ ∈ k k ≤ k k k k LD D L = GG . Can be done directly in 3 +O(n ) flops. ··· ∈ ≥ ≥ AB A B , called submultiplicativity. σp 0. The σi are singular values. { } If G’s diagonal is positive, A is SPD. ···≥ ≥ are linear combinations of v1,..., vk. kT k ≤ k k k k 1. Matrix 2-norm, where A = σ . a b a 2 b 2, called Cauchy-Schwarz inequality. To solve Ax = b for SPD A, factor A = GGT , solve Gw = b 2 1 A basis B of subspace S, B = v1,..., vk S has ≤ k k k k n k k 1 σ1 T 2. The condition number κ2(A)= A 2 A− 2 = , or rect- { } ⊂ 1. A = maxi=1,...,m j=1 ai,j (max row sum). by forward substitution, then solve G x = w with backwards σn Span(B)= S and all vi linearly independent. k k∞ m | | k k k σ1k n3 2 angular condition number κ2(A) = . Note that 2. A 1 = maxj=1,...,n Pi=1 ai,j (max column sum). substitution, which takes + O(n ) flops. σmin(m,n) The dimension of S is B for a basis B of S. k k | | 3 T 2 | | 3. A is hard: it takes O(n3), not O(n2) operations. m n T κ (A A)= κ (A) . For subspaces S, T with S T , dim(S) dim(T ), and fur- 2 P For A R × , if rank(A)= n, then A A is SPD. 2 2 k k n m ∈ ther if dim(S)= dim(T ), then⊆S = T . ≤ 4. A = a2 . often replaces . 3. For a rank k approximation to A, let Σk = k kF i=1 j=1 i,j k · kF k · k2 T T A linear transformation T : Rn Rm has x, y Rn,α,β q QR-factorization diag(σ1, , σk, 0 ). Then Ak = UΣkV . rank(Ak) k P P m n ··· ≤ → ∀ ∈ m n ∈ Numerical Stability For any A R with m n, we can factor A = QR, where and rank(Ak) = k iff σk > 0. Among rank k or lower R . T (αx + βy) = αT (x)+ βT (y). Further, A R × such × Six sources of error in scientific computing: modeling errors, mea- Rm m∈ ≥ T Rm n that x . T (x) Ax. ∃ ∈ Q × is orthogonal, and R =[ R1 0 ] × is upper matrices, Ak minimizes A Ak 2 = σk+1. ∀ ≡ surement or data errors, blunders, discretization or truncation ∈ ∈ k − k For two linear transformations T : Rn Rm, S : Rm Rp, triangular. rank(A)= n iff R1 is invertible. 4. Rank determination, since rank(A) = r equals the num- errors, convergence tolerance, and rounding errors. S T S(T (x)) is linear transformation. (→T (x) Ax) (S→(y) Q’s first n (or last m n) columns form an orthonormal basis ber of nonzero σ, or in machine arithmetic, perhaps the exponent − T B◦y) ≡ (S T )(x) BAx. ≡ ∧ ≡ For single and double: for span(A) (or nullspace(A )). number of σ ǫmach σ1. e T ≥ × T ⇒ ◦ ≡ d1.d2d3 dt β t = 24, e ∈ {−126,..., 127} 2vv T Σ(1 : r, 1: r) 0 V A Householder reflection is H = I T . H is symmetric 1 The matrix’s row space is the span of its rows, its column ± ··· × v v A = UΣV = U1 U2 T z}|{ t = 53, e ∈ {−1022,..., 1023} − 0 0 V space or range is the span of its columns, and its rank is the sign mantissa base and orthogonal. Explicit H.H. QR-factorization is: 2 ˆx x |{z} | {z } |{z} 1: dimension of either of these spaces. The relative error in ˆx approximating x is | −x | . for k =1: n do See that range(U1)= range(A). The SVD gives an orthonormal Rm n | | t+1 2: T For A × , rank(A) min(m, n). A has full row (or Unit roundoff or machine epsilon is ǫmach = β− . Arith- v = A(k : m, k) A(k : m, k) 2e1 basis for the range and nullspace of A and A . ∈ ≤ ± k T k column) rank if rank(A)= m (or n). 2vv T metic operations have relative error bounded by ǫmach. 3: A(k : m, k : n)= I T A(k : m, k : n) Compute the SVD by using shifted QR on A A. A diagonal matrix D Rn n has d = 0 for j = k. The − v v × j,k E.g., consider z = x y with input x, y. This program has 4: end for diagonal identity matrix I∈has i = 1. 6 − Information Retrival & LSI j,j three roundoff errors.z ˆ = ((1+ δ1)x (1 + δ2)y)(1+ δ3), where m − 2 We get H H H A = R, so then, Q = H H H . This In the bag of words model, wd R , where wd(i) is the (per- The upper (or lower) bandwidth of A is max i j among i,j z zˆ (δ1+δ3)x (δ2+δ3)y+O(ǫ ) n n 1 1 1 2 n | − mach | −2 ··· ··· ∈ | − | δ1, δ2, δ3 [ ǫmach,ǫmach]. | −z | = x y 2 3 where i j (or i j) such that A = 0. ∈ − takes 2mn 3 n + O(mn) flops. haps weighted) frequency of term i in document d. The corpus i,j | | | − | − m n m ≥ ≤ 6 The bad case is where δ1 = ǫmach, δ2 = ǫmach, δ3 = 0: Givens requires 50% more flops. Preferable for sparse A. matrix is A = [w1, , wn] R × . For a query q R , rank A matrix with lower bandwidth 1 is upper Hessenberg. z zˆ x+y − ··· q∈T w ∈ Rn n | −z | = ǫmach |x y| Inaccuracy if x + y x y called catas- The Gram-Schmidt produces a skinny/reduced QR- documents according to a d score. For A, B × , B is A’s inverse if AB = BA = I. If such | | | − | | |≫| − | wd 2 m n k k ∈ 1 trophic calcellation. factorization A = Q R , where Q R × has orthonormal a B exists, A is invertible or nonsingular. B = A− . 1 1 1 ∈ In latent semantic indexing, you do the same, but in a 1 columns. The Gram-Schmidt algorithm is: T The inverse of A is A− =[x1, , xn] where Axi = ei. Conditioning & Backwards Stability k dimensional subspace. Factor A = UΣV , then define n n ··· T k n T For A R the following are equivalent: A is nonsingular, Left Looking Right Looking A∗ = Σ V R × . Each w∗ = A∗ = U w , and × A problem instance is ill conditioned if the solution is sensitive to 1:k,1:k :,1:k ∈ d :,d :,1:k d ∈ x b b x 0 x 0 1: for k =1: n do 1: Q = A T rank(A)= n, A = is solvable for any , A = iff = . perturbations of the data. For example, sin 1 is well conditioned, q∗ = U:,1:kq. Rn T n 2: q a 2: for do The inner product of x, y is x y = i=1 xiyi. but sin 12392193 is ill conditioned. k = k k =1: n In the Ando-Lee analysis, for a corpus with k topics, for Rn ∈ T 3: for do 3: q Vectors x, y are orthogonal if x y =P 0. Suppose we perturb Ax = b by (A + E)ˆx = b + e where j =1: k 1 R(k,k)= k 2 t 1: k and d 1: n, let Rt,d 0 be document d’s relevance to ∈ m n n −T k k ∈ ∈ ≥ T n n The nullspace or kernel of A R is x R : Ax = 0 . E e ˆx+x 2 4: R(j,k)= q ak 4: qk = qk/R(k,k) R × k k δ, k k δ. Then k k 2δκ(A) + O(δ ), where j topic t. R:,d 2 = 1. True document similarity is RR = × , Rm n ∈ { ∈T } A b x 5: q q q 5: for do k k For A × , Range(A) and Nullspace(A ) are orthogonal k k ≤ k k ≤1 k k ≤ k = k R(j,k) j j = k +1: n where entry (i,j) is relevance of i to j.

Linear Algebra Gaussian Elimination Norms Numerical Stability

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support