
Computing the diagonal of the inverse of a sparse matrix Yousef Saad Department of Computer Science and Engineering University of Minnesota “Sparse Days”, Toulouse, June 15, 2010 Motivation: DMFT ‘Dynamic Mean Field Theory’ - quantum mechanical studies of highly correlated particles ® Equation to be solved (repeatedly) is Dyson’s equation G(!) = [(! + µ)I − V − Σ(!) + T ]−1 • ! (frequency) and µ (chemical potential) are real • V = trap potential = real diagonal • Σ(!) == local self-energy - a complex diagonal • T is the hopping matrix (sparse real). Sparse Days 06/15/2010 2 ® Interested only in diagonal of G(!) – in addition, equation must be solved self-consistently and ... ® ... must do this for many !’s ® Related approach: Non Equilibrium Green’s Function (NEGF) approach used to model nanoscale transistors. ® Many new applications of diagonal of inverse [and related problems.] ® A few examples to follow Sparse Days 06/15/2010 3 Introduction: A few examples Problem 1: Compute Tr[inv[A]] the trace of the inverse. ® Arises in cross validation : k(I − A(θ))gk2 with A(θ) ≡ I−D(DT D+θLLT )−1DT ; Tr (I − A(θ)) D == blurring operator and L is the regularization operator ® In [Huntchinson ’90] Tr[Inv[A]] is stochastically estimated ® Many authors addressed this problem. Sparse Days 06/15/2010 4 Problem 2: Compute Tr [ f (A)], f a certain function Arises in many applications in Physics. Example: ® Stochastic estimations of Tr ( f(A)) extensively used by quan- tum chemists to estimate Density of States, see [Ref: H. Röder, R. N. Silver, D. A. Drabold, J. J. Dong, Phys. Rev. B. 55, 15382 (1997)] Sparse Days 06/15/2010 5 Problem 3: Compute diag[inv(A)] the diagonal of the inverse ® Arises in Dynamic Mean Field Theory [DMFT, motivation for this work]. In DMFT, we seek the diagonal of a “Green’s function” which solves (self-consistently) Dyson’s equation. [see J. Freericks 2005] ® Related approach: Non Equilibrium Green’s Function (NEGF) approach used to model nanoscale transistors. ® In uncertainty quantification, the diagonal of the inverse of a covariance matrix is needed [Bekas, Curioni, Fedulova ’09] Sparse Days 06/15/2010 6 Problem 4: Compute diag[ f (A)]; f = a certain function. ® Arises in any density matrix approach in quantum modeling - for example Density Functional Theory. ® Here, f = Fermi-Dirac operator: 1 Note: when T ! 0 f() = 1 + exp(−µ) then f becomes a step kBT function. Note: if f is approximated by a rational function then diag[f(A)] −1 ≈ a lin. combinaiton of terms like diag[(A − σiI) ] ® Linear-Scaling methods based on approximating f(H) and Diag(f(H)) – avoid ‘diagonalization’ of H Sparse Days 06/15/2010 7 Methods based on the sparse L U factorization ® Basic reference: K. Takahashi, J. Fagan, and M.-S. Chin, Formation of a sparse bus impedance matrix and its application to short circuit study, in Proc. of the Eighth Inst. PICA Conf., Minneapolis, MN, IEEE, Power Engineering Soc., 1973, pp. 63-69. ® Described in [Duff, Erisman, Reid, p. 273] - ® Algorithm used by Erisman and Tinney [Num. Math. 1975] Sparse Days 06/15/2010 8 ® Main idea. If A = LDU and B = A−1 then B = U −1D−1 + B(I − L); B = D−1L−1 + (I − U)B: ® Not all entries are needed to compute selected entries of B ® For example: Consider lower part, i > j; use first equation: X bij = (B(I − L))ij = − biklkj k>j ® Need entries bik of row i where Lkj 6= 0, k > j. ® “Entries of B belonging to the pattern of (L; U)T can be extracted without computing any other entries outside the pat- tern.” Sparse Days 06/15/2010 9 ® More recently exploited in a different form in L. Lin, C. Yang, J. Meza, J. Lu, L. Ying, W. E SelInv – An algorithm for selected inversion of a sparse symmetric matrix, Tech. Report, Princeton Univ. ® An algorithm based on a form of nested dissection is de- scribed in Li, Ahmed, Glimeck, Darve [2008] ® A close relative to this technique is represented in L. Lin , J. Lu, L. Ying , R. Car , W. E Fast algorithm for extracting the diagonal of the inverse matrix with application to the elec- tronic structure analysis of metallic systems Comm. Math. Sci, 2009. ® Difficulty: 3-D problems. Sparse Days 06/15/2010 10 Stochastic Estimator • A = original matrix, B = A−1. • δ(B) = diag(B) [matlab notation] • D(B) = diagonal matrix with diagonal δ(B) Notation: • and : Elementwise multiplication and division of vectors • fvjg: Sequence of s random vectors 2 s 3 2 s 3 X X Result: δ(B) ≈ 4 vj Bvj5 4 vj vj5 j=1 j=1 Refs: C. Bekas , E. Kokiopoulou & YS (’05), Recent: C. Bekas, A. Curioni, I. Fedulova ’09. Sparse Days 06/15/2010 11 ® Let Vs = [v1; v2; : : : ; vs]. Then, alternative expression: > −1 > D(B) ≈ D(BVsVs )D (VsVs ) Question: When is this result exact? Main Proposition n×s n×n • Let Vs 2 R with rows fvj;:g; and B 2 C with elements fbjkg • Assume that: hvj;:; vk;:i = 0, 8j 6= k, s.t. bjk 6= 0 Then: > −1 > D(B)=D(BVsVs )D (VsVs ) ® Approximation to bij exact when rows i and j of Vs are ? Sparse Days 06/15/2010 12 Ideas from information theory: Hadamard matrices ® Consider the matrix V – want the rows to be as ‘orthogonal as possible among each other’, i.e., want to minimize T kI − VV kF T Erms = or Emax = max jVV jij pn(n − 1) i6=j ® Problems that arise in coding: find code book [rows of V = code words] to minimize ’cross-correlation amplitude’ ® Welch bounds: s s n − s n − s Erms ≥ Emax ≥ (n − 1)s (n − 1)s ® Result: 9 a sequence of s vectors vk with binary entries which achieve the first Welch bound iff s = 2 or s = 4k. Sparse Days 06/15/2010 13 ® Hadamard matrices are a special class: n × n matrices with entries ±1 and such that HH> = nI. 2 1 1 1 1 3 1 1 6 1 −1 1 −1 7 Examples : and 6 7 : 1 −1 4 1 1 −1 −1 5 1 −1 −1 1 ® Achieve both Welch bounds ® Can build larger Hadamard matrices recursively: Given two Hadamard matrices H1 and H2, the Kro- necker product H1 ⊗ H2 is a Hadamard matrix. ® Too expensive to use the whole matrix of size n ® Can use Vs = matrix of s first columns of Hn Sparse Days 06/15/2010 14 0 0 20 20 40 40 32 64 60 60 80 80 100 100 120 120 0 20 40 60 80 100 120 0 20 40 60 80 100 120 > Pattern of VsVs , for s = 32 and s = 64. Sparse Days 06/15/2010 15 A Lanczos approach ® Given a Hermitian matrix A - generate Lanczos vectors via: βi+1qi+1 = Aqi − αiqi − βiqi−1 αi; βi+1 selected s.t. kqi+1k2 = 1 and qi+1 ? qi; qi+1 ? qi−1 ® Result: > AQm = QmTm + βm+1qm+1em; > −1 −1 > ® When m = n then A = QnTnQn and A = QnTn Qn : −1 −1 > ® For m < n use the approximation: A ≈ QmTm Qm ! −1 −1 > D(A ) ≈ D[QmTm Qm] Sparse Days 06/15/2010 16 ALGORITHM : 1 diagInv via Lanczos For j = 1; 2; ··· ; Do: βj+1qj+1 = Aqj − αjqj − βjqj−1 [Lanczos step] pj := qj − ηjpj−1 δj := αj − βjηj pj pj dj := dj−1 + [Update of diag(inv(A))] δj βj+1 ηj+1 := δj EndDo −1 ® dk (a vector) will converge to the diagonal of A ® Limitation: Often requires all n steps to converge ® One advantage: Lanczos is shift invariant – so can use this for many !’s ® Potential: Use as a direct method - exploiting sparsity Sparse Days 06/15/2010 17 Using a sparse V : Probing Find Vs such that (1) s is small and (2) Vs Goal: satisfies Proposition (rows i & j orthgonoal for any nonzero bij) Can work only for sparse matrices but B = Difficulty: A−1 is usually dense ® B can sometimes be approximated by a sparse matrix. ® bij; jbijj > Consider for some : (B)ij = 0; jbijj ≤ ® B will be sparse under certain conditions, e.g., when A is diagonally dominant ® In what follows we assume B is sparse and set B := B. ® Pattern will be required by standard probing methods. Sparse Days 06/15/2010 18 Generic Probing Algorithm ALGORITHM : 2 Probing Input: A, s Output: Matrix D (B) Determine Vs := [v1; v2; : : : ; vs] for j 1 to s Solve Axj = vj end Construct Xs := [x1; x2; : : : ; xs] > −1 > Compute D (B) := D XsVs D (VsVs ) ® Note: rows of Vs are typically scaled to have unit 2-norm −1 > =1., so D (VsVs ) = I. Sparse Days 06/15/2010 19 Standard probing (e.g. to compute a Jacobian) ® Several names for same method: “probing”; “CPR”, “Sparse Jacobian estimators”,.. Basis of the method: can compute Jacobian if a coloring of the columns is known so that no two columns of the same color overlap. 1 3 5 12 13 16 20 All entries of same color 1 (1) 1 (3) can be computed with 1 (5) one matvec. Example: For all blue 1 (12) entries multiply B by the 1 (13) blue vector on right. 1 (15) 1 (20) Sparse Days 06/15/2010 20 What about Diag(inv(A))? ® Define vi - probing vector associated with color i: 1 if color(k) == i [v ] = i k 0 otherwise ® Standard probing satisfies requirement of Proposition but... ® ... this coloring is not what is needed! [It is an overkill] Alternative: ® Color the graph of B in the standard graph coloring algo- rithm [Adjacency graph, not graph of column-overlaps] Graph coloring yields a valid set of probing Result: vectors for D(B).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages36 Page
-
File Size-