Techniques for High-Performance Construction of Fock Matrices H
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Solving the Multi-Way Matching Problem by Permutation Synchronization
Solving the multi-way matching problem by permutation synchronization Deepti Pachauriy Risi Kondorx Vikas Singhy yUniversity of Wisconsin Madison xThe University of Chicago [email protected] [email protected] [email protected] http://pages.cs.wisc.edu/∼pachauri/perm-sync Abstract The problem of matching not just two, but m different sets of objects to each other arises in many contexts, including finding the correspondence between feature points across multiple images in computer vision. At present it is usually solved by matching the sets pairwise, in series. In contrast, we propose a new method, Permutation Synchronization, which finds all the matchings jointly, in one shot, via a relaxation to eigenvector decomposition. The resulting algorithm is both computationally efficient, and, as we demonstrate with theoretical arguments as well as experimental results, much more stable to noise than previous methods. 1 Introduction 0 Finding the correct bijection between two sets of objects X = fx1; x2; : : : ; xng and X = 0 0 0 fx1; x2; : : : ; xng is a fundametal problem in computer science, arising in a wide range of con- texts [1]. In this paper, we consider its generalization to matching not just two, but m different sets X1;X2;:::;Xm. Our primary motivation and running example is the classic problem of matching landmarks (feature points) across many images of the same object in computer vision, which is a key ingredient of image registration [2], recognition [3, 4], stereo [5], shape matching [6, 7], and structure from motion (SFM) [8, 9]. However, our approach is fully general and equally applicable to problems such as matching multiple graphs [10, 11]. -
Arxiv:1912.06217V3 [Math.NA] 27 Feb 2021
ROUNDING ERROR ANALYSIS OF MIXED PRECISION BLOCK HOUSEHOLDER QR ALGORITHMS∗ L. MINAH YANG† , ALYSON FOX‡, AND GEOFFREY SANDERS‡ Abstract. Although mixed precision arithmetic has recently garnered interest for training dense neural networks, many other applications could benefit from the speed-ups and lower storage cost if applied appropriately. The growing interest in employing mixed precision computations motivates the need for rounding error analysis that properly handles behavior from mixed precision arithmetic. We develop mixed precision variants of existing Householder QR algorithms and show error analyses supported by numerical experiments. 1. Introduction. The accuracy of a numerical algorithm depends on several factors, including numerical stability and well-conditionedness of the problem, both of which may be sensitive to rounding errors, the difference between exact and finite precision arithmetic. Low precision floats use fewer bits than high precision floats to represent the real numbers and naturally incur larger rounding errors. Therefore, error attributed to round-off may have a larger influence over the total error and some standard algorithms may yield insufficient accuracy when using low precision storage and arithmetic. However, many applications exist that would benefit from the use of low precision arithmetic and storage that are less sensitive to floating-point round-off error, such as training dense neural networks [20] or clustering or ranking graph algorithms [25]. As a step towards that goal, we investigate the use of mixed precision arithmetic for the QR factorization, a widely used linear algebra routine. Many computing applications today require solutions quickly and often under low size, weight, and power constraints, such as in sensor formation, where low precision computation offers the ability to solve many problems with improvement in all four parameters. -
Arxiv:1809.04476V2 [Physics.Chem-Ph] 18 Oct 2018 (Sub)States
Quantum System Partitioning at the Single-Particle Level Quantum System Partitioning at the Single-Particle Level Adrian H. M¨uhlbach and Markus Reihera) ETH Z¨urich, Laboratorium f¨ur Physikalische Chemie, Vladimir-Prelog-Weg 2, CH-8093 Z¨urich,Switzerland (Dated: 17 October 2018) We discuss the partitioning of a quantum system by subsystem separation through unitary block- diagonalization (SSUB) applied to a Fock operator. For a one-particle Hilbert space, this separation can be formulated in a very general way. Therefore, it can be applied to very different partitionings ranging from those driven by features in the molecular structure (such as a solute surrounded by solvent molecules or an active site in an enzyme) to those that aim at an orbital separation (such as core-valence separation). Our framework embraces recent developments of Manby and Miller as well as older ones of Huzinaga and Cantu. Projector-based embedding is simplified and accelerated by SSUB. Moreover, it directly relates to decoupling approaches for relativistic four-component many-electron theory. For a Fock operator based on the Dirac one-electron Hamiltonian, one would like to separate the so-called positronic (negative-energy) states from the electronic bound and continuum states. The exact two-component (X2C) approach developed for this purpose becomes a special case of the general SSUB framework and may therefore be viewed as a system- environment decoupling approach. Moreover, for SSUB there exists no restriction with respect to the number of subsystems that are generated | in the limit, decoupling of all single-particle states is recovered, which represents exact diagonalization of the problem. -
A New Scalable Parallel Algorithm for Fock Matrix Construction
A New Scalable Parallel Algorithm for Fock Matrix Construction Xing Liu Aftab Patel Edmond Chow School of Computational Science and Engineering College of Computing, Georgia Institute of Technology Atlanta, Georgia, 30332, USA [email protected], [email protected], [email protected] Abstract—Hartree-Fock (HF) or self-consistent field (SCF) ber of independent tasks and using a centralized dynamic calculations are widely used in quantum chemistry, and are scheduler for scheduling them on the available processes. the starting point for accurate electronic correlation methods. Communication costs are reduced by defining tasks that Existing algorithms and software, however, may fail to scale for large numbers of cores of a distributed machine, particularly in increase data reuse, which reduces communication volume, the simulation of moderately-sized molecules. In existing codes, and consequently communication cost. This approach suffers HF calculations are divided into tasks. Fine-grained tasks are from a few problems. First, because scheduling is completely better for load balance, but coarse-grained tasks require less dynamic, the mapping of tasks to processes is not known communication. In this paper, we present a new parallelization in advance, and data reuse suffers. Second, the centralized of HF calculations that addresses this trade-off: we use fine- grained tasks to balance the computation among large numbers dynamic scheduler itself could become a bottleneck when of cores, but we also use a scheme to assign tasks to processes to scaling up to a large system [14]. Third, coarse tasks reduce communication. We specifically focus on the distributed which reduce communication volume may lead to poor load construction of the Fock matrix arising in the HF algorithm, balance when the ratio of tasks to processes is close to one. -
Arxiv:1709.00765V4 [Cond-Mat.Stat-Mech] 14 Oct 2019
Number of hidden states needed to physically implement a given conditional distribution Jeremy A. Owen,1 Artemy Kolchinsky,2 and David H. Wolpert2, 3, 4, ∗ 1Physics of Living Systems Group, Department of Physics, Massachusetts Institute of Technology, 400 Tech Square, Cambridge, MA 02139. 2Santa Fe Institute 3Arizona State University 4http://davidwolpert.weebly.com† (Dated: October 15, 2019) We consider the problem of how to construct a physical process over a finite state space X that applies some desired conditional distribution P to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller, or Maxwell demon). It was previously known that some conditional distributions cannot be implemented using any master equation that involves just the states in X. However, here we show that any conditional distribution P can in fact be implemented—if additional “hidden” states not in X are available. Moreover, we show that is always possible to implement P in a thermodynamically reversible manner. We then investigate a novel cost of the physical resources needed to implement a given distribution P : the minimal number of hidden states needed to do so. We calculate this cost exactly for the special case where P represents a single-valued function, and provide an upper bound for the general case, in terms of the nonnegative rank of P . These results show that having access to one extra binary degree of freedom, thus doubling the total number of states, is sufficient to implement any P with a master equation in a thermodynamically reversible way, if there are no constraints on the allowed form of the master equation. -
Domain Decomposition and Electronic Structure Computations: a Promising Approach
Domain Decomposition and Electronic Structure Computations: A Promising Approach G. Bencteux1,4, M. Barrault1, E. Canc`es2,4, W. W. Hager3, and C. Le Bris2,4 1 EDF R&D, 1 avenue du G´en´eral de Gaulle, 92141 Clamart Cedex, France {guy.bencteux,maxime.barrault}@edf.fr 2 CERMICS, Ecole´ Nationale des Ponts et Chauss´ees, 6 & 8, avenue Blaise Pascal, Cit´eDescartes, 77455 Marne-La-Vall´ee Cedex 2, France, {cances,lebris}@cermics.enpc.fr 3 Department of Mathematics, University of Florida, Gainesville, FL 32611-8105, USA, [email protected] 4 INRIA Rocquencourt, MICMAC project, Domaine de Voluceau, B.P. 105, 78153 Le Chesnay Cedex, FRANCE Summary. We describe a domain decomposition approach applied to the spe- cific context of electronic structure calculations. The approach has been introduced in [BCH06]. We survey here the computational context, and explain the peculiari- ties of the approach as compared to problems of seemingly the same type in other engineering sciences. Improvements of the original approach presented in [BCH06], including algorithmic refinements and effective parallel implementation, are included here. Test cases supporting the interest of the method are also reported. It is our pleasure and an honor to dedicate this contribution to Olivier Pironneau, on the occasion of his sixtieth birthday. With admiration, respect and friendship. 1 Introduction and Motivation 1.1 General Context Numerical simulation is nowadays an ubiquituous tool in materials science, chemistry and biology. Design of new materials, irradiation induced damage, drug design, protein folding are instances of applications of numerical sim- ulation. For convenience we now briefly present the context of the specific computational problem under consideration in the present article. -
On the Asymptotic Existence of Partial Complex Hadamard Matrices And
Discrete Applied Mathematics 102 (2000) 37–45 View metadata, citation and similar papers at core.ac.uk brought to you by CORE On the asymptotic existence of partial complex Hadamard provided by Elsevier - Publisher Connector matrices and related combinatorial objects Warwick de Launey Center for Communications Research, 4320 Westerra Court, San Diego, California, CA 92121, USA Received 1 September 1998; accepted 30 April 1999 Abstract A proof of an asymptotic form of the original Goldbach conjecture for odd integers was published in 1937. In 1990, a theorem reÿning that result was published. In this paper, we describe some implications of that theorem in combinatorial design theory. In particular, we show that the existence of Paley’s conference matrices implies that for any suciently large integer k there is (at least) about one third of a complex Hadamard matrix of order 2k. This implies that, for any ¿0, the well known bounds for (a) the number of codewords in moderately high distance binary block codes, (b) the number of constraints of two-level orthogonal arrays of strengths 2 and 3 and (c) the number of mutually orthogonal F-squares with certain parameters are asymptotically correct to within a factor of 1 (1 − ) for cases (a) and (b) and to within a 3 factor of 1 (1 − ) for case (c). The methods yield constructive algorithms which are polynomial 9 in the size of the object constructed. An ecient heuristic is described for constructing (for any speciÿed order) about one half of a Hadamard matrix. ? 2000 Elsevier Science B.V. All rights reserved. -
Triangular Factorization
Chapter 1 Triangular Factorization This chapter deals with the factorization of arbitrary matrices into products of triangular matrices. Since the solution of a linear n n system can be easily obtained once the matrix is factored into the product× of triangular matrices, we will concentrate on the factorization of square matrices. Specifically, we will show that an arbitrary n n matrix A has the factorization P A = LU where P is an n n permutation matrix,× L is an n n unit lower triangular matrix, and U is an n ×n upper triangular matrix. In connection× with this factorization we will discuss pivoting,× i.e., row interchange, strategies. We will also explore circumstances for which A may be factored in the forms A = LU or A = LLT . Our results for a square system will be given for a matrix with real elements but can easily be generalized for complex matrices. The corresponding results for a general m n matrix will be accumulated in Section 1.4. In the general case an arbitrary m× n matrix A has the factorization P A = LU where P is an m m permutation× matrix, L is an m m unit lower triangular matrix, and U is an×m n matrix having row echelon structure.× × 1.1 Permutation matrices and Gauss transformations We begin by defining permutation matrices and examining the effect of premulti- plying or postmultiplying a given matrix by such matrices. We then define Gauss transformations and show how they can be used to introduce zeros into a vector. Definition 1.1 An m m permutation matrix is a matrix whose columns con- sist of a rearrangement of× the m unit vectors e(j), j = 1,...,m, in RI m, i.e., a rearrangement of the columns (or rows) of the m m identity matrix. -
Think Global, Act Local When Estimating a Sparse Precision
Think Global, Act Local When Estimating a Sparse Precision Matrix by Peter Alexander Lee A.B., Harvard University (2007) Submitted to the Sloan School of Management in partial fulfillment of the requirements for the degree of Master of Science in Operations Research at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2016 ○c Peter Alexander Lee, MMXVI. All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part in any medium now known or hereafter created. Author................................................................ Sloan School of Management May 12, 2016 Certified by. Cynthia Rudin Associate Professor Thesis Supervisor Accepted by . Patrick Jaillet Dugald C. Jackson Professor, Department of Electrical Engineering and Computer Science and Co-Director, Operations Research Center 2 Think Global, Act Local When Estimating a Sparse Precision Matrix by Peter Alexander Lee Submitted to the Sloan School of Management on May 12, 2016, in partial fulfillment of the requirements for the degree of Master of Science in Operations Research Abstract Substantial progress has been made in the estimation of sparse high dimensional pre- cision matrices from scant datasets. This is important because precision matrices underpin common tasks such as regression, discriminant analysis, and portfolio opti- mization. However, few good algorithms for this task exist outside the space of L1 penalized optimization approaches like GLASSO. This thesis introduces LGM, a new algorithm for the estimation of sparse high dimensional precision matrices. Using the framework of probabilistic graphical models, the algorithm performs robust covari- ance estimation to generate potentials for small cliques and fuses the local structures to form a sparse yet globally robust model of the entire distribution. -
Matrix Algebra for Quantum Chemistry
Matrix Algebra for Quantum Chemistry EMANUEL H. RUBENSSON Doctoral Thesis in Theoretical Chemistry Stockholm, Sweden 2008 Matrix Algebra for Quantum Chemistry Doctoral Thesis c Emanuel Härold Rubensson, 2008 TRITA-BIO-Report 2008:23 ISBN 978-91-7415-160-2 ISSN 1654-2312 Printed by Universitetsservice US AB, Stockholm, Sweden 2008 Typeset in LATEX by the author. Abstract This thesis concerns methods of reduced complexity for electronic structure calculations. When quantum chemistry methods are applied to large systems, it is important to optimally use computer resources and only store data and perform operations that contribute to the overall accuracy. At the same time, precarious approximations could jeopardize the reliability of the whole calcu- lation. In this thesis, the selfconsistent eld method is seen as a sequence of rotations of the occupied subspace. Errors coming from computational ap- proximations are characterized as erroneous rotations of this subspace. This viewpoint is optimal in the sense that the occupied subspace uniquely denes the electron density. Errors should be measured by their impact on the over- all accuracy instead of by their constituent parts. With this point of view, a mathematical framework for control of errors in HartreeFock/KohnSham calculations is proposed. A unifying framework is of particular importance when computational approximations are introduced to eciently handle large systems. An important operation in HartreeFock/KohnSham calculations is the calculation of the density matrix for a given Fock/KohnSham matrix. In this thesis, density matrix purication is used to compute the density matrix with time and memory usage increasing only linearly with system size. The forward error of purication is analyzed and schemes to control the forward error are proposed. -
Problem Set #1 Solutions
PROBLEM SET #2 SOLUTIONS by Robert A. DiStasio Jr. Q1. 1-particle density matrices and idempotency. (a) A matrix M is said to be idempotent if M 2 = M . Show from the basic definition that the HF density matrix is idempotent when expressed in an orthonormal basis. An element of the HF density matrix is given as (neglecting the factor of two for the restricted closed-shell HF density matrix): * PCCμν = ∑ μi ν i . (1) i In matrix form, (1) is simply equivalent to † PCC= occ occ (2) where Cocc is the occupied block of the MO coefficient matrix. To reformulate the nonorthogonal Roothaan-Hall equations FC= SCε (3) into the preferred eigenvalue problem ~~ ~ FC= Cε (4) the AO basis must be orthogonalized. There are several ways to orthogonalize the AO basis, but I will only discuss symmetric orthogonalization, as this is arguably the most often used procedure in computational quantum chemistry. In the symmetric orthogonalization procedure, the MO coefficient matrix in (3) is recast as ~ + 1 − 1 ~ CSCCSC=2 ⇔ = 2 (5) which can be substituted into (3) yielding the eigenvalue form of the Roothaan-Hall equations in (4) after the following algebraic manipulation: FC = SCε − 1 ~ − 1 ~ FSCSSC( 2 )(= 2 )ε − 1 − 1 ~ −1 − 1 ~ SFSCSSSC2 ( 2 )(= 2 2 )ε . (6) −1 − 1 ~ −1 − 1 ~ ()S2 FS2 C= () S2 SS2 Cε ~~ ~ FC = Cε ~ ~ −1 − 1 −1 − 1 In (6), the dressed Fock matrix, F , was clearly defined as F≡ S2 FS 2 and S2 SS2 = S 0 =1 was used. In an orthonormal basis, therefore, the density matrix of (2) takes on the following form: ~ ~† PCC= occ occ . -
Relativistic Cholesky-Decomposed Density Matrix
Relativistic Cholesky-decomposed density matrix MP2 Benjamin Helmich-Paris,1, 2, a) Michal Repisky,3 and Lucas Visscher2 1)Max-Planck-Institut f¨urKohlenforschung, Kaiser-Wilhelm-Platz 1, D-45470 M¨ulheiman der Ruhr 2)Section of Theoretical Chemistry, Vrije Universiteit Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands 3)Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiT The Arctic University of Norway, N-9037 Tromø Norway (Dated: 15 October 2018) In the present article, we introduce the relativistic Cholesky-decomposed density (CDD) matrix second-order Møller–Plesset perturbation theory (MP2) energies. The working equations are formulated in terms of the usual intermediates of MP2 when employing the resolution-of-the-identity approximation (RI) for two-electron inte- grals. Those intermediates are obtained by substituting the occupied and vir- tual quaternion pseudo-density matrices of our previously proposed two-component atomic orbital-based MP2 (J. Chem. Phys. 145, 014107 (2016)) by the corresponding pivoted quaternion Cholesky factors. While working within the Kramers-restricted formalism, we obtain a formal spin-orbit overhead of 16 and 28 for the Coulomb and exchange contribution to the 2C MP2 correlation energy, respectively, compared to a non-relativistic (NR) spin-free CDD-MP2 implementation. This compact quaternion formulation could also be easily explored in any other algorithm to compute the 2C MP2 energy. The quaternion Cholesky factors become sparse for large molecules and, with a block-wise screening, block sparse-matrix multiplication algorithm, we observed an effective quadratic scaling of the total wall time for heavy-element con- taining linear molecules with increasing system size.