Arxiv:1311.6244V1 [Physics.Comp-Ph] 25 Nov 2013

An efficient implementation of Slater-Condon rules Anthony Scemama,∗ Emmanuel Giner November 26, 2013 Abstract Slater-Condon rules are at the heart of any quantum chemistry method as they allow to simplify 3N- dimensional integrals as sums of 3- or 6-dimensional integrals. In this paper, we propose an efficient implementation of those rules in order to identify very rapidly which integrals are involved in a matrix element expressed in the determinant basis set. This implementation takes advantage of the bit manipulation instructions on x86 architectures that were introduced in 2008 with the SSE4.2 instruction set. Finding which spin-orbitals are involved in the calculation of a matrix element doesn't depend on the number of electrons of the system. In this work we consider wave functions Ψ ex- For two determinants which differ by two spin- pressed as linear combinations of Slater determinants orbitals: D of orthonormal spin-orbitals φ(r): jl hDjO1jDiki = 0 (4) X Ψ = c D (1) jl i i hDjO2jDiki = hφiφkjO2jφjφli − hφiφkjO2jφlφji i All other matrix elements involving determinants Using the Slater-Condon rules,[1, 2] the matrix ele- with more than two substitutions are zero. ments of any one-body (O1) or two-body (O2) oper- An efficient implementation of those rules requires: ator expressed in the determinant space have simple expressions involving one- and two-electron integrals 1. to find the number of spin-orbital substitutions in the spin-orbital space. The diagonal elements are between two determinants given by: 2. to find which spin-orbitals are involved in the X substitution hDjO1jDi = hφijO1jφii (2) i2D 3. to compute the phase factor if a reordering of 1 X hDjO jDi = hφ φ jO jφ φ i − the spin-orbitals has occured 2 2 i j 2 i j (i;j)2D This paper proposes an efficient implementation of hφiφjjO2jφjφii those three points by using some specific bit manipulation instructions at the CPU level. For two determinants which differ only by the substi- arXiv:1311.6244v1 [physics.comp-ph] 25 Nov 2013 tution of spin-orbital i with spin-orbital j: 1 Algorithm j hDjO1jDi i = hφijO1jφji (3) X In this section, we use the convention that the least hDjO jDji = hφ φ jO jφ φ i − hφ φ jO jφ φ i 2 i i k 2 j k i k 2 k j significant bit of binary integers is the rightmost bit. k2D As the position number of a bit in an integer is the ∗Laboratoire de Chimie et Physique Quantiques, CNRS- exponent for the corresponding bit weight in base 2, IRSAMC, Universitéde Toulouse, France. the bit positions are numbered from the right to the 1 Algorithm 1: Compute the degree of excitation 1.2 Finding the number of substitu- between D1 and D2. tions 1 2 Function n excitations(I ;I ) We propose an algorithm to count the number of sub- Input: I1, I2: lists of integers representing stitutions between two determinants D1 and D2 (al- determinants D1 and D2. gorithm 1). This number is equivalent to the degree Output: d: Degree of excitation. of excitation d of the operator T^d which transforms 1 two d 0; D1 into D2 (D2 = T^dD1). The degree of excitation is 2 for σ 2 fα; βg do equivalent to the number of holes created in D1 or to 3 for i 0 to N − 1 do int the number of particles created in D2. In algorithm 1, 1 2 4 two d two d+ popcnt (Ii,σ xor Ii,σ); the total number of substitutions is calculated as the sum of the number of holes created in each 64-bit 5 d two d=2; integer of D1. 6 return d; 1 2 On line 4, (Ii,σ xor Ii,σ) returns a 64-bit integer 1 with bits set to one where the bits differ between Ii,σ 2 and Ii,σ. Those correspond to the positions of the left starting at position 0. To be consistent with this holes and the particles. The popcnt function returns convention, we also represent the arrays of 64-bit in- the number of non-zero bits in the resulting integer. tegers from right to left, starting at position zero. At line 5, two d contains the sum of the number of Following with the usual notations, the spin-orbitals holes and particles, so the excitation degree d is half start with index one. of two d. The Hamming weight is defined as the number of non-zero bits in a binary integer.The fast calculation of Hamming weights is crucial in various domains 1.1 Binary representation of the de- of computer science such as error-correcing codes[3] terminants or cryptography[4]. Therefore, the computation of Hamming weights has appeared in the hardware of processors in 2008 via the the popcnt instruction in- The molecular spin-orbitals in the determinants are troduced with the SSE 4.2 instruction set. This in- ordered by spin: the α spin-orbitals are placed before struction has a 3-cycle latency and a 1-cycle through- the β spin-orbitals. Each determinant is represented put independently of the number of bits set to one as a pair of bit-strings: one bit-string corresponding (here, independently of the number of electrons), to the α spin-orbital occupations, and one bit-string as opposed to Wegner's algorithm[5] that repeatedly for the β spin-orbital occupations. When the i-th finds and clears the last nonzero bit. The popcnt in- orbital is occupied by an electron with spin σ in the struction may be generated by Fortran compilers via determinant, the bit at position (i − 1) of the σ bit- the intrinsic popcnt function. string is set to one, otherwise it is set to zero. The pair of bit-strings is encoded in a 2- 1.3 Identifying the substituted spin- dimensional array of 64-bit integers. The first di- orbitals mension contains Nint elements and starts at position zero. Nint is the minimum number of 64-bit integers Algorithm 2 creates the list of spin-orbital indices needed to encode the bit-strings: containing the holes of the excitation from D1 to D2. At line 4, H is is set to a 64-bit integer with ones at the positions of the holes. The loop starting at N = bN =64c + 1 (5) int MOs line 5 translates the positions of those bits to spin- orbital indices as follows: when H 6= 0, the index where NMOs is the total number of molecular spin- of the rightmost bit of H set to one is equal to the orbitals with spin α or β (we assume this number number of trailing zeros of the integer. This number to be the same for both spins). The second index can be obtained by the x86 64 bsf (bit scan forward) of the array corresponds to the α or β spin. Hence, instruction with a latency of 3 cycles and a 1-cycle determinant Dk is represented by an array of Nint throughput, and may be generated by the Fortran k 64-bit integers Ii,σ, i 2 [0;Nint − 1]; σ 2 f1; 2g. trailz intrinsic function. At line 7, the spin-orbital 2 Algorithm 2: Obtain the list of orbital indices Algorithm 3: Obtain the list of orbital indices corresponding to holes in the excitation from D1 corresponding to particles in the excitation from to D2 D1 to D2 Function get holes(I1, I2); Function get particles(I1, I2) Input: I1, I2: lists of integers representing Input: I1, I2: lists of integers representing determinants D1 and D2. determinants D1 and D2. Output: Holes: List of positions of the holes. Output: Particles: List of positions of the 1 for σ 2 fα; βg do particles. 2 k 0; 1 for σ 2 fα; βg do 3 for i 0 to Nint − 1 do 2 k 0; 1 2 1 4 H Ii,σ xor Ii,σ and Ii,σ; 3 for i 0 to Nint − 1 do 1 2 2 5 while H 6= 0 do 4 P Ii,σ xor Ii,σ and Ii,σ; 6 position trailing zeros(H); 5 while P 6= 0 do 7 Holes[k; σ] 1 + 64 × i + position; 6 position trailing zeros(P); 8 H bit clear(H; position); 7 Particles[k; σ] 1 + 64 × i + position; 9 k k + 1; 8 P bit clear(P; position); 9 k k + 1; 10 return Holes; 10 return Particles; index is calculated. At line 8, the rightmost bit set to one is cleared in H. tained in the integers between the integer containing The list of particles is obtained in a similar way the lowest orbital (included) and the integer contain- with algorithm 3. ing highest orbital (excluded). Line 14 sets all the m rightmost bits of the integer to one and all the other 1.4 Computing the phase bits to zero. At line 15, the n + 1 rightmost bits of the integer containing to the lowest orbital are set to In our representation, the spin-orbitals are always or- zero. At this point, the bit mask is defined on in- dered by increasing index. Therefore, a reordering tegers mask[j ] to mask[k] (if the substitution occurs may occur during the spin-orbital substitution, in- on the same 64-bit integer, j = k). mask[j ] has ze- volving a possible change of the phase. ros on the leftmost bits and mask[k] has zeros on the As no more than two substitutions between deter- rightmost bits.

Arxiv:1311.6244V1 [Physics.Comp-Ph] 25 Nov 2013

Hamming Weight Attacks on Cryptographic Hardware – Breaking Masking Defense?

Type I Codes Over GF(4)

Faster Population Counts Using AVX2 Instructions

Coding Theory and Applications Solved Exercises and Problems Of

Conversion of Mersenne Twister to Double-Precision Floating-Point

Applications of Search Techniques to Cryptanalysis and the Construction of Cipher Components. James David Mclaughlin Submitted F

Calculating Hamming Distance with the IBM Q Experience José Manuel Bravo [email protected] Prof

FED-STD-1037C the Number of Digit Positions in Which the Corresponding Digits of Two Binary Numbers Or Words of the Same Length

Bits and Bit Sequences Integers

A Loopless Gray Code for Minimal Signed-Binary Representations

Calculating Hamming Distance with the IBM Q Experience José Manuel Bravo [email protected] Prof

Arxiv:1809.02161V1 [Cs.PL] 6 Sep 2018 – Little-Used and Unclear Corner Cases in the Language Standards