Arxiv:1605.05526V2 [Physics.Chem-Ph] 13 Jul 2016 9,14 Accuracy of Other Methods to Obtain Γzzzz

Home , Fock matrix

DMRG-CASPT2 study of the longitudinal static second hyperpolarizability of all-trans polyenes Sebastian Wouters,1 Veronique Van Speybroeck,1 and Dimitri Van Neck1 Center for Molecular Modelling, Ghent University, Technologiepark 903, 9052 Zwijnaarde, Belgium We have implemented internally contracted complete active space second order perturbation theory (CASPT2) with the density matrix renormalization group (DMRG) as active space solver [Y. Kurashige and T. Yanai, J. Chem. Phys. 135, 094104 (2011)]. Internally contracted CASPT2 requires to contract the generalized Fock matrix with the 4-particle reduced density matrix (4-RDM) of the reference wavefunction. The required 4- RDM elements can be obtained from 3-particle reduced density matrices (3-RDM) of diﬀerent wavefunctions, formed by symmetry-conserving single-particle excitations op top of the reference wavefunction. In our spin- adapted DMRG code chemps2 [https://github.com/sebwouters/chemps2], we decompose these excited wavefunctions as spin-adapted matrix product states, and calculate their 3-RDM in order to obtain the required contraction of the generalized Fock matrix with the 4-RDM of the reference wavefunction. In this work, we study the longitudinal static second hyperpolarizability of all-trans polyenes C2nH2n+2 [n = 4 − 12] in the cc-pVDZ basis set. DMRG-SCF and DMRG-CASPT2 yield substantially lower values and scaling with system size compared to RHF and MP2, respectively.

I. INTRODUCTION For realistic systems and Hamiltonians, applying DMRG to the entire system becomes infeasible. DMRG Non-linear optical (NLO) properties of organic con- can then be used as a numerically exact active space jugated materials have been studied extensively due to solver in conjunction with the complete active space self their potential use in optical devices. In computational consistent field method (DMRG-SCF) to treat the static correlation.26–29 Dynamic correlation can be added sub- chemistry, it is well known that approximate methods 30,31 struggle to yield qualitatively accurate NLO properties. sequently by canonical transformation theory, internally contracted perturbation theory,32–34 or a configu- Density functional theory (DFT) with the commonly 35,36 used exchange-correlation functionals significantly over- ration interaction expansion. Alternatively, the perturbation wavefunctions can also be solved within the estimates the longitudinal static second hyperpolarizabil- 37–39 1–3 DMRG framework. Recently DMRG has shown its ity γzzzz and its scaling with system size. Long-range corrected functionals reduce a large part of the over- full potential by resolving the electronic structure of important catalysts, containing multiple transition metal estimation, indicating the importance of long-range ex- 40–42 change for the NLO properties.4–7 atoms. Also in ab initio methods, electron correlation should The MPS ansatz and the DMRG algorithm are re- be included with care.8 Restricted Hartree-Fock (RHF) viewed in Sect. II and III, respectively. Our implementation of the contraction of the generalized Fock ma- theory overestimates γzzzz, and with second order Møller-Plesset perturbation theory (MP2) this over- trix with the 4-particle reduced density matrix (4-RDM) estimation becomes even worse.9 For short polyenes, of the reference wavefunction is described in Sect. IV. coupled-cluster theory with single and double excitations The DMRG-SCF algorithm and internally contracted complete active space second order perturbation the- (CCSD) predicts larger γzzzz values compared to RHF, but their ratio drops below 1 with increasing system ory (CASPT2) are outlined in Sect. V. In Sect. VI we 10 use our implementation of DMRG-CASPT2 to study the length. The effect of electron correlation on γzzzz has since been further explored with various methods.11,12 longitudinal static second hyperpolarizability of all-trans polyenes, and compare the results with lower levels of One particular method, the density matrix renormal- theory. A summary of this work is given in Sect. VII. ization group (DMRG),13 is especially well-suited to study one-dimensional systems such as hydrogen chains and polyenes. It has therefore been used to assess the arXiv:1605.05526v2 [physics.chem-ph] 13 Jul 2016 9,14 accuracy of other methods to obtain γzzzz. II. MPS ANSATZ White invented DMRG in 1992 to solve the breakdown of Wilson’s numerical renormalization group for real- Starting from a full configuration interaction (FCI) space lattice systems.13 A few years later, Ostlund¨ and wavefunction for L spatial orbitals Rommer discovered its underlying wavefunction ansatz, the matrix product state (MPS).15 White and Martin ap- X n n n ...n plied DMRG for the first time to ab initio Hamiltonians |Ψi = C 1↑ 1↓ 2↑ L↓ 16 in 1999. With the effort of several quantum chemistry {niσ } 17–25 n n n n groups, DMRG has since become a standard method † 1↑ † 1↓ † 2↑ † L↓ aˆ aˆ aˆ ... aˆ |−i , (1) in computational chemistry. 1↑ 1↓ 2↑ L↓ 2 the MPS ansatz can be introduced. Without loss of ac- smaller than the virtual dimension it represents: curacy, the FCI coefficient tensor can be decomposed as reduced X reduced n n n n n n ...n n D = D C 1↑ 1↓ 2↑ 2↓ 3↑ 3↓ L↑ L↓ jLNLIL X n n n n n n n n jLNLIL = A[1] 1↑ 1↓ A[2] 2↑ 2↓ A[3] 3↑ 3↓ ...A[L] L↑ L↓ α0,α1 α1,α2 α2,α3 αL−1,αL X < Drepresented = (2j + 1)Dreduced . (4) {αi} L jLNLIL j N I = A[1]n1 A[2]n2 A[3]n3 ...A[L]nL , (2) L L L i L−i Another advantage of the symmetry-adaptation is the with size(αi) = min(4 , 4 ) to allow to represent every state of the Hilbert space. On the last line the short- ability to study excited states as ground states of different symmetry sectors, for example to calculate singlet-triplet hand ni for ni↑ni↓ was introduced. Matrices are written in boldface. The MPS ansatz can now be obtained by gaps. We also want to mention that the decomposition i L−i in Eq. (3) resembles the wavefunction in the multifacet truncating size(αi) to min(4 , 4 ,D). The parameter 53 D is called the virtual or bond dimension of the MPS. graphically contracted function method. With increasing virtual dimension D, a larger corner of When a non-singular matrix G and its inverse are the Hilbert space can be reached, and this parameter is inserted at a particular virtual bond, the MPS tensors therefore a handle on the convergence. change but the wavefunction does not: Because of the virtual dimension truncation, the MPS A[i]ni A[i + 1]ni+1 = (A[i]ni G) G−1A[i + 1]ni+1 ansatz is not invariant with respect to orbital rotations. ni ni+1 With arguments from quantum information theory, it can = Ae [i] Ae [i + 1] . (5) be shown that for one-dimensional orbital spaces, any ac- This gauge freedom allows to left-normalize curacy per length unit can be reached with a finite virtual X † dimension D, independent of the number of orbitals L, (A[i]ni ) A[i]ni = I (6) 43 even if L → ∞. In quantum chemistry, the orbital ac- ni tive spaces are often far from one-dimensional. With a or right-normalize proper choice and ordering of the active space orbitals, X † and by exploiting the symmetry group of the Hamilto- A[i]ni (A[i]ni ) = I (7) nian, efficient MPS decompositions of the FCI coefficient ni tensor can still be obtained.44 The symmetry group of the Hamiltonian typically the MPS tensors. Analogous expressions exist for re- 24 consists of SU(2) spin symmetry, U(1) particle number duced MPS tensors. Left- and right-normalization are symmetry, and the spatial point group symmetry P. used in the DMRG algorithm to remove the overlap ma- In chemps2 all three Hamiltonian symmetries are ex- trix from the local generalized eigenvalue problems. ploited, but P is limited to the abelian point groups with real-valued character tables.24 The non-abelian SU(2) spin symmetry was first exploited in condensed matter III. DMRG ALGORITHM and nuclear structure DMRG calculations,45–48 and later found its way to quantum chemistry.14,21,22,24,49,50 We First, a so-called micro-iteration or local eigenvalue also want to mention Refs. 51 and 52, where the non- problem of the DMRG algorithm is discussed. Subse- abelian point group D∞h is exploited for homonuclear quently, its place in the total DMRG algorithm is ex- diatomic molecules. plained. A micro-iteration corresponds to the optimiza- In order to exploit the symmetry, both the orbital and tion of two neighbouring MPS tensors, while all other virtual indices are written with the correct symmetry la- MPS tensors remain fixed. Suppose the MPS tensors bels. In what follows j and s denote spin, jz and sz de- belong to orbitals i and i + 1. A two-orbital tensor note spin projection, N denotes particle number, and I B[i]nini+1 = A[i]ni A[i + 1]ni+1 (8) denotes an irreducible representation of an abelian point group P with real-valued character table. In that case is constructed by contracting over the common virtual I ⊗ I = Itrivial for all I. Due to the Wigner-Eckart theo- index. It forms an initial guess for the local eigenvalue rem, the MPS tensors factorize into reduced tensors and problem which arises by varying the Lagrangian Clebsch-Gordan coefficients: L = hΨ(B[i]) | Hˆ | Ψ(B[i])i − E hΨ(B[i]) | Ψ(B[i])i (9) ssz NI z z z z z A[i]j j N I α ;j j N I α = hjLjLss | jRjRi nini+1 L L L L L R R R R R with respect to the parameters in B[i] . When all δ δ T [i]sNI . (3) MPS tensors to the left of orbital i are left-normalized NL+N,NR IL⊗I,IR jLNLILαL;jRNRIRαR and all MPS tensors to the right of orbital i+1 are right- The Clebsch-Gordan coefficients introduce block spar- normalized, this variation yields a standard eigenvalue sity and information compression. A reduced basis state sNI problem: (jLNLILαL) in the reduced MPS tensor T[i] repre- z X n n sents an entire multiplet (j j N I α ) in the MPS ten- i i+1 neinei+1 nini+1 L L L L L Hn n × B[i] = EB[i] . (10) ssz NI ei ei+1 sor A[i] . The reduced virtual dimension is therefore neinei+1 3

In practice, the application of the sparse effective Hamil- for mixed distributed and shared memory architectures. tonian on the left-hand side of Eq. (10) is realized by: MPI processes56 become responsible for certain operator and complementary operator pairs. This strategy X n n H i i+1 × B[i]neinei+1 was first introduced in Ref. 57. The contraction over neinei+1 separate reduced symmetry sectors (jLNLIL) is paral- nini+1 e e lelized in chemps2 with OMP threads.24,58 The par- X X n n n ni+1 = O[i, k] i B[i]ei ei+1 C[i, k] , (11) allelization over symmetry sectors was implemented in nei nei+1 k neinei+1 Ref. 22 for distributed memory architectures. The parallelization over operator and complementary operator 2 where the sum over k runs over O(L ) operator and com- pairs is independent from the parallelization over sym- plementary operator pairs. The lowest eigenvalue and metry sectors, and they give independent (multiplica- n n corresponding B[i] i i+1 tensor of the effective Hamilto- tive) speedups. Our hybrid parallelization is illustrated nian are typically obtained with the Lanczos or Davidson in Fig. 1. The caption of Fig. 1 contains all details of algorithm. This tensor is then decomposed with a singu- the calculation. We also want to mention a third par- lar value decomposition into a left-normalized MPS ten- allelization strategy for DMRG, which involves multiple n n sor A[i] i , a right-normalized MPS tensor A[i + 1] i+1 , simultaneous sweeps.59 and 4 × min(size(αi−1), size(αi+1)) Schmidt values Λ:

nini+1 ni ni+1 B[i] = A[i] ΛA[i + 1] . (12) IV. CONTRACTION OF THE GENERALIZED FOCK MATRIX AND THE 4-RDM When the number of Schmidt values is larger than D, only the D largest ones are kept. This local eigenvalue equation is solved during so- The 2-, 3-, and 4-particle reduced density matrices (2-, called left and right sweeps or macro-iterations. During 3-, and 4-RDM) are defined as a left (right) sweep, the D largest Schmidt values are 2 X † † Γpq;wx = haˆpσaˆqτ aˆxτ aˆwσi , (15) ni ni+1 absorbed into the MPS tensor A[i] (A[i + 1] ). In στ the next step, the MPS tensors of orbitals i − 1 and i 3 X † † † (i + 1 and i + 2) are then optimized. The DMRG sweeps Γpqr;wxy = haˆpσaˆqτ aˆrυaˆyυaˆxτ aˆwσi , (16) continue until the energy and/or the wavefunction does στυ 4 X † † † † not change anymore or a maximum number of sweeps is Γpqrs;wxyz = haˆpσaˆqτ aˆrυaˆsφaˆzφaˆyυaˆxτ aˆwσi ,(17) reached. The virtual dimension truncation D is stepwise στυφ increased after several sweeps. where the Greek letters denote spin-projections and the Analogous expressions exist for the reduced two-orbital Latin letters spatial orbitals. The computational cost to tensor, its corresponding effective Hamiltonian equa- obtain the N-RDM as the expectation value of an MPS tion, and the subsequent decomposition into left- and is O(LN+1D3 + L2N D2).27,32–34,60 In chemps2, the 2- 24 right-normalized reduced MPS tensors. For more de- and 3-RDM are implemented as described in these works. tails on the DMRG algorithm we refer the reader to 1 Renormalized operators of three spin- 2 second quantized Refs. 18 and 44. Per sweep, its computational cost is operators (13)-(14) are needed for the 3-RDM. They cou- 3 3 4 2 2 2 O(L D +L D ), its memory requirement O(L D ), and ple to two spin- 1 and one spin- 3 renormalized operators: its disk requirement O(L3D2). In what follows, D will 2 2 1 1 1 1 1 3 always denote the reduced virtual dimension. ⊗ ⊗ = ⊕ ⊕ . (18) The operators 2 2 2 2 2 2 The full twelvefold permutation symmetry of the 3-RDM ˆ† † biσ =a îσ (13) is taken into account, as well as the Hermitian conju- 1 24 ˆ 2 −σ gation equivalence of renormalized operators and the biσ = (−1) aî −σ (14) fermionic anti-commutation relations. The scaling of the are doublet irreducible tensor operators, belonging to computational cost of the 3-RDM with system size is il- 1 z respectively row (s = 2 , s = σ, N = 1,Ii) of ir- lustrated in Fig. 2. The caption of Fig. 2 contains all 1 6 2 reducible representation (s = 2 ,N = 1,Ii) and row details of the calculation. As can be observed, O(L D ) 1 z is the dominant contribution in practice. (s = 2 , s = σ, N = −1,Ii) of irreducible represen- 1 The generalized Fock matrix will be introduced in tation (s = 2 ,N = −1,Ii). The Wigner-Eckart theo- rem can therefore be exploited for both the MPS tensors Sect. V. This symmetric matrix has two spatial orbital and the second quantized operators. The resulting SU(2) indices and is diagonal in the irreducible representations: Clebsch-Gordan coefficients can be removed by summing Fpq = Fqp = FpqδIp,Iq . (19) over the common multiplets, and only Wigner 6-j and 9-j symbols remain in a spin-adapted DMRG code.24 The contraction of the generalized Fock matrix with the We have reviewed the symmetry-adaption and the use 4-RDM of the active space is required for CASPT2: of operator and complementary operator pairs in or- 4 X 4 F.Γ pqr;wxy = FszΓpqrs;wxyz. (20) der to explain the hybrid parallelization in chemps2 sz 4

(a) (b) 16 27 chemps2:omp t = 294 s chemps2:omp [1 mpi process] 14 chemps2:mpi 26 chemps2:hybrid [16 threads/proc] block:mpi block:mpi [without omp] 12 25 t = 65 s 10 24 t = 336 s 8 23 Speedup Speedup 6 t = 817 s t = 640 s 22 4

1 2 2

0 20 0 2 4 6 8 10 12 14 16 20 21 22 23 24 25 26 27 # cores # cores

FIG. 1. Speedup achieved with the hybrid parallelization of chemps2, compared with the MPI parallelization of block 54 version 1.1-alpha. The system under study is H2O in Roos’ ANO DZ basis set (L = 41) with atom positions O(0,0,0) and H(±0.790689766, 0, 0.612217330) in A.˚ 55 All calculations were performed with reduced virtual dimension D = 1000. RHF orbitals were used, ordered per irreducible representation of C2v according to their single-particle energy, and the irreducible representations were ordered as {A1, A2, B1, and B2}. The residual norm tolerance for the Davidson algorithm was set to 10−4. Note that in block the square of this parameter needs to be passed. Each node has a dual Intel Xeon Sandy Bridge E5-2670 (total of 16 cores at 2.6 GHz) and 64 GB of memory. The nodes are connected with FDR InﬁniBand. The renormalized operators were stored on GPFS in order to achieve high disk bandwidths. Both codes and all depending libraries were compiled with the Intel MPI compiler version 2015.1.133. The Intel Math Kernel Library version 11.2.1.133 was used for BLAS and LAPACK routines. (a) Comparison of pure MPI and OMP speedups on a single node. Wall times per sweep are indicated for 16 cores (in seconds). (b) Illustration of the hybrid parallelization of chemps2. For 16 cores and less, one MPI process with several OMP threads is used. For 32 cores and more, several MPI processes each with 16 OMP threads are used. Wall times per sweep are indicated (in seconds).

One way to avoid the implementation and computation of where the spatial orbitals s and z belong to the same irre- the full 4-RDM is to work in the pseudocanonical orbital ducible representation (Is = Iz). With |Ψ0i the reference basis which diagonalizes the generalized Fock matrix: ˆ P † wavefunction and Epq = σ aˆpσaˆqσ, we define ‘excited’ wavefunctions as: Fpq = Fppδp,q. (21) h i Kurashige and Yanai describe an efficient contraction |sz, α, βi = α Eˆsz + Ezs + β |Ψ0i , (23) of the 4-RDM with the pseudocanonical Fock matrix, in which two of the eight 4-RDM indices are always which have the same symmetry as the reference wave- 32 identical. This is in general a good strategy for smaller function if Is = Iz. We denote the 3-RDM of the (un- active spaces, or molecules with a large point group. For normalized) excited wavefunctions as all-trans polyenes, however, it is better to use a localized and ordered orbital basis because the reduced virtual di- Γ(sz, α, β)3 = mension D can then be orders of magnitude smaller. pqr;wxy X † † † Another option to avoid the implementation and hsz, α, β | aˆpσaˆqτ aˆrυaˆyυaˆxτ aˆwσ | sz, α, βi . (24) computation of the full 4-RDM is to use a cumulant στυ 33 approximation. With this approximation, imaginary 3 3 level shifts become necessary, and even then potential In this notation Γ(sz, 0, 1)pqr;wxy = Γpqr;wxy, the 3-RDM energy surfaces can become quite rugged.33 In order to of the reference wavefunction. The following identity compute the longitudinal static second hyperpolarizabil- holds: ity, accurate finite differences need to be calculated, and 2 Γ4 + Γ4 + Γ3 we have observed that the cumulant approximation of pqrs;wxyz pqrz;wxys pqr;wxy 3 3 the 4-RDM yields meaningless results for this purpose. = Γ(sz, 1, 1)pqr;wxy − Γ(sz, 1, 0)pqr;wxy We adopt another strategy to avoid the implementa- 3 3 3 − δs,pΓ − δs,qΓ − δs,rΓ tion of the full 4-RDM. Because the generalized Fock ma- zqr;wxy pzr;wxy pqz;wxy 3 3 3 trix is symmetric, the following sum of 4-RDM elements − δs,wΓpqr;zxy − δs,xΓpqr;wzy − δs,yΓpqr;wxz can be used as well: 3 3 3 − δz,pΓsqr;wxy − δz,qΓpsr;wxy − δz,rΓpqs;wxy 4 4 3 3 3 Γpqrs;wxyz + Γpqrz;wxys, (22) − δz,wΓpqr;sxy − δz,xΓpqr;wsy − δz,yΓpqr;wxs. (25) 5

104 loop over single-particle excitations which generates Data Fit t L 6. 02 ˆ ˆ ˆ ∝ |Ψpw;qxi = Epw |Ψqxi = EpwEqx |Ψ0i . (27)

The 2-RDM elements are obtained by taking overlaps 103 with the reference wavefunction:

2 Γpq,wx = hΨ0 | Ψpw;qxi − δq,w hΨ0 | Ψpxi . (28)

102 Wall time [seconds] V. DMRG-CASPT2

In the complete active space self consistent field 101 method (CASSCF), the spatial orbitals are divided into 6 7 8 9 10 11 12 13 n = L core, active, and virtual orbitals. The core orbitals are 2 doubly occupied, while the virtual orbitals remain empty. By taking the Coulomb and exchange interactions with FIG. 2. Scaling of the wall time to calculate the 3-RDM the electrons in the core orbitals into account, an effec- for the (2n, 2n) active spaces of all-trans polyenes C H 2n 2n+2 tive active space Hamiltonian can be constructed, and for a reduced virtual dimension D = 1000. All computations were performed with chemps2 on a single node with a dual its desired eigenstate can be computed with FCI. The Intel Xeon Sandy Bridge E5-2670 (total of 16 cores at 2.6 gradient and Hessian of the energy with respect to rota- GHz) and 64 GB of memory. The geometry of the polyenes is tions between the three orbital spaces can be computed optimized with B3LYP/cc-pVDZ. The active spaces are fully based on the 2-RDM of the active space solution.62 The converged with DMRG-SCF(2n, 2n)/cc-pVDZ. Localized or- three orbital spaces are then optimized with the Newton- bitals are constructed with the Edmiston-Ruedenberg algo- Raphson algorithm, or its augmented Hessian variant.63 rithm, and ordered according to the topology of the molecule An important question is the selection of the active or- with the Fiedler vector of the exchange matrix. The scal- bital space. We want to mention Refs. 29, 64, and 65, ing of the wall time has to be compared with the theoretical which shed new light on this subject. computational scaling O(L4D3 + L6D2). The equations in Ref. 62 depend solely on the active space 2-RDM, and any method which can compute this quantity to high accuracy can be used as active Eq. (25) shows that the required 4-RDM contributions space solver. Recently, unbiased RDMs have been ob- can be obtained with O(L2) 3-RDM calculations, with tained with FCI quantum Monte Carlo (FCIQMC),66 total computational cost O(L6D3+L8D2). This is higher and a corresponding FCIQMC-CASSCF algorithm was than the theoretical optimum O(L5D3 + L8D2) for the developed.67 We also want to mention a CASSCF variant 4-RDM, but as illustrated in Fig. 2 for the 3-RDM, the without underlying wavefunction ansatz, based on the dominant contribution for our 4-RDM calculation will be variational optimization of the 2-RDM.68 With DMRG as O(L2 × L6D2). active space solver, the method is called DMRG-CASSCF 26–29 The excited wavefunctions |Ψ1i = |sz, α, βi are decom- or DMRG-SCF. posed into spin-adapted MPS in chemps2 by minimiza- Systems exhibit static correlation when multiple Slater tion of the Hylleraas functional determinants are required for a qualitatively accurate de- scription. In quantum chemistry, the set of important L = hΨ | Ψ i − 2 hΨ | α Eˆ + E + β | Ψ i , (26) orbitals, in which the occupation changes over the dom- 1 1 1 sz zs 0 inant Slater determinants, is typically small. These orbitals form the actice space, and the static correlation in a manner entirely similar to Ref. 37. The sweep can be resolved with CASSCF, FCIQMC-CASSCF, or algorithm is only performed between orbitals min(s, z) DMRG-SCF. Due to the Coulomb repulsion between the and max(s, z) as the MPS tensors outside of this range electrons, the core and virtual orbitals show small devia- do not change. The cost of one full optimization is tions from doubly occupied and empty, respectively. The O(LD2 + LD3) = O(LD3), i.e. entirely negligible com- associated energy contribution is called the dynamic cor- pared to the subsequent 3-RDM calculation. After sub- relation. With DMRG as active space solver, it can be mission of our manuscript, a similar strategy in the resolved with canonical transformation theory,30,31 inter- block code to compress the perturber wavefunctions for nally contracted perturbation theory,32–34 or a configu- DMRG-NEVPT2 has been brought to our attention.34,61 ration interaction expansion.35,36 Alternatively, the per- The proposed strategy resembles the strategy in a FCI turbation wavefunctions can also be solved within the code to calculate the N-RDM, which is driven by a rou- DMRG framework.37–39 tine to compute |Ψpw;αi = Eˆpw |Ψαi. The computation We have implemented internally contracted com- of the 2-RDM, for example, is realized by a nested for- plete active space second order perturbation theory 6

69,70 32 (CASPT2) with DMRG as active space solver. It The overlap matrix hΨwx;yz | Ψpq;rsi is block-diagonal in is based on the generalized Fock operator: the different excitation types (A to H). It is diagonalized, small eigenvalues are discarded, and Eq. (43) is trans- X Fˆ = FpqEˆpq, (29) formed to pq X (F − E δ ) C = −V , (44) with matrix elements αβ 0 α,β β α β 1 X h i h i F = haˆ H,ˆ aˆ† − aˆ† H,ˆ aˆ i (30) pq 2 pσ qσ pσ qσ with F diagonal for two excitations |αi and |βi of the σ same type (A to H). In chemps2, the coefficients Cα are X 1 = t + hEˆ i (pq|rs) − (pr|qs) , (31) solved with the conjugate gradient algorithm. We use pq rs 2 rs the initial guess where tpq and (pq|rs) are the usual one- and two-electron ini Vα integrals. Cα = − . (45) Fαα − E0 The full Hilbert space H is split up into four parts: For the excitation types A and C, the contraction of the H = V0 ⊕ VK ⊕ VSD ⊕ VTQ... (32) generalized Fock matrix with 4-RDM of the active space V contains only the CASSCF solution |Ψ i. V is the of the CASSCF solution is needed. If the active space or- 0 0 K bitals in the DMRG algorithm are not pseudocanonical, space spanned by all possible active space excitations on 1 2 3 4 top of |Ψ i which are orthogonal to V . Wavefunctions in we rotate Γ ,Γ ,Γ , and F.Γ to the pseudocanonical 0 0 orbital basis before building the required intermediates VK have the same core and virtual orbitals as |Ψ0i, with the same occupation. V contains all single and double to solve Eq. (44). We also couple the excitation types B, SD E, F, G, and H to singlet and triplet excitations: particle excitations on top of |Ψ0i which are orthogonal to V0 ⊕ VK. With the indices ij for core orbitals, tuv for B singlet : EˆtiEûj + EˆtjEûi |Ψ0i , (46) active orbitals, and ab for virtual orbitals, VSD is spanned by the following excitation types: B triplet : EˆtiEûj − EˆtjEûi |Ψ0i , (47) A: EˆtiEûv |Ψ0i , (33) to make F more sparse.69,70 B: EˆtiEûj |Ψ0i , (34) ˆ ˆ In order to mitigate intruder state problems, we have C: EatEuv |Ψ0i , (35) implemented the imaginary level shift71 and the ioniza- 72 D: EâiEˆtu |Ψ0i , EˆtiEâu |Ψ0i , (36) tion potential - electron affinity (IPEA) shift. For the E: Eˆ Eˆ |Ψ i , (37) latter, the matrix on the left-hand side in Eq. (43) is ti aj 0 shifted with F: EâtEˆbu |Ψ0i , (38) ˆ ˆ G: EaiEbt |Ψ0i , (39) hΨwx;yz | Fˆ | Ψpq;rsi += H: Eˆ Eˆ |Ψ i . (40) IPEA ai bj 0 δ δ δ δ hΨ | Ψ i p,w q,x r,y s,z 2 wx;yz pq;rs And VTQ.. is the remainder of H. The zeroth order Hamiltonian for internally contracted CASPT2 is × 4 + hEˆppi − hEˆqqi + hEˆrri − hEˆssi . (48)

Hˆ0 = Pˆ0FˆPˆ0 +PˆKFˆPˆK +PˆSDFˆPˆSD +PˆTQ..FˆPˆTQ.., (41) where Pˆ is the projector onto V . The first order VI. LONGITUDINAL STATIC SECOND X X HYPERPOLARIZABILITY OF POLYENES wavefunction |Ψ1i for internally contracted CASPT2 is spanned by a linear combination over VSD: Static second hyperpolarizabilities can be obtained X 73 |Ψ1i = Cpq;rsEˆpqEˆrs |Ψ0i with the finite field method. When an electric field E is

pq;rs∈VSD applied in the z-direction, the Hamiltonian becomes X = Cpq;rs |Ψpq;rsi . (42) Hˆ = Hˆ0 + Ez. (49) pq;rs∈VSD The coeﬃcients can be found by solving The static second hyperpolarizability in the z-direction can be calculated as the fourth order derivative of the X eigenvalue E of Hˆ with respect to the electric ﬁeld: hΨwx;yz | Hˆ0 − E0 | Ψpq;rsi Cpq;rs pq;rs∈V SD ∂4E = − hΨwx;yz | Hˆ | Ψ0i . (43) γzzzz = − 4 . (50) ∂E E→0 7

For centrosymmetric all-trans polyenes, the fourth order and the estimated exponent for power law behaviour derivative can be approximated with the finite difference log(γ (n)) − log(γ (n − 1)) a(n) ≈ zzzz zzzz , (54) −6E(0) + 8E(E) − 2E(2E) log(n) − log(n − 1) γ (E) = , (51) zzzz E4 are shown in Figs. 3(a) and 3(b), respectively. when the origin is chosen in the center of the polyene. Of all methods considered, DMRG-SCF and DMRG- We calculate γzzzz(E) for E ∈ {0.001, 0.002, 0.004} a.u. CASPT2 yield the lowest incremental longitudinal static and extrapolate the finite differences to zero field with a second hyperpolarizabilities ∆γzzzz(n) and exponents least-squares fit to a(n). An experimental value of a = 3.2 was determined 2 4 for polyenes with length n = 4 − 16, with a linear fit on γzzzz(E) = γzzzz(0) + c1E + c2E , (52) a double logarithmic scale.80 We have performed a similar analysis for our computational data in Fig. 4. RHF, with c2 c1. The odd powers in the electric field vanish in Eq. (52) because of the centrosymmetry. MP2, CCSD, DMRG-SCF, and DMRG-CASPT2 yield The values of E have to be chosen with care.14 For the exponents a = 4.5, a = 4.5, a = 4.0, a = 3.2, and too large fields, higher order effects come into play in a = 3.7, respectively. While the DMRG-SCF exponent Eq. (52). The ab initio calculations yield energies with a corresponds best to the experimental result, the calcula- certain precision, due to either the convergence threshold tions have been performed in vacuum and with a modest or the finite precision arithmetic. For too small fields, the basis set, and should therefore be treated with care. The ab initio calculations mainly allow to compare dif- relative error on γzzzz(E) will therefore be too large. All calculations were performed in the cc-pVDZ ba- ferent levels of theory. As noted in the introduction, CCSD yields larger (smaller) longitudinal static second sis. The geometries of all-trans polyenes C2nH2n+2 hyperpolarizabilities than RHF for short (long) polyenes. [n = 4 − 12] with C2h symmetry were optimized with the B3LYP functional in psi4.74 The carbon atoms of The RHF and MP2 values and power law exponents the polyenes form two parallel rows, and the electric are substantially reduced with DMRG-SCF and DMRG- fields have the same direction. The electric fields break CASPT2, respectively. Our calculations hence point out the importance of static correlation for the non-linear the C2h symmetry to Cs symmetry, and the latter was used in the ab initio calculations. The RHF, MP2, and optical properties of conjugated molecules. CCSD calculations were performed with pyscf,75 as well The CCSD and DMRG-CASPT2 (incremental) γzzzz as the calculation of the electron integrals for chemps2. differ by less than a factor 2. It remains an open question For the DMRG-SCF and DMRG-CASPT2 calculations, to which extent CCSD covers the static correlation incor- a (2n, 2n) π-electron active space was used. The active porated in DMRG-CASPT2. If CCSD only captures a space orbitals were localized with an augmented Hes- fraction of the static correlation, multi-reference coupled sian Newton-Raphson implementation of the Edmiston- cluster theory (MRCC) can significantly alter the (incre- Ruedenberg algorithm.76,77 They were ordered accord- mental) γzzzz compared to DMRG-CASPT2, in analogy ing to the one-dimensional topology of the polyene, by to the single-reference calculations. means of the Fiedler vector of the exchange matrix.77,78 The DMRG calculations were performed with a reduced VII. SUMMARY virtual dimension D = 1000, and a residual norm threshold 10−10 for the Davidson algorithm. With these pa- In Sect. II and III the matrix product state (MPS) rameters, both the ground state |Ψ0i of the active space Hamiltonian and the corresponding excited wavefunc- ansatz and the density matrix renormalization group tions |sz, α, βi are indistinguishable from FCI. (DMRG) algorithm were reviewed. A hybrid parallelization of DMRG for mixed distributed and shared memory In order to obtain accurate finite differences γzzzz(E) and corresponding extrapolations (52) to zero field, we architectures was also described. Processes become re- have observed that the cumulant approximation, the sponsible for certain operator and complementary oper- imaginary level shift, and the IPEA shift cannot be used. ator pairs, while the contractions over separate reduced symmetry sectors are parallelized by threads. Because The IPEA shift even yields negative γzzzz. a(n) the two parallelization strategies are independent, they The power law behaviour γzzzz(n) ∝ n is often assumed.79 Consider a small local electric field which show independent (multiplicative) speedups. causes a response of a certain length scale. For polyenes In Sect. IV our strategy to contract the generalized smaller than this length scale, the possibility for response Fock matrix with the 4-particle reduced density matrix opens up with polyene length n, and a rapid increase of (4-RDM) of the reference wavefunction was explained. The required 4-RDM elements can be obtained from the γzzzz with n is observed (a(n) 1). For polyenes much 3-RDMs of ‘excited’ wavefunctions, formed by symmetry- larger than this length scale, γzzzz scales linearly with n (a(n) → 1). The incremental longitudinal static second conserving single-particle excitations on top of the ref- hyperpolarizability erence wavefunction. These excited wavefunctions are decomposed into spin-adapted MPSs at negligible cost. ∆γzzzz(n) = γzzzz(n) − γzzzz(n − 1), (53) The total computational cost of our strategy scales as 8

(a) (b) 5.0

4.5 107

4.0 ] . u . ) a [ n

( 3.5 ) a n (

γ 106 ∆ 3.0

RHF DMRG-SCF RHF DMRG-SCF 2.5 MP2 DMRG-CASPT2 MP2 DMRG-CASPT2 CCSD CCSD 105 2.0 4 5 6 7 8 9 10 11 12 13 4 5 6 7 8 9 10 11 12 13 L L n = 2 n = 2

FIG. 3. The longitudinal static second hyperpolarizability of all-trans polyenes C2nH2n+2. The geometries are optimized with B3LYP/cc-pVDZ and all calculations are performed with the cc-pVDZ basis. (a) The incremental longitudinal static second hyperpolarizability ∆γzzzz(n) = γzzzz(n) − γzzzz(n − 1). (b) The estimatation in Eq. (54) for the power law exponent.

108 calculations hence point out the importance of static correlation for the non-linear optical properties of conjugated molecules.

107 ACKNOWLEDGEMENTS ] . u . a [

) S.W. would like to thank Ward Poelmans and Ewan n (

γ Higgs for their help with improving the disk bandwidths 106 in chemps2; Jun Yang and Sandeep Sharma for their help with the installation and execution of block; Toru RHF a=4.5 DMRG-SCF a=3.2 Shiozaki and Takeshi Yanai for insightful conversations MP2 a=4.5 DMRG-CASPT2 a=3.7 on CASPT2; and Peter Limacher for stimulating dis- CCSD a=4.0 105 cussions on the second hyperpolarizability. S.W. also 4 5 6 7 8 9 10 11 12 13 gratefully acknowledges a postdoctoral fellowship from n = L 2 the Research Foundation Flanders (Fonds Wetenschap- pelijk Onderzoek Vlaanderen). The computational re- FIG. 4. The power law scaling of the longitudinal static sources (Stevin Supercomputer Infrastructure) and ser- second hyperpolarizability of all-trans polyenes C2nH2n+2 is vices used in this work were provided by the VSC (Flem- determined with a linear fit on a double logarithmic scale. ish Supercomputer Center), funded by Ghent University, the Hercules Foundation and the Flemish Government - department EWI. 6 3 8 2 O(L D + L D ). In practice, the dominant term is 1B. Champagne, E. A. Perpète, S. J. A. van Gisbergen, E.-J. O(L8D2), which is the same as for the theoretical op- Baerends, J. G. Snijders, C. Soubra-Ghaoui, K. A. Robins, and timum O(L5D3 + L8D2). B. Kirtman, J. Chem. Phys. 109, 10489 (1998). 2S. J. A. van Gisbergen, P. R. T. Schipper, O. V. Gritsenko, E. J. Our implementation of DMRG-CASPT2 was outlined Baerends, J. G. Snijders, B. Champagne, and B. Kirtman, Phys. in Sect. V. We have studied the longitudinal static sec- Rev. Lett. 83, 694 (1999). 3B. Champagne, E. A. Perpète,D. Jacquemin, S. J. A. van Gis- ond hyperpolarizability of all-trans polyenes C2nH2n+2 [n = 4 − 12], obtained in the cc-pVDZ basis with the bergen, , E.-J. Baerends, C. Soubra-Ghaoui, , K. A. Robins, and B. Kirtman, J. Phys. Chem. A 104, 4755 (2000). finite field method, in Sect. VI. The results of three 4P. Mori-Sánchez, Q. Wu, and W. Yang, J. Chem. Phys. 119, single-reference methods (RHF, MP2, and CCSD) were 11001 (2003). compared with the results of DMRG-SCF and DMRG- 5M. Kamiya, H. Sekino, T. Tsuneda, and K. Hirao, J. Chem. CASPT2, using a (2n, 2n) π-electron active space. The Phys. 122, 234111 (2005). 6H. Sekino, Y. Maeda, M. Kamiya, and K. Hirao, J. Chem. Phys. multi-reference methods yield substantially lower values 126, 014107 (2007). and exponents for the longitudinal static second hyperpo- 7J.-W. Song, M. A. Watson, H. Sekino, and K. Hirao, J. Chem. larizability than their single-reference counterparts. Our Phys. 129, 024117 (2008). 9

8R. J. Bartlett and G. D. Purvis, Phys. Rev. A 20, 1313 (1979). 47S. Pittel and N. Sandulescu, Phys. Rev. C 73, 014301 (2006). 9 Q. Li, L. Chen, Q. Li, and Z. Shuai, Chem. Phys. Lett. 457, 276 48A. I. Tóth,C. P. Moca, O. Legeza, and G. Zaránd,Phys. Rev. (2008). B 78, 245109 (2008). 10 P. A. Limacher, Q. Li, and H. P. Lüthi,J. Chem. Phys. 135, 49S. Sharma and G. K.-L. Chan, J. Chem. Phys. 136, 124121 014111 (2011). (2012). 11 M. Nakano, T. Minami, H. Fukui, R. Kishi, Y. Shigeta, and 50S. Keller and M. Reiher, J. Chem. Phys. 144, 134101 (2016). B. Champagne, J. Chem. Phys. 136, 024315 (2012). 51S. Sharma, T. Yanai, G. H. Booth, C. J. Umrigar, and G. K.-L. 12 J. B. Robinson and P. J. Knowles, J. Chem. Phys. 137, 054301 Chan, J. Chem. Phys. 140, 104112 (2014). (2012). 52S. Sharma, J. Chem. Phys. 142, 024107 (2015). 13 S. R. White, Phys. Rev. Lett. 69, 2863 (1992). 53R. Shepard, G. Gidofalvi, and S. R. Brozell, J. Chem. Phys. 14 S. Wouters, P. A. Limacher, D. Van Neck, and P. W. Ayers, J. 141, 064105 (2014). Chem. Phys. 136, 134110 (2012). 54S. Sharma and G. K.-L. Chan, (2016), block version 1.1- 15 ¨ S. Ostlund and S. Rommer, Phys. Rev. Lett. 75, 3537 (1995). alpha, https://github.com/sanshar/block/releases/tag/v1. 16 S. R. White and R. L. Martin, J. Chem. Phys. 110, 4127 (1999). 1-alpha. 17 A. O. Mitrushenkov, G. Fano, F. Ortolani, R. Linguerri, and 55G. K.-L. Chan and M. Head-Gordon, J. Chem. Phys. 118, 8551 P. Palmieri, J. Chem. Phys. 115, 6815 (2001). (2003). 18 G. K.-L. Chan and M. Head-Gordon, J. Chem. Phys. 116, 4462 56Message Passing Interface, http://www.mpi-forum.org/. (2002). 57G. K.-L. Chan, J. Chem. Phys. 120, 3172 (2004). 19 O. Legeza, J. Röder, and B. A. Hess, Phys. Rev. B 67, 125114 58Open Multi-Processing API, http://openmp.org/wp/. (2003). 59E. M. Stoudenmire and S. R. White, Phys. Rev. B 87, 155137 20 G. Moritz, B. A. Hess, and M. Reiher, J. Chem. Phys. 122, (2013). 024107 (2005). 60D. Zgid and M. Nooijen, J. Chem. Phys. 128, 144115 (2008). 21 D. Zgid and M. Nooijen, J. Chem. Phys. 128, 014107 (2008). 61S. Guo and G. K.-L. Chan, (2016), http://sanshar.github.io/ 22 Y. Kurashige and T. Yanai, J. Chem. Phys. 130, 234114 (2009). Block/examples-with-pyscf.html#dmrg-nevpt2. 23 H.-G. Luo, M.-P. Qin, and T. Xiang, Phys. Rev. B 81, 235129 62P. E. M. Siegbahn, J. Almlöf,A. Heiberg, and B. O. Roos, J. (2010). Chem. Phys. 74, 2384 (1981). 24 S. Wouters, W. Poelmans, P. W. Ayers, and D. Van Neck, Com- 63A. Banerjee, N. Adams, J. Simons, and R. Shepard, J. Phys. put. Phys. Commun. 185, 1501 (2014). Chem. 89, 52 (1985). 25 S. Keller, M. Dolfi, M. Troyer, and M. Reiher, J. Chem. Phys. 64S. Keller, K. Boguslawski, T. Janowski, M. Reiher, and P. Pulay, 143, 244118 (2015). J. Chem. Phys. 142, 244104 (2015). 26 D. Zgid and M. Nooijen, J. Chem. Phys. 128, 144116 (2008). 65C. J. Stein and M. Reiher, J. Chem. Theory Comput. 12, 1760 27 D. Ghosh, J. Hachmann, T. Yanai, and G. K.-L. Chan, J. Chem. (2016). Phys. 128, 144117 (2008). 66C. Overy, G. H. Booth, N. S. Blunt, J. J. Shepherd, D. Cleland, 28 T. Yanai, Y. Kurashige, D. Ghosh, and G. K.-L. Chan, Int. J. and A. Alavi, J. Chem. Phys. 141, 244117 (2014). Quant. Chem. 109, 2178 (2009). 67G. L. Manni, S. D. Smart, and A. Alavi, J. Chem. Theory Com- 29 S. Wouters, T. Bogaerts, P. Van Der Voort, V. Van Speybroeck, put. 12, 1245 (2016). and D. Van Neck, J. Chem. Phys. 140, 241103 (2014). 68J. Fosso-Tande, T.-S. Nguyen, G. Gidofalvi, and I. A. Eu- 30 T. Yanai, Y. Kurashige, E. Neuscamman, and G. K.-L. Chan, gene DePrince, J. Chem. Theory Comput. in press (2016), J. Chem. Phys. 132, 024105 (2010). 10.1021/acs.jctc.6b00190. 31 T. Yanai, Y. Kurashige, E. Neuscamman, and G. K.-L. Chan, 69K. Andersson, P. A. Malmqvist, B. O. Roos, A. J. Sadlej, and Phys. Chem. Chem. Phys. 14, 7809 (2012). K. Wolinski, J. Phys. Chem. 94, 5483 (1990). 32 Y. Kurashige and T. Yanai, J. Chem. Phys. 135, 094104 (2011). 70K. Andersson, P. Malmqvist, and B. O. Roos, J. Chem. Phys. 33 Y. Kurashige, J. Chalupský,T. N. Lan, and T. Yanai, J. Chem. 96, 1218 (1992). Phys. 141, 174111 (2014). 71N. Forsberg and P.-A. Malmqvist, Chem. Phys. Lett. 274, 196 34 S. Guo, M. A. Watson, W. Hu, Q. Sun, and G. K.-L. Chan, J. (1997). Chem. Theory Comput. 12, 1583 (2016). 72G. Ghigo, B. O. Roos, and P.-A. Malmqvist, Chem. Phys. Lett. 35 M. Saitow, Y. Kurashige, and T. Yanai, J. Chem. Phys. 139, 396, 142 (2004). 044118 (2013). 73H. A. Kurtz, J. J. P. Stewart, and K. M. Dieter, J. Comput. 36 M. Saitow, Y. Kurashige, and T. Yanai, J. Chem. Theory Com- Chem. 11, 82 (1990). put. 11, 5120 (2015). 74J. M. Turney, A. C. Simmonett, R. M. Parrish, E. G. Hohen- 37 S. Sharma and G. K.-L. Chan, J. Chem. Phys. 141, 111101 stein, F. A. Evangelista, J. T. Fermann, B. J. Mintz, L. A. Burns, (2014). J. J. Wilke, M. L. Abrams, N. J. Russ, M. L. Leininger, C. L. 38 S. Sharma and A. Alavi, J. Chem. Phys. 143, 102815 (2015). Janssen, E. T. Seidl, W. D. Allen, H. F. Schaefer, R. A. King, 39 S. Sharma, G. Jeanmairet, and A. Alavi, J. Chem. Phys. 144, E. F. Valeev, C. D. Sherrill, and T. D. Crawford, WIREs Com- 034103 (2016). putational Molecular Science 2, 556 (2012). 40 Y. Kurashige, G. K.-L. Chan, and T. Yanai, Nat. Chem. 5, 660 75Q. Sun, (2016), pyscf: Python module for quantum chemistry, (2013). https://github.com/sunqm/pyscf. 41 S. Sharma, K. Sivalingam, F. Neese, and G. K.-L. Chan, Nat. 76C. Edmiston and K. Ruedenberg, Rev. Mod. Phys. 35, 457 Chem. 6, 927 (2014). (1963). 42 J. Chalupský, T. A. Rokob, Y. Kurashige, T. Yanai, E. I. 77S. Wouters, W. Poelmans, S. De Baerdemacker, P. W. Ayers, Solomon, L. Rul´ısek, and M. Srnec, J. Am. Chem. Soc. 136, and D. Van Neck, Comput. Phys. Commun. 191, 235 (2015). 15977 (2014). 78G. Barcza, O. Legeza, K. H. Marti, and M. Reiher, Phys. Rev. 43 M. B. Hastings, J. Stat. Mech.: Theory Exp. 2007, P08024 A 83, 012508 (2011). (2007). 79J. L. Bredas, C. Adant, P. Tackx, A. Persoons, and B. M. Pierce, 44 S. Wouters and D. Neck, Eur. Phys. J. D 68, 272 (2014). Chem. Rev. 94, 243 (1994). 45 G. Sierra and T. Nishino, Nucl. Phys. B 495, 505 (1997). 80G. S. W. Craig, R. E. Cohen, R. R. Schrock, R. J. Silbey, G. Puc- 46 I. P. McCulloch and M. Gulácsi,EPL (Europhysics Letters) 57, cetti, I. Ledoux, and J. Zyss, J. Am. Chem. Soc. 115, 860 (1993). 852 (2002).