The HANDE-QMC Project: Open-Source Stochastic Quantum

arXiv:1811.11679v2 [physics.comp-ph] 4 Dec 2018 inqatmMneCro(CQC method (FCIQMC) interac- Carlo configuration Monte quantum full over tion The emerged decade. have last the problems structure elec- solve tronic Monte to approaches Quantum stochastic Carlo, Monte alternative of diffusion as success such techniques Carlo the on Building Abstract ilJ Handley, J. Will † △ ¶ ‡‡ et fPyis meilCleeLno,SuhKensingto South London, College Imperial Physics, of Dept. ‡ ¶¶ nvriyCeia aoaoy esedRa,Cambridge Road, Lensfield Laboratory, Chemical University tcatcqatmceityfo h ground the from chemistry quantum stochastic ylra etefrQatmMlclrSine,Departme Sciences, Molecular Quantum for Centre Hylleraas ae .Spencer, S. James ∇ et fMtras meilCleeLno,SuhKensing South London, College Imperial Materials, of Dept. et fCeity meilCleeLno,SuhKensing South London, College Imperial Chemistry, of Dept. @ ae .Shepherd, J. James eateto oenPyis nvriyo cec n Tec and Science of University Physics, Modern of Department eateto hmsr,Vrii eh lcsug Virgi Blacksburg, Tech, Virginia Chemistry, of Department unu iuain ru,Lwec iemr ainlLa National Livermore Lawrence Group, Simulations Quantum ai-nre Filip, Maria-Andreea ⊥ § h AD-M rjc:open-source project: HANDE-QMC The tJh’ olg,S onsSre,Cmrde B T,Uni 1TP, CB2 Cambridge, Street, John’s St College, John’s St al nttt o omlg,MdnlyRa,Cambridge, Road, Madingley Cosmology, for Institute Kavli k fTos h rtcUiest fNra,N93 rmø N Tromsø, N-9037 Norway, of University Arctic The - Tromsø of # srpyisGop aeds aoaoy abig,CB3 Cambridge, Laboratory, Cavendish Group, Astrophysics Remigio, ovle&CisClee rnt tet abig,C21T CB2 Cambridge, Street, Trinity College, Caius & Gonville †† hmsr ulig nvriyo oa oa 24,USA 52240, Iowa, Iowa, of University Building, Chemistry k , ⊥ △ , # , ∇ † , §§ , ‡ , §§ Xu, †† in .Malone, D. Fionn ikS Blunt, S. Nick hmsW Rogers, W. Thomas , §§ ¶ ¶¶ §§ , §§ ila .Vigor, A. William , uhr itdalphabetically listed Authors §§ -al [email protected] E-mail: .M .Foulkes, C. M. W. aiona950 USA 94550, California A,Uie Kingdom United 2AZ, A,Uie Kingdom United 2AZ, n lxJW Thom J.W. Alex and ntdKingdom United tt up state 306 China 230026, ¶ , § 1 , §§ † , fteculdcutrwv ucinadthe and function wave cluster coupled the of sampling Monte stochastic quantum allowing (DMQMC), Carlo matrix Carlo Monte density the and cluster to (CCMC) coupled led introduc- subsequently of The has development FCIQMC desired. of where is tion cases accuracy for high problems, ex- very such the of approach solution systematically act to one allows @ enho Choi, Seonghoon , §§ † ‡‡ , §§ eeaA Neufeld, A. Verena , §§ † hre ..Scott, J.C. Charles , oehWeston, Joseph §§ uhST Franklin, S.T. Ruth ∗ , aps odnS72AZ, SW7 London Campus, n ¶ B E,Uie Kingdom United 1EW, CB2 , o aps odnSW7 London Campus, ton ‡‡ to hmsr,University Chemistry, of nt i 46,Uie States United 24061, nia o aps odnSW7 London Campus, ton nlg,Hfi Anhui, Hefei, hnology, oaoy Livermore, boratory, ¶ , B H,UK 0HA, CB3 §§ e Kingdom ted H,UK OHE, Jiˇr´ı Etrych, ,UK A, ¶ † , orway , §§ §§ ¶ , §§ RuQing oet Di Roberto ¶ , §§ ¶ , §§ exact thermal density matrix, respectively. In reach of traditional, exact FCI approaches; in this article we describe the HANDE-QMC code, this respect, the method occupies a similar an open-source implementation of FCIQMC, space to the density matrix renormalization CCMC and DMQMC, including initiator and group (DMRG) algorithm27–29 and selected semi-stochastic adaptations. We describe our CI approaches.15–20 Employing a sparse and code and demonstrate its use on three exam- stochastic sampling of the FCI wave func- ple systems; a molecule (nitric oxide), a model tion greatly reduces the memory requirements solid (the uniform electron gas), and a real solid compared to exact approaches. The intro- (diamond). An illustrative tutorial is also in- duction of FCIQMC has led to the devel- cluded. opment of several other related QMC methods, including coupled cluster Monte Carlo (CCMC),30,31 density matrix quantum Monte 1 Introduction Carlo (DMQMC),32,33 model space quantum Monte Carlo (MSQMC),34–36 clock quantum Quantum Monte Carlo (QMC) methods, in Monte Carlo,37 driven-dissipative quantum their many forms, are among the most reliable Monte Carlo (DDQMC),38 and several other and accurate tools available for the investiga- variants, including multiple approaches for tion of realistic quantum systems.1 QMC meth- studying excited-state properties.34,39–41 ods have existed for decades, including notable In this article we present HANDE-QMC approaches such as variational Monte Carlo (Highly Accurate N-DEterminant Quantum (VMC),2–6 diffusion Monte Carlo (DMC)1,7–10 Monte Carlo), an open-source quantum chem- and auxiliary-field QMC (AFQMC);11 such istry code that performs several of the above methods typically have low scaling with system quantum Monte Carlo methods. In partic- size, efficient large-scale parallelization, and ular, we have developed a highly-optimized systematic improvability, often allowing bench- and massively-parallelized package to per- mark quality results in challenging systems. form state-of-the-art FCIQMC, CCMC and A separate hierarchy exists in quantum chem- DMQMC simulations. istry, consisting of methods such as coupled An overview of stochastic quantum chemistry cluster (CC) theory,12 Møller-Plesset perturba- methods in HANDE-QMC is given in Section 2. tion theory (MPPT),13 and configuration inter- Section 3 describes the HANDE-QMC package, action (CI), with full CI (FCI)14 providing the including implementation details, our develop- exact benchmark within a given single-particle ment experiences, and analysis tools. Applica- basis set. The scaling with the number of basis tions of FCIQMC, CCMC and DMQMC meth- functions can be steep for these methods: from ods are contained in Section 4. We conclude N 4 for MP2 to exponential for FCI. Various ap- with a discussion in Section 5 with views on proaches to tackle the steep scaling wall have scientific software development and an outlook been proposed in the literature: from adap- on future work. A tutorial on running HANDE tive selection algorithms15–20 and many-body is provided in the Supplementary Material. expansions for CI21 to the exploitation of the locality of the one-electron basis set22 for MP2 and CC.23–25 Such approaches have been in- 2 Stochastic quantum chem- creasingly successful, now often allowing chemical accuracy to be achieved for systems com- istry prising thousands of basis functions. In 2009, Booth, Thom and Alavi intro- 2.1 Full Configuration Interac- duced the full configuration interaction quan- tion Quantum Monte Carlo tum Monte Carlo (FCIQMC) method.26 The The FCI ansatz for the ground state wave- FCIQMC method allows essentially exact FCI function is |Ψ i = c |D i, where {D } is results to be achieved for systems beyond the CI i i i i the set of Slater determinants.P Noting that

2 N (1 − δτHˆ ) |Ψ0i∝|ΨCIi as N →∞, where Ψ0 by incorporating information about the magni- is some arbitrary initial vector with hΨ0|ΨCIi= 6 tude of the Hamiltonian matrix elements into 0 and δτ is sufficiently small,42 the coeffi- the selection probabilities.47,48 49 cients {ci} can be found via an iterative pro- The initiator approximation (often referred cess derived from a first-order solution to the to as i-FCIQMC) only permits new particles to imaginary-time Schrödinger equation:26 be created on previously unoccupied determinants if the spawning determinant has a weight c (τ +δτ)= c (τ)−δτ hD |Hˆ |D i c (τ). (1) i i X i j j above a given threshold — this introduces a j systematic error which is reduced with increasing particle populations, but effectively reduces A key insight is that the action of the Hamil- the severity of the sign problem. This simple tonian can be applied stochastically rather modification has proven remarkably successful than deterministically: the wavefunction is dis- and permits FCI-quality calculations on Hilbert cretized by using a set of particles with weight spaces orders of magnitude beyond exact FCI. ±1 to represent the coefficients, and is evolved in imaginary time by stochastically creating new particles according to the Hamiltonian ma- 2.2 Coupled Cluster Monte trix (Section 2.4). By starting with just par- Carlo ticles on the Hartree–Fock determinant or a small number of determinants, the sparsity of The coupled cluster wavefunction ansatz is Tˆ ˆ the FCI wavefunction emerges naturally. The |ΨCCi = Ne |DHFi, where T is the clus- FCIQMC algorithm hence has substantially re- ter operator containing all excitations up to a duced memory requirements26 and is naturally given truncation level, N is a normalisation fac- scalable43 in contrast to conventional Lanczos tor and |DHFi the Hartree–Fock determinant. techniques. The sign problem manifests itself in For convenience, we rewrite the wavefunction Tˆ /tHF the competing in-phase and out-of-phase com- ansatz as |ΨCCi = tHFe |DHFi, where tHF is a weight on the Hartree–Fock determinant, and binations of particles with positive and nega- ′ ′ 42 define Tˆ = tiaî, where restricts the sum to tive signs on the same determinant; this is Pi alleviated by exactly canceling particles of op- be up to the truncation level,a î is an excitation posite sign on the same determinant, a process operator (excitor) such thata î |DHFi results in termed ‘annihilation’. This results in the dis- |Dii and ti is the corresponding amplitude. Us- tinctive population dynamics of an FCIQMC ing the same first-order Euler approach as in simulation, and a system-specific critical pop- FCIQMC gives a similar propagation equation: ulation is required to obtain a statistical rep- t (τ + δτ)= t (τ) − δτ hD |Hˆ |D i t˜ (τ). (2) resentation of the correct FCI wavefunction.42 i i X i j j j Once the ground-state FCI wavefunction has been reached, the population is controlled via The key difference between Eqs. (1) and (2) is a diagonal energy offset9,26 and statistics can t˜j = hDj|ΨCCi contains contributions from clus- be accumulated for the energy estimator and, ters of excitors30 whereas the FCI wavefunction if desired, other properties. is a simple linear combination. This is tricky to The stochastic efficiency of the algorithm (de- evaluate efficiently and exactly each iteration. termined by the size of statistical errors for Instead, t˜j is sampled and individual contri- a given computer time) can be improved by butions propagated separately.30,50,51 Bar this several approaches: using real weights, rather complication, the coupled cluster wavefunction than integer weights, to represent particle am- can be stochastically evolved using the same ap- 44,45 plitudes; a semi-stochastic propagation, in proach as used in FCIQMC. which the action of the Hamiltonian in a small subspace of determinants is applied exactly;44,46 and more efficient sampling of the Hamiltonian

3 2.3 Density Matrix Quantum write Hˆ = Hˆ 0 + Vˆ and define the auxiliary ˆ 0 Monte Carlo density matrix fˆ(τ) = e−(β−τ)H ρˆ(τ) with the following properties: FCIQMC and CCMC are both ground-state, 0 zero-temperature methods (although excited- fˆ(0) = e−βHˆ , (5) state variants of FCIQMC exist34,39–41). The fˆ(β)=ˆρ(β), (6) exact thermodynamic properties of a quantum system in thermal equilibrium can be deter- dfˆ = Hˆ 0fˆ− fˆH.ˆ (7) mined from the (unnormalized) N-particle den- dτ −βHˆ sity matrix,ρ ˆ(β)= e , where β =1/kBT . A We see that with this form of density matrix direct evaluation ofρ ˆ(β) requires knowledge of we can begin the simulation from a mean-field the full eigenspectrum of Hˆ , a hopeless task for solution defined by Hˆ0, which should (by con- all but trivial systems. To make progress we struction) lead to a distribution containing the note that the density matrix obeys the (sym- desired important states (such as the Hartree– metrized) Bloch equation Fock density matrix element) at low temperature. Furthermore, if Hˆ 0 is a good mean dρˆ 1 0 = − Hˆ ρˆ +ρ ˆHˆ . (3) field Hamiltonian then eβHˆ ρˆ is a slowly vary- dβ 2 h i ing function of β, and is thus easier to sample. Representingρ ˆ in the Slater determinant basis, Comparing Eqs. (3) and (7), we see that fˆ can ρij = hDi|ρˆ|Dji and again using a first-order up- be stochastically sampled in a similar fashion date scheme results in similar update equations to DMQMC, with minor modifications relative to FCIQMC and CCMC: to using the unsymmetrized Bloch equation:32 i) the choice of Hˆ 0 changes the probability of δβ ρ (β + δβ)= ρ (β) − hD |Hˆ |D i ρ (β) killing a particle (Section 2.4); ii) the τ = 0 ini- ij ij 2 X h i k kj k tial configuration must be sampled according to Hˆ 0 rather than the identity matrix; iii) evolv- + ρ (β) hD |Hˆ |D i . ik k j i ing to τ = β gives a sample of the density ma- (4) trix at inverse temperature β only - indepen- It follows that elements of the density matrix dent simulations must be performed to accu- can be updated stochastically in a similar fash- mulate results at different temperatures. We ion to FCIQMC and CCMC. ρ(β) is a single term this method interaction-picture DMQMC stochastic measure of the exact density ma- (IP-DMQMC). trix at inverse temperature β. Therefore, un- like FCIQMC and CCMC, multiple independent simulations must be performed in order 2.4 Commonality between FCIQMC, to gather statistics at each temperature. The CCMC and DMQMC simplest starting point for a simulation is at FCIQMC, CCMC and DMQMC have more β = 0, where ρ is the identity matrix. Each similarities than differences: the amplitudes simulation (termed ‘β-loop’) consists of sam- within the wavefunction or density matrix are pling the identity matrix and propagating to represented stochastically by a weight, or par- the desired value of β. Averaging over multiple ticle.52 These stochastic amplitudes are sam- β-loops gives thermal properties at all temper- pled to produce states, which make up the atures in the range [0, β]. wavefunction or density matrix. For FCIQMC While this scheme is exact (except for small (DMQMC), a state corresponds to a determi- and controllable errors due to finite δβ), it suf- nant (outer product of two determinants), and fers from the issue that important states at low for CCMC corresponds to a term sampled from temperature may not be sampled in the initial the cluster expansion corresponding to a single (β = 0) density matrix, where all configurations determinant. The stochastic representation of are equally important.33 To overcome this, we

4 the wavefunction or density matrix is evolved states and properties,41,45,55 and can naturally by be applied to different wavefunction Ansätze,56 which can be added relatively straightforwardly spawning sampling the action of the Hamil- on top of a core implementation of FCIQMC. tonian on each (occupied) state, which Due to this, improvements in, say, excitation requires random selection of a state con- generators can be immediately used across all nected to the original state. The process methods in HANDE. of random selection (‘excitation generation’) is system-dependent, as it depends upon the connectivity of the Hamiltonian 3 HANDE-QMC matrix; efficient sampling of the Hamil- tonian has a substantial impact on the 3.1 Implementation stochastic efficiency of a simulation.44,47,48 HANDE-QMC is implemented in Fortran and death killing each particle with probability takes advantage of the increased expressiveness proportional to its diagonal Hamiltonian provided by the Fortran 2003 and 2008 stan- matrix element. dards.57 Parallelization over multiple processors is implemented using OpenMP (CCMC- annihilation combining particles on the same only for intra-node shared memory communica- state and canceling out particles with the tion) and MPI. Parallelization and the reusabil- same absolute weight but opposite sign. ity of core procedures have been greatly aided Energy estimators can be straightforwardly ac- by the use of pure procedures and minimal cumulated during the evolution process. A global state, especially for system and calcu- parallel implementation distributes states over lation data. multiple processors, each of which need only We attempt to use best-in-class libraries evolve its own set of states. The annihilation where possible. This allows for rapid develop- stage then requires an efficient process for de- ment and a focus on the core QMC algorithms. HANDE-QMC relies upon MurmurHash2 for termining to which processor a newly spawned 58 43 hashing operations, dSFMT for high-quality particle should be sent. For CCMC an addi- 59 tional communication step is required to ensure pseudo-random number generation, numerical libraries (cephes,60 LAPACK, ScaLAPACK, that the sampling of products of amplitudes is 61,62 unbiased.50 TRLan ) for special functions, matrix and Hence, FCIQMC, CCMC and DMQMC share vector procedures and Lanczos diagonalization, and HDF5 for file I/O.63 The input file to the majority of the core algorithms in the 64 HANDE-QMC implementations. The primary HANDE-QMC is a Lua script; Lua is a difference is the representation of the wavefunc- lightweight scripting language designed for em- bedding in applications and can easily be used tion or density matrix, and the action of the 65 Hamiltonian in the representation. These dif- from Fortran codes via the AOTUS library. ferences reside in the outer-most loop of the al- Some of the advantages of using a scripting lan- gorithm and so do not hinder the re-use of com- guage for the input file are detailed in Section 5. ponents between the methods. This remains Calculation, system settings and other metadata are included in the output in the JSON for- the case even for linked coupled cluster Monte 66 Carlo, which applies the similarity-transformed mat, providing a good compromise between Hamiltonian, e−T HeT , and the interaction pic- human- and machine-readable output. ture formulation of DMQMC. HANDE can be compiled either into a stan- It is important to note that this core dalone binary or into a library, allowing it to be used directly from existing quantum chemistry paradigm also covers different approaches to 67 propagation,34,44,49,53 the initiator approxima- packages. CMake is used for the build system, tion,31,49,54 excitation generators,47,48 excited which allows for auto-detection of compilers, li-

5 braries and available settings in most cases. A calculations can be stored and resumed via the legacy Makefile is also included for compiling use of restart files. The state of the pseudo- HANDE in more complex environments where random number generator is included in the direct and fine-grained control over settings is restart files such that restarted calculations fol- useful. low the same Markov chain as if they had been Integrals for molecular and solid systems run in a single calculation assuming the same can be generated by Hartree–Fock calcula- calculation setup is used. We use the HDF5 tions using standard quantum chemistry pro- format and library for efficient I/O and com- grams, such as Psi4,68 HORTON,69 PySCF,70 pact file sizes. A key advantage of this ap- Q-Chem,71 and MOLPRO,72 in the plain-text proach is that it abstracts the data layout into FCIDUMP format. HANDE can convert the a hierarchy (termed groups and datasets). This FCIDUMP file into an HDF5 file, which gives makes extending the restart file format to in- a substantial space saving and can be read in clude additional information whilst maintaining substantially more quickly. For example, an all- backward compatibility with previous calcula- electron FCIDUMP for coronene in a Dunning tions particularly straightforward. Each calcu- cc-pVDZ basis73 is roughly 35GB in size and lation is labeled with a universally unique iden- takes 1840.88 seconds to read into HANDE and tifier (UUID),78 stored in the restart file and initialise. When converted to HDF5 format, included in the metadata of subsequent calcu- the resulting file is 3.6GB in size and initial- lations. This is critical for tracing the prove- ising an identical calculation takes only 60.83 nance of data generated over multiple restarted seconds. This is useful in maximizing resource calculations. utilization when performing large production- Extensive user-level documentation is in- scale calculations on HPC facilities. The mem- cluded in the HANDE-QMC package79 and ory demands of the integrals are reduced by details compilation, input options, running storing the two-electron integrals only once on HANDE and calculation analysis. The doc- each node using either the MPI-3 shared mem- umentation also includes several tutorials on ory functionality or, for older MPI implemen- FCIQMC, CCMC and DMQMC, which guide tations, POSIX shared memory. new users through generating the integrals (if In common with several Monte Carlo meth- required), running a QMC calculation along ods, data points from consecutive iterations are with enabling options for improving stochastic not independent, as the population at a given efficiency, and analysing the calculations. The iteration depends on the population at the pre- HANDE source code is also heavily commented vious iteration. This autocorrelation must be and contains extensive explanations on the the- removed in order to obtain accurate estimates ories and methods implemented (especially for of the standard error arising from FCIQMC and CCMC), and data structures. Each procedure CCMC simulations74 and is most straightfor- also begins with a comment block describing its wardly done via a reblocking analysis.75 This action, inputs and outputs. We find this level can be performed as a post-processing step76 of developer documentation to be extremely but is also implemented as an on-the-fly algo- important for onboarding new developers and rithm,77 which enables calculations to be termi- making HANDE accessible to modifications by nated once a desired statistical error has been other researchers. reached. It is often useful to continue an existing calcu- 3.2 Development methodology lation; for example to accumulate more statistics to reduce the error bar, to save equilibra- The HANDE-QMC project is managed using tion time when investigating the effect of calcu- the Git distributed version control system.80 A lation parameters or small geometry changes, public Git repository is hosted on GitHub81 and or for debugging when the bug is only evident is updated with new features, improvements deep into a calculation. To aid these use cases, and bug fixes. We also use a private Git repos-

6 itory for more experimental development and als include several examples of using pyhande research; this allows for new features to be it- for data analysis. pyhande makes extensive erated upon (and potentially changed or even use of the Python scientific stack (NumPy,88 removed) without introducing instability into SciPy,89 Pandas85 and Matplotlib90). the more widely available code.82 We regularly update the public version, from which official 3.4 License releases are made, with the changes made in the private repository. Further details of our HANDE-QMC is licensed under the GNU development practices such as our development Lesser General Public License, version 2.1. The philosophy and the extensive continuous inte- LGPLv2.1 is a weak copyleft license,91,92 which gration set up using Buildbot83 are outlined in allows the QMC implementations to be incor- Ref. 84. porated in both open- and closed-source quantum chemistry codes while encouraging devel- 3.3 pyhande opments and improvements to be contributed back or made available under the same terms.93 Interpretation and analysis of calculation out- pyhande is licensed under the 3-Clause BSD Li- put is a critical part of computational science. cense,94 in keeping with many scientific Python While we wrote scripts for performing common packages. analyses, such as reblocking to remove the effect of autocorrelation from estimates of the standard error, we found that users would write ad- 4 Example results hoc, fragile scripts for extracting other useful In this section we present calculations to data, which were rarely shared and contained demonstrate the core functionality included in overlapping functionality. This additional bar- HANDE-QMC: we consider a small molecule rier also hindered curiousity-driven exploration (nitric oxide); the uniform electron gas in the of results. To address this, the HANDE-QMC zero-temperature ground state and at finite package includes pyhande, a Python library temperatures; and a periodic solid, diamond, for working with HANDE calculation outputs. with k-point sampling. The supplementary ma- pyhande extracts metadata (including version, terial includes a tutorial on running and analyz- system and calculation parameters, calculation ing FCIQMC on the water molecule in cc-pVDZ UUID) into a Python dictionary and the QMC basis, which is easily accessible by determinis- output into a Pandas85 DataFrame, which pro- tic methods and can be easily performed on any vides a powerful abstraction for further anal- relatively modern laptop. ysis. pyhande includes scripts and functions to automate common tasks, including reblocking analysis, plateau and shoulder31 height esti- 4.1 Computational details mation, stochastic inefficiency estimation86 and All calculations in this section were run with reweighting to reduce the bias arising from pop- HANDE versions earlier than version 1.3. In- ulation control.9,87 We have found that the de- tegrals were generated using PySCF, Psi4 and velopment of pyhande has aided reproducibil- Q-Chem. Input, output and analysis scripts are ity by providing a single, robust implementa- available under a Creative Commons License at tion for output parsing and common analy- https://doi.org/10.17863/CAM.31933 con- ses, and has made more complex analyses more taining specifics on which version is used for straightforward by providing rich access to raw some calculations, and which SCF program is data in a programmable environment. Indeed, used. many functions included in pyhande began as exploratory analysis in a Python shell or a Jupyter notebook. The HANDE-QMC documentation also details pyhande and the tutori-

7 4.2 Molecules: Nitric oxide CCSDT capture > 92% and > 98% of the correlation energy, respectively, with CCSDTQ es- Nitric oxide is an important molecule, perhaps sentially exact, and the percentage decreasing most notably as a signalling molecule in multi- with increasing bond length as expected. The ple physiological processes. Here, we consider CCMC approach is particularly appropriate for NO in a cc-pVDZ basis set,73 correlating all 15 such high-order CC calculations, where stochas- electrons. The FCI space size is ∼ 1012, and so tic sampling naturally takes advantage of the is somewhat beyond the reach of exact FCI ap- sparse nature of the CC amplitudes. proaches. We consider initiator FCIQMC, using a walker population of 8 × 106, which is −1.293×102 more than suﬃcient to achieve an accuracy of 0.0

∼ 0.1mEh. This is then compared to CCMC h −0.1 results for the CCSD, CCSDT and CCSDTQ /E Ansätze. An unrestricted Hartree–Fock (UHF) −0.2 CCSD-MC CCSDT-MC Energy molecular orbital basis is used. The computa- −0.3 CCSDTQ-MC tional resources to perform this study are mod- −1.293×102 est compared to state-of-the-art FCIQMC sim- 0.0 ulations, never using more than about 100 pro- h −0.1 cessing cores. /E In Figure 1 and Table 1, results are presented −0.2 CCSDTQ-MC Energy for this system at varying internuclear dis- −0.3 i-FCIQMC tances. Remarkably good agreement between CCSDTQ-MC and the i-FCIQMC is achieved, 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 R/A˚ with CCSDT-MC also performing extremely well. Statistical errors do not pose any issue Figure 1: The binding curve of NO in a in these results, as is typically the case for cc-pVDZ basis set, correlating all electrons. FCIQMC and CCMC simulations; all such error Stochastic error bars are not visible on this bars are naturally of order 0.1mEh or less. For scale, but all are smaller than 1 mEh. For better i-FCIQMC results the semi-stochastic adapta- resolution in the differences between methods, tion was used,44,46 choosing the deterministic see Table (1). space by the approach of Ref. 46. Fig. (2) demonstrates such simulations before and after enabling semi-stochastic propagation, and the benefits are clear. Indeed, i-FCIQMC results 4.3 Model Solid: Uniform elec- here have statistical errors of order ∼ 1µEh or tron gas smaller. HANDE also has built-in capability to perform CCMC calculations were performed with real calculations of model systems commonly used weights using the even selection algorithm.51 in condensed matter physics, specifically the For the largest calculations, CCSDTQ-MC, uniform electron gas (UEG),96–98 the Hubbard heatbath excitation generators were used with model,99–101 and the Heisenberg model.32,102 up to 4.5 × 106 occupied excitors, paralleliz- Such model systems have formed the founda- ing over 96 cores. For comparison, determinis- tion of our understanding of simple solids and tic single reference CCSDTQ calculations per- strongly correlated materials, and are a use- formed with the MRCC program package95 re- ful testing ground for new computational ap- quired storage of 2.1 × 107 amplitudes, but did proaches. Studying the UEG, for example, has not converge beyond R =1.7A.˚ provided insight into the accuracy of many- Table (1) also shows the percentage of cor- body electronic structure methods and has been relation energy captured by the various lev- a critical ingredient for the development of els of CC, compared to i-FCIQMC. CCSD and many of the exchange-correlation kernels used

8 Table 1: CCMC and i-FCIQMC results for the NO molecule in a cc-pVDZ basis set, correlating all electrons, as plotted in Fig. (1). UHF orbitals were used. Numbers in parentheses show statistical error bars, not systematic initiator error, which is estimated to be ∼ 0.1mEh for i-FCIQMC results. i-FCIQMC results used the semi-stochastic adaptation with a deterministic space of size 2 × 104. The results of such a semi-stochastic approach are demonstrated in Fig. (2). The ﬁnal three columns show the percentage of correlation energy recovered by CCSD-MC, CCSDT-MC and CCSDTQ- MC, compared to i-FCIQMC. i-FCIQMC calculations were performed with 8 × 106 walkers, and CCMC calculations used at most 7 × 106 excips.

(Total energy + 129Eh)/Eh Correlation energy recovered (%) R/A˚ CCSD CCSDT CCSDTQ i-FCIQMC CCSD CCSDT CCSDTQ 0.9 -0.328507(1) -0.3346(1) -0.33523(4) -0.335225(2) 97.7330(6) 99.78(5) 100.00(1) 1.0 -0.5162(2) -0.52478(2) -0.525448(6) -0.525470(2) 97.06(8) 99.779(6) 99.993(2) 1.1 -0.582684(9) -0.59317(8) -0.59447(3) -0.594565(3) 96.435(3) 99.58(2) 99.973(9) 1.154 -0.5904(5) -0.6018(3) -0.6035(2) -0.603772(2) 96.1(2) 99.43(9) 99.92(5) 1.2 -0.58653(3) -0.6005(4) -0.6018(2) -0.602136(3) 95.541(8) 99.5(1) 99.89(7) 1.3 -0.5622(2) -0.5782(4) -0.5790(6) -0.580833(3) 94.67(5) 99.2(1) 99.5(2) 1.4 -0.5256(2) -0.5451(10) -0.5471(7) -0.548340(3) 93.34(7) 99.1(3) 99.6(2) 1.7 -0.43299(10) -0.4503(5) -0.4543(1) -0.455765(4) 92.13(3) 98.1(2) 99.48(4) 2.0 -0.39816(6) -0.40800(9) -0.41010(6) -0.411350(2) 94.45(2) 98.59(4) 99.47(2) 2.5 -0.39132(5) -0.39371(8) -0.39434(2) -0.3954786(4) 98.05(2) 99.17(4) 99.467(8)

in Kohn–Sham density functional theory.103–105 The UEG has been used recently as a h −0.3420 (a) /E means to benchmark and test performance of

−0.3435 new methods, such as modiﬁcations to diffusion Monte Carlo (DMC), as well as low −0.3450 orders of coupled cluster theory31,106–113 and Correlation energy 114–119 0 50000 100000 150000 FCIQMC. −0.3292 A recent CCMC study116 employing coupled /t (b) cluster levels up to CCSDTQ5 used HANDE −0.3296 to compute the total energy of the UEG at rs = [0.5, 5]a0, the range relevant to electron 98 −0.3300 Correlation energy densities in real solids. The results suggest

0 20000 40000 60000 80000 100000 that CCSDTQ might be necessary at low den- 116 Iteration sities beyond rs = 3a0 in order to achieve chemical accuracy, whilst CCSDTQ5 was nec- Figure 2: Example simulations in HANDE- essary to reproduce FCIQMC to within error QMC using the semi-stochastic FCIQMC ap- bars.114–117 44 proach of Umrigar and co-workers. Vertical HANDE was also used in the resolution dashed lines show the iteration where the semi- of a discrepancy between restricted path- stochastic adaptation is begun, and the result- integral Monte Carlo and configuration path- ing reduction in noise is clear thereafter. (a) NO integral Monte Carlo data for the exchange- in a cc-pVDZ basis set, with all electrons cor- correlation energy of the UEG necessary to related, at an internuclear distance of 1.154A.˚ parametrize DFT functionals at finite tempera- 4 The deterministic space is of size 2 × 10 . (b) ture.54,120–128. The UEG at finite temperatures A half-filled two-dimensional 18-site Hubbard is parametrized by the density and the degen- model at U/t =1.3, using a deterministic space eracy temperature, Θ = T/TF , where TF is of size 104. 129 the Fermi temperature . When both rs ≈ 1

9 Table 2: Ground-state exchange-correlation en- −0 425 . −0.550 ergies (Exc) for the UEG at rs = 1a0, compar- 0 450 ing various levels of coupled cluster theory with − . −0.555 −0.475

FCIQMC. Exchange-correlation energies were h 116 /E 0 500 calculated using data from Ref. − . 0.120 0.125 0.130 xc E −0.525 Method Exc/Eh CCSD CCSDT −0.550 CCSD-MC -0.551128(6) FCIQMC DMQMC CCSDT-MC -0.55228(1) −0.575 CCSDTQ-MC -0.55231(1) 10−1 100 101 CCSDTQ5-MC -0.55232(1) Θ FCIQMC -0.55233(1) Figure 3: The exchange-correlation energy (Exc) for the UEG at rs = 1a0 as a function and Θ ≈ 1 the system is said to be in the of temperature Θ using DMQMC (Ref. 54). warm dense regime, a state of matter which is The horizontal lines represent basis set extrap- to found in planetary interiors130 and can be olated CCSD, CCSDT and FCIQMC exchange- created experimentally in inertial confinement correlation energies energies (Ref. 116). Error fusion experiments.131 bars on CCMC and FCIQMC results are too Here, we show that use of HANDE can fa- small to be seen on this scale. CCSDT and cilitate straightforward benchmarking of model FCIQMC values cannot be distinguished on this systems at both zero and finite temperature. scale. See Table. (2) for numerical values for In Fig. 3 we compare DMQMC data for the 14- CCSD to CCSDTQ5 in the ground state. electron, spin-unpolarized UEG at finite Θ to zero temperature (Θ = 0) energies found using k point mesh in a GTH-DZVP132† basis, and a 116 CCMC and FCIQMC for rs = 1a0. We com- GTH-pade pseudo-potential.134,135 There were pute the exchange-correlation internal energy 2 atoms, 8 electrons in 52 spinorbitals per k point. Integral files have been generated with E (Θ) = E (Θ) − T (Θ), (8) XC QMC 0 PySCF70 using Gaussian density fitting.136 Or- bitals were obtained from density functional where EQMC(Θ) is the QMC total energy of the theory using the LDA Slater-Vosko-Vilk-Nusair UEG and T0 is the ideal kinetic energy of the (SVWN5) exchange-correlation functional137 to same UEG. Even at rs = 1a0, coupled cluster requires contributions from triple excitations to write out complex valued integrals at different k obtain FCI-quality energies; CCSD differs by k points, and HANDE’s read–in functionali- about 1mHa. DMQMC results tend to the ex- ties were adapted accordingly. Details of this pected zero temperature limit given by both will be the subject of a future publication on solid-state calculations. The heat bath uniform FCI and CC. Ground-state values from cou- 47,48 pled cluster and FCIQMC are presented in Ta- singles or the heat bath Power–Pitzer ref. excitation generator48 and even selection51 or ble. (2), to make the small differences between 50 high-accuracy methods clearer. multi-spawn sampling were used. Deterministic coupled cluster has been applied to diamond previously; Booth et 4.4 Solids: Diamond al.138 have investigated diamond with CCSD, 139 Finally, we apply HANDE-QMC to a real peri- CCSD(T) and FCIQMC in a basis of plane odic solid, diamond, employing k point sam- waves with the projector augmented wave 140 141 pling. CCMC has been applied to 1×1×1 method; McClain et al. studied diamond (up to CCSDTQ), 2×1×1 (up to CCSDT), with CCSD using GTH pseudo-potentials in 2×2×1 and 2×2×2 (up to CCSD) k point †As used in PySCF, 70 and CP2K, 133 meshes and non-initiator FCIQMC to a 1×1×1 https://www.cp2k.org/.

10 DZV, DZVP, TZVP basis sets132,134,135†; Gru- shifted) meshes, which explains the larger dif- ber et al.142 used CCSD with (T) corrections ference between CCSD-McClain et al. and in an MP2 natural orbital basis.143 the rest of the data. An accuracy of (0.01- The lattice constant was fixed to 3.567A,˚ as 0.1) eV/unit ((0.00037-0.0037) Eh/unit) might in the study by McClain et al.141 Figure 4 be required to accurately predict, for example shows the correlation energy as a function of crystal structures,144 so these limited k-point number of k points comparing the CCMC and mesh results suggest that at least CCSDT level FCIQMC results to the CCSD results obtained is required for reasonable accuracy, possibly using PySCF and the CCSD results of Mc- CCSDTQ. Nonetheless, we have not considered Clain et al.141 The correlation energy given here larger basis sets, additional k points, and other is calculated with respect to the HF energy, important aspects required for an exhaustive as the correlation energy from using DFT or- study. bitals, added to the difference of energy of reference determinant consisting of DFT orbitals and HF SCF energy. Differences in conver- 5 Discussion This article has presented the key function- −0.225 CCSD-PySCF ality included in HANDE-QMC: efficient, ex- −0.230 CCSD-McClain et al. CCSD-MC tensible implementations of the full configura- h −0.235 CCSDT-MC CCSDTQ-MC /E tion interaction quantum Monte Carlo, cou- FCIQMC

HF −0.240

E pled cluster Monte Carlo and density matrix

− −0.245 . quantum Monte Carlo methods. Advances such

tot − 44,46 E 0.250 as semi-stochastic propagation in FCIQMC 47,48 −0.255 and efficient excitation generators are also −0.260 implemented. HANDE-QMC can be applied 1.00 1.25 1.50 1.75 2.00 2.25 2.50 to model systems – the Hubbard, Heisenberg k 1/3 (# points) and uniform electron gas models – as well as molecules and solids. Figure 4: Difference between the total and We have found using a scripting language Hartree-Fock energy per k point for diamond (Lua) in the input file145 to be extremely bene- using CCMC (CCSD to CCSDTQ) and (non- ficial – for example, in running multi-stage cal- initiator) FCIQMC based on DFT orbitals. culations, enabling semi-stochastic propagation The CCSDTQ and the FCIQMC data point after the most important states have emerged, overlap to a large extent. The CCSD-PySCF irregular output of restart files, or for enabling data was run with Hartree-Fock orbitals. In the additional output for debugging at a specific case of CCMC, FCIQMC and CCSD-PySCF point in the calculation. As with (e.g.) Psi4, the mesh has been shifted to contain the Γ PySCF and HORTON, we find this approach point. CCSD-McClain et al. is data from Fig- far more flexible and powerful than a custom ure 1 in McClain et al.141 using PySCF; we declarative input format used in many other sci- show only their data up to 12 k-points for com- entific codes. parison. Both studies used the DZVP basis set We are strong supporters of open-source soft- and GTH pseudopotentials. ware in scientific research and are glad that the HANDE-QMC package has been used in others gences are due to the use of differently op- research in ways we did not envisage, including timized orbitals, and a different treatment of in the development of Adaptive Sampling Con- the exchange integral (which will feature in a figuration Interaction (ASCI),53 understand- future publication). In the case of CCMC, ing the inexact power iteration method146 and FCIQMC and CCSD-PySCF the k point mesh in selecting the P subspace in the CC(P;Q) has been shifted to contain the Γ point, while method.147 We believe one reason for this is McClain et al.141 used Γ point centered (not

11 that the extensive user- and developer-level No. EP/G036888/1. FDM was funded by documentation makes learning and developing an Imperial College President’s scholarship and HANDE-QMC rather approachable. Indeed, part of this work was performed under the five of the authors of this paper made their first auspices of the U.S. Department of Energy contributions to HANDE-QMC as undergradu- (DOE) by LLNL under Contract No. DE- ates with little prior experience in software de- AC52-07NA27344. VAN acknowledges the EP- velopment or computational science. In turn, SRC Centre for Doctoral Training in Computa- HANDE-QMC has greatly benefited from ex- tional Methods for Materials Science for fund- isting quantum chemistry software, in particu- ing under grant number EP/L015552/1 and lar integral generation from Hartree–Fock cal- the Cambridge Philosophical Society for a stu- culations in Psi4,68 Q-Chem71 and PySCF.70 dentship. RDR acknowledges partial support We hope in future to couple HANDE-QMC to by the Research Council of Norway through its such codes to make running stochastic quantum Centres of Excellence scheme, project number chemistry calculations simpler and more conve- 262695 and through its Mobility Grant scheme, nient. To this end, some degree in standardiza- project number 261873. CJCS acknowledges tion of data formats to make it simple to pass the Sims Fund for a studentship. JJS is cur- data (e.g. wavefunctions amplitudes) between rently supported by an Old Gold Summer Fel- codes would be extremely helpful in connecting lowship from the University of Iowa. JJS also libraries, developing new methods147 and repro- gratefully acknowledges the prior support of a ducibility. Research Fellowship from the Royal Commis- We close by echoing the views of the Psi4 sion for the Exhibition of 1851 and a produc- developers:68 ‘the future of quantum chem- tion project from the Swiss National Supercom- istry software lies in a more modular ap- puting Centre (CSCS) under project ID s523. proach in which small, independent teams de- WAV acknowledges EPSRC for a PhD stu- velop reusable software components that can dentship. AJWT acknowledges Imperial Col- be incorporated directly into multiple quantum lege London for a Junior Research Fellowship, chemistry packages’ and hope that this leads to the Royal Society for a University Research an increased vibrancy in method development. Fellowship (UF110161 and UF160398), Mag- dalene College for summer project funding for Acknowledgement JSS and WMCF re- M-AF, and EPSRC for an Archer Leadership ceived support under EPSRC Research Grant Award (project e507). We acknowledge con- EP/K038141/1 and acknowledge the stimu- tributions from J. Weston during an Under- lating research environment provided by the graduate Research Opportunities Scholarships Thomas Young Centre under Grant No. TYC- in the Centre for Doctoral Training on The- 101. NSB acknowledges St John’s College, Cambridge, for funding through a Research Fel- ory and Simulation of Materials at Imperial College funded by EPSRC under Grant No. lowship, and Trinity College, Cambridge for EP/G036888/1. The HANDE-QMC project an External Research Studentship during this acknowledges a rich ecosystem of open-source work. JE acknowledges Trinity College, Cam- projects, without which this work would not bridge, for funding through a Summer Stu- dentship during this work. RSTF acknowl- have been possible. edges CHESS for a studentship. WH acknowledges Gonville & Caius College, Cambridge for A An introductory tutorial funding through a Research Fellowship during this work. NSB and WH are grateful to for to HANDE-QMC Undergraduate Research Opportunities Schol- arships in the Centre for Doctoral Training on In the following we present an introductory Theory and Simulation of Materials at Impe- tutorial, demonstrating how to perform basic rial College funded by EPSRC under Grant FCIQMC and i-FCIQMC simulations with the HANDE-QMC code. More extensive tutorials,

12 including for CCMC and DMQMC, exist in the As discussed in the main text, the integral ﬁle HANDE-QMC documentation. Here we take may be generated by multiple other quantum the water molecule at its equilibrium geome- chemistry packages.69–72 try, in a cc-pVDZ basis set73 and correlating In general, the system may be deﬁned by all electrons. This is a simple example, but has specifying additional parameters, including the a Hilbert space dimension of ∼ 5 × 108, making number of electrons, the spin quantum number an exact FCI calculation non-trivial to perform. (Ms), the point group symmetry label, and a CAS subspace, for example:

A.1 A basic i-FCIQMC simula- sys = read_in { tion int_file = "INTDUMP", nel = 10, The input file for HANDE-QMC is a Lua script. ms = 0, The basic structure of such an input file is sym = 0, shown in Fig. (5). CAS = {8, 23}, } sys = read_in { int_file = "INTDUMP", The input file then calls the fciqmc{...} } function, which performs an FCIQMC simulation with the provided system and parame- fciqmc { ters. There are several options here; most are sys = sys, self-evident and are described in detail in the qmc = { HANDE-QMC documentation. tau specifies tau = 0.01, the time step size, and tau_search=true up- tau_search = true, dates this time step to an optimal value during rng_seed = 8, init_pop init_pop = 500, the simulation. specifies the initial mc_cycles = 5, particle population, and target_population nreports = 3*10^3, the value at which this population will at- target_population = 10^4, tempt to stabilize. excit_gen specifies the excit_gen = "heat_bath", excitation generator to be used. This option initiator = true, is not required, although the heat-bath algo- real_amplitudes = true, rithm of Umrigar and co-workers47 that we spawn_cutoff = 0.1, have adapted for HANDE-QMC as explained state_size = -1000, in Ref.,48 as used here, is a sensible choice spawned_state_size = -100, in small systems. initiator=true ensures }, that the initiator adaptation, i-FCIQMC, is } used. real_amplitudes=true ensures that non-integer particle weights are used. This Figure 5: An example input file for an i- leads to improved stochastic efficiency, and so is FCIQMC simulation on a molecular system. always recommended. Lastly, state_size and The results of such a simulation are presented spawned_state_size specify the memory allo- in Fig. (6). cated to the particle and spawned particle ar- rays, respectively - a negative sign is used to In this the system is entirely determined by specify these values in megabytes (thus 1GB the integral file, “INTDUMP”, which stores all and 100MB, here). of the necessary 1- and 2-body molecular in- The input file is run with tegrals. For this tutorial, the integral file was $ mpiexec hande.x hande.lua > hande.out generated through the Psi4 code.68 Both the “INTDUMP” file, and the Psi4 script used to with the MPI command varying between imple- generate it, are available in additional material. mentations in the usual way. The results of the

13 14000 (a) 12000 10000 8000 No. of particles (b) h −100

/E −105 −110 num. 115 E − (c) 500 480 denom. 460 E −0.18 (d) h

/E −0.21 corr E −0.24 0 2000 4000 6000 8000 10000 12000 14000 Iteration

Figure 6: The results of running the input ﬁle in Fig (5). (a) shows the particle population, stabilizing slightly above the targeted value of 104. (b) shows the numerator of the energy estimator, ˆ i=06 Ci hD0|H|Dii, as discussed in the main text. (c) shows the energy denominator, which is the Pnumber of particles on the Hartree–Fock determinant. (d) shows the correlation energy estimates themselves.

input file in Fig. (5) and presented in Fig. (6). (Ecorr) is the Hartree–Fock projected estimator: Because of the correlated nature of the ˆ 1 QMC data, care must be taken when esti- hD0|(H − EHF )|Ψ0i Ecorr = , (9) mating error bars; a large number of itera- hD0|Ψ0i tions must typically be performed, allowing ˆ 6 Ci hD0|H|Dii data to become sufficiently uncorrelated. This = Pi=0 , (10) C task can be error-prone for new users (and old 0 ones). HANDE-QMC includes a Python script, where |D0i is the Hartree–Fock determinant reblock_hande.py, which performs a rigorous and EHF is the Hartree–Fock energy. Ci are the blocking analysis of the simulation data, auto- particle amplitudes, with C0 being the Hartree– matically detecting if sufficient iterations have Fock amplitude. Because both the numerator been performed and, if so, choosing the optimal and denominator are random variables, they block length to provide final estimates. should be averaged separately, before perform- This final energy estimate can be obtained by ing division. It is therefore important that data $ reblock_hande.py --quiet hande.out be averaged from the point where both the numerator and denominator have converged indi- The usual estimator for the correlation energy vidually; in some cases the energy itself may appear converged while the numerator and denominator are still converging. This does not occur in the current water molecule case, as can be seen in Fig. (6), where the numerator and denominator are plotted in (b) and (c), re-

14 spectively. Here, all relevant estimates appear −0.213 FCI h i-FCIQMC

converged by iteration ∼ 1000. /E −0.214 The reblock_hande.py script will automati- −0.215 cally detect when the required quantities have −0.216 converged, in order to choose the iteration from −0.217

which to start averaging data. However, a start- Correlation energy ing iteration may be manually provided using −0.218 104 105 --start. In general it is good practice to man- Particle population ually plot simulation data, as in Fig. (6), to check that behavior is sensible. In this case, the Figure 8: Initiator convergence for the water reblock_hande.py script automatically begins molecule in a cc-pVDZ basis set, with all elec- averaging from iteration number 1463, which is trons correlated. Results were obtained by run- appropriate. ning the input file of Fig. (7). sys = read_in { A.2 Converging initiator error int_file = "INTDUMP", } After running the reblock_hande.py script, the correlation energy estimate can be read off targets = {2*10^3, 4*10^3, 8*10^3, simply as Ecorr = −0.2166(2)Eh. This com- 1.6*10^4, 3.2*10^4, 6.4*10^4, pares well to the exact FCI energy of EFCI = 1.28*10^5} −0.217925Eh, in error by ∼ 1.3mEh, despite using only ∼ 104 particles to sample a space of for i,target in ipairs(targets) do dimension ∼ 5 × 108. fciqmc { Nonetheless, an important feature of i- sys = sys, FCIQMC is the ability to converge to the exact qmc = { tau = 0.01, result by varying only one parameter, the par- rng_seed = 8, ticle population. This is possible by running init_pop = target/20, multiple i-FCIQMC simulation independently. mc_cycles = 5, However, one can make use of the Lua input nreports = 3*10^3, file with HANDE-QMC to perform an arbitrary tau_search = true, number of simulations with a single input file, target_population = target, as shown by example in Fig. (7). Here, targets excit_gen = "heat_bath", is a table containing particle populations from initiator = true, 2 × 103, and doubling until 1.28 × 105. We real_amplitudes = true, loop over all target populations and perform an spawn_cutoff = 0.1, FCIQMC simulation for each. state_size = -1000, Running the reblock_hande.py script on the spawned_state_size = -100, subsequent output file gives the results in Ta- }, ble (3). The final column gives the projected } energy estimate of the correlation energy, and end is plotted in Fig. (8), with comparison to the FCI energy. Accuracy within 1mEh is reached 4 Figure 7: An example input file showing how with Nw =2 × 10 , and an accuracy of 0.1mEh 5 to use Lua features to perform multiple simula- by Nw =2 × 10 . tions in a single input file, with particle popu- It is simple to perform a semi-stochastic i- lations from 2000 to 128, 000. FCIQMC simulation. To do this, as well as passing sys and qmc parameters to the fciqmc function, one should also pass a semi_stoch table. The simplest form for this table, which

15 Table 3: Output of the HANDE-QMC reblocking script, on the simulation with the input ﬁle of Fig. (7). The ﬁnal column gives the estimates of the correlation energy, as determined from the projected energy estimator.

Block from # H psips H0j Nj N0 Shift Proj. Energy hande.out 0 1.83000000e+03 2292(4)P -36.77(8) 172.6(5) -0.210(3) -0.2131(3) 1 1.81800000e+03 4602(5) -56.4(1) 262.5(6) -0.213(2) -0.2148(2) 2 1.47300000e+03 9108(7) -88.29(9) 408.2(5) -0.213(1) -0.2163(2) 3 1.78100000e+03 19050(10) -151.5(1) 697.0(6) -0.217(1) -0.2173(2) 4 1.97200000e+03 38150(10) -276.7(1) 1270.0(6) -0.2188(5) -0.21784(6) 5 2.06500000e+03 74310(30) -528.3(2) 2428(1) -0.2193(6) -0.21761(8) 6 1.82500000e+03 152900(30) -1081.4(4) 4964(2) -0.2186(4) -0.21787(5)

is almost always appropriate, is the following: newly spawned walkers being added to a second spawned walker array. After evolution a semi_stoch = { collective MPI_AlltoAllv is set up to commu- size = 10^4, nicate the spawned walker array to the appro- start_iteration = 2*10^3, space = "high", priate processors. The annihilation step is then }, carried out by merging the subsequently sorted spawned walker array with the main list. The "high" option generates a deterministic During the simulation every walker needs to space by choosing the most highly-weighted de- know which processor a connected determinant terminants in the FCIQMC wave function at resides on but naturally can not store this map- the given iteration (which in general should be ping. In order to achieve a relatively uniform an iteration where the wave function is largely distribution of determinants at a low computa- converged), 2 × 103 in this case. The total size tional cost, each walker is assigned to a proces- of the deterministic space is given by the size sors p as parameter, 104 in this case. p(|Dii) = hash(|Dii) mod Np, (11)

B Parallelization where Np is the number of processors and hash is a hash function58. In this appendix, we describe two techniques that can optimize the FCIQMC parallelization, B.1 Load Balancing load balancing and non-blocking communication. Parallelization of CCMC has been ex- The workload of the algorithm is primarily de- plained in Ref. 50 but does not yet make use termined by the number of walkers on a given of non-blocking communication. processor, but the above hashing procedure dis- By and large, HANDE’s FCIQMC implemen- tributes work to processors on a determinant tation follows the standard parallel implemen- basis. For the hashing procedure to be effective tation of the FCIQMC algorithm, a more com- we require that the average population for a plete description of which can be found in Ref. random set of determinants to be roughly uni- 43. In short, each processor stores a sorted form. Generally hashing succeeds in this re- main list of instantaneously occupied determi- gard and one finds a fairly even distribution nants containing the determinant’s bit string of both walkers and determinants. When scal- representation, the walker’s weight as well as ing a problem of a fixed size to more proces- any simulation dependent flags. For each it- sors, i.e. strong scaling, one observes that the eration every walker is given the chance to distribution loses some of its uniformity with spawn to another connected determinant, with certain processors becoming significantly under

16 and over populated which negatively affects the the distribution across processors should be parallelism43. This is to be expected as in the roughly constant, although small fluctuations limit Np → NDets there would be quite a pro- will persist. With this in mind redistribution nounced load imbalance unless each determi- should only occur after this stabilisation has oc- nant’s coefficient was of a similar magnitude curred and also should not need to occur too fre- (which can often be the case for strongly cor- quently. This ensures that the computational related systems). Naturally this limit is never cost associated with performing load balancing reached, but the observed imbalance is largely is fairly minor in a large calculation. Addition- a consequence of this increased refinement. ally as M is increased the optimal distribution In HANDE we optionally use dynamic load of walkers should be approached, although with balancing to achieve better parallel perfor- an increase in computational effort. mance. In practice, we define an array pmap as B.2 Non-blocking communica- p (i)= i mod N , (12) map p tion so that its entries cyclically contain the proces- HANDE also makes use of non-blocking asyn- sor IDs, 0,...,Np − 1. Determinants are then initially mapped to processors as chronous communication to alleviate latency is- sues when scaling to large processor counts.148 Using asynchronous communications is non- p(|Dii)= pmap hash(|Dii) mod Np × M , trivial in HANDE due to the annihilation stage (13) of FCIQMC-like algorithms. We use the follow- where M is the bin size. Eq. (13) reduces to ing algorithm: Consider the evolution of walk- Eq. (11) when M = 1. ers from τ to τ + ∆τ, then for each processor The walker population in each of these M bins the following steps are carried out: on each processor can be determined and com- municated to all other processors. In this way, 1. Initialise the non-blocking receive of walk- every processor knows the total distribution of ers spawned onto the current processor walkers across all processors. In redistributing from time τ. the Np × M bins we adopt a simple heuristic approach by only selecting bins belonging to 2. Evolve the main list to time τ + ∆τ. processors whose populations are either above 3. Complete the receive of walkers. or below a certain user defined threshold. By redistributing bins in order of increasing popu- 4. Evolve the received walkers to τ + ∆τ. lation we can, in principle, isolate highly populated determinants while also allowing for a 5. Annihilate walkers spawned from the evo- finer distribution. lution of the two lists as well as the This procedure translates to a simple modifi- evolved received list with the main list on cation of pmap so that its entries now contain the this processor. processor IDs which give the determined optimal distribution of bin. 6. Send remaining spawned walkers to their Finally, the walkers which reside in the chosen new processors. bins have to be moved to their new processor, While this requires more work per iteration, it which can simply be achieved using a commu- should result in improved efficiency if the time nication procedure similar to that used for the take to complete this work is less than the la- annihilation stage. Some care needs to be taken tency time. This also ensures faster processors that all determinants are on their correct pro- can continue doing work, i.e. evolving the main cessors at a given iteration so that annihilation list, while waiting for other processors to finish takes place correctly. evolving their main lists. For communications Once the population of walkers has stabilised to be truly overlapping the slowest processor

17 would need to complete the steps above before sponse in quantum Monte Carlo. J. the fastest processor reaches step (3), otherwise Chem. Phys. 2016, 145, 081103. there will be latency as the received list cannot be evolved before all walkers spawned onto a (7) Grimm, R. C.; Storer, R. G. Monte- given processor are received. Carlo solution of Schrödinger’s equation. It should be pointed out that walkers spawned J. Comput. Phys. 1971, 7, 134. onto a processor at time τ are only annihilated (8) Anderson, J. B. A random-walk simula- with the main list after evolution to τ + ∆τ, + tion of the Schrodinger equation: H3 . J. which differs from the normal algorithm. While Chem. Phys. 1975, 63, 1499. annihilation is vital to attaining converged re- sults26,42 the times at which it takes place is (9) Umrigar, C. J.; Nightingale, M. P.; somewhat arbitrary, once walkers are annihi- Runge, K. J. A diffusion Monte Carlo al- lated at the same point in simulation time. gorithm with very small time-step errors. Communication between processors is also re- J. Chem. Phys. 1993, 99, 2865. quired when collecting statistics, however the usual collectives required for this can simply (10) Kim, J.; Baczewski, A. D.; be replaced by the corresponding non-blocking Beaudet, T. D. et al. QMCPACK: procedures. This does require that information an open source ab initio quantum is printed out in a staggered fashion but this is Monte Carlo package for the electronic of minor concern. structure of atoms, molecules and solids. J. Phys. Condsens. Matter 2018, 30, 195901. References (11) Zhang, S.; Krakauer, H. Quantum Monte (1) Foulkes, W. M. C.; Mitas, L.; Carlo Method using Phase-Free Random Needs, R. J. et al. Quantum Monte Walks with Slater Determinants. Phys. Carlo simulations of solids. Rev. Mod. Rev. Lett. 2003, 90, 136401. Phys. 2001, 73, 33–83. (12) Ciˇzek,ˇ J. On the Correlation Problem (2) McMillan, W. L. Ground State of Liquid in Atomic and Molecular Systems. Cal- 4He. Phys. Rev. 1965, 138, A442–A451. culation of Wavefunction Components in UrsellType Expansion Using Quan- (3) Umrigar, C. J.; Wilson, K. G.; tumField Theoretical Methods. J. Chem. Wilkins, J. W. Optimized Trial Wave Phys. 1966, 45, 4256–4266. Functions for Quantum Monte Carlo Calculations. Phys. Rev. Lett. 1988, 60, (13) Møller, C.; Plesset, M. S. Note on 1719. an Approximation Treatment for Many- Electron Systems. Phys. Rev. 1934, 46, (4) Umrigar, C. J.; Toulouse, J.; Filippi, C. 618–622. et al. Alleviation of the Fermion-Sign Problem by Optimization of Many-Body (14) Knowles, P. J.; Handy, N. C. A New Wave Functions. Phys. Rev. Lett. 2007, Determinant-based Full Configuration 98, 110201. Interaction Method. Chem. Phys. Letters 1984, 111, 315–321. (5) Neuscamman, E.; Umrigar, C. J.; Chan, G. K.-L. Optimizing large param- (15) Huron, B.; Malrieu, J. P.; Rancurel, P. eter sets in variational quantum Monte Iterative perturbation calculations of Carlo. Phys. Rev. B 2012, 85, 045103. ground and excited state energies from multiconfigurational zeroth-order wave- (6) Neuscamman, E. Variation after re- functions. J. Chem. Phys. 1973, 58, 5745–5759.

18 (16) Giner, E.; Scemama, A.; Caffarel, M. Us- (25) Ziólkowski, M.; Jans´ık, B.; Kjaer- ing perturbatively selected configuration gaard, T. et al. Linear scaling cou- interaction in quantum Monte Carlo cal- pled cluster method with correlation en- culations. Can. J. Chem. 2013, 91, 879. ergy based error control. J. Chem. Phys. 2010, 133, 014107. (17) Schriber, J. B.; Evangelista, F. A. Com- munication: An adaptive configuration (26) Booth, G. H.; Thom, A. J. W.; Alavi, A. interaction approach for strongly corre- Fermion Monte Carlo without fixed lated electrons with tunable accuracy. J. nodes: A game of life, death, and an- Chem. Phys. 2016, 144, 161106. nihilation in Slater determinant space. J. Chem. Phys. 2009, 131, 054106–1–10. (18) Tubman, N. M.; Lee, J.; Takeshita, T. Y. et al. A deterministic alternative to the (27) White, S. R. Density matrix formula- full configuration interaction quantum tion for quantum renormalization groups. Monte Carlo method. J. Chem. Phys. Phys. Rev. Lett. 1992, 69, 2863. 2016, 145, 044112. (28) Chan, G. K.-L. An algorithm for large (19) Holmes, A. A.; Tubman, N. M.; Umri- scale density matrix renormalization gar, C. J. Heat-Bath Configuration In- group calculations. J. Chem. Phys. 2004, teraction: An Efficient Selected Configu- 120, 3172. ration Interaction Algorithm Inspired by Heat-Bath Sampling. J. Chem. Theory (29) Olivares-Amaya, R.; Hu, W.; Comput. 2016, 12, 3674–3680. Nakatani, N. et al. The ab-initio density matrix renormalization group (20) Garniron, Y.; Scemama, A.; Loos, P.-F. in practice. J. Chem. Phys. 2015, 142, et al. Hybrid stochastic-deterministic cal- 034102. culation of the second-order perturbative contribution of multireference perturba- (30) Thom, A. J. W. Stochastic Coupled Clus- tion theory. J. Chem. Phys. 2017, 147, ter Theory. Phys. Rev. Lett. 2010, 105, 034101. 263004–1–4.

(21) Eriksen, J. J.; Lipparini, F.; Gauss, J. (31) Spencer, J. S.; Thom, A. J. W. Develop- Virtual Orbital Many-Body Expansions: ments in stochastic coupled cluster the- A Possible Route towards the Full Con- ory: The initiator approximation and ap- ﬁguration Interaction Limit. J. Phys. plication to the uniform electron gas. J. Chem. Lett. 2017, 4633–4639. Chem. Phys. 2016, 144, 084108.

(22) Saebo, S.; Pulay, P. Local Treatment of (32) Blunt, N. S.; Rogers, T. W.; Electron Correlation. Annu. Rev. Phys. Spencer, J. S. et al. Density-matrix Chem. 1993, 44, 213–236. quantum Monte Carlo method. Phys. Rev. B 2014, 89, 245124. (23) Hampel, C.; Werner, H.-J. Local treatment of electron correlation in coupled (33) Malone, F. D.; Blunt, N. S.; Shep- cluster theory. J. Chem. Phys. 1996, herd, J. J. et al. Interaction picture den- 104, 6286–6297. sity matrix quantum Monte Carlo. J. Chem. Phys. 2015, 143, 044116. (24) Riplinger, C.; Neese, F. An efficient and near linear scaling pair natural orbital (34) Ten-no, S. Stochastic determination of based local coupled cluster method. J. effective Hamiltonian for the full con- Chem. Phys. 2013, 138, 034106. figuration interaction solution of quasi- degenerate electronic states. J. Chem. Phys. 2013, 138, 164126.

19 (35) Ohtsuka, Y.; Ten-no, S. A study of po- (44) Petruzielo, F. R.; Holmes, A. A.; tential energy curves from the model Changlani, H. J. et al. Semistochastic space quantum Monte Carlo method. J. Projector Monte Carlo Method. Phys. Chem. Phys. 2015, 143, 214107. Rev. Lett. 2012, 109, 230201.

(36) Ten-no, S. Multi-state effective Hamilto- (45) Overy, C.; Booth, G. H.; Blunt, N. S. nian and size-consistency corrections in et al. Unbiased reduced density ma- stochastic configuration interactions. J. trices and electronic properties from Chem. Phys. 2017, 147, 244107. full configuration interaction quantum Monte Carlo. J. Chem. Phys. 2014, 141, (37) McClean, J. R.; Aspuru-Guzik, A. Clock 244117. quantum Monte Carlo technique: An imaginary-time method for real-time (46) Blunt, N. S.; Smart, S. D.; Kersten, J. quantum dynamics. Phys. Rev. A 2015, A. F. et al. Semi-stochastic full configu- 91 . ration interaction quantum Monte Carlo: Developments and application. J. Chem. (38) Nagy, A.; Savona, V. Driven-dissipative Phys. 2015, 142, 184107. quantum Monte Carlo method for open quantum systems. Phys. Rev. A 2018, (47) Holmes, A. A.; Changlani, H. J.; Umri- 97, 052129. gar, C. J. Efficient Heat-Bath Sampling in Fock Space. J. Chem. Theory Comput. (39) Booth, G. H.; Chan, G. K.-L. Commu- 2016, 12, 1561–1571. nication: Excited states, dynamic correlation functions and spectral properties (48) Neufeld, V. A.; Thom, A. J. W. Exciting from full configuration interaction quan- determinants in Quantum Monte Carlo: tum Monte Carlo. J. Chem. Phys. 2012, Loading the dice with fast, low memory 137, 191102. weights. arXiv [physics.chem-ph] 2018,

(40) Humeniuk, A.; Mitrić, R. Excited states (49) Cleland, D.; Booth, G. H.; Alavi, A. from quantum Monte Carlo in the ba- Communications: Survival of the sis of Slater determinant. J. Chem. Phys. fittest: Accelerating convergence in 2014, 141, 194104. full configuration-interaction quantum Monte Carlo. J. Chem. Phys. 2010, 132, (41) Blunt, N. S.; Smart, S. D.; Booth, G. H. 41103. et al. An excited-state approach within full configuration interaction quantum (50) Spencer, J. S.; Neufeld, V. A.; Monte Carlo. J. Chem. Phys. 2015, 143, Vigor, W. A. et al. Large Scale Paral- 134117. lelization in Stochastic Coupled Cluster. arXiv [physics.chem-ph] 2018, (42) Spencer, J. S.; Blunt, N. S.; Foulkes, W. M. The sign problem (51) Scott, C. J. C.; Thom, A. J. W. Stochas- and population dynamics in the full con- tic coupled cluster theory: Efficient sam- figuration interaction quantum Monte pling of the coupled cluster expansion. J. Carlo method. J. Chem. Phys. 2012, Chem. Phys. 2017, 147, 124105. 136, 054110–1–10. (52) The original algorithm used integer (43) Booth, G. H.; Smart, S. D.; Alavi, A. weights. It was subsequently shown Linear-scaling and parallelisable algo- that floating-point weights greatly re- rithms for stochastic quantum chemistry. duce the stochastic noise. HANDE-QMC Mol. Phys. 2014, 112, 1855–1869. uses fixed-precision for the weights such that both approaches can be straightforwardly handled.

20 (53) Tubman, N. M.; Lee, J.; Takeshita, T. Y. (63) The HDF Group, Hierarchical et al. A deterministic alternative to the Data Format, version 5. 1997-2018; full conﬁguration interaction quantum http://www.hdfgroup.org/HDF5/. Monte Carlo method. J. Chem. Phys. 2016, 145, 044112. (64) Ierusalimschy, R. Programming in Lua, 4th ed.; Lua.org, 2016. (54) Malone, F. D.; Blunt, N.; Brown, E. W. et al. Accurate Exchange-Correlation (65) AOTUS: Advanced Options and Energies for the Warm Dense Electron Tables in Universal Scripting. Gas. Phys. Rev. Lett. 2016, 117, 115701. https://geb.sts.nt.uni-siegen.de/doxy/aotus/index.html.

(55) Blunt, N. S.; Booth, G. H.; Alavi, A. (66) JSON. https://www.json.org/, Ac- Density matrices in full configuration in- cessed: 2018-10-31. teraction quantum Monte Carlo: Ex- (67) CMake. https://cmake.org/, Ac- cited states, transition dipole moments, cessed: 2018-10-31. and parallel distribution. J. Chem. Phys. 2017, 146, 244105. (68) Parrish, R. M.; Burns, L. A.; Smith, D. G. A. et al. Psi4 1.1: An Open-Source (56) Shepherd, J. J.; Henderson, T. M.; Scuse- Electronic Structure Program Empha- ria, G. E. Using full configuration in- sizing Automation, Advanced Libraries, teraction quantum Monte Carlo in a se- and Interoperability. J. Chem. Theory niority zero space to investigate the cor- Comput. 2017, 13, 3185–3197. relation energy equivalence of pair coupled cluster doubles and doubly occupied (69) Toon Verstraelen, Pawel Tecmer, Farnaz configuration interaction. J. Chem. Phys. Heidar-Zadeh, Cristina E. Gonzlez- 2016, 144, 094112. Espinoza, Matthew Chan, Taewon D. Kim, Katharina Boguslawski, Stijn Fias, (57) The use of Fortran 2003 and 2008 im- Steven Vandenbrande, Diego Berrocal, poses a need for a recent Fortran com- and Paul W. Ayers HORTON 2.1.0, piler. Indeed, we have found bugs in both http://theochem.github.com/horton/, open-source and proprietary compilers 2017. and worked around them where possible. (70) Sun, Q.; Berkelbach, T. C.; Blunt, N. S. (58) Appleby, A. SMHasher. et al. PySCF: the Python-based simula- https://github.com/aappleby/smhasher. tions of chemistry framework. Wiley In- (59) Saito, M.; Matsumoto, M. dSFMT. terdisciplinary Reviews: Computational https://github.com/MersenneTwister-Lab/dSFMTMolecular. Science 8, e1340.

(60) Moshier, S. R. cephes. (71) Shao, Y.; Gan, Z.; Epifanovsky, E. et al. http://www.netlib.org/cephes/. Advances in molecular quantum chemistry contained in the Q-Chem 4 program (61) Wu, K.; Simon, H. TRLan. package. Molecular Physics 2015, 113, https://codeforge.lbl.gov/projects/trlan. 184–215.

(62) Yamazaki, I.; Bai, Z.; Simon, H. (72) Werner, H.-J.; Knowles, P. J.; Knizia, G. et al. Adaptive Projection Subspace Di- et al. MOLPRO, version 2015.1, a pack- mension for the Thick-Restart Lanc- age of ab initio programs; 2015. zos Method. ACM Trans. Math. Softw. 2010, 37, 27. (73) Dunning, T. H., Jr. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon

21 and hydrogen. J. Chem. Phys. 1989, 90, HANDE Quantum Monte Carlo Project. 1007–1023. J. Open Res. Softw. 2015, 3, 1–6.

(74) DMQMC averages over independent cal- (85) McKinney, W. Python for Data Analysis: culations and so does not suffer from a Data Wrangling with Pandas, NumPy, correlation issue. and IPython, 2nd ed.; O’Reilly Media, Incorporated, 2017. (75) Flyvbjerg, H.; Petersen, H. G. Error estimates on averages of correlated data. J. (86) Vigor, W. A.; Spencer, J. S.; Chem. Phys. 1989, 91, 461–466. Bearpark, M. J. et al. Understand- ing and improving the efficiency of (76) https://github.com/jsspencer/pyblock. full configuration interaction quantum 2016 (77) Kent, D. R.; Muller, R. P.; Ander- Monte Carlo. J. Chem. Phys. , 144, son, A. G. et al. Efficient algorithm for 094110. on-the-fly error analysis of local or dis- (87) Vigor, W. A.; Spencer, J. S.; tributed serially correlated data. Journal Bearpark, M. J. et al. Minimising of Computational Chemistry 2007, 28, biases in full configuration interaction 2309–2316. quantum Monte Carlo. J. Chem. Phys. 2015 (78) Leach, P.; Mealling, M.; Salz, R. A uni- , 142, 104101. versally unique identifier (UUID) URN (88) Oliphant, T. E. Guide to NumPy, 2nd namespace; 2005. ed.; CreateSpace Independent Publishing (79) https://hande.readthedocs.io. Platform: USA, 2015.

(80) Git is a very powerful distributed ver- (89) Jones, E.; Oliphant, T.; Peter- sion control system and has become son, P. et al. SciPy: Open source the de facto standard in software de- scientiﬁc tools for Python. 2001; velopment. Its decentralized model http://www.scipy.org/. has undoubtedly contributed to the (90) Hunter, J. D. Matplotlib: A 2D graph- huge growth of open-source software. ics environment. Computing In Science The oﬃcial documentation is avail- & Engineering 2007, 9, 90–95. able online https://git-scm.com/, but, as with any powerful tool, it is (91) Rosen, L. Open Source Licensing: Soft- not an easy task to become familiar ware Freedom and Intellectual Property with it. A large number of tutorials Law; Prentice Hall PTR: Upper Saddle are available online. We can recom- River, NJ, USA, 2004. mend http://gitimmersion.com/ and https://coderefinery.github.io/git-intro/(92). St. Laurent, A. M. Understanding Open Source and Free Software Licensing; (81) https://github.com/hande-qmc/hande. O’Reilly Media, 2004.

(82) We note that access to the private repos- (93) The full legal text of the license is avail- itory is liberally granted. able from the Free Software Foundation: https://www.gnu.org/licenses/old-licenses/lgpl-2.1.en.html. (83) Buildbot. https://buildbot.net/, Ac- cessed: 2018-11-1. (94) Legal text available from the Open Source Initiative: (84) Spencer, J. S.; Blunt, N. S.; Vigor, W. A. https://opensource.org/licenses/BSD-3-Clause. et al. Open-Source Development Ex- periences in Scientiﬁc Software: The

22 (95) MRCC, a quantum chemical program (106) Freeman, D. L. Coupled-cluster expan- suite written by M. Kllay, Z. Rolik, J. sion applied to the electron gas: Inclu- Csontos, P. Nagy, G. Samu, D. Mester, J. sion of ring and exchange effects. Phys. Cska, B. Szab, I. Ladjnszki, L. Szegedy, Rev. B 1977, 15, 5512–5521. B. Ladczki, K. Petrov, M. Farkas, P. D. Mezei, and B. Hgely. See also Z. Rolik, (107) Bishop, R. F.; Lührmann, K. H. Elec- L. Szegedy, I. Ladjnszki, B. Ladczki, and tron correlations: I. Ground-state results M. Kllay, J. Chem. Phys. 139, 094105 in the high-density regime. Phys. Rev. B (2013), as well as: www.mrcc.hu. 1978, 17, 3757–3780. (96) Loos, P.-F.; Gill, P. M. W. The uniform (108) Bishop, R. F.; Lührmann, K. H. Electron electron gas. WIREs Comput. Mol. Sci. correlations. II. Ground-state results at 2016, 6, 410–429. low and metallic densities. Phys. Rev. B 1982, 26, 5523–5557. (97) Giuliani, G.; Vignale, G. Quantum The- ory Electron Liq.; Cambridge University (109) Shepherd, J. J.; Grüneis, A.; Press: Cambridge, 2005; pp 1–68. Booth, G. H. et al. Convergence of many-body wavefunction expansions (98) Martin, R. M. Electron. Struct.; Cam- using a plane wave basis: from the bridge University Press: Cambridge, homogeneous electron gas to the solid 2004; pp 100–118. state. 2012,

(99) Gutzwiller, M. C. Eﬀect of Correlation (110) Shepherd, J. J.; Gr¨uneis, A. Many-body on the Ferromagnetism of Transition quantum chemistry for the electron gas: Metals. Phys. Rev. Lett. 1963, 10, 159– Convergent perturbative theories. Phys. 162. Rev. Lett. 2013, 110, 226401.

(100) Hubbard, J. Electron Correlations in (111) Roggero, A.; Mukherjee, A.; Pederiva, F. Narrow Energy Bands. Proc. R. Soc. Quantum Monte Carlo with coupled- London A Math. Phys. Eng. Sci. 1963, cluster wave functions. Phys. Rev. B 276, 238–257. 2013, 88, 115138. (101) Kanamori, J. Electron Correlation and (112) McClain, J.; Lischner, J.; Watson, T. Ferromagnetism of Transition Metals. et al. Spectral functions of the uniform 1963 Prog. Theor. Phys. , 30, 275–289. electron gas via coupled-cluster theory (102) Altland, A.; Simons, B. Condensed Mat- and comparison to the G W and related ter Field Theory; Cambridge University approximations. Phys. Rev. B 2016, 93, Press, 2010. 235139. (103) Ceperley, D. M.; Alder, B. J. Ground (113) Shepherd, J. J. Communication: Con- State of the Electron Gas by a Stochas- vergence of many-body wave-function ex- tic Method. Phys. Rev. Lett. 1980, 45, pansions using a plane-wave basis in the 566–569. thermodynamic limit. J. Chem. Phys. 2016, 145, 031104. (104) Perdew, J. P.; Zunger, A. Self-interaction correction to density-functional approx- (114) Shepherd, J. J.; Booth, G.; Gr¨uneis, A. imations for many-electron systems. et al. Full conﬁguration interaction per- Phys. Rev. B 1981, 23, 5048–5079. spective on the homogeneous electron gas. Phys. Rev. B Condens. Matter (105) Giuliani, G.; Vignale, G. Quantum The- 2012, 85 . ory Electron Liq.; Cambridge University Press: Cambridge, 2005; pp 327–404.

23 (115) Shepherd, J. J.; Booth, G. H.; Alavi, A. (124) Dornheim, T.; Groth, S.; Malone, F. D. Investigation of the full configuration in- et al. Ab initio quantum Monte Carlo teraction quantum Monte Carlo method simulation of the warm dense electron using homogeneous electron gas models. gas. Phys. Plasmas 2017, 24, 056303. J. Chem. Phys. 2012, 136, 244101. (125) Brown, E. W.; DuBois, J. L.; Holz- (116) Neufeld, V. A.; Thom, A. J. W. A study mann, M. et al. Exchange-correlation En- of the dense uniform electron gas with ergy for the Three-Dimensional Homoge- high orders of coupled cluster. J. Chem. neous Electron Gas at Arbitrary Temper- Phys. 2017, 147, 194105. ature. Phys. Rev. B 2013, 88, 081102. (117) Luo, H.; Alavi, A. Combining the (126) Karasiev, V. V.; Sjostrom, T.; Dufty, J. Transcorrelated Method with Full Con- et al. Accurate Homogeneous Elec- figuration Interaction Quantum Monte tron Gas Exchange-Correlation Free En- Carlo: Application to the Homogeneous ergy for Local Spin-Density Calculations. Electron Gas. J. Chem. Theory Comput. Phys. Rev. Lett. 2014, 112, 076403. 2018, 14, 1403–1411. (127) Dornheim, T.; Groth, S.; Sjostrom, T. (118) Ruggeri, M.; R´ıos, P. L.; Alavi, A. Cor- et al. Ab Initio Quantum Monte Carlo relation energies of the high-density spin- Simulation of the Warm Dense Electron polarized electron gas to meV accuracy. Gas in the Thermodynamic Limit. Phys. arXiv[cond-mat.str-el] 2018, Rev. Lett. 2016, 117, 156403. (119) Blunt, N. S. Communication: An effi- (128) Groth, S.; Dornheim, T.; Sjostrom, T. cient and accurate perturbative correc- et al. Ab initio Exchange-Correlation tion to initiator full configuration inter- Free Energy of the Uniform Electron Gas action quantum Monte Carlo. J. Chem. at Warm Dense Matter Conditions. Phys. Phys. 2018, 148, 221101. Rev. Lett. 2017, 119, 135001. (120) Brown, E. W.; Clark, B. K.; DuBois, J. L. (129) Dornheim, T.; Groth, S.; Bonitz, M. The et al. Path-Integral Monte Carlo Simu- uniform electron gas at warm dense mat- lation of the Warm Dense Homogeneous ter conditions. Physics Reports 2018, Electron Gas. Phys. Rev. Lett. 2013, 744, 1. 110, 146405. (130) Fortney, J. J.; Glenzer, S. H.; Koenig, M. (121) Schoof, T.; Groth, S.; Vorberger, J. et al. Frontiers of the physics of dense et al. Ab Initio Thermodynamic Results plasmas and planetary interiors: Exper- for the Degenerate Electron Gas at Fi- iments, theory, and applications. Phys. nite Temperature. Phys. Rev. Lett. 2015, Plasmas 2009, 16, 041003. 115, 130402. (131) Hu, S. X.; Militzer, B.; Goncharov, V. N. (122) Groth, S.; Schoof, T.; Dornheim, T. et al. et al. First-principles equation-of-state Ab initio quantum Monte Carlo simula- table of deuterium for inertial confine- tions of the uniform electron gas with- ment fusion applications. Phys. Rev. B out fixed nodes. Phys. Rev. B 2016, 93, 2011, 84, 224109. 085102. (132) VandeVondele, J.; Krack, M.; Mo- (123) Dornheim, T.; Groth, S.; Schoof, T. et al. hamed, F. et al. Quickstep: Fast and Ab initio quantum Monte Carlo simula- accurate density functional calculations tions of the uniform electron gas without using a mixed Gaussian and plane fixed nodes: The unpolarized case. Phys. waves approach. Comput. Phys. Com- Rev. B 2016, 93, 205134. mun. 2005, 167, 103–128.

24 (133) Hutter, J.; Iannuzzi, M.; Schiﬀmann, F. (143) Gr¨uneis, A.; Booth, G. H.; Marsman, M. et al. CP2K: atomistic simulations of et al. Natural Orbitals for Wave Func- condensed matter systems. Wiley Inter- tion Based Correlated Calculations Using discip. Rev. Comput. Mol. Sci. 2014, 4, a Plane Wave Basis Set. J. Chem. Theory 15–25. Comput. 2011, 7, 2780–2785.

(134) Goedecker, S.; Teter, M.; Hutter, J. Sep- (144) Wagner, L. K.; Ceperley, D. M. Discov- arable dual-space Gaussian pseudopoten- ering correlated fermions using quantum tials. Phys. Rev. B 1996, 54, 1703–1710. Monte Carlo. Reports Prog. Phys. 2016, 79, 094501. (135) Hartwigsen, C.; Goedecker, S.; Hutter, J. Relativistic separable dual-space Gaus- (145) We note we probably would have chosen, sian pseudopotentials from H to Rn. like Psi4, to use Python via the excel- Phys. Rev. B 1998, 58, 3641–3662. lent pybind11 library had HANDE-QMC (136) Sun, Q.; Berkelbach, T. C.; Mc- been written in C++, in part due to Clain, J. D. et al. Gaussian and plane- Python already being used extensively in wave mixed density fitting for periodic scientific research. systems. J. Chem. Phys. 2017, 147, (146) Lu, J.; Wang, Z. The full configura- 164119. tion interaction quantum Monte Carlo (137) Vosko, S. H.; Wilk, L.; Nusair, M. Accu- method in the lens of inexact power iter- rate spin-dependent electron liquid cor- ation. arXiv:1711.09153 [physics] 2017, relation energies for local spin density arXiv: 1711.09153. calculations: a critical analysis. Can. J. (147) Deustua, J. E.; Shen, J.; Piecuch, P. Con- Phys. 1980, 58, 1200–1211. verging High-Level Coupled-Cluster En- (138) Booth, G. H.; Grüneis, A.; Kresse, G. ergetics by Monte Carlo Sampling and et al. Towards an exact description of Moment Expansions. Phys. Rev. Lett. electronic wavefunctions in real solids. 2017, 119, 223003. Nature 2013, 493, 365–370. (148) Gillan, M. J.; Towler, M. D.; Alfe, D. (139) Raghavachari, K.; Trucks, G. W.; Petascale computing opens new vistas for Pople, J. A. et al. A fifth-order pertur- quantum Monte Carlo. Psi-k Newsletter bation comparison of electron correlation 2011, 103, 32. theories. Chem. Phys. Lett. 1989, 157, 479–483. (140) Blöchl, P. E. Projector augmented-wave method. Phys. Rev. B 1994, 50, 17953– 17979. (141) McClain, J.; Sun, Q.; Chan, G. K.-L. et al. Gaussian-Based Coupled-Cluster Theory for the Ground-State and Band Structure of Solids. J. Chem. Theory Comput. 2017, 13, 1209–1218. (142) Gruber, T.; Liao, K.; Tsatsoulis, T. et al. Applying the Coupled-Cluster Ansatz to Solids and Surfaces in the Thermody- namic Limit. Phys. Rev. X 2018, 8, 021043.