The HANDE-QMC project: open-source stochastic from the ground state up

James S. Spencer,†,‡ Nick S. Blunt,¶,§,∥∥ Seonghoon Choi,¶,∥∥ Jiˇr´ıEtrych,¶,∥∥ Maria-Andreea Filip,¶,∥∥ W. M. C. Foulkes,†,∥,∥∥ Ruth S.T. Franklin,¶,∥∥ Will J. Handley,⊥,#,@,∥∥ Fionn D. Malone,†,△,∥∥ Verena A. Neufeld,¶,∥∥ Roberto Di Remigio,∇,††,∥∥ Thomas W. Rogers,†,∥∥ Charles J.C. Scott,¶,∥∥ James J. Shepherd,‡‡,∥∥ William A. Vigor,¶¶,∥∥ Joseph Weston,†,∥∥ RuQing Xu,§§,∥∥ and Alex J.W. Thom∗,¶,¶¶

†Dept. of Physics, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom ‡Dept. of Materials, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom ¶University Chemical Laboratory, Lensfield Road, Cambridge CB2 1EW, United Kingdom §St John’s College, St John’s Street, Cambridge, CB2 1TP, United Kingdom ∥Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA ⊥Astrophysics Group, Cavendish Laboratory, Cambridge, CB3 OHE, UK #Kavli Institute for Cosmology, Madingley Road, Cambridge, CB3 0HA, UK @Gonville & Caius College, Trinity Street, Cambridge, CB2 1TA, UK △Quantum Simulations Group, Lawrence Livermore National Laboratory, Livermore, California 94550, USA ∇Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, University of Tromsø - The Arctic University of Norway, N-9037 Tromsø, Norway ††Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24061, United States ‡‡Chemistry Building, University of Iowa, Iowa, 52240, USA ¶¶Dept. of Chemistry, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom §§Department of Modern Physics, University of Science and Technology, Hefei, Anhui, 230026, China ∥∥Authors listed alphabetically

E-mail: [email protected]

Abstract the last decade. The full configuration interac- tion quantum Monte Carlo (FCIQMC) method Building on the success of Quantum Monte allows one to systematically approach the ex- Carlo techniques such as diffusion Monte Carlo, act solution of such problems, for cases where alternative stochastic approaches to solve elec- very high accuracy is desired. The introduc- tronic structure problems have emerged over tion of FCIQMC has subsequently led to the

1 development of Monte Carlo duced the full configuration interaction quan- (CCMC) and density matrix quantum Monte tum Monte Carlo (FCIQMC) method.26 The Carlo (DMQMC), allowing stochastic sampling FCIQMC method allows essentially exact FCI of the coupled cluster wave function and the results to be achieved for systems beyond the exact thermal density matrix, respectively. In reach of traditional, exact FCI approaches; in this article we describe the HANDE-QMC code, this respect, the method occupies a similar an open-source implementation of FCIQMC, space to the density matrix renormalization CCMC and DMQMC, including initiator and group (DMRG) algorithm27–29 and selected semi-stochastic adaptations. We describe our CI approaches.15–20 Employing a sparse and code and demonstrate its use on three exam- stochastic sampling of the FCI wave func- ple systems; a molecule (nitric oxide), a model tion greatly reduces the memory requirements solid (the uniform electron gas), and a real solid compared to exact approaches. The intro- (diamond). An illustrative tutorial is also in- duction of FCIQMC has led to the devel- cluded. opment of several other related QMC meth- ods, including coupled cluster Monte Carlo (CCMC),30,31 density matrix quantum Monte 1 Introduction Carlo (DMQMC),32,33 model space quantum Monte Carlo (MSQMC),34–36 clock quantum Quantum Monte Carlo (QMC) methods, in Monte Carlo,37 driven-dissipative quantum their many forms, are among the most reliable Monte Carlo (DDQMC),38 and several other and accurate tools available for the investiga- variants, including multiple approaches for tion of realistic quantum systems.1 QMC meth- studying excited-state properties.34,39–41 ods have existed for decades, including notable In this article we present HANDE-QMC approaches such as variational Monte Carlo (Highly Accurate N-DEterminant Quantum (VMC),2–6 diffusion Monte Carlo (DMC)1,7–10 Monte Carlo), an open-source quantum chem- and auxiliary-field QMC (AFQMC);11 such istry code that performs several of the above methods typically have low scaling with system quantum Monte Carlo methods. In partic- size, efficient large-scale parallelization, and ular, we have developed a highly-optimized systematic improvability, often allowing bench- and massively-parallelized package to per- mark quality results in challenging systems. form state-of-the-art FCIQMC, CCMC and A separate hierarchy exists in quantum chem- DMQMC simulations. istry, consisting of methods such as coupled An overview of stochastic quantum chemistry cluster (CC) theory,12 Møller-Plesset perturba- methods in HANDE-QMC is given in Section 2. tion theory (MPPT),13 and configuration inter- Section 3 describes the HANDE-QMC package, action (CI), with full CI (FCI)14 providing the including implementation details, our develop- exact benchmark within a given single-particle ment experiences, and analysis tools. Applica- basis set. The scaling with the number of basis tions of FCIQMC, CCMC and DMQMC meth- functions can be steep for these methods: from ods are contained in Section 4. We conclude N 4 for MP2 to exponential for FCI. Various ap- with a discussion in Section 5 with views on proaches to tackle the steep scaling wall have scientific software development and an outlook been proposed in the literature: from adap- on future work. A tutorial on running HANDE tive selection algorithms15–20 and many-body is provided in the Supplementary Material. expansions for CI21 to the exploitation of the locality of the one-electron basis set22 for MP2 and CC.23–25 Such approaches have been in- creasingly successful, now often allowing chem- ical accuracy to be achieved for systems com- prising thousands of basis functions. In 2009, Booth, Thom and Alavi intro-

2 2 Stochastic quantum chem- The stochastic efficiency of the algorithm (de- termined by the size of statistical errors for istry a given computer time) can be improved by several approaches: using real weights, rather 2.1 Full Configuration Interac- than integer weights, to represent particle am- tion Quantum Monte Carlo plitudes;44,45 a semi-stochastic propagation, in which the action of the Hamiltonian in a small The FCI ansatz for∑ the ground state wave- | ⟩ | ⟩ { } subspace of determinants is applied exactly;44,46 function is ΨCI = i ci Di , where Di is the set of Slater determinants. Noting that and more efficient sampling of the Hamiltonian ˆ N by incorporating information about the magni- (1 − δτH) |Ψ0⟩ ∝ |ΨCI⟩ as N → ∞, where Ψ0 is some arbitrary initial vector with ⟨Ψ |Ψ ⟩ ̸= tude of the Hamiltonian matrix elements into 0 CI 47,48 0 and δτ is sufficiently small,42 the coeffi- the selection probabilities. The initiator approximation49 (often referred cients {ci} can be found via an iterative pro- cess derived from a first-order solution to the to as i-FCIQMC) only permits new particles to imaginary-time Schr¨odingerequation:26 be created on previously unoccupied determi- ∑ nants if the spawning determinant has a weight ˆ above a given threshold — this introduces a ci(τ +δτ) = ci(τ)−δτ ⟨Di|H|Dj⟩ cj(τ). (1) j systematic error which is reduced with increas- ing particle populations, but effectively reduces A key insight is that the action of the Hamil- the severity of the sign problem. This simple tonian can be applied stochastically rather modification has proven remarkably successful than deterministically: the wavefunction is dis- and permits FCI-quality calculations on Hilbert cretized by using a set of particles with weight spaces orders of magnitude beyond exact FCI. 1 to represent the coefficients, and is evolved in imaginary time by stochastically creating 2.2 Coupled Cluster Monte new particles according to the Hamiltonian ma- trix (Section 2.4). By starting with just par- Carlo ticles on the Hartree–Fock determinant or a The coupled cluster wavefunction ansatz is small number of determinants, the sparsity of Tˆ ˆ |ΨCC⟩ = Ne |DHF⟩, where T is the clus- the FCI wavefunction emerges naturally. The ter operator containing all excitations up to a FCIQMC algorithm hence has substantially re- given truncation level, N is a normalisation fac- duced memory requirements26 and is naturally tor and |DHF⟩ the Hartree–Fock determinant. 43 scalable in contrast to conventional Lanczos For convenience, we rewrite the wavefunction ˆ techniques. The sign problem manifests itself in T /tHF ansatz as |ΨCC⟩ = tHFe |DHF⟩, where tHF is the competing in-phase and out-of-phase com- a weight on∑ the Hartree–Fock determinant, and binations of particles with positive and nega- ˆ ′ ′ define T = i tiaˆi, where restricts the sum to tive signs on the same determinant;42 this is be up to the truncation level,a ˆi is an excitation alleviated by exactly canceling particles of op- operator (excitor) such thata ˆi |DHF⟩ results in posite sign on the same determinant, a process |Di⟩ and ti is the corresponding amplitude. Us- termed ‘annihilation’. This results in the dis- ing the same first-order Euler approach as in tinctive population dynamics of an FCIQMC FCIQMC gives a similar propagation equation: simulation, and a system-specific critical pop- ∑ ˆ ulation is required to obtain a statistical rep- ti(τ + δτ) = ti(τ) − δτ ⟨Di|H|Dj⟩ t˜j(τ). (2) 42 resentation of the correct FCI wavefunction. j Once the ground-state FCI wavefunction has been reached, the population is controlled via The key difference between Eqs. (1) and (2) is 9,26 a diagonal energy offset and statistics can t˜j = ⟨Dj|ΨCC⟩ contains contributions from clus- be accumulated for the energy estimator and, ters of excitors30 whereas the FCI wavefunction if desired, other properties. is a simple linear combination. This is tricky to

3 evaluate efficiently and exactly each iteration. β-loops gives thermal properties at all temper- Instead, t˜j is sampled and individual contri- atures in the range [0, β]. butions propagated separately.30,50,51 Bar this While this scheme is exact (except for small complication, the coupled cluster wavefunction and controllable errors due to finite δβ), it suf- can be stochastically evolved using the same ap- fers from the issue that important states at low proach as used in FCIQMC. temperature may not be sampled in the initial (β = 0) density matrix, where all configurations 33 2.3 Density Matrix Quantum are equally important. To overcome this, we write Hˆ = Hˆ 0 + Vˆ and define the auxiliary Monte Carlo density matrix fˆ(τ) = e−(β−τ)Hˆ 0 ρˆ(τ) with the FCIQMC and CCMC are both ground-state, following properties: zero-temperature methods (although excited- ˆ −βHˆ 0 state variants of FCIQMC exist34,39–41). The f(0) = e , (5) exact thermodynamic properties of a quantum fˆ(β) =ρ ˆ(β), (6) system in thermal equilibrium can be deter- dfˆ mined from the (unnormalized) N-particle den- = Hˆ 0fˆ− fˆH.ˆ (7) −βHˆ dτ sity matrix,ρ ˆ(β) = e , where β = 1/kBT .A direct evaluation ofρ ˆ(β) requires knowledge of We see that with this form of density matrix the full eigenspectrum of Hˆ , a hopeless task for we can begin the simulation from a mean-field ˆ all but trivial systems. To make progress we solution defined by H0, which should (by con- note that the density matrix obeys the (sym- struction) lead to a distribution containing the metrized) Bloch equation desired important states (such as the Hartree– [ ] Fock density matrix element) at low temper- dρˆ 1 ˆ 0 = − Hˆ ρˆ +ρ ˆHˆ . (3) ature. Furthermore, if H is a good mean dβ 2 field Hamiltonian then eβHˆ 0 ρˆ is a slowly vary- ing function of β, and is thus easier to sample. Representingρ ˆ in the Slater determinant basis, Comparing Eqs. (3) and (7), we see that fˆ can ρ = ⟨D |ρˆ|D ⟩ and again using a first-order up- ij i j be stochastically sampled in a similar fashion date scheme results in similar update equations to DMQMC, with minor modifications relative to FCIQMC and CCMC: to using the unsymmetrized Bloch equation:32 [ δβ ∑ i) the choice of Hˆ 0 changes the probability of ρ (β + δβ) = ρ (β) − ⟨D |Hˆ |D ⟩ ρ (β) ij ij 2 i k kj killing a particle (Section 2.4); ii) the τ = 0 ini- k ] tial configuration must be sampled according to ˆ 0 + ρik(β) ⟨Dk|H|Dj⟩ . Hˆ rather than the identity matrix; iii) evolv- (4) ing to τ = β gives a sample of the density ma- It follows that elements of the density matrix trix at inverse temperature β only - indepen- can be updated stochastically in a similar fash- dent simulations must be performed to accu- ion to FCIQMC and CCMC. ρ(β) is a single mulate results at different temperatures. We stochastic measure of the exact density ma- term this method interaction-picture DMQMC trix at inverse temperature β. Therefore, un- (IP-DMQMC). like FCIQMC and CCMC, multiple indepen- dent simulations must be performed in order 2.4 Commonality between FCIQMC, to gather statistics at each temperature. The CCMC and DMQMC simplest starting point for a simulation is at β = 0, where ρ is the identity matrix. Each FCIQMC, CCMC and DMQMC have more simulation (termed ‘β-loop’) consists of sam- similarities than differences: the amplitudes pling the identity matrix and propagating to within the wavefunction or density matrix are the desired value of β. Averaging over multiple represented stochastically by a weight, or par-

4 ticle.52 These stochastic amplitudes are sam- the case even for linked coupled cluster Monte pled to produce states, which make up the Carlo, which applies the similarity-transformed wavefunction or density matrix. For FCIQMC Hamiltonian, e−T HeT , and the interaction pic- (DMQMC), a state corresponds to a determi- ture formulation of DMQMC. nant (outer product of two determinants), and It is important to note that this core for CCMC corresponds to a term sampled from paradigm also covers different approaches to the cluster expansion corresponding to a single propagation,34,44,49,53 the initiator approxima- determinant. The stochastic representation of tion,31,49,54 excitation generators,47,48 excited the wavefunction or density matrix is evolved states and properties,41,45,55 and can naturally by be applied to different wavefunction Ans¨atze,56 which can be added relatively straightforwardly spawning sampling the action of the Hamil- on top of a core implementation of FCIQMC. tonian on each (occupied) state, which Due to this, improvements in, say, excitation requires random selection of a state con- generators can be immediately used across all nected to the original state. The process methods in HANDE. of random selection (‘excitation genera- tion’) is system-dependent, as it depends upon the connectivity of the Hamiltonian 3 HANDE-QMC matrix; efficient sampling of the Hamil- tonian has a substantial impact on the 3.1 Implementation stochastic efficiency of a simulation.44,47,48 HANDE-QMC is implemented in Fortran and death killing each particle with probability takes advantage of the increased expressiveness proportional to its diagonal Hamiltonian provided by the Fortran 2003 and 2008 stan- matrix element. dards.57 Parallelization over multiple proces- sors is implemented using OpenMP (CCMC- annihilation combining particles on the same only for intra-node shared memory communica- state and canceling out particles with the tion) and MPI. Parallelization and the reusabil- same absolute weight but opposite sign. ity of core procedures have been greatly aided Energy estimators can be straightforwardly ac- by the use of pure procedures and minimal cumulated during the evolution process. A global state, especially for system and calcu- parallel implementation distributes states over lation data. multiple processors, each of which need only We attempt to use best-in-class libraries evolve its own set of states. The annihilation where possible. This allows for rapid develop- stage then requires an efficient process for de- ment and a focus on the core QMC algorithms. HANDE-QMC relies upon MurmurHash2 for termining to which processor a newly spawned 58 43 hashing operations, dSFMT for high-quality particle should be sent. For CCMC an addi- 59 tional communication step is required to ensure pseudo-random number generation, numeri- cal libraries (cephes,60 LAPACK, ScaLAPACK, that the sampling of products of amplitudes is 61,62 unbiased.50 TRLan ) for special functions, matrix and Hence, FCIQMC, CCMC and DMQMC share vector procedures and Lanczos diagonalization, and HDF5 for file I/O.63 The input file to the majority of the core algorithms in the 64 HANDE-QMC implementations. The primary HANDE-QMC is a Lua script; Lua is a difference is the representation of the wavefunc- lightweight scripting language designed for em- bedding in applications and can easily be used tion or density matrix, and the action of the 65 Hamiltonian in the representation. These dif- from Fortran codes via the AOTUS library. ferences reside in the outer-most loop of the al- Some of the advantages of using a scripting lan- gorithm and so do not hinder the re-use of com- guage for the input file are detailed in Section 5. ponents between the methods. This remains Calculation, system settings and other meta-

5 data are included in the output in the JSON for- reached. mat,66 providing a good compromise between It is often useful to continue an existing calcu- human- and machine-readable output. lation; for example to accumulate more statis- HANDE can be compiled either into a stan- tics to reduce the error bar, to save equilibra- dalone binary or into a library, allowing it to be tion time when investigating the effect of calcu- used directly from existing quantum chemistry lation parameters or small geometry changes, packages. CMake67 is used for the build system, or for debugging when the bug is only evident which allows for auto-detection of compilers, li- deep into a calculation. To aid these use cases, braries and available settings in most cases. A calculations can be stored and resumed via the legacy Makefile is also included for compiling use of restart files. The state of the pseudo- HANDE in more complex environments where random number generator is included in the direct and fine-grained control over settings is restart files such that restarted calculations fol- useful. low the same Markov chain as if they had been Integrals for molecular and solid systems run in a single calculation assuming the same can be generated by Hartree–Fock calcula- calculation setup is used. We use the HDF5 tions using standard quantum chemistry pro- format and library for efficient I/O and com- grams, such as Psi4,68 HORTON,69 PySCF,70 pact file sizes. A key advantage of this ap- Q-Chem,71 and MOLPRO,72 in the plain-text proach is that it abstracts the data layout into FCIDUMP format. HANDE can convert the a hierarchy (termed groups and datasets). This FCIDUMP file into an HDF5 file, which gives makes extending the restart file format to in- a substantial space saving and can be read in clude additional information whilst maintaining substantially more quickly. For example, an all- backward compatibility with previous calcula- electron FCIDUMP for coronene in a Dunning tions particularly straightforward. Each calcu- cc-pVDZ basis73 is roughly 35GB in size and lation is labeled with a universally unique iden- takes 1840.88 seconds to read into HANDE and tifier (UUID),78 stored in the restart file and initialise. When converted to HDF5 format, included in the metadata of subsequent calcu- the resulting file is 3.6GB in size and initial- lations. This is critical for tracing the prove- ising an identical calculation takes only 60.83 nance of data generated over multiple restarted seconds. This is useful in maximizing resource calculations. utilization when performing large production- Extensive user-level documentation is in- scale calculations on HPC facilities. The mem- cluded in the HANDE-QMC package79 and ory demands of the integrals are reduced by details compilation, input options, running storing the two-electron integrals only once on HANDE and calculation analysis. The doc- each node using either the MPI-3 shared mem- umentation also includes several tutorials on ory functionality or, for older MPI implemen- FCIQMC, CCMC and DMQMC, which guide tations, POSIX shared memory. new users through generating the integrals (if In common with several Monte Carlo meth- required), running a QMC calculation along ods, data points from consecutive iterations are with enabling options for improving stochastic not independent, as the population at a given efficiency, and analysing the calculations. The iteration depends on the population at the pre- HANDE source code is also heavily commented vious iteration. This autocorrelation must be and contains extensive explanations on the the- removed in order to obtain accurate estimates ories and methods implemented (especially for of the standard error arising from FCIQMC and CCMC), and data structures. Each procedure CCMC simulations74 and is most straightfor- also begins with a comment block describing its wardly done via a reblocking analysis.75 This action, inputs and outputs. We find this level can be performed as a post-processing step76 of developer documentation to be extremely but is also implemented as an on-the-fly algo- important for onboarding new developers and rithm,77 which enables calculations to be termi- making HANDE accessible to modifications by nated once a desired statistical error has been other researchers.

6 3.2 Development methodology velopment of pyhande has aided reproducibil- ity by providing a single, robust implementa- The HANDE-QMC project is managed using tion for output parsing and common analy- the Git distributed version control system.80 A ses, and has made more complex analyses more public Git repository is hosted on GitHub81 and straightforward by providing rich access to raw is updated with new features, improvements data in a programmable environment. Indeed, and bug fixes. We also use a private Git repos- many functions included in pyhande began as itory for more experimental development and exploratory analysis in a Python shell or a research; this allows for new features to be iter- Jupyter notebook. The HANDE-QMC docu- ated upon (and potentially changed or even re- mentation also details pyhande and the tutori- moved) without introducing instability into the als include several examples of using pyhande more widely available code.82 We regularly up- for data analysis. pyhande makes extensive date the public version, from which official re- use of the Python scientific stack (NumPy,89 leases are made, with the changes made in the SciPy,90 Pandas86 and Matplotlib91). private repository. The code contains a com- prehensive test suite83 and when new features are implemented they are benchmarked against 3.4 License existing codes wherever possible. Further de- HANDE-QMC is licensed under the GNU tails of our development practices such as our Lesser General Public License, version 2.1. The development philosophy and the extensive con- 92,93 84 LGPLv2.1 is a weak copyleft license, which tinuous integration set up using Buildbot are allows the QMC implementations to be incor- outlined in Ref. 85. porated in both open- and closed-source quan- tum chemistry codes while encouraging devel- 3.3 pyhande opments and improvements to be contributed back or made available under the same terms.94 Interpretation and analysis of calculation out- pyhande is licensed under the 3-Clause BSD Li- put is a critical part of computational science. cense,95 in keeping with many scientific Python While we wrote scripts for performing common packages. analyses, such as reblocking to remove the effect of autocorrelation from estimates of the stan- dard error, we found that users would write ad- 4 Example results hoc, fragile scripts for extracting other useful data, which were rarely shared and contained In this section we present calculations to overlapping functionality. This additional bar- demonstrate the core functionality included in rier also hindered curiousity-driven exploration HANDE-QMC: we consider a small molecule of results. To address this, the HANDE-QMC (nitric oxide); the uniform electron gas in the package includes pyhande, a Python library zero-temperature ground state and at finite for working with HANDE calculation outputs. temperatures; and a periodic solid, diamond, pyhande extracts metadata (including version, with k-point sampling. The supplementary ma- system and calculation parameters, calculation terial includes a tutorial on running and analyz- UUID) into a Python dictionary and the QMC ing FCIQMC on the water molecule in cc-pVDZ output into a Pandas86 DataFrame, which pro- basis, which is easily accessible by determinis- vides a powerful abstraction for further anal- tic methods and can be easily performed on any ysis. pyhande includes scripts and functions relatively modern laptop. to automate common tasks, including reblock- 31 ing analysis, plateau and shoulder height esti- 4.1 Computational details mation, stochastic inefficiency estimation87 and reweighting to reduce the bias arising from pop- All calculations in this section were run with ulation control.9,88 We have found that the de- HANDE versions earlier than version 1.3. In-

7 tegrals were generated using PySCF, Psi4 and up to 4.5 × 106 occupied excitors, paralleliz- Q-Chem. Input, output and analysis scripts are ing over 96 cores. For comparison, determinis- available under a Creative Commons License at tic single reference CCSDTQ calculations per- https://doi.org/10.17863/CAM.31933 con- formed with the MRCC program package97 re- taining specifics on which version is used for quired storage of 2.1 × 107 amplitudes, but did some calculations, and which SCF program is not converge beyond R = 1.7A.˚ used. Orbital visualizations were made with Table (1) also shows the percentage of cor- IboView.96 relation energy captured by the various lev- els of CC, compared to i-FCIQMC. CCSD and 4.2 Molecules: Nitric oxide CCSDT capture > 92% and > 98% of the cor- relation energy, respectively, with CCSDTQ es- Nitric oxide is an important molecule, perhaps sentially exact, and the percentage decreasing most notably as a signalling molecule in multi- with increasing bond length as expected. The ple physiological processes. Here, we consider CCMC approach is particularly appropriate for NO in a cc-pVDZ basis set,73 correlating all 15 such high-order CC calculations, where stochas- electrons. The FCI space size is ∼ 1012, and so tic sampling naturally takes advantage of the is somewhat beyond the reach of exact FCI ap- sparse nature of the CC amplitudes. proaches. We consider initiator FCIQMC, us- 6 ing a walker population of 8 × 10 , which is −1.293×102 0.0 more than sufficient to achieve an accuracy of

h −0 1 ∼ 0.1mEh. This is then compared to CCMC . results for the CCSD, CCSDT and CCSDTQ /E −0.2 CCSD-MC Ans¨atze.An unrestricted Hartree–Fock (UHF) CCSDT-MC Energy molecular orbital basis is used. The computa- −0.3 CCSDTQ-MC tional resources to perform this study are mod- 2 −1.293×10 est compared to state-of-the-art FCIQMC sim- 0.0 ulations, never using more than about 100 pro- h −0.1 cessing cores. /E −0.2 In Figure 1 and Table 1, results are presented CCSDTQ-MC Energy for this system at varying internuclear dis- −0.3 i-FCIQMC tances. Remarkably good agreement between 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 CCSDTQ-MC and the i-FCIQMC is achieved, R/A˚ with CCSDT-MC also performing extremely well. Statistical errors do not pose any issue Figure 1: The binding curve of NO in a in these results, as is typically the case for cc-pVDZ basis set, correlating all electrons. FCIQMC and CCMC simulations; all such error Stochastic error bars are not visible on this scale, but all are smaller than 1 mE . For better bars are naturally of order 0.1mEh or less. For h i-FCIQMC results the semi-stochastic adapta- resolution in the differences between methods, tion was used,44,46 choosing the deterministic see Table (1). space by the approach of Ref. 46. Fig. (2) demonstrates such simulations before and after enabling semi-stochastic propagation, and the 4.3 Model Solid: Uniform elec- benefits are clear. Indeed, i-FCIQMC results tron gas here have statistical errors of order ∼ 1µEh or smaller. HANDE also has built-in capability to perform CCMC calculations were performed with real calculations of model systems commonly used weights using the even selection algorithm.51 in condensed matter physics, specifically the For the largest calculations, CCSDTQ-MC, uniform electron gas (UEG),98–100 the Hubbard heatbath excitation generators were used with model,101–103 and the Heisenberg model.32,104

8 Table 1: CCMC and i-FCIQMC results for the NO molecule in a cc-pVDZ basis set, correlating all electrons, as plotted in Fig. (1). UHF orbitals were used. Numbers in parentheses show statistical error bars, not systematic initiator error, which is estimated to be ∼ 0.1mEh for i-FCIQMC results. i-FCIQMC results used the semi-stochastic adaptation with a deterministic space of size 2 × 104. The results of such a semi-stochastic approach are demonstrated in Fig. (2). The final three columns show the percentage of correlation energy recovered by CCSD-MC, CCSDT-MC and CCSDTQ- MC, compared to i-FCIQMC. i-FCIQMC calculations were performed with 8 × 106 walkers, and CCMC calculations used at most 7 × 106 excips.

(Total energy + 129Eh)/Eh Correlation energy recovered (%) R/A˚ CCSD CCSDT CCSDTQ i-FCIQMC CCSD CCSDT CCSDTQ 0.9 -0.328507(1) -0.3346(1) -0.33523(4) -0.335225(2) 97.7330(6) 99.78(5) 100.00(1) 1.0 -0.5162(2) -0.52478(2) -0.525448(6) -0.525470(2) 97.06(8) 99.779(6) 99.993(2) 1.1 -0.582684(9) -0.59317(8) -0.59447(3) -0.594565(3) 96.435(3) 99.58(2) 99.973(9) 1.154 -0.5904(5) -0.6018(3) -0.6035(2) -0.603772(2) 96.1(2) 99.43(9) 99.92(5) 1.2 -0.58653(3) -0.6005(4) -0.6018(2) -0.602136(3) 95.541(8) 99.5(1) 99.89(7) 1.3 -0.5622(2) -0.5782(4) -0.5790(6) -0.580833(3) 94.67(5) 99.2(1) 99.5(2) 1.4 -0.5256(2) -0.5451(10) -0.5471(7) -0.548340(3) 93.34(7) 99.1(3) 99.6(2) 1.7 -0.43299(10) -0.4503(5) -0.4543(1) -0.455765(4) 92.13(3) 98.1(2) 99.48(4) 2.0 -0.39816(6) -0.40800(9) -0.41010(6) -0.411350(2) 94.45(2) 98.59(4) 99.47(2) 2.5 -0.39132(5) -0.39371(8) -0.39434(2) -0.3954786(4) 98.05(2) 99.17(4) 99.467(8)

Such model systems have formed the founda- tion of our understanding of simple solids and h −0.3420 (a) /E strongly correlated materials, and are a use-

−0.3435 ful testing ground for new computational ap- proaches. Studying the UEG, for example, has −0.3450 provided insight into the accuracy of many- Correlation energy 0 50000 100000 150000 body electronic structure methods and has been −0.3292 a critical ingredient for the development of /t (b) many of the exchange-correlation kernels used −0.3296 in Kohn–Sham density functional theory.105–107 The UEG has been used recently as a −0.3300

Correlation energy means to benchmark and test performance of new methods, such as modifications to dif- 0 20000 40000 60000 80000 100000 Iteration fusion Monte Carlo (DMC), as well as low orders of coupled cluster theory31,108–115 and Figure 2: Example simulations in HANDE- FCIQMC.116–121 QMC using the semi-stochastic FCIQMC ap- A recent CCMC study118 employing coupled 44 proach of Umrigar and co-workers. Vertical cluster levels up to CCSDTQ5 used HANDE dashed lines show the iteration where the semi- to compute the total energy of the UEG at stochastic adaptation is begun, and the result- rs = [0.5, 5]a0, the range relevant to electron ing reduction in noise is clear thereafter. (a) NO densities in real solids.100 The results suggest in a cc-pVDZ basis set, with all electrons cor- that CCSDTQ might be necessary at low den- related, at an internuclear distance of 1.154A.˚ 118 sities beyond rs = 3a0 in order to achieve × 4 The deterministic space is of size 2 10 . (b) chemical accuracy, whilst CCSDTQ5 was nec- A half-filled two-dimensional 18-site Hubbard essary to reproduce FCIQMC to within error model at U/t = 1.3, using a deterministic space bars.116–119 4 of size 10 . HANDE was also used in the resolution

9 Table 2: Ground-state exchange-correlation en- −0 425 . −0.550 ergies (Exc) for the UEG at rs = 1a0, compar- ing various levels of coupled cluster theory with −0.450 −0.555 FCIQMC. Exchange-correlation energies were −0.475 h 118 /E −0 500 calculated using data from Ref. . 0.120 0.125 0.130 xc E −0.525 Method Exc/Eh CCSD CCSDT −0.550 CCSD-MC -0.551128(6) FCIQMC DMQMC CCSDT-MC -0.55228(1) −0.575 CCSDTQ-MC -0.55231(1) 10−1 100 101 CCSDTQ5-MC -0.55232(1) Θ FCIQMC -0.55233(1) Figure 3: The exchange-correlation energy (Exc) for the UEG at rs = 1a0 as a function of a discrepancy between restricted path- of temperature Θ using DMQMC (Ref. 54). integral Monte Carlo and configuration path- The horizontal lines represent basis set extrap- integral Monte Carlo data for the exchange- olated CCSD, CCSDT and FCIQMC exchange- correlation energy of the UEG necessary to correlation energies energies (Ref. 118). Error parametrize DFT functionals at finite tempera- bars on CCMC and FCIQMC results are too ture.54,122–130. The UEG at finite temperatures small to be seen on this scale. CCSDT and is parametrized by the density and the degen- FCIQMC values cannot be distinguished on this scale. See Table. (2) for numerical values for eracy temperature, Θ = T/TF , where TF is 131 CCSD to CCSDTQ5 in the ground state. the Fermi temperature . When both rs ≈ 1 and Θ ≈ 1 the system is said to be in the warm dense regime, a state of matter which is 4.4 Solids: Diamond to found in planetary interiors132 and can be created experimentally in inertial confinement Finally, we apply HANDE-QMC to a real peri- fusion experiments.133 odic solid, diamond, employing k point sam- Here, we show that use of HANDE can fa- pling. CCMC has been applied to 1×1×1 cilitate straightforward benchmarking of model (up to CCSDTQ), 2×1×1 (up to CCSDT), systems at both zero and finite temperature. 2×2×1 and 2×2×2 (up to CCSD) k point In Fig. 3 we compare DMQMC data for the 14- meshes and non-initiator FCIQMC to a 1×1×1 † electron, spin-unpolarized UEG at finite Θ to k point mesh in a GTH-DZVP134 basis, and a 136,137 zero temperature (Θ = 0) energies found using GTH-pade pseudo-potential. There were 118 k CCMC and FCIQMC for rs = 1a0. We com- 2 atoms, 8 electrons in 52 spinorbitals per k pute the exchange-correlation internal energy point. Integral files have been generated with PySCF70 using density fitting.138 Or- EXC(Θ) = EQMC(Θ) − T0(Θ), (8) bitals were obtained from density functional theory using the LDA Slater-Vosko-Vilk-Nusair 139 where EQMC(Θ) is the QMC total energy of the (SVWN5) exchange-correlation functional to UEG and T0 is the ideal kinetic energy of the write out complex valued integrals at different same UEG. Even at rs = 1a0, coupled cluster k points, and HANDE’s read–in functionali- requires contributions from triple excitations to ties were adapted accordingly. Details of this obtain FCI-quality energies; CCSD differs by will be the subject of a future publication on about 1mHa. DMQMC results tend to the ex- solid-state calculations. The heat bath uniform pected zero temperature limit given by both singles 47,48 or the heat bath Power–Pitzer ref. FCI and CC. Ground-state values from cou- excitation generator48 and even selection51 or pled cluster and FCIQMC are presented in Ta- † 70 135 ble. (2), to make the small differences between As used in PySCF, and CP2K, https://www. .org/. high-accuracy methods clearer.

10 multi-spawn50 sampling were used. Deterministic coupled cluster has been applied to diamond previously; Booth et al.140 have investigated diamond with CCSD, CCSD(T)141 and FCIQMC in a basis of plane waves with the projector augmented wave method;142 McClain et al.143 studied diamond with CCSD using GTH pseudo-potentials in DZV, DZVP, TZVP basis sets134,136,137†; Gru- ber et al.144 used CCSD with (T) corrections in an MP2 natural orbital basis.145 The lattice constant was fixed to 3.567A,˚ as 143 in the study by McClain et al. Figure 4 −0.225 CCSD-PySCF shows the correlation energy as a function of −0.230 CCSD-McClain et al. CCSD-MC number of k points comparing the CCMC and h −0.235 CCSDT-MC CCSDTQ-MC /E FCIQMC results to the CCSD results obtained − FCIQMC

HF 0.240 using PySCF and the CCSD results of Mc- E − −0.245 143 .

Clain et al. The correlation energy given here tot −

E 0.250 is calculated with respect to the HF energy, − as the correlation energy from using DFT or- 0.255 −0.260 bitals, added to the difference of energy of ref- 1.00 1.25 1.50 1.75 2.00 2.25 2.50 erence determinant consisting of DFT orbitals (#k points)1/3 and HF SCF energy. Differences in conver- gences are due to the use of differently op- Figure 4: Difference between the total and timized orbitals, and a different treatment of Hartree-Fock energy per k point for diamond the exchange integral (which will feature in a using CCMC (CCSD to CCSDTQ) and (non- future publication). In the case of CCMC, initiator) FCIQMC based on DFT orbitals. FCIQMC and CCSD-PySCF the k point mesh The CCSDTQ and the FCIQMC data point has been shifted to contain the Γ point, while overlap to a large extent. The CCSD-PySCF McClain et al.143 used Γ point centered (not data was run with Hartree-Fock orbitals. In the shifted) meshes, which explains the larger dif- case of CCMC, FCIQMC and CCSD-PySCF ference between CCSD-McClain et al. and the mesh has been shifted to contain the Γ the rest of the data. An accuracy of (0.01- point. CCSD-McClain et al. is data from Fig- 143 0.1) eV/unit ((0.00037-0.0037) Eh/unit) might ure 1 in McClain et al. using PySCF; we be required to accurately predict, for example show only their data up to 12 k-points for com- structures,146 so these limited k-point parison. Both studies used the DZVP basis set mesh results suggest that at least CCSDT level and GTH pseudopotentials. is required for reasonable accuracy, possibly CCSDTQ. Nonetheless, we have not considered larger basis sets, additional k points, and other important aspects required for an exhaustive study.

5 Discussion

This article has presented the key function- ality included in HANDE-QMC: efficient, ex- tensible implementations of the full configura-

11 tion interaction quantum Monte Carlo, cou- ducibility. pled cluster Monte Carlo and density matrix We close by echoing the views of the Psi4 quantum Monte Carlo methods. Advances such developers:68 ‘the future of quantum chem- as semi-stochastic propagation in FCIQMC44,46 istry software lies in a more modular ap- and efficient excitation generators47,48 are also proach in which small, independent teams de- implemented. HANDE-QMC can be applied velop reusable software components that can to model systems – the Hubbard, Heisenberg be incorporated directly into multiple quantum and uniform electron gas models – as well as chemistry packages’ and hope that this leads to molecules and solids. an increased vibrancy in method development. We have found using a scripting language Acknowledgement JSS and WMCF re- (Lua) in the input file147 to be extremely bene- ceived support under EPSRC Research Grant ficial – for example, in running multi-stage cal- EP/K038141/1 and acknowledge the stimu- culations, enabling semi-stochastic propagation lating research environment provided by the after the most important states have emerged, Thomas Young Centre under Grant No. TYC- irregular output of restart files, or for enabling 101. WMCF also received support under additional output for debugging at a specific US DoE grant NA 0002911. NSB acknowl- point in the calculation. As with (e.g.) Psi4, edges St John’s College, Cambridge, for fund- PySCF and HORTON, we find this approach ing through a Research Fellowship, and Trinity far more flexible and powerful than a custom College, Cambridge for an External Research declarative input format used in many other sci- Studentship during this work. JE acknowl- entific codes. edges Trinity College, Cambridge, for fund- We are strong supporters of open-source soft- ing through a Summer Studentship during this ware in scientific research and are glad that the work. RSTF acknowledges CHESS for a stu- HANDE-QMC package has been used in others dentship. WH acknowledges Gonville & Caius research in ways we did not envisage, including College, Cambridge for funding through a Re- in the development of Adaptive Sampling Con- 53 search Fellowship during this work. NSB and figuration Interaction (ASCI), understand- WH are grateful to for Undergraduate Research ing the inexact power iteration method148 and Opportunities Scholarships in the Centre for in selecting the P subspace in the CC(P;Q) Doctoral Training on Theory and Simulation of method.149 We believe one reason for this is Materials at Imperial College funded by EP- that the extensive user- and developer-level SRC under Grant No. EP/G036888/1. FDM documentation makes learning and developing was funded by an Imperial College President’s HANDE-QMC rather approachable. Indeed, scholarship and part of this work was per- five of the authors of this paper made their first formed under the auspices of the U.S. Depart- contributions to HANDE-QMC as undergradu- ment of Energy (DOE) by LLNL under Con- ates with little prior experience in software de- tract No. DE-AC52-07NA27344. VAN ac- velopment or computational science. In turn, knowledges the EPSRC Centre for Doctoral HANDE-QMC has greatly benefited from ex- Training in Computational Methods for Mate- isting quantum chemistry software, in particu- rials Science for funding under grant number lar integral generation from Hartree–Fock cal- EP/L015552/1 and the Cambridge Philosoph- culations in Psi4,68 Q-Chem71 and PySCF.70 ical Society for a studentship. RDR acknowl- We hope in future to couple HANDE-QMC to edges partial support by the Research Coun- such codes to make running stochastic quantum cil of Norway through its Centres of Excellence chemistry calculations simpler and more conve- scheme, project number 262695 and through its nient. To this end, some degree in standardiza- Mobility Grant scheme, project number 261873. tion of data formats to make it simple to pass CJCS acknowledges the Sims Fund for a stu- data (e.g. wavefunctions amplitudes) between dentship. JJS is currently supported by an codes would be extremely helpful in connecting Old Gold Summer Fellowship from the Uni- libraries, developing new methods149 and repro-

12 versity of Iowa. JJS also gratefully acknowl- edges the prior support of a Research Fellow- ship from the Royal Commission for the Exhibi- tion of 1851 and a production project from the Swiss National Supercomputing Centre (CSCS) under project ID s523. WAV acknowledges EPSRC for a PhD studentship. AJWT ac- knowledges Imperial College London for a Ju- nior Research Fellowship, the Royal Society for a University Research Fellowship (UF110161 and UF160398), Magdalene College for sum- mer project funding for M-AF, and EPSRC sys = read_in { for an Archer Leadership Award (project e507). int_file = "INTDUMP", We acknowledge contributions from J. Weston } during an Undergraduate Research Opportu- nities Scholarships in the Centre for Doctoral fciqmc { Training on Theory and Simulation of Materi- sys = sys, als at Imperial College funded by EPSRC under qmc = { Grant No. EP/G036888/1. The HANDE-QMC tau = 0.01, project acknowledges a rich ecosystem of open- tau_search = true, source projects, without which this work would rng_seed = 8, not have been possible. init_pop = 500, mc_cycles = 5, nreports = 3*10^3, A An introductory tutorial target_population = 10^4, excit_gen = "heat_bath", to HANDE-QMC initiator = true, real_amplitudes = true, In the following we present an introductory spawn_cutoff = 0.1, tutorial, demonstrating how to perform basic state_size = -1000, FCIQMC and i-FCIQMC simulations with the spawned_state_size = -100, HANDE-QMC code. More extensive tutorials, }, including for CCMC and DMQMC, exist in the } HANDE-QMC documentation. Here we take the water molecule at its equilibrium geome- Figure 5: An example input file for an i- try, in a cc-pVDZ basis set73 and correlating FCIQMC simulation on a molecular system. all electrons. This is a simple example, but has The results of such a simulation are presented a Hilbert space dimension of ∼ 5 × 108, making in Fig. (6). an exact FCI calculation non-trivial to perform.

A.1 A basic i-FCIQMC simula- tion The input file for HANDE-QMC is a Lua script. The basic structure of such an input file is shown in Fig. (5). In this the system is entirely determined by the integral file, “INTDUMP”, which stores all of the necessary 1- and 2-body molecular in- tegrals. For this tutorial, the integral file was

13 generated through the Psi4 code.68 Both the $ mpiexec hande.x hande.lua > hande.out “INTDUMP” file, and the Psi4 script used to generate it, are available in additional material. with the MPI command varying between imple- As discussed in the main text, the integral file mentations in the usual way. The results of the may be generated by multiple other quantum input file in Fig. (5) are presented in Fig. (6). chemistry packages.69–72 Because of the correlated nature of the QMC In general, the system may be defined by data, care must be taken when estimating error specifying additional parameters, including the bars; a large number of iterations must typically number of electrons, the spin quantum number be performed, allowing data to become suffi- ciently uncorrelated. This task can be error- (Ms), the point group symmetry label, and a CAS subspace, for example: prone for new users (and old ones). HANDE- QMC includes a Python script, reblock_ sys = read_in { hande.py, which performs a rigorous blocking int_file = "INTDUMP", analysis of the simulation data, automatically nel = 10, detecting if sufficient iterations have been per- ms = 0, formed and, if so, choosing the optimal block sym = 0, length to provide final estimates. { } CAS = 8, 23 , This final energy estimate can be obtained by } $ reblock_hande.py --quiet hande.out The input file then calls the fciqmc{...} function, which performs an FCIQMC simu- The usual estimator for the correlation energy lation with the provided system and parame- (Ecorr) is the Hartree–Fock projected estimator: ters. There are several options here; most are ⟨D |(Hˆ − E 1)|Ψ ⟩ self-evident and are described in detail in the E = 0 HF 0 , (9) corr ⟨D |Ψ ⟩ HANDE-QMC documentation. tau specifies ∑ 0 0 ˆ the time step size, and tau_search=true up- ̸ Ci ⟨D0|H|Di⟩ = i=0 , (10) dates this time step to an optimal value during C0 the simulation. init_pop specifies the initial | ⟩ particle population, and target_population where D0 is the Hartree–Fock determinant the value at which this population will attempt and EHF is the Hartree–Fock energy. Ci are the to stabilize. excit_gen specifies the excita- particle amplitudes, with C0 being the Hartree– tion generator to be used. This option is not Fock amplitude. Because both the numerator required, although the heat-bath algorithm of and denominator are random variables, they Umrigar and co-workers47 that we have adapted should be averaged separately, before perform- for HANDE-QMC as explained in Ref.,48 as ing division. It is therefore important that data used here, is a sensible choice in small sys- be averaged from the point where both the nu- tems. initiator=true ensures that the ini- merator and denominator have converged indi- tiator adaptation, i-FCIQMC, is used. real_ vidually; in some cases the energy itself may amplitudes=true ensures that non-integer par- appear converged while the numerator and de- ticle weights are used. This leads to improved nominator are still converging. This does not stochastic efficiency, and so is always recom- occur in the current water molecule case, as mended. Lastly, state_size and spawned_ can be seen in Fig. (6), where the numerator state_size specify the memory allocated to and denominator are plotted in (b) and (c), re- the particle and spawned particle arrays, re- spectively. Here, all relevant estimates appear spectively - a negative sign is used to spec- converged by iteration ∼ 1000. ify these values in megabytes (thus 1GB and The reblock_hande.py script will automati- 100MB, here). cally detect when the required quantities have The input file is run with converged, in order to choose the iteration from which to start averaging data. However, a start-

14 14000 (a) 12000 10000 8000 No. of particles (b)

h −100

/E −105 −110 num. −115 E (c) 500 480 denom. 460 E −0.18 (d) h

/E −0.21 corr

E −0.24 0 2000 4000 6000 8000 10000 12000 14000 Iteration

Figure 6: The results of running the input file in Fig (5). (a) shows the particle population, 4 ∑stabilizing slightly above the targeted value of 10 . (b) shows the numerator of the energy estimator, ⟨ | ˆ | ⟩ i=0̸ Ci D0 H Di , as discussed in the main text. (c) shows the energy denominator, which is the number of particles on the Hartree–Fock determinant. (d) shows the correlation energy estimates themselves. ing iteration may be manually provided using However, one can make use of the Lua input --start. In general it is good practice to man- file with HANDE-QMC to perform an arbitrary ually plot simulation data, as in Fig. (6), to number of simulations with a single input file, check that behavior is sensible. In this case, the as shown by example in Fig. (7). Here, targets reblock_hande.py script automatically begins is a table containing particle populations from averaging from iteration number 1463, which is 2 × 103, and doubling until 1.28 × 105. We appropriate. loop over all target populations and perform an FCIQMC simulation for each. A.2 Converging initiator error Running the reblock_hande.py script on the subsequent output file gives the results in Ta- After running the reblock_hande.py script, ble (3). The final column gives the projected the correlation energy estimate can be read off energy estimate of the correlation energy, and simply as Ecorr = −0.2166(2)Eh. This com- is plotted in Fig. (8), with comparison to the pares well to the exact FCI energy of EFCI = FCI energy. Accuracy within 1mEh is reached − ∼ 4 0.217925Eh, in error by 1.3mEh, despite with Nw = 2 × 10 , and an accuracy of 0.1mEh ∼ 4 5 using only 10 particles to sample a space of by Nw = 2 × 10 . dimension ∼ 5 × 108. It is simple to perform a semi-stochastic i- Nonetheless, an important feature of i- FCIQMC simulation. To do this, as well as FCIQMC is the ability to converge to the exact passing sys and qmc parameters to the fciqmc result by varying only one parameter, the par- function, one should also pass a semi_stoch ticle population. This is possible by running table. The simplest form for this table, which multiple i-FCIQMC simulation independently. is almost always appropriate, is the following:

15 Table 3: Output of the HANDE-QMC reblocking script, on the simulation with the input file of Fig. (7). The final column gives the estimates of the correlation energy, as determined from the projected energy estimator. ∑ Block from # H psips H0jNj N0 Shift Proj. Energy hande.out 0 1.83000000e+03 2292(4) -36.77(8) 172.6(5) -0.210(3) -0.2131(3) 1 1.81800000e+03 4602(5) -56.4(1) 262.5(6) -0.213(2) -0.2148(2) 2 1.47300000e+03 9108(7) -88.29(9) 408.2(5) -0.213(1) -0.2163(2) 3 1.78100000e+03 19050(10) -151.5(1) 697.0(6) -0.217(1) -0.2173(2) 4 1.97200000e+03 38150(10) -276.7(1) 1270.0(6) -0.2188(5) -0.21784(6) 5 2.06500000e+03 74310(30) -528.3(2) 2428(1) -0.2193(6) -0.21761(8) 6 1.82500000e+03 152900(30) -1081.4(4) 4964(2) -0.2186(4) -0.21787(5)

−0.213 FCI h i-FCIQMC /E −0.214 sys = read_in { −0.215 int_file = "INTDUMP", } −0.216 −0.217 Correlation energy targets = {2*10^3, 4*10^3, 8*10^3, −0.218 1.6*10^4, 3.2*10^4, 6.4*10^4, 104 105 Particle population 1.28*10^5} Figure 8: Initiator convergence for the water for i,target in ipairs(targets) do molecule in a cc-pVDZ basis set, with all elec- fciqmc { trons correlated. Results were obtained by run- sys = sys, ning the input file of Fig. (7). qmc = { tau = 0.01, rng_seed = 8, { init_pop = target/20, semi_stoch = mc_cycles = 5, size = 10^4, nreports = 3*10^3, start_iteration = 2*10^3, tau_search = true, space = "high", } target_population = target, , excit_gen = "heat_bath", initiator = true, The "high" option generates a deterministic real_amplitudes = true, space by choosing the most highly-weighted de- spawn_cutoff = 0.1, terminants in the FCIQMC wave function at state_size = -1000, the given iteration (which in general should be spawned_state_size = -100, an iteration where the wave function is largely }, converged), 2 × 103 in this case. The total size } of the deterministic space is given by the size end parameter, 104 in this case.

Figure 7: An example input file showing how B Parallelization to use Lua features to perform multiple simula- tions in a single input file, with particle popu- In this appendix, we describe two techniques lations from 2000 to 128, 000. that can optimize the FCIQMC parallelization, load balancing and non-blocking communica- tion. Parallelization of CCMC has been ex-

16 plained in Ref. 50 but does not yet make use and over populated which negatively affects the of non-blocking communication. parallelism43. This is to be expected as in the By and large, HANDE’s FCIQMC implemen- limit Np → NDets there would be quite a pro- tation follows the standard parallel implemen- nounced load imbalance unless each determi- tation of the FCIQMC algorithm, a more com- nant’s coefficient was of a similar magnitude plete description of which can be found in Ref. (which can often be the case for strongly cor- 43. In short, each processor stores a sorted related systems). Naturally this limit is never main list of instantaneously occupied determi- reached, but the observed imbalance is largely nants containing the determinant’s bit string a consequence of this increased refinement. representation, the walker’s weight as well as In HANDE we optionally use dynamic load any simulation dependent flags. For each it- balancing to achieve better parallel perfor- eration every walker is given the chance to mance. In practice, we define an array pmap spawn to another connected determinant, with as newly spawned walkers being added to a sec- pmap(i) = i mod Np, (12) ond spawned walker array. After evolution a so that its entries cyclically contain the proces- collective MPI_AlltoAllv is set up to commu- sor IDs, 0,...,N − 1. Determinants are then nicate the spawned walker array to the appro- p initially mapped to processors as priate processors. The annihilation step is then ( ) carried out by merging the subsequently sorted p(|D ⟩) = p hash(|D ⟩) mod N × M , spawned walker array with the main list. i map i p During the simulation every walker needs to (13) know which processor a connected determinant where M is the bin size. Eq. (13) reduces to resides on but naturally can not store this map- Eq. (11) when M = 1. ping. In order to achieve a relatively uniform The walker population in each of these M bins distribution of determinants at a low computa- on each processor can be determined and com- tional cost, each walker is assigned to a proces- municated to all other processors. In this way, sors p as every processor knows the total distribution of walkers across all processors. In redistributing × p(|Di⟩) = hash(|Di⟩) mod Np, (11) the Np M bins we adopt a simple heuristic approach by only selecting bins belonging to where Np is the number of processors and hash processors whose populations are either above is a hash function58. or below a certain user defined threshold. By redistributing bins in order of increasing popu- B.1 Load Balancing lation we can, in principle, isolate highly pop- ulated determinants while also allowing for a The workload of the algorithm is primarily de- finer distribution. termined by the number of walkers on a given This procedure translates to a simple modifi- processor, but the above hashing procedure dis- cation of pmap so that its entries now contain the tributes work to processors on a determinant processor IDs which give the determined opti- basis. For the hashing procedure to be effective mal distribution of bin. we require that the average population for a Finally, the walkers which reside in the chosen random set of determinants to be roughly uni- bins have to be moved to their new processor, form. Generally hashing succeeds in this re- which can simply be achieved using a commu- gard and one finds a fairly even distribution nication procedure similar to that used for the of both walkers and determinants. When scal- annihilation stage. Some care needs to be taken ing a problem of a fixed size to more proces- that all determinants are on their correct pro- sors, i.e. strong scaling, one observes that the cessors at a given iteration so that annihilation distribution loses some of its uniformity with takes place correctly. certain processors becoming significantly under Once the population of walkers has stabilised

17 the distribution across processors should be would need to complete the steps above before roughly constant, although small fluctuations the fastest processor reaches step (3), otherwise will persist. With this in mind redistribution there will be latency as the received list cannot should only occur after this stabilisation has oc- be evolved before all walkers spawned onto a curred and also should not need to occur too fre- given processor are received. quently. This ensures that the computational It should be pointed out that walkers spawned cost associated with performing load balancing onto a processor at time τ are only annihilated is fairly minor in a large calculation. Addition- with the main list after evolution to τ + ∆τ, ally as M is increased the optimal distribution which differs from the normal algorithm. While of walkers should be approached, although with annihilation is vital to attaining converged re- an increase in computational effort. sults26,42 the times at which it takes place is somewhat arbitrary, once walkers are annihi- B.2 Non-blocking communica- lated at the same point in simulation time. Communication between processors is also re- tion quired when collecting statistics, however the HANDE also makes use of non-blocking asyn- usual collectives required for this can simply chronous communication to alleviate latency is- be replaced by the corresponding non-blocking sues when scaling to large processor counts.150 procedures. This does require that information Using asynchronous communications is non- is printed out in a staggered fashion but this is trivial in HANDE due to the annihilation stage of minor concern. of FCIQMC-like algorithms. We use the follow- ing algorithm: Consider the evolution of walk- ers from τ to τ + ∆τ, then for each processor Notes and References the following steps are carried out: (1) Foulkes, W. M. C.; Mitas, L.; 1. Initialise the non-blocking receive of walk- Needs, R. J. et al. Quantum Monte ers spawned onto the current processor Carlo simulations of solids. Rev. Mod. from time τ. Phys. 2001, 73, 33–83.

2. Evolve the main list to time τ + ∆τ. (2) McMillan, W. L. Ground State of Liquid 4He. Phys. Rev. 1965, 138, A442–A451. 3. Complete the receive of walkers. (3) Umrigar, C. J.; Wilson, K. G.; 4. Evolve the received walkers to τ + ∆τ. Wilkins, J. W. Optimized Trial Wave Functions for Quantum Monte Carlo 5. Annihilate walkers spawned from the evo- Calculations. Phys. Rev. Lett. 1988, 60, lution of the two lists as well as the 1719. evolved received list with the main list on this processor. (4) Umrigar, C. J.; Toulouse, J.; Filippi, C. et al. Alleviation of the Fermion-Sign 6. Send remaining spawned walkers to their Problem by Optimization of Many-Body new processors. Wave Functions. Phys. Rev. Lett. 2007, While this requires more work per iteration, it 98, 110201. should result in improved efficiency if the time (5) Neuscamman, E.; Umrigar, C. J.; take to complete this work is less than the la- Chan, G. K.-L. Optimizing large param- tency time. This also ensures faster processors eter sets in variational quantum Monte can continue doing work, i.e. evolving the main Carlo. Phys. Rev. B 2012, 85, 045103. list, while waiting for other processors to finish evolving their main lists. For communications (6) Neuscamman, E. Variation after re- to be truly overlapping the slowest processor

18 sponse in quantum Monte Carlo. J. (16) Giner, E.; Scemama, A.; Caffarel, M. Us- Chem. Phys. 2016, 145, 081103. ing perturbatively selected configuration interaction in quantum Monte Carlo cal- (7) Grimm, R. C.; Storer, R. G. Monte- culations. Can. J. Chem. 2013, 91, 879. Carlo solution of Schr¨odinger’sequation. J. Comput. Phys. 1971, 7, 134. (17) Schriber, J. B.; Evangelista, F. A. Com- munication: An adaptive configuration (8) Anderson, J. B. A random-walk simula- interaction approach for strongly corre- + tion of the Schrodinger equation: H3 . J. lated electrons with tunable accuracy. J. Chem. Phys. 1975, 63, 1499. Chem. Phys. 2016, 144, 161106.

(9) Umrigar, C. J.; Nightingale, M. P.; (18) Tubman, N. M.; Lee, J.; Takeshita, T. Y. Runge, K. J. A diffusion Monte Carlo al- et al. A deterministic alternative to the gorithm with very small time-step errors. full configuration interaction quantum J. Chem. Phys. 1993, 99, 2865. Monte Carlo method. J. Chem. Phys. (10) Kim, J.; Baczewski, A. D.; 2016, 145, 044112. Beaudet, T. D. et al. QMCPACK: (19) Holmes, A. A.; Tubman, N. M.; Umri- an open source ab initio quantum gar, C. J. Heat-Bath Configuration In- Monte Carlo package for the electronic teraction: An Efficient Selected Configu- structure of atoms, molecules and solids. ration Interaction Algorithm Inspired by J. Phys. Condsens. Matter 2018, 30, Heat-Bath Sampling. J. Chem. Theory 195901. Comput. 2016, 12, 3674–3680.

(11) Zhang, S.; Krakauer, H. Quantum Monte (20) Garniron, Y.; Scemama, A.; Loos, P.-F. Carlo Method using Phase-Free Random et al. Hybrid stochastic-deterministic cal- Walks with Slater Determinants. Phys. culation of the second-order perturbative Rev. Lett. 2003, 90, 136401. contribution of multireference perturba- (12) Ciˇzek,ˇ J. On the Correlation Problem tion theory. J. Chem. Phys. 2017, 147, in Atomic and Molecular Systems. Cal- 034101. culation of Wavefunction Components (21) Eriksen, J. J.; Lipparini, F.; Gauss, J. in UrsellType Expansion Using Quan- Virtual Orbital Many-Body Expansions: tumField Theoretical Methods. J. Chem. A Possible Route towards the Full Con- Phys. 1966, 45, 4256–4266. figuration Interaction Limit. J. Phys. (13) Møller, C.; Plesset, M. S. Note on Chem. Lett. 2017, 4633–4639. an Approximation Treatment for Many- (22) Saebo, S.; Pulay, P. Local Treatment of Electron Systems. Phys. Rev. 1934, 46, Electron Correlation. Annu. Rev. Phys. 618–622. Chem. 1993, 44, 213–236.

(14) Knowles, P. J.; Handy, N. C. A New (23) Hampel, C.; Werner, H.-J. Local treat- Determinant-based Full Configuration ment of electron correlation in coupled Interaction Method. Chem. Phys. Letters cluster theory. J. Chem. Phys. 1996, 1984, 111, 315–321. 104, 6286–6297.

(15) Huron, B.; Malrieu, J. P.; Rancurel, P. (24) Riplinger, C.; Neese, F. An efficient and Iterative perturbation calculations of near linear scaling pair natural orbital ground and excited state energies from based local coupled cluster method. J. multiconfigurational zeroth-order wave- Chem. Phys. 2013, 138, 034106. functions. J. Chem. Phys. 1973, 58, 5745–5759.

19 (25) Zi´o lkowski, M.; Jans´ık, B.; Kjaer- (35) Ohtsuka, Y.; Ten-no, S. A study of po- gaard, T. et al. Linear scaling cou- tential energy curves from the model pled cluster method with correlation en- space quantum Monte Carlo method. J. ergy based error control. J. Chem. Phys. Chem. Phys. 2015, 143, 214107. 2010, 133, 014107. (36) Ten-no, S. Multi-state effective Hamilto- (26) Booth, G. H.; Thom, A. J. W.; Alavi, A. nian and size-consistency corrections in Fermion Monte Carlo without fixed stochastic configuration interactions. J. nodes: A game of life, death, and an- Chem. Phys. 2017, 147, 244107. nihilation in Slater determinant space. J. Chem. Phys. 2009, 131, 054106–1–10. (37) McClean, J. R.; Aspuru-Guzik, A. Clock quantum Monte Carlo technique: An (27) White, S. R. Density matrix formula- imaginary-time method for real-time tion for quantum renormalization groups. quantum dynamics. Phys. Rev. A 2015, Phys. Rev. Lett. 1992, 69, 2863. 91 .

(28) Chan, G. K.-L. An algorithm for large (38) Nagy, A.; Savona, V. Driven-dissipative scale density matrix renormalization quantum Monte Carlo method for open group calculations. J. Chem. Phys. 2004, quantum systems. Phys. Rev. A 2018, 120, 3172. 97, 052129.

(29) Olivares-Amaya, R.; Hu, W.; (39) Booth, G. H.; Chan, G. K.-L. Commu- Nakatani, N. et al. The ab-initio nication: Excited states, dynamic corre- density matrix renormalization group lation functions and spectral properties in practice. J. Chem. Phys. 2015, 142, from full configuration interaction quan- 034102. tum Monte Carlo. J. Chem. Phys. 2012, 137, 191102. (30) Thom, A. J. W. Stochastic Coupled Clus- ter Theory. Phys. Rev. Lett. 2010, 105, (40) Humeniuk, A.; Mitri´c,R. Excited states 263004–1–4. from quantum Monte Carlo in the ba- sis of Slater determinant. J. Chem. Phys. (31) Spencer, J. S.; Thom, A. J. W. Develop- 2014, 141, 194104. ments in stochastic coupled cluster the- ory: The initiator approximation and ap- (41) Blunt, N. S.; Smart, S. D.; Booth, G. H. plication to the uniform electron gas. J. et al. An excited-state approach within Chem. Phys. 2016, 144, 084108. full configuration interaction quantum Monte Carlo. J. Chem. Phys. 2015, 143, (32) Blunt, N. S.; Rogers, T. W.; 134117. Spencer, J. S. et al. Density-matrix quantum Monte Carlo method. Phys. (42) Spencer, J. S.; Blunt, N. S.; Rev. B 2014, 89, 245124. Foulkes, W. M. The sign problem and population dynamics in the full con- (33) Malone, F. D.; Blunt, N. S.; Shep- figuration interaction quantum Monte herd, J. J. et al. Interaction picture den- Carlo method. J. Chem. Phys. 2012, sity matrix quantum Monte Carlo. J. 136, 054110–1–10. Chem. Phys. 2015, 143, 044116. (43) Booth, G. H.; Smart, S. D.; Alavi, A. (34) Ten-no, S. Stochastic determination of Linear-scaling and parallelisable algo- effective Hamiltonian for the full con- rithms for stochastic quantum chemistry. figuration interaction solution of quasi- Mol. Phys. 2014, 112, 1855–1869. degenerate electronic states. J. Chem. Phys. 2013, 138, 164126.

20 (44) Petruzielo, F. R.; Holmes, A. A.; that both approaches can be straightfor- Changlani, H. J. et al. Semistochastic wardly handled. Projector Monte Carlo Method. Phys. Rev. Lett. 2012, 109, 230201. (53) Tubman, N. M.; Lee, J.; Takeshita, T. Y. et al. A deterministic alternative to the (45) Overy, C.; Booth, G. H.; Blunt, N. S. full configuration interaction quantum et al. Unbiased reduced density ma- Monte Carlo method. J. Chem. Phys. trices and electronic properties from 2016, 145, 044112. full configuration interaction quantum Monte Carlo. J. Chem. Phys. 2014, 141, (54) Malone, F. D.; Blunt, N.; Brown, E. W. 244117. et al. Accurate Exchange-Correlation Energies for the Warm Dense Electron (46) Blunt, N. S.; Smart, S. D.; Kersten, J. Gas. Phys. Rev. Lett. 2016, 117, 115701. A. F. et al. Semi-stochastic full configu- ration interaction quantum Monte Carlo: (55) Blunt, N. S.; Booth, G. H.; Alavi, A. Developments and application. J. Chem. Density matrices in full configuration in- Phys. 2015, 142, 184107. teraction quantum Monte Carlo: Ex- cited states, transition dipole moments, (47) Holmes, A. A.; Changlani, H. J.; Umri- and parallel distribution. J. Chem. Phys. gar, C. J. Efficient Heat-Bath Sampling 2017, 146, 244105. in Fock Space. J. Chem. Theory Comput. 2016, 12, 1561–1571. (56) Shepherd, J. J.; Henderson, T. M.; Scuse- ria, G. E. Using full configuration in- (48) Neufeld, V. A.; Thom, A. J. W. Ex- teraction quantum Monte Carlo in a se- citing Determinants in Quantum Monte niority zero space to investigate the cor- Carlo: Loading the Dice with Fast, Low- relation energy equivalence of pair cou- Memory Weights. J. Chem. Theory Com- pled cluster doubles and doubly occupied put. 2019, 15, 127–140. configuration interaction. J. Chem. Phys. 2016, 144, 094112. (49) Cleland, D.; Booth, G. H.; Alavi, A. Communications: Survival of the (57) The use of Fortran 2003 and 2008 im- fittest: Accelerating convergence in poses a need for a recent Fortran com- full configuration-interaction quantum piler. Indeed, we have found bugs in both Monte Carlo. J. Chem. Phys. 2010, 132, open-source and proprietary compilers 41103. and worked around them where possible.

(50) Spencer, J. S.; Neufeld, V. A.; (58) Appleby, A. SMHasher. https: Vigor, W. A. et al. Large scale par- //github.com/aappleby/smhasher. allelization in stochastic coupled cluster. J. Chem. Phys. 2018, 149, 204103. (59) Saito, M.; Matsumoto, M. dSFMT. https://github.com/ (51) Scott, C. J. C.; Thom, A. J. W. Stochas- MersenneTwister-Lab/dSFMT. tic coupled cluster theory: Efficient sam- pling of the coupled cluster expansion. J. (60) Moshier, S. R. cephes. http://www. Chem. Phys. 2017, 147, 124105. netlib.org/cephes/.

(52) The original algorithm used integer (61) Wu, K.; Simon, H. TRLan. https:// weights. It was subsequently shown codeforge.lbl.gov/projects/trlan. that floating-point weights greatly re- (62) Yamazaki, I.; Bai, Z.; Simon, H. duce the stochastic noise. HANDE-QMC et al. Adaptive Projection Subspace Di- uses fixed-precision for the weights such mension for the Thick-Restart Lanc-

21 zos Method. ACM Trans. Math. Softw. (73) Dunning, T. H., Jr. Gaussian basis sets 2010, 37, 27. for use in correlated molecular calcula- tions. I. The atoms boron through neon (63) The HDF Group, Hierarchical and hydrogen. J. Chem. Phys. 1989, 90, Data Format, version 5. 1997-2018; 1007–1023. http://www.hdfgroup.org/HDF5/. (74) DMQMC averages over independent cal- (64) Ierusalimschy, R. Programming in Lua, culations and so does not suffer from a 4th ed.; Lua.org, 2016. correlation issue.

(65) AOTUS: Advanced Options and (75) Flyvbjerg, H.; Petersen, H. G. Error esti- Tables in Universal Scripting. mates on averages of correlated data. J. https://geb.sts.nt.uni-siegen. Chem. Phys. 1989, 91, 461–466. de/doxy/aotus/index.html. (76) https://github.com/jsspencer/ (66) JSON. https://www.json.org/, Ac- pyblock. cessed: 2018-10-31. (77) Kent, D. R.; Muller, R. P.; Ander- (67) CMake. https://cmake.org/, Ac- son, A. G. et al. Efficient algorithm for cessed: 2018-10-31. on-the-fly error analysis of local or dis- (68) Parrish, R. M.; Burns, L. A.; Smith, D. tributed serially correlated data. Journal G. A. et al. Psi4 1.1: An Open-Source of 2007, 28, Electronic Structure Program Empha- 2309–2316. sizing Automation, Advanced Libraries, (78) Leach, P.; Mealling, M.; Salz, R. A uni- and Interoperability. J. Chem. Theory versally unique identifier (UUID) URN Comput. 2017, 13, 3185–3197. namespace; 2005.

(69) Toon Verstraelen, Pawel Tecmer, Farnaz (79) https://hande.readthedocs.io. Heidar-Zadeh, Cristina E. Gonzlez- Espinoza, Matthew Chan, Taewon D. (80) Git is a very powerful distributed ver- Kim, Katharina Boguslawski, Stijn Fias, sion control system and has become Steven Vandenbrande, Diego Berrocal, the de facto standard in software de- and Paul W. Ayers HORTON 2.1.0, velopment. Its decentralized model has http://theochem.github.com/horton/, undoubtedly contributed to the huge 2017. growth of open-source software. The official documentation is available on- (70) Sun, Q.; Berkelbach, T. C.; Blunt, N. S. line https://git-scm.com/, but, as et al. PySCF: the Python-based simula- with any powerful tool, it is not an tions of chemistry framework. Wiley In- easy task to become familiar with it. terdisciplinary Reviews: Computational A large number of tutorials are avail- Molecular Science 8, e1340. able online. We can recommend http: (71) Shao, Y.; Gan, Z.; Epifanovsky, E. et al. //gitimmersion.com/ and https:// Advances in molecular quantum chem- coderefinery.github.io/git-intro/. istry contained in the Q-Chem 4 program (81) https://github.com/hande-qmc/ package. Molecular Physics 2015, 113, hande. 184–215. (82) We note that access to the private repos- (72) Werner, H.-J.; Knowles, P. J.; Knizia, G. itory is liberally granted. et al. MOLPRO, version 2015.1, a pack- age of ab initio programs; 2015.

22 (83) J. S. Spencer, testcode, (94) The full legal text of the license is avail- https://github.com/jsspencer/testcode able from the Free Software Founda- 2017. tion: https://www.gnu.org/licenses/ old-licenses/lgpl-2.1.en.html. (84) Buildbot. https://buildbot.net/, Ac- cessed: 2018-11-1. (95) Legal text available from the Open Source Initiative: https: (85) Spencer, J. S.; Blunt, N. S.; Vigor, W. A. //opensource.org/licenses/ et al. Open-Source Development Ex- BSD-3-Clause. periences in Scientific Software: The HANDE Quantum Monte Carlo Project. (96) Knizia, G. IboView. http://www. J. Open Res. Softw. 2015, 3, 1–6. iboview.org.

(86) McKinney, W. Python for Data Analysis: (97) MRCC, a quantum chemical program Data Wrangling with Pandas, NumPy, suite written by M. Kllay, Z. Rolik, J. and IPython, 2nd ed.; O’Reilly Media, Csontos, P. Nagy, G. Samu, D. Mester, J. Incorporated, 2017. Cska, B. Szab, I. Ladjnszki, L. Szegedy, B. Ladczki, K. Petrov, M. Farkas, P. D. (87) Vigor, W. A.; Spencer, J. S.; Mezei, and B. Hgely. See also Z. Rolik, Bearpark, M. J. et al. Understand- L. Szegedy, I. Ladjnszki, B. Ladczki, and ing and improving the efficiency of M. Kllay, J. Chem. Phys. 139, 094105 full configuration interaction quantum (2013), as well as: www.mrcc.hu. Monte Carlo. J. Chem. Phys. 2016, 144, 094110. (98) Loos, P.-F.; Gill, P. M. W. The uniform electron gas. WIREs Comput. Mol. Sci. (88) Vigor, W. A.; Spencer, J. S.; 2016, 6, 410–429. Bearpark, M. J. et al. Minimising biases in full configuration interaction (99) Giuliani, G.; Vignale, G. Quantum The- quantum Monte Carlo. J. Chem. Phys. ory Electron Liq.; Cambridge University 2015, 142, 104101. Press: Cambridge, 2005; pp 1–68.

(89) Oliphant, T. E. Guide to NumPy, 2nd (100) Martin, R. M. Electron. Struct.; Cam- ed.; CreateSpace Independent Publishing bridge University Press: Cambridge, Platform: USA, 2015. 2004; pp 100–118.

(90) Jones, E.; Oliphant, T.; Peterson, P. (101) Gutzwiller, M. C. Effect of Correlation et al. SciPy: Open source scientific tools on the Ferromagnetism of Transition for Python. 2001; http://www.scipy. Metals. Phys. Rev. Lett. 1963, 10, 159– org/. 162.

(91) Hunter, J. D. Matplotlib: A 2D graph- (102) Hubbard, J. Electron Correlations in ics environment. Computing In Science Narrow Energy Bands. Proc. R. Soc. & Engineering 2007, 9, 90–95. London A Math. Phys. Eng. Sci. 1963, 276, 238–257. (92) Rosen, L. Open Source Licensing: Soft- ware Freedom and Intellectual Property (103) Kanamori, J. Electron Correlation and Law; Prentice Hall PTR: Upper Saddle Ferromagnetism of Transition Metals. River, NJ, USA, 2004. Prog. Theor. Phys. 1963, 30, 275–289.

(93) St. Laurent, A. M. Understanding Open (104) Altland, A.; Simons, B. Condensed Mat- Source and Free Software Licensing; ter Field Theory; Cambridge University O’Reilly Media, 2004. Press, 2010.

23 (105) Ceperley, D. M.; Alder, B. J. Ground (115) Shepherd, J. J. Communication: Con- State of the Electron Gas by a Stochas- vergence of many-body wave-function ex- tic Method. Phys. Rev. Lett. 1980, 45, pansions using a plane-wave basis in the 566–569. thermodynamic limit. J. Chem. Phys. 2016, 145, 031104. (106) Perdew, J. P.; Zunger, A. Self-interaction correction to density-functional approx- (116) Shepherd, J. J.; Booth, G.; Gr¨uneis,A. imations for many-electron systems. et al. Full configuration interaction per- Phys. Rev. B 1981, 23, 5048–5079. spective on the homogeneous electron gas. Phys. Rev. B Condens. Matter (107) Giuliani, G.; Vignale, G. Quantum The- 2012, 85 . ory Electron Liq.; Cambridge University Press: Cambridge, 2005; pp 327–404. (117) Shepherd, J. J.; Booth, G. H.; Alavi, A. Investigation of the full configuration in- (108) Freeman, D. L. Coupled-cluster expan- teraction quantum Monte Carlo method sion applied to the electron gas: Inclu- using homogeneous electron gas models. sion of ring and exchange effects. Phys. J. Chem. Phys. 2012, 136, 244101. Rev. B 1977, 15, 5512–5521. (118) Neufeld, V. A.; Thom, A. J. W. A study (109) Bishop, R. F.; L¨uhrmann,K. H. Elec- of the dense uniform electron gas with tron correlations: I. Ground-state results high orders of coupled cluster. J. Chem. in the high-density regime. Phys. Rev. B Phys. 2017, 147, 194105. 1978, 17, 3757–3780. (119) Luo, H.; Alavi, A. Combining the (110) Bishop, R. F.; L¨uhrmann,K. H. Electron Transcorrelated Method with Full Con- correlations. II. Ground-state results at figuration Interaction Quantum Monte low and metallic densities. Phys. Rev. B Carlo: Application to the Homogeneous 1982, 26, 5523–5557. Electron Gas. J. Chem. Theory Comput. 2018, 14, 1403–1411. (111) Shepherd, J. J.; Gr¨uneis, A.; Booth, G. H. et al. Convergence of (120) Ruggeri, M.; R´ıos,P. L.; Alavi, A. Cor- many-body wavefunction expansions relation energies of the high-density spin- using a plane wave basis: from the polarized electron gas to meV accuracy. homogeneous electron gas to the solid Phys. Rev. B 2018, 98, 161105. state. 2012, (121) Blunt, N. S. Communication: An effi- (112) Shepherd, J. J.; Gr¨uneis,A. Many-body cient and accurate perturbative correc- quantum chemistry for the electron gas: tion to initiator full configuration inter- Convergent perturbative theories. Phys. action quantum Monte Carlo. J. Chem. Rev. Lett. 2013, 110, 226401. Phys. 2018, 148, 221101. (113) Roggero, A.; Mukherjee, A.; Pederiva, F. (122) Brown, E. W.; Clark, B. K.; DuBois, J. L. Quantum Monte Carlo with coupled- et al. Path-Integral Monte Carlo Simu- cluster wave functions. Phys. Rev. B lation of the Warm Dense Homogeneous 2013, 88, 115138. Electron Gas. Phys. Rev. Lett. 2013, 110, 146405. (114) McClain, J.; Lischner, J.; Watson, T. et al. Spectral functions of the uniform (123) Schoof, T.; Groth, S.; Vorberger, J. electron gas via coupled-cluster theory et al. Ab Initio Thermodynamic Results and comparison to the G W and related for the Degenerate Electron Gas at Fi- approximations. Phys. Rev. B 2016, 93, nite Temperature. Phys. Rev. Lett. 2015, 235139. 115, 130402.

24 (124) Groth, S.; Schoof, T.; Dornheim, T. et al. (133) Hu, S. X.; Militzer, B.; Goncharov, V. N. Ab initio quantum Monte Carlo simula- et al. First-principles equation-of-state tions of the uniform electron gas with- table of deuterium for inertial confine- out fixed nodes. Phys. Rev. B 2016, 93, ment fusion applications. Phys. Rev. B 085102. 2011, 84, 224109.

(125) Dornheim, T.; Groth, S.; Schoof, T. et al. (134) VandeVondele, J.; Krack, M.; Mo- Ab initio quantum Monte Carlo simula- hamed, F. et al. Quickstep: Fast and tions of the uniform electron gas without accurate density functional calculations fixed nodes: The unpolarized case. Phys. using a mixed Gaussian and plane Rev. B 2016, 93, 205134. waves approach. Comput. Phys. Com- mun. 2005, 167, 103–128. (126) Dornheim, T.; Groth, S.; Malone, F. D. et al. Ab initio quantum Monte Carlo (135) Hutter, J.; Iannuzzi, M.; Schiffmann, F. simulation of the warm dense electron et al. CP2K: atomistic simulations of gas. Phys. Plasmas 2017, 24, 056303. condensed matter systems. Wiley Inter- discip. Rev. Comput. Mol. Sci. 2014, 4, (127) Brown, E. W.; DuBois, J. L.; Holz- 15–25. mann, M. et al. Exchange-correlation En- ergy for the Three-Dimensional Homoge- (136) Goedecker, S.; Teter, M.; Hutter, J. Sep- neous Electron Gas at Arbitrary Temper- arable dual-space Gaussian pseudopoten- ature. Phys. Rev. B 2013, 88, 081102. tials. Phys. Rev. B 1996, 54, 1703–1710.

(128) Karasiev, V. V.; Sjostrom, T.; Dufty, J. (137) Hartwigsen, C.; Goedecker, S.; Hutter, J. et al. Accurate Homogeneous Elec- Relativistic separable dual-space Gaus- tron Gas Exchange-Correlation Free En- sian pseudopotentials from H to Rn. ergy for Local Spin-Density Calculations. Phys. Rev. B 1998, 58, 3641–3662. Phys. Rev. Lett. 2014, 112, 076403. (138) Sun, Q.; Berkelbach, T. C.; Mc- (129) Dornheim, T.; Groth, S.; Sjostrom, T. Clain, J. D. et al. Gaussian and plane- et al. Ab Initio Quantum Monte Carlo wave mixed density fitting for periodic Simulation of the Warm Dense Electron systems. J. Chem. Phys. 2017, 147, Gas in the Thermodynamic Limit. Phys. 164119. Rev. Lett. 2016, 117, 156403. (139) Vosko, S. H.; Wilk, L.; Nusair, M. Accu- (130) Groth, S.; Dornheim, T.; Sjostrom, T. rate spin-dependent electron liquid cor- et al. Ab initio Exchange-Correlation relation energies for local spin density Free Energy of the Uniform Electron Gas calculations: a critical analysis. Can. J. at Warm Dense Matter Conditions. Phys. Phys. 1980, 58, 1200–1211. Rev. Lett. 2017, 119, 135001. (140) Booth, G. H.; Gr¨uneis,A.; Kresse, G. (131) Dornheim, T.; Groth, S.; Bonitz, M. The et al. Towards an exact description of uniform electron gas at warm dense mat- electronic wavefunctions in real solids. ter conditions. Physics Reports 2018, Nature 2013, 493, 365–370. 744, 1. (141) Raghavachari, K.; Trucks, G. W.; (132) Fortney, J. J.; Glenzer, S. H.; Koenig, M. Pople, J. A. et al. A fifth-order pertur- et al. Frontiers of the physics of dense bation comparison of electron correlation plasmas and planetary interiors: Exper- theories. Chem. Phys. Lett. 1989, 157, iments, theory, and applications. Phys. 479–483. Plasmas 2009, 16, 041003.

25 (142) Bl¨ochl, P. E. Projector augmented-wave method. Phys. Rev. B 1994, 50, 17953– 17979.

(143) McClain, J.; Sun, Q.; Chan, G. K.-L. et al. Gaussian-Based Coupled-Cluster Theory for the Ground-State and Band Structure of Solids. J. Chem. Theory Comput. 2017, 13, 1209–1218.

(144) Gruber, T.; Liao, K.; Tsatsoulis, T. et al. Applying the Coupled-Cluster Ansatz to Solids and Surfaces in the Thermody- namic Limit. Phys. Rev. X 2018, 8, 021043.

(145) Gr¨uneis,A.; Booth, G. H.; Marsman, M. et al. Natural Orbitals for Wave Func- tion Based Correlated Calculations Using a Plane Wave Basis Set. J. Chem. Theory Comput. 2011, 7, 2780–2785.

(146) Wagner, L. K.; Ceperley, D. M. Discov- ering correlated fermions using quantum Monte Carlo. Reports Prog. Phys. 2016, 79, 094501.

(147) We note we probably would have chosen, like Psi4, to use Python via the excel- lent pybind11 library had HANDE-QMC been written in C++, in part due to Python already being used extensively in scientific research.

(148) Lu, J.; Wang, Z. The full configura- tion interaction quantum Monte Carlo method in the lens of inexact power iter- ation. arXiv:1711.09153 [physics] 2017, arXiv: 1711.09153.

(149) Deustua, J. E.; Shen, J.; Piecuch, P. Con- verging High-Level Coupled-Cluster En- ergetics by Monte Carlo Sampling and Moment Expansions. Phys. Rev. Lett. 2017, 119, 223003.

(150) Gillan, M. J.; Towler, M. D.; Alfe, D. Petascale computing opens new vistas for quantum Monte Carlo. Psi-k Newsletter 2011, 103, 32.

26 Graphical TOC Entry

27