Arxiv:2008.05592V3 [Quant-Ph] 17 Nov 2020 Model 6 1

Density functionals and Kohn-Sham potentials with minimal wavefunction preparations on a quantum computer

Thomas E. Baker1 and David Poulin1, 2, 3, ‡ 1Institut quantique & Département de physique, Universitéde Sherbrooke, Sherbrooke, Québec J1K 2R1 Canada 2Quantum Architecture and Computation Group, Microsoft Research, Redmond, WA 98052 USA 3Canadian Institute for Advanced Research, Toronto, Ontario, M5G 1Z8 Canada (Dated: November 18, 2020) One of the potential applications of a quantum computer is solving quantum chemical systems. It is known that one of the fastest ways to obtain somewhat accurate solutions classically is to use approximations of density functional theory. We demonstrate a general method for obtaining the exact functional as a machine learned model from a sufficiently powerful quantum computer. Only existing assumptions for the current feasibility of solutions on the quantum computer are used. Several known algorithms including quantum phase estimation, quantum amplitude estimation, and quantum gradient methods are used to train a machine learned model. One advantage of this combination of algorithms is that the quantum wavefunction does not need to be completely re- prepared at each step, lowering a sizable prefactor. Using the assumptions for solutions of the ground state algorithms on a quantum computer, we demonstrate that finding the Kohn-Sham potential is not necessarily more difficult than the ground state density. Once constructed, a classical user can use the resulting machine learned functional to solve for the ground state of a system self- consistently, provided the machine learned approximation is accurate enough for the input system. It is also demonstrated how the classical user can access commonly used time- and temperature- dependent approximations from the ground state model. Minor modifications to the algorithm can learn other types of functional theories including exact time- and temperature-dependence. Several other algorithms–including quantum machine learning–are demonstrated to be impractical in the general case for this problem.

CONTENTS IV. Limitations of known algorithms 9 A. Feasibility of obtaining the starting state 9 I. Introduction 2 1. General considerations for real time evolution 9 II. Algorithm for the functional 3 2. Choice of basis sets for algorithms 9 A. Quantities of interest 4 3. Limits on solutions in superposition 10 4. Alternative algorithms to real time 1. Density functionals 4 evolution 10 2. Kohn-Sham potentials 5 B. Other methods and quantum machine B. Example for the Kohn-Sham potential 5 learning 10 1. Quantum machine learning 11 III. Additional considerations and use of the C. Summary 11 functional 6 A. Scaling and resources required 6 V. Conclusion 11 1. Algorithmic scaling 6 2. Prefactor for convergence of the VI. Acknowledgements 12 wavefunction 6 3. Convergence of the machine learned A. Quantum chemistry 12 arXiv:2008.05592v3 [quant-ph] 17 Nov 2020 model 6 1. Many-body problems 12 4. Resources required 6 2. Basis sets 12 B. Structure of the neural network 7 a. The curse of dimensionality and other C. Comment on the universal vs. exact limitations in point-like basis sets 12 functional 7 B. Density functional theory 13 D. Comparing the search for the density and 1. Kohn-Sham density functional theory 14 Kohn-Sham potential 7 a. Finding the Kohn-Sham potential 14 E. Use of the functional 8 b. Components of the functional in the F. Other types of functional theories 8 Kohn-Sham system 14 1. Time evolution 8 c. Kohn-Sham potential by functional 2. Finite temperature calculations 8 derivatives 14 G. Non-density functional quantities 9 d. Variational principle for the Kohn-Sham H. Reduced examples for testing 9 potential 14 2

e. v-representability 15 wavefunction can be extremely costly,22–25 taking months f. Minimization of the Kohn-Sham or years even for moderately sized systems,24,25 and that functional 15 the wavefunction cannot be copied.26 The wavefunction g. Relationship between the energies of the is therefore a valuable commodity and measurements Kohn-Sham and the fully interacting should be minimized.27 system 15 One option to encode many solutions into one mea- 2. Potential functional theory 15 surement is to use a machine learned (ML) model.28 In a. Why the functional derivative is also general, ML models can interpolate remarkably well be- necessary 15 tween input data to give access to many systems, includ- 3. Time-dependent density functional theory 15 ing those not already solved. In principle, the ML model 4. Density functional theory at finite can be constructed directly on the quantum computer or temperature 16 from classical data generated by the quantum computer. 5. Comment on density functional Using ML models would also allow for users of the approximations 16 quantum computer to export solutions to classical users. The results could then be quickly retrieved classically C. Training a machine learning model with from the model and are generally accurate over a training stochastic gradient descent 16 manifold on which the model was constructed in a desire 1. Gradient-free methods 17 to generate the best machine learned model possible. Finding the full wavefunction would require exponen- D. Quantum algorithms 17 tially many measurements, so this can be difficult to im- 1. Quantum Fourier transform 18 plement on a quantum computer. But the same informa- 2. Quantum phase estimation 19 tion can be expressed in a more compact form. So, we 3. Quantum gradient algorithm 19 can look at alternative formulations of quantum physics a. Functional derivatives with quantum for the most descriptive model. gradient algorithms 20 The route pursued here is with density functional the- 4. Real time evolution 20 ory (DFT).29 Hohenberg and Kohn established that the 5. Quantum amplitude estimation (quantum one-body density, n(r), is one-to-one with the external counting) 21 potential v(r) up to a constant shift of the potential. In essence, the density can replace the wavefunction, but it References 21 has fewer variables. In order to use DFT, we must find some other means of obtaining the energy, since the Hamiltonian is not used I. INTRODUCTION in DFT. Instead, the universal functional, F [n], must be found. It was proven that the universal functional ex- Quantum computing has been proposed as an alterna- ists and is common to all problems of the same electron- tive to classical computing,1 and there are some prob- electron interaction.29,30 lems which can be solved faster than known classical The quantities required for the classical user to find algorithms.2–7 One of the most sought after and poten- self-consistent solutions are the exact functional (deter- tially far reaching applications on a quantum computer mining the energy) and the functional derivative.31 So, is the solution of quantum chemistry problems.8–13 Ob- in addition to finding F [n], we also must find some other taining exact solutions from a quantum computer effi- quantity such as the density, n(r), or the Kohn-Sham 30 ciently could revolutionize modern applications includ- (KS) potential, vs(r). With these components, we can ing the creation of new medicines, fertilizers, batteries, fully characterize a quantum ground state and solve for superconductors, and more.14–20 other measurable quantities. To do this, one important quantity to determine is the It has already been established that the density func- ground state energy. The energy is a highly useful quan- tional can be successfully modeled with ML methods tity for determining properties such as the equilibrium on the classical computer.31–45 Exact quantities at sev- geometry of a molecule. Yet, the energy is not descrip- eral different external potentials must be found for tive enough to fully characterize all desired properties the ML models to be trained. The number of training of a system. For example, the band structure can be a points needed to construct accurate models are not pro- useful tool to characterize a material, but this requires hibitively large. From the ML functionals, self-consistent measurements at several k-points.21 So, many measure- solutions can be obtained.38,46–48 Numerically accurate ments of the wavefunction would be required for some ML functionals satisfy all exact conditions of F [n]33,49 simple quantities. and escape the common errors of approximated density Measurement on the quantum computer is expensive functionals.50,51 because the wavefunction must often be re-prepared be- To apply the classical ML-DFT methods on a quantum fore a second measurement is performed. It has already computer, some additional constraints must be minded. been shown on a quantum computer that obtaining the Previous attempts to obtain functionals from the quan- 3 tum computer have relied on many measurements of the demonstrated by us in the references and the algorithms wavefunction for each system of interest.13,52–56 In our performed accurately. This section will contain all the el- view, a worthy goal is to avoid both excessive measure- ements of the algorithm and assumes only a background ments and re-preparations of the wavefunction especially of algorithms in quantum computing. in the case of time-dependent quantities.53,57 We provide Fig. 1 to illustrate the steps necessary for This work proposes a feasible algorithm that finds the one iteration of the algorithm, which we will refer to as ML model for F [n] on the quantum computer if a ground a recycled wavefunction for minimal prefactor (RWMP) state wavefunction is available. The algorithm leaves the method. Although many quantities could be produced wavefunction largely undisturbed so it can be used as from this algorithm, we will focus on the components the starting state for another system, greatly reducing useful for the density functional. The inputs are the ex- the prefactor required to solve other systems. This is ac- ternal potential for some system (|v(r)i) and initial guess complished by using a state-preserving quantum count- weights for the ML model (|w(i)i) represented as classical ing algorithm to extract descriptive quantities such as variables throughout. The following steps are required to the density.58,59 Much of the algorithm is kept entirely obtain the solution for a given system and then update on the quantum computer to motivate future improve- the parameters of the ML model. ments for speed, but the counting algorithm does allow for information to be output classically. 1. We prepare a ground state wavefunction |Ψi for This algorithm is an alternative to running one very a given external potential, |v(r)i, and number of long computation and just measuring one energy, in that electrons, Ne. each step of the wavefunction preparation is proposed This can be done by real-time evolution (RTE, see to solve another system. Thus, no step from the ground Apx D 4). In Fig. 1, this is denoted by a box for RTE. state solver is wasted when using the algorithm here. The subroutine that obtains the wavefunction here does We also demonstrate that the Kohn-Sham potential not have to be RTE. If another, more advanced solver is can be solved using a similar strategy as the wavefunc- developed and used, then this can be substituted with no tion. A gradient evaluated on a cost function for the change to the rest of the RWMP algorithm KS system allows for the determination of the exact Note that the methods to obtain quantities from the KS potential. This strategy can be faster than obtain- result of classical computations are not available since the ing the density. Further, we demonstrate how access to quantum wavefunction has coefficients that are stored in the functional can be used to find approximate time- and superposition (a linear combination of |0i and |1i). This temperature-dependent behavior in a system from the means that the coefficients of the wavefunction cannot ground state functional, and modifications can be added be found except with many measurements. We must first to obtain exact results. find the energy before obtaining other relevant quantities. One temptation would be to use quantum machine learning, but long-known bounds on the efficiency of 2. We obtain the ground state energy |Ei from the these methods preclude their use here.60 This agrees ground state wavefunction |Ψi given in the previous with recent demonstrations that some known quan- step. tum machine learning algorithms are not universally advantageous.61–64 We also discuss general limitations on Access to the ground state energy is provided by quantum phase estimation (QPE, Appendix. D 2) or with known algorithms such as quantum machine learning and 65 why these algorithms are expected to be inefficient here. some other method like qubitization. On Fig. 1, this Section II presents the algorithm Section III will dis- step is denoted by QPE. cuss several uses of the resulting functional and considerations in choosing quantities to solve for. Section IV will discuss known limitations and justify why the algorithm is constructed as presented. Necessary background information on quantum chemistry, DFT, ML, and quantum computing algorithms is given in the appendixes.

II. ALGORITHM FOR THE FUNCTIONAL

In order to establish an algorithm for the quantum computer that gives results in quantum chemistry, knowl- FIG. 1. One step of the RWMP algorithm for the density. The edge of both ﬁelds must be understood. To avoid a next iteration uses the output wavefunction as the starting lengthy summary in the main text, we have included rel- state for the next system to reduce the prefactor. A similar evant background knowledge in the appendixes in case procedure can be used to ﬁnd the KS potential, replacing n(r) they are needed. Nearly all of the computational steps with vs(r) (with a QGA as an oracle query for the QAE). The (e.g., machine learning the functional) have already been ML model may need inputs from other registers. 4

The next task is to determine some quantities of inter- a configuration where the initial wavefunction is accu- est without requiring a full measurement of the wavefunc- rate, then this can become the first data point for the tion. The counting algorithm used allows for the wave- ML model. Each subsequent time step could be another function to be used again on the next iteration. potential data point for the model. The second advantage to this strategy is that the se- 3. Given the energy |Ei and wavefunction |Ψi, we gen- quence of potentials in the RWMP algorithm can allow erate some quantity that is denoted here as |n(r)i for the ordering of the next potential to be close by. This from a quantum amplitude estimation (QAE, Ap- allows for the best starting state for the next system to pendix. D 5), a name which we use interchangeably be used and reduce the total amount of RTE steps that with quantum counting throughout. must be run over all systems. In summary, the RWMP algorithm here could make use of the preparation time The symbol used, n(r), is for the density (Ap- for a hard to solve system by finding data from the inter- pendix. B), but we can substitute this quantity for others. mediate steps and reduce the prefactor in the solution. For example, one could also determine the KS potential The RWMP algorithm repeats until all systems are |vs(r)i. This step is denoted as QAE in Fig. 1. Note that visited. The final step is to measure the parameters of the this step may involve an oracle query such as a quantum ML model, |wi. The model can then be used classically. gradient algorithm (QGA, Appendix. D 3) or have other In what follows, we discuss which quantities should subroutines. This is a point we will expand on in the next be obtained by the QAE for the best description of the section, Sec. II A, when discussing how to obtain either ground state. Then in Sec. III, we discuss aspects of this n(r) or vs(r). For now, we focus on what to do once the strategy that were not absolutely necessary in defining quantity is obtained. the algorithm and how a classical user can use the result. The output wavefunction is slightly modified by the Finally, we expand on many points that were not crucial QAE but remains nearly the same state with a small for the basic understanding of the RWMP algorithm and amount of error. The procedure to return the wavefunc- explain limits that ultimately lead to this algorithm (and tion to its original state does not completely re-prepare why other subroutines were not used) in Sec. IV. the wavefunction and instead has an iterative set of steps to repair the wavefunction as explained in Ref. 58 (see also Appendix. D 5). A. Quantities of interest 4. The output of the QAE can be used to update the ML model’s parameters |w(i)i to the next iteration, We move on to discussing which quantities are best to |w(i+1)i. More than one ML model can be trained determine from the wavefunction via the QAE. There are here (e.g., a ML model for |Ei and for |n(r)i). two main options: the coefficients of the density matrix and the KS potential. When choosing the quantities of The typical steps in a stochastic gradient descent interest, both the functional and the functional deriva- (SGD, Appendix. C) of forward and backward propaga- tive must be determined with the ML model in order to tion can be used. For backward propagation, the output find solutions on the classical computer. There are several of a QGA can be used to update the ML parameters. This types of functionals that can be trained for this, some of step is marked as ML in Fig. 1. This ML operation may which are presented here. need to be controlled on |v(r)i depending on which quantity is being trained. Note that the QAE output could also be stored clasically and then machine learned not 1. Density functionals on the quantum computer, skipping this step but elim- inating the opportunity for an improvement in the ML box from quantum advantages. In the RWMP algorithm, one option is to find the density, n(r), since this is proven to be a suitable replacement 5. Another external potential is provided (|v + ∆vi for the wavefunction from the Hohenberg-Kohn theorem 2 (not shown in Fig. 1) and the wavefunction is re- in Ref. 29. The N elements of the density matrix are ex- † used as the starting state for the next RTE. pectation values of the operatorc î cˆj (not just diagonal elements; see Apx A 1) wherec ˆ is a fermionic operator Here, ∆v is chosen by the user and simply added to defined in Appendix. A 1. Spin indices are ignored for the coefficients of v(r) to give the next potential. Many simplicity for now. Since the expectation value is on the † passes through a set of potentials must occur to obtain interval [−1, 1], we can use the operator (ˆci cˆj + 1)/2 (de- an accurate ML model. fined on [0,1]) and afterward shift the result back to the Recall that the resource estimate given in Ref. 24, sev- original interval. Using this shifted operator is a neces- eral months may be required for a small molecule with sary component to using the QAE because the expec- RTE. The RWMP algorithm allows for a sequence of in- tation value can now be related to a probability. The termediate systems to be visited, effectively making use number of rounds required for the QAE relates to the of that time to obtain data. So, if the system starts in inverse of the probability of failure requested.58 5

There are several options for the ML model. The ML coefficients of the KS potential and the process is re- step can train directly from n(r) to E or we can take peated sufficient times until the true KS potential is as an input v(r) to the model and train both n[v](r) obtained. An initial guess for the parameters could be and E[v]. The first option gives a pure density func- taken from existing semi-local approximations such as tional (Appendix. B). The second option gives a poten- local density approximations, etc.83,84 or using a classi- tial functional, which is a dual functional to DFT (Ap- cal method.72–79,85 The other classical methods where a pendix. B 2).66 Both of these theories can be solved self- gradient is used to evolve the potential may be useful, consistently. One can also train the bifunctional E[n, v].34 but the density does not need to be constructed to use Eq. (1). In order to construct Eq. (1) on a quantum computer, 2. Kohn-Sham potentials the eigenvalues of free Hamiltonians, such as the KS Hamiltonian (Tˆs +Vˆs) can be mapped to the interval [0,1] for the QAE. The operator must also be scaled by a con- The other main quantity, which has more value, is the stant, but we also note that shifting the potential by a KS potential, v (r). The defining feature of the KS system s constant, v (r) → v (r)+C, is allowed to ensure all eigen- is a noninteracting system that has the same density as s s values are positive without changing the eigenvectors.29 a given interacting system (see more discussion and ways Further, identifying some upper bound on the expecta- to realize this potential in Appendix. B 1). The potential tion value, ε , the scaled operator could appear as defining this noninteracting system is defined as v (r) max s hΨ|Tˆ + Vˆ + C|Ψi/ε . which is described by N parameters. s s max The expectation value hΦ|Tˆ + Vˆ |Φi can be evaluated Two potentials can have the same density if the sys- s s in one of two ways. First, it can be computed by diag- tems do not have the same electron-electron interactions, ˆ ˆ 29 onalizing Ts + Vs (determining hk) and taking the sum so as not to violate the Hohenberg-Kohn theorem. The ˜ P † ˜ ˜ KS potential is highly valuable since it can be applied in hΦ| k hkcˆkcˆk|Φi in the diagonalized basis with Φ is now several other instances. This includes finding time- and used. This procedure uses QAE to find the expectation value analogously to finding the coefficients of the den- temperature-dependent calculations from vs(r) or the KS band structure.57,67,68 sity. Alternatively, one could apply a QPE (or qubitization) directly for the KS system, noting that the gate A central question is whether vs(r) exists for a given interacting system. This is known as the problem of v- count is drastically less for the noninteracting system. representability. Since it is proven that the KS potential The KS potential as encoded in the ML model can be always exists on a lattice, v (r) always exists here.69–72 expressed as either vs[v](r) or as vs[n](r), where both are s well-defined.86 The KS potential must be converged to, just as we had to evolve an initial state in RTE to a final state. The difference in finding the KS potential is that a gradient B. Example for the Kohn-Sham potential is applied instead of a time evolution operator for the wavefunction. To illustrate further some of the more abstract quan- The KS potential, v , will satisfy the s tities in the RWMP algorithm in Sec. II, we provide an minimization,73–75 expanded example of the RWMP algorithm of how to obtain the ML-KS potential. min hΨ[v]|Tˆ + Vˆs|Ψ[v]i − hΦ[vs]|Tˆ + Vˆs|Φ[vs]i (1) vs 1. An initial potential v(r) is chosen. This can be done by assuming the external poten- where Ψ is the interacting wavefunction and Φ is the tial is a set of nuclei with a Coulombic interaction, KS wavefunction (see Appendix. B 1 d for more infor- v(r) = P (−Z/|r − r |), where a indexes the posi- mation). There are other methods to obtain the KS a a tions of the nuclei, r with atomic number Z. potential,76–79 but the method used here is straightfor- a ward on a quantum computer given a close enough start- 2. A basis set is chosen, ϕk(r). ing guess or small enough molecule (i.e., those solvable The model can then be discretized in this basis and by RTE). Many other methods to find the Kohn-Sham the resulting model has fermionic operators (see potential would either require numerous measurements Appendix. A 1). of the wavefunction or large overhead in terms of qubits for operations that are simple on the classical computer 3. RTE is run and the ground state is obtained. but costly on the quantum computer including addition, 4. QPE is used to find the ground state energy, E. division, etc.80 Equation 1 is used as the output of the oracle query 5. Ψ and E are used in the QAE to find the first term in QAE for the QGA.81,82 Note that the QGA is partic- in Eq. (1). The energy of the noninteracting system ularly useful for finding the functional derivatives here, is also obtained for the second term in Eq. (1) us- notably taking the variation of all possible vs(r) in one ing some initial guess for vs(r). This constructs the oracle query.81 The resulting gradient is applied on the oracle in the QGA. 6

6. The QGA result is added to the coefficients of the 1. Algorithmic scaling KS potential and the last step is repeated until the vs(r) a certain number of times to find the mini- The scaling of the RWMP algorithm for the density mum. functional in terms of the number of basis sets is O(N 2) 7. The values E and v (r) are input into a ML model. for one system asymptotically in the number of basis s functions due to time evolution. The actual complexity The gradient of the parameters in the model are 2 4 updated with a QGA by repeatedly computing the in practice is between O(N ) and O(N ) for intermedi- gradient and adding it to the current coefficients. ate system size (see Appendix. A 2 and D 4). The other subroutines scale at most as O(N 2) (see the appendixes). We can also use the QGA to output E and vs(r) to the classical user and learn the final set of potentials classically. 2. Prefactor for convergence of the wavefunction 0 8. A new set of atomic coordinates are provided, ra, the difference ∆v between the previous and this Even though the scaling in terms of the number of basis potential is computed, and the result is added to functions is polynomial, the actual cost is significant due the old potential. to a prefactor.24 The prefactor of the algorithmic scaling is problem de- We now have the next potential and can start again pendent and will be the dominant cost to obtaining a at step 3 until all systems are visited. ground state wavefunction. The prefactor will depend on Note that this computation can be restarted at any the time used to prepare Ψ, the number of steps that the time (at the significant cost of re-preparing the wavefunc- QAE algorithm must be run, and the required number tion). Note that it has been assumed that all systems are of systems to be visited. Note that the number of steps run for the same basis set, although this condition could that must be run to obtain the correct time evolution be relaxed in principle. is dependent on how close the starting state is to the One advantage of using this method is that the Kohn- system’s solution, the number of electrons, how strongly Sham potential is characterized by N coefficients κk, correlated the electrons are, and many other variables. PN Note that if the first system in the RWMP algorithm is vs(r) = k=1 κkϕk(r). This is a factor N less than the density required. The evolution of the Kohn-Sham po- exactly equal to the initial wavefunction (e.g., the well- tential from an initial guess potential by gradients is very separated limit for a neutral molecule where each sep- similar to how the wavefunction is evolved with RTE. arated piece is a hydrogen atom with one electron and where Hartree-Fock is exact for one-electron systems), then the time to make the first solution is zero and each III. ADDITIONAL CONSIDERATIONS AND subsequent motion of the atoms closer together will be USE OF THE FUNCTIONAL another data point that may not require too many time steps to obtain. In the previous section, an RWMP algorithm for finding the density functional and Kohn-Sham potential were detailed. It was also noted that other variations of the 3. Convergence of the machine learned model density functional could be found using the same technology. In order to aid the convergence of the ML model, it Several points that could extend the RWMP would be useful to have several quantum computers run- algorithm–and other details about what was introduced– ning at once. This would mean that more than one sys- are discussed here. These include resource costs, a com- tem can be used to construct a mini-batch update for parison between finding the density and the KS potential, the cost function of the gradient descent, making the re- the universality of the functional, how the functional can sulting update to the ML model more accurate. One also be used, applications to other types of functional theo- can save the output quantities as future training points at ries, and opportunities for near-term studies. Some points the cost of extra registers. Once an accurate ML model is relating to a quantum advantage that are discussed here generated, the problem does not need to be solved again. are continued in Sec. IV A.

4. Resources required A. Scaling and resources required The RWMP algorithm presented in this work requires The algorithm will require a fully scalable quantum 4+ζ qubit registers for ζ quantities of interest that are computer, probably with self-correcting memory. Near- not the energy (e.g., learning E and vs(r) implies ζ = 1). term examples are available for density functionals, and a There will also be overhead for the QAE and other parts full discussion of the known features of finding the density of the RWMP algorithm One can also divide the param- functional with this method is given here. eters in the ML model into separate registers for each 7 additional quantity of interest since the model spaces be- algorithm as presented in Sec. II would still avoid ex- tween them should be sparse. cessive measurement; meanwhile, outputting to classical The number of qubits required is clearly large and de- variables will reduce the amount of overhead needed to pendent on the specific steps of the RWMP algorithm store ML parameters. used. This means the RWMP algorithm is best suited for a fully working quantum computer that is expected in the future. There are many ways to reduce the steps neces- C. Comment on the universal vs. exact functional sary. For example, in one way, by outputting the QAE to a classical user. However, we want to maintain the flex- ML functionals are very accurate over the training ibility that a quantum advantage could be realized here 47 as discussed next in Sec. III B. manifold on which they were constructed. We have chosen the appropriate adjective (exact or universal) describ- ing the functional very carefully in each use here. The procedure we describe here is not truly universal since B. Structure of the neural network the accuracy of the ML functional is limited to the training manifold of potentials that we have explored for the In a neural network, the form of the nonlinear function, ML model. For example, trying to solve a model trained S (see Appendix. C), is often chosen so that it is easily for solutions with Ne electrons with a new potential in differentiable. On the quantum computer, determining the training manifold with Ne + 1 electrons will try to the gradients are accomplished in one oracle query (see project the solution back onto the Ne solutions. Appendix D 3), so the traditionally used functions on a The resulting ML functional can be numerically ex- classical computer (e.g., sigmoids, etc.) can be swapped act, however, for a given problem. This functional will be out for another function that may be lead to better per- called the exact functional, implying that it is accurate formance or faster convergence. It is not clear if this will for some systems but not all possible systems. produce any detectable advantage. To complete the discussion, one can have a universal The number of coefficients required to obtain an ac- but not exact functional. For example, if one estimated curate ML model can be very high, but some guidelines 87 the energy to be a fixed value for all systems, this would can reduce this number for a given quantity. For the be universal but not very useful. problem at hand, note that the training is done in the same basis as the problem is expressed. In this case, the connectivity of the neural network can be constrained on physical arguments. Known bounds on the structure of D. Comparing the search for the density and local correlations as proved by Hastings in Ref. 88 apply Kohn-Sham potential to densities that was originally shown by Kohn, et. al. in Ref. 89 and 90 This means that perturbations on the To actually compare the true cost of the O(N 2) opera- density decay exponentially with distance for gapped sys- tions to find the density with the O(T N) coefficients for tems and as a power law for gapless systems.91 Then, the KS potential, one would need to know the number the neural network models for the density do not need of times T a gradient must be applied. This depends on to account for arbitrarily long ranged connections and the system studied and starting KS potential. It is there- can remain local. In essence, the connectivity of edges on fore not clear as a general guideline if directly finding the the graph for the neural network would be similar to the coefficients of the density matrix is always better than structure of a multi-scale entanglement renormalization starting from a KS potential that is close and applying ansatz92 (scaling as N log N) but in three-dimensions. the gradients. The key difference in the two cases is that Note that this estimate is a minimum and it may still the KS potential relies on a suitable starting state and the be advantageous to add more hidden layers or connectiv- density does not. The assumption for the KS potential– ity. Note that we will expect that gradient-free methods that a good starting state is required–is very similar to (Appendix. C 1) will not necessarily be more useful in the restriction on RTE itself to find the ground state, so training the ML model. all assumptions are consistent. Not all quantities will have a local structure. The vs(r) The fact that the KS potential is competitive with find- is fully nonlocal due to the Hartree potential (see Ap- ing the density here is more a comment on the overhead pendix. B 1 a) and can require all-to-all connectivity be- required for the density (which is far greater than on tween subsequent hidden layers in the neural network. the classical computer) than the KS potential being any Note that it is not required to train the ML model on easier to find. There is one advantage to using the min- the quantum computer. One can determine the quanti- imization for vs(r) here in that the QGA is much more ties of interest for each system and output the results efficient than the classical computer. with the QAE from the quantum computer to the clas- In addition to comparing the scaling, the nature of the sical computer.58 But we give the option here to train KS problem requires a large basis set to obtain the proper the neural network on the quantum computer in case a KS potential and avoid any Gibbs oscillations that would quantum advantage can be realized. Using the RWMP appear in a truncated basis set.78 So, the N required for 8 an accurate density or energy might be smaller than for (TD-DFT),57,97–99 thermal properties,100 supercon- an accurate KS potential in practice. ducting functionals,101,102 quantum electrodynamics- 103–105 106,107 66,108–112 Regardless, any effort to find vs(r) is worth it since DFT, ensembles, and others. In vs(r) is far more descriptive and gives access to useful each case, the solution on the quantum computer is mod- quantities. ified in some way. For example, a superconducting functional theory can be found if the pairing potential113 is also computed and learned. E. Use of the functional Two of the methods in the above list (time- and temperature-dependent methods) deserve extended dis- The algorithmic cost in Sec. III A is paid only once. cussion in the subsequent sub-sections since both exact When the model is given to a classical user, a sepa- and approximate functionals can be found from both. rate cost must be paid to solve it for the ground state. Note that the approximate methods only require the Given the ML model, the classical user can solve the ground state functional in both cases. ML functional self-consistently by determining the Euler- Lagrange equation to minimize the functional (see Ap- pendix. B 1 f). This requires that a projection back into the training manifold must occur to ensure the functional 1. Time evolution derivative is accurate.34,46 This projection is estimated to scale as O(N 2) but can be machine learned separately to speed up computation.33,34,46 Given an accurate enough ML approximation for the 44 Finding the pure density functional has better scaling KS potential, one can time evolve the system. If the for the classical user, scaling like the number of basis system is adiabatically time evolved with a weak enough functions, N. Note that even though we can learn F [n], perturbation, this would be available immediately from the external potentials (as well as particle numbers and just the KS system at zero time (see Eq. (B25) in Ap- polarizations) over which the ML model is accurate must pendix. B 3). The adiabatic approximation is often suffi- also be given to the classical user so it is understood what cient for many physical processes. the training manifold is. For time evolution beyond the adiabatic limit, an The KS system scales as the cube of the number of exchange-correlation kernel denoted as fxc (see Ap- basis functions formally, N 3, since it is a noninteracting pendix. B 3) would need to be found. Full evalua- problem. Note that when solving the KS system with the tion of fxc could be accomplished with the QGA (Ap- exact functional, convergence is proven.72,93 pendix. D 3 a); however, one can estimate the kernel If a bifunctional, E[n, v], is used then a self-consistent via back propagation in the neural network (see Ap- solution is not necessary.34 pendix. C). An accurate functional derivative will be re- Once the model is trained sufficiently accurately, one quired in this case.46 One can also time evolve on the can always re-train for another type of functional (e.g., quantum computer and use expectation values at vari- from the KS potential, the density can be obtained and ous times for other quantities.56,65,114,115 a pure density functional can be trained). Finding the density functional–and using it to compute quantities–is preferable to generating a library of system properties and machine learning those properties. From 2. Finite temperature calculations the DFT model, the other quantities can be constructed. As an example application, if trained over enough external potentials, this method could efficiently evaluate For temperature-dependent processes, one approxima- molecular dynamics problems giving accurate results on tion that is available from the ground state KS po- 116 laptops instead of supercomputers.94 tential is the Fermi-weighted KS technique. In this method, occupied and unoccupied orbitals found at zero- temperature can be weighted with a Fermi-Dirac distri- F. Other types of functional theories bution to obtain a finite temperature density. For exact computations at finite temperature, Ap- There are many other types of functional theories that pendix. B 4 discusses the temperature-dependent KS po- are related to DFT that can be obtained with these meth- tential (which is very similar to the ground state case) ods. The most easily extended method is density matrix and can be found if a finite temperature wavefunction is 114,117 functional theory (DMFT).95 Since our method of ob- provided. taining the density was to actually obtain the coefficients Other theories may require computation of other quan- of the density matrix (see Apx A), the density matrix tities, including those not relevant to quantum chemistry. functional is learned. Yet, they can follow the same strategy as the RWMP al- Extensions of DFT can also be solved with this method gorithm (i.e., re-use of the wavefunction and QAE) to including the motion of nuclei,96 time-dependence find the relevant quantities. 9

G. Non-density functional quantities 1. General considerations for real time evolution

Note that the RWMP algorithm is not specific to the One of the primary motivations for making the RWMP density, the KS potential, or even a density functional. algorithm for the density functional is to recycle the Many quantities of interest can be obtained. To see how ground state wavefunction, reducing the cost of RTE. It is the continued fraction representation of the Green’s func- true that RTE scales only polynomially (Appendix. A 2) tion can be obtained, see Ref. 59. with the number of orbitals. In comparison, the full configuration interaction (FCI) gives the exact results but 136 scales as O(Ne!). So, RTE has a scaling advantage H. Reduced examples for testing over FCI; however, the prefactor matters. The exact same RTE algorithm can be run on a classi- Throughout, it is implied that many qubits will be re- cal computer. The reason that quantum chemistry computations are not run with RTE is that the prefactor quired, but we view this as similar to requiring sufficient 24,137 memory on the classical computer. However, there are equal to the number of time steps is large. This is test systems that are used to provide insight into DFT especially problematic for large systems and those where and may be useful for proofs of principle.118–126 A sim- a near-degeneracy must be avoided, necessitating a small ple model that may be within reach of existing quantum step size. But this is exactly where quantum computers computers is the two-site Hubbard model, which has been are hoped to be applied. used to study simplified DFT.127–135 Note also that the Trotterization of the time evo- Each individual algorithm has been tested in cases of lution operator is suited for planar molecules since relevant interest for other problems, many of which by interactions are comparatively localized, which is the same reason that matrix product states prefer these the authors. Citations to many of the tests of these algo- 91,138,139 rithms are included near their description. geometries further limiting the usefulness of RTE. Further comment becomes far more complicated because these alternative, smaller wavefunction ansatzs on IV. LIMITATIONS OF KNOWN ALGORITHMS the classical computer may display all the same features of the true ground state and accurate energy. These This section discusses limitations on the types of algo- solvers typically have systematic errors that are stud- rithms that could have been used to machine learn the ied, can solve systems faster, and have led to solutions functional and how future improvements in algorithms of large systems140 and new discoveries.141 So, it is not must continue to improve the performance on quantum clear if a quantum computation can beat all classical rep- computers. We hope that a clear and complete discussion resentations in terms of efficiency. Note that on the clas- of the current hurdles for quantum computation motivate sical computer, another solver can be used (e.g. tensor future algorithms and provide a consistent account. networks,91,138 quantum Monte Carlo,142 random phase approximation,143,144 or many, many other methods145). The number of choices here is vast and storied, so we A. Feasibility of obtaining the starting state leave more discussion for others.145,146 In summary, the time complexity of solving the quan- In the previous section, we omitted a discussion of how tum computation should be expected to be larger than to prepare the initial state in superposition and how it is classical solvers. Quantum computers can represent a re- converged. The time it takes to obtain a wavefunction is duction in the amount of space required to store a wave- known to be inefficient with current techniques. For some function, but it is not clear if this will always beat every algorithms, the ideal input would have been a superpo- classical representation. Improved methods of finding the sition of solutions that would include all combinations of ground state would be highly valuable. particle numbers and spin polarizations for all external potentials. In this section, we consider the complications for even 2. Choice of basis sets for algorithms preparing a suitable state in superposition and known limits. We will discuss the RTE algorithm in the context It is true that the quantum computer provides a dif- of its scaling, why the large prefactor can prohibit its ferent representation of the wavefunction. In some repre- use on some systems, comparing the implementation of sentations for classical algorithms, the memory will grow RTE on the classical and quantum computer, and limita- considerably with system size. For example, a coupled tions to constructing a superposition of all solutions with cluster calculation can have many coefficients as in the any method. This last point will address the feasibility number of operations may become exponentially large.147 of finding the original starting state for the QML. This is due to the need to store more coefficients for a Some other algorithms that were not used as subrou- given basis. The quantum wavefunction on the quantum tines in the RWMP method are also discussed. computer may have some advantage here since coeffi- 10 cients are stored in superposition on each qubit. How- ing the initial state required for algorithms that need this ever, there are many wavefunction ansatzs to consider superposition of solutions. in comparison because some methods can find accurate answers with considerably less coefficients, and the time Theorem IV.1. A quantum computer cannot efficiently needed for the quantum computer’s wavefunction can be generate a superposition of solutions for all potentials lengthy. necessary for F [n] unless BQP=QMA-complete. Some recent efforts on the quantum computer have Proof. It is also known that at least one of the systems sought to impose specific conditions to reduce the amount contained in the universal functional is in the computa- of operations required. The strategies that are pursued tional class QMA-complete to solve.154 This is not effi- are to use different basis sets and generate criteria for cient on the quantum computer to compute objects in removing terms from the Hamiltonian. For example, this complexity class. So, no algorithm should be able to plane-waves lead to a sparse Hamiltonian (see also Ap- obtain all solutions efficiently. pendix. A 2).148 Such systematic methods for this should be expected to be as difficult as (or more so than) solving Thus, some elements should be expected to be uncon- the wavefunction outright. verged in a superposition unless the algorithm is run for The proper comparison between RTE on a quantum an impractically long time. Note that algorithms requir- computer with a plane-wave basis set and RTE on a clas- ing a superposition generally require more than one so- sical computer is that that the classical computer can lution, so the superposition is not run just once. handle any basis. So, a comparison with a plane-wave basis set on a quantum computer should be compared to a RTE on the classical computer with some other ba- 4. Alternative algorithms to real time evolution sis set (e.g. Gaussians). The difference in the number of orbitals required to accurately simulate matter for plane- We do not rule out that some improvement may allow waves can be much higher than other basis sets due to RTE (or some other method) to receive a quantum ad- cusps in the wavefunction that require large numbers of vantage. If a superior algorithm is developed (e.g., the 145,149–151 plane-waves to resolve. So, restricting the basis tools exist to make imaginary time evolution,155 pertur- to plane-waves at best matches the classical equivalent bation theory,156 preparation of projected entangled pair in terms of operations required (see also Appendix. A 2). states,157 or a very expensive version of the density ma- Wavelets have appeared in some references recently trix renormalization group138,158,159 at the present time), suggesting that these functions can provide a system- it is likely that it will rely on converging from some ini- atic way to compress a Hamiltonian, but in the 30 year tial state. Also, any algorithm like exact diagonalization existence of these functions (for more information, see for general systems is not expected to be efficient since the references in Ref. 152), they have been demonstrated this problem is not contained in the BQP complexity to scale poorly to large system sizes due to the curse of class.154,160,161 This creates more motivation to focus on 153 dimensionality. These methods typically give access to algorithms that converge. 1–3 electrons when not approximating the Fock opera- We do note that progress through the decades on quan- tor. To use these functions, a compression ratio of nearly tum chemistry has been difficult to find a unifying prin- 100% would be required. Using these functions beyond ciple that would help in algorithm design,146 but perhaps one-dimension faces significant hurdles in the general, in- quantum computing may motivate a new way to look at teracting case. the problem for cases of interest since the prospects of In conclusion, restrictions to a particular basis set or finding a general algorithm are prohibitive. procedures to remove terms in a Hamiltonian is not a cure-all for computation in general. The task of identifying which terms of the Hamiltonian without first solving B. Other methods and quantum machine learning the problem is typically complicated, and the resulting answer is not necessarily exact anymore. Approximated In regards to other methods to train the ML model, we calculations are essentially the strategy of classical meth- had investigated using alternative subroutines using a su- ods which leave out some effect or terms, and it is not perposition of solutions. Algorithms that we considered clear how to do this systematically in general, nor if any included Grover’s algorithm,4,7 quantum walks,162,163 simple strategy can be expected. A single method or and others. Each of these requires a solution of a super- change in basis is not a panacea to making the solution position of systems, some means of identifying the correct on a quantum computer less complex. solutions through a phase kickback, undoing the superposition of systems, and then repeating the process until the error is low enough to ensure the correct solution is 3. Limits on solutions in superposition determined to some high probability. The limitations described in Theorem IV.1 are one hur- With regards to applying any generic algorithm to find dle, but in our view, these strategies of uninformed search a superposition of solutions, we can place a limit on find- were too lengthy even on the quantum computer. Since 11 the improvement in the number of steps required is only This is a consequence of the hardness of finding the by a square root factor, searching the exponentially sized functional. If this were not true, we could find a QML database causes the algorithm to run for far longer than algorithm that could discover F [n] in exponentially fewer other classical algorithms for electronic structure. steps, which is a violation of both Theorem IV.2 and The variational quantum eigensolver (VQE)164 might Ref. 60. So, there can be no exponential speedup for QML be adapted to finding, for example, the Kohn-Sham po- under the PAC model here. tential with a method from Refs. 72–79, but it is not While there may be cases that lie outside of the PAC clear if errors can be kept small in a reasonable amount model,170–174 we have we no evidence that the functional of time- and wish to keep focus on finding the exact KS is not subject to the PAC assumptions. There is also potential. The VQE would also require the exact den- good evidence to suggest that simple systems–at least– sity from another method, but it could in principle be obey the limits of the PAC model.165 adapted. The main question is how well this suggestion In summary, the difficulty of finding the universal func- would perform on a quantum computer based on current tional on a quantum computer places limits on the abil- hardware. ity for many-body solvers on the quantum computer and QML as well. Note the generality of the statements here for all QML algorithms. Recent works have shown that 1. Quantum machine learning some existing QML algorithms are not as efficient as once thought,61,62 but the arguments here apply to QML in Quantum machine learning (QML) algorithms were general for this problem. also not suitable since known bounds on the number These general arguments do not prohibit a quantum of oracle queries imply that there is only a polynomial advantage if another algorithm can be found to solve speedup for QML algorithms. See Ref. 60 and experi- systems in a more specific case or restricted class of sys- mental proof on a simple case in Ref. 165. A confirming tems. Recent progress on finding classical algorithms that statement is also found in in Ref. 166. are superior to quantum algorithms illustrate the need Lacking an exponential speedup, it is likely too expen- for caution when proposing an efficient QML algorithm, sive for the quantum computer to run in a reasonable however.64 Still, QML does not seem to be a feasible way amount of time here. forward for the problem of interest here. Existing statements in the literature on the hardness of determining the universal functional can lead to limits on the types of QML algorithms we expect can exist. We C. Summary formalize the relevant limits in some statements here.167

Theorem IV.2. No QML algorithm can discover the The main take-away from these statements is that an universal functional in polynomial time unless QMA- exponentially more efficient algorithm for the most gen- complete reduces to the complexity class BQP. eral case is ruled out when discussing the solution on a quantum computer. Reducing the prefactor, therefore, Proof. The functional is proven to be QMA-complete to becomes highly beneficial. This does not preclude algo- learn.161 If an algorithm determined the functional in rithms on more restricted systems, but there is no hint of polynomial time, then BQP=QMA-complete. how exactly to construct such an algorithm or that this Note that this says nothing about whether QML can is any easier than a straight-forward solution. do slightly better than the classical algorithm, but typ- The RWMP method avoids repeated measurement, re- ically a step in a QML algorithm is to re-prepare the duces the prefactor to solve each system iteratively and wavefunction and this is one of the main issues we want allows for more systems to be solved with RTE or some to avoid here. Because the functional is known to be other method. QMA-complete to learn, finding the exact and universal functional with the RWMP algorithm would require an exponential amount of time to visit all systems, as V. CONCLUSION expected.154,160,161,168 We continue on to make a connection with some statements about learnability. Theorem IV.2 also implies limits on how learnable the It has been demonstrated that a combination of al- universal functional is by any method. In Ref. 60, under gorithms applied to a wavefunction on the quantum the probably approximately correct (PAC)169 model of computer can yield the Kohn-Sham potential, energy, ML, only a polynomial reduction in oracle queries (train- and density matrix coefficients without completely re- ing points) is possible with QML. preparing the ground state wavefunction each time. The determined quantities can be used to train a machine Lemma IV.3. The assumed limitations on the number learned model using gradient-based methods either on of oracle queries (quantum and classical learning differ the quantum computer or classically. The ground state by only polynomial factors) required to discover F [n] with wavefunction was used as the starting point for the next QML are the same as those under the PAC model.60 system, reducing the prefactor and avoiding an expensive 12 computation of the ground state at each step. This effi- indices.175,176 The one-electron integral is ciency was also used for the Kohn-Sham potential with Z a minimization condition. ∗ 1 2 tij = ϕ (r) − ∇ + v(r) ϕj(r) dr (A2) Once a model is created, a classical user can extract i 2 the relevant quantities from the machine learned model and use it for ground state, time-, and temperature- which is the kinetic plus external potential terms. The dependent calculations. Finding the Kohn-Sham poten- two-electron integral is tial is especially useful here since it gives access to many ZZ 1 ∗ ∗ 0 0 0 0 properties of the ground state; in addition, there was Vijk` = ϕi (r)ϕj (r )vee(r − r )ϕk(r)ϕ`(r ) dr dr some indication that the Kohn-Sham potential might 2 (A3) scale better in some cases as opposed to finding the den- where v (r − r0) = 1/|r − r0| for the case of a Coulomb sity. Known limitations on the complexity of finding the ee interaction and that this expression assumes the orbitals universal functional and quantum machine learning have for both spin-up and spin-down electrons are the same. constrained the choice of subroutines in the algorithm Note that a Hubbard model is an approximation with here. A better method to solve for the ground state on the only the most diagonally dominant terms of the Coulomb quantum computer must be a focus of future research to operator, V = Uδ δ δ for Hubbard interaction make quantum chemistry studies feasible, but this algo- ijk` ij jk k` U.177 We have restricted our consideration to the Born- rithm will allow for solutions to exported to many users. Oppenheimber approximation,178 even though the discussion can be generalized to the motion of nuclei.96 Solving the entire many-body problem is known to be VI. ACKNOWLEDGEMENTS difficult if not impossible. However, approximate methods can yield results that are accurate to what is known T.E.B. acknowledges funding provided by the post- as chemical accuracy (1 mHa) or a stricter limit applies doctoral fellowship from Institut quantique and Institut in some cases.145 Transdisciplinaire d’Information Quantique (INTRIQ). This research was undertaken thanks in part to funding from the Canada First Research Excellence Fund 2. Basis sets (CFREF). We thank useful discussions at the 2018 New Trends in Quantum Error Correction workshop We can note that Eq. (A 1) has been written in the at Universitéde Sherbrooke. The authors are thankful second quantized form since we expect to need a basis for useful discussions with Frank Verstraete, Guillaume to truncate the problem to a more manageable size. One Duclos-Cianci, Anirban Narayan Chowdhury, Jonathan may, for example, choose Gaussian orbitals179 so that A. Gross, Colin Trout, Li Li (李力), Yehua Liu, Vamsee Eqs. (A2) and (A3) can be evaluated analytically and Voora, Shane Parker, Raphael Ribeiro, Agustin Di Paolo, chemical accuracy can be obtained with only a few func- Anirudh Krishna, Maxime Tremblay, Jessica Lemieux, tions. Other basis functions can also be chosen.145 Benjamin Bourassa, Stuart Clark, Rex Godby, and Niki- It is known that Eq. (A3), when represented in a local tas Gidopoulos. basis, reduces to145 ZZ 1 0 2 0 0 2 lim Vijk` = |γik(r, r )| vee(r − r ) dr dr ∼ O(N ) Appendix A: Quantum chemistry loc. 2 (A4) for a density matrix γ where the limit is taken for well In this section, we review some background informa- separated, local orbitals at large distances. This reduces tion on quantum chemistry. the computational complexity from O(N 4) to O(N 2) in the asymptotic limit, although the true scaling lies somewhere in-between depending on the details of the 1. Many-body problems system.24 This argument only applies to orbitals that drop off sufficiently quickly with distance from the origin. The problem of interest is to solve the many-body problem expressed by the Hamiltonian175 a. The curse of dimensionality and other limitations in ! point-like basis sets X † X † † H = tijcîσcˆjσ + Vijk`cîσcˆjσ0 cˆ`σ0 cˆkσ (A1) ijσ k`σ0 We note that the reduction in Eq. (A4) happens immediately when using purely local basis sets, with no spatial with fermionic operatorsc ˆ on discretized lattice sites extent, is used. In that case, we would reduce to a sum (or basis functions) indexed by i, j, k, ` ∈ {1,...,N} over only the diagonal elements of the two-electron in- (for N basis functions) with spin σ. Note the order of tegral, Vijk`, if the orbitals were point-like. However, us- 13 ing only these localized orbitals (e.g., grid points, plane- and is related to the density in the limit where r → r0 waves, wavelets, etc.) comes at a steep price. 1 X In particular, note that wavelets are very expensive n(r) = ϕ∗(r)ρ ϕ (r) (B3) 2 i ij j for large scale quantum chemistry problems. It has been ij known for some time now that a curse of dimensionality shows that the number of functions in one dimension where ρ = hΨ|cˆ†cˆ |Ψi. A spin index has been sup- d ij i j scales as N1D for d dimensions with a number of functions pressed, signifying a spin degenerate ground state. How- 153 in one-dimension, N1D. Due to the large number of ba- ever, extensions to ground states without spin degeneracy sis functions, wavelet based functions have only been able are also available.51 to solve 2 and 3 electron systems maximum in the gen- Having replaced the wavefunction for the more com- 180 eral case, although these functions can be efficient for pact density, the Hamiltonian must be substituted for larger noninteracting or single Slater-determinant theo- another mathematical object that acts on the density. In 181 ries or other cases of very particular interest. Wavelets general, an object that maps a function to a scalar value are simply not expected to be efficient for real three- is known as a functional.182 In DFT, a functional maps dimensional systems of any meaningful size based on pre- the one-body density to a scalar energy value. existing works unless the problem is converted to a non- In order to find the ground state energy, we can use a interacting theory or a special geometry is chosen. For minimization over all densities183 more information, see the references in Ref. 152. For plane-wave functions, many thousands of functions Z E = min F [n] + n(r)v(r) dr (B4) are required to resolve the electron-electron cusp (e.g., n the behavior of the wavefunction at the nucleus in a hydrogen 1s orbital).145,149–151 We will not consider point- although it is impractical to search for the ground like basis functions further here to concentrate on the state density with this formulation. The second term in general case, although plane-waves can be useful for pe- Eq. (B4) is the external potential functional (often de- riodic systems. With respect to the density matrix (which noted as V [n]) and has a known form. Contrastingly, the is a highly important quantity in Sec. B), the full double universal functional, F [n], is defined as the search over sum will be taken (see Eq. (B3)) and not just diagonal all wavefunctions Ψ constrained to give the density,50,51 elements. F [n] = minhΨ[n]|Tˆ + Vêe|Ψ[n]i. (B5) In summary, even though the scaling of the two- Ψ→n electron operator is better for point-like basis sets, many more basis functions will be required to obtain accurate and is common to all systems since it does not depend results except in special cases. So, choosing a point-like on the external potential. Clearly, the minimization is basis function will not represent a general strategy for all not an efficient way to find the functional, but it is useful types of quantum chemical problems that we may wish as a mathematical tool. to solve. Because F [n] is unknown explicitly (its existence is proven by contradiction), it requires approximation to use.30 Some limiting cases are known, such as one- or Appendix B: Density functional theory two-electron cases, the uniform gas via a fitting procedure, and one-dimension.184–187 Many exact properties of the functional are known from rigorous mathemati- The foundations of density functional theory (DFT), cal statements188,189 or limited test cases.93,125 One com- including the Kohn-Sham system and other variants, are mon way to design new functionals is to build in exact introduced here. conditions.190–192 A compact representation of the quantum ground state Note that to solve a problem with F [n], the functional is the one-body density. In DFT, the ground state wave- derivative can be used and is defined as50,51 function is replaced with the density. It was proven in Ref. 29 that the one-body density, defined as Z δF [g(x)] F [g(x) + ηΥ(x)] − F [g(x)] Υ(x) dx ≡ lim Z Z δg(x) η→0 η 2 n(r) = ... |Ψ(r, r2,..., rNe )| dr2 ... drNe (B1) (B6) d = F [g(x) + ηΥ(x)] is sufficient to characterize the ground state. Note that dη η=0 in order to obtain this quantity on the lattice, the one- (B7) body reduced density matrix must be obtained for Ne electrons, where Υ is an arbitrary test function, g is the function Z Z we wish to evaluate F around, and η is a small param- 0 ∗ eter. The first functional derivative is most well-known ρˆ(r, r ) = ... Ψ (r, r2,..., rNe ) (B2) from classical physics where it is used to minimize the 0 193 ×Ψ(r , r2,..., rNe ) dr2 ... drNe Lagrangian via Euler-Lagrange minimization. 14

In order to ﬁnd the minimal density, a functional b. Components of the functional in the Kohn-Sham system derivative can be taken. This is synonymous with the 194 Euler-Lagrange equations in this case The form of the universal functional for the KS case is

δF [n] F [n] = T [n] + U[n] + E [n] (B11) + v(r) = µ (B8) s xc δn(r) This form is known from perturbative expansions of the many-body system.30,175 Note that the subscripted ”s” where a constant chemical potential µ was added as a on the kinetic energy is to signify that T is evaluated Lagrange multiplier for the total particle number. This s over noninteracting wavefunctions, φ, but has the same equation is then used to solve for orbital-free DFT. form as the same kinetic energy operator in Eq. (A2). This term shows that the KS scheme is not a pure density functional but one that relies the auxiliary noninteracting 1. Kohn-Sham density functional theory orbitals. The cost to solve the noninteracting system is larger than pure-DFT, but still smaller than many other approximations. One useful alternative formulation of F [n] is KS-DFT. In addition to the kinetic energy, another known energy This reformulation of DFT proposes an external poten- in Eq. (B11) is the Hartree energy, tial whose solution resulting one-body density is equivalent to obtaining the one-body density of the fully inter- 1 ZZ n(r)n(r0) acting system. The original goal of DFT was to propose U[n] = dr dr0 (B12) 2 |r − r0| a purely wavefunction-free method to characterize the quantum ground state, but it is difficult to find suitable which is fully nonlocal. approximations that are accurate enough. The unknown term in Eq. (B11) is the exchange- It can be noted that approximating F [n] is a large ap- correlation energy, Exc, which is not known as a density proximation on the total energy. In this alternative for- functional and requires approximation in practice. If the mulation, one introduces an easy to solve, noninteracting, exact Exc is used, then the theory is exact. The useful- auxiliary system to make the required approximation a ness of defining the KS system is that the approximation smaller fraction of the overall energy. Obtaining the KS to the total energy is small for many systems of practical potential gives insight to many more physical quantities interest. than just the density, and the orbitals of the noninteracting system can be used in a variety of other contexts. c. Kohn-Sham potential by functional derivatives

The KS potential is explicitly a. Finding the Kohn-Sham potential δ Z vs(r) = U[n] + Exc[n] + n(r)v(r) dr (B13) To formalize the KS system, what is known as the δn adiabatic connection can be used to transform from the = vH[n](r) + vxc[n](r) + v(r) original problem to the final noninteracting problem.195 Equation (A1) can be rewritten as196 where a functional derivative is taken over the relevant energy terms, for example, Hλ = Tˆ + λVˆ + Vˆ (λ) (B9) δU[n] Z n(r0) ee v [n](r) = = dr0 (B14) H δn |r − r0| where an express dependence on the coupling constant λ has been introduced. The tuning parameter λ can vary for the Hartree potential, and the form of vxc[n](r) is not (λ=0) between the KS system (λ = 0, where Vˆ = Vˆs) known explicitly. In summary, by re-grouping the non- and the original system (λ = 1). Note that the external kinetic energy terms in the Hamiltonian, U[n] + V [n] + potential operator (Vˆ = v(r)) has received an implicit Exc[n], the resulting system will appear as noninteract- coupling constant dependence, but there is no simple an- ing. The electron-electron term is contained in the result- alytic form for Vˆ (λ). The constraint given in this problem ing potential of the noninteracting system. is that the density must be the same for any λ, d. Variational principle for the Kohn-Sham potential n(r) ≡ n(λ=1)(r) =! n(λ=0)(r) (B10) The Kohn-Sham potential also satisfies the minimiza- which is difficult to construct in practice. Note that the tion of the quantity73 other limit of λ → ∞ can also be used to base a functional 197 theory. TΨ[vs] = hΨ[v]|Tˆ + Vˆs|Ψ[v]i − hΦ[vs]|Tˆ + Vˆs|Φ[vs]i (B15) 15 where we follow Refs. 73–75 closely. Note that for Hartree energy U, exchange correlation energy ˆ ˆ Ψ(r, r2,..., rNe ) is not an eigenstate of T + Vs but that Exc, and exchange-correlation potential vxc, Note that Φ(r) is. So, Eq. (B20) shows it is not sufficient to have only the KS potential to find E, although perturbation theory on the ˆ ˆ ˆ ˆ 199 hΨ[v]|T + Vs|Ψ[v]i > hΦ[vs]|T + Vs|Φ[vs]i (B16) density can be used. by the variational principle.198 The functional derivative 73 of Eq. (B15) with respect to vs(r) is 2. Potential functional theory

δTΨ[vs] = nΨ(r) − nΦ(r) (B17) When examining Eq. (B4), it is natural to ask if a δvs dual theory can be formulated based on v(r) instead of and equals the difference in the densities of the two sys- n(r) since both are one-body quantities. This question tems, one computed from Ψ (nΨ) and the other density stems from noticing that functional derivatives of n(r) 73 from Φ (nΦ). When this difference is zero, the condition yield equations that can be solved for the density, result- for the Kohn-Sham potential is found given in Eq. (B10). ing in the Euler-Lagrange minimization for the density functional from Eq. (B8). To the question: can we instead take a functional e. v-representability derivative with respect to v(r) instead? The answer is yes. It was proven in Ref. 66 that the dependence on the The KS scheme is exactly defined provided that v- functional in terms of the external potential was sufficient representability is satisfied. In common practice, this is to describe the ground state. In this theory, the density not a concern since it was proven on a grid that the sys- must be determined from v(r) directly as n(r) → n[v](r). tem must be v-representable since the kinetic energy is The resulting energy becomes regularized.70,71 So, we always expect v-representability here. Z E = min F [v] + n[v](r)v(r) dr (B21) n[v]

f. Minimization of the Kohn-Sham functional where

F [v] = min hΨ[v]|Tˆ + Vêe|Ψ[v]i (B22) Note that the Euler-Lagrange minimization of the Ψ→n[v] functional yields the KS equations which is similar to Eq. (B5). ∇2 − + v (r) φ (r) = φ (r) (B18) 2 s j j j a. Why the functional derivative is also necessary for some KS energy eigenvalues j and KS orbitals φj(r). The density is then the sum over occupied orbitals equivalent to Very importantly, one cannot determine the entire character of the ground state (i.e., find n(r) or equiv- X 2 n(r) = |φj(r)| (B19) alent) with only E[v]. To see this, note that the Euler- 112 j∈occ. Lagrange equation for potential functional theory is which can be found from Eq. (B3) by noting that the δF [v] Z δn[v](r) + v(r0) dr0 = 0 (B23) excitations are orthogonal. One recovers Eq. (B3) with δv δv(r0) an additional index for the excitations when φ is decom- posed into a chosen basis. where a derivative of the density with respect to the external potential in the second term of the left-hand side must be determined to solve this equation and find the g. Relationship between the energies of the Kohn-Sham and density. In summary, one can formulate potential func- the fully interacting system tionals, E[v], provided that n[v](r) is known.86,111 In other words, the functional derivative is also necessary The adiabatic connection from Sec. B 1 a does not con- to perform self-consistent calculations if only the energy serve energy. The relation between the ground state en- is known for a given potential. ergy of the interacting system, E, and the energy of the KS system (the sum of eigenvalues of the noninteracting P 50 system, j∈occ. j) is 3. Time-dependent density functional theory X Z E = j − U[n] + Exc[n] − n(r)vxc(r)dr (B20) In a time-dependent DFT (TD-DFT), one may simply j∈occ. propagate the KS potential according to Schrödinger’s 16 equation for time evolution57,194 and

2 τ n o ∂ ∇ F [n] ≡ min T [Γ]ˆ + Vee[Γ]ˆ − τS[Γ]ˆ (B32) i φ (r, t) = − + v [n, Φ ](r, t) φ (r, t) (B24) ˆ ∂t j 2 s 0 j Γ→n In summary, if a system is solved at a given temper- with an initial starting state Φ . A formal justification 0 ature, one can solve for the KS potential analogously to for the existence of TD-DFT is available.97,98 the ground state with an extra term −τSˆ (and a term Computing response functions is also necessary if a for the particle number) representing the entropy in the perturbation to v (r) is applied. Knowing just the KS s functional. Note that the ground state density is replaced orbitals is sufficient to determine the response function by the Γˆ object which is akin to a density matrix and that for the KS system,57 extra weights must be solved. In order to find this, several ∞ ∗ ∗ 0 0 states ψNe,i must be used. 0 X φk(r)φj(r)φj (r )φk(r ) χs(r, r , ω) = lim (ξk − ξj) η→0+ ω − (j − k) + iη k,j=1 (B25) 5. Comment on density functional approximations with occupation numbers ξj, eigenvalues j, frequency ω, There are many functionals that can be used to approx- and small parameter η. Hence, knowing all eigenvalues of imate Exc. Each performs with its own set of systematic the vs(r) at t = 0 gives the KS response function. One deficiencies. can also relate χs to the interacting response function χ The most pertinent approximations for this paper via a kernel (ng.s. is the ground state density) are the ML functionals.31,33–38,41–45 The general strategy of fitting a functional may be unpalatable,83 but 0 0 δvxc[n](r, t) fxc[n](r, t, r , t ) = (B26) the generic strategy of fitting exact data is not unique δn(r0, t0) n=ng.s. to ML functionals. The simplest approximation to the and the relation functional–known as the local density approximation–is a fit of highly accurate quantum Monte Carlo data.201,202 0 −1 0 −1 0 0 fxc(r, r , ω) = χs (r, r , ω) − χ (r, r , ω) − vee(r − r ) Further, coefficients present in hybrid functionals are (B27) also fit to existing data,203 among other examples. The which is similar to a Dyson’s equation. Many cases of ML functionals simply represent a more robust approx- interest obtain sufficiently accurate answers with only the imation that can interpolate well provided the system adiabatic approximation, however. TD-DFT can be used solved is close to the training manifold. This strat- to find excited states.99 egy would capture the exact conditions of the exact functional.33,49–51,204

4. Density functional theory at finite temperature Appendix C: Training a machine learning model In order to incorporate finite temperature effects into with stochastic gradient descent the density functional, an entropy term can be added following the original treatment by Mermin,100 one can Machine learning methods rely on the minimization write the grand canonical free energy as of some cost function. Here we describe the most basic version of this, the stochastic gradient descent (SGD). If Ωˆ = H − τSˆ − µNˆ (B28) a function must be minimized, some procedure for the minimization is necessary. We can define a cost function, for temperature τ, chemical potential µ, number operator g, that could take the form Nˆ, and entropy operator X 2 g = n ˆw(x(ι)) − n(ι) (C1) Sˆ = −kB ln Γˆ (B29) ι,x where200 for some observable n with known values of n(ι) (called a X training set) indexed by ι with some coefficients for the Γˆ = pN ,i|ψN ,iihψN ,i| (B30) e e e weight ˆw and bias b in the form Ne,i with P p = 1, 0 ≤ p ≤ 1, and states ψ x(i+1) = S ˆw(i) · x(i) + b(i) (C2) {Ne,i} Ne,i Ne,i Ne,i indexing excitations over a particular number of particles for a level i of the neural network with a nonlinear func- Ne. The minimum of Ωˆ is (and adding descriptive indices) tion S with a given input x(ι). The final level of the neural network will be the final quantity of interest, n ˆw(x(ι)). Z ˆ τ τ In order to minimize g and therefore construct the best Ωv,µ = min F [n] + n(r)(v(r) − µ)dr (B31) n approximation to the known values, a gradient descent 17 can be performed. The basic idea is to ensure that any the two,216,217 although this may require that the wave- evolution of the coefficients ˆw occur along the steepest function is re-solved to implement on the quantum com- negative gradient in the system. In order to ensure that puter. We were not able to find any evidence that random the gradient is negative, we can start from a consequence walks can reliably compete with gradient-based training of Gauss’s law which states that a gradient of a scalar methods; however, if one could use such an algorithm, a (here, g) along the direction of changing ˆw (denoted δ ˆw) quadratic speedup is available on the quantum computer 162 is equivalent to the Laplacian of g, ∆g = ∇ ˆwg·δ ˆw where for training with the random walk.

∇ ˆwg = (∂w1 g, ∂w2 g,...). We also note that kernel based methods can be used to If the form of the update from iteration over a time train the neural network but that they are not as “choice- δt (which can also be expressed in the discrete case as free” since a functional form of the kernel must be se- earlier) is chosen as lected. The minimization of the coefficients with a kernel does not require gradients.48 ˆwt+δt = ˆwt − η∇ ˆwg (C3) for some small parameter η (called a learning rate), then Appendix D: Quantum algorithms then applying ∇ ˆwg to each side with a dot product gives 2 ∆g = −η |∇ ˆwg| < 0. In other words, the Laplacian of g is guaranteed to be negative for a small enough η such On a quantum computer, both the method of manip- that g is linear on a small neighborhood. The argument ulating and storing information is different from a clas- was shown for the weights of the neural network, but the sical computer. On a classical computer, electrons are same argument applies to the biases. moved around and represent different information based Evaluating Eq. (C1) for all ι provided can be very on where they are placed. costly since the resulting gradient is very noisy. To In a quantum computer, the quantity that we manipulate is the spin of an electron or some other quantity that speedup the gradient descent, a randomly sampled subset 1 of the provided training data, called a mini-batch. This is allowed to exist in a superposition of states. We will can be one or more selected points. Since the points are refer to qubits in this work but note that one can extend randomly selected at each step, the gradient descent is the ideas to qudits where more than two states are pos- taken stochastically. sible. It is not possible to determine all coefficients of a Even though the cost function evaluated over the mini- wavefunction in a superposition without an exponential batch must be negative, the entire cost function evaluated number of measurements to find them (i.e., measuring over all training points does not need to decrease. How- the spin for the both |0i and |1i states for each qubit in- ever, on average, the observable’s value is lowered over dividually). To manipulate the state of a quantum wave- several SGD steps. The condition is h∂ gi < 0 for a mini- function, we can apply operators, specifically operators t that are unitary. Often, the operators are applied to a batch in the general case but would be ∂tg < 0 if all points are used. specified qubit and also another auxiliary qubit to keep There are two steps required when training a ML required operations unitary. model with SGD: forward and backward propagation. Sequences of these unitary operators can be cast as a In forward propagation, Eq. (C2) is applied straightfor- tensor network diagram. Each line of the diagram repre- wardly from the input to the output layers of the neu- sents a single qubit or a group of qubits called a register. ral network. The backward propagation requires that a Each block is a unitary operation that manipulates the derivative of Eq. (C2) from the output to input layers. state of one or more qubits. Note that the allowed opera- Repeating this constructs the cost function and then ap- tions on the quantum computer maintain the number of plies the gradient to all weights and biases in the network lines and do not involve truncations of the space at any until the model is more converged. point. This is qualitatively different from other uses of More advanced algorithms can also be used to converge tensor network diagrams used for “tensor network meth- gradient descents as well.205 ods” which can refer to a class of algorithms that solve for the ground state of a quantum system on classical computer.91 So, there is a subtle distinction between the 1. Gradient-free methods tensor network methods and the tensor network diagrams we draw here even though the general properties of the diagrams is the same in both. The main difference here is This section has been focused on training ML models that no form of truncation or a connection with a renor- with gradients. Another class of method, one that uses malization group is taken explicitly. random walks,206–215 is also available. However, these One example of a useful gate that will appear in several methods of training that involve a random walk typi- places is the Hadamard gate, cally only perform well on small numbers of parameters. In fact, these methods perform better than gradients 1 1 1 in some cases for these small systems. If the problem H = √ (D1) 2 1 −1 is too large, then a gradient-based method is generally better.211 There may also be opportunities to combine which is written in the {|0i, |1i} qubit basis states. Ap- 18 plying this gate on a |0i state will give an equal superposition over the |0i and |1i states, √ H|0i = (|0i + |1i) / 2 (D2) which can be applied identically to more qubits as denoted by the ⊗ operator. We present some common algorithms in the context of 218 FIG. 2. Circuit diagram of the quantum Fourier transform. solving quantum chemistry problems. The quantum Note that the rotation operator is about the z axis, and the gradient algorithm (QGA) is used to find derivatives of Hadamard is about the x axis. So, they cannot commute nor a function, generically. Meanwhile, the quantum phase can they be placed in a different order. estimation (QPE) determines the phase of a given state. An important sub-algorithm in the QGA and QPE is the quantum Fourier transform (QFT) which is detailed find1 first.1 Real time evolution (RTE) in addition to quantum N−1 1 amplitude estimation (QAE) are also discussed. 1 O X |xi = exp iπxy /2` |y i (D6) Note that everywhere we use the symbol N for the 2N/2 ` ` number of qubits in this section. Many times in the liter- `=0 y`=0 ature, the number of qubits may also implicitly mean N where the sums in Eq. (D3) are rewritten for the binary registers with r qubits each for r digits of precision. If this representation as is the case, the same concepts would apply, regardless. 2N −1 N−1 1 1 1 1 X O X X X X ≡ = ... (D7)

1. Quantum Fourier transform y=0 `=0 y`=0 y0=0 y1=0 yN−1=0

The QFT begins with a set of input data recorded on for simplicity. Noticing that each term is factorizable, this an initial set of registers, which we denote as |yi. The becomes end result of the QFT is to change the data |yi into |xi N−1 ( ) 219 1 O according to the discrete Fourier transform |xi = |0i + exp iπx/2` |1i (D8) 2N/2 `=0 2N −1 1 X |xi = exp(i2πxy/2N )|yi (D3) where for a given `, the division by a power of 2 in the 2N/2 y=0 argument of the exponential will reveal one digit of the binary representation of the integer x.1 for N qubits with a normalization factor 2−N/2 com- The operators necessary to perform a QFT on a quan- ing from the prefactor in Eq. (D1). The key first step tum computer can be identified in Eq. (D8). Note that is to understand how Eq. (D3) can be re-expressed in applying the Hadamard gate on a qubit written in an binary instead of integers integer y. A number y can arbitrary binary form is be re-expressed in base 2 with the digits (assuming √ value either 0 or 1, corresponding to the states of a H|xki = (|0i + exp(iπxk)|1i) / 2 (D9) qubit) y0y1 . . . yN−1 where N is the maximum number of (qu)bits we allow for the binary number. In full, y as can be verified by noting that xk is either 0 or 1. So, relates to the binary digits as a Hadamard will rotate each bit of x into a basis that is only different from an individual term in Eq. (D8) by N−1 N−2 a phase on the |1i state. If we apply the phase rotation y = 2 y0 + 2 y1 + ... + 2yN−2 + yN−1 (D4) gate of the form A similar form can be written for using a qudit where a 1 0 different basis is used (e.g., trinary). R` = iπ/2` (D10) To express the binary representation of |yi as qubits, 0 e let us express the state as after the Hadamard, then this will constitute the most N−1 basic operation in the QFT. That is, applying H and O then R a certain number of times will generate terms |yi = |y0 . . . yN−1i = |y0i⊗...⊗|yN−1i ≡ |yì (D5) ` `=0 required in Eq. (D8). More than one R` gate may need to be applied depending on which bit of x we are acting on. with each of the sub-indices representing another digit of The structure of gates is shown in Fig. 2. The next qubit y’s binary representation (y` ∈ {0, 1}). has all but the first register’s rotation matrix applied. We We can substitute y for its binary representation as then continue to the next qubit, applying one less set of using the previously defined expressions in Eq. (D4) to gates, and continue until we apply only the Hadamard on 19 the last qubit. If a swap operation is applied, the qubits transform is applied on the last step, we obtain |ϕi and will appear in order and this completes the QFT. have then represented the phase on a set of qubits. Note that the overall cost of this algorithm scales In order to construct the unitary operator U, we sim- as O(N 2) but that a cheaper O(N log N) is also ply must obtain the representation of the exponentiated available.220 Note also that since we cannot efficiently Hamiltonian as exp(−iHt).11 The energy of the wave- obtain all coefficients from the superposition, the algo- function (ϕ related to E) is then related to the time ap- rithm does not provide a useful speedup over the classi- plied and may involve other pre-determined constants.11 cal algorithm which scales exponentially, not polynomi- In order to apply the exponentiated Hamiltonian, one ally. However, the time to measure all elements is expo- option is to use the Trotter-Suzuki decomposition221 to nentially long, so this quantum advantage is not truly decompose the exponential into a product of exponentials advantageous. Still, the QFT can be useful as a tool in with fewer terms.24 other subroutines. By expanding the number of qubits used in the auxiliary register, we can increase the accuracy of the final result. We will defer detailed analysis of this point to 2. Quantum phase estimation Ref. 1, but it is worth noting that some error in this algorithm can be reduced with more resources. The total Given an input state, |ψi, we want to determine the error can be reduced arbitrarily to 1-η for some small associated eigenvalue of the form exp(i2πϕ) for some real number η. ϕ denoting a phase. For this algorithm, we must first have Note that improvements can be applied to the algo- the initial state, |ψi, and a number of auxiliary qubits at rithm to generate more methods of QPE.222 A recent least equal to the number of digits that the phase must improvement known as qubitization can bring down the be accurate to. gate-count for the determining the phase. We will defer The strategy will be to generate, from an initial wave- to the discussion in Ref. 65. function ψ, the binary digits of the phase of ϕ–as defined in Appendix. D 1–and then perform an inverse QFT to obtain the phase on an auxiliary register. 3. Quantum gradient algorithm Let a gate U[= R1 from Eq. (D10)] be one such that when applied to ψ, and controlled on one of the auxiliary Given an oracle for a function f, its gradients can be qubits, it produces a phase that is the jth value of the computed in one query of the oracle instead of m + 1 binary representation, classically for m grid points.81 2j 1 i2π2t−j ϕ We start with three multi-qubit registers. One is used U H|0ij|ψi = √ |0i + e |1i |ψi (D11) 2 j for the computation of f. The other contains an equal superposition over all states (representing the infinitesimal th where |0ij is the j qubit and t is the total number of directions that f can be shifted for an eventual gradient). qubits that this operator will be applied to. The last receives a QFT (on initial register set to 1 while The gate U is applied as in Eq. (D11) to a register of the others are set to zero). This final register will be used auxiliary qubits and to ψ as in Fig. 3. The resulting state for a phase kickback. The purpose of the phase kickback is then is to modify the phases of the QFT. When the inverse

t 2t−1 QFT is applied, we obtain the gradient similar to how 1 O t−j 1 X |0i + e2πi2 ϕ|1i = ei2πϕy|yi the phase was obtained for the QPE. 2t/2 2t/2 The steps of this algorithm are shown in Fig. 4. The j=1 y=0 (D12) Hadamard gate produces where the same conversion to and from a binary represen- N 1 tation in Appendix. D 1 is used. Once an inverse Fourier ⊗N ⊗N 1 O X 1 X H |0i = √ |δ`i ≡ √ |δi (D13) 2N 2N `=1 δ`=0 δ

where N is the number of qubits contained in the second register. When calling the oracle query as controlled on the equal superposition in Eq. (D13), f is evaluated on all arguments δ` giving f(c+δ), having perturbed an initial coordinate by δ or some similar function.82 FIG. 3. Circuit diagram of the quantum phase estimation On the third register in Fig. 4, a QFT has been applied algorithm The input is an auxiliary register and the initial on an initial value of 1, giving an output of wavefunction. The output is the phase corresponding to the eigenvalue and the original waevfunction. Essentially, one ob- 1 X Np ei2πw/2 |wi (D14) tains the associated energy for an input wavefunction. 2Np/2 w 20 which is similar to Eq. (D3). The number of qubits in Recall that, by inspection from Eq. (D3), applying the this register, Np, are enough to allow a pending bitwise inverse Fourier transform will give addition to be carried out properly. 2N The state of the full quantum wavefunction is then O (∇xk f) (D20) M c k 1 X X Np |ψi = √ ei2πw/2 |wi|f(c + δ˜)i|δi (D15) N Np 2 2 δ w So, given a continuous function f, we obtain a gradient ∇f. The representation on the qubits for both f and ∇f where is in the binary representation of their continuous values. Improvements to this algorithm have been noted.82 δ˜ = L(δ − N/2)/2N (D16) and N is a vector of (2N , 2N ,...) and provides the off- a. Functional derivatives with quantum gradient algorithms set factor to convert integers to real numbers. M and L are assigned meaning in the following. The factor M The QGA can be used to evaluate the functional scales the maximum amount of ∇f to keep this quantity derivative, Eq. (B6). The test field Υ(x) can be pro- expressed as an integer. The parameter L is a neighbor- vided by hand or by Hadamard transformation with each hood over which the derivative is accurate to first order, resulting state giving the same result. Evaluating the for example in one-dimension derivative as before on η produces the functional derivative. This could be applied to a variety of functionals, f(x + L/2) − f(x − L/2) such as the density functional or the partition function ∂xf ≈ (D17) L (although this last object would be very difficult to compute before taking the functional derivative due again to is the decomposition in one-dimension. the curse of dimensionality).223 There are other ways to The next step is to add (bitwise addition denoted by get the functional derivative, such as with a chain rule if ⊕) the first register to the third under the addition w → an alternative form is required. w ⊕ (2N 2Np f)/(ML) mod 2Np so that the register and phase are shifted as

N Np 4. Real time evolution 1 X X i2π w+ 2 2 f(c+δ˜) /2Np |ψi = √ e ML 2N 2Np δ w While it is possible in theory to implement a more × |wi|f(c + δ˜)i|δi (D18) advanced classical algorithm on the quantum computer, there may be sizable overhead. It is generally accepted where this trick is often called a phase kickback. In a in the literature to use the real time evolution (RTE) small neighborhood, L, around the central point c of the method which we choose to introduce here.224 oracle query the vectors δ can be thought of as pertur- The initial state of qubits can be initialized into a state bations on this point. Expanding the function according consisting of a single-particle Hamiltonian’s, H0, eigen- to a Taylor expansion gives state. This could be the Hartree-Fock solution for some 11 number of electrons, Ne. The Hamiltonian is then given N X i2π 2 (f(c)+ L (δ− N )·∇f) a time-dependence such that t = 0 is H0 and t = tmax. is e ML 2N 2 |f(c + δ˜)i|δi (D19) the full Hamiltonian, δ N i2π 2 f(c) i 2π N ·∇f X i 2π δ·∇f = e ML e M 2 e M |f(c + δ˜)i|δi H(t) = H0 + λ(t)H1 + C (D21) δ for some time-dependent function λ(t) and interaction term H1. By tuning the time parameter slowly enough, the new ground state can be found. A constant C is added in anticipation of the QPE and is simply taken into account when converting the output phase to the energy.224 The final state of the RTE must be a close approximation to the true ground state for QPE to work properly.1,225 In order to evolve the initial state to the ground state, the error in the Trotter step must be no more than the allowed accuracy for a computation.23 For the case of molecular systems, this is 1 mHa (although FIG. 4. Circuit diagram of the quantum gradient algorithm this can be even lower for some applications). A vari- One could write the function f with a control to a fourth able number of steps is required to fully evolve the initial register with input c for the point that f is evaluated on. wavefunction to the ground state, but this may be on the 21 order of a number of thousands and the entire process can 3. The newly found energy is compared to the saved take months or much, much longer.24 energy of the original wavefunction There are also other algorithms that could be used,138,155–159 but these may have a large overhead. 4. The difference is represented as a single bit (called Just as with phase estimation, we present the most a pointer qubit) widely known algorithm here for ease of presentation. The RWMP method of the main text can interchange 5. (Accept) The pointer qubit is measured. The al- subroutines for the best algorithm gorithm then branches: if the measurement of the single pointer qubit gives a success, implying the energies match and the original Ψ was recovered, 5. Quantum amplitude estimation (quantum we repeat the steps starting from step 1 here after counting) returning the state to the original configuration. A separate counter is incremented by one each time A way to obtain useful quantities from a wavefunction this step is reached. without measuring it is to use QAE226 as described by Ref. 5 (although we follow the state-preserving quantum 6. (Reject) If the pointer measurement results in a counting algorithm used in Ref. 58 and also note Refs. 59 failure, then the wavefunction found is not the orig- and 227). One can envision the application of an operator inal. We must recover the original wavefunction by ˆ † O onto the wavefunction (e.g.,c î cˆj onto |Ψi) as being undoing the QPE, undoing the operator applied represented in a superposition of the original function Ψ (Oˆ†), applying the operator again, and again find- and all other states that are perpendicular, Ψ⊥, as ing the energy difference. We return to step 1. More details and diagrams can be found in Refs. 5, 58, ˆ ⊥ ⊥ O|Ψi = α0|Ψi + α |Ψ i. (D22) and 228. Intuitively, we are counting the number of times the starting wavefunction is recovered when applying an We will simply assume some process exists to apply the operator. The ratio of accepted counts to the total num- operator of interest is available. The goal is to estimate ber of times the operator is applied is related to the co- α which is the expectation value. A series of steps is 0 efficient on the ground state, α0. While the description required to find this coefficient. The fraction of times that here involves a measurement and therefore gives classical the following is successful will give α0. Before performing data, the algorithm can be used as an oracle query for any of the subsequent steps, we assume that the energy of the QGA (as stated in Ref. 58 at the cost of additional Ψ has been obtained via phase estimation beforehand on auxiliary qubits). a separate register. In total, four registers are required: In order to see how the algorithm will converge when one for Ψ, one for the saved ground state energy, one for the reject step is activated, we can analogize with the the check energy, and one for the pointer qubit. half-life of radioactive isotopes to envision when the re- One iteration of the full algorithm is the following: jection procedure must eventually recover the correct 1. An operator is applied to Ψ ground state. A detailed analysis of the convergence of the rejection step shows the number of steps required to 2. The energy of the resulting state is determined; the find the original Ψ is related to 1/ε for some probability energy is in a superposition over all states of failure, ε and is found in Refs. 5 and 58.

‡ Néle 1er décembre 1976, décédéle 25 juin 2020. 6 Peter W Shor, “Polynomial-time algorithms for prime fac- 1 Michael A Nielsen and Isaac L Chuang, Quantum Compu- torization and discrete logarithms on a quantum com- tation and Quantum Information (Cambridge University puter,” SIAM review 41, 303–332 (1999). Press, 2010). 7 Lov K Grover, “From Schrödinger’sequation to the quan- 2 David Deutsch and Richard Jozsa, “Rapid solution of tum search algorithm,” Pramana 56, 333–348 (2001). problems by quantum computation,” Proceedings of the 8 AlánAspuru-Guzik, Anthony D Dutoi, Peter J Love, and Royal Society of London. Series A: Mathematical and Martin Head-Gordon, “Simulated quantum computation Physical Sciences 439, 553–558 (1992). of molecular energies,” Science 309, 1704–1707 (2005). 3 Peter W Shor, “Proceedings of the 35th annual sympo- 9 Katherine L Brown, William J Munro, and Vivien M sium on foundations of computer science,” IEE Computer Kendon, “Using quantum computers for quantum simu- society press, Santa Fe, NM (1994). lation,” Entropy 12, 2268–2307 (2010). 4 Lov K Grover, “Quantum mechanics helps in searching for 10 Benjamin P Lanyon, James D Whitfield, Geoff G Gillett, a needle in a haystack,” Phys. Rev. Lett. 79, 325 (1997). Michael E Goggin, Marcelo P Almeida, Ivan Kassal, Ja- 5 Gilles Brassard, Peter Hoyer, Michele Mosca, and Alain cob D Biamonte, Masoud Mohseni, Ben J Powell, Marco Tapp, “Quantum amplitude amplification and estima- Barbieri, et al., “Towards quantum chemistry on a quan- tion,” Contemporary Mathematics 305, 53–74 (2002). tum computer,” Nature Chemistry 2, 106–111 (2010). 22

11 James D Whitfield, Jacob Biamonte, and AlánAspuru- mechanics,” Foundations of Physics 1, 23–33 (1970). Guzik, “Simulation of electronic structure hamiltonians 27 Scott Aaronson, “Shadow tomography of quantum using quantum computers,” Mol. Phys. 109, 735–750 states,” in Proceedings of the 50th Annual ACM SIGACT (2011). Symposium on Theory of Computing (2018) pp. 325–338. 12 Yudong Cao, Jonathan Romero, Jonathan P Olson, 28 Giuseppe Carleo, Ignacio Cirac, Kyle Cranmer, Lau- Matthias Degroote, Peter D Johnson, MáriaKieferová, rent Daudet, Maria Schuld, Naftali Tishby, Leslie Vogt- Ian D Kivlichan, Tim Menke, Borja Peropadre, Nico- Maranto, and Lenka Zdeborová,“Machine learning and las PD Sawaya, et al., “Quantum chemistry in the age the physical sciences,” Rev. Mod. Phys. 91, 045002 of quantum computing,” Chemical reviews 119, 10856– (2019). 10915 (2019). 29 Pierre Hohenberg and Walter Kohn, “Inhomogeneous 13 Sam McArdle, Suguru Endo, Alan Aspuru-Guzik, Si- electron gas,” Physical Review 136, B864 (1964). mon C Benjamin, and Xiao Yuan, “Quantum computa- 30 Walter Kohn and Lu Jeu Sham, “Self-consistent equations tional chemistry,” Reviews of Modern Physics 92, 015003 including exchange and correlation effects,” Physical Re- (2020). view 140, A1133 (1965). 14 Kisuk Kang, Ying Shirley Meng, Julien Bréger,Clare P 31 John C. Snyder, Matthias Rupp, Katja Hansen, Klaus- Grey, and Gerbrand Ceder, “Electrodes with high power Robert Müller,and Kieron Burke, “Finding density func- and high capacity for rechargeable lithium batteries,” Sci- tionals with machine learning,” Phys. Rev. Lett. 108, ence 311, 977–980 (2006). 253002 (2012). 15 Michael G Mavros, Takashi Tsuchimochi, Tim Kowal- 32 JörgBehler and Michele Parrinello, “Generalized neural- czyk, Alexandra McIsaac, Lee-Ping Wang, and Troy Van network representation of high-dimensional potential- Voorhis, “What can density functional theory tell us energy surfaces,” Physical review letters 98, 146401 about artificial catalytic water splitting?” Inorganic chem- (2007). istry 53, 6386–6397 (2014). 33 Li Li, Thomas E. Baker, Steven R. White, and Kieron 16 P Cudazzo, Gianni Profeta, Antonio Sanna, A Floris, Burke, “Pure density functional for strong correlation and A Continenza, S Massidda, and EKU Gross, “Ab ini- the thermodynamic limit from machine learning,” Phys. tio description of high-temperature superconductivity in Rev. B 94, 245129 (2016). dense molecular hydrogen,” Phys. Rev. Lett. 100, 257001 34 Felix Brockherde, Leslie Vogt, Li Li, Mark E Tuckerman, (2008). Kieron Burke, and Klaus-Robert Müller,“Bypassing the 17 Thomas Holm Rod, Ashildur Logadottir, and Jens Kehlet Kohn-Sham equations with machine learning,” Nature Nørskov, “Ammonia synthesis at low temperatures,” Communications 8, 872 (2017). J. Chem. Phys. 112, 5343–5347 (2000). 35 Mihail Bogojeski, Leslie Vogt-Maranto, Mark E Tucker- 18 Jens Kehlet Nørskov, Matthias Scheffler, and Hervé man, Klaus-Robert Mueller, and Kieron Burke, “Den- Toulhoat, “Density functional theory in surface science sity functionals with quantum chemical accuracy: From and heterogeneous catalysis,” MRS Bulletin 31, 669–674 machine learning to molecular dynamics,” Preprint at (2006). ChemRxiv https://doi. org/10.26434/chemrxiv 8079917, 19 JoséA Flores-Livas, Antonio Sanna, and EKU Gross, v1 (2019). “High temperature superconductivity in sulfur and sele- 36 Mihail Bogojeski, Felix Brockherde, Leslie Vogt-Maranto, nium hydrides at high pressure,” The European Physical Li Li, Mark E Tuckerman, Kieron Burke, and Journal B 89, 63 (2016). Klaus-Robert Müller,“Efficient prediction of 3D elec- 20 Patrik Rydberg, Flemming Steen Jørgensen, and tron densities using machine learning,” arXiv preprint Lars Olsen, “Use of density functional theory in drug arXiv:1811.06255 (2018). metabolism studies,” Expert opinion on drug metabolism 37 Ryo Nagai, Ryosuke Akashi, Shu Sasaki, and & toxicology 10, 215–227 (2014). Shinji Tsuneyuki, “Neural-network kohn-sham exchange- 21 N.W. Ashcroft and N.D. Mermin, Solid State Physics correlation potential and its out-of-training transferabil- (Saunders College, Philadelphia, 1976). ity,” J. Chem. Phys. 148, 241737 (2018). 22 Sabine Jansen, Mary-Beth Ruskai, and Ruedi Seiler, 38 Li Li, John C Snyder, Isabelle M Pelaschier, Jessica “Bounds for the adiabatic approximation with appli- Huang, Uma-Naresh Niranjan, Paul Duncan, Matthias cations to quantum computation,” J. Math. Phys. 48, Rupp, Klaus-Robert Müller, and Kieron Burke, “Un- 102111 (2007). derstanding machine-learned density functionals,” Inter- 23 Dave Wecker, Bela Bauer, Bryan K Clark, Matthew B national Journal of Quantum Chemistry 116, 819–833 Hastings, and Matthias Troyer, “Gate-count estimates for (2016). performing quantum chemistry on small quantum com- 39 Andrea Grisafi, Alberto Fabrizio, Benjamin Meyer, puters,” Phys. Rev. A 90, 022305 (2014). David M Wilkins, Clemence Corminboeuf, and Michele 24 David Poulin, Matthew B Hastings, Dave Wecker, Nathan Ceriotti, “Transferable machine-learning model of the Wiebe, Andrew C Doherty, and Matthias Troyer, “The electron density,” ACS central science 5, 57–64 (2018). trotter step size required for accurate quantum simula- 40 Alberto Fabrizio, Andrea Grisafi, Benjamin Meyer, tion of quantum chemistry,” Quantum Information and Michele Ceriotti, and Clemence Corminboeuf, “Electron Computation 15, 0361–0384 (2015). density learning of non-covalent systems,” Chemical sci- 25 Jessica Lemieux, Guillaume Duclos-Cianci, David ence 10, 9424–9432 (2019). Sénéchal, and David Poulin, “Resource estimate for 41 M Michael Denner, Mark H Fischer, and Titus Neupert, quantum many-body ground state preparation on a “Active learning a one-dimensional density functional the- quantum computer,” arXiv preprint arXiv:2006.04650 ory,” arXiv preprint arXiv:2005.03014 (2020). (2020). 42 Ryo Nagai, Ryosuke Akashi, and Osamu Sugino, “Com- 26 James L Park, “The concept of transition in quantum pleting density functional theory by machine learning hid- 23

den messages from molecules,” npj Computational Mate- 61 Ewin Tang, “A quantum-inspired classical algorithm for rials 6, 1–8 (2020). recommendation systems,” in Proceedings of the 51st An- 43 Sergei Manzhos, “Machine learning for the solution of nual ACM SIGACT Symposium on Theory of Computing the schrödingerequation,” Machine Learning: Science and (2019) pp. 217–228. Technology 1, 013002 (2020). 62 Ewin Tang, “Quantum-inspired classical algorithms for 44 Yasumitsu Suzuki, Ryo Nagai, and Jun Haruyama, “Ma- principal component analysis and supervised clustering,” chine learning exchange-correlation potential in time- arXiv preprint arXiv:1811.00414 (2018). dependent density functional theory,” arXiv preprint 63 AndrásGilyén,Seth Lloyd, and Ewin Tang, “Quantum- arXiv:2002.06542 (2020). inspired low-rank stochastic regression with logarith- 45 Jack Wetherell, Andrea Costamagna, Matteo Gatti, and mic dependence on the dimension,” arXiv preprint Lucia Reining, “Insights into one-body density matrices arXiv:1811.04909 (2018). using deep learning,” Faraday Discussions (2020). 64 Nai-Hui Chia, AndrásGilyén,Tongyang Li, Han-Hsuan 46 John C. Snyder, Matthias Rupp, Katja Hansen, Leo Lin, Ewin Tang, and Chunhao Wang, “Sampling-based Blooston, Klaus-Robert Müller, and Kieron Burke, sublinear low-rank matrix arithmetic framework for de- “Orbital-free bond breaking via machine learning,” J. quantizing quantum machine learning,” in Proceedings of Chem. Phys. 139, 224104 (2013). the 52nd Annual ACM SIGACT Symposium on Theory of 47 John C. Snyder, Matthias Rupp, Klaus-Robert Müller, Computing (2020) pp. 387–400. and Kieron Burke, “Nonlinear gradient denoising: Finding 65 Guang Hao Low and Isaac L Chuang, “Hamiltonian sim- accurate extrema from inaccurate functional derivatives,” ulation by qubitization,” Quantum 3, 163 (2019). Int. J. Quant. Chem. 115, 1102–1114 (2015). 66 Weitao Yang, Paul W Ayers, and Qin Wu, “Potential 48 Kevin Vu, John C Snyder, Li Li, Matthias Rupp, Bran- functionals: dual to density functionals and solution to the don F Chen, Tarek Khelif, Klaus-Robert Müller, and v-representability problem,” Phys. Rev. Lett. 92, 146404 Kieron Burke, “Understanding kernel ridge regression: (2004). Common behaviors from simple functions to density func- 67 John P Perdew, “What do the kohn-sham orbital energies tionals,” Int. J. Quant. Chem. 115, 1115–1128 (2015). mean? how do atoms dissociate?” in Density Functional 49 Jacob Hollingsworth, Li Li, Thomas E Baker, and Kieron Methods in Physics, NATO ASI Series (Springer, 1985) Burke, “Can exact conditions improve machine-learned pp. 265–308. density functionals?” J. Chem. Phys. 148, 241743 (2018). 68 John P Perdew and Mel Levy, “Comment on ”signifi- 50 Eberhard Engel and Reiner M Dreizler, Density functional cance of the highest occupied kohn-sham eigenvalue”,” theory: an advanced course (Springer Science & Business Phys. Rev. B 56, 16021 (1997). Media, 2011). 69 Mel Levy, “Universal variational functionals of electron 51 Eberhard KU Gross and Reiner M Dreizler, Density func- densities, first-order density matrices, and natural spin- tional theory, NATO ASI Series, Vol. 337 (Springer Sci- orbitals and solution of the v-representability problem,” ence & Business Media, 2013). Proceedings of the National Academy of Sciences 76, 52 Ryan Hatcher, Jorge A Kittl, and Christopher Bowen, 6062–6065 (1979). “A method to calculate correlation for density func- 70 Walter Kohn, “v-Representability and density functional tional theory on a quantum processor,” arXiv preprint theory,” Phys. Rev. Lett. 51, 1596 (1983). arXiv:1903.05550 (2019). 71 JT Chayes, L Chayes, and Mary Beth Ruskai, “Den- 53 James D Whitfield, MH Yung, David Gabriel Tempel, sity functional approach to quantum lattice systems,” S Boixo, and AlánAspuru-Guzik, “Computational com- J. Stat. Phys. 38, 497–518 (1985). plexity of time-dependent density functional theory,” 72 Lucas O Wagner, Thomas E Baker, EM Stoudenmire, New. J. Phys. 16, 083035 (2014). Kieron Burke, and Steven R White, “Kohn-Sham cal- 54 James Brown, Jun Yang, and James D Whitfield, culations with the exact functional,” Phys. Rev. B 90, “Solver for the electronic V-representation problem of 045109 (2014). time-dependent density functional theory,” arXiv preprint 73 N.I. Gidopoulos, “Progress at the interface of wave- arXiv:1904.10958 (2019). function and density-functional theories,” Phys. Rev. A 55 Jun Yang, James Brown, and James Daniel Whitfield, 83, 040502(R) (2011). “Measurement on quantum devices with applications to 74 Timothy J Callow and Nikitas I Gidopoulos, “Optimal time-dependent density functional theory,” arXiv preprint power series expansions of the Kohn–Sham potential,” arXiv:1909.03078 (2019). The European Physical Journal B 91, 209 (2018). 56 Patrick Rall, “Quantum algorithms for estimating phys- 75 Timothy J Callow, Nektarios N Lathiotakis, and Nikitas I ical quantities using block-encodings,” arXiv preprint Gidopoulos, “Density-inversion method for the kohn– arXiv:2004.06832 (2020). sham potential: Role of the screening density,” The Jour- 57 Carsten A Ullrich, Time-dependent density-functional nal of Chemical Physics 152, 164114 (2020). theory: concepts and applications (OUP Oxford, 2011). 76 Daniel S Jensen and Adam Wasserman, “Numerical 58 Kristan Temme, Tobias J Osborne, Karl G Vollbrecht, density-to-potential inversions in time-dependent density David Poulin, and Frank Verstraete, “Quantum metropo- functional theory,” Phys. Chem. Chem. Phys. 18, 21079– lis sampling,” Nature 471, 87 (2011). 21091 (2016). 59 Thomas E Baker, “Lanczos recursion on a quantum com- 77 Daniel S Jensen and Adam Wasserman, “Numerical meth- puter for the Green’s function,” arXiv preprint arXiv: ods for the inverse problem of density functional theory,” 2008.05593 (2020). Int. J. Quant. Chem. 118, e25425 (2018). 60 Rocco A Servedio and Steven J Gortler, “Equivalences 78 Bikash Kanungo, Paul M Zimmerman, and Vikram and separations between quantum and classical learnabil- Gavini, “Exact exchange-correlation potentials from ity,” SIAM Journal on Computing 33, 1067–1092 (2004). ground-state electron densities,” Nature communications 24

10, 1–9 (2019). adiabatic states: towards a density functional theory 79 Ashish Kumar and Manoj K Harbola, “A general penalty beyond the Born–Oppenheimer approximation,” Phil. method for density-to-potential inversion,” arXiv preprint Trans. R. Soc. A 372, 20130059 (2014). arXiv:2004.03219 (2020). 97 Erich Runge and Eberhard KU Gross, “Density- 80 Thomas G Draper, “Addition on a quantum computer,” functional theory for time-dependent systems,” arXiv preprint quant-ph/0008033 (2000). Phys. Rev. Lett. 52, 997 (1984). 81 Stephen P Jordan, “Fast quantum algorithm for numer- 98 Robert van Leeuwen, “Causality and symmetry in time- ical gradient estimation,” Phys. Rev. Lett. 95, 050501 dependent density-functional theory,” Phys. Rev. Lett. (2005). 80, 1280 (1998). 82 AndrásGilyén,Srinivasan Arunachalam, and Nathan 99 Peter Elliott, Filipp Furche, and Kieron Burke, “3 ex- Wiebe, “Optimizing quantum optimization algorithms via cited states from time-dependent density functional the- faster quantum gradient computation,” in Proceedings of ory,” Reviews in computational chemistry 26, 91 (2009). the Thirtieth Annual ACM-SIAM Symposium on Discrete 100 N David Mermin, “Thermal properties of the inhomoge- Algorithms (Society for Industrial and Applied Mathe- neous electron gas,” Physical Review 137, A1441 (1965). matics, 2019) pp. 1425–1444. 101 W Kohn, EKU Gross, and LN Oliveira, “Orbital mag- 83 Michael G Medvedev, Ivan S Bushmarinov, Jianwei Sun, netism in the density functional theory of superconduc- John P Perdew, and Konstantin A Lyssenko, “Density tors,” Journal de physique 50, 2601–2612 (1989). functional theory is straying from the path toward the 102 K Capelle and EKU Gross, “Density functional theory for exact functional,” Science 355, 49–52 (2017). triplet superconductors,” Int. J. Quant. Chem. 61, 325– 84 Thomas Edward Baker, Methods of Calculation with 332 (1997). the Exact Density Functional using the Renormaliza- 103 Michael Ruggenthaler, F Mackenroth, and Dieter Bauer, tion Group, Ph.D. thesis, University of California, Irvine “Time-dependent Kohn-Sham approach to quantum elec- (2017); John P Perdew and Thomas E Baker, “IPAM trodynamics,” Phys. Rev. A 84, 042107 (2011). Book of DFT: The Generalized Gradient Approximation,” 104 IV Tokatly, “Time-dependent density functional theory (2017) Chap. 6. for many-electron systems interacting with cavity pho- 85 Ashish Kumar and Manoj K Harbola, “Using random tons,” Phys. Rev. Lett. 110, 233001 (2013). numbers to obtain kohn-sham potential for a given den- 105 Michael Ruggenthaler, Johannes Flick, Camilla Pelle- sity,” arXiv preprint arXiv:2006.00324 (2020). grini, Heiko Appel, Ilya V Tokatly, and Angel Ru- 86 EKU Gross and CR Proetto, “Adiabatic Connection and bio, “Quantum-electrodynamical density-functional the- the Kohn- Sham Variety of Potential- Functional Theory,” ory: Bridging quantum optics and electronic-structure Journal of chemical theory and computation 5, 844–849 theory,” Phys. Rev. A 90, 012508 (2014). (2009). 106 Michael Filatov, “Ensemble dft approach to excited states 87 Marcel F Langer, Alex Goeßmann, and Matthias Rupp, of strongly correlated molecular systems,” in Density- “Representations of molecules and materials for inter- functional methods for excited states, Topics in Current polation of quantum-mechanical simulations via machine Chemistry (Springer, 2015) pp. 97–124. learning,” arXiv preprint arXiv:2003.12081 (2020). 107 Pejman Jouzdani, Stefan Bringuier, and Mark Kostuk, 88 Matthew B Hastings, “Locality in quantum and markov “A method of determining excited-states for quantum dynamics on lattices and networks,” Phys. Rev. Lett. 93, computation,” arXiv preprint arXiv:1908.05238 (2019). 140402 (2004). 108 Morrel H Cohen and Adam Wasserman, “On hardness 89 Walter Kohn, “Density functional and density matrix and electronegativity equalization in chemical reactivity method scaling linearly with the number of atoms,” theory,” J. Stat. Phys. 125, 1121–1139 (2006). Phys. Rev. Lett. 76, 3168 (1996). 109 Morrel H Cohen and Adam Wasserman, “On the founda- 90 Emil Prodan and Walter Kohn, “Nearsightedness of elec- tions of chemical reactivity theory,” The Journal of Phys- tronic matter,” Proceedings of the National Academy of ical Chemistry A 111, 2229–2242 (2007). Sciences of the United States of America 102, 11635– 110 Peter Elliott, Kieron Burke, Morrel H Cohen, and 11638 (2005). Adam Wasserman, “Partition density-functional theory,” 91 Thomas E Baker, Samuel Desrosiers, Maxime Trem- Phys. Rev. A 82, 024501 (2010). blay, and Martin P Thompson, “Méthodes de cal- 111 Attila Cangi, Donghyung Lee, Peter Elliott, Kieron cul avec réseaux de tenseurs en physique (basic ten- Burke, and E.K.U. Gross, “Electronic structure via po- sor network computations in physics),” arXiv preprint tential functional approximations,” Phys. Rev. Lett. 106, arXiv:1911.11566 (2019). 236404 (2011). 92 GuifréVidal, “Class of quantum many-body states that 112 Attila Cangi, EKU Gross, and Kieron Burke, “Potential can be efficiently simulated,” Phys. Rev. Lett. 101, functionals versus density functionals,” Phys. Rev. A 88, 110501 (2008). 062505 (2013). 93 Lucas O Wagner, EM Stoudenmire, Kieron Burke, and 113 JR Schrieffer, “Theory of superconductivity,” (1963). Steven R White, “Guaranteed convergence of the Kohn- 114 Anirban Narayan Chowdhury and Rolando D Somma, Sham equations,” Phys. Rev. Lett. 111, 093003 (2013). “Quantum algorithms for gibbs sampling and hitting-time 94 Zhenwei Li, James R Kermode, and Alessandro De Vita, estimation,” Quantum Information & Computation 17, “Molecular dynamics with on-the-fly machine learning of 41–64 (2017). quantum-mechanical forces,” Physical review letters 114, 115 Guang Hao Low and Isaac L Chuang, “Optimal hamilto- 096405 (2015). nian simulation by quantum signal processing,” Physical 95 TL Gilbert, “Hohenberg-Kohn theorem for nonlocal ex- review letters 118, 010501 (2017). ternal potentials,” Phys. Rev. B 12, 2111 (1975). 116 Thomas R Mattsson and Michael P Desjarlais, “Phase 96 Nikitas I Gidopoulos and EKU Gross, “Electronic non- diagram and electrical conductivity of high energy-density 25

water from density functional theory,” Phys. Rev. Lett. Physical Journal B 91, 142 (2018). 97, 017801 (2006). 132 Justin C Smith and Kieron Burke, “Thermal stitching: 117 David Poulin and Pawel Wocjan, “Sampling from the Combining the advantages of different quantum fermion thermal quantum gibbs state and evaluating partition solvers,” Phys. Rev. B 98, 075148 (2018). functions with a quantum computer,” Phys. Rev. Lett. 133 Francisca Sagredo and Kieron Burke, “Accurate dou- 103, 220502 (2009). ble excitations from ensemble density functional calcu- 118 EM Stoudenmire, Lucas O Wagner, Steven R White, lations,” J. Chem. Phys. 149, 134103 (2018). and Kieron Burke, “One-dimensional continuum elec- 134 Justin C Smith, Aurora Pribram-Jones, and Kieron tronic structure with the density-matrix renormalization Burke, “Exact thermal density functional theory for group and its implications for density-functional theory,” a model system: Correlation components and accuracy Phys. Rev. Lett. 109, 056402 (2012). of the zero-temperature exchange-correlation approxima- 119 Luke Shulenburger, Michele Casula, Gaetano Sena- tion,” Phys. Rev. B 93, 245131 (2016). tore, and Richard M Martin, “Spin resolved energy 135 Marcela Herrera, Krissia Zawadzki, and Irene D’Amico, parametrization of a quasi-one-dimensional electron gas,” “Melting a hubbard dimer: benchmarks of ‘alda’for quan- Journal of Physics A: Mathematical and Theoretical 42, tum thermodynamics,” The European Physical Journal B 214021 (2009). 91, 248 (2018). 120 Lucas O Wagner, EM Stoudenmire, Kieron Burke, and 136 WA Vigor, JS Spencer, MJ Bearpark, and AJW Thom, Steven R White, “Reference electronic structure calcula- “Minimising biases in full configuration interaction quan- tions in one dimension,” Phys. Chem. Chem. Phys. 14, tum monte carlo,” J. Chem. Phys. 142, 104101 (2015). 8581–8590 (2012). 137 Jarrod R McClean, Ryan Babbush, Peter J Love, and 121 T.E. Baker, E.M. Stoudenmire, L.O. Wagner, K. Burke, AlánAspuru-Guzik, “Exploiting locality in quantum com- and S.R. White, “One-dimensional mimicking of elec- putation for quantum chemistry,” The journal of physical tronic structure: The case for exponentials,” Phys. Rev. B chemistry letters 5, 4368–4380 (2014). 91, 235141 (2015); “Erratum: One-dimensional mim- 138 Ulrich Schollwöck, “The density-matrix renormalization icking of electronic structure: The case for exponentials group in the age of matrix product states,” Annals of [Phys. Rev. B 91, 235141 (2015)],” 93, 119912(E) (2016). Physics 326, 96–192 (2011). 122 N Helbig, IV Tokatly, and Angel Rubio, “Exact Kohn– 139 Garnet Kin-Lic Chan and Sandeep Sharma, “The density Sham potential of strongly correlated finite systems,” matrix renormalization group in quantum chemistry,” An- J. Chem. Phys. 131, 224105 (2009). nual review of physical chemistry 62, 465–481 (2011). 123 N Helbig, Johanna I Fuks, M Casula, Matthieu J Ver- 140 Jun Yang, Weifeng Hu, Denis Usvyat, Devin Matthews, straete, MAL Marques, IV Tokatly, and Angel Rubio, Martin Schütz, and Garnet Kin-Lic Chan, “Ab initio “Density functional theory beyond the linear regime: determination of the crystalline benzene lattice energy Validating an adiabatic local density approximation,” to sub-kilojoule/mole accuracy,” Science 345, 640–643 Phys. Rev. A 83, 032503 (2011). (2014). 124 Peter Elliott, Johanna I Fuks, Angel Rubio, and 141 Olayide A Arodola and Mahmoud ES Soliman, “Quantum Neepa T Maitra, “Universal dynamical steps in the mechanics implementation in drug-design workflows: does exact time-dependent exchange-correlation potential,” it really help?” Drug design, development and therapy 11, Phys. Rev. Lett. 109, 266404 (2012). 2551 (2017). 125 Johanna I Fuks, Kai Luo, Ernesto D Sandoval, and 142 WMC Foulkes, Lubos Mitas, RJ Needs, and G Ra- Neepa T Maitra, “Time-resolved spectroscopy in time- jagopal, “Quantum Monte Carlo simulations of solids,” dependent density functional theory: An exact condition,” Rev. Mod. Phys. 73, 33 (2001). Phys. Rev. Lett. 114, 183002 (2015). 143 Henk Eshuis, Jefferson E Bates, and Filipp Furche, “Elec- 126 N.A. Lima, M.F. Silva, L.N. Oliveira, and K. Capelle, tron correlation methods based on the random phase ap- “Density functionals not based on the electron gas: proximation,” Theoretical Chemistry Accounts 131, 1084 local-density approximation for a Luttinger liquid,” (2012). Phys. Rev. Lett. 90, 146402 (2003). 144 Guo P Chen, Vamsee K Voora, Matthew M Agee, 127 Johanna I Fuks and Neepa T Maitra, “Challenging adi- Sree Ganesh Balasubramani, and Filipp Furche, abatic time-dependent density functional theory with “Random-phase approximation methods,” Annual review a hubbard dimer: the case of time-resolved long-range of physical chemistry 68, 421–445 (2017). charge transfer,” Phys. Chem. Chem. Phys. 16, 14504– 145 Trygve Helgaker, Poul Jorgensen, and Jeppe Olsen, 14513 (2014). Molecular electronic-structure theory (John Wiley & Sons, 128 J.I. Fuks and N.T. Maitra, “Charge transfer in time- 2014). dependent density-functional theory: Insights from the 146 John A Pople, “Nobel lecture: Quantum chemical mod- asymmetric Hubbard dimer,” Phys. Rev. A 89, 062502 els,” Rev. Mod. Phys. 71, 1267 (1999). (2014). 147 Rodney J Bartlett and Monika Musia l, “Coupled-cluster 129 Aron J Cohen and Paula Mori-Sánchez, “Landscape of an theory in quantum chemistry,” Rev. Mod. Phys. 79, 291 exact energy functional,” Phys. Rev. A 93, 042511 (2016). (2007). 130 DJ Carrascal, Jaime Ferrer, Justin C Smith, and Kieron 148 Ryan Babbush, Nathan Wiebe, Jarrod McClean, James Burke, “The Hubbard dimer: a density functional case McClain, Hartmut Neven, and Garnet K.L. Chan, “Low- study of a many-body problem,” Journal of Physics: Con- depth quantum simulation of materials,” Phys. Rev. X 8, densed Matter 27, 393001 (2015). 011044 (2018). 131 Diego J Carrascal, Jaime Ferrer, Neepa Maitra, and 149 JC Kimball, “Short-range correlations and the structure Kieron Burke, “Linear response time-dependent density factor and momentum distribution of electrons,” Journal functional theory of the Hubbard dimer,” The European of Physics A: Mathematical and General 8, 1513 (1975). 26

150 Bruno Klahn and John D Morgan III, “Rates of conver- (2018). gence of variational calculations and of expectation val- 167 Theorems listed here are so heavily dependent on pre- ues,” J. Chem. Phys. 81, 410–433 (1984). existing proofs that they are probably more accurately 151 Robert Nyden Hill, “Rates of convergence and error called a corollary of those theorems, but this naming con- estimation formulas for the Rayleigh–Ritz variational vention is used in several physics works and here to match. method,” J. Chem. Phys. 83, 1173–1196 (1985). 168 James Daniel Whitfield, Peter John Love, and Alan 152 Thomas E. Baker, Kieron Burke, and Steven R. White, Aspuru-Guzik, “Computational complexity in electronic “Accurate correlation energies in one-dimensional systems structure,” Physical Chemistry Chemical Physics 15, from small system-adapted basis functions,” Phys. Rev. B 397–411 (2013). 97, 085139 (2018). 169 Leslie G Valiant, “A theory of the learnable,” Communi- 153 Gregory Beylkin and Martin J Mohlenkamp, “Numeri- cations of the ACM 27, 1134–1142 (1984). cal operator calculus in higher dimensions,” Proceedings 170 Jonathan Blair Amsterdam, The valiant learning model: of the National Academy of Sciences 99, 10246–10251 Extensions and assessment, Ph.D. thesis, Massachusetts (2002). Institute of Technology, Department of Electrical Engi- 154 Roberto Oliveira and Barbara M Terhal, “The complex- neering (1988). ity of quantum spin systems on a two-dimensional square 171 F Bergadano and L Saitta, “On the error probability of lattice,” Quant. Inf. Comput. 8, 900–924 (2008). boolean concept descriptions,” in Proceedings of the 1989 155 Sam McArdle, Tyson Jones, Suguru Endo, Ying Li, Si- European Working Session on Learning (1989) pp. 25–35. mon C Benjamin, and Xiao Yuan, “Variational ansatz- 172 David Haussler, “Probably approximately correct learn- based quantum simulation of imaginary time evolution,” ing,” in Encyclopedia of Machine Learning and Data Min- npj Quantum Information 5, 1–6 (2019). ing, edited by Claude Sammut and Geoffrey I. Webb 156 A Hackl and S Kehrein, “Real time evolution in quantum (Springer US, Boston, MA, 2017) pp. 1017–1017. many-body systems with unitary perturbation theory,” 173 Wray Lindsay Buntine, A theory of learning classifica- Phys. Rev. B 78, 092303 (2008). tion rules, Ph.D. thesis, University of Technology, Sydney 157 Martin Schwarz, Kristan Temme, and Frank Verstraete, (1990). “Preparing projected entangled pair states on a quantum 174 Michael J Pazzani and Wendy Sarrett, “A framework for computer,” Phys. Rev. Lett. 108, 110502 (2012). average case analysis of conjunctive learning algorithms,” 158 AndrásGilyén,Yuan Su, Guang Hao Low, and Nathan Machine Learning 9, 349–372 (1992). Wiebe, “Quantum singular value transformation and 175 Alexander L Fetter and John Dirk Walecka, Quantum beyond: exponential improvements for quantum matrix theory of many-particle systems (Courier Corporation, arithmetics,” in Proceedings of the 51st Annual ACM 2012). SIGACT Symposium on Theory of Computing (2019) pp. 176 Stanley Raimes, Many-electron theory (North-Holland, 193–204. 1972). 159 Noson S Yanofsky and Mirco A Mannucci, Quantum 177 John Hubbard, “Electron correlations in narrow energy computing for computer scientists (Cambridge University bands,” Proceedings of the Royal Society of London. Se- Press, 2008). ries A. Mathematical and Physical Sciences 276, 238–257 160 Yi-Kai Liu, Matthias Christandl, and Frank Ver- (1963). straete, “Quantum computational complexity of the n- 178 Max Born and Robert Oppenheimer, “Zur quantentheorie representability problem: Qma complete,” Physical re- der molekeln,” Annalen der physik 389, 457–484 (1927). view letters 98, 110503 (2007). 179 S Francis Boys, “Electronic wave functions-I. A general 161 Norbert Schuch and Frank Verstraete, “Computational method of calculation for the stationary states of any complexity of interacting electrons and fundamental lim- molecular system,” Proceedings of the Royal Society of itations of density functional theory,” Nature Physics 5, London. Series A. Mathematical and Physical Sciences 732–735 (2009). 200, 542–554 (1950). 162 Mario Szegedy, “Quantum speed-up of markov chain 180 Florian A Bischoff, Robert J Harrison, and Ed- based algorithms,” in 45th Annual IEEE symposium on ward F Valeev, “Computing many-body wave functions foundations of computer science (IEEE, 2004) pp. 32–41. with guaranteed precision: The first-order Møller-Plesset 163 Jessica Lemieux, Bettina Heim, David Poulin, Krysta wave function for the ground state of helium atom,” Svore, and Matthias Troyer, “Efficient quantum walk cir- J. Chem. Phys. 137, 104103 (2012). cuits for metropolis-hastings algorithm,” Quantum 4, 287 181 Robert J Harrison, Gregory Beylkin, Florian A Bischoff, (2020). Justus A Calvin, George I Fann, Jacob Fosso-Tande, 164 Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man- Diego Galindo, Jeff R Hammond, Rebecca Hartman- Hong Yung, Xiao-Qi Zhou, Peter J Love, AlánAspuru- Baker, Judith C Hill, et al., “MADNESS: A multiresolu- Guzik, and Jeremy L O’brien, “A variational eigenvalue tion, adaptive numerical environment for scientific simu- solver on a photonic quantum processor,” Nature commu- lation,” SIAM Journal on Scientific Computing 38, S123– nications 5, 4213 (2014). S142 (2016). 165 Diego Ristè,Marcus P Da Silva, Colm A Ryan, Andrew W 182 Michael Reed and Barry Simon, Methods of modern math- Cross, Antonio D Córcoles,John A Smolin, Jay M Gam- ematical physics: Functional analysis (Elsevier, 2012). betta, Jerry M Chow, and Blake R Johnson, “Demon- 183 Mel Levy and John P Perdew, “The constrained search stration of quantum advantage in machine learning,” npj formulation of density functional theory,” in Density func- Quantum Information 3, 16 (2017). tional methods in physics (Springer, 1985) pp. 11–30. 166 Srinivasan Arunachalam and Ronald De Wolf, “Optimal 184 Enrico Fermi, “Un metodo statistico per la determi- quantum sample complexity of learning algorithms,” The nazione di alcune priorieta dell’atome,” Rend. Accad. Journal of Machine Learning Research 19, 2879–2878 Naz. Lincei 6, 32 (1927). 27

185 Llewellyn H Thomas, “The calculation of atomic fields,” Review Letters 49, 1691 (1982). in Mathematical Proceedings of the Cambridge Philosoph- 205 Sebastian Ruder, “An overview of gradient descent op- ical Society, Vol. 23 (Cambridge University Press, 1927) timization algorithms,” arXiv preprint arXiv:1609.04747 pp. 542–548. (2016). 186 CF v Weizsäcker, “Zur theorie der kernmassen,” 206 Vladim´ır Cern`y,ˇ “Thermodynamical approach to the Zeitschrift fürPhysik A Hadrons and Nuclei 96, 431–458 traveling salesman problem: An efficient simulation algo- (1935). rithm,” Journal of optimization theory and applications 187 Raphael F Ribeiro, Donghyung Lee, Attila Cangi, Peter 45, 41–51 (1985). Elliott, and Kieron Burke, “Corrections to Thomas-Fermi 207 Dimitris Bertsimas and John Tsitsiklis, “Simulated an- densities at turning points and beyond,” Phys. Rev. Lett. nealing,” Statistical science 8, 10–15 (1993). 114, 050401 (2015). 208 Luke Tierney, “Markov chains for exploring posterior dis- 188 Elliott H Lieb and Stephen Oxford, “Improved tributions,” the Annals of Statistics 22, 1701–1728 (1994). lower bound on the indirect coulomb energy,” 209 SVN Vishwanathan, Karsten M Borgwardt, and Nicol N Int. J. Quant. Chem. 19, 427–439 (1981). Schraudolph, “Fast computation of graph kernels,” in 189 Stefano Pittalis, CR Proetto, A Floris, A Sanna, NIPS, Vol. 19 (2006) pp. 131–138. C Bersier, K Burke, and Eberhard KU Gross, “Exact con- 210 Radford M Neal, Bayesian learning for neural networks, ditions in finite-temperature density-functional theory,” Lecture Notes in Statistics, Vol. 118 (Springer Science & Phys. Rev. Lett. 107, 163001 (2011). Business Media, 2012). 190 Jianwei Sun, Adrienn Ruzsinszky, and John P Perdew, 211 Luis Miguel Rios and Nikolaos V Sahinidis, “Derivative- “Strongly constrained and appropriately normed semilo- free optimization: a review of algorithms and compari- cal density functional,” Phys. Rev. Lett. 115, 036402 son of software implementations,” Journal of Global Op- (2015). timization 56, 1247–1293 (2013). 191 Paula Mori-Sánchez, Aron J Cohen, and Weitao Yang, 212 Natesh S Pillai, Andrew M Stuart, and Alexandre H “Localization and delocalization errors in density func- Thiéry, “Noisy gradient flow from a random walk in tional theory and implications for band-gap prediction,” hilbert space,” Stochastic Partial Differential Equations: Physical review letters 100, 146401 (2008). Analysis and Computations 2, 196–232 (2014). 192 Aron J Cohen, Paula Mori-Sánchez, and Weitao Yang, 213 Bryan Perozzi, Vivek Kulkarni, and Steven Skiena, “Insights into current limitations of density functional “Walklets: Multiscale graph embeddings for interpretable theory,” Science 321, 792–794 (2008). network classification,” arXiv preprint arXiv:1605.02115 193 Herbert Goldstein, Charles P Poole, and John L Safko, (2016). Classical Mechanics (Pearson Higher Ed, 2014). 214 Vincent Tejedor, “A dimensional acceleration of gradient 194 Eberhard KU Gross and Neepa T Maitra, “Introduction descent-like methods, using persistent random walkers,” to TDDFT,” in Fundamentals of Time-Dependent Den- arXiv preprint arXiv:1801.04532 (2018). sity Functional Theory (Springer, 2012) pp. 53–99. 215 Swarnendu Ghosh, Nibaran Das, Teresa Gon¸calves, Paulo 195 Michael Seidl, “Strong-interaction limit of density- Quaresma, and Mahantapas Kundu, “The journey of functional theory,” Phys. Rev. A 60, 4387 (1999). graph kernels through two decades,” Computer Science 196 David C Langreth and John P Perdew, “The exchange- Review 27, 88–111 (2018). correlation energy of a metallic surface,” Solid State Com- 216 Vincent Tejedor, Raphael Voituriez, and Olivier munications 17, 1425–1429 (1975). Bénichou, “Optimizing persistent random searches,” 197 Michael Seidl, John P Perdew, and Mel Levy, “Strictly Phys. Rev. Lett. 108, 088103 (2012). correlated electrons in density-functional theory,” Physi- 217 Dougal Maclaurin, David Duvenaud, and Ryan Adams, cal Review A 59, 51 (1999). “Gradient-based hyperparameter optimization through 198 John S Townsend, A modern approach to quantum me- reversible learning,” in International Conference on Ma- chanics (University Science Books, 2000). chine Learning (2015) pp. 2113–2122. 199 Andreas Görling and Mel Levy, “Exact Kohn-Sham 218 Alexandre Blais, “Algorithmes et architectures pour or- scheme based on perturbation theory,” Phys. Rev. A 50, dinateurs quantiques supraconducteurs,” in Annales de 196 (1994). Physique, Vol. 28 (EDP Sciences, 2003) pp. 1–147. 200 Aurora Pribram-Jones, Stefano Pittalis, EKU Gross, and 219 Charles Kittel, Quantum theory of solids (Wiley, 1987). Kieron Burke, “Thermal density functional theory in con- 220 Lisa Hales and Sean Hallgren, “An improved quantum text,” in Frontiers and Challenges in Warm Dense Matter fourier transform algorithm and applications,” in Foun- (Springer, 2014) pp. 25–60. dations of Computer Science, 2000. Proceedings. 41st An- 201 David M Ceperley and Berni J Alder, “Ground state of nual Symposium on (IEEE, 2000) pp. 515–525. the electron gas by a stochastic method,” Physical Review 221 Masuo Suzuki, “Improved trotter-like formula,” Letters 45, 566 (1980). Phys. Lett. A 180, 232–234 (1993). 202 Seymour H Vosko, Leslie Wilk, and Marwan Nusair, “Ac- 222 David Poulin, Alexei Kitaev, Damian S Steiger, curate spin-dependent electron liquid correlation energies Matthew B Hastings, and Matthias Troyer, “Quantum for local spin density calculations: a critical analysis,” algorithm for spectral measurement with a lower gate Canadian Journal of physics 58, 1200–1211 (1980). count,” Phys. Rev. Lett. 121, 010501 (2018). 203 Axel D Becke, “A new mixing of hartree–fock and lo- 223 Erich Novak and Henryk Wo´zniakowski, Tractability of cal density-functional theories,” The Journal of chemical Multivariate Problems: Standard information for func- physics 98, 1372–1377 (1993). tionals, Vol. 12 (European Mathematical Society, 2008). 204 John P Perdew, Robert G Parr, Mel Levy, and Jose L Bal- 224 Sangchul Oh, “Quantum computational method of find- duz Jr, “Density-functional theory for fractional particle ing the ground-state energy and expectation values,” number: derivative discontinuities of the energy,” Physical Phys. Rev. A 77, 012326 (2008). 28

225 Ivan Kassal, James D Whitﬁeld, Alejandro Perdomo- chose to also use abbreviations for those. Ortiz, Man-Hong Yung, and Al´anAspuru-Guzik, “Sim- 227 Emanuel Knill, Gerardo Ortiz, and Rolando D Somma, ulating chemistry using quantum computers,” Annual re- “Optimal quantum measurements of expectation values view of physical chemistry 62, 185–207 (2011). of observables,” Phys. Rev. A 75, 012328 (2007). 226 This algorithm can go by other names, notably quan- 228 Chris Marriott and John Watrous, “Quantum Arthur– tum counting, but we use ‘QAE’ instead of the abbrevi- Merlin games,” Computational Complexity 14, 122–152 ated ‘QC’ for quantum counting since this would overlap (2005). with ‘quantum chemistry’ and ‘quantum computing’ if we