Sampling, rates, and reaction currents through reverse stochastic quantization on quantum computers
Guglielmo Mazzola IBM Quantum, IBM Research – Zurich, 8803 R¨uschlikon,Switzerland (Dated: August 27, 2021) The quest for improved sampling methods to solve statistical mechanics problems of physical and chemical interest proceeds with renewed efforts since the invention of the Metropolis algorithm, in 1953. In particular, the understanding of thermally activated rare-event processes between long-lived metastable states, such as protein folding, is still elusive. In this case, one needs both the finite-temperature canonical distribution function and the reaction current between the reactant and product states, to completely characterize the dynamic. Here we show how to tackle this problem using a quantum computer. We use the connection between a classical stochastic dynamics and the Schroedinger equation, also known as stochastic quantization, to variationally prepare quantum states allowing us to unbiasedly sample from a Boltzmann distribution. Similarly, reaction rate constants can be computed as ground state energies of suitably transformed operators, following the supersymmetric extension of the formalism. Finally, we propose a hybrid quantum-classical sampling scheme to escape local minima, and explore the configuration space in both real-space and spin hamiltonians. We indicate how to realize the quantum algorithms constructively, without assuming the existence of oracles, or quantum walk operators. The quantum advantage concerning the above applications is discussed.
I. INTRODUCTION decorrelate the walk, but generally implies a low accep- tance rate, such that most of the computational time is spent in proposing transitions x x0, which are never The task of sampling from a multidimensional finite- → temperature Boltzmann probability distribution, ρ(x), is accepted. a central problem in numerical simulations of physics, The total CPU runtime of a Metropolis simulation can chemistry1, and beyond the traditional boundaries of be estimated by multiplying the cost to generate each it- Natural sciences. For example in optimization by sim- eration of the Markov chain with the number of steps re- ulated annealing, the physical potential is replaced by quired to i) overcome the thermalization transient regime a suitable cost function, and the temperature becomes and, ii) gather sufficient statistics to evaluate the target an effective parameter that decreases during the opti- operator. The error in its estimate scales with 1/√M, mization2. Perhaps the most important occurences of where M is the number of un-correlated samples gen- critical slowing down of sampling methods in presence of erated. Markov chain Monte Carlo algorithms can be complex energy landscapes include optimization of spin- therefore computationally expensive, as long sequences glasses, neural-networks, and protein folding. All in all, can be necessary to obtain precise estimates of statistical despite these being classical problems, their solution is quantities The shape of the potential energy landscape, far from being simple or efficient with classical methods. v(x), plays a major role in controlling the efficiency of the The celebrated Metropolis algorithm3,4, one of the top algorithm. In short, a sampling scheme, T (x, x0), based ten most important algorithms from the last century5, on local updates likely fails to visit all the local minima enabled countless applications in classical and quantum of the potential during a finite length simulation. On the statistical mechanics6. In short, the Metropolis algo- other hand, the choice of an effective global update rule rithm aims to sample from a probability distribution is heavily system-dependent. While in unfrustrated spin ρ(x), by constructing a random walk for the a variable x. systems, global (or cluster) update rules have been effec- arXiv:2108.11410v1 [quant-ph] 25 Aug 2021 Each iteration of this Markov chain consists of a proposal tive in overcoming critical slowing down of simulations 7,8 move x x0 defined by a transition matrix T (x, x0), fol- at phase transitions , a general solution to this problem → lowed by an acceptance step, with probability A(x, x0). has yet to be found, especially in continuous systems. A practical sampling scheme must be efficient in ex- Molecular dynamics, perhaps the most common local- ploring the huge configuration’s space and escaping the updates based sampling scheme, fails when the poten- local minima of a potential v(x). The latter can be either tial displays several local minima separated by large a chemical energy surface or a cost function for optimiza- barriers6. These conditions are ubiquitous in structural 9–11 tion problems. Smaller average displacements x x0 phase transitions , and conformal reactions in solu- lead to increased acceptance rate, yet the samples→ be- tions, such as the well known protein folding problem12. come statistically correlated, such that CPU is wasted In these cases, a simulation initialized in one minimum in generating a lot of very similar configurations without will hardly visit spontaneously other minima, as this pro- a real improvement in the estimate. On the contrary, cess involves the occurrence of a thermally activated rare a highly non-local proposal move would be effective to event 13. In general, dynamics that are characterized by 2 a time-scale gap between fast local relaxations and slow tum walk operators32, and we indicate how to realize the activation processes are difficult to simulate. These con- algorithm constructively, provided some reasonable ap- ditions arise in systems belonging to physics, chemistry, proximations on the functional form of v(x). and biology. We also notice that dynamics-based ap- proaches, such as Hamiltonian Monte Carlo can be used 14,15 also in absence of a real physical systems . II. LANGEVIN AND FOKKER-PLANCK While several revolutionary enhanced sampling algo- EQUATIONS rithms have been devised to escape such free energy min- 16 17 ima, e.g. parallel tempering , metadynamics , and um- The starting point of our discussion is the first-order 18 brella sampling , to name a few, they also come with Langevin equation. On one hand, this dynamics can limitations. For example, the efficiency of the first is mimic the microscopical behavior of particles in a solu- controlled by the algorithm’s hyper-parameters, while tion, on the other, it is probably the simplest thermostat the latter requires the definition of a reaction coordinate used to achieve canonical sampling at finite temperature which is in general hard to get a priori. In particular, if T . In one dimension, this equation reads we consider a rare-event dynamic, i.e. a transition pro- cess that take place on a long time scale compared to time x˙ = f(x) + η, (1) scale characterizing local relaxations in the local minima, finding the reaction pathway it is a huge problem in itself. where f(x) = ∂v(x)/∂x represents the classical force Given the rapid development of digital quantum hard- acting on the particles,− x ˙ is the usual time derivative of ware applications, it is timely to revisit the task of per- the position x, and η is a gaussian distributed random forming quantum computing simulations of a continuous variable with zero mean, and defined by the following variable diffusion process, with the following goals: pro- correlator: pose a quantum algorithm to (i) unbiasedly sample from a canonical finite-temperature distribution ρ(x), and (ii) < ηtηt0 >= 2T δ(t t0), (2) compute the thermalization rate k, together with the re- − action current j(x), which are essential pieces of informa- where we also set the Boltzmann constant kB and the tion to model reactive processes. Moreover, on a more friction γ to unity (the latter choice only rescales the unit heuristic take, (iii) study the possible origin of quantum of time). The Markov chain generated by the Langevin speed-up in visiting the configuration space, escaping lo- dynamics, where the transition matrix T (x, x0) is given cal minima with global quantum updates. by Eq.1, allows us to sample from the canonical distri- Before addressing these points, let us review some bution quantum computing efforts along this broad line of re- search, to better contextualize the present work. Quan- ρ(x) = exp( v(x)/T ), (3) − tum versions of the Metropolis algorithm have been in- vented, but to achieve the task to sample from a finite in the limit of vanishing integration time-step1,33. temperature quantum canonical distribution19,20. Other It is well known that the probability density distri- related work concerns the task of loading distribution bution P (x, t), for the stochastic process x(t) generated functions in a quantum register. Most methods operate by a first-order Langevin equation (Eq.1) satisfies the under the assumption that the normalization is known a Fokker-Planck’s equation priori21,22. Once that this distribution is loaded, Monte Carlo integration can be performed with a quadratic ∂P (x, t) ∂(f(x)P (x, t)) ∂2P (x, t) = + T (4) speedup, as shown by Montanaro23. ∂t − ∂x ∂x2 A similar quantum speed-up can be achieved in ap- proximating partition functions of classical lattice mod- The stationary solution of this stochastic differential 23,24 equation is exactly the desired equilibrium probability els, under certain conditions . Even more pertinent 33 to our research is the possibility of achieving acceler- density ρ(x) . ated sampling through quantum walks25–27, or a quanti- zation of a Markov chains using parent quantum hamil- tonians and quantum annealing in lattice models28. Fi- III. REVERSE STOCHASTIC QUANTIZATION nally, Ref.29 translates the problem of finding the reac- tion pathways as an optimization problem to be solved In this Section we bridge the classical statistical me- by quantum annealing. chanics with an effective quantum problem. In the early In this manuscript we will adopt a different strat- 80’s, Parisi discovered that there is a deep relation- egy, as we do not use quantum walks or Grover-like ship between the Fokker-Planck equation (4), and the approaches. The computational primitive used in this Schroedinger equation34. This is obtained by searching framework is instead the hamiltonian simulation in con- for a solution of the following type: tinuous space representation30,31. Moreover, we do not assume the existence of an oracle, black-box, or quan- P (x, t) = ψ0(x)Ψ(x, t). (5) 3
p If we write ψ0(x) = ρ(x), then Ψ(x, t) satisfies a . The multidimensional generalization of Eq. 10 is dis- Schroedinger equation in imaginary time cussed in Ref.36. ∂ Ψ(x, t) = HΨ(x, t) (6) ∂t − A. A double well potential where H is an effective Hamiltonian that reads Before going further, it could be beneficial to familiar- H = K + V, (7) ize ourselves with this formalism on a well-studied bench- ∂2 mark, the one-dimensional double well model. In this K = T (8) case a valid potential reads − ∂x2 2 2 1 ∂v(x) 1 ∂ v(x) v(x) = h(x2 x2)2, (11) V = (9) − 0 4T ∂x − 2 ∂x2 and features two local minima at positions x0, sepa- Interestingly, the ground state of H is exactly given by rated by an energy barrier with height h (see± Fig.1). ψ0, with eigenvalue E0 = 0. This connection takes the If T h the hopping process becomes a thermally acti- name of stochastic quantization, because a quantum evo- vated rare event, which rate is well described by Kramers lution in imaginary time can be recast as a classical theory stochastic process, and it can be used to solve quantum ωx0 ω0 h/T physics problems through classical numerical methods. k e− , (12) Here instead, we take the reverse route, using this con- ≈ 2π nection to map a classical statistical mechanics problem where ωx and ω0 are the characteristic frequencies of the into a quantum formalism, to be eventually solved on a harmonic approximation of the potential at the bottom quantum computing machine. x = x and at the barrier x = 013. The time scale For example, we already notice that sampling from the 0 2 1/k represents the relaxation time of any local update ground state ψ0(x) means to sample from the equilib- Markov chain simulations, namely a fully ergodic simu- | | every rium density function ρ(x), at finite temperature lations is achieved if both wells are visited multiple times T , that is one of our targets. To conclude, we provide an (one each, to the very least) during the simulation. one-to-one mapping between a classical potential surface We construct the effective potentials V (x) and V S(x), v(x) ( and a temperature T ) and an effective quantum and we numerically solve the associated Schrodinger operator H. This simple observation marks already a dif- equations. In Fig.1.a we plot these potentials and the 28 parent ference with the recent Ref. , where the obtained first seven eigenfunctions of Eq.7. The two lowest-lying lattice Hamiltonian depends on the specific choice of the states are the symmetric and antisymmetryc combina- classical Markov chain to be quantized. tions of the two distributions localized at the left and the right wells, namely ψ0(x) = 1/√2(ψL(x) + ψR(x)) and ψ (x) = 1/√2(ψ (x) ψ (x)). The energy gap ∆ IV. SUPERSYMMETRIC HAMILTONIAN AND 1 L − R REACTION RATES separating these two states decreases exponentially with the inverse temperature and the height of the potential energy barrier, in perfect agreement with Eq. 12, while Before detailing the quantum computing approach to the gap between the first and the second excited states tackle this quantum problem, let us further explore the remains (1) (see also panel c). This numerically con- possibilities that this formalism unlocks. Indeed, we can firms thatO the gap ∆ of Eq.7, or equivalently the ground extract additional information from the spectrum of H. state energy ES of the supersymmetric partner of Eq.7, The gap between the fundamental and the first excited 0 gives the thermalization rate of the system at finite tem- state ∆ = E E = E provides the relaxation time 1 0 1 perature. towards equilibrium,− which is the dominant time scale at Let us notice again that, despite the fact we are solving which the diffusion process takes place33. While the cal- a Schroedinger equation, the rate obtained is the one culation of excited state energies can be quite elusive, corresponding to a purely classical thermally activated luckily this formalism reserves an additional surprise. process, not to a quantum tunneling event37–40. Since the Hamiltonian H defines a ground state with In Fig.1.b we also numerically demonstrates that a zero eigenvalue, we can construct its super-symmetric 2 S S ψ0(x) = ρ(x). partner, H , such that its ground state E0 directly pro- | | vides the fundamental gap of H35,36. For example, if we consider a simple one-dimensional potential, this new op- S S V. QUANTUM COMPUTING AND QUANTUM erator H = K + V is readily obtained by adding the ADVANTAGE second derivative of v to the effective potential V ,
∂2v(x) The last and essential step of the framework is to V S = V + (10) ∂x2 propose several quantum computing implementations to 4
2 Rate
V,V