arXiv:1606.05365v3 [physics.chem-ph] 8 Sep 2016 prxmto otesnl ufc Schr¨odinger the consistent equation surface of single a the extension is to approximation natural which a [32–34], as propagator viewed Herman-Kluk be The can regime. algorithm non-adiabatic the in nuclei-electron Schr¨odinger equation the al- from starting type (FGA-SH), sur- hopping with hopping face approximation surface Gaussian frozen a called of gorithm, derivation rigorous 16, cally 15, 13, 11, 9, [6, 28–30] years 23, recent 22, in progress huge de- spite poor, from rather derivation remains Schr¨odinger systematic dynamics exact its the of convinc- understanding quite the is algorithm ing, in- type the FSSH While the dimensionality. behind of tuition curse the solve to to due directly impractical ex- is the be- which approximate hopping Schr¨odinger dynamics, to act with surfaces trajectories energy classical adiabatic tween use to is [3–28]. rithms area research active very a name been to has just few, into environment, a with taking interaction or the cost, account dynamics, computational quantum the exact reducing the further to improv- approximation on the focuses ing which development algorithms, the popularity, hopping huge surface the of to impor- Due is [3–9]. effect tant non-adiabatic applied where successfully scenarios various been to quantum- have sur- algorithms mixed The hopping for regime. face non-adiabatic used the widely in dynamics are classical [2], (FSSH) in hopping algorithm surface switches fewest the as revamped ∗ [email protected] norpeiu ok[1,w aeamathemati- a gave we [31], work previous our In algo- hopping surface the of idea underlying The and [1] in pioneered algorithms, hopping surface The mrvdsmln n aiaino rznGusa approx Gaussian frozen of validation and sampling Improved 2 G-Hbsdo it n et rnhn rcse.Tealg dynamics. The non-adiabatic of processes. branching examples death test sampling and rep improved birth integral An on based path regime. FGA-SH semiclassical a the samples in method dynamics (FGA-SH) hopping surface eateto hsc n eateto hmsr,Dk Uni Duke Chemistry, of Department and Physics of Department ntesii ftefws wthssraehpig h fro the hopping, surface switches fewest the of spirit the In .INTRODUCTION I. 1 eateto ahmtc,Dk nvriy o 02,Dur 90320, Box University, Duke Mathematics, of Department opn loih o oaibtcdynamics nonadiabatic for algorithm hopping inegLu Jianfeng Dtd etme ,2016) 9, September (Dated: ,2, 1, ∗ n hna Zhou Zhennan and naatv rnn n pitn fsraehpigtra- hopping in surface of result splitting and processes pruning branching adaptive dy- an birth-death non-adiabatic and The for fruitful algorithms namics. quite better be fertilize would of [38–45]) would point based e.g., view integral (see the path methods the of with connection algorithms the hopping surface expect interpretation. we integral fact, path first algo- novel In type the the hopping to is surface thanks this for rithms, know, used is we strategy as such Carlo far time Monte As diffusion [35–37]. the of algorithms e.g., context other sampling, in integral used path typically branch- is birth-death which processes, the ing adopting by the reduce variance we scheme), sampling sampling Carlo Monte trajectories direct independent a using (as of instead that is work this to. the devoted which is integral, work path numeri- current the of approximate improvement to further but schemes cal integral, to path leads a the naturally of of terms also scheme in sampling algorithms Carlo type Monte FSSH in- the clear of a provides terpretation only This not weights. representation different nu- integral carry of may path paths trajectories and those FSSH to in similar clei par- are in paths integral, the To path ticular, Feynman FSSH. the is in from this used different that rather emphasize those we of confusion, possible spirit any prevent same adiabatic the between in hops with surfaces, paths classical the is by path given the matrix which the for sequel), as the to in Schr¨odinger (referred equation surfaces integral adiabatic multiple path for a Schr¨odinger to equations semiclassical leads the hopping of representation surface of idea the that differ- between surfaces. hopping energy account ent into takes dynam- hence non-adiabatic and the ics to regime, semiclassical the in h annwigein fteipoe loih in algorithm improved the of ingredient new main The is method FGA-SH the behind observation key The ceei eeoe nti okfor work this in developed is scheme rtmi aiae o h standard the for validated is orithm e asinapoiainwith approximation Gaussian zen est,Dra C278 USA 27708, NC Durham versity, eetto ftenon-adiabatic the of resentation 1 a C278 USA 27708, NC ham mto ihsurface with imation 2 jectories, which bears some similarity with the multiple The matrix Schr¨odinger equation is obtained from the spawning method [46, 47] in the context of non-adiabatic nuclei-electron Schr¨odinger equation by expanding the dynamics, while the latter spawns Gaussian as a set of total in the adiabatic basis. After rescal- basis functions for a wave function approach, rather than ing, the semiclassical nuclei-electron Schr¨odinger equa- semiclassical trajectories. tion reads (see e.g., [48, 49]) Besides the improved sampling scheme of FGA-SH, in ε2 this work we also further elaborate the initial sampling of iε∂ Φ(t, x, r)= ∆ Φ(t, x, r)+H (r, x)Φ(t, x, r), (4) t − 2 x e trajectories in the FGA-SH method and also the calcu- lation of observables based on averaging of trajectories. where Φ(t, x, r) denotes the nuclei-electron wave function We validate the improved FGA-SH method by various and He(r, x) denotes electronic Hamiltonian (in a dia- numerical tests, which explore the dependence of the nu- batic representation). The parameter ε is proportional merical error on the semiclassical parameter ε, the long to the square root of the ratio of the electron to time accuracy and , the impact of that of nuclei and is thus a small parameter (for simplic- the weighting factor, etc. for the model test problems by ity of notation, we have assumed that all nuclei share Tully [2] for nonadiabatic dynamics. the same mass). The adiabatic states Ψk(r; x) are the The rest of the paper is organized as follows. In Section eigenstates of He(r, x) with eigenvalues Ek(x). Assume II, we review the path integral representation for matrix the first two adiabatic states are well isolated from the Schr¨odinger equation in the semiclassical regime which rest of the states, we thus take the following ansatz for leads to the FGA-SH method introduced in [31]. The the total wave function improved sampling algorithm for FGA-SH is discussed in Section III. We validate the method by numerical tests Φ(t, x, r)= u0(t, x)Ψ0(r; x)+ u1(t, x)Ψ1(r; x). (5) in Section IV. The paper is concluded in Section V. We obtain (1) by inserting (5) into (4) and writing the

equations in terms of u0 and u1. While it is possible to II. THEORY generalize the method to take into account of more than two adiabatic surfaces, we restrict to the case of two for A. Path integral representation for semiclassical simplicity. matrix Schr¨odinger equations Solving the matrix Sch¨odinger equations (1) using con- ventional numerical methods is impractical due to the We consider the matrix Schr¨odinger equation with two potential high dimensionality of the nuclei degree of free- electronic levels in the adiabatic basis: dom, and hence we resort to semiclassical methods which exploit the limiting behavior of ε 0. In previous 2 → u0 ε u0 E0 u0 work [31], we derived the frozen Gaussian approxima- iε∂t = ∆x + u1! − 2 u1! E1! u1! tion with surface hopping (FGA-SH) from the matrix 2 m Schr¨odinger equation with rigorous error bounds of the ε D00 D01 u0 2 d00 d01 u0 ε ∂x , − 2 D D u − d d j u approximate nuclei wave function. The algorithm follows 10 11! 1! j=1 10 11! 1! X j the same spirit as the Tully’s fewest switches surface hop- (1) ping (FSSH) method [2], except that the hopping rule where for k,l =0, 1, j =1,...,m, is different from FSSH, as will be explained in subsec- tion IIB. D (x)= Ψ (r; x), ∆ Ψ (r; x) , (2) kl h k x l ir The FGA-SH method can be understood as a path in-

(dkl(x)) = Ψk(r; x), ∂xj Ψl(r; x) r. (3) tegral formulation of the matrix Schr¨odinger equation in j h i the spirit of surface hopping. It approximates the solu- Here m is the spatial dimension of the nuclei degree of u0 tion u = u1 as freedom x and Ψk(r; x) are adiabatic states where r de-  notes the electronic degree of freedom. Note that our u(T, x)= uFGA-SH(T, x)+ (ε) O (6) index convention of Dkl and dkl is flipped from that of = Eze x; z(s) 0≤s≤T + (ε), Tully [2], we choose the current convention so that the F { } O  terms in (1) follows the usual index convention of matrix- where the average is taken overe an ensemble of trajecto- vector product. ries we describe below in Section IIB and the functional 3

(expression given below in Section IIC) depends on According to (7), the position and momentum part F the trajectory z up to time T . Our surface hopping algo- z(t) = (p(t), q(t)) of the trajectory z(t) is continuous rithm can be viewed as a Monte Carlo sampling scheme and piecewise differentiable, while l(t) is piecewise con- for this path integral.e As proved in [31], uFGA-SH gives stant with almost surely finite manye jumps during any an approximation to the exact solution with error (ε) finite time interval. Given a realization of the trajec- O (in L2 metric) for any finite T . For completeness, we tory z(t) = (z(t),l(t)) starting from initial condition provide in the Appendix a brief discussion of the deriva- z(0) = (z(0),l(0)), we denote by n the number of jumps tion of (6). The readers may refer to [31] for detailed l(t) hase in the time interval [0,T ] (thus n is a random vari- asymptotic derivation. able)e and also the discontinuity set of l(t) as t , ,t , 1 ··· n which is an increasingly ordered random sequence. At  each of those time, the trajectory switches from one en- B. Surface hopping trajectories ergy surface to the other; and thus tk, k = 1,...,n, are referred to as hopping times in the sequel. Let us first specify the ensemble of trajectories used The starting point of the trajectory z(0) = (z(0),l(0)) in (6), which largely follows the spirit of surface hop- is sampled according to a distribution on the extended ping trajectories. The trajectory z(t) in (6) evolves on phase space determined by the initial conditione of the ma- the extended phase space which consists of the classi- trix Schr¨odinger equation. Given the initial wave func- cal phase space on two energy surfaces:e we write z(t) = tion uk(0, x) for k =0, 1, we denote the Gaussian packet (z(t),l(t)) R2m 0, 1 , where l(t) 0, 1 indicates ∈ ×{ } ∈ { } transform of u as the energy surface that the trajectory lies on at timee t. m i i 2 The position and momentum z(t) = (p(t), q(t)) evolves (−p0·(y−q)+ |y−q| ) A0(z,l)=2 2 ul(0,y)e ε 2 dy. (10) by a Hamiltonian flow on the energy surface l(t): ˆRm

q˙(t)= p(t); Then the z(0) = (z(0),l(0)) is sampled from the measure 2m (7) P0(z(0),l(0)) with probability density on R 0, 1 ( p˙(t)= qEl(t)(q(t)). ×{ } −∇ proportionale to A0(z(0),l(0)) . Here we assume that | | A ( , k) is integrable on R2m for each k = 0, 1 and we The trajectory hops between surfaces according to a 0 · Markov jump process, with infinitesimal transition rate denote the normalizing factor as over the time period (t,t + δt): 1 0 = 3m/2 A0(z, k) dz. (11) Z (2πε) ˆR2m | | P l(t+δt)= m l(t)= n, z(t)= z = δ +λ (z)δt+o(δt) k=0,1 | nm nm X (8)  for m,n 0, 1 , where the rate matrix is given by Thus, the initial point of the trajectory z(t) follows the ∈{ } distribution

λ00(z) λ01(z) p d10(q) p d10(q) e = − | · | | · | . P −1 1 λ (z) λ (z) p d (q) p d (q) 0(z,l)= 0 3m/2 A0(z,l) . (12) 10 11 ! | · 01 | − | · 01 |! Z (2πε) | | (9) That is, if the trajectory is on the surface 0 at time t, We will discuss the numerical sampling of initial points then during the time interval (t,t + δt), the probabil- in Section IIIB. ity that the trajectory hops to the surface 1 is given by p(t) d (q(t)) δt for sufficiently small δt. We remark that | · 10 | p d (q) is in general complex, and hence we take its mod- C. Ensemble average of surface hopping · 10 ulus in the rate matrix; also note that the rate is state trajectories dependent (on z(t)). The trajectory z(t) thus follows a Markov switching process, which is piecewise determinis- Given the description of the ensemble of paths z(t), tic. The sampled trajectories follow thee Hamiltonian dy- t [0,T ], we now specify the functional in the path ∈ F namics on each energy surface, with random hops to the integral (6). Recall that we denote n the number of hopse other energy surfaces, and thus are very similar in spirit of the trajectory and t , ,t the sequence of hop- 1 ··· n to those trajectories used in the FSSH method (though ping times. For convenience, we also denote t = 0 and  0 with different hopping rules). tn+1 = T , the starting and final time of the trajectory. 4

The functional is then given by The last term in (13) involves the hopping coefficients F τ (k) at each hopping, given by 0 x; z(t) 0≤t≤T = l(T ) Z A(T ) F { } | i A0(z(0)) × (k) − | | τ = p(tk) dl(t+)l(t )(q(tk)), (17) i  n τ (k) − · k k e exp Θ(T, x) exp w(T ) , (13) (k) × ε e τ − + k=1 where l(t ) and l(t ) give the energy surface index     Y k k 1 0 before and after the hop at hopping time tk (so that where 0 = 0 and 1 = 1 denotes the electronic | i | i l(t−) = l(t+)). Recall that this is exactly related to the state associated with each surface, 0 is defined in (11), k 6 k   Z jumping intensity used for surface hopping of the trajec- A0(z(0)) = A0(z(0),l(0)) is given in (10), and all the tories. Since τ (k) is complex valued in general, we have other terms depend on the trajectory, in particular, the chosen its modulus as the jumping intensity λ in the sur- sequencee of hopping times (we suppress the dependence face hopping trajectory, the term τ (k)/ τ (k) in (13) is in the notation for simplicity). An outline of the argu- used to correct the phase factor due to the complex value. ment leading to (13) is provided in the Appendix, with the detailed asymptotic derivation provided in [31]. It Finally, the weighting factor w in (13) solves the ODE comes from a stochastic representation of a high dimen- w˙ (t)= p(t) d (q(t)) , (18) sional integral involving all possible hopping times of a · (1−l(t))l(t) surface hopping trajectory.

Let us explain the terms appeared in (13). l(T ) gives with initial condition w(0) = 0. Thus, it is the accumu- | i the electronic state of the trajectory at time T , and the lated jumping intensity of the trajectory. The appear- factor / A (z(0)) results from the initial sampling. ance of the weighting term in (13) is due to the fact Z0 | 0 | The term that the infinitesimal transition rate (8) of the trajectory e i z is non-homogeneous and depends on the current po- A(T )exp Θ(T, x) ε sition and momentum. Therefore, we need to reweight   thee terms in the path-integral representation such that resembles the familiar amplitude (A(T )) and phase the average gives the correct wave function. Without (Θ(T, x)) expression from the Herman-Kluk propagator the weighting term, the path integral formulation is no [32–34, 50], in particular, the phase term Θ takes the longer an accurate approximation [31]; we also show in following form Section IV D the crucial role of such weighting terms for i calculating observables associated to the non-adiabatic Θ(t, x)= S(t)+ x q(t) 2 + p(t) (x q(t)), (14) 2| − | · − dynamics. where S(t) is the classical action associated with the tra- For the algorithmic purpose, we remark that we can jectory and recall that z(t) = (p(t), q(t)) is the momen- combine A with the weighting factor w as tum and position of the trajectory. The amplitude A and A(t) action S solve the ODEs Γ(t)= exp w(t) , (19) A(0) ˙ 1 2 | | S(t)= p(t) El(t)(q(t)), (15)  2 − which solves the ODE 1 A˙(t)= A tr Z(t)−1 ∂ p(t) i∂ q(t) 2E (q(t)) 2 z − z ∇q l(t)   ˙ 1 −1 2 A d (q(t)) p(t). (16) Γ(t)= Γ tr Z(t) ∂zp(t) i∂zq(t) q El(t)(q(t)) − l(t)l(t) · 2 − ∇   +Γ p(t) d (q(t)) p(t) d (q(t)) with initial conditions S(0; z(0)) = 0 and A(0; z(0)) = · (1−l(t))l(t) − · l(t)l(t) A0(z(0)). Here ∂z is short for ∂z = ∂q(0) i∂p(0) and  (20) − Z(t)= ∂z(q(t)+ip(t)). Thosee equations are similare to the evolutione equations in Herman-Kluk propagator, except with initial condition Γ(0; z(0)) = A (z(0))/ A (z(0)) . 0 | 0 | that similar to the evolution equations for (p(t), q(t)), the The quantity Γ(t) will be treated as the weight of the | | above ODEs also depend on the current surface l(t) of the trajectory in our algorithm.e Thus we wille prune trajec-e trajectory. Also note that it is clear for the ODEs that tories with small weight, and branch trajectories with the value S(t) and A(t) are determined by the trajectory larger weights to reduce the variance of the stochastic up to time t. sampling algorithm. 5

III. ALGORITHM we may calculate them for each trajectory on the fly. These ODEs can be numerically solved by standard A. FGA-SH sampling based on birth/death ODE integrators, for example, a 4-th order Runge- branching process Kutta scheme is chosen in our implementation.

As discussed before, the path integral representation 3. Hopping attempts. The probability that a surface readily suggests a direct Monte Carlo sampling strat- hop occurs within the time interval (t,t+∆t) is given by ∆t λ . For ∆t sufficiently small, we can egy: An ensemble of independent trajectories are gen- | (1−l(t))l(t)| erated as in section IIB and then the average of is neglect the event that two hops happen within the F calculated as in section IIC to approximate the time- time interval. With this probability, the trajectory is dependent wave function or the associated observables. hopped to the other surface, so that the label of the energy surface is changed: l(t + ∆t)=1 l(t) and This is the algorithm used in our previous work [31]. − the phase change τ/ τ is recorded. Here we present an improved sampling strategy based | | on birth/death branching process to reduce the sampling 4. Birth/death branching. For every Nbranch steps we variance of the ensemble average, borrowing a familiar do the branching according to the weight that the variance reduction method in the context of diffusion trajectories carry at time t + ∆t: γ = γ(t + ∆t). Monte Carlo algorithms. The basic idea is to prune For each trajectory, we generate a random number ξ the trajectories with small weights, while duplicate those uniformly distributed on [0, 1]. Denoting [ γ ] as the with larger weights ( Γ(t) ), in a consistent way that the | | | | integer part of γ , the birth/death is given by mean is preserved, while avoiding few trajectories carry | | If ξ > γ [ γ ], we replace the current trajectory huge weights so to reduce the variance of the sampling. • | |− | | with [ γ ] + 1 trajectories identical to the parent tra- Note that the trajectories are no longer independent, the | | jectory except that the weight is reset to be γ/ γ ; dependence comes in due to the branching step. | | If ξ γ [ γ ], we replace the current trajectory The algorithm starts by sampling a collection of ini- • ≤ | |− | | with [ γ ] trajectories identical to the parent trajec- tial points for the trajectories and estimate 0 defined | | Z tory except that the weight is reset to be γ/ γ . If in (11), which will be discussed in more details in sec- | | [ γ ] = 0, this means that we kill the parent trajec- tion IIIB. We denote by M(0) the number of trajecto- | | tory without any offsprings. ries initially. Each trajectory carries the information of position qα, momentum pα, the index of energy surface Note that the branching rule is designed so that on av- lα, classical action Sα, and a weight γα; initially we set erage the total weight of the offsprings is equal to the S = 0 and γ = A (z (0))/ A(z (0)) for each trajec- weight of the parent trajectory, whereas the weight of α α 0 α | α | tory. We use γα here to distinguish with Γ since dur- each offspring is of order 1. ing the branching process,e the weighte γ of a trajectory α After propagating the trajectories till t = T , the wave will be redistributed among the offspring and hence differ function can be reconstructed following (6) with the mod- from Γ, as will be discussed in the algorithm below. ification to take into account the birth/death branching The propagation of the trajectories are carried out as process. The path integral is approximated by follows: For each time step of size ∆t, the following steps are performed in order: M(T ) u (T, x)= Z0 l (T ) γ (T ) 1. Evolve the position and momentum p(t), q(t) by the FGA-SH M(0) | α i α × α=1 X Hamiltonian dynamics (7) on the current surface l(t) nα (k) i τα for time interval ∆t for each trajectory (we omit the exp Θα(T, x) , (21) × ε (k) τα index of trajectory in the algorithm description for   kY=1

simplicity of notation). where M(T ) denotes the number of trajectory at time 2. Evolve the quantities S and γ following (15) and (20) T and we use subscript α explicitly to emphasize the respectively according to the current surface of the dependence of these quantities on the right hand side on trajectory l(t). Note that as discussed in section IIC, each trajectory. We also remark that γ 1 due to the | α|≈ the quantities S and γ at time t only depend on the branching process, so it mainly contributes to a phase portion of the trajectory z up to time t. Therefore factor in the summation.

e 6

We emphasize that, except for the step of birth/death distribution branching, these is no exchange of information between 1 1 P (q, p, 0) = A (z(0), 0) , (22) the trajectories, and moreover, the branching history 0 3m/2 0 0 (2πε) | | does not contribute to the modified average of trajec- Z where A is defined in (10) and in (11). We assume tories (21), and is hence not necessary to be stored. 0 Z0 Therefore, the computational cost to generate one tra- that A0(z(0), 0) can be obtained with some accuracy. Ex- jectory in the current algorithm is almost the same as plicit calculation is unfortunately only possible in low those with fully independent trajectories, e.g., the direct dimensions for general initial data, due to the curse of Monte Carlo sampling of the path integral [31]. dimension in numerical quadrature. A special case is if we assume u0(y) is a Gaussian wave packet, for which A0 and can be obtained explicitly. We now give the ex- Z0 pression of P0 for Gaussian initial data, as this is used in our numerical tests. For example, if we consider a semi-

B. Initial sampling classical wave packet centered at y =q ˜ with momentum p˜,

m We now further elaborate the initial sampling of the i a u (y) = exp p˜ (y q˜) exp j (y q˜ )2 , trajectories. For simplicity, let us assume that initially, 0 ε · − − 2ε j − j    j=1 the wave function is only non-zero on surface 0, so that X   the trajectories will all initiate on that surface. Recall where aj are positive constants. By direct calculation, that we aim to sample z(0) according to the probability we have

m 2 2 m m − 1 (˜pj pj ) + aj(˜qj qj ) i(ajq˜j + qj )(˜pj pj) i (pj qj p˜j q˜j ) A (q, p, 0)=2 (πε) 2 (1+a ) 2 exp − − exp − + − , 0 j − 2(1 + a )ε (1 + a )ε ε j=1 j j Y     (23)

and thus bution and we set lα = 0, Sα = 0 and the initial weight

m A0(zα(0), 0) m m − 1 γα = . A0(q, p, 0) =2 (πε) 2 (1 + aj ) 2 A (z (0), 0) | | 0 α j=1 | | Y For the Gaussian initial data, we can directly sample 2 2 (˜pj pj ) + aj(˜qj qj ) P exp − − . (24) (p, q) from 0, which is a Gaussian distribution on the × − 2(1 + aj )ε   phase space. For more general P0, some importance sam- The resulting probability distribution for the initial posi- pling might be needed, which we will not go into details tion and momentum is a multivariate normal distribution here. centered at q =q ˜ and p =p ˜ with standard derivation

(√ε). The normalizing constant 0 can be also calcu- O Z C. Computing observables lated analytically as m 1 m 1+ aj 2 For many applications, the goal is not to approximate 0 = 3m/2 A0(z, 0) dz =2 . Z (2πε) ˆR2m | | a j=1 j the wave function itself, but rather the time evolution Y   of certain observable. For the purpose of calculating an Therefore, the probability distribution (22) is obtained observable, it is often the case that we do not need to explicitly. In the special case that aj = 1, j, we have m ∀ reconstruct the wave function on a mesh, which is not 0 =2 and the initial probabilistic measure is given by Z feasible in high dimensions anyway. p˜ p 2 + q˜ q 2 P(p, q, 0)=2−2m(επ)−m exp | − | | − | . We take the mass on each energy surface as an exam- − 4ε   ple, denoted by

The initial momentum and position of the trajectories 2 mk = uk(T, x) dx, k =0, 1, (25) (pα(0), qα(0)) are sampled independently from the distri- ˆRm | | 7 other macroscopic quantities can be treated in a similar space corresponding to the electronic degree of freedom fashion. From the approximation of the wave function in is equivalent to C2, and hence the electronic Hamiltonian FGA-SH, we have H (in a diabatic representation) is equivalent to a 2 2 e × matrix. u (T, x) 2 = u (T, x)¯u (T, x). | k | k k Example 1. Simple avoided crossing. For the simple avoided crossing example, we will vary Using (21) for both uk and the complex conjugateu ¯k, we the energy gap (controlled by the small parameter δ) be- arrive at tween the two surfaces as ε 0, which is a more in- → 2 M(T ) teresting scenario. For this, we consider a class of elec- u (T, x) 2 Z0 δ δ γ γ | k | ≈ (M(0))2 k,lα(T ) k,lβ (T ) α β× tronic Hamiltonian given by a product of a scalar func- α,β=1" X tion F δ(x) and a 2 2 matrix M(x) independent of δ, nα (k) nβ (k) × τα τ¯β namely gα,β(x) , (26) × (k) (k) k=1 τα k=1 τβ # δ δ Y Y He (x)= F (x)M(x).

where α and β are indices for two trajectories (for uk δ δ Since F is a scalar, He and M share the same eigen- andu ¯k respectively), and we have use the short-hand functions. Denoting the eigenvalues of M by λk, we have notation δ δ i i Ek(x)= F (x)λk(x). (27) g (x) = exp Θ Θ¯ . α,β ε α − ε β Then, we obtain that, for k = l,   6 We remark that while the formula (26) appears to com- δ δ Ψl, xHe Ψk Ψl, xMΨk plex, the resulting mk is in fact real due to the symmetry dlk = h ∇δ δ i = h ∇ i , E E λk λl of interchanging indices α and β. k − l − We observe that the only dependence on x of the right and similarly, hand side of (26) is through gα,β(x), and since it is a δ Ψl, ∆xMΨk 2 xλk dkl Gaussian function, the integration in x can be carried Dlk = Ψl, ∆xΨk = h i− ∇ · . h i λk λl out analytically: − δ δ δ Therefore, dlk and Dlk are independent of F (and of δ m i (Sα(t)−Sβ (t)) in particular), so we thereby suppress the superscript δ. gα,β(x) dx = (πε) 2 e ε ˆRm × While we can take F δ such that energy surfaces become − i (q (t)−q (t))·(p (t)+p (t)) e 2ε α β α β close and even touch each other as δ 0. × × → − 1 |q (t)−q (t)|2− 1 |p (t)−p (t)|2 For the simple avoided crossing, we choose M to be e 4ε α β 4ε α β , × tanh(wx) 2π 0.1 Therefore, given the sampled FGA-SH trajectories (and M = tanh(wx) , 0.1 2π ! hence pα(t), qα(t) etc.), we can estimate mk without re- − construction and numerical quadrature of the wave func- where w is a stretching parameter. The eigenvalues of M tions. are

tanh2(wx) +0.01, IV. RESULTS AND DISCUSSIONS ±s 4π2

A. Tully’s three examples plotted in Figure 1. We observe that the two eigenvalue surfaces are close around x = 0. By the plots of d10 and We now revisit the standard models examples similar D10 in Figure 1, we see the coupling is significant around to those of Tully’s [2], which are test cases widely used x = 0 as well. To control the energy gap, we introduce δ for non-adiabatic dynamics. The algorithm is not lim- the following F function, ited to one dimension, the comparison for higher dimen- δ −x2 F (x)= Cg(1 + (δ 1)e ), sional systems (for instance the spin-boson system) will − be considered in future works. All the examples below such that F δ(x)= (δ) around x = 0 and F δ(0) = δ, so O have two adiabatic states, which means that the Hilbert that the energy gap vanishes at x =0as δ 0. Here C → g 8

0.2 5 0.04 0.6 E d E d 0 01 0 01 D E 0.4 D 0.1 E 01 0.02 1 01 1 0.2 0 0 0 0 -0.1 -0.02 -0.2

-0.2 -5 -0.04 -0.4 -2 0 2 -2 0 2 -4 -2 0 2 4 -4 -2 0 2 4 x x x x

FIG. 1. Example 1. Left: Eigenvalues of M (related to the FIG. 3. Example 3. Left: Eigenvalues of He. Right: the δ δ δ eigenvalues of He by F ). Right: the coupling vectors of He , coupling vectors of He. which are invariant with respect to δ (see text for further discussion). Hence, as x , the eigenvalues of He, E0/1(x) 2 → ∞ → 0.02 2 π /400, and as x , the eigenvalues of He, E d 0 01 ∓ → −∞ E D E0/1(x) 0. As shown in Figure 3, this model involves 0.01 1 1 01 →∓ an extended region of strong nonadiabatic coupling when 0 0 x < 0. Moreover, as x > 0, the upper energy surface is -0.01 -1 repulsive so that it is likely that trajectories moving from -0.02 -2 -4 -2 0 2 4 -4 -2 0 2 4 left to right on the excited energy surface will be reflected x x while those on the ground energy surface will be trans- mitted. The energy surfaces and coupling vectors are FIG. 2. Example 2. Left: Eigenvalues of He. Right: the illustrated in Figure 3. Compared to Tully’s extended coupling vectors of He. coupling example in [2], we slightly modify the potential

energy surface so that the eigenvalues E0/1(x) as func- is a parameter we will use to adjust the global energy gap tions of x become smooth. of the adiabatic surfaces. We introduce F δ to get a family of examples for different ε, which facilitates the study of the algorithm for different semiclassical parameter ε, as B. Convergence test with various ε in Section IVB. We will focus on the most interesting regime of parameter choice δ = (ε), so that the energy O In this test, we test the algorithm on Example 1 for gap is comparable to the semiclassical parameter. 1 1 1 ε = 32 , 64 and 128 with w = 2, δ =5ε and Cg = 1. The Example 2. Dual avoided crossing. We choose He purpose is to understand the performance of the algo- to be rithm for systems with different semiclassical parameter 2 1 0 0.1e−0.06x ε. The corresponding energy surfaces are plotted in Fig- H = 2 2 . e −0.06x −x ure 4, while the coupling vectors stay unchanged. The 20 0.1e 0.5e +0.25! − initial condition is chosen concentrated on the lower sur- Observe that, the two eigenvalues are closest to each face only, and assumed to be the Gaussian wave packet other when x = 1. The coupling vectors around these ± −1/4 i k ·(x−y ) − 1 (x−y )2 points x = 1 are significantly larger than their values u (0, x) = (32ε) e ε 0 0 e 2ε 0 , (28) ± 0 elsewhere as shown in Figure 2. This explains why the model is often referred to as the dual avoided crossing. where y0 and k0 denote the initial position and momen- The energy surfaces and coupling vector are illustrated tum of the semiclassical wave packet respectively, the 2 in Figure 2. packet is normalzied so that the L norm is 1. In the Example 3. Extended coupling with reflection. set of tests, we fix y0 = 1.5 and k0 = 1.5. For the ini- In − this example, He is set to be tial trajectory number M = 25, 50, 100, 200, 400, 800, 1600 and 3200, the numerical test is repeated for 100 1 1 π arctan(2x)+ 2 H = F 20 20 2 , times in each case, and empirical average of the L er- e 1 arctan(2x)+ π 1 20 2 − 20 ! rors of the wave functions with confidence intervals are where  plotted in Figure 4. We observe that, the numerical er- 1 π ror is dominated by the stochastic sampling error which F = arctan(5x)+ + δ . is of order (M −1/2), which is consistent with the Monte 20 2 O   9

0.2 0.4 0.4 FGAFGA-SH 0.15 0.3 0.2 Reference

0 0.1 0.2 E , ǫ=1/32 0 -0.2 E , ǫ=1/32 0.05 1 0.1 -0.4 E , ǫ=1/64 0 ) 4 4.5 E , ǫ=1/64 0 1 0 E , ǫ=1/128 0 0 ǫ E , =1/128 Real (u 0.05 1 -0.05 -0.1

-0.1 -0.2 0

-0.15 -0.3 -0.05 -2.5 -2 -1.5 -0.2 -0.4 -3 -2 -1 0 1 2 3 -6 -4 -2 0 2 4 6 8 x x

ǫ=1/32 FGA-SH 0 ǫ=1/64 Reference 10 ǫ =1/128 0.4

0.2 ) 1

0 Error Real (u

-0.2

10-1 O ( M-1/2 ) -0.4

-0.6 102 103 -6 -4 -2 0 2 4 6 8 Initial Sample Size M x

H ε 1 1 1 t FIG. 4. Top: Eigenvalues of e for = 32 , 64 and 128 . FIG. 5. Snapshot of the numerical solutions at = 4 (two Bottom: Empirical average of the error in the wave functions components of the solution are plotted), note in particular with confidence intervals. that part of the wave function has bounced back and propa- gates to the left.

Carlo nature of the algorithm. The stochastic sampling error also appears to be rather independent of the small rameters so that the transmitted wave packet will switch parameter ε. direction when propagating on the top energy surface and experience the second significant non-adiabatic tran- C. Error growth in time and conservation of sition when it goes through x = 0 again. We compare energy in a bouncing back test the numerical results with initial number of trajectory M = 3200 to the reference solution till T = 4. The wave We now study the performance of the FGA-SH algo- functions obtained by the FGA-SH method with refer- rithm in a bouncing back test, to validate the long time ence solutions are plotted in Figure 5, from which we accuracy especially when the wave propagation switches observe very nice agreement. direction (bounces back by the energy surface). We We plot the error in the wave function versus time in 1 choose to focus on Example 1 for the test. We fix ε = 32 , Figure 6 (the error is empirically averaged for indepen- w = 2, δ = ε and Cg = 5. The potential is steeper com- dent run of 100 times). It seems that the error grows pared to that of the previous subsection, and therefore roughly linearly in time, so the growth is mild. Also, the wave packet tends to be bounced back by the steep we plot the discrepancy of the total energy between the potential. The initial condition is chosen concentrated FGA-SH solutions and the reference solutions. In this on the lower surface only, and takes the same form as test, the initial energy of the wave packet is given by (28) with k = 1.7 and y = 1.5. We choose these pa- 0.1935. Besides the averaged error shown in Figure 6, 0 0 − 10

0.5 0.2 FGA-SH FGA-SH Reference Refence 0.2 0.1 ) ) 1 0.15 0 0 0 0.1 Real(u Real (u -0.1 0.05

0 -0.5 -0.2

Error in wave functions 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -5 0 5 -5 0 5 time x x ×10-3 10 0.5 0.5 FGA-SH FGA-SH 8 Reference Reference ) ) 0 0 6 0 0 4 Real (u Real (u

Error in total energy 2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 -0.5 -0.5 time -5 0 5 -5 0 5 ×104 2 x x

1.5

1 FIG. 7. Top: FGA-SH solutions with the weighting factor.

0.5 Bottom: FGA-SH solutions without the weighting factor.

0

Number of trajectories 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 time

each trajectory to conserve the classical energy. As we

FIG. 6. From top to bottom: The error in the wave functions have shown, the ensemble average of the trajectories gives measured in L2 metric, the error in total energy, and the good conservation of the energy of the wave packet. number of trajectories as a function of time. The averages and confidence intervals are estimated empirically using 100 runs. we also note that among the 100 runs, the largest de- D. Effect of weighting factor w viation in energy is 0.0115. Therefore, the total energy is roughly conserved even in the worst scenario (relative We now numerically demonstrate the role of the error 5.94%). Also, we report the average number of tra- weighting factor exp(w), which is important to get the jectories versus time in Figure 6, which suggests a mild correct ensemble average of the wave function, since the linear growth as the simulation proceeds. This is a good hopping is driven by a non-homogenous Poisson process indication that the variance of the algorithm does not (the jumping rate depends on the position and momen- grow rapidly in time, while rigorous variance bound is tum of the trajectory). We choose to focus on Example 1 beyond the scope of the current work, and will be leaved for the test. We fix ε = 1 , w = 1, δ = ε and C = 1. The for future works. 32 g initial condition is chosen concentrated on the lower sur- Let us remark that a major difference between FGA- face only, and takes the same form as (28) with k0 =1.5 SH and other FSSH type algorithms is that the trajectory and y0 = 1.5. We run the algorithm with and without z(t) is continuous in time on the phase space R2m, while − the weighting factor w (meaning setting always w = 0) in FSSH and other versions of surface hopping, a momen- for 100 times each, some snapshots of the numerical solu- tume shift is usually introduced to conserve the classical tions are plotted in Figure 7, from which we observe that, energy along the trajectory when hopping occurs (if hop- weighting factors are crucial for accurate approximations ping occurs from energy surface 0 to 1, it is required of the wave functions. ′ ′ that E0(p, q) = E1(p , q) where p is the momentum af- ter hopping). Note that as in the FGA for single sur- Also, we summarize the empirical averages and the cor- face Schr¨odinger equation, each Gaussian evolved in the responding variances of the wave functions error and the FGA-SH does not solve the matrix Schr¨odinger equation, transition rates (abbreviated by TR) in Table I. We ob- and only the average of trajectories gives an approxima- serve that the weighting factor w is also crucial to get tion to the solution. Therefore, it is not necessary for good approximations of the transition rates. 11

avg. error Variance TR mean Variance 0.6 w/o w.f. 0.2695 9.5780e-03 0.1772 1.0339e-02 with w.f. 0.0678 7.5759e-03 0.2386 1.2106e-02 0.58 TABLE I. Numerical error in the wave functions, and average transition rates with and without the weighting factor. The 0.56 reference transition rate is given by 0.2443; the inclusion of the weighting factor reduces the relative error from 27.5% to 0.54 2.33%. Transition Rate Reference FGA-SH 0.52 E. Initial condition with different momentum 0.5 1 1.5 2 Finally, we test all three examples in Section IV A with Momentum K initial conditions of different momentum, we aim to com- pare the FGA-SH method with reference solutions in 0.4 1 terms of transition rates. We fixed ε = 64 and initial sample size M = 1600. The initial condition is chosen 0.3 concentrated on the lower surface only, and takes the fol- 0.4 lowing wave packet form 0.38 0.2 i 1 2 1.4 1.8 −1/4 K·(x−y0) − (x−y0) u0(0, x) = (32ε) e ε e 2ε , Transition Rate 0.1 Reference where y = 1.5 and various K are used so that different 0 − FGA-SH momentum is considered for the initial wave packet. 0 For Example 1, we choose δ = 5ε and test two cases: 0.5 1 1.5 2 1 Momentum K the small global gap scenario Cg = 20 and the large gap 8000 8000 scenario Cg = 1. Note that, when Cg = 1, many clas- sically forbidden hops will happen. The FGA-SH algo- 7000 7000 rithm is repeated for 100 independent trials in each case, 6000 6000 and empirical averages of the hopping rates with con- 5000 5000 Final num. of traj. Final num. of traj.

fidence intervals are plotted in Figure 8, together with 4000 4000 0 0.5 1 1.5 2 0 0.5 1 1.5 2 the typical number of trajectories at the end of the sim- Momentum K Momentum K ulation time. The corresponding results for Example 2 and Example 3 are plotted in Figure 9. We observe that, FIG. 8. Numerical results of the transition rate of FGA-SH the FGA-SH results give accurate approximation in the method compared with the reference solution for Example 1. C 1 tests. It is also worth pointing out that the error seems Top: Transition rates for Example 1 with g = 20 (smaller gap); middle: Transition rates for Example 1 with Cg = 1 to be rather uniform for different values of the initial (larger gap); bottom left: typical number of trajectories at momentum K. We also remark that the birth/death final time for Example 1 with small Cg; bottom right: typical processes adaptively choose the number of trajectories number of trajectories at final time for Example 1 with large needed, which helps to maintain the uniform accuracy Cg. over different initial momentum. From the numerical re- sults, it can be seen that a smaller initial momentum ends up requiring more trajectories. algorithm is validated in various numerical tests for the standard test cases for non-adiabatic dynamics. The path integral interpretation of the fewest switches V. CONCLUSION surface hopping type of algorithm leads to potentially further development for algorithms for non-adiabatic dy- In this work we further develop the FGA-SH method, namics. Some interesting future directions include vali- introduced in [31], by proposing an improved sampling dation of the algorithm for higher dimensional problems, algorithm using birth / death branching processes. The non-adiabatic thermal sampling using surface hopping 12

proofs. For simplicity of notation, we assume the ini- 0.8 tial condition is on the energy surface E0, and hence the FGA-SH 0.7 Reference trajectory starts from that energy surface. We assume the following deterministic ansatz, referred 0.4 0.6 as surface hopping ansatz, for the solution to (1).

(0) (2) 0.5 0.35 u (T, x)= 0 u (T, x)+ u (T, x)+ FGA | i ··· Transition Rate   1.6 1.8 + 1 u(1)(T, x)+ u(3)(T, x)+ . (A1) 0.4 | i ···   This ansatz is similar to that of proposed by Wu and Her- 0.3 0.5 1 1.5 2 man [13, 16, 17], which is also based on the Herman-Kluk Momentum K propagators. The two approaches are different however in several essential ways, as elaborated in [31]. The wave function u(n) stands for the contribution 0.34 with n surface hops before time t, starting from surface

E0. In particular, for trajectories with even number of hops, the electronic state ends at 0 , and trajectories 0.32 | i FGA-SH with odd number of hops contribute to 1 . This explains | i Reference the linear combination in (A1). We denote a sequence

Transition Rate 0.3 t n for the hopping times satisfying { k}k=1 0 6 t 6 t 6 6 t 6 T, 1 2 ··· n 0.28 0.5 1 1.5 2 at which time the trajectory switches from one energy Momentum K surface to the other. The ansatz for u(n) is given by ×104 1.8 4500

4000 1 1.6 u(n)(T, x)= dz 3m/2 ˆ 0 ˆ 3500 (2πε) 06t16···6tn6T 1.4 3000 (1) (n) (n) i (n) Final num. of traj. Final num. of traj. τ τ A exp Θ dTn:1, (A2) 1.2 2500 ··· ε 0 0.5 1 1.5 2 0 0.5 1 1.5 2   Momentum K Momentum K where τ (k) is defined in (17) and dT = dt dt . n:1 1 ··· n FIG. 9. Numerical results of the transition rate of FGA-SH Note that in (A2), we integrate over all possible hopping n method compared with the reference solution for Example 2 times for n hops in the time interval [0,T ]. Given tk { }k=1 and Example 3. Top: Transition rates for Example 2; middle: and z0, the trajectory z(t) for 0 6 t 6 T is specified. Transition rates for Example 3; bottom left: typical number of Substitute the ansatz into the matrix Schr¨odinger trajectories at final time for Example 2; bottom right: typical equations and carry oute asymptotic calculations as in number of trajectories at final time for Example 3. [31], we arrive at the conclusion that the evolution of A(n) and Θ(n) should be exactly as that described in Section dynamics and also the calculation time-correlation func- IIC, such that uFGA(T, x) is a good approximation to tion in the non-adiabatic regime. the true solution. Indeed, the asymptotic analysis can be turned into rigorous error analysis [31] that, uFGA(T, x) is an approximation of the exact solution with (ε) error O Appendix A: Asymptotic derivation of the path in L2 metric. integral semiclassical approximation Let us now link the deterministic ansatz to the path integral representation. As we discussed in Section IIB, For completeness, we provide a brief explanation of given T > 0, the number of jumps n of the stochastic the path integral approximation (6). Please refer to [31] trajectory z(t) for 0 6 t 6 T is a random variable. In for the detailed asymptotic derivation and mathematical particular, by the properties of the associated counting e 13 process, the probability that there is no jump (n = 0) is for t 6 t 6 6 t , and 0 otherwise. 1 2 ··· n given by Using the above probabilities, we may calculate explic- itly the expectation with respect to the trajectory z. We T (1) P − 0 τ ds (n =0)= e ´ | | . (A3) verify that (6) is exactly a stochastic representation of the FGA ansatz given in (A1)–(A2), where the integralse with And, more generally, we have respect to t1,...,tn are replaced by the averaging of tra- jectories. In particular, for the functional (13), we ob- k i F (j) serve that the term A(T )exp Θ(T, x) comes from the P(n = k)= dTk:1 τ ε ˆ0

[1] J. Tully and R. Preston, J. Chem. Phys. 55, 562 (1971). [17] Y. Wu and M. Herman, J. Chem. Phys. 127, 044109 [2] J. Tully, J. Chem. Phys. 93, 1061 (1990). (2007). [3] S. Hammes-Schiffer and J. Tully, J. Chem. Phys. 101, [18] G. Hanna, H. Kim, and R. Kapral, in Quantum Dy- 4657 (1994). namics of Complex Molecular Systems, Vol. 83, edited [4] P. Barbara, T. Meyer, and M. Ratner, J. Phys. Chem. by D. A. Micha and I. Burghardt (Springer, 2007) pp. 100, 13148 (1996). 295–319. [5] J. Tully, Faraday Discussions 110, 407 (1998). [19] J. R. Schmidt, P. V. Parandekar, and J. C. Tully, J. [6] R. Kapral, Annu. Rev. Phys. Chem. 57, 129 (2006). Chem. Phys. 129, 044104 (2008). [7] N. Shenvi, S. Roy, and J. C. Tully, Science 326, 829 [20] J. Subotnik and N. Shenvi, J. Chem. Phys. 134, 024105 (2009). (2011). [8] M. Barbatti, WIREs Comput. Mol. Sci. 1, 620 (2011). [21] B. Landry and J. Subotnik, J. Chem. Phys. 137, 22A513 [9] J. E. Subotnik, A. Jain, B. Landry, A. Petit, W. Ouyang, (2011). and N. Bellonzi, Annu. Rev. Phys. Chem. 67, 387 (2016). [22] V. Gorshkov, S. Tretiak, and D. Mozyrsky, Nat. Com- [10] O. V. Prezhdo and P. J. Rossky, J. Chem. Phys. 107, mun. 4 (2013). 825 (1997). [23] J. Subotnik, W. Ouyang, and B. Landry, J. Chem. Phys. [11] I. Horenko, C. Salzmann, B. Schmidt, and C. Sch¨utte, 139, 214107 (2013). J. Chem. Phys. 117, 11075 (2002). [24] B. R. Landry, M. J. Falk, and J. E. Subotnik, J. Chem. [12] A. W. Jasper, S. N. Stechmann, and D. G. Truhlar, J. Phys. 139, 211101 (2013). Chem. Phys. 116, 5424 (2002). [25] G. Hanna and R. Kapral, in Reaction Rate Constant [13] Y. Wu and M. Herman, J. Chem. Phys. 123, 144106 Computations: Theories and Applications, Vol. 6, edited (2005). by K. Han and T. Chu (Royal Society of Chemistry, 2013) [14] M. Bedard-Hearn, R. Larsen, and B. Schwartz, J. Chem. p. 233. Phys. 123, 234106 (2005). [26] A. Jain, M. F. Herman, W. Ouyang, and J. E. Subotnik, [15] G. Hanna and R. Kapral, J. Chem. Phys. 122, 244505 J. Chem. Phys. 143, 134106 (2015). (2005). [27] A. Jain and J. E. Subotnik, J. Chem. Phys. 143, 134107 [16] Y. Wu and M. Herman, J. Chem. Phys. 125, 154116 (2015). (2006). [28] R. Kapral, Chem. Phys. (in press). 14

[29] M. F. Herman, J. Chem. Phys. 81, 754 (1984). [40] S. Jang and G. A. Voth, J. Chem. Phys. 111, 2371 (1999). [30] R. Kapral and G. Ciccotti, J. Chem. Phys. 110, 8919 [41] N. Makri, Annu. Rev. Phys. Chem. 50, 167 (1999). (1999). [42] W. H. Miller, J. Phys. Chem. A 105, 2942 (2001). [31] J. Lu and Z. Zhou, “Frozen Gaussian approxima- [43] I. R. Craig and D. E. Manolopoulos, J. Chem. Phys. 121, tion with surface hopping for mixed quantum-classical 3368 (2004). dynamics: A mathematical justification of fewest [44] R. Lambert and N. Makri, J. Chem. Phys. 137 (2012). switches surface hopping algorithms,” (2016), preprint, [45] S. Habershon, D. Manolopoulos, T. Markland, and arXiv:1602.06459. T. Miller III, Annu. Rev. Phys. Chem. 64 (2013). [32] M. Herman and E. Kluk, Chem. Phys. 91, 27 (1984). [46] T. J. Martinez, M. Ben-Nun, and R. D. Levine, J. Phys. [33] K. Kay, J. Chem. Phys. 100, 4377 (1994). Chem. 100, 7884 (1996). [34] K. Kay, Chem. Phys. 322, 3 (2006). [47] M. Ben-Nun and T. J. Martinez, J. Chem. Phys. 108, [35] R. C. Grimm and R. G. Storer, J. Comput. Phys. 4, 230 7244 (1998). (1969). [48] G. A. Hagedorn, Ann. Math. 124, 571 (1986). [36] R. C. Grimm and R. G. Storer, J. Comput. Phys. 7, 134 [49] G. Panati, H. Spohn, and S. Teufel, ESAIM Math. (1971). Model. Numer. Anal. 41, 297 (2007). [37] J. B. Anderson, J. Chem. Phys. 63, 1499 (1975). [50] S. Swart and V. Rousse, Commun. Math. Phys. 286, 725 [38] J. Cao and G. A. Voth, J. Chem. Phys. 100, 5106 (1994). (2009). [39] X. Sun and W. Miller, J. Chem. Phys. 106, 6346 (1997).