arXiv:2107.08633v1 [cond-mat.stat-mech] 19 Jul 2021 ie nta tt.SneMetropolis Since state. initial from given starting variables distribution random method. target a sample arbitrary (MCMC) an quickly follow Carlo to that Monte required is chain MCMC Markov model the as by such 6]. [5, inference purposes stochastic and various evaluation, its for and be- training used has widely sampling re- learning, the come machine of these of to background development addition the cent In with [4]. applications, transitions traditional 2], glass [1, and proteins [3], like glasses complex macromolecules nu- of as to such behaviors developed systems equilibrium been the engineer- have examine and methods merically science sampling of The fields ing. various in importance ing aibe 1] sas lsie sa xeddensemble extended an as auxiliary classified as also of momenta is Monte help introduces [14], variables Hamiltonian the which (HMC), with [13]. Carlo [12] algorithm ex- Wang-Landau method the the multicanonical 11], [10, the groups: tempering simulated three and the into [9], Carlo the ensem- categorized Monte change extended to roughly of transition are path techniques rapid a ble The of a distribution. allowing proposal target a dimension by higher accelerated in and is variables, state convergence ensemble auxiliary the introducing the extended method, by ensemble extended the extended is called space the is In One [8]. The method concepts. distribution. accel- based two mainly target to on constructed been the proposed have techniques to been speed-up convergence have the variants erate sys- many complex [7], investigate tems to MCMC introduced cessfully ∗ † [email protected] [email protected] tcatcdnmc ihu ealdblnecniinco condition balance detailed without dynamics Stochastic h otcmo ehiu o apigi provided is sampling for technique common most The increas- of become have techniques sampling Recently, 1 ntttso noainfrFtr oit,Ngy Univer Nagoya Society, Future for Innovation of Institutes utemr epooea ffiin ot al ehdbsdo based violat method that appro Carlo two method Monte efficient the Carlo an paper, Monte propose this we generalized Furthermore In of framework understood. propose clearly been the have been approaches not bala two has detailed conve these the th Historically, of of In violation MCMC. acceleration tion. the of on variant the based a approach aimed such another is have (HMC) studies Carlo Monte of sam Hamiltonian of number technique huge common most a the provides (MCMC) Carlo Monte apigocpe nipratpsto ntere fvari of theories in position important an occupies Sampling .INTRODUCTION I. 2 rdaeSho fIfrainSine,Thk Universit Tohoku Sciences, Information of School Graduate 3 4 nttt fInvtv eerh oy nttt fTechn of Institute Tokyo Research, Innovative of Institute im-,C.Ld,Knn iaok,Tko1807,Japan 108-0075, Tokyo Minato-ku, Konan, Ltd., Co. Sigma-i, ehdadHmloinMneCarlo Monte Hamiltonian and method khs Ichiki Akihisa hoaaa euok,Tko1285,Japan 152-8550, Tokyo Meguro-ku, Oh-okayama, tal. et Dtd uy2,2021) 20, July (Dated: 1, ∗ suc- n aauiOhzeki Masayuki and h pe fcnegnet h agtdsrbto.Sec- to distribution. respect target the with to convergence methods nu- of is other speed the method to In Ohzeki-Ichiki compared limits. generalized specific merically as the HMC V, contains the section the and method method and Ohzeki-Ichiki gradient generalized III, the will the section we respectively, that gradient in IV, see HMC the section the in reviewing method II, after Ohzeki-Ichiki the section fact, and in this method method gradient show the To including HMC. dynamics of ily dynamics. con- Hamiltonian seamlessly the gen- indeed to the that is nected explained method be Ohzeki-Ichiki will method it eralized Ohzeki-Ichiki and the generalized, paper, be Hamilto- this will the In of space. behavior dynamics. symplectic state nian the duplicated to the similar is in This state ro- the of causes evolution driv- current tational probability The the systems. producing a force two ing introduces Ohzeki-Ichiki the and between The system current original probability [23]. the distri- system duplicates target continuous method any a to in converges balance- bution Ichiki that detailed and de- dynamics of Ohzeki construction the violating result, systematic this of a on dis- violation proposed Based target the the [22]. that to tributions convergence shown accelerates been balance How- tailed has condition. balance it detailed developed the ever, of been range have the within algorithms Con- acceleration [18–21]. investigated balance ventional intensively detailed been of has possibility violation the acceleration, [17]. for distribution cepts target arbitrary an ex- to later applied was be algorithm to This to Wolff model. by Ising clusters tended the using in by spins updates of state efficient algo- makes Swendensen-Wang updated [15] The rithm the coarse-graining. con- the for the via of generated candidates cept are candidates of efficient Such proposal state. efficient based an is acceleration on for concept alternative The method. iy uoco hks-u aoa4480,Japan 464-8603, Nagoya Chikusa-ku, Furo-cho, sity, h eeaie heiIhk ehdpoie fam- a provides method Ohzeki-Ichiki generalized The con- two mentioned above the to addition in Recently, c odto a trce uhatten- much attracted has condition nce neednl,adterrelationship their and independently, d u cetfi ed,adMro chain Markov and fields, scientific ous ce r emesyudrto in understood seamlessly are aches stedtie aac condition. balance detailed the es gnet h agtdistribution. target the to rgence u framework. our n ln.I h rgeso MCMC, of progress the In pling. ,Sna 8-59 Japan 980-8579, Sendai y, eetdvlpeto MCMC, of development recent e ,3 4, 3, 2, XY † ncigsml gradient simple nnecting ology, oe 1] n snwextended now is and [16], model 2 tion VI is devoted to a summary and discussion. III. HAMILTONIAN MONTE CARLO

We have seen that, in the simple gradient method, the II. GRADIENT METHOD state is updated in the direction along the gradient of the potential, which is normal to the energy surface. With The simplest dynamics converging to the target dis- such a method, it is difficult to avoid to be trapped in tribution is given by a gradient method. The gradient the local minimum of the potential. To overcome this method satisfies the so-called condition. difficulty, it has been proposed to add extra degrees of Physically, the dynamics with the detailed balance condi- freedom to the original system to make new directions tion is relaxed to a steady state in which no macroscopic to escape from the local minimum of the potential. This heat is generated. Such a special steady state is called idea is called an extended ensemble method. A method an equilibrium state. By the gradient method, the Gibbs called Hamiltonian Monte Carlo (HMC) is one of the real- distribution izations of the extended ensemble methods. In the HMC, in addition to the original state variable x, a momentum π(x) = exp [ U(x)/T ] /Z (1) − p is introduced as an auxiliary variable. By introducing with a partition function Z is achieved with the balance the momentum, the dimension of the dynamical system between the energy gradient and the diffusion due to doubles, and it becomes easier to escape from the local noise. The following dynamics gives the simplest gradi- minimum of the potential. In other words, when the ki- ent method in which the N-dimensional continuous state netic energy exceeds the energy gap between the local x converges to the Gibbs distribution: minimum and the local maximum of the potential U(x), the state can escape from the local minimum of the po- ∂U tential. The basic concept of the HMC is that the Gibbs dxi(t)= dt + √2TdWi(t) , (2) −∂xi distribution where, dxi is the displacement of xi during an infinites- πx,p(x, p) = exp [ H(x, p)/T ] /Zx,p , (6) − imal time dt, and U(x) and T correspond to the poten- p2 tial and temperature, respectively. W (t) is a standard H(x, p)= U(x)+ i (7) i 2m Wiener process that satisfies Xi i

dWi(t) =0 , (3) is invariant under the Hamiltonian dynamics h ′ i ′ dWi(t)dWj (t ) = δij δ(t t )dt , (4) pi h i − x˙ i = , (8) mi where δ and δ(t) denote Kronecker and Dirac delta ij ∂U(x) functions, respectively, and represents an expecta- p˙i = , (9) tion. The Fokker-Planck equationh·i corresponding to the − ∂xi Langevin equation (2) is given as where Zx,p := dxdp exp [ H(x, p)/T ] is a partition − ∂P (x, t) ∂ ∂U(x) ∂ function. Here, Rmi represents the mass of the i-th de- = –T P (x, t) . (5) gree of freedom. The target Gibbs distribution π(x) = ∂t − ∂x − ∂x ∂x  Xi i i i exp [ U(x)/T ] /Z is acquired as a marginal distribution π(x)=− dp π (x, p) via the Gibbs distribution (6). It is straightforwardly confirmed that the Gibbs distri- x,p The algorithm of the HMC consists of the following bution (1) is the steady solution satisfying the Fokker- R steps. (i) Sample the momentum p′ (i =1, ,N) from Planck equation (5). i the Gaussian distribution ··· It is guaranteed by the H-theorem that the dynam- ′2 ics (2) converges to a unique steady distribution (1) as ′ 1 pi an equilibrium distribution regardless of an initial con- PG(pi)= exp . (10) √2πmiT −2miT  dition. Therefore, the target Gibbs distribution can be obtained by providing U(x) and T in the simple gradient This procedure changes the state from (x, p) to (x, p′). dynamics (2). However, since the simple gradient method (ii) Evolve the state for waiting time τ starting from updates the state along the gradient of the potential U, the initial state (x, p′) according to the Hamiltonian the update becomes inefficient when the state is trapped dynamics (8) and (9). We denote the obtained in a local minimum of the potential, where the gradient state as (x′′,p′′). (iii) According to the Metropolis- vanishes. To escape from such a local minimum, noise is Hasting rule [7, 24], the state obtained in the step exploited in MCMC algorithms. However, if the poten- (ii), (x′′,p′′), is accepted with the acceptance rate tial around the local minimum is steep, it takes a long min [1, exp [H(x′′,p′′)–H(x, p′)] /T ]. Otherwise, the time to escape from the local minimum. In the history of state remains{− at (x, p′). The algorithm} of the HMC con- MCMC studies, various techniques have been proposed sists of a repetition of these three steps. to avoid such a bottleneck restricting the relaxation to Note that the Gibbs distribution (6) is invariant under the target distribution. the Hamiltonian dynamics (8) and (9). In particular, the 3

Gaussian distribution (10) gives the steady state distri- This system has the steady state distribution of Gibbsian bution for the momentum. In step (i), the momentum p is form sampled from this invariant distribution. The advantage πx,y(x, y) = exp β [U(x)+ U(y)] /Zx,y , (16) of the HMC is that the Gaussian random variables can {− } be easily generated in numerical manners. In step (ii), where β = 1/T , and Zx,y is a partition function. Then, the target distribution π(x) = exp [ U(x)/T ] /Z the state update is ballistic on the energy surface. Even − if the state is located at the local minimum of the poten- is acquired as the marginal distribution π(x) = tial U(x), it is possible to escape from it by the effect of dy πx,y(x, y). Note that this system violates the de- kinetic energy. The rejection in step (iii) is exploited to Rtailed balance condition, but satisfies the balance condi- eliminate nonphysical time evolution [25]. Since the total tion energy is conserved under the Hamiltonian dynamics, the ∂ x ∂ y ui π(x, y)+ ui π(x, y)=0 , (17) acceptance rate is theoretically always unity. However, ∂xi ∂yi Xi Xi naive numerical calculations have been reported to show where the driving force an increase in total energy. The step (iii) is introduced to eliminate this possibility to guarantee the calculation x ∂U(y) ui = γ , (18) accuracy. Thus, step (iii) is extra and can be omitted ∂yi when the time evolution of the Hamiltonian dynamics is y ∂U(x) calculated with sufficiently high accuracy. ui = γ (19) − ∂xi In the simple gradient method (2), the state update yields the probability current characteristic to the vio- in the normal direction of the energy surface is ballis- lation of the detailed balance. The introduction of the tic. The update on the energy surface is diffuse, since driving force satisfying the balance condition remains the the state update on the energy surface is caused only Gibbs distribution (16) to be the steady state distribu- by noise. On the other hand, in the HMC, the update tion. Although the two duplicated systems affect each in the normal direction of the energy surface is caused other via the driving force, the steady state distribution only by the random sampling of momentum. However, for each system is independent. the update on the energy surface is ballistic since the In the Ohzeki-Ichiki dynamics (11) and (12), the same state evolves according to the Hamiltonian dynamics. form of the potential in the original x-system is cho- The Gibbs distribution obeys the principle of equal a sen as that in the duplicated y-system. However, there priori weights for states with equal energy. The HMC is is arbitrariness in the choice of the potential in the y- expected to quickly satisfy the principle of equal a pri- system, since y is an auxiliary variable and the tar- ori weights by the ballistic state updates on the energy get distribution is given as the marginal distribution surface. π(x) = dy πx,y(x, y). Therefore, the potential in the y-systemR does not have to be the same as that of the x-system. Consider the following dynamics: IV. OHZEKI-ICHIKI METHOD ∂Hx(x) ∂Hy(y) x dxi(t)= + γ dt + √2TdWi (t) , − ∂xi ∂yi  The violation of the detailed balance condition was (20) shown to accelerate relaxation to the steady state due to the eigenvalue shit for the Fokker-Planck operator [22]. ∂Hy(y) ∂Hx(x) √ y dyi(t)= γ dt + 2TdWi (t) , In order to systematically introduce the violation of the − ∂yi − ∂xi  detailed balance condition, Ohzeki and Ichiki have pro- (21) posed to duplicate the original system to introduce a where H (x) = U(x) is the potential in the original x- rotating probability current between the two duplicated x systems: system, and the energy Hy(y) in the y-system can be in the form of an arbitrary function. This system has the following steady state distribution independent of the ∂U(x) ∂U(y) x dxi(t)= dt + γ dt + √2TdWi (t) ,(11) value of γ: − ∂xi ∂yi πx,y(x, y) = exp β [Hx(x)+ Hy(y)] /Zx,y . (22) ∂U(y) ∂U(x) √ y {− } dyi(t)= dt γ dt + 2TdWi (t) ,(12) − ∂yi − ∂xi Therefore, the target distribution is obtained as a marginal distribution π(x) = dy πx,y(x, y) for an ar- where xi and yi are degrees of freedom belonging to the bitrary form of Hy. x R original and the replicated system, respectively. Wi and Consider the change of variables in dynamics (20) and y Wi are independent standard Wiener processes: (21) as γ =˜γT , t˜=˜γTt. Then the dynamics

x x ′ ′ ∂Hy(y) dWi (t)dWj (t ) = δij δ(t t )dt , (13) dx (t˜)= dt , (23) − i ∂y y y ′ ′ i dWi (t)dWj (t ) = δij δ(t t )dt , (14) − ∂Hx(x) x y ′ dyi(t˜)= dt (24) dWi (t)dWj (t ) =0 . (15) − ∂xi

4

is obtained in the limit ofγ ˜ . Note that Hx → ∞ and Hy play the roles of potential and kinetic energies 0 in this dynamics, respectively. In fact, the choice of 2 0 Hy(y)= i yi /2mi reproduces the Hamiltonian dynam- ics (8) andP (9). In dynamics (20) and (21), the driving 0 force proportional to γ causes the violation of the detailed 0 balance condition. The case of γ = 0 corresponds to the 00 simple gradient method. On the other hand, the dynam- 00 ics in the limit γ corresponds to the Hamiltonian dynamics. Thus, it→ is ∞ concluded that the dynamics (20) 0 and (21) seamlessly connects the gradient method and 0 the Hamiltonian dynamics that is the basis of the HMC.

0 0

V. HYBRID USE OF GRADIENT METHOD 0 0 AND HAMILTONIAN DYNAMICS

x In the previous section, we have introduced the dy- FIG. 1. (Color online) Time evolution of the state h i (left namics, which incorporates the simple gradient method panel) and the internal energy (right panel). The black cross marks, blue dots, green triangles, and red circles indi- and the Hamiltonian dynamics. By the simple gradient cate the results of the simple gradient method, conventional method, the state update on the energy surface is re- HMC, conventional Ohzeki-Ichiki method, and the proposed alized diffusely, and it takes a long time to satisfy the method, respectively. The error bars indicate variances. principle of equal a priori weights. On the other hand, in the HMC, the state update on the energy surface is so ballistic that the principle of equal a priori weights 2 and Hy(y)= y /2m. The time evolution of the Langevin is quickly satisfied. However, since the total energy is equations is calculated by applying the Heun scheme [26]. conserved under the Hamiltonian dynamics, transitions The time evolution of the Hamiltonian dynamics in the between energy surfaces are prohibited. For this rea- HMC is calculated using the leapfrog method [25]. Other son, the HMC requires resampling of momentum from parameters are set as follows: In the HMC, the particle the Gaussian distribution (10) which is realized in the mass is set as m = 1. In the algorithm of the HMC, it is steady state. necessary to evolve the Hamiltonian dynamics by a cer- Consider the case of finite γ in the dynamics (20) and tain waiting time τwait before resampling the momentum. (21) with harmonic H that connects the simple gradi- y We set the waiting time as τwait = 0.01. In the Ohzeki- ent method and the Hamiltonian dynamics. In such a Ichiki method, the parameter γ characterizing the viola- dynamics, the state update on the energy surface, which tion of the detailed balance condition is set as γ = 10.0. has been a bottleneck of relaxation to the steady state In the generalized Ohzeki-Ichiki method where Hy(y) is in the simple gradient method, is realized to become bal- harmonic, the particle mass is set as m =1.0. The value listic. In addition, the effects of gradients and noise au- of γ = 10.0 is also chosen in this dynamics. Figure. 1 tomatically enhance transitions between energy surfaces. shows the numerical results averaged over Nsample = 1000 Therefore, it is not required to resample the momentum, independent runs taking time average during ∆t = 0.1. unlike the case of conventional HMC. The Ohzeki-Ichiki method shows faster convergence to To demonstrate the performance of our proposed the steady state than the simple gradient method because method, i.e., the dynamics with harmonic Hy, we first of the detailed balance violation. Furthermore, it can be deal with a toy model of a one-dimensional double-well seen that the convergence of the proposed dynamics with potential: harmonic potential for y is faster than the Ohzeki-Ichiki 1 1 dynamics, since the potential of the y-system is compli- U(x)= x4 x2 . (25) cated in the conventional Ohzeki-Ichiki method. In the 4 − 2 HMC, relaxation depends on the waiting time τwait. The The initial condition is set to be in one of the poten- larger τwait, the smaller the number of Monte Carlo steps tial wells at x = 1. Thus, the system must go be- is required for convergence. However, as seen in Fig. 1, yond the potential hill at x = 0 to realize the steady it requires a longer calculation time, which is given by state. In our numerical calculations, we set the tem- the product of τwait and the Monte Carlo steps in HMC, perature as T = 1.0. The infinitesimal time-step is set than other methods. As seen in the previous section, to be dt = 1.0 10−4. We compare the performance the timescale conversion in the Ohzeki-Ichiki dynamics of the simple gradient× method, the conventional HMC, reproduces the Hamiltonian dynamics. Due to the limit the conventional Ohzeki-Ichiki method, namely, the dy- of this timescale conversion, it is difficult to make a di- namics (20) and (21) with Hx(x) = Hy(x) = U(x), and rect comparison between the HMC and the Ohzeki-Ichiki the proposed dynamics (20) and (21) with Hx(x)= U(x) method. In fact, in the limit ofγ ˜ , dt˜ corresponding → ∞ 5 to the infinitesimal time step dt diverges. This means that one Monte Carlo step in the Ohzeki-Ichiki method 0 should be compared with the result of the HMC with the 0 limit of long waiting time τwait . We also evaluate the integrated→ ∞ auto-correlation time 0 ∞ ′ ′ 2 2 2 τint := dt x(t)x(t + t ) x / x x . 0 0 h i − h i − h i h i h i The integratedR auto-correlation time for each dynamics 0 is evaluated by the empirical average after the conver- 00 gence to the steady state. We obtain τint = 2.00 for the simple gradient method, which corresponds to the 0 dynamics with γ = 0, τint = 0.19 for the conventional Ohzeki-Ichiki method with γ = 10.0, and τint = 0.14 0 for the proposed hybrid use of the gradient method and 0 0 the Hamiltonian dynamics with γ = 10.0, respectively. 0 0 00 0 0 00 In addition to the convergence of x and U(x) shown in Fig. 1, these results imply thath thei proposedh i method leads the significant reduction of the relaxation time to FIG. 2. (Color online) Time evolution of the magnetization the steady state. (left panel) and internal energy (right panel) of XY model. To demonstrate the removal of the critical slow- The black cross marks, green triangles, and red circles indicate ing down in our method, we next deal with the two- the results of the gradient method, conventional Ohzeki-Ichiki dimensional XY model on a square lattice: method, and the proposed method, respectively. The error bars indicate variances. U(x)= cos(xi–xj ) , (26) − hXi,ji ment is achieved by the proposed method. In the pro- where the sum is taken over all pairs of the near- posed method, both magnetization and internal energy est neighboring sites. The two-dimensional XY model rapidly converge to the steady state values, and the crit- exhibits the Kosterlitz-Thouless transition at Tc = ical slowing down appears to be eliminated. 0.89213(10) [27]. At temperatures below Tc, magneti- N zation m = i=1 sin xi/N exhibits slow relaxation fol- lowing the powerP law decay [28]. Since the critical slow- VI. SUMMARY AND DISCUSSION ing down is a bottleneck for convergence to the targeted steady state, it is preferred to avoid such slowing down We have seen that the Ohzeki-Ichiki method seam- behaviors. lessly connects the simple gradient method with the We compare the convergence performance of the gradi- Hamiltonian dynamics. The Hamiltonian dynamics cor- ent method, the Ohzeki-Ichiki method, and the proposed responds to a specific limit of the generalized Ohzeki- method, in which the potential of the y-system is given Ichiki method. The HMC does not satisfy the detailed N 2 by Hy(y)= i=1 yi /2mi. In our numerical calculations, balance condition in general. In the HMC, the candidate the numberP of spins is set to be N = 10 10. According of the updated state depends on the waiting time, which × to the finite size correction, the effective critical tempera- defines a leapfrog operator Lˆ. Even if the updated state ′ ′ ture for this system is evaluated as Tceff 0.975 [29]. To (x ,p ) = Lˆ(x, p) is proposed starting from the current ∼ demonstrate the removal of the critical slowing down, the state (x, p) by the leapfrog operator, the reverse transi- ′ ′ temperature is set to be T =0.5

[1] C. Sch¨utte, A. Fischer, W. Huisinga, and International Journal of Modern Physics C 12, 623 (2001). P. Deuflhard, A direct approach to conforma- [9] K. Hukushima and K. Nemoto, Exchange monte tional dynamics based on hybrid monte carlo, carlo method and application to spin glass simulations, J. Comput. Phys. 151, 146–168 (1999). Journal of the Physical Society of Japan 65, 1604 (1996). [2] A. Mitsutake, Y. Sugita, and Y. Okamoto, Generalized- [10] E. Marinari and G. Parisi, Simulated ensemble algorithms for molecular simulations of biopoly- tempering: A new monte carlo scheme, mers, Biopolymers 60, 96 (2001). Europhysics Letters (EPL) 19, 451 (1992). [3] A. T. Ogielski, Dynamics of three-dimensional [11] A. P. Lyubartsev, A. A. Martsinovski, S. V. ising spin glasses in thermal equilibrium, Shevkunov, and P. N. Vorontsov-Velyaminov, Phys. Rev. B 32, 7384 (1985). New approach to monte carlo calculation of the [4] R. Yamamoto and W. Kob, Replica-exchange molec- free energy: Method of expanded ensembles, ular dynamics simulation for supercooled liquids, The Journal of Chemical Physics 96, 1776 (1992). Phys. Rev. E 61, 5473 (2000). [12] B. A. Berg and T. Neuhaus, Multicanonical ensemble: A [5] C. Andrieu, N. de Freitas, A. Doucet, and M. I. Jordan, new approach to simulate first-order phase transitions, Machine Learning 50, 5 (2003). Phys. Rev. Lett. 68, 9 (1992). [6] D. J. C. MacKay, Information Theory, Inference & [13] F. Wang and D. P. Landau, Efficient, multiple-range ran- Learning Algorithms (Cambridge University Press, USA, dom walk algorithm to calculate the , 2002). Phys. Rev. Lett. 86, 2050 (2001). [7] N. Metropolis, A. W. Rosenbluth, M. N. Rosen- [14] S. Duane, A. Kennedy, B. J. Pendleton, and D. Roweth, bluth, A. H. Teller, and E. Teller, Equation of Hybrid monte carlo, Physics Letters B 195, 216 (1987). state calculations by fast computing machines, [15] R. H. Swendsen and J.-S. Wang, Nonuniver- The Journal of Chemical Physics 21, 1087 (1953). sal critical dynamics in monte carlo simulations, [8] Y. Iba, Extended ensemble Monte Carlo, Phys. Rev. Lett. 58, 86 (1987). 7

[16] U. Wolff, Collective monte carlo updating for spin sys- cal integration illustrated by the st¨ormer–verlet method, tems, Phys. Rev. Lett. 62, 361 (1989). Acta Numerica 12, 399–450 (2003). [17] A. Barbu and S.-C. Zhu, Generalizing swendsen- [26] P. E. Kloeden and E. Platen, wang to sampling arbitrary posterior probabilities, Numerical Solution of Stochastic Differential Equations IEEE Transactions on Pattern Analysis and Machine Intelligence(Springer27, 1239 Berlin(2005). Heidelberg, 1992). [18] H. Suwa and S. Todo, Markov chain monte [27] P. Olsson, Monte carlo analysis of the two-dimensional xy carlo method without detailed balance, model. ii. comparison with the kosterlitz renormalization- Phys. Rev. Lett. 105, 120603 (2010). group equations, Phys. Rev. B 52, 4526 (1995). [19] K. S. Turitsyn, M. Chertkov, and M. Vucelja, Irre- [28] H. Nishimori and G. Ortiz, versible monte carlo algorithms for efficient sampling, Elements of Phase Transitions and Critical Phenomena Physica D: Nonlinear Phenomena 240, 410 (2011). (Oxford University Press, 2010). [20] H. C. Fernandes and M. Weigel, Non-reversible [29] Y. Komura and Y. Okabe, Large-scale monte carlo simulations of spin models, monte carlo simulation of two-dimensional Computer Physics Communications 182, 1856 (2011), classical xy model using multiple gpus, computer Physics Communications Special Edition Journal of the Physical Society of Japan 81, 113001 (2012), for Conference on Computational Physics Trondheim, https://doi.org/10.1143/JPSJ.81.113001. Norway, June 23-26, 2010. [30] M. Okudo and H. Suzuki, Hamiltonian monte carlo with [21] Y. Sakai and K. Hukushima, Dy- explicit, reversible, and volume-preserving adaptive step namics of one-dimensional size control, JSIAM Letters 9, 33 (2017). without detailed balance condition, [31] M. D. Hoffman and A. Gelman, The Journal of the Physical Society of Japan 82, 064003 (2013), no-u-turn sampler: Adaptively setting https://doi.org/10.7566/JPSJ.82.064003. path lengths in hamiltonian monte carlo, [22] A. Ichiki and M. Ohzeki, Violation of detailed balance Journal of Machine Learning Research 15, 1593 (2014). accelerates relaxation, Phys. Rev. E 88, 020101 (2013). [32] A. Ichiki and M. Ohzeki, Full-order fluctuation- [23] M. Ohzeki and A. Ichiki, Langevin dynam- dissipation relation for a class of nonequilibrium steady ics neglecting detailed balance condition, states, Phys. Rev. E 91, 062105 (2015). Phys. Rev. E 92, 012105 (2015). [33] F. Coghi, R. Chetrite, and H. Touchette, Role of current [24] W. K. Hastings, Monte carlo sampling meth- fluctuations in nonreversible samplers, Physical Review ods using markov chains and their applications, E 103, 10.1103/physreve.103.062142 (2021). Biometrika 57, 97 (1970). [25] E. Hairer, C. Lubich, and G. Wanner, Geometric numeri-