Skewed Thermodynamic Geometry and Optimal Free Energy Estimation

Steven Blaber∗ and David A. Sivak† Dept. of Physics, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada (Dated: January 5, 2021) Free energy differences are a central quantity of interest in physics, chemistry, and biology. We develop design principles that improve the precision and accuracy of free energy estimators, which has potential applications to screening for targeted drug discovery. Specifically, by exploiting the connection between the work of time-reversed protocol pairs, we develop near-equilibrium approximations for moments of the excess work and analyze the dominant contributions to the precision and accuracy of standard nonequilibrium free-energy estimators. Within linear response, minimum-dissipation protocols follow geodesics of the Riemannian metric induced by the Stokes’ friction tensor. We find the next-order contribution arises from the rank-3 supra-Stokes’ tensor that skews the geometric structure such that minimum-dissipation protocols follow geodesics of a generalized cubic Finsler metric. Thus, near equilibrium the supra-Stokes’ tensor determines the leading-order contribution to the bias of bidirectional free-energy estimators.

I. INTRODUCTION (equaling the dissipation), and hence the error, can be reduced by following a different path through control- parameter space and varying the velocity along the path Free energy differences determine the equilibrium 14–16 phases of thermodynamic systems as well as the relative while keeping the protocol duration fixed. Near equi- reaction rates and binding affinities of chemical species.1 librium, this can be mapped onto the thermodynamic- Computational and experimental techniques which accu- geometry framework: for the mean-work estimator, the rately and precisely predict free energy differences are bias and from finite-time protocols is improved therefore highly desirable. One important application is by following geodesics of a thermodynamic metric, the in pharmaceutical drug discovery, where computation of force-variance (FV) metric, a Riemannian metric de- fined by the matrix of conjugate-force fluctu- free energy differences can aid in the identification and 17–21 design of ligands for targeted protein binding.2–5 Current ations; similarly, for BAR, the variance (but not the methods rely primarily on costly and time-consuming ex- bias) is minimized if the protocol follows geodesics of the FV metric.22 This has been used to improve the precision perimentation, which can be reduced through screening 23–26 with efficient computational techniques.1–8 of calculated binding potentials of mean force. Nonequilbrium measurements of free energy differences The FV metric only minimizes the variance of the are often estimated by measuring the work incurred from mean-work estimator and BAR if the relaxation time a protocol (a dynamic variation in control parameters) is independent of the control parameters. The friction- 27,28 that drives the system between control-parameter end- tensor metric, the product of the covariance and inte- points corresponding to each state. A unidirectional es- gral relaxation time of conjugate-force fluctuations (19), timator calculates the free energy difference using only is more general than the FV and provides a relatively sim- the work done by a protocol driving from initial to final ple prescription for reducing excess work in more general states (forward protocol), while a bidirectional estimator near-equilibrium processes, where minimum-dissipation also uses a protocol that drives from final to initial states protocols follow geodesics of the Riemannian metric in- (reverse protocol). The mean-work estimator9 equates duced by the friction tensor. For common unidirectional the mean work with the free energy difference and yields estimators near equilibrium, the bias and variance are a biased estimate for any non-quasistatic (finite-speed) proportional to the first (mean) of the excess protocol. The Jarzynski estimator for free energy differ- work, and hence minimum-dissipation protocols defined ences (derived from the Jarzynski equality relating the by the friction tensor minimize both the bias and vari- exponentially averaged work to the free energy change10) ance. is unbiased for a large number of samples. The mean- We show that for bidirectional estimators, the variance work and Jarzynski estimators can be used as either uni- is proportional to the sum of the second moments of the

arXiv:2009.14354v2 [cond-mat.stat-mech] 2 Jan 2021 or bidirectional estimators; however, if bidirectional data excess work from forward and reverse protocols (26a), is available the maximum log-likelihood estimate is Ben- while the bias is proportional to the difference of the nett’s acceptance ratio (BAR)11 which yields (for a large first moments (27a).29 With this in mind, we extend the number of samples) the minimum variance of any unbi- thermodynamic-geometry framework to higher-order mo- ased estimator.12,13 ments of the excess work and beyond the leading-order Regardless of the estimator, non-quasistatic control- friction-tensor approximation. We find that for bidi- parameter protocols result in excess work (work in ex- rectional estimators near equilibrium, minimum-variance cess of the free energy difference), which increases the protocols follow geodesics of the Riemannian metric in- bias and variance (decreases the accuracy and precision, duced by the friction tensor, while minimum-bias pro- respectively) of free energy estimates. This excess work tocols follow geodesics of a cubic Finsler metric. For 2 the simple model system of a Brownian particle in a In analogy with fluid dynamics, this rank-two tensor is quadratic trap with time-dependent stiffness (a breathing the Stokes’ friction, since it produces a drag force that de- harmonic trap), the minimum-variance and minimum- pends linearly on velocity. We denote the Stokes’ friction bias protocols can improve variance by a factor of 3 4 with superscript (1) since it is the leading-order contri- and bias by over a factor of 10 (Fig. 2). − bution to dissipation.27 Following parallel arguments, we approximate the third moment of the excess-work distribution as II. DERIVATION 3 d Wex Λ 3 (2) ˙ ˙ ˙ h i 2 ζijk[λ(t)]λi(t)λj(t)λk(t) , (6) Consider a system in the canonical ensemble, in dt ≈ β thermal equilibrium with a heat bath at temper- for the rank-three tensor ature T . The probability distribution over mi- crostates x at control parameters λ is π(x λ) = ζ(2)[λ(t)] (7) β F λ E x, λ E x, λ | ijk ≡ exp [ ( ) ( )], with energy ( ) and free en- Z ∞ Z ∞ − P 2 00 000 00 000 ergy F (λ) kBT ln x exp [ βE(x, λ)]. Here β β dt dt δf (0)δf (t )δf (t ) . −1 ≡ − − ≡ i j k λ(t) (kBT ) for Boltzmann’s constant kB. We define the − 0 0 h i work done in a single realization of an external agent changing the control parameters λ according to a proto- (The factor of three in (6) results from grouping the in- t dex permutations ijk, jik, kij into one term, using the col Λ as W R dt0 f λ˙ (t0), which implies the average 0 i i invariance of the{ sum under exchange} of indices, e.g., excess work≡ (W − W ∆F ) is ex (2) ˙ ˙ ˙ (2) ˙ ˙ ˙ ≡ − ζijkλiλjλk = ζjikλiλjλk.) We call the rank-three tensor Z t [Eq. (7)] the supra-Stokes’ tensor and index it by super- 0 ˙ 0 Wex Λ = dt δfi Λλi(t ) , (1) script (2), as it corresponds to the leading-order correc- h i − 0 h i tion to dissipation beyond the Stokes’ friction (15). where we adopt the Einstein summation convention of For fourth and higher moments, Appendix B shows implied summation over repeated indices. A dot denotes that the integration bounds must remain finite since the the time derivative λ˙ i dλi/dt, fi ∂λ U is the gen- n-time covariance functions do not decay to zero, so there ≡ ≡ − i eralized force conjugate to λi, and δfi fi fi eq is the is no clear analogy to frictional drag forces. Nevertheless, difference from the equilibrium average.≡ Angle−h i brackets we can still approximate them by Λ denote an average over the nonequilibrium ensem- h· · · i n n ble of system responses to control-parameter protocol Λ. d Wex Λ (n−1) Y h i n ν ···ν [λ(t), t] λ˙ ν (t) , (8) The time derivative of the second moment of the dt ≈ C 1 n i excess-work distribution is i=1 where we use index notation ν , ν , ν instead of i, j, k, d W 2 Z t 1 2 3 ex Λ ˙ 0 0 ˙ 0 and define integral n-time covariance··· functions h i = 2λi(t) dt δfi(t)δfj(t ) Λ λj(t ) . (2) dt 0 h i (n−1) ν ···ν [λ(t), t] (9) For a sufficiently slow protocol, we replace the nonequi- C 1 n ≡ n t * n + librium average Λ with the equilibrium average Y Z Y h· · · i ( β)n dt δf (0)δf (t ) . λ(t) at fixed control parameters λ(t): i ν1 νj j − 0 h· · · i i=2 j=2 λ(t) 2 Z t d Wex Λ 00 00 h i 2λ˙ i(t)λ˙ j(t) dt δfi(0)δfj(t ) , (3) This implies that higher-order moments of the excess dt ≈ h iλ(t) 0 work are higher order in control-parameter velocity and where we used the stationarity of the equilibrium aver- are therefore smaller for slow protocols. The approxima- age, defined t00 t0 t, and assumed smooth protocols tions of (4), (6), and (8) are the leading-order contribu- to expand the control-parameter≡ − velocity to zeroth or- tions to each moment of the excess work. der, λ˙ (t t00) λ˙ (t). Finally, we assume correlations in To derive the next-order contribution we exploit the the conjugate− ≈ forces relax quickly relative to the proto- connection between time-reversed protocols through the 30,31 col duration and replace the integration bound t with Crooks relation, which constrains the probability of (Appendix B), simplifying the approximation to ∞ forward and reverse work measurements as

−βWex 2 PΛ(Wex)e = P † ( Wex) . (10) d Wex Λ 2 (1) Λ h i ζ [λ(t)]λ˙ i(t)λ˙ j(t) , (4) − dt ≈ β ij Λ† is the time-reversed protocol, starting at equilibrium for the friction tensor in the end state of the forward protocol. Integrating over W produces Jarzynski’s equality10 Z ∞ ex (1) 00 00 ζij [λ(t)] β dt δfi(0)δfj(t ) λ(t) . (5) −βWex ≡ h i e Λ = 1 , (11) 0 h i 3 while first multiplying by Wex then integrating leads to (Eq. (13) of Ref. 27) and is positive for all protocols. The Stokes’ friction (5) is an autocovariance function, −βWex Wexe Λ = Wex Λ† . (12) h i −h i (1) (1) ζ = δfiδfj τ , (19) Taylor expanding the exponential in (11) and (12) to ij h iλ(t) ◦ ij third order gives the Hadamard (entry-by-entry) product, , of the 1 2 1 2 3 δf δf ◦ Wex Λ β W Λ β W Λ (13a) conjugate-force covariance i j λ(t) and integral re- h i ≈ 2 h exi − 6 h exi h i 2 1 2 3 laxation time Wex Λ† Wex Λ + β W Λ β W Λ . (13b) h i ≈ −h i h exi − 2 h exi Z ∞ 00 (1) 00 δfi(0)δfj(t ) λ(t) According to (8), near equilibrium the higher-order mo- τij dt h i . (20) ≡ δfiδfj λ(t) ments of the excess work are higher order in control- 0 h i parameter velocity and therefore can be neglected for The Stokes’ friction is largest when the conjugate-force slow protocols. The leading-order contributions to the fluctuations are largest (large covariance) and most per- sum and difference of the excess work are sistent (long relaxation time). In contrast to the posi- 2 tive contribution from the Stokes’ friction, the contribu- Wex Λ + Wex † β W Λ (14a) h i h iΛ ≈ h exi tion from the supra-Stokes’ tensor (second term in (15)) 1 2 3 Wex Λ Wex Λ† 6 β Wex Λ . (14b) changes sign under time reversal because it is cubic in h i − h i ≈ h i control-parameter velocity. The supra-Stokes’ tensor (7) Differentiating (14a) and (14b) with respect to time, sub- stituting in (4) and (6) respectively, then adding the two (2) (2) ζ = δfiδfjδfk λ(t) τ (21) equations gives ijk −h i ◦ ijk   is not positive semidefinite, and is the Hadamard product d Wex Λ (1) 1 (2) ˙ ˙ ˙ of the unnormalized coskewness δfiδfjδfk (related h i ζij + ζijkλk λiλj , (15) λ(t) dt ≈ 4 to as covariance is relatedh to variance)i and the where henceforth we drop the explicit dependence of the integral double relaxation time friction and control-parameter velocity on λ and t. Dif- (2) τ (22) ferentiating (13a) and substituting (15) and (6) gives ijk ≡ ∞ ∞ 00 000 Z Z δfi(0)δfj(t )δfk(t ) λ(t) 2   t00 t000 . d Wex Λ 2 (1) 3 (2) d d h i h i ζ + ζ λ˙ k λ˙ iλ˙ j . (16) 0 0 δfiδfjδfk λ(t) dt ≈ β ij 4 ijk h i For a protocol with positive velocity, the contribution To derive an approximation for higher-order moments, n to dissipation from the supra-Stokes’ tensor quantifies we follow parallel arguments. Multiplying (10) by Wex the increase (decrease) in excess work from negatively and integrating over Wex, then either adding or subtract- n (positively) skewed conjugate-force fluctuation. ing ( Wex) † , the leading-order contributions to the h − iΛ The leading-order friction-tensor approximation (first sum and difference for n > 2 are term in (15)) endows the control-parameter space with a n n n Riemannian metric, such that minimum-work protocols W Λ + ( Wex) † 2 ( Wex) Λ (17a) h exi h − iΛ ≈ h − i n n n+1 follow geodesics of the Stokes’ friction tensor. With the W Λ ( Wex) † β W Λ , (17b) h exi − h − iΛ ≈ h ex i addition of the next-order contribution (7), the excess which implies work can be expressed as n Z ∆t n−1 d Wex Λ tot ˙ ˙ β h i (18) Wex Λ dt ζij λiλj , (23) dt ≈ h i ≈ 0   n (n−1) n + 1 (n) Y for total friction tensor n ν ···ν + ν ···ν λ˙ ν λ˙ ν . C 1 n 2 C 1 n+1 n+1 i i=1 tot (1) 1 (2) ζ ζ + ζ λ˙ k (24) ij ≡ ij 4 ijk Equations (15), (16), and (18) are the central results of this article, which have applications to designing that explicitly depends on the protocol velocity. minimum-dissipation and optimal free energy estimation Minimum-dissipation protocols follow geodesics of the ζtot protocols. generalized cubic Finsler metric ij , an extension of the (1) Riemannian metric ζij . A Finsler metric defines a space where distances depend not only on positions (points) III. NEXT-ORDER CONTRIBUTION TO but also on directions (tangent vectors). Nevertheless, EXCESS WORK the usual concepts of length, curvature, and geodesics from Riemannian geometry generalize, and there are The approximation of (15) can improve near- standard procedures for calculating geodesics32 (see Ap- equilibrium estimates of the excess work. The leading- pendix C). For small control-parameter velocities, the to- order term is the usual linear-response approximation tal friction is positive semi-definite; however, for large 4 a b IV. PRECISION AND ACCURACY OF FREE 1 0.5 1 0.25 ENERGY ESTIMATES

0 0.25 0 0.125 We quantify the precision of a free energy estimator by its variance. In the limit of many samples, the expected CP velocity 12

force variance force variance of any unbiased estimator ∆F is bounded by

Stokes’ friction d -1 0 -1 0 1 1.5 2 1 1.5 2  2 c d   2 −1 −1 4 δ∆dF [1 + cosh(βWex)] , (25) 1 0.1 1 0.25 ≥ N h i − N

where for simplicity we assume an equal number of for- 0 0 0 0.125 ward and reverse work measurements, and the average is over a total of N work measurements. BAR sat- CP velocity total friction h·urates · · i this bound for large N. This bound demonstrates -1 -0.1 -1 0 an explicit connection between the minimum variance of 1 1.5 2 1 1.5 2 supra-Stokes’ contribution free energy estimators and the excess work. The variance control parameter (CP) control parameter (CP) is minimized at 0 when the excess-work distribution is a delta function at W = 0 (achieved in the quasistatic FIG. 1: Friction as a function of control-parameter ve- ex locity λ˙ ∗ and control parameter λ∗ for the breath- limit). For any finite-time protocol, the average excess ing harmonic trap. Scaled (a) force variance hδf 2i∗ ≡ work is positive. 2 2 2 (1)∗ 2 3 (1) Assuming small excess work, we expand the variance β ki hδf i, (b) Stokes’ friction ζ ≡ β ki ζ /γ, (c) supra- 1 ˙ ∗ (2)∗ 1 2 3 ˙ (2) Stokes’ contribution 4 λ ζ ≡ 4 β ki λζ /γ, and (d) total  2 2 2 tot∗ 2 3 tot   Wex Λ + Wex Λ† friction ζ ≡ β ki ζ /γ, for initial spring constant ki and δ∆dF h i h i (26a) damping coefficient γ. The control parameter is the trap stiff- ≈ 2N ∗ Z ∆t ness, given in dimensionless form as λ ≡ k/ki with velocity 2 (1) ˙ ∗ 2 dt ζ λ˙ iλ˙ j , (26b) λ ≡ γk/ki . Geodesics (solid curves) minimize the magni- ≈ βN ij tude of the corresponding metric’s contribution to the excess 0 work, with distinct curves representing different average pro- where the second line follows from (16). Equation (26a) tocol velocities. also holds for very few samples, since then BAR is equiv- alent to the average of the sum of the forward and reverse work measurements.29 Thus the protocol designed to re- control-parameter velocities the approximation breaks (1) duce the variance follows geodesics of ζ , and for one- down and no longer guarantees positive semidefiniteness, ij (1)−1/2 leading to the unphysical possibility of negative excess dimensional control proceeds at velocity λ˙ ζ . ∝ work. Unlike the variance, the protocol that maximizes To illustrate some general properties of the friction, the accuracy (minimum-bias) is different for unidirec- Fig. 1 shows, for the model system of a breathing har- tional and bidirectional estimators. For unidirectional monic trap (Brownian particle in a harmonic trap where Jarzynski and mean-work estimators, near equilibrium the control parameter is the time-dependent stiffness, see the minimum-bias protocol is simply the minimum- Appendix A), the (a) force variance, (b) Stokes’ friction, dissipation protocol (protocol that minimizes (15)) and (c) supra-Stokes’ contribution, and (d) total friction. The therefore to leading order is optimized by the same pro- force variance and Stokes’ friction are independent of tocol that minimizes (26b). For BAR, for small excess control-parameter velocity. In general, the force vari- work (or few samples), the bias instead is proportional ance differs from the Stokes’ friction whenever integral to the difference between the forward and reverse excess 29 relaxation time depends on the control parameters, as work: for the breathing harmonic trap. The contribution from D E 1 the supra-Stokes’ tensor (c) is antisymmetric in control- δ∆dF ( Wex Λ Wex Λ† ) (27a) ≈ 2N h i − h i parameter velocity, becoming negative for negative ve- Z ∆t locity. This antisymmetric contribution skews the total 1 (2) ˙ ˙ ˙ dt ζijk λiλjλk , (27b) friction (d), which depends on the control-parameter ve- ≈ 4N 0 locity and lacks any symmetry under time reversal. where the second line follows from (15). The protocol Geodesics (solid curves in Fig. 1) are protocols that designed to reduce the (magnitude of) bias thus follows minimize the contribution from the corresponding metric ζ(2) to the excess work. For relatively fast protocols (average geodesics of the cubic Finsler metric ijk, simplifying for ∗ (2)−1/3 scaled velocities of λ˙ k/ki > 0.5 or < 0.5, for initial one-dimensional control to λ˙ ζ . ≡ − ∝ spring constant ki and damping∼ coefficient∼ γ), the veloc- To illustrate the potential benefit of protocols designed ity is significantly smaller in regions of high friction and to reduce variance and bias, Fig. 2 shows approximations larger in regions of low friction (or force variance). for the variance (26a) and bias (27a) of designed and 5

102 a b beyond linear response, which asymmetrically skews the 100 Riemannian metric (5) into a generalized cubic Finsler 100 metric (24). The Stokes’ friction tensor (5) controls the naive leading-order contribution to the variance of both uni- bias 10 2 force variance ° directional and bidirectional free energy estimators (16), 2 precise variance (var) 10° accurate such that minimum-variance protocols follow geodesics 4 10° of the Stokes’ friction. For unidirectional estimators, 2 0 2 2 0 2 10° 10 10 10° 10 10 these same protocols also minimize the bias; however, 2 c 2 d for bidirectional estimators such as BAR the leading-

naive order contribution to the bias is from the supra-Stokes’ naive 1 1/2 tensor (27b), and therefore minimum-bias protocols in-

/ var / bias stead follow geodesics of the supra-Stokes’ tensor. From 1/2 1/8 des des these near-equilibrium approximations, we design proto- var

bias cols that increase the precision and accuracy of standard 1/4 1/32 2 0 2 2 0 2 nonequilibrium free energy estimators (Fig. 2). 10° 10 10 10° 10 10 protocol duration protocol duration The addition of the supra-Stokes’ tensor has sev- 2 FIG. 2: (a) Scaled variance h(β δ∆dF ) i/N (26a) and (b) eral physical implications. Notably, it accounts for scaled bias hβ δ∆dF i/N (27a) as functions of scaled protocol time-reversal asymmetric dissipation, which arises from (1) (1) duration ∆t/τf (for slowest relaxation time τf = γ/(2kf )), skewed conjugate-force fluctuations. Since the supra- for the breathing harmonic trap. Ratios of designed and Stokes’ contribution is antisymmetric, it cancels out for 2 2 naive (c) h(δ∆dF ) ides/h(δ∆dF ) inaive and (d) biases equal numbers of forward and reverse protocols, which hδ∆dF ides/hδ∆dF inaive. Subscript ‘naive’ denotes the constant- has implications for the design principles of molecu- velocity protocol, and ‘des’ the designed protocols, which are lar machines: many molecular machines (e.g., kinesin the force-variance-optimized ‘force variance’, the minimum- walking toward a microtubule’s plus end33 or ATP syn- variance ‘accurate’, and the minimum-bias ‘precise’ protocols. thase synthesizing ATP34) achieve directed behavior; the Dash-dotted lines denote the limits for short (black) and long duration (colors). Final control-parameter value (spring con- supra-Stokes’ tensor quantifies the leading-order ener- getic cost of directed operation compared to a coequal stant) is kf /ki = 1/16. forward and reverse process. naive protocols, as well as their ratio, as a function of protocol duration, for the model system of a breathing In some cases the minimum-work protocols can be calculated exactly,35,36 or minimum-variance protocols harmonic trap (Appendix A). For a slow protocol (pro- 37,38 tocol duration longer than the slowest relaxation time, solved numerically, which would yield more accurate and precise estimators; however, to date these optimiza- t/τ (1) > ∆ f 1), the designed protocols reduce the vari- tions are limited to simple systems and do not permit ance by a∼ factor of 3 4 (Fig. 2c), with the precise pro- − straightforward generalization. The present formalism tocol (designed to reduce variance) performing the best can be applied to more general settings than exact cal- (smallest ratio). For the bias, the accurate protocol (de- culations and makes fewer approximations than Shenfeld signed to reduce bias) performs the best (smallest ratio), et al.,22 who only examine the force-variance metric and with reductions by an order of magnitude for the slow- do not consider minimizing bias. est protocols shown (Fig. 2d). For fast protocols (pro- tocol duration shorter than the slowest relaxation time, (1) < ∆t/τf 1), the approximations break down, and the Here we considered continuous protocols, but free naive protocol∼ can outperform the designed in both bias energies are often estimated from sampling discrete and variance (ratio larger than one). For this system the states.39–42 Additionally, for discrete state sampling in minimum-variance and minimum-bias protocols achieve replica-exchange43 or parallel-tempering44 simulations it similar amounts of bias and variance in all cases, likely is important to consider the acceptance probability be- due to their similar functional form (Fig. 3). tween states.45–53 The force-variance metric has been ap- plied in this context,22 and the linear-response framework of Ref. 27 has been generalized to discrete control.54 As V. DISCUSSION Ref. 22 has shown, spacing the states along geodesics of the force variance not only reduces the variance of the We have developed near-equilibrium approximations free energy estimator but also increases the acceptance for moments of the excess work, (15), (16), and (18), probability. We expect that analogous principles hold that incorporate time-reversal symmetric and antisym- for designing minimum-variance and minimum-bias pro- metric contributions. The antisymmetric contribution to tocols for discrete state sampling using the milder ap- the first moment (15) yields the next-order contribution proximations of our approach. 6

Acknowledgments 1 Naive The authors thank Steve Large, Emma Lathouwers, Force Variance and Joseph Lucero (SFU Physics), Miranda Louwerse Precise (SFU Chemistry), John Chodera and Josh Fass (Memo- Accurate rial Sloan Kettering Computational Biology), and Gavin i Crooks (Berkeley Institute for Theoretical Science) for k

/ 0.5 feedback on the manuscript. This work is supported by k an SFU Graduate Deans Entrance Scholarship (SB), an NSERC Discovery Grant and Discovery Accelerator Sup- plement (DAS), and a Tier-II Canada Research Chair (DAS). 0 0 0.5 1 t/∆t Data Availability FIG. 3: The control parameter (spring constant k) as a func- The data that support the findings of this study are tion of time for three different protocols in the model system available from the corresponding author upon reasonable of a breathing harmonic trap. The control parameter drives the system between relatively strongly and weakly confined request. states (kf /ki = 1/16). ‘Naive’ denotes the constant-velocity protocol, ‘Force Variance’ the force-variance-optimized proto- col proceeding according to k˙ ∝ k, ‘Accurate’ the minimum- variance protocol with k˙ ∝ k3/2, and ‘Precise’ the minimum- bias protocol with k˙ ∝ k5/3. Appendix A: Breathing Harmonic Trap

A simple non-trivial system for designing and testing function (9) are minimum-variance and minimum-bias protocols is the 2 1 breathing harmonic trap, since its correlation functions δf k(t) = (A3a) can be calculated analytically and it has a non-Gaussian h i 2k2 βγ work distribution. Consider a colloidal particle in a har- ζ[k(t)] = (A3b) monic trap with variable stiffness. The particle position 4k3 x obeys the overdamped Langevin equation, β2γ2 ζ(2)[k(t)] = (A3c) 2k5   p (3) 2 3t 51γ γx˙ = kx + 2γkBT η, (A1) [k(t), t] β γ + , (A3d) − C ≈ 16k6 32k7 in a trap of time-dependent strength k, damping coeffi- respectively. In (A3d), we neglect terms of order t−n cient γ, environmental temperature T , Boltzmann’s con- for n 1. Since we only consider one-dimensional ≥ stant kB, and Gaussian white noise η. The control pa- control, we can analytically determine the minimum- rameter is the trap strength λ = k, so the conjugate force variance (precise) and minimum-bias (accurate) proto- is f ∂U/∂k = 1 x2. The joint probability distribu- cols that minimize Eq. (26b) and (27b), respectively. ≡ − − 2 tion of the particle position and work obeys38,55 The force-variance-optimized protocol proceeds at veloc- ity k˙ k, minimum-variance protocol at k˙ k3/2, and minimum-bias∝ protocol at k˙ k5/3. Figure 3∝ plots these ∂p(x, w, t) 1 ∂p ∂  1 ∂p  γ = γkx˙ 2 + kxp + . protocols alongside the naive∝ (constant-velocity) proto- ∂t −2 ∂w ∂x β ∂x col. (A2) As a simple test of the approximations of Eqs. (15), (16), and (18), Fig. 4 plots the sum and difference of the From this, we can solve for any moment of either first three moments of the excess work from forward and the work or position distribution, subject to an initial reverse protocols. We choose the forward protocol to be equilibrium condition p(x, w, t = 0) = π(x ki)δ(w) = a decrease in the control parameter (decreasing k), and −1/2 2 | [2π/(βki)] exp βkix /2 δ(w), with δ(w) the Dirac therefore the reverse increases the control parameter (in- delta function. {− } creasing k). In all cases the exact calculations agree with (1) The force variance (19), the Stokes’ (5) and the supra- the approximation for slow protocols (large ∆t/τf for (1) Stokes’ coefficients (7), and the third integral covariance slowest relaxation time τf = γ/(2kf )), and overestimate 7

leading-order contributions) for the moments. In the for-

) 0 mer, one must treat the bound as finite since the n-time

† 10

L covariance functions do not decay to zero for some sub- i space of large time arguments. n ex In more detail, consider the four-time covariance W (kurtosis) δfi(0)δfj(t2)δfk(t3)δf`(t4) λ(t). When any 2 h i + ± h − 10° one of the four times 0, t2, t3, t4 significantly dif- ( + ) { } L fers from the others, the conjugate forces decorre- i Approximation

n late, and the kurtosis decays to zero; however, when ex t3 t4 t2 0, any variables separated by sig-

W n = 1 ( − ) h nificant∼ time decorrelate,∼ and the kurtosis approaches ( 4 10° n = 2 n δfi(0)δfj(t2) λ(t) δfk(t3)δf`(t4) λ(t). This limit repre-

b h i h i n = 3 sents a plane in the (t2, t3, t4) parameter space where + ± n − ( + )

a the kurtosis asymptotes to a finite value even for large 1 0 1 2 10° 10 10 10 time arguments. Integrating the above and multiply- (1) Dt/t ing by three (to account for permutations of the indices f 1, 2, 3, 4) yields the integral four-time covariance in the limit t3 t4 t2 0, FIG. 4: The sum (black, +) and the difference (red, −) of the ∼  ∼ moments of excess work for forward and reverse protocols of (3) [λ(t), t] = 3t (1)[λ(t), t] (1)[λ(t), t] (B1a) the breathing harmonic trap, as a function of protocol dura- Cijk` Cij Ck` (1) (1) (1) tion ∆t (scaled by the slowest relaxation time τf = γ/(2kf )). = 3t ζ [λ(t)]ζ [λ(t)] , t . (B1b) Solid lines are the near-equilibrium approximations, given by ij k` → ∞ Eqs. (15), (16), and (18). Dashed (n = 1), dotted (n = 2), and Since this is the only case that remains finite as t , dash-dotted (n = 3) curves show exact results. The protocol the second line is the approximation (to highest order→ ∞ in k(t) is linear with k /k = 1/2. The coefficients α+ = 1/2, f i 1 t) for the integral four-time covariance. Parallel argu- α− = 2, α+ = 1/4, α− = 1/3, α+ = 1/2 and α− = 1/4 1 2 2 3 3 ments hold for the higher-order moments. are chosen such that any moment approximated by the same friction coefficient collapses onto a single curve. In all cases When approximating the average excess work (23) the exact calculations agree with the approximation for large with both the Stokes’ and supra-Stokes’ tensors (24), fi- ∆t/τ (1). nite integration bounds on the Stokes’ friction may be f necessary. For finite integration bounds on the Stokes’ friction we replace (24) with (1) (1) for fast protocols (small ∆t/τf ). For large ∆t/τf , the (1) 1 (2) ij[λ(t), λ˙ (t), t] [λ(t), t] + ζ [λ(t)]λ˙ k(t) . (B2) first and second moments are approximated by (15) and C ≡ Cij 4 ijk (16) so the sums (n = 1 and 2 (+)) are both approxi- mated by the Stokes’ friction (5) and decrease as 1/∆t, For systems with weakly skewed conjugate-force fluctua- while the differences (n = 1 and 2 ( )) are proportional tion, the contribution from the finite bound on the first to the supra-Stokes’ tensor (7) and− decrease as 1/(∆t)2. term can be comparable in magnitude to the contribution The third moment is approximated by (18), so the sum from the supra-Stokes’ tensor. (n = 3 (+)) is approximated by the third integral covari- Figures 5 shows the first and second moments of the ex- ance (9) and decays as 1/(∆t)2 (due to the term linear cess work for forward and reverse protocols, compared to in t in (A3d)) while the difference (n = 3 ( )) is propor- different approximations. For the mean excess work, the tional to the supra-Stokes tensor and decays− as 1/(∆t)2. supra-Stokes’ tensor contributes with opposite sign from the Stokes’ friction for forward (decreasing k) protocols and with same sign for reverse (increasing k). Since the Stokes’-friction approximation overestimates the excess Appendix B: Finite Integration Bounds work in both cases (a,c), adding the supra-Stokes’ tensor reduces the accuracy of the approximation for the reverse The infinite integration bound on the friction assumes excess work. This effect is an artifact of the infinite inte- that correlations relax quickly relative to the protocol gral bound; indeed, if the bound is kept finite ( (1) rather speed. For the first two moments, despite the finite in- than ζ(1)), the first integral autocovariance overestimatesC tegration bound generally yielding a more accurate ap- for forward protocols and underestimates for reverse pro- proximation, the infinite bound considerably simplifies tocols, and adding the supra-Stokes’ tensor improves the the approximation and allows for straightforward proto- approximation in both cases. Finally, we note that the col optimization. Stokes’ friction does not always overestimate for both for- The effect of finite integration bounds is significant ward and reverse protocols. For the second moment in for two calculations: approximation of fourth- and (b,d), the Stokes’ friction overestimates for forward pro- higher-order moments of the excess work, and next-order tocols and underestimates for reverse protocols, and the approximations (including both leading- and next-to- supra-Stokes’ tensor improves the approximation when 8

0.10.1 0.10.1 Riemannian geometry (notably curvature, length, and Exact (1) + 3 k˙z(2) a C 4 b (1) z L geodesics) generalize. There are therefore standard pro- L i L i L (1) i 2 i ex cedures to find the geodesics despite the more complex 2

C ex ex (1) 1 ˙ (2)

ex + kz 0.1 0.050.1 4 W 0.05

0.05 C W 0.05 h W (1) 3 ˙ (2) landscapes induced by the generalized cubic Finsler met- h W 2 Exact h + 4 kz 2 h C b (1) b b

b ric (24). L

z 1 2 1 2 i L (1) i 2 C ex Finsler geometry has several applications in both ex (1) + 1 k˙z(2) 0 0 32 0.05 C 4 W 0.050 0 1 2 0 0 1 2 h W 10 0 10 1 10 2 10 0 10 1 10 2 physics and biology. In the thermodynamic context, the 2 h 10 10 10 10 10 10 56 b b flexion tensor 1 2 0.10.1 0.10.1

c † d †

† 3 0 0 L   †

L ∂ π x λ i 0 1 2 L 0 1 2 ln ( ) i 10 10 10 L 10 10 10 i

2 λ t i ex ijk[ ( )] | (C1) 2 ex ex ∂λ ∂λ ∂λ

ex 0.05 0.05 F ≡ i j k 0.1 0.050.1 W 0.05 λ(t) W h W h W 2 h 2 † h b b † b b L 1 2 1 2 i L is a Finsler metric arising as the third-order contribution i 2 ex ex 0.05 0.050 0 to the near-equilibrium expansion of the relative entropy W 0 0 1 2 0 0 1 2 h W 10100 10101 10102 10100 10101 10102 2 h (1) (1) (Kullback-Leibler divergence) of a probability distribu- b b Dt/t (1) Dt/t (1) 1 2 32 Dt/tff Dt/tff tion relative to the equilibrium distribution: 0 0 100 101 102 100 101 102 Z ∞ π x0 λ (1) (1) 0 ( ) Dt/t FIG. 5: The firstDt/t and second moments of excess works for D(π(x λ) π(x λ) ) dx π(x λ) ln | (C2a) f f | || | | ≡ | π(x λ) forward (a,b) and reverse (c,d) protocols of the breathing har- −∞ | monic trap as a function of protocol time ∆t (scaled by the ij∆λi∆λj + ijk∆λi∆λj∆λk . (1) ≈ I F slowest relaxation time τf = γ/(2kf )). The dashed, dot- (C2b) ted, and dash-dotted lines are different forms of the near- The coskewness tensor (21) is related to the flexion tensor equilibrium approximation. The protocol k(t) is linear with by kf /ki = 1/2.

ijk[λ(t)] = δfiδfjδfk (C3) F h iλ(t) X ∂2 ln π(x λ) ∂ ln π(x λ) either the Stokes’ friction or first integral covariance are + | | , used. ∂λi∂λj ∂λk σijk λ(t)

where the sum is over permutations of the indices σijk = Appendix C: Finsler Geometry 123, 132, 321 . In weak gravitational lensing57,58 the flexion{ tensor} is the third-order correction to the shapes The addition of the supra-Stokes’ tensor (7) comes at of images, which is described as flexing the shape of im- the cost of a more complex geometric structure. In con- ages from an ellipse towards a banana shape. Similarly, trast to Riemannian geometry, Finsler geometry is not re- the supra-Stokes’ tensor (product of coskewness tensor stricted to a quadratic norm. In general, the inner prod- and integral double relaxation time (21)) skews the ther- ucts are not characterized by points, but rather by points modynamic geometry of minimum-dissipation protocols, and directions. Despite this, several useful concepts from as demonstrated in Fig. 1.

∗ Electronic address: [email protected] Sci. 4, 1708 (2018). † Electronic address: [email protected] 7 J. D. Chodera, D. L. Mobley, M. R. Shirts, R. W. Dixon, 1 V. Gapsys, S. Michielssens, J. H. Peters, B. L. de Groot, K. Branson, and V. S. Pande, Curr. Opin. Struc. Biol. 21, and H. Leonov, Calculation of Binding Free Energies 150 (2011). (Springer New York, New York, NY, 2015), pp. 173–209. 8 A. Pohorille, C. Jarzynski, and C. Chipot, J. Phys. Chem. 2 C. E. Schindler, H. Baumann, A. Blum, D. B¨ose, H.- B 114, 10235 (2010). P. Buchstaller, L. Burgdorf, D. Cappel, E. Chekler, 9 J. Gore, F. Ritort, and C. Bustamante, Proc. Natl. Acad. P. Czodrowski, D. Dorsch, et al., J. Chem. Inf. Model 60, Sci. U.S.A 100, 12564 (2003). 5457 (2020). 10 C. Jarzynski, Phys. Rev. Lett. 78, 2690 (1997). 3 B. Kuhn, M. Tich´y,L. Wang, S. Robinson, R. E. Mar- 11 C. H. Bennett, J. Comput. Phys. 22, 245 (1976). tin, A. Kuglstatter, J. Benz, M. Giroud, T. Schirmeister, 12 M. R. Shirts, E. Bair, G. Hooker, and V. S. Pande, Phys. R. Abel, et al., J. Med. Chem 60, 2485 (2017). Rev. Lett. 91, 140601 (2003). 4 M. Ciordia, L. P´erez-Benito,F. Delgado, A. A. Trabanco, 13 P. Maragakis, M. Spichty, and M. Karplus, Phys. Rev. and G. Tresadern, J. Chem. Inf. Model 56, 1856 (2016). Lett. 96, 100602 (2006). 5 L. Wang, Y. Wu, Y. Deng, B. Kim, L. Pierce, G. Krilov, 14 W. P. Reinhardt and J. E. Hunter III, J. Chem. Phys. 97, D. Lupyan, S. Robinson, M. K. Dahlgren, J. Greenwood, 1599 (1992). et al., J. Am. Chem. Soc. 137, 2695 (2015). 15 J. E. Hunter III, W. P. Reinhardt, and T. F. Davis, J. 6 M. Aldeghi, V. Gapsys, and B. L. de Groot, ACS Cent. Chem. Phys. 99, 6856 (1993). 9

16 C. Jarque and B. Tidor, J. Phys. Chem. B 101, 9402 39 D. A. Pearlman and P. A. Kollman, J. Chem. Phys 90, (1997). 2460 (1989). 17 F. Weinhold, J. Chem. Phys. 63, 2479 (1975). 40 N. Lu and D. A. Kofke, J. Chem. Phys 111, 4414 (1999). 18 P. Salamon and R. S. Berry, Phys. Rev. Lett. 51, 1127 41 R. J. Radmer and P. A. Kollman, J. Comput. Chem 18, (1983). 902 (1997). 19 J. C. Sch¨on,J. Chem. Phys. 105, 10072 (1996). 42 D. Wu and D. A. Kofke, J. Chem. Phys 123, 084109 20 M. A. Miller and W. P. Reinhardt, J. Chem. Phys. 113, (2005). 7035 (2000). 43 Y. Sugita, A. Kitao, and Y. Okamoto, J. Chem. Phys 113, 21 G. E. Crooks, Phys. Rev. Lett. 99, 100602 (2007). 6042 (2000). 22 D. K. Shenfeld, H. Xu, M. P. Eastwood, R. O. Dror, and 44 D. J. Earl and M. W. Deem, Phys. Chem. Chem. Phys 7, D. E. Shaw, Phys. Rev. E 80, 046705 (2009). 3910 (2005). 23 D. D. L. Minh, J. of Comp. Chem. 41, 715 (2020). 45 D. A. Kofke, J. Chem. Phys 117, 6911 (2002). 24 T. T. Pham and M. R. Shirts, J. Chem. Phys. 135, 034114 46 C. Predescu, M. Predescu, and C. V. Ciobanu, J. Chem. (2011). Phys 120, 4119 (2004). 25 T. T. Pham and M. R. Shirts, J. Chem. Phys. 136, 124120 47 A. Kone and D. A. Kofke, J. Chem. Phys 122, 206101 (2012). (2005). 26 S. Park and W. Im, J. Chem. Theory Comput. 10, 2719 48 N. Rathore, M. Chopra, and J. J. de Pablo, J. Chem. Phys (2014). 122, 024111 (2005). 27 D. A. Sivak and G. E. Crooks, Phys. Rev. Lett. 108, 49 C. Predescu, M. Predescu, and C. V. Ciobanu, J. Phys. 190602 (2012). Chem. B 109, 4189 (2005). 28 S. Deffner and M. V. S. Bonan¸ca,EPL 131, 20001 (2020). 50 W. Nadler and U. H. Hansmann, Phys. Rev. E 76, 065701 29 S. Kim, Y. W. Kim, P. Talkner, and J. Yi, Phys. Rev. E (2007). 86, 041130 (2012). 51 J. D. Chodera and M. R. Shirts, J. Chem. Phys. 135, 30 G. E. Crooks, Phys. Rev. E 60, 2721 (1999). 194110 (2011). 31 G. E. Crooks, Phys. Rev. E 61, 2361 (2000). 52 R. M. Dirks, H. Xu, and D. E. Shaw, J. Chem. Theory 32 P. L. Antonelli, R. S. Ingarden, and M. Matsumoto, The Comput. 8, 162 (2012). theory of sprays and Finsler spaces with applications in 53 J. L. MacCallum, M. I. Muniyat, and K. Gaalswyk, J. physics and biology, vol. 58 (Springer Science & Business Phys. Chem. B 122, 5448 (2018). Media, 2013). 54 S. J. Large and D. A. Sivak, J. Stat. Mech.: Theory Exp. 33 N. Hirokawa, Y. Noda, Y. Tanaka, and S. Niwa, Nat. Rev. 2019, 083212 (2019). Mol. Cell Biol. 10, 682 (2009). 55 A. Imparato, L. Peliti, G. Pesce, G. Rusciano, and 34 W. Junge and N. Nelson, Annu. Rev. Biochem. 84, 631 A. Sasso, Phys. Rev. E 76, 050101 (2007). (2015). 56 E. Sellentin, M. Quartin, and L. Amendola, Mon. Not. R. 35 T. Schmiedl and U. Seifert, Phys. Rev. Lett. 98, 108301 Astron. Soc. 441, 1831 (2014). (2007). 57 D. M. Goldberg and D. J. Bacon, Astrophys. J 619, 741 36 A. Gomez-Marin, T. Schmiedl, and U. Seifert, J. Chem. (2005). Phys. 129, 024114 (2008). 58 D. J. Bacon, D. M. Goldberg, B. T. P. Rowe, and A. N. 37 P. Geiger and C. Dellago, Phys. Rev. E 81, 021127 (2010). Taylor, Mon. Not. R. Astron. Soc. 365, 414 (2006). 38 A. P. Solon and J. M. Horowitz, Phys. Rev. Lett. 120, 180605 (2018).