Fisher and Statistical Mechanics

Tech. Note 008v4 http://threeplusone.com/sher

Gavin E. Crooks January 4, 2012

Fisher information is an important concept in statistical and , but it has re- ceived relatively little consideration in statistical physics. In order to rectify this oversight, in this brief note I will review the correspondence between Fisher information and uctuations at thermodynamic equilibrium, and discuss various applications of Fisher information to equilibrium and non-equilibrium statistical mechanics.

Given a family of probability distributions p(x|λ) that vary From Eq. (4) it follows that the derivative of the free energy smoothly with a λ, then the Fisher information [1] with respect to the control parameter, dF/dλ, is equal to the is dened as average derivative of the energy with respect to the same pa- ( ) ∑ 2 rameter. ∂ ln p(x|λ) I(λ) = p(x|λ) (1) ⟨ ⟩ ∂λ dF ∂E x = (6) ⟨( ) ⟩ dλ ∂λ ∂ ln p(x|λ) 2 = ∂λ Therefore, the Fisher information of a canonical ensemble with respect to a parameter λ that smoothly (but otherwise Since the mean of the “ function” ⟨∂ ln p/∂λ⟩ is zero arbitrarily) controls the energy, is the of the in- (See appendix), it follows that the Fisher information is the nitesimal change in energy with respect to a change in the variance of the score. control parameter. The Fisher information is also equal to the negative mean ⟨(⟨ ⟩ ) ⟩ second derivative of the score, under certain conditions (See ∂E ∂E 2 I(λ) = β2 − (7) appendix). ∂λ ∂λ ⟨ ⟩ ∂2 I(λ) = − ln p(x|λ) (2) If we consider the inverse temperature as the controllable ∂λ2 parameter, then the Fisher information is equal to the energy The probability distribution of a system at thermal equi- uctuations. ⟨( ) ⟩ librium is given by the canonical ensemble [2, 3] 2 ( ) I(β) = ⟨E⟩ − E (8) ρ(x|λ) = exp βF(λ) − βE(x, λ) , (3) In other words, the Fisher information of a thermodynamic where β = 1/kBT is the inverse temperature T of the en- system tells us the size of uctuations about equilibrium. vironment in natural units (kB is the Boltzmann constant), Let us consider two examples. First, suppose the system E(x, λ) is the energy of the system, which depends both on under examination is a single polymer, and the parameter un- the internal state x and the external control λ, and F is the der control is the end-to-end distance L. The instantaneous free energy T ∂E tension = ∂L is the force exerted by the polymer on the ap- ∑ paratus constraining the distance between the polymer ends. { } βF = − ln exp −βE(x, λ) . (4) The Fisher information for this system is equal to the vari- x ance of the tension at equilibrium. Alternatively, instead of the free energy, we could write ⟨( ) ⟩ 2 equivalent equations using the free entropy ψ = −βF or the I(L) = β2 ⟨T⟩ − T (9) partition function Z = exp{−βF}. If we plug the canonical ensemble (3) into the Fisher infor- On the other hand, suppose control is exerted by apply- mation (1) we nd that ing constant tension to the ends of the polymer. The to- ⟨( ) ⟩ tal energy is then a linear function of length and tension, dF ∂E 2 I(λ) = β2 − . (5) dλ ∂λ

1 E(x, T) = U(x) − T L(x), and the Fisher information is equal Recall that a metric provides a measure of ‘distance’ between to the variation of the end-to-end polymer length at equilib- points. It is a real function f(a, b) such that (1) distances rium. are non-negative, f(a, b) ⩾ 0 with equality if, and only if, ⟨( ) ⟩ 2 a = b, (2) symmetric, f(a, b) = f(b, a) and (3) it is gener- I(T) = β2 ⟨L⟩ − L (10) ally shorter to go directly from point a to c than to go by In each case the Fisher information has a simple physical in- way of b, f(a, b) + f(b, c) ⩾ f(a, c) (The triangle inequal- terpretation in terms of the equilibrium uctuations. ity). Moreover, a Riemannian metric is a length space; there When the energy is a linear function of the control param- is always a point b ‘between’ a and c such that the equality eter, as in the last example, then the variance (and therefore f(a, b) + f(b, c) = f(a, c) holds. the Fisher information) is equal to the negative second deriva- We can also dene a corresponding divergence, or entropic tive of the free entropy [4, 5]. However, in general the relation action, ∫ 1 ∑ dλi dλj between free energy and Fisher information contains an ad- J = I ds (15) ds ij ds ditional term. 0 ij ∑ ( ) ∂2βF ∂ ∂βE − = − exp βF(λ) − βE(x, λ) which is related to the length (14) by the inequality, ∂λ2 ∂λ ∂λ x ⟨ ⟩ ⟨( ) ⟩ ⟨ ⟩ J ⩾ L2 2 2 2 (16) 2 ∂E 2 ∂E ∂ E = β − β − β 2 ∂λ ∂λ ∂λ which can be derived∫ as a∫ consequence[∫ of the Cauchy-] ⟨ ⟩ τ τ τ 2 2 2 2 ⩾ ∂ E Schwarz inequality 0 f dt 0 g dt 0 fg dt with = I(λ) − β (11) ∂λ2 g(t) = 1. The value of the divergence depends on the parametrization. The minimum value L2 is attained only The last term will be zero if the control parameter is inten- when the integrand is a constant along the path. sive (e.g. the pressure), and will be very small and entirely This Fisher information metric has been extensively ap- unimportant for an extensive parameter (e.g. volume) in the plied to thermodynamic systems under the term “thermo- macroscopic thermodynamic limit. In these cases the Fisher dynamic length” [8–11], although the connection to Fisher information can be calculated given the free energy as a func- information has not been widely appreciated [4, 5]. The di- tion of the control parameter. vergence measures the number of natural uctuations along ≡ When we have many control λ a path, and thermodynamic length is the cumulative root- { 1 2 N} λ , λ ,..., λ we can construct a Fisher information mean-square deviations measured along the path. I ij (Also commonly gij) Another application of Fisher information is the Cram´er- ∑ ( )( ) ∂ ln p(x|λ) ∂ ln p(x|λ) Rao inequality [1]. Suppose that we have a function of state I = p(x|λ) (12) ˆ ij ∂λi ∂λj λ(x) that is an unbiased for the parameter λ, i.e. x ∫ If we plug the canonical ensemble (3) into the denition p(x|λ)ˆλ(x)dx = λ , (17) of the Fisher information matrix (12), then we nd that the Fisher matrix is a covariation matrix of uctuations around then the Cram´er-Raoinequality states that the variance of equilibrium. ⟨(⟨ ⟩ ) (⟨ ⟩ )⟩ the estimate is greater or equal to the inverse Fisher infor- ∂E ∂E ∂E ∂E mation, 2 (13) Iij = β i − i j − j 1 ∂λ ∂λ ∂λ ∂λ ⟨(ˆλ − λ)2⟩ ⩾ . (18) I(λ) Fisher information is central to the eld of ‘’. Since the Fisher information matrix Iij is essen- For a system in thermodynamic equilibrium, the Cram´er- tially a , it follows that the matrix is sym- Rao inequality can be interpreted as a thermodynamic un- metric and positive semi-denite (all the eigenvalues are pos- certainty relation [12] (in rough analogy to the quantum un- itive), and can therefore can be used a metric tensor to de- certainty relations). For instance, a single measurement of ne a notion of distance between points in the space of pa- a system determines the instantaneous energy, but this is rameters. This equips this manifold of parameters with the not sufcient to infer the temperature with certainty. The structure of a Riemannian metric [6, 7]. The length of a curve Cram´er-Raoinequality then relates the standard deviation of through parameter space, measured using this Fisher metric the energy uctuations ∆E to the minimum uncertainty of 2 1/2 (also known as the Rao metric or entropy differential metric) the temperature measurement ∆β = ⟨(ˆλ − λ) ⟩ , is ∫ v 1 u∑ ∆E ∆β ⩾ 1 . (19) u dλi dλj L = t I ds (14) ds ij ds 0 ij

2 Appendix: Miscellaneous Mathematics References

It is useful to recall the behavior of logarithms under differ- [1] T. M. Cover and J. A. Thomas. Elements of Information entiation, Theory. Wiley, New York, 2nd edition, 2006. ∂ ln p(x|λ) 1 ∂p(x|λ) = ∂λ p(x|λ) ∂λ [2] J. W. Gibbs. Elementary Principles in Statistical Me- chanics. Charles Scribner’s Sons, New York, 1902. and that integration and differentiation can be interchanged, provided that certain mild technical conditions are met. See [3] D. Chandler. Introduction to Modern Statistical Me- “Leibniz integral rule”. ∫ ∫ chanics. Oxford University Press, Oxford, 1987. ∂ ∂ p(x|λ) dx = p(x|λ) dx [4] D. Brody and N. Rivier. Geometrical aspects of statis- ∂λ ∂λ tical mechanics. Phys. Rev. E, 51(2):1006–1011, 1995. Since probability distributions are normalized, ∫ doi:10.1103/PhysRevE.51.1006. p(x|λ) dx = 1 [5] G. E. Crooks. Measuring thermodynamic length. Phys. Rev. Lett., 99:100602 (4), 2007. it follows∫ that ∫ doi:10.1103/PhysRevLett.99.100602. ∂n ∂n ∂n p(x|λ) dx = p(x|λ) dx = 1 = 0 . [6] C. R. Rao. Information and accuracy attainable in the es- ∂λn ∂λn ∂λn timation of statistical parameters. Bull. Calcutta Math. Consequently, the mean of the “score function” ∂ ln p/∂λ Soc., 37:81–91, 1945. is zero. ⟨ ⟩ ∫ ∂ ∂ [7] J. Burbea and C. R. Rao. Entropy differential metric, dis- ln p(x|λ) = p(x|λ) dx = 0 tance and divergence measures in probability spaces - a ∂λ ∂λ unied approach. J. Multivariate Anal., 12(4):575–596, It follows that the Fisher information is the variance of the 1982. doi:10.1016/0047-259X(82)90065-3. score. Again, if we can interchange differentiation and integra- [8] F. Weinhold. Metric geometry of equilibrium ther- tion, the Fisher information is also equal to the negative modynamics. J. Chem. Phys., 63(6):2479–2483, 1975. mean of the score’s second derivative: doi:10.1063/1.431689. ⟨ ⟩ ∫ [ ] ∂2 ∂ 1 ∂p(x|λ) − ln p(x|λ) = − p(x|λ) dx [9] G. Ruppeiner. Thermodynamics: A riemannian geo- ∂λ2 ∂λ p(x|λ) ∂λ metric model. Phys. Rev. A, 20(4):1608–1613, 1979. ∫ [ ] ∫ 1 ∂p(x|λ) 2 ∂2p(x|λ) doi:10.1103/PhysRevA.20.1608. = dx − dx p(x|λ) ∂λ ∂λ2 ∫ [ ] [10] F. Schl ¨ogl. Thermodynamic metric and stochas- ∂ ln p(x|λ) 2 tic measures. Z. Phys. B, 59(4):449–454, 1985. = p(x|λ) dx − 0 ∂λ doi:10.1007/BF01328857. = I(λ) [11] P. Salamon, J. Nulton, and E. Ihrig. On the rela- tion between entropy and energy versions of thermo- Acknowledgments: Thanks to David Sivak, Lawrence dynamic length. J. Chem. Phys., 80(1):436–437, 1984. Berkeley National Laboratory (Numerous comments, obser- doi:10.1063/1.446467. vations and corrections); and Neri Merhav, Technion - Israel Institute of Technology (For pointing our the connection be- [12] R. Gilmore. Uncertainty relations of statistical tween the Cram´er-Raoinequality and thermodynamic uncer- mechanics. Phys. Rev. A, 31(5):3237–3239, 1985. tainty relations). doi:10.1103/PhysRevA.31.3237.

Version History v4 (2011-12-17) Minor typo and grammar corrections. Added Fisher information of inverse tempera- ture and relation between Cram´er-Raoinequality thermody- namic uncertainty relations; v4 (2012-01-05) Fix several sign errors and other minor corrections (Kudos: David Sivak).

3