<<

Large-deviation principles, stochastic effective actions, path , and the structure and meaning of thermodynamic descriptions

Eric Smith Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA (Dated: February 18, 2011) The meaning of thermodynamic descriptions is found in large-deviations scaling [1, 2] of the prob- abilities for fluctuations of averaged quantities. The central function expressing large-deviations scaling is the , which is the basis for both fluctuation theorems and for characterizing the thermodynamic interactions of systems. Freidlin-Wentzell theory [3] provides a quite general for- mulation of large-deviations scaling for non-equilibrium stochastic processes, through a remarkable representation in terms of a Hamiltonian dynamical system. A number of related methods now exist to construct the Freidlin-Wentzell Hamiltonian for many kinds of stochastic processes; one method due to Doi [4, 5] and Peliti [6, 7], appropriate to integer counting statistics, is widely used in reaction-diffusion theory. Using these tools together with a path-entropy method due to Jaynes [8], this review shows how to construct entropy functions that both express large-deviations scaling of fluctuations, and describe system-environment interactions, for discrete stochastic processes either at or away from equilibrium. A collection of variational methods familiar within quantum field theory, but less commonly applied to the Doi-Peliti construction, is used to define a “stochastic effective ”, which is the large-deviations for arbitrary non-equilibrium paths. We show how common principles of entropy maximization, applied to different ensembles of states or of histories, lead to different entropy functions and different sets of thermodynamic state variables. Yet the relations among all these levels of description may be constructed explicitly and understood in terms of information conditions. Although the example systems considered are limited, they are meant to provide a self-contained introduction to methods that may be used to systematically construct descriptions with all the features familiar from equilibrium thermodynamics, for a much wider range of systems describable by stochastic processes.

I. INTRODUCTION which each of these may be seen in different forms that are appropriate to equilibrium and non-equilibrium sta- tistical mechanics. It also brings together construction Thermodynamics is not fundamentally a theory of en- methods (based on generating functions [4–7]), scaling ergy distribution, but a theory of statistical inference relations (based on ray approximations [3, 21]), and vari- about incompletely constrained or measured systems [9]. ational methods (based on functional Legendre trans- While most of our experience and intuition about ther- forms [22]), which will enable the reader to systematically modynamics is drawn from equilibrium statistical me- construct the fluctuation theorems of thermodynamic de- chanics [10–12], which emphasizes the role of energy, we scriptions from their underlying stochastic processes. should expect that its fundamental principles apply in much wider domains, outside equilibrium, and even out- side mechanics. A. Key concepts, and source of examples Within the last 50 years, a clear conceptual un- derstanding of the nature of thermodynamic descrip- 1. Entropy underlies large-deviations scaling and reflects tions [1, 2] has combined with new methods to analyze a system structure wide variety of stochastic processes, for continuous [13] and discrete systems [4–7], in quantum [14, 15] and clas- The entropy, as a logarithmic measure of the aggre- sical mechanics [16, 17], both at and away from equilib- gate probability of coarse-grained states or processes, is rium, and even in other areas using the central quantity in thermodynamics. It arises as the such as optimal data compression and reliable commu- leading term in the log-probability for fluctuations of av- nication [18], and via these, robust molecular recogni- eraged quantities. At the same time, however, when sub- tion [19, 20]. These developments confirm that thermo- system components, or a system and its environment, dynamics is indeed not restricted to equilibrium or to me- interact, they discover their most probable joint config- chanics. They give us insight into when thermodynamic uration through fluctuations. Therefore the competition descriptions should exist, and they provide systematic among entropy terms also reflects the structure of system methods to construct such descriptions in a wide variety interactions at the macroscale. We are reminded of the of situations. importance of this structural role of entropies, by the fact This paper reviews the aspects of large-numbers scal- that pure “classical” thermodynamics [10] is entirely de- ing and structural decomposition that are essential to voted to the analysis of entropy gradients. Therefore we thermodynamic descriptions, and presents examples from wish to insist on being able to decompose entropy into 2 its sub-system components as a criterion for any fully- formulae and entropy decompositions. For this, we intro- developed thermodynamic description. duce the notion of the stochastic effective action, which is The key property that defines the existence of thermo- defined by variational methods that generalize the famil- dynamic limits, and the form of their fluctuation theo- iar Legendre transform of equilibrium thermodynamics. rems, is large-deviations scaling [1, 2]. It is the precise The meaning of this quantity, and the way it is used, will statement of the simplification provided by the law of be most easily understood by following its construction in large numbers, not only in the infinite limit of aggre- the body of the paper, so we postpone further discussion gation, but in the asymptotic approach to infinity. It until that point. is well-known that averaging over ensembles of configu- rations for large systems removes almost all (irrelevant) degrees of freedom, and leaves only summary statistics, 2. The principles in a simple progression: from equilibrium which are the state variables of the thermodynamic de- to non-equilibrium scription.1 In finite systems, these summary statistics can still show sample fluctuation, but in the thermody- We will develop examples whose purpose is to show namic limit, the fluctuation probability takes a simple that the definition and properties of the entropy do not form. change as we extend thermodynamics beyond equilib- For an ensemble that possesses large-deviations scal- rium statistical mechanics. Rather, what changes is the ing, it is possible to describe classes of fluctuations as state space to which we assign probabilities. having the same structure under different degrees of ag- A direct example comes from comparing an equilib- gregation. (An example would be fractional density fluc- rium thermodynamic system, to a non-equilibrium de- tuations in regions of a gas, whose sample volume may be scription constructed for the same system. A frequent varied). The log-probability for any such fluctuation then approach [23–26] to non-equilibrium statistical mechan- factors, into a term that depends on overall system scale, ics (NESM) continues to use the equilibrium state vari- and a scale-invariant coefficient that depends only on the ables and equilibrium entropy, but considers their time structure of the fluctuation. The scale-invariant coeffi- rates of change. We will see that such an approach, fo- cient is called the rate function of the large-deviations cused on retaining the functional form of the equilibrium scaling relation [2]. (An example would be the specific entropy, sacrifices its meaning as a large-deviations rate entropy of a gas.) function. Large-deviations scaling presumes the existence of a Instead, we will make the transition from equilibrium separation of scales – between microscopic processes and to non-equilibrium statistical mechanics by replacing an their macroscopic descriptions – over which aggregation ensemble of states (in equilibrium) by an ensemble in does not lead to qualitative changes in the kinds of fluctu- which entire time-dependent trajectories – termed histo- ations that can occur. We expect thermodynamic limits ries – are the elementary entities (for NESM), and then to exist where this separation of scales is large. Examples we will construct the appropriate large-deviations limits of structural change that can interrupt simple scaling un- for the ensemble of histories. The functional form of the der the include phase transitions, entropy will necessarily change. More importantly, the which can change the space of accessible excitations. inventory of state variables will necessarily be enlarged, Large-deviations scaling permits us to combine fluc- to include not only the configuration variables of equilib- tuation statistics with entropy decompositions that re- rium, but also a collection of currents that relate to the flect system structure. In equilibrium thermodynam- changes in configuration. Both kinds of variables will be ics, the result is the classical fluctuation theorem for needed as summary statistics for an ensemble of histories, macrostates [1]: the log-probability for fluctuations (of and both will enter as arguments in the non-equilibrium energy, volume, particles, etc.) between sub-systems entropy. with well-defined entropies is the difference between the NESM is not so far removed from equilibrium thermo- sum of entropies at the fluctuating value and the maxi- dynamics that it can really do justice to the generality of mum value for this sum, which is the equilibrium value. large-deviations principles and thermodynamic descrip- If we wish to consider the probabilities of fluctuations tions. However, it allows us to begin with a completely with more complex structure (whether in equilibrium familiar (equilibrium) construction, then to compare it thermodynamics or in more complicated cases), we will to a construction with rather different functional forms, need more flexible methods to compute large-deviations and finally to derive the complete set of relations that connect the two descriptions.

1 In the language of Kolmogorov, the state variables are minimum B. Markovian stochastic processes, the two-state sufficient statistics for predicting the state of a system sampled model as an example, and the approach of the paper from an ensemble. No more detailed summary statistic provides better predictions over the whole ensemble. At the same time, no further reduction makes it possible to write any state variable Markovian stochastic processes [27] provide a general, as a function of the others. substrate-independent framework within which to study 3 the aggregation properties of coarse-grained probabilities in which they appear below. We return to these in the and large-deviations scaling. They include models from discussion, making use of examples from the text. statistical mechanics, but they may also be used to repre- sent many other random processes, whose structure may Numerous, diverse literatures now contribute to the have different constraints and interpretations from those understanding of methods closely related to those used of mechanics. below. A brief summary of the history and connections among the ideas, and a broader set of citations, are pro- This review will use the two-state , in ei- vided in App. A. ther discrete or continuous time, as a sample system for which equilibrium and non-equilibrium thermodynamic descriptions will be built and then compared. The two- state random walk is the simplest discrete stochastic pro- cess, and most thermodynamic quantities of interest in both ensembles can be computed for this system in closed C. Bringing together three perspectives on form. However, the constructions in the examples imme- non-equilibrium thermodynamics diately generalize to more complicated cases, and several generalizations and approximation methods will be cov- ered either in the main text or in appendices. The extension of large-deviations scaling to ensembles Sec. II will introduce large-deviations scaling one level of histories is given by Freidlin-Wentzell (F-W) the- “below” the discrete random walk, by supposing that the ory [3]. This approach is widely applied to the computa- discrete model emerges as a coarse-grained description tion of escape trajectories and first-passage times [28–36]. from the continuous random walk in a double-well po- It is remarkable for the way it reduces both problems of tential. The continuum model illustrates the concept inference, and the description of multiscale dynamics, to of concentration of measure from large-deviations the- a representation which is a Hamiltonian dynamical sys- ory, and sets the parameters (both explicit and implicit) tem [16, 37]. that define the discrete model. It also clarifies the nature and origin of the “local-equilibrium” approximation [26] A convenient method for arriving at the Hamiltonian for coarse-grained descriptions of motion on free-energy description of F-W theory is the Doi-Peliti (D-P) con- landscapes, and illustrates graphically the dual roles that struction [4–7] based on generating functionals. The D- charges and currents must have in non-equilibrium en- P construction is only one of many closely-related meth- tropy principles. ods based on expansions in coherent states, which are re- viewed in App. A. These methods [13, 21, 38, 39] have the Sections III and IV present the equilibrium thermody- common feature that the field representing sample-mean namics of the two-state model, and its most direct gen- values of observables is augmented by a conjugate mo- eralization through the master equation of the stochastic mentum that generates the change in those observables. process. Sec. III uses the exact solution of the equilib- This value/momentum pair leads to the F-W Hamil- rium distribution to introduce all basic quantities of the tonian description. The D-P method is particular to large-deviations theory, and derives these using gener- stochastic processes with independent, discrete number ating functions and their associated variational methods. counts, but may readily be generalized to continuous or Sec. IV then presents the same construction for the time- non-independent observables [40], as well as having many dependent probability distribution, in which histories of parallels in dissipative [14, 15, 41]. particle counts rather than counts at a single time are the elementary entities. Neither of these ensembles dis- The F-W method directly gives the scale factors and tinguishes particle identities, either in states or in tra- rate functions of the large-deviations limit for histories. jectories. Sec. V presents an alternative construction of However, it does not generally decompose fluctuation a thermodynamics of histories based on the entropy of probabilities into separate terms representing sub-system distinguishable-particle trajectories, and this second con- components or system-environment interactions, which struction naturally separates the path entropy from prob- we want as part of a thermodynamic description. To pro- ability terms due to the environment, in a form exactly duce that decomposition, we introduce a path-entropy analogous to the Gibbs free energy for equilibrium. method due to Jaynes known as maximum caliber [8], The remainder of the introduction lists sources for which has its roots in much older analysis of the en- the particular methods used in later sections, and ex- tropy rates of stochastic processes and chaotic dynam- plains why they capture different aspects of a full non- ical systems due to Kolmogorov and others. In addition equilibrium thermodynamics. Many aspects of the fol- to making the F-W representation more recognizable as lowing derivations – the naturalness of the generating- a direct counterpart to the constructions in equilibrium function representation, the role of operator algebras and thermodynamics, the maximum-caliber method will sep- linear algebra more generally, or the information condi- arate those events that involve energy exchange with the tions and counting statistics that relate one ensemble to environment from those that do not, allowing us to un- another – may be understood in conceptual terms that derstand the role of energy dissipation in large-deviations are more fundamental than the particular constructions formulae for paths. 4

D. A glossary of notation 3. Whether we are describing absolute particle num- bers distributed among states, or are separating to- In characterizing thermodynamic ensembles, we will tal number as a scaling variable, from the fractional need to make three choices about the level of representa- distributions of particles that have a scale-invariant tion, and the basis used: meaning. Different choices will lead to objects with quite differ- 1. Whether we are referring to ensemble averages ent mathematical behavior, even if all of them represent (hence, deterministic summary statistics), or quan- particle numbers in one way or another. The notation in tities that fluctuate stochastically to represent the the following sections is chosen to reflect the three choices process of sampling; above, while still permitting readable equations. Ensemble averages of any quantities are denoted by 2. Whether we are considering discrete samples that overbars. This convention is easier to incorporate in change in the integer basis of particle counts, or complicated equations than the E (for expectation value) modes of collective fluctuation, which we describe commonly used in statistics. with the continuous mean values of Poisson distri- The remaining distinctions in the notation are summa- butions;2 rized in Table I.

Position Domain Measure Dynamics Meaning and usage Conjugate variable momentum N [0, ∞] Integer Fixed Total population number not used Scale factor in large-deviations property ni [0,N] Integer Stochastic Values of sample points −∂/∂ni ni [0,N] Real Langevin Mean of Poisson distribution ηi

ˆni ≡ ni/N [0, 1] Rational Stochastic Relative values of sample points − (1/N) ∂/∂ˆni

νi ≡ ni/N [0, 1] Real Langevin Distribution normalized by N ηi Structure factor in large-deviations property

TABLE I. Notations for particle number in different bases. ni are integer sample values of particle counts in state i. ni are the corresponding continuous mean values of Poisson distributions used as a basis for collective fluctuations. ˆni are integer sample values normalized by total number N to isolate large-deviations scaling, and νi are the corresponding normalized mean values of Poisson distributions. Each representation of population numbers is accompanied by a conjugate momentum or shift operator, shown in the last column. Total population N will be kept fixed in all examples, and its corresponding shift operator is therefore not used.

It will be important, in following the Doi-Peliti con- shifts probability among indices; 2) the Liouville opera- struction below, to understand that math-italic ni are tor that acts on generating functions, shifting probability the mean values of Poisson distributions that are used as among modes; and 3) the action functional of the Doi- a basis in which to expand the actual distribution that Peliti field-theoretic expansion in Poisson distributions, evolves under the . They remain ran- which generates the covariance for Langevin statistics. dom variables that are sampled from the ensemble, but We will show that, with a suitable choice of dynamical they fluctuate with Langevin statistics rather than the variables, one may skip over the laborious task of in- Poisson statistics of the integer particle counts ni. The terconverting these representations, and simply copy the ensemble mean will be denotedn ¯i. functional forms from one representation to another. The The dynamics of a stochastic process may be repre- number variables and shift operators in Table I are those sented in three ways associated with these different bases, that substitute for one another under changes of repre- which contain equivalent information and which will all sentation. be illustrated in the following sections. These are: 1) the transfer matrix of the discrete stochastic process, which

2 This distinction is similar to the distinction between the position basis and the wavenumber basis exchanged by Fourier-Laplace transform. 5

II. COARSE GRAINING, EMERGENCE OF species of particles in fixed proportion, which occurs in THE NEAR-EQUILIBRIUM APPROXIMATION, chemical reaction networks (considered in App. D). The AND THE DUAL STRUCTURE OF origin of free energies need not be mechanical; such a NON-EQUILIBRIUM STATE VARIABLES representation may be found for any stochastic process that satisfies detailed balance in its stationary state. In later sections, the two-state random walk will be the Free energies will also provide a way to interpret the microscopic model, whose thermodynamic descriptions generating functions and functionals used in later sec- we seek. In this section we will take an even lower-level tions. A generating function distorts the free energy random walk in a continuum potential to be the micro- landscape by changing the one-particle free energies of scopic model, for which the two-state model is the coarse- stable states and transitions. We will see that the mo- grained thermodynamic description. Starting in the con- mentum variables that appear in the Freidlin-Wentzell tinuum will allow us to relate the dual (charge/current) theory are precisely the distortions of free energy land- character of non-equilibrium thermodynamic state vari- scapes that extract the past histories most likely to lead ables to graphical features of the underlying free-energy to fluctuation conditions imposed at any moment. landscape. The particular class of continuum models we will consider – landscapes with basins and barriers – also lead to the asymmetry between states and kinetics in B. Concentration of measure in the classical discrete stochastic processes, and to the nature large-deviations limit, and fixed points on the of the local-equilibrium approximation in such models.3 free-energy landscape The main points of the section will be: 1) the way large-deviations scaling isolates the fixed points of a Consider then the stochastic motion of a particle in stochastic dynamical system as its control points; 2) the the double-well continuum potential illustrated in Fig. 1. difference between the relevant sets of fixed points for Large-deviations theory for the continuum [3, 21, 28, 38] equilibrium versus non-equilibrium ensembles; and 3) the applies at low temperatures, and characterizes three as- way the static and kinetic properties of the underlying pects of the probability of trapped motions and escapes: system become encoded in the discrete representation. We will also cover the origin of the natural scale for a discrete stochastic process, which is never expressed di- µ rectly within the discrete model, but which can be neces- sary to regulate and to understand divergences when its stochastic behavior is analyzed. µ1a µ1b

A. The free-energy landscape representation FIG. 1. A continuum double-well potential that might under- bridges scales, and applies to general stochastic lie the two-state model. The fine-grained description follows processes satisfying detailed balance particle trajectories over the continuum. The coarse-grained description is dominated by three chemical potentials: at the Entropy characterizes probabilities for systems that we minima of the well (attracting fixed points) and the barrier describe completely (called “closed” systems), while free maximum (the one-dimensional version of a saddle point). energies characterize probabilities for systems coupled to incompletely-described environments. A review of the basic relations of probability to entropy and free energy in classical equilibrium thermodynamics is given in App. B. 1. The probability for a random walker to be found In this and later sections, we will write probabilities away from the attracting fixed points (the minima of states, and transition rates, in terms of Gibbs free en- of the continuum free-energy potential) decreases T T ergies [11]. The free energy representation is not linked exponentially in β = 1/kB , where is tempera- to any particular scale, and so provides the map from a ture and kB is Boltzmann’s constant. In particu- continuum potential for a stochastic particle to the pa- lar, escapes from one well to the other are expo- rameters of its discrete-state approximation. This rep- nentially improbable. The leading term in the log resentation becomes particularly powerful for extensions of the escape rate is given by the quasipotential of from single-particle motion, to the conversion of several Freidlin-Wentzell theory [28]. 2. Among escape events, the majority occur along a particular spatio-temporal history of thermal exci- tations leading uphill from the minima of the po- 3 This asymmetry is not inherent in the classical limit, however, nor is it a common feature of quantum statistical ensembles, tential toward the interior maximum. More pre- where positions and momenta may have much more symmetric cisely, conditional on having observed an escape roles. The continuum model therefore provides a point of depar- event, the probability that the escape trajectory ture for several, different macroscopic approximations. deviated from this most-probable form decreases 6

exponentially in β = 1/kBT, which is a large- The continuum large-deviations theory for escapes [28] deviations result for histories. The most-likely es- gives the escape rate from either well, to leading expo- cape trajectory is the stationary path derived from nential order, as a function of the chemical potentials in the Freidlin-Wentzell dynamical system [28]. For the wells and at the local maximum of the continuum one-dimensional systems, this trajectory is always potential between the wells (indicated in Fig. 1), in the the time-reverse of the classical diffusion trajectory form from the maximum to the minimum. 1 −β(‡− ) k+ = e a , 3. The most-probable trajectory for any escape re- −β −1 k = e ( ‡ b ). (2) quires a specific finite time, which will limit the − frequencies at which we can use coarse-grained ap- Note that the absolute magnitudes of probabilities which proximations. In one dimension, the escape time are kept to describe transitions are generally (exponen- equals the classical diffusive relaxation time over tially) smaller than corrections to the mean properties of the path whose time-reverse is the escape. the fixed points, which are omitted in the large-deviations approximation.5 They are, however, the leading terms The large-deviations scaling parameter in these contin- in the conditional probability, and therefore the leading T uum formulae is the inverse temperature β = 1/kB . β contribution to dynamics. will continue to be a scaling parameter in large-deviations The escape rates appear in the discrete-state, formulae for the discrete approximation, but other pa- continuous-time approximation as the parameters in its rameters such as the number of random walkers in the master equation, whose form is system will also enter. ∂ρna,nb ∂/∂na−∂/∂nb The exponential suppression of deviations from either = k+ e 1 na fixed points or stereotypical escape trajectories is the ∂t − phenomenon known as concentration of measure for the ∂/∂nb−∂/∂na + k− e 1 nb ρna,nb . (3) random walk in the potential. It is the property that − 4 leads us to make only a finite error even if we replace Here, na and nb are possible values for the numbers of the infinitely many degrees of freedom of a random walk particles found in the right and left wells at any instant on the continuum, by a (zero-dimensional!) discrete ran- of time. ρna,nb is a time-dependent probability density dom walk between two states, written for an ensemble of such observations. The master equa- tion (3) describes the flow of probability among different ⇀ b ↽ a. (1) values of the indices (na, nb) as particles hop with rates k± per particle. Here b and a are coarse-grained labels standing for the Often the right-hand side of a master equation such as left-hand and right-hand wells. Eq. (3) is written as a sum over all values of (na, nb). For The states in the discrete walk are associated approx- stochastic processes in which almost all change results imately with the positions of the local minima of the from independent single-particle transitions, only adja- continuum potential, as shown in Fig. 2. Each state also cent values (na 1, nb 1) contribute to the change in has a Gibbs free energy per particle – known in chemi- ± ∓ ρn ,n . Therefore we have written the shifts of indices in cal thermodynamics as the chemical potential [11], and a b the sum as the action of operators e∂/∂nb−∂/∂na , treating written (for a single particle) as 1 – near the value at the indices (n , n ) formally as if they lived on a con- the well minimum. a b tinuum, even though only values separated by integers

ever appear in the time evolution of ρna,nb . The opera- tor in square brackets in Eq. (3) is called the transfer matrix of the stochastic process. Its functional form will k- re-appear throughout the subsequent analyses, in the op- k+ erators that govern the time evolution of generating func- tions, and as the Hamiltonian in the Freidlin-Wentzell a dynamical system. b

FIG. 2. Discrete hops between well minima account for almost C. The discrete-state projections for equilibrium all structure in the probability distribution for the continuous versus non-equilibrium systems random walk at low temperature. The role of the barrier potential (saddle point in Fig. 1) as the determiner of kinetics The net effect of concentration of measure in the con- is absorbed in the transition rates k±. tinuum random walk is to extract the parameters of its

4 5 The magnitude of this error decays as powers of 1/kB T. See Ref. [42], Ch. 7 for more on approximations of this kind. 7 discrete approximation from the fixed points of the mean erty of charges: their value would not change if we ran the flow on the free-energy landscape. The concentration to- dynamics in time-reverse. Away from equilibrium, new ward the attracting fixed points is purely spatial, and is state variables are introduced, which live on the saddle easy to visualize for landscapes in any number of dimen- points, and these have the property of currents: their sions. The concentration of escape trajectories is spatio- value would change sign if we ran the dynamics in time- temporal, and is not directly illustrated in the static po- reverse. tential of Fig. 1. Concentration of trajectories takes on a spatial aspect for landscapes in more than one dimen- Non-equilibrium ensembles require the introduction of sion, where transitions are exponentially likely to pass additional sets of current-valued state variables [40, 43], through saddle points, as shown in Fig. 3. which do not arise in equilibrium, and which have their origin in properties of the saddle points of the underlying free-energy landscape.

D. The nature of the local-equilibrium approximation

Free energy landscapes with basins and barriers lead to an extreme asymmetry in the way charge-valued and current-valued state variables are represented in the dis- crete model. The asymmetry comes, as noted above, FIG. 3. A Free energy landscape in more than one dimension, from the fact that the leading probabilities for dynamics and its approximation by a discrete-state system. The charge- are exponentially smaller than corrections to the prop- valued state variables of the equilibrium ensemble live at the erties of states that are dropped in the large-deviations attracting fixed points (black), where the principle axes of approximation. Therefore, in such systems, the charge- curvature (crosses) point in the same direction, so the scalar valued state variables, even in the non-equilibrium en- curvature of the landscape is positive. Current-valued state semble, are nearly identical to the state variables of equi- variables of the kinetic ensemble live on the saddle points librium. The one-particle chemical potentials 1, and (white), where principle axes of curvature point in opposite their generalizations to concentration-dependent chemi- directions, and the scalar curvature is negative. Trajectories cal potentials [11], all have the same relation to Gibbs mostly sit at fixed points for long time intervals, but when transitions happen, they are exponentially likely to follow spe- free energy as in equilibrium. This property defines the cific escape paths between basins, indicated by narrow curved local-equilibrium approximation for this class of models. paths. We note two things about the local-equilibrium ap- proximation, to emphasize its limits. First, the equiva- lence of the non-equilibrium charge-valued state variables We may now graphically characterize the difference be- with their counterparts in equilibrium does not extend to tween static and dynamic ensembles, for random walks the entropy [43]. The non-equilibrium entropy depends equilibrium on landscapes with basins and barriers. The on both charges and currents, even at a single time. For thermodynamic description for any such system is fully 1 ensembles of histories, it depends on the rate of tran- specified by the chemical potentials at the attracting sitions as well as the state-occupancy statistics, as we fixed points (filled dots in Fig. 3). In this ensemble, there will show in multiple examples in Sec. IV and Sec. V. is no role for barriers, and no appearance of their chem- ‡ This fact will be essential to understanding “maximum- ical potentials . The probabilities for state occupancy entropy production” [26, 44–47] as an approximation but only are determined by their free energies, because all not a principle for non-equilibrium systems. waiting times for escapes are (by assumption) surpassed. For systems away from equilibrium, transition rates be- Second, we note that the asymmetry between states come essential to determining state occupancies, as well and kinetics need not be a property of free energy land- as transition frequencies between pairs of states. These scapes if they do not have basins and barriers. In par- rates are controlled by the saddle points (white dots in ticular, it may not be a property of free diffusion, and it Fig. 3). is generally not a property of driven dissipative quantum With each kind of fixed point we associate a state vari- ensembles [43]. Therefore, if the underlying continuum able in the thermodynamic description. (See App. B for model does not have the features that create asymmetry, discussion of the origin and role of state variables as con- we have no ground to expect that even the charge-valued straints.) The state variables of the equilibrium theory, state variables in the non-equilibrium theory will closely which live on attracting fixed points, all have the prop- resemble those in the corresponding equilibrium limit. 8

E. The implicit “natural scale” for a a sub-distribution within which (na, nb) is the most-likely coarse-grained description value. The sub-distribution, when normalized, would be the equilibrium distribution in a two-state system with a Finally we mention an implicit limit on the use of the shifted chemical potential; the ratio between the gener- discrete-state approximation. Escapes in the continuum ating function and the normalized distribution measures model are rare within the intervals that particles spend the overlap of the original distribution with the one ap- in either basin. However, they do require a non-zero propriate to (na, nb). time, comparable to the diffusive relaxation time. In It will be the Legendre transform of this - the non-equilibrium discrete-state model, we will probe generating function that gives the fluctuation probabil- transition probabilities with time-dependent distortions ity to observe (na, nb) in the original equilibrium dis- of the chemical potentials 1 and ‡. The model will per- tribution, and expresses this probability as a difference mit these probes to have arbitrarily high-frequency time- of entropies. The Legendre transform of the cumulant- dependence. However, we will see when we consider path generating function is known, in some domains of quan- entropies in Sec. V that such sources lead to divergences tum field theory, as the quantum effective action, and we in individual entropy terms that should remain finite to will adapt that term here, calling it the “stochastic ef- be meaningful. fective action”. (For more context and the relation to The solution to the problem of divergences is to rec- literature, see App. A.) Though it is only a difference of ognize that the discrete model has a natural scale [48], static entropies in the equilibrium ensemble, the stochas- which is an upper bound on the frequencies that may tic effective action will become dynamical in ensembles sensibly appear in probes of the theory. The natural of histories, where it will be the strict counterpart to the scale is the diffusive relaxation frequency in the con- quantum effective action. tinuum model. For probes with higher frequencies, the constants k± describing transition rates no longer retain their meaning or values. They were defined as lumped- A. The equilibrium distribution of the two-state stochastic process parameter representations of escape paths, in a potential which was assumed to be fixed during the period of the escape. Faster probes “melt” the discrete approximation, At an equilibrium steady state the solution to the mas- and require that the description revert to the underlying ter equation (3) is the binomial distribution continuum. n kna k b N ρeq = − + na,nb N (k+ + k−) na III. LARGE-DEVIATIONS ANALYSIS OF THE EQUILIBRIUM TWO-STATE STOCHASTIC na nb N =ν ¯a ν¯b . (4) PROCESS na Total particle number N = n +n is conserved by all We now begin the analysis of the equilibrium distribu- a b terms in the transfer matrix of Eq. (3). The equilibrium tion for the two-state system. Since the entire equilib- occupation fractions are rium distribution may be written down analytically (it 1 is a binomial distribution), the purpose of this “analy- −βa k− e sis” is to introduce the key quantities expressing large- ν¯a = ≡ k + k Z deviations scaling, along with systematic ways to com- + − 1 −β1 pute them using generating functions. The methods k+ e b ν¯b = , (5) and the asymptotics will generalize immediately to time- ≡ k+ + k− Z1 dependent systems that are not exactly solvable, or at least very inconvenient to write in closed form. in which The large-deviation result we will derive is that, for any −β1 −β1 Z1 e a + e b (6) apportionment (na, nb) of N particles to the two states, ≡ the log-probability of this apportionment in the equilib- is the called the partition function [11] for a one-particle rium distribution is the difference of the joint entropy ensemble in this two-state system. from its maximizing value. A more extensive taxon- The Gibbs free energies for non-interacting particles omy of large-deviation results for equilibrium ensembles scale linearly (that is, they are “extensive”) in parti- is given in Ref. [1]. Non-maximizing values of (na, nb) are cle number [11], so the structural terms in the large- termed fluctuations, and the relation between entropies deviations formulae for fluctuation probabilities will be and log-probability for sample values is therefore called functions of the ratios a fluctuation theorem. na We will isolate this leading-exponential term in the ˆna ≡ N log-probability by using the cumulant-generating func- n ˆn b . (7) tion to shift the distribution, effectively projecting onto b ≡ N 9

In both steady-state and time-dependent probability dis- The one-particle minimum expressed in terms of frac- tributions, Roman na, nb, ˆna, ˆnb will be used for dis- tional occupancies, which is the descaled version at any crete particle-number indices, respectively un-normalized N, gives an N-independent relation between the chemi- or normalized by N. When the description of distribu- cal potential per particle, and the one-particle partition tions is transferred to the generating function, the corre- function, sponding continuous indices will be n , n for absolute a b 1 T 1 T min ˆna a + kB log ˆna + ˆnb b + kB log ˆnb number, and νa, νb for relative numbers corresponding ˆn to the definitions (7). = k T log Z . (12) For dynamical as for static systems, it is convenient − B 1 to study open-system properties by considering an open These are the standard relations for ideal gases or dilute system and its environment to be components in a larger solutions. closed system. Here the closed system will be defined by N as a parameter, and we introduce the stochastic variable under the reaction n (n n ) /2, so that ≡ b − a 1. The aggregation of state variables and the fluctuation N probability n n a ≡ 2 − N The local-equilibrium approximation for the two-state n +n. (8) b ≡ 2 system allows us to approximate the log-probabilities for non-equilibrium configurations of (n , n ) by the sum of When only n is denoted explicitly, the distribution ρ a b na,nb free energies for the individual wells. At the minimizing will be indexed ρn. The relative particle number asym- value of (n , n ) for this sum, the equilibrium free energy metry is likewise defined as ˆn (ˆn ˆn ) /2. Its counter- a b ≡ b − a for the composite system is attained. We may therefore part in continuous variables will be denoted ν. Equilib- express the minimum joint free energy per particle, using rium values for all numbers will be indicated with over- Eq. (11), as a definition for the single-particle chemical bars. potential for the equilibrated system, in a form equivalent The equilibrium distribution (4) may be cast in a va- to to Eq. (10): riety of instructive forms. Using the relation (2) of the rate constants to one-particle chemical potentials, and 1 min[G + G ] = k T log Z + k T log N Stirling’s formula for the logarithms of factorials, the fol- N a b ≡ a∪b − B 1 B lowing expressions re equivalent: 1 + k T log N. (13) ≡ a∪b B 1 ¯ ¯ ρeq ena(log ˆna−log ˆna)+nb(log ˆnb−log ˆnb) 1 na,nb Here a∪b is the whole-system counterpart to the one- ≈ √2πN ˆnaˆnb 1 1 particle chemical potentials a and b for subsystems in 1 ¯ e−ND(ˆnˆn) the local-equilibrium approximation. ≡ √2πN ˆnaˆnb The density values in the equilibrium distribu- 1 1 1 1 −βna(a+τ log ˆna)−βnb(b +τ log ˆnb) tion (9) may then be written as exponentials of the = N e √2πN ˆnaˆnb Z1 (system + environment) entropies of Eq. (B5), as

1 N log(N/Z1) −βnaa−βnbb = e e na +nb √2πN ˆn ˆn eq βNa∪b −β(Ga+Gb) a b ρna,nb e e ≈ 2πnanb na +nb = eN log(N/Z1) e−β(Ga+Gb) (9) βGa∪b na +nb −β(Ga+Gb) 2πnanb e e . (14) ≈ 2πnanb In the second line, D ˆn ˆn¯ is the Kullback-Leibler di- vergence [49], in which ˆnand ˆnstand¯ for the distribu- As explained in App. B, all three Gibbs free energies are functions of intensive kBT and p, and of extensive argu- tions with coefficients (ˆna, ˆnb), ˆn¯a, ˆn¯b respectively. The ments na, nb, and N. Since total energy is controlled by na- and nb-particle chemical potentials add concentration corrections to the entropy in the one-particle potentials, the environment temperature, it is the entropy compo- as nents of these G values for the subsystems, as functions of na and nb, which control the difference of Ga +Gb from 1 T a = a + kB logna Ga∪b. 1 T b = b + kB lognb. (10)

By comparing the second with the last-two lines of 2. Entropies of equilibrium and residual fluctuations Eq. (9), we see that the minimum of the Gibbs free en- ergies of the subsystems over n is The leading-exponential approximation of large- N deviations scaling separates extensive entropies of the β min[Ga + Gb] β min[naa +nbb]= N log . n ≡ Z1 subsystems, which were defined into the parameters of (11) the two-state stochastic process from coarse-graining the 10 continuum model, from entropies due to chemical fluc- Here new normalized fractions in the presence of q – tuation, which are sub-extensive. To see this, following which will be referred to as a source – are defined by any standard thermodynamics text [11], we write any of 1 1 1 1 ¯ −q/2 −(βa+q/2) the one-particle chemical potentials h kBTs , in ˆnae e ˆn˜a = which h1 is the specific enthalpy and s≡1 is− the specific ≡ ˆn¯ e−q/2 + ˆn¯ eq/2 (q) a b Z1 entropy. This decomposition yields a relation between − β1−q/2 ˆn¯ eq/2 e ( b ) subsystem and whole system free energies at equilibrium, ˆn˜ b = , (20) b ≡ ˆn¯ e−q/2 + ˆn¯ eq/2 (q) which is a b Z1 1 ¯ 1 ¯ 1 ha∪b = ˆnaha + ˆnbhb and the associated one-particle partition function with s1 = ˆn¯ s1 + ˆn¯ s1 ˆn¯ log ˆn¯ ˆn¯ log ˆn¯ . (15) source q is a∪b a a b b − a a − b b (q) − β1 +q/2 − β1−q/2 The free energy in the thermodynamic limit is then given Z e ( a ) + e ( b ). (21) by 1 ≡ The sum in the third line of Eq. (19) evaluates to unity, G = N h1 k Ts1 + k T log N , (16) a∪b a∪b − B a∪b B as for any normalized binomial, but it is instructive to which extensive in particle number, if we think of the use what was learned in forming Eq. (14) to recast the log N term as being set by pressure. In comparison, the sum and prefactor together in terms of a normalized dis- Shannon entropy of the full distribution over fluctuations tribution and a “penalty” term, as contains a term from the normalization of the exponential (q) (q) βG −βG(q) βG(q) na +nb −β G +G in Eq. (14), due to its width, ψ(z) e a∪b a∪b e a∪b e a b ≈ 2πnanb 1 n ρeq log ρeq 1 + log 2πN ˆn¯ ˆn¯ . (17) (22) − n n ≈ 2 a b n In Eq. (22) the subsystem free energies are defined at any This correction, being only logarithmic, is sub-extensive na, nb as (q) T in N. Ga = Ga +nakB q/2 G(q) = G n k Tq/2. (23) b b − b B B. Generating functions and the stochastic (q) (q) (q) effective action We wish to introduce Ga∪b as the minimum of Ga +Gb over n, as before. However, for discrete n, this gives a dis- continuous function of q. The thermodynamic usage of A moment-generating function – or “ordinary power- G(q) is as a macroscopic function of its intensive state series generating function” [50] – for a distribution in- a∪b variables, and therefore it will save intermediate steps dexed on the two numbers na and nb has two complex (q) arguments, and is written and notation simply to define Ga∪b as the minimum over n treated as a continuous variable, whose minimizing ar- ψ(z ,z ) zna znb ρ . (18) gument will then be a continuous function of q. a b ≡ a b na,nb na,nb For such binomial distributions or their multinomial generalizations, sources of the form kBTq = q/β behave To study the properties of ρn at fixed N, recognizing that N/2 n as shifts in chemical potential, in this case split evenly zna zna = (z z ) (z /z ) , we may set z z 1 and a b a b b a a b ≡ between subsystems a and b. The “penalty” term is ex- denote zb/za z. At equilibrium we will be interested ≡ pressed as a function of G G(q) : the minimum only in the one-argument function of z. However, as we a∪b − a∪b pass to dynamical descriptions, it will remain convenient chemical work needed to convert the system with poten- ¯ in some cases to retain both variables za and zb, even if tial difference b a and equilibrium ˆnto one with po- − (q) they are applied to a distribution with support on only tential difference q/β and an equilibrium ˆn¯ b − a − one value of N. determined by the new potentials. β G G(q) is The weight factors zn have an effect similar to shift a∪b − a∪b also the log-probability to obtain the shifted distributio n in the subsystem chemical potentials, which will recur n repeatedly in our analysis. Therefore we denote z = eq, from the original through the set of weights z . and write the one-variable generating function as The penalty function is referred to as the cumulant- generating function and denoted Γ(q). It is defined from n eq ψ(z)= z ρn the moment-generating function ψ by the relation n ψ(z) e−Γ(log z) = e−Γ(q). (24) na nb N ≡ = ˆn¯ /√z ˆn¯ √z a b From the definition (19) of the moment-generating func- n na tion it follows that (q) N Z1 na nb N = ˆn˜ ˆn˜ . (19) d log ψ(z) dΓ(q) (q) (q) Z a b = = n n , (25) 1 n na d log z − dq ≡ 11 in which we introduce the continuous counterpart n(q) as The definition is both the gradient of Γ and the expectation of n under the equilibrium distribution shifted by z. eff S (n) [Γ(q)+ nq] q=q(n) , (30) From Eq. (22), the cumulant-generating function is ≡ | just the Gibbs free energy difference in which q(n) denotes the inverse function of n(q) from Eq. (27). 6 From the gradient relation (25) it follows Γ(q)= βG(q) βG . (26) a∪b − a∪b that Recasting Eq. (25), dSeff(n) = q(n). (31) (q) dn dΓ(q) dG = a∪b = n(q), (27) − dq d ( k Tq) eff − B Therefore S must vanish at the mean value of n in the distribution ρ for which it is computed, because no source we recover the usual relation from equilibrium thermo- (i.e., q(n) = 0) is required to yield the equilibrium value dynamics [11], that the gradient of the Gibbs free energy as the mean. with respect to the chemical potential is the particle num- Addition of the term n(q)q to Γ in Eq. (30) cancels only ber. (q) For reference in later sections, we may compute closed the explicit q/β term in the shifted free energy Ga∪b, leav- 1 1 (q) (q) forms for the various quantities. Define ¯ b a, dual ing the particle assignments na , nb unchanged. The to the relative particle-number difference ≡ˆn.¯ Then− terms that remain are precisely those in the fluctuation theorem: they are the subsystem free energies evaluated N 1 n(q) = tanh (q β¯) , (28) at shifted ns. The effective action may be written 2 2 − Seff(n)= βG (n )+ βG (n ) βG (N) . (32) and a a b b − a∪b β¯ 1 In Eq. (32) the free energies G are now regarded as func- Γ(q)= N log cosh log cosh (q β¯) . (29) 2 − 2 − tions of the continuous-valued na and nb rather than the discrete indices na and nb on which the distribution is In continuum field theories with many particles and defined. nonlinear interactions among them, it is often necessary Referring back to the distribution over fluctuations in to approximate the moment- and cumulant-generating eq eff the original ρns,np of Eq. (14), we see that exp S functions by series expansions in the variance of the directly extracts the leading exponential dependence− of Gaussian approximation to fluctuations. In such an ex- the fluctuation probability. It omits sub-extensive nor- pansion, The gradient of Γ yields all of the connected malization factors related to the width of the distribu- graphs giving n(q), while the gradient of ψ gives a sum tion. Therefore the effective action directly expresses the over all graphs (see Ref. [22] Ch. 16). large-deviations scaling for fluctuation probabilities. The log-probability approximated by Eq. (32) scales linearly in N, and its structure is determined by the remaining 1. The stochastic effective action for single-time N-invariant fractions νa and νb. fluctuations For reference in comparison to later expressions for the effective actions of fluctuating histories, the effective ac- App B reviews the fact that the Gibbs free energy is tion for the equilibrium distribution evaluates to a Legendre transform of the entropy. Thus, the entropy is the converse Legendre transform of the Gibbs free en- β¯ 1 Seff(n)= N log cosh log cosh q(n) β¯ ergy. We may therefore expect that by Legendre trans- 2 − 2 − forming Γ(q), we will arrive at a direct expression for the q(n) 1 entropy differences that govern internal fluctuations of + tanh q(n) β¯ , (33) 2 2 − particle number, without reference to external tempera- ture or pressure. The Legendre transform of the cumulant generating in which q(n) is evaluated by inversion of Eq. (28). function Γ is known as an effective action [22]. When it is computed for the single-time binomial distribution, it hardly seems to justify its name, if we expect an ac- tion in the sense of Hamiltonian dynamics. However, it 6 For the equilibrium distribution, and indeed all other distribu- will become just such an action in the time-dependent tions computed below, this inverse will be unique. For problems ensemble. Here we introduce the Legendre transform for in which it is not unique, such as occur along the co-existence its statistical meaning, and later we introduce the dy- curves of first-order phase transitions, the Legendre transform is namical version. replaced by the Legendre-Fenchel transform [1, 2]. 12

IV. THE DOI-PELITI CONSTRUCTION FOR the starting point for methods of Gaussian integration, THE DYNAMICAL TWO-STATE SYSTEM Langevin equations, or other commonly used approxima- tion schemes. We now construct a second generating function and ensemble, for the time-dependent distribution ρ evolving under the master equation (3), rather than merely for its equilibrium steady state. The equilibrium fluctuation re- 1. The master equation expressed as a function of sults of Sec. III will reappear within this larger construc- mass-action rates tion, as probabilities for histories conditioned on their behavior at a single instant but on no other information. The two-state master equation (3) is an instance of Here, however, the single-time fluctuation probabilities a wider class of equations describing single-particle ex- will no longer be computed as primitive quantities. In- changes. A notation for the more general class is intro- stead, they will arise as integrals of log-probability along duced here, both because it cleanly divides the particle- the entire histories which are most-likely, conditioned on hopping terms associated with stochasticity from the achieving a particular deviation at a single time. probabilities which coincide with mass-action rates, and In keeping with the shift from states to histories, because it immediately generalizes to more complex reac- the cumulant-generating function and effective action in tion stoichiometries, such as those required by chemical this section will be functionals, whose arguments are se- reaction networks. The master equation for the two-state quences of occupation numbers (na, nb). The description system is written in this general form as derived from the master equation will still only count particle numbers, and will not resolve identities. Un- ∂ρn = e∂/∂ni−∂/∂nj 1 r ρ . (34) like the equilibrium constructions, the path-space large- ∂t − ji n deviations formulae will not allow us to identify the nat- i,j∈{a,b} ural path-entropy functions, for which we will require the caliber formulation of Sec. V. However, this section will Here n is a vector whose components give the number of introduce the remarkable Freidlin-Wentzell formulation particles of each type i 1,...,I (here n=(na, nb)). For ∈ 8 of large-deviations theory as a Hamiltonian dynamical the linear reaction model, the mass-action rates rji system, in which the stochastic effective action will be- come a genuine Lagrange-Hamilton action functional.7 rba = k+na

rab = k−nb (35) A. Liouville time evolution and the coherent-state expansion for its quadrature describe the probability for particle transfer from state i to state j. The form (34) extends immediately to ar- The reason to study generating functions at equilib- bitrary many-particle exchanges with rates that may in- rium is that the moments of the equilibrium distribution volve powers of many different components of n, and to may be more relevant and robust descriptions of fluctua- arbitrary network connections. The general methods de- rived below extend immediately to all those cases. Even tions than individual probability values ρna,nb . Similarly, the reason to use generating functions to study dynamics the shortcuts we will introduce, to pass directly from the is that the basis of single-particle hops is often less in- form of the master equation to the action functional of 9 formative than a basis in the collective motions of many Freidlin-Wentzell theory, apply with no changes. particles. The master equation formalism acts on probabilities in the basis of single-particle hops. The counterpart, which acts on the moment-generating function, is known 8 We write the mass-action rates as going from i to j with this as a Liouville equation. When specified in complete de- index ordering, because they will appear again later as entries in tail, the Liouville equation has the same information a transfer matrix, which is naturally written as acting from the as its corresponding master equation. However, low- left on column vectors of occupation numbers [ni] order approximations to the Liouville form will often 9 The reader is cautioned that in cases where multiple reactants of contain more of the total information than similarly the same type participate in a reaction, the assumption of sepa- ration of timescales may be inconsistent with using equilibrium reduced approximations in the basis of single-particle state variables in the minimal way they are used here. An exam- hops. Such approximations are sometimes called the hy- ple is provided in Ref. [53], where the local-equilibrium approx- drodynamic limit [52]. Therefore the Liouville form is imation may still be used, but it requires replacing the discrete master equation with a full Boltzmann transport equation for continuous-valued positions and momenta. An alternative solu- tion is to add time-local corrections known as “contact terms” to represent the correct counting of many-particle states. For 7 For an introduction to action functionals and Hamiltonian dy- the linear system such complications do not arise, so we will not namics, see Ref. [51]. digress on them further here. 13

2. Definition of the Liouville operator from the transfer eδt L is written as the exponential of a sum or integral. matrix Such expressions are known as time-ordered exponential integrals. For the Liouville operator with constant co- This section sketches the conversion from number efficients, time-ordering serves no explicit purpose, but terms and index shifts in the master equation, to the when time-dependent sources are introduced below, it equivalent polynomials and differential operators in the will become necessary. Liouville equation. It then converts the representation on We may efficiently approximate the product (38) by polynomials to the representation introduced by Masao interposing a complete distribution of generating func- Doi [4, 5], in terms of raising and lowering operators in tions between each of the small increments of time evo- an abstract linear vector space. The systematic construc- lution in the second line, which we develop in the next tion of Liouville forms from master equations is now stan- sub-section. Doing so, however, introduces a distinct set dard [16, 37], and its form for the linear mass-action ki- of arguments z at each time, which are cumbersome to netics of Eq. (35) is the simplest possible. work with. Therefore, at intermediate times between 0 In the vector notation, the generating function at a and T , it is conventional to adopt an abstract notation single time becomes a function of a vector z whose com- for the complex vectors and derivatives [4, 5], which more ponents correspond to those of n. Here the moment- clearly reflect the nature of the generating function as a generating function is vector in a normed linear space (a Hilbert space). The correspondence of the elementary monomials, ni ψ(z)= zi ρn. (36) † zj a , (39) n i∈{a,b} ↔ j ∂ ai, (40) Time evolution under Eq. (34) implies time evolution of ∂zi ↔ ψ according to reflects the property that complex variables and their ∂ψ(z) n ∂ρn derivatives satisfy the algebra of raising and lowering op- = z k 10 ∂t k ∂t erators familiar from the quantum harmonic oscillator, n k∈{a,b} ∂ † zj nk i i = 1 z r (n ) ρ , ,zj a ,aj = δj. (41) k ji i n ∂z ↔ zi − i i,j∈{a,b} n k∈{a,b} Polynomials built from these variables live in a func- zj ∂ nk = 1 rji zi zk ρn, tion space built on a “vacuum” which is simply the num- zi − ∂zi i,j∈{a,b} n k∈{a,b} ber 1, denoted ∂ z, ψ(z) . (37) ≡ −L ∂z 1 0) . (42) ↔| Here we have written out the functional dependence of The conjugate operation to creating a generating func- rij on the number indices of the particle species enter- tion is the evaluation of the trace – generally with ing half-reactions explicitly, to show how those number moment-sampling operators to extract its moments – indices are replaced by products zi∂/∂zi acting on the so that the conjugate operator to the right-hand vac- basis polynomials of the generating function. Successive uum (42) is an integral, lines of Eq. (37) reflect the action of successive terms in the master equation (34). The extension of these steps to dIzδ(z) (0 . (43) ↔ | multi-species reactions also follows a standard form [16], of which our example is representative. The differen- From these definitions an entire function space with inner tial operator (z, ∂/∂z) is the Liouville operator for this product can be built up, as reviewed in Refs. [16, 17].11 stochastic process.L Time evolution of the density ρn over an interval [0,T ] is performed by taking the quadrature of Eq. (37), which we may think of as being performed over a sequence of 10 Of course, in the approach of “quantizing” the classical Har- short time intervals of length δt: monic oscillator to a differential operator on Schr¨odinger wave

T functions, this map from polynomials to operators is the origin dt L of the raising and lowering operator algebra. ψT = e 0 ψ0 T 11 It is conventional to denote the inner product for classical T/δt stochastic processes with round braces, as in |0), to distinguish δt L = e ψ . (38) them from the quantum bra-ket notation |0 [54]. This notation T 0 k=1 reflects the fact that it is not in the operator algebra, but in the space of functions and the inner product, that the distinc- denotes time-ordering in the product, and we will use tion between classical and quantum-mechanical systems lies, as Tthe same notation when the product of exponential terms discussed in the introduction. 14

i 3. The coherent-state expansion and the fact that all a ∂/∂zi annihilate the right vac- uum (42), it follows that↔ Eq. (44) does define an eigen- i The transformation from discrete number-states to state of all the lowering operators a with corresponding i generating functions is the Laplace transform. It may eigenvalues φ , be applied within the discrete time slices of the quadra- ture (38), where it makes use of the Laplace transforms ai φ = φi φ . (45) of intermediate Poisson distributions, which are known as coherent states. It has become conventional in the The left-hand coherent state dual to Eq. (44) is given Doi-Peliti literature to denote the mean values of these by Poisson distributions with the field φ; for a multi-state † † system, we index them with a complex-valued vector field φ† = (0 eφ ae−(φ φ)/2 | φ. The conjugate vectors to these Poisson distributions M are moment-sampling operators, weighted by Hermitian ∞ φ† a −(φ†φ)/2 conjugate field variables denoted φ†. = (0 e . (46) | M! † M=0 The φ will have the property that, dynamically, they propagate information backward in time from final data Eq. (46) is likewise checked to be an eigenstate of the imposed by the weights za, zb. The interpretation of this raising operators, with eigenvalues φ†, property is that the moment-sampling operators capture j conditional distributions at earlier times, conditioned on † † † † the final values they produce. The fields φ and φ† will φ aj = φ φj. (47) turn out to be the position fields and their conjugate- momentum fields in one version of the Freidlin-Wentzell The presence of a corresponding to ∂/∂z in Eq. (46) theory, and the Hamiltonian under which they evolve gives the left-hand coherent state the interpretation of will be the Liouville operator. The way evolution un- a moment-sampling operator. This state displaces any † der the dynamical system is made consistent with the argument aj in the state that it acts on (through the in- “backward” propagation of information, by the conju- † ner product), by a summand φj. It then sets the original gate momentum, is that the momentum variables will a† to zero because all left-hand states include the trace typically have unstable evolution forward in time, while j the position variables will evolve in stable directions. defined by the left vacuum (43). In this section we construct the coherent-state expan- The sum of outer products of a left-hand coherent state sion for the quadrature (38), which is due to Peliti [6, 7]. and its dual right-hand coherent state gives a representa- The Liouville operator is a function of raising and low- tion of the identity operator as an over-complete integral i † over states, ering operators a , aj so the convenient expansion of Eq. (38) will be given by a basis of eigenstates of these dIφ†dIφ φ φ† = I. (48) operators. The coherent states are constructed to be such πI eigenstates. The right-hand coherent state is defined as The result of taking the inner product of Eq. (48) with † † φ = e−(φ φ)/2ea φ 0) | any state ψ on the right is that the moment-sampling M † ∞ a† φ operator φ simply transposes the weights that it ex- −(φ†φ)/2 = e 0) M! | tracts from ψ onto the left-coherent states φ in the M=0 I ∞ expansion (48). nm † φ nm −(φmφm)/2 m † If we insert a complete set of states in this form be- = e am 0) . (44) nm! | tween every term in the product (38), we express the m=1 nm=0 moment-generating function at time T as an integral over The components of its complex parameter vector φ corre- all intermediate values of the field variables φ, φ†, of the spond to those of z. From the commutator algebra (41), form

T −Γ (log z ,log z ) † † z −φ† φ + z −φ† φ −S−Γ log φ† ,log φ† ψ(z ,z ) e T a b = φ φ φ φ e( a aT ) aT ( b b T ) bT 0( a0 b 0). (49) a b ≡ D aD aD bD b 0

It requires some algebra, but no conceptual difficulties, to show that the action in Eq (49) is given by

S = dt ∂ φ† φ + ∂ φ†φ + . (50) − t a a t b b L 15

† The terms ∂tφaφa come from the inner products of states and the Freidlin-Wentzell quasipotential, in turn in this φ† and φ at successive times, and the evaluation of section. All introductions to Freidlin-Wentzell theory that I the Liouville operator sandwiched between these states have seen in the reaction-diffusion literature compute the converts ˆ into the quantity serving as the Hamiltonian. quasipotential by studying the stationary points in the Its form,L field variables φ, φ† [16, 17], or in the equivalent operator expectations [37]. For the two-state system, this choice = k φ† φ† φ + k φ† φ† φ . (51) L + a − b a − b − a b is technically the easiest, because the bilinear functional integral (49) is manifestly the “simple harmonic oscil- is obtained directly from the penultimate line of Eq. (37) lator” of stochastic processes. However, field variables with the mass-action rates (35). It is a general feature, are the least intuitive choice, because individually nei- which extends to non-linear rate equations, that factors ther φ nor φ† represents an observable quantity. (Recall of zi in the rate functions on the reactant side cancel that the function of φ as the expectation value of a Pois- factors 1/zi from the shift operators (the leading paren- son distribution is only propagated through time by the thesized term in Eq. (37)). moment-sampling weight given by φ†.) An elementary canonical transformation to action-angle variables [51] will produce fields which, unlike φ and φ†, correspond B. Standard computations of the Freidlin-Wentzell directly to observed average number occupancies, and to quasipotential a conjugate momentum that has the physical interpreta- tion of a chemical potential. Equations (49-51) are the stochastic-process reformu- lation of the single-time generation function (18). The Legendre transform log ψ(za,zb) continues to be the equi- eff 1. Diagonalization and descaling in field variables librium effective action S (na, nb) of Eq. (30). The field integral (49) for these quantities will in general be dominated by the stationary paths of the action (50). The quasipotential provides the leading log-probability The value of S in Eq. (50), on the stationary path, is for fluctuations. When large-deviations scaling holds, known in Freidlin-Wentzell theory as the quasipotential, the value of the quasipotential on its stationary path because it generalizes the Gibbs free energy from equi- should factor into an overall scale factor, and a scale- librium thermodynamics.12 The quasipotential is widely independent rate function. This scaling is particularly used to approximate the solutions to diffusion equations easy to achieve for the the Gaussian functional inte- on the boundaries of trapping domains [3, 21, 38]. gral (49) of the two-state model. The field variable φ The field integral of the last section has converted an ir- is rescaled to remove a factor N, which may be moved reversible stochastic process into a deterministic dynam- outside the entire action (50), leaving a bilinear func- ical system perturbed by fluctuations. Any such Hamil- tional of normalized fields that do not depend on system tonian system offers several choices of representation, scale, at any field values. In standard treatments using related under the changes of (field) variables known in coherent-state variables [16, 37], all scale factors come Hamiltonian dynamics as canonical transformations [51]. from the Poisson field φ, and their Hamiltonian-conjugate In addition to the canonical variables of the Hamiltonian momenta φ† are unchanged. dynamics, this action functional also admits an unusual rescaling is one of several changes of variable that “kinematic” interpretation, in which escape trajectories may be made in the bilinear field basis given by φ, and are represented as rolling in an energy potential, even φ†. Another is a rotation of components from (φ ,φ ) when they move opposite to the direction of classical dif- a b to a basis in which the independent dynamical quan- fusion.13 Finally, the leading expansion about the mean tity n (n n ) /2 may be separated from the non- behavior in the integral (49) comes from Gaussian fluctu- b a dynamical≡ conserved− quantity N. However, this rotation ations, which are readily converted to standard approx- must be used with care, as certain terms needed to define imation methods such as the Langevin formulation. For correlation functions in terms of the equilibrium distri- the bilinear action, the Langevin approximation in the bution disappear from the na¨ıve continuous-time limit, original fields φ, φ† is exact, though in other variables or and must be reconstructed from the explicit discrete-time for more complex reactions it will not be. We will con- forms [39]. We will also see that correct calculation of sider each of these aspects of the Doi-Peliti field integral, the cumulant-generating functional can rely on contri- butions from the apparently non-dynamical number N, depending on how total derivatives are handled in the action (50). 12 The Gibbs free energy and other Legendre transforms of the en- tropy are conventionally known as thermodynamic potentials, by In order to work out the correct treatment of such tech- yet a further analogy with the potential energy in mechanics. nical issues, we will begin in this section with the gener- 13 A thorough treatment of this kinematic potential is provided in ating function ψ(za,zb) of Eq. (18), with its apparently Ref. [55]. superfluous second complex argument. Once the two- 16

variable generating function has been understood, we will as the physically relevant time variable. Note that this perform the more complicated, but more direct, evalua- natural timescale for rare events relates the transition- tion of the single-argument form ψ(z) from Eq. (19). state chemical potential to that for the equilibrated joint We will see in this section that the coherent-state vari- system, defined from Eq. (13). ables φ, and φ† provide a powerful and simple route to In these new variables the action (50) becomes scaling and a variety of exact solutions in the Gaussian † † † model. However, they are not directly related to ob- S = N dτ ∂τ Φ Φˆ ∂τ φ φˆ + φ φˆ ν¯Φˆ servable quantities, and their simplicity in the case of − − − non-interacting particles can quickly give way to quite † † N dτ ∂τ Φ Φˆ ∂τ φ φˆ + ˆ . (55) complicated correlation functions if particle interactions ≡ − − L must be considered. They also produce a continuum limit which, although simple in form, does not explicitly rep- resent all quantities needed to compute correlation func- 2. Calculation of the two-argument cumulant-generating function tions. Therefore we will abandon the field basis of φ, and φ† as soon as the structure of its generating function has been understood, and move to a set of transformed vari- The leading large-N exponential dependence of the ables that lead to slightly more complex algebra, but do functional integral (49) comes from the stationary point not suffer from any of these shortcomings. of the action and the boundary terms at times 0 and T . The stationary-point conditions are vanishing of the first The field components φa, φb correspond to expectation 14 variational derivatives of the action (55), which take the values na, nb defined as in Eq. (25). Therefore, as for the equilibrium distribution, we may define a rotated form ˆ basis corresponding in a similar manner to n and N, by † ∂ † ∂τ Φ = L = νφ¯ the orthogonal transformation ∂Φˆ − ∂ ˆ φ (φb φa) /2 ˆ ≡ − ∂τ Φ= L† = 0 † −∂Φ φ† φ φ† b a ˆ ≡ − † ∂ † Φ φb + φa ∂τ φ = L = φ ≡ ∂φˆ † † † Φ φb + φa /2. (52) ˆ ≡ ˆ ∂ ˆ ˆ ∂τ φ = L† = φ ν¯Φ (56) A scale factor proportional to total particle number may −∂φ − − then be removed from the Poisson fields φ, Φ only, defin- Eq. (56) is the promised conversion of the first-moment ing normalized fields dynamics of the stochastic process (34) into a deter- ministic dynamical system with Hamiltonian ˆ. It fol- φˆ φ/N L ≡ lows immediately from these equations and from the Φˆ Φ/N, (53) lack of explicit time-dependence in ˆ, that this Liouville- ≡ Hamiltonian is a constant of the motionL along the sta- Here the relation of absolute to relative number fields tionary path, repeats the notation used to relate absolute to relative number indices n and ˆn n/N. Even though the relation d ˆ † ≡ L 0. (57) between variables φ, φ and the expected number indices dτ ≡ is not simple, in general it is the φ variables that carry ˆ the scaling with total system size. The boundary terms at time T T (k+ + k−) in the ≡ † A final coordinate transformation reflects the fact that variation of the functional integral (49) vanish at φaTˆ = † † the coordinate timescale dt is not characteristic of the za, φbTˆ = zb. The intermediate time-dependence of φ is dynamics for either average behavior or fluctuations. We thus therefore introduce a rescaled time variable τ, with Ja- ˆ φ† =(z z ) eτ−T , (58) cobean τ b − a † −β −1 and the corresponding solution for Φ is dτ e ( ‡ a∪b) e−β‡ k + k = = , (54) dt ≡ + − ν¯ ν¯ Z ν¯ ν¯ 1 a b 1 a b Φ† = (z + z ) +ν ¯ (z z ) 1 eτ−T . (59) τ 2 b a b − b − The field Φˆ is non-dynamical, and a steady-state so- 14 For the case φ† ≡ 1 representing thermal equilibrium and all lution for the relative number field φˆ in terms of Φˆ is classical diffusion solutions, φa = na and φb = nb as an expec- therefore give by tation in the functional integral, but not as a general insertion in higher-order correlation functions. φˆ =ν ¯Φˆ. (60) 17

† † The product φaφa + φbφb is the expectation of the total it again renders the correct answer in a most cryptic fash- number operator, and therefore must equal N. In the ion. rotated and descaled basis, this equality takes the form We begin with the conservation law (57). To obtain ˆ 1 = Φ†Φ+ˆ φ†φˆ a reference value for we note that in the equilibrium distribution where theL mass-action equations balance, 1 ˆ = Φˆ (zb + za) +ν ¯ (zb za) , (61) = 0. The action (55) is bilinear in the fields φ and 2 − L† φ , so that it has a simple relation to either of its gradi- which may therefore be used to assign a value to Φˆ in ents, terms of the arguments za, zb of the generating function. The equivalent expression for the relative number- ∂ ˆ ∂ ˆ ˆ = φ† L = L φˆ . (65) † † i † ˆ i operator difference φbφb φaφa /N is then L ∂φi ∂φi − i=a,b i=a,b 1 2ν = φ†Φˆ + 2Φ†φ.ˆ (62) Therefore, the stationary point conditions, together with 2 ˆ 0, imply S 0 when evaluated on any stationary L ≡ ≡ Eq. (62) is immediately solved from the boundary val- path. Thus the only contribution to ΓT (log za, log zb) ues (58,59) at time Tˆ to recover Eq. (28) as † † must come from the boundary term Γ0 log φa0, log φb0 1 in Eq. (49). Since the initial distribution has support only 2νT = tanh (log z β¯) . (63) 2 − on a single value na + nb = N, from the definition (18) In terms of this final value for the function ν, the func- we may write tional dependence at earlier times is given by −Γ log φ† ,log φ† , N log φ† φ† −Γ log φ† /φ† 0( a0 b 0 ) 2 ( a0 b 0) 0( ( b 0 a0)) ˆ e = e e , ν ν¯ =(ν ν¯) eτ−T . (64) τ − T − (66) The algebra of the preceding solution is simple and lin- in which the single-argument cumulant-generating func- ear, and yet it is profoundly obscure as a representation of tion corresponds to the logarithm of ψ(z) in Eq. (19). ˆ † † the physical distributions of random walkers in the two- From Eq. (58) at τ 0 and large T , φb φa → † † − → state model. This obscurity is the reason we will soon 0, and as we will see φb and φa remain nonzero, so abandon coherent-state variables. Note that the field φˆ † † log φb /φa0 0. Therefore, for the equilibrium dis- – na¨ıvely corresponding to the observable n – is actu- 0 → † † ally constant by Eq. (60). All particle number-dynamics tribution Γ0 log φb0/φa0 0 as well. The only con- † † → in this solution comes from the fields φ and Φ . These tribution to the generating function at time Tˆ, even with fields, known as response fields in the Doi-Peliti litera- z z 1, is the term N log φ† φ† corresponding to ture [39], are acting to propagate information from the b a ≡ 2 a0 b0 boundary condition at the final time Tˆ backward into the the total number, which is not even dynamical! −Tˆ interior of the functional integral, by selecting moments In the same limit e 0, setting zazb = 1, we have of the intermediate states with time-dependent weights. → The stationary-path solution (64) is the least- 1 † † cosh (log z β¯) improbable sequence of fluctuations to have led to the φ† φ Φ 2 − . (67) a0 → b0 → 0 → cosh 1 β¯ final value ντ . The single-time probability of ντ in turn 2 results from the accumulation of probabilities for succes- sive accumulating fluctuations in the integral (55) for S. The resulting evaluation for the cumulant-generating However, to identify the particle distribution in this solu- function is then tion requires incorporating the response fields, including the (strangely dynamical) response field Φ† associated † † ΓTˆ(log za, log zb)=Γ0 log φa0, log φb0 with the (unchanging) total particle number N. To show that, despite the difficulties of interpretation, N † † = log φa0φb the quasipotential recovers the single-time fluctuation − 2 0 1 1 theorems, we evaluate the cumulant-generating function N log cosh β¯ log cosh (log z β¯) , Γ . The calculation is easiest if we take Tˆ 1 so that the → 2 − 2 − Tˆ ≫ initial condition Γ0 may be evaluated on the equilibrium (68) distribution.15 For the exact bilinear action (50) a con- servation law greatly simplifies the calculations, although recovering Eq. (29). We obtain the correct answer from a collection of surface terms, while the single-argument cumulant-generating functional that should have con- 15 Any other initial condition would decay toward the equilibrium trolled dynamics dropped out. How shall we understand distribution for sufficiently large Tˆ. this? 18

3. Cancellation of surface terms, and the quasipotential as tional integrals extremely useful, but it is also the feature an integral of time-local log likelihoods that requires us to go through the full exercise of defining the generating functional of continuous-time sources, and The puzzle of the Gaussian evaluation of the cumulant- then computing the Legendre transform to a Stochastic generating function is solved by taking care with the free- Effective Action, to identify the history-dependent ob- dom we have to include total derivatives in the field ac- servations for which we are actually computing probabil- tion. The particular choice that moves all fluctuation ities. probabilities into the expected, single-variable cumulant- generating function Γ(log z) also sets up the action-angle change of variables in the next section, which will isolate 4. Action-angle variables only the dynamical observables. The property S 0, together with the general feature ˆ 0, implies for a≡ bilinear action that the sum of time- We next compute the same generating function using a L≡derivative terms equals zero independently. In particular, set of transformed variables in which the expected num- we may decompose these as ber indices n are given by elementary fields rather than by bilinear forms. In these variables the response field has † ˆ † ˆ 0= ∂tφaφa + ∂tφbφb the physically simple interpretation of a chemical poten- 1 tial. The transformed variables have advantages and dis- = ∂ log φ†φ† φ†φˆ + φ† φˆ 2 t b a b b a a advantages relative to the coherent-state field variables. † The Liouville/Hamiltonian operator in the transformed φ b 1 † ˆ † ˆ variables will no longer be bilinear, making solutions to + ∂t log † φbφb φaφa . (69) φa 2 − the less obvious, even though they may still be found exactly. On the other hand, the kine- † ˆ † ˆ The combination φbφb + φaφa is nothing more matic nature of the two-field model as a Hamiltonian than the conserved number 1, while the combina- system will be clearer. Also – although we have not considered fluctuations yet and have only talked about tion φ†φˆ φ† φˆ /2 is precisely the observable ex- b b − a a mean values – the transformed Hamiltonian will show pected number asymmetry ν. The total derivative explicitly all terms needed to compute fluctuations, in- † † (N/2) ∂t log φbφa could have been removed from the cluding those that vanished in the continuous-time limit − † action to cancel the term in the final generating func- in coherent-state variables φ and φ . In the transformed † variables it is no longer necessary to appeal to the un- tion, leaving only the argument Γ log φ /φ† and 0 b0 a0 derlying discrete-time form to compute correlation func- causing the magnitude of ΓTˆ(log z) to originate from the tions. remaining term in the action rather than from a bound- The canonical transformation, from coherent-state ary term. fields to action-angle fields (including the now-familiar Let us summarize what has been accomplished so far: factoring out of the scale factor N), is defined by We have recovered the value of the single-time generat- ing function and the mean value of its associated distri- φ† eηi bution, but we have also derived a new inference about i ≡ −ηi −ηi the most probable path of previous observations condi- φi e ni e Nνi, (70) ≡ ≡ tioned on those values at a time Tˆ. The fact that the stationary value ντ =ν ¯ for τ < Tˆ does not reflect the in- for i a,b . The fields ni now correspond to expecta- fluence of sources, or in fact any causal influence at times tion values∈ { } from Eq. (25), not only as single insertions, τ < Tˆ, but rather the conditional probability structure but in general products of fields in the Doi-Peliti func- generated by the stationary points of the field functional tional integral (49). In the transformed variables, this integral. This very powerful feature makes such func- integral becomes

T −ηaT −ηbT ψ(z ,z ) e−ΓT (log za,log zb) = η n η n e(zae −1)naT +(zbe −1)nbT −S−Γ0(ηa0,ηb0). (71) a b ≡ D aD aD bD b 0

The action in transformed variables becomes = N dτ (∂ η ν + ∂ η ν )+ ˆ , (72) − τ a a τ b b L in which the Liouville/Hamiltonian has the form

S = dt [ (∂tηana + ∂tηbnb)+ ] ηb−ηa ηa−ηb − L = k+ 1 e na + k− 1 e nb, or L − − 19

ηb−ηa ηa−ηb ˆ =ν ¯b 1 e νa +ν ¯a 1 e νb. (73) Again ˆ 0, but now a non-trivial relation exists be- L − − tween theL ≡ generating function and the stationary-path action, which may be written We now observe an important and general relation be- ˆ tween the field action (73) and the transfer-matrix terms T ∂ ˆ Γ (log z)= N dτ Lν + ˆ +Γ (0) . (76) in the original master equation (3). (Recall that the Li- Tˆ − ∂ν L 0 0 ouville operator is defined with a minus sign relative to Here ν and (the implicitly present) η are evaluated the transfer matrix.) The original index shifting opera- over a stationary path. Of course the second factor tors ∂/∂ni have been replaced by momentum fields ηi, ˆ − 0 need not have been written, but it serves to and the discrete indices ni have been replaced by field emphasizeL ≡ the Legendre-dual relation between the Li- variables ni, whose expectation values in the functional ouville/Hamiltonian operator and its stochastic “La- integral are those of the indices n in the time-dependent i grangian” ∂ ην + ˆ, which is similar to the duality density ρ . Thus we see that the tedious task of ex- τ na,nb between the− generatingL functional and the effective ac- pressing the operator algebra and coherent-state expan- tion. sion may be bypassed by a simple notational replacement, to arrive at the functional integral directly. The glossary indicating which number terms are substituted in pass- 5. The canonical versus the kinematic description, and its ing between the master equation and Liouville operator consequences for time-reversal in one-dimensional systems. is given in Table. I. The same forms hold for the mas- ter equation (34) with more general non-linear reaction The cumulant-generating function, evaluated as a rates, and the forms for large-deviation rates in the more stationary-path integral in Eq. (76), is the quasipoten- general case are provided in App. D. tial of Freidlin-Wentzell theory. For single-time fluctua- The basis rotation in action-angle variables is now tions, it is a difference of Gibbs free energies, by Eq. (26). identical to that performed in Sec. IIIA for the particle With respect to the underlying dynamical system, how- numbers and the chemical potentials, with η ηb ηa ever, the quasipotential is an action, with the Liouville ≡ − transforming as the chemical potential dual to n. The operator acting as its conserved Hamiltonian. The Liou- resulting action becomes ville operator, however, still masks the deep simplicity of one-dimensional stochastic processes, which is provided ˆ by yet another level of description, in terms of a kinematic N T S = (η + η ) potential. The kinematic potential is the counterpart to − 2 b a the familiar potential energy in Hamiltonian dynamics, 0 and it provides the most intuitive connection to finite- η −η + N dτ ∂τ ην +ν ¯bνa (1 e ) +ν ¯aνb 1 e . temperature instanton methods [42]. Here we show how − − − the kinematic description is extracted, and use it to prove (74) that, for one-dimensional systems, the most-likely path to arrive at any fluctuation is the time-reverse of the clas- The first term, arising from the total derivative discussed sical diffusive-relaxation path from that fluctuation. This at the end of the preceding subsection, is precisely the one extension of Onsager’s near-equilibrium results [23, 24] is independent of the magnitude of the fluctuation or the needed to cancel a factor of (N/2) (ηb + ηa) in the initial form of the potential. The ability to prove it so easily in generating function Γ0(ηa ,ηb ) in Eq. (71), leaving only 0 0 the general case is an example of the power of Freidlin- the single-argument function Γ0(η0) for the response field conjugate to ν. The net effect of the boundary terms at Wentzell methods in some circumstances. In dynamical-systems terms, ν is a field with canonical time Tˆ is to set ηa ˆ = log za and ηb ˆ = log zb with the T T momentum η with respect to the Hamiltonian ˆ. While result that ΓTˆ(log za, log zb)=ΓTˆ(log z) ΓTˆ(q) from L Eq. (24). We therefore proceed to solve≡ for the classi- η is a canonical momentum, however, it does not play cal stationary paths as in the preceding section, and dis- the role of a kinematic momentum in the stationary-path pense with further consideration of these extra, canceling solutions. To express the kinematic variables, we recall β¯ terms. thatν ¯a/ν¯b = e from Sec. IIIA, or equivalently 2¯ν = tanh(β/¯ 2). Correspondingly, as a purely notational The stationary-path equations, which are the equa- −device, for any instantaneous value of ν, we may denote tions of motion with respect to the dynamical system, in action-angle variables are νa eβ, (77) νb ≡ ∂ ˆ so that is the chemical potential for which that value ∂ η = L τ ∂ν of ν would result at an equilibrium. With this notation ∂ ˆ we may recast Eq. (74) – as promised, now ignoring the ∂ ν = L. (75) surface term – as τ − ∂η 20

β β S = N dτ ∂ ην + 2√ν¯ ν¯ ν ν cosh ( ¯) cosh η + ( ¯) − τ a b a b 2 − − 2 − sinh(η/2) β¯ η β¯ η = N dτ ∂ ην + sinh − + 2ν cosh − . (78) − τ cosh(β/¯ 2) 2 2

The first line of Eq. (78) shows us that η β (¯ ) /2 then obtain is, to leading quadratic order about zero,− the term− that ˆ 1 log z creates the “kinetic energy” in the Hamiltonian . Its Γ (log z) Γ (0) = N log cosh (η β¯) , (81) vanishing results in the stationary solution ∂ ν =L 0 for T − 0 − 2 − τ 0 its conjugate number field. The η-independent part of the Hamiltonian, 2√ν¯aν¯bνaνb cosh[β (¯ ) /2] 1 , which again reproduces Eq. (29). defines the kinematic− potential for{ the dynamical− system.− } 6. Langevin approximation and the magnitude of The fields ν, η β (¯ ) /2 follow a familiar Hamilto- fluctuations about time-dependent solutions nian phase-space− dynamics− in this potential, with the kinematic momentum vanishing at the turning points It is important that the equilibrium generating func- ˆ = 0, and the potential vanishing only where the tion (19) produces not only an offset equilibrium value, L mass-action rates satisfy conditions of detailed balance. but an entire distribution corresponding to an effective These total stationary points are all unstable, just as for shift in the chemical potential between the a and b states. problems of barrier penetration or escape in equilibrium The width of this distribution is identical to the fluctua- Hamiltonian statistical mechanics [42]. A more detailed tion variance in ν produced by the Gaussian integral with treatment, for problems with multiple basins of attrac- action (78), as we now show. The easiest demonstra- tion and an interesting instanton structure as a result, is tion is by means of the Langevin approximation, which provided in Ref. [55]. is also useful to show how the classical Langevin equa- The very strong consequence of the form in the first tion is generated from the more complete Doi-Peliti func- line, in one dimension, is that the conservation law ˆ 0 tional integral. This construction is particularly elegant has only two solutions at any value of ν: η = 0L≡ and in action-angle variables, which lead to a Langevin equa- η = β (¯ ), by symmetry of the second cosh. Then, tion directly for the particle numbers. The coherent-state − by antisymmetry of the gradient of the same cosh, ∂τ ν is variables in which the Langevin equation is usually con- equal in magnitude and opposite in sign at these two so- structed [56] only offer this interpretation for fluctuations lutions. Thus, the stationary-path solution at nonzero η about classical diffusion paths, where φ† 1, and they ≡ is the time-reverse in ν of the classical diffusion solution. can become quite complicated to compute about other Moreover, we have β = β¯ η at this solution, which backgrounds. In action-angle variables, the same inter- together with the form in the− second line of Eq. (78) im- pretation is valid in for all paths, including those that mediately gives propagate conditions about single-time fluctuation back- ward in time. 1 1 The fact that the Gaussian integral delivers the fluc- ν = tanh (η β¯) . (79) 2 2 − tuation magnitude (and more generally, the correct cor- relation structure) even for time-dependent solutions is For these solutions we may recover the value of an important improvement over an approach often taken the single-time cumulant-generating function from the with Langevin equations, which is simply to “guess” a action-angle equations of motion. Vanishing ˆ gives the fluctuation magnitude as an independent input to mod- action as an integral over its kinetic terms alongL the sta- els [56, 57]. The variance produced is also the unique tionary path, which no longer vanish, value for which the terms in the path entropies of Sec. V will cancel to produce the continuous-time hydrodynamic Tˆ limit of Ref. [52]. ΓT (log z) Γ0(0) = N dτ ∂τ ην The Langevin stochastic differential equation encodes − − 0 exactly the same approximations as the Gaussian approx- log z imation to fluctuations in the functional integral. About = N dη ν. (80) cl cl − 0 any solutions η , ν , to the stationary-point equations, we write general η = ηcl + η′, ν = νcl + ν′, and expand S Using the expression (79) in the last line of Eq. (80), we from Eq. (74) to second order in primes.

cl cl 1 cl cl S = Scl + N dτ η′ ∂ + ν¯ eη +ν ¯ e−η ν′ ν¯ νcleη +ν ¯ νcle−η η′2 (82) τ b a − 2 b a a b 21

In Eq. (82) we denote by Scl = N dτ ∂ ηclνcl + ˆ ηcl,νcl the action of the classical stationary path. Terms − τ L ′ ′ linear in η and ν vanish as the condition for the stationary-point solutions to hold. We could complete the square in η′ in Eq. (82), leading to the construction of Onsager and Machlup [58], and we will do this in a later section. An alternative approach, pursued here, is to perform the Hubbard-Stratonovich trans- cl cl 1/2 Aux formation [22, 59] by introducing an auxiliary field λ, with a functional integral Det ν¯bνa +ν ¯aνb λ exp S which is just a representation of unity, in which the auxiliary field action is given by D 2 Aux N 1 cl ηcl cl −ηcl ′ S dτ λ ν¯bνa e +ν ¯aνb e η (83) ≡ 2 ν¯ νcleηcl +ν ¯ νcle−ηcl − b a a b The sum of actions for the original and auxiliary fields becomes

2 Aux N λ ′ ηcl −ηcl ′ S + S = dτ + 2η ∂τ + ν¯be +ν ¯ae ν λ (84) 2 ν¯ νcleηcl +ν ¯ νcle−ηcl − b a a b The integral η′ in the original functional integral, if rotated to an imaginary contour of integration,16 simply produces the functionalD δ-function

Aux 2 ′ − S+S ηcl −ηcl ′ N λ η e = δ ∂τ + ν¯be +ν ¯ae ν λ exp dτ . (85) D − − 2 ν¯ νcleηcl +ν ¯ νcle−ηcl   b a a b   

The last of the original functional integrals, over ν′ ad- We will return in Sec. IVF to check that this result agrees mits only those solutions which satisfy the Langevin equa- with the variance of the single-time distribution given by tion the equilibrium generating function (22).

ηcl −ηcl ′ We have given a thorough treatment of the single-time ∂τ + ν¯be +ν ¯ae ν = λ. (86) generating function to provide multiple points of refer- ence for terms that cause φ† fields to deviate from unity, λ, known as the Langevin field, has the correlation func- or η to deviate from zero, and to interpret their effect tion at any two times on stationary points. It will now be straightforward to cl ηcl cl −ηcl place all such terms entirely within the functional inte- ν¯bνa e +ν ¯aνb e ′ λτ λτ ′ = δ(τ τ ) , (87) gral rather than in boundary terms. We will do so first N − for a discrete source with an identical effect to the single- as a result of the Gaussian kernel of integration. time generating function, and examine the future as well A case of special interest, for its relation to the single- as past properties of the stationary paths. We will then time distribution – at or away from equilibrium – is the define exact solutions for the more general case of con- ′ 2 fluctuation variance at equal times (ντ ) about a con- tinuous sources, and finally study the analytic structure stant or long-time persistent background. We will not of the small-fluctuation limit, which is simply a linear provide details about the inversion of Eq. (86) to express expansion in point sources. ν′ in terms of λ and a retarded Green’s function, which may be found in Ref. [39]. However, the general result is that for backgrounds that persist much longer than the C. Generating functionals and arbitrary time-dependent sources ηcl −ηcl decay time ν¯be +ν ¯ae , the single-time variance is given by When sources interior to functional integrals are used cl ηcl cl −ηcl to compute the probabilities of complicated histories, the ′ 2 ν¯bνa e +ν ¯aνb e (ντ ) = . (88) construction is usually done in two disconnected stages. 2N ν¯ eηcl +ν ¯ e−ηcl First, the final-time arguments z are set to one, so that b a i the moment-generating function ψ(za,zb) becomes sim- ply the trace of the probability distribution at time T . That is, the elaborate field integrals (49,71) are simply 16 Note that η′ is always integrated along an imaginary contour, as the condition for stability of the Gaussian integral. This is true in complicated expressions for the number 1. New terms the Onsager-Machlup construction, and it is also the imaginary involving the integration variables and external fields are part of η′ that defines the quantity behaving as a momentum in then simply inserted into the action, with the under- the kinematic description. standing that these are “sources” that perturb the par- 22 ticle motion. will write Here we will introduce sources systematically as part of the construction of the functional integral, to make clear qτ = j(τ) δτ, (90) their connection to the original construction of the gener- and take j(τ) j , called a current, to be a smooth ating function. The construction involves modifying the τ function of τ in≡ the limit δτ 0. Because both relax- map by which we have, up to now, simply re-interpreted ation and fluctuation effects are→ governed by timescales intermediate variables z and ∂/∂z as operators a† and a. in ∆τ 1, we may readily consider sources jτ that have A generating functional is the time-evolved product of large magnitude∼ over some range in τ that is 1. Cur- a sequence of generating functions produced by adding rents of this form may be constructed to approximate≪ the small continuous sources to all previous distributions. point source jτ qδ(τ τC ) for some particular time τC . That is, we replace the identification (40) with one in ≈ − Finally we set values za = 1, zb = 1 at time Tˆ, keeping which the abstract operators differ from variables z by only sources from nonzero j(τ) in the range 0 <τ < Tˆ. the addition of sources. At any time τ, we identify The field integral at τ = T therefore simply computes a trace, and all correlations are studied internally in the † qτ functional integral. The notation ψ(za,zb) is no longer zj aje , keeping ↔ needed, and we simply refer explicitly to the cumulant- ∂ ai. (89) generating functional Γ[j] whose argument is the function ∂zi ↔ j. Both coherent-state field variables and action-angle Moreover, to separate the discretization time δτ from variables will be of interest, and so we provide both forms characteristic timescales associated with the sources, we here. Eq. (49) is replaced by

T −Γ[j] † † 1−φ† φ + 1−φ† φ −S −Γ log φ† ,log φ† e = φ φ φ φ e( aT ) aT ( b T ) bT j 0( a0 b 0), (91) D aD aD bD b 0 in which j S N dτ ∂ φ† φˆ + ∂ φ†φˆ φ†φˆ φ† φˆ + ˆ . (92) j ≡ − τ a a τ b b − 2 b b − a a L

Likewise, Eq. (71) – removing the un-needed integration is the Hamiltonian of the dynamical system in the pres- variable ηa + ηb conjugate to N, and writing the single- ence of sources. Because ˆj now depends explicitly on argument generating function alone – becomes time through j, Eq. (57) isL replaced by

T −η ˆ e−Γ[j] = η ne(e T −1)nT −Sj −Γ0(η0), (93) d dj D D L ν . (97) 0 dτ ≡− dτ in which We also have as boundary conditions that η∞ = 0 (more generally ηa∞ = 0 and ηb∞ = 0 independently if Sj = N dτ ∂τ ην jν + ˆ (94) we had kept both sets of integration variables), or equiv- − − L † † alently φa∞ = 1 and φb∞ = 1. To understand how these with Liouville operator conditions lead to stationary-point solutions, which in- volves the stability structure of the functional integrals, η −η ˆ ν¯bνa (1 e ) +ν ¯aνb 1 e . (95) it is easiest to solve some examples. The general exact L≡ − − solution, in coherent-state fields, is given in App. C. The upper time limit T may now be taken to infinity and dropped from the notation. If we consider sources jτ 0 at both τ 0 and τ and which are smooth D. General solutions on→ the scale of the→ discrete→ sum ∞ (not a very restrictive condition) it follows from the stationary-path conditions in the presence of the source j that In all the following cases, we will suppose that the ini- tial distribution is the equilibrium distribution (4). Other ˆ ˆ † ˆ † ˆ initial conditions can easily be considered, but all con- j j φ φb φaφa /2 L ≡ L− b − verge to the equilibrium exponentially in τ, and so may = ˆ jν (96) be decoupled to any desired degree from the influence of j L− 23 at later times. We begin with a point source which recov- ers the single-time generating function, and then consider ντ − ν sources extended in time.

−5 −4 −3 −2 −1 0 1 2 3 4 5 τ 1. Point sources

FIG. 4. The time-symmetric, exponentially decaying station- Suppose that j qδ(τ τ ) for some time τ in the τ → − C C ary path associated with single-time δ-function sources. This sense of convergence of smooth distributions with com- solution illustrates the embedding of the single-time equilib- pact support and fixed area q. The exact solution in this rium large-deviations theory within the richer time-dependent case is most simply expressed in action-angle variables, large-deviations theory of the stochastic process. In the for- which can then be converted to coherent-state field vari- ward direction, classical diffusion is responsible for the decay, ables if desired. and the response field η ≡ 0. In the reverse direction, η From the action form (94) with this source, stationary back-propagates the constraint that a fluctuation of magni- tude (ντ − ν¯) is observed at time τC , and the dynamical- solutions are just those of the unperturbed action except C at τ , where η has the discontinuity system representation generates the least-improbable trajec- C tory of fluctuations that achieve that constraint. η = η q (98) τC +ǫ τC −ǫ − as ǫ 0. To understand what this implies, we return to → cl ′ The generating functional for the point source has the the kinematic form for (78) for the action, using η + η value already computed, as in Sec. IVB6. β is a function of ν from Eq. (77), ∞ cl cl cl which we indicate on the stationary path by writing . Γ[j]=Γ0(0) N dτ ∂τ η ν For free solutions we already know that ˆ = 0 requires − 0 L q either ηcl = 0 or ηcl = β ¯ cl . Therefore we may 1 − = N log cosh ηcl β¯ , (101) write − 2 − 0 β β cosh cl ¯ cosh η + cl ¯ = again recovering Eq. (29). Note that although νcl = 0 2 − − 2 − at all times, the integral receives contributions only from β τ <τ where η = 0. cosh cl ¯ (1 cosh η′) C 2 − − The effective action is numerically just that given in Eq. (33), because it differs from the generating functional cl β cl ′ + sinh η + ¯ sinh η . (99) by subtraction of the point source qντ as before. How- 2 − C ever, as a Legendre transform on a space of histories, we Our concern is with the second line of Eq. (99). This now regard it as the functional quantity appears with positive sign in ˆ and so describes Seff[n]= Seff Nνcl , (102) divergent fluctuations for η′ integratedL along a real con- cl tour. As in the case of the Hubbard-Stratonovich trans- whose argument ν is the whole exponential history ′ formation, the convergent contour of integration for η is given by Eq. (100) with ντC fixed by Eq. (28). The en- imaginary, where it behaves like an ordinary momentum tropy difference of Eq. (32) is now expressed not only as variable in finite-temperature field theory [60]. a probability of a fluctuation of given magnitude, but of The instability that governs the integration contour the most-likely sequence of previous and subsequent con- for fluctuations also implies that time evolution in the figurations consistent with that observed fluctuation but forward direction for real-valued η is unstable in gen- with no other information given. eral. This should not be surprising, as we have already To understand the meaning of the stochastic effective seen that the evolution equations for η are stable going action as a probability on histories, it is necessary to ap- backward in time, as η propagates context from observ- preciate the non-local relation between histories and the ables to their most-likely prior causes. The implication structure of the moment-sampling currents j which con- cl tain the minimal information to specify them. To extend for our stationary path is that if η∞ = 0 and jτ 0 for cl ≡ the formulae for the effective action to more general histo- τ >τC +ǫ, we must have ητ 0 for all τ >τC +ǫ. Hence cl ≡ ries, a variety of methods are known which produce both ητC −ǫ = q and for τ <τC + ǫ we are back to the station- ary path evaluated in Sec. IVB5. The consequence for non-local and exact, or local but approximate, solutions. the number field is that its magnitude at τ = τC is just that of Eq. (28), and that for times either before or after τ , νcl satisfies the generalization of Eq. (64) to 2. Continuous sources; linear-response approximation and C analytic structure

cl cl −|τ−τC | ντ ν¯ = ντC ν¯ e . (100) − − We now return to the second-order expansion of The time-dependence ofνcl is shown in Fig. 4. Eq. (94) in fluctuations, from Sec. IVB6, but this time 24 we complete the square directly rather than through E. The relation to Onsager’s “minimum entropy a Hubbard-Stratonovich transformation. Following On- production” property sager [23, 24, 58], we will expand about the equilibrium background where νcl ν¯ and ηcl 0, and will simply Eq. (108) is a relation between an offset n n¯ and a ′ ≡ ≡ − write η = η. (The more general problem of expanding transport current ∂τ n, which we might call v, a reaction about non-classical backgrounds conditioned on sources velocity in appropriate units of time. Recalling Eq. (32) could of course be considered as well.) The quadratic for the log-probability of single-time fluctuations, we may 2 expansion of the action takes the simple form recognize the combination (n n¯) / (2Nν¯aν¯b) as none ∞ other than Seff(n), the entropy− deficit from the equilib- Sj = Nν¯ dτjτ rium large-deviations principle. Immediately we would − 0 eff ∞ then recognize (n n¯) / (Nν¯aν¯b) as ∂S (n) /∂n, and 2 − + dτ (ν ν¯)[( ∂ + 1) η j] ν¯ ν¯ η . (n n¯) v/ (Nν¯aν¯b) as the rate of change in the equi- − − τ − − q b − 0 librium entropy of quasi-equilibrated subsystems. From (103) these associations we could rewrite Eq. (108) in the form The stationary-point equations are now linear-response 1 ∞ ∂Seff(n) 1 ∂2Seff(n) Seff[n]= dτ Seff(n)+ v + v2 , equations, 2 ∂n 2 ∂n2 0 ( ∂τ + 1) η = j (109) − (∂ +1)(ν ν¯)=2¯ν ν¯ η. (104) 2 eff 2 τ − a b in which the term ∂ S (n) /∂n gives the linear- η is the response to j through the advanced Green’s func- response coefficients 1/ (Nν¯aν¯b). tion: If we chose to regard minimization of Seff[n] as an inte- ∞ gral over time of two separate criteria – the first being the ′ τ−τ ′ eff ητ = dτ e jτ ′ , (105) probability of a fluctuation to n from S (n) and the sec- τ ond being a minimization over v – the minimized function while ν ν¯ responds through the symmetric Green’s func- of v given n would be the “entropy production” relative tion: − to a bilinear form v2∂2Seff(n) /∂n2 parametrized by the ∞ near-equilibrium response coefficients. In two papers in ′ − τ−τ ′ ν ν¯ =ν ¯ ν¯ dτ e j ′ . (106) 1931 [23, 24], this was the result derived by Onsager as a τ − a b | | τ 0 consequence of microscopic reversibility and referred to Eq. (106) reproduces the exact time dependence of as a “minimum entropy production” property. Onsager Eq. (100) for a point source, and gives the linear small-q referred to the bilinear form of currents v2∂2Seff(n) /∂n2 approximation to the magnitude (28). as the “dissipation function”. We now have a closed-form local expression for the Note three things: 1) It is not the “entropy produc- generating functional in terms of the response field η and tion” per se that is minimized, but only its value rel- the source j, because in these variables Γ[j] is just Sj ative to the dissipation function, which from the form evaluated on the stationary solution. From Eq. (104), of the system entropy may be arbitrary; 2) there is no this is obvious reason to interpret the dissipation function as a Tˆ Tˆ “constraint” on the entropy production, and thus entropy N 2 production is not “minimized subject to constraints”, as Γ[j]= Nν¯ dτ jτ dτ (2¯νaν¯b) η , (107) − − 2 is 0 0 an analogy to the way the entropy maximized subject with η given by Eq. (105). We recognize from Eq. (9), to constraints in equilibrium; and 3) the presence of the response coefficients in this form results from the near- the quantity Nν¯aν¯b as the variance of the equilibrium distribution and hence the susceptibility to perturbations equilibrium expansion, as is well-known. in chemical potential. We see, then, how we may speak precisely about the From the definition (C11) and use of the equations of range of assumptions needed to make the “production” motion (104), the effective action is similarly expressed of the equilibrium entropy a quantity that is informative as a local functional, about dynamics. We have required the particular form of the local-equilibrium approximation that comes from ∞ 2 N [(∂τ +1)(ν ν¯)] a basins-and-barriers model, so that the dependence of Seff[n]= dτ − 2 2¯ν ν¯ the stochastic effective action Seff[n] on its instantaneous 0 a b 1 ∞ [(∂ +1)(n n¯)]2 configuration variables reduces at leading order to the = dτ τ − . (108) equilibrium quantity Seff(n). We have assumed small 2 2Nν¯ ν¯ 0 a b fluctuations in order to expand the linear velocity de- Eq. (108) is the expression first due to Onsager and pendence in terms of ∂Seff(n) /∂n, and to approximate Machlup [58]. The first line, as usual, makes explicit the linear response coefficients by their equilibrium val- the separation of large-deviations scaling in N from the ues. Had all of these conditions not been satisfied, the integral that is the rate function, which is a function only expansion (109) need not have been valid. Yet, within of fractional displacements ν ν¯. the more general framework of the stochastic process, we − 25 could readily have derived the correct alternative form from the exact solution, which is derived as Eq. (C11) in App. C. In the more general regime of non-linear ντ − ν response and large perturbations from the equilibrium j distribution, the relation between a source j(τ) and the τ path ν(τ) that it induces will generally be non-local in τ.

0 τC τ F. A second example: fixed dis-equilibria and an entropy rate for a large-deviation rate function FIG. 5. The source current (dashed) and resulting stationary (q) solution (solid), for a source that raises nτ to n at τ = 0, We now consider a second problem, aimed at relating holds it at this value until τ = τC , and then releases it to free dynamics to reference static distributions. Suppose that, diffusive decay. rather than assuming free relaxation immediately after a fluctuation of magnitude (28), we ask for the probability of a history which remains at that value for a fixed time, and and then freely decays. The difference of the effective action for this fixed-disequilibrium trajectory from that Seff[n]= N dτ ∂ ην + ˆ for a single-time fluctuation will give the entropy rate − τ L difference, between the stochastic process about the non- q eff N cosh 2 1 equilibrium steady state and the same process acting on = S nτ={0,τC } + τC − 2 cosh β¯ cosh q−β¯ the equilibrium distribution. 2 2 eff 2 For this example, it is convenient to put the initial dis- = S nτ={0,τC } + τC N √ν¯bνa √ν¯aνb . tribution at time τ , to put the initial fluctuation − → −∞ (113) at τ = 0 and hold it until τC , and then to permit free decay after τC as before. From the constructions above, ˆj may be of either sign, while ˆ 0 always. it is easy to define a source protocol that will produce L We may evaluate the q 1L≥ limit of Eq. (113) for this history. The source current takes the form comparison to both the equilibrium≪ effective action and q q the Onsager-Machlup linear approximation. Noting that jτ = Θ(τC τ)Θ(τ 0) sinh + 2¯ν cosh 1 in the linear-response regime (ν ν¯) ν¯ ν¯ q, the small- − − 2 2 − − → a b q q expansion becomes + [δ(τ τC )+ δ(τ 0)] . (110) 2 − − 2 2 Nν¯aν¯bq Nν¯aν¯bq cl Seff[n] + τ . (114) The stationary-point background evolves η unstably → 2 C 4 from its initial asymptotic value of zero at τ to value q at τ = 0 ǫ. The δ-function in the source→ −∞ then The probability of an initial fluctuation to non- lowers ηcl to value− q/2 at τ = 0+ ǫ, which solves the equilibrium number occupancies is given by the equilib- steady-state condition rium effective action, while the probability of persistence at that value is governed by a new term with the struc- ∂ ˆ ture of an entropy rate [49]. We will see in the Sec. V L = j, (111) ∂ν that this part of the effective action is a free-energy-rate difference, just as the equilibrium Seff(n) of Eq. (32) is and holds it there until τC ǫ, at which point the second cl − a free-energy difference. The entropy-rate difference for δ-function term takes η 0 at τ = τC +ǫ. These results are easy to check from→ the exact equations of motion persistent dis-equilibrium may be extracted as in action-angle coordinates, and the time-dependence of ∂ eff cl S [n]= N ˆ both j and the associated ν are shown in Fig. 5. τ∈[0+,τC ] τ ∂τC L Because of the time-dependence of this source protocol q N cosh 2 1 j, neither ˆ nor ˆj vanishes. Indeed, because ∂τ η = 0 = − L L 2 cosh β¯ cosh q−β¯ except at the support of the δ-functions of jτ , and be- 2 2 2 cause these two terms cancel in the action, the nonzero = N √ν¯ ν √ν¯ ν . (115) eff b a a b values of ˆj and ˆ determine Γ and S . We can check − that theseL potentialsL evaluate to Eq. (115) is a new result. The combination of square-root dependence on quantities that equal the expected escape Γ[j]= N dτ ∂ ην + ˆ rates –ν ¯bνa andν ¯aνb – and the squared difference of these − τ Lj in the exact entropy rate, extend to general multiparticle β¯ 1 = N log cosh log cosh (q β¯) chemical reactions as shown in App. D. 2 − 2 − To make contact with the Langevin treatment in N q q Sec. IVB6, we check that in the persistent non- τ cosh 1 + 2¯ν sinh (112) − C 2 2 − 2 equilibrium domain where η = q/2, the fluctuation of 26 the Langevin field in Eq. (87) is given by in this limit; more generally they are distinct functions altogether. We can check from Eq. (115) that in the cl cl cl η cl −η cl cl interval [0,τ ], ˆ has the limits ˆ ν¯ as tanh(q/2) 1 ν¯bνa e +ν ¯aνb e = 2 ν¯aν¯bνa νb . (116) C s ˆ L L → → and ν¯p as tanh(q/2) 1. Thus the entropy rate The decay rate in the Langevin equation (86) evaluates of theL stochastic → process is bounded→ − by leaving-rates from to the two respective states, while the corresponding rate of entropy change in Eq. (122) is unbounded. cl cl ν¯ ν¯ ν¯ eη +ν ¯ e−η = a b . (117) We may summarize this section as follows: It should b a cl cl νa νb not be surprising that the stochastic effective action for histories includes entropy-rate terms that have no simple Hence the single-time fluctuation expression (88) evalu- relation to rates of change in the equilibrium entropy of ates to states. The former measures uncertainty about rates of transition, while the latter measures uncertainty about νclνcl (ν′ )2 = a b , (118) the occupation frequencies for states. It may be that, in τ N some cases, dynamics is near enough to equilibrium that the state occupancy at successive moments of time places in agreement with the variance obtained from Eq. (88) for tight constraints on the possible entropy of transitions. the distribution produced by the equilibrium generating However, this is not a general result, and it is not even function. implied by the local-equilibrium approximation of the form produced by the double-well potential. To say more 1. The non-equilibrium entropy rate in relation to gradients about the origin of the rate term in the stochastic effec- of the equilibrium entropy tive action (113), however, requires a separation between system and environment terms that the continuous-time two-state model does not readily provide. For that sepa- How does the entropy-rate difference defined from the ration we turn to the method of maximum caliber. dynamical stochastic effective action relate to changes in the equilibrium entropy of subsystems under free decay? We may recast Eq. (32) as V. PATH ENTROPIES AND MAXIMUM Seff(n)= ND(ν ν¯) CALIBER N νbνa = log + q (νb νa) . (119) The construction of equilibrium Gibbs free energies 2 ν¯bν¯a − from the entropy, reviewed in App. B, begins with an For the period of free decay after τC , the change in equi- explicit division between entropy terms for the system librium entropy with n is and its environment. The Freidlin-Wentzell construction of the preceding section does not naturally provide such a ∂ Seff(n)= q. (120) decomposition for paths, because the contribution of the ∂n “environment” comes from the form of the transition-rate The time-dependence of the equilibrium-form entropy terms k±, which are embedded in the master equation. follows from the time-dependence of n. Its gradient at To separate these two contributions in the ensemble over the moment after the source perturbation is turned off is histories, we require an explicit combinatorial formula for given by properties of paths, which is defined independently of the probabilities given to such paths by the transition rates ∂τ n = N (ν ν¯) set by the environment. The method of maximum cal- − − q iber, as formulated in Refs [8, 57, 61, 62], provides such N sinh 2 = β¯ q−β¯ . (121) a decomposition. − 2 cosh cosh 2 2 This section will introduce two changes of representa- Therefore, under free diffusion from a chemical-potential tion. One is the consideration of histories of an individ- perturbation by q, the initial rate of subsystem entropy ual random walker, as explained in the introduction. The change is given by other, which will be more important to the ability to iso- late system-environment interactions, is the replacement eff eff ∂S (n) of the continuous-time stochastic process of the master ∂τ S (n)= ∂τ n ∂n equation (3), by a discrete-time two-state Markov pro- q N q sinh cess, shown in Fig. 6. The continuous-time and discrete- = 2 . (122) − 2 cosh β¯ cosh q−β¯ time models, in appropriate limits, represent the same 2 2 stochastic process. However, the discrete-time model has eff eff At small q, ∂τ S (n) 4∂S [n] /∂τC from Eq. (115). the feature that, in every time interval of length ∆t, some Thus the stochastic-process− → entropy rates, and the rate transition of the particle’s state must occur, even if it is of change of equilibrium entropies, are not equal even a transition back to the same state. 27

a b equilibrium. It does not consider time-dependent dis- placements, for which the Doi-Peliti formalism may pro- FIG. 6. Representation of the coarse-grained two-state sys- vide easier constructions. However, the steady-state non- tem as a discrete finite-state model with forced transitions on equilibrium problem does make contact directly with the every interval of length ∆t. constant-j solutions of the preceding section.

For the stochastic process of Fig. 6 and a finite time interval T , consider four observables M αβ, for α,β It may come as a surprise when we find, below, that the a,b . These are the number of transitions from state∈ entropy rate in the effective action (113) draws its form β{ to} state α between 0 and T . If we specify no more entirely from the probabilities of no change. When we than these as constraints on a path entropy, they will separate the large-deviations formula into contributions tend toward uniform distribution on a long interval T , from the path entropy and the environmental probabili- irrespective of the distribution we use to initialize paths. ties, other terms associated with changes will also appear. However, these cancel exactly in the hydrodynamic limit The idea is to label each path with an index j, and to represented by Eq. (113). Thus we will see that, among compute a path entropy the many terms that could have been used to probe the generating functional of the previous section, the partic- ular coupling to the field ν that we studied was the form that produces the hydrodynamic limit. We return at the Spath p log p , (123) end of this section to what such a decomposition implies ≡− j j j about the role of “energy dissipation” as an explanation subject to some set of appropriate constraints on prop- for large-deviations probabilities of histories. erties of the paths, such as its M αβ values, which will In the next two sections, we will solve for the path en- αβ be denoted M . It will be convenient here to divide semble first in its stationary distribution, and then per- j the path specification into the state in which the path turbed with sources. With appropriate choices for the starts, labeled with index σ, and the remainder of the transition parameters in the discrete model, the path en- path conditioned on that starting state, indexed j σ. semble will be constructed from the transfer matrix for We could think, thus, of j as a string denoting the| or- the single-particle continuous-time model. With a partic- dered states through which the trajectory passed, and ular choice of perturbing sources in the generating func- j σ as the string excluding its first letter. In this index- tional for this distribution, we will be able to show that ing| p = p p . the continuous-time Langevin equation gives the Gaus- j j|σ σ sian approximation to the fluctuating single-particle tra- It simplifies the treatment, without omitting any re- jectory. sults that matter here, to exclude the distribution over over initial states from the variational problem, and to A. Variational approach for a path ensemble in its consider only variations of pj|σ. For any path j, the αβ stationary distribution, and connection to the numbers Mj count the state-transitions from state β to continuous-time transfer matrix state α along that path. Denote by M¯ αβ the constraint values on maximum entropy for this path ensemble. Then The implementation of maximum-caliber in Refs. [61, the Lagrangian for the path-entropy maximization prob- 62] considers only steady trajectories displaced from the lem becomes

path p p log p p λαβ p p M αβ M¯ αβ η p 1 . (124) L ≡− j|σ σ j|σ σ −  j|σ σ j −  − σ  j|σ −  σ σ σ j|σ αβ j|σ j|σ    

Lagrangians of the form (124) produce Gibbs (expo- iber by Jaynes [8], to reflect the fact that it is a logarith- nential) distributions as their maximizing distributions. mic measure of the width of a “tube” of microhistories Therefore, in terms of the Lagrange multipliers λαβ , which are typical within the distribution that produces the maximum-entropy probabilities have the form: macrohistories characterized by the M¯ αβ . Following Refs. [61, 62] (apart from a sign convention aa ab ba bb 1 M Mj Mj Mj αβ p = (γaa) j γab γba γbb p . (125) for the λ ) we define j Q σ αβ The entropy on this distribution was given the name cal- γαβ e−λ , (126) ≡ 28

αβ and choose pσ to be the starting probability appropriate The relation of the γ to the pβ may now be to whichever is the first state in history j. For this set made explicit. The assumption behind{ } the use of a of observables, the order in which transitions occur does discrete stochastic-process model is that ∆t is a scal- not affect the weight function, though it does restrict the ing variable that may be changed in the discrete model set of possible paths. while leaving important physical observables unchanged, Q is the partition function on histories, defined as the though rescaling ∆t may introducing overall renormal- sum over j of the other terms in Eq. (125). Because ization constants associated with the discretization. In histories are in 1-1 correspondence with monomials that general the γαβ will then depend on the ∆t but the sta- arise from the Mth power of a 2 2 matrix, the partition tionary distribution of the stochastic process should not. function is immediately expressed× as the usual trace of a Denote the stationary distribution that is self-consistent transfer matrix, just as we could have computed working with the γαβ appropriate to the given M¯ αβ by directly from Eq. (34) above: [π ].19 Then the most general stochastic matrix γαβ β M with [πβ] as the dominant eigenvector is aa ab 1 1 γ γ pa Q = ba bb . (127) aa ab γ γ p γ γ 1 πb πa −R b = + − 1 r . γba γbb 1 π π − b − a Now we may make some simplifying assumptions with- (130) out loss of generality, to bring the expression (127) into The constant R sets the frequency of transition events direct correspondence with the transfer matrix of the M ab = M ba. To model ensure that the discrete stochastic continuous-time model. The matrix γ does not actu- process has the same continuum limit as the continuous- ally have four independent values γαβ , because the time model of the previous sections, we require that for original Lagrangian (124) did not have four indepen- sufficiently short ∆t, R must converge to dent constraints M¯ αβ . The sum M¯ αβ M, be- αβ ≡ cause for each possible path also M αβ M. More- R =(k+ + k−) ∆t. (131) αβ j ≡ over, for large M, it is not possible to set M¯ ab = M¯ ba The form (130) is chosen so that under refinement or by more than 1, so effectively these must be chosen coarsening of the time increment, the definition (131) 17 ± αβ equal. Therefore two normalizations of the γ may remains consistent. be chosen arbitrarily. These may be chosen so that The interpretation of k± as rate constants is completed aa ba ab bb γ + γ = 1, γ + γ = 1, effectively choosing unit if we set their ratio from the fractions of time intervals normalization for the partition function Q. The matrix in which the system starts, respectively, in states a or b: γαβ is then called a stochastic matrix. Next, in keeping with the omission of the initial state k− πa = from the variational problem, suppose pσ is chosen to be k+ + k− the stationary distribution of largest eigenvalue preserved k+ by whatever values of γαβ solve the variational problem. πb = , (132) k+ + k− Then it follows immediately – by collapse of the matrix T αβ corresponding to Eq. (5). Values of k± are then chosen γ with 1 1 on the left, and with pa pb on to satisfy the constraints (129). With these choices the αβ the right – that matrix γ becomes precisely the transfer matrix for ∂ log Q the continuous-time one-particle problem. = M αβ = Mγαβp . (128) Note that the relation between the rate constants, equi- ∂ log γαβ j β librium frequencies, and transition numbers may be writ- We solve for the γαβ by setting Eq. (128) equal to ten M¯ αβ.18 It follows from our choice to normalize γαβ to ¯ ba ¯ ab M M be a stochastic matrix that = = πak+ πbk−. (133) T T M¯ αβ = p . (129) This form – a variant expression for either πak+ or πbk− M β α – which characterizes the equilibrium distribution, will arise again when we use the generating function to probe steady non-equilibrium states. 17 Note that if we were imposing more finely resolved time- From the variational property (128) and the original dependent constraints, the same might not be true, but for this Lagrangian formulation of the problem, we recognize that finite set of four observables covering indefinite M, we have no other choice consistent with only two states. 18 If we had not supposed that {pα} was the stationary distribution, the same result would be obtained asymptotically in large M, 19 The distribution π clearly corresponds to the number fraction through the approximation of log Q by its largest log-eigenvalue, β ν of the preceding sections. denoted log λ+. This is the approach taken in Refs. [61, 62]. β 29 log Q will be the negative of a Gibbs free energy. From M¯ jointly satisfying the fact that log Q 0 when the γαβ are evaluated at the above conditions,≡ we know that the free energy may ∂Spath M¯ αβ = λαβ. (137) be offset by a constant at the entropy-maximizing, equi- ∂M¯ αβ librium distribution. We may derive the exact relation by calculating the entropy of paths (the caliber) on the From the developments of the single-time equilibrium, solution (125), we may anticipate that the effective action for other con- figurations M¯ αβ than the most-probable ones will have αβ ¯ αβ Spath M¯ αβ = π log π + M αβ λαβ the same form as Eq. (136), with λ fixed and M − α α α αβ permitted to vary independently of them. We will re- turn to that and to its physical and combinatorial eval- ¯ αβ αβ M = πα log πα M¯ log uation in a moment, but first we do the construction − − Mπβ α αβ methodically through a generating function for the path- probability distribution. = π log π M π γαβ log γαβ. − α α − β α α β (134) B. Generating function for steady non-equilibrium states, and connection to the continuous-time The first line of Eq. (134) is simply the evaluation of the Langevin equation definition (123), recalling that log Q = 0 in the normal- ization of pj. The second line is the proper extensive-form A generating function for the discrete-time path en- dependence of the entropy on the constraint variables semble may be constructed as it was in the Doi-Peliti which are its macroscopic arguments. The third line, in- method. Because we will consider only steady non- terpreting the γαβ as transition probabilities in the equilibrium distributions, we may do this in a direct and state diagram of Fig. (6), expresses the path entropy as simple way, as a modification of the underlying trans- a sum of the entropy of the initial distribution, with the fer matrix by shifts in the chemical potential. There are time-integral of the entropy rate of the stochastic process many perturbations that could be chosen for the discrete- in that distribution [49]. This expression is the first in time model, but only the one equivalent to the coupling the paper, in which the entropy rate is given its familiar to the number field in Sec. IVC produces the same hy- form from the stochastic process of a finite-state system. drodynamic limit as ∆t 0. The combinatorial interpretation of the second line in Because the discrete model→ has four parameters γαβ Eq. (134) is immediate. Using S0( πα ) to denote the for transition rates, we introduce four complex variables single-time entropy of the initial distribution,{ } we may αβ express zαβ eζ (138) ≡ as arguments of the moment-generating function. Sup- Mπa Mπb Spath M¯ αβ = S0( πα )+log +log . αβ αβ ¯ ba ¯ ab pose that, like the λ , all ζ scale (k+ + k−) ∆t 1. { } M M ba ab ∼ ≪ The terms ζ and ζ modify the transition currents, (135) while terms ζaa and ζbb modify the persistence probabil- The combinatorial entropy is the Stirling approxima- ities. The counterpart to the current j of Sec. IVC, which tion for the logs of the two independent binomial factors ¯ ba effectively modifies the chemical potentials of the states, for distributing M transitions a b among Mπa = is the combination ζaa ζbb. Therefore, to compare the M¯ aa + M¯ ba total exits from state →a, and similarly for − ab path entropy to this perturbation in the Doi-Peliti for- distributing M¯ exits b a among Mπ total exits ba ab → b mulation, we set ζ = ζ 0, and consider only the from b. Note that the entropy rate term in Eq. (134) remaining two perturbations.≡ will diverge logarithmically in ∆t if a finite number of For constant sources zaa and zbb, the path-generating transitions M¯ ba = M¯ ab is fixed in a growing number of function for non-equilibrium, asymptotically-steady tran- opportunities M. sition numbers may simply be written in terms of a Finally, from the explicit expression for the entropy ei- transfer matrix that has the same modification in every ¯ αβ ther in terms of M or the associated Lagrange mul- timestep, as tipliers λαβ = log γαβ , we may identify the relation − path aa bb between the partition function Q and a free energy that ψpath zaa,zbb e−Γ (ζ ,ζ ) leads to the relations (128) ≡ aa M bb aa Mj bb j = (z ) z pj|σπσ log Q = λαβM¯ αβ Spath M¯ αβ + S0( πα ) . σ j|σ − λ − λ { } αβ M 1 1 1 aa aa ab (136) γ z γ πa = ba bb bb .(139) ¯ αβ Q γ γ z πb Here each Mλ denotes the αβ-component of the set of 30

The starting probabilities πσ in Eq. (139) will no longer the correlation function, it is then the classical relax- be those preserved by the{ transfer} matrix at nonzero ation time of these fluctuations, given by Eq. (117), that ζαβ , so the function Γpath ζaa,ζbb will generally dif- causes the single-time fluctuations (88) to agree with fer from the form (136) given for log Q. The difference those in the equilibrium generating-function distribution will be associated with transient decay− from π to the from Eq. (22). We thus establish the mutual consistency { σ} perturbed stationary state, and will scale as (T )0 at large of all of the approaches used. T . If we denote this constant as Λ0, the asymptotic ex- pression for the generating function becomes C. The effective action from caliber and its Γpath ζaa,ζbb = (λ + ζ)αβM¯ αβ Spath M¯ αβ interpretation (λ+ζ) − (λ+ζ) αβ +Λ0 + log Q. (140) From here we compute the effective action by Legendre transform, just as in Sec. III for the single-time equilib- The asymptotic stationary valuesp ¯a,p ¯b may be com- rium distribution. The result is puted as functions of λ + ζ as before, and the functional αβ Seff M¯ αβ =Γpath ζaa,ζbb ζaaM¯ aa ζbb M¯ bb relation may then be inverted to assign values to the ζ . M¯ M¯ − M¯ − M¯ aa bb = λαβM¯ αβ Spath M¯ αβ +Λ0 + log Q It is possible to check that only the combination ζ ζ − appears in any of the transition-number expressions,− so αβ aa bb we could originally have set z = 1/z as a single ar- Mp¯ gument to ψpath. This construction exactly follows the = λαaM¯ αa log a − M¯ ba construction of the two-argument and the one-argument α moment-generating functions in Sec. IIIB. Mp¯b An important secondary relation that is not merely a + λαbM¯ αb log − M¯ ab definition but results from the choice ζba = ζab 0, is α ≡ that the off-equilibrium transition rates satisfy +Λ0 + log Q. (142)

M¯ ba M¯ ba In the last expression we recall the decomposition (135) (λ+ζ) = (λ+ζ) = p¯ k p¯ k . (141) T T a + b − of the stationary-state path entropy into the logarithms of two independent binomials. For the equilibrium tran- Eq. (141) is the counterpart to Eq. (133) for the equilib- sition rates Λ0 S0( πα ) and the expression vanishes rium distribution. This property will be responsible for as required. Recalling→ { that} S0( πα )+log Q in Eq. (136) the cancellation of terms between the path entropy and is simply the maximizer of the{ terms} written explicitly external weight factors which could diverge as ∆t 0, in Eq. (142), we recover exactly the difference of Gibbs and recovery of the hydrodynamic limit. → free energies which was the stochastic effective action in The transition rates (141) equal the rate at which the Eq. (32). Langevin field correlation function (87) injects noise into It now remains only to check that the expression (142) the number-field correlation function (88). Thus we con- recovers the non-divergent entropy rate computed from firm that, for this simple example of non-interacting par- the master equation, and to assign physical meanings to ticles, the Langevin equation (86) describes the fluctu- the terms that appear. ating trajectories of single-particle states. The Langevin Consider the limit in which M ba = M ab remains finite ′ description is indirect, as ν is formally the expectation as ∆t 0 and M , and suppose all k±∆t 1. value of a Poisson distribution, and not a discrete single- The external→ source→ term ∞ that appears in the effective≪ particle trajectory. Following the injection of noise by action (142) evaluates to

αβ ¯ αβ a ba b ab ba ab αβ λ M Mp¯ M Mp¯ M M M = − log (1 k+∆t) − log (1 k−∆t) log(k+∆t)+ log(k−∆t) T − M∆t − − M∆t − − T T =p ¯ k +p ¯ k p¯ k p¯ k log k k ∆t2 . (143) a + b − − a + b − + − The first two terms in the second line, although proportional to the expected escape rates from the two wells with occupanciesp ¯a,p ¯b, actually come from the accumulation of time intervals in which the particle does not escape and whose likelihoods differ from unity only by k±∆t. This accumulation of terms would not be affected even if we made finite changes in M ba or M ab, as long as total M was sufficiently large. The other term, containing the logarithm, comes from the unlikely events in which sufficient energy accumulates in a fluctuation to induce a barrier crossing. It is simply the product of Boltzmann factors on a set of independent intervals of length ∆t, whose total number is M ba + M ab. 31

The terms in the purely combinatorial path entropy evaluate to Spath M¯ αβ Mp¯a M ba M ba Mp¯b M ab M ab M ba M ba M ab M ab = − log 1 − log 1 log log M − M − Mp¯a − M − Mp¯b − M Mp¯a − M Mp¯b M ba M baM ab 2 log ≈ M − M 2p¯ p¯ a b = ∆t p¯ k p¯ k 2 log k k ∆t2 . (144) a + b − − + −

We have placed the terms in the same order as those lowing Ref. [48] – the arguments of the logarithms in in the external linear function (143), and the term with Equations (143,144) become the logarithm – again arising from the events where the 1 −β(‡− ) state changes – is in fact identical in the entropy and ∆tk± 2πe a/b . (146) in Eq. (143). The two log terms exactly cancel in the ≈ subtraction that defines the effective action (142), leaving The exponential waiting times in k± are not affected, but a finite remainder, which we will see is the hydrodynamic the factor ∆t cancels the dimensional prefactors in k±, entropy rate. which have dimensions 1/time, and have a characteristic It is the first two, non-divergent term, which differ be- scale set by the diffusive relaxation rate. tween the path entropy and the external probabilities. If we plug the approximation (146) into the for- The total numbers M ba and M ab are frequencies entering mula (143) for the environment terms in the path-space the binomial factors in the path entropy, and both terms Gibbs free energy, this term evaluates approximately to contribute identical factors proportional to M ba = M ab. αβ ¯ αβ The difference of Eq. (143) and Eq. (144) gives the non- αβ λ M p¯ak+ +p ¯bk− divergent effective action by Eq. (142), which goes at T ≈ 0 ¯ ba ¯ ab large T (ignoring the contribution of terms (T ) ) to M 1 M 1 ∼ + β ‡ + ‡ . T − a T − b eff ¯ αβ S M (147) p¯ak+ +p ¯bk− 2 p¯ak+p¯bk− T → − 2 We have thus arrived at a mathematical expression = p¯ k p¯ k . (145) a + − b − of the graphical heuristic for thinking about dynamical ensembles given in Fig. 3 of Sec. IIC. The attracting Eq. (145) recovers precisely the 1-particle coefficient of fixed points of the underlying free-energy landscape carry tC in Eq. (113), when converted to descaled time τC by charge-valued, extensive state variables such as the occu- the factor k+ + k−. pation frequenciesp ¯a,p ¯b for states. They are related Thus we again find that T is the scaling variable in under Legendre transform to dual, intensive entropy gra- the large-deviations principle associated with indepen- dients from the environment – here k+ and k− respec- dent path fluctuations for a single particle. This scaling tively – which determine the probabilities for each state. applies only to the extended-time partition function. It The extensive, current-valued state variables – here the acts independently of the scaling with N, which is the numbers M¯ ba and M¯ ab of transition events – live on the same in the extended-time and single-time levels of ther- saddle points of the landscape. Under Legendre trans- modynamic description. form, they should be coupled to the dual chemical poten- tials that determine the probability of transitions: here 1 1 ‡ a and ‡ b . The asymmetry between states and D. Discretization divergences and the natural scale transitions− arises− because chemical potentials appear ex- ponentially in the leaving rates k±, and only linearly in As was explained in Sec. IIE, the discrete time interval the transition-state terms involving ‡. ∆t may be taken only down to a certain scale, before the approximation by a discrete two-state theory becomes in- appropriate. The lower limit to ∆t from the continuum E. Connection to energy dissipation double-well model will come from the diffusive relaxation time, for a trajectory starting at the saddle point and The path-space Gibbs free energy (142) has separated ending in whichever well is most-quickly reached. On events into categories, which we can associate with dif- timescales shorter than this, it becomes invalid to char- ferent kinds of energy transfers. The events in which ba acterize escapes as elementary events counted by Mj or the particle exchanges wells could correctly be associ- ab Mj . ated with either absorption of energy from the bath, or We expect that when the discretization scale is set to dissipation to it, depending on the sign of the free-energy this value – termed the natural scale of the problem fol- difference in the starting and ending well. However, these 32 terms entirely cancel between the path entropy and the the entire sub-distribution within which that sample fluc- environment-mediated probability, and do not contribute tuation is typical. This is the shifted binomial distribu- to the entropy rate expression in Eq. (113) in the hydro- tion appearing in Eq. (19). The constant of proportion- dynamic limit. This cancelling term may be ascribed to ality between the two distributions, whose logarithm is “entropy production” in the path ensemble which is ex- the cumulant-generating function, is then precisely the ported to the environment, but it does not appear in the leading exponential difference of probabilities, because large-deviations formula. Moreover, we are lucky that we normalization terms in the original distribution and the do not need it to compute path-probabilities, because it projected distribution have canceled. appears to depend on the model resolution ∆t, which is not physically meaningful. The terms that survive in the entropy rate, and con- B. Operator algebras in the Doi formalism; tribute to its distinctive form (115), come from all the two-field methods in general, and the difference moments when thermal fluctuations do not lead to es- between classical and quantum systems capes. For all these, energy is neither absorbed nor dis- sipated. In summary, we acknowledge that there are The Doi [4, 5] construction of time-dependent ensem- cases [63] in which a collection of non-equilibrium transi- bles uses a further property of generating functions which tions is constrained by consistency with the equilibrium can lead to confusion on first exposure, but which is distribution that is their starting or ending point. Since important to understand in order to assign the correct the equilibrium distribution is dictated by energy, these meaning to the operators and states used in the con- constrained transformations must also have characteriza- struction. tions in terms of energy. However, for a large collection Any probability distribution may be written as a sum of non-equilibrium fluctuation theorems, energy alone is of modes, and the generating function associated with not serving as the primary constraint on kinetics, and that probability distribution may likewise be written as we must look for the explanation of path probabilities in a sum of polynomials. Therefore, generating functions terms of other parameters of the stochastic process. satisfy a property of superposition, similar in many re- spects to the superposition satisfied by states in quan- tum mechanics. Time evolution acts on superpositions, VI. DISCUSSION AND CONCLUSIONS either of quantum states, or of the modes of a generating function, independently. Much of the time-dependence A. Why generating functions for large-deviations in either quantum mechanics or stochastic processes can formulae? therefore be understood from the sequencing of the time- evolution operators, without reference to the particular We remarked in Sec. IVA that generating func- function-spaces in which they act. Doi’s contribution was tions provide a more natural representation for large- to distill the dependence in the order of time-evolution deviation formulae, and better approximation methods, operators into an operator algebra, which has the same than the discrete probability distribution, because gen- form as the algebra of the simple harmonic oscillators of erating functions expand in moments of collective many- quantum mechanics or quantum field theory. particle motion, rather than in single-particle hops. We For this reason, Doi-Peliti methods are sometimes [16] have also seen, comparing Eq. (14) to Eq. (32), that the termed “” methods for stochas- cumulant-generating function extracts the leading expo- tic processes. However, the operator algebra itself car- nential term in from the probability distribution, bypass- ries no implication of a physical property of quantum- ing sub-leading terms from normalization. But how does superposition of distinct eigenstates of observables, and it perform this separation? We may understand that the derives only from the linearity of time evolution. The isolation of leading exponentials is in fact a direct re- physical commitment to the classical or quantum nature flection of the way the generating function expands in of a system comes from specifying the space of functions moments. representing states and their observable properties, and In elementary statistical mechanics [11], we argue that on the inner product in that space. For further discussion the microcanonical and canonical ensembles produce the of the many similarities implied, solely by superposition same entropy, because the probability of the most-likely and causality, between classical and quantum systems, element in a sharply peaked distribution is equal at lead- see Ref. [39]. It is under the correspondence of opera- ing exponential order to the weight of all typical elements tor algebras that the stochastic effective action and the in that distribution. The two probabilities differ only by quantum effective action are equivalent objects. the width of the distribution, which is sub-leading as we The use of time-dependent sources to probe complex saw in Eq. (17). histories with the stochastic effective action does not The moment-generating function may be understood seem common in reaction-diffusion applications of Doi- as extracting, not only the probability to observe a sin- Peliti theory. We hope that such methods can come to gle sample-fluctuation from the equilibrium distribution, be used to ask a wider range of structurally explicit ques- but in fact the projection of the original distribution onto tions about non-equilibrium paths. 33

As the simplest non-trivial examples, this paper has the domain of stochastic processes, to the incompleteness demonstrated new entropy-rate formulae for steady non- of spatial geometry as a representation for mechanics, as equilibrium paths, both for independent-particle random expressed in Zeno’s paradox of the arrow. If a geometric walks, and for stoichiometrically non-trivial chemical re- description can assign only a position as a property of actions. An area for future work is to extend such for- the arrow at any single instant, this discription offers no mulae to study the large-deviations behavior of random way to distinguish states of motion that the arrow may walks on graphs or the fluctuations of realistic chemical- have, which in a dynamical theory are required as initial reaction networks. conditions. Unlike the other Zeno paradoxes which em- phasize subdivision of space or time, the arrow paradox highlights the inadequacy of dynamical descriptions that C. The information conditions that embed cannot represent both position and momentum as inde- single-time ensembles within ensembles of histories pendent but inherent properties at a single instant. The dynamical system of Freidlin-Wentzell theory, with its The equilibrium (canonical) thermodynamic ensemble conjugate momentum field, completes this dual formula- was recovered in Sec. IVC as the projection of the larger tion for thermodynamics much as Hamiltonian dynamics ensemble over histories onto a single time slice, in a re- completes it for mechanics. gion where the histories had settled into their stationary In quantum mechanics the independence of position distribution over both states and transitions. “Fluctua- and momentum as both inherent properties is manifest tions” in the equilibrium ensemble have no dynamical – traveling waves are distinct superpositions from stand- connotation, and are simply sample values. Histories ing waves – but it comes at the price that position and have both dynamical correlation – a property of a path it- momentum are non-commuting observables [40, 43]. The self – and stochastic fluctuation that represents sampling correspondence between operator algebras for stochastic from the ensemble over paths. processes and dissipative quantum systems (annotated Fluctuation values in the equilibrium ensemble, when bibliography in App. A), shows that the Doi-Peliti con- embedded within the larger ensemble of histories, have struction is revealing essentially the same form of in- the interpretation of instantaneous conditions on histo- herent duality as the standing/traveling wave duality ries accompanied by no other information. These infor- in quantum mechanics. The remarkable insight from mation conditions imply distributions over histories con- Freidlin-Wentzell theory, not obvious beforehand, is that ditioned on the single-time fluctuation, which appears as this dual momentum variable for classical stochastic pro- an independent variable. Freidlin-Wentzell theory repro- cesses should be an “inference field”, and that in re- duces the probabilities for such single-time fluctuations versible random walks it takes the form of a chemical from the integrals, over the most-probable histories, of potential. conditional probabilities for each successive increment of change. The equilibrium distribution is not the only ensem- E. The master equation versus maximum caliber ble that can be recovered from an ensemble of histo- ries by projection onto single instants. Projections onto Refs. [57, 61, 62] emphasize the potential for mas- slices where paths have not settled into their stationary ter equations to give only a coarse-grained description distribution result in non-equilibrium entropies at sin- compared to maximum-caliber methods that track dis- gle times. These entropies require both charge and cur- tinguishable particle trajectories. For interacting parti- rent state variables as arguments [40, 43], and they can cles, this difference may be important. However, for non- be written in terms of the Wigner distribution, consid- interacting particles, the N-particle moment-generating ered in Ref. [55]. We may say, then that there is a hi- function is simply the N th power of the one-particle erarchy of ensembles, from the most-complete ensemble moment-generating function [50]. Indeed, this relation is over histories, to single-time non-equilibrium ensembles, what causes the field normalizations (53,70) to lead to an which lose the time-correlation of histories but preserve exact (and not merely asymptotic) factorization of scale- current-valued state variables, finally reaching the equi- factor N from the remaining large-deviation rate func- librium ensemble, which projects onto only the charge- tions for relative numbers, not only for the two-state sys- valued state variables. By connecting state variables to tem, but for any non-interacting particle system. We also fixed points of free-energy landscapes in Fig. 3 , and to see that the Doi-Peliti construction “contains” a descrip- Legendre duality in Eq. (147), we have attempted to pro- tion of single-particle trajectories, from the agreement of vide several perspectives to understand this duality. the transition rates (141) in the discrete-time model with the correlation function (87) of the Langevin stochastic differential equation, whose solution is the Gaussian ap- D. Thermodynamics beyond Zeno proximation to discrete random walk. A useful further exercise would be to map the generat- The incompleteness of the time-reversal-invariant state ing functions for the master equation of Sec. IV and the variables of equilibrium thermodynamics is analogous, in discrete-time transfer matrix of Sec. V term-by-term, to 34

understand why the latter readily yields a decomposition are phenomena linked inherently to equilibrium, we may with the structure of the Gibbs free energy, while in the ask how many non-equilibrium cases of symmetry break- former this is hidden. In this way it might be possible to ing and large-deviations scaling are responsible for com- extend the flexibility of Doi-Peliti effective action meth- plex dynamics in nature that are not yet well-understood ods directly to maximum caliber. in terms of collective phenomena [70].

F. On the necessity of thermodynamic limits Acknowledgments

Our rather conservative extension of large-deviations I am grateful to acknowledge Insight Venture Partners calculations, from equilibrium to non-equilibrium statis- for support of this work, and Cosma Shalizi for many tical mechanics, has the advantage of starting in well- helpful conversations and references over several years. known territory, and taking a small step across which we can construct all relations between the different ensem- bles explicitly. It does not do justice to the generality of Appendix A: Survey of approaches to large-deviations formulae, emphasized in Ref. [2], and up large-deviations theory for non-equilibrium to now we have not adequately expressed the dependence processes throughout science on the existence of thermodynamic descriptions. The main text brings together Doi-Peliti construc- The ability to coarse-grain has been our most impor- tion methods, the larger framework of Freidlin-Wentzell tant tool to bring an infinitely complicated world within theory, and Jaynes’s methods of defining combina- reach of analysis and comprehension. In the early years torial entropies for paths. Mathematical differences of statistical mechanics, when the formulations of Boltz- among the methods are small; Doi-Peliti theory may be mann [64, 65] and Gibbs [66] were seen as “replacing” seen simply as a construction method for the Freidlin- classical thermodynamics, Fermi [10] was quick to em- Wentzell quasipotential, and the only important addi- phasize that thermodynamic descriptions have their own tion from path entropies is the finer resolution it gives to internal consistency and, at the appropriate level of scale, distinguishable-particle trajectories. are self-contained with respect to empirical validation. What will not be apparent in the main text is that We have learned over the last half-century that this point several distinct literatures exist for these nearly-identical is true and important to a degree that Fermi could not mathematical methods, which have been repeatedly dis- have foreseen. As renormalization methods have become covered for quantum mechanics, and for discrete- and well-understood conceptually [67], we have learned that continuous-variable classical stochastic processes. From all descriptions are – or may as well be – thermodynamic the collection of different special cases, it has become descriptions [48, 59].20 clear [39] that a single framework of two-field functional Our view of the meaning of fundamental theory has integrals subsumes all the different approaches. More- changed accordingly. All theories once regarded as fun- over, all integrals of this type share particular structural damental have been effective theories, and at the same features, which reflect superposition or causality, inde- time, from criteria of scale-dependent statistical reduc- pendently of the physical substrate to which they apply. tion, each has been fundamental. Whether the series We summarize in order the key ideas behind two-field terminates is now understood not to be critical; any level methods, functional Legendre transforms, and entropy- at which one can form a thermodynamic description is a rate methods, and finally mention a parallel treatment valid starting point for deduction at that scale and larger within the mathematics literature that emphasizes ob- scales of aggregation. servables rather than states. Hierarchies of nested phase transitions [68, 69], con- nected by regions of large-deviations scaling, govern the behavior of equilibrium matter at all scales now known, 1. Two-field functional integral methods within theories that are already understood. Recognizing that neither large-deviations scaling nor phase transition The most widely-used (formally exact)21 methods in the physics literature to compute both deterministic and fluctuation properties of non-equilibrium systems are a class of generating-functional methods, all of which have 20 Properly they are called effective theories. The class of equilib- rium mechanical theories generally recognized within the con- ventional term “thermodynamics” are a class of effective theo- ries. The defining properties used in this review are, however, properties of the whole class, and so the terms “thermodynamic 21 With this condition we exclude very widely used but intention- description” and “effective theory” could be used interchange- ally approximate methods such as Langevin equations or the van ably to refer to them. This concept of “effective” theory is also Kampen expansion [71]. Any of the full methods cited here may the source of the name for the quantum and stochastic effective be reduced to Langevin or van Kampen approximations by stan- actions. dard methods of Gaussian integration. 35

expressions as two-field, coherent-state functional inte- 2. Functional Legendre transform and stochastic grals. The earliest of these were the density-matrix meth- effective actions ods of Schwinger [14] and Keldysh [15, 41] to combine dissipation and dynamics in quantum systems.22 The Generating-functional methods for dynamical systems Schwinger-Keldysh form was later proposed as a gen- – especially emphasizing efficient approximation – have eral framework to handle dissipative dynamics in clas- been extensively refined in vacuum quantum field theory sical or quantum systems by Martin, Siggia, and Rose (QFT) and to some extent in condensed-matter physics. (MSR) [13]. It was observed by Doi [4, 5] that the su- They are used in these domains to extract the lead- perposition principle for generating functions for multi- ing deterministic approximation to fluctuating quantum- particle classical stochastic processes is similar enough to mechanical observables, a use very close to the extrac- that for quantum density matrices that the same opera- tion of leading exponential dependence of probabilities in tor algebra may be used to describe both, and only their large-deviations theory. The functional Legendre trans- Hilbert spaces differ. The Doi operator algebra for dis- form of the cumulant-generating functional is known in crete Markov processes was later expanded as a coherent- QFT as the quantum effective action [22]. It is the basis state functional integral of MSR form by Peliti [6, 7]. of background-field methods used to identify the ground Alex Kamenev has recently emphasized [39] that the states or vacua of field theories, and it is the starting causal structure originally recognized by Keldysh, and point for a variety of renormalization schemes. Efficient reflected in a characteristic “tri-diagonal” form of the and elegant graphical methods have been derived to ap- MSR action functional, is the unifying physical commit- proximate it within the moment expansions of Gaussian ment that all field-theoretic methods of this kind share. functional integrals. Explanations of the common elements of superposi- In the Doi-Peliti construction for discrete stochas- tion and operator algebras are given in the main text in tic processes, the counterpart to the quantum effec- Sec. IVC, and the tri-diagonal form of Gaussian kernels tive action is the deficit in entropy for any config- and Green’s functions is illustrated in App. C. uration from the maximum entropy attainable under the same constraints. This entropy deficit is precisely Among the two-field methods, the one that most the large-deviations rate function for fluctuations in explicitly emphasizes the large-deviations structure of macrostates [1]. For closed equilibrium systems this rate the macroscopic approximation is the ray-theoretic, or function is simply a difference between two entropies. For eikonal-based, approach of Freidlin and Wentzell [3, 21, open equilibrium systems the contribution from entropies 37, 38, 73]. This approach, introduced in basic form by of the environment leads to the expression of the rate Eyring [74] to understand chemical reaction rates, has function as a difference of Gibbs free energies. The latter been most extensively developed for escape and first- case leads to the forms developed in the main text. passage problems [28–36]. It refines and generalizes In the large-deviations limit for non-equilibrium the equilibrium barrier-penetration formulae of Wentzel, stochastic processes, where probabilities are defined on Kramers, and Brillouin (WKB) [54]. More generally the histories rather than on instantaneous states, the rate Freidlin-Wentzell method is of interest to mathematicians function becomes the difference between a combinatorial for the solution of boundary-value problems for diffu- entropy defined on a path, and an entropy rate from the sion in potentials. Freidlin-Wentzell theory introduces a Markov process that generates the path in a stationary quantity known as the quasipotential, so-named because environment, as we have illustrated in Sec. V. it is thought of as a generalization of the equilibrium ther- 23 We have termed this large-deviations rate function, modynamic potentials The quasipotential is a large- counterpart to the quantum effective action in QFT, the deviations rate function, appropriately scaled for system stochastic effective action. The quasipotential used in size. It may be evaluated as an action-functional integral Freidlin-Wentzell theory to compute boundary values of along a ray or “eikonal” of the diffusion operator [21, 38], diffusion equations in potentials is an instance of the which has the interpretation of the most probable trajec- stochastic effective action. For a one-dimensional sys- tory leading to an escape or boundary value. tem it is the form produced by the moment-generating function for observations at a single time. The gen- eral stochastic effective action, for arbitrary path fluc- tuations, is computed as the self-consistent expectation value of the Freidlin-Wentzell field action, in which the Liouville operator acts as Hamiltonian. 22 These methods, in turn, build on the earlier work of Matsub- ara [60, 72] using field-theoretic representations of the finite- temperature partition function to compute time-dependent cor- relation functions for fluctuations about thermal equilibrium. 3. Path entropies and maximum caliber 23 “Thermodynamic potential” as used here is often a somewhat indefinite reference to the classical thermodynamic entropy or any of its Legendre transforms. It will be shown quite precisely The approach to non-equilibrium thermodynamics below, exactly what combination of such potentials the Freidlin- taken by E. T. Jaynes [8] is a direct outgrowth of methods Wentzell quasipotential generalizes. based on the entropy rate of a stochastic process. (Re- 36 lated methods were developed for chaotic deterministic for which that entropy would be maximum, dynamical systems, where the entropy rate was replaced by the Kolmogorov-Sinai, or metric, entropy.) Probabil- eS(U,V,{ni}). ities are assigned to histories, and the measure on the space of histories, together with any constraints, then In this respect, the relative probability is a more funda- determines the path-space entropy. Jaynes coined the mental quantity than the normalized probability. The S term caliber for the path entropy, to suggest the fluctua- normalization is the sum or integral of terms e over tion width of a tube of micro-histories about any coarse- whichever states the system can take, which can depend grained macro-history. on the setting. For independent systems, the relative In the text, we have used a formulation of the probabilities multiply, so that the sum of the entropies is maximum-caliber principle developed in Refs. [57, 61, maximized at the overall-most-likely state. It is this rela- 62], suitable for computing the probability of steady non- tion between entropy and relative probability that defines equilibrium distributions. In cases where the transition the macroscopic fluctuation properties of ensembles [1]. probabilities (including those altered by insertions from The way entropy governs the structure of interacting generating functions) are stationary, the calculation of systems comes from the properties of its gradients. The the entropy rate simplifies using the chain rule for condi- manipulation of these gradients is the main enterprise of tional entropies [49]. The chain rule separates the path classical thermodynamics [10]. entropy into an equilibrium entropy on the stationary dis- The equation of state is merely a definition of the gra- tribution, and a conditional entropy of transitions from dients of the entropy in terms of observable quantities: that distribution. This compact representation appears ∂S ∂S ∂S in Sec. VA. δS = δU + δV + δni ∂U ∂V ∂n i i βδU + βpδV β δn . (B1) ≡ − i i 4. Large-deviations theory within the probability i literature In chemical thermodynamics 1/β = kBT corresponds to the temperature in energy units, p is the pressure, and The methods used in this paper are those common are the chemical potentials. These gradients of S in the physics literature, particularly the literature from i with{ } respect to its extensive arguments are the intensive reaction-diffusion theory, of which Refs. [16, 17] are repre- state variables in the thermodynamic description. sentative. Parallel to this physics literature is a very large Eq. (B1) may be re-arranged to express entropy change literature on stochastic processes and large-deviations in the units of any of the constraining state variables that theory, framed within the language of modern probabil- are shared between system sub-components, or between ity theory [75–78]. The approaches omitted in this re- the system and its environment. When energy is chosen view have in common a starting point in the reverse Kol- as the unit of measure – a natural choice since it is con- mogorov equation, which emphasizes the time-evolution served among nearly all forms of systems in contact – the of operators corresponding to observables, rather than result is the usual thermodynamic statement of “conser- the forward-Kolmogorov or master equation used here, vation of energy”, which models the time-evolution of probability distribu- tions. Rigorous proofs of convergence, which have not δU = k T δS pδV + δn , (B2) been the main concern in this article, are sometimes made B − i i i easier by using the observable representation.

In this representation pδV and i δni are interpreted respectively as mechanical and{ chemical} increments of Appendix B: State variables, entropy, and Gibbs work, and kBT δS labeled “heat”. Eq. (B2) is only a free energy in classical thermodynamics “conservation law” in the sense that δU is a lower bound on the energy change required to permit a change δS, in The entropy is the state function maximized subject to the context where volume also changes by δV and particle the constraints on the system configurations. The values numbers by δni . of the constraints are given by the state variables that The expression{ } of competition of entropy terms in the are the arguments of the entropy. relative probability, when a small system is coupled to a much larger environment, is that we may often expand S(U,V, ni ) . the entropy in the environment to linear order in the { } exchanged amounts of any conserved quantity that the Common sources of constraint are internal energy U, vol- system and environment dynamically apportion between ume V , or the numbers ni of different kinds of particles. them. The result for the total relative probability of an { } The entropy arises as the leading exponential term in apportionment that leaves U and V in the system is the relative probabilities of different states of the system, grouped according to the values of the constraints U, V , eS(U,V,{ni})−βE U−βE pE V . (B3) 37

We have supposed in the expression (B3) that energy If we wish to express classical thermodynamic relations and volume can be exchanged, but not particle number, in the form that relates most directly to the fluctuation- and that intra-system equilibration leads to the relative origins of the entropy, we may simply work directly with probability eS that would be maximum with U and V the logarithm of the relative probability, which is as constraints. Here βE and pE are regarded as fixed de- T scriptors of the environment, by supposing that it is large βG(kB ,p, ni ) max [S(U,V, ni ) β (U + pV )] . − { } ≡ U,V { } − compared to the system, so that only the first derivative (B5) of its entropy need be known. βE and pE therefore are We will take the sign and normalization from Eq. (B5) not functions of U or V . to define a standard Legendre transform of S. The expected or stable state of the system is the one The fluctuation theorems in Sec. III, for particle ex- that maximizes the joint probability (B3) over U and V . change, are similar in structure to the intermediate ex- Ordinarily this maximization is regarded to take place pression (B3) for exchanges of energy or volume. βG within the fluctuations of the ensemble, and values other takes the place of S, because it is assumed that energy− than the maximizers are never used in the combination and volume equilibrate on times much shorter than the S (U,V, n ) β U β p V in classical thermodynam- { i} − E − E E equilibration times for particle exchange. In addition, the ics. two states of the system exchange particles only with each If we wish to use the classical theory to understand the other and not with the environment, so neither Gibbs free effects of constraints on system states, we think of using energy is evaluated only to linear order. βE and pE as control variables, imposed through the in- teraction between the system and the environment. The log-probability as a function of these control variables Appendix C: Exact solution for continuous sources then becomes maxU,V [S(U,V, ni ) βEU βEpEV ]. in the time-dependent generating functional { } − − At the maximizers, β = βE and p = pE. If we con- tinue to denominate entropy changes in units of energy, The non-local relation between current sources and the and if we choose signs so that the U, V extremum is a 24 non-equilibrium paths they produce makes computation minimum, then the log-probability is proportional to of the functional Legendre transform technically chal- Gibbs Free Energy the lenging for all but simple cases. However, the simplicity of the Gaussian integral in coherent-state variables makes G(kBT,p, ni ) min [U + pV kBTS(U,V, ni )] . { } ≡ U,V − { } it possible to understand this non-locality with exact so- (B4) lutions. The coherent-state variables in this appendix Now p and β have been set equal in the system and en- will follow the diagonalization of Sec. IVB vironment. The subscripts pE, βE have therefore been The coherent-state field action (92) may be recast in dropped, and the minimizing U and V are now functions the matrix form [39] characteristic of all two-field action of p and β. functionals,25 as

ˆ ˆ † φb φa ∂τ +ν ¯a j/2 ν¯a φb Sj = N dτ − − − † ν¯b ∂τ +ν ¯b j/2 φ − − − a √2φˆ 1/2Φˆ 1 1 j 1/2φ† = N dτ ∂τ + σ0 + σ3 ν¯ + σ1 + iνσ¯ 2 . (C1) − 2 2 − 2 √2Φ†

In the second line, σ designates the 2 2 identity matrix and the other σ are the Pauli matrices [54] 0 × i 1 σ1 = 1 24 This convention is arbitrary, of course, but it helps to realize i σ2 = − that the choice to minimize the free energy is a relic of the i notion from 19th-century mechanics that energy minimization identifies stable states. Keeping entropy maximization foremost emphasizes that stability comes from the relative measures of 1 σ3 = . (C2) groups of microstates partitioned according to different values 1 for a few macroscopic averages, and we will use such a conven- − tion throughout the paper. The solution to the stationary point conditions, and for 25 For relations among these, see the literature review in App. A. the effective action more generally, follows closely what 38 has already been done for the single-time generating func- In the same manner, for continuous sources (no δ- tion. The major difference is that we must replace the functions, so we do not have to worry about exponential solution of scalar differential equations with the time- tails) j 0, and the equilibrium state as the initial τ<0 ≡ ordered matrix solution of a linear differential equation distribution, the lower boundary condition for the φˆ fields expressed in terms of the matrix in Eq. (C1). The 2 2 is matrix kernel, which depends on parameters of the× po- tential and on j, will be denoted by Φˆ 0 √2φˆ 1/2Φˆ = 2¯ν 1 , (C6) 0 √ 1 1 j 2 σ[¯ν,j] σ + σ ν¯ + σ + iνσ¯ . (C3) ≡ 2 0 2 3 − 2 1 2 and the general time-dependent solution is The expression of the boundary conditions at time T ˆ τ (here again T will be assumed finite but large) may be Φ0 −1 − dτσ[¯ν,jτ ] √2φˆ 1/2Φˆ = 2¯ν 1 e 0 . written τ √2 T (C7) 1/2φ† 0 Eq. (C7) generalizes the constant solution (60) due to = √2 , (C4) † nonzero j . √2Φ 1 τ T The constraint of total number fixes the normalization ˆ and in terms of these, the general solution for the φ† fields of Φ0 as before, through the relation propagates this constraint backward in time, as √2φˆ 1/2Φˆ 1/2φ† τ † Tˆ 1= † (at any τ) 1/2φ −1 − dτσ[¯ν,jτ ] 0 √2Φ = √2 e τ . (C5) τ √ † T 2Φ 1 ˆ τ T −1 − dτσ[¯ν,jτ ] 0 = Φˆ 0 2¯ν 1 e 0 , (C8) Here −1 denotes the inverse time-ordering operator T 1 whichT arranges terms in the exponential from right to left in order of decreasing time. The solution (C5) is the corresponding to Eq. (61). The general time-dependent direct generalization of the solutions (58,59). number expectation nτ is immediately seen to satisfy

√2φˆ 1/2Φˆ 1 1/2φ† 2nτ = N τ 1 √2Φ† τ

τ Tˆ −1 − dτσ[¯ν,jτ ] −1 − dτσ[¯ν,jτ ] 0 = NΦˆ 0 2¯ν 1 e 0 σ1 e τ T T 1 Tˆ δ −1 − dτσ[¯ν,jτ ] 0 = 2 N log 2¯ν 1 e 0 δjτ T 1 δ = 2 Γ[j] . (C9) − δjτ

The functional form in the first two lines of Eq. (C9) di- Drawing ντ = nτ /N from Eq. (C9), the stochastic ef- rectly generalizes Eq. (100) for the point source. The last fective action is two lines are the functional equivalent, with a variational Tˆ eff derivative replacing the partial derivative, of the single- S [n]= N dτjτ ντ + Γ[j] . (C11) time relation (25). And indeed, by exactly the evaluation 0 leading to Eq. (68), we find that the cumulant-generating Eq. (C11) is the promised functional which cor- functional is given by rectly generalizes the notion of an entropy difference from Eq. (32) to a space of histories. It represents a Tˆ −1 − dτσ[¯ν,jτ ] 0 large-deviations principle for the stochastic process from Γ[j]= N log 2¯ν 1 e 0 . ˆ − T 1 Eq. (3), with a rate function T dτj ν +Γ[j] /N which is 0 τ τ (C10) strictly N-independent for Gaussian fluctuations. In this We now have an exact expression for the effective ac- sense the linear reaction is the analog of the Gaussian tion which generalizes the single-time expression (33). distribution, and its large-deviations principle is simply 39 the for a system with finite vari- defined from the transition-state and single-species one- ance. The residual functional determinant for fluctua- particle chemical potentials as tions about the mean value – though it will not be com- ab m 1 −β ‡ − a puted here – would introduce sub-extensive corrections in kba = e i=1 i N just like those of the residual entropy in Eq. (17). (See ab n 1 −β‡ − b Refs [22] Ch. 16, [39], and [42] Ch. 7 for good treatments kab = e j=1 j . (D2) of such calculations). The forward and backward half-reaction rates associated with the reaction (D1), with the simplified dilute-solution Appendix D: Entropy-rate formulae for assumption that activities are proportional to concentra- multi-particle reactions tions, are then given by m

Independent random walks between wells, by a par- rba = kba nai ticles of a single type, may be modeled in the discrete- i=1 ab m 1 state approximation by diffusion on an ordinary graph. If −β (β +log na ) = e ‡ e i=1 ai i the the mass-action rates satisfy detailed balance in the ab m −β − a steady state, the graph may be considered undirected; e ‡ i=1 i ≡ n otherwise it is directed. Chemical reactions in which multiple particles jointly move through a transition state rab = kab nbj . j=1 from reactants to products generalize this relation from ab n 1 graphs to hypergraphs. A graph need not be directed in −β β +log nb = e ‡ e j=1 bj j the case of detailed balance, because the nodes at either ab n −β − b end of a link are naturally complementary. Hypergraphs, e ‡ j=1 j . (D3) ≡ in contrast, are defined to permit arbitrary collections of nodes as the “boundary” of a hyper-edge. We therefore In the succeeding lines of each rate law, dropping the require directed hyper-edges to model chemistry, which superscript 1 on the species chemical potentials reflects distinguish a subset of nodes as inputs and a second sub- the absorption of number factors to form the n-particle set as outputs. chemical potentials, as in Eq. (10). The surprising formula (115) for the probability rate of persistent non-equilibrium paths may immediately be extended to arbitrary networks of chemical reactions, a. Master equation, Liouville operator, and hence, from graphs to directed hypergraphs. Global net- Freidlin-Wentzell action work topology will not be considered here, but the rate formulae for arbitrary multiparticle reactions at steady- The master equation for the stochastic reaction net- state dis-equilibria will be given. work immediately generalizes Eq. (34) to Consider a general collection of bi-directional reactions m n indexed by their reactant and product sides as r and p. ∂ρn ∂/∂na − ∂/∂nb = e i=1 i j=1 j 1 r ρ . Individual species participating in the reaction generalize ∂t − ba n ab the wells a and b of the two-state system, to a many- (D4) well problem where indices ai and bj are drawn from a Note that the reaction order matters in this expression, common set of species. The reaction formula is so that the same exponential of shift operators occurs ba in two terms but with opposing sign, acting on different pb + + pb ↽⇀ ra + + ra . (D1) products of the species numbers n or n . 1 n 1 m ai bj The construction of the field functional integral pro- To simplify the notation, rather than writing stoichio- ceeds just as for the linear reaction. Skipping directly to metric coefficients other than unity, it will be assumed action-angle variables, the Liouville operator generalizing that the coefficients ai or bj can repeatedly take the same Eq. (73) becomes values as needed. The order in which the indices ab are written defines the relation between the sign of the cur- m n m ηbj − ηai rent and the changes of concentrations, so that ab and = kba na 1 e j=1 i=1 . (D5) L i − i=1 ba denote the same reaction but with opposite sign con- ba ventions for the current. When only the pair matters, without respect to sign, the pair-index will be denoted The corresponding action functional generalizing by ab . Eq. (74) (and dropping surface terms associated with The complexes of reactants and products behave in total number) is then the transition state as a single “particle”, for which the I one-particle free energy (or chemical potential) will be S = dt ∂ η n + . (D6) ab − t k k L denoted ‡ . The half-reaction rate constants are then k=1 40

For the general network, which may have multiple char- A sequence of evaluations identical to those leading to acteristic timescales in different reactions, it is less con- Eq. (113) then produces an equivalent form for the set venient to find an overall non-dimensionalization of time of many-particle reactions, now expressed as a sum over than to simply work in real-time coordinates, so we will reactions only (so unordered pairs ab ) write all integrals with measure dt. If, as for the double-well, sources are coupled only to (arbitrary polynomials in) the nk, the nk equation of mo- tion will continue to depend explicitly only on the η vari- ables. Hence, the response to the sources will be through n 2 ab 1 m 1 −β βa βb these variables. The nk stationary-point equations there- = e ‡ e 2 i=1 i e 2 j=1 j fore continue to determine quantities such as the value of L − ab appearing in the steady non-equilibrium effective ac- 2 tion.L From these, the forms of sources needed to produce m n such backgrounds can then be inferred. The equation for = kba na kab nb . (D9)  i − j  n is ab i=1 j=1 k   n m ab m ∂ η − η −β ‡ − ai bj ai ∂tnk = e i=1 e j=1 i=1 . ∂ηk ab (D7) Again, the sum is symmetric under exchange of a and b, Eq. (D9) generalizes Eq. (115) for the single linear re- so wherever ∂/∂ηk acts, it acts equally and with oppo- action. Other terms in the effective action vanish on site sign in two terms. Cancellation of all such pairs of a constant-n history, and so this is the expression from terms corresponds to detailed balance, except that the η Freidlin-Wentzell theory that governs fluctuations in the variables in the exponential can take nonzero values. manner of an entropy-rate difference. A topic for future Eq. (D7) can be solved with ∂tnk 0 for any configu- work is to extend the Jaynes path-entropy methods to ration of the n by setting ≡ { k} many-particle reactions, to determine whether combina- 1 1 torial and environment entropy rates may be isolated for η = β = β1 + log n . (D8) these as for the double-well. k 2 k 2 k k

[1] Richard S. Ellis. Entropy, Large Deviations, and Statis- J. Math. Phys., 2:407–32, 1961. tical Mechanics. Springer-Verlag, New York, 1985. [15] L. V. Keldysh. Diagram technique for nonequilibrium [2] Hugo Touchette. The large deviation approach to statis- processes. Sov. Phys. JETP, 20:1018, 1965. tical mechanics. Phys. Rep., 478:1–69, 2009. [16] Daniel C. Mattis and M. Lawrence Glasser. The uses [3] M. I. Freidlin and A. D. Wentzell. Random perturbations of quantum field theory in diffusion-limited reactions. in dynamical systems. Springer, New York, second edi- Rev. Mod. Phys, 70:979–1001, 1998. tion, 1998. [17] J. Cardy. Field theory and non-equilibrium [4] M. Doi. Second quantization representation for classical statistical mechanics. 1999. http://www- many-particle system. J. Phys. A, 9:1465–1478, 1976. thphys.physics.ox.ac.uk/users/JohnCardy/home.html. [5] M. Doi. Stochastic theory of diffusion-controlled reaction. [18] Claude Elwood Shannon and Warren Weaver. The Math- J. Phys. A, 9:1479–, 1976. ematical Theory of Communication. U. Illinois Press, Ur- [6] L. Peliti. Path-integral approach to birth-death processes bana, Ill., 1949. on a lattice. J. Physique, 46:1469, 1985. [19] Thomas D. Schneider. Theory of Molecular Ma- [7] L. Peliti. Renormalization of fluctuation effects in a+a → chines I: Channel Capacity of Molecular Machines. a reaction. J. Phys. A, 19:L365, 1986. J. Theor. Biol., 148:83–123, 1991. [8] E. T. Jaynes. The minimum entropy production princi- [20] Thomas D. Schneider. Theory of Molecular Ma- ple. Annu. Rev. Phys. Chem., 31:579–601, 1980. chines II: Energy Dissipation from Molecular Machines. [9] E. T. Jaynes. : the logic of science. J. Theor. Biol., 148:125–137, 1991. Cambridge U. Press, New York, 2003. [21] R. Graham. Path integral formulation of general diffusion [10] Enrico Fermi. Thermodynamics. Dover, New York, 1956. processes. Z. Phys. B, 26:281–290, 1977. [11] Charles Kittel and Herbert Kroemer. Thermal Physics. [22] Steven Weinberg. The quantum theory of fields, Vol. II: Freeman, New York, second edition, 1980. Modern applications. Cambridge, New York, 1996. [12] Kerson Huang. Statistical mechanics. Wiley, New York, [23] Lars Onsager. Reciprocal Relations in Irreversible Pro- 1987. cesses. I. Phys. Rev., 37:405–426, 1931. [13] P. C. Martin, E. D. Siggia, and H. A. Rose. Statistical [24] Lars Onsager. Reciprocal Relations in Irreversible Pro- dynamics of classical systems. Phys. Rev. A, 8:423–437, cesses. II. Phys. Rev., 38:2265–2279, 1931. 1973. [25] S. R. de Groot and P. Mazur. Non-equilibrium Thermo- [14] J. Schwinger. of a quantum oscillator. dynamics. Dover, New York, 1984. 41

[26] Dilip Kondepudi and Ilya Prigogine. Modern Thermo- tion’ principle. J. Phys. A: Math. Theor., 40:9717–9720, dynamics: from Heat Engines to Dissipative Structures. 2007. Wiley, New York, 1998. [48] Joseph G. Polchinski. and effec- [27] Rick Durrett. Essentials of stochastic processes. Springer, tive lagrangians. Nuclear Physics B, 231:269–295, 1984. New York, 1999. [49] Thomas M. Cover and Joy A. Thomas. Elements of In- [28] Robert S. Maier and D. L. Stein. Escape problem for formation Theory. Wiley, New York, 1991. irreversible systems. Phys. Rev. E, 48:931–938, 1993. [50] Herbert S. Wilf. Generatingfunctionology. A K Peters, [29] Robert S. Maier and D. L. Stein. Transition-rate theory Wellesley, MA, third edition, 2006. for nongradient drift fields. Phys. Rev. Lett., 69:3691– [51] Herbert Goldstein, Charles P. Poole, and John L. Safko. 3695, 1992. . Addison Wesley, New York, third [30] Robert S. Maier and Daniel L. Stein. Asymptotic exit lo- edition, 2001. cation distributions in the stochastic exit problem. SIAM [52] Pierre Gaspard. Time-reversed dynamical entropy J. App. Math., 57:752, 1997. and irreversibility in Markovian random processes. [31] Robert S. Maier and D. L. Stein. Effect of focusing and J. Stat. Phys., 117:599–615, 2004. caustics on exit phenomena in systems lacking detailed [53] G. Nicolis and I. Prigogine. Fluctuations in nonequlib- balance. Phys. Rev. Lett., 71:1783–1786, 1993. rium systems. Proc. Nat. Acad. Sci. USA, 68:2102–2107, [32] Robert S. Maier and Daniel L. Stein. A scaling theory of 1971. bifurcations in the symmetric weak-noise escape problem. [54] J. J. Sakurai. Modern quantum mechanics. Benjamin J. Stat. Phys., 83:291, 1996. Cummings, Reading, MA, 85. [33] Robert S. Maier and D. L. Stein. How an anomalous cusp [55] Eric Smith and Supriya Krishnamurthy. Collective fluc- bifurcates. Phys. Rev. Lett., 85:1358–1361, 2000. tuations in evolutionary games. in preparation, 2010. [34] Robert S. Maier and D. L. Stein. Oscillatory behavior [56] Supriya Krishnamurthy, Eric Smith, David C. Krakauer, of the rate of escape through an unstable limit cycle. and Walter Fontana. The stochastic behavior of a molec- Phys. Rev. Lett., 77:4860–4863, 1996. ular switching circuit with feedback. Biology Direct, 2:13, [35] Robert S. Maier and D. L. Stein. Noise-activated escape 2007. doi:10.1186/1745-6150-2-13. from a sloshing potential well. Phys. Rev. Lett., 86:3942– [57] Kingshuk Ghosh, Ken A. Dill, Mandar M. Inamdar, Ef- 3945, 2001. frosyni Seitaridou, and Rob Phillips. Teaching the prin- [36] Robert S. Maier and D. L. Stein. Droplet nucle- ciples of statistical dynamics. Am. J. Phys., 74:123–133, ation and domain wall motion in a bounded interval. 06. Phys. Rev. Lett., 87:270601, 2001. [58] L. Onsager and S. Machlup. Fluctuations and irreversible [37] Gregory L. Eyink. Action principle in nonequilibrium processes. Phys. Rev., 91:1505, 1953. statistical dynamics. Phys. Rev. E, 54:3419–3435, 1996. [59] Steven Weinberg. The quantum theory of fields, Vol. I: [38] R. Graham and T. T´el. Existence of a potential for dis- Foundations. Cambridge, New York, 1995. sipative dynamical systems. Phys. Rev. Lett., 52:9–12, [60] Gerald D. Mahan. Many-particle physics. Plenum, New 1984. York, second edition, 1990. [39] Alex Kamenev. Keldysh and doi-peliti techniques [61] Gerhard Stock, Kingshuk Chosh, and Ken A. Dill. Maxi- for out-of-equilibrium systems. pages arXiv:cond– mum caliber: a variational approach applied to two-state mat/0109316v2, 2001. Lecture notes presented at Wind- dynamics. J. Chem. Phys., 128:194192:1–12, 2009. sor NATO school on “Field Theory of Strongly Corre- [62] David Wu, Kingshuk Ghosh, Mandar Inamdar, Heun Jin lated Fermions and Bosons in Low-Dimensional Disor- Lee, Scott Fraser, Ken Dill, and Rob Phillips. Tra- dered Systems” (August 2001). jectory approach to two-state kinetics of single parti- [40] Eric Smith. Quantum-classical correspondence principles cles on sculpted energy landscapes. Phys. Rev. Lett., for locally non-equilibrium driven systems. Phys. Rev. E, 103:050603:1–4, 2009. 77:021109, 2008. [63] C. Jarzynski. Nonequilibrium work relations: founda- [41] E. M. Lifshitz and L. P. Pitaevskii. Statistical Physics, tions and applications. Eur. Phys. J. B, 64:331–340, Part II. Pergamon, New York, 1980. 2008. [42] Sidney Coleman. Aspects of symmetry. Cambridge, New [64] Ludwig Boltzmann. Uber¨ die Zustand des York, 1985. W¨armegleichgewichtes eines Systems von K¨orpern [43] Eric Smith. Thermodynamic dual structure of linearly mit R¨ucksicht auf die Schwerkraft. Weiner Berichte, dissipative driven systems. Phys. Rev. E, 72:36130, 2005. 76:373–435, 1877. [44] Roderick Dewar. Information theory explanation of the [65] Ludwig Boltzmann. Popul¨are Schriften. J. A. Barth, fluctuation theorem, maximum entropy production and Leipzig, 1905. re-issued Braunschweig: F. Vieweg, 1979. self-organized criticality in non-equilibrium stationary [66] J. Willard Gibbs. The Scientific Papers of J. Willard states. J. Phys. A: Math. Gen., 36:631–641, 2003. Gibbs, Vol. 1: Thermodynamics. Ox Bow Press, New [45] Roderick C. Dewar. Maximum entropy production and Haven, CT, 1993. non-equilibrium statistical mechanics. In A. Kleidon [67] K. G. Wilson and J. Kogut. The renormalization group and R. Lorenz, editors, Non-Equilibrium Thermodynam- and the ε expansion. Phys. Rep., Phys. Lett., 12C:75– ics and Entropy Production: Life, Earth and Beyond, 200, 1974. pages 41–55. Springer-Verlag, Heidelberg, 2004. [68] Steven Weinberg. The first three minutes. Basic Books, [46] R. C. Dewar. Maximum entropy production and the fluc- New York, second edition, 1993. tuation theorem. J. Phys. A: Math. Gen., 38:L371–L381, [69] Steven Weinberg. Dreams of a final theory. Pantheon, 2005. New York, 1992. [47] G. Grinstein and R. Linsker. Comments on a deriva- [70] Nigel Goldenfeld and Carl Woese. Life is physics: evo- tion and application of the ‘maximum entropy produc- lution as a collective phenomenon far from equilibrium. 42

Ann. Rev. Cond. Matt. Phys., 2010. [75] Thomas G. Kurtz. Approximation of population pro- [71] N. G. van Kampen. Stochastic Processes in Physics and cesses. Soc. Ind. Appl. Math., Philadelphia, PA, cbms-nsf Chemistry. Elsevier, Amsterdam, third edition, 2007. vol.36 edition, 1981. [72] Takeo Matsubara. A new approach to quantum- [76] Stewart N. Ethier and Thomas G. Kurtz. Markov Pro- statistical mechanics. Prog. Theor. Phys. (Kyoto), cesses: Characterization and Convergence. Wiley, New 14:351–378, 1955. York, 1986. [73] R. Graham and T. T´el. Weak-noise limit of fokker-planck [77] Bert Fristedt and Lawrence Gray. A modern approach to models and nondifferential potentials for dissipative dy- probability theory. Birkh¨auser, Boston, 1997. namical systems. Phys. Rev. A, 31:1109–1122, 1985. [78] Jin Feng and Thomas G. Kurtz. Large Deviations for [74] S. Glasstone, K. J. Laidler, and H. Eyring. The Theory Stochastic Processes. Providence, Rhode Island, 2006. of Rate Processes. McGraw Hill, New York, 1941.