PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

Field master equation theory of the self-excited Hawkes process

Kiyoshi Kanazawa1 and Didier Sornette2,3,4 1Faculty of Engineering, Information and Systems, University of Tsukuba, Tennodai, Tsukuba, Ibaraki 305-8573, Japan 2ETH Zurich, Department of Management, Technology, and Economics, Zurich 8092, Switzerland 3Tokyo Tech World Research Hub Initiative, Institute of Innovative Research, Tokyo Institute of Technology, Tokyo 152-8550, Japan 4Institute of Risk Analysis, Prediction, and Management, Academy for Advanced Interdisciplinary Studies, Southern University of Science and Technology, Shenzhen 518055, China

(Received 9 January 2020; accepted 25 August 2020; published 21 September 2020)

A field theoretical framework is developed for the Hawkes self-excited point process with arbitrary memory kernels by embedding the original non-Markovian one-dimensional dynamics onto a Markovian infinite- dimensional one. The corresponding Langevin dynamics of the field variables is given by stochastic partial differential equations that are Markovian. This is in contrast to the Hawkes process, which is non-Markovian (in general) by construction as a result of its (long) memory kernel. We derive the exact solutions of the Lagrange-Charpit equations for the hyperbolic master equations in the Laplace representation in the steady state, close to the critical point of the Hawkes process. The critical condition of the original Hawkes process is found to correspond to a transcritical bifurcation in the Lagrange-Charpit equations. We predict a power law scaling of the density function (PDF) of the intensities in an intermediate asymptotic regime, which crosses over to an asymptotic exponential function beyond a characteristic intensity that diverges as the critical condition is approached. We also discuss the formal relationship between quantum field theories and our formulation. Our field theoretical framework provides a way to tackle complex generalization of the Hawkes process, such as nonlinear Hawkes processes previously proposed to describe the multifractal properties of earthquake seismicity and of financial volatility.

DOI: 10.1103/PhysRevResearch.2.033442

I. INTRODUCTION optimal execution strategies, and capturing the dynamics of the full order book [14]. Another domain of intense use of the The self-excited conditional Poisson process introduced by Hawkes model and its many variations is found in the field of Hawkes [1–3] has progressively been adopted as a useful first- social dynamics on the internet, including instant messaging order model of intermittent processes with time (and space) and blogging such on Twitter [15] as well as the dynamics of clustering, such as those occurring in seismicity and financial book sales [16], video views [17], success of movies [18], and markets. The Hawkes process was first used and extended in so on. statistical seismology and remains probably the most success- The present article, together with Ref. [19], can be con- ful parsimonious description of earthquake statistics [4–9]. sidered as a sequel complementing a series of papers devoted More recently, the Hawkes model has known a burst of in- to the analysis of various statistical properties of the Hawkes terest in finance (see, e.g., Ref. [10] for a short review), process [20–24]. These papers have extended the general as it was realized that some of the stochastic processes in theory of point processes [25] to obtain general results on financial markets can be well represented by this class of the distributions of total number of events, total number of models [11], for which the triggering and branching processes generations, and so on, in the limit of large time windows. capture the herding nature of market participants (be they Here, we consider the opposite limit of very small time win- due to psychological or rational imitation of human traders dows, and characterize the distribution of “intensities,” where or as a result of machine learning and adapting). In field of the intensity ν(t ) of the Hawkes process at time t is defined financial economics, the Hawkes process has been success- as the probability per unit time that an event occurs [more fully involved in issues as diverse as estimating the volatility precisely, ν(t )dt is the probability that an event occurs be- at the level of transaction data, estimating the market stabil- tween t and t + dt]. We propose a natural formulation of ity [11–13], accounting for systemic risk contagion, devising the Hawkes process in the form of a field theory of prob- ability density functionals taking the form of a field master equation. This formulation is found to be ideally suited to investigate the hitherto ignored properties of the distribution Published by the American Physical Society under the terms of the of Hawkes intensities, which we analyze in depth in a series of Creative Commons Attribution 4.0 International license. Further increasingly sophisticated forms of the memory kernel char- distribution of this work must maintain attribution to the author(s) acterizing how past events influence the triggering of future and the published article’s title, journal citation, and DOI. events.

2643-1564/2020/2(3)/033442(28) 033442-1 Published by the American Physical Society KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

This paper is organised as follows. Section II presents the Hawkes process in its original definition. In addition, we provide a comprehensive review of the previous literature on non-Markovian stochastic processes of diffusive transport developed in traditional statistical . Historically, the analytical properties of the generalized Langevin equation (GLE) have been intensively studied and we provide a brief review on the similarity and dissimilarity between the GLE and the Hawkes process from the viewpoint of the Markov embedding of the original non-Markovian one-dimensional dynamics onto a Markovian field dynamics. We then proceed to develop a stochastic Markovian partial equivalent to the Hawkes process in Sec. III. This is done first for the case where the memory kernel is a single exponential, then made of two exponentials, an arbitrary finite number of FIG. 1. Schematic representation of the Hawkes process (1) with exponentials, and finally for general memory kernels. It is in an exponential kernel (13). The event occurring at a given time stamp ν Sec. III that the general field master equations are derived. is represented by a jump in the intensity ˆ (t ), the probability per unit Section IV presents the analytical treatment and provides the time that a next event will occur. solutions of the master equations, leading to the derivation of the probability density function of the Hawkes intensities The Hawkes process, as any other point process, deals with for the various above-mentioned forms of the memory kernel. events (“points” along the time axis). The theory of point We also discuss a formal relationship between quantum field processes indeed considers events as being characterized by theories and our formulation in Sec. V. Section VI summa- a time of occurrence but vanishing duration (the duration of rizes and concludes by outlining future possible extensions events is very small compared to the interevent times). Thus, of the formalism. These sections are complemented by seven to a given event i is associated a time ti of occurrence. In the Appendixes, in which the detailed analytical derivations are case where one deals with spatial point processes, the event provided. has also a position ri. A “mark” mi can be included to describe the event’s size, or its “fertility,” i.e., the average number of events it can trigger directly. II. MODEL AND LITERATURE REVIEW The stochastic dynamics of the Hawkes process is defined A. Notation as follows. Let us introduce a state variableν ˆ, called the ν First, we explain our mathematical notation for stochas- intensity. The intensity ˆ is a statistical measure of the fre- tic processes. By convention, we denote stochastic variables quency of events per unit time (i.e., a shock occurs during , + ν with a hat symbol, such as Aˆ, to distinguish them from the [t t dt) with the probability of ˆdt). In the Hawkes process, nonstochastic real numbers A, corresponding, for instance, to the intensity satisfies the following stochastic sum equation a specific realization of the random variable. The ensemble (see Fig. 1): average of any stochastic variable Aˆ is also denoted by Aˆ . Nˆ (t ) The probability density function (PDF) of any stochastic Aˆ νˆ (t ) = ν + n h(t − tˆ ), (1) ˆ = = 0 i is denoted by Pt (A(t ) A) Pt (A). The PDF characterizes i=1 the probability that Aˆ(t ) ∈ [A, A + dA)asP (A)dA.Usingthe t where ν is the background intensity, {tˆ} represent the time notation of the PDF, the ensemble average reads Aˆ(t ) := 0 i i series of events, n is a positive number called the branching AP (A)dA. ∞ t ratio, h(t ) is a normalized positive function [i.e., h(t )dt = We also make the following remark on notations used for 0 1], and Nˆ (t ) is the number of events during the interval [0, t ) our functional analysis. For any function z(x) ∈ SF defined (called “counting process”). One often refers toν ˆ (t ) as a con- for x ∈ R+ = (0, ∞) with a function space SF , we can con- { } S → ditional intensity in the sense that, conditional on the realized sider a functional f [ z(x) x∈R+ ], i.e., f : F R. Functionals sequence of Nˆ (t ) = k (with k  0) events, the probability in this paper are often abbreviated as f [z]:= f [{z(x)}x], with the square brackets emphasized to distinguish from ordinary that the (k + 1)-th event occurs during [t, t + dt), such that ∈ , + ν functions. tˆk+1 [t t dt), is given by ˆ (t )dt. The pulse (or memory) kernel h(t ) represents the non-Markovian influence of a given event and is non-negative. B. Definition of the Hawkes conditional Poisson process The branching ratio n is a very fundamental quantity, which The Hawkes process is the simplest self-excited point is the average number of events of first generation (“daugh- process, which describes with a linear intensity functionν ˆ ters”) triggered by a given event [8,25]. This definition results how past events influence the triggering of future events. Its from the fact that the Hawkes model, as a consequence of structure is particularly well suited to address the general and the linear structure of its intensity (1), can be mapped exactly important question occurring in many complex systems of onto a branching process, making unambiguous the concept of disentangling the exogenous from the endogenous sources of generations: More precisely, a given realization of the Hawkes observed activity. It has been and continues to be a very useful process can be represented by the set of all possible tree model in geophysical, social, and financial systems. combinations, each of them weighted by a certain probability

033442-2 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020) derived from the intensity function [26]. The branching ratio It is remarkable that the FP equation is always linear in is the control parameter separating three different regimes: terms of the PDF even if the Langevin equation has nonlinear (i) n < 1, subcritical; (ii) n = 1, critical; and (iii) n > 1, super- terms in general. While there is a one-to-one correspondence critical or explosive (with a finite probability). The branching between the Langevin equation and the FP equation, the ratio n can be shown to be also the fraction of events that FP equation is often analyzed because the standard methods are endogenous, i.e., that have been triggered by previous based on linear algebra are available, such as the eigenfunc- events [27]. tion expansion. The Langevin equation (2) can be interpreted as the equation of motion for a Brownian particle, obtained after C. Review of non-Markovian stochastic processes in the integrating out the many degrees of freedom of the origi- framework of the generalized Langevin equation nal microscopic dynamics except for those of the Brownian This subsection provides the background of previous meth- particles. For example, Eq. (2) can be systematically derived ods for non-Markovian stochastic processes in statistical from the Hamiltonian dynamics of the Brownian particle sur- physics, by focusing on the diffusive dynamics of Brownian rounded by a dilute gas (see the kinetic frameworks [32–34] particles (e.g., see Refs. [28] for detailed reviews). While this for examples). class of physical stochastic processes exhibits dynamical char- b. Non-Markovian nature as a result of variable elimination. acteristics that are quite different from those of the Hawkes = 1 2 Let us focus on the case of a harmonic potential U (ˆx) 2 kxˆ processes, our framework can been formally related to such and eliminate the velocity [30] to change the system descrip- standard theories (in particular for the Markov embedding tions from (ˆv, xˆ)toˆx. By integrating out the velocity degree, techniques for non-Markovian processes). We thus offer a we obtain the non-Markovian dynamics for the positionx ˆ(t ): comprehensive review to prepare the reader to better under- dxˆ(t ) ∞ stand our theoretical developments. This subsection is written =− dsK(s)ˆx(t − s) + ζˆ (t ), in a self-contained way and is not needed to understand our dt 0 main results; readers only interested in our formulation can ∞ 1 −γ s/M skip this section. ζˆ (t ):= dse ηˆ (t − s). (5) M 0 −γ t/M 1. Markovian Langevin equations Here, the memory kernel K(t ):= ke /M represents the retarded potential effect and the noise term ζˆ (t )isthe a. Langevin equation. In the context of statistical physics, colored Gaussian noise with zero mean ζˆ (t )=0 and auto- non-Markovian stochastic processes have been studied from  correlation ζˆ (t )ζˆ (t )=(T/M)e−γ |t−t |/M . Remarkably, the the viewpoint of diffusion processes [29,30]. One of the typi- non-Markovian nature has appeared as a result of variable cal diffusive models is the Langevin equation, elimination. Here the non-Markovian version of the FDR ζˆ ζˆ  = 2 | − | dvˆ (t ) dU(ˆx) dxˆ(t ) holds (t ) (t ) x eqK( t t ) with equilibrium aver- M =− − γ vˆ (t ) + ηˆ (t ), = vˆ (t ), (2) 2 = / dt dxˆ dt age x eq : kBT k. wherev ˆ (t ) andx ˆ(t ) are the velocity and the position of the 2. Generalized non-Markovian Langevin equation Brownian particle, M is its mass, U (ˆx) is the confining poten- While the Markovian Langevin description (2) is rea- tial, γ is the viscous friction coefficient, andη ˆ (t ) represents sonable for dilute thermal environments [35,36], such a the thermal fluctuation modeled by the zero-mean white Gaus- Markovian description is not available for dense thermal en- sian noise. Equation (2) is one of the stochastic differential vironments, such as liquids [35,37,38]. Indeed, Eq. (2)is equations (SDE) first written down in the history of physics. not valid even for a Brownian particle in water, which is The noise term embodying the presence of thermal fluctu- one of the most historically important cases. For such cases, ation satisfies the fluctuation-dissipation relation (FDR) the Langevin description must be modified to accommodate ηˆ (t )ˆη(t )=2γ T δ(t − t ), (3) non-Markovian effects originating from hydrodynamic inter- actions in liquids. The minimal model for such systems is where T is the temperature. In this paper, the Boltzmann given by the generalized Langevin equation (GLE) [29,30]: constant is taken unity: k = 1. The FDR must hold for relax- B ation dynamics near equilibrium states because both viscous dvˆ (t ) dU(ˆx) ∞ friction and thermal fluctuation come from the same thermal M =− − dsK(s)ˆv(t − s) + ηˆ (t ), (6) dt dxˆ environment [30]. 0 The standard analytical solution to this Langevin equation where K(t ) is the memory kernel for viscous friction andη ˆ (t ) can be obtained via the time-evolution equation for the joint is the thermal fluctuation modeled by a colored Gaussian noise PDF Pt (v, x), which is given by the Fokker-Planck (FP) equa- with zero mean ηˆ (t )=0. Since both viscous friction and tion [31,32]: thermal fluctuation share the same origin, they are related to each other via the FDR for non-Markovian processes: ∂Pt (v, x) = L P (v, x),   ∂ FP t ηˆ (t )ˆη(t )=TK(|t − t |). (7) t ∂ γ ∂ dU(x) T ∂ LFP :=− v + v + + . (4) For typical three-dimensional liquid systems, the memory ∂x M ∂v dx M ∂v kernel has a long tail due to the hydrodynamic retardation

033442-3 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020) effect, such that K(t ) ∝ t −3/2, which was verified by direct 4. Fokker-Planck descriptions of non-Markovian processes experiments in Refs. [35,37,38]. Compared with Markovian stochastic processes, there are The GLE (6) can be derived from microscopic dynamics few mathematical techniques that can be applied to the by rearrangement of the Liouville operator by the method Fokker-Plank (master) equations associated with general non- of the projection operators [30,39]. Assuming that the initial Markovian processes. One of the formal methods is to use state of the system is sufficiently close to equilibrium and that the time-convolution Fokker-Planck equation for macroscopic a sufficient number of macroscopic variables are accessible ∂ /∂ = t L , variables x: Pt (x) t 0 ds GFP(x s)Pt−s(x), which was via experiments, the projection operator formalism provides derived from the projection operator formalism [30]. Also, a microscopic foundation of the non-Markovian stochastic some specific class of non-Markovian SDEs can be for- processes with intuitive physical interpretation. mulated as a Fokker-Planck equation with time-dependent The solution of the GLE (6) is much harder to obtain coefficients via the functional stochastic calculus [46]. While than that of the Markovian Langevin equations due to its these equations are formally correct, they are not easy to non-Markovian nature. However, due to the linearity of the exploit for practical calculations due to their genuine non- dynamics and the Gaussianity of the thermal fluctuations Markovian nature. [while they are specific to the GLE equation (6)], all the mo- One of the most powerful approaches to tackle GLE is to ments and correlation functions can be analytically calculated use Markov embedding, thus making the system Markovian. based on the Laplace transformation of the SDE [40] and This research direction was proposed by Ref. [45] and there response functions can be expanded as sums of exponentials are a variety of ways to select the auxiliary variables. By through the residue theorems [41]. In addition, the recurrence taking the selection in Eq. (8) according to Ref. [43], we methods of Ref. [42] are also available for this model. obtain the complete Fokker-Planck equation for the GLE with K − /τ = κ t k the exponential-sum memory K(t ) k=1 ie in the ab- sence of potential U (ˆx) = 0, 3. Origin of the non-Markovian property and Markov embedding We have seen that the non-Markovian property appears ∂P ( ) K ∂ u ∂ u κ T ∂2 t = − k + k + κ v + k P ( ) as the result of variable elimination through the example k 2 t ∂t ∂v M ∂uk τk τk ∂u of Eq. (5). This shows that some non-Markovian systems k=1 k with an exponential memory kernel can be converted back (9) to a Markovian system by adding an auxiliary variable [30]. for the extended phase point := (v, u ,...,u ) and the This procedure is called Markov embedding [43,44] and can 1 K corresponding PDF P ( ). be generalized to transform the GLE (6) into simultaneous t Markovian SDEs when the kernel is given by a sum of ex- 5. Field description for an infinite number of auxiliary variables ponentials: There are two classes for Markov embedding: One requires a finite number of auxiliary variables [e.g., the GLE with K dvˆ (t ) K −t/τi a finite discrete sum of exponential terms to describe the K(t ) = κie ⇒ M = uˆk (t ), dt memory (8)], and the other one requires an infinite number k=1 k=1 of auxiliary variables. The latter class is essentially similar to duˆ (t ) uˆ (t ) 2κ T the classical field theory of stochastic processes, represented k =− k − κ + k ξˆ G kvˆ (t ) k (t )(8)by stochastic partial differential equations (SPDE). Indeed, dt τk τk the continuous version of the Markov embedding (8) can be expressed in terms of SPDEs as follows. For the continuous ξˆ G with independent standard Gaussian noises k (t ), satisfying decomposition ξˆ G = ξˆ G ξˆ G  =δ δ −  (t )k 0 and k (t ) j (t ) kj (t t ). Here we have ∞ taken the free potential case U (ˆx) = 0 for simplicity. We thus K(t ) = dxκ(x)e−t/x, (10) find that the exponential-sum memory case is Markovian by 0 introducing the new system-variable set ˆ := (ˆv, uˆ1,...,uˆK ), we obtain an equivalent Markov embedding representation in while it was non-Markovian in the original variable represen- the absence of the potential U (ˆx) = 0, tationv ˆ (t ). dvˆ (t ) 1 ∞ In this sense, whether a system is regarded as Marko- = dxu(t, x), dt M vian or non-Markovian crucially depends on which variable 0 set is taken for the description of the system. This story is ∂uˆ(t, x) uˆ(t, x) 2κ(x)T actually consistent with the projection operator formalism: =− − κ(x)ˆv(t ) + ξˆ G(x, t ) , (11) ∂t x x While the original Hamiltonian dynamics obeys the Marko- vian dynamics (i.e., the time evolution of the phase-space with the spatial white Gaussian noise term ξˆ G(t, x) satisfying distribution is given by the Liouville equation), the reduced ξˆ G(t, x)=0 and ξˆ G(t, x)ξˆ G(t , x )=δ(x − x)δ(t − t ). dynamics of macroscopic variables obeys the non-Markovian This is a simple equation in terms of the time derivative Langevin dynamics due to integrating out irrelevant variables. but it can be regarded as a SPDE, sinceu ˆ(t, x) is spatially Furthermore, a systematic method of Markov embedding distributed over the auxiliary field x ∈ (0, ∞). Indeed, this [45] was proposed on the basis of the continued-fraction interpretation enables us to apply the functional calculus his- expansion [39]. torically developed for the analytical solution of SPDEs [31].

033442-4 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

TABLE I. Summary table to compare the generalized Langevin equation and the Hawkes process. Here FDR stands for the fluctuation- dissipation relation.

Model GLE (exponential memory) GLE (general memory) Hawkes process (general memory) Character Diffusive transport Diffusive transport Point process Fundamental equation Fokker-Planck equation (9) Field Fokker-Planck equation (12) Field master equation (40) Typical systems Near equilibrium Near equilibrium Out-of-equilibrium Phenomena Relaxation Relaxation Critical bursts FDR Yes Yes No Non-Markovian representation SDE (6)SDE(6)SDE(1) Markovian representation SDE (8) SPDE (11) SPDE (36) # of auxiliary variables Finite Infinite (field description) Infinite (field description)

In the domain of mathematics dealing with SPDEs, the Gaussian noise to describe the diffusive local transport, and Fokker-Planck description is based on the functional calculus thus does not include trajectory jumps. This difference should for the field variable (called the functional Fokker-Planck be reflected in the form of the time-evolution equation of the equation in Ref. [31]). The corresponding field Fokker-Planck PDF. Indeed, the time evolution of the PDF for the Hawkes equation can be derived by replacing the discrete sum with the process will be shown to obey the master equation (integrod- functional derivative: ifferential equations), while that for the Langevin dynamics ∞ obeys the Fokker-Planck equations (second-order derivative ∂Pt [ ] ∂ u(x) δ u(x) = dx − + + κ(x)v equations). ∂t ∂v M δu(x) x 0 Another dissimilarity comes from the fact that the Hawkes κ(x)T δ2 process is an out-of-equilibrium model typically describing a + P [ ] (12) x δu2(x) t branching process requiring immigrants or background events to drive the whole sequence of events, while the Langevin = , { } for the extended phase point : (v u(x) x∈(0,∞)) and the equations are near-equilibrium models in the sense that ini- corresponding probability functional Pt [ ]. tial distributions of the thermal baths are characterized by We note that the mathematical precise definition of such small perturbations from the Gibbs distribution. Note that Fokker-Planck description has not been established yet [31]. the physical validity of the projection operator formalism is One can see a formal divergent term in the FP field equation, not guaranteed in general for out-of-equilibrium systems [30] δ/δ = δ such as [ u(x)]u(x) (0). This divergence is irrelevant characterized by non-Gibbs initial distributions. This con- for physical observables as shown in Sec. VA2since the ceptual difference is important because the FDR (7)isnot SPDE is linear. However, general nonlinear SPDEs can have necessarily assumed for the Hawkes processes. Indeed, the divergences even for observables (see the example of the non- master equation for the Hawkes process does not satisfy the linear stochastic reaction-diffusion model in Ref. [31], Chap. condition [31] (or the time-reversal symme- 13.3.3). According to convention, the safest interpretation is try) as shown later. that all the procedures are implicitly discrete and the con- We should stress that the word “nonequilibrium” is used tinuous notation is regarded as a useful abbreviation of the here for systems driven by external forces, which leads to non- discrete underlying model. By introducing the finite lattice Gibbs initial distributions for thermal baths in contact with the constant dx for the auxiliary variable x,wehavethevari- system. While we did not review these cases in detail, there are → κ → κ able transformations uk u(xk )dx and k (xk )dx.The various understandings of the FDR. In the historical context functional derivative is then introduced as the formal limit of stochastic processes, the FDR is defined as being closely δ /δ = ∂ /∂ / of F[ ] u(xk ): limdx→0( F[ ] uk ) dx (see Ref. [31], associated with the time-reversal symmetry of the stochastic Chap. 13.1.1). In this sense, the derivation of the field Fokker- processes. In this sense, one can mathematically prove that Planck equation (12) here follows this convention: We first the Hawkes processes is actually “out of equilibrium,” in confirm that the discrete Markov embedding (8) works well the sense that the corresponding master equation does not and then generalize it to its general continuous version (11). satisfy the symmetry condition in Gardiner’s textbook [31]. We note that such out-of-equilibrium processes are physi- 6. Relation to the non-Markovian Hawkes processes cally reasonable in general nonequilibrium setups. Indeed, The above historical discussion of non-Markovian diffu- the shot noise process [31] and the Lévy flights dynamics sive process provides a clear guideline for different classes [47] can be observed in out-of-equilibrium systems (e.g., see of stochastic processes, including the Hawkes process as Refs. [48,49] for their statistical physics derivation from mi- follows. The summary highlighting dissimilarities and simi- croscopic dynamics), while they do not satisfy the detailed larities between the GLE and the Hawkes process is presented balance condition. In the historical context of the projection in Table I. operators, the near-equilibrium condition is defined such that a. Dissimilarities. The Hawkes process is an example of the initial distribution for the noise space is characterized by a non-Markovian point processes, triggering finite-size jumps linear response around the Gibbs distribution. In the context along a sample trajectory. This is in contrast to the non- of fluctuation theorems [50], the FDR is derived from the Markovian Langevin equation, which is based on infinitesimal assumptions of (1) time-reversal symmetry of the microscopic

033442-5 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020) dynamics and (2) validity of the Gibbs distribution for the exponential memory kernel: thermal bath in contact with the target system. The common 1 understanding is that the noise source (i.e., the thermal bath h(t ) = e−t/τ , (13) in contact with the systems) is characterized by a distribution τ close to the Gibbs distribution. Therefore, there should not ∞ satisfying the normalization h(t )dt = 1. The decay time be any confusion when considering the Hawkes process for 0 τ quantifies how long an event can typically trigger events which the FDR is irrelevant. in the future. This special case is Markovian as discussed b. Similarities. The GLE (6) can be mapped onto a Marko- in Refs. [51,52], because the lack of memory of exponential vian process (11) by adding a sufficient number of auxiliary distributions ensures that the number of events after time t variables. As we show below, the same Markov embedding defines a Markov process in continuous time [53]. technique is available even for the non-Markovian Hawkes As shown in the example (5), some non-Markovian pro- processes (1), by adding a sufficient number of auxiliary cesses can be mapped onto a Markovian stochastic system variables. Since the memory kernel can be a continuous sum if the memory function is exponential. Here we show that, of exponential kernels, the most general description of the likewise, this single exponential case (13) can be mapped onto Hawkes process should be based on an infinite number of an SDE driven by a state-dependent Markovian Poisson noise. auxiliary variables. As discussed above, such systems are typ- By decomposing the intensity as ically described as a classical field theory driven by stochastic terms and thus the dynamics of the original Hawkes process zˆ := νˆ − ν0, (14) can be finally mapped onto an SPDE (36) and the field master equations (40). let us consider the Langevin dynamics dzˆ 1 n III. MASTER EQUATIONS =− zˆ + ξˆ P (15) dt τ τ νˆ We now formulate the master equation for the model (1) ξˆ P and provide its asymptotic solution around the critical point. with a state-dependent Poisson noise νˆ with intensity given byν ˆ = zˆ + ν0 and initial conditionz ˆ(0) = 0. The introduction A. Markov embedding: Introduction of auxiliary variables ofz ˆ is similar to the trick proposed in Ref. [54] for an efficient estimation of the maximum likelihood of the Hawkes process. As discussed in Sec. II C 3, non-Markovian properties By expressing the state-dependent Poisson noise as in many stochastic systems often arise as a result of vari- able eliminations. This suggests that, reciprocally, it might be Nˆ (t ) possible to map a non-Markovian system onto a Markovian ξˆ P = δ − , νˆ (t ) (t tˆi ) (16) one by adding auxiliary variables (i.e., Markov embedding). i=1 While the Hawkes process (1) is non-Markovian in its orig- inal representation only based on the intensityν ˆ (t ), it can we obtain the formal solution of equation (15) be also transformed onto a Markovian process by selecting t n  an appropriate set of system variables. In this section, we ν t = ν + z t = ν + dte−(t−t )/τ ξˆ P t  ˆ ( ) 0 ˆ( ) 0 τ νˆ ( ) formulating such a Markov embedding procedure to derive 0 the corresponding master equations. Nˆ (t ) = ν0 + n h(t − tˆi ). (17) B. The single exponential kernel case i=0 1. Mapping to Markovian dynamics This solution shows that the SDE (15) is equivalent to the Before developing the general framework for arbitrary Hawkes process (1). Equation (15) together with (16) is there- memory kernel h(t ), we consider the simplest case of an fore a short-hand notation for

− 1 , + = − ν + − = τ zˆ(t )dt (no jump during [t t dt); probability 1 ˆ (t )dt) zˆ(t dt) zˆ(t ) n , + = ν (18) τ (jumpin[t t dt); probability ˆ (t )dt) for the probabilistic time evolution during [t, t + dt)[seeFig.2(a) for a schematic representation]. Note that the event probability explicitly depends onν ˆ (t ), which reflects the endogenous nature of the Hawkes process. This is the first example of the Markov embedding of the Hawkes process. Note that this procedure corresponds to Eq. (8) for the case of the GLE with K = 1.

2. Master equation By introducing Eq. (15) together with (16), we have transformed a non-Markovian point process into a Markovian SDE. This allows us to derive the corresponding master equation for the probability density function (PDF) Pt (z) of the excess intensity zˆ (14), ∂P (z) 1 ∂ t = zP (z) + [(ν + z − n/τ )P (z − n/τ ) − (ν + z)P (z)], (19) ∂t τ ∂z t 0 t 0 t

033442-6 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

FIG. 2. (a) Schematic representation of a typical trajectory forz ˆ(t )definedby(14)and(15). A jump of size n/τ may occur in the time interval [t, t + dt) with probabilityν ˆ (t )dt. (b) Let us consider an arbitrary function f (ˆz). At the time t = tˆi of the jump, there is a corresponding jump in the trajectory of f (ˆz), which is characterized by df (ˆz(t )) := f (zˆ(tˆi − 0) + n/τ ) − f (zˆ(tˆi − 0)). The plots shown here are based on a numerical simulation with the following parameters: τ = 1, n = 0.5,ν0 = 0.1, and f (z) = exp[5(z − 1)] + 0.1. with the boundary condition

Pt (z)|z=0 = 0. (20)

Pt (z)dz is thus the probability thatz ˆ(t ) takes a value in the intervalz ˆ(t ) ∈ [z, z + dz) at time t. We note that this master equation after Markov embedding corresponds to the FP equation (9) for the case of the GLE with K = 1. The master equation (19) is derived as follows. Let us consider an arbitrary function f (ˆz). Using (18), its time evolution during [t, t + dt) is given by ∂ − zˆ(t ) f (zˆ(t )) dt (probability = 1 − νˆ (t )dt) f (zˆ(t + dt)) − f (zˆ(t )) = τ ∂zˆ , (21) f (zˆ(t ) + n/τ ) − f (zˆ(t )) (probability = νˆ (t )dt) as schematically illustrated in Fig. 2(b). By taking the ensemble average of both sides, we obtain ∂P (z) z ∂ f (z) dt dz t f (z) = dzP (z) − dt + (z + ν ){ f (z + n/τ ) − f (z)}dt , ∂t t τ ∂z 0 ∂P (z) 1 ∂ ⇒ dz t f (z) = dzf(z) zP (z) +{(ν + z − n/τ )P(z − n/τ ) − (ν + z)P(z)} . (22) ∂t τ ∂z t 0 0 This result (22) is obtained by (i) using the identity ∂Pt (z) f (zˆ(t + dt)) − f (zˆ(t )) = dz[P + (z) − P (z)] f (z) = dt dz f (z), (23) t dt t ∂t

(ii) by performing a partial integration of the first term in the exponential functions: right-hand side of Eq. (22) and (iii) by introducing the change of variable z → z − n/τ for the second term. Since (22)isan K 1 nk − /τ identity holding for arbitrary f (z), the integrants of the left- h(t ) = e t k . (24) n τk hand side and right-hand side must be equal for arbitrary f (z), k=1 which yields the master equation (19).

Note that the above derivation of the master equation is In this case, each coefficient nk quantifies the contribution of not restricted to the exponential shape of the memory kernel. τ the kth exponential with memory length k to the branching We are going to use the same derivation in the more com- = K ∞ = ratio n k=1 nk, satisfying the normalization 0 h(t )dt plex examples discussed below. We also note that the master 1. We note that this representation (24) is quite general, as equation (19) does not satisfy the detailed balance condition it can approximate well the case of a power-law kernel with of Ref. [31], which reflects the fact that the Hawkes process is cutoff up to a constant [55]. a model for out-of-equilibrium systems. Harris suggested the intuitive notion that it is possible to map this case to a Markovian dynamics if the state of the system at time t is made to include the list of the ages of all C. Discrete sum of exponential kernels events [56]. The problem is that this conceptual approach is unworkable in practice due to the exorbitant size of the re- 1. Mapping to Markovian dynamics quired information. By introducing an auxiliary age pyramid The above formulation can be readily generalized to the process, Ref. [57] identified some key components to add to case of a memory kernel expressed as a discrete sum of the Hawkes process and its intensity to make the dynamics

033442-7 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

FIG. 3. Case where the memory kernel is the sum of two exponentials. Panels (a) and (b) show the schematic trajectories of the two excess intensitiesz ˆ1 andz ˆ2 and panel (c) that of the resulting total intensityν ˆ := ν0 + zˆ1 + zˆ2. The parameters are K = 2,τ1 = 1, n1 = 0.3,τ2 = 3, n2 = 0.5(andthusn = 0.8), and ν0 = 0.1.

Markovian. Here, in order to map model (1) onto a Marko- The jump-size vector is given by h := T vian stochastic process, we propose a more straightforward (n1/τ1, n2/τ2,...,nK /τK ) . The PDF Pt (z) obeys the Markov embedding approach, which generalized the previous following boundary condition, case of a single exponential memory function. We decompose | = , { }K Pt (z) z∈∂RK 0 (28) the intensity into a sum of K excess intensities zk k=1 as + follows: K on the boundary ∂R+ :={z|zk = 0forsomek}. Equation (27) K can be derived following the procedure used for the single νˆ (t ) = ν0 + zˆk (t ) . (25) exponential case that led us to the master equation (19)(see k=1 Appendix A1for an explicit derivation). Note that this master equation corresponds to the FP equation (9) for the case of the Each excess intensityz ˆ is the solution of a Langevin equation k GLE. driven by a state-dependent Markovian Poisson shot noise dzˆ zˆ n 3. Laplace representation of the master equation k =− k + k ξˆ P . νˆ (26) dt τk τk The master equation (27) takes a simplified form under the Laplace representation, ξˆ P Note that the same state-dependent Poisson noise νˆ (t ) de- fined by expression (16) acts on the Langevin equation for P˜t (s):= LK [Pt (z); s] , (29) each excess intensity {zˆk}k=1,...,K . In other words, each shock event impacts simultaneous the trajectories for all excess in- where the Laplace transformation in the K-dimensional space { } is defined by tensities zˆk k=1,...,K [see the vertical broken line in Figs. 3(a) and 3(b) and the resulting trajectory ofν ˆ (t )inFig.3(c)]. Note ∞ L = −s·z that this Markov embedding corresponds to Eq. (8)forthe K [ f (z); s]: dze f (z) (30) 0 case of the GLE. = K = with volume element dz : k=1 dzk. The wave vector s : T 2. Master equation (s1,...,sK ) is the conjugate of the excess intensity vector z := (z ,...,z )T. As the set of SDEs for zˆ := (ˆz , zˆ ,...,zˆ )T are standard 1 K 1 2 K The Laplace representation of the master equation (27)is Markovian stochastic processes, we obtain the corresponding then given by master equation: K K ∂P˜ (s) s ∂P˜ (s) ∂ ∂P (z) K ∂ z K t =− k t + e−h·s − ν − P˜ s . t = k + ν + − /τ − ∂ τ ∂ ( 1) 0 ∂ t ( ) Pt (z) 0 (zk nk k ) Pt (z h) t = k sk = sk ∂t ∂zk τk k 1 k 1 k=1 k=1 (31) K − ν + . 0 zk Pt (z) (27) Then, the Laplace representation (29)ofPt (z), which is the k=1 solution of (31), allows us to obtain the Laplace representation

033442-8 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

∈ , ∞ Q˜ t (s) of the intensity PDF Pt (ν) according to field x (0 ). Furthermore, this interpretation allows us to − ν + K use functional calculus, historically developed for the solution ˜ = L ν = s( 0 k=1 zˆk ) Qt (s): 1[Pt ( ); s] e of SPDEs (such as the stochastic reaction-diffusion equations −ν0s T = e P˜t (s = (s, s,...,) ) . (32) [31]). Note that this SPDE corresponds to Eq. (11) for the case of the GLE. The set of SPDEs (36) expresses the fact that the contin- { , } D. General kernels uous field of excess intensity zˆ(t x) x∈R+ tends to relax to zero, but they are intermittently simultaneously shocked by 1. Mapping to Markovian dynamics ξˆ P the shared shot noise term νˆ , with an x-dependent jump size The above formulation can be generalized to general forms n(x)/x. This is in contrast to the SPDE representation (11)for of the memory kernel. Let us decompose the kernel as a the GLE (6), where the noise term at x has no correlation with continuous superposition of exponential kernels, that at another point x: ξˆ G(t, x)ξˆ G(t, x )=0forx = x.Ina ∞ ∞ more conventional expression, we can rewrite the long-range 1 n(x) − / h(t ) = e t xdx , n = n(x)dx , (33) nature of the spatial correlation inη ˆ (t, x)as n 0 x 0 ηˆ (t, x)ˆη(t, x) = K(x, x)δ(t − t ), where we have introduced the set of continuous auxiliary vari- ss ∈  ables x R+. This decomposition satisfies the normalization K ,  = n(x)n(x )ν . ∞ = (x x )  ˆ ss (37) condition 0 h(t )dt 1. Here we use the notation x for these xx auxiliary variables to emphasize the formal connection with If K(x, x)wasaδ function, the SPDE could be regarded the usual field theory of classical stochastic (or quantum) sys- as a noninteracting infinite variable systems. This long-range tems. Indeed, the observables will be defined over the “field” nature means that all the excess intensityz ˆ(t, x) at different x ∈ R+. The function n(x) quantifies the contribution of the points are strongly correlated through this noise term, even xth exponential with memory length x to the branching ratio. though the SPDE (36) has no spatial derivatives. We can then interpret n(x)/n as a normalized distribution of timescales present in the memory kernel of the Hawkes pro- 2. Field master equation cess. As we show below, an important condition for solvability The master equation corresponding to the SDE (36) can will be the existence of its first-order moment be derived by following the same procedure presented for the α ∞ n(x) :=τ := x dx < ∞ . (34) simple exponential case and for the discrete sum of expo- n 0 n nentials. There is, however, a technical difference since the This condition (34) means that n(x) should decay faster than state of the system is now specified by the continuous field { , } 1/x2 at large x’s. Hence, the representation (33) implies that variable zˆ(t x) x∈R+ . Thus, the probability density function is { , = the memory kernel has to decay at large times faster than 1/t 2. replaced with the probability density functional P[ zˆ(t x) } = { } This covers situations where the variance of the timescales z(x) x∈R+ ] Pt [ z(x) x∈R+ ]. In other words, the probability { } embedded in the memory kernel diverges. But this excluded that the system state is in the state specified by z(x) x∈R+ +θ { } D the cases h(t ) ∼ 1/t 1 with 0 <θ<1 that are relevant to at time t is characterized by Pt [ z(x) x∈R+ ] z with functional D the Omori law for earthquakes [16,17] and to the response to integral volume element z. social shocks [20]. This case 0 <θ<1forwhichα diverges We use the notational convention that any mapping with { } needs to be treated separately and this is beyond the content square bracket A[ f (x) x∈R+ ] indicates that the map A is a { } of the present work. functional of f (x) x∈R+ . In addition, we sometimes abbrevi- { } = { } We then decompose the intensity of the Hawkes process as ate the functional Pt [ z(x) x∈R+ ]byPt [z]: Pt [ z(x) x∈R+ ]for a continuous sum of excess intensitiesz ˆ(t, x) the sake of brevity. ∞ The presence of a continuous field variable leads to sev- eral technical issues, such as in the correct application of νˆ (t ) = ν0 + dxzˆ(t, x) . (35) 0 the Laplace transform. The functional Laplace transformation L Each excess intensitiesz ˆ(t, x) is the solution of the following path of an arbitrary functional f [z] is defined by a functional dynamical equation, integration (i.e., a path integral): ∞ − dxs(x)z(x) ∂zˆ(t, x) zˆ(t, x) n(x) L f [z]; s := Dze 0 f [z] . (38) =− + ηˆ (t, x), ηˆ (t, x):= ξˆ P, (36) path ∂t x x νˆ This allows us to define the Laplace representation of the where, as for the previous case of a discrete sum of exponen- probability density functional by tials, the same state-dependent Poisson noise ξˆ P(t ) defined by νˆ ˜ = L expression (16) acts on the Langevin equation for each excess Pt [s]: path[Pt [z]; s] (39) , { τ } intensityz ˆ(t x). While Eq. (36) is a simple equation only for an arbitrary nonnegative function s( ) τ∈R+ . related to the time derivative, this equation can be regarded as As the natural extension of Eq. (27), the master equation an SPDE sincez ˆ(t, x) is distributed over the auxiliary variable for the probability density functional is given by

∞ ∞ ∞ ∂Pt [z] δ z(x) = dx Pt [z] + ν0 + (z − n/x)dx Pt [z − n/x] − ν0 + zdx Pt [z] (40) ∂t 0 δz(x) x 0 0

033442-9 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020) with the boundary condition Interpretation. As discussed in Sec. II C 5, one of the safest interpretations is to regard this functional description P [z]| ∈∂ ∞ = 0, (41) t z R+ as the formal continuous limit from the discrete descrip- ∞ where the boundary of the function space ∂R+ :={z|z(x) = tion in Sec. III C. By introducing a finite lattice interval 0forsomex ∈ [0, ∞)}. This field master equation after dx > 0 and rewriting τk → xk, zˆk (t ) → zˆ(t, xk )dx, and nk → Markov embedding corresponds to the FP field equation (12) n(xk )dx,Eqs.(24)–(26) can be rewritten as for the GLE.

K K 1 n(xk ) − / ∂z(t, xk ) zˆ(t, xk ) n(xk ) = t xk , ν = ν + , , =− + ξˆ P . h(t ) e dx ˆ (t ) 0 zˆ(t xk )dx νˆ (t ) (42) n xk ∂t xk xk k=1 k=1 The master equation (27) is rewritten as K K K ∂Pt (z) 1 ∂ z(xk ) = dx Pt (z) + ν0 + (z(xk ) − n(xk )/xk )dx Pt (z − h) − ν0 + z(xk )dx Pt (z). (43) ∂t dx ∂z(xk ) xk k=1 k=1 k=1 According to the convention in Ref. [31], we take a formal limit K →∞and dx → 0 to apply the formal replacement ∞ K δ 1 ∂ dx[...]:= lim dx[...], := lim . (44) dx→0 δz(x) dx→0 dx ∂z(xk ) 0 k=1

The master equation (40) for the field variables {zˆ(t, x)}x∈(0,∞) to a Laplace transform, since it can be rewritten as is thus derived. While this derivation is based on a formal ∞ limit of a discrete description, Eq. (40) can be also derived by 1 1 1 −st 1 1 1 h(t ) = n e ds = L1 n ; t direct continuous operations based on the functional Taylor n 0 s s n s s expansions (see Appendix A2for the detailed derivation n(x) 1 ⇐⇒ = L−1 . based on functional calculus). 1 [h(t ); s] (46) n x s=1/x We remark that the SPDE (36) is linear and serious diver- gence problems did not appear at least for our main results This allows us to reformulate the several examples discussed on physical observables. In addition, our final results have above in a unified way presented in Table II. been confirmed to be robust for both discrete and continuous cases as shown later. This strategy is consistent with the safe IV. SOLUTION prescription suggested in Ref. [31]: In the beginning, the functional descriptions for field variables should be based on A. Main results a discrete formulation. The continuous description should be In Sec. III, we have derived the master equations and introduced afterward as a formal limit of zero-lattice inter- their Laplace representations for the Hawkes processes with vals. In this sense, our analysis has successfully avoided the arbitrary memory kernels. Remarkably, the Laplace rep- delicate divergence problems on general SPDEs. resentations are first-order partial (functional) differential equations. Because first-order partial (functional) differential equations can be formally solved by the method of character- 3. Laplace representation of the master equation istics (see Appendix C for a brief review), various analytical In the functional Laplace representation (39), the master properties of the Hawkes process can be studied in detail. equation (40) takes the following simple first-order functional In this section, we present novel properties of the Hawkes differential equation: process unearthed from the solution of the master equations by the method of characteristics. In particular, we focus on the behavior of the PDF of the steady-state intensity near the ∂ ˜ ∞ Pt [s] − dxs(x )n(x )/x = ν e 0 − 1 P˜ [s] ∂t 0 t ∞ TABLE II. Examples of various memory kernel h(t ) and corre- ∞ δ ˜ − dxs(x )n(x )/x s(x) Pt [s] sponding n(x) defined in expression (33). − dx e 0 − 1 + . 0 x δs(x) Case h(t ) n(x) (45) 1 −t/τ1 Single exponential kernel e n1δ(x − τ1 ) τ1 Discrete superposition K n − /τ K 4. General formulation 1 k t k δ − τ of exponential kernel n k=1 τ e k=1 nk (x k ) k β ∗ All the above forms of the memory kernel can be unified by β  1 β n τ ∗ e−τ /x Power-law kernel ( 0) τ ∗ (1+t/τ ∗ )β+1 x x (β) remarking that the variable transformation (33) is equivalent

033442-10 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020) critical point n = 1. Under the condition of the existence of 1. Steady-state solution the first-order moment (34), an asymptotic analysis of the Let us first study the steady solution of the PDF P (ν). ν = ν ss master equations shows that the PDF Pss( ): limt→∞ Pt ( ) By setting K = 1inEq.(31), we obtain the expression exhibits a power-law behavior with a nonuniversal exponent: of the Laplace transform of the steady state P˜ss(s):= ∞ ν −sν 0 d e Pss(z) of the master equation (19) in the form of 1 a first-order ordinary differential equation P (ν) ∝ , with α = nτ (47) ss 1−2ν α ν 0 s dP˜ (s) e−ns/τ − 1 + ss = ν (e−ns/τ − 1)P˜ (s). (48) τ ds 0 ss for large ν, up to an exponential truncation, which is pushed ν →∞ → By solving this equation, we obtain the exact steady solution toward as n 1. As the tail exponent is smaller than < ν below the critical point n 1, 1, the steady-state PDF Pss( ) is not renormalizable without the exponential cutoff. However, the characteristic scale of the ν s sds Q˜ s =−sν + P˜ s =− 0 log ss( ) 0 log ss( ) −ns/τ exponential tail diverges as the system approaches the critical τ 0 e − 1 + s/τ point n = 1, and the power-law tail (47) can be observed over (49) many orders of magnitude of the intensity for near-critical systems, as we illustrate below. with the renormalization condition The parameter α = nτ entering in the expression of tail ∞ exponent in expression (47) has been defined by expression Pss(ν) = Q˜ ss(s = 0) = 1. (50) (34). Since ν0 is the background intensity of the Hawkes 0 ν intensity as defined in (1), the exponent of Pss( ) depends a. Near the critical point. Let us evaluate the asymptotic ν α = ν τ on 0 n 0 , which is n times the average number of behavior of Q˜ ss(s)forlargeν by assuming that the system is background events (or immigrants) occurring during a time in the near-critical state, such that equal to the average time scale τ of the memory kernel. Thus, the larger the memory τ, the larger is the background  := 1 − n  1. (51) ν intensity 0, and the larger the branching ratio n, the smaller is  − ν α − ν α By performing an expansion in the small parameter ,we the exponent 1 2 0 . Note that 1 2 0 can even turn neg- ν ν α> / obtain an asymptotic formula for small s (large ), ative for 0 1 2, which corresponds to a nonmonotonous s PDF Pss(ν), which first grows according to the power law (47) ν ds s log Q˜ (s) − 0 =−2ν τ log 1 + , before decaying exponentially at very large ν’s. ss 2 0 τ 0 /τ + s/2τ 2τ In simple terms, the PDF (47) describes the distribution of the number νdt of events in the limit of infinitely small (52) , + time windows [t t dt]. We should contrast this limit to the which implies a power-law behavior with a nonuniversal ex- other previously studied limit of infinitely large and finite ponent, up to an exponential truncation: but very large time windows. Standard results of branching −1+2ν0τ −2τν processes (of which the Hawkes model is a subset) give the Pss(ν) ∝ ν e , (53) total number of events generated by a given triggering event ν (see Ref. [21] for a detailed derivation). In Eq. (1), this corre- for large . This is a special case of expression (47) obtained → τ=τ sponds to counting all the events over an infinitely large time for n 1 and . It is remarkable that the power-law exponent is less than window that are triggered by a single source event ν0 = δ(t ) occurring at the origin of time. Reference [22] demonstrates 1, and thus the PDF is not renormalizable without the ex- the distribution of “seismic rates” in the limit of large time ponential truncation. This means that the power-law scaling actually corresponds to an intermediate asymptotics of the windows which, in our current formulation, corresponds to t+T PDF, according to the classification of Barenblatt [58]. In the distribution of N(t ):= ν(τ )dτ, in the limit of large t addition, while this scaling can be regarded as a heavy “tail” T ’s. The corresponding probability density distributions are for 2ν τ<1, the exponent can be negative when 2ν τ>1 totally different from (47), which corresponds to the other 0 0 (i.e., the PDF is a power-law increasing function until the limit T → 0. exponential tail takes over and ensure the normalization of the We derive our main result (47) first for the single exponen- PDF). tial form of the memory kernel, then for the discrete sum of The characteristic scale of the exponential truncation is exponentials, and then for the general case. defined by 1 1 νcut := = , (54) B. Single exponential kernel 2τ 2τ (1 − n) As the first example, we focus on the single exponential which diverges as the system approaches to the critical condi- kernel (13). While this special case is analytically tractable tion n = 1. This means that (i) if the system is in a near-critical without the need to refer to the master equation approach state   1 and (ii) the background intensity is sufficiently [52], we nevertheless derive its exact solution via the master small ν0 < 1/(2τ ), one can actually observe the power-law in- −1 equation approach, because the methodology will be readily termediate asymptotics for a wide range ν  νcut = O( ), generalized to the more complex cases. up to the exponential truncation.

033442-11 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

FIG. 4. Numerical evaluation of the steady state PDF of the intensityν ˆ for the following parameter sets near the critical point: n = 0.999 (blue bars) and n = 0.99 (red bars). The theoretical power law is shown by the green straight line. (a) Background intensity ν0 = 0.01, relaxation time τ = 1, leading to the power-law exponent 0.98. (b) ν0 = 0.2,τ= 1, leading to the power-law exponent 0.6. (c) ν0 = 1.0,τ= 1, leading to the negative (i.e., growing) power-law exponent −1.0. For all simulations, the sampling time interval and total sampling time are dt = 0.001 and Ttot = 10000 with the initial conditionz ˆ(0) = 0. The initial 10% of the sample trajectories were discarded from the statistics.

b. Numerical verification. We now present numerical con- d = ν (e−ns/τ − 1), with  := log P˜ . (56) firmations of our theoretical prediction, in particular for the dt o intermediate asymptotics as shown in Fig. 4. There is a well- developed literature on the numerical simulation of Hawkes process [59,60]. We have used an established simulation pack- These equations can be solved explicitly, age for PYTHON called “tick” (version 0.6.0.0) for 32-thread 4 parallel computing. The total simulation time was 10 and the ν s sds t = F s + C ,= ν s − 0 + C sampling time interval was 0.001. We note that the initial 10% ( ) 1 0 −ns/τ  2 τ 0 e − 1 + s /τ of the sampled trajectories were discorded for initialization. (57) For small background intensity ν0  1/τ, we obtain an ap- proximate universal exponent −1. For 2ν0τ<1, we observe a decaying power law of exponent 1 − 2ν0τ, while we observe with a growing power law for 2ν0τ>1. The power-law interme- diate asymptotics is truncated by the exponential function, as s ds predicted and also discussed in Ref. [54] (albeit with the error F(s):= + C . (58) −ns/τ − + /τ 1 of missing the 1 in the exponent and thus failing to describe s0 e 1 s the intermediate asymptotics), ensuring the normalization of the PDF of the Hawkes intensities. C1 and C2 are constants of integration and s0 is a positive con- stant chosen to satisfy several convenient properties discussed 2. Time-dependent solution below. We now present the exact solution of the time-dependent a. Summary of the properties of F. We present several master equation. In the Laplace representation, the dynamics analytical properties of F(s) (see Appendix D for their proof): of the PDF of the intensities is given by the following first- (α1) F(s) is a monotonically increasing function by choos- order PDE, ing s0 > 0 appropriately. (α2) The inverse function F −1(s) can be defined uniquely. ∂P˜ (s) s ∂P˜ (s) t + e−ns/τ − + t = ν e−ns/τ − P˜ s . In addition, for the subcritical case n < 1, the following prop- ∂ 1 τ ∂ 0( 1) t ( ) t s erties hold true: (α3) s0 can be set to any positive value. α F =−∞ (55) ( 4) lims→+0 (s) . (α5) lims→∞ F(s) =+∞. This equation can be solved by the method of characteris- (α6) F(s) can take all real values: F(s) ∈ (−∞, ∞)for tics (see Appendix C for a brief review). The corresponding s > 0. Lagrange-Charpit equations are given by b. Regularization of F(s). In the following, we assume the < F ds s subcritical condition n 1. It is useful to decompose (s) = e−ns/τ − 1 + , into regular and singular parts: dt τ

s ds s ds τ s τ s 1 − e−ns/τ − ns/τ τ s F(s) = − + log = ds + log , (59) e−ns/τ − 1 + s/τ (1 − n)s/τ 1 − n s 1 − n s(e−ns/τ − 1 + s/τ ) 1 − n s s0 s0 0 s0 0 totally zero as an identity regular part singular part

033442-12 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020) where the regular part is well defined even for s0 → 0. This d. Asymptotic relaxation dynamics for large t. The time- expression is useful since the divergent factor in F(s) can be dependent asymptotic solution is given for large t by renormalized into the integral constant C1, such that s ν0 sds τ P˜ s − + P˜ = (S t, s ), log t ( ) −ns/τ log t 0 ( ) C1 − log s0 → C1. (60) τ , e − 1 + s/τ 1 − n S(t s) → 1 − n We then take the formal limit s0 0 and use the following S(t, s) = s exp − (t − FR(s)) , (68) regularized expression for the subcritical condition n < 1, τ τ  F F(s) = F (s) + log s, assuming that t (s) for a given s. As a corollary of this R 1 − n formula, an asymptotic prediction for the distribution condi- τ s 1 − e−ns/τ − ns/τ tional on the initial intensity is given by the following formula, F (s):= ds , (61) R 1 − n s(e−ns/τ − 1 + s/τ ) ν, |ν , = = L−1 ˜ , ˜ =−ν , 0 P( t ini t 0) 1 [Pt (s); t] log Pt=0(s) inis where the s0 dependence is removed as the result of the renor- (69) malization. The regular part has no singularity at s  0 and 2 reads FR(s) −n s/{2τ (1 − n)}. for large t. Note that asymptotic convergence of these formu- c. Explicit solution. Building on the above, we now pro- las is not uniform in terms of s; indeed, the convergence of vide the solution of the master equation (55). According to the the Laplace representation for large s is slower than that for method of characteristics (see Appendix C for a brief review), small s. the general solution is given by 3. Another derivation of the power law exponent: Linear stability C = H(C ) (62) 2 1 analysis of the Lagrange-Charpit equation H · with an arbitrary function ( ). The time-dependent solution We have presented both the steady-state and time- log Q˜ (s) = log P˜ (s) − ν s is thus given by t t 0 dependent solutions of the master equations, based on exact ν s sds or asymptotic methods. While these formulations are already log Q˜ (s) =− 0 + H(t − F(s)). t −ns/τ clear, here we revisit the power-law behavior (53)ofthe τ 0 e − 1 + s/τ (63) steady-state PDF and present another derivation based on The function H(·) is determined by the initial condition. the linear stability analysis of the Lagrange-Charpit equation, Let us assume that the initial PDF and its Laplace repre- which has the advantage of being generalizable to memory sentation are given by Pt=0(ν) and P˜t=0(s), respectively. Then, kernels defined as superposition of exponential functions. In- deed, while the derivation based on the exact solution (49)is we obtain s clear and powerful, it is not easy to extend this kind of calcu- ν0 sds H(−F(s)) = log P˜t=0(s) + (64) lation to general cases, such as superposition of exponential τ e−ns/τ − 1 + s/τ 0 kernels. In contrast, the derivation that we now present can or equivalently, be extended to arbitrary forms of the memory kernel of the S(x) Hawkes processes, as will be shown later. Moreover, we have ν0 sds H(x) = log P˜ = (S(x)) + , found additional distinct derivations of (53) and we refer the t 0 τ −ns/τ − + /τ 0 e 1 s interested reader to Appendix B. S(x) = F −1(−x). (65) While the steady-state master equation (48) is a ordinary differential equation which can be solved exactly, let us con- Note that the time-dependent solution (63) is consistent sider its corresponding Lagrange-Charpit equations, with the steady solution (49) ˜ = ˜ , ds −ns/τ s d −ns/τ lim Qt (s) Qss(s) (66) =−e + 1 − , log P˜ss =−ν0(e − 1), (70) t→∞ dl τ dl H = since limx→+∞ (x) 0 (see Appendix E1for the proof). where we introduce the parameter l of the characteristic curve. We also note that, from the time-dependent solution (63), we These equations can be regarded as describing a “dynamical ν can derive the dynamics of the intensity ˆ (t ) for finite t as system” in terms of the auxiliary “time” l. This formulation − −nt/τ is useful because the well-developed theory of dynamical −(1−n)t/τ 1 e νˆ (t )=νinie + ν0 (67) systems is applicable even to more general cases as shown 1 − n later. ν = ν with the initial condition ˆ (0) ini (see Appendix E2for the a. Subcritical condition n < 1. Let us first focus on the derivation). This expression (67) shows that the mean inten- subcritical case n < 1 and consider the expansion of Eq. (70) →+∞ ν →ν / − sity converges at long times t to ˆ (t ) 0 (1 n), around s = 0, which leads to which is a well-known result [8,25]. Expression (67)also 2 2 shows that an initial impulse decays exponentially with a ds 1 − n n s d nν0 − s − +··· , log P˜  s +··· . renormalised time decay τ/(1 − n), which is also consistent dl τ 2τ 2 dl ss τ τ/ − with previous reports [18]. This diverging timescale (1 (71) n), as n → 1, reflects the occurrence of all the generations of triggered events that renormalize the “bare” memory function The corresponding flow of this effective dynamical system into a “dressed” memory kernel with much longer memory. s(l) along the s axis as a function of “time” l is illustrated in

033442-13 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

divergent constant C, we obtain the intermediate asymptotics,

−1+2ν0τ Pss(ν) ∝ ν , (78) which recovers the leading power law intermediate asymptotic (53), which is a special case of the general solution (47).

FIG. 5. Schematic illustration of the vector field V (s):= − /τ C. Double exponential kernel ds/dt =−e ns + 1 − s/τ along the s dimension as a function of “time” l. (a) Near critical condition  := 1 − n  1, two fixed points We now consider the case where the memory function (24) exist at s = 0 (attractor) and s −2τ (repeller). (b) At the critical is made of K = 2 exponential functions. Since the Laplace condition n = 1, the repeller merges with the attractor, which corre- representation of the master equation is still a first-order sponds to a transcritical bifurcation. partial differential equation, its solution can be formally ob- tained by the method of characteristics (see Appendix C for Fig. 5. Near the critical condition  = 1 − n  1, this “dy- a short review). Unfortunately, the time-dependent Lagrange namical system” has two fixed points V (s) = 0at Charpit equation cannot be exactly solved in explicit form anymore. We therefore focus on the steady-state solution of s = 0, s −2τ. (72) the master equation, with a special focus on the regime close The former is a stable attractor whereas the latter is an unsta- to the critical point. We develop the stability analysis of the ble repeller [see Fig. 5(a)]. Remarkably, the critical condition Lagrange-Charpit equations following the same approach as n = 1 for the Hawkes process corresponds to the condition in Sec. IV B 3. of a transcritical bifurcation [i.e., the repeller merges with the Let us start from the Lagrange-Charpit equations, which attractor; see Fig. 5(b)] for the “dynamical system” described are given by by the Lagrange-Charpit equations. This picture is useful be- ds1 − /τ + /τ s1 =− (n1s1 1 n2s2 2 ) + − , cause it can be straightforwardly generalized to more general e 1 (79a) dl τ1 memory kernels h(t ), as shown later. ds2 − /τ + /τ s2 Let us neglect the subleading contribution to obtain the =−e (n1s1 1 n2s2 2 ) + − , 1 τ (79b) general solution as dl 2 d − /τ + /τ − − − /τ nν0 =−ν (n1s1 1 n2s2 2 ) −  = ˜ , s = e (1 n)(l l0 ) , P˜  dl s l + C, 0(e 1) with : log Pss log ss τ ( ) (73) dl (79c) with constants of integration l0 and C. In the following, we set the initial “time” (i.e., the initial point on the characteristic and l is the auxiliary “time” parameterizing the position on curve) as l0 = 0. We then obtain the characteristic curve. Let us develop the stability analysis nν s around s = 0 (i.e., for large ν’s) for this pseudodynamical log P˜ − 0 + C (74) ss 1 − n system. a. Subcritical case n < 1. Assuming n := n1 + n2 < 1, let with constant of integration C. This constant is fixed by the us first consider the linearized dynamics of system (79)as condition of normalization of the PDF, given by log P˜ss = 0 for s = 0, which imposes C = 0. We thus obtain ds d −Hs,  ν0Ks (80) ν0s dl dl log Q˜ ss(s) =−ν0s + log P˜ss(s) − , (75) 1 − n with which is consistent with the asymptotic mean intensity in the 1−n1 −n2 s τ τ n1 n2 = 1 , = 1 2 , = , . steady state [see the long time limit of Eq. (67)]. s : H : −n1 1−n2 K : (81) s2 τ τ τ1 τ2 b. At criticality n = 1. For n = 1, the lowest-order contri- 1 2 bution in the Lagrange-Charpit equation is given by Regarding this system as a dynamical system with the aux- ds s2 2τ 2 iliary “time” l, its qualitative dynamics can be illustrated by its − ⇒ s = (76) phase space depicted in Fig. 6. In the subcritical case n < 1, dl 2τ 2 l − l 0 the origin s = (0, 0) is “attractive” since all the eigenvalues of with constant of integration l0. In the following, we set l0 = 0 H are positive [Fig. 6(a)]. as the initial point on the characteristic curve. We then obtain Let us introduce the eigenvalues λ ,λ and eigenvectors 1 2 ν e1, e2 of H, such that log P˜ = 0 dl s(l) + C −2ν τ log |s|+C (77) ss τ 0 λ = , , −1 = 1 0 . P : (e1 e2 ) P HP λ (82) with the constant of integration C. The constant is an “di- 0 2 vergent” constant since it has to compensate the diverging Because all eigenvalues are real (see Appendix F1for the logarithm to ensure that log P (s = 0) = 0. This divergent ss proof), we denote λ  λ . The determinant of H is given by constant appears as a result of neglecting the ultraviolet (UV) 1 2 cutoff for small s (which corresponds to neglecting the ex- 1 − n det H = . (83) ponential tail of the PDF of intensities). By ignoring the τ1τ2

033442-14 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

Since the left-hand side (LHS) is zero for any x, the function H(·) must be identically zero. With  := log P˜ss as defined in (79), this leads to −1 log P˜ss(s) =−νKH s. (91) T By substituting with the special s = (s1 = s, s2 = s) , we ob- tain ν T 0 log Q˜ (s) =−ν s +  =∞(s(1, 1) ) − s (92) ss 0 t 1 − n for small s, which recovers expression (75) derived above. b. Critical case n = 1. In this case, the eigenvalues and FIG. 6. Qualitative representation of the Lagrange-Charpit equa- eigenvectors of H are given by / = − tions in phase space. By rewriting ds dk : V (s) Hs,the τ + τ , n1 1 n2 2 τ1 −n2 “velocity” vector field V (s) is plotted in the phase space (s1 s2 ). λ = 0,λ= , e = , e = . 1 2 τ τ 1 τ 2 n (a) Subcritical case with (τ1,τ2, n1, n2 ) = (1, 3, 0.3, 0.1), showing 1 2 2 1 = τ ,τ , , = that s 0 is a stable attractor. (b) Critical case with ( 1 2 n1 n2 ) (93) (1, 3, 0.3, 0.7), showing that the e1 direction is marginal in terms of the linear stability analysis (i.e., the repeller merges with the at- This means that the eigenvalue and its inverse matrix tractor, which corresponds to a transcritical bifurcation in dynamical are respectively given by systems). τ −n − 1 n n P = 1 2 , P 1 = 1 2 , τ2 n1 α −τ2 τ1 This means that the zero eigenvalue λ1 = 0 appears at the critical point n = 1. Below the critical point n < 1, all the α := det P = τ1n1 + τ2n2. (94) eigenvalues are positive (λ ,λ > 0). For n < 1, the dynamics 1 2 This value of α is the special case for two exponentials of the can be rewritten as general definition (34). Accordingly, let us introduce −λ1 (l−l0 ) d −1 λ1 0 −1 e P s =− P s ⇒ s(l) = P −λ − − n1s1 + n2s2 0 λ e 2 (l l0 )/C X = X,Y T = P 1s, ⇐⇒ X = , dl 2 1 ( ) α (84) −τ s + τ s Y = 2 1 1 2 . α (95) with constants of integration l0 and C1. We can assume l0 = 0 as the initial point of the characteristic curve without loss of We then obtain generality. Integrating the second equation in (80), we obtain dX dY = 0, =−λ Y (96) /λ dl dl 2  = ν + =−ν 1 1 0 −1 + 0K s(l)dl C2 0KP /λ P s C2 012 at the leading linear order in expansions in powers of X and =−νKH−1s + C . (85) Y . Since the first linear term is zero in the dynamics of X, 2 corresponding to a transcritical bifurcation for the Lagrange- The general solution is given by Charpit equations (79), we need to take into account the second-order term in X, namely H(C1) = C2 (86) X 2 1 1 −(n1s1/τ1+n2s2/τ2 ) with a function H determined by the initial condition on the e  1 − X + + n1n2 − Y 2 τ τ characteristic curve. Let us introduce 1 2 + O(XY, X 2Y,Y 2 ), (97) −1 s¯1 λ2/λ1 −1 s¯ := P s = ⇒ C1 = (s¯1) (s¯2 ) . (87) s¯2 where we have dropped terms of the order Y 2, XY, and X 2Y . This means that the solution is given by the following form: We then obtain the dynamical equations at the transcritical bifurcation [see Fig. 6(b)] to leading order − λ /λ − (s) =−νKH 1s + H((s¯ ) 2 1 (s¯ ) 1). (88) 1 2 dY dX X 2 −λ Y, − , (98) Because of the renormalization of the PDF, the following dl 2 dl 2α relation whose solutions are given by lim (s) = 0 (89) → 2α s 0 −λ2 (l−l0 ) X (l) = , Y (l) = C1e (99) l − l0 must hold for any path in the (s1, s2 ) space ending on the origin (limit s → 0). Let us consider the specific limit such with constants of integration l0 and C1. We can assume l0 = 0 −1 λ2/λ1 thats ¯1 → 0 withs ¯2 = x (¯s1) for an arbitrary positive x: as the initial point on the characteristic curve. Remarkably, only the contribution along the X axis is dominant for the large lim (s) = H(x). (90) | || | →∞ s¯1→0 l limit (i.e., X Y for l ), which corresponds to the

033442-15 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

FIG. 7. Numerical evaluation of the steady-state PDF of the Hawkes intensityν ˆ for the double exponential case K = 2(24)with(τ1,τ2 ) = (1, 3), (n1, n2 ) = (0.5, 0.499), or (n1, n2 ) = (0.5, 0.49), near the critical point: (a) Background intensity ν0 = 0.01, leading to the power law exponent 0.96. (b) ν0 = 0.1, leading to the power law exponent 0.6. (c) ν0 = 0.75, leading to the negative (i.e., growing PDF) power law exponent −2.0. For all simulations, the sampling time interval and total sampling time are dt = 0.001 and Ttot = 10000 from the initial conditionz ˆ(0) = 0. The initial 10% of the sample was discarded from the statistics. asymptotic limit s → 0. We then obtain c. Numerical verification. We have numerically confirmed our theoretical prediction (106), a special case of (47)fora n s (l) n s (l)   ν dl 1 1 + 2 2 memory kernel with two exponentials, as shown in Fig. 7. 0 τ τ 1 2 The main properties are the same as those shown in Fig. 4, ν implying that our prediction is verified for memory kernels − ν α | |+ 0n1n2 1 − 1 + 2 0 log X Y C2 (100) with one and two exponentials. λ2 τ1 τ2 with constant of integration C2. The general solution is given by D. Discrete superposition of exponential kernels H(C ) = C (101) We now consider the case where the memory kernel is 1 2 the sum of an arbitrary finite number K of exponentials ac- with a function H, which is determined by the initial condi- cording to expression (24). Our treatment follows the method tion. Considering that presented for the case K = 2. The corresponding Lagrange-Charpit equations read 2λ α C = Y exp 2 , (102) 1 X dsk − K n s /τ sk =−e j=1 j j j + − , 1 τ the solution is given by the following form: dl k  d − K n s /τ ν0n1n2 1 1 =−ν (e j=1 j j j − 1). (107)  s =− ν α |X|+ − Y 0 ( ) 2 0 log λ τ τ dl 2 1 2 2λ2α The derivation of the PDF of the Hawkes intensities boils + H Y exp . (103) = X down to a stability analysis of these equations around s 0 in the neighborhood of the critical condition n = 1. Because we have neglected the UV cutoff for small s, there is a. Subcritical case n < 1. We linearize the Lagrange- an artificial divergent term −2ν0α log |X| for small X. Except Charpit equations to obtain for this divergent term, (s) must be constant for s → 0. The function H(·) is thus constant because ds d −Hs,  ν Ks (108) dl dl 0 lim [(s) + 2ν0α log |X|] = H(Z) = const. (104) y→0 with with the choice of X = 2λ2α/log(Z/Y ) for any positive con- ⎛ − ⎞ stant Z. Therefore, we obtain the steady solution 1 n1 , − n2 ,...− nK τ1 τ2 τK ⎜ − ⎟ − n1 , 1 n2 ,...− nK ν0n1n2 1 1 ⎜ τ τ τ ⎟ − ν α | |+ − ⎜ 1 2 K ⎟ n1 nK log P˜ss(s) 2 0 log X Y (105) H = , K = ,..., . λ τ τ : ⎜ . . . . ⎟ : τ τ 2 1 2 ⎝ . . .. . ⎠ 1 K − − n1 , − n2 ,...1 nK for small X and Y , by ignoring the UV cutoff and the constant τ τ τ contribution. This recovers the power law formula of the in- 1 2 K termediate asymptotics of the PDF of the Hawkes intensities: (109) ˜ =−ν + ˜ , − ν α | | ∼ log Qss(s): 0s log Pss(s s) 2 0 log s (s 0) Considering that all eigenvalues {λk}k=1,...,K of H are real − + ν α ⇐⇒ P(ν) ∼ ν 1 2 0 (ν →+∞), (106) (see Appendix F1for its proof), we order them according to λi <λj for i < j. We denote the corresponding eigenvectors with α = τ1n1 + τ2n2 as defined in (94). as {ek}k=1,...,K . The matrix H can thus be diagonalized as

033442-16 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020) follows: ⎛ ⎞ We now introduce a new set of variables (i.e., representation λ , 0, ... 0 based on the eigenvectors) ⎜ 1 ⎟ ⎜ 0,λ2,... 0 ⎟ P := (e ,...,e ), P−1HP = ⎜ ⎟. ⎛ ⎞ 1 K ⎝ . . .. . ⎠ ⎛ ⎞ T . . . . X g1 1 ⎜ ⎟ 0, 0, ... λ ⎜ ⎟ T K ⎜X2 ⎟ −1 −1 ⎜g2 ⎟ X = ⎝ ⎠ := P s, P = ⎜ ⎟. (116) (110) ... ⎝...⎠ X T The critical case n = 1 corresponds to the existence of a zero K gK eigenvalue. Therefore, at the critical point, the determinant of The linearized Lagrange-Charpit equations are given by H is zero (see Appendix F2for the derivation of the explicit form of its determinant): dX1  , dXj −λ  . K K 0 jXj for j 2 (117) 1 − n dl dl det H = k=1 k = 0 ⇐⇒ n := n = 1. K τ k k=1 k k=1 Similarly to Eq. (99), the leading-order contribution comes (111) from the X1 direction because |X1||Xj| for j  2inthe Following calculations similar those presented in to Sec. IV B, asymptotic limit l →∞. We thus neglect other contribution we obtain by assuming Xj ∼ 0for j  2. It is therefore necessary to take −1 the second-order contribution along the X1 direction, (s) −ν0KH s, (112) −1 where the inverse matrix H is explicitly given in 2 K K Appendix F3. We finally obtain − K n s /τ nksk 1 nksk e k=1 k k k − =− + +··· . 1 τ τ −ν = k 2 = k log Q˜ (s) =−ν s + (s(1,...,1)T ) = 0 s, (113) k 1 k 1 ss 0 1 − n (118) We note that X is given by again recovering (92) and (75) derived above. 1 b. Critical case n = 1. At the critical point, the smallest λ = K eigenvalue of H is zero ( 1 0). By direct substitution, its 1 n n T X = g · s = n s , g = 1 ,..., K , corresponding eigenvector is 1 1 α k k 1 α α (119) = T k 1 e1 = (τ1,...,τK ) , (114) as seen from K ⎛ ⎞ where α := = τknk, which is a special case for a discrete − ⎛ ⎞ k 1 1 n1 , − n2 ,...− nK τ sum of exponentials of the general definition (34). τ1 τ2 τK 1 ⎜ − ⎟ ⎜− n1 , 1 n2 ,...− nK ⎟⎜τ ⎟ Taking the derivative of (119) and using Eq. (107), we ⎜ τ τ τ ⎟⎜ 2 ⎟ = ⎜ 1 2 K ⎟⎜ ⎟ obtain He1 ⎜ . . . . ⎟⎜ . ⎟ ⎝ . . .. . ⎠⎝ . ⎠ − 2 − n1 , − n2 ,...1 nK τ K K τ τ τ K dX 1 ds 1 n s ⎛ 1 ⎞ 2 K 1 = n k = − k k +··· . α k 0 α τ (120) 1 − n dl = dl 2 = k ⎜ ⎟ k 1 k 1 ⎜ − ⎟ ⎜1 n⎟ = ⎜ ⎟ = 0 for n = 1. (115) ⎝ . ⎠ This means that g1 is a correct representation. Note that the . value of α given by (34) ensures consistency with the follow- − 1 n ing identify:

⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ /α, /α, ... /α τ , , ...  , , ... gT n1 n2 nK 1 1 0 0 ⎜ 1 ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ T ⎜ , ,... ⎟⎜τ2, , ... ⎟ ⎜0, 1,...0⎟ − ⎜g ⎟ 1 = ⎜ 2 ⎟ e , e ,...,e = ⎜ ⎟⎜ ⎟ = ⎜ ⎟, P P ⎝ ⎠ 1 2 K ⎜ . . . . ⎟⎜ . . . . ⎟ ⎜ . . . .⎟ (121) ... ⎝ . . .. . ⎠⎝ . . .. . ⎠ ⎝ . . .. .⎠ T gK , ,... τ2, , ...  0, 0,...1 where  represents some unspecified value. Since the contribution of X2,...XK can be ignored for the description of the leading behavior along X1, let us set X2 = X3 =···=XK = 0, which leads ⎛ ⎞ ⎛ ⎞ X X τ ⎜ 1⎟ ⎜ 1 1 ⎟ ⎜ 0 ⎟ ⎜X1τ2 ⎟ s = PX  (e1,...,eK )⎜ . ⎟ = Xe1 = ⎜ . ⎟. (122) ⎝ . ⎠ ⎝ . ⎠ 0 X1τK

033442-17 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

We thus obtain the second-order contribution along the X1 E. General case ,..., axis by ignoring nonlinear contribution from X2 XK : We are now prepared to study the general case where dX X 2 the memory kernel of the Hawkes process is a continuous 1 − 1 . (123) dl 2α superposition of exponential functions (33). Introducing the With calculations that follow step by step those in Sec. IV B, steady-state cumulant functional we obtain

log Q˜ ss(s) −2ν0α log |s| K [s]:= log P˜ss[s], (125) −1+2ν0α ⇐⇒ P(ν) ∼ ν , with α := nkτk. (124) k=1 This recovers the power law formula of the intermedi- and from the master equation in its functional Laplace rep- ate asymptotics of the PDF of the Hawkes intensities resentation Eq. (45), we obtain the following first-order given by (47). functional differential equation in the steady state,

∞ ∞ δ ∞ − dxs(x )n(x )/x s(x) [s] − dxs(x )n(x )/x dx e 0 − 1 + = ν0(e 0 − 1). (126) 0 x δs(x) The corresponding Lagrange-Charpit equations are the following partial-integro equations,

∂ ∞ ∂ ∞ s(l; x) − dxs(x )n(x )/x s(x) (l) − dxs(x )n(x )/x = 1 − e 0 − , =−ν (e 0 − 1), (127) ∂l τ ∂l 0 where l is the curvilinear parameter indexing the position b. Critical case n = 1. At criticality, the smallest eigen- along the characteristic curve. We now perform the stability value vanishes: λmin = 0. Indeed, we obtain the zero eigen- analysis of Eq. (127) in the neighborhood of s = 0 close to function the critical condition n = 1. λ = = , a. Subcritical case n < 1. We linearize the Lagrange- e(x; 0) x (133) Charpit equation (127) to obtain which can be checked by direct substitution: ∞ ∂s(l; x)    =− dx H(x, x )s(x ), ∞ ∞ δ(x − x) − n(x) ∂l dxH(x, x )e(x; λ = 0) = dx x 0  ∂ ∞ 0 0 x (l) = ν    0 dx K(x )s(x ) (128) = 1 − n = 0 , for n = 1. (134) ∂l 0 with We now introduce a set of variables to obtain a new represen- δ(x − x) − n(x) n(x) tation based on the eigenfunctions, H(x, x ):= , K(x):= . (129) x x ∞ s(x) = e(x; λ)X (λ) ⇐⇒ X (λ) = dxe−1(λ; x)s(x), Let us introduce the eigenvalues λ  λ and eigenfunctions min λ 0 e(x; λ), satisfying ∞ (135)  ,   λ = λ λ . dx H(x x )e(x ; ) e(x; ) (130) −1 0 with the inverse matrix e (λ; x) satisfying Appendix G1shows that all the eigenvalues are real. The ∞  −  −1  inverse matrix of H(x, x ), denoted by H 1(x, x ), can be ex- dxe (λ; x)e(x; λ ) = δλ,λ . (136) plicitly obtained as shown in Appendix G2. Since the inverse 0 −  matrix H 1(x, x ) has a singularity at n = 1, the critical con- We assume the existence of the inverse matrix, which is equiv- dition of this Hawkes process is given by n = 1 as expected. alent to the assumption that the set of all eigenfunctions is Using calculations that are analogous to those in Sec. IV B, complete. H(x, x ) can be diagonalized: we obtain ∞ ∞ ∞ ∞  −    1 λ , λ = λδ  .  −ν  −1 ,   , dx dx e ( ; x)H(x x )e(x ; ) λ,λ (137) [s] 0 dx dx K(x)H (x x )s(x ) (131) 0 0 0 0 from which we state that We then obtain the linearized Lagrange-Charpit equations, −ν ∂X (λ) log Q˜ (s) =−ν s + [s1(x)] = 0 s, (132) −λX (λ). (138) ss 0 1 − n ∂l where 1(x) is an indicator function defined by 1(x) = 1for The dominant contribution comes from the vanishing eigen- any x. This recovers (75), (92), and (113) derived above. value. We therefore focus on X (0) by setting X (λ) = 0for

033442-18 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

λ>0. We then form the expansion A. Formal equivalence between the generalized Langevin ∞   ∞ equation and quantum field theory − dxs(x )n(x )/x  n(x )s(x ) e 0 − 1 =− dx x We first discuss the formal relationship between the GLE 0 ∞   2 and quantum field theory. In the beginning, let us study the + 1  n(x )s(x ) +··· . case of the discrete-exponential-sum memory kernel (8). It is dx  2 0 x well known that the Fokker-Planck (FP) or master equations (139) have a structure that is quite similar to that of the Schrödinger The explicit representation of x(0) is given by equation [61], and here we reformulate the Fokker-Planck or ∞ master equations according to this classical idea. X (λ = 0) = dxe−1(λ = 0; x)s(x) 0 1. Schrödinger-like representation for the Fokker-Planck equation ∞ 1 −1 n(x) = dxn x s x , e λ = x = , After rewriting v → , uk → φk, and |Pt := α ( ) ( ) ( 0; ) α 0  K φ , φ ,...,φ |, φ ,...,φ d k=1 d kPt ( 1 K ) 1 K , the FP (140) equation can be rewritten in a quantum-mechanics-like form: α where is defined by expression (34). Expression (140) can ∂ be checked to be valid by direct substitution since, from |P =H |P , ∂t t orgn t Eq. (127), we have K ∞ −i φk κkT ∂X (0) − ∂s(x) = φ + π + κ  − π 2 = 1 Horgn : k i k k k dxe (0; x) M τk τk ∂l 0 ∂l k=1 (146) 1 ∞ n(x)s(x) 2 = − dx +··· , 0 α (141) 2 0 x with the “momentum” operators  :=−i∂/∂ and πk := −i∂/∂φ satisfying commutative relations showing that the first-order contribution is actually null in k this representation. The parameter α (34) has the property to [, ] = i, [φk,πk ] = iδkk . (147) ensure the consistency with the following identity: ∞ 1 ∞ While the Hamiltonian Horg is non-Hermitian (i.e., the evo- dxe−1 λ = x e x λ = = dxn x x ( 0; ) ( ; 0) α ( ) lution operator is not self-adjoint), this quantum-mechanical 0 0 formulation is sometimes useful since analytical methods = δλ=0,λ=0 = 1. (142) developed in are available and can be formally transposed. λ λ = Since we ignore the contribution from X ( ) except for 0, In particular, on the condition that the detailed balance is λ = λ> let us set X ( ) 0for 0, which yields satisfied, there exists a mathematically better mapping [62]. The steady solution of the FP is given by s(x) = e(x; λ)X (λ) = e(x;0)X = X x, (143) 0 0 λ 1 M K 1 , φ ,...,φ ∝ − 2 + φ2 . where we have written X := X (0). We then obtain the Pss( 1 K ) exp k 0 T 2 2κk k=1 second-order contribution along the X0 axis, (148) ∂X X 2 0 − 0 . (144) ∂l α |ψ = −1/2| 2 Here we make a transformation t : Pss Pt to obtain From calculations mimicking those in Sec. IV B, we obtain ∂ − |ψ = + |ψ , −1+2ν0α t (HH HA) t log Q˜ ss(s) −2ν0α log |s|⇐⇒P(ν) ∼ ν , ∂t ∞ K π 2 m ω2 ω α = . = k + k k φ2 − k , : dxn(x)x (145) HH : k 0 2mk 2 2 k=1 This recovers the power law formula of the intermedi- K 1 ate asymptotics of the PDF of the Hawkes intensities H := i κ π − φ  (149) A k k M k given by (47). k=1 with m := τ /(2T κ ) and ω := 1/τ . Here, H and H V. DISCUSSION: FORMAL RELATION TO QUANTUM k k k k k H A are Hermitian and anti-Hermitian operators, respectively FIELD THEORY † = † =− (i.e., HH HH and HA HA). Equation (149) can be re- We have formulated the non-Markovian Hawkes process as garded as a non-Hermitian Schrödinger equation based on a classical field theory associated with stochastic excitation. the harmonic-oscillator Hamiltonian HH . This decomposition While the formulation is entirely classical, it is interesting to is known to be mathematically useful in particular for the point out its formal relationship with quantum field theory. eigenfunction expansions [62].

033442-19 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

We can introduce the creation and annihilation operators by defining and their commutative relation |ψ = −1/2| , t : Pss Pt m ω i † = k k φ − π , 1 M 1 ak : k k , {φ } ∝ − 2 + φ2 , 2 mkωk Pss( (x) x ) exp dx (x) T 2 2κ(x) ω mk k i † x 1 a : = φ + π , [a , a  ] = δ ,  (150) k k ω k k k k k m(x):= ,ω(x):= . (156) 2 mk k 2T κ(x) x to lead the Hamiltonian of the harmonic oscillators While the divergent term δ(0) appears due to the singularity of =  K the commutative relation (152)atx x , this term is irrelevant = ω † . to the physical observables. Indeed, by introducing the field HH kak ak (151) = creation and annihilation operators k 1 m(x)ω(x) i 2. Schrödinger-like representation for the field Fokker-Planck a† x = φ x − π x , ( ): ( ) ω ( ) equation 2 m(x) (x) m(x)ω(x) i a x = φ x + π x This calculation can be generalized to the stochastic field ( ): ( ) ω ( ) (157) theory of the Fokker-Planck equation. Indeed, by introducing 2 m(x) (x) the field operator and the corresponding commutative relation satisfying the field commutative relation δ [a(x), a†(x)] = δ(x − x), (158) π x =−i , φ x ,π x = iδ x − x , ( ): δφ [ ( ) ( )] ( ) (152) (x) we obtain the field harmonic-oscillator representation we obtain the operator form of the FP field equation for the † HH = dxω(x)a (x)a(x), (159) state vector |Pt := dDφPt (, φ)|, φ: ∂ where the divergent term δ(0) cancels out of the final re- |P =H |P , ∂ t orgn t sults. This technical procedure is essentially the same as that t −i φ(x) in quantum electrodynamics, where renormalization of the H := dx φ(x) + iπ(x) + κ(x) energy is required to avoid divergence by removing the zero- orgn M x point energy of harmonic oscillators. κ(x)T The formal correspondence between the FP field equation − π(x)2 . (153) x and non-Hermitian quantum field theory itself is not sur- prising. For example, it is known that a stochastic chemical The FP field equation can be transformed into a non- reaction system, characterized by an SPDE, is formally equiv- Hermitian quantum-field theory composed of harmonic oscil- alent to non-Hermitian quantum field theory (see Sec. I E lators on the field: in Ref. [63]). Since the formal relationship between classical ∂ and quantum mechanics is a recent popular topic in terms of − |ψ =(H + H )|ψ (154) ∂t t H A t non-Hermitian physics [64], it might be interesting to further seek this mathematical relationship in understanding general with Hermitian and anti-Hermitian operators non-Markovian processes. π 2(x) m(x)ω2(x) ω(x) H := dx + φ2(x) − δ(0) , B. Quantum-field-like representation for the Hawkes process H 2m(x) 2 2 We can apply this formal procedure to the field master 1 HA := i dx κ(x)π(x) − φ(x) (155) equation for the Hawkes process. Let us apply the functional M Kramers-Moyal expansion to obtain

∞ ∂P [z] δ z(x) (−1)k n(x) δ k t = dx + dx ν + z(x)dx P [z], (160) ∂t δz(x) x k! x δz(x) 0 t k=1 where we have used the functional Taylor expansion

∞ n(x) n 1 n(x) δ k ν + z(x) − dx P z − = − dx ν + z(x)dx P [z] 0 x t x k! x δz(x) 0 t k= 0 n(x) δ = exp − dx ν + z(x)dx P [z]. (161) x δz(x) 0 t

033442-20 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

We then introduce the field operators and the corresponding commutative relations, δ φ(x):= z(x),π(x):=−i , [φ(x),π(x)] = iδ(x − x). (162) δφ(x) We then obtain the Schrödinger-like representation for the state vector |Pt := DφPt [φ]|φ ∂ |P =H|P (163) ∂t t t with non-Hermitian Hamiltonian H defined by dx 1 n(x) H := π(x)φ(x) + exp dx π(x) − 1 ν + φ(x)dx . (164) x i x 0

Since the Hamiltonian includes infinite-order “momentum” crosses over to an asymptotic exponential function beyond a operators, the Hamiltonian is classified as a nonlocal oper- characteristics intensity that diverges as the critical condition ator, which reflects trajectory jumps in the point processes. is approached (n → 1). The exponent of the PDF is nonuni- A similar nonlocal Hamiltonian representation for the master versal and a function of the background intensity ν0 of the equation can be seen in Ref. [65] in the context of path- Hawkes intensity and of the parameter α = nτ, where τ is integral representations of Lévy processes. In this sense, the first-order moment of the distribution of timescales of the the field master equation can be formally regarded as a memory function of the Hawkes process. We found that the non-Hermitian quantum field theory without locality. The larger the memory τ, the larger the background intensity ν0, exponential operator T [y]:= exp[−i dxy(x)π(x)] naturally and the larger the branching ratio n, the smaller is the exponent appears because it is a translation operator: T [y]Pt [z] = 1 − 2ν0α of the PDF of Hawkes intensities. Pt [z − y]. This work provides the basic analytical tools to analyze We note that the Hamiltonian reduces to a local operator if Hawkes processes from a different angle than hitherto devel- we can approximately truncate the Kramers-Moyal expansion oped and will be useful to study more general and complex up to the second order. While the validity of such approxi- models derived from the Hawkes process. For instance, it mation is not obvious for this linear Hawkes process, we can is straightforward to extend our treatment to the case where actually formulate such a formal approximation by generaliz- each event has a mark quantifying its impact or “fertility,” ing the for the field master equation in thus defining the more general Hawkes process with intensity the case of nonlinear Hawkes processes [66]. ν = ν + Nˆ (t ) ρ − ˆ (t ) 0 n i=1 ˆih(t tˆi ) with independent and identi- cally distributed random numbers {ρˆi}i. Our framework is also well suited to nonlinear generalisations of the Hawkes process, for instance with the intensity taking the formν ˆ (t ) = ω > ω ω = g( ˆ (t )) 0, where the auxiliary variable ˆ is given by ˆ (t ) VI. CONCLUSION ω + Nˆ (t ) ρ − { } 0 n i=1 ˆih(t tˆi ) and where the times tˆi i of the events We have presented an analytical framework of the Hawkes are determined from the intensityν ˆ (t ). In this nonlinear ver- process for an arbitrary memory kernel, based on the master sion, the positivity ofω ˆ (t ) and ρi are not required anymore. equation governing the behavior of auxiliary field variables. This nonlinear Hawkes process is more complex than the We have derived systematically the corresponding functional linear Hawkes process but our framework can be applied to master equation for the auxiliary field variables. While the derive its most important analytical properties [66]. We note Hawkes point process is non-Markovian by construction, the that such nonlinear Hawkes process include several models introduction of auxiliary field variables provides a formulation that have been proposed in the past, with applications to ex- in terms of linear stochastic partial differential equations that plain the multifractal properties of earthquake seismicity [67] are Markovian. For the case of a memory kernel decaying as and financial volatility [68]. a single exponential, we presented the exact time-dependent and steady-state solutions for the probability density function (PDF) of the Hawkes intensities, using the Laplace represen- tation of the master equation. For memory kernels represented ACKNOWLEDGMENTS as arbitrary sums of exponential (discrete and continuous sums), we derived the asymptotic solutions of the Lagrange- This work was supported by the Japan Society for the Charpit equations for the hyperbolic master equations in the Promotion of Science KAKENHI (Grants No. 16K16016 and Laplace representation in the steady state, close to the critical No. 20H05526) and Intramural Research Promotion Program point n = 1 of the Hawkes process, where n is the branching in the University of Tsukuba. We are grateful for useful re- ratio. Our theory predicts a power law scaling of the PDF of marks on the manuscript provided by M. Schatz, A. Wehrli, the intensities in an intermediate asymptotics regime, which S. Wheatley, H. Takayasu, and M. Takayasu.

033442-21 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

APPENDIX A: EXPLICIT DERIVATION OF THE MASTER EQUATIONS 1. Derivation of Eqs. (27)and(31)

Given the dynamical equations (26) for the excess intensities {zˆk}k=1,...,K , which are short-hand notations for the dynamics given by Eq. (18), for an arbitrary function f (zˆ), its stochastic time evolution therefore reads ∂ − K zˆk f (zˆ) , + = − ν = τ ∂ dt (no jump during [t t dt); probability 1 ˆ (t )dt) df (zˆ(t )) = f (zˆ(t + dt)) − f (zˆ(t )) = k 1 k zˆk (A1) f (zˆ(t ) + h) − f (zˆ(t )) (jumpin[t, t + dt); probability = νˆ (t )dt) with jump size vector h and Hawkes intensityν ˆ, defined by T K n1 n2 nK h := , ,..., , νˆ (t ):= ν0 + zˆk (t ). (A2) τ τ τK 1 2 k=1 Taking the ensemble average of both sides of (A1) and after partial integration of the left-hand side, we get K K ∂Pt (z) zk ∂ f (z) dz f (z) dt = dz − dt + ν0 + zk dt{ f (z + h) − f (z)} Pt (z). (A3) ∂t τk ∂zk k=1 k=1 After partial integration of the right-hand side and making the change of variable z + h → z, we obtain K K K ∂Pt (z) ∂ zk dz f (z) = dz P(z) + ν0 + (zk − hk ) P(z − h) − ν0 + zk P(z) f (z). (A4) ∂t ∂zk τk k=1 k=1 k=1 Since this is an identify for an arbitrary f (z), we obtain Eq. (27). We derive the corresponding Laplace representation (31) as follows: Let us apply the Laplace transform to both sides of Eq. (27), K K K ∂Pt (z) ∂ zk LK = LK P(z) + ν0 + (zk − hk ) P(z − h) − ν0 + zk P(z) . (A5) ∂t ∂zk τk k=1 k=1 k=1 The left-hand side is given by ∂P (z) ∂P˜ (s) L t = t . (A6) K ∂t ∂t For the right-hand side, let us consider the following two relations: ∂ z ∂ z ∂ z L k P(z) = dze−s·z k P(z) = dz dz e−s·z k P(z) K ∂z τ ∂z τ i k ∂z τ k k k k i|i =k k k zk =∞ zk −s·z zk −s·z zk = dzi P(z) + sk dzke P(z) = sk dzi dzke P(z) τk = τk τk i|i =k zk 0 i|i =k 1 ∂ s ∂ = s dz − dz e−s·zP(z) =− k P˜ (s), (A7) k i τ ∂s k τ ∂s t i|i =k k k k k where we have used the partial integration on the second line and have used the boundary condition (28) on the third line, and K K −s·z LK ν0 + (zk − hk ) P(z − h) = dze ν0 + (zk − hk ) P(z − h) k=1 k=1 K K −s·h −s·(z−h) −s·h −s·z = e dze ν0 + (zk − hk ) P(z − h) = e dze ν0 + zk P(z) k=1 k=1 K ∂ K ∂ −s·h −s·z −s·h = e ν0 − dze P(z) = e ν0 − P˜t (s), (A8) ∂sk ∂sk k=1 k=1 where we have applied the change of variable z − h → z on the second line. By applying these two relations to the right-hand side of Eq. (A5), we obtain Eq. (31).

033442-22 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

2. Derivation of Eq. (40) The Hawkes intensityν ˆ is defined by ∞ νˆ (t ):= ν0 + zˆ(t, x)dx , (A9) 0 in terms of the continuous field of excess intensities {zˆ(t, x)}x∈(0,∞). For an arbitrary functional f [ˆz], let us consider its stochastic time evolution: ∞ , δ −dt dxzˆ(t x) f [ˆz] (no jump during [t, t + dt); probability = 1 − νˆ (t )dt) df[ˆz] = f [{zˆ(t + dt, x)} ] − f [{zˆ(t, x)} ] = 0 x δzˆ(x) , x x f [ˆz + n/x] − f [ˆz] (jump in [t, t + dt); probability = νˆ (t )dt) (A10) where we have used the the functional Taylor expansion ∞ 1 δk f [z] f [z + η] − f [z] = dx1 ...dxk η(x1) ...η(xk ) (A11) k! δz(x ) ...δz(xk ) k=0 1 up to first order. Taking the ensemble average of both sides of (A10) yields ∞ ∞ ∂Pt [z] z(x) δ f [z] Dzf[z] dt = Dz − dx dt + ν0 + z(x)dx dt{ f [z + n/x] − f [z]} Pt [z]. (A12) ∂t 0 x δz(x) 0 By partial integration and with the change of variable z + n/x → z, we obtain ∞ ∞ ∞ ∂Pt [z] δ z n Dzf[z] = Dz dx Pt [z] + ν0 + z − dx Pt [z − n/x] − ν0 + zdx P[z] f [z]. (A13) ∂t 0 δz x 0 x 0 Since this is an identify for arbitrary f [z], we obtain Eq. (40).

APPENDIX B: DERIVATIONS OF THE POWER LAW PDF an asymptotic form for small suv < s  τ, OF HAWKES INTENSITIES FOR THE EXPONENTIAL ν s sds MEMORY KERNEL Q˜ s − 0 log ss( ) −s/τ τ s e − 1 + s/τ Here, we provide two different derivations of the power law uv s 2 PDF (53) of Hawkes intensities for the exponential memory ν0 2τ ds s ∼− =−2ν τ log , (B2) kernel (13). τ 0 suv s suv which implies the power-law relation for the tail distribution:

1. Introduction of a UV cutoff −1+2ν0τ Pss(ν) ∝ ν (B3) We now investigate the steady solution of the master equa- for 0  ν  ν := 1/s . tion (19) for the probability density function (PDF) Pt (z)of max uv the excess intensityz ˆ (14), at the critical point n = 1. Let us introduce a UV cutoff suv to address the singularity 2. Kramers-Moyal approach at s → 0 so that we can express We can derive relation (53) using the Kramers-Moyal ex- pansion of the master equation (19). Let us consider the ν s sds expansion Q˜ (s)  exp − 0 ss τ −s/τ − + /τ s e 1 s ∞ uv 1 −n k ∂k − /τ ν + − /τ − /τ = ν + . e s − 1 + s/τ ( 0 z n )Pt (z n ) ( 0 z)Pt (z) = −ν s − s − ν τ . k! τ ∂zk exp 0( uv ) 0 log −s /τ k=0 e uv − 1 + suv/τ (B4) (B1) By truncating the series at the second order, we obtain the Fokker-Planck equation at the critical point n = 1inthe Recall that log Q˜ (s) =−sν + log P˜ (s) and P˜ (s)is ss 0 ss ss steady state: the Laplace transform of the steady state P˜ss(s):= ∞ −sν dνe Pss(z) of the master equation (19). The introduction 2 0 ν0 ∂ 1 ∂ of this UV cutoff s amounts to introducing a cutoff in the − + (ν0 + z) Pss(z)  0, (B5) uv τ ∂z 2τ 2 ∂z2 memory function h(t ) at large timescale [i.e., there exists tcut > such that h(t ) is negligible for t tcut]. The validity of this ap- for z →∞. We thus obtain an asymptotic formula proximation is confirmed by considering the time-dependent − + ν τ = 1 2 0 solution (see Sec. IV B 2). At the critical point n 1, it has Pss(z) ∼ (ν0 + z) , for large z. (B6)

033442-23 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

This solution is consistent with the truncation of the Kramers- positive definite Moyal series at the second order, which consists in removing 1 negligible higher order terms. For large l  3, indeed, we > 0(D3) −ns/τ − + /τ obtain e 1 s ∂2 ∂l for s > s0 by choosing an appropriate positive s0. Since the (ν + z)P (z)  (ν + z)P (z) , for large z. α ∂ 2 0 ss ∂ l 0 ss integrand is positive definite, the statement ( 1) is proved. As z z α α (B7) a corollary of ( 1), the statement ( 2) is proved. We next study F(s) in the subcritical regime (n < 1). For n < 1, the statement (α3) is true because e−ns/τ − 1 + s/τ > APPENDIX C: ELEMENTARY SUMMARY OF THE 0 for any s > 0 and (τ/n)logn < 0. The statement (α4) is true METHOD OF CHARACTERISTICS because, for 0 < s < s0  τ/n, we obtain The method of characteristics is a standard method to solve τ F  s →−∞ first-order PDEs. Here we focus on linear first-order PDEs that (s) log (D4) 1 − n s0 are relevant to the derivation of the PDF of Hawkes intensities. Let us consider the following PDE: for s →+0. The statement (α5) is correct because ∂ , ∂ , s ds , , z(x y) + , , z(x y) = , , . F s  + c = τ s + c →+∞ a(x y z) b(x y z) c(x y z) (C1) ( ) /τ 1 log 1 (D5) ∂x ∂y c0 s

According to the method of characteristics, we consider the for s →+∞with constants c0 and c1. As a corollary of (α1), corresponding Lagrange-Charpit equations: (α4), and (α5), the statement (α6) is proved. dx =−a(x, y, z), (C2a) dl APPENDIX E: ANALYTICAL DERIVATION OF THE TIME-DEPENDENT SOLUTION (63) dy =−b(x, y, z), (C2b) dl 1. Consistency check 1: Convergence to the steady solution dz Let us check that the time-dependent solution (63) is con- =−c(x, y, z), (C2c) dl sistent with the steady solution (49)forn < 1. To prove this, it is sufficient to show that with the parameter l encoding the position along the charac- teristic curves. These equations are equivalent to an invariant lim H(x) = 0. (E1) →∞ form in terms of l x α α α dx dy dz Using ( 1), ( 4), and ( 5), we obtain = = . (C3) a(x, y, z) b(x, y, z) c(x, y, z) lim S(x) = lim F −1(x) = 0 , (E2) x→∞ x→−∞ Let us write their formal solutions as C1 = F1(x, y, z) and = , , and thus C2 F2(x y z) with constants of integration C1 and C2.The general solution of the original PDE (C1) is given by ν S sds H = ˜ + 0 lim (x) lim log Pt=0(S) − /τ x→∞ S→0 τ e ns − 1 + s/τ φ(F1(x, y, z), F2(x, y, z)) = 0(C4) 0 = 0. (E3) with an arbitrary function φ(C1,C2 ), which is determined by the initial or boundary condition of the PDE (C1). This method can be readily generalized to systems with many 2. Consistency check 2: Relaxation dynamics of the averageν ˆ(t) variables. at finite times Let us now study the dynamics of the average intensityν ˆ (t ) APPENDIX D: ANALYTICAL DERIVATION OF SOME via the time-dependent formula (63). Below the critical point, MAIN PROPERTIES OF F(s)(58) we can use the renormalized expression (61). Then the integral F(s) can be asymptotically evaluated for small 0  s  τ/n: Here, we derive properties (α1)–(α6) of F(s)(58). First, τ the following relations hold true: F(s)  log s →−∞ (s → 0). (E4) 1 − n τ d − /τ s > log n ⇒ (e ns − 1 + s/τ ) , = − F n ds This means that the argument x(t s) t (s)showsthe divergence n 1 =− e−ns/τ + > 0(D1) τ τ τ x(t, s) = t − F(s)  t − log s →+∞. (E5) 1 − n and From Eq. (E4), the inverse function shows the asymptotic lim (e−ns/τ − 1 + s/τ ) =∞. (D2) →∞ behavior for large x: s − These relations guarantee that there exists s0 > 0 such that −1 1 n −ns/τ S(x) = F (−x)  exp − x . (E6) e − 1 + s/τ > 0fors > s0. Therefore, the integrand is τ

033442-24 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020) ⎛ ⎞ By substituting the relation (E5), we obtain n1 , ,... τ 0 0 ⎜ 1 ⎟ − ⎜ ⎟ , = F 1 − , n S(x(t s)) ( x(t s)) ⎜ 0, 2 , ... 0 ⎟ ⎜ τ2 ⎟ = ⎜ ⎟. 1 − n τ A : ⎜ . . . . ⎟ (F1)  exp − t − log s ⎜ . . .. . ⎟ τ 1 − n ⎝ ⎠ , , ... n2 −(1−n)t/τ 0 0 τ = se . (E7) 2 ν = ν We now assume the initial condition ˆ (0) ini, or equiva- Indeed, by representing all the matrices by their elements lently log P˜ = (s) =−ν s.FromEq.(63), we thus obtain the t 0 ini H¯ := (H¯ij), H := (Hij), and A := Aij, we obtain relaxation dynamics for the tail s  0, s ν sds − ni δkl nl τ j ˜ − −(1−n)t/τ ν − 0 H¯ = A H A 1 = δ − δ log Pt (s) se ini − /τ ij ik kl lj ik lj τ − − /τ ns − + /τ τ τ τ n se (1 n)t e 1 s k,l k,l i k l j √ ν (1 − e−nt/τ ) δ − − ν −(1−n)t/τ + 0 , ij nin j inie s (E8) = √ . 1 − n (F2) τiτ j which means that the average ofν ˆ (t ) is given by We therefore obtain d νˆ (t )=− log P˜t (s) = ds s 0 Hei = λiei ⇐⇒ H¯ (Aei ) = λi(Aei ), (F3) ν (1 − e−nt/τ ) = ν e−(1−n)t/τ + 0 . (E9) ini 1 − n implying that any eigenvalue of H is the same as that of H¯ . Because H¯ is a symmetric matrix, all the eigenvalues of H¯ are 3. Derivation of the asymptotic formula (68) real. Therefore, all the eigenvalues of H are also real. Let us derive the asymptotic relaxation formula (68)for sufficiently large t, satisfying 2. Determinant t  F(s) (E10) Here, we derive the determinant det H for arbitrary values of K. Let us recall the following identities, showing the invari- for a given s. Under such a condition, the asymptotic relation ance of determinants: F −1 − , for the inverse function ( x(t s)) is available as Eq. (E6) ⎛ ⎞ ⎛ ⎞ with x(t, s) = t − F(s). We then obtain a1 a1 ⎜ ⎟ ⎜ ⎟ ⎜a2 ⎟ ⎜ a2 ⎟ − 1 − n τ ⎜ ⎟ ⎜ ⎟ F 1(−x t, s )  − t − F s − s ( ) exp τ R( ) − log ⎜ . ⎟ ⎜ . ⎟ 1 n ⎜ . ⎟ ⎜ . ⎟ = ⎜ ⎟ = ⎜ ⎟. − det H det ⎜ ⎟ det ⎜ ⎟ (F4) 1 n ⎜a j ⎟ ⎜a j + cak⎟ = s exp − (t − FR(s)) (E11) ⎜ ⎟ ⎜ ⎟ τ ⎜ . ⎟ ⎜ . ⎟ ⎝ . ⎠ ⎝ . ⎠ for sufficiently large t. By substituting this into Eq. (63), we aK aK obtain Eq. (68). This implies APPENDIX F: PROOFS OF MATHEMATICAL ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ PROPERTIES OF H (109) a a a ⎜ 1 ⎟ ⎜ 1 ⎟ ⎜ 1 ⎟ Here, we summarize the proofs of the main mathematical ⎜ ⎟ ⎜ − ⎟ ⎜ − ⎟ ⎜a2 ⎟ ⎜a2 a1⎟ ⎜a2 a1⎟ properties of H (109) for arbitrary values of K. ⎜ ⎟ ⎜ ⎟ ⎜ − ⎟ = ⎜a3 ⎟ = ⎜ a3 ⎟ = ⎜a3 a1⎟ det H det ⎜ ⎟ det ⎜ ⎟ det ⎜ ⎟ ⎜ . ⎟ ⎜ . ⎟ ⎜ . ⎟ 1. Proof of that its eigenvalues are real ⎝ . ⎠ ⎝ . ⎠ ⎝ . ⎠ aK aK aK All eigenvalues of H are real numbers for the following ⎛ ⎞ ⎛ ⎞ reasons. H can be symmetrized as H¯ , defined by  a1 a1 ⎛ ⎞ ⎜ ⎟ ⎜  ⎟ − ⎜a2 − a1 ⎟ ⎜a ⎟ 1 n1 , n1n2 ,... n1nK ⎜ ⎟ ⎜ 2 ⎟ τ τ τ τ τ ⎜ 1 1 2 1 K ⎟ ⎜a − a ⎟ ⎜a ⎟ ⎜ − ⎟ =···=det ⎜ 3 1 ⎟ := det ⎜ 3 ⎟ (F5) ⎜ n2n1 , 1 n2 ,... n2nK ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ τ2τ1 τ2 τ2τK ⎟ ⎜ . ⎟ ⎜ . ⎟ H¯ := AHA−1 = ⎜ ⎟, ⎝ . ⎠ ⎝ . ⎠ ⎜ . . .. . ⎟ ⎜ . . . . ⎟ −  ⎝ ⎠ aK a1 aK − nK n1 , nK n2 ,... 1 nK τK τ1 τK τ2 τK

033442-25 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020) and ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞   +   +  +   + K  a1 a1 n2a2 a1 n2a2 n3a3 a1 k=2 nkak ⎜  ⎟ ⎜  ⎟ ⎜  ⎟ ⎜  ⎟ ⎜a2 ⎟ ⎜ a2 ⎟ ⎜ a2 ⎟ ⎜ a ⎟ = ⎜ ⎟ = ⎜ ⎟ = ⎜ ⎟ =···= ⎜ 2 ⎟ det H det ⎜ . ⎟ det ⎜ . ⎟ det ⎜ . ⎟ det ⎜ . ⎟ (F6) ⎝ . ⎠ ⎝ . ⎠ ⎝ . ⎠ ⎝ . ⎠     aK aK aK aK with constants {nk}k. Using these relations, the determinant of H is given by ⎛ ⎞ ← −n1/τ1 + 1/τ1, −n2/τ2,...,−nK /τK a1 ⎜ ⎟← ⎜ −n1/τ1, −n2/τ2 + 1/τ2,..., −nK /τK ⎟ a2 det H = det ⎜ ⎟ ⎜ . . . . ⎟ . ⎝ . . .. . ⎠ . ← −n1/τ1, −n2/τ2,...,−nK /τK + 1/τK aK ⎛ ⎞ ←  = (1 − n1)/τ1, −n2/τ2, ... −nK /τK a1 a1 ⎜ ⎟←  = − ⎜ −1/τ1, 1/τ2, ... 0 ⎟ a2 a2 a1 = det ⎜ ⎟ ⎜ . . . . ⎟ . ⎝ . . .. . ⎠ . ←  = − −1/τ1, 0, ... 1/τK aK aK a1 ⎛ ⎞ − K /τ , ,... ← a = a + K n a (1 k=1 nk ) 1 0 0 1 1 k=2 k k ⎜ ⎟←  =  ⎜ −1/τ1, 1/τ2, ... 0 ⎟ a2 a2 = det ⎜ ⎟ ⎜ . . . . ⎟ . ⎝ . . .. . ⎠ . ←  =  −1/τ1, 0,...1/τK aK aK 1 − K n = k=1 k . (F7) τ1 ...τK

3. Inverse matrix Here we derive the inverse matrix of H for arbitrary values of K. The inverse matrix is derived from the method of row reduction: ⎛ ⎞ ← −n1/τ1 + 1/τ1, −n2/τ2, ... −nK /τK 1, 0, ..., 0 b1 ⎜ ⎟ ⎜ −n /τ , −n /τ + 1/τ ,... −n /τ 0, 1, ..., 0⎟← b2 ⎜ 1 1 2 2 2 K K ⎟ ⎜ ...... ⎟ . ⎝ ...... ⎠ . ← −n1/τ1, −n2/τ2, ..., −nK /τK + 1/τK 0, 0, ..., 1 bK ⎛ ⎞ ←  = (1 − n1)/τ1, −n2/τ2,...−nK /τK 1, 0,...0 b1 b1 ⎜ ⎟←  = − ⎜ −1/τ1, 1/τ2,... 0 −1, 1,...0⎟ b2 b2 b1 → ⎜ ⎟ ⎜ ...... ⎟ . ⎝ ...... ⎠ . ←  = − −1/τ1, 0,...1/τK −1, 0,...1 bK bK b1 ⎛ ⎞ − /τ , ,... − K , , ... ← b = b + K n b (1 n) 1 0 0 1 k=2 nk n2 nK 1 1 k=2 k k ⎜ ⎟←  =  ⎜ −1/τ1, 1/τ2,... 0 −1, 1,...0 ⎟ b2 b2 → ⎜ ⎟ ⎜ ...... ⎟ . ⎝ ...... ⎠ . ←  =  −1/τ1, 0,...1/τK −1, 0, ... 1 bK bK ⎛ ⎞  τ  , ,... τ + τ / − ,τ/ − ,...τ / − ← = 1 1 0 0 1 1n1 (1 n) 1n2 (1 n) 1nK (1 n) b1 (1−n) b1 ⎜ ⎟←  = τ  ⎜−τ2/τ1, 1, ... 0 −τ2,τ2, ... 0 ⎟ b2 2b2 → ⎜ ⎟ ⎜ ...... ⎟ . ⎝ ...... ⎠ . ←  = τ  −τK /τ1, 0,...1 −τK , 0,...τK bK K bK

033442-26 FIELD MASTER EQUATION THEORY OF THE … PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

⎛ ⎞ ←  =  1, 0, ... 0 τ1 + τ1n1/(1 − n),τ1n2/(1 − n), ... τ1nK /(1 − n) b1 b1   τ  ⎜ ⎟← = + 2 ⎜0, 1, ... 0 τ2n1/(1 − n),τ2 + τ2n2/(1 − n), ... τ2nK /(1 − n) ⎟ b2 b2 τ b1 → ⎜ ⎟ 1 ⎜ ...... ⎟ . ⎝ ...... ⎠ .   τ  , , ... τ / − ,τ/ − , ... τ + τ / − ← b = b + 2 b 0 0 1 K n1 (1 n) K n2 (1 n) K K nK (1 n) K K τ1 1 (F8) which implies ⎛ ⎞ τ1 + τ1n1/(1 − n),τ1n2/(1 − n),...τ1nK /(1 − n) ⎜ ⎟ ⎜ τ2n1/(1 − n),τ2 + τ2n2/(1 − n),... τ2nK /(1 − n) ⎟ −1 = ⎜ ⎟ H ⎜ . . . . ⎟ (F9) ⎝ . . .. . ⎠

τK n1/(1 − n),τK n2/(1 − n),...τK + τK nK /(1 − n) or equivalently τ n H −1 = τ δ + i j (F10) ij i ij 1 − n in the representation by matrix elements. As a check of the above calculation, we can directly confirm the following relation, defining the inverse matrix:

K K n 1 τ n HH−1 = I ⇐⇒ H H −1 = − j + δ τ δ + j k = δ . (F11) ij jk τ τ ij j jk 1 − n ik j=1 j=1 j j

The inverse matrix has a singularity at n = 1, which corre- such that sponds to the critical regime of the Hawkes process. τ ∗ n(x) dxH¯ (x, x)¯e(x; λ) = λe¯(x; λ), e¯(x; λ):= e(x; λ) 0 x APPENDIX G: PROOFS OF THE MATHEMATICAL (G4) PROPERTIES OF H(x, x )(129) or equivalently,

1. Proof that the eigenvalues are real ∗ τ n(x)n(x) Considering the analogy to the eigenvalue problem of the  λ = / − λ λ . dx  e¯(x; ) (1 x )¯e(x; ) (G5) finite-dimensional matrix H (109), it is obvious by taking the 0 xx  continuous limit that all the eigenvalues of H(x, x ) are posi- This implies that all the eigenvalues of H(x, x ) are identical to tive. As an Appendix, we remark that all eigenvalues can be those of H¯ (x, x). Since Eq. (G5) is a homogeneous Fredholm ,  proved real for H(x x ) by making some specific assumptions. integral equation of√ the second kind with a continuous and For example, let us assume that n(x) has a finite cutoff, such symmetric kernel n(x)n(x)/(xx) and with a finite interval that [0,τ∗], all the eigenvalues of H¯ (x, x) are real according to the Hilbert-Schmidt theory [69]. Therefore, all the eigenvalues of ,  n˜(x)(x <τ∗) H(x x ) are also real. n(x) = (G1) 0(x  τ ∗) 2. Inverse matrix The inverse matrix of H(x, x ) is given by with a positive continuous functionn ˜(x) > 0 and a cutoff  ∗  n(x ) τ > 0. In this case, the eigenvalue problem for H(x, x ) can H −1(x, x):= x δ(x − x) + . (G6) be rewritten as 1 − n Indeed, we verify that ∞ τ ∗ ∞ dxH(x, x )e(x; λ) = H(x, x )e(x; λ) = λe(x; λ). dxH(x, x )H −1(x, x ) 0 0 0 (G2) ∞     δ(x − x ) − n(x )    n(x ) While H(x, x ) itself is not a symmetric kernel, H(x, x) can = dx x δ(x − x ) + x 1 − n be symmetrized by introducing 0 = δ(x − x). (G7) √ = δ(x − x) − n(x)n(x) The inverse matrix has a singularity at n 1, corresponding H¯ (x, x):= √ , (G3) to the critical regime of the Hawkes process. xx 033442-27 KIYOSHI KANAZAWA AND DIDIER SORNETTE PHYSICAL REVIEW RESEARCH 2, 033442 (2020)

[1] A. Hawkes, J. Royal Stat. Soc., Ser. B 33, 438 (1971). [37] R. Huang, I. Chavez, K. M. Taute, B. Lukic,´ S. Jeney, M. G. [2] A. Hawkes, Biometrika 58, 83 (1971). Raizen, and E.-L. Florin, Nat. Phys. 7, 576 (2011). [3] A. Hawkes and D. Oakes, J. Appl. Prob. 11, 493 (1974). [38] T. Franosch, M. Grimm, M. Belushkin, F. M. Mor, G. Foffi, L. [4] Y. Y. Kagan and L. Knopoff, J. Geophys. Res. 86, 2853 (1981). Forró, and S. Jeney, Nature (London) 478, 85 (2011). [5] Y. Y. Kagan and L. Knopoff, Science 236, 1563 (1987). [39] H. Mori, Prog. Theor. Phys. 33, 423 (1965); 34, 399 (1965). [6] Y. Ogata, J. Am. Stat. Assoc. 83, 9 (1988). [40] R. Morgado, F. A. Oliveira, G. G. Batrouni, and A. Hansen, [7] Y. Ogata, Pure Appl. Geophys. 155, 471 (1999). Phys. Rev. Lett. 89, 100601 (2002). [8] A. Helmstetter and D. Sornette, J. Geophys. Res. 107, 2237 [41] J.-D. Bao, Y.-Z. Zhuo, F. A. Oliveira, and P. Hänggi, Phys. Rev. (2002). E 74, 061111 (2006). [9] S. Nandan, G. Ouillon, D. Sornette, and S. Wiemer, Seismol. [42] M. H. Lee, Phys.Rev.B26, 2547 (1982). Res. Lett. 90, 1650 (2019). [43] I. Goychuk, Phys.Rev.E80, 046125 (2009) [10] A. G. Hawkes, Quant. Finance 18, 193 (2018). [44] R. Kupferman, J. Stat. Phys. 114, 291 (2004). [11] V. Filimonov and D. Sornette, Phys.Rev.E85, 056108 (2012). [45] F. Marchesoni and P. Grigolini, J. Chem. Phys. 78, 6287 (1983); [12] V. Filimonov and D. Sornette, Quant. Finance 15, 1293 (2015). M. Ferrario and P. Grigolini, J. Math. Phys. 20, 2567 (1979). [13] S. Wheatley, A. Wehrli, and D. Sornette, Quant. Finance 19, [46] P. Hänggi and P. Jung, Adv. Chem. Phys. 89, 239 (1995). 1165 (2019). [47] J. Klafter and I. M. Sokolov, First Steps in Random Walks: From [14] E. Bacry, I. Mastromatteo, and J.-F. Muzy, Market Microstr. Tools to Applications (Oxford University Press, Oxford, UK, Liquidity 1, 1550005 (2015). 2011). [15] Q. Zhao, M. A. Erdogdu, H. Y. He, A. Rajaraman, and J. [48] K. Kanazawa, T. G. Sano, T. Sagawa, and H. Hayakawa, Phys. Leskovec, in Proceedings of the 21th ACM SIGKDD Interna- Rev. Lett. 114, 090601 (2015). tional Conference on Knowledge Discovery and Data Mining [49] K. Kanazawa, T. G. Sano, A. Cairoli, and A. Baule, Nature (ACM, 2015), pp. 1513. (London) 579, 364 (2020). [16] D. Sornette, F. Deschatres, T. Gilbert, and Y. Ageon, Phys. Rev. [50] C. Jarzynski, J. Stat. Phys. 98, 77 (2000). Lett. 93, 228701 (2004). [51] D. Oakes, J. Appl. Prob. 12, 69 (1975). [17] R. Crane and D. Sornette, Proc. Nat. Acad. Sci. USA 105, [52] A. Dassios and H. Zhao, Adv. Appl. Probab. 43, 814 (2011). 15649 (2008). [53] D. Sornette and L. Knopoff, Bull. Seismol. Soc. Am. 87, 789 [18] J. V. Escobar and D. Sornette, PLoS One 10, e0116811 (2015). (1997). [19] K. Kanazawa and D. Sornette, Phys. Rev. Lett., [54] J.-P. Bouchaud, J. Bonart, J. Donier, and M. Gould, Trades, arXiv:2001.01195. Quotes, and Prices (Cambridge University Press, Cambridge, [20] A. I. Saichev and D. Sornette, Phys.Rev.E70, 046123 (2004). UK, 2018), Sec. 9.3.4. [21] A. Saichev, A. Helmstetter, and D. Sornette, Pure Appl. [55] S. J. Hardiman, N. Bercot, and J.-P. Bouchaud, Eur. Phys. J. B Geophys. 162, 1113 (2005). 86, 442 (2013). [22] A. I. Saichev and D. Sornette, Eur. Phys. J. B 49, 377 (2006). [56] T. E. Harris, The Theory of Branching Processes (Springer, [23] A. Saichev and D. Sornette, J. Geophys. Res. 112, B04313 Berlin, 1963). (2007). [57] A. Boumezoued, Adv. Appl. Probab. 48, 463 (2016). [24] A. Saichev and D. Sornette, Phys.Rev.E89, 012104 (2014). [58] G. I. Barenblatt, Scaling, Self-similarity, and Intermediate [25] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Asymptotics (Cambridge University Press, Cambridge, UK, Point Processes (Springer, Heidelberg, 2003), Vol. 1. 1996). [26] J. Zhuang, Y. Ogata, and D. Vere-Jones, J. Am. Stat. Assoc. 97, [59] A. Dassios and H. Zhao, Electron. Commun. Probab. 18,1 369 (2002). (2013). [27] A. Helmstetter and D. Sornette, Geophys. Res. Lett. 30, 1576 [60] D. Harte, J. Stat. Software 35, 1 (2010). (2003). [61] K. Kawasaki, Prog. Theor. Phys. 51, 1064 (1974); H. Leschke [28] P. Hänggi, P. Talkner, and M. Borkovec, Rev. Mod. Phys. 62, and M. Schmutz, Z. Phys. B 27, 85 (1977);M.Doi,J. Phys. A: 251 (1990); R. Metzler and J. Krafter, Phys. Rep. 339, 1 (2000). Math. Gen. 9, 1465 (1976); 9, 1479 (1976); L. Peliti, J. Phys. [29] R. Kubo, M. Toda, and N. Hashitsume, Statsitical Physics II, 46, 1469 (1985). 2nd ed. (Springer-Verlag, Berlin, 1991). [62] H. Risken, The Fokker-Planck Equation, 2nd ed. (Springer- [30] R. Zwanzig, Nonequilibrium Statistical Mechanics (Oxford Verlag, Berlin, 1989). University Press, New York, 2001). [63] G. Ódor, Rev. Mod. Phys. 76, 663 (2004). [31] C. W. Gardiner, Handbook of Stochastic Methods, 4th ed. [64] Y. Ashida, Z. Gong, and M. Ueda, arXiv:2006.01837. (Springer, Berlin, 2009). [65] H. Kleinert, Path Integrals in Quantum Mechanics, Statistics, [32] N. G. van Kampen, Stochastic Processes in Physics and Chem- and Polymer Physics, 5th ed. (World Scientific, Singapore, istry, 3rd ed. (Elsevier, Amsterdam, 2007). 2009). [33] J.-P. Hansen and I. McDonald, Theory of Simple Liquids, [66] K. Kanazawa and D. Sornette, Phys. Rev. Lett. 125, 138301 3rd ed. (Academic Press, Amsterdam, 2006). (2020). [34] H. Spohn, Rev. Mod. Phys. 52, 569 (1980). [67] D. Sornette and G. Ouillon, Phys.Rev.Lett.94, 038501 (2005). [35] T. Li and M. G. Raizen, Ann. Phys. (Berlin) 525, 281 (2013). [68] V. A. Filimonov and D. Sornette, EPL 9, 46003 (2011). [36] T. Li, S. Kheifets, D. Medellin, and M. G. Raizen, Science 328, [69] G. B. Arfken and H. J. Weber, Mathematical Methods for Physi- 1673 (2010). cists (Academic Press, San Diego, 1995).

033442-28