arXiv:1801.02846v1 [math.DS] 9 Jan 2018 noise with system stochastic multiscale aaasmlto scnutdi h otx fGusa admfluc ex conte random the the Gaussian in of processes of observation Most context and signal the multiscale in prediction. of effects measu conducted climate guida is noisy and inertial assimilation on weather from data based applications to system practical spacecrafts many algorith dynamical and in recursive random important a a state very provides of system also It state the for or system. are distribution signal f the observations estimate probability of best the observations the the of noisy gain and terms to evolution is in informatio idea state state, state general The The system fluctuations. extract random to [18]. procedure observations a of is assimilation Data Introduction 1. aaasmlto n aaee siainfra for estimation parameter and assimilation Data iqa Duan Jinqiao 1 ajeZhang Yanjie siiainsse ihnon-Gaussian with system assimilation Abstract. 11271290. and 11371367, 11531006, h rgnlsse,b xmnn aaee siainadms pr most and estimation parameter sh examining Keywords: th by we approximates system, system equation original slow components, the filter dimensional differential low original slow partial the that the stochastic for demonstrate approximates Zakai available filter corresponding dimensional only the low are averaged, observations the the When n [email protected] . ∗ [email protected] and xlmath804@163. [email protected], E-mail: qain aaee siain otpoal paths probable most estimation, parameter equation, 3 USA 60616, uzogUiest fSineadTcnlg,Whn407,Ch 430074, Wuhan Technology, and Science 2 of University Huazhong etrfrMteaia cecs&Sho fMteaisadSt and of School & Sciences Mathematical for Center eateto ahmtclSine,Tigu nvriy Beijin University, Tsinghua Sciences, Mathematical of Department eateto ple ahmtc,Ilni nttt fTechnology of Institute Illinois Mathematics, Applied of Department hswr a atyspotdb h S rn 604,adNSF and 1620449, grant NSF the by supported partly was work This [email protected],[email protected], [email protected], hswr saotlwdmninlrdcinfraso-atdata slow-fast a for reduction dimensional low about is work This utsaesses o-asinLeynie vrgn principle, averaging L´evy noise, non-Gaussian systems, Multiscale 1 2 ha Cheng Zhuan , n ioa Li Xiaofan and 2 α iyn Zhang Xinyong , 2 − tbeLeynievasohsi averaging. stochastic via L´evy noise stable α sal L´evy-stable to asinrandom Gaussian of xt 3 o,[email protected] com, o siaiga estimating for m ioiChen Xiaoli , rtetu system true the or ie nysome only given , .Frhroe we Furthermore, s. ihtehelp the with n sigwrson works isting lwdnmc of dynamics slow e c faircrafts of nce 004 China 100084, g atistics, eet.I is It rements. ina utos The tuations. bbepaths. obable sal under usually hcg,IL Chicago, , yexamining by , grants C wthat ow Zakai 1 , data assimilation for a multiscale stochastic system 2

fluctuations has been considered by Park et.al.(see [1]), and they have shown that the probability density of the original system converges to that of the reduced system, by a Fourier analysis method. Imkeller et.al.[2] have further proved the convergence in distribution for the optimal filter, via backward stochastic differential equations and asymptotic techniques. However, random fluctuations are often non-Gaussian (in particular, L´evy type) in nonlinear systems, for example, in geosciences [21], and biosciences [3, 4, 5, 14, 15, 16, 32]. There are experimental demonstrations of L´evy fluctuations in optimal foraging theory, rapid geographical spread of emergent infectious disease and switching currents. Humphries et. al. [25] used GPS to track the wandering black bowed albatrosses around an island in Southern Indian Ocean to study the movement patterns of searching food. They found that by fitting the data of the movement steps, the movement patterns obeys the power-law property with power parameter α =1.25. La Cognata et. al. [15] considered a Lotka-Volterra system of two competing species subject to multiplicative α-stable L´evy noise, and analyzed the role of the L´evy noise sources. Lisowski et. al. [14] studied a model of a stepping molecular motor consisting of two connected heads, and examined its dynamics as noise parameters varied. Speed and direction appear to very sensitively depend on characteristics of the noise. They explored the effect of noise on the ballistic graphene-based small Josephson junctions in the framework of the resistively and capacitively shunted model and found that the analysis of the switching current distribution made it possible to efficiently detect a non-Gaussian noise component in a Gaussian background. L´evy motions are appropriate models for a class of important non-Gaussian processes with jumps or bursts [3, 4, 22]. It is desirable to consider data assimilation when the system evolution is under L´evy motions. This has been recently considered by one of us and other authors but not in the multiscale context (see [20, 28, 29, 35]). The multi-scale stochastic dynamical systems arise widely in finance and biology. For example, there are two kinds of mutual securities in financial markets. One for the low-risk bonds, which can be characterized by ordinary differential equations; the other for high-risk stocks, whose price has two different changes. On the one hand, the edge change caused by the normal supply and demand can be characterized by Gaussian noise. On the other hand, due to the arrival of the important information of the stock, there will be a finite jump in the stock price. Such perturbations can be characterized by non-Gauss noise. In general, stock prices change at all times, while bond prices change for months or even years. Thus, the price of these two securities can be characterized by two-scales system with non-Gaussian noise (see [9]). Moreover, a large number of observations from biological experiments showed that the production of mRNA and proteins occur in a bursty, unpredictable, and intermittent manner, which create variation or noise in individual cells or cell-to-cell interactions. Such burst-like events appear to be appropriately modeled by the non-Gaussian noise. Since the mRNA synthesis process is faster than the protein dynamics, this leads to a two-time-scale system (see [23]). Here ǫ represents the ratio between the natural time scales of the data assimilation for a multiscale stochastic system 3

protein and mRNA. Parameter estimation for continuous time stochastic models is an increasingly important part of the overall modeling strategy in a wide variety of applications. It is quite often the case that the data to be fitted to a diffusion process has a multiscale character with Gaussian noise. One example is in molecular dynamics, where it is desirable to find effective models for low dimensional phenomena (such as conformational dynamics, vacancy diffusion and so forth) which are embedded within higher dimensional time-series. We are often interested in the parameter (see [23]), which represents the degradation or production rates of protein and mRNA. In this paper, we develop a parameter estimation method for multiscale diffusions with non-Gaussian noise. The results established here may be used to examine the change rate for low-risk bounds (see [9]). In this present paper, we consider a slow-fast data assimilation system under

L´evy noise, but only the slow component Xt is observable. By averaging out the fast component via an invariant measure, we thus reduce the system dimension by focusing on the slow system evolution. We prove that the reduced lower dimensional filter effectively approximates (in ) the original filter. We demonstrate that a system parameter may be estimated via the low dimensional slow system, utilising only observations on the slow component. The accuracy for this estimation is quantified by p-moment, with p (1,α). We apply the stochastic Nelder-Mead method [31] ∈ for optimization in the searching for the estimated parameter value. Furthermore, we illustrate the low dimensional approximation by comparing the most probable paths for the slow system and the original system. Finally, we make some remarks in Section 6. This paper is organized as follows. After recalling some basic facts about L´evy motions and the generalized solution in the next section, we address the effects of the multiscale signal and observation processes in the context of L´evy random fluctuations in Section 3 to Section 5. We illustrate the low dimensional slow approximation by examining zakai equation, parameter estimation and most probable paths. Finally, we give some discussions and comments in a more biological context.

2. Preliminaries

We recall some basic definitions for L´evy motions (or L´evy processes).

Definition 1. A Lt is a L´evy process if

(1) L0 =0 (a.s.);

(2) Lt has independent increments and stationary increments; and (3) L has stochastically continuous sample paths, i.e., for every s 0, L(t) L(s) in t ≥ → probability, as t s. → A L´evy process L taking values in Rn is characterized by a drift vector b Rn, t ∈ an n n non-negative-definite, symmetric covariance matrix Q and a Borel measure × data assimilation for a multiscale stochastic system 4

ν defined on Rn 0 . We call (b, Q, ν) the generating triplet of the L´evy motions L . \{ } t Moreover, we have the L´evy-Itˆodecomposition for Lt as follows:

Lt = bt + BQ(t)+ yN(t,dy)+ yN(t,dy), (2.1) y <1 y 1 Z|| || Z|| ||≥ where N(dt,dy) is the Poisson random measure,e N(dt,dy)= N(dt,dy) ν(dx)dt is the − compensated Poisson random measure, ν(S)= EN(1,S) is the jump measure, and Bt is an independent standard n-dimensional Browniane motion. The characteristic function of Lt is given by E[exp(i u, L )] = exp(tψ(u)), u Rn, (2.2) h ti ∈ where the function ψ : Rn C is the characteristic exponent → 1 i u,z ψ(u) = i u, b u,Qu + (e h i 1 i u, z I z <1 )ν(dz).(2.3) h i − 2h i Rn 0 − − h i {|| || } Z \{ } The Borel measure ν is called the jump measure. Here , denotes the scalar product h· ·i in Rn. Definition 2. For α (0, 2), an n-dimensional symmetric α- Lα is a L´evy ∈ t process with characteristic exponent ψ ψ(u)= C (n, α) u α, for u Rn (2.4) − 1 | | ∈ 1 with C1(n, α) := π− 2 Γ((1 + α)/2)Γ(n/2)/Γ((n + α)/2). For an n-dimensional symmetric α-stable L´evy process, the diffusion matrix Q = 0, the drift vector b = 0, and the L´evy measure ν is given by C (n, α) ν(du)= 2 du, (2.5) u n+α || || 1 α n/2 where C (n, α) := αΓ((n + α)/2)/(2 − π Γ(1 α/2)). 2 − For every function φ 2(Rn), the generator for this symmetric α-stable L´evy ∈ C0 process in Rn is

( αφ)(x)= [φ(x + u) φ(x)] ν(du). (2.6) A Rn 0 − Z \{ } It is known [6] that extends uniquely to a self-adjoint operator in the domain. By Aα Fourier inverse transform, φ = θ ( ∆)α/2φ(x), (2.7) Aα α,n − where

θα,n = (cos( e, y ) 1) ν(du) (2.8) Rn 0 h i − Z \{ } with e being the unit vector in Rn. For fix α (1, 2), η [0,α), and set ∈ ∈ ρ(x)=(1+ x )η. (2.9) | | data assimilation for a multiscale stochastic system 5

p Rn p Let Lρ( ) be the weighted L -space with norm:

1 p p f p;ρ =( f(x) ρ(x)dx) (2.10) || || Rn | | Z For l N, let W l,p be the l-order weighted Sobolev space with norm ∈ ρ l f = lf , (2.11) || ||l,p;ρ ||∇ ||p;ρ i=0 X where l denotes the l-order gradient. ∇ Definition 3. A backward predictable stochastic process v Lp(Ω; L (R; W 1,p)) is ∈ loc ρ called a generalized solution of the equality

v(t, x)= ϕ(x)+ Lv(s, x)ds + hlv(s, x)dwl(s), (2.12) Z[t,T ] Z[t,T ] d if for every y C∞(R ) it satisfies the following equation ∈ 0 l l (v(t, x),y)=(ϕ,y)+ (v(s, x), L∗y)ds + (h v(s, x),y)dw (s)(2.13) Z[t,T ] Z[t,T ] with L being the generator of some L´evy process.

3. A slow-fast filtering system

Let us consider the following slow-fast signal-observation system:

dX = f (X ,Y )dt + g (X ,Y )dB1 + σ dLα1 , x Rn, (3.14) t 1 t t 1 t t t 1 1 ∈ 1 1 1 2 2 α2 Rm dYt = ε− f2(Xt,Yt)dt + ε− g2(Xt,Yt)dBt + 1 dL2 ,y , (3.15)  ε α2 ∈  dZ = h(X )dt + dW , z Rd. (3.16) t t t ∈  Here (X ,Y ) is an Rn Rm -valued signal process which represents the slow and fast t t × components. The constant σ1 represents the noise intensity for the slow variable. The Rd 1 2 observation process Zt is -valued. The standard Brownian motions Bt , Bt , Wt α1 α2 are independent. The non-Gaussian processes L1 , L2 (with 1 < α1,α2 < 2) are independent symmetric α-stable L´evy processes with triplets (0, 0, ν1) and (0, 0, ν2), respectively. The parameter ε is the ratio of the slow time scale to the fast time scale. We make the following assumptions on this filtering system.

Hypothesis H.1. The functions f1,g1, f2,g2 satisfy the global Lipschitz conditions, i.e., there exists a positive constant γ such that

f (x ,y ) f (x ,y ) 2 + g (x ,y ) g (x ,y ) 2 || 1 1 1 − 1 2 2 || || 1 1 1 − 1 2 2 || + f (x ,y ) f (x ,y ) 2 + g (x ,y ) g (x ,y ) 2 || 2 1 1 − 2 2 2 || || 2 1 1 − 2 2 2 || γ[ x x 2 + y y 2] ≤ || 1 − 2|| || 1 − 2|| for all x Rn,y Rm, i =1, 2. i ∈ i ∈ data assimilation for a multiscale stochastic system 6

Remark 1. Note that with the help of the global Lipschitz condition, it follows that there is a positive constant K such that f (x, y) 2 + g (x, y) 2 + f (x, y) 2 + g (x, y) 2 K(1 + x 2 + y 2) || 1 || || 1 || || 2 || || 2 || ≤ || || || || for all x Rn,y Rm. ∈ ∈ Hypothesis H.2. The coefficients f ,g are of class C2(Rn Rm) with the first 2 2 × and second order derivatives bounded n Hypothesis H.3. The sensor function h is of class Cb(R ), i.e. all the bounded continuous functions on Rn . Hypothesis H.4. There exists a positive constant C, such that T 2 sup Tr[g2(x, y)g2(x, y) ]+2 y, f2(x, y) C(1 + y ). (3.17) x { h i}≤− | |

This hypothesis (H.4.) ensures the existence of an invariant measure µx(dy) (see [12]) for the fast component Y . With the special scaling exponent 1 for the fast t α2 component Yt, this invariant measure is independent of ε (see [11, 12]). The infinitesimal generator of the slow-fast stochastic system is 1 = + , (3.18) L L1 εL2 where Φ(x, y, t)= f Φ(x, y, t)+ 1 Tr[G H (Φ(x, y, t))] L1 1 ·∇x 2 1 · x +P.V. n (Φ(x + σ z ,y,t) Φ(x, y, t))ν (dz ),  R 0 1 1 1 1 \{ } 1 − (3.19)  2Φ(x, y, t)= f2 yΦ(x, y, t)+ 2 Tr[G2 Hy(Φ(x, y, t))]  L ·∇ R · +P.V. Rm 0 (Φ(x, y + z2, t) Φ(x, y, t))ν2(dz2). \{ } − T Here Gi = gig(i =1, 2), is the gradient,R and H is the Hessian matrix (with respect i ∇ to x and y respectively). Let = σ(Z :0 s t) , (3.20) Zt s ≤ ≤ ∨N where is the collection of all P -negligible sets of (Ω, ). Define N F = , (3.21) Z Zt t R+ ∈_ where denotes taking the σ-algebra generated by the union . That is, t Zt W = σ( t). S (3.22) Z Z t [ By the version of Girsanov’s change of measure theorem, we obtain a new probability measure P˜, such that the observation Zt becomes P˜-independent of the signal variables (Xt,Yt). This can be done through dP˜ m t 1 m t = exp hi(X )dW i hi(X )2ds . (3.23) dP − s s − 2 s i=1 0 i=1 0 ! X Z X Z Denote dP R˜t = . (3.24) dP˜ t Z

data assimilation for a multiscale stochastic system 7

Then by the Kallianpur-Striebel formula, for every bounded differentiable function φ, we have the following representation:

E˜[R˜tφ(Xt,Yt) t] E˜[R˜tφ(Xt,Yt) ] E[φ(Xt,Yt) t]= |Z = |Z . (3.25) |Z E˜[R˜ ] E˜[R˜ ] t|Zt t|Z The unnormalized conditional distribution of φ(Xt,Yt), given Zt, is defined as Pt(φ)= E˜[R˜ φ(X ,Y ) ]. Thus, we have the following Zakai equation: t t t |Z T dPt(φ)= Pt( φ)dt + Pt(φh )dZt, E L (3.26) ( P0(φ)= [φ(X0,Y0)].

The x-marginal of Pt(φ) is defined as

ρt(φ)= φ(x)Pt(dx,dy). (3.27) Rn+m Z Now we define a reduced, low dimensional signal-observation system

¯ ¯ ¯ ¯ 1 α1 dXt = f1(Xt)dt +g ¯1(Xt)dBt + σ1dL1 , ¯ ¯ (3.28) ( dZt = h(Xt)dt + dBt. Here ¯ f (x)= m f (x, y)µ (dy), 1 R 1 x (3.29) G¯(x)= g (x, y)g (x, y)T µ (dy). ( RRm 1 1 x

The unnormalized conditionalR distributionρ ¯t(φ) corresponding to the filter for the reduced system (3.28) satisfies the following (reduced) Zakai equation

T dρ¯t(φ)=ρ ¯t( ¯φ)dt +ρ ¯t(φh )dZ¯t, E L (3.30) ( ρ¯0(φ)= [φ(X0)], where ¯φ(x, t)= f¯ (x) φ(x, t)+ 1 Tr[H (G¯(x)φ(x, t))] L 1 ·∇x 2 x (3.31) + (φ(x + σ z , t) φ(x, t))ν (dz ). Rn 1 1 − 1 1 However, we are more interestedR in the reduced filtering problem with the actual observation Zt. This leads us to rewrite the reduced Zakai equation (3.30) as follows:

T dρ¯t(φ)=ρ ¯t( ¯φ)dt +ρ ¯t(φh )dZt, E L (3.32) ( ρ¯0(φ)= [φ(X0)]. Lemma 1. Assume that the following conditions are satisfied: (i) The functions σil, bi, and hl for i, j = 1, 2, ,k and l = 1, 2, ,k and their ··· ··· 1 derivatives of first order (in x) as well as the derivatives of second order (in x) are il i uniformly bounded by the constant N0. The functions σ , b are locally uniformly bounded in t. (ii) φ 1,p(Rk). ∈ Wρ data assimilation for a multiscale stochastic system 8

Let u(t, x, ω) be a generalized solution of problem du(t, x, ω) = Lu(t, x, ω)dt + hl(t, x, ω)u(t, x, ω)dBl(t), Rk  (t, x, ω) [T0, T ] Ω, (3.33) ∈ ×Rk ×  u(T0,x,ω) = ϕ(x, ω), (x, ω) Ω, ∈ × where  1 ik jk 2 i Lu(t, x) = 2 σ (t, x, ω)σ (t, x, ω)∂iju(t, x, ω)+ b ∂iu(t, x, ω) 1 dz (3.34) + 2 Rk (u(t, x + z)+ u(t, x z) 2u(t, x)) z k+α , − − | | and v(t, x, ω) is a generalizedR solution of problem dv(t, x, ω) = [Lv(t, x, ω)]dt + [hl(t, x, ω)v(t, x, ω)]dBl(t), − (t, x, ω) [T , T ] Rk Ω, (3.35)  ∈ 0 × ×  v(T,x,ω) = ϕ(x, ω), (x, ω) Rk Ω. ∈ × Then the following formulas hold  u(t, x)= E(ϕ(Y (t, x, T ))χ (t, T ) T0 ), (3.36) 0 1 0 |Ft and v(t, x)= E(ϕ(X(T, x, t))χ (T, t) T ), (3.37) 2 |Ft where X(t, x, s) is the forward stochastic differential equation with generator L and

Y (t, x, s) is the backward stochastic differential equation with generator L. Here χ1(t, s)

and χ2(t, s) satisfy the following equations 1 χ (t, s)= exp hl(τ,Y (t, x, τ))dBl(τ) hlhl(τ,Y (t, x, τ))dτ ,(3.38) 1 − 2 Z[s,t] Z[s,t]  and 1 χ (s, t)= exp hl(τ,X(τ, x, t))dBl(τ) hlhl(τ,X(τ, x, t))dτ .(3.39) 2 − 2 Z[t,s] Z[t,s]  ij 1 T Proof. Denote a =(a ) = 2 σ(t, x, ω)σ (t, x, ω). Before we proceed to the proof of the lemma, let us consider the problem ij 2 i dv(t, x, ω) = [a (T t, x, ω)∂ijv(t, x, ω))+ b (T t, x, ω)∂iv(t, x, ω) 1 − − dz + (v(t, x + z)+ v(t, x z) 2v(t, x)) + ]dt  2 Rk − − z k α (3.40) l l | | Rk  +h (T t, x, ω)v(t, x, ω)dBT (t), (t, x, ω) [0, T T0] Ω,  R − ∈ − × × v(0,x,ω) = ϕ(x, ω), (x, ω) Rk Ω. ∈ ×  By the definition of a backward predictable stochastic process, the coefficients  T t in equation (3.40) are predictable relative to the family of σ algebra − with − FT t [0, T T ]. The process B is a Wiener martingale with respect to the same ∈ − 0 T family and the initial condition ϕ is measurable with respect to the minimal σ-algebra of this family. data assimilation for a multiscale stochastic system 9

Let v(t, x) be a generalized solution of problem. Then by the definition of k generalized solution, for every y C∞(R ), the following equality holds on [T , T ] Ω, ∈ 0 0 × ij (v(T s, x),y) =(ϕ,y)+ [0,T s] [ (a (T s)∂iv(s, x),yj) − − − − +(bi(T s) aij(T s))∂ v(s, x)) − R − j − i 1 dz (3.41) + 2 Rk (v(s, x + z)+ v(s, x z) 2v(s, x)) z k+α ,y) ds − − | | l l + [0R,T s] h (T t)v(t, x)dBT (s). i − − Denote u(s) = v(T s) in equationR (3.41), we find that u(t) satisfies the following − k equation on [T , T ] Ω, for every y C∞(R ). 0 × ∈ 0 ij i ij (u(t, x),y) =(ϕ,y)+ [t,T ][ (a ∂iv(s, x),yj)+((b aj )∂iu(t, x)) 1 − − dz + 2 Rk (v(t, x + z)+ v(t, x z) 2v(t, x)) z k+α ,y)]ds (3.42) R − − | | + hlv(s, x)dBl(s). [Rt,T ] Thus u is an generalized solutionR of problem (3.35). On the other hand, changing the variables in equality (3.42), we obtain that v(t, x) is a generalized solution of problem (3.40). Thus we have proved that problems (3.35) and (3.40) are equivalent. Therefore all the results obtained for problem (3.35) are naturally carried over to problem (3.40). For the problem (3.40), we can use the similar W1,p method to obtain the existence and uniqueness of ρ - solution to stochastic fractal equations by using purely probabilistic argument (see [19, 30]). This completes the proof of this lemma.

Now we show that the reduced system approximates the original system, by examining the corresponding Zakai equations. Before describing the theorem, we start by describing the probabilistic representation for semi-linear stochastic fractal equations. We proceed to consider the probability measure P. Note that the process

Zt is a under P. For convenience, we rewrite P and Zt as P and Bt, respectively. By Lemma 1, we know vφ(t, x, y)= E˜ [φ(X )R˜ e ] solves a stochastic x,y T t,T |Yt,T partial differential equation (SPDE).e e

φ φ T φ dv (t, x, y)= v (t, x, y)dt + h(x) v (t, x, y)d←−Bt. − φ L (3.43) ( v (T,x,y)= φ(x).

φ Here d←−Bt denotes Itˆo’s backward integral. Likewise, we introduce v (t, x) = E˜ [φ(X¯ )R˜ ], and then v(t, x) solves the following SPDE x T t,T |Yt,T φ φ T φ dv (t, x)= v (t, x)dt + h(x) v (t, x)d←−Bt, − φ L (3.44) ( v (T, x)= φ(x).

By the version of Girsanov’s change of measure theorem and of (Xt,Yt), we know that for any t [0, T ]: P (vφ(t)) = ρ (φ). In particular, we have ∈ t T φ P ρT (φ)= v (0,x,y) (X0,Y0)(dx,dy). (3.45) Z data assimilation for a multiscale stochastic system 10

Similarly, we have

φ P ρ¯T (φ)= v¯ (0, x) (X0)(dx). (3.46) Z This is our main result on the comparison between the original filter and the reduced

filter. This is desirable when only the slow component Xt is observable.

Theorem 1. Under the hypotheses (H.1)-(H.4) and for p with 1

Proof. p E[ ρT (φ) ρ¯T (φ) ] E | φ − | φ P p = [ (v (0,x,y) v¯ (0, x)) (X0,Y0)(dx,dy) ], − p E[ vφ(0,x,y) v¯φ(0, x) P (dx,dy)], R (X0,Y0) ≤ E φ − φ p P = v (0,x,y) v¯ (0, x) (X0,Y0)(dx,dy), R − p E E˜ ˜ E˜ ¯ ˜ P (3.48) = R  x,y[φ(XT )R0,T 0,T ]  x[φ(XT )R0,T 0,T ] (X0,Y0)(dx,dy), |Y − p |Y hE E˜ ¯ ˜ P i RC x[(φ(XT ) φ(XT ))R0,T t,T ] (X0,Y0)(dx,dy ), ≤ − |Y p E E ¯ E˜ ˜ P = C R [φ(XT ) φ(XT ) T ] x[R0,T 0,T ] (X0,Y0)(dx,dy), − p |Y |Y C E E[X X¯ ] P (dx,dy). R T T T (X0,Y0) ≤ − |Y Using the Jensen’s inequality,R we have

E[ ρ (φ) ρ¯ (φ) p] E X X¯ pP (dx,dy). (3.49) | T − T | ≤ | T − T | (X0,Y0) Z Applying a similar argument from [26], we obtain p lim E XT X¯T =0 (3.50) ǫ 0 → | − | Finally, we conclude that E[ ρ (φ) ρ¯ (φ) p] Cε. (3.51) | T − T | ≤ This completes the proof of Theorem 1.

4. Parameter estimation

In this section, we consider parameter estimation in the following slow-fast n dx(t)= Ax(t)+ f1(x(t),y(t), θ) dt, x R , {1 } 1∈ α Rm (4.52) dy(t)= By(t)+ f2(x(t),y(t)) dt + 1 dLt , y , ( ε { } ε α ∈ data assimilation for a multiscale stochastic system 11

where A is an n n matrix, B is a positive definite matrix with eigenvalues × − 0 < λ < λ < < λ . θ is an unknown parameter defined in an compact set Θ 1 2 ··· m of Rn. Hypothesis H.5. For all θ Θ, the function f is uniformly bounded and ∈ 1 Lipschitz w.r.t x and y. The function f2 is smooth with the first order derivatives bounded by λ1. Define an averaged, low dimensional slow stochastic dynamical system in Rn (see [27]), as in the previous section ¯ dx¯(t)= Ax¯(t)+ f1(¯x(t), θ)dt, (4.53) where

f¯1(¯x, θ) , f1(x, y, θ)µx(dy). Rm Z We will estimate the unknown parameter θ, based on this low dimensional slow system (4.53) but using the observations on the slow component x only. Denote the solution

of the original slow-fast system (4.52) with actual parameter θ0 by (x(t, θ0),y(t, θ0)), and the solution of the slow system (4.53) with parameter θ byx ¯(t, θ). We recall the following lemma. Lemma 2. Under hypothesis (H.6), the following strong convergence holds

p lim E x(t, θ0) x¯(t, θ0) =0, t [0, T ] and p (1,α). (4.54) ε 0 → | − | ∈ ∈ Proof. The proof of average principle is similar the the infinite case which has been studied in [26], Thus we omit here.

We take F (θ)= T x(t, θ ) x¯(t, θ) pdt as the objective function and assume there 0 | 0 − | is a unique θˆǫ Θ such that F (θˆǫ) = min F (θ). This θˆǫ is our estimated parameter ∈ R θ Θ value. Then we can state our main result∈ as follows. Theorem 2. Under hypothesis (H.6), the estimated parameter value converges to the true parameter value, as the scale parameter ε approaches zero. That is, ǫ lim θˆ = θ0. (4.55) ε 0 → Proof. Note that E x¯(t, θ ) x¯(t, θˆε) p C (E x(t, θ ) x¯(t, θ ) p+E x(t, θ ) x¯(t, θˆε) p),(4.56) | 0 − | ≤ | 0 − 0 | | 0 − | for some positive constant C. Integrating both sides with respect to time, we get T E x¯(t, θ ) x¯(t, θˆε) pdt C(F (θ )+ F (θˆε)) 2CF (θ ). (4.57) | 0 − | ≤ 0 ≤ 0 Z0 ε We calculate the difference betweenx ¯(t, θ0) andx ¯(t, θˆ ) to obtain x¯˙ (t, θ ) x¯˙ (t, θˆε)= A[¯x(t, θ ) x¯(t, θˆε)] + [f¯ (¯x(t, θ ), θ ) f¯ (¯x(t, θˆε), θˆε)]. 0 − 0 − 1 0 0 − 1 data assimilation for a multiscale stochastic system 12

By the variation of constant formula, we have t ε A(t s) ε ε x¯(t, θ ) x¯(t, θˆ )= e − [f¯ (¯x(s, θ ), θ ) f¯ (¯x(s, θˆ ), θˆ )]ds. 0 − 1 0 0 − 1 Z0 Using the mean value theorem, we obtain t ε ε A(t s) x¯(t, θ ) x¯(t, θˆ )=(θ θˆ ) e − f¯ (¯x(s, θ′), θ′)ds, (4.58) 0 − 0 − ∇θ 1 Z0 ε ε for some θ′ = θˆ + l(θ θˆ ) with l (0, 1). Denote 0 − ∈ T t p A(t s) G(θ′)= e − f¯ (¯x(s, θ′), θ′)ds dt. ∇θ 1 Z0 Z0

If G(θ′) > 0, we have by (4.57) and (4.58)

ǫ C θ0 θˆ F (θ0), (4.59) | − | ≤ G(θ′) which implies the desired result (4.55) due to Lemma 1. T E ˆε p ˆε If G(θ′) = 0, we have 0 x¯(t, θ0) x¯(t, θ ) dt = 0. Then F (θ0) = F (θ ), which ˆε | − ˆε| ˆε implies θ0 = θ since F (θ) admits a unique θ such that F (θ ) = min F (θ). This R θ Θ completes the proof of this theorem. ∈

We proceed to the following two dimensional slow-fast stochastic dynamical system in R2 to verify the parameter estimation method. Example 1. Consider the following slow-fast stochastic dynamics y2 dx(t) = x(t)dt + cos(θx)e− dt, − y(t) 1 α (4.60) dy(t) = dt + 1 dLt . ( − ε ε α Using a result in [11], we find the invariant measure µ(dx)= ρ(x)dx with density

1 ixξ 1 ξ α ρ(x)= e e− α | | dξ. (4.61) 2π R Z Then the averaged, one dimensional, slow equation is dx¯(t)= x¯(t)+ a cos(θx¯(t))dt, (4.62) − where

1 1 ξ2 1 ξ α a = e− 4 − α | | dξ. √4π R Z In the following numerical simulations, we apply the Euler-Maruyama method to generate M different slow paths xi,j : xi,j = xi,j(t , θ ), i = 1, 2, .., N; j = 1, 2, ..., M { i 0 } as the observations. Then we minimize the objective function F (θ)= M N xi,j j=1 i=0 | − x¯i(θ) p by the stochastic Nelder-Mead method [34] to obtain the estimated parameter | P P value θˆ. The deterministic Nelder-Mead method is a geometric search method to find a minimizer of an objective function, which was originally devised for nonlinear and deterministic optimization. However, when noise dominates the true differences in the data assimilation for a multiscale stochastic system 13

function values, the relative rank of function values in the simplex can be wrong due to the sampling error, leading the algorithm to the wrong direction. The stochastic Nelder- Mead method overcomes this shortage.

3 13 Original Path Averaged Path 2.8 12 2.6

2.4 11

2.2

x 10 2

1.8 9

1.6 8 1.4

1.2 7 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 5 10 15 20 25 30 t Iteration

(a) The sample paths x vs.x ¯ (b) The estimated value θˆ

Figure 1. The sample paths x, x¯ (online color) and estimated value θˆ (shown as ): ∗ θ = 9 (true parameter value), ε =0.01, α =1.9, Θ=(0, 100), M = 1000.

As shown in Figure 1, we see that slow system x¯ is a good approximation of the slow component x of the original system. By just about 8 iteration in the stochastic Nelder-Mead search, we get estimated parameter θˆ ≈ 9 (true parameter value).

5. Most probable paths

We further verify that the average system captures the behaviour of the original system, by examing the most probable paths. We do this in a concrete example. Example 2. 1 dX =0.1(X X3)Y 2dt + (0.01) α dLα, t t − t t t dY = Yt dt + 2 dB ,  t − ε √ε t (5.63)  √ 1  dZ1t = Xtdt + 0.2dWt ,  2 dZ2t = Ytdt + √0.2dWt ,  The invariant measure ρ∞(y; x) for the fast variable Yt satisfies the normal distribution with mean 0 and variance 2 and does not depend on x. So we can get the averaged, one dimensional, slow equation with the original observation ¯ 3 2 f(x) = R 0.1(x x )y ρ∞(y; x)dy −3 2 =0.1(x x ) y ρ∞(y)dy (5.64) R − R =0.2(x x3). − R and 1 dX¯ =0.2(X¯ X¯ 3)dt + (0.01) α dLα t t − t t (5.65) ( dZ1t = Xtdt + √0.2dWt data assimilation for a multiscale stochastic system 14

We can use partial differential equations methods to simulate the solution of the Zakai equation (see [18]), which can be carried out as the following two steps.

5 The reduced system 4 The original system in X

3

2

1

0 MPP

-1

-2

-3

-4

-5 0 5 10 15 20 25 30 35 40 45 50 t

Figure 2. The most probable paths X, X¯ (online color) with X0 = 1, X¯0 = 1,ε = 0.01, α =1.5.

The first step, called the prediction step, consists of solving the Fokker-Planck equation(see [10]). The second step, called the correction step, uses the new observation to update. Then we can infer the most probable phase portrait (see [8]) at each time t. Likely, we can obtain the most probable path of the averaged system with the same observations. The detail illustration can be shown in the Figure 2.

6. Conclusion and discussion

We developed a reduction method based on a slow-fast stochastic dynamical system by using the stochastic averaging principle. We demonstrated that the low dimensional slow system approximates the slow dynamics of the original system, by examining the Zakai equation, parameter estimation and most probable paths. Rather than solving the original system, we can accurately estimate the unknown parameter θ only by the observation of the slow variable, which reduces the computational complexity and will not be computationally expensive. The results established in this paper can be used to examine financial markets or stochastic chemical kinetics, where we are more interested in the change rate θ for low-risk bounds or mRNA. Finally, we illustrated the low dimensional approximation by comparing the most probable paths for the slow system and the original system. This provided a deterministic geometric tool to visualize stochastic dynamics. Acknowledgements. We would like to thank Peter Imkeller (Berlin, Germany) and Yanzhao Cao (Auburn, USA) for helpful discussions. data assimilation for a multiscale stochastic system 15

References

[1] J. H. Park, R. B. Sowers and N. Sri Namachchivaya 2010 Dimensional reduction in nonlinear filtering Nonlinearity. 23 305-324 [2] Hoong C. Yeong, N. Sri Namachchivaya, Nicolas Perkowski and Peter Imkeller 2013 Dimensional reduction in nonlinear filtering:A homogenization approach Annals of Applied Probability. 23 2290-2326 [3] Y. T. Lin and C. R. Doering 2016 Gene expression dynamics with stochastic bursts: Construction and exact results for a coarse-grained model Phys. Rev. E. 93 022409 [4] C. Jia, M. Q. Zhang and H. Qian 2017 Emergent L´evy behavior of stochastic burst phenomenon in single-cell gene expression. Preprint [5] J. A Roberts, T. W. Boonstra and M. Breakspear 2014 The heavy tail of the human brain Current Opinion in Neurobiology. 31 164 - 172 [6] D. Applebaum 2009 L´evy Processes and Cambridge Cambridge University Press [7] Bishwal, Jaya P.N. 2008 Parameter estimation in stochastic differential equations Berlin Springer- Verlag [8] Z. Cheng, J. Duan and L. Wang 2016 Most probable dynamics of some nonlinear systems under noisy fluctuations Communications in Nonlinear Science and Numerical Simulation. 30 108-114 [9] G. Dror 2007 Strong Convergence rate for two-time-scale jump-diffusion Stochastic Differential Systems Siam Journal on Multiscale Modeling and Simulation. 6 577-594 [10] T. Gao, J. Duan and X. L 2016 FokkerPlanck equations for stochastic dynamical systems with symmetric L´evy motions Applied Mathematics and Computation. 278 1-20 [11] S. Albeverio, B. R¨udiger and J. Wu 2000 Invariant measures and symmetry property of L´evy type operators Potential Analysis. 13 147-168 [12] S. Albeverio, Z. Brze´zniak and J. Wu 2010 Existence of global solutions and invariant measures for stochastic differential equations driven by Poisson type noise with non-Lipschitz coefficients Journal of Mathematical Analysis and Applications. 371 309-322 [13] D. Brockmann 2009 Human mobility and spatial disease dynamics Reviews of Nonlinear Dynamics and Complexity. 2 1-24 [14] B. Lisowski, D. Valenti, B. Spagnolo, M. Bier and E. Gudowska-Nowak 2015 Stepping molecular motor amid L´evy Physical Review E Statistical Nonlinear and Soft Matter Physics. 91 042713 [15] A. La Cognata, D. Valenti, A. A. Dubkov and B. Spagnolo 2010 Dynamics of two competing species in the presence of L´evy noise sources Physical Review E. 82 011121 [16] C. Guarcello, D. Valenti, B. Spagnolo, V. Pierro, and G. Filatrella 2017 Anomalous transport effects on switching currents of graphene-based Josephson junctions Nanotechnology. 28 134001 [17] D. Brockmann and I. M. Sokolov 2002 L´evy flights in external force fields: from models to equations Chemical Physics. 284 409-421. [18] A. Bain and C. Dan 2009 Fundamentals of Stochastic Filtering New York Springer [19] B. L. Rozovskii 1983 Stochastic Evolution Systems: Linear Theory and Applications to Non-linear Filtering Moscow Nauka Publishers [20] C. Ceci and K. Colaneri 2014 The Zakai equation of nonlinear filtering for jump-diffusion observations: existence and uniqueness Applied Mathematics and Optimization. 69 47-82. [21] P. D. Ditlevsen 1999 Observation of α stable noise induced millennial climate changes from an − ice record Geophysical Research Letters. 26 1441-1444 [22] J. Duan 2015 An Introduction to Stochastic Dynamics New York Cambridge University Press [23] F. Wu, T. Tian, J. B. Rawlings and George Yin 2016 Approximate method for stochastic chemical kinetics with two-time scales by chemical Langevin equations. Journal of Chemical Physics. 17 174112 [24] S. Popa and S. S. Sritharan 2009 Nonlinear filtering of Itˆo-L´evy stochastic differential equations with continuous observations. Communications on Stochastic Analysis. 3 313-330. data assimilation for a multiscale stochastic system 16

[25] N. E. Humphries, H. Weimerskirch, N. Queiroz, E.J. Southall and D.W. Sims 2012 Foraging success of biological L´evy flights recorded in situ Proceedings of the National Academy of Sciences. 109 7169-7174 [26] J. Bao, G. Yin and C. Yuan 2017 Two-time-scale stochastic partial differential equations driven by α-stable noises: Averaging principles. Bernoulli. 23 645-669 [27] G. A. Pavliotis and A. M. Stuart 2008 Multiscale Methods Averaging and Homogenization New York Springer [28] S. Popa and S. S. Sritharan 2009 Nonlinear filtering of Itˆo-L´evy stochastic differential equations with continuous observations. Communications on Stochastic Analysis. 3 313-330 [29] H. Qiao and J. Duan 2015 Nonlinear filtering of stochastic dynamical systems with L´evy noises Adv. Appl. Prob. 47 902-918 [30] Y. C. Xie, Q. Zhang and X. C. Zhang 2014 Probabilistic approach for semi-linear stochastic fractal equations. Stochastic Processes and their Applications. 124 3948-3964. [31] K. H. Chang 2012 Stochastic Nelder-Mead simplex method - a new globally convergence direct search method for simulation optimization. European Journal of Operational Research. 220 684- 694 [32] Y. Zheng, J. Duan, L. Serdukova and J. Kurths 2016 Transitions in a genetic transcriptional regulatory system under L´evy motion. Scientific Reports 6 29274 [33] Grigorios A. Pavliotis and Andrew M. Stuart 2007 Parameter estimation for multiscale diffusions Journal of Statistical Physics. 127 741-781 [34] J. Ren, J. Duan and X. Wang 2013 A parameter estimation method based on random slow manifolds Applied Mathematical Modelling. 39 3721-3732 [35] X. Sun, J. Duan, X. Li and X. Wang 2015 State estimation under non-Gaussian L´evy noise: A modified Kalman filtering method Banach Center Publications. 105 239-246