International Journal of Pure and Applied Mathematics Volume 119 No. 3 2018, 461-473 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu AP doi: 10.12732/ijpam.v119i3.6 ijpam.eu

ON THE DISTRIBUTION OF THE RUNNING AVERAGE OF A SKELLAM PROCESS

Weixuan Xia ∗Mathematical Finance Boston University Questrom School of Business 595 Commonwealth Ave, Boston, MA 02215, USA

Abstract: It is shown that the of the running average of a Skellam process is of compound Poisson type, which gives rise to double-uniform-sum distributions. The average process’s characteristic function, moments, as well as probability density function and cumulative distribution function are derived in explicit form.

AMS Subject Classification: 60E10, 60G55, 60J75 Key Words: probability distribution, running average, Skellam process, double-uniform sum

1. Introduction

A Skellam distribution is a discrete-valued probability distribution initially pro- posed in Skellam (1946) [6], and is well known to be the distribution of the difference of two independent Poisson random variables, with respective rate parameters λ > 0 and λ > 0. In continuous time t 0, a Skellam pro- 1 2 ≥ cess K (K ) is hence defined to be a L´evy process admitting a Skellam ≡ t distribution. In light of the conventional definition of L´evy processes (e.g., see Schoutens ((2003), pages 44–45) [5]), the Skellam process is defined by the following conditions. K = 0, a.s., i.e., Pr[K = 0] = 1. • 0 0 Received: May 18, 2107 c 2018 Academic Publications, Ltd. Revised: July 11, 2018 url: www.acadpubl.eu Published: July 12, 2018 462 W. Xia

For any partition of time P = t N with t = 0 and t < t , • { n}n∈ 0 n n+1 n, it holds true that the increment random variables Ktn+1 Ktn are ∀ law − mutually independent and stationary with K K n = K n tn+1 − t tn+1−t ∼ Skellam(λ (t t ), λ (t t )). 1 n+1 − n 2 n+1 − n The mapping t R K Z is c`adl`ag with probability 1, i.e., • ∈ + 7→ t ∈ lim Pr[ K K > ǫ] = 0, ǫ > 0. hց0 | t+h − t| ∀ Conditional on t, the Skellam process has the following probability mass function,

x/2 −(λ1+λ2)t λ1 pK|t(x) Pr[Kt = x] = e Ix(2t λ1λ2), x Z, (1) ≡ λ2 ∈   p where I ( ) is the modified Bessel function of the first kind. For details refer · · to Abramowitz and Stegun ((1972), pages 375–378) [1]. The characteristic function of K is hence given by

φ (u) := E eiuKt eiuxp (x) (2) K|t ≡ K|t x∈Z   X = exp λ t(eiu 1) + λ t(e−iu 1) , u R, 1 − 2 − ∈ where i = √ 1. Clearly, (2) is infinitely divisible in that φ (u) = (φ (u))t, − K|t K|1 so that the L´evy properties are meaningful. Several works have so far existed to discuss the properties as well as ap- plications of the Skellam process. For instance, Barndorff-Nielsen et al (2010) [2] considered the scaled Skellam process and a generalization using negative binomial distributions when modeling low-latency financial data while Kerss et al (2014) [4] analyzed, by means of time change, fractional Skellam processes of which the Skellam process is a special case. In this paper our interest lies in analyzing the following time-scaled integral of the path of the Skellam process,

1 t K˜ := K ds. (3) t t s Z0 This stochastic process, notably, can be identified as the running average of the Skellam process K. In equivalent differential, we can write

1 1 t dK˜ = K K ds dt, (4) t t t − t2 s  Z0  ON THE DISTRIBUTION OF THE RUNNING... 463

with K˜0 = 0, a.s. This indicates that K˜ has continuous sample paths of bounded total variation. In the following sections the distributional information of K˜ is thoroughly explored, while comparison is also made with the original distribu- tional properties of K.

2. Characteristic function

In general, the distribution of K˜ is analyzed conditional on t > 0 and the parametrization λ > 0, λ > 0 . In an attempt to derive the characteristic { 1 2 } function of K˜ , we introduce the following lemma, which applies quite conve- niently to the general class of L´evy processes.

Lemma 1. If X (Xt) is a L´evy process and Y (Yt) its Riemann integral defined by ≡ ≡ t Yt := Xsds, (5) Z0 then it holds for the respective characteristic functions of X and Y that

1 φ (u) := E eiuYt = exp t ln φ (tuz)dz , u R. (6) Y |t X|1 ∈  Z0    Proof. By the independent and stationary increments of X, the decompo- sition n n t t Yt = lim Xkt/n = lim (n k + 1)(Xkt/n X(k−1)t/n) (7) n→∞ n n→∞ n − − Xk=1 Xk=1 immediately leads to

n ktu φY |t(u) = exp lim ln φX|t/n , (8) n→∞ n ! Xk=1   from which the lemma follows by infinite divisibility.

The next theorem hence gives the characteristic function of K˜ . Theorem 2.

E iuK˜t φK˜ |t(u) := e (9)   eiu 1 1 e−iu = exp λ t − 1 + λ t − 1 , u R. 1 iu − 2 iu − ∈      464 W. Xia

Proof. This follows from a direct application of Lemma 1 to (2) after scaling by 1/t.

Notice that (9) has a removable singularity at u = 0, as it is easily ob- servable that the Taylor expansion of the function (ez 1)/z about 0 contains − all nonnegative integer-valued powers of z. Upon removal we can define that φ (0) = 1. As a consequence, K˜ ’s generating function, φ ( iu), K˜ |t K˜ |t − is uniformly well-defined on the real line, which allows the next section to ex- patiate on the moment properties. Obviously, like (2), (9) is still infinitely divisible. An important implication from (9) is that the running average process has a compound , as stated below. Corollary 3. Nt law K˜t = Jn, (10) n=1 X where (Nt) is a Poisson process with intensity parameter λ1 + λ2 > 0 and J N are i.i.d. random variables admitting a double uniform distribution. { n}n∈ ++ Proof. Some elementary transformations from (9) lead to

(eiu 1)e−iu(λ eiu + λ ) φ (u) = exp (λ + λ )t − 1 2 1 (11) K˜ |t 1 2 i(λ + λ )u −   1 2  λ eiu 1 λ 1 e−iu = exp (λ + λ )t 1 − + 2 − 1 , 1 2 λ + λ iu λ + λ iu −   1 2 1 2  which conveniently points to a compound Poisson structure with rate (λ1 +λ2)t. The mixing distribution is understood from

−iu iu iuJ1 λ2 1 e λ1 e 1 φJ (u) := E e = − + − (12) λ1 + λ2 iu λ1 + λ2 iu   to be a weighted average of the characteristic functions of two uniform distri- butions supported over [ 1, 0] and [0, 1], respectively. − In other words, the running average of the Skellam process is equivalent in law to a compound Poisson process with intensity λ1 +λ2 and double-uniformly distributed jumps. Nevertheless, the resulting distribution is no longer Skellam, in the absence of α-stability. ON THE DISTRIBUTION OF THE RUNNING... 465

3. Moments

For succinctness, denote by mr the rth moment of the running average K˜ , with the definition r d φK˜ |t(u) m := E K˜ r = ( i)r . (13) r t − dur u=0   The uniform existence of the moments is as aforementioned, and they can be found by the following recursive formula. Theorem 4.

r r λ + ( 1)k+1λ m = 1, m = t 1 − 2 m , r N. (14) 0 r+1 k k + 2 r−k ∈ Xk=0   Proof. Based on (9), the Taylor expansion of the characteristic exponent, ln φK˜ |t(u), around 0, gives that

∞ ∞ eiu 1 (iu)r 1 e−iu ( iu)r − = 1 + and − = 1 + − . (15) iu (r + 1)! iu (r + 1)! r=1 r=1 X X Then, we apply the famous exponential formula in combinatorics, a.k.a. Fa`a di Bruno’s formula in the context of exponentials (see Stanley ((1999), pages 1–10) [7]), in order to calculate the coefficients in

∞ m φ (u) = r (iu)r. (16) K˜ |t r! r=0 X As a result,

r r λ t ( 1)k+1λ t m = (k + 1)! 1 + − 2 m , r N, (17) r+1 k (k + 2)! (k + 2)! r−k ∈ Xk=0     with m0 = 1, and the theorem follows. In connection with this, the mean, , , and excess can be calculated in proper order as

(λ λ )t E[K˜ ] = m = 1 − 2 , (18) t 1 2 (λ + λ )t Var[K˜ ] = m m2 = 1 2 , (19) t 2 − 1 3 466 W. Xia

3 m3 3m2m1 + 2m1 3√3(λ1 λ2) Skew[K˜t] = − = − , (20) 3/2 3 m m2 4 (λ1 + λ2) t 2 − 1 2 4 ˜ m4 4m3m1 + 6m2m1 3m1 p 9 EKurt[Kt] = − 2 − 3 = . (21) m m2 − 5(λ1 + λ2)t 2 − 1 We remark that, compared to the Skellam process,

E[K˜t]/E[Kt] = 1/2, Var[K˜t]/Var[Kt] = 1/3, Skew[K˜t]/Skew[Kt] = 3√3/4, 1 and EKurt[K˜t]/EKurt[Kt] = 9/5 . In comparison, the running average is char- acterized with smaller variance but higher asymmetric leptokurtic level. The mean and variance are still linear in time while the skewness and kurtosis gen- erally decrease with the passage of time.

4. Probability Distribution Functions

In this section we derive the explicit distribution of the running average K˜ , still subject to t > 0 and the same parametrization. We also need the follow- ing lemma which introduces a relatively new double-uniform-sum distribution, namely the n-fold of double-uniform distribution functions. Lemma 5. Given n i.i.d. double-uniformly distributed random variables N (Ui)i∈ ++|n with density function f (x) = (1 w)1 (x) + w1 (x), 1 x 1, (22) U − [−1,0] (0,1] − ≤ ≤ where 0 < w < 1 is the weight parameter and 1·( ) the indicator function, the n · density function of the V := i=1 Ui is given by

⌊x⌋ n−k 1 P n n f (x) = ( 1)k+j (23) V (n 1)! − k + j j − kX=−n Xj=0     wn−j(1 w)j (x k)n−1, n x n. × − − − ≤ ≤ Proof. We first note that (22) is a direct implication from the proof of Corollary 3 with the replacement of λ1/(λ1 + λ2) by w. For convenience we start with the bilateral Laplace transform, ∞ el 1 1 e−l n f¯ (l) = e−lxf (x)dx = (1 w) − + w − , l C, (24) V V − l l ∈ Z−∞   1These proportional relations hold quite generally for the class of L´evy processes, which can be easily verified through (6). ON THE DISTRIBUTION OF THE RUNNING... 467 according to the convolution theorem, and then with n N , applying the ∈ ++ binomial expansion we have

n l n n j n−j (e 1) f¯V (l) = (1 w) w − , (25) j − lne(n−j)l Xj=0   n n n e(k+j−n)l = (1 w)j wn−j( 1)n−k j k − − ln k,jX=0     n n−j n n e−kl = (1 w)j wn−j( 1)k+j . j k + j − − ln Xj=0 kX=−j     We note that (25) can be termwise inverted because each summand converges absolutely in an arbitrarily small neighborhood γ along the imaginary axis, and has only one pole of order n at the origin. By considering the poles to either side of γ we apply Cauchy’s residue theorem to obtain (x−k)l n−1 n−1 1 e (k x) 1(−∞,k](x) (x k) 1[k,∞)(x) dl = − + − (26) 2πi ln 2( 1)n(n 1)! 2(n 1)! Iγ − − − (x k)n−1sgn(x k) = − − , 2(n 1)! − where sgn( ) denotes the sign function, thus yielding · n n−j 1 n n f (x) = ( 1)k+j (27) V 2(n 1)! − k + j j − Xj=0 kX=−j     wn−j(1 w)j(x k)n−1sgn(x k). × − − − For the terms containing k, by splitting the summation we observe that n 1 n ( 1)k (x k)n−1sgn(x k) (28) 2(n 1)! − k + j − − − kX=−n   ⌊x⌋ 1 n = ( 1)k (x k)n−1 2(n 1)! − k + j − − kX=−n   n n ( 1)k (x k)n−1 − − k + j − ! k=X⌊x⌋+1   ⌊x⌋ 1 n = ( 1)k (x k)n−1 (n 1)! − k + j − − kX=−n   468 W. Xia

n 1 n ( 1)k (x k)n−1, − 2(n 1)! − k + j − − kX=−n   where the second sum in the last equality obviously equals 0 given j 0. ≥ Dropping the redundant terms with n k + 1 j n hence leads to (23). − ≤ ≤ Clearly, for x > n (23) universally yields 0. | | Some interesting statistical properties of the double-uniform-sum distribu- tion are summarized in Appendix A. Since it has been proven that K˜ admits a compound Poisson-double-uniform distribution, the next theorem presents a convenient series representation for its density function. Theorem 6. ∞ ⌊x⌋ n−k tn f (x) = e−(λ1+λ2)t δ (x) + ( 1)k+j (29) K˜ |t {0} n!(n 1)! − n=1 X − kX=−n Xj=0 n n λn−jλj (x k)n−1 , x R, × k + j j 1 2 − ∈     ! where δ ( ) denotes the . {0} · Proof. The density function is written in terms of a Poisson-weighted sum of double-uniform densities. I.e., ∞ ((λ + λ )t)ne−(λ1+λ2)t f (x) = 1 2 f (x N = n), (30) K˜ |t n! K˜ |t | t nX=0 where recall that the Poisson process N has intensity λ + λ . For n 1, 1 2 ≥ f (x N = n) is identical to f (x) with weight parameter λ /(λ + λ ). For K˜ |t | t V 1 1 2 n = 0, on the other hand, it becomes a unit mass at the origin. Clearly, (29) signifies a mixed discrete-continuous distribution.

In a similar fashion, the cumulative distribution function of K˜ is as below. Theorem 7. x F (x) Pr[K˜ x] = f (y)dy (31) K˜ |t ≡ t ≤ K˜ |t Z−∞ ∞ ⌊x⌋ n−k tn = e−(λ1+λ2)t 1 (x) + ( 1)k+j [0,∞) (n!)2 − n=1 X kX=−n Xj=0 n n λn−jλj (x k)n , x R. × k + j j 1 2 − ∈     ! ON THE DISTRIBUTION OF THE RUNNING... 469

Proof. The proof is easily completed by the termwise integration of (27) and some simplification analogous to the case of (29).

The density function (29) has some eccentric features. In fact, regarding smoothness, its rth order derivative drf (x) dxr is undefined for all x K˜ |t ∈ Z [ r 1, r + 1], with r N. We notice that indeterminateness can only ∩ − − ∈  occur when the term x k has non-positive power. It then suffices to see that − if n = 1, for c 1, 0, 1 , ∈ {− } ⌊x⌋ 1−k k+j 1 1 1−j j 0 lim ( 1) λ1 λ2(x k) (32) xրc − k + j j − kX=−1 Xj=0     = (λ 1 (c) λ 1 (c)) 1 {0,1} − 2 {−1,0} ⌊x⌋ 1−k k+j 1 1 1−j j 0 + lim ( 1) λ1 λ2(x k) . xցc − k + j j − kX=−1 Xj=0     Therefore, it can be implied that, without regard to the disconnected mass at the origin, the density function (29) has three discontinuities, with directions (as x increases) +1, sgn(λ λ ), and 1, and with magnitudes λ te−(λ1+λ2)t, λ 2 − 1 − 2 | 1 − λ te−(λ1+λ2)t, and λ te−(λ1+λ2)t, at x = 1, x = 0, and x = 1, respectively, 2| 1 − the second of which exists provided that λ = λ . Also, at x = 2, the density 1 6 2 ± curve is non-smooth. Apparently, all these breaks diminish as the intensity parameters or time increase, along with the weight of the mass. In particular, Appendix B discusses the extreme case as we force λ 0, in which the Skellam 2 ց process reduces to a simple Poisson process. Under the compound Poisson structure, the series (29) converges rapidly for most applications. According to (30), each conditional density f (x N = n) K˜ |t | t as a double-uniform-sum density is nonnegative and, by consulting the central limit theorem, must be convergent to normality as n grows to infinity. This ensures that if f (x) = e−(λ1+λ2)t δ (x) + ∞ g (x) then for any x K˜ |t {0} n=1 n ∈ [ n, n] g (x) is positive decreasing as n , and therefore a rough geometric − n → ∞ P  series bound for the remainder estimate of

n∗ ˆ −(λ1+λ2)t fK˜ |t,n∗ (x) = e δ{0}(x) + gn(x) , (33) n=1 ! X for some sufficiently large n∗, reads, for

∗ ∗ −(λ +λ )t x [ n , n ],E ∗ (x) < e 1 2 g ∗ (x)/(1 g ∗ (x)/g ∗ (x)). ∈ − n n +1 − n +1 n 470 W. Xia

To give some illustrations, fixing t = 1 and taking n∗ = 20, under λ = { 1 0.2, λ = 0.6 , λ = 1, λ = 0.5 , and λ = 2, λ = 1.2 , λ = 4, λ = 6 , 2 } { 1 2 } { 1 2 } { 1 2 } we obtain that E (x) < 2 10−23, E (x) < 4 10−18, E (x) < 6 10−12, 20 × 20 × 20 × and E (x) < 3 10−4, respectively, for x 10. Understanding the rapid 20 × | | ≤ convergence of (31) is similar. In this connection, Figure 1 below plots the ˆ density function approximations fK˜ |t,20(x) of fK˜ |t(x) for these four examples, in proper order, where we also include the mass function of the Skellam process K, pK|t(x), for visual comparison.

Fig 1. Comparison of distributions of K and K˜ at t = 1

In each plot, a solid square mark stands for the probability mass of K˜ at the origin, which, notably, is significantly smaller than that of K at 0. Al- though this is directly due to the fact that I (z) 1, z R, another more 0 ≤ ∀ ∈ intuitive explanation is from the path behavior: note that K is a pure-jump L´evy process with jumps of size either 1 or 1, whereas after the first jump of t − K the running average 0 Ksds t is nonzero with probability 1. It is also clear that the density function of K˜ can have three breaks, whose magnitudes are R  negatively influenced by the intensities and time elapsed, as aforementioned. Specially, using λ1 = 1, λ2 = 0.5 and t = 1, we have the next plot of both the probability density function and cumulative distribution function of K˜ . Observably, wherever the density function has breaks, the cumulative dis- tribution function is non-smooth or has abrupt turns. The mass at 0 has led to a jump in the cumulative distribution function. ON THE DISTRIBUTION OF THE RUNNING... 471

Fig 2. Probability density function and cumulative distribution func- tion of K˜ with λ1 = 1 and λ2 = 0.5 at t = 1

Appendix A - Statistical Properties of Double-uniform-sum Distributions

Given n N and w (0, 1), we already know that the Fourier transform of ∈ ++ ∈ (23) is given by

iux φV (u) = e fV (x)dx (34) R Z 1 e−iu eiu 1 n = (1 w) − + w − , u R. − iu iu ∈  

Expanding ln φV (u) around 0 we can directly obtain that

1 E[V ] = n w , (35) − 2   1 Var[V ] = n w w2 + , (36) − 12   24√3(2w3 3w2 + w) Skew[V ] = − , (37) n(12w 12w2 + 1)3 − 6(720w4 1440w3 + 840w2 120w + 1) EKurt[V ] =p − − . (38) − 5n(12w 12w2 + 1)2 − With w fixed, the mean and variance increase whereas the skewness and excess kurtosis decrease with growing n. On the other hand, the weight w places a large impact on the asymmetric leptokurtic feature. With equal n, the following facts are easily justifiable: Skew[V ] > 0 if w (0, 1/2) and Skew[V ] < 0 if ∈ w (1/2, 1), while if w = 1/2 or in the limit w 1 or w 0 Skew[V ] = 0; ∈ ր ց max Skew[V ] = min Skew[V ] = 1/n; with ω = 1 2(5 w∈(0,1) − w∈(0,1) ±± ± ± p 472 W. Xia

√5)/15 1/2 2, EKurt[V ] > 0 if w (ω , ω ) (ω , ω ), EKurt[V ] < 0 ∈ −− −+ ∪ +− ++ if w (0, ω ) (ω , ω ) (ω , 1), and EKurt[V ] = 0 at each w = ω ; ∈  −− ∪ −+ +− ∪ ++ ±± max EKurt[V ] = 3/(2n) and min EKurt[V ] = 6/(5n); in the limit w∈(0,1) w∈(0,1) − ω 1 or ω 0 EKurt[V ] = 6/(5n). ր ց − Recall that V ’s distribution is supported by x [ n, n]. The figure below ∈ − presents, with w = 0.7, the density functions f (x) for all n 5. V ≤

Fig 3. Double-uniform-sum probability density functions with w = 0.7

As the times of convolution increase, the density curves become smoother and will eventually tend to normality.

Appendix B - Selected Formulas for the Case of Poisson Processes

In the extreme case where λ 0, we denote by K(0) and K˜ (0) the Poisson 2 ց t t process and its running average, to avoid confusion. Based on (29), note that the 0   sum with j is nonzero only for j = 0, as limλ2ց0 λ2 = 1, and then simplification leads to

∞ n ⌊x⌋ −λ1t (λ1t) k f (0) (x) = e δ (x) + ( 1) (39) K˜ |t {0} n!(n 1)! − n=1 X − Xk=0 n (x k)n−1 , x 0, × k − ≥   ! and, similarly,

∞ n ⌊x⌋ −λ1t (λ1t) k F (0) (x) = e 1 + ( 1) (40) K˜ |t (n!)2 − n=1 X Xk=0 ON THE DISTRIBUTION OF THE RUNNING... 473

n (x k)n , x 0. × k − ≥   ! Simultaneously, the characteristic function reduces to eiu 1 φ (0) (u) = exp λ t − , u R. (41) K˜ |t 1 iu ∈    We remark that in this case K˜ (0) is equivalent in law to a compound Pois- son process with intensity λ1 > 0 and standard uniform jumps, for which the convoluted law is the limit of (23) as w 1, and is known as the Irwin-Hall ր distribution proposed in Hall (1927) [3]. Needless to say, in this case the density function (39) has only one downward −λ1t break of size λ1te at x = 1. In addition, the point mass at the origin coincides with that of the distribution of the Poisson process K(0), which has (0) nondecreasing sample paths. On the other hand, now p (0) (0) Pr K = K |t ≡ t −λ1t 0 = e .   References [1] M. Abramowitz, I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 10th printing, U.S. National Bureau of Standards, Washington, D.C., USA (1972), 375–378. DOI: 10.1063/1.3047921 [2] O. E. Barndorff-Nielsen, D. Pollard, N. Shephard, Integer-valued L´evy processes and low latency financial econometrics, Quantitative Finance, 12, No. 4 (2010), 587–605. DOI: 10.1080/14697688.2012.664935 [3] P. Hall, The distribution of means for samples of size N drawn from a population in which the variate takes values between 0 and 1, all such values being equally probable, Biometrika, 19, No. 3 (1927), 240–245. DOI: 10.2307/2331961 [4] A. Kerss, N. N. Leonenko, A. Sikorskii, Fractional Skellam processes with applications to finance, Fractional Calculus and Applied Analysis, 17, No. 2 (2014), 532–551. DOI: 10.2478/s13540-014-0184-2 [5] W. Schoutens, L´evy Processes in Finance: Pricing Financial Derivatives, Wiley Series in Probability and Statistics, John Wiley & Sons Ltd, Chichester, West Sussex, England (2003), 44–45. DOI: 10.1002/0470870230 [6] J. G. Skellam, The frequency distribution of the difference between two Poisson variables belonging to different populations, Journal of the Royal Statistical Society, Ser. A (1946), 109–296. DOI: 10.2307/2981372 [7] R. P. Stanley, Enumerative Combinatorics, Cambridge Studies in Advanced Mathematics, Cambridge University Press, Cambridge, England, Ser. 2 (1999), 1–10. DOI: 10.1017/CBO9780511609589 474