Stationary Gamma Processes

Robert L. Wolpert

July 28, 2016 (Draft 2.6)

1 Introduction

For fixed α,β > 0 these notes present six different stationary , each with Gamma X Ga(α, β) univariate marginal distributions and autocorrelation function ρ|s−t| for t ∼ Xs, Xt. Each will be defined on some time index set , either = Z or = R. Five of the six constructions can be applied to otherT InfiniteT ly DivisibleT (ID) distribu- tions as well, both continuous ones (normal, α-stable, etc.) and discrete (Poisson, negative binomial, etc.). For specifically the Poisson and Gaussian distributions, all but one of them (the Markov change-point construction) coincide— essentially, there is just one “AR(1)-like” (namely, the AR(1) process in discrete time, or the Ornstein-Uhlenbeck process in continuous time), and there is just one AR(1)-like Poisson process. For other ID distributions, however, and in particular for the Gamma, each of these constructions yields a process with the same univariate marginal distributions and the same autocorrelation but with different joint distributions at three or more times. First, by “Ga(α, β)” we mean the with mean α/β and variance α/β2, i.e., with shape parameter α and rate parameter β. The pdf and chf are given by: βα f(x α, β)= xα−1e−βx1 | Γ(α) {x>0} χ(ω α, β)=(1 iω/β)−α | − = exp eiωu αe−βu u−1 du . (1) − R  Z +  The sum ξ := ξ of independent random variables ξ ind Ga(α , β) with the same rate + j j ∼ j parameter also has a Gamma distribution, ξ+ Ga(α+, β), if αj R+ are summable P ∼ { } ⊂ with sum α+ := Σαj < . Eqn(1) shows that the “L´evy measure” for this distribution is −βu −1 ∞ ν(du)= αe u 1{u>0}du. From the Poisson representation of ID distributions

X = u (du) R N Z 1 for (du) Po ν(du) we can show that the extremal properties of ID random variables dependN only∼ on the L´evy measure ν(du): for large u R ,  ∈ + P[X>u] P (u, ) = ] ≈ N ∞ 6 ∅ =1 exp ν (u, ) − −  ∞ ν (u, ) (α/βu)e−βu, (2) ≈ ∞ ≈  from which one might mount a study of the multivariate extreme properties of the six processes presented below.

2 Six Stationary Gamma Processes

2.1 The Gamma AR(1) Process Fix α,β > 0 and 0 ρ< 1. Let X Ga(α, β) and for t N define X recursively by ≤ 0 ∼ ∈ t

Xt := ρXt−1 + ζt (3)

−α for iid ζ with chf Eeiωζt = (1 iω/β)−α(1 iρω/β)α = β−iω (easily seen to be { t} − − β−iρω positive-definite, with L´evy measure ν (du)= α e−βu e−βu/ρ u−11 du). A simple way ζ −  {u>0} of generating ζ with this distribution is presented in the Appendix. The process X has { t}   { t} Gamma univariate Xt Ga(α, β) for every t R+ and, at consecutive times 0, 1, joint chf ∼ ∈

χ(s,t)= E exp(isX0 + itX1)

= E exp(i(s + ρt)X0 + itζ1) (1 i(s + ρt)/β) (1 it/β) −α = − − . (4) 1 itρ/β h − i Since this is asymmetric in s,t, the process is not time-reversible; this is also evident from the observation that

P[X ρX ]= P[ζ 0]=1 > P[ζ X (1 ρ2)/ρ]= P[X ρX ]. t ≥ t−1 t ≥ t ≤ t−1 − t−1 ≥ t

This process has marginal distribution Xt Ga(α, β) at all times t, and has autocorrelation |s−t| ∼ Corr Xs,Xt) = ρ (easily found from either (3) or (4)). It is clearly Markov (from (3)), and can be shown to have infinitely-divisible (ID) multivariate marginal distributions of all orders, since the ζ are ID. { t}

2 2.2 Thinned Gamma Process

If X Ga(α1, β) and Y Ga(α2, β) are independent, then Z := X + Y Ga(α+, β) and U :=∼X/Z Be(α ,α )∼ are also independent (where α := α + α ), because∼ under the ∼ 1 2 + 1 2 change of variables x = uz, y = (1 u)z with Jacobian J(u,z)= z, − βα1 βα2 f(u,z)= xα1−1e−βx yα2−1e−βy J(u,z) Γ(α ) Γ(α )  1   2  βα+ = uα1−1(1 u)α2−1zα+−2e−βzz Γ(α )Γ(α ) −  1 2  Γ(α ) βα+ = + uα1−1(1 u)α2−1 zα+−1e−βz . Γ(α )Γ(α ) − Γ(α )  1 2   + 

Thus if Z Ga(α1 + α2, β) and U Be(α1,α2) are independent then X = UZ Ga(α1, β) and Y = (1∼ U)Z Ga(α , β) are∼ independent too. Let ∼ − ∼ 2 X Ga(α, β) 0 ∼ and, for t N (or t N, resp.) set ∈ ∈−

Xt := ξt + ζt where

ξ := B X , B Be(αρ,αρ¯), t t · t−1 t ∼ ζ Ga(αρ,¯ β) t ∼ (or ξ = B X , resp.), whereρ ¯ := (1 ρ) and all the B and ζ are independent. t t · t+1 − { t} { t}

Then, by induction, ξ Ga(αρ,β) and ζ Ga(αρ,¯ β) are independent, with sum X t ∼ t ∼ t ∼ Ga(α, β). Thus Xt is a Markov process with Gamma univariate marginal distribution X Ga(α, β), now{ with} symmetric joint chf t ∼

χ(s,t)= E exp(isX0 + itX1) = E exp is(X ξ )+ i(s + t)ξ + itζ { 0 − 1 1 1} = (1 is/β)−αρ¯(1 i(s + t)/β)−αρ(1 it/β)−αρ¯ (5) − − − |s−t| and autocorrelation Corr Xs,Xt)= ρ . The process of passing from Xt−1 to ξt = Xt−1Bt is called thinning, so X is called the thinned gamma process. A similar construction is t available for any ID marginal distribution.

3 2.3 Random Measure Gamma Process Let (dxdy) be a countably additive random measure that assigns independent random variablesG (A ) Ga(α A , β) to disjoint Borel sets A (R2) of finite area A (this is G i ∼ | i| i ∈ B | i| possible by the Kolmogorov consistency conditions, and is illustrated in the Appendix) and, for λ := log ρ, consider the collection of sets: − G := (x,y): x R, 0 y <λe−2λ|t−x| t ∈ ≤ −λ|s−t| shown in Figure(1) whose intersections have area Gs Gt = e . For t = R, set | ∩ | ∈T

G1 G2 G3 0.0 0.2 0.4 0.6 −2 −1 0 1 2 3 4 5 Time t Figure 1: Random measure construction of process X = (G ) t G t X := (G ). (6) t G t R2 For any n times t1 < t2 < < tn the sets Gti partition into n(n + 1)/2 sets of ··· c { } finite area (and one with infinite area, ( Gti ) ), so each Xti can be written as the sum of some subset of n(n + 1)/2 independent Gamma∪ random variables. In particular, any n =2 variables Xs and Xt can be written as X = (G G )+ (G G ), X = (G G )+ (G G ) s G s\ t G s ∩ t t G t\ s G s ∩ t just as in the thinning approach of Section(2.2), so both 1-dimensional and 2-dimensional marginal distributions for the random measure process coincide with those for the thinning process. Again the joint chf is

χ(s,t)= E exp(isX0 + itX1) = (1 is/β)−αρ¯(1 i(s + t)/β)−αρ(1 it/β)−αρ¯ (7) − − − |s−t| and the autocorrelation is Corr Xs,Xt) = exp λ s t or, for integer times, ρ for ρ := exp( λ). The distribution for consecutive −triplets| − differs| from those of the Thinned  Gamma Process,− however, an illustration that the thinning process is Markov but the random measure is not. The Random Measure process does feature infinitely-divisible (ID) marginal distributions of all orders, while the thinned process does not.

4 2.4 The Markov change-point Gamma Process

Let ζ : n Z iid Ga(α, β) be iid Gamma random variables and let N be a standard { n ∈ } ∼ t Poisson process indexed by t R (so N0 =0 and (Nt Ns) Po(t s) for all

Xt := ζn, n = Nλt.

Then each Xt Ga(α, β) and, for s,t R, Xs and Xt are either identical (with ρ|s−t|) or independent—∼ reminiscent of∈ a Metropolis MCMC chain. The chf is

χ(s,t)= E exp(isX0 + itX1) −α = ρ 1 i(s + t)/β +ρ ¯(1 is/β)−α(1 it/β)−α (8) − − −

and once again the marginal distribution is Xt Ga(α, β) and the autocorrelation function |s−t| ∼ is Corr Xs,Xt)= ρ . 2.5 The Squared O-U Gamma Diffusion

iid Let Zi OU(λ/2, 1) be independent Ornstein-Uhlenbeck velocity processes, mean-zero Gaussian{ } processes∼ with covariance Cov Z (s),Z (t) = exp λ s t δ , and set i j − 2 | − | ij 1 n   X := Z (t)2 t 2β i i=1 X 2 4 for n N and β R+. Note Zi(t) No(0, 1), so EZi(t) = 1 and EZi(t) = 3; it follows that EZ (s∈)2Z (t)2 = 1+2exp(∈ λ s t∼). Then X Ga(α, β) for α = n/2, with EX = α/β and i i − | − | t ∼ s 1 EX X = n 1+2exp( λ s t ) + n(n 1) s t 4β2 − | − | − α2 +αexp λ s t  − | − | = 2 β  Cov α −λ|s−t| so the autocovariance is Xs,Xt) = β2 e and the autocorrelation at integer times is Corr X ,X )= ρ|s−t| for ρ := exp( λ). The chf at consecutive integer times is s t −

χ(s,t)= E exp(isX0 + itX1) −α = 1 i(s + t)/β st(1 ρ)/β2 , (9) − − − distinct from (4), (5)=(7), and (8), so this process is new. Itˆo’s formula is used in (Wolpert, 2011) to show that Xt has stochastic differential equation (SDE) representation

t t X = X 2λ X α/β ds +2 λ/β X dW (10) t 0 − s − s s Z0 Z0  p p 5 and hence has generator Aφ(x)=(∂/∂ǫ)E[φ(X ) X = x] given by t+ǫ | t ǫ=0 ′ ′′ Aφ(x)= 2λ(x α/β)φ (x)+(2λ/β )xφ (x), (11) − − which we will use to distinguish this process from that of Section(2.6). While the construc- tion above required half-integer values for α, (9) is positive-definite and the SDE (10) has a unique strong solution for all α> 0, so a time-reversible stationary Markov diffusion process exists with this distribution. Cox et al. (1985, Eqn (17)), building on (Feller, 1951), found the transition kernel for this process:

v (α−1)/2 f(y,t x,s)= ce−u−v I 2√uv , | u (α−1)    where c := β/ 1 e−2λ|t−s| , u := cye−2λ|t−s|, v := cx. − where Iq(x) is the modified Bessel function of the first kind of order q. With this one can construct likelihood functions and generate posterior samples, MLEs, etc. for the parameters of this process. Cox et al. show that the process Xt is strictly positive at all times if α 1, but occasionally reaches zero for α< 1. ≥

2.6 Continuously Thinned Gamma Process Pick a large integer n and set ǫ := 1/n, q := exp( λǫ), and p := 1 q = λǫ + o(ǫ). Draw X Ga(α, β) and, for integers i, j N, draw independently− − 0 ∼ ∈ ζ Ga(αp,β) b Be(αp,αq). i ∼ j ∼ Set:

X0 = X0 X = X (1 b ) + ζ ǫ 0 − 1 1 X = X (1 b ) + ζ 2ǫ ǫ − 2 2 = X (1 b )(1 b ) + ζ (1 b ) + ζ 0 − 1 − 2 1 − 2 2 X = X (1 b )(1 b )(1 b )+ ζ (1 b )(1 b )+ ζ (1 b )+ ζ 3ǫ 0 − 1 − 2 − 3 1 − 2 − 3 2 − 3 3

and, in general,

k k X = X (1 b )+ ζ (1 b ) . (12a) kǫ 0 − j i − j j=1 i=1 ( ) Y X i

6 In the limit as n and kǫ t the products converge to the product integral of the beta → ∞ → process introduced by Hjort (1980, 3) (and described lucidly by Thibaux and Jordan, 2007, 2) and the sum to an ordinary gamma§ stochastic integral, §

t Xt = X0 [1 dB(s)] + [1 dB(s)] ζ(dr), (12b) − 0  −  s∈Y(0,t] Z s∈Y(r,t]  where ζ(dr) Ga αλ dr, β is a Gamma random measure and B(s) BP α,λds is a Beta process, i.e.,∼ an SII L´evy process with L´evy measure ∼   ν (du)= λαu−1(1 u)α−1 1 du B − {0

1 F (t) [1 dB(s)] = − − 1 F (r) s∈Y(r,t] − where F (t)= [1 dB(s)] satisfies s∈(t,∞) − Q dF (t) dF (s) = dB(t), B = . (13) 1 F (t) t 1 F (s) − Z(0,t] − 2.6.1 Generator The Gamma process of Eqn(12b) is stationary and Markov, with generator

∞ Aφ(x)= φ(x + u) φ(x) αλu−1e−βu du − Z0 x   + φ(x u) φ(x) αλu−1(1 u/x)α−1 du (14) − − − Z0   Because this differs from (11) (and in particular because it is a non-local operator, showing Xt has jumps), this process is new. Once again the one-dimensional marginal distributions are X Ga(α, β) and the autocorrelation is Corr(X ,X ) = exp λ s t .a t ∼ s t − | − |  3 Discussion

We have now constructed six distinct processes that share the same univariate marginal distribution and autocorrelation function, but which all differ in their n-variate marginal distributions for n 3. Some are Markov, some not; some are time-reversible, some not; some have ID marginal≥ distributions of all orders, some don’t. Similar methods can be used to construct AR(1)-like processes with any ID marginal distribution, such as those listed

7 on p. 10; only in the two cases of Gaussian and Poisson do all these constructions coincide. Many of these are useful for modeling time-dependent phenomena whose dependence falls off over time, but for which traditional Gaussian methods are unsuitable because of heavy tails, or integer values, or positivity, or for other reasons. I know of very little work (yet!) exploring inference for processes like these; a beginning appears in (Wang, 2013, 3). I have written code in R to generate samples from each of these six processes; available on§ request.

8 Appendix

Proposition 1 (Walker (2000)). The innovations ζt in Eqn(3) can be constructed succes- sively as follows:

λ Ga(α, 1), N λ Po 1−ρ λ , ζ N Ga N , β . t ∼ t | t ∼ ρ t t | t ∼ t ρ Proof. For ω R,   ∈ iζtω −Nt Ee Nt = (1 iωρ/β) | ∞− n Eeiζtω λ = (1 iωρ/β)−n 1−ρ λ exp 1−ρ λ /n! | t − ρ t − ρ t n=0 X (1 ρ)ω   = exp i − λ β iρω t −  (1 ρ)ω −α β iω −α Eeiζtω = 1 i − = − . − β iρω β iρω h − i h − i

Poisson and Gamma SII Processes The chf for a Poisson random variable X Po(ν) is ∼ ∞ k ν iθ χ (θ)= E[eiθX ]= eiθk e−ν = e(e −1)ν X k! Xk=0   so for any u R the re-scaled random variable Y := uX has chf ∈ iθuX χuX (θ)= E[e ]= χX (uθ) iθu = e(e −1)ν and a linear combination Y := u X of independent X Po(ν ) has chf j j j ∼ j

iθuj P (e −1)νj χY (θ)= e j Y n o = exp (eiθuj 1)ν (15a) − j ( j ) X = exp (eiθu 1)ν(du) (15b) R − Z  for the discrete measure

ν(du)= νjδuj (du) X 9 that assigns mass νj to each point uj, provided the sum in (15a) converges. Of course the sum converges if it has only finitely-many terms, or even if there are infinitely-many with iθu νj < (because e 1 2), but that condition isn’t actually necessary. Since also eiθu 1∞ θu , the| random− | variable ≤ Y will be well-defined and finite provided P| − |≤| | 1 u ν < (16a) ∧| j| j ∞ j X  or, in integral form,

1 u ν(du) < . (16b) R ∧| | ∞ Z A random variable with chf of form (15a) for a sequence satisfying (16a) is called a “com- pound Poisson” distribution; one with the more general chf of form (15b) for a measure satisfying (16b) is called Infinitely Divisible, or ID.

ID Distributions For any σ-finite measure satisfying (16b) it’s easy to make a with station- ary independent increments of the more general form of (15b), beginning a Poisson random measure (duds) on R R with intensity measure E (duds)= ν(du) ds: N × + N

Xt := u (duds). (17) R N ZZ ×(0,t] This is a right-continuous independent-increment nondecreasing process that begins at X0 = 0 and has jumps ∆t =[Xt Xt−] [Xt limsրt Xs] of magnitudes ∆t E at rate ν(E) for any Borel E R. The Poisson− process≡ itself− is the special case where ν∈(E)= ν1 , with ⊂ {1∈E} jumps of magnitude ∆t = 1 at rate ν R+. Khinchine and L´evy (1936) showed∈ that a random variable Y has a chf of the form (15b) if and only1 if, for every n N, one can write Y = ζ + ζ as the sum of n iid random ∈ 1 ··· n variables ζj. This property is called “Infinite Divisibility” (abbreviated ID), and the processes we have constructed whose increments have this property are called “SII” processes for their stationary independent increments. Examples of ID distributions (or SII processes) and their L´evy measures include:

Poisson Po(λ) ν(du)= λδ1(du) NB qk Negative Binomial (α,p) ν(du)= k∈N α k δk(du), q := (1 p) −1 −βu − Gamma Ga(α, β) ν(du)= αu e 1{u>0} du St Pαγ πα −α−1 α-Stable 0(α,β,γ,δ) ν(du)= π Γ(α)sin 2 u (1 + β sgn u) du S S αγ πα | |−α−1 Symmetric α-Stable α (α,γ) ν(du)= π Γ(α)sin 2 u du Cauchy Ca(δ, γ) ν(du)= γ u −2 du. | | π | | 1For nonnegative random variables this is true as stated, but a slightly more general form is necessary for real-valued ID random variables, with a condition on ν(du) somewhat weaker than (16b) (see (18c))— if you get interested, ask me about “compensation”. This is needed for the Ca(δ, γ) example and, for α 1, ≥ the St0(α,β,γ,δ) and SαS(α,γ) examples below.

10 The defining condition for a random variable Y to be “Infinitely Divisible” (ID) is that for each n N there must exist iid random variables ζj : 1 j n such that Y and n ∈ { ≤ ≤ } j=1 ζj have the same distribution. This is clearly equivalent to the condition that every power χα(ω) of the characteristic function χ(θ) := E exp(iθY ) must also be a characteristic Pfunction (i.e., must be positive-definite) for each inverse integer α = 1/n, because we can 1/n just take ζj to be iid with chf χ (ω) and verify that their sum has chf χ(ω). Less obvious but also true{ } is that Y is ID if and only if χα(ω) is positive-definite for all real α > 0, and even less obvious is the Khinchine and L´evy theorem that χ must take the specific form

χ(θ) = exp iθδ θ2σ2/2+ eiθu 1 ν(du) (18a) − R −  Z    for some δ R, σ2 0, and Borel measure ν satisfying ν( 0 )=0 and (16b) or, a little more generally,∈ the form≥ { }

χ(θ) = exp iθδ θ2σ2/2+ eiθu 1 iθh(u) ν(du) (18b) − R − −  Z    for any bounded function h that satisfies h(u) = u + O(u2) near u 0 (like arctan u or 2 ≈ u1{|u|<1} or u/(1 + u )) and a Borel measure ν satisfying the weaker restriction

1 u2 ν(du) < . (18c) R ∧ ∞ Z  Some properties of ID distributions beyond our scope, but covered in (Steutel and van Harn, 2004) (see also (Bose et al., 2002)), include: Theorem 1 (S&vH, Thm 2.13). Let χ(θ) be the chf of an Infinitely Divisible distribution. Then ( θ R) χ(θ) =0 . Also, if χ(θ) is analytic in some open domain Ω C, then ( θ ∀Ω) ∈χ(θ) {=0 .6 Thus,} ID chfs do not vanish on R or anywhere in C that⊂ χ(θ) is analytic.∀ ∈ { 6 } Theorem 2 (S&vH, Thm 9.8). Let X be an infinitely divisible random variable that is not normal or degenerate. Then the two-sided tail of X satisfies log P[ X >x] lim − | | = c x→∞ x log x

−1 for a number 0 c< given by c = max[ν(R+),ν(R−)]. An ID random variable that is not degenerate has≤ a normal∞ distribution if and only if the same limit is c = . ∞ Theorem 3 (S&vH, Prop 2.3). No non-degenerate bounded random variable is ID

This one’s easy enough to prove. If X ∞ = B < and X has the same distribution as n k k 2 V ∞ V E 2 2 2 j=1 ζj for iid ζj then ζj ∞ = B/n and σ := (X)= n (ζj) n ζj B /n, so σ =0 and X must be{ degenerate.} k k ≤ ≤ P 11 Gamma Variables & Processes The Gamma distribution X Ga(α, β) with mean α/β and variance α/β2 has chf ∼ iθX χX (θ)= E[e ] ∞ βα = eiθx xα−1e−βx dx 0 Γ(α) Z  −α  = 1 iθ/β − = exp α log(1 iθ/β) {− ∞ − } = exp eiθu 1 αu−1e−βu du , − −  Z0   exactly of the form of (15b) with L´evy measure

−1 −βu ν(du)= αu e 1{u>0} du.

This measure, while infinite, does satisfy condition (16b):

1 ∞ 1 u ν(du)= u αu−1e−βu du + 1 αu−1e−βu du R ∧| | | | Z Z0 Z1  ∞   αe−βu du = α/β < . ≤ ∞ Z0 Thus gamma-distributed random variables are ID and the SII gamma process

Xt = u (duds) R N ZZ ×(0,t] has infinitely-many non-negative “jumps” ∆t = Xt Xt− in any time interval a

12 References

Bose, A., DasGupta, A. and Rubin, H. (2002) A contemporary review and bibliography of infinitely divisible distributions and processes. Sankhy¯a, Ser. A, 64, 763–819. Special issue in memory of D. Basu.

Cox, J. C., Ingersoll, J. E., Jr. and Ross, S. A. (1985) A theory of the term structure of interest rates. Econometrica, 53, 385–408.

Feller, W. (1951) Two singular diffusion problems. Annals of Mathematics, 54, 173–182. doi:10.2307/1969318.

Hjort, N. L. (1980) Nonparametric Bayes estimators based on beta processes in models for life history data. Ann. Stat., 18, 1259–1294.

Khinchine, A. Y. and L´evy, P. (1936) Sur les lois stables. Comptes rendus hebdomadaires des seances de l’Acad´emie des sciences. Acad´emie des science (France). Serie A. Paris, 202, 374–376.

Steutel, F. W. and van Harn, K. (2004) Infinite Divisibility of Probability Distributions on the Real Line, vol. 259 of Monographs and Textbooks in Pure and Applied Mathematics. New York, NY: Marcel Dekker.

Thibaux, R. and Jordan, M. I. (2007) Hierarchical beta processes and the Indian buffet process. In Proceedings of the Eleventh International Conference on Artificial Intelligence and (AISTATS 2007), March 21–24, San Juan, Puerto Rico, (eds. M. Meila and X. Shen).

Walker, S. G. (2000) A note on the innovation distribution of a gamma distributed autore- gressive process. Scand. J. Stat., 27, 575–576.

Wang, J. (2013) Bayesian Modeling and Adaptive Monte Carlo with Geophysics Applications. Ph.D. thesis, Duke University Statistical Science Department.

Wolpert, R. L. (2011) Negative binomial & gamma processes. Research notes, 2011-10-05, Draft 69.

Wolpert, R. L. and Ickstadt, K. (1998) Poisson/gamma random field models for spatial statistics. Biometrika, 85, 251–267.

Last edited: July 28, 2016

13