Lévy based Cox

Gunnar Hellmund

PhD Thesis

Department of Mathematical Sciences University of Aarhus February 2009 To Katharina and Johanna. Acknowledgments

I would like to thank my family for their love and support during my three years as a PhD student. Especially I want to thank my wife Lene: I am humbled by your love.

I wish to thank my supervisor Eva B. Vedel Jensen for originally encour- aging me to initiate the PhD study and introducing me to the theory of point processes. I also wish to thank Rasmus Plenge Waagepetersen and Svend-Erik Graversen for their kind advise and commitment to science.

3 Preface

This thesis consists of three papers. The rst paper Lévy based Cox point processes includes background material, basic denitions and other core ma- terial. The second paper Completely random signed measures answers some fundamental probability theoretical questions regarding Lévy bases, while the third paper Strong mixing with a view toward spatio-temporal estimating functions is concerned with estimation theory for (non-stationary) Cox point processes. Before the actual papers, I provide an informal readers guide as introduc- tion where I put emphasis on the ndings I favor.

4 Contents

• A readers guide

• Paper 1: Lévy based Cox point processes (30 pages)

• Paper 2: Completely random signed measures (8 pages)

• Paper 3: Strong mixing with a view toward spatio-temporal estimating functions (15 pages)

Lévy based Cox point processes has been published in Advances in Applied Probability vol. 40 nr. 3, s. 603-629

Completely random signed measures will be published in and Prob- ability Letters (in press)

Strong mixing with a view toward spatio-temporal estimating functions will appear as a Thiele Research Report and will be submitted to Statistica Sinica for the upcoming special issue on composite likelihood methods.

5 A readers guide

Below, I give an informal readers guide to the three papers of the thesis. Section number etc. refer to the paper in question. References can be found immediately after the readers guide.

Paper 1: Lévy based Cox point processes Cox point processes are point processes with random intensity, i.e. given a realization of the generating random eld Λ = λ the resulting is a with intensity λ. The most widely used Cox point process models are log-Gaussian Cox processes (LGCPs) driven by log-Gaussian random elds and shot-noise Cox processes (SNCPs) driven by shot-noise random elds. The introduction of Paper 1 provides a short discussion of the literature and points out the major result that both LGCPs and SNCPs can be constructed through kernel smoothings of so-called Lévy bases. A Lévy basis L dened on Rd is a indexed by the bounded Borel subsets of Rd such that • Its values on disjoint sets are independent

• The values of L are innitely divisible

• Given any disjoint sequence of bounded Borel sets (An) such that ∪An is a bounded Borel subset X L (∪An) = L (An) ,P − a.s.

Section 2 provides theoretical background material which also enables the reader to understand integration wrt. Lévy bases. Since the random elds driving (log) Lévy based Cox point processes are kernel smoothings of Lévy bases this is an important Section. Lemma 1 gives sucient conditions for integrability of a kernel wrt. a Lévy basis. Necessary and sucient conditions for integrability are given in [2], but their conditions are not as user-friendly as our conditions. For readers unfamiliar with Cox point processes some key concepts are summarized in Section 3. The denition of Lévy driven Cox processes is given in Section 4.1. Sec- tion 4.2 provides nth order product densities (Proposition 3), and probability generating functional (Proposition 6). A mixing result (Proposition 7) that

6 might be interesting for inference of stationary point processes is given in Sec- tion 4.3, while Section 4.4 provides examples of Lévy driven Cox processes incl. the important SNCPs. The denition of log Lévy driven Cox Processes is given in Section 5.1. As shown in Section 5.4.2. these include log-Gaussian Cox point processes driven by stationary Gaussian random elds - but it also includes the log shot-noise Cox processes and combinations of these two classes as discussed in Section 5.1. A traditional argument for using LGCPs is that the Gaussian random eld model random events - events that make the intensity increase or decrease. Using a stationary Gaussian random eld will however on average provide equal increases and decreases of the intensity. Log shot-noise random elds can be used to separate (and quantify) the amount of respectively increases and decreases in the intensity generated by random events. The use of a positive shot-noise random eld in a log Lévy driven Cox process can describe the positive eect on the intensity by random events and a negative shot-noise random eld can be used to describe the negative eect on the intensity by random events. (see gure 3 on page 21 of Paper 1). Section 5.2 provides details on nth order product densities while a mixing result (useful in a stationary setting) is provided by Proposition 11 in Section 5.3. Finally in Section 6 combinations of Lévy driven Cox processes and log Lévy driven Cox processes are given. Section 7 discusses inhomogeneous (log) Lévy driven Cox processes. In the concluding discussion in Section 8 spatio-temporal extensions are given in Section 8.2 and in Section 8.3 inference based on summary statistics is mentioned.

7 Paper 2: Completely random signed measures A Lévy basis can be seen as a 'generalized' . It is however not a random (signed) measure in the usual sense, since its realizations may not be (signed) Radon measures. A question to be answered is: How fundamental is the concept of a Lévy basis and how is it related to the usual notion of random measures? These questions are answered in the paper Completely random signed measures. According to Lemma 3.1 the assumption of innitely divisible values of Lévy bases (that might seem odd at rst) is a natural consequence of the independence assumption, if the values of the Lévy basis on bounded Borel sets are assumed to have nite variance or the Lévy basis is dominated by Lebesgue measure. The proof of Lemma 3.1 uses a result in [1] and a trans- lation of the problem to a one dimensional stochastic process setting, a trick that is also used in other proofs. The question: 'When is a Lévy basis a (signed) random measure?' is answered by Corollary 5.5 (in conjunction with Lemma 4.6 and Denition 5.4). As a side-eect of these investigations we were able to give a characteri- zation of completely random signed measures that extends known results for completely random measures (Theorem 4.7).

8 Paper 3: Strong mixing with a view toward spatio-temporal estimating functions As described in Lévy based Cox point processes the Lévy based Cox processes constitute a large and exible class of point processes driven by random elds. Most often second-order reweighted stationarity or even other kinds of non- stationary models are natural to consider in an applied situation. As noted in Section 1.1 the recent literature on Cox point processes focuses on second- order reweighted stationary processes on the form

Λ(x) = exp (β · z (x)) Ψ (x) , (1) where β is a parameter vector, z is covariate information and Ψ is a stationary eld. The literature on Cox point processes dened on (rectangular) observa- tion windows in R2 has so far used minimum contrast methods for summary statistics to obtain parameter estimates - most recently in [3]. A major draw- back of these methods is that the summary statistics are only well-dened if the Cox point process under investigation is stationary or second-order re-weighted stationary. Instead of using summary statistics, the Bernoulli composite likelihood introduced in [4] can be used in much more general settings (see Section 2.1). However departing from stationarity introduces diculties when proving asymptotic results. The concept of strong mixing that is essential for consis- tency and asymptotic normality (Section 2.2) of parameter estimates in the non-stationary setting is introduced in Section 1.1. For the rst time in the literature on Cox point processes dened on (rectangular) observation windows in R2 we obtain parameter estimates for a process which is neither stationary nor second-order re-weighted station- ary (see the short simulation study in Section 2.3). The fact that parameter estimates can be obtained in a non-stationary setting will make Cox point processes more attractive when modeling the dynamics of natural phenom- ena. Furthermore we introduce a exible class of shot-noise Cox processes based on the (in geostatistical context) popular Whittle-Matern family of covariance functions (Section 1.2). The exible second-order structure of this family of stationary random shot-noise elds makes it a good choice of the eld Ψ in formula (1), when Cox point processes are used to obtain information on the eect of covariates.

9 References

[1] S. J. Karlin and W. J. Studdun. Tchebyche systems: With Applications in Analysis and Statistics. Pure and applied mathematics. New York: Wiley, 1966.

[2] R.S. Rajput and J. Rosinski. Spectral representation of innitely divisible processes. Probab. Theory Relat. Fields, 82:451487, 1989.

[3] R. Waagepetersen and Y. Guan. Two-step estimation for inhomogeneous spatial point processes. submitted, 2008.

[4] R. P. Waagepetersen. An estimating function approach to inference for in- homogeneous neyman-scott processes. Biometrics, 63(1):252258, March 2007.

10 L´evy based Cox point processes

Gunnar Hellmund, Michaela Prokeˇsov´aand Eva B. Vedel Jensen

T.N. Thiele Centre for Applied Mathematics in Natural Science Department of Mathematical Sciences University of Aarhus, Denmark

Abstract

In this paper, we introduce L´evy driven Cox point processes (LCPs) as Cox point processes with driving intensity function Λ defined by a kernel smooth- ing of a L´evy basis (an independently scattered infinitely divisible random measure). We also consider log L´evy driven Cox point processes (LLCPs) with Λ equal to the exponential of such a kernel smoothing. Special cases are shot noise Cox processes, log Gaussian Cox processes and log shot noise Cox processes. We study the theoretical properties of L´evy based Cox processes, including moment properties described by nth order product densities, mixing properties, specification of inhomogeneity and spatio-temporal extensions. Keywords: Cox process; infinitely divisible distribution; inhomogeneity; kernel smoothing; L´evy basis; log Gaussian Cox process; mixing; product density; shot noise Cox process AMS 2000 Subject Classification: Primary 60D05, 60G55; Secondary 60K35

1 Introduction

Cox point processes constitute one of the most important and versatile classes of point process models for clustered point patterns [12, 13, 30, 31, 33, 41]. During the last decades several new classes of Cox point process models have appeared in the literature – e.g. shot noise Cox processes defined by means of generalized gamma measures [5], log Gaussian Cox processes [9, 29] and shot noise Cox processes [27]. These models share some common properties and differ in others, depending on how the driving intensity measure of the Cox process is constructed. One of the aims of this paper is to introduce a unified framework which is able to include all the different models mentioned above thus showing them in new light, investigate their relationships and define further natural extensions of those models. The starting point for us will be the notion of a L´evy basis L – an indepen- dently scattered infinitely divisible random measure. The short terminology of a L´evy basis has been introduced in [2, 3]. Independently scattered infinitely divisible random measures have been studied in detail in [36]. L´evy bases in- clude Poisson random measures, mixed Poisson random measures, Gaussian

1 random measures as well as so-called G measures [5]. Thus having in mind − the construction of the shot noise Cox processes the second step in defining the driving intensity Λ of the Cox process should be a kernel smoothing of the L´evy basis Λ(ξ)= k(ξ, η)L(dη), Z where k is a kernel (weight) function. By this we arrive at the definition of the L´evy driven Cox processes (LCPs) – i.e. Cox processes with the random driving intensity function defined by an integral of a weight function with respect to a L´evy basis. This construction has earlier been discussed by Robert L. Wolpert under the name of L´evy moving average processes [48], see also [49, 50]. It will be shown that LCPs are, under regularity conditions, shot noise Cox processes with additional random noise. Furthermore, it is also possible to define the driving intensity as the expo- nential of a kernel smoothing of a L´evy basis (now allowing for non-positive weight functions and non-positive L´evy bases) thus arriving at the log L´evy driven Cox processes (LLCPs). It will be shown that LLCPs have, under reg- ularity conditions, a driving field of the form Λ = Λ Λ , where Λ and Λ are 1 · 2 1 2 independent, Λ1 is a log Gaussian field and Λ2 is a log shot noise field. The latter process may describe clustered point patterns with randomly placed empty holes. Shot noise Cox processes, log Gaussian Cox processes and log shot noise Cox processes will appear as natural building blocks in a modelling framework for Cox processes. Different types of combinations of the building blocks (corresponding to thinning and superposition) will be discussed in the present paper. Having defined the framework the second aim is to study the theoret- ical properties of L´evy based Cox processes, including moment properties described by nth order product densities, mixing properties, specification of inhomogeneity and spatio-temporal extensions. Examples where the new models are needed do already appear in the literature. In [47], an LCP (Cox process with additional random noise) is used in the modelling of tropical rain forest. In [13, pp. 92–100], a point pattern from forestry is described by a shot noise Cox process thinned by a random field, taking unexplained large scale environmental heterogeneity into account. If this field is log Gaussian, the resulting process is one of the combinations of LCPs and LLCPs to be described in the present paper. In the very recent review paper [33], tropical rain forest is modelled by inhomogeneous shot noise Cox processes, obtained by thinning of a homogeneous shot noise Cox process with a log-linear deterministic field depending on explanatory variables. If the deterministic field is substituted with a log Gaussian field, we again arrive at a combination of an LCP and an LLCP. The present paper is organized as follows. In Section 2 we give a short overview of the theory of L´evy bases and integration with respect to such bases. In Section 3 we recall standard results about Cox processes. In Sec- tion 4 we introduce and study the L´evy driven Cox processes and in Section 5 the log L´evy driven Cox processes. Combinations of LCPs and LLCPs are discussed in Section 6, while inhomogeneous LCPs and LLCPs are considered in Section 7. We conclude with a discussion.

2 2 L´evy bases

This section provides a brief overview of the general theory of L´evy bases, in particular the theory of integration with respect to L´evy bases. For a more detailed exposition, see [3, 36] and references therein. Let R be a Borel subset of Rd, (R) the Borel sets contained in R, and B A the δ-ring of bounded Borel subsets of R. Following [36], we consider a collection of real-valued random variables L = L(A), A with the following properties { ∈ A} for every sequence A of disjoint sets in , L(A ),...,L(A ),... are • { n} A 1 n independent random variables and L( A )= L(A ) a.s. provided ∪n n n n A , ∪n n ∈A P for every A in , L(A) is infinitely divisible. • A If L has these properties, L will be called a L´evy basis, cf. [3]. If L(A) 0 ≥ for all A , L is called a non-negative L´evy basis. ∈A For a random variable X, the logarithm of the characteristic function log E(eivX ) will be called the cumulant function and will be denoted by C(v, X). This notation has been used in the paper [3] where the terminology of a L´evy basis was introduced. When L is a L´evy basis, the cumulant function of L(A) can by the L´evy-Khintchine representation be written as

1 2 ivr C(v,L(A)) = iva(A) v b(A)+ (e 1 ivr1[−1,1](r)) U(dr, A), (1) − 2 R − − Z where a is a σ additive set function on , b is a measure on (R), U(dr, A) − A B is a measure on (R) for fixed dr and a L´evy measure on (R) for each B B fixed A (R), i.e. U( 0 , A) = 0 and (1 r2) U(dr, A) < , where ∈ B { } R ∧ ∞ ∧ denotes minimum. In fact U is a measure on (R) (R), cf. [36, Lemma R B ×B 2.3]. This measure is referred to as the generalized L´evy measure and L is said to have characteristic triplet (a, b, U). If b = 0 then L is called a L´evy jump basis, if U = 0 then L is a Gaussian basis, see the examples below. A general L´evy basis L can always be written as a sum of a Gaussian basis and an independent L´evy jump basis. Note that the term iva(A) corresponds to a nonrandom shift of the values of L. The nonrandom shift may be included in the Gaussian component or in the jump component of the L´evy basis or may be shared amongst them. Let a = a+ + a−. Then, there exists a unique non-negative measure µ on | | (R), satisfying B µ(A)= a (A)+ b(A)+ (1 r2)U(dr, A), | | R ∧ Z for A , cf. [36, Proposition 2.1, Definition 2.2]. We will call µ the control ∈A measure. In [36, Lemma 2.3] it has been shown that the generalized L´evy measure U factorizes as

U(dr, dη)= V (dr, η)µ(dη), (2) where V (dr, η) is a L´evy measure for fixed η. Moreover a and b are absolutely continuous with respect to µ, i.e.

a(dη) =a ˜(η)µ(dη), b(dη)= ˜b(η)µ(dη), (3)

3 and obviously a˜ , ˜b 1 µ a.s. . | | ≤ Let L′(η) be a random variable with the cumulant function

′ 1 2˜ ivr C(v,L (η)) = iva˜(η) v b(η)+ (e 1 ivr1[−1,1](r)) V (dr, η). (4) − 2 R − − Z Then, we get the representation

C(v,L(A)) = C(v,L′(η)) µ(dη). (5) ZA The random variables L′(η) will play an important role in the following and will be called spot variables. Note that L′(η) characterizes the behaviour of L at location η. For later use, note that if E(L′(η)) and Var(L′(η)) exist, then

E(L′(η)) =a ˜(η)+ rV (dr, η), C Z[−1,1] Var(L′(η)) = ˜b(η)+ r2V (dr, η). R Z By (1) – (3) it is no restriction if we for modelling purposes only consider L´evy bases with characteristic triplet (a, b, U) of the form

a(dη) =a ˜ν(η)ν(dη) (6)

b(dη)= ˜bν(η)ν(dη) (7)

U(dr, dη)= Vν(dr, η)ν(dη) (8) where ν is a non-negative measure on (R), a : R R and b : R B ν → ν → [0, ) are measurable functions and V (dr, η) is a L´evy measure for fixed ∞ ν η. The random variable satisfying (5) with µ replaced by ν will be denoted ′ ′ ′ ˜ ˜ by Lν(η). For simplicity, we write Lµ(η) = L (η),a ˜µ =a ˜, bµ = b and V (dr, η)= V (dr, η). If V ( , η),a ˜ (η) and ˜b (η) do not depend on η neither µ ν · ν ν does the distribution of L′ (η) and the L´evy basis L is called ν factorizable. ν − If moreover the measure ν is proportional to the Lebesgue measure, L is called homogeneous and all the finite dimensional distributions of L are translation invariant. Let us now consider integration of a measurable function f on (R, (R)) B with respect to a L´evy basis L. The function f is said to be integrable with respect to L, cf. [36], if there exists a sequence of simple functions f converging to f µ a.e. and such that f dL converges in probability, n − A n as n , for all A (R). The limit is denoted A f dL. The integral of a →∞ ∈Bk R simple function f = x 1 n with respect to L is defined in the obvious n j=1 j Aj R manner P k n fn dL = xjL(A Aj ). A ∩ Z Xj=1 The following lemma gives conditions for integrability and characterizes the distribution of the resulting integral.

Lemma 1 Let f be a measurable function on (R, (R)) and L a L´evy basis B on R with characteristic triplet (a, b, U). If the following conditions

4 (i) f(η) a (dη) < R | | | | ∞ (ii) f(η)2 b(dη) < RR ∞ (iii) f(η)r) V (dr, η) µ(dη) < RR R | | ∞ are satisfied,R R then the function f is integrable with respect to L and R f dL is a well defined random variable with the cumulant function R 1 C v, f dL = iv f(η)a(dη) v2 f(η)2b(dη) (9) − 2  ZR  ZR ZR if(η)vr + (e 1 if(η)vr1[−1,1](r)) V (dr, η) µ(dη). R − − ZR Z Proof. It suffices to check that the regularity conditions of [36, Theorem 2.7] are satisfied under the assumptions of Lemma 1. More specifically we need to check that (a) h(f(η), η) µ(dη) < , R | | ∞ (b) R f(η)2˜b(η)µ(dη) < , R ∞ (c) min 1, (rf(η))2 V (dr, η) µ(dη) < , RR R { } ∞ whereR R  h(u, η)= ua˜τ (η)+ (τ(ru) uτ(r))V (dr, η). R − Z Here, r τ(r)= r1 (r)+ 1 C (r) [−1,1] r [−1,1] | | and r a˜τ (η) =a ˜(η)+ V (dr, η). C r Z[−1,1] | | To proof (a), note that τ(ru) ur . Therefore, | ≤ | |

h(f(η), η) f(η)˜aτ (η) + 2 f(η)r V (dr, η). | | ≤ | | R | | Z Using (i) and (iii) of Lemma 1, it follows that

h(f(η), η) µ(dη) f(η)˜a(η) µ(dη)+3 f(η)r V (dr, η)µ(dη) < . | | ≤ | | R | | ∞ ZR ZR ZR Z Condition (b) is the same as (ii) and (c) follows from (iii) and

min 1, (rf(η))2 rf(η) . { } ≤ | | 

The conclusions of Lemma 1 hold under weaker assumptions, see [20, Proposition 5.6] or [36, Theorem 2.7]. The assumptions in Lemma 1 are sim- ple to check and suffice for our purposes. The master thesis [20] also contains new selfcontained proofs of a number of other results concerning integration with respect to a L´evy basis. Using equation (4) we can rewrite (9) as

C v, f dL = C(vf(η),L′(η)) µ(dη). (10)  ZR  ZR 5 The logarithm of the Laplace transform of a random variable X will be called the kumulant function and denoted by K(v, X) = log E(e−vX ) for v ∈ R, in accordance with the notation used in [3]. If the kumulant function of the integral R f dL exists, then R K v, f dL = K(vf(η),L′(η)) µ(dη). (11)  ZR  ZR Example 1 (Gaussian L´evy basis). If L is a Gaussian L´evy basis with characteristic triplet (a, b, 0), then L(A) is N(a(A), b(A)) distributed for each set A . If (6) and (7) hold, we obtain L′ (η) N(˜a (η), ˜b (η)). Further- ∈ A ν ∼ ν ν more, 1 C v, f dL = iv f(η) a(dη) v2 f(η)2 b(dη). − 2  ZR  ZR ZR It follows that

f dL N f(η) a(dη), f(η)2 b(dη) . R ∼ R R Z  Z Z  The basis is ν factorizable whena ˜ and ˜b are constant. A concrete exam- − ν ν ple of a Gaussian L´evy basis is obtained by attaching independent Gaussian random variables X to a locally finite sequence η of fixed points and let { i} { i} L(A)= X , A . i ∈A ηXi∈A Another example of a Gaussian L´evy basis is the process, cf. e.g. [24, Section 1.3]. 

Example 2 (Poisson L´evy basis). The simplest L´evy jump basis is the Poisson basis for which L(A) P o(ν(A)), where ν is a non-negative measure ∼ on (R). Clearly, L is a non-negative L´evy basis. This basis has characteristic B triplet (ν, 0, δ1(dr)ν(dη)), where δc denotes the Dirac measure concentrated at c. Note thata ˜ (η) 1 and V (dr, η) = δ (dr). This basis is always ν ≡ ν 1 ν factorizable. The random variable L′ (η) has a P o(1) distribution.  − ν Example 3 (generalized G-L´evy basis). A broad and versatile class of (non-negative) L´evy jump bases are the so-called generalized G–L´evy bases with characteristic triplet of the form (a, 0, U) depending on a non-negative measure ν on (R). The measures a and U satisfy (6) and (8) with B −α−1 1 −α r −θ(η)r r −θ(η)r V (dr, η)= 1R (r) e dr anda ˜ (η)= e dr, ν + Γ(1 α) ν Γ(1 α) − Z0 − where α ( , 1) and θ :R (0, ) is a measurable function. Γ denotes the ∈ −∞ → ∞ gamma function. The class includes two important special cases – the gamma L´evy basis for α = 0 with L′ (η) Γ(1,θ(η)), and the inverse Gaussian L´evy ν ∼ basis for α = 1 with L′ (η) IG(√2, 2θ(η)). In case the function θ is 2 ν ∼ constant θ(η)= θ we get that L(A) G(α, ν(A),θ), i.e. L is a G-measure as ∼ p defined in [5, Section 2].  The following theorem is a special case of the L´evy-Ito decomposition. This theorem will play a crucial role for the interpretation of some of the L´evy driven Cox processes to be considered in the subsequent sections.

6 Theorem 2 Suppose that the L´evy basis L has no Gaussian part (b = 0) and its generalized L´evy measure U satisfies the following conditions U( (r, η) ) = 0 for all (r, η) R R (U is diffuse), • { } ∈ × r U(dr, dη) < for all A . • [−1,1]×A | | ∞ ∈A ThenR L(A)= a0(A)+ rN(dr, A), A , (12) R ∈A Z where a (A)= a(A) rU(dr, A), A , 0 − ∈A Z[−1,1] and N is a Poisson measure on R R with intensity measure U. × The conditions of Theorem 2 are satisfied for a Poisson L´evy basis and a generalized G-L´evy basis if ν is a diffuse locally finite measure on (R). B 3 Cox processes

Let S be a Borel subset of Rd and suppose that Λ(ξ) : ξ S is a non- { ∈ } negative random field which is almost surely integrable (with respect to the Lebesgue measure) on bounded Borel subsets of S. A point process X on S is a Cox process with the driving field Λ, if conditionally on Λ, X is a Poisson process with intensity Λ, cf. [11, 12, 31]. The driving measure ΛM of the Cox process X is defined by

Λ (B)= Λ(ξ)dξ, B (S), M ∈Bb ZB where (S) is the bounded Borel subsets of S. Bb In the following, the intensity function of X will be denoted by ρ(ξ) and, more generally, ρ(n)(ξ) is the nth order product density of X. It follows from the conditional structure of X that ρ(n) can be computed from Λ by

n ρ(n)(ξ ,...,ξ )= E Λ(ξ ), ξ S. (13) 1 n i i ∈ Yi=1 (for a proof, using moment measures, see e.g. [12]). A useful characteristic of a point process is the pair correlation function defined by

ρ(2)(ξ ,ξ ) g(ξ ,ξ )= 1 2 , ξ ,ξ S, 1 2 (1) (1) 1 2 ρ (ξ1) ρ (ξ2) ∈

(1) (1) (1) provided that ρ (ξi) > 0 for i = 1, 2. (We let g(ξ1,ξ2) = 0, if ρ (ξ1) ρ (ξ2)= 0, cf. [31, p. 31].) Note that for a Cox process, the pair correlation function can be calculated as E Λ(ξ1,ξ2) g(ξ1,ξ2)= . E Λ(ξ1) E Λ(ξ2) It can be shown that a Cox process is overdispersed relative to the Poisson process, i.e. Var(X(B)) E X(B), ≥ 7 where X(B) denotes the number of points from X falling in B. Examples of Cox processes include shot noise Cox processes (SNCPs, see [5, 27, 49]) with driving field of the form

Λ(ξ)= r k(ξ, η), (r,ηX)∈Φ where k is a probability kernel (k( , η) is a probability density) and Φ is the · atoms of a Poisson measure on R R, say. Concrete examples of probability + × kernels are the uniform kernel 1 k(ξ, η)= d 1[0,R]( ξ η ), ωdR k − k

d/2 d where R > 0 and ωd = π /Γ(1 + d/2) is the volume of the unit ball in R , and the Gaussian kernel 1 k(ξ, η)= exp( ξ η 2/2σ2), (14) (2πσ2)d/2 −k − k where σ2 > 0. Another important class of Cox processes are the log Gaussian Cox processes (LGCPs, see [29]) driven by the exponential of a Gaussian field Ψ Λ(ξ) = exp(Ψ(ξ)).

4 L´evy driven Cox processes (LCPs) 4.1 Definition A point process X on S is called a L´evy driven Cox process (LCP) if X is a Cox process with a driving field of the form

Λ(ξ)= k(ξ, η)L(dη), ξ S, (15) ∈ ZR where L is a non-negative L´evy basis on R. Furthermore, k is a non-negative function on S R such that k(ξ, ) is integrable with respect to L for each × · ξ S and k( , η) is integrable with respect to the Lebesgue measure on S for ∈ · each η R. ∈ Note that it is always possible for each pair (k,L) to construct an asso- ciated pair (k,˜ L˜) generating the same driving field Λ where now k˜( , η) is a · probability kernel. We may simply let

k˜(ξ, η)= k(ξ, η)/α(η), L˜(dη)= α(η)L(dη) where α(η)= k(ξ, η)dξ ZS is assumed to be strictly positive. In the formulation and analysis of the models it is however convenient not always to restrict to probability kernels. It is important to note that from the non-negativity of the L´evy basis L and [12, Theorem 6.1.VI], we get that L is equivalent to a random measure on

8 R. Thus, the measurability of Λ defined in (15) follows from measurability of k as a function of η and ξ and Tonelli’s theorem. Therefore, Λ is a well-defined random field and (under the condition of local integrability - see below) the driving measure Λ(ξ)dξ, B (S), is also a well-defined random measure B ∈Bb determined by the finite-dimensional distributions of L. R It will be assumed that the function k and the L´evy basis L have been chosen such that Λ is almost surely locally integrable, i.e. Λ(ξ)dξ < B ∞ with probability 1 for B (S). A sufficient condition for the last property ∈Bb R is that, cf. [31, Remark 5.1],

E Λ(ξ)dξ < , B ( ). (16) ∞ ∈Bb S ZB If L is factorizable, then (16) is satisfied if the following conditions hold

∞ rV (dr) < , ∞ Z1 k(ξ, η)µ(dη)dξ < , B (S). ∞ ∈Bb ZB ZR 4.2 The nth order product densities of an LCP It is possible to derive a number of properties of LCPs, using the theory of L´evy bases presented in Section 2. Below, the nth order product densities, the generating functional and the void probabilities of an LCP are consid- ered. In the proposition below, (complete) Bell polynomials, well known in combinatorics, are used, see [10].

Proposition 3 Suppose that

n E k(ξ, η)L(dη) < ∞ ZR  and (k(ξ, η)r)nV (dr, η)µ(dη) < , R ∞ ZR Z + for all ξ S. Then, the nth order product density of an LCP is given by ∈ n 1 ρ(n)(ξ ,...,ξ )= t B (κ (t),...,κ (t)), 1 n 2nn!  j n 1 n tX∈Tn jY=1   ξ ,...,ξ S, where T denotes the set of all functions from 1,...,n to 1 n ∈ n { } 1, 1 n, B is the nth complete Bell polynomial evaluated at {− } n n j ′ κj(t)= tik(ξi, η) κj(L (η))µ(dη), j = 1,...,n, R ! Z Xi=1 ′ ′ and κj(L (η)) is the jth cumulant moment of the spot variable L (η).

9 (n) E n Proof. First we rewrite ρ (ξ1,...,ξn)= i=1 Λ(ξi), using the polariza- tion formula, cf. [15, p. 43], Q n n n n E 1 E Λ(ξi)= n ti tiΛ(ξi) . (17) 2 n! ! ! Yi=1 tX∈Tn Yi=1 Xi=1 The terms n n E tiΛ(ξi) ! Xi=1 can be computed by evaluating the nth complete Bell polynomial in the first n n n cumulants of i=1 tiΛ(ξi)= R i=1 tik(ξi, η) L(dη). Thus, we have n P n R P E tiΛ(ξi) = Bn(κ1(t),...,κn(t)), ! Xi=1 where κj(t) is the jth cumulant of n tik(ξi, η)L(dη). R Z Xi=1 Under the assumptions of the proposition, κj(t) can be calculated by differ- n entiating (11) j times with f(η)= i=1 tik(ξi, η). We get n P j ′ κj(t)= tik(ξi, η) κj(L (η))µ(dη). R ! Z Xi=1 Note that n j E tik(ξi, η)L(dη) < R ! ∞ Z Xi=1 and n j ( tik(ξi, η)) V (dr, η)µ(dη) < , R R+ ∞ Z Z Xi=1 j = 1,...,n, under the assumptions of the proposition.  Corollary 4 Suppose that k(ξ, ) satisfies the assumptions of Lemma 1 for · each ξ S. Then, the intensity function of the LCP exists and is given by ∈ ρ(ξ)= k(ξ, η) E(L′(η))µ(dη) (18) ZR for all ξ S. Furthermore, if ∈ 2 E k(ξ, η)L(dη) < , (19) ∞ ZR  and (k(ξ, η)r)2 V (dr, η) µ(dη) < , (20) R ∞ ZR Z for each ξ S, the pair correlation function of the process exists and is given ∈ by k(ξ, η)k(ζ, η) Var(L′(η))µ(dη) g(ξ,ζ)=1+ R , (21) R ρ(ξ)ρ(ζ) for all ξ,ζ S. ∈ 10 Proof. The result follows from Proposition 3, using that the first and 2 second complete Bell polynomials are given by B1(x)= x, B2(x1,x2)= x1 + ′ ′ ′ ′ x2. Also recall that κ1(L (η)) = E(L (η)) and κ2(L (η)) = Var(L (η)). 

Corollary 5 (Stationary LCP) Let S=R= Rd and assume that k is a homogeneous kernel in the sense that

k(ξ, η)= k(ξ η) for all ξ, η Rd. (22) − ∈ Let k(η)dη = α. Assume that L is a homogenous L´evy basis with control measure µ(dη)= cdη for some c> 0. Then, (18) and (21) take the following R simplified form

ρ =c E L′α Var L′ I (ζ ξ) g(ξ,ζ) =1+ k − , (E L′)2 c where Ik only depend on the kernel k k(ζ ξ + η)k(η) Ik(ζ ξ)= − 2 dη. − Rd α Z ′ Var L 1 E ′ Note that the fraction (E L′)2 is equal to E L′ , 1 and L for the Poisson, gamma and inverse Gaussian basis, respectively. The choice of the L´evy basis changes substantially the correlations in the LCP and the overall variability in the point pattern even when the corresponding LCPs are stationary and all other parameters of the model are the same. As an illustration, Figure 1 shows 3 stationary LCPs observed on a [0, 100] [0, 200] window with c = 0.003, × E L′ = 2 and a Gaussian kernel obtained as 10 times the kernel (14) with σ = 4. The spot variable L′ is distributed as E L′ times a P o(1)-distributed variable, as a Γ(1, E L′)-distributed variable and a IG(1, 1/ E L′)-distributed variable, respectively. From left to right, an increasing irregularity is clearly visible. The distribution of a point process X on S can be characterized by the probability generating functional GX . This functional is defined by

GX (u)= E u(ξ), ξY∈X for functions u :S [0, 1] with ξ S : u(ξ) < 1 bounded. As proved e.g. → { ∈ } in [12] the probability generating functional of a Cox process can be computed by G (u)= E exp (1 u(ξ))Λ(ξ)dξ . (23) X − −  ZS  Void probabilities can be calculated as

v(B) := P (X B = )= E exp Λ(ξ)dξ , B (S). ∩ ∅ − ∈Bb  ZB 

Below, we give the expressions for GX and v for an LCP.

11 Figure 1: Examples of realizations of homogeneous LCPs with Poisson (left), gamma (middle) and inverse Gaussian (right) L´evy bases. For details, see the text.

Proposition 6 The probability generating functional of an LCP has the fol- lowing form

GX (u) = exp 1 exp (1 u(ξ))k(ξ, η)r dξ U(dr, dη) − R − − −  Z + ZR   ZS  (1 u(ξ))k(ξ, η)dξa (dη) , − − 0 ZR ZS  while the void probabilities are given by

v(B) = exp 1 exp( r k(ξ, η)dξ) U(dr, dη) − R − −  Z + ZR  ZB  k(ξ, η)dξa (dη) , B (S). − 0 ∈Bb ZR ZB  Proof. Since Λ(ξ) is almost surely locally integrable,

(1 u(ξ))Λ(ξ)dξ 1supp(1−u)(ξ)Λ(ξ)dξ < (24) − ≤ ∞ ZS ZS is a well-defined non-negative random variable and its kumulant transform exists. (In (24), the support of the function 1 u is denoted supp(1 u).) − −

12 Using the key relation (11) for the kumulant function, we get log (G (u)) = log E exp( (1 u(ξ))Λ(ξ)dξ) X − −  ZS  =K 1, (1 u(ξ))Λ(ξ)dξ −  ZS  =K 1, (1 u(ξ)) k(ξ, η)L(dη)dξ −  ZS ZR  =K 1, (1 u(ξ))k(ξ, η)dξ L(dη) −  ZR ZS   = K (1 u(ξ))k(ξ, η)dξ, L′(η) µ(dη) − ZR ZS  = (1 u(ξ))k(ξ, η)dξa (dη) − − 0 ZR ZS + exp (1 u(ξ))k(ξ, η)r dξ 1 V (dr, η)µ(dη). R − − − ZR Z +   ZS  

The result for the void probabilities is obtained by choosing u(ξ)= 1Bc (ξ). 

4.3 Mixing properties The following proposition gives conditions for stationarity and mixing of an LCP. Mixing and are important e.g. for establishing the consistency of model parameter estimates, including nonparametric estimates of the n-th order product density ρ(n) and the pair correlation function g. Mixing [12, Definition 10.3.I] implies ergodicity [12, p. 341]. The case of an LCP with G-L´evy basis has been treated in [5, Proposition 2.2].

Proposition 7 Let S=R= Rd and assume that the L´evy basis L and the kernel k are homogeneous. Then, an LCP with driving field Λ of the form (15) is stationary and mixing.

Proof. Note that a Cox process is stationary/mixing if and only if the driving field of the Cox process has the same property [12, Proposition 10.3.VII]. Using the assumptions of the proposition it is easily seen that Λ(ξ + x) : ξ Rd { ∈ } has the same distribution as Λ(ξ) : ξ Rd for all x Rd. { ∈ } ∈ According to [12, Proposition 10.3.VI(a)], Λ is mixing if and only if

L [h + T h ] L [h ]L [h ], Λ 1 x 2 → Λ 1 Λ 2 as x . Here, h1 and h2 are arbitrary non-negative bounded functions k dk→∞ on R of bounded support, and LΛ is the Laplace functional defined by

LΛ[h]= E exp h(ξ)Λ(ξ)dξ , − Rd  Z 

13 and T h(ξ)= h(ξ + x), ξ,x Rd. We get x ∈ LΛ[h1 + Txh2]

= E exp (h1(ξ)+ h2(ξ + x))k(ξ η)L(dη)dξ − Rd Rd −  Z Z 

= E exp h1(ξ)k(ξ η)dξ + h2(ξ)k(ξ η x)dξ L(dη) − Rd Rd − Rd − −  Z Z Z  

= E exp h˜1(η)L(dη) exp h˜2(η + x)L(dη) , − Rd · − Rd   Z   Z  where

h˜i(η)= hi(ξ)k(ξ η)dξ. Rd − Z If k has bounded support, then we can find a C > 0 such that for x >C k k η Rd : h˜ (η) > 0 η Rd : h˜ (η + x) > 0 = . { ∈ 1 } ∩ { ∈ 2 } ∅ It follows that for x >C k k

LΛ[h1 + Txh2] = E exp h˜1(η)L(dη) E exp h˜2(η + x)L(dη) − Rd · − Rd  Z   Z  = LΛ[h1]LΛ[h2], since L is independently scattered. If k does not have bounded support, we define a series of functions with bounded support k (ξ η)= k(ξ η)1 ( ξ η ), n = 1, 2,... n − − [0,n) k − k that converges monotonically from below to k. It follows that h˜i,n defined by

h˜i,n(η)= hi(ξ)kn(ξ η)dξ Rd − Z converges monotonically from below to h˜i(η) and for fixed n we can find Cn such that for x >C k k n

E exp h˜1,n(η)L(dη) exp h˜2,n(η + x)L(dη) − Rd · − Rd   Z   Z 

= E exp h˜1,n(η)L(dη) E exp h˜2,n(η + x)L(dη) − Rd · − Rd  Z   Z  Using the reasoning just after [12, Proposition 10.3.VI], it follows that L [h + T h ] L [h ]L [h ], Λ 1 x 2 → Λ 1 Λ 2 for the original functions h1 and h2. 

4.4 Examples of LCPs 4.4.1 Shot noise Cox processes (SNCPs) with random noise Under the assumptions of Theorem 2, the driving field of an LCP takes the form

Λ(ξ)= k(ξ, η)a0(dη)+ rk(ξ, η), (25) R Z (r,ηX)∈Φ 14 where Φ is the atoms of a Poisson measure on R R with intensity measure + × U. An LCP X with such a driving field is distributed as a superposition X X where X and X are independent, X is a Poisson point process 1 ∪ 2 1 2 1 with intensity function

ρ1(ξ)= k(ξ, η)a0(dη) ZR and X2 is a shot noise Cox process as defined in [27] with driving field

Λ2(ξ)= rk(ξ, η). (r,ηX)∈Φ An LCP with driving field Λ of the form (25) is therefore an SNCP with additional random noise. Simulation of the associated L´evy basis can be per- formed, using the algorithm introduced in [16], if L is factorizable, otherwise the algorithm developed in [50] may be used, see also [46]. A third option is the method used in [27]. An overview of available methods of simulating L´evy processes can be found in [39]. For a 0, we get the familiar SNCPs. In [27], three specific examples of 0 ≡ stationary SNCPs are considered. Using the notion of a L´evy basis, they are specified by U(dr, dη)= V (dr, η)ν(dη), where ν(dη) dη and ∝ V is concentrated in a single point c> 0, i.e. V (dr)= δ (dr). If c = 1, • c the corresponding L´evy basis is Poisson. If c = 1, L(A) cP o(ν(A)). 6 ∼ LCPs of this type are the well-known Mat´ern cluster process [25] and the Thomas process [42]. V ((0, )) < . In this case, Φ can be represented as a marked Pois- • ∞ ∞ son point process. Examples of LCPs with such a L´evy basis are the Neyman-Scott processes, cf. [34]. −α−1 V (dr) = 1R (r) r e−θr dr corresponding to a G-L´evy basis. The • + Γ(1−α) resulting LCP is a so-called shot noise G Cox process [5]. In Figure 2, we show an example of an SNCP with a homogeneous Poisson process (a0 is proportional to Lebesgue measure) as additional random noise. More precisely, the process X = X X is defined on [0, 200] [0, 100], X 1 ∪ 2 × 1 is a Poisson process with intensity 0.01 and X2 is an SNCP with Gaussian kernel (14) with σ = 2 and an intensity measure U of the form U(dr, dη) = δ (r) 0.0025dη. The process X is thereby a Thomas process. 25 · 2 4.4.2 LCPs driven by smoothed discrete random fields We suppose that η is a locally finite sequence of fixed points and let { i} L(A)= Xi, ηXi∈A where X is a sequence of independent and identically distributed non- { i} negative random variables with infinitely divisible distribution. If, for in- stance, Xi is gamma or inverse Gaussian distributed, then L is a special case of a gamma or inverse Gaussian L´evy basis, respectively. The driving intensity of the associated LCP will take the form

Λ(ξ)= k(ξ, ηi)Xi. η Xi

15 Figure 2: Example of a shot noise Cox process with extra noise. For details, see the text.

5 Log L´evy driven Cox processes (LLCPs) 5.1 Definition A point process X on S is called a log L´evy driven Cox process (LLCP) if X is a Cox process with intensity field of the form

Λ(ξ) = exp k(ξ, η)L(dη) , (26) ZR  where L is a L´evy basis and k is a kernel such that k(ξ, ) is integrable with · respect to L for each ξ S, k( , η) is integrable with respect to Lebesgue ∈ · measure on S for each η R and Λ is almost surely locally integrable. ∈ Since the driving intensity field of an LLCP is always non-negative because of the exponential function, we can generally use kernels and L´evy bases which also have negative values. Moreover, using the L´evy-Khintchine representa- tion (1), we see that each L´evy basis L is equal to a sum of two independent parts – a L´evy jump part (let us denote it by LJ ) and a Gaussian part (let us denote it by LG). Thus we can represent the driving intensity of an LLCP as a product of two independent driving fields

Λ(ξ) = exp k(ξ, η)LJ (dη) exp k(ξ, η)LG(dη) = ΛJ (ξ)ΛG(ξ). R R Z  Z  (27) If L 0, Λ is the driving field of a log Gaussian Cox process (LCP) J ≡ [9, 29]; if L 0, Λ is under regularity conditions the driving field of a log G ≡ shot noise Cox process, see the examples in Section 5.4 below. Because of the exponential function in the definition of Λ(ξ), stronger con- ditions on k and L are needed in order to ensure that Λ is almost surely locally integrable. A sufficient condition is that the kumulant transform K( k(ξ, η),L′(η)) exists for all ξ S and η R, and that − ∈ ∈ exp K( k(ξ, η),L′(η)) µ(dη) dξ < , for all B (S). (28) − ∞ ∈Bb ZB ZR  This result follows from the definition of the kumulant function and from the key relation (11) for the kumulant transform. In particular, we use that

E Λ(ξ) = exp K(1, k(ξ, η)L(dη)) = exp K( k(ξ, η),L′(η)) µ(dη) . − −  ZR  ZR  16 Note that 1 K( k(ξ, η),L′(η)) = k(ξ, η)˜a(η)+ k(ξ, η)2˜b(η) − 2 k(ξ,η)r + (e 1 k(ξ, η)r1[−1,1](r))V (dr, η). R − − Z If L is factorizable, then (28) is satisfied if either there exist B > 0, C > 0 and D> 0 such that

k(ξ, η) C for all ξ S, η R (29) | |≤ ∈ ∈ k(ξ, η) iµ(dη) 0 and R> 0 such that

k(ξ, η) C for all ξ S, η R (32) | |≤ ∈ ∈ k(ξ, η)=0 for ξ η > R (33) k − k µ is locally finite (34)

C|r| e 1 C r 1[−1,1](r) V (dr) < . (35) R − − | | ∞ Z   Note that (29) and (30) are satisfied for the Gaussian kernel if µ is Lebesgue measure, while (32) and (33) hold for the uniform kernel. In the case of a purely Gaussian basis, (30) is only needed for i = 2 and conditions (31) and (35) are trivially fulfilled since V 0. ≡ 5.2 The nth order product densities of an LLCP The nth order product densities of LLCPs are easily derived, using L´evy theory.

Proposition 8 The nth order product density is given by

n (n) ′ ρ (ξ1,...,ξn) = exp K( k(ξi, η),L (η))µ(dη) , (36) R − ! Z Xi=1 ξ ,...,ξ S, provided the right-hand side exists. 1 n ∈ Proof. The formula follows directly from the definition of the kumulant function and from the key relation (11). We get

n n (n) ρ (ξ1,...,ξn)= E Λ(ξi)= E exp k(ξi, η)L(dη) i=1 i=1 ZR ! Y n X = exp K(1, k(ξ , η)L(dη)) − i i=1 ZR ! Xn ′ = exp K( k(ξi, η),L (η))µ(dη) . R − ! Z Xi=1 

17 Corollary 9 The intensity function of an LLCP X is given by

ρ(ξ) = exp K( k(ξ, η),L′(η)) µ(dη) , (37) − ZR  provided the right-hand side exists. When the second order product density exists, the pair correlation function of an LLCP takes the following form g(ξ,ζ) = exp [K( k(ξ, η) k(ζ, η),L′(η)) − −  ZR K( k(ξ, η),L′(η)) K( k(ζ, η),L′(η))]µ(dη) − − − −  = exp k(ξ, η)k(ζ, η)b(dη)  ZR + e(k(ξ,η)+k(ζ,η))r ek(ξ,η)r ek(ζ,η)r + 1 V (dr, η) µ(dη) . R R − − Z Z h i  Corollary 10 (Stationary LLCP) Let S=R= Rd. Assume that k is a homogeneous kernel and L a homogeneous L´evy basis with µ(dη) = cdη for some c> 0. Then,

ρ = exp c K( k(η),L′)dη Rd −  Z  and g(ξ,ζ) = exp ˜bc k(ξ ζ + η)k(η)dη Rd −  Z +c (e(k(ξ−ζ+η)+k(η))r ek(ξ−ζ+η)r ek(η)r + 1)V (dr)dη . Rd R − − Z Z  5.3 Mixing properties Proposition 11 Let S=R= Rd and assume that the L´evy basis L is homo- geneous and the kernel k is homogeneous. Then, an LLCP with driving field of the form (26) is stationary and mixing.

Proof. As in the proof of Proposition 7, we immediately get the station- arity. The method of proving mixing has to be modified compared to the one used in Proposition 7. First, rewrite

LΛ[h1 + Txh2]

= E exp h1(ξ)exp k(ξ η)L(dη) dξ − Rd Rd −   Z Z  

exp h2(ξ + x)exp k(ξ η)L(dη) dξ · − Rd Rd −  Z Z   = E[A B ], · x say. If k has bounded support, A and B will be independent if x is large x k k enough. If k does not have bounded support, we use a series of functions kn

18 with bounded support that converges to k. To be precise, let as in Proposi- tion 7 k (u)= k(u)1 ( u ), u Rd, n [0,n) k k ∈ n = 1, 2,... . We have k k and k k . Now, let n → | n| ≤ | |

An = exp h1(ξ)exp kn(ξ η)L(dη) dξ − Rd Rd −  Z Z   and

Bx,n = exp h2(ξ + x)exp kn(ξ η)L(dη) dξ . − Rd Rd −  Z Z   Note that 0 A, A ,B ,B 1. Now, consider the following inequality ≤ n x x,n ≤ E[A B ] E A E B | · x − · x| E[A B ] E[A B ] + E[A B ] E[A B ] ≤ | · x − · x,n | | · x,n − n · x,n | + E[A B ] E A E B + E A E B E A E B | n · x,n − n · x,n| | n · x,n − n · x| + E A E B E A E B . | n · x − · x| = δ1xn + δ2xn + δ3xn + δ4xn + δ5xn, say. Let us evaluate each of these five terms. Using that 0 A 1 and that ≤ ≤ L is homogeneous, we get

δ = E[A (B B )] 1xn | · x − x,n | E[A B B ] ≤ · | x − x,n| E B B ≤ | x − x,n| = E B B . | 0 − 0,n| Now, since k k and k k , where k is L integrable, it follows that n → | n| ≤ | | −

kn(ξ η)L(dη) k(ξ η)L(dη), Rd − → Rd − Z Z almost surely. We can therefore find n1 (not dependent on x) such that for n n , δ ǫ, say. Using the same type of arguments, we can find ≥ 1 1xn ≤ n ,n ,n such that for n n , δ ǫ, i = 2, 4, 5. Now choose a fixed 2 4 5 ≥ i ixn ≤ n max(n ,n ,n ,n ) and consider ≥ 1 2 4 5 δ = E[A B ] E A E B . 3xn | n · x,n − n · x,n| Using the previous results for bounded functions of bounded support, we finally find a constant C > 0 such that for x with x >C we have δ ǫ. k k 3xn ≤ This completes the proof. 

5.4 Examples of LLCPs 5.4.1 Log shot noise Cox processes (LSNCPs) Under the assumptions of Theorem 2, the driving field of an LLCP takes the form

Λ(ξ) = exp d(ξ)+ rk(ξ, η) , (38)   (r,ηX)∈Φ   19 where d(ξ) is a deterministic function and Φ is the atoms of a Poisson measure on R R with intensity measure U. Such a process is called a log shot noise × Cox process (LSNCP). It is important to realize that SNCPs and LSNCPs are quite different model classes. An SNCP X with driving field of the form

Λ(ξ)= rk(ξ, η) (r,ηX)∈Φ is a superposition of independent Poisson processes X , (r, η) Φ, where (r,η) ∈ X has intensity function rk( , η). (The process η : (r, η) Φ is usually (r,η) · { ∈ } called the centre process (although it is not necessarily locally finite) while X(r,η) is called a cluster around η.) The presence of a particular cluster X(r,η) will not affect the presence of the other clusters. In contrast to this, the driving field of an LSNCP takes the form

Λ(ξ) = exp(d(ξ)) exp(rk(ξ, η)). (r,ηY)∈Φ A cluster X with negative, numerically large values of rk( , η) will very (r,η) · likely contain 0 points and moreover, wipe out points from other clusters in the neighbourhood of η. In the resulting point pattern, empty holes may therefore be present. Examples of such point patterns are shown in Figure 3. Here, η is a homogeneous Poisson process on [0, 100] [0, 200] with intensity { } 1 2 × c = 0.003, V (dr)= 3 δ1(r)+ 3 δ−1(r) and the kernel is (left) k(ξ) = 1( ξ R) |ξ|3 | |≤ and (right) k(ξ) = (1 3 )1( ξ R), respectively, with R = 10. − R | |≤ 5.4.2 Log Gaussian Cox processes (LGCPs) In this subsection, we consider LLCPs with driving field of the form

Λ(ξ) = exp k(ξ, η)L(dη) , (39) ZR  where L is a Gaussian L´evy basis. Clearly, the resulting process is an LGCP [9, 29]. If k and L are homoge- neous, the process is stationary. In this case, the random intensity function Λ(ξ) is well defined for all ξ Rd and almost surely integrable if ∈ k(ξ) C,ξ Rd, and k(ξ)2 dξ < . (40) ≤ ∈ Rd ∞ Z The covariance function of the Gaussian field

Ψ(ξ)= k(ξ η)L(dη) Rd − Z takes the form

Cov(Ψ(ξ1), Ψ(ξ2)) = k(ξ1 ξ2 + η)k(η)dη = c(ξ1 ξ2), Rd − − Z say. Note that under (40) c is integrable. Under the mild additional assump- tion that the set of discontinuity points of k has Lebesgue measure 0, c is also

20 Figure 3: Examples of log shot noise Cox processes. Notice the circular empty holes in the point patterns. For details, see the text.

continuous. In the proposition below, we show that any stationary LGCP with a continuous and integrable covariance function can indeed be obtained as a kernel smoothing (39) of a Gaussian L´evy basis. The proposition is a generalization of a result mentioned in [21].

Proposition 12 Any stationary Gaussian random field with continuous and integrable covariance function can be generated by a kernel smoothing of a homogeneous L´evy basis.

Proof. Let Ψ(ξ) : ξ Rd be an arbitrary stationary zero mean Gaussian { ∈ } field. Let c(ξ ,ξ ) = c(ξ ξ ) denote its covariance function which is a 1 2 1 − 2 function of ξ = ξ ξ due to the stationarity. Since c is continuous and 1 − 2 positive definite, it follows from Bochner’s Theorem that

c (ξ)= eiξ·ητ (dη) Rd Z for some non-negative measure τ. Since c is integrable and symmetric, τ has a symmetric density f, which can be found using the inverse Fourier- transform. √f is continuous and a member of L2 Rd . Note: For a symmetric function defined on Rd the Fourier transform and its inverse are the same up  to multiplication/division with the constant (2π)d/2. By the convolution theorem for the Fourier(-Plancerel) transform we get

−1 −1 −1 ˆ\ˆ ˆ ˆ f f = f f = f, ∗ · p p  pd pd 21 thus ˆ ˆ f f (ξ)= c (ξ) . ∗ Put k = √ˆf and let L denotep a homogeneousp L´evy basis, with characteristic triplet (0, 1, 0). Then, since the covariance function for kdL is equal to k k, ∗ our proof is complete.  R In [29, Theorem 3], conditions for ergodicity is given in the special case of a stationary LGCP. Note that under (40) c(ξ) 0 for ξ and the → k k→∞ conditions for ergodicity stated in [29, Theorem 3b] are satisfied.

6 Combinations of LCPs and LLCPs

The driving field of an LLCP has the form

Λ(ξ) = exp k(ξ, η)LJ (dη) exp k(ξ, η)LG(dη) = ΛJ (ξ)ΛG(ξ). ZR  ZR  It seems natural to extend the model such that the kernels used in the jump part and the Gaussian part do not need to be the same. We thereby arrive at Cox processes with a driving field of the form

Λ(ξ) = exp k(ξ, η)LJ (dη) ΛG(ξ) (41) ZR  with ΛG an arbitrary log Gaussian random field. If LJ satisfies the regularity conditions of Theorem 2, we get

Λ(ξ) = exp d(ξ)+ rk(ξ, η)+ Y (ξ) ,   (r,ηX)∈Φ   where d(ξ) is a deterministic function, Φ is the atoms of a Poisson measure with intensity measure U and Y is an independent Gaussian field. A related model can be found in [37] for modelling the positions of off- springs in a long-leaf pine forest given the positions of the parents and infor- mation about the topography. The model is in [37] formulated conditional on the positions η of the parents. There are, of course, other possibilities for combining shot noise compo- nents and log Gaussian components in the driving field than the one suggested above. For instance, if LJ is a non-negative L´evy jump basis, we may consider Cox processes driven by

Λ(ξ) = k(ξ, η)LJ (dη) ΛG(ξ) ZR 

= k(ξ, η)a0(dη)+ rk(ξ, η) ΛG(ξ), (42)  R  Z (r,ηX)∈N   cf. [45]. In [13, pp. 92-100], a Cox process model of the type described in (42) has been considered but now with the Gaussian field replaced by a Boolean field. Such a model will be able to produce shot noise point patterns with empty holes generated by the Boolean field.

22 7 Inhomogeneous LCPs and LLCPs

In [33], it has recently been suggested to introduce inhomogeneity into a Cox process such that the resulting process becomes second-order intensity reweighted stationary, see [1] for details. In this section, we describe four types of inhomogeneity. Only Type 3 leads to second-order intensity reweighted sta- tionary processes. We concentrate on SNCPs with a 0, cf. Section 4.4.1. The interpreta- 0 ≡ tion of the type of inhomogeneity introduced may be facilitated by using the cluster representation of a shot noise Cox process X. It is not needed that the process of cluster centres (mothers) is locally finite in order to use this interpretation. Example 4 (Type 1). The kernel is assumed to be homogeneous k(ξ, η)= k(ξ η) while the L´evy basis satisfies V (dr, η)= V (dr), ν(dη)= cf(η)dη. If − the function f is non-constant, mothers will be unevenly distributed (accord- ing to ν) but the distribution of the clusters will not depend on the location in the sense that the distribution of X η does not depend on η.  (r,η) − Example 5 (Type 2). The kernel is assumed to be homogeneous k(ξ, η)= k(ξ η) while the L´evy basis satisfies V (dr, η)= V (d( r )) and ν(dη)= cdη. − f(η) In this case, the mothers will be evenly distributed while the distribution of the clusters may be location dependent. A model with (k, V ) replaced by k(ξ, η) = k(ξ η)f(η) and V (dr, η) = V (dr) will result in the same type of − LCP. 

Example 6 (Type 3). The kernel is inhomogeneous of the form k(ξ, η)= k(ξ η)f(ξ) while the L´evy basis is homogeneous V (dr, η) = V (dr) and − ν(dη) = cdη. The resulting LCP will be a location dependent thinning of a stationary LCP. This option has been discussed in [33, 44] with the following log-linear specification of the function f

f(ξ) = exp(z(ξ) β). · Here, z(ξ) is a list of explanatory variables and β a parameter vector. Note that Type 2 and 3 inhomogeneity will typically have a similar appearance. The reason is that they can be regarded as only differing in the specification of the kernel as either of the form

k(ξ, η)= k(ξ η)f(η) (Type 2) − or k(ξ, η)= k(ξ η)f(ξ) (Type 3), − and k(ξ η)(f(η) f(ξ)) is only non-negligible if ξ and η are close enough so − − that k(ξ η) is non-negligible and at the same time there is an abrupt change − in f between ξ and η. 

Example 7 (Type 4). Inhomogeneity may also be introduced into the pro- cess by a local scaling mechanism [17, 18]. Here, the kernel is inhomogeneous

ξ η 1 k(ξ, η)= k − . f(η) f(η)d   23 while V (dr, η) = V (dr) and ν(dη) = cdη/f(η)d. The inhomogeneity of the resulting point process can be explained by local scaling.  In Figure 4, examples of inhomogeneous LCPs of Type 1, 2 and 4 are given on S = R = [0, 100] [0, 200]. Here, k is the Gaussian kernel (14) with σ = 2, × c = 1/200 and V is concentrated in r = 18. The inhomogeneity function f is linear in all three cases, f(x,y)= y/100.

Figure 4: Examples of realizations of inhomogeneous LCPs. From left to right, Type 1, 2 and 4, respectively. For details, see the text.

Inhomogeneity may be introduced into an LSNCP by changing L or k as indicated in the examples above. Compared to LCPs, the effects are now multiplicative.

8 Discussion

During the last years, there has been some debate concerning which one of the two model classes (SNCP or LGCP) are most appropriate [28, 32, 38, 49]. The modelling framework described in the present paper provides the possibility for using models involving both SNCP and LGCP components and subsequently test whether it is appropriate to reduce the model to a pure SNCP model or a pure LGCP model. Figure 5 summarizes the most important model classes treated in the present paper. Below, we discuss a few additional issues.

24 L´evy based Cox point processes

LCP LLCP

SNCP LCP LCP with with thinned LSNCP LGCP random smoothed by noise DF LGF

Figure 5: Overview of L´evy based Cox point processes. The abbreviations DF and LGF are used for a discrete random field and a log Gaussian random field, respectively. For further details, see the text.

8.1 Probability densities of LCPs and LLCPs It is possible to derive an expression for the density of an LCP or an LLCP, using the methodology of L´evy bases. For instance, in the case of an LCP with a 0, the density of X for B (S) can be written as an expan- 0 ≡ B ∈ Bb sion involving complete Bell polynomials evaluated at certain cumulants. The derivation of this result utilizes (17). Unfortunately, the expansion seems to be too complicated to be of practical use for inference. The same type of con- clusion has been reached for likelihood inference in G-shot-noise Cox processes and in log Gaussian Cox processes, see [5, Section 4.2.1] and [29]. Closed form expressions for densities of other types of Cox processes are available, see [26].

8.2 Spatio-temporal extensions The LCPs and LLCPs extend easily to spatio-temporal Cox processes. The set S on which the process is living is now a Borel subset of Rd R where × the last copy of R indicates time. The dependency on the past at time t and position x may be modelled using an ambit set

A (x), x Rd,t R, t ∈ ∈ satisfying

(x,t) A (x) ∈ t A (x) Rd ( ,t] t ⊆ × −∞ A spatio-temporal LCP is then defined by a driving intensity of the form

Λ(x,t)= k((x,t), (y,s))L(d(y,s)), ZAt(x) where L is a non-negative L´evy basis on R Rd R and k is a non-negative ⊆ × weight function. Likewise, a spatio-temporal LLCP has driving field of the form Λ(x,t) = exp k((x,t), (y,s))L(d(y,s)) , ZAt(x) ! 25 where L and k do not need to be non-negative anymore. Using the tools of L´evy theory, it is possible to derive moment relations as shown in the present paper for the purely spatial case [35]. This approach to spatio-temporal mod- elling is expected to be very flexible and has earlier been used with success in growth modelling [23], see also [22]. It will be interesting to investigate how it performs compared to earlier methods described in [6, 7, 8, 14].

8.3 Statistical inference Statistical inference for Cox processes has been studied earlier in a number of papers, including [4, 19, 28, 32, 33, 43]. It remains to investigate to what degree known procedures, based on summary statistics, likelihood or Bayesian reasoning, can be adjusted to deal with LCPs and LLCPs. For a stationary LCP with Λ = ρ1 + Λ2, it is easy to determine the summary statistics F , G and J in terms of the corresponding characteristics F2, G2 and J2 of the shot noise component with intensity field Λ2. Thus, 1 F (r) = exp( ρ B(0, r) )(1 F (r)), − − 1| | − 2 ρ ρ 1 G(r) = exp( ρ B(0, r) ) 1 (1 F (r)) + 2 (1 G (r)) , − − 1| | ρ + ρ − 2 ρ + ρ − 2  1 2 1 2  ρ1 ρ2 J(r) = + J2(r). ρ1 + ρ2 ρ1 + ρ2

However, in general simple expressions for G2 and J2 in terms of model pa- rameters are not available. Likewise, it does not seem to be possible to derive general closed form expressions for F , G and J in the case of an LLCP. In order to evaluate whether both a jump part and a Gaussian part is needed in an LLCP, we may consider a third order summary statistic, sug- gested in the paper [29] (for stationary point processes) 1 ρ(3)(ξ,ζ) z(t)= dξ dζ, t> 0, (43) π2t4 (ρ(1))3 g(ξ) g(ζ) g(ξ ζ) Zkξk≤t Zkζk≤t − where the following abbreviated notation is used due to the stationarity g(ξ ,ξ )= g(ξ ξ ), 1 2 2 − 1 ρ(3)(ξ ,ξ ,ξ )= ρ(3)(ξ ξ ,ξ ξ ). 1 2 3 2 − 1 3 − 1 When computing the integrand in z(t) for an LLCP we obtain (3) ρ (ξ1,ξ2,ξ3) (1) 3 (ρ ) g(ξ1,ξ2) g(ξ2,ξ3) g(ξ1,ξ3) E( 3 Λ (ξ )) ( 3 E Λ (ξ )) = i=1 J i i=1 J i , (44) E(Λ (ξ )Λ (ξ )) E(Λ (ξ )Λ (ξ )) E(Λ (ξ )Λ (ξ )) J 1 J Q2 J 2 QJ 3 J 1 J 3 where ΛJ (ξ) = exp R k(ξ, η)LJ (dη) is the part of the driving intensity orig- inating from the pure jump part of the L´evy basis. Thus, this characteristic R  of X is not influenced by the Gaussian part of the model. In particular, z 1 ≡ for log Gaussian Cox processes. A non-parametric unbiased estimator of z(t) has been derived in [29, Theorem 2]. Assessment of the full potential of the new modelling framework described in the present paper will also require more detailed studies of inhomogeneity and practical experience with concrete applications of the models.

26 Acknowledgements

We are grateful to Ole E. Barndorff-Nielsen for sharing his thoughts with us. This research has been supported by grants from the Carlsberg Foundation and the Danish Natural Science Research Council.

References

[1] Baddeley, A. J.; Moller, J.; Waagepetersen, R. (2000) Non- and semi- parametric estimation of interaction in inhomogeneous point patterns. Statist. Neerlandica 54, 329–350. [2] Barndorff-Nielsen, O.E., Blæsild, P. and Schmiegel, J. (2004). A parsi- monious and universal description of turbulent velocity increments. Eur. Phys. J. B 41, 345 –363. [3] Barndorff-Nielsen, O.E. and Schmiegel, J. (2004). L´evy based tempo- spatial modelling; with applications to turbulence. Uspekhi Mat. Nauk 159, 63–90. [4] Best, N.G., Ickstadt, K. and Wolpert, R.L. (2000). Spatial Poisson re- gression for health and exposure data measured at disparate resolutions. J. Amer. Statist. Assoc. 95, 1076–1088. [5] Brix, A. (1999). Generalized gamma measures and shot-noise Cox pro- cesses. Adv. Appl. Prob. 31, 929–953. [6] Brix, A. and Chadoeuf, J. (2002). Spatio-temporal modeling of weeds by shot-noise G Cox processes. Biom. J. 44, 83–99. [7] Brix, A. and Diggle, P.J. (2001). Spatio-temporal prediction for log- Gaussian Cox processes. J. Roy. Statist. Soc. B 63, 823–841. [8] Brix, A. and Møller, J. (2001). Space-time multitype log Gaussian Cox processes with a view to modelling weed data. Scand. J. Statist. 28, 471–488. [9] Coles, P. and Jones, B. (1991). A lognormal model for the cosmological mass distribution. Monthly Notes of the Royal Astronomical Society 248, 1–13. [10] Comtet, L. (1970). Analyse Combinatoire. Tomes I, II. Presses Univer- sitaires de France, Paris. [11] Cox, D.R. (1955). Some statistical models related with series of events. J. Roy. Statist. Soc. B 17, 129–164. [12] Daley, D.J. and Vere-Jones, D. (1988). An Introduction to the Theory of Point Processes. Springer Verlag, New York. [13] Diggle, P.J. (2003). Statistical Analysis of Spatial Point Patterns. Second Edition. Hodder Arnold, London. [14] Diggle, P., Rowlingson, B. and Su, T. (2005). Point process methodology for on-line spatio-temporal disease surveillance. Environmetrics 16, 423– 434. [15] Federer, H. (1969). Geometric Measure Theory. Springer Verlag, Berlin.

27 [16] Ferguson, T.S. and Klass, M.J. (1972). A representation of independent increment processes without Gaussian components. Ann. Math. Statist. 43, 1634–43. [17] Hahn, U. (2007). Global and Local Scaling in the Statistics of Spatial Point Processes. Habilitationsschrift, Institut f¨ur Mathematik, Univer- sit¨at Augsburg. [18] Hahn, U., Jensen, E.B.V., van Lieshout, M.N.M. and Nielsen, L.S. (2003). Inhomogeneous spatial point processes by location dependent scaling. Adv. Appl. Prob. (SGSA) 35, 319–336. [19] Heikkinen, J. and Arjas, E. (1998). Non-parametric Bayesian estimation of a spatial Poisson intensity. Scand. J. Statist. 25, 435–450. [20] Hellmund, G. (2005). L´evy driven Cox processes with a View towards Modelling Tropical Rain Forest. Master Thesis, Department of Mathe- matical Sciences, University of Aarhus, Denmark. [21] Higdon, D. (2002). Space and space-time modeling using process con- volutions. In Quantitative Methods for Current Environmental Issues, eds. C.W. Anderson, V. Barnett, P.C. Chatwin and A.H. El-Shaarawi, Springer, London, pp. 37–56. [22] Jensen, E.B.V., J´onsd´ottir, K.Y., Schmiegel, J. and Barndorff- Nielsen, O.E. (2007). Spatio-temporal modelling – with a view to biolog- ical growth. In Statistics of Spatio-Temporal Systems, eds. B.F. Finken- stadt, L. Held and V. Isham. Monographs on Statistics and Applied Prob- ability, Chapman & Hall/CRC, pp. 47-75. [23] J´onsd´ottir, K.Y., Schmiegel, J. and Jensen, E.B.V. (2008). L´evy-based growth models. Bernoulli 14, 62–90. [24] Khoshnevisan, D. (2002). Multiparameter Processes. An Introduction to Random Fields. Springer Monographs in Mathematics. Springer-Verlag, New York. [25] Mat´ern, B. (1960). Spatial Variation. Meddelanden 49, Statens Skogs- forskningsinstitut, Stockholm. Second edition. Lectures Notes Statist. 36, Springer, Berlin, 1986. [26] McCullagh, P. and Møller, J. (2006). The permanental process. Adv. Appl. Prob. (SGSA) 38, 873–888. [27] Møller, J. (2003a). Shot noise Cox processes. Adv. Appl. Prob. (SGSA) 35, 614–640. [28] Møller, J. (2003b). A comparison of spatial point process models in epi- demiological applications. In Highly Structured Stochastic Systems, eds. P.J. Green, N.L. Hjort and S. Richardson. Oxford University Press, Ox- ford, pp. 264–268. [29] Møller, J., Syversveen, A.R. and Waagepetersen, R.P. (1998). Log Gaus- sian Cox processes. Scand. J. Statist. 25, 451–482. [30] Møller, J. and Torrisi, G.L. (2005). Generalised shot noise Cox processes. Adv. Appl. Prob. (SGSA) 37, 48–74. [31] Møller, J. and Waagepetersen, R.P. (2003). Statistical Inference and Sim- ulation for Spatial Point Processes. Chapman & Hall/CRC, Boca Raton.

28 [32] Møller, J. and Waagepetersen, R.P. (2002). Statistical inference for Cox processes. In Spatial Cluster Modelling, eds A.B. Lawson and D. Denison. Chapman & Hall/CRC, Boca Raton, pp. 37–60. [33] Møller, J. and Waagepetersen, R.P. (2007). Modern statistics for spatial point processes. Scand. J. Statist. 34, 643–684. [34] Neyman, J. and Scott, E.L. (1958). Statistical approach to problems of cosmology. J. R. Statist. Soc. B 20, 1-43. [35] Prokeˇsov´a, M., Hellmund, G. and Jensen, E.B.V. (2006). On spatio- temporal L´evy based Cox processes. In Proceedings S4G, International Conference on Stereology, Spatial Statistics and Stochastic Geometry, Prague, June 26 to 29 2006, eds. R. Lechnerov´a, I. Saxl and V. Beneˇs. Union Czech Mathematicians and Physicists, Prague, pp. 111–116. [36] Rajput, R.S. and Rosinski, J. (1989). Spectral representation of infinitely divisible processes. Probab. Theory Relat. Fields 82, 451–487. [37] Rathburn, S.R. and Cressie, N. (1994). A space-time survival point pro- cess for longleaf pine forest in southern Georgia. J. Amer. Statist. Assoc. 89, 1164–1174. [38] Richardson, S. (2003). Spatial models in epidemiological applications. In Highly Structured Stochastic Systems, eds. P.J. Green, N.L. Hjort and S. Richardson. Oxford University Press, Oxford, pp. 237–259. [39] Rosinski, J. (2007). Simulation of L´evy processes. To appear in Ency- clopedia of Statistics in Quality and Reliability: Computational Intensive Methods and Simulation. Wiley. [40] Sato, K.-I. (1999). L´evy Processes and Infinitely Divisible Distributions. Cambridge University press, Cambridge. [41] Stoyan, D., Kendall, W. S. and Mecke, J. (1995). Stochastic Geometry and its Applications. Second edition. Wiley, Chichester. [42] Thomas, M. (1949). A generalization of Poisson’s binomial limit for use in ecology. Biometrika 36, 18–25. [43] Van Lieshout, M.N.M. and Baddeley, A.J. (1996). A nonparametric mea- sure of spatial interaction in point patterns. Statist. Neerlandica 50, 344– 361. [44] Waagepetersen, R.P. (2005). Discussion of the paper by Baddeley, Turner, Møller & Hazelton (2005). J. Roy. Statist. Soc. B 67, 662. [45] Waagepetersen, R.P. (2007). Personal communication. [46] Walker, S. and Damien, P. (2000). Representations of L´evy processes without Gaussian components. Biometrika 87(2), 477–483. [47] Wiegand, T., Gunatilleke, S., Gunatilleke, N. and Okuda, T. (2007). Analyzing the spatial structure of a Sri Lankan tree species with multiple scales of clustering. Ecology 88, 3088–3102. [48] Wolpert, R.L. (2001). L´evy moving averages and spatial statistics. Lec- ture. Slides at //citeseer.ist.psu.edu/wolpert01leacutevy.html. [49] Wolpert, R.L. and Ickstadt, K. (1998a). Poisson/gamma random fields models for spatial statistics. Biometrika 85, 251–267.

29 [50] Wolpert, R.L. and Ickstadt, K. (1998b). Poisson/gamma random fields models for spatial statistics. Simulation of L´evy random fields. In Practi- cal Nonparametric and Semiparametric Bayesian Statistics, eds. D. Dey, P. M¨uller and D. Sinha. Springer Lecture Notes in Statistics Vol. 133, Springer, New York, pp. 227-242.

30 Completely random signed measures

Gunnar Hellmund

T.N. Thiele Centre for Applied Mathematics in Natural Science Department of Mathematical Sciences University of Aarhus, Denmark

Abstract Completely random signed measures are dened, characterized and related to Lévy random measures and Lévy bases.

1 Introduction

Completely random measures were dened in Kingman (1993). As described in Kingman (1993); Karr (1991) and more recently in Daley and Vere-Jones (2003, 2008) completely random measures are related to point process models, in particular Poisson cluster point processes. We make a natural extension of completely random measures to completely random signed measures and give a characterization of this class of signed random measures. It is shown that the class of Lévy random measures, introduced and used in Lévy adaptive regression kernel models Tu et al. (2006), and the class of Lévy bases, dened in Barndor-Nielsen and Schmiegel (2004) and used in spatio-temporal modeling in Barndor-Nielsen and Schmiegel (2004); Hellmund et al. (2008); Jónsdóttir et al. (2008), are natural extensions of completely random signed measures. Furthermore we show the assumption of innitely divisibility in the denition of Lévy random measures and Lévy bases can be replaced by other very mild assumptions. The most basic concept involved in the denition of Lévy random measures and Lévy bases is thus independence.

2 Signed random measures

We let X denote a Borel subset of Rd for some d ≥ 1 and B = B (X ) denote the trace of the Borel sigma algebra on X . By Bb we denote the set of bounded Borel subsets of X . A subset of X is bounded, if the closure of the set is compact. Let M denote the set of signed Radon measures on B, i.e. an element in M is a σ-additive set function, which takes nite values on compact sets, in particular on all bounded Borel subsets of X . We let M+ denote the subset of M consisting of positive Radon measures. There are several dierent denitions of signed Radon measures in the literature, we use, what we believe is the most general, see Ash (1972).

1 We dene a random signed measure M as a measurable mapping from a proba- bility space (Ω, E,P ) into (M, F) where Z F = σ {πf |f ∈ Cc (X )} , πf : M → R : µ → f (x) µ (dx) , X and Cc (X ) is the set of continuous functions on X with compact support. Let F + denote the trace of F on M+, then F + = B (M+). A random measure is dened as a measurable mapping from a probability space into (M+, F +). The lemma below, used in the sequel, concerns the xed atoms of a signed random measure. We say x ∈ X is a xed atom of M if and only if

P (|M ({x})| > 0) > 0.

Lemma 2.1. A signed random measure has at most countable many xed atoms.

Proof. Assume there are more than countable many xed atoms, then there exist

{xn|n ≥ 1} contained in a bounded set, such that

∃b > 0, a > 0∀n ≥ 1 : P (|M (xn)| > b) > a. Thus and P cannot be convergent, P (lim supn {|M (xn)| > b}) ≥ a n M ({xn}) which is a contradiction.

3 A result on innitely divisibility

Lemma 3.1. Let M denote a stochastic process with index set Bb such that are independent and (M (Bn))n≥1 X M (∪Bn) = M (Bn)

P-a.s. for disjoint sets and . (Bn)n≥1 ⊂ Bb ∪Bn ∈ Bb Then M (B) is innitely divisible for any B ∈ Bb, if the cumulant function of M (A) ,A ∈ BB is of the form Z  itM(A) C{M (A) ‡ t} = log E e = ft,B (x) λB (dx) (1) A for some measurable function for all and an atom-less ft,B : X → C t ∈ R\{0} nite measure λB on (B, BB), where BB = B (B). Remark 3.2. For a given and , because of independence, the cumulant t ∈ R B ∈ Bb transform denes a complex measure on (B, BB). If M (A) is zero with probability one on all sets in BB with Lebesgue measure zero, then Lebesgue measure dominates the complex measure generated by the cumulant transform for all t ∈ R and thus, by Radon-Nikodym, condition (3.1) in the above lemma is fullled.

2 Proof. Let B ∈ Bb be given. Since λB (B) is nite using Lemma 12.2, p.268 in Karlin and Studdun (1966) we can choose such that , (Bs)0≤s≤1 B0 = ∅ Bs0 ⊆ 0 Bs, s ≤ s, B1 = B and λB (Bs) = s · λB (B) for s ∈ [0, 1]. It is clear the stochastic process has independent increments and P-a.s. . (Ms) = M (Bs∧1)s≥0 M (B0) = 0 Furthermore (Ms) is stochastic continuous: Let 0 ≤ s < 1 be given, then for any s0 ∈ (s, 1): Ms0 − Ms = M (Bs0 \Bs) Since 0 0 λB (Bs0 \Bs) = (s − s) λB (B) → 0, s ↓ s, we see

M (Bs0 \Bs)→ ˜ 0. 0 Thus in probability M (Bs0 \Bs) → 0 as s goes to s. Left continuity for 0 < s ≤ 1 is proved similar.

By denition 1.6 in Sato (2005) (Ms) is an in law. Therefore M1 = M (B) is innitely divisible, see Theorem 9.1 Sato (2005).

4 Completely random signed measures

Denition 4.1. A completely random (signed) measure is a random (signed) mea- sure M with independent values on disjoint sets, i.e. {M (An)} are independent, if {An} is a family of disjoint sets. Corollary 4.2. A completely random signed measure with cumulant transform sat- isfying condition (1) in Lemma 3.1 has innitely divisible values. In the sequel we use the denition of a Poisson point process found in Kingman (1993).

Denition 4.3. A Poisson point process Φ on Y, a Borel subset of Rl for some l ≥ 1, is a random countable subset of Y, such that • The number of points N (A) in a Borel subset A of Y is Poisson distributed with mean value µ (A), where µ is a measure on B (Y) (µ may be innite on bounded sets!).

• Given disjoint sets A1,...,An the random variables N (A1) ,...,N (An) are independent. The theorem below is stated in Kingman (1993), which also provides a sketch of a proof. We give a detailed proof, since we use important elements of the proof in the sequel. Theorem 4.4. Given a completely random measure M fullling condition (1) in Lemma 4.2 there exists a Radon measure m, an at most countable set of xed atoms , independent positive random variables and a Poisson point pro- {xi}i∈I ⊂ X {Wi} cess on , such for any Φ X × R+ B ∈ Bb X X M (B) ∼ m (B) + Wi · 1B (xi) + Uj · 1B (Xj) . (2)

i (Xj ,Uj )∈Φ

3 Proof. Using Lemma 2.1 we note the set of xed atoms is at most countable. In the remaining part of the proof we assume M has no xed atoms. By Lemma 3.1 M (B) is innitely divisible for any B ∈ Bb. For any in there exists a constant and a measure on , B Bb m (B) ∈ R+ νB R+ such that R is nite and for any : (|x| ∧ 1) νB (dx) t ∈ R+ Z  itM(B) itz λt (B) = log E e = m (B) · it + (e − 1)νB (dt) , (0,∞] see Exc. 11, Chap. 15 in Kallenberg (2002).

By the properties of λt, (A, B) → νB (A) is a bi-measure. There exists a unique -nite measure on satisfying σ ν (X × R+, B ⊗ B ([0, ∞)))

ν (B × C) = νB (C) for all B ∈ Bb and C ∈ B ((0, ∞]), see (9.17) in Sato (2005). We see m denes a Radon measure on B. Assume without loss of generality, that m ≡ 0. Let denote a Poisson point process on with mean measure . Notice Φ X × R+ ν the number of points from Φ in a bounded set need not be nite. Dene

0 X M (B) = Uj · 1B (Xj)

(Xj ,Uj )∈Φ for any B ∈ Bb. Then for any B ∈ Bb, using ν is σ-nite and applying Campbell's Theorem Kingman (1993) we get Z  E [exp (itM 0 (B))] = exp eitz − 1 ν (dx × dz) B×(0,∞] Z  itz  = exp e − 1 νB (dz) (0,∞] Assuming without loss of generality, that M 0 is a random measure, M and M 0 are equal in distribution. We were not able to nd a reference to the lemma below, concerning additive processes, thus a short proof is provided.

Lemma 4.5. Given a continuous additive process (Xs), s ≥ 0 with paths of bounded variation, X0 = 0 and with characteristic triplet on the form (As, 0, γ (s)) we have As ≡ 0. i.e. the process has no Gaussian part and is deterministic, see Sato (2005) for notation.

Proof. The total variation process X  of is an increasing, continuous additive Vs (Xs) process. Set X e−Vs Y = , s ≥ 0 s  X  E e−Vs

Then Ys is a continuous martingale of bounded variation and thus constant, therefore X is deterministic, implying is integrable and is of bounded variation. Vs (Xs) γ (s) Xs − γ (s) therefore denes a continuous martingale of bounded variation, thus Xs = γ (s)

4 Lemma 4.6. Let M denote a completely random signed measure and suppose the condition on the cumulant transform in Lemma 3.1 is fullled, then M has no Gaus- sian part.

Proof. Let B ∈ Bb be given and let (Ms) denote the additive process in law con- structed in the proof of Lemma 3.1 In the proof below, all references are to Sato (2005).

By Theorem 11.5 we can choose a cadlag modication of (Ms). Given a cadlag ˜ modication M of M, n ≥ 1 and the partition 0 = s0 < s1 = 1/n < ··· < sn−1 = (n − 1) /n < sn = 1 of the interval [0, 1] we have P-a.s. X X M˜ − M˜ = M − M ≤ |M| (B) < ∞, si si−1 si si−1 i i since M is a random signed measure. Without loss of generality we assume M is an additive process of bounded variation (see Lemma 21.8 (i)). Using Theorem 9.8 the law of M is uniquely determined by a characteristic triplet (As, νs, γ (s)) each component satisfying some conditions given in the theorem. For every , is a Lévy measure on Dene and let s ≥ 0 νs R. H = (0, ∞) × R\{0} B (H) denote the Borel subsets of H. By (19.1) we dene

J (D, ω) = # {s > 0| (s, Ms − Ms−) ∈ D ∈ B (H)} Because of bounded variation we have (Lemma 21.8) for any s > 0 Z |x| J (d (t, x) , ω) < ∞ (0,s]×R\{0} as shown page - 5 R for all . 1413 142 {|x|≤1} |x| νs (dx) < ∞ s > 0 Using Theorem 19.3 and Lemma 21.8 there exist processes M J and M G, such that M = M J + M G and M G is P-a.s. an additive process, continuous in s with characteristic triplet R  and of bounded variation, thus As, 0, γ (s) − {|x|≤1} xνs (dx) As ≡ 0 (see 4.5). Theorem 4.7. Given a completely random signed measure M fullling condition (1) in Lemma 3.1. Then for all B ∈ Bb:

k + − X X M (B) ∼ µ (B) − µ (B) + Wi · 1B (xi) + Uj · 1B (Xj)

i=1 (Xj ,Uj )∈Φ where is a Poisson point process on , are independent random variables, Φ X × R Wi + − the xi, i ∈ I is an at most countable set of points in X and µ , µ are Radon measures.

Proof. Assume without loss of generality M has no xed atoms. Applying Rajput and Rosinski (1989) Proposition 2.1 (see this reference for notation) for every B ∈ Bb:  Z  Z itz  C{M (B) ‡ t} = it · a (B) − zUB (dz) + e − 1 UB (dz) {|z|≤1} R 5 From the proof of the previous lemma, we have that Z (|z|) UB (dz) < ∞. (3) {|z|≤1} Thus we can apply Campbell's Theorem Kingman (1993). In an argument, similar to the one found in the proof of Theorem 4.4, we can construct a Poisson point process Φ on X × R, such that   Z X itz  C Uj · 1B (Xj) ‡ t = e − 1 UB (dz)

(Xj ,Uj )∈Φ R (see Lemma 2.3 in Rajput and Rosinski (1989) for existence of a (mean) measure on X × R with the acquired properties) It remains to note that R is nite for all bounded sets . a (B)− {|z|≤1} zUB (dz) B 5 Lévy bases

Denition 5.1. A stochastic process L indexed by Bb is called a Lévy basis, if L (B) is innitely divisible for all B in Bb and L (Bn),n ≥ 1 are independent and X L (∪nBn) = L (Bn) n P-a.s. for any sequence of disjoint sets , . (Bn)n≥1 ⊆ Bb ∪Bn ∈ Bb Remark 5.2. The condition of innite divisibility can be left out, if condition (1) in Lemma 3.1 is fullled. It is proved in Rajput and Rosinski (1989) Lemma 2.3 that the cumulant trans- form of a Lévy basis L can be written as

n 1 2 C{L (dx) ‡ t} = ita (x) − 2 t b (x) Z itz  o + e − 1 − itz · 1[−1,1] (z) ρ (x, dz) λ (dx) (4) R λ is called the control measure and is σ-nite, a is a Borel measurable mapping into the real numbers and b is a density wrt. λ of a measure, ρ (x, ·) is a Lévy measure for given x. A Lévy basis such that a ≡ 0 and ρ ≡ 0 is called a purely Gaussian Lévy basis.

Theorem 5.3. Let L denote a Lévy basis. L has the same distribution as the sum of a purely Gaussian Lévy basis and a completely random signed measure restricted to Bb (all terms being independent) if and only if for any B ∈ Bb: Z Z |z| ρ (x, dz) λ (dx) < ∞ B {|z|≤1}

6 Proof. From the proof of Lemma 4.6 we see the condition is necessary. Assume the condition is fullled. Using the representation of the cumulant trans- form of the Lévy we can make the following rearrangements:

Z n    C{dx ‡ t} = it a (x) − z · 1[−1,1] (z) ρ (x, dz) R 1 Z o − t2b (x) + eitz − 1 ρ (x, dz) λ (dx) (5) 2 R The non-Gaussian part of L has cumulant transform  Z  Z  itz  {it a (x) − z · 1[−1,1] (z) ρ (x, dz) + e − 1 ρ (x, dz)}λ (dx) (6) R R Following the proof of Theorem 4.7 there is a completely random signed measure with cumulant transform (6).

Denition 5.4. Lévy random measures are Lévy bases with no Gaussian part.

Corollary 5.5. A Lévy random measure is a completely random signed measure if and only if for any B ∈ Bb: Z Z |z| ρ (x, dz) λ (dx) < ∞. B {|z|≤1} References

Ash, R. B., 1972. Real Analysis and Probability. Probability and . New York: Academic Press.

Barndor-Nielsen, O., Schmiegel, J., 2004. Levy-based spatial-temporal modelling, with applications to turbulence. Uspekhi Matematicheskikh Nauk 59 (1), 6390.

Daley, D., Vere-Jones, D., 2003. An Introduction to the Theory of Point Processes, Vol. I. New York: Springer-Verlag.

Daley, D. J., Vere-Jones, D., 2008. An Introduction to the Theory of Point Processes, Vol. II. New York: Springer-Verlag.

Hellmund, G., Prokesova, M., Vedel, E., September 2008. Lévy based cox point processes. Advances in Applied Probability 40 (3), to appear.

Jónsdóttir, K., Schmiegel, J., Jensen, E., 2008. Lévy-based growth models. Bernoulli 14 (1), 6290.

Kallenberg, O., 2002. Foundations of Modern Probability, second edition Edition. Probability and Its Applications. New York: Springer-Verlag.

Karlin, S. J., Studdun, W. J., 1966. Tchebyche systems: With Applications in Analysis and Statistics. Pure and applied mathematics. New York: Wiley.

7 Karr, A. F., 1991. Point Processes and Their Statistical Inference. Bora Raton: CRC Press.

Kingman, J. F. C., 1993. Poisson Processes. Oxford Studies in Probability. Oxford: Oxford University Press.

Rajput, B., Rosinski, J., 1989. Spectral representations of innitely divisible pro- cesses. and Related Fields 82, 451487.

Sato, K., 2005. Lévy Processes and Innitely Divisible Distributions. Cambridge studies in advanced mathematics. Cambridge: Cambridge University Press.

Tu, C., Clyde, M., Wolpert, R., 2006. Lévy adaptive regression kernels. Discussion Paper 2006-08, Duke University Department of Statistical Science. URL ftp://ftp.stat.duke.edu/pub/WorkingPapers/06-08.html

8 Strong mixing with a view toward spatio-temporal estimating functions

Gunnar Hellmund, Rasmus Plenge Waagepetersen

T.N. Thiele Centre for Applied Mathematics in Natural Science Department of Mathematical Sciences University of Aarhus, Denmark

Department of Mathematical Sciences University of Aalborg, Denmark

Abstract d We provide conditions for strong mixing of point processes on R , d ≥ 1 driven by shot-noise random fields. We discuss how the results provided can be readily applied in estimation of spatio-temporal models.

Introduction

Mixing is often an essential prerequisite for establishing theoretical properties of inference procedures for spatial processes. In this paper we establish results regard- ing strong mixing of spatial point processes. We give particular attention to point processes driven by shot-noise random fields. For a particular class of estimating functions we discuss how the strong mixing results can be applied to establish consistency and asymptotic normality for the estimating function parameter estimates. This class of estimating functions e.g. contains estimating functions obtained from certain composite likelihood functions [Møller and Waagepetersen, 2007] which provide computationally easier alternatives to maximum likelihood estimation for spatial point processes. The discussion of mixing and estimating functions is focused on the spatial case. Generalizations to spatio-temporal settings are however obvious. So far, inference for point processes driven by random fields has mainly been restricted to stationary or second-order re-weighted stationary processes [Diggle, 2002, Møller and Waagepetersen, 2007, Waagepetersen and Guan, 2008]. However, our mixing results are applicable in a wider context of e.g. shot-noise processes with an inhomogeneous cluster center process. Such processes are neither stationary nor second-order re-weighted stationary. We conclude the paper by considering a specific example of estimation in a non-second-order reweighted stationary model, motivated by a point pattern dataset from the Yasuni tropical rain forest plot in Ecuador.

1 1 Point processes driven by random fields

A point process driven by a random field X on a subset of Rd is defined in terms of a random intensity function Λ, such that given Λ = λ, X is a Poisson point process with intensity λ. Realizations of these point processes are most often clustered point patterns. The intensity m and the second-order product density m2 for a point process with random intensity Λ are given by

2 d m(x) = E[Λ(x)] and m (x, y) = E[Λ(x)Λ(y)], x, y ∈ R . A sufficient condition for second-order re-weighted stationarity Baddeley et al. [2000] is that the pair correlation function g(x, y) = m2(x, y)/(m(x)m(y)) is translation invariant, i.e. g(x, y) only depends on x − y.

1.1 Mixing for point processes

d For a point process X and given W0 ⊆ R , τ > 0 define

 d Wτ = x ∈ R |∀y ∈ W0 : kx − yk ≥ τ and F0 = σ (X ∩ W0) , Fτ = σ (X ∩ Wτ ) .

The strong mixing coefficient between the σ-algebras F0 and Fτ is given as

α (τ) = sup {|P (Aτ ∩ A0) − P (Aτ ) P (A0)| |Aτ ∈ Fτ ,A0 ∈ F0} .

The following theorem states that the mixing coefficient of a point process driven by a random field is bounded above by the mixing coefficient of the driving random field Λ.

Theorem 1.1. For a given point process, let Λ denote the underlying random in- tensity. Set

αΛ (τ) = sup {|P (Aτ ∩ A0) − P (Aτ ) P (A0)| |Aτ ∈ Gτ ,A0 ∈ G0} , where Gτ = σ(Λ (x) , x ∈ Wτ ), and G0 = σ(Λ (x) , x ∈ W0). Then

α (τ) ≤ αΛ (τ) .

Proof. Let Aτ ∈ Fτ ,A0 ∈ F0 be given. Let Λ|Wτ and Λ|W0 denote the restrictions of Λ to Wτ and W0. Then because X given Λ is a Poisson point process:

|P (Aτ ∩ A0) − P (Aτ ) P (A0)| = |E [P (Aτ ∩ A0|Λ)] − E [P (Aτ |Λ)] E [P (A0|Λ)]|

= |E [P (Aτ |Λ) P (A0|Λ)] − E [P (Aτ |Λ)] E [P (A0|Λ)]| ≤ αΛ (τ) , using Doukhan [(1’) p. 3 1994], since P (Aτ |Λ) ≤ 1 is measurable wrt. Gτ , and P (A0|Λ) ≤ 1 is measurable wrt. G0.

2 Frequently used random fields are log-Gaussian random fields and shot-noise random fields. Recently the literature on point processes with random intensity has focused on second-order reweighted stationary point processes with random intensity of the form Λ(x) = exp(β · z(x))Ψ(x), where z(x) denotes covariate information, β is a vector of parameters and Ψ is a stationary random field. In the case of log-Gaussian random fields, popular choices of log(Ψ) are Gaussian random fields with exponential or Gaussian covariance function. In the case of the inhomogeneous Thomas point process [Waagepetersen, 2007]

X 1 |x − c|2 Ψ(x) = exp(− ), 2πσ2 2σ2 c∈Φ where Φ is a homogeneous Poisson point process. The random field Ψ for an inho- mogeneous Thomas point process has Gaussian covariance function. According to [Belyaev, 1959] random fields with Gaussian covariance function have smooth realizations and are not strong mixing: Using a simple power-series expansion it can easily be seen that the σ-algebras generated by the restriction of the random field to any open ball is identical to the σ-algebra generated by the field restricted to any set containing an open ball. However, as shown in the sequel, in the case of point processes driven by shot-noise random fields an analytic random field does not rule out strong mixing of the resulting point process.

1.2 The Whittle-Mat´ernclass Guttorp and Gneiting [2006] and Stein [1999] discusses the widely used and very flexible Whittle-Mat´ernclass of isotropic covariance functions for random fields on Rd, d ≥ 1. An often used parametrization of this family is σ 2ν1/2r 2ν1/2r c(r) = ( )νK ( ); ρ, σ, ν > 0, 2ν−1Γ(ν) ρ ν ρ where Kν denotes the second modified Bessel function of order ν. For fixed ρ and σ the limiting covariance function of this family, as ν goes to infinity, is a Gaussian covariance function of the form r2 c(r) = σ exp(− ), ρ2

1 For ν = 2 we obtain the exponential covariance function. In Mat´ern[1960] it is noted d (p.17) that the Whittle-Mat´erncovariance function with ν ≥ 2 can be seen as the correct generalization to d dimensions of type III distributions. Furthermore using the results in Mat´ern[1960] we see that point processes on Rd driven by shot-noise random fields on the form

X ν Ψ(x) = ϕγ(κ|x − c|) Kν(κ|x − c|); ϕ, ν, κ > 0, c∈Φ

3 where γ−1 = 2ν+dπd/2κ−1Γ(ν + d/2), and Φ is a homogeneous Poisson point process, have covariance functions belonging to the Whittle-Mat´ernfamily. The intensity function of these processes is equal to ϕ · α, where α denotes the intensity of Φ. Furthermore for r = |x − y| √ Cov(Ψ(x), Ψ(y)) 21−2ν−d/2 π = r2ν+d/2K (κr). ϕ2γ2α Γ(2ν + (d + 1)/2)κ2ν+d/2 2ν+d/2

Since Kν is exponential decaying it is a consequence of Theorem 1.2 below that this flexible class of shot-noise random fields can be used to generate point processes that are strong mixing.

1.3 Cluster processes In this section we consider cluster point processes X with cluster centers generated from a Poisson process. Specifically we assume X is a point process with random intensity function X x → k (ξ, ζ, x) , (1.1) (ξ,ζ)∈Φ where k is a positive function and Φ is a Poisson point process on Rd × Rl with intensity function d l χ : R × R → R+, l ≥ 1. In order to have a well-defined point process it is necessary that the mean intensity function exists and is locally integrable, i.e. for all x ∈ Rd

k (ξ, ζ, x) χ (ξ, ζ) dζdξ < ∞, ˆ d ˆ l R R and for all bounded Borel subsets B ⊂ Rd

k (ξ, ζ, x) χ (ξ, ζ) dζdξdx < ∞. ˆ ˆ d ˆ l B R R For introductions to cluster processes see [Daley and Vere-Jones, 2003, 2008, Møller, 2003, Møller and Torrisi, 2005]. The class of shot-noise Cox processes Møller [2003] is obtained with l = 1 and k(ξ, ζ, x) of the form ζf(ξ, x). The generalized shot-noise Cox processes in Møller and Torrisi [2005] also have random intensity functions of d the form (1.1) with l = 2 and k(ξ, ζ, x) = ζ1f(ξ/ζ2, x/ζ2)/ζ2 for some kernel f but Φ is not restricted to be Poisson. Recall the notation for mixing coefficients and define n τ o W = x ∈ d|∀y ∈ W : kx − yk > c,τ R 0 2 and d Wc,0 = R \Wc,τ . The following theorem generalizes a result in Waagepetersen and Guan [2008]. A proof is given in the Appendix.

4 Theorem 1.2.

α (τ) ≤4 k (ξ, ζ, x) χ (ξ, ζ) d (ζ, ξ) dx ˆ ˆ l W0 Wc,τ ×R + 4 k (ξ, ζ, x) χ (ξ, ζ) d (ζ, ξ) dx. ˆ ˆ l Wτ Wc,0×R In particular, if h(z) is O (e−γ·z) for some positive constant γ and

k (ξ, ζ, x) · χ (ξ, ζ) dζ ≤ h (kξ − xk) , ˆ l R −ν·τ then α (τ) is O (e ) for some positive constant ν if W0 is bounded. The condition in the theorem trivially holds if k has compact support. Another example is a shot-noise Cox process with k (ξ, ζ, x) = ζ · f (x) · fk (kx − ξk) where f is bounded, fk is a Laplace or Gaussian kernel and ζ · χ (ξ, ζ) dζ < ∞. Certain strong mixing results [Guyon, 1995] only require that ´α(τ) ∈ O(τ −p).

2 Estimating functions

Maximum likelihood estimation is in general difficult for point processes with ran- dom intensity due to intractable likelihood functions. Instead parameter estimates are often obtained using minimum contrast estimation based on the K-function, which is well-defined in the stationary or second-order re-weighted stationary case [see Diggle, 2002, Møller and Waagepetersen, 2007, 2004, Waagepetersen and Guan, 2008]. In the case of cluster processes, which are not second-order re-weighted sta- tionary, minimum contrast estimation based on the K-function is not possible and instead one may use certain composite likelihood functions as discussed below. Fur- thermore we discuss how strong mixing results become useful for establishing asymp- totic results for a general class of estimating functions including those obtained by differentiating log composite likelihood functions.

2.1 Composite likelihoods

We assume the distribution PX of X is defined in terms of a vector of parameters θ ∈ Θ ⊆ Rq for some q ≥ 1. The intensity m and second-order product density m2 is related to probabilities of occurrence of points or pairs of points and this leads to composite likelihoods of the form ! 1 X ` (θ) = ln m1 (x) − m1 (x) d (x) . (2.1) |W | ˆ x∈W W [Schoenberg, 2005, Waagepetersen, 2007] and   6= 6= 1  X 2 X 2  `pairwise (θ) =  ln m (x, y) − ( 1) ln m (x, y)d(x, y) |W |  ˆ W ×W :  x,y∈X∩W : x,y∈X∩W : kx−yk

5 [Guan, 2006]. Guan [2006] considered the stationary case but (2.2) may be applied in a much wider context of non stationary point processes. In (2.2), pairs of points with large inter point distance tend to add more noise than information and that is the reason for using only pairs of points whose inter point distance is less than the user-specified parameter r. In Waagepetersen [2007] the Bernoulli composite log likelihood was introduced,   6= 1  X 2 2  `Bernoulli (θ) =  ln m (x, y) − m (x, y) d (x, y) . (2.3) |W |  ˆ W ×W :  x,y∈W : kx−yk 2, may be written as a sum 1 X ` (θ) , (2.4) a p p where `p (θ) only depends on X through X ∩ Cp for a bounded neighborhood Cp of Sp (see figure (2.1)). The scaling factor a is often the size |W | of W or the number of points p in W .

d Figure 2.1: In a partition of R into squares, p marks the center of a cell Sp and Cp marks a neighborhood.

Regarding (2.3) for example, 6= X 2 2 `p (θ) = ln m (x, y) − m (x, y)d(x, y) ˆSp∩W ×W : x∈X∩Sp,y∈X: kx−yk

6 and Cp = {y ∈ R|∃x ∈ Sp : kx − yk ≤ r}.

2.2 Mixing and asymptotic results In this section we present a set of theorems useful for establishing asymptotic proper- ties of estimates obtained e.g. from composite likelihoods as discussed in the previous section. Consider an increasing sequence of observation windows Wn and a sequence of estimating functions u (n, θ) of the form

1 X u (n, θ) = u (θ) (2.5) a p n p using the notation of the previous section, where up (θ) only depends on X through X ∩ Cp and Cp is a bounded neighborhood of Sp. The estimating function u (n, θ) might for instance be given by the gradient of one of the composite likelihoods ˆ defined in the previous section. An estimate θn is obtained by solving u(n, θ) = 0. The primary application of strong mixing is to obtain a for a suitably scaled version of (2.5). Define the mixing coefficient

α2,∞ (m) = sup {α (σ (up(θ), p ∈ A) , σ (up(θ), p ∈ B))} , A,B where A, B ⊆ Zd, |A| ≤ 2, dist (A, B) ≥ m,

α (σ (up(θ), p ∈ A) , σ (up(θ), p ∈ B)) = sup |P (MA ∩ MB) − P (MA) P (MB)| MA,MB and MA ∈ σ (up(θ), p ∈ A) ,MB ∈ σ (up(θ), p ∈ B) . Assume for all ∃δ > 0∀θ ∈ Θ:

h 2+δi sup E kup (θ)k < ∞ (2.6) p and X d−1 δ/2+δ m α2,∞ (m) < ∞. (2.7) m≥1

1/2 It then follows from Theorem 3.3.1 in Guyon [1995] that the variance of an u(n, θ) is 1/2 bounded and that an u(n, θ) is asymptotically normal, provided the variance matrix 1/2 of an u(n, θ) stays positive definite as n tends to infinity.

In applications up (θ) is a measurable mapping of X|Cp , thus   α (σ (up, p ∈ A) , σ (up, p ∈ B)) ≤ α σ X|∪Cp , p ∈ A , σ X|∪Cp , p ∈ B , where

  X X X X α σ X|∪Cp , p ∈ A , σ X|∪Cp , p ∈ B = sup P MA ∩ MB − P MA P MB X X MA ,MB

7 and X  X  MA ∈ σ X|∪Cp , p ∈ A ,MB ∈ σ X|∪Cp , p ∈ B . Mixing results for the process X are therefore crucial for establishing mixing of d up(θ), p ∈ Z . d The following theorem is an example of how mixing of up(θ), p ∈ Z may be used ˆ to establish consistency of roots θn of un. Define n o R = θ ∈ Θ| lim kEθ [u (n, θ)]k = 0 n→∞ 0 and R = {θ ∈ Θ|∃θR ∈ R : kθ − θRk < } , where k·k is the Euclidean norm. We will assume R = {θ0}, i.e. asymptotically there is at most one root.

Theorem 2.1. Suppose u (n, θ) is continuous and Θ is compact, given (1)-(3) below, ˆ θn is consistent, i.e. in probability, ˆ lim θn = θ0. n→∞ if an ↑ ∞ as n ↑ ∞.

1. ∀ > 0∃c > 0,N ≥ 1∀n ≥ N : infΘ\R kEθ0 [u (n, θ)]k ≥ c 2. the conditions (2.6) and (2.7) are satisfied

3. ∃h : x ↓ 0 ⇒ h (x) ↓ 0 and ∃Bn = OP (1):

0 0 ku (n, θ) − u (n, θ )k ≤ Bn · h (kθ − θ k) ,P − a.s.,

where Bn = OP (1) means that Bn is a sequence of stochastic variables not depending on θ and bounded in probability, i.e.

∀ > 0∃δ > 0 : lim sup P (Bn ≥ δ) ≤ . n→∞

Proof. By Theorem 3.1 in Crowder [1986] it suffices to prove that

sup ku (n, θ) − Eθ0 [u (n, θ)]k →n→∞ 0. (2.8) Θ\R

Note that

! 6= ! 1 X 1 X 2 X V ar (u (n, θ)i) = V ar up (θ)i = 2 σp + 2 σp1p2 , an an p∈Wn p p1,p2 where i refers to the i0th estimating function and

2 σp = V ar (up (θ)i) , σp1p2 = Cov (up1 (θ)i , up2 (θ)i) .

8 Using condition (2) and the first part of Theorem 3.3.1 in [Guyon, 1995] and the accompanying Remark (1) we obtain for all θ and n ≥ 1

anV ar (u (n, θ)i) < ∞. Thus in L2 and therefore in probability

ku (n, θ) − Eθ0 [u (n, θ)]k → 0, n → ∞. (2.9)

We therefore obtain (2.8) from condition (3), since this condition by Theorem 21.10 (i) in Davidson [1994] assures stochastic equicontinuity, which combined with (2.9) and Theorem 21.9 in [Davidson, 1994] provides uniform convergence Theorem 3.3 in [Crowder, 1986] provides conditions, for which

−1/2 ˆ  Vn (θ0) Mn (θ0) θn − θ0 has the same asymptotic distribution as

−1/2 −Vn (θ0) u (n, θ0) (2.10) where

Vn (θ) = V arθ0 (u (n, θ))  ∂  M (θ) = E u (n, θ) . n θ0 ∂θT

ˆ Asymptotic normality of θn then follows from asymptotic normality of u(n, θ). ˆ Asymptotic existence of θn is ensured by the following condition

T ∀ > 0∃δ > 0,N ≥ 1∀n ≥ N : inf (θ − θ0) Eθ0 [u (n, θ)] > δ kθ−θ0k=

[Theorem 3.2 Crowder, 1986]. A related route to existence of a consistent sequence of roots is given by Theorem 2 in [Waagepetersen and Guan, 2008]. This approach 1/2 also relies on a central limit theorem for an u(n, θ).

2.3 Estimation in non-stationary cluster point processes In this example we provide some details for estimation in shot-noise Cox point processes X with random intensity on the form: X Λ(x) = k(kc − xk), c∈Φ where k is an Epanechnikov type kernel, i.e.

( 2δγ 2 (1 − γ kc − xk ), if kc − xk < √1 ; k(kc − xk) = π γ 0, otherwise.

9 where δ, γ > 0 and Φ is a Poisson point process on R2 with intensity function

λ(c) = exp(β0 + β · z(c)),

l β0 ∈ R, β ∈ R and z(c) is a set of l covariate fields. The intensity function is given as

1 2δγ β +β·z(c) 2 2 m (x) = e 0 (1 − γ kc − xk )dc; x ∈ R . π ˆ 1 kc−xk< γ This cluster point process is neither stationary nor second-order re-weighted sta- tionary. Strong mixing of the point process is trivial since the kernel has bounded support, thus using a first order Bernoulli composite log likelihood (2.1) we can estimate the parameters β, γ and the quantity δ · exp(β0). We assume an enlarged observation window W contains covariate information.

2.3.1 Data As a brief practical example we will look at positions of trees of the species Rinorea lindeniana in the Yasuni tropical forest plot in the Ecuadorian rain forest, data are kindly provided by professor Henrik Balslev, Department of Biology, University of Aarhus, Denmark. The distribution of species in the 500m × 500m plot in relation to habitat infor- mation was discussed in Valencia et al. [2004]. Specifically the observation window was divided into six different habitats as shown in figure 2.2. We will use an in- dicator of each of the six different habitats as covariate information. Furthermore since we need covariate information in an enlarged observation window, we will only look at trees within the [60, 440] × [60, 440] sub window - see the right hand side of figure 2.2. In this specific case β becomes a four-dimensional vector - the sixth habitat is ignored, since it has only a minor effect due to it’s small size and loca- tion outside W , and the fifth habitat is used as set off. The estimated values were: β = (−3.98, −0.35, −0.05, −1.87), γ = exp(−6.27) and δ · exp(β0) = exp(−3.53).

Figure 2.2: Left: Habitats in the Yasuni tropical forest plot. Right: Positions of Rinorea lindeniana in the [60, 440] × [60, 440] sub window.

10 2.3.2 Simulation We simulated 100 point patterns of the above type using habitat covariate infor- mation from the Yasuni plot and parameters β = (−3.98, −0.35, −0.05, −1.87), γ = exp(−6.27), δ = exp(6.5 − 3.53) and β0 = −6.5. See figure 2.3 for plots of four simulated point patterns. For each point pattern we estimated the parame- ters β and γ and the quantity δ · exp(β0). As expected estimates are well-behaved (see figure 2.4). The empirical mean values obtained from these simulations were β = (−4.04, −0.39, −0.08, −1.78), ln(γ) = −5.48, ln(δ · exp(β0)) = −3.54. Although the estimates of γ seem to be significantly biased, they are within a reasonable dis- tance from the true value.

Figure 2.3: Simulated point patterns, using parameters β = (−3.98, −0.35, −0.05, −1.87), γ = exp(−6.27), δ = exp(6.5 − 3.53) and β0 = −6.5.

11 Figure 2.4: QQ-plots of parameter estimates obtained using first order Bernoulli composite likelihood. From above left: β the 4-dimensional covariate parameter vector, ln(γ), ln(δ · exp(β0)).

12 Discussion

We have derived useful conditions for strong mixing of cluster point processes. A very important application of the mixing results was given in section 2, through a discussion of asymptotic results for estimating functions for spatial data. The results in section 2 may be further generalized and adapted to other dependence structures using Guyon [1995] and Davidson [1994]. The asymptotic results found above gives a theoretically justification for method of moments and composite likelihood estimation methods in (non-stationary) point processes driven by random fields. The shot-noise random fields introduced in sec- tion 1.2 may provide a flexible class for use in descriptive statistics based on observed point patterns and covariate information, while models that are not second-order re-weighted stationary may occur when trying to describe the dynamics behind some clustered point patterns. Shot-noise random fields are fairly simple and very fast to simulate. Fast estimation procedures and fast simulation routines provide evident opportunities for simulation based inference. Subsampling methods as described in Heagerty and Lumley [2000] can easily be adopted. Throughout our discussion we focused on spatial estimating functions in order to provide a clear presentation. However, the setup can obviously be generalized to spatio-temporal settings if time is interpreted as one of the spatial dimensions.

Appendix

Proof of theorem 1.2: It is well known that X can be seen as a union of independent Poisson point processes X = ∪(ξ,ζ)∈ΦX(ξ,ζ), where X(ξ,ζ) has intensity function k (ξ, ζ, x). Let Xτ and X0 denote Poisson cluster processes, such that

Xτ = ∪(ξ,ζ)∈Φ| l X(ξ,ζ) Wc,τ ×R and

X0 = ∪(ξ,ζ)∈Φ| l X(ξ,ζ). Wc,0×R

Then X = Xτ ∪ X0, furthermore Xτ and X0 are independent. Let Gτ and G0 denote elements in the sigma-algebra Nlf , such for given Aτ ∈ Fτ and A0 ∈ F0 Aτ = {X ∩ Wτ ∈ Gτ } and A0 = {X ∩ W0 ∈ G0} . Where  d d Nlf = σ M ⊂ R |∀B ∈ Bb R ∃m ≥ 0 : # (M ∩ B) = m , d d and Bb R are the bounded Borel subsets of R . Define ∗ ∗ Aτ = {Xτ ∩ Wτ ∈ Gτ } ,A0 = {X0 ∩ W0 ∈ G0}

13 and Bτ = {Xτ ∩ W0 = ∅} ,B0 = {X0 ∩ Wτ = ∅}

Because of independence between Xτ and X0 we see, that

∗ ∗ P (Aτ ∩ A0 ∩ Bτ ∩ B0) = P (Aτ ∩ Bτ ) P (A0 ∩ B0) ∗ P (Aτ ∩ Bτ ∩ B0) = P (Aτ ∩ Bτ ) P (B0) ∗ P (A0 ∩ B0 ∩ Bτ ) = P (A0 ∩ B0) P (Bτ ) . Thus

|P (Aτ ∩ A0) − P (Aτ ) P (A0)| ∗ ∗ C C  = |P (Aτ ∩ Bτ ) P (A0 ∩ B0) + P Aτ ∩ A0 ∩ Bτ ∪ B0 − P (Aτ ) P (A0) | C  C  ∗ ∗ ≤ P Bτ + P B0 + P (Aτ ∩ Bτ ) P (A0 ∩ B0) (1 − P (B0) P (Bτ )) C C  C C  +P Aτ ∩ Bτ ∪ B0 + P A0 ∩ Bτ ∪ B0 C  C  ≤ 4P Bτ + 4P B0 , since C C  1 − P (B0) P (Bτ ) = P B0 ∪ Bτ . Using Markov’s Inequality

C  P Bτ = P (# (Xτ ∩ W0) ≥ 1) ≤ E [# (Xτ ∩ W0)] = k (ξ, ζ, x) χ (ξ, ζ) d (ζ, ξ, x) ˆ ˆ l W0 Wc,τ ×R and C  P B0 ≤ k (ξ, ζ, x) χ (ξ, ζ) d (ζ, ξ, x) ˆ ˆ l Wτ Wc,0×R References

A. J. Baddeley, J. Møller, and R. Waagepetersen. Non- and semi-parametric es- timation of interaction in inhomogeneous point patterns. Statistica Neerlandica, 54:329–350, 2000.

Yu. K. Belyaev. Analytic random processes. Theory of Probability and its Applica- tions, 4(4):402–409, 1959.

M. Crowder. On consistency and inconsistency of estimating equations. Econometric Theory, 2(3):305–330, 1986.

D.J. Daley and D. Vere-Jones. An Introduction to the Theory of Point Processes Vol I+II. Springer (New York), 2003, 2008.

J. Davidson. Stochastic limit theory. Oxford University Press (Oxford), 1994.

P.J. Diggle. Statistical Analysis of Spatial Point Patterns. Hodder Arnold (London), 2002.

14 P. Doukhan. Mixing: Properties and examples. Springer-Verlag (New York), 1994.

Y. T. Guan. A composite likelihood approach in fitting spatial point process models. Journal Of The American Statistical Association, 101(476):1502–1512, 2006.

P. Guttorp and T. Gneiting. Studies in the history of probability and statistics xlix on the mat´erncorrelation family. Biometrika, 93(4):989–995, 2006.

X. Guyon. Random Fields on a Network: Modeling, Statistics, and Applications. Springer-Verlag (New York), 1995.

P. J. Heagerty and T. Lumley. Window subsampling of estimating functions with application to regression models. Journal of the American Statistical Association, 95:197–211, 2000.

B. Mat´ern. Spatial Variation. Statens Skogsforskningsinstitut (Stockholm), 1960.

J. Møller. Shot noise Cox processes. Advances In Applied Probability, 35(3):614–640, 2003.

J. Møller and G. L. Torrisi. Generalised shot noise Cox processes. Advances In Applied Probability, 37(1):48–74, 2005.

J. Møller and R. P. Waagepetersen. Modern statistics for spatial point processes. Scandinavian Journal Of Statistics, 34(4):643–684, 2007.

J Møller and R.P. Waagepetersen. Statistical Inference and Simulation for Spatial Point Processes. Chapman & Hall/CRC (Boca Raton), 2004.

F. P. Schoenberg. Consistent parametric estimation of the intensity of a spatial- temporal point process. Journal Of Statistical Planning And Inference, 128(1): 79–93, 2005.

M.L. Stein. Interpolation of Spatial Data: Some Theory for Kriging. Springer-Verlag (New York), 1999.

R. Valencia, R. B. Foster, G. Villa, R. Condit, J. C. Svenning, C. Hernandez, K. Ro- moleroux, E. Losos, E. Magard, and H. Balslev. Tree species distributions and local habitat variation in the amazon: large forest plot in eastern ecuador. Journal Of Ecology, 92(2):214–229, 2004.

R. Waagepetersen and Y. Guan. Two-step estimation for inhomogeneous spatial point processes. submitted, 2008.

R. P. Waagepetersen. An estimating function approach to inference for inhomoge- neous neyman-scott processes. Biometrics, 63(1):252–258, 2007.

15