arXiv:1205.3918v1 [stat.ME] 17 May 2012 oacaso eiuldiagnostics. generalized residual further of class and a residuals, to as viewed Be- pseudo- be and from can score derived The pseudo-likelihood. test sag’s generalized “pseudo-score” is a likelihood, process to point statistic, the test analysis on score based the The . in pattern point validation spatial of model informal and ence K92 abr ,Dnake-mail: Denmark Ø, 7G, Aalborg Vej DK-9220 Bajers Fredrik University, Mathematical Aalborg of Sciences, Department , of Professor Møller Jesper and Rubak Ege Baddeley, Adrian Point Models Spatial Process for Residual Diagnostics and Pseudo-Score Score, nvriy rdi aesVj7,D-20AlogØ, Aalborg e-mail: DK-9220 Denmark 7G, Vej Bajers Aalborg Fredrik Sciences, University, Mathematical of Department Scholar, 01 o.2,N.4 613–646 4, DOI: No. 26, Vol. 2011, Science Statistical [email protected] nvriyo etr utai e-mail: Australia Professor, Western Adjunct of and University Australia 5, 6913, Bag WA Private Wembley Statistics, and Informatics Mathematics,

dinBdee sRsac cets,CSIRO Scientist, Research is Baddeley Adrian c ttsia Science article Statistical original the the of by reprint published electronic an is This ern iesfo h rgnli aiainand pagination in detail. original typographic the from differs reprint hspprdvlp e ol o omlinfer- formal for tools new develops paper This nttt fMteaia Statistics Mathematical of Institute 10.1214/11-STS367 .INTRODUCTION 1. [email protected] nttt fMteaia Statistics Mathematical of Institute , ieo sn ucinlsmaysaitc,sc sRipley as esta such statistics, the summary to functional po using support on of theoretical tice based lend da diagnostics f results of derived pattern The class test residuals. point a “pseudo-score” to spatial a and to of pseudo-likelihood, generalized analysis is test the score in validation model Abstract. oe aiain on rcs eiul,pseudo-likeli residuals, process point validation, model phrases: and words Key co for Software provided. s plots. is methods residual diagnostics localization smoothed and support statistic also scan results The models. ted hntsigfrcmlt pta admes n hyprov they and the ; as spatial such complete tools for testing when 01 o.2,N.4 613–646 4, No. 26, Vol. 2011, g ua sPostdoctoral is Rubak Ege . 2011 , edvlpnwtosfrfra neec n informal and inference formal for tools new develop We eprMle is Møller Jesper . compensator [email protected] opnao,fntoa umr statistics, summary functional Compensator, This . fthe of in . 1 aysaitc uha Ripley’s as sum- such functional statistics using of mary practice established the for data. these (localized in “” points) between and dependence points) of in check variation abundance to (spatial “inhomogeneity” possible it for make separately techniques Our paper. the pattern. point clustered hot- a ex- localized in (for or spots effects possible covariate for it check making to ample) processes, hy- point wider alternative much of a and class to this null extends approach the Our both potheses. under Poisson is n ls fmdl,tesoets ttsi sequiv- empirical is statistic the test to score the alent models, of In class points. between one inhibition and clustering study to effects covariate about inference [ formal mainly used support been has to test score the statistics, spatial [ odeso-tpoeuebsdo oprn the customary comparing the on empirical to based related procedure closely goodness-of-fit is procedure test eto n oe aiain[ se- validation model model for and diagnostics lection provide to statistics plied 13 22 K u prahas rvdstertclsupport theoretical provides also approach Our Figure h ieiodsoeadtesoets [ test score the and score likelihood The , ap- in frequently used are 324, and 315 pages ], fnto o etn te fit- other testing for -function 47 , 76 1 K suigteudryn on process point underlying the assuming ] hw he xml aast tde in studied sets data example three shows fnto ihisnl xetdvalue. expected null its with -function hood. ’s lse prac- blished ptn the mputing o Besag’s rom K n process int c sthe as uch K -function, d new ide a The ta. fnto,adtescore the and -function, 2 , K 15 fnto [ -function , 19 , 60 , 61 63 77 , , .In ]. 75 64 ], ] 2 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b) (c) Fig. 1. Point pattern data sets. (a) Japanese black pine seedlings and saplings in a 10 × 10 metre quadrat [53, 54]. Reprinted by kind permission of Professors M. Numata and Y. Ogata. (b) Simulated realization of inhomogeneous Strauss process showing strong inhibition and spatial trend [7], Figure 4 b. (c) Simulated realization of homogeneous Geyer saturation process showing moderately strong clustering without spatial trend [7], Figure 4 c.

Similar statements apply to the nearest neighbor vides a new class of residual that distance distribution function G and the empty space can be used as informal diagnostics for model fit, function F . for a wide of point process models, in close For computational efficiency, especially in large analogy with current practice. The diagnostics ap- data sets, the point process likelihood is often re- ply under very general conditions, including the case placed by Besag’s [14] pseudo-likelihood. The result- of inhomogeneous point process models, where ex- ing “pseudo-score” is a possible surrogate for the ploratory methods are underdeveloped or inapplica- likelihood score in the . In one model, this ble. For instance, Figure 2 shows the compensator pseudo-score test statistic is equivalent to a resid- of K(r) for an inhomogeneous Strauss process. ual version of the empirical K-function, yielding Section 2 introduces basic definitions and assump- a new, efficient diagnostic for model fit. However, in tions. Section 3 describes the score test for a gen- general, the interpretation of the pseudo-score test eral point process model, and Section 4 develops statistic is conceptually more complicated than that of the likelihood score test statistic, and hence diffi- cult to employ as a diagnostic. In classical settings the score test statistic is a weighted sum of residuals. For point processes the pseudo-score test statistic is a weighted point process residual in the sense of [4, 7]. This sug- gests a simplification, in which the pseudo-score test statistic is replaced by another residual diagnostic that is easier to interpret and to compute. In special cases this diagnostic is a residual version of one of the classical functional summary statis- tics K, G or F obtained by subtracting a “com- pensator” from the functional summary statistic. The compensator depends on the fitted model, and may also depend on the observed data. For exam- ple, suppose the fitted model is the homogeneous Poisson process. Then (ignoring some details) the compensator of the empirical K-function Kˆ (r) is 2 its expectation K0(r)= πr under the model, while Fig. 2. Empirical K-function (thick grey line) for the point the compensator of the empirical nearest neighbor pattern data in Figure 1(b), compensator of the K-function function Gˆ(r) is the empirical empty space func- (solid black line) for a model of the correct form, and expected tion Fˆ(r) for the same data. This approach pro- K-function for a homogeneous Poisson process (dashed line). DIAGNOSTICS FOR SPATIAL POINT PROCESSES 3 the important case of Poisson point process mod- Poisson(W, ρ) is els. Section 5 gives examples and technical tools for non-Poisson point process models. Section 6 devel- (2) f(x)=exp (1 ρ(u)) du ρ(xi). − ops the general theory for our diagnostic tools. Sec- W i Z  Y tion 7 applies these tools to tests for first order trend We assume that f is hereditary, that is, f(x) > 0 and hotspots. Sections 8–11 develop diagnostics for implies f(y) > 0 for all finite y x W . Processes interaction between points, based on pairwise dis- satisfying these assumptions include⊂ ⊂ (under integra- tances, nearest neighbor distances and empty space bility conditions) inhomogeneous Poisson processes distances, respectively. The tools are demonstrated with an intensity function, finite Gibbs processes on data in Sections 12–15. Further examples of di- contained in W , and Cox processes driven by ran- agnostics are given in Appendix A. Appendices B–E dom fields. See [38], Chapter 3, for an overview of provide technical details. finite point processes including these examples. In practice, our methods require the density to have 2. ASSUMPTIONS a tractable form, and are only developed for Pois- 2.1 Fundamentals son and Gibbs processes. A spatial point pattern data set is a finite set 2.3 Conditional Case x = x ,...,xn of points xi W , where the num- { 1 } ∈ In the conditional case, we assume X = Y W ber of points n(x)= n 0 is not fixed in advance, where Y is a point process. Thus, X may depend∩ and the domain of observation≥ W Rd is a fixed, ⊂ on unobserved points of Y lying outside W . The known region of d-dimensional space with finite pos- density of X may be unknown or intractable. Under itive volume W . We take d = 2, but the results | | suitable conditions (explained in Section 5.4) mod- generalize easily to all dimensions. eling and inference can be based on the conditional A point process model assumes that x is a realiza- distribution of X◦ = X W ◦ given X+ = X W + = tion of a finite point process X in W without mul- x+, where W + W is∩ a subregion, typically∩ a re- tiple points. We can equivalently view X as a ran- gion near the boundary⊂ of W , and only the points in dom finite subset of W . Much of the literature on W ◦ = W W + are treated as random. We assume spatial statistics assumes that X is the restriction that the\ conditional distribution of X◦ = X W ◦ X = Y W of a stationary point process Y on the given X+ = X W + = x+ has an hereditary∩ den- ∩ R2 entire space . We do not assume this; there is no sity f(x◦ x+) with∩ respect to Poisson(W ◦, 1). Pro- assumption of stationarity, and some of the models cesses satisfying| these assumptions include Markov considered here are intrinsically confined to the do- point processes [74], [50], Section 6.4, together with main W . For further background material including all processes covered by the unconditional case. Our measure theoretical details, see, for example, [50], methods are only developed for Poisson and Markov Appendix B. point processes. Write X Poisson(W, ρ) if X follows the Poisson ∼ For ease of exposition, we focus mainly on the un- process on W with intensity function ρ, where we as- conditional case, with occasional comments on the X sume ν = W ρ(u)du is finite. Then n( ) is Poisson conditional case. For Poisson point process models, distributed with ν, and, conditional on n(X), we always take W = W ◦ so that the two cases agree. the pointsR in X are i.i.d. with density ρ(u)/ν. Every point process model considered here is as- 3. SCORE TEST FOR POINT PROCESSES sumed to have a probability density with respect to Poisson(W, 1), the unit rate Poisson process, under In principle, any technique for likelihood-based in- one of the following scenarios. ference is applicable to point process likelihoods. In practice, many likelihood computations require ex- 2.2 Unconditional Case tensive Monte Carlo simulation [31, 50, 51]. To min- In the unconditional case we assume X has a den- imize such difficulties, when assessing the goodness sity f with respect to Poisson(W, 1). Then the den- of fit of a fitted point process model, it is natural to sity is characterized by the property choose the score test which only requires computa- E E tions for the null hypothesis [61, 75]. (1) [h(X)] = [h(Y)f(Y)] Consider any parametric family of point process for all nonnegative measurable functionals h, where models for X with density fθ indexed by a k-dimen- Y Poisson(W, 1). In particular, the density of sional vector parameter θ Θ Rk. For a simple ∼ ∈ ⊆ 4 A. BADDELEY, E. RUBAK AND J. MØLLER null hypothesis H0 : θ = θ0 where θ0 Θ is fixed, the dexed by θ Θ. The score test statistic is (3), where score test against any alternative H∈: θ Θ , where ∈ 1 ∈ 1 Θ Θ θ , is based on the score test statistic U(θ)= κθ(xi) κθ(u)ρθ(u)du, 1 0 − ⊆ \ { } i W ([22], page 315), X Z (3) T 2 = U(θ )⊤I(θ )−1U(θ ). ⊤ 0 0 0 I(θ)= κθ(u) κθ(u) ρθ(u)du ∂ ⊤ W Here U(θ)= log fθ(x) and I(θ)= Eθ[U(θ)U(θ) ] Z ∂θ ∂ are the score function and , re- with κθ(u)= ∂θ log ρθ(u). Asymptotic results are giv- spectively, and the expectation is with respect to fθ. en in [45, 62]. Here and throughout, we assume that the order of 4.1 Log-Linear Alternative integration and differentation with respect to θ can be interchanged. Under suitable conditions, the null The score test is commonly used in spatial epi- distribution of T 2 is χ2 with k degrees of freedom. demiology to assess whether disease incidence de- In the case k = 1 it may be informative to evaluate pends on environmental exposure. As a particular the signed square root case of (5), suppose the Poisson model has a log- linear intensity function (4) T = U(θ0)/ I(θ0), (7) ρ (u) = exp(α + βZ(u)), which is asymptotically Np(0, 1) distributed under (α,β) the same conditions. where Z(u), u W , is a known, real-valued and non- For a composite null hypothesis H : θ Θ where constant covariate∈ function, and α and β are real 0 ∈ 0 Θ Θ is an m-dimensional submanifold with 0 < parameters. Cox [21] noted that the uniformly most 0 ⊂ m < k, the score test statistic is defined in [22], powerful test of H0 : β = 0 (the homogeneous Pois- page 324. However, we shall not use this version son process) against H1 : β> 0 is based on the statis- of the score test, as it assumes differentiability of tic the likelihood with respect to nuisance parameters, (8) S(x)= Z(x ). which is not necessarily applicable here (as exempli- i i fied in Section 4.2). X In the sequel we often consider models of the form Recall that, for a point process X on W with inten- sity function ρ, we have Campbell’s Formula ([24], x x x (5) f(α,β)( )= c(α, β)hα( ) exp(βS( )), page 163), where the parameter β and the statistic S(x) are one dimensional, and the null hypothesis is H0 : β = 0. (9) E h(xi) = h(u)ρ(u)du For fixed α, this is a linear and (4) x X W  Xi∈  Z becomes for any Borel function h such that the on the E V T (α) = (S(x) (α,0)[S(X)])/ ar(α,0)[S(X)]. right-hand side exists; and for the Poisson process − Poisson(W, ρ), In practice, when α is unknown,q we replace α by its MLE under H so that, with a slight abuse of 2 0 (10) Var h(xi) = h(u) ρ(u)du notation, the signed square root of the score test X W xi∈  Z statistic is approximated by X for any Borel function h such that the integral on T = T (ˆα) the right-hand side exists. Hence, the standardized (6) version of (8) is = (S(x) E [S(X)])/ Var [S(X)]. − (ˆα,0) (ˆα,0) Under suitable conditions, T in (q6) is asymptotically (11) T = S(x) κˆ Z(u)du κˆ Z(u)2 du, − W s W equivalent to T in (4), and so a standard Normal  Z . Z approximation may still apply. whereκ ˆ = n/ W is the MLE of the intensity κ = exp(α) under| the| null hypothesis. This is a direct 4. SCORE TEST FOR POISSON PROCESSES application of the approximation (6) of the signed Application of the score test to Poisson point pro- square root of the score test statistic. cess models appears to originate with Cox [21]. Con- Berman [13] proposed several tests and diagnos- sider a parametric family of Poisson processes, tics for spatial association between a point process X Poisson(W, ρθ), where the intensity function is in- and a covariate function Z(u). Berman’s Z1 test is DIAGNOSTICS FOR SPATIAL POINT PROCESSES 5 equivalent to the Cox score test described above. sample paths of T (z) will be governed by Poisson Waller et al. [76] and Lawson [47] proposed tests for behavior where A(z) is small. the dependence of disease incidence on environmen- In this paper, our approach is simply to plot the tal exposure, based on data giving point locations of score test statistic as a function of the nuisance pa- disease cases. These are also applications of the score rameter. This turns the score test into a graphical test. Berman conditioned on the number of points exploratory tool, following the approach adopted in when making inference. This is in accordance with many other areas [2, 15, 19, 60, 77]. A second style the observation that the statistic n(x) is S-ancillary of plot based on S(x, z) κAˆ (z) against z may be for β, while S(x) is S-sufficient for β. more appropriate visually.− Such a plot is the lurking variable plot of [7]. Berman [13] also proposed a plot 4.2 Threshold Alternative and of S(x, z) against z, together with a plot ofκA ˆ (z) Nuisance Parameters against z, as a diagnostic for dependence on Z. This Consider the Poisson process with an intensity is related to the Kolmogorov–Smirnov test since, un- function of “threshold” form, der H0, the values Yi = Z(xi) are i.i.d. with distri- bution function P(Y y)= A(y)/ W . κ exp(φ) if Z(u) z, ρ (u)= ≤ | | z,κ,φ κ if Z(u) ≤> z, 4.3 Hot Spot Alternative  where z is the threshold level. If z is fixed, this Consider the Poisson process with intensity model is a special case of (7) with Z(u) replaced (12) ρκ,φ,v(u)= κ exp(φk(u v)), by I Z(u) z , and so (8) is replaced by − { ≤ } where k is a kernel (a probability density on R2), S(x)= S(x, z)= I Z(xi) z , R2 { ≤ } κ> 0 and φ are real parameters, and v is a nui- i ∈ X sance parameter. This process has a “hot spot” of where I denotes the indicator function. By (11) elevated intensity in the vicinity of the location v. {·} the (approximate) score test of H0 : φ = 0 against By (11) and (9)–(10) the score test of H0 : φ = 0 H : φ = 0 is based on against H : φ = 0 is based on 1 6 1 6 T = T (z) = (S(x, z) κAˆ (z))/ κAˆ (z), T = T (v) = (S(x, v) κMˆ 1(v))/ κMˆ 2(v), − − where A(z)= u W : Z(u) z isp the area of the where p corresponding|{ level∈ set of Z.≤ }| S(x, v)= k(xi v) If z is not fixed, then it plays the role of a nuisance − i parameter in the score test: the value of z affects X inference about the canonical parameter φ, which is is the usual nonparametric kernel estimate of point the parameter of primary interest in the score test. process intensity [28] evaluated at v without edge Note that the likelihood is not differentiable with correction, and respect to z. i Mi(v)= k(u v) du, i = 1, 2. In most applications of the score test, a nuisance W − parameter would be replaced by its MLE under the Z The numerator S(x, v) κMˆ (v) is the smoothed null hypothesis. However, in this context, z is not − 1 identifiable under the null hypothesis. Several solu- residual field [7] of the null model. In the special case where k(u) I u h is the uniform density tions have been proposed [18, 25, 26, 33, 68]. They on a disc of radius∝ {kh,k≤ the maximum} max T (v) is include replacing z by its MLE under the alterna- v closely related to the scan statistic [1, 44]. tive [18], maximizing T (z) or T (z) over z [25, 26], and finding the maximum p-value| of| T (z) or T (z) 5. NON-POISSON MODELS over a confidence region for z under the alterna-| | tive [68]. The remainder of the paper deals with the case These approaches appear to be inapplicable to the where the alternative (and perhaps also the null) is current context. While the null distribution of T (z) not a Poisson process. Key examples are stated in is asymptotically N(0, 1) for each fixed z as κ , Section 5.1. Non-Poisson models require additional this convergence is not uniform in z. The null distri-→∞ tools including the Papangelou conditional intensity bution of S(x, z) is Poisson with parameter κA(z); (Section 5.2) and pseudo-likelihood (Section 5.3). 6 A. BADDELEY, E. RUBAK AND J. MØLLER

5.1 Point Process Models with Interaction tion (negative association) when φ< 0. The Geyer and area-interaction models are well-defined point We shall frequently consider densities of the form processes for any value of φ [11, 31], but the Strauss density is integrable only when φ 0 [43]. (13) f(x)= c λ(xi) exp(φV (x)), ≤ i 5.2 Conditional Intensity Y  where c is a normalizing constant, the first order Consider a parametric model for a point process X term λ is a nonnegative function, φ is a real inter- in R2, with parameter θ Θ. Papangelou [59] de- action parameter, and V (x) is a real nonadditive fined the conditional intensity∈ of X as a nonnega- function which specifies the interaction between the tive λθ(u, X) indexed by locations points. We refer to V as the interaction potential. u R2 and characterized by the property that In general, apart from the Poisson density (2) corre- ∈ sponding to the case φ = 0, the normalizing constant Eθ h(xi, X xi ) X \ { } is not expressible in closed form. xi∈  Often the definition of V can be extended to all (17) X R2 finite point patterns in so as to be invariant un- = Eθ h(u, X)λθ(u, X)du R2 der rigid motions (translations and rotations). Then Z  the model for X is said to be homogeneous if λ is for all measurable functions h such that the left constant on W , and inhomogeneous otherwise. or right-hand side exists. Equation (17) is known Let as the Georgii–Nguyen–Zessin (GNZ) formula [[30, 41], [42, 52]]; see also Section 6.4.1 in [50]. Adapting d(u, x)=min u xj j k − k a term from stochastic process theory, we will call denote the distance from a location u to its nearest the random integral on the right-hand side of (17) neighbor in the point configuration x. For n(x)= the (Papangelou) compensator of the random sum n 1 and i = 1,...,n, define on the left-hand side. ≥ Consider a finite point process X in W . In the x−i = x xi . unconditional case (Section 2.2) we assume X has \ { } density fθ(x) which is hereditary for all θ Θ. We In many places in this paper we consider the fol- ∈ lowing three motion-invariant interaction potentials may simply define V (x) = V (x, r) depending on a parameter r > 0 (18) λθ(u, x)= fθ(x u )/fθ(x) which specifies the range of interaction. The Strauss ∪ { } for all locations u W and point configurations x process [73] has interaction potential ∈ ⊂ W such that u / x. Here we take 0/0=0. For xi x ∈ ∈ (14) VS(x, r)= I xi xj r , we set λθ(xi, x) = λθ(xi, x−i), and for u / W we {k − k≤ } ∈ i 0 and spatial inhibi- X+ = x+. DIAGNOSTICS FOR SPATIAL POINT PROCESSES 7

It is convenient to rewrite (18) in the form sity λθ(u, Y) of Y satisfies λθ(u, Y) = λθ(u, Y B(u, R)), where B(u, R) is the ball of radius R cen-∩ λθ(u, x) = exp(∆u log f(x)), tered at u. Define the erosion of W by distance R, where ∆ is the one-point difference operator W⊖R = u W : B(u, R) W , (20) ∆uh(x)= h(x u ) h(x u ). { ∈ ⊂ } ∪ { } − \ { } and assume this has nonzero area. Let B = W W⊖R Note the Poincar´einequality for the Poisson pro- be the border region. The process satisfies a spatial\ cess X, Markov property: the processes Y W⊖R and Y c ∩ ∩ V E 2 W are conditionally independent given Y B. (21) ar[h(X)] [∆uh(X)] ρ(u)du ∩ ≤ W In this situation we shall invoke the conditional Z ◦ + ◦ holding for all measurable functionals h such that case with W = W⊖R and W = W W . The con- ◦ \ + + the right-hand side is finite; see [46, 79]. ditional distribution of X W given X W = x has Papangelou conditional∩ intensity ∩ 5.3 Pseudo-Likelihood and Pseudo-Score ◦ + ◦ ◦ + λθ(u, x x ) if u W , (24) λθ(u, x x )= To avoid computational problems with point pro- | 0∪ otherwise.∈ cess likelihoods, Besag [14] introduced the pseudo- Thus, the unconditionaln and conditional versions of a Markov point process have the same Papangelou conditional intensity at locations in W ◦. PL(θ)= λθ(xi, x) ◦ For x = x1,...,xn◦ , the conditional probability  i  { } (22) Y density given x+ becomes x◦ x+ exp λθ(u, x)du . fθ( ) · − |  ZW  n◦ This is of the same functional form as the likelihood + ◦ + = cθ(x )λθ(x1, x ) λθ(xi, x1,...,xi−1 x ) function of a Poisson process (2), but has the Papan- { }∪ i=2 gelou conditional intensity in place of the Poisson Y ◦ ∅ x+ x+ ∅ intensity. The corresponding pseudo-score if n > 0, and fθ( )= cθ( ), where denotes the empty configuration,| and the inverse normaliz- ∂ + + PU(θ)= log PL(θ) ing constant cθ(x ) depends only on x . ∂θ (23) For example, instead of (13) we now consider ∂ ∂ n◦ = log λθ(xi, x) λθ(u, x)du ◦ + + ◦ + ∂θ − ∂θ f(x x )= c(x ) λ(xi) exp(φV (x x )), i ZW | ∪ X "i=1 # is an unbiased estimating function, E PU(θ)=0, by Y θ R2 virtue of (17). In practice, the pseudo-likelihood is assuming V (y) is defined for all finite y such R2 ⊂ applicable only if the Papangelou conditional inten- that for any u y, ∆uV (y) depends only on u and y B(u, R∈). This\ condition is satisfied by the sity λθ(u, x) is tractable. ∩ The pseudo-likelihood function can also be defined interaction potentials (14)–(16); note that the range in the conditional case [39]. In (22) the product is of interaction is R = r for the Strauss process, and ◦ R = 2r for both the Geyer and the area-interaction instead over points xi x and the integral is in- stead over W ◦; in (23) the∈ sum is instead over points models. x◦ ◦ xi and the integral is instead over W ; and 6. SCORE, PSEUDO-SCORE AND in∈ both places x = x◦ x+. The Papangelou con- ∪ RESIDUAL DIAGNOSTICS ditional intensity λθ(u, x) must also be replaced by ◦ + λθ(u, x x ). This section develops the general theory for our | diagnostic tools. 5.4 Markov Point Processes By (6) in Section 3 it is clear that comparison For a point process X constructed as X = Y W of a summary statistic S(x) to its predicted va- where Y is a point process in R2, the density∩ and lue ES(X) under a null model is effectively equiv- Papangelou conditional intensity of X may not be alent to the score test under an exponential family available in simple form. Progress can be made if Y model where S(x) is the canonical sufficient statis- is a Markov point process of interaction range R< tic. Similarly, the use of a functional summary statis- ; see [30, 52, 66, 74] and [50], Section 6.4.1. Briefly, tic S(x, z), depending on a function argument z, is ∞this that the Papangelou conditional inten- related to the score test under an exponential family 8 A. BADDELEY, E. RUBAK AND J. MØLLER model where z is a nuisance parameter and S(x, z) mation and simulation from the null model, which in is the canonical sufficient statistic for fixed z. In this both cases may be computationally expensive. Var- section we construct the corresponding exponential ious approximations for the score or the score test family models, apply the score test, and propose sur- statistic can be constructed, as discussed in the se- rogates for the score test statistic. quel. 6.1 Models 6.2 Pseudo-Score of Extended Model

Let fθ(x) be the density of any point process X The extended model (25) is an exponential family on W governed by a parameter θ. Let S(x, z) be with respect to φ, having Papangelou conditional a functional summary statistic of the point pattern intensity data set x, with function argument z belonging to κθ,φ,z(u, x)= λθ(u, x) exp(φ∆uS(x, z)), any space. Consider the extended model with density where λθ(u, x) is the Papangelou conditional inten- sity of the null model. The pseudo-score function (25) fθ,φ,z(x)= cθ,φ,zfθ(x) exp(φS(x, z)), with respect to φ, evaluated at φ =0, is where φ is a real parameter, and cθ,φ,z is the normal- PU(θ, z)= ∆x S(x, z) ∆uS(x, z)λθ(u, x)du, izing constant. The density is well-defined provided i − i W X Z M(θ, φ, z)= E[fθ(Y) exp(φS(Y, z))] < , ∞ where the first term where Y Poisson(W, 1). The extended model is ∼ (28) Σ∆S(x, z)= ∆xi S(x, z) constructed by “exponential tilting” of the original i model by the statistic S. By (6), for fixed θ and z, X ˆ assuming differentiability of M with respect to φ in will be called the pseudo-sum of S. If θ is the maxi- a neighborhood of φ = 0, the signed root of the score mum pseudo-likelihood estimate (MPLE) under H0, test statistic is approximated by the second term with θ replaced by θˆ becomes E V x x x (26) T = (S(x, z) θˆ[S(X, z)])/ arθˆ[S(X, z)], (29) ∆S( , z)= ∆uS( , z)λθˆ(u, )du − C W q Z where θˆ is the MLE under the null model, and the and will be called the (estimated) pseudo-compensator expectation and are with respect to the null of S. We call model with density fθˆ. ∆S(x, z)= PU(θ,ˆ z) Insight into the qualitative behavior of the ex- (30) R tended model (25) can be obtained by studying the = Σ∆S(x, z) ∆S(x, z) −C perturbing model the pseudo-residual since it is a weighted residual in (27) gφ,z(x)= kφ,z exp(φS(x, z)), the sense of [7]. The pseudo-residual serves as a surrogate for the provided this is a well-defined density with respect numerator in the score test statistic (26). For the to Poisson(W, 1), where kφ,z is the normalizing con- denominator, we need the variance of the pseudo- stant. When the null hypothesis is a homogeneous residual. Appendix B gives an exact formula (66) Poisson process, the extended model is identical to for the variance of the pseudo-score PU(θ, z), which the perturbing model, up to a change in the first or- can serve as an approximation to the variance of der term. In general, the extended model is a quali- the pseudo-residual ∆S(x, z). This is likely to be tative hybrid between the null and perturbing mod- an overestimate, becauseR the effect of parameter es- els. timation is typically to deflate the residual vari- In this context the score test is equivalent to naive ance [7]. comparison of the observed and null-expected val- The first term in the variance formula (66) is ues of the functional summary statistic S. The test 2 2 statistic T in (26) may be difficult to evaluate; typi- (31) ∆S(x, z)= [∆uS(x, z)] λ (u, x)du, C θˆ cally, apart from Poisson models, the moments (par- ZW ticularly the variance) of S would not be available which we shall call the Poincar´epseudo-variance be- in closed form. The null distribution of T would also cause of its similarity to the Poincar´eupper bound typically be unknown. Hence, implementation of the in (21). It is easy to compute this quantity along- score test would typically require approxi- side the pseudo-residual. Rough calculations in Sec- DIAGNOSTICS FOR SPATIAL POINT PROCESSES 9 tions 9.4 and 10.3 suggest that the Poincar´epseudo- strated in the examples in Sections 9, 10 and 11 and variance is likely to be the dominant term in the in Appendix A. variance, except at small r values. The variance of Consider a null model with Papangelou conditional residuals is also studied in [17]. intensity λθ(u, x). Following [7], define the (s-weight- For computational efficiency we propose to use the ed) innovation by square root of (31) as a surrogate for the denomi- nator in (26). This yields a “standardized” pseudo- (34) S(x, r)= S(x, z) s(u, x, z)λθ(u, x)du, I − residual ZW (32) ∆S(x, z)= ∆S(x, z)/ 2∆S(x, z). which by the GNZ formula (17) has mean zero un- T R C der the null model. In practice, we replace θ by an We emphasize that this quantityp is not guaranteed estimate θˆ (e.g., the MPLE) and consider the (s- to have zero mean and unit variance (even approxi- weighted) residual mately) under the null hypothesis. It is merely a com- putationally efficient surrogate for the score test sta- (35) S(x, z)= S(x, z) s(u, x, z)λ (u, x)du. R − θˆ tistic; its null distribution must be investigated by ZW other means. Asymptotics of ∆S(x, z) under The residual shares many properties of the score T a large-domain limit [69] could be studied, but limit function and can serve as a computationally efficient results are unlikely to hold uniformly over r. In this surrogate for the score. The data-dependent integral paper we evaluate null distributions using Monte Carlo methods. (36) S(x, z)= s(u, x, z)λ (u, x)du C θˆ The pseudo-sum (28) can be regarded as a func- ZW tional summary statistic for the data in its own is the (estimated) Papangelou compensator of S. right. Its definition depends only on the choice of the The variance of S(x, z) can be approximated by statistic S, and it may have a meaningful interpre- the innovation variance,R given by the general vari- tation as a nonparametric of a property ance formula (65) of Appendix B. The first term of the point process. The pseudo-compensator (29) in (65) is the Poincar´evariance might also be regarded as a functional summary statistic, but its definition involves the null model. (37) 2S(x, z)= s(u, x, z)2λ (u, x)du. C θˆ If the null model is true, we may expect the pseudo- ZW residual to be approximately zero. Sections 9–11 and Rough calculations reported in Sections 9.4 and 10.3 Appendix A study particular instances of pseudo- suggest that the Poincar´evariance is likely to be the residual diagnostics based on (28)–(30). largest term in the variance for sufficiently large r. In the conditional case, the Papangelou condi- By analogy with (31) we propose to use the Poincar´e tional intensity λ (u, x) must be replaced by λ (u, variance as a surrogate for the variance of S(x, z), θˆ θˆ R x◦ x+) given in (19) or (24). The integral in the and thereby obtain a “standardized” residual | definition of the pseudo-compensator (29) must be (38) S(x, z)= S(x, z)/ 2S(x, z). restricted to the domain W ◦, and the summation T R C over data points in (28) must be restricted to points Once again S(x, z) is not exactlyp standardized, be- ◦ ◦ 2 T V xi W , that is, to summation over points of x . cause S(x, z) is an approximation to ar[ S(x, z)] ∈ and becauseC the numerator and denominatorR of (38) 6.3 Residuals are dependent. The null distribution of S(x, z) A simpler surrogate for the score test is available must be investigated by other means. T when the canonical sufficient statistic S of the per- In the conditional case, the integral in the defi- turbing model is naturally expressible as a sum of nition of the compensator (36) must be restricted local contributions to the domain W ◦, and the summation over data ◦ (33) S(x, z)= s(x , x , z). points in (33) must be restricted to points xi W , i −i ◦ ∈ i that is, to summation over points of x . X Note that any statistic can be decomposed in this 7. DIAGNOSTICS FOR FIRST ORDER TREND way unless some restriction is imposed on s; such a decomposition is not necessarily unique. We call Consider any null model with density fθ(x) and the decomposition “natural” if s(u, x, z) only de- Papangelou conditional intensity λθ(u, x). By anal- pends on points of x that are close to u, as demon- ogy with Section 4 we consider alternatives of the 10 A. BADDELEY, E. RUBAK AND J. MØLLER form (25) where seen in [9, 23, 29, 50]. Simple empirical of these functions are of the form S(x, z)= s(x , z) i ˆ ˆ i K(r)= Kx(r) X (40) for some function s. The perturbing model (27) is 1 I a Poisson process with intensity exp(φs( , z)), = 2 eK (xi,xj) xi xj r , · ρˆ (x) W {k − k≤ } where z is a nuisance parameter. The score test is | | Xi6=j a test for the presence of an (extra) first order trend. Gˆ(r)= Gˆx(r) The pseudo-score and residual diagnostics are both (41) equal to 1 = eG(xi, x i, r)I d(xi, x i) r , n(x) − { − ≤ } S(x, z)= s(xi, z) i R X i ˆ ˆ (39) X F (r)= Fx(r) (42) s(u, z)λθˆ(u, x)du. 1 − W = eF (u, r)I d(u, x) r du, Z W W { ≤ } This is the s-weighted residual described in [7]. The | | Z where eK (u, v), eG(u, x, r) and eF (u, r) are edge cor- variance of (39) can be estimated by simulation, or 2 approximated by the Poincar´evariance (37). rection weights, and typicallyρ ˆ (x)= n(x)(n(x) 1)/ W 2. − If Z is a real-valued covariate function on W , then | | we may take s(u, z)= I Z(u) z for z R, cor- 8.2 Score Test Approach responding to a threshold{ effect≤ (cf.} Section∈ 4.2). A plot of (39) against z was called a lurking vari- The classical approach fits naturally into the able plot in [7]. scheme of Section 6. In order to test for depen- If s(u, z)= k(u z) for z R2, where k is a density dence between points, we choose a perturbing model function on R2, then− ∈ that exhibits dependence. Three interesting exam- ples of perturbing models are the Strauss process, S(x, z)= k(xi z) k(u z)λ (u, x)du, the Geyer saturation model with saturation thresh- R − − − θˆ i ZW old 1 and the area-interaction process, with interac- X tion potentials V (x, r), V (x, r) and V (x, r) given which was dubbed the smoothed residual field in [7]. S G A in (14)–(16). The nuisance parameter r 0 deter- Examples of application of these techniques have ≥ been discussed extensively in [7]. mines the range of interaction. It is interesting to note that, although the Strauss density is integrable only when φ 0, the extended model obtained by 8. INTERPOINT INTERACTION ≤ perturbing fθ by the Strauss density may be well- In the remainder of the paper we concentrate on defined for some φ> 0. This extended model may diagnostics for interpoint interaction. support alternatives that are clustered relative to 8.1 Classical Summary Statistics the null, as originally intended by Strauss [73]. The potentials of these three models are closely re- Following Ripley’s influential paper [64], it is stan- lated to the summary statistics K,ˆ Gˆ and Fˆ in (40)– dard practice, when investigating association or de- (42). Ignoring the edge correction weights e( ), we pendence between points in a spatial point pattern, have · to evaluate functional summary statistics such as 2 W the K-function, and to compare graphically the em- (43) Kˆx(r) | | VS(x, r), ≈ n(x)(n(x) 1) pirical summaries and theoretical predicted values − under a suitable model, often a stationary Poisson 1 (44) Gˆx(r) VG(x, r), process (“Complete Spatial Randomness,” CSR) [23, ≈ n(x) 29, 64]. 1 The three most popular functional summary sta- (45) Fˆx(r) VA(x, r). ≈− W tistics for spatial point processes are Ripley’s K- | | function, the nearest neighbor distance distribution To draw the closest possible connection with the function G and the empty space function (spheri- score test, instead of choosing the Strauss, Geyer or cal contact distance distribution function) F . Defi- area-interaction process as the perturbing model, we nitions of K, G and F and their estimators can be shall take the perturbing model to be defined through DIAGNOSTICS FOR SPATIAL POINT PROCESSES 11

(27) where S is one of the statistics Kˆ , Gˆ or Fˆ. We 9. RESIDUAL DIAGNOSTICS FOR call these the (perturbing) Kˆ -model, Gˆ-model and INTERACTION USING PAIRWISE DISTANCES Fˆ-model, respectively. The score test is then pre- This section develops residual (35) and pseudo- cisely equivalent to comparing Kˆ , Gˆ or Fˆ with its residual (30) diagnostics derived from a summary predicted expectation using (6). statistic S which is a sum of contributions depending Essentially Kˆ , Gˆ, Fˆ are renormalized versions on pairwise distances. of VS , VG, VA as shown in (43)–(45). In the case of Fˆ the renormalization is not data-dependent, so 9.1 Residual Based on Perturbing Strauss Model the Fˆ-model is virtually an area-interaction model, ignoring edge correction. For Kˆ , the renormaliza- 9.1.1 General derivation Consider any statistic of tion depends only on n(x), and so, conditionally the general “pairwise interaction” form ˆ on n(x)= n, the K-model and the Strauss process (46) S(x, r)= q( xi,xj , r). { } are approximately equivalent. Similarly for Gˆ, the i

1 2 This section develops residual and pseudo-residual I1 = E[t(u, X, r) λ(u, X)] du 4 W diagnostics derived from summary statistics based Z on nearest neighbor distances. ρ = E[t(u, X, r)2]du 4 10.1 Residual Based on Perturbing Geyer Model ZW and The Geyer interaction potential VG(x, r) given 1 by (15) is clearly a sum of local statistics (33), and I = E[I u v r λ (u, v, X)] du dv 2 4 {k − k≤ } 2 its compensator is ZW ZW ρ2 I I VG(x, r)= d(u, x) r λθˆ(u, x)du. = u v r du dv C W { ≤ } 4 W W {k − k≤ } Z Z Z The Poincar´evariance is equal to the compensator as λ(u, X) = ρ and λ (u, v, X) = λ(u, X)λ(v, 2 in this case. Ignoring edge effects, V (x, r) is ap- X u )= ρ2. This is reminiscent of expressions G ∪ { } proximately n(x)Gˆx(r); cf. (41). for the large-domain limiting variance of Kˆ under If the null model is CSR with estimated intensity CSR obtained using the methods of U-statistics [16, κˆ = n(x)/ W , then 48, 65], summarized in [29], page 51 ff. Now Y = | | t(u, X, r) is Poisson distributed with mean µ = VG(x, r) =κ ˆ W B(xi, r) ; ρ B(u, r) W so that E(Y 2)= µ + µ2. For u W C ∩ ⊖r i we| have µ∩= ρπr| 2, so ignoring edge effects ∈ [ ˆ ρ ρ ignoring edge effects, this is approximately κ ˆ W F (r); I (ν + ν2) W and I ν W , | | 1 ≈ 4 | | 2 ≈ 4 | | cf. (42). Thus, the residual diagnostic is approxi- mately n(x)(Gˆ(r) Fˆ(r)). This is a reasonable di- where ν = ρπr2. Note that since ν is the expected agnostic for departure− from CSR, since F G under number of points within distance r of a given point, CSR. This argument lends support to Diggle’s≡ [27], a value of ν = 1 corresponds to the scale of nearest- equation (5.7), proposal to judge departure from neighbor distances in the pattern, rnn = 1/√πρ. For CSR using the quantity sup Gˆ Fˆ . the purposes of the K function this is a “short” | − | distance. Hence, it is reasonable to describe I as This example illustrates the important point that 1 the compensator of a functional summary statistic S the “leading term” in the variance, since I1 I2 for ν 1. ≫ should not be regarded as an alternative parametric ≫Meanwhile, the Poincar´evariance (37) is estimator of the same quantity that S is intended to estimate. In the example just given, under CSR 2 n(x) 2 the compensator of Gˆ is approximately Fˆ, a quali- VS(x, r)= t(u, x, r) du, C 4 W tatively different and in some sense “opposite” sum- | | ZW which is an approximately unbiased estimator of I1 mary of the point pattern. by Fubini’s Theorem. Hence, We have observed that the interaction potential VG ˆ 2 2 of the Geyer saturation model is closely related to G. E VS(x, r) E VS(x, r) C C However, the pseudo-residual associated to VG is Var[ VS (X, r)] ≈ Var[ VS(X, r)] R I a more complicated statistic, since a straightforward I 1+ ν calculation shows that the pseudo-sum is 1 . ≈ I1 + I2 ≈ 2+ ν Σ∆VG(x, r) Thus, as a rule of thumb, the Poincar´evariance un- = VG(x, r)+ I xi xj r and derestimates the true variance; the ratio of means is {k − k≤ i (1 + ν)/(2 + ν) 1/2. The ratio falls to 2/3 when X jX:j6=i ≥ ν =1, that is, when r = rnn = 1/√πρ. We can take d(xj, x−i) > r , } 14 A. BADDELEY, E. RUBAK AND J. MØLLER and the pseudo-compensator is we can again apply the variance formula (65) of Appendix B, which gives Var[ VG(X, r)] = L + L , I 1 2 ∆VG(x, r)= I d(u, x) r λˆ(u, x)du where C { ≤ } θ ZW P X + I d(xi, x−i) > r L1 = ρ d(u, ) r du { } W { ≤ } i Z X and I u xi r λθˆ(u, x)du. · W {k − k≤ } 2 P Z L2 = ρ u v r, W W {k − k≤ 10.2 Residual Based on Perturbing Gˆ-Model Z Z d(u, X) > r, d(v, X) > r du dv. The empirical G-function (41) can be written } The Poincar´evariance is equal to the compensator (53) Gˆx(r)= g(xi, x−i, r), i in this case, and is X where 2 I VG(x, r)= d(u, x) r λθˆ(u, x)du 1 C W { ≤ } g(u, x, r)= eG(u, x, r) Z n(x) + 1 n(x) (54) = W U(x, r) , I d(u, x) r , u / x, W | ∩ | · { ≤ } ∈ | | so that the Papangelou compensator of the empirical where U(x, r)= i b(xi, r). The Poincar´evariance is G-function is an approximately unbiased estimator of the term L1. For u W Swe have P d(u, X) r = 1 ˆ ⊖r Gx(r) exp( ρπr∈2) so that { ≤ } − C − 2 = g(u, x, r)λθˆ(u, x)du L1 ρ W (1 exp( ρπr )), W ≈ | | − − Z 2 1 ignoring edge effects. Again, let ν = ρπr so that = e (u, x, r)λ (u, x)du. G θˆ L1 ρ W (1 exp( ν)). Meanwhile, n(x) + 1 W B x ,r Z ∩Si ( i ) ≈ | | − − P The residual diagnostics obtained from the Geyer d(u, X) > r, d(v, X) > r ˆ { } and G-models are very similar, and we choose to = exp( ρ b(u, r) b(v, r) ). use the diagnostic based on the popular Gˆ-function. − | ∪ | As with the K-function, we typically use the com- This probability lies between exp( ν) and exp( 2ν) − − pensator(s) of the fitted model(s) rather than the for all u, v. Thus (ignoring edge effects), residual(s), to visually maintain the close connec- L ρ2πr2 W exp( (1 + δ)ν) tion to the empirical G-function. 2 ≈ | | − The expressions for the pseudo-sum and pseudo- = ρν W exp( (1 + δ)ν), compensator of Gˆ are not of simple form, and we | | − where 0 δ 1. Hence, refrain from explicitly writing out these expressions. ≤ ≤ For both the Gˆ- and Geyer models, the pseudo-sum −ν L2 νe and pseudo-compensator are not directly related to . L ≤ 1 e−ν a well-known summary statistic. We prefer to plot 1 − the pseudo-residual rather than the pseudo-sum and Let f(ν)= νe−ν /(1 e−ν ). Then f(ν) is strictly − pseudo-compensator(s). decreasing and f(ν) < 1 for all ν > 0 so that L1/ (L + L ) 1 , that is, the variance is underesti- 10.3 Residual Variance Under CSR 1 2 2 mated by at≥ most a factor of 2. Note that f(1.25) Again assume a Poisson process of intensity ρ as 0.5, so L /(L +L ) 2 when r r , where r =≈ 1 1 2 ≥ 3 ≤ crit crit the null model. Since VG is a sum of local statistics, 1.25/πρ. The conclusions and rule-of-thumb for Gˆ are similar to those obtained for Kˆ in Sec- VG(x, r)= I d(xi, x xi) r , p { \ ≤ } T T i tion 9.4. X DIAGNOSTICS FOR SPATIAL POINT PROCESSES 15

11. DIAGNOSTICS FOR INTERACTION 11.2 Pseudo-Residual Based on BASED ON EMPTY SPACE DISTANCES Perturbing Fˆ-Model 11.1 Pseudo-Residual Based on Perturbing Alternatively, one could use a standard empirical Area-Interaction Model estimator Fˆ of the empty space function F as the summary statistic in the pseudo-residual. The pseu- When the perturbing model is the area-interaction do-sum associated with the perturbing Fˆ-model is process, it is convenient to reparametrize the den- sity, such that the canonical sufficient statistic VA Σ∆Fˆx(r)= n(x)Fˆx(r) Fˆx (r), − −i given in (16) is redefined as i X 1 with pseudo-compensator VA(x, r)= W B(xi, r) . W ∩ | | i ∆Fˆx(r)= (Fˆx (r) Fˆx(r))λ (u, x)du. [ C ∪{u} − θˆ This summary statistic is not naturally expressed as W Z a sum of contributions from each point as in (33), Ignoring edge correction weights, Fˆx (r) Fˆx(r) ∪{u} − so we shall only construct the pseudo-residual. Let is approximately equal to ∆uVA(x, r), so the pseudo- sum and pseudo-compensator associated with the U(x, r)= W B(xi, r). ∩ perturbing Fˆ-model are approximately equal to the i [ pseudo-sum and pseudo-compensator associated with The increment the perturbing area-interaction model. Here, we usu- ∆uVA(x, r) ally prefer graphics using the pseudo-compensator(s) and the pseudo-sum since this has an intuitive in- 1 = ( U(x u , r) U(x, r) ), u / x, terpretation as explained above. W | ∪ { } | − | | ∈ | | can be thought of as “unclaimed space”—the pro- 12. TEST CASE: TREND WITH INHIBITION portion of space around the location u that is not In Sections 12–14 we demonstrate the diagnostics “claimed” by the points of x. The pseudo-sum on the point pattern data sets shown in Figure 1. This section concerns the synthetic point pattern in Σ∆VA(x, r)= ∆xi VA(x, r) i Figure 1(b). X is the proportion of the window that has “single 12.1 Data and Models coverage”—the proportion of locations in W that Figure 1(b) shows a simulated realization of the are covered by exactly one of the balls B(x , r). This i inhomogeneous Strauss process with first order term can be used in its own right as a functional summary λ(x,y) = 200 exp(2x + 2y + 3x2), interaction range statistic, and it corresponds to a raw (i.e., not edge R = 0.05, interaction parameter γ = exp(φ) = 0.1 corrected) empirical estimate of a summary func- and W equal to the unit square; see (13) and (14). tion F (r) defined by 1 This is an example of extremely strong inhibition F1(r)= P(# x X d(u, x) r = 1) (negative association) between neighboring points, { ∈ | ≤ } for any stationary point process X, where u R2 is combined with a spatial trend. Since it is easy to arbitrary. Under CSR with intensity ρ we have∈ recognize spatial trend in the data (either visually or using existing tools such as kernel smoothing [28]), F (r)= ρπr2 exp( ρπr2). 1 − the main challenge here is to detect the inhibition This summary statistic does not appear to be treated after accounting for the trend. in the literature, and it may be of interest to study We fitted four point process models to the data in it separately, but we refrain from a more detailed Figure 1(b). They were (A) a homogeneous Poisson study here. process (CSR); (B) an inhomogeneous Poisson pro- The pseudo-compensator corresponding to this cess with the correct form of the first order term, pseudo-sum is that is, with intensity 2 (55) ρ(x,y) = exp(β0 + β1x + β2y + β3x ), ∆VA(x, r)= ∆uVA(x, r)λ (u, x)du. C θˆ ZW where β0,...,β3 are real parameters; (C) a homoge- This integral does not have a particularly simple neous Strauss process with the correct interaction interpretation even when the null model is CSR. range R = 0.05; and (D) a process of the correct 16 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b) Fig. 3. Residual diagnostics based on pairwise distances, for a model of the correct form fitted to the data in Figure 1(b). (a) Residual Kˆ -function and two-standard-deviation limits under the fitted model of the correct form. (b) Standardized residual Kˆ -function under the fitted model of the correct form. form, that is, inhomogeneous Strauss with the cor- Figure 2 in Section 1 shows Kˆ along with its com- rect interaction range R = 0.05 and the correct form pensator for the fitted model, together with the the- of the first order potential (55). oretical K-function under CSR. The empirical K- 12.2 Software Implementation function and its compensator coincide very closely, suggesting correctly that the model is a good fit. Fig- The diagnostics defined in Sections 9–11 were im- ure 3(a) shows the residual Kˆ -function and the two- plemented in the R language, and has been publicly standard-deviation limits, where the surrogate stan- released in the spatstat library [6]. Unless other- dard deviation is the square root of (37). Figure 3(b) wise stated, models were fitted by approximate max- shows the corresponding standardized residual Kˆ - imum pseudo-likelihood using the algorithm of [5] function obtained by dividing by the surrogate stan- with the default quadrature scheme in spatstat, dard deviation. having an m m grid of dummy points where m = × Although this model is of the correct form, the max(25, 10[1+2 n(x)/10]) was equal to 40 for most standardized residual exceeds 2 for small values of r. of our examples. over the domain W were p This is consistent with the prediction in Section 9.4 approximated by finite sums over the quadrature that the variance approximation would be inaccu- points. Some models were refitted using a finer grid rate for small r. The null model is a nonstationary of dummy points, usually 80 80. In addition to × Poisson process; the minimum value of the inten- maximum pseudo-likelihood estimation, the software sity is 200. Taking ρ = 200 and applying the rule of also supports the Huang–Ogata [37] approximate thumb in Section 9.4 gives maximum likelihood. 1 12.3 Application of Kˆ Diagnostics rnn = = 0.04, √200π 12.3.1 Diagnostics for correct model First we fit- suggesting that the Poincar´evariance estimate be- ted a point process model of the correct form (D). comes unreliable for r 0.04 approximately. The fitted parameter values wereγ ˆ = 0.217 and βˆ = Formal significance≤ interpretation of the critical (5.6, 0.46, 3.35, 2.05) using the coarse grid of dum- bands in Figure 3(b) is limited, because the null dis- − my points, andγ ˆ = 0.170 and βˆ = (5.6, 0.64, 4.06, tribution of the standardized residual is not known 2.44) using the finer grid of dummy points,− as against exactly, and the values 2 are approximate point- the true values γ = 0.1 and β = (5.29, 2, 2, 3). wise critical values, that± is, critical values for the DIAGNOSTICS FOR SPATIAL POINT PROCESSES 17

(a) (b) Fig. 4. Null distribution of standardized residual of Kˆ . Pointwise 2.5% and 97.5% quantiles (grey shading) and sample mean (dotted lines) of T Kˆ from 1000 simulated realizations of model (D) with estimated parameter values (a)γ ˆ = 0.217 and βˆ = (5.6, −0.46, 3.35, 2.05) using a 40 × 40 grid of dummy points; (b)γ ˆ = 0.170 and βˆ = (5.6, −0.64, 4.06, 2.44) using a 80 × 80 grid. score test based on fixed r. The usual problems of to capture spatial trend. The correct model (D) is multiple testing arise when the test statistic is con- judged to be a good fit. sidered as a function of r; see [29], page 14. For very The interpretation of this example requires some small r there are small-sample effects so that a nor- caution, because the residual Kˆ -function of the fit- mal approximation to the null distribution of the ted Strauss models (C) and (D) is constrained to standardized residual is inappropriate. be approximately zero at r = R = 0.05. The max- To confirm this, Figure 4 shows the pointwise 2.5% imum pseudo-likelihood fitting algorithm solves an and 97.5% quantiles of the null distribution of Kˆ , estimating equation that is approximately equiva- obtained by extensive simulation. The sample meanT lent to this constraint, because of (43). of the simulated Kˆ is also shown, and indicates It is debatable which of the presentations in Fig- that the expectedT standardized residual is nonzero ure 5 is more effective at revealing lack of fit. A com- for small values of r. Repeating the computation pensator plot such as Figure 5(a) seems best at cap- with a finer grid of quadrature points (for approx- turing the main differences between competing mod- imating integrals over W involved in the pseudo- els. It is particularly useful for recognizing a gross likelihood and the residuals) reduces the bias, sug- lack of fit. A residual plot such as Figure 5(b) seems gesting that this is a discretization artefact. better for making finer comparisons of model fit, for example, assessing models with slightly different 12.3.2 Comparison of competing models Figu- ranges of interaction. A standardized residual plot re 5(a) shows the empirical K-function and its com- such as Figure 5(c) tends to be highly irregular for pensator for each of the models (A)–(D) in Sec- small values of r, due to discretization effects in the tion 12.1. Figure 5(b) shows the corresponding resid- computation and the inherent nondifferentiability of ual plots, and Figure 5(c) the standardized residuals. the empirical statistic. In difficult cases we may ap- A positive or negative value of the residual suggests ply smoothing to the standardized residual. that the data are more clustered or more inhibited, Gˆ respectively, than the model. The clear inference is 12.4 Application of Diagnostics that the Poisson models (A) and (B) fail to capture 12.4.1 Diagnostics for correct model Consider interpoint inhibition at range r 0.05, while the ho- again the model of the correct form (D). The resid- mogeneous Strauss model (C) is≈ less clustered than ual and compensator of the empirical nearest neigh- the data at very large scales, suggesting that it fails bor function Gˆ for the fitted model are shown in 18 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b)

(c)

Fig. 5. Model diagnostics based on pairwise distances, for each of the models (A)–(D) fitted to the data in Figure 1(b). (a) Kˆ and its compensator under each model. (b) Residual Kˆ -function (empirical minus compensator) under each model. (c) Standardized residual Kˆ -function under each model.

Figure 6. The residual plot suggests a marginal lack that the 2 limits are not trustworthy for r< 0.05 ± of fit for r< 0.025. This may be correct, since the fit- approximately. ted model parameters (Section 12.3.1) are marginally Figure 7 shows the pointwise 2.5% and 97.5% quan- poor estimates of the true values, in particular, of tiles of the null distribution of Gˆ. Again, there is T the interaction parameter. This was not reflected so a suggestion of bias for small values of r which ap- strongly in the Kˆ diagnostics. This suggests that the pears to be a discretization artefact. residual of Gˆ may be particularly sensitive to lack of fit of interaction. 12.4.2 Comparison of competing models For each Applying the rule of thumb in Section 10.3, we of the four models, Figure 8(a) shows Gˆ and its have rcrit = 0.044, agreeing with the interpretation Papangelou compensator. This clearly shows that DIAGNOSTICS FOR SPATIAL POINT PROCESSES 19

(a) (b)

Fig. 6. Residual diagnostics obtained from the perturbing Gˆ-model when the data pattern is a realization of an inhomogeneous Strauss process. (a) Gˆ and its compensator under a fitted model of the correct form, and theoretical G-function for a Poisson process. (b) Residual Gˆ-function and two-standard-deviation limits under the fitted model of the correct form. the Poisson models (A) and (B) fail to capture in- Figure 8(b) shows the standardized residual of Gˆ, terpoint inhibition in the data. The Strauss mod- and Figure 8(c) the pseudo-residual of VG (i.e., the els (C) and (D) appear virtually equivalent in Fig- pseudo-residual based on the pertubing Geyer mo- ure 8(a). del), with spline smoothing applied to both plots.

(a) (b)

Fig. 7. Null distribution of standardized residual of Gˆ. Pointwise 2.5% and 97.5% quantiles (grey shading) and sam- ple mean (dotted lines) from 1000 simulated realizations of model (D) with estimated parameter values (a)γ ˆ = 0.217 and βˆ = (5.6, −0.46, 3.35, 2.05) using a 40 × 40 grid of dummy points; (b)γ ˆ = 0.170 and βˆ = (5.6, −0.64, 4.06, 2.44) using a 80 × 80 grid. 20 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b)

(c)

Fig. 8. Diagnostics based on nearest neighbor distances, for the models (A)–(D) fitted to the data in Figure 1(b). (a) Compen- sator for Gˆ. (b) Smoothed standardized residual of Gˆ. (c) Smoothed pseudo-residual derived from a perturbing Geyer model.

The Strauss models (C) and (D) appear virtually of the models (C)–(D) provide a better fit. Despite equivalent in Figure 8(c). The standardized residual the close connection between the area-interaction plot Figure 8(b) correctly suggests a slight lack of process and the Fˆ-model, the diagnostic in Figu- fit for model (C) while model (D) is judged to be re 9(b) based on the Fˆ-model performs better in a reasonable fit. this particular example and correctly shows (D) is the best fit to data. In both cases it is noticed that 12.5 Application of Fˆ Diagnostics the pseudo-sum has a much higher peak than the Figure 9 shows the pseudo-residual diagnostics pseudo-compensators for the Poisson models (A)– based on empty space distances. Both diagnostics (B), correctly suggesting that these models do not clearly show models (A)–(B) are poor fits to data. capture the strength of inhibition present in the However, in Figure 9(a) it is hard to decide which data. DIAGNOSTICS FOR SPATIAL POINT PROCESSES 21

(a) (b)

Fig. 9. Pseudo-sum and pseudo-compensators for the models (A)–(D) fitted to the data in Figure 1(b) when the perturbing model is (a) the area-interaction process (null fitted on a fine grid) and (b) the Fˆ-model (null fitted on a coarse grid).

13. TEST CASE: CLUSTERING WITHOUT 13.2 Application of Kˆ Diagnostics TREND A plot (not shown) of the Kˆ -function and its com- 13.1 Data and Models pensator, under each of the three models (E)–(G), demonstrates clearly that the homogeneous Poisson Figure 1(c) is a realization of a homogeneous Geyer model (E) is a poor fit, but does not discriminate saturation process [31] on the unit square, with first between the other models. order term λ = exp(4), saturation threshold s = 4.5 Figure 10 shows the residual Kˆ and the smoothed and interaction parameters r = 0.05 and γ = standardized residual Kˆ for the three models. These exp(0.4) 1.5, that is, the density is diagnostics show that the homogeneous Poisson mo- ≈ del (E) is a poor fit, with a positive residual suggest- (56) f(x) exp(n(x) log λ + VG,s(x, r) log γ), ∝ ing correctly that the data are more clustered than where the Poisson process. The plots suggests that both models (F) and (G) are considerably better fits to VG,s(x, r)= min s, I xi xj r . the data than a Poisson model. They show that (G) {k − k≤ } i  j:j6=i  is a better fit than (F) over a range of r values, and X X suggest that (G) captures the correct form of the This is an example of moderately strong clustering interaction. (with interaction range R = 2r = 0.1) without trend. Gˆ The main challenge here is to correctly identify the 13.3 Application of Diagnostics range and type of interaction. Figure 11 shows Gˆ and its compensator, and the We fitted three point process models to the data: corresponding residuals and standardized residuals, (E) a homogeneous Poisson process (CSR); (F) a ho- for each of the models (E)–(G) fitted to the clus- mogeneous area-interaction process with disc radius tered point pattern in Figure 1(c). The conclusions r = 0.05; (G) a homogeneous Geyer saturation pro- obtained from Figure 11(a) are the same as those cess of the correct form, with interaction parame- in Section 13.2 based on Kˆ and its compensator. ter r = 0.05 and saturation threshold s = 4.5 while Figure 12 shows the smoothed pseudo-residual di- the parameters λ and γ in (56) are unknown. The agnostics based on the nearest neighbor distances. parameter estimates for (G) were log λˆ = 4.12 and The message from these diagnostics is very similar logγ ˆ = 0.38. to that from Figure 11. 22 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b)

Fig. 10. Model diagnostics based on pairwise distances for each of the models (E)–(G) fitted to the data in Figure 1(c). (a) Residual Kˆ ; (b) smoothed standardized residual Kˆ .

Models (F) and (G) have the same range of in- 13.4 Application of Fˆ Diagnostics teraction R = 0.1. Comparing Figures 10 and 11, we might conclude that the Gˆ-compensator appears Figure 13 shows the pseudo-residual diagnostics less sensitive to the form of interaction than the Kˆ - based on the empty space distances, for the three compensator. Other suggest that Gˆ is models fitted to the clustered point pattern in Fig- more sensitive than Kˆ to discrepancies in the range ure 1(c). In this case diagnostics based on the area- of interaction. interaction process and the Fˆ-model are very simi-

(a) (b)

Fig. 11. Model diagnostics based on nearest neighbor distances for each of the models (E)–(G) fitted to the data in Figure 1(c). (a) Gˆ and its compensator under each model; (b) smoothed standardized residual Gˆ. DIAGNOSTICS FOR SPATIAL POINT PROCESSES 23

(a) (b) Fig. 12. Smoothed pseudo-residuals for each of the models (E)–(G) fitted to the clustered point pattern in Figure 1(c) when the perturbing model is (a) the Geyer saturation model with saturation 1 and (b) the Gˆ-model. lar, as expected due to the close connection between a real parameter; (J) the Ogata–Tanemura mo- the two diagnostics. Here it is very noticeable that del (57). For more detail on the data set and the the pseudo-compensator for the Poisson model has fitted inhomogeneous soft core model, see [7, 56]. a higher peak than the pseudo-sum, which correctly A complication in this case is that the soft core indicates that the data is more clustered than a Pois- process (57) is not Markov, since the pair poten- son process. tial c(u, v) = exp( σ4/ u v 4) is always positive. Nevertheless, since− thisk function− k decays rapidly, it 14. TEST CASE: JAPANESE PINES seems reasonable to apply the residual and pseudo- 14.1 Data and Models residual diagnostics, using a cutoff distance R such that log c(u, v) ε when u v R, for a spec- Figure 1(a) shows the locations of seedlings and ified| tolerance ε|≤. The cutoffk depends− k≤ on the fitted saplings of Japanese black pine, studied by Numata parameter value σ2. We chose ε = 0.0002, yielding [53, 54] and analyzed extensively by Ogata and Tane- R = 1. Estimated interaction parameters wereσ ˆ2 = mura [55, 56]. In their definitive analysis [56] the fit- 0.11 for model (I) andσ ˆ2 = 0.12 for model (J). ted model was an inhomogeneous “soft core” pair- 14.2 Application of Kˆ Diagnostics wise interaction process with log-cubic first order A plot (not shown) of Kˆ and its compensator for term λβ(x,y) = exp(Pβ(x,y)), where Pβ is a cubic polynomial in x and y with coefficient vector β, and each of the models (H)–(J) suggests that the homo- density geneous soft core model (I) is inadequate, while the inhomogeneous models (H) and (J) are reasonably good fits to the data. However, it does not discrim- f(β,σ2)(x)= c(β,σ2) exp Pβ(xi)  i inate between the models (H) and (J). (57) X Figure 14 shows smoothed versions of the residual 4 4 and standardized residual of Kˆ for each model. The (σ / xi xj ) , − k − k Ogata–Tanemura model (J) is judged to be the best i

(a) (b) Fig. 13. Pseudo-sum and pseudo-compensators for the models (E)–(G) fitted to the clustered point pattern in Figure 1(c) when the perturbing model is (a) the area-interaction process and (b) the Fˆ-model. the same as those based on Kˆ shown in Figure 14. ering Figures 14, 15 and 16, the Ogata–Tanemura Figure 16 shows the pseudo-residuals when using model (J) provides the best fit. either a perturbing Geyer model [Figure 16(a)] or 14.4 Application of Fˆ diagnostics a perturbing Gˆ-model [Figure 16(b)]. Figures 16(a)– (b) tell almost the same story: the inhomogeneous Finally, the empty space pseudo-residual diagnos- Poisson model (H) provides the worst fit, while it tics are shown in Figure 17 for the Japanese Pines is difficult to discriminate between the fit for the data in Figure 1(a). This gives a clear indication soft core models (I) and (J). In conclusion, consid- that the Ogata–Tanemura model (J) is the best fit

(a) (b) Fig. 14. Model diagnostics based on pairwise distances for each of the models (H)–(J) fitted to the Japanese pines data in Figure 1(a). (a) Smoothed residual Kˆ ; (b) smoothed standardized residual Kˆ . DIAGNOSTICS FOR SPATIAL POINT PROCESSES 25

(a) (b)

(c) Fig. 15. Model diagnostics based on nearest neighbour distances for each of the models (H)–(J) fitted to the Japanese pines data in Figure 1(a). (a) Gˆ and its compensator; (b) smoothed residual Gˆ; (c) smoothed standardised residual Gˆ. to the data, and the data pattern appears to be one diagnostic when validating a model. It is well too regular compared to the Poisson model (H) and known that Kˆ is sensitive to features at a larger scale not regular enough for the homogeneous softcore than Gˆ and Fˆ. Compensator and pseudo-compensa- model (I). tor plots are informative for gaining an overall pic- ture of model validity, and tend to make it easy 15. SUMMARY OF TEST CASES to recognize a poor fit when comparing competing In this section we discuss which of the diagnostics models. To compare models which fit closely, it may we prefer to use based on their behavior for the three be more informative to use (standardized) residu- test cases in Sections 12–14. als or pseudo-residuals. We prefer to use the stan- Typically, the various diagnostics supplement each dardized residuals, but it is important not to over- other well, and it is recommended to use more than interpret the significance of departure from zero. 26 A. BADDELEY, E. RUBAK AND J. MØLLER

(a) (b) Fig. 16. Smoothed pseudo-residuals for each of the models (H)–(J) fitted to the Japanese pines data in Figure 1(a) when the perturbing model is (a) the Geyer saturation model with saturation 1 (null fitted on a fine grid) and (b) the Gˆ-model.

Based on the test cases here, it is not clear whether instance, for the first test case (Section 12) the best diagnostics based on pairwise distances, nearest compensator plot is that in Figure 5(a) based on neighbor distances, or empty space distances are pairwise distances (Kˆ and Kˆ ) which makes it easy preferable. However, for each of these we prefer to to identify the correct model.C On the other hand, work with compensators and residuals rather than in this test case the best residual type plot is that pseudo-compensators and pseudo-residuals when pos- in Figure 8(b) based on nearest neighbor distances sible (i.e., it is only necessary to use pseudo-versions ( Gˆ) where the correct model is the only one within for diagnostics based on empty space distances). For theT critical bands. However, in the third test case

(a) (b) Fig. 17. Pseudo-sum and pseudo-compensators for the models (H)–(J) fitted to the real data pattern in Figure 1(a) when the perturbing model is (a) the area-interaction process and (b) the Fˆ-model. DIAGNOSTICS FOR SPATIAL POINT PROCESSES 27

(Section 14) the best compensator plot is one of which therefore are not accompanied by experimen- the plots in Figure 17 with pseudo-compensators tal results. based on empty space distances (Σ∆VA and ∆VA ˆ ˆ C A.1 Third and Higher Order Functional or Σ∆F and ∆F , respectively) which clearly indi- Summary Statistics cates which modelC is correct. In the first and third test cases (Sections 12 While the intensity and K-function are frequently- and 14), which both involve inhomogeneous models, used summaries for the first and second order mo- it is clear that Kˆ and its compensator are more sen- ment properties of a spatial point process, third and sitive to lack of fit in the first order term than Gˆ and higher order summaries have been less used [49, 67, its compensator [compare, e.g., the results for the 70, 72]. homogeneous model (C) in Figures 5(a) and 8(b)]. Statistic of order k For a functional summary sta- It is our general experience that diagnostics based tistic of kth order, say, on Kˆ are particularly well suited to assess the pres- ence of interaction and to identify the general form (58) S(x, r)= q( xi ,...,xi , r), { 1 k } of interaction. Diagnostics based on Kˆ and, in par- {xi ,...,xi }⊆x 1 Xk ticular, on Gˆ seem to be good for assessing the range we obtain of interaction. Finally, it is worth mentioning the computational Σ∆S(x, r) difference between the various diagnostics (timed on (59) = k!S(x, r) a 2.5 GHz laptop). The calculations for Kˆ and Kˆ C used in Figure 2 are carried out in approximately = k! q( xi ,...,xi , r), { 1 k } five seconds, whereas the corresponding calculations {x ,...,x }⊆x i1 Xik for Gˆ and Gˆ only take a fraction of a second. For Σ∆Fˆ and C∆Fˆ, for example, the calculations take ∆S(x, r) C C about 45 seconds. = k! S(x, r) C (60) = (k 1)! 16. POSSIBLE EXTENSIONS − The definition of residuals and pseudo-residuals q( xi1 ,...,xik−1 , u , r) should extend immediately to marked point pro- · W x { } Z {xi ,...,xi }⊆ cesses. For space–time point processes, residual di- 1 Xk−1 agnostics can be defined using the spatiotemporal λ (u, x)du, conditional intensity (i.e., given the past history). · θˆ Pseudo-residuals are unnecessary because the likeli- ˆ hood of a general space–time point process is a prod- PU(θ, r) (61) uct integral (Mazziotto–Szpirglas identity). In the = k! S(x, r)= k!S(x, r) k! S(x, r), space–time case there is a martingale structure in R − C time, which gives more hope of rigorous asymptotic where i1, i2,... are pairwise distinct in the sums results in the temporal (long-run) limit regime. in (59)–(60). In this case again, pseudo-residual di- Residuals can be derived from many other sum- agnostics are equivalent to those based on residuals. mary statistics. Examples include third-order and Third order example For a stationary and isotropic higher-order moments (Appendix A.1), tessellation point process (i.e., when the distribution of X is statistics (Appendix A.2), and various combinations invariant under translations and rotations), the in- of F , G and K. tensity and K-function of the process completely In the definition of the extended model (25) the determine its first and second order moment prop- canonical statistic S could have been allowed to de- erties. However, even in this case, the simplest de- pend on the nuisance parameter θ, but this would scription of third order moments depends on a three- have complicated our notation and some analysis. dimensional vector specified from triplets (xi,xj,xk) of points from X such as the lengths and angle be- APPENDIX A: FURTHER DIAGNOSTICS tween the vectors xi xj and xj xk. This is often In this appendix we present other diagnostics considered too complex,− and instead− one considers which we have not implemented in software, and a certain one-dimensional property of the triangle 28 A. BADDELEY, E. RUBAK AND J. MØLLER

T (xi,xj,xk) as exemplified below, where L(xi,xj,xk) depends not only on the points in x which are Dirich- denotes the largest side in T (xi,xj,xk). let neighbors to u (with respect to x∪{u}) but also Let the canonical sufficient statistic of the per- on the Dirichlet neighbors to those∼ points (with re- turbing density (27) be spect to x∪{u} or with respect to x\{u}). In other words, if∼ we define the iterated Dirichlet∼ neighbor re- S(x, r)= VT (x, r) 2 lation by that xi x xj if there exists some xk such (62) ∼ x x x = I L(xi,xj,xk) r . that xi xk and xj xk, then t(u, ) depends on { ≤ } ∼ ∼ i

2 VarI = E[h(u, X) λ(u, X)] du (67) SW (x, r)= sW (xi, x xi , r) \ { } W x ∈x (65) Z Xi + E[A(u, v, X)+ B(u, v, X)] du dv, with compensator (in the unconditional case) 2 ZW SW (x, r)= sW (u, x, r)λ (u, x)du. where C θˆ ZW A(u, v, X)=∆uh(v, X)∆vh(u, X)λ2(u, v, X), We explore two different strategies for modifying the edge correction. B(u, v, X)= h(u, X)h(v, X) In the restriction approach, we replace W by W ◦ ◦ ◦ λ(u, X)λ(v, X) λ2(u, v, X) , and x by x = x W , yielding · { − } ∩ ◦ where λ2(u, v, x)= λ(u, x)λ(v, x u ) is the second SW ◦ (x, r)= sW ◦ (xi, x xi , r), ∪{ } \ { } order Papangelou conditional intensity. Note that x ∈x◦ Xi for a Poisson process B(u, v, X) is identically zero. (68) ◦ ◦ + SW ◦ (x, r)= sW ◦ (u, x , r)λθˆ(u, x x )du. B.2 Pseudo-Score C ◦ | ZW Let S(x, z) be a functional summary statistic with Data points in the boundary region W + are ignored function argument z. Take h(u, X)=∆uS(x, z). in the calculation of the empirical statistic SW ◦ . The Then the innovation I is the pseudo-score (23), and boundary configuration x+ = x W + contributes ∩ the variance formula (65) becomes only to the estimation of θ and the calculation of the Papangelou conditional intensity λ ( , x+). This has V θˆ ar[PU(θ)] the advantage that the modified empirical· ·| statis- tic (68) is identical to the standard statistic S com- E 2 = [(∆uS(X, z)) λ(u, X)] du puted on the subdomain W ◦; it can be computed W Z using existing software, and requires no new theo- 2 retical justification. The disadvantage is that we lose (66) + E[(∆u∆vS(X, z)) λ2(u, v, X)] du dv W 2 information by discarding some of the data. Z In the reweighting approach we retain the bound- + E[∆uS(x, z)∆vS(x, z) ary points and compute 2 ZW SW ◦,W (x, r)= sW ◦,W (xi, x xi , r), λ(u, X)λ(v, X) λ2(u, v, X) ]du dv, \ { } · { − } x ∈x◦ Xi where for u = v and u, v x = ∅, 6 { }∩ ◦ + SW ◦,W (x, r)= sW ◦,W (u, x, r)λθˆ(u, x x )du, ∆u∆vS(x, z)= S(x u, v , z) S(x u , z) C W ◦ | ∪ { } − ∪ { } Z S(x v , z)+ S(x, z) where sW ◦,W ( ) is a modified version of sW ( ). Bound- − ∪ { } ary points contribute· to the computation of· the mod- satisfies ∆u∆vS(x, z)=∆v∆uS(x, z). ified summary statistic SW ◦,W and its compensator. 30 A. BADDELEY, E. RUBAK AND J. MØLLER

The modification is designed so that SW ◦,W has Estimators of the form (69) satisfy the local decom- properties analogous to SW . position (67) where The K-function and G-function of a point pro- 2 1 cess Y in R are defined [63, 64] under the assump- sW (u, x, r)= ρˆ2(x u ) W tion that Y is second order stationary and strictly ∪ { } | | stationary, respectively. The standard estima- eW (u, xj)I u xj r , u / x. · {k − k≤ } ∈ tors KˆW (r) and Gˆx(r) of the K-function and G- j function, respectively, are designed to be approxi- X Now we wish to modify (69) so that the outer sum- mately pointwise unbiased estimators when applied ◦ mation is restricted to data points xi in W W , to X = Y W . ⊂ We do∩ not necessarily assume stationarity, but while retaining the property of unbiasedness for sta- when constructing modified summary statistics tionary and isotropic point processes. The restric- tion estimator is KˆW ◦,W (r) and GˆW ◦,W (r), we shall require that they are also approximately pointwise unbiased estima- KˆW ◦ (r) tors of K(r) and G(r), respectively, when Y is sta- 1 tionary. This greatly simplifies the interpretation of (72) = ˆ ˆ ρˆ2(x◦) W ◦ plots of KW ◦,W (r) and GW ◦,W (r) and their compen- | | sators. eW ◦ (xi,xj)I xi xj r , · ◦ ◦ {k − k≤ } xi∈x xj ∈x APPENDIX D: MODIFIED EDGE X X−i CORRECTIONS FOR THE K-FUNCTION where the edge correction weight is given by (70) or (71) with W replaced by W ◦. A more efficient D.1 Horvitz–Thompson Estimators alternative is to replace (69) by the reweighting es- The most common nonparametric estimators of timator the K-function [9, 57, 63] are continuous Horvitz– KˆW ◦,W (r) Thompson type estimators [8, 20] of the form 1 ˆ ˆ (73) = K(r)= KW (r) ρˆ2(x) W ◦ (69) | | 1 I eW ◦,W (xi,xj)I xi xj r , = 2 eW (xi,xj) xi xj r . x · ◦ {k − k≤ } ρˆ ( ) W {k − k≤ } xi∈x xj ∈x−i | | Xi6=j X X 2 2 where eW ◦,W (u, v) is a modified version of eW ( ) Hereρ ˆ =ρ ˆ (x) should be an approximately unbi- · ased estimator of the squared intensity ρ2 under constructed so that the double sum in (73) is unbi- stationarity. Usuallyρ ˆ2(x)= n(n 1)/ W 2 where ased for Y (r). Compared to the restriction estima- n = n(x). − | | tor (72), the reweighting estimator (73) contains ad- The term eW (u, v) is an edge correction weight, ditional contributions from point pairs (xi,xj) where ◦ + xi x and xj x . depending on the geometry of W , designed so that ∈ ∈ ˆ 2 ˆ The modified edge correction factor eW ◦,W ( ) the double sum in (69), say, Y (r) =ρ ˆ (x) W K(r), · is an unbiased estimator of Y (r)= ρ2 W K|(r).| Pop- for (73) is the Horvitz–Thompson weight [9] in an ular examples are the Ohser–Stoyan translation| | edge appropriate context. Ripley’s [63, 64] iso- correction with tropic correction (71) is derived assuming isotropy, by Palm conditioning on the location of the first trans eW (u, v)= eW (u, v) point x , and determining the probability that x (70) i j W would be observed inside W after a random rotation = | | about x . Since the constraint on x is unchanged, W (W + (u v)) i j | ∩ − | no modification of the edge correction weight is re- and Ripley’s isotropic correction with quired, and we take eW ◦,W ( )= eW ( ) as in (71). · · iso Note, however, that the denominator in (73) is eW (u, v)= e (u, v) W changed from W to W ◦ . (71) | | | | 2π u v The Ohser–Stoyan [58] translation correction (70) = k − k . length(∂B(u, u v ) W ) is derived by considering two-point sets (xi,xj) sam- k − k ∩ DIAGNOSTICS FOR SPATIAL POINT PROCESSES 31

◦ ◦ pled under the constraint that both xi and xj are in- Usually, W = W⊖R, so W W⊖r is equal to ◦ ∩ side W . Under the modified constraint that xi W W . From this we conclude that when using ∈ ⊖ max(R,r) and xj W , the appropriate edge correction weight border correction we should always use the reweight- ∈ is ing estimator since the restriction estimator discards additional information and neither the implementa- eW ◦,W (u, v)= eW ◦,W (u v) − tion nor the interpretation is easier. W (W ◦ + (u v)) = | ∩ − | W ◦ APPENDIX E: MODIFIED EDGE | | CORRECTIONS FOR NEAREST NEIGHBOR so that 1/eW ◦,W (z) is the fraction of locations u in W ◦ such that u + z W . FUNCTION G ∈ D.2 Border Correction E.1 Hanisch Estimators A slightly different creature is the border corrected Hanisch [32] considered estimators for G(r) of the x estimator [using usual intensity estimatorρ ˆ = n( )/ form GˆW (r)= Dˆ x(r)/ρˆ, whereρ ˆ is some estimator W ] | | of the intensity ρ, and W I I Kˆ (r)= xi W⊖di di r W | | (74) Dˆ x(r)= { ∈ } { ≤ }, n(x)n(x W⊖r) W ∩ x ∈x ⊖di Xi | | I xi W⊖r I xi xj r · x x { ∈ } {k − k≤ } where di = d(xi, x xi ) is the nearest neighbor dis- xi∈ xj ∈ −i \{ } X X tance for xi. Ifρ ˆ were replaced by ρ, then GˆW (r) with compensator (in the unconditional case) would be an unbiased, Horvitz–Thompson estima- W x I u xj r tor of G(r). See [71], pages 128–129, [9]. Hanisch’s ˆ | | xj ∈ {k − k≤ } KW (r)= recommended estimator D4 is the one in whichρ ˆ is C (n(x) + 1)(n(x W r) + 1) ZW⊖r P ∩ ⊖ taken to be x◦ x+ λθˆ(u, )du. I xi W⊖d · | Dˆ x( )= { ∈ i }. The restriction estimator is ∞ x W⊖di xi∈ | | W ◦ X ˆ ◦ KW (r)= ◦ | | ◦ This is sensible because Dˆ x( ) is an unbiased esti- n(x )n(x W⊖r) ∞ ∩ mator of ρ and is positively correlated with Dˆx(r). I ◦ I xi W⊖r xi xj r The resulting estimator GˆW (r) can be decomposed · ◦ ◦ { ∈ } {k − k≤ } xi∈x x ∈x X jX−i in the form (67) where I I and the compensator is u W⊖d(u,x) d(u, x) r ◦ sW (u, x, r)= { ∈ } { ≤ } W x◦ I u xj r ˆ xj ∈ Dx∪{u}( ) W⊖d(u,x) ˆ ◦ | | {k − k≤ } KW (r)= ◦ ◦ ∞ | | C W ◦ (n(x ) + 1)(n(x W r) + 1) Z ⊖r P ∩ ⊖ for u / x, where d(u, x) is the (“empty space”) dis- ∈ ◦ + tance from location u to the nearest point of x. λˆ(u, x x )du. · θ | Hence, the corresponding compensator is Typically, W ◦ = W , so W ◦ is equal to W . ⊖R ⊖r ⊖(R+r) I I The reweighting estimator is u W⊖d(u,x) d(u, x) r Gˆ (r)= { ∈ } { ≤ } W ˆ W C W Dx∪{u}( ) W⊖d(u,x) ˆ ◦ Z ∞ | | KW ,W (r)= | ◦ | n(x)n(x W⊖r) λ (u, x)du. ∩ · θˆ I xi W⊖r I xi xj r This is difficult to evaluate, since the denominator · { ∈ } {k − k≤ } x ∈x◦ x ∈x Xi jX−i of the integrand involves a summation over all data points: Dx ( ) is not related in a simple way to and the compensator is ∪{u} ∞ I Dx( ). Instead, we chooseρ ˆ to be the conventional W xj ∈x u xj r ∞ Kˆ ◦ (r)= | | {k − k≤ } estimatorρ ˆ = n(x)/ W . Then W ,W ◦ | | C ◦ (n(x)+1)(n(x W⊖r)+1) ZW ∩W⊖r P ∩ W ◦ + GˆW (r)= | | Dˆ x(r), λˆ(u, x x )du. n(x) · θ | 32 A. BADDELEY, E. RUBAK AND J. MØLLER which can be decomposed in the form (67) with E.2 Border Correction Estimator I I W u W⊖d(u,x) d(u, x) r The classical border correction estimate of G is s (u, x, r)= { ∈ } { ≤ } W | | 1 n(x)+1 W⊖d(u,x) ˆ | | GW (r)= n(x W⊖r) for u / x, so that the compensator is (78) ∩ ∈ I xi W r I d(xi, x i) r ˆ · { ∈ ⊖ } { − ≤ } GW (r) x ∈x C Xi W I u W⊖d(u,x) I d(u, x) r (75) = | | { ∈ } { ≤ } with compensator (in the unconditional case) n(x) + 1 W W⊖d(u,x) Z | | ˆ 1 x GW (r)= λθˆ(u, )du. C 1+ n(x W⊖r) · (79) ∩ In the restriction estimator we exclude the bound- I d(u, x) r λ (u, x)du. ary points and take d◦ = d(x , x◦ ), effectively re- θˆ i i −i · W⊖r { ≤ } placing the data set x by its restriction x◦ = x W ◦: Z ∩ In the conditional case, the Papangelou conditional ◦ ◦ ◦ + ◦ I xi W ◦ I d r intensity λ (u, x) must be replaced by λ (u, x x ) W ⊖di i θˆ θˆ ˆ ◦ { ∈ } { ≤ } | GW (r)= | ◦| ◦ . given in (24). The restriction estimator is obtained n(x ) W ◦ x ∈x◦ ⊖d ◦ ◦ Xi | i | by replacing W by W and x by x in (78)–(79), The compensator is (75) but computed for the point yielding ◦ ◦ pattern x in the window W : ˆ 1 GW ◦ (r)= ◦ n(x W⊖r) GˆW ◦ (r) ∩ C I ◦ I ◦ ◦ I ◦ I x◦ xi W⊖r d(xi, x−i) r , W u W⊖d(u,x◦) d(u, ) r · { ∈ } { ≤ } { ∈ } { ≤ } x ∈x◦ = | ◦ | ◦ i n(x ) + 1 ◦ W X ZW | ⊖d(u,x◦)| ˆ 1 ◦ + GW ◦ (r)= ◦ λˆ(u, x x )du. C 1+ n(x W ) · θ | ∩ ⊖r ◦ ◦ In the usual case W = W⊖R, we have W⊖d = I ◦ ◦ + d(u, x ) r λθˆ(u, x x )du. W . · W ◦ { ≤ } | ⊖(R+d) Z ⊖r In the reweighting estimator we take di = d(xi, x ◦ ◦ \ Typically, W = W⊖R so that W⊖r = W⊖(R+r). The xi ). To retain the Horvitz–Thompson property, { } reweighting estimator is obtained by restricting xi we must replace the weights 1/ W⊖di in (74) by ◦ ◦ | | and u in (78)–(79) to lie in W , yielding 1/ W W⊖di . Thus, the modified statistics are | ∩ | 1 I I GˆW ◦,W (r)= W xi W⊖di di r ◦ (76) Gˆ ◦ (r)= | | { ∈ } { ≤ } n(x W⊖r) W ,W ◦ ∩ n(x) x◦ W W⊖di xi∈ | ∩ | X I xi W r I d(xi, x i) r , · { ∈ ⊖ } { − ≤ } x ∈x◦ and Xi 1 Gˆ ◦ (r) ˆ ◦ W ,W GW ,W (r)= ◦ C C 1+ n(x W⊖r) W I u W d u,x I d(u, x) r ∩ { ∈ ⊖ ( )} { ≤ } (77) = | | ◦ I n(x) + 1 W ◦ W W⊖d(u,x) d(u, x) r ◦ Z | ∩ | · W ∩W⊖r { ≤ } ◦ + Z λˆ(u, x x )du. ◦ + · θ | λˆ(u, x x )du. · θ | ◦ ◦ In the usual case where W = W R, we have W ◦ ◦ ⊖ ∩ In the usual case where W = W⊖R, we have W W d = W . ∩ ⊖ i ⊖ max(R,di) W⊖r = W⊖ max(R,r). Again, the reweighting approach Optionally, we may also replace W /n(x) in (76) is preferable to the restriction approach. ◦ ◦ | | by W /n(x W ), and, correspondingly, replace The border corrected estimator Gˆ(r) has relatively |W|| | ∩ in (77) by W ◦ /(n(x W ◦) + 1). poor performance and sample properties [38], pa- n(x)+1 | | ∩ DIAGNOSTICS FOR SPATIAL POINT PROCESSES 33 ge 209. Its main advantage is its computational ef- [12] Barnard, G. (1963). Discussion of “The spectral analy- ficiency in large data sets. Similar considerations sis of point processes” by M. S. Bartlett. J. R. Stat. should apply to its compensator. Soc. Ser. B Stat. Methodol. 25 294. [13] Berman, M. (1986). Testing for spatial association be- ACKNOWLEDGMENTS tween a point process and another stochastic pro- cess. J. Roy. Statist. Soc. Ser. C 35 54–62. This paper has benefited from very fruitful discus- [14] Besag, J. (1978). Some methods of statistical analysis sions with Professor Rasmus P. Waagepetersen. We for spatial data. Bull. Int. Statist. Inst. 44 77–92. also thank the referees for insightful comments. The [15] Chen, C. (1983). Score tests for regression models. J. Amer. Statist. Assoc. 78 158–161. research was supported by the University of West- [16] Chetwynd, A. G. and Diggle, P. J. (1998). On es- ern Australia, the Danish Natural Science Research timating the reduced second moment measure of Council (Grants 272-06-0442 and 09-072331, Point a stationary spatial point process. Aust. N. Z. J. process modeling and ), the Dan- Stat. 40 11–15. MR1628212 ish Agency for Science, Technology and Innovation [17] Coeurjolly, J. F. and Lavancier, F. (2010). Resid- (Grant 645-06-0528, International Ph.D. student) uals for stationary marked Gibbs point processes. and by the Centre for Stochastic Geometry and Ad- Preprint. Available at http://arxiv.org/abs/ 1002.0857 vanced Bioimaging, funded by a grant from the Vil- . [18] Conniffe, D. (2001). Score tests when a nuisance pa- lum Foundation. rameter is unidentified under the null hypothesis. J. Statist. Plann. Inference 97 67–83. MR1851375 REFERENCES [19] Cook, R. D. and Weisberg, S. (1983). Diagnostics for in regression. Biometrika 70 1– [1] Alm, S. E. (1998). Approximation and simulation of 10. MR0742970 the distributions of scan statistics for Poisson pro- [20] Cordy, C. B. (1993). An extension of the Horvitz– cesses in higher dimensions. Extremes 1 111–126. Thompson theorem to point sampling from a con- MR1652932 tinuous universe. Statist. Probab. Lett. 18 353–362. [2] Atkinson, A. C. (1982). Regression diagnostics, trans- MR1247446 formations and constructed variables (with dis- [21] Cox, D. R. (1972). The statistical analysis of depen- 44 cussion). J. Roy. Statist. Soc. Ser. B 1–36. dencies in point processes. In Stochastic Point Pro- MR0655369 cesses: Statistical Analysis, Theory, and Applica- [3] Baddeley, A. (1980). A limit theorem for statistics tions (Conf., IBM Res. Center, Yorktown Heights, 12 of spatial data. Adv. in Appl. Probab. 447–461. N.Y., 1971) 55–66. Wiley-Interscience, New York. MR0569436 MR0375705 Baddeley, A. Møller, J. Pakes, A. G. [4] , and (2008). [22] Cox, D. R. and Hinkley, D. V. (1974). Theoretical Properties of residuals for spatial point processes. Statistics. Chapman & Hall, London. MR0370837 Ann. Inst. Statist. Math. 60 627–649. MR2434415 [23] Cressie, N. A. C. (1991). Statistics for Spatial Data. [5] Baddeley, A. and Turner, R. (2000). Practical max- Wiley, New York. MR1127423 imum pseudolikelihood for spatial point patterns [24] Daley, D. J. and Vere-Jones, D. (2003). An Intro- (with discussion). Aust. N. Z. J. Stat. 42 283–322. duction to the Theory of Point Processes. Volume I: MR1794056 Elementary Theory and Methods, 2nd ed. Springer, [6] Baddeley, A. and Turner, R. (2005). Spatstat: An New York. MR1950431 R package for analyzing spatial point patterns. Davies, R. B. J. Statist. Software 12 1–42. [25] (1977). Hypothesis testing when a nui- [7] Baddeley, A., Turner, R., Møller, J. and Hazel- sance parameter is present only under the alterna- 64 ton, M. (2005). Residual analysis for spatial point tive. Biometrika 247–254. MR0501523 processes (with discussion). J. R. Stat. Soc. Ser. B [26] Davies, R. B. (1987). Hypothesis testing when a nui- Stat. Methodol. 67 617–666. MR2210685 sance parameter is present only under the alterna- 74 [8] Baddeley, A. J. (1993). Stereology and survey sam- tive. Biometrika 33–43. MR0885917 pling theory. Bull. Int. Statist. Inst. 50 435–449. [27] Diggle, P. J. (1979). On parameter estimation and [9] Baddeley, A. J. (1999). Spatial sampling and cen- goodness-of-fit testing for spatial point patterns. soring. In Stochastic Geometry (Toulouse, 1996). Biometrika 35 87–101. Monogr. Statist. Appl. Probab. 80 37–78. Chapman [28] Diggle, P. J. (1985). A kernel method for smoothing & Hall/CRC, Boca Raton, FL. MR1673114 point process data. J. Roy. Statist. Soc. Ser. C 34 [10] Baddeley, A. J. and Møller, J. (1989). Nearest- 138–147. neighbour Markov point processes and random sets. [29] Diggle, P. J. (2003). Statistical Analysis of Spatial Int. Stat. Rev. 57 89–121. Point Patterns, 2nd ed. Hodder Arnold, London. [11] Baddeley, A. J. and van Lieshout, M. N. M. (1995). [30] Georgii, H.-O. (1976). Canonical and grand canonical Area-interaction point processes. Ann. Inst. Statist. Gibbs states for continuum systems. Comm. Math. Math. 47 601–619. MR1370279 Phys. 48 31–51. MR0411497 34 A. BADDELEY, E. RUBAK AND J. MØLLER

[31] Geyer, C. (1999). Likelihood inference for spatial [47] Lawson, A. B. (1993). On the analysis of mortality point processes. In Stochastic Geometry (Toulouse, events around a prespecified fixed point. J. Roy. 1996). Monogr. Statist. Appl. Probab. 80 79– Statist. Soc. Ser. A 156 363–377. 140. Chapman & Hall/CRC, Boca Raton, FL. [48] Lotwick, H. W. and Silverman, B. W. (1982). Meth- MR1673118 ods for analysing spatial processes of several types [32] Hanisch, K. H. (1984). Some remarks on estimators of of points. J. Roy. Statist. Soc. Ser. B 44 406–413. the distribution function of nearest neighbour dis- MR0693241 tance in stationary spatial point processes. Math. [49] Møller, J., Syversveen, A. R. and Waagepeter- Operationsforsch. Statist. Ser. Statist. 15 409–412. sen, R. P. (1998). Log Gaussian Cox processes. MR0756346 Scand. J. Statist. 25 451–482. MR1650019 [33] Hansen, B. E. (1996). Inference when a nuisance pa- [50] Møller, J. and Waagepetersen, R. P. (2004). rameter is not identified under the null hypothesis. Statistical Inference and Simulation for Spatial Econometrica 64 413–430. MR1375740 Point Processes. Monogr. Statist. Appl. Probab. [34] Heinrich, L. (1988). Asymptotic behaviour of an em- 100. Chapman & Hall/CRC, Boca Raton, FL. pirical nearest-neighbour distance function for sta- MR2004226 tionary Poisson cluster processes. Math. Nachr. 136 [51] Møller, J. and Waagepetersen, R. P. (2007). Mod- 131–148. MR0952468 ern statistics for spatial point processes. Scand. J. [35] Heinrich, L. (1988). Asymptotic Gaussianity of some Statist. 34 643–684. MR2392447 estimators for reduced factorial moment measures [52] Nguyen, X.-X. and Zessin, H. (1979). Integral and and product densities of stationary Poisson cluster differential characterizations of the Gibbs process. processes. Statistics 19 87–106. MR0921628 Math. Nachr. 88 105–115. MR0543396 [36] Hope, A. C. A. (1968). A simplified Monte Carlo signif- [53] Numata, M. (1961). Forest vegetation in the vicinity icance test procedure. J. R. Stat. Soc. Ser. B Stat. of Choshi—Coastal flora and vegetation at Choshi, Methodol. 30 582–598. Chiba prefecture, IV (in Japanese). Bull. Choshi [37] Huang, F. and Ogata, Y. (1999). Improvements of Mar. Lab. 3 28–48. the maximum pseudo-likelihood estimators in var- [54] Numata, M. (1964). Forest vegetation, particularly pine ious spatial statistical models. J. Comput. Graph. stands in the vicinity of Choshi—Flora and vegeta- Statist. 8 510–530. tion in Choshi, Chiba prefecture, VI (in Japanese). [38] Illian, J., Penttinen, A., Stoyan, H. and Stoyan, D. Bull. Choshi Mar. Lab. 6 27–37. (2008). Statistical Analysis and Modelling of Spatial [55] Ogata, Y. and Tanemura, M. (1981). Estimation Point Patterns. Wiley, Chichester. MR2384630 of interaction potentials of spatial point patterns [39] Jensen, J. L. and Møller, J. (1991). Pseudolikelihood through the maximum likelihood procedure. Ann. for exponential family models of spatial point pro- Inst. Statist. Math. 33 315–338. cesses. Ann. Appl. Probab. 1 445–461. MR1111528 [56] Ogata, Y. and Tanemura, M. (1986). Likelihood [40] Jolivet, E. (1981). and conver- estimation of interaction potentials and external gence of empirical processes for stationary point fields of inhomogeneous spatial point patterns. processes. In Point Processes and Queuing Prob- In Pacific Statistical Congress (I. S. Francis, lems (Colloq., Keszthely, 1978). Colloq. Math. Soc. B. J. F. Manly and F. C. Lam, eds.) 150–154. J´anos Bolyai 24 117–161. North-Holland, Amster- Elsevier, Amsterdam. dam. MR0617406 [57] Ohser, J. (1983). On estimators for the reduced sec- [41] Kallenberg, O. (1978). On conditional intensities of ond moment measure of point processes. Math. point processes. Z. Wahrsch. Verw. Gebiete 41 205– Operationsforsch. Statist. Ser. Statist. 14 63–71. 220. MR0461654 MR0697340 [42] Kallenberg, O. (1984). An informal guide to the the- [58] Ohser, J. and Stoyan, D. (1981). On the second-order ory of conditioning in point processes. Internat. and orientation analysis of planar stationary point Statist. Rev. 52 151–164. MR0967208 processes. Biometrical J. 23 523–533. MR0635658 [43] Kelly, F. P. and Ripley, B. D. (1976). A note on [59] Papangelou, F. (1974). The conditional intensity of Strauss’s model for clustering. Biometrika 63 357– general point processes and an application to line 360. MR0431375 processes. Z. Wahrsch. Verw. Gebiete 28 207–226. [44] Kulldorff, M. (1999). Spatial scan statistics: Mod- MR0373000 els, calculations, and applications. In Scan Statis- [60] Pregibon, D. (1982). Score tests in GLIM with applica- tics and Applications 303–322. Birkh¨auser, Boston, tions. In GLIM 82: Proceedings of the International MA. MR1697758 Conference on Generalized Linear Models. Lecture [45] Kutoyants, Y. A. (1998). Statistical Inference for Spa- Notes in Statist. 14. Springer, New York. tial Poisson Processes. Lecture Notes in Statist. 134. [61] Radhakrishna Rao, C. (1948). Large sample tests of Springer, New York. MR1644620 statistical hypotheses concerning several parame- [46] Last, G. and Penrose, M. (2011). Poisson process Fock ters with applications to problems of estimation. space representation, chaos expansion and covari- Proc. Cambridge Philos. Soc. 44 50–57. MR0024111 ance inequalities. Probab. Theory Related Fields 150 [62] Rathbun, S. L. and Cressie, N. (1994). Asymptotic 663–690. properties of estimators for the parameters of spa- DIAGNOSTICS FOR SPATIAL POINT PROCESSES 35

tial inhomogeneous Poisson point processes. Adv. in [71] Stoyan, D., Kendall, W. S. and Mecke, J. (1987). Appl. Probab. 26 122–154. MR1260307 Stochastic Geometry and Its Applications. Wiley, [63] Ripley, B. D. (1976). The second-order analysis of sta- Chichester. MR0895588 tionary point processes. J. Appl. Probab. 13 255– [72] Stoyan, D. and Stoyan, H. (1995). Fractals, Random 266. MR0402918 Shapes and Point Fields. Wiley, Chichester. [64] Ripley, B. D. (1977). Modelling spatial patterns (with [73] Strauss, D. J. (1975). A model for clustering. 62 discussion). J. Roy. Statist. Soc. Ser. B 39 172–212. Biometrika 467–475. MR0383493 MR0488279 [74] van Lieshout, M. N. M. (2000). Markov Point Pro- [65] Ripley, B. D. (1988). Statistical Inference for Spa- cesses and Their Applications. Imperial College tial Processes. Cambridge Univ. Press, Cambridge. Press, London. MR1789230 MR0971986 [75] Wald, A. (1941). Some examples of asymptotically most 12 [66] Ripley, B. D. and Kelly, F. P. (1977). Markov point powerful tests. Ann. Math. Statist. 396–408. MR0006683 processes. J. Lond. Math. Soc. (2) 15 188–192. [76] Waller, L., Turnbull, B., Clark, L. C. and MR0436387 Nasca, P. (1992). Chronic Disease Surveillance [67] Schladitz, K. and Baddeley, A. J. (2000). A third and testing of clustering of disease and expo- order point process characteristic. Scand. J. Statist. 27 sure: Application to leukaemia incidence and TCE- 657–671. MR1804169 contaminated dumpsites in upstate New York. En- Silvapulle, M. J. [68] (1996). A test in the presence of nui- vironmetrics 3 281–300. 91 sance parameters. J. Amer. Statist. Assoc. 1690– [77] Wang, P. C. (1985). Adding a variable in gener- 1693. MR1439111 alized linear models. Technometrics 27 273–276. [69] Stein, M. L. (1995). An approach to asymptotic in- MR0797565 ference for spatial point processes. Statist. Sinica [78] Widom, B. and Rowlinson, J. S. (1970). New model 5 221–234. MR1329294 for the study of liquid–vapor phase transitions. [70] Stillinger, D. K., Stillinger, F. H., Torquato, S., J. Chem. Phys. 52 1670–1684. Truskett, T. M. and Debenedetti, P. G. [79] Wu, L. (2000). A new modified logarithmic Sobolev in- (2000). Triangle distribution and equation of state equality for Poisson point processes and several ap- for classical rigid disks. J. Statist. Phys. 100 49–72. plications. Probab. Theory Related Fields 118 427– MR1791553 438. MR1800540