Independent Component Models for Replicated Point Processes
Total Page:16
File Type:pdf, Size:1020Kb
Independent component models for replicated point processes Daniel Gervini September 25, 2018 Abstract We propose a semiparametric independent-component model for the intensity func- tions of a point process. When independent replications of the process are available, we show that the estimators are consistent and asymptotically normal. We study the finite-sample behavior of the estimators by simulation, and as an example of application we analyze the spatial distribution of street robberies in the city of Chicago. Key words: Cox process; latent-variable model; Poisson process; spline smoothing. 1 Introduction Point processes in time and space have a broad range of applications, in diverse areas such as neuroscience, ecology, finance, astronomy, seismology, and many others. Exam- ples are given in classic textbooks like Cox and Isham (1980), Diggle (2013), Møller and Waagepetersen (2004), Streit (2010), and Snyder and Miller (1991), and in the papers cited below. However, the point-process literature has mostly focused on the single-realization case, such as the distribution of trees in a single forest (Jalilian et al., 2013) or the dis- tribution of cells in a single tissue sample (Diggle et al., 2006). But situations where several replications of a process are available are increasingly common. Among the few papers proposing statistical methods for replicated point processes we can cite Diggle et arXiv:1506.00137v1 [stat.ME] 30 May 2015 al. (1991), Baddeley et al. (1993), Diggle et al. (2000), Bell and Grunwald (2004), Landau et al. (2004), Wager et al. (2004), and Pawlas (2011). However, these papers mostly propose estimators for summary statistics of the processes (the so-called F , G and K statistics) rather than for the intensity functions that characterize the processes. When several replications of a process are available, it is possible to estimate the inten- sity functions directly thanks to the possibility of “borrowing strength” across replications. This is the basic idea behind many functional data methods that are applied to stochastic processes which, individually, are only sparsely sampled (James et al., 2000; Yao et al., 2005; Gervini, 2009). Functional Data Analysis has mostly focused on continuous-time processes; little work has been done on discrete point processes. We can mention Bouzas 1 et al. (2006, 2007) and Fern´andez-Alcal´aet al. (2012), which have rather limited scopes since they only estimate the mean of Cox processes, and Wu et al. (2013), who estimate the mean and the principal components of independent and identically distributed realizations of a Cox process, but the method of Wu et al. (2013) is based on kernel estimators of the covariance function that are not easily extended beyond i.i.d. replications and time- dependent processes. The semiparametric method we propose in this paper, on the other hand, can be applied just as easily to temporal or spatial processes, and can be extended to non-i.i.d. situations like ANOVA models or more complex data structures like marked point processes and multivariate processes, even though we will not go as far in this paper. As an example of application we will analyze the spatial distribution of street robberies in Chicago in the year 2014, where each day is seen as a replication of the process. The method we propose in this paper fits a non-negative independent-component model for the intensity functions, and provides estimators of the intensity functions for the individual replications as by-products. We establish the consistency and asymptotic normality of the component estimators in Section 3, study their finite-sample behavior by simulation in Section 4, and analyze the Chicago crime data in Section 5. 2 Independent-component model for intensity process A point process X is a random countable set in a space , where is usually R for S S temporal processes and R2 or R3 for spatial processes (Møller and Waagepetersen, 2004, ch. 2; Streit, 2010, ch. 2). Locally finite processes are those for which #(X B) < ∩ ∞ with probability one for any bounded B . For such processes we can define the count ⊆ S function N(B) = #(X B). Given a locally integrable function λ : [0, ), i.e. a ∩ S → ∞ function such that λ(t)dt < for any bounded B , the process X is a Poisson B ∞ ⊆ S process with intensityR function λ(t) if (i) N(B) follows a Poisson distribution with rate λ(t)dt for any bounded B , and (ii) conditionally on N(B) = m, the m points in B ⊆ S X B are independent and identically distributed with density λ˜(t) = λ(t)/ λ for any R ∩ B bounded B . ⊆ S R Let X = X B for a given B. Then the density function of X at x = t ,...,t B ∩ B B { 1 m} is f(x ) = f(m)f(t ,...,t m) (1) B 1 m| λ(t)dt m m = exp λ(t)dt { B } λ˜(t ). − m! × j B R j=1 Z Y Since the realizations of XB are sets, not vectors, what we mean by ‘density’ is the following: 2 if is the family of locally finite subsets of , i.e. N S = A : #(A B) < for all bounded B , N { ⊆ S ∩ ∞ ⊆ S} then for any F ⊆N ∞ P (X F ) = I( t ,...,t F )f( t ,...,t )dt dt B ∈ · · · { 1 m}∈ { 1 m} 1 · · · m m=0 ZB ZB X∞ exp λ(t)dt m = − B I( t ,...,t F ) λ(t ) dt dt , m! · · · { 1 m}∈ { j } 1 · · · m m=0 R B B j=1 X Z Z Y and, more generally, for any function h : [0, ) N → ∞ ∞ E h(X ) = h( t ,...,t )f( t ,...,t )dt dt . (2) { B } · · · { 1 m} { 1 m} 1 · · · m m=0 B B X Z Z A function h on is, essentially, a function well defined on m for any integer m which is N S invariant under permutation of the coordinates; for example, h( t ,...,t )= m t /m. { 1 m} j=1 j Single realizations of point processes are often modeled as Poisson processesP with fixed λs. But for replicated point processes a single intensity function λ rarely provides an ade- quate fit for all replications. It is more reasonable to assume that the λs are subject-specific and treat them as latent random effects. These processes are called doubly stochastic or Cox processes (Møller and Waagepetersen, 2004, ch. 5; Streit, 2010, ch. 8). We can think of a doubly stochastic process as a pair (X, Λ) where X Λ = λ is a Poisson process | with intensity λ, and Λ is a random function that takes values on the space of non- F negative locally integrable functions on . This provides an appropriate framework for the S type of applications we have in mind: the n replications can be seen as i.i.d. realizations (X1, Λ1),..., (Xn, Λn) of a doubly stochastic process (X, Λ), where X is observable but Λ is not. In this paper we will assume that all X s are observed on a common region B of ; i S the method could be extended to Xis observed on non-conformal regions Bi at the expense of a higher computational complexity. For many applications we can think of the intensity process Λ as a finite combination of independent components, p Λ(t)= Ukφk(t), (3) Xk=1 where φ ,...,φ are functions in and U ,...,U are independent nonnegative random 1 p F 1 p variables. This is the functional equivalent of the multivariate Independent Component decomposition (Hyv¨arinen et al., 2001). For identifiability we assume that B φk(t)dt = 1 R 3 p for all k, i.e. that the φks are density functions. Then B Λ(t)dt = k=1 Uk and, if we p define S = k=1 Uk and Wk = Uk/S, we have R P P p ˜ ˜ Λ(t)= SΛ(t), with Λ(t)= Wkφk(t). (4) Xk=1 The intensity process Λ(t), then, can be seen as the product of an intensity factor S ˜ and a scatter factor Λ(t) which is a convex combination of the φks. Conditionally on Λ(t) = λ(t) the count function N(B) has a Poisson distribution with rate S = s, and conditionally on N(B) = m the m points in XB are independent identically distributed realizations with density λ˜(t) which is determined by the Wks. In general S and W are not independent, but if the Uks are assumed to follow independent Gamma distributions, say U Gamma(α , β) with a common β, then S and W are independent with S k ∼ k ∼ Gamma( α , β) and W Dirichlet(α ,...,α ) (Bilodeau and Brenner, 1999, ch. 3). k ∼ 1 p t The marginalP density of XB, with a slight abuse of notation in writing xB = (m, ), can be expressed as f(xB) = f(m, t, s, w) ds dw (5) Z Z = f(t m, w)f(m s)f(s, w) ds dw | | Z Z with m p f(t m, w)= w φ (t ). | k k j jY=1 Xk=1 ˜ Since λ(t) is a mixture of the φks, for computational convenience we introduce indicator ′ vectors y ,..., y such that, for each j, y = 1 for some k and y ′ = 0 for all k = k, 1 m jk jk 6 with P (Y = 1 m, w)= w . In words, y indicates which φ the point t is coming from. jk | k j k j Conditionally on m and w, the yjs are independent Multinomial(1, w). Then, collecting all yjs into a single y for notational simplicity, (5) can also be written as f(xB) = f(m, t, y, s, w) ds dw y Z Z X = f(t m, y)f(y m, w)f(m s)f(s, w) ds dw y | | | Z Z X with m p f (t m, y)= φ (t )yjk | k j jY=1 kY=1 4 and m p y f(y m, w)= w jk .