Gaussian Process Modulated Cox Processes Under Linear Inequality Constraints
Total Page:16
File Type:pdf, Size:1020Kb
Gaussian Process Modulated Cox Processes under Linear Inequality Constraints Andr´esF. L´opez-Lopera ST John Nicolas Durrande Mines Saint-Etienne´ ∗ PROWLER.io PROWLER.io Abstract Poisson processes are the foundation for modelling point patterns (Kingman, 1992). Their extension Gaussian process (GP) modulated Cox pro- to stochastic intensity functions, known as doubly cesses are widely used to model point pat- stochastic Poisson processes or Cox processes (Cox, terns. Existing approaches require a mapping 1955), enables non-parametric inference on the in- (link function) between the unconstrained tensity function and allows expressing uncertainties GP and the positive intensity function. This (Møller and Waagepetersen, 2004). Moreover, pre- commonly yields solutions that do not have a vious studies have shown that other classes of point closed form or that are restricted to specific processes may also be seen as Cox processes. For ex- covariance functions. We introduce a novel ample, Yannaros (1988) proved that Gamma renewal finite approximation of GP-modulated Cox processes are Cox processes under non-increasing con- processes where positiveness conditions can ditions. A similar analysis was made later for Weibull be imposed directly on the GP, with no re- processes (Yannaros, 1994). strictions on the covariance function. Our ap- Gaussian processes (GPs) form a flexible prior over proach can also ensure other types of inequal- functions, and are widely used to model the intensity ity constraints (e.g. monotonicity, convexity), process Λ( ) (Møller et al., 2001; Adams et al., 2009; resulting in more versatile models that can Teh and Rao,· 2011; Gunter et al., 2014; Lasko, 2014; be used for other classes of point processes Lloyd et al., 2015; Fernandez et al., 2016; Donner and (e.g. renewal processes). We demonstrate Opper, 2018). However, to ensure positive intensities, on both synthetic and real-world data that this commonly requires link functions between the our framework accurately infers the intensity intensity process and the GP g( ). Typical examples functions. Where monotonicity is a feature of mappings are Λ(x) = exp(g(x))· (Møller et al., 2001; of the process, our ability to include this in Diggle et al., 2013; Flaxman et al., 2015) or Λ(x) = the inference improves results. g(x)2 (Lloyd et al., 2015; Kozachenko et al., 2016). The exponential transformation has the drawback that there is no closed-form expression for some of the 1 INTRODUCTION integrals required to compute the likelihood. Although the square inverse link function allows closed-form Point processes are used in a variety of real-world expressions for certain kernels, it leads to models problems for modelling temporal or spatiotemporal exhibiting \nodal lines" with zero intensity due to point patterns in fields such as astronomy, geogra- the non-monotonicity of the transformation (see John phy, and ecology (Baddeley et al., 2015; Møller and and Hensman, 2018, for a discussion). Furthermore, Waagepetersen, 2004). In reliability analysis, they are current approaches to Cox process inference cannot used as renewal processes to model the lifetime of items be used in applications such as renewal processes that or failure (hazard) rates (Cha and Finkelstein, 2018). require both positivity and monotonicity constraints. Here, we introduce a novel approximation of GP- ∗Part of this work was completed during an internship of modulated Cox processes that does not rely on a A. F. L´opez-Lopera at PROWLER.io. mapping to obtain the intensity. In our approach we impose the constraints (e.g. non-negativeness or Proceedings of the 22nd International Conference on Ar- monotonicity) directly on Λ( ) by sampling from a tificial Intelligence and Statistics (AISTATS) 2019, Naha, · Okinawa, Japan. PMLR: Volume 89. Copyright 2019 by truncated Gaussian vector. This has the advantage the author(s). that the likelihood can be computed in closed form. Gaussian Process Modulated Cox Processes under Inequality Constraints Moreover, our approach can ensure any type of linear 3 APPROXIMATION OF GP inequality constraint everywhere, which allows mod- MODULATED COX PROCESSES elling of a broader range of point processes. This paper is organised as follows. In Section 2, In this work, we approximate the intensity Λ( ) of the Cox process by a finite-dimensional GP Λ· ( ) we briefly describe inhomogeneous Poisson processes m · and some of their extensions. In Sections 3 and 4, subject to some inequality constraints (e.g. bounded- we introduce a finite representation of GP-modulated ness, monotonicity, convexity). Since positiveness con- straints are imposed directly on Λ ( ), a link function Cox processes and the corresponding Cox process m · inference under inequality constraints. In Section is no longer necessary. This has two main advantages. 5, we apply our framework to 1D and 2D inference First, the likelihood (1) can be computed analytically. examples under different inequality conditions. We Second, as our approach ensures any linear inequality also test its performance in reliability applications constraint, it can be used for modelling a broader with hazard rates exhibiting monotonic behaviours. range of point processes. Finally, in Section 6, we summarise our results and outline potential future work. 3.1 Finite Approximation of 1D GPs Let Λ( ) be a zero-mean GP on R with arbitrary 2 POISSON POINT PROCESSES covariance· function k. Consider x , with compact space = [0; 1], and a set of knots2 St ; ; t . S 1 ··· m 2 S A Poisson process X is a random countable subset of Here we consider equispaced knots tj = (j 1)∆m Rd where points occur independently (Baddeley with ∆ = 1=(m 1). We define Λ ( ) as the− finite- S ⊆ m m et al., 2006). Let N N be a random variable (r.v.) dimensional approximation− of Λ( ) consisting· of its 2 · denoting the number of points in X. Let X1; ;Xn piecewise-linear interpolation at knots t ; ; t , i.e., ··· 1 ··· m be a set of n independent and identically distributed m (i.i.d.) r.v.'s on . The likelihood of (N = n; X = X S 1 Λm(x) = φj(x)ξj; (3) x1; ;Xn = xn) under an inhomogeneous Poisson j=1 process··· with non-negative intensity λ( ) is given by (Møller and Waagepetersen, 2004) · where ξj := Λ(tj) for j = 1; ; m, and φ1; ; φm are hat basis functions given by··· ··· n exp( µ) Y ( x tj x tj f(N;X ; ;X )(n; x1; ; xn) = − λ(xi); (1) 1 − if − 1; 1 n ∆m ∆m ··· ··· n! φj(x) := − ≤ (4) i=1 0 otherwise: where Z Similarly to spline-based approaches (e.g., Sleeper and µ = λ(s) ds (2) Harrington, 1990), we assume that Λ( ) is piecewise · S defined by (first-order) polynomials. The striking is the intensity measure or overall intensity. property of this basis is that satisfying the inequality When is the real line, the distance (inter-arrival constraints (e.g. boundedness, monotonicity, convex- time) betweenS consecutive points of a Poisson process ity) at the knots implies that the constraints are follows an exponential distribution. Renewal processes satisfied everywhere in the input space (Maatouk and are a generalisation of Poisson processes where inter- Bay, 2017). Although it is tempting to generalise arrival times are i.i.d. but not necessarily exponentially the above construction to smoother basis functions, distributed. An example is the Weibull process where it makes this property difficult to enforce. inter-arrival times are distributed following λ(x) = We aim at computing the distribution of Λ ( ) under αβxβ 1 (Cha and Finkelstein, 2018). m − the condition that it belongs to a convex· set of Cox processes (Cox, 1955) are a natural extension functions defined by some inequality constraints (e.g. E of inhomogeneous Poisson processes where λ( ) is positivity). This piecewise-linear representation has · sampled from a non-negative stochastic process Λ( ). the benefit that satisfying Λm( ) is equivalent · 2 E Previous studies have shown that many classes· of to satisfying only a finite number of inequality con- point processes can be seen as Cox processes under straints. More precisely, certain conditions (Møller and Waagepetersen, 2004; Λ ( ) ξ ; (5) Yannaros, 1988, 1994). For example, Weibull renewal m · 2 E , 2 C m processes are Cox processes for β (0; 1] (Yannaros, where ξ = [ξ ; ; ξ ]>, and is a convex set on R . 1 ··· m C 1994). This motivates the construction2 of GPs with For non-negativeness conditions , is given by non-negative and monotonic constraints, so that they E+ C can be used as intensities Λ( ) of Cox processes. := c Rm; j = 1; ; m : c 0 ; (6) · C+ f 2 8 ··· j ≥ g Andr´esF. L´opez-Lopera, ST John, Nicolas Durrande 2 2 2 1 1 1 ↓ + + m 0 ∈ E 0 ∈ E 0 Λ m m -1 Λ -1 Λ -1 -2 -2 -2 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 # (a) unconstrained GP (b) GP with C+ constraints (c) GP with C+ constraints Figure 1: Samples from the prior Λm( ) under (a) no constraints, (b) non-negativeness constraints, (c) both non-negativeness and non-increasing constraints.· The grey region shows the 95% confidence interval. and for non-increasing conditions , is given by Since (8) depends on r.v.'s ξ1; ; ξm, it can be E# C approximated using samples of ξ···. To estimate the Rm := c ; j = 2; ; m : cj 1 cj : (7) covariance parameters θ of the vector ξ, we can use C# f 2 8 ··· − ≥ g stochastic global optimisation (Jones et al., 1998). Constraints can be composed, e.g. the convex set of non-negativeness and non-increasing conditions is 3.3 Extension to Higher Dimensions given by +# = + . C C \C# Assuming that ξ is zero-mean Gaussian-distributed The approximation in (3) can be extended to grids in d dimensions by tensorisation.