<<

Mathematical background of the p3r2 test

Niklas Hohmann∗ December 14, 2017

1 Motivation

There are two pivotal principles for the construction of the test. The first one is • propability density functions (pdfs) are needed to construct certain statistical methods and to prove their optimality In the case of the p3r2 test, the statistical method used is a likelihood-ratio test. The notion of optimality for tests used here is a uniform most powerful (UMP) test. The second principle is that • a (PPP) can be modeled as a random At first sight, this complicates things. For example, the observation times

y = (y1, y2, . . . , yN ) (1) of a PPP are not taken as elements of RN , but are interpreted as a measure µy on the real numbers, which is the sum of Dirac measures:

N X µy = δyj . (2) j=1 But this approach has got advantages that will become clear soon. The general goal in the next few pages is to establish a pdf for PPPs to construct a UMP test for PPPs. Ignoring more technical details, a random measure Xi is a random element with values in M(R), the set of all measures on R. Since a PPP can be ∗FAU Erlangen-Nuremberg, email: [email protected]

1 modeled as a random measure, it can be imagined as a procedure that is randomly choosing a measure similar to the one in equation (2) that can again be interpreted as observation times. Since Xi is a random element with values in M(R), the distribution Pi of Xi is a on M(R). So Pi assigns probabilities to sets of measures. Given a second random measure Xj with distribution Pj, the relationship of their distributions Pi and Pj can be examined by applying methods of measure theory. It is known from the Radon-Nikodym theorem that the existence of a pdf of a (probability) measure Pi in reference to a (probability) measure Pj is equal to absolute continuity of Pi in reference to Pj, which is written as Pi  Pj. So to establish a pdf for PPPs, absolute continuity between the distributions of the relevant random measures has to be established.

2 Basic idea

For simplification, look at the measures on the positive real numbers. Assume µr is a finite measure on [0, ∞). Then it is uniquely determined by its Z Lr(t) = exp(−tx) µr(dx) . (3)

If µr  λM , there exists a density function fr of µr in reference to λM . So we get the alternative representation Z Lr(t) = fr(x) exp(−tx) λM (dx) (4) for the Laplace transform. Now, assume it is unknown whether or not µr  λM , but it can be shown that Z Lr(t) = g(x) exp(−tx) λM (dx) (5) for all t and a given function g. So g behaves like a density in combination with the Laplace transform. If this property can be expanded and used to show that g behaves like a density in combination with all indicator functions of measurable sets, then g really is a density, which means µr  λM and fr = g almost everywhwere. The same approach is used for random measures: first, a function is identified that behaves like a density in combination with the Laplace transform. This is done in section 4. Then, it is shown that this property expands and the function really is a density. This is done in section 5.

2 3 Modelling and Notation

In terms of notation, I follow [2] and use the conventions ln(0) := −∞ and 1 1 exp(−∞) = 0. Choose fr ∈ L (R, B(R), λ) := L (λ) satisfying

1. there is a compact, connected set M ⊂ R with supp(fr) ⊂ M for all r

2. fr(x) ≥ 0 for all x ∈ R 3. R f(x)λ(dx) > 0 Here, r is an arbitrary index from an index set R and λ is the . Since everything outside the set M is irrelevant for modelling, I will use λM , the Lebesgue measure restricted to M instead of λ. The fr will serve as the rate functions of the PPPs. Define the measures Z µr(A) := fr(x) λM (dx) . (6) A 1 Then µr  λM and fr is the density function of µr in reference to λM . Last, we denote the PPP with intensity measure µr with Xr and the PPP with intensity measure λM with XM .

4 Identifying a possible pdf candidate

In the case of a random measure Xr, the Laplace transform is given by   Z  Lr(f) = E exp − f(x) Xr(dx) . (7)

+ + Here, f ∈ Cc , where Cc is the set of positive continuous functions with compact . If Xr is a PPP with intensity measure µr, this simplifies to Z  Lr(f) = exp exp(−f(x)) − 1 µr(dx) (8) Z  = exp (exp(−f(x)) − 1)fr(x) λM (dx) . (9)

Here, fr is the density of µr in reference to λM . Like in section 2, we need a function dr that behaves like a density in combination with the Laplace transform, meaning   Z  Lr(f) = E dr(XM ) exp − f(x) XM (dx) . (10)

1it is not a probability density function

3 I will show that the function Z  dr(XM ) := cr · exp ln(fr(x))XM (dx) (11) with Z  cr := exp 1 − fr(x)λM (dx) (12) has got that property. First, assume the rate function fr is a simple function, so

n X fr(x) = αi1Ai (x) (13) i=1 with αi > 0 and measurable sets Ai. For technical reasons, add the set n A0 = M \ ∪i=1Ai with α0 = 0, so

n X fr(x) = αi1Ai (x) . (14) i=0

+ Now, take any f ∈ Cc . Then for every Ai, there is a series of simple functions

(˜sk,i)k∈N withs ˜k,i(x) % f(x)1Ai (x) almost everywhere for k → ∞. Then

si,k(x) = (ln(αi) − s˜i,k(x))1Ai (x) & (ln(αi) − f(x))1Ai (x) (15) for k → ∞. The function n X sk(x) = si,k(x) (16) i=0 is also a simple function and has got a representation

n Xk sk(x) = bj,k1Bj,k (x) (17) j=1 with measurable Bj,k. Then

sk(x) & ln(fr(x)) − f(x) (18) almost everywhere for k → ∞. Further, the sets Bj,k can be chosen to be com- patible with the sets Ai, meaning that for all i, there is a Ji ⊂ {1, 2, 3, . . . , nk} satisfying [ Bj,k = Ai . (19)

j∈Ji

4 So we get an alternative representation of fr:

n Xk fr(x) = α˜j1Bj,k (20) j=1 with αi =α ˜j for all j ∈ Ji. Furthermore bj,k = ln(αi) − βj,k for j ∈ Ji. So X s˜i,k(x) = βj,k1Bj,k (x) (21)

j∈Ji and n n Xk Xk s˜k(x) = s˜i,k(x) = βj,k1Bj,k (x) % f(x) (22) i=1 j=1 almost everywhere for k → ∞. With fr being a simple function, the equation (12) reduces to

n ! Z Xk cr = exp 1 − α˜j1Bj,k λM (dx) (23) j=1 n ! Xk = exp (1 − α˜j)λM (Bj,k) (24) j=1 n Yk = exp ((1 − α˜j)λM (Bj,k)) . (25) j=1

+ Now, we start showing equation (10). First, fix any f ∈ Cc and an approxi- mating sequences ˜k as discussed above. Then   Z  E dr(XM ) exp − s˜k(x) XM (dx) (26)  Z  = E cr · exp ln(fr(x)) − s˜k(x) XM (dx) (27)  Z  = E cr · exp sk(x) XM (dx) (28)

" n # Yk = E cr · exp (bj,kXM (Bj,k)) . (29) j=1 Now, note that for a random variable Y having a Poisson distribution with parameter p > 0 and any a > 0, it is

E [exp(aY )] = exp(p(exp(a) − 1)) . (30)

5 By using the independent increments of the PPP XM and the fact that XM (B) ∼ Poi(λM (B)), continue the series of equations with

n Yk eq. (29) = cr · E [exp (bj,kXM (Bj,k))] (31) j=1 n Yk = cr · exp (λM (Bj,k)(exp(bj,k) − 1)) (32) j=1 n Yk = exp (λM (Bj,k) · (1 − α˜j − 1 +α ˜j exp(−βj))) (33) j=1 n ! Xk = exp α˜jλM (Bj,k)(exp(−βj) − 1) (34) j=1 n ! Z Xk = exp α˜j1Bj,k (x)(exp(−βj) − 1)λM (dx) (35) j=1 Z  = exp (exp(−s˜k(x)) − 1)fr(x)λM (dx) (36)

= Lr(˜sk). (37)

By standard approximation arguments, this also holds for any fr as defined + in section 3 and all f ∈ Cc .

5 Expanding the pdf property

I will first show the expansion of the density behaviour for measures on the positive real numbers (see section 2). This is just to give an idea of the principles that will be used for random measures, so the fact that [0, ∞) is not compact will be ignored. The underlying idea is to essentially reverse engineer theorem 15.5 from [2]. Once the expansion has been shown to work for measures on the positive real numbers, i will show that it also works for random measures.

5.1 Measures on the positive real numbers So assume we know that Z Lr(t) = g(x) exp(−tx) λM (dx)(t ≥ 0) (38)

6 with Lr being the Laplace transform of a finite measure µr on the positive real numbers and g the assumed density. The family (ft)t∈[0,∞) of functions given by ( exp(−tx) for t > 0 ft(x) = (39) 1 for t = 0 is closed under multiplication. Because of the linearity of the integral Z Z ˜ ˜ f(x) µr(dx) = g(x)f(x) λM (dx) (40)

˜ holds for all f belonging to the algebra generated by (ft)t∈[0,∞). From the Stone-Weierstrass theorem it follows that Z Z f(x) µr(dx) = g(x)f(x) λM (dx) (41) for all bounded continuous functions f (corollary 15.3 in [2] reverse). Ap- proximating the indicator functions of compact sets A ∈ B([0, +∞)) with bounded continuous functions yields Z Z 1A(x) µr(dx) = g(x)1A(x) λM (dx) . (42)

Applying a argument, this expands to all measurable sets (Theorem 13.11 in [2] reverse). So g really is a density of µr in reference to λM .

5.2 Random measures Now apply this approach to random measures. For this, look at the set M(M) of locally finite measures on M, equipped with the . From Prohorovs theorem ( [1], Theorem 4.2), it is known that M(M) is polish. Since M is compact, all measures in M(M) are finite. The space M(M) is also locally compact. To see this, fix a µ ∈ M(M). Then µ(M) = cµ < ∞. The topology is initial for the mappings πB defined as ν 7→ ν(B)(ν ∈ M(M)) (43) for bounded measurable B. So πM is by definition continuous, mM = −1 πM ((−∞, cµ + )) with  > 0 is an in the vague topology that contains µ. Applying Prohorovs theorem again shows that mM is a relative compact neighbourhood of µ, so M(M) is locally compact. Since M(M) is also separable, it is countable at infinity, so its Alexandroff

7 compactification M∗ is well defined, metrizable and compact. Denote the infinitely distant point in M∗ with ∞, so M∗ \M(M) = {∞}. Let F + be the set of all positive measurable functions. For f ∈ F + define ( Z 0 if f ≡ 0 f(x) ∞(dx) := (44) +∞ else

+ Then for every f ∈ Cc define ∗ ϕf : M 7→ R (45) through  Z  ϕf (ν) = exp − f(x) ν(dx) (46)

This family is closed under multiplication and contains the 1 (set f ≡ 0). Since M is polish, all locally finite measures are regular (theorem 13.6 [2]), so by the Riesz-Markov-Kakutani representation theorem, the family separates points in M(M), so it separates points in M∗ aswell. It is also continuous, since for c > 0 Z −1 ∗ ϕf ((−∞, c)) = {ν ∈ M | − f(x)ν(dx) < ln(c)} (47) Z = {ν ∈ M∗ | f(x)ν(dx) > − ln(c)} (48) Z = {ν ∈ M∗ | f(x)ν(dx) ≤ c˜}C (49) which is the complement of a compact set and therefore open in the topology of the compactification. With all this, the tools are ready to generalize the argument as used above. M ∗ is a compact, , all requirements for the Stone-Weierstrass theorem are met. So the expansion to all bounded continuous functions works, and the approximation of compact sets is also just like above. The compact sets do generate the Borel σ field and are closed under intersection, so the determine a measure uniquely. Last, note that M(M) is Borel, so restricting the measurable sets to M(M) gets rid of the additional structure of M∗. So dr really is a density of Pr in reference to PM .

6 Statistical modeling

From here on, statistical modeling is straightforward. Choose a rate function f0 as the null hypothesis and a rate function f1 as the alternative that both

8 meet the restrictions in section 3. Define the measures µ0, µ1 following equa- tion (6) and PPPs X0 ∼ P0,X1 ∼ P1 with intensity measures µ0, µ1. Then define the binary statistical model

(M(M), B(M(M)), {P0,P1}) . (50)

All that was shown in the previous sections is that this is a dominated model with {P0,P1}  PM . As a likelihood-ratio, we get Z  d1(ν) c1 R(ν) := = · exp ln (f1(x)/f0(x)) ν(dx) . (51) d0(ν) c0

Since the measures of a point process are always of the type (2), this reduces to2 N c1 Y f1(yj) R(ν) = . (52) c f (y ) 0 j=1 0 j This is the test statistic as described in the paper. Optimality of the likelihood- ratio test f0 vs. f1 (P0 vs. P1) comes from [3] p. 91.

References

[1] Olav Kallenberg. Random Measures. Springer, New York, 2017.

[2] Achim Klenke. . A Comprehensive Course. Springer, London, 2008.

[3] F. Liese and Klaus-J. Miescke. Statistical Decision Theory. Springer, New York, 2008.

2 Since integrating f with δt is equal to evaluating f(t)

9