arXiv:1906.06736v3 [math.PR] 22 Sep 2020 fixed uhthat such B aeafnaetlrl nnnaaercBysa analysis Bayesian nonparametric in m role random fundamental or a measures have random scattered independently called admmeasure random A Introduction 1 example Bayesian Nonparametric A 5 h est eutfrQDCRMs QID for result density The 3 Preliminaries and Notation 2 Introduction 1 Contents rpriso h es class dense the of Properties 4 1 , 1 B PM obneUiest.Eal riccardo.passeggeri@g Email: University. Sorbonne LPSM, . h est eutfrQDpitpoess...... class . dense the . of Properties . . . 4.1 . processes point QID for result density The 3.1 2 ω ,..., n ao( Sato and nyi hr xs w nntl iiil I)rno vari random (ID) divisible infinitely two exist there if only rmwr ae nCM eeoe yBoeik isnadJ and Wilson th in Broderick, results by these developed of CRMs relevance on the in based show with we framework processes point paper, of the cert class of and the part law for their also between results correspondence same the one L´evy- to a one posses conver a family to this exists respect of with elements CRMs the all that of space demonstrate the in dense is (CRMs) Keywords: (2018)). 3221 sidpnetof independent is S (2010): MSC conjugacy. automatic analysis, Bayesian opeeyrno esrs(Rs aeteadtoa pro additional the have (CRMs) measures random Completely . us-nntl iiil QD itiuin aebe r been have distributions (QID) divisible Quasi-infinitely ξ B ( k ω ∈ , B S rn.Ae.Mt.Soc. Math. Amer. Trans. ) , us-nnt iiiiiy opeeyrno esr,d , random completely divisibility, quasi-infinite nqaiifiieydvsberno measures random divisible quasi-infinitely On k sa is 00,6G7 01,62F15 60A10, 60G57, 60E07, ξ ∈ on N F X h admvariables random the , nti ok eso htafml fQDcmltl admm random completely QID of family a that show we work, this In . S maual in -measurable ihudryn rbblt space probability underlying with , A icroPasseggeri Riccardo A 370 etme 3 2020 23, September ω ′ 4382 21).Arno variable random A (2018)). 8483-8520 , ∈ 15 ...... Ω Abstract ξ o fixed for ( 1 B 1 ) mail.com , ξ ( B B 2 ) n oal nt esr in measure finite locally a and hnciefruainadta there that and formulation Khintchine ,..., sGoa n a e ar fr in affirm Vaart der van and Ghosal as ; eedn nrmns ntesecond the In increments. dependent ables ( 1 cnl nrdcdb ide,Pan Lindner, by introduced ecently Ω aue ihidpnetincrements, independent with easures eea aeinnonparametric Bayesian general e i hrceitcpis eprove We pairs. characteristic ain , ξ ec ndsrbto.W further We distribution. in gence F ( Y B , k and P ra ( ordan ) ) necas nonparametric class, ense r needn.CM,also CRMs, independent. are safunction a is , Z et htfraydisjoint any for that perty s.t. 12 ...... Bernoulli X X + sQDi and if QID is Y , = d Ω 24 Z × 3181- , easures and S B → Y ∈ [ 0 S , ∞ for 16 13 ] 5 3 1 , their recent book (see [7]) CRMs “arise as priors, or building blocks for priors, in many Bayesian nonparametric applications”. Completely random measures have a long history which is inextricably linked with the one of infinitely divisible distributions. In 1967, Kingman [14] proved a very appealing and useful repre- sentation theorem for all CRMs. He showed that any CRM ξ is almost surely given by the sum of three components: one deterministic, one concentrated on a fixed set of atoms, and one concentrated on a random set of atoms. He further showed that the last component, which he called the ordinary component, is fully determined by a Poisson : ξord(B)= (0,∞) xη(B × dx), where η is a on S 0 ∞ . The Poisson point process is the prime example of infinitely × ( , ) R divisible CRM. Infinitely divisible (ID) distributions have an even longer history that goes back to the work of L´evy, Kolmogorov and De Finetti, among others. They constitute one of the most studied classes of probability distributions. One of their most attractive properties is that their characteristic function have an explicit formulation, called the L´evy-Khintchine formulation, written in terms of three math- ematical objects. These are the drift, which is a real valued constant, the Gaussian component, which is a non-negative constant, and the L´evy measure, which is a measure on R satisfying an integrability condition and with no mass at {0}. Gaussian and Poisson distributions are examples of this class. In 2018, in [15] Sato, Lindner and Pan introduced the class of quasi-infinitely divisible (QID) distributions. A QID is defined as follows: a random variable X is QID (namely d has a QID distribution) if and only if there exist two ID random variables Y and Z s.t. X +Y = Z and Y is independent of X. QID distributions are like ID distributions except for the fact that the L´evy measure is now allowed to take negative values. In other words, a QID distribution has a L´evy- Khintchine formulation which is uniquely determined by a drift, a Gaussian component and by a ‘signed measure’ (more precisely a real valued set function) called the quasi-L´evy measure. Any ID distribution is QID, but the converse is not always true. In [15], the authors show that QID distributions are dense in the space of all probability distribu- tions with respect to weak convergence and that distributions concentrated on the integers (or any shift and dilation of them) are QID if and only if their characteristic functions have no zeros, among other results. Further theoretical results have been achieved in [1, 13, 19, 20]. In [19], the QID framework is extended to real-valued random noises and stochastic processes. QID distributions have already shown to have an impact in various fields: from mathematical physics, see [4] and [6], to number theory, see [16] and [17], and to insurance mathematics, see [21]. The first main contribution of this paper is the density result for QID random measures. We prove that a certain class of QID completely random measures (CRMs), which we denote by A , is dense with respect to convergence in distribution (precisely in both weak and vague convergence) in the space of all CRMs, also know as random measures with independent increments or as independently scattered random measures. This result extends the density result in [15] to the infinite dimensional setting of CRMs. The class A have quite remarkable features. First, as any CRM they have an almost sure represen- tation in terms of an ‘atomless’ ID component and an ‘atomic’ one. Second, the number of atoms is finite. Third, these random measures are almost surely finite and even more their atomless component has finite L´evy measure. Moreover, for the elements of this class, we are able to show an explicit spectral representation, namely the L´evy-Khintchine formulation, and prove that there exists a unique one-to-one correspon- dence between them and pairs of deterministic measures satisfying certain conditions, which we call characteristic pairs. We prove all these results also for the class of point processes with independent

2 increments, of which the Poisson point process is an example. With these results this paper shows that the fixed component of a CRM, which has been left out in Kingman’s analysis and in the theory of CRM in general, have exactly the same nice representation results as the widely studied ordinary component. Thus, not only there is no real need of leaving out of the analysis the fixed component (as Kingman graphically says, fixed atoms can be removed by simple surgery), but this might also be dangerous since in many applications the fixed component has an irreplaceable role. This will also appear evident in the Bayesian setting we discuss in this work (see also [2]). In the last section we investigate the impact of these results in the nonparametric Bayesian sta- tistical framework presented by Broderick, Wilson and Jordan in [2] based on CRMs (see also [3]). In particular, we consider priors to be given by elements in A (with quasi-L´evy measure having a particular structure). We show that they are dense in the space of priors considered in [2] and [3] with respect to convergence in distribution, thus showing also that our density result is flexible enough to adjust to various assumptions/settings. Second, we present explicit formulations for their posterior distributions. Third, when focusing on point processes, we prove automatic conjugacy for all the ele- ments of A under the only condition that the characteristic function of the posterior distribution has no zeros. This condition is satisfied in many situations and the result is more general than the the one of [2] which is based on the exponential structure of the likelihood. We remark that the general nature of our results allow them to be applied in many Bayesian settings. Thus, the choice of the work of Broderick, Wilson and Jordan [2] represents a first easy example. The paper is structured as follows. Section 2 concerns with the notations and some preliminar- ies. In Section 3 we provide the density results for CRMs and in Subsection 3.1 the one for point processes with independent increments. In Section 4, we show various properties for the classes of QID random measures and QID point processes presented in Section 3. In particular we present the L´evy-Khintchine formulation and the one-to-one correspondence of these random measures with their unique characteristic pair. In Section 5, we present the Bayesian setting and the relative results: computation of the posterior, convergence results for the posterior, and automatic conjugacy.

2 Notation and Preliminaries

By a measure on a measurable space (X,G ) we always mean a positive measure on (X,G ), i.e. a [0,∞]-valued σ-additive set function on G that assigns the value 0 to the empty set. For a non- empty set X, by B(X) we mean the Borel σ-algebra of X, unless stated differently. The law and the characteristic function of a random variable X will be denoted by L (X) and by Lˆ(X), respectively. For two measurable spaces (X,G ) and (Y,F ), we denote by G ⊗ F the product σ-algebra of G and F , and by G × F their Cartesian product. Let us recall some definitions.

Definition 2.1 (extended signed measure). Given a measurable space (X,Σ), that is, a set X with a σ-algebra Σ on it, an extended signed measure is a function µ : Σ → R∪{∞,−∞} s.t. µ(/0)= 0 and µ ∞ ∞ is σ-additive, that is, it satisfies the equality µ ( n=1 An)= ∑n=1 µ(An) where the series on the right must converge in R ∪ {∞,−∞} absolutely (namely the value of the series is independent of the order S of its elements), for any sequence A1,A2,... of disjoint sets in Σ. As a consequence any extended signed measure can take plus or minus infinity as value, but not both. In this work, we use the term ‘signed measure’ for an extended signed measure. Further, the total

3 variation of a signed measure µ is defined as the measure |µ| : Σ → [0,∞] defined by

∞ |µ|(A) := sup ∑ |µ(A j)|, (1) j=1

where the supremum is taken over all the partitions {A j} of A ∈ Σ. The total variation |µ| is finite if and only if µ is finite. Let us recall the definition of a signed bimeasure. Definition 2.2 (Signed bimeasure). Let (X,Σ) and (Y,Γ) be two measurable spaces. A signed bimea- sure is a function M : Σ × Γ → [−∞,∞] such that: (i) the function A → M(A,B) is a signed measure on Σ for every B ∈ Γ, (i) the function B → M(A,B) is a signed measure on Γ for every A ∈ Σ. Let S be a separable and complete metric space with Borel σ-algebra S and let Sˆ be the ring composed by bounded Borel sets in S. The triplet (S,S,Sˆ) is called localised Borel space (see page 19 in [12]). Definition 2.3 (random measure). A random measure ξ on S, with underlying (Ω,F ,P), is a function Ω × S → [0,∞], such that ξ (ω,B) is a F -measurable in ω ∈ Ω for fixed B and a locally finite measure in B ∈ S for fixed ω. Definition 2.4 (completely random measure). A completely random measure (CRM) ξ is a random measure s.t. for any disjoint B1,B2,...,Bk ∈ S, k ∈ N, the random variables ξ (B1),ξ (B2),...,ξ (Bk) are independent. CRMs are also called independently scattered random measure or random measure with independent increments. Definition 2.5 (diffuse random measure). Using the notation of the previous definition, we say that a random measure ξ on S is diffuse if ξ (ω,B) is a locally finite diffuse measure in B ∈ S for fixed ω. Remark 2.6. Term finite for random measures stands for a.s. finite. Thus, for a finite random mea- sure we mean an a.s. finite random measure. For a random measure ξ on a Polish space X, x ∈ X is a fixed atom of ξ if and only if P(|ξ ({x})| > a.s. 0) > 0. Further, a random measure ξ is called atomless if ξ ({x}) = 0 for every x ∈ X. The atomless condition is for random measures what the continuity in probability is for continuous time stochastic processes. We remark that an atomless random measure is not necessarily a diffuse random measure (see Corollary 12.11 in [11]). For example, think of a Poisson point process with E[ξ (s)] ≡ 0, like the homogeneous Poisson point process, which has no fixed atoms but it is not diffuse. Now, we introduce the concept of a quasi-L´evy type measure. We start with the following defini- tion, which we recall from [15]: B B B B Definition 2.7. Let r(R) := {B ∈ (R)|B ∩ (−r,r)= /0} for r > 0 and 0(R) := r>0 r(R) be the class of all Borel sets that are bounded away from zero. Let ν : B (R) → R be a function such that 0 S ν|Br (R) is a finite signed measure for each r > 0 and denote the total variation, positive and negative + − part of ν B by |ν B |, ν and ν respectively. Then the total variation |ν|, the positive | r (R) | r (R) |Br (R) |Br(R) part ν+ and the negative part ν− of ν are defined to be the unique measures on (R,B(R)) satisfying

|ν|({0})= ν+({0})= ν−({0})= 0

+ + − − and |ν|(A)= |ν B |, ν (A)= ν (A), ν (A)= ν (A), | r (R) |Br(R) |Br(R)

for A ∈ Br(R), for some r > 0.

4 As mentioned in [15], ν is not a a signed measure because it is defined on B0(R), which is not a σ- algebra. In the case it is possible to extend the definition of ν to B(R) such that ν is a signed measure then we will identify ν with its extension to B(R) and speak of ν as a signed measure. Moreover, the uniqueness of |ν|, ν+ and ν− is ensured by an application of the Carath´eodory’s extension theorem (see Lemma 2.14 in [19]). Further, notice that B0(R)= {B ∈ B(R) : 0 ∈/ B}= 6 {B ∈ B(R) : 0 ∈/ B} (see Remark 2.6 in [19]).

Definition 2.8 (quasi-L´evy type measure, quasi-L´evy measure, QID distribution, from [15]). A quasi- L´evy type measure is a function ν : B0(R) → R satisfying the condition in Definition 2.7 and such 2 that its total variation |ν| satisfies R(1 ∧ x )|ν|(dx) < ∞. Let µ be a probability distribution on R. We say that µ is quasi-infinitely divisible if its characteristic function has a representation R θ 2 µˆ (θ)= exp iθγ − a + eiθx − 1 − iθτ(x) ν(dx) 2 R  Z    where a,γ ∈ R and ν is a quasi-Levy´ type measure. The characteristic triplet (γ,a,ν) of µ is unique, and a and γ are called the Gaussian variance and the drift of µ, respectively. A quasi-Levy´ type mea- sure ν is called quasi-L´evy measure, if additionally there exist a quasi-infinitely divisible distribution µ and some a,γ ∈ R such that (γ,a,ν) is the characteristic triplet of µ. We call ν the quasi-Levy´ measure of µ.

The above definition extend to the Rd case (for d > 1) as shown in Remark 2.4 in [15]. As pointed out in Example 2.9 of [15], a quasi-L´evy measure is always a quasi-L´evy type measure, while the converse is not true. Moreover, we say that a function f is integrable with respect to quasi-Levy´ type measure ν if it is integrable with respect to |ν|. Then, we define:

f dν := f dν+ − f dν−, B ∈ B(R). ZB ZB ZB In this work we always keep the same order for the elements in the characteristic triplet: the first element is the drift, the second one is the Gaussian variance, and the third one is the (quasi) L´evy measure.

Definition 2.9 (QID random measure). Let Λ be a random measure. If Λ(A) is a QID random vari- able, for every A ∈ S, then we call Λ a QID random measure.

We conclude with the following result on QID distributions.

Theorem 2.10 (Theorem 4.3.4 in [5]). Let d ∈ N. The characteristic triplet (γ,0,ν), where ν is a finite quasi-Levy´ type measure, is the characteristic triplet of a QID distribution on Rd if and only if ∞ ν∗n exp(ν) := ∑n=1 n! is a measure. In that case, µ ∼ (γ,0,ν) is given by

δγ ∗ exp(ν) µ = . exp(ν(Rd))

3 The density result for QID CRMs

In this section we present the density results for QID CRMs in the space of all CRMs with respect to convergence in distribution. Let us start with some preliminaries. Let S be a separable and complete metric space with Borel σ-algebra S and let Sˆ be the ring composed by bounded Borel sets in S.

5 Let CˆS be the space of all bounded continuous functions f : S → R+ with bounded support. Let MS be the space of locally finite measures, namely µ ∈ MS if µ(B) < ∞ for every B ∈ Sˆ. The space M S might be endowed with the vague topology, denoted by BMS , generated by the integration maps π f : µ 7→ f (x)µ(dx), for all f ∈ CˆS. The vague topology is the coarsest topology making all π f M continuous. The measurable space ( s,BMS ) is a Polish space. The associated notion of vague R v convergence denoted by µn → µ is defined by the condition f (x)µn(dx) → f (x)µ(dx) for all f Cˆ . ∈ S R R An equivalent definition of random measure (see Definition 2.3) is the following: a random mea- F M B B sure ξ is a measurable mapping from (Ω, ,P) to ( S, MS ), where MS is the topology generated by all projection maps πB : µ 7→ µ(B) with B ∈ S, or, equivalently, by all integration maps π f with B measurable f ≥ 0. From Lemma 4.1 in [10] or Theorem 4.2 in [12], we know that MS and BMS co- incide. Hence it is equivalent to consider a random measure as a measurable mapping from (Ω,F ,P) M M B to ( S,BMS ) or to ( S, MS ). The convergence in distribution of ξn to ξ means that E[g(ξn)] → E[g(ξ )] for every bounded w continuous function g on MS, or equivalently that L (ξn) → L (ξ ), where for any bounded measures w µn and µ, the weak convergence µn → µ stands for g(y)µn(dy) → g(y)µ(dy) for all g as above. vd We write ξn → ξ to stress that the convergence of distributionR is for randomR measures considered as random elements in the space MS with vague topology. As mentioned in the previous section, in this setting an atom of a random measure ξ is an element s ∈ S such that P(ξ ({s}) > 0) > 0. We recall now a fundamental result by Harris, see [8].

Theorem 3.1 (see Theorem 4.11 in [12]). Let ξ ,ξ1,ξ2,... be random measures on S. Then these conditions are equivalent: vd (i) ξn → ξ , d (ii) f (x)ξn(dx) → f (x)ξ (dx) for all f ∈ CˆS, (iiI) exp f x ξ dx exp f x ξ dx for all f Cˆ with f 1. RE[ (− ( ) Rn( ))] → E[ (− ( ) ( ))] ∈ S ≤ The followingR density result extends TheoremR 4.1 in [15].

Theorem 3.2. Let A be a connected interval of the real line. The class of QID distributions with finite quasi-Levy´ measure, zero Gaussian variance and with support on A is dense in the class of probability distributions with support on A with respect to weak convergence.

Proof. Some arguments of the proof are in nature similar to the ones of the proof of Theorem 4.1 in [15]. First, we prove the result when A is bounded. Let A be a finite closed interval, thus A = [k,c] for some k,c ∈ R. Let µ be a probability distribution 2 2 with support [k,c]. For n ∈ N, let b j,n = k + (c − k) j/2n , j ∈ {0,...,2n } and define the discrete distribution µn concentrated on the lattice {b0,n,...,b2n2,n} by

µ((−∞,b0,n]), j = 0, 2 µn({b j,n})= µ((b j−1,n,b j,n]), j = 1,...,2n − 1, (2)  2 µ((b2n2−1,n,∞)), j = 2n . w  Then, µn → µ as n → ∞. Observe that µn is the probability distribution of a random variable with values on {b0,n,...,b2n2,n}⊂ [k,c]. It remains to prove that each µn is a weak limit of QID distributions with finite quasi-L´evy measure, zero Gaussian variance and with support on [k,c]. W.l.o.g. assume that 2 the approximating sequence of distributions σ is such that σ({b j,n}) > 0 for every j ∈ {0,...,2n }.

6 Assume that the characteristic function σˆ has zeros (in the other case we can directly use Corollary (X−k)2n2 3.10 in [15] to conclude). Let X be a random variable with distribution σ and define Y = c−k . 2 2 Then, Y is concentrated on {0,...,2n } with masses a j = P(Y = j) > 0 for j = 0,...,2n , and its 2n2 j characteristic function has zeroes. Then, the polynomial f (w)= ∑ j=0 a jw has zeroes on the unit 2n2 2 circle. Factorizing, we obtain f (w)= a2n2 ∏ j=1(w − ξ j), where ξ j, j = 1,...,2n , denote the complex 2n2 roots. Let fh(w)= a2n2 ∏ j=1(w − ξ j − h), where w ∈ C and h > 0. Then, for small enough h, fh is a 2n2 j polynomial with real coefficients, namely fh(w)= ∑ j=0 ah, jw with ah, j ∈ R. Observe that for small enough h, ah, j and a j will be close, so ah, j > 0. Now, let Zh be a random variable with distribution 2 −1 2 2n 2n Zh(c−k) σh = ∑ j=0 ah, j ∑ j=0 ah, jδ j and let Xh = 2n2 + k. Observe that, for every h > 0, Xh is random variable with values on the lattice {b0,n,...,b2n2,n} and its characteristic function has no zeros, and that d Xh → X as h ց 0. Finally, by Corollary 3.10 in [15] we know that Xh is QID with finite quasi-L´evy measure and zero Gaussian variance. Observe that if A is a bounded open interval, say A = (k′,c′) for some c,k ∈ R, then the above ′ ′ ′ arguments apply. Let µ be a probability distribution with support (k ,c ). For any n ∈ N let kn = ′ (c′−k′) ′ ′ (c′−k′) ′ ′ ′ 2 2 k + 2n2 and cn = c − 2n2 and let b j,n = kn + (cn − kn) j/2n , j ∈ {0,...,2n } and define the w discrete distribution µn concentrated on the lattice {b0,n,...,b2n2,n} as in (2). Then, µn → µ as n → ∞ ′ ′ and, applying the same reaming arguments (in which n is fixed) for kn and cn instead of k and c, we obtain the result for A bounded and open. Let now A be an unbounded interval of the form A = [k,∞) for some k ∈ R. Let µ be a probability 2 distribution with support on [k,∞). For n ∈ N, let b j,n = k+ j/n, j ∈ {0,...,2n } and define the discrete w distribution µn concentrated on the lattice {b0,n,...,b2n2,n} as in (2). Then, µn → µ as n → ∞. Using the notation above, let X be a random variable with distribution σ and define Y = (X −k)n. Then, Y is 2 concentrated on {0,...,2n } with masses a j and its characteristic function has zeroes by assumption. We proceed as before. Thus, for small enough h, we obtain a polynomial with real coefficients fh, 2n2 j namely fh(w)= ∑ j=0 ah, jw with ah, j ∈ R and ah, j > 0, for small enough h. Then, let Zh be a random 2 −1 2 2n 2n Zh variable with distribution σh = ∑ j=0 ah, j ∑ j=0 ah, jδ j and let Xh = n + k. Then, Xh is random variables with support on {b0,n,..., b2n2,n}⊂ [k,∞) and its characteristic function has no zeros, and d that Xh → X as h ց 0. Hence, by Corollary 3.10 in [15] we obtain the result. Similarly we obtain the result for (k′,∞), for (−∞,c] and for (−∞,c′), where k′,c,c′ ∈ R.

Recall that the L´evy-Prokhorov metric (or better just L´evy metric since we work on R) for two probability distributions F and G on R is defined as

ρ(F,G) := inf{ε > 0|F(x − ε) − ε ≤ G(x) ≤ F(x + ε)+ ε for all x ∈ R}.

x Lemma 3.3. Let F and G be any two probability distributions on R and let Fc(x) := F( c ) and Gc(x) := x G( c ) where c ∈ R \ {0}. For every positive constant c ≤ 1 we have that ρ(Fc,Gc) ≤ ρ(F,G). x−ε x Proof. Let c be any positive constant c ≤ 1. Observe that Fc(x−ε)= F( c ) ≤ F( c −ε) and similarly x we have that Fc(x + ε) ≥ F( c + ε). This implies that if ε > 0 satisfies F(x − ε) − ε ≤ G(x) ≤ F(x + ε)+ ε for all x ∈ R, then it also satisfies Fc(x − ε) − ε ≤ Gc(x) ≤ Fc(x + ε)+ ε for all x ∈ R. Then, we have

ρ(Fc,Gc)= inf{ε > 0|Fc(x − ε) − ε ≤ Gc(x) ≤ Fc(x + ε)+ ε for all x ∈ R}

7 ≤ inf{ε > 0|F(x − ε) − ε ≤ G(x) ≤ F(x + ε)+ ε for all x ∈ R} = ρ(F,G).

Observe that for two real valued random variables X and Y the above lemma affirms that for any 0 < c ≤ 1 we have that ρ(cX,cY) ≤ ρ(X,Y). Another useful property of the Prokhorov metric is the following. From condition 3) of the section “L´evy metric” in [22] (page 405) given any probability distributions on R F1,...,Fk,G1,...,Gk, where k ∈ N, we have that k ρ(F1 ∗···∗ Fk,G1 ∗···∗ Gk) ≤ ∑ ρ(Fj,G j). (3) j=1

For the next two results denote by Sn the sequence of bounded sets (i.e. Sn ∈ Sˆ) s.t. Sn ↑ S. Notice that such sequence exists by the definition of Sˆ, see page 19 in [12].

Proposition 3.4. Consider an atomless CRM α with corresponding unique pair (γ,F). Let γn(A)= 1 S S B γ(Sn ∩ A) and let Fn(C)= F(C ∩ (Sn × ( n ,∞))), for every A ∈ ,C ∈ ⊗ ((0,∞)) and n ∈ N. Then, γn and Fn are finite measures and there exists a sequence of atomless finite CRMs αn with pair (γn,Fn) d s.t. αn → α. Proof. From Kingman’s representation theorem (see [14] and see also Corollary 12.11 in [11] and Corollary 3.21 in [12]), we have that every atomless CRM α has the following representation: ∞ α = γ + xδsη(dsdx), a.s. (4) Z0 ZS for some non-random diffuse measure γ ∈ MS and a Poisson process η on S × (0,∞) with intensity F satisfying ∞ (1 ∧ x)F(A × dx) < ∞, (5) Z0 for every A ∈ Sˆ. In particular, for every B ∈ S we have that α(B) < ∞ if and only if γ(B) < ∞ and condition (5) holds for B ∈ S (see Corollary 12.11 in [11]). Further, notice that the above formulation implies that for every A ∈ S and f ∈ CˆS ∞ ∞ α(A)= γ(A)+ xη(A × dx) and α f = γ f + x f (s)η(dsdx), a.s.. Z0 Z0 ZS Moreover, the unique one to one correspondence between α and (γ,F) is shown in Theorem 3.20 of [12]. It is possible to see that γn and Fn are measures on S and on S ⊗ B((0,∞)), respectively. In particular, since α(A) < ∞ for every A ∈ Sˆ then γ(Sn) < ∞ and ∞ 1 (1 ∧ x)F(S × dx) < ∞ ⇒ F(S × ( ,∞)) < ∞, n n n Z0 for every n ∈ N. Thus, γn and Fn are finite measures, for every n ∈ N. Now, for every n ∈ N, let ηn be a Poisson process on S × (0,∞) with intensity Fn and let ∞ αn = γn + xδsηn(dsdx). Z0 ZS Then, we have that αn is an atomless CRM and since γn and Fn are finite then αn is finite, for every n ∈ N (see Corollary 12.11 in [11]).

8 Concerning the stated convergence we have the following. From Lemma 12.2 in [11] (or from Lemma 3.1 in [12]) we have that for every f ∈ CˆS

∞ −logE exp − f (s)α(ds) = γ f + 1 − e−xδs f F(dsdx).   Z  Z0 ZS

Hence, by assumption we have that for every f ∈ CˆS

−logE exp − f (s)α(ds) + logE exp − f (s)αn(ds)   Z    Z  ∞ ∞ = f (s)γ(ds)+ 1 − e−x f (s)F(dsdx) − f (s)γ(ds)+ 1 − e−x f (s)F(dsdx) S 0 S S 1 S Z Z Z Z n Z n Z n 1 n ∞ = f (s)γ(ds)+ 1 − e−x f (s)F(dsdx)+ 1 − e−x f (s)F(dsdx) → 0, ZS\Sn Z0 ZSn Z0 ZS\Sn d as n → ∞. Then, by point (iii) in Theorem 3.1 (see also Lemma 4.24 in [12]) we obtain that αn → α, as n → ∞.

Now, let us denote by I the set of all CRMs on S (considered as random elements in MS endowed with the vague topology) and recall that Z+ = N ∪ {0}. From Theorem 7.1 in [10] we know that an element ξ of I has the following representation

K a.s. ξ = α + ∑ β jδs j j=1 with K ∈ Z+ ∪ {∞}, where {s j : j ≥ 1} is the set of fixed atoms of ξ in S, α is an atomless CRM, and β j, j ≥ 1, are R+-valued random variables, which are mutually independent and independent of α. K We call ∑ j=1 β jδs j the fixed component of ξ . We remark that in the Kingman’s representation α is the sum of a deterministic and a ordinary component as shown in the proof of Proposition 3.4 in eq. (4). Consider the following class of QID random measures:

K A I a.s. := ξ ∈ ξ = α + ∑ β jδs j ,with α an atomless CRM with finite L´evy measure, {s j : j = 1,...,K}  j=1

a finite set of fixed atoms in S, and β j, j ≥ 1, R+-valued QID random variables with finite quasi-L´evy

measure and zero Gaussian variance and that are mutually independent and independent of α .  First, notice that α is ID, because any atomless random measure with independent increments is ID. Second, observe that, in contrast with the usual representation of CRMs, the elements of A have that the atomless random measure α has finite L´evy measure (thus, α is finite), that the number of fixed atoms K is finite, and that β j, j = 1,...,K, R+-valued QID random variables with finite quasi-L´evy measure and zero Gaussian variance. Notice that the elements of A are almost surely finite on S. Thus, A is strictly smaller than the class of QID CRMs, which in turn is strictly smaller than the class of all CRMs (namely I ). We are ready to present the main result of this section.

Theorem 3.5. A is dense in the space of all CRMs with respect to the convergence in distribution.

9 Proof. From Theorem 7.1 in [10] we know that any CRM has the following unique representation

K a.s. ξ = α + ∑ β jδs j (6) j=1 with K ≤ ∞, where {s j : j ≥ 1} is the set of fixed atoms of ξ , α is a random measure without fixed atoms with independent increments (hence, α is an atomless ID random measure), and β j, j ≥ 1, are R+-valued random variables, which are mutually independent and independent of α. From Theorem 3.2 with A = [0,∞), we know that for each β j there exists a sequence of non- negative QID random variable with zero Gaussian variance and finite L´evy measure that converges in distribution to β j, for every j ∈ N. Denote by βn, j such a sequence. Denote by Sn the sequence of bounded sets s.t. Sn ↑ S and by (γ,F) be the pair associated to α. Let 1 B γn(A)= γ(Sn ∩ A) and Fn(C)= F(C ∩ (Sn × ( n ,∞))), for every A ∈ S, C ∈ S ⊗ ((0,∞)) and n ∈ N, as in Proposition 3.4. Then, by Proposition 3.4 there exists a sequence of finite CRMs αn with pair d (γn,Fn) s.t. αn → α. The first step is to show the existence of random measures ξn ∈ A with ID random measure equals in distribution to αn, with fixed atoms in {s j : j ≥ 1} and weights equal in distributions to βn, j. The existence is not immediate because we do not know whether the βn, j are mutually independent and independent of αn in the underlying probability space of ξ . This is a classical problem in probabil- ity and the solution lies in the construction of a probability space under which these conditions are satisfied, which is given by the ‘product’ of the probability spaces. For the sake of clarity and completeness let us write here the arguments. Fix n ∈ N. Denote the underlying probability spaces of αn by (Ω,F ,P) and of the random variable βn, j by (Ω j,F j,P j), ′ ′ ′ ′ ′ for j = 1,...,n. Consider the probability space (Ω ,F ,P ) where Ω = Ω × Ω1 ×···× Ωn, F = ′ F ⊗ F1 ⊗···⊗ Fn and P is the product probability measure of P,P1,...,Pn. ′ ′ Let αn(·)(ω,ω1,...,ωn) := αn(·)(ω) and let βn, j(ω,ω1,...,ωn) := βn, j(ω j), where j = 1,...,n, for ′ (1) (1) (n) (n) every (ω,ω1,...,ωn) ∈ Ω . Observe that for every B1,...,Bk ∈ S and x1,...,xk,x1 ,...,xk ,...,x1 ,...,xk ∈ R+ we have that

′ ′ ′ ′ (1) ′ (1) P αn(B1) < x1,...,αn(Bk) < xk,δs1 (B1)βn,1 < x1 ,...,δs1 (Bk)βn,1 < xk ,  ′ (n) ′ (n) ...,δsn (B1)βn,n < x1 ,...,δsn (Bk)βn,n < xk = P α(B1) < x1,...,α(Bk) < xk    ′ (1) ′ (1) ′ (n) ′ (n) P1 δs1 (B1)βn,1 < x1 ,...,δs1 (Bk)βn,1 < xk ···Pn δsn (B1)βn,n < x1 ,...,δsn (Bk)βn,n < xk Now, let    n ′ ′ ′ ′ ′ ′ ′ ξn(·)(ω ) := αn(·)(ω )+ ∑ βn, j(ω )δs j (·), ∀ω ∈ Ω , (7) j=1 ′ ′ ′ where s1,..., sn are the same as the ones in (6). It is possible to see that, for every ω ∈ Ω , ξn(·)(ω ) is a measure because it is the sum of measures and that, for every B ∈ S, ξn(B)(·) is a measurable function because it is the sum of measurable functions. Thus, ξn is a random measure on S and from its definition it is possible to see that it belongs to A . d Since βn, j → β j we can choose a subsequence of βn, j, which by abuse of notation we denote it 1 by βn, j, such that ρ(βn, j,β j) < n2 for every j = 1,...,n and n ∈ N. From the above arguments there exists a sequence of random measures in A (with possibly different underlying probability spaces)

10 ′ n ′ ′ d ′ 1 such that ξn = αn + ∑ j=1 βn, jδs j . Thus, using that βn, j = βn, j we obtain that ρ(βn, j,β j) < n2 for every j = 1,...,n and n ∈ N. vd d Now, we need to show that ξn → ξ . From Theorem 3.1, it is sufficient to show that f (x)ξn(dx) → ˆ ′ d d ′ d f (x)ξ (dx) for all f ∈ CS. Since αn = αn for every n ∈ N and αn → α for every ω ∈ ΩR then αn → α. Further, since α′ and α are independent of the corresponding fixed component, this reduces the goal R n n ′ d ∞ ˆ to prove that ∑ j=1 f (s j)βn, j → ∑ j=1 f (s j)β j for all f ∈ CS. Let f ∈ CˆS, hence, f is bounded and has bounded support, and by denoting B the support of f we have that B ∈ Sˆ and so that almost surely ξn(B) < ∞, n ∈ N, and ξ (B) < ∞. Thus, for each n ∈ N, n ′ ∞ ∑ j=1 f (s j)βn, j < ∞ a.s. and ∑ j=1 f (s j)β j < ∞ a.s.. Moreover, notice that it is sufficient to prove the result for any f ∈ CˆS with f (s) ≤ 1 for every ˆ ¯ n ′ ¯ n f (s j) ′ s ∈ S. Indeed, consider any f ∈ CS and let C ∈ R+ be its bound, then ∑ j=1 f (s j)βn, j = C ∑ j=1 C¯ βn, j n f (s j) ′ d ∞ f (s j) n ′ d ∞ and so if ∑ j=1 C¯ βn, j → ∑ j=1 C¯ β j then ∑ j=1 f (s j)βn, j → ∑ j=1 f (s j)β j. Now, consider any f ∈ CˆS with f (s) ≤ 1 for every s ∈ S. By the triangular inequality we have that

n ∞ n n n ∞ ′ ′ ρ ∑ f (s j)βn, j, ∑ f (s j)β j ≤ ρ ∑ f (s j)βn, j, ∑ f (s j)β j + ρ ∑ f (s j)β j, ∑ f (s j)β j . j=1 j=1 ! j=1 j=1 ! j=1 j=1 !

n a.s. ∞ The last element converges to zero as n → ∞ because ∑ j=1 f (s j)β j → ∑ j=1 f (s j)β j as n → ∞. For the other element, by (3) and by Lemma 3.3 we obtain that

n n n n ′ ′ ′ 1 ρ ∑ f (s j)βn, j, ∑ f (s j)β j ≤ ∑ ρ f (s j)βn, j, f (s j)β j ≤ ∑ ρ βn, j,β j < . j=1 j=1 ! j=1 j=1 n   n ′ d ∞ Thus, we have that ∑ j=1 f (s j)βn, j → ∑ j=1 f (s j)β j as n → ∞, which concludes the proof. Remark 3.6. We could alternatively consider an almost sure equality in (7) and then use the existence and uniqueness results for random measures (see Theorem 2.15 and Corollary 2.16 in [11]) to obtain a random measure almost surely equal to ξn. In addition, by the Kolmogorov extension theorem the same arguments of the first part of the above proof hold for the case of n ‘equal’ to infinity, namely ′ ∞ ′ ξn = αn + ∑ j=1 βn, j. Further, we point out that if ξ is such that the number of fixed atoms in any bounded set (i.e. in any B ∈ Sˆ) is finite then the number of fixed atoms in the support of every f ∈ CˆS is finite, namely {s j : j ≥ 1}∩ supp( f ) has finite cardinality, and so the stated result follows directly from the mutual ′ ′ d independence of the βn, j, j = 1,...,n, from the fact that βn, j → β j as n → ∞, for every j = 1,...,n and n ∈ N, and from the continuous mapping theorem.

Remark 3.7. Let A∞ be a class of random measures like A , but such that the ID component is not necessarily finite, i.e. the ‘α’ is not necessarily finite. Then, trivially A∞ is dense in I w.r.t. the K convergence in distribution. Indeed, let ξ = α + ∑ j=1 β jδs j be any CRM on S. If we know the ID component of ξ , i.e. α, and for modelling/theoretical reasons we can take an approximating sequence ′ ′ ′ n ′ ′ ′ of unbounded ξn, then we can define the ξn s.t. ξn(·)(ω ) := α˜ n(·)(ω )+ ∑ j=1 βn, j(ω )δs j (·), ∀ω ∈ ′ ′ A Ω , where α˜ n(·)(ω,ω1,...,ωn) := α(·)(ω). Then, ξn ∈ ∞ and from the arguments of the proof of d Theorem 3.5 it is possible to see that ξn → ξ .

It is possible to consider also the set of bounded measures, denoted by MˆS, which can be endowed with the vague topology, as for MS, but also with the weak topology. The weak topology on MˆS is

11 the topology generated by the integration maps π f for all bounded continuous functions. Then, for random measures ξ ,ξ1,ξ2,... considered as random elements in MˆS, endowed with the weak topology, wd we will denote by ξn → ξ the convergence in distribution. Observe that in this setting a QID random measures as defined in Definition are QID random measures on (S,S) (hence we do not need to extend them) because for every B ∈ S they are all a.s. bounded. We will use the following result of Kallenberg to prove our next result.

Theorem 3.8 (see Theorem 4.19 in [12]). Let ξ ,ξ1,ξ2,... be a.s. bounded random measures on S. Then these conditions are equivalent wd (i) ξn → ξ , vd d (ii) ξn → ξ , and ξn(S) → ξ (S).

We are now ready to present our next result, which is similar to Theorem 3.5, but applies to MˆS and involves both the vague and the weak topology.

Theorem 3.9. A is dense in the space of all CRMs, considered as random elements in MˆS, endowed with either the vague topology or the weak topology, with respect to the convergence in distribution.

Proof. Consider first the case of MˆS endowed with the vague topology. Then, by the same arguments as the ones used in the proof of Theorem 3.5 we obtain the result. For the weak topology case, by the same arguments as the ones used in the proof of Theorem 3.5 vd d we have that ξn → ξ . Hence, according to Theorem 3.8 it remains to prove that ξn(S) → ξ (S), namely ′ n ′ d ∞ that αn(S)+ ∑ j=1 βn, j → α(S)+ ∑ j=1 β j. However, this has been proved in the proof of Theorem 3.5 – indeed, consider f ≡ 1 and notice that ξn(S) and ξ (S) are a.s. finite since ξn and ξ are almost surely bounded. Thus, the proof is complete.

3.1 The density result for QID point processes In this subsection we answer positively the following question: given any point process with indepen- dent increments is it possible to find a sequence of QID point processes with independent increments which converges in distribution to it? Thus, in this subsection we restrict our focus to point processes with independent increments and check that the density result holds. There are two main reasons for doing this. First, the class of point processes with independent increments represents one of the most studied class of completely random measures due to their nice theoretical properties and their importance in applications. Second, we have an explicit formulation for the quasi-L´evy measure and the drift of QID random variables supported on finite subsets of Z+ (see Theorem 3.9 in [15]). Let us first show the density result for random variables supported on Z+.

Proposition 3.10. The class of QID distributions supported on finite subsets of Z+ is dense in the class of probability distributions with support on Z+ with respect to weak convergence. Proof. Let µ be a probability distribution with support on N. For n ∈ N, define the discrete distribution µn concentrated on the lattice {0,...,2n} by

µn({ j})= µ({ j}), j ∈ {0,...,2n}.

w Then, µn → µ as n → ∞. It remains to prove that each µn is a weak limit of QID distributions with support on {0,...,2n}. W.l.o.g. assume that the approximating sequence of distributions σ is such

12 that σ({ j}) > 0 for every j ∈ {0,...,2n}. Assume that the characteristic function σˆ has zeros (in the other case we can directly use Theorem 3.9 in [15] to conclude). Let X be a random variable with 2n j distribution σ and let a j = P(X = j) > 0 for j = 0,...,2n. Then, the polynomial f (w)= ∑ j=0 a jw 2n has zeroes on the unit circle. Factorizing, we obtain f (w)= a2n ∏ j=1(w − ξ j), where ξ j, j = 1,...,2n, 2n denote the complex roots. Let fh(w)= a2n ∏ j=1(w − ξ j − h), where w ∈ C and h > 0. Then, for 2n j small enough h, fh is a polynomial with real coefficients, namely fh(w)= ∑ j=0 ah, jw with ah, j ∈ R. Observe that for small enough h, ah, j and a j will be close, so ah, j > 0. Now, let Xh be a random 2n −1 2n variable with distribution σh = ∑ j=0 ah, j ∑ j=0 ah, jδ j. We conclude by noticing that, for every h > 0, Xh is random variable with values on the lattice {0,...,2n} and its characteristic function has  d no zeros (thus it is QID by Theorem 3.9 in [15]), and that Xh → X as h ց 0. From Corollary 3.21 in [12], for an atomless point process with independent increments the cor- responding unique pair, which we denote by (γ,F), is such that γ = 0 and F is restricted to S × N. Let A ′ be the set of all the point processes in A . In other words, let

K A ′ I a.s. := ξ ∈ ξ = α + ∑ β jδs j ,with α an atomless point process with independent increments and  j=1

finite L´evy measure, {s j : j = 1,...,K} a finite set of fixed atoms in S, and β j, j ≥ 1, QID random variables

concentrated on finite subsets of Z+ and that are mutually independent and independent of α .  Obviously, we have A ′ ( A ( I .

Theorem 3.11. A ′ is dense in the space of all point processes with independent increments with respect to the convergence in distribution.

Proof. It follows from the same arguments as the ones used in the proof of Theorem 3.5. In particular, now we need to use Proposition 3.10 instead of Theorem 3.2. Further, now γn = γ = 0 and Fn and F are concentrated on S × N. Then, following the same arguments as the ones used in the proof of Theorem 3.5 we obtain the result.

We conclude this subsection with the density result for finite point processes (for which the weak topology might also be used), namely the equivalent of Theorem 3.9 for point processes with inde- pendent increments.

Proposition 3.12. A ′ is dense in the space of point processes with independent increments, consid- ered as random elements in MˆS, endowed with either the vague topology or with the weak topology, with respect to the convergence in distribution.

Proof. It follows from the same arguments as the ones used in the proof of Theorem 3.9, with Theorem 3.11 instead of Theorem 3.5.

4 Properties of the dense class A

In this section we explore some of the properties of the random measures in A , with a particular focus on spectral representations.

13 Consider the same notation as in the previous section. Let α be an atomless CRM (hence, ID). Using Theorem 12.10 and Corollary 12.11 in [11] we have that

∞ Lˆ (1) iθx (1) (α(A))(θ)= exp −iθγ (A)+ e − 1FA (dx) , (8)  Z0  for every θ ∈ R and A ∈ S, where γ is a finite diffuse measure on S and F is a finite measure on S ⊗ B((0,∞)) with diffuse projections onto S. Observe that we can extend F(1) to a finite measure on S ⊗ B(R), by assigning value zero outside S ⊗ B((0,∞)); by abuse of notation, we call this finite measure F(1). n Further, let ∑ j=1 δs j β j, where n ∈ N, s j ∈ S, j = 1,...,n, and where the β j’s are mutually inde- pendent QID random variables with finite quasi-L´evy measure and zero Gaussian variance. With centering function equal zero (as in (8)), denote by c j and b j the drift and the quasi-L´evy measure of β j, for j = 1,...,n. Notice that we can use such centering function because the β j’s have finite quasi n L´evy measure. Then, the L´evy-Khintchine formulation of ∑ j=1 δs j β j is given by n Lˆ (2) iθx (2) ∑ δs j (A)β j (θ)= exp iθγ (A)+ e − 1FA (dx) ,  j=1   ZR 

(2) n (2) n for every θ ∈ R and A ∈ S, where γ (A)= ∑ j=1 δs j (A)c j and FA (·)= ∑ j=1 δs j (A)b j(·). Then, n ξ = α + ∑ j=1 δs j β j has the following formulation

iθx Lˆ(ξ (A))(θ)= exp iθν0(A)+ e − 1 − iθτ(x)FA(dx) , (9)  ZR  (2) (1) (1) (2) for every θ ∈ R and A ∈ S, where ν0(A)= γ (A) − γ (A) and FA(·)= FA (·)+ FA (·). Proposition 4.1. Let ξ ∈ A and adopt the notation above. Then, F extends uniquely to a finite signed measure on S ⊗ B(R). Proof. Consider the notations above. For the first statement we need to show that F is a finite signed measure on S⊗B(R). Since F(1) is a finite measure on S⊗B((0,∞)), it remains to show that F(2) is B (2) n a finite signed measure on S ⊗ (R). We know that FA (·)= ∑ j=1 δs j (A)b j(·) where b j(·) are finite signed measures on B(R). It is possible to see that F(2) is a bimeasure on S × B(R) and that

n sup |F(2)(B )| = |b |(R) < ∞, ∑ Ai i ∑ j I i∈I j=1 where the supremum is taken over all the finite families of disjoints elements of S × B(R). Then, by Theorem 5.18 in [19] (see also Theorem 4 in [9]) F(2) extends to a finite signed measure on S⊗B(R). Thus, F is a finite signed measure on S ⊗ B(R).

Following the notation of the ID case (see [12] page 89), we call F the quasi-Levy´ measure of ξ . Observe that in the ID case the L´evy measure might not even be σ-finite (see [14] pages 82-83), while here our quasi-L´evy measure is a finite signed measure. Further, we remark that a similar result (1) to Proposition 4.1 holds for ξ ∈ A∞ (see Remark 3.7). In this case, F is a measure (not necessarily σ-finite) on S⊗B(R) (see Corollary 3.21 in [12]), and F(2) is the same as in the proof of Proposition 4.1. Thus, in this case F = F(1) + F(2) is a signed measure not necessarily finite. In the following result we show the existence of a unique correspondence between any element in A and a characteristic pair.

14 Theorem 4.2. Let ξ ∈ A . Then, there exists a pair (ν0,F) s.t. (9) holds, where ν0 and F are a finite signed measure on S and S ⊗ B(R), respectively, s.t. for every A ∈ S and B ∈ B(R): n S (i) ν0(A)= −γ(A)+ ∑ j=1 δs j (A)c j, for some diffuse finite measure γ on , c1,...,cn ∈ R, and finitely many atoms s1,...,sn ∈ S, ˜ n ˜ S B (ii) F(A × B)= G(A × B)+ ∑ j=1 δs j (A)b j(B), for some finite measure G on ⊗ (R), which is the extension by zero of some measure G on S ⊗ B((0,∞)) with diffuse projections onto S, and for some finite signed measures b j’s on B(R), such that exp(b1),...,exp(bn) are measures. Conversely, for every such pair (ν0,F) there exists a unique random measure ξ ∈ A s.t. (9) holds. Proof. Concerning the atomless component of ξ , from Corollary 12.11 in [11] and Theorem 3.20 in [12] we know that there exists a one to one correspondence between an ID atomless random measure with independent increments and a characteristic pair, composed by a diffuse measure on S and a measure on S⊗ B(R) with diffuse projections onto S. In our case we note that the components of the characteristic pair are finite measures by definition. For the fixed component of ξ , by Theorem 2.10 we know that a characteristic triplet where the Gaussian component is zero and the quasi-L´evy measure is finite is the characteristic triplet of a QID random variable if and only if the exponential of the finite quasi-L´evy measure is a measure. Then, by the definition of ξ and by the discussion and the computations at the beginning of this section on the characteristic functions of the components of ξ , we immediately obtain the result. Notice that for the converse direction we need also to show the independence of the fixed and atomless components, but this follows immediately from the linear structure of ν0 and F. Remark 4.3. Notation: instead of using the characteristic pair we could have equivalently used the n n n characteristic set ({s j} j=1,γ,{c j} j=1,G,{b j} j=1), with the above structure, in order to have a one to one identification with ξ ∈ A .

4.1 Properties of the dense class A ′ Since A ′ ( A all the results presented in the previous section holds for ξ ∈ A ′. In this subsection, we show that even better results holds for the elements in A ′. This is mainly due to the fact that we have more information about the structure of these random measures. Let us recall Theorem 3.9 in [15]. Despite we have used this theorem before we present it here to facilitate the reader in the understanding of the results of this subsection.

Theorem 4.4 (Theorem 3.9 in [15]). Let µ be a discrete distribution concentrated on {0,1,2,...,n} n for some n ∈ N, i.e., µ = ∑ j=0 a jδ j, where a0,...,an−1 ≥ 0, an > 0, and a0 + ··· + an = 1. Then the following are equivalent: (i) µ is quasi-infinitely divisible. (ii) The characteristic function of µ has no zeroes. n j (iii) The polynomial w 7→ ∑ j=0 a jw in the complex variable w has no roots on the unit circle, n j i.e. ∑ j=0 a jw 6= 0, for all w ∈ C with |w| = 1. Further, if one of the equivalent conditions (i)-(iii) holds, then the quasi-Levy´ measure of µ is finite and concentrated on Z, the drift lies in {0,1...,n}, and the Gaussian variance of µ is 0. More n j precisely, if ξ1,...,ξn denote the n complex roots of w 7→ ∑ j=0 a jw , counted with multiplicity, then the quasi-Levy´ measure of µ is given by

∞ ∞ −1 m −1 −m ν = − ∑ m ∑ ξ j δ−m − ∑ m ∑ ξ j δm, (10) m=1  j:|ξ j|<1  m=1  j:|ξ j|>1 

15 and the drift is equal to the number of those zeroes of this polynomial which lie inside the unit circle (counted with multiplicity), i.e., have modulus less than 1.

In the following theorem we adopt the following notation. Let ξ ∈ A ′. We denote by α its a.s. atomless component and by β j, j = 1,...,n the QID random variables of its fixed component, i.e. ξ = n k j L α +∑ j=1 β jδs j . Further, for every j = 1,...,n, we denote the law of β j by ∑l=0 a j,l δl, namely (β j)= k j k j l ∑l=0 a j,l δl and denote by ζ j,1,...,ζ j,k j the k j complex roots of w 7→ ∑l=0 a j,l w . Finally, we denote by b j the quasi-L´evy measure of β j, i.e.

∞ ∞ −1 m −1 −m b j = − ∑ m ∑ ζ j,l δ−m − ∑ m ∑ ζ j,l δm, (11) m=1  l:|ζ j,l |<1  m=1  l:|ζ j,l |>1 

and by c j its drift, i.e. c j = #{|ζ j,l | < 1,l = 1,...,k j}.

′ Theorem 4.5. Let ξ ∈ A . Then, there exists a pair (ν0,F) s.t. (9) holds, where ν0 and F are a finite signed measure on S and S ⊗ B(R), respectively, s.t. for every A ∈ S and B ∈ B(R): n (i) ν0(A)= ∑ j=1 δs j (A)c j, where n ∈ N, s j ∈ S is an atom, and c j = #{|ζ j,l | < 1,l = 1,...,k j}, for j = 1,...,n, ˜ n ˜ S B (ii) F(A × B)= G(A × B)+ ∑j=1 δs j (A)b j(B), where G is a finite measure on ⊗ (R) restricted on S × N and with diffuse projections onto S, and where b j satisfies (11), for j = 1,...,n.

Conversely, for every such pair (ν0,F), where ζ j,1,...,ζ j,k j denote the k j complex roots of some k j l A polynomial w 7→ ∑l=0 a j,l w for j = 1,...,n, there exists a unique random measure ξ ∈ s.t. (9) holds. Proof. It follows from the same arguments as the one used in Theorem 4.2 and from Theorem 4.4. In particular, the first direction is trivial. For the other direction, we have the following. As mentioned in the proof of Theorem 4.2, we have a one-to-one correspondence for the atomless part of ξ and its characteristic pair. Concerning the fixed component, let us assume that there exist c j and b j which are k j l functions of some complex roots of some complex polynomial w 7→ ∑l=0 a j,l w with no roots in the unite circle, where k j ∈ N, a0,...,ak j −1 ≥ 0, ak j > 0, and a0 + ··· + ak j = 1. Then, by Theorem 4.4 L k j there exists a QID probability distribution (β j)= ∑l=0 a j,l δl. Since this holds for every j = 1,...,n n then from the set of atoms s1,...,sn ∈ S we obtain the fixed component ∑ j=1 δs j β j of a random measure in A ′.

The same comment in Remark 4.3 for Theorem 4.2 holds here for Theorem 4.5. In addition, we refer to [18] for further properties of certain subclasses of point processes with quasi-L´evy measures.

5 A Nonparametric Bayesian example

In this section we show how the setting and the results presented in Sections 3 and 4 apply to a partic- ular class of nonparametric prior distributions. The framework is the one of the paper by Broderick, Wilson and Jordan [2]. This framework is also explored in subsequent papers, see [3] among others. In their work they analyse Bayesian nonparametric prior and likelihood based on CRMs. In particular, they let the prior to be modelled as: K

Θ := ∑ θkδψk k=1

16 where the cardinality K may be either finite or infinity and where (θk,ψk) is a pair consisting of the frequency (or rate) of the k-th trait together with its trait ψk, which belongs to some space Ψ of traits. Further, they let the data point for the m-th individual to be modelled as:

Km

Xm := ∑ xm,kδψk k=1 where xm,k represents the degree to which the m-th data point belongs to the trait ψk. This setting can be applied to many real world applications. In particular, in topic modelling we have that ψk represents a topic; that is, ψk is a distribution over words in a vocabulary. Further, θk might represent the frequency with which the topic ψk occurs in a corpus of documents. Finally, x j,k represents the number of words in topic ψ j,k that occur in the jth document. So the jth document has K a total length of ∑k=1 x j,k words. In this case, the actual observation consists of the words in each m documents, and the topics of the whole corpus of documents are latent. From a mathematical (and formal) point of view Θ and Xm are defined as CRMs. In particular, for the data Xm, we let xm,k be drawn according to some distribution H that takes θk as a parameter indep and have support on Z+, that is xm,k ∼ h(xm,k|θk), independently across m and k. We assume that X1,...,Xm are i.i.d. conditional on Θ. Moreover, [2] consider the following assumptions for Θ and Xm:

Assumption A00: the atomless component of Θ has characteristic pair (γ,F) s.t. γ = 0 and F(dθ × dψ)= ν(dθ) · G(dψ), where ν is any σ-finite measure on R+ and G is a proper distribution on Ψ with no atoms.

Assumptions A0, A1, and A2: Θ has a finite number of fixed atoms, ν(R )= ∞, and ∞ h(x|θ)ν(dθ) + ∑x=1 R+ ∞, respectively. < R

We remark that by Assumption A00 we have that the location of the non-fixed atoms ψ and the frequencies θk are stochastically independent. We call ν the weights rate measure of Θ. Moreover, the assumptions A0, A1 and A2 comes from a modelling need. By assuming A0 we are saying that we initially know certain traits, by A1 that there are a countable infinity of possible traits, and by A2 that the amount of information from finitely represented data is finite (because by A2 the number of non-fixed atoms is finite). The first main result in [2] is Theorem 3.1, which shows explicit formulations for the posterior distribution Θ|X1, and it is extended in Corollary 3.2 to the posterior Θ|X1:m. In the following result we are going to show that similar results hold for any random measure in A without assuming A0, A1 or A2. K Notice that we can write Θ = ∑k=1 θkδψk , where K = Kfix +Kord, namely K is the sum of the fixed and non-fixed atoms, thus K is random. Following the notation of [2], we denote the fixed component Kfix L of Θ by Θ fix = ∑k=1 θ fix,kδψ fix,k and the law of θ fix,k by Ffix,k := (Θ({ψ fix,k })). A K Proposition 5.1. Let Θ ∈ satisfying A00. Write Θ = ∑k=1 θkδψk , and let X1,...,Xm be generated K indep conditional on Θ according to X1 := ∑k=1 x1,kδψk with x1,k ∼ h(x|θk) for proper, discrete probability mass function h. It is enough to make the assumption for X1 since the X1,...,Xm are i.i.d. conditional on Θ. Then let Θpost be a random measure with the distribution of Θ|X1:m (i.e. Θ|X1,...,Xm). Θpost is a CRM with three parts.

17 1. For each k ∈ [Kfix], Θpost has a fixed atom at ψ fix,k with weight θpost, fix,k distributed according to the finite-dimensional posterior Fpost, fix,k(dθ) that comes from prior Ffix,k , likelihood h, and obser- vation X(ψ fix,k). Moreover, Ffix,k is QID with no Gaussian component and finite quasi-Levy´ measure, ∝ m and Fpost, fix,k(dθ) Ffix,k(dθ)∏ j=1 h(x fix, j,k |θ). 2. Let {ψnew,k : k ∈ [Knew]} be the union of atom locations across X1,X2,...,Xm minus the fixed locations in the prior of Θ.Knew is finite. Let xnew, j,k be the weight of the atom in Xj located at ψnew,k, for some j = 1,...,m. Then Θpost has a fixed atom at xnew,k with random weight θpost,new,k , whose ∝ m distribution Fpost,new,k(dθ) ν(dθ)∏ j=1 h(xnew, j,k |θ). m 3. The ordinary component of Θpost has finite weights rate measure νpost,m(dθ) := ν(dθ)h(0|θ) . Remark 5.2. Observe that since Θ ∈ A then it has finite fixed atoms so assumption A0 is satisfied. Moreover, since ν is also finite and h(x|θ) ≤ 1, then assumption A2 is also satisfied. The only differ- ence with Theorem 3.1 and Corollary 3.2 in [2] is that we do not necessarily satisfy assumption A1. However, A1 is a modelling assumption rather than a technical one. Indeed, the proof of this result follows from similar arguments as the one used in the proof of Theorem 3.1 and Corollary 3.2 in [2]. We write them for completeness.

Proof. Let us first prove the result for Θ|X. Any fixed atom θ fix,kδψ fix,k in the prior is independent of the other fixed atoms and of the ordinary component. Thus, all of X except x fix,k := X({ψ fix,k}) is independent of θ fix,k. Thus, Θ|X has a fixed atom at ψ fix,k and L (θpost, fix,k ) ∝ Ffix,k(dθ)h(x fix,k |θ). Recall that since G is continuous, all the fixed and non-fixed atoms of Θ are at a.s. distinct locations.

Observe that by letting Ψ fix := {ψ fix,1,...,ψ fix,Kfix } we can define the fixed and ordinary component of X by Xfix(A) := X(A ∩ Ψ fix) and Xord(A) := X(A ∩ (Ψ \ Ψ fix)), respectively.

Let x ∈ Z+ and let {ψnew,x,1,...,ψnew,x,Knew,x } be all the locations of atoms in Xord of size x, which is finite and it is a subset of the locations of atoms of Θord. Further, let θnew,x,k := Θ({ψnew,x,k }). Observe Knew,x that the values {θnew,x,k }k=1 are generated from a thinned Poisson point process with rate measure (also known as intensity measure) νx(dθ)= ν(dθ)h(x|θ), this is due to the h(x|θ)-thinning of the Kord Poisson point process {θord,k}k=1 which has rate measure ν. Moreover, given that νx(R+) < ∞, we have that L (θnew,x,k ) ∝ ν(dθ)h(x|θ). Finally, observe that there is a possibility that atoms in Θord are not observed in Xord, this happens when the likelihood draw returns a zero. These atom weights form a Poisson point process with rate measure ν(dθ)h(0|θ). Considering Θ|X1 as the new prior we obtain the formulation for the posterior Θ|X1,X2 by induc- tion and by observing that the assumptions are still satisfied by Θ|X1. Then, by induction we conclude the proof.

In the next result, we show that random measures in A satisfying A00 are dense in the space of all CRMs satisfying A0, A1 and A2, namely all the random measures considered in [2] (and in [3]). Further, we show how this result translates into a convergence for the ordinary component of the posterior of these random measures.

Proposition 5.3. Consider any random measure Θ satisfying A00, A0, A1 and A2, namely as in Theorem 3.1 in [2]. Then, there exists a sequence of random measures (Θn)n∈N in A and satisfying d d A00 such that Θn → Θ, as n → ∞. Further, Θn,post,ord → Θpost,ord , as n → ∞. Proof. The first part of this proof consists in realising that the arguments in the proof of Proposition 3.4 and Theorem 3.5 adapt to the present case. Denote by F(dθ × dψ)= ν(dθ) · G(dψ) the L´evy measure of Θ. Following the proofs of Propo- sition 3.4 and Theorem 3.5 it is possible to see that the approximating sequence Θn should have L´evy

18 1 measure νn(dθ) · Gn(dψ) where νn(dθ) := ν(( n ,∞) ∩ dθ) and Gn(dθ) := G(Sn ∩ dψ). However, given the assumptions on F, namely that G is a finite measure, we can (and we do) take the L´evy mea- sure of Θn to be given by Fn(dθ ×dψ) := νn(dθ)·G(dψ). Then, applying the same arguments as the one used in the proof of Proposition 3.4 and Theorem 3.5, we obtain that the ordinary component of Θn converge in distribution to the one of Θ. The convergence of the fixed component follows directly from Theorem 3.5. Since Fn is finite, we have that Θn is in A and that it satisfies A00. For the convergence of the posteriors, consider Θn with its respective data points Xn,1,....,Xn,m, which are defined conditional on Θn as in Proposition 5.1 and belong to some probability spaces possibly different from the one of the other data points. From Proposition 5.1 we have that Θn,post m has finite weights rate measure νn,post,m(dθ) := νn(dθ)h(0|θ) , while from Corollary 3.2 in [2] we m know that Θpost has finite weights rate measure νpost,m(dθ) := ν(dθ)h(0|θ) . Since νn,post,m(·)= 1 νpost,m(( n ,∞) ∩ ·) we obtain the result by Proposition 3.4. We summarise our findings so far in words. First, we obtain an explicit expression for the posterior of any random measure in A satisfying A00. Second, such random measures are dense with respect to convergence in distribution in the space of all priors considered in [2]. Third, when approximating in distribution such a prior, the ordinary component of the posteriors of these random measures converge to the one of the prior. Thus, by these results we have a random truncation procedure; this is so because the number of non-fixed atoms of the prior is random and almost surely finite for every n ∈ N. Thus, the present truncation procedure extends the one of [3]. Indeed, we do not arbitrarily fix the number non-fixed atoms of the truncated prior and we are able to keep explicit formulations for the posterior of the truncated prior. In the next result we show that, under certain conditions, we have automatic conjugacy for random measures in A ′ satisfying A00.

Proposition 5.4. Let Θ ∈ A ′ satisfying A00 and with weights rate measure having finite support. Let K indep X be generated conditional on Θ according to X := ∑k=1 xkδψk with xk ∼ h(x|θk) for proper, discrete probability mass function h. Assume that the characteristic functions of the random variables of the fixed component of Θpost have no zeros, namely assume that for every x ∈ N, z ∈ R and k ∈ [Kfix]

∞ ∞ izθ izθ e h(x|θ)Ffix,k (dθ) 6= 0 and e h(x|θ)ν(dθ) 6= 0. (12) Z0 Z0 ′ Then, Θpost ∈ A , satisfies A00 and has weights rate measure with finite support.

Proof. Assumption (12) implies that the characteristic functions of Fpost, fix,k and of Fpost,new, j have no zeros. Further, they are also supported on a finite subset of Z+. Then, by Theorem 4.4 we obtain the result.

n(k) (k) Remark 5.5. Let Θ and X be as in Proposition 5.4. Notice that we can write Ffix,k = ∑ j=0 a j δ j, Kν where a0,...,an−1 ≥ 0, an > 0, and a0 +···+an = 1, for k ∈ Kfix. Further, we can write ν = ∑ j=1 b jδ j, where Kν ∈ N indicates the highest value in supp(ν), b1,...,bKν −1 ≥ 0 and bKν > 0. Assumption (12) can be rewritten as: For every x ∈ N, z ∈ R and k ∈ [Kfix], assume that

(k) n Kν iz j (k) iz j ∑ e h(x| j)a j 6= 0 and ∑ e h(x| j)b j 6= 0. j=0 j=1

19 Moreover, by Theorem 4.4 this assumption (and so assumption (12)) is equivalent to the following n(k) (k) j assumption: For every x ∈ N and k ∈ [Kfix], assume that the polynomials w 7→ ∑ j=0 h(x| j)a j w and Kν j w 7→ ∑ j=1 h(x| j)b j w in the complex variable w have no roots on the unit circle. Remark 5.6. The results presented in this section holds also if the weights rate measure is infinite, namely ν(R+)= ∞ (under the additional assumptions A1 and A2). In particular, the equivalent of Proposition 5.1 would be identical to Corollary 3.2 except for the result of point 1, because here we additionally know that Ffix,k is QID with no Gaussian component and finite quasi-L´evy measure. Further, the equivalent of Proposition 5.3 would follows from the arguments presented taking into consideration Remark 3.7. The equivalent of Proposition 5.4 is more subtle and it is presented below.

Consider the following class of QID random measures:

K A ′′ I a.s. := ξ ∈ ξ = α + ∑ β jδs j ,with α an atomless point process with independent increments  j=1

and finite L´evy measure, {s j : j = 1,...,K} a finite set of fixed atoms in S, and β j, j ≥ 1, Z+-valued

QID random variables that are mutually independent and independent of α .  A ′′ A Let ∞ indicate the set of random measures like in but with α being any atomless point process with independent increments. As a side comment, we remark that is possible to see that a result similar to Theorem 4.2 and Theorem 4.5 holds for the elements in A ′′, where thanks to Theorem 8.1 in [15] we are able to know the structure of their L´evy-Khintchine representation in more details. A ′′ Proposition 5.7. Let Θ ∈ ∞ and assume A00, A0, A1 and A2. Let X be generated conditional on ∞ indep Θ according to X := ∑k=1 xkδψk with xk ∼ h(x|θk) for proper, discrete probability mass function h. Assume that the characteristic functions of the random variables of the fixed component of Θpost have no zeros, namely assume that for every x ∈ N, z ∈ R and k ∈ [Kfix]

∞ ∞ izθ izθ e h(x|θ)Ffix,k (dθ) 6= 0 and e h(x|θ)ν(dθ) 6= 0. (13) Z0 Z0 A ′′ Then, Θpost ∈ ∞ and satisfies A00, A0, A1 and A2.

Proof. Assumption (13) implies that the characteristic functions of Fpost, fix,k and of Fpost,new, j have no zeros. Further, they are also supported on Z+. Then, by Theorem 8.1 in [15] we obtain the result. Observe that assumption (13) can be rewritten more explicitly as done in Remark 5.5 for assump- tion (12).

Acknowledgement

The author would like to thank Almut Veraart, Fabio Bernasconi and Ismael Castillo for useful dis- cussions. The research developed in this paper is supported by the EPSRC (award ref. 1643696) at Imperial College London and by the Fondation Sciences Math´ematiques de Paris (FSMP) fellowship, held at LPSM (Sorbonne University).

20 References

[1] Berger D. On Quasi-Infinitely Divisible Distributions with a Point Mass. In press in Mathematische Nachrichten, (2019+). [2] Broderick, T., Wilson, A.C. and Jordan, M.I. Posteriors, conjugacy, and exponential families for completely random measures. Bernoulli, 24, 3181-3221, (2018) [3] Campbell, T., Huggins, J.H., How, J., and Broderick, T. Truncated random measures. Bernoulli, 25, 1256- 1288 (2019) [4] Chaiba H., Demni, N., Mouayn, Z. Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels. Journal of Mathematical Physics 57, 072-103 (2016). [5] Cuppens, R. Decomposition of Multivariate Probabilities. Academic Press, New York, (1975). [6] Demni, N., Mouayn, Z. Analysis of generalized Poisson distributions associated with higher Landau levels. Infinite Dimensional Analysis, Quantum Probability and Related Topics, 18(04), (2015). [7] Ghosal, S., Van der Vaart, A. Fundamentals of Nonparametric Bayesian Inference. Cambridge University Press (2017). [8] Harris. Random measures and motions of point processes. Z. Wahrsch. verw. Geb. 18, 85-115, (1971). [9] Horowitz J. Une remarque sur les bimesures. Seminaire´ de probabilites´ (Strasbourg), 11 (1977), 59-64. [10] Kallenberg O. Random Measures, 3rd edn. Akademie-Verlag, Berlin (1983). [11] Kallenberg O. Foundations of modern probability. Second Edition. Springer, (2001). [12] Kallenberg O. Random Measures, Theory and Applications. Springer, (2017). [13] Khartov A. A. Compactness criteria for quasi-infinitely divisible distributions on the integers. Statistics and Probability Letters, Volume 153, 1-6, (2019) [14] Kingman, J.F.C. Poisson Processes. Oxford: Oxford Univ. Press. (1992) [15] Lindner A., Pan L., Sato K. On quasi-infinitely divisible distribution, Trans. Amer. Math. Soc., 2018. [16] Nakamura, T. A complete Riemann zeta distribution and the Riemann hypothesis. Bernoulli, 21(1), 604- 617, (2015). [17] Nakamura, T. Zeta distributions generated by Dirichlet series and their (quasi) infinitely divisibility, pri- vate communication. [18] Nehring, B., Poghosyan, S., and Zessin, H. On the construction of point processes in statistical mechanics, Journal of Mathematical Physics, 54 (6), 063302, (2013). [19] Passeggeri, R. Spectral representations of quasi-infinitely divisible processes. Stochastic processes and their applications, 130, (3), (2020), 1735-1791. [20] Passeggeri, R. On the extension and kernels of signed bimeasures and their role in stochastic integration. ArXiv, (2020) [21] Zhang, H., Liu, Y., Li, B.. Notes on discrete compound Poisson model with applications to risk theory. Insurance Math. Econom. 59, 325-336, (2014). [22] Zolotarev V.M., “Levy´ metric”, in Hazewinkel, Michiel (ed.), Encyclopedia of Mathematics, Volume 5, Kluwer Academic Publishers, (2001).

21