<<

arXiv:1302.0968v1 [math.PR] 5 Feb 2013 tion functions amdsrbtos oa n lblapoiain histo snake. approximation, Brownian global tion, and local distributions, Palm inpoesin process sion htaDW-process a that oal nt o every for finite locally h rnhn rate branching the hr ecniee oelclpoete faDawson–Watan a of a as properties to referred local (henceforth some process considered we where 385–443 1, No. DOI: 41, Vol. Probability2013, of Annals The Lemma utbesaig)W sueteiiilmeasure initial the assume We scaling.) suitable

c n yorpi detail. 385–443 typographic 1, the and by No. published 41, Vol. article 2013, original the Mathematical of of reprint Institute electronic an is This nttt fMteaia Statistics Mathematical of Institute e od n phrases. and words Key .Introduction. 1. M 00sbetclassifications. subject 2011. 2000 February AMS revised 2010; February Received 10.1214/11-AOP702 v ˙ = 4.1 n oa prxmto of dist approximation Palm local multivariate a the and by of leading versions densities, continuous moment strongly conditional measures, of Palm of properties representation ity snake Brownian a tree, ian in h oetmaue of measures moment the otniyadsaigpoete.Ti l ed pt nas an to of distribution up conditional leads the all of This description properties. scaling and continuity that h otiuint h xeiri ie ytePl distrib Palm the by given is exterior ξ the to contribution the in ftehsoia rcs n nov oedlct es delicate some involve and densities. process moment historical the of tions eedn n ie yteped-admmeasures pseudo-random the by given and dependent h ii as limit the OA ODTOIGI DAWSON–WATANABE IN CONDITIONING LOCAL 1 2 t f ∆ R at .) ≥ v osdralclyfiieDwo–aaaesuperprocess Dawson–Watanabe finite locally a Consider d ξ x with t − ,where 0, 1 hre the charges x , . . . , v R 2 d d ihiiilcondition initial with ε ≥ γ ihLpaefunctionals Laplace with ξ → n hspprmyb eadda otnaino [ of continuation a as regarded be may paper This .Ormi eut nld oercriefrua for formulas recursive some include results main Our 2. u rosaebsdo h o lse representa- cluster Cox the on based are proofs Our . .Frgeneral For 2. = ( = ,tersrcin otoest r odtoal in- conditionally are sets those to restrictions the 0, v esr-audbacigdffsos oetmaue and measures moment diffusions, branching Measure-valued > t ε ( = ξ yOa Kallenberg Olav By nihohoso oepoints some of -neighborhoods t 2013 , savgeycniuu,mauevle diffu- measure-valued continuous, vaguely a is ) .(h rcs rtrafo [ from criteria precise (The 0. hsrpitdffr rmteoiia npagination in original the from differs reprint This . v uunUniversity Auburn t steuiu ouint the to solution unique the is ) ξ ihcnetost h nfr Brown- uniform the to connections with , 05,6J0 60J80. 60J60, 60G57, in ξ DW-process t h naso of Annals The yasainr lse ˜ cluster stationary a by 1 γ emyrdc oti aeb a by case this to reduce may we , v 0 = f ξ E tafie time fixed a at ) t Ti mut ochoosing to amounts (This . µ ia rcs,cutrrepresenta- cluster process, rical o fixed a for µ e − ob uhthat such be to x ξ 1 t f x , . . . , ξ ˜ = r˜ or η > t e 19 η iae of timates ult to duality − n ributions, ihnice with whereas , ymptotic continu- µv ∈ to of ution , r utdin quoted are ] ,given 0, ξ vlto equa- evolution ( = R t d > t o suitable for In . ξ b super- abe t ) .Recall 0. ξ t sa.s. is 19 ], 2 O. KALLENBERG

Our motivating result is Theorem 9.1, which describes asymptotically the conditional distribution of ξt for a fixed t> 0, given that ξt charges the d ε-neighborhoods of some points x1,...,xn R , where the approximation is in terms of total variation. In the limit,∈ the restrictions to those sets are conditionally independent and given by some universal pseudo-random measures ξ˜ orη ˜, whereas the contribution to the exterior region is given by the multivariate Palm distribution of ξt at x1,...,xn. The present work may be regarded as part of a general research program outlined in [18], where we consider some random objects with similar local hitting and conditioning properties arising in different contexts. Examples identified so far include the simple [10, 12, 15, 20, 30], local times of regenerative and related random sets [13, 14, 16], measure-valued diffusion processes [19], and intersection or self-intersection measures on ran- dom paths [24]. We are especially interested in cases where the local hitting are proportional to the appropriate moment densities, and the simple or multivariate Palm distributions can be approximated by elemen- tary conditional distributions. Our proofs, here as in [19], are based on the representation of each ξt as a countable sum of conditionally independent clusters of age h (0,t], where ∈ the generating ancestors at time s = t h form a ζs directed 1 − by h− ξs (cf. [4, 26]). Typically we let h 0 at a suitable rate depending on ε. In particular, the multivariate, conditional→ Slivnyak formula from [22] yields an explicit representation of the Palm distributions of ξt in terms of the Palm distributions for the individual clusters. Our arguments also rely on a detailed study of moment measures and Palm distributions, as well as on various approximation and scaling properties associated with the pseudo- processes ξ˜andη ˜—all topics of independent interest covered by Sections 4–8. Here our analysis often goes far beyond what is needed in Section 9. Moment measures of DW-processes play a crucial role in this paper, along with suitable versions of their densities. Thus, they appear in our asymptotic formulas for multivariate hitting probabilities, which extend the univariate results of Dawson et al. [3] and Le Gall [27]; cf. Lemma 7.2. They further form a convenient tool for the construction and analysis of multivariate Palm distributions, via the duality theory developed in [13]. Finally, they enter into a variety of technical estimates throughout the paper. In Theo- rem 4.2 we give a basic cluster decomposition of moment measures, along with a forward recursion (implicit in Dynkin [5]), a backward recursion and a . In Theorem 4.4 we explore the fundamental connection, first noted by Etheridge [7], between moment measures and certain uniform Brownian trees, and we provide several recursive constructions of the latter. The mentioned results enable us in Section 5 to establish some useful local estimates and continuity properties for ordinary and conditional moment densities. CONDITIONING IN SUPERPROCESSES 3

Palm measures form another recurrent theme throughout the paper. Af- ter providing some general results on this topic in Section 3, we prove in Theorem 4.8 that the Palm distributions of a single DW-cluster can be obtained by ordinary conditioning from a suitably extended version of Le Gall’s Brownian snake [26]. In Theorem 6.3 we use the cluster representation along with duality theory to establish some strong continuity properties of the multivariate Palm distributions. Local approximations of DW-processes of dimension d 3 were studied ≥ already in [19], where we introduced a universal, stationary and scaling invariant (self-similar) pseudo- ξ˜, providing a local approx- imation of ξt for every t> 0, regardless of the initial measure µ. (The prefix “pseudo” signifies that the underlying probability measure is not normalized and may be unbounded.) Though no such object exists for d =2, we show in Section 8 that the stationary clusterη ˜ has similar approximation properties for all d 2 and satisfies some asymptotic scaling relations, which makes it a good substitute≥ for ξ˜. A technical complication when dealing with cluster representations is the possibility of multiple hits. More specifically, a single cluster may hit (charge) several of the ε-neighborhoods of x1,...,xn, or one of those neighborhoods may be hit by several clusters. To minimize the effect of such multiplicities, we need the cluster age h to be sufficiently small. (On the other hand, it needs to be large enough for the mentioned hitting estimates to apply to the individual clusters.) Probability estimates for multiple hits are derived in Section 7. Here we also estimate the effects of decoupling, where com- ponents of ξt involving possibly overlapping sets of clusters are replaced by conditionally independent measures. Palm distributions of historical, spatial branching processes were first in- troduced in [11] under the name of backward trees, where they were used to derive criteria for persistence or extinction. The methods and ideas of [11] were extended to continuous time and more general processes in [8, 9, 29]. Further discussions of Palm distributions for superprocesses appear in [2, 4, 7, 19, 37]. In particular, a probabilistic (pathwise) description of the uni- variate Palm distributions of a DW-process is given in [2, 4]. More gen- erally, there is a vast literature on conditioning in superprocesses (cf. [7], Sections 3.3–4). In particular, Salisbury and Verzani [33, 34] consider the conditional distribution of a DW-process in a bounded domain, given that the exit measure hits n given points on the boundary. However, their meth- ods and results are entirely different from ours. General surveys of superprocesses include the excellent monographs and lecture notes [2, 6, 7, 28, 32]. The literature on general random measures and the associated Palm kernels is vast; see [1, 12, 30] for some basic facts and further references. 4 O. KALLENBERG

For the sake of economy and readability, we are often taking slight liber- ties with the notation and associated terminology. Thus, for the DW-process we are often using the same symbol ξ to denote the measure-valued diffu- sion process itself, the associated historical process and the entire random evolution, involving complete information about the cluster structure for arbitrary times s

Pµ η = µ(η)= µ(dx) x(η)= µ(dx) 0(θxη), { ∈ ·} L Z L Z L where the shift operators θx are defined by (θxµ)B = µ(B x) or (θxµ)f = µ(f θ ), and we are writing instead of . When we− use the notation ◦ x Lx Lδx Pµ or µ, it is implicitly understood that µpt < for all t> 0. Let d denoteL the space of locally finite measures on Rd,∞ endowed with the σ-fieldM d generated by all evaluation maps πB : µ µB for arbitrary B , the d d 7→ ∈B Borel σ-field in R . Write ˆ or ˆ d for the classes of bounded Borel sets in Rd or bounded measures onB Rd, respectively.M For abstract measure spaces S, the meaning of S or ˆ S is similar, except that we require the measures µ to be uniformlyM Mσ-finite, in the sense that µB < for some fixed ∈ MS k ∞ measurable partition B1,B2,... of S. We use double bars to denote the supremum norm when applied to functions, the operatork norm · k when applied to matrices, and total variation when applied to signed measures. In the latter case, we define µ = 1 µ , k kB k B k where 1Bµ denotes the restriction of µ to the set B. For any measure space S and measurable set B S, we consider the hitting set HB = µ M; µB > 0 , equipped with⊂ the σ-field generated by the restriction { ∈ MS } map µ 1Bµ, and we often write B instead of HB for convenience, referring7→ to this as the total variationk · k on B. Thus,k· in k Section 8, we may write (1) (ξ˜) (˜η) = (1 ξ˜) (1 η˜) , kL −L kB kL B −L B kHB even when the pseudo-distributions of ξ˜ andη ˜ are unbounded. In Sections 7 and 9 we use a similar notation for signed measures on finite product spaces CONDITIONING IN SUPERPROCESSES 5 X i Si , in which case (1) needs to be replaced by its counterpart for se- quencesM of random measures. n For any space S, the components of x S are denoted by xi, and we write (n) n ∈ S for the set of n-tuples x S with distinct components xi. For func- ∈ n n tions f, we distinguish between ordinary powers f and tensor powers f ⊗ , n J whereas for measures µ the symbols µ⊗ and µ⊗ mean product measures. For point processes ζ on S, ζ(n) and ζ(J) denote the corresponding factorial measures, which for simple point processes, agree with the restrictions of ζn and ζJ to S(n) and S(J), respectively. Convolutions and convolution prod- ucts are written as and ( )J . For suitable functions f and g, f g means f/g 1, whereas f∗ g means∗ f g 0, unless otherwise specified.∼ The → < ≈ − → < relation f ⌢ g means f cg for some constant c> 0, f g means f ⌢ g and g < f, and f g means≤f/g 0. ≍ ⌢ ≪ → Let J be the class of partitions of the set J, and write n when J = P r r P 1,...,n . Define the scaling and shift operators Sx by SxB = rB + x and { } r r put Sr = S0 . Thus, µSx is the measure obtained by magnifying µ around 1 ε x by a factor r− . The open ε-ball around x is denoted by Bx. Indicator functions are written as 1 or 1B, and δs denotes the unit mass at s, so that δ B = 1 (s). We write R{·}= [0, ), Z = 0, 1,... and N = 1, 2,... . The s B + ∞ + { } { } symbols and γ mean independence or conditional independence given ⊥⊥ ⊥⊥ d γ, and we use (ξ) for the distribution of ξ and = for equality in distribution. We often writeL µf = f dµ and (f µ)B = µ[f; B]= f dµ. Conditional · B probabilities and distributionsR are written as P [ ] and R[ ], Palm measures and distributions as P [ ] and [ ], respectively.·|· WeL sometimes·|· use 0 to denote the Palm measure·k· at 0. L ·k· L

2. Gaussian, binomial and Poisson preliminaries. Here we collect some properties of Gaussian measures and binomial or Poisson processes needed in subsequent sections. We begin with a simple exercise in linear algebra. By the principal variances of a random vector, we mean the positive eigenvalues of the associated covariance matrix.

Lemma 2.1. For any π n, consider some uncorrelated random vec- tors ξ , J π, in Rd with∈ uncorrelated P entries of variance σ2, and put J ∈ ξj = ξJ for j J π. Then the array (ξ1,...,ξn) has principal variances σ2 J , J π, each∈ ∈ with multiplicity d. | | ∈ 2 Proof. By scaling we may take σ = 1, and since each ξJ has uncor- related components, we may further take d = 1. Defining J by j J π, j ∈ j ∈ we get Cov(ξi,ξj)= δJi,Jj . It remains to note that the m m matrix with entries a 1 has eigenvalues m, 0,..., 0.  × ij ≡ We proceed with a simple comparison of normal distributions. 6 O. KALLENBERG

n Lemma 2.2. Write νΛ for the centered normal distribution on R with covariance matrix Λ. Then n 1/2 Λ n νΛ k k ν⊗Λ . ≤  detΛ  k k

Proof. Let Λ have eigenvalues λ1 λn, and let x1,...,xn be the associated coordinates of x Rn. Then ≤···≤ν has density ∈ Λ 2 1/2 xk /2λk pλk (xk)= (2πλk)− e−| | kYn kYn ≤ ≤ (λ /λ )1/2p (x ) ≤ n k λn k kYn ≤ n 1/2 λn n = p⊗ (x), λ λ  λn 1 · · · n and the assertion follows since Λ = λ and det(Λ) = λ λ .  k k n 1 · · · n

Now let pt denote the continuous density of the symmetric Gaussian dis- tribution on Rd with variances t> 0.

d Lemma 2.3. The normal densities pt on R satisfy 2 d/2 d p (x) (1 td x − ) p (x), 0

Proof. For fixed t> 0 and x = 0, the maximum of ps(x) for s (0,t] 2 6 2 ∈ occurs when s = ( x /d) t. This gives ps(x) pt(x) for x td, and for x 2 td we have | | ∧ ≤ | | ≥ | | ≤ 2 d/2 d/2 p (x) (2π x /d)− e− s ≤ | | 2 d/2 x 2/2t 2 d/2 (2π x /d)− e−| | = (td x − ) p (x).  ≤ | | | | t For convenience, we also quote the elementary Lemma 3.1 from [19].

d Lemma 2.4. For fixed d and T > 0, the normal densities pt on R satisfy p (x + y) < p (x), x Rd, y h t T. t ⌢ t+h ∈ | |≤ ≤ ≤ d Given a measure µ on R and some measurable functions f1,...,fn 0 on Rd, we introduce the convolution ≥

d n µ fk (x)= µ(du) fk(xk u), x = (x1,...,xn) (R ) .  ∗  Z − ∈ Ok n kYn ≤ ≤ CONDITIONING IN SUPERPROCESSES 7

d Lemma 2.5. Let µ be a measure on R with µpt < for all t> 0. Then N n ∞ for any n , the function (µ pt⊗ )(x) is finite and jointly continuous in (x,t) Rnd∈ (0, ). ∗ ∈ × ∞ nd 1 Proof. Letting t> 0 and x R with c− 0, we see from Lemma∈ 2.4 that | | p (x u) < p (x u) < pn (u) < p (u), t k − ⌢ c k − ⌢ 2c ⌢ 2c/n kYn kYn ≤ ≤ Rd n uniformly for u . Since µp2c/n < , we get (µ pt⊗ )(x) < for any t> 0, and the asserted∈ continuity follows∞ by dominated∗ converg∞ence from the fact that p (x) is jointly continuous in (x,t) Rd (0, ).  t ∈ × ∞

Given a probability measure µ on some space S, let σ1,...,σn be i.i.d. random elements in S with distribution µ. Then the ξ = k δσk on S (or any process with the same distribution) is called a binomial processP based on µ. We say that ξ is a uniform binomial process on an interval I if µ is the uniform distribution on I. We begin with a simple sampling property of binomial processes.

Lemma 2.6. Let τ1 < <τn form a uniform binomial process on [0, 1], and consider an independent,· · · uniformly distributed subset ϕ 1,...,n ⊂ { } of fixed cardinality ϕ = k. Then the times τr with r ϕ or r / ϕ form independent, uniform| | binomial processes on [0, 1] of orders∈ k and∈ n k, respectively. −

Proof. We may assume that ϕ = π ,...,π , where π ,...,π form { 1 k} 1 n a uniform permutation of 1,...,n independent of τ1,...,τn. The random variables σ = τ π , r = 1,...,n, are then i.i.d. U(0, 1), and we have r ◦ r τ ; r ϕ = σ ,...,σ , τ ; r / ϕ = σ ,...,σ .  { r ∈ } { 1 k} { r ∈ } { k+1 n} This leads to a simple domination property for binomial processes:

Lemma 2.7. For each n Z , let ξ be a uniform binomial process on ∈ + n [0, 1] with ξn = n. Then for any point process η ξn with fixed η = k n, we have k k ≤ k k ≤ n (η) (ξ ). L ≤  k  L k

J Proof. Let τ1 < <τn be the points of ξn. Writing ξn = j J δτj · · · ϕ ∈ when J 1,...,n , we have η = ξn for some random subset ϕ P1,...,n ⊂ { } ⊂ { } 8 O. KALLENBERG with ϕ = k a.s. Choosing ψ 1,...,n to be independent of ξn and uni- formly| distributed| with ψ =⊂k, { we get } | | (η)= (ξϕ)= P ξJ , ϕ = J L L n { n ∈ · } XJ n n (ξJ )= (ξψ)= (ξ ), ≤ L n  k  L n  k  L k XJ where the last equality holds by Lemma 2.6. 

We also need a conditional independence property of binomial processes.

Lemma 2.8. Let σ1 < <σn form a uniform binomial process on [0,t], and fix any k 1,...,n .· Then: · · ∈ { } (i)

n k 1 n k n P σ ds = k s − (t s) − t− ds, s (0,t); { k ∈ }  k  − ∈

(ii) given σk, the times σ1,...,σk 1 and σk+1,...,σn form independent, − uniform binomial processes on [0,σk] and [σk,t], respectively.

Proof. Part (i) is elementary and classical. For part (ii) we note that, by Proposition 1.27 in [17], the times σ1,...,σk 1 form a uniform bino- − mial process on [0,σk], conditionally on σk,...,σn. By symmetry, the times σk+1,...,σn form a uniform binomial process on [σk,t], conditionally on σ1,...,σk. The conditional independence holds since the conditional distri- butions depend only on σk; cf. Proposition 6.6 in [15]. 

We proceed with a useful identity for homogeneous Poisson processes, stated in terms of the tetrahedral sets t∆ = s Rn ; s < 0,n N. n { ∈ + 1 · · · n } ∈

Lemma 2.9. Let τ1 <τ2 < form a Poisson process on R+ with con- · · · Rn+1 N stant rate c> 0. Then for any measurable function f 0 on + with n , we have ≥ ∈

n (2) Ef(τ1,...,τn+1)= c E f(s1,...,sn,τ1) ds1 dsn. Z · · · Zτ1∆n · · ·

Proof. Since

n ∞ n ct n Eτ1 = t ce− dt = n!c− , Z0 CONDITIONING IN SUPERPROCESSES 9 the right-hand side of (2) defines the joint distribution of some random variables σ ,...,σ . Noting that (τ ) has density 1 n+1 L n+1 cn+1sne cs g (s)= − , s 0, n+1 n! ≥ we get for any measurable function f 0 on R ≥ + n+1 ∞ c ∞ n cs Ef(τn+1)= f(s)gn+1(s) ds = s f(s)e− ds Z0 n! Z0 n c n n = Eτ1 f(τ1)= c E f(τ1) ds1 dsn, n! Z · · · Zτ1∆n · · · d which shows that σn+1 = τn+1. We also see from (2) that σ1,...,σn form a uniform binomial process on [0,σn+1], conditionally on σn+1. Since the corresponding property holds for τ1,...,τn+1, for example by Proposition 1.28 in [17], we obtain

d (σ1,...,σn+1) = (τ1,...,τn+1), as required. 

Say that a measurable space S is additive if it is closed under an asso- ciative, commutative and measurable operation “+” and contains a unique element 0 with s +0= s for all s S. Define l(s) s, and take ξl = sξ(ds) ∈ ≡ to be 0 when ξ = 0. We need a simple estimate for Poisson processesR on an additive space. Recall that a Borel space is a measurable space S, that is, Borel isomorphic to a Borel set B [0, 1], so that there exists a one-to-one, bimeasurable map f : S B. ⊂ ↔ Lemma 2.10. On an additive Borel space S, consider a Poisson process ξ and a measurable function f 0, where both f and Eξ are bounded. Then ≥ Ef(ξl) Eξf f Eξ 2. | − | ≤ k kk k Proof. Writing p = Eξ , we get k k E[f(ξl); ξ > 1] f P ξ > 1 k k ≤ k k {k k } p 1 2 = (1 (1 + p)e− ) f p f . − k k≤ 2 k k Since ξ is a mixed binomial process on S based on Eξ (cf. Proposition 1.28 in [17]), we also have E[f(ξl); ξ =1]= P ξ = 1 E[f(ξl) ξ = 1] k k {k k } |k k p Eξf p = pe− = e− Eξf, Eξ k k 10 O. KALLENBERG and so 0 Eξf E[f(ξl); ξ = 1] ≤ − k k p 2 = (1 e− )Eξf p Eξ f = p f . − ≤ k kk k k k Noting that Ef(ξl) Eξf = E[f(ξl); ξ > 1] + E[f(ξl); ξ = 1] Eξf, − k k k k − we get by combination Ef(ξl) Eξf ( 1 p2 f ) (p2 f )= p2 f .  | − |≤ 2 k k ∨ k k k k We conclude with an elementary inequality needed in Section 7.

Lemma 2.11. For any n N and k ,...,k Z , we have ∈ 1 n ∈ + 2 k 1 (k 1) k .  j −  ≤ i − j jYn + Xi n jYn ≤ ≤ ≤ Proof. Clearly (hk 1) h(k 1) + k(h 1) , h, k Z . − + ≤ − + − + ∈ + Proceeding by induction, we obtain

k 1 (k 1) k .  j −  ≤ i − + j jYn + Xi n Yj=i ≤ ≤ 6 It remains to note that (k 1) k(k 1)/2 for k Z .  − + ≤ − ∈ + 3. Measure, kernel and Palm preliminaries. Here we collect some gen- eral propositions about measures, kernels and Palm distributions, needed in subsequent sections. The first few results are easy and probably known, though no references could be found.

Lemma 3.1. For any measurable space S, the space ˆ S is complete in total variation. M

Proof. Let µ1,µ2,... ˆ S with µm µn 0 as m,n . Assum- ∈ M n k − k→ →∞ ing µn =0, we may define ν = n 2− µn/ µn and choose some measurable 6 1 k k functions f1,f2,... L (ν) withPµn = fn ν. Then ν fm fn = µm µn ∈ ·1 | 1− | k − k→ 0, which means that (fn) is Cauchy in L (ν). Since L is complete (cf. [15], 1 page 16), we have convergence fn f in L , and so the measure µ = f ν satisfies µ µ = ν f f 0. → · k − nk | − n|→ For any measure µ on a topological space S, we define suppµ as the intersection of all closed sets F S with µF c = 0. ⊂ CONDITIONING IN SUPERPROCESSES 11

Lemma 3.2. Fix a measure µ on a Polish space S. Then µ(supp µ)c = 0, and s suppµ iff µG> 0 for every neighborhood G of s. ∈

Proof. Choose a countable base B1,B2,... of S, and define I = i N { N∈ ; µBi = 0 . Any open set G S can be written as i J Bi for some J , } ⊂ c ∈ ⊂ and we note that µG = 0 iff J I. Hence, (suppµ) S= i I Bi. If s / suppµ, ⊂ ∈ ∈ then s Bi for some i I, and so µG = 0 for someS neighborhood G of ∈ ∈ s. Conversely, the latter condition implies s Bi for some i I, and so s / suppµ.  ∈ ∈ ∈ We continue with a simple measurability property.

Lemma 3.3. Let S and T be Borel spaces. For any µ ˆ S T and t d c ∈ M × ∈ T , let µt denote the restriction of µ to S t1,...,td . Then the mapping (µ,t) µ is product-measurable. × { } 7→ t Proof. We may take T = R. Put I = 2 n(j 1, j], n, j Z, and define nj − − ∈ U (t)= I ; t ,...,t / I , n N,t Rd. n { nj 1 d ∈ nj} ∈ ∈ [j

n n Then the restriction µt of µ to S Un(t) is product-measurable, and µt µt by monotone convergence.  × ↑

Given two measurable spaces (S, ) and (T, ), a kernel from S to T is S T defined as a function µ 0 on (S, ) such that µsB = µ(s,B) is measurable in s S for fixed B and≥ a measureT in B for fixed s. For any measure ν on S∈and kernel µ from S to T , we define∈T the composition ν µ and product νµ as the measures on S T and T , respectively, given by ⊗ ×

(ν µ)f = ν(ds) µs(dt)f(s,t), νµ = (ν µ)(S ). ⊗ Z Z ⊗ × · Conversely, when T is Borel, any σ-finite measure M on S T admits a disintegration M = ν µ into a σ-finite supporting measure ×ν on S and a kernel µ from S to T⊗, where the latter is again σ-finite, in the sense that µsf(s, ) < for some measurable function f > 0 on S T . When M( T ) is σ-finite· ∞ we may take ν = M( T ), in which case×µ can be chosen·× to be a probability kernel, in the sense·× that µ = 1 for all s. In general, the k sk measures µs are unique, s S a.e. ν, up to normalizations. Some basic properties of∈ kernels and their compositions are given in [15]. Here we first consider the total variation ν µ , where µ is a signed kernel, defined as the difference between two a.e.k bounded⊗ k kernels. 12 O. KALLENBERG

Lemma 3.4. For any measurable space S and Borel space T , let ν ˆ S , and consider a signed kernel µ from S to T . Then µ is measurable∈ M and ν µ = ν µ . k k k ⊗ k k k Proof. Assuming µ = µ µ for some bounded kernels µ and µ from ′ − ′′ ′ ′′ S to T , we defineµ ˆ = µ′ +µ′′. Since T is Borel, Proposition 7.26 in [15] yields a measurable function f : S T [ 1, 1] with µ = f µˆ. Then µ =µ ˆ f , which is measurable by Lemma× 1.41→ − in [15]. Furthermore,· k k | | ν µ = f (ν µˆ) = (ν µˆ) f = ν(ˆµ f )= ν µ .  k ⊗ k k · ⊗ k ⊗ | | | | k k We proceed with a simple projection property.

Lemma 3.5. For any Borel spaces S, T and U, consider some σ-finite measures ν and νˆ on S U and S and some signed kernels µ and µˆ from S U or S to T , such that× ν and ν µ have projections νˆ and νˆ µˆ onto S ×and S T , respectively. Then µˆ ⊗ sup µ a.e. νˆ. ⊗ × k sk≤ u k s,uk Proof. Sinceν ˆ is σ-finite and U is Borel, we have ν =ν ˆ ρ for some ⊗ probability kernel ρ from S to U. Writing πS T for projection onto S T , we obtain × × 1 1 νˆ µˆ = (ν µ) πS− T = (ˆν ρ µ) πS− T =ν ˆ ρµ, ⊗ ⊗ ◦ × ⊗ ⊗ ◦ × ⊗ and soµ ˆ = ρµ a.e.ν ˆ. Hence, for any measurable function f on T with f 1, we get a.e. | |≤ µfˆ = (ρµ)f = ρ(µf) ρ µf ρ µ , | | | | | |≤ | |≤ k k which implies µˆ ρ µ sup µ a.e.ν. ˆ  k sk≤ sk sk≤ uk s,uk The following technical result plays a crucial role in Section 6. For any G , G ,... S, put limsup G = G = s S; s G i.o. . 1 2 ⊂ n n n k n k { ∈ ∈ n } T S ≥ Lemma 3.6. Let ν be a kernel from R to a Polish space S with suppν = S, let µ,µ ,µ ,... be bounded kernels from S R to a Borel space U, where 1 2 × each µn is continuous in total variation on S Gn for some open set Gn R. Assume ν µ µ > h 0 for some measurable× functions h : S ⊂R {k − nk n}≡ n × → R+ with hn 0 uniformly on bounded sets. Then µ = µ′ a.e. ν, where µ′ is continuous in→ total variation on S lim sup G . × n n

Proof. First let the kernels ν,µ,µ1,µ2,... and functions h1, h2,... be independent of the real parameter, hence kernels or functions on S. Let CONDITIONING IN SUPERPROCESSES 13

c S′ S be the set where µ µn hn, so that ν(S′) = 0. For any t,t′ S′, we⊂ have k − k≤ ∈ n n n n µ µ ′ µ µ + µ µ ′ + µ ′ µ ′ , k t − t k ≤ k t − t k k t − t k k t − t k where µn = µ . Fixing any s S, we may let t,t s and then n to n ∈ ′ → →∞ get µ µ ′ 0. Hence, Lemma 3.1 yields a bounded measure µ on U k t − t k→ s′ with µt µs′ 0. Note that µs′ is well defined for every s S and that µ = µk when− k→s S . ∈ s′ s ∈ ′ To prove the required continuity of µ′, suppose that sk s in S. Fixing any metrization d of S, we may choose t ,t ,... S with → 1 2 ∈ ′ k d(s ,t )+ µ′ µ < 2− , k N. k k k sk − tk k ∈ In particular t s, and so k → µ′ µ′ µ′ µ + µ µ′ 0, k sk − sk ≤ k sk − tk k k tk − sk→ as desired. The continuity of µ′ implies measurability, which means that µ′ is again a locally bounded kernel from S to U. Further note that µn µ′ h on S , which extends by continuity to µ µ h on S,k where− k≤ the n ′ k n − ′k≤ n′ functions hn′ are upper semi-continuous versions of hn, satisfying the same convergence condition. We now allow ν, µ and µ1,µ2,... to depend on a parameter x R. Con- structing µ as before for each x, we get µ µ 0 uniformly on∈ bounded ′ k n − ′k→ sets in S R. Since each µn is continuous in total variation on S Gn, the same continuity× holds for µ on the set S lim sup G .  × ′ × n n A random measure ξ on a measurable space S is defined as a kernel from the basic Ω into S. The intensity Eξ is the measure on S given by (Eξ)f = E(ξf). For any random element η in a measurable space T , we define the associated Campbell measure M on S T by Mf = E ξ(ds)f(s, η). When T is Borel, and M is σ-finite, we may× form the disintegrationR M = ν µ, where µ is a σ-finite kernel of Palm measures µs on T . If Eξ = M( ⊗T ) is σ-finite, we may take ν = Eξ and choose µ to ·× be a probability kernel from S to T , in which case the µs are called Palm distributions of η with respect to ξ. For convenience, we write µ = P [η ξ] = [η ξ] , µ f(s, )= E[f(s, η) ξ] . s ∈ ·k s L k s s · k s Alternatively, P [ ξ] may be regarded as a kernel from S to the basic prob- ability space Ω with·k σ-field generated by η. The multivariate Palm distribu- n n tions are defined as the kernels [η ξ⊗ ] from S to T , for arbitrary n N. The following conditioning approachL k to Palm distributions (cf. [18, 21∈]) is often useful. On a measure space with pseudo-probability P˜, we introduce a random pair (σ, η˜) in S T with × Ef˜ (σ, η˜)= E f(s, η)ξ(ds). Z 14 O. KALLENBERG

Then the Palm distributions of η with respect to ξ are given by E[f(η) ξ] = E˜[f(˜η) σ] a.s. k σ | The duality between Palm measures and conditional moment densities was first noted in [13]. The following versions of the main results (with subsequent clarifications) are convenient for our present purposes.

Lemma 3.7. On a filtered probability space (Ω, , P ), consider a random measure ξ on a Polish space S with σ-finite intensityF Eξ and some ( )- Ft ⊗ S measurable processes Mt on S. Then:

(i) E[ξ t]= Mt Eξ a.s. iff P [ ξ]s = Ms,t P a.e. on t. In this case, the versions|FP [ ξ] ·= M P on ·k are such that· F ·k s,t s,t · Ft (ii) for fixed t, the measure P [ ξ]s,t is continuous in s S, in total ·k 1 ∈ variation on t, if and only if Ms,t is L -continuous in s, (iii) for fixedF s S, the measures P [ ξ] on are consistent in t if ∈ ·k s,t Ft and only if Ms,t is a martingale in t, (iv) if the t are countably generated and the continuity in (ii) holds for every t, then theF consistency in (iii) holds for all s suppEξ. ∈ Here M is a function on S R Ω such that M(s,t,ω) is product- × + × measurable in (s,ω) for each t, and we are writing Mt = M( , t, ) and Ms,t = M(s,t, ). The Palm distributions P [ ξ] form a kernel from·S to· Ω, endowed · ·k s with any of the σ-fields t, and in (ii)–(iv) we consider some special versions P [ ξ] of those measuresF on . ·k s,t Ft Proof. (i) If E[ξ ]= M Eξ a.s., then for any A and B |Ft t · ∈ Ft ∈ S

Eξ(ds)P [A ξ]s = E[ξB; A]= E[E[ξB t]; A] ZB k |F

= E Ms,tEξ(ds); A = Eξ(ds)E[Ms,t; A], ZB  ZB and so P [A ξ] = E[M ; A] = (M P )A, s S a.e. Eξ, k s s,t s,t · ∈ which shows that we can choose P [ ξ]s = Ms,t P on t. Conversely, if P [ ξ] = M P a.e. on , then a similar·k calculation· yieldsF ·k s s,t · Ft

E[ξB t]= Ms,tEξ(ds) = (Mt Eξ)B a.s., |F ZB · which means that we can choose E[ξ ]= M Eξ on . |Ft t · S CONDITIONING IN SUPERPROCESSES 15

1 (ii) This follows from the L∞/L -isometry

P [ ξ] P [ ξ] ′ = (M M ′ ) P = E M M ′ , k ·k s,t − ·k s ,tkt k s,t − s ,t · kt | s,t − s ,t| where t denotes total variation on t. (iii)k Since · k M is -measurable, weF get for any A and t t , s,t Ft ∈ Ft ≤ ′ P [A ξ] P [A ξ] ′ = E[M M ′ ; A] k s,t − k s,t s,t − s,t = E[M E[M ′ ]; A], s,t − s,t |Ft which vanishes for every A, if and only if M = E[M ′ ] a.s. s,t s,t |Ft (iv) For any t t′ and A t, we have P [A ξ]s,t = P [A ξ]s,t′ a.e. by the uniqueness of≤ the Palm disintegration.∈ F Since k is countablyk generated, Ft a monotone-class argument gives P [ ξ]s,t = P [ ξ]s,t′ a.e. on t, and so ·k ·k 1F Ms,t = E[Ms,t′ t] a.s. for s S a.e. Eξ as in (iii). By the L -continuity |F 1∈ of Ms,t and Ms,t′ and the L -contractivity of conditional expectations, the latter relation extends to suppEξ, and so by (iii) the measures P [ ξ]s,t are consistent for all s suppEξ.  ·k ∈ Often in applications, Eξ = p λ for some σ-finite measure λ on S and continuous function p> 0 on S. Assuming· E[ξ ]= X λ a.s. for some ( |Ft t · Ft ⊗ )-measurable processes Xt, we may choose Mt = Xt/p in (i). Note that the LS1-continuity in (ii) and the martingale property in (iii) hold simultaneously for X and M. A point process on a measure space T is defined as a random measure

ζ of the form i δτi , where the τi are random elements in T . Given ζ and a probabilityP kernel ν from T to a measure space S, we may form a M cluster process ξ = i ηi, where the ηi are conditionally independent random measures on S withP distributions ντi . We assume ζ and ν to be such that ξBk < a.s. for some measurable partition B1,B2,... of S. If ζ is Poisson or Cox,∞ we call ξ a Poisson or Cox cluster process generated by ζ and ν. We need the following representation for the Palm measures of a Cox cluster process, quoted from [22]. For Poisson processes and dimension n = 1, the result goes back to [12, 23, 31, 35], and applications to superprocesses appear in [2, 4]. When ξ is a Poisson cluster process generated by a measure µ T and a probability kernel ν from T to S , let ξ˜ denote a random measure∈ M on S with pseudo-distributionµ ˜ = µνM. Given a σ-finite measure µ = n µn, we call dµn/dµ the relative density of µn with respect to µ. WriteP η and Eη for the conditional distributions and expectations given η. L Lemma 3.8. Let ξ be a cluster process on S generated by a Cox process on T with directing measure η. Then n J (3) E ξ⊗ = E ξ˜⊗ , n N, η η ∈ πXn OJ π ∈P ∈ 16 O. KALLENBERG

n n which yields Eξ⊗ by integration with respect to (η). Assuming Eηξ⊗ to π L be a.s. σ-finite and writing pη for the relative densities in (3), we have n π ( ) ˜ ˜ J (4) η[ξ ξ⊗ ]s = η(ξ) pη (s) η[ξ ξ⊗ ]sJ a.e., L k L ∗ J ∗ πL k πXn ∈ ∈P which yields [ξ ξ n] by integration with respect to [η ξ n] . L k ⊗ s L k ⊗ s n n Here [ξ ξ⊗ ] and [η ξ⊗ ] need to be based on the same supporting L k n L k n measure for ξ⊗ , even when Eξ⊗ fails to be σ-finite. For a probabilistic in- terpretation in the Poisson case, consider for any s Sn, a random partition ∈ π πs ξ in n with distribution given by the relative densities ps in (3) and ⊥⊥ P ˜J ˜ some independent Palm versions ξsJ of ξ. Then (4) is equivalent to d J n n ξ = ξ + ξ˜ , s S a.e. E ξ⊗ . s sJ ∈ µ JXπs ∈ The result extends immediately to Palm measures of the form [ξ (ξ ) n] , L ′k ′′ ⊗ s where ξ′ and ξ′′ are random measures on S′ and S′′ such that the pair (ξ′,ξ′′) forms a Cox cluster process directed by η. Indeed, assuming S′ and S′′ to be disjoint, we may apply Lemma 3.8 to the cluster process ξ = ξ′ + ξ′′ on S = S′ S′′. This more general version is needed for the proof of Lemma 6.2 below. ∪ Next we show how the moment and Palm measures of a random measure η are affected by a shift by a fixed measure µ. Here we define (θ µ)f = µ(f θ ), s ◦ s where θst = s + t.

Lemma 3.9. Let µ(η)= µ(ds) (θsη) for some random measure η on Rd L n L nd and a µ d such that EηR ⊗ = pn λ⊗ with µ pn < a.e. Then E η n = (µ ∈p M) λ nd and · ∗ ∞ µ ⊗ ∗ n · ⊗ n pn(s r)µ(dr) n n (5) µ[η η⊗ ]s = − [θrη η⊗ ]s r a.e. Eµη⊗ . L k Z (µ p )(s) L k − ∗ n nd Proof. Fixing n, we may write pn = p and λ⊗ (ds)= ds, and let µs denote the mixing measure in (5). Then

n n Eµ η⊗ (ds)f(s, η)= µ(dr)E η⊗ (ds)f(s + r, θrη) Z Z Z

n n = µ(dr) Eη (ds)E[f(s + r, θrη) η⊗ ]s Z Z k

n = µ(dr) p(s r) dsE[f(s,θrη) η⊗ ]s r Z Z − k −

n n = Eµη⊗ (ds) µs(dr)E[f(s,θrη) η⊗ ]s r, Z Z k − CONDITIONING IN SUPERPROCESSES 17 where the first step holds by the definition of Pµ, the second step holds by Palm disintegration and the third step holds by the definition of p and the n nd invariance of λ. This gives Eµη⊗ = (µ p) λ⊗ , and so the fourth step holds by Fubini’s theorem. Now (5) follows∗ by· the uniqueness of the Palm disintegration. 

We turn to the Palm measures of a random measure of the form ξ δ . ⊗ τ Lemma 3.10. For any random measure ξ on S and random element τ in a Borel space T , we have P [τ = t ξ δ ] = 0, (s,t) S T a.e. E(ξ δ ). 6 k ⊗ τ s,t ∈ × ⊗ τ Proof. Since T is Borel, the diagonal (t,t); t T in T 2 is measurable. { ∈ } Letting ν be the associated supporting measure for ξ δτ , we get by Palm disintegration ⊗

ν(ds dt)P [τ = t ξ δτ ]s,t = E (ξ δτ )(ds dt)1 τ = t ZZ 6 k ⊗ ZZ ⊗ { 6 }

= E ξ(ds)1 τ = τ = 0, Z { 6 } and the assertion follows. 

The following continuity property of Palm distributions extends Lemma 2.2 in [19]. For any random measure ξ on Rd, we define the centered Palm dis- s tributions by P = [θ sξ ξ]s. Recall that E[ξ; A]= E(ξ1A). L − k d Lemma 3.11. Let ξn and ηn be random measures on R with centered s s ˆd Palm distributions Pn and Qn, respectively, and fix any B . Then the s s ∈ B following conditions imply sups B Pn Qn 0: ∈ k − k→ (i) Eξ B 1; n ≍ (ii) E[ξnB; ξn ] E[ηnB; ηn ] 0; k r ∈ ·s − r ∈ · k→s (iii) sup Pn Pn + sup Qn Qn 0. r,s B k − k r,s B k − k→ ∈ ∈ 1 Proof. Writing fA(µ) = (µB)− µ(ds)1A(θ sµ), we get B − R s (6) Eξn(ds)PnA = E(ξnB)fA(ξn)= E[ξnB; ξn dµ]fA(µ), ZB Z ∈ and similarly for η . For any s B, we have, by (i), n ∈ P s Qs < E(ξ B) P s Qs k n − nk ⌢ n k n − nk E(ξ B)P s E(η B)Qs + Eξ B Eη B . ≤ k n n − n nk | n − n | 18 O. KALLENBERG

Here the second term on the right tends to 0 by (ii), and (6) shows that the first term is bounded by

s r s r E(ξnB)P Eξn(dr)P + E(ηnB)Q Eηn(dr)Q n − Z n n − Z n B B

r r + Eξn(dr)P Eηn(dr)Q Z n − Z n B B r s r s E(ξnB) sup Pn Pn + E(ηn B) sup Qn Qn ≤ r,s B k − k r,s B k − k ∈ ∈ + E[ξ B; ξ ] E[η B; η ] , k n n ∈ · − n n ∈ · k which tends to 0 by (i)–(iii). 

For any diffuse and locally finite random measure ξ on Rd, we say that n r n the kernel [ξ ξ⊗ ]x is tight if E[ξUx 1 ξ⊗ ]x 0 as r 0 for every x (Rd)(n),L wherek U r = Br . In particular,∧ k E[ξ→x ξ n]→= 0 for all i. ∈ x i xi { i}k ⊗ x For probability measuresS on suitable measure spaces S , weak continuity or convergence is defined with respect to the vague topology;M cf. [12, 15]. The following result extends Lemmas 3.4 and 3.5 in [13].

Lemma 3.12. Let ξ be a diffuse random measure on S = Rd, such that: n (n) (i) Eξ⊗ is locally finite on S ; n (ii) for any open set G S, the kernel [1Gc ξ ξ⊗ ]x has a version, that is, continuous in total variation⊂ in x G(nL); k (iii) for any compact set K S(n),∈ ⊂ n ε r Eξ⊗ Bx(ξUx 1) lim sup lim inf n ε ∧ = 0. r 0 x K ε 0 Eξ⊗ Bx → ∈ → n (n) Then the kernel [ξ ξ⊗ ]x has a tight, weakly continuous version on S , satisfying the propertyL k in (ii) for every G.

The result remains true with (ii) restricted to open sets G with Gc com- pact. It also extends with the same proof to measure-valued processes ξt, where (i) and (iii) hold uniformly for t> 0 in compacts, and we consider joint continuity in the pair (x,t).

Proof. Proceeding as in [13], we may construct a version of the kernel n (n) [ξ ξ⊗ ]x on S satisfying the continuity in (ii) for any open set G. Now fix Lanykx S(n), and let l(r) denote the lim inf in (iii). By Palm disintegration ∈ (n) r under condition (i), we may choose some xk x in S such that E[ξUx 1 ξ n] 2l(r). Then by (ii) we have for any→ open G Rd and x G(n),∧ k ⊗ xk ≤ ⊂ ∈ r c n r c n E[ξ(Ux G ) 1 ξ⊗ ]x = lim E[ξ(Ux G ) 1 ξ⊗ ]xk l(r). ∩ ∧ k k ∩ ∧ k ≤ →∞ CONDITIONING IN SUPERPROCESSES 19

n Since G is arbitrary, and l(r) 0 as r 0, we conclude that P [ξ ξ⊗ ]x is tight at x. Using the full strength→ of→ (iii), we get in the same∈ way ·k the uniform tightness r n (7) lim sup E[ξUx 1 ξ⊗ ]x = 0, r 0 x K ∧ k → ∈ (n) (n) for any compact set K S . For open G and xk x in G , we get, by (ii), ⊂ → n w n [1 c ξ ξ⊗ ] [1 c ξ ξ⊗ ] . L G k xk →L G k x By a simple approximation based on (7) (cf. Theorem 4.28 in [15] or Theo- rem 4.9 in [12]), the latter convergence remains valid with 1Gc ξ replaced by ξ, as required. 

Next we show how the Palm distributions can be extended, with preserved continuity properties, to conditionally independent random elements.

Lemma 3.13. Let ξ be a random measure on a Polish space S, consider some random elements α and β in Borel spaces and choose a kernel ν such that ν = [β α, ξ] a.s. Then α,ξ L | (8) [ξ, α, β ξ] = [ξ, α ξ] ν, s S a.e. Eξ. L k s L k s ⊗ ∈ When β ξ, this simplifies to ⊥⊥α (9) [α, β ξ] = [α ξ] ν, s S a.e. Eξ, L k s L k s ⊗ ∈ where να = [β α] a.s. In the latter case, if [α ξ] has a version, that is, continuous inL total| variation, then so does [Lβ ξk]. L k

Proof. First we prove (9) when ξ is α-measurable and να = [β α] a.s. Assuming Eξ to be σ-finite, we get L |

Eξ(ds)E[f(s, α, β) ξ]s = E ξ(ds)f(s, α, β) Z k Z

= E ξ(ds) να(dt)f(s,α,t) Z Z

= Eξ(ds)E να(dt)f(s,α,t) ξ , Z Z k s and so for s S a.e. Eξ ∈

[α, β ξ]sf = E[f(α, β) ξ]s = E να(dt)f(α, t) ξ L k k Z k s

= P [α dr ξ]s νr(dt)f(s,t) = ( [α ξ]s ν)f, Z ∈ k Z L k ⊗ 20 O. KALLENBERG as required. To obtain (8), it suffices to replace α in (9) by the pair (ξ, α). If β ξ, then ν = ν a.s., and (9) follows from (8). In particular, ⊥⊥α ξ,α α (10) [β ξ] = [ν ξ] , s S a.e. Eξ. L k s L αk s ∈ Assuming [α ξ] to be continuous in total variation and defining [β ξ] by (10), we getL fork any s,s S, L k ′ ∈ [β ξ] [β ξ] ′ = [ν ξ] [ν ξ] ′ kL k s −L k s k kL αk s −L αk s k [α ξ] [α ξ] ′ , ≤ kL k s −L k s k and the asserted continuity follows. 

4. Moment measures and Palm kernels. We are now ready to begin our study of DW-processes ξ in Rd, starting from a σ-finite measure µ on Rd, always assumed to be such that ξt is a.s. locally finite for all t> 0 (cf. Lemma 4.1 below). This condition is clearly stronger than requiring only µ to be locally finite. The distribution of ξ is denoted by µ(ξ)= Pµ ξ , and we write (ξ)= P ξ when µ = δ . L { ∈ ·} Lx x{ ∈ ·} x We will make constant use of the fact that ξt is infinitely divisible, hence a countable sum of conditionally independent clusters, equally distributed apart from shifts and rooted at the points of a Poisson process ζ0 of an- 1 cestors with intensity measure t− µ; cf. [4, 26]. Indeed, allowing the cluster distribution to be unbounded but σ-finite, we obtain a similar cluster rep- resentation of the historical process, and we may introduce an associated canonical cluster η with pseudo-distributions x(η), normalized such that 1 L d Px ηt = 0 = t− , where the subscript x signifies that η starts at x R . For measures{ 6 }µ on Rd we write (η)= P η µ(dx), and we define∈ E f(η) Lµ x{ ∈ ·} µ accordingly. R By the Markov property of ξ, we have a similar representation of ξt for every s = t h (0,t) as a countable sum of conditionally independent h- − ∈ clusters (clusters of age h), rooted at the points of a Cox process ζs directed 1 by h− ξs. In other words, ζs is conditionally Poisson given ξs with intensity 1 measure h− ξs. We first state the criteria for the random measures ξt to be locally finite, quoted from Lemma 3.2 in [19]. Recall that pt denotes the continuous density of the symmetric Gaussian distribution on Rd with variances t> 0.

Lemma 4.1. Let ξ be a DW-process in Rd with σ-finite initial measure µ. Then these conditions are equivalent: (i) ξ is a.s. locally finite for every t 0; t ≥ (ii) E ξ is locally finite for every t 0; µ t ≥ (iii) µp < for all t> 0; t ∞ in which case also (iv) E ξ = E η has the continuous density µ p . µ t µ t ∗ t CONDITIONING IN SUPERPROCESSES 21

We turn to the moment measures of a DW-process ξ with canonical clus- n n J J 1 d ter η. Define ν = E η⊗ and ν = E η⊗ , and note that ν = ν = p λ . t 0 t t 0 t t t t · ⊗ Write I′ J for summation over all nonempty, proper subsets I J. Given ⊂ ⊂ any elementsP i = j in J, we form a new set Jij = Jji by combining i and j into a single element6 i, j . The moment measures{ } of a DW-process may be obtained by an initial cluster decomposition, followed by a recursive construction for the individual clusters, as specified by the compact and suggestive formulas below. Some more explicit versions are given after the statement of the theorem.

Theorem 4.2. Let ξ be a DW-process in Rd with canonical cluster η, J J and write νt = E0ηt⊗ . Then for any t> 0 and µ: n J (i) E ξ⊗ = (µ ν ), n N; µ t ∗ t ∈ πXn OJ π ∈P ∈ t J ′ I J I (ii) νt = νs (νt s νt \s ) ds, J 2; Z ∗ − ⊗ − | |≥ IXJ 0 ⊂ t J Jij J (iii) νt = (νs νt⊗ s) ds, J 2; Z ∗ − | |≥ Xi=j 0 6 (iv) νn = νπ νJ , s,t> 0,n N. s+t  s ∗ t  ∈ πXn OJ π ∈P ∈ Note that denotes convolution in the space variables; (ii) and (iii) also involve convolution∗ in the time variable. To state our more explicit versions d of (i)–(iv), let f1,...,fn be any nonnegative, measurable functions on R , Rd J Rd Jij and write xJ = (xj; j J) ( ) . For uJij ( ) , take uk = uij when k i, j . ∈ ∈ ∈ ∈ { } J (i′) Eµ ξtfi = µ(du) ν (dxJ ) fi(u + xi); Z Z t iYn πXn JYπ Yi J ≤ ∈P ∈ ∈ t J ′ I J I (ii′) νt fi = ds νs(du) νt s(dxI ) νt \s (dxJ I ) Z Z Z − Z − \ Oi J IXJ 0 ∈ ⊂ f (u + x ); × i i Yi J ∈ t J Jij (iii′) νt fi = ds νs (duJij ) νt s(dxk)fk(uk + xk); Z Z Z − Oi J Xi=j 0 kYJ ∈ 6 ∈ n π J (iv′) ν fi = ν (duπ) ν (dxJ ) fi(uJ + xi). s+t Z s Z t Oi n πXn JYπ Yi J ≤ ∈P ∈ ∈ 22 O. KALLENBERG

The cluster decomposition (i) and forward recursion (ii) are implicit in Dynkin [5], who works in a very general setting, using series expansions of Laplace transforms; cf. Section 2.2 in [7]. The backward recursion (iii) and Markov property (iv) are believed to be new. First we prove (i), (ii) and (iv).

Partial proof. (i) Use the first assertion in Lemma 3.8. (ii) In Theorem 1.7 of [5], take K = λ, ψt(z)= z2, and η = δ µ, and 0 ⊗ let Πx be the distribution of a standard starting at x. t Each term in (ii) appears twice, which accounts for the factor q2 =2 in formula 1.6.B of [5]. Alternatively, we may use a probabilistic approach based on Le Gall’s snake [28], or we may apply Itˆo’s formula to the martingales ξs(νt s f) µ(νt f), s [0,t], as explained for J = 2 in [7], page 39. (iv)− Using∗ − (i) repeatedly,∗ ∈ along with the Markov| | property at s, we get J n n n (µ ν )= E ξ⊗ = E E [ξ⊗ ξ ]= E E ξ⊗ ∗ s+t µ s+t µ µ s+t| s µ ξs t πXn OJ π ∈P ∈ J π J = E (ξ ν )= E ξ⊗ ν µ s ∗ t µ s ∗ t πXn OJ π πXn OJ π ∈P ∈ ∈P ∈ = (µ νI ) νJ . ∗ s ∗ t πXn κXπ OI κ OJ π ∈P ∈P ∈ ∈ Now take µ = cδ with c> 0, divide by c, and let c 0.  0 → Part (iii) will be deduced from Theorem 4.4 below, which in turn depends on the following discrete constructions. Say that a tree or branch is defined on [s,t], if it is rooted at time s, and all leaves extend to time t. It is also said to be simple if it has only one leaf, and binary if exactly two branches emanate from each vertex. It is further said to be geometric if its graph in the plane has no self-intersections. Furthermore, we say that a tree or set of trees is marked if distinct marks are assigned to the leaves. A random permutation is called uniform if it is exchangeable, and we say that the marks are random if they are conditionally exchangeable, given the underlying tree structure. Siblings are defined as leaves originating from the same vertex.

1 n Lemma 4.3. There are n!(n 1)!2 − marked, binary trees on [0,n] with distinct splitting times 1,...,n− 1. The following constructions are equivalent and give the same probability− to all such trees: (i) Forward recursion: Proceed in n 1 steps, starting from a simple tree on [0, 1]. After k 1 steps, we have a− binary tree on [0, k] with k leaves and distinct splitting− times 1,...,k 1. Now divide a randomly chosen leaf into two, and extend all leaves to− time k + 1. After the final step, attach random marks to the leaves. CONDITIONING IN SUPERPROCESSES 23

(ii) Backward recursion: Proceed in n 1 steps, starting from n simple, marked trees on [n 1,n]. After k 1 steps,− we have n k +1 binary trees on [n k,n] with totally− n leaves and− distinct splitting times− n k +1,...,n 1. Now− join two randomly chosen roots, and extend all roots to− time n k −1. Continue until all roots are connected. − − (iii) Sideways recursion: Let τ1,...,τn 1 be a uniform permutation of 1,..., n 1. Proceed in n 1 steps, starting− from a simple tree on [0,n]. After k −1 steps, we have− a binary tree on [0,n] with k leaves and distinct − splitting times τ1,...,τk 1. Now attach a new branch on [τk,n] to the last available path. After the− final step, attach random marks to the leaves.

Proof. (ii) By the obvious one-to-one correspondence between marked trees and selections of pairs, this construction gives the same probability to all possible trees. The total number of choices is clearly n n 1 2 n(n 1) (n 1)(n 2) 2 1 n!(n 1)! − = − − − · = n−1 .  2  2  · · ·  2  2 2 · · · 2 2 − (i) Before the final marking of leaves, the resulting tree can be realized as a geometric one, which yields a one-to-one correspondence between the (n 1)! possible constructions and the set of all geometric trees. Now any binary− tree with n distinct splitting times and with m pairs of siblings can n m 1 be realized as a geometric tree in 2 − − different ways. Furthermore, any m geometric tree with n leaves and m pairs of siblings can be marked in n!2− nonequivalent ways. Hence, any given tree of this type has probability 2n m 1 2m 2n 1 − − = − , (n 1)! · n! n!(n 1)! − − which is independent of m and hence is the same for all trees. (Note that this agrees with the probability in (ii).) (iii) Before the final marking, this construction yields a binary, geometric tree with distinct splitting times 1,...,n. Conversely, any geometric tree can be realized in this way for a suitable permutation τ1,...,τn 1 of 1,...,n. The correspondence is one-to-one, since both sets of trees h−ave the same cardinality (n 1)!. The proof may now be completed as in case (i).  − Given a uniform, discrete random tree, as described in Lemma 4.3, we may form a uniform, marked, Brownian tree in Rd on the time interval [0,t] by a suitable choice of random splitting times and spatial motion. Note that this uniform tree is entirely different from the historical Brownian tree considered in [26] or in Section 3.4 of [7]. Elaborating on the insight of Etheridge [7], Sections 2.1–2, we show how the moment measures of a single cluster admit a probabilistic interpretation in terms of such a tree. 24 O. KALLENBERG

Theorem 4.4. Form a marked, binary random tree in Rd, rooted at the origin at time 0, with n leaves extending to time t> 0, with branching structure as in Lemma 4.3, with splitting times τ1,...,τn 1 given by an in- dependent, uniform binomial process on [0,t], and with spatial− motion given by independent Brownian motions along the branches. Then the joint dis- n n tribution µt of the leaves at time t and the cluster moment measure νt in n n 1 n Theorem 4.2 are related by νt = n!t − µt .

Proof. The assertion is obvious for n = 1. Proceeding by induction, assume that the statement holds for trees up to order n 1, where n 2, and turn to trees of order n marked by J = 1,...,n . For− any I J with≥ I = k [1,n), Lemma 4.3 shows that the number{ of} marked, discrete⊂ trees of| | order∈n such that J first splits into I and J I equals \ 1 k 1 n+k n 2 k!(k 1)!2 − (n k)!(n k 1)!2 − − − − − −  k 1  − 2 n = (n 2)!k!(n k)!2 − , − − where the last factor on the left arises from the choice of k 1 splitting times for the I-component, among the remaining n 2 splitting− times for the − 1 n original tree. Since the total number of trees is n!(n 1)!2 − , the probability that J first splits into I and J I equals − \ 2 n 1 (n 2)!k!(n k)!2 2 n − − − − = . n!(n 1)!21 n n 1  k  − − − Since all genealogies are equally likely, the discrete subtrees marked by I and J I are conditionally independent and uniformly distributed, and the remaining\ branching times 2,...,n 1 are divided uniformly between the − two trees. Since the splitting times τ1 < <τn 1 of the continuous tree · · · − form a uniform binomial process on [0,t], Lemma 2.8 shows that (τ1) has n 2 1 n L density (n 1)(t s) − t − , whereas τ2,...,τn 1 form a binomial process on − − − [τ1,t], conditionally on τ1. Furthermore, Lemma 2.6 shows that the splitting times of the two subtrees form independent binomial processes on [τ1,t], conditionally on τ1 and the initial split of J into I and J I. Combining these facts with the conditional independence of the spatial\ motion, we see that the continuous subtrees marked by I and J I are conditionally independent \ uniform Brownian trees on [τ1,t], given the spatial motion up to time τ1 and the split at time τ1 of the original index set into I and J I. Conditioning as indicated and using the induction hypothes\ is and Theo- rem 4.2(ii), we get 1 t n − n 1 n ′ n 2 I J I µt = t − (t s) − µs (µt s µt \s) ds  k  Z − ∗ − ⊗ − IXJ 0 ⊂ CONDITIONING IN SUPERPROCESSES 25

1 t n − 1 n ′ 1 I J I = t − νs (νt s νt \s ) ds  k  k!(n k)! Z ∗ − ⊗ − IXJ 0 ⊂ − 1 n t 1 n t − ′ I J I t − n = νs (νt s νt \s ) ds = νt , n! Z ∗ − ⊗ − n! IXJ 0 ⊂ where I = k. Note that a factor 2 cancels out in the first step, since every partition| | I, J I is counted twice. This completes the induction.  { \ }

Proof of Theorem 4.2(iii). Let τ1 < <τn 1 denote the splitting · · · − times of the Brownian tree in Theorem 4.4. By Lemma 2.8, τn 1 has den- n 2 1 n − sity (n 1)s − t − (in s for fixed t), and given τn 1 the remaining times − − τ1,...,τn 2 form a uniform binomial process on [0,τn 1]. By Lemma 4.3 the − − entire structure up to time τn 1 is then conditionally a uniform Brownian tree of order n 1, independent− of the last branching and the motion up to − J time t. Defining µt as before and conditioning on τn 1, we get − 1 t n 2 J n − s − Jij J µt = (n 1) n 1 (µs µt⊗ s) ds.  2  Z − t ∗ − i,jX J 0 − { }⊂ By Theorem 4.4 we may substitute

J Jij J νt Jij νs µt = , µs = , µt s = νt s, n!tn 1 (n 1)!sn 2 − − − − − and the assertion follows. Here again a factor 2 cancels out in the last com- putation, since every pair i, j appears twice in the summation . { } i,j J  P ∈

To describe the Palm distributions of a single cluster, we begin with some basic properties of the Brownian excursion and snake. Given a process X in a space S and some random times σ τ, we define the restriction of X ≤ to [σ, τ] as the process Ys = Xσ+s for s τ σ and Ys =∆ for s>τ σ, where ∆ / S. By a Markov time for X≤we− mean a random time σ, such− that the restrictions∈ of X to [0,σ] and [σ, ] are conditionally independent, ∞ given Xσ. We quote some distributional facts for the Brownian excursion, first noted by Williams [36]; cf. [25].

Lemma 4.5. Given a Brownian excursion X, conditioned to reach height t> 0, let σ and τ be the first and last times that X visits t, and write ρ for the first time X attains its minimum on [σ, τ]. Then Xρ is U(0,t), and σ, ρ and τ are Markov times for X.

Some induced properties of the Brownian snake are implicit in Le Gall [26, 28]: 26 O. KALLENBERG

Lemma 4.6. Given X, σ, ρ and τ as in Lemma 4.5, let Y be a Brownian snake with contour process X. Then Yσ and Yτ are Brownian motions on [0,t] extending Yρ on [0, Xρ], both are independent of Xρ, and σ, ρ and τ are Markov times for Y .

Proof. The corresponding properties are easily verified for the approx- imating discrete snake based on a simple (cf. [7], Section 3.6), and they extend in the limit to the continuous snake. Alternatively, we may approximate the Brownian snake, as in [26], by a discrete tree under the Brownian excursion, for which the corresponding properties are again obvi- ous. 

We now form an extended Brownian excursion, generating an extended Brownian snake, related to the uniform Brownian tree in Theorem 4.4. The unmarked tree, constructed as in Lemma 4.3(iii) though with Brownian spatial motion and with τ1,...,τn 1 chosen to be i.i.d. U(0,t), is referred to below as a discrete Brownian snake− on [0,t] of order n.

Lemma 4.7. Given a Brownian excursion X, conditioned to reach height t> 0, form Xn by inserting n independent copies of the path between the first and last visits to t, let τ1 < <τn be the connection times of those n + 1 paths, and form a Brownian· · snake · Y n with contour process Xn. Then the n n paths Yτ1 ,...,Yτn form a discrete Brownian snake on [0,t].

Proof. Use Lemma 4.6 and its proof. 

The extended Brownian snake Y n generates a measure-valued process ηn, in the same way as the ordinary snake Y generates a single cluster η; cf. [26, 28] or [7], page 69. We show how the nth order Palm distributions with respect to ηt can be obtained from ηn by suitable conditioning. The “cluster terms” in Lemma 3.8 can then be obtained by simple averaging, based on the elementary Lemma 3.9.

Theorem 4.8. Let Yn be an extended Brownian snake with connection times τ1,...,τn, generating a measure-valued process ηn. Choose an indepen- dent, uniform permutation π of 1,...,n, and define βk = Yn(τπk ,t), k n. Then for any initial measure µ on Rd, the nth order Palm distributions≤ of nd η with respect to ηt are given a.e. λ⊗ by n (11) [η η⊗ ] = [η β] a.s. Lµ k t β Lµ n|

Proof. Let τ0 and τ1 be the first and last times that X visits t, and let τ0,...,τn+1 be the endpoints of the corresponding n + 1 paths for the CONDITIONING IN SUPERPROCESSES 27 extended process Xn. Introduce the associated local times σ0,...,σn+1 of Xn at t, so that σ0 = 0 and the differences σk σk 1 are independent and 1 − − exponentially distributed with rate c = t− . Writing σ π = (σπ1 ,...,σπn ), we get, by Lemma 2.9, ◦ cn (12) Ef(σ π,σn+1)= E f(s,σ1) ds. n ◦ n! Z[0,σ1]

By excursion theory (cf. [15], pages 432–442), the shifted path θτ0 X on [0,τ1 τ0] is generated by a Poisson process ζ σ1 of excursions from t, restricted− to the set of paths not reaching level⊥⊥ 0. By the strong Markov property, we can use the same process ζ to encode the excursions of Xn on the extended interval [τ ,τ ], provided that we choose ζ (σ ,...,σ ). 0 n+1 ⊥⊥ 1 n+1 By Lemma 4.5, the restrictions of X to the intervals [0,τ0] and [τ1, ] are independent of the intermediate path, and the corresponding property∞ holds for the restrictions of X to [0,τ ] and [τ , ], by the construction in n 0 n+1 ∞ Lemma 4.7. For convenience we may extend ζ to a point process ζ′ on [0, ], using points at 0 and to encode the initial and terminal paths of ∞ ∞ X or Xn. From (12) we get, by independence, cn (13) Ef(σ π,σn+1,ζ′)= E f(s,σ1,ζ′) ds. n ◦ n! Z[0,σ1]

The inverse processes T of X and Tn of Xn are obtained from the pairs (σ1,ζ′) or (σn+1,ζ′), respectively, by a common measurable con- struction, and we note that T (σk)= τk for k = 0, 1 and Tn(σk)= τk for k = 0,...,n + 1. Furthermore, ξ = λ T 1 and ξ = λ T 1 are the local ◦ − n ◦ n− time random measures of X and Xn, respectively, at height t. Since the entire excursions X and Xn may be recovered from the same pairs by a common measurable mapping, we get, from (13), n n c c n (14) Ef(τ π, Xn)= E f(T s, X) ds = E f(r, X)ξ⊗ (dr), n ◦ n! Z[0,σ1] ◦ n! Z where T s = (Ts1 ,...,Tsn ), and the second equality holds by the substitu- tion rule◦ for Lebesgue–Stieltjes integrals. Now introduce some random snakes Y and Yn with contour processes X and Xn, respectively, with initial distribution µ, and with Brownian spa- tial motion in Rd. By Le Gall’s path-wise construction of the snake [26], or alternatively by the discrete approximation described in [7], the condi- tional distributions µ[Y X] and µ[Yn Xn] are given by a common prob- ability kernel. The sameL constructions| L justify| the conditional independence Y (τ π), and so, by (14), n⊥⊥Xn ◦ n c n Eµf(τ π,Yn)= Eµ f(r, Y )ξ⊗ (dr). ◦ n! Z 28 O. KALLENBERG

Since βk = Yn(τ πk,t) for all k, and ηt is the image of ξ under the map Y ( ,t), the substitution◦ rule for integrals yields · n n c n c n Eµf(β,Yn)= Eµ f(Y (r, t),Y )ξ⊗ (dr)= Eµ f(x,Y )η⊗ (dx). n! Z n! Z t

Finally, the entire clusters η and ηn are generated by Y and Yn, respectively, through a common measurable mapping, and so n c n Eµf(β, ηn)= Eµ f(x, η)η⊗ (dx), n! Z t which extends by monotone convergence to any initial measure. The asser- tion now follows by direct disintegration, or by the conditioning approach to Palm distributions described in Section 3. 

5. Moment densities. Here we collect some technical estimates and con- tinuity properties for the moment densities of a DW-process, useful in sub- sequent sections. We begin with a result for general Brownian trees, defined as random trees in Rd with spatial motion given by independent Brownian d n motions. For x (R ) , let rx denote the distance from x to the diagonal d (n∈) c set Dn = ((R ) ) .

Lemma 5.1. For any marked Brownian tree on [0,s] with n leaves and paths in Rd, the joint distribution at time s has a continuous density q on (Rd)(n) satisfying

2 2 nd/2 nd d (n) q(x) (1 tdn r− ) p⊗ (x), x (R ) ,s t. ≤ ∨ x nt ∈ ≤ Proof. Conditionally on tree structure and splitting times, the joint distribution is a convolution of centered Gaussian distributions µ1,...,µn, nd supported by some linear subspaces S1 Sn = R of dimensions d, 2d, ...,nd. The tree structure is specified⊂···⊂ by a nested sequence of partitions π ,...,π of the index set 1,...,n , and we write h ,...,h for the times 1 n { } 1 n between branchings. Then Lemma 2.1 shows that µk has principal variances d hk J , J πk, each with multiplicity d. Writing νt = pt λ⊗ and noting that J | | n ∈k +1 for J π , we get, by Lemma 2.2, · | |≤ − ∈ k (k 1)d/2 kd (n k)d µk (n k + 1) − ν(n k+1)h δ0⊗ − , k n. ≤ − − k ⊗ ≤ Putting

(k 1)d/2 n2d/2 c = (n k + 1) − n , − ≤ kYn ≤ s = (n k + 1)h , t = s + + s , k n, k − k k k · · · n ≤ CONDITIONING IN SUPERPROCESSES 29 we get 1 kd (n k)d d d ( ) ( ) ⊗ − ( ) c− µk (νs δ0 )= νsj = νt . k ∗ n ≤ k ∗ n k ⊗ j ∗ k k ≤ ≤ kOn ≥ kOn ≤ ≤ Consider the orthogonal decomposition x = x + + x in Rnd with 1 · · · n xk Sk Sk 1, and write x′ = x xn. Since xn equals the orthogonal ∈ ⊖ − − | | distance of x to the subspace Sn 1 Dn, we get xn rx. Using Lemma 2.3 − ⊂ | |≥ and noting that hn = tn tk nt, we see that the continuous density of c 1(µ µ ) at x is bounded≤ ≤ by − 1 ∗···∗ n 2 d d/2 xk /2tk pt⊗k (xk)= (2πtk)− e−| | kYn kYn ≤ ≤ 2 2 nd/2 xn /2hn x /2nt (2πh )− e−| | e−| k| ≤ n k

This yields a useful estimate for the moment densities of a single cluster.

Rd n Lemma 5.2. For a DW-process in , the cluster moment measures νt = n n Rd (n) E0ηt⊗ have densities qt (x) that are jointly continuous in (x,t) ( ) (0, ) and satisfy the uniform bounds ∈ × ∞ n < 2 nd/2 n Rd (n) sup qs (x) ⌢ (1 rx− t) pnt⊗ (x), x ( ) ,t> 0. s t ∨ ∈ ≤ Proof. By Theorem 4.4 it is equivalent to consider the joint endpoint distribution of a uniform, nth order Brownian tree in Rd on the interval [0,t], where the stated estimate holds by Lemma 5.1. To prove the asserted continuity, we may condition on tree structure and splitting times to get a nonsingular Gaussian distribution, for which the assertion is obvious. The unconditional statement then follows by dominated convergence, based on the uniform bound in Lemma 5.1. 

We proceed with a density version of Theorem 4.2. 30 O. KALLENBERG

Theorem 5.3. For a DW-process in Rd with initial measure µ = 0, the n n n 6 moment measures Eµξt⊗ and νt = E0ηt⊗ have positive, jointly continuous densities on (Rd)(n) (0, ), satisfying density versions of the identities in Theorem 4.2(i)–(iv).× ∞

n n Proof. Let qt denote the jointly continuous densities of νt obtained in Lemma 5.2. As mixtures of normal densities, they are again strictly positive. n Inserting the versions qt into the convolution formulas of Theorem 4.2(i), n we get some strictly positive densities of the measures Eµξt⊗ , and the joint continuity of those densities follows by extended dominated convergence (cf. [15], page 12) from the estimates in Lemma 5.2 and the joint continuity in Lemma 2.5. n Inserting the continuous densities qt into the expressions on the right of Theorem 4.2(ii)–(iv), we obtain densities of the measures on the left. If the latter functions can be shown to be continuous on (Rd)(J) or (Rd)(n), J n respectively, they must agree with the continuous densities qt or qs+t, and the desired identities follow. By Lemma 5.2 and extended dominated conver- n gence, it is enough to prove the required continuity with qs and qs replaced n by the normal densities pt and pnt⊗ , respectively. Hence, we need to show n (n 1) n π n that the convolutions p p⊗ , p⊗ − p⊗ and p⊗ p⊗ are continuous, t ∗ nt (n 1)t ∗ nt nt ∗ nt which is clear since they are all nonsingular− Gaussian. 

We turn to the conditional moment densities.

d Theorem 5.4. Let ξ be a DW-process in R with ξ0 = µ. Then for every n there exist some processes M t on Rnd, 0 s 0; s ∈ ∈ t 1 Rd (n) (iii) Ms(x) is continuous, a.s. and in L , in (x,t) ( ) (s, ) for fixed s 0. ∈ × ∞ ≥ Rd (n) n Proof. Write Sn = ( ) , let qt denote the continuous densities in nd d J Lemma 5.2 and let xJ be the projection of x R onto (R ) . By the ∈ n Markov property of ξ and Theorem 4.2(i), the random measures Eµ[ξt⊗ ξs] have a.s. densities | t J (15) Ms(x)= (ξs qt s)(xJ ), x Sn, ∗ − ∈ πXn JYπ ∈P ∈ which are a.s. continuous in (x,t) S (s, ) for fixed s 0 by Theo- ∈ n × ∞ ≥ rem 5.3. Indeed, the previous theory applies with µ replaced by ξs, since E ξ p = µp < and hence ξ p < for every t> 0 a.s. µ s t s+t ∞ s t ∞ CONDITIONING IN SUPERPROCESSES 31

To prove the L1-continuity in (iii), it suffices, by Lemma 1.32 in [15], to t show that EµMs(x) is continuous in (x,t) Sn (s, ). By Lemma 5.2 and extended dominated convergence, it is then∈ enough× to∞ prove the a.s. and L1 J continuity in x Sn alone, for the processes in (15) with qt s replaced by J ∈ − pt⊗ . Here the a.s. convergence holds by Lemma 2.5, and so by Theorem 4.2(i) n n it remains to show that µ qs pt⊗ is continuous on Sn for fixed s, t, µ n n n ∗ n∗ and n. Since qs pt⊗ = νs pt⊗ is continuous on Sn, by Theorem 4.4 and Lemma 5.1, it suffices,∗ by Lemma∗ 5.2 and extended dominated convergence, n Rnd to show that µ pt⊗ is continuous on for fixed t, µ and n, which holds by Lemma 2.5. ∗ To prove (ii), let B Rnd be measurable, and note that ⊂ nd t n n λ⊗ [M ; B]= E ξ⊗ B = E E [ξ⊗ B ξ ] 0 µ t µ µ t | s nd t nd t = Eµλ⊗ [Ms; B]= λ⊗ [EµMs; B], t t which implies M0 = EµMs a.e. Since both sides are continuous on Sn, they agree identically on the same set, and so by (15) J J Eµ (ξs qt s)(xJ )= (µ qt )(xJ ), s 0 and using the Markov property at r, we obtain E [M r+t(x) ξ ]= M r+t(x) a.s.,x S ,r> 0, 0 s

We turn to a simple truncation property of the conditional densities.

Lemma 5.5. Let ξ be a DW-process in Rd with initial measure µ, fix some disjoint, open sets B ,...,B Rd and put B = X B and U = B . 1 n ⊂ k k k k Then as h 0 we have, uniformly for (x,t) B (0, ) in compacts,S → ∈ × ∞ J Eµ (1U c ξt h qh )(xJ ) 0. − ∗ → πXn JYπ ∈P ∈ Proof. Writing t = s + h and using the notation and results of Theo- t t rem 5.4, we see that the left-hand side is bounded by EµMs(x)= M0(x). By Lemma 5.2, this is locally bounded by a sum of products of convolutions J µ pt⊗ (xJ ), and Lemma 2.4 yields a similar uniform bound, valid in some neighborhood∗ of every fixed pair (x,t) (Rd)(n) (0, ). Letting µ 0 locally and using Lemma 2.5 and dominated convergence,∈ × we∞ get E M t(x)↓ 0, uni- µ s → formly for (x,t) (Rd)(n) (0, ) in compacts. This reduces the proof to the case of bounded∈ µ. We× may∞ then estimate the expression on the left by π J Eµξt⊗ h sup qh (xJ u). k − k u U c − πXn JYπ ∈P ∈ ∈ 32 O. KALLENBERG

π By Theorems 4.2(i) and 4.4 the norms Eµξt⊗ h are bounded for bounded k − k J t h. Furthermore, Lemma 5.2 shows that the functions qh may be estimated − J by the corresponding normal densities ph⊗ , for which the desired uniform convergence is obvious. 

We need some more precise estimates of the moment densities near the n n diagonals. Here qµ,t denotes the continuous density of Eµξt⊗ in Theorem 5.3.

d n n Lemma 5.6. Let ξ be a DW-process in R . Then E ξ p⊗ is contin- µ s⊗ ∗ h uous on Rnd and such that for fixed t> 0, uniformly on Rnd and in s t, h> 0 and µ, ≤ n n 1 nd/2 J E ξ⊗ p⊗ < (1 h− t) (µ p⊗ ) < . µ s ∗ h ⌢ ∨ ∗ nt+h ∞ πXn OJ π ∈P ∈ n n n d (n) Furthermore, E ξ p⊗ q on (R ) as s t and h 0. µ s⊗ ∗ h → µ,t → → n n < Proof. By Theorem 4.2(i) it suffices to show that νs ph⊗ ⌢ (1 1 n ∗ n ∨ h− t) pnt⊗ +h, uniformly for s t. By Theorem 4.4 we may replace νs by × ≤ n the distribution of the endpoint vector γs of a uniform Brownian tree. Con- ditioning on tree structure and splitting times, we see from Lemma 2.1 that n γs becomes centered Gaussian with principal variances bounded by nt. Con- n volving with ph⊗ gives a centered Gaussian density with principal variances in [h,nt + h], and Lemma 2.2 yields the required bound for the latter density n in terms of the rotationally symmetric version pnt⊗ +h. Taking expected values yields the corresponding unconditional bound. The asserted continuity may now be proved as in case of Lemma 2.5. To prove the last assertion, consider first the corresponding statement for a single cluster. Here both sides are mixtures of similar normal densities, obtained by conditioning on splitting times and branching structure in the equivalent Brownian trees of Theorem 4.4, and the statement results from an elementary approximation of the uniform binomial process on [0,t] by a similar process on [0,s]. The general result now follows by dominated conver- gence from the density version of Theorem 4.2(i) established in Theorem 5.3. 

d n To state the next result, we use for x = (x1,...,xn) (R ) and k [1,n] k ∈ ∈ the notation x = (x1,...,xk).

Lemma 5.7. For any µ and 1 k n we have, uniformly for 0 < h r (t 1 ) and (x,t) (Rd)(n) (0,≤ ) ≤in compacts, ≤ ≤ ∧ 2 ∈ × ∞ k(1 d/2) (n+k) n k k r − , d 3, (Eµξ⊗ (p⊗ p⊗ ))(x,x ) < ≥ t ∗ h ⊗ r ⌢  log r k, d = 2. | | CONDITIONING IN SUPERPROCESSES 33

n+k Proof. First we prove a similar estimate for the moment measures νt of a single cluster. By Theorem 4.4 it is equivalent to consider the distribu- n+k tion µt for the endpoint vector (γ1, . . ., γn+k) of a uniform Brownian tree on [0,t]. Then let τi and αi be the time and place where leaf number n + i n n is attached, and put τ = (τi) and α = (αi). Let µt τ and µt τ,α denote the | | conditional distributions of (γ1, . . ., γn), given τ or (τ, α), respectively, and put u = t + r. Then we have, uniformly for h and r as above and x (Rd)(n), ∈ n+k n k k n n (µt (ph⊗ pr⊗ ))(x,x ) = E(µt τ,α ph⊗ )(x) pu τi (xi αi) ∗ ⊗ | ∗ − − iYk ≤ n n d/2 E(µt τ,α ph⊗ )(x) (u τi)− ≤ | ∗ − Yi k ≤ u k k(1 d/2) n d/2 n r − , < q (x) s− ds < q (x) ⌢ u Z  ⌢ u  log r k, r | | when d 3 or d = 2, respectively. Here the first equality holds since (γ , . . ., γ ) ≥ 1 n and γn+1, . . ., γn+k are conditionally independent, given τ and α. The sec- d/2 ond relation holds since pr r− . We may now use the chain rule for k k≤ n n conditional expectations to replace µt τ,α by µt τ . Next we apply Lemma 2.7 n n | | twice, first to replace µt τ by µt , then to replace τn+1,...,τn+k by a uniform | n n < n binomial process on [0,t]. We also note that µt νh⊗ ⌢ µu by the corre- ∗ n n < n sponding property of the binomial process. This implies µt ph⊗ ⌢ qu since both sides have continuous densities outside the diagonals∗by Lemma 5.2, justifying the third step. The last step is elementary calculus. Since the previous estimate is uniform in x (Rd)(n), it remains valid for n+k n ∈ n the measures µ νt with qu replaced by the convolution µ qu , which is bounded on compacts∗ in (Rd)(n) (0, ) by Lemma 5.2. Finally,∗ Theo- ×n ∞n < n rem 4.2(i) shows, as before, that µ νt ph⊗ (x) ⌢ µ qu (x) for h t, which is again locally bounded.  ∗ ∗ ∗ ≤ We conclude with two technical results, needed in the next section.

Lemma 5.8. For any initial measure µ d and a π n with π 0, ∗ t ∗ h t,h · OJ π ∈ d (n) where qt,h(x) 0 as h 0, uniformly for (x,t) (R ) (0, ) in com- pacts. → → ∈ × ∞

d (n) Proof. For x = (x1,...,xn) (R ) , write ∆=mini=j xi xj , and note that ∈ 6 | − | 2 2 2 inf xi uJ (∆/2) ( J 1) ∆ /4. u (Rd)π | − | ≥ | |− ≥ ∈ i XJ π JXπ ∈ ∈ ∈ 34 O. KALLENBERG

J Rd (n) Letting qh denote the continuous density of J π νh on ( ) and using Lemma 5.2, we get as h 0 N ∈ → < < nd/2 ∆2/8nhd sup qh(x u) ⌢ sup pnh(xi uJ ) ⌢ h− e− 0, Rd π − Rd π − → u ( ) u ( ) i YJ π ∈ ∈ ∈ ∈ Rd (n) π π 1 uniformly for x ( ) in compacts. Since νt = π !t| |− by Theo- rem 4.4, we conclude∈ that k k | | π sup (νt qh)(x u) 0, h 0, u (Rd)π ∗ − → → ∈ d (n) uniformly for (x,t) (R ) R+ in compacts. ∈ n n× π J n Since the densities qt of νt satisfy νt J π qh qt+h by Theorems 4.2(iv) π n ∗ ∈ ≤ and 5.3, we have νt qh qt+h. UsingN Lemmas 2.4 and 5.2 and writing u¯ = (u, . . . , u), we get ∗ ≤ π n n d (ν q )(x u) < p⊗ ( u¯)= p⊗ (¯u), u R , t ∗ h − ⌢ b − b ∈ for some constant b> 0, uniformly for (x,t) (Rd)(n) (0, ) in compacts. n ∈ × ∞ Here pb⊗ (¯u)µ(du) < since µpt < for all t> 0. Letting h = hn 0 and ∞ ∞ d (n) → restrictingR (x,t) = (xn,tn) to a compact subset of (R ) (0, ), we get by dominated convergence × ∞

π π qt,h(x) = (µ ν qh)(x)= µ(du)(ν qh)(x u) 0, ∗ t ∗ Z t ∗ − → which yields the required uniform convergence. 

For the clusters η of a DW-process in Rd, we define ν (dx)= E [η (dx);sup η (Bε)c > 0], h,ε> 0,x Rd. h,ε 0 h t t x ∈ Lemma 5.9. For any initial measure µ , we have ∈ Md n (n 1) ε nd µ ν (ν ν⊗ − )= q λ⊗ , t,h,ε> 0, ∗ t ∗ h,ε ⊗ h t,h · where qε (x) 0 as ε2+r h 0 for some r> 0, uniformly for (x,t) t,h → ≥ → ∈ (Rd)(n) (0, ) in compacts. × ∞

Proof. Let ρ be the span of η from 0, and put T (r)= P0[ρ > r η1]0 1/2 k and h′ = h . By Palm disintegration and Lemma 8.1(iv), νh,ε has a density bounded by ε x p (x)T (ε/2h ), x ε/2, p (x)T − | | h ′ | |≤ h  h  ≤  p (x), x > ε/2. ′ h | | ε/2 ε/2 c Letting νh,ε′ and νh,ε′′ denote the restrictions of νh,ε to B0 and (B0 ) , (n 1) n respectively, we conclude that ν ν⊗ − has a density T (ε/2h )p⊗ (x). h,ε′ ⊗ h ≤ ′ h Hence, for 0 < h t, Theorem 4.4 yields a density < T (ε/2h )qn (x) of ≤ ⌢ ′ t+h CONDITIONING IN SUPERPROCESSES 35

(n 1) νn (ν ν⊗ − ). Here T (ε/2h ) 0 as h/ε2 0 since ρ< a.s., and t ∗ h,ε′ ⊗ h ′ → → ∞ n < Rd (n) Lemma 5.2 gives supu qt+h(x u) ⌢ 1, uniformly for (x,t) ( ) (0, ) in compacts. − ∈ × ∞ (n 1) Next we note that ν ν⊗ − has a density bounded by h,ε′′ ⊗ h n nd/2 ε2/8h p⊗ (x)1 x > ε/2 < h− e− 0. h {| 1| } ⌢ → (n 1) Since νn < 1 for bounded t> 0 by Theorem 4.4, even νn (ν ν⊗ − ) k t k ⌢ t ∗ h,ε′′ ⊗ h has a density that tends to 0, uniformly for x Rnd and bounded t> 0. Com- ∈ (n 1) bining the results for ν and ν , we conclude that νn (ν ν⊗ − ) has h,ε′ h,ε′′ t ∗ h,ε ⊗ h a density qε satisfying sup qε (x u) 0, uniformly for (x,t) (Rd)(n) t,h u t,h − → ∈ × (0, ) in compacts. To deduce the stated result for general µ, we may argue as∞ in the previous proof, using dominated convergence based on the rela- n n n tions νh,ε νh and νt νh⊗ νt+h with associated density versions, valid by Theorems≤ 4.2(iv) and∗ 5.3.≤ 

6. Palm continuity and approximation. Here we establish some continu- ity and approximation properties for the multivariate Palm distributions of a DW-process, needed in Section 9. We begin with a continuity property of the transition kernels, which might be known, though no reference could be found.

Lemma 6.1. Let ξ be a DW-process in Rd, and fix any µ and B d, ∈B where either µ or B is bounded. Then µ(1B ξt) is continuous in total vari- ation in t> 0. L

Proof. First let µ < . For any t> 0, the ancestors of ξt at time 0 k k ∞ 1 form a Poisson process ζ0 with intensity t− µ. By Lemma 4.5 the ancestors 2 splitting before time s (0,t) form a Poisson process with intensity st− µ, and so such a split occurs∈ with probability 1 exp( st 2 µ ) st 2 µ . − − − k k ≤ − k k Hence, the process ζs of ancestors at time s agrees, up to a set of probability 2 1 st− µ , with a Poisson process with intensity t− µ ps. Replacingk k s and t by s + h and t + h where h

36 O. KALLENBERG

Choosing s = h 1/2, we get convergence to 0 as h 0, uniformly for t (0, ) in compacts,| | which proves the continuity in t→. ∈ ∞ Now let µ be arbitrary, and assume instead that B is bounded. Let µr and r r c ′ < µr′ denote the restrictions of µ to B0 and (B0) . Then Pµr ξtB> 0 ⌢ (µr′ ν )B< , uniformly for t (0, ) in compacts (cf. Lemma{ 7.2 below).} As∗ t ∞ ∈ ∞ r , we get P ′ ξ B> 0 0 by dominated convergence, in the same →∞ µr { t }→ uniform sense. Finally, by the version for bounded µ, µr (1Bξt) is continuous in total variation in t> 0 for fixed r> 0.  L

< 1/2 A similar argument based on the estimate δx ps ps 1 ⌢ x s− yields continuity in the same sense even under spatialk ∗ shifts.− Wek pr| oceed| with a uniform bound for the associated Palm distributions.

Lemma 6.2. For any µ, t> 0, and open G Rd, there exist some func- (n) ⊂ (n) tions ph on G with ph 0 as h 0, uniformly for (x,t) G (0, ) in compacts, such that a.e.→E ξ n on→ G(n) and for rt +∞r µ s⊗ ≤ n n c c µ[1G ξt ξs⊗ ] Eµ[ ξr (1G ξt r) ξs⊗ ] pt r. kL k − L − k k≤ −

Proof. The random measures ξs and ξt may be regarded as Cox cluster 1 processes generated by the random measure h− ξr and the probability kernel h (η ) from Rd to , where h = s r or h = t r, respectively. To keep Lu h Md − − track of the cluster structure, we introduce some marked versions ξ˜s or ξ˜t on Rd [0, 1], where each cluster ξ is replaced by ξ˜ = ξ δ for some i.i.d. × i i i ⊗ σi U(0, 1) random variables σi independent of the ξi. Note that ξ˜s and ξ˜t are d again cluster processes with generating kernelsν ˜h = (ξh δσ) from R to (Rd [0, 1]), where (ξ ,σ)= ν λ. By the transferL theorem⊗ (cf. [15], M × L h h ⊗ page 112), we may assume that ξ˜t( [0, 1]) = ξt a.s. For any v = (v , . . . , v ) [0, 1]n·×, we may write ξ˜ = ξ˜ + ξ˜ , where 1 n ∈ t t,v t,v′ ξ˜ denotes the restriction of ξ˜ to Rd v , . . . , v c, which is product- t,v t × { 1 n} measurable in (ω, v) by Lemma 3.3. Writing D = ([0, 1](n))c, we get for any B ˆnd and for measurable functions f : (Rd [0, 1])n (Rd [0, 1]) [0,∈1]B with f =0 for x Bc × × M × → x,v ∈ n n n Eµξ˜⊗ (dx dv)(Eµ[fx,v(1Gc ξ˜t) ξ˜⊗ ]x,v Eµ[fx,v(1Gc ξ˜t,v) ξ˜⊗ ]x,v) ZZ s k s − k s

n Eµ 1B(x)ξ˜⊗ (dx dv) fx,v(1Gc ξ˜t) fx,v(1Gc ξ˜t,v) ≤ ZZ s | − |

n c Eµ 1B(x)ξ˜⊗ (dx dv)1 ξ˜′ G > 0 ≤ ZZ s { t,v } ˜ n ˜ n ˜ c Eµξs⊗ (B D)+ Eµ ξs⊗ (dx dv) 1 ξt,v′ i G > 0 . ≤ × ZZ c { } B D Xi n × ≤ CONDITIONING IN SUPERPROCESSES 37

To estimate the first term on the right, we define n′ = π n; π

n (π) J π J Eµξ˜⊗ ( D)= Eµ ζ (du) η⊗ = Eµξ⊗ ν s ·× Z r h,uJ s ∗ h πX′ OJ π πX′ OJ π ∈Pn ∈ ∈Pn ∈ = (µ νπI) νJ , ∗ s ∗ h πX′ κXπ OI κ JOπI ∈Pn ≺ ∈ ∈ (π) and similarly for the associated densities, where ζs denotes the factorial d (π) measure of ζs on (R ) . For each term on the right, we have πI < I for at least one I κ, and then Lemma 5.8 yields a corresponding| density| | | that tends to 0 as ∈h 0, uniformly for (x, r) (Rd)(I) (0, ) in compacts. The remaining factors→ have locally bounded∈ densities× on∞ (Rd)(I) (0, ), for example, by Lemma 5.6. Hence, by combination, E ξ˜ n( D×) has∞ a µ s⊗ ·× density that tends to 0 as h 0, uniformly for (x,s) (Rd)(n) (0, ) in compacts. → ∈ × ∞ Turning to the second term on the right, let B = X B G(n) be compact, i i ⊂ and write BJ = Xi J Bi for J 1,...,n . Using the previous notation and ∈ ⊂ { } defining νh,ε as in Lemma 5.9, we get for ε> 0 small enough, ˜ n ˜ c Eµ ξs⊗ (dx dv)1 ξt,v′ 1 G > 0 ZZB Dc { } ×

(n) u1 u1 c ui = Eµ ζ (du) η (dx1)1 η G > 0 η Bi Z r Z h { h } h B1 Yi>1 n (n 1) E ξ⊗ (ν ν⊗ − )B ≤ µ r ∗ h,ε ⊗ h ′ J J J1 ⊗ 1 J = (µ νr (νh,ε νh )BJ1 ) (µ νr νh⊗ )BJ , ∗ ∗ ⊗ ′ ∗ ∗ πXn JYπ ∈P ∈ where 1 J1 π, J1′ = J1 1 and π′ = π J1 . For each term on the right, Lemma ∈5.9 yields∈ a density\{ of} the first factor\{ } that tends to 0 as h 0 for Rd J1 → fixed ε> 0, uniformly for (xJ1 ,t) ( ) (0, ) in compacts. Since the remaining factors have locally bounded∈ densities× ∞ on the sets (Rd)J (0, ) by Lemma 5.6, the entire sum has a density that tends to 0 as h× 0∞ for fixed ε> 0, uniformly for (x,t) (Rd)(n) (0, ) in compacts. Combining→ the previous estimates and using∈ Lemma ×3.4, we∞ obtain n n n (16) [1 c ξ˜ ξ˜⊗ ] [1 c ξ˜ ξ˜⊗ ] p a.e. E ξ˜⊗ , kLµ G tk s x,v −Lµ G t,vk s x,vk≤ h µ s d (n) for some measurable functions ph on (R ) with ph 0 as h 0, uniformly on compacts. → → 38 O. KALLENBERG

We now apply the probabilistic form of Lemma 3.8 to the pair (ξ˜s, ξ˜t), regarded as a Cox cluster process generated by ξ . Under [ξ˜ ξ˜ n] , the r Lµ tk s⊗ x,v leading term agrees with the nondiagonal component ξ˜t,v. To see this, we first condition on ξr, so that ξ˜t becomes a Poisson cluster process generated by a nonrandom measure at time r. By Fubini’s theorem, the leading term d c is a.e. restricted to R v1, . . . , vn , whereas by Lemma 3.10 the remaining terms are a.e. restricted×{ to Rd v} , . . . , v . Hence, Lemma 3.8 yields a.e. × { 1 n} ˜ ˜ n ˜ ˜ n Rd (n) n µ[ξt,v ξs⊗ ]x,v = Eµ[ ξr (ξt r) ξs⊗ ]x,v, x ( ) , v [0, 1] , L k L − k ∈ ∈ and so, by (16), n n n c ˜ ˜ c ˜ ˜ ˜ µ[1G ξt ξs⊗ ]x,v Eµ[ ξr (1G ξt r) ξs⊗ ]x,v ph a.e. Eµξs⊗ . kL k − L − k k≤ The assertion now follows by Lemma 3.5. 

We may now establish some basic regularity properties for the Palm dis- tributions of a DW-process. Here again, weak continuity is defined with respect to the vague topology.

Theorem 6.3. For a DW-process ξ in Rd, there exist versions of the n Palm kernels [ξ ξ⊗ ] , such that: Lµ tk t x n Rd (n) (i) µ[ξt ξt⊗ ]x is tight and weakly continuous in (x,t) ( ) (0, ); L k d n ∈ × ∞ (ii) for any t> 0 and open G R , [1 c ξ ξ⊗ ] is continuous in total ⊂ Lµ G tk t x variation in x G(n); ∈ c n (iii) for any open G and bounded µ or G , [1 c ξ ξ⊗ ] is continuous Lµ G tk t x in total variation in (x,t) G(n) (0, ). ∈ × ∞ n Proof. (ii) For the conditional moment measure Eµ[ξt⊗ ξr] with r 0 fixed, Theorem 5.4 yields a Lebesgue density, that is, L1-continuous| ≥ in Rd (n) n (x,t) ( ) (r, ). Since the continuous density of Eµξt⊗ is even strictly ∈ × ∞ 1 n positive by Theorem 5.3, the L -continuity extends to the density of Eµ[ξt⊗ ξr] n n | with respect to E ξ⊗ . Hence, by Lemma 3.7 the Palm kernels [ξ ξ⊗ ] µ t Lµ rk t x have versions that are continuous in total variation in (x,t) (Rd)(n) (r, ) for fixed r 0. Fixing any t > r 0 and G Rd, we see in∈ particular× that∞ ≥ n ≥ ⊂ c the kernel Eµ[ ξr (1G ξt r) ξt⊗ ]x has a version, that is, continuous in total L − k variation in x (Rd)(n). Choosing arbitrary r , r ,... (0,t) with r t, ∈ 1 2 ∈ k → using Lemma 6.2 with r = rk and s = t, and invoking Lemma 3.6, we obtain n a similar continuity property for the kernel µ[1Gc ξt ξt⊗ ]x. (iii) Let µ or Gc be bounded, and fix anyL r k0. As before, we may n ≥ choose the kernels [ξ ξ⊗ ] to be continuous in total variation in (x,t) Lµ rk t x ∈ (Rd)(n) (r, ). For any x,x (Rd)(n) and t,t > r, write × ∞ ′ ∈ ′ n n c c ′ ′ Eµ[ ξr (1G ξt r) ξt⊗ ]x Eµ[ ξr (1G ξt r) ξ⊗′ ]x k L − k − L − k t k CONDITIONING IN SUPERPROCESSES 39

n c c ′ Eµ[ ξr (1G ξt r) ξr (1G ξt r) ξt⊗ ]x ≤ kL − −L − kk n n + [ξ ξ⊗ ] [ξ ξ⊗′ ] ′ . kLµ rk t x −Lµ rk t x k As x′ x and t′ t for fixed r, the first term on the right tends to 0 in total→ variation by→ Lemma 6.1 and dominated convergence, whereas the second term tends to 0 in the same sense by the continuous choice of kernels. n c This shows that Eµ[ ξr (1G ξt r) ξt⊗ ]x is continuous in total variation in L − k (x,t) (Rd)(n) (r, ) for fixed r, µ, and G. ∈ × ∞ 1 Now choose h1, h2,...> 0 to be rationally independent with hn 0, and define → 1 r (t)= h [h− t ], t> 0, k N. k k k − ∈ Then Lemma 6.2 applies with r = rk(t) and s = t for some functions pk with p 0, uniformly for (x,t) G(n) (0, ) in compacts. Since the k → ∈ × ∞ sets Uk = hk j(j 1, j) satisfy lim supk Uk = (0, ), Lemma 3.6 yields a − n ∞ version of theS kernel [1 c ξ ξ⊗ ] , that is, continuous in total variation Lµ G tk t x in (x,t) G(n) (0, ). ∈ ×r ∞ r (i) Writing Ux = i Bxi and using Theorem 5.3 and Lemma 5.7, we get n ε r S 2 Eµξt⊗ Bx(ξtUx 1) d (n+1) n r , ∧ < r (Eµξ⊗ (p⊗2 p 2 ))(x,xi) < E ξ nBε ⌢ t ∗ ε ⊗ r ⌢  r2 log r , µ t⊗ x Xi n ≤ | | uniformly for (x,t) (Rd)(n) (0, ) in compacts. Now use part (iii), along ∈ × ∞ with a uniform version of Lemma 3.12 for random measures ξt. 

The last result yields a similar continuity property for the forward Palm kernels [ξ ξ n] with ss> 0, [ξ ξ n] has a version, that is, Lµ tk s⊗ x continuous in total variation in x (Rd)(n). ∈

Proof. Let ζs denote the ancestral process of ξt at time s = t h. n − Since ξt ζs ξs⊗ , it suffices by Lemma 3.13 to prove the continuity in total ⊥⊥ n 1 variation of µ[ζs ξs⊗ ]x. Since ζs is a Cox process directed by h− ξs, we see L k n from [22] that µ[ζs ξs⊗ ]x is a.e. the distribution of a Cox process directed by [h 1ξ ξ Ln] .k Hence, for any G Rd, Lµ − sk s⊗ x ⊂ n n [ζ ξ⊗ ] [1 c ζ ξ⊗ ] kLµ sk s x −Lµ G sk s xk n P [ζ G> 0 ξ⊗ ] ≤ µ s k s x −1 h ξsG n 1 n = E [1 e− ξ⊗ ] E [h− ξ G 1 ξ⊗ ] . µ − k s x ≤ µ s ∧ k s x

1Meaning that no nontrivial linear combination with integral coefficients exists. 40 O. KALLENBERG

n Choosing versions of µ[ξs ξs⊗ ]x as in Theorem 6.3 and using part (i) of L k n that result, we conclude that µ[ζs ξs⊗ ]x can be approximated in total vari- Ln k Rd (n) ation by kernels µ[1Gc ζs ξs⊗ ]x with open G , uniformly for x G in compacts. It isL then enoughk to choose the⊂ latter kernel to be continu-∈ (n) c ous in total variation on G . Since 1G ζs 1Gc ξs 1Gξs, such a version exists ⊥⊥ n by Lemma 3.13, given the corresponding property of µ[1Gc ξs ξs⊗ ]x from Theorem 6.3(ii).  L k

The following approximation plays a crucial role in Section 9. Here the Palm kernels are assumed to be continuous, in the sense of Theorem 6.3 and Corollary 6.4.

Lemma 6.5. Fix any µ, t> 0 and open G Rd. Then as s t and u x G(n), we have ⊂ ↑ → ∈ n n [1 c ξ ξ⊗ ] [1 c ξ ξ⊗ ] 0. kLµ G tk s u −Lµ G tk t xk→ Proof. Letting rt + r, Lemma 6.2 shows that the first two terms on the right of (17) are bounded by some functions pt r, (n) − where ph 0 as h 0, uniformly for (x,t) G (0, ) in compacts. Next, by Lemma↓ 3.7→and Theorem 5.4, the last∈ term× in (∞17) tends to 0 as s t for fixed r and t, uniformly for (x,t) (Rd)(n) (0, ) in compacts. Letting→ s t and then r t, we conclude∈ that the× left-hand∞ side of (17) → → (n) n tends to 0 as s t, uniformly for x G in compacts. Since [1 c ξ ξ⊗ ] ↑ ∈ Lµ G tk t x is continuous in total variation in x G(n) by Theorem 6.3(ii), we obtain the required joint convergence as s t∈and u x.  ↑ → We conclude with a continuity property of the one-dimensional Palm dis- tributions, quoted from Lemma 3.5 in [19] and its proof. Here µ[ξt ξt]x and [η η ] denote the continuous versions of the Palm distributionsL k of Lµ tk t x ξt and ηt, as constructed explicitly in [2, 4]. Lemma 6.6. Let ξ be a DW-process in Rd with canonical cluster η. Then for fixed t> 0 and µ, the shifted Palm distributions µ[θ xξt ξt]x and d L − k µ[θ xηt ηt]x are continuous in x R , in total variation on any compact setL B− Rkd. When µ < we may∈ even take B = Rd. ⊂ k k ∞ CONDITIONING IN SUPERPROCESSES 41

7. Hitting, multiplicities, and decoupling. Here we derive some estimates of hitting probabilities needed in subsequent sections. We begin with the basic hitting estimates, quoted in the form of Lemma 4.2 in [19]. The state- ment also defines the function t(ε) that occurs frequently below. Note that the definitions differ for d 3 and d = 2. ≥ Lemma 7.1. Let η be the canonical cluster of a DW-process in Rd. Then: (i) for d 3, we have with t(ε)= t + ε2, uniformly in µ, t and ε with 0 < ε √t, ≥ ≤ 2 d ε µp < ε − P η B > 0 < µp ; t ⌢ µ{ t 0 } ⌢ t(ε) (ii) for d = 2, we may choose t(ε)= tl(ε/√t) with 0 l 1 < log ε 1/2 ≤ ε − ⌢ | |− such that, uniformly for µ, t and ε with 0 < 2ε< √t, µp < log(t/ε2)P η Bε > 0 < µp . t ⌢ µ{ t 0 } ⌢ t(ε) We proceed with a uniform limit theorem for hitting probabilities, quoted from Lemma 5.1 and Theorem 5.3 in [19]. Define 2 d ε cd = lim ε − P0 ξtB0 > 0 /pt(0), d 3, ε 0 { } ≥ → ε m(ε)= log ε P 2 η B > 0 , d = 2, | | λ { 1 0 } where the constants cd exist by [3], and the function log m(ε) is bounded for ε 1 by Lemma 5.1 in [19]. ≪ Lemma 7.2. Let ξ be a DW-process in Rd with canonical cluster η. Then as ε 0 for fixed t> 0, µ and bounded B: → 2 d ε (i) ε − Pµ ξtB( ) > 0 cd(µ pt) B 0, d 3; k { · }− ∗ k → ≥ ε (ii) log ε Pµ ξtB( ) > 0 m(ε)(µ pt) B 0, d = 2, k| | { · }− ∗ k → d and similarly with ξt replaced by ηt. When µ is bounded, we may take B = R .

We also quote from Lemma 4.4 in [19] some estimates for multiple hits, ε later to be extended in Lemma 7.5. Let κh denote the number of h-clusters ε of ξt hitting B0 at time t. Lemma 7.3. Let ξ be a DW-process in Rd. Then: (i) for d 3, we have with t = t + ε2 as ε2 h t ≥ ε ≪ ≤ ε ε 2(d 2) 1 d/2 2 E κ (κ 1) < ε − h − µp + (µp ) ; µ h h − ⌢ { t t(ε) } < 1/2 (ii) for d = 2, there exists a function th,ε >t with 0

Rd 1 2 For a DW-process ξ in and for any t>h> 0, let ηh, ηh,... denote the h-clusters in ξt, and write ζs for the ancestral process of ξt at time s = t h. ui − When ζs = i δui , we also write ηh for the h-cluster rooted at ui. Put (Nn) = Nn PN(n). Define h(ε) as in Lemma 7.1, but with t replaced by h. ′ \ First we estimate the probability for a single h-cluster in ξt to hit several ε-balls around x ,...,x Rd. We will refer repeatedly to the conditions 1 n ∈ ε2 h ε, d 3, (18) ≪ ≤ ≥ 1 1 h log ε − log h − , d = 2. ≤ | | ≪ | | Lemma 7.4. Let ξ be a DW-process in Rd, and fix any µ, t> 0, and x (Rd)(n). Then as ε, h 0, subject to (18), ∈ → n(d 2) kj ε ε − , d 3, Pµ η B > 0 ≥ { h xj }≪  log ε n, d = 2. k [(Nn)′ j\n − ∈ ≤ | | Proof. We need to show that for any i = j in 1,...,n , 6 { } n(d 2) k ε k ε ε − , d 3, Pµ η B η B > 0 ≥ { h xi ∧ h xj }≪  log ε n, d = 2. k[N − ∈ | | 1 Writingx ¯ = 2 (xi + xj) and ∆x = xi xj , and using Cauchy’s inequality, Lemmas 4.1 and 7.1(i) and the parallelogram| − | identity, we get for d 3, ≥ P ηkBε ηkBε > 0 µ { h xi ∧ h xj } k[N ∈ E 1 ηkBε ηkBε > 0 ≤ µ { h xi ∧ h xj } kXN ∈ u ε u ε = Eµ ζs(du)1 η B η B > 0 Z { h xi ∧ h xj }

ε ε = Eµξs(du)Pu ηhB ηhB > 0 Z { xi ∧ xj }

ε ε 1/2 (µ ps)u du(Pu ηhB > 0 Pu ηhB > 0 ) ≤ Z ∗ { xi } { xj }

d 2 1/2 < ε − (µ ps)u du(ph (xi u)ph (xj u)) ⌢ Z ∗ ε − ε −

d 2 ∆x 2/8h′ = ε − (µ ps)uph (¯x u) due−| | Z ∗ ε − d 2 ∆x 2/8h′ < ε − (µ p )(¯x)e−| | , ⌢ ∗ 2t CONDITIONING IN SUPERPROCESSES 43 which tends to 0 faster than any power of ε. If instead d = 2, we get, from Lemma 7.1(ii), the bound 2 1 ∆x 2/8h′ 1 ∆x 2/16 (log(h/ε ))− (µ p )(¯x)e−| | < log ε − (µ p )(¯x)ε| | , ∗ tε ⌢ | | ∗ 2t which tends to 0 faster than any power of log ε 1.  | |− ε We turn to the possibility for a single ball Bxj to be hit by several h- clusters of ξt, thus providing a multivariate version of Lemma 7.3:

Lemma 7.5. Let ξ be a DW-process in Rd, and fix any µ, t> 0 and x (Rd)(n). Then as ε, h 0, subject to (18), ∈ → n(d 2) k ε , d 3, E 1 η j Bε > 0 1 − µ h xj n ≥  { }−  ≪  log ε − , d = 2. k XN(n) jYn + ∈ ≤ | | To compare with Lemma 7.3, note that the estimates for n = 1 reduce to d 2 ε ε − , d 3, Eµ(κ 1)+ ≥ h − ≪  log ε 1, d = 2. | |− Proof. On the sets

A = 1 ηkBε > 0 n , h,ε> 0, h,ε  { h xj }≤  j\n kXN ≤ ∈ we have

1 ηkj Bε > 0 1  { h xj }−  k XN(n) jYn + ∈ ≤ 1 ηkj Bε > 0 = 1 ηkBε > 0 nn. ≤ { h xj } { h xj }≤ kXNn jYn jYn kXN ∈ ≤ ≤ ∈ c kj ε l ε On Ah,ε we note that j n ηh Bxj > 0 implies ηhBxi > 0 for some i n and ≤ ≤ l = k1,...,kn, and so Q 6 k 1 η j Bε > 0 1  { h xj }−  k XN(n) jYn + ∈ ≤ 1 ηkj Bε > 0 ≤ { h xj } k XN(n) jYn (19) ∈ ≤ 1 ηl Bε > 0 1 ηkj Bε > 0 ≤ { h xi } { h xj } Xi n (k,l)XN(n+1) jYn ≤ ∈ ≤ = ζ(n+1)(du dv)1 ηv Bε > 0 1 ηuj Bε > 0 , ZZ s { h xi } { h xj } Xi n jYn ≤ ≤ 44 O. KALLENBERG

nd d where u = (u1, . . ., un) R and v R . Finally, writing Uh,ε for the union ∈ ∈ c in Lemma 7.4 and using Lemma 2.11, we get on Uh,ε

1 ηkj Bε > 0 1  { h xj }−  k XN(n) jYn + ∈ ≤ = 1 ηkj Bε > 0 1  { h xj }−  jYn kXN + ≤ ∈ 1 ηl Bε > 0 1 1 ηkBε > 0 ≤  { h xi }−  { h xj } Xi n Xl N jYn kXN ≤ ∈ ≤ ∈ = 1 ηl Bε > 0 1 ηkj Bε > 0 , { h xi } { h xj } Xi n (k,l)XN(n+1) jYn ≤ ∈ ≤ m which agrees with the bound in (19). Now let qµ,s denote the continuous density of E ξ m in Theorem 5.3. Since Ω = (U A ) U c Ac , we µ s⊗ h,ε ∩ h,ε ∪ h,ε ∪ h,ε may combine the previous estimates and use Lemmas 5.6 and 7.1(i) to get, for d 3, ≥ E 1 ηkj Bε > 0 1 nnP U µ { h xj }−  − µ h,ε k XN(n) jYn + ∈ ≤ (n+1) v ε uj ε Eµ ζ (du dv)1 η B > 0 1 η B > 0 ≤ ZZ s { h xi } { h xj } Xi n jYn ≤ ≤ (n+1) ε ε = Eµξ⊗ (du dv)Pv ηhB > 0 Pu ηhB > 0 ZZ s { xi } j { xj } Xi n jYn ≤ ≤ (n+1)(d 2) n+1 < ε − q (u, v) du dvph (xi v) ph (xj uj) ⌢ ZZ µ,s ε − ε − Xi n jYn ≤ ≤ (n+1)(d 2) n+1 (n+1) = ε − (q p⊗ )(x,x ) µ,s ∗ hε i Xi n ≤ (n+1)(d 2) 1 d/2 n(d 2) < ε − h − ε − . ⌢ ≪ n The term n PµUh,ε on the left is of the required order by Lemma 7.4. For d = n 1 2, a similar argument, based on Lemma 7.1(ii), yields the bound log ε − − log h log ε n.  | | × | | ≪ | |− We may combine the last two lemmas into a useful approximation:

Corollary 7.6. Fix any t, x and µ, let ξ be a DW-process in Rd with k h-clusters ηh at time t and let γ be a random element in a space T . Put B = CONDITIONING IN SUPERPROCESSES 45

(B1)n. Then as ε, h 0 subject to (18), and for d 3 or d = 2, respectively, 0 → ≥ n(d 2) ε (n) uj ε ε − , µ((ξtSx )j n, γ) Eµ ζs (du)1 ((η Sx )j n, γ) L j ≤ − Z { h j ≤ ∈ ·} ≪  n B log ε − . | | Proof. It is enough to establish the corresponding bounds for

ε (n) uj ε Eµ f((ξtSx )j n, γ) ζs (du)f((η Sx )j n, γ) , j ≤ − Z h j ≤

n uniformly for H B-measurable functions f on d T with 0 f 1HB . ε M × ≤ ≤ Writing ∆h for the absolute value on the left, Uh,ε for the union in Lemma 7.4 ε and κh for the sum in Lemma 7.5, we note that ∆ε κε 1 κε > 1 + 1 κε 1; U . h ≤ h { h } { h ≤ h,ε} Since k1 k> 1 2(k 1) for any k Z , we get { }≤ − + ∈ + E ∆ε 2E (κε 1) + P U , µ h ≤ µ h − + µ h,ε which is of the required order by Lemmas 7.4 and 7.5. 

ε We proceed to estimate the contribution of ξt to the balls Bxi from dis- r X r Rd n tantly rooted h-clusters. Define Bx = j nBx for x = (xj) ( ) . ≤ j ∈ Lemma 7.7. Fix any t,r > 0, x (Rd)(n) and µ, and put B = X Br . ∈ x j xj Then as ε, h 0, with ε2 h for d 3 and ε h for d = 2, → ≤ ≥ ≤ n(d 2) u ε , d 3, E ζ(n)(du) 1 η j Bε > 0 − µ s h xj n ≥ Z c { }≪  log ε , d = 2. Bx jYn − ≤ | | n n Proof. Let qµ,s denote the jointly continuous density of Eµξs⊗ from Theorem 5.3. For d 3 we may use the conditional independence and Lem- ma 7.1(i) to write the≥ ratio between the two sides as

n(2 d) n ε ε − Eµξs⊗ (du) Puj ηhBxj > 0 Z c { } Bx jYn ≤ < qn (u)p n(x u) du ⌢ µ,s h⊗ε c ZBx −

= (qn p n)(x) qn (x u)p n(u) du, µ,s h⊗ε µ,s h⊗ε ∗ − ZB0 − 2 n where hε = h + ε . Here the first term on the right tends to qµ,t(x) as hε 0 by Lemma 5.6, and the same limit is obtained for the second term by the join→ t continuity in Theorem 5.3 and elementary estimates. Hence, the difference n(d 2) tends to 0. For d = 2, the same argument yields a similar bound with ε − 2 n 2 replaced by log(ε /h) − . Since 0 < ε h 0, we have log(ε /h) log ε , and the assertion| follows.|  ≤ → | | ≥ | | 46 O. KALLENBERG

Next we estimate the probability that a closed set Gc Rd is hit by some ⊂ h-cluster in ξt, rooted within a compact subset B G. For our present purposes, it suffices with a bound that tends to 0 as ⊂h 0 faster than any power of h. →

Lemma 7.8. Let t = s + h with 0 < h s, let G Rd be open with a compact subset B, and let r denote the minimum≤ distance⊂ between B and Gc. Write ζs for the ancestral process of ξt at time s. Then

u c < d 1 d/2 r2/2h Pµ ζs(du)ηh G > 0 ⌢ r h− − e− (µ νt)B. ZB  ∗

1/2 Proof. Letting h′ = h < r/2, we get by Theorem 3.3(b) in [3]

u c u r c Pµ ζs(du)ηhG > 0 Pµ ζs(du)ηh(Bu) > 0 ZB  ≤ ZB  = E P ξ (Br)c > 0 µ (ξsB)δ0 { h 0 } < 2 d+2 r2/2h ⌢ EµξsBr− (r/h′) e d 1 d/2 r2/2h < (µ ν )Br h− − e− . ⌢ ∗ t Here the first step holds by the definition of r, the second step holds by conditional independence and shift invariance, and the last step holds since E ξ = µ ν and p < p when s [t/2,t].  µ s ∗ s s ⌢ t ∈ The last two lemmas yield a useful decoupling approximation:

Corollary 7.9. For a DW-process ξ in Rd and times t and s = t h, − choose ξ˜ ξ with (ξ ,ξ ) =d (ξ , ξ˜ ). Let G Rd be open, put B = (B1)n t⊥⊥ξs t s t s t ⊂ 0 × Gc, and fix any x G(n). Then as ε, h 0 subject to (18) ∈ → n(d 2) ε ε ˜ ε − , d 3, µ((ξtSx )j n,ξt) µ((ξtSx )j n, ξt) B ≥ kL j ≤ −L j ≤ k ≪  log ε n, d = 2. | |− Proof. Letting U r = Br , fix any r> 0 with U 2r G, and write x j xj x ⊂ ˜ ˜ ˜ ξt = ξt′ + ξt′′ and ξt = ξt′ + ξt′′S, where ξt′ is the sum of clusters in ξt rooted in U r, and similarly for ξ˜ . Putting D = Rnd (Rd)(n) and letting ζ be the x t′ n \ s ancestral process of ξt at time s, we get ε ε µ((ξtSx )j n,ξt) µ((ξt′Sx )j n,ξt′′) B kL j ≤ −L j ≤ k ε ε c P ξ B > ξ′B + P ξ′G > 0 ≤ µ t xj t xj  µ{ t } jYn jYn ≤ ≤ n uj ε Pµ ζ⊗ (du) η B > 0 ≤ Z s h xj  Dn jYn ≤ CONDITIONING IN SUPERPROCESSES 47

(n) uj ε + Eµ ζs (du) 1 ηh Bxj > 0 Z r c { } (Bx) jYn ≤ n(d 2) u c ε − , + Pµ ζs(du)ηhG > 0 n Z r  ≪  log ε , Ux | |− by Lemmas 7.4, 7.7 and 7.8. Since the last estimate only depends on the ε ε marginal distributions of the pairs ((ξtSxj )j n,ξt) and ((ξt′Sxj )j n,ξt′′), the same bound applies to ≤ ≤ ε ˜ ε ˜ µ((ξtSx )j n, ξt) µ((ξt′Sx )j n, ξt′′) B. kL j ≤ −L j ≤ k d ˜  It remains to note that (ξt′,ξt′′) = (ξt′, ξt′′).

8. Scaling limits and local approximation. Here we study the pseudo- random measures ξ˜ andη ˜, which provide local approximations of the DW- process ξ in Rd and its canonical cluster η. We begin with some scaling properties of ξ and η. Given a suitable measure-valued process η in Rd, we define the span of η from 0 as the random variable ρ = inf r> 0;sup η (Br)c = 0 . { t t 0 } Recall that a DW-cluster has a.s. finite span. For any process X on R+, we define the process Xˆh on R+ by (Xˆh)t = Xht. Let x[η ηh]y denote the continuous versions of the Palm distributions describedL in k[2, 4].

Lemma 8.1. Let ξ be a DW-process in Rd with canonical cluster η, and let ρ denote the span of η from 0. Then for any µ, x Rd and r,c> 0: ∈ 2 (i) (r ξ )= 2 (ξ 2 S ); LµSr 1 Lr µ r r 2 2 (ii) µSr (r η)= r µ(ˆηr2 Sr); L 2 L (iii) [r η η ] = [ˆη 2 S η 2 ] ; L0 k 1 x L0 r rk r x (iv) P [ρ > c η 2 ] P [rρ + x > c η ] . 0 k r x ≤ 0 | | k 1 0 Though (i) and (ii) are probably known, they are included here with short proofs for easy reference.

Proof. (i) If v solves the evolution equation for ξ, then so doesv ˜(t,x)= 2 2 ˜ 2 2 ˜ 2 r v(r t,rx). Writing ξt = r− ξr2tSr,µ ˜ = r− µSr, and f(x)= r f(rx), we get

ξ˜tf˜ ξ 2 f µv 2 µ˜v˜t ξtf˜ Eµe− = Eµe− r t = e− r t = e− = Eµ˜e− , and so (ξ˜)= (ξ), which is equivalent to (i); cf. [7], page 51. Lµ Lµ˜ d (ii) Define the cluster kernel ν by νx = x(η), x R , and consider the cluster decomposition ξ = mζ(dm), whereLζ is a Poisson∈ process with in- R 48 O. KALLENBERG tensity µν when ξ0 = µ. Here 2 2 r− ξ 2 Sr = (r− m 2 Sr)ζ(dm), r,t> 0. r t Z r t 2 Using (i) and the uniqueness of the L´evy measure, we obtain (r− µSr)ν = 2 µ(ν r− mˆ r2 Sr ), which is equivalent to { ∈ ·} 2 2 r− (η)= −2 (η)= (r− ηˆ 2 S ). LµSr Lr µSr Lµ r r (iii) By Palm disintegration, we get, from (ii),

2 2 E0η1(dx)E0[f(x, r η) η1]x = E0 η1(dx)f(x, r η) Z k Z 2 = r E0 η 2 (dx)f(x, ηˆ 2 Sr) Z r r 2 = r E0η 2 (dx)E0[f(x, ηˆ 2 Sr) η 2 ]x, Z r r k r and (iii) follows by the continuity of the Palm kernel. (iv) By (iii) we have 0[ρ ηr2 ]x = 0[rρ η1]x, and (iv) follows for x = 0. For general x it is enoughL tok take r =L 1. Thenk recall from Corollary 4.1.6 in [4] or Theorem 11.7.1 in [2] that, under 0[η η1]x, the cluster η is a countable sum of conditionally independent subclusters,L k rooted along the path of a on [0, 1] from 0 to x. In particular, the evolution of the process after time 1 is the same as under the original distribution (η) (cf. Lemma 3.13), hence independent of x. We may now construct a clusterL with distribution 0[η η1]x from one with distribution 0[η η1]0, simply by shifting every subclusterL k born at time s 1 by an amountL k (1 s)x. Since all mass of η is then shifted by at most ≤x , the span from 0 of− the entire cluster is increased by at most x , and the| | assertion follows.  | | We may now summarize the basic properties of the pseudo-random mea- sure ξ˜, introduced in Section 8 of [19]. Here and below, we write 0 (ξ )= [ξ ξ ] , 0 (η )= [η η ] , Lµ t Lµ tk t 0 Lµ t Lµ tk t 0 where [ξ ξ ] and [η η ] denote the continuous versions of the Palm Lµ tk t x Lµ tk t x distributions constructed in [2, 4]. Since the pseudo-random measures ξ˜ and η˜ below are stationary by definition, we may further take 0(ξ˜) and 0(˜η) to be the unique invariant versions of the associated PalmL distributionsL [θ xξ˜ ξ˜]x and [θ xη˜ η˜]x, respectively. L − k L − k Theorem 8.2. Let ξ be a DW-process in Rd with d 3. Then there exists a pseudo-random measure ξ˜ on Rd, such that: ≥ (i) as ε 0 for fixed µ and t> 0 → d 2 0 0 µ(ξt) µpt (ξ˜) Bε ε − , (ξt) (ξ˜) Bε 0, kL − L k 0 ≪ kLµ −L k 0 → and similarly with ξt replaced by ηt; CONDITIONING IN SUPERPROCESSES 49

(ii) for any r> 0, d 2 2 0 0 2 (ξS˜ )= r − (r ξ˜), (ξS˜ )= (r ξ˜); L r L L r L d (iii) ξ˜ is stationary with Eξ˜= λ⊗ ; (iv) (ξ˜) is an invariant measure for ξ; (v) asL r , we have in total variation on H for bounded B →∞ B d 2 r − 2−d ⊗d (ξ 2 ) (ξ˜). Lr λ r →L Proof. (i)–(ii) In Theorems 8.1–2 of [19] we proved the existence of a stationary pseudo-random measure ξ˜ on Rd satisfying (ii), and such that as ε 0 for bounded B d, → ∈B 2 d 2 ε − µ(ε− ξtSε) µpt (ξ˜) B 0, (20) k L − L k → 0 2 0 (ε− ξ S ) (ξ˜) 0, kLµ t ε −L kB → and similarly with ξt replaced by ηt. Under (ii), the latter properties are equivalent to (i). (iii) In our proof of Theorem 8.2 in [19] (display (20) in [19], page 2210) we showed that for any B ˆd ∈ B 2 d 2 ε 2 1 ε − E [ε− ξ B ; ε− ξ S ] µp E[ξB˜ ; ξ˜ ] 0. k µ t 0 t ε ∈ · − t 0 ∈ · kB → 1 d d 1 ˜ 1 Taking B = B0 and µ = λ⊗ , we get, in particular, ε− Eλ⊗d ξtSεB0 EξB0 , which extends by stationarity to arbitrary B. Hence, → d d d d λ⊗ = ε− λ⊗ S = ε− E ⊗d (ξ S ) Eξ,˜ ε λ t ε → d and so Eξ˜= λ⊗ . (iv) Let (ξ˜t) denote the DW-process ξ with initial measure ξ˜. Using (ii) and Lemma 8.1(i), we get for any r> 0, ˜ 2 (ξ 2 Sr)= E ˜(ξ 2 Sr)= E −2 ˜ (r ξ1) L r Lξ r Lr ξSr d 2 2 d 2 2 = r − E (r ξ )= r − (r ξ˜ )= (ξS˜ ), Lξ˜ 1 L 1 L r ˜ d ˜ ˜ d ˜ which implies ξr2 Sr = ξSr. Hence, ξt = ξ for all t 0. ≥ d d d (v) Using Lemma 8.1(i) and (20) above and noting that λ⊗ Sr = r λ⊗ , we get, as r , →∞ d 2 d 2 2 r − 2−d ⊗d (ξ 2 )= r − ⊗d (r ξ S ) (ξ˜).  Lr λ r Lλ 1 1/r →L For d =2, there is no random measure ξ˜ with the stated properties. How- ever, a similar role is then played by the stationary cluster η˜t with pseudo- distribution (˜ηt)= λ⊗d (ηt). Writingη ˜ =η ˜1, we have the following approx- imation andL scalingL properties: 50 O. KALLENBERG

Theorem 8.3. Let ξ be a DW-process in Rd with canonical cluster η. Then as ε 0 for fixed µ and t> 0: → d 2 ε − , d 3, (i) µ(ξt) µpt (˜η) Bε ≥ kL − L k 0 ≪  log ε 1, d = 2, | |− 0 0 (ii) (ξt) (˜η) Bε 0, d 2, kLµ −L k 0 → ≥ and similarly with ξt replaced by ηt. Furthermore, (iii) for any r> 0, d 2 2 0 0 2 (˜η 2 S )= r − (r η˜), (˜η 2 S )= (r η˜), L r r L L r r L (iv) for d 3 and as t , in total variation on H for bounded B, ≥ →∞ B (˜η ) (ξ˜). L t →L Though for d 3 the pseudo-random measures ξ˜ andη ˜ have many similar properties, we note≥ thatη ˜ has weaker scaling properties. The Palm distri- butions 0 (ξ ) and 0 (η ) both exist, since the common intensity measure Lµ t Lµ t Eµξt = Eµηt is locally finite. Our proof of Theorem 8.3 requires a simple comparison of (ξ ) and (η ). Lµ t Lµ t Lemma 8.4. Let ξ be a DW-process in Rd with canonical cluster η. Then for any µ, B and t> 0: (i) P η B> 0 = log(1 P ξ B> 0 ); µ{ t } − − µ{ t } (ii) (ξ ) (η ) (P η B> 0 )2. kLµ t −Lµ t kB ≤ µ{ t } In particular, P ξ B> 0 P η B> 0 as either side tends to 0. µ{ t }∼ µ{ t } Proof. (i) See Lemma 4.1 in [19]. (ii) Using the cluster representation ξt = mζt(dm), where ζt is a Poisson process on d with intensity measure µ(ηRt), we get M L B 1Bξt = 1Bmζt(dm)= 1Bmζ (dm), Z Z t B where ζt denotes the restriction of ζt to the set of measures m with mB > 0. For any measurable function f 0 on with f(0) = 0, Lemma 2.10 yields ≥ Md E f(1 ξ ) E f(1 η ) f (P η B> 0 )2, | µ B t − µ B t | ≤ k k µ{ t } and the assertion follows since f is arbitrary. 

Proof of Theorem 8.3. Some crucial ideas in the following proof are adapted from the corresponding arguments in Section 8 of [19]. CONDITIONING IN SUPERPROCESSES 51

(i) For d 3, the assertion follows from Theorem 8.2(i), applied to ξt ≥ ˆ under Pµ and to η1 under Pλ⊗d . Now let d = 2. Fixing t> 0, µ 2, and ˆ2 i ∈ M B , writing ηh for the h-clusters of ξt with associated point process ζs of ancestors∈ B at time s = t h and letting ε, h 0 with log h log ε , we get as in [19] [display (12)],− for any -measurable→ function| f with| ≪ | 0 f| 1 , HB ≤ ≤ HB E f(ξ S )= E f ηi S E f(ηi S ) µ t ε µ  h ε ≈ µ h ε Xi Xi

x = Eµ ζs(dx)f(η Sε)= Eµξs(dx)Exf(ηhSε) Z h Z

= µ(dy) ps(x y)Exf(ηhSε) dx Z Z −

µpt Exf(ηhSε) dx = µptEf(hηS˜ ), ≈ Z ε/√h where the third equality holds by the conditional independence of the clus- ters and the Cox nature of ζs, and the last equality holds by Lemma 8.1(ii). As for the first approximation, we get, by Lemma 7.3,

i i cε cε Eµ f η Sε f(η Sε) Eµ[κ ; κ > 1]  h  − i h ≤ h h Xi X

log(t/h)µp + (µp )2 < t t(h,ε) ⌢ log ε 2 | | log h + 1 1 < | | log ε − , ⌢ log ε 2 ≪ | | | | ε i ε where κh denotes the number of clusters ηh hitting B0. For the second approximation, we get, by Lemma 7.1(ii) as ε h 0, ≤ →

µ(dy) (ps(y x) pt(y))Exf(ηhSε) dx Z Z − −

2 1 < log(ε /h) − µ(dy) ps(y x) pt(y) ph (x) dx ⌢ | | Z Z | − − | cε

1 1/2 < log ε − µ(dy)E ps(y γh ) pt(y) , ⌢ | | Z | − cε − | d where γ denotes a standard normal random vector in R . Since ps(y 1/2 − γhcε ) p (y) by the joint continuity of p (x) and → t t Ep (y γh1/2) = (p p )(y)= p (y) p (y), s − cε s ∗ hcε s+hcε → t the last expectation tends to 0 by Lemma 1.32 in [15], and so the integral on the right tends to 0 by dominated convergence. 52 O. KALLENBERG

In summary, noting that both approximations are uniform in f, we get as ε, h 0 with log h log ε → | | ≪ | | 1 (21) (ξ S ) µp (hηS˜ ) log ε − , kLµ t ε − tL ε/√h kB ≪ | | which extends to unbounded µ by an easy truncation argument. Further- more, Lemmas 7.2(ii) and 8.4 yield, as ε 0, → cε 2 2 (22) E f(ξ S ) E f(η S ) < (P η B > 0 ) < log ε − . | µ t ε − µ t ε | ⌢ µ{ t 0 } ⌢ | | Hence, (21) remains valid with ξt replaced by ηt. Now (i) follows as we take 2 t =1 and µ = λ⊗ and combine with (21). The corresponding result for ηt follows by means of (22). (ii) Once again, the statement for d 3 follows from Theorem 8.2(i). For d =2, we see from Lemma 7.2(ii) above≥ and Lemma 3.4 in [19] that as ε 0, for fixed t> 0, → ε 1 P ξ B > 0 log ε − µp , µ{ t 0 } ≍ | | t ε 2 4 2 1 2 4 E (ξ B ) ε log ε (λ⊗ B ) µp ε log ε µp , µ t 0 ≍ | | 0 t ≍ | | t and similarly for ηt. Hence, E (ξ Bε)2 E [(ξ Bε)2 ξ Bε > 0] = µ t 0 ε4 log ε 2, µ t 0 | t 0 P ξ Bε > 0 ≍ | | µ{ t 0 } ε 2 and similarly for ηt. Thus, ξtB0/ε log ε is uniformly integrable, condition- ε | | ally on ξtB0 > 0, and correspondingly for ηt under both Pµ and Pλ⊗2 . Noting that, by (i), ε ε µ[ξtSε ξtB0 > 0] [˜ηSε ηB˜ 0 > 0] B1 0, kL | −L | k 0 → we obtain ε ε ε ε 2 Eµ[ξtB0; ξtSε ξtB0 > 0] E[˜ηB0;ηS ˜ ε ηB˜ 0 > 0] B1 ε log ε , k ∈ ·| − ∈ ·| k 0 ≪ | | and so, by (i), ε ε 2 Eµ[ξtB0; ξtSε ] µptE[˜ηB0;ηS ˜ ε ] B1 ε , k ∈ · − ∈ · k 0 ≪ and similarly for ηt. Next, Lemma 4.1 yields ε 2 2 EµξtB = λ⊗ (µ pt)1Bε ε µpt. 0 ∗ 0 ≍ Combining the last two estimates with Lemma 6.6, and using Lemma 3.11 in a version for pseudo-random measures, we obtain the desired convergence. (iii) Use Lemma 8.1(ii)–(iii). (iv) From (iii) and Theorem 8.2(i)–(ii), we get, as r , →∞ d 2 2 (˜η 2 )= r − (r ηS˜ ) (ξ˜).  L r L 1/r →L Though the scaling properties ofη ˜ are weaker than those of ξ˜ when d 3, η˜ does satisfy a strong continuity property under scaling, which extends≥ Lemma 5.1 in [19]. CONDITIONING IN SUPERPROCESSES 53

Theorem 8.5. Let η˜ be the stationary cluster of a DW-process in R2, and define a kernel ν from (0, ) to by ∞ M2 2 ν(r)= log r (r− ηS˜ ), r> 0. | |L r Then the kernel t ν(exp( et)) is uniformly continuous on [1, ), in total 7→ − ∞ variation on HB for bounded B.

Proof. For any ε, r, h (0, 1), let ζs denote the ancestral process of ξ1 u∈ at time s = 1 h, and let ηh be the h-clusters rooted at the associated atoms at u. Then −

1 1 2 1 2 u log ε − ν(ε) r− ⊗2 (ε− ξ1Sε) r− E ⊗2 ζs(du)1 ε− η Sε | | ≈ Lrλ ≈ rλ Z { h ∈ ·}

2 2 = u(ε− ηhSε) du = (ε− hηS˜ ) Z L L ε/√h 1 log ε − ν(ε/√h), ≈ | | with all relations explained and justified below. The first equality holds by the conditional independence of the clusters and the fact that Erλ⊗2 ζs = 2 (r/h)λ⊗ . The second equality follows from Lemma 8.1(i) by an elementary substitution. To estimate the error in first approximation, we see from Lemmas 7.1(ii) 1 and 8.4 that for ε< 2 , 1 2 r log ε − ν(ε) ⊗2 (ε− ξ S ) = ⊗2 (η S ) ⊗2 (ξ S ) k | | −Lrλ 1 ε kB kLrλ 1 ε −Lrλ 1 ε kB 2 2 2 < (rP η˜(εB) > 0 ) < r log ε − , ⌢ { } ⌢ | | 1 2 where ηh, ηh,... are the h-clusters of ξ at time t. As for the second approx- imation, we get, by Lemma 7.3 for small enough ε/h,

k E ⊗2 1 η S ⊗2 (ξ S ) rλ { h ε ∈·}−Lrλ 1 ε Xk B

k < E ⊗2 1 (η (εB)) 1 ⌢ rλ  + h −  Xk + log h rλ 2p + (rλ 2p )2 < | | ⊗ 1 ⊗ t(h,ε) ⌢ log ε 2 | | log h + r = r | | . log ε 2 | | The third approximation relies on the estimate log ε log h ν(ε/√h) | | 1 < | |, k kB √ − ⌢ log ε log(ε/ h) | | | |

54 O. KALLENBERG which holds for ε h by the boundedness of ν. Combining those estimates and letting r 0≤ gives → log h (23) ν(ε) ν(ε/√h) < | |, ε h< 1. k − kB ⌢ log ε ≪ | | u v Putting ε = e− and ε/√h = e− and writing νA(x)= ν(x, A) for measur- able sets A H , we get for u v u (with 0/0 = 1), ⊂ B − ≪ u u νA(e− ) νA(e− ) u v u v u log < 1 < ν (e− ) ν (e− ) < − < log , ν (e v) ⌢ ν (e v) − ⌢ | A − A | ⌢ u ⌢ v A − A − and so, for u = es and v = et (with = 0), ∞−∞ log ν (exp( et)) log ν (exp( es)) < t s , | A − − A − | ⌢ | − | which extends immediately to arbitrary s,t 1. Since ν is bounded on HB t ≥ by Lemma 7.2, the function νA(exp( e )) is again uniformly continuous on [1, ), and the assertion follows since− all estimates are uniform in A.  ∞ The exact scaling properties ofη ˜ in Theorem 8.3(iii) may be supplemented by the following asymptotic age invariance, which may be compared with the exact age invariance of ξ˜ in Theorem 8.2(iv).

2 2 Corollary 8.6. Let ε 0 and h> 0 with ε h ε− for d 3 and log ε log h for d = 2. Then,→ as ε 0, ≪ ≪ ≥ | | ≫ | | → d 2 ε − , d 3, (˜ηh) (˜η1) Bε ≥ kL −L k 0 ≪  log ε 1, d = 2. | |− Proof. Fix any B ˆd. For d 3, we get by Theorems 8.2 and 8.3 ∈ B ≥ d 2 2 (˜ηS ) r − (r ηS˜ ) kL ε − L ε/r kB d 2 (˜ηS ) (ξS˜ ) + r − (˜ηS ) (ξS˜ ) ≤ kL ε −L ε kB kL ε/r −L ε/r kB d 2 d 2 d 2 d 2 ε − + r − (ε/r) − < ε − . ≪ ⌢ When d =2, we may use (23) instead to get log ε (˜ηS ) (r2ηS˜ ) | |kL ε −L ε/r kB log ε < ν(ε) ν(ε/r) + | | 1 ν(ε/r) ⌢ k − kB log(ε/r) − k kB

| | log r log r < | | + | | 0. ⌢ log ε log ε log r → | | | | − | | d 2 2 It remains to note that r (r ηS˜ )= (˜η 2 ) by Theorem 8.3(iii).  − L 1/r L r CONDITIONING IN SUPERPROCESSES 55

9. Local conditioning and global approximation. Here we state and prove our main approximation theorem, which contains multivariate versions of the local approximations in Section 8 and shows how the multivariate Palm dis- tributions of a DW-process can be approximated by elementary conditional distributions. Given a DW-process ξ in Rd, let ξ˜ andη ˜ be the associated pseudo-random measures from Theorems 8.2 and 8.3. Let qµ,t denote the continuous versions n n of the moment densities of Eµξt⊗ from Theorem 5.3, and write µ[ξt ξt⊗ ]x for the regular, multivariate Palm distributions considered in TheoremL k 6.3. Define c and m as in Lemma 7.2. Write f g for f/g 1, f g for d ε ∼ → ≈ f g 0, and f g for f/g 0. The notation B with associated terminology− → is explained≪ in Section→ 1 above. k · k

Theorem 9.1. Let ξ be a DW-process in Rd with d 2, and let ε 0 for fixed µ, t> 0 and open G Rd. Then: ≥ → ⊂ (i) for any x (Rd)(n), ∈ n n(d 2) n ε cd ε − , d 3, Pµ ξ B > 0 qµ,t(x) ≥ { t x }∼  mn log ε n, d = 2; ε | |− (ii) for any x G(n), in total variation on (B1)n Gc, ∈ 0 × ε n ε n ε n µ[(ξtSx )j n,ξt ξt⊗ Bx > 0] ⊗ [˜ηSε ηB˜ 0 > 0] µ[ξt ξt⊗ ]x; L j ≤ | ≈L | ⊗L k (iii) for d 3 we have, in the same sense, ≥ 2 ε n ε n ˜ ˜ 1 n µ[(ε− ξtSx )j n,ξt ξt⊗ Bx > 0] ⊗ [ξ ξB0 > 0] µ[ξt ξt⊗ ]x. L j ≤ | →L | ⊗L k Here (i) extends some asymptotic results for n =1 from [3, 19, 27]. Parts (ii) and (iii) show that, asymptotically as ε 0, the contributions of ξt to ε ε c → the sets Bx1 ,...,Bxn and G are conditionally independent. They further imply the multivariate Palm approximation n ε n (n) [1 c ξ ξ⊗ B > 0] [1 c ξ ξ⊗ ] , x G , Lµ G t| t x →Lµ G tk t x ∈ 1 and they contain the asymptotic equivalence or convergence on B0 for any x (Rd)(n), ∈ ε ε n ε [˜ηSε ηB˜ 0 > 0], d 2, µ[ξtS ξ⊗ B > 0] ≈L | ≥ L xj | t x  [ξ˜ ξB˜ 1 > 0], d 3, →L | 0 ≥ extending the versions for n = 1 implicit in Theorems 8.2 and 8.3. Analogous results for simple point processes and regenerative sets appear in [16, 22]. Given (i), assertions (ii) and (iii) are essentially equivalent to the following x estimate, which we prove first. Here and below, qµ,t(x)= qµ,t. 56 O. KALLENBERG

Lemma 9.2. Let ξ be a DW-process in Rd, fix any µ, t> 0 and open G Rd, and put B = (B1)n Gc. Then, as ε 0 for fixed x G(n), ⊂ 0 × → ∈ n(d 2) ε x n n ε − , d 3, µ((ξtSx )j n,ξt) qµ,t ⊗ (˜ηSε) µ[ξt ξt⊗ ]x B ≥ kL j ≤ − L ⊗L k k ≪  log ε n, d = 2. | |−

Proof. We may regard ξt as a sum of conditionally independent clusters u ηh of age h (0,t), rooted at the points u of the ancestral process ζs at time s = t h. Choose∈ the random measure ξ to satisfy − t′ u d ξ′ (ξ ,ζ , (η )), (ξ ,ξ ) = (ξ ,ξ′). t⊥⊥ξs t s h s t s t Our argument can be summarized as follows:

ε (n) uj ε µ((ξtSxj )j n,ξt) Eµ ζs (du)1( )((ηh Sxj )j n,ξt′) L ≤ ≈ Z · ≤

n ε n = Eµξ⊗ (du) u (ηhS ) µ[ξt ξ⊗ ]u Z s L j xj ⊗L k s Oj n (24) ≤ n n n n (E ξ⊗ p⊗ ) ⊗ (˜η S ) [ξ ξ⊗ ] ≈ µ s ∗ h xL h ε ⊗Lµ tk t x x n n q ⊗ (˜ηS ) [ξ ξ⊗ ] , ≈ µ,tL ε ⊗Lµ tk t x where h and ε are related as in (8.18), and the approximations hold in the n(d 2) n sense of total variation on HB of the order ε − or log ε − , respectively. Detailed justifications are given below. | | The first relation in (24) is immediate from Corollaries 7.6 and 7.9. To justify the second relation, we provide some intermediate steps:

(n) uj ε Eµ ζs (du)1( )((ηh Sxj )j n,ξt′) Z · ≤

(n) uj ε = Eµ ζs (du) µ[(η Sx )j n,ξt′ ξs,ζs] Z L h j ≤ |

n (n) ε = h Eµ ζ (du) u (ηhS ) µ[ξ′ ξs] Z s L j xj ⊗L t| Oj n ≤ (n) ε = Eµ ξ (du) u (ηhS ) µ[ξt ξs] Z s L j xj ⊗L | Oj n ≤ (n) ε = Eµ ξs (du) uj (ηhSx ) 1( )(ξt) Z L j ⊗ · Oj n ≤ (n) ε n = Eµξ (du) u (ηhS ) µ[ξt ξ⊗ ]u. Z s L j xj ⊗L k s Oj n ≤ CONDITIONING IN SUPERPROCESSES 57

Here the first and fourth equalities hold by disintegration and Fubini’s the- orem. The second relation holds by the conditional independence of the h- clusters and the process ξt′, along with the normalization of (η). The third L (n) relation holds by the choice of ξt′ and the moment relation Eµ[ζs ξs]= n n | h− ξs⊗ from [22]. The fifth relation holds by Palm disintegration. To justify the third relation in (24), we first consider a change in the last factor. By Lemmas 3.4 and 7.1,

n ε n n Eµξ⊗ (du) u (ηhS ) ( µ[ξt ξ⊗ ]u µ[ξt ξ⊗ ]x) Z s L j xj ⊗ L k s −L k t Oj n B ≤ n(d 2) n n n n ε − , < Eµξ⊗ (du)p⊗ (x u) µ[ξt ξ⊗ ]u µ[ξt ξ⊗ ]x Gc ⌢ Z s hε − kL k s −L k t k  log ε n, | |− with hε defined as in Lemma 7.1 with t replaced by h. Choosing r> 0 with B2r G(n), we may estimate the integral on the right by x ⊂ n n n n c (Eµξs⊗ ph⊗ε (x)) sup µ[ξt ξs⊗ ]u µ[ξt ξt⊗ ]x G ∗ u Br kL k −L k k ∈ x n n + Eµξs⊗ (du)ph⊗ε (x u). r c Z(Bx) − Here the first term tends to 0 by Lemmas 5.6 and 6.5, whereas the second term tends to 0 as in the proof of Lemma 7.7. Hence, in the second line of n n (24), we may replace µ[ξt ξs⊗ ]u by µ[ξt ξt⊗ ]x. By a similar argumentL basedk on LemmaL k 3.4 and Corollary 8.6, we may next replace (˜ηSε) in the last line by (˜ηhSε). It is then enough to prove that L L

n ε n Eµξ⊗ (du) u (ηhS ) qt(x) ⊗ (˜ηhSε), Z s L j xj ≈ L Oj n ≤ n where qt denotes the continuous density of Eµξt⊗ in Theorem 5.3. Here the total variation distance may be expressed in terms of densities as

(qs(x u) qt(x)) u (ηhSε) du Z − − L j Oj n B ≤ n(d 2) n ε − , < qs(x u) qt(x) p⊗ (u) du ⌢ Z | − − | hε  log ε n. | |− Letting γ be a standard normal random vector in Rnd, we may write the 1/2 1/2 integral on the right as E q (x γhε ) q (x) . Here q (x γhε ) q (x) | s − − t | s − → t a.s. by the joint continuity of qs(u), and Lemma 5.6 yields 1/2 n Eq (x γh ) = (q p⊗ )(x) q (x). s − ε s ∗ hε → t 58 O. KALLENBERG

The former convergence then extends to L1 by Lemma 1.32 in [15], and the required approximation follows. 

Proof of Theorem 9.1. (i) For any x (Rd)(n), Lemma 9.2 yields ∈ n(d 2) n ε x ε n ε − , d 3, Pµ ξ⊗ B > 0 q (P ηB˜ > 0 ) ≥ | { t x }− µ,t { 0 } |≪  log ε n, d = 2. | |− It remains to note that, by Lemma 7.2, c εd 2, d 3, P ηB˜ ε > 0 d − ≥ { 0 }∼  m(ε) log ε 1, d = 2. | |− (ii) Assuming x G(n) and using (i) and Lemma 9.2, we get, in total variation on (B1)n ∈ Gc, 0 × ((ξ Sε ) ,ξ ) ε n ε µ t xj j n t [(ξ S ) ,ξ ξ⊗ B > 0] = L ≤ µ t xj j n t t x n ε L ≤ | P ξ⊗ B > 0 µ{ t x } x n n q (˜ηS ) [ξ ξ⊗ ] µ,tL⊗ ε ⊗Lµ tk t x ≈ qx (P ηB˜ ε > 0 )n µ,t { 0 } n ε n = ⊗ [˜ηS ηB˜ > 0] [ξ ξ⊗ ] . L ε| 0 ⊗Lµ tk t x (iii) When d 3, Theorem 8.2(i) yields ≥ 2 d 2 ε − (ε− ηS˜ ) (ξ˜), L ε →L 1 in total variation on B0 . Hence, 2 ε 1 [ε− ηS˜ ηB˜ > 0] [ξ˜ ξB˜ > 0], L ε| 0 →L | 0 and the assertion follows by means of (ii). 

Acknowledgment. My sincere thanks to the referees for their careful reading and many helpful remarks.

REFERENCES [1] Daley, D. J. and Vere-Jones, D. (2008). An Introduction to the Theory of Point Processes. Vol. II: General Theory and Structure, 2nd ed. Springer, New York. MR2371524 [2] Dawson, D. A. (1993). Measure-valued Markov processes. In Ecole´ D’Et´ede´ Proba- bilit´es de Saint-Flour XXI—1991. Lecture Notes in Math. 1541 1–260. Springer, Berlin. MR1242575 [3] Dawson, D. A., Iscoe, I. and Perkins, E. A. (1989). Super-Brownian motion: Path properties and hitting probabilities. Probab. Theory Related Fields 83 135–205. MR1012498 [4] Dawson, D. A. and Perkins, E. A. (1991). Historical processes. Mem. Amer. Math. Soc. 93 (454) iv+179. MR1079034 CONDITIONING IN SUPERPROCESSES 59

[5] Dynkin, E. B. (1991). Branching particle systems and superprocesses. Ann. Probab. 19 1157–1194. MR1112411 [6] Dynkin, E. B. (1994). An Introduction to Branching Measure-valued Processes. CRM Monograph Series 6. Amer. Math. Soc., Providence, RI. MR1280712 [7] Etheridge, A. M. (2000). An Introduction to Superprocesses. University Lecture Series 20. Amer. Math. Soc., Providence, RI. MR1779100 [8] Gorostiza, L. G., Roelly-Coppoletta, S. and Wakolbinger, A. (1990). Sur la persistance du processus de Dawson–Watanabe stable. L’interversion de la limite en temps et de la renormalisation. In S´eminaire de Probabilit´es, XXIV, 1988/89. Lecture Notes in Math. 1426 275–281. Springer, Berlin. MR1071545 [9] Gorostiza, L. G. and Wakolbinger, A. (1991). Persistence criteria for a class of critical branching particle systems in continuous time. Ann. Probab. 19 266–288. MR1085336 [10] Jagers, P. (1973). On Palm probabilities. Z. Wahrsch. Verw. Gebiete 26 17–32. MR0339330 [11] Kallenberg, O. (1977). Stability of critical cluster fields. Math. Nachr. 77 7–43. MR0443078 [12] Kallenberg, O. (1986). Random Measures, 4th ed. Akademie-Verlag, Berlin. MR0854102 [13] Kallenberg, O. (1999). Palm measure duality and conditioning in regenerative sets. Ann. Probab. 27 945–969. MR1698987 [14] Kallenberg, O. (2001). Local hitting and conditioning in symmetric interval par- titions. . Appl. 94 241–270. MR1840832 [15] Kallenberg, O. (2002). Foundations of Modern Probability, 2nd ed. Springer, New York. MR1876169 [16] Kallenberg, O. (2003). Palm distributions and local approximation of regenerative processes. Probab. Theory Related Fields 125 1–41. MR1952454 [17] Kallenberg, O. (2005). Probabilistic Symmetries and Invariance Principles. Springer, New York. MR2161313 [18] Kallenberg, O. (2007). Some problems of local hitting, scaling, and conditioning. Acta Appl. Math. 96 271–282. MR2327541 [19] Kallenberg, O. (2008). Some local approximations of Dawson–Watanabe super- processes. Ann. Probab. 36 2176–2214. MR2478680 [20] Kallenberg, O. (2009). Some local approximation properties of simple point pro- cesses. Probab. Theory Related Fields 143 73–96. MR2449123 [21] Kallenberg, O. (2010). Commutativity properties of conditional distributions and Palm measures. Commun. Stoch. Anal. 4 21–34. MR2643111 [22] Kallenberg, O. (2011). Iterated Palm conditioning and some Slivnyak-type theo- rems for Cox and cluster processes. J. Theoret. Probab. 24 875–893. MR2822485 [23] Kummer, G. and Matthes, K. (1970). Verallgemeinerung eines Satzes von Sliwnjak. II–III. Rev. Roumaine Math. Pures Appl. 15 845–870, 1631–1642. MR0270416 [24] Le Gall, J.-F. (1986). Sur la saucisse de Wiener et les points multiples du mouve- ment brownien. Ann. Probab. 14 1219–1244. MR0866344 [25] Le Gall, J.-F. (1986). Une approche ´el´ementaire des th´eor`emes de d´ecomposition de Williams. In S´eminaire de Probabilit´es XX. Lecture Notes in Math. 1204, 447–464. Springer, Berlin. [26] Le Gall, J.-F. (1991). Brownian excursions, trees and measure-valued branching processes. Ann. Probab. 19 1399–1439. MR1127710 60 O. KALLENBERG

[27] Le Gall, J.-F. (1994). A lemma on super-Brownian motion with some applica- tions. In The Dynkin Festschrift. Progress in Probability 34 237–251. Birkh¨auser, Boston, MA. MR1311723 [28] Le Gall, J.-F. (1999). Spatial Branching Processes, Random Snakes and Partial Differential Equations. Birkh¨auser, Basel. MR1714707 [29] Liemant, A., Matthes, K. and Wakolbinger, A. (1988). Equilibrium Distri- butions of Branching Processes. Mathematical Research 42. Akademie-Verlag, Berlin. MR0969126 [30] Matthes, K., Kerstan, J. and Mecke, J. (1978). Infinitely Divisible Point Pro- cesses. Wiley, Chichester. MR0517931 [31] Mecke, J. (1967). Station¨are zuf¨allige Masse auf lokalkompakten Abelschen Grup- pen. Z. Wahrsch. Verw. Gebiete 9 36–58. MR0228027 [32] Perkins, E. (2002). Dawson–Watanabe superprocesses and measure-valued diffu- sions. In Lectures on and Statistics (Saint-Flour, 1999). Lec- ture Notes in Math. 1781 125–324. Springer, Berlin. MR1915445 [33] Salisbury, T. S. and Verzani, J. (1999). On the conditioned exit measures of super Brownian motion. Probab. Theory Related Fields 115 237–285. MR1720367 [34] Salisbury, T. S. and Verzani, J. (2000). Nondegenerate conditionings of the exit measures of super Brownian motion. Stochastic Process. Appl. 87 25–52. MR1751163 [35] Slivnyak, I. M. (1962). Some properties of stationary flows of homogeneous random events. Theory Probab. Appl. 7 336–341; 9 168. [36] Williams, D. (1974). Path decomposition and continuity of local time for one- dimensional diffusions. I. Proc. London Math. Soc. (3) 28 738–768. MR0350881 [37] Zahle,¨ U. (1988). The fractal character of localizable measure-valued processes. II. Localizable processes and backward trees. Math. Nachr. 137 35–48. MR0968985

Dept. of Mathematics and Statistics Auburn University 221 Parker Hall, Auburn Alabama 36849 USA E-mail: [email protected]