An Introduction to Lévy and Feller Processes

– Advanced Courses in Mathematics - CRM Barcelona 2014 –

René L. Schilling

TU Dresden, Institut für Mathematische Stochastik, 01062 Dresden, Germany

[email protected] http://www.math.tu-dresden.de/sto/schilling

These course notes will be published, together Davar Khoshnevisan’s notes on Invariance and Comparison Principles for Parabolic Stochastic Partial Differential Equations as From Lévy-Type Processes to Parabolic SPDEs by the CRM, Barcelona and Birkäuser, Cham 2017 (ISBN: 978-3-319-34119-4). The arXiv-version and the published version may differ in layout, pagination and wording, but not in content. arXiv:1603.00251v2 [math.PR] 17 Oct 2016

Contents

Preface 3

Symbols and notation 5

1. Orientation 7

2. Lévy processes 12

3. Examples 16

4. On the 24

5. A digression: semigroups 30

6. The generator of a Lévy process 36

7. Construction of Lévy processes 44

8. Two special Lévy processes 49

9. Random measures 55

10.A digression: stochastic integrals 64

11.From Lévy to Feller processes 75

12.Symbols and 84

13.Dénouement 93

A. Some classical results 97

Bibliography 104

1

Preface

These lecture notes are an extended version of my lectures on Lévy and Lévy-type processes given at the Second Barcelona Summer School on Stochastic Analysis organized by the Centre de Recerca Matemàtica (CRM). The lectures are aimed at advanced graduate and PhD students. In order to read these notes, one should have sound knowledge of measure theoretic and some background in stochastic processes, as it is covered in my books Measures, Integals and Martingales [54] and [56].

My purpose in these lectures is to give an introduction to Lévy processes, and to show how one can extend this approach to space inhomogeneous processes which behave locally like Lévy processes. After a brief overview (Chapter 1) I introduce Lévy processes, explain how to char- acterize them (Chapter 2) and discuss the quintessential examples of Lévy processes (Chapter 3). The Markov (loss of memory) property of Lévy processes is studied in Chapter 4. A short analytic interlude (Chapter 5) gives an introduction to operator semigroups, resolvents and their generators from a probabilistic perspective. Chapter 6 brings us back to generators of Lévy processes which are identified as pseudo differential operators whose symbol is the characteristic exponent of the Lévy process. As a by-product we obtain the Lévy–Khintchine formula. Continuing this line, we arrive at the first construction of Lévy processes in Chapter 7. Chap- ter 8 is devoted to two very special Lévy processes: (compound) Poisson processes and Brownian motion. We give elementary constructions of both processes and show how and why they are special Lévy processes, indeed. This is also the basis for the next chapter (Chapter 9) where we construct a random measure from the jumps of a Lévy process. This can be used to provide a further construction of Lévy processes, culminating in the famous Lévy–Itô decomposition and yet another proof of the Lévy–Khintchine formula. A second interlude (Chapter 10) embeds these random measures into the larger theory of ran- dom orthogonal measures. We show how we can use random orthogonal measures to develop an extension of Itô’s theory of stochastic integrals for square-integrable (not necessarily continuous) martingales, but we restrict ourselves to the bare bones, i.e. the L2-theory. In Chapter 11 we in- troduce Feller processes as the proper spatially inhomogeneous brethren of Lévy processes, and we show how our proof of the Lévy–Khintchine formula carries over to this setting. We will see, in particular, that Feller processes have a symbol which is the state-space dependent analogue of the characteristic exponent of a Lévy process. The symbol describes the process and its gener- ator. A probabilistic way to calculate the symbol and some first consequences (in particular the decomposition of Feller processes) is discussed in Chapter 12; we also show that

3 4 R. L. Schilling: An Introduction to Lévy and Feller Processes the symbol contains information on global properties of the process, such as conservativeness. In the final Chapter 13, we summarize (mostly without proofs) how other path properties of a Feller process can be obtained via the symbol. In order to make these notes self-contained, we collect in the appendix some material which is not always included in standard graduate probability courses.

It is now about time to thank many individuals who helped to bring this enterprise on the way. I am grateful to the scientific board and the organizing committee for the kind invitation to deliver these lectures at the Centre de Recerca Matemàtica in Barcelona. The CRM is a wonderful place to teach and to do research, and I am very happy to acknowledge their support and hospitality. I would like to thank the students who participated in the CRM course as well as all students and readers who were exposed to earlier (temporally & spatially inhomogeneous. . . ) versions of my lectures; without your input these notes would look different! I am greatly indebted to Ms. Franziska Kühn for her interest in this topic; her valuable comments pinpointed many mistakes and helped to make the presentation much clearer. And, last and most, I thank my wife for her love, support and forbearance while these notes were being prepared.

Dresden, September 2015 René L. Schilling Symbols and notation

This index is intended to aid cross-referencing, so notation that is specific to a single chapter is generally not listed. Some symbols are used locally, without ambiguity, in senses other than those given below; numbers following an entry are page numbers. Unless otherwise stated, functions are real-valued and binary operations between functions such n→∞ as f ± g, f · g, f ∧ g, f ∨ g, comparisons f 6 g, f < g or limiting relations fn −−−→ f , limn fn, liminfn fn, limsupn fn, supn fn or infn fn are understood pointwise.

−ix·ξ General notation: analysis eξ (x) e positive always in the sense > 0 General notation: probability negative always in the sense 6 0 N 1,2,3,... ∼ ‘is distributed as’ inf /0 inf /0 = +∞ ⊥⊥ ‘is stochastically independent’ a ∨ b maximum of a and b a.s. almost surely (w. r. t. P) a ∧ b minimum of a and b iid independent and identically distributed bxc largest integer n 6 x Rd 2 2 2 N,Exp,Poi normal, exponential, Poisson |x| norm in : |x| = x1 + ··· + xd distribution Rd d x · y scalar product in : ∑ j=1 x jy j  P,E probability, expectation 1, x ∈ A V C 1A 1A(x) = , ov variance, covariance 0, x ∈/ A (L0)–(L3) definition of a Lévy process, 7 δx point mass at x (L20) 12 D domain ∆ Laplace operator Sets and σ-algebras ∂ c ∂ j partial derivative A complement of the set A ∂x j ∂ ∂ > A closure of the set A ∇, ∇x gradient ,..., ∂x1 ∂xd · F f , fb Fourier transform A ∪ B disjoint union, i.e. A ∪ B for (2π)−d R e−ix·ξ f (x)dx disjoint sets A ∩ B = /0 F−1 f , f inverse Fourier transform Br(x) open ball, R eix·ξ f (x)dx centre x, radius r q supp f support, { f 6= 0}

5 6 R. L. Schilling: An Introduction to Lévy and Feller Processes

B(E) Borel sets of E Spaces of functions X Ft canonical filtration σ(Xs : s 6 t) B(E) Borel functions on E σ S  F∞ t>0 Ft Bb(E) – – , bounded Fτ 75 C(E) continuous functions on E 29 Fτ+ Cb(E) – – , bounded predictable σ-algebra, 101 P C∞(E) – – , lim f (x) = 0 |x|→∞

Stochastic processes Cc(E) – – , compact support Px,Ex law and mean of a Markov Cn(E) n times continuously diff’ble process starting at x, 24 functions on E n Xt− left limit lims↑t Xs Cb(E) – – , bounded (with all derivatives) ∆Xt jump at time t: Xt − Xt− Cn (E) – – , 0 at infinity (with all σ,τ stopping times: {σ 6 t} ∈ Ft , ∞ derivatives) t > 0 n x Cc(E) – – , compact support τr ,τr inf{t > 0 : |Xt − X0| > r}, first p p p p exit time from the open ball Br(x) L (E, µ),L (µ),L (E) L space w. r. t. the

centered at x = X0 measure space (E,A , µ) càdlàg right continuous on [0,∞) with S(Rd) rapidly decreasing smooth finite left limits on (0,∞) functions on Rd, 36 1. Orientation

Stochastic processes with stationary and independent increments are classical examples of Markov processes. Their importance both in theory and for applications justifies to study these processes and their history. The origins of processes with independent increments reach back to the late 1920s and they are closely connected with the notion of infinite divisibility and the genesis of the Lévy–Khintchine formula. Around this time, the limiting behaviour of sums of independent random variables

X0 := 0 and Xn := ξ1 + ξ2 + ··· + ξn, n ∈ N, was well understood through the contributions of Borel, Markov, Cantelli, Lindeberg, Feller, de Finetti, Khintchine, Kolmogorov and, of course, Lévy; two new developments emerged, on the one hand the study of dependent random variables and, on the other, the study of continuous-time analogues of sums of independent random variables. In order to pass from n ∈ N to a continuous parameter t ∈ [0,∞) we need to replace the steps ξk by increments Xt − Xs. It is not hard to see that Xt , t ∈ N, with iid (independent and identically distributed) steps ξk enjoys the following properties:

X0 = 0 a.s. (L0) stationary increments Xt − Xs ∼ Xt−s − X0 ∀s 6 t (L1) independent increments Xt − Xs ⊥⊥σ(Xr,r 6 s) ∀s 6 t (L2) where ‘∼’ stands for ‘same distribution’ and ‘⊥⊥’ for stochastic independence. In the non-discrete setting we will also require a mild regularity condition

continuity in probability limP(|Xt − X0| > ε) = 0 ∀ε > 0 (L3) t→0 which rules out fixed discontinuities of the path t 7→ Xt . Under (L0)–(L2) one has that

n Xt = ξk,n(t) and ξk,n(t) = (X kt − X (k−1)t ) are iid (1.1) ∑ n k=1 n for every n ∈ N. Letting n → ∞ shows that Xt arises as a suitable limit of (a triangular array of) iid random variables which transforms the problem into a question of limit theorems and infinite divisibility.

7 8 R. L. Schilling: An Introduction to Lévy and Feller Processes

This was first observed in 1929 by de Finetti [15] who introduces (without naming it, the name is due to Bawly [6] and Khintchine [29]) the concept of infinite divisibility of a random variable X n ∀n ∃ iid random variables ξi,n : X ∼ ∑ ξi,n (1.2) i=1 and asks for the general structure of infinitely divisible random variables. His paper contains two remarkable results on the characteristic function χ(ξ) = Eeiξ·X of an infinite divisible random variable (taken from [39]):

De Finetti’s first theorem. A random variable X is infinitely divisible if, and only if, its charac-   teristic function is of the form χ(ξ) = limn→∞ exp − pn(1 − φn(ξ)) where pn > 0 and φn is a characteristic function.

De Finetti’s second theorem. The characteristic function of an infinitely divisible random vari- able X is the limit of finite products of Poissonian characteristic functions

 ihnξ  χn(ξ) = exp − pn(1 − e ) ,

and the converse is also true. In particular, all infinitely divisible laws are limits of convo- lutions of Poisson distributions.

Because of (1.1), Xt is infinitely divisible and as such one can construct, in principle, all indepen- dent-increment processes Xt as limits of sums of Poisson random variables. The contributions of Kolmogorov [31], Lévy [37] and Khintchine [28] show the exact form of the characteristic function of an infinitely divisible random variable

Z iξ·X 1  iy·ξ  − logEe = −il · ξ + ξ · Qξ + 1 − e + iξ · y1(0,1)(|y|) ν(dy) (1.3) 2 y6=0 where l ∈ Rd, Q ∈ Rd×d is a positive semidefinite symmetric matrix, and ν is a measure on Rd R 2 \{0} such that y6=0 min{1,|y| }ν(dy) < ∞. This is the famous Lévy–Khintchine formula. The exact knowledge of (1.3) makes it possible to find the approximating Poisson variables in de

Finetti’s theorem explicitly, thus leading to a construction of Xt . A little later, and without knowledge of de Finetti’s results, Lévy came up in his seminal paper

[37] (see also [38, Chap. VII]) with a decomposition of Xt in four independent components: a deterministic drift, a Gaussian part, the compensated small jumps and the large jumps ∆Xs :=

Xs − Xs−. This is now known as Lévy–Itô decomposition: p Z ! Xt = tl + QWt + lim ∆Xs −t yν(dy) + ∆Xs (1.4) ε→0 ∑ ∑ 01 p ZZ ZZ = tl + QWt + y(N(ds,dy) − dsν(dy)) + yN(ds,dy). (1.5) c (0,t]×B1(0) (0,t]×B1(0) Chapter 1: Orientation 9

Lévy uses results from the convergence of random series, notably Kolmogorov’s three series the- orem, in order to explain the convergence of the series appearing in (1.4). A rigorous proof based on the representation (1.5) are due to Itô [23] who completed Lévy’s programme to construct Xt . The coefficients l,Q,ν are the same as in (1.3), W is a d-dimensional standard Brownian mo- tion, and Nω ((0,t] × B) is the random measure #{s ∈ (0,t] : Xs(ω) − Xs−(ω) ∈ B} counting the jumps of X; it is a Poisson random variable with intensity EN((0,t]×B) = tν(B) for all Borel sets B ⊂ Rd \{0} such that 0 ∈/ B. Nowadays there are at least six possible approaches to constructing processes with (stationary and) independent increments X = (Xt )t>0.

The de Finetti–Lévy(–Kolmogorov–Khintchine) construction. The starting point is the observation that each Xt satisfies (1.1) and is, therefore, infinitely divisible. Thus, the character- istic exponent logEeiξ·Xt is given by the Lévy–Khintchine formula (1.3), and using the triplet √ (l,Q,ν) one can construct a drift lt, a Brownian motion QWt and compound Poisson processes, i.e. Poisson processes whose intensities y ∈ Rd are mixed with respect to the finite measure

νε (dy) := 1[ε,∞)(|y|)ν(dy). Using a suitable compensation (in the spirit of Kolmogorov’s three series theorem) of the small jumps, it is possible to show that the limit ε → 0 exists locally uni- formly in t. A very clear presentation of this approach can be found in Breiman [10, Chapter 14.7–8], see also Chapter 7.

The Lévy–Itô construction. This is currently the most popular approach to independent-in- crement processes, see e.g. Applebaum [2, Chapter 2.3–4] or Kyprianou [36, Chapter 2]. Origi- nally the idea is due to Lévy [37], but Itô [23] gave the first rigorous construction. It is based on the observation that the jumps of a process with stationary and independent increments define a

Poisson random measure Nω ([0,t]×B) and this can be used to obtain the Lévy–Itô decomposition (1.5). The Lévy–Khintchine formula is then a corollary of the pathwise decomposition. Some of the best presentations can be found in Gikhman–Skorokhod [18, Chapter VI], Itô [24, Chapter 4.3] and Bretagnolle [11]. A proof based on additive functionals and martingale stochastic integrals is due to Kunita & Watanabe [35, Section 7]. We follow this approach in Chapter 9.

Variants of the Lévy–Itô construction. The Lévy–Itô decomposition (1.5) is, in fact, the semimartingale decomposition of a process with stationary and independent increments. Using the general theory of semimartingales – which heavily relies on general random measures – we can identify processes with independent increments as those semimartingales whose semimartingale characteristics are deterministic, cf. Jacod & Shiryaev [27, Chapter II.4c]. A further interesting derivation of the Lévy–Itô decomposition is based on stochastic integrals driven by martingales. The key is Itô’s formula and, again, the fact that the jumps of a process with stationary and in- dependent increments defines a Poisson which can be used as a good stochastic 10 R. L. Schilling: An Introduction to Lévy and Feller Processes integrator; this unique approach1 can be found in Kunita [34, Chapter 2].

Kolmogorov’s construction. This is the classic construction of stochastic processes starting from the finite-dimensional distributions. For a process with stationary and independent incre- ments these are given as iterated convolutions of the form

E f (Xt0 ,...,Xtn ) Z Z = ··· f (y0,y0 + y1,...,y0 + ··· + yn) pt0 (dy0)pt1−t0 (dy1)... ptn−tn−1 (dyn)

R iξ·y with pt (dy) = P(Xt ∈ dy) or e pt (dy) = exp[−tψ(ξ)] where ψ is the characteristic exponent (1.3). Particularly nice presentations are those of Sato [51, Chapter 2.10–11] and Bauer [5, Chapter 37].

The invariance principle. Just as for a Brownian motion, it is possible to construct Lévy processes as limits of (suitably interpolated) random walks. For finite dimensional distributions this is done in Gikhman & Skorokhod [18, Chapter IX.6]; for the whole trajectory, i.e. in the space of càdlàg2 functions D[0,1] equipped with the Skorokhod topology, the proper references are Prokhorov [42] and Grimvall [19].

Random series constructions. A series representation of an independent-increment pro- cess (Xt )t∈[0,1] is an expression of the form

n  Xt = lim Jk1[0,t](Uk) −tck a.s. n→∞ ∑ k=1

The random variables Jk represent the jumps, Uk are iid uniform random variables and ck are suitable deterministic centering terms. Compared with the Lévy–Itô decomposition (1.4), the main difference is the fact that the jumps are summed over a deterministic index set {1,2,...n} while the summation in (1.4) extends over the random set {s : |∆Xs| > 1/n}. In order to construct a process with characteristic exponent (1.3) where l = 0 and Q = 0, one considers a disintegration Z ∞ ν(dy) = σ(r,dy)dr. 0

It is possible, cf. Rosinski´ [47], to choose σ(r,dy) = P(H(r,Vk) ∈ dy) where V = (Vk)k∈N is any sequence of d-dimensional iid random variables and H : (0,∞) × Rd → Rd is measurable. Now let Γ = (Γk)k∈N be a sequence of partial sums of iid standard exponential random variables and

U = (Uk)k∈N iid uniform random variables on [0,1] such that U,V,Γ are independent. Then Z k Z Jk := H(Γk,Vk) and ck = yσ(r,dy)dr k−1 |y|<1

1It reminds of the elegant use of Itô’s formula in Kunita-and-Watanabe’s proof of Lévy’s characterization of Brow- nian motion, see e.g. Schilling & Partzsch [56, Chapter 18.2]. 2A french acronym meaning ‘right-continuous and finite limits from the left’. Chapter 1: Orientation 11 is the sought-for series representation, cf. Rosinski´ [47] and [46]. This approach is important if one wants to simulate independent-increment processes. Moreover, it still holds for Banach space valued random variables. 2. Lévy processes

P Throughout this chapter, (Ω,A , ) is a fixed probability space, t0 = 0 6 t1 6 ... 6 tn and 0 6 s < t d are positive real numbers, and ξk,ηk, k = 1,...,n, denote vectors from R ; we write ξ · η for the Euclidean scalar product.

Rd Definition 2.1. A Lévy process X = (Xt )t>0 is a Xt : Ω → satisfying (L0)– (L3); this is to say that X starts at zero, has stationary and independent increments and is continu- ous in probability.

One should understand Lévy processes as continuous-time versions of sums of iid random vari- ables. This can easily be seen from the telescopic sum

n X − X = X − X , s < t, n ∈ N, t s ∑ tk tk−1 (2.1) k=1

k where tk = s + n (t − s). Since the increments Xtk − Xtk−1 are iid random variables, we see that all Xt of a Lévy process are infinitely divisible, i.e. (1.2) holds. Many properties of a Lévy process will, therefore, resemble those of sums of iid random variables. Let us briefly discuss the conditions (L0)–(L3).

X Remark 2.2. We have stated (L2) using the canonical filtration Ft := σ(Xr, r 6 t) of the process X. Often this condition is written in the following way

Xt − Xt ,...,Xt − Xt are independent random variables n n−1 1 0 (L20) for all n ∈ N, t0 = 0 < t1 < ··· < tn.

It is easy to see that this is actually equivalent to (L2): From

bi-measurable (Xt1 ,...,Xtn ) ←−−−−−−−→ (Xt1 − Xt0 ,...,Xtn − Xtn−1 ) it follows that

X  Ft = σ (Xt1 ,...,Xtn ), 0 6 t1 6 ... 6 tn 6 t  = σ (Xt1 − Xt0 ,...,Xtn − Xtn−1 ), 0 = t0 6 t1 6 ... 6 tn 6 t (2.2)  = σ Xu − Xv, 0 6 v 6 u 6 t , and we conclude that (L2) and (L20) are indeed equivalent. The condition (L3) is equivalent to either of the following

12 Chapter 2: Lévy processes 13

• ‘t 7→ Xt is continuous in probability’;

1 • ‘t 7→ Xt is a.s. càdlàg’ (up to a modification of the process). The equivalence with the first claim, and the direction ‘⇐’ of the second claim are easy:

P P 1E lim (|Xu − Xt | > ε) = lim (|X|t−u|| > ε) 6 lim (|Xh| ∧ ε), (2.3) u→t |t−u|→0 h→0 ε but it takes more effort to show that continuity in probatility (L3) guarantees that almost all paths are càdlàg.2 Usually, this is proved by controlling the oscillations of the paths of a Lévy process, cf. Sato [51, Theorem 11.1], or by the fundamental regularization theorem for submartingales, see Revuz & Yor [44, Theorem II.(2.5)] and Remark 11.2; in contrast to the general martingale setting [44, Theorem II.(2.9)], we do not need to augment the natural filtration because of (L1) and (L3). Since our construction of Lévy processes gives directly a càdlàg version, we do not go into further detail. The condition (L3) has another consequence. Recall that the Cauchy–Abel functional equa- tions have unique solutions if, say, φ, ψ and θ are (right-)continuous:

φ(s +t) = φ(s) · φ(t) φ(t) = φ(1)t , ψ(s +t) = ψ(s) + ψ(t)(s,t > 0) =⇒ ψ(t) = ψ(1) ·t (2.4) c θ(st) = θ(s) · θ(t) θ(t) = t , c > 0. The first equation is treated in Theorem A.1 in the appendix. For a thorough discussion on condi- tions ensuring uniqueness we refer to Aczel [1, Chapter 2.1]. Rd Proposition 2.3. Let (Xt )t>0 be a Lévy process in . Then iξ·X  iξ·X t d Ee t = Ee 1 , t > 0, ξ ∈ R . (2.5)

Proof. Fix s,t > 0. We get (L2) (L1) Eeiξ·(Xt+s−Xs)+iξ·Xs = Eeiξ·(Xt+s−Xs) Eeiξ·Xs = Eeiξ·Xt Eeiξ·Xs , or φ(t + s) = φ(t) · φ(s), if we write φ(t) = Eeiξ·Xt . Since x 7→ eiξ·x is continuous, there is for every ε > 0 some δ > 0 such that

E iξ·(Xt −Xs) P P |φ(t) − φ(s)| 6 e − 1 6 ε + 2 (|Xt − Xs| > δ) = ε + 2 (|X|t−s|| > δ). Thus, (L3) guarantees that t 7→ φ(t) is continuous, and the claim follows from (2.4).

Notice that any solution f (t) of (2.4) also satisfies (L0)–(L2); by Proposition 2.3 Xt + f (t) is a Lévy process if, and only if, f (t) is continuous. On the other hand, Hamel, cf. [1, p. 35], constructed discontinuous (non-measurable and locally unbounded) solutions to (2.4). Thus, (L3) means that t 7→ Xt has no fixed discontinuities, i.e. all jumps occur at random times.

1‘Right-continuous and finite limits from the left’ 2More precisely: that there exists a modification of X which has almost surely càdlàg paths. 14 R. L. Schilling: An Introduction to Lévy and Feller Processes

P Corollary 2.4. The finite-dimensional distributions (Xt1 ∈ dx1,...,Xtn ∈ dxn) of a Lévy process are uniquely determined by

n ! n h itk−tk−1 E E exp i ∑ ξk · Xtk = ∏ exp(i(ξk + ··· + ξn) · X1) (2.6) k=1 k=1

Rd N for all ξ1,...,ξn ∈ , n ∈ and 0 = t0 6 t1 6 ... 6 tn.

Proof. The left-hand side of (2.6) is just the characteristic function of (Xt1 ,...,Xtn ). Consequently, the assertion follows from (2.6). Using Proposition 2.3, we have

n ! n−2 ! E E exp i ∑ ξk · Xtk = exp i ∑ ξk · Xtk + i(ξn + ξn−1) · Xtn−1 + iξn · (Xtn − Xtn−1 ) k=1 k=1 n−2 ! (L2) iξn·X1 tn−tn−1 = Eexp i ξk · Xt + i(ξn + ξn−1) · Xt Ee . (L1) ∑ k n−1 k=1

Since the first half of the right-hand side has the same structure as the original expression, we can iterate this calculation and obtain (2.6).

It is not hard to invert the Fourier transform in (2.6). Writing pt (dx) := P(Xt ∈ dx) we get

Z Z n P(X ∈ B ,...,X ∈ B ) = ··· (x + ··· + x )p (dx ) t1 1 tn n ∏ 1Bk 1 k tk−tk−1 k (2.7) k=1 Z Z n = ··· (y )p (dy − y ). ∏ 1Bk k tk−tk−1 k k−1 (2.8) k=1

iξ·X Let us discuss the structure of the characteristic function χ(ξ) = Ee 1 of X1. From (2.1) 2 we see that each random variable Xt of a Lévy process is infinitely divisible. Clearly, |χ(ξ)| is 0 0 the (real-valued) characteristic function of the symmetrization Xe1 = X1 −X1 (X1 is an independent copy of X1) and Xe1 is again infinitely divisible:

n n h 0 0 i Xe1 = (Xek − Xek−1 ) = (X k − X k−1 ) − (X k − X k−1 ) . ∑ n n ∑ n n k=1 k=1 n n

2 2n 2 In particular, |χ| = |χ1/n| where |χ1/n| is the characteristic function of Xe1/n. Since everything is real and |χ(ξ)| 6 1, we get

2 2/n d θ(ξ) := lim |χ1/n(ξ)| = lim |χ(ξ)| , ξ ∈ R , n→∞ n→∞ which is 0 or 1 depending on |χ(ξ)| = 0 or |χ(ξ)| > 0, respectively. As χ(ξ) is continuous at

ξ = 0 with χ(0) = 1, we have θ ≡ 1 in a neighbourhood Br(0) of 0. Now we can use Lévy’s continuity theorem (Theorem A.5) and conclude that the limiting function θ(ξ) is continuous everywhere, hence θ ≡ 1. In particular, χ(ξ) has no zeroes. Chapter 2: Lévy processes 15

Rd Corollary 2.5. Let (Xt )t>0 be a Lévy process in . There exists a unique ψ : Rd → C such that −tψ(ξ) d Eexp(iξ · Xt ) = e , t > 0, ξ ∈ R . The function ψ is called the characteristic exponent.

Proof. In view of Proposition 2.3 it is enough to consider t = 1. Set χ(ξ) := Eexp(iξ · X1). An obvious candidate for the exponent is ψ(ξ) = −log χ(ξ), but with complex logarithms there is always the trouble which branch of the logarithm one should take. Let us begin with the unique- ness: −ψ −φ −(ψ−φ) e = e =⇒ e = 1 =⇒ ψ(ξ) − φ(ξ) = 2π ikξ Z for some integer kξ ∈ . Since φ,ψ are continuous and φ(0) = ψ(0) = 1, we get kξ ≡ 0. To prove the existence of the logarithm, it is not sufficient to take the principal branch of the logarithm. As we have seen above, χ(ξ) is continuous and has no zeroes, i.e. inf|ξ|6r |χ(ξ)| > 0 3 for any r > 0; therefore, there is a ‘distinguished’, continuous version of the argument arg◦ χ(ξ) such that arg◦ χ(0) = 0.

This allows us to take a continuous version of log χ(ξ) = log|χ(ξ)| + arg◦ χ(ξ).

Corollary 2.6. Let Y be an infinitely divisible random variable. Then there exists at most one4

Lévy process (Xt )t>0 such that X1 ∼ Y.

Proof. Since X1 ∼ Y, infinite divisibility is a necessary requirement for Y. On the other hand, Proposition 2.3 and Corollary 2.4 show how to construct the finite-dimensional distributions of a

Lévy process, hence the process, from X1.

So far, we have seen the following one-to-one correspondences

1:1 1:1 E iξ·X1 E iξ·X1 (Xt )t>0 Lévy process ←→ e ←→ ψ(ξ) = −log e and the next step is to find all possible characteristic exponents. This will lead us to the Lévy– Khintchine formula.

3A very detailed argument is given in Sato [51, Lemma 7.6], a completely different proof can be found in Dieudonné [16, Chapter IX, Appendix 2]. 4We will see in Chapter 7 how to construct this process. It is unique in the sense that its finite-dimensional distributions are uniquely determined by Y. 3. Examples

We begin with a useful alternative characterisation of Lévy processes.

Rd P Theorem 3.1. Let X = (Xt )t>0 be a stochastic process with values in , (X0 = 0) = 1 and X Ft = Ft = σ(Xr, r 6 t). The process X is a Lévy process if, and only if, there exists an exponent ψ : Rd → C such that   E iξ·(Xt −Xs) −(t−s)ψ(ξ) Rd e Fs = e for all s < t, ξ ∈ . (3.1)

Proof. If X is a Lévy process, we get

  (L2) (L1) Cor. 2.5 E iξ·(Xt −Xs) E iξ·(Xt −Xs) E iξ·Xt−s −(t−s)ψ(ξ) e Fs = e = e = e .

Conversely, assume that X0 = 0 a.s. and (3.1) holds. Then

Eeiξ·(Xt −Xs) = e−(t−s)ψ(ξ) = Eeiξ·(Xt−s−X0) which shows Xt − Xs ∼ Xt−s − X0 = Xt−s, i.e. (L1).

For any F ∈ Fs we find from the tower property of conditional expectation    h i E 1 iξ·(Xt −Xs) E 1 E iξ·(Xt −Xs) E1 −(t−s)ψ(ξ) F · e = F e Fs = F · e . (3.2)

iu1 iu c Observe that e F = 1Fc + e 1F for any u ∈ R; since both F and F are in Fs, we get       iu1F iξ·(Xt −Xs) iξ·(Xt −Xs) iu iξ·(Xt −Xs) E e e = E 1Fc e + E 1F e e

(3.2) iu  −(t−s)ψ(ξ) = E 1Fc + e 1F e

(3.2) 1 = Eeiu F Eeiξ·(Xt −Xs).

Thus, 1F ⊥⊥(Xt − Xs) for any F ∈ Fs, and (L2) follows. iξ·X −tψ(ξ) Finally, limt→0 Ee t = limt→0 e = 1 proves that Xt → 0 in distribution, hence in proba- bility. This gives (L3).

Theorem 3.1 allows us to give concrete examples of Lévy processes. Example 3.2. The following processes are Lévy processes.

d a) Drift in direction l/|l|, l ∈ R , with speed |l|: Xt = tl and ψ(ξ) = −il · ξ.

16 Chapter 3: Examples 17

Rd×d b) Brownian motion with (positive semi-definite) covariance matrix Q ∈ : Let (Wt )t>0 be a d √ standard on R and set Xt := QWt . 1 P −d/2 −1/2 −1 Then ψ(ξ) = 2 ξ · Qξ and (Xt ∈ dy) = (2πt) (detQ) exp(−y · Q y/2t)dy. c) Poisson process in R with jump height 1 and intensity λ. This is an integer-valued counting process (Nt )t>0 which increases by 1 after an independent exponential waiting time with mean λ. Thus, ∞ 1 Nt = ∑ [0,t](τk), τk = σ1 + ··· + σk, σk ∼ Exp(λ) iid. k=1 Using this definition, it is a bit messy to show that N is indeed a Lévy process (see e.g. Çinlar [12, Chapter 4]). We will give a different proof in Theorem 3.4 below. Usually, the first step is to show that its law is a Poisson distribution

(λt)k P(N = k) = e−tλ , k = 0,1,2,... t k!

(thus the name!) and from this one can calculate the characteristic exponent

∞ (λt)k EeiuNt = ∑ eiuke−tλ = e−tλ expλteiu = exp −tλ(1 − eiu), k=0 k! i.e. ψ(u) = λ(1 − eiu). Mind that this is strictly weaker than (3.1) and does not prove that N is a Lévy process.

Rd d) in with jump distribution µ and intensity λ. Let N = (Nt )t>0 be a Poisson process with intensity λ and replace the jumps of size 1 by independent iid jumps of d random height H1,H2,... with values in R and H1 ∼ µ. This is a compound Poisson process:

Nt Ct = ∑ Hk, Hk ∼ µ iid and independent of (Nt )t>0. k=1

We will see in Theorem 3.4 that compound Poisson processes are Lévy processes.

Let us show that the Poisson and compound Poisson processes are Lévy processes. For this we need the following auxiliary result. Since t 7→ Ct is a step function, the Riemann–Stieltjes integral R f (u)dCu is well-defined.

Lemma 3.3 (Campbell’s formula). Let Ct = H1 + ··· + HNt be a compound Poisson process as in

Example 3.2.d) with iid jumps Hk ∼ µ and an independent Poisson process (Nt )t>0 with intensity λ. Then  Z ∞   Z ∞Z  iy f (s+t) Eexp i f (t + s)dCt = exp λ (e − 1) µ(dy)dt (3.3) 0 0 y6=0

d holds for all s > 0 and bounded measurable functions f : [0,∞) → R with compact support. 18 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Set τk = σ1 + ··· + σk where σk ∼ Exp(λ) are iid. Then  Z ∞  φ(s) :=Eexp i f (s +t)dCt 0 ∞ ! E = exp i ∑ f (s + σ1 + ··· + σk)Hk k=1 Z ∞ ∞ ! iid E E P = exp i ∑ f (s + x + σ2 + ··· + σk)Hk exp(i f (s + x)H1) (σ1 ∈ dx) 0 k=2 | {z }| {z }| {z } =φ(s+x) =:γ(s+x) =λe−λx dx Z ∞ =λ φ(s + x)γ(s + x)e−λx dx 0 Z ∞ =λeλs γ(t)φ(t)e−λt dt. s This is equivalent to Z ∞ e−λsφ(s) = λ (φ(t)e−λt )γ(t)dt s and φ(∞) = 1 since f has compact support. This integral equation has a unique solution; it is now a routine exercise to verify that the right-hand side of (3.3) is indeed a solution.

Theorem 3.4. Let Ct = H1 + ··· + HNt be a compound Poisson process as in Example 3.2.d) with iid jumps Hk ∼ µ and an independent Poisson process (Nt )t>0 with intensity λ. Then (Ct )t>0 (and also (Nt )t>0) is a d-dimensional Lévy process with characteristic exponent Z ψ(ξ) = λ (1 − eiy·ξ ) µ(dy). (3.4) y6=0

Proof. Since the trajectories of t 7→ Ct are càdlàg step functions with C0 = 0, the properties (L0) Rd and (L3), see (2.3), are satisfied. We will show (L1) and (L2). Let ξk ∈ , 0 = t0 6 ... 6 tn, and a < b. Then the Riemann–Stieltjes integral Z ∞ ∞ 1 1 (a,b](t)dCt = ∑ (a,b](τk)Hk = Cb −Ca 0 k=1 exists. We apply the Campbell formula (3.3) to the function n f (t) := 1 (t) ∑ ξk (tk−1,tk] k=1 and with s = 0. Then the left-hand side of (3.3) becomes the characteristic function of the incre- ments n ! E · (C −C ) , exp i ∑ ξk tk tk−1 k=1 while the right-hand side is equal to " # Z n Z tk n  Z  iξk·y iξk·y exp λ ∑ (e − 1)dt µ(dy) = ∏ exp λ(tk −tk−1) (e − 1) µ(dy) y6=0 k=1 tk−1 k=1 y6=0 n = E  ·C  ∏ exp iξk tk−tk−1 k=1 Chapter 3: Examples 19

(use Campbell’s formula with n = 1 for the last equality). This shows that the increments are 0 independent, i.e. (L2 ) holds, as well as (L1): Ctk −Ctk−1 ∼ Ctk−tk−1 .

If d = 1 and Hk ∼ δ1, Ct is a Poisson process.

∗k ∗0 Denote by µ the k-fold convolution of the measure µ; as usual, µ := δ0.

Corollary 3.5. Let (Nt )t>0 be a Poisson process with intensity λ and Ct = H1 + ··· + HNt a com- pound Poisson process with iid jumps Hk ∼ µ. Then, for all t > 0,

(λt)k P(N = k) = e−λt , k = 0,1,2,... (3.5) t k! ∞ ( t)k P −λt λ ∗k Rd (Ct ∈ B) = e ∑ µ (B), B ⊂ Borel. (3.6) k=0 k!

Proof. If we use Theorem 3.4 for d = 1 and µ = δ1, we see that the characteristic function of Nt is iu χt (u) = exp[−λt(1−e )]. Since this is also the characteristic function of the Poisson distribution

(i.e. the r.h.s. of (3.5)), we get Nt ∼ Poi(λt).

Since (Hk)k∈N ⊥⊥(Nt )t>0, we have for any Borel set B

∞ P P (Ct ∈ B) = ∑ (Ct ∈ B, Nt = k) k=0 ∞ P P P = δ0(B) (Nt = 0) + ∑ (H1 + ··· + Hk ∈ B) (Nt = k) k=1 ∞ (λt)k = e−λt ∑ µ∗k(B). k=0 k!

Example 3.2 contains the basic Lévy processes which will also be the building blocks for all Lévy processes. In order to define more specialized Lévy processes, we need further assumptions on the distributions of the random variables Xt .

Rd Definition 3.6. Let (Xt )t>0 be a stochastically continuous process in . It is called self-similar, if

∀a > 0 ∃b = b(a) : (Xat )t>0 ∼ (bXt )t>0 (3.7) in the sense that both sides have the same finite-dimensional distributions.

Lemma 3.7 (Lamperti). If (Xt )t>0 is self-similar and non-degenerate, then there exists a unique H index of self-similarity H > 0 such that b(a) = a . If (Xt )t>0 is a self-similar Lévy process, then 1 H > 2 .

0 Proof. Since (Xt )t>0 is self-similar, we find for a,a > 0 and each t > 0

0 0 b(aa )Xt ∼ Xaa0t ∼ b(a)Xa0t ∼ b(a)b(a )Xt , 20 R. L. Schilling: An Introduction to Lévy and Feller Processes

0 0 1 and so b(aa ) = b(a)b(a ) as Xt is non-degenerate. By the convergence of types theorem (Theo- rem A.6) and the continuity in probability of t 7→ Xt we see that a 7→ b(a) is continuous. Thus, the Cauchy functional equation b(aa0) = b(a)b(a0) has the unique continuous solution b(a) = aH for some H > 0. 1 Assume now that (Xt )t>0 is a Lévy process. We are going to show that H > 2 . Using self- similarity and the properties (L1), (L2) we get (primes always denote iid copies of the respective random variables)

H 00 0 H 00 H 0 (n + m) X1 ∼ Xn+m = (Xn+m − Xm) + Xm ∼ Xn + Xm ∼ n X1 + m X1. (3.8)

1 Any standard normal random variable X1 satisfies (3.8) with H = 2 . On the other hand, if X1 has V V V 00 V 0 V 00 V 0 a second moment, we get (n + m) X1 = Xn+m = Xn + Xm = n X1 + m X1 by Bienaymés 1 identity for variances, i.e. (3.8) can only hold with H = 2 . Thus, any self-similar X1 with finite 1 1 second moment has to satisfy (3.8) with H = 2 . If we can show that H < 2 implies the existence of a second moment, we have reached a contradiction. 1 H If Xn is symmetric and H < 2 , we find because of Xn ∼ n X1 some u > 0 such that 1 P(|X | > unH ) = P(|X | > u) < . n 1 4 By the symmetrization inequality (Theorem A.7), 1 1 1 − exp{−nP(|X | > unH )} P(|X | > unH ) < 2 1 6 n 4 P H N P 0 −1/H which means that n (|X1| > un ) 6 c for all n ∈ . Thus, (|X1| > x) 6 c x for all x > u+1, and so Z ∞ Z ∞ E 2 P 0 1−1/H |X1| = 2 x (|X1| > x)dx 6 2(u + 1) + 2c x dx < ∞ 0 u+1 1 0 0 as H < 2 . If Xn is not symmetric, we use its symmetrization Xn − Xn where Xn are iid copies of Xn.

Definition 3.8. A random variable X is called stable if

N Rd 0 0 ∀n ∈ ∃bn > 0, cn ∈ : X1 + ··· + Xn ∼ bnX + cn (3.9)

0 0 where X1,...,Xn are iid copies of X. If (3.9) holds with cn = 0, the random variable is called strictly stable. A Lévy process (Xt )t>0 is (strictly) stable if X1 is a (strictly) stable random variable.

1We use here that bX ∼ cX =⇒ b = c if X is non-degenerate. To see this, set χ(ξ) = Eeiξ·X and notice

b  b n  |χ(ξ)| = χ c ξ = ··· = χ c ξ .

If b < c, the right-hand side converges for n → ∞ to χ(0) = 1, hence |χ| ≡ 1, contradicting the fact that X is non- degenerate. Since b,c play symmetric roles, we conclude that b = c. Chapter 3: Examples 21

Note that the symmetrization X − X0 of a stable random variable is strictly stable. Setting χ(ξ) = Eeiξ·X it is easy to see that (3.9) is equivalent to

d n icn·ξ 0 ∀n ∈ N ∃bn > 0, cn ∈ R : χ(ξ) = χ(bnξ)e . (3.9 )

Example 3.9. a) Stable processes. By definition, any stable random variable is infinitely divisible, d and for every stable X there is a unique Lévy process on R such that X1 ∼ X, cf. Corollary 2.6.

A Lévy process (Xt )t>0 is stable if, and only if, all random variables Xt are stable. This follows 0 iξ·X at once from (3.9 ) if we use χt (ξ) := Ee t :

0 n (2.5) nt (3.9 ) t i(tcn)·ξ (2.5) i(tcn)·ξ χt (ξ) = χ1(ξ) = χ1(bnξ) e = χt (bnξ)e .

It is possible to determine the characteristic exponent of a , cf. Sato [51, Theorem 14.10] and (3.10) further down. b) Self-similar processes. Assume that (Xt )t>0 is a self-similar Lévy process. Then n N 0 0 ∀n ∈ : b(n)X1 ∼ Xn = ∑(Xk − Xk−1) ∼ X1,n + ··· + Xn,n k=1

0 where the Xk,n are iid copies of X1. This shows that X1, hence (Xt )t>0, is strictly stable. In fact, the converse is also true: c) A strictly stable Lévy process is self-similar. We have already seen in b) that self-similar

Lévy processes are strictly stable. Assume now that (Xt )t>0 is strictly stable. Since Xnt ∼ bnXt we get e−ntψ(ξ) = Eeiξ·Xnt = Eeibnξ·Xt = e−tψ(bnξ).

−1 Taking n = m, t t/m and ξ bm ξ we see

− t ψ(ξ) −tψ(b−1ξ) e m = e m .

+ From these equalities we obtain for q = n/m ∈ Q and b(q) := bn/bm

−qtψ(ξ) −tψ(b(q)ξ) e = e =⇒ Xqt ∼ b(q)Xt =⇒ Xat ∼ b(a)Xt for all t > 0 because of the continuity in probability of (Xt )t>0. Since, by Corollary 2.4, the finite- dimensional distributions are determined by the one-dimensional distributions, we conclude that (3.7) holds. This means, in particular, that strictly stable Lévy processes have an index of self-similarity 1 H > 2 . It is common to call α = 1/H ∈ (0,2] the index of stability of (Xt )t>0, and we have 1/α Xnt ∼ n Xt . If X is ‘only’ stable, its symmetrization is strictly stable and, thus, every stable Lévy process has an index α ∈ (0,2]. It plays an important role for the characteristic exponent. For a general 22 R. L. Schilling: An Introduction to Lévy and Feller Processes stable process the characteristic exponent is of the form Z  |z · ξ|α 1 − isgn(z · ξ)tan απ σ(dz) − i µ · ξ, (α 6= 1),  Sd 2 ψ(ξ) = Z (3.10)  |z · ξ|1 + 2 isgn(z · ξ)log|z · ξ|σ(dz) − i µ · ξ, (α = 1),  Sd π where σ is a finite measure on Sd and µ ∈ Rd. The strictly stable exponents have µ = 0 (if α 6= 1) R and Sd zk σ(dz) = 0, k = 1,...,d (if α = 1). These formulae can be derived from the general Lévy–Khintchine formula; a good reference is the monograph by Samorodnitsky & Taqqu [48, Chapters 2.3–4].

If X is strictly stable such that the distribution of Xt is rotationally invariant, it is clear that α R α ψ(ξ) = c|ξ| . If Xt is symmetric, i.e. Xt ∼ −Xt , then ψ(ξ) = Sd |z · ξ| σ(dz) for some finite, symmetric measure σ on the unit sphere Sd ⊂ Rd. Let us finally show Kolmogorov’s proof of the Lévy–Khintchine formula for one-dimensional Lévy processes admitting second moments. We need the following auxiliary result.

R V V Lemma 3.10. Let (Xt )t>0 be a Lévy process on . If X1 < ∞, then Xt < ∞ for all t > 0 and

2 EXt = tEX1 =: tµ and VXt = tVX1 =: tσ .

Proof. If VX1 < ∞, then E|X1| < ∞. With Bienaymé’s identity, we get

m V V V V V Xm = ∑ (Xk − Xk−1) = m X1 and X1 = n X1/n. k=1

In particular, VXm,VX1/n < ∞. This, and a similar argument for the expectation, show

+ VXq = qVX1 and EXq = qEX1 for all q ∈ Q .

V V V Moreover, (Xq − Xr) = Xq−r = (q − r) X1 for all rational numbers r 6 q, and this shows that 2 Xq − EXq = Xq − qµ converges in L as q → t. Since t 7→ Xt is continuous in probability, we can 2 identify the limit and find Xq − qµ → Xt −tµ. Consequenctly, VXt = tσ and EXt = tµ.

We have seen in Proposition 2.3 that the characteristic function of a Lévy process is of the form

iξXt  iξX1 t t χt (ξ) = Ee = Ee = χ1(ξ) .

2 Let us assume that X is real-valued and has finite (first and) second moments VX1 = σ and

EX1 = µ. By Taylor’s formula

 Z 1  iξ(Xt −tµ) 2 2 iθξ(Xt −tµ) Ee = E 1 + iξ(Xt −tµ) − ξ (Xt −tµ) (1 − θ)e dθ 0  Z 1  2 2 iθξ(Xt −tµ) = 1 − E ξ (Xt −tµ) (1 − θ)e dθ . 0 Chapter 3: Examples 23

Since Z 1 Z 1 iθξ(X −tµ) 1 (1 − θ)e t dθ 6 (1 − θ)dθ = , 0 0 2 we get 2 i X i (X −t ) ξ 2 Ee ξ t = Ee ξ t µ 1 − tσ . > 2 N n R Thus, χ1/n(ξ) 6= 0 if n > N(ξ) ∈ is large, hence χ1(ξ) = χ1/n(ξ) 6= 0. For ξ ∈ we find (using a suitable branch of the complex logarithm)

∂  t ψ(ξ) := −log χ1(ξ) = − χ1(ξ) ∂t t=0 1 − EeiξXt = lim t→0 t Z ∞ 1 iyξ  = lim 1 − e + iyξ pt (dy) − iξ µ t→0 t −∞ Z ∞ 1 − eiyξ + iyξ = lim 2 πt (dy) − iξ µ (3.11) t→0 −∞ y

2 −1 where pt (dy) = P(Xt ∈ dy) and πt (dy) := y t pt (dy). Yet another application of Taylor’s theo- rem shows that the integrand in the above integral is bounded, vanishes at infinity, and admits a 1 2 continuous extension onto the whole real line if we choose the value 2 ξ at y = 0. The family (πt )t∈(0,1] is uniformly bounded,

1 Z 1 1 2 t→0 y2 p (dy) = E(X2) = VX + EX  = σ 2 +tµ2 −−→ σ 2, t t t t t t t hence sequentially vaguely relatively compact (see Theorem A.3). We conclude that every se- quence (πt(n))n∈N ⊂ (πt )t∈(0,1] with t(n) → 0 as n → ∞ has a vaguely convergent subsequence. 2 But since the limit (3.11) exists, all subsequential limits coincide which means that πt converges vaguely to a finite measure π on R. This proves that

Z ∞ 1 − eiyξ + iyξ ψ(ξ) = −log χ1(ξ) = 2 π(dy) − iξ µ −∞ y for some finite measure π on (−∞,∞) with total mass π(R) = σ 2. This is sometimes called the de −21 2 Finetti–Kolmogorov formula. If we set ν(dy) := y {y6=0} π(dy) and σ0 := π{0}, we obtain the Lévy–Khintchine formula Z 1 2 2 iyξ  ψ(ξ) = −i µξ + σ0 ξ + 1 − e + iyξ ν(dy) 2 y6=0

2 2 R 2 where σ = σ0 + y6=0 y ν(dy).

2 iyξ 2 iyξ  2 Note that e = ∂ξ 1 − e + iyξ /y , i.e. the kernel appearing in (3.11) is indeed measure-determining. 4. On the Markov property

P Let (Ω,A , ) be a probability space with some filtration (Ft )t>0 and a d-dimensional adapted Rd stochastic process X = (Xt )t>0, i.e. each Xt is Ft measurable. We write B( ) for the Borel sets = (S ) and set F∞ : σ t>0 Ft . The process X is said to be a simple Markov process, if

d P(Xt ∈ B | Fs) = P(Xt ∈ B | Xs), s 6 t, B ∈ B(R ), (4.1) holds true. This is pretty much the most general definition of a Markov process, but it is usually too general to work with. It is more convenient to consider Markov families.

Definition 4.1. A (temporally homogeneous) Markov transition function is a measure kernel d d pt (x,B), t > 0, x ∈ R , B ∈ B(R ) such that

d a) B 7→ ps(x,B) is a probability measure for every s > 0 and x ∈ R ;

d b) (s,x) 7→ ps(x,B) is a Borel measurable function for every B ∈ B(R );

c) the Chapman–Kolmogorov equations hold Z d d ps+t (x,B) = pt (y,B) ps(x,dy) for all s,t > 0, x ∈ R , B ∈ B(R ). (4.2)

Definition 4.2. A stochastic process (Xt )t>0 is called a (temporally homogeneous) Markov pro- cess with transition function if there exists a Markov transition function pt (x,B) such that

d P(Xt ∈ B | Fs) = pt−s(Xs,B) a.s. for all s 6 t, B ∈ B(R ). (4.3)

Conditioning w.r.t. σ(Xs) and using the tower property of conditional expectation shows that (4.3) implies the simple Markov property (4.1). Nowadays the following definition of a Markov process is commonly used.

x d Definition 4.3. A (universal) Markov process is a tuple (Ω,A ,Ft ,Xt ,t > 0,P ,x ∈ R ) such Px Px that pt (x,B) = (Xt ∈ B) is a Markov transition function and (Xt )t>0 is for each a Markov x process in the sense of Definition 4.2 such that P (X0 = x) = 1. In particular,

x Xs x d P (Xt ∈ B | Fs) = P (Xt−s ∈ B) P -a.s. for all s 6 t, B ∈ B(R ). (4.4)

24 Chapter 4: On the Markov property 25

We are going to show that a Lévy process is a (universal) Markov process. Assume that (Xt )t>0 X is a Lévy process and set Ft := Ft = σ(Xr, r 6 t). Define probability measures

x d P (X• ∈ Γ) := P(X• + x ∈ Γ), x ∈ R , where Γ is a Borel set of the path space (Rd)[0,∞) = {w | w : [0,∞) → Rd}.1 We set Ex := R ...dPx. By construction, P = P0 and E = E0. x x Note that Xt := Xt + x satisfies the conditions (L1)–(L3), and it is common to call (Xt )t>0 a Lévy process starting from x.

Rd Lemma 4.4. Let (Xt )t>0 be a Lévy process on . Then

x d d pt (x,B) := P (Xt ∈ B) := P(Xt + x ∈ B), t > 0, x ∈ R , B ∈ B(R ), is a Markov transition function.

Proof. Since pt (x,B) = E1B(Xt + x) (the proof of) Fubini’s theorem shows that x 7→ pt (x,B) is a measurable function and B 7→ pt (x,B) is a probability measure. The Chapman–Kolmogorov equations follow from

ps+t (x,B) = P(Xs+t + x ∈ B) = P((Xs+t − Xt ) + x + Xt ∈ B)

(L2) Z = P(y + Xt ∈ B)P((Xs+t − Xt ) + x ∈ dy) Rd (L1) Z = P(y + Xt ∈ B)P(Xs + x ∈ dy) Rd Z = pt (y,B) ps(x,dy). Rd Remark 4.5. The proof of Lemma 4.4 shows a bit more: From Z Z pt (x,B) = 1B(x + y)P(Xt ∈ dy) = 1B−x(y)P(Xt ∈ dy) = pt (0,B − x)

d we see that the kernels pt (x,B) are invariant under shifts in R (translation invariant). In slight abuse of notation we write pt (x,B) = pt (B − x). From this it becomes clear that the Chapman–

Kolmogorov equations are convolution identities pt+s(B) = pt ∗ ps(B), and (pt )t>0 is a convolu- tion semigroup of probability measures; because of (L3), this semigroup is weakly continuous at t = 0, i.e. pt → δ0 as t → 0, cf. Theorem A.3 et seq. for the weak convergence of measures. Lévy processes enjoy an even stronger version of the above Markov property.

Theorem 4.6 (Markov property for Lévy processes). Let X be a d-dimensional Lévy process and set Y := (Xt+a − Xa)t>0 for some fixed a > 0. Then Y is again a Lévy process satisfying

Y X a) Y ⊥⊥(Xr)r6a, i.e. F∞ ⊥⊥Fa . 1 Rd [0,∞) Rd Recall that B(( ) ) is the smallest σ-algebra containing the cylinder sets Z = ×t>0 Bt where Bt ∈ B( ) d and only finitely many Bt 6= R . 26 R. L. Schilling: An Introduction to Lévy and Feller Processes

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

Y X Proof. Observe that Fs = σ(Xr+a − Xa, r 6 s) ⊂ Fs+a. Using Theorem 3.1 and the tower prop- erty of conditional expectation yields for all s 6 t   h   i E iξ·(Yt −Ys) Y E E iξ·(Xt+a−Xs+a) X Y −(t−s)ψ(ξ) e Fs = e Fs+a Fs = e .

Thus, (Yt )t>0 is a Lévy process with the same characteristic function as (Xt )t>0. The property (L20) for X gives

X Xtn+a − Xtn−1+a, Xtn−1+a − Xtn−2+a,..., Xt1+a − Xa ⊥⊥Fa .

X As σ(Yt1 ,...,Ytn ) = σ(Ytn −Ytn−1 ,...,Yt1 −Yt0 )⊥⊥Fa for all t0 = 0 < t1 < ··· < tn, we get ! Y [ X F∞ = σ σ(Yt1 ,...,Ytn ) ⊥⊥Fa . t1<···

Using the Markov transition function pt (x,B) we can define a linear operator on the bounded Borel measurable functions f : Rd → R: Z Ex Rd Rd Pt f (x) := f (y) pt (x,dy) = f (Xt ), f ∈ Bb( ), t > 0, x ∈ . (4.5)

For a Lévy process, cf. Remark 4.5, we have pt (x,B) = pt (B−x) and the operators Pt are actually convolution operators: Z E Pt f (x) = f (Xt + x) = f (y + x) pt (dy) = f ∗ pet (x) where pet (B) := pt (−B). (4.6)

Definition 4.7. Let Pt , t > 0, be defined by (4.5). The operators are said to be

d d d a) acting on Bb(R ), if Pt : Bb(R ) → Bb(R ).

b) an operator semigroup, if Pt+s = Pt ◦ Ps for all s,t > 0 and P0 = id.

c) sub-Markovian if 0 6 f 6 1 =⇒ 0 6 Pt f 6 1. Rd d) contractive if kPt f k∞ 6 k f k∞ for all f ∈ Bb( ).

e) conservative if Pt 1 = 1.

d d 2 f) Feller operators, if Pt : C∞(R ) → C∞(R ).

d d g) strongly continuous on C∞(R ), if limt→0 kPt f − f k∞ = 0 for all f ∈ C∞(R ).

d d h) strong Feller operators, if Pt : Bb(R ) → Cb(R ).

2 d C∞(R ) denotes the space of continuous functions vanishing at infinity. It is a Banach space when equipped with the uniform norm k f k∞ = supx∈Rd | f (x)|. Chapter 4: On the Markov property 27

Lemma 4.8. Let (Pt )t>0 be defined by (4.5). The properties 4.7.a)–e) hold for any Markov process, 4.7.a)–g) hold for any Lévy process, and 4.7.a)–h) hold for any Lévy process such that all transition probabilities pt (dy) = P(Xt ∈ dy), t > 0, are absolutely continuous w.r.t. Lebesgue measure.

Proof. We only show the assertions about Lévy processes (Xt )t>0.

a) Since Pt f (x) = E f (Xt + x), the boundedness of Pt f is obvious, and the measurability in x follows from (the proof of) Fubini’s theorem.

b) By the tower property of conditional expectation, we get for s,t > 0

x x x  Pt+s f (x) = E f (Xt+s) = E E [ f (Xt+s) | Fs]

(4.4) x Xs  = E E f (Xt ) = Ps ◦ Pt f (x).

For the Markov transition functions this is the Chapman–Kolmogorov identity (4.2).

c) and d), e) follow directly from the fact that B 7→ pt (x,B) is a probability measure.

d f) Let f ∈ C∞(R ). Since x 7→ f (x + Xt ) is continuous and bounded, the claim follows from

dominated convergence as Pt f (x) = E f (x + Xt ).

g) f ∈ C∞ is uniformly continuous, i.e. for every ε > 0 there is some δ > 0 such that

|x − y| 6 δ =⇒ | f (x) − f (y)| 6 ε.

Hence, Z Px kPt f − f k∞ 6 sup | f (Xt ) − f (x)|d x∈Rd Z Z  x x = sup | f (Xt ) − f (x)|dP + | f (Xt ) − f (x)|dP x∈Rd |Xt −x|6δ |Xt −x|>δ Px 6 ε + 2k f k∞ sup (|Xt − x| > δ) x∈Rd (L3) = ε + 2k f k∞ P(|Xt | > δ) −−−→ ε. t→0 Since ε > 0 is arbitrary, the claim follows. Note that this proof shows that uniform conti- nuity in probability is responsible for the strong continuity of the semigroup.

h) see Lemma 4.9.

Rd Lemma 4.9 (Hawkes). Let X = (Xt )t>0 be a Lévy process on . Then the operators Pt defined by (4.5) are strong Feller if, and only if, Xt ∼ pt (y)dy for all t > 0.

1 Proof. ‘⇐’: Let Xt ∼ pt (y)dy. Since pt ∈ L and since convolutions have a smoothing property (e.g. [54, Theorem 14.8] or [55, Satz 18.9]), we get with pet (y) = pt (−y) ∞ 1 Rd Pt f = f ∗ pet ∈ L ∗ L ⊂ Cb( ). 28 R. L. Schilling: An Introduction to Lévy and Feller Processes

d d ‘⇒’: We show that pt (dy)  dy. Let N ∈ B(R ) be a Lebesgue null set λ (N) = 0 and d g ∈ Bb(R ). Then, by the Fubini–Tonelli theorem Z ZZ g(x)Pt 1N(x)dx = g(x)1N(x + y) pt (dy)dx ZZ = g(x)1N(x + y)dx pt (dy) = 0. | {z } =0

Take g = Pt 1N, then the above calculation shows Z 2 (Pt 1N(x)) dx = 0.

Hence, Pt 1N = 0 Lebesgue-a.e. By the strong Feller property, Pt 1N is continuous, and so Pt 1N ≡ 0, hence

pt (N) = Pt 1N(0) = 0.

Remark 4.10. The existence and smoothness of densities for a Lévy process are time-dependent properties, cf. Sato [51, Chapter V.23]. The typical example is the . This is a (one-dimensional) Lévy process with characteristic exponent 1 ψ(ξ) = log(1 + |ξ|2) − iarctanξ, ξ ∈ R, 2 and this process has the transition density 1 p (x) = xt−1e−x 1 (x), t > 0. t Γ(t) (0,∞)

t−1 p The factor x gives a time-dependent condition for the property pt ∈ L (dx). One can show, cf. [30], that Reψ(ξ) ∞ d lim = ∞ =⇒ ∀t > 0 ∃pt ∈ C (R ). |ξ|→∞ log(1 + |ξ|2) The converse direction remains true if ψ(ξ) is rotationally invariant or if it is replaced by its symmetric rearrangement.

Remark 4.11. If (Pt )t>0 is a Feller semigroup, i.e. a semigroup satisfying the conditions 4.7.a)-g), then there exists a unique stochastic process (a Feller process) with (Pt )t>0 as transition semi- group. The idea is to use Kolmogorov’s consistency theorem for the following family of finite- dimensional distributions   px (B × ··· × B ) = P 1 P 1 P (...P (1 )) (x) t1,...,tn 1 n t1 B1 t2−t1 B2 t3−t2 tn−tn−1 Bn

p Here Xt0 = X0 = x a.s. Note: It is not enough to have a semigroup on L as we need pointwise evaluations. d d If the operators Pt are not a priori given on Bb(R ) but only on C∞(R ), one still can use the

Riesz representation theorem to construct Markov kernels pt (x,B) representing and extending Pt d onto Bb(R ), cf. Lemma 5.2. Chapter 4: On the Markov property 29

Recall that a is a random time τ : Ω → [0,∞] such that {τ 6 t} ∈ Ft for all t > 0. n −n It is not hard to see that τn := (b2 τc + 1)2 , n ∈ N, is a sequence of stopping times with values k2−n, k = 1,2,..., such that

τ1 τ2 ... τn ↓ τ = inf τn. > > > n∈N This approximation is the key ingredient to extend the Markov property (Theorem 4.6) to random times.

Theorem 4.12 (Strong Markov property for Lévy processes). Let X be a Lévy process on Rd and set Y := (Xt+τ − Xτ )t>0 for some a.s. finite stopping time τ. Then Y is again a Lévy process satisfying

Y X  X X a) Y ⊥⊥(Xr)r6τ , i.e. F∞ ⊥⊥Fτ+ := F ∈ F∞ : F ∩ {τ < t} ∈ Ft ∀t > 0 .

b) Y ∼ X, i.e. X and Y have the same finite dimensional distributions.

n −n Rd X Proof. Let τn := (b2 τc + 1)2 . For all 0 6 s < t, ξ ∈ and F ∈ Fτ+ we find by the right- continuity of the sample paths (or by the continuity in probability (L3)) h i h i iξ·(Xt+τ −Xs+τ ) iξ·(Xt+τ −Xs+τ ) E e 1F = lim E e n n 1F n→∞ ∞ h iξ·(X −n −X −n ) i = lim E e t+k2 s+k2 1{τ =k2−n} · 1F n→∞ ∑ n k=1 ∞ h i iξ·(Xt+k −n −Xs+k −n ) = lim E e 2 2 1{(k−1)2−n τ

S∞ −n −n In the last equality we use · k=1{(k − 1)2 6 τ < k2 } = {τ < ∞} for all n > 1. X The same calculation applies to finitely many increments. Let F ∈ Fτ+, t0 = 0 < t1 < ··· < tn d and ξ1,...,ξn ∈ R . Then

n h i n ·(X −X ) i h i ·X i E ∑k=1 ξk tk+τ tk−1+τ 1 E ξk tk−tk−1 P e F = ∏ e (F). k=1

This shows that the increments Xtk+τ −Xtk−1+τ are independent and distributed like Xtk−tk−1 . More- X over, all increments are independent of F ∈ Fτ+.

Therefore, all random vectors of the form (Xt1+τ − Xτ ,...,Xtn+τ − Xtn−1+τ ) are independent of X Y X Fτ+, and we conclude that F∞ = σ(Xt+τ − Xτ , t > 0)⊥⊥Fτ+. 5. A digression: semigroups

We have seen that the Markov kernel pt (x,B) of a Lévy or Markov process induces a semigroup of linear operators (Pt )t>0. In this chapter we collect a few tools from functional analysis for the d d study of operator semigroups. By Bb(R ) we denote the bounded Borel functions f : R → R, and d C∞(R ) are the continuous functions vanishing at infinity, i.e. lim|x|→∞ f (x) = 0; when equipped with the uniform norm k · k∞ both sets become Banach spaces.

d d Definition 5.1. A Feller semigroup is a family of linear operators Pt : Bb(R ) → Bb(R ) satisfy- ing the properties a)–g) of Definition 4.7: (Pt )t>0 is a semigroup of conservative, sub-Markovian d d operators which enjoy the Feller property Pt (C∞(R )) ⊂ C∞(R ) and which are strongly contin- d uous on C∞(R ).

d Notice that (t,x) 7→ Pt f (x) is for every f ∈ C∞(R ) continuous. This follows from

|Pt f (x) − Ps f (y)| 6 |Pt f (x) − Pt f (y)| + |Pt f (y) − Ps f (y)|

6 |Pt f (x) − Pt f (y)| + kP|t−s| f − f k∞, the Feller property 4.7.f) and the strong continuity 4.7.g).

Lemma 5.2. If (Pt )t>0 is a Feller semigroup, then there is a Markov transition function pt (x,dy) R (Definition 4.1) such that Pt f (x) = f (y) pt (x,dy).

Proof. By the Riesz representation theorem we see that the operators Pt are of the form Z Pt f (x) = f (y) pt (x,dy) where pt (x,dy) is a Markov kernel. The tricky part is to show the joint measurability of the transition function (t,x) 7→ pt (x,B) and the Chapman–Kolmogorov identities (4.2). For every compact set K ⊂ Rd the functions defined by c d(x,Un ) fn(x) := c , d(x,A) := inf |x − a|, Un := {y : d(y,K) < 1/n}, d(x,K) + d(x,Un ) a∈A d are in C∞(R ) and fn ↓ 1K. By monotone convergence, pt (x,K) = infn∈N Pt fn(x) which proves the joint measurability in (t,x) for all compact sets.

By the same, the semigroup property Pt+s fn = PsPt fn entails the Chapman–Kolmogorov identi- R ties for compact sets: pt+s(x,K) = pt (y,K) ps(x,dy). Since   (t,x) 7→ pt (x,B) is measurable &  d  D := B ∈ B(R ) Z

 pt+s(x,B) = pt (y,B) ps(x,dy) 

30 Chapter 5: A digression: semigroups 31 is a Dynkin system containing the compact sets, we have D = B(Rd).

To get an intuition for semigroups it is a good idea to view the semigroup property

Pt+s = Ps ◦ Pt and P0 = id as an operator-valued Cauchy functional equation. If t 7→ Pt is—in a suitable sense—continuous, tA the unique solution will be of the form Pt = e for some operator A. This can be easily made n×n rigorous for matrices A,Pt ∈ R since the matrix exponential is well defined by the uniformly convergent series ∞ tkAk d Pt = exp(tA) := and A = Pt ∑ t= k=0 k! dt 0 with A0 := id and Ak = A ◦ A ◦ ··· ◦ A (k times). With a bit more care, this can be made to work also in general settings.

Definition 5.3. Let (Pt )t>0 be a Feller semigroup. The (infinitesimal) generator is a linear opera- tor defined by   d d Pt f − f D(A) := f ∈ C∞(R ) ∃g ∈ C∞(R ) : lim − g = 0 (5.1) t→0 t ∞ P f − f A f := lim t , f ∈ D(A). (5.2) t→0 t

tA The following lemma is the rigorous version for the symbolic notation ‘Pt = e ’.

Lemma 5.4. Let (Pt )t>0 be a Feller semigroup with infinitesimal generator (A,D(A)). Then Pt (D(A)) ⊂ D(A) and d P f = AP f = P A f for all f ∈ D(A), t 0. (5.3) dt t t t > R t Rd Moreover, 0 Ps f ds ∈ D(A) for any f ∈ C∞( ), and Z t d Pt f − f = A Ps f ds, f ∈ C∞(R ), t > 0 (5.4) 0 Z t = PsA f ds, f ∈ D(A), t > 0. (5.5) 0 Proof. Let 0 < ε < t and f ∈ D(A). The semigroup and contraction properties give

Pt f − Pt−ε f Pε f − f − Pt A f 6 Pt−ε − Pt−ε A f + Pt−ε A f − Pt−ε Pε A f ∞ ε ∞ ε ∞

Pε f − f 6 − A f + A f − Pε A f ∞ −−→ 0 ε ∞ ε→0

d− where we use the strong continuity in the last step. This shows dt Pt f = APt f = Pt A f ; a similar d+ (but simpler) calculation proves this also for dt Pt f . 32 R. L. Schilling: An Introduction to Lévy and Feller Processes

d Let f ∈ C∞(R ) and t,ε > 0. By Fubini’s theorem and the representation of Pt with a Markov transition function (Lemma 5.2) we get Z t Z t Pε Ps f (x)ds = Pε Ps f (x)ds, 0 0 and so, Z t Z t Pε − id 1  Ps f (x)ds = Ps+ε f (x) − Ps f (x) ds ε 0 ε 0 1 Z t+ε 1 Z ε = Ps f (x)ds − Ps f (x)ds. ε t ε 0

Since t 7→ Pt f (x) is continuous, the fundamental theorem of calculus applies, and we get 1 Z r+ε lim Ps f (x)ds = Pr f (x) ε→0 ε r R t for r > 0. This shows that 0 Ps f ds ∈ D(A) as well as (5.4). If f ∈ D(A), then we deduce (5.5) from t t t Z (5.3) Z d (5.4) Z PsA f (x)ds = Ps f (x)ds = Pt f (x) − f (x) = A Ps f (x)ds. 0 0 ds 0 d Remark 5.5 (Consequences of Lemma 5.4). Write C∞ := C∞(R ).

−1 R t a) (5.4) shows that D(A) is dense in C∞, since D(A) 3 t 0 Ps f ds −−→ f for any f ∈ C∞. t→0 b) (5.5) shows that A is a closed operator, i.e.

uniformly fn ∈ D(A), ( fn,A fn) −−−−−−→ ( f ,g) ∈ C∞ × C∞ =⇒ f ∈ D(A) & A f = g. n→∞

c) (5.3) means that A determines (Pt )t>0 uniquely.

Let us now consider the Laplace transform of (Pt )t>0. Rd Definition 5.6. Let (Pt )t>0 be a Feller semigroup. The resolvent is a linear operator on Bb( ) given by Z ∞ −λt d d Rλ f (x) := e Pt f (x)dt, f ∈ Bb(R ), x ∈ R , λ > 0. (5.6) 0 The following formal calculation can easily be made rigorous. Let (λ − A) := (λ id−A) for λ > 0 and f ∈ D(A). Then Z ∞ −λt (λ − A)Rλ f = (λ − A) e Pt f dt 0 Z ∞ (5.4),(5.5) −λt = e (λ − A)Pt f dt 0 Z ∞ Z ∞   −λt −λt d = λ e Pt f dt − e Pt f dt 0 0 dt Z ∞ Z ∞ parts −λt −λt −λt ∞ = λ e Pt f dt − λ e Pt f dt − [e Pt f ]t=0 = f . 0 0

A similar calculation for Rλ (λ − A) gives Chapter 5: A digression: semigroups 33

Theorem 5.7. Let (A,D(A)) and (Rλ )λ>0 be the generator and the resolvent of a Feller semi- group. Then −1 Rλ = (λ − A) for all λ > 0.

Since Rλ is the Laplace transform of (Pt )t>0, the properties of (Rλ )λ>0 can be found from

(Pt )t>0 and vice versa. With some effort one can even invert the (operator-valued) Laplace trans- form which leads to the familiar expression for ex:

n −n n   t  strongly tA R n = id− A −−−−−→ e = Pt (5.7) t t n n→∞

tA (the notation e = Pt is, for unbounded operators A, formal), see Pazy [41, Chapter 1.8].

1 Lemma 5.8. Let (Rλ )λ>0 be the resolvent of a Feller semigroup (Pt )t>0. Then dn R = n!(−1)nRn+1 n ∈ N . (5.8) dλ n λ λ 0 Proof. Using a symmetry argument we see

Z t Z t Z t Z tn Z t2 n t = ... dt1 ...dtn = n! ... dt1 ...dtn. 0 0 0 0 0

d d Let f ∈ C∞(R ) and x ∈ R . Then

n Z ∞ n Z ∞ n d n d −λt n −λt (−1) n Rλ f (x) = (−1) n e Pt f (x)dt = t e Pt f (x)dt dλ 0 dλ 0 Z ∞ Z t Z tn Z t2 −λt = n! ... e Pt f (x)dt1 ...dtn dt 0 0 0 0 Z ∞ Z ∞ Z ∞ −λt = n! ... e Pt f (x)dt dt1 ...dtn 0 tn t1 Z ∞ Z ∞ Z ∞ −λ(t+t1+···+tn) = n! ... e Pt+t1+···+tn f (x)dt dt1 ...dtn 0 0 0 n+1 = n!Rλ f (x).

The key result identifying the generators of Feller semigroups is the following theorem due to Hille, Yosida and Ray, a proof can be found in Pazy [41, Chapter 1.4] or Ethier & Kurtz [17, Chapter 4.2]; a probabilistic approach is due to Itô [25].

d Theorem 5.9 (Hille–Yosida–Ray). A linear operator (A,D(A)) on C∞(R ) generates a Feller semigroup (Pt )t>0 if, and only if,

d a) D(A) ⊂ C∞(R ) dense.

b) A is dissipative, i.e. kλ f − A f k∞ > λk f k∞ for some (or all) λ > 0.

d c) (λ − A)(D(A)) = C∞(R ) for some (or all) λ > 0.

1 This Lemma only needs that the operators Pt are strongly continuous and contractive, Definition 4.7.g), d). 34 R. L. Schilling: An Introduction to Lévy and Feller Processes

d) A satisfies the positive maximum principle:

f ∈ D(A), f (x0) = sup f (x) > 0 =⇒ A f (x0) 6 0. (PMP) x∈Rd

This variant of the Hille–Yosida theorem is not the standard version from functional analysis since we are interested in positivity preserving (sub-Markov) semigroups. Let us briefly discuss the role of the positive maximum principle. Rd Remark 5.10. Let (Pt )t>0 be a strongly continuous contraction semigroup on C∞( ), i.e.

kPt f k∞ k f k∞ and limkPt f − f k∞ = 0, 6 t→0 cf. Definition 4.7.d),g).2

◦ 1 Sub-Markov ⇒ (PMP). Assume that f ∈ D(A) is such that f (x0) = sup f > 0. Then f f + 6 + + + + Pt f (x0) − f (x0) 6 Pt f (x0) − f (x0) 6 k f k∞ − f (x0) = 0.

Pt f (x0) − f (x0) =⇒ A f (x0) = lim 6 0. t→0 t Thus, (PMP) holds.

◦ d 2 (PMP) ⇒ dissipativity. Assume that (PMP) holds and let f ∈ D(A). Since f ∈ C∞(R ), we may assume that f (x0) = | f (x0)| = sup| f | (otherwise f − f ). Then

kλ f − A f k∞ > λ f (x0) − A f (x0) > λ f (x0) = λk f k∞. | {z } 60 ◦ 3 (PMP) ⇒ sub-Markov. Since Pt is contractive, we have Pt f (x) 6 kPt f k∞ 6 k f k∞ 6 1 for all Rd Rd f ∈ C∞( ) such that | f | 6 1. In order to see positivity, let f ∈ C∞( ) be non-negative. We distinguish between two cases: ◦ d 1 Rλ f does not attain its infimum. Since Rλ f ∈ C∞(R ) vanishes at infinity, we have nec- essarily Rλ f > 0. ◦ 2 ∃x0 : Rλ f (x0) = infRλ f . Because of the (PMP) we find

λRλ f (x0) − f (x0) = ARλ f (x0) > 0

=⇒ λRλ f (x) > infλRλ f = λRλ f (x0) > f (x0) > 0.

This proves that f > 0 =⇒ λRλ f > 0. From (5.8) we see that λ 7→ Rλ f (x) is completely monotone, hence it is the Laplace transform of a positive measure. Since Rλ f (x) has the integral representation (5.6), we conclude that Pt f (x) > 0 (for all t > 0 as t 7→ Pt f is contin- uous).

Using the Riesz representation theorem (as in Lemma 5.2) we can extend Pt as a sub-Markov d operator onto Bb(R ).

2 d These properties are essential for the existence of a generator and the resolvent on C∞(R ). Chapter 5: A digression: semigroups 35

In order to determine the domain D(A) of the generator the following ‘maximal dissipativity’ result is handy.

Lemma 5.11 (Dynkin, Reuter). Assume that (A,D(A)) generates a Feller semigroup and that

(A,D(A)) extends A, i.e. D(A) ⊂ D(A) and A|D(A) = A. If

u ∈ D(A), u − Au = 0 =⇒ u = 0, (5.9) then (A,D(A)) = (A,D(A)).

d Proof. Since A is a generator, (id−A) : D(A) → C∞(R ) is bijective. On the other hand, the relation (5.9) means that (id−A) is injective, but (id−A) cannot have a proper injective extension.

Theorem 5.12. Let (Pt )t>0 be a Feller semigroup with generator (A,D(A)). Then   d d Pt f (x) − f (x) D(A) = f ∈ C∞(R ) ∃g ∈ C∞(R ) ∀x : lim = g(x) . (5.10) t→0 t

Proof. Denote by D(A) the right-hand side of (5.10) and define

P f (x) − f (x) A f (x) := lim t for all f ∈ D(A), x ∈ Rd. t→0 t Obviously, (A,D(A)) is a linear operator which extends (A,D(A)). Since (PMP) is, essentially, a pointwise assertion (see Remark 5.10, 1◦), A inherits (PMP); in particular, A is dissipative (see Remark 5.10, 2◦): kA f − λ f k∞ > λk f k∞. This implies (5.9), and the claim follows from Lemma 5.11. 6. The generator of a Lévy process

We want to study the structure of the generator of (the semigroup corresponding to) a Lévy process

X = (Xt )t>0. This will also lead to a proof of the Lévy–Khintchine formula. ∞ Rd Rd Our approach uses some Fourier analysis. We denote by Cc ( ) and S( ) the smooth, com- pactly supported functions and the smooth, rapidly decreasing ‘Schwartz functions’.1 The Fourier transform is denoted by Z fb(ξ) = F f (ξ) := (2π)−d f (x)e−iξ·x dx, f ∈ L1(dx). Rd Observe that F f is chosen in such a way that the characteristic function becomes the inverse Fourier transform. We have seen in Proposition 2.3 and its Corollaries 2.4 and 2.5 that X is completely character- ized by the characteristic exponent ψ : Rd → C

iξ·X  iξ·X t −tψ(ξ) d Ee t = Ee 1 = e , t > 0, ξ ∈ R .

We need a few more properties of ψ which result from the fact that χ(ξ) = e−ψ(ξ) is a character- istic function.

Lemma 6.1. Let χ(ξ) be any characteristic function of a probability measure µ. Then

2 2 2 d |χ(ξ + η) − χ(ξ)χ(η)| 6 (1 − |χ(ξ)| )(1 − |χ(η)| ), ξ,η ∈ R . (6.1)

Proof. Since µ is a probability measure, we find from the definition of χ ZZ χ(ξ + η) − χ(ξ)χ(η) = eix·ξ eix·η − eix·ξ eiy·η  µ(dx) µ(dy) 1 ZZ = eix·ξ − eiy·ξ eix·η − eiy·η  µ(dx) µ(dy). 2 In the last equality we use that the integrand is symmetric in x and y, which allows us to in- terchange the variables. Using the elementary formula |eia − eib|2 = 2 − 2cos(b − a) and the

1 Rd ∞ Rd N α N To be precise, f ∈ S( ), if f ∈ C ( ) and if supx∈Rd (1 + |x| )|∂ f (x)| 6 cN,α for any N ∈ 0 and any Nd multiindex α ∈ 0.

36 Chapter 6: The generator of a Lévy process 37

Cauchy–Schwarz inequality yield

|χ(ξ + η) − χ(ξ)χ(η)| ZZ 1 ix· iy· ix· iy· e ξ − e ξ · e η − e η µ(dx) µ(dy) 6 2 ZZ p p = 1 − cos(y − x) · ξ 1 − cos(y − x) · η µ(dx) µ(dy) r r ZZ  ZZ  6 1 − cos(y − x) · ξ µ(dx) µ(dy) 1 − cos(y − x) · η µ(dx) µ(dy).

This finishes the proof as ZZ Z Z  cos(y − x) · ξ µ(dx) µ(dy) = Re eiy·ξ µ(dy) e−ix·ξ µ(dx) = |χ(ξ)|2.

d Theorem 6.2. Let ψ : R → C be the characteristic exponent of a Lévy process. Then the function ξ 7→ p|ψ(ξ)| is subadditive and

2 Rd |ψ(ξ)| 6 cψ (1 + |ξ| ), ξ ∈ . (6.2)

Proof. We use (6.1) with χ = e−tψ , divide by t > 0 and let t → 0. Since |χ| = e−t Reψ , this gives

2 |ψ(ξ + η) − ψ(ξ) − ψ(η)| 6 4Reψ(ξ)Reψ(η) 6 4|ψ(ξ)| · |ψ(η)|.

By the lower triangle inequality, p p |ψ(ξ + η)| − |ψ(ξ)| − |ψ(η)| 6 2 |ψ(ξ)| |ψ(η)| p p p and this is the same as subadditivity: |ψ(ξ + η)| 6 |ψ(ξ)| + |ψ(η)|. In particular, |ψ(2ξ)| 6 4|ψ(ξ)|. For any ξ 6= 0 there is some integer n = n(ξ) ∈ Z such that n−1 n 2 6 |ξ| 6 2 , so

n −n 2n 2 |ψ(ξ)| = |ψ(2 2 ξ)| 6 max{1,2 } sup |ψ(η)| 6 2 sup |ψ(η)|(1 + |ξ| ). |η|61 |η|61

Lemma 6.3. Let (Xt )t>0 be a Lévy process and denote by (A,D(A)) its infinitesimal generator. ∞ Rd Then Cc ( ) ⊂ D(A). ∞ Rd E Proof. Let f ∈ Cc ( ). By definition, Pt f (x) = f (Xt + x). Using the differentiation lemma for parameter-dependent integrals (e.g. [54, Theorem 11.5] or [55, 12.2]) it is not hard to see that d d Pt : S(R ) → S(R ). Obviously,

−tψ(ξ) E iξ·Xt Ex iξ·(Xt −x) e = e = e = e−ξ (x)Pt eξ (x) (6.3)

iξ·x Rd Rd for eξ (x) := e . Recall that the Fourier transform of f ∈ S( ) is again in S( ). From Z Z Pt f = Pt fb(ξ)eξ (·)dξ = fb(ξ)Pt eξ (·)dξ (6.4) Z (6.3) −tψ(ξ) = fb(ξ)eξ (·)e dξ 38 R. L. Schilling: An Introduction to Lévy and Feller Processes

−tψ we conclude that Pct f = fbe . Hence,

−1 −tψ Pt f = F ( fbe ). (6.5)

Consequently,

−tψ Pct f − fb e fb− fb = −−−→ −ψ fb t t t→0 fb∈S(Rd ) Pt f (x) − f (x) =====⇒ −−−→ g(x) := F−1(−ψ fb)(x). t t→0

Since ψ grows at most polynomially (Lemma 6.2) and fb∈ S(Rd), we see ψ fb∈ L1(dx) and, by d the Riemann–Lebesgue lemma, g ∈ C∞(R ). Using Theorem 5.12 it follows that f ∈ D(A).

2 Rd Rd Definition 6.4. Let L : Cb( ) → Cb( ) be a linear operator. Then

L(x,ξ) := e−ξ (x)Lxeξ (x) (6.6)

iξ·x is the symbol of the operator L = Lx, where eξ (x) := e . The proof of Lemma 6.3 actually shows that we can recover an operator L from its symbol 2 Rd Rd 2 ∞ Rd L(x,ξ) if, say, L : Cb( ) → Cb( ) is continuous: Indeed, for all u ∈ Cc ( ) Z Lu(x) = L ub(ξ)eξ (x)dξ Z = ub(ξ)Lxeξ (x)dξ Z −1 = ub(ξ)L(x,ξ)eξ (x)dξ = F (L(x,·)Fu(·))(x). Example 6.5. A typical example would be the Laplace operator (i.e. the generator of a Brownian motion) Z 1 1 1 2 1 2 iξ·x 1 2 ∆ f (x) = − ( ∂x) f (x) = fb(ξ) − |ξ| e dξ, i.e. L(x,ξ) = − |ξ| , 2 2 i 2 2 1 or the fractional Laplacian of order 2 α ∈ (0,1) which generates a rotationally symmetric α-stable Lévy process Z α/2 α  iξ·x α −(−∆) f (x) = fb(ξ) − |ξ| e dξ, i.e. L(x,ξ) = −|ξ| .

More generally, if P(x,ξ) is a polynomial in ξ, then the corresponding operator is obtained by 1 replacing ξ by i ∇x and formally expanding the powers. Definition 6.6. An operator of the form Z L(x,D) f (x) = fb(ξ)L(x,ξ)eix·ξ dξ, f ∈ S(Rd), (6.7) is called (if defined) a pseudo differential operator with (non-classical) symbol L(x,ξ).

2 C2(Rd) kuk = k α uk As usual, b is endowed with the norm (2) ∑06|α|62 ∂ ∞. Chapter 6: The generator of a Lévy process 39

Remark 6.7. The symbol of a Lévy process does not depend on x, i.e. L(x,ξ) = L(ξ). This is a consequence of the spatial homogeneity of the process which is encoded in the translation invariance of the semigroup (cf. (4.6) and Lemma 4.4):

Pt f (x) = E f (Xt + x) =⇒ Pt f (x) = ϑx(Pt f )(0) = Pt (ϑx f )(0) where ϑxu(y) = u(y+x) is the shift operator. This property is obviously inherited by the generator, i.e.

A f (x) = ϑx(A f )(0) = A(ϑx f )(0), f ∈ D(A).

∞ Rd Rd As a matter of fact, the converse is also true: If L : Cc ( ) → C( ) is a linear operator satisfying ϑx(L f ) = L(ϑx f ), then L f = f ∗ λ where λ is a distribution, i.e. a continuous linear ∞ Rd R functional λ : Cc ( ) → , cf. Theorem A.10.

Theorem 6.8. Let (Xt )t>0 be a Lévy process with generator A. Then 1 Z A f (x) = l · ∇ f (x) + ∇ · Q∇ f (x) +  f (x + y) − f (x) − ∇ f (x) · y1 (|y|)ν(dy) (6.8) 2 (0,1) y6=0

∞ Rd Rd Rd×d for any f ∈ Cc ( ), where l ∈ ,Q ∈ is a positive semidefinite matrix, and ν is a measure Rd R 2 on \{0} such that y6=0 min{1,|y| }ν(dy) < ∞. Equivalently, A is a pseudo differential operator Z ix·ξ ∞ Rd Au(x) = −ψ(D)u(x) = − ub(ξ)ψ(ξ)e dξ, u ∈ Cc ( ), (6.9) whose symbol is the characteristic exponent −ψ of the Lévy process. It is given by the Lévy– Khintchine formula

Z 1  iy·ξ  ψ(ξ) = −il · ξ + ξ · Qξ + 1 − e + iξ · y1(0,1)(|y|) ν(dy) (6.10) 2 y6=0 where the triplet (l,Q,ν) as above.

I learned the following proof from Francis Hirsch; it is based on arguments by Courrège [13] and Herz [20]. The presentation below follows the version in Böttcher, Schilling & Wang [9, Section 2.3].

Proof. The proof is divided into several steps.

◦ ∞ Rd 1 We have seen in Lemma 6.3 that Cc ( ) ⊂ D(A).

◦ ∞ Rd ∞ 2 Set A0 f := (A f )(0) for f ∈ Cc ( ). This is a linear functional on Cc . Observe that

∞ Rd (PMP) f ∈ Cc ( ), f > 0, f (0) = 0 ===⇒ A0 f > 0. 40 R. L. Schilling: An Introduction to Lévy and Feller Processes

◦ ◦ 2 ∞ Rd 3 By 2 , f 7→ A00 f := A0(| · | · f ) is a positive linear functional on Cc ( ). Therefore it is ∞ Rd ∞ Rd bounded. Indeed, let f ∈ Cc (K) for a compact set K ⊂ and let φ ∈ Cc ( ) be a cut-off function such that 1K 6 φ 6 1. Then

k f k∞φ ± f > 0.

By linearity and positivity k f k∞A00φ ± A00 f > 0 which shows |A00 f | 6 CKk f k∞ with the constant CK = A00φ. By Riesz’ representation theorem, there exists a Radon measure3 µ such that Z Z µ(dy) Z A (| · |2 f ) = f (y) µ(dy) = |y|2 f (y) = |y|2 f (y)ν(dy). 0 |y|2 | {z } This implies that =:ν(dy) Z ∞ Rd A0 f0 = f0(y)ν(dy) for all f0 ∈ Cc ( \{0}); y6=0

d since any compact subset of R \{0} is contained in an annulus BR(0) \ Bε (0), we have

supp f0 ∩ Bε (0) = /0for some sufficiently small ε > 0. The measure ν is a Radon measure on Rd \{0}.

◦ ∞ Rd c 4 Let f ,g ∈ Cc ( ), 0 6 f ,g 6 1, supp f ⊂ B1(0), suppg ⊂ B1(0) and f (0) = 1. From  sup kgk∞ f (y) + g(y) = kgk∞ = kgk∞ f (0) + g(0) y∈Rd

and (PMP), it follows that A0(kgk∞ f + g) 6 0. Consequently,

A0g 6 −kgk∞A0 f .

If g ↑ 1 − 1 , then this shows B1(0) Z ν(dy) 6 −A0 f < ∞. |y|>1

R 2 Hence, y6=0(|y| ∧ 1)ν(dy) < ∞. ◦ ∞ Rd 1 5 Let f ∈ Cc ( ) and φ(y) = (0,1)(|y|). Define Z   S0 f := f (y) − f (0) − y · ∇ f (0)φ(y) ν(dy). (6.11) y6=0 By Taylor’s formula, there is some θ ∈ (0,1) such that

1 d ∂ 2 f (θy) f (y) − f (0) − y · ∇ f (0)φ(y) = ∑ ykyl. 2 k,l=1 ∂xk∂xl

3A Radon measure on a topological space E is a Borel measure which is finite on compact subsets of E and regular: for all open sets µ(U) = supK⊂U µ(K) and for all Borel sets µ(B) = infU⊃B µ(U) (K, U are generic compact and open sets, respectively). Chapter 6: The generator of a Lévy process 41

2 2 2 Using the elementary inequality 2ykyl 6 yk + yl 6 |y| , we obtain  1 d ∂ 2 2  4 ∑k,l=1 ∂x ∂x f ∞|y| , |y| < 1 | f (y) − f (0) − y · ∇ f (0)φ(y)| 6 k l 2k f k∞, |y| > 1 2 6 2k f k(2)(|y| ∧ 1).

This means that S0 defines a distribution (generalized function) of order 2.

◦ ◦ ◦ 6 Set L0 := A0 − S0. The steps 2 and 5 show that A0 is a distribution of order 2. Moreover, Z   L0 f0 = f0(0) − y · ∇ f0(0)φ(y) ν(dy) = 0 y6=0 ∞ Rd for any f0 ∈ Cc ( ) with f0|Bε (0) = 0 for some ε > 0. Hence, supp(L0) ⊂ {0}.

Let us show that L0 is almost positive (also: ‘fast positiv’, ‘prèsque positif’):

∞ Rd f0 ∈ Cc ( ), f0(0) = 0, f0 > 0 =⇒ L0 f0 > 0. (PP)

∞ Rd 1 Indeed: Pick 0 6 φn ∈ Cc ( \{0}), φn ↑ Rd \{0} and let f0 be as in (PP). Then

suppL0⊂{0} L0 f0 = L0[(1 − φn) f0)]

= A0[(1 − φn) f0] − S0[(1 − φn) f0] Z f0(0)=0 = A0[(1 − φn) f0] − (1 − φn(y)) f0(y)ν(dy) ∇ f0(0)=0 y6=0 2◦ Z > − (1 − φn(y)) f0(y)ν(dy) −−−→ 0 (PMP) n→∞ by the monotone convergence theorem.

◦ ◦ ∞ Rd ∞ Rd 7 As in 5 we find with Taylor’s formula for f ∈ Cc ( ), supp f ⊂ K and φ ∈ Cc ( ) satis- fying 1K 6 φ 6 1

2 ( f (y) − f (0) − ∇ f (0) · y)φ(y) 6 2k f k(2)|y| φ(y).

α (As usual, k f k(2) = ∑06|α|62 k∂ f k∞.) Therefore,

2 2k f k(2)|y| φ(y) + f (0)φ(y) + ∇ f (0) · yφ(y) − f (y) > 0,

and (PP) implies

2 L0 f 6 f (0)L0φ + |∇ f (0)|L0(| · |φ) + 2k f k(2)L0(| · | φ) 6 CKk f k(2).

◦ ◦ 8 We have seen in 6 that L0 is of order 2 and suppL0 ⊂ {0}. Therefore,

1 d ∂ 2 f (0) d ∂ f (0) L0 f = ∑ qkl + ∑ lk − c f (0). (6.12) 2 k,l=1 ∂xk∂xl k=1 ∂xk 42 R. L. Schilling: An Introduction to Lévy and Feller Processes

2 ∞ Rd We will show that (qkl)k,l is positive semidefinite. Set g(y) := (y·ξ) f (y) where f ∈ Cc ( ) 1 is such that B1(0) 6 f 6 1. By (PP), L0g > 0. It is not difficult to see that this implies

d Rd ∑ qklξkξl > 0, for all ξ = (ξ1,...,ξd) ∈ . k,l=1

9◦ Since Lévy processes and their semigroups are invariant under translations, cf. Remark 6.7,

we get A f (x) = A0[ f (x + ·)]. If we replace f by f (x + ·), we get 1 A f (x) = c f (x) + l · ∇ f (x) + ∇ · Q∇ f (x) 2 0 Z   (6.8 ) + f (x + y) − f (x) − y · ∇ f (x)1(0,1)(|y|) ν(dy). y6=0 We will show in the next step that c = 0.

10◦ So far, we have seen in 5◦,7◦ and 9◦ that

α ∞ Rd kA f k∞ 6 Ck f k(2) = C ∑ k∂ f k∞, f ∈ Cc ( ), |α|62 2 Rd which means that A (has an extension which) is continuous as an operator from Cb( ) to Rd ∞ Rd Cb( ). Therefore, (A,Cc ( )) is a pseudo differential operator with symbol

iξ·x −ψ(ξ) = e−ξ (x)Axeξ (x), eξ (x) = e .

0 Inserting eξ into (6.8 ) proves (6.10) and, as ψ(0) = 0, c = 0. Remark 6.9. In step 8◦ of the proof of Theorem 6.8 one can use the (PMP) to show that the 0 d N ∞ R coefficient c appearing in (6.8 ) is positive. For this, let ( fn)n∈ ⊂ Cc ( ), fn ↑ 1 and fn|B1(0) = 1. By (PMP), A0 fn 6 0. Moreover, ∇ fn(0) = 0 and, therefore, Z S0 fn = − (1 − fn(y))ν(dy) −−−→ 0. n→∞ Consequently, limsupL0 fn = limsup(A0 fn − S0 fn) 6 0 =⇒ c > 0. n→∞ n→∞ For Lévy processes we have c = ψ(0) = 0 and this is a consequence of the infinite life-time of the process: d P(Xt ∈ R ) = Pt 1 = 1 for all t > 0, R t and we can use the formula Pt f − f = A 0 Ps f ds, cf. Lemma 5.4, for f ≡ 1 to show that

c = A1 = 0 ⇐⇒ Pt 1 = 1.

Rd R 2 Definition 6.10. A Lévy measure is a Radon measure ν on \{0} s.t. y6=0(|y| ∧1)ν(dy) < ∞. A Lévy triplet is a triplet (l,Q,ν) consisting of a vector l ∈ Rd, positive semi-definite matrix Q ∈ Rd×d and a Lévy measure ν. Chapter 6: The generator of a Lévy process 43

The proof of Theorem 6.8 incidentally shows that the Lévy triplet defining the exponent (6.10) or the generator (6.8) is unique. The following corollary can easily be checked using the represen- tation (6.8).

Corollary 6.11. Let A be the generator and (Pt )t>0 the semigroup of a Lévy process. Then the Lévy triplet is given by

Z P f ( ) t 0 0 ∞ Rd f0 dν = A f0(0) = lim ∀ f0 ∈ Cc ( \{0}), t→0 t Z   lk = Aφk(0) − yk φ(y) − 1(0,1)(|y|) ν(dy), k = 1,...,d, (6.13) Z qkl = A(φkφl)(0) − φk(y)φl(y)ν(dy), k,l = 1,...d, y6=0

∞ Rd 1 where φ ∈ Cc ( ) satisfies B1(0) 6 φ 6 1 and φk(y) := ykφ(y). In particular, (l,Q,ν) is uniquely determined by A or the characteristic exponent ψ.

We will see an alternative uniqueness proof in the next Chapter 7.

Remark 6.12. Setting pt (dy) = P(Xt ∈ dy), we can recast the formula for the Lévy measure as p (dy) ν(dy) = lim t (vague limit of measures on the set Rd \{0}). t→0 t Moreover, a direct calculation using the Lévy–Khintchine formula (6.13) gives the following al- ternative representation for the qkl:

1 ψ(nξ) ξ · Qξ = lim , ξ ∈ Rd. 2 n→∞ n2 7. Construction of Lévy processes

Our starting point is now the Lévy–Khintchine formula for the characteristic exponent ψ of a Lévy process Z 1  iy·ξ  ψ(ξ) = −il · ξ + ξ · Qξ + 1 − e + iξ · y1(0,1)(|y|) ν(dy) (7.1) 2 y6=0 where (l,Q,ν) is a Lévy triplet in the sense of Definition 6.10; a proof of (7.1) is contained in Theorem 6.8, but the exposition below is independent of this proof, see however Remark 7.7 at the end of this chapter. What will be needed is that a compound Poisson process is a Lévy process with càdlàg paths and characteristic exponent of the form Z φ(ξ) = 1 − eiy·ξ ρ(dy) (7.2) y6=0 (ρ is any finite measure), see Example 3.2.d), where ρ(dy) = λ · µ(dy). Let ν be a Lévy measure and denote by Aa,b = {y : a 6 |y| < b} an annulus. Since we have R  2  y6=0 |y| ∧ 1 ν(dy) < ∞, the measure ρ(B) := ν(B ∩ Aa,b) is a finite measure, and there is a corresponding compound Poisson process. Adding a drift with l = −R yρ(dy) shows that for every exponent Z ψa,b(ξ) = 1 − eiy·ξ + iy · ξν(dy) (7.3) a6|y|0. In fact, a,b Lemma 7.1. Let 0 < a < b 6 ∞ and ψ given by (7.3). Then the corresponding Lévy process Xa,b is an L2(P)-martingale with càdlàg paths such that  Z   a,b  a,b a,b > E Xt = 0 and E Xt · (Xt ) = t ykyl ν(dy) . a6|y|

a,b a,b Proof. Set Xt := Xt , ψ := ψ and Ft := σ(Xr, r 6 t). Using the differentiation lemma for parameter-dependent integrals we see that ψ is twice continuously differentiable and ∂ψ(0) ∂ 2ψ(0) Z = 0 and = ykyl ν(dy). ∂ξk ∂ξk∂ξl a6|y|

44 Chapter 7: Construction of Lévy processes 45 and

2 t 2 ( ) t ( ) t ( ) (k) (l) ∂ iξ·Xt ∂ ψ 0 ∂ψ 0 ∂ψ 0 E(Xt Xt ) = − Ee = − ∂ξk∂ξl ξ=0 ∂ξk∂ξl ∂ξk ∂ξl Z = t ykyl ν(dy). a6|y|

The martingale property now follows from the independence of the increments: Let s 6 t, then

(L2) E(Xt | Fs) = E(Xt − Xs + Xs | Fs) = E(Xt − Xs | Fs) + Xs = E(Xt−s) + Xs = Xs. (L1)

We will use the processes from Lemma 7.1 as main building blocks for the Lévy process. For this we need some preparations.

k 1 2 Lemma 7.2. Let (Xt )t>0 be Lévy processes with characteristic exponents ψk. If X ⊥⊥X , then 1 2 X := X + X is a Lévy process with characteristic exponent ψ = ψ1 + ψ2.

k k 1 2 1 2 Proof. Set Ft := σ(Xs , s 6 t) and Ft = σ(Ft ,Ft ). Since X ⊥⊥X , we get for F = F1 ∩ F2, k Fk ∈ Fs ,

   1 1 2 2  E iξ·(Xt −Xs)1 E iξ·(Xt −Xs )1 iξ·(Xt −Xs )1 e F = e F1 · e F2

 1 1   2 2  E iξ·(Xt −Xs )1 E iξ·(Xt −Xs )1 = e F1 e F2

(L2) −(t−s)ψ1(ξ) −(t−s)ψ2(ξ) = e P(F1) · e P(F2)

= e−(t−s)(ψ1(ξ)+ψ2(ξ))P(F).

k As {F1 ∩ F2 : Fk ∈ Fs } is a ∩-stable generator of Fs, we find   E iξ·(Xt −Xs) −(t−s)ψ(ξ) e Fs = e .

X Observe that Fs could be larger than the canonical filtration Fs . Therefore, we first condition E X w.r.t. (··· | Fs ) and then use Theorem 3.1, to see that X is a Lévy process with characteristic exponent ψ = ψ1 + ψ2.

n Lemma 7.3. Let (X )n∈N be a sequence of Lévy processes with characteristic exponents ψn. n Assume that Xt → Xt converges in probability for every t > 0. If either: the convergence is uniform in probability, i.e.   P n ∀ε > 0 ∀t 0 : lim sup|Xs − Xs| > ε = 0, > n→∞ s6t or: the limiting process X has càdlàg paths, then X is a Lévy process with characteristic exponent ψ := limn→∞ ψn. 46 R. L. Schilling: An Introduction to Lévy and Feller Processes

d n Proof. Let 0 = t0 < t1 < ··· < tm and ξ1,...,ξm ∈ R . Since the X are Lévy processes,

" m # m (L2),(L1) h i Eexp i ξ · (Xn − Xn ) = Eexp iξ · Xn ∑ k tk tk−1 ∏ k tk−tk−1 k=1 k=1 and, because of convergence in probability, this equality is inherited by the limiting process X. This proves that X has independent (L20) and stationary (L1) increments. The condition (L3) follows either from the uniformity of the convergence in probability or the càdlàg property. Thus, X is a Lévy process. From

i ·Xn i ·X lim Ee ξ 1 = Ee ξ 1 n→∞ we get that the limit limn→∞ ψn = ψ exists.

Lemma 7.4 (Notation of Lemma 7.1). Let (an)n∈N be a sequence a1 > a2 > ... decreasing to zero a ,a and assume that the processes (X n+1 n )n∈N are independent Lévy processes with characteristic ∞ an+1,an ∞ exponents ψan+1,an . Then X := ∑n=1 X is a Lévy process with exponent ψ := ∑n=1 ψan+1,an and càdlàg paths. Moreover, X is an L2(P)-martingale.

an+m,an m an+k,an+k−1 Proof. Lemmas 7.1, 7.2 show that X = ∑k=1 X is a Lévy process with characteristic m an+m,an 2 P exponent ψan+m,an = ∑k=1 ψan+k,an+k−1 , and X is an L ( )-martingale. By Doob’s inequality and Lemma 7.1   E an+m,an 2 E an+m,an 2 sup|Xs | 6 4 |Xt | s6t Z dom. convergence = 4t y2 ν(dy) −−−−−−−−−→ 0. m,n→∞ an+m6|y|

an,a 2 2 Hence, the limit X = limn→∞ X 1 exists (uniformly in t) in L , i.e. X is an L (P)-martingale; since the convergence is also uniform in probability, Lemma 7.3 shows that X is a Lévy process ∞ with exponent ψ = ∑n=1 ψan+1,an . Taking a uniformly convergent subsequence, we also see that the limit inherits the càdlàg property from the approximating Lévy processes Xan,a1 .

We can now prove the main result of this chapter.

Theorem 7.5. Let (l,Q,ν) be a Lévy triplet and ψ be given by (7.1). Then there exists a Lévy process X with càdlàg paths and characteristic exponent ψ.

Proof. Because of Lemma 7.1, 7.2 and 7.4 we can construct X piece by piece.

◦ 1 Let (Wt )t>0 be a Brownian motion and set p 1 Xc := tl + QW and ψ (ξ) := −il · ξ + ξ · Qξ. t t c 2

◦ Rd S∞  1 1 2 Write \{0} = · n=0 An with A0 := {|y| > 1} and An := n+1 6 |y| < n ; set λn := ν(An) and µn := ν(· ∩ An)/ν(An). Chapter 7: Construction of Lévy processes 47

3◦ Construct, as in Example 3.2.d), a compound Poisson process comprising the large jumps Z 0 1,∞  iy·ξ  Xt := Xt and ψ0(ξ) := 1 − e ν(dy) 16|y|<∞ and compensated compound Poisson processes taking account of all small jumps Z n an+1,an 1  iy·ξ  Xt := Xt , an := and ψn(ξ) := 1 − e + iy · ξ ν(dy). n An We can construct the processes Xn stochastically independent (just choose independent jump time processes and independent iid jump heights when constructing the compound Poisson processes) and independent of the Wiener process W.

◦ ∞ 4 Setting ψ = ψ0 + ψc + ∑n=1 ψn, Lemma 7.2 and 7.4 prove the theorem. Since all approx- imating processes have càdlàg paths, this property is inherited by the sums and the limit (Lemma 7.4).

The proof of Theorem 7.5 also implies the following pathwise decomposition of a Lévy process.

We write ∆Xt := Xt − Xt− for the jump at time t. From the construction we know that

[1,∞) 1 (large jumps) Jt = ∑ ∆Xs [1,∞)(|∆Xs|) (7.4) s6t [1/n,1) 1 (small jumps) Jt = ∑ ∆Xs [1/n,1)(|∆Xs|) (7.5) s6t [1/n,1) [1/n,1) [1/n,1) (compensated small jumps) Jet = Jt − EJt (7.6) Z 1 = ∑ ∆Xs [1/n,1)(|∆Xs|) −t yν(dy). s6t 1 n 6|y|<1 are Lévy processes and J[1,∞) ⊥⊥Je[1/n,1).

Corollary 7.6. Let ψ be a characteristic exponent given by (7.1) and let X be the Lévy process constructed in Theorem 7.5. Then  Z   p 2 Xt = QWt + lim ∆Xs1[1/n,1)(|∆Xs|) −t yν(dy) =: Mt ,L -martingale n→∞ ∑ s t 6 [1/n,1)  1 tl + ∑ ∆Xs [1,∞)(|∆Xs|). =: At , bdd. variation s6t continuous pure jump part Gaussian where all appearing processes are independent.

Proof. The decomposition follows directly from the construction in Theorem 7.5. By Lemma 7.4, 1/n,1 2 limn→∞ X is an L (P)-martingale, and since the (independent!) Wiener process W is also an L2(P)-martingale, so is their sum M. 48 R. L. Schilling: An Introduction to Lévy and Feller Processes

The paths t 7→ At (ω) are a.s. of bounded variation since, by construction, on any time-interval 0 0 [0,t] there are Nt (ω) jumps of size > 1. Since Nt (ω) < ∞ a.s., the total variation of At (ω) is less 1 or equal than |l|t + ∑s6t |∆Xs| [1,∞)(|∆Xs|) < ∞ a.s. Remark 7.7. A word of caution: Theorem 7.5 associates with any ψ given by the Lévy–Khintchi- ne formula (7.1) a Lévy process. Unless we know that all characteristic exponents are of this form (this was proved in Theorem 6.8), it does not follow that we have constructed all Lévy processes. On the other hand, Theorem 7.5 shows that the Lévy triplet determining ψ is unique. Indeed, assume that (l,Q,ν) and (l0,Q0,ν0) are two Lévy triplets which yield the same exponent ψ. Now we can associate, using Theorem 7.5, with each triplet a Lévy process X and X0 such that

0 Eeiξ·Xt = e−tψ(ξ) = Eeiξ·Xt .

Thus, X ∼ X0 and so these processes have (in law) the same pathwise decomposition, i.e. the same drift, diffusion and jump behaviour. This, however, means that (l,Q,ν) = (l0,Q0,ν0). 8. Two special Lévy processes

We will now study the structure of the paths of a Lévy process. We begin with two extreme cases: Lévy processes which only grow by jumps of size 1 and Lévy processes with continuous paths.

Throughout this chapter we assume that all paths [0,∞) 3 t 7→ Xt (ω) are right-continuous with finite left-hand limits (càdlàg). This is a bit stronger than (L3), but it is always possible to construct a càdlàg version of a Lévy process (see the discussion on page 13). This allows us to consider the jumps of the process X

∆Xt := Xt − Xt− = Xt − limXs. s↑t

Theorem 8.1. Let X be a one-dimensional Lévy process which moves only by jumps of size 1. Then X is a Poisson process.

X Proof. Set Ft := σ(Xs, s 6 t) and let T1 = inf{t > 0 : ∆Xt = 1} be the time of the first jump. X Since {T1 > t} = {Xt = 0} ∈ Ft , T1 is a stopping time. Let T0 = 0 and Tk = inf{t > Tk−1 : ∆Xt = 1}, be the time of the kth jump; this is also a stopping time. By the Markov property ((4.4) and Lemma 4.4),

P(T1 > s +t) = P(T1 > s, T1 > s +t) E1 PXs  = {T1>s} (T1 > t) E1 P0  = {T1>s} (T1 > t)

= P(T1 > s)P(T1 > t)

0 where we use that Xs = 0 if T1 > s (the process hasn’t yet moved!) and P = P .

Since t 7→ P(T1 > t) is right-continuous, P(T1 > t) = exp[t logP(T1 > 1)] is the unique solution of this functional equation (Theorem A.1). Thus, the sequence of inter-jump times σk := Tk −Tk−1, k ∈ N, is an iid sequence of exponential times. This follows immediately from the strong Markov property (Theorem 4.12) for Lévy processes and the observation that

Y Tk+1 − Tk = T1 where Y = (Yt+Tk −YTk )t>0

Y and T1 is the first jump time of the process Y. ∞ 1 Obviously, Xt = ∑k=1 [0,t](Tk), Tk = σ1 +···+σk, and Example 3.2.c) (and Theorem 3.4) show that X is a Poisson process.

A Lévy process with uniformly bounded jumps admits moments of all orders.

49 50 R. L. Schilling: An Introduction to Lévy and Feller Processes

Lemma 8.2. Let (Xt )t>0 be a Lévy process such that |∆Xt (ω)| 6 c for all t > 0 and some constant p c > 0. Then E(|Xt | ) < ∞ for all p > 0.

X Proof. Let Ft := σ(Xs, s 6 t) and define the stopping times  τ0 := 0, τn := inf t > τn−1 : |Xt − Xτn−1 | > c .

Since X has càdlàg paths, τ0 < τ1 < τ2 < .... Let us show that τ1 < ∞ a.s. For fixed t > 0 and n ∈ N we have

P P P (τ1 = ∞) 6 (τ1 > nt) 6 (|Xkt − X(k−1)t | 6 2c, ∀k = 1,...,n)

n (L2) (L1) n = P(|X − X | 2c) = P(|Xt | 2c) . 0 ∏ kt (k−1)t 6 6 (L2 ) k=1 P P Letting n → ∞ we see that (τ1 = ∞) = 0 if (|Xt | 6 2c) < 1 for some t > 0. (In the alternative case, we have P(|Xt | 6 2c) = 1 for all t > 0 which makes the lemma trivial.) − ∼ − ⊥⊥ X ( − By the strong Markov property (Theorem 4.12) τn τn−1 τ1 and τn τn−1 Fτn−1 , i.e. τn τn−1)n∈N is an iid sequence. Therefore,

n Ee−τn = Ee−τ1  = qn for some q ∈ [0,1). From the very definition of the stoppping times we infer

n n |X | |X − X | | X |+|X − X | nc. t∧τn 6 ∑ τk τk−1 6 ∑ ∆ τk τk− τk−1 6 2 k=1 k=1 | {z } | {z } 6c 6c

Thus, |Xt | > 2nc implies that τn < t, and by Markov’s inequality

t −τn t n P(|Xt | > 2nc) 6 P(τn < t) 6 e Ee = e q .

Finally,

∞ E|X |p = E|X |p1  t ∑ t {2nc<|Xt |62(n+1)c} n=0 ∞ ∞ p pP p t p n 6 (2c) ∑ (n + 1) (|Xt | > 2nc) 6 (2c) e ∑ (n + 1) q < ∞. n=0 n=0 Rd Recall that a Brownian motion (Wt )t>0 on is a Lévy process such that Wt is a normal random variable with mean 0 and covariance matrix t id. We will need Paul Lévy’s characterization of Brownian motion which we state without proof. An elementary proof can be found in [56, Chapter 9.4].

Theorem 8.3 (Lévy). Let M = (Mt ,Ft ),M0 = 0, be a one-dimensional martingale with contin- 2 uous sample paths such that (Mt − t,Ft )t>0 is also a martingale. Then M is a one-dimensional standard Brownian motion. Chapter 8: Two special Lévy processes 51

Rd Theorem 8.4. Let (Xt )t>0 be a Lévy process in whose sample paths are a.s. continuous. Then √ d Xt ∼ tl + QWt where l ∈ R , Q is a positive semidefinite symmetric matrix, and W is a standard Brownian motion in Rd.

We will give two proofs of this result.

Proof (using Theorem 8.3). By Lemma 8.2, the process (Xt )t>0 has moments of all orders. There- ξ d fore, Mt := Mt := ξ · (Xt − EXt ) exists for any ξ ∈ R and is a martingale for the canonical filtration Ft := σ(Xs, s 6 t). Indeed, for all s 6 t

(L2) E(Mt | Fs) = E(Mt − Ms | Fs) − Ms = EMt−s + Ms = Ms. (L1)

Moreover

E 2 2 E 2 (Mt − Ms | Fs) = ((Mt − Ms) + 2Ms(Mt − Ms) | Fs)

(L2) 2 = E((Mt − Ms) ) + 2MsE(Mt − Ms)

Lemma 3.10 E 2 V = (t − s) M1 = (t − s) M1,

2 V and so (Mt −t M1)t>0 and (Mt )t>0 are martingales with continuous paths. Now we can use Theorem 8.3 and deduce that ξ · (Xt − EXt ) is a one-dimensional Brownian motion with variance ξ · Qξ where tQ is the covariance matrix of Xt (cf. the proof of Lemma 7.1 √ or Lemma 3.10). Thus, Xt −EXt = QWt where Wt is a d-dimensional standard Brownian motion.

Finally, EXt = tEX1 =: tl.

d Standard proof (using the CLT). Fix ξ ∈ R and set M(t) := ξ ·(Xt −EXt ). Since X has moments of all orders, M is well-defined and it is again a Lévy process. Moreover,

2 2 EM(t) = 0 and tσ = VM(t) = E[(ξ · (Xt − EXt )) ] = tξ · Qξ where Q is the covariance matrix of X, cf. the proof of Lemma 7.1. We proceed as in the proof of the CLT: Using a Taylor expansion we get

0 n  n iM(t) i∑n [M(tk/n)−M(t(k−1)/n)] (L2 )  iM(t/n) 1 2 Ee = Ee k=1 = Ee = 1 − EM (t/n) + Rn . (L1) 2

1 E 3 t t The remainder term Rn is estimated by 6 |M ( n )|. If we can show that |Rn| 6 ε n for large E 2 t t 2 n = n(ε) and any ε > 0, we get because of M ( n ) = n σ  n iM(t) 1 2 t − 1 (σ 2+2ε)t − 1 σ 2t Ee = lim 1 − (σ + 2ε) = e 2 −−−→ e 2 . n→∞ 2 n ε→0

2 This shows that ξ · (Xt − EXt ) is a centered Gaussian random variable with variance σ . Since

EXt = tEX1 we conclude that Xt is Gaussian with mean tl and covariance tQ. 52 R. L. Schilling: An Introduction to Lévy and Feller Processes

E 3 t We will now estimate |M ( n )|. For every ε > 0 we can use the uniform continuity of s 7→ M(s) on [0,t] to get k k−1 lim max |M( nt) − M( n t)| = 0. n→∞ 16k6n Thus, we have for all ε > 0   P k k−1 1 = lim max |M( nt) − M( n t)| 6 ε n→∞ 16k6n  n  \ = lim P |M( k t) − M( k−1t)| ε n→∞ n n 6 k=1 n (L2) = lim P(|M( t )| ε) (L1) n→∞ ∏ n 6 k=1 n = lim 1 − P(|M( t )| > ε) n→∞ n

lim e−nP(|M(t/n)|>ε) 1 6 n→∞ 6 x P where we use the inequality 1 + x 6 e . This proves limn→∞ n (|M(t/n)| > ε) = 0. Therefore, Z E 3 t E 2 t 3 t P |M ( n )| 6 ε M ( n ) + |M ( n )|d |M(t/n)|>ε t q q ε σ 2 + P(|M( t )| > ε) EM6( t ). 6 n n n 6 6 iuM(s) −sψ(u) It is not hard to see that EM (s) = a1s + ··· + a6s (differentiate Ee = e six times at u = 0), and so s t t P(|M(t/n)| > ε) t E|M3( t )| ε σ 2 + c = εσ 2 + o(1) n 6 n n t/n n

We close this chapter with Paul Lévy’s construction of a standard Brownian motion (Wt )t>0. Since W is a Lévy process which has the Markov property, it is enough to construct a Brownian n motion W(t) only for t ∈ [0,1], then produce independent copies (W (t))t∈[0,1],n = 0,1,2,..., and join them continuously:  W 0(t), t ∈ [0,1), Wt := W 0(1) + ··· +W n−1(1) +W n(t − n), t ∈ [n,n + 1).

Since each Wt is normally distributed with mean 0 and variance t, we will get a Lévy process with 1 2 R characteristic exponent 2 ξ , ξ ∈ . In the same vein we get a d-dimensional Brownian motion (1) (d) by making a vector (Wt ,...,Wt )t>0 of d independent copies of (Wt )t>0. This yields a Lévy 1 2 2 R process with exponent 2 (ξ1 + ··· + ξd ), ξ1,...,ξd ∈ . Denote a one-dimensional normal distribution with mean m and variance σ 2 as N(m,σ 2). The motivation for the construction is the observation that a Brownian motion satisfies the following mid-point law (cf. [56, Chapter 3.4]):

P 1 1  R (W(s+t)/2 ∈ • | Ws = x,Wt = y) = N 2 (x + y), 4 (t − s) , s 6 t, x,y ∈ . Ë 1 Chapter 8: Two special Lévy processes 53 Ë 1

1 W2 W1 W2n+1 (t) 4 Än(t) W1

3 W4

W2n (t)

k−1 2k−1 k 1 1 3 2n 2n+1 2n 4 2 4 1

Figure 8.1.: Interpolation of order four in Lévy’s construction of Brownian motion.

This can be turned into the following construction method:

Algorithm. Set W(0) = 0 and let W(1) ∼ N(0,1). Let n > 1 and assume that the random variables W(k2−n), k = 1,...,2n − 1 have already been constructed. Then  W(k2−n), l = 2k, W(l2−n−1) := 1 −n −n   2 W(k2 ) +W((k + 1)2 ) + Γ2n+k, l = 2k + 1,

−n where Γ2n+k is an independent (of everything else) N(0,2 /4) Gaussia random variable, cf. Fig- ure 8.1. In-between the nodes we use piecewise linear interpolation:

−n n W2n (t,ω) := Linear interpolation of W(k2 ,ω), k = 0,1,...,2 , n > 1.

At the dyadic points t = k2− j we get the ‘true’ value of W(t,ω), while the linear interpolation is an approximation, see Figure 8.1.

Theorem 8.5 (Lévy 1940). The series

∞  W(t,ω) := ∑ W2n+1 (t,ω) −W2n (t,ω) +W1(t,ω), t ∈ [0,1], n=0 converges a.s. uniformly. In particular (W(t))t∈[0,1] is a one-dimensional Brownian motion.

Proof. Set ∆n(t,ω) := W2n+1 (t,ω) −W2n (t,ω). By construction,

−n−1 n ∆n (2k − 1)2 ,ω) = Γ2n+(k−1)(ω), k = 1,2,...,2 , are iid N(0,2−(n+2)) distributed random variables. Therefore,    √  −n−1 xn n n+2 −n−1 P max ∆n (2k − 1)2 > √ 6 2 P 2 ∆n 2 > xn , 16k62n 2n+2 54 R. L. Schilling: An Introduction to Lévy and Feller Processes and the right-hand side equals

n Z ∞ n+1 Z ∞ n+1 2 · 2 −r2/2 2 r −r2/2 2 −x2/2 √ e dr 6 √ e dr = √ e n . 2π xn 2π xn xn xn 2π √ Choose c > 1 and xn := c 2nlog2. Then

∞   ∞ n+1 −n−1 xn 2 −c2 log2n P max ∆n (2k − 1)2 > √ √ e ∑ n n+2 6 ∑ n=1 16k62 2 n=1 c 2π ∞ 2 2 = √ ∑ 2−(c −1)n < ∞. c 2π n=1

Using the Borel–Cantelli lemma we find a set Ω0 ⊂ Ω with P(Ω0) = 1 such that for every ω ∈ Ω0 there is some N(ω) > 1 with r −n−1 nlog2 max ∆n (2k − 1)2 6 c n+1 for all n > N(ω). 16k62n 2

∆n(t) is the distance between the polygonal arcs W2n+1 (t) and W2n (t); the maximum is attained at one of the midpoints of the intervals [(k − 1)2−n,k2−n], k = 1,...,2n, see Figure 8.1. Thus, r −n−1  nlog2 sup W2n+1 (t,ω) −W2n (t,ω) 6 max ∆n (2k − 1)2 ,ω 6 c , 1 k 2n n+1 06t61 6 6 2 for all n > N(ω) which means that the limit ∞  W(t,ω) := lim W2N (t,ω) = W2n+1 (t,ω) −W2n (t,ω) +W1(t,ω) N→∞ ∑ n=0 exists for all ω ∈ Ω0 uniformly in t ∈ [0,1]. Therefore, t 7→W(t,ω), ω ∈ Ω0, inherits the continuity of the polygonal arcs t 7→ W2n (t,ω). Set

1 We (t,ω) := W(t,ω) Ω0 (ω).

n By construction, we find for all 0 6 k 6 l 6 2

−n −n −n −n We (l2 ) −We (k2 ) = W2n (l2 ) −W2n (k2 ) l −n −n  = ∑ W2n (l2 ) −W2n ((l − 1)2 ) l=k+1 ∼iid N(0,(l − k)2−n).

Since t 7→ We (t) is continuous and the dyadic numbers are dense in [0,t], we conclude that the increments We (tk) −We (tk−1), 0 = t0 < t1 < ··· < tN 6 1 are independent N(0,tk −tk−1) distributed random variables. This shows that (We (t))t∈[0,1] is a Brownian motion. 9. Random measures

We continue our investigations of the paths of càdlàg Lévy processes. Independently of Chapters 5 and 6 we will show in Theorem 9.12 that the processes constructed in Theorem 7.5 are indeed all Lévy processes; this gives also a new proof of the Lévy–Khintchine formula, cf. Corollary 9.13.

As before, we denote the jumps of (Xt )t>0 by

∆Xt := Xt − Xt− = Xt − limXs. s↑t Definition 9.1. Let X be a Lévy process. The counting measure

d Nt (B,ω) := #{s ∈ (0,t] : ∆Xs(ω) ∈ B}, B ∈ B(R \{0}) (9.1) is called the jump measure of the process X.

Since a càdlàg function x : [0,∞) → Rd has on any compact interval [a,b] at most finitely many 1 jumps |∆xt | > ε exceeding a fixed size, we see that

d Nt (B,ω) < ∞ ∀t > 0, B ∈ B(R ) such that 0 ∈/ B.

Notice that 0 ∈/ B is equivalent to Bε (0) ∩ B = /0for some ε > 0. Thus, B 7→ Nt (B,ω) is for every ω a locally finite Borel measure on Rd \{0}.

Definition 9.2. Let Nt (B) be the jump measure of the Lévy process X. For every Borel function f : Rd → R with 0 ∈/ supp f we define Z Nt ( f ,ω) := f (y)Nt (dy,ω). (9.2)

Since 0 ∈/ supp f , it is clear that Nt (supp f ,ω) < ∞, and for every ω ∞ 1 0 Nt ( f ,ω) = ∑ f (∆Xs(ω)) = ∑ f (∆Xτn (ω)) (0,t](τn(ω)). (9.2 ) 0

Both sums are finite sums, extending only over those s where ∆Xs(ω) 6= 0. This is obvious in the second sum where τ1(ω),τ2(ω),τ3(ω)... are the jump times of X.

d m Lemma 9.3. Let Nt (·) be the jump measure of a Lévy process X, f ∈ Cc(R \{0},R ) (i.e. f Rm k takes values in ), s < t and tk,n := s + n (t − s). Then n−1   Nt ( f ,ω) − Ns( f ,ω) = f ∆Xu(ω) = lim f Xt (ω) − Xt (ω) . (9.3) ∑ n→∞ ∑ k+1,n k,n s

55 56 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Throughout the proof ω is fixed and we will omit it in our notation. Since 0 ∈/ supp f , there is some ε > 0 such that Bε (0) ∩ supp f = /0;therefore, we need only consider jumps of size |∆Xt | > ε. Denote by J = {τ1,...,τN} those jumps. For sufficiently large n we can achieve that  • # J ∩ (tk,n,tk+1,n] 6 1 for all k = 0,...,n − 1;

• |Xtκ+1,n − Xtκ,n | < ε if κ is such that J ∩ (tκ,n,tκ+1,n] = /0.

Indeed: Assume this is not the case, then we could find sequences s < sk < tk 6 t such that

tk − sk → 0, J ∩ (sk,tk] = /0and |Xtk − Xsk | > ε. Without loss of generality we may assume that sk ↑ u and tk ↓ u for some u ∈ (s,t]; u = s can be ruled out because of right-continuity. By the càdlàg property of the paths, |∆Xu| > ε, i.e. u ∈ J, which is a contradiction.

Since we have f (Xtκ+1,n −Xtκ,n ) = 0 for intervals of the ‘second kind’, only the intervals containing some jump contribute to the (finite!) sum (9.3), and the claim follows.

Lemma 9.4. Let Nt (·) be the jump measure of a Lévy process X. Rm Rd Rm a) (Nt ( f ))t>0 is a Lévy process on for all f ∈ Cc( \{0}, ). Rd b) (Nt (B))t>0 is a Poisson process for all B ∈ B( ) such that 0 ∈/ B.

d c) ν(B) := EN1(B) is a locally finite measure on R \{0}.

Proof. Set Ft := σ(Xs, s 6 t). d m 0 a) Let f ∈ Cc(R \{0},R ). From Lemma 9.3 and (L2 ) we see that Nt ( f ) is Ft measurable Y and Nt ( f ) − Ns( f )⊥⊥Fs, s 6 t. Moreover, if Nt (·) denotes the jump measure of the Lévy process Y Y = (Xt+s −Xs)t>0, we see that Nt ( f )−Ns( f ) = Nt−s( f ). By the Markov property (Theorem 4.6), Y X ∼ Y, and we get Nt−s( f ) ∼ Nt−s( f ). Since t 7→ Nt ( f ) is càdlàg, (Nt ( f ))t>0 is a Lévy process. b) By definition, N0(B) = 0 and t 7→ Nt (B) is càdlàg. Since X is a Lévy process, we see as in the proof of Theorem 8.1 that the jump times   τ0 := 0, τ1 := inf t > 0 : ∆Xt ∈ B , τk := inf t > τk−1 : ∆Xt ∈ B satisfy τ1 ∼ Exp(ν(B)), and the inter-jump times (τk −τk−1)k∈N are an iid sequence. The condition

0 ∈/ B ensures that Nt (B) < ∞ a.s., which means that the intensity ν(B) is finite. Indeed, we have

−tν(B) 1 − e = P(τ1 t) = P(Nt (B) > 0) −−→ 0; 6 t→0 this shows that ν(B) < ∞. Thus, ∞ 1 Nt (B) = ∑ (0,t](τk) k=1 is a Poisson process (Example 3.2) and, in particular, a Lévy process (Theorem 3.4). E c) The intensity of (Nt (B))t>0 is ν(B) = N1(B). By Fubini’s theorem it is clear that ν is a measure. Chapter 9: Random measures 57

Definition 9.5. Let Nt (·) be the jump measure of a Lévy process X. The intensity measure is the measure ν(B) := EN1(B) from Lemma 9.4.

We will see in Corollary 9.13 that ν is the Lévy measure of (Xt )t>0 appearing in the Lévy– Khintchine formula.

Lemma 9.6. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure. For 1 d m R 1 every f ∈ L (ν), f : R → R , the random variable Nt ( f ) := f (y)Nt (dy) exists as L -limit of integrals of simple functions and satisfies Z Z ENt ( f ) = E f (y)Nt (dy) = t f (y)ν(dy). (9.4) y6=0

M 1 Proof. For any step function f of the form f (y) = ∑k=1 φk Bk (y) with 0 ∈/ Bk the formula (9.4) follows from ENt (Bk) = tν(Bk) and the linearity of the integral. Since ν is defined on Rd \{0}, any f ∈ L1(ν) can be approximated by a sequence of step 1 functions ( fn)n∈N in L (ν)-sense, and we get Z E|Nt ( fn) − Nt ( fm)| t | fn − fm|dν −−−−→ 0. 6 m,n→∞

1 Because of the completeness of L (P), the limit limn→∞ Nt ( fn) exists, and with a routine argument 1 we see that it is independent of the approximating sequence fn → f ∈ L (ν). This allows us to 1 1 define Nt ( f ) for f ∈ L (ν) as L (P)-limit of stochastic integrals of simple functions; obviously, (9.4) is preserved under this limiting procedure.

Theorem 9.7. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure.

R 1 d m a) Nt ( f ) := f (y)Nt (dy) is a Lévy process for every f ∈ L (ν), f : R → R . In particular, Rd (Nt (B))t>0 is a Poisson process for every B ∈ B( ) such that 0 ∈/ B. B 1 B Rd b) Xt := Nt (y B(y)) and Xt − Xt are for every B ∈ B( ), 0 ∈/ B, Lévy processes.

Proof. a) Note that ν is a locally finite measure on Rd \{0}. This means that, by standard density d 1 1 results from integration theory, the family Cc(R \{0}) is dense in L (ν). Fix f ∈ L (ν) and d 1 choose fn ∈ Cc(R \{0}) such that fn → f in L (ν). Then, as in Lemma 9.6, Z E|Nt ( f ) − Nt ( fn)| t | f − fn|dν −−−→ 0. 6 n→∞ P t R Since (|Nt ( f )| > ε) 6 ε | f |dν → 0 for every ε > 0 as t → 0, the process Nt ( f ) is continuous 1 in probability. Moreover, it is the limit (in L , hence in probability) of the Lévy processes Nt ( fn) (Lemma 9.4); therefore it is itself a Lévy process, see Lemma 7.3. 1 In view of Lemma 9.4.c), the indicator function 1B ∈ L (ν) whenever B is a Borel set satisfying

0 ∈/ B. Thus, Nt (B) = Nt (1B) is a Lévy process which has only jumps of unit size, i.e. it is by Theorem 8.1 a Poisson process. 58 R. L. Schilling: An Introduction to Lévy and Feller Processes

1 1 b) Set f (y) := y B(y) and Bn := B ∩ Bn(0). Then fn(y) = y Bn (y) is bounded and 0 ∈/ supp fn, 1 hence fn ∈ L (ν). This means that Nt ( fn) is for every n ∈ N a Lévy process. Moreover, Z Z Nt ( fn) = yNt (dy) −−−→ yNt (dy) = Nt ( f ) a.s. Bn n→∞ B

Since Nt ( f ) changes its value only by jumps,

P(|Nt ( f )| > ε) 6 P(X has at least one jump of size B in [0,t]) −tν(B) = P(Nt (B) > 0) = 1 − e , which proves that the process Nt ( f ) is continuous in probability. Lemma 7.3 shows that Nt ( f ) is a Lévy process. d d Finally, approximate f (y) := y1B(y) by a sequence φl ∈ Cc(R \{0},R ). Now we can use Lemma 9.3 to get

n−1   Xt − Nt (φl) = lim (Xt − Xt ) − φl(Xt − Xt ) , n→∞ ∑ k+1,n k,n k+1,n k,n k=0

The increments of X are stationary and independent, and so we conclude from the above for- mula that X − N(φl) has also stationary and independent increments. Since both X and N(φl) are continuous in probability, so is their difference, i.e. X − N(φl) is a Lévy process. Finally,

Nt (φl) −−→ Nt ( f ) and Xt − Nt (φl) −−→ Xt − Nt ( f ), l→∞ l→∞ and since X and N( f ) are continuous in probability, Lemma 7.3 tells us that X − N( f ) is a Lévy process.

We will now show that Lévy processes with ‘disjoint jump heights’ are independent. For this we need the following immediate consequence of Theorem 3.1:

Lemma 9.8 (Exponential martingale). Let (Xt )t>0 be a Lévy process. Then

iξ·Xt e iξ·Xt tψ(ξ) Mt := = e e , t > 0, Eeiξ·Xt is a martingale for the filtration X = (X , s t) such that sup |M | et Reψ(ξ). Ft σ s 6 s6t s 6

d Theorem 9.9. Let Nt (·) be the jump measure of a Lévy process X and U,V ∈ B(R ), 0 ∈/ U,0 ∈/ V and U ∩V = /0. Then the processes

U 1 V 1 U∪V Xt := Nt (y U (y)), Xt := Nt (y V (y)), Xt − Xt are independent Lévy processes in Rd. Chapter 9: Random measures 59

Proof. Set W := U ∪V. By Theorem 9.7, XU ,XV and X −XW are Lévy processes. In fact, a slight variation of that argument even shows that (XU ,XV ,X − XW ) is a Lévy process in R3d. In order to see their independence, fix s > 0 and define for t > s and ξ,η,θ ∈ Rd the processes

U U V V eiξ·(Xt −Xs ) eiη·(Xt −Xs ) C := − 1, D := − 1, t  U U  t  V V  E eiξ·(Xt −Xs ) E eiη·(Xt −Xs ) W W eiθ·(Xt −Xt −Xs+Xs ) E := − 1. t  W W  E eiθ·(Xt −Xt −Xs+Xs )

By Lemma 9.8, these processes are bounded martingales satisfying ECt = EDt = EEt = 0. Set k tk,n = s + n (t − s). Observe that

n−1 ! E(C D E ) = E (C −C )(D − D )(E − E ) t t t ∑ tk+1,n tk,n tl+1,n tl,n tm+1,n tm,n k,l,m=0 n−1 ! = E (C −C )(D − D )(E − E ) . ∑ tk+1,n tk,n tk+1,n tk,n tk+1,n tk,n k=0

In the second equality we use that martingale increments Ct −Cs,Dt −Ds,Et −Es are independent X of Fs , and by the tower property E  (Ctk+1,n −Ctk,n )(Dtl+1,n − Dtl,n )(Etm+1,n − Etm,n ) = 0 unless k = l = m.

An argument along the lines of Lemma 9.3 gives   E E (Ct Dt Et ) = ∑ ∆Cu ∆Du ∆Eu = 0 s

Since all processes are Lévy processes, (9.5) already proves the independence of XU , XV and W d Y = X − X . Indeed, we find for 0 = t0 < t1 < ... < tm = t and ξk,ηk,θk ∈ R  U U V V  i∑k ξk·(Xt −Xt ) i∑k ηk·(Xt −Xt ) i∑ θ ·(Yt −Yt ) E e k+1 k e k+1 k e k k k+1 k

U U V V  iξk·(X −X ) iηk·(X −X ) iθ ·(Y −Y ) = E ∏e tk+1 tk e tk+1 tk e k tk+1 tk k 0 U U V V (L2 )  iξk·(X −X ) iηk·(X −X ) iθ ·(Y −Y ) = ∏E e tk+1 tk e tk+1 tk e k tk+1 tk k U U V V (9.5)  iξk·(X −X )  iηk·(X −X )  iθ ·(Y −Y ) = ∏E e tk+1 tk E e tk+1 tk E e k tk+1 tk . k The last equality follows from (9.5); the second equality uses (L20) for the Lévy process

U V W (Xt ,Xt ,Xt − Xt ). 60 R. L. Schilling: An Introduction to Lévy and Feller Processes

This shows that the families (XU −XU ) , (XV −XV ) and (Y −Y ) are independent, hence tk+1 tk k tk+1 tk k tk+1 tk k U V W the σ-algebras σ(Xt , t > 0), σ(Xt , t > 0) and σ(Xt − Xt , t > 0) are independent.

Corollary 9.10. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure. Rd a) (Nt (U))t>0 ⊥⊥(Nt (V))t>0 for U,V ∈ B( ), 0 ∈/ U, 0 ∈/ V,U ∩V = /0.

b) For all measurable f : Rd → Rm satisfying f (0) = 0 and f ∈ L1(ν)

   R  −t R − iξ· f (y) (dy) E eiξ·Nt ( f ) = E ei ξ· f (y)Nt (dy) = e y6=0[1 e ]ν . (9.6) Z c) |y|2 ∧ 1 ν(dy) < ∞. y6=0

Proof. a) Since (Nt (U))t>0 and (Nt (V))t>0 are completely determined by the independent pro- U V cesses (Xt )t>0 and (Xt )t>0, cf. Theorem 9.9, the independence is clear. n 1 Rm b) Let us first prove (9.6) for step functions f (x) = ∑k=1 φk Uk (x) with φk ∈ and disjoint sets d U1,...,Un ∈ B(R ) such that 0 6∈ Uk. Then  Z   n Z  E E 1 exp i ξ · f (y)Nt (dy) = exp i ∑ ξ · φk Uk (y)Nt (dy) k=1 n a) E = ∏ exp[iξ · φk Nt (Uk)] k=1 n h i 9.7.a)  iξ·φk  = ∏ exp tν(Uk) e − 1 k=1  n   iξ·φk  = exp t ∑ e − 1 ν(Uk) k=1  Z  = exp −t 1 − eiξ· f (y)ν(dy) .

For any f ∈ L1(ν) the integral on the right-hand side of (9.6) exists. Indeed, the elementary iu inequality |1 − e | 6 |u| ∧ 2 and ν{|y| > 1} < ∞ (Lemma 9.4.c)) yield Z Z Z  iξ· f (y) 1 − e ν(dy) 6 |ξ| | f (y)|ν(dy) + 2 ν(dy) < ∞. y6=0 0<|y|<1 |y|>1 Therefore, (9.6) follows with a standard approximation argument and dominated convergence. c) We have already seen in Lemma 9.4.c) that ν{|y| > 1} < ∞. R 2 Let us show that 0<|y|<1 |y| ν(dy) < ∞. For this we take U = {δ < |y| < 1}. Again by Theo- U U rem 9.9, the processes Xt and Xt − Xt are independent, and we get

iξ·X iξ·XU iξ·(X −XU ) iξ·XU 0 < Ee t = Ee t · Ee t t 6 Ee t . U 1 Since Xt is a compound Poisson process—use part b) with f (y) = y U (y)—we get for all |ξ| 6 1

iξ·X iξ·XU −t R (1−cosξ·y)ν(dy) −t R 1 (ξ·y)2 ν(dy) 0 < Ee t 6 Ee t = e U 6 e δ<|y|61 4 . Chapter 9: Random measures 61

z Rez 1 2 For the equality we use |e | = e , the inequality follows 4 u 6 1−cosu if |u| 6 1. Letting δ → 0 R 2 we see that 0<|y|<1 |y| ν(dy) < ∞.

Corollary 9.11. Let Nt (·) be the jump measure of a Lévy process X and ν the intensity measure. For all f : Rd → Rm satisfying f (0) = 0 and f ∈ L2(ν) we have2

 Z 2  Z   2 E f (y) Nt (dy) −tν(dy) = t | f (y)| ν(dy). (9.7) y6=0 Proof. It is clearly enough to show (9.7) for step functions of the form

n 1 Rm f (x) = ∑ φk Bk (x), Bk disjoint, 0 6∈ Bk, φk ∈ , k=1 and then use an approximation argument.

Since the processes Nt (Bk,·) are independent Poisson processes with mean ENt (Bk) = tν(Bk) and variance VNt (Bk) = tν(Bk), we find   E (Nt (Bk) −tν(Bk))(Nt (Bl) −tν(Bl))    0, if Bk ∩ Bl = /0, i.e. k 6= l,  =  VNt (Bk) = tν(Bk), if k = l, 

= tν(Bk ∩ Bl).

Therefore,

 2 Z  E f (y) Nt (dy) −tν(dy)

  ZZ   = E f (y) f (z) Nt (dy) −tν(dy) Nt (dz) −tν(dz) n   E   = ∑ φkφl Nt (Bk) −tν(Bk) Nt (Bl) −tν(Bl) k,l=1 | {z } =tν(Bk∩Bl ) n Z 2 2 = t ∑ |φk| ν(Bk) = t | f (y)| ν(dy). k=1 In contrast to Corollary 7.6 the following theorem does not need (but constructs) the Lévy triplet (l,Q,ν).

Theorem 9.12 (Lévy–Itô decomposition). Let X be a Lévy process and denote by Nt (·) and ν the

2This is a special case of an Itô isometry, cf. (10.9) in the following chapter. 62 R. L. Schilling: An Introduction to Lévy and Feller Processes jump and intensity measures. Then

Z  p  2 Xt = QWt + y Nt (dy) −tν(dy) =: Mt ,L -martingale 0<|y|<1 Z  tl + yNt (dy). =: At , bdd. variation (9.8) |y|>1 continuous pure jump part Gaussian where l ∈ Rd and Q ∈ Rd×d is a positive semidefinite symmetric matrix and W is a standard Brownian motion in Rd. The processes on the right-hand side of (9.8) are independent Lévy processes.

◦ 1 Proof. 1 Set Un := { n < |y| < 1}, V = {|y| > 1}, Wn := Un ∪· V and define Z Z Z V Un Xt := yNt (dy) and Xet := yNt (dy) −t yν(dy). V Un Un

V Un Wn R  By Theorem 9.9 (X )t 0, (Xt )t 0 and Xt − Xt + t yν(dy) are independent Lévy pro- t > e > Un t>0 cesses. Since U V U V X = (X − Xe n − X ) + Xe n + X , the theorem follows if we can show that the three terms on the right-hand side converge separately as n → ∞.

◦ E Un Un 2 Lemma 9.6 shows Xet = 0; since Xe is a Lévy process, it is a martingale: for s 6 t     E Un E Un Un Un Xet Fs = Xet − Xes Fs + Xes (L2)   E Un Un Un = Xet−s + Xes = Xes . (L1)

U 2 (Fs can be taken as the natural filtration of X n or X). By Doob’s L martingale inequality we find for any t > 0 and m < n   2  2 E Un Um E Un Um sup Xes − Xes 6 4 Xet − Xet s6t Z = 4t |y|2 ν(dy) −−−−→ 0. 1 1 m,n→∞ n <|y|6 m

R  2 Un Therefore, the limit 0<|y|<1 y Nt (dy) −tν(dy) = L -limn→∞ Xet exists locally uniformly (in t). The limit is still an L2 martingale with càdlàg paths (take a locally uniformly a.s. convergent subsequence) and, by Lemma 7.3, also a Lévy process.

3◦ Observe that U V U V U U (X − Xe n − X ) − (X − Xe m − X ) = Xe m − Xe n , Chapter 9: Random measures 63

c 2 Un V and so Xt := L -limn→∞(Xt − Xet − Xt ) exists locally uniformly (in t) Since, by construction Un V 1 c |∆(Xt − Xet − Xt )| 6 n , it is clear that X has a.s. continuous sample paths. By Lemma 7.3 it is a Lévy process. From Theorem 8.4 we know that all Lévy processes with continuous sample √ d×d paths are of the form tl + QWt where W is a Brownian motion, Q ∈ R a symmetric positive semidefinite matrix and l ∈ Rd.

4◦ Since independence is preserved under L2-limits, the decomposition (9.8) follows. Finally, Z yN (dy, ) = X ( )1 − | X ( )|1 t ω ∑ ∆ s ω {∆Xs(ω)>1} ∑ ∆ s ω {∆Xs(ω)6−1} |y| 1 > 0

Corollary 9.13 (Lévy–Khintchine formula). Let X be a Lévy process. Then the characteristic exponent ψ is given by Z 1 h iy·ξ i ψ(ξ) = −il · ξ + ξ · Qξ + 1 − e + iξ · y1(0,1)(|y|) ν(dy) (9.9) 2 y6=0 where ν is the intensity measure, l ∈ Rd and Q ∈ Rd×d is symmetric and positive semidefinite.

Proof. Since the processes appearing in the Lévy–Itô decomposition (9.8) are independent, we see √ −ψ(ξ) iξ·X iξ·(−l+ QW ) iR ξ·y(N (dy)−ν(dy)) iR ξ·yN (dy) e = Ee 1 = Ee 1 · Ee 0<|y|<1 1 · Ee |y|>1 1 .

Since W is a standard Brownian motion, √ iξ·(l+ QW ) il·ξ− 1 ξ·Qξ Ee 1 = e 2 .

Using (9.6) with f (y) = y1 (y), U = { 1 < |y| < 1}, subtracting R y (dy) and letting n → Un n n Un ν ∞ we get

 Z   Z  h iy·ξ i Eexp i ξ · y(N1(dy) − ν(dy)) = exp − 1 − e + iξ · y ν(dy) ; 0<|y|<1 0<|y|<1

finally, (9.6) with f (y) = y1V (y), V = {|y| > 1}, once again yields

 Z   Z  h iy·ξ i Eexp i ξ · yN1(dy) = exp − 1 − e ν(dy) |y|>1 |y|>1 finishing the proof. 10. A digression: stochastic integrals

In this chapter we explain how one can integrate with respect to (a certain class of) random mea- sures. Our approach is based on the notion of random orthogonal measures and it will include the classical Itô integral with respect to square-integrable martingales. Throughout this chapter, P (Ω,A , ) is a probability space, (Ft )t>0 some filtration, (E,E ) is a measurable space and µ is a (positive) measure on (E,E ). Moreover, R ⊂ E is a semiring, i.e. a family of sets such that /0 ∈ R, for all R,S ∈ R we have R ∩ S ∈ R, and R \ S can be represented as a finite union of disjoint sets from R, cf. [54, Chapter 6] or [55, Definition 5.1]. It is not difficult to check that

R0 := {R ∈ R : µ(R) < ∞} is again a semiring. Definition 10.1. Let R be a semiring on the measure space (E,E , µ).A random orthogonal measure with control measure µ is a family of random variables N(ω,R) ∈ R, R ∈ R0, such that

 2 E |N(·,R)| < ∞ ∀R ∈ R0 (10.1)

E[N(·,R)N(·,S)] = µ(R ∩ S) ∀R,S ∈ R0. (10.2)

The following Lemma explains why N(R) = N(ω,R) is called a (random) measure.

Lemma 10.2. The random set function R 7→ N(R) := N(ω,R),R ∈ R0, is countably additive in L2, i.e. ∞ ! n [ 2 N Rn = L - lim N(Rk) a.s. (10.3) · n→∞ ∑ n=1 k=1 S∞ for every sequence (Rn)n∈N ⊂ R0 of mutually disjoint sets such that R := · n=1 Rn ∈ R0. In particular, N(R ∪· S) = N(R) + N(S) a.s. for disjoint R,S ∈ R0 such that R ∪· S ∈ R0 and N(/0) = 0 a.s. (notice that the exceptional set may depend on the sets R,S). Proof. From R = S = /0and E[N(/0)2] = µ(/0) = 0 we get N(/0) = 0 a.s. It is enough to prove (10.3) as finite additivity follows if we take (R1,R2,R3,R4 ...) = (R,S, /0, /0,...). If Rn ∈ R0 are mutually S∞ disjoint sets such that R := · n=1 Rn ∈ R0, then " #  n 2 E N(R) − ∑ N(Rk) k=1 n n n E 2 E 2 E E = N (R) + ∑ N (Rk) − 2 ∑ [N(R)N(Rk)] + ∑ [N(R j)N(Rk)] k=1 k=1 j6=k, j,k=1 n (10.2) = µ(R) − µ(Rk) −−−→ 0 ∑ n→∞ k=1 where we use the σ-additivity of the measure µ.

64 Chapter 10: A digression: stochastic integrals 65

Example 10.3. a) () Let R = {(s,t] : 0 6 s < t < ∞} and µ = λ be Lebesgue measure on (0,∞). Clearly, R = R0 is a semiring. Let W = (Wt )t>0 be a one-dimensional standard Brownian motion. The random set function

N(ω,(s,t]) := Wt (ω) −Ws(ω), 0 6 s < t < ∞ is a random orthogonal measure with control measure λ. This follows at once from     E (Wt −Ws)(Wv −Wu) = t ∧ v − s ∨ u = λ (s,t] ∩ (u,v] for all 0 6 s < t < ∞ and 0 6 u < v < ∞. Mind, however, that N is not σ-additive. To see this, take Rn := (1/(n+1),1/n], where n ∈ N, and observe that S R = (0,1]. Since W has stationary and independent increments, and scales √ · n n like Wt ∼ tW1, we have

E − ∞ |N(R )| = E − ∞ |W −W | exp ∑n=1 n exp ∑n=1 1/(n+1) 1/n h i ∞ E −1/2 = ∏n=1 exp −(n(n + 1)) |W1|

Jensen’s ∞ −1/2 (n(n+1)) E −|W1| 6 ∏n=1 α , α := e ∈ (0,1). ineq.

∞ −1/2 E ∞ As the series ∑n=1(n(n + 1)) diverges, we get exp[−∑n=1 |N(Rn)|] = 0 which means that ∞ ∑n=1 |N(ω,Rn)| = ∞ for almost all ω. This shows that N(·) cannot be countably additive. Indeed, countable additivity implies that the series

∞ ! ∞ [ N ω, Rn = ∑ N(ω,Rn) n=1 n=1 converges. The left-hand side, hence the summation, is independent under rearrangements. This, ∞ however, entails absolute convergence of the series ∑n=1 |N(ω,Rn)| < ∞ which does not hold as we have seen above. b) (2nd order orthogonal noise) Let X = (Xt )t∈T be a complex-valued stochastic process defined 2 on a bounded or unbounded interval T ⊂ R. We assume that X has a.s. càdlàg paths. If E(|Xt | ) < ∞, we call X a second-order process; many properties of X are characterized by the correlation  function K(s,t) = E XsXt , s,t ∈ T.   If E (Xt −Xs)(Xv −Xu) = 0 for all s 6 t 6 u 6 v, s,t,u,v ∈ T, then X is said to have orthogonal increments. Fix t0 ∈ T and define for all t ∈ T  E 2  (|Xt − Xt0 | ), if t > t0, F(t) := E 2 − (|Xt0 − Xt | ), if t 6 t0.

Clearly, F is increasing and, since t 7→ Xt is a.s. right-continuous, it is also right-continuous. Moreover, 2 F(t) − F(s) = E(|Xt − Xs| ) for all s 6 t, s,t ∈ T. (10.4) 66 R. L. Schilling: An Introduction to Lévy and Feller Processes

To see this, we assume without loss of generality that s 6 t0 6 t. We have E 2 E 2 F(t) − F(s) = (|Xt − Xt0 | ) − (|Xs − Xt0 | ) E 2 E 2 = (|(Xt − Xs) + (Xs − Xt0 )| ) − (|Xs − Xt0 | )

orth. 2 = E(|Xt − Xs| ). incr.

This shows that µ(s,t] := F(t) − F(s) defines a measure on R = R0 = {(s,t] : −∞ < s < t <

∞, s,t ∈ T}, which is the control measure of N(ω,(s,t]) := Xt (ω)−Xs(ω). In fact, for s < t,u < v, s,t,u,v ∈ T, we have    Xt − Xs = Xt − Xt∧v + Xt∧v − Xs∨u + Xs∨u − Xs    Xv − Xu = Xu − Xt∧v + Xt∧v − Xs∨u + Xs∨u − Xu .

Using the orthogonality of the increments we get      E (Xt − Xs)(Xv − Xu) = E Xt∧v − Xs∨u Xt∧v − Xs∨u = F(t ∧ v) − F(s ∨ u) = µ(s,t] ∩ (u,v], i.e. N(ω,•) is a random orthogonal measure. c) (Martingale noise) Let M = (Mt )t>0 be a square-integrable martingale with respect to the fil- tration (Ft )t>0, M0 = 0, and with càdlàg paths. Denote by hMi the predictable , 2 i.e. the unique (hMi0 := 0) increasing such that M −hMi is a martingale. The random set function N(ω,(s,t]) := Mt (ω) − Ms(ω), s 6 t, is a random orthogonal measure on R = {(s,t] : 0 6 s < t < ∞} with control measure µ(s,t] = E(hMit − hMis). This follows immediately from the tower property of conditional expectation

E tower E E E 2 E [Mt Mv] = [Mt (Mv | Ft )] = [Mt ] = hMit if t 6 v which, in turn, gives for all 0 6 s < t and 0 6 u < v

E[(Mt − Ms)(Mv − Mu)] = EhMit∧v − EhMis∧v − EhMit∧u + EhMis∧u = µ(s,t] ∩ (0,v] − µ(s,t] ∩ (0,u] = µ(s,t] ∩ (u,v]. d) (Poisson random measure) Let X be a d-dimensional Lévy process,

 d  S := B ∈ B(R ) : 0 ∈/ B , R := (s,t] × B : 0 6 s < t < ∞, B ∈ S , and Nt (B) the jump measure (Definition 9.1). The random set function

Ne(ω,(s,t] × B) := [Nt (ω,B) −tν(B)] − [Ns(ω,B) − sν(B)], R = (s,t] × B ∈ R, Chapter 10: A digression: stochastic integrals 67 is a random orthogonal measure with control measure λ × ν where λ is Lebesgue measure on

(0,∞) and ν is the Lévy measure of X. Indeed, by definition R = R0, and it is not hard to see that R is a semiring1. Set Net (B) := Ne((0,t] × B) and let B,C ∈ S , t,v > 0. As in the proof of Corollary 9.11 we have   E Net (B)Net (C) = tν(B ∩C).

Since S is a semiring, we get B = (B ∩C) ∪· (B \C) = (B ∩C) ∪· B1 ∪· ··· ∪· Bn with finitely many mutually disjoint Bk ∈ S such that Bk ⊂ B \C. The processes Ne(Bk) and Ne(C) are independent (Corollary 9.10) and centered. Therefore we have for t 6 v n E  E  E  Net (B)Nev(C) = Net (B ∩C)Nev(C) + ∑ Net (Bk)Nev(C) k=1 n E  E E = Net (B ∩C)Nev(C) + ∑ Net (Bk) · Nev(C) k=1   = E Net (B ∩C)Nev(C) .

Use the same argument over again, as well as the fact that Net (B∩C) has independent and centered increments (Lemma 9.4), to get   E Net (B)Nev(C)   E E = E Net (B ∩C)Nev(B ∩C) = Net (B∩C) [Nev(B∩C)−Net (B∩C)]=0   z   }| { = E Net (B ∩C)Net (B ∩C) + E Net (B ∩C) Nev(B ∩C) − Net (B ∩C) (10.5)   = E Net (B ∩C)Net (B ∩C) = tν(B ∩C) = λ((0,t] ∩ (0,v])ν(B ∩C).

For s 6 t, u 6 v and B,C ∈ S a lengthy, but otherwise completely elementary, calculation based on (10.5) shows   E Ne((s,t] × B)Ne((u,v] ×C) = λ((s,t] ∩ (u,v])ν(B ∩C).

 d e) (Space-time white noise) Let R := (0,t] × B : t > 0, B ∈ B(R ) and µ = λ Lebesgue measure on the half-space H+ := [0,∞) × Rd. W Consider the mean-zero, real-valued ( (R))R∈B(H+) whose covariance func- tion is given by Cov(W(R)W(S)) = λ(R ∩ S).2 By its very definition W(R) is a random orthog- onal measure on R0 with control measure λ. 1 Both S and I := {(s,t] : 0 6 s < t < ∞} are semirings, and so is their cartesian product R = I × S , see [54, Lemma 13.1] or [55, Lemma 15.1] for the straightforward proof. 2 + The map (R,S) 7→ λ(R ∩ S) is positive semidefinite, i.e. for R1,...,Rn ∈ B(H ) and ξ1,...,ξn ∈ R n n Z Z  n 2 1 1 1 ∑ ξ jξkλ(R j ∩ Rk) = ∑ ξ j R j (x)ξk Rk (x)λ(dx) = ∑ ξk Rk (x) λ(dx) > 0. j,k=1 j,k=1 k=1 68 R. L. Schilling: An Introduction to Lévy and Feller Processes

We will now define a stochastic integral in the spirit of Itô’s original construction.

Definition 10.4. Let R be a semiring and R0 = {R ∈ R : µ(R) < ∞}.A simple function is a deterministic function of the form n 1 N R f (x) = ∑ ck Rk (x), n ∈ , ck ∈ , Rk ∈ R0. (10.6) k=1

n Intuitively, IN( f ) = ∑k=1 ckN(Rk) should be the stochastic integral of a simple function f . The only problem is the well-definedness. Since a random orthogonal measure is a.s. finitely addi- tive, the following lemma has exactly the same proof as the usual well-definedness result for the Lebesgue integral of a step function, see e.g. Schilling [54, Lemma 9.1] or [55, Lemma 8.1]; note that finite unions of null sets are again null sets.

n 1 m 1 Lemma 10.5. Let f be a simple function and assume that f = ∑k=1 ck Rk = ∑ j=1 b j S j has two representations as step-function. Then

n m ∑ ckN(Rk) = ∑ b jN(S j) a.s. k=1 j=1

Definition 10.6. Let N(R), R ∈ R0, be a random orthogonal measure with control measure µ. The stochastic integral of a simple function f given by (10.6) is the random variable

n IN(ω, f ) := ∑ ckN(ω,Rk). (10.7) k=1 The following properties of the stochastic integral are more or less immediate from the defini- tion.

Lemma 10.7. Let N(R),R ∈ R0, be a random orthogonal measure with control measure µ, f ,g simple functions, and α,β ∈ R.

a) IN(1R) = N(R) for all R ∈ R0;

3 b) S 7→ IN(1S) extends N uniquely to S ∈ ρ(R0), the ring generated by R0;

c) IN(α f + βg) = αIN( f ) + βIN(g); (linearity)

 2 R 2 d) E IN( f ) = f dµ; (Itô’s isometry)

Proof. The properties a) and c) are clear. For b) we note that ρ(R0) can be constructed from R0 by adding all possible finite unions of (disjoint) sets (see e.g. [54, Proof of Theorem 6.1, Step 2]).

3A ring is a family of sets which contains /0and which is stable under unions and differences of finitely many sets.

Since R ∩ S = R \ (R \ S), it is automatically stable under finite intersections. The ring generated by R0 is the smallest ring containing R0. Chapter 10: A digression: stochastic integrals 69

In order to see d), we use (10.6) and the orthogonality relation E[N(R j)N(Rk)] = µ(R j ∩ Rk) to get n E 2 E IN( f ) = ∑ c jck [N(R j)N(Rk)] j,k=1 n = ∑ c jckµ(R j ∩ Rk) j,k=1 Z n 1 1 = ∑ c j R j (x)ck Rk (x) µ(dx) j,k=1 Z = f 2(x) µ(dx).

Itô’s isometry now allows us to extend the stochastic integral to the L2(µ)-closure of the simple functions: L2(E,σ(R), µ). For this take f ∈ L2(E,σ(R), µ) and any approximating sequence

( fn)n∈N of simple functions, i.e. Z 2 lim | f − fn| dµ = 0. n→∞ 2 In particular, ( fn)n∈N is an L (µ) Cauchy sequence, and Itô’s isometry shows that the random 2 variables (IN( fn))n∈N are a Cauchy sequence in L (P): Z  2  2 2 E (IN( fn) − IN( fm)) = E IN( fn − fm) = ( fn − fm) dµ −−−−→ 0. m,n→∞ 2 Because of the completeness of L (P), the limit limn→∞ IN( fn) exists and, by a standard argument, it does not depend on the approximating sequence.

Definition 10.8. Let N(R), R ∈ R0, be a random orthogonal measure with control measure µ. The stochastic integral of a function f ∈ L2(E,σ(R), µ) is the random variable Z 2 f (x)N(ω,dx) := L (P)- lim IN(ω, fn) (10.8) n→∞ 2 where ( fn)n∈N is any sequence of simple functions which approximate f in L (µ). It is immediate from the definition of the stochastic integral, that f 7→ R f dN is linear and enjoys Itô’s isometry Z 2 Z E f (x)N(dx) = f 2(x) µ(dx). (10.9)

Remark 10.9. Assume that the random orthogonal measure N is of space-time type, i.e. E = (0,∞) × X where (X,X ) is some measurable space, and R = {(0,t] × B : B ∈ S } where S is a semiring in X . If for B ∈ S the stochastic process Nt (B) := N((0,t] × B) is a martingale w.r.t. the filtration Ft := σ(N((0,s] × B), s 6 t,B ∈ S ), then ZZ 1 1 2 Nt ( f ) := (0,t](s) f (x)N(ds,dx), (0,t] ⊗ f ∈ L (µ), t > 0, is again a(n L2-)martingale. For simple functions f this follows immediately from the fact that sums and differences of finitely many martingales (with a common filtration) are again a martin- gale. Since L2(P)-limits preserve the martingale property, the claim follows. 70 R. L. Schilling: An Introduction to Lévy and Feller Processes

At first sight, the stochastic integral defined in 10.8 looks rather restrictive since we can only integrate deterministic functions f . As all randomness can be put into the random orthogonal mea- sure, we have considerable flexibility, and the following construction shows that Definition 10.8 covers pretty much the most general stochastic integrals.

From now on we assume that

• the random measure N(dt,dx) on (E,E ) = ((0,∞)×X, B(Rd)⊗X ) is of space-time type, cf. Remark 10.9, with control measure µ(dt,dx);

P • (Ft )t>0 is some filtration in (Ω,A , ).

Let τ be a stopping time; the set 0,τ := {(ω,t) : 0 < t 6 τ(ω)} is called stochastic interval. K K We define

• E◦ := Ω × (0,∞) × X;

• E ◦ := P ⊗ X where P is the predictable σ-algebra in Ω × (0,∞), see Definition A.8 in the appendix;

• R◦ := { 0,τ × B : τ bounded stopping time, B ∈ S }; K K • µ◦(dω,dt,dx) := P(dω)µ(dt,dx) as control measure;

• N◦ω, 0,τ × B := Nω,(0,τ(ω)] × B as random orthogonal measure4. K K ◦ ◦ ◦ ◦ Lemma 10.10. Let N , R0 and µ be as above. The R0 -simple processes n f ( ,t,x) = c 1 ( ,t)1 (x), c ∈ R, , × B ∈ ◦ ω : ∑ k 0,τk ω Bk k 0 τk k R0 k=1 K K K K are L2(µ◦)-dense in L2(E◦,P ⊗ σ(S ), µ◦).

Proof. This follows from standard arguments from measure and integration; notice that the pre- dictable σ-algebra P ⊗ σ(S ) is generated by sets of the form 0,τ × B where τ is a bounded K K stopping time and B ∈ S , cf. Theorem A.9 in the appendix.

Observe that for simple processes appearing in Lemma 10.10 ZZ n n ◦ f (ω,t,x)N(ω,dt,dx) := ∑ ck N (ω, 0,τk × Bk) = ∑ ck N(ω,(0,τk(ω)] × Bk) k=1 K K k=1 is a stochastic integral which satisfies

ZZ 2 n n E 2 ◦ 2 E f (·,t,x)N(·,dt,dx) = ∑ ck µ ( 0,τk × Bk) = ∑ ck µ((0,τk] × Bk). k=1 K K k=1

Just as above we can now extend the stochastic integral to L2(E◦,P ⊗ σ(S ), µ◦).

4To see that it is indeed a random orthogonal measure, use a discrete approximation of τ. Chapter 10: A digression: stochastic integrals 71

Corollary 10.11. Let N(ω,dt,dx) be a random orthogonal measure on E of space-time type (cf. Remark 10.9) with control measure µ(dt,dx) and f : Ω × (0,∞) × X → R be an element of L2(E◦,P ⊗ σ(S ), µ◦). Then the stochastic integral ZZ f (ω,t,x)N(ω,dt,dx) exists and satisfies the following Itô isometry ZZ 2 ZZ E f (·,t,x)N(·,dt,dx) = E f 2(·,t,x) µ(dt,dx). (10.10)

Let us show that the stochastic integral w.r.t. a space-time random orthogonal measure extends the usual Itô integral. To do so we need the following auxiliary result.

Lemma 10.12. Let N(ω,dt,dx) be a random orthogonal measure on E of space-time type (cf. Remark 10.9) with control measure µ(dt,dx) and τ a stopping time. Then ZZ φ(ω)1 τ,∞ (ω,t) f (ω,t,x)N(ω,dt,dx) K J ZZ (10.11) = φ(ω) 1 τ,∞ (ω,t) f (ω,t,x)N(ω,dt,dx) K J ∞ 2 ◦ ◦ for all φ ∈ L (Fτ ) and f ∈ L (E ,P ⊗ σ(S ), µ ). 1 Proof. Since t 7→ τ,∞ (ω,t) is adapted to the filtration (Ft )t>0 and left-continuous, the inte- K J grands appearing in (10.11) are predictable, hence all stochastic integrals are well-defined.

◦ 1 Assume that φ(ω) = 1F (ω) for some F ∈ Fτ and f (ω,t,x) = 1 0,σ (ω,t)1B(x) for some ◦ K K bounded stopping time σ and 0,σ × B ∈ R0 . Define K K ◦ Nσ (ω,B) := N(ω,(0,σ(ω)] × B) = N (ω, 0,σ × B) K K ◦ 1 1 5 for any 0,σ × B ∈ R0 . The random time τF := τ F + ∞ Fc is a stopping time , and we have K K 1 1 1 1 1 1 1 1 1 1 φ τ,∞ f = F τ,∞ 0,σ B = τF ,∞ 0,σ B = τF ∧σ,σ B. K J K J K K K J K K K K From this we get (10.11) for our choice of φ and f : ZZ ZZ 1 1 1 φ τ,∞ (t) f (t,x)N(dt,dx) = τF ∧σ,σ (t) B(x)N(dt,dx) K J K K

= Nσ (B) − NτF ∧σ (B)

= 1F · (Nσ (B) − Nτ∧σ (B)) ZZ = 1F 1 τ∧σ,σ (t)1B(x)N(dt,dx) K K ZZ = 1F 1 τ,∞ 1 0,σ (t)1B(x)N(dt,dx) K J K K ZZ = φ 1 τ,∞ (t) f (t,x)N(dt,dx). K J ( /0, τ > t) 5 Indeed, {τF 6 t} = {τ 6 t} ∩ F = ∈ Ft for all t > 0. F, τ 6 t 72 R. L. Schilling: An Introduction to Lévy and Feller Processes

◦ ◦ 2 If φ = 1F for some F ∈ Fτ and f is a simple process, then (10.11) follows from 1 because of the linearity of the stochastic integral.

◦ 2 ◦ ◦ 3 If φ = 1F for some F ∈ Fτ and f ∈ L (µ ), then (10.11) follows from 2 and Itô’s isometry: Let fn be a sequence of simple processes which approximate f . Then   ZZ  2 E φ1 τ,∞ (t) fn(t,x) − φ1 τ,∞ (t) f (t,x) N(dt,dx) K J K J ZZ h 2i = E φ1 τ,∞ (t) fn(t,x) − φ1 τ,∞ (t) f (t,x) µ(dt,dx) K J K J ZZ h 2i E fn(t,x) − f (t,x) µ(dt,dx) −−−→ 0. 6 n→∞ ◦ 2 ◦ ◦ 4 If φ is an Fτ measurable step-function and f ∈ L (µ ), then (10.11) follows from 3 because of the linearity of the stochastic integral.

◦ ∞ 5 Since we can approximate φ ∈ L (Fτ ) uniformly by Fτ measurable step functions φn, (10.11) follows from 4◦ and Itô’s isometry because of the following inequality:   ZZ   2 E φn1 τ,∞ (t) f (t,x) − φ1 τ,∞ (t) f (t,x) N(dt,dx) K J K J ZZ  2 = E φn1 τ,∞ (t) f (t,x) − φ1 τ,∞ (t) f (t,x) µ(dt,dx) K J K J ZZ 2 E 2  6 kφn − φkL∞(P) f (t,x) µ(dt,dx). We will now consider ‘martingale noise’ random orthogonal measures, see Example 10.3.c), which are given by (the predictable quadratic variation of) a square-integrable martingale M. For these random measures our definition of the stochastic integral coincides with Itô’s definition. Recall that the Itô integral driven by M is first defined for simple, left-continuous processes of the form n f ( ,t) := ( )1 ( ,t), t 0, (10.12) ω ∑ φk ω τk,τk+1 ω > k=1 K K where 0 6 τ1 6 τ2 6 ... 6 τn+1 are bounded stopping times and φk bounded Fτk measurable random variables. The Itô integral for such simple processes is Z n f ( ,t)dM ( ) = ( )M ( ) − M ( ) ω t ω : ∑ φk ω τk+1 ω τk ω k=1 2 and it is extended by Itô’s isometry to all integrands from L (Ω × (0,∞),P,dP ⊗ dhMit ). For details we refer to any standard text on Itô integration, e.g. Protter [43, Chapter II] or Revuz & Yor [44, Chapter IV]. We will now use Lemma 10.12 in the particular situation where the space component dx is not present.

Theorem 10.13. Let N(dt) be a ‘martingale noise’ random orthogonal measure induced by the square-integrable martingale M (Example 10.3). The stochastic integral w.r.t. the random orthog- onal measure N(dt) and Itô’s stochastic integral w.r.t. M coincide. Chapter 10: A digression: stochastic integrals 73

∞ Proof. Let 0 6 τ1 6 τ2 6 ... 6 τn+1 be bounded stopping times, φk ∈ L (Fτk ) bounded random variables and f (ω,t) be a simple stochastic process of the form (10.12). From Lemma 10.12 we get

Z Z n f (t)N(dt) = 1 (t)N(dt) ∑ φk τk,τk+1 k=1 K K n Z = 1 (t)1 (t)N(dt) ∑ φk τk,∞ 0,τk+1 k=1 K J K K n Z = 1 (t)1 (t)N(dt) ∑ φk τk,∞ 0,τk+1 k=1 K J K K n = (M − M ). ∑ φk τk+1 τk k=1 This means that both stochastic integrals coincide on the simple stochastic processes. Since both integrals are extended by Itô’s isometry, the assertion follows.

Example 10.14. Using random orthogonal measures we can re-state the Lévy-Itô decomposition appearing in Theorem 9.12. For this, let Ne(dt,dx) be the Poisson random orthogonal measure (Ex- ample 10.3.d) on E = (0,∞)×(Rd \{0}) with control measure dt ×ν(dx) (ν is a Lévy measure). Additionally, we define for all deterministic functions h : (0,∞) × Rd → R ZZ h(s,x)N(ω,ds,dx) := ∑ h(s,∆Xs(ω)) ∀ω ∈ Ω 0

6 provided that the sum ∑0

ZZ  p 2 Xt = QWt + 1(0,t](s)y1(0,1)(|y|)Ne(ds,dy) =: Mt , L -martingale ZZ  1 1 tl + (0,t](s)y {|y|>1} N(ds,dy). =: At , bdd. variation

continuous pure jump part Gaussian

Example 10.14 is quite particular in the sense that N(·,dt,dx) is a bona fide positive measure, and the control measure µ(dt,dx) is also the compensator, i.e. a measure such that

Ne((0,t] × B) = N((0,t] × B) − µ((0,t] × B) is a square-integrable martingale.

6This is essentially an ω-wise Riemann–Stieltjes integral. A sufficient condition for the absolute convergence is, e.g. that h is continuous and h(t,·) vanishes uniformly in t in some neighbourhood of x = 0. The reason for this is the c c fact that Nt (ω,Bε (0)) = N(ω,(0,t] × Bε (0)) < ∞, i.e. there are at most finitely many jumps of size exceeding ε > 0. 74 R. L. Schilling: An Introduction to Lévy and Feller Processes

Following Ikeda & Watanabe [22, Chapter II.4] we can generalize the set-up of Example 10.14 in the following way: Let N(ω,dt,dx) be for each ω a positive measure of space-time type. Since t 7→ N(ω,(0,t] × B) is increasing, there is a unique compensator Nb(ω,dt,dx) such that for all B with ENb((0,t] × B) < ∞

Ne(ω,(0,t] × B) := N(ω,(0,t] × B) − Nb(ω,(0,t] × B), t > 0, is a square-integrable martingale. If t 7→ Nb((0,t] × B) is continuous and B 7→ Nb((0,t] × B) a σ- finite measure, then one can show that the angle bracket satisfies  Ne((0,·] × B), Ne((0,·] ×C) t = Nb (0,t] × (B ∩C) . This means, in particular, that Ne(ω,dt,dx) is a random orthogonal measure with control measure µ((0,t] × B) = ENb((0,t] × B), and we are back in the theory which we have developed in the first part of this chapter. It is possible to develop a fully-fledged for this kind of random measures. P Definition 10.15. Let (Ω,A , ) be a probability space with a filtration (Ft )t>0.A semimartin- gale is a stochastic process X of the form Z tZ Z tZ Xt = X0 + At + Mt + f (·,s,x)Ne(ds,dx) + g(·,s,x)N(ds,dx) 0 0 R t R ( 0 := (0,t]) where

• X0 is an F0 measurable random variable,

• M is a continuous square-integrable (w.r.t. Ft ),

• A is a continuous Ft adapted process of bounded variation,

• N(ds,dx), Ne(ds,dx) and Nb(ds,dx) are as described above, 1 2 ◦ • f 0,τn ∈ L (Ω×(0,∞)×X,P ⊗X , µ ) for some increasing sequence τn ↑ ∞ of bounded K K stopping times,

R tR • g is such that 0 g(ω,s,x)N(ω,ds,dx) exists as an ω-wise integral, • f (·,s,x)g(·,s,x) ≡ 0. In this case, we even have Itô’s formula, see [22, Chapter II.5], for any F ∈ C2(R,R): Z t Z t Z t 0 0 1 00 F(Xt ) − F(X0) = F (Xs−)dAs + F (Xs−)dMs + F (Xs−)dhMis 0 0 2 0 Z tZ   + F(Xs− + f (s,x)) − F(Xs−) Ne(ds,dx) 0 Z tZ   + F(Xs− + g(s,x)) − F(Xs−) N(ds,dx) 0 Z tZ  0  + F(Xs + f (s,x)) − F(Xs) − f (s,x)F (Xs) Nb(ds,dx) 0 R t R where we use again the convention that 0 := (0,t]. 11. From Lévy to Feller processes

x We have seen in Lemma 4.8 that the semigroup Pt f (x) := E f (Xt ) = E f (Xt + x) of a Lévy pro- cess (Xt )t>0 is a Feller semigroup. Moreover, the convolution structure of the transition semigroup R E f (Xt +x) = f (x+y)P(Xt ∈ dy) is a consequence of the spatial homogeneity (translation invari- ance) of the Lévy process, see Remark 4.5 and the characterization of translation invariant linear functionals (Theorem A.10). Lemma 4.4 shows that the translation invariance of a Lévy process is due to the assumptions (L1) and (L2). It is, therefore, a natural question to ask what we get if we consider stochastic processes whose semigroups are Feller semigroups which are not translation invariant. Since every Feller semi- group admits a Markov transition kernel (Lemma 5.2), we can use Kolmogorov’s construction to obtain a Markov process. Thus, the following definition makes sense.

Rd Definition 11.1. A Feller process is a càdlàg Markov process (Xt )t>0, Xt : Ω → , t > 0, whose x transition semigroup Pt f (x) = E f (Xt ) is a Feller semigroup.

Remark 11.2. It is no restriction to require that a Feller process has càdlàg paths. By a fundamental result in the theory of stochastic processes we can construct such modifications. Usually, one argues like this: It is enough to study the coordinate processes, i.e. d = 1. Rather than looking at t 7→ Xt we consider a (countable, point-separating) family of functions u : R → R and show that each t 7→ u(Xt ) has a càdlàg modification. One way of achieving this is to use martingale regularization techniques (e.g. Revuz & Yor [44, Chapter II.2]) which means that we should pick u in such a way that u(Xt ) is a supermartingale. The usual candidate for this is the resolvent −λt + R e Rλ f (Xt ) for some f ∈ C∞( ). Indeed, if Ft = σ(Xs, s 6 t) is the natural filtration, f > 0 and s 6 t, then Z ∞ Z ∞ x  Xs −λr −λr E Rλ f (Xt ) | Fs = E e Pr f (Xt−s)dr = e PrPt−s f (Xs)dr 0 0 Z ∞ Z ∞ λ(t−s) −λu λ(t−s) −λu = e e Pu f (Xs)du 6 e e Pu f (Xs)du t−s 0 λ(t−s) = e Rλ f (Xs).

X Let Ft = Ft := σ(Xs, s 6 t) be the canonical filtration.

Lemma 11.3. Every Feller process (Xt )t>0 is a strong Markov process, i.e.

Ex  EXτ Px f (Xt+τ ) | Fτ = f (Xt ), -a.s. on {τ < ∞}, t > 0, (11.1) Rd holds for any stopping time τ, Fτ := {F ∈ F∞ : F ∩ {τ 6 t} ∈ Ft ∀t > 0} and f ∈ C∞( ).

75 76 R. L. Schilling: An Introduction to Lévy and Feller Processes

A routine approximation argument shows that (11.1) extends to f (y) = 1K(y) (where K is a d compact set) and then, by a Dynkin-class argument, to any f (y) = 1B(y) where B ∈ B(R ).

n  −n Proof. To prove (11.1), approximate τ from above by discrete stopping times τn = b2 τc+1 2 and observe that for F ∈ Fτ ∩ {τ < ∞} (i) (ii) (iii) x x x Xτ  x Xτ  E [1F f (Xt+τ )] = lim E [1F f (Xt+τ )] = lim E 1F E n f (Xt ) = E 1F E f (Xt ) . n→∞ n n→∞

Here we use that t 7→ Xt is right-continuous, plus (i) dominated convergence and (iii) the Feller continuity 4.7.f); (ii) is the strong Markov property for discrete stopping times which follows directly from the Markov property: Since {τn < ∞} = {τ < ∞}, we get ∞ x x  E [1 f (X )] = E 1 −n f (X −n ) F t+τn ∑ F∩{τn=k2 } t+k2 k=1 ∞ x X −n  = E 1 −n E k2 f (X ) ∑ F∩{τn=k2 } t k=1

x Xτ  = E 1F E n f (Xt ) .

−n In the last calculation we use that F ∩ {τn = k2 } ∈ Fk2−n for all F ∈ Fτ .

Once we know the generator of a Feller process, we can construct many important martingales with respect to the canonical filtration of the process.

Corollary 11.4. Let (Xt )t>0 be a Feller process with generator (A,D(A)) and semigroup (Pt )t>0. For every f ∈ D(A) the process t [ f ] Z Mt := f (Xt ) − A f (Xr)dr, t > 0, (11.2) 0 X Px Rd is a martingale for the canonical filtration Ft := σ(Xs, s 6 t) and any , x ∈ . X [ f ] Proof. Let s 6 t, f ∈ D(A) and write, for short, Fs := Fs and Mt := Mt . By the Markov property  Z t  x x E [Mt − Ms | Fs] = E f (Xt ) − f (Xs) − A f (Xr)dr Fs s Z t−s Xs Xs = E f (Xt−s) − f (Xs) − E A f (Xu)du. 0 On the other hand, we get from the semigroup identity (5.5) Z t−s Z t−s Xs Xs E A f (Xu)du = PuA f (Xs)du = Pt−s f (Xs) − f (Xs) = E f (Xt−s) − f (Xs) 0 0 x which shows that E [Mt − Ms | Fs] = 0.

Our approach from Chapter 6 to prove the structure of a Lévy generator ‘only’ uses the positive maximum principle. Therefore, it can be adapted to Feller processes provided that the domain ∞ Rd D(A) is rich in the sense that Cc ( ) ⊂ D(A). All we have to do is to take into account that Feller processes are not any longer invariant under translations. The following theorem is due to Courrège [14] and von Waldenfels [61, 62]. Chapter 11: From Lévy to Feller processes 77

Theorem 11.5 (von Waldenfels, Courrège). Let (A,D(A)) be the generator of a Feller process ∞ d such that C (R ) ⊂ D(A). Then A| ∞ Rd is a pseudo differential operator c Cc ( ) Z ix·ξ Au(x) = −q(x,D)u(x) := − q(x,ξ)ub(ξ)e dξ (11.3) whose symbol q : Rd × Rd → C is a measurable function of the form Z 1  iy·ξ  q(x,ξ) = q(x,0)−il(x) · ξ + ξ · Q(x)ξ + 1 − e + iy · ξ1(0,1)(|y|) ν(x,dy) (11.4) | {z } 2 y6=0 >0 and (l(x),Q(x),ν(x,dy)) is a Lévy triplet1 for every fixed x ∈ Rd.

If we insert (11.4) into (11.3) and invert the Fourier transform we obtain the following integro- differential representation of the Feller generator A: 1 A f (x) = l(x) · ∇ f (x) + ∇ · Q(x)∇ f (x) 2 Z   (11.5) + f (x + y) − f (x) − ∇ f (x) · y1(0,1)(|y|) ν(x,dy). y6=0

2 Rd This formula obviously extends to all functions f ∈ Cb( ). In particular, we may use the function ix·ξ 2 f (x) = eξ (x) = e , and get

e−ξ (x)Aeξ (x) = −q(x,ξ). (11.6)

Proof of Theorem 11.5 (sketch). For a worked-out version see [9, Chapter 2.3]. In the proof of The- orem 6.8 use, instead of A0 and A00

2 A0 f Ax f := (A f )(x) and A00 f Axx f := Ax(| · −x| f )

d for every x ∈ R . This is needed since Pt and A are not any longer translation invariant, i.e. we ◦ ◦ ◦ ◦ cannot shift A0 f to get Ax f . Then follow the steps 1 –4 to get ν(dy) ν(x,dy) and 6 –9 for (l(x),Q(x)). Remark 6.9 shows that the term q(x,0) is non-negative. The key observation is, as in the proof of Theorem 6.8, that we can use in steps 3◦ and 7◦ the 3 positive maximum principle to make sure that Ax f is a distribution of order 2, i.e.

∞ Rd |Ax f | = |Lx f + Sx f | 6 CKk f k(2) for all f ∈ Cc (K) and all compact sets K ⊂ .

Here Lx is the local part with support in {x} accounting for (q(x,0),l(x),Q(x)), and Sx is the non-local part supported in Rd \{x} giving ν(x,dy).

With some abstract functional analysis we can show some (local) boundedness properties of x 7→ A f (x) and (x,ξ) 7→ q(x,ξ).

1Cf. Definition 6.10 2This should be compared with Definition 6.4 and the subsequent comments. 3To be precise: its weakened form (PP), cf. page 41. 78 R. L. Schilling: An Introduction to Lévy and Feller Processes

Corollary 11.6. In the situation of Theorem 11.5, the condition (PP) shows that

∞ sup |A f (x)| 6 Crk f k(2) for all f ∈ Cc (Br(0)) (11.7) |x|6r and the positive maximum principle (PMP) gives

∞ Rd sup |A f (x)| 6 Cr,Ak f k(2) for all f ∈ Cc ( ), r > 0. (11.8) |x|6r Proof. In the above sketched analogue of the proof of Theorem 6.8 we have seen that the family of linear functionals n ∞ o Cc (Br(0)) 3 f 7→ Ax f : x ∈ Br(0) where Ax f := (A f )(x) satisfies

∞ |Ax f | 6 cr,xk f k(2), f ∈ Cc (Br(0)),

2 R i.e. Ax : (Cb(Br(0)),k · k(2)) → ( ,| · |) is bounded. By the Banach–Steinhaus theorem (uniform boundedness principle)

sup |Ax f | 6 Crk f k(2). |x|6r Since A also satisfies the positive maximum principle (PMP), we know from step 4◦ of the (suitably adapted) proof of Theorem 6.8 that Z ν(x,dy) 6 Aφ0(x) for some φ0 ∈ Cc(B1(0)). |y|>1 ∞ Rd 1 1 Let r > 1, pick χ = χr ∈ Cc ( ) such that B2r(0) 6 χ 6 B3r(0). We get for |x| 6 r

A f (x) = A[χ f ](x) + A[(1 − χ) f ](x) Z = A[χ f ](x) + (1 − χ(x + y)) f (x + y) − (1 − χ(x)) f (x)ν(x,dy), |y|>r | {z } =0 and so

sup |A f (x)| 6 Crkχ f k(2) + k f k∞kAφ0k∞ 6 Cr,Ak f k(2). |x|6r Corollary 11.7. In the situation of Theorem 11.5 there exists a locally bounded nonnegative func- tion γ : Rd → [0,∞) such that

2 d |q(x,ξ)| 6 γ(x)(1 + |ξ| ), x,ξ ∈ R . (11.9)

2 Rd Proof. Using (11.8) we can extend A by continuity to Cb( ) and, therefore,

ix·ξ −q(x,ξ) = e−ξ (x)Aeξ (x), eξ (x) = e makes sense. Moreover, we have sup |Ae (x)| C ke k for any r 1; since ke k is a |x|6r ξ 6 r,A ξ (2) > ξ (2) polynomial of order 2 in the variable ξ, the claim follows. Chapter 11: From Lévy to Feller processes 79

d For a Lévy process we have ψ(0) = 0 since P(Xt ∈ R ) = 1 for all t > 0, i.e. the process Xt does not explode in finite time. For Feller processes the situation is more complicated. We need the following technical lemmas.

Lemma 11.8. Let q(x,ξ) be the symbol of (the generator of ) a Feller process as in Theorem 11.5 and F ⊂ Rd be a closed set. Then the following assertions are equivalent.

2 d a) |q(x,ξ)| 6 C(1 + |ξ| ) for all x,ξ ∈ R where C = 2 sup sup|q(x,ξ)|. |ξ|61 x∈F Z |y|2 b) supq(x,0) + sup|l(x)| + supkQ(x)k + sup 2 ν(x,dy) < ∞. x∈F x∈F x∈F x∈F y6=0 1 + |y| If F = Rd, then the equivalent properties of Lemma 11.8 are often referred to as ‘the symbol has bounded coefficients’.

Outline of the proof (see [57, Appendix] for a complete proof ). The direction b)⇒a) is proved as Theorem 6.2. Observe that ξ 7→ q(x,ξ) is for fixed x the characteristic exponent of a Lévy process. The finiteness of the constant C follows from the assumption b) and the Lévy–Khintchine formula (11.4). For the converse a)⇒b) we note that the integrand appearing in (11.4) can be estimated by |y|2 c 1+|y|2 which is itself a Lévy exponent:

2 Z Z ∞ |y| 1 −d/2 −|ξ|2/2λ −λ/2 2 = [1 − cos(y · ξ)]g(ξ)dξ, g(ξ) = (2πλ) e e dλ. 1 + |y| 2 0 Therefore, by Tonelli’s theorem, Z |y|2 ZZ 2 ν(x,dy) = [1 − cos(y · ξ)] ν(x,dy)g(ξ)dξ y6=0 1 + |y| y6=0 Z  1  = g(ξ) Req(x,ξ) − ξ · Q(x)ξ − q(x,0) dξ. 2 ∞ Rd Lemma 11.9. Let A be the generator of a Feller process, assume that Cc ( ) ⊂ D(A) and denote ∞ Rd 1 1 by q(x,ξ) the symbol of A. For any cut-off function χ ∈ Cc ( ) satisfying B1(0) 6 χ 6 B2(0) and χr(x) := χ(x/r) one has Z  −2 2 2 q(x,D)(χreξ )(x) 4 sup |q(x,η)| 1 + r |ρ| + |ξ| χ(ρ) dρ, (11.10) 6 Rd b |η|61 Rd lim A(χreξ )(x) = −eξ (x)q(x,ξ) for all x,ξ ∈ . (11.11) r→∞ d Proof. Observe that χdreξ (η) = r χb(r(η − ξ)) and Z ix·η q(x,D)(χreξ )(x) = q(x,η)e χdreξ (η)dη Z ix·η d = q(x,η)e r χb(r(η − ξ))dη (11.12) Z −1 ix·(ξ+ρ/r) = q(x,ξ + r ρ)e χb(ρ)dρ. 80 R. L. Schilling: An Introduction to Lévy and Feller Processes

Therefore we can use the estimate (11.9) with the optimal constant (x) = 2sup |q(x, )| and γ |η|61 η 2 2 2 the elementary estimate (a + b) 6 2(a + b ) to obtain Z −1 q(x,D)(χreξ )(x) 6 q(x,ξ + r ρ) χb(ρ) dρ Z  −2 2 2 6 4 sup |q(x,η)| 1 + r |ρ| + |ξ| χb(ρ) dρ. |η|61 This proves (11.10); it also allows us to use dominated convergence in (11.12) to get (11.11). Just Rd R observe that χb ∈ S( ) and χb(ρ)dρ = χ(0) = 1.

Lemma 11.10. Let q(x,ξ) be the symbol of (the generator of ) a Feller process. Then the following assertions are equivalent:

a) x 7→ q(x,ξ) is continuous for all ξ.

b) x 7→ q(x,0) is continuous.

d d c) Tightness: lim supν(x,R \ Br(0)) = 0 for all compact sets K ⊂ R . r→∞ x∈K

d) Uniform continuity at the origin: lim sup|q(x,ξ) − q(x,0)| = 0 for all compact K ⊂ Rd. |ξ|→0 x∈K ∞ 1 1 Proof. Let χ : [0,∞) → [0,1] be a decreasing C -function satisfying [0,1) 6 χ 6 [0,4). The 2 2 d functions χn(x) := χ(|x| /n ), x ∈ R and n ∈ N, are radially symmetric, smooth functions with 1 1 Rd Bn(0) 6 χn 6 B2n(0). Fix any compact set K ⊂ and some sufficiently large n0 such that

K ⊂ Bn0 (0). a)⇒b) is obvious. b)⇒c) For m > n > 2n0 the positive maximum principle implies

1K(x)A(χn − χm)(x) > 0.

1 1 Therefore, − K(x)q(x,0) = limn→∞ K(x)Aχn+n0 (x) is a decreasing limit of continuous functions. Since the limit function q(x,0) is continuous, Dini’s theorem implies that the limit is uniform on the set K. From the integro-differential representation (11.5) of the generator we get

Z  1K(x)|Aχm(x) − Aχn(x)| = 1K(x) χm(x + y) − χn(x + y) ν(x,dy). n−n06|y|62m+n0 Letting m → ∞ yields

Z  1K(x)|q(x,0) + Aχn(x)| = 1K(x) 1 − χn(x + y) ν(x,dy) |y|>n−n0 Z  1K(x) 1 − 1 (y) ν(x,dy) > B2n+n0 (0) |y|>n−n0 1 (x) x,Bc ( ). > K ν 2n+n0 0 Chapter 11: From Lévy to Feller processes 81

where we use that K ⊂ Bn0 (0) and

χn(x + y) 1 (x + y) = 1 (y) 1 (y). 6 B2n(0) B2n(0)−x 6 B2n+n0 (0) Since the left-hand side converges uniformly to 0 as n → ∞, c) follows. c)⇒d) Since the function x 7→ q(x,ξ) is locally bounded, we conclude from Lemma 11.8 that supx∈K |l(x)| + supx∈K kQ(x)k < ∞. Thus, lim|ξ|→0 supx∈K(|l(x) · ξ| + |ξ · Q(x)ξ|) = 0, and we may safely assume that l ≡ 0 and Q ≡ 0. If |ξ| 6 1 we find, using (11.4) and Taylor’s formula for the integrand,

|q(x,ξ) − q(x,0)| Z h iy·ξ i = 1 − e + iy · ξ1(0,1)(y) ν(x,dy) y6=0 Z Z Z 1 2 2 6 |y| |ξ| ν(x,dy) + |y||ξ|ν(x,dy) + 2ν(x,dy) 0<|y|2<1 2 16|y|2<1/|ξ| |y|2>1/|ξ| Z 2 |y| −1/2 2  6 2 ν(x,dy) 1 + |ξ| |ξ| + 2ν x,{y : |y| > 1/|ξ|} . 0<|y|2<1/|ξ| 1 + |y| Since this estimate is uniform for x ∈ K, we get d) as |ξ| → 0. d)⇒c) As before, we may assume that l ≡ 0 and Q ≡ 0. For every r > 0 Z 2 1 c |y/r| ν(x,Br(0)) 6 2 ν(x,dy) 2 |y|>r 1 + |y/r| Z Z  η · y = 1 − cos g(η)dη ν(x,dy) |y|>r Rd r Z Reqx, η  − q(x,0) g(η)dη, 6 Rd r where g(η) is as in the proof of Lemma 11.8. Since R (1 + |η|2)g(η)dη < ∞, we can use (11.9) and find c ν(x,Br(0)) 6 cg sup Req(x,η) − q(x,0) . |η|61/r Taking the supremum over all x ∈ K and letting r → ∞ proves c). c)⇒a) From Lemma 11.9 we know that limn→∞ e−ξ (x)A[χneξ ](x) = −q(x,ξ). Let us show that this convergence is uniform for x ∈ K. Let m > n > 2n0. For x ∈ K

e−ξ (x)A[eξ χn](x) − e−ξ (x)A[eξ χm](x)

Z   = eξ (y)χn(x + y) − eξ (y)χm(x + y) ν(x,dy) y6=0 Z   6 χm(x + y) − χn(x + y) ν(x,dy) y6=0 Z   6 χm(x + y) − χn(x + y) ν(x,dy) n−n06|y|62m+n0 (x,Bc ( )). 6 ν n−n0 0 82 R. L. Schilling: An Introduction to Lévy and Feller Processes

In the penultimate step we use that, because of the definition of the functions χn,  supp χm(x + ·) − χn(x + ·) ⊂ B2m(x) \ Bn(x) ⊂ B2m+n0 (0) \ Bn−n0 (0)

for all x ∈ K ⊂ Bn0 (0). The right-hand side tends to 0 uniformly for x ∈ K as n → ∞, hence m → ∞.

Remark 11.11. The argument used in the first three lines of the step b)⇒c) in the proof of Lemma 11.10 shows, incidentally, that

x 7→ q(x,ξ) is always upper semicontinuous since it is (locally) a decreasing limit of continuous functions. ∞ Rd Remark 11.12. Let A be the generator of a Feller process, and assume that Cc ( ) ⊂ D(A); d although A maps D(A) into C∞(R ), this is not enough to guarantee that the symbol q(x,ξ) is continuous in the variable x. On the other hand, if the Feller process X has only bounded jumps, i.e. if the support of the Lévy measure ν(x,·) is uniformly bounded, then q(·,ξ) is continuous. This is, in particular, true for diffusions. c This follows immediately from Lemma 11.10.c) which holds if ν(x,Br(0)) = 0 for some r > 0 and all x ∈ Rd. ∞ Rd 1 We can also give a direct argument: pick χ ∈ Cc ( ) satisfying B3r(0) 6 χ 6 1. From the representation (11.5) it is not hard to see that

2 Rd A f (x) = A[χ f ](x) for all f ∈ Cb( ) and x ∈ Br(0); in particular, A[χ f ] is continuous.

If we take f (x) := eξ (x), then, by (11.6), −q(x,ξ) = e−ξ (x)Aeξ (x) = e−ξ (x)A[χeξ ](x) which proves that x 7→ q(x,ξ) is continuous on every ball Br(0), hence everywhere. We can now discuss the role of q(x,ξ) for the conservativeness of a Feller process.

Theorem 11.13. Let (Xt )t>0 be a Feller process with infinitesimal generator (A,D(A)) such that ∞ Rd Cc ( ) ⊂ D(A), symbol q(x,ξ) and semigroup (Pt )t>0.

d a) If x 7→ q(x,ξ) is continuous for all ξ ∈ R and Pt 1 = 1, then q(x,0) = 0.

b) If q(x,ξ) has bounded coefficients and q(x,0) = 0, then x 7→ q(x,ξ) is continuous for all d ξ ∈ R and Pt 1 = 1. N Proof. Let χ and χr, r ∈ , be as in Lemma 11.9. Then eξ χr ∈ D(A) and, by Corollary 11.4, Z Mt := eξ χr(Xt ) − eξ χr(x) − A(eξ χr)(Xs)ds, t > 0, [0,t) is a martingale. Using optional stopping for the stopping time

x Rd τ := τR := inf{s > 0 : |Xs − x| > R}, x ∈ , R > 0, Chapter 11: From Lévy to Feller processes 83

Ex the stopped process (Mt∧τ )t>0 is still a martingale. Since Mt∧τ = 0, we get Z Ex Ex (χreξ )(Xt∧τ ) − χreξ (x) = A(χreξ )(Xs)ds. [0,t∧τ)

Note that the integrand is evaluated only for times s < t ∧ τ where |Xs| 6 R + x. Since A(eξ χr)(x) is locally bounded, we can use dominated convergence and Lemma 11.9 and we find, as r → ∞, Z Ex Ex eξ (Xt∧τ ) − eξ (x) = − eξ (Xs)q(Xs,ξ)ds. [0,t∧τ)

x a) Set ξ = 0 and observe that Pt 1 = 1 implies that τ = τR → ∞ a.s. as R → ∞. Therefore, Z x d x P (Xt∧τ ∈ R ) − 1 = −E q(Xs,0)ds, [0,τ∧t) and with Fatou’s Lemma we can let R → ∞ to get

Z  Z  Z t x x x 0 = liminfE q(Xs,0)ds > E liminf q(Xs,0)ds = E q(Xs,0)ds. R→∞ [0,τ∧t) R→∞ [0,τ∧t) 0 Since x 7→ q(x,0) is continuous and q(x,0) non-negative, we conclude with Tonelli’s theorem that Z t 1 x q(x,0) = lim E q(Xs,0)ds = 0. t→0 t 0 b) Set ξ = 0 and observe that the boundedness of the coefficients implies that Z x d x P (Xt ∈ R ) − 1 = −E q(Xs,0)ds [0,t)

x d as R → ∞. Since the right-hand side is 0, we get Pt 1 = P (Xt ∈ R ) = 1.

Remark 11.14. The boundedness of the coefficients in Theorem 11.13.b) is important. If the coefficients of q(x,ξ) grow too rapidly, we may observe explosion in finite time even if q(x,0) = 0. A typical example in dimension 3 is the given by the generator 1 L f (x) = a(x)∆ f (x) 2 where a(x) is continuous, rotationally symmetric a(x) = α(|x|) for a suitable function α(r), and R ∞ √ satisfies 1 1/α( r)dr < ∞, see Stroock & Varadhan [60, p. 260, 10.3.3]; the corresponding 1 2 symbol is q(x,ξ) = 2 a(x)|ξ| . This process explodes in finite time. Since this is essentially a time-changed Brownian motion (see Böttcher, Schilling & Wang [9, Chapter 4.1]), this example works only if Brownian motion is transient, i.e. in dimensions d = 3 and higher. A sufficient criterion for conservativeness in terms of the symbol is

liminf sup sup |q(y,η)| < ∞ for all x ∈ Rd, r→∞ |y−x|62r |η|61/r see [9, Theorem 2.34]. 12. Symbols and semimartingales

So far, we have been treating the symbol q(x,ξ) of (the generator of) a Feller process X as an an- alytic object. On the other hand, Theorem 11.13 indicates, that there should be some probabilistic consequences. In this chapter we want to follow this lead, show a probabilistic method to calcu- late the symbol and link it to the semimartingale characteristics of a Feller process. The blueprint for this is the relation of the Lévy–Itô decomposition (which is the semimartingale decomposition of a Lévy process, cf. Theorem 9.12) with the Lévy–Khintchine formula for the characteristic exponent (which coincides with the symbol of a Lévy process, cf. Corollary 9.13). x For a Lévy process Xt with semigroup Pt f (x) = E f (Xt ) = E f (Xt + x) the symbol can be cal- culated in the following way:

x iξ·(X −x) −tψ(ξ) e− (x)Pt e (x) − 1 E e t − 1 e − 1 lim ξ ξ = lim = lim = −ψ(ξ). (12.1) t→0 t t→0 t t→0 t For a Feller process a similar formula is true.

Theorem 12.1. Let X = (Xt )t>0 be a Feller process with transition semigroup (Pt )t>0 and gen- ∞ Rd erator (A,D(A)) such that Cc ( ) ⊂ D(A). If x 7→ q(x,ξ) is continuous and q has bounded coefficients (Lemma 11.8 with F = Rd), then

Exeiξ·(Xt −x) − 1 − q(x,ξ) = lim . (12.2) t→0 t ∞ Rd 1 1 x  Proof. Pick χ ∈ Cc ( ), B1(0) 6 χ 6 B2(0) and set χn(x) := χ n . Obviously, χn → 1 as n → ∞. By Lemma 5.4, Z t Pt [χneξ ](x) − χn(x)eξ (x) = APs[χneξ ](x)ds 0 Z t Ex = A[χneξ ](Xs)ds 0 Z t Z   Ex = − eη (Xs)q(Xs,η) χdneξ (η) dη ds 0 Rd |d {z } =n χb(n(η−ξ)) Z t   −−−→ − Ps eξ q(·,ξ) (x)ds. n→∞ 0 Since x 7→ q(x,ξ) is continuous, we can divide by t and let t → 0; this yields

1  lim Pt eξ (x) − eξ (x) = −eξ (x)q(x,ξ). t→0 t

84 Chapter 12: Symbols and semimartingales 85

Theorem 12.1 is a relatively simple probabilistic formula to calculate the symbol. We want to relax the boundedness and continuity assumptions. Here Dynkin’s characteristic operator becomes useful.

Lemma 12.2 (Dynkin’s formula). Let (Xt )t>0 be a Feller process with semigroup (Pt )t>0 and generator (A,D(A)). For every stopping time σ with Exσ < ∞ one has Z x x E f (Xσ ) − f (x) = E A f (Xs)ds, f ∈ D(A). (12.3) [0,σ)

[ f ] R t Proof. From Corollary 11.4 we know that Mt := f (Xt ) − f (X0) − 0 A f (Xs)ds is a martingale; thus (12.3) follows from the optional stopping theorem.

Rd Definition 12.3. Let (Xt )t>0 be a Feller process. A point a ∈ is an absorbing point, if

a P (Xt = a, ∀t > 0) = 1.

Denote by τr := inf{t > 0 : |Xt − X0| > r} the first exit time from the ball Br(x) centered at the starting position x = X0.

Rd Lemma 12.4. Let (Xt )t>0 be a Feller process and assume that b ∈ is not absorbing. Then there b exists some r > 0 such that E τr < ∞.

Proof. 1◦ If b is not absorbing, then there is some f ∈ D(A) such that A f (b) 6= 0. Assume the contrary, i.e. A f (b) = 0 for all f ∈ D(A).

By Lemma 5.4, Ps f ∈ D(A) for all s > 0, and Z t Pt f (b) − f (b) = A(Ps f )(b)ds = 0. 0

d So, Pt f (b) = f (b) for all f ∈ D(A). Since the domain D(A) is dense in C∞(R ) (Remark 5.5), we Rd Pb get Pt f (b) = f (b) for all f ∈ C∞( ), hence (Xt = b) = 1 for any t > 0. Therefore,

b b P (Xq = b, ∀q ∈ Q, q > 0) = 1 and P (Xt = b, ∀t > 0) = 1, because of the right-continuity of the sample paths. This means that b is an absorbing point, contradicting our assumption.

2◦ Pick f ∈ D(A) such that A f (b) > 0. Since A f is continuous, there exist ε > 0 and r > 0 such that A f |Br(b) > ε > 0. From Dynkin’s formula (12.3) with σ = τr ∧ n, n > 1, we deduce Z Eb Eb Eb ε (τr ∧ n) 6 A f (Xs)ds = f (Xτr∧n) − f (b) 6 2k f k∞. [0,τr∧n)

Eb Finally, monotone convergence shows that τr 6 2k f k∞/ε < ∞. 86 R. L. Schilling: An Introduction to Lévy and Feller Processes

Definition 12.5 (Dynkin’s operator). Let X be a Feller process. The linear operator (A,D(A)) defined by

 Ex f (X ) − f (x)  τr lim x , if x is not absorbing, A f (x) := r→0 E τr 0, otherwise,

n d o D(A) := f ∈ C∞(R ) : the above limit exists pointwise , is called Dynkin’s (characteristic) operator.

Lemma 12.6. Let (Xt )t>0 be a Feller process with generator (A,D(A)) and characteristic opera- tor (A,D(A)).

a) A is an extension of A, i.e. D(A) ⊂ D(A) and A|D(A) = A.

d b) (A,D) = (A,D(A)) if D = { f ∈ D(A) : f , A f ∈ C∞(R )}.

Proof. a) Let f ∈ D(A) and assume that x ∈ Rd is not absorbing. By Lemma 12.4 there is some x d r = r(x) > 0 with E τr < ∞. Since we have A f ∈ C∞(R ), there exists for every ε > 0 some δ > 0 such that

|A f (y) − A f (x)| < ε for all y ∈ Bδ (x).

Without loss of generality let δ < r. Using Dynkin’s formula (12.3) with σ = τδ , we see Z Ex Ex Ex Ex f (Xτδ ) − f (x) − A f (x) τδ 6 |A f (Xs) − A f (x)| ds 6 ε τδ . [0,τδ ) | {z } 6ε Ex Ex Thus, limr→0 f (Xτr ) − f (x) τr = A f (x). If x is absorbing and f ∈ D(A), then A f (x) = 0 and so A f (x) = A f (x). b) Since (A,D) satisfies the (PMP), the claim follows from Lemma 5.11.

Theorem 12.7. Let (Xt )t>0 be a Feller process with infinitesimal generator (A,D(A)) such that ∞ Rd 1 Cc ( ) ⊂ D(A) and x 7→ q(x,ξ) is continuous . Then

Exei(Xτr −x)·ξ − 1 − q(x,ξ) = lim x (12.4) r→0 E τr Rd 1 Rd for all x ∈ (as usual, ∞ := 0). In particular, q(a,ξ) = 0 for all absorbing states a ∈ . ∞ Rd 1 Proof. Let χn ∈ Cc ( ) such that Bn(0) 6 χn 6 1. By Dynkin’s formula (12.3) Z Ex Ex e−ξ (x) [χn(Xτr∧t )eξ (Xτr∧t )] − χn(x) = e−ξ (x)A[χneξ ](Xs)ds. [0,τr∧t)

1For instance, if X has bounded jumps, see Lemma 11.10 and Remark 11.12. Our proof will show that it is actually enough to assume that s 7→ q(Xs,ξ) is right-continuous. Chapter 12: Symbols and semimartingales 87

Observe that A[χneξ ](Xs) is bounded if s < τr, see Corollary 11.6. Using the dominated conver- gence theorem, we can let n → ∞ to get Z Ex Ex e−ξ (x) eξ (Xτr∧t ) − 1 = e−ξ (x)Aeξ (Xs)ds [0,τr∧t) Z (12.5) (11.6) Ex = − eξ (Xs − x)q(Xs,ξ)ds. [0,τr∧t)

If x is absorbing, we have q(x,ξ) = 0, and (12.4) holds trivially. For non-absorbing x, we pass to x the limit t → ∞ and get, using E τr < ∞ (see Lemma 12.4),

x e− (x)E e (Xτ ) − 1 1 Z ξ ξ r = − Ex e (X − x)q(X ,ξ)ds. Ex Ex ξ s s τr τr [0,τr)

Since s 7→ q(Xs,ξ) is right-continuous at s = 0, the limit r → 0 exists, and (12.4) follows.

A small variation of the above proof yields

Corollary 12.8 (Schilling, Schnurr [57]). Let X be a Feller process with generator (A,D(A)) such ∞ Rd that Cc ( ) ⊂ D(A) and x 7→ q(x,ξ) is continuous. Then

Exei(Xt∧τr −x)·ξ − 1 − q(x,ξ) = lim (12.6) t→0 t for all x ∈ Rd and r > 0.

Proof. We follow the proof of Theorem 12.7 up to (12.5). This relation can be rewritten as

Ex i(Xt∧τr −x)·ξ Z t e − 1 1Ex 1 = − eξ (Xs − x)q(Xs,ξ) [0,τr)(s)ds. t t 0

Observe that Xs is bounded if s < τr and that s 7→ q(Xs,ξ) is right-continuous. Therefore, the limit t → 0 exists and yields (12.6).

∞ Rd Every Feller process (Xt )t>0 such that Cc ( ) ⊂ D(A) is a semimartingale. Moreover, the semimartingale characteristics can be expressed in terms of the Lévy triplet (l(x),Q(x),ν(x,dy)) of the symbol. Recall that a (d-dimensional) semimartingale is a stochastic process of the form

Z tZ c 1  X  1 Xt = X0 + Xt + y (0,1)(|y|) µ (·,ds,dy) − ν(·,ds,dy) + ∑ [1,∞)(|∆Xs|)∆Xs + Bt 0 s6t where Xc is the continuous martingale part, B is a previsible process with paths of finite variation (on compact time intervals) and with the jump measure

X ( ,ds,dy) = (ds,dy) µ ω ∑ δ(s,∆Xs(ω)) s:∆Xs(ω)6=0 whose compensator is ν(ω,ds,dy). The triplet (B,C,ν) with the (predictable) quadratic variation C = [Xc,Xc] of Xc is called the semimartingale characteristics. 88 R. L. Schilling: An Introduction to Lévy and Feller Processes

Theorem 12.9 (Schilling [53], Schnurr [59]). Let (Xt )t>0 be a Feller process with infinitesimal ∞ Rd 2 generator (A,D(A)) such that Cc ( ) ⊂ D(A) and symbol q(x,ξ) given by (11.4). If q(x,0) = 0, then X is a semimartingale whose semimartingale characteristics can be expressed by the Lévy triplet (l(x),Q(x),ν(x,dy))

Z t Z t Bt = l(Xs)ds, Ct = Q(Xs)ds, ν(·,ds,dy) = ν(Xs(·),dy)ds. 0 0 Proof. 1◦ Combining Corollary 11.4 with the integro-differential representation (11.5) of the generator shows that

[ f ] Z Mt = f (Xt ) − A f (Xs)ds [0,t) Z 1 Z = f (Xt ) − l(Xs) · ∇ f (Xs)ds − ∇ · Q(Xs)∇ f (Xs)ds [0,t) 2 [0,t) Z Z   − f (Xs + y) − f (Xs) − ∇ f (Xs) · y1(0,1)(|y|) ν(Xs,dy)ds [0,t) y6=0

2 Rd is for all f ∈ Cb( ) ∩ D(A) a martingale. ◦ 2 Rd 2 Rd 2 We claim that Cc( ) ⊂ D(A). Indeed, let f ∈ Cc( ) with supp f ⊂ Br(0) for some r > 0 ∞ and pick a sequence fn ∈ Cc (B2r(0)) such that limn→∞ k f − fnk(2) = 0. Using (11.7) we get

sup |A fn(x) − A fm(x)| c3rk fn − fmk(2) −−−−→ 0. 6 m,n→∞ |x|63r

∞ Since supp fn ⊂ B2r(0) and fn → f uniformly, there is some u ∈ Cc (B3r(0)) with | fn(x)| 6 u(x). Therefore, we get for |x| > 2r Z |A fn(x) − A fm(x)| 6 | fn(x + y) − fm(x + y)|ν(x,dy) y6=0 Z 6 2 u(x + y)ν(x,dy) = 2Au(x) −−−→ 0 y6=0 |x|→∞

d uniformly for all m,n. This shows that (A fn)n∈N is a Cauchy sequence in C∞(R ). By the closed- ness of (A,D(A)) we gather that f ∈ D(A) and A f = limn→∞ A fn. ◦ ∞ Rd 1 3 Fix r > 1, pick χr ∈ Cc ( ) such that B3r(0) 6 χ 6 1, and set

σ = σr := inf{t > 0 : |Xt − X0| > r} ∧ inf{t > 0 : |∆Xt | > r}.

Since (Xt )t>0 has càdlàg paths and infinite life-time, σr is a family of stopping times with σr ↑ ∞.

2A sufficient condition is, for example, that X has infinite life-time and x 7→ q(x,ξ) is continuous (either for all ξ or just for ξ = 0), cf. Theorem 11.13 and Lemma 11.10. Chapter 12: Symbols and semimartingales 89

2 Rd Rd Rd x x For any f ∈ C ( )∩Cb( ) and x ∈ we set fr := χr f , f := f (·−x), fr (·−x), and consider Z Z [ f x] x x 1 x Mt∧σ = f (Xt∧σ ) − l(Xs) · ∇ f (Xs)ds − ∇ · Q(Xs)∇ f (Xs)ds [0,t∧σ) 2 [0,t∧σ) Z Z  x x x  − f (Xs + y) − f (Xs) − ∇ f (Xs) · y1(0,1)(|y|) ν(Xs,dy)ds [0,t∧σ) y6=0 Z Z x x 1 x = fr (Xt∧σ ) − l(Xs) · ∇ fr (Xs)ds − ∇ · Q(Xs)∇ fr (Xs)ds [0,t∧σ) 2 [0,t∧σ) Z Z  x x x 1  − fr (Xs + y) − fr (Xs) − ∇ fr (Xs) · y (0,1)(|y|) ν(Xs,dy)ds [0,t∧σ) 0<|y|

x 2 Rd [ f ] Since fr ∈ Cc( ) ⊂ D(A), we see that Mt is a local martingale (with reducing sequence σr, r > 0), and by a general result of Jacod & Shiryaev [27, Theorem II.2.42], it follows that X is a semimartingale with the characteristics mentioned in the theorem.

We close this chapter with the discussion of a very important example: Lévy driven stochastic differential equations. From now on we assume that

Φ : Rd → Rd×n is a matrix-valued, measurable function,

L = (Lt )t>0 is an n-dimensional Lévy process with exponent ψ(ξ), and consider the following Itô stochastic differential equation (SDE)

dXt = Φ(Xt−)dLt , X0 = x. (12.7)

If Φ is globally Lipschitz continuous, then the SDE (12.7) has a unique solution which is a strong 3 x Markov process, see Protter [43, Chapter V, Theorem 32] . If we write Xt for the solution of (12.7) x x with initial condition X0 = x = X0 , then the flow x 7→ Xt is continuous [43, Chapter V, Theorem 38]. > If we use Lt = (t,Wt ,Jt ) as driving Lévy process where W is a Brownian motion and J is a pure-jump Lévy process (we assume4 that W ⊥⊥J), and if Φ is a block-matrix, then we see that (12.7) covers also SDEs of the form

dXt = f (Xt−)dt + F(Xt−)dWt + G(Xt−)dJt .

Lemma 12.10. Let Φ be bounded and Lipschitz, X the unique solution of the SDE (12.7), and ∞ Rd denote by A the generator of X. Then Cc ( ) ⊂ D(A).

3Protter requires that L has independent coordinate processes, but this is not needed in his proof. For existence and uniqueness the local Lipschitz and linear growth conditions are enough; the strong Lipschitz condition is used for the Markov nature of the solution. 4This assumption can be relaxed under suitable assumptions on the (joint) filtration, see for example Ikeda & Watanabe [22, Theorem II.6.3] 90 R. L. Schilling: An Introduction to Lévy and Feller Processes

Proof. Because of Theorem 5.12 (this theorem does not only hold for Feller semigroups, but for any strongly continuous contraction semigroup satisfying the positive maximum principle), it suffices to show 1 x  d lim E f (Xt ) − f (x) = g(x) and g ∈ C∞(R ) t→0 t

∞ Rd for any f ∈ Cc ( ). For this we use, as in the proof of the following theorem, Itô’s formula to get

Z t x x E f (Xt ) − f (x) = E A f (Xs)ds, 0 and a calculation similar to the one in the proof of the next theorem. A complete proof can be found in Schilling & Schnurr [57, Theorem 3.5].

Theorem 12.11. Let (Lt )t>0 be an n-dimensional Lévy process with exponent ψ and assume that Φ is Lipschitz continuous. Then the unique Markov solution X of the SDE (12.7) admits a generalized symbol in the sense that

Exei(Xt∧τr −x)·ξ − 1 lim = −ψ(Φ>(x)ξ), r > 0. t→0 t

1 Rd ∞ Rd > If Φ ∈ Cb( ), then X is Feller, Cc ( ) ⊂ D(A) and q(x,ξ) = ψ(Φ (x)ξ) is the symbol of the generator.

Theorem 12.11 indicates that the formulae (12.4) and (12.6) may be used to associate symbols not only with Feller processes but with more general Markov semimartingales. This has been investigated by Schnurr [58] and [59] who shows that the class of Itô processes with jumps is essentially the largest class that can be described by symbols; see also [9, Chapters 2.4–5]. This opens up the way to analyze rather general semimartingales using the symbol. Let us also point out that the boundedness of Φ is only needed to ensure that X is a Feller process.

Proof. Let τ = τr be the first exit time for the process X from the ball Br(x) centered at x = X0. We use the Lévy–Itô decomposition (9.8) of L. From Itô’s formula for jump processes (see, e.g. Protter [43, Chapter II, Theorem 33]) we get

1 Ex e (X − x) − 1 t ξ t∧τ Z t∧τ Z t∧τ 1Ex 1 c = ieξ (Xs− − x)ξ dXs − eξ (Xs− − x)ξ · d[X,X]s ξ t 0 2 0   + e−ξ (x) ∑ eξ (Xs) − eξ (Xs−) − ieξ (Xs−)ξ · ∆Xs s6τ∧t

=: I1 + I2 + I3. Chapter 12: Symbols and semimartingales 91

We consider the three terms separately. Z t∧τ 1Ex I1 = ieξ (Xs− − x)ξ dXs t 0 Z t∧τ 1Ex = ieξ (Xs− − x)ξ · Φ(Xs−)dLs t 0 Z t∧τ  Z  1Ex = ieξ (Xs− − x)ξ · Φ(Xs−)ds ls + yNs(dy) t 0 |y|>1

=: I11 + I12 where we use that the diffusion part and the compensated small jumps of a Lévy process are a martingale, cf. Theorem 9.12. Further,

I3 + I12 Z t∧τ Z 1Ex  1  = eξ (Xs− − x) eξ (Φ(Xs−)y) − 1 − iξ · Φ(Xs−)y (0,1)(|y|) dsNs(dy) t 0 y6=0 Z t∧τ Z 1Ex  1  = eξ (Xs− − x) eξ (Φ(Xs−)y) − 1 − iξΦ(Xs−)y (0,1)(|y|) ν(dy)ds t 0 y6=0 Z  iξ·Φ(x)y  −−→ e − 1 − iξ · Φ(x)y1(0,1)(|y|) ν(dy). t→0 y6=0

Here we use that ν(dy)ds is the compensator of dsNs(dy), see Lemma 9.4. This requires that the integrand is ν-integrable, but this is ensured by the local boundedness of Φ(·) and the fact that R 2 y6=0 min{|y| ,1}ν(dy) < ∞. Moreover, Z t∧τ 1Ex I11 = ieξ (Xs− − x)ξ · Φ(Xs−)l ds −−→ iξ · Φ(x)l, t 0 t→0 and, finally, we observe that Z Z c c h i [X,X] = Φ(Xs−)dLs, Φ(Xs−)dLs Z c > = Φ(Xs−)d[L,L]s Φ (Xs−) Z > d×d = Φ(Xs−)QΦ (Xs−)ds ∈ R which gives Z t∧τ 1 Ex c I2 = − eξ (Xs− − x)ξ · d[X,X]s ξ 2t 0 Z t∧τ 1 Ex > = − eξ (Xs− − x)ξ · Φ(Xs−)QΦ (Xs−)ξ ds 2t 0 1 −−→ − ξ · Φ(x)QΦ>(x)ξ. t→0 2 This proves 1 q(x,ξ) = −il · Φ>(x)ξ + ξ · Φ(x)QΦ>(x)ξ 2 Z  iy·Φ>(x)ξ >  + 1 − e + iy · Φ (x)ξ1(0,1)(|y|) ν(dy) y6=0 = ψ(Φ>(x)ξ). 92 R. L. Schilling: An Introduction to Lévy and Feller Processes

∞ Rd For the second part of the theorem we use Lemma 12.10 to see that Cc ( ) ⊂ D(A). The con- x x tinuity of the flow x 7→ Xt (Protter [43, Chapter V, Theorem 38])—X is the unique solution of x Ex E x the SDE with initial condition X0 = x—ensures that the semigroup Pt f (x) := f (Xt ) = f (Xt ) Rd Rd Rd x maps f ∈ C∞( ) to Cb( ). In order to see that Pt f ∈ C∞( ) we need that lim|x|→∞ |Xt | = ∞ a.s. This requires some longer calculations, see e.g. Schnurr [58, Theorem 2.49] or Kunita [34, Proof of Theorem 3.5, p. 353, line 13 from below] (for Brownian motion this argument is fully worked out in Schilling & Partzsch [56, Corollary 19.31]). 13. Dénouement

It is well known that the characteristic exponent ψ(ξ) of a Lévy process L = (Lt )t>0 can be used to describe many probabilistic properties of the process. The key is the formula

Exeiξ·(Lt −x) = Eeiξ·Lt = e−tψ(ξ) (13.1)

x which gives access to the Fourier transform of the transition function P (Lt ∈ dy) = P(Lt +x ∈ dy).

Although it is not any longer true that the symbol q(x,ξ) of a Feller process X = (Xt )t>0 is the characteristic exponent, we may interpret formulae like (12.4)

Exei(Xτr −x)·ξ − 1 −q(x,ξ) = lim x r→0 E τr as infinitesimal versions of the relation (13.1). What is more, both ψ(ξ) and q(x,ξ) are the Fourier symbols of the generators of the processes. We have already used these facts to discuss the conservativeness of Feller processes (Theorem 11.13) and the semimartingale decomposition of Feller processes (Theorem 12.9). It is indeed possible to investigate further path properties of a Feller process using its symbol q(x,ξ). Below we will, mostly without proof, give some examples which are taken from Böttcher, Schilling & Wang [9]. Let us point out the two guiding principles.

I For sample path properties, the symbol q(x,ξ) of a Feller process assumes same role as the characteristic exponent ψ(ξ) of a Lévy process.

II A Feller process is ‘locally Lévy’, i.e. for short-time path properties the Feller

process, started at x0, behaves like the Lévy process (Lt + x0)t>0 with characteristic exponent ψ(ξ) := q(x0,ξ).

The latter property is the reason why such Feller processes are often called Lévy-type processes. The model case is the stable-like process whose symbol is given by q(x,ξ) = |ξ|α(x) where α : Rd → (0,2) is sufficiently smooth1. This process behaves locally, and for short times t  1, like an α(x)-stable process, only that the index now depends on the starting point X0 = x. The key to many path properties are the following maximal estimates which were first proved in [53]. The present proof is taken from [9], the observation that we may use a random time τ instead of a fixed time t is due to F. Kühn. 1In dimension d = 1 Lipschitz or even Dini continuity is enough (see Bass [4]), in higher dimensions we need 5d+3 something like C -smoothness, cf. Hoh [21]. Meanwhile, Kühn [33] established the existence for d > 1 with α(x) satisfying any Hölder condition.

93 94 R. L. Schilling: An Introduction to Lévy and Feller Processes

∞ Rd Theorem 13.1. Let (Xt )t>0 be a Feller process with generator A, Cc ( ) ⊂ D(A) and symbol q(x,ξ). If τ is an integrable stopping time, then   x x P sup|Xs − x| > r 6 cE τ sup sup |q(y,ξ)|. (13.2) s6τ |y−x|6r |ξ|6r−1

x Proof. Denote by σr = σr the first exit time from the closed ball Br(x). Clearly,   x x {σr < τ} ⊂ sup|Xs − x| > r ⊂ {σr 6 τ}. s6τ ∞ Rd Pick u ∈ Cc ( ), 0 6 u 6 1, u(0) = 1, suppu ⊂ B1(0), and set y − x ux(y) := u . r r

x c In particular, ur|Br(x) = 0. Hence,

x 1 x 1 − u (X x ). {σr 6τ} 6 r τ∧σr

Now we can use (5.5) or (12.3) to get

x x x x P E  x  (σr 6 τ) 6 1 − ur(Xτ∧σr ) Z Ex x = q(Xs,D)ur(Xs)ds x [0,τ∧σr ) Z Z Ex 1 x = B (x)(Xs)eξ (Xs)q(Xs,ξ)ur(ξ)dξ ds x r b [0,τ∧σr ) Z Z Ex x sup |q(y,ξ)||ur(ξ)|dξ ds 6 x b [0,τ∧σr ) |y−x|6r Z Ex x x = [τ ∧ σr ] sup |q(y,ξ)||ubr(ξ)|dξ |y−x|6r (11.9) Z Ex 2 6 c τ sup sup |q(y,ξ)| (1 + |ξ| )|ub(ξ)|dξ. L 11.8 |y−x|6r |ξ|6r−1 There is a also a probability estimate for sup |X − x| r, but this requires some sector s6t s 6 condition for the symbol, that is an estimate of the form

d |Imq(x,ξ)| 6 κ Req(x,ξ), x,ξ ∈ R . (13.3)

One consequence of (13.3) is that the drift (which is contained in the imaginary part of the symbol) is not dominant. This is a familiar assumption in the study of path properties of Lévy processes, see e.g. Blumenthal & Getoor [8]; a typical example where this condition is violated are (Lévy) 1 symbols of the form ψ(ξ) = iξ + |ξ| 2 . For a Lévy process the sector condition on ψ coincides with the sector condition for the generator and the associated non-symmetric Dirichlet form, see Jacob [26, Volume 1, 4.7.32–33]. Chapter 13: Dénouement 95

With some more effort (see [9, Theorem 5.5]), we may replace the sector condition by imposing conditions on the expression Req(x,ξ) sup as r → ∞. | || q(y, )| |y−x|6r ξ Im ξ

Theorem 13.2 (see [9, pp. 117–119]). Let (Xt )t>0 be a Feller process with infinitesimal generator ∞ Rd A, Cc ( ) ⊂ D(A) and symbol q(x,ξ) satisfying the sector condition (13.3). Then   x cκ,d P sup|Xs − x| < r 6 . (13.4) t sup −1 inf |q(y, )| s6t |ξ|6r |y−x|62r ξ

The maximal estimates (13.2) and (13.4) are quite useful tools. With them we can estimate the mean exit time from balls (X and q(x,ξ) are as in Theorem 13.2): c Ex x cκ 6 σr 6 sup inf |q(y, )| sup ∗ inf |q(y, )| |ξ|61/r |y−x|6r ξ |ξ|6k /r |y−x|6r ξ for all x ∈ Rd and r > 0 and with k∗ = arccosp2/3 ≈ 0.615. Recently, Kühn [32] studied the existence of and estimates for generalized moments; a typical result is contained in the following theorem.

Theorem 13.3. Let X = (Xt )t>0 be a Feller process with infinitesimal generator (A,D(A)) and ∞ Rd Cc ( ) ⊂ D(A). Assume that the symbol q(x,ξ), given by (11.4), satisfies q(x,0) = 0 and has (l(x),Q(x),ν(x,dy)) as x-dependent Lévy triplet. If f : Rd → [0,∞) is (comparable to) a twice continuously differentiable submultiplicative function such that Z sup f (y)ν(x,dy) < ∞ for a compact set K ⊂ Rd, x∈K then the generalized moment sup sup Ex f (X − x) < exists and x∈K s6t s∧τK ∞

Ex C(M1+M2)t f (Xt∧τK ) 6 c f (x)e ; here τK = inf{t > 0 : Xt ∈/ K} is the first exit time from K, C = CK is some absolute constant and Z Z  2  M1 = sup |l(x)| + |Q(x)| + (|y| ∧ 1)ν(x,dy) , M2 = sup f (y)ν(x,dy). x∈K y6=0 x∈K |y|>1 If X has bounded coefficients (Lemma 11.8), then K = Rd is admissible.

There are also counterparts for the Blumenthal–Getoor and Pruitt indices. Below we give two representatives, for a full discussion we refer to [9].

Definition 13.4. Let q(x,ξ) be the symbol of a Feller process. Then  sup sup |q(y, )|  x |η|6|ξ| |y−x|61/|ξ| η β∞ := inf λ > 0 : lim = 0 , (13.5) |ξ|→∞ |ξ|λ   x inf|η|6|ξ| inf|y−x|61/|ξ| |q(y,η)| δ∞ := sup λ > 0 : lim = ∞ . (13.6) |ξ|→∞ |ξ|λ 96 R. L. Schilling: An Introduction to Lévy and Feller Processes

x x α(x) By definition, 0 6 δ∞ 6 β∞ 6 2. For example, if q(x,ξ) = |ξ| with a smooth exponent x x function α(x), then β∞ = δ∞ = α(x); in general, however, we cannot expect that the two indices coincide. As for Lévy processes, these indices can be used to describe the path behaviour.

Theorem 13.5. Let (Xt )t>0 be a d-dimensional Feller process with the generator A such that ∞ Rd Cc ( ) ⊂ D(A) and symbol q(x,ξ). For every bounded analytic set E ⊂ [0,∞), the Hausdorff dimension ( ) x dim{Xt : t ∈ E} 6 min d, sup β∞ dimE . (13.7) x∈Rd A proof can be found in [9, Theorem 5.15]. It is instructive to observe that we have to take the supremum w.r.t. the space variable x, as we do not know how the process X moves while we observe it during t ∈ E. This shows that we can only expect to get ‘exact’ results if t → 0. Here is such an example.

Theorem 13.6. Let (Xt )t>0 be a d-dimensional Feller process with symbol q(x,ξ) satisfying the sector condition. Then, Px-a.s. sup |X − x| 06s6t s x lim = 0 ∀λ > β∞, (13.8) t→0 t1/λ sup |X − x| 06s6t s x lim = ∞ ∀λ < δ∞. (13.9) t→0 t1/λ As one would expect, these results are proved using the maximal estimates (13.2) and (13.4) in conjunction with the Borel–Cantelli lemma, see [9, Theorem 5.16].

If we are interested in the long-term behaviour, one could introduce indices ‘at zero’, where we replace in Definition 13.4 the limit |ξ| → ∞ by |ξ| → 0, but we will always have to pay the price that we loose the influence of the starting point X0 = x, i.e. we will have to take the supremum or infimum for all x ∈ Rd.

With the machinery we have developed here, one can also study further path properties, such as invariant measures, , transience and recurrence etc. For this we refer to the monograph [9] as well as recent developments by Behme & Schnurr [7] and Sandric´ [49, 50]. A. Some classical results

In this appendix we collect some classical results from (or needed for) probability theory which are not always contained in a standard course.

The Cauchy–Abel functional equation

Below we reproduce the standard proof for continuous functions which, however, works also for right-continuous (or monotone) functions.

Theorem A.1. Let φ : [0,∞) → C be a right-continuous function satisfying the functional equation φ(s +t) = φ(s)φ(t). Then φ(t) = φ(1)t .

Proof. Assume that φ(a) = 0 for some a > 0. Then we find for all t > 0

φ(a +t) = φ(a)φ(t) = 0 =⇒ φ|[a,∞) ≡ 0.

To the left of a we find for all n ∈ N

 a n a  0 = φ(a) = φ n =⇒ φ n = 0.

t Since φ is right-continuous, we infer that φ|[0,∞) ≡ 0, and φ(t) = φ(1) holds. Now assume that φ(1) 6= 0. Setting f (t) := φ(t)φ(1)−t we get

f (s +t) = φ(s +t)φ(1)−(s+t) = φ(s)φ(1)−sφ(t)φ(1)−t = f (s) f (t) as well as f (1) = 1. Applying the functional equation k times we conclude that

k   1 k N f n = f n for all k,n ∈ .

The same calculation done backwards yields

k k k  1 k  1 n n  n  n   n f n = f n = f n = f (1) = 1.

Hence, f |Q+ ≡ 1. Since φ, hence f , is right-continuous, we see that f ≡ 1 or, equivalently, φ(t) = [φ(1)]t for all t > 0.

97 98 R. L. Schilling: An Introduction to Lévy and Feller Processes

Characteristic functions and moments

Theorem A.2 (Even moments and characteristic functions). Let Y = (Y (1),...,Y (d)) be a random variable in Rd and let χ(ξ) = Eeiξ·Y be its characteristic function. Then E(|Y|2) exists if, and ∂ 2 only if, the second derivatives 2 χ(0), k = 1,...,d, exist and are finite. In this case all mixed ∂ξk second derivatives exist and 1 ∂ χ(0) ∂ 2χ(0) EY (k) = and E(Y (k)Y (l)) = − . (A.1) i ∂ξk ∂ξk∂ξl Proof. In order to keep the notation simple, we consider only d = 1. If E(Y 2) < ∞, then the formu- lae (A.1) are routine applications of the differentiation lemma for parameter-dependent integrals, see e.g. [54, Theorem 11.5] or [55, Satz 12.2]. Moreover, χ is twice continuously differentiable. 2 00 Let us prove that E(Y ) 6 −χ (0). An application of l’Hospital’s rule gives 1  χ0(2h) − χ0(0) χ0(0) − χ0(−2h) χ00(0) = lim + h→0 2 2h 2h χ0(2h) − χ0(−2h) = lim h→0 4h χ(2h) − 2χ(0) + χ(−2h) = lim h→0 4h2 " 2# eihY − e−ihY  = lim E h→0 2h " # sinhY 2 = − lim E . h→0 h

From Fatou’s lemma we get

"  2# 00 sinhY  2 χ (0) 6 −E lim = −E Y . h→0 h

(k) (l)  (k) 2  (l) 2 In the multivariate case observe that E|Y Y | 6 E (Y ) + E (Y ) .

Vague and weak convergence of measures

1 d A sequence of locally finite Borel measures (µn)n∈N on R converges vaguely to a locally finite measure µ if Z Z d lim φ dµn = φ dµ for all φ ∈ Cc(R ). (A.2) n→∞ d Since the compactly supported continuous functions Cc(R ) are dense in the space of continuous d d functions vanishing at infinity C∞(R ) = {φ ∈ C(R ) : lim|x|→∞ φ(x) = 0}, we can replace in (A.2) d d the set Cc(R ) with C∞(R ). The following theorem guarantees that a family of Borel measures is sequentially relatively compact2 for the vague convergence.

1I.e. every compact set K has finite measure. 2Note that compactness and sequential compactness need not coincide! Appendix A: Some classical results 99

Rd Theorem A.3. Let (µt )t>0 be a family of measures on which is uniformly bounded, in the sense d that sup (R ) < . Then every sequence ( ) N has a vaguely convergent subsequence. t>0 µt ∞ µtn n∈

d If we test in (A.2) against all bounded continuous functions φ ∈ Cb(R ), we get weak conver- gence of the sequence µn → µ. One has

Theorem A.4. A sequence of measures (µn)n∈N converges weakly to µ if, and only if, µn converges d d vaguely to µ and limn→∞ µn(R ) = µ(R ) (preservation of mass). In particular, weak and vague convergence coincide for sequences of probability measures.

Proofs and a full discussion of vague and weak convergence can be found in Malliavin [40, Chapter III.6] or Schilling [55, Chapter 25]. d R iξ·y For any finite measure µ on R we denote by µ(ξ) := Rd e µ(dy) its characteristic function.

Rd Theorem A.5 (Lévy’s continuity theorem). Letq(µn)n∈N be a sequence of finite measures on . If µn → µ weakly, then the characteristic functions µn(ξ) converge locally uniformly to µ(ξ). d Conversely, if the limit limn→∞ µn(ξ) = χ(ξ) exists for all ξ ∈ R and defines a function χ which is continuous at ξ = 0, then there exists a finiteq measure µ such that µ(ξ) = χq(ξ) and

µn → µ weakly. q q A proof in one dimension is contained in the monograph by Breiman [10], d-dimensional ver- sions can be found in Bauer [5, Chapter 23] and Malliavin [40, Chapter IV.4].

Convergence in distribution

d By −−→ we denote convergence in distribution.

0 Theorem A.6 (Convergence of types). Let (Yn)n∈N, Y and Y be random variables and suppose that there are constants an > 0, cn ∈ R such that

d d 0 Yn −−−→ Y and anYn + cn −−−→ Y . n→∞ n→∞

0 0 If Y and Y are non-degenerate, then the limits a = lim an and c = lim cn exist and Y ∼ aY + c. n→∞ n→∞

iupξZ Proof. Write χZ(ξ) := Ee for the characteristic function of the random variable Z.

1◦ By Lévy’s continuity theorem (Theorem A.5) convergence in distribution ensures that

locally unif. locally unif. icn·ξ χa Y +c (ξ) = e χY (anξ) −−−−−−→ χY 0 (ξ) and χY (ξ) −−−−−−→ χY (ξ). n n n n n→∞ n n→∞

Take some subsequence (an(k))k∈N ⊂ (an)n∈N such that limk→∞ an(k) = a ∈ [0,∞]. 2◦ Claim: a > 0. Assume, on the contrary, that a = 0.

|χa Y +c (ξ)| = |χY (an(k)ξ)| −−−→ |χY (0)| = 1. n(k) n(k) n(k) n(k) k→∞ 100 R. L. Schilling: An Introduction to Lévy and Feller Processes

0 Thus, |χY 0 | ≡ 1 which means that Y would be degenerate, contradicting our assumption.

◦ −1 ◦ 3 Claim: a < ∞. If a = ∞, we use Yn = (an) (Yn − cn) and the argument from step 1 to reach the contradiction −1 −1 (an(k)) −−−→ a > 0 ⇐⇒ a < ∞. k→∞

◦ 4 Claim: There exists a unique a ∈ [0,∞) such that limn→∞ an = a. Assume that there were two 0 0 different subsequential limits an(k) → a, am(k) → a and a 6= a . Then

|χa Y +c (ξ)| = |χa Y (ξ)| −−−→ χY (aξ), n(k) n(k) n(k) n(k) n(k) k→∞ 0 |χa Y +c (ξ)| = |χa Y (ξ)| −−−→ χY (a ξ). m(k) m(k) m(k) m(k) m(k) k→∞

0 0 On the other hand, χY (aξ) = χY (a ξ) = χY 0 (ξ). If a < a, we get by iteration

a0  a0 N  |χY (ξ)| = χY ξ = ··· = χY ξ −−−→ |χY (0)| = 1. a a N→∞

Thus, |χ| ≡ 1 and Y is a.s. constant. Since a,a0 can be interchanged, we conclude that a = a0.

5◦ We have χa Y +c (ξ) χa Y +c (ξ) χ 0 (ξ) eicn·ξ = n n n = n n n −−−→ Y . n→∞ χanYn (ξ) χYn (anξ) χY (aξ)

icn·ξ Since χY is continuous and χY (0) = 1, the limit limn→∞ e exists for all |ξ| 6 δ and some small δ. For ξ = tξ0 with |ξ0| = 1, we get

δ δ iδcn·ξ0 Z χY 0 (tξ0) Z e − 1 2 itcn·ξ0 0 < dt = lim e dt = lim 6 liminf , 0 χY (taξ0) n→∞ 0 n→∞ icn · ξ0 n→∞ |cn · ξ0|

0 ic·ξ ic0·ξ and so limsupn→∞ |cn| < ∞; if there were two limit points c 6= c , then e = e for all |ξ| 6 δ. 0 ic ·ξ ic·ξ This gives c = c , and we finally see that cn → c, e n → e , as well as

ic·ξ χY 0 (ξ) = e χY (aξ).

A random variable is called symmetric if Y ∼ −Y.

Theorem A.7 (Symmetrization inequality). Let Y1,...,Yn be independent symmetric random vari- ables. Then the partial sum Sn = Y1 + ··· +Yn is again symmetric and   P 1 P (|Y1 + ··· +Yn| > u) > 2 max |Yk| > u . (A.3) 16k6n

If the Yk are iid with Y1 ∼ µ, then

P P 1 −n (|Y1|>u) (|Y1 + ··· +Yn| > u) > 2 1 − e . (A.4) Appendix A: Some classical results 101

Proof. By independence, Sn = Y1 + ··· +Yn ∼ −Y1 − ··· −Yn = −Sn.

Let τ = min{1 6 k 6 n : |Yk| = max16l6n |Yl|} and set Yn,τ = Sn −Yτ . Then the four (counting all possible ± combinations) random variables (±Yτ ,±Yn,τ ) have the same law. Moreover,

P P P P (Yτ > u) 6 (Yτ > u, Yn,τ > 0) + (Yτ > u, Yn,τ 6 0) = 2 (Yτ > u, Yn,τ > 0), and so P P P 1 P (Sn > u) = (Yτ +Yn,τ > u) > (Yτ > u, Yn,τ > 0) > 2 (Yτ > u).

By symmetry, this implies (A.3). In order to see (A.4), we use that the Yk are iid, hence

P P P n −n (|Y1|>u) ( max |Yk| 6 u) = (|Y1| 6 u) 6 e , 16k6n −p along with the elementary inequality 1 − p 6 e for 0 6 p 6 1. This proves (A.4).

The predictable σ-algebra P Let (Ω,A , ) be a probability space and (Ft )t>0 some filtration. A stochastic process (Xt )t>0 is called adapted, if for every t > 0 the random variable Xt is Ft measurable.

Definition A.8. The predictable σ-algebra P is the smallest σ-algebra on Ω × (0,∞) such that all left-continuous adapted stochastic processes (ω,t) 7→ Xt (ω) are measurable. A P measurable process X is called a predictable process.

For a stopping time τ we denote by

0,τ := {(ω,t) : 0 < t 6 τ(ω)} and τ,∞ := {(ω,t) : t > τ(ω)} (A.5) K K K J the left-open stochastic intervals. The following characterization of the predictable σ-algebra is essentially from Jacod & Shiryaev [27, Theorem I.2.2].

Theorem A.9. The predictable σ-algebra P is generated by any one of the following families of random sets

a) 0,τ where τ is any bounded stopping time; K K b) Fs × (s,t] where Fs ∈ Fs and 0 6 s < t.

Proof. We write Pa and Pb for the σ-algebras generated by the families listed in a) and b), respectively.

◦ 1 Pick 0 6 s < t, F = Fs ∈ Fs and let n > t. Observe that sF := s1F +n1Fc is a bounded stopping 3 time and F × (s,t] = sF ,tF . Therefore, sF ,tF = 0,tF \ 0,sF ∈ Pa, and we conclude that K K K K K K K K Pb ⊂ Pa.

( /0, s > t) 3 Indeed, {sF 6 t} = {s 6 t} ∩ F = ∈ Ft for all t > 0. F, s 6 t 102 R. L. Schilling: An Introduction to Lévy and Feller Processes

◦ 2 Let τ be a bounded stopping time. Since t 7→ 1 0,τ (ω,t) is adapted and left-continuous, we K K have Pa ⊂ P. 3◦ Let X be an adapted and left-continuous process and define for every n ∈ N

∞ n 1 Xt := ∑ Xk2−n k2−n,(k+1)2−n (t). k=0 K K

n n Obviously, X = (Xt )t>0 is Pb measurable; because of the left-continuity of t 7→ Xt , the limit n limn→∞ Xt = Xt exists, and we conclude that X is Pb measurable; consequently, P ⊂ Pb.

The structure of translation invariant operators

Let ϑx f (y) := f (y + x) be the translation operator and fe(x) := f (−x). A linear operator L : ∞ Rd Rd Cc ( ) → C( ) is called translation invariant if

ϑx(L f ) = L(ϑx f ). (A.6)

∞ Rd 0 A distribution λ is an element of the topological dual (Cc ( )) , i.e. a continuous linear func- ∞ Rd R ∞ Rd tional λ : Cc ( ) → . The convolution of a distribution with a function f ∈ Cc ( ) is defined as ∞ Rd 0 ∞ Rd Rd f ∗ λ(x) := λ(ϑ−x fe), λ ∈ Cc ( ) , f ∈ Cc ( ), x ∈ . If λ = µ is a measure, this formula generalizes the ‘usual’ convolution Z Z f ∗ µ(x) = f (x − y) µ(dy) = fe(y − x) µ(dy) = µ(ϑ−x fe).

∞ Rd 0 Theorem A.10. If λ ∈ (Cc ( )) is a distribution, then L f (x) := f ∗ λ(x) defines a translation ∞ Rd Rd invariant continuous linear map L : Cc ( ) → C( ). ∞ Rd Rd Conversely, every translation-invariant continuous linear map L : Cc ( ) → C( ) is of the ∞ Rd 0 form L f = f ∗ λ for some unique distribution λ ∈ (Cc ( )) .

Proof. Let L f = f ∗ λ. From the very definition of the convolution we get

(ϑ−x f ) ∗ λ = λ(ϑ−xϑ]−x f ) = λ(ϑ−x[ f (x − ·)])

= ϑ−xλ(ϑ−x[ f (−·)])

= ϑ−x( f ∗ λ).

Rd ∞ Rd ∞ Rd For any sequence xn → x in and f ∈ Cc ( ) we know that ϑ−xn fe→ ϑ−x fe in Cc ( ). Since ∞ Rd λ is a continuous linear functional on Cc ( ), we conclude that

lim L f (xn) = lim λ(ϑ−x fe) = λ(ϑ−x fe) = L f (x) n→∞ n→∞ n which shows that L f ∈ C(Rd). (With a similar argument based on difference quotients we could even show that L f ∈ C∞(Rd).) Appendix A: Some classical results 103

∞ Rd In order to prove the continuity of L, it is enough to show that L : Cc (K) → C( ) is continuous Rd ∞ Rd for every compact set K ⊂ (this is because of the definition of the topology in Cc ( )). We ∞ Rd Rd will use the closed graph theorem: Assume that fn → f in Cc ( ) and fn ∗ λ → g in C( ), then we have to show that g = f ∗ λ. Rd ∞ Rd For every x ∈ we have ϑ−x fen → ϑ−x fein Cc ( ), and so

g(x) = lim ( fn ∗ λ)(x) = lim λ(ϑ−x fen) = λ(ϑ−x fe) = ( f ∗ λ)(x). n→∞ n→∞

Assume now, that L is translation invariant and continuous. Define λ( f ) := (L fe)(0). Since L is linear and continuous, and f 7→ fe and the evaluation at x = 0 are continuous operations, λ is a ∞ Rd continuous linear map on Cc ( ). Because of the translation invariance of L we get

(L f )(x) = (ϑxL f )(0) = L(ϑx f )(0) = λ(ϑgx f ) = λ(ϑ−x fe) = ( f ∗ λ)(x).

If µ is a further distribution with L f (0) = f ∗ µ(0), we see

∞ Rd (µ − λ)( fe) = f ∗ (µ − λ)(0) = f ∗ µ(0) − f ∗ λ(0) = 0 for all f ∈ Cc ( ) which proves µ = λ. Bibliography

[1] Aczél, J.: Lectures on Functional Equations and Their Applications. Academic Press, New York (NY) 1966.

[2] Applebaum, D.: Lévy Processes and Stochastic Calculus. Cambridge University Press, Cam- bridge 2009 (2nd ed).

[3] Barndorff-Nielsen, O.E. et al.: Lévy Processes. Theory and Applications. Birkhäuser, Boston (MA) 2001.

[4] Bass, R.: Uniqueness in law for pure-jump Markov processes. Probab. Theor. Relat. Fields 79 (1988) 271–287.

[5] Bauer, H.: Probability Theory. De Gruyter, Berlin 1996.

[6] Bawly, G.M.: Über einige Verallgemeinerungen der Grenzwertsätze der Wahrscheinlich- keitsrechnung. Mat. Sbornik (Réc. Math. Moscou, N.S.) 1 (1936) 917–930.

[7] Behme, A., Schnurr, A.: A criterion for invariant measures of Itô processes based on the symbol. Bernoulli 21 (2015) 1697–1718.

[8] Blumenthal, R.M., Getoor, R.K.: Sample functions of stochastic processes with stationary independent increments. J. Math. Mech. 10 (1961) 493–516.

[9] Böttcher, B., Schilling, R.L., Wang, J.: Lévy-type Processes: Construction, Approxima- tion and Sample Path Properties. Lecture Notes in Mathematics 2099 (Lévy Matters III), Springer, Cham 2014.

[10] Breiman, L.: Probability. Addison–Wesley, Reading (MA) 1968 (several reprints by SIAM, Philadelphia).

[11] Bretagnolle, J.L.: Processus à accroissements independants. In: Bretagnolle, J.L. et al.: École d’Été de Probabilités: Processus Stochastiques. Springer, Lecture Notes in Mathe- matics 307, Berlin 1973, pp. 1–26.

[12] Çinlar, E.: Introduction to Stochastic Processes. Prentice–Hall, Englewood Cliffs (NJ) 1975 (reprinted by Dover, Mineola).

104 Bibliography 105

[13] Courrège, P.: Générateur infinitésimal d’un semi-groupe de convolution sur Rn, et formule de Lévy–Khintchine. Bull. Sci. Math. 88 (1964) 3–30.

∞ [14] Courrège, P.: Sur la forme intégro différentielle des opérateurs de Ck dans C satisfaisant au principe du maximum. In: Séminaire Brelot–Choquet–Deny. Théorie du potentiel 10 (1965/66) exposé 2, pp. 1–38.

[15] de Finetti, B.: Sulle funzioni ad incremento aleatorio. Rend. Accad. Lincei Ser. VI 10 (1929) 163–168.

[16] Dieudonné, J.: Foundations of Modern Analysis. Academic Press, New York (NY) 1969.

[17] Ethier, S.N., Kurtz, T.G.: Markov Processes. Characterization and Convergence. John Wiley & Sons, New York (NY) 1986.

[18] Gikhman, I.I., Skorokhod, A.V.: Introduction to the Theory of Random Processes. W.B. Saunders, Philadelphia (PA) 1969 (reprinted by Dover, Mineola 1996).

[19] Grimvall, A.: A theorem on convergence to a Lévy process. Math. Scand. 30 (1972) 339– 349.

[20] Herz, C.S.: Théorie élémentaire des distributions de Beurling. Publ. Math. Orsay no. 5, 2ème année 1962/63, Paris 1964.

[21] Hoh, W.: Pseudo differential operators with negative definite symbols of variable order. Rev. Mat. Iberoamericana 16 (2000) 219–241.

[22] Ikeda, N., Watanabe S.: Stochastic Differential Equations and Diffusion Processes. North- Holland Publishing Co./Kodansha, Amsterdam and Tokyo 1989 (2nd ed).

[23] Itô, K.: On stochastic processes. I. (Infinitely divisible laws of probability). Japanese J. Math. XVIII (1942) 261–302.

[24] Itô, K.: Lectures on Stochastic Processes. Tata Institute of Fundamental Research, Bombay 1961 (reprinted by Springer, Berlin 1984). http://www.math.tifr.res.in/~publ/ln/tifr24.pdf

[25] Itô, K.: Semigroups in probability theory. In: Functional Analysis and Related Topics, Pro- ceedings in Memory of K. Yosida (Kyoto 1991). Springer, Lecture Notes in Mathemtics 1540, Berlin 1993, 69–83.

[26] Jacob, N.: Pseudo Differential Operators and Markov Processes (3 volumes). Imperial Col- lege Press, London 2001–2005.

[27] Jacod, J., Shiryaev, A.N.: Limit Theorems for Stochastic Processes. Springer, Berlin 1987 (2nd ed Springer, Berlin 2003). 106 R. L. Schilling: An Introduction to Lévy and Feller Processes

[28] Khintchine, A.Ya.: A new derivation of a formula by P. Lévy. Bull. Moscow State Univ. 1 (1937) 1–5 (Russian; an English translation is contained in Appendix 2.1, pp. 44–49 of [45]).

[29] Khintchine, A.Ya.: Zur Theorie der unbeschränkt teilbaren Verteilungsgesetze. Mat. Sbornik (Réc. Math. Moscou, N.S.) 2 (1937) 79–119 (German; an English translation is contained in Appendix 3.3, pp. 79–125 of [45]).

[30] Knopova, V., Schilling, R.L.: A note on the existence of transition probability densities for Lévy processes. Forum Math. 25 (2013) 125–149.

[31] Kolmogorov, A.N.: Sulla forma generale di un processo stocastico omogeneo (Un problema die Bruno de Finetti). Rend. Accad. Lincei Ser. VI 15 (1932) 805–808 and 866–869 (an En- glish translation is contained in: Shiryayev, A.N. (ed.): Selected Works of A.N. Kolmogorov. Kluwer, Dordrecht 1992, vol. 2, pp.121–127).

[32] Kühn, F.: Existence and estimates of moments for Lévy-type processes. Stoch. Proc. Appl. (in press). DOI: 10.1016/j.spa.2016.07.008

[33] Kühn, F.: Probability and Heat Kernel Estimates for Lévy(-Type) Processes. PhD Thesis, Technische Universität Dresden 2016.

[34] Kunita, H.: Stochastic differential equations based on Lévy processes and stochastic flows of diffeomorphisms. In: Rao, M.M. (ed.): Real and Stochastic Analysis. Birkhäuser, Boston (MA) 2004, pp. 305–373.

[35] Kunita, H., Watanabe, S.: On square integrable martingales. Nagoya Math. J. 30 (1967) 209–245.

[36] Kyprianou, A.: Introductory Lectures on Fluctuations and Lévy Processes with Applications. Springer, Berlin 2006.

[37] Lévy, P.: Sur les intégrales dont les élements sont des variables aléatoires indépendantes. Ann. Sc. Norm. Sup. Pisa 3 (1934) 337–366.

[38] Lévy, P.: Théorie de l’addition des variables aléatoires. Gauthier–Villars, Paris 1937.

[39] Mainardi, F., Rogosin, S.V.: The origin of infinitely divisible distributions: from de Finetti’s problem to Lévy–Khintchine formula. Math. Methods Economics Finance 1 (2006) 37–55.

[40] Malliavin, P.: Integration and Probability. Springer, New York (NY) 1995.

[41] Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Springer, New York (NY) 1983.

[42] Prokhorov, Yu.V.: Convergence of random processes and limit theorems in probability the- ory. Theor. Probab. Appl. 1 (1956) 157–214. Bibliography 107

[43] Protter, P.E.: Stochastic Integration and Differential Equations. Springer, Berlin 2004 (2nd ed).

[44] Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Springer, Berlin 2005 (3rd printing of the 3rd ed).

[45] Rogosin, S.V., Mainardi, F.: The Legacy of A.Ya. Khintchine’s Work in Probability Theory. Cambridge Scientific Publishers, Cambridge 2010.

[46] Rosinski,´ J.: On series representations of infinitely divisible random vectors. Ann. Probab. 18 (1990) 405–430.

[47] Rosinski,´ J.: Series representations of Lévy processes from the perspective of point pro- cesses. In: Barndorff-Nielsen et al. [3] (2001) 401–415.

[48] Samorodnitsky, G., Taqqu, M.S.: Stable Non-Gaussian Random Processes. Chapman & Hall, New York (NY) 1994.

[49] Sandric,´ N.: On recurrence and transience of two-dimensional Lévy and Lévy-type pro- cesses. Stoch. Proc. Appl. 126 (2016) 414–438.

[50] Sandric,´ N.: Long-time behavior for a class of Feller processes. Trans. Am. Math. Soc. 368 (2016) 1871–1910.

[51] Sato, K.: Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press, Cambridge 1999 (2nd ed 2013).

[52] Schilling, R.L.: Conservativeness and extensions of Feller semigroups. Positivity 2 (1998) 239–256.

[53] Schilling, R.L.: Growth and Hölder conditions for the sample paths of a Feller process. Probab. Theor. Relat. Fields 112 (1998) 565–611.

[54] Schilling, R.L.: Measures, Integrals and Martingales. Cambridge University Press, Cam- bridge 2011 (3rd printing).

[55] Schilling, R.L.: Maß und Integral. De Gruyter, Berlin 2015.

[56] Schilling, R.L., Partzsch, L.: Brownian Motion. An Introduction to Stochastic Processes. De Gruyter, Berlin 2014 (2nd ed).

[57] Schilling, R.L., Schnurr, A.: The symbol associated with the solution of a stochastic differ- ential equation. El. J. Probab. 15 (2010) 1369–1393.

[58] Schnurr, A.: The Symbol of a Markov Semimartingale. PhD Thesis, Technische Universität Dresden 2009. Shaker-Verlag, Aachen 2009. 108 R. L. Schilling: An Introduction to Lévy and Feller Processes

[59] Schnurr, A.: On the semimartingale nature of Feller processes with killing. Stoch. Proc. Appl. 122 (2012) 2758–2780.

[60] Stroock, D.W., Varadhan, S.R.S.: Multidimensional Stochastic Processes. Springer, Berlin 1997 (2nd corrected printing).

[61] von Waldenfels, W.: Eine Klasse stationärer Markowprozesse. Kernforschungsanlage Jülich, Institut für Plasmaphysik, Jülich 1961.

[62] von Waldenfels, W.: Fast positive Operatoren. Z. Wahrscheinlichkeitstheorie verw. Geb. 4 (1965) 159–174.