Malliavin

by David Nualart

1 1 One-dimensional Gaussian analysis

Consider the probability space (R, B(R), γ), where

• γ = N(0, 1) is the standard Gaussian probability on R with density

1 −x2/2 p(x) = √ e , x ∈ R. 2π

• The probability of any interval [a, b] is given by

b 1 Z 2 γ([a, b]) = √ e−x /2dx. 2π a

1 We are going to introduce two basic differential operators. For any f ∈ C (R) we define:

• Derivative operator: Df(x) = f 0(x).

• Divergence operator: δf(x) = xf(x) − f 0(x).

k m m Denote by Cp (R ) the space of functions f : R → R, which are k times continuously differentiable, and such that for some N ≥ 1, |f (k)(x)| ≤ C(1 + |x|N ).

Lemma 1.1. The operators D and δ are adjoint with respect to the measure γ. That means, 1 for any f, g ∈ Cp (R), we have

hDf, giL2(R,γ) = hf, δgiL2(R,γ) .

Proof. Integrating by parts and using p0(x) = −xp(x) we get Z Z f 0(x)g(x)p(x)dx = − f(x)(g(x)p(x))0dx R R Z Z = − f(x)g0(x)p(x)dx + f(x)g(x)xp(x)dx R R Z = f(x)δg(x)p(x)dx. R

2 Lemma 1.2 (Heisenberg’s commutation relation). Let f ∈ C (R). Then

(Dδ − δD)f = f

Proof. We can write

Dδf(x) = D(xf(x) − f 0(x)) = f(x) + xf 0(x) − f 00(x) and, on the other hand, δDf(x) = δf 0(x) = xf 0(x) − f 00(x). This completes the proof.

2 n More generally, if f ∈ C (R) for n ≥ 2, we have

(Dδn − δnD)f = nδn−1f

Proof. Using induction on n, we can write

Dδnf = Dδ(δn−1f) = δD(δn−1f) + δn−1f = δ δn−1Df + (n − 1)δn−2f + δn−1f = δnDf + nδn−1f.

Next we will introduce the Hermite polynomials. Define H0(x) = 1, and for n ≥ 1 put n Hn(x) = δ 1. In particular, for n = 1, 2, 3, we have

H1(x) = δ1 = x 2 H2(x) = δx = x − 1 2 3 H3(x) = δ(x − 1) = x − 3x.

We have the following formula for the derivatives of the Hermite polynomials:

0 Hn = nHn−1

In fact, 0 n n n−1 Hn = Dδ 1 = δ D1 + nδ 1 = nHn−1.

Proposition 1.1. The sequence of normalized Hermite polynomials { √1 H , n ≥ 0} form a n! n 2 complete orthonormal system of functions in the Hilbert space L (R, γ). Proof. For n, m ≥ 0, we can write ( Z n! if n = m Hn(x)Hm(x)p(x)dx = R 0 if n 6= m

Indeed, using the properties of Hermite polynomials, we obtain Z Z m Hn(x)Hm(x)p(x)dx = Hn(x)δ 1(x)p(x)dx R R Z 0 m−1 = Hn(x)δ 1(x)p(x)dx R Z = n Hn−1(x)Hm−1(x)p(x)dx. R 2 To show completeness, it suffices to prove that if f ∈ L (R, γ) is orthogonal to all Hermite polynomials, then f = 0. Because the leading coefficient of Hn(x) is 1, we have that f is n orthogonal to all monomials x . As a consequence, for all t ∈ R, ∞ Z X (it)n Z f(x)eitxp(x)dx = f(x)xnp(x)dx = 0. n! R n=0 R

3 We can commute the and the because

∞ X Z |tx|n Z |f(x)|p(x)dx = e|tx||f(x)|p(x)dx n! n=0 R R 1 Z Z  2 ≤ f 2(x)p(x)dx e2|tx|p(x)dx < ∞. R R Therefore, the Fourier transform of fp is zero, so fp = 0, which implies f = 0. This completes the proof.

For each a ∈ R, we have the following series expansion, which will play an important role.

∞ n 2 X a ax− a H (x) = e 2 . (1) n! n n=0

n n n Proof of (1): In fact, taking into account that Hn = δ 1 and that δ is the adjoint of D , we obtain ∞ ax X 1 a· e = he ,H i 2 H (x) n! n L (R,γ) n n=0 ∞ X 1 a· n = he , δ 1i 2 H (x) n! L (R,γ) n n=0 ∞ X 1 n a· = hD (e ), 1i 2 H (x) n! L (R,γ) n n=0 ∞ n X a a· = he , 1i 2 H (x). n! L (R,γ) n n=0 Finally, 1 Z x2 a2 a· ax− 2 2 he , 1iL2(R,γ) = √ e dx = e . 2π R and (1) holds true. Let us now define the Ornstein Uhlenbeck operator, which is a second order differential 2 operator. For f ∈ C (R) we set

Lf(x) = −xf 0(x) + f 00(x).

This operator has the following properties. 1. Lf = −δDf. Proof: δDf(x) = δf 0(x) = xf 0(x) − f 00(x).

2. LHn = −nHn, that is, Hn is an eigenvector of L with eigenvalue −n. Proof: 0 LHn = −δDHn = −δHn = −nδHn−1 = −nHn.

4 The operator L is the infinitesimal generator of the Ornstein-Uhlenbeck semigroup. Con- 2 −nt sider the semigroup of operators {Pt, t ≥ 0} on L (R, γ), defined by PtHn = e Hn, that is, ∞ X 1 −nt P f = hf, H i 2 e H . t n! n L (R,γ) n n=0

dPt Then, L is the generator of Pt, that is, dt = LPt. 2 Proposition 1.2 (Mehler’s formula). For any function f ∈ L (R, γ), we have the following formula for the Ornstein-Uhlenbeck semigroup: Z −t p −2t −t p −2t Ptf(x) = f(e x + 1 − e y)p(y)dy = E[f(e x + 1 − e Y )], R where Y is a N(0, 1) random variable. √ R −t −2t Proof. Set Ptf(x) = f(e x + 1 − e y)p(y)dy. e R 2 (i) We will first show that Pt and Pet are contraction operators on L (R, γ). Indeed,

∞ X 1 kP fk2 = hf, H i2 e−2nt ≤ kfk2 , t L2(R,γ) n! n L2(R,γ) L2(R,γ) n=0 and

Z Z p 2 kP fk2 = f(e−tx + 1 − e−2ty)p(y)dy p(x)dx et L2(R,γ) R R Z p ≤ f 2(e−tx + 1 − e−2ty)p(y)p(x)dydx R2 p = E[f 2(e−tX + 1 − e−2tY )] = kfk2 , L2(R,γ) where X and Y are independent N(0, 1)-random variables.

ax 2 (ii) The functions {e , a ∈ R} form a total system in L (R, γ). So, it suffices to show that a· a· Pte = Pete for each a ∈ R. We have

√ 2 a· h axe−t+aY 1−e−2t i axe−t a (1−e−2t) (Pete )(x) = E e = e e 2

2 2 ∞ n −nt a axe−t− 1 a2e−2t a X a e = e 2 e 2 = e 2 H (x) n! n n=0 ∞ ! 2 n 2  2  a X a a a·− a = e 2 P H (x) = e 2 P e 2 (x) t n! n t n=0 a· = (Pte )(x).

This completes the proof of the proposition.

5 The Ornstein-Uhlenbeck semigroup has the following propreties:

1. kPtfkLp(R,γ) ≤ kfkLp(R,γ) for any p ≥ 2. Proof: Using Mehler’s formula and H¨older’sinequality, we can write Z Z p p −t p −2t kPtfk p = f(e x + 1 − e y)p(y)dy p(x)dx L (R,γ) R R Z p ≤ |f(e−tx + 1 − e−2ty)|pp(y)p(x)dydx R2 p = E[|f(e−tX + 1 − e−2tY )|p] = kfkp . Lp(R,γ) R 2. P0f = f and P∞f = limt→∞ Ptf = f(y)p(y)dy. R −t 3. DPtf = e PtDf.

4. f ≥ 0 implies Ptf ≥ 0. 2 5. For any f ∈ L (R, γ) we have Z Z ∞ f(x) − fdγ = − LPtf(x)dt. R 0 Proof: ∞ Z X 1 f(x) − fdγ = hf, H i 2 H (x) n! n L (R,γ) n R n=1 ∞ Z ∞  X 1 −nt = hf, H i 2 ne H (x)dt n! n L (R,γ) n n=1 0 ∞ ! Z ∞ X 1 = hf, H i 2 (−LP H )(x) dt n! n L (R,γ) t n 0 n=1 Z ∞ = − LPtf(x)dt. 0 1 Proposition 1.3 (First Poincar´einequality). For any f ∈ Cp (R),

Var(f) ≤ kf 0k2 L2(R,γ) Proof. Set f¯ = R fdγ. We can write R Z Var(f) = f(x)(f(x) − f¯)p(x)dx R Z ∞ Z = − f(x)LPtf(x)p(x)dxdt 0 R Z ∞ Z = f(x)δDPtf(x)p(x)dxdt 0 R Z ∞ Z −t 0 0 = e f (x)Ptf (x)p(x)dxdt 0 R Z ∞ −t 0 0 ≤ e kf kL2(R,γ)kPtf kL2(R,γ)dt 0 ≤ kf 0k2 . L2(R,γ)

6 This result as the following interpretation. If f 0 is small, f is concentrated around its mean value f¯ = R f(x)p(x)dx because R Z Var(f) = (f(x) − f¯)2p(x)dx. R The result can be extended to the Sobolev space

1,2 0 2 D = {f : f, f ∈ L (R, γ)} defined as the completion of C1( ) by the norm kfk2 = kfk2 + kf 0k2 . p R 1,2 L2(R,γ) L2(R,γ)

1.1 Finite-dimensional case We consider now the finite-dimensional case. That is, the probability space (Ω, F,P ) is n n n such that Ω = R , F = B(R ) is the Borel σ-field of R , and P is the standard Gaussian probability with density p(x) = (2π)−n/2e−|x|2/2. In this framework we consider, as before, two differential operators. The first is the derivative operator, which is simply the gradient of n a differentiable function F : R → R:  ∂F ∂F  ∇F = ,..., . ∂x1 ∂xn The second differential operator is the divergence operator and is defined on differentiable n n vector-valued functions u: R → R as follows: n   X ∂ui δ(u) = u x − = hu, xi − div u. i i ∂x i=1 i It turns out that δ is the adjoint of the derivative operator with respect to the Gaussian measure ¶. This is the content of the next proposition. Proposition 1.4. The operator δ is the adjoint of ∇; that is,

E(hu, ∇F i) = E(F δ(u))

n n n if F : R → R and u: R → R are continuously differentiable functions which, together with their partial derivatives, have at most polynomial growth.

Proof. Integrating by parts, and using ∂p/∂xi = −xip, we obtain

n Z X Z ∂F h∇F, uipdx = uipdx n n ∂x R i=1 R i n  Z Z  X ∂ui = − F pdx + F uixipdx n ∂x n i=1 R i R Z = F δ(u)pdx. Rn This completes the proof.

7 2 Malliavin calculus on the Wiener space

2.1 Brownian motion and Wiener space Brownian motion was named after the botanist Robert Brown, who observed in a microscope the complex and erratic motion of grains of pollen suspended in water. Brownian motion was then rigorously defined and studied by Norbert Wiener; this is why it is also called the . The mathematical definition of Brownian motion is the following.

Definition 2.1. A real-valued B = (Bt)t≥0 defined on a probability space (Ω, F,P ) is called a Brownian motion if it satisfies the following conditions:

1. Almost surely B0 = 0.

2. For all 0 ≤ t1 < ··· < tn the increments Btn − Btn−1 ,...,Bt2 − Bt1 are independent random variables.

3. If 0 ≤ s < t, the increment Bt − Bs is a Gaussian random variable with mean zero and variance t − s.

4. With probability one, the map t → Bt is continuous. Properties (i), (ii), and (iii) are equivalent to saying that B is a with mean zero and covariance function

Γ(s, t) = min(s, t). (2)

The existence of Brownian motion can be proved in the following way: The function Γ(s, t) = min(s, t) is symmetric and nonnegative definite because it can be written as Z ∞ min(s, t) = 1[0,s](r)1[0,t](r)dr. 0

Then, for any integer n ≥ 1 and real numbers a1, . . . , an, n n X X Z ∞ aiaj min(ti, tj) = aiaj 1[0,ti](r)1[0,tj ](r)dr i,j=1 i,j=1 0 n 2 Z ∞X  = ai1[0,ti](r) dr ≥ 0. 0 i=1 Therefore, by Kolmogorov’s extension theorem, there exists a Gaussian process with mean zero and covariance function min(s, t). Moreover, for any s ≤ t, the increment Bt − Bs has the normal distribution N(0, t − s). This implies that for any natural number k we have   (2k)! (B − B )2k = (t − s)k. E t s 2kk! Therefore, by Kolmogorov’s continuity theorem, there exists a version of B with H¨older- continuous trajectories of order γ for any γ < (k − 1)/(2k) on any interval [0,T ]. This implies that the paths of this version of the process B are γ-H¨oldercontinuous on [0,T ] for any γ < 1/2 and T > 0. Brownian motion can be defined in the canonical probability space (Ω, F,P ) known as the Wiener space, where

8 • Ω is the space of continuous functions ω : R+ → R vanishing at the origin. •F is the Borel σ-field B(Ω) for the topology corresponding to uniform convergence on compact sets. One can easily show that F coincides with the σ-field generated by the collection of cylinder sets

C = {ω ∈ Ω: ω(t1) ∈ A1, . . . , ω(tk) ∈ Ak} , (3)

for any integer k ≥ 1, Borel sets A1,...,Ak in R, and 0 ≤ t1 < ··· < tk. • P is the Wiener measure. That is, P is defined on a cylinder set of the form (3) by Z

P (C) = pt1 (x1)pt2−t1 (x2 − x1) ··· ptk−tk−1 (xk − xk−1) dx1 ··· dxk, (4) A1×···×Ak

−1/2 −x2/(2t) where pt(x) denotes the Gaussian density pt(x) = (2πt) e , x ∈ R, t > 0. The mapping P defined by (4) on cylinder sets can be uniquely extended to a probability measure on F. This fact can be proved as a consequence of the existence of Brownian motion on R+. Finally, the canonical stochastic process defined as Bt(ω) = ω(t), ω ∈ Ω, t ≥ 0, is a Brownian motion.

2.2 Wiener integral We next define the integral of square integrable functions with respect to Brownian motion, known as the Wiener integral. We consider the set E0 of step functions

n−1 X ϕt = aj1(tj ,tj+1](t), t ≥ 0, (5) j=0 where n ≥ 1 is an integer, a0, . . . , an−1 ∈ R, and 0 = t0 < ··· < tn. The Wiener integral of a step function ϕ ∈ E0 of the form (5) is defined by

n−1 Z ∞ X ϕtdBt = aj(Btj+1 − Btj ). 0 j=0

R ∞ 2 2 The mapping ϕ → 0 ϕtdBt from E0 ⊂ L (R+) to L (Ω) is linear and isometric:

Z ∞ 2 n−1 Z ∞ X 2 2 2 ϕ dB = a (t − t ) = ϕ dt = kϕk 2 . E t t j j+1 j t L (R+) 0 j=0 0

2 The space E0 is a dense subspace of L (R+). Therefore, the mapping Z ∞ ϕ → ϕtdBt 0 2 2 can be extended to a linear isometry between L (R+) and the Gaussian subspace of L (Ω) R ∞ spanned by the Brownian motion. The random variable 0 ϕtdBt is called the Wiener integral 2 of ϕ ∈ L (R+) and is denoted by B(ϕ). Observe that it is a Gaussian random variable with 2 mean zero and variance kϕk 2 . L (R+)

9 2.3 Malliavin derivative

Let B = (Bt)t≥0 be a Brownian motion on a probability space (Ω, F,P ) such that F is the 2 σ-field generated by B. Set H = L (R+), and for any h ∈ H, consider the Wiener integral Z ∞ B(h) = h(t)dBt. 0 The Hilbert space H plays a basic role in the definition of the derivative operator. In fact, the derivative of a random variable F :Ω → R takes values in H, and (DtF )t≥0 is a stochastic process in L2(Ω; H). We start by defining the derivative in a dense subset of L2(Ω). More precisely, consider the set S of smooth and cylindrical random variables of the form

F = f(B(h1),...,B(hn)), (6)

∞ n where f ∈ Cp (R ) and hi ∈ H. Definition 2.2. If F ∈ S is a smooth and cylindrical random variable of the form (6), the derivative DF is the H-valued random variable defined by

n X ∂f D F = (B(h ),...,B(h ))h (t). t ∂x 1 n i i=1 i

For instance, D(B(h)) = h and D(Bt1 ) = 1[0,t1], for any t1 ≥ 0. The derivative operator can be interpreted as a . Consider the 1 R t Cameron-Martin space H ⊂ Ω, which is is the set of functions of the form ψ(t) = 0 h(s)ds, where h ∈ H. Then, For ant h ∈ H, hDF, hiH is the derivative of F in the direction of R · 0 h(s)ds: Z T d  Z ·  hDF, hiH = htDtF dt = F ω +  hsds |=0. 0 d 0

For example, if F = Bt1 , then

 Z ·  Z t1 F ω +  hsds = ω(t1) +  hsds, 0 0

R t1 so, hDF, hiH = 0 hsds, and DtF = 1[0,t1](t). The operator D defines a linear and unbounded operator from S ⊂ L2(Ω) into L2(Ω; H). Let us now introduce the divergence operator. Denote by SH the class of smooth and cylin- drical stochastic processes u = (ut)t≥0 of the form

n X ut = Fjhj(t), (7) j=1

where Fj ∈ S and hj ∈ H. Definition 2.3. We define the divergence of an element u of the form (7) as the random variable given by n n X X δ(u) = FjB(hj) − hDFj, hjiH . j=1 j=1

10 In particular, for any h ∈ H we have δ(h) = B(h). As in the finite-dimensional case, the divergence is the adjoint of the derivative operator, as is shown in the next proposition.

Proposition 2.1. Let F ∈ S and u ∈ SH . Then

E(F δ(u)) = E(hDF, uiH ).

Proof. We can assume that F = f(B(h1) ...,B(hn)) and n X u = gj(B(h1) ...,B(hn))hj, j=1

where h1, . . . , hn are orthonormal elements in H. In this case, the duality relationship reduces to the finite-dimensional case proved in Proposition 1.4.

We will make use of the notation DhF = hDF, hiH for any h ∈ H and F ∈ S. The following proposition states the basic properties of the derivative and divergence operators on smooth and cylindrical random variables.

Proposition 2.2. Suppose that u, v ∈ SH , F ∈ S, and h ∈ H. Then, if (ei)i≥1 is a complete orthonormal system in H, we have ∞  X  E(δ(u)δ(v)) = E(hu, viH ) + E Dei hu, ejiH Dej hv, eiiH , (8) i,j=1

Dh(δ(u)) = δ(Dhu) + hh, uiH , (9)

δ(F u) = F δ(u) − hDF, uiH . (10) Property (8) can also be written as  Z ∞   Z ∞ Z ∞  E(δ(u)δ(v)) = E utvtdt + E DsutDtvsdsdt . 0 0 0 Pn Proof of Proposition 2.2. We first show property (9). Consider u = j=1 Fjhj, where Fj ∈ S and hj ∈ H for j = 1, . . . , n. Then, using Dh(B(hj)) = hh, hjiH , we obtain n n  X X  Dh(δ(u)) = Dh FjB(hj) − hDFj, hjiH j=1 j=1 n n X X = Fjhh, hjiH + (DhFjB(hj) − hDh(DFj), hjiH ) j=1 j=1

= hu, hiH + δ(Dhu). To show property (8), using the duality formula (Proposition 2.1) and property (9), we get

E(δ(u)δ(v)) = E(hv, D(δ(u))iH ) ∞  X  = E hv, eiiH Dei (δ(u)) i=1 ∞  X   = E hv, eiiH hu, eiiH + δ(Dei u) i=1 ∞  X  = E(hu, viH ) + E Dei hu, ejiH Dej hv, eiiH . i,j=1

11 Finally, to prove property (10) we choose a smooth random variable G ∈ S and write, using the duality relationship (Proposition 2.1),

E(δ(F u)G) = E(hDG, F uiH ) = E(hu, D(FG) − GDF iH )

= E((δ(u)F − hu, DF iH )G), which implies the result because S is dense in L2(Ω).

2.4 Sobolev spaces The next proposition will play a basic role in extending the derivative to suitable Sobolev spaces of random variables. Proposition 2.3. The operator D is closable from Lp(Ω) to Lp(Ω; H) for any p ≥ 1.

Proof. Assume that the sequence FN ∈ S satisfies

Lp(Ω) Lp(Ω;H) FN −→ 0 and DFN −→ η, PN as N → ∞. Then η = 0. Indeed, for any u = j=1 Gjhj ∈ SH such that GjB(hj) and DGj are bounded, by the duality formula (Proposition 2.1), we obtain

E(hη, uiH ) = lim E(hDFN , uiH ) N→∞ = lim E(FN δ(u)) = 0. N→∞ p This implies that η = 0, since the set of u ∈ SH with the above properties is dense in L (Ω; H) for all p ≥ 1.

We consider the closed extension of the derivative, which we also denote by D. The domain of this operator is defined by the following Sobolev spaces. For any p ≥ 1, we denote 1,p by D the closure of S with respect to the seminorm   Z ∞ p/21/p p 2 kF k1,p = E(|F | ) + E (DtF ) dt . 0 1,p In particular, F belongs to D if and only if there exists a sequence Fn ∈ S such that Lp(Ω) Lp(Ω;H) Fn −→ F and DFn −→ DF,

1,2 as n → ∞. For p = 2, the space D is a Hilbert space with scalar product  Z ∞  hF,Gi1,2 = E(FG) + E DtFDtGdt . 0 1,p In the same way we can introduce spaces D (H) by taking the closure of SH . The corre- sponding seminorm is denoted by k · k1,p,H . The Malliavin derivative satisfies the following chain rule.

0 Proposition 2.4. Let ϕ: R → R be a continuous differentiable function such that |ϕ (x)| ≤ α 1,p 1,q C(1 + |x| ) for some α ≥ 0. Let F ∈ D for some p ≥ α + 1. Then, ϕ(F ) belongs to D , where q = p/(α + 1), and D(ϕ(F )) = ϕ0(F )DF.

12 Proof. Notice that |ϕ(x)| ≤ C0(1 + |x|α+1), for some constant C0, which implies that ϕ(F ) ∈ Lq(Ω) and, by H¨older’sinequality, ϕ0(F )DF ∈ Lq(Ω; H). Then, to show the proposition it suffices to approximate F by smooth and cylindrical random variables, and ϕ by ϕ∗αn, where αn is an approximation of the identity. We next define the domain of the divergence operator. We identify the Hilbert space 2 2 L (Ω; H) with L (Ω × R+). Definition 2.4. The domain of the divergence operator Dom δ in L2(Ω) is the set of processes 2 2 u ∈ L (Ω × R+) such that there exists δ(u) ∈ L (Ω) satisfying the duality relationship

E(hDF, uiH ) = E(δ(u)F ),

1,2 for any F ∈ D .

Observe that δ is a linear operator such that E(δ(u)) = 0. Moreover, δ is closed; that is, if the sequence un ∈ SH satisfies

L2(Ω;H) L2(Ω) un −→ u and δ(un) −→ G,

as n → ∞, then u belongs to Dom δ and δ(u) = G. Proposition 2.2 can be extended to random variables in suitable Sobolev spaces. Property 1,2 1,2 (8) holds for u, v ∈ D (H) ⊂ Dom δ and, in this case, for any u ∈ D (H) we can write  Z ∞   Z ∞ Z ∞  2 2 2 2 E(δ(u) ) ≤ E (ut) dt + E (Dsut) dsdt = kuk1,2,H . 0 0 0

1,2 1,2 Property (9) holds if u ∈ D (H) and Dhu ∈ Dom δ. Finally, property (10) holds if F ∈ D , F u ∈ L2(Ω; H), u ∈ Dom δ, and the right-hand side is square integrable. We can also introduce iterated derivatives and the corresponding Sobolev spaces. The kth derivative DkF of a random variable F ∈ S is the k-parameter process obtained by iteration:

n X ∂kf Dk F = (B(h ),...,B(h ))h (t ) ··· h (t ). t1,...tk 1 n i1 1 ik k ∂xi1 ··· ∂xik i1,...,ik=1

For any p ≥ 1, the operator Dk is closable from Lp(Ω) into Lp(Ω; H⊗k), and we denote by k,p D the closure of S with respect to the seminorm   k Z p/21/p p X j 2 kF kk,p = E(|F | ) + E (Dt ,...,t F ) dt1 ··· dtj . j 1 j j=1 R+

k,∞ k,p ∞,2 k,2 ∞ k,∞ For any k ≥ 1, we set D := ∩p≥2D , D := ∩k≥1D , and D := ∩k≥1D . Similarly, k,p we can introduce the spaces D (H).

2.5 The divergence as a tochastic integral

The Malliavin derivative is a local operator in the following sense. Let [a, b] ⊂ R+ be fixed. We denote by F[a,b] the σ-field generated by the random variables {Bs − Ba, s ∈ [a, b]}.

13 1,2 2 Lemma 2.5. Let F be a random variable in D ∩L (Ω, F[a,b],P ). Then DtF = 0 for almost all (t, ω) ∈ [a, b]c × Ω.

2 Proof. If F belongs to S ∩L (Ω, F[a,b],P ) then this property is clear. The general case follows by approximation.

The following result says that the divergence operator is an extension of Itˆo’sintegral. For any t ≥ 0 we denote by Ft the σ-algebra generated by the null sets and the random variables Bs, s ∈ [0, t].

2 Theorem 2.6. Any process u in L (Ω × R+) which is adapted (for each t ≥ 0, ut is Ft- measurable) belongs to Dom δ and δ(u) coincides with Itˆo’sstochastic integral Z ∞ δ(u) = utdBt. 0 Proof. Consider a simple process u of the form

n−1 X ut = φj1(tj ,tj+1](t), j=0

where 0 ≤ t0 < t1 < ··· < tn and the random variables φj ∈ S are Ftj -measurable. Then δ(u) coincides with the Itˆointegral of u because, by (10),

n−1 n−1 n−1 X X Z tj+1 X δ(u) = φj(Btj+1 − Btj ) − Dtφjdt = φj(Btj+1 − Btj ), j=0 j=0 tj j=0

taking into account that Dtφj = 0 if t > tj by Lemma 2.5. Then the result follows by approx- 2 2 imating any process in L (P) by simple processes, and approximating any φj ∈ L (Ω, Ftj ,P )

by Ftj -measurable smooth and cylindrical random variables. If u is not adapted, δ(u) coincides with an anticipating stochastic integral introduced by Skorohod. Using techniques of Malliavin calculus, Nualart and Pardoux developed a for the Skorohod integral. If u and v are adapted then, for s < t, Dtvs = 0 and, for s > t, Dsut = 0. As a consequence, property (8) leads to the isometry property of Itˆo’sintegral for adapted processes 1,2 u, v ∈ D (H):  Z ∞  E(δ(u)δ(v)) = E utvtdt . 0 1,2 If u is an adapted process in D (H) then, from property (9), we obtain  Z ∞  Z ∞ Dt usdBs = ut + DtusdBs, (11) 0 t because Dtus = 0 if t > s.

14 2.6 Isonormal Gaussian processes So far, we have developed the Malliavin calculus with respect to Brownian motion. In this R ∞ case, the Wiener integral B(h) = 0 h(t)dBt gives rise to a centered Gaussian family indexed 2 by the Hilbert space H = L (R+). More generally, consider a separable Hilbert space H with scalar product h·, ·iH . An isonormal Gaussian process is a centered Gaussian family H1 = {W (h), h ∈ H} satisfying

E(W (h)W (g)) = hh, giH ,

2 for any h, g ∈ H. Observe that H1 is a Gaussian subspace of L (Ω). The Malliavin calculus can be developed in the framework of an isonormal Gaussian 2 process, and all the notions and properties that do not depend on the fact that H = L (R+) can be extended to this more general context.

3 Multiple stochastic . Wiener chaos

In this section we present the Wiener chaos expansion, which provides an orthogonal de- composition of random variables in L2(Ω) in terms of multiple stochastic integrals. We then compute the derivative and the divergence operators on the Wiener chaos expansion.

3.1 Multiple Stochastic Integrals

Recall that B = (Bt)t≥0 is a Brownian motion defined on a probability space (Ω, F,P ) such 2 n that F is generated by B. Let Ls(R+) be the space of symmetric square integrable functions n n f : R+ → R. If f : R+ → R, we define its symmetrization by 1 X f˜(t , . . . , t ) = f(t , . . . , t ), 1 n n! σ(1) σ(n) σ where the sum runs over all permutations σ of {1, 2, . . . , n}. Observe that

kf˜k 2 n ≤ kfk 2 n . L (R+) L (R+)

2 n Definition 3.1. The multiple stochastic integral of f ∈ Ls(R+) is defined as the iterated stochastic integral

Z ∞ Z tn Z t2

In(f) = n! ··· f(t1, . . . , tn)dBt1 ··· dBtn . 0 0 0 2 Note that if f ∈ L (R+), I1(f) = B(f) is the Wiener integral of f. 2 n If f ∈ L (R+) is not necessarily symmetric, we define

In(f) = In(f˜).

Using the properties of Itˆo’sstochastic integral, one can easily check the following isometry 2 n 2 m property: for all n, m ≥ 1, f ∈ L (R+), and g ∈ L (R+ ), ( 0 if n 6= m, E(In(f)Im(g)) = (12) n!hf,˜ g˜i 2 n if n = m. L (R+)

15 2 n Next, we want to compute the product of two multiple integrals. Let f ∈ Ls(R+) and 2 m g ∈ Ls(R+ ). For any r = 0, . . . , n ∧ m, we define the contraction of f and g of order r to be 2 n+m−2r the element of L (R+ ) defined by

(f ⊗r g)(t1, . . . , tn−r, s1, . . . , sm−r) Z = f(t1, . . . , tn−r, x1, . . . , xr)g(s1, . . . , sm−r, x1, . . . , xr)dx1 ··· dxr. r R+

We denote by f ⊗˜ r g the symmetrization of f ⊗r g. Then, the product of two multiple stochastic integrals satisfies the following formula: n∧m X nm I (f)I (g) = r! I (f ⊗ g). (13) n m r r n+m−2r r r=0 The next result gives the relation between multiple stochastic integrals and Hermite poly- nomials. 2 Proposition 3.1. For any g ∈ L (R+), we have   ⊗n n B(g) I (g ) = kgk 2 H , n L ( +) n R kgk 2 L (R+) ⊗n where g (t1, . . . , tn) = g(t1) ··· g(tn).

Proof. We can assume that kgk 2 = 1. We proceed by induction over n. The case n = 1 L (R+) is immediate. We then assume that the result holds for 1, . . . , n. Using the product rule (13), the induction hypothesis, and the recursive relation for the Hermite polynomials, we get ⊗(n+1) ⊗n ⊗(n−1) In+1(g ) = In(g )I1(g) − nIn−1(g )

= Hn(B(g))B(g) − nHn−1(B(g))

= Hn+1(B(g)), which concludes the proof.

The next result is the Wiener chaos expansion. Theorem 3.2. Every F ∈ L2(Ω) can be uniquely expanded into a sum of multiple stochastic integrals as follows: ∞ X F = E(F ) + In(fn), n=1 2 n where fn ∈ Ls(R+). 2 For any n ≥ 1, we denote by Hn the closed subspace of L (Ω) formed by all multiple stochastic integrals of order n. For n = 0, H0 is the space of constants. Observe that H1 2 coincides with the isonormal Gaussian process {B(f), f ∈ L (R+)}. Then Theorem 3.2 can be reformulated by saying that we have the orthogonal decomposition 2 ∞ L (Ω) = ⊕n=0Hn. Proof of Theorem 3.2. It suffices to show that if a random variable G ∈ L2(Ω) is orthogonal ∞ to ⊕n=0Hn then G = 0. This assumption implies that G is orthogonal to all random variables k 2 of the form B(g) , where g ∈ L (R+), k ≥ 0. This in turn implies that G is orthogonal to all the exponentials exp(B(h)), which form a total set in L2(Ω). So G = 0.

16 3.2 Derivative operator on the Wiener chaos Let us compute the derivative of a multiple stochastic integral. 2 n 1,2 Proposition 3.2. Let f ∈ Ls(R+). Then In(f) ∈ D and

DtIn(f) = nIn−1(f(·, t)).

⊗n Proof. Assume that f = g , with kgk 2 = 1. Then, using Proposition 3.1 and the L (R+) properties of Hermite polynomials, we have

0 DtIn(f) = Dt(Hn(B(g))) = Hn(B(g))Dt(B(g)) = nHn−1(B(g))g(t) ⊗(n−1) = ng(t)In−1(g ) = nIn−1(f(·, t)). The general case follows using linear combinations and a density argument. This finishes the proof.

Moreover, applying (12), we have  Z  Z 2 2 2 E (DtIn(f)) dt = n E(In−1(f(·, t)) )dt R+ R+ Z 2 2 = n (n − 1)! kf(·, t)k 2 n−1 dt L (R+ ) R+ 2 = nn!kfk 2 n L (R+) 2 = nE(In(f) ). (14) As a consequence of Proposition 3.2 and (14), we deduce the following result. 2 P∞ Proposition 3.3. Let F ∈ L (Ω) with Wiener chaos expansion F = n=0 In(fn). Then 1,2 F ∈ D if and only if ∞ 2 X 2 (kDF k ) = nn!kfnk 2 n < ∞, E H L (R+) n=1 and in this case ∞ X DtF = nIn−1(fn(·, t)). n=1 k,2 Similarly, if k ≥ 2, one can show that F ∈ D if and only if ∞ X k 2 n n!kfnk 2 n < ∞, L (R+) n=1 and in this case ∞ X Dk F = n(n − 1) ··· (n − k + 1)I (f (· , t , . . . , t )), t1,...,tk n−k n 1 k n=k

2 k ∞,2 where the series converges in L (Ω × R+). As a consequence, if F ∈ D then the following formula, due to Stroock, allows us to compute explicitly the kernels in the Wiener chaos expansion of F : 1 f = (DnF ). (15) n n!E

17 3 Example 3.3. Consider F = B1 . Then 3 2 f1(t1) = E(Dt1 B1 ) = 3E(B1 )1[0,1](t1) = 31[0,1](t1), f (t , t ) = 1 (D2 B3) = 3 (B )1 (t ∨ t ) = 0, 2 1 2 2 E t1,t2 1 E 1 [0,1] 1 2 f (t , t , t ) = 1 (D3 B3) = 1 (t ∨ t ∨ t ), 3 1 2 3 6 E t1,t2,t3 1 [0,1] 1 2 3 and we obtain the Wiener chaos expansion

Z 1 Z t1 Z t2 3 B1 = 3B1 + 6 dBt1 dBt2 dBt3 . 0 0 0 1,2 Proposition 3.3 implies the following characterization of the space D . Proposition 3.4. Let F ∈ L2(Ω). Assume that there exists an element u ∈ L2(Ω; H) such that, for all G ∈ S and h ∈ H, the following duality formula holds:

E(hu, hiH G) = E(F δ(Gh)). (16) 1,2 Then F ∈ D and DF = u. P∞ 2 n Proof. Let F = n=0 In(fn), where fn ∈ Ls(R+). By the duality formula (Proposition 2.1) and Proposition 3.2, we obtain ∞ ∞ X X E(F δ(Gh)) = E(In(fn)δ(Gh)) = E(hD(In(fn)), hiH G) n=0 n=0 ∞ X = E(hnIn−1(fn(·, t)), hiH G). n=1 Then, by (16), we get ∞ X E(hnIn−1(fn(·, t)), hiH G) = E(hu, hiH G), n=1 P∞ 2 which implies that the series n=1 nIn−1(fn(·, t)) converges in L (Ω; H) and its sum coincides with u. Proposition 3.3 allows us to conclude the proof.

1,2 Corollary 3.4. Let (Fn)n≥1 be a sequence of random variables in D that converges to F in L2(Ω) and is such that 2 sup E(kDFnkH ) < ∞. n 1,2 Then F belongs to D and the sequence of derivatives (DFn)n≥1 converges to DF in the weak topology of L2(Ω; H).

Proof. The assumptions imply that there exists a subsequence (Fn(k))k≥1 such that the se- 2 quence of derivatives (DFn(k))k≥1 converges in the weak topology of L (Ω; H) to some element α ∈ L2(Ω; H). By Proposition 3.4, it suffices to show that, for all G ∈ S and h ∈ H,

E(hα, hiH G) = E(F δ(Gh)). (17) By the duality formula (Proposition 2.1), we have

E(hDFn(k), hiH G) = E(Fn(k)δ(Gh)). Then, taking the limit as k tends to infinity, we obtain (17), which concludes the proof.

18 The next proposition shows that the indicator function of a set A ∈ F such that 0 < 1,2 P (A) < 1 does not belong to D . Proposition 3.5. Let A ∈ F and suppose that the indicator function of A belongs to the 1,2 space D . Then, P (A) is zero or one. Proof. Consider a continuously differentiable function ϕ with compact support, such that ϕ(x) = x2 for each x ∈ [0, 1]. Then, by Proposition 2.4, we can write

2 D1A = D[(1A) ] = D[ϕ(1A)] = 21AD1A.

Therefore D1A = 0 and, from Proposition 3.3, we deduce that 1A = P (A), which completes the proof.

3.3 Divergence on the Wiener chaos We now compute the divergence operator on the Wiener chaos expansion. A square integrable 2 stochastic process u = (ut)t≥0 ∈ L (Ω × R+) has an orthogonal expansion of the form

∞ X ut = In(fn(·, t)), n=0

2 n+1 where f0(t) = E(ut) and, for each n ≥ 1, fn ∈ L (R+ ) is a symmetric function in the first n variables.

Proposition 3.6. The process u belongs to the domain of δ if and only if the series

∞ X δ(u) = In+1(f˜n) (18) n=0

converges in L2(Ω).

Proof. Suppose that G = In(g) is a multiple stochastic integral of order n ≥ 1, where g is symmetric. Then Z  E(hu, DGiH ) = E In−1(fn−1(·, t))nIn−1(g(·, t)) dt R+ Z = n(n − 1)! hfn−1(·, t), g(·, t)i 2 n−1 dt L (R+ ) R+

= n!hf , gi 2 n = n!hf˜ , gi 2 n n−1 L (R+) n−1 L (R+) ˜ ˜ = E(In(fn−1)In(g)) = E(In(fn−1)G).

If u ∈ Dom δ, we deduce that ˜ E(δ(u)G) = E(In(fn−1)G)

for every G ∈ Hn. This implies that In(f˜n−1) coincides with the projection of δ(u) on the nth Wiener chaos. Consequently, the series in (18) converges in L2(Ω) and its sum is equal to δ(u). The converse can be proved by similar arguments.

19 4 Ornstein-Uhlenbeck semigroup. Meyer inequalities

In this section we describe the main properties of the Ornstein–Uhlenbeck semigroup and its generator. We then give the relationship between the Malliavin derivative, the divergence operator, and the Ornstein–Uhlenbeck semigroup generator.

4.1 Mehler’s formula

Let B = (Bt)t≥0 be a Brownian motion on a probability space (Ω, F,P ) such that F is generated by B. Let F be a random variable in L2(Ω) with the Wiener chaos decomposition P∞ 2 n F = n=0 In(fn), fn ∈ Ls(R+).

Definition 4.1. The Ornstein–Uhlenbeck semigroup is the one-parameter semigroup (Tt)t≥0 of operators on L2(Ω) defined by

∞ X −nt Tt(F ) = e In(fn). n=0 An alternative and useful expression for the Ornstein–Uhlenbeck semigroup is Mehler’s formula:

0 0 Proposition 4.1. Let B = (Bt)t≥0 be an independent copy of B. Then, for any t ≥ 0 and F ∈ L2(Ω), we have 0 −t p −2t 0 Tt(F ) = E (F (e B + 1 − e B )), (19) 0 0 where E denotes the mathematical expectation with respect to B .

Proof. Both Tt in Definition 4.1 and the right-hand side of (19) give rise to linear contraction operators on Lp(Ω), for all p ≥ 1. For the first operator, this is clear. For the second, using Jensen’s inequality it follows that, for any p ≥ 1,

p 0 −t p −2t 0 p E(|Tt(F )| ) = E(|E (F (e B + 1 − e B ))| ) 0 −t p 0 p p ≤ E(E (|F (e B + 1 − e−2tB )| )) = E(|F | ). Thus, it suffices to show (19) for random variables of the form 1 2 R F = exp λB(h) − λ , where B(h) = htdBt, h ∈ H, is an element of norm one, and 2 R+ λ ∈ R. We have, using formula (1),    0 −t p −2t 0 1 2 E exp e λB(h) + 1 − e λB (h) − 2 λ ∞   X λn = exp e−tλB(h) − 1 e−2tλ2 = e−nt H (B(h)) = T F, 2 n! n t n=0 because ∞ X λn F = H (B(h)) n! n n=0 ⊗n and Hn(B(h)) = In(h ) (see Proposition 3.1). This completes the proof.

20 Mehler’s formula implies that the operator Tt is nonnegative. Moreover, Tt is symmetric, that is, ∞ X −nt E(GTt(F )) = E(FTt(G)) = e E(In(fn)In(gn)), n=0 P∞ P∞ where F = n=0 In(fn) and G = n=0 In(gn). The Ornstein–Uhlenbeck semigroup has the following hypercontractivity property Theorem 4.2. Let F ∈ Lp(Ω), p > 1, and q(t) = e2t(p − 1) + 1 > p, t > 0. Then

kTtF kq(t) ≤ kF kp. As a consequence of the hypercontractivity property, for any 1 < p < q < ∞ the norms 2t k · kp and k · kq are equivalent on any Wiener chaos Hn. In fact, putting q = e (p − 1) + 1 > p with t > 0, we obtain, for every F ∈ Hn, −nt e kF kq = kTtF kq ≤ kF kp, which implies that q − 1n/2 kF k ≤ kF k . (20) q p − 1 p Moreover, for any n ≥ 1 and 1 < p < ∞, the orthogonal projection onto the nth Wiener p chaos Jn is bounded in L (Ω), and ( (p − 1)n/2kF k if p > 2, kJ F k ≤ p (21) n p −n/2 (p − 1) kF kp if p < 2. In fact, suppose first that p > 2 and let t > 0 be such that p − 1 = e2t. Using the hypercon- tractivity property with exponents p and 2, we obtain nt nt nt nt kJnF kp = e kTtJnF kp ≤ e kJnF k2 ≤ e kF k2 ≤ e kF kp. If p < 2, we have nt kJnF kp = sup E((JnF )G) ≤ kF kp sup kJnGkq ≤ e kF kp, kGkq≤1 kGkq≤1 where q is the conjugate of p, and q − 1 = e2t. As an application we can establish the following lemma. Lemma 4.3. Fix an integer k ≥ 1 and a real number p > 1. Then, there exists a constant k,2 cp,k such that, for any random variable F ∈ D , k kE(D F )kH⊗k ≤ cp,kkF kp. P∞ k Proof. Suppose that F = n=0 In(fn). Then, by Stroock’s formula (15), E(D F ) = k!fk. Therefore, √ k kE(D F )kH⊗k = k!kfkkH⊗k = k!kJkF k2. From (20) we obtain −k/2 kJkF k2 ≤ (p − 1) ∧ 1 kJkF kp. Finally, applying (21) we get sign(p−2)k/2 kJkF kp ≤ (p − 1) kF kp, which concludes the proof.

21 The next result can be regarded as a regularizing property of the Ornstein–Uhlenbeck semigroup.

Proposition 4.2. Let F ∈ Lp(Ω) for some p > 1. Then, for any t > 0, we have that 1,p TtF ∈ D and there exists a constant cp such that

−1/2 kDTtF kLp(Ω;H) ≤ cpt kF kp. (22)

Proof. Consider a sequence of smooth and cylindrical random variables Fn ∈ S which con- p p 1,p verges to F in L (Ω). We know that TtFn converges to TtF in L (Ω). We have TtFn ∈ D , and using Mehler’s formula (19), we can write

 0 −t p −2t 0  D(Tt(Fn − Fm)) = D E ((Fn − Fm)(e B + 1 − e )B ) −t e 0  0 −t p −2t 0  = √ E D ((Fn − Fm)(e B + 1 − e )B )) . 1 − e−2t Then, Lemma 4.3 implies that

e−t kD(Tt(Fn − Fm))kLp(Ω;H) ≤ cp,1 √ kFn − Fmkp. 1 − e−2t

p 1,p Hence, DTtFn is a Cauchy sequence in L (Ω; H). Therefore, TtF ∈ D and DTtF is the p limit in L (Ω; H) of DTtFn. The estimate (22) follows by the same arguments. With the above ingredients, we can show an extension of Corollary 3.4 to any p > 1.

1,p p Proposition 4.3. Let Fn ∈ D be a sequence of random variables converging to F in L (Ω) for some p > 1. Suppose that sup kFnk1,p < ∞. n 1,p Then F ∈ D .

Proof. The assumptions imply that there exists a subsequence (Fn(k))k≥1 such that the se- q quence of derivatives (DFn(k))k≥1 converges in the weak topology of L (Ω; H) to some element q α ∈ L (Ω; H), where 1/p + 1/q = 1. By Proposition 4.2, for any t > 0, we have that TtF 1,p p q belongs to D and DTtFn(k) converges to DTtF in L (Ω; H). Then, for any β ∈ L (Ω; H), we can write

−t E(hDTtF, βiH ) = lim E(hDTtFn(k), βiH ) = lim e E(hTtDFn(k), βiH ) k→∞ k→∞ −t −t = lim e E(hDFn(k),TtβiH ) = e E(hα, TtβiH ) k→∞ −t = E(he Ttα, βiH ).

−t p Therefore, DTtF = e Ttα. This implies that DTtF converges to α as t ↓ 0 in L (Ω; H). 1.p Using that D is a closed operator, we conclude that F ∈ D and DF = α.

22 4.2 Generator of the Ornstein-Uhlenbeck semigroup The generator of the Ornstein–Uhlenbeck semigroup in L2(Ω) is the operator given by T F − F LF = lim t , t↓0 t and the domain of L is the set of random variables F ∈ L2(Ω) for which the above limit exists 2 P∞ 2 n in L (Ω). It is easy to show that a random variable F = n=0 In(fn), fn ∈ Ls(R+), belongs to the domain of L if and only if ∞ X 2 2 n kIn(fn)k2 < ∞; n=1 P∞ 2,2 and, in this case, LF = n=1 −nIn(fn). Thus, Dom L coincides with the space D . We also define the operator L−1, which is the pseudo-inverse of L, as follows. For every F ∈ L2(Ω), set ∞ X 1 LF = − I (f ). n n n n=1 −1 2,2 −1 Note that L is an operator with values in D and that LL F = F − E(F ), for any F ∈ L2(Ω), so L−1 acts as the inverse of L for centered random variables. The next proposition explains the relationship between the operators D, δ, and L. 2 1,2 Proposition 4.4. Let F ∈ L (Ω). Then, F ∈ Dom L if and only if F ∈ D and DF ∈ Dom δ and, in this case, we have δDF = −LF. P∞ 1,2 Proof. Let F = n=0 In(fn). Suppose first that F ∈ D and DF ∈ Dom δ. Then, for any random variable G = Im(gm), we have, using the duality relationship (Proposition 2.1),

(GδDF ) = (hDG, DF i ) = mm!hg , f i 2 m = (GmI (f )). E E H m m L (R+ ) E m m

Therefore, the projection of δDF onto the mth Wiener chaos is equal to mIm(fm). This P∞ 2 implies that the series n=1 nIn(fn) converges in L (Ω) and its sum is δDF . Therefore, F ∈ Dom L and LF = −δDF . 1,2 Conversely, suppose that F ∈ Dom L. Clearly, F ∈ D . Then, for any random variable 1,2 P∞ G ∈ D with Wiener chaos expansion G = n=0 In(gn), we have ∞ X (hDG, DF i ) = nn!hg , f i 2 n = − (GLF ). E H n n L (R+) E n=1 As a consequence, DF belongs to the domain of δ and δDF = −LF .

The operator L behaves as a second-order differential operator on smooth random vari- ables. Proposition 4.5. Suppose that F = (F 1,...,F m) is a random vector whose components 2,4 2 m belong to D . Let ϕ be a function in C (R ) with bounded first and second partial derivatives. Then, ϕ(F ) ∈ Dom L and m m X ∂2ϕ X ∂ϕ L(ϕ(F )) = (F )hDF i,DF ji + (F )LF i. ∂x ∂x H ∂x i,j=1 i j i=1 i

23 1,2 Proof. By the chain rule (see Proposition 2.4), ϕ(F ) belongs to D and

m X ∂ϕ D(ϕ(F )) = (F )DF i. ∂x i=1 i Moreover, by Proposition 4.4, ϕ(F ) belongs to Dom L and L(ϕ(F )) = −δ(D(ϕ(F ))). Using the factorization property of the divergence operator yields the result.

n In the finite-dimensional case (Ω = R equipped with the standard Gaussian law), L = n ∆−x·∇ coincides with the generator of the Ornstein–Uhlenbeck process (Xt)t≥0 in R , which is the solution to the stochastic differential equation √ dXt = 2dBt − Xtdt, where (Bt)t≥0 is an n-dimensional Brownian motion.

4.3 Meyer’s inequality The next theorem provides an estimate for the Lp(Ω)-norm of the divergence operator for p any p > 1. It was proved by Pisier, using the boundedness in L (R) of the Riesz transform. 1,p Theorem 4.4. For any p > 1, there exists a constant cp > 0 such that for any u ∈ D (H),   p p p E(|δ(u)| ) ≤ cp E(kDuk 2 2 ) + kE(u)kH . (23) L (R+)

1,p As a consequence of Theorem 4.4, the divergence operator is continuous from D (H) to Lp(Ω), and so we have Meyer’s inequality:

p  p p  p E(|δ(u)| ) ≤ cp E(kDuk 2 2 ) + E(kukH ) = cpkuk1,p,H . (24) L (R+) This result can be extended as follows.

k,p Theorem 4.5. For any p > 1, k ≥ 1, and u ∈ D (H),   k p p p kδ(u)kk−1,p ≤ ck,p E(kD uk 2 k+1 ) + E(kukH ) = ck,pkukk,p,H . L (R+ )

k,p k−1,p This implies that the operator δ is continuous from D (H) into D (H).

5 Stochastic integral representations. Clark-Ocone formula

This section deals with the following problem. Given a random variable F in L2(Ω), with E(F ) = 0, find a stochastic process u in Dom δ such that F = δ(u). We present two different answers to this question, both integral representations. The first is the Clark–Ocone formula, in which u is required to be adapted. Therefore, the process u is unique and its expression involves a conditional expectation of the Malliavin derivative of F . The second uses the inverse of the Ornstein–Uhlenbeck generator. We then present some applications of these integral representations.

24 5.1 Clark-Ocone formula

Let B = (Bt)t≥0 be a Brownian motion on a probability space (Ω, F,P ) such that F is generated by B, equipped with its Brownian filtration (Ft)t≥0. The next result expresses the integrand of the integral representation theorem of a square integrable random variable in terms of the conditional expectation of its Malliavin derivative.

1,2 2 Theorem 5.1 (Clark–Ocone formula). Let F ∈ D ∩ L (Ω, FT ,P ). Then F admits the following representation: Z T F = E(F ) + E(DtF |Ft)dBt. 0 Proof. By the Itˆointegral representation theorem, there exists a unique process adated process 2 2 u ∈ L (Ω × [0,T ]) such that F ∈ L (Ω, FT ,P ) admits the stochastic integral representation

Z T F = E(F ) + utdBt. 0

It suffices to show that ut = E(DtF |Ft) for almost all (t, ω) ∈ [0,T ] × Ω. Consider a process 2 v ∈ LT (P). On the one hand, the isometry property yields Z T E(δ(v)F ) = E(vsus)ds. 0 On the other hand, by the duality relationship (Proposition 2.1), and taking into account that v is progressively measurable,

 Z T  Z T E(δ(v)F ) = E vtDtF dt = E(vsE(DtF |Ft))dt. 0 0

Therefore, ut = E(DtF |Ft) for almost all (t, ω) ∈ [0,T ] × Ω, which concludes the proof. Consider the following simple examples of the application of this formula.

3 2 Example 5.2. Suppose that F = Bt . Then DsF = 3Bt 1[0,t](s) and

2 2 E(DsF |Fs) = 3E((Bt − Bs + Bs) |Fs) = 3(t − s + Bs ).

Therefore Z t 3 2 Bt = 3 (t − s + Bs )dBs. (25) 0 This formula should be compared with Itˆo’sformula,

Z t Z t 3 2 Bt = 3 Bs dBs + 3 Bsds. (26) 0 0 Notice that equation (25) contains only a stochastic integral but it is not a martingale, be- cause the integrand depends on t, whereas (26) contains two terms and one is a martingale. Moreover, the integrand in (25) is unique.

25 x Example 5.3. Consider the Brownian motion (Lt )t≥0,x∈R. For any ε > 0, we set

−1/2 −x2/(2ε) pε(x) = (2πε) e .

We have that, as  → 0, Z t 2 L (Ω) x Fε = pε(Bs − x)ds −→ Lt . (27) 0 Applying the derivative operator yields

Z t Z t 0 0 DrF = p(Bs − x)DrBsds = p(Bs − x)ds. 0 r Thus Z t 0 E(DrF|Fr) = E(p(Bs − Br + Br − x)|Fr)ds r Z t 0 = p+s−r(Br − x)ds. r As a consequence, taking the limit as  → 0, we obtain the following integral representation of the Brownian local time:

Z t x x Lt = E(Lt ) + ϕ(t − r, Br − x)dBr, 0 where Z r 0 ϕ(r, y) = ps(y)ds. 0

5.2 Second integral representation Recall that L is the generator of the Ornstein–Uhlenbeck semigroup.

1,2 Proposition 5.1. Let F be in D with E(F ) = 0. Then the process

u = −DL−1F belongs to Dom δ and satisfies F = δ(u). Moreover u ∈ L2(Ω; H) is unique among all square integrable processes with a chaos expansion

∞ X ut = Iq(fq(t)) q=0 such that fq(t, t1, . . . , tq) is symmetric in all q + 1 variables t, t1, . . . , tq. Proof. By Proposition 4.4, F = LL−1F = −δ(DL−1F ). Clearly, the process u = −DL−1F has a Wiener chaos expansion with functions symmetric 2 in all their variables. To show uniqueness, let v ∈ L (Ω; H) with a chaos expansion vt = P∞ q=0 Iq(gq(t)), such that the function gq(t, t1, . . . , tq) is symmetric in all q + 1 variables

26 1,2 t, t1, . . . , tq and such that δ(v) = F . Then, there exists a random variable G ∈ D such that DG = v. Indeed, it suffices to take

∞ X 1 G = I (g ). q + 1 q+1 q q=0

We claim that G = −L−1F . This follows from LG = −δDG = −δ(v) = −F . The proof is now complete.

It is important to notice that, unlike the Clark–Ocone formula, which requires that the underlying process is a Brownian motion, the representation provided in Proposition 5.1 holds in the context of a general Gaussian isonormal process.

6 Existence and regularity of densities. Density formulas

In this Section we apply Malliavin calculus to derive explicit formulas for the densities of random variables on Wiener space and to establish criteria for their regularity.

6.1 Analysis of densities in the one-dimensional case

We recall that B = (Bt)t≥0 is a Brownian motion on a probability space (Ω, F,P ) such that F is generated by B. The topological support of the law of a random variable F is defined as the set of points x ∈ R such that P (|x − F | < ) > 0 for all  > 0. 1,2 Our first result says that if a random variable F belongs to the Sobolev space D then the topological support of the law of F is a closed interval.

1,2 Proposition 6.1. Let F ∈ D . Then, the topological support of the law of F is a closed interval.

Proof. Clearly the topological support of the law of F is a closed set. Then, it suffices to show that it is connected. We show this by contradiction. If the topological support of the law of F is not connected, there exists a point a ∈ R and  > 0 such that P (a −  < F < a + ) = 0, P (F ≥ a + ) < 1, and P (F ≤ a − ) < 1. Let ϕ : R → R be an infinitely differentiable function such that ϕ(x) = 0 if x ≤ a −  and ϕ(x) = 1 if x ≥ a + . By Proposition 2.4, 1,2 ϕ(F ) ∈ D but, almost surely, ϕ(F ) = 1{F ≥a+}. Therefore, by Proposition 3.5, we must have P (F ≥ a + ) = 0 or P (F ≥ a + ) = 1, which leads to a contradiction.

1,2 If a random variable F belongs to D , and its derivative is not degenerate, then F has a density.

1,2 Proposition 6.2. Let F be a random variable in the space D such that kDF kH > 0 almost surely. Then, the law of F is absolutely continuous with respect to the Lebesgue measure on R. Proof. Replacing F by arctan F , we may assume that F takes values in (−1, 1). It suffices to R 1 show that, for any measurable function g :(−1, 1) → [0, 1] such that −1 g(y)dy = 0, we have E(g(F )) = 0. We can find a sequence of continuous functions gn :(−1, 1) → [0, 1] such that,

27 as n tends to infinity, gn(y) converges to g(y) for almost all y with respect to the measure −1 P ◦ F + `, where ` denotes the Lebesgue measure on R. Set Z x ψn(x) = gn(y)dy. −∞

2 Then, ψn(F ) converges to 0 almost surely and in L (Ω) because gn converges almost every- R 1 where to g, with respect to the Lebesgue measure, and −1 g(y)dy = 0. Furthermore, by the 1,2 chain rule (Proposition 2.4), ψn(F ) ∈ D and

D(ψn(F )) = gn(F )DF,

which converges almost surely and in L2(Ω) to g(F )DF . Because D is closed, we conclude that g(F )DF = 0. Our hypothesis kDF kH > 0 implies that g(F ) = 0 almost surely, and this finishes the proof.

The following result is an expression for the density of a random variable in the Sobolev 1,2 space D , assuming that kDF kH > 0 a.s. 1,2 Proposition 6.3. Let F be a random variable in the space D such that kDF kH > 0 a.s. 2 2 Suppose that DF/kDF kH belongs to the domain of the operator δ in L (Ω). Then the law of F has a continuous and bounded density, given by

  DF  p(x) = E 1{F >x}δ 2 . (28) kDF kH

∞ R y Proof. Let ψ be a nonnegative function in C0 (R), and set ϕ(y) = −∞ ψ(z)dz. 1,2 Then, by the chain rule (Proposition 2.4), ϕ(F ) belongs to D and we can write

2 hD(ϕ(F )),DF iH = ψ(F )kDF kH .

Using the duality formula (Proposition 2.1), we obtain

 DF     DF  E(ψ(F )) = E D(ϕ(F )) , 2 = E ϕ(F ) δ 2 . (29) kDF kH H kDF kH

By an approximation argument, equation (29) holds for ψ(y) = 1[a,b](y), where a < b. As a consequence, we can apply Fubini’s theorem to get

 Z F   DF  P (a ≤ F ≤ b) = E ψ(x)dx δ 2 −∞ kDF kH Z b   DF  = E 1{F >x}δ 2 dx, a kDF kH which implies the desired result.

1,p 2 Remark 6.1. Equation (28) still holds under the hypotheses F ∈ D and DF/kDF kH ∈ 1,p0 0 2,α D (H) for some p, p > 1. Sufficient conditions for these hypotheses are F ∈ D and −2β E(kDF k ) < ∞ with 1/α + 1/β < 1.

28 Example 6.2. Let F = B(h). Then DF = h and   DF −2 δ 2 = B(h)khkH . kDF kH As a consequence, formula (28) yields

−2  p(x) = khkH E 1{F >x}F , 2 which is true because p(x) is the density of the distribution N(0, khkH ). Applying equation (28) we can derive density estimates. Notice first that (28) holds if 1{F >x} is replaced by 1{F |x|)) δ 2 , (30) kDF kH p for all x ∈ R. Applying (30) and Meyer’s inequality (24), we can deduce the following result. Proposition 6.4. Let q, α, β be three positive real numbers such that 1/q + 1/α + 1/β = 1. 2,α −2β Let F be a random variable in the space D , such that E(kDF kH ) < ∞. Then, the density p(x) of F can be estimated as follows:

1/q p(x) ≤ cq,α,β (P (|F | > |x|))   −1 2 −2 × E(kDF kH ) + D F α 2 2 kDF kH . (31) L (Ω;L (R+)) β

6.2 Existence and smoothness of densities for random vectors 1 m i 1,2 Let F = (F ,...,F ) be such that F ∈ D for i = 1, . . . , m. We define the Malliavin matrix of F as the random symmetric nonnegative definite matrix

i j γF = (hDF ,DF iH )1≤i,j≤m. (32) 2 In the one-dimensional case, γF = kDF kH . The following theorem is a multidimensional version of Proposition 6.2.

Theorem 6.3. If det γF > 0 a.s. then the law of F is absolutely continuous with respect to m the Lebesgue measure on R . This theorem was proved by Bouleau and Hirsch using the co-area formula and techniques of geometric measure theory, and we omit the proof. As a consequence, the measure (det γF × P ) ◦ F −1 is always absolutely continuous; that is,

P (F ∈ B, det γF > 0) = 0,

m for any Borel set B ∈ B(R ) of zero Lebesgue measure. 1 m i 1,2 Definition 6.4. We say that a random vector F = (F ,...,F ) is nondegenerate if F ∈ D for i = 1, . . . , m and −p E((det γF ) ) < ∞, for all p ≥ 2.

29 k Set ∂i = ∂/∂xi and, for any multi-index α ∈ {1, . . . , m} , k ≥ 1, we denote by ∂α the k ∂ /(∂xα1 ··· ∂xαk ). ij 1,∞ Lemma 6.5. Let γ be an m × m random matrix such that γ ∈ D for all i, j and −p −1ij 1,∞ E (| det γ| ) < ∞ for all p ≥ 2. Then, γ belongs to D for all i, j, and

m ij X ik `j D γ−1 = − γ−1 γ−1 Dγk`. (33) k,`=1

Proof. It can be proved that P (det γ > 0) is zero or one. So, we can assume that det γ > 0 −1 −1 almost surely. For any  > 0, we define γ = (det γ + ) A(γ), where A(γ) is the adjoint −1 1,∞ p matrix of γ. Then, the entries of γ belong to D and converge in L (Ω), for all p ≥ 2, to −1 −1 those of γ as  tends to zero. Moreover, the entries of γ satisfy

−1 ij sup k(γ ) k1,p < ∞, ∈(0,1]

−1 1,p for all p ≥ 2. Therefore, by Proposition 4.3 the entries of γ belong to D for any p ≥ 2. −1 Finally, from the expression γ γ = (det γ/(det γ + ))Im, where Im denotes the identity matrix of order m, we deduce (33) on applying the derivative operator and letting  tend to zero.

The following result can be regarded as an integration-by-parts formula and plays a fun- damental role in the proof of the regularity of densities.

Proposition 6.5. Let F = (F 1,...,F m) be a nondegenerate random vector. Fix k ≥ 1 and i k+1,∞ ∞ ∞ m suppose that F ∈ D for i = 1, . . . , m. Let G ∈ D and let ϕ ∈ Cp (R ). Then, for any k ∞ multi-index α ∈ {1, . . . , m} , there exists an element Hα(F,G) ∈ D such that

E(∂αϕ(F )G) = E(ϕ(F )Hα(F,G)), (34)

where the elements Hα(F,G) are recursively given by

m X  −1ij j H(i)(F,G) = δ G γF DF j=1

and, for α = (α1, . . . , αk), k ≥ 2, we set

Hα(F,G) = Hαk (F,H(α1,...,αk−1)(F,G)).

Proof. By the chain rule (Proposition 2.4), we have

m m j X i j X ij hD(ϕ(F )),DF iH = ∂iϕ(F )hDF ,DF iH = ∂iϕ(F )γF i=1 i=1 and, consequently, m X j −1 ji ∂iϕ(F ) = hD(ϕ(F )),DF iH (γF ) . j=1

30 Taking expectations and using the duality relationship (Proposition 2.1) yields

E(∂iϕ(F )G) = E(ϕ(F )H(i)(F,G)),

Pm  −1ij j where H(i) = j=1 δ G γF DF . Notice that Meyer’s inequality (Theorem 4.4) and p Lemma 6.5 imply that H(i) belongs to L (Ω) for any p ≥ 2. We finish the proof with a recurrence argument.

One can show that, for any p > 1, there exist constants β, γ > 1 and integers n, m such that −1 m n kHα(F,G)kp ≤ cp,q det γF β kDF kk,γ kGkk,q . (35) The proof of this inequality is based on Meyer’s and H¨older’sinequalities. The following result is a multidimensional version of the density formula (28).

Proposition 6.6. Let F = (F 1,...,F m) be a nondegenerate random vector such that F i ∈ m+1,∞ D for i = 1, . . . , m. Then F has a continuous and bounded density given by

p(x) = E(1{F >x}Hα(F, 1)), (36) where α = (1, 2, . . . , m).

Proof. Recall that, for α = (1, 2, . . . , m)

Hα(F, 1) m X −1 1j1 j1 −1 2j2 j2 −1 mjm jm   = δ (γF ) DF δ (γF ) DF ··· δ (γF ) DF ··· . j1,...,jm=1

∞ m Then, equality (34) applied to the multi-index α = (1, 2, . . . , m) yields, for any ϕ ∈ Cp (R ),

E(∂αϕ(F )) = E(ϕ(F )Hα(F, 1)). Notice that Z F 1 Z F m ϕ(F ) = ··· ∂αϕ(x)dx. −∞ −∞ Hence, by Fubini’s theorem we can write Z E(∂αϕ(F )) = ∂αϕ(x)E(1{F >x}Hα(F, 1))dx. (37) Rm ∞ m ∞ m Given any function ψ ∈ C0 (R ), we can take ϕ ∈ Cp (R ) such that ψ = ∂αϕ, and (37) yields Z E(ψ(F )) = ψ(x)E(1{F >x}Hα(F, 1))dx, Rm which implies the result.

The following theorem is the basic criterion for the smoothness of densities.

1 m i ∞ Theorem 6.6. Let F = (F ,...,F ) be a nondegenerate random vector such that F ∈ D for all i = 1, . . . , m. Then the law of F possesses an infinitely differentiable density.

31 ∞ m Proof. For any multi-index β and any ϕ ∈ Cp (R ), we have, taking α = (1, 2, . . . , m),

E (∂β∂αϕ(F )) = E (ϕ(F )Hβ(F,Hα(F, 1)))) Z  = ∂αϕ(x) E 1{F >x}Hβ(F,Hα(F, 1)) dx. Rm ∞ m Hence, for any ξ ∈ C0 (R ), Z Z  ∂βξ(x)p(x)dx = ξ(x) E 1{F >x}Hβ(F,Hα(F, 1)) dx. Rm Rm Therefore, p(x) is infinitely differentiable and, for any multi-index β, we have

|β|  ∂βp(x) = (−1) E 1{F >x}Hβ(F, (Hα(F, 1)) .

This completes the proof.

6.3 Density formula using the Riesz transform In this section we present a method for obtaining a density formula using the Riesz transform, following the methodology introduced by Malliavin and extensively studied by Bally and Caremellino. In contrast with (36), here we only need two derivatives, instead of m + 1. m Let Qm be the fundamental solution to the Laplace equation ∆Qm = δ0 on R , m ≥ 2. That is, 1 Q (x) = a−1 ln ,Q (x) = a−1|x|2−m, m > 2, 2 2 |x| m m m where am is the area of the unit sphere in R . We know that, for any 1 ≤ i ≤ m, x ∂ Q (x) = −c i , (38) i m m |x|m

1 m where cm = 2(m − 2)/am if m > 2 and c2 = 2/a2. Notice that any function ϕ in C0 (R ) can be written as m X Z ϕ(x) = ∇ϕ ∗ ∇Qm(x) = ∂ϕ(x − y)∂iQm(y)dy. (39) m i=1 R Indeed, ∇ϕ ∗ ∇Qm(x) = ϕ ∗ ∆Qm(x) = ϕ(x). Theorem 6.7. Let F be an m-dimensional nondegenerate random vector whose components 2,∞ are in D . Then, the law of F admits a continuous and bounded density p given by m X p(x) = E(∂iQm(F − x)H(i)(F, 1)), i=1 where m X −1 ij j H(i)(F, 1) = δ (γF ) DF . j=1

32 1 m Proof. Let ϕ ∈ C0 (R ). Applying (39), we can write

m X  Z  E(ϕ(F )) = E ∂iQm(y)(∂iϕ(F − y))dy . m i=1 R

Assume that the support of ϕ is included in the ball BR(0) for some R > 1. Then, using (38) we obtain  Z   Z  E |∂iQm(y)∂iϕ(F − y)| dy ≤ k∂iϕk∞E |∂iQm(y)|dy Rm {y:|F |−R≤|y|≤|F |+R}  Z |F |+R  r m−1 ≤ cmVol(B1(0))k∂iϕk∞E m r dr |F |−R r = 2cmVol(B1(0))k∂iϕk∞RE(|F |) < ∞. As a consequence, Fubini’s theorem and (34) yield

m X Z E(ϕ(F )) = ∂iQm(y)E(∂iϕ(F − y))dy m i=1 R m X Z = ∂iQm(y)E(ϕ(F − y)H(i)(F, 1))dy m i=1 R m X Z = ϕ(y)E(∂iQm(F − y)H(i)(F, 1))dy. m i=1 R This completes the proof.

The approach based on the Riesz transform can also be used to obtain the following uniform estimate for densities, due to Stroock.

Lemma 6.8. Under the assumptions of Theorem 6.7, for any p > m there exists a constant c depending only on m and p such that  m kpk∞ ≤ c max kH(i)(F, 1)kp . 1≤i≤m

Proof. From m X p(x) = E(∂iQm(F − x)H(i)(F, 1)), i=1 applying H¨older’sinequality with 1/p + 1/q = 1 and the estimate (see (38))

1−m |∂iQm(F − x)| ≤ cm|F − x| yields 1/q   (1−m)q p(x) ≤ mcmA E |F − x| , (40) where A = max1≤i≤m kH(i)(F, 1)kp.

33 Suppose first that p is bounded and let M = supx∈R p(x). We can write, for any  > 0, Z (1−m)q (1−m)q (1−m)q E(|F − x| ) ≤  + |z − x| p(x)dx |z−x|≤ (1−m)q (p−m)/(p−1) ≤  + Cm,p M. (41)

Therefore, substituting (41) into (40), we get

 1−m 1/q (p−m)/p 1/q M ≤ Amcm  + Cm,p M .

1−1/m Now we minimize with respect to  and obtain M ≤ ACm,pM , for some constant Cm,p, m m which implies that M ≤ Cm,pA . If p is not bounded, we apply the procedure to p ∗ ψδ, where ψδ is an approximation of the identity, and let δ tend to zero at the end.

7 Malliavin Differentiability of Diffusion Processes. Proof of H¨ormander’stheorem

1 d Suppose that B = (Bt)t≥0, with Bt = (Bt ,...,Bt ), is a d-dimensional Brownian motion. Consider the m-dimensional stochastic differential equation

d X j dXt = σj(Xt)dBt + b(Xt)dt, (42) j=1

m m m with initial condition X0 = x0 ∈ R , where the coefficients σj, b: R → R , 1 ≤ j ≤ d are measurable functions. By definition, a solution to equation (42) is an adapted process X = (Xt)t≥0 such that, for any T > 0 and p ≥ 2,   p E sup |Xt| < ∞ t∈[0,T ] and X satisfies the integral equation

d Z t Z t X j Xt = x0 + σj(Xs)dBs + b(Xs)ds. (43) j=1 0 0

The following result is well known.

m m Theorem 7.1. Suppose that the coefficients σj, b: R → R , 1 ≤ j ≤ d, satisfy the Lipschitz m condition: for all x, y ∈ R ,

max (|σj(x) − σj(y)|, |b(x) − b(y)|) ≤ K|x − y|. (44) j

Then there exists a unique solution X to Equation (43).

When the coefficients in equation (42) are continuously differentiable, the components of the solution are differentiable in the Malliavin calculus sense.

34 1 m m Proposition 7.1. Suppose that the coefficients σj, b are in C (R ; R ) and have bounded i 1,∞ partial derivatives. Then, for all t ≥ 0 and i = 1, . . . , m, Xt ∈ D , and for r ≤ t and j = 1, . . . , d,

m d Z t j X X j k ` DrXt = σj(Xr) + ∂kσ`(Xs)DrXs dBs k=1 `=1 r m Z t X j k + ∂kb(Xs)DrXs ds. (45) k=1 r Proof. To simplify, we assume that b = 0. Consider the Picard approximations given by (0) Xt = x0 and d Z t (n+1) X (n) j Xt = x0 + σj(Xs )dBs, j=1 0 if n ≥ 0. We will prove the following claim by induction on n:

(n),i 1,∞ Claim: Xt ∈ D for all i = 1, . . . , m, t ≥ 0. Moreover, for all p > 1 and t ≥ 0,   (n) p ψn(t) := sup E sup |DrXs | < ∞ (46) 0≤r≤t s∈[r,t] and, for all T > 0 and t ∈ [0,T ], Z t ψn+1(t) ≤ c1 + c2 ψn(s)ds, (47) 0 for some constants c1, c2 depending on T . Clearly, the claim holds for n = 0. Suppose that it is true for n. Applying property (11) of the divergence operator and the chain rule (Proposition 2.4), for any r ≤ t, i = 1, . . . , m , and ` = 1, . . . , d, we get

 m Z t  ` (n+1),i ` X i (n) j DrXt = Dr σj(Xs )dBs j=1 0 m  Z t  X i (n) `  i (n)  j = δ`,jσ`(Xr ) + Dr σj(Xs ) dBs j=1 r m  m Z t  X i (n) X (n) ` (n),k j = δ`,jσ`(Xr ) + ∂kσj(Xs )DrXs dBs . j=1 k=1 r

(n+1),i 1,∞ From these equalities and condition (46) we see that Xt ∈ D and we obtain, using the Burkholder–David–Gundy inequality and H¨older’sinequality,    Z t  (n+1) p (p−1)/2 p j (n) p E sup |DrXs | ≤ cp γp + T K E |DrXs | ds , (48) r≤s≤t r where   (n) p γp = sup E sup |σj(Xt )| < ∞. n,j 0≤t≤T

35 So (46) and (47) hold for n + 1 and the claim is proved. We know that   (n) p E sup |Xs − Xs| −→ 0 s≤T as n tends to infinity. By Gronwall’s lemma applied to (47) we deduce that the derivatives of (n),i p the sequence Xt are bounded in L (Ω; H) uniformly in n for all p ≥ 2. This implies that i 1,∞ the random variables Xt belong to D . Finally, applying the operator D to equation (43) i we deduce the linear stochastic differential equation (45) for the derivative of Xt . This completes the proof of the proposition.

Example 7.2. Consider the diffusion process in R

dXt = σ(Xt)dBt + b(Xt)dt, X0 = x0,

1 where σ and b are globally Lipschitz functions in C (R). Then, for all t > 0, Xt belongs to 1,∞ D and the Malliavin derivative (DrXt)r≤t satisfies the following linear equation: Z t Z t 0 0 DrXt = σ(Xr) + σ (Xs)Dr(Xs)dBs + b (Xs)Dr(Xs)ds. r r Therefore, by Itˆo’sformula,

 Z t Z t  0 1 0 2 DrXt = σ(Xt) exp σ (Xs)dBs + (b(Xs) − 2 (σ ) (Xs))ds . r r Consider the m × m matrix-valued process defined by

d Z t Z t X ` Yt = Im + ∂σ`(Xs)YsdBs + ∂b(Xs)Ysds, l=1 0 0

where Im denotes the identity matrix of order m and ∂σ` denotes the m × m Jacobian matrix of the function σ`; that is, i i (∂σ`)j = ∂jσ`. In the same way, ∂b denotes the m × m Jacobian matrix of b. If the coefficients of equation 1+α (43) are of class C , α > 0, then there is a version of the solution Xt(x0) to this equation that is continuously differentiable in x0, and for which Yt is the Jacobian matrix ∂Xt/∂x0:

∂Xt Yt = . ∂x0

Proposition 7.2. For any t ∈ [0,T ] the matrix Yt is invertible. Its inverse Zt satisfies

d Z t X ` Zt = Im − Zs∂σ`(Xs)dBs `=1 0 d Z t  X  − Zs ∂b(Xs) − ∂σ`(Xs)∂σ`(Xs) ds. 0 `=1

36 Proof. By means of Itˆo’sformula, one can check that ZtYt = YtZt = Im, which implies that −1 Zt = Yt . In fact, d Z t Z t X ` ZtYt = Im + Zs ∂σ`(Xs)YsdBs + Zs∂b(Xs)Ysds `=1 0 0 d Z t X ` − Zs∂σl(Xs)YsdBs `=1 0 d Z t  X  − Zs ∂b(Xs) − ∂σ`(Xs)∂σ`(Xs) Ysds 0 `=1 d Z t  X  − Zs ∂σ`(Xs)∂σ`(Xs) Ysds = Im. 0 `=1

Similarly, we can show that YtZt = Im.

i j i Lemma 7.3. The m × d matrix (DrXt)j = DrXt can be expressed as −1 DrXt = YtYr σ(Xr), (49) where σ denotes the m × d matrix with columns σ1, . . . , σd. −1 Proof. It suffices to check that the process Φt,r := YtYr σ(Xr), t ≥ r satisfies d Z t Z t X ` Φt,r = σ(Xr) + ∂σ`(Xs)Φs,rdBs + ∂b(Xs)Φs,rds. `=1 r r In fact, d Z t X −1 ` σ(Xr) + ∂σ`(Xs)(YsYr σ(Xr))dBs `=1 r Z t −1 + ∂b(Xs)(YsYr σ(Xr))ds r −1 −1 = σ(Xr) + (Yt − Yr)Yr σ(Xr) = YtYr σ(Xr). This completes the proof.

Consider the Malliavin matrix of Xt, denoted by γXt := Qt and given by d Z t i,j X ` i ` j Qt = DsXt DsXt ds. `=1 0 R t T That is, Qt = 0 (DsXt)(DsXt) ds. Equation (49) leads to T Qt = YtCtYt , (50) where Z t −1 T −1 T Ct = Ys σσ (Xs)(Ys ) ds. 0

Taking into account that Yt is invertible, the nondegeneracy of the matrix Qt will depend only on the nondegeneracy of the matrix Ct, which is called the reduced Malliavin matrix.

37 7.1 Absolute continuity under ellipticity conditions Consider the stopping time defined by

T S = inf{t > 0 : det σσ (Xt) 6= 0}.

1+α Theorem 7.4. Let (Xt)t≥0 be a diffusion process with C and Lipschitz coefficients. Then, for any t > 0, the law of Xt conditioned by {t > S} is absolutely continuous with respect to m the Lebesgue measure on R .

Proof. It suffices to show that det Ct > 0 a.s. on the set {S < t}. Suppose that t > S. For m any u ∈ R with |u| = 1 we can write Z t T T −1 T −1 T u Ctu = u Ys σσ (Xs)(Ys ) uds 0 Z t T T  −1 T 2 ≥ inf v σσ (Xs)v |(Ys ) u| ds. 0 |v|=1

T T  T Notice that inf|v|=1 v σσ (Xs)v is the smallest eigenvalue of σσ (Xs), which is strictly positive in an open interval contained in [0, t] by the definition of the stopping time S and because t > S. −1 T −1 Furthermore, |(Ys ) u| ≥ |u| |Ys| . Therefore we obtain

T 2 u Ctu ≥ k|u| , for some positive random variable k > 0, which implies that the matrix Ct is invertible. This completes the proof.

Example 7.5. Assume that σ(x0) 6= 0 in Example 7.2. Then, for any t > 0, the law of Xt is absolutely continuous with respect to the Lebesgue measure in R.

7.2 Regularity of the density under H¨ormander’sconditions We need the following regularity result, whose proof is similar to that of Proposition 7.1 and is thus omitted.

Proposition 7.3. Suppose that the coefficients σj, 1 ≤ j ≤ m, and b of equation (42) are infinitely differentiable with bounded derivatives of all orders. Then, for all t ≥ 0 and i ∞ i = 1, . . . , m, Xt belong to D . m Consider the following vector fields on R : m X ∂ σ = σi (x) , j = 1, . . . , d, j j ∂x i=1 i m X ∂ b = bi(x) . ∂x i=1 i

The Lie bracket between the vector fields σj and σk is defined by

∇ ∇ [σj, σk] = σjσk − σkσj = σj σk − σk σj,

38 where m ∇ X ` i ∂ σj σk = σj∂`σk . ∂xi i,`=1 Set d 1 X σ = b − σ∇σ . 0 2 ` ` `=1

The vector field σ0 appears when we write the stochastic differential equation (43) in terms of the Stratonovich integral (see Section 2.7) instead of Itˆo’sintegral:

d Z t Z t X j Xt = x0 + σj(Xs) ◦ dBs + σ0(Xs)ds. j=1 0 0

Let us introduce the nondegeneracy condition required for the smoothness of the density. (HC) H¨ormander’scondition: The vector space spanned by the vector fields

σ1, . . . , σd, [σi, σj], 0 ≤ i ≤ d, 1 ≤ j ≤ d, [σi, [σj, σk]], 0 ≤ i, j, k ≤ d, . . .

m at the point x0 is R . 1 1 For instance, if m = d = 1, σ1(x) = a(x) and σ0(x) = a0(x); then H¨ormander’scondition n means that a(x0) 6= 0 or a (x0)a0(x0) 6= 0 for some n ≥ 1. Theorem 7.6. Assume that H¨ormander’scondition holds. Then, for any t > 0, the random vector Xt has an infinitely differentiable density. This result can be considered as a probabilistic version of H¨ormander’stheorem on the hypoellipticity of second-order differential operators. In fact, the density pt of Xt satisfies the Fokker–Planck equation  ∂  − + L∗ p = 0, ∂t t where m m 1 X ∂2 X ∂ L = (σσT )ij + bi . 2 ∂x ∂x ∂x i,j=1 i j i=1 i ∞ m ∗ Then, pt ∈ C (R ) means that ∂/∂t − L is hypoelliptic (H¨ormander’stheorem). For the proof of Theorem 7.6 we need several technical lemmas.

Lemma 7.7. Let C be an m × m symmetric nonnegative definite random matrix. Assume ij that the entries C have moments of all orders and that for any p ≥ 2 there exists 0(p) such that, for all  ≤ 0(p), sup P vT Cv ≤  ≤ p. |v|=1 −p Then E((det C) ) < ∞ for all p ≥ 2. T m Proof. Let λ = inf|v|=1 v Cv be the smallest eigenvalue of C. We know that λ ≤ det C. 1 −p Pm ij 2 2 Thus, it suffices to show that E(λ ) < ∞ for all p ≥ 2. Set |C| = i,j=1(C ) . Fix

39  > 0, and let v1, . . . , vN be a finite set of unit vectors such that the balls with their center in these points and radius 2/2 cover the unit sphere Sm−1. Then, we have   P (λ < ) = P inf vT Cv <  |v|=1  1  1 ≤ P inf vT Cv < , |C| ≤ + P |C| > . (51) |v|=1  

T Assume that |C| ≤ 1/ and vk Cvk ≥ 2 for any k = 1,...,N. For any unit vector v, there 2 exists a vk such that |v − vk| ≤  /2 and we can deduce the following inequalities:

T T T T v Cv ≥ vk Cvk − |v Cv − vk Cvk| T T T T ≥ 2 − (|v Cv − v Cvk| + |v Cvk − vk Cvk|) ≥ 2 − 2|C| |v − vk| ≥ .

As a consequence, (51) implies that

N  [   1 P (λ < ) ≤ P {vT Cv < 2} + P |C| > ≤ N(2)p+2m + p (|C|p) k k  E k=1

1 −2m if  ≤ 2 0(p + 2m). The number N depends on  but is bounded by a constant times  . p Therefore, we obtain P (λ < ) ≤ C for all  ≤ 1(p) and for all p ≥ 2. This implies that λ−1 has moments of all orders, which completes the proof of the lemma.

Lemma 7.8. Let (Zt)t≥0 be a real-valued, adapted, continuous process such that Z0 = z0 6= 0. Suppose that there exist α > 0 and t0 > 0 such that, for all p ≥ 1 and t ∈ [0, t0],   p pα E sup |Zs − z0| ≤ Cpt . 0≤s≤t

Then, for all p ≥ 1 and t ≥ 0,

 Z t −p E |Zs|ds < ∞. 0

Proof. We can assume that t ∈ [0, t0]. For any 0 <  < t|z0|/2, we have

Z t   Z 2/|z0|  P |Zs|ds <  ≤ P |Zs|ds <  0 0   |z0| ≤ P sup |Zs − z0| > 0≤s≤2/|z0| 2 p  pα 2 Cp 2 ≤ p , |z0| |z0| which implies the desired result.

The next lemma was proved by Norris , following the ideas of Stroock, and is the basic ingredient in the proof of Theorem 7.6.

40 Lemma 7.9 (Norris’s lemma). Consider a continuous of the form

Z t d Z t X i i Yt = y + asds + usdBs, 0 i=1 0 where Z t d Z t X i i a(t) = α + βsds + γsdBs 0 i=1 0 p and c = E sup0≤t≤T (|βt| + |γt| + |at| + |ut|) < ∞ for some p ≥ 2. Fix q > 8. Then, for all r < (q − 8)/27 there exists an 0 such that, for all  ≤ 0, we have

 Z T Z T  2 q 2 2 rp P Yt dt <  , (|at| + |ut| )dt ≥  ≤ c1 . 0 0 Proof of Theorem 7.6. The proof will be carried out in several steps:

−p Step 1 We need to show that, for all t > 0 and all p ≥ 2, E((det Qt) ) < ∞, where Qt is the Malliavin matrix of Xt. Taking into account that

−1 p p E | det Yt | + | det Yt| < ∞,

−p it suffices to show that E((det Ct) ) < ∞ for all p ≥ 2. Step 2 Fix t > 0. Using Lemma 7.7, the problem reduces to showing that, for all p ≥ 2, we have T  p sup P v Ctv ≤  ≤  , |v|=1 for any  ≤ 0(p), where the quadratic form associated with the matrix Ct is given by

d Z t T X −1 2 v Ctv = hv, Ys σj(Xs)i ds. (52) j=1 0

Step 3 Fix a smooth function V and use Itˆo’sformula to compute the differential of −1 Yt V (Xt):

d −1  −1 X k d Yt V (Xt) = Yt [σk,V ](Xt)dBt k=1  d  −1 1 X + Yt [σ0,V ] + 2 [σk, [σk,V ]] (Xt)dt. (53) k=1

Step 4 We introduce the following sets of vector fields:

Σ0 = {σ1, . . . , σd},

Σn = {[σk,V ], k = 0, . . . , d, V ∈ Σn−1} if n ≥ 1, ∞ Σ = ∪n=0Σn

41 and

0 Σ0 = Σ0,  0 0 Σn = [σk,V ], k = 1, . . . , d, V ∈ Σn−1; d  1 X 0 [σ0,V ] + 2 [σj, [σj,V ]],V ∈ Σn−1 if n ≥ 1, j=1 0 ∞ 0 Σ = ∪n=0Σn.

0 m We denote by Σn(x) (resp. Σn(x)) the subset of R obtained by freezing the variable x in 0 0 the vector fields of Σn (resp. Σn). Clearly, the vector spaces spanned by Σ(x0) or by Σ (x0) m coincide and, under H¨ormander’scondition, this vector space is R . Therefore, there exists Sj0 0 an integer j0 ≥ 0 such that the linear span of the set of vector fields j=0 Σj(x) at point x0 has dimension m. As a consequence there exist constants R > 0 and c > 0 such that

j0 X X hv, V (y)i2 ≥ c, (54) j=0 0 V ∈Σj for all v and y with |v| = 1 and |y − x0| < R.

−4j Step 5 For any j = 0, 1, . . . , j0 we put m(j) = 2 and define the set  Z t  X −1 2 m(j) Ej = hv, Ys V (Xs)i ds ≤  . 0 0 V ∈Σj

T Notice that {v Ctv ≤ } = E0 because m(0) = 1. Consider the decomposition

c c c E0 ⊂ (E0 ∩ E1) ∪ (E1 ∩ E2) ∪ · · · ∪ (Ej0−1 ∩ Ej0 ) ∪ F, where F = E0 ∩ E1 ∩ · · · ∩ Ej0 . Then, for any unit vector v, we have

j0−1 T X c P (v Ctv ≤ ) = P (E0) ≤ P (F ) + P (Ej ∩ Ej+1). j=0 We will now estimate each term in this sum.

Step 6 Let us first estimate P (F ). By the definition of F we obtain

 j0 Z t  X X −1 2 m(j0) P (F ) ≤ P hv, Ys V (Xs)i ds ≤ (j0 + 1) . j=0 0 0 V ∈Σj

Then, taking into account (54), we can apply Lemma 7.8 to the process

j0 X X −1 2 Zs = inf hv, Ys V (Xs)i , |v|=1 j=0 0 V ∈Σj

42 and we obtain  j0 Z t −p X X −1 2 E inf hv, Ys V (Xs)i ds < ∞. |v|=1 j=0 0 0 V ∈Σj

Therefore, for any p ≥ 1, there exists 0 such that

P (F ) ≤ p

for any  < 0.

c Step 7 For any j = 0, . . . , j0, the probability of the event Ej ∩ Ej+1 is bounded by the sum 0 with respect to V ∈ Σj of the probability that the two following events happen:

Z t −1 2 m(j) hv, Ys V (Xs)i ds ≤  0 and

d Z t X −1 2 hv, Ys [σk,V ](Xs)i ds k=1 0 d 2 Z t   X   m(j+1) + v, Y −1 [σ ,V ] + 1 [σ , [σ ,V ]] (X ) ds > , s 0 2 j j s n(j) 0 j=1

0 where n(j) denotes the cardinality of the set Σj. −1 Consider the continuous semimartingale (hv, Ys V (Xs)i)s≥0. From (53) we see that the quadratic variation of this semimartingale is equal to

d Z s X −1 2 hv, Yr [σk,V ](Xr)i dr, k=1 0 and the bounded variation component is

Z s   d   −1 1 X v, Yr [σ0,V ] + 2 [σj, [σj,V ]] (Xr) dr. 0 j=1

Taking into account that 8m(j + 1) < m(j), from Norris’s lemma (Lemma 7.9) applied to the T −1 semimartingale Ys = v Ys V (Xs), we get that, for any p ≥ 1, there exists an 0 > 0 such that c p P (Ej ∩ Ej+1) ≤  , for all  ≤ 0. The proof of the theorem is now complete.

8 Stein’s method for normal approximation

The following lemma is a characterization of the standard normal distribution on the real line.

43 Lemma 8.1 (Stein’s lemma). A random variable X such that E(|X|) < ∞ has the standard 1 normal distribution N(0, 1) if and only if, for any function f ∈ Cb (R), we have 0 E(f (X) − f(X)X) = 0. (55)

Proof. Suppose first that X has the standard normal distribution N√(0, 1). Then, equality (55) follows integrating by parts and using that the density p(x) = (1/ 2π) exp(−x2/2) satisfies the differential equation p0(x) = −xp(x). iλX Conversely, let ϕ(λ) = E(e ), λ ∈ R, be the characteristic function of X. Because X is 0 iλX integrable, we know that ϕ is differentiable and ϕ (λ) = iE(Xe ). By our assumption, this is equal to −λϕ(λ). Therefore, ϕ(λ) = exp(−λ2/2), which concludes the proof.

0 If the expectation E(f (X) − f(X)X) is small for functions f in some large set, we might conclude that the distribution of X is close to the normal distribution. This is the main idea of Stein’s method for normal approximations and the goal is to quantify this assertion in a proper way. To do this, consider a random variable X with the N(0, 1) distribution and fix a measurable function h: R → R such that E(|h(X)|) < ∞. Stein’s equation associated with h is the linear differential equation

0 fh(x) − xfh(x) = h(x) − E(h(X)), x ∈ R. (56)

Definition 8.2. A solution to equation (56) is an absolutely fh such that 0 there exists a version of the derivative fh satisfying (56) for every x ∈ R. The next result provides the existence of a unique solution to Stein’s equation. Proposition 8.1. The function Z x x2/2 −y2/2 fh(x) = e (h(y) − E(h(X)))e dy (57) −∞ is the unique solution of Stein’s equation (56) satisfying

−x2/2 lim e fh(x) = 0. (58) x→±∞ Proof. Equation (56) can be written as

2 d  2  ex /2 e−x /2f (x) = h(x) − (h(X)). dx h E This implies that any solution to equation (56) is of the form Z x x2/2 x2/2 −y2/2 fh(x) = ce + e (h(y) − E(h(X)))e dy, −∞

for some c ∈ R. Taking into account that Z x −y2/2 lim (h(y) − E(h(X)))e dy = 0, x→±∞ −∞ the asymptotic condition (58) is satisfied if and only if c = 0.

44 Notice that, since R (h(y) − (h(X)))e−y2/2dy = 0, we have R E Z x Z ∞ −y2/2 −y2/2 (h(y) − E(h(X)))e dy = − (h(y) − E(h(X)))e dy. (59) −∞ x

Proposition 8.2. Let h: R → [0, 1] be a measurable function. Then the solution to Stein’s equation fh given by (57) satisfies

rπ kf k ≤ and kf 0 k ≤ 2. (60) h ∞ 2 h ∞

Proof. Taking into account that |h(x) − E(h(X))| ≤ 1, where X has law N(0, 1), we obtain Z ∞ r x2/2 −y2/2 π |fh(x)| ≤ e e dy = , |x| 2

x2/2 R ∞ −y2/2 because the function x → e |x| e dy attains its maximum at x = 0. To prove the second estimate, observe that, in view of (59), we can write Z x 0 x2/2 −y2/2 fh(x) = h(x) − E(h(X)) + xe (h(y) − E(h(X)))e dy −∞ Z ∞ x2/2 −y2/2 = h(x) − E(h(X)) − xe (h(y) − E(h(X)))e dy, x

for every x ∈ R. Therefore Z ∞ 0 x2/2 −y2/2 |fh(x)| ≤ 1 + |x|e e dy = 2. |x|

This completes the proof.

8.1 Total variation and convergence in law

Let Fn be a sequence of random variables defined in a probability space (Ω, F,P ).

L Definition 8.3. We say that Fn → F if E[g(Fn)] → E[g(F )] for any g : R → R continuous and bounded. L We know that Fn → F if and only if P (Fn ≤ z) → P (F ≤ z) for any point z ∈ R of continuity of the distribution function of F . The total variation distance between two probabilities ν1 and ν2 on R is defined as

dTV (ν1, ν2) = sup |ν1(B) − ν2(B)| B∈B(R)

−1 −1 L Then, dTV (P ◦ Fn ,P ◦ F ) → 0 is strictly stronger that the convergence in law Fn → F . Using the Stein’s method, we can prove the following result.

45 Proposition 8.3. Let ν be a probability on R. Then, Z 0 dTV (ν, γ) ≤ sup [φ (x) − xφ(x)]ν(dx) , φ∈FTV R where r 1 π 0 FTV = {φ ∈ C ( ): kφk∞ ≤ , kφ k∞ ≤ 2}. R 2

Proof. Let h : R → [0, 1] be a continuous function and let φh be the solution to the Stein’s equation associated with h, that is,

0 h(x) − E[h(Z)] = φh(x) − xφh(x).

Integrating with respect to ν yields Z Z Z 0 hdν − hdγ = [φh(x) − xφh(x)]ν(dx) R R R Z 0 ≤ sup√ [φ (x) − xφ(x)]ν(dx) . 1 π 0 φ∈C (R):kφk∞≤ 2 ,kφ k∞≤2 R

This inequality holds for any h : R → [0, 1] measurable, because we can approximate h by continuous functions almost everywhere with respect to the measure ν + γ. Taking h = 1B, we obtain the result.

9 Central limit theorems and Malliavin calculus

Let (Bt)∈[0,T ] be a Brownian motion defined on the Wiener space (Ω, F,P ). The following results connects Stein’s method with Malliavin calculus.

1,2 Theorem 9.1 (Nourdin-Peccati). Suppose that F ∈ D satisfies F = δ(u), where u belongs to the domain in L2 of the divergence operator δ. Then,

dTV (PF , γ) ≤ 2E[|1 − hDF, uiH |].

Proof. It follows from

0 E[F φ(F )] = E[δ(u)φ(F )] = E[hu, D[φ(F )]iH ] = E[φ (F )hu, DF iH ].

Therefore,

0 0 |E[φ (F )] − E(F φ(F )]| = |E[φ (F )[1 − hDF, uiH ]|

≤ 2E[|1 − hDF, uiH |] for any φ ∈ FTV .

R T 1,2 Suppose that F = 0 usdBs, where u is an adapted measurable process in D (H). Then, Z T DtF = ut + DtusdBs, t

46 and Z T Z T  2 hu, DF iH = kukH + DtusdBs utdt. 0 t As a consequence,

 Z T Z T   2  dTV (PF , γ) ≤ 2E |1 − kukH | + 2E DtusdBs utdt 0 t 1 " Z T Z s 2 # 2 2  ≤ 2E |1 − kukH | + 2 E utDtusdt ds . 0 0

R T (n) (n) Proposition 9.1. A sequence Fn = 0 us dBs, where u is progressively measurable and (n) 1,2 u ∈ D (H), converges in total variation to the law N(0, 1) if: (n) 2 1 (i) ku kH → 1 in L (Ω) and

R T R s (n) (n) 2 (ii) E 0 0 ut Dtus dt ds → 0. Example 1. The previous proposition can be applied to the following example. √ (n) n ut = 2nt exp(Bt(1 − t))1[0,1](t).

We can take u = −DL−1F , because

F = LL−1F = −δDL−1F,

and, we obtain −1 dTV (PF , γ) ≤ 2E[|1 − hDF, −DL F iH |]

2 2 2 If E[F ] = σ > 0 and we take γσ = N(0, σ ), we can derive the following inequality: 2 d (F, γ ) ≤ E[|σ2 − hDF, ui |]. TV σ σ2 H Proof.

dTV (F, γσ) = sup |P (F ∈ B) − γσ(B)| B∈B(R) = sup |P (σ−1F ∈ σ−1B) − γ(σ−1B)| B∈B(R) = sup |P (σ−1F ∈ B) − γ(B)| B∈B(R) 2 ≤ E[|σ2 − hDF, ui |]. σ2 H

47 9.1 Normal approximation on a fixed Wiener chaos 1,2 Recall that for any F ∈ D such that E[F ] = 0, 2 d (P , γ ) ≤ E[|σ2 − hDF, −DL−1F i |]. TV F σ σ2 H

2 2 Proposition 9.2. Suppose F ∈ Hq for some q ≥ 2 and E(F ) = σ . Then,

2 q d (P , γ ) ≤ Var kDF k2  TV F σ qσ2 H

−1 1 2 2 Proof. Using L F = − q F and E[kDF kH ] = qσ , we obtain   2 −1 2 1 2 E[|σ − hDF, −DL F iH |] = E σ − kDF k q H 1q ≤ Var kDF k2 . q H

Proposition 9.3. Suppose that F = Iq(f) ∈ Hq, q ≥ 2. Then, (q − 1)q Var kDF k2  ≤ (E(F 4) − 3σ4) ≤ (q − 1)Var kDF k2  . H 3 H Proof. This proposition is a consequence of the following two formulas: q−1 4 X q Var kDF k2  = r2(r!)2 (2q − 2r)!kf⊗ fk2 (61) H r e r H⊗(2q−2r) r=1

Proof of (61): We have DtF = qIq−1(f(·, t)), and using the product formula for multiple stochastic integrals we obtain

Z T 2 2 2 kDF kH = q Iq−1(f(·, t)) dt 0 q−1 2 X q − 1 = q2 r! I (f⊗ f) r 2q−2r−2 e r+1 r=0 q 2 X q − 1 = q2 (r − 1)! I (f⊗ f) r − 1 2q−2r e r r=1 q−1 2 X q − 1 = qq!kfk2 + q2 (r − 1)! I (f⊗ f). (62) H⊗q r − 1 2q−2r e r r=1 Then, (61) follows from the isometry property of multiple integrals. The second formula is the following one:

q−1 4 3 X q E[F 4] − 3σ4 = r(r!)2 (2q − 2r)!kf⊗ fk2 (63) q r e r H⊗(2q−2r) r=1

48 −1 1 Proof on (63): Using that −L F = q F and L = −δD we can write 4 3 −1 3 −1 3 E[F ] = E[F × F ] = E[(−δDL F )F ] = E[h−DL F,D(F )iH ] 1 3 = E[hDF,D(F 3)i ] = E[F 2kDF k2 ]. (64) q H q H By the product formula of multiple integrals,

q−1 2 X q F 2 = I (f)2 = q!kfk2 + r! I (f⊗ f). (65) q H⊗q r 2q−2r e r r=0 Then (63) follows from (64), (65), (62) and the isometry property of multiple integrals.

9.2 Fourth Moment theorem Stein’s method combined with Malliavin calculus leads to a simple proof of the Fourth Moment theorem:

Theorem 9.2. Fix q ≥ 2. Let Fn = Iq(fn) ∈ Hq, n ≥ 1 be such that lim E(F 2) = σ2. n→∞ n The following conditions are equivalent:

L 2 (i) Fn → N(0, σ ), as n → ∞. 4 4 (ii) E(Fn ) → 3σ , as n → ∞. 2 2 2 (iii) kDFnkH → qσ in L (Ω), as n → ∞.

(iv) For all 1 ≤ r ≤ q − 1, fn ⊗r fn → 0, as n → ∞. This theorem constitutes a drastic simplification of the method of moments. Proof. First notice that (i) implies (ii) because for any p > 2, the hypercontractivity property of the Ornstein-Uhlenbeck semigroup implies q 2 sup kFnkp ≤ (p − 1) sup kFnk2 < ∞. n n The equivalence of (ii) and (iii)) follows from the previous proposition, and these conditions imply (i), with convergence in total variation. The fact that (iv) implies (ii) and (iii) is a consequence of kfn⊗e rfnk ≤ kfn ⊗r fnk. Let us show that (ii) implies (iv). From (65) we get q 4 X q E[F 4] = (r!)2 (2q − 2r)!kf ⊗˜ f k2 n r n r n H⊗(2q−2r) r=0 q−1 4 X q = (2q)!kf ⊗˜ f k2 + (r!)2 (2q − 2r)!kf ⊗˜ f k2 n n H⊗2q r n r n H⊗(2q−2r) r=1 2 4 +(q!) kfnkH . 2 2 4 Then, we use the fact that (2q)!kfn⊗efnkH⊗2q equals to 2(q!) kfnkH plus a linear combination of the terms kf ⊗ f k2 , with 1 ≤ r ≤ q − 1, to conclude that n r n H⊗(2q−2r)

kfn ⊗r fnkH⊗(2q−2r) → 0, 1 ≤ r ≤ q − 1.

49 9.3 Multivariate Gaussian approximation The next result is as multivariate extension of the fourth moment theorem.

Theorem 9.3 (Peccati-Tudor ’05). Let d ≥ 2 and 1 ≤ q1 < ··· < qd. Consider random vectors 1 d 1 d Fn = (Fn ,...,Fn ) = (Iq1 (fn),...,Iqd (fn)),

i 2 qi where fn ∈ Ls([0,T ] ). Suppose that, for any 1 ≤ i ≤ d,

lim E[(F i )2] = σ2. n→∞ n i Then, the following two conditions are equivalent:

L 2 (i) Fn → Nd(0, Σ), where Σ is a diagonal matrix such that Σii = σi .

i L 2 (ii) For every i = 1, . . . , d, Fn → N(0, σi ). Note that the convergence of the marginal distributions implies the joint convergence to a random vector with independent components.

9.4 Chaotic Central Limit Theorem P∞ Theorem 9.4. Let Fn = q=1 Iq(fq,n), n ≥ 1. Suppose that:

2 2 (i) For all q ≥ 1, q!kfq,nk → σq as n → ∞.

(ii) For all q ≥ 2 and 1 ≤ r ≤ q − 1, fq,n ⊗r fq,n → 0 as n → ∞.

2 P (iii) q!kfq,nk ≤ δq, where q δq < ∞. Then, as n tends to infinity

∞ L 2 2 X 2 Fn → N(0, σ ), where σ = σq . q=1

4 4 Assuming (i), condition (ii) is equivalent to (ii)’: limn→∞ E(Iq(fq,n) ) = 3σq , q ≥ 2. The theorem implies the convergence in law of the whole sequence (Iq(fq,n), q ≥ 1) to an infinite dimensional Gaussian vector with independent components.

9.5 Breuer-Major theorem 2 A function f ∈ L (R, γ) has Hermite rank d ≥ 1 if

∞ X f(x) = aqHq(x), ad 6= 0. q=d

For example, f(x) = |x|p − R |x|pdγ(x) has Hermite rank 1 if p > 0 is not an even integer. R Let X = {Xk, k ∈ Z} be a centered stationary Gaussian sequence with unit variance. Set ρ(v) = E[X0Xv] for v ∈ Z.

50 2 Theorem 9.5 (Breuer-Major ’83). Let f ∈ L (R, γ) with Hermite rank d ≥ 1 and assume P |ρ(v)|d < ∞. Then, v∈Z n 1 X L V := √ f(X ) → N(0, σ2), n n k k=1 as n → ∞, where σ2 = P∞ q!a2 P ρ(v)q. q=d q v∈Z

Proof. From the chaotic Central Limit Theorem, it suffices to consider the case f = aqHq, 2 q ≥ d. There exists a sequence {ek, k ≥ 1} in H = L ([0,T ]) such that

hek, ejiH = ρ(k − j).

The sequence {B(ek)} has the same law as {Xk}, and we may replace Vn by n aq X G = √ H (B(e )) = I (f ), n n q k q q,n k=1 a √q Pn ⊗q where fq,n = n k=1 ek . We can write 2 n   q!aq X X |v| q!kf k2 = ρ(k − j)q = q!a2 ρ(v)q 1 − 1 , q,n H⊗q n q n {|v|

2 2 2 X q 2 E[Gn] = q!kfq,nkH⊗q → q!aq ρ(v) = σ . v∈Z Applying the Fourth Moment Theorem, It suffices to show that for r = 1, . . . , q − 1,

2 n aq X ⊗(q−r) ⊗(q−r) f ⊗ f = ρ(k − j)re ⊗ e → 0. q,n r q,n n k j k,j=1 We have 4 n aq X kf ⊗ f k2 = ρ(k − j)rρ(i − `)rρ(k − i)q−rρ(j − `)q−r. q,n r q,n H⊗(2q−2r) n2 i,j,k,`=1

Using |ρ(k − j)rρ(k − i)q−r| ≤ |ρ(k − j)|q + |ρ(k − i)|q, we obtain   r 2 4 X q −1+ q X r kfq,n ⊗r fq,nkH⊗(2q−2r) ≤ 2aq |ρ(k)| n |ρ(i)|  k∈Z |i|≤n   q−r −1+ X q−r × n q |ρ(j)|  . |j|≤n Then, it suffices to show that for r = 1, . . . , q − 1,

−1+ r X n q |ρ(i)|r → 0. |i|≤n

51 This follows from H¨older’sinequality. Indeed, for a fixed δ ∈ (0, 1), we have the estimates

r ! q −1+ r X −1+ r 1− r X 1− r n q |ρ(i)|r ≤ n q (2[nδ] + 1) q |ρ(i)|q ≤ cδ q , |i|≤[nδ] i∈Z

and r   q −1+ r X r X q n q |ρ(i)| ≤  |ρ(i)|  . [nδ]<|i|≤n [nδ]<|i|≤n The first term converges to zero as δ tends to zero and the second one converges to zero for fixed δ as n → ∞.

9.6 Fractional Brownian motion H H The fractional Brownian motion (fBm) B = (Bt )t≥0 is a zero mean Gaussian process with covariance 1 E(BH BH ) = R (s, t) = s2H + t2H − |t − s|2H  . s t H 2 H ∈ (0, 1) is called the Hurst parameter. H H 2 2H The covariance formula implies E(Bt − Bs ) = |t − s| . As a consequence, for any H γ < H, with probability one, the trajectories t → Bt (ω) are H¨oldercontinuous of order γ:

H H γ |Bt (ω) − Bs (ω)| ≤ Gγ,T (ω)|t − s| , s, t ∈ [0,T ].

1 1 2 For H = 2 , B is a Brownian motion. Properties of the fractional Brownian motion: 1) The fractional Brownian motion has the following self-similarity property. For all a > 0, the processes −H H {a Bat , t ≥ 0} and H {Bt , t ≥ 0} have the same probability distribution (they are fractional Brownian motions with Hurst parameter H). 2) Unlike Brownian motion, the fractional Brownian motion has correlated increments. 1 More precisely, For H 6= 2 , we can write

H H H ρ(n) = E(B1 (Bn+1 − Bn )) 1 = (n + 1)2H + (n − 1)2H − 2n2H  2 ∼ H(2H − 1)n2H−2, as n → ∞.

1 P (ii) If H > 2 , then ρ(n) > 0 and n ρ(n) = ∞ (long memory). 1 P (iii) If H < 2 , then ρ(n) < 0 (intermittency) and n |ρ(n)| < ∞.

52 1 iT 3) The fractional Brownian motion has finite H -variation: Fix T > 0. Set ti = n for H H H 1 ≤ i ≤ n and define ∆Bti = Bti − Bti−1 . Then, as n → ∞,

n X 1 L2(Ω),a.s. H H |∆Bti | −→ cH T, i=1

H 1 where cH = E[|B1 | H ].

n 1 P H H Proof. By the self-similarity, i=1 |∆Bti | has the same law as

n T X H H 1 |B − B | H . n i i−1 i=1

H H The sequence {Bi − Bi−1, i ≥ 1} is stationary and ergodic. Therefore, the Ergodic Theorem implies the desired convergence.

9.6.1 Fractional noise

H H Let Xk = Bk − Bk−1. The sequence {Xk, k ≥ 1} is Gaussian, stationary and centered with covariance 1 ρ(n) = |n + 1|2H + |n − 1|2H − 2|n|2H  . 2 2H−2 1 We have ρ(n) ∼ H(2H−1)n as n → ∞. Then, for any integer q ≥ 2 such that H < 1− 2q , we have X |ρ(v)|q < ∞ v∈Z and the Breuer-Major theorem implies

n 1 X L √ H (BH − BH ) → N(0, σ2 ), n q k k−1 H,q k=1 where σ2 = q! P ρ(v)q. H,q v∈Z

9.6.2 CLT for the q-variation of the fBm

q For a real q ≥ 1, set cq = E[|Z| ], where Z ∼ N(0, 1). The Breuer-Major theorem leads to the following convergence:

1 Theorem 9.6. Suppose H < 2 and q is not an even integer. As n → ∞ we have

n 1 X h qH H H q i L 2 √ n |B k − B k−1 | − cq → N(0, σ˜H,q). n n n k=1

q Proof. Use that |x| − cq has Hermite rank 1.

53 9.6.3 Rate of convergence for the quadratic variation Define for n ≥ 1, n X H 2 Sn = (∆k,nB ) , k=1 H H H where ∆k,nB = B k − B k−1 . Then, n n

2H−1 a.s. n Sn → 1,

2H−1 n tends to infinity. In fact, by the self-similarity property, n Sn has the same law as 1 Pn H H 2 n k=1(Bk −Bk−1) , and the result follows form the Ergodic Theorem. To study the asymp- totic normality, consider

n n 1 X  2H H 2  L 1 X  H H 2  Fn = n (∆k,nB ) − 1 = (Bk − Bk−1) − 1 , σn σn k=1 k=1

2 where σn is such that E[Fn ] = 1.

2 Theorem 9.7. Assume H < 3 . Then, lim σ = 2 P ρ2(r) and 4 n→∞ n r∈Z  1 − 2 5 n if H ∈ (0, 8 )  1 3 − 2 2 5 dTV (PFn , γ) ≤ cH × n (log n) if H = 8  4H−3 5 3 n if H ∈ ( 8 , 4 ). As a consequence, ! √ 2H−1 L X 2 n(n Sn − 1) → N 0, 2 ρ (r) . r∈Z

1 log Sn a.s. The estimator of H given by Hbn = 2 − 2 log n satisfies Hbn → H and ! √ L 1 X 2 n log n(Hbn − H) → N 0, ρ (r) . 2 r∈Z 2 Proof. There exists a sequence {ek, k ≥ 1} in H = L ([0,T ]) such that

hek, ejiH = ρ(k − j).

H H The sequence {B(ek)} has the same law as {Bk − Bk−1}, and we may replace Fn by

n 1 X  2  Gn = B(ek) − 1 = I2(fn), σn k=1

where f = 1 Pn e ⊗ e . By the isometry property of multiple integrals, n σn k=1 k k n 2 X 2n X  |r| 1 = E[G2 ] = 2kf k2 = ρ2(k − j) = 1 − ρ2(r). n n L2([0,T ]2) σ2 σ2 n n k,j=1 n |r|

54 2 Since P ρ2(r) < ∞, because H < 3 , we deduce that lim σ = 2 P ρ2(r). We can r 4 n→∞ n r∈Z write Dr[I2(fn)] = 2I1(fn(·, r)] and

2 2  kD[I2(fn)]kH = 4 I2(fn ⊗1 fn) + kfnkH = 4I2(fn ⊗1 fn) + 2. Therefore,

2   2 Var kD[I2(fn)]kH = 16E (I2(fn ⊗1 fn)) 2 = 8kfn ⊗1 fnkL2([0,T ]2) n 16 X = ρ(k − j)ρ(i − `)ρ(k − i)ρ(j − `) σ4 n k,j,i,`=1 n 16 X ≤ (ρ ∗ ρ )(i − `)2 σ4 n n n i,`=1 16n X 16n ≤ (ρ ∗ ρ )(k)2 = kρ ∗ ρ k2 , 4 n n 4 n n `2(Z) σn σn k∈Z

where ρn(k) = |ρ(k)|1{|k|≤n−1}. Applying Young’s inequality yields

2 4 kρn ∗ ρnk 2 ≤ kρnk 4/3 , ` (Z) ` (Z) so that  3 2  16n X 4 Var kD[I (f )]k ≤ |ρ(k)| 3 . 2 n H σ4   n |k|

Remark: Nourdin-Peccati ’13 proved the following optimal version of the fourth moment theorem 2 (assuming E[Fn ] = 1): cM(Fn) ≤ dTV (Fn,Z) ≤ CM(Fn), 3 4 where M(Fn) = max(|E[Fn ]|,E[Fn ] − 3). As a consequence, the sequence n 1 X  H H 2  Fn = (Bk − Bk−1) − 1 σn k=1 satisfies:  1 − 2 2 n if H ∈ (0, 3 )  1 − 2 2 2 dTV (PFn , γ) ≈ n (log n) if H = 3  9  6H− 2 2 3 n if H ∈ ( 3 , 4 ), where ≈ means that we have an upper and lower bounds with some constants cH,1 and cH,2.

55 10 Applications of the Malliavin calculus in finance

In this section we present some applications of Malliavin Calculus to mathematical finance. First we discuss a probabilistic method for numerical computations of price sensitivities (Greeks) based on the integration by parts formula. Then, we discuss the use of Clark-Ocone formula to find hedging portfolios in the Black-Scholes model.

10.1 Black-Scholes model Consider a market consisting of one stock (risky asset) and one bond (risk-less asset). We assume that the price process (St)t≥0 follows a Black-Scholes model with constant coefficients σ > 0 and µ, that is,  σ2   S = S exp µ − t + σB , (66) t 0 2 t where B = (Bt)t∈[0,T ] is a Brownian motion defined in a complete probability space (Ω, F,P ). We will denote by (Ft)∈[0,T ] the filtration generated by the Brownian motion and completed by the P -null sets. By Itˆo’sformula we obtain that St satisfies a linear stochastic differential equation dSt = µStdt + σStdBt. (67) The coefficient µ is the mean return rate and σ is the volatility. The price of the bond at time t is ert, where r is the interest rate. Consider an investor who starts with some initial endowment x ≥ 0 and invests in the assets described above. Let αt be the number of non-risky assets and βt the number of stocks owned by the investor at time t. The couple φt = (αt, βt), t ∈ [0,T ], is called a portfolio, and we assume that the processes αt and βt are measurable and adapted processes such that Z T Z T 2 βt dt < ∞, |αt| dt < ∞ 0 0

rt almost surely. Then the value of the portfolio at time t is Vt(φ) = αte + βtSt.We say that the portfolio φ is self-financing if

Z t Z t rs Vt(φ) = x + r αse ds + βsdSs. 0 0 From now on we will consider only self-financing portfolios. It is easy to check that the −rt discounted value of a self-financing portfolio Vet(φ) = e Vt(φ) satisfies Z t Vet(φ) = x + βsdSes, 0

−rt where Set = e St. Notice that

dSet = (µ − r) Setdt + σSetdBt.

µ−r Set θ = σ . Consider the martingale measure defined on FT , by dQ  θ2  = exp −θB − . dP t 2

56 Under Q the process Wt = Bt + θt is a Brownian motion and the discounted price process −rt Set = e St is a martingale because

dSet = σSetdWt.

2 Suppose that F ≥ 0 is an FT -measurable such that EQ(F ) < ∞. The random variable F represents the payoff of some derivative. We say that F can ve replicated if there exists a self-financing portfolio φ such that VT (φ) = F . The Itˆointegral representation theorem implies that any derivative is replicable, and this means that the Black-Scholes market is complete. Indeed, if suffices to write

Z T −rT −rT e F = EQ(e F ) + usdWs, 0 and take the self-financing portfolio φt = (αt, βt), where

ut βt = . (68) σSet The price of a derivative with payoff F at time t ≤ T is given by the value at time t of a self-financing portfolio which replicates F .Then,

−r(T −t) Vt(φ) = e EQ(F |Ft). (69)

10.2 Computation of Greeks In this section we will present a general integration by parts formula and we will apply it to the computation of Greeks in the Black-Schole smodel. Let W = {W (h), h ∈ H} denote an isonormal Gaussian process associated with the Hilbert space H. We assume that W is defined on a complete probability space (Ω, F,P ), and that F is generated by W .

1,2 Proposition 10.1. Let F , G be two random variables such that F ∈ D . Consider an −1 H-valued random variable u such that DuF = hDF, uiH 6= 0 a.s. and Gu(DuF ) ∈ Domδ. Then, for any continuously differentiable function function f with bounded derivative we have

E(f 0(F )G) = E(f(F )H(F,G)), (70)

−1 where H(F,G) = δ(Gu(DuF ) ). Proof: By the chain rule we have

0 Du(f(F )) = f (F )DuF.

Hence, by the duality relationship we get

0 −1  E(f (F )G) = E Du(f(F ))(DuF ) G −1  = E D(f(F )), u(DuF ) G H −1 = E(f(F )δ(Gu(DuF ) )).

This completes the proof. 

57 Remark 10.1. If the law of F is absolutely continuous, we can assume that the function f is Lipschitz.

−1 Remark 10.2. Suppose that u is deterministic. Then, for Gu(DuF ) ∈ Domδ it suffices −1 1,2 1,4 2,2 that G(DuF ) ∈ D . Sufficient conditions for this are the following: G ∈ D , F ∈ D , 6 −12 6 E(G ) < ∞, E((DuF ) ) < ∞, and E(kDDuF kH ) < ∞. Remark 10.3. Suppose we take u = DF . In this case ! GDF H(F,G) = δ 2 , kDF kH and Equation (66) yields !! 0 GDF E(f (F )G) = E f(F )δ 2 . (71) kDF kH

10.2.1 Computation of Greeks for European options A Greek is a derivative of a financial quantity, usually an option price, with respect to any of the parameters of the model. This derivative is useful to measure the stability of this quantity under variations of the parameter. Consider an option with payoff F ≥ 0 such that 2 EQ(F ) < ∞. From (69) its price at time t = 0 is given by

−rT V0 = EQ(e F ).

The most important Greek is the Delta, denoted by ∆, which by definition is the derivative of V0 with respect to the initial price of the stock S0. Suppose that the payoff F only depents on the price of the stock at the maturity time T . That is, F = Φ(ST ). We call these derivative European options. Notice that ∂ST = ST . As a consequence, if Φ is a Lipschitz function we can write ∂S0 S0

−rT ∂V0 −rT 0 ∂ST e 0 ∆ = = EQ(e Φ (ST ) ) = EQ(Φ (ST )ST ). ∂S0 ∂S0 S0

Now we will apply Proposition 10.1 with u = 1, F = ST and G = ST . We have

Z T DuST = DtST dt = σT ST . (72) 0

Hence, all the conditions appearing in Remark 2 above are satisfies in this case and we we have    1  W δ S (D S )−1 = δ = T . T u T σT σT As a consequence,

e−rT ∆ = EQ(Φ(ST )WT ). (73) S0σT

58 The Gamma, denoted by Γ, is the second derivative of the option price with respect to S0. As before we obtain

2  2! −rT ∂ V0 −rT 00 ∂ST e 00 2 Γ = 2 = EQ e Φ (ST ) = 2 EQ(Φ (ST )ST ). ∂S0 ∂S0 S0

0 2 Suppose now that Φ is Lipschitz. We first apply Proposition 10.1 with ST , F = ST and u = 1. From (72) we have

   S  W  δ S2 (D S )−1 = δ T = S T − 1 , T u T σT T σT and, as a consequence,  W  E (Φ00(S )S2 ) = E Φ0(S )S T − 1 . Q T T Q T T σT   WT Finally, applying again Proposition 10.1 with G = ST σT − 1 , F = ST and u = 1 yields

  Z T −1!   WT WT 1 δ ST − 1 DtST dt = δ 2 2 − σT 0 σ T σT  W 2 1 W  = T − − T σ2T 2 σ2T σT and,  W    W 2 1 W  E Φ0(S )S T − 1 = E Φ(S ) T − − T . Q T T σT Q T σ2T 2 σ2T σT Therefore, we obtain

−rT   2  e WT 1 Γ = 2 EQ Φ(ST ) − − WT . (74) S0 σT σT σ The derivative with respect to the volatility is called Vega, and denoted by ϑ: ∂V ∂S ϑ = 0 = E (e−rT Φ0(S ) T ) = e−rT E (Φ0(S )S (W − σT )). ∂σ Q T ∂σ Q T T T

Applying Proposition 10.1 with G = ST WT , F = ST and u = 1 yields

Z T −1!   WT δ ST (WT − σT ) DtST dt = δ − 1 0 σT W 2 1  = T − − W . σT σ T As a consequence,  W 2 1  ϑ = e−rT E Φ(S ) T − − W . (75) Q T σT σ T By means of an approximation procedure these formulas still hold although the function Φ and its derivative are not Lipschitz. We just need Φ to be piecewise continuous with jump

59 discontinuities and with linear growth. In particular, we can apply these formulas to the case of and European call-option (Φ(x) = (x−K)+), and European put-option (Φ(x) = (K −x)+), or a digital option (Φ(x) = 1{x>K}). For example, using formulas (73), (74), and (75) we can compute the values of ∆, Γ and ϑ for an European Call option with exercise price K and compare the results with those obtained in this case using the explicit expression for the price

−rT V0 = S0N(d+) − Ke N(d−), where N is the distribution function of the law N(0, 1), and

 2  S0 σ log K + r ± 2 T d± = √ . σ T We can compute the values of the previous derivatives with a Monte Carlo numerical procedure.

10.2.2 Computation of Greeks for exotic options 1 R T Consider options whose payoff is a function of the average of the stock price T 0 Stdt, that is  1 Z T  F = Φ Stdt . T 0 For instance, an Asiatic Call-option with exercise price K, is a derivative of this type, where +  1 R T  F = T 0 Stdt − K . In this case there is no closed formula for the density of the random 1 R T variable T 0 Stdt. From (69) the price of this option at time t = 0 is given by   Z T  −rT 1 V0 = e EQ Φ Stdt . T 0

1 R T Let us compute the Delta for this type of options. Set ST = T 0 Stdt. We have −rT ∂V0 −rT 0 ∂ST e 0 ∆ = = EQ(e Φ (ST ) ) = EQ(Φ (ST )ST ). ∂S0 ∂S0 S0

We are going to apply Proposition 10.1 with G = ST , F = ST and ut = St. Let us compute 1 Z T σ Z T DtF = DtSrdr = Srdr, T 0 T t and ! ! GS 2 S δ · = δ · R T σ R T 0 StDtF dt 0 Stdt     R T R T S R T σS dr dt 2 StdWt 0 t t r =  0 +   R T  2  σ Stdt R T 0 0 Stdt R T 2 StdWt = 0 + 1. σ R T 0 Stdt

60 Notice that Z T 1  Z T  StdWt = ST − S0 − r Stdt . 0 σ 0 Thus, ! ! GS 2 (S − S ) 2r 2 S − S δ · = T 0 + 1 − = T 0 − m , R T 2 R T σ2 σ2 R T 0 StDtF dt σ 0 Stdt 0 Stdt σ2 where m = r − 2 . Finally, we obtain the following expression for the Delta

−rT    2e  ST − S0 ∆ = 2 EQ Φ ST − m . S0σ T ST

10.3 Application of the Clark-Ocone formula in hedging In this section we discuss the application of Clark-Ocone formula to find explicit formulas for a replicating portfolio in the Black-Scholes model. 1,2 Suppose that F ∈ D . Then, applying Clark-Ocone’s formula, from (68) we obtain

e−r(T −t) βt = EQ (DtF |Ft) . σSt

Consider the particular case of an European option with payoff F = Φ(ST ). Then

−r(T −t) e 0  βt = EQ Φ (ST )σST |Ft σSt   −r(T −t) 0 ST ST = e EQ Φ ( St) |Ft St St −r(T −t) 0  = e EQ Φ (xST −t)ST −t |x=St .

∂F In this way we recover the fact that βt coincides with ∂x (t, St), where F (t, x) is the price function. Consider now an option whose payoff is a function of the average of the stock price 1 R T  ST = T 0 Stdt, that is F = Φ ST . In this case we obtain

T −t  Z T  e 0 1 βt = EQ Φ (ST ) Srdr|Ft . St T t We can write t 1 Z T ST = St + Srdr, T T t 1 R t where St = t 0 Srdr. As a consequence we obtain

e−r(T −t)  tx y(T − t)  y(T − t)  β = E Φ0 + S S | . t Q T −t T −t x=St,y=St St T T T

61