Lectures on Gaussian approximations with Malliavin Ivan Nourdin

To cite this version:

Ivan Nourdin. Lectures on Gaussian approximations with Malliavin calculus. Donati-Martin C. Séminaire de probabilités XLV, 2078, Springer, pp.3-89, 2013, Lecture notes in mathematics, 978-3- 319-00320-7. ￿10.1007/978-3-319-00321-4_1￿. ￿hal-00680319v3￿

HAL Id: hal-00680319 https://hal.archives-ouvertes.fr/hal-00680319v3 Submitted on 28 Jun 2012

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Lectures on Gaussian approximations with Malliavin calculus

Ivan Nourdin Université de Lorraine, Institut de Mathématiques Élie Cartan B.P. 70239, 54506 Vandoeuvre-lès-Nancy Cedex, France [email protected]

June 28, 2012

Overview. In a seminal paper of 2005, Nualart and Peccati [40] discovered a surprising central limit theorem (called the “Fourth Moment Theorem” in the sequel) for sequences of multiple stochastic of a fixed order: in this context, convergence in distribution to the standard normal law is equivalent to convergence of just the fourth moment. Shortly afterwards, Peccati and Tudor [46] gave a multidimensional version of this characterization. Since the publication of these two beautiful papers, many improvements and developments on this theme have been considered. Among them is the work by Nualart and Ortiz-Latorre [39], giving a new proof only based on Malliavin calculus and the use of integration by parts on Wiener space. A second step is my joint paper [27] (written in collaboration with Peccati) in which, by bringing together Stein’s method with Malliavin calculus, we have been able (among other things) to associate quantitative bounds to the Fourth Moment Theorem. It turns out that Stein’s method and Malliavin calculus fit together admirably well. Their interaction has led to some remarkable new results involving central and non-central limit theorems for functionals of infinite-dimensional Gaussian fields. The current survey aims to introduce the main features of this recent theory. It originates from a of lectures I delivered∗ at the Collège de France between January and March 2012, within the framework of the annual prize of the Fondation des Sciences Mathématiques de Paris. It may be seen as a teaser for the book [32], in which the interested reader will find much more than in this short survey.

Acknowledgments. It is a pleasure to thank the Fondation des Sciences Mathématiques de Paris for its generous support during the academic year 2011-12 and for giving me the opportunity to speak about my recent research in the prestigious Collège de France. I am grateful to all the participants of these lectures for their assiduity. Also, I would like to warmly thank two anonymous referees for their very careful reading and for their valuable suggestions and remarks. Finally, my last thank goes to Giovanni Peccati, not only for accepting to give a lecture (resulting to the material developed in Section 10) but also (and especially!) for all the nice theorems we recently discovered together. I do hope it will continue this way as long as possible!

∗You may watch the videos of the lectures at http://www.sciencesmaths-paris.fr/index.php?page=175.

1 Contents

1 Breuer-Major Theorem 2

2 Universality of Wiener chaos 8

3 Stein’s method 14

4 Malliavin calculus in a nutshell 19

5 Stein meets Malliavin 28

6 The smart path method 37

7 Cumulants on the Wiener space 41

8 A new density formula 45

9 Exact rates of convergence 50

10 An extension to the Poisson space (following the invited talk by Giovanni Peccati) 54

11 Fourth Moment Theorem and free probability 63

1 Breuer-Major Theorem

The aim of this first section is to illustrate, through a guiding example, the power of the approach we will develop in this survey.

Let X > be a centered stationary Gaussian family. In this context, stationary just means { k}k 1 that there exists ρ : Z R such that E[X X ] = ρ(k l), k, l > 1. Assume further that ρ(0) = 1, → k l − that is, each X is (0, 1) distributed. k N Let ϕ : R R be a measurable function satisfying →

2 1 2 x2/2 E[ϕ (X1)] = ϕ (x)e− dx < . (1.1) √2π R ∞ Z Let H0,H1,... denote the sequence of Hermite polynomials. The first few Hermite polynomials are H = 1, H = X, H = X2 1 and H = X3 3X. More generally, the qth Hermite polynomial H 0 1 2 − 3 − q is defined through the relation XHq = Hq+1 + qHq 1. It is a well-known fact that, when it verifies 2 −x2/2 (1.1), the function ϕ may be expanded in L (R, e− dx) (in a unique way) in terms of Hermite polynomials as follows:

∞ ϕ(x) = aqHq(x). (1.2) Xq=0 Let d > 0 be the first integer q > 0 such that a = 0 in (1.2). It is called the Hermite rank of ϕ; q 6 it will play a key role in our study. Also, let us mention the following crucial property of Hermite

2 polynomials with respect to Gaussian elements. For any integer p, q > 0 and any jointly Gaussian random variables U, V (0, 1), we have ∼ N 0 if p = q E[H (U)H (V )] = 6 (1.3) p q q!E[UV ]q if p = q. 

In particular (choosing p = 0) we have that E[Hq(X1)] = 0 for all q > 1, meaning that a0 = E[ϕ(X1)] in (1.2). Also, combining the decomposition (1.2) with (1.3), it is straightforward to check that

2 ∞ 2 E[ϕ (X1)] = q!aq. (1.4) Xq=0 We are now in position to state the celebrated Breuer-Major theorem.

Theorem 1.1 (Breuer, Major, 1983; see [7]) Let Xk k>1 and ϕ : R R be as above. Assume d{ } → further that a0 = E[ϕ(X1)] = 0 and that k Z ρ(k) < , where ρ is the covariance function of ∈ | | ∞ X > and d is the Hermite rank of ϕ (observe that d > 1). Then, as n , { k}k 1 P → ∞ n 1 V = ϕ(X ) law (0, σ2), (1.5) n √n k → N Xk=1 with σ2 given by

∞ σ2 = q!a2 ρ(k)q [0, ). (1.6) q ∈ ∞ q=d k Z X X∈ (The fact that σ2 [0, ) is part of the conclusion.) ∈ ∞ The proof of Theorem 1.1 is far from being obvious. The original proof consisted to show that all the moments of V converge to those of the Gaussian law (0, σ2). As anyone might guess, n N this required a high ability and a lot of combinatorics. In the proof we will offer, the complexity 2 4 is the same as checking that the variance and the fourth moment of Vn converges to σ and 3σ respectively, which is a drastic simplification with respect to the original proof. Before doing so, let us make some other comments. Remark 1.2 1. First, it is worthwhile noticing that Theorem 1.1 (strictly) contains the classical central limit theorem (CLT), which is not an evident claim at first glance. Indeed, let Yk k>1 2 { } be a sequence of i.i.d. centered random variables with common variance σ > 0, and let FY 1 denote the common cumulative distribution function. Consider the pseudo-inverse FY− of FY , defined as

1 F − (u) = inf y R : u 6 F (y) , u (0, 1). Y { ∈ Y } ∈ 1 law When U [0,1] is uniformly distributed, it is well-known that FY− (U) = Y1. Observe ∼ U 2 also that 1 X1 e t /2dt is distributed. By combining these two facts, we get that √2π − [0,1] −∞ U law ϕ(X1) = Y1 withR x 1 1 t2/2 ϕ(x) = F − e− dt , x R. Y √2π ∈  Z−∞  3 Assume now that ρ(0) = 1 and ρ(k) = 0 for k = 0, that is, assume that the sequence X > 6 { k}k 1 is composed of i.i.d. (0, 1) random variables. Theorem 1.1 yields that N n n 1 law 1 law ∞ Y = ϕ(X ) 0, q!a2 , √n k √n k → N  q Xk=1 Xk=1 Xq=d   2 2 2 thereby concluding the proof of the CLT since σ = E[ϕ (X1)] = q∞=d q!aq, see (1.4). Of course, such a proof of the CLT is like to crack a walnut with a sledgehammer. This approach P has nevertheless its merits: it shows that the independence assumption in the CLT is not crucial to allow a Gaussian limit. Indeed, this is rather the summability of a series which is responsible of this fact, see also the second point of this remark. > D 1 2. Assume that d 2 and that ρ(k) k − as k for some D (0, d ). In this case, it dD/2 1 n ∼ | | | | → ∞ ∈ may be shown that n − k=1 ϕ(Xk) converges in law to a non-Gaussian (non degenerated) d random variable. This shows in particular that, in the case where Z ρ(k) = , we can P k | | ∞ get a non-Gaussian limit. In other words, the summability assumption∈ in Theorem 1.1 is, P roughly speaking, equivalent (when d > 2) to the asymptotic normality.

n [nt] 3. There exists a functional version of Theorem 1.1, in which the sum k=1 is replaced by k=1 for t > 0. It is actually not that much harder to prove and, unsurprisingly, the limiting process P P is then the standard Brownian motion multiplied by σ.

Let us now prove Theorem 1.1. We first compute the limiting variance, which will justify the formula (1.6) we claim for σ2. Thanks to (1.2) and (1.3), we can write

2 n n 1 ∞ 1 ∞ E[V 2] = E a H (X ) = a a E[H (X )H (X )] n n  q q k   n p q p k q l Xq=d Xk=1 p,qX=d k,lX=1  n   1 ∞ 2 q ∞ 2 q r 1 = q!aq ρ(k l) = q!aq ρ(r) 1 | | r d and r Z are fixed, we have that ∈ 2 q r 1 2 q q!aqρ(r) 1 | | r

(i) when ϕ = Hq has the form of a Hermite polynomial (for some q > 1); (ii) when ϕ = P R[X] is a real polynomial; ∈

4 2 (iii) in the general case when ϕ L2(R, e x /2dx). ∈ − We first show that (ii) implies (iii). That is, let us assume that Theorem 1.1 is shown for 2 polynomial functions ϕ, and let us show that it holds true for any function ϕ L2(R, e x /2dx). ∈ − We proceed by approximation. Let N > 1 be a (large) integer (to be chosen later) and write

N n n 1 1 ∞ V = a H (X ) + a H (X ) =: V + R . n √n q q k √n q q k n,N n,N Xq=d Xk=1 q=XN+1 Xk=1 Similar computations as above lead to

2 ∞ 2 d sup E[Rn,N ] 6 q!aq ρ(r) 0 as N . (1.7) n>1 × | | → → ∞ q=N+1 r Z X X∈ 2 2 (Recall from (1.4) that E[ϕ (X )] = ∞ q!a < .) On the other hand, using (ii) we have that, 1 q=d q ∞ for fixed N and as n , → ∞ P N law V 0, q!a2 ρ(k)q . (1.8) n,N → N  q  q=d k Z X X∈   It is then a routine exercise (details are left to the reader) to deduce from (1.7)-(1.8) that V = V + R law (0, σ2) as n , that is, that (iii) holds true. n n,N n,N → N → ∞ Next, let us prove (i), that is, (1.5) when ϕ = Hq is the qth Hermite polynomial. We actually need to work with a specific realization of the sequence X > . The space { k}k 1 L2(Ω) := span X ,X ,... H { 1 2 } being a real separable Hilbert space, it is isometrically isomorphic to either RN (with N > 1) or L2(R ). Let us assume that L2(R ), the case where RN being easier to handle. Let + H ≃ + H ≃ Φ: L2(R ) be an isometry. Set e = Φ(X ) for each k > 1. We have H → + k k ρ(k l) = E[X X ] = ∞ e (x)e (x)dx, k, l > 1 (1.9) − k l k l Z0 If B = (Bt)t>0 denotes a standard Brownian motion, we deduce that

law ∞ X > = e (t)dB , { k}k 1 k t Z0 k>1 these two families being indeed centered, Gaussian and having the same covariance structure (by construction of the ek’s). On the other hand, it is a well-known result of stochastic analysis (which follows from an induction argument through the Itô formula) that, for any function e L2(R ) ∈ + such that e 2 R = 1, we have k kL ( +) t1 tq−1 ∞ ∞ Hq e(t)dBt = q! dBt1 e(t1) dBt2 e(t2) ... dBtq e(tq). (1.10) Z0  Z0 Z0 Z0 (For instance, by Itô’s formula we can write

2 t1 ∞ ∞ ∞ 2 e(t)dBt = 2 dBt1 e(t1) dBt2 e(t2) + e(t) dt Z0  Z0 Z0 Z0 t1 ∞ = 2 dBt1 e(t1) dBt2 e(t2) + 1, Z0 Z0 5 which is nothing but (1.10) for q = 2, since H = X2 1.) At this stage, let us adopt the two 2 − following notational conventions: (a) If ϕ (resp. ψ) is a function of r (resp. s) arguments, then the tensor product ϕ ψ is the ⊗ function of r + s arguments given by ϕ ψ(x1, . . . , xr+s) = ϕ(x1, . . . , xr)ψ(xr+1, . . . , xr+s). ⊗ q Also, if q > 1 is an integer and e is a function, the tensor product function e⊗ is the function e ... e where e appears q times. ⊗ ⊗ (b) If f L2(Rq ) is symmetric (meaning that f(x , . . . , x ) = f(x , . . . , x ) for all ∈ + 1 q σ(1) σ(q) permutation σ S and almost all x , . . . , x R ) then ∈ q 1 q ∈ + t1 tq−1 B ∞ Iq (f) = f(t1, . . . , tq)dBt1 . . . dBtq := q! dBt1 dBt2 ... dBtq f(t1, . . . , tq). Rq 0 0 0 Z + Z Z Z With these new notations at hand, observe that we can rephrase (1.10) in a simple way as

∞ B q Hq e(t)dBt = Iq (e⊗ ). (1.11) Z0  It is now time to introduce a very powerful tool, the so-called Fourth Moment Theorem of Nualart and Peccati. This wonderful result lies at the heart of the approach we shall develop in these lecture notes. We will prove it in Section 5.

Theorem 1.3 (Nualart, Peccati, 2005; see [40]) Fix an integer q > 2, and let fn n>1 be a 2 Rq B 2 2 { } 2 sequence of symmetric functions of L ( +). Assume that E[Iq (fn) ] = q! fn 2 Rq σ as k kL ( +) → n for some σ > 0. Then, the following three assertions are equivalent as n : → ∞ → ∞ (1) IB(f ) law (0, σ2); q n → N law (2) E[IB (f )4] 3σ4; q n → 2 R2q 2r (3) fn r fn L2(R2q−2r ) 0 for each r = 1, . . . , q 1, where fn r fn is the function of L ( + − ) k ⊗ k + → − ⊗ defined by

fn r fn(x1, . . . , x2q 2r) ⊗ − = fn(x1, . . . , xq r, y1, . . . , yr)fn(xq r+1, . . . , x2q 2r, y1, . . . , yr)dy1 . . . dyr. Rr − − − Z +

Remark 1.4 In other words, Theorem 1.3 states that the convergence in law of a normalized sequence of multiple Wiener-Itô integrals IB(f ) towards the Gaussian law (0, σ2) is equivalent q n N to convergence of just the fourth moment to 3σ4. This surprising result has been the starting point of a new line of research, and has quickly led to several applications, extensions and improvements. One of these improvements is the following quantitative bound associated to Theorem 1.3 that we shall prove in Section 5 by combining Stein’s method with the Malliavin calculus. Theorem 1.5 (Nourdin, Peccati, 2009; see [27]) If q > 2 is an integer and f is a symmetric 2 Rq B 2 2 element of L ( +) satisfying E[Iq (f) ] = q! f L2(Rq ) = 1, then k k +

B 1 x2/2 q 1 B 4 sup P [Iq (f) A] e− dx 6 2 − E[Iq (f) ] 3 . R ∈ − √2π 3q − A ( ) ZA r ⊂B q

6 Let us go back to the proof of (i), that is, to the proof of (1.5) for ϕ = Hq. Recall that the sequence ek has be chosen for (1.9) to hold. Using (1.10) (see also (1.11)), we can write B { } Vn = Iq (fn), with n 1 q f = e⊗ . n √n k Xk=1 We already showed that E[V 2] σ2 as n . So, according to Theorem 1.3, to get (i) it remains n → → ∞ to check that fn r fn L2(R2q−2r ) 0 for any r = 1, . . . , q 1. We have k ⊗ k + → − n n 1 q q 1 r q r q r fn r fn = e⊗ r e⊗ = ek, el 2 R e⊗ − e⊗ − ⊗ n k ⊗ l n h iL ( +) k ⊗ l k,lX=1 k,lX=1 n 1 r q r q r = ρ(k l) e⊗ − e⊗ − , n − k ⊗ l k,lX=1 implying in turn 2 fn r fn 2 R2q−2r k ⊗ kL ( + ) n 1 r r q r q r q r q r = ρ(i j) ρ(k l) ei⊗ − ej⊗ − , ek⊗ − el⊗ − L2(R2q−2r ) n2 − − h ⊗ ⊗ i + i,j,k,lX=1 n 1 r r q r q r = ρ(i j) ρ(k l) ρ(i k) − ρ(j l) − . n2 − − − − i,j,k,lX=1 Observe that ρ(k l) r ρ(i k) q r 6 ρ(k l) q + ρ(i k) q. This, together with other obvious | − | | − | − | − | | − | manipulations, leads to the bound 2 2 6 q r q r fn r fn 2 R2q−2r ρ(k) ρ(i) ρ(j) − k ⊗ kL ( + ) n | | | | | | k Z i

Thus, to get that fn r fn L2(R2q−2r ) 0 for any r = 1, . . . , q 1, it suffices to show that k ⊗ k + → − q−r r s (r) := n− q ρ(i) 0 for any r = 1, . . . , q 1. n | | → − i

r/q q−r q−r q 1 r/q s (δ, r) 6 n− r ρ(i) (1 + 2[nδ]) q 6 cst δ − , 1,n  | |  × i <[nδ] | |X   7 as well as

r/q r/q q−r q−r q q s (δ, r) 6 n− r ρ(i) (2n) q 6 cst ρ(i) . 2,n  | |  ×  | |  [nδ]6 i [nδ] X| | | |X     Since 1 r/q > 0, it is a routine exercise (details are left to the reader) to deduce that s (r) 0 − n → as n . Since this is true for any r = 1, . . . , q 1, this concludes the proof of (i). → ∞ − It remains to show (ii), that is, convergence in law (1.5) whenever ϕ is a real polynomial. We shall use the multivariate counterpart of Theorem 1.3, which was obtained shortly afterwards by Peccati and Tudor. Since only a weak version (where all the involved multiple Wiener-Itô integrals have different orders) is needed here, we state the result of Peccati and Tudor only in this situation. We refer to Section 6 for a more general version and its proof.

Theorem 1.6 (Peccati, Tudor, 2005; see [46]) Consider l integers q1, . . . , ql > 1, with l > 2. i Assume that all the qi’s are pairwise different. For each i = 1, . . . , l, let fn n>1 be a sequence of 2 qi B i 2 i 2 { 2} R q symmetric functions of L ( +) satisfying E[Iqi (fn) ] = qi! fn 2 R i σi as n for some k kL ( + ) → → ∞ σ > 0. Then, the following two assertions are equivalent as n : i → ∞ (1) IB(f i ) law (0, σ2) for all i = 1, . . . , l; qi n → N i (2) IB (f 1),...,IB(f l ) law 0, diag(σ2, . . . , σ2) . q1 n ql n → N 1 l In other words, Theorem 1.6 proves the surprising fact that, for such a sequence of vectors of multiple Wiener-Itô integrals, componentwise convergence to Gaussian always implies joint convergence. We shall combine Theorem 1.6 with (i) to prove (ii). Let ϕ have the form of a real polynomial. In N particular, it admits a decomposition of the type ϕ = q=d aqHq for some finite integer N > d. Together with (i), Theorem 1.6 yields that P n n 1 1 H (X ),..., H (X ) law 0, diag(σ2, . . . , σ2 ) , √n d k √n N k → N d N k=1 k=1 ! X X  2 q where σq = q! k Z ρ(k) , q = d, . . . , N. We deduce that ∈

PN n N 1 V = a H (X ) law 0, a2q! ρ(k)q , n √n q q k → N  q  q=d k=1 q=d k Z X X X X∈   which is the desired conclusion in (ii) and conclude the proof of Theorem 1.1.

To go further. In [33], one associates quantitative bounds to Theorem 1.1 by using a similar approach.

2 Universality of Wiener chaos

Before developing the material which will be necessary for the proof of the Fourth Moment Theorem 1.3 (as well as other related results), to motivate the reader let us study yet another consequence of this beautiful result.

8 For any sequence X1,X2,... of i.i.d. random variables with mean 0 and variance 1, the central limit theorem asserts that V = (X + ... + X )/√n law (0, 1) as n . It is a particular n 1 n → N → ∞ instance of what is commonly referred to as a ‘universality phenomenon’ in probability. Indeed, we observe that the limit of the sequence Vn does not rely on the specific law of the Xi’s, but only of the fact that its first two moments are 0 and 1 respectively. Another example that exhibits a universality phenomenon is given by Wigner’s theorem in the random matrix theory. More precisely, let X > and X /√2 > be two independent families { ij}j>i 1 { ii }i 1 composed of i.i.d. random variables with mean 0, variance 1, and all the moments. Set Xji = Xij Xij and consider the n n random matrix M = ( ) 6 6 . The matrix M being symmetric, × n √n 1 i,j n n its eigenvalues λ1,n, . . . , λn,n (possibly repeated with multiplicity) belong to R. Wigner’s theorem then asserts that the spectral measure of Mn, that is, the random probability measure defined as 1 n 1 √ 21 n k=1 δλk,n , converges almost surely to the semicircular law 2π 4 x [ 2,2](x)dx, whatever the − − exact distribution of the entries of M are. P n In this section, our aim is to prove yet another universality phenomenon, which is in the spirit of the two afore-mentioned results. To do so, we need to introduce the following two blocks of basic ingredients:

(i) Three sequences X = (X1,X2,...), G = (G1,G2,...) and E = (ε1, ε2,...) of i.i.d. random variables, all with mean 0, variance 1 and finite fourth moment. We are more specific with G and E, by assuming further that G (0, 1) and P (ε = 1) = P (ε = 1) = 1/2. (As we 1 ∼ N 1 1 − will see, E will actually play no role in the statement of Theorem 2.1; we will however use it to build a interesting counterexample, see Remark 2.2(1).)

(ii) A fixed integer d > 1 as well as a sequence g : 1, . . . , n d R, n > 1 of real functions, each n { } → gn satisfying in addition that, for all i1, . . . , id = 1, . . . , n, (a) g (i , . . . , i ) = g (i , . . . , i ) for all permutation σ S ; n 1 d n σ(1) σ(d) ∈ d (b) g (i , . . . , i ) = 0 whenever i = i for some k = l; n 1 d k l 6 (c) d! n g (i , . . . , i )2 = 1. i1,...,id=1 n 1 d

(Of course,P conditions (a) and (b) are becoming immaterial when d = 1.) If x = (x1, x2,...) is a given real sequence, we also set

n x Qd(gn, ) = gn(i1, . . . , id)xi1 . . . xid . i1,...,iXd=1

Using (b) and (c), it is straightforward to check that, for any n > 1, we have E[Qd(gn, X)] = 0 2 and E[Qd(gn, X) ] = 1. We are now in position to state our new universality phenomenon.

Theorem 2.1 (Nourdin, Peccati, Reinert, 2010; see [34]) Assume that d > 2. Then, as n , the following two assertions are equivalent: → ∞ (α) Q (g , G) law (0, 1); d n → N law (β) Q (g , X) (0, 1) for any sequence X as given in (i). d n → N Before proving Theorem 2.1, let us address some comments.

9 Remark 2.2 1. In reality, the universality phenomenon in Theorem 2.1 is a bit more subtle than in the CLT or in Wigner’s theorem. To illustrate what we have in mind, let us consider an explicit situation (in the case d = 2). Let g : 1, . . . , n 2 R be the function given by n { } → 1 1 gn(i, j) = i=1,j>2 or j=1,i>2 . 2√n 1 { } − It is easy to check that gn satisfies the three assumptions (a)-(b)-(c) and also that n 1 Q2(gn, x) = x1 xk. × √n 1 − Xk=2 The classical CLT then implies that Q (g , G) law G G and Q (g , E) law ε G . Moreover, it 2 n → 1 2 2 n → 1 2 is a classical and easy exercise to check that ε1G2 is (0, 1) distributed. Thus, what we just law N showed is that, although Q (g , E) (0, 1) as n , the assertion (β) in Theorem 2.1 2 n → N → ∞ fails when choosing X = G (indeed, the product of two independent (0, 1) random variables N is not gaussian). This means that, in Theorem 2.1, we cannot replace the sequence G in (α) by another sequence (at least, not by E !). 2. Theorem 2.1 is completely false when d = 1. For an explicit counterexample, consider for 1 x instance gn(i) = i=1 , i = 1, . . . , n. We then have Q1(gn, ) = x1. Consequently, the assertion (α) is trivially{ } verified (it is even an equality in law!) but the assertion (β) is never true unless X (0, 1). 1 ∼ N Proof of Theorem 2.1. Of course, only the implication (α) (β) must be shown. Let us divide its → proof into three steps. 1 2 Rd Step 1. Set ei = [i 1,i], i > 1, and let fn L ( +) be the symmetric function defined as − ∈ n f = g (i , . . . , i )e ... e . n n 1 d i1 ⊗ ⊗ id i1,...,iXd=1 B By the very definition of Id (fn), we have

n t1 td−1 B ∞ Id (fn) = d! gn(i1, . . . , id) dBt1 ei1 (t1) dBt2 ei2 (t2) ... dBtd eid (td). 0 0 0 i1,...,iXd=1 Z Z Z Observe that

t1 t ∞ d−1 dBt1 ei1 (t1) dBt2 ei2 (t2) ... dBtd eid (td) Z0 Z0 Z0 is not almost surely zero (if and) only if id 6 id 1 6 ... 6 i1. By combining this fact with assumption (b), we deduce that −

t1 td−1 B ∞ Id (fn) = d! gn(i1, . . . , id) dBt1 ei1 (t1) dBt2 ei2 (t2) ... dBtd eid (td) 6 6 0 0 0 1 id<...

= d! gn(i1, . . . , id)(Bi1 Bi1 1) ... (Bi Bi 1) − d d− 6 6 − − 1 id<...

n n fn d 1 fn = gn(i, k2, . . . , kd)gn(j, k2, . . . , kd) ei ej, ⊗ −   ⊗ i,jX=1 k2,...,kXd=1   so that 2 n n 2 fn d 1 fn L2(R2 ) = gn(i, k2, . . . , kd)gn(j, k2, . . . , kd) k ⊗ − k +   i,jX=1 k2,...,kXd=1 2 n  n  > g (i, k , . . . , k )2 (by summing only over i = j)  n 2 d  Xi=1 k2,...,kXd=1 2  n  2 2 > max gn(i, k2, . . . , kd) = τn, (2.12) 16i6n   k2,...,kXd=1   where n 2 τn := max gn(i, k2, . . . , kd) . (2.13) 16i6n k2,...,kXd=1 G law B Now, assume that (α) holds. By Theorem 1.3 and because Qd(gn, ) = Id (fn), we have in particular that fn d 1 fn L2(R2 ) 0 as n . Using the inequality (2.12), we deduce that k ⊗ − k + → → ∞ τ 0 as n . n → → ∞ Step 2. We claim that the following result (whose proof is given in Step 3) allows to conclude the proof of (α) (β). → Theorem 2.3 (Mossel, O’Donnel, Oleszkiewicz, 2010; see [20]) Let X and G be given as in d (i) and let gn : 1, . . . , n R be a function satisfying the three conditions (a)-(b)-(c). Set 4{ } → R R γ = max 3,E[X1 ] > 1 and let τn be the quantity given by (2.13). Then, for all function ϕ : {3 } → of class with ϕ′′′ < , we have C k k∞ ∞ γ 3 (d 1) 3/2 E[ϕ(Qd(gn, X))] E[ϕ(Qd(gn, G))] 6 (3 + 2γ) 2 − d √d! ϕ′′′ √τn. − 3 k k∞

Indeed, assume that (α) holds. By Step 1, we have that τ 0 as n . Next, Theorem 2.3 n → → ∞ together with (α), lead to (β) and therefore conclude the proof of Theorem 2.1.

Step 3: Proof of Theorem 2.3. During the proof, we will need the following auxiliary lemma, which is of independent interest.

Lemma 2.4 (Hypercontractivity) Let n > d > 1, and consider a multilinear polynomial P R[x , . . . , x ] of degree d, that is, P is of the form ∈ 1 n

P (x1, . . . , xn) = aS xi. S 1,...,n i S ⊂{SX=d } Y∈ | |

11 Let X be as in (i). Then,

4 4 2d 2 2 E P (X1,...,Xn) 6 3 + 2E[X1 ] E P (X1,...,Xn) . (2.14)

Proof. The proof follows ideas from [20] is by induction on n. The case n = 1 is trivial. Indeed, in this case we have d = 1 so that P (x1) = ax1; the conclusion therefore asserts that (recall that 2 4 2 2 E[X1 ] = 1, implying in turn that E[X1 ] > E[X1 ] = 1)

4 4 4 4 2 a E[X1 ] 6 a 3 + 2E[X1 ] , which is evident. Assume now that n > 2. We can write

P (x1, . . . , xn) = R(x1, . . . , xn 1) + xnS(x1, . . . , xn 1), − − where R,S R[x1, . . . , xn 1] are multilinear polynomials of n 1 variables. Observe that R ∈ − − has degree d, while S has degree d 1. Now write P = P (X1,...,Xn), R = R(X1,...,Xn 1), S −4 R S − = S(X1,...,Xn 1) and α = E[X1 ]. Clearly, and are independent of Xn. We have, using − 2 E[Xn] = 0 and E[Xn] = 1: 2 2 2 2 E[P ] = E[(R + SXn) ] = E[R ] + E[S ] P4 R S 4 R4 R2S2 3 RS3 4 S4 E[ ] = E[( + Xn) ] = E[ ] + 6E[ ] + 4E[Xn]E[ ] + E[Xn]E[ ]. Observe that E[R2S2] 6 E[R4] E[S4] and

3 p p1 3 3 RS3 4 R4 4 S4 4 R4 S4 S4 E[Xn]E[ ] 6 α E[ ] E[ ] 6 α E[ ] E[ ] + αE[ ],  1 3  p p 3 where the last inequality used both x 4 y 4 6 √xy + y (by considering x < y and x > y) and α 4 6 α 4 2 2 (because α > E[Xn] > E[Xn] = 1). Hence E[P4] 6 E[R4] + 2(3 + 2α) E[R4] E[S4] + 5αE[S4] R4 R4 S4 2 S4 6 E[ ] + 2(3 + 2α)pE[ ]pE[ ] + (3 + 2α) E[ ] 2 = E[R4] + (3 + 2pα) E[Sp4] .   p 4 p d 2 4 d 1 2 By induction, we have E[R ] 6 (3 + 2α) E[R ] and E[S ] 6 (3 + 2α) − E[S ]. Therefore

2 E[P4] 6 (3 + 2α)p2d E[R2] + E[S2] = (3 + 2α)2pdE[P2]2, and the proof of the lemma is concluded. We are now in position to prove Theorem 2.3. Following [20], we use the Lindeberg replacement trick. Without loss of generality, we assume that X and G are stochastically independent. For (i) i = 0, . . . , n, let W = (G1,...,Gi,Xi+1,...,Xn). Fix a particular i = 1, . . . , n and write

U = g (i , . . . , i )W (i) ...W (i), i n 1 d i1 id 16i1,...,id6n i1=i,...,iX =i 6 d6 [ V = g (i , . . . , i )W (i) ... W (i) ...W (i) i n 1 d i1 i id 16i1,...,id6n jX: i =i ∃ j n = d g (i, i , . . . , i )W (i) ...W (i), n 2 d i2 id i2,...,iXd=1 12 [(i) where Wi means that this particular term is dropped (observe that this notation bears no ambiguity: indeed, since gn vanishes on diagonals, each string i1, . . . , id contributing to the definition of Vi contains the symbol i exactly once). For each i, note that Ui and Vi are independent of the variables Xi and Gi, and that

(i 1) (i) Qd(gn, W − ) = Ui + XiVi and Qd(gn, W ) = Ui + GiVi.

By Taylor’s theorem, using the independence of Xi from Ui and Vi, we have

1 2 2 E ϕ(U + X V ) E ϕ(U ) E ϕ′(U )V E[X ] E ϕ′′(U )V E[X ] i i i − i − i i i − 2 i i i

1  3  3      6 ϕ′′′ E[ Xi ]E[ Vi ]. 6k k∞ | | | | Similarly,

1 2 2 E ϕ(U + G V ) E ϕ(U ) E ϕ′(U )V E[G ] E ϕ′′(U )V E[G ] i i i − i − i i i − 2 i i i

1  3  3      6 ϕ′′′ E[ Gi ]E[ Vi ]. 6k k∞ | | | | Due to the matching moments up to second order on one hand, and using that E[ X 3] 6 γ and | i| E[ G 3] 6 γ on the other hand, we obtain that | i| (i 1) (i) E ϕ(Q (g , W − )) E ϕ(Q (g , W )) = E ϕ(U + G V ) E ϕ(U + X V ) d n − d n i i i − i i i γ 3 6 ϕ′′′ E[ Vi ].        3 k k∞ | | By Lemma 2.4, we have

3 4 3 3 (d 1) 2 3 E[ V ] 6 E[V ] 4 6 (3 + 2γ) 2 − E[V ] 2 . | i| i i

Using the independence between X and G, the properties of gn (which is symmetric and vanishes 2 2 on diagonals) as well as E[Xi] = E[Gi] = 0 and E[Xi ] = E[Gi ] = 1, we get

n 3/2 2 3/2 2 E[Vi ] = dd! gn(i, i2, . . . , id)  i2,...,iXd=1  n n 3/2 2 2 6 (dd!) max gn(j, j2, . . . , jd) gn(i, i2, . . . , id) , v16j6n × u j2,...,jd=1 i2,...,id=1 u X X t implying in turn that

n n n 2 3/2 6 3/2 2 2 E[Vi ] (dd!) max gn(j, j2, . . . , jdk ) gn(i1, i2, . . . , id) , v16j6n × i=1 u j2,...,jd=1 i1,...,id=1 X u X X 3/2 t = d √d! √τn.

13 By collecting the previous bounds, we get

E[ϕ(Qd(gn, X))] E[ϕ(Qd(gn, G))] | n − | (i 1) (i) 6 E ϕ(Q (g , W − )) E ϕ(Q (g , W )) d n − d n i=1 X  n   n γ 3 γ 3 (d 1) 2 3 6 ϕ′′′ E[ Vi ] 6 (3 + 2γ) 2 − ϕ′′′ E[Vi ] 2 3 k k∞ | | 3 k k∞ Xi=1 Xi=1 γ 3 (d 1) 3/2 6 (3 + 2γ) 2 − d √d! ϕ′′′ √τn. 3 k k∞

As a final remark, let us observe that Theorem 2.3 contains the CLT as a special case. Indeed, fix d = 1 and let g : 1, . . . , n R be the function given by g (i) = 1 . We then have τ = 1/n. n { } → n √n n It is moreover clear that Q (g , G) (0, 1). Then, for any function ϕ : R R of class 3 with 1 n ∼ N → C ϕ′′′ < and any sequence X as in (i), Theorem 2.3 implies that k k∞ ∞ X1 + ... + Xn 1 y2/2 4 E ϕ ϕ(y)e− dy 6 max E[X1 ]/3, 1 ϕ′′′ , √n − √2π R { }k k∞    Z from which it is straightforward to deduce the CLT.

To go further. In [34], Theorem 2.1 is extended to the case where the target law is the centered Gamma law. In [48], there is a version of Theorem 2.1 in which the sequence G is replaced by P, a sequence of i.i.d. Poisson random variables. Finally, let us mention that both Theorems 2.1 and 2.3 have been extended to the free probability framework (see Section 11) in the reference [13].

3 Stein’s method

In this section, we shall introduce some basic features of the so-called Stein method, which is the first step toward the proof of the Fourth Moment Theorem 1.3. Actually, we will not need the full force of this method, only a basic estimate. 2 A random variable X is (0, 1) distributed if and only if E[eitX ] = e t /2 for all t R. This N − ∈ simple fact leads to the idea that a random variable X has a law which is close to (0, 1) if and 2 N only if E[eitX ] is approximately e t /2 for all t R. This last claim is nothing but the usual criterion − ∈ for the convergence in law through the use of characteristic functions. Stein’s seminal idea is somehow similar. He noticed in [52] that X is (0, 1) distributed if and N only if E[f (X) Xf(X)] = 0 for all function f belonging to a sufficiently rich class of functions (for ′ − instance, the functions which are 1 and whose derivative grows at most polynomially). He then C wondered whether a suitable quantitative version of this identity may have fruitful consequences. This is actually the case and, even for specialists (at least for me!), the reason why it works so well remains a bit mysterious. Surprisingly, the simple following statement (due to Stein [52]) happens to contain all the elements of Stein’s method that are needed for our discussion. (For more details or extensions of the method, one can consult the recent books [9, 32] and the references therein.) Lemma 3.1 (Stein, 1972; see [52]) Let N (0, 1) be a standard Gaussian random variable. ∼ N Let h : R [0, 1] be any . Define f : R R by → x → x2 a2 f(x) = e 2 h(a) E[h(N)] e− 2 da (3.15) − Z−∞ x2 ∞  a2 = e 2 h(a) E[h(N)] e− 2 da. (3.16) − − Zx  14 Then f is of class 1, and satisfies f(x) 6 π/2, f (x) 6 2 and C | | | ′ | f ′(x) = xf(x) + h(x) E[h(N)] p (3.17) − for all x R. ∈ Proof: The equality between (3.15) and (3.16) comes from + 1 ∞ a2 0 = E h(N) E[h(N)] = h(a) E[h(N)] e− 2 da. − √2π − Z   −∞  Using (3.16) we have, for x > 0: + x2 ∞ a2 xf(x) = xe 2 h(a) E[h(N)] e− 2 da − Zx 2 + 2 2 + 2 x ∞ a x ∞ a 6 xe 2 e− 2 da 6 e 2 ae− 2 da = 1. Zx Zx Using (3.15) we have, for x 6 0: x x2 a2 xf(x) = xe 2 h(a) E[h(N)] e− 2 da − Z−∞ 2 + 2 2  + 2 x ∞ a x ∞ a 6 x e 2 e− 2 da 6 e 2 ae− 2 da = 1. | | x x Z| | Z| | The identity (3.17) is readily checked. We deduce, in particular, that

f ′(x) 6 xf(x) + h(x) E[h(N)] 6 2 | | | | | − | for all x R. On the other hand, by (3.15)-(3.16), we have, for every x R, ∈ ∈ x x2/2 y2/2 ∞ y2/2 x2/2 ∞ y2/2 π f(x) 6 e min e− dy, e− dy = e e− dy 6 , | | x x 2 Z−∞ Z  Z| | r where the last inequality is obtained by observing that the function s : R+ R given by x2/2 y2/2 → s(x) = e x∞e− dy attains its maximum at x = 0 (indeed, we have

R x2/2 ∞ y2/2 x2/2 ∞ y2/2 s′(x) = xe e− dy 1 6 e ye− dy 1 = 0 − − Zx Zx so that s is decreasing on R+) and that s(0) = π/2. The proof of the lemma is complete. p To illustrate how Stein’s method is a powerful approach, we shall use it to prove the celebrated Berry-Esseen theorem. (Our proof is based on an idea introduced by Ho and Chen in [16], see also Bolthausen [5].)

Theorem 3.2 (Berry, Esseen, 1956; see [15]) Let X = (X1,X2,...) be a sequence of i.i.d. random variables with E[X ] = 0, E[X2] = 1 and E[ X 3] < , and define 1 1 | 1| ∞ n 1 V = X , n > 1, n √n k Xk=1 to be the associated sequence of normalized partial sums. Then, for any n > 1, one has x 3 1 u2/2 33 E[ X1 ] sup P (Vn 6 x) e− du 6 | | . (3.18) x R − √2π √n ∈ Z −∞

15 Remark 3.3 One may actually show that (3.18) holds with the constant 0.4784 instead of 33. This has been proved by Korolev and Shevtsova [18] in 2010. (They do not use Stein’s method.) On the other hand, according to Esseen [15] himself, it is impossible to expect a universal constant smaller than 0.4097.

Proof of (3.18). For each n > 2, let Cn > 0 be the best possible constant satisfying, for all i.i.d. random variables X ,...,X with E[ X 3] < , E[X2] = 1 and E[X ] = 0, that 1 n | 1| ∞ 1 1 x 3 1 u2/2 Cn E[ X1 ] sup P (Vn 6 x) e− du 6 | | . (3.19) x R − √2π √n ∈ Z −∞ 2 As a first (rough) estimation, we first observe that, since X1 is centered with E[X1 ] = 1, one has 3 2 3 E[ X ] > E[X ] 2 = 1, so that C 6 √n. This is of course not enough to conclude, since we need | 1| 1 n to show that Cn 6 33. For any x R and ε > 0, introduce the function ∈ 1 if u 6 x ε − h (u) = linear if x ε < u < x + ε . x,ε  −  0 if u > x + ε It is immediately checked that, for all n > 2, ε > 0 and x R, we have ∈

E[hx ε,ε(Vn)] 6 P (Vn 6 x) 6 E[hx+ε,ε(Vn)]. − Moreover, for N (0, 1), ε > 0 and x R, we have, using that the density of N is bounded by ∼ N ∈ 1 , √2π 4ε E[hx+ε,ε(N)] 6 E[hx ε,ε(N)] 6 P (N 6 x) − √2π − 4ε 6 E[hx+ε,ε(N)] 6 E[hx ε,ε(N)] + . − √2π Therefore, for all n > 2 and ε > 0, we have

x 1 u2/2 4ε sup P (Vn 6 x) e− du 6 sup E[hx,ε(Vn)] E[hx,ε(N)] + . x R − √2π x R − √2π ∈ Z ∈ −∞

Assume for the time being that, for all ε > 0, 3 3 2 6 E[ X1 ] 3 Cn 1 E[ X1 ] sup E[hx,ε(Vn)] E[hx,ε(N)] 6 | | + − | | . (3.20) x R | − | √n ε n ∈ We deduce that, for all ε > 0,

x 3 3 2 1 u2/2 6 E[ X1 ] 3 Cn 1 E[ X1 ] 4ε sup P (Vn 6 x) e− du 6 | | + − | | + . x R − √2π √n ε n √2π ∈ Z −∞

By choosing ε = Cn−1 E[ X 3], we get that n | 1| q x 3 1 u2/2 E[ X1 ] 4 sup P (Vn 6 x) e− du 6 | | 6 + 3 + Cn 1 , x R − √2π √n √2π − ∈ Z     −∞ p

16 4 so that Cn 6 6 + 3 + √Cn 1. It follows by induction that Cn 6 33 (recall that Cn 6 √n so √2π − that C2 6 33 in particular),  which is the desired conclusion. We shall now use Stein’s Lemma 3.1 to prove that (3.20) holds. Fix x R and ε > 0, and let ∈ f denote the Stein solution associated with h = hx,ε, that is, f satisfies (3.15). Observe that h is 1 π continuous, and therefore f is . Recall from Lemma 3.1 that f 6 and f ′ 6 2. Set C k k∞ 2 k k∞ also f(x) = xf(x), x R. We then have ∈ p π ef(x) f(y) = f(x)(x y) + (f(x) f(y))y 6 + 2 y x y . (3.21) − − − 2 | | | − | r 

On the e other hand,e set X V i = V i , i = 1, . . . , n. n n − √n

i Observe that Vn and Xi are independent by construction. One can thus write

E[h(Vn)] E[h(N)] = E[f ′(Vn) Vnf(Vn)] n − − 1 Xi = E f ′(V ) f(V ) n n − n √n i=1   Xn 1 i Xi i i = E f ′(V ) f(V ) f(V ) because E[f(V )X ] = E[f(V )]E[X ] = 0 n n − n − n √n n i n i Xi=1   n  2 1 i Xi Xi = E f ′(V ) f ′ V + θ with θ U independent of X ,...,X . n n − n √n n ∼ [0,1] 1 n Xi=1     We have f (x) = f(x) + h(x) E[h(N)], so that ′ − n E[h(V )] Ee[h(N)] = a (f) b (f) + a (h) b (h) , (3.22) n − i − i i − i i=1 X  e e where 1 X 1 a (g) = E[g(V ) g(V i)] and b (g) = E g V i + θ i g(V i) X2 . i n − n n i n √n − n i n      i (Here again, we have used that Vn and Xi are independent.) Hence, to prove that (3.20) holds true, we must bound four terms. 2 1 i i 2 1 1st term. One has, using (3.21) as well as E[ X ] 6 E[X ] 2 = 1 and E[ V ] 6 E[(V ) ] 2 6 1, | 1| 1 | n| n 1 π π 1 a (f) 6 E[ X ] + 2E[ X ]E[ V i ] 6 + 2 . i n√n | 1| 2 | 1| | n| 2 n√n  r  r 

e 1 2nd term. Similarly and because E[θ] = 2 , one has 1 π 1 π E[ X 3] b (f) 6 E[θ]E[ X 3] + 2E[θ]E[ X 3]E[ V i ] 6 + 1 | 1| . i n√n | 1| 2 | 1| | n| 2 2 n√n  r   r 

3rd term .e By definition of h, we have

1 v u 1 h(v) h(u) = (v u) h′(u + s(v u))ds = − E [x ε,x+ε](u + θ(v u)) , − − 0 − − 2ε − − Z h i b 17 with θ U independent of θ and X ,...,X , so that ∼ [0,1] 1 n b 1 1 i Xi ai(h) 6 E Xi [x ε,x+ε] Vn + θ 2ε n√n | | − √n   

1 y b i y = E Xi P x ε 6 Vn 6 x + ε 2ε n√n | | − √n − − √n b "   y=θXi #

1 y i y 6 sup P x ε 6 Vn 6 x + ε . 2ε n√n y R − √n − − √n ∈   We are thus left to bound P (a 6 V i 6 b) for all a, b R with a 6 b. For that, set n ∈ V i = 1 X , so that V i = 1 1 V i. We then have, using in particular (3.19) (with n √n 1 j=i j n n n − 6 − n 1 instead of n) and the fact thatq the standard Gaussian density is bounded by 1 , P √ e − e 2π

i a i b P (a 6 Vn 6 b) = P 6 Vn 6  1 1 1 1  − n − n q e q  a b = P 6 N 6  1 1 1 1  − n − n q q  a i b a b +P 6 Vn 6 P 6 N 6  1 1 1 1  −  1 1 1 1  − n − n − n − n  e 3    b qa 2 Cn 1 Eq[ X1 ] q q 6 − + − | | . √2π 1 1 √n 1 − n − q We deduce that

3 1 Cn 1 E[ X1 ] ai(h) 6 + − | | . √2πn√n 1 n√n√n 1 ε − − 4th term . Similarly, we have

1 3 1 i Xi bi(h) = E Xi θ [x ε,x+ε] Vn + θ θ 2n√nε − √n    3 E[ X1 ] y bi y 6 | | sup P x ε 6 Vn 6 x + ε 4n√nε y R − √n − − √n ∈   3 3 2 E[ X1 ] Cn 1 E[ X1 ] 6 | | + − | | . 2√2πn√n 1 2n√n√n 1 ε − − Plugging these four estimates into (3.22) and by using the fact that n > 2 (and therefore n 1 > n ) − 2 and E[ X 3] > 1, we deduce the desired conclusion. | 1| To go further. Stein’s method has developed considerably since its first appearance in 1972. A comprehensive and very nice reference to go further is the book [9] by Chen, Goldstein and Shao, in which several applications of Stein’s method are carefully developed.

18 4 Malliavin calculus in a nutshell

The second ingredient for the proof of the Fourth Moment Theorem 1.3 is the Malliavin calculus (the first one being Stein’s method, as developed in the previous section). So, let us introduce the reader to the basic operators of Malliavin calculus. For the sake of simplicity and to avoid technicalities that would be useless in this survey, we will only consider the case where the underlying (fixed once for all throughout the sequel) is a classical Brownian motion B = (Bt)t>0 defined on some probability space (Ω, ,P ); we further assume that the σ-field is generated by B. F F For a detailed exposition of Malliavin calculus (in a more general context) and for missing proofs, we refer the reader to the textbooks [32, 38].

Dimension one. In this first section, we would like to introduce the basic operators of Malliavin calculus in the simplest situation (where only one Gaussian random variable is involved). While easy, it is a sufficiently rich context to encapsulate all the essence of this theory. We first need to recall some useful properties of Hermite polynomials.

Proposition 4.1 The family (Hq)q N R[X] of Hermite polynomials has the following properties. ∈ ⊂ N (a) Hq′ = qHq 1 and Hq+1 = XHq qHq 1 for all q . − − − ∈ 1 2 1 x2/2 (b) The family Hq is an orthonormal basis of L (R, e− dx). √q! q N √2π   ∈ (c) Let (U, V ) be a Gaussian vector with U, V (0, 1). Then, for all k, l N, ∼ N ∈ q!E[UV ]q if p = q E[H (U)H (V )] = p q 0 otherwise.  Proof. This is well-known. For a proof, see, e.g., [32, Proposition 1.4.2].

2 Let ϕ : R R be an element of L2(R, 1 e x /2dx). Proposition 4.1(b) implies that ϕ may be → √2π − expanded (in a unique way) in terms of Hermite polynomials as follows:

∞ ϕ = aqHq. (4.23) Xq=0 When ϕ is such that qq!a2 < , let us define q ∞ P ∞ Dϕ = qaqHq 1. (4.24) − q=0 X

Since the Hermite polynomials satisfy Hq′ = qHq 1 (Proposition 4.1(a)), observe that −

Dϕ = ϕ′

(in the sense of distributions). Let us now define the Ornstein-Uhlenbeck semigroup (Pt)t>0 by

∞ qt Ptϕ = e− aqHq. (4.25) q=0 X

19 Plainly, P0 = Id, PtPs = Pt+s (s, t > 0) and

t DPt = e− PtD. (4.26)

Since (Pt)t>0 is a semigroup, it admits a generator L defined as d L = P . dt|t=0 t Of course, for any t > 0 one has that

d Pt+h Pt Ph Id Ph Id d Pt = lim − = lim Pt − = Pt lim − = Pt Ph = PtL, dt h 0 h h 0 h h 0 h dh → → → h=0

d and, similarly, dt Pt = LPt. Moreover, going back to the definition of (P t)t>0, it is clear that the 2 domain of L is the set of functions ϕ L2(R, 1 e x /2dx) such that q2q!a2 < and that, in ∈ √2π − q ∞ this case, P ∞ Lϕ = qa H . − q q q=0 X We have the following integration by parts formula, whose proof is straightforward (start with the case ϕ = Hp and ψ = Hq, and then use bilinearity and approximation to conclude in the general case) and left to the reader. Proposition 4.2 Let ϕ be in the domain of L and ψ be in the domain of D. Then

2 2 e x /2 e x /2 Lϕ(x)ψ(x) − dx = Dϕ(x)Dψ(x) − dx. (4.27) R √2π − R √2π Z Z

We shall now extend all the previous operators in a situation where, instead of dealing with a random variable of the form F = ϕ(N) (that involves only one Gaussian random variable N), we deal more generally with a random variable F that is measurable with respect to the Brownian motion (Bt)t>0.

Wiener integral. For any adapted† and square integrable u = (ut)t>0, let us denote by 0∞ utdBt its Itô integral. Recall from any standard textbook of stochastic analysis that the Itô integral is a linear functional that takes its values on L2(Ω) and has the following basic R features, coming mainly from the independence property of the increments of B:

∞ E usdBs = 0 (4.28) Z0  E ∞ u dB ∞ v dB = E ∞ u v ds . (4.29) s s × s s s s Z0 Z0  Z0  In the particular case where u = f L2(R ) is deterministic, we say that ∞ f(s)dB is the Wiener ∈ + 0 s integral of f; it is then easy to show that R ∞ f(s)dB 0, ∞ f 2(s)ds . (4.30) s ∼ N Z0  Z0  †Any adapted process u that is either càdlàg or càglàd admits a progressively measurable version. We will always assume that we are dealing with it.

20 Multiple Wiener-Itô integrals and Wiener chaoses. Let f L2(Rq ). Let us see how one ∈ + could give a ‘natural’ meaning to the q-fold multiple integral

B Iq (f) = f(s1, . . . , sq)dBs1 . . . dBsq . Rq Z + To achieve this goal, we shall use an iterated Itô integral; the following heuristic ‘calculations’ are thus natural within this framework:

f(s1, . . . , sq)dBs1 . . . dBsq Rq Z + 1 = f(s1, . . . , sq) s >...>s dBs1 . . . dBsq Rq { σ(1) σ(q)} σ Sq + X∈ Z s s ∞ σ(1) σ(q−1) = dBsσ(1) dBsσ(2) ... dBsσ(q) f(s1, . . . , sq) 0 0 0 σ Sq Z Z Z X∈ t1 tq−1 ∞ = dBt1 dBt2 ... dBtq f(tσ−1(1), . . . , tσ−1(q)) 0 0 0 σ Sq X∈ Z Z Z t1 tq−1 ∞ = dBt1 dBt2 ... dBtq f(tσ(1), . . . , tσ(q)). (4.31) 0 0 0 σ Sq Z Z Z X∈ B Now, we can use (4.31) as a natural candidate for being Iq (f). Definition 4.3 Let q > 1 be an integer. 1. When f L2(Rq ), we set ∈ + t1 tq−1 B ∞ Iq (f) = dBt1 dBt2 ... dBtq f(tσ(1), . . . , tσ(q)). (4.32) 0 0 0 σ Sq Z Z Z X∈ B The random variable Iq (f) is called the qth multiple Wiener-Itô integral of f. 2. The set B of random variables of the form IB(f), f L2(Rq ), is called the qth Wiener chaos Hq q ∈ + of B. We also use the convention B = R. H0 The following properties are readily checked.

2 Rq Proposition 4.4 Let q > 1 be an integer and let f L ( +). ∈ q 1. If f is symmetric (meaning that f(t , . . . , t ) = f(t , . . . , t ) for any t R and any 1 q σ(1) σ(q) ∈ + permutation σ S ), then ∈ q t1 tq−1 B ∞ Iq (f) = q! dBt1 dBt2 ... dBtq f(t1, . . . , tq). (4.33) Z0 Z0 Z0 2. We have

B B Iq (f) = Iq (f), (4.34) where f stands fore the symmetrization of f given by 1 fe(t , . . . , t ) = f(t , . . . , t ). (4.35) 1 q q! σ(1) σ(q) σ Sq X∈ e 21 3. For any p, q > 1, f L2(Rp ) and g L2(Rq ), ∈ + ∈ + B E[Iq (f)] = 0 (4.36) B B E[Ip (f)Iq (g)] = p! f, g L2(Rp ) if p = q (4.37) h i + B B E[Ip (f)Iq (g)] = 0 if p = q. (4.38) e e 6

The space L2(Ω) can be decomposed into the infinite orthogonal sum of the spaces B. (It is a Hq statement which is analogous to the content of Proposition 4.1(b), and it is precisely here that we need to assume that the σ-field is generated by B.) It follows that any square-integrable random F variable F L2(Ω) admits the following chaotic expansion: ∈

∞ B F = E[F ] + Iq (fq), (4.39) q=1 X where the functions f L2(Rq ) are symmetric and uniquely determined by F . In practice and q ∈ + when F is ‘smooth’ enough, one may rely on Stroock’s formula (see [53] or [38, Exercise 1.2.6]) to compute the functions fq explicitely.

The following result contains a very useful property of multiple Wiener-Itô integrals. It is in the same spirit as Lemma 2.4.

Theorem 4.5 (Nelson, 1973; see [21]) Let f L2(Rq ) with q > 1. Then, for all r > 2, ∈ + B r q r/2 r 6 q E Iq (f) [(r 1) q!] f L2(R ) < . (4.40) | | − k k + ∞   Proof. See, e.g., [32, Corollary 2.8.14]. (The proof uses the hypercontractivity property of (Pt)t>0 defined as (4.48).)

Multiple Wiener-Itô integrals are linear by construction. Let us see how they behave with respect to multiplication. To this aim, we need to introduce the concept of contractions.

2 Rp 2 Rq Definition 4.6 When r 1, . . . , p q , f L ( +) and g L ( +), we write f r g to indicate ∈ { ∧ } ∈ ∈ 2 Rp+q 2r ⊗ the rth contraction of f and g, defined as being the element of L ( + − ) given by

(f r g)(t1, . . . , tp+q 2r) (4.41) ⊗ − = f(t1, . . . , tp r, x1, . . . , xr)g(tp r+1, . . . , tp+q 2r, x1, . . . , xr)dx1 . . . dxr. Rr − − − Z + By convention, we set f g = f g as being the tensor product of f and g, that is, ⊗0 ⊗ (f g)(t , . . . , t ) = f(t , . . . , t )g(t , . . . , t ). ⊗0 1 p+q 1 p p+1 p+q Observe that

6 p q f r g L2(Rp+q−2r) f L2(R ) g L2(R ), r = 0, . . . , p q (4.42) k ⊗ k + k k + k k + ∧ by Cauchy-Schwarz, and that f p g = f, g L2(Rp ) when p = q. The next result is the fundamental ⊗ h i + product formula between two multiple Wiener-Itô integrals.

22 Theorem 4.7 Let p, q > 1 and let f L2(Rp ) and g L2(Rq ) be two symmetric functions. Then ∈ + ∈ + p q B B ∧ p q B Ip (f)Iq (g) = r! Ip+q 2r(f rg), (4.43) r r − ⊗ r=0 X    where f g stands for the contraction (4.41).e ⊗r Proof. Theorem 4.7 can be established by at least two routes, namely by induction (see, e.g., [38, page 12]) or by using the concept of diagonal measure in the context of the Engel-Rota-Wallstrom theory (see [45]). Let us proceed to a heuristic proof following this latter strategy. Going back to B B the very definition of Ip (f), we see that the diagonals are avoided. That is, Ip (f) can be seen as

B 1 Ip (f) = f(s1, . . . , sp) si=sj , i=j dBs1 . . . dBsp Rp { 6 6 } Z + B The same holds for Iq (g). Then we have (just as through Fubini)

B B 1 1 Ip (f)Iq (g) = f(s1, . . . , sp) si=sj , i=j g(t1, . . . , tq) ti=tj , i=j dBs1 . . . dBsp dBt1 . . . dBtq . Rp+q { 6 6 } { 6 6 } Z + While there is no diagonals in the first and second blocks, there are all possible mixed diagonals in the joint writing. Hence we need to take into account all these diagonals (whence the combinatorial coefficients in the statement, which count all possible diagonal sets of size r) and then integrate out 2 (using the rule (dBt) = dt). We thus obtain

p q ∧ B B p q Ip (f)Iq (g) = r! (f r g)(x1, . . . , xp+q 2r)dBx1 . . . dBxp+q−2r r r p+q−2r ⊗ − r=0 R+ X    Z which is exactly the claim (4.43).

Malliavin derivatives. We shall extend the operator D introduced in (4.24). Let F L2(Ω) ∈ and consider its chaotic expansion (4.39). Definition 4.8 1. When m > 1 is an integer, we say that F belongs to the Sobolev-Watanabe space Dm,2 if

∞ m 2 q q! fq L2(Rq ) < . (4.44) k k + ∞ Xq=1

2. When (4.44) holds with m = 1, the Malliavin derivative DF = (DtF )t>0 of F is the element of L2(Ω R ) given by × +

∞ B DtF = qIq 1 (fq( , t)) . (4.45) − · q=1 X 3. More generally, when (4.44) holds with an m bigger than or equal to 2 we define the mth Malliavin m 2 m derivative D F = (D F ) > of F as the element of L (Ω R ) given by t1,...,tm t1,...,tm 0 × +

∞ B Dt1,...,tm F = q(q 1) ... (q m + 1)Iq m (fq( , t1, . . . , tm)) . (4.46) − − − · q=m X

23 The power 2 in the notation Dm,2 is because it is related to the space L2(Ω). (There exists a space Dm,p related to Lp(Ω) but we will not use it in this survey.) On the other hand, it is clear by construction that D is a linear operator. Also, using (4.37)-(4.38) it is easy to compute the L2-norm of DF in terms of the kernels fq appearing in the chaotic expansion (4.39) of F : Proposition 4.9 Let F D1,2. We have ∈

2 ∞ 2 E DF L2(R ) = qq! fq 2 Rq . k k + k kL ( +) q=1 h i X Proof. By (4.45), we can write

2 2 ∞ B E DF 2 = E qI (f ( , t)) dt L (R+) q 1 q k k R+  − ·   h i Z Xq=1    ∞ B B = pq E Ip 1 (fp( , t)) Iq 1 (fq( , t)) dt. R − · − · p,q=1 Z + X   Using (4.38), we deduce that

2 ∞ 2 B 2 E DF L2(R ) = q E Iq 1 (fq( , t)) dt. k k + R − · q=1 + h i X Z h i Finally, using (4.37), we get that

2 ∞ 2 2 ∞ 2 E DF L2(R ) = q (q 1)! fq( , t) 2 Rq−1 dt = qq! fq 2 Rq . + L ( + ) L ( +) k k − R+ k · k k k h i Xq=1 Z Xq=1

Let H be the qth Hermite polynomial (for some q > 1) and let e L2(R ) have norm 1. Recall q ∈ + (1.10) and Proposition 4.1(a). We deduce that, for any t > 0,

∞ B q B q 1 Dt Hq e(s)dWs = Dt(Iq (e⊗ )) = qIq 1(e⊗ − )e(t) −  Z0  ∞ ∞ ∞ = qHq 1 e(s)dBs e(t) = Hq′ e(s)dBs Dt e(s)dBs . − Z0  Z0  Z0  More generally, the Malliavin derivative D verifies the chain rule:

Theorem 4.10 Let ϕ : R R be both of class 1 and Lipschitz, and let F D1,2. Then, → C ∈ ϕ(F ) D1,2 and ∈

Dtϕ(F ) = ϕ′(F )DtF, t > 0. (4.47)

Proof. See, e.g., [38, Proposition 1.2.3].

Ornstein-Uhlenbeck semigroup. We now introduce the extension of (4.25) in our infinite- dimensional setting.

24 Definition 4.11 The Ornstein-Uhlenbeck semigroup is the family of linear operators (Pt)t>0 defined on L2(Ω) by

∞ qt B PtF = e− Iq (fq), (4.48) q=0 X where the symmetric kernels fq are given by (4.39).

A crucial property of (Pt)t>0 is the Mehler formula, that gives an alternative and often useful representation formula for Pt. To be able to state it, we need to introduce a further notation. Let (B,B′) be a two-dimensional Brownian motion defined on the product probability space (Ω, , P) = (Ω Ω , F F ,P P ). Let F L2(Ω). Since F is measurable with respect to F × ′ ⊗ ′ × ′ ∈ the Brownian motion B, we can write F = ΨF (B) with ΨF a measurable mapping determined P B 1 a.s.. As a consequence, for any t > 0 the random variable Ψ (e tB + √1 e 2tB ) is ◦ − F − − − ′ well-defined P P a.s. (note indeed that e tB + √1 e 2tB is again a Brownian motion for any × ′ − − − ′ t > 0). We then have the following formula.

Theorem 4.12 (Mehler’s formula) For every F = F (B) L2(Ω) and every t > 0, we have ∈ t 2t P (F ) = E′ Ψ (e− B + 1 e B′) , (4.49) t F − −  p  where E′ denotes the expectation with respect to P ′.

Proof. By using standard arguments, one may show that the linear span of random variables F having the form F = exp ∞ h(s)dB with h L2(R ) is dense in L2(Ω). Therefore, it suffices 0 s ∈ + to consider the case where F has this particular form. On the other hand, we have the following R identity, see, e.g., [32, Proposition 1.4.2(vi)]: for all c, x R, ∈ q cx c2/2 ∞ c e − = H (x), q! q q=0 X h(s) with H the qth Hermite polynomial. By setting c = h 2 R = h and x = ∞ dB , we q k kL ( +) k k 0 h s deduce that k k R q ∞ 1 h 2 ∞ h ∞ h(s) exp h(s)dB = e 2 k k k k H dB , s q! q h s 0 q=0 0 Z X Z k k  implying in turn, using (1.10), that

∞ 1 h 2 ∞ 1 B q exp h(s)dB = e 2 k k I h⊗ . (4.50) s q! q Z0 q=0 X 

Thus, for F = exp 0∞ h(s)dBs,

R qt 1 h 2 ∞ e− B q P F = e 2 k k I h⊗ . t q! q q=0 X 

25 On the other hand,

t 2t ∞ t 2t E′ Ψ (e− B + 1 e B′) = E′ exp h(s)(e− dB + 1 e dB′ ) F − − s − − s  Z0   p  2t p t ∞ 1 e− 2 = exp e− h(s)dB exp − h s 2 k k  Z0    2t −2t qt 1 e− 2 e h 2 ∞ e− B q = exp − h e 2 k k I h⊗ by (4.50) 2 k k q! q   q=0 X  = PtF. The desired conclusion follows.

Generator of the Ornstein-Uhlenbeck semigroup. Recall the definition (4.44) of the Sobolev-Watanabe spaces Dm,2, m > 1, and that the symmetric kernels f L2(Rq ) are uniquely q ∈ + defined through (4.39).

Definition 4.13 1. The generator of the Ornstein-Uhlenbeck semigroup is the linear operator L defined on D2,2 by

∞ LF = qIB(f ). − q q Xq=0 1 2 2. The pseudo-inverse of L is the linear operator L− defined on L (Ω) by

1 ∞ 1 B L− F = I (f ). − q q q Xq=1 It is obvious that, for any F L2(Ω), we have that L 1F D2,2 and ∈ − ∈ 1 LL− F = F E[F ]. (4.51) − 1 Our terminology for L− is explained by the identity (4.51). Another crucial property of L is contained in the following result, which is the exact generalization of Proposition 4.2.

Proposition 4.14 Let F D2,2 and G D1,2. Then ∈ ∈

E[LF G] = E[ DF,DG 2 R ]. (4.52) × − h iL ( +) B B Proof. By bilinearity and approximation, it is enough to show (4.52) for F = Ip (f) and G = Iq (g) with p, q > 1 and f L2(Rp ), g L2(Rq ) symmetric. When p = q, we have ∈ + ∈ + 6 E[LF G] = pE[IB(f)IB(g)] = 0 × − p q and

∞ B B 2 E[ DF,DG L (R+)] = pq E[Ip 1(f( , t))Iq 1(g( , t))]dt = 0 h i − · − · Z0 by (4.38), so the desired conclusion holds true in this case. When p = q, we have

B B E[LF G] = pE[Ip (f)Ip (g)] = pp! f, g L2(Rp ) × − − h i +

26 and

2 ∞ B B 2 E[ DF,DG L (R+)] = p E[Ip 1(f( , t))Ip 1(g( , t))]dt h i − · − · Z0 ∞ 2 p = p (p 1)! f( , t), g( , t) L2 (Rp−1)dt = pp! f, g L2(R ) − h · · i + h i + Z0 by (4.37), so the desired conclusion holds true also in this case.

We are now in position to state and prove an integration by parts formula which will play a crucial role in the sequel. Theorem 4.15 Let ϕ : R R be both of class 1 and Lipschitz, and let F D1,2 and G L2(Ω). → C ∈ ∈ Then

1 Cov G, ϕ(F ) = E ϕ′(F ) DF, DL− G 2 R . (4.53) h − iL ( +) Proof. Using the assumptions  made on F and ϕ, we can write: 1 Cov G, ϕ(F ) = E L(L− G) ϕ(F ) (by (4.51)) × 1 = E Dϕ(F ), DL− G 2 R (by (4.52))   h −  iL ( +) 1 = E ϕ′(F ) Dϕ(F ), DL− G 2 R (by (4.47)),  h − iL ( +) which is the announced formula. 

Theorem 4.15 admits a useful extension to indicator functions. Before stating and proving it, we recall the following classical result from measure theory. Proposition 4.16 Let C be a Borel set in R, assume that C [ A, A] for some A > 0, and let ⊂ − µ be a finite measure on [ A, A]. Then, there exists a sequence (h ) of continuous functions with − n support included in [ A, A] and such that hn(x) [0, 1] and 1C (x) = limn hn(x) µ-a.e. − ∈ →∞ Proof. This is an immediate corollary of Lusin’s theorem, see e.g. [50, page 56].

Corollary 4.17 Let C be a Borel set in R, assume that C [ A, A] for some A > 0, and let ⊂ − F D1,2 be such that E[F ] = 0. Then ∈ F 1 E F 1 (x)dx = E 1 (F ) DF, DL− F 2 R . C C h − iL ( +)  Z  −∞   Proof. Let λ denote the Lebesgue measure and let PF denote the law of F . By Proposition 4.16 with µ = (λ + PF ) [ A,A] (that is, µ is the restriction of λ + PF to [ A, A]), there is a sequence | − − (h ) of continuous functions with support included in [ A, A] and such that h (x) [0, 1] and n − n ∈ 1C (x) = limn hn(x) µ-a.e. In particular, 1C (x) = limn hn(x) λ-a.e. and PF -a.e. By Theorem →∞ →∞ 4.15, we have moreover that

F 1 E F h (x)dx = E h (F ) DF, DL− F 2 R . n n h − iL ( +)  Z  −∞   The dominated convergence applies and yields the desired conclusion.

As a corollary of both Theorem 4.15 and Corollary 4.17, we shall prove that the law of any multiple Wiener-Itô integral is always absolutely continuous with respect to the Lebesgue measure except, of course, when its kernel is identically zero.

27 Corollary 4.18 (Shigekawa; see [51]) Let q > 1 be an integer and let f be a non zero element of 2 Rq B L ( +). Then the law of F = Iq (f) is absolutely continuous with respect to the Lebesgue measure. Proof. Without loss of generality, we further assume that f is symmetric. The proof is by induction B 2 on q. When q = 1, the desired property is readily checked because I (f) (0, f 2 R ), see 1 ∼ N k kL ( +) (4.30). Now, let q > 2 and assume that the statement of Corollary 4.18 holds true for q 1, that is, B − 2 Rq 1 assume that the law of Iq 1(g) is absolutely continuous for any symmetric element g of L ( +− ) − q 2 R q such that g L2(Rq−1) > 0. Let f be a symmetric element of L ( +) with f L2(R ) > 0. Let k k + k k + 2 R ∞ q−1 h L ( ) be such that 0 f( , s)h(s)ds L2(R ) = 0. (Such an h necessarily exists because, ∈ · + 6 otherwise, we would have that f( , s) = 0 for almost all s > 0 which, by symmetry, would imply R · that f 0; this would be in contradiction with our assumption.) Using the induction assumption, ≡ we have that the law of

∞ B ∞ 2 DF, h L (R+) = DsF h(s)ds = qIq 1 f( , s)h(s)ds h i − · Z0 Z0  is absolutely continuous with respect to the Lebesgue measure. In particular,

P ( DF, h 2 R = 0) = 0, h iL ( +) implying in turn, because DF 2 R = 0 DF, h 2 R = 0 , that {k kL ( +) } ⊂ {h iL ( +) } P ( DF 2 R > 0) = 1. (4.54) k kL ( +) Now, let C be a Borel set in R. Using Corollary 4.17, we can write, for every n > 1,

1 2 1 1 1 2 E C [ n,n](F ) DF L2(R ) = E C [ n,n](F ) DF, DL− F L (R+) ∩ − q k k + ∩ − h − i    F  1 = E F C [ n,n](y)dy . ∩ −  Z−∞  Assume that the Lebesgue measure of C is zero. The previous equality implies that

1 1 2 E C [ n,n](F ) DF L2(R ) = 0, n > 1. ∩ − q k k +   But (4.54) holds as well, so P (F C [ n, n]) = 0 for all n > 1. By monotone convergence, we ∈ ∩ − actually get P (F C) = 0. This shows that the law of F is absolutely continuous with respect to ∈ the Lebesgue measure. The proof of Corollary 4.18 is concluded.

To go further. In the literature, the most quoted reference on Malliavin calculus is the excellent book [38] by Nualart. It contains many applications of this theory (such as the study of the smoothness of probability laws or the anticipating ) and constitutes, as such, an unavoidable reference to go further.

5 Stein meets Malliavin

We are now in a position to prove the Fourth Moment Theorem 1.3. As we will see, to do so we will combine the results of Section 3 (Stein’s method) with those of Section 4 (Malliavin calculus), thus explaining the title of the current section! It is a different strategy with respect to the original proof, which is based on the use of the Dambis-Dubins-Schwarz theorem. We start by introducing the distance we shall use to measure the closeness of the laws of random variables.

28 Definition 5.1 The total variation distance between the laws of two real-valued random variables Y and Z is defined by

dTV (Y,Z) = sup P (Y C) P (Z C) , (5.55) C B(R) ∈ − ∈ ∈ where B(R) stands for the set of Borel sets in R.

When C B(R), we have that P (Y C [ n, n]) P (Y C) and P (Z C [ n, n]) ∈ ∈ ∩ − → ∈ ∈ ∩ − → P (Z C) as n by the monotone convergence theorem. So, without loss we may restrict the ∈ → ∞ supremum in (5.55) to be taken over bounded Borel sets, that is,

dTV (Y,Z) = sup P (Y C) P (Z C) . (5.56) C B(R) ∈ − ∈ C bounded∈

We are now ready to derive a bound for the Gaussian approximation of any centered element F belonging to D1,2.

Theorem 5.2 (Nourdin, Peccati, 2009; see [27]) Consider F D1,2 with E[F ] = 0. Then, ∈ with N (0, 1), ∼ N 1 d (F,N) 6 2 E 1 DF, DL− F 2 R . (5.57) TV − h − iL ( +) Proof. Let C be a bounded Borel set in R. Let A > 0 be such that C [ A, A]. Let λ denote the ⊂ − Lebesgue measure and let PF denote the law of F . By Proposition 4.16 with µ = (λ + PF ) [ A ,A] | − − (the restriction of λ + P to [ A, A]), there is a sequence (h ) of continuous functions such F − n that hn(x) [0, 1] and 1C (x) = limn hn(x) µ-a.e. By the dominated convergence theorem, ∈ →∞ E[h (F )] P (F C) and E[h (N)] P (N C) as n . On the other hand, using Lemma n → ∈ n → ∈ → ∞ 3.1 (and denoting by fn the function associated with hn) as well as (4.53) we can write, for each n,

E[hn(F )] E[hn(N)] = E[fn′ (F )] E[F fn(F )] − − 1 = E[f ′ (F )(1 DF, DL− F 2 R ] n − h − iL ( +) 1 6 2 2 E 1 DF, DL− F L (R+) . | − h − i | Letting n goes to infinity yields  

1 P (F C) P (N C) 6 2 E 1 DF, DL− F 2 R , ∈ − ∈ | − h − iL ( +)|   which, together with (5.56), implies the desired conclusion. Wiener chaos and the Fourth Moment Theorem. In this section, we apply Theorem 5.2 to a chaotic random variable F , that is, to a random variable having the specific form of a multiple Wiener-Itô integral. We begin with a technical lemma which, among other, shows that the fourth moment of F is necessarily greater than 3E[F 2]2. We recall from Definition 4.6 the meaning of f f. ⊗r 2 Rq B Lemma 5.3 Let q > 1 be an integer and consider a symmetric function f L ( +). Set F = Iq (f) e 2 2 2 ∈ and σ = E[F ] = q! f L2(Rq ). The following two identities hold: k k + 2 q 1 4 − 2 2 1 2 r 2 q 2 E σ DF 2 = r! (2q 2r)! f f 2q−2r L (R+) 2 r L2(R ) " − q k k # q r − k ⊗ k +   Xr=1   e (5.58)

29 and

q 1 4 − 4 4 3 2 q 2 E[F ] 3σ = rr! (2q 2r)! f rf 2 R2q−2r (5.59) − q r − k ⊗ kL ( + ) Xr=1   q 1 2 − e 2 q 2 2q 2r 2 = q! f r f 2 R2q−2r + − f rf 2 R2q−2r . r k ⊗ kL ( + ) q r k ⊗ kL ( + ) r=1 X     −   e (5.60) In particular,

2 2 1 2 q 1 4 4 E σ DF 2 R 6 − E[F ] 3σ . (5.61) − q k kL ( +) 3q − "  #  Proof. We follow [28] for (5.58)-(5.59) and [40] for (5.60). For any t > 0, we have DtF = B qIq 1 f( , t) so that, using (4.43), − ·

1 2 ∞ B 2 DF L2(R ) = q Iq 1 f( , t) dt q k k + − · Z0 q 1 2  − ∞ q 1 B = q r! − I2q 2 2r f( , t) rf( , t) dt r − − · ⊗ · Z0 r=0   X  q 1 2 − e ∞ q 1 B = q r! − I2q 2 2r f( , t) r f( , t) dt r − − · ⊗ · Z0 r=0   X  q 1 2 − q 1 B ∞ = q r! − I2q 2 2r f( , t) r f( , t)dt r − − · ⊗ · r=0 0 X   Z  q 1 2 − q 1 B = q r! − I2q 2 2r(f r+1 f) r − − ⊗ r=0 X   q 2 q 1 B = q (r 1)! − I2q 2r(f r f) − r 1 − ⊗ r=1 X  −  q 1 2 − 2 q 1 B = q! f L2(Rq ) + q (r 1)! − I2q 2r(f r f). (5.62) k k + − r 1 − ⊗ r=1 X  −  2 2 2 Since E[F ] = q! f 2 Rq = σ , the identity (5.58) follows now from (5.62) and the orthogonality k kL ( +) properties of multiple Wiener-Itô integrals. Recall the hypercontractivity property (4.40) of multiple Wiener-Itô integrals, and observe the relations L 1F = 1 F and D(F 3) = 3F 2DF . By combining − − q formula (4.53) with an approximation argument (the derivative of ϕ(x) = x3 being not bounded), we infer that

4 3 3 2 2 E[F ] = E F F = E F DF 2 R . (5.63) × q k kL ( +)     Moreover, the multiplication formula (4.43) yields

q 2 2 B 2 q B F = Iq (f) = s! I2q 2s(f sf). (5.64) s − ⊗ Xs=0   e 30 By combining this last identity with (5.62) and (5.63), we obtain (5.59) and finally (5.61). It remains to prove (5.60). Let σ be a permutation of 1,..., 2q (this fact is written in symbols as σ S ). { } ∈ 2q If r 0, . . . , q denotes the cardinality of σ(1), . . . , σ(q) 1, . . . , q then it is readily checked ∈ { } { } ∩ { } that r is also the cardinality of σ(q + 1), . . . , σ(2q) q + 1,..., 2q and that { } ∩ { }

f(t1, . . . , tq)f(tσ(1), . . . , tσ(q))f(tq+1, . . . , t2q) R2q Z + f(t , . . . , t )dt . . . dt × σ(q+1) σ(2q) 1 2q 2 = (f r f)(x1, . . . , x2q 2r) dx1 . . . dx2q 2r R2q−2r ⊗ − − Z + 2 = f r f 2 R2q−2r . (5.65) k ⊗ kL ( + )

Moreover, for any fixed r 0, . . . , q , there are q 2(q!)2 permutations σ S such that ∈ { } r ∈ 2q # σ(1), . . . , σ(q) 1, . . . , q = r. (Indeed, such a permutation is completely determined by { } ∩ { }  the choice of: (a) r distinct elements y , . . . , y of 1, . . . , q ; (b) q r distinct elements y , . . . , y 1 r { } − r+1 q of q + 1,..., 2q ; (c) a bijection between 1, . . . , q and y , . . . , y ; (d) a bijection between { } { } { 1 q} q + 1,..., 2q and 1,..., 2q y , . . . , y .) Now, observe that the symmetrization of f f is { } { }\{ 1 q} ⊗ given by 1 f f(t , . . . , t ) = f(t , . . . , t )f(t , . . . , t ). ⊗ 1 2q (2q)! σ(1) σ(q) σ(q+1) σ(2q) σ S2q X∈ e Therefore,

2 f f 2 R2q k ⊗ kL ( + ) 1 = e f(tσ(1), . . . , tσ(q))f(tσ(q+1), . . . , tσ(2q)) 2 2q (2q)! ′ R σ,σ S2q Z + X∈ f(t ′ , . . . , t ′ )f(t ′ , . . . , t ′ )dt . . . dt × σ (1) σ (q) σ (q+1) σ (2q) 1 2q 1 = f(t1, . . . , tq)f(tq+1, . . . , t2q) (2q)! R2q σ S2q Z + X∈ f(t , . . . , t )f(t , . . . , t )dt . . . dt × σ(1) σ(q) σ(q+1) σ(2q) 1 2q q 1 = f(t1, . . . , tq)f(tq+1, . . . , t2q) (2q)! R2q r=0 σ S2q Z + X σ(1),...,σ(qX∈) 1,...,q =r { }∩{ } f(t , . . . , t )f(t , . . . , t )dt . . . dt . × σ(1) σ(q) σ(q+1) σ(2q) 1 2q Using (5.65), we deduce that

q 1 2 − 2 2 4 2 q 2 (2q)! f f 2 R2q = 2(q!) f L2(Rq ) + (q!) f r f 2 R2q−2r . (5.66) k ⊗ kL ( + ) k k + r k ⊗ kL ( + ) r=1 X   e Using the orthogonality and isometry properties of multiple Wiener-Itô integrals, the identity (5.64)

31 yields q 4 4 2 q 2 E[F ] = (r!) (2q 2r)! f rf 2 R2q−2r r − k ⊗ kL ( + ) r=0   X 2 2 4 q = (2q)! f f 2 R2q + (q!) ef L2(R ) k ⊗ kL ( + ) k k + q 1 4 − e 2 q 2 + (r!) (2q 2r)! f rf 2 R2q−2r . r − k ⊗ kL ( + ) r=1 X   e 2 4 2 2 4 By inserting (5.66) in the previous identity (and because (q!) f L2(Rq ) = E[F ] = σ ), we get k k + (5.60). As a consequence of Lemma 5.3, we deduce the following bound on the total variation distance for the Gaussian approximation of a normalized multiple Wiener-Itô integral. This is nothing but Theorem 1.5 but we restate it for convenience. Theorem 5.4 (Nourdin, Peccati, 2009; see [27]) Let q > 1 be an integer and consider a symmetric function f L2(Rq ). Set F = IB(f), assume that E[F 2] = 1, and let N (0, 1). ∈ + q ∼ N Then q 1 d (F,N) 6 2 − E[F 4] 3 . (5.67) TV 3q − r

1 1 1 1 2 Proof. Since L− F = F , we have DF, DL− F L2(R ) = DF 2 R . So, we only need to − q h − i + q k kL ( +) apply Theorem 5.2 and then formula (5.61) to conclude. The estimate (5.67) allows to deduce an easy proof of the following characterization of CLTs on Wiener chaos. (This is the Fourth Moment Theorem 1.3 of Nualart and Peccati!). We note that our proof differs from the original one, which is based on the use of the Dambis-Dubins-Schwarz theorem.

Corollary 5.5 (Nualart, Peccati, 2005; see [40]) Let q > 1 be an integer and consider a sequence (f ) of symmetric functions of L2(Rq ). Set F = IB(f ) and assume that E[F 2] σ2 > 0 n + n q n n → as n . Then, as n , the following three assertions are equivalent: → ∞ → ∞ Law (i) F N (0, σ2); n → ∼ N (ii) E[F 4] E[N 4] = 3σ4; n →

(iii) fn rfn L2(R2q−2r ) 0 for all r = 1, . . . , q 1. k ⊗ k + → −

(iv) fn e r fn L2(R2q−2r ) 0 for all r = 1, . . . , q 1. k ⊗ k + → − 2 2 Proof. Without loss of generality, we may and do assume that σ = 1 and E[Fn ] = 1 for all n. The implication (ii) (i) is a direct application of Theorem 5.4. The implication (i) (ii) → → comes from the Continuous Mapping Theorem together with an approximation argument (observe that sup E[F 4] < by the hypercontractivity relation (4.40)). The equivalence between n>1 n ∞ (ii) and (iii) is an immediate consequence of (5.59). The implication (iv) (iii) is obvious (as → f f 6 f f ) whereas the implication (ii) (iv) follows from (5.60). k n⊗r nk k n ⊗r nk → Quadratice variation of the fractional Brownian motion. In this section, we aim to illustrate Theorem 5.2 in a concrete situation. More precisely, we shall use Theorem 5.2 in order to

32 derive an explicit bound for the second-order approximation of the quadratic variation of a fractional Brownian motion on [0, 1]. H H Let B = (B ) > be a fractional Brownian motion with Hurst index H (0, 1). This means t t 0 ∈ that BH is a centered Gaussian process with covariance function given by 1 E[BH BH] = t2H + s2H t s 2H , s, t > 0. t s 2 − | − |  It is easily checked that BH is selfsimilar of index H and has stationary increments. Fractional Brownian motion has been successfully used in order to model a variety of natural phenomena coming from different fields, including hydrology, biology, medicine, economics or traffic networks. A natural question is thus the identification of the Hurst parameter from real data. To do so, it is popular and classical to use the quadratic variation (on, say, [0, 1]), which is observable and given by

n 1 − S = (BH BH )2, n > 1. n (k+1)/n − k/n Xk=0 One may prove (see, e.g., [25, (2.12)]) that

2H 1 proba n − S 1 as n . (5.68) n → → ∞

We deduce that the estimator Hn, defined as 1 log S H = n , b n 2 − 2 log n

b proba satisfies H 1 as n . To study the asymptotic normality, consider n → → ∞ 2H n 1 n 1 b n − H H 2 2H (law) 1 − H H 2 Fn = (B(k+1)/n Bk/n) n− = (Bk+1 Bk ) 1 , σn − − σn − − k=0 k=0 X   X   2 where σn > 0 is so that E[Fn ] = 1. We then have the following result. 2 2 Theorem 5.6 Let N (0, 1) and assume that H 6 3/4. Then, limn σn/n = 2 r Z ρ (r) if ∼ N →∞ H (0, 3 ), with ρ : Z R given by ∈ ∈ 4 → P 1 ρ(r) = r + 1 2H + r 1 2H 2 r 2H , (5.69) 2 | | | − | − | | 2 9 3  and limn σn/(n log n) = if H = . Moreover, there exists a constant cH > 0 (depending only →∞ 16 4 on H) such that, for every n > 1,

1 if H (0, 5 ) √n ∈ 8   (log n)3/2 5  if H = 8  √n dTV (Fn,N) 6 cH  . (5.70) ×   n4H 3 if H ( 5 , 3 ) − ∈ 8 4   1 if H = 3  log n 4    33 As an immediate consequence of Theorem 5.6, provided H < 3/4 we obtain that

2H 1 law 2 √n n − S 1 0, 2 ρ (r) as n , (5.71) n − → N → ∞ r Z  X∈  implying in turn 1 √n log n H H law 0, ρ2(r) as n . (5.72) n − → N 2 → ∞ r Z  X∈  b Indeed, we can write x u dv log x = x 1 du for all x > 0, − − v2 Z1 Z1 so that (by considering x > 1 and 0 < x < 1)

(x 1)2 1 log x + 1 x 6 − 1 + for all x > 0. − 2 x2  

As a result,

√n 2H 1 √n 2H 1 √n log n H H = log(n − S ) = (n − S 1) + R n − − 2 n − 2 n − n  with b

2H 1 2 √n(n − Sn 1) 1 R 6 − 1 + . | n| 4√n (n2H 1S )2   − n  proba Using (5.68) and (5.71), it is clear that R 0 as n and then that (5.72) holds true. n → → ∞ Now we have motivated it, let us go back to the proof of Theorem 5.6. To perform our calculations, we will mainly follow ideas taken from [3]. We first need the following ancillary result. Z 1 Lemma 5.7 1. For any r , let ρ(r) be defined by (5.69). If H = 2 , one has ρ(r) 2H 2 ∈ 1 > 6 ∼ H(2H 1) r − as r . If H = 2 and r 1, one has ρ(r) = 0. Consequently, −2 | | | | → ∞ | | r Z ρ (r) < if and only if H < 3/4. ∈ ∞ n 1 α nα+1 2. PFor all α > 1, we have − r as n . − r=1 ∼ α+1 → ∞ Proof. 1. The sequence ρ is symmetric,P that is, one has ρ(n) = ρ( n). When r , − → ∞ 2H 2 2H 2 ρ(r) = H(2H 1)r − + o(r − ). − 2 Using the usual criterion for convergence of Riemann sums, we deduce that Z ρ (r) < if and r ∞ only if 4H 4 < 1 if and only if H < 3 . ∈ − − 4 P 2. For α > 1, we have: − n 1 r α 1 1 xαdx = as n . n n → α + 1 → ∞ r=1 0 X   Z α+1 We deduce that n rα n as n . r=1 ∼ α+1 → ∞ We are now inP position to prove Theorem 5.6.

34 Proof of Theorem 5.6. Without loss of generality, we will rather use the second expression of Fn:

n 1 1 − H H 2 Fn = (Bk+1 Bk ) 1 . σn − − k=0 X   H 2 Consider the linear span of (Bk )k N, that is, is the closed linear subspace of L (Ω) generated H H ∈ H by (Bk )k N. It is a real separable Hilbert space and, consequently, there exists an isometry Φ: L∈2(R ). For any k N, set e = Φ(BH BH); we then have, for all k, l N, H → + ∈ k k+1 − k ∈

∞ e (s)e (s)ds = E[(BH BH )(BH BH)] = ρ(k l) (5.73) k l k+1 − k l+1 − l − Z0 with ρ given by (5.69). Therefore,

BH BH : k N law= ∞ e (s)dB : k N = IB(e ): k N , { k+1 − k ∈ } k s ∈ 1 k ∈ Z0   where B is a Brownian motion and IB( ), p > 1, stands for the pth multiple Wiener-Itô integral p · associated to B. As a consequence we can, without loss of generality, replace Fn by

n 1 1 − B 2 Fn = I1 (ek) 1 . σn − k=0 X h  i Now, using the multiplication formula (4.43), we deduce that

n 1 B 1 − Fn = I2 (fn), with fn = ek ek. σn ⊗ Xk=0 By using the same arguments as in the proof of Theorem 1.1, we obtain the exact value of σn:

n 1 − σ2 = 2 ρ2(k l) = 2 (n r )ρ2(r). n − − | | k,l=0 r

2 σn 2 r 1 = 2 ρ (r) 1 | | r

35 and 9 9n r ρ2(r) 1 | | ∼ 64 ∼ 32 r

On the other hand, recall that the convolution of two sequences u(n) n Z and v(n) n Z is the { } ∈ { } ∈ sequence u v defined as (u v)(j) = n Z u(n)v(j n), and observe that (u v)(l i) = ∗ ∗ ∈ − Z ∗ − k Z u(k l)v(k i) whenever u(n) = u( n) and v(n) = v( n) for all n . Set ∈ − − P− − ∈ P 1 Z ρn(k) = ρ(k) k 6n 1 , k , n > 1. | | {| | − } ∈ We then have (using (5.58) for the first equality, and noticing that f f = f f ), n ⊗1 n n⊗1 n 2 1 B 2 e E 1 D[I (fn)] 2 R − 2k 2 kL ( +) "  # n 1 2 8 − = 8 fn 1 fn L2(R2 ) = 4 ρ(k l)ρ(i j)ρ(k i)ρ(l j) k ⊗ k + σn − − − − i,j,k,lX=0 n 1 8 − 6 ρ (k l)ρ (i j)ρ (k i)ρ (l j) σ4 n − n − n − n − n i,l=0 j,k Z X X∈ n 1 8 − 2 8n 2 8n 2 = (ρ ρ )(l i) 6 (ρ ρ )(k) = ρ ρ 2 Z . σ4 n ∗ n − σ4 n ∗ n σ4 k n ∗ nkℓ ( ) n i,l=0 n k Z n X X∈ > 1 1 1 Recall Young’s inequality: if s, p, q 1 are such that p + q = 1 + s , then

u v s Z 6 u p Z v q Z . (5.76) k ∗ kℓ ( ) k kℓ ( )k kℓ ( ) 4 2 6 4 Let us apply (5.76) with u = v = ρn, s = 2 and p = 3 . We get ρn ρn ℓ2(Z) ρn 4 , so that k ∗ k k kℓ 3 (Z)

3 2 1 B 2 8n 4 E 1 D[I (fn)] 2 R 6 ρ(k) 3 . (5.77) − 2k 2 kL ( +) σ4  | |  "  # n k

36 3 Assume first that H < 4 and recall (5.74). This, together with (5.77) and (5.78), imply that

2 1 B 2 1 B 2 E 1 D[I2 (fn)] 2 R 6 E 1 D[I (fn)] 2 R − 2k kL ( +) v − 2k 2 kL ( +)   u "  # u t 1 if H (0, 5 ) √n ∈ 8  (log n)3/2 5 6 cH  if H = . ×  √n 8  n4H 3 if H ( 5 , 3 )  − 8 4  ∈  3 Therefore, the desired conclusion holds for H (0, 4 ) by applying Theorem 5.2. Assume now that 3 ∈ H = 4 and recall (5.75). This, together with (5.77) and (5.78), imply that

2 1 B 2 1 B 2 E 1 D[I2 (fn)] 2 R 6 E 1 D[I (fn)] 2 R = O(1/ log n), − 2k kL ( +) v − 2k 2 kL ( +)   u "  # u t 3 and leads to the desired conclusion for H = 4 as well. To go further. In [27], one may find a version of Theorem 5.2 where N is replaced by a centered Gamma law (see also [26]). In [1], one associate to Corollary 5.5 an almost sure central limit theorem. In [6], the case where H is bigger than 3/4 in Theorem 5.6 is analyzed.

6 The smart path method

The aim of this section is to prove Theorem 1.6 (that is, the multidimensional counterpart of the Fourth Moment Theorem), and even a more general version of it. Following the approach developed in the previous section for the one-dimensional case, a possible way for achieving this goal would have consisted in extending Stein’s method to the multivariate setting, so to combine them with the tools of Malliavin calculus. This is indeed the approach developed in [35] and it works well. In this survey, we will actually proceed differently (we follow [28]), by using the so-called ‘smart path method’ (which is a popular method in spin glasses theory, see, e.g., Talagrand [54]). Let us first illustrate this approach in dimension one. Let F D1,2 with E[F ] = 0, let 2 ∈ N (0, 1) and let h : R R be a function satisfying ϕ′′ < . Imagine we want ∼ N → C k k∞ ∞ to estimate E[h(F )] E[h(N)]. Without loss of generality, we may assume that N and F are − stochastically independent. We further have: 1 d E[h(F )] E[h(N)] = E[h(√tF + √1 tN)]dt − dt − Z0 1 1 1 = E[h′(√tF + √1 tN)F ] E[h′(√tF + √1 tN)N] dt. 2√t − − 2√1 t − Z0  −  For any x R and t [0, 1], Theorem 4.15 implies that ∈ ∈ 1 E[h′(√tF + √1 tx)F ] = √t E[h′′(√tF + √1 tx) DF, DL− F 2 R ], − − h − iL ( +) whereas a classical integration by parts yields

E[h′(√tx + √1 tN)N] = √1 t E[h′′(√tx + √1 tN)]. − − −

37 We deduce, since N and F are independent, that

1 1 1 E[h(F )] E[h(N)] = E[h′′(√tx + √1 tN)( DF, DL− F 2 R 1)]dt, (6.79) − 2 − h − iL ( +) − Z0 implying in turn

1 1 E[h(F )] E[h(N)] 6 h′′ E 1 DF, DL− F L2(R ) , (6.80) − 2k k∞ − h − i +   compare with (5.57). It happens that this approach extends easily to the multivariate setting. To see why, we will adopt the following short-hand notation: for every h : Rd R of class 2, we set → C ∂2h h′′ = max sup (x) . k k∞ i,j=1,...,d Rd ∂x ∂x x i j ∈

Theorem 6.1 below is a first step towards Theorem 1.6, and is nothing but the multivariate counterpart of (6.79)-(6.80).

Theorem 6.1 Fix d > 2 and let F = (F ,...,F ) be such that F D1,2 with E[F ] = 0 for any i. 1 d i ∈ i Let C d(R) be a symmetric and positive matrix, and let N be a centered Gaussian vector with ∈ M d 2 covariance C. Then, for any h : R R belonging to and such that h′′ < , we have → C k k∞ ∞ d 1 1 E[h(F )] E[h(N)] 6 h′′ E C(i, j) DFj, DL− Fi L2(R ) . (6.81) − 2k k∞ − h − i + i,j=1 X  

Proof. Without loss of generality, we assume that N is independent of the underlying Brownian motion B. Let h be as in the statement of the theorem. For any t [0, 1], set Ψ(t) = ∈ E h √1 tF + √tN , so that −   1 E[h(N)] E[h(F )] = Ψ(1) Ψ(0) = Ψ′(t)dt. − − Z0 We easily see that Ψ is differentiable on (0, 1) with

d ∂h 1 1 Ψ′(t) = E √1 tF + √tN Ni Fi . ∂xi − 2√t − 2√1 t i=1    X  − By integrating by parts, we can write

∂h ∂h E √1 tF + √tN Ni = E E √1 tx + √tN Ni ∂xi − ( ∂xi − x=F )    | d   ∂2h = √t C(i, j) E E √1 tx + √tN ∂x ∂x − j=1 (  i j  x=F ) X  | d ∂2h = √t C(i, j) E √1 tF + √tN . ∂x ∂x − j=1  i j  X 

38 By using Theorem 4.15 in order to perform the integration by parts, we can also write ∂h ∂h E √1 tF + √tN F = E E √1 tF + √tx F ∂x − i ∂x − i  i  (  i  x=N )   | d 2 ∂ h 1 = √1 t E E √1 tF + √tx DF , DL− F 2 R − ∂x ∂x − h j − iiL ( +) j=1 (  i j  x=N ) X  | d 2 ∂ h 1 = √1 t E √1 tF + √tN DF , DL− F 2 R . − ∂x ∂x − h j − iiL ( +) j=1  i j  X  Hence d 2 1 ∂ h 1 Ψ′(t) = E √1 tF + √tN C(i, j) DF , DL− F 2 R , 2 ∂x ∂x − − h j − jiL ( +) i,j=1  i j  X   and the desired conclusion follows.

We are now in position to prove Theorem 1.3 (using a different approach compared to the original proof; here, we rather follow [39]). We will actually even show the following more general version.

Theorem 6.2 (Peccati, Tudor, 2005; see [46]) Let d > 2 and qd, . . . , q1 > 1 be some fixed integers. Consider vectors B B > Fn = (F1,n,...,Fd,n) = (Iq1 (f1,n),...,Iqd (fd,n)), n 1, with f L2(Rqi ) symmetric. Let C (R) be a symmetric and positive matrix, and let N be i,n ∈ + ∈ Md a centered Gaussian vector with covariance C. Assume that

lim E[Fi,nFj,n] = C(i, j), 1 6 i, j 6 d. (6.82) n →∞ Then, as n , the following two conditions are equivalent: → ∞ (a) Fn converges in law to N; (b) for every 1 6 i 6 d, F converges in law to (0,C(i, i)). i,n N Proof. By symmetry, we assume without loss of generality that q1 6 ... 6 qd. The implication (a) (b) being trivial, we only concentrate on (b) (a). So, assume (b) and let us show that (a) ⇒ ⇒ holds true. Thanks to (6.81), we are left to show that, for each i, j = 1, . . . , d,

2 1 1 L (Ω) 2 2 DFj,n, DL− Fi,n L (R+) = DFj,n,DFi,n L (R+) C(i, j) as n . (6.83) h − i qi h i → → ∞ Observe first that, using the product formula (4.43),

1 ∞ B B 2 DFj,n,DFi,n L (R+) = qj Iq 1(fi,n( , t))Iq 1(fj,n( , t))dt q h i i− · j − · i Z0 q q 1 i∧ j − qi 1 qj 1 B ∞ = qj r! − − Iqi+qj 2 2r fi,n( , t) r fj,n( , t)dt r r − − 0 · ⊗ · Xr=0    Z  q q 1 i∧ j − qi 1 qj 1 B = qj r! − − Iq +q 2 2r (fi,n r+1 fj,n) r r i j − − ⊗ r=0    q Xq i∧ j qi 1 qj 1 B = qj (r 1)! − − Iq +q 2r(fi,n r fj,n). (6.84) − r 1 r 1 i j − ⊗ Xr=1  −  −  39 Now, let us consider all the possible cases for qi and qj with j > i.

First case: q = q = 1. We have DF ,DF 2 R = f , f 2 R = E[F F ]. But it i j h j,n i,niL ( +) h i,n j,niL ( +) i,n j,n is our assumption that E[F F ] C(i, j) so (6.83) holds true in this case. i,n j,n → > 2 2 Second case: qi = 1 and qj 2. We have DFj,n,DFi,n L (R+) = fi,n,DFj,n L (R+) = B h i h i Iq 1(fi,n 1 fj,n). We deduce that j − ⊗ 2 2 2 E[ DF ,DF 2 R ] = (q 1)! f f q −1 6 (q 1)! f f q −1 j,n i,n L ( +) j i,n 1 j,n 2 R j j i,n 1 j,n 2 R j h i − k ⊗ kL ( + ) − k ⊗ kL ( + )

= (qj 1)! fi,n fi,n, fj,n qj 1 fj,n L2(R2 ) − h ⊗e ⊗ − i + 2 6 2 (qj 1)! fi,n 2 R fj,n qj 1 fj,n L2(R ) − k kL ( +)k ⊗ − k + 2 = (qj 1)!E[Fi,n] fj,n qj 1 fj,n L2(R2 ). − k ⊗ − k + At this stage, observe the following two facts. First, because q = q , we have C(i, j) = 0 i 6 j necessarily. Second, since E[F 2 ] C(j, j) and F Law (0,C(j, j)), we have by Theorem 5.5 j,n → j,n → N that fj,n qj 1 fj,n L2(R2 ) 0. Hence, (6.83) holds true in this case as well. k ⊗ − k + → Third case: qi = qj > 2. By (6.84), we can write

qi 1 2 − 1 qi 1 B 2 DFj,n,DFi,n L (R+) = E[Fi,nFj,n] + qi (r 1)! − I2q 2r(fi,n r fj,n). q h i − r 1 i− ⊗ i r=1 X  −  We deduce that 1 2 E DF ,DF 2 R C(i, j) q h j,n i,niL ( +) − " i  # qi 1 4 − 2 2 2 qi 1 2 = E[Fi,nFj,n] C(i, j) + qi (r 1)! − (2qi 2r)! fi,n rfj,n 2 R2qi−2r . − − r 1 − k ⊗ kL ( + ) r=1  −   X The first term of the right-hand side tends to zero by assumption. For thee second term, we can write, whenever r 1, . . . , q 1 , ∈ { i − } 2 6 2 fi,n rfj,n 2 R2qi−2r fi,n r fj,n 2 R2qi−2r k ⊗ kL ( + ) k ⊗ kL ( + )

= fi,n qi r fi,n, fj,n qi r fj,n L2(R2r) e h ⊗ − ⊗ − i + 6 fi,n qi r fi,n L2(R2r) fj,n qi r fj,n L2(R2r ). k ⊗ − k + k ⊗ − k + Law Law Since Fi,n (0,C(i, i)) and Fj,n (0,C(j, j)), by Theorem 5.5 we have that fi,n q r → N → N k ⊗ i− fi,n L2(R2r ) fj,n qi r fj,n L2(R2r) 0, thereby showing that (6.83) holds true in our third case. k + k ⊗ − k + → Fourth case: qj > qi > 2. By (6.84), we have

qi 1 qi 1 qj 1 B 2 DFj,n,DFi,n L (R+) = qj (r 1)! − − Iq +q 2r(fi,n r fj,n). q h i − r 1 r 1 i j − ⊗ i r=1 X  −  −  We deduce that

1 2 E DFj,n,DFi,n 2 R q h iL ( +)  i  qi 2 2 2 2 qi 1 qj 1 2 = q (r 1)! − − (q + q 2r)! f f q +q −2r . j i j i,n r j,n 2 R i j − r 1 r 1 − k ⊗ kL ( + ) Xr=1  −   −  e 40 For any r 1, . . . , q , we have ∈ { i} 2 2 f f q +q −2r 6 f f q +q −2r i,n r j,n 2 R i j i,n r j,n 2 R i j k ⊗ kL ( + ) k ⊗ kL ( + )

= fi,n qi r fi,n, fj,n qj r fj,n L2(R2r) e h ⊗ − ⊗ − i + 6 fi,n qi r fi,n L2(R2r) fj,n qj r fj,n L2(R2r ) k ⊗ − k + k ⊗ − k + 2 6 q fi,n L2(R i ) fj,n qj r fj,n L2(R2r ) k k + k ⊗ − k + Law Since F (0,C(j, j)) and q r 1, . . . , q 1 , by Theorem 5.5 we have that j,n → N j − ∈ { j − } fj,n qj r fj,n L2(R2r) 0. We deduce that (6.83) holds true in our fourth case. k ⊗ − k + → Summarizing, we have that (6.83) is true for any i and j, and the proof of the theorem is done.

When the integers qd, . . . , q1 are pairwise disjoint in Theorem 6.2, notice that (6.82) is automatically verified with C(i, j) = 0 for all i = j, see indeed (4.38). As such, we recover the 6 version of Theorem 6.2 (that is, Theorem 1.6) which was stated and used in Lecture 1 to prove Breuer-Major theorem.

To go further. In [35], Stein’s method is combined with Malliavin calculus in a multivariate setting to provide bounds for the Wasserstein distance between the laws of N (0,C) and ∼ Nd F = (F ,...,F ) where each F D1,2 verifies E[F ] = 0. Compare with Theorem 6.1. 1 d i ∈ i 7 Cumulants on the Wiener space

In this section, following [29] our aim is to analyze the cumulants of a given element F of D1,2 and to show how the formula we shall obtain allows us to give yet another proof of the Fourth Moment Theorem 1.3. Let F be a random variable with, say, all the moments (to simplify the exposition). Let φF denote its characteristic function, that is, φ (t) = E[eitF ], t R. Then, it is well-known that we F ∈ may recover the moments of F from φF through the identity

dj E[F j] = ( i)j φ (t). − dtj |t=0 F

The cumulants of F , denoted by κ (F ) > , are defined in a similar way, just by replacing φ by { j }j 1 F log φF in the previous expression:

dj κ (F ) = ( i)j log φ (t). j − dtj |t=0 F The first few cumulants are

κ1(F ) = E[F ], κ (F ) = E[F 2] E[F ]2 = Var(F ), 2 − κ (F ) = E[F 3] 3E[F 2]E[F ] + 2E[F ]3. 3 − It is immediate that

j κj(F + G) = κj(F ) + κj(G) and κj(λF ) = λ κj(F ) (7.85)

41 for all j > 1, when λ R and F and G are independent random variables (with all the moments). ∈ Also, it is easy to express moments in terms of cumulants and vice-versa. Finally, let us observe that the cumulants of F (0, σ2) are all zero, except for the second one which is σ2. This fact, ∼ N together with the two properties (7.85), gives a quick proof of the classical CLT and illustrates that cumulants are often relevant when wanting to decide whether a given random variable is approximately normally distributed. The following simple lemma is a useful link between moments and cumulants. Lemma 7.1 Let F be a random variable (in a given probability space (Ω, ,P )) having all the F moments. Then, for all m N, ∈ m m+1 m m s E[F ] = κ (F )E[F − ]. s s+1 Xs=0   Proof. We can write dm+1 dm d E[F m+1] = ( i)m+1 φ (t) = ( i)m+1 φ (t) log φ (t) − dtm+1 |t=0 F − dtm |t=0 F dt F m   m ds+1 dm s = ( i)m+1 log φ (t) − φ (t) by Leibniz rule − s dts+1 |t=0 F dtm s |t=0 F s=0      −  m X m m s = κ (F )E[F − ]. s s+1 s=0 X  

From now on, we will deal with a random variable F with all moments that is further measurable with respect to the Brownian motion (Bt)t>0. We let the notation of Section 4 prevail and we consider the chaotic expansion (4.39) of F . We further assume (only to avoid technical issues) that D Dm,2 > m p F belongs to ∞, meaning that F for all m 1 and that E[ D F L2(Rm)] < for all ∈ k k + ∞ m > 1 and all p > 2. This assumption allows us to introduce recursively the following (well-defined) sequence of random variables related to F . Namely, set Γ0(F ) = F and

1 Γ (F ) = DF, DL− Γ (F ) 2 R . j+1 h − j iL ( +) The following result contains a neat expression of the cumulants of F in terms of the family Γs(F ) s N. { } ∈ Theorem 7.2 (Nourdin, Peccati, 2010; see [29]) Let F D . Then, for any s N, ∈ ∞ ∈

κs+1(F ) = s!E[Γs(F )].

Proof. The proof is by induction. It consists in computing κs+1(F ) using the induction hypothesis, together with Lemma 7.1 and (4.53). First, the result holds true for s = 0, as it only says that κ1(F ) = E[Γ0(F )] = E[F ]. Assume now that m > 1 is given and that κs+1(F ) = s!E[Γs(F )] for all s 6 m 1. We can then write − m 1 m+1 − m m s κ (F ) = E[F ] κ (F )E[F − ] by Lemma 7.1 m+1 − s s+1 s=0 X   m 1 m+1 − m m s = E[F ] s! E[Γ (F )]E[F − ] by the induction hypothesis. − s s Xs=0   42 On the other hand, by applying (4.53) repeatedly, we get

m+1 m m m m 1 E[F ] = E[F ]E[Γ0(F )] + Cov(F , Γ0(F )) = E[F ]E[Γ0(F )] + mE[F − Γ1(F )] m m 1 m 1 = E[F ]E[Γ0(F )] + mE[F − ]E[Γ1(F )] + mCov(F − , Γ1(F )) m m 1 m 2 = E[F ]E[Γ (F )] + mE[F − ]E[Γ (F )] + m(m 1)E[F − Γ (F )] 0 1 − 2 = ... m m m s = s! E[F − ]E[Γ (F )]. s s s=0 X   Thus

m 1 m+1 − m m s κ (F ) = E[F ] s! E[Γ (F )]E[F − ] = m!E[Γ (F )], m+1 − s s m s=0 X   and the desired conclusion follows.

Let us now focus on the computation of cumulants associated to random variables having the form of a multiple Wiener-Itô integral. The following statement provides a compact representation for the cumulants of such random variables.

B 2 Rq Theorem 7.3 Let q > 2 and assume that F = Iq (f), where f L ( +). We have κ1(F ) = 0, 2 > ∈ κ2(F ) = q! f L2(Rq ) and, for every s 3, k k +

κs(F ) = q!(s 1)! cq(r1, . . . , rs 2) (...((f r1 f) r2 f) ... rs−3 f) rs−2 f, f L2(Rq ), (7.86) − − ⊗ ⊗ ⊗ ⊗ + X where the sum runs over all collections of integerse r1,e . . . , rs 2 esuch that:e −

(i) 1 6 r1, . . .P , rs 2 6 q; − (s 2)q (ii) r1 + ... + rs 2 = − ; − 2 3q (s 2)q (iii) r1 < q, r1 + r2 < , ..., r1 + ... + rs 3 < − ; 2 − 2

(iv) r2 6 2q 2r1, ..., rs 2 6 (s 2)q 2r1 ... 2rs 3; − − − − − − − and where the combinatorial constants cq(r1, . . . , rs 2) are recursively defined by the relations − q 1 2 c (r) = q(r 1)! − , q − r 1  −  and, for a > 2,

aq 2r1 ... 2ra 1 1 q 1 cq(r1, . . . , ra) = q(ra 1)! − − − − − − cq(r1, . . . , ra 1). − r 1 r 1 −  a −  a − 

Remark 7.4 1. If sq is odd, then κs(F ) = 0, see indeed condition (ii). This fact is easy to see s (law) in any case: use that κs( F ) = ( 1) κs(F ) and observe that, when q is odd, then F = F (law) − − − (since B = B). −

43 B 2 R2 2. If q = 2 and F = I2 (f) with f L ( +), then the only possible integers r1, . . . , rs 2 ∈ − verifying (i) (iv) in the previous statement are r1 = ... = rs 2 = 1. On the other hand, we − − immediately compute that c2(1) = 2, c2(1, 1) = 4, c2(1, 1, 1) = 8, and so on. Therefore,

B s 1 κs(I2 (f)) = 2 − (s 1)! (...(f 1 f) . . . f) 1 f, f 2 R2 , (7.87) − ⊗ ⊗ L ( +)

and we recover the classical expression of the cumulants of a double integral.

3. If q > 2 and F = IB(f), f L2(Rq ), then (7.86) for s = 4 reads q ∈ +

q 1 − B κ4(Iq (f)) = 6q! cq(r, q r) (f rf) q rf, f L2(Rq ) − ⊗ ⊗ − + r=1 X q 1 4 e e 3 − 2 q = rr! (2q 2r)! (f rf) q r f, f L2(Rq ) q r − ⊗ ⊗ − + r=1   X q 1 4 − e 3 2 q = rr! (2q 2r)! f rf, f r f 2 R2q−2r q r − ⊗ ⊗ L ( + ) r=1   X q 1 4 − e 3 2 q 2 = rr! (2q 2r)! f rf 2 R2q−2r , (7.88) q r − k ⊗ kL ( + ) r=1 X   e and we recover the expression for κ4(F ) given in (5.59) by a different route.

Proof of Theorem 7.3. Let us first show the following formula: for any s > 2, we claim that

q [(s 1)q 2r1 ... 2rs−2] q − − − − ∧ 1 1 Γs 1(F ) = ... cq(r1, . . . , rs 1) r1

1 Γs(F ) = DF, DL− Γs 1F L2(R ) h − − i + q [(s 1)q 2r1 ... 2rs−2] q − − − − ∧ 1 1 = ... qcq(r1, . . . , rs 1) r 1 is thus finished. Now, let us take the expectation on both sides of (7.89). We get

κs(F ) = (s 1)!E[Γs 1(F )] − − q [(s 1)q 2r1 ... 2rs−2] q − − − − ∧ = (s 1)! ... c (r , . . . , r )1 ... 1 (s−1)q q 1 s 1 r1 q + rs 1, so that rs 1 > q. Therefore, − − − − − − − q [(s 2)q 2r1 ... 2rs−3] q − − − − ∧ κ (F ) = (s 1)! ... c (r , . . . , r , q)1 ... 1 (s−2)q s q 1 s 2 r1 2 B 2 Rq is fixed and if (Fn)n>1 is a sequence of the form Fn = Iq (fn) with fn L ( +) such that 2 2 > 4 ∈ E[Fn ] = q! fn L2(Rq ) = 1 for all n 1 and E[Fn ] 3 as n , then Fn (0, 1) in law k k + → → ∞ → N as n . → ∞ To this end, observe that κ1(Fn) = 0 and κ2(Fn) = 1. To estimate κs(Fn), s > 3, we consider the expression (7.86). Let r1, . . . , rs 2 be some integers such that (i)–(iv) in Theorem 7.3 are satisfied. Using Cauchy-Schwarz and then− successively

6 6 p q g rh L2(Rp+q−2r ) g r h L2(Rp+q−2r) g L2(R ) h L2(R ) k ⊗ k + k ⊗ k + k k + k k + whenever g L2(Rp ), h L2(Rq ) and r = 1, . . . , p q, we get that e ∈ + ∈ + ∧ q (...(fn r1 fn) r2 fn) . . . fn) rs−2 fn, fn L2(R ) h ⊗ ⊗ ⊗ i + 6 q q (...(fn r1 fn) r 2 fn) . . . fn) rs−2 fn L2(R ) fn L2(R ) e e e k ⊗ ⊗ ⊗ k + k k + 6 s 2 fn r1 fn L2(R2q−2r1 ) fn L−2(Rq ) k ⊗ e k e+ k k e + s 1 2 = (q!) − fn r1 fn 2 R2q−2r1 . (7.90) e k ⊗ kL ( + ) 4 Since E[Fn ] 3 = κ4(Fn) 0, we deduce frome (7.88) that fn rfn L2(R2q−2r ) 0 for all − → k ⊗ k + → r = 1, . . . , q 1. Consequently, by combining (7.86) with (7.90), we get that κ (F ) 0 as − s n → n for all s > 3, implying in turn that F (0, 1) in law. e → ∞ n → N To go further. The multivariate version of Theorem 7.2 may be found in [23].

8 A new density formula

1 In this section, following [37] we shall explain how the quantity DF, DL F 2 R is related to h − − iL ( +) the density of F D1,2 (provided it exists). More specifically, when F D1,2 is such that E[F ] = 0, ∈ ∈ let us introduce the function g : R R, defined by means of the following identity: F → 1 g (F ) = E[ DF, DL− F 2 R F ]. (8.91) F h − iL ( +)|

45 A key property of the random variable gF (F ) is as follows. Proposition 8.1 If F D1,2 satisfies E[F ] = 0, then P (g (F ) > 0) = 1. ∈ F R x 1 > Proof. Let C be a Borel set of and set φn(x) = 0 C [ n,n](t)dt, n 1 (with the usual convention ∩ − x = 0 for x < 0). Since φ is increasing and vanishing at zero, we have xφ (x) > 0 for all 0 − x n R n x R. In particular, R ∈ R F F 1 1 0 6 E[F φn(F )] = E F C [ n,n](t)dt = E F C [ n,n](t)dt . 0 ∩ − ∩ −  Z   Z−∞  1 Therefore, we deduce from Corollary 4.17 that E gF (F ) C [ n,n](F ) > 0. By dominated convergence, this yields E [g (F )1 (F )] > 0, implying in turn that∩ − P (g (F ) > 0) = 1. F C  F 

The following theorem gives a new density formula for F in terms of the function gF . We will then study some of its consequences.

Theorem 8.2 (Nourdin, Viens, 2009; see [37]) Let F D1,2 with E[F ] = 0. Then, the ∈ law of F admits a density with respect to Lebesgue measure (say, ρ : R R) if and only if → P (gF (F ) > 0) = 1. In this case, the support of ρ, denoted by supp ρ, is a closed interval of R containing zero and we have, for (almost) all x supp ρ: ∈ E[ F ] x y dy ρ(x) = | | exp . (8.92) 2g (x) − g (y) F  Z0 F 

Proof. Assume that P (gF (F ) > 0) = 1 and let C be a Borel set. Let n > 1. Corollary 4.17 yields

F 1 1 E F C [ n,n](t)dt = E C [ n,n](F )gF (F ) . (8.93) ∩ − ∩ −  Z  −∞   F 1 Suppose that the Lebesgue measure of C is zero. Then C [ n,n](t)dt = 0, so that 1 −∞ ∩ − E C [ n,n](F )gF (F ) = 0 by (8.93). But, since P (gF (F ) > 0) = 1, we get that P (F ∩ − R ∈ C [ n, n]) = 0 and, by letting n , that P (F C) = 0. Therefore, the Radon-Nikodym ∩ −  → ∞ ∈ criterion is verified, hence implying that the law of F has a density. Conversely, assume that the law of F has a density, say ρ. Let φ : R R be a continuous → function with compact support, and let Φ denote any antiderivative of φ. Note that Φ is necessarily bounded. We can write:

E φ(F )gF (F ) = E Φ(F )F by (4.53) ∞ yρ(y)dy   = Φ(x) x ρ(x)dx = φ(x) ∞ yρ(y)dy dx = E φ(F ) F . R ( ) R ρ(F ) Z ∗ Z Zx   R  Equation ( ) was obtained by integrating by parts, after observing that ∗ ∞ yρ(y)dy 0 as x → | | → ∞ Zx (for x + , this is because F L1(Ω); for x , this is because F has mean zero). Therefore, → ∞ ∈ → −∞ we have shown that, P -a.s.,

F∞ yρ(y)dy gF (F ) = . (8.94) R ρ(F )

46 1 (Notice that P (ρ(F ) > 0) = R ρ(x)>0 ρ(x)dx = R ρ(x)dx = 1, so that identity (8.94) always makes sense.) Since F D1,2, one{ has (see,} e.g., [38, Proposition 2.1.7]) that supp ρ = [α, β] with ∈ R R 6 α < β 6 + . Since F has zero mean, note that α < 0 and β > 0 necessarily. For every −∞ ∞ x (α, β), define ∈

ϕ (x) = ∞ yρ (y) dy. (8.95) Zx The function ϕ is differentiable almost everywhere on (α, β), and its derivative is xρ (x). In − particular, since ϕ(α) = ϕ(β) = 0 and ϕ is strictly increasing before 0 and strictly decreasing afterwards, we have ϕ(x) > 0 for all x (α, β). Hence, (8.94) implies that P (g (F ) > 0) = 1. ∈ F Finally, let us prove (8.92). Let ϕ still be defined by (8.95). On the one hand, we have ϕ (x) = xρ(x) for almost all x supp ρ. On the other hand, by (8.94), we have, for almost ′ − ∈ all x supp ρ, ∈

ϕ(x) = ρ(x)gF (x). (8.96)

By putting these two facts together, we get the following ordinary differential equation satisfied by ϕ:

ϕ (x) x ′ = for almost all x supp ρ. ϕ(x) −gF (x) ∈

Integrating this relation over the interval [0, x] yields

x y dy log ϕ(x) = log ϕ(0) . − g (y) Z0 F

Taking the exponential and using 0 = E(F ) = E(F+) E(F ) so that E F = E(F+) + E(F ) = − − | | − 2E(F+) = 2ϕ(0), we get

1 x y dy ϕ(x) = E[ F ] exp . 2 | | − g (y)  Z0 F  Finally, the desired conclusion comes from (8.96).

As a consequence of Theorem 8.2, we have the following statement, yielding sufficient conditions in order for the law of F to have a support equal to the real line.

Corollary 8.3 Let F D1,2 with E[F ] = 0. Assume that there exists σ > 0 such that ∈ min 2 gF (F ) > σmin,P -a.s. (8.97)

Then the law of F , which has a density ρ by Theorem 8.2, has R for support and (8.92) holds almost everywhere in R.

Proof. It is an immediate consequence of Theorem 8.2, except for the fact that supp ρ = R. For the moment, we just know that supp ρ = [α, β] with 6 α < 0 < β 6 + . Identity (8.94) yields −∞ ∞

∞ yρ (y) dy > σ2 ρ (x) for almost all x (α, β). (8.98) min ∈ Zx

47 Let ϕ be defined by (8.95), and recall that ϕ(x) > 0 for all x (α, β). When multiplied by ′ ∈ ϕ (x) > x x [0, β), the inequality (8.98) gives ϕ(x) σ2 . Integrating this relation over the interval [0, x] ∈ − min > x2 1 yields log ϕ (x) log ϕ (0) 2 σ2 , i.e., since ϕ(0) = 2 E F , − − min | |

x2 ∞ 1 2 σ2 ϕ (x) = yρ (y) dy > E F e− min . (8.99) 2 | | Zx ′ ϕ (x) 6 x Similarly, when multiplied by x (α, 0], inequality (8.98) gives ϕ(x) σ2 . Integrating this ∈ − min 6 x2 relation over the interval [x, 0] yields log ϕ (0) log ϕ (x) 2 σ2 , i.e. (8.99) still holds for x (α, 0]. − min ∈ Now, let us prove that β = + . If this were not the case, by definition, we would have ϕ (β) = 0; ∞ on the other hand, by letting x tend to β in the above inequality, because ϕ is continuous, we would β2 1 2σ2 have ϕ (β) > E F e− min > 0, which contradicts β < + . The proof of α = is similar. In 2 | | ∞ −∞ conclusion, we have shown that supp ρ = R.

Using Corollary 8.3, we deduce a neat criterion for normality.

Corollary 8.4 Let F D1,2 with E[F ] = 0 and assume that F is not identically zero. Then F is ∈ Gaussian if and only if Var(gF (F )) = 0.

Proof : By (4.53) (choose ϕ(x) = x, G = F and recall that E[F ] = 0), we have

1 2 E[ DF, DL− F H] = E[F ] = VarF. (8.100) h − i 2 Therefore, the condition Var(gF (F )) = 0 is equivalent to P (gF (F ) = VarF ) = 1. Let F (0, σ ) 2 ∼ N with σ > 0. Using (8.94), we immediately check that gF (F ) = σ , P -a.s. Conversely, if 2 gF (F ) = σ > 0 P -a.s., then Corollary 8.3 implies that the law of F has a density ρ, given by x2 E F 2 ρ(x) = | |e− 2 σ2 for almost all x R, from which we immediately deduce that F (0, σ ). 2σ2 ∈ ∼ N Observe that if F (0, σ2) with σ > 0, then E F = 2/π σ, so that the formula (8.92) for ∼ N | | ρ agrees, of course, with the usual one in this case. p As a ‘concrete’ application of (8.92), let us consider the following situation. Let K : [0, 1]2 R → be a square-integrable kernel such that K(t, s) = 0 for s > t, and consider the centered Gaussian process X = (Xt)t [0,1] defined as ∈ 1 t X = K(t, s)dB = K(t, s)dB , t [0, 1]. (8.101) t s s ∈ Z0 Z0 Fractional Brownian motion is an instance of such a process, see, e.g., [25, Section 2.3]. Consider the maximum

Z = sup Xt. (8.102) t [0,1] ∈ Assume further that the kernel K satisfies

1 c, α > 0, s, t [0, 1]2, s = t, 0 < (K(t, u) K(s, u))2du 6 c t s α. (8.103) ∃ ∀ ∈ 6 − | − | Z0 48 This latter assumption ensures (see, e.g., [11]) that: (i) Z D1,2; (ii) the law of Z has a density ∈ with respect to Lebesgue measure; (iii) there exists a (a.s.) unique random point τ [0, 1] where 1 ∈ the supremum is attained, that is, such that Z = Xτ = 0 K(τ, s)dBs; and (iv) DtZ = K(τ, t), t [0, 1]. We claim the following formula. ∈ R Proposition 8.5 Let Z be given by (8.102), X be defined as (8.101) and K L2([0, 1]2) be ∈ satisfying (8.103). Then, the law of Z has a density ρ whose support is R+, given by

E Z E[Z] x (y E[Z])dy ρ(x) = | − |exp − , x > 0. 2h (x) − h (y) Z ZE[Z] Z ! Here,

∞ u h (x) = e− E [R(τ , τ ) Z = x] du, Z 0 u | Z0 where R(s, t) = E[X X ], s, t [0, 1], and τ is the (almost surely) unique random point where s t ∈ u 1 (u) u 2u Xt = K(t, s)(e− dBs + 1 e− dBs′ ) 0 − Z p attains its maximum on [0, 1], with (B,B′) a two-dimensional Brownian motion defined on the product probability space (Ω, , P) = (Ω Ω , F F ,P P ). F × ′ ⊗ ′ × ′ 1 B B Proof. Set F = Z E[Z]. We have D L F = ∞ I (f ( , t)) and D F = ∞ qI (f ( , t)). − − t − q=1 q 1 q · t q=1 q 1 q · Thus − − P P ∞ u ∞ B ∞ u (q 1)u ∞ B e− Pu(DtF )du = Iq 1(fq( , t)) e− qe− − du = Iq 1(fq( , t)). − · − · 0 q=1 0 q=1 Z X Z X Consequently,

1 ∞ u D L− F = e− P (D F )du, t [0, 1]. − t u t ∈ Z0 1 By Mehler’s formula (4.49), and since DF = DZ = K(τ, ) with τ = argmaxt [0,1] 0 K(t, s)dBs, · ∈ we deduce that R 1 ∞ u D L− F = e− E′[K(τ , t)]du, − t u Z0 implying in turn 1 1 ∞ u gF (F ) = E[ DF, DL− F L2([0,1]) F ] = dt du e− K(τ0, t)E[E′[K(τu, t) F ]] h − i | 0 0 | 1 Z Z ∞ u ∞ u = e− E E′ K(τ , t)K(τ , t)dt F du = e− E E′ [R(τ , τ ) F ] du 0 u | 0 u | Z0  Z0  Z0 ∞ u   = e− E [R(τ , τ ) F ] du. 0 u | Z0 The desired conclusion follows now from Theorem 8.2 and the fact that F = Z E[Z]. − To go further. The reference [37] contains concentration inequalities for centered random variables F D1,2 satisfying g (F ) 6 αF + β. The paper [41] shows how Theorem 8.2 can lead to ∈ F optimal Gaussian density estimates for a class of stochastic equations with additive noise.

49 9 Exact rates of convergence

1,2 In this section, we follow [30]. Let F > be a sequence of random variables in D such that { n}n 1 E[F ] = 0, Var(F ) = 1 and F law N (0, 1) as n . Our aim is to develop tools for n n n → ∼ N → ∞ computing the exact asymptotic expression of the (suitably normalized) sequence

P (F 6 x) P (N 6 x), n > 1, n − when x R is fixed. This will complement the content of Theorem 5.2. ∈ A technical computation. For every fixed x, we denote by f : R R the function x → u u2/2 1 a2/2 fx(u) = e ( ,x](a) Φ(x) e− da −∞ − Z−∞ 2 Φ(u)(1 Φ(x)) if u 6 x = √2πeu /2 − , (9.104) × Φ(x)(1 Φ(u)) if u > x  − 2 where Φ(x) = 1 x e a /2da. We have the following result. √2π − −∞ Proposition 9.1 LetR N (0, 1). We have, for every x R, ∼ N ∈ x2/2 1 2 e− E[f ′ (N)N] = (x 1) . (9.105) x 3 − √2π Proof. Integrating by parts (the bracket term is easily shown to vanish), we first obtain that

+ u2/2 + u2/2 ∞ e− ∞ 2 e− E[f ′ (N)N] = f ′ (u)u du = fx(u)(u 1) du x x √2π − √2π Z Z −∞ + u −∞ 1 ∞ 2 1 a2/2 = (u 1) ( ,x](a) Φ(x) e− da du. √2π − −∞ − Z Z  −∞ −∞   Integrating by parts once again, this time using the relation u2 1 = 1 (u3 3u) , we deduce that − 3 − ′ + u ∞ 2 1 a2/2 (u 1) ( ,x](a) Φ(x) e− da du − −∞ − Z Z  −∞ + −∞   1 ∞ 3 1 u2/2 = (u 3u) ( ,x](u) Φ(x) e− du −3 − −∞ − Z −∞x  + 1 3 u2/2 ∞ 3 u2/2 = (u 3u)e− du Φ(x) (u 3u)e− du −3 − − − Z−∞ Z−∞  1 2 x2/2 2 u2/2 3 u2/2 = (x 1)e− , since [(u 1)e ] = (u 3u)e . 3 − − − ′ − − −

A general result. Assume that F > is a sequence of (sufficiently regular) centered random { n}n 1 variables with unitary variance such that the sequence

1 2 ϕ(n) := E[(1 DF , DL F 2 R ) ], n > 1, (9.106) − h n − − niL ( +) q converges to zero as n . According to Theorem 5.2 one has that, for any x R and as n , → ∞ ∈ → ∞ P (F 6 x) P (N 6 x) 6 d (F ,N) 6 2ϕ(n) 0, (9.107) n − TV n →

50 where N (0, 1). The forthcoming result provides a useful criterion in order to compute an exact ∼ N asymptotic expression (as n ) for the quantity → ∞ P (F 6 x) P (N 6 x) n − , n > 1. ϕ(n)

Theorem 9.2 (Nourdin, Peccati, 2010; see [30]) Let Fn n>1 be a sequence of random 1,2 { } variables belonging to D , and such that E[Fn] = 0, Var[Fn] = 1. Suppose moreover that the following three conditions hold: (i) we have 0 < ϕ(n) < for every n and ϕ(n) 0 as n ; ∞ → → ∞ (ii) the law of Fn has a density with respect to Lebesgue measure for every n;

−1 DFn, DL Fn 2 R 1 (iii) as n , the two-dimensional vector F , h − iL ( +)− converges in distribution → ∞ n ϕ(n)   2 2 to a centered two-dimensional Gaussian vector (N1,N2), such that E[N1 ] = E[N2 ] = 1 and E[N1N2] = ρ. Then, as n , one has for every x R, → ∞ ∈ 2 P (F 6 x) P (N 6 x) ρ e x /2 n − (1 x2) − . (9.108) ϕ(n) → 3 − √2π Proof. For any integer n and any 1-function f with a bounded derivative, we know by Theorem C 4.15 that 1 E[F f(F )] = E[f ′(F ) DF , DL− F 2 R ]. n n n h n − niL ( +) Fix x R and observe that the function f defined by (9.104) is not 1 due to the singularity in x. ∈ x C However, by using a regularization argument given assumption (ii), one can show that the identity 1 E[F f (F )] = E[f ′ (F ) DF , DL− F 2 R ] n x n x n h n − niL ( +) is true for any n. Therefore, since P (F 6 x) P (N 6 x) = E[f (F )] E[F f (F )], we get n − x′ n − n x n 1 2 P (Fn 6 x) P (N 6 x) 1 DFn, DL− Fn L (R+) − = E fx′ (Fn) − h − i . ϕ(n) " × ϕ(n) # 1 Reasoning as in Lemma 3.1, one may show that fx is Lipschitz with constant 2. Since ϕ(n)− (1 1 − DF , DL F 2 R ) has variance 1 by definition of ϕ(n), we deduce that the sequence h n − − niL ( +) 1 2 1 DFn, DL− Fn L (R+) f ′ (F ) − h − i , n > 1, x n × ϕ(n) is uniformly integrable. Definition (9.104) shows that u f (u) is continuous at every u = x. This → x′ 6 yields that, as n and due to assumption (iii), → ∞ 1 2 1 DFn, DL− Fn L (R+) E fx′ (Fn) − h − i E[fx′ (N1)N2] = ρ E[fx′ (N1)N1]. " × ϕ(n) # → − − Consequently, relation (9.108) now follows from formula (9.105).

The double integrals case and a concrete application. When applying Theorem 9.2 in concrete situations, the main issue is often to check that condition (ii) therein holds true. In the particular case of sequences belonging to the second Wiener chaos, we can go further in the analysis, leading to the following result.

51 Proposition 9.3 Let N (0, 1) and let F = IB(f ) be such that f L2(R2 ) is symmetric ∼ N n 2 n n ∈ + for all n > 1. Write κp(Fn), p > 1, to indicate the sequence of the cumulants of Fn. Assume that κ (F ) = E[F 2] = 1 for all n > 1 and that κ (F ) = E[F 4] 3 0 as n . If we have in 2 n n 4 n n − → → ∞ addition that

κ3(Fn) κ8(Fn) α and 2 0, (9.109) κ4(Fn) → κ4(Fn) → then, forp all x R,  ∈ 2 P (Fn 6 x) P (N 6 x) α 2 x − 1 x e− 2 as n . (9.110) κ4(Fn) → 6√2π − → ∞  Remark 9.4pDue to (9.109), we see that (9.110) is equivalent to

2 P (Fn 6 x) P (N 6 x) 1 2 x − 1 x e− 2 as n . κ3(Fn) → 6√2π − → ∞  3 Since each Fn is centered, one also has that κ3(Fn) = E[Fn ]. Proof. We shall apply Theorem 9.2. Thanks to (5.60), we get that

4 κ4(Fn) E[Fn ] 3 2 = − = 8 fn 1 fn L2(R2 ). 6 6 k ⊗ k + By combining this identity with (5.58) (here, it is worth observing that f f is symmetric, so n ⊗1 n that the symmetrization f f is immaterial), we see that the quantity ϕ(n) appearing in (9.106) n⊗1 n is given by κ4(Fn)/6. In particular, condition (i) in Theorem 9.2 is met (here, let us stress that one may show that κ (F )e> 0 for all n by means of (5.60)). On the other hand, since F is p 4 n n a non-zero double integral, its law has a density with respect to Lebesgue measure, according to Theorem 4.18. This means that condition (ii) in Theorem 9.2 is also in order. Hence, it remains to check condition (iii). Assume that (9.109) holds. Using (7.87) in the cases p = 3 and p = 8, we deduce that

κ (F ) 8 fn, fn 1 fn L2(R2 ) 3 n = h ⊗ i + κ4(Fn) √6 ϕ(n) and p 2 17920 (fn 1 fn) 1 (fn 1 fn) L2(R2 ) κ8(Fn) k ⊗ ⊗ ⊗ k + 2 = 4 . (κ4(Fn)) ϕ(n) On the other hand, set

1 2 2 DFn L2(R ) 1 Y = k k + − . n ϕ(n)

1 2 B By (5.62), we have DYn 2 R 1 = 2 I (fn 1 fn). Therefore, by (5.58), we get that 2 k kL ( +) − 2 ⊗ 2 1 2 128 E DYn 2 R 1 = (fn 1 fn) 1 (fn 1 fn) 2 R2 2k kL ( +) − ϕ(n)4 k ⊗ ⊗ ⊗ kL ( +) "  # κ8(Fn) = 2 0 as n . 140 (κ4(Fn)) → → ∞

52 Hence, by Theorem 5.5, we deduce that Y Law (0, 1). We also have n → N

4 √6 κ3(Fn) α√6 E[YnFn] = fn 1 fn, fn 2 R2 = =: ρ as n . L ( +) ϕ(n)h ⊗ i 2 κ4(Fn) → 2 → ∞

Therefore, to conclude that condition (iii) in Theoremp 9.2 holds true, it suffices to apply Theorem 6.2.

To give a concrete application of Proposition 9.3, let us go back to the quadratic variation of H H fractional Brownian motion. Let B = (Bt )t>0 be a fractional Brownian motion with Hurst index H (0, 1 ) and let ∈ 2 n 1 1 − H H 2 Fn = (Bk+1 Bk ) 1 , σn − − k=0 X   2 2 2 where σn > 0 is so that E[Fn ] = 1. Recall from Theorem 5.6 that limn σn/n = 2 r Z ρ (r) < →∞ , with ρ : Z R given by (5.69); moreover, there exists a constant c > 0 (depending∈ only on ∞ → + H P H) such that, with N (0, 1), ∼ N c d (F ,N) 6 H , n > 1. (9.111) TV n √n

The next two results aim to show that one can associate a lower bound to (9.111). We start by the following auxiliary result.

Proposition 9.5 Fix an integer s > 2, let Fn be as above and let ρ be given by (5.69). Recall that H < 1 , so that ρ ℓ1(Z). Then, the sth cumulant of F behaves asymptotically as 2 ∈ n (s 1) 1 s/2 s/2 1 ρ∗ − , ρ ℓ2(Z) κ (F ) n − 2 − (s 1)! h i as n . (9.112) s n ∼ − ρ s → ∞ k kℓ2(Z)

law B 1 n 1 2 Proof. As in the proof of Theorem 5.6, we have that F = I (f ) with f = − e⊗ . Now, n 2 n n σn k=0 k let us proceed with the proof. It is divided into several steps. P B First step. Using the formula (7.87) giving the cumulants of Fn = I2 (fn) as well as the very definition of the contraction , we immediately check that ⊗1 s 1 n 1 2 − (s 1)! − κs(Fn) = s− ρ(ks ks 1) . . . ρ(k2 k1)ρ(k1 ks). σn − − − − k1,...,kXs=0 Second step. Since H < 1 , we have that ρ ℓ1(Z). Therefore, by applying Young inequality 2 ∈ repeatedly, we have

(s 1) (s 2) s 1 ρ ∗ − ∞ Z 6 ρ 1 Z ρ ∗ − ∞ Z 6 ... 6 ρ − < . k | | kℓ ( ) k kℓ ( )k | | kℓ ( ) k kℓ1(Z) ∞ (s 1) s In particular, we have that ρ , ρ 2 Z 6 ρ < . h| |∗ − | |iℓ ( ) k kℓ1(Z) ∞ Third step. Thanks to the result shown in the previous step, observe first that

(s 1) ρ(k2)ρ(k2 k3)ρ(k3 k4) . . . ρ(ks 1 ks)ρ(ks) = ρ ∗ − , ρ ℓ2(Z) < . | − − − − | h| | | |i ∞ k2,...,ks Z X ∈ 53 Hence, one can apply dominated convergence to get, as n , that → ∞ σs n κ (F ) 2s 1(s 1)! n s n − − n 1 n 1 k1 1 − − − = ρ(k2)ρ(k2 k3)ρ(k3 k4) . . . ρ(ks 1 ks)ρ(ks) n − − − − k1=0 k2,...,ks= k1 X X − = ρ(k2)ρ(k2 k3)ρ(k3 k4) . . . ρ(ks 1 ks)ρ(ks) − − − − k2,...,ks Z X ∈ max k2, . . . , ks min k2, . . . , ks 1 1 1 { } 0 { } k2

Since σ √2n ρ 2 Z as n , the desired conclusion follows. n ∼ k kℓ ( ) → ∞ Corollary 9.6 Let F be as above (with H < 1 ), let N (0, 1), and let ρ be given by (5.69). n 2 ∼ N Then, for all x R, we have ∈

2 2 ρ∗ , ρ ℓ2(Z) 2 x 6 6 h i 2 √n P (Fn x) P (N x) 2 (1 x ) e− as n . − → 3 ρ ℓ2(Z) − → ∞  k k In particular, we deduce that there exists dH > 0 such that d H 6 P (F 6 0) P (N 6 0) 6 d (F ,N), n > 1. (9.114) √n n − TV n

Proof. The desired conclusion follows immediately by combining Propositions 9.3 and 9.5.

By paying closer attention to the used estimates, one may actually show that (9.114) holds true 5 1 for any H < 8 (not only H < 2 ). See [32, Theorem 9.5.1] for the details.

To go further. The paper [30] contains several other examples of application of Theorem 9.2 and Proposition 9.3. In the reference [4], one shows that the deterministic sequence

max E[F 3],E[F 4] 3 , n > 1, { n n − } completely characterizes the rate of convergence (with respect to smooth distances) in CLTs involving chaotic random variables.

10 An extension to the Poisson space (following the invited talk by Giovanni Peccati)

1,2 Let B = (B ) > be a Brownian motion, let F be any centered element of D and let N (0, 1). t t 0 ∼ N We know from Theorem 5.2 that

1 d (F,N) 6 2 E[ 1 DF, DL− F 2 R ]. (10.115) TV | − h − iL ( +)|

54 The aim of this section, which follows [43, 44], is to explain how to deduce inequalities of the type (10.115), when F is a regular functional of a Poisson measure η and when the target law N is either Gaussian or Poisson. We first need to introduce the basic concepts in this framework.

Poisson measure. In what follows, we shall use the symbol P o(λ) to indicate the Poisson k distribution of parameter λ > 0 (that is, P o(λ) if and only if P ( = k) = e λ λ for Pλ ∼ Pλ − k! all k N), with the convention that P o(0) = δ (Dirac mass at 0). Set A = Rd with d > 1, let ∈ 0 A be the Borel σ-field on A, and let µ be a positive, σ-finite and atomless measure over (A, ). We A set = B : µ(B) < . Aµ { ∈ A ∞}

Definition 10.1 A Poisson measure η with control µ is an object of the form η(B) B µ with the { } ∈A following features: (1) for all B , we have η(B) P o(µ(B)). ∈ Aµ ∼ (2) for all B,C with B C = , the random variables η(B) and η(C) are independent. ∈ Aµ ∩ 6 ∅ Also, we note η(B) = η(B) µ(B). − Remark 10.2 1. As a simple example, note that for d = 1 and µ = λ Leb (with ‘Leb’ the b × Lebesgue measure) the process η([0, t]) > is nothing but a Poisson process with intensity { }t 0 λ.

2. Let µ be a σ-finite atomless measure over (A, ), and observe that this implies that there A exists a sequence of disjoint sets A : j > 1 such that A = A. For every j = 1, 2, ... { j } ⊂ Aµ ∪j j belonging to the set J0 of those indices such that µ(Aj) > 0 consider the following objects: (j) (j) X = Xk : k > 1 is a sequence of i.i.d. random variables with values in Aj and with { }µ| common distribution Aj ; P is a Poisson random variable with parameter µ(A ). Assume µ(Aj ) j j moreover that : (i) X(j) is independent of X(k) for every k = j, (ii) P is independent of 6 j P for every k = j, and (iii) the classes X(j) and P are independent. Then, it is a k 6 { } { j } straightforward computation to verify that the random point measure

Pj

η( ) = δ (j) ( ), · Xk · j J0 k=1 X∈ X 0 where δx indicates the Dirac mass at x and k=1 = 0 by convention, is a a Poisson random measure with control µ. See e.g. [49, Section 1.7]. P

Multiple integrals and chaotic expansion. As a preliminary remark, let us observe that E[η(B)] = 0 and E[η(B)2] = µ(B) for all B . For any q > 1, set L2(µq) = L2(Aq, q, µq). We ∈ Aµ A want to appropriately define b b ηb Iq (f) = f(x1, . . . , xq)η(dx1) ... η(dxq) q ZA when f L2(µq). To reach ourb goal, web proceed in a classical way. We first consider the subset ∈ (µq) of simple functions, which is defined as E (µq) = span 1 ... 1 , with B ,...,B such that B B = for all i = j . E B1 ⊗ ⊗ Bq 1 q ∈ Aµ i ∩ j ∅ 6  55 When f = 1 ... 1 with B ,...,B such that B B = for all i = j, we naturally B1 ⊗ ⊗ Bq 1 q ∈ Aµ i ∩ j ∅ 6 set

ηb Iq (f) := η(B1) ... η(Bq) = f(x1, . . . , xq)η(dx1) ... η(dxq). q ZA (For such a simpleb functionb f, note that the right-handb side inb the previous formula makes perfectly ηb sense by considering η as a signed measure.) We can extend by linearity the definition of Iq (f) to any f (µq). It is then a simple computation to check that ∈ E ηb ηb b E[I (f)I (g)] = p!δ f, g 2 p p q p,q h iL (µ ) for all f (µp) and g (µeq), with f (resp. g) the symmetrization of f (resp. g) and δ the ∈ E ∈ E e p,q Kronecker symbol. Since (µq) is dense in L2(µq) (it is precisely here that the fact that µ has E ηb 2 q no atom is crucial!), we can define Iq (fe) by isometrye to any f L (µ ). Relevant properties of ηb ηb ηb ηb ∈ ηb Iq (f) include E[Iq (f)] = 0, Iq (f) = Iq (f) and (importantly!) the fact that Iq (f) is a true multiple integral when f (µq). ∈ E e ηb Definition 10.3 Fix q > 1. The set of random variables of the form Iq (f) is called the qth Poisson- Wiener chaos.

In this framework, we have an analogue of the chaotic decomposition (4.39) – see e.g. [45, Corollary 10.0.5] for a proof.

Theorem 10.4 For all F L2(σ η ) (that is, for all random variable F which is square integrable ∈ { } and measurable with respect to η), we have

∞ ηb F = E[F ] + Iq (fq), (10.116) Xq=1 q 2 q where the kernels fq are (µ -a.e.) symmetric elements of L (µ ) and are uniquely determined by F .

Multiplication formula and contractions. When f (µp) and g (µq) are symmetric, we ∈ E ∈ E define, for all r = 0, . . . , p q and l = 0, . . . , r: ∧ l f ⋆r g(x1, . . . , xp+q r l) − −

= f(y1, . . . , yl, x1, . . . , xr l, xr l+1, . . . , xp l)g(y1, . . . , yl, x1, . . . , xr l, xp l+1, . . . , xp+q r l) l − − − − − − − ZA µ(dy ) . . . µ(dy ). × 1 l We then have the following product formula, compare with (4.43).

Theorem 10.5 (Product formula) Let p, q > 1 and let f (µp) and g (µq) be symmetric. ∈ E ∈ E Then

p q r ∧ b ηb ηb p q r η ^l Ip (f)Iq (g) = r! Ip+q r l f ⋆r g . r r l − − Xr=0    Xl=0    

56 p ηb Proof. Recall that, when dealing with functions in (µ ), Ip (f) is a true multiple integral (by seeing E η as a signed measure). We deduce

ηb ηb Ip (f)Iq (g) = f(x1, . . . , xp)g(y1, . . . , yq)η(dx1) ... η(dxp)η(dy1) ... η(dyq). b p+q ZA By definition of f (the same applies for g), we haveb that bf(x1, .b . . , xp) =b 0 when xi = xj for some i = j. Consider r = 0, . . . , p q, as well as pairwise disjoint indices i , . . . , i 1, . . . , p 6 ∧ 1 r ∈ { } and pairwise disjoint indices j1, . . . , jr 1, . . . , q . Set k1, . . . , kp r = 1, . . . , p i1, . . . , ir ∈ { } { − } { }\{ } and l1, . . . , lq r = 1, . . . , q j1, . . . , jr . We have, since µ is atomless and using η(dx) = { − } { }\{ } η(dx) µ(dx), − b f(x , . . . , x )g(y , . . . , y )1 η(dx ) ... η(dx )η(dy ) ... η(dy ) 1 p 1 q xi =yj ,...,xir =yjr 1 p 1 q p+q { 1 1 } ZA

= f(xk1 , . . . , xkp−r , xi1 , . . . , xir )g(yl1 , . . . , ylq−br , xi1 , . . .b , xir ) b b p+q−2r ZA η(dx ) ... η(dx )η(dy ) ... η(dy )η(dx ) . . . η(dx ) × k1 kp−r l1 lq−r i1 ir

= f(x1, . . . , xp r, a1, . . . , ar)g(y1, . . . , yq r, a1, . . . , ar) p+q−2r − b b −b b ZA η(dx1) ... η(dxp r)η(dy1) ... η(dyq r)η(da1) . . . η(dar). × − − By decomposing over the hyperdiagonals x = y , we deduce that { i j} p q b b b b ∧ ηb ηb p q Ip (f)Iq (g) = r! f(x1, . . . , xp r, a1, . . . , ar)g(y1, . . . , yq r, a1, . . . , ar) r r p+q−2r − − r=0 A X    Z η(dx1) ... η(dxp r)η(dy1) ... η(dyq r)η(da1) . . . η(dar), × − − and we get the desired conclusion by using the relationship b b b b η(da1) . . . η(dar) = η(da1) + µ(da1) ... η(dar) + µ(dar) .   b b Malliavin operators. Each time we deal with a random element F of L2( σ(η) ), in what follows { } we always consider its chaotic expansion (10.116). Definition 10.6 1. Set DomD = F L2(σ η ): qq! f 2 < . If F DomD, we set { ∈ { } k qkL2(µq ) ∞} ∈ P ∞ ηb DtF = qIq 1(fq( , t)), t A. − · ∈ q=1 X The operator D is called the Malliavin derivative. 2. Set DomL = F L2(σ η ): q2q! f 2 < . If F DomL, we set { ∈ { } k qkL2(µq ) ∞} ∈

∞ b P LF = qIη(f ). − q q q=1 X The operator L is called the generator of the Ornstein-Uhlenbeck semigroup. 3. If F L2(σ η ), we set ∈ { } 1 ∞ 1 ηb L− F = I (f ). − q q q q=1 X 1 The operator L− is called the pseudo-inverse of L.

57 It is readily checked that LL 1F = F E[F ] for F L2(σ η ). Moreover, proceeding mutatis − − ∈ { } mutandis as in the proof of Theorem 4.15, we get the following result.

Proposition 10.7 Let F L2(σ η ) and let G DomD. Then ∈ { } ∈ 1 Cov(F,G) = E[ DG, DL− F 2 ]. (10.117) h − iL (µ) The operator D does not satisfy the chain rule. Instead, it admits an ‘add-one cost’ representation which plays an identical role.

Theorem 10.8 (Nualart, Vives, 1990; see [42]) Let F DomD. Since F is measurable with ∈ respect to η, we can view it as F = F (η) with a slight abuse of notation. Then

D F = F (η + δ ) F (η), t A, (10.118) t t − ∈ where δt stands for the Dirac mass at t. ηb Proof. By linearity and approximation, it suffices to prove the claim for F = Iq (f), with q > 1 and f (µq) symmetric. In this case, we have ∈ E

F (η + δt) = f(x1, . . . , xq) η(dx1) + δt(dx1) ... η(dxq) + δt(dxq) . q ZA   Let us expand the integrator. Eachb member of such an expansiob n such that there is strictly more than one Dirac mass in the resulting expression gives a contribution equal to zero, since f vanishes on diagonals. We therefore deduce that q

F (η + δt) = F (η) + f(x1, . . . , xl 1, t, xl+1, . . . , xq)η(dx1) ... η(dxl 1)η(dxl+1) ... η(dxq) Aq − − Xl=1 Z ηb = F (η) + qIq 1(f(t, )) by symmetry of f b b b b − · = F (η) + DtF.

As an immediate corollary of the previous theorem, we get the formula

D (F 2) = (F + D F )2 F 2 = 2FD F + (D F )2, t A, t t − t t ∈ which shows how D is far from satisfying the chain rule (4.47).

Gaussian approximation. It happens that it is the following distance which is appropriate in our framework.

Definition 10.9 The Wasserstein distance between the laws of two real-valued random variables Y and Z is defined by

dW (Y,Z) = sup E[h(Y )] E[h(Z)] , (10.119) h Lip(1) − ∈ where Lip(1) stands for the set of Lipschitz functions h : R R with constant 1. → Since we are here dealing with Lipschitz functions h, we need a suitable version of Stein’s lemma. Compare with Lemma 3.1.

58 Lemma 10.10 (Stein, 1972; see [52]) Suppose h : R R is a Lipschitz constant with constant → 1. Let N (0, 1). Then, there exists a solution to the equation ∼ N f ′(x) xf(x) = h(x) E[h(N)], x R, − − ∈ 2 that satisfies f ′ 6 and f ′′ 6 2. k k∞ π k k∞ Proof. Let us recallq that, according to Rademacher’s theorem, a function which is Lipschitz continuous on R is almost everywhere differentiable. Let f : R R be the (well-defined!) function → given by t ∞ e− t 2t f(x) = E[h(e− x + 1 e N)N]dt. (10.120) 2t − − 0 √1 e− − Z − p By dominated convergence we have that f 1 with h ∈ C 2t ∞ e− t 2t f ′(x) = E[h′(e− x + 1 e N)N]dt. 2t − − 0 √1 e− − Z − p We deduce, for any x R, ∈ 2t ∞ e− 2 f ′(x) 6 E N dt = . (10.121) 2t | | | | 0 √1 e π Z − − r Now, let F : R R be the function given by → ∞ t 2t R F (x) = E[h(N) h(e− x + 1 e− N)]dt, x . 0 − − ∈ Z p Observe that F is well-defined since h(N) h(e tx + √1 e 2tN) is integrable due to − − − − t 2t t 2t h(N) h(e− x + 1 e− N) 6 e− x + 1 1 e− N − − t| | 2−t − | | p 6 e− x + e− Np,  | | | | where the last inequality follows from 1 √1 u = u/(√1 u + 1) 6 u if u [0, 1]. By dominated − − − ∈ convergence, we immediately see that F is differentiable with

∞ t t 2t F ′(x) = e− E[h′(e− x + 1 e− N)]dt. − 0 − Z p By integrating by parts, we see that F ′(x) = f(x). Moreover, by using the notation introduced in Section 4, we can write

f ′(x) xf(x) − = LF (x), by decomposing in Hermite polynomials, since LH = qH = H XH q − q q′′ − q′ ∞ = LP h(x)dt, since F (x) = ∞ E[h(N)] P h(x) dt − t 0 − t Z0 d R  = ∞ P h(x)dt − dt t Z0 = P0h(x) P h(x) = h(x) E[h(N)]. − ∞ − This proves the claim for f ′ . The claim for f ′′ is a bit more difficult to achieve; we refer to k k∞ k k∞ Stein [52, pp. 25-28] to keep the length of this survey within bounds.

We can now derive a bound for the Gaussian approximation of any centered element F belonging to DomD, compare with (10.115).

59 Theorem 10.11 (Peccati, Solé, Taqqu, Utzet, 2010; see [44]) Consider F DomD with ∈ E[F ] = 0. Then, with N (0, 1), ∼ N

2 1 2 1 d (F,N) 6 E 1 DF, DL− F 2 + E (D F ) D L− F µ(dt) . W π − h − iL (µ) t | t | r ZA    Proof. Let h Lip(1) and let f be the function of Lemma 10.10. Using (10.118) and a Taylor ∈ formula, we can write

D f(F ) = f(F + D F ) f(F ) = f ′(F )D F + R(t), t t − t 1 2 2 with R(t) 6 f ′′ (DtF ) 6 (DtF ) . We deduce, using (10.117) as well, | | 2 k k∞ 1 E[h(F )] E[h(N)] = E[f ′(F )] E[F f(F )] = E[f ′(F )] E[ Df(F ), DL− F 2 ] − − − h − iL (µ) 1 1 = E[f ′(F )(1 DF, DL− F 2 )] + ( D L− F )R(t)µ(dt). − h − iL (µ) − t ZA 2 Consequently, since f ′ 6 , k k∞ π q dW (F,N) = sup E[h(F )] E[h(N)] h Lip(1) | − | ∈ 2 1 2 1 6 E 1 DF, DL− F 2 + E (D F ) D L− F µ(dt) . π − h − iL (µ) t | t | r ZA   

Poisson approximation. To conclude this section, we will prove a very interesting result, which may be seen as a Poisson counterpart of Theorem 10.11.

Theorem 10.12 (Peccati, 2012; see [43]) Let F DomD with E[F ] = λ > 0 and F taking its ∈ values in N. Let P o(λ). Then, Pλ ∼ sup P (F C) P ( λ C) (10.122) C N ∈ − P ∈ ⊂ λ λ 1 e − 1 1 e− 1 6 − E λ DF, DL− F 2 + − E D F (D F 1)D L− F µ(dt). λ | − h − iL (µ)| λ2 | t t − t | Z ηb 1 Just as a mere illustration, consider the case where F = η(B) = I1 ( B) with B µ. We then 1 1 ∈ A have DF = DL F = 1 , so that DF, DL F 2 = 1 dµ = µ(B) and DF (DF 1) = 0 − − B h − − iL (µ) B − a.e. The right-hand side of (10.122) is therefore zero, as it was expected since F P o(λ). R ∼ During the proof of Theorem 10.12, we shall use an analogue of Lemma 3.1 in the Poisson context, which reads as follows.

Lemma 10.13 (Chen, 1975; see [8]) Let C N, let λ > 0 and let P o(λ). The equation ⊂ Pλ ∼ with unknown f : N R, → λ f(k + 1) kf(k) = 1 (k) P ( C), k N, (10.123) − C − Pλ ∈ ∈ admits a unique solution such that f(0) = 0, denoted by fC . Moreover, by setting ∆f(k) = 1 e−λ 2 2 f(k + 1) f(k), we have ∆fC 6 − and ∆ fC 6 ∆fC . − k k∞ λ k k∞ λ k k∞

60 2 Proof. We only provide a proof for the bound on ∆fC; the estimate on ∆ fC is proved e.g. by Daly in [10]. Multiplying both sides of (10.123) by λk/k! and summing up yields that, for every k > 1,

k 1 (k 1)! − λr f (k) = − [1 (r) P ( C)] (10.124) C λk r! C − Pλ ∈ r=0 X = f j (k) (10.125) { } j C X∈ = f c (k) (10.126) − C (k 1)! ∞ λr = − [1 (r) P ( C)], (10.127) − λk r! C − Pλ ∈ Xr=k where Cc denotes the complement of C in N. (Identity (10.125) comes from the additivity property of C f , identity (10.126) is because fN 0 and identity (10.126) is due to 7→ C ≡ ∞ λr [1 (r) P ( C)] = E[1 ( ) E[1 ( )]] = 0. r! C − Pλ ∈ C Pλ − C Pλ r=0 X  Since f (k) f (k + 1) = f c (k + 1) f c (k) (due to (10.126)), it is sufficient to prove that, for C − C C − C every k > 1 and every C N, f (k + 1) f (k) 6 (1 e λ)/λ. One has the following fact: for ⊂ C − C − − every j > 1 the mapping k f j (k) is negative and decreasing for k = 1, ..., j and positive and 7→ { } decreasing for k > j + 1. Indeed, we use (10.124) to deduce that, if 1 6 k 6 j,

j k λ λ r (k 1)! f j (k) = e− λ− − (which is negative and decreasing in k), { } − j! (k r)! Xr=1 − whereas (10.127) implies that, if k > j + 1,

j λ λ ∞ r (k 1)! f j (k) = e− λ − (which is positive and decreasing in k). { } j! (k + r)! r=0 X

Using (10.125), one therefore infers that fC (k + 1) fC (k) 6 f k (k + 1) f k (k), for every k > 0. − { } − { } Since

k 1 r r 1 λ k r r λ − λ ∞ λ − e− λ r ∞ λ f k (k + 1) f k (k) = e− + = + { } − { } r!k r! λ r! × k r! "r=0 # "r=1 # X r=Xk+1 X r=Xk+1 1 e λ 6 − − , λ the proof is concluded.

We are now in a position to prove Theorem 10.12.

Proof of Theorem 10.12. The main ingredient is the following simple inequality, which is a kind of Taylor formula: for all k, a N, ∈ 1 f(k) f(a) ∆f(a)(k a) 6 ∆2f (k a)(k a 1) . (10.128) − − − 2k k∞| − − − |

61 Assume for the time being that (10.128) holds true and fix C N. We have, using Lemma 10.13 ⊂ and then (10.117)

P (F C) P ( C) = E[λf (F + 1)] E[F f (F )] ∈ − Pλ ∈ C − C = λE[∆fC (F )] E[(F λ)fC ( F )] − − 1 = λE[∆fC (F )] E[ DfC (F ), DL − F L2(µ)] . − h − i

Now, combining (10.118) with (10.128), we can write

DtfC (F ) = ∆fC(F )DtF + S(t),

1 2 1 e−λ with S(t) 6 ∆ fC DtF (DtF 1) 6 − 2 DtF (DtF 1) , see indeed Lemma 10.13 for the 2 ∞ λ k k | − | | − | 1 e−λ last inequality. Putting all these bounds together and since ∆fC 6 − by Lemma 10.13, we k k∞ λ get the desired conclusion. So, to conclude the proof, it remains to show that (10.128) holds true. Let us first assume that k > a + 2. We then have

k 1 k 1 − − f(k) = f(a) + ∆f(j) = f(a) + ∆f(a)(k a) + (∆f(j) ∆f(a)) − − Xj=a Xj=a k 1 j 1 k 2 − − − = f(a) + ∆f(a)(k a) + ∆2f(l) = f(a) + ∆f(a)(k a) + ∆2f(l)(k l 1), − − − − Xj=a Xl=a Xl=a so that k 2 − 1 f(k) f(a) ∆f(a)(k a) 6 ∆2f (k l 1) = ∆2f (k a)(k a 1), | − − − | k k∞ − − 2k k∞ − − − Xl=a that is, (10.128) holds true in this case. When k = a or k = a + 1, (10.128) is obviously true. Finally, consider the case k 6 a 1. We have − a 1 a 1 − − f(k) = f(a) ∆f(j) = f(a) + ∆f(a)(k a) + (∆f(a) ∆f(j)) − − − Xj=k Xj=k a 1 a 1 a 1 − − − = f(a) + ∆f(a)(k a) + ∆2f(l) = f(a) + ∆f(a)(k a) + ∆2f(l)(l k + 1), − − − Xj=k Xl=j Xl=k so that a 1 − 1 f(k) f(a) ∆f(a)(k a) 6 ∆2f (l k + 1) = ∆2f (a k)(a k + 1), | − − − | k k∞ − 2k k∞ − − Xl=k that is, (10.128) holds true in this case as well. The proof of Theorem 10.12 is done.

To go further. A multivariate extension of Theorem 10.11 can be found in [47]. The reference [19] contains several explicit applications of the tools developed in this section.

62 11 Fourth Moment Theorem and free probability

To conclude this survey, we shall explain how the Fourth Moment Theorem 1.3 extends in the theory of free probability, which provides a convenient framework for investigating limits of random matrices. We start with a short introduction to free probability. We refer to [22] for a systematic presentation and to [2] for specific results on Wigner multiple integrals.

Free tracial probability space. A non-commutative probability space is a von Neumann algebra A (that is, an algebra of operators on a complex separable Hilbert space, closed under adjoint and convergence in the weak operator topology) equipped with a trace ϕ, that is, a unital linear functional (meaning preserving the identity) which is weakly continuous, positive (meaning ϕ(X) 0 whenever X is a non-negative element of A ; i.e. whenever X = YY for some Y A ), ≥ ∗ ∈ faithful (meaning that if ϕ(YY ∗) = 0 then Y = 0), and tracial (meaning that ϕ(XY ) = ϕ(YX) for all X,Y A , even though in general XY = YX). ∈ 6 Random variables. In a non-commutative probability space, we refer to the self-adjoint elements of the algebra as random variables. Any random variable X has a law: this is the unique probability measure µ on R with the same moments as X; in other words, µ is such that

xkdµ(x) = ϕ(Xk), k > 1. (11.129) R Z (The existence and uniqueness of µ follow from the positivity of ϕ, see [22, Proposition 3.13].)

Convergence in law. We say that a sequence (X1,n,...,Xk,n), n > 1, of random vectors converges in law to a random vector (X1, ,...,Xk, ), and we write ∞ ∞ law (X1,n,...,Xk,n) (X1, ,...,Xk, ), → ∞ ∞ to indicate the convergence in the sense of (joint) moments, that is,

lim ϕ (Q(X1,n,...,Xk,n)) = ϕ (Q(X1, ,...,Xk, )) , (11.130) n ∞ ∞ →∞ for any polynomial Q in k non-commuting variables. We say that a sequence (Fn) of non-commutative stochastic processes (that is, each Fn is a one-parameter family of self-adjoint operators Fn(t) in (A , ϕ)) converges in the sense of finite- dimensional distributions to a non-commutative stochastic process F , and we write ∞ f.d.d. Fn F , → ∞ to indicate that, for any k > 1 and any t1, . . . , tk > 0,

law (Fn(t1),...,Fn(tk)) (F (t1),...,F (tk)). → ∞ ∞

Free independence. In the free probability setting, the notion of independence (introduced by Voiculescu in [55]) goes as follows. Let A1,..., Ap be unital subalgebras of A . Let X1,...,Xm be elements chosen from among the A ’s such that, for 1 j < m, two consecutive elements X i ≤ j and Xj+1 do not come from the same Ai and are such that ϕ(Xj ) = 0 for each j. The subalgebras A1,..., Ap are said to be free or freely independent if, in this circumstance,

ϕ(X X X ) = 0. (11.131) 1 2 ··· m

63 Random variables are called freely independent if the unital algebras they generate are freely independent. Freeness is in general much more complicated than classical independence. For example, if X,Y are free and m, n > 1, then by (11.131),

ϕ (Xm ϕ(Xm)1)(Y n ϕ(Y n)1) = 0. − − By expanding (and using the linear property of ϕ), we get

ϕ(XmY n) = ϕ(Xm)ϕ(Y n), (11.132) which is what we would expect under classical independence. But, by setting X = X = X ϕ(X)1 1 3 − and X = X = Y ϕ(Y ) in (11.131), we also have 2 4 − ϕ (X ϕ(X)1)(Y ϕ(Y )1)(X ϕ(X)1)(Y ϕ(Y )1) = 0. − − − − By expanding, using (11.132) and the tracial property of ϕ (for instance ϕ(XYX) = ϕ(X2Y )) we get

ϕ(XYXY ) = ϕ(Y )2ϕ(X2) + ϕ(X)2ϕ(Y 2) ϕ(X)2ϕ(Y )2, − which is different from ϕ(X2)ϕ(Y 2), which is what one would have obtained if X and Y were classical independent random variables. Nevertheless, if X,Y are freely independent, then their joint moments are determined by the moments of X and Y separately, exactly as in the classical case.

Semicircular distribution. The semicircular distribution (m, σ2) with mean m R and S ∈ variance σ2 > 0 is the probability distribution 1 2 2 2 1 (m, σ )(dx) = 4σ (x m) x m 2σ dx. (11.133) S 2πσ2 − − {| − |≤ } p If m = 0, this distribution is symmetric around 0, and therefore its odd moments are all 0. A simple calculation shows that the even centered moments are given by (scaled) Catalan numbers: for non-negative integers k,

m+2σ 2k 2 2k (x m) (m, σ )(dx) = Ckσ , m 2σ − S Z − 1 2k 2 where Ck = k+1 k (see, e.g., [22, Lecture 2]). In particular, the variance is σ while the centered fourth moment is 2σ4. The semicircular distribution plays here the role of the Gaussian distribution.  It has the following similar properties:

1. If S (m, σ2) and a, b R, then aS + b (am + b, a2σ2). ∼ S ∈ ∼ S 2 2 2 2. If S1 (m1, σ1) and S2 (m2, σ2) are freely independent, then S1 +S2 (m1 +m2, σ1 + 2 ∼ S ∼ S ∼ S σ2).

Free Brownian Motion. A free Brownian motion S = S(t) > is a non-commutative { }t 0 stochastic process with the following defining characteristics:

(1) S(0) = 0.

64 (2) For t > t > 0, the law of S(t ) S(t ) is the semicircular distribution of mean 0 and variance 2 1 2 − 1 t t . 2 − 1 (3) For all n and tn > > t2 > t1 > 0, the increments S(t1), S(t2) S(t1), ..., S(tn) S(tn 1) ··· − − − are freely independent. We may think of free Brownian motion as ‘infinite-dimensional matrix-valued Brownian motion’.

Wigner integral. Let S = S(t) t>0 be a free Brownian motion. Let us quickly sketch out the { } 1 construction of the Wigner integral of f with respect to S. For an indicator function f = [u,v], the Wigner integral of f is defined by

∞ 1 (x)dS(x) = S(v) S(u). [u,v] − Z0 k 1 We then extend this definition by linearity to simple functions of the form f = i=1 αi [ui,vi], where [u , v ] are disjoint intervals of R . Simple computations show that i i + P ϕ ∞ f(x)dS(x) = 0 (11.134) Z0  ∞ ∞ ϕ f(x)dS(x) g(x)dS(x) = f, g 2 R . (11.135) × h iL ( +) Z0 Z0  By isometry, the definition of ∞ f(x)dS(x) is extended to all f L2(R ), and (11.134)-(11.135) 0 ∈ + continue to hold in this more general setting. R

Multiple Wigner integral. Let S = S(t) t>0 be a free Brownian motion, and let q > 1 be q { } q an integer. When f L2(R ) is real-valued, we write f to indicate the function of L2(R ) given ∈ + ∗ + by f ∗(t1, . . . , tq) = f(tq, . . . , t1). Following [2], let us quickly sketch out the construction of the multiple Wigner integral of f with respect to S. Let Dq Rq be the collection of all diagonals, i.e. ⊂ + Dq = (t , . . . , t ) Rq : t = t for some i = j . (11.136) { 1 q ∈ + i j 6 } For an indicator function f = 1 , where A Rq has the form A = [u , v ] ... [u , v ] with A ⊂ + 1 1 × × q q A Dq = , the qth multiple Wigner integral of f is defined by ∩ ∅ IS(f) = (S(v ) S(u )) ... (S(v ) S(u )). q 1 − 1 q − q k 1 We then extend this definition by linearity to simple functions of the form f = i=1 αi Ai , where A = [ui , vi ] ... [ui , vi ] are disjoint q-dimensional rectangles as above which do not meet the i 1 1 × × q q P diagonals. Simple computations show that

S ϕ(Iq (f)) = 0 (11.137) S S ϕ(Iq (f)Iq (g)) = f, g∗ L2(Rq ). (11.138) h i + S 2 Rq By isometry, the definition of Iq (f) is extended to all f L ( +), and (11.137)-(11.138) continue S ∈ to hold in this more general setting. If one wants Iq (f) to be a random variable, it is necessary for S f to be mirror symmetric, that is, f = f ∗ (see [17]). Observe that I1 (f) = 0∞ f(x)dS(x) when q = 1. We have moreover R ϕ(IS (f)IS(g)) = 0 when p = q, f L2(Rp ) and g L2(Rq ). (11.139) p q 6 ∈ + ∈ +

65 2 Rp 2 Rq r When r 1, . . . , p q , f L ( +) and g L ( +), let us write f ⌢ g to indicate the rth ∈ { ∧ } ∈ ∈ 2 Rp+q 2r contraction of f and g, defined as being the element of L ( + − ) given by r f ⌢ g(t1, . . . , tp+q 2r) (11.140) −

= f(t1, . . . , tp r, x1, . . . , xr)g(xr, . . . , x1, tp r+1, . . . , tp+q 2r)dx1 . . . dxr. Rr − − − Z + By convention, set f ⌢0 g = f g as being the tensor product of f and g. Since f and g are ⊗ not necessarily symmetric functions, the position of the identified variables x1, . . . , xr in (11.140) is important, in contrast to what happens in classical probability. Observe moreover that

r p q f ⌢ g 2 Rp+q−2r 6 f L2(R ) g L2(R ) (11.141) k kL ( + ) k k + k k + q by Cauchy-Schwarz, and also that f ⌢ g = f, g∗ L2(Rq ) when p = q. h i + 2 Rp We have the following product formula (see [2, Proposition 5.3.3]), valid for any f L ( +) and q ∈ g L2(R ): ∈ + p q ∧ S S S r Ip (f)Iq (g) = Ip+q 2r(f ⌢ g). (11.142) − r=0 X We deduce (by a straightforward induction) that, for any e L2(R ) and any q > 1, ∈ +

∞ S q Uq e(x)dSx = Iq (e⊗ ), (11.143) Z0  where U = 1, U = X, U = X2 1, U = X3 2X, ..., is the sequence of Tchebycheff polynomials 0 1 2 − 3 − of second kind (determined by the recursion XUk = Uk+1 + Uk 1), 0∞ e(x)dS(x) is understood as a Wigner integral, and e q is the qth tensor product of e. This− is the exact analogue of (1.10) in ⊗ R our context.

We are now in a position to offer a free analogue of the Fourth Moment Theorem 5.5, which reads as follows.

Theorem 11.1 (Kemp, Nourdin, Peccati, Speicher, 2011; see [17]) Fix an integer q > 2 2 q S and let S > be a free Brownian motion. Whenever f L (R ), set I (f) to denote the qth { t}t 0 ∈ + q multiple Wigner integrals of f with respect to S. Let F > be a sequence of Wigner multiple { n}n 1 integrals of the form

S Fn = Iq (fn), where each f L2(R ) is mirror-symmetric, that is, is such that f = f . Suppose moreover that n ∈ + n n∗ ϕ(F 2) 1 as n . Then, as n , the following two assertions are equivalent: n → → ∞ → ∞ (i) F Law S (0, 1); n → 1 ∼ S (ii) ϕ(F 4) 2 = ϕ(S4). n → 1

66 Proof (following [24]). Without loss of generality and for sake of simplicity, we suppose that ϕ(F 2) = 1 for all n (instead of ϕ(F 2) 1 as n ). The proof of the implication (i) (ii) n n → → ∞ ⇒ being trivial by the very definition of the convergence in law in a free tracial probability space, we only concentrate on the proof of (ii) (i). ⇒ Fix k > 3. By iterating the product formula (11.142), we can write

k S k S r1 r2 rk−1 Fn = Iq (fn) = Ikq 2r1 ... 2r (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn , − − − k−1 (r1,...,r ) A Xk−1 ∈ k,q  where

k 1 Ak,q = (r1, . . . , rk 1) 0, 1, . . . , q − : r2 6 2q 2r1, r3 6 3q 2r1 2r2,..., − ∈ { } − − − rk 1 6 (k 1)q 2r1 ... 2rk 2 .  − − − − − − By taking the ϕ-trace in the previous expression and taking into account that (11.137) holds, we deduce that

k S k r1 r2 rk−1 ϕ(Fn ) = ϕ(Iq (fn) ) = (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn, (11.144) (r1,...,r ) B Xk−1 ∈ k,q with

Bk,q = (r1, . . . , rk 1) Ak,q : 2r1 + ... + 2rk 1 = kq . − ∈ − Let us decompose B into C E , with C = B 0, q k 1 and E = B C . We k,q k,q ∪ k,q k,q k,q ∩ { } − k,q k,q \ k,q then have

k r1 r2 rk−1 ϕ(Fn ) = (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn (r1,...,r ) C Xk−1 ∈ k,q  r1 r2 rk−1 + (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn .

(r1,...,r ) E Xk−1 ∈ k,q  Using the two relationships f ⌢0 f = f f and n n n ⊗ n q 2 fn ⌢ fn = fn(t1, . . . , tq)fn(tq, . . . , t1)dt1 . . . dtq = fn L2(Rq ) = 1, Rq k k + Z +

r1 r2 rk−1 it is evident that (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn = 1 for all (r1, . . . , rk 1) Ck,q. We deduce − ∈ that

k r1 r2 rk−1 ϕ(Fn ) = #Ck,q + (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn . (r1,...,r ) E Xk−1 ∈ k,q  On the other hand, by applying (11.144) with q = 1, we get that

k S 1 k 1 r1 1 r2 1 rk−1 1 ϕ(S1 ) = ϕ(I1 ( [0,1]) ) = (... (( [0,1] ⌢ [0,1]) ⌢ [0,1]) ...) ⌢ [0,1] (r1,...,r ) B Xk−1 ∈ k,1 = 1 = #Bk,1.

(r1,...,r ) B Xk−1 ∈ k,1

67 But it is clear that Ck,q is in bijection with Bk,1 (by dividing all the ri’s in Ck,q by q). Consequently,

k k r1 r2 rk−1 ϕ(Fn ) = ϕ(S1 ) + (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn . (11.145) (r1,...,r ) E Xk−1 ∈ k,q  Now, assume that ϕ(F 4) ϕ(S4) = 2 and let us show that ϕ(F k) ϕ(Sk) for all k > 3. Using n → 1 n → 1 that fn = fn∗, observe that r fn ⌢ fn(t1, . . . , t2q 2r) −

= fn(t1, . . . , tq r, s1, . . . , sr)fn(sr, . . . , s1, tq r+1, . . . , t2q 2r)ds1 . . . dsr Rr − − − Z +

= fn(sr, . . . , s1, tq r, . . . , t1)fn(t2q 2r, . . . , tq r+1, s1, . . . , sr)ds1 . . . dsr Rr − − − Z + r r = fn ⌢ fn(t2q 2r, . . . , t1) = (fn ⌢ fn)∗(t1, . . . , t2q 2r), − − r r that is, fn ⌢ fn = (fn ⌢ fn)∗. On the other hand, the product formula (11.142) leads to 2 q S r Fn = r=0 I2q 2r(fn ⌢ fn). Since two multiple integrals of different orders are orthogonal (see (11.139)), we deduce− that P q 1 − 4 2 2 2 r r q ϕ(Fn ) = fn fn 2 R2q + fn L2(R ) + fn ⌢ fn, (fn ⌢ fn)∗ L2(R2q−2r ) k ⊗ kL ( + ) k k + h i + Xr=1 q 1  q 1 − − 4 r 2 r 2 = 2 fn L2([0,1]q) + fn ⌢ fn 2 R2q−2r = 2 + fn ⌢ fn 2 R2q−2r . (11.146) k k k kL ( + ) k kL ( + ) r=1 r=1 X X Using that ϕ(F 4) 2, we deduce that n → r 2 fn ⌢ fn 2 R2q−2r 0 for all r = 1, . . . , q 1. (11.147) k kL ( + ) → −

Fix (r1, . . . , rk 1) Ek,q and let j 1, . . . , k 1 be the smallest integer such that rj − ∈ ∈ { − } ∈ 1, . . . , q 1 . Then: { − } r1 r2 rk−1 (... ((fn ⌢ fn) ⌢ fn) ...) ⌢ fn r r rj−1 rj rj+1 rk−1 = (... ((f ⌢1 f ) ⌢2 f ) ... ⌢ f ) ⌢ f ) ⌢ f ) ...) ⌢ f n n n n n n n rj rj+1 r q = (... ((f ... f ) ⌢ f ) ⌢ f ) ...) ⌢k−1 f (since f ⌢ f = 1) n ⊗ ⊗ n n n n n n rj rj+1 r = (... ((f ... f ) (f ⌢ f )) ⌢ f ) ...) ⌢k−1 f n ⊗ ⊗ n ⊗ n n n n rj k j 1 6 (fn ... fn) (fn ⌢ fn) fn − − (Cauchy-Schwarz) k r⊗ ⊗ ⊗ kk k = f ⌢j f (since f 2 = 1) k n nk k nk 0 as n by (11.147). → → ∞ Therefore, we deduce from (11.145) that ϕ(F k) ϕ(Sk), which is the desired conclusion and n → 1 concludes the proof of the theorem.

During the proof of Theorem 11.1, we actually showed (see indeed (11.146)) that the two assertions (i)-(ii) are both equivalent to a third one, namely r 2 (iii): fn ⌢ fn 2 R2q−2r 0 for all r = 1, . . . , q 1. k kL ( + ) → − Combining (iii) with Corollary 5.5, we immediately deduce an interesting transfer principle for translating results between the classical and free chaoses.

68 Corollary 11.2 Fix an integer q > 2, let Bt t>0 be a standard Brownian motion and let St t>0 { 2} Rq B S { } be a free Brownian motion. Whenever f L ( +), we write Iq (f) (resp. Iq (f)) to indicate the ∈ 2 q qth multiple Wiener integrals of f with respect to B (resp. S). Let f > L (R ) be a sequence { n}n 1 ⊂ + of symmetric functions and let σ > 0 be a finite constant. Then, as n , the following two → ∞ assertions hold true.

(i) E[IB (f )] q!σ2 if and only if ϕ(IS (f )2) σ2. q n → q n → law (ii) If the asymptotic relations in (i) are verified, then IB(f ) (0, q!σ2) if and only if q n → N IS(f ) law (0, σ2). q n → S

To go further. A multivariate version of Theorem 11.1 (free counterpart of Theorem 6.2) can be found in [36]. In [31] (resp. [14]), one exhibits a version of Theorem 11.1 in which the semicircular law in the limit is replaced by the free Poisson law (resp. the so-called tetilla law). An extension of Theorem 11.1 in the context of the q-Brownian motion (which is an interpolation between the standard Brownian motion corresponding to q = 1 and the free Brownian motion corresponding to q = 0) is given in [12].

References

[1] B. Bercu, I. Nourdin and M.S. Taqqu (2010). Almost sure central limit theorems on the Wiener space. Stoch. Proc. Appl. 120, no. 9, 1607-1628.

[2] P. Biane and R. Speicher (1998). Stochastic analysis with respect to free Brownian motion and analysis on Wigner space. Probab. Theory Rel. Fields 112, 373-409.

[3] H. Biermé, A. Bonami and J. Léon (2011). Central Limit Theorems and Quadratic Variations in terms of Spectral Density. Electron. J. Probab. 16, 362-395.

[4] H. Biermé, A. Bonami, I. Nourdin and G. Peccati (2011). Optimal Berry-Esseen rates on the Wiener space: the barrier of third and fourth cumulants. Preprint.

[5] E. Bolthausen (1984). An estimate of the remainder in a combinatorial central limit theorem. Z. Wahrscheinlichkeitstheorie verw. Gebiete 66, 379-386.

[6] J.-C. Breton and I. Nourdin (2008). Error bounds on the non-normal approximation of Hermite power variations of fractional Brownian motion. Electron. Comm. Probab. 13, 482- 493 (electronic).

[7] P. Breuer and P. Major (1983). Central limit theorems for non-linear functionals of Gaussian fields. J. Mult. Anal. 13, 425-441.

[8] L.H.Y. Chen (1975). Poisson approximation for dependent trials. Ann. Probab. 3, no. 3, 534-545

[9] L.H.Y. Chen, L. Goldstein and Q.-M. Shao (2010). Normal Approximation by Stein’s Method. Probability and Its Applications, Springer.

[10] F. Daly (2008). Upper bounds for Stein-type operators. Electronic Journal of Probability 13(20), 566-587 (Electronic).

69 [11] L. Decreusefond and D. Nualart (2008). Hitting times for Gaussian processes. Ann. Probab. 36 (1), 319-330.

[12] A. Deya, S. Noreddine and I. Nourdin (2012). Fourth Moment Theorem and q-Brownian Chaos. Comm. Math. Phys., to appear.

[13] A. Deya and I. Nourdin (2011). Invariance principles for homogeneous sums of free random variables. Preprint.

[14] A. Deya and I. Nourdin (2012). Convergence of Wigner integrals to the tetilla law. ALEA 9, 101-127.

[15] C.G. Esseen (1956). A moment inequality with an application to the central limi theorem. Skand. Aktuarietidskr. 39, 160-170.

[16] S.-T. Ho and L.H.Y. Chen (1978). An Lp bound for the remainder in a combinatorial Central Limit Theorem. Ann. Probab. 6(2), 231-249.

[17] T. Kemp, I. Nourdin, G. Peccati and R. Speicher (2011). Wigner chaos and the fourth moment. Ann. Probab., in press.

[18] V. Yu. Korolev and I.G. Shevtsova (2010). On the upper bound for the absolute constant in the Berry-Esseen inequality. Theory of Probability and its Applications 54, no. 4, 638-658.

[19] R. Lachièze-Rey and G. Peccati (2011). Fine Gaussian fluctuations on the Poisson space, I: contractions, cumulants and random geometric graphs. Preprint.

[20] E. Mossel, R. O’Donnell and K. Oleszkiewicz (2010). Noise stability of functions with low influences: invariance and optimality. Ann. Math. 171, no. 1, 295-341.

[21] E. Nelson (1973). The free Markoff field. J. Funct. Analysis, 12, 211-227.

[22] A. Nica and R. Speicher (2006). Lectures on the Combinatorics of Free Probability. Cambridge University Press.

[23] S. Noreddine and I. Nourdin (2011). On the Gaussian approximation of vector-valued multiple integrals. J. Multiv. Anal. 102, no. 6, 1008-1017.

[24] I. Nourdin (2011). Yet another proof of the Nualart-Peccati criterion. Electron. Comm. Probab. 16, 467-481.

[25] I. Nourdin (2012). Selected aspects of fractional Brownian motion. Springer Verlag.

[26] I. Nourdin and G. Peccati (2007). Non-central convergence of multiple integrals. Ann. Probab., 37(4), 1412-1426.

[27] I. Nourdin and G. Peccati (2009). Stein’s method on Wiener chaos. Probab. Theory Rel. Fields 145(1), 75-118.

[28] I. Nourdin and G. Peccati (2010). Stein’s method meets Malliavin calculus: a short survey with new estimates. In the volume: Recent Advances in Stochastic Dynamics and Stochastic Analysis, World Scientific

[29] I. Nourdin and G. Peccati (2010). Cumulants on the Wiener space. J. Funct. Anal. 258, 3775- 3791.

70 [30] I. Nourdin and G. Peccati (2010). Stein’s method and exact Berry-Esseen asymptotics for functionals of Gaussian fields. Ann. Probab. 37(6), 2231-2261.

[31] I. Nourdin and G. Peccati (2011). Poisson approximations on the free Wigner chaos. Preprint.

[32] I. Nourdin and G. Peccati (2012). Normal Approximations Using Malliavin Calculus: from Stein’s Method to Universality. Cambridge Tracts in Mathematics. Cambridge University Press.

[33] I. Nourdin, G. Peccati and M. Podolskij (2011). Quantitative Breuer-Major Theorems. Stoch. Proc. App. 121, no. 4, 793-812.

[34] I. Nourdin, G. Peccati and G. Reinert (2010). Invariance principles for homogeneous sums: universality of Gaussian Wiener chaos. Ann. Probab. 38(5), 1947-1985.

[35] I. Nourdin, G. Peccati and A. Réveillac (2010). Multivariate normal approximation using Stein’s method and Malliavin calculus. Ann. Inst. H. Poincaré (B) Probab. Statist. 46(1), 45-58.

[36] I. Nourdin, G. Peccati and R. Speicher (2011). Multidimensional semicircular limits on the free Wigner chaos. To appear in Ascona 2011 Proceedings.

[37] I. Nourdin and F.G. Viens (2009). Density estimates and concentration inequalities with Malliavin calculus. Electron. J. Probab. 14, 2287-2309 (electronic).

[38] D. Nualart (2006). The Malliavin calculus and related topics. Springer-Verlag, Berlin, second edition.

[39] D. Nualart and S. Ortiz-Latorre (2008). Central limit theorems for multiple stochastic integrals and Malliavin calculus. Stoch. Proc. Appl. 118 (4), 614-628.

[40] D. Nualart and G. Peccati (2005). Central limit theorems for sequences of multiple stochastic integrals. Ann. Probab. 33 (1), 177-193.

[41] D. Nualart and L. Quer-Sardanyons (2011). Optimal Gaussian density estimates for a class of stochastic equations with additive noise. Infinite Dimensional Analysis, Quantum Probability and Related Topics 14, no.1, 25-34.

[42] D. Nualart and J. Vives (1990). Anticipative calculus for the Poisson process based on the Fock space. Sem. de Proba. XXIV, LNM 1426, pp. 154-165. Springer-Verlag

[43] G. Peccati (2012). The Chen-Stein method for Poisson functionals. Preprint.

[44] G. Peccati, J.-L. Solé, M.S. Taqqu and F. Utzet (2010). Stein’s method and normal approximation of Poisson functionals. Ann. Probab. 38(2), 443-478.

[45] G. Peccati and M.S. Taqqu (2010). Wiener chaos: moments, cumulants and diagrams. Springer- Verlag.

[46] G. Peccati and C.A. Tudor (2005). Gaussian limits for vector-valued multiple stochastic integrals. Séminaire de Probabilités XXXVIII, LNM 1857. Springer-Verlag, pp. 247-262.

[47] G. Peccati and C. Zheng (2010). Multidimensional Gaussian fluctuations on the Poisson space. Electron. J. Probab. 15, paper 48, 1487-1527 (electronic)

[48] G. Peccati and C. Zheng (2011). Universal Gaussian fluctuations on the discrete Poisson chaos. Preprint.

71 [49] M. Penrose (2003). Random Geometric Graphs. Oxford University Press.

[50] W. Rudin (1987): Real and Complex Analysis. Third edition. McGraw-Hill Book Co., New York.

[51] I. Shigekawa (1980). Derivatives of Wiener functionals and absolute continuity of induced measures. J. Math. Kyoto Univ. 20(2), 263-289.

[52] Ch. Stein (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II: , 583-602. Univ. California Press, Berkeley, Calif..

[53] D. W. Stroock (1987). Homogeneous chaos revisited. In: Seminaire de Probabilités XXI, Lecture Notes in Math. 1247, 1-8.

[54] M. Talagrand (2003). Spin Glasses, a Challenge for Mathematicians. Springer Verlag.

[55] D.V. Voiculescu (1985). Symmetries of some reduced free product C∗-algebras. Operator algebras and their connection with topology and ergodic theory, Springer Lecture Notes in Mathematics 1132, 556–588.

[56] D.V. Voiculescu (1991). Limit laws for random matrices and free product. Invent. Math. 104, 201-220.

[57] E.P. Wigner (1958). On the distribution of the roots of certain symmetric matrices. Ann. of Math. (2) 67, 325-327.

72