<<

arXiv:1211.1837v1 [math.PR] 8 Nov 2012 uevle rcse,FymnKcsmgop,McKean–Vl semigroups, Feynman–Kac processes, valued sure e falpoaiiymaue vrteset the over measures all of set 82C22. 1017–1052 3, No. DOI: 21, Probability Vol. Applied 2011, of Annals The besae qipdwt some with equipped spaces able y( by

olcino rnfrain Φ transformations of collection qaino h olwn form: following the of equation (1.1) c aiainadtpgahcdetail. typographic 1017–1052 and 3, the pagination by No. published 21, Vol. article 2011, original the Mathematical of of reprint Institute electronic an is This nttt fMteaia Statistics Mathematical of Institute e od n phrases. and words Key 1.1. Introduction. 1. M 00sbetclassifications. subject 2000 AMS 2010. April Received 10.1214/10-AAP716 η n ) NI odaxSdOet n NI Bordeaux-Sud-Ouest INRIA and Bordeaux-Sud-Ouest, INRIA OCNRTO NQAIISFRMA FOR INEQUALITIES CONCENTRATION enfil atcemodels. particle field Mean n ≥ cecsadi oeua chemistry. eng molecular stochastic in in and arising sciences flows distribution and Feynman–Kac gases of of models collision-type models. McKean of models, diffusion class this for results interactin Benn new to very and sequences yielding Bernstein random systems, independent Hoeffding, for classical equalities the to generalize inequaliti concentration respect here The va derived. with also are properties, parameter, limiting concentration the uniform of cesses, properties Und stability interest. independent additional of be may conditionall which sequences, of random arrays wi triangular analysis for perturbation analysis stochastic valuedcentration original measure an nonlinear generatio combine discrete We of of interpretations class particle general field a of properties tration 0 eilsrt hs eut ntecneto McKean–Vlasov of context the in results these illustrate We euneo rbblt esrson measures probability of sequence a hsatcei ocre ihteflcutosadteconce the and fluctuations the with concerned is article This yPer e oa n maulRio Emmanuel and Moral Del Pierre By ocnrto nqaiis enfil atcemdl,me models, particle field mean inequalities, Concentration n nvri´ eVersailles Universit´e de and 2011 , ATCEMODELS PARTICLE η hsrpitdffr rmteoiia in original the from differs reprint This . n n rmr 01,6K5 eodr 09,60F10, 60F99, secondary 60K35; 60E15, Primary +1 +1 σ in Φ = : fils( -fields P e ( Let h naso ple Probability Applied of Annals The ( 1 E n +1 n ) ( E E P → η n n n E ) ) ) n n . n ≥ ≥ ( with , E 0 0 n elet we and , n easqec fmeasur- of sequence a be +1 E n svmodels. asov ), aifiganonlinear a satisfying n independent y n ≥ spresented es n mean and n ≥ processes. .W osdra consider We 0. particle g udpro- lued hacon- a th ,adw denote we and 0, faclass a of time the ineering rsome er t in- ett P -type ( E n- n ethe be ) , a- 2 P. DEL MORAL AND E. RIO

These discrete time versions of conservative and nonlinear integro-differential type equations in distribution spaces arise in a variety of scientific disciplines including in physics, , information theory and sciences. To motivate the article, before describing their mean field particle inter- pretations, we illustrate these rather abstract evolution models working out explicitly some of these equations in a series of concrete examples. The first one is related to nonlinear filtering problems arising in . Suppose we are given a pair signal-observation (Xn,Yn)n 0 on ≥ some product space (Rd1 Rd2 ), with some initial distribution and Markov transition of the following× form: P((X ,Y ) d(x,y)) = η (dx)g (x,y)λ (dy), 0 0 ∈ 0 0 0 P((X ,Y ) d(x,y) (X ,Y )) = M (X , dx)g (x,y)λ (dy). n+1 n+1 ∈ | n n n+1 n n+1 n+1 In the above display, λn stands for some reference probability measures d2 on R , gn is a sequence of positive functions, Mn+1 are Markov transitions d1 from R into itself and finally η0 stands for some initial probability mea- sure on Rd1 . For a given sequence of observations Y = y delivered by some sensor, the filtering problem consists of computing sequentially the flow of conditional distributions defined by η = Law(X Y = y ,...,Y = y ) n n| 0 0 n n and b η = Law(X Y = y ,...,Y = y ). n+1 n+1| 0 0 n n These distributions satisfy a nonlinear evolution equation of the form (1.1) with the transformations

Φn+1(ηn)(dx′)= ηn(dx)Mn+1(x,dx′) and (1.2) Z b Gn(x) ηn(dx)= ηn(dx) ηn(dx′)Gn(x′) for some collection ofb likelihoodR functions Gn = gn( ,yn). Replacing these · d1 functions by some ]0, 1]-valued potential function Gn on R , we obtain the conditional distributions of a particle absorption model Xn′ with free evolution transitions Mn and killing rate (1 Gn). More precisely, if T stands for the killing time of the process, we have− that

(1.3) η = Law(X′ T >n) and η = Law(X′ T >n). n n| n+1 n+1| These nonabsorption conditional distributions arise in the analysis of con- finement processes,b as well as in with the numerical solving of Schr¨odinger ground state energies (see, e.g., [11, 13, 28]). Another important class of measures arising in particle physics and stochas- tic optimization problems is the class of Boltzmann distributions, also known CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 3 as the , defined by

1 βnV (x) (1.4) ηn(dx)= e− λ(dx), Zn with some reference λ, some inverse temperature pa- rameter and some nonnegative potential energy function V on some state space E. The normalizing constant Zn is sometimes called “partition func- tion” or the “free energy.” In sequential Monte Carlo methodology, as well as in operation research literature, these multiplicative formulae are often used to compute rare events , as well as the cardinality or the volume of some complex state spaces. Further details on these stochastic techniques can be found in the series of articles [5, 29, 30]. To fix the ideas we can consider the uniform measure on some finite set E. Surprisingly, this flow of measures also satisfies the above nonlinear evolution equation, as soon as ηn is an invariant measure of Mn, for each n 1, and the potential functions G are chosen of the following form: G = exp≥ ((β β )V ). The partition n n n+1 − n functions can also be computed in terms of the flow of measures (ηp)0 p

Zn = ηp(dx)Gp(x), 0 p

(1.5) Φn+1(ηn)= ηnKn+1,ηn for some collection of Markov kernels Kn+1,µ indexed by the time parameter n 0, and the set of measures µ on the space E . We already mention that ≥ n the choice of the Markov transitions Kn,η is not unique. Several examples are presented in Section 2 in the context of Feynman–Kac semigroups or McKean–Vlasov diffusion-type models. To fix the ideas, we can choose el- ementary transitions Kn,η(x,dx′)=Φn(η)(dx′) that do not depend on the state variable x. 4 P. DEL MORAL AND E. RIO

In the literature on mean field particle models, the transitions Kn,η are called a choice of McKean transitions. These models provide a natural inter- pretation of the flow of measures ηn as the laws of the time inhomogeneous Markov chain Xn with elementary transitions P (Xn dx Xn 1)= Kn,ηn−1 (Xn 1, dx) with ηn 1 = Law(Xn 1) ∈ | − − − − and starting with some initial with distribution η0 = Law(X0). The Markov chain Xn can be thought of as a perfect algorithm. For a thorough description of these discrete generation and nonlinear McKean- type models, we refer the reader to [9]. In the further development of the article, we always assume that the mappings i N i (xn)1 i N En K N (xn, An+1) n+1,1/N Pj=1 δ j ≤ ≤ ∈ 7→ xn are N -measurable, for any n 0, N 1, and 1 i N, and any mea- En⊗ ≥ ≥ ≤ ≤ surable subset An+1 En+1. In this situation, the mean field particle inter- ⊂ N pretation of this nonlinear measure valued model is an En -valued Markov (N) (N,i) chain ξn = (ξn )1 i N , with elementary transitions defined as ≤ ≤ N (N) (N) (N,i) i P(ξ dx )= K N (ξ , dx ) n+1 ∈ |Fn n+1,ηn n i=1 (1.6) Y N N 1 with ηn := δ (N,j) . N ξn Xj=1 N In the above displayed formula, n stands for the σ-field generated by (N) F 1 N the random sequence (ξp )0 p n, and dx = dx dx stands for an ≤ ≤ 1 ×···×N N infinitesimal neighborhood of a point x = (x ,...,x ) En . The initial sys- (N) ∈ tem ξ0 consists of N independent and identically distributed random vari- ables with common law η0. As usual, to simplify the presentation, when there is no possible confusion we suppress the parameter N, so that we write ξn i (N) (N,i) and ξn instead of ξn and ξn . The state components of this Markov chain are called particles or sometimes walkers in physics to distinguish the stochastic sampling model with the physical particle in molecular models. N The rationale behind this is that ηn+1 is the associated i with N independent variables with distributions K N (ξ , dx), so as soon n+1,ηn n N N as ηn is a good approximation of ηn then, in view of (1.6), ηn+1 should be a good approximation of ηn+1. Roughly speaking, this induction argument N shows that ηn tends to ηn, as the population size N tends to infinity. These stochastic particle algorithms can be thought of in various ways: from the physical view point, they can be seen as microscopic particle inter- pretations of physical nonlinear measure valued equations. From the pure CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 5 mathematical point of view, they can also be interpreted as natural stochas- tic linearizations of nonlinear evolution semigroups. From the probabilistic point of view, they can be interpreted as interacting recycling acceptance- techniques. In this case, they can be seen as a sequential and interacting technique. For instance, in the context of the nonlinear filtering equation (1.2), the mean field particle model associated with the flow of optimal one-step pre- dictors ηn, with the McKean transitions Kn,η(x,dx′)=Φn(η)(dx′), is the (Rd1 )N -valued Markov chain defined by sampling N conditionally indepen- i dent random variables ξn+1 = (ξn+1)1 i N , with common distribution given by ≤ ≤

N N i 1 Gn(ξn) i (1.7) Φn+1 δ j (dx)= Mn+1(ξn, dx). N ξn N j ! j=1 Gn(ξn) Xj=1 Xi=1 By construction, the resulting particle modelP is a simple genetic-type stochas- tic algorithm: the mutation and the selection transitions are dictated by the prediction and the updating transitions defined in (1.2). During the selec- tion transition, one updates the positions of the particles in accordance with the fitness likelihood functions Gn. This mechanism is called the selection- updating transition as the more likely particles with high Gn-potential value are selected for reproduction. In other words, this transition allows particles to give birth to some particles at the expense of light particles which die. The second mechanism is called the mutation-prediction transition since at this step each particle evolves randomly according to the transition ker- nels Mn. Another important feature of genetic-type particle models is that their ancestral or their complete genealogical tree structure can be used to approximate the smoothing problem, including the computation of the dis- tribution of the signal trajectories given the observations. Further details on this subject can be found in [9, 13]. The same genetic-type particle algorithm applies for the particle absorp- tion model (1.3) and the Boltzmann–Gibbs model (1.4), by replacing, re- spectively, the likelihood functions by the nonabsorption rates Gn and the fitness functions Gn = exp ((βn+1 βn)V ). In the reverse angle, the occupation− measures of a given genetic-type particle mean field model converge, as the size of the population tends to infinity, to the solution of an evolution equation of the form (1.1), with the one-step transformations (1.2). These limiting models are often called the infinite population models. For a recent treatment on these genetic models, we refer the reader to [16]. The origins of genetic-type particle methods can be traced back in physics and molecular chemistry in the 1950s with the pioneering works of Harris and Kahn [17] and Rosenbluth and Rosenbluth [27]. During the last two 6 P. DEL MORAL AND E. RIO decades, the mean field particle interpretations of these discrete generation measure valued equations were increasingly identified as a powerful stochas- tic simulation algorithm with emerging subjects in physics, biology and en- gineering sciences. They have led to spectacular results in signal processing processing with the corresponding particle filter technology, in stochastic engineering with interacting-type Metropolis and Gibbs sampler methods, as well as in quantum chemistry with quantum and diffusion Monte Carlo algorithms leading to precise estimates of the top eigenvalues and the ground states of Schroedinger operators. For a thorough discussion on these appli- cation areas, we refer the reader to [9, 10, 15], and the references therein. To motivate the article, we illustrate the fluctuation and the concentration results presented in this work with three additional illustrative examples, including Feynman–Kac models, McKean–Vlasov diffusion-type models, as well as interacting jump type McKean model of gases. We end this Introduction with some more or less traditional notation used in the present article. We denote, respectively, by (E), 0(E) and (E), the set of all finite signed measures on some measurableM spaMce (E, ), theB convex subset of measures with null mass and the Banach space ofE all bounded and measurable functions f equipped with the uniform norm f . k k We also denote by Osc1(E), the convex set of -measurable functions f with oscillations osc(f) 1. We let µ(f)= µ(dxE )f(x), be the Lebesgue integral of a function f ≤ (E), with respect to a measure µ (E). We recall that a bounded integral∈B operator M fromR a measurable∈ space M (E, ) into an auxiliary measurable space (F, ) is an operator f M(f) fromE F 7→ (F ) into (E) such that the functions M(f)(x) := F M(x,dy)f(y) are B-measurableB and bounded, for any f (F ). A Markov kernel is a pos- itiveE and bounded integral operator M∈Bwith M(1) =R 1. Given a pair of bounded integral operators (M1, M2), we let (M1M2) the composition opera- tor defined by (M1M2)(f)= M1(M2(f)). For time homogenous state spaces, m m 1 m 1 we denote by M = M − M = MM − the mth composition of a given bounded integral operator M, with m 1. A bounded integral operator M from a measurable space (E, ) into an≥ auxiliary measurable space (F, ) also generates a dual operatorEµ µM from (E) into (F ) definedF by (µM)(f) := µ(M(f)). We also used7→ the notationM M K([f K(f)]2)(x) := K([f K(f)(x)]2)(x) − − for some bounded integral operator K and some bounded function f. When the bounded integral operator M has a constant mass, that is, when M(1)(x)= M(1)(y) for any (x,y) E2, the operator µ µM maps ∈ 7→ 0(E) into 0(F ). In this situation, we let β(M) be the Dobrushin coef- ficientM of a boundedM integral operator M defined by the formula β(M) := sup osc(M(f)); f Osc (F ) . { ∈ 1 } CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 7

1.2. Description of the main results. The mathematical and numerical analysis of the mean field particle models (1.6) is one of the most attrac- tive research areas in pure and applied probability, as well as in advanced stochastic engineering and computational physics. The fluctuation analysis of these discrete generation particle models around their limiting distributions is often restricted to Feynman–Kac-type models (see, e.g., [7, 9, 12, 14] and references therein) or specific continuous time mean field models including McKean–Vlasov diffusions and Boltzmann-type collision models of gases [24, 31]. In the present article, we design an original and natural stochastic per- turbation analysis that applies to a rather large class of models satisfying a rather weak first-order regularity property. We combine an original stochas- tic perturbation analysis with a concentration analysis for triangular arrays of conditionally independent random sequences, which may be of indepen- dent interest. Under some additional stability properties of the limiting mea- sure valued processes, uniform concentration properties with respect to the time parameter are also derived. The concentration inequalities presented here generalize the classical Hoeffding, Bernstein and Bennett inequalities for independent random sequences to interacting particle systems, yielding very new results for this class of models. To describe with some precision this first main result we observe that the local sampling errors associated with the corresponding mean field particle N model are expressed in terms of the centered random fields Wn , given by the following stochastic perturbation formulae:

N N 1 N (1.8) ηn = ηn 1Kn,ηN + Wn . − n−1 √N To analyze the propagation properties of these local sampling errors, up to a second-order remainder measure, we further assume that the one-step mappings Φn governing equation (1.1) have a first-order decomposition (1.9) Φ (η) Φ (µ) (η µ)D Φ n − n ≃ − µ n with a first-order integral operator DµΦn from (En) into (En 1), s.t. B B − DµΦn(1) = 0. The precise definition of the first-order regularity property (1.9) is provided in Definition 3.1. Our first main result is a functional for the random fields (1.10) V N := √N[ηN η ]. n n − n This fluctuation theorem takes, basically, the following form.

Theorem 1.1. N The sequence (Wn )n 0 converges in law, as N tends to infinity, to the • ≥ sequence of n independent, Gaussian and centered random fields (Wn)n 0 ≥ 8 P. DEL MORAL AND E. RIO

with a function given for any f, g (E ), and any n 1, by ∈B n ≥ E (1.11) (Wn(f)Wn(g)) = ηn 1Kn,ηn−1 ([f Kn,ηn−1 (f)][g Kn,ηn−1 (g)]) − − − and, for n = 0, by E(W (f)W (g)) = η [(f η (f))(g η (g))]. 0 0 0 − 0 − 0 N For any fixed time horizon n 0, the sequence of random fields Vn con- • verges in law, as the number of≥ particles N tends to infinity, to a Gaussian and centered random fields n V = W . n pDp,n p=0 X In the above display, p,n stands for the semigroup associated with the operator = D ΦD. Dn ηn−1 n A complete detailed proof of the functional central limit theorem stated above is provided in Section 5, dedicated to a stochastic perturbation anal- ysis of mean field particle models. We let Φp,n =Φp+1,n Φp+1, 0 p n, be the semigroup associated with the measure valued equation◦ defined≤ ≤ in (1.1). For p = n, we use the conven- tion Φn,n = Id, the identity operator. By construction, we have Φ (η) Φ (µ) (η µ)D Φ and D Φ = . p,n − p,n ≃ − µ p,n ηp p,n Dp,n N The fluctuation theorem stated above shows that the fluctuations of ηn around the limiting measure ηn is precisely dictated by first-order differential- type operators Dηp Φp,n of the semigroup Φp,n around the flow of measures η , with p n. Furthermore, for any f Osc (E ), one observes that p ≤ n ∈ 1 n n E 2 E 2 (Vn(fn) )= ((Wp[Dηp Φp,n(fn)]) ) p=0 (1.12) X n σ2β(DΦ )2 := σ2 ≤ p p,n n Xp=0 with the uniform local parameters 2 2 σn := sup sup µ(Kn,µ[fn Kn,µ(fn)] ) fn Osc1(En) µ (En−1) | − | ∈ ∈P and

β(DΦp,n):= sup β(DηΦp,n). η (Ep) ∈P The second part of this article is concerned with the concentration prop- erties of mean field particle models. These results quantify exponentially small probabilities of deviations events between the occupation measures N ηn and their limiting values. The exponential deviation events discussed in CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 9 this article are described in terms of the parameters n σ2 β2 := β(DΦ )2 and b⋆ := sup β(DΦ ). n ≤ n p,n n p,n p=0 0 p n X ≤ ≤ Besides the fact that the nonasymptotic analysis of weakly dependent vari- ables is rather well developed, the concentration properties of discrete gen- eration and interacting particle systems often resume to asymptotic large deviation results, or to nonasymptotic rough exponential estimates (see, e.g., [9] and references therein). Our main result on this subject is an orig- inal concentration theorem that includes Hoeffding, Bennett and Bernstein exponential inequalities for mean field particle models. This result takes, basically, the following form.

Theorem 1.2. For any N 1, n 0, fn Osc1(En), and any x 0 the probability of each of the following≥ pair≥ of events∈ is greater than 1 e≥ x − − N rn 1 2 ⋆ 1 x V (f ) (1 + ε− (x)) + √Nσ b ε− n n ≤ √ 0 n n 1 Nσ2 N  n  and N rn 1 Vn (fn) (1+ ε− (x)) + √2xβn. ≤ √N 0 In the above display, rn stands for some parameter whose values only depend on the amplitude of the second-order terms in the development (1.9), and the pair of functions (ε0, ε1) are defined by (1.13) ε (λ)= 1 (λ log (1 + λ)), ε (λ)=(1+ λ) log (1 + λ) λ. 0 2 − 1 − Under additional stability properties of the semigroup associated with the ⋆ limiting model (1.5), the parameters (σn, βn, bn, rn) are uniformly bounded w.r.t. the time parameter.

A complete detailed proof of the functional concentration inequalities stated in Theorem 1.2 is provided in Section 5.3. Some of the consequences of the concentration inequalities stated above are provided in Section 4. To give a flavor of these results, using a Bernstein-type concentration inequality we will check that 2 P N rn λ (1.14) lim suplog Vn (fn) + λ ⋆ 2 . N ≥ √N ≤−2(bnσn) →∞   The detailed proof of this asymptotic estimate is provided on page 16. This observation shows that this concentration inequality is “almost” asymptot- ically sharp, with a variance-type term whose values are pretty close to the exact limiting presented in (1.12). A more precise asymptotic estimate would require a refined moderate deviation analysis. We hope to discuss these properties in a forthcoming study. 10 P. DEL MORAL AND E. RIO

The outline of the rest of the article is as follows. To motivate the present article, we have collected in Section 2 three different classes of abstract mean field particle models that can be studied using the fluctuation and the concentration analysis developed in this article. In Section 3, we discuss the main regularity properties used in our anal- ysis. In Section 4, we illustrate the impact of Theorem 1.2 with some more Bennett and Hoeffding-type concentration properties, as well as Bernstein- type concentration inequalities and uniform exponential deviation properties w.r.t. the time parameter. Section 5 is mainly concerned with the detailed proofs of the theorems stated above. We combine a natural stochastic per- turbation analysis with nonlinear semigroup techniques that allow us to describe both the fluctuations and the concentration of the mean field mea- sures in terms of the local error random field models introduced in (1.8). The functional central limit theorem is proved in Section 5.1. In Appendix A.6, we provide a preliminary convex analysis including estimates of inverses of Legendre–Fenchel transformations of classical convex functions needed in this article. In Section 5.2, we prove a technical concentration lemma for triangular arrays of conditionally independent random variables. In Sec- tion 5.3, we apply this lemma to prove concentration inequalities for mean field models.

2. Some illustrative examples.

2.1. Feynman–Kac models. As mentioned in the Introduction, the first prototype model we have in mind is a class of Feynman–Kac distribution flow equation arising in a variety of application areas including stochastic engineering, physics, biology and Bayesian statistics. These models are de- fined in terms of a series of bounded and positive integral operators Qn from En 1 into En with the following dynamical equation: − (2.1) fn (En) ηn(fn)= ηn 1(Qn(fn))/ηn 1(Qn(1)) ∀ ∈B − − with a given initial distribution η0 (E0). To avoid unnecessary technical discussions we simplify the analysis∈ and P we assume that

n 0 0 < inf Gn(x) sup Gn(x) < with Gn(x) := Qn+1(1)(x). ∀ ≥ x En ≤ x En ∞ ∈ ∈ Rewritten in a slightly different way, we have

ηn =Φn(ηn 1):=Ψn 1(ηn 1)Mn with Mn(fn)= Qn(fn)/Qn(1) − − − and the Boltzmann–Gibbs transformation Ψn from (En) into itself given by P f (E )Ψ (η )(f )= η (G f )/η (G ). ∀ n ∈B n n n n n n n n n Using the ratio formulation (2.1) of the semigroup, we will check in Ap- pendix A.3 that the first-order decomposition (1.9) is met with the first- CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 11 order operator defined by 1 DµΦn(f) := Qn(f Φn(µ)(f)). µQn(1) − The nonlinear filtering model (1.2), the particle absorption model (1.3) and the Boltzmann–Gibbs distribution flow (1.4) can be abstracted in this framework by setting

Qn+1(x,dx′)= Gn(x)Mn+1(x,dx′). We leave the reader to check that this flow of measures satisfy the recursive equation (1.1) for any choice of Markov transitions given below: (2.2) K (x,dy)= ε G (x)M (x,dy) + (1 ε G (x))Φ (η )(dy). n+1,ηn n n n − n n n+1 n In the above displayed formula εn stands for some [0, 1]-valued parameters that may depend on the current measure ηn and such that εnGn 1. In this situation, the mean field N-particle model associated withk the collectionk≤ of Markov transitions (2.2) is a combination of simple selection/mutation i genetic transition ξn ξn = (ξn)1 i N ξn+1. During the selection stage, ≤ ≤ i i i with probability εnGn(ξn), we set ξn = ξn; otherwise, the particle jumps to a b b N new location, randomly drawn from the discrete distributionΨn(ηn ). During i i the mutation stage, each of the selectedb particles ξn ξn+1 evolves according to the transition Mn+1. If we set εn = 0, the above particle model reduces to the simple genetic-type model discussed in (1.7b ) in the Introduction. 2.2. Gaussian mean field models. The concentration analysis presented in this article is not restricted to Feynman–Kac-type models. It also ap- plies to McKean-type models associated with a collection of multivariate d Gaussian-type Markov transitions on En = R , defined by 1 Kn,η(x,dy)= d (2π) det(Qn) (2.3) p 1 1 exp (y d (x, η))′Q− (y d (x, η)) dy, × −2 − n n − n   with a nonsingular, positive and semi-definite covariance matrix Qn and d d some sufficiently regular drift mapping dn : (x, η) R (R ) d(x, η) Rd. In Appendix A.5, for d = 1 we will check that∈ any× linear P drift7→ function∈ dn of the form dn(x, η)= an(x)+ η(bn)cn(x), with some measurable (and nonnecessarily bounded) function an, and some pair of functions bn and cn (R), the first-order decomposition (1.9) is met with the first-order operator∈B defined by D Φ (f)(x) := [K (f)(x) Φ (µ)(f)] µ n n,µ − n + b (x) µ(dy)c (y)K (y, dz)f(z)(z d (y,µ)). n n n,µ − n Z 12 P. DEL MORAL AND E. RIO

In this context, the N-mean field particle model is given by the following recursion: i i N i 1 i N ξn = dn(ξn 1, ηn 1)+ Wn, ∀ ≤ ≤ − − i where (Wn)i 0 is a collection of independent and identically distributed ≥ d-valued Gaussian random variables with covariance matrix Qn. The con- nection between these discrete generation models and the more traditional continuous time McKean–Vlasov diffusion models is as follows. Consider the partial differential equation d d 1 2 i ∂ µ = ∂ i j (µ ) ∂ i (b ( , η )µ ), t t 2 x ,x t − x t · t t i,jX=1 Xi=1 d where µt is a probability measure on R , and bt some drift term associated with some kernels bt′ and given by

bt(x,µ)= bt′ (x,x′)µ(dx′). Z Under appropriate regularity conditions, one can show that ηt is the marginal distribution at time t of the law of the solution of the nonlinear stochastic differential equation

(2.4) dXt = bt(Xt,µt) dt + dBt, where Bt is a d-dimensional Brownian motion. These models have been intro- duced in the late-1960s by McKean [23]. The convergence of the mean field particle model associated with the diffusion (2.4) has been deeply studied in the mid-1990s by Bossy and Talay [3, 4], M´el´eard [24] and Sznitman [32]. We also refer the reader to the more recent treatments on McKean–Vlasov diffusion models by Bolley, Guillin and Malrieu [1] and Bolley, Guillin and Villani [2]. Besides the fact that these continuous time probabilistic mod- els are directly connected to a rather large class of physical equations, to get some computationally feasible solution, some kind of time discretiza- tion scheme is needed. Mimicking traditional time discretization techniques of deterministic dynamical systems, several natural strategies can be used. For instance, we can use a Euler-type discretization of the diffusion given by (2.4) as follows: X∆ X∆ = b (X∆ ,µ )∆ + (B B ) tn − tn−1 tn−1 tn−1 tn−1 tn − tn−1 on the time mesh (tn)n 0, with (tn tn 1) = ∆, with some initial ran- ≥ − − ∆ dom variable with distribution µ0 = Law(X0 ). In this situation, the el- ∆ ementary transitions of the approximated random states Xtn are of the form (2.3), with the identity covariance matrices Qn = Id, and the drift functions dn(x, η)= x+btn−1 (x, η). The refined convergence analysis of these discrete time approximation models for more general models, including gran- ular media equations is developed Malrieu and Talay [21, 22]. CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 13

We mention that the semigroup derivation approach for functional fluc- tuation theorems and concentration inequalities developed in this article do not apply directly to any nonlinear diffusion equations with general interac- tion kernels ct. The semigroup derivation technique requires one to control recursively time the integrability properties of the semigroup associated with the first and second-order derivative terms. A rather crude sufficient con- dition is to assume that the drift terms dn is of the form discussed in the discrete time model.

2.3. A McKean model of gases. We end this section with a mean field particle model arising in fluid mechanics. We consider a measurable state space (Sn, n) with a countably generated σ-field and an ( n n)-measurable mapping aS be a from (S E ) into R such that ν (Sds)⊗Ea (s,x)=1, for n n × n + n n any x En, and some bounded positive measure νn (Sn). To illustrate ∈ R∈ M this model, we can take a partition of the state En = s Sn As associated ∈ with a countable set Sn equipped with the counting measure νn(s)=1 and S set an(s,x) = 1As (x). We let Kn+1,η be the McKean transition defined by

(2.5) Kn+1,η(x,dy)= νn(ds)η(du)an(s, u)Mn+1((s,x), dy). Z In the above displayed formula, Mn stands for some Markov transition from (Sn En) into En+1. The discrete time version of McKean’s two-velocities model× for Maxwellian gases corresponds to the time homogenous model on En = Sn = 1, +1 associated with the counting measure νn and the pair of parameters{− }

an(s,x) = 1s(x) and Mn+1((s,x), dy)= δsx(dy). In this situation, the measure valued equation (1.1) takes the following quadratic form: η (+1) = η (+1)2 + (1 η (+1))2. n+1 n − n We leave the reader to write out the mean field particle interpretation of this model. For more details on this model, we refer to [31]. In Appendix A.4, we will check that the first-order decomposition (1.9) is met with the first-order operator defined by D Φ (f)(x) = [K (f)(x) Φ (µ)(f)] µ n+1 n+1,µ − n+1 + ν (ds)[a(s,x) µ(a(s, ))]µ(M (f)(s, )). n − · n+1 · Z 3. Some weak regularity properties. To describe precisely the concen- tration inequalities developed in the article, we need to introduce a first round of notation. 14 P. DEL MORAL AND E. RIO

Definition 3.1. We let Υ(E, F ) be the set of mappings Φ : µ (E) Φ(µ) (F ) satisfying the first-order decomposition ∈ P 7→ ∈ P (3.1) Φ(µ) Φ(η) = (µ η)D Φ+ Φ(µ, η). − − η R In the above displayed formula, the first-order operators ( ηΦ)η (E) is some collection of bounded integral operators from E into F Dsuch that∈P

η (E), x E (DηΦ)(1)(x)=0 and (3.2) ∀ ∈ P ∀ ∈ β( Φ):= sup β(DηΦ) < . D η (E) ∞ ∈P Φ The collection of second-order remainder signed measures ( (µ, η))(µ,η) (E2 ) on F are such that R ∈P Φ 2 Φ (3.3) (µ, η)(f) (µ η)⊗ (g) R (f, dg), |R |≤ | − | η Z Φ 2 for some collection of integral operators Rη from (F ) into the set Osc1(E) such that B sup osc(g ) osc(g )RΦ(f, d(g g )) osc(f)δ(RΦ) 1 2 η 1 ⊗ 2 ≤ η (E) Z (3.4) ∈P with δ(RΦ) < . ∞ This rather weak first-order regularity property is satisfied for a large class of one-step transformations Φn associated with a nonlinear measure valued process (1.1). For instance, in Section 5.3 we shall prove that the Feynman– Kac transformations Φn introduced in (2.1) belong to the set Υ(En 1, En). The latter is also met for the Gaussian transitions introduced in (2.3)− and for the McKean-type model of gases (2.5) presented in Section 2.3. The proof of this assertion is rather technical and it is postponed in Appendix A.4. We assume that the one-step mappings

Φn : µ (En 1) Φn(µ) := µKn,µ (En) ∈ P − −→ ∈ P governing equation (1.1) are chosen so that Φn Υ(En 1, En), for any n 1. The main advantage of the regularity condition∈ comes− from the fact that≥ Φ Υ(E , E ) with the first-order decomposition-type formula p,n ∈ p n Φ (η) Φ (µ) = [η µ]D Φ + Φp,n (η,µ), p,n − p,n − µ p,n R for some collection of bounded integral operators DµΦp,n from Ep into En and some second-order remainder signed measures Φp,n (η,µ). For further R use, we let rn be the second-order stochastic perturbation term related to the quadratic remainder measures RΦp,n and defined by n Φp,n rn := δ(R ). p=0 X CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 15

4. Some exponential concentration inequalities. Let us examine some more or less direct consequences of the concentration inequalities stated in Theorem 1.2. When the Markov kernels Kn,µ = Kn do not depend on the measure µ, the N-particle model reduces to a collection of independent copies of the Markov chain with elementary transitions Pn = Kn. In this special case, the second-order parameters vanish (i.e., rn = 0), while the first-order expansion parameters (σn, βn) are related to the properties of the semigroup of the underlying Markov chain; that is, we have that n n 2 2 2 2 2 σn = σpβ(Pp,n) βn = β(Pp,n) with Pp,n = Kp+1,...,Kn 1Kn, ≤ − p=0 p=0 X X with the Dobrushin ergodic coefficient β(Pp,n) associated with Pp,n. When n the chain is asymptotically stable in the sense that supn 0 p=0 β(Pp,n) < , the first-order expansion parameters given above are unifo≥ rmly bounded with∞ respect to the time parameter. P In more general situations, the analysis of these parameters depends on the model at hand. For instance, for time homogeneous Feynman–Kac mod- els [i.e., En = E and (Gn, Mn) = (G, M)] these parameters can be related to the mixing properties of the Markov chain associated with the transitions M. To be more precise, let us suppose that the following condition is met:

(M)m m 1, εm > 0 s.t. (4.1) ∃ ≥ ∃ (x,y) E2 M m(x, ) ε M m(y, ). ∀ ∈ · ≥ m · It is well known that the mixing-type condition (M)m is satisfied for any aperiodic and irreducible Markov chains on finite spaces, as well as for bi- Laplace exponential transitions associated with a bounded drift function and for Gaussian transitions with a mean drift function that is constant outside some compact domain. To go one step further, we introduce the following quantities:

(4.2) δm := sup (G(xp)/G(yp)). 0 p

As we mentioned above, in the special case where the Markov kernels N Kn,µ = Kn do not depend on the measure µ, the random measures ηn co- incide with the occupation measure associated with N independent and identically distributed random variables with common law ηn. In this situ- ation, the pair of events described in Theorem 1.2 resumes to the following Bennett and Hoeffding-type concentration events, respectively, given by

N 2 ⋆ 1 x N 2x [η η ](f ) σ b ε− and [η η ](f ) β . n − n n ≤ n n 1 Nσ2 n − n n ≤ N n  n  r The first inequality can be described more explicitly using the analytic es- timates

1 √2x + (4x/3) log(1 + (x/3) + √2x) ε− (x) − (x/3) + √2x. 1 ≤ log(1 + (x/3) + √2x) ≤ In the context of Feynman–Kac models, the second-order terms can be estimated more explicitly using the upper bounds

1 log(1 + 2x + 2√x) 2√x ε− (x) 2x + log(1 + 2x + 2√x)+ − 2x + 2√x. 0 ≤ 2x + 2√x ≤ A detailed proof of the upper bounds given above is detailed in Appendix A.6, dedicated to the convex analysis of the Legendre–Fenchel transformations used in this article. The second rough estimate in the r.h.s. of the above displayed formulae leads to Bernstein-type concentration inequalities.

Corollary 4.1. For any N 1 and any n 0, we have the following Bernstein-type concentration inequalities:≥ ≥ 2 2 ⋆ 1 1 r λ √2r b − log P [ηN η ](f ) n + λ b⋆ σ + n + λ 2r + n N n − n n ≥ N ≥ 2 n n √ n 3    N    and 2 2 1 1 r λ √2r − log P [ηN η ](f ) n + λ β + n + 2r λ . −N n − n n ≥ N ≥ 2 n √ n    N   N In terms of the random fields Vn , the first concentration inequality stated in Corollary 4.1 takes the following form: r log P V N (f ) n + λ − n n ≥ √  N  2 2 ⋆ 1 λ √2r λ b − b⋆ σ + n + 2r + n ≥ 2 n n √ √ n 3  N  N   λ2 . N−→ 2(b⋆ σ )2 →∞ n n This proves the asymptotic estimate presented in (1.14). CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 17

Last, but not least, without further work, Theorem 1.2 leads to uni- form concentration inequalities for mean field particle interpretations of Feynman–Kac semigroups.

Corollary 4.2. In the context of Feynman–Kac models, under the mixing type condition (M)m introduced in (4.1), for any N 1, any n 0 and any x 0 the probability of each of the following pair of≥ events: ≥ ≥ N 4 1 [η η ](f ) ̟ (m)(1 + ε− (x)) n − n n ≤ N 3,1 0

8δm 2 1 x + ̟ (m)σ ε− ε 2,2 1 4σ2̟ (m)N m  2,2  and

N 4 1 2̟2,2(m)x [ηn ηn](fn) ̟3,1(m)(1 + ε0− (x)) + 2 − ≤ N r N is greater than 1 e x. − − 5. A stochastic perturbation analysis.

5.1. Proof of the functional central limit theorem.

Definition 5.1. We say that a collection of Markov transitions Kη from a measurable space (E, ) into another (F, ) satisfies condition (K) as soon as the following Lipschitz-typeE inequality isF met for every f Osc (F ): ∈ 1 (5.1) (K) [K K ](f) (µ η)(h) T K (f, dh). k µ − η k≤ | − | η Z K In the above display, Tη stands for some collection of bounded integral operators from (F ) into (E) such that B B K K (5.2) sup osc(h)Tη (f, dh) osc(f)δ(T ), η (E) ≤ ∈P Z Φ for some finite constant δ(T ) < . In the special case where Kη(x,dy)= Φ(η)(dy), for some mapping Φ: η ∞ (E) Φ(η) (F ), condition (5.1) is a simple Lipschitz-type condition∈ on P the7→ mapping∈ PΦ. In this situation, we denote by (Φ) the corresponding condition; and whenever it is met, we says that the mapping Φ satisfy condition (Φ).

We further assume that we are given a collection of McKean transitions Kn,η satisfying the weak Lipschitz-type condition stated in (5.1). In this situation, we already mention that the corresponding one-step mappings Φn(η)= ηKn,η, and the corresponding semigroup Φp,n satisfies condition Φp,n (Φp,n) for some collection of bounded integral operators Tη . 18 P. DEL MORAL AND E. RIO

In the context of Feynman–Kac-type mdels, it is not difficult to check that condition (Φn) is equivalent to the fact that the McKean transitions Kn,η given in (2.2) satisfy the Lipschitz condition (5.1). The latter is also met for the Gaussian transitions introduced in (2.3) as soon as the drift function d(x, η) is sufficiently regular. As before, this condition is met for the Gaussian transitions introduced in (2.3) and for the McKean-type model of gases (2.5) presented in Section 2.3. For a more detailed discussion on these stability properties, we refer the reader to the Appendix, on page 23. N Notice that the centered random fields Wn introduced in (1.8) have con- ditional variance functions given by E N 2 N N 2 (5.3) (Wn (fn) n 1)= ηn 1[Kn,ηN ((fn Kn,ηN (fn)) )]. |F − − n−1 − n−1 Using Kintchine’s inequality, for every f Osc1(En), N 1 and any n 0 and m 1 we have the L almost sure estimates∈ ≥ ≥ ≥ 2m E N 2m (N) 1/(2m) 2m m (5.4) ( Wn (fn) n 1) b(2m) with b(2m) := 2− (2m)!/m!. | | |F − ≤ We can also prove the following theorem.

N Theorem 5.2. The sequence (Wn )n 0 converges in law, as N tends to infinity, to the sequence of n independent,≥ Gaussian and centered random fields (Wn)n 0 described in Theorem 1.1. ≥ The proof of this theorem follows the same line of arguments as those we used in [9] in the context of Feynman–Kac models. For completeness, and for the convenience of the reader, the complete proof of this result is housed in Appendix A.2. Let us examine some direct consequences of this result. Combining the Lipschitz property (Φp,n) of the semigroup Φp,n with the decomposition n N N N [ηn ηn]= [Φp,n(ηp ) Φp,n(Φp(ηp 1))], − − − Xp=0 we find that n √ N N Φp,n N [ηn ηn](fn) = Wp (h) T N (f, dh). | − | | | Φp(ηp−1) p=0 X Z N In the above displayed formulae, we have used the convention Φ0(η 1)= η0, − for p = 0. From the previous L2m almost sure estimates, we readily conclude that n sup √NE( [ηN η ](f ) 2m)1/(2m) b(2m) δ(T Φp,n ). | n − n n | ≤ N 1 p=0 ≥ X CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 19

We are now in position to prove the fluctuation Theorem 1.1. Using the decomposition N N N √ Φn N Vn = Wn + Vn 1 n + NR (ηn 1, ηn 1), − D − − we readily prove that n 1 (5.5) V N = W N + N , n p Dp,n √ Rn p=0 N X with the remainder second-order measure n 1 − N := N RΦp+1 (ηN , η )D . Rn p+1 p p p+1,n p=0 X In the above display, p,n = p+1,..., n 1 n stands for the semigroup D D D − D associated with the integral operators n := Dηn−1 Φn, with the usual con- vention = Id, for p = n. Using a first-orderD derivation formula for the Dn,n semigroup Φp,n (cf., e.g., Lemma A.1 on page 24), it is readily checked that D Φ = (D Φ )(D Φ )= (D Φ )= . ηp p,n ηp p+1 ηp+1 p+1,n Dp+1 ηp p,n Dp,n Using the fact that n 1 − N N 2 Φp+1 (f ) (V )⊗ (g) Rη (f, dg), |Rn n |≤ | p | p p=0 X Z we conclude that, for any m 1, we have ≥ n 1 p 2 − E( N (f ) m)1/m b(2m)2 β( ) δ(T Φq,p ) δ(RΦp+1 ). |Rn n | ≤ Dp+1,n p=0 q=0 ! X X This clearly implies that 1 N converge in law to the null measure, in the √N Rn sense that 1 N (f ) converge in law to zero, for any bounded test function √N Rn n N fn on En. Using the fact that Wn converges in law to the sequence of n independent, random fields Wn, the proposition is now a direct consequence of the decomposition formula (5.5). This ends the proof of Theorem 1.1.

5.2. A concentration lemma for triangular arrays. For every n 0 and (N) (N,i) ≥ N 1, we let Xn := (Xn )1 i N be a triangular array of random vari- ≥ ≤ ≤ N ables defined on some filtered (Ω, n ) associated with N F (N,i) a collection of increasing σ-fields ( n )n 0. We assume that (Xn )1 i N N F ≥ ≤ ≤ are n 1-conditionally independent and centered random variables. Suppose furthermoreF − that (N,i) E (N,i) 2 N 2 n 0 an Xn bn and ((Xn ) n 1) cn ∀ ≥ ≤ ≤ |F − ≤ 20 P. DEL MORAL AND E. RIO

N for some collection of finite constants (an, bn, cn), with the convention 1 = ∅, Ω for n = 0. For any n 0, let F− { } ≥ N N N N N N N (N,i) Tn := Sn + Rn where ∆Sn := Sn Sn 1 = Xn − − Xi=1 N and Rn is a random perturbation term such that m 1 E( RN m)1/m b(2m)2d ∀ ≥ | n | ≤ n N for some finite constant dn. We use the convention S 1 =0, for n = 0. We set − n n 2 ⋆ 2 2 2 2 cn := (bn)− cp and δn := δp Xp=0 Xp=0 with the middle point b a δ := n − n . n 2 Lemma 5.3. For any N 1 and any n 0, the probability of each of the following pair of events ≥ ≥

N 1 2 ⋆ 1 x (5.6) T d (1+ ε− (x)) + Nc b ε− n ≤ n 0 n n 1 Nc2  n  and N 1 (5.7) T d (1+ ε− (x)) + δ √2xN n ≤ n 0 n is greater than 1 e x, for any x 0. − − ≥ Remark 5.4. Notice that (5.7) gives always a better concentration in- equality when n c2 n δ2. In the opposite situation, if n c2 < p=0 p ≥ p=0 p p=0 p n δ2, inequality (5.6) gives better concentration estimates for sufficiently p=0 p P P P small values of the precision parameter x. P Before getting into the details of the proof of the above lemma, we examine some direct consequences of these inequalities based on Legendre–Fenchel transforms estimates developed in Appendix A.6. First, combining (A.16) with (A.15) we observe that, with probability greater than 1 e x, − − x T N d (1+2√x + θ (x)) + b⋆ c √N√2x + Nc2 θ n ≤ n 0 n n n 1 Nc2   n  with the pair of functions log(1 + 2√x + 2x)) 2√x θ (x) := 2x + log(1 + 2√x + 2x) 2√x + − 2x 0 − 2x + 2√x ≤ CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 21 and √2x + (4x/3) x θ1(x) := 1 √2x . log(1 + (x/3) + √2x) − − ≤ 3 The upper bounds given above together with (A.8) imply that, with proba- bility greater than 1 e x, − − T N d + A x + 2xBN , n ≤ n n n where q b⋆ A := 2d + n and BN := (√2d + b⋆ c √N)2. n n 3 n n n n   Using these successive upper bounds, we arrive at the following Bernstein- type inequality: 1 T N d log P n n + λ −N N ≥ N (5.8)   2 2 ⋆ 1 λ √2d b − b⋆ c + n + λ 2d + n . ≥ 2 n n √ n 3  N    In much the same way, starting from (5.6), we have, with probability greater than 1 e x, − − (5.9) T N d (1 + 2(x + √x)) + δ √2xN = d + A x + 2xBN , n ≤ n n n n n with the pair of constants q N √ 2 An := 2dn and Bn := (√2dn + δn N) . Using these successive upper bounds, we arrive at the following Bernstein- type inequality: N 2 2 1 1 T d λ √2d − (5.10) log P n n + λ δ + n + 2d λ . − N N ≥ N ≥ 2 n √ n    N   Proof of Lemma 5.3. First, we observe that m N (tdn) t [0, 1/(2d )[ E(etRn ) b(2m)2m. ∀ ∈ n ≤ m! m 0 X≥ To obtain a more explicit form of the r.h.s. term, we recall that b(2m)2m = E(X2m) with a Gaussian centered random variable with E(X2)=1 and sm 1 d [0, 1/2[ E(exp dX2 )= b(2m)2m = . ∀ ∈ { } m! √ m 0 1 2d X≥ − From this observation, we readily find that N N t(R dn) t [0, 1/(2d )[ L (t) := log E(e n − ) α (t) := α (td ). ∀ ∈ n 0,n ≤ 0,n 0 n 22 P. DEL MORAL AND E. RIO

Using (A.7), we obtain the following almost sure inequality: 2 N cn E t∆Sn N log (e n 1) N α1(bnt). |F − ≤ b  n  It implies that n 2 N cp N E tSn N t 0 L1,n(t) := log (e ) N α1(bpt) α1,n(t), ∀ ≥ ≤ bp ≤ Xp=0  N 2 ⋆ with the increasing and convex function α1,n(t)= Ncnα1(bnt). Using (A.8), we now obtain the following Cram´er–Chernoff estimate: N N N⋆ 1 N⋆ 1 x (5.11) x 0 P(S + R r + (L )− (x) + (L )− (x)) e− . ∀ ≥ n n ≥ n 0,n 1,n ≤ In other words, the probability that N N N⋆ 1 N⋆ 1 S + R r + (L )− (x) + (L )− (x) n n ≤ n 0,n 1,n x is greater than 1 e− , which, together with the homogeneity properties of the inverses of− Legendre–Fenchel transforms recalled in Appendix A.6, gives (5.6). The proof of (5.7) is based on Hoeffding’s inequality,

(N,i) E tXn N 2 2 8 log (e n 1) t (bn an) . |F − ≤ − N N 2 2 From these estimates, we readily find that L1,n(t) α2,n(t) := Nδnt /2. Ar- guing as before, we find that ≤

N⋆ 1 N⋆ 1 2 (L )− (x) (α )− (x)= 2xNδ . 1,n ≤ 2,n n We end the proof of the second assertion usingq (5.11). This ends the proof of the lemma. 

5.3. Concentration properties of mean field models. This section is con- cerned with the proof of Theorem 1.2. To simplify the presentation, we set

(N) Φp,n p,n := Φ (ηN )Φp,n and p,n = . D D p p−1 R R Under our assumptions, we have the almost sure estimates (N) sup β( p,n ) β( Φp,n):= sup β( ηΦp,n). N 1 D ≤ D η (Ep) D ≥ ∈P In this notation, one important consequence of the above lemma is the fol- lowing decomposition: V N := √N[ηN η ] n n − n n √ N N N N = N [Φp,n(ηp ) Φp,n(Φp(ηp 1))] = In + Jn − − p=0 X CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 23

N N with the pair of random measures (In , Jn ) given by n n N N (N) N √ N N In := Wp p,n and Jn := N p,n(ηp , Φp(ηp 1)). D R − p=0 p=0 X X In what follows fn stands for some test function fn Osc1(En). Combin- ing (5.4) with the generalized Minkowski integral inequality∈ we find that

E N N m (N) 1/m 2 Φp,n N ( p,n(ηp , Φp(ηp 1))(fn) p 1) b(2m) δ(R ), |R − | |F − ≤ from which we readily conclude that n m 1/m E N m 1/m E N N ( √NJn (fn) ) = N p,n(ηp , Φp(ηp 1))(fn) | | R − ! p=0 X n b(2m) 2 δ(RΦp,n ). ≤ p=0 X Notice that n N √NIN = (N,i)(f ) where (N,i)(f )= U (N,i)( (N)(f )), n Xp,n n Xp,n n p Dp,n n p=0 X Xi=1 (N,i) and the random measures Up are given, for any g Osc (E ), by p ∈ 1 p (N,i) (N,i) (N,i) Up (gp) := gp(ξp ) Kp,ηN (gp)(ξp 1 ). − p−1 − In the further development of this section, we fix the final time horizon n and the the function f Osc (E ). To clarify the presentation, we omit the n ∈ 1 n final time index and the test function fn, and we set, for any p in [0,n], p N X(N,i) = (N,i)(f ), SN = X(N,i) p Xp,n n p q q=0 X Xi=1 and p N N N Rp := N q,n(ηq , Φq(ηq 1)). R − Xk=0 At the final time horizon, we have p = n = SN = √NIN and RN = √NJ N . ⇒ n n n n N By construction, these variables form a triangular array of p 1-condition- ally independent random variables and F − E (N,i) 2 N ((Xp ) p 1) = 0. |F − In addition, we readily check the following almost sure estimates: (N,i) E (N,i) 2 N 1/2 Xp β( Φp,n) and ((Xp ) p 1) σpβ( Φp,n) | |≤ D |F − ≤ D 24 P. DEL MORAL AND E. RIO for any 0 p n. The proof of the theorem is now a direct consequence of Lemma 5.3≤. ≤

APPENDIX A.1. A first-order composition lemma.

Lemma A.1. For any pair of mappings Φ Υ(E , E ) and Φ Υ(E , E ) 1 ∈ 0 1 2 ∈ 1 2 the composition mapping (Φ2 Φ1) Υ(E0, E2) and we have the first-order derivation-type formula ◦ ∈ (A.1) (Φ Φ )= Φ Φ . Dη 2 ◦ 1 Dη 1DΦ1(η) 2 Proof. To check this property, we first observe that under this condi- tion, we clearly have the Lipschitz property,

(Φ) [Φ(µ) Φ(η)](f) (µ η)(h) T Φ(f, dh), | − |≤ | − | η Z Φ for some collection of integral operators Tη from (F ) into the set Osc1(E) such that B

Φ Φ (A.2) sup osc(h)Tη (f, dh) osc(f)δ(T ) η (E) ≤ ∈P Z for some finite constant δ(T Φ) < . Using this property, we easily check that (A.1) is met with ∞ β( (Φ Φ )) β( Φ )β( Φ ) D 2 ◦ 1 ≤ D 2 D 1 and

Φ2 Φ1 Φ1 Φ1 2 Φ2 δ(R ◦ ) δ(T )+ δ(T ) δ(R ). ≤ This ends the proof of the lemma. 

We also mention that for any pair of mappings Φ1 : η (E0) Φ1 (E ) and Φ : η (E ) Φ (E ), the composition mapping∈ P Φ7→ = Φ ∈ P 1 2 ∈ P 1 7→ 1 ∈ P 2 2 ◦ Φ1 satisfies condition (Φ) as soon as this condition is met for each mapping. In this case, we also notice that

Φ2 Φ1 Φ2 Φ1 δ(T ◦ ) δ(T ) δ(T ). ≤ × Suppose we are given a mapping Φ defined in terms of a nonlinear transport formula

Φ(η)= ηKη, with a collection of Markov transitions Kη from a measurable space (E, ) into another (F, ) satisfying condition (K). Using the decomposition E F Φ(µ) Φ(η) = [η µ]K + µ[K K ], − − η µ − η CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 25 we readily check that (K) = (Φ) with T Φ(f, dh)= δ (dh)+ T K(f, dh). ⇒ η Kη(f) η N N A.2. Proof of Theorem 5.2. Let = n ; n 0 be the natural filtra- F {F (N) ≥ } tion associated with the N-particle system ξn . The first class of martin- gales that arises naturally in our context is the Rd-valued and N -martingale N F Mn (f) defined by n N N N (A.3) Mn (f)= [ηp (fp) Φp(ηp 1)(fp)], − − p=0 X u Rd where fp : xp Ep fp(xp) = (fp (xp))u=1,...,d is a d-dimensional and bounded measurable∈ 7→ function. By direct inspection,∈ we see that the vth com- N N u ponent of the martingale Mn (f) = (Mn (f ))u=1,...,d is the d-dimensional and F N -martingale defined for any u = 1,...,d by the formula n N u N u N u Mn (f )= [ηp (fp ) Φp(ηp 1)(fp )] − − Xp=0 n N u N u = [ηp (fp ) ηp 1Kp,ηN (fp )], − − p−1 p=0 X N with the usual convention K0,ηN = η0 =Φ0(η 1) for p = 0. The idea of the −1 − proof consists of using the CLT for triangular arrays of Rd-valued random variables ([18], Theorem 3.33, page 437). We first rewrite the martingale N √NMn (f) in the following form: N n √ N 1 (N,i) (N,i) NMn (f)= (fp(ξp ) Kp,ηN (fp)(ξp 1 )). √ − p−1 − p=0 N Xi=1 X √ N (n+1)N N This readily yields NMn (f)= k=1 Uk (f) where for any 1 k (n + 1)N with k = pN + i for some i = 1,...,N and p = 0,...,n ≤ ≤ P N 1 (N,i) (N,i) Uk (f)= (fp(ξp ) Kp,ηN (fp)(ξp 1 )). √N − p−1 − N We further denote by k the σ-algebra generated by the random variables j G ξp for any pair index (j, p) such that pN + j k. It can be checked that, for any 1 u < v d and for any 1 k (n +≤ 1)N with k = pN + i for some ≤ ≤ ≤ ≤E N u N i = 1,...,N and p = 0,...,n, we have (Uk (f ) k 1)=0 and |G − E N u N v N (Uk (f )Uk (f ) k 1) |G − 1 u u v v (N,i) = Kp,ηN [(fp Kp,ηN fp )(fp Kp,ηN fp )](Xp 1 ). N p−1 − p−1 − p−1 − 26 P. DEL MORAL AND E. RIO

This also yields that pN+N E N u N v N (Uk (f )Uk (f ) k 1) |F − k=XpN+1 N u u v v = ηp 1[Kp,ηN [(fp Kp,ηN fp )(fp Kp,ηN fp )]]. − p−1 − p−1 − p−1 N Our aim is now to describe the limiting behavior of the martingale √NMn (f) N def. [Nt]+N N in terms of the process Xt (f) = k=1 Uk (f). By the definition of the particle model associated with a given mapping Φn, and using the fact that [Nt] P [ ] = [t], one gets that for any 1 u, v d N ≤ ≤ [Nt]+N N u N v N E(Uk (f )Uk (f ) k 1) |F − Xk=1 [Nt] N[t] = CN (f u,f v)+ − (CN (f u,f v) CN (f u,f v)), [t] N [t]+1 − [t] where, for any n 0 and 1 u, v d, ≥ ≤ ≤ n N u v N u u v v Cn (f ,f )= ηp 1[Kp,ηN ((fp Kp,ηN fp )(fp Kp,ηN fp ))]. − p−1 − p−1 − p−1 p=0 X Under our regularity conditions on the McKean transitions, this implies that for any 1 i, j d, ≤ ≤ [Nt]+N N u N v N P u v E(Uk (f )Uk (f ) k 1) Ct(f ,f ), |F − −−−→N →∞ Xk=1 with n u v u u v v Cn(f ,f )= ηp 1[Kp,ηp−1 ((fp Kp,ηp−1 fp )(fp Kp,ηp−1 fp ))] − − − p=0 X and, for any t R , ∈ + C (f u,f v)= C (f u,f v)+ t (C (f u,f v) C (f u,f v)). t [t] { } [t]+1 − [t] Since U N (f) 2 ( f ), for any 1 k [Nt]+ N, the conditional k k k≤ √N p n k pk ≤ ≤ Lindeberg condition is clearly≤ satisfied, and therefore one concludes that W the Rd-valued martingale XN (f); t R converges in law to a continuous { t ∈ +} Gaussian martingale Xt(f); t R+ such that, for any 1 u, v d and t R , X(f u), X(f v){ = C (f∈u,f v).} Recalling that XN (f)=≤ √NM≤ N (f), ∈ + h it t [t] [t] Rd N N we conclude that the -valued and -martingale √NMn (f) converges d F u in law to an R -valued and Gaussian martingale Mn(f) = (Mn(f ))u=1,...,d CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 27 such that for any n 0 and 1 u, v d ≥ n ≤ ≤ u v u u v v M(f ), M(f ) n = ηp 1[Kp,ηp−1 ((fp Kp,ηp−1 fp )(fp Kp,ηp−1 fp ))], h i − − − p=0 X with the convention K0,η−1 = η0 for p = 0. To take the final step, we let (ϕn)n 0 be a sequence of bounded mea- surable functions, respectively, in (E≥)dn . We associate with ϕ = (ϕ ) B n n n the sequence of functions f = (fp)0 p n defined for any 0 p n by the following formula: ≤ ≤ ≤ ≤

u d0+ +dp+ +dn f = (f ) = (0,..., 0, ϕ , 0,..., 0) (E ) ··· ··· . p p u=0,...,n p ∈B p dq In the above display, 0 stands for the null function in (Ep) (for q = p). u B 6 By construction, we have, fu = ϕu and for any 0 u n, we have that u u ≤ ≤ f = (fp )0 p n ≤ ≤ = (0,..., 0, ϕu, 0,..., 0) (E )d0 (E )du (E )dn ∈B 0 ×···×B u ×···×B n so that √ N u √ N N N NMn (f )= N[ηu (ϕu) ηu 1Ku,ηN (ϕu)] = Vu (ϕu) − − u−1 and therefore √ N √ N u N N NMn (f) := ( NMn (f ))0 u n = (Vu (ϕu))0 u n := n (ϕ). ≤ ≤ ≤ ≤ V We conclude that N (ϕ) converges in law to an (n + 1)-dimensional and Vn centered Gaussian random field n(ϕ) = (Vu(ϕu))0 u n with, for any 0 u, v n, V ≤ ≤ ≤ ≤ E 1 2 (Vu(ϕu)Vv(ϕv)) 1 1 2 2 = 1u(v)ηu 1[Ku,ηu−1 (ϕu Ku,ηu−1 ϕu)Ku,ηu−1 (ϕu Ku,ηu−1 ϕu)]. − − − This ends the proof of the theorem.

A.3. Feynman–Kac semigroups. In the context of Feynman–Kac flows (2.1) discussed in the Introduction, the semigroup Φp,n is given by the fol- lowing formula:

ηp(Qp,n(f)) ηn(f)= with Qp,n = Qp+1,...,Qn 1Qn. ηp(Qp,n(1)) −

For p = n, we use the convention Qn,n = Id, the identity operator. Also observe that 1 [Φp,n(µ) Φp,n(η)](f)= (µ η)DηΦp,n(f), − µ(Gp,n,η) − with the first-order operator D Φ (f) := G P (f Φ (η)(f)). η p,n p,n,η p,n − p,n 28 P. DEL MORAL AND E. RIO

In the above display Gp,n,η and Pp,n stand for the potential function and the Markov operator given by

Gp,n,η := Qp,n(1)/η(Qp,n(1)) and Pp,n(f)= Qp,n(f)/Qp,n(1). It is now easy to check that

Φp,n 1 2 (µ, η)(f) := [µ η]⊗ (Gp,n,η Dp,n,η(f)). R −µ(Gp,n,η) − ⊗ Using the fact that

D Φ (f)(x)= G (x) [P (f)(x) P (f)(y)]G (y)η(dy), η p,n p,n,η p,n − p,n p,n,η Z we find that f Osc (E ) D Φ (f) q β(P ) ∀ ∈ 1 n k η p,n k≤ p,n p,n with Qp,n(1)(x) qp,n = sup . x,y Qp,n(1)(y) This implies that β(DΦ ) 2q β(P ). p,n ≤ p,n p,n Finally, we observe that

Φp,n 2 2 Gp,n,η Dp,n,η(f) (µ, η)(f) (2q β(D )) [µ η]⊗ |R |≤ p,n p,n − 2q ⊗ β(D )  p,n p,n  from which one concludes that

δ(RΦp,n ) 2q2 β(DΦ ) 4q3 β(P ). ≤ p,n p,n ≤ p,n p,n We end this section with the analysis of these quantities for the time homogeneous models discussed in (4.1) and (4.2). Under the condition (M)m we have for any n m 1 and p 1, ≥ ≥ ≥ 2 n/m (A.4) qp,p+n δm/εm and β(Pp,p+n) (1 εm/δm 1)⌊ ⌋. ≤ ≤ − − The proof of these estimates relies on semigroup techniques (see [9], Chap- ter 4, for details). Several contraction inequalities can be deduced from these results, given below. For any k 0 and for l = 1, 2, ≥ n m(δ /ε )k (A.5) qk β(P )l ̟ (m) := m m . p,n p,n ≤ k,l 1 ((1 ε2 /δ ))l p=0 m m 1 X − − − Notice that k k+2 δm/εm k k+2 ̟k,l(m) mδm 1 2 l 1 mδm 1δm/εm , ≤ − (2 (εm/δm 1)) − ≤ − − − CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 29 and that r 4̟ (m) and b⋆ 2δ /ε n ≤ 3,1 n ≤ m m as well as 2 2 2 2 2 σn 4̟2,2(m)σ and βn 4̟2,2(m) with σ := sup σn( 1). ≤ ≤ n 1 ≤ ≥ A.4. McKean mean field model of gases. We consider McKean-type mod- els of gases (2.5) presented in Section 2.3. To simplify the presentation, we consider time homogeneous models, and we supress the time index. In this notation, we find that

[K K ](f)(x)= ν(ds)[η µ](a(s, ))M(f)(s,x). η − µ − · Z Observe that [η µ](K K )(f)(x)= ν(ds)[η µ](a(s, ))[η µ](M(f)(s, )). − η − µ − · − · Z Using the decomposition (A.6) Φ(η) Φ(µ) = (η µ)K + µ(K K ) + [η µ](K K ) − − µ η − µ − η − µ we readily check that Φ Υ(E, E) with the first-order operator ∈ D Φ(f)(x) = [K (f)(x) Φ(µ)(f)] µ µ − + ν(ds)[a(s,x) µ(a(s, ))]µ(M(f)(s, )) − · · Z and the second-order remainder measure Φ 2 (µ, η)(f)= [η µ]⊗ (g )ν(ds) with g = a(s, ) M(f)(s, ). R − s s · ⊗ · Z In this situation, we notice that

β( Φ) β(M) 1+ ν(ds) osc(a(s, )) D ≤ ·  Z  and δ(RΦ) β(M) ν(ds) osc(a(s, )). ≤ · Z A.5. Gaussian semigroups. To simplify the presentation, we only dis- cuss time homogenous and one-dimensional models. We consider the one- dimensional gaussian transitions on E = R defined below: 1 1 K (x,dy)= exp (y d(x, η))2 dy η √ −2 − 2π   with some linear drift function dn of the form d(x, η)= a(x)+ η(b)c(x), with some measurable (and nonnecessarily bounded) function a, and some pair 30 P. DEL MORAL AND E. RIO of functions b and c (R). We use the decomposition ∈B [K K ](f)(x)= K (x,dy)Θ(∆ (x,y))f(y) η − µ µ µ,η Z

+ Kµ(x,dy)∆µ,η(x,y)f(y) Z with Θ(u)= eu 1 u and the function ∆ (x,y) defined by − − µ,η dK (x, ) ∆ (x,y) = log η · (y)=∆(1) (x,y)+∆(2) (x,y) µ,η dK (x, ) µ,η µ,η µ · with ∆(1) (x,y) := [d(x, η) d(x,µ)][y d(x,µ)] = c(x)(η µ)(b)[y d(x,µ)], µ,η − − − − ∆(2) (x,y) := 1 [d(x, η) d(x,µ)]2 = 1 c(x)2[(η µ)(b)]2. µ,η − 2 − − 2 − Under our assumptions on the drift function d, we have ∆(1) (x,y) c osc(b) y d(x,µ) and ∆(2) (x,y) c 2 osc(b)2/2. | µ,η | ≤ k k | − | | µ,η | ≤ k k u 2 Using the fact that Θ(u) e| |u /2, after some elementary manipulations we prove that | |≤

(1) 2 sup [Kη Kµ](f)(x) Kµ(x,dy)∆µ,η(x,y)f(y) C[(η µ)(b)] f x R − − ≤ − k k ∈ Z with some finite constant C < whose values only depend on c and osc(b). On the other hand, we have∞ k k

(1) 2 (η µ)(dx) K (x,dy)∆ (x,y)f(y) = (η µ)⊗ (b (K′ (f))) − µ µ,η − ⊗ µ Z Z and

(1) µ(dx) K (x,dy)∆ (x,y)f(y) = (η µ)(b)µ(K′ (f)) µ µ,η − µ Z Z with the bounded integral operator Kµ′ defined by

K′ (f)(x)= c(x) K (x,dy)[y d(x,µ)]f(y). µ µ − Z Using the decomposition (A.6) we prove that Φ(η)(f Φ(µ)(f)) = (η µ)D Φ(f)+ Φ(η,µ)(f), − − µ R with the first-order operator

D Φ(f)= K (f Φ(µ)(f)) + bµ(K′ (f)) µ µ − µ and a second-order remainder term such that Φ 2 2 (η,µ)(f) C′[ (η µ)⊗ (b (K′ (f))) + [(η µ)(b)] osc(f)] |R |≤ | − ⊗ µ | − CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 31 with some finite constant C′ < whose values only depend on c and osc(b). Using the fact that ∞ k k

K′ (1) = 0 and K′ (f) = K′ (f Φ(µ)(f)) c osc(f), µ k µ k k µ − k ≤ k k Φ we conclude that (3.3) and (3.4) are met with δ(R ) C′ osc(b)(2 c + osc(b)), and condition (3.2) is satisfied with β( Φ) 1+≤ c osc(b). k k D ≤ k k A.6. Legendre transform and convex analysis. We associate with any increasing and convex function L : t Dom(L) L(t) R+ defined in some ∈ 7→ ∈ ⋆ domain Dom(L) R+, with L(0) = 0, the Legendre–Fenchel transform L defined by the variational⊂ formula λ 0 L⋆(λ):= sup (λt L(t)). ∀ ≥ t Dom(L) − ∈ Note that L⋆ is a convex increasing function with L⋆(0) = 0 and its inverse ⋆ 1 ⋆ 1 (L )− is a concave increasing function [with (L )− (0) = 0]. ⋆ ⋆ For instance, the Legendre–Fenchel transforms (α0, α1) of the pair of con- vex nonnegative functions (α0, α1) given below: t [0, 1/2[ α (t) := t 1 log (1 2t) ∀ ∈ 0 − − 2 − and t 0 α (t) := et 1 t ∀ ≥ 1 − − are simply given by α⋆(λ)= 1 (λ log (1 + λ)) and α⋆(λ)=(1+ λ) log (1 + λ) λ. 0 2 − 1 − Recall that, for any centered random variable Y with values in ] , 1] such that E(Y 2) v, we have −∞ ≤ vet + e vt (A.7) E(etY ) − 1+ vα (t) exp(vα (t)). ≤ 1+ v ≤ 1 ≤ 1 We refer to [6] for a proof of (A.7) and for more precise results. For any pair of such functions (L1,L2), it is readily checked that t Dom(L ) L (t) L (t) and Dom(L ) Dom(L ) ∀ ∈ 2 1 ≤ 2 2 ⊂ 1 ⇓ ⋆ ⋆ ⋆ 1 ⋆ 1 L L and (L )− (L )− . 2 ≤ 1 1 ≤ 2 For any pair of positive numbers (u, v), We also have that 1 t v− Dom(L ) L (t)= uL (vt) ∀ ∈ 2 1 2 ⇓ ⋆ ⋆ λ ⋆ 1 ⋆ 1 x λ 0 L (λ)= uL and x 0 (L )− (x)= uv(L )− . ∀ ≥ 1 2 uv ∀ ≥ 1 2 u     32 P. DEL MORAL AND E. RIO

As a simple consequence of the latter results, let us quote the following property that will be used later in the further development of Section 5.3:

⋆ 1 x ⋆ 1 x u u and v v = uv(L )− u v(L )− . ≤ ≤ ⇒ 2 u ≤ 2 u     Here we want to give upper bounds on the inverse functions of the Legen- dre transforms. Our motivation is due to the following result, which avoids the loss of a factor 2 when adding exponential inequalities. Let A and B be centered random variables with finite log-Laplace transform, which we denote by αA and αB, in a neighborhood of 0. Then, denoting by αA+B the log-Laplace transform of A + B, ⋆ 1 ⋆ 1 ⋆ 1 (A.8) (α )− (t) (α )− (t) + (α )− (t) A+B ≤ A B for any positive t (see [25], Lemma 2.1). In order to obtain analytic approximations of these inverse functions, one can use the Newton algorithm: let x α (z) F (z)= z + − ∗ , (α∗)′(z) and define the sequence (zn) by zn = F (zn 1). From the properties of the Legendre–Fenchel transform, we also have that− α((α ) 1(z)) + x (A.9) F (z)= ′ − . (α ) 1(z)  ′ −  Now recall the variational formulation of the inverse of the Legendre–Fenchel transform, ⋆ 1 1 (A.10) (α )− (x)= inf t− (α(t)+ x), t>0 valid for any x 0 (see [26], page 159 for a proof of this formula). From this ≥ formula, assuming that α′′(0) > 0 and setting z = α′(t), we get that ⋆ 1 (A.11) (α )− (x)= inf F (z). z α′(Dom(α)) ∈ 1 1 Let then f(z)= α((α′)− (z)) + x and g(z) = (α′)− (z). From the strict con- vexity of α, the function t t 1((α(t)+ x) a unique minimum t and is → − x decreasing with negative derivative for ttx. It follows that f/g has a unique critical point z(x), which is the unique global strict minimum of F and the unique fixed point of F . ⋆ 1 Furthermore z(x) = (α )− (x). Let z0 > 0 be in the interior of the image by α′ of the domain of α. If z0 > z(x), then (zn) is a decreasing sequence of numbers bounded from below by z(x). Hence (z ) decreases to z(x) as n tends to . If z < z(x) n ∞ 0 and F (z0) belongs to the interior of α′(Dom(α)), then z1 > z(x) and (zn)n>0 is decreasing to z(x). CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 33

We now recall the convergence properties of the Newton algorithm. As- sume that z > z(x) and let A be a positive real such that F (z) 2A for 0 ′′ ≤ any z in [z(x), z0]. Then, by the Taylor formula at order 2, 2n 1 (2n) (A.12) 0 z z(x) A − (z z(x)) , ≤ n − ≤ 0 − which provides a supergeometric rate of convergence if A(z0 z(x)) < 1. Since F depends on x, A is a function of x. In order to get− estimates of the rate of convergence of zn to z(x) for small values of x, we now assume that α′ is convex. We will prove that ⋆ 1 1 (α )− (x) (A.13) A := sup F ′′(z) . 2 z z(x) ≤ 2xα′′(0) ≥ To prove (A.13), we start by computing F ′′ = (f/g)′′. Since f ′ = zg′, 2 (f/g)′ = g′(zg f)g− . − Now (zg f) = g + (zg f )= g. It follows that − ′ ′ − ′ 1 2 2 3 (f/g)′′ = g′g− + (zg f)(g′′g− 2g′ g− ). − − Next, for z z(x), zg(z) f(z) 0, so that ≥ − ≥ 1 2 (f/g)′′(z) g′g− + (zg f)g′′g− . ≤ − Under the additional assumption that α′ is convex, the inverse function (α ) 1 = g is concave, so that g 0. In that case, for z z(x), ′ − ′′ ≤ ≥ (f/g)′′(z) g′(z)/g(z) = (log g)′(z). ≤ t Now log g is the inverse function of ψ(t)= α′(e ). From the properties of α′, the function ψ is convex, so that log g is concave. Hence (log g)′ is nonin- creasing, which implies that

F ′′(z) g′(z(x))/g(z(x)) = z(x)g′(z(x))/f(z(x)) for any z z(x). ≤ ≥ Since f(z) x and g′(z(x)) g′(0) = 1/α′′(0), we get (A.13), noticing that ≥ ⋆ 1 ≤ z(x)= F (z(x)) = (α )− (x). We now apply these results to the functions α0 and α1. Using the fact that t2 t2 α (t) := et 1 t α (t) := 2 ≤ 1 − − ≤ 1 2(1 t/3) − for every t [0, 3[, and applying (B.5), page 153 in [26], we get that ∈ ⋆ 1 ⋆ 1 √2x (α )− (x) (α )− (x)= √2x + (x/3). ≤ 1 ≤ 1 ⋆ Also, by the second part of Theorem B.2 in [26], the function α1, which is the inverse function of the above function, satisfies t2 (A.14) α⋆(t) , 1 ≥ 2(1 + (t/3)) 34 P. DEL MORAL AND E. RIO which is the usual bound in the Bernstein inequality. Now z = et 1, and consequently t = log(1 + z) and − x + z log(1 + z) F (z)= − . log(1 + z)

Set z0 = √2x + (x/3). Then z0 > z(x). Hence z(x) < z1 < z0 (here z1 = F (z0)). So

⋆ 1 √2x + (4x/3) log(1 + (x/3) + √2x) (α )− (x) z1 := − 1 ≤ log(1 + (x/3) + √2x) (A.15) (x/3) + √2x. ≤ Furthermore, from (A.12) and (A.13) and the fact that z z(x) x/3, 0 − ≤ ⋆ 1 x ⋆ 1 0 z (α )− (x) (α )− (x), ≤ 1 − 1 ≤ 18 1 which ensures that ⋆ 1 18z /(18 + x) (α )− (x) z . 1 ≤ 1 ≤ 1 In the same way, noticing that t2/(1 4t/3) α (t) t2/(1 2t) for any t [0, 1/2[, − ≤ 0 ≤ − ∈ we get ⋆ 1 2√x + (4x/3) (α )− (x) 2√x + 2x := z . ≤ 0 ≤ 0 By definition of α0, we have α0′ (t) = 2t/(1 2t). Let z = 2t/(1 2t). Then t = z/(2 + 2z), so that − − 1 x + α0((α0′ )− (t)) 2x + log(1 + z) z F (z)= 1 = 2x + log(1 + z)+ − . (α0′ )− (t) z

Computing z1 = F (z0), we get ⋆ 1 (α )− (x) z := 2x + log(1 + 2x + 2√x) 0 ≤ 1 log(1 + 2x + 2√x) 2√x (A.16) + − 2x + 2√x 2x + 2√x, ≤ which improves on the previous upper bound. Furthermore, from (A.12) and (A.13)

⋆ 1 x ⋆ 1 0 z (α )− (x) (α )− (x), ≤ 1 − 0 ≤ 9 0 which ensures that ⋆ 1 9z /(9 + x) (α )− (x) z . 1 ≤ 0 ≤ 1 CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 35

REFERENCES [1] Bolley, F., Guillin, A. and Malrieu, F. (2010). Trend to equilibrium and par- ticle approximation for a weakly selfconsistent Vlasov–Fokker–Planck equation. M2AN 44 867–884. [2] Bolley, F., Guillin, A. and Villani, C. (2007). Quantitative concentration in- equalities for empirical measures on non-compact spaces. Probab. Theory Related Fields 137 541–593. MR2280433 [3] Bossy, M. and Talay, D. (1997). A stochastic particle method for the McKean– Vlasov and the Burgers equation. Math. Comp. 66 157–192. MR1370849 [4] Bossy, M. and Talay, D. (1996). Convergence rate for the approximation of the limit law of weakly interacting particles: Application to the Burgers equation. Ann. Appl. Probab. 6 818–861. MR1410117 [5] Cerou, F., Del Moral, P. and Guyader, A. (2010). A non asymptotic vari- ance theorem for unnormalized Feynman-Kac particle models. Ann. Inst. H. Poincar´e. To appear. [6] Bentkus, V. (2004). On Hoeffding’s inequalities. Ann. Probab. 32 1650–1673. MR2060313 [7] Chopin, N. (2004). Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference. Ann. Statist. 32 2385–2411. MR2153989 [8] Crooks, G. E. (1998). Nonequilibrium measurements of free energy differences for microscopically reversible Markovian systems. J. Stat. Phys. 90 1481–1487. MR1628273 [9] Del Moral, P. (2004). Feynman–Kac Formulae: Genealogical and Interacting Par- ticle Systems with Applications. Springer, New York. MR2044973 [10] Del Moral, P., Doucet, A. and Jasra, A. (2006). Sequential Monte Carlo sam- plers. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 411–436. MR2278333 [11] Del Moral, P. and Miclo, L. (2003). Particle approximations of Lyapunov ex- ponents connected to Schr¨odinger operators and Feynman–Kac semigroups. ESAIM Probab. Stat. 7 171–208. MR1956078 [12] Del Moral, P. and Miclo, L. (2000). Branching and interacting particle systems approximations of Feynman–Kac formulae with applications to non-linear filter- ing. In S´eminaire de Probabilit´es, XXXIV. Lecture Notes in Math. 1729 1–145. Springer, Berlin. MR1768060 [13] Del Moral, P. and Doucet, A. (2004). Particle motions in absorbing medium with hard and soft obstacles. Stoch. Anal. Appl. 22 1175–1207. MR2089064 [14] Del Moral, P. and Guionnet, A. (1999). Central limit theorem for nonlinear filter- ing and interacting particle systems. Ann. Appl. Probab. 9 275–297. MR1687359 [15] Doucet, A., de Freitas, N. and Gordon, N. (2001). Sequential Monte Carlo Methods in Practice. Springer, New York. MR1847783 [16] Steinsaltz, D., Evans, S. N. and Wachter, K. W. (2005). A generalized model of mutation-selection balance with applications to aging. Adv. in Appl. Math. 35 16–33. MR2141503 [17] Harris, T. E. and Kahn, H. (1951). Estimation of particle transmission by random sampling. Natl. Bur. Stand. Appl. Math. Ser. 12 27–30. [18] Jacod, J. and Shiryaev, A. N. (1987). Limit Theorems for Stochastic Processes. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 288. Springer, Berlin. MR0959133 [19] Jarzynski, C. (1997). Nonequilibrium equality for free energy differences. Phys. Rev. Lett. 78 2690–2693. 36 P. DEL MORAL AND E. RIO

[20] Jarzynski, C. (1997). Equilibrium free-energy differences from nonequilibrium mea- surements: A master-equation approach. Phys. Rev. E 56 5018. [21] Malrieu, F. (2003). Convergence to equilibrium for granular media equations and their Euler schemes. Ann. Appl. Probab. 13 540–560. MR1970276 [22] Malrieu, F. and Talay, D. (2005). Concentration inequalities for Euler schemes. In Monte Carlo and Quasi-Monte Carlo Methods 2004 355–371. Springer, Berlin. MR2208718 [23] McKean, H. P. Jr. (1967). Propagation of chaos for a class of non-linear parabolic equations. In Stochastic Differential Equations (Lecture Series in Differential Equations, Session 7, Catholic Univ., 1967) 41–57. Air Force Office Sci. Res., Arlington, VA. MR0233437 [24] Mel´ eard,´ S. (1996). Asymptotic behaviour of some interacting particle systems; McKean–Vlasov and Boltzmann models. In Probabilistic Models for Nonlin- ear Partial Differential Equations (Montecatini Terme, 1995). Lecture Notes in Math. 1627 42–95. Springer, Berlin. MR1431299 [25] Rio, E. (1994). Local invariance principles and their application to density estima- tion. Probab. Theory Related Fields 98 21–45. MR1254823 [26] Rio, E. (2000). Th´eorie Asymptotique des Processus Al´eatoires Faiblement D´ependants. Math´ematiques and Applications (Berlin) [Mathematics and Ap- plications] 31. Springer, Berlin. MR2117923 [27] Rosenbluth, M. N. and Rosenbluth, A. W. (1955). Monte-Carlo calculations of the average extension of macromolecular chains. J. Chem. Phys. 23 356–359. [28] Rousset, M. (2006). On the control of an interacting particle estimation of Schr¨odinger ground states. SIAM J. Math. Anal. 38 824–844 (electronic). MR2262944 [29] Rubinstein, R. (2009). The Gibbs cloner for combinatorial optimization, counting and sampling. Methodol. Comput. Appl. Probab. 11 491–549. MR2551567 [30] Rubinstein, R. Y. (2010). Randomized algorithms with splitting: Why the classic randomized algorithms do not work and how to make them work. Methodol. Comput. Appl. Probab. 12 1–50. [31] Shiga, T. and Tanaka, H. (1985). Central limit theorem for a system of Markovian particles with mean field interactions. Z. Wahrsch. Verw. Gebiete 69 439–459. MR0787607 [32] Sznitman, A.-S. (1991). Topics in propagation of chaos. In Ecole´ D’Et´ede´ Prob- abilit´es de Saint–Flour XIX—1989. Lecture Notes in Math. 1464 165–251. Springer, Berlin. MR1108185

Centre INRIA Bordeaux-Sud-Ouest Centre INRIA Bordeaux-Sud-Ouest Institut de Mathematiques´ de Bordeaux and Universite´ Bordeaux 1 Laboratoire de Mathematiques´ de Versailles 351 Cours de la Liberation´ Universite´ de Versailles, Batimentˆ Fermat 33405 Talence Cedex 45 Av. des Etats-Unis France 78035 Versailles Cedex E-mail: [email protected] France E-mail: [email protected]