CONCENTRATION INEQUALITIES for MEAN FIELD PARTICLE MODELS 3 As the Gibbs Measure, Defined By
Total Page:16
File Type:pdf, Size:1020Kb
The Annals of Applied Probability 2011, Vol. 21, No. 3, 1017–1052 DOI: 10.1214/10-AAP716 c Institute of Mathematical Statistics, 2011 CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS By Pierre Del Moral and Emmanuel Rio INRIA Bordeaux-Sud-Ouest, and INRIA Bordeaux-Sud-Ouest and Universit´ede Versailles This article is concerned with the fluctuations and the concen- tration properties of a general class of discrete generation and mean field particle interpretations of nonlinear measure valued processes. We combine an original stochastic perturbation analysis with a con- centration analysis for triangular arrays of conditionally independent random sequences, which may be of independent interest. Under some additional stability properties of the limiting measure valued pro- cesses, uniform concentration properties, with respect to the time parameter, are also derived. The concentration inequalities presented here generalize the classical Hoeffding, Bernstein and Bennett in- equalities for independent random sequences to interacting particle systems, yielding very new results for this class of models. We illustrate these results in the context of McKean–Vlasov-type diffusion models, McKean collision-type models of gases and of a class of Feynman–Kac distribution flows arising in stochastic engineering sciences and in molecular chemistry. 1. Introduction. 1.1. Mean field particle models. Let (En)n 0 be a sequence of measur- ≥ able spaces equipped with some σ-fields ( n)n 0, and we let (En) be the set of all probability measures over the setE E ≥, with n 0. WeP consider a n ≥ collection of transformations Φn+1 : (En) (En+1), n 0, and we denote by (η ) a sequence of probabilityP measures→ P on E satisfying≥ a nonlinear arXiv:1211.1837v1 [math.PR] 8 Nov 2012 n n 0 n equation≥ of the following form: (1.1) ηn+1 =Φn+1(ηn). Received April 2010. AMS 2000 subject classifications. Primary 60E15, 60K35; secondary 60F99, 60F10, 82C22. Key words and phrases. Concentration inequalities, mean field particle models, mea- sure valued processes, Feynman–Kac semigroups, McKean–Vlasov models. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Applied Probability, 2011, Vol. 21, No. 3, 1017–1052. This reprint differs from the original in pagination and typographic detail. 1 2 P. DEL MORAL AND E. RIO These discrete time versions of conservative and nonlinear integro-differential type equations in distribution spaces arise in a variety of scientific disciplines including in physics, biology, information theory and engineering sciences. To motivate the article, before describing their mean field particle inter- pretations, we illustrate these rather abstract evolution models working out explicitly some of these equations in a series of concrete examples. The first one is related to nonlinear filtering problems arising in signal processing. Suppose we are given a pair signal-observation Markov chain (Xn,Yn)n 0 on ≥ some product space (Rd1 Rd2 ), with some initial distribution and Markov transition of the following× form: P((X ,Y ) d(x,y)) = η (dx)g (x,y)λ (dy), 0 0 ∈ 0 0 0 P((X ,Y ) d(x,y) (X ,Y )) = M (X , dx)g (x,y)λ (dy). n+1 n+1 ∈ | n n n+1 n n+1 n+1 In the above display, λn stands for some reference probability measures d2 on R , gn is a sequence of positive functions, Mn+1 are Markov transitions d1 from R into itself and finally η0 stands for some initial probability mea- sure on Rd1 . For a given sequence of observations Y = y delivered by some sensor, the filtering problem consists of computing sequentially the flow of conditional distributions defined by η = Law(X Y = y ,...,Y = y ) n n| 0 0 n n and b η = Law(X Y = y ,...,Y = y ). n+1 n+1| 0 0 n n These distributions satisfy a nonlinear evolution equation of the form (1.1) with the transformations Φn+1(ηn)(dx′)= ηn(dx)Mn+1(x,dx′) and (1.2) Z b Gn(x) ηn(dx)= ηn(dx) ηn(dx′)Gn(x′) for some collection ofb likelihoodR functions Gn = gn( ,yn). Replacing these · d1 functions by some ]0, 1]-valued potential function Gn on R , we obtain the conditional distributions of a particle absorption model Xn′ with free evolution transitions Mn and killing rate (1 Gn). More precisely, if T stands for the killing time of the process, we have− that (1.3) η = Law(X′ T >n) and η = Law(X′ T >n). n n| n+1 n+1| These nonabsorption conditional distributions arise in the analysis of con- finement processes,b as well as in computational physics with the numerical solving of Schr¨odinger ground state energies (see, e.g., [11, 13, 28]). Another important class of measures arising in particle physics and stochas- tic optimization problems is the class of Boltzmann distributions, also known CONCENTRATION INEQUALITIES FOR MEAN FIELD PARTICLE MODELS 3 as the Gibbs measure, defined by 1 βnV (x) (1.4) ηn(dx)= e− λ(dx), Zn with some reference probability measure λ, some inverse temperature pa- rameter and some nonnegative potential energy function V on some state space E. The normalizing constant Zn is sometimes called “partition func- tion” or the “free energy.” In sequential Monte Carlo methodology, as well as in operation research literature, these multiplicative formulae are often used to compute rare events probabilities, as well as the cardinality or the volume of some complex state spaces. Further details on these stochastic techniques can be found in the series of articles [5, 29, 30]. To fix the ideas we can consider the uniform measure on some finite set E. Surprisingly, this flow of measures also satisfies the above nonlinear evolution equation, as soon as ηn is an invariant measure of Mn, for each n 1, and the potential functions G are chosen of the following form: G = exp≥ ((β β )V ). The partition n n n+1 − n functions can also be computed in terms of the flow of measures (ηp)0 p<n using the easy-to-check multiplicative formula, ≤ Zn = ηp(dx)Gp(x), 0 p<n ≤Y Z as soon as β0 = 0. In statistical mechanics literature, the above formula is sometimes called the Jarzynski or the Crooks equality [8, 19, 20]. Notice that βn+1V the stochastic models discussed above remain valid if we replace e− by any collection of functions gn+1 s.t. gn+1 = gn Gn = 0 p n Gp, for some × ≤ ≤ potential functions Gn, with n 0. Further details on this model, with sev- eral worked-out applications on≥ concrete hidden MarkovQ chain problems and Bayesian inference can be found in the article [10] dedicated to sequential Monte Carlo technology. All of the models discussed above can be abstracted in a single probabilistic model. The latter is often called a Feynman–Kac model, and it will be presented in some details in Section 2.1. The mean field-type interacting particle system associated with equa- tion (1.1) relies on the fact that the one-step mappings can be rewritten in the following form: (1.5) Φn+1(ηn)= ηnKn+1,ηn for some collection of Markov kernels Kn+1,µ indexed by the time parameter n 0, and the set of measures µ on the space E . We already mention that ≥ n the choice of the Markov transitions Kn,η is not unique. Several examples are presented in Section 2 in the context of Feynman–Kac semigroups or McKean–Vlasov diffusion-type models. To fix the ideas, we can choose el- ementary transitions Kn,η(x,dx′)=Φn(η)(dx′) that do not depend on the state variable x. 4 P. DEL MORAL AND E. RIO In the literature on mean field particle models, the transitions Kn,η are called a choice of McKean transitions. These models provide a natural inter- pretation of the flow of measures ηn as the laws of the time inhomogeneous Markov chain Xn with elementary transitions P (Xn dx Xn 1)= Kn,ηn−1 (Xn 1, dx) with ηn 1 = Law(Xn 1) ∈ | − − − − and starting with some initial random variable with distribution η0 = Law(X0). The Markov chain Xn can be thought of as a perfect sampling algorithm. For a thorough description of these discrete generation and nonlinear McKean- type models, we refer the reader to [9]. In the further development of the article, we always assume that the mappings i N i (xn)1 i N En K N (xn, An+1) n+1,1/N Pj=1 δ j ≤ ≤ ∈ 7→ xn are N -measurable, for any n 0, N 1, and 1 i N, and any mea- En⊗ ≥ ≥ ≤ ≤ surable subset An+1 En+1. In this situation, the mean field particle inter- ⊂ N pretation of this nonlinear measure valued model is an En -valued Markov (N) (N,i) chain ξn = (ξn )1 i N , with elementary transitions defined as ≤ ≤ N (N) (N) (N,i) i P(ξ dx )= K N (ξ , dx ) n+1 ∈ |Fn n+1,ηn n i=1 (1.6) Y N N 1 with ηn := δ (N,j) . N ξn Xj=1 N In the above displayed formula, n stands for the σ-field generated by (N) F 1 N the random sequence (ξp )0 p n, and dx = dx dx stands for an ≤ ≤ 1 ×···×N N infinitesimal neighborhood of a point x = (x ,...,x ) En . The initial sys- (N) ∈ tem ξ0 consists of N independent and identically distributed random vari- ables with common law η0. As usual, to simplify the presentation, when there is no possible confusion we suppress the parameter N, so that we write ξn i (N) (N,i) and ξn instead of ξn and ξn .