Lecture 2: Moments, Cumulants, and Scaling

Lecture 2: Moments, Cumulants, and Scaling Scribe: Ernst A. van Nierop (and Martin Z. Bazant) February 4, 2005 Handouts: • Historical excerpt from Hughes, Chapter 2: Random Walks and Random Flights. • Problem Set #1 1 The Position of a Random Walk 1.1 General Formulation Starting at the originX� 0= 0, if one takesN steps of size�xn, Δ then one ends up at the new position X� N . This new position is simply the sum of independent random�xn), variable steps (Δ N � X� N= Δ�xn. n=1 The set{ X� n} contains all the positions the random walker passes during its walk. If the elements of the{ setΔ�xn} are indeed independent random variables,{ X� n then} is referred to as a “Markov chain”. In a MarkovX� chain,N+1is independentX� ofnforn < N, but does depend on the previous positionX� N . LetPn(R�) denote the probability density function (PDF) for theR� of values the random variable X� n. Note that this PDF can be calculated for both continuous and discrete systems, where the discrete system is treated as a continuous system with Dirac delta functions at the allowed (discrete) values. Letpn(�r|R�) denote the PDF�x forn. Δ Note that in this more general notation we have included the possibility that�r (which is the value�xn) of depends Δ R� on. In Markovchain jargon,pn(�r|R�) is referred to as the “transition probability”X� N fromto state stateX� N+1. Using the PDFs we just defined, let’s return to Bachelier’s equation from the first lecture. 1.2 Bachelier’s Equation ¿From the Markov assumption, the probability of finding a randomR� after walkerNsteps at depends on the probability of getting toR� −position �r afterN − 1 steps, and the probability of stepping�r, so that � d PN (R�) = pN (�rR| � − �r)PN−1(R� − �r) d �r. 1 M.Z. Bazant – 18.366 Random Walks and Diffusion – Lecture 2 2 Now, if we assume that the transition probability is translationallypn( invariant,�r|R�) =pn(�r) then (i.e.the step does not depend on the currentR�). positionThis reduces the integral to a convolution and gives us Bachelier’s equation � d PN (R�) = pN (�r)PN−1(R� − �r) d �r, or PN=pN∗ PN−1 where∗’ ‘ denotes the convolution. NotePN− that1is itself the result of a convolution, and we can writePNin terms of the PDFs of the steps··· N) taken and the (1 PDF of the initialP0 position,, so that PN=p1∗ p2∗ p3∗ · · · ∗ pN∗ P0. In what follows, we assumeP0= thatδ(�x), so the random walk always starts at the origin. 1.3 The Convolution Theorem Convolutions are best treated by Fourier transform, which turns them into simple products. For two functionsf(x) andg(x), the Convolution Theorem states that f�∗ g(�k) =fˆ(�k)ˆg(�k). To derive this result, we make use of the definition of the Fourier transform, ∞ ∞ � � �� f�∗ g(�k) = e−ik·�x f(�x�)g(� x − � x�) dd� x� dd� x −∞ −∞ ∞ ∞ � �� = f(�x�) e−ik·�xg(�x − � x�) dd� x dd� x� −∞ −∞ where we have inverted the order of integration by assuming both integrals are absolutely convergent � (which follows from the common assumption,f, g ∈ L2). If we now�y= let� x − � x we obtain ∞ � ∞ � � � � � � f�∗ g(�k) = f(�x�) e−ik�ye−i k�x g(�y) dd� y dd� x −∞ −∞ =fˆ(�k)ˆg(�k). As described in the 2003 notes, the same method can be used to demonstrate the convolution theorem for discrete functions, � � � � � � fˆ(k)ˆg(k) = e−ikyf(y) e−ikzg(z) y z � � � � = e−ikx g(y)f(x − y) x y =f�∗ g(k), where all the terms with an exponentzhave equal been to collected on the right hand side of the second line. The key property used in both derivationse−ikye−ikz is=e that−ik(y+z). M.Z. Bazant – 18.366 Random Walks and Diffusion – Lecture 2 3 1.4 Exact Solution for the PDF of the Position Using the convolution theorem, we now write � � � � PˆN (k) =pˆ1(k)ˆp2(k) . pˆN (k) N � = pˆn(�k). n=1 � � N In the case of identical steps, thisP simplifiesˆN (k) = (ˆp1 to(k)) . Using the inverse transform, we find that the PDF of the position of the walker is given by ∞ N d � � � d �k P (�x) = eik·�x pˆ (�k) . N n (2π)d −∞ n=1 This is an exact solution, although the integral cannot be evaluated analytically in terms of ele mentary functions, except in a few special cases. (See Hughes, Ch. 2.) However, we are generally interested in the “longtime limit” of manyN→ steps, ∞, where the integral can be approxi mated by asymptotic analysis. This will be pursued in the next lecture. Here, we focus on scaling properties of the distribution with the number of steps. 2 Moments and Cumulants 2.1 Characteristic Functions � The Fourier transform of a PDF,PˆN ( suchk) for asX� N, is generally called a “characteristic function” in the probability literature. For random walks, especially on lattices, the characteristic function for the individual steps,p(�k), isˆ often referred to as the “structure function”. Note that � ∞ Pˆ(�0) = P (�x) dd� x = 1 −∞ due to the normalizationP (�x), of and that this is a maximum, since � � � |Pˆ(�k =� 0) | ≤ |e−ik·�xP (�x)| dd�x = P (� x) dd� x = 1. IfP (�x) is nonzero on a set of nonzero measure, then the complex exponential factor causes cancel lations in the integral, resulting|Pˆ(�k)| in< 1 for�k �=�0. 2.2 Moments In the 1dimensional case= 1), (d the moments of a randomX variable,, are defined as m1 =�x� 2 m2 =�x � . n mn =�x � . M.Z. Bazant – 18.366 Random Walks and Diffusion – Lecture 2 4 To get a feeling for the significance of moments,m1is note just that the mean or expected value of the random variable.m When= 0, the second moment,m , allows us to define the “root 1 √ 2 meansquare width” of the distribution,m2. In a multidimensional setting1), (d the > moments become tensors, (j1j2...jn) mn =�xj1 xj2. xjn�, so thenthmoment is a tensor ofn with rankdncomponents. The offdiagonal components measure anisotropy, and generally the moment tensors describe the “shape” of the distribution. In probability, a characteristic functionPˆ(�k) is also often referred to as a “momentgenerating function”, because it conveniently encodes the moments in its Taylor expansion around the origin. For example,d for= 1, we have ∞ n n � (−i) k mn Pˆ(k) = n! n=0 1 = 1 − im k − m k2. 1 2 2 since ∂nPˆ m =in (0). n ∂kn Ford > 1, the moment tensors are generated by ∂nPˆ (j1j2...jn) n � mn =i (0) ∂kj1 ∂kj2. ∂kjn and from the Taylor expansion, 1 Pˆ(�k) = 1 − im · �k − �k · m· �k +. 1 2 2 Note thatPˆ(�k) is analytic around the origin if and only if all moments existn< and∞ are finite (m for alln). This condition breaks down in cases of distributions with “fat tails”, to be discussed in subsequent lectures. 2.3 Cumulants Certain nonlinear combinations of moments, called “cumulants”, arise naturally when analyzing sums of independent random variables. Consider d � � d �k P (�x) = eik·�xPˆ(�k) (2π)d d � � � d �k = eik·�x+ψ(k) (2π)d where ψ(�k) = logPˆ(�k) is called the “cumulant generating function”, whose Taylor coefficients at the origin (analogous to the moments above) are the “cumulants”, which ared = scalars 1 and, for more generally, tensors M.Z. Bazant – 18.366 Random Walks and Diffusion – Lecture 2 5 ford > 1 ∂nψ n(j1,j2,...,jn) � i cn = (0) ∂kj1 ∂kj2. ∂kjn 1 ψ(�k) =−ic · �k − �k · c· �k +. 1 2 2 The cumulants can be expressed in terms of the moments by equating Taylor expansion coeffi cients in Eq. (2.3).d = For 1, we find c1 =m1 or ‘mean’ 2 2 c2 =m2− m1=σ or ‘variance’= (σ standard deviation) 3 c3 =m3− 3m1m2+ 2m1 or ‘skewness’ 2 2 4 c4 =m4− 3m2− 4m1m3+ 12m1m2− 6m1 or ‘kurtosis’. (Note that “skewness” and “kurtosis” are also sometimes used for the normalized cumulants, defined in the next lecture.) Although these expressions increase in complexity ratherlth quickly, the cumulant can be expressed in terms ofl moments the first in closed form via the determinant of an ‘almost’ lower triangular matrix, � m 1 0 0 0 0. 0 � � 1 � � m m 1 0 0 0. 0 � � 2 1 � � m m �2�m 1 0 0 . 0 � � 3 2 1 1 � � m m �3�m �3�m 1 0 . 0 � � 4 3 1 2 2 1 � l+1� �4� �4� �4� � cl=(−1) � m5 m4 m3 m2 m1 1 . 0 � . � 1 2 3 � � ...... ... .. .� � . .. � � . � � .. � � ml−1 ml−2 . 1 � � �l −1� � � ml ml−1 . .l− .2 . .m1� Ford >1, thenth cumulant is a tensorn ofwith rankdncomponents, related to the moment tensors,ml, for≤ 1l ≤ n. For example, the second cumulant matrix is given by (ij) (ij)(i)(j) c2 =m2 − m1m1. 3 Additivity of Cumulants A crucial feature of random walks with independently identically distributed (IID) steps is that � � cumulants are additive. If weψ( definek) andψN (k) to be the cumulant generating functions of the steps and the final position, respectively, then d � � ˆ � d �k P (�x) = eik·�x+N ψ(k) N (2π)d d � � � d �k = eik·�x+ψN(k) , (2π)d which implies � � ψN (k) =Nψ(k). M.Z. Bazant – 18.366 Random Walks and Diffusion – Lecture 2 6 th (j1,j2,...,jn) th Therefore, then cumulant tensorcn,N for the position afterNsteps is related ton the cumulant tensor for each step by cn,N=Ncn.

Lecture 2: Moments, Cumulants, and Scaling

18.440: Lecture 9 Expectations of Discrete Random Variables

Simulation of Moment, Cumulant, Kurtosis and the Characteristics Function of Dagum Distribution

Randomized Algorithms

Probability Cheatsheet V2.0 Thinking Conditionally Law of Total Probability (LOTP)

Joint Probability Distributions

Functions of Random Variables

Expected Value and Variance

Random Variables and Applications

Expectation and Functions of Random Variables

STAT 830 Generating Functions

5.5 the Expected Value of a Function of Random Variables 5.6 Special

Moments & Cumulants