1. Monte Carlo Integration
Total Page:16
File Type:pdf, Size:1020Kb
1. Monte Carlo integration The most common use for Monte Carlo methods is the evaluation of integrals. This is also the basis of Monte Carlo simulations (which are actually integrations). The basic principles hold true in both cases. 1.1. Basics Basic idea becomes clear from the • “dartboard method” of integrating the area of an irregular domain: Choose points randomly within the rectangular box A # hits inside area = p(hit inside area) Abox # hits inside box as the # hits . ! 1 1 Buffon’s (1707–1788) needle experiment to determine π: • – Throw needles, length l, on a grid of lines l distance d apart. – Probability that a needle falls on a line: d 2l P = πd – Aside: Lazzarini 1901: Buffon’s experiment with 34080 throws: 355 π = 3:14159292 ≈ 113 Way too good result! (See “π through the ages”, http://www-groups.dcs.st-and.ac.uk/ history/HistTopics/Pi through the ages.html) ∼ 2 1.2. Standard Monte Carlo integration Let us consider an integral in d dimensions: • I = ddxf(x) ZV Let V be a d-dim hypercube, with 0 x 1 (for simplicity). ≤ µ ≤ Monte Carlo integration: • - Generate N random vectors x from flat distribution (0 (x ) 1). i ≤ i µ ≤ - As N , ! 1 V N f(xi) I : N iX=1 ! - Error: 1=pN independent of d! (Central limit) / “Normal” numerical integration methods (Numerical Recipes): • - Divide each axis in n evenly spaced intervals 3 - Total # of points N nd ≡ - Error: 1=n (Midpoint rule) / 1=n2 (Trapezoidal rule) / 1=n4 (Simpson) / If d is small, Monte Carlo integration has much larger errors than • standard methods. When is MC as good as Simpson? Let’s follow the error • 1=n4 = 1=N 4=d = 1=pN d = 8: ! In practice, MC integration becomes better when d>6–8. ∼ In lattice simulations d = 106 108! • − In practice, N becomes just too large very quickly for the standard • methods: already at d = 10, if n = 10 (pretty small), N = 1010. Too much! 4 1.3. Why error is 1=pN? / This is due to the central limit theorem (see, f.ex., L.E.Reichl, Sta- • tistical Physics). Let y = V f(x ) (absorbing the volume in f) be the value of the • i i integral using one random coordinate (vector) xi. The result of the integral, after N samples, is y + y + : : : + y z = 1 2 N : N N Let p(y ) be the probability distribution of y , and P (z ) the distri- • i i N N bution of zN . Now PN (zN ) = Z dy1 : : : dyN p(y1) : : : p(yN )δ(zN yi=N) − Xi Now • 1 = Z dy p(y) 5 y = dy y p(y) h i Z y2 = dy y2 p(y) h i Z σ = y2 y 2 = (y f )2 h i − h i h − h i i Here, and in the following, α means the expectation value of α h i (mean of the distribution pα(α)). The error of the Monte Carlo integration (of length N) is defined as • the width of the distribution PN (zN ), or more precisely, σ2 = z2 z 2 : N h N i − h N i Naturally z = y . • h N i h i Let us calculate σ : • N - Let us define the Fourier transformation of p(y): ik(y y ) φ(k) = Z dy e −h i p(y) 6 Similarly, the Fourier transformation of PN (z) is ik(zN zN ) ΦN (k) = Z dzN e −h i PN (zN ) i(k=N)(y1 y +:::+yN y ) = Z dy1 : : : dyN e −h i −h i p(y1) : : : p(yN ) = [φ(k=N)]N - Expanding φ(k=N) in powers of k=N (N large), we get 2 2 i(k=N)(y y ) 1 k σ φ(k=N) = dy e −h i p(y) = 1 + : : : Z − 2 N 2 Because of the oscillating exponent, φ(k=N) will decrease as k j j increases, and φ(k=N)N will decrease even more quickly. Thus, 2 2 3 N 1 k σ k k2σ2=2N Φ (k) = 01 + O( )1 e− N − 2 N 2 N 3 −! @ A 7 - Taking the inverse Fourier transform we obtain 1 ik(zN y ) P (z ) = dk e− −h i Φ (k) N N 2π Z N 2 2 1 ik(zN y ) k σ =2N = dk e− −h i e− 2π Z 2 v N 1 N(zN y ) = u exp 2 − h i 3 : u2π σ − 2σ2 t 4 5 Thus, the distribution P (z ) approaches Gaussian as N , • N N ! 1 and σN = σ=pN: 8 1.4. In summary: error in MC integration If we integrate I = dx f(x) ZV with Monte Carlo integration using N samples, the error is 2 2 v f f σN = V uh i − h i : u t N In principle, the expectation values here are exact properties of the dis- tribution of f, i.e. 1 1 f = dxf(x) = dy y p(y) f 2 = dxf 2(x) = dy y2p(y): h i V Z Z h i V Z Z Here p(y) was the distribution of values y = f(x) when the x-values are sampled with flat distribution. p(y) can be obtained as follows: the probability of the x-coordinate is flat, i.e. px(x) = C =const., and thus dx p (x)dx = C dy p(y)dy x dy ≡ ) 9 p(y) = C=(dy=dx) = C=f 0(x(y)) : Of course, we don’t know the expectation values beforehand! In practice, these are substituted by the corresponding Monte Carlo estimates: 1 2 1 2 f f(xi) f f (xi) : (1) h i ≈ N Xi h i ≈ N Xi Actually, in this case we must also use 2 2 v f f σN = V uh i − h i u N 1 t − because using (1) here would underestimate σN . Often in the literature the real expectation values are denoted by x , hh ii as opposed to the Monte Carlo estimates x . h i The error obtained in this way is called the 1-σ error; i.e. the error is the width of the Gaussian distribution of fN = 1=N f(xi): Xi 10 It means that the true answer is within V f σ with 68% probability. h i N ∼ This is the most common error value cited in the literature. This assumes that the distribution of fN really is Gaussian. The central limit theorem guarantees that this is so, provided that N is large enough (except in some pathological cases). However, in practice N is often not large enough for the Gaussianity to hold true! Then the distribution of fN can be seriously off-Gaussian, usually skewed. More refined methods can be used in these cases (for example, modified jackknife or bootstrap error estimation). However, of- ten this is ignored, because there just are not enough samples to deduce the true distribution (or people are lazy). 11 Example: convergence in MC integration Let us study the following integral: • 1 I = dx x2 = 1=3 : Z0 2 1 Denote y = f(x) = x and x = f − (y) = py. • If we sample x with a flat distribution, p (x) = 1, 0 < x 1, the • x ≤ probability distribution pf (y) of y = f(x) can be obtained from dx p (x)dx = p (x) dy p (y)dy x x jdyj ≡ f ) dx 1 1 1 1=2 p (y) = = 1=f 0(x) = x− = y− f jdy j j j 2 2 where 0 < y 1. ≤ @xi In more than 1 dim: pf (~y) = @y , the Jacobian determinant. j • 12 The result of a Monte Carlo integration using N samples x , y = • i i f(xi), is IN = (y1 + y2 + : : : + yN )=N. What is the probability distribution PN of IN ? This tells us how badly • the results of the (fixed length) MC integration scatter. P1(z) = pf (z) = 1=pz 1 y1 + y2 P2(z) = dy1dy2 pf (y1)pf (y2) δ(z ) Z0 − 2 π=2 0 < y 1=2 = 8 1 y ≤ < arcsin − 1=2 < y 1 y ≤ :1 y1 + y2 + y3 P3(z) = dy1dy2dy3 pf (y1)pf (y2)pf (y3) δ(z ) = : : : Z0 − 3 : : : 13 6 PN , measured from the results of 10 independent MC integrations for each N: 5 50 P 1 P10 P 4 2 40 P100 P 3 P1000 P10 3 30 P N PN 2 20 1 10 0 0 0 0.5 1 0.3 0.35 0.4 The expectation value = 1/3 for all PN . The width of the Gaussian is f 2 f 2 v v 4 σN = uhh ii − hh ii = u u u t N t45N 14 where 1 2 2 1 2 f = dxf (x) = dyy pf (y) = 1=5: hh ii Z0 Z1 For example, using N = 1000, σ 0:0094. Plotting 1000 ≈ (x 1=3)2 C exp 2 − 3 − 2σ2 4 1000 5 in the figure above we obtain a curve which is almost undistinguishable from the measured P1000 (blue dashed) curve. 15 In practice, the expectation value and the error is of course measured from a single Monte Carlo integration. For example, using again N = 1000 and the Monte Carlo estimates 2 2 1 2 1 2 v f f f = fi ; f = fi ; σN = uh i − h i h i N X h i N X u N 1 i i t − we obtain the following results (here 4 separate MC integrations): 1: 0:34300 0:00950 2: 0:33308 0:00927 3: 0:33355 0:00931 4: 0:34085 0:00933 Thus, both the average and the error estimate can be reliably calculated in a single MC integration.