Lecture 20 Randomness and Monte Carlo

Lecture 20 Randomness and Monte Carlo J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) CS 357 1 / 41 What we’ll do: Random number generators Monte-Carlo integration Monte-Carlo simulation Our testing and our methods often rely on “randomness” and the impact of finite precision Random input Random perturbations Random sequences J. Chaudhry (UNM) CS 357 2 / 41 Randomness Randomness ≈ unpredictability One view: a sequence is random if it has no shorter description Physical processes, such as flipping a coin or tossing dice, are deterministic with enough information about the governing equations and initial conditions. But even for deterministic systems, sensitivity to the initial conditions can render the behavior practically unpredictable. we need random simulation methods J. Chaudhry (UNM) CS 357 3 / 41 Quasi-Random Sequences For some applications, reasonable uniform coverage of the sample is more important than the “randomness” True random samples often exhibit clumping Perfectly uniform samples uses a uniform grid, but does not scale well at high dimensions quasi-random sequences attempt randomness while maintaining coverage J. Chaudhry (UNM) CS 357 4 / 41 Quasi-Random Sequences quasi random sequences are not random, but give random appearance by design, the points avoid each other, resulting in no clumping http://dilbert.com/strips/comic/2001-10-25/ http://www.xkcd.com/221/ J. Chaudhry (UNM) CS 357 5 / 41 Randomness is easy, right? In May, 2008, Debian announced a vulnerability with OpenSSL: the OpenSSL pseudo-random number generator I the seeding process was compromised (2 years) I only 32,767 possible keys I seeding based on process ID (this is not entropy!) I all SSL and SSH keys from 9/2006 - 5/2008 regenerated I all certificates recertified Cryptographically secure pseudorandom number generator (CSPRNG) are necessary for some apps Other apps rely less on true randomness J. Chaudhry (UNM) CS 357 6 / 41 Repeatability With unpredictability, true randomness is not repeatable ...but lack of repeatability makes testing/debugging difficult So we want repeatability, but also independence of the trials 1 >> rand('seed',1234) 2 >> rand(10,1) J. Chaudhry (UNM) CS 357 7 / 41 Pseudorandom Numbers Computer algorithms for random number generations are deterministic ...but may have long periodicity (a long time until an apparent pattern emerges) These sequences are labeled pseudorandom Pseudorandom sequences are predictable and reproducible (this is mostly good) J. Chaudhry (UNM) CS 357 8 / 41 Random Number Generators Properties of a good random number generator: Random pattern: passes statistical tests of randomness Long period: long time before repeating Efficiency: executes rapidly and with low storage Repeatability: same sequence is generated using same initial states Portability: same sequences are generated on different architectures J. Chaudhry (UNM) CS 357 9 / 41 Random Number Generators Early attempts relied on complexity to ensure randomness “midsquare” method: square each member of a sequence and take the middle portion of the results as the next member of the sequence ...simple methods with a statistical basis are preferable J. Chaudhry (UNM) CS 357 10 / 41 Linear Congruential Generators (LCGs) Congruential random number generators are of the form: xk = (axk-1 + c)( mod m) where a and c are integers given as input. x0 is called the seed Integer m - 1 is the largest integer representable Quality depends on a and c. The period will be at most m. Example Let a = 13, c = 0, m = 31, and x0 = 1. 1, 13, 14, 27, 10, 6,... This is a permutation of integers from 1,..., 30, so the period is m - 1. J. Chaudhry (UNM) CS 357 11 / 41 History From C. Moler, NCM IBM used Scientific Subroutine Package (SSP) in the 1960’s the mainframes. Their random generator, rnd used a = 65539, c = 0, and m = 231. arithmetic mod 231 is done quickly with 32 bit words. multiplication can be done quickly with a = 216 + 3 with a shift and short add. Notice (mod m): xk+2 = 6xk+1 - 9xk ...strong correlation among three successive integers J. Chaudhry (UNM) CS 357 12 / 41 History From C. Moler, NCM Matlab used a = 75, c = 0, and m = 231 - 1 for a while period is m - 1. this is no longer sufficient J. Chaudhry (UNM) CS 357 13 / 41 Linear Congruential Generators (LCGs) – Summary sensitive to a and c be careful with supplied random functions on your system period is m standard division is necessary if generating floating points in [0, 1). J. Chaudhry (UNM) CS 357 14 / 41 what’s used? Two popular methods: 1. rand() in Unix uses LCG with a = 1103515245, c = 12345, m = 231. 2. Method of Marsaglia (period ≈ 21430). 1 Initialize x0,..., x3 and c to random values given a seed 2 3 Let s = 2111111111xn-4 + 1492xn-31776xn-2 + 5115xn-1 + c 4 32 5 Compute xn = s mod 2 6 32 7 c = floor(s=2 ) In general, the digits in random numbers are not themselves random...some patterns reoccur much more often. J. Chaudhry (UNM) CS 357 15 / 41 Fibonacci produce floating-point random numbers directly using differences, sums, or products. Typical subtractive generator: xk = xk-17 - xk-5 with “lags” of 17 and 5. Lags much be chosen very carefully negative results need fixing more storage needed than congruential generators no division needed very very good statistical properties long periods since repetition does not imply a period J. Chaudhry (UNM) CS 357 16 / 41 Discrete random variables random variable x values: x1,..., xn (the set of values is called sample space) n probabilities p1,..., pn with i=1 pi = 1 P throwing a die (1-based index) values: x1 = 1, x2 = 2,..., x6 = 6 probabilities pi = 1=6 J. Chaudhry (UNM) CS 357 17 / 41 Expected value and variance expected value: average value of the variable n E[x] = xjpj = Xj 1 variance: variation from the average σ2[x] = E[(x - E[x])2] = E[x2]- E[x]2 Standard deviation: square root of variance. denoted by σ[x] throwing a die expected value: E[x] = (1 + 2 + ··· + 6)=6 = 3.5 variance: σ2[x] = 2.916 J. Chaudhry (UNM) CS 357 18 / 41 Estimated E[x] to estimate the expected value, choose a set of random values based on the probability and average the results N 1 E[x] ≈ x N i = Xj 1 bigger N gives better estimates throwing a die 3 rolls: 3, 1, 6 ! E[x] ≈ (3 + 1 + 6)=3 = 3.33 9 rolls: 3, 1, 6, 2, 5, 3, 4, 6, 2 ! E[x] ≈ (3 + 1 + 6 + 2 + 5 + 3 + 4 + 6 + 2)=9 = 3.51 J. Chaudhry (UNM) CS 357 19 / 41 Law of large numbers by taking N to , the error between the estimate and the expected value is statistically zero. That is, the estimate will converge to the correct value 1 N 1 P(E[x] = lim ! x ) = 1 N N i i=1 1 X This is the basic idea behind Monte Carlo estimation J. Chaudhry (UNM) CS 357 20 / 41 Continuous random variable and density function random variable: x values: x 2 [a, b] (the set of values is called sample space, which in this case is [a, b] ) b probability: density function ρ(x) > 0 with a ρ(x) dx = 1 d probability that the variable has value betweenR (c, d) ⊂ [a, b]: c ρ(x) R J. Chaudhry (UNM) CS 357 21 / 41 Uniformly distributed random variable ρ(x) is constant b a ρ(x) dx = 1 means ρ(x) = 1=(b - a) R J. Chaudhry (UNM) CS 357 22 / 41 Continuous extensions to expectation and variance Expected value b E[x] = xρ(x) dx Za b E[g(x)] = g(x)ρ(x) dx Za Variance of random variable and a function of random variable b σ2[x] = (x - E[x])2ρ(x) dx = E[x2]-(E[x])2 Za b σ2[g(x)] = (g(x)- E[g(x)])2ρ(x) dx = E[g(x)2]-(E[g(x)])2 Za Estimating the expected value of some g(x) N 1 E[g(x)] ≈ g(x ) N i Xi=1 Law of large numbers valid for continuous case too J. Chaudhry (UNM) CS 357 23 / 41 Sampling over intervals Almost all systems provide a uniform random number generator on (0, 1) Matlab: rand If we need a uniform distribution over (a, b), then we modify xk on (0, 1) by (b - a)xk + a Matlab: Generate 100 uniformly distributed values between (a, b) r = a + (b-a).*rand(100,1) Uniformly distributed integers: randi Random permutations: randperm J. Chaudhry (UNM) CS 357 24 / 41 Non-uniform distributions (Not on Exam) sampling nonuniform distributions is much more difficult if the cumulative distribution function is invertible, then we can generate the non-uniform sample from the uniform. Consider probability density function ρ(t) = λe-λt, t > 0 giving the cumulative distribution function as x x -λ -λ -λ F(x) = λe tdt = -(e t) = (1 - e x) Z0 0 thus inverting the CDF xk = - log(1 - yk)/λ where yk is uniform in (0, 1). ...not so easy in general J. Chaudhry (UNM) CS 357 25 / 41 multidementional extensions difficult domains (complex geometries) expected value E[g(x)] = g(x)ρ(x) dx Zx2Ω Ω ⊂ Rd Density function for uniform distribution: 1 ρdx = 1 =) ρ = V ZΩ where V = Ω 1 dx is the volume of Ω. J. Chaudhry (UNM)R CS 357 26 / 41 Monte Carlo Central to the PageRank (and many many other applications in fincance, science, informatics, etc) is that we randomly process something what we want to know is “on average” what is likely to happen what would happen if we have an infinite number of samples? let’s take a look at integral (a discrete limit in a sense) J.

Load more