Introduction to Stochastic Simulation: 1. Ideas and Examples

1 Introduction to Stochastic Simulation: 1. Ideas and Examples Pierre L'Ecuyer Universitéde Montréal,Canada Operations Research Seminars, Zinal, June 2016 We focus on general ideas and methods that apply widely, not on specific applications. Draft 2 Systems, models, and simulation Model: simplified representation of a system. Modeling: building a model, based on data and other information. Simulation program: computerized (runnable) representation of the model. Some environments do not distinguish modeling and programming (no explicit programming). Simulation: running the simulation program and observing the results. Much more convenient and powerful than experimenting with the real system. Deterministic vs stochastic model. Static vs dynamic model. Draft M=M=s queue, arrival rate λ, service rate µ, s > λ/µ servers, steady-state. Let W the waiting time of a random customer. There are exact expressions for E[W ], P[W = 0], P[W > x], etc. Conditionally on W > 0, W is exponential with mean 1=(sµ − λ). Simulation model: can be much more detailed and realistic. Requires reliable data and information. 3 Analytic models vs simulation models Analytic model: provides a simple formula. Usually hard to derive and requires unrealistic simplifying assumptions, but easy to apply and insightful. Examples: Erlang formulas in queueing theory, Black-Scholes formula for option pricing in finance, etc. Draft Simulation model: can be much more detailed and realistic. Requires reliable data and information. 3 Analytic models vs simulation models Analytic model: provides a simple formula. Usually hard to derive and requires unrealistic simplifying assumptions, but easy to apply and insightful. Examples: Erlang formulas in queueing theory, Black-Scholes formula for option pricing in finance, etc. M=M=s queue, arrival rate λ, service rate µ, s > λ/µ servers, steady-state. Let W the waiting time of a random customer. There are exact expressions for E[W ], P[W = 0], P[W > x], etc. Conditionally on W > 0, W is exponential with mean 1=(sµ − λ). Draft 3 Analytic models vs simulation models Analytic model: provides a simple formula. Usually hard to derive and requires unrealistic simplifying assumptions, but easy to apply and insightful. Examples: Erlang formulas in queueing theory, Black-Scholes formula for option pricing in finance, etc. M=M=s queue, arrival rate λ, service rate µ, s > λ/µ servers, steady-state. Let W the waiting time of a random customer. There are exact expressions for E[W ], P[W = 0], P[W > x], etc. Conditionally on W > 0, W is exponential with mean 1=(sµ − λ). Simulation model: can be much more detailed and realistic. Requires reliable data and information. Draft 4 real-life system decision verification making modeling data conceptual programming computerized model model Draftverification 5 Examples: A stochastic activity network Gives precedence relations between activities. Activity k has random duration Yk (also length of arc k) with known cumulative distribution function (cdf) def Fk (y) = P[Yk ≤ y]: Project duration T = (random) length of longest path from source to sink. 5 8 sink Y10 Y5 Y9 Y12 2 4 7 Y4 Y8 Y1 Y2 Y7 Y11 source 0 1 3 6 Y0 DraftY3 Y6 6 Simulation Repeat n times: generate Yk with cdf Fk for each k, then compute T . From those n realizations of T , we can estimate E[T ], P[T > x] for some x, and compute confidence intervals on those values. Better: construct a histogram to estimate the distribution of T . Numerical illustration: 2 Yk ∼ N(µk ; σk ) for k = 1; 2; 4; 11; 12, and Vk ∼ Expon(1/µk ) otherwise. µk ; : : : ; µ13: 13.0, 5.5, 7.0, 5.2, 16.5, 14.7, 10.3, 6.0, 4.0, 20.0, 3.2, 3.2, 16.5. We may pay a penalty if T > 90, for example. Draft Results of an experiment with n = 100 000. Histogram of values of T gives much more information than confidence ^ interval on E[T ] or P[T > x] or ξ0:99. Values from 14.4 to 268.6; 11.57% exceed x = 90. Frequency (×103) mean = 64:2 T = 48:2 T = x = 90 10 5 ξ^0:99 = 131:8 0 T 0 25 50 75 100 125 150 175 200 7 Naive idea: replace each Yk by its expectation. Gives T = 48:2. Draft 7 Naive idea: replace each Yk by its expectation. Gives T = 48:2. Results of an experiment with n = 100 000. Histogram of values of T gives much more information than confidence ^ interval on E[T ] or P[T > x] or ξ0:99. Values from 14.4 to 268.6; 11.57% exceed x = 90. Frequency (×103) mean = 64:2 T = 48:2 T = x = 90 10 5 ξ^0:99 = 131:8 0 T 0 25 50 75Draft100 125 150 175 200 8 Collisions in a hashing system In applied probability and statistics, Monte Carlo simulation is often used to estimate a distribution that is too hard to compute exactly. Example: We throw M points randomly and independently into k squares. C = number of collisions (when a point falls in square already occupied). What is the distribution of C? Exemple: k = 100, M = 10, C = 3. • • • • •• • Draft••• Simplified analytic model: We consider two cases: M = m (constant) and M ∼ Poisson(m). Theorem: If k ! 1 and m2=(2k) ! λ (a constant) then C ) Poisson(λ). −λ x That is, in the limit, P[C = x] = e λ =x! for x = 0; 1; 2; :::. But how good is this approximation for finite k and m? 9 One application: hashing. Large set of identifiersΦ, e.g., 64-bit numbers, only a few a used. Hashing function h :Φ ! f1;:::; kg (set of physical addresses), with k jΦj. Another application: testing random number generators. Draft 9 One application: hashing. Large set of identifiersΦ, e.g., 64-bit numbers, only a few a used. Hashing function h :Φ ! f1;:::; kg (set of physical addresses), with k jΦj. Another application: testing random number generators. Simplified analytic model: We consider two cases: M = m (constant) and M ∼ Poisson(m). Theorem: If k ! 1 and m2=(2k) ! λ (a constant) then C ) Poisson(λ). −λ x That is, in the limit, P[C = x] = e λ =x! for x = 0; 1; 2; :::. But how good is this approximation for finite k and m? Draft Illustration: k = 10 000 and m = 400. Analytic approx.: C ∼ Poisson with mean λ = m2=(2k) =8. We simulated n = 107 replications (or runs). Empirical mean and variance: 7.87 and 7.47 for M = m = 400 (fixed); 7.89 and 8.10 for M ∼ Poisson(400). 10 Simulation: Generate M indep. integers uniformly over f1;:::; kg and compute C. Repeat n times independently and compute the empirical distribution of the n realizations of C. Can also estimate E[C], P[C > x], etc. Draft 10 Simulation: Generate M indep. integers uniformly over f1;:::; kg and compute C. Repeat n times independently and compute the empirical distribution of the n realizations of C. Can also estimate E[C], P[C > x], etc. Illustration: k = 10 000 and m = 400. Analytic approx.: C ∼ Poisson with mean λ = m2=(2k) =8. We simulated n = 107 replications (or runs). Empirical mean and variance: 7.87 and 7.47 for M = m = 400 (fixed); 7.89 and 8.10 for M ∼ Poisson(400). Draft c M fixed M ∼ Poisson Poisson distrib. 11 0 3181 4168 3354.6 1 25637 32257 26837.0 2 105622 122674 107348.0 3 288155 316532 286261.4 4 587346 614404 572522.8 5 948381 957951 916036.6 6 1269427 1247447 1221382.1 7 1445871 1397980 1395865.3 8 1434562 1377268 1395865.3 9 1251462 1207289 1240769.1 10 978074 958058 992615.3 11 688806 692416 721902.0 12 442950 459198 481268.0 13 260128 282562 296164.9 14 141467 162531 169237.1 15 71443 86823 90259.7 16 33224 43602 45129.9 17 14739 20827 21237.6 18 5931 9412 9438.9 19 2378 3985 3974.3 20 791 1629 1589.7 21 310 592 605.6 22 79 264 220.2 23 26 83 76.6 24 5 33 25.5 ≥ 25 5 Draft15 11.7 12 What if we take smaller values? k = 100 and m = 40. We still have λ = m2=(2k) =8. We simulated n = 107 replications (or runs). Empirical mean, 95% confidence interval, variance: 6.90, (6.896, 6.899), 4.10, for M = m = 40 (fixed); 7.03, (7.032, 7.035), 8.48, for M ∼ Poisson(40). Draft c M fixed M ∼ Poisson Poisson distrib. 13 0 1148 17631 3354.6 1 14355 100100 26837.0 2 84046 294210 107348.0 3 302620 600700 286261.4 4 744407 951184 572522.8 5 1340719 1238485 916036.6 6 1828860 1379400 1221382.1 7 1941207 1353831 1395865.3 8 1634465 1194915 1395865.3 9 1103416 956651 1240769.1 10 602186 705478 992615.3 11 267542 485353 721902.0 12 97047 309633 481268.0 13 29161 187849 296164.9 14 7195 107189 169237.1 15 1363 58727 90259.8 16 224 30450 45129.9 17 35 15115 21237.6 18 4 7271 9438.9 19 0 3311 3974.3 20 0 1468 1589.7 21 0 607 605.6 22 0 258 220.2 23 0 110 76.6 24 0 44 25.5 ≥ 25 0 Draft30 11.7 14 k = 25 and m = 20. Again, λ = m2=(2k) =8. We make n = 107 runs. Empirical mean, variance: 6.05, 2.16 for M = m = 40 (fixed); 6.23, 8.21 for M ∼ Poisson(40). Draft c M fixed M ∼ Poisson Poisson distrib. 15 0 135 49506 3354.6 1 4565 220731 26837.0 2 52910 528292 107348.0 3 314067 905283 286261.4 4 1050787 1224064 572522.9 5 2130621 1400387 916036.6 6 2692369 1392858 1221382.1 7 2170849 1237244 1395865.3 8 1124606 997974 1395865.3 9 371366 743393 1240769.2 10 77190 513360 992615.3 11 9768 332413 721902.0 12 738 203692 481268.0 13 29 118015 296164.9 14 0 65363 169237.1 15 0 34266 90259.8 16 0 17384 45129.9 17 0 8600 21237.6 18 0 3984 9438.9 19 0 1861 3974.2 20 0 768 1589.7 21 0 345 605.6 22 0 128 220.2 23 0 56 76.6 24 0 15 25.5 25 0Draft 18 11.7 First customer arrives time T1 = A1, leaves station 1 at D1;1 = T1 + S1;1 and starts its service at station 2, leaves station 2 at time D2;1 = D1;1 + S2;1 to join queue 3, etc.

Load more