STOCHASTIC MODELING Note
Total Page:16
File Type:pdf, Size:1020Kb
IMA BOOTCAMP: STOCHASTIC MODELING JASMINE FOO AND KEVIN LEDER Note: much of the material in these bootcamp lecture notes is adapted from the book `Introduction to Probability Models' by Sheldon Ross and lecture notes on the web by Janko Gravner for MAT135B. 1. Exponential random variables and their properties We begin with a review of exponential random variables, which appear a lot in mathe- matical models of real-world phenomena because of their convenient mathematical prop- erties. Definition 1. A continuous random variable X is said to have an exponential distribution with parameter λ, λ > 0, if its probability density function is given by ( λe−λx; x ≥ 0 f(x) = 0; x < 0 1.1. Moments. The mean of the exponential distribution, E[X] is given by Z 1 1 E[X] = xλe−λxdx = : −∞ λ The moment generating function φ(t) of the exponential distribution is Z 1 λ φ(t) = E[etX ] = etxλe−λxdx = : 0 λ − t Using this we can derive the variance 2 2 2 d 1 1 V ar(X) = E[X ] − (EX) = 2 φ(t) − 2 = 2 : dt t=0 λ λ 1.2. Memoryless property. Exponential distributions have the memoryless property. Specifically, it satisfies the following: P (X > s + tjX > t) = P (X > s); for all s; t ≥ 0: To see this, note that the property is equivalent to P (X > s + t; X > t) = P (X > s) P (X > t) or P (X > s + t) = P (X > s)P (X > t): 1 2 JASMINE FOO AND KEVIN LEDER For X exponentially distributed, we get e−λ(s+t) = e−λse−λt so the memoryless property is satisfied. It can be shown that the exponential distribution is the only distribution that satisfies this property. Example 1. A store must decide how much product to order to meet next month's demand. The demand is modeled as an exponential random variable with rate λ. The product costs c dollars per unit, and can be sold at a price of s > c dollars per unit. How much should be ordered to maximize the store's expected profit? Assume that any inventory left over at the end of the month is worthless. Solution. Let X be the demand. If the store orders t units of the product, the profit is P = s min(X; t) − ct In other words ( sX − ct; X < t P = (s − c)t; X ≥ t So E[P ] = (sE[XjX < t] − ct)P (X < t) + (s − c)tP (X ≥ t): To calculate this conditional expected value, we note that 1/λ = E[X] = E[XjX < t]P (X < t) + E[XjX ≥ t]P (X ≥ t) = E[XjX < t](1 − eλt) + (t + EX)e−λt where in the last line we have used the memoryless property: conditioned on having exceeded t, the amount by which X exceeds t is distributed as an exponential variable with parameter λ. Thus we have 1 − (t + 1 )e−λt E[XjX < t] = λ λ 1 − e−λt Plugging this into the expression for E[P ] we have 1 − (t + 1 )e−λt E[P ] = (s λ λ − ct)(1 − e−λt) + (s − c)teλt 1 − e−λt = s/λ − se−λt/λ − ct To maximize the expected profits, we take the derivative and set se−λt − c = 0, obtaining that the maximal profit is obtained when 1 t = log (s=c): λ Definition 2. The hazard rate or failure rate of a random variable with cumulative distri- bution function F and density f is: f(t) h(t) = : 1 − F (t) IMA BOOTCAMP: STOCHASTIC MODELING 3 Consider the following related quantity: conditioned on X > t, what is the probability that X 2 (t; t + dt)? P (X 2 (t; t + dt); X > t) P (X 2 (t; t + dt)) f(t)dt P (X 2 (t; t + dt)jX > t) = = ≈ P (X > t) P (X > t) 1 − F (t) So, if we think of X as the time to failure of some object, the hazard rate can be interpreted as the conditional probability density of failure at time t, given that the object has not failed until time t. The failure rate of an exponentially distributed random variable is a constant: h(t) = λe−λteλt = λ 1.3. Sum and minimums of exponential random variables. Suppose Xi; i = 1 : : : n are independent identically distributed exponential random variables with parameter λ. It can be shown (by induction, for example), that the sum X1 + X2 + ::: + Xn has a Gamma distribution with parameters n and 1/λ, with pdf (λt)n−1 f(t) = λe−λt : (n − 1)! Next, let's define Y to be the minimum of n independent exponential random variables, X1;:::;Xn with parameters λ1; : : : ; λn. n Y P (Y > x) = P (Xi > x) i=1 n Y = e−λix i=1 − Pn λ x = e ( i=1 i) Thus we see that the minimum, Y , is exponentially distributed with parameter equal to the sum of the parameters λ1 + λ2 ··· + λn. 2. Poisson Processes Definition 3. A counting process is a random process N(t); t ≥ 0 (e.g. number of events that have occurred by time t) such that (1) For each t, N(t) is a nonnegative integer (2) N(t) is nondecreasing in t (3) N(t) is right continuous. Thus N(t) − N(s) represents the number of events in (s; t]. Definition 4. A Poisson process is a random process with intensity (or rate) λ > 0 is a counting process N(t) such that 4 JASMINE FOO AND KEVIN LEDER (1) N(0) = 0 (2) it has independent increments: if (s1; t1] \ (st; t2] = ; then N(t1) − N(s1) and N(t2) − N(s2) are independent (3) number of events in any interval of length t is Poisson(λt) Recall that a Poisson random variable Y with parameter λ takes values k = 0; 1; 2;::: with probability λke−λ P (Y = k) = k! In particular, we have the probability mass function for the number of events in an interval of size h: (λh)k P (N(t + h) − N(t) = k) = e−λh ; k = 0; 1; 2;::: k! Now, if h ! 0 we have that the number of arrivals in a small interval of size h satisfies P (N(h) = 1) = λh and P (N(h) ≥ 2) = o(h): Also, from the properties of a Poisson random variable we have that E[N(t)] = λt. 2.1. Motivation for the Poisson process. Suppose we have a store where the customers arrive at a rate λ = 4:5 customers per hour on average, and the store owner is interested to find the distribution of the average number X of customers who arrive during a particular time period of length t hours. The arrival rate can be recast as 4.5/3600 seconds per hour = 0.00125 per second. We can interpret this probabilistically by saying that during each second either 0 or 1 customers will arrive, and the probability during any single second is 0.00125. Under this setup the Binomial(3600t; 0:00125) distribution describes X, the number of customers who arrive during a time period of length t. Using the Poisson approximation to the Binomial we then have that X is approximated by a Poisson variable with mean 3600t ∗ 0:00125 = 4:5t = λt, which is consistent with our definition of a Poisson process. Recall that property 3 of a Poisson process is that the number of arrivals in any given time period of length t is Poisson(λt). In addition, by construction we have that the number of arrivals in non overlapping time intervals is independent. This example motivates the definition of a Poisson process, which is often used to model arrival processes of events with a constant average arrival rate. 2.2. Interarrival and waiting time distributions. Consider a Poisson process with rate λ and let T1 denote the time of the first event, and for n > 1 let Tn denote the elapsed time between the (n − 1)st and nth event. The sequence fTn; n = 1; 2;:::g is called the sequence of inter arrival times. Proposition 1. Tn, n = 1; 2;::: are independent identically distributed exponential random variables with parameter λ. IMA BOOTCAMP: STOCHASTIC MODELING 5 To see this, first consider T1. −λt P (T1 > t) = P (N(t) = 0) = e Thus T1 is exponentially distributed with parameter λ. Then Z 1 P (T2 > t) = P (T2 > tjT1 2 (s; s + ds))P (T1 2 (s; s + ds))ds 0 Z 1 = P (no events in (s, s+t])λe−λsds 0 Z 1 = e−λtλe−λsds 0 = e−λt So, T2 is also exponentially distributed with parameter λ. From this argument we can also see that T2 is independent of T1 since conditioning on the value of T1 has no impact on the pdf of T2. Repeating the argument for n ≥ 2 gives the result. Intuitively, remember that the Poisson process has independent increments (so from any point onward it is independent of what happened in the past) and stationary increments (any increment of the same length has the same distribution), so the process has no memory. Hence exponential waiting times until the next arrival are expected. Next we consider the waiting time until the nth event, Sn, and its distribution. Note that n X Sn = Ti i=1 and since the Ti are iid exponential(λ) variables, the variable Sn is Gamma(n; λ). In other words, Sn has pdf (λt)n−1 f (t) = λe−λt ; t ≥ 0: Sn (n − 1)! Note that we could also equivalently define a Poisson process by starting with iid ex- ponential inter arrival times with parameter λ, and then defining a counting process by saying that the nth event occurs at time Sn = T1 + T2 + ::: + Tn: The resulting counting process would be a Poisson process with rate λ.