Elements of Probability and Queueing Theory
Total Page:16
File Type:pdf, Size:1020Kb
APPENDIX A Elements of Probability and Queueing Theory The primary purpose of this appendix is to summarize for the reader's conve nience those parts of the elements of probability theory, Markov process, and queueing theory that are most relevant to the subject matter of and in consistent notations with this book. Mathematical terminologies are used only for the pur pose of introducing the language to the intended readers. No pretense at rigor is attempted. Except for some relatively new material and interpretations, the reader can find a more thorough introduction of this material in [Kleinrock 1975 Chapters 2 - 4] and [Cinlar 1975]. A.1 Elements of Probability Theory Let (Q, F, P) be a probability space, where F is a O'-field in Q and P is a prob- ability measure defined on F. A random variable X is areal (measurable) function defined on the probability space. For any roe Q, X(ro)e R is areal number. The distribution function of X is defined as F(x) = P(ro: X(ro) ~x). Function fex) is called the probability density of the random variable X if for any subset Ae R, we have P( Xe A) =i f(x) dx. If F(x) is differentiable at x, then fex) = dF(x)/dx. If X takes values only in a countable set of real numbers, X and F(x) are called discrete. A familiar dis crete distribution is the binomial distribution: 312 APPENDIX A r=O,I,2, ... ,n. If n=l, the distribution is called a Bernoulli distribution. For engineering pur poses, we can think of F(x) or f(x) as a function defining or characterizing the random variable, X. To simplify the description, we often use gross character izations, such as, the mean and variance which give respectively the center of gravity and the spread (along the x axis) of the shape of f(x). The mean (or the expected) value of a random variable X on n is the integral of X with respect to the distribution function F(x): E[X] =[ xdF(x) = [ xf(x)dx, The kth moment of X is E[Xkl = [ x"<!F(x) = [ xkf(x)dx, In particular, the variance of Xis the second central moment A Random or Stochastic Sequence is simply a sequence of random variables indexed by an independent variable, usually the time. A random sequence is defined by specifying the joint distribution function of all the random variables Xl, X2, X3, ... , Xn, •• involved. If then the random sequence is said to possess the Markov property which colloquially can be described as "knowledge of the present separates the past and the future." Other modifying adjectives, such as Gaussian, Stationary (wide sense and strict sense), Purely random (the so-called "white" noise), Discrete State, Semi-Markov, Renewal etc are additional specifications and / or restrictions placed on the joint density function. Elements of Probability and Queueing Theory 313 Exerclse Make the above defining adjectives conceptually precise by spelling out these specifications on the joint density functions. Ignoring mathematical intricaeies, we ean think of a random or stochas tic process as the limit of a random sequence when the indexing variable becomes continuous. For a sequence of random variables, we are often interested in whether or not it converges, and if so, whether it converges to a given random variable or a deterministic eonstant. Let Xn be a sequence of random variables with distribution functions Fn(x), n =1,2, .... This sequence converges to a random variable X if IXn-XI becomes small in some sense as n--7oo• The definitions for different convergences are given as follows. (i) Convergence in probability: If for any e>O, limn~[ IXn - XI ~ e] =0, then Xn is said to converge to X in probability as n --700. (ü) Convergence with probability one (almost sure (a.s.) convergence): If for any E>Ü, limn~P[max ~ IXk-XI~]=O, orIfP[ro: liffin~Xn(ro) = X(ro)] =1, then Xn is said to converge to X with probability one (w.p.l), or almost surely, as n --700. (üi) Convergence in mean or mean square: If limn~ooE[ IXn - XI] =0 or limn~ooE[ IXn - XI2 ] =0, then Xn is said to converge to X in mean or in mean square as n--7oo• Convergence in mean square implies convergence inmean. (iv) Convergence in distribution and weak convergence: If limn~ooFn(x) = F(x) for every continuity point x of F, then Xn is said to eonverge to X in distribution as n --7 00 , and the distribution function Fn(x) is said to converge wealdy to F(x). Some examples will help to gain an intuitive understanding of the definitions. Example 1 Let the sequence of Xi'S be independently distributed according to P(Xn= 1) = 1 n and 314 APPENDIX A P(Xn=0)=1-k· (i) This sequence clearly converges in probability to the degenerate random variable, zero, since lim.1 = 0 for E<1, f n~oon limP(IXnl ~E) = \ n~oo limO=O forQl. n~oo (ii) However, this sequence does not converge to zero with probability one. Since lim P{MaxlXkl ~E }= lim {1- fi(1_I)} =1- 0 = 1 for any n and E<l. n~oo k~ n~oo k=n k (iii) Finally, this sequence does converge in mean or mean square because !~~E[Xn] = !~~ {1xP(Xn=1)+OxP(Xn=O)}= !~~ i = O. and !~E[X~] = nl~~ {1 2xP(Xn=1)+02xP(Xn=O)}= nl~ k = O. Example 2 Consider the same setup as in Example 1 except P(Xn= n) =...L n2 and P(Xn= 0 ) = 1 - ...L . n2 (i) It is c1ear that convergence in probability still holds. (ii) But now the sequence also converges with probability one since lim P{MaxIXkl~E}= lim {1- TI (1- ...L)} = 1-1 = 0, 'V large n andE<1. n~oo ~ n~oo k=n k2 Elements of Probability and Queueing Theory 315 (ili) However, convergence in mean and mean square holds and no longer holds, respectively. This is because limE[Xn] = lim {nxP(Xn=1)+OxP(Xn=O)}= lim...ll... = 0 n~oo n~oo n~n2 and limE[~] = lim {n2xP(Xn=1)+02xp(Xn=O)}= limn2 = 1. n~oo n~oo n~oon2 Very roughly speaking, convergence with probability one is concemed with how fast deviations from the limit disappears as n~oo while mean square convergence is concerned with how fast the amplitude of the deviation de creases as n~oo. Finally, convergence in distribution or weak convergence is concemed with whether or not the distribution of a random variable becomes statistically indistinguishable from that of another. In weak convergence, we do not make sample-path-wise comparisons. Both convergence with probability one and convergence in mean (mean square) imply convergence in prob ability which, in turn, implies convergence in distribution [Billingsley 1979]. Conver gence with probability one and convergence in mean do not imply each other. However, if Xn is dominated by another random variable, Y, which has a finite mean, then Xn converges to X with prob ability one (w.p.l) implies that Xn converges to X in mean. This can be easily proved by using the Lebesgue dominated convergence theorem stated below. Lebesgue domInated convergence theorem: Let Xn and Y be random variables on Q. If Y has a finite mean, IXnl SY w. p.l, and limn~n(ro) = X(ro) w.p.1, then limn~[Xn(ro)] = E~~n(ro)] = E[X(ro)]. Pictorially, the relationship among the various convergence definitions are as follows: 316 APPENDIX A Another important theorem is: The Law of Large Numbers: Suppose that {Xn } is an independent sequence and E[Xn] = O. If 1:n Var [Xn] / n2 < 00, then . 1 n hm- "LXk=O, w.p.1. n-400n k=l A direct corollary of this result is: Suppose that Xl, X2, ... are independent and identically distributed (Li.d) and E[Xn] = E[X] , then w.p.1. Finally, another concept used quite often in this book is the consistency of estimates. Let e be a quantity to be estimated and en, n=1,2, ... , be estimates of e. The estimate en is said to be consistent for e if en converges to e in probability or with probability one. The former may be called weak consistency and the latter, strong consistency. For example, if Xl, X2' ... are i.i.d., then {X1+ ... +Xn}/n is a strong consistent estimate ofE[X]. Elements of Probability and Queueing Theory 317 A.2 Markov Processes We review in this section the basic concepts and tenninologies of the Markov processes basically following [Cinlar 1975]. A.2.1 Markov Chalns Let n be a sampIe probability space and P a probability measure on it. Consider a stochastic sequence X = { Xn, ne (0, I,... )} with a countable state space'l' and indexed by time; that is, for each n e (0,1, ... ) and ooe n, Xn(oo) is an element of'l'. The elements of'l' are customarily denoted as integers. Thus Xn = j means the process is in state j at time n. A stochastic process X = {Xn' n=O,I, ... } is called a Markov chain pro vided that P(Xn+ 1=j I Xo, XI, ... , Xn) = P(Xn+ 1=j I Xn) for all j e 'I' and n e (O, 1, ... )1. Let P(ij) == P(Xn+ 1 =j I Xn =i) be the transition probability of the Markov chain and P = [ P(i,j)] , i,j e 'I' be the transition matrix. Let 1t(i), i e '1', be the steady state probability of state i.