
A Probability Distributions We recall here the density and the two first moments of most of the distribu­ tions used in this book. An exhaustive review of probability distributions is provided by Johnson and Kotz (1972), or the more recent Johnson and Hoet­ ing (2003), Johnson et al. (1994, 1995). The densities are given with respect to Lebesgue or counting measure depending on the context. A.l. Normal Distribution, Np(O, E) ( () E JRP and E is a (p x p) symmetric positive definite matrix.) f(xj(), E) = (det E)-lf2(27r)-pf2e-(x-l:l)t L'-l(x-1:1)/2. lE~:~,E[X] = () and lE~:~,E[(X- O)(X- O)t] = E. When E is not positive definite, the Np ((), E) distribution has no den­ sity with respect to Lebesgue measure on JRP. For p = 1, the log-normal distribution is defined as the distribution of ex when X,....., N(O, o·2 ). A.2. Gamma Distribution, Qa(a, {3) (a,{3>0.) {3) _ {3a a-l -f3x][ ( ) !( xa,I - T(a)X e [O,+oo) X. lEa,fJ[X] = a/(3 and vara,fJ(X) = a/(32 . Particular cases of the Gamma distribution are the Erlang distribution, Qa(a, 1), the exponential distribution Qa(1, /3) (denoted by £xp(B)), and the chi squared distribution, Qa(v/2, 1/2) (denoted by x~). (Note also that the opposite convention is sometimes adopted for the parameter, namely that Qa(a,{3) may also be noted as Qa(a, 1//3). See, e.g., Berger 1985.) A.3. Beta Distribution, Be( a, {3) (a,{3 > 0.) xa.- 1 (1- x)f3-l f(xja,{3) = B(a,{3) II[o,lj(x), where 582 A Probability Distributions B( /3) = r( ex )F(/3) ex, r(ex+f3)' lEa,J)[X] = exj(ex + /3) and vara,J3(X) = ex/3/[(ex + f3) 2(ex + f3 + 1)]. The beta distribution can be obtained as the distribution of YI/(Y1 + Y2) when y1 rv Qa(ex, 1) and y2 rv Qa(/3, 1). A.4. Student's t Distribution, Tp(v, 0, L') (v > 0, () E lR.P, and L' is a (p x p) symmetric positive-definite matrix.) f(xiv,B,L') = F((v+p)/2)/F(v/2) [1+ (x-B)tL'-1(x-O)]-(v+p)/2 (det L')l/2(v1r )P/2 v lEv,II,E[X] = () (v > 1) and lEe,E[(X- B)(X- O)t] = vL'j(v- 2) (v > 2). When p = 1, a particular case of Student's t distribution is the Cauchy distribution, C(B, a-2), which corresponds to v = 1. Student's t distribution Tp(v, 0, I) can be derived as the distribution of XjZ when X"' Np(O, I) and vZ2 "'X~· A.5. Fisher's F Distribution, F(v, p) (v,p > 0.) F((v+p)j2)vPI2pv/2 x(v-2)/2 f(xiv, p) = F(v /2)F(p/2) (v + px)(v+p)/2 IT[o,+oo) (x). lEv,p[X] = pj(p-2) (p > 2) and varv,p(X) = 2p2 (v+p-2)/[v(p-4)(p-2)2] (p > 4). The distribution F(p, q) is also the distribution of (X- O)t L'-1 (X- B)jp when X "' Tp(q, 0, L'). Moreover, if X "' F(v, p), vXj(p + vX) "' Be(v /2, p/2). A.6. Inverse Gamma Distribution, IQ(ex, /3) (ex, f3 > 0.) /3"' e-!3/x f(xlex, /3) = r(ex) x"'+l IT[O,+oo[(x). lEa,J)[X] = /3/(ex- 1) (ex > 1) and vara,J3(X) = {32/((ex- 1) 2(ex- 2)) (ex> 2). This distribution is the distribution of x-1 when X"' Qa(ex,/3). A.7. Noncentral Chi Squared Distribution, x~(..\) (..\ 2: 0.) f(xl..\) = ~(xj..\)(p- 2 ) 14 I(p-2)/2(V,\;;)e-(>+x)/ 2 . 2 JE-\[X] = p +..\and var-\(X) = 3p + 4..\. This distribution can be derived as the distribution of Xt + · · · + x; when Xi "'N(Bi, 1) and Bi + ... + e; = ..\. A Probability Distributions 583 A.8. Dirichlet Distribution, Dk(a1, ... , ak) ( a1, ... , ak > 0 and ao = a1 + · · · + ak.) 1Ea[Xi] = ai/ao, var(Xi) = (ao- ai)ai/[a~(ao + 1)] and cov(Xi,Xj) = -aiajj[a~(ao + 1)] (i-f j). As a particular case, note that (X, 1 -X) "' D2(a1, a2) is equivalent to X "'Be(a1, a2). A.9. Pareto Distribution, Pa(a, xo) (a> 0 and xo > 0.) 2 1Ea,x0 [X]= axo/(a- 1) (a> 1) and var0 ,x 0 (X) =ax~/[( a -1) (a- 2)] (a> 2). A.10. Binomial Distribution, B(n,p) (0::; p::; 1.) f(xlp) = (:)px(l- Pt-xiT{o, ... ,n}(x). 1Ep(X) = np and var(X) = np(1- p). A.l1. Multinomial Distribution, Mk(n;p1, ... ,pk) (pi :2: 0 ( 1 ::; i ::; k) and I:i Pi = 1.) 1Ep(Xi) = npi, var(Xi) = npi(1- Pi), and cov(Xi, Xj) = -nPiPj (i-f j). Note that, if X"' Mk(n;p1, ... ,pk), Xi"' B(n,pi), and that the binomial distribution X"' B(n,p) corresponds to (X,n- X)"' M2(n;p, 1-p). A.12. Poisson Distribution, P()..) ().. > 0.) IE.x[X] =)..and var.x(X) = )... A.13. Negative Binomial Distribution, Neg(n,p) (0 ::; p ::; 1.) f(xlp) = (n +: + 1)pn(1- p)xiTN(x). 1Ep[X] = n(1- p)fp and varp(X) = n(1- p)fp2. 584 A Probability Distributions A.14. Hypergeometric Distribution, Hyp( N; n; p) (0 <5_ p <5. 1, n <Nand pN EN.) (pxn) ((l~!~N) f(xJp) = (~) IT{n-(1-p)N, ... ,pN}(x)IT{o,l, ... ,n}(x). JEN,n,p[X] = np and varN,n,p(X) = (N- n)np(l- p)j(N- 1). B Notation B .1 Mathematical h = (h1, ... , hn) = {h;} boldface signifies a vector H = {hij} = llhijll uppercase signifies a matrix I, 1, J = 11' identity matrix, vector of ones, .a matrix of ones A-<B (B- A) is a positive definite matrix IAI determinant of the matrix A tr(A) trace of the matrice A a+ max (a, 0) c~, (;) binomial coefficient Da logistic function 1F1(a; b; z) confluent hypergeometric function p- generalized inverse of F r(x) gamma function (x > 0) lfF(x) digamma function, (d/dx)F(x) (x > 0) ITA (t) indicator function (1 if tEA, 0 otherwise) Iv(z) modified Bessel function ( z > 0) (Pl·~PJ multinomial coefficient \lf(z) gradient of f(z), the vector with coefficients (8j8zi)f(z) (f(z) E lR and z E: JRP) \lt f(z) divergence of f(z), L_(8/8zi)f(z) (f(z) E JRP and z E JR) L1f(z) Laplacian of f(z), "L_(82 j8zl)f(z) ll·llrv total variation norm lxl = (Exr)l/2 Euclidean norm [x] or lxJ greatest integer less than x lxl smallest integer larger than x f(t) ex: g(t) the functions f and g are proportional supp(f) support off (x, y) scalar product of x and y in JRP 586 B Notation x V y maximum of x and y x 1\ y minimum of x and y B.2 Probability X,Y random variable (uppercase) (X, P,B) probability triple: sample space, probability distribution, and a-algebra of sets f3n (3-mix:ing coefficient 8o0 (B) Dirac mass at Bo E(B) energy function of a Gibbs distribution £(n) entropy of the distribution 1f F(xiB) cumulative distribution function of X, conditional on the parameter B f(xiB) density of X, conditional on the parameter B, with respect to Lebesgue or counting measure X"" J(xiB) X is distributed with density f(xiB) IEo[g(X)] expectation of g(x) under the distribution X rv f(xiB) JEV[h(V)] expectation of h( v) under the distribution of V JE7r[h(B)Ix] expectation of h( B) under the distribution of B, conditional on x, 1r ( BI x) iid independent and identically distributed .X(dx) Lebesgue measure, also denoted by d.X(x) Po probability distribution, indexed by the parameter B P*q convolution product of the distributions p and q, that is, distribution of the sum of X "" p and Y "" q convolution nth power, that is, distribution of the sum of n iid rv's distributed from cp(t) density of the Normal distribution N(O, 1) ci>( t) cumulative distribution function of the Normal distribution N1 n----+ oo, O(n) O(n), o(n) big "Oh", little "oho" As n ----+constant, or Ov(n), ov(n) a~) ----+ 0, and the subscript p denotes in probability B.3 Distributions B(n,p) binomial distribution Be( a, (3) beta distribution C(B, a 2 ) Cauchy distribution Vk(a~, 0 0 0 ,ak) Dirichlet distribution [xp(.X) exponential distribution :F(p,q) Fisher's F distribution ga(a, (3) gamma distribution Ig(a,(3) inverse gamma distribution B.4 Markov Chains 587 chi squared distribution, noncentral chi squared distribution with noncentrality parameter >. Mk(n;p1, .. ,pk) multinomial distribution N(O, a 2 ) univariate normal distribution Nv(O, E) multivariate normal distribution Neg(n,p) negative binomial distribution P(>.) Poisson distribution Pa(xo, a) Pareto distribution Tp(v, (),E) multivariate Student's t distribution U[a,b] continuous uniform distribution We(a,c) Weibull distribution Wk(p,E) Wishart distribution B.4 Markov Chains a atom AR(p) autoregressive process of order p ARMA(p,q) autoregressive moving average process of order (p, q) c small set d(a) period of the state or atom a t "dagger," absorbing state LlV(x) drift of V IE~'[h(Xn)] expectation associated with PI' 1Ex0 [h(Xn)] expectation associated with Px 0 1/A total number of passages in A Q(x,A) probability that 1/A is infinite, starting from x -r: variance of SN(g) for the Central Limit Theorem K€ kernel of the resolvant MA(q) moving average process of order q L(x,A) probability of return to A starting from x V 1 Vm minorizing measure for an atom or small set P(x,A) transition kernel pm(x,A) transition kernel of the chain (Xmn)n P~'(·) probability distribution of the chain (Xn) with initial state Xo ,....., J.L probability distribution of the chain (Xn) with initial state Xo = xo 7r invariant measure SN(g) empirical average of g(xi) for 1 :S i :S N Tp,q coupling time for the initial distributions p and q TA return time to A TA(k) kth return time to A U(x,A) average number of passages in A, starting from :z; 588 B Notation generic element of a Markov chain augmented or split chain B.5 Statistics x,y realized values (lowercase) of the random variables X and Y (uppercase) X,Y sample space (uppercase script Roman letters) (},).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages66 Page
-
File Size-