A Probability Distributions

A Probability Distributions We recall here the density and the two first moments of most of the distribu tions used in this book. An exhaustive review of probability distributions is provided by Johnson and Kotz (1972), or the more recent Johnson and Hoet ing (2003), Johnson et al. (1994, 1995). The densities are given with respect to Lebesgue or counting measure depending on the context. A.l. Normal Distribution, Np(O, E) ( () E JRP and E is a (p x p) symmetric positive definite matrix.) f(xj(), E) = (det E)-lf2(27r)-pf2e-(x-l:l)t L'-l(x-1:1)/2. lE~:~,E[X] = () and lE~:~,E[(X- O)(X- O)t] = E. When E is not positive definite, the Np ((), E) distribution has no den sity with respect to Lebesgue measure on JRP. For p = 1, the log-normal distribution is defined as the distribution of ex when X,....., N(O, o·2 ). A.2. Gamma Distribution, Qa(a, {3) (a,{3>0.) {3) _ {3a a-l -f3x][ ( ) !( xa,I - T(a)X e [O,+oo) X. lEa,fJ[X] = a/(3 and vara,fJ(X) = a/(32 . Particular cases of the Gamma distribution are the Erlang distribution, Qa(a, 1), the exponential distribution Qa(1, /3) (denoted by £xp(B)), and the chi squared distribution, Qa(v/2, 1/2) (denoted by x~). (Note also that the opposite convention is sometimes adopted for the parameter, namely that Qa(a,{3) may also be noted as Qa(a, 1//3). See, e.g., Berger 1985.) A.3. Beta Distribution, Be( a, {3) (a,{3 > 0.) xa.- 1 (1- x)f3-l f(xja,{3) = B(a,{3) II[o,lj(x), where 582 A Probability Distributions B( /3) = r( ex )F(/3) ex, r(ex+f3)' lEa,J)[X] = exj(ex + /3) and vara,J3(X) = ex/3/[(ex + f3) 2(ex + f3 + 1)]. The beta distribution can be obtained as the distribution of YI/(Y1 + Y2) when y1 rv Qa(ex, 1) and y2 rv Qa(/3, 1). A.4. Student's t Distribution, Tp(v, 0, L') (v > 0, () E lR.P, and L' is a (p x p) symmetric positive-definite matrix.) f(xiv,B,L') = F((v+p)/2)/F(v/2) [1+ (x-B)tL'-1(x-O)]-(v+p)/2 (det L')l/2(v1r )P/2 v lEv,II,E[X] = () (v > 1) and lEe,E[(X- B)(X- O)t] = vL'j(v- 2) (v > 2). When p = 1, a particular case of Student's t distribution is the Cauchy distribution, C(B, a-2), which corresponds to v = 1. Student's t distribution Tp(v, 0, I) can be derived as the distribution of XjZ when X"' Np(O, I) and vZ2 "'X~· A.5. Fisher's F Distribution, F(v, p) (v,p > 0.) F((v+p)j2)vPI2pv/2 x(v-2)/2 f(xiv, p) = F(v /2)F(p/2) (v + px)(v+p)/2 IT[o,+oo) (x). lEv,p[X] = pj(p-2) (p > 2) and varv,p(X) = 2p2 (v+p-2)/[v(p-4)(p-2)2] (p > 4). The distribution F(p, q) is also the distribution of (X- O)t L'-1 (X- B)jp when X "' Tp(q, 0, L'). Moreover, if X "' F(v, p), vXj(p + vX) "' Be(v /2, p/2). A.6. Inverse Gamma Distribution, IQ(ex, /3) (ex, f3 > 0.) /3"' e-!3/x f(xlex, /3) = r(ex) x"'+l IT[O,+oo[(x). lEa,J)[X] = /3/(ex- 1) (ex > 1) and vara,J3(X) = {32/((ex- 1) 2(ex- 2)) (ex> 2). This distribution is the distribution of x-1 when X"' Qa(ex,/3). A.7. Noncentral Chi Squared Distribution, x~(..\) (..\ 2: 0.) f(xl..\) = ~(xj..\)(p- 2 ) 14 I(p-2)/2(V,\;;)e-(>+x)/ 2 . 2 JE-\[X] = p +..\and var-\(X) = 3p + 4..\. This distribution can be derived as the distribution of Xt + · · · + x; when Xi "'N(Bi, 1) and Bi + ... + e; = ..\. A Probability Distributions 583 A.8. Dirichlet Distribution, Dk(a1, ... , ak) ( a1, ... , ak > 0 and ao = a1 + · · · + ak.) 1Ea[Xi] = ai/ao, var(Xi) = (ao- ai)ai/[a~(ao + 1)] and cov(Xi,Xj) = -aiajj[a~(ao + 1)] (i-f j). As a particular case, note that (X, 1 -X) "' D2(a1, a2) is equivalent to X "'Be(a1, a2). A.9. Pareto Distribution, Pa(a, xo) (a> 0 and xo > 0.) 2 1Ea,x0 [X]= axo/(a- 1) (a> 1) and var0 ,x 0 (X) =ax~/[( a -1) (a- 2)] (a> 2). A.10. Binomial Distribution, B(n,p) (0::; p::; 1.) f(xlp) = (:)px(l- Pt-xiT{o, ... ,n}(x). 1Ep(X) = np and var(X) = np(1- p). A.l1. Multinomial Distribution, Mk(n;p1, ... ,pk) (pi :2: 0 ( 1 ::; i ::; k) and I:i Pi = 1.) 1Ep(Xi) = npi, var(Xi) = npi(1- Pi), and cov(Xi, Xj) = -nPiPj (i-f j). Note that, if X"' Mk(n;p1, ... ,pk), Xi"' B(n,pi), and that the binomial distribution X"' B(n,p) corresponds to (X,n- X)"' M2(n;p, 1-p). A.12. Poisson Distribution, P()..) ().. > 0.) IE.x[X] =)..and var.x(X) = )... A.13. Negative Binomial Distribution, Neg(n,p) (0 ::; p ::; 1.) f(xlp) = (n +: + 1)pn(1- p)xiTN(x). 1Ep[X] = n(1- p)fp and varp(X) = n(1- p)fp2. 584 A Probability Distributions A.14. Hypergeometric Distribution, Hyp( N; n; p) (0 <5_ p <5. 1, n <Nand pN EN.) (pxn) ((l~!~N) f(xJp) = (~) IT{n-(1-p)N, ... ,pN}(x)IT{o,l, ... ,n}(x). JEN,n,p[X] = np and varN,n,p(X) = (N- n)np(l- p)j(N- 1). B Notation B .1 Mathematical h = (h1, ... , hn) = {h;} boldface signifies a vector H = {hij} = llhijll uppercase signifies a matrix I, 1, J = 11' identity matrix, vector of ones, .a matrix of ones A-<B (B- A) is a positive definite matrix IAI determinant of the matrix A tr(A) trace of the matrice A a+ max (a, 0) c~, (;) binomial coefficient Da logistic function 1F1(a; b; z) confluent hypergeometric function p- generalized inverse of F r(x) gamma function (x > 0) lfF(x) digamma function, (d/dx)F(x) (x > 0) ITA (t) indicator function (1 if tEA, 0 otherwise) Iv(z) modified Bessel function ( z > 0) (Pl·~PJ multinomial coefficient \lf(z) gradient of f(z), the vector with coefficients (8j8zi)f(z) (f(z) E lR and z E: JRP) \lt f(z) divergence of f(z), L_(8/8zi)f(z) (f(z) E JRP and z E JR) L1f(z) Laplacian of f(z), "L_(82 j8zl)f(z) ll·llrv total variation norm lxl = (Exr)l/2 Euclidean norm [x] or lxJ greatest integer less than x lxl smallest integer larger than x f(t) ex: g(t) the functions f and g are proportional supp(f) support off (x, y) scalar product of x and y in JRP 586 B Notation x V y maximum of x and y x 1\ y minimum of x and y B.2 Probability X,Y random variable (uppercase) (X, P,B) probability triple: sample space, probability distribution, and a-algebra of sets f3n (3-mix:ing coefficient 8o0 (B) Dirac mass at Bo E(B) energy function of a Gibbs distribution £(n) entropy of the distribution 1f F(xiB) cumulative distribution function of X, conditional on the parameter B f(xiB) density of X, conditional on the parameter B, with respect to Lebesgue or counting measure X"" J(xiB) X is distributed with density f(xiB) IEo[g(X)] expectation of g(x) under the distribution X rv f(xiB) JEV[h(V)] expectation of h( v) under the distribution of V JE7r[h(B)Ix] expectation of h( B) under the distribution of B, conditional on x, 1r ( BI x) iid independent and identically distributed .X(dx) Lebesgue measure, also denoted by d.X(x) Po probability distribution, indexed by the parameter B P*q convolution product of the distributions p and q, that is, distribution of the sum of X "" p and Y "" q convolution nth power, that is, distribution of the sum of n iid rv's distributed from cp(t) density of the Normal distribution N(O, 1) ci>( t) cumulative distribution function of the Normal distribution N1 n----+ oo, O(n) O(n), o(n) big "Oh", little "oho" As n ----+constant, or Ov(n), ov(n) a~) ----+ 0, and the subscript p denotes in probability B.3 Distributions B(n,p) binomial distribution Be( a, (3) beta distribution C(B, a 2 ) Cauchy distribution Vk(a~, 0 0 0 ,ak) Dirichlet distribution [xp(.X) exponential distribution :F(p,q) Fisher's F distribution ga(a, (3) gamma distribution Ig(a,(3) inverse gamma distribution B.4 Markov Chains 587 chi squared distribution, noncentral chi squared distribution with noncentrality parameter >. Mk(n;p1, .. ,pk) multinomial distribution N(O, a 2 ) univariate normal distribution Nv(O, E) multivariate normal distribution Neg(n,p) negative binomial distribution P(>.) Poisson distribution Pa(xo, a) Pareto distribution Tp(v, (),E) multivariate Student's t distribution U[a,b] continuous uniform distribution We(a,c) Weibull distribution Wk(p,E) Wishart distribution B.4 Markov Chains a atom AR(p) autoregressive process of order p ARMA(p,q) autoregressive moving average process of order (p, q) c small set d(a) period of the state or atom a t "dagger," absorbing state LlV(x) drift of V IE~'[h(Xn)] expectation associated with PI' 1Ex0 [h(Xn)] expectation associated with Px 0 1/A total number of passages in A Q(x,A) probability that 1/A is infinite, starting from x -r: variance of SN(g) for the Central Limit Theorem K€ kernel of the resolvant MA(q) moving average process of order q L(x,A) probability of return to A starting from x V 1 Vm minorizing measure for an atom or small set P(x,A) transition kernel pm(x,A) transition kernel of the chain (Xmn)n P~'(·) probability distribution of the chain (Xn) with initial state Xo ,....., J.L probability distribution of the chain (Xn) with initial state Xo = xo 7r invariant measure SN(g) empirical average of g(xi) for 1 :S i :S N Tp,q coupling time for the initial distributions p and q TA return time to A TA(k) kth return time to A U(x,A) average number of passages in A, starting from :z; 588 B Notation generic element of a Markov chain augmented or split chain B.5 Statistics x,y realized values (lowercase) of the random variables X and Y (uppercase) X,Y sample space (uppercase script Roman letters) (},).

Load more