A Probability Review

A Probability Review Outline: • A probability review. Shorthand notation: RV stands for random variable. EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: • Go over handouts 2–5 in EE 420x notes. Basic probability rules: (1) Pr{Ω} = 1, Pr{∅} = 0, 0 ≤ Pr{A} ≤ 1; ∞ P∞ Pr{∪i=1Ai} = i=1 Pr{Ai} if Ai ∩ Aj = ∅ for all i 6= j; | {z } Ai and Aj disjoint (2) Pr{A ∪ B} = Pr{A} + Pr{B} − Pr{A ∩ B}, Pr{Ac} = 1 − Pr{A}; (3) If A ⊥⊥ B, then Pr{A ∩ B} = Pr{A}· Pr{B}; (4) Pr{A ∩ B} Pr{A | B} = (conditional probability) Pr{B} or Pr{A ∩ B} = Pr{A | B}· Pr{B} (chain rule); EE 527, Detection and Estimation Theory, # 0b 2 (5) Pr{A} = Pr{A | B1} Pr{B1} + ··· + Pr{A | Bn} Pr{Bn} if B1,B2,...,Bn form a partition of the full space Ω; (6) Bayes’ rule: Pr{B | A} Pr{A} Pr{A | B} = . Pr{B} EE 527, Detection and Estimation Theory, # 0b 3 Reminder: Independence, Correlation and Covariance For simplicity, we state all the definitions for pdfs; the corresponding definitions for pmfs are analogous. Two random variables X and Y are independent if fX,Y (x, y) = fX(x) · fY (y). Correlation between real-valued random variables X and Y : Z +∞ Z +∞ E X,Y {XY } = x y fX,Y (x, y) dx dy. −∞ −∞ Covariance between real-valued random variables X and Y : covX,Y (X, Y ) = E X,Y {(X − µX)(Y − µY )} Z +∞ Z +∞ = (x − µX)(y − µY ) fX,Y (x, y) dx dy −∞ −∞ EE 527, Detection and Estimation Theory, # 0b 4 where Z +∞ µX = E X(X) = x fX(x) dx −∞ Z +∞ µY = E Y (Y ) = y fY (y) dy. −∞ Uncorrelated random variables: Random variables X and Y are uncorrelated if cX,Y = covX,Y (X, Y ) = 0. (1) If X and Y are real-valued RVs, then (1) can be written as E X,Y {XY } = E X{X} E Y {Y }. Mean Vector and Covariance Matrix: Consider a random vector X1 X2 X = . . . XN EE 527, Detection and Estimation Theory, # 0b 5 The mean of this random vector is defined as µ1 E X1[X1] µ2 E X2[X2] µX = . = E X{X} = . . . . µN E XN [XN ] Denote the covariance between Xi and Xk, covXi,Xk(Xi,Xk), by ci,k; hence, the variance of Xi is ci,i = covXi,Xk(Xi,Xi) = var (X ) = σ2 . The covariance matrix of X is defined as Xi i Xi | {z } more notation c1,1 c1,2 ······ c1,N c2,1 c2,2 ······ c2,N CX = . . . cN,1 cN,2 ······ cN,N The above definitions apply to both real and complex vectors X. Covariance matrix of a real-valued random vector X: T CX = E X{(X − E X[X]) (X − E X[X]) } T T = E X[XX ] − E X[X] (E X[X]) . For real-valued X, ci,k = ck,i and, therefore, CX is a symmetric matrix. EE 527, Detection and Estimation Theory, # 0b 6 Linear Transform of Random Vectors Linear Transform. For real-valued Y , X,A, Y = g(X) = A X. Mean Vector: µY = E X{A X} = A µX. (2) Covariance Matrix: T T CY = E Y {YY } − µY µY T T T T = E X{A XX A } − A µX µX A T T T = A E X{XX } − µX µX A | {z } CX T = ACX A . (3) EE 527, Detection and Estimation Theory, # 0b 7 Reminder: Iterated Expectations In general, we can find E X,Y [g(X, Y )] using iterated expectations: E X,Y [g(X, Y )] = E Y {E X | Y [g(X, Y ) | Y ]} (4) where E X | Y denotes the expectation with respect to fX|Y (x | y) and E Y denotes the expectation with respect to fY (y). Proof. Z +∞ E Y {E X | Y [g(X, Y ) | Y ]} = E X | Y [g(X, Y ) | y] fY (y) dy −∞ Z +∞ Z +∞ = g(x, y)fX | Y (x | y) dx fY (y) dy −∞ −∞ Z +∞ Z +∞ = g(x, y)fX | Y (x | y)fY (y) dx dy −∞ −∞ Z +∞ Z +∞ = g(x, y) fX,Y (x, y) dx dy −∞ −∞ = E X,Y [g(X, Y )]. 2 EE 527, Detection and Estimation Theory, # 0b 8 Reminder: Law of Conditional Variances Define the conditional variance of X given Y = y to be the variance of X with respect to fX | Y (x | y), i.e. h 2 i varX | Y (X | Y = y) = E X | Y (X − E X|Y [X | y]) | y 2 2 = E X | Y [X | y] − (E X|Y [X | y]) . The random variable varX | Y (X | Y ) is a function of Y only, taking values var(X | Y = y). Its expected value with respect to Y is n 2 2o E Y {varX | Y (X | Y )} = E Y E X | Y [X | Y ] − (E X|Y [X | Y ]) iterated exp. 2 2 = E X,Y [X ] − E Y {(E X|Y [X | Y ]) } 2 2 = E X[X ] − E Y {(E X|Y [X | Y ]) }. Since E X|Y [X | Y ] is a random variable (and a function of Y only), it has variance: 2 2 varY {E X|Y [X | Y ]} = E Y {(E X|Y [X | Y ]) } − (E Y {E X|Y [X | Y ]}) iterated exp. 2 2 = E Y {(E X|Y [X | Y ]) } − (E X,Y [X]) 2 2 = E Y {E X|Y [X | Y ] } − (E X[X]) . EE 527, Detection and Estimation Theory, # 0b 9 Adding the above two expressions yields the law of conditional variances: E Y {varX | Y (X | Y )} + varY {E X|Y [X | Y ]} = varX(X). (5) Note: (4) and (5) hold for both real- and complex-valued random variables. EE 527, Detection and Estimation Theory, # 0b 10 Useful Expectation and Covariance Identities for Real-valued Random Variables and Vectors E X,Y [a X + b Y + c] = a · E X[X] + b · E Y [Y ] + c 2 2 varX,Y (a X + b Y + c) = a varX(X) + b varY (Y ) +2 a b · covX,Y (X, Y ) where a, b, and c are constants and X and Y are random variables. A vector/matrix version of the above identities: E X,Y [A X + B Y + c] = A E X[X] + B E Y [Y ] + c T T covX,Y (A X + B Y + c) = A covX(X) A + B covY (Y ) B T T +A covX,Y (X, Y ) B + B covX,Y (Y , X) A where “T ” denotes a transpose and T covX,Y (X, Y ) = E X,Y {(X − E X[X]) (Y − E Y [Y ]) }. Useful properties of crosscovariance matrices: • covX,Y ,Z(X, Y + Z) = covX,Y (X, Y ) + covX,Z(X, Z). EE 527, Detection and Estimation Theory, # 0b 11 • T covY ,X(Y , X) = [covX,Y (X, Y )] . • covX(X) = covX(X, X) varX(X) = covX(X, X). • T covX,Y (A X + b,P Y + q) = A covX,Y (X, Y ) P . (To refresh memory about covariance and its properties, see p. 12 of handout 5 in EE 420x notes. For random vectors, see handout 7 in EE 420x notes, particularly pp. 1–15.) Useful theorems: (1) (handout 5 in EE 420x notes) E X(X) = E Y [E X | Y (X | Y )] shown on p. 8 E X | Y [g(X) · h(Y ) | y] = h(y) · E X | Y [g(X) | y] E X,Y [g(X) · h(Y )] = E Y {h(Y ) · E X | Y [g(X) | Y ]}. The vector version of (1) is the same — just put bold letters. EE 527, Detection and Estimation Theory, # 0b 12 (2) varX(X) = E Y [varX | Y (X | Y )] + varY (E X | Y [X|Y ]); and the vector/matrix version is covX(X) = E Y [covX | Y (X | Y )] | {z } variance/covariance matrix of X +covY (E X | Y [X|Y ]) shown on p. (3) Generalized law of conditional variances: covX,Y (X, Y ) = E Z[covX,Y | Z(X, Y | Z)] +covZ(E X | Z[X | Z], E Y | Z[Y | Z]). (4) Transformation: Y = g (X ,...,X ) one-to-one 1 1 1 n Y = g(X) ⇐⇒ . Yn = gn(X1,...,Xn) then fY (y) = fX(h1(y1), . , hn(yn)) · |J| EE 527, Detection and Estimation Theory, # 0b 13 where h(·) is the unique inverse of g(·) and ∂x1 ··· ∂x1 ∂x ∂y1 ∂yn J = = . ∂yT ∂xn ··· ∂xn ∂y1 ∂yn Print and read the handout Probability distributions from the Course readings section on WebCT. Bring it with you to the midterm exams. EE 527, Detection and Estimation Theory, # 0b 14 Jointly Gaussian Real-valued RVs Scalar Gaussian random variables: 1 h (x − µ )2i f (x) = exp − X . X p 2 2 σ2 2 π σX X Definition. Two real-valued RVs X and Y are jointly Gaussian if their joint pdf is of the form 1 fX,Y (x, y) = q 2 2πσXσY 1 − ρX,Y ( 2 2 1 h(x − µX) (y − µY ) · exp − 2 · 2 + 2 2 (1 − ρX,Y ) σX σY ) (x − µX)(y − µY )i −2 ρX,Y . (6) σXσY 2 2 This pdf is parameterized by µX, µY , σX, σY , and ρX,Y . Here, p 2 p 2 σX = σX and σY = σY . Note: We will soon define a more general multivariate Gaussian pdf. EE 527, Detection and Estimation Theory, # 0b 15 If X and Y are jointly Gaussian, contours of equal joint pdf are ellipses defined by the quadratic equation 2 2 (x − µX) (y − µY ) (x − µX)(y − µY ) 2 + 2 −2 ρX,Y = const ≥ 0.

A Probability Review

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support