A Probability Review

A Probability Review Outline: • A probability review. Shorthand notation: RV stands for random variable. EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: • Go over handouts 2–5 in EE 420x notes. Basic probability rules: (1) Pr{Ω} = 1, Pr{∅} = 0, 0 ≤ Pr{A} ≤ 1; ∞ P∞ Pr{∪i=1Ai} = i=1 Pr{Ai} if Ai ∩ Aj = ∅ for all i 6= j; | {z } Ai and Aj disjoint (2) Pr{A ∪ B} = Pr{A} + Pr{B} − Pr{A ∩ B}, Pr{Ac} = 1 − Pr{A}; (3) If A ⊥⊥ B, then Pr{A ∩ B} = Pr{A}· Pr{B}; (4) Pr{A ∩ B} Pr{A | B} = (conditional probability) Pr{B} or Pr{A ∩ B} = Pr{A | B}· Pr{B} (chain rule); EE 527, Detection and Estimation Theory, # 0b 2 (5) Pr{A} = Pr{A | B1} Pr{B1} + ··· + Pr{A | Bn} Pr{Bn} if B1,B2,...,Bn form a partition of the full space Ω; (6) Bayes’ rule: Pr{B | A} Pr{A} Pr{A | B} = . Pr{B} EE 527, Detection and Estimation Theory, # 0b 3 Reminder: Independence, Correlation and Covariance For simplicity, we state all the definitions for pdfs; the corresponding definitions for pmfs are analogous. Two random variables X and Y are independent if fX,Y (x, y) = fX(x) · fY (y). Correlation between real-valued random variables X and Y : Z +∞ Z +∞ E X,Y {XY } = x y fX,Y (x, y) dx dy. −∞ −∞ Covariance between real-valued random variables X and Y : covX,Y (X, Y ) = E X,Y {(X − µX)(Y − µY )} Z +∞ Z +∞ = (x − µX)(y − µY ) fX,Y (x, y) dx dy −∞ −∞ EE 527, Detection and Estimation Theory, # 0b 4 where Z +∞ µX = E X(X) = x fX(x) dx −∞ Z +∞ µY = E Y (Y ) = y fY (y) dy. −∞ Uncorrelated random variables: Random variables X and Y are uncorrelated if cX,Y = covX,Y (X, Y ) = 0. (1) If X and Y are real-valued RVs, then (1) can be written as E X,Y {XY } = E X{X} E Y {Y }. Mean Vector and Covariance Matrix: Consider a random vector X1 X2 X = . . . XN EE 527, Detection and Estimation Theory, # 0b 5 The mean of this random vector is defined as µ1 E X1[X1] µ2 E X2[X2] µX = . = E X{X} = . . . . µN E XN [XN ] Denote the covariance between Xi and Xk, covXi,Xk(Xi,Xk), by ci,k; hence, the variance of Xi is ci,i = covXi,Xk(Xi,Xi) = var (X ) = σ2 . The covariance matrix of X is defined as Xi i Xi | {z } more notation c1,1 c1,2 ······ c1,N c2,1 c2,2 ······ c2,N CX = . . . cN,1 cN,2 ······ cN,N The above definitions apply to both real and complex vectors X. Covariance matrix of a real-valued random vector X: T CX = E X{(X − E X[X]) (X − E X[X]) } T T = E X[XX ] − E X[X] (E X[X]) . For real-valued X, ci,k = ck,i and, therefore, CX is a symmetric matrix. EE 527, Detection and Estimation Theory, # 0b 6 Linear Transform of Random Vectors Linear Transform. For real-valued Y , X,A, Y = g(X) = A X. Mean Vector: µY = E X{A X} = A µX. (2) Covariance Matrix: T T CY = E Y {YY } − µY µY T T T T = E X{A XX A } − A µX µX A T T T = A E X{XX } − µX µX A | {z } CX T = ACX A . (3) EE 527, Detection and Estimation Theory, # 0b 7 Reminder: Iterated Expectations In general, we can find E X,Y [g(X, Y )] using iterated expectations: E X,Y [g(X, Y )] = E Y {E X | Y [g(X, Y ) | Y ]} (4) where E X | Y denotes the expectation with respect to fX|Y (x | y) and E Y denotes the expectation with respect to fY (y). Proof. Z +∞ E Y {E X | Y [g(X, Y ) | Y ]} = E X | Y [g(X, Y ) | y] fY (y) dy −∞ Z +∞ Z +∞ = g(x, y)fX | Y (x | y) dx fY (y) dy −∞ −∞ Z +∞ Z +∞ = g(x, y)fX | Y (x | y)fY (y) dx dy −∞ −∞ Z +∞ Z +∞ = g(x, y) fX,Y (x, y) dx dy −∞ −∞ = E X,Y [g(X, Y )]. 2 EE 527, Detection and Estimation Theory, # 0b 8 Reminder: Law of Conditional Variances Define the conditional variance of X given Y = y to be the variance of X with respect to fX | Y (x | y), i.e. h 2 i varX | Y (X | Y = y) = E X | Y (X − E X|Y [X | y]) | y 2 2 = E X | Y [X | y] − (E X|Y [X | y]) . The random variable varX | Y (X | Y ) is a function of Y only, taking values var(X | Y = y). Its expected value with respect to Y is n 2 2o E Y {varX | Y (X | Y )} = E Y E X | Y [X | Y ] − (E X|Y [X | Y ]) iterated exp. 2 2 = E X,Y [X ] − E Y {(E X|Y [X | Y ]) } 2 2 = E X[X ] − E Y {(E X|Y [X | Y ]) }. Since E X|Y [X | Y ] is a random variable (and a function of Y only), it has variance: 2 2 varY {E X|Y [X | Y ]} = E Y {(E X|Y [X | Y ]) } − (E Y {E X|Y [X | Y ]}) iterated exp. 2 2 = E Y {(E X|Y [X | Y ]) } − (E X,Y [X]) 2 2 = E Y {E X|Y [X | Y ] } − (E X[X]) . EE 527, Detection and Estimation Theory, # 0b 9 Adding the above two expressions yields the law of conditional variances: E Y {varX | Y (X | Y )} + varY {E X|Y [X | Y ]} = varX(X). (5) Note: (4) and (5) hold for both real- and complex-valued random variables. EE 527, Detection and Estimation Theory, # 0b 10 Useful Expectation and Covariance Identities for Real-valued Random Variables and Vectors E X,Y [a X + b Y + c] = a · E X[X] + b · E Y [Y ] + c 2 2 varX,Y (a X + b Y + c) = a varX(X) + b varY (Y ) +2 a b · covX,Y (X, Y ) where a, b, and c are constants and X and Y are random variables. A vector/matrix version of the above identities: E X,Y [A X + B Y + c] = A E X[X] + B E Y [Y ] + c T T covX,Y (A X + B Y + c) = A covX(X) A + B covY (Y ) B T T +A covX,Y (X, Y ) B + B covX,Y (Y , X) A where “T ” denotes a transpose and T covX,Y (X, Y ) = E X,Y {(X − E X[X]) (Y − E Y [Y ]) }. Useful properties of crosscovariance matrices: • covX,Y ,Z(X, Y + Z) = covX,Y (X, Y ) + covX,Z(X, Z). EE 527, Detection and Estimation Theory, # 0b 11 • T covY ,X(Y , X) = [covX,Y (X, Y )] . • covX(X) = covX(X, X) varX(X) = covX(X, X). • T covX,Y (A X + b,P Y + q) = A covX,Y (X, Y ) P . (To refresh memory about covariance and its properties, see p. 12 of handout 5 in EE 420x notes. For random vectors, see handout 7 in EE 420x notes, particularly pp. 1–15.) Useful theorems: (1) (handout 5 in EE 420x notes) E X(X) = E Y [E X | Y (X | Y )] shown on p. 8 E X | Y [g(X) · h(Y ) | y] = h(y) · E X | Y [g(X) | y] E X,Y [g(X) · h(Y )] = E Y {h(Y ) · E X | Y [g(X) | Y ]}. The vector version of (1) is the same — just put bold letters. EE 527, Detection and Estimation Theory, # 0b 12 (2) varX(X) = E Y [varX | Y (X | Y )] + varY (E X | Y [X|Y ]); and the vector/matrix version is covX(X) = E Y [covX | Y (X | Y )] | {z } variance/covariance matrix of X +covY (E X | Y [X|Y ]) shown on p. (3) Generalized law of conditional variances: covX,Y (X, Y ) = E Z[covX,Y | Z(X, Y | Z)] +covZ(E X | Z[X | Z], E Y | Z[Y | Z]). (4) Transformation: Y = g (X ,...,X ) one-to-one 1 1 1 n Y = g(X) ⇐⇒ . Yn = gn(X1,...,Xn) then fY (y) = fX(h1(y1), . , hn(yn)) · |J| EE 527, Detection and Estimation Theory, # 0b 13 where h(·) is the unique inverse of g(·) and ∂x1 ··· ∂x1 ∂x ∂y1 ∂yn J = = . ∂yT ∂xn ··· ∂xn ∂y1 ∂yn Print and read the handout Probability distributions from the Course readings section on WebCT. Bring it with you to the midterm exams. EE 527, Detection and Estimation Theory, # 0b 14 Jointly Gaussian Real-valued RVs Scalar Gaussian random variables: 1 h (x − µ )2i f (x) = exp − X . X p 2 2 σ2 2 π σX X Definition. Two real-valued RVs X and Y are jointly Gaussian if their joint pdf is of the form 1 fX,Y (x, y) = q 2 2πσXσY 1 − ρX,Y ( 2 2 1 h(x − µX) (y − µY ) · exp − 2 · 2 + 2 2 (1 − ρX,Y ) σX σY ) (x − µX)(y − µY )i −2 ρX,Y . (6) σXσY 2 2 This pdf is parameterized by µX, µY , σX, σY , and ρX,Y . Here, p 2 p 2 σX = σX and σY = σY . Note: We will soon define a more general multivariate Gaussian pdf. EE 527, Detection and Estimation Theory, # 0b 15 If X and Y are jointly Gaussian, contours of equal joint pdf are ellipses defined by the quadratic equation 2 2 (x − µX) (y − µY ) (x − µX)(y − µY ) 2 + 2 −2 ρX,Y = const ≥ 0.

Load more