Joint Probability Distributions, Correlations What we learned so far… • Events: – Working with events as sets: union, intersection, etc. • Some events are simple: Head vs Tails, Cancer vs Healthy • Some are more complex: 10
20,000 random variables Xi for each of its genes • A joint probability distribution describes the behavior of several random variables • We will start with just two random variables X and Y and generalize when necessary
Chapter 5 Introduction 3 Joint Probability Mass Function Defined
The joint probability mass function of the discrete random variables XY and , denoted as fxyXY , , satifies: (1) fxyXY , 0 All probabilities are non-negative (2) fxyXY , 1 The sum of all probabilities is 1 xy
(3) fxyPXxYyXY , , (5-1)
Sec 5‐1.1 Joint Probability Distributions 4 Example 5‐1: # Repeats vs. Signal Bars
You use your cell phone to check your airline reservation. It asks you to speak the name of your departure city to the voice recognition system. • Let Y denote the number of times you have to state your departure city. • Let X denote the number of bars of signal strength on you cell phone.
y = number of x = number of bars times city of signal strength Bar Chart of name is stated 123 Number of Repeats vs. Cell Phone Bars 1 0.010.020.25 2 0.020.030.20 3 0.020.100.05 0.25 0.20 4 0.150.100.05 0.15 0.10
Probability 4 Times 0.05 3 Times Figure 5‐1 Joint probability Twice 0.00 Once distribution of X and Y. The table cells 1 2 are the probabilities. Observe that 3 Cell Phone Bars more bars relate to less repeating.
Sec 5‐1.1 Joint Probability Distributions 5 Marginal Probability Distributions (discrete) For a discrete joint PDF, there are marginal distributions for each random variable, formed by summing the joint PMF over the other variable.
y = number of x = number of bars of times city name signal strength f XXYxfxy , is stated 123f Y (y ) = y 1 0.01 0.02 0.25 0.28 2 0.02 0.03 0.20 0.25 fYXYyfxy , 3 0.02 0.10 0.05 0.17 x 4 0.15 0.10 0.05 0.30
f X (x ) = 0.20 0.25 0.55 1.00 Called marginal Figure 5‐6 From the prior example, because they are the joint PMF is shown in green written in the margins while the two marginal PMFs are shown in purple.
Sec 5‐1.2 Marginal Probability Distributions 6 Mean & Variance of X and Y are calculated using marginal distributions
y = number of x = number of bars times city of signal strength name is stated 123f (y ) = y *f (y ) = y 2*f (y ) = 1 0.01 0.02 0.25 0.28 0.28 0.28 2 0.02 0.03 0.20 0.25 0.50 1.00 3 0.02 0.10 0.05 0.17 0.51 1.53 4 0.15 0.10 0.05 0.30 1.20 4.80 f (x ) = 0.20 0.25 0.55 1.00 2.49 7.61 x *f (x ) = 0.20 0.50 1.65 2.35 x 2*f (x ) = 0.20 1.00 4.95 6.15
2 2 μX =E(X) = 2.35; σX = V(X) = 6.15 – 2.35 = 6.15 – 5.52 = 0.6275
2 2 μY= E(Y) = 2.49; σY =V(Y) = 7.61 – 2.49 = 7.61 – 16.20 = 1.4099
Sec 5‐1.2 Marginal Probability Distributions 7 Conditional Probability Distributions PAB Recall that PBA P(Y=y|X=x)=P(X=x,Y=y)/P(X=x)= PA =f(x,y)/fX(x)
From Example 5‐1 y = number of x = number of bars of P(Y=1|X=3) = 0.25/0.55 = 0.455 times city name signal strength is stated 123f Y (y ) = P(Y=2|X=3) = 0.20/0.55 = 0.364 1 0.010.020.250.28 P(Y=3|X=3) = 0.05/0.55 = 0.091 2 0.020.030.200.25 3 0.020.100.050.17 P(Y=4|X=3) = 0.05/0.55 = 0.091 4 0.150.100.050.30 f (x ) =0.200.250.551.00 Sum = 1.00 X
Note that there are 12 probabilities conditional on X, and 12 more probabilities conditional upon Y.
Sec 5‐1.3 Conditional Probability 8 Distributions Joint Random Variable Independence • Random variable independence means that knowledge of the values of X does not change any of the probabilities associated with the values of Y.
• Opposite: Dependence implies that the values of X are influenced by the values of Y
Sec 5‐1.4 Independence 9
Independence for Discrete Random Variables • Remember independence of events (slide 21 lecture 3) : P(A|B)=P(A ∩ B)/P(B)=P(A) or P(B|A)= P(A ∩ B)/P(A)=P(B) or P(A ∩ B)=P(A)∙ P(B)
• Random variables independent if any events A that Y=y and B that X=x are independent P(Y=y|X=x)=P(Y=y) for any x or P(X=x|Y=y)=P(X=x) for any y or P(X=x, Y=y)=P(X=x)∙P(Y=y) for any x and y X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6
What is the marginal PY(Y=0)? A. 1/6 B. 2/6 C. 3/6 D. 4/6 E. I don’t know
Get your i‐clickers 12 X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6
What is the marginal PY(Y=0)? A. 1/6 B. 2/6 C. 3/6 D. 4/6 E. I don’t know
Get your i‐clickers 13 X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6 What is the conditional P(X=0|Y=0)? A. 2/6 B. 1/2 C. 1/6 D. 4/6 E. I don’t know
Get your i‐clickers 14 X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6 What is the conditional P(X=0|Y=0)? A. 2/6 B. 1/2 C. 1/6 D. 4/6 E. I don’t know
Get your i‐clickers 15 X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6 Are they independent? A. yes B. no C. I don’t know
Get your i‐clickers 16 X and Y are Bernoulli variables
Y=0 Y=1 X=0 2/6 1/6 X=1 2/6 1/6 Are they independent? A. yes B. no C. I don’t know
Get your i‐clickers 17 X and Y are Bernoulli variables
Y=0 Y=1 X=0 1/2 0 X=1 0 1/2 Are they independent? A. yes B. no C. I don’t know
Get your i‐clickers 18 X and Y are Bernoulli variables
Y=0 Y=1 X=0 1/2 0 X=1 0 1/2 Are they independent? A. yes B. no C. I don’t know
Get your i‐clickers 19 Joint Probability Density Function Defined The joint probability density function for the continuous random
variables X and Y, denotes as fXY(x,y), satisfies the following properties:
(1) fxyXY , 0 for all xy ,
(2) fxydxdy , 1 XY (3) PXY , R f xydxdy , (5-2) XY R
Figure 5‐2 Joint probability density function for the random variables X and Y. Probability that (X, Y) is in the region R is determined by the volume of
fXY(x,y) over the region R.
Sec 5‐1.1 Joint Probability Distributions 20 Joint Probability Density Function
Figure 5‐3 Joint probability density function for the continuous random variables X and Y of expression levels of two different genes. Note the asymmetric, narrow ridge shape of the PDF – indicating that small values in the X dimension are more likely to occur when small values in the Y dimension occur.
Sec 5‐1.1 Joint Probability Distributions 21 Marginal Probability Distributions (continuous) • Rather than summing a discrete joint PMF, we integrate a continuous joint PDF. • The marginal PDFs are used to make probability statements about one variable. • If the joint probability density function of random variables X and Y is fXY(x,y), the marginal probability density functions of X and Y are:
f XXYx f x, y fxXXY f xydy, y y fyYXY f x, y x fy f xydx, (5-3) YXY x
Sec 5‐1.2 Marginal Probability Distributions 22 Conditional Probability Density Function Defined Given continuous random variables XY and with joint probability density function fxyXY , , the conditional probability densiy function of YX given =x is
fxyXY,, fxy XY fy = if fx 0 (5-4) Yx fx fxydy, X X XY y
Compare to discrete: P(Y=y|X=x)=fXY(x,y)/fX(x) which satifies the following properties:
(1) fyYx 0 (2) fydy 1 Yx (3) PY BX x f ydy for any set B in the range of Y Yx Sec 5‐1.3 Conditional Probability B 23 Distributions Conditional Probability Distributions
• Conditional probability distributions can be developed for multiple random variables by extension of the ideas used for two random variables. • Suppose p = 5 and we wish to find the distribution of
X1, X2 and X3 conditional on X4=x4 and X5=x5. f xxxxx,,,, XX12345 XX X 12345 fxxx123,, XX12345 X xx fxx, XX45 45 for fxx , 0. XX45 45
Sec 5‐1.5 More Than Two Random 24 Variables Independence for Continuous Random Variables
For random variables X and Y, if any one of the following properties is true, the others are also true. Then X and Y are independent. P(Y=y|X=x)=P(Y=y) for any x or P(X=x|Y=y)=P(X=x) for any y or P(X=x, Y=y)=P(X=x)∙P(Y=y) for any x and y
(1) fxyfxfyXY , X Y
(2) fyfyYxYX for all x and y with fx 0
(3) fyfxXyXY for all x and y with fy 0 (4) PXAYBPXAPYB , for any sets AB and in the range of X and Y , respectively. (5-7)
Sec 5‐1.4 Independence 25 Covariation, Correlations
Covariance Defined
Covariance is a number qunatifying average dependence between two random variables.
The covariance between the random variables X and Y, denoted as covXY , or XY is XYEX X Y Y EXY X Y (5-14)
The units of XY are units of XY times units of .
Unlike the range of variance, -XY .
Sec 5‐2 Covariance & Correlation 29 Covariance and PMF tables
y = number of x = number of bars times city of signal strength name is stated 123 1 0.01 0.02 0.25 2 0.02 0.03 0.20 3 0.02 0.10 0.05 4 0.15 0.10 0.05 The probability distribution of Example 5‐1 is shown.
By inspection, note that the larger probabilities occur as X and Y move in opposite directions. This indicates a negative covariance.
Sec 5‐2 Covariance & Correlation 30 Covariance and Scatter Patterns
Figure 5‐13 Joint probability distributions and the sign of cov(X, Y). Note that covariance is a measure of linear relationship. Variables with non‐zero covariance are correlated.
Sec 5‐2 Covariance & Correlation 31 Independence Implies σ=ρ = 0 but not vice versa
• If X and Y are independent random variables,
σXY = ρXY = 0 (5‐17)
• ρXY = 0 is necessary, but not a sufficient condition for independence. Independent covariance=0
NOT independent
Sec 5‐2 Covariance & Correlationcovariance=0 32 Correlation is “normalized covariance” • Also called: Pearson correlation coefficient
ρXY=σXY /σXσY is the covariance normalized to
be ‐1 ≤ ρXY ≤ 1 Karl Pearson (1852– 1936) English mathematician and biostatistician
Spearman rank correlation
• Pearson correlation tests for linear relationship between X and Y • Unlikely for variables with broad distributions non‐ linear effects dominate • Spearman correlation tests for any monotonic relationship between X and Y
• Calculate ranks (1 to n), rX(i) and rY(i) of variables in both samples. Calculate Pearson correlation between ranks: Spearman(X,Y) = Pearson(rX, rY) • Ties: convert to fractions, e.g. tie for 6s and 7s place both get 6.5. This can lead to artefacts. • If lots of ties: use Kendall rank correlation (Kendall tau) Matlab exercise: Correlation/Covariation • Generate a sample with Stats=100,000 of two Gaussian random variables r1 and r2 which have mean 0 and standard deviation 2 and are: – Uncorrelated – Correlated with correlation coefficient 0.9 – Correlated with correlation coefficient ‐0.5 – Trick: first make uncorrelated r1 and r2. Then change r1 to: r1m=mix.*r2+(1‐mix.^2)^0.5.*r1; where mix= corr. coeff. • In each case calculate covariance and correlation coefficient • In each case make scatter plot: plot(r1,r2,’k.’); Matlab exercise: Correlation/Covariation 1. Stats=100000; 2. r1=2.*randn(Stats,1); 3. r2=2.*randn(Stats,1); 4. disp('Covariance matrix='); disp(cov(r1,r2)); 5. disp('Correlation=');disp(corr(r1,r2)); 6. figure; plot(r1,r2,'ko'); 7. mix=0.9; %Mixes r2 to r1 but keeps same variance 8. r1m=mix.*r2+sqrt(1‐mix.^2).*r1; 9. disp('Covariance matrix='); disp(cov(r1m,r2)); 10.disp('Correlation=');disp(corr(r1m,r2)); 11.figure; plot(r1m,r2,'ko'); 12.mix=‐0.5; %REDO LINES 8‐11
Linear Functions of Random Variables
• A function of random variables is itself a random variable. • A function of random variables can be formed by either linear or nonlinear relationships. We start with linear functions.
• Given random variables X1, X2,…,Xp and constants c1, c2, …, cp Y= c1X1 + c2X2 + … + cpXp (5‐24) is a linear combination of X1, X2,…,Xp.
Sec 5‐4 Linear Functions of Random 40 Variables
Mean & Variance of a Linear Function
Y= c1X1 + c2X2 + … + cpXp
EY cEX1122 cEX ... cEpp X (5-25)
22 2 VYcVXcVX1122 ... cVXpp 2 ccXX ijij cov (5-26) ij If XX12 , ,..., Xpij are independent , then cov XX 0,
22 2 VY cVX1122 cVX ... cpp VX (5-27)
Sec 5‐4 Linear Functions of Random 42 Variables Example 5‐31: Error Propagation
A semiconductor product consists of three layers. The variances of the thickness of each layer is 25, 40 and 30 nm2. What is the variance of the finished product? Answer:
XXX123 X 3 2 VX VXi 25 40 30 95 nm i1 SD X 95 9.7 nm
Sec 5‐4 Linear Functions of Random 43 Variables
Mean & Variance of an Average
XX12... Xp If XEX and i p p Then EX (5-28a) p
If the XVX are independent with 2 ii p 22 Then VX (5-28b) pp2
Sec 5‐4 Linear Functions of Random 46 Variables Credit: XKCD comics