<<

Section 5.3 Bivariate Unit Normal Bivariate Unit Normal

If X and Y have independent unit normal distributions then their joint Lecture 17: Bivariate f (x, y) is given by

2 − 1 (x2+y 2) f (x, y) = φ(x)φ(y) = c e 2 Sta230 / Mth230

Colin Rundel

April 2, 2014

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 1 / 20

Section 5.3 Bivariate Unit Normal Section 5.3 Bivariate Unit Normal Bivariate Unit Normal, cont. Bivariate Unit Normal - Normalizing Constant

We can rewrite the joint distribution in terms of the distance r from the First, rewriting the joint distribution in this way also makes it possible to origin evaluate the constant of integration, c.

From our previous discussions of the normal distribution we know that p 2 2 r = x + y c = √1 , but we never saw how to derive it explicitly. 2π 2 − 1 (x2+y 2) 2 − 1 r 2 f (x, y) = c e 2 = c e 2

This tells us something useful about this special case of the bivariate normal distributions: it is rotationally symmetric about the origin.

r r + eps This particular fact is incredibly powerful and helps us solve a variety of Y problems.

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 2 / 20 Sta230 / Mth230 (Colin Rundel)X Lecture 17 April 2, 2014 3 / 20 Section 5.3 Bivariate Unit Normal Section 5.3 Bivariate Unit Normal Bivariate Unit Normal - Normalizing Constant Rayleigh Distribution

The distance from the origin of a point (x, y) derived from X , Y ∼ N (0, 1) is called the Rayleigh distribution.

− 1 r 2 fR (r) = re 2

Z r − 1 r 2 Fr (r) = re 2 dr 0 r − 1 r 2 = −e 2 0 − 1 r 2 = 1 − e 2

We will see the in a little bit.

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 4 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 5 / 20

Section 5.3 Bivariate Unit Normal Section 5.3 Bivariate Unit Normal Rayleigh Distribution - Example Bivariate Unit Normal - of the Normal

If the position of a target shooter’s shots are described by normal Assume we did not know that Var(X ) = σ2 = 1, what would we need to distributions in the x and y direction that are centered on the bull’s eye evaluate to find the variance of X ∼ N (0, 1)? with a of 1 in: Z ∞ 2 2 2 1 2 − 1 x2 What is the probability a shot is more within 1 inch of the bull’s eye? Var(X ) = E(X ) − E(X ) = E(X ) = √ x e 2 dx 2π −∞ What is the probability a shot is more than 2 inches away from the bull’s eye? We can use the Rayleigh distribution to make our life easier What is the probability a shot is between 1 and 2 inches from the E(R2) = E(X 2 + Y 2) bull’s eye? = E(X 2) + E(Y 2) = 2E(X 2)

E(X 2) = E(R2)/2

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 6 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 7 / 20 Section 5.3 Bivariate Unit Normal Section 5.3 Bivariate Unit Normal Bivariate Unit Normal - Variance of the Normal Rayleigh Distribution - Expectation

E(R2) is still not very easy to evaluate, but we can consider a change of We can also use the preceding result to help find the expected value of R. variables where we let S = R2.

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 8 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 9 / 20

Section 5.3 Sums of Independent Normal Variables Section 5.3 Sums of Independent Normal Variables Sums of Independent Normal Variables Alternative Approach

Previously we have shown that the sum (or average) of iid random variables will It is also possible to prove these results using generating functions. converge to the normal distribution as the number of random variables goes to infinity thanks to the central limit theorem. Recall: Intuitively it makes sense then that the sum of a small number of normals should also be normal but we have not explicitly shown this to be true. Also, what about the case Moment generating functions unique identify distributions. where the normals are not identically distributed? If X and Y are independent with mgfs MX (t) and MY (t) then Based on arguments using projections and the Bivariate Unit Normal’s rotational symmetry the book derives the following general result (and special cases): MX +Y (t) = MX (t)MY (t)

If Z ∼ N (µ, σ2) then i) If X ∼ N (µ, σ2) and Y ∼ N (λ, τ 2), then X + Y ∼ N (µ + λ, σ2 + τ 2).

1 2 2 2 2 M (t) = exp(µt + σ t ) ii) If X , Y ∼ N (0, 1), then αX + βY ∼ N (0, 1) for all α and β with α + β = 1 Z 2

iii) If X , Y ∼ N (0, 1), then X + Y ∼ N (0, 2)

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 10 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 11 / 20 Section 5.3 Sums of Independent Normal Variables Section 5.3 Sums of Independent Normal Variables Proof of iii) Proof of ii)

Let X , Y ∼ N (0, 1) then

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 12 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 13 / 20

Section 5.3 Sums of Independent Normal Variables Section 5.3 Sums of Independent Normal Variables Proof of i) Further Generalization and Example

Let X ∼ N (µ, σ2) and Y ∼ N (λ, τ 2) then If X , Y , Z are all independent normal random variables then we have shown that X 0 = X + Y must also be a normal and it follows that Z 0 = X 0 + Z must also be a normal random variable, which is equivalent to X + Y + Z which is therefore a normal random variable.

Example - If X , Y , Z ∼ N (0, 1) then what is P(X + Y < Z + 2)?

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 14 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 15 / 20 Section 5.3 χ2 Distribution Section 5.3 χ2 Distribution

Multivariate Unit Normal Distribution of Rn

Let Z ,..., Z ∼ N (0, 1) then the joint distribution of (z ,..., z ) is 1 n 1 n  n n−1 1 − 1 r 2 fR (r) = cnr √ e 2 n 2π f (z1,..., zn) = f (z1) × · · · × f (zn)  n If we then perform a change of variables such that S = R2 1 1 2 2 n n = √ exp[− (z1 + ··· + zn )]  2π 2 dSn fSn (s) = fRn (r) dR Once again we can model the distance of a point from the origin in this n n−1 − 1 r 2 0 0 r e 2 cn n−2 − 1 r 2 n-dimensional space using = c = r e 2 n 2r 2 c0 q n n/2−1 −s/2 2 2 = s e Rn = Z1 + ··· + Zn 2  n n−1 1 − 1 r 2 P(Rn ∈ (r, r + )) = cnr √ e 2 2π Which is the with k = n/2 and θ = 2.

Sn ∼ Gamma(n/2, 2) where cn is the (n − 1)-dimensional volume of the “surface” of a sphere of This special case of the Gamma distribution is known as the χ2 distribution with n-degrees of radius 1 in n-dimensions. Eg. c2 = 2π and c3 = 4π freedom. Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 16 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 17 / 20

Section 5.3 χ2 Distribution Section 5.3 χ2 Distribution 2 Pearson’s χ test Finding cn

Why do we care about the χ2 distribution? Imagine we want to decide if a If X ∼ Gamma(k, θ) then its density is given by six sided die is fair, we roll the die 100 times and total up the results 1 1 f (x|k, θ) = xk−1e−x/θ Roll 1 2 3 4 5 6 X θk Γ(k) Counts 18 13 17 13 16 23 From this we can find, In 1900 Carl Pearson showed that

k 2 k 2 X (Oi − Ei ) X (Ni − npi ) = ∼ χ2(df = k − 1) E np i=1 i i=1 i

If you’ve taken an introductory course you’ve seen this in the form of a hypothesis test where you calculate a p-value (probability of getting the observed value or more extreme given H0 is true).

k 2 ! X (Oi − Ei ) p-value = 1 − Fχ2 k−1 E i=1 i

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 18 / 20 Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 19 / 20 Section 5.3 χ2 Distribution

Implications of Rn distribution

We have shown that the density for Rn is given by

1 2 0 n−1 − 2 r fRn (r) = cn r e

what does this tell us about the distribution of probability mass for a multivariate normal distribution?

An interesting question - where are we most likely to see an observation from an n dimension multivariate normal distribution?

Sta230 / Mth230 (Colin Rundel) Lecture 17 April 2, 2014 20 / 20