Joint Distributions Math 217 Probability and Statistics a Discrete Example
Total Page:16
File Type:pdf, Size:1020Kb
F ⊆ Ω2, then the distribution on Ω = Ω1 × Ω2 is the product of the distributions on Ω1 and Ω2. But that doesn't always happen, and that's what we're interested in. Joint distributions Math 217 Probability and Statistics A discrete example. Deal a standard deck of Prof. D. Joyce, Fall 2014 52 cards to four players so that each player gets 13 cards at random. Let X be the number of Today we'll look at joint random variables and joint spades that the first player gets and Y be the num- distributions in detail. ber of spades that the second player gets. Let's compute the probability mass function f(x; y) = Product distributions. If Ω1 and Ω2 are sam- P (X=x and Y =y), that probability that the first ple spaces, then their distributions P :Ω1 ! R player gets x spades and the second player gets y and P :Ω2 ! R determine a product distribu- spades. 5239 tion on P :Ω1 × Ω2 ! R as follows. First, if There are hands that can be dealt to E1 and E2 are events in Ω1 and Ω2, then define 13 13 P (E × E ) to be P (E )P (E ). That defines P on 13 39 1 2 1 2 these two players. There are hands \rectangles" in Ω1×Ω2. Next, extend that to count- x 13 − x able disjoint unions of rectangles. There's a bit of for the first player with exactly x spades, and with 13 − x26 + x theory to show that what you get is well-defined. the remaining deck there are Extend to complements of those unions. Again, y 13 − y more theory. Continue by closing under countable hands for the second player with exactly y spades. unions and complements. A lot more theory. But Thus, there are it works. That theory forms part of the the topic 13 39 13 − x26 + x called measure theory which is included in courses on real analysis. x 13 − x y 13 − y When you have the product distribution on Ω1 × ways of dealing hands to those two players with x Ω2, the random variables X and Y are independent, and y spades. Thus, but there are applications where we have a distri- bution on Ω1 × Ω2 that differs from the product 13 39 13 − x26 + x distribution, and for those, X and Y won't be in- x 13 − x y 13 − y f(x; y) = : dependent. That's the situation we'll look at now. 5239 13 13 Definition. A joint random variable (X; Y ) is a random variable on any sample space Ω which is You can write that expression in terms of trino- the product of two sets Ω1 × Ω2. mial coefficients, or in terms of factorials, but we'll Joint random variables do induce probability dis- leave it as it is. If you were a professional Bridge tributions on Ω1 and on Ω2. If E ⊆ Ω1, de- player, you might want to see a table of the values fine P (E) to be the probability in Ω of the set of f(x; y). E × Ω2. That defines P :Ω1 ! R which satisfies It's intuitive that X and Y are not indepen- the axioms for a probability distributions. Simi- dent joint random variables. The more spades the larly, you can define P :Ω2 ! R by declaring for first player has, the fewer the second will proba- F ⊆ Ω2 that P (F ) = P (Ω1 × F ). If it happens bly have. In order to show they're not indepen- that P (E × F ) = P (E)P (F ) for all E ⊆ Ω1 and dent, it's enough to find particular values of x 1 and y so that f(x; y) = P (X=x and Y =y) does We can summarize the cumulative distribution not equal the product of fX (x) = P (X=x) and function as f (y) = P (Y =y). For example, we know they can't Y 8 0 if x < 0 or y < 0 both get 7 spades since there are only 13 spades in > > x2 if 0 ≤ x ≤ 1 and x ≤ y all, so f(7; 7) = 0. But since the first player can get <> F (x; y) = 2xy − y2 if 0 ≤ x ≤ 1 and x > y 7 spades, f (7) > 0, also the second player can, so X > 2y − y2 if x > 1 and 0 ≤ y ≤ 1 f (y) > 0. Therefore f(7; 7) 6= f (7)f (7). We've > Y Y Y :> 1 if x > 1 and y > 1 proven that X and Y are not independent. Generally speaking, joint cumulative distribution A continuous example. Let's choose a point functions aren't used as much as joint density func- from inside a triangle ∆ uniformly at random. Let's tions. Typically, joint c.d.f.'s are much more com- take ∆ to be half the unit square, namely the half plicated to describe, just as in this example. with vertices at (0; 0), (1; 0), and (1; 1). Then Joint distributions and density functions. ∆ = f(x; y) j 0 ≤ y ≤ x ≤ 1g: Density functions are the usual way to describe joint continuous real-valued random variables. 1 The area of this triangle is 2 , so we can find the Let X and Y be two continuous real-valued ran- probability of any event E, a subset of ∆, simply dom variables. Individually, they have their own by doubling its area. The joint probability density cumulative distribution functions 1 function is constantly 2 inside ∆ and 0 outside. FX (x) = P (X ≤ x) FY (y) = P (Y ≤ y); 2 if 0 ≤ y ≤ x ≤ 1 f(x; y) = 0 otherwise whose derivatives, as we know, are the probability density functions A continuous joint random variable (X; Y ) is de- d d termined by its cumulative distribution function f (x) = F (x) f (y) = F (y): X dx X Y dy Y F (x; y) = P (X ≤ x and Y ≤ y): Furthermore, the cumulative distribution functions can be found by integrating the density functions We'll figure out it's value for this example. Z x Z y First, if either x or y is negative, then the event FX (x) = fX (t) dt FY (y) = fY (t) dt: (X ≤ x and Y ≤ y) completely misses the triangle, −∞ −∞ so it's actually empty, so its probability is 0. There is also a joint cumulative distribution func- Second, if x is between 0 and 1 and x ≤ y, then tion for (X; Y ) defined by the event (X ≤ x and Y ≤ y) is a triangle of width 1 2 and height x, so its area is 2 x . Doubling the area F (x; y) = P (X ≤ x and Y ≤ y): gives a probability of x2. Third, if x is between 0 and 1 and x > y, then the The joint probability density function f(x; y) is point (x; y) lies inside the triangle, and the event found by taking the derivative of F twice, once with (X ≤ x and Y ≤ y) is a trapezoid. Its bottom respect to each variable, so that length is x, top length is x − y, and height is y, so @ @ its area is xy − y2=2. Therefore, the probability of f(x; y) = F (x; y): @x @y this event is 2xy − y2. There remain two more cases to be considered, (The notation @ is substituted for d to indicate that but their analyses are omitted. there are other variables in the expression that are 2 held constant while the derivative is taken with respect to the given variable.) The joint cumula- tive distribution function can be recovered from the joint density function by integrating twice Z x Z y F (x; y) = f(s; t) dt ds: −∞ −∞ (When more that one integration is specified, the R y inner integral, namely −∞ f(s; t) dt, is written without parentheses around it, even though the parentheses would help clarify the expression.) Furthermore, the individual cumulative distribu- tion functions are determined by the joint distribu- tion function. FX (x) = P (X ≤ x and Y ≤ 1) = lim F (x; y) = F (x; 1) y!1 FY (y) = P (X ≤ 1 and Y ≤ y) = lim F (x; y) = F (1; y) x!1 The volume under that surface is 1, as it has to be for f to be a probability density function. To find Likewise, the individual density functions can be the probability that (X; Y ) lies in an event E, a found by integrating joint density function. subset of the unit square, just find the volume of Z 1 Z 1 the solid above E and below the surface z = f(x; y). fX (x) = f(x; y) dy; fY (x) = f(x; y) dx −∞ −∞ That's a double integral These individual density functions fX and fy ZZ are often called marginal density functions to dis- P ((X; Y ) 2 E) = f(x; y) dx dy: E tinguish them from the joint density function For example, suppose we want to find the proba- f(X;Y ). Likewise the corresponding individual cu- bility P (0 ≤ X ≤ 3 and 1 ≤ Y ≤ 1). The event mulative distribution functions FX and FY are 4 3 3 1 called marginal cumulative distribution functions to E is (0 ≤ X ≤ 4 and 3 ≤ Y ≤ 1), and that de- scribes a rectangle. The double integral giving the distinguish them form the joint c.d.f F(X;Y ). probability is Another continuous example. The last exam- Z 3=4Z 1 2 ple was a uniform distribution on a triangle.