Random Vectors 1/2 Ignacio Cascos 2018
Total Page:16
File Type:pdf, Size:1020Kb
Random vectors 1/2 Ignacio Cascos 2018 Outline 4.1 Joint, marginal, and conditional distributions 4.2 Independence 4.3 Transformations of random vectors 4.4 Sums of independent random variables (convolutions) 4.5 Mean vector and covariance matrix 4.6 Multivariate Normal and Multinomial distributions 4.7 Mixtures 4.8 General concept of a random variable 4.9 Random sample 4.10 Order statistics Introduction library(datasets) attach(cars) plot(speed,dist) 120 100 80 60 dist 40 20 0 5 10 15 20 25 speed 1 4.1 Joint, marginal, and conditional distributions In many situations we are interested in more than one feature (variable) associated with the same random experiment. A random vector is a measurable mapping from a sample space S into Rd. A bivariate random vector maps S into R2, 2 (X, Y ): S −→ R . The joint distribution of a random vector describes the simultaneous behavior of all variables that build the random vector. Discrete random vectors Given X and Y two discrete random variables (on the same probability space), we define • joint probability mass function: pX,Y (x, y) = P (X = x, Y = y) satisfying – pX,Y (x, y) ≥ 0; P P – x y pX,Y (x, y) = 1. joint cumulative distribution function F x , y P X ≤ x ,Y ≤ y P P p x, y • : X,Y ( 0 0) = ( 0 0) = x≤x0 y≤y0 X,Y ( ). For any (borelian) A ⊂ R2, we use the joint probability mass function to compute the probability that (X, Y ) lies in A, X P ((X, Y ) ∈ A) = pX,Y (xi, yj) . (xi,yj )∈A Continuous random vectors Given X and Y two continuous random variables (on the same probability space), we define • joint density mass function: fX,Y (x, y) satisfying – fX,Y (x, y) ≥ 0; R +∞ R +∞ – −∞ −∞ fX,Y (x, y)dxdy = 1. We can use it to compute probabilities, Z b Z d P (a ≤ X ≤ b, c ≤ Y ≤ d) = fX,Y (x, y)dydx . a c - joint cumulative distribution function: Z x0 Z y0 FX,Y (x0, y0) = P (X ≤ x0,Y ≤ y0) = fX,Y (x, y)dydx . −∞ −∞ We have further 2 ∂ FX,Y (x, y) fX,Y (x, y) = . ∂x∂y 2 Example (Uniform continuous random vector on diamond) 1/2 if − 1 ≤ x + y ≤ 1, −1 ≤ x − y ≤ 1 fX,Y (x, y) = 0 otherwise 1.0 0.5 0.0 −0.5 −1.0 −2 −1 0 1 2 Marginal distributions (discrete) The distribution of each of the components of a random vector alone is referred to as marginal distribution. Discrete variables Given X and Y two discrete random variables with joint probability mass function pX,Y (x, y), • (marginal) probability mass function of X: X X pX (x) = P (X = x) = P (X = x, Y = y) = pX,Y (x, y) . y y • (marginal) probability mass function of Y : X X pY (y) = P (Y = y) = P (X = x, Y = y) = pX,Y (x, y) . x x Marginal distributions (continuous) Given X and Y two continuous random variables with joint density mass function fX,Y (x, y), • (marginal) density mass function of X: Z +∞ fX (x) = fX,Y (x, y)dy . −∞ (marginal) density mass function of Y : Z +∞ fY (y) = fX,Y (x, y)dx . −∞ 3 Example (marginals of unif. random vector on diamond) R +∞ R x+1 1 • Given −1 < x < 0, fX (x) = −∞ fX,Y (x, y)dy = −x−1 2 dy = x + 1 . R +∞ R −x+1 1 • Given 0 < x < 1, fX (x) = −∞ fX,Y (x, y)dy = x−1 2 dy = 1 − x . x + 1 if − 1 < x ≤ 0 fX (x) = 1 − x if 0 < x < 1 0 otherwise 1.5 1.0 0.5 0.0 −0.5 −2 −1 0 1 2 Conditional distributions (discrete) Distribution of one component given a condition on the other one. Discrete variables. Given X and Y two discrete random variables with joint probability mass function pX,Y (x, y) • (conditional) probability mass function of Y given X = x0 (pX (x0) > 0): P (X = x0,Y = y) pX,Y (x0, y) pY |X (y|x0) = P (Y = y|X = x0) = = . P (X = x0) pX (x0) • (conditional) probability mass function of X given Y = y0 (pY (y0) > 0): P (X = x, Y = y0) pX,Y (x, y0) pX|Y (x|y0) = P (X = x|Y = y0) = = . P (Y = y0) pY (y0) Conditional distributions (continuous) Continuous variables. Given X and Y two continuous random variables with joint density mass function f(x, y) • density mass function of Y given X = x0 (fX (x0) > 0): f(x0, y) fY |X (y|x0) = . fX (x0) • density mass function of X given Y = y0 (fY (y0) > 0): f(x, y0) fX|Y (x|y0) = . fY (y0) 4 Example (conditional dist. of uniform r.v. on diamond) • Given −1 < x0 < 0, fX,Y (x0, y) 1 fY |X (y|x0) = = − 1 − x0 < y < 1 + x0 . fX (x0) 2(x0 + 1) Y |X = x0 ∼ U(−1 − x0, 1 + x0) • Given 0 < x0 < 1, fX,Y (x0, y) 1 fY |X (y|x0) = = x0 − 1 < y < 1 − x0 . fX (x0) 2(1 − x0) Y |X = x0 ∼ U(x0 − 1, 1 − x0) 4.2 Independence Two random variables are independent if the value that one of them assumes does not provide us with any information about the value that the other one might assume. More specifically, two random variables X and Y defined on the same probability space are independent if for all (borelian) sets of real numbers B1,B2 ⊂ R P (X ∈ B1) ∩ (Y ∈ B2) = P (X ∈ B1)P (Y ∈ B2). Equivalently, X and Y are independent if their joint cdf equals the product of the marginal cdfs, that is, FX,Y (x, y) = FX (x)FY (y) for all x, y ∈ R. Independence • Discrete variables: X and Y are independent if for all x, y any of the following conditions is fulfilled pY |X (y|x) = pY (y) pX|Y (x|y) = pX (x) pX,Y (x, y) = pX (x)pY (y) . • Continuous variables: X and Y are independent if for all x, y any of the following conditions is fulfilled fY |X (y|x) = fY (y) fX|Y (x|y) = fX (x) fX,Y (x, y) = fX (x)fY (y) . Example (Uniform continuous random vector on diamond) NOT independent marginals. If −1 < x0 < 0, then Y |X = x0 ∼ U(−1 − x0, 1 + x0) which clearly depends on x0. 5 1.0 0.5 0.0 −0.5 −1.0 −2 −1 0 1 2 4.3 Transformations of random vectors t d k Consider a d-variate random vector X = (X1,...,Xd) and a function g : R 7→ R , then Y = g(X) is a k-variate random vector. If k = 1, then Y = g(X) is a random variable. Mean of a univariate transformation of a random vector P • X discrete: E[Y ] = E[g(X)] = g(x)pX(x). R • X continuous: [Y ] = [g(X)] = d g(x)fX(x)dx. E E R Transformations of random vectors t d A random vector X = (X1,...,Xd) in R with joint density function fX(x) is transformed into Y = t d (Y1,...,Yd) = g(X) also in R as Y1 = g1(X1,...,Xd),...,Yd = gd(X1,...,Xd) in such a way that the inverse transformations exist. The joint density mass function of Y is ∂x1 ∂x1 ∂y ··· ∂y 1 d −1 . fY(y1, . , yd) = fX(g (y1, . , yd)) det . .. . ∂xd ··· ∂xd ∂y1 ∂yd Example (Uniform continuous random vector on diamond) 1/2 if − 1 ≤ x1 + x2 ≤ 1, −1 ≤ x1 − x2 ≤ 1 fX(x) = 0 otherwise 6 1 1 Y = X = AX −1 1 The inverse transform is 1/2 −1/2 X = A−1Y = Y 1/2 1/2 −1 −1 1/4 if − 1 < y1, y2 < 1 fY(y) = fX(A y)| det(A )| = . 0 otherwise 4.4 Sums of independent random variables (convolutions) If X1 and X2 are two continuous and independent random variables with associated density mass functions fX1 (x1) and fX2 (x2), the density mass function of Y = X1 + X2 is Z +∞ fY (y) = fX1 (y − x)fX2 (x)dx . −∞ 1 1 It corresponds to the marginal distribution of the first component of the transformation Y = X. 0 1 −1 1 1 1 −1 Just observe that = . 0 1 0 1 Sum of two independent U(−1, 1) random variables 1/2 if − 1 < x < 1 fX (x) = fX (x) = 1 2 0 otherwise Let Y = X1 + X2, • if −2 < y < 0, then R +∞ R 1 1 R y+1 1 y+2 fY (y) = −∞ fX1 (y − x)fX2 (x)dx = −1 2 fX1 (y − x)dx = −1 4 dx = 4 . • if 0 < y < 2, then R +∞ R 1 1 R 1 1 2−y fY (y) = −∞ fX1 (y − x)fX2 (x)dx = −1 2 fX1 (y − x)dx = y−1 4 dx = 4 . Sum of two independent U(−1, 1) random variables set.seed(1) hist(runif(10000,min=-1)+runif(10000,min=-1),probability=T) 7 Histogram of runif(10000, min = −1) + runif(10000, min = −1) 0.4 0.3 Density 0.2 0.1 0.0 −2 −1 0 1 2 runif(10000, min = −1) + runif(10000, min = −1) 4.5 Mean vector and covariance matrix Mean vector The mean vector of random vector X is a (column) vector, each of whose components is the mean of a component of X. E[X1] E[X2] µ = E[X] = . . E[Xd] Mean vector and covariance matrix Covariance and correlation The covariance is a measure of the linear dependency between two variables Cov[X, Y ] = E[(X − E[X])(Y − E[Y ])] = E[XY ] − E[X]E[Y ] and the correlation its dimensionless version Cov[X, Y ] ρX,Y = p .