Statistics for Data Science

Statistics for Data Science MSc Data Science WiSe 2019/20 Prof. Dr. Dirk Ostwald 1 (2) Random variables 2 Random variables • Definition and notation • Cumulative distribution functions • Probability mass and density functions 3 Random variables • Definition and notation • Cumulative distribution functions • Probability mass and density functions 4 Definition and notation Random variables and distributions • Let (Ω; A; P) be a probability space and let X :Ω !X be a function. • Let S be a σ-algebra on X . • For every S 2 S let the preimage of S be X−1(S) := f! 2 ΩjX(!) 2 Sg: (1) • If X−1(S) 2 A for all S 2 S, then X is called measurable. • Let X :Ω !X be measurable. All S 2 S get allocated the probability −1 PX : S! [0; 1];S 7! PX (S) := P X (S) = P (f! 2 ΩjX(!) 2 Sg) (2) • X is called a random variable and PX is called the distribution of X. • (X ; S; PX ) is a probability space. • With X = R and S = B the probability space (R; B; PX ) takes center stage. 5 Definition and notation Random variables and distributions 6 Definition and notation Definition (Random variable) Let (Ω; A; P) denote a probability space. A (real-valued) random variable is a mapping X :Ω ! R;! 7! X(!); (3) with the measurability property f! 2 ΩjX(!) 2 S)g 2 A for all S 2 S: (4) Remarks • Random variables are neither \random" nor \variables". • Intuitively, ! 2 Ω gets randomly selected according to P and X(!) realized. • The distributions (probability measures) of random variables are central. 7 Definition and notation Random variables and distributions • Let (Ω; A; P) and (X ; S; PX ) denote probability spaces for X :Ω !X . • The following notations for events A 2 A w.r.t. X are conventional: fX 2 Sg := f! 2 ΩjX(!) 2 Sg;S ⊂ X fX = xg := f! 2 ΩjX(!) = xg; x 2 X fX ≤ xg := f! 2 ΩjX(!) ≤ xg; x 2 X fX < xg := f! 2 ΩjX(!) < xg; x 2 X • These conventions entail the following conventions for distributions: PX (X 2 S) = P (fX 2 Sg) = P (f! 2 ΩjX(!) 2 Sg) ;S ⊂ X PX (X ≤ x) = P (fX ≤ xg) = P (f! 2 ΩjX(!) ≤ xg) ; x 2 X • Often, the random variable subscript in distribution symbols is omitted: P (X 2 S) = PX (X 2 S) ;S ⊂ X P (X ≤ x) = PX (X ≤ S) ; x 2 X • Distributions can be defined using cumulative distribution functions, probability mass functions, and probability density functions. 8 Random variables • Definition and notation • Cumulative distribution functions • Probability mass and density functions 9 Cumulative distribution functions Definition (Cumulative distribution function) The cumulative distribution function (CDF) of a random variable X is defined as P : R ! [0; 1]; x 7! P (x) := P(X ≤ x): (5) Remarks • CDFs can be used to define distributions. • CDFs exist for both discrete and continuous random variables. 10 Cumulative distribution functions Example (Cumulative distribution function) Consider a random variable with outcome space X = f0; 1; 2g and distribution defined by 1 1 1 (X = 0) = ; (X = 1) = ; (X = 2) = (6) P 4 P 2 P 4 Then its distribution function is given by 8 0 x < 0; > > <> 1 0 ≤ x < 1; P : ! [0; 1]; x 7! P (x) := 4 (7) R 3 > 4 1 ≤ x < 1; > :>1 x ≥ 2: Remarks • P is right-continuous. • P is defined for all x 2 R, while X 2 f0; 1; 2g. 11 Cumulative distribution functions Identity of CDFs Let X have CDF P and let Y have CDF Q. If P (x) = Q(x) for all x, then P(X 2 S) = P(Y 2 S) for all events S 2 S. Properties of CDFs A function P : R ! [0; 1] is a CDF for some probability P, if and only if P satisfies the following conditions (1) P is non-decreasing: x1 < x2 implies that P (x1) ≤ P (x2). (2) P is normalized: limx→−∞ P (x) = 0 and limx!1 P (x) = 1. + + (3) P is right-continuous: P (x) = P (x ) for all x, where P (x ) := limy!x;y>x P (y). 12 Random variables • Definition and notation • Cumulative distribution functions • Probability mass and density functions 13 Probability mass and density functions Definition (Probability mass functions, discrete random variables) A random variable X is discrete, if it takes on countably many values in X := fx1; x2; :::g. The probability mass function of X is defined as p : X! [0; 1]; x 7! p(x) := P(X = x): (8) Remarks • A set is countable, if it is finite or bijectively related to N. • A PMF is non-negative: p(x) ≥ 0 for all x 2 X . • P A PMF is normalized: i p(x) = 1. • The CDF of a PMF is P (x) = (X ≤ x) = P p(x ). P xi≤x i • The CDF of a PMF is also referred to as a cumulative mass function (CMF). 14 Probability mass and density functions Example (Bernoulli random variable) Let X be a random variable with outcome set X = f0; 1g and probability mass function p : X! [0; 1]; x 7! p(x) := µx(1 − µ)1−x for µ 2 [0; 1]: (9) Then X is said to be distributed according to a Bernoulli distribution with parameter µ 2 [0; 1], for which we write X ∼ Bern(µ). We denote the probability mass function of a Bernoulli random variable by Bern(x; µ) := µx(1 − µ)1−x: (10) Remarks • A Bernoulli random variable can be used to model a single biased coin flip with outcomes \failure" 0 and \success" 1. • µ is the probability for X to take the value 1, 1 1−1 P(X = 1) = µ (1 − µ) = µ. (11) 15 Probability mass and density functions Definition (Probability density functions, continuous random variables) A random variable X is continuous, if there exists a function p : R ! R≥0; x 7! p(x) (12) such that • p(x) ≥ 0 for all x 2 R, • R 1 −∞ p(x)dx = 1, • R b P(a ≤ X ≤ b) = a p(x) dx for all a; b 2 R; a ≤ b. Remarks • R a PDFs can take on values larger than 1 and P(X = a) = a p(x) dx = 0. • Probabilities are obtained from PDFs by integration, • (Probability) mass = (probability) density × (set) volume. • R x d The CDF of a PDF is P (x) = −∞ p(ξ) dξ, thus p(x) = dx P (x): • The CDF of a PDF is also referred to as cumulative density function. 16 Probability mass and density functions Example (Gaussian random variable, standard normal variable) Let X be a random variable with outcome set R and probability density function 1 1 2 p : R ! R>0; x 7! p(x) := p exp − (x − µ) : (13) 2πσ2 2σ2 Then X is said to be distributed according to a Gaussian distribution with parameters 2 2 µ 2 R and σ > 0, for which we write X ∼ N µ, σ . We abbreviate the PDF of a Gaussian random variable by 1 1 N x; µ, σ2 := p exp − (x − µ)2 : (14) 2πσ2 2σ2 A Gaussian random variable with µ = 0 and σ2 = 1 is said to be distributed according to a standard normal distribution and is often referred to as a Z variable. Remarks • The parameter µ specifies the location of highest probability density. • The parameter σ2 specifies the width of the distribution. • The term p 1 is the normalization constant for exp − 1 (x − µ)2 . 2πσ2 2σ2 17 Probability mass and density functions Example (Uniform random variables) Let X be a discrete random variable with a finite outcome set X and probability mass function 1 p : X! R≥0; x 7! p(x) := : (15) jX j Then X is said to be distributed according to a discrete uniform distribution, for which we write X ∼ U(jX j). We abbreviate the PMF of a discrete uniform random variable by 1 U(x; jX j) := : (16) jX j Similarly, let X be a continuous random variable with probability density function ( 1 b−a x 2 [a; b] p : R ! R>0; x 7! p(x) := (17) 0 x2 = [a; b] Then X is said to be distributed according to a continuous uniform distribution with parameters a and b, for which we write X ∼ U(a; b). We abbreviate the PDF of a continuous uniform random variable by 1 U(x; a; b) := : (18) b − a 18 Probability mass and density functions Properties of cumulative density functions • P(X > x) = 1 − P (x) (Exceedance distribution function) • P(x < X ≤ y) = P (y) − P (x) (Interval probability) • With the properties of the Riemann integral, we have P (y) − P (x) = P(x < X < y) = P(x ≤ X < y) (19) = P(x < X ≤ y) = P(x ≤ X ≤ y): 19 Probability mass and density functions Definition (Inverse cumulative distribution function) Let X be a random variable with CDF P . Then the inverse cumulative distribution function or quantile function of X is defined as −1 −1 P : [0; 1] ! R; q 7! P (q) := inffxjP (x) > qg (20) If P is invertible, i.e., strictly increasing and continuous, then P −1(q) is the unique real number x such that P (x) = q. Remarks • P −1(0:25) is called the first quartile. • P −1(0:50) is called the median or second quartile. • P −1(q) is also referred to as qth percentile. 20 Probability mass and density functions Example (CDF and inverse CDF for Gaussian random variables) Let X be a univariate Gaussian random variable with expectation parameter µ and variance parameter σ2.

Statistics for Data Science

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support