RS - Chapter 3 - Moments
Chapter 3 Moments of a Distribution
Expectation
We develop the expectation operator in terms of the Lebesgue integral.
• Recall that the Lebesgue measure λ(A) for some set A gives the length/area/volume of the set A. If A = (3; 7), then λ(A) =|3 -7|= 4.
• The Lebesgue integral of f on [a,b] is defined in terms of Σi yi λ(Ai),
where 0 = y1 ≤ y2 ≤ ... ≤ yn, Ai = {x : yi ≤ f (x) < yi+1}, and λ(Ai) is the Lebesgue measure of the set Ai.
• The value of the Lebesgue integral is the limit as the yi's are pushed closer together. That is, we break the y-axis into a grid using {yn} and break the x-axis into the corresponding grid {An} where
Ai = {x : f (x) є [yi; yi+1)}.
1 RS - Chapter 3 - Moments
Taking expectations: Riemann vs Lebesgue
• Riemann’s approach Partition the base. Measure the height of the function at the center of each interval. Calculate the area of each interval. Add all intervals. • Lebesgue approach Divide the range of the function. Measure the length of each horizontal interval. Calculate the area of each interval. Add all intervals.
Taking expectations: Riemann vs Lebesgue • A Borel function (RV) f is integrable if and only if |f| is integrable.
• For convenience, we define the integral of a measurable function f from (Ω, Σ , μ) to ( ¯R, ¯ B), where ¯R = R U {−∞, ∞}, ¯ B = σ(B U{{∞}, {−∞}}).
Example: If Ω = R and μ is the Lebesgue measure, then the
Lebesgue integral of f over an interval [a, b] is written as ∫[a,b] f(x) dx = b ∫a f(x) dx, which agrees with the Riemann integral when the latter is well defined. However, there are functions for which the Lebesgue integrals are defined but not the Riemann integrals.
• If μ=P, in statistics, ∫ X dP = EX = E[X] is called the expectation or expected value of X.
2 RS - Chapter 3 - Moments
Expected Value
Consider our probability space (Ω, Σ, P) . Take an event (a set A of ωє Ω) and X, a RV, that assigns real numbers to each ωєA.
• If we take an observation from A without knowing which ωєA will be drawn, we may want to know what value of X(ω) we should expect to see.
• Each of the ωєA has been assigned a probability measure P[ω], which induces P[x]. Then, we use this to weight the values X(ω).
• P is a probability measure: The weights sum to 1. The weighted sum provides us with a weighted average of X (ω). If P gives the "correct" likelihood of ω being chosen, the weighted average of X(ω) –E[X]– tells us what values of X(ω) are expected.
Expected Value
• Now with the concept of the Lebesgue integral, we take the possible
values {xi} and construct a grid on the y-axis, which gives a corresponding grid on the x-axis in A, where
Ai = {ωєA: X(ω) ∈ [xi; xi+1)}.
Let the elements in the x-axis grid be Ai. The weighted average is n n n xi P[ Ai ] xi PX [ X xi ] xi f X (xi ) i1 i1 i1 • As we shrink the grid towards 0, A, becomes infinitesimal. Let dω be the infinitesimal set A. The Lebesgue integral becomes:
n
lim xi P[ Ai ] x P[d ] x PX [ X xi ] x f X ( xi ) dx n i 1
3 RS - Chapter 3 - Moments
The Expectation of X: E(X)
The expectation operator defines the mean (or population average) of a random variable or expression.
Definition Let X denote a discrete RV with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be:
EX xpx xpxii xi and if X is continuous with probability density function f(x)
E Xxfxdx
Sometimes we use E[.] as EX[.] to indicate that the expectation is being taken over f X(x) dx.
Interpretation of E(X)
1. The expected value of X, E(X), is the center of gravity of the probability distribution of X. 2. The expected value of X, E(X), is the long-run average value of X. (To be discussed later: Law of Large Numbers)
0.4
0.3
0.2
0.1
0 01234567 E(X)
4 RS - Chapter 3 - Moments
Example: The Binomal distribution Let X be a discrete random variable having the Binomial distribution -- i.e., X = the number of successes in n independent repetitions of a Bernoulli trial. Find the expected value of X, E(X).
n x nx px p 1 p x 0,1,2,3,K , n x nn n x nx EX xpx x p 1 p xx00x
n n x nx xp 1 p x1 x n n! nx xppx 1 x1 xn!! x
Example: Solution nn n x nx EX xpx x p 1 p xx00x n n x nx xp 1 p x1 x n n! nx xppx 1 x1 xn!! x n n! nx ppx 1 x1 xnx1! ! nn!!nn12 pp1211 pp 0!nn 1 ! 1! 2 ! nn!! K ppnn1 1 p nn2!1! 1!0!
5 RS - Chapter 3 - Moments
Example: Solution
nn1!nn12 1! np p0111 p p p 0!nn 1 ! 1! 2 !
nn1! 1! ⋯K ppnn211 p nn2!1! 1!0!
nn1101nn12 np p11 p p p 01
nn11nn21 ⋯K pp1 p nn21
n 1 n 1 np p11 p np np
Example: Exponential Distribution
Let X have an exponential distribution with parameter . The probability density function of X is:
exx 0 fx 00x
The expected value of X is: E X xf x dx x ex dx 0 We will determine xedxx udv uv vdu
6 RS - Chapter 3 - Moments
Example: Exponential Distribution We will determine xedxx using integration by parts. In this case ux and dvedx x
Hence du dx and v e x 1 Thus xedxxexxx edx xee x x 1 EX xe xxx dx xe e 0 0 0 11 00 0 Summary: If X has an exponential distribution with parameter then: 1 EX
Example: The Uniform distribution
Suppose X has a uniform distribution from a to b. Then: 1 axb fx ba 0,x ax b
The expected value of X is: b E Xxfxdxxdx1 ba a 222b 1 x ba ab ba 22a ba 2
7 RS - Chapter 3 - Moments
Example: The Normal distribution Suppose X has a Normal distribution with parameters and . Then: 2 1 x fx e2 2 2
The expected value of X is:
2 1 x E X xf x dx x e2 2 dx 2
Make the substitution: x 1 z dz dx and x z
2 1 z Hence EX z e2 dz 2 22 1 zz edz22 zedz 22 22 1 zz Now edz221 and zedz 0 2
Thus EX
8 RS - Chapter 3 - Moments
Example: The Gamma distribution
Suppose X has a Gamma distribution with parameters and . Then: 1 x xe x 0 fx 00x Note: f x dx x1 ex dx 1 if 0, 0 This is a very useful formula when working with the Gamma distribution.
Example: The Gamma distribution The expected value of X is: E Xxf xdx x x1 ex dx 0 xedx x This is now 0 equal to 1.
1 1 xedx x 1 0 1
1
9 RS - Chapter 3 - Moments
Example: The Gamma distribution
Thus, if X has a Gamma ( ,) distribution, the expected value of X is: E(X) = /
Special Cases: ( ,) distribution then the expected value of X is: 1. Exponential () distribution: = 1, arbitrary 1 EX 2. Chi-square () distribution: = /2, = ½. EX 2 1 2
Example: The Gamma distribution
0.3
0.25
0.2
0.15
0.1
0.05
0 0246810 EX
10 RS - Chapter 3 - Moments
The Exponential distribution 0.25
0.2
0.15
0.1
0.05
0 0 5 10 15 20 25
1 EX
The Chi-square (2) distribution
0.14
0.12
0.1
0.08
0.06 0.04
0.02
0 0 5 10 15 20 25 EX
11 RS - Chapter 3 - Moments
Expectation of a function of a RV
• Let X denote a discrete RV with probability function p(x) (or pdf f(x) if X is continuous) then the expected value of g(X), E[g(X)], is defined to be:
E gX gxpx gx ii px xi
and if X is continuous with probability density function f(x)
E gX gxfxdx
• Examples: g(x) = (x – μ)2 E[g(x)] = E[(x – μ)2] g(x) = (x – μ)k E[g(x)] = E[(x – μ)k]
Expectation of a function of a RV
Example: Suppose X has a uniform distribution from 0 to b. Then:
1 0 x b fx b 00,x xb Find the expected value of A = X2 . If X is the length of a side of a square (chosen at random form 0 to b) then A is the area of the square
b b x3332bb 0 22 21 1 E Xxfxdxxdx b ba 33b 3 a 0
= 1/3 the maximum area of the square
12 RS - Chapter 3 - Moments
Median: An alternative central measure • A median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half.
Definition: Median The median of a random variable X is the unique number m that satisfies the following inequalities: P(X ≤ m) ≥ ½ and P(X≥ m) ≥ ½.
For a continuous distribution, we have that m solves:
m f X ( x)dx f X ( x)dx 1 / 2 m
Median: An alternative central measure
• Calculation of medians is a popular technique in summary statistics and summarizing statistical data, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean.
An optimality property A median is also a central point which minimizes the average of the absolute deviations. That is, a value of c that minimizes E(|X – c|) is the median of the probability distribution of the random variable X.
13 RS - Chapter 3 - Moments
Example I: Median of the Exponential Distribution
Let X have an exponential distribution with parameter . The probability density function of X is:
exx 0 fx 00x The median m solves the following integral of X:
f X (x)dx 1/ 2 m exdx exdx ex | em 1/ 2 m m m
That is, m = ln(2)/λ.
Example II: Median of the Pareto Distribution
Let X follow a Pareto distribution with parameters α (scale) and xs (shape, usually notated xm). The pdf of X is:
x s if x x 0 f ( x) x 1 s 0 if x 0 The median m solves the following integral of X: f ( x)dx 1 / 2 X m x x ( 1)1 s dx x x ( 1) dx x C 1 s s m x m ( 1) 1 1/ xs x C |m xs m 1/ 2 m xs 2
Note: The Pareto distribution is used to describe the distribution of wealth.
14 RS - Chapter 3 - Moments
Moments of a Random Variable
The moments of a random variable X are used to describe the behavior of the RV (discrete or continuous).
Definition: Kth Moment Let X be a RV (discrete or continuous), then the kth moment of X is: xpxk if X is discrete x k k E X xfk xdxif X is continuous -
• The first moment of X, = 1 = E(X) is the center of gravity of the distribution of X. • The higher moments give different information regarding the shape of the distribution of X.
Moments of a Random Variable
Definition: Central Moments Let X be a RV (discrete or continuous). Then, the kth central moment of X is defined to be: xpxX k if is discrete x 0 k k EX k xfxdxX if is continuous -
where = 1 = E(X) = the first moment of X .
• The central moments describe how the probability distribution is distributed about the center of gravity, .
15 RS - Chapter 3 - Moments
Moments of a Random Variable – 1st and 2nd
The first central moments is given by: 0 1 EX The second central moment depends on the spread of the probability distribution of X about It is called the variance of X and is denoted by the symbol var(X).
2 0 EXnd 2 = 2 central moment.
0 EX2 is called the standard deviation of X 2 and is denoted by the symbol .
var XEX 022 2
Moments of a Random Variable – Skewness
• The third central moment 0 EX 3 3 contains information about the skewness of a distribution.
00 33 • A popular measure of skewness: 1 3 0 32 2 • Distribution according to skewness: 1) Symmetric distribution
0.09 0.08 0.07 0.06 0 0.05 0, 0 0.04 31 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35
16 RS - Chapter 3 - Moments
Moments of a Random Variable – Skewness
2) Positively skewed distribution 0.08 0.07 0 0.06 3 0, 0 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35
3) Negatively skewed distribution
0.08 0.07 0.06 0 0.05 310, 0 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35
Moments of a Random Variable – Skewness
• Skewness and Economics - Zero skew means symmetrical gains and losses. - Positive skew suggests many small losses and few rich returns. - Negative skew indicates lots of minor wins offset by rare major losses.
• In financial markets, stock returns at the firm level show positive skewness, but at stock returns at the aggregate (index) level show negative skewness.
• From horse race betting and from U.S. state lotteries there is evidence supporting the contention that gamblers are not necessarily risk-lovers but skewness-lovers: Long shots are overbet (positve skewness loved!).
17 RS - Chapter 3 - Moments
Moments of a Random Variable – Kurtosis
0 EX 4 • The fourth central moment 4
It contains information about the shape of a distribution. The property of shape that is measured by this moment is called kurtosis.
00 44 • The measure of (excess) kurtosis: 2 33 4 0 2 2 • Distributions: 1) Mesokurtic distribution 0 0.09 0, moderate in size 0.08 4 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35
Moments of a Random Variable – Kurtosis 2) Platykurtic distribution 0 0,4 small in size
0 020406080
3) Leptokurtic distribution
0 0,4 large in size
0 0 20406080
18 RS - Chapter 3 - Moments
Moments of a Random Variable
Example: The uniform distribution from 0 to 1
10 x 1 fx 00,1x x
Finding the moments 1 1 x k 1 1 xfkk xdx x1 dx k 0 kk 110 Finding the central moments: 1 k 0 x k fxdx x1 1 dx k 2 0
Moments of a Random Variable
Finding the central moments (continuation): 1 k 0 x k fxdx x1 1 dx k 2 0 1 making the substitution wx 2
1 1 kk11 2 w k 1 2 11 0 wdwk 22 k 1 kk111 2 2
k 1 1 11 if k even 21k k k 1 21k 0if k odd
19 RS - Chapter 3 - Moments
Moments of a Random Variable
11 11 Hence 000 , 0, 2342324 12 25 80 1 Thus, var X 0 2 12 1 The standard deviation var X 0 2 12 0 The measure of skewness 3 0 1 3 0 4 180 The measure of kurtosis 1 331.2 4 112 2
Alternative measures of dispersion When the median is used as a central measure for a distribution, there are several choices for a measure of variability: -The range –-the length of the smallest interval containing the data -The interquartile range -the difference between the 3rd and 1st quartiles.
-The mean absolute deviation – (1/n) Σi |xi – central measure(X)|
-The median absolute deviation (MAD) – MAD= mi(|xi - m(X)|)
These measures are more robust (to outliers) estimators of scale than the sample variance or standard deviation.
They especially behave better with distributions without a mean or variance, such as the Cauchy distribution.
20 RS - Chapter 3 - Moments
Rules for Expectations
gxpx if X is discrete x EgX gxf xdxif X is continuous • Rules: 1. Ec c where c is a constant Proof: if gX c then EgX Ec cfxdx cfxdxc The proof for discrete random variables is similar.
Rules for Expectations
2. EaXb aEX b where ab , are constants Proof if g X aX b then E aX b ax b f x dx axfxdxbfxdx aE X b
The proof for discrete random variables is similar.
21 RS - Chapter 3 - Moments
Rules for Expectations
2 3. var XEX0 2 EX22 EX 2 21 Proof: var X EX22 x fxdx x222xfxdx x22fxdx2 xfxdx fxdx 22 EX2 EX 221
The proof for discrete random variables is similar.
Rules for Expectations 4. varaX b a2 var X
Proof:
aX b EaX b aE X b a b
var aX b E aX b 2 aX b
2 EaXb a b Ea 2 X 2 aE22 X 2 avar X
22 RS - Chapter 3 - Moments
Moment generating functions
The expectation of a function g(X) is given by: gxpx if X is discrete x EgX gxf xdxif X is continuous Definition: Moment Generating Function (MGF) Let X denote a random variable. Then, the moment generating function of X,
mX(t), is defined by: - epxtx if X is discrete x tX mtX Ee efxdxtx if X is continuous
23 RS - Chapter 3 - Moments
MGF: Examples
1. The Binomial distribution (parameters p, n)
n x nx px p 1 p x 0,1,2,K , n x
The MGF of X , mX(t) is: tX tx mtX Ee epx x n txn x nx epp 1 x0 x nnx nntxnxnx ep1 p ab xx00xx n abn ept 1 p
MGF: Examples
2. The Poisson distribution (parameter ) x px e x 0,1,2,K x!
The MGF of X , mX(t) is:
n x mt EetX epx tx tx X ee x x0 x!
x t x e t u eeee eu using xx00x!!x et 1 e
24 RS - Chapter 3 - Moments
MGF: Examples
3. The Exponential distribution (parameter ) ex x 0 fx 00x
The MGF of X , mX(t) is: mt EetX efxdxeedx tx tx x X 0 tx tx e edx t 0 0 t t undefined t
MGF: Examples 4. The Standard Normal distribution ( = 0, = 1)
2 1 x f xe 2 2
The MGF of X , mX(t) is:
mt EetX efxdx tx X
2 1 x eedxtx 2 2
2 1 xtx2 edx2 2
25 RS - Chapter 3 - Moments
MGF: Examples
We will now use the fact that
xb 2 1 2 edx2a 1 for all ab 0, We have 2 a completed the square
22 22 11xtx22 t xtxt mt e22 dxe e 2 dx X 22
22xt 2 tt1 eedxe222 2
This is 1
MGF: Examples
4. The Gamma distribution (parameters , )
1 x xe x 0 fx 00x
The MGF of X , mX(t) is: mt EetX efxdx tx X exedxtx1 x 0 x 1edx tx 0
26 RS - Chapter 3 - Moments
MGF: Examples
We use the fact ba xedxabx1 1 for all a 0, b 0 0 a mt xe 1 tx dx X 0 t tx xe 1 dx t 0 t
Equal to 1 The Chi-square distribution with degrees of freedom = /2, =½): 2 mtX 12 t
MGF: Properties
1. mX(0) = 1
tX0 X mtXX Ee, hence m 0 Ee E 1 1
Note: The MGFs of the following distributions satisfy the property
mX(0) = 1
n t i) Binomial Dist'n mtX ep 1 p e t 1 ii) Poisson Dist'n mtX e iii) Exponential Dist'n mtX t
t 2 2 iv) Std Normal Dist'n mtX e v) Gamma Dist'n mtX t
27 RS - Chapter 3 - Moments
MGF: Properties
2.
We use the expansion of the exponential function:
tX mtX Ee
MGF: Properties k k d 3. mmtX 0 k Xk dt t 0 Now
and mX 0 1
and mX 0 2 k continuing we find m Xk 0
28 RS - Chapter 3 - Moments
MGF: Applying Property 3 – Binomial Property 3 is very useful in determining the moments of a RV X. Examples: n t i) Binomial Dist'n mtX ep 1 p
ttn 1 mtX nep1 p pe
00n 1 mnepppenpX 01 1
nn21 mtnpneppepeeppe 11 ttttt 1 X
n 2 tt t t npeeppnepepp11 1
ttn 2 t npeeppnepp11
'' 2 2 m X (0) np[np 1 p] np[np q] n p npq 2
MGF: Applying Property 3 – Poisson
et 1 ii) Poisson Dist'n mtX e
tt eet11 t mtX e e e
ttt et1121 t 2 e t et mtX e e1 e e
tt 2 et12 tt et 1 mtX e e21 e e
tt 2 et12t et 1 eee3
ttt 32et13 et 12 et 1 eee3
29 RS - Chapter 3 - Moments
MGF: Applying Property 3 – Poisson
To find the moments we set t = 0.
e0 10 1 meX 0
00 22ee10 10 2 meX 0 e
30 20t 0 3 2 3 meeeX 03 3
MGF: Applying Property 3 – Exponential
iii) Exponential Dist'n mtX t 1 d dt mtX dt t dt 11 tt22
33 mtX 212 t t
44 mtX 23 t 123 t
4 55 mtX 23 4 t 1 4! t
k k 1 mtX k! t
30 RS - Chapter 3 - Moments
MGF: Applying Property 3 – Exponential
Thus, 2 1 m 0 1 X 3 2 m 02 2 X 2
k 1 k! mkk 0! kX k
We can calculate the following popular descriptive statistics: 2 0 2 2 2 2 - σ = μ 2 = μ2 – μ = (2/λ ) – (1/λ) = (1/λ) 0 3 3 2 3/2 - γ1= μ 3 /σ = (2/λ ) / [(1/λ) ] = 2 0 4 4 4 - γ2= μ 4/σ – 3 = (9/λ ) / [(1/λ) ] – 3 = 6
MGF: Applying Property 3 – Exponential
Note: the moments for the exponential distribution can be calculated
in an alternative way. This is done by expanding mX(t) in powers of t and equating the coefficients of tk to the coefficients in:
11 23 mtX 1 uuu L tu1 t 1 tt23 t 1 L 23
Equating the coefficients of tk we get: 1!k k or k ! kkk
31 RS - Chapter 3 - Moments
MGF: Applying Property 3 – Normal
2 iv) Standard normal distribution mX(t)=exp(t /2)
We use the expansion of eu.
We now equate the coefficients tk in:
MGF: Applying Property 3 – Normal
If k is odd: k = 0.
1 For even 2k: 2 k 2!2!kkk 2!k or 2 k 2!k k
2! 4! Thus 0, 1, 0, 3 12222! 342
32 RS - Chapter 3 - Moments
The log of Moment Generating Functions
Let lX (t) = ln mX(t) = the log of the MGF.
Then lmXX 0 ln 0 ln 1 0
1 mtX mX 0 ltXX mt lX 0 1 mtXX mt mX 0 2 mtmtXX mt X ltX 2 mtX 2 mm00 m 0 XX X 2 2 lX 02 21 mX 0
The Log of Moment Generating Functions
Thus lX (t) = ln mX(t) is very useful for calculating the mean and variance of a random variable
1. lX 0 2 2. lX 0
33 RS - Chapter 3 - Moments
Log of MGF: Examples – Binomial
1. The Binomial distribution (parameters p, n) nn tt mtX ep1 p epq t ltXXln mt n ln epq
1 t 1 lt n ep lnpnpX 0 X ept q pq eptt ep q ep tt ep ltX n 2 ept q
2 pp q pp lX 0 n npq pq 2
Log of MGF: Examples – Poisson
2. The Poisson distribution (parameter )
et 1 mtX e
t ltXXln mt e 1
t ltX e lX 0
t 2 l 0 ltX e X
34 RS - Chapter 3 - Moments
Log of MGF: Examples – Exponential
3. The Exponential distribution (parameter ) t mtX t undefined t
ltXXln mt ln ln t if t
1 1 lt t X t 2 1 ltX 11 t t 2 11 Thus ll 0 and 2 0 XX 2
Log of MGF: Examples – Normal
4. The Standard Normal distribution ( = 0, = 1)
t2 2 mtX e
t2 ltXXln mt 2
ltXX tlt, 1
2 Thus llXX 0 0 and 0 1
35 RS - Chapter 3 - Moments
Log of MGF: Examples – Gamma 5. The Gamma distribution (parameters , ) mtX t
ltXXln mt ln ln t 1 ltX tt
2 ltX 11 t t 2 Hence ll 0 and 2 0 XX 2
Log of MGF: Examples – Chi-squared
6. The Chi-square distribution (degrees of freedom )
2 mtX 12 t ltln mt ln 1 2 t XX2 1 lt 2 X 21 2tt 1 2
2 2 ltX 112 t 2 12 t 2
2 Hence llXX 0 and 0 2
36 RS - Chapter 3 - Moments
Characteristic functions
Definition: Characteristic Function Let X denote a random variable. Then, the characteristic function of X,
φX(t) is defined by: itx X (t ) E (e )
itx itx Since e = cos(xt) + i sin(xt) and ║e ║≤ 1, then φX(t) is defined for all t. Thus, the characteristic function always exists, but the MGF need not exist.
Relation to the MGF: φX(t) = miX(t) = mX(it)
k (t ) Calculation of moments: X | i k t t 0 k
37