Chapter 3 - Moments

Home , Central moment, Expected value, Gamma distribution

RS - Chapter 3 - Moments

Chapter 3 Moments of a Distribution

Expectation

We develop the expectation operator in terms of the Lebesgue integral.

• Recall that the Lebesgue measure λ(A) for some set A gives the length/area/volume of the set A. If A = (3; 7), then λ(A) =|3 -7|= 4.

• The Lebesgue integral of f on [a,b] is defined in terms of Σi yi λ(Ai),

where 0 = y1 ≤ y2 ≤ ... ≤ yn, Ai = {x : yi ≤ f (x) < yi+1}, and λ(Ai) is the Lebesgue measure of the set Ai.

• The value of the Lebesgue integral is the limit as the yi's are pushed closer together. That is, we break the y-axis into a grid using {yn} and break the x-axis into the corresponding grid {An} where

Ai = {x : f (x) є [yi; yi+1)}.

1 RS - Chapter 3 - Moments

Taking expectations: Riemann vs Lebesgue

• Riemann’s approach Partition the base. Measure the height of the function at the center of each interval. Calculate the area of each interval. Add all intervals. • Lebesgue approach Divide the range of the function. Measure the length of each horizontal interval. Calculate the area of each interval. Add all intervals.

Taking expectations: Riemann vs Lebesgue • A Borel function (RV) f is integrable if and only if |f| is integrable.

• For convenience, we define the integral of a measurable function f from (Ω, Σ , μ) to ( ¯R, ¯ B), where ¯R = R U {−∞, ∞}, ¯ B = σ(B U{{∞}, {−∞}}).

Example: If Ω = R and μ is the Lebesgue measure, then the

Lebesgue integral of f over an interval [a, b] is written as ∫[a,b] f(x) dx = b ∫a f(x) dx, which agrees with the Riemann integral when the latter is well defined. However, there are functions for which the Lebesgue integrals are defined but not the Riemann integrals.

• If μ=P, in statistics, ∫ X dP = EX = E[X] is called the expectation or expected value of X.

2 RS - Chapter 3 - Moments

Expected Value

Consider our probability space (Ω, Σ, P) . Take an event (a set A of ωє Ω) and X, a RV, that assigns real numbers to each ωєA.

• If we take an observation from A without knowing which ωєA will be drawn, we may want to know what value of X(ω) we should expect to see.

• Each of the ωєA has been assigned a probability measure P[ω], which induces P[x]. Then, we use this to weight the values X(ω).

• P is a probability measure: The weights sum to 1. The weighted sum provides us with a weighted average of X (ω). If P gives the "correct" likelihood of ω being chosen, the weighted average of X(ω) –E[X]– tells us what values of X(ω) are expected.

Expected Value

• Now with the concept of the Lebesgue integral, we take the possible

values {xi} and construct a grid on the y-axis, which gives a corresponding grid on the x-axis in A, where

Ai = {ωєA: X(ω) ∈ [xi; xi+1)}.

Let the elements in the x-axis grid be Ai. The weighted average is n n n  xi P[ Ai ]   xi PX [ X  xi ]   xi f X (xi ) i1 i1 i1 • As we shrink the grid towards 0, A, becomes infinitesimal. Let dω be the infinitesimal set A. The Lebesgue integral becomes:

n   

lim xi P[ Ai ]  x P[d ]  x PX [ X  xi ]  x f X ( xi ) dx n      i 1   

3 RS - Chapter 3 - Moments

The Expectation of X: E(X)

The expectation operator defines the mean (or population average) of a random variable or expression.

Definition Let X denote a discrete RV with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be:

EX   xpx   xpxii  xi and if X is continuous with probability density function f(x)

 E Xxfxdx   

Sometimes we use E[.] as EX[.] to indicate that the expectation is being taken over f X(x) dx.

Interpretation of E(X)

1. The expected value of X, E(X), is the center of gravity of the probability distribution of X. 2. The expected value of X, E(X), is the long-run average value of X. (To be discussed later: Law of Large Numbers)

0.4

0.3

0.2

0.1

0 01234567 E(X)

4 RS - Chapter 3 - Moments

Example: The Binomal distribution Let X be a discrete random variable having the Binomial distribution -- i.e., X = the number of successes in n independent repetitions of a Bernoulli trial. Find the expected value of X, E(X).

n x nx px p 1 p  x 0,1,2,3,K , n x nn n x nx EX xpx  x p 1  p  xx00x

n n x nx  xp 1 p x1 x n n! nx  xppx 1 x1 xn!! x

Example: Solution nn n x nx EX xpx  x p 1  p  xx00x n n x nx  xp 1 p x1 x n n! nx  xppx 1 x1 xn!! x n n! nx  ppx 1 x1 xnx1! ! nn!!nn12 pp1211 pp 0!nn 1 ! 1! 2 ! nn!! K ppnn1 1 p nn2!1! 1!0!

5 RS - Chapter 3 - Moments

Example: Solution

 nn1!nn12 1! np p0111 p p p  0!nn 1 ! 1! 2 !

nn1!  1!  ⋯K ppnn211 p nn2!1! 1!0! 

nn1101nn12  np p11 p  p  p 01 

nn11nn21  ⋯K pp1  p nn21 

n 1 n 1 np p11 p np  np

Example: Exponential Distribution

Let X have an exponential distribution with parameter . The probability density function of X is:

exx  0 fx   00x 

The expected value of X is:  E X xf  x dx x ex dx  0 We will determine  xedxx udv uv vdu

6 RS - Chapter 3 - Moments

Example: Exponential Distribution We will determine  xedxx using integration by parts. In this case ux and dvedx   x

Hence du dx and v e  x 1 Thus xedxxexxx  edx xee x   x     1 EX xe xxx dx xe e  0 0  0 11 00  0     Summary: If X has an exponential distribution with parameter  then: 1 EX 

Example: The Uniform distribution

Suppose X has a uniform distribution from a to b. Then:  1 axb  fx  ba  0,x  ax b

The expected value of X is:  b E Xxfxdxxdx1   ba  a 222b 1 x ba ab ba 22a ba 2

7 RS - Chapter 3 - Moments

Example: The Normal distribution Suppose X has a Normal distribution with parameters  and . Then: 2 1  x fx e2 2 2

The expected value of X is:

 2 1  x E X xf  x dx x e2 2 dx   2

Make the substitution: x   1 z  dz dx and x z  

 2 1  z Hence EX  z e2 dz  2  22 1 zz  edz22 zedz 22   22 1 zz Now edz221 and zedz 0 2 

Thus EX 

8 RS - Chapter 3 - Moments

Example: The Gamma distribution

Suppose X has a Gamma distribution with parameters  and . Then:    1 x  xe x 0 fx      00x  Note:  f x dx x1 ex dx 1 if  0,  0   This is a very useful formula when working with the Gamma distribution.

Example: The Gamma distribution The expected value of X is:   E Xxf xdx x x1 ex dx  0        xedx x This is now   0 equal to 1.

 1  1  xedx x  1  0  1

1      

9 RS - Chapter 3 - Moments

Example: The Gamma distribution

Thus, if X has a Gamma ( ,) distribution, the expected value of X is: E(X) =  /

Special Cases: ( ,) distribution then the expected value of X is: 1. Exponential () distribution:  = 1,  arbitrary 1 EX   2. Chi-square () distribution:  = /2,  = ½.  EX  2    1 2

Example: The Gamma distribution

0.3

0.25

0.2

0.15

0.1

0.05

0 0246810  EX 

10 RS - Chapter 3 - Moments

The Exponential distribution 0.25

0.2

0.15

0.1

0.05

0 0 5 10 15 20 25

1 EX 

The Chi-square (2) distribution

0.14

0.12

0.1

0.08

0.06 0.04

0.02

0 0 5 10 15 20 25 EX  

11 RS - Chapter 3 - Moments

Expectation of a function of a RV

• Let X denote a discrete RV with probability function p(x) (or pdf f(x) if X is continuous) then the expected value of g(X), E[g(X)], is defined to be:

E gX   gxpx     gx ii px  xi

and if X is continuous with probability density function f(x)

 E gX   gxfxdx   

• Examples: g(x) = (x – μ)2  E[g(x)] = E[(x – μ)2] g(x) = (x – μ)k  E[g(x)] = E[(x – μ)k]

Expectation of a function of a RV

Example: Suppose X has a uniform distribution from 0 to b. Then:

 1 0  x  b fx  b 00,x  xb Find the expected value of A = X2 . If X is the length of a side of a square (chosen at random form 0 to b) then A is the area of the square

b  b x3332bb 0 22 21 1 E Xxfxdxxdx  b  ba 33b 3  a 0 

= 1/3 the maximum area of the square

12 RS - Chapter 3 - Moments

Median: An alternative central measure • A median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half.

Definition: Median The median of a random variable X is the unique number m that satisfies the following inequalities: P(X ≤ m) ≥ ½ and P(X≥ m) ≥ ½.

For a continuous distribution, we have that m solves:

m   f X ( x)dx   f X ( x)dx  1 / 2  m

Median: An alternative central measure

• Calculation of medians is a popular technique in summary statistics and summarizing statistical data, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean.

An optimality property A median is also a central point which minimizes the average of the absolute deviations. That is, a value of c that minimizes E(|X – c|) is the median of the probability distribution of the random variable X.

13 RS - Chapter 3 - Moments

Example I: Median of the Exponential Distribution

Let X have an exponential distribution with parameter . The probability density function of X is:

exx  0 fx   00x  The median m solves the following integral of X:

  f X (x)dx  1/ 2 m    exdx   exdx   ex |  em 1/ 2   m m m

That is, m = ln(2)/λ.

Example II: Median of the Pareto Distribution

Let X follow a Pareto distribution with parameters α (scale) and xs (shape, usually notated xm). The pdf of X is:

 x   s if x  x  0 f ( x)   x  1 s  0 if x  0  The median m solves the following integral of X: f ( x)dx  1 / 2  X m   x  x  ( 1)1 s dx   x x ( 1) dx   x  C   1 s  s m x m  (  1)  1      1/   xs x  C |m  xs m  1/ 2  m  xs 2

Note: The Pareto distribution is used to describe the distribution of wealth.

14 RS - Chapter 3 - Moments

Moments of a Random Variable

The moments of a random variable X are used to describe the behavior of the RV (discrete or continuous).

Definition: Kth Moment Let X be a RV (discrete or continuous), then the kth moment of X is:   xpxk   if X is discrete  x k   k  E X      xfk  xdxif X is continuous  -

• The first moment of X,  = 1 = E(X) is the center of gravity of the distribution of X. • The higher moments give different information regarding the shape of the distribution of X.

Moments of a Random Variable

Definition: Central Moments Let X be a RV (discrete or continuous). Then, the kth central moment of X is defined to be:   xpxX  k if is discrete  x 0 k  k EX    k   xfxdxX  if is continuous -

where  = 1 = E(X) = the first moment of X .

• The central moments describe how the probability distribution is distributed about the center of gravity, .

15 RS - Chapter 3 - Moments

Moments of a Random Variable – 1st and 2nd

The first central moments is given by: 0  1 EX The second central moment depends on the spread of the probability distribution of X about It is called the variance of X and is denoted by the symbol var(X).

2 0 EXnd 2 = 2 central moment.

0 EX2 is called the standard deviation of X 2 and is denoted by the symbol .

var XEX  022 2   

Moments of a Random Variable – Skewness

• The third central moment 0 EX 3  3  contains information about the skewness of a distribution.

00 33 • A popular measure of skewness:  1   3 0 32  2 • Distribution according to skewness: 1) Symmetric distribution

0.09 0.08 0.07 0.06 0 0.05  0, 0 0.04 31 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

16 RS - Chapter 3 - Moments

Moments of a Random Variable – Skewness

2) Positively skewed distribution 0.08 0.07 0 0.06 3 0, 0 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

3) Negatively skewed distribution

0.08 0.07 0.06 0 0.05 310, 0 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

Moments of a Random Variable – Skewness

• Skewness and Economics - Zero skew means symmetrical gains and losses. - Positive skew suggests many small losses and few rich returns. - Negative skew indicates lots of minor wins offset by rare major losses.

• In financial markets, stock returns at the firm level show positive skewness, but at stock returns at the aggregate (index) level show negative skewness.

• From horse race betting and from U.S. state lotteries there is evidence supporting the contention that gamblers are not necessarily risk-lovers but skewness-lovers: Long shots are overbet (positve skewness loved!).

17 RS - Chapter 3 - Moments

Moments of a Random Variable – Kurtosis

0 EX 4  • The fourth central moment 4 

It contains information about the shape of a distribution. The property of shape that is measured by this moment is called kurtosis.

00 44 • The measure of (excess) kurtosis:  2  33   4 0 2  2 • Distributions: 1) Mesokurtic distribution 0 0.09  0, moderate in size 0.08  4 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

Moments of a Random Variable – Kurtosis 2) Platykurtic distribution 0   0,4 small in size

0 020406080

3) Leptokurtic distribution

0   0,4 large in size

0 0 20406080

18 RS - Chapter 3 - Moments

Moments of a Random Variable

Example: The uniform distribution from 0 to 1

10 x  1 fx  00,1x  x 

Finding the moments 1  1 x k 1 1  xfkk xdx x1 dx k    0 kk 110  Finding the central moments:  1 k 0 x k fxdx  x1 1 dx k  2  0

Moments of a Random Variable

Finding the central moments (continuation):  1 k 0 x k fxdx  x1 1 dx k  2  0 1 making the substitution wx  2

1 1 kk11 2 w k 1 2 11  0 wdwk 22 k    1 kk111 2  2

k 1  1 11 if k even   21k k  k 1   21k    0if k odd

19 RS - Chapter 3 - Moments

Moments of a Random Variable

11 11 Hence 000 ,  0, 2342324 12 25 80 1 Thus, var X   0  2 12 1 The standard deviation var X 0 2 12  0 The measure of skewness   3  0 1  3 0  4 180 The measure of kurtosis  1  331.2   4 112 2

Alternative measures of dispersion When the median is used as a central measure for a distribution, there are several choices for a measure of variability: -The range –-the length of the smallest interval containing the data -The interquartile range -the difference between the 3rd and 1st quartiles.

-The mean absolute deviation – (1/n) Σi |xi – central measure(X)|

-The median absolute deviation (MAD) – MAD= mi(|xi - m(X)|)

These measures are more robust (to outliers) estimators of scale than the sample variance or standard deviation.

They especially behave better with distributions without a mean or variance, such as the Cauchy distribution.

20 RS - Chapter 3 - Moments

Rules for Expectations

  gxpx    if X is discrete  x EgX     gxf  xdxif X is continuous  • Rules: 1. Ec   c where c is a constant Proof:  if gX c then EgX  Ec  cfxdx   cfxdxc   The proof for discrete random variables is similar. 

Rules for Expectations

2. EaXb  aEX   b where ab , are constants Proof  if g X aX b then E aX  b  ax  b f x dx   axfxdxbfxdx     aE X  b

The proof for discrete random variables is similar.

21 RS - Chapter 3 - Moments

Rules for Expectations

2 3. var XEX0  2  EX22 EX   2       21 Proof:  var X EX22 x fxdx          x222xfxdx    x22fxdx2 xfxdx   fxdx     22 EX2  EX  221

The proof for discrete random variables is similar.

Rules for Expectations 4. varaX b a2 var  X 

Proof:

aX b EaX b aE X b a b

var aX b E aX  b  2   aX b 

2 EaXb a b   Ea 2 X  2    aE22 X 2 avar X  

22 RS - Chapter 3 - Moments

Moment generating functions

The expectation of a function g(X) is given by:   gxpx    if X is discrete  x EgX     gxf  xdxif X is continuous   Definition: Moment Generating Function (MGF) Let X denote a random variable. Then, the moment generating function of X,

mX(t), is defined by: -   epxtx   if X is discrete  x tX    mtX  Ee   efxdxtx  if X is continuous  

23 RS - Chapter 3 - Moments

MGF: Examples

1. The Binomial distribution (parameters p, n)

n x nx px p 1 p  x  0,1,2,K , n x

The MGF of X , mX(t) is: tX tx mtX  Ee epx  x n txn x nx epp 1 x0 x nnx nntxnxnx   ep1 p  ab xx00xx  n abn  ept 1 p

MGF: Examples

2. The Poisson distribution (parameter )  x px e x 0,1,2,K x!

The MGF of X , mX(t) is:

n x mt EetX epx tx tx   X      ee x x0 x!

x t x e  t u eeee eu using  xx00x!!x et 1  e

24 RS - Chapter 3 - Moments

MGF: Examples

3. The Exponential distribution (parameter ) ex x  0 fx   00x 

The MGF of X , mX(t) is:  mt EetX efxdxeedx tx  tx  x X     0   tx tx  e  edx   t   0  0    t       t undefined t  

MGF: Examples 4. The Standard Normal distribution ( = 0,  = 1)

2 1  x f xe 2 2

The MGF of X , mX(t) is:

 mt EetX efxdx tx X    

 2 1  x   eedxtx 2  2

 2 1  xtx2   edx2  2

25 RS - Chapter 3 - Moments

MGF: Examples

We will now use the fact that

 xb 2 1  2  edx2a 1 for all ab 0, We have  2 a completed the square

22 22 11xtx22 t xtxt mt e22 dxe e 2 dx X   22 

 22xt 2 tt1  eedxe222  2

This is 1

MGF: Examples

4. The Gamma distribution (parameters , )

   1 x  xe x 0 fx      00x 

The MGF of X , mX(t) is:  mt EetX efxdx tx X         exedxtx1 x 0      x 1edx tx 0 

26 RS - Chapter 3 - Moments

MGF: Examples

We use the fact  ba  xedxabx1 1 for all a 0, b 0 0 a   mt xe 1  tx dx X   0        t  tx  xe 1  dx      t 0 t

Equal to 1  The Chi-square distribution with degrees of freedom = /2, =½):   2 mtX  12 t 

MGF: Properties

1. mX(0) = 1

tX0 X mtXX Ee, hence m 0 Ee  E 1 1

Note: The MGFs of the following distributions satisfy the property

mX(0) = 1

n t i) Binomial Dist'n mtX  ep 1 p  e t 1 ii) Poisson Dist'n mtX  e  iii) Exponential Dist'n mtX     t

t 2 2 iv) Std Normal Dist'n mtX  e   v) Gamma Dist'n mtX     t

27 RS - Chapter 3 - Moments

MGF: Properties

We use the expansion of the exponential function:

tX mtX  Ee

MGF: Properties k k d 3. mmtX  0 k Xk  dt t  0 Now

and mX  0  1

and mX  0   2 k  continuing we find m Xk 0  

28 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Binomial Property 3 is very useful in determining the moments of a RV X. Examples: n t i) Binomial Dist'n mtX  ep 1 p

ttn 1 mtX  nep1 p pe

00n 1 mnepppenpX 01  1 

nn21 mtnpneppepeeppe  11  ttttt  1 X 

n  2 tt t t  npeeppnepepp11   1

ttn  2 t npeeppnepp11

'' 2 2 m X (0)  np[np  1  p]  np[np  q]  n p  npq   2

MGF: Applying Property 3 – Poisson

 et 1 ii) Poisson Dist'n mtX  e

tt eet11 t   mtX  e e e

ttt et1121 t  2  e t et mtX  e e1 e  e

tt  2 et12  tt et  1 mtX  e e21 e e

tt 2 et12t  et 1 eee3

ttt 32et13 et 12  et 1 eee3 

29 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Poisson

To find the moments we set t = 0.

e0 10 1 meX 0  

00 22ee10   10 2 meX 0   e 

30 20t 0 3 2 3 meeeX 03 3

MGF: Applying Property 3 – Exponential

 iii) Exponential Dist'n mtX     t  1 d  dt  mtX  dt  t dt 11 tt22   

33 mtX  212   t        t 

44 mtX 23    t    123      t 

4 55 mtX 23 4 t 1 4!  t

k  k 1 mtX   k!   t 

30 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Exponential

Thus, 2 1 m 0      1 X  3 2 m 02   2 X  2

k 1 k! mkk 0!     kX  k

We can calculate the following popular descriptive statistics: 2 0 2 2 2 2 - σ = μ 2 = μ2 – μ = (2/λ ) – (1/λ) = (1/λ) 0 3 3 2 3/2 - γ1= μ 3 /σ = (2/λ ) / [(1/λ) ] = 2 0 4 4 4 - γ2= μ 4/σ – 3 = (9/λ ) / [(1/λ) ] – 3 = 6

MGF: Applying Property 3 – Exponential

Note: the moments for the exponential distribution can be calculated

in an alternative way. This is done by expanding mX(t) in powers of t and equating the coefficients of tk to the coefficients in:

 11 23 mtX  1 uuu L  tu1  t 1  tt23 t  1   L 23 

Equating the coefficients of tk we get:  1!k k  or  k !  kkk 

31 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Normal

2 iv) Standard normal distribution mX(t)=exp(t /2)

We use the expansion of eu.

We now equate the coefficients tk in:

MGF: Applying Property 3 – Normal

If k is odd: k = 0.

 1 For even 2k: 2 k  2!2!kkk 2!k  or   2 k 2!k k

2! 4! Thus  0, 1,  0,  3 12222! 342 

32 RS - Chapter 3 - Moments

The log of Moment Generating Functions

Let lX (t) = ln mX(t) = the log of the MGF.

Then lmXX 0 ln 0 ln 1 0

1 mtX   mX 0 ltXX mt lX 0 1  mtXX mt mX 0 2   mtmtXX   mt X  ltX  2 mtX  2 mm00  m  0 XX  X  2 2 lX 02 21 mX 0

The Log of Moment Generating Functions

Thus lX (t) = ln mX(t) is very useful for calculating the mean and variance of a random variable

1. lX  0   2 2. lX  0  

33 RS - Chapter 3 - Moments

Log of MGF: Examples – Binomial

1. The Binomial distribution (parameters p, n) nn tt mtX  ep1 p epq t ltXXln mt  n ln  epq

1 t 1 lt  n ep   lnpnpX 0  X ept  q pq eptt ep q ep tt ep ltX  n 2 ept  q

2 pp  q pp   lX 0 n  npq pq 2

Log of MGF: Examples – Poisson

2. The Poisson distribution (parameter )

et 1 mtX  e

t ltXXln mt    e 1

t ltX   e   lX 0  

t  2  l 0   ltX   e X  

34 RS - Chapter 3 - Moments

Log of MGF: Examples – Exponential

3. The Exponential distribution (parameter )    t   mtX     t undefined t  

ltXXln mt  ln ln t if t

1 1 lt  t X   t 2 1 ltX 11   t        t 2 11 Thus ll 0 and 2  0 XX  2

Log of MGF: Examples – Normal

4. The Standard Normal distribution ( = 0,  = 1)

t2 2 mtX  e

t2 ltXXln mt  2

ltXX tlt,   1

2 Thus llXX 0 0 and  0 1

35 RS - Chapter 3 - Moments

Log of MGF: Examples – Gamma 5. The Gamma distribution (parameters , )   mtX     t

ltXXln mt   ln ln   t 1  ltX   tt

2  ltX 11  t    t 2   Hence ll 0 and 2  0 XX  2

Log of MGF: Examples – Chi-squared

6. The Chi-square distribution (degrees of freedom )

  2 mtX  12 t   ltln mt  ln 1 2 t XX2  1  lt  2  X 21 2tt 1 2

2 2 ltX  112   t    2  12 t 2

2 Hence  llXX 0 and  0 2 

36 RS - Chapter 3 - Moments

Characteristic functions

Definition: Characteristic Function Let X denote a random variable. Then, the characteristic function of X,

φX(t) is defined by: itx  X (t )  E (e )

itx itx Since e = cos(xt) + i sin(xt) and ║e ║≤ 1, then φX(t) is defined for all t. Thus, the characteristic function always exists, but the MGF need not exist.

Relation to the MGF: φX(t) = miX(t) = mX(it)

 k  (t ) Calculation of moments: X |  i k  t t  0 k