<<

RS - Chapter 3 - Moments

Chapter 3 Moments of a Distribution

Expectation

We develop the expectation operator in terms of the Lebesgue integral.

• Recall that the Lebesgue measure λ(A) for some set A gives the length/area/volume of the set A. If A = (3; 7), then λ(A) =|3 -7|= 4.

• The Lebesgue integral of f on [a,b] is defined in terms of Σi yi λ(Ai),

where 0 = y1 ≤ y2 ≤ ... ≤ yn, Ai = {x : yi ≤ f (x) < yi+1}, and λ(Ai) is the Lebesgue measure of the set Ai.

• The value of the Lebesgue integral is the limit as the yi's are pushed closer together. That is, we break the y-axis into a grid using {yn} and break the x-axis into the corresponding grid {An} where

Ai = {x : f (x) є [yi; yi+1)}.

1 RS - Chapter 3 - Moments

Taking expectations: Riemann vs Lebesgue

• Riemann’s approach Partition the base. Measure the height of the function at the center of each interval. Calculate the area of each interval. Add all intervals. • Lebesgue approach Divide the range of the function. Measure the length of each horizontal interval. Calculate the area of each interval. Add all intervals.

Taking expectations: Riemann vs Lebesgue • A Borel function (RV) f is integrable if and only if |f| is integrable.

• For convenience, we define the integral of a measurable function f from (Ω, Σ , μ) to ( ¯R, ¯ B), where ¯R = R U {−∞, ∞}, ¯ B = σ(B U{{∞}, {−∞}}).

Example: If Ω = R and μ is the Lebesgue measure, then the

Lebesgue integral of f over an interval [a, b] is written as ∫[a,b] f(x) dx = b ∫a f(x) dx, which agrees with the Riemann integral when the latter is well defined. However, there are functions for which the Lebesgue integrals are defined but not the Riemann integrals.

• If μ=P, in , ∫ X dP = EX = E[X] is called the expectation or of X.

2 RS - Chapter 3 - Moments

Expected Value

Consider our space (Ω, Σ, P) . Take an event (a set A of ωє Ω) and X, a RV, that assigns real numbers to each ωєA.

• If we take an observation from A without knowing which ωєA will be drawn, we may want to know what value of X(ω) we should expect to see.

• Each of the ωєA has been assigned a P[ω], which induces P[x]. Then, we use this to weight the values X(ω).

• P is a probability measure: The weights sum to 1. The weighted sum provides us with a weighted of X (ω). If P gives the "correct" likelihood of ω being chosen, the weighted average of X(ω) –E[X]– tells us what values of X(ω) are expected.

Expected Value

• Now with the concept of the Lebesgue integral, we take the possible

values {xi} and construct a grid on the y-axis, which gives a corresponding grid on the x-axis in A, where

Ai = {ωєA: X(ω) ∈ [xi; xi+1)}.

Let the elements in the x-axis grid be Ai. The weighted average is n n n  xi P[ Ai ]   xi PX [ X  xi ]   xi f X (xi ) i1 i1 i1 • As we shrink the grid towards 0, A, becomes infinitesimal. Let dω be the infinitesimal set A. The Lebesgue integral becomes:

n   

lim xi P[ Ai ]  x P[d ]  x PX [ X  xi ]  x f X ( xi ) dx n      i 1   

3 RS - Chapter 3 - Moments

The Expectation of X: E(X)

The expectation operator defines the (or population average) of a or expression.

Definition Let X denote a discrete RV with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be:

EX   xpx   xpxii  xi and if X is continuous with probability density function f(x)

 E Xxfxdx   

Sometimes we use E[.] as EX[.] to indicate that the expectation is being taken over f X(x) dx.

Interpretation of E(X)

1. The expected value of X, E(X), is the center of gravity of the of X. 2. The expected value of X, E(X), is the long-run average value of X. (To be discussed later: )

0.4

0.3

0.2

0.1

0 01234567 E(X)

4 RS - Chapter 3 - Moments

Example: The Binomal distribution Let X be a discrete random variable having the -- i.e., X = the number of successes in n independent repetitions of a . Find the expected value of X, E(X).

n x nx px p 1 p  x 0,1,2,3,K , n x nn n x nx EX xpx  x p 1  p  xx00x

n n x nx  xp 1 p x1 x n n! nx  xppx 1 x1 xn!! x

Example: Solution nn n x nx EX xpx  x p 1  p  xx00x n n x nx  xp 1 p x1 x n n! nx  xppx 1 x1 xn!! x n n! nx  ppx 1 x1 xnx1! ! nn!!nn12 pp1211 pp 0!nn 1 ! 1! 2 ! nn!! K ppnn1 1 p nn2!1! 1!0!

5 RS - Chapter 3 - Moments

Example: Solution

 nn1!nn12 1! np p0111 p p p  0!nn 1 ! 1! 2 !

nn1!  1!  ⋯K ppnn211 p nn2!1! 1!0! 

nn1101nn12  np p11 p  p  p 01 

nn11nn21  ⋯K pp1  p nn21 

n 1 n 1 np p11 p np  np

Example:

Let X have an exponential distribution with parameter . The probability density function of X is:

exx  0 fx   00x 

The expected value of X is:  E X xf  x dx x ex dx  0 We will determine  xedxx udv uv vdu

6 RS - Chapter 3 - Moments

Example: Exponential Distribution We will determine  xedxx using integration by parts. In this case ux and dvedx   x

Hence du dx and v e  x 1 Thus xedxxexxx  edx xee x   x     1 EX xe xxx dx xe e  0 0  0 11 00  0     Summary: If X has an exponential distribution with parameter  then: 1 EX 

Example: The Uniform distribution

Suppose X has a uniform distribution from a to b. Then:  1 axb  fx  ba  0,x  ax b

The expected value of X is:  b E Xxfxdxxdx1   ba  a 222b 1 x ba ab ba 22a ba 2

7 RS - Chapter 3 - Moments

Example: The Suppose X has a Normal distribution with parameters  and . Then: 2 1  x fx e2 2 2

The expected value of X is:

 2 1  x E X xf  x dx x e2 2 dx   2

Make the substitution: x   1 z  dz dx and x z  

 2 1  z Hence EX  z e2 dz  2  22 1 zz  edz22 zedz 22   22 1 zz Now edz221 and zedz 0 2 

Thus EX 

8 RS - Chapter 3 - Moments

Example: The

Suppose X has a Gamma distribution with parameters  and . Then:    1 x  xe x 0 fx      00x  Note:  f x dx x1 ex dx 1 if  0,  0   This is a very useful formula when working with the Gamma distribution.

Example: The Gamma distribution The expected value of X is:   E Xxf xdx x x1 ex dx  0        xedx x This is now   0 equal to 1.

 1  1  xedx x  1  0  1

1      

9 RS - Chapter 3 - Moments

Example: The Gamma distribution

Thus, if X has a Gamma ( ,) distribution, the expected value of X is: E(X) =  /

Special Cases: ( ,) distribution then the expected value of X is: 1. Exponential () distribution:  = 1,  arbitrary 1 EX   2. Chi-square () distribution:  = /2,  = ½.  EX  2    1 2

Example: The Gamma distribution

0.3

0.25

0.2

0.15

0.1

0.05

0 0246810  EX 

10 RS - Chapter 3 - Moments

The Exponential distribution 0.25

0.2

0.15

0.1

0.05

0 0 5 10 15 20 25

1 EX 

The Chi-square (2) distribution

0.14

0.12

0.1

0.08

0.06 0.04

0.02

0 0 5 10 15 20 25 EX  

11 RS - Chapter 3 - Moments

Expectation of a function of a RV

• Let X denote a discrete RV with probability function p(x) (or pdf f(x) if X is continuous) then the expected value of g(X), E[g(X)], is defined to be:

E gX   gxpx     gx ii px  xi

and if X is continuous with probability density function f(x)

 E gX   gxfxdx   

• Examples: g(x) = (x – μ)2  E[g(x)] = E[(x – μ)2] g(x) = (x – μ)k  E[g(x)] = E[(x – μ)k]

Expectation of a function of a RV

Example: Suppose X has a uniform distribution from 0 to b. Then:

 1 0  x  b fx  b 00,x  xb Find the expected value of A = X2 . If X is the length of a side of a square (chosen at random form 0 to b) then A is the area of the square

b  b x3332bb 0 22 21 1 E Xxfxdxxdx  b  ba 33b 3  a 0 

= 1/3 the maximum area of the square

12 RS - Chapter 3 - Moments

Median: An alternative central measure • A is described as the numeric value separating the higher half of a , a population, or a probability distribution, from the lower half.

Definition: Median The median of a random variable X is the unique number m that satisfies the following inequalities: P(X ≤ m) ≥ ½ and P(X≥ m) ≥ ½.

For a continuous distribution, we have that m solves:

m   f X ( x)dx   f X ( x)dx  1 / 2  m

Median: An alternative central measure

• Calculation of is a popular technique in summary statistics and summarizing statistical data, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean.

An optimality property A median is also a central point which minimizes the average of the absolute deviations. That is, a value of c that minimizes E(|X – c|) is the median of the probability distribution of the random variable X.

13 RS - Chapter 3 - Moments

Example I: Median of the Exponential Distribution

Let X have an exponential distribution with parameter . The probability density function of X is:

exx  0 fx   00x  The median m solves the following integral of X:

  f X (x)dx  1/ 2 m    exdx   exdx   ex |  em 1/ 2   m m m

That is, m = ln(2)/λ.

Example II: Median of the

Let X follow a Pareto distribution with parameters α (scale) and xs (shape, usually notated xm). The pdf of X is:

 x   s if x  x  0 f ( x)   x  1 s  0 if x  0  The median m solves the following integral of X: f ( x)dx  1 / 2  X m   x  x  ( 1)1 s dx   x x ( 1) dx   x  C   1 s  s m x m  (  1)  1      1/   xs x  C |m  xs m  1/ 2  m  xs 2

Note: The Pareto distribution is used to describe the distribution of wealth.

14 RS - Chapter 3 - Moments

Moments of a Random Variable

The moments of a random variable X are used to describe the behavior of the RV (discrete or continuous).

Definition: Kth Let X be a RV (discrete or continuous), then the kth moment of X is:   xpxk   if X is discrete  x k   k  E X      xfk  xdxif X is continuous  -

• The first moment of X,  = 1 = E(X) is the center of gravity of the distribution of X. • The higher moments give different information regarding the shape of the distribution of X.

Moments of a Random Variable

Definition: Central Moments Let X be a RV (discrete or continuous). Then, the kth of X is defined to be:   xpxX  k if is discrete  x 0 k  k EX    k   xfxdxX  if is continuous -

where  = 1 = E(X) = the first moment of X .

• The central moments describe how the probability distribution is distributed about the center of gravity, .

15 RS - Chapter 3 - Moments

Moments of a Random Variable – 1st and 2nd

The first central moments is given by: 0  1 EX The second central moment depends on the spread of the probability distribution of X about It is called the of X and is denoted by the symbol var(X).

2 0 EXnd 2 = 2 central moment.

0 EX2 is called the of X 2 and is denoted by the symbol .

var XEX  022 2   

Moments of a Random Variable –

• The third central moment 0 EX 3  3  contains information about the skewness of a distribution.

00 33 • A popular measure of skewness:  1   3 0 32  2 • Distribution according to skewness: 1) Symmetric distribution

0.09 0.08 0.07 0.06 0 0.05  0, 0 0.04 31 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

16 RS - Chapter 3 - Moments

Moments of a Random Variable – Skewness

2) Positively skewed distribution 0.08 0.07 0 0.06 3 0, 0 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

3) Negatively skewed distribution

0.08 0.07 0.06 0 0.05 310, 0 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

Moments of a Random Variable – Skewness

• Skewness and - Zero skew symmetrical gains and losses. - Positive skew suggests many small losses and few rich returns. - Negative skew indicates lots of minor wins offset by rare major losses.

• In financial markets, stock returns at the firm level show positive skewness, but at stock returns at the aggregate (index) level show negative skewness.

• From horse race betting and from U.S. state lotteries there is evidence supporting the contention that gamblers are not necessarily -lovers but skewness-lovers: Long shots are overbet (positve skewness loved!).

17 RS - Chapter 3 - Moments

Moments of a Random Variable –

0 EX 4  • The fourth central moment 4 

It contains information about the shape of a distribution. The property of shape that is measured by this moment is called kurtosis.

00 44 • The measure of (excess) kurtosis:  2  33   4 0 2  2 • Distributions: 1) Mesokurtic distribution 0 0.09  0, moderate in size 0.08  4 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 5 10 15 20 25 30 35

Moments of a Random Variable – Kurtosis 2) Platykurtic distribution 0   0,4 small in size

0 020406080

3) Leptokurtic distribution

0   0,4 large in size

0 0 20406080

18 RS - Chapter 3 - Moments

Moments of a Random Variable

Example: The uniform distribution from 0 to 1

10 x  1 fx  00,1x  x 

Finding the moments 1  1 x k 1 1  xfkk xdx x1 dx k    0 kk 110  Finding the central moments:  1 k 0 x k fxdx  x1 1 dx k  2  0

Moments of a Random Variable

Finding the central moments (continuation):  1 k 0 x k fxdx  x1 1 dx k  2  0 1 making the substitution wx  2

1 1 kk11 2 w k 1 2 11  0 wdwk 22 k    1 kk111 2  2

k 1  1 11 if k even   21k k  k 1   21k    0if k odd

19 RS - Chapter 3 - Moments

Moments of a Random Variable

11 11 Hence 000 ,  0, 2342324 12 25 80 1 Thus, var X   0  2 12 1 The standard deviation var X 0 2 12  0 The measure of skewness   3  0 1  3 0  4 180 The measure of kurtosis  1  331.2   4 112 2

Alternative measures of dispersion When the median is used as a central measure for a distribution, there are several choices for a measure of variability: -The range –-the length of the smallest interval containing the data -The interquartile range -the difference between the 3rd and 1st quartiles.

-The mean absolute deviation – (1/n) Σi |xi – central measure(X)|

-The median absolute deviation (MAD) – MAD= mi(|xi - m(X)|)

These measures are more robust (to outliers) of scale than the sample variance or standard deviation.

They especially behave better with distributions without a mean or variance, such as the .

20 RS - Chapter 3 - Moments

Rules for Expectations

  gxpx    if X is discrete  x EgX     gxf  xdxif X is continuous  • Rules: 1. Ec   c where c is a constant Proof:  if gX c then EgX  Ec  cfxdx   cfxdxc   The proof for discrete random variables is similar. 

Rules for Expectations

2. EaXb  aEX   b where ab , are constants Proof  if g X aX b then E aX  b  ax  b f x dx   axfxdxbfxdx     aE X  b

The proof for discrete random variables is similar.

21 RS - Chapter 3 - Moments

Rules for Expectations

2 3. var XEX0  2  EX22 EX   2       21 Proof:  var X EX22 x fxdx          x222xfxdx    x22fxdx2 xfxdx   fxdx     22 EX2  EX  221

The proof for discrete random variables is similar.

Rules for Expectations 4. varaX b a2 var  X 

Proof:

aX b EaX b aE X b a b

var aX b E aX  b  2   aX b 

2 EaXb a b   Ea 2 X  2    aE22 X 2 avar X  

22 RS - Chapter 3 - Moments

Moment generating functions

The expectation of a function g(X) is given by:   gxpx    if X is discrete  x EgX     gxf  xdxif X is continuous   Definition: Moment Generating Function (MGF) Let X denote a random variable. Then, the moment generating function of X,

mX(t), is defined by: -   epxtx   if X is discrete  x tX    mtX  Ee   efxdxtx  if X is continuous  

23 RS - Chapter 3 - Moments

MGF: Examples

1. The Binomial distribution (parameters p, n)

n x nx px p 1 p  x  0,1,2,K , n x

The MGF of X , mX(t) is: tX tx mtX  Ee epx  x n txn x nx epp 1 x0 x nnx nntxnxnx   ep1 p  ab xx00xx  n abn  ept 1 p

MGF: Examples

2. The (parameter )  x px e x 0,1,2,K x!

The MGF of X , mX(t) is:

n x mt EetX epx tx tx   X      ee x x0 x!

x t x e  t u eeee eu using  xx00x!!x et 1  e

24 RS - Chapter 3 - Moments

MGF: Examples

3. The Exponential distribution (parameter ) ex x  0 fx   00x 

The MGF of X , mX(t) is:  mt EetX efxdxeedx tx  tx  x X     0   tx tx  e  edx   t   0  0    t       t undefined t  

MGF: Examples 4. The Standard Normal distribution ( = 0,  = 1)

2 1  x f xe 2 2

The MGF of X , mX(t) is:

 mt EetX efxdx tx X    

 2 1  x   eedxtx 2  2

 2 1  xtx2   edx2  2

25 RS - Chapter 3 - Moments

MGF: Examples

We will now use the fact that

 xb 2 1  2  edx2a 1 for all ab 0, We have  2 a completed the square

22 22 11xtx22 t xtxt mt e22 dxe e 2 dx X   22 

 22xt 2 tt1  eedxe222  2

This is 1

MGF: Examples

4. The Gamma distribution (parameters , )

   1 x  xe x 0 fx      00x 

The MGF of X , mX(t) is:  mt EetX efxdx tx X         exedxtx1 x 0      x 1edx tx 0 

26 RS - Chapter 3 - Moments

MGF: Examples

We use the fact  ba  xedxabx1 1 for all a 0, b 0 0 a   mt xe 1  tx dx X   0        t  tx  xe 1  dx      t 0 t

Equal to 1  The Chi-square distribution with degrees of freedom = /2, =½):   2 mtX  12 t 

MGF: Properties

1. mX(0) = 1

tX0 X mtXX Ee, hence m 0 Ee  E 1 1

Note: The MGFs of the following distributions satisfy the property

mX(0) = 1

n t i) Binomial Dist'n mtX  ep 1 p  e t 1 ii) Poisson Dist'n mtX  e  iii) Exponential Dist'n mtX     t

t 2 2 iv) Std Normal Dist'n mtX  e   v) Gamma Dist'n mtX     t

27 RS - Chapter 3 - Moments

MGF: Properties

2.

We use the expansion of the exponential function:

tX mtX  Ee

MGF: Properties k k d 3. mmtX  0 k Xk  dt t  0 Now

and mX  0  1

and mX  0   2 k  continuing we find m Xk 0  

28 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Binomial Property 3 is very useful in determining the moments of a RV X. Examples: n t i) Binomial Dist'n mtX  ep 1 p

ttn 1 mtX  nep1 p pe

00n 1 mnepppenpX 01  1 

nn21 mtnpneppepeeppe  11  ttttt  1 X 

n  2 tt t t  npeeppnepepp11   1

ttn  2 t npeeppnepp11

'' 2 2 m X (0)  np[np  1  p]  np[np  q]  n p  npq   2

MGF: Applying Property 3 – Poisson

 et 1 ii) Poisson Dist'n mtX  e

tt eet11 t   mtX  e e e

ttt et1121 t  2  e t et mtX  e e1 e  e

tt  2 et12  tt et  1 mtX  e e21 e e

tt 2 et12t  et 1 eee3

ttt 32et13 et 12  et 1 eee3 

29 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Poisson

To find the moments we set t = 0.

e0 10 1 meX 0  

00 22ee10   10 2 meX 0   e 

30 20t 0 3 2 3 meeeX 03 3

MGF: Applying Property 3 – Exponential

 iii) Exponential Dist'n mtX     t  1 d  dt  mtX  dt  t dt 11 tt22   

33 mtX  212   t        t 

44 mtX 23    t    123      t 

4 55 mtX 23 4 t 1 4!  t

k  k 1 mtX   k!   t 

30 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Exponential

Thus, 2 1 m 0      1 X  3 2 m 02   2 X  2

k 1 k! mkk 0!     kX  k

We can calculate the following popular descriptive statistics: 2 0 2 2 2 2 - σ = μ 2 = μ2 – μ = (2/λ ) – (1/λ) = (1/λ) 0 3 3 2 3/2 - γ1= μ 3 /σ = (2/λ ) / [(1/λ) ] = 2 0 4 4 4 - γ2= μ 4/σ – 3 = (9/λ ) / [(1/λ) ] – 3 = 6

MGF: Applying Property 3 – Exponential

Note: the moments for the exponential distribution can be calculated

in an alternative way. This is done by expanding mX(t) in powers of t and equating the coefficients of tk to the coefficients in:

 11 23 mtX  1 uuu L  tu1  t 1  tt23 t  1   L 23 

Equating the coefficients of tk we get:  1!k k  or  k !  kkk 

31 RS - Chapter 3 - Moments

MGF: Applying Property 3 – Normal

2 iv) Standard normal distribution mX(t)=exp(t /2)

We use the expansion of eu.

We now equate the coefficients tk in:

MGF: Applying Property 3 – Normal

If k is odd: k = 0.

 1 For even 2k: 2 k  2!2!kkk 2!k  or   2 k 2!k k

2! 4! Thus  0, 1,  0,  3 12222! 342 

32 RS - Chapter 3 - Moments

The log of Moment Generating Functions

Let lX (t) = ln mX(t) = the log of the MGF.

Then lmXX 0 ln 0 ln 1 0

1 mtX   mX 0 ltXX mt lX 0 1  mtXX mt mX 0 2   mtmtXX   mt X  ltX  2 mtX  2 mm00  m  0 XX  X  2 2 lX 02 21 mX 0

The Log of Moment Generating Functions

Thus lX (t) = ln mX(t) is very useful for calculating the mean and variance of a random variable

1. lX  0   2 2. lX  0  

33 RS - Chapter 3 - Moments

Log of MGF: Examples – Binomial

1. The Binomial distribution (parameters p, n) nn tt mtX  ep1 p epq t ltXXln mt  n ln  epq

1 t 1 lt  n ep   lnpnpX 0  X ept  q pq eptt ep q ep tt ep ltX  n 2 ept  q

2 pp  q pp   lX 0 n  npq pq 2

Log of MGF: Examples – Poisson

2. The Poisson distribution (parameter )

et 1 mtX  e

t ltXXln mt    e 1

t ltX   e   lX 0  

t  2  l 0   ltX   e X  

34 RS - Chapter 3 - Moments

Log of MGF: Examples – Exponential

3. The Exponential distribution (parameter )    t   mtX     t undefined t  

ltXXln mt  ln ln t if t

1 1 lt  t X   t 2 1 ltX 11   t        t 2 11 Thus ll 0 and 2  0 XX  2

Log of MGF: Examples – Normal

4. The Standard Normal distribution ( = 0,  = 1)

t2 2 mtX  e

t2 ltXXln mt  2

ltXX tlt,   1

2 Thus llXX 0 0 and  0 1

35 RS - Chapter 3 - Moments

Log of MGF: Examples – Gamma 5. The Gamma distribution (parameters , )   mtX     t

ltXXln mt   ln ln   t 1  ltX   tt

2  ltX 11  t    t 2   Hence ll 0 and 2  0 XX  2

Log of MGF: Examples – Chi-squared

6. The Chi-square distribution (degrees of freedom )

  2 mtX  12 t   ltln mt  ln 1 2 t XX2  1  lt  2  X 21 2tt 1 2

2 2 ltX  112   t    2  12 t 2

2 Hence  llXX 0 and  0 2 

36 RS - Chapter 3 - Moments

Characteristic functions

Definition: Let X denote a random variable. Then, the characteristic function of X,

φX(t) is defined by: itx  X (t )  E (e )

itx itx Since e = cos(xt) + i sin(xt) and ║e ║≤ 1, then φX(t) is defined for all t. Thus, the characteristic function always exists, but the MGF need not exist.

Relation to the MGF: φX(t) = miX(t) = mX(it)

 k  (t ) Calculation of moments: X |  i k  t t  0 k

37