<<

NPTEL- and Distributions

MODULE 3 OF A RANDOM AND ITS DISTRIBUTION LECTURE 16 Topics 3.5 PROBABILITY AND INEQUALITIES 3.5.3 Jensen Inequality 3.5.4 푨푴-푮푴-푯푴 inequality 3.6 DESCRIPTIVE MEASURES OF PROBABILITY DISTRIBUTIONS 3.6.1 Measures of 3.6.1.1 3.6.1.2 3.6.1.3 3.6.2 Measures of Dispersion 3.6.2.1 Standard 3.6.2.2 Mean Deviation 3.6.2.3 Deviation 3.6.2.4 3.6.3 measures of 3.6.4 measures of

3.5.3 Jensen Inequality

Theorem 5.2

Let 퐼 ⊆ ℝ be an and let 휙: 퐼 → ℝ be a twice such that its second order derivative 휙′′ ∙ is continuous on 퐼 and휙′′ 푥 ≥ 0, ∀푥 ∈ ℝ. Let 푋 be a with 푆푋 ⊆ 퐼 and finite expectation. Then

퐸 휙 푋 ≥ 휙 퐸 푋 .

If 휙′′ 푥 > 0, ∀푥 ∈ 퐼, then the inequality above is strict unless 푋 is a degenerate random variable.

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 1

NPTEL- Probability and Distributions

Proof. Let 휇 = 퐸 푋 . On expanding 휙 푥 into a about 휇 we get

푥 − 휇 2 휙 푥 = 휙 휇 + 푥 − 휇 휙′ 휇 + 휙′′ 휉 , ∀푥 ∈ 퐼, 2! for some 휉 between 휇 and 푥. Since 휙′′ 푥 ≥ 0, ∀푥 ∈ 퐼, it follows that

휙 푥 ≥ 휙 휇 + 푥 − 휇 휙′ 휇 , ∀푥 ∈ 퐼 (5.2)

′ ⟹ 휙 푋 ≥ 휙 휇 + 푋 − 휇 휙 휇 , ∀푋 ∈ 푆푋

⟹ 퐸 휙 푋 ≥ 휙 휇 + 휙′ 휇 퐸 푋 − 휇

= 휙 휇

= 휙 퐸 푋 .

Clearly the inequality in (5.2) is strict unless 퐸 푋 − 휇 2 = 0, i. e., 푃 푋 = 휇 = 1 (using Corollary 3.1 (ii)). ▄

Example 5.3

Let 푋 be a random variable with support 푆푋, an interval in ℝ. Then

2 (i) 퐸 푋2 ≥ 퐸 푋 taking 휙 푥 = 푥2, 푥 ∈ ℝ, in 5.2 ; (ii) 퐸 ln 푋 ≤ ln 퐸 푋 , provided 푆푋 ⊆ 0, ∞ taking 휙 푥 = − ln 푥, 푥 ∈ ℝ, in Theorem 5.2 ; (iii) 퐸 푒−푋 ≥ 푒−퐸 푋 taking 휙 푥 = 푒−푥 , 푥 ∈ ℝ, in Theorem 5.2 ; 1 1 1 (iv) 퐸 ≥ , if 푆 ⊆ 0, ∞ taking 휙 푥 = , 푥 ∈ ℝ in Theorem 5.2 ; 푋 퐸 푋 푋 푥 provided the involved expectations exist. ▄

Definition 5.2

Let 푋 be a random variable with support 푆푋 ⊆ 0, ∞ . Then, provided they are finite, 퐸 푋 is called the 퐴푀 of 푋, 푒퐸 ln 푋 is called the 1 퐺푀 of 푋, and 1 is called (퐻푀) of 푋. ▄ 퐸 푋

3.5.4 푨푴-푮푴-푯푴 inequality

Example 5.4

(i) Let 푋 be a random variable with support 푆푋 ⊆ 0, ∞ . Then

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 2

NPTEL- Probability and Distributions

1 퐸 ln 푋 퐸 푋 ≥ 푒 ≥ 1 , 퐸 푋

provided the expectations are finite.

(ii) Let 푎1 , ⋯ , 푎푛 be positive real constants and let 푝1 , ⋯ , 푝푛 be another of positive 푛 real constants such that 푖=1 푝푖 = 1. Then

푛 푛 1 푎 푝 ≥ 푎 푝 ≥ ∙ 푖 푖 푖 푖 푛 푝푖 푖=1 푖=1 푖=1 푎푖

Solution.

(i) From Example 5.3 (ii) we have

ln 퐸 푋 ≥ 퐸 ln 푋

⟹ 퐸 푋 ≥ 푒퐸 ln 푋 (5.3)

1 Using (5.3) on ,we get 푋

1 1 퐸 ln 퐸 ≥ 푒 푋 = 푒−퐸 ln 푋 푋

퐸 ln 푋 1 ⟹ 푒 ≥ 1 . (5.4) 퐸 푋

The assertion now follows on combining (5.3) and (5.4).

(ii) Let 푋 be a discrete type random variable with support 푆푋 = 푎1, 푎2 ⋯ and 푃 푋 = 푎푖 = 푝푖, 푖 = 1, ⋯ , 푛. Clearly, 푛 푃 푋 = 푥 > 0 ∀푥 ∈ 푆푋 and 푥∈푆푋 푃 푋 = 푥 = 푖=1 푝푖 = 1. On using (i), we get 1 퐸 ln 푋 퐸 푋 ≥ 푒 ≥ 1 퐸 푋

푛 푛 1 ⟹ 푎 푝 ≥ 푒 푖=1 ln 푎푖 푝푖 ≥ 푖 푖 푛 푝푖 푖=1 푖=1 푎푖

푛 푛 푝 푖 1 ⟹ 푎 푝 ≥ 푒ln 푖=1 푎푖 ≥ 푖 푖 푛 푝푖 푖=1 푖=1 푎푖

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 3

NPTEL- Probability and Distributions

푛 푛 1 ⟹ 푎 푝 ≥ 푎 푝푖 ≥ . 푖 푖 푖 푛 푝푖 푖=1 푖=1 푖=1 푎푖

3.6 DESCRIPTIVE MEASURES OF PROBABILITY DISTRIBUTIONS

Let 푋 be a random variable defined on a probability 훺, ℱ, 푃 , associated with a random ℰ. Let 퐹푋 and 푓푋 denote, respectively, the distribution function and the p.d.f./p.m.f. of 푋. The (i.e., the distribution function/p.d.f./p.m.f.) of 푋 describes the manner in which the random variable 푋 takes values in different Borel sets. It may be desirable to have a set of numerical measures that provide a summary of the prominent features of the probability distribution of 푋. We call these measures as descriptive measures. Four prominently used descriptive measures are: (i) Measures of central tendency (or location), also referred to as ; (ii) measures of dispersion; (iii) measures of skewness, and (iv) measures of kurtosis.

3.6.1 Measures of Central Tendency

A of central tendency or location (also called an ) gives us the idea about the central value of the probability distribution around which values of the random variable are clustered. Three commonly used measures of central tendency are mean, median and mode.

3.6.1.1 Mean.

Recall (Definition 3.2 (i)) that the mean (of probability distribution) of a random variable ′ 푋 is given by 휇1 = 퐸 푋 . We have seen that the mean of a probability distribution gives us idea about the average observed value of 푋 in the long run (i.e., the average of observed values of 푋when the random experiment is repeated a large of times). Mean seems to be the best suited average if the distribution is symmetric about a point 푑 휇 (i.e., 푋 − 휇 = 휇 − 푋 , in which case 휇 = 퐸 푋 provided it is finite), values in the neighborhood of 휇 occur with high , and as we move away from 휇 in either direction 푓푋(⋅) decreases. Because of its simplicity mean is the most commonly used average (especially for symmetric or nearly symmetric distributions). Some of the demerits of this measure are that in some situations this may not be defined (Examples 3.2 and 3.4) and that it is very sensitive to presence of a few extreme values of 푋 which are different from other values of 푋 (even though they may occur with small positive

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 4

NPTEL- Probability and Distributions

probabilities). So this measure should be used with caution if probability distribution assigns positive probabilities to a few Borel sets having some extreme values.

3.6.1.2 Median.

1 1 A 푚 satisfying in 퐹 푚 − ≤ ≤ 퐹 푚 , i. e. , 푃 푋 < 푚 ≤ ≤ 푋 2 푋 2 푃 푋 ≤ 푚 , is called the median (of the probability distribution) of 푋. Clearly if 푚 is the median of a probability distribution then, in the long run (i.e., when the random experiment ℰ is repeated a large number of times), the values of 푋 on either side of 푚 in

푆푋 are observed with the same . Thus the median of a probability distribution, in some sense, divides 푆푋 into two equal parts each having the same probability of occurrence. It is evident that if 푋is of continuous type then the median m is given by

퐹푋 푚 = 1/2. For some distributions (especially for distributions of discrete type 1 random variable) it may happen that 퐹 푎 − < and 푥 ∈ ℝ: 퐹 푥 = 1/2 = 푎 , 푏 , 푋 2 푋 for some −∞ < 푎 < 푏 < ∞, so that the median is not unique. In that case 푃 푋 = 푥 =

0, ∀x ∈ a, b and thus we take the median to be 푚 = 푎 = inf 푥 ∈ ℝ: 퐹푋 푥 ≥ 1/2 . For random variables having a symmetric probability distribution it is easy to verify that the mean and the median coincide (see Problem 33). Unlike the mean, the median of a probability distribution is always defined. Moreover the median is not affected by a few extreme values as it takes into account only the probabilities with which different values occur and not their numerical values. As a measure of central tendency the median is preferred over the mean if the distribution is asymmetric and a few extreme observations are assigned positive probabilities. However the fact that the median does not at all take into account the numerical values of 푋 is one of its demerits. Another disadvantage with median is that for many probability distributions it is not easy to evaluate (especially for distributions whose distribution functions 퐹푋(⋅) do not have a closed form).

3.6.1.3 Mode.

Roughly speaking the mode 푚0 of a probability distribution is the value that occurs with highest probability and is defined by 푓푋 푚0 = sup 푓푋 푥 : 푥 ∈ 푆푋 . Clearly if 푚0 is the mode of a probability distribution of 푋 then, in the long run, either 푚0 or a value in the neighborhood of 푚0 is observed with maximum frequency. Mode is easy to understand and easy to calculate. Normally, it can be found by just inspection. Note that a probability distribution may have more than one mode which may be far apart. Moreover mode does not take into account the numerical values of 푋 and it also does not take into account the probabilities associated with all the values of 푋. These are crucial deficiencies of mode which make it less preferable over mean and median. A probability distribution with one (two/three) mode(s) is called an unimodal (bimodal/trimodal) distribution. A distribution with multiple modes is called a .

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 5

NPTEL- Probability and Distributions

3.6.2 Measures of Dispersion

Measures of central tendency give us the idea about the location of only central part of the distribution. Other measures are often needed to describe a probability distribution. The values assumed by a random variable 푋 usually differ from each other. The usefulness of mean or median as an average is very much dependent on the variability (or dispersion) of values of 푋 around mean or median. A probability distribution (or the corresponding random variable 푋) is said to have a high dispersion if its support contains many values that are significantly higher or lower than the mean or median value. Some of the commonly used measures of dispersion are , quartile deviation (or semi-inter-quartile ) and coefficient of variation.

3.6.2.1 Standard Deviation.

Recall (Definition 3.2 (v)) that the (of the probability distribution) of a random variable 푋 is defined by 휎2 = 퐸 푋 − 휇 2 , where 휇 = 퐸 푋 is the mean (of the probability distribution) of 푋. The standard deviation (of the probability distribution) of 2 푋 is defined by 휎 = 휇2 = 퐸 푋 − 휇 . Clearly the variance and the standard deviation give us the idea about the average spread of values of 푋around the mean 휇. However, unlike the variance, the unit of measurement of standard deviation is the same as that of 푋. Because of its simplicity and intuitive appeal, standard deviation is the most widely used measure of dispersion. Some of the demerits of standard deviation are that in many situations it may not be defined (distribution for which second moment is not finite) and that it is sensitive to presence of a few extreme values of 푋 which are different from other values. A justification for having the mean 휇 in place of median or any other average in the definition of 휎 = 퐸 푋 − 휇 2 is that 퐸 푋 − 휇 2 ≤ 퐸 푋 − 푐 2 , ∀푐 ∈ ℝ (Problem 32).

3.6.2.2 Mean Deviation.

Let 퐴 be a suitable average. The mean deviation (of probability distribution) of 푋 around average 퐴 is defined by MD 퐴 = 퐸 푋 − 퐴 . Among various mean deviations, the mean deviation about the median 푚 is more preferable than the others. A reason for this preference is the fact that for any random variable 푋, MD 푚 = 퐸 푋 − 푚 ≤ 퐸 푋 − 푐 = MD 푐 , ∀푐 ∈ ℝ (Problem 24). Since a natural between 푋 and 푚 is 푋 − 푚 , as a measure of dispersion, the mean deviation about median seems to be more appealing than the standard deviation. Although the mean deviation about median (or mean) has more intuitive appeal than the standard deviation, in most situations, it is not easy to evaluate. Some of the other demerits of mean deviation are that in many situations they may not be defined and that they are sensitive to presence of a few extreme values of 푋 which are different from other values.

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 6

NPTEL- Probability and Distributions

3.6.2.3 Quartile Deviation.

A common drawback with the standard deviation and mean deviations, as measures of dispersion, is that they are sensitive to presence of a few extreme values of 푋. Quartile deviation measures the spread in the middle half of the distribution and is therefore not influenced by extreme values. Let 푞1 and 푞3 be real such that 1 3 퐹 푞 − ≤ ≤ 퐹 푞 and 퐹 푞 − ≤ ≤ 퐹 푞 , 푋 1 4 푋 1 푋 3 4 푋 3

1 3 i. e. , 푃 푋 < 푞 ≤ ≤ 푃 푋 ≤ 푞 and 푃 푋 < 푞 ≤ ≤ 푃 푋 ≤ 푞 . 1 4 1 3 4 3

The quantities 푞1 and 푞3 are called, respectively, the lower and upper of the probability distribution of random variable 푋. Clearly if 푞1, 푚 and 푞3 are respectively the lower quartile, the median and the upper quartile of a probability distribution then they divide the probability distribution in four parts so that, in the long run (i.e., when the random experiment ℰ is repeated a large number of times) twenty five percent of the observed values of 푋 are expected to be less than 푞1, fifty percent of the observed values of 푋 are expected to be less than 푚 and seventy five percent of the observed values of 푋 are expected to be less than 푞3. The quantity IQR = 푞3 − 푞1 is called the inter quartile 푞 −푞 range of the probability distribution of 푋 and the quantity QD = 3 1 is called the 2 quartile deviation or the semi-inter-quartile range of the probability distribution of 푋. It can be seen that if 푋 is of absolutely continuous type then 푞1 and 푞3 are given by 1 3 퐹 푞 = and 퐹 푞 = . For some distributions (especially for distributions of 푋 1 4 푋 3 4 1 discrete type random variables) it may happen that 퐹 푞 − < and/or 푥 ∈ 푋 1 4 3 ℝ: 퐹 푥 = 1/4 = 푎 , 푏 (퐹 푞 − < and/or 푥 ∈ ℝ: 퐹 푥 = 3/4 = 푐, 푑 ) for 푋 푋 3 4 푋 some −∞ < 푎 < 푏 < ∞ (for some −∞ < 푐 < 푑 < ∞), so that 푞1 (푞3 ) is not uniquely defined. In that case 푃 푋 = 푥 = 0, ∀푥 ∈ 푎, 푏 (푥 ∈ 푐, 푑 ) and thus we take 푞1 = 푎 = inf 푥 ∈ ℝ: 퐹푋 푥 ≥ 1/4 ( 푞3 = 푐 = inf 푥 ∈ ℝ: 퐹푋 푥 ≥ 3/4 ) . For random variables having a symmetric probability distribution it is easy to verify that 푚 =

푞1 + 푞3 /2 (Problem 33). Although, unlike the standard deviation and the mean deviation, quartile deviation is not sensitive to presence of some extreme values of 푋 a major drawback with the quartile deviation is that it ignores the tails of the probability distribution (which constitute 50% of the probability distribution). Note that the quartile deviation depends on the units of measurement of random variable 푋 and thus it may not be an appropriate measure for comparing dispersions of two different probability distributions. For comparing dispersions of two different probability distributions a normalized measure such as

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 7

NPTEL- Probability and Distributions

푞3−푞1 푞 − 푞 CQD = 2 = 3 1 푞3+푞1 푞3 + 푞1 2 seems to be more appropriate. The quantity CQD is called the coefficient of quartile deviation of the probability distribution of 푋. Clearly the coefficient of quartile deviation is independent of unit and thus it can be used to be compare dispersion of two different probability distributions.

3.6.2.4 Coefficient of Variation.

Like quartile deviation, standard deviation 휎 also depends on the units of measurement of random variable 푋 and thus it is not an appropriate measure for comparing dispersions of two different probability distributions. Let 휇 and 휎, respectively, be the mean and the standard deviation of the distribution of 푋 . Suppose that 휇 ≠ 0. The coefficient of variation of the probability distribution of 푋 is defined by 휎 퐶푉 = . 휇

Clearly the coefficient of variation measures the variation per unit of mean and is independent of units. Therefore it seems to be an appropriate measure to compare dispersions of two different probability distributions. A disadvantage with the coefficient of variation is that when the mean 휇 is close to zero it is very sensitive to small changes in the mean.

3.6.3 Measures of Skewness

Skewness of a probability distribution is a measure of asymmetry (or lack of symmetry). Recall that the probability distribution of random variable 푋 is said to be symmetric about 푑 point 휇 if 푋 − 휇 = 휇 − 푋. In that case 휇 = 퐸 푋 (provided it exists) and 푓푋 휇 + 푥 = 푓푋 휇 − 푥 , ∀푥 ∈ ℝ. Evidently, for symmetric distributions, the shape of the p.d.f./p.m.f. on the left of 휇 is the mirror of that on the right side of 휇. It can be shown that, for symmetric distribution the mean and the median coincide (Problem 33). We say that a probability distribution is positively skewed if the tail on the right side of the p.d.f./p.m.f. is longer than that on the left side of the p.d.f./p.m.f. and bulk of the values lie on the left side of the mean. Clearly a positively skewed distribution indicates presence of a few high values of 푋 which pull up the value of the mean resulting in mean larger than the median and the mode. For unimodal positively skewed distribution we normally have Mode < Median < Mean. Similarly we say that a probability distribution is negatively skewed if the tail on the left side of the p.d.f./p.m.f. is longer than that on the right side of the p.d.f./p.m.f. and bulk of the values lie on the right side of the mean. Clearly a negatively skewed distribution indicates presence of a few low values of 푋 which pull

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 8

NPTEL- Probability and Distributions

down the value of the mean resulting in mean smaller than the median and the mode. For unimodal negatively skewed distribution we normally have Mean< Median

3 3 퐸 푋 − 휇 휇3 훽1 = 퐸 푍 = 3 = 3 . 휎 2 휇2

The quantity 훽1 is simply called the coefficient of skewness. Clearly for symmetric distributions 훽1 = 0 (Theorem 4.3 (vi)). However the converse may not be true, i.e., there are examples of skewed probability distributions for which 훽1 = 0. A large positive value of 훽1 indicates that the is positively skewed and a small negative value of 훽1 indicates that the data is negatively skewed. A measure of skewness can also be based on quartiles. Let 푞1 ,푚, 푞3 and 휇 denote respectively the lower quartile, the median, the upper quartile and the mean of the probability distribution of 푋. We know that for 푞 +푞 random variables having a symmetric probability distribution 휇 = 푚 = 1 3 , i. e., 2 푞3 − 푚 = 푚 − 푞1.

For positively (negatively) skewed distribution we will have 푞3 − 푚 > < 푚 − 푞1 . Thus one may also define a measure of skewness based on 푞3 − 푚 − 푚 − 푞1 = 푞3 − 2푚 + 푞1 .To make this quantity independent of units one may consider

푞3 − 2푚 + 푞1 훽2 = 푞3 − 푞1 as a measure of skewness. The quantity 훽2 is called the Yule coefficient of skewness.

3.6.4 Measures of Kurtosis

For real constants 휇 ∈ ℝ and 휎 > 0, let 푌휇,휎 be a random variable having the p.d.f. 1 푥−휇 2 − 2 푓푌 푥 = 푒 2휎 , −∞ < 푥 < ∞, −∞ < 휇 < ∞, 휎 > 0, 휇 ,휎 휎 2휋

2 i.e., 푌휇,휎 ∼ 푁 휇, 휎 (see Example 4.2 and discussion following it). We have seen (Example 4.2 (iii)) that 휇 and 휎2 are respectively the mean and the variance of the distribution of 푌휇,휎 . We call the probability distribution corresponding to p.d.f. 푓푌휇 ,휎 as the with mean 휇 and variance 휎2 (denoted by 푁 휇, 휎2 ). We know that 푁 휇, 휎2 distribution is symmetric about 휇 (cf. Example 4.2 (ii)). Also it is easy to verify that 푁 휇, 휎2 distribution is unimodal with 휇 as the common value of mean, median and mode. Kurtosis of the probability distribution of 푋 is a measure of

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 9

NPTEL- Probability and Distributions

peakedness and thickness of tail of p.d.f. of 푋 relative to the peakedness and thickness of tails of the p.d.f. of normal distribution. A distribution is said to have higher (lower) kurtosis than the normal distribution if its p.d.f., in comparison with the p.d.f. of a normal distribution, has a sharper (rounded) peak and longer, fatter (shorter, thinner) tails. Let 휇 and 휎 respectively, be the mean and the standard deviation of distribution of 푋 and let 푍 = 푋 − 휇 / 휎 be the standardized variable. A measure of Kurtosis of the probability distribution of 푋 is defined by

4 4 퐸 푋 − 휇 휇4 훾1 = 퐸 푍 = 4 = 2. 휎 휇2

The quantity 훾1 is simply called the kurtosis of the probability distribution of 푋. it is easy to show that for any values of 휇 ∈ ℝ and 휎 > 0, the kurtosis of 푁 휇, 휎2 distribution is

훾1 = 3 (use Example 4.2 (iii)). The quantity

훾2 = 훾1 − 3 is called the excess kurtosis of the distribution of 푋. It is clear that for normal distributions the excess kurtosis is zero. Distributions with zero excess Kurtosis are called mesokurtic. A distribution with positive (negative) excess Kurtosis is called l푒푝푡표푘푢푟푡푖푐 (푝푙푎푡푦푘푢푟푡푖푐). Clearly a leptokurtic (platykurtic) distribution has sharper (rounded) peak and longer, fatter (shorter, thinner) tails.

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 10

NPTEL- Probability and Distributions

Figure 6.1. Distribution symmetric about mean 휇

Figure 6.2 (a). Unimodal distribution

Figure 6.2 (b). Bimodal distribution

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 11

NPTEL- Probability and Distributions

Figure 6.2 (c). Trimodal distribution

Figure 6.3. Lower , median and upper quantile

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 12

NPTEL- Probability and Distributions

Figure 6.4 (a). Negatively skewed distribution

Figure 6.4 (b). Normal (no skew) distribution

Figure 6.4 (c). Positively skewed distribution

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 13

NPTEL- Probability and Distributions

Figure 6.5. Kurtosis (Mesokurtic, Leptokurtic and Platykurtic distributions)

Example 6.1

For 훼 ∈ 0,1 , let 푋훼 be a random variable having the p.d.f. 훼푒푥 , if 푥 < 0 푓 푥 = . 훼 1 − 훼 푒−푥 , if 푥 ≥ 0

(i) Show that, for a positive integer 푟, ∞ 푒−푥 푥푟−1 푑푥 = 푟 − 1 !. 0

′ 푟 Hence find 휇푟 훼 = 퐸 푋훼 , 푟 ∈ 1,2, ⋯ ;

(ii) For 푝 ∈ 0,1 , find 휉푝 ≡ 휉푝 훼 such that 퐹훼 휉푝 = 푝, where 퐹훼 is the distribution function of 푋훼 . The quantity 휉푝 is called the 푝-th quantile of the distribution of 푋훼 ; (iii) Find the lower quartile 푞1 훼 , the median 푚 훼 and the upper quartile 푞3 훼 of the distribution function of 푋훼 ; (iv) Find the mode 푚0 훼 of the distribution of 푋훼 ; (v) Find the standard deviation 휎 훼 , the mean deviation about median MD 푚 훼 , the inter-quartile range) IQR 훼 , the quartile deviation (or semi-inter-quartile range) QD 훼 , the coefficient of quartile deviation CQD 훼 and the coefficient of variation CV 훼 of the distribution of 푋훼 ; (vi) Find the coefficient of skewness 훽1 훼 and the Yule coefficient of skewness 훽2 훼 of the distribution of 푋훼 . According to values of 훼, classify the distribution of 푋훼 as symmetric, positively skewed and negatively skewed;

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 14

NPTEL- Probability and Distributions

(vii) Find the excess kurtosis 훾2 훼 of the distribution of 푋훼 and hence comment on the kurtosis of the distribution of 푋훼 .

Solution.

(i) For 푟 ∈ 1,2, ⋯ , let ∞ −푥 푟−1 Ι푟 = 푒 푥 푑푥, 0

so that 훪1 = 1. Performing integration by parts it is straightforward to see that 훪푟 = 푟 − 1 훪푟−1, 푟 ∈ 2,3, ⋯ . On successively using this relationship we get

∞ −푥 푟−1 훪푟 = 푒 푥 푑푥 = 푟 − 1 !, 푟 ∈ 1,2, ⋯ . 0

Therefore, for a positive integer 푟,

′ 푟 휇푟 훼 = 퐸 푋훼

0 ∞ = 훼푥푟 푒푥 푑푥 + 1 − 훼 푥푟 푒−푥 푑푥 −∞ 0

∞ = −1 푟 훼 + 1 − 훼 푥푟 푒−푥 푑푥 0

= −1 푟 훼 + 1 − 훼 푟!

1 − 2훼 푟! if 푟 ∈ 1, 3, 5, ⋯ = . 푟!, if 푟 ∈ 2, 4, 6, ⋯

(ii) Let 푝 ∈ 0,1 and let 휉푝 be such that 퐹훼 휉푝 = 푝. Note that 0 푥 퐹훼 0 = 훼 푒 푑푥 = 훼. −∞

Thus, for evaluation of 휉푝 , the following two cases arise.

Case I. 0 ≤ 훼 < 푝

We have 푝 = 퐹훼 휉푝 , i. e.,

0 휉푝 푝 = 훼푒푥 푑푥 + 1 − 훼 푒−푥 푑푥 −∞ 0

= 1 − 1 − 훼 푒−휉푝 ,

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 15

NPTEL- Probability and Distributions

i.e., 휉푝 = ln 1 − 훼 / 1 − 푝 .

Case II. 훼 ≥ 푝

In this case we have

휉푝 푝 = 훼푒푥 푑푥 = 훼푒휉푝 , −∞

훼 i.e., 휉 = −In . Combining the two cases we get 푝 푝

1 − 훼 ln , if 0 ≤ 훼 < 푝 1 − 푝 휉푝 = 훼 . −ln , if 푝 ≤ 훼 ≤ 1 푝

(iii) We have 4 1 − 훼 1 ln , if 0 ≤ 훼 < 3 4 푞1 훼 = 휉1 = , 4 1 −ln 4훼 , if ≤ 훼 ≤ 1 4 1 ln 2 1 − 훼 , if 0 ≤ 훼 < 2 푚 훼 = 휉1 = , 1 2 −ln 2훼 , if ≤ 훼 ≤ 1 2

3 ln 4 1 − 훼 , if 0 ≤ 훼 < 4 and 푞3 훼 = 휉3 = 4훼 3 . 4 −ln , if ≤ 훼 ≤ 1 3 4

(iv) The mode 푚0 ≡ 푚0 훼 of the distribution of 푋훼 is such that 푓훼 푚0 = sup{푓훼 푥 : −∞ < 푥 < ∞}. Clearly 푚0 = max 훼, 1 − 훼 .

′ ′ 2 (v) Using (i) we have 휇1 훼 = 퐸 푋훼 = 1 − 2훼 and 휇2 훼 = 퐸 푋훼 = 2. It follows that the standard deviation of the distribution of 푋훼 is 2 ′ ′ 2 휎 훼 = Var 푋훼 = 휇2 훼 − 휇1 훼 = 1 + 4훼 − 4훼 .

1 1 Note that, for 0 ≤ 훼 < , 푚 훼 = ln 2 1 − 훼 ≥ 0 and, for 훼 > , 푚 훼 = 2 2 − ln 2훼 < 0. Thus, for the evaluation of the mean deviation about the median, the following cases arise:

1 Case I. 0 ≤ 훼 < (so that 푚 훼 ≥ 0) 2

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 16

NPTEL- Probability and Distributions

MD 푚 훼 = 퐸 푋 − 푚 훼

0 푚 훼 = 훼 푚 훼 − 푥 푒푥 푑푥 + 1 − 훼 푚 훼 − 푥 푒−푥 푑푥 −∞ 0

∞ + 1 − 훼 푥 − 푚 훼 푒−푥 푑푥 푚 훼

= 푚 훼 + 2훼

= ln 2 1 − 훼 + 2훼.

1 Case II. ≤ 훼 ≤ 1 (so that 푚 훼 ≤ 0) 2

MD 푚 훼 = 퐸 푋 − 푚 훼

푚 훼 0 = 훼 푚 훼 − 푥 푒푥 푑푥 + 훼 (푥 − 푚 훼 )푒푥 푑푥 −∞ 푚 훼

∞ + 1 − 훼 푥 − 푚 훼 푒−푥푑푥 0

= 2 1 − 훼 − 푚 훼

= ln 2훼 + 2 1 − 훼 .

Combining the two cases we get

1 ln 2 1 − 훼 + 2훼, if 0 ≤ 훼 < 2 MD 푚 훼 = 1 . ln 2훼 + 2 1 − 훼 , if ≤ 훼 ≤ 1 2

Using (iii) the inter- quartile range of the distribution of 푋훼 is

IQR 훼 = 푞3 훼 − 푞1 훼 1 ln 3, if 0 ≤ 훼 < 4 1 3 = ln 16훼 1 − 훼 , if ≤ 훼 < . 4 4 3 ln 3, if ≤ 훼 ≤ 1 4

The quartile deviation of the distribution of 푋훼 is

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 17

NPTEL- Probability and Distributions

1 ln 3 , if 0 ≤ 훼 < 4 푞3 훼 − 푞1(훼) 1 3 QD 훼 = = ln 4 훼 1 − 훼 , if ≤ 훼 < . 2 4 4 3 ln 3, if ≤ 훼 ≤ 1 4

The coefficient quartile deviation of the distribution of 푋훼 is ln 3 1 , if 0 ≤ 훼 < 16 1−훼 2 ln 4 3

푞 훼 − 푞 훼 ln 16훼 1 − 훼 1 3 3 1 , if ≤ 훼 < CQD 훼 = = 1−훼 4 4. 푞3 훼 + 푞1 훼 ln 훼 ln 3 3 − , if ≤ 훼 ≤ 1 16훼2 ln 4 3

For 훼 ≠ 1/2, the coefficient of variation of the distribution of 푋훼 is

휎 훼 1 + 4훼 − 4훼2 CV 훼 = ′ = . 휇1 훼 1 − 2훼

(vi) We have 3 ′ 3 ′ ′ ′ ′ 3 휇3 훼 = 퐸 푋훼 − 휇1 = 휇3 훼 − 3휇1 훼 휇2 훼 + 2 휇1 훼 = 2 1 − 2훼 .

Therefore,

3 휇3 훼 2 1 − 2훼 훽1 훼 = = . 휎 훼 1 + 4훼 − 4훼2 Using (iii), the Yule coefficient of skewness is

3 ln 1 4 , if 0 ≤ 훼 < ln 3 4

ln 4훼 1 − 훼 1 1 − , if ≤ 훼 < 4 2 푞3 훼 − 2푚 훼 + 푞1 훼 ln 16훼 1 − 훼 훽2 훼 = = . 푞3 훼 − 푞1 훼 ln 4훼 1 − 훼 1 3 , if ≤ 훼 < 2 4 ln 16훼 1 − 훼 3 ln 3 4 , if ≤ 훼 ≤ 1 ln 3 4

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 18

NPTEL- Probability and Distributions

1 Clearly, for 0 ≤ 훼 < 1/2, 훽 훼 > 0, 푖 = 1,2, and, for < 훼 ≤ 1, 훽 훼 < 0, 푖 = 푖 2 푖 1,2. It follows that the probability distribution of 푋훼 is positively skewed if 1 1 0 ≤ 훼 < 1/2 and negatively skewed if < 훼 ≤ 1. For 훼 = , 푓 푥 = 2 2 훼 푓훼 −푥 , ∀푥 ∈ ℝ. Thus, for 훼 = 1 2, the probability of 푋훼 is symmetric about zero.

(vii) We have ′ 4 휇4 훼 = 퐸 푋훼 − 휇1

2 4 ′ ′ ′ ′ ′ ′ = 휇4 훼 − 4휇1 훼 휇3 훼 + 6 휇1 훼 휇2 훼 − 3 휇1 훼

= 24 − 12 1 − 2훼 2 − 3 1 − 2훼 4.

Therefore,

휇 훼 24 − 12 1 − 2훼 2 − 3 1 − 2훼 4 훾 훼 = 4 = 1 2 2 − 1 − 2훼 2 2 휇2 훼

and

12 − 6 1 − 2훼 4 훾 훼 = 훾 훼 − 3 = . 2 1 2 − 1 − 2훼 2 2

Clearly, for any 훼 ∈ 0, 1 , 훾2 훼 > 0. It follows that, for any value of 훼 ∈ 0,1 , the distribution of 푋훼 is leptokurtic.

Dept. of Mathematics and Statistics Indian Institute of Technology, Kanpur 19