Measures of Dispersion
Total Page:16
File Type:pdf, Size:1020Kb
CHAPTER Measures of Dispersion Studying this chapter should Three friends, Ram, Rahim and enable you to: Maria are chatting over a cup of tea. • know the limitations of averages; • appreciate the need for measures During the course of their conversation, of dispersion; they start talking about their family • enumerate various measures of incomes. Ram tells them that there are dispersion; four members in his family and the • calculate the measures and average income per member is Rs compare them; • distinguish between absolute 15,000. Rahim says that the average and relative measures. income is the same in his family, though the number of members is six. Maria 1. INTRODUCTION says that there are five members in her family, out of which one is not working. In the previous chapter, you have She calculates that the average income studied how to sum up the data into a in her family too, is Rs 15,000. They single representative value. However, are a little surprised since they know that value does not reveal the variability that Maria’s father is earning a huge present in the data. In this chapter you will study those measures, which seek salary. They go into details and gather to quantify variability of the data. the following data: 2021-22 MEASURES OF DISPERSION 75 Family Incomes in values, your understanding of a Sl. No. Ram Rahim Maria distribution improves considerably. 1. 12,000 7,000 0 For example, per capita income gives 2. 14,000 10,000 7,000 only the average income. A measure of 3. 16,000 14,000 8,000 dispersion can tell you about income 4. 18,000 17,000 10,000 inequalities, thereby improving the 5. ----- 20,000 50,000 6. ----- 22,000 ------ understanding of the relative standards of living enjoyed by different strata of Total income 60,000 90,000 75,000 society. Average income 15,000 15,000 15,000 Dispersion is the extent to which Do you notice that although the values in a distribution differ from the average is the same, there are average of the distribution. considerable differences in individual To quantify the extent of the incomes? variation, there are certain measures It is quite obvious that averages try namely: to tell only one aspect of a distribution (i) Range i.e. a representative size of the values. (ii) Quartile Deviation To understand it better, you need to (iii) Mean Deviation know the spread of values also. (iv) Standard Deviation You can see that in Ram’s family, Apart from these measures which differences in incomes are give a numerical value, there is a comparatively lower. In Rahim’s family, graphic method for estimating differences are higher and in Maria’s dispersion. family, the differences are the highest. Range and quartile deviation Knowledge of only average is measure the dispersion by calculating insufficient. If you have another value the spread within which the values lie. which reflects the quantum of variation Mean deviation and standard deviation calculate the extent to which the values differ from the average. 2. MEASURES BASED UPON SPREAD OF VALUES Range Range (R) is the difference between the largest (L) and the smallest value (S) in a distribution. Thus, R = L – S Higher value of range implies higher dispersion and vice-versa. 2021-22 76 STATISTICS FOR ECONOMICS Activities Quartile Deviation Look at the following values: The presence of even one extremely 20, 30, 40, 50, 200 high or low value in a distribution can • Calculate the Range. reduce the utility of range as a measure • What is the Range if the value of dispersion. Thus, you may need a 200 is not present in the data measure which is not unduly affected set? • If 50 is replaced by 150, what by the outliers. will be the Range? In such a situation, if the entire data is divided into four equal parts, each containing 25% of the values, we get Range: Comments Range is unduly affected by extreme the values of quartiles and median. values. It is not based on all the (You have already read about these in values. As long as the minimum Chapter 5). and maximum values remain The upper and lower quartiles (Q3 unaltered, any change in other and Q1, respectively) are used to values does not affect range. It calculate inter-quartile range which is cannot be calculated for open- Q – Q . ended frequency distribution. 3 1 Interquartile range is based upon Notwithstanding some limitations, middle 50% of the values in a range is understood and used distribution and is, therefore, not frequently because of its simplicity. For affected by extreme values. Half of the example, we see the maximum and inter-quartile range is called quartile minimum temperatures of different deviation (Q.D.). Thus: cities almost daily on our TV screens and form judgments about the temperature variations in them. Q.D. is therefore also called Semi- Inter Quartile Range. Open-ended distributions are those in which either the lower limit of Calculation of Range and Q.D. for the lowest class or the upper limit ungrouped data of the highest class or both are not specified. Example 1 Calculate range and Q.D. of the Activity following observations: • Collect data about 52-week high/ 20, 25, 29, 30, 35, 39, 41, low of shares of 10 companies 48, 51, 60 and 70 from a newspaper. Calculate the Range is clearly 70 – 20 = 50 range of share prices. Which For Q.D., we need to calculate company’s share is most volatile values of Q and Q . and which is the most stable? 3 1 2021-22 MEASURES OF DISPERSION 77 Range is just the difference between n +1 th the upper limit of the highest class and Q1 is the size of value. 4 the lower limit of the lowest class. So n being 11, Q1 is the size of 3rd value. range is 90 – 0 = 90. For Q.D., first As the values are already arranged calculate cumulative frequencies as in ascending order, it can be seen that follows: Q , the 3rd value is 29. [What will you 1 Class- Frequencies Cumulative do if these values are not in an order?] Intervals Frequencies 3 (n +1) CI f c. f. th Similarly, Q3 is size of 0–10 5 05 4 10–20 8 13 value; i.e. 9th value which is 51. Hence 20–40 16 29 Q = 51 40–60 7 36 3 60–90 4 40 51− 29 n = 40 = = 11 2 n Q is the size of th value in a Do you notice that Q.D. is the 1 4 average difference of the Quartiles from continuous series. Thus, it is the size the median. of the 10th value. The class containing th Activity the 10 value is 10–20. Hence, Q1 lies in class 10–20. Now, to calculate the • Calculate the median and exact value of Q , the following formula check whether the above 1 statement is correct. is used: Calculation of Range and Q.D. for a n cf frequency distribution. Q =L +4 × i Example 2 1 f For the following distribution of marks Where L = 10 (lower limit of the scored by a class of 40 students, relevant Quartile class) calculate the Range and Q.D. c.f. = 5 (Value of c.f. for the class preceding the quartile class) TABLE 6.1 i = 10 (interval of the quartile class), Class intervals No. of students and C I (f) f = 8 (frequency of the quartile class) 0–10 5 Thus, 10–20 8 20–40 16 10− 5 Q= 10 + × 10 =16.25 40–60 7 8 60–90 4 40 3n Similarly, Q is the size of th 3 4 2021-22 78 STATISTICS FOR ECONOMICS value; i.e., 30th value, which lies in Quartile deviation can generally be class 40–60. Now using the formula calculated for open-ended for Q3, its value can be calculated as distributions and is not unduly affected follows: by extreme values. 3. MEASURES OF DISPERSION FROM AVERAGE Recall that dispersion was defined as the extent to which values differ from their average. Range and quartile deviation are not useful in measuring, how far the values are, from their In individual and discrete series, average. Yet, by calculating the spread of values, they do give a good idea n+ 1 th about the dispersion. Two measures Q1 is the size of value, but 4 which are based upon deviation of the in a continuous distribution, it is values from their average are Mean n Deviation and Standard Deviation. the size of th value. Similarly, 4 Since the average is a central value, some deviations are positive and some for Q3 and median also, n is used in place of n+1. are negative. If these are added as they are, the sum will not reveal anything. If the entire group is divided into In fact, the sum of deviations from two equal halves and the median Arithmetic Mean is always zero. Look calculated for each half, you will have at the following two sets of values. the median of better students and the Set A : 5, 9, 16 median of weak students. These Set B : 1, 9, 20 medians differ from the median of the entire group by 13.31 on an average. You can see that values in Set B are Similarly, suppose you have data farther from the average and hence about incomes of people of a town. more dispersed than values in Set A. Median income of all people can be Calculate the deviations from calculated.