Measures of Skewness

UNIT MEASURES OF SKEWNESS . Structure 16.0 Objecthes 16.1 Introduction 16.2 Meaning of Skewness 16.3 Positive and Nagative Skewness 16.4 Difference between Dispersion and Skewness . 16.5 Tests of Skewness 16.6 Measures of Skewness i6.7 Some Illustrations 16.8 Properties of Normal Cuke 16.9 Let Us Sum Up 16-10 Key Words and Symbols 16.11 Answers to Check Your Progress 16.12 Terminal QuestionslExercises 16.0 OBJECTIVES After studying this unit, you should be able to : 9 distinguish between skewness and dispersion - 0 differentiate between symmetrical, positively skewed and negatively skewed data calculate skewness by different methods * decide which of the methods of computing is suitable in a given sithation e appreciate the role of normal curve in the analysis of data and discuss its properties. 6 1 INTRODUCTION As you know, to analyse any numerical data there are three main characteristics : 1) central tendency i.e., a value around which many other items of the data congregate, 2) dispersion i.e., how much the items deviate from central tendency, and 3) skewness i.e., how the items are distributed about the central tendency. 111 this unit, you will learn about the third characteristic i.e. skewness. In Unit 10 to 13 you have studied the measures of central tendency viz., arithmetic mean, median, mode geometric mean, harmonic mean and moving average. In units 14 and 15 you have studied the measures of dispersion viz. range, quartile deviation, mean deviation, standard deviation and Lorenz curve. In this unit you will learn about third characteristic i.e. skewness. You will study the meaning, purpose and methods of computing skewness. You will also study the role and properties of normal curve in analysis of data. In fact, there is one more characteristic called kurtosis i.e., concentration of frequencies in the central part of the data, which is not within the scope of this course. 16.2 MEANING OF SKEWNESS A frequency distribution is said to be 'symmetrical', if the frequencies are ' symmetrically distributed about central value, i,e., when values of the variable which are at an equal distance from middle have equal frequencies. Study the following two sets of distributions. A) X : 10 15 20 25 30 f : 5 8 26 8 5 Here X = 20 is the group of middle items. + . B) X 5-9 9-1 3 13-17 17-21 21-24 f 7 18 25 18 7 Here middle group is 13-17. You can easily understand that they are symmetrical distributions. You shouId also I note (can verify by calculation) that for each set the values of mean, median and mode are the same values. In fact, for any symmetrical distribution in which frequencies steadily rise and then steadily fall (i.e., bell shaped), mean, median ant? mode are equal. Study Figure 16.1 for the shape of such data on graph paper. Figure 16.1 - I I I Symmetrical Distribution 1 If the graph of a perfectly symmetrical data is folded at the line passing through mean, I one side of the curve perfectly coincides with the others side. You can say one side is I the mirror image of the other side. Ingeneral, however, frequency distributions are not perfectly sylnmetrical; some may ,be slightly asymmetrical and some others may be highly asymmetrical. Consider the following two asymrr~etrical(of Skewed) distributions : B) I X 5-9 9-13 13-17 17-21 21-24 I f 7 28 15 10 2 4 ' Here the frequencies are not symmetrically distributed about the middle. In ldistrihution A the extent of asymmetry is small while in distribution B it is I comparatively larger. ;I' The word 'skewness' is used to denote the 'extent of asymmetry' in the data. When the frequency distribution is not symmetrical, it is said to be "skewed". The word I y y i 'skewness' literally denoted 'asymmetry , or 'lack of symmetry and the word 'skewed' 1 1 denoted 'asymmetrical'. A symmetrical distribution has therefore zero skewness. 1 ! A distribution..can be symmetrical even if frequencies first steadily fall and then I steadily rise. Consider the following distribAtions: I 1 I Size of Items ; 10-20 20-30 30-40 40-50 50-60 6&70 7&80 ' I I 1 Frequency : 40 27 15 10 15 27 40 E This is also a case of symmetrical distribution. But in this case there will be two values 1. a of the mode and both of them will be different from arithmetic mean and median ; + which will be in the middle group, You may notice in such symmetrical distributions, . , fl which are called bimodel or u-shaped, only mean and median are equal. Look at figure 16.2 and study the shape of such data on graph paper. 1 I Br Measures of Dispersion and Skewness I I Mode 1 Mean = Median Mode 2 Figure 16.2 Bimodal or U Shaped Distribution A bimodel distribution can also be a skewed distribution as in the following example : Size of Items. : 10-15 15-20 20-25 25-30 30-35 35-40' 40-45 Frequency : 27 18 10 5 17 17 . 30 ' Here also the distribution of items around the middle group or central value is not -same on both sides. Thus, we can also say that study of skewness.- is the study of distribution of items around the central tendency. Analysing the skewness of data serves the following main purposes : * 1) It helps in finding out the nature and the degree of concentration - whether it is in higher or the lower values. 2) The empirical relationship between mean, median and mode i.e., M, = 3 Md'- 2x,is based on a moderately skewed distribution. The measure of skewnesi will reveal to what extent such empirical relationship holds goods. 3) It helps in knowing if the distribution is normal or not. You will lea~nabout normal distribution later in thisunit. 16.3 POSITIVE AND NEGATIVE SKEWNESS Wherever dhta is skewed there can be two possibilities : 1) the skewness may be positive or 2) it may be negative. In a bell shaped data or a unimodel data, which is also most common in nature, it is quite easy to understand the concept of positive and nagative skewness i.e., direction of the skewness. Mode plays an important role in this connection. The spread of.the data on either side of the mode helps in deciding the direction of ske,wness. Consider the two sets of data given below: " ' A) Sizeof Items : 24 4-6 6-8 8-10 10-12 12-14 14-16 Frequency : 5 12 27 , 10 8 3 1 Frequency : 2 5 ' 12 18 30 21 6 I) Like in Set B if there ismlonger tail towards the lower value or left hand sideCi.e.. larger spread on the lower side, of inode,.the skewness is negative or left handed. In such 'a case Mean < Median < Mode. Look at Figure 16.3 to unclerstand the data on graph paper. Measures of Skewness As in the case of Set A if there is a longer tail of the distribution towards the higher values or right hand side i.e., larger spread on higher side of mode, the skewness 1s positive or right handed. In this case Mean > Median > Mode. Shape of such a data ' on graph paper would be as shown in Figure 16.4. Such data is termed 'ulongated bcll shaped data'. The case of extreme positive skewness would arise when frequencies are highest in the lowest values and then they steadily fall as the values increase. Similarly, the extreme negative skewness would arise when frequencies are lowest in the lower values and they steadily increase as values increase. t'he highest frequency representing the highest values: Such data is callcd 'J' shupcd data. Consider the following two sets of data : A) Size of Items : 10-12 12-14 14-16 1618 18-20 Frequency : 27 20 12 6 3 B) Sizeof Items : 10-12 12-14 14-18 1618 18-20 Frequency : 3 6 12 20 27 set A shows very high positive skewness and Set B shows high negative skewness. Thcir shape on graph paper will be as shown in Figure 16.5A and 16.5B. \ Figure 16.5A Figure 16.5B JBhaped Positively Skewed Distribution J Shaped Negatively Slewed Distribution Measure or Dlsperslon and Skewnew Note : For the data to be skewed or symmetrical, deviations from central tendency must exist in the data. Check Your Progress A . 1) Distinguish between' symmetrical data and skewed data. 2) Differentiate between positiveskewness and negative skewness. 3) ~istin~uishbetween high skewness and moderate skewness. ......................................................................................................... 4) Differentiate between bell shaped and U-shaped data. 5) State whether the following statements are True or False i) All distributions can be classified as negative or positive skewed. ii) Two halves of a symmetrical distribution are mirror images of each other. iii) The sum of positive and negative deviations from median is always equal to zero in a symmetrical distribution. iv) J-shaped distribution indicates moderate skeynegs. v) It is possible that for some data Arithmetic Mean = Median = Mode, still. it is not perfectly symmetrical. vi) Positive skewness implies that mean value is less than mode.' vii) Median can never be equal to mean in a skewed distribution. viii) Greater the difference 'between mean and mode, the more skewed 'is the distribution. ix) U-shaped data has two modes. x) A longer tail to right means data is negatively skewed. xi) U-s4aped distributions are always symmetrical, xii) Highly skewed data is always positively skewed.

Measures of Skewness

Lecture 22: Bivariate Normal Distribution Distribution

Applied Biostatistics Mean and Standard Deviation the Mean the Median Is Not the Only Measure of Central Value for a Distribution

1. How Different Is the T Distribution from the Normal?

Fan Chart: Methodology and Its Application to Inflation Forecasting in India

Characteristics and Statistics of Digital Remote Sensing Imagery (1)

Linear Regression

Finding Basic Statistics Using Minitab 1. Put Your Data Values in One of the Columns of the Minitab Worksheet

Continuous Random Variables

The Central Limit Theorem (Review)

Calculating Variance and Standard Deviation

Binomial and Normal Distributions

On Median and Quartile Sets of Ordered Random Variables*