Commonly used coefficients for sampled data

Axel Drefahl • [email protected]

Last updated: November 12, 2016

Abstract

The skewness of a set of data points can be quantified by several meth- ods. Assuming that the data points represent sample values, it often is of interest to know if their distribution is symmetric or asymmetric (skewed) with respect to their sample . To measure skewness, defined skewness coefficients are in use and have selectively been in- tegrated with statistical software packages and data-prcoessing appli- cations. Here, a brief overview of commonly implemented skewness coefficients is given. This document is available at: www.axeleratio.com/statistics/skew/skewness formulae.pdf

1 Introduction

Skewness coefficients measure the lack of symmetry—the degree of asymme- try from the normal distribution of values observed for a variable. Various skewness coefficients are in use and have been discussed in the literature [1–8] and on web sites [9–12].

The for a set of values is measured by the , and . For a normal distribution, the mean, median and mode are equal. If the value distribution tails off to the left, it is said to be left- skewed and the relation mean < median < mode holds. For a right-skewed distribution, mode < median < mean [2]. The mean and mode have been employed in deriving skewness coefficients.

Current definitions of skewness functions are based on the second and third moments around the mean. A self-consistent overview of skewness formulae

1 is desirable for both judging their computational (dis)similarity and compar- ing results obtained with different calculators and applications. The com- monly applied skewness coefficients are presented herein, without making any attempt to introduce the theory behind statistical skewness. References to supplementing test data and example calculations are provided.

2 Skewness Coefficients

We consider a set of data points {x1,...,xn} that represent values observed for the single measured variable x.x ¯,x ˜ andx ˆ denote the arithmetic mean, median and mode, respectively. The (corrected sample standard deviation) is:

n v 1 2 s = u (x − x¯) . (1) un − 1 X i t i=1 Karl Pearson established coefficients based on these statistical functions. Pearson’s first skewness coefficient is also known as mode skewness: x¯ − xˆ SP1 = . (2) s Pearson’s second skewness coefficient is also known as median skewness: x¯ − x˜ SP2 = 3 . (3) s The mode also occurs in other skewness measurements [1]. Instead of the mode, the second and third around the mean are now typically employed in developing skewness measures that are available with software packages. The second moment around the mean is n 1 2 m2 = (x − x¯) ; (4) n X i i=1 and the third moment is n 1 3 m3 = (x − x¯) . (5) n X i i=1 The traditional Fisher-Pearson coefficient of skewness is (for example, see page 6 in [2]): 1 n 3 m3 n i=1(xi − x¯) g1 = = . (6) 3/2 1 Pn 2 3/2 m [ =1(xi − x¯) ] 2 n Pi 2 The adjusted Fisher-Pearson standardized moment coefficient is (eq. 1c in [2]): n(n − 1) G1 = g1 p . (7) n − 2 This skewness coefficient can also be written in the form n n xi − x¯ 3 G1 = ( ) , (8) (n − 1)(n − 2) X σ i=1 which “brings back” the standard deviation into the formula. An often used variation is the following skewness coefficient: n ⋆ 1 xi − x¯ 3 G1 = ( ) . (9) (n − 1) X σ i=1 ⋆ With increasing n, G1 and G1 become more alike as n/(n−2) is approaching unity.

OpenOffice Calc and Microsoft Excel include a SKEW function that imple- ments G1. R, a programming language and software environment for sta- tistical computing, employs g1 in the skewness function available with its moments package. Various online calculators produce a skewness value based ⋆ on G1 [13].

Test data and associated calculations are available in: www.axeleratio.com/statistics/skew/skewness test data.pdf.

Those calculations are backed up in an OpenOffice Calc spreadsheet, which can be downloaded as an entry tool for further skewness calculations: www.axeleratio.com/statistics/skew/skewness test calc.ods.

References

[1] Arnold, B. C. and Groneveld, R. A. (1995), “Measuring Skewness with Respect to the Mode,” The American Statistician, 49 (1), 34-38. DOI: http://www.jstor.org/stable/2684808

[2] Doane, D. P. and Seward, L. E. (2011), “Measuring Skewness: A For- gotten Statistics?”, Journal of Statistics Education, 19 (2), 1-18. Avail- able: https://ww2.amstat.org/publications/jse/v19n2/doane.pdf [Accessed Nov. 12, 2016].

3 [3] Pearson, K. (1895), “Contribution to the Mathematical Theory of Evo- lution, II: Skew Variation in Homogeneous Material,” Transactions of the Royal Philosophical Society, Series A, 186, 343-414. http://bayes. wustl.edu/Manual/Pearson_1895.pdf

[4] Popescu, A. (2014), “Research on the normal distribution of variables as a condition for the correct application of the startistical models for processing data,” Scientific Papers Series Management, Economic En- gineering in Agriculture and Rural Development, 14 (3), 277-282.

[5] Tabor, J. (2010), “Investigating the Investigative Task: Testing for Skew- ness,” Journal of Statistics Education, 18 (2), 1-12. Available: https:// ww2.amstat.org/publications/jse/v18n2/tabor.pdf [Accessed Nov. 12, 2016]

[6] Wright, D. B. and Herrington, J. A (2011), “Problematic standard errors and confidence intervals for skewness and ,” Behavior Research Methods, 43, 8-17. DOI: 10.3758/s13428-010-0044-x.

[7] “Using Excel with the Normal Distribution.” Chapter 7 in Carlberg, C. (2011), “Statistical Analysis: Microsoft R Excel 2010,” Que Publishing (Second Printing: April 2012).

[8] “Measure of skewness.” Pages 48 and 49 in Davies, O. L. and Gold- smith, P. L. (Fourth Edition, Reprinted in 1984), “Statistical Methods in Research & Production,” Longman Inc., New York.

[9] “Measures of Central Tendency,” Lærd statistics. Avail- able: https://statistics.laerd.com/statistical-guides/ measures-central-tendency-mean-mode-median.php [Accessed Nov. 12, 2016].

[10] “Skewness Formula,” Macroption. Available: http://www. macroption.com/skewness-formula/ [Accessed Nov. 12, 2016]

[11] “Skewness,” Wikipedia. Available: https://en.wikipedia.org/wiki/ Skewness [Accessed Nov. 12, 2016].

[12] “Skewness,” Wolfram MathWorld. Available: http://mathworld. wolfram.com/Skewness.html [Accessed Nov. 12, 2016].

[13] Selected Skewness Calculators: http://www.endmemo.com/statistics/skewness.php,

4 http://ncalculators.com/statistics/skewness-calculator.htm, https://www.easycalculation.com/statistics/skewness.php.

5