Descriptive statistics (distribution)… Indicator Definition Formula In Excel In Stata In R
Variability Indicates how close the sample mean is Standard from the ‘true’ population mean. It =(STDEV(range of sem=sd(x)/sqrt error σ tabstat var1, increases as the variation increases and it SE = cells))/(SQRT(COUNT(sam (length(x)); (deviation) X s(semean) decreases as the sample size goes up. It n e range of cells))). sem of the mean provides a measure of uncertainty. Confidence Use “Descriptive Statistics” The range where the 'true' value of the Use package intervals for in the “Data Analysis” tab ci var1 mean is likely to fall most of the time CI = X ± SE *Z “pastecs” the mean X X (1) Distribution Measures the symmetry of the distribution (whether the mean is at the center of the distribution). The skewness value of a -tabstat var1, 3 normal distribution is 0. A negative value (X − X ) s(skew) Custom Skewness = ∑ i =SKEW(range of cells) indicates a skew to the left (left tail is Sk 3 - sum var1, estimation (n −1)s longer that the right tail) and a positive detail values indicates a skew to the right (right tail is longer than the left one) Measures the peakedness (or flatness) of a distribution. A normal distribution has a -tabstat var1, 4 Custom value of 3. A kurtosis >3 indicates a sharp − s(k) Kurtosis ∑(X i X ) =KURT(range of cells) estimation peak with heavy tails closer to the mean K = - sum var1, − 4 kurtosis(x) (leptokurtic ). A kurtosis < 3 indicates the (n 1)s detail opposite a flat top (platykurtic). Notation: Xi = individual value of X X(bar) = mean of X For more info check the module “Descriptive Statistics n = sample size with Excel/Stata” in http://dss.princeton.edu/training/ s2 = variance s = standard deviation SE = standard error of the mean PU/DSS/OTR X(bar) For Excel 2007 http://office.microsoft.com/en-us/excel/HP100215691033.aspx Z = critical value (Z=1.96 give a 95% certainty) (1)For Excel 2003 http://office.microsoft.com/en-us/excel/HP011277241033.aspx