Descriptive Statistics .Pdf
Total Page:16
File Type:pdf, Size:1020Kb
Descriptive Statistics: Numerical Measures Numerical Data Properties Central Tendency Variation Shape Mean Range Skew Median Interquartile Kurtosis Range Mode Variance Midrange Standard Deviation Midhinge Coeff. of Variation Measures of Location Measures of Location Mode –If the measures are computed for data from a sample, they are called sample statistics.` Median Percentile –If the measures are computed for data Quartiles from a population, they are called population parameters. of the corresponding population parameter. For example, the sample mean is a point estimator of the population mean. –A sample statistic is referred to as the point estimator Mean • The mean of a data set is the average of all the data values. • As we said, the sample mean is the point estimator of the population mean m. Example: Apartment Rents Seventy efficiency apartments were randomly sampled in a small college town. The monthly rent prices for these apartments are listed in ascending order on the next slide. Sample Mean Example Continued 425 430 430 435 435 435 435 435 440 440 440 440 440 445 445 445 445 445 450 450 450 450 450 450 450 460 460 460 465 465 465 470 470 472 475 475 475 480 480 480 480 485 490 490 490 500 500 500 500 510 510 515 525 525 525 535 549 550 570 570 575 575 580 590 600 600 600 600 615 615 Properties of the Arithmetic Mean 1- Every set of interval-level and ratio-level data has a mean. 2- All the values are included in computing the mean. 3- A set of data has a unique mean. 4- The mean is affected by unusually large or small data values. 5- The arithmetic mean is the only measure of central tendency where the sum of the deviations of each value from the mean is zero. Median The median of a data set is the value in the middle when the data Whenever a data set has extreme values, the median is the preferred measure of central location. items are arranged in ascend The median is the measure of location most often reported for annual income and property value data. A few extremely large incomes or property values can inflate the mean. Mode The mode of a data set is the value that occurs with greatest frequency. If the data have exactly two modes, the data are bimodal. If the data have more than two modes, the data are multimodal. Quartiles, Deciles and Percentiles • The pth percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100 - p) percent of the items take on this value or more Measures of Variability (Dispersion) It is often desirable to consider measures of variability (dispersion), as well as measures of location. For example, in choosing supplier A or supplier B we might consider not only the average delivery time for each, but also the variability in delivery time for each. Measures of Variability (Dispersion) • Range • Interquartile range • Variance • Standard deviation • Coefficient of variation Coefficient of Variation Measure of relative dispersion Always a % CV is the standard deviation expressed as percent of the mean Used to compare two or more groups Weakness: CV is undefined if the mean is zero or if data are negative. Thus, CV is used only for variables whose values are X>=0 .