ENM 317 Engineering Chapter 2

Ilgın ACAR, Fall 15

Sample Data Measures of Arithmetic mean (or simply the Mean) of a list of numbers is the sum of all the members of the list divided by the number of items in the list.

Ex.

Median is the middle value in an ordered sequence of data. The positioning-point formula (n+1)/2 is used to find the place in the ordered array that corresponds to the value.

The positioning point is (20+1) / 2= 10.5 Therefore Median is (4.2 + 4.3 ) / 2 = 4.25

Measures of Central Tendency

Mode is the most frequently occurring value in the data set. Mode is equal to 4.6 in our example.

Midrange is the average of the smallest and the largest observations in a set of data.

Ex:

Midhinge is the mean of the first and the third quartiles in a set of data.

Measures of Central Tendency

first quartile (designated Q1) = lower quartile = cuts off lowest 25% of data = 25th percentile

second quartile (designated Q2) = median = cuts data set in half = 50th percentile

third quartile (designated Q3) = upper quartile = cuts off highest 25% of data, or lowest 75% = 75th percentile

Ex: Q1 = 2.6 Q3 = 4.7 Midhinge = ( 2.6 + 4.7 ) / 2 = 3.65

Measures of Variation Range is the difference between the largest and the smallest observations in a set of data.

Range = X largest – X smallest

Ex: Range = 5.0 – 0.9 = 4.1

Interquartile Range (Midspread) is the difference between the third quartile and the first quartile.

Interquartile Range = Q3 - Q1

Ex: Interquartile Range = 4.7 – 2.6 = 2.1 Measures of Variation

Variance

Ex.

Standard deviation

Ex. Shapes of distributions Skewness = is a measure of distribution asymmetry or the tendency of one tail to be heavier than the other.

Symmetric vs. Skewed: A symmetric distribution can be defined as one in which the upper half is a mirror image of the lower half of the distribution.

Positive (Right) skewed: If the mean is greater than the median the data may be described as positive or right skewed.

Negative (Left) skewed: If the mean is less than the median the data may be described as negative or left skewed.

Shapes of distributions

Kurtosis : is a measure of how different a distribution is from the normal distribution.

Leptokurtic Mesokurtic (prominent peak & fat tails) (data centered & few obs. in tails)

Platykurtic (flat with few obs. in tails)

Five Number Summary Boxplot (“Box and Whiskers”) schematic presentation of the sample median, the upper and lower sample quartiles, and the largest and smallest data observations within 1.5 IQR of the upper and lower quartiles. - provides a representation of the shape of a data set. * if the data set is symmetric, then the two whiskers are roughly equal and the median is roughly in the center of the box

* if the data is skewed, then the whiskers are not of equal length and the median is not in the center of the box. - outliers are often assumed to be any observation which is farther than 1.5 times the IQR from either end of the box. That is, it falls outside of the range:

(QL - 1.5*IQR), (QU + 1.5*IQR). - extreme outlier = > 3*IQR from either end of the box (an outlier is not included as part of the whisker.)

Box and Whiskers Descriptive Statistics: MINITAB and EXCEL Steam and Leaf and Box-Whisker Plot