Statistics - Measures of Central Tendency and Dispersion Class 2 Session 2
Total Page:16
File Type:pdf, Size:1020Kb
Central Tendency Variability Statistics - Measures of central tendency and dispersion Class 2 Session 2 Oscar BARRERA [email protected] January 31, 2018 Oscar BARRERA Statistics Intermediate Before starting Some important tips of notation Central Tendency Variability Measures of Central Tendency So you collected data, created a frequency distribution, made a graph...now what? I It would be nice to have a single value to represent a distribution a "descriptive statistic" Descriptive statistic I Is a value that describes or represents a set of data. I But what value from a distribution should be used as the representative? I Why is any one value in the data set better than another value? There are at least three characteristics you look for in a descriptive statistic to represent a set of data. Oscar BARRERA Statistics Intermediate Central Tendency Variability Measures of Central Tendency There are at least three characteristics you look for in a descriptive statistic to represent a set of data. 1. Represented: A good descriptive statistic should be similar to many scores in a distribution. (High frequency) 2. Well balanced: neither greater-than or less-than scores are overrepresented 3. Inclusive: Should take individual values from the distribution into account so no value is left out Oscar BARRERA Statistics Intermediate Central Tendency Variability Measures of Central Tendency In a normal distribution the most frequent scores cluster near the center and less frequent scores fall into the tails. Central tendency means most scores(68%) in a normally distributed set of data tend to cluster in the central tendency area. [Come back to characteristics!] Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode The mode (Mo) is the most frequently occurring score in a distribution. Example with a set of quiz scores (X): 10 10 10 9 9 9 9 8 8 8 8 8 7 7 7 7 7 7 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 54444444433333221 Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode The mode (Mo) is the most frequently occurring score in a distribution. Example with a set of quiz scores (X): 10 10 10 9 9 9 9 8 8 8 8 8 7 7 7 7 7 7 6 6 6 6 6 6 6 5 5 5 5 5 5 5 5 54444444433333221 What is the Mo? I X = 5 has the greatest frequency (9); hence, the mode is 5. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode Limitations I Multi-modal: There can be more than one mode I Lack of Representativeness :It may not be a good representative of all values Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode Limitations I Multi-modal: There can be more than one mode Examples 10 10 9 9 9 9 9 9 8 8 7 7 6 6 5 5 4 4 4 4 4 4 3 3 2 2 1 1 Two most frequent scores, X = 9 and X = 4, I Or in case of a rectangular distribution 10 9 8 7 6 5 4 3 2 1 0 Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode Limitations I Lack of Representativeness :It may not be a good representative of all values Examples 10 10 10 10 10 10 10 9 8 7 6 5 4 3 2 1 0 The mode might be at one end of a distribution, not at the center Oscar BARRERA Statistics Intermediate Central Tendency Variability The Mode Limitations It is possible that the most frequent score has a frequency that is only one or two counts greater than the second most frequent score. In the first example above the mode was X = 5. However, X = 4 had a frequency of eight, which is only one less than the frequency of nine for X= 5. Why is 5 a better mode than 4? There is no answer. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median The median (Md) is the middle score of a distribution. I Half on the left half on the right (the 50th percentile.) I Better measure of central tendency than the mode since it balances perfectly distribution. How to find it? Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? Two simple steps 1. determine the median’s location 2. find the value at that location. I It differs whether you have an even or an odd number of scores. With ODD number of observations. For example: n=11 10 10 9 7 7 6 5 4 3 2 2 Use the following equations N + 1 Md = 2 Solving with N=11? Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? With ODD number of observations. For example: n=11 10 10 9 7 7 6 5 4 3 2 2 Use the following equations N + 1 Md = 2 Solving with N=11? 11 + 1 Md = = 6th 2 This indicates the median is located at the sixth position in the distribution. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? 11 + 1 Md = = 6th 2 This indicates the median is located at the sixth position in the distribution. I Order the data: rank-ordered from smallest to largest. I Count off six positions starting with the smallest value. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? Two simple steps 1. determine the median’s location 2. find the value at that location. I It differs whether you have an even or an odd number of scores. With EVEN number of observations. For example: n=12 20 19 18 16 15 14 12 11 11 11 10 9 You have to determine the two positions around the median N N + 2 Md = ; 2 2 Solving with N=11? Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? With EVEN number of observations. For example: n=12 20 19 18 16 15 14 12 11 11 11 10 9 Use the You have to determine the two positions around the median N N + 2 Md = ; 2 2 Solving with N=12? 12 12 + 2 Md = ; = Md = 6th; 7th 2 2 Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? With EVEN number of observations. For example: n=12 20 19 18 16 15 14 12 11 11 11 10 9 Use the You have to determine the two positions around the median 12 12 + 2 Md = ; = Md = 6th; 7th 2 2 Thus, the median is between sixth and seventh positions. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median How to find it? With EVEN number of observations. For example: n=12 You have to determine the two positions around the median 12 12 + 2 Md = ; = Md = 6th; 7th 2 2 Thus, the median is between sixth and seventh positions. The average of 12 and 14 is (14 + 12)= = 13, which is the median value. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median Limitations Some limitations of the Median: I The median does not take into account the actual values of the scores in a set of data. Example:take the following two sets of scores, each with n = 5 scores: Median is equal to 5: which is unaffected by the values greater and less than its value. Oscar BARRERA Statistics Intermediate Central Tendency Variability The Median Limitations Some limitations of the Median: I The median does not take into account the actual values of the scores in a set of data. Median is unaffected by the values greater and less than its value. Oscar BARRERA Statistics Intermediate Central Tendency Variability However: it is frequently reported as a measure of central tendency when data sets are incomplete or data are severely skewed. Example? Oscar BARRERA Statistics Intermediate Central Tendency Variability Quartiles Quartiles split a distribution into fourths or quarters of the distribution. I There are actually three quartile points in a distribution: I The 1st quartile (Q1) separates the lower 25% of the scores from the upper 75% of the scores; I the 2nd quartile (Q2) is the median and separates the lower 50% of the scores from the upper 50%; I and the 3rd quartile (Q3) separates the upper 25% of the scores from the lower 75%. Steps: Determining the quartile value is like determining the median. Oscar BARRERA Statistics Intermediate Central Tendency Variability Quartiles How to find it When determining the median, there are different procedures for determining the quartiles based on your having an even versus an odd number of scores. N + 1 I For the 1st quartile ODD Q = 1 4 N + 2 I For the 1st quartile EVEN Q = 1 4 3N + 3 I For the 3st quartile ODD Q = 3 4 3N + 2 I For the 3st quartile EVEN Q = 3 4 Oscar BARRERA Statistics Intermediate Central Tendency Variability Quartiles How to find it Example: ODD number. n=9 7 7 8 9 10 11 11 14 15 Then The locations of the first and third quartiles are: 9 + 1 Q = = 2; 5thpos 1 4 3 ∗ 9 + 3 Q = = 7:5thpos 3 4 Oscar BARRERA Statistics Intermediate Central Tendency Variability Quartiles How to find it Then The locations of the first and third quartiles are: 9 + 1 Q = = 2; 5thpos 1 4 3 ∗ 9 + 3 Q = = 7:5thpos 3 4 How to estimate Q1 and Q3 ? Oscar BARRERA Statistics Intermediate Central Tendency Variability Quartiles How to find it Then The locations of the first quartiles is: 9 + 1 Q = = 2; 5thpos 1 4 Q1 = 2:5 means the first quartile is the average of the values at the second and third positions.