Descriptive Statistics Frequency Distributions and Their Graphs
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 2 Descriptive Statistics § 2.1 Frequency Distributions and Their Graphs Frequency Distributions A frequency distribution is a table that shows classes or intervals of data with a count of the number in each class. The frequency f of a class is the number of data points in the class. Class Frequency, f 1 – 4 4 LowerUpper Class 5 – 8 5 Limits 9 – 12 3 Frequencies 13 – 16 4 17 – 20 2 Larson & Farber, Elementary Statistics: Picturing the World , 3e 3 1 Frequency Distributions The class width is the distance between lower (or upper) limits of consecutive classes. Class Frequency, f 1 – 4 4 5 – 1 = 4 5 – 8 5 9 – 5 = 4 9 – 12 3 13 – 9 = 4 13 – 16 4 17 – 13 = 4 17 – 20 2 The class width is 4. The range is the difference between the maximum and minimum data entries. Larson & Farber, Elementary Statistics: Picturing the World , 3e 4 Constructing a Frequency Distribution Guidelines 1. Decide on the number of classes to include. The number of classes should be between 5 and 20; otherwise, it may be difficult to detect any patterns. 2. Find the class width as follows. Determine the range of the data, divide the range by the number of classes, and round up to the next convenient number. 3. Find the class limits. You can use the minimum entry as the lower limit of the first class. To find the remaining lower limits, add the class width to the lower limit of the preceding class. Then find the upper class limits. 4. Make a tally mark for each data entry in the row of the appropriate class. 5. Count the tally marks to find the total frequency f for each class. Larson & Farber, Elementary Statistics: Picturing the World , 3e 5 Constructing a Frequency Distribution Example: The following data represents the ages of 30 students in a statistics class. Construct a frequency distribution that has five classes. Ages of Students 18 20 21 27 29 20 19 30 32 19 34 19 24 29 18 37 38 22 30 39 32 44 33 46 54 49 18 51 21 21 Continued. Larson & Farber, Elementary Statistics: Picturing the World , 3e 6 2 Constructing a Frequency Distribution Example continued: 1. The number of classes (5) is stated in the problem. 2. The minimum data entry is 18 and maximum entry is 54, so the range is 36. Divide the range by the number of classes to find the class width. Class width = 36 = 7.2 Round up to 8. 5 Continued. Larson & Farber, Elementary Statistics: Picturing the World , 3e 7 Constructing a Frequency Distribution Example continued: 3. The minimum data entry of 18 may be used for the lower limit of the first class. To find the lower class limits of the remaining classes, add the width (8) to each lower limit. The lower class limits are 18, 26, 34, 42, and 50. The upper class limits are 25, 33, 41, 49, and 57. 4. Make a tally mark for each data entry in the appropriate class. 5. The number of tally marks for a class is the frequency for that class. Continued. Larson & Farber, Elementary Statistics: Picturing the World , 3e 8 Constructing a Frequency Distribution Example continued: Number of Ages students Ages of Students Class Tally Frequency, f 18 – 25 13 26 – 33 8 34 – 41 4 42 – 49 3 Check that the 50 – 57 2 sum equals the number in the ∑f = 30 sample. Larson & Farber, Elementary Statistics: Picturing the World , 3e 9 3 Midpoint The midpoint of a class is the sum of the lower and upper limits of the class divided by two. The midpoint is sometimes called the class mark . (Lower class limit) + (Upper class limit) Midpoint = 2 Class Frequency, f Midpoint 1 – 4 4 2.5 1 + 4 5 Midpoint = = = 2.5 2 2 Larson & Farber, Elementary Statistics: Picturing the World , 3e 10 Midpoint Example : Find the midpoints for the “Ages of Students” frequency distribution. Ages of Students Class Frequency, f Midpoint 18 + 25 = 43 18 – 25 13 21.5 43 ÷ 2 = 21.5 26 – 33 8 29.5 34 – 41 4 37.5 42 – 49 3 45.5 50 – 57 2 53.5 ∑f = 30 Larson & Farber, Elementary Statistics: Picturing the World , 3e 11 Relative Frequency The relative frequency of a class is the portion or percentage of the data that falls in that class. To find the relative frequency of a class, divide the frequency f by the sample size n. Class frequency f Relative frequency = = Sample size n Relative Class Frequency, f Frequency 1 – 4 4 0.222 ∑f = 18 Relative frequency = f = 4 ≈ 0.222 n 18 Larson & Farber, Elementary Statistics: Picturing the World , 3e 12 4 Relative Frequency Example : Find the relative frequencies for the “Ages of Students” frequency distribution. Relative Portion of Class Frequency, f Frequency students 18 – 25 13 0.433 f 13 = 26 – 33 8 0.267 n 30 34 – 41 4 0.133 ≈ 0.433 42 – 49 3 0.1 50 – 57 2 0.067 f ∑f = 30 ∑ = 1 n Larson & Farber, Elementary Statistics: Picturing the World , 3e 13 Cumulative Frequency The cumulative frequency of a class is the sum of the frequency for that class and all the previous classes. Ages of Students Cumulative Class Frequency, f Frequency 18 – 25 13 13 26 – 33 + 8 21 34 – 41 + 4 25 42 – 49 + 3 28 Total number of 50 – 57+ 2 30 students ∑f = 30 Larson & Farber, Elementary Statistics: Picturing the World , 3e 14 Frequency Histogram A frequency histogram is a bar graph that represents the frequency distribution of a data set. 1. The horizontal scale is quantitative and measures the data values. 2. The vertical scale measures the frequencies of the classes. 3. Consecutive bars must touch. Class boundaries are the numbers that separate the classes without forming gaps between them. The horizontal scale of a histogram can be marked with either the class boundaries or the midpoints. Larson & Farber, Elementary Statistics: Picturing the World , 3e 15 5 Class Boundaries Example : Find the class boundaries for the “Ages of Students” frequency distribution. Ages of Students Class Class Frequency, f Boundaries The distance from the 18 – 25 13 17.5 − 25.5 upper limit of the first class to the lower limit 26 – 33 8 25.5 − 33.5 of the second class is 1. 34 – 41 4 33.5 − 41.5 42 – 49 3 41.5 − 49.5 Half this distance 50 – 57 2 49.5 − 57.5 is 0.5. ∑f = 30 Larson & Farber, Elementary Statistics: Picturing the World , 3e 16 Frequency Histogram Example : Draw a frequency histogram for the “Ages of Students” frequency distribution. Use the class boundaries. 14 13 Ages of Students 12 10 8 8 f 6 4 4 3 2 2 0 17.5 25.5 33.5 41.5 49.5 57.5 Broken axis Age (in years) Larson & Farber, Elementary Statistics: Picturing the World , 3e 17 Frequency Polygon A frequency polygon is a line graph that emphasizes the continuous change in frequencies. 14 Ages of Students 12 10 8 Line is extended to the x-axis. f 6 4 2 0 13.5 21.5 29.5 37.5 45.5 53.5 61.5 Broken axis Age (in years) Midpoints Larson & Farber, Elementary Statistics: Picturing the World , 3e 18 6 Relative Frequency Histogram A relative frequency histogram has the same shape and the same horizontal scale as the corresponding frequency histogram. 0.5 0.433 0.4 Ages of Students 0.3 0.267 0.2 0.133 0.1 Relative frequency 0.1 0.067 (portion students) of 0 17.5 25.5 33.5 41.5 49.5 57.5 Age (in years) Larson & Farber, Elementary Statistics: Picturing the World , 3e 19 Cumulative Frequency Graph A cumulative frequency graph or ogive , is a line graph that displays the cumulative frequency of each class at its upper class boundary. 30 Ages of Students 24 The graph ends at 18 the upper 12 boundary of the last class. 6 (portion students) of Cumulative frequency 0 17.5 25.5 33.5 41.5 49.5 57.5 Age (in years) Larson & Farber, Elementary Statistics: Picturing the World , 3e 20 § 2.2 More Graphs and Displays 7 Stem-and-Leaf Plot In a stem-and-leaf plot , each number is separated into a stem (usually the entry’s leftmost digits) and a leaf (usually the rightmost digit). This is an example of exploratory data analysis . Example : The following data represents the ages of 30 students in a statistics class. Display the data in a stem-and-leaf plot. Ages of Students 18 20 21 27 29 20 19 30 32 19 34 19 24 29 18 37 38 22 30 39 32 44 33 46 54 49 18 51 21 21 Continued. Larson & Farber, Elementary Statistics: Picturing the World , 3e 22 Stem-and-Leaf Plot Ages of Students Key: 1|8 = 18 1 8 8 8 9 9 9 2 0 0 1 1 1 2 4 7 9 9 Most of the values lie between 20 3 0 0 2 2 3 4 7 8 9 and 39. 4 4 6 9 5 1 4 This graph allows us to see the shape of the data as well as the actual values. Larson & Farber, Elementary Statistics: Picturing the World , 3e 23 Stem-and-Leaf Plot Example : Construct a stem-and-leaf plot that has two lines for each stem. Ages of Students 1 Key: 1|8 = 18 1 8 8 8 9 9 9 2 0 0 1 1 1 2 4 2 7 9 9 3 0 0 2 2 3 4 3 7 8 9 From this graph, we can conclude 4 4 that more than 50% of the data lie 4 6 9 between 20 and 34.