Boxplot Is a Visual Representation of the Data Using Min, Q1, Median, Q3, Max and Outliers

Total Page:16

File Type:pdf, Size:1020Kb

Boxplot Is a Visual Representation of the Data Using Min, Q1, Median, Q3, Max and Outliers

Ch2.3 Boxplot

 Boxplot is a visual representation of the data using min, Q1, median, Q3, max and outliers.

Ex. (Boxplot without outliers) In the rainfall example, the 5-number summary = (1, 3.45, 4.6, 5.4, 8.2). The boxplot of the Rainfall example is

Ex. (Boxplot with outliers): Consider a different dataset, the 5-number summary of the data is (Min, Q1, Med, Q3, Max) = ( -3, 3.4, 4.6, 5.4, 9.0)

1  Construct a Boxplot:

Ex. Age of the “best actor” Oscar winner (1991-2000) 55, 42, 37, 38, 31, 45, 60, 46, 40, 36 Steps: (1) Sort the data: 31 36 37 38 40 42 45 46 55 60 (2) Find the 5-number summary: min=31, Q1=37, median=41, Q3=46, max=60

(3) Identify outliers using the 1.5 IQR rule IQR = 46-37 = 9 1.5*IQR = 13.5 Q1 – 1.5*IQR = 37 - 13.5 = 23.5 (no data points below 23.5) Q3 + 1.5*IQR = 46 + 13.5 = 59.5 (one data point 60 is above 59.5, so 60 is an outlier)

(4) Draw a central box span the quartiles

(5) Draw a line in the box marking the median

(6) Outliers are plotted individually

(7) Use lines to connect the smallest and the largest observations that are not outliers

Note: The boxplot is theleft one on the next page.

2 Comment: Boxplots are very useful when presented side-by-side for comparison Ex. Side-by-side boxplots of the ages of the best actor (1) and the best actress (2) of the Oscar winners in the past 10 years

Shape: Since Q1’s are little closer to median (center) than Q3, minimums are closer to the center than maximums, both distributions are a little skewed to right (positively skewed)

Center: The center for actors is about 7 years larger than that for actresses.

Spread: Both distributions have similar spread.

Outliers: There is one outlier in actor data. However, there is no outlier in actress data.

Overall: The shapes of both distributions look similar. We can roughly view the actor distribution is shifted (upward) by about 7 years.

3

Recommended publications