AP Statistics Unit 1 Review Exploratory Data Analysis
Total Page:16
File Type:pdf, Size:1020Kb
AP Statistics Unit 1 Review Exploratory Data Analysis
What You Should Have Learned
Please review pages 104 & 161. They provide a review list of the most important skills you should have acquired from your study of chapters 1 & 2.
A. Chapter 1 – A, B, D, E, F, G
B. Chapter 2 – A, B, C, D
Practice Problems
1. In 1798 the English scientist Henry Cavendish measured the density of the earth by careful work with a torsion balance. The variable recorded was the density of the earth as a multiple of the density of water. Here are the 29 measurements:
5.50 5.61 4.88 5.07 5.26 5.55 5.36 5.29 5.58 5.65 5.57 5.53 5.62 5.29 5.44 5.34 5.79 5.10 5.27 5.39 5.42 5.47 5.63 5.34 5.46 5.30 5.75 5.68 5.85 Present these measurements graphically as a stemplot. Discuss the features of the distribution. Are there any outliers? What is your estimate of the density of the earth based on these measurements?
Be sure to state the context of the distribution FIRST!!!! Shape ~Roughly Symmetrical Outliers ~ No Center ~ 5.448 Density of the earth (Estimate) ~ approximately 5.45 Spread ~ 5.46
2. Different varieties of the tropical flower Heliconia are fertilized by different species of hummingbirds. Over time, the lengths of the flowers and the forms of the hummingbirds’ beaks have evolved to match each other. Here are the data on the lengths in millimeters of three varieties of these flowers on the island of Dominica.
H. bihai 47.12 46.75 46.80 47.12 46.67 47.43 46.44 46.64 48.07 48.34 48.15 50.26 50.12 46.34 46.94 48.36
H. caribaea red 41.90 42.01 41.93 43.09 41.47 41.69 39.78 40.57 39.63 42.18 40.66 37.87 39.16 37.40 38.20 38.07 38.10 37.97 38.79 38.23 38.87 37.78 38.01
H. caribaea yellow 36.78 37.02 36.52 36.11 36.03 35.45 38.13 37.10 35.17 36.82 36.66 35.68 36.03 34.57 34.63
a. Make boxplots to compare the three distributions. Report the five number summaries along with the graphs. What are the most important differences among the three varieties of flowers? AP Statistics Unit 1 Review Exploratory Data Analysis
The center of the boxplots is the most important difference among the three distributions. b. Find and s for each variety.
H. bihai H. caribaea red H. caribaea yellow mean ~ 47.597 mean ~ 39.711 mean ~ 36.18 s ~ 1.213 s ~ 1.799 s ~ 0.975
c. Make a stemplot of each set of flower lengths. Do the distributions appear suitable for use of and s as summaries?
X-bar & s is not suitable for the first two distributions because the graph displays skewness. The third distribution can be summarized using x-bar and s.
d. Starting from the and s-values in millimeters, find the mean and standard deviations in inches. (A millimeter is 1/1000 of a meter. A meter is 39.97 inches.)
H. bihai H. caribaea red H. caribaea yellow mean ~ 1.902 mean ~ 1.587 mean ~ 1.446 s ~ 0.0485 s ~ 0.0719 s ~ 0.0390
3. Mrs. Causey asked her students how much time they had spent using a computer during the previous week. a. Construct a relative frequency table based on the ogive. Then make a histogram. b. Estimate the median, Q1, and Q3 from the ogive. Then make a boxplot. Are there any outliers? c. At what percentile does a student who used her computer for 10 hours last week fall?
b. median~5; Q1~2.5; Q3~11 Yes, there are outliers. outliers > 11 + 1.5(11-2.5) AP Statistics Unit 1 Review Exploratory Data Analysis
outliers > 23.75 4. A study of the size of jury awards in civil cases (such as injury, product liability, and medical malpractice) in Chicago showed that the median award was about $8000. But the mean award was about $69,000. Explain how a difference this big between the two measures of center can occur.
In order for there to be such a large difference in the mean and median, the jury must have awarded very large settlements (outliers). The large settlements greatly influenced the mean of the distribution.
5. The scores of a reference population on the Wechsler Intelligence Scale for Children (WISC) are normally distributed with µ = 100 and σ = 15. A school district classified children as “gifted” if their WISC score exceeded 135. There are 1300 sixth-graders in the school district. About how many of them are gifted? Show all of your work.
I. Sketch the graph. II. z = 135 – 100 = 2.33 III. Probability = 1 – 0.9901 = 0.0099 15
IV. Approximately 13 students are gifted.
6. Scores on the ACT test for the 2004 high school graduating class had mean 20.9 and standard deviation 4.8. In all, 1,171,460 students in this class took the test, and 1,052,490 of them had scores of 27 or lower. If the ditstribution of scores were Normal, what percent of scores would be 27 or lower? What percent of the actual scores were 27 or lower? Does the Normal distribution describe the actual data well?
I. Sketch the graph. Percent of actual observations < 27 = 1,052,490 = 0.8984 = 89.84% 1,171,460 II. z = 27 – 20.9 = 1.27 4.8 Yes, the Normal distribution describes the actual data well.
III. Probability = 0.8980 AP Statistics Unit 1 Review Exploratory Data Analysis
7. Output from Minitab software describing the distribution of monthly returns from Wal-Mart stock.
a. Give the five number summary for the monthly returns on Wal-Mart stock.
Five-number Summary: -34.04255, -2.950258, 3.4691, 8.4511, 58.67769
b. Describe in words the main features of the distribution.
Shape ~ Roughly Symmetrical Center ~ 3.064 Spread ~ 34 to 58, s = 11.49
c. Find the interquartile range (IQR) for the Wal-Mart data. Are there any outliers according to the 1.5 x IQR criterion? Does it appear to you that the software uses this criterion in choosing which observations to report separately as outliers? IQR = 11.401 Yes, the software uses the criterion for identifying outliers.
8. Joey received a report that he scored in the 97th percentile on a national standardized reading test but in the 72nd percentile on the math portion. a. Explain to Joey’s grandmother, who knows no statistics, what these numbers mean.
97% of the students who took the standardized reading test scored at or below Joey’s score. 72% of the students who took the standardized math test scored at or below Joey’s score.
b. Can we determine Joey’s z-scores for his reading and math performance? Why or why not? AP Statistics Unit 1 Review Exploratory Data Analysis
No, we can not. There is no information given regarding the normality of the test scores.
9. A certain density curve consists of a straight-line segment that begins at the origin, (0,0) and has slope 1.
a. Sketch the density curve. What are the coordinates of the right endpoint of the segment?
Sketch the curve. The right endpoint is (√2, √2).
b. Determine the median, Q1, and Q3.
Median ~ 1; Q1 ~ √2/2 = 0.707; Q3 ~√6/2 = 1.225
c. Relative to the median, where would you expect the mean of this distribution to be located?
The mean will be slightly below the median.
d. What percent of observations lie below 0.5? Above 1.5?
12.5%; 0% AP Statistics Unit 1 Review Exploratory Data Analysis
10. The following figure displays Normal probability plots for four different sets of data. Describe what each plot tells you about the Normality of the given data set.
a. Approximately Normal Distribution
b. Normal Distribution
c. Not Normal ~ Skewed Right Distribution
d. Not Normal ~ Bimodal Distribution