AP Statistics Semester I Final Exam Review
Total Page:16
File Type:pdf, Size:1020Kb
AP Statistics Semester I Final Exam Review 1. Jerry and George are playing golf. Their scores on the first 9 holes are shown in the table below. For each player, find the mean, median and mode score. Use your results to explain which measure of central tendency gives the best comparison of the abilities of the two players. Player/Hole 1 2 3 4 5 6 7 8 9 Total Mean Median Mode Jerry 4 7 5 2 4 7 3 6 7 45 George 3 5 4 2 3 7 3 5 16 48 2. The histograms below represent the distributions of the batting averages of the top National League and American League batters for the years shown. Examine the histograms and write a one paragraph observation about the data. Be sure to discuss such measures of center and spread. Also, compare the three distributions, noting any apparent trends. 3. Below are the winning men’s long jump distances for the first 22 Olympic Games. Year Winner Meters Year Winner Meters 1896 Ellery Clark 6.35 1952 Jerome Biffle 7.57 1900 Alvin Kraenzlin 7.18 1956 Gregory Bell 7.83 1904 Myer Prinstein 7.34 1960 Ralph Boston 8.12 1908 Francis Irons 7.48 1964 Lynn Davies 8.07 1912 Albert Gutterson 7.60 1968 Robert Beamon 8.90 1920 William Pettersson 7.15 1972 Randy Williams 8.24 1924 De Hart Hubbard 7.44 1976 Arnie Robinson 8.35 1928 Edward Hamm 7.73 1980 Lutz Dombrowski 8.54 1932 Edward Gordon 7.63 1984 Carl Lewis 8.54 1936 Jesse Owens 8.06 1988 Carl Lewis 8.72 1948 William Steele 7.82 1992 Carl Lewis 8.67 a) Create a stem-and-leaf plot of the data. Be sure to add a legend or key. b) Now plot the same data with a time plot from 1896 to 1992. Label appropriately. c) Write a one-paragraph conclusion about these plots. Compare/contrast the two plots, noting any advantages one plot has over the other toward analyzing the data. 4. The scores of the winning teams in the Rose Bowl games played from 1931 to 1980 are arranged in ascending order in the list below. 7 7 7 7 9 10 10 13 13 14 14 14 14 14 14 17 17 17 17 17 17 18 20 20 20 21 21 21 21 23 24 25 27 27 27 28 29 29 34 34 35 35 38 40 42 42 42 44 45 49 a) Create a histogram of the data, with bars of width 5 (1-5, 6-10, 11-15, 16-20, etc...). b) Find the 5-number summary of the data and use it to construct a box plot. MIN Q1 MEDIAN Q3 MAX c) Describe the shape of the data from the histogram. Explain how it is usually possible to know the shape of the histogram simply by looking at the box plot. d) Is the highest value in this data an outlier? Use a formula to justify your response 5. Read the problem description below left: a) Which curve do you think shows the weights of newly minted quarters, which curve the coins after five years, and which curve the coins after ten years? b) What happens to the average weight of the coins as time passes? c) What happens to the standard deviation of the weight of the coins as time passes? 6. Lengths of pregnancy of women having children are normally distributed, with a mean of 266 days and a standard deviation of 16 days. a) Sketch a normal curve from this information, labeling the mean and 3 standard deviations in both directions. Use your sketch to answer questions b-d b) What percentage of pregnancies last between 250 and 282 days? __________ c) What percentage of pregnancies last between 250 and 298 days? __________ d) What percentage of pregnancies last between 234 and 266 days? __________ e) A letter once appeared in Dear Abby’s newspaper column from a woman who said she had been pregnant for 310 days before giving birth to her baby. This is considerably longer than the typical pregnancy. What percentage of pregnancies last 310 or more days? Show a sketch and all necessary work. 7. Consider the following data, which give the weight (in thousands of pounds) x and gasoline mileage (miles per gallon) y for ten different automobiles. x 2.5 3.0 4.0 3.5 2.7 4.5 3.8 2.9 5.0 2.2 y 40 43 30 35 42 19 32 39 15 14 a) Use your calculator to find the least-squares regression equation, r and r2 for this data. Also use your calculator to make a scatter plot of the data. Reg. Eq.: ___________________________________ r: _____________ b) Using your results from part (a), describe the strength of the linear relationship between these two variables. c) Use the results from part (a) to complete the table below (round to nearest tenth). Weight (x) Actual MPG ( y ) Predicted MPG ( y ) Residual 3.0 43 4.0 30 5.0 15 d) Use your calculator to create a residual plot.. e) What does this residual plot tell us? Explain in a few sentences. 8. Answer TRUE if the statement is always true. If the statement is not always true, replace the word(s) in bold with words that make the statement always true. a) Correlation analysis is a method of obtaining the equation that represents the relationship between two variables. b) The linear correlation coefficient is used to determine the equation that represents the relationship between two variables. c) A correlation coefficient of zero means that the two variables are perfectly correlated. d) Whenever the slope of the regression line is zero, the correlation coefficient will also be zero. e) When r is positive, the slope will always be negative. f) The slope of the regression line represents the amount of change expected to take place in y when x increases by one unit. g) The calculated value of r2 represents the fraction of variation in the y variable which can be explained by the linear model. h) Correlation coefficients range between 0 and 1. i) The y variable is called the explanatory variable. j) The line of best fit is used to predict the average value of y that can be expected to occur for a given value of x. k) If a point has a residual of zero, then it lies on the line of best fit. l) If a point has a negative residual, then it lies above the line of best fit. 9. An issue of the school newspaper at a university reported the results of a survey conducted by the office of student development on the percent of students who use their student government representative to convey their feelings on university issues. a) What is the population of interest in this survey? b) In addition to recording whether the respondent uses the representative, what other variables do you think should be measured? c) Explain in some detail how you would take a sample for this survey. Identify the sampling method used in the following: d) Personal interviews of students leaving the cafeteria after lunch. e) Sending a survey to the first name on each page of the student directory. f) Asking students to respond to a survey in an issue of the school newspaper. 10. A clothing manufacturer would like to compare the durability of a newly designed line of children’s clothes with that of its existing line of clothing. To do so, the company will conduct an experiment using sets of identical twins. One of each set of twins will wear the old clothing and the other one will wear the new clothing. The children will then be allowed to play for a period of time, after which the clothing will be evaluated for durability. a) Why is this an experiment? b) What type of experiment is this? c) Describe in detail how you would utilize the 3 Principles of Experimental Design to ensure that the results of this experiment are trustworthy. • CONTROL • RANDOMIZATION • REPLICATION 11. You roll two 6-sided dice. The first die has 2 1’s, 2 2’s and 2 3’s on it. The other has 3 2’s and 3 3’s. You will roll these dice and find the sum. a) Show all possible outcomes for these two dice when rolled together. b) Make a probability distribution table for the sum of these two dice c) What is the probability that the sum is odd? d) What is the probability that the sum is a multiple of 3? e) What is the mean value of this discrete random variable? 12. A bag contains 5 marbles (3 are red and 2 are blue). You draw a marble out of the bag at random and replace it. Let X = the number of marbles drawn until you get a red marble. a) Explain why X has a Geometric Distribution. Find the probability that... b) It takes 4 draws to get the first red marble. c) It takes 4 or fewer draws to get the first red marble. d) It takes more than 3 draws to get the first red marble. e) On average, how many draws would we expect to make to get the first red? 13. A new television show has a 20% chance of being successful. Assume that NBC will introduce eight new shows this spring. Let X = the number of those shows that will succeed.