1. What Type of Data Do We Have?

MS & IS 200.21 Final Exam

A casino manager is interested in determining if a roulette wheel is balanced. For those unfamiliar with roulette, there are 38 spaces (18 red spaces, 18 black spaces, and two green spaces) on the wheel and players have the opportunity to bet on a number of outcomes, one being the individual number. A truly balanced wheel should have should produce outcomes that resemble theoretically equal frequencies (i.e. each number should have a frequency of 1/38 or 2.5%). We are interested in the number of times the number 15 appears (vs. "non-15" occurrences) (questions 1-4).

1. What type of data do we have? A. Numerical B. Categorical C. Continuous D. Statistical

2. After 100 spins, the number 15 appeared 5 times. Is there a cause for alarm? A. Yes, the wheel is unbalanced B. No, there is the possibility of sampling error. C. Yes, 15 always happens at most 3 times D. No, theoretically probability does not per 100 apply to gaming. We should use empirical probabilities.

3. What would be the appropriate null hypothesis to determine if the wheel is balanced? A. HO: X 1 =2.5 B. HO: Π = 15 C. HO: μ = 15 D. HO: Π = 2.5

4. What would be the appropriate test statistic to determine if the wheel is balanced? A. One sample t test of the mean B. One sample t test for proportion C. One sample z test of the mean D. One sample z test for proportion

5. Compared to a t-distribution, a Z-distribution has: A. A larger mean, depending on the df of B. An equal mean, regardless of the df for the t-distribution the t-distribution C. A larger standard deviation D. The same standard deviation

6. Parameter is to statistic as: A. Sample is to Population B. Observed is to Unobserved C. Population is to Sample D. Estimate is to Actual

7. If the variance of a distribution is 0, the range is 0 A. True B. False

8. A distribution with mode < median < mean is negatively skewed? A. True B. False

Last year's final produced a distribution with the following: Mean of 80, σ2 of 25 (assumed to be normal) (questions 9-14).

9. What would be your standard score if you received a 75? A. -2 B. -1 C. 0 D. 1

10. What is the approximate proportion of people who scored over 90? A. 40% B. 2% C. 98% D. 10%

11. What is the probability that we can select someone who scored between 75 and 85? A. 100% B. 50% C. 68% D. 25%

12. What is the probability that we would select someone below 75 AND above 85? A. 31.8% B. 0% C. 2.5% D. 68.13%

13. If we took a sample 100, what is the probability that we would observe a mean less than 79.3? A. 8% B. 18% C. 92% D. 98%

14. Suppose we sampled 25 students from this year. The mean was measured at 77. What is a 90% confidence interval for the true population mean? A. 75.3, 78.7 B. 68.5, 85.5 C. 76.6, 77.3 D. 75.6, 78.3

15. The correlation between "Amount spent on shoes within the last year" and "Annual salary" is the multiplicative inverse of the correlation between "Annual salary" and "Amount spent on shoes within the last year." A. True B. False

16. If the range of a distribution is 0, then the standard deviation of the same distribution is also 0. A. True B. False

17. A sampling distribution created with "n = 10" has a larger variance than a sampling distribution created with the same original variable with "n = 30." A. True B. False

18. If Θ is equal to a value less than X , then Σ (Xi - Θ) < Σ (Xi - X ) A. True B. False 19. If we multiply the variance of a variable (call it "Y") by one less than the number in the sample, we would have: A. SST B. SSR C. SSE D. SSY

20. A sampling distribution created with "n = 10" has a mean smaller than a sampling distribution created from the sample original variable with "n = 30." A. False B. True

A researcher read the literature pertaining to the number of movies watched in a movie theatre during a certain year in college. Through her investigation, she believed that neither upperclassmen nor underclassmen watched more movies. She decided to test this belief by sampling 23 upperclassmen and 19 underclassmen. The statistics appear below: Upperclassmen, mean of 4.3 movies (stdev of 1.1) Underclassmen, mean of 6.6 movies (stdev of 1.7) (use for questions 21-27)

21. The researcher's belief before she collected data constitutes: A. Research hypothesis B. Null hypothesis C. Alternative hypothesis D. Conclusion

22. These are the variables and classifications for this study. A. Class Standing (numerical); number of B. Upperclassmen (categorical); movies (categorical) underclassmen (categorical) C. Upperclassmen (categorical); D. Class standing (categorical); number of underclassmen (categorical); movies movies (numerical) (numerical)

23. If the researcher wanted to use a hypothesis testing, what would be the appropriate null hypothesis?

A. HO: μ1 = μ2 B. HO: Π1 = Π2 C. HO: X 1 = X 2 D. HO: Π = 6.6

24. Which is the appropriate test statistic? A. Paired sample Z test B. Regression analysis C. Independent samples t test for mean D. Independent samples Z test for mean difference differences

25. If α = .05, what would be the appropriate critical value (two tail alternative)? A.  2.021 B.  1.68 C.  1.96 D.  2.009 26. Instead of a test statistic, the researcher wanted to use a confidence interval (and be as "precise" as she was when she used a test statistic). What is the level of confidence? A. 5% B. 95% C. 98% D. 99%

27. What is the value of the point estimate? A. -2.3 B. 0 C. 4.3 D. 6.6

28. In every hypothesis testing situation, to find a critical value we need to know all of the following except: A. alpha B. sample size C. alternative hypothesis D. test statistic

29. A Z-distribution is a probability distribution created under the assumption of the: A. Null hypothesis B. Alternative hypothesis C. Research hypothesis D. Conclusion

30. If 76% of cows produce high protein milk, what is the standard deviation of this population? A. .18 B. .24 C. .45 D. .76

The marketing department at Pepsi is very interested in knowing if there is a difference between males and females who prefer the soft drink. The following is a contingency table created after a survey was undertaken (Questions 31-38).

Males Females Prefer Pepsi 147 186 Do not prefer Pepsi 134 156

31. How many categorical variables are in this study? A. 1 B. 2 C. 3 D. 4

32. In creating a test statistic, the denominator includes a term called "pie-hat." Why is "pie-hat" necessary? A. There is only one population, hence B. The assumption of the null hypothesis only one proportion states the proportions are equal, and hence have the same variance C. The assumption of the alternative state D. Pie-hat is the variance of the that the proportions are the same; hence hypothesized difference of the two this is the hypothesized proportion. proportions. 33. What is the value of this "pie-hat"? A. .523 B. .544 C. .533 D. .535

34. What is the appropriate null hypothesis?

A. HO: μ1 = μ2 B. HO: Π1 = Π2 C. HO: X 1 = X 2 D. HO: p1 = p2

35. If α = .02, what is the value of the critical value (two tail test)? A.  1.64 B.  1.96 C.  2.326 D.  2.575

36. What is the appropriate conclusion based on the data provided (and α = .02)? A. Men like Pepsi more B. Women like Pepsi more C. Both like Pepsi equally as well D. More information is needed to answer this question.

37. What is the value of the point estimate for a 98% confidence interval? A. 0 B. .021 C. .21 D. .544

38. What is the proper confidence interval for the Pepsi example? A. -.043 , .043 B. 0 , .021 C. -.057 , .099 D. .021 , .099

39. The variance of a distribution can never be smaller than the standard deviation. A. True B. False

40. A type II error is when we decide the null hypothesis is ____ when it is actually _____. A. True; True B. False; True C. False; False D. True; False

A pizza maker believed that he invented a way of making better tasting pizza. He made 40 pizzas the original way and 40 pizzas. Respondents were to rate the pizza on a continuous scale from 1 to 50 (Questions 41-42).

41. What is the null hypothesis for this study?

A. HO: μ1 > μ2 B. HO: μ1 = μ2 C. HO: X 1 = X 2 D. HO: X 1 > X 2

42. What is the most appropriate alternative hypothesis for this study (meaning, the alternative hypothesis that conforms to the research hypothesis)?

A. HO: μ1 > μ2 B. HO: μ1 ≠ μ2 C. HO: X 1 ≠ X 2 D. HO: X 1 > X 2 The power company will estimate the electricity bill certain months based on the size of the home. The correlation between monthly use (as indicated on the bill) and size of the home is .68 (Questions 43-47).

43. What type of relationship between exists between usage and size of the home? A. Positive B. Negative C. Inverse D. No existent

44. What is the response variable for this problem? A. Electricity Bill B. House Size C. Houses D. Usage per square foot

A sample of 51 houses was used for computing the correlation. Here are the other statistics from the survey: Mean electricity bill = $67.34, variance of $100 Mean house size = 2300 square feet, variance of 900 square feet

45. What is the regression coefficient (slope) to predict the electricity bill? A. .227 B. .075 C. 6.12 D. 2.04

46. If ei is the difference between the actual bill (yi) and the predicted bill (ŷi), what is the sum of all 51 ei's? A. 0 B. 100 C. 67.34 D. 2300

47. In question 46, the predicted bill was found using the regression slope (b1) in question 45 and the subsequent intercept (b0). Say we pick two other values to represent b1 and b0 to create a "new" regression equation. The two values that we picked are smaller (closer to zero) than the originals. How does the "new" ei's compare to the original ei's? A. New are smaller B. New are larger C. They are both the same D. Need more information to answer this question.

48. If SST is 1548 and SSE is 483, what is R2? A. 31.2% B. 68.8% C. 14.5% D. 45.3%

49. Covariance, correlation and the regression coefficient are measures of association between two variables. What two attributes do these the measures have in common? A. Effect size and strength of relationship B. Strength and direction C. Effect size and direction D. Strength of relationship and prediction 50. For process data, ____ is the key statistic for determining "randomness" and ___ is they key statistic for determining if the process is "in control." A. Mean; mean B. Standard deviation; mean C. Standard deviation; Standard Deviation D. Mean; Standard deviation