DPL = Average Daily Patient Load, MXE = Monthly X-Ray Exposures
Total Page:16
File Type:pdf, Size:1020Kb
Question 1 Multiple regression model was used to predict monthly labor hours in US Navel hospitals based on the following set of repressors DPL = average daily patient load, MXE = monthly X-ray exposures, MOB = monthly occupied bed-days, EP = eligible population in the area/1000, ALP = average length of patient’s stay, in days.
The regression output from Minitab is given as under:
The regression equation is MLH = 1711 - 9.6 DPL + 0.0563 MXE + 1.38 MOB - 3.99 EP - 358 ALP
Predictor Coef SE Coef T P VIF Constant 1711 1058 1.62 0.134 DPL -9.62 96.21 -0.10 0.922 2.671 MXE 0.05628 0.02096 2.68 0.021 7.940 MOB 1.377 3.047 0.45 0.660 9.225 EP -3.988 7.061 -0.56 0.584 9.195 ALP -358.0 207.1 -1.73 0.112 4.265
S = 633.130 R-Sq = 99.1% R-Sq(adj) = ??%
Analysis of Variance
Source DF SS MS F P Regression 5 4946 989 246.82 0.000 Residual Error 11 44 4 Total 16 4991
i) To improve this model, which variable you will remove from the model, at first,
a. using the VIF criterion
b. using the p-value for coefficients
ii) Compute R2 adjusted and interpret.
iii) Interpret the coefficient of MOB. For the same data, step wise regression was performed to obtain the best model. The stepwise regression output is give as under:
Stepwise Regression: MLH versus DPL, MXE, MOB, EP, ALP
Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15
Response is MLH on 5 predictors, with N = 17
Step 1 2 3 Constant -90.75 -130.00 1298.72
MOB 1.123 0.836 0.975 T-Value 23.79 10.52 9.47 P-Value 0.000 0.000 0.000
MXE 0.073 0.054 T-Value 3.99 2.73 P-Value 0.001 0.017
ALP -288 T-Value -1.91 P-Value 0.078
S 927 656 602 R-Sq 97.42 98.79 99.06 R-Sq(adj) 97.25 98.62 98.84 Mallows Cp 19.2 4.0 2.7
iv) Write regression equation of the best model.
v) Which variable(s) of the final model are significant at 5% level? Question 2
Different Autoregressive models were fitted on the time series data that represent the number of stores opened for Bed & Bath from 1993 to 2010. The regression outputs from Minitab are provided below.
Model 1
The regression equation is Stores Opened = 12.9 + 1.95 Yt-1 - 0.182 Yt-2 - 0.999 Yt-3
Predictor Coef SE Coef T P Constant 12.91 11.70 1.10 0.302 Yt-1 1.9466 0.4377 4.45 0.002 Yt-2 -0.1825 0.9526 -0.19 0.853 Yt-3 -0.9986 0.6109 -1.63 0.141
S = 18.9551 R-Sq = 99.7% R-Sq(adj) = 99.5%
Model 2
The regression equation is Stores Opened = 10.8 + 2.50 Yt-1 - 1.65 Yt-2
Predictor Coef SE Coef T P Constant 10.76 10.45 1.03 0.328 Yt-1 2.4994 0.2706 9.24 0.000 Yt-2 -1.6507 0.3088 -5.35 0.000
S = 19.5823 R-Sq = 99.6% R-Sq(adj) = 99.5%
Model 3
The regression equation is Stores Opened = 34.9 + 1.07 Yt-1
Predictor Coef SE Coef T P Constant 34.86 15.15 2.30 0.040 Yt-1 1.06743 0.03795 28.13 0.000
S = 36.3824 R-Sq = 98.5% R-Sq(adj) = 98.4%
i) Choose the best model (also provide reasoning). ii) The number of stores opened in 2006 and 2007 were given as 809 and 815 respectively. What will be the forecast for 2008 and 2009 using the above three models
Forecast for the No. of Stores Opened
Year Model 1 Model 2 Model 3
2008
2009
Question 3
Given below are the prices of a basket of four food items from 1996 to 2000.
Year Wheat($/Bus Corn($/Bush Soybeans($/Bu Milk($/hundredwei hel) el) shel) ght) 1996 4.25 3.71 7.41 15.03 1997 3.43 2.7 7.55 13.63 1998 2.63 2.3 6.05 15.18 1999 2.11 1.97 4.68 14.72 2000 2.16 1.9 4.81 12.32 i) What are the simple price indexes for wheat and soybeans, respectively, in 1998 using 1996 as the base year?
ii) What is the unweighted aggregate price index for the basket of four food items in 1998 using 1996 as the base year? Also interpret your result. iii) What is the Paasche price index for the basket of four food items in 2000 that consisted of 40 bushels of wheat, 50 bushels of corn, 35 bushels of soybeans and 60 hundredweight of milk in 2000 using 1996 as the base year?
iv) What is the Laspeyres price index for the basket of four food items in 1999 that consisted of 50 bushels of wheat, 30 bushels of corn, 40 bushels of soybeans and 80 hundredweight of milk in 1996 using 1996 as the base year? Also, interpret the result Question 4
Exponential trend model was fitted on the monthly time series data that represents the average traffic on google recorded at the beginning of each month from Jan 2004 to Aug 2010. The regression equation is given as under:
Log() = 0.0801- 0.0038 X - 0.1067 Jan + 0.0794 Feb + 0.1008 Mar + 0.0761 Apr + 0.0795 May + 0.1084 Jun - 0.1101 Jul + 0.0874 Aug + 0.0204 Sep + 0.0106 Oct + 0.0051 Nov Where X represents the coded observations (0 - 79) and Jan, Feb,….,Nov represents the monthly dummies.
i) Compute the monthly compound growth rate.
ii) Interpret the multiplier for June
iii) Compute the fitted value for August 2010
iv) Forecast the value for Sept 2010. Question 5
i) The residuals represent a. the difference between the actual Y values and the mean of Y. b. the difference between the actual Y values and the predicted Y values. c. the square root of the slope. d. the predicted value of Y for the average X value.
ii) A recent study of 15 shoppers showed that the correlation between the time spent in the store and the dollars spent was 0.235. Using a significance level equal to 0.05, the critical value for the test to determine whether the true population correlation coefficient is zero is:
a. t = 2.1604 b. z = 1.96 c. t = 2.458 d. none of the above
The director of the MBA program of a state university wanted to know if a one week orientation would change the proportion among potential incoming students who would perceive the program as being good. Given below is the result from 215 students’ view of the program before and after the orientation.
After the Orientation Before the Orientation Good Not Good Total Good 93 37 130 Not Good 71 14 85 Total 164 51 215
iii) which test should she use? a. -test for difference in proportions b. Z-test for difference in proportions c. McNemar test for difference in proportions d. Wilcoxon rank sum test iv) What is the value of the test statistic using a 1% level of significance?
ANSWER: 3.2717 or 3.2717
v) What is the p-value of the test statistic using a 5% level of significance?
ANSWER: 0.0011 Answer (vi) and (vii) based on the following information
The following table contains the number of complaints received in a departmental store for the first 6 years. Year 2009 2010 2011 2012 2013 2014 Complaints 36 45 81 90 108 144
vi) If a three-term moving average is used to smooth this series, what would be the second calculated term? a. 36 b. 40.5 c. 54 d. 72 vii) Referring to Table 3, if this series is smoothed using exponential smoothing with a smoothing constant of 1/3, how many terms would it have? a. 3 b. 4 c. 5 d. 6
viii) We have created a 95% confidence interval for with the result (10, 15). What decision will we make if we test at = 0.05?
a. Reject H0 in favor of H1.
b. Accept H0 in favor of H1.
c. Fail to reject H0 in favor of H1. d. We cannot tell what our decision will be from the information given.
ix) A major videocassette rental chain is considering opening a new store in an area that currently does not have any such stores. The chain will open if there is evidence that more than 5,000 of the 20,000 households in the area are equipped with videocassette recorders (VCRs). It conducts a telephone poll of 300 randomly selected households in the area and finds that 96 have VCRs. The value of the test statistic in this problem is approximately equal to: a. 2.80 b. 2.60 c. 1.94 d. 1.30 x) Durbin Watson Statistic is used to test
a. Normality
b. Independence
c. Linearity
d. Equality of Variance xi) Suppose we are interested in testing the null hypothesis H0: vs H1: . A sample of size 20 is obtained and sample mean and sample standard deviations are given as 8 and 3.5. What will be the value of the test statistic
a. 2.55
b. -1.96
c. 1.96
d. -2.55
xii) A study is conducted to test whether there is a significant difference in the scores of male and female students for a STATS course. Following information is obtained
Male Female
n = 25 n = 15
Based on this information, what is the absolute value of the test statistic, assuming unequal variances?
a. 2.459
b. 1.57
c. 4.23
d. None of the above
xiii) Based on the information given in part (xii), what is the p-value for testing
H0: vs H1 :
a. 0.019 b. 0.319 c. 0.063 d. 0.121
Answer parts (xiv) and (xv) based on the following information
Where people turn for news is different for various age groups. A study indicated where different age groups primarily get their news:
Age group Media Under 36 36-50 50+ Total TV 73 (92.11) 102 (109.02) 127 (100.87) 302
Radio 75 (75.03) 97 (????) 74 (82.16) 246 Newspape r 52 (65.88) 79 (77.98) 85 (????) 216
Internet 105 (71.98) 83 (85.20) 48 (78.82) 236 Total 305 361 334 1000 xiv) Find the missing expected counts
xv) What is the value of test statistic