Answers
S b e the mean return for a random sample of 1. Let S b e the S&P return for a randomly chosen dayand
20 days.
(a) E ( S )=E (S )=:00032
p
p
(b) SD(S )=SD(S )= n = :00859= 20 = :00192
(c)
:005 :00032
S>:005) = P Z> = P (Z>2:438) = :00074 P (
:00192
Without doing any calculations we know that it is more likely that an individual day's return will
b e greater than .005 than that the mean return will b e, since the standard deviation of S is smaller
than that of S .
2.(a) The graphical output for descriptive statistics gives all of the information we need:
Descriptive Statistics
Variable: Freshman
Anderson-Darling Normality Test A-Squared: 0.794 P-Value: 0.038
Mean 82.3778 StDev 9.3983 Variance 88.3276 Skewness -3.2E-01 Kurtosis -7.8E-01 62.5 70.0 77.5 85.0 92.5 N 90 Minimum 59.0000 1st Quartile 75.0000 Median 83.0000 3rd Quartile 90.0000 95% Confidence Interval for Mu Maximum 98.0000 95% Confidence Interval for Mu 80.4093 84.3462 80 81 82 83 84 85 95% Confidence Interval for Sigma 8.1973 11.0148 95% Confidence Interval for Median 95% Confidence Interval for Median
81.2373 85.0000
So, the 95% con dence interval for the true average freshman retention rate is (80.409, 84.346).
The interval requires that the variable b e reasonably normally distributed, and that the sample b e
a random sample from a larger p opulation. The latter requirement is met, and the former is not
to o bad.
(b) The prediction interval is then
r
p
1
89
X t 1+ =82:378 (1:987)(9:398) 1:0111 = 82:378 18:777 = (63:601; 101:155): s
:025
90
Note that the upp er limit of the interval go es past the highest p ossible value of 100, which re ects
the slight left tail in the distribution. The actual value for NYU, bytheway,was 86.
c
1999, Je rey S. Simono
(c) Here is a graphical representation of the exp enditure variable:
Descriptive Statistics
Variable: Expend.
Anderson-Darling Normality Test A-Squared: 6.531 P-Value: 0.000
Mean 14676.7 StDev 9801.9 Variance 96077185 Skewness 3.09663 Kurtosis 13.6810 10000 25000 40000 55000 70000 N 91 Minimum 6093.0 1st Quartile 8742.0 Median 11999.0 3rd Quartile 16923.0 95% Confidence Interval for Mu Maximum 73967.0 95% Confidence Interval for Mu 12635.4 16718.1 10000 11000 12000 13000 14000 15000 16000 17000 95% Confidence Interval for Sigma 8555.4 11476.9 95% Confidence Interval for Median 95% Confidence Interval for Median
10486.7 13416.6
The con dence interval for average (over all researchuniversities) mean (over all students within
the university) exp enditure per student is thus (12635, 16718). This interval is not very useful,
however, since the exp enditure variable is very long right{tailed.
(d) Given the long right tail in the distribution, a prediction interval must b e constructed in the correct
scale. The waytodoitistologthevariable, construct the prediction interval, and then antilog.
Here is a normal plot for the logged exp enditure variable, which lo oks more Gaussian (it's still not
great, b eing right{tailed, but there's no stronger transformation than the log to try):
c
1999, Je rey S. Simono Normal Probability Plot for Logged e
99 Mean: 4.10586 StDev: 0.214029
95
90
80
70 60 50 40 Percent 30
20
10
5
1
3.5 3.7 3.9 4.1 4.3 4.5 4.7 4.9
Data
The prediction interval in the logged scale has the form
r
p
1
90
=4:106 (1:987)(:214) X t 1+ 1:011 = 4:106 :4276 = (3:678; 4:534): s
:025
91
3:678 4:534
Antilogging this interval gives (10 ; 10 ), or (4764; 34198). The actual value for NYU was
24135. Note that this interval is much more reasonable than the one done in the original scale,
whichis
14677 (1:987)(9802)(1:0055) = 14677 19584 = ( 4907; 34261):
(e) In the sample 30 of the 90 institutions have freshman retention rate less than 80%, so a con dence
interval for the probability that a researchuniversity has retention rate less than 80% is
s
r