BE540 Introductory Biostatistics Take Home Examination III Units 6 and 7 – Estimation and Hypothesis Testing SOLUTIONS

BE540 Introductory Biostatistics 2005 Take Home Examination III BE540 Introductory Biostatistics Take Home Examination III Units 6 and 7 – Estimation and Hypothesis Testing SOLUTIONS 1. (10 points total) Laboratory tests of bacterial counts are often used for declaring a water source “polluted”. Suppose that the distribution of bacterial counts in a sample taken from Lake Quinsigamond is normally distributed with a variance of 9,000,000. a. (5 points) Suppose 25 samples were taken over the course of July 2004 and yielded a mean count of 11,500. Construct an 80% confidence interval estimate of the unknown mean bacterial count in this pond at this time. Answer: (10,738.8, 12,269.2) Solution: Estimate = x =11,500 2 SE(Estimate)= σ = 9,000,000 / 25 =600 n Multiplier= Z .90 =1.282 CI= x ± (Z .90 ) SE( x ) =11,500 ± (1.282)(600) = 10,730.8, 12, 269.2 Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 1 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III b. (5 points) One year later, in July 2005, the Massachusetts Department of Environmental Quality Engineering (DEQE) took another sample and noted a bacterial count of 15,000. In your opinion, is this evidence of a pollution effect? Answer: No Solution: No. Because associated significance level is 0.12 Let X be 1997 bacterial count distribution H φ : µ =11,500 H A : µ >11,500 (one sided) p-value= Prob[ X≥ 15,000 | H φ is true] 15000− 11500 = Prob[ Normal (0,1) ≥ ] 9,000,000 = Prob[ Normal (0,1) ≥ 1.167] = 0.1261 ⇒ The single reading of 15,000 has a sufficiently high likelihood occurring if the true mean level is 11,500. There is not enough evidence to suspect a pollution effect. Obviously a sample size> 1 is needed to investigate this question. 2. (10 points total). a. (2 points) True or False. Consider the construction of a 95% confidence interval. Suppose one repeats the sampling process indefinitely. Suppose further that, for each sample drawn, a new 95% confidence interval calculation is performed. If for each sample, the investigator claims that the actual parameter value is contained in the interval, about 95% of his or her statements will be correct. Answer: TRUE Explanation: By definition, a 95% confidence interval is generated by a process that is correct 95% of the time. Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 2 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III b. (2 points) True or False. A hypothesis test for which the type I error occurs with probability α has probability of type II error equal to (1 - α). Answer: FALSE Explanation: Type I error= Pr[ reject Hφ when Hφ is true] whereas Type II error= Pr[accept Hφ when H A is true] c. (2 points) True or False. If a one sided test indicates that the null hypothesis can be rejected at the 5% level, then a two sided test performed on the same set of data is necessarily significant at the 5% level. Answer: FALSE Explanation: If a one-sided test has an achieved significance level of .05, Then the same test, carried out as two-sided, has an achieved level of significance (2)(.05)= .10 d. (2 points) True or False. For a given sample variance s2 and sample mean X , a 90% confidence interval for an unknown mean µ is narrower than a 99% confidence interval. Answer: TRUE Explanation: This is true because the magnitude of the SE multiplier in a 90% confidence interval is larger than the magnitude of the SE multiplier in a 99% confidence interval. Hint: Recall that the multiplier in a 100% confidence interval would have to be infinity! e. (2 points) True or False. An investigator is performing a t-test for which the assumptions are satisfied could, in the absence of a student’s t-distribution tables, use a Normal(0,1) probability table provided the degrees of freedom is sufficiently large. Answer: TRUE Explanation: As degrees of freedom > 30, the t-distribution gets more and more similar to the Normal(0,1) Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 3 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III 3. (10 points total) In the planning of a large scale study, pilot data, comprised of a sample of size 24 is drawn from a normal distribution with known variance σ2 = 6.4512 and unknown mean µ. The observed mean Xwasn Xn=24 = 2.66. In mounting the large scale study, how large a sample size would have to be drawn to estimate the unknown mean to within plus or minus .14 with 90% confidence? Note: “Plus or minus .14” is interpreted to mean that the total width of the confidence interval is 2 x .14. Answer: 891 Solution: Solution: Width of a 90% CI, because Z .95 =1.645 is given by = ( Upper end value) – ( Lower end value) = [ x +(1.645) σ / n ]- [ x -1.645) σ / n ] =(Z)(1.645) σ / n Since total width=(Z)(0.14) we have (2)(0.14)= (2)(1.645) σ / n ⇒ n = (1.645) 22σ /(0.14) 2 ⇒ n = (138.0625) σ 2 Since σ 2 =6.4512 ⇒ n = (138.0625) (6.4512) = 890.6 Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 4 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III 4. (10 points total) According to the theory of plate tectonics, the California coast (including Los Angeles) is carried on one plate and the continental United States (including Sacramento) is carried on another plate. These two cities are moving relative to one another in such a way that Los Angeles gets about an inch closer to Sacramento every year. A hypothetical experiment is underway to check this theory, by measuring the distance between Los Angeles and Sacramento once a year for 50 years. The first 25 measurements have an average of 31,996,832 inches with a standard deviation of 40 inches. The next 25 measurements average out to be 31,996,806 with a standard deviation of 40 inches. Carry out the appropriate hypothesis test to check the proposed plate tectonics theory. You may assume normality. Answer: A one sided two sample t-test (t=2.30, df=48) provides statistically significant evidence that Los Angeles is getting closer to Sacramento over time. Achieved level of significance (p-value) = 0.013 95% CI of 25 year change in distance ( inches) = (3.26, 48.7) Solution: Let x = ave of 1st 25 measurements y = ave of 2nd 25 measurements Assumptions: Since observed SDs are equal, we can assume equality of variance. Will also assume normality so the 2 2 x ∼ Normal ( µx , σ /25) and y ∼ Normal ( µy , σ /25) Ho and Ha : H φ : µx < µy (1 sided) 22 2 ()24SS12+ () 24 2 Preliminary:S pool = =(40) 48 ()X-Y -() 0 Test statistic: t = with df=48 SE() X-Y ()X-Y 31,996,832-31,996,806 = = 22 ⎛⎞11 SSpp 2 ⎜ ⎟ + 40⎜ + ⎟ 25 25 ⎝⎠25 25 = 2.30 p- value=Prob[t2.30df=48 ≥ ] =0.0129 Statistical decision:0.013 is not considered with Ho. Reject. 95% CI for µ-µXy= (X-Y) ±() t.975;df=48 SE(X-Y) =(26) ± (2.01)(11.313709) =(3.259446,48.740554) Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 5 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III 5. (10 points total) An investigator performed 40 tests on data from nine patients who received treatment A and eight patients who received treatment B. Each of the 40 tests was a student t-test and was independent of the other tests. Three of the 40 t-statistic values were greater than t0.025,df=15 = 2.13. How many significant results would you expect to see if the type I error for each was 0.05 and if, in truth, there are no differences between the two treatments? Answer: 2 Solution: If the 40 tests are mutually independent ,all with α =0.05 And if Ho is true in every instance Then the expected number of significant results=(40) []0.05 =2 _____________________________________________________________________________________________ 6. (10 points total) The objective of an experiment by Buckner et al was to study the effect of pancuronium- induced muscle relaxation on circulating plasma volume. Subjects were newborn infants weighing more than 1700 grams who required respiratory assistance within 24 hours of birth and met other criteria. Five infants paralyzed with pancuronium and seven nonparalyzed infants yielded the following statistics on the second of three measurements of plasma volume (ml) made during mechanical ventilation. Group n Sample Mean Sample Standard Deviation Paralyzed 5 48.0 8.1 Non-paralyzed 7 56.7 8.1 Compute a 95% confidence interval for µ1 - µ2. State all necessary assumptions and interpret your results. Z:\bigelow\teaching\web540\docu\unit6 and 7_solutions Page 6 of 11 BE540 Introductory Biostatistics 2005 Take Home Examination III Answer: ( -19.27, +1.87 ) Assumptions: (1) normality and (2) common variance Interpretation: The confidence interval includes zero. With 95% confidence the data are consistent with the inference of no difference in mean plasma volume at 12-24 hours among the two groups. Solution: Point estimate of µ1-µ2: (48.0 – 56.7) = -8.7 22 (5−+− 1)(8.1) (7 1)(8.1) 2 Point estimate of σ2: σˆ 2 ==()8.1 (5−+ 1) (7 − 1) ⎡⎤(8.1)22 (8.1) Estimated SE of point estimate of µ1-µ2: ⎢⎥+=4.7429 ⎣⎦57 Confidence coefficient = t.975;df=10 = 2.2281 95% confidence interval for µ1-µ2: c XX1−± 2h t.; 975df = 10 seXXc 1 − 2 h = −±8.7( 2.2281)( 4.7429) = (.,.)−19 27 187 7.

BE540 Introductory Biostatistics Take Home Examination III Units 6 and 7 – Estimation and Hypothesis Testing SOLUTIONS

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support