4 Solutions to Exercises

4 Solutions to Exercises

50 Solutions to Exercises 4 Solutions to Exercises 4.1 About these solutions The solutions that follow were prepared by Darryl K. Nester. I occasionally pillaged or plagiarized solutions from the second edition (prepared by George McCabe), but I take full responsibility for any errors that may remain. Should you discover any errors or have any comments about these solutions (or the odd answers, in the back of the text), please report them to me: Darryl Nester Bluffton College Bluffton, Ohio 45817 email: [email protected] WWW: http://www.bluffton.edu/~nesterd 4.2 Using the table of random digits Grading SRSs chosen from the table of random digits is complicated by the fact that students can Þnd some creative ways to (mis)use the table. Some approaches are not mistakes, but may lead to different students having different ÒrightÓ answers. Correct answers will vary based on: The line in the table on which they begin (you may want to specify one if the text does not). Whether they start with, e.g., 00 or 01. Whether or not they assign multiple labels to each unit. Whether they assign labels across the rows or down the columns (nearly all lists in the text are alphabetized down the columns). Some approaches can potentially lead to wrong answers. Mistakes to watch out for include: They may forget that all labels must be the same length, e.g., assigning labels like 0, 1, 2,...,9,10,...rather than 00, 01, 02,.... In assigning multiple labels, they may not give the same number of labels to all units. E.g., if there are 30 units, they may try to use up all the two-digit numbers, thus assigning 4 labels to the Þrst ten units and only 3 to the remaining twenty. 4.3 Using statistical software The use of computer software or a calculator is a must for all but the most cursory treatment of the material in this text. Be aware of the following considerations: Acknowledgments 51 Standard deviations: Students may easily get confused by software which gives both the so-called Òsample standard deviationÓ (the one used in the text) and the Òpopulation standard deviationÓ (dividing by n rather than n 1). Symbolically, the former is usually given as ÒsÓ and the latter as Ò Ó (sigma), but the distinction is not always clear. For example, many computer spreadsheets have a command such as ÒSTDEV(...)Óto compute a standard deviation, but you may need to check the manual to Þnd out which kind it is. As a quick check: for the numbers 1, 2, 3, s = 1 while = 0.8165. In general, if two values are given, the larger one is s and the smallerq is . If only one value is given, = n and it is the ÒwrongÓ one, use the relationship s n1 . Quartiles and Þve-number summaries: Methods of computing quartiles vary between different packages. Some use the approach given in the text (that is, Q1 is the median of all the numbers below the location of the overall median, etc.), while others use a more complicated approach. For the numbers 1, 2, 3, 4, for example, we would have Q1 = 1.5 and Q3 = 2.5, but Minitab reports these as 1.25 and 2.75, respectively. Since I used Minitab for most of the analysis in these solutions, this was sometimes a problem. However, I remedied the situation by writing a Minitab macro to compute quartiles the IPS way. (In effect, I was Òdumbing downÓ Minitab, since its method is more sophisticated.) This and other macros are available at my website. Boxplots: Some programs which draw boxplots use the convention that the ÒwhiskersÓ extend to the lower and upper deciles (the 10th and 90th percentiles) rather than to the minimum and maximum. (DeltaGraph, which I used for most of the graphs in these solutions, is one such program. It took some trickery on my part to convince it to make them as I wanted them.) While the decile method is merely different from that given in the text, some methods are (in my opinion) just plain wrong. Some graphing calculators from Sharp draw Òbox charts,Ó which have a center line at the mean (not the median), and a box extending from x to x + ! I know of no statistics text that uses that method. 4.4 Acknowledgments I should mention the software I used in putting these solutions together: For typesetting: TEX Ñ speciÞcally, Textures, from Blue Sky Software. For the graphs: DeltaGraph (SPSS), Adobe Illustrator, and PSMathGraphs II (MaryAnn Software). For statistical analysis: Minitab, GPower, JMP IN, and GLMStatÑthe latter two mostly for the Chapters 14 and 15. George McCabe supplied output from SAS for Chapter 15. GPower is available as freeware on the Internet, while GLMStat is share- ware. Additionally, I used the TI-82, TI-85, TI-86, and TI-92 calculators from Texas Instruments. 52 Chapter 1 Looking at Data Ñ Distributions Chapter 1 Solutions Section 1: Displaying Distributions with Graphs 1.1 (a) Categorical. (b) Quantitative. (c) Categorical. (d) Categorical. (e) Quantitative. (f) Quantitative. 1.2 Gender: categorical. Age: quantitative. Household income: quantitative. Voting Democratic/Republican: categorical. 1.3 The individuals are vehicles (or ÒcarsÓ). Variables: vehicle type (categorical), where made (categorical), city MPG (quantitative), and highway MPG (quantitative). 1.4 Possible answers (unit; instrument): number of pages (pages; eyes) number of chapters (chapters; eyes) number of words (words; eyes [likely bloodshot after all that counting]) weight or mass (pounds/ounces or kilograms; scale or balance) height and/or width and/or thickness (inches or centimeters; ruler or measuring tape) volume (cubic inches or cubic centimeters; ruler or measuring tape [and a calculator]) Any one of the Þrst three could be used to estimate the time required to read the book; the last two would help determine how well the book would Þt into a book bag. 1.5 A tape measure (the measuring instrument) can be used to measure (in units of inches or centimeters) various lengths such as the longest single hair, length of hair on sides or back or front. Details on how to measure should be given. The case of a bald (or balding) person would make an interesting class discussion. 1.6 Possible answers (reasons should be given): unemployment rate, average (mean or median) income, quality/availability of public transportation, number of entertainment and cultural events, housing costs, crime statistics, population, population density, number of automobiles, various measures of air quality, commuting times (or other measures of trafÞc), parking availability, taxes, quality of schools. 1.7 For (a), the number of deaths would tend to rise with the increasing population, even if cancer treatments become more effective over time: Since there are more people, there are more potential cases of cancer. Even if treatment is more effective, the increasing cure rate may not be sufÞcient to overcome the rising number of cases. For (b), if treatments for other diseases are also improving, people who might have died from other causes would instead live long enough to succumb to cancer. Solutions 53 Even if treatments were becoming less effective, many forms of cancer are detected earlier as better tests are developed. In measuring Þve-year survival rates for (c), if we can detect cancer (say) one year earlier than was previously possible, then effectively, each patient lives one year longer after the cancer is detected, thus raising the Þve-year survival rate. 949 = . = . 903 = 1.8 (a) 1988: 24,800,000 0 00003827 38 27 deaths per million riders. 1992: 54,632,000 0.00001653 = 16.53 deaths per million riders. Death rates are less than half what they were; bicycle riding is safer. (b) It seems unlikely that the number of riders more than doubled in a six-year period. 2942 = . = 1.9 Using the proportion or percentage of repairs, Brand A is more reliable: 13,376 0 22 192 = . = 22% for Brand A, and 480 0 4 40% for Brand B. 1.10 (a) Student preferences may vary; be sure they give a reason. Method 1 is faster, but less accurateÑit will only give values that are multiples of 10. (b) In either method 1 or 2, fractions of a beat will be lostÑfor example, we cannot observe 7.3 beats in 6 seconds, only 7. The formula 60 50 t, where t is the time needed for 50 beats, would give a more accurate rate since the inaccuracy is limited to the error in measuring t (which can be measured to the nearest second, or perhaps even more accurately). 1.11 Possible answers are total proÞts, number of employees, total value of stock, and total assets. 1.12 (a) Yes: The sum of the ethnic 9000 group counts is 12,261,000. (b) A 8000 bar graph or pie chart (not recom- 7000 mended) may be used. In order 6000 to see the contrast of the heights 5000 of the bars, the chart needs to be 4000 fairly tall. 3000 2000 1000 Number of students (thousands) 0 American Asian non-HispanicHispanic non-HispanicForeign Indian Black white 54 Chapter 1 Looking at Data Ñ Distributions 1.13 (a) Shown at right. The bars are 60 given in the same order as the data in the tableÑthe most obvious wayÑbut 50 that is not necessary (since the vari- 40 able is nominal, not ordinal). (b) A 30 pie chart would not be appropriate, since the different entries in the ta- 20 ble do not represent parts of a single 10 whole. Percent of female doctorates 0 Comp.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    243 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us