Chapter 04 Random Variables and Probability Distributions

Chapter 04 Random Variables and Probability Distributions Solutions to odd-numbered homework problems (text: MacClave, Benson, and Sincich, 10th edition)

[4.1] In general, counting things leads to Discrete measurements while measuring things against some instrument leads to Continuous measurements. (a) “Number of newspapers sold per month” involves counting, so this is a Discrete random variable. (b) “Amount of ink used” involves measuring (e.g., volume) so this is a Continuous random variable. (c) “Number of ounces in a bottle” involves measuring (e.g., a weight or volume measurement), so this is a Continuous random variable. (d) “Number of defective parts” involves counting, so this is a Discrete random variable. (e) “Number of people” involves counting, so this is a Discrete random variable.

[4.5] Technically speaking, x is measured by counting (numbers of dollars and numbers of cents), so this variable is Discrete.

However, in many applications it is common to think of discrete variables that have a huge number of possible values as being “approximately” continuous. For example, people might say (as an approximation) that “salaries follow a normal distribution” or that the “number of loaves of bread sold per week” follows a normal distribution. What they mean in both cases is that “salaries” and “numbers of loaves sold” *are* indeed discrete, but as a good approximation, we can often model them as being *approximately” normal (i.e., as being continuous variables)

[4.29] (a) x: $0 $300,000 p(x): .70 .30

(b)  = $0(.70) + $300,000(.30) = $90,000

6! 6543 21 65 [4.35] (a) = = = 15. 2!(6  2)! (21)(43 21) 21 5! (b) 5 = = 10 2  2!3! 7! (c) 7  = = 1 0 0!7! 6! (d) 6  = = 1 6 6!0! 4! (e) 4  = = 4 3 3!1! [4.47] (a) x = “# of bridges with a rating of 4 or below” is binomial (n = 10, p = .09).

P(X  3) = 1 – P(x  2) = 1 – [ P(x=0) + P(x =1) + P(X=2)] 10 0 10 10 1 9 10 2 8 = 1 – [ 0 (.09) (.91) + 1 (.09) (.91) + 2 (.09) (.91) ] = 1 – [.389416 + .348678 + .17146] = .055 (or, 5.5%)

(b) There is a fairly low chance (5.5%) of ever finding 3 or more bridges with ratings below 4. We can either conclude that a fairly rare event has indeed happened or that perhaps the 9% forecast is a little too low. The smaller the probability of such a situation, the more likely we are to go with the second conclusion (that the 9% figure is too low).

[4.49] (a) P(taxpayer with income less than $100,000 is audited) = 15/1000 = .015 P(taxpayer with income exceeding $100,000 is audited) = 30/1000 = .03.

(b) Let X = “# people with incomes less than $100,000 who are audited. Then, X  binomial (n=5, p = .015).

5 1 4 P(X = 1) = 1  (.015) (1-.015) = .070600241 P(X > 1) = 1 – P(X  1) = 1 – [P(X=0) + P(X=1)] = 5 0 5 5 1 4 1 – [ 0  (.015) (1-.015) + 1 (.015) (1-.015) ] = .0021833.

(c) Y = “# people with incomes exceeding $100,000 who are audited”. Then, Y  binomial (n=5, p = .03)

5 1 4 P(Y = 1) = 1  (.03) (1-.03) = .132794 5 0 5 5 1 4 P(Y > 1) = 1 – P(Y  1) = 1 - [ 0  (.03) (1-.03) + 1 (.03) (1-.03) ] = .008472

(d) Here, X  binomial (n=2, p = .015) and Y  binomial (n=2, p = .03) P(X=0 ∩ Y=0) = P(X=0)P(Y=0) = (1-.015)2(1-.03)2 = .9128847.

(e) To use the binomial, you must be able to assume that the people selected independently of one another (random sampling guarantees that) and that the probabilities of being audited do not change from person to person.

284  280 [4.75] (a) No bolt-on trace elements used: P(280 < x < 284) = = .1333. 290  260 284  280 Bolt-on trace elements are attached: P(280 < x < 284) = = .57143. 285  278 268  260 (b) No bolt-on trace elements used: P(x < 268) = = .26666. 290  260 Bolt-on trace elements are attached: P(x < 268) = 0. [4.79] (a) x = “amount dispensed” is continuous.

(b) The graph of this uniform distribution is:

x = amount dispensed 6.5 7.5 7.5  6.5 (c)  = (6.5 + 7.5)/2 = 7,  = = .28868, 12 so  ±2 = 7 ± 2(.28868) = 7 ± .5736. Notice that these two points lie outside the endpoints 6.5. and 7.5 in the graph above, which means that 100% of the probability lies between 7 ± .5736.

(d) P(x 7) = .50, because the uniform distribution is symmetric around  = 7.

(e) P(x  6) = 0, because the density function is 0 for all x  6.

7.25  6.5 (f) P(6.5  x  7.25)= = .75. (so, P(x  7.25) = .26; see part (g)) 7.5  6.5

(g) Let “xi > 7.25” denote the event “bottle number i contains more than 7.25 ounces”. Then for all 6 bottles to exceed 7.25 ounces, we have

P(x1>7.25  x2 >7.25  … x6>7.25) = P(x1>7.25)P(x2 > 7.25) …P(x6 > 7.25) = .256 = .000244.

[4.85] Remember, the trick with using the z table is to draw (or envision) the area under the z curve that lies above the interval (e.g., the interval z > 1.46) and then think about how you would chop up that area into pieces of the form 0 < z < z0 (i.e., the type of areas that are given by the z table). No formulas are needed – just reason geometrically about these areas and the answer will fall out. Think of the statement “P(z > 1.46)” as being “the area under the z curve over the interval z > 1.46”.

(a) P(z > 1.46) = (area to the right of 0) – (area between 0 and 1.46) = .5000 - .4279 = .0721

(b) P(z < -1.56) = (area to the left of 0) – (area between -1.56 and 0) = .5000 - .4406 = .0594

(c) P(.67 < z < 2.41) = (area between 0 and 2.41) – (area between 0 and .67) = .4920 - .2486 = .2434 (d) P(-.196 < z < -.33) = (area between -1.96 and 0) – (area between -.33 and 0) = .4750 - .1293 = .3457

(e) P(z ≥ 0) = (area to the right of 0) = (half the total area) = .5

(f) P(-2.33 < z < 1.50) = (area between -2.22 and 0) +(area between 0 and 1.50) = .4901 + .4332 = .9233

[4.89] As in problem 4.85 (or in any problem involving z), draw (or at least envision) the area under the z curve and then break it up into use z-table type areas (those between 0 and z0). Remember: when you are given the areas under the z curve (as in this problem), you must use the z table “backwards” by looking up that area in the interior of the table and then reading the z0 values from the left and top margins of the table.

(a) P(z ≤ z0) = .2090 = (area left of z0) = (area left of 0) + (area between 0 and z0) So, .2090 = .5000 + (area between 0 and z0) or, (area between 0 and z0) = .5000 - .2090 = .2910. From z table, the value of z0 is then .81.

(b) P(z ≤ z0) = .7090 = (area left of 0) + (area between 0 and z0), so (area between 0 and z0) = .7090 - .5000 = .2090. The closest z0 is z0 = .55, which corresponds to an area of .2088, so use z0 ≈ .55 as a good/close approximation to the correct value of z0.

(c) P(-z0 ≤ z < z0) = .8472 = (area between –z0 and 0) + (area between 0 and z0). Because the z curve is symmetric, both these areas are the same, so, .8472 = 2(area between 0 and z0) or, .8472/2 = .4236 = (area between 0 and z0). From z table, z0 = 1.43

(d) P(-z0 ≤ z < z0) = .1664 = (area between –z0 and 0) + (area between 0 and z0). Like part (c), both these areas are the same, so, .1664= 2(area between 0 and z0) or, . 1664/2 = .0832 = (area between 0 and z0). From z table, z0 = .21.

(e) P(z0≤ z ≤ 0) = .4798 = (area between –z0 and 0) which, by symmetry of z, equals

(area between 0 and -z0). The z table gives -z0 = 2.05 (so z0 = -2.05).

(f) P(-1 ≤ z < z0) = .5328 = (area between – 1 and 0) + (area between 0 and z0), so (area between 0 and z0) = .5328 - .3413 = .1915 and z0 = .50 [4.109] x = “amount of dye discharged” is normal with mean μ and standard deviation σ = .4. We want P(unacceptable) = P(x > 6) = .01. Standardizing, this becomes 6   6   P(z > ) = .01 , so from z table, ≈ 2.33, which gives μ ≈ 6 - .4(2.33) = 5.068.  .4

[4.125]A binomial random variable can be approximated by areas under a normal curve when n is large. Just use the binomial’s mean np and standard deviation npq for as the mean and standard deviation of the normal curve. In this problem μ = np = (100)(.40) = 40 and σ = npq = 100(.40)(1 .40) = 24 =4.89898.

35.5  40 (a) P(x ≤ 35) ≈ P(x ≤ 35.5) = P(z ≤ ) = P(z ≤ -.92) = normal 4.89898 .5000- .3212 = .1788.

39.5  40 50.5  40 (b) P(40 ≤x ≤ 50) ≈ P(39.5 ≤x ≤ 50.5) = P( ≤ z ≤ ) = normal 4.89898 4.89898 P(-.10≤ z ≤ 2.14) = .0398 + .4838 = .5236

37.5  40 (c) P(x ≥ 38) ≈ P(x ≥ 37.5) = P(z ≥ ) = P(z ≥ -.51) = normal 4.89898 .1950 + .5000 = .6950.

[4.129]x = “# having post-laser vision problems” is binomial with n = 100,000 and p = .01.

949.5  (100,000)(.01) P(x < 950) = P(x ≤ 949.5) = P(z ≤ ) = P(z ≤ -1.61) = normal (100,000)(.01)(1 .01) .5000- .4463 = .0537.

[4.133] x = “# defective CDs” is binomial with n = 1600 and p = .006.

11.5  (1600)(.006) P(x ≥ 12) = P(x ≥ 11.5) = P(z ≥ ) = P(z ≥ .62) = normal (1600)(.006)(1 .006) .5000- .2324 = .2676. The event “x ≥ 12” is fairly large (i.e. it is fairly likely to occur), so find 12 defective CDs would not be unreasonable if the defect rate is really .006 (i.e., the defect-free rate is .994). [4.139]There are 5 choices for which number could be selected first & 5 choices for which one could be selected second, so there are 55 = 25 possible samples of size 2 (as listed in the problem). Since the two items in any sample are selected independently (i.e., at random), then we can simply multiply the two probabilities to get the probability of both items. For example, P(1,2) = P(x=1  x = 2) = P(x=1)P(x = 2) = (.2)(.3) = .06.

(a) The means of the samples are 1, 1.5, 2, 2.5, …, 4.5, 5. Adding the probabilities of the pairs that are associated with each mean, we get the sampling distribution:

sample mean probability 1 .04  (1,1) 1.5 .12 2 .17 2.5 .20 3 .20  (1,5) (2,4) (3,3) (4,2) (5,1) 3.5 .14 4 .08 4.5 .04 5 .01 (b) The histogram of the distribution in part (a) looks approximately like this:

1 1.5 2 2.5 3 3.5 4 4.5 5 1 (c) P( x ≥ 4.5) = P( x =4.5) + P( x =5) = .04 +.01 = .05

(d) P( x ≥ 4.5) is a fairly rare event (only happens 5% of the time), so it would be unlikely to find values of x above 4.5.

 16 [4.149] n = 64 =2, so

(a) P( x < 16)  P(z < (16-20)/2) = P(z < -2.0) = .5000 - .4772 = .0228

(b) P( x > 23)  P(z > (23-20)/2) = P(z >1.5) = .5000-.4332 = .0668

(c) P( x > 25)  P(z > (25-20)/2) = P(z > 2.5) = .5000-.4938 = .0062

(d) P(16 < x < 22)  P((16-20)/2 < z < (22-20)/2) = P(-2.0 < z <1) = .4772 + .3413 = .8185

(e) P( x < 14)  P(z < (14-20)/2) = P(z < -3) = .5000 - .4987 = .0013 [4.153](a)  x =  = 141

  18 (b) x = n = 100 = 1.8

(c) Because n = 100 exceeds 30, the Central Limit Theorem says that the sampling distribution of x should be approximately normal.

(d) z = ( x -  x )/ x = (142-141)/1.8 = .5555  .56

(e) P( x > 142) = P( z > .56) = .5000 - .2123 = .2877

[4.161](a) Because n = 50 exceeds 30, the Central Limit Theorem says that the sampling distribution of x should be approximately normal.

(b) P( x > 44) = P( z > (44-40)/(12/ 50 )) = P(z > 2.36) = .5000 - .4901 = .0091

 x  (c) P( - 2 n < <  + 2 n ) = P(-2 < z < 2) = .4772 + .4772 = .9544

[4.167] You can simply use the n=20, p = .7 binomial table for this problem:

(a) P(X = 14) = P(X  14) –P(X 13) = .584 - .392 = .192

(b) P(X  12) = .228

(c) (P(X > 12) = 1 – P(X 12) = 1 - .228 = .772

(d) P(9  X  18) = P(X  18) – P(X 8) = .992 - .005 = .987

(e) P(8 < X < 18) = P(X  8) – P(X  18) = .965 - .005 = .960

(f)  = np = 20(.7) = 14,  = npq = 20(.7)(.3) = 2.049, σ2 = npq = 4.2

(f) P( - 2 < x <  - 2) = P(14 – 2(2.049) < x < 14 + 2(2.049)) = P(9.902 < x < 18.098) = P(10 ≤ x ≤ 18) = P(x ≤ 18) – P( x ≤ 9) = .992 - .017 = .975 [4.183] (a) For a uniform distribution,  = (c+d)/2 = (10,000 + 15,000)/2 = 12,500.

(b) P(x > 12,000) = (15000 – 12,000)/(15000 – 10,000) = 3000 / 5000 = .60.

(c) We want .20 = P(x > x0) = (15,000 – x0)/(15,000 – 10,000), so solve for x0 = 14,000

[4.187] If  = 18.2,  = 10.64, then the lowest possible time x = 0 years would have a z score of z = (0- 18.2) / 10.64 = -1.71. In other words, no values of x that are farther than 1.71 standard deviations below the mean would be possible, which isn’t the case for a normal distribution (that has many of its values farther away from the mean than just 1.71 standard deviations. So, this data is unlikely to have come from a normal distribution.

[4.197] X = “number of questionnaires returned  binomial (n, p=.4), so we want the probability P(X > 100) to be large. Using the normal approximation to the 100  np binomial, P(X > 100) ≈ P( z > ) should be “large”, which would require npq 100  np that be fairly large and negative. Since we know that a lot of the area npq under a z curve lies to the right of -2, let’s choose -2 as a “large negative” z value 100  np and find the n that makes P( z > ) = P( z > -2) = .9772 (a “large” npq probability):

100  np 100  .4n = = -2, which has the solution n = 291.85, or about n = 292. npq .24n Of course, if you want an even larger probability of getting 100 responses back, then you would find the z value corresponding to that probability and resolve the equation above, etc.