Section 1: Highest RBI

Total Page:16

File Type:pdf, Size:1020Kb

Section 1: Highest RBI

Section 1: Highest RBI

Histogram of RBI

8 7 6 5 4 3 Frequency 2 1 0 35 51 67 83 99 115

RBI

Since the mean depends on the actual values, this solution is unique. The team that produces the highest RBI is:

5Jermaine Dye 7Aubrey Huff 11James Loney 13Jason Giambi 14Pat Burrell 16David Murphy 17Prince Fielder 18Matt Kemp 22Garrett Atkins 24Carlos Delgado

With a mean of 94.2 Section 2: Smallest Mean number of strikeouts

1 2 4 1 3 2 4 5 5 5 248 11 6 45779 12 7 0 (2) 8 59 11 9 11 10 44 8 11 116 5 12 46 3 13 46 1 14 1 15 3

This solution is unique. With a mean of 55.6, the team with the smallest mean number of strikeouts is:

1Carl Crawford 4Brian Giles 8Ivan Rodriguez 9Pedro Feliz 12Ryan Theriot 15Edgar Renteria 19Emil Brown 21Jason Kendall 23Ian Kinsler 25Jeff Keppinger Section 3: Highest Median number of homeruns

Since the median only considers the position of the observations in terms of rank, this solution is not unique. With a median of 32, a team with the highest median is:

2Cody Ross 5Jermaine Dye 7Aubrey Huff 13Jason Giambi 14Pat Burrell 17Prince Fielder 18Matt Kemp 22Garrett Atkins 23Ian Kinsler 24Carlos Delgado Section 4: Highest Median batting Average: We calculate the batting average as the number of hits divided by the number of at bats. These values are:

Batting PLAYER At Bats Hits Average: 23Ian Kinsler 516 127 0.319 12Ryan Theriot 415 114 0.307 4Brian Giles 598 182 0.306 7Aubrey Huff 606 176 0.304 6Shane Victorino 590 172 0.293 5Jermaine Dye 570 167 0.292 18Matt Kemp 461 120 0.290 11James Loney 588 162 0.289 22Garrett Atkins 458 113 0.286 8Ivan Rodriguez 595 172 0.276 17Prince Fielder 459 122 0.276 16David Murphy 503 136 0.275 1Carl Crawford 518 165 0.273 10Aaron Rowand 398 110 0.271 24Carlos Delgado 402 98 0.271 15Edgar Renteria 598 162 0.270 25Jeff Keppinger 599 143 0.266 2Cody Ross 580 178 0.260 20Kosuke Fukudome 536 134 0.257 14Pat Burrell 549 149 0.250 9Pedro Feliz 611 175 0.249 13Jason Giambi 443 121 0.247 21Jason Kendall 425 106 0.246 19Emil Brown 501 129 0.244 3Jeff Francoeur 559 171 0.239

Again, since the median considers only rank, the solution is not unique. One possible team with a high median is:

3Jeff Francoeur 4Brian Giles 5Jermaine Dye 6Shane Victorino 7Aubrey Huff 11James Loney 12Ryan Theriot 18Matt Kemp 22Garrett Atkins 23Ian Kinsler Section 5: Least variable number of bases stolen.

This data is skewed right. It has high outliers. 4 in particular.

We start with the low values for numbers of bases stolen because those values are closer together. The minimum range is 3. The minimum IQR is 1. The minimum standard deviation is 1.08. Oddly, the same team of individuals is the least variable using all three of these measures:

4Brian Giles 5Jermaine Dye 7Aubrey Huff 10Aaron Rowand 13Jason Giambi 17Prince Fielder 19Emil Brown 22Garrett Atkins 24Carlos Delgado 25Jeff Keppinger Section 6: Smallest probability that a randomly selected palyer has stolen fewer than 10 bases.

We sort the data by number of bases stolen and then select the team who has stolen the most number of bases. There are 7 players on this team of 10 that have stolen more than 10 bases, so the probability of choosing a randomly selected player with number of stolen bases less than 10 is 0.3.

This is the team:

PLAYER 1Carl Crawford 6Shane Victorino 8Ivan Rodriguez 11James Loney 12Ryan Theriot 16David Murphy 18Matt Kemp 20Kosuke Fukudome 21Jason Kendall 23Ian Kinsler Part 7:

We choose numbers from 1-25 out of a hat (without replacement, obviously) in order to select our team. The following team results:

Batting PLAYER Average: 1Carl Crawford 0.273 4Brian Giles 0.306 6Shane Victorino 0.293 8Ivan Rodriguez 0.276 13Jason Giambi 0.247 14Pat Burrell 0.250 18Matt Kemp 0.290 19Emil Brown 0.244 22Garrett Atkins 0.286 24Carlos Delgado 0.271

Sample average batting average is: 0.274 Sample standard deviation is 0.021. Critical value for the CI: 2.262. Hence:

0.274 +/- 2.262*0.021/sqrt(10) = (0.258, 0.289)

In order to find the team with the narrowest CI for mean batting average, we sort the players in order from highest to lowest batting average and then select the group of consecutive players with the lowest range (hence they are closest together).

1Carl Crawford 2Cody Ross 8Ivan Rodriguez 10Aaron Rowand 15Edgar Renteria 16David Murphy 17Prince Fielder 20Kosuke Fukudome 24Carlos Delgado 25Jeff Keppinger

The formula for the confidence interval here is the same as above and results in the interval: (0.265, 0.274). This interval is narrower, by quite a lot, than the previous interval. Since we didn’t choose this team at random, though, we can’t use this interval to make generalizations about the larger population. Section 8: Smallest p-value:

Using the same random team as above:

Batting PLAYER Average: 1Carl Crawford 0.273 4Brian Giles 0.306 6Shane Victorino 0.293 8Ivan Rodriguez 0.276 13Jason Giambi 0.247 14Pat Burrell 0.250 18Matt Kemp 0.290 19Emil Brown 0.244 22Garrett Atkins 0.286 24Carlos Delgado 0.271

the sample proportion is 0.1. It doesn’t make sense even to do the hypothesis test to see if the mean is greater than 0.5. Not only is there not evidence of that claim, but there is fairly substantial evidence that the true proportion is actually less than (not greater than 0.5).

We could create a team with a higher chance of having a higher proportion of players with a mean batting average over 100, but even if we did that, the highest proportion that we could get it 0.4. That’s because there are only 4 players amongst the 25 with a batting averages greater than 0.3. Even then we would not have any evidence that the proportion of players with batting average greater than 0.3 is greater than 0.5. Even if we did that, we wouldn’t have a random sample anymore, and so we couldn’t generalize from our results to the larger population of MLB players. In addition, the sample size needs to be larger for generalizability. Part 9

Runs v. Homeruns

40

35

30

25

20

Homeruns15

10

5

0 38 58 78 98

Runs

Green circles denote the team with the highest correlation.

8Ivan Rodriguez 44 7 25Jeff Keppinger 45 3 21Jason Kendall 46 2 8Ivan Rodriguez 48 13 19Emil Brown 57 13 16David Murphy 64 15 17Prince Fielder 86 34 7Aubrey Huff 96 32 5Jermaine Dye 96 34 24Carlos Delgado 96 38

The correlation coefficient of is 0.970. Part 10 Keeping in mind that we must sacrifice some things in order to get other things, the best team I can determine is:

Batting PLAYER AB H R SB SO HR RBI Average: 4Brian Giles 559 171 81 2 52 12 63 0.306 5Jermaine Dye 590 172 96 3 104 34 96 0.292 7Aubrey Huff 598 182 96 4 89 32 108 0.304 11James Loney 595 172 66 7 85 13 90 0.289 12Ryan Theriot 580 178 85 22 58 1 38 0.307 13Jason Giambi 458 113 68 2 111 32 96 0.247 17Prince Fielder 588 162 86 3 134 34 102 0.276 22Garrett Atkins 611 175 86 1 100 21 99 0.286 23Ian Kinsler 518 165 102 26 67 18 71 0.319 24Carlos Delgado 598 162 96 1 124 38 115 0.271

Their team stats are: Section 1: Highest Mean RBI 87.800 Section 2: Lowest Mean Strikouts 92.400 Section 3: Highest Median Homeruns 26.500 Section 4: Highest median batting average 0.290 Section 5: Least Variation in stolen bases per player 9.122 Section 6: Smallest Probability that a random player had fewer than 10 SBs 0.800 Section 7: Narrowest confidence interval for Batting Average 0.030 Section 8: Smallest p-value for hypothesis test 0.7365 Section 9: Highest correlation between Runs and homeruns 0.288

Recommended publications