What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum Tim Hesterberg Google
[email protected] November 20, 2014 Abstract I have three goals in this article: (1) To show the enormous potential of bootstrapping and permutation tests to help students understand statistical concepts including sampling distributions, standard errors, bias, confidence intervals, null distributions, and P -values. (2) To dig deeper, understand why these methods work and when they don't, things to watch out for, and how to deal with these issues when teaching. (3) To change statistical practice|by comparing these methods to common t tests and intervals, we see how inaccurate the latter are; we confirm this with asymptotics. n ≥ 30 isn't enough|think n ≥ 5000. Resampling provides diagnostics, and more accurate alternatives. Sadly, the common bootstrap percentile interval badly under-covers in small samples; there are better alternatives. The tone is informal, with a few stories and jokes. arXiv:1411.5279v1 [stat.OT] 19 Nov 2014 Keywords: Teaching, bootstrap, permutation test, randomization test 1 Contents 1 Overview 3 1.1 Notation . .4 2 Introduction to the Bootstrap and Permutation Tests 5 2.1 Permutation Test . .6 2.2 Pedagogical Value . .6 2.3 One-Sample Bootstrap . .8 2.4 Two-Sample Bootstrap . .9 2.5 Pedagogical Value . 12 2.6 Teaching Tips . 13 2.7 Practical Value . 13 2.8 Idea behind Bootstrapping . 15 3 Variation in Bootstrap Distributions 20 3.1 Sample Mean, Large Sample Size: . 20 3.2 Sample Mean: Small Sample Size . 22 3.3 Sample Median . 24 3.4 Mean-Variance Relationship .