Bootstrap (Part 3)

Christof Seiler

Stanford University, Spring 2016, Stats 205 Overview

I So far we used three different bootstraps:

I Nonparametric bootstrap on the rows (e.g. regression, PCA with random rows and columns) I Nonparametric bootstrap on the residuals (e.g. regression) I Parametric bootstrap (e.g. PCA with fixed rows and columns)

I Today, we will look at some tricks to improve the bootstrap for confidence intervals:

I Studentized bootstrap Introduction

I A is (asymptotically) pivotal if its limiting distribution does not depend on unknown quantities I For example, with observations X1,..., Xn from a with unknown and , a pivotal quantity is ! √ θ − θˆ T (X ,..., X ) = n 1 n σˆ with unbiased estimates for sample mean and variance

1 n 1 n θˆ = X X σˆ2 = X(X − θˆ)2 n i n − 1 i i=1 i=1

I Then T (X1,..., Xn) is a pivot following the Student’s t-distribution with ν = n − 1 degrees of freedom I Because the distribution of T (X1,..., Xn) does not depend on µ or σ2 Introduction

I The bootstrap is better at estimating the distribution of a pivotal statistics than at a nonpivotal statistics I We will see an asymptotic argument using Edgeworth expansions I But first, let us look at an example Motivation

I Take n = 20 random exponential variables with mean 3

x = rexp(n,rate=1/3)

I Generate B = 1000 bootstrap samples of x, and calculate the mean for each bootstrap sample

s = numeric(B) for (j in1:B) { boot = sample(n,replace=TRUE) s[j] = mean(x[boot]) }

I Form confidence interval from bootstrap samples using quantiles (α = .025)

simple.ci = quantile(s,c(.025,.975))

I Repeat this process 100 times I Check how often the intervals actually contains the true mean Motivation

bootstrap conf intervals 100 80 60 40 20 0

0 2 4 6 8 Motivation

I Another way is to calculate a pivotal quantity as the bootstrapped I Calculate the mean and

x = rexp(n,rate=1/3) mean.x = mean(x) sd.x = sd(x)

I For each bootstrap sample, calculate

z = numeric(B) for (j in1:B) { boot = sample(n,replace=TRUE) z[j] = (mean.x - mean(x[boot]))/sd(x[boot]) }

I Form a confidence interval like this

pivot.ci = mean.x + sd.x*quantile(z,c(.025,.975)) Motivation

bootstrap conf intervals 100 80 60 40 20 0

0 2 4 6 8 Studentized Bootstrap

I Consider X1,..., Xn from F I Let θˆ be an estimate of some θ 2 I Let σˆ be a for θˆ estimated using the bootstrap I Most of the time as n grows

θˆ − θ ∼. N(0, 1) σˆ

(α) I Let z be the 100 · αth of N(0, 1) I Then a standard confidence interval with coverage probability 1 − 2α is θˆ ± z(1−α) · σˆ

I As n → ∞, the bootstrap and standard intervals converge Studentized Bootstrap

I How can we improve the standard confidence interval? I These intervals are valid under assumption that

θˆ − θ Z = ∼. N(0, 1) σˆ

I But this is only valid as n → ∞ I And are approximate for finite n I When θˆ is the sample mean, a better approximation is

θˆ − θ Z = ∼. t σˆ n−1

and tn−1 is the Student’s t distribution with n − 1 degrees of freedom Studentized Bootstrap

I With this new approximation, we have

ˆ (1−α) θ ± tn−1 · σˆ

I As n grows the t distribution converges to the normal distribution I Intuitively, it widens the interval to account for unknown standard error I But, for instance, it does not account for in the underlying population I This can happen when θˆ is not the sample mean I The Studentized bootstrap can adjust for such errors Studentized Bootstrap

I We estimate the distribution of

θˆ − θ Z = ∼. ? σˆ

∗1 ∗2 ∗B I by generating B bootstrap samples X , X ,..., X I and computing θˆ∗b − θˆ Z ∗b = σˆ∗b ∗b (α) I Then the αth percentile of Z is estimated by the value ˆt such that #{Z ∗b ≤ ˆt(α)} = α B I Which yields the studentized bootstrap interval

(θˆ − ˆt(1−α) · σ,ˆ θˆ − ˆt(α) · σˆ) Asymptotic Argument in Favor of Pivoting

ˆ 1 2 I Consider θ estimated by θ with variance n σ I Take the pivotal statistics ! √ θˆ − θ S = n σˆ

with estimate θˆ and asymptotic variance estimate σˆ2 I Then, we can use Edgeworth expansions √ √ P(S ≤ x) = Φ(X) + nq(x)φ(x) + O( n)

with Φ standard normal distribution, φ standard normal density, and q even polynomials of degree 2 Asymptotic Argument in Favor of Pivoting

I Bootstrap estimates are ! √ θˆ∗ − θˆ S = n σˆ∗

I Then, we can use Edgeworth expansions

∗ √ √ P(S ≤ x|X1,..., Xn) = Φ(X) + nqˆ(x)φ(x) + O( n)

I qˆ is obtain by replacing unknowns in q with bootstrap estimates I Asymptotically, we further have √ qˆ − q = O( n) Asymptotic Argument in Favor of Pivoting

I Then, the bootstrap approximation to the distribution of S is

∗ P(S ≤ x) − P(S ≤ x|X1,..., Xn) =  √ √   √ √  Φ(X)+ nq(x)φ(x)+O( n) − Φ(X)+ nqˆ(x)φ(x)+O( n)

 1  = O n √ I Compared to the normal approximation n I Which the same as the error when using standard bootstrap (can be shown with the same argument) Studentized Bootstrap

I These pivotal intervals are more accurate in large samples than that of standard intervals and t intervals I Accuracy comes at the cost of generality

I standard normal tables apply to all samples and all samples sizes I t tables apply to all samples of fixed n I studentized bootstrap tables apply only to given sample

I The studentized bootstrap can be asymmetric I It can be used for simple statistics, like mean, , trimmed mean, and sample percentile I But for more general statistics like the correlation coefficients, there are some problems:

I Interval can fall outside of allowable I Computational issues if both parameter and standard error have to be bootstrapped Studentized Bootstrap

I The Studentized bootstrap works better for variance stabilized I Consider a X with mean θ and standard deviation s(θ) that varies as a function of θ I Using the delta method and solving an ordinary differential equation, we can show that Z x 1 g(x) = du s(u)

will make the variance of g(X) constant I Usually s(u) is unknown I So we need to estimate s(u) = se(θˆ|θ = u) using the bootstrap Studentized Bootstrap

1. First bootstrap θˆ, second bootstrap seˆ (θˆ) from θˆ∗ 2. Fit curve through points (θˆ∗1, seˆ (θˆ∗1)),..., (θˆ∗B, seˆ (θˆ∗B)) 3. Variance stabilization g(θˆ) by numerical integration 4. Studentized bootstrap using g(θˆ∗) − g(θˆ) (no denominator, since variance is now approximately one) 5. Map back through transformation g −1

Source: Efron and Tibshirani (1994) Studentized Bootstrap in R

library(boot) mean.fun = function(d, i) { m = mean(d$hours[i]) n = length(i) v = (n-1)*var(d$hours[i])/n^2 c(m, v) } air.boot <- boot(aircondit, mean.fun,R= 999) results = boot.ci(air.boot, type = c("basic", "stud")) Studentized Bootstrap in R

results

## BOOTSTRAP CALCULATIONS ## Based on 999 bootstrap replicates ## ## CALL : ## boot.ci(boot.out = air.boot, type = c("basic", "stud")) ## ## Intervals : ## Level Basic Studentized ## 95% ( 22.2, 171.2 ) ( 49.0, 303.0 ) ## Calculations and Intervals on Original Scale References

I Efron (1987). Better Bootstrap Confidence Intervals I Hall (1992). The Bootstrap and Edgeworth Expansion I Efron and Tibshirani (1994). An Introduction to the Bootstrap I Love (2010). Bootstrap-t Confidence Intervals (Link to blog entry)