Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
R in Action, Second Edition.pdf
Скачиваний:
540
Добавлен:
26.03.2016
Размер:
20.33 Mб
Скачать

296

CHAPTER 12 Resampling statistics and bootstrapping

12.6.2Bootstrapping several statistics

In the previous example, bootstrapping was used to estimate the confidence interval for a single statistic (R-squared). Continuing the example, let’s obtain the 95% confidence intervals for a vector of statistics. Specifically, let’s get confidence intervals for the three model regression coefficients (intercept, car weight, and engine displacement).

First, create a function that returns the vector of regression coefficients:

bs <- function(formula, data, indices) { d <- data[indices,]

fit <- lm(formula, data=d) return(coef(fit))

}

Then use this function to bootstrap 1,000 replications:

library(boot)

set.seed(1234)

results <- boot(data=mtcars, statistic=bs, R=1000, formula=mpg~wt+disp)

> print(results)

ORDINARY NONPARAMETRIC BOOTSTRAP Call:

boot(data = mtcars, statistic = bs, R = 1000, formula = mpg ~

 

wt + disp)

 

Bootstrap Statistics :

 

 

original

bias

std. error

t1*

34.9606

0.137873

2.48576

t2*

-3.3508

-0.053904

1.17043

t3*

-0.0177

-0.000121

0.00879

When bootstrapping multiple statistics, add an index parameter to the plot() and boot.ci() functions to indicate which column of bootobject$t to analyze. In this example, index 1 refers to the intercept, index 2 is car weight, and index 3 is the engine displacement. To plot the results for car weight, use

plot(results, index=2)

The graph is given in figure 12.3.

To get the 95% confidence intervals for car weight and engine displacement, use

> boot.ci(results, type="bca", index=2) BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS Based on 1000 bootstrap replicates

CALL :

boot.ci(boot.out = results, type = "bca", index = 2)

Intervals :

Level

BCa

95%

(-5.66, -1.19 )

Calculations and Intervals on Original Scale

 

 

 

 

Bootstrapping with the boot package

 

 

297

 

 

Histogram of t

 

 

 

 

 

 

 

 

 

0.4

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

0.3

 

 

 

 

−1

 

 

 

 

 

 

 

 

 

 

 

 

−2

 

 

 

 

 

 

Density

0.2

 

 

 

t*

−3

 

 

 

 

 

 

 

 

 

 

 

 

−4

 

 

 

 

 

 

 

0.1

 

 

 

 

−5

 

 

 

 

 

 

 

 

 

 

 

 

−6

 

 

 

 

 

 

 

0.0

 

 

 

 

 

 

 

 

 

 

 

 

−6

−4

−2

0

 

−3

−2

−1

0

1

2

3

 

 

 

t*

 

 

Quantiles of Standard Normal

Figure 12.3 Distribution of bootstrapping regression coefficients for car weight

> boot.ci(results, type="bca", index=3)

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS

Based on 1000 bootstrap replicates

CALL :

 

 

boot.ci(boot.out

= results, type = "bca", index = 3)

Intervals :

 

Level

BCa

 

95%

(-0.0331,

0.0010 )

Calculations and

Intervals on Original Scale

NOTE The previous example resamples the entire sample of data each time. If you can assume that the predictor variables have fixed levels (typical in planned experiments), you’d do better to only resample residual terms. See Mooney and Duval (1993, pp. 16–17) for a simple explanation and algorithm.

Before we leave bootstrapping, it’s worth addressing two questions that come up often:

How large does the original sample need to be?

How many replications are needed?

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]