Bootstrap Regression With R: Histogram of KPL
Bootstrap Regression With R: Histogram of KPL
Bootstrap Regression With R: Histogram of KPL
Histogram of kpl
35
30
25
20
Frequency
15
10
5
0
4 6 8 10 12 14 16 18
kpl
1
> # Now bootstrap the mean of kpl
> n = length(kpl); B = 1000
> mstar = NULL # mstar will contain bootstrap mean values
>
> for(draw in 1:B) mstar = c(mstar,mean(kpl[sample(1:n,size=n,replace=T)]))
> hist(mstar)
Histogram of mstar
250
200
Frequency
150
100
50
0
mstar
2
> # Look at a normal qq plot. That's a plot of the order statistics against
> # the corresponding quantiles of the (standard) normal. Should be roughly linear
> # if the data are from a normal distribution.
9.5
9.0
Sample Quantiles
8.5
8.0
-3 -2 -1 0 1 2 3
Theoretical Quantiles
> # Quantile bootstrap CI for mu. Use ONLY if the bootstrap distribution is symmetric.
> sort(mstar)[25]; sort(mstar)[975]
[1] 8.3034
[1] 9.3492
> # Compare the usual CI
> t.test(kpl)
data: kpl
t = 32.9363, df = 99, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
8.264966 9.324634
sample estimates:
mean of x
8.7948
3
> # Now regression
> # Compute some polynomial terms
> wsq = weight^2; lsq = length^2; wl = weight*length
> # Bind it into a nice data frame
> datta = cbind(kpl,weight,length,wsq,lsq,wl)
> datta = as.data.frame(datta)
>
> model1 = lm(kpl ~ weight + length + wsq + lsq + wl, data=datta)
> summary(model1)
Call:
lm(formula = kpl ~ weight + length + wsq + lsq + wl, data = datta)
Residuals:
Min 1Q Median 3Q Max
-4.0861 -0.8702 0.0490 0.6898 4.4006
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 79.124 29.121 2.717 0.00784 **
weight 24.336 26.570 0.916 0.36204
length -33.764 19.350 -1.745 0.08427 .
wsq 11.377 8.531 1.334 0.18556
lsq 5.140 3.410 1.507 0.13508
wl -12.442 10.174 -1.223 0.22442
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
4
> Vb = var(bstar) # Approximate asymptotic covariance matrix of betahat
> Vb
(Intercept) weight length wsq lsq wl
(Intercept) 4009.5755 2805.7433 -2432.9272 403.57428 359.95762 -795.78318
weight 2805.7433 2337.0043 -1816.6783 434.68245 288.94772 -724.67663
length -2432.9272 -1816.6783 1511.4557 -292.33275 -229.91599 534.81663
wsq 403.5743 434.6825 -292.3327 117.92217 52.99175 -158.04873
lsq 359.9576 288.9477 -229.9160 52.99175 36.19566 -89.15841
wl -795.7832 -724.6766 534.8166 -158.04873 -89.15841 239.35661
>
> # Test individual coefficients. H0: betaj=0
> se = sqrt(diag(Vb)); Z = betahat/se
> rbind(betahat,se,Z)
(Intercept) weight length wsq lsq wl
betahat 79.124214 24.3361095 -33.7637822 11.376646 5.1396488 -12.4424491
se 63.321209 48.3425725 38.8774443 10.859198 6.0162826 15.4711541
Z 1.249569 0.5034095 -0.8684671 1.047651 0.8542898 -0.8042354
Final
comment:
This
is
not
a
typical
bootstrap
regression.
It’s
more
common
to
bootstrap
the
residuals.
But
that
applies
to
a
conditional
model
in
which
the
values
of
the
explanatory
variables
are
fixed
constants.
5