Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
Bootstrapping Techniques in Statistical Analysis and Approaches in R MATH 289
Approaches in R
MATH 289
Ning Zhao
University of California San Diego, Department of Mathematics
Abstract
The true probability distribution of a test statistic is rarely known. Generally, its asymptotic
law is used as approximation of the true law. If the sample size is not large enough, the
asymptotic behavior of that statistic could lead to a poor approximation of the true one. Using
bootstrap methods, under some regularity conditions, it is possible to obtain a more accurate
approximation of the distribution of the test statistic. The bootstrap is a method to derive proper-
ties (standard errors, conïňA-dence
˛ intervals and critical values) of the sampling distribution of
estimators. It takes the sample (the values of the independent and dependent random variables)
as the population and the estimates of the sample as true values. Not to draw from a specified
distribution by a random number generator, the bootstrap draws with replacement from the
sample. The article will discuss many bootstrap methods and do simulations for some of them.
1 Definition
1.1 General illustration in Bootstrap World
Consider a sample with n = 1, ..., N independent observations of a dependent variable y and
M + 1 explanatory variables x. A paired bootstrap is obtained by independently drawing N pairs
( xi , yi ) from the observed sample withreplacement. The bootstrap sample has the same number of
observations, however some observations appear several times and other observations never. The
bootstrap involves drawing a large number B of bootstrap samples. a single bootstrap sample is
denoted ( xb∗ , y∗b ), where xb∗ is a N ∗ ( M + 1) matrix and y∗b an N-dimensional column vector of the
data in the b-th bootstrap sample.
1
B
where θ̂ ∗ = 1
B ∑ θ̂b∗ .
b =1
The whole covariance matrix V (θ̂ of a vector θ̂ is estimated analogously. So, in this algorithm
the estimator θ̂ is consistent and asymptotically normally distributed and bootstrap se could in use
of constructing approximate CIs and perform asymptotic tests based on the normal distribution.
1.4 t-Bootstrap
ˆ (θ̂ ) and that the asymptotic distribution of
Assuming that we have consistent estimates of θ̂ and se
the t-statistic is the standard normal
θ̂ −θ0
t= ˆ (θ̂ )
se
−→ N (0, 1).
Then we could calculate approximate critical values from percentiles of empirical distribution in a
series bootstrap replications for the t-statistic.
[θ̂ + tα/2 ∗ se
ˆ (θ̂ ), θ̂ + t1−α/2 ∗ se
ˆ (θ̂ )]
2
Therefore The confidence interval from bootstrap-t is not necessarily better then the percentile
method. However, it is consistent with bootstrap-t hypothesis testing. The bootstrap is typically
used for consistent yet biased estimators. In lots of cases we know the asymptotic properties of
these estimators. Therefore we could use asymptotic theory to derive the approximate sampling
distribution. However bootstrap is an alternative way to produce approximations for the true
sample properties. However sometimes the asymptotic sampling distribution is not that simple to
derive. For instance, the asymptotic sampling is too time consuming and error prone. Another
concern is that the bootstrap produces better approcimations for some properties. For instance,
it can be shown that bootstrap approximations converge faster for some statistics than the
approximations based on asymptotic theory. These kinds of bootstrap approximations are called
asymptotic refinements.
where Φ and φ denote the distribution function and density of the standard normal distribution.
This is a continuous distribution with density
1 n y − Yi∗
fˆ(y) =
nh ∑ φ( h
)
i =1
3
From the theory of kernel density estimation it is known that asymptotically (n → ∞, h → 0,
nh → ∞) fˆ converges to the true density f of the underlying distribution. The smooth bootstrap
is thus consistent.
2 Simulation
2.1 A Regression Model
First, now, we compare Bootstrap Tests to Asymptotic Tests. We want to show the performance of
bootstrap tests as compared to standard asymptotic tests. To this end, let us discuss it in a linear
regression model:
y = β 1 + β 2 ∗ x + β 3 ∗ x + u, u N (0, δ2 ∗ I )
In this model, we are about to test the null hypothesis H0 : β 3 = 0. Compute a classical
t-statistic for this null hypothesis. On the basis of this test statistic, we can perform a real
test through this. We could select some N to perform parametric bootstrap tests by different
simulations. We also could figure out two types of bootstrap P-values correspondwith the exact
P-value, and the trend that the correspondence change as N increase
Y = data$V1
X.1 = rep(1,nObs)
X.2 = data$V2
X.3 = data$V3
## OLS estimation of unrestricted model and classical t-test for H_0: beta_3 = 0
NB.1 = 99
NB.2 = 999
NB.3 = 9999
beta.1 = coef(OLS.res)[1]
4
beta.2 = coef(OLS.res)[2]
sigma = summary(OLS.res)$sigma
U.matrix = matrix(0,nObs,NB.3)
Y.matrix = matrix(0,nObs,NB.3)
T.vector = rep(0,NB.3)
for (i in 1:NB.3)
U.matrix[,i] = rnorm(nObs,0,sigma)
Y.matrix[,i] = beta.1 * X.1 + beta.2 * X.2 + U.matrix[,i]
y = Y.matrix[,i]
OLS = lm(y ~ X.2 + X.3)
T.vector[i] = coef(OLS)[3] / sqrt(diag(vcov(OLS.unres))[3])
T.vector.1 = T.vector[1:NB.1]
T.vector.2 = T.vector[1:NB.2]
T.vector.3 = T.vector[1:NB.3]
1-eps-converted-to.pdf
So these residuals appear exhibit normality, homogeneity , and independence. Those are
pretty clear. This might be a problem with heterogeneity. Most books just show a few examples
like this and then residuals with clear patterning, most often increasing residual values with
increasing fitted values, however this figure looks like the model has decreasing residual values
5
with increasing fitted values. (Note that we could try log transformation to omit this problem out
of the model.)
2-eps-converted-to.pdf
The Q-Q plot above comparing randomly generated, independent standard normal data on
the vertical axis to a standard normal population on the horizontal axis. The approximate linearity
of the points almostly suggests that the data are normally distributed. However, there are some
noises beyond the second Normal theoretical quantiles, where we should take care of those noises.
3-eps-converted-to.pdf
After standordized, the errors basically show the same results as the first figure.
6
4-eps-converted-to.pdf
The figure in terms of leverage represents cases we may want to research as possibly having
undue influence on the regression relationship.
2.2 Bootstrap CI
We mentioned the concern at the first part.
The bootstrap distribution and the sample may disagree systematically, in which case bias may
occur. If the bootstrap distribution of an estimator is symmetric, then percentile confidence-
interval are often used; such intervals are appropriate especially for median-unbiased estimators
of minimum risk (with respect to an absolute loss function). Furthermore, it is an appropriate
way to control and check the stability of the results. Bias in the bootstrap distribution will lead to
bias in the confidence-interval. Otherwise, if the bootstrap distribution is non-symmetric, then
percentile confidence-intervals are often inappropriate.
The Poisson distribution in real world statistics. For instance, it has been used by engineers
as a model for counting problem, based on the rationale that if the rate is approximately constant.
The model is that we need to estimate the parameter λ in the Poisson model:
λ x e−λ
P( X = x ) = x!
7
n <- length(y)
NegLogLike<-function(p)
NegLogLike <- -(mean(y*log(p))-p-mean(log(factorial(y))))
NB <- 3000 # we will sample 3000 samples and estimate 3000 times
# we will save the bootstrap results in this array, and this array is initialized as 0 vector
lambdahat_MLEB <- rep(0,NB)
for(i in 1:NB) # for loop, we will repeat the experiment NB times
5-eps-converted-to.pdf
As we see, the figure that stands for resampling looks pretty good. As above, we could
minimize the negative log-likelihood function in order to get Fisher Information for the Statistic:
n n
g = − n1 ∑ log f n ( xi |λ) = − n1 ∑ l n ( xi |λ)
i =1 i =1
then we could get Fisher Information through below:
n
g”= − n1 ∑ l”( xi |λ) −→ − E(l”( X |λ)) = I (λ)
i =1
> mean(lambdahat_MLEB)
8
[1] 3.891379
> mean(lambdahat_MLEB)
[1] 3.891379
> out$minimum
[1] 2.205347
> out$estimate
[1] 4.013331
> out$gradient
[1] 1.770456e-09
> out$hessian
[,1]
[1,] 0.2491199
3 Discussion
A advantage of bootstrap is its simplicity. It is a straightforward way to derive estimates of
standard errors and confidence intervals for complex estimators of complex parameters of the dis-
tribution, such as percentile points, proportions, odds ratio, and correlation coefficients. Moreover,
it is an appropriate way to control and check the stability of the results. And bootstrap is also
a very broaden technique that can be used into different fields. For instance, block bootstrap is
wildly performed as concrete statistical tool in Signal Process. However, bootstrapping is (under
some conditions) asymptotically consistent, however it does not provide general finite-sample
guarantees. Moreover, it tends to be overly optimistic. The apparent simplicity may conceal the
fact that important assumptions are being made when undertaking the bootstrap analysis (e.g.
independence of samples) where these would be more formally stated in other approaches.
The discussion of the bootstrap for data is not yet finished but carrying on. For the comparison
of the proposed resampling schemes a complete understanding is still missing and theoretical
research is still going on. Applications of time series analysis will also require new approaches.
Examples are more interacted with other fields in academic world...
References
[1] Bradley Efron, Robert J. Tibshirani, An Introduction to the Bootstrap, (1993)
[2] Brownstone, David, Robert Valetta, The Bootstrap and Multiple Imputations: Harnessing In-
creased Computing Power for Improved Statistical Tests, Journal of Economic Perspectives, 15(4),
129-141
[3] A Random Effect Block Bootstrap for Clustered Data, Journal of Computational and Graphical
Statistics, 2012
[4] Field , C. A. , and Welsh , A. H. Bootstrapping Clustered Data, Journal of the Royal Statistical
Society, Series B , 69 , 369 âĂŞ 390 .
[5] Shao , J. and Tu , D. ( 1995 )The Jackknife and Bootstrap New York : Springer , New York .
[6] Clark , R. G. and Allingham , S. ( 2011 ), âĂIJ Robust Resampling Confidence Intervals for
Empirical Variograms ,âĂİ Mathematical Geosciences , 43 ( 2 ), 243 âĂŞ 259 .
9
[7] Davison , A. C. and Hinkley , D. V. ( 1997 ), Bootstrap Methods and Their Application ,
Cambridge : Cambridge University Press .
[8] Caers , J. , Beirlant , J. and Vynckier , P. ( 1998 ), âĂIJ Bootstrap Confidence Intervals for Tail
Indices ,âĂİ Computational Statistics and Data Analysis , 26 , 259 âĂŞ 277 .
[9] DiCiccio TJ, Efron B (1996) Bootstrap confidence intervals (with Discussion). Statistical
Science 11: 189-228
[11] Gregory Shakhnarovich, Ran El-Yaniv, Yoram Baram, Smoothed Bootstrap and Statistical
Data Cloning for Classier Evaluation
10