Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Bootstrapping moving average models

1992, Journal of the Italian Statistical Society

J. Ital. Statist. Soc. (1992) 2, pp. 227-234 BOOTSTRAPPING MOVING A V E R A G E MODELS Marcella Corduas* Centro di Specializzazione e Ricerche, Portici (NA) Universitd di Napoli Federico H Summary In recent years, the bootstrap method has been extended to time series analysis where the observations are serially correlated. Contributions have focused on the autoregressive model producing alternative resampling procedures. In contrast, apart from some empirical applications, very little attention has been paid to the possibility of extending the use of the bootstrap method to pure moving average (MA) or mixed ARMA models. In this paper, we present a new bootstrap procedure which can be applied to assess the distributional properties of the moving average parameters estimates obtained by a least square approach. We discuss the methodology and the limits of its usage. Finally, the performance of the bootstrap approach is compared with that of the competing alternative given by the Monte Carlo simulation. Keywords: bootstrap, time series, Moving Average models. 1. Introduction Since the earlier contributions by Efron (1979, 1982) the bootstrap m e t h o d has been widely investigated as a non-parametric procedure for estimating the standard error of a statistic defined on a sequence of i.i.d, random variables. Recently, several authors have focussed their attention on the possibility of extending the methodology to time series analysis by considering the case of stationary Autoregressive processes. In that context, two resampling schemes have been proposed in order to take into account the dependence structure of the observations. The former attempts to reduce the problem to i.i.d, values by resampling the estimated residuals from the fitted model (see Freedman, 1984; E f r o n & * Address for correspondence: MarceUa Corduas, Centro di Specializzazione e Ricerche, Via Universit~t 96, 80055 Portici (NA), Italy. Research partially supported by CNR and MURST. 227 M. CORDUAS Tibshirani, 1986). In particular, at each iteration a bootstrap replicate of the time series is derived by filtering the bootstrapped residuals through the estimated AR filter. Therefore, the experiment provides the standard error estimate of the statistic of interest given the initial model estimation. The latter is a model-free approach and constructs bootstrap replicates by selecting with replacement blocks of sequential observations (Liu & Singh, 1988; Kiinsch, 1989; Corduas, 1990). It originates a consistent bootstrap procedure for a parameter of the m-dimensional joint distribution,of the observations which works for arbitrary stationary processes with short-range dependence. However, in many time series problems the process generating the observations follows pure moving average (MA) or mixed A R M A structures. Apart from some empirical studies, which merely re-apply the EfronTibshirani technique (see, for instance, Chatterjee, 1986), very little attention has been paid to the use of the bootstrap technique in relation to those models. In this paper we propose a bootstrap procedure which can be used in order to assess the distributional properties of the MA parameter estimates obtained by a least squares approach. The article is organized as follows. In Section 2 we describe the estimation methodology, the implications and the limits of its usage. In Section 3 we present the resampling procedure and compare its performance with the competing alternative given by the Monte Carlo simulation commonly applied. 2. The estimation of MA models Consider a mean adjusted time series (Zt) which follows a moving-average model: Zt = O(B) at=at-Olat_l-...-Oqat_q, where at~WN(0,02) is a Gaussian White Noise process with zero mean and variance 02, B denotes the backshift operator and O(B)=I-O1B-...-OqB q is such that the equation O(B)=O has roots outside the unit circle. The last condition (invertibility) ensures that Zt has a pure autoregressive representation: ~(B) Zt = aa oo where ~(B)= 1- ~ :rjBi=O(B)-z a n d ~ 228 I~jl<~. MOVING AVERAGE MODELS Hannah & Rissanen (1982) suggested a general algorithm for estimating the parameters of A R M A models and for identifying the degrees p and q of the related lag operators. The basic procedure, which relies on the earlier method developed by Durbin (1960), consists of three stages in which A R M A models of increasing orders are estimated iteratively by fitting a suitable regression model to the estimated innovations. The ~optimal~ orders p and q are selected by minimizing an appropriate criterion. Here, we shall limit ourselves to illustrate the first two steps of that procedure with reference to an MA (q) model. The MA representation (and, generally speaking, the A R M A formulation) can be seen as a regression model where the lagged variables at_ p j=l,...,q, are unobservable. Durbin (1960) proposed to estimate the innovation at by fitting a long autoregression to the data. The convergence of the :t-weights sequence justifies such approach. In fact, the information which is lost by the finite autoregression may be negligible if a sufficient number of observations are available and an appropriate order is chosen for the autoregression. Thus, in the first step of the algorithm the following model is fitted to the time series: Z t -- $ ~ l Z t _ l + . . . - l - : t l Z t _ L -t- a t ; in the second step, the past values of the residuals at are used in place of the unobservable lagged variables, at-j, in the regression model: Z t = -olat_l-...-Oqat_q+at, which yields the desired parameter estimates. The estimation of the finite AR, which is performed in step 1, can be carried out by means of alternative methods such as: i) Yule-Walker equations; in particular, the Durbin-Levison algorithm can be used as suggested by Hannan& Rissanen (1982) in their first contribution; ii) Burg's algorithm (1975), which has several practical advantages. In fact, it produces always stationary AR estimates which are less biased than those obtained with alternative methods (Tjostheim & Paulsen, 1983); iii) ordinary least squares method (OLS), which represents the simplest option to be applied. This is the method that we have chosen in our study. The estimators {01,...,Oq} derived by the two-steps procedure described above are consistent but not fully efficient. Some extensions which attempt to improve the described algorithm have been proposed in the literature (see, 229 M. CORDUAS for instance, H a n n a n & Kavalieris, 1984; Koreisha & Pukkila, 1990). However, the former solution is still preferable because of its simple implementation. 3. The bootstrap procedure Our resampling approach extends to MA models a method which has been successfully applied to A R models (Corduas, 1990). In particular, as we have discussed in the previous paragraph, the estimation of the MA parameters can be reduced to that of the coefficients of a linear regression model: Z t = 01ctt_l-...-OqCtt_q+at, t=L+q+l,...,n where n is the length of the available time series. In the remaining section, it will be convenient to denote the vector of the observed values {Zt}t~L+q+l with Z and the regressor matrix with A. Thus, an estimate of the standard deviations associated to the OLS estimates 0={01 .... ,0q} can be obtained by the following resampling schema. Given the matrix [ZIA], derived in the first step of the estimation algorithm, we generated a bootstrap replicate of that matrix by resampling its rows with replacement. After centering the columns for the corresponding mean, we obtain the bootstrap matrix/Z* IA*]. Note that the resampled rows still satisfy the relation: ..... n Z*=A*0 + a*. Then, the least square estimator 0* = (A*'A*)-IA*'Z * represents the boot- strap estimator of 0. The validity of the procedure relies on the assumption that given the data the distribution V ~ - ( 0 * - 0 ) converges to that of ~/-n(O-O). For Zt-MA(1), that consideration can be justified by the following argument. The empirical distribution of the centered residuals (dr-d) converges weakly to the distribution of at. Thus we can consider ~t-1 as they were a realization from the process at-1. Moreover, the rows of the matrix/Z[A] can be seen as realizations of the process (Zt, at-l). The latter is a stationary, ergodic sequence and, under the assumption of Gaussianity, is weakly dependent with a joint distribution Q completely characterized by {O,oe}. The empirical distribution Qn which puts equal mass over the pairs (zt, 6tt-z) converges to Q a.s. by the ergo dic theorem for a stationary bivariate process (White, 1984, p. 42). On the other side, the bootstrap acts by sampling with replacement from 230 MOVING AVERAGE MODELS the pairs (zt, at-z), then (Z't, a*) are i.i.d, random variables with common distribution Qn. The bootstrap distribution Q* converges to Qn a.s. by the usual argument (Glivenko-CanteUi theorem). From those considerations, we could deduce that the distribution of 6", being a continuous functional of Q*, converges to that of 0. In order to investigate the finite sample size properties of the proposed bootstrap procedure we present the results of an empirical experiment which refers to MA(1) models. In particular, the bootstrap simulation consisted of the following steps: i) a value 0 ~ [-1, 1] was chosen and a realization zt of the corresponding process was generated. For each replication the first 60 observations Were discarded to avoid initialization effects; ii) the parameter 0 was estimated using the two-steps estimation algorithm illustrated in the previous section; iii) given the matrix [Z[A], 200 bootstrap samples were generated and the empirical distribution of 0" was estimated. The last point represents a peculiarity of our bootstrap procedure. In fact, the results are conditional to the first estimation step in which the long A R model is fitted to the data, whereas the simulation approach, commonly applied (see, for instance, Koreisha & Pukkila, 1990) iterates the generation of realizations from the model and proceedes for each of them through the twosteps estimation algorithm. Moreover, since the results from a bootstrap experiment are data dependent, the whole experiment was repeated for a hundred time series and the data length was changed accordingly (n=60, 100, 200). In the first step, the order of the finite A R was set as a function of the number of observations (L=X/-n). Alternative order identification criteria (BIC, CAT) were also experimented, but the results were not as satisfactory as in the first case because the estimated innovations often were serially correlated. This fact confirms the evidence discussed by Koreisha & Pukkila (1990). Some samples (either in the simulation or in each bootstrap experiment) were rejected since they yielded non-invertible parameter estimates, that is, 0 estimates which were out of the [-1, 1] interval. With respect to the bootstrap experiment, for most of the parameter values, on average less than 5% of the samples were discarded, for 0=.9 that percentage increased to 15%. In Table 1 we present the final results of our study for different values of 0. We expect them to be symmetrical about the 0-axis since the asymptotic standard deviation of ~ is n - m . Then, we report only the final conclusions for positive values of 0. The first two columns refer to the Monte Carlo experiment. In total, given a value of 0, 1000 realizations were generated and for each of them the 231 M. CORDUAS Tab. 1 - Final results. 0=0.3 simulation~ n av(O) 60 0=0.5 bootstrap ~ simulation~ ~(0) av(O*) std(O*) bootstrap ~176 n av(O) ~(0) av(O*) std(8*) .3127 .1510 .1501 .0216 60 . 5 0 7 2 .1502 .1496 .0215 100 .3115 .1143 .1100 .0116 100 .5088 .1141 '.1108 .0118 200 .3956 .0753 .0756 .0069 200 .5044 .0752 .0070 0=0.7 simulation~ n av(O) 60 .0761 0=0.9 bootstrap ~ simulation~ 0(~1) av(O*) std(O*) bootstrap ~ n av(O) a(O) av(~*) std(~)*) . 6 8 9 8 .1416 .1417 .0212 60 .8094 .1199 .1259 .0241 100 .7035 .1127 .1071 .0143 100 .8457 .0905 .0949 .0168 200 .7029 .0751 .0762 .0071 200 .8807 .0645 .0671 .0090 ~ determined from 1000 trials o~ determined from 200 bootstraps o _n100 realizations Asymptotic standard deviation: 1/%/60 = 0.1291; 1/%/100 = 0.1; I/X/ 200 = 0.0707 MA (1) model was estimated by the two-step algorithm. The empirical distribution of 0 derived from the simulation was then used to contrast the performance of the bootstrap procedure. In this respect, the former is considered as the ~true distributiom~ of 0 since no further information is available on the distribution of the estimator of interest. 232 MOVING AVERAGE MODELS In particular, we report the mean value of the estimated coefficient av(O), which provides a measure to evaluate the performance of the experiment in terms of approximation and the standard deviation 0(0). Besides, the table shows the results from the simulation of the bootstrap experiment. Specifically, av(O*) and std((r*) are the average and the standard deviation of the bootstrap estimates of the standard deviation of 6) (denoted by t~*). First we see that, as it is expected, on average the bootstrap estimates of the standard deviation of ~ approach the asymptotic values as the number of observations n increases. Further, the bootstrap produces results which are consistent on average with those obtained from the Monte Carlo experiment. It is important to remark that the resampling procedure requires a lower number of operations with respect to the Monte Carlo simulation. Indeed the bootstrap does not need to repeat the estimation of the long AR model at each step. Finally, the computing time required to estimate O* for a particular data set (this implies the generation of 200 bootstrap replications) is very limited: 9.6 see. on an Ibm Ps2/80 for N=200 using a routine implemented in the GAUSS programming environment. Thus, this characteristic and the desirable statistical properties of the bootstrap (generally speaking, the method is not restricted by a particular distributional assumption) encourage its usage for estimating the standard deviation of the parameter estimates of MA models. 4. Final remarks In this paper we have proposed a new procedure for bootstrapping MA models which is strictly related to the innovation estimation algorithm and exploits the formulation of the regression models involved. The validity of the proposed procedure has been confirmed by simulation. We found that the bootstrap produces results which are comparable with those obtained by a Monte Carlo study. REFERENCES BURGJ. (1975), Maximal entropy spectral analysis, Ph. D. dissert., Stanford University, Dept. of Geophysics. 233 M. C O R D U A S CriArr~mE S. (1986), Bootstrapping ARMA models: some simulations, 1EEE Transactions on System, Man & Cybernetics, 16, 294-299. Col~uAs M. (1990), Approcci alternativi per il ricampionamento nei modelli Autoregressivi, Atti della X X X V Riunione Scientzfica SIS, Padova, 2, 61-68. DtrnaIN J. (1960), The fitting of time series models, Rev. Int. Stat. Inst., 28, 233-244. EFRor~ B. (1979), Bootstrap methods: another look at jackknife, Annals o f Statistics, 7, 1-26. En~ON B. (1982), The jackknife, the bootstrap and other resampling plans, SIAMCBMS Monograph 38, Philadelphia. E~oN B., Tmsm~ANI R. (1986), Bootstrap methods for standard error confidence intervals and other measures of statistical accuracy, Statistical Science, 1, 54-77. Fm~EDt,tAND. (1984), On bootstrapping two-stage least squares estimates in stationary linear models, Annals of Statistics, 12, 827-842. HANNAN E. J., R~SSANENJ. (1982), Recursive estimation of mixed autoregressive moving average order, Biometrika, 69, 81-94. HANNAN E. J., KAVALmVaSL. (1984), A method for autoregressive moving-average estimation, Biometrika, 72, 273-280. Ko~IsnA S., PorraLA T. (1990), A generalized least-squares approach for estimation of autoregressive moving-average models, Journal of Time Series Analysis, 2, 139151. KOr~scn H. R. (1989), The jackknife and the bootstrap for general stationary observations, Annals of Statistics, 17, 1217-1241. LrtJ R. Y., SINGH K. (1988), Moving blocks jackknife and bootstrap capture weak dependence, Technical Report, Dept. of Statistics, Rutgers University. TJOSTI-IEIMD., PAtrLSEr~J. (1983), Bias of some commonly used time series estimates, Biometrika, 48, 197-199. WnrrE H. (1984), Asymptotic theory for econometricians, Academic Press, Orlando (CA). Received: July 1991. Revised: February 1992. 234