Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Testing for Breaks in Coefficients and Error Variance: Simulations and Applications∗ Jing Zhou† Pierre Perron‡ BlackRock, Inc. Boston University July 28, 2008 Abstract In a companion paper, Perron and Zhou (2008) provided a comprehensive treatment of the problem of testing jointly for structural change in both the regression coefficients and the variance of the errors in a single equation regression model involving stationary regressors, allowing the break dates for the two components to be different or overlap. The aim of this paper is twofold. First, we present detailed simulation analyses to document various issues related to their procedures: a) the inadequacy of the two step procedures that are commonly applied; b) which particular version of the necessary correction factor exhibits better finite sample properties; c) whether applying a correction that is valid under more general conditions than necessary is detrimental to the size and power of the tests; d) the finite sample size and power of the various tests proposed; e) the performance of the sequential method in determining the number and types of breaks present. Second, we apply their testing procedures to various macroeconomic time series studied by Stock and Watson (2002). Our results reinforce the prevalence of change in mean, persistence and variance of the shocks to these series, and the fact that for most of them an important reduction in variance occurred during the 1980s. In many cases, however, the so-called “great moderation” should instead be viewed as a “great reversion”. JEL Classification: C22 Keywords: Change-point; Variance shift; Conditional heteroskedasticity; Likelihood ratio tests; the “Great moderation.” ∗ Perron acknowledges financial support from the National Science Foundation under Grant SES-0649350. We are grateful to Adam McCloskey for detailed comments on a previous draft. † BlackRock Inc., 40 E 52nd Street, New York, NY 10022 (Jing.Zhou@blackrock.com). ‡ Department of Economics, Boston University, 270 Bay State Rd., Boston, MA, 02215 (perron@bu.edu). 1 Introduction In a companion paper, Perron and Zhou (2008) provided a comprehensive treatment of the problem of testing jointly for structural changes in both the regression coefficients and the variance of the errors in a single equation regression model involving stationary regressors, allowing the break dates for the two components to be different or overlap. Their framework is quite general in that it allows for general mixing-type regressors and the assumptions imposed on the errors are quite mild. The errors’ distribution can be non-normal and conditional heteroskedasticity is permissible. Extensions to the case with serially correlated errors were also treated. Perron and Zhou (2008) provided the required tools for addressing the following testing problems, among others: a) testing for given numbers of changes in regression coefficients and variance of the errors; b) testing for some unknown number of changes less than some pre-specified maximum; c) testing for changes in variance (regression coefficients), allowing for a given number of changes in regression coefficients (variance); d) a sequential procedure for estimating the number of changes present. These testing problems are important for practical applications, as witnessed by recent interests in macroeconomics and finance where documenting structural changes in the variability of shocks to simple autoregressions or Vector Autoregressive Models has been a concern; see, among others, Blanchard and Simon (2001), Herrera and Pesavento (2005), Kim and Nelson (1999), McConnell and Perez-Quiros (2000), Sensier and van Dijk (2004) and Stock and Watson (2002). Given the lack of proper testing procedures, a commonly used approach is to apply standard sup-Wald type tests (e.g., see Andrews, 1993 and Bai and Perron, 1998) for changes in the mean of the absolute value of the estimated residuals; see e.g., Herrera and Pesavento (2005) and Stock and Watson (2002). This is a rather ad hoc procedure. For the problem of testing for a change in variance only (imposing no change in the regression coefficients), Deng and Perron (2008) have extended the CUSUM of squares test of Brown, Durbin and Evans (1975), allowing very general conditions on the regressors and the errors. This test was suggested by Inclán and Tiao (1994) for detecting change in variance in a normally distributed time series. This test is, however, adequate only if no changes in coefficients are present. As documented by, e.g., Stock and Watson (2002), it is often the case that changes in both coefficients and variance occur, often at different dates. A commonly applied method is to first test for changes in the regression coefficients and, conditioning on the break dates found for them, then test for changes in the variance. This is clearly inappropriate as in the first step the test suffers from severe size distortions 1 (see Section 2). Also, neglecting changes in coefficients while testing for changes in variance induces both size distortions and a loss of power. Hence, a joint approach is needed. To that effect, Perron and Zhou (2008) suggested procedures based on quasi likelihood ratio tests constructed using a likelihood function appropriate for identically and independently distributed normal errors. Corrections to these likelihood ratio tests were then applied so that their limit distributions are free of nuisance parameters in the presence of non-normal distributions and/or conditional heteroskedasticity. Extensions that allow for serial correlation were considered as well. The aim of this paper is twofold. First, we present detailed simulation analyses to document various issues related to Perron and Zhou’s (2008) procedures: a) the inadequacy of the two step procedures that are commonly applied; b) which particular version of the necessary correction factor leads to tests with better finite sample properties; c) whether applying a correction that is valid under more general conditions than necessary is detrimental to the size and power of the tests; d) the finite sample size and power of the various tests proposed; e) the performance of their sequential method in determining the number and types of breaks present. Second, we apply their testing procedures to various macroeconomic time series studied by Stock and Watson (2002). On one hand, our results reinforce the prevalence of change in mean, persistence and variance of the shocks in simple autoregressions. Most series exhibit an important reduction in variance that occurred during the 1980s. For many series, however, our evidence indicates two breaks in the variance of the shocks, with the feature that variance increases at the first and decreases at the second. Hence, the so-called “great moderation” may be qualified as a phenomenon for which the high variance level of the 1970s to early 1980s ended and variance reverted (roughly) back to its pre-1970s level; in some cases this reversion was exact (e.g., with inflation), incomplete (e.g., with interest rates) or magnified (e.g., with real variables). Hence, for many series, the so-called “great moderation” should rather be characterized as a “great reversion”. We also present a number of interesting results pertaining to change in the level and persistence of the series. This paper is structured as follows. Section 2 provides some motivations which show that commonly used procedures that do not treat the problem of testing for changes in coefficients and variance jointly suffer from important size distortions and power losses. Section 3 presents the class of models to be considered, the testing problems to be addressed and the corresponding test statistics with limit distributions free of nuisance parameters. Section 4 presents simulation results. Section 5 presents empirical applications related to various macroeconomic time series. Section 6 provides brief concluding remarks. 2 2 Motivation To motivate the importance of considering the problem of testing for changes in the regression coefficients and the variance of the errors jointly, we start with some simple simulation experiments. The data generating process (DGP) is a simple sequence of i.i.d. normal random variables with mean and variance that can each change at a single date. To analyze the effect of ignoring a variance break when testing for a change in the regression coefficients on the size of a test, the DGP is specified by yt = μ + et , (1) where et ∼ i.i.d. N (0, 1 + δ 1 I(t > T v )) with I(·) being the indicator function. We consider 3 break dates, T v ∈ {[0.25T ], [0.5T ], [0.75T ]}, with [x] being the greatest integer less than x, and variance changes δ 1 varying between 0 and 10 in steps of 0.05. The sample size is set to T = 100 and 5000 replications are used. The test we consider is the standard Sup-LR test (see Andrews, 1993) for a one-time change in μ occurring at some unknown date. The exact size of the test is presented in Figure 1 for a nominal size of 5%. The results show important size distortions unless the break occurs early, at T v = [0.25T ]; these distortions increase with δ 1 . To assess the effect on power, the DGP is yt = μ + δ 2 I(t > T c ) + et (2) with et as specified above. In this case, we consider T v ∈ {[0.5T ], [0.75T ]}, T c = [0.3T ], T = 100, δ1 ∈ {0, 0.5, 1, 1.5, 2, 2.5, 3} and δ 2 varying between 0 and 2. The results are presented in Figure 2, which shows that power decreases as the magnitude of the ignored break in variance increases. We now consider the effect of a change in mean on the size and power of tests for a change in variance that do not take the former change into account. We consider two testing procedures. The first is based on the CUSUM of squares test, as originally proposed by Brown, Durbin and Evans (1975) and advocated as a test for change in variance by Inclán and Tiao (1994), who showed that it is related to the likelihood ratio test for a change in variance of a sequence of i.i.d. normal random variables (though the equivalence is not exact in finite samples). With k the number of regressors, it is defined by CUSQ = √ P P (r) (r) maxk+1≤r≤T T |ST − (r − k)/(T − k)|, where ST = ( rt=k+1 vet2 )/( Tt=k+1 vet2 ) with vet being the recursive residuals from the relevant regression imposing no change. Its limit distribution under the null hypothesis is the supremum (over [0, 1]) of a Brownian bridge process, for 3 the DGP considered here. To analyze the size of the test when ignoring the change in coefficients, DGP (2) is used with δ 1 = 0 and we set T c ∈ {[.25T ], [.5T ], [.75T ]}, letting δ 2 vary between 0 and 10. The results are presented in Figure 3 (for a nominal size of 5%), which shows that in all cases the exact size of the test increases to one rapidly as the magnitude of change in mean increases. This is not surprising since the CUSQ test also has power against change in the regression coefficients, as originally argued by Brown, Durbin and Evans (1975). For power analysis, the DGP used is again (2) with δ1 varying between 0 and 15 and δ 2 ∈ {0, 1, 1.5, 2, 2.5, 3, 3.5}. The results are presented in Figure 4, which shows that an ignored change in mean can increase the power of the CUSQ test. This result is, however, of little help given the large size distortions. The second procedure we consider is the two step method used by Herrera and Pesavento (2005) and Stock and Watson (2002), among others, which applies a test for a change in the mean of the absolute value of the estimated residuals. Again, DGP (2) is used to assess the size (δ 1 = 0) and power properties of this test. For size, δ2 varies between 0 and 10 and we set T c ∈ {[.25T ], [.5T ], [.75T ]}; while for power, δ 2 varies between 0 and 3.5 and we consider two sets of break dates, namely, {T c = [.5T ], T v = [.3T ]} and {T c = [.75T ], T v = [.3T ]}. The results are presented in Figures 5 and 6 for a 5% nominal size. They show that unless the ignored change in mean is at the middle of the sample, the test suffers from serious size distortions which increase as the magnitude of change in mean increases. For the case of a break in mean at mid-sample, which suffers from no size distortions, Figure 6 shows that power decreases as the magnitude of change in mean increases. While the setup considered above is quite simple, it shows how inference can be misleading when changes in the coefficients of the conditional mean and changes in the variance of the errors are not analyzed jointly. 3 Model and testing problems The main framework we consider is the following multiple linear regression with m breaks (or m + 1 regimes) in the conditional mean equation: c t = Tj−1 + 1, ..., Tjc , yt = x0t β + zt0 δ j + ut , (3) for j = 1, ..., m + 1. In this model, yt is the observed dependent variable at time t; both xt (p × 1) and zt (q × 1) are vectors of covariates and β and δj (j = 1, ..., m + 1) are the corresponding vectors of coefficients; ut is the disturbance at time t. The indices (T1c , ..., Tmc ), or the break points, are explicitly treated as unknown (the convention that T0c = 0 and 4 c Tm+1 = T is used). This is a partial structural change model since the parameter vector β is not subject to shifts and is estimated using the entire sample. When p = 0, we obtain a pure structural change model where all coefficients are subject to change. We also allow for n breaks (or n+1 regimes) in the variance of the errors, occurring at unknown dates (T1v , ..., Tnv ). v + 1 ≤ t ≤ Tiv (i = Accordingly, the error term ut has zero mean and variance σ 2i for Ti−1 v = T . The breaks 1, ..., n + 1), where again we use the convention that T0v = 0 and Tn+1 in variance and coefficients are allowed to occur at different times. Hence, the m-vector v (T1c , ..., Tmc ) and the n-vector (T1 , ..., Tnv ) can have distinct elements or can overlap partly or completely. K denotes the total number of break dates and max[m, n] ≤ K ≤ m + n. When the breaks completely coincide, m = n = K. This model is a special case of the class of models considered by Qu and Perron (2007). The method of estimation considered is quasi-maximum likelihood (QML) assuming serially uncorrelated Gaussian errors. Qu and Perron (2007) proved consistency of the estimates of the break fractions under general conditions on the regressors and the errors. Substantial heterogeneity in the distributions of the regressors is allowed across regimes, though unit root processes are not permitted. The series zt ut and ut are assumed to be short memory processes with bounded forth moments. Otherwise, the conditions imposed are mild in the sense that they allow for substantial conditional heteroskedasticity and autocorrelation. The testing problems considered include the following: TP-1) H0 : {m = n = 0} versus H1 : {m = 0, n = na }; TP-2) H0 : {m = ma , n = 0} versus H1 : {m = ma , n = na }; TP-3) H0 : {m = 0, n = na } versus H1 : {m = ma , n = na }; TP-4) H0 : {m = n = 0} versus H1 : {m = ma , n = na }, where ma and na are some positive numbers selected a priori. We also consider testing problems for which the alternative hypothesis specify some unknown numbers of breaks, up to some maximum: TP-5) H0 : {m = n = 0} versus H1 : {m = 0, 1 ≤ n ≤ N}; TP-6) H0 : {m = ma , n = 0} versus H1 : {m = ma , 1 ≤ n ≤ N}; TP-7) H0 : {m = 0, n = na } versus H1 : {1 ≤ m ≤ M, n = na }; TP-8) H0 : {m = n = 0} versus H1 : {1 ≤ m ≤ M, 1 ≤ n ≤ N}. Finally, we also address: TP-9) H0 : {m = ma , n = na } versus H1 : {m = ma + 1, n = na } and TP-10: H0 : {m = ma , n = na } versus H1 : {m = ma , n = na +1}. These last two are useful for assessing the adequacy of a model with a particular number of breaks by looking at whether including one more break is warranted. Perron and Zhou (2008) also considered a sequential procedure that allows for estimating the number of breaks in both coefficients and variance of the errors. 5 4 The quasi-likelihood ratio tests Consider TP-1, in which one specifies no change in the regression coefficients (m = q = 0) but tests for a given number na of changes in the variance of the errors. Under the null hypothesis, the log-likelihood function is given by eT = −(T /2) (log 2π + 1) − (T /2) log σ e2 , log L (4) P e 2 and β e = (PT xt x0 )−1 (PT xt yt ). Under the alternative where σ e2 = T −1 Tt=1 (yt − x0t β) t t=1 t=1 hypothesis, we estimate the model using the quasi-maximum likelihood estimation method. For a given partition {T1v , ..., Tnv }, the log-likelihood value is given by log L̂T (T1v , ..., Tnv ) = −(T /2) (log 2π + 1) − (1/2) nP a +1 i=1 v (Tiv − Ti−1 ) log σ̂ 2i , (5) where the quasi-maximum likelihood estimates (QMLEs) of β and σ 2i (i = 1, ..., na ) jointly PTiv v )−1 t=T (yt − x0t β̂)2 and solve the system σ̂ 2i = (Tiv − Ti−1 v i−1 +1 nP a +1 β̂ = ( i=1 v Ti P v +1 t=Ti−1 nP a +1 xt x0t /σ̂ 2i )−1 ( i=1 v Ti P v +1 t=Ti−1 xt yt /σ̂ 2i ) for i = 1, ..., na + 1. Hence, the Sup-Likelihood ratio test is ¡ ¢ eT ] sup 2[log L̂T T1v , ..., Tnva − log L v v (λ1 ,...,λna )∈Λv,ε eT ], = 2[log L̂T (T̂1v , ..., T̂nva ) − log L sup LR1,T (na , ε|m = n = 0) = where the estimates (T̂1v , ..., T̂nva ) are the QMLEs of the break dates obtained by imposing the restriction that there is no structural change in the coefficients and the break fractions are in the set ¯ ¡ ¢ ¯ Λv,ε ={ λv1 , ..., λvna : ¯λvi+1 − λvi ¯ ≥ ε (i = 1, ..., na − 1), λv1 ≥ ε, λvna ≤ 1 − ε}. The parameter ε acts as a truncation which imposes a minimal length for each segment and affects the limiting distribution of the test. The limit distribution of the sup LR1,T tests (and others discussed below) depends the nuisance parameter T P ψ = lim var(T −1/2 (u2t /σ 2 ) − 1), T →∞ t=1 6 To consistently estimate this quantity, the following class of estimates are considered: ψ̂ = 1 T TP −1 ω (j, m) j=−(T −1) T P η̂t η̂t−j (6) t=|j|+1 P where η̂t = (û2t /σ̂ 2 ) − 1 and σ̂ 2 = T −1 Tt=1 û2t with ût being the residuals under the null hypotheses. Here w(j, m) is a weight function and m is some bandwidth parameter which can be selected using one of the many alternative methods that have been proposed; see, e.g., Andrews (1991). Following Kejriwal and Perron (2006), the residuals under the null hypothesis are used construct ψ̂ but the residuals under the alternative hypothesis are used to select the bandwidth parameter m (see also Kejriwal, 2007). Simulations will reveal that using this hybrid method permits one to control the exact size of the corresponding tests in small samples without significant loss of power. In our simulations and empirical applications, we use the Quadratic Spectral kernel for ω (j, m) and we adopt the method suggested by Andrews (1991) with an AR(1) approximation to select m. Remark 1 If the errors are i.i.d., ψ = μ4 /σ 4 − 1, which can be consistently estimated using P P ψ̂ = μ̂4 /σ̂ 4 − 1, where σ̂ 2 = T −1 Tt=1 û2t and μ̂4 = T −1 Tt=1 û4t with ût being the residuals under the null or alternative hypothesis. Also, if the errors are normally distributed, ψ = 2 so that no adjustment is necessary, a case that was covered by Qu and Perron (2007). ∗ = (2/ψ̂) sup LR1,T . The corrected test statistic for the testing problem TP-1 is then sup LR1,T The limit distribution of this test is the same as that in Bai and Perron (1998), applied to na changes. Hence, the critical values provided by Bai and Perron (1998, 2003) can be used. For the testing problem TP-2, there are ma breaks in coefficients under both the null and alternative hypotheses so that the test pertains to assessing whether there are 0 or na breaks in the variance. For a given partition {T1c , ..., Tmc a }, the likelihood function under the null hypothesis is eT (T1c , ..., Tmc ) = −(T /2) (log 2π + 1) − (T /2) log σ e2 log L a Pma +1 PTjc −1 0 0e 2 e 0 0e where σ e2 = T −1 j=1 δj = X MZ̄ Y and e c +1 (yt − xt β − zt δ j ) , β = (X MZ̄ X) t=Tj−1 0 −1 0 −1 0 0 e with MZ̄ = I−Z̄(Z̄ Z̄) Z̄ , Z̄ = diag (Z1 , ..., Zma +1 ), Zj = (zT c +1 , ..., zT c )0 , (Z Zj ) Z (Yj −Xj β) j j j−1 j 0 0 c c c +1 , ..., yT c ) and Xj = (xT c +1 , ..., xT c ) . For given partitions {T , ..., T Yj = (yTj−1 1 ma } and j j−1 j {T1v , ..., Tnva }, the log-likelihood under the alternative is log L̂T (T1c , ..., Tmc a ; T1v , ..., Tnva ) = −(T /2) (log 2π + 1) − (1/2) 7 nP a +1 i=1 v (Tiv − Ti−1 ) log σ̂ 2i , (7) where the QMLEs solve the following equations for i = 1, ..., na + 1, v σ̂ 2i = (Tiv − v Ti−1 )−1 Ti P (yt − x0t β̂ − zt0 δ̂t,j ) v +1 t=Ti−1 σ ), and β̂ = (X 0 MZ̄σ X)−1 X 0 MZ̄σ Y where MZ̄σ = IT −Z̄σ (Z̄σ0 Z̄σ )−1 Z̄σ0 with Z̄σ = diag(Z1σ , ..., Zm a +1 σ σ σ 0 σ v v = (zt /σ̂ i ) for Ti−1 < t ≤ Ti (i = 1, ..., na + 1). Using Zj = (zTj−1 c +1 , ..., zT c ) and zt j σ 0 the same notation, δ̂ t,j = (Zjσ0 Zjσ )−1 Zjσ0 (Yjσ − Xjσ β̂) where Yjσ = (yTσj−1 c +1 , ..., yT c ) and j σ 0 σ σ v v Xjσ = (xσTj−1 c +1 , ..., xT c ) with xt = (xt /σ̂ i ) and yt = (yt /σ̂ i ) for Ti−1 < t ≤ Ti . Hence, j the quasi Sup-likelihood ratio test is sup LR2,T (ma , na , ε|n = 0, ma ) = 2[ sup (λc1 ,...,λcma ;λv1 ,...,λvna )∈Λε log L̂T (T1c , ..., Tmc a ; T1v , ..., Tnva ) − sup (λc1 ,...,λcma )∈Λc,ε eT (T1c , ..., Tmc )] log L a eT (T̂1c , ..., T̂mc )], = 2[log L̂T (Te1c , ..., Temc a ; Te1v , ..., Tenva ) − log L a where ¯ ¯ Λc,ε = {(λc1 , ..., λcm ) : ¯λcj+1 − λcj ¯ ≥ ε (j = 1, ..., ma − 1), λc1 ≥ ε, λcma ≤ 1 − ε} and ¡ ¢ Λε = { λc1 , ..., λcma , λv1 , ..., λvna : for (λ1 , ..., λK ) = (λc1 , ..., λcma ) ∪ (λv1 , ..., λvna ) (8) |λj+1 − λj | ≥ ε (j = 1, ..., K − 1), λ1 ≥ ε, λK ≤ 1 − ε}. Note that the estimates of the break dates in coefficients and variance are denoted by a “∼” when they are obtained jointly and by a “ˆ” when obtained separately. The corrected ∗ = (2/ψ̂) sup LR2,T . version of the sup LR2,T test is given by sup LR2,T For the testing problem TP-3, the null hypothesis specifies na breaks in variance and zero break in coefficients so that, for a given partition {T1v , ..., Tnv }, the log-likelihood is na +1 ¡ ¢ v eT T1v , ..., Tnv = −(T /2) (log 2π + 1) − (1/2) P (Tiv − Ti−1 ) log σ e2i , log L a i=1 PTiv 0e 2 0e v e0 e0 0 )−1 t=T for i = 1, ..., n where σ e2i = (Tiv − Ti−1 v +1 (yt − xt β − zt δ) a + 1, (β , δ ) = i−1 σ0 0 (W σ0 W σ )−1 W σ0 Y σ and W σ = (w1σ , ..., wTσ )0 with wtσ = (xσ0 t , zt ) . Under the alternative hypothesis, there are ma breaks in the regression coefficients and na breaks in variance so 8 the likelihood is given by (7). Hence, the Sup-Likelihood ratio test is sup LR3,T (ma , na , ε|m = 0, na ) = 2[ sup (λc1 ,...,λcma ;λv1 ,...,λvna )∈Λε log L̂T (T1c , ..., Tmc a ; T1v , ..., Tnva ) − sup (λv1 ,...,λvna )∈Λv,ε eT (T1v , ..., Tnv )] log L a eT (T̂1v , ..., T̂nv )]. = 2[log L̂T (Te1c , ..., Temc a ; Te1v , ..., Tenva ) − log L a ∗ The limit distributions of the tests sup LR2,T and sup LR3,T depend on the true unknown value of the relevant break fractions corresponding to the break dates allowed under both the null and alternative hypotheses. These distributions can be bounded by limit random variables which do not depend on such unknown values and are the same as in Bai and Perron (1998). Hence, a conservative testing procedure is possible. In this paper, we shall assess the extent to which the tests are conservative. For the testing problem TP-4, the null hypothesis specifies no breaks and the corresponding log-likelihood is given by (4). The alternative hypothesis specifies ma breaks in coefficients and na breaks in the variance of the errors and its corresponding log likelihood is given by (7). Hence, the Sup-Likelihood ratio test is sup LR4,T (ma , na , ε|n = m = 0) ¡ ¢ eT ] = 2[ sup log L̂T T1c , ..., Tmc a ; T1v , ..., Tnva − log L (λc1 ,...,λcma ;λv1 ,...,λvna )∈Λε eT ] = 2[log L̂T (Te1c , ..., Temc a ; Te1v , ..., Tenva ) − log L (9) The transformed statistic, with a limit distribution free of nuisance parameters, is given by ∗ sup LR4,T = sup LR4,T − ((ψ̂ − 2)/ψ̂)LRv , (10) where LRv is the likelihood ratio test for no break in variance versus na breaks evaluated at the estimates {Te1v , ..., Tenva } obtained from maximizing the log-likelihood function (7) that eT ], jointly allows for ma breaks in coefficients. That is, LRv = 2[log L̂T (Te1v , ..., Tenva ) − log L eT are defined by (5) and (4), respectively. The critical values of the where log L̂T (·) and log L ∗ are provided by Perron and Zhou (2008). limit distribution of sup LR4,T 4.1 Extensions to serially correlated errors We now consider the case where the errors ut are allowed to be serially correlated. For ∗ ∗ and sup LR2,T are asymptotically invariant testing problems TP-1 and TP-2, sup LR1,T to non-normal errors, serial correlation and conditional heteroskedasticity. For the testing 9 problem TP-3, Perron and Zhou (2008) suggested the use of the following Wald type statistic: sup(λc1 ,...,λcma )∈Λε F3,T (ma , na , ε|m = 0, na ), where F3,T (ma , na , ε|m = 0, na ) = 0 (T − (ma + 1) q − p) 0 0 δ̂ R (RV̂ (δ̂)R0 )−1 Rδ̂, ma q (11) 0 δ̂ = (δ̂ 1 , ..., δ̂ ma +1 )0 the QMLE of the coefficients that are subject to change under a given partition of the sample, R is the conventional matrix such that (Rδ)0 = (δ 01 − δ 02 , ..., δ 0ma − δ 0ma +1 ) and V̂ (δ̂) is an estimate of the variance covariance matrix of δ̂ that is robust to serial correla¢−1 ¡ ¢−1 ¡ ΩZ̄σ∗ Z̄σ∗0 Z̄σ∗ , tion and heteroskedasticity, i.e., a consistent estimate of V (δ̂) = plimT →∞ T Z̄σ∗0 Z̄σ∗ where Z̄σ∗ = MXσ Z̄σ , ΩZ̄σ∗ = E(Z̄σ∗0 Ub∗ Ub∗0 Z̄σ∗ ), Ub∗ = MXσ Uσ , MXσ = IT − Xσ (Xσ0 Xσ )−1 Xσ0 , ¡ ¢ σ σ σ 0 σ σ 0 , Zj = (zTσj−1 Xσ = (xσ1 , ..., xσT ), Z̄σ = diag Z1σ , ..., Zm c +1 , ..., zT c ) , Uσ = (u1 , ..., uT ) , a +1 j v0 xσt = (xσt /σ i0 ), ztσ = (zt /σ i ) and uσt = (ut /σ i ) for Ti−1 < t ≤ Tiv0 (i = 1, ..., na + 1). A consistent estimate of V (δ̂) can be obtained using kernel based methods, as in Andrews (1991) and the limiting distribution of sup F3,T (ma , na , ε|m = 0, na ) is the same as sup LR3,T with martingale difference errors. In practice, the computation of this test statistic can be very involved, especially if a data dependent method is used to construct the robust asymptotic covariance matrix V̂ (δ̂). Following Bai and Perron (1998), we suggest first obtaining the break points that correspond to the global maximization of the log-likelihood function (7), then plugging these estimates into (11) to construct the test. This will not affect the consistency of the test since the break fractions are consistently estimated. For the testing problem TP-4, things are more complex and Perron and Zhou (2008) proposed a quasi-Wald testing procedure. Note first that the information matrix is block diagonal with respect to δ and σ 2 , hence the test will involve one component for changes in δ and one component for changes in σ 2 . The first component is the same as discussed above, namely, sup F3,T as defined by (11) except with zt replacing ztσ since the null hypothesis specifies no break in variance. For the second component, Perron and Zhou (2008) suggested summing the individual Wald tests for each successive pairs of regimes, leading to the component na −1 P ev − λ ev ) − σ̂ 4 /(λ ev − λ ev ))−1 , (σ̂ 2i+1 − σ̂ 2i )2 (σ̂ 4i+1 /(λ sup FTσ = ψ̂ i+1 i i i i−1 i=1 PTh v where σ̂ 2i = (Teiv − Tei−1 )−1 Thiv v i−1 +1 û2t and the estimates are constructed by maximizing the log-likelihood function (7) subject to the restrictions imposed by the set Λε . The resulting suggested test statistic is then sup F4,T (ma , na , ε|m = 0, na = 0) = sup F3,T + sup FTσ , whose ∗ in the case of martingale difference limit distribution is identical to that of the sup LR4,T errors. 10 4.2 Double maximum tests Following Bai and Perron (1998), Perron and Zhou (2008) also proposed the so-called double maximum tests which are arguably the most useful tests to apply when trying to determine if structural changes are present. Only the U D max version is here considered as it is simpler to construct than the W D max, both having similar properties. They are given by the following: For TP-5: ∗ ∗ U D max LR1,T = max sup LR1,T (na , ε|m = n = 0) , 1≤na ≤N For TP-6: ∗ U D max LR2,T ∗ = max sup LR2,T (ma , na , ε|n = 0, ma ) , For TP-7: ∗ U D max LR3,T = max sup LR3,T (ma , na , ε|m = 0, na ) , For TP-8: ∗ U D max LR4,T = max 1≤na ≤N 1≤ma ≤M ∗ max sup LR4,T (ma , na , ε|n = m = 0) . 1≤na ≤N 1≤ma ≤M For TP-5 through TP-7, the critical values of the limit distributions are available from Bai and Perron (1998, 2003) for N or M equal to 5 (acting as a bound for TP-6 and TP-7) and the limit distribution for the UD max LR4,T test is given in Perron and Zhou (2008). Note that for the testing problems TP-5 and TP-6, the results are valid whether the errors are martingale differences or serially correlated. This is not the case for TP-7 and TP-8 for which Perron and Zhou (2008) proposed sup-Wald type tests given by 4.3 For TP-7: UD max F3,T = max sup F3,T (ma , na , ε|m = 0, na ) , For TP-8: UD max F4,T = max 1≤ma ≤M max sup F4,T (ma , na , ε|n = m = 0) . 1≤na ≤N 1≤ma ≤M Testing for an additional break We now consider the testing problems TP-9 and TP-10 which look at whether including an additional break is warranted. Let (Te1c , ..., Temc ; Te1v , ..., Tenv ) denote the estimates of the break dates in the regression coefficients and the variance of the errors obtained jointly by maximizing the quasi-likelihood function assuming m breaks in the coefficients and n breaks in the variance. For the testing problem TP-9, the issue is whether an additional break in the regression coefficients is present. The test is sup SeqT (m + 1, n|m, n) = 2[ max c sup log L̂T (Te1c , ..., Tej−1 , τ , Tejc , ..., Temc ; Te1v , ..., Tenv ) 1≤j≤m+1 τ ∈Λc j,ε − log L̂T (Te1c , ..., Temc ; Te1v , ..., Tenv )], c c c + (Tejc − Tej−1 )ε ≤ τ ≤ Tejc − (Tejc − Tej−1 )ε}. This amounts to performing where Λcj,ε = {τ ; Tej−1 m + 1 tests for a single break in the regression coefficients within each of the m + 1 regimes, 11 defined by the partition {Te1c , ..., Temc }. Note that the different scenarios that arise when allowing breaks in coefficients and in the variance to occur at different dates imply two possible cases since (Te1c , ..., Temc ) and (Te1v , ..., Tenv ) can partly or completely overlap or be altogether different: 1) if the n break dates in variance are a subset of the m break dates in coefficients, c and Tejc ; 2) otherwise, there is one or more variance there is no variance break between Tej−1 c and Tejc . When serial correlation in the errors, the principle is the same breaks between Tej−1 except that the statistic is based on the robust Wald test statistic sup F3,T , as defined by (11), applied to a one break test for each segment. For the testing problem TP-10, similar considerations apply. Here the issue is whether an additional break in the variance is present. The test statistic is sup SeqT (m, n + 1|m, n) = (2/ψ̂) max v sup 2 log L̂T (Te1c , ..., Temc ; Te1v , ..., Tej−1 , τ , Tejv , ..., Tenv ) 1≤j≤n+1 τ ∈Λv j,ε − log L̂T (Te1c , ..., Temc ; Te1v , ..., Tenv ), v v v )ε}. The correction factor (2/ψ̂) is )ε ≤ τ ≤ Tejv −(Tejv − Tej−1 +(Tejv − Tej−1 where Λvj,ε = {τ : Tej−1 needed to ensure that the limit distribution of the test is free of nuisance parameters when the errors are allowed to be non-normal, serially correlated and/or conditionally heteroskedastic. 4.4 Sequential procedure Perron and Zhou (2008) discussed a specific to general sequential procedure to estimate the number and types of breaks in coefficients and variance. The starting point is the ∗ statistic so that it can be applied in a sequential manner to modification of the sup LR4,T address the testing problem H0 : {m = , n = } versus H1 : {m = + 1, n = + 1}. The procedure is to test the null hypothesis of breaks versus the alternative hypothesis of + 1 breaks by performing a one break test for each of the + 1 segments induced by the partition {T̂1 , ..., T̂l }, the estimates of the break dates obtained by maximizing the Gaussian log-likelihood function (7) with Tjc = Tjv = Tj for j = 1, ..., . The test statistic is then the maximal value of the tests over all + 1 segments, denoted sup SeqT ( + 1| ). Upon rejection, a model with + 1 breaks is considered with the additional break inserted within the segment associated with the maximal value of the tests at the position that maximizes the log-likelihood function. This procedure is iterated until a non-rejection occurs. Let the number of breaks thus selected be denoted by K̄. The next step is to decide whether a break in coefficients, variance or both has occurred at each of the selected break dates. Standard hypothesis testing for the equality of the 12 parameters across adjacent segments is applied. Consider first the case of testing whether the coefficients are equal across the two adjacent segments (T̂k−1 + 1, T̂k ), corresponding to regime k, and (T̂k + 1, T̂k+1 ), corresponding to regime k + 1 (k = 1, ..., K̄). Denote the true value of the coefficients in regimes k and k + 1 by β k and β k+1 , respectively. The null hypothesis is then H0 : β k = β k+1 and the alternative hypothesis is H1 : β k 6= β k+1 . Note that since there is a break in either the regression coefficients and/or the variance of the errors, under the null hypothesis there must be a change in the variance of the errors. Hence, the test to be applied is a standard Chow-type test allowing for a change in variance across regimes (see Goldfeld and Quandt, 1978). Consider now the testing problem H0 : σ 2k = σ 2k+1 versus H1 : σ 2k 6= σ 2k+1 , where σ 2k and σ 2k+1 are the true variances of the errors in regimes k and k + 1, respectively. The Wald test, corrected for potential non-normality and conditional heteroskedasticity, is given by Wk = ¢2 (T̂k − T̂k−1 )(T̂k+1 − T̂k ) ¡ 2 σ̂ k+1 − σ̂ 2k , 4 (T̂k+1 − T̂k−1 )(μ̂4 − σ̂ ) where σ̂ 2k and σ̂ 2k+1 are the maximum likelihood estimates of σ 2k and σ 2k+1 (constructed allowing the regression coefficients to be different in regimes k and k + 1) and μ̂4 is a consistent PT̂ estimate of E(u4t ), e.g., μ̂4 = (T̂k+1 − T̂k−1 )−1 T̂k+1 +1 û4t , constructed under the alternative k−1 hypothesis to maximize power. 5 Monte Carlo experiments This section presents the results of simulation experiments to address the following issues: 1) which particular version of the correction factor ψ̂ leads to tests with better finite sample properties; 2) whether applying a correction that is valid under more general conditions than necessary is detrimental to the size and power of the test; 3) the finite sample size and power of the various tests proposed; 4) the performance of the sequential method in determining the number and types of breaks present. Throughout, we use 1,000 replications. 5.1 The choice of ψ̂ To address what specific version of the correction factor is best to use, we consider the size ∗ test under the following simple DGP with ARCH(1) errors: and power of the sup LR4,T yt = μ1 + μ2 1(t > [.25T ]) + et , 13 √ where et = ut ht , ut ∼ i.i.d. N (0, 1) and ht = δ 1 + δ 2 1 (t > [.75T ]) + γe2t−1 with h0 = δ 1 / (1 − γ) and δ 1 = 1. The sample size we use is T = 100 and we set the truncation to ε = 0.20. Under the null hypothesis of no change μ2 = δ 2 = 0, while under the alternative hypothesis one break in mean and one break in variance are allowed. We set μ1 = 0 under both the null and alternative hypotheses and consider three versions for the estimate ψ̂ as defined by (6): 1) using the residuals under the alternative hypothesis to construct the bandwidth m and to estimate the relevant autocovariances of ηt (labeled “alternative”), 2) using the residuals under the null hypothesis instead (labeled “null”) and, as suggested by Kejriwal and Perron (2006), 3) using a hybrid method that constructs the bandwidth m using the residuals under the alternative hypothesis but uses the residuals under the null hypothesis to estimate the autocovariances of η t (labeled “hybrid”) 1 . The results for the exact size of the test (using a 5% nominal size test) are presented in Table 1. They show the method “alternative” to induce substantial size distortions that increase in γ, i.e., they increase in the autocorrelation of the squared errors. The method “null”, on the other hand, induces conservative size distortions. Finally, the hybrid method enables an exact size close to the nominal level for all cases considered. The results for power are presented in Table 2 for a 5% nominal size. We only consider the methods “null” and “hybrid” given the high size distortions of the method “alternative”. They show that substantial power gains can be achieved by using the “hybrid” method as opposed to the “null” method, especially if the ARCH effect is pronounced. Hence, we recommend using the “hybrid” method and all results given below will be based on its use. 5.2 Should we always correct? We now address the issue of whether it is costly in terms of power to use a correction that is valid under more general conditions than necessary. To this effect, we first consider the ∗ test under the following DGP with normal errors: power of the sup LR4,T yt = μ1 + μ2 1(t > T1c ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T1v )), where we set μ1 = 0 and μ2 = δ. We consider three scenarios for the timing of the breaks: a common break in mean and variance at T1c = T1v = [.5T ] and disjoint breaks at {T1c = [.3T ], T1v = [.6T ]} and {T1c = [.6T ], T1v = [.3T ]}. We use two sample sizes, T = 100, 200, and 1 Using an hybrid method in which m is selected using the residuals under the null hypothesis leads to tests with very low power that can decrease as the magnitude of change increases. 14 the power, for 5% nominal size tests, is evaluated at values of δ ranging from 0.25 to 1.5. ∗ tests are evaluated: 1) with a full correction based on ψ̂ as Three versions of the sup LR4,T defined by (6) using the hybrid method which is valid for errors that can be conditionally hetereoskedastic and serially correlated (labeled “full”); 2) a correction that is valid for i.i.d. P though not necessarily normal errors given by ψ̂ = μ̂4 /σ̂ 4 − 1, where σ̂ 2 = T −1 Tt=1 û2t and P μ̂4 = T −1 Tt=1 û4t with ût being the residuals under the null hypotheses (labeled “i.i.d.”); 3) no correction, i.e., using ψ̂ = 2, which is the appropriate value with normal errors (labeled “NC”). The results are presented in Table 3. They show that the power and exact size are basically the same using any of the methods. Hence, there is no cost in using a full correction. ∗ when testing for a single break Table 4 presents related results for the statistic sup LR1,T in variance assuming no break in regression coefficients. The DGP is yt = et with et ∼ i.i.d. N(0, 1 + δ1(t > [.5T ])) and δ varies between 0 and 1.5. The full correction yields power and exact size similar to a correction that correctly assumes i.i.d. errors, though here imposing normality can lead to tests with somewhat higher power. Overall, using the full correction entails little power loss or size distortion and, hence, we shall continue to use it in all results below. There may be cases where correctly imposing normality can lead to tests with slightly higher power but this can be achieved only if one has the correct prior knowledge of the true distribution of the errors, a case that is unlikely to occur in practice. 5.3 Testing for variance breaks only We now consider the case of testing only for variance breaks, assuming no change in regression coefficients. To this effect, we shall investigate the properties of the following tests: the ∗ ∗ ∗ (na , ε|m = n = 0), abbreviated sup LR1,T (na , ε) 2 , the UD max LR1,T for testing sup LR1,T against an unknown number of breaks up to N = 5, and a corrected version of the CUSQ given by ¯ h i¯ ¯ −1/2 P[T λ] 2 [T λ] PT 2 ¯ supλ∈[0,1] ¯T et − T et ¯ t=1 v t=1 v , CUSQ∗ = ϕ̂1/2 a where ϕ̂a = (T −1) X w (j, m) j=−(T −1) 2 T X t=|j|+1 We use similar abbreviations throughout. 15 η̂ t η̂t−j , P η̂t = vet2 − σ̂ 2 and σ̂ 2 = T −1 Tt=1 vet2 with vet denoting the recursive residuals from regression that imposes the null hypothesis of no change. Here also, w (j, m) is the Quadratic Spectral kernel and the bandwidth parameter m is selected using Andrews’ (1991) method with an AR(1) approximation. The aim of the following design is to address the following issues: ∗ ∗ (na , ε) and UD max LR1,T tests, b) the relative power of the a) the size of the sup LR1,T ∗ ∗ ∗ sup LR1,T (na , ε), UD max LR1,T and CUSQ tests, c) the power losses incurred from under∗ (na , ε) test, d) the relative power of the specifying the number of breaks in the sup LR1,T ∗ ∗ UD max LR1,T compared to the sup LR1,T (na , ε) test with na specified as the true number of breaks. We start with a simple design with normal errors and the DGP is yt = et , et ∼ i.i.d. ∗ ∗ (1, ε) and UD max LR1,T N(0, 1 + δ1(t > [.5T ])). We use T = 100, 200 and for the sup LR1,T tests, the trimming parameter is set to ε = 0.25. The coefficient δ varies between 0 (size) and ∗ (1, ε) 1.5. The results are presented in Table 5. They show the exact sizes of the sup LR1,T ∗ ∗ and UD max LR1,T tests to be close to their nominal 5% sizes. The CUSQ test is slightly ∗ (1, ε) test. Interestingly, the undersized. The highest power is achieved using the sup LR1,T ∗ ∗ UD max LR1,T test has power very close to that of the sup LR1,T (1, ε) test especially for the larger sample size, even though the range of alternatives considered is broader. We next consider a dynamic model with ARCH(1) errors for which the DGP is given by yt = c + αyt−1 + et , p et = ut ht , ut ∼ i.i.d. N (0, 1), ht = δ 1 + δ 2 1 (t > [.5T ]) + γe2t−1 , where we set h0 = δ 1 /(1 − γ), c = 0.5, δ 1 = 0.1 and the trimming parameter is again ε = 0.25. We consider two values of the autoregressive parameter α = 0.2 and 0.7, the ARCH coefficient is set to γ = 0.1, 0.3 and 0.5 and again the size and power of 5% nominal size tests are evaluated at T = 100, 200. The magnitude of the change δ 2 varies between 0 (size) and 0.30. The results are presented in Table 6. They again show the exact sizes ∗ ∗ (1, ε) and UD max LR1,T tests to be close to their nominal 5% sizes. The of the sup LR1,T ∗ CUSQ test is again slightly undersized and in some cases, more so as γ increases. The ∗ ∗ test still has power very close to that of the sup LR1,T (1, ε) test even though UD max LR1,T the range of alternatives considered under it is broader. The power of the latter two tests dominates that of the CUSQ∗ , especially when T = 100, in which case the discrepancies can be substantial. 16 We now turn to a case with two breaks in variance. The DGP is yt = et , et ∼ i.i.d. N(0, 1 + δ1([.3T ] < t ≤ [.6T ])). This specifies a model in which the variance increases at [0.3T ] and returns back to its original level at [0.6T ]. The sample size is set to T = 200. We consider four values of the trimming parameter ε = 0.10, 0.15, 0.20 and 0.25. The magnitude of the break in variance varies between δ = 0 (size) and δ = 3. We again consider ∗ ∗ test with N = 5 but include both the sup LR1,T (1, ε) test for a single the U D max LR1,T ∗ (2, ε) test for two breaks to assess the extent of power gains when break and the sup LR1,T specifying the correct number of breaks. The results are presented in Table 7. Consider first ∗ ∗ ∗ (1, ε), sup LR1,T (2, ε) and UD max LR1,T are slightly the size of the tests. The sup LR1,T ∗ conservative when ε is small but less so as ε increases. The CUSQ is more conservative with an exact size of 0.026. As expected, power increases as ε increases since the class of ∗ ∗ (2, ε) tests, (1, ε) and sup LR1,T alternatives shrinks with ε. When comparing the sup LR1,T the latter is substantially more powerful, indicating that allowing for the correct number of ∗ is not as powerful as breaks is important for power considerations. Here, the UD max LR1,T ∗ ∗ (2, ε) but more powerful than the sup LR1,T (1, ε). All versions of these tests the sup LR1,T ∗ are considerably more powerful than the CU SQ test, which shows little power. 5.4 Conditional tests We now consider the properties of the tests that condition on either breaks in coefficients (resp., variance) when testing for changes in variance (resp., coefficients). Consider first ∗ (ma , na , ε|n = 0, ma ) which tests for na changes in variance the size and power of sup LR2,T conditional on ma changes in regression coefficients. We set ma = na = 1 and the DGP is a simple mean shift model with a change in mean of magnitude μ2 at mid-sample and i.i.d. normal errors with a change in variance of magnitude δ (under the alternative hypothesis) that occurs at [0.25T]. The results for size are presented in Table 8. When there is no change in mean (μ2 = 0), the test exhibits liberal size distortions, as expected since the limit distribution is inappropriate. Initially, the exact size approaches the 5% nominal size rather quickly as μ2 increases, and more so the larger the trimming parameter ε and/or the sample size T . But when the change in mean is very large, the test is conservative and more so as the trimming parameter grows. This conservativeness is due to the fact that the limit distribution used is actually an upper bound. The results for power are presented in Table 9. Given the fact that the test becomes conservative for large values of μ2 , the power of the test accordingly decreases with growth in μ2 , though not to a large extent. It increases rapidly with the sample size and marginally with the value of the trimming parameter ε. 17 5.5 ∗ ∗ Size and power of the sup LR4,T and U D max LR4,T tests ∗ ∗ We now present results pertaining to the properties of the sup LR4,T and UD max LR4,T tests. We first consider the size of the tests with normal i.i.d. errors and the DGP set to yt = et ∼ i.i.d. N (0, 1). We analyze three values of the trimming parameter ε = 0.1, 0.15 ∗ ∗ test, M = N = 2, and for the sup LR4,T test, we consider and 0.2. For the UD max LR4,T the following combinations: a) ma = na = 1, b) ma = 1, na = 2 and c) ma = 2, na = 1. Two sample sizes are used: T = 100 and 200. The results are presented in Table 10 and they show the exact sizes of the tests to be close to their nominal 5% levels. Table 11 presents the results of a similar experiment but with ARCH(1) errors so that the DGP is yt = et , √ et = ut ht , where ut ∼ i.i.d. N (0, 1), ht = δ 1 + γε2t−1 , h0 = δ 1 / (1 − γ), δ 1 = 1 and γ takes values 0.1, 0.3 and 0.5. There are some cases of slight size distortions when T = 100 but these markedly decrease when T = 200. We now consider the power of the tests. Since some partial results for the one break case ∗ test, we concentrate here on the case with a are available in Tables 2 and 3 for the sup LR4,T different number of breaks in coefficients and variance. We also only consider normal errors though the hybrid-type correction is still applied. Table 12 presents the results for the case with one break in coefficients and two breaks in variance, for which the DGP is given by yt = μ1 + μ2 1(t > T c ) + et , et ∼ i.i.d. N (0, 1 + δ1(T1v < t ≤ T2v )) with μ1 = 0 and μ2 = δ. For these tests the trimming parameter used is ε = 0.1. Five ∗ different configurations of break dates are considered. We analyze two forms of the sup LR4,T test: a) one that tests for a single break in both mean and variance and b) one that correctly tests for two changes in variance and one change in mean. We do this to investigate the extent of the power differences between underspecification and correct specification for the number of breaks. As expected, the power increases rapidly with δ and T . Under this DGP, the power is similar whether accounting for one or (correctly) two breaks in variance. ∗ test is somewhat below the power of both versions of the The power of the U D max LR4,T ∗ test. This may, however, be an artifact of the specific DGP considered. sup LR4,T Table 13 presents the results for the case with two breaks in coefficients and one break in variance, for which the DGP is given by yt = μ1 + μ2 1(T1c < t ≤ T2c ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T v )) 18 ∗ with μ1 = 0 and μ2 = δ. Again, we consider two forms of the sup LR4,T test: a) one that tests for a single break in both mean and variance and b) one that correctly tests for two changes in mean and one change in variance. Comparing these results with those in Table 12, the first notable difference is the fact that for given values of δ and T , the power is lower than in the case of one break in coefficient and two breaks in variance, indicating that changes in variance are easier to detect than changes in mean. The second difference is that ∗ test now has power in between that of the test that correctly specifies the UD max LR4,T the type and number of breaks and the one that underspecifies the number of changes in mean. This difference can indeed be substantial and, in line with the results of Bai and Perron (2006), the power of the UD max test is close to the power attained when correctly specifying the type and number of breaks. 5.6 Estimating the numbers of breaks in coefficients and in variance In this section, we assess the performance of the sequential procedure discussed in Section 4.4 via a simple simulation experiment. The basic DGP is yt = μ1 + μ2 1(t > T c ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T v )), so that a maximum of one break in either mean and/or variance is allowed. The sample size is T = 200 and the tests are constructed with trimming parameter ε = 0.15. We consider the following scenarios: a) no change in either mean or variance; b) a change in mean only, occurring at mid-sample; c) a change in variance only, also occurring at mid sample; d) a change in both mean and variance, occurring at a common date (midsample); e) a change in both mean and variance, occurring at different but close dates (T c = [0.5T ], T v = [0.7T ]); f) a change in both mean and variance, occurring at different distant dates (T c = [0.25T ], T v = [0.75T ]). Whenever breaks occur, we consider a range of magnitudes. The results are presented in Table 14. In general, the procedure works quite well in selecting the correct number and types of breaks. There are cases, however, in which the probability of making the correct selection is quite low. This occurs when both changes in mean and variance are not large and occur at different dates, especially when the respective break dates are far apart. The reason for this is that the sup SeqT ( + 1| ) statistic jointly tests for a break in both regression coefficients and variance. Hence, if only one type of break occurs, the power of the test can be quite low unless the magnitudes of the breaks 19 are large. Unfortunately, situations for which the procedure performs poorly are expected to be quite common in practice, as we shall see in the empirical applications reported in the next section. Hence, though this specific to general procedure is valid in large samples, it should not be applied mechanically. Care must be exercised to assess whether one is facing a situation for which its finite sample properties are rather poor. An alternative approach is to use a general to specific procedure to determine the appropriate number and types of breaks. This involves using the battery of tests presented in this paper in a judicious way. The procedure cannot be mechanized but is likely to deliver better results. We shall illustrate how to apply it in the context of actual applications to be discussed in the next section. 6 Applications The set of testing procedures discussed above are useful tools for the joint detection of structural changes in the unconditional variance of the errors and the parameters of the conditional mean in a linear regression model. To our knowledge, no such tests have been previously available under the level of generality that we consider here. Our novel contributions are important for practical applications as witnessed by recent interest in macroeconomics and finance, where documenting structural change in the variability of shocks to simple autoregressions or vector autoregressive models has been a concern; see, among others, Blanchard and Simon (2001), Herrera and Pesavento (2005), Kim and Nelson (1999), McConnell and Perez-Quiros (2000), Sensier and van Dijk (2004), Summers (2005) and Stock and Watson (2002). Stock and Watson (2002) present an exhaustive analysis documenting facts about potential changes in the volatility of macroeconomic time series using the two step approach mentioned in Section 2. Of interest here is the fact that many such series seem to have experienced a decline in volatility in the mid 1980s. We reconsider the analysis presented in Table 2 of Stock and Watson (2002), pertaining to 22 common macroeconomic variables. These are quarterly series covering the period 1960-2001 whose list is contained in Table 15. The series have been transformed to eliminate trends and/or unit roots 3 . Graphs of the series are presented in Figures 7 and 8. With the variables transformed to achieve stationarity, the basic regression is a simple AR(4) with a fitted intercept. We first discuss how we used our testing procedures to select the number of breaks in 3 The data source is the DRI-McGraw Hill Basic Economics database. The series were kindly posted by Mark Watson on his web page. 20 coefficients (intercept and autoregressive parameters) and variance. With the types of breaks in the series analyzed, the sequential procedure did not perform well. This is due to the fact that in most cases, changes in both the coefficients and the variance occurred at different times, a case for which the specific to general procedure performs poorly. Hence, we used a procedure more akin to a general to specific one. To start, we set an upper bound of two breaks for each of the coefficients and variance, implying a maximum of four breaks overall. This upper bound should be large enough for the types of series analyzed. In any event, we subsequently present evidence that two breaks are sufficient. ∗ test with M = N = 2. In Table 15, we The first statistic used is the UD max LR4,T report the outcome of this test with trimming parameters ε = 0.15 and ε = 0.20. It shows significant evidence for at least one break for 14 of the 22 series (at the 10% significance level). For the eight series for which this test statistic is not significant, Stock and Watson (2002) reported evidence of breaks in five of them: consumption, change in inventory investment, total production of goods, nondurable goods and non-agricultural employment. ∗ test shows a rejection, we computed a For those series for which the U D max LR4,T wide range of tests to decide which model to select. To illustrate, consider the case of ∗ (1, 2) is indeed the GDP series for which we selected m = 1 and n = 2. The sup LR4,T significant at the 1% level. We then consider the sequential tests sup SeqT (m, n + 1|m, n) and sup SeqT (m + 1, n|m, n) to see if too many or too few breaks are included. The test statistic sup SeqT (2, 2|1, 2) is insignificant, indicating that an additional break in coefficients is unwarranted. The sup SeqT (1, 2|1, 1) test statistic is significant, indicating that a second break in variance is warranted. The sup LR3,T (1, 2|0, 2) test statistic is insignificant at the 10% level but marginally so. Given the low power of this test induced by the fact that 5 parameters are allowed to change, we decided to keep one break in the coefficients (the parameter estimates will show an important change). Finally, the sup SeqT (1, 3|1, 2) statistic is insignificant so that a third break in variance is not needed. This is the basic procedure that we repeated for all series. The models selected are presented in the fourth column of Table 16. In all cases, at least one change in both coefficients and variance occurs and often two of each do. Table 16 presents the key parameter estimates: a) the break dates in coefficients (T1c and T2c ) and variance (T1v and T2v ); b) the value of the intercept in each regime (αi , i = 1, 2, 3), used to assess whether there are important level shifts; c) the sum of the autoregressive coefficients in each regime (β i , i = 1, 2, 3), used to assess whether there are changes in persistence induced by different propagation mechanisms; d) the standard deviation of the 21 errors in each regime (σ i , i = 1, 2, 3), used to quantify the magnitude of changes in variance (we also present various ratios to help gauge relative magnitudes across regimes in the last three columns). The first thing to note is that the ratio of the standard deviations of the errors for the last regime relative to the preceding one provides strong evidence of a change in variance, typically being a substantial decrease (for GDP, consumption of durables and non-durables, total fixed investment, residential investment, production of durable goods and structures, inflation and the T-bill and T-bond rates). There are, however, several cases for which the evidence points to a substantial increase in the variance of the errors: the consumption and production of services, exports and imports. Hence, the so-called “great moderation” did not occur across all sectors. The last break date is typically estimated to be in the mid-1980s for most series with some exceptions, for which it occurred in the early 1990s. The results show many additional interesting features. Consider first the cases for which two breaks in the variance of the shocks occurred. In these cases, it appears that there is a tendency for variance to revert back to the level of the first regime. For example, for GDP, the ratio σ 3 /σ 2 is 0.35 while σ 3 /σ 1 is 0.67. For inflation this feature is especially pronounced: the variance after 1986:1 reverts back exactly to its level prior to 1971:3. For the interest rate series, there is actually an increase in variance after the mid-1980s in comparison to before 1967:4 for the T-bill rate and before 1979:3 for the T-bond rate. So this so-called “great moderation” may need to be qualified as a phenomenon in which the high variance level of the 1970s to early 1980s ended and variance returned back to the level of (roughly) the pre-1970s; sometimes this reversion is exact (e.g., inflation), incomplete (e.g., interest rates) or magnified (e.g., real variables). Hence, the so-called “great-moderation” may better be termed as the “great-reversion”. With respect to the intercept of the regression (the long term level), there is not much evidence of significant change, with the following exceptions. For exports, there is a decrease in 1972:4 and an increase in 1992:1. For imports, we have the mirror image with an increase in 1967:1 and a decrease in 1990:4. The other series with important level shifts in mean are the interest rate series: for the 90-day T-bill rate, an increase occurs in 1967:4 and a decrease in 1983:4; the pattern is similar for the 10-year T-bond rate but the increase occurs in 1979:3 and the decrease in 1986:4. With respect to the sum of the autoregressive coefficients, which can be thought of as the persistence of the series, there are important changes. For GDP, consumption and investment, the results point to a one-time increase, though the dates for each differ. For 22 the production of services, there are two increases, in 1968:3 and 1982:4. For other series the pattern is more complex and interesting. For inflation, there is a substantial increase in 1971:3 and a large decrease in 1986:1 (the pattern is similar for the production of structures, though the dates and relative magnitudes differ). For the 90-day T-bill rate, there is a decrease in 1967:4 and an increase in 1983:4. Interestingly, for the 10-year T-bond rate the pattern is reversed, with an increase in 1979:3 and a decrease in 1986:4. The most peculiar results are, however, for imports and exports. For both series, there are two changes in variance and coefficients. In each regime, the variance of the shocks is the same for the two series. However, the pattern for the measure of persistence is completely different. For exports, there is a very large increase in 1972:4 followed by a large decrease in 1992:1, while for imports there is a large decrease in 1967:1 and a very large increase in 1990:4. ∗ tests jointly for the presence of changes in the regression Since the statistic UD max LR4,T coefficients and the variance of the errors, it may be that it lacks power if only changes in variance occur (especially if the number of regression coefficients allowed to change is large, e.g., 5 in the applications here). In such a case, an alternative strategy is possible. It involves ∗ ∗ tests to assess the number of changes in variance, asand SupLR1,T using the UD max LR1,T suming no change in coefficients, and then using the statistic sup LR3,T (ma , na , ε|m = 0, na ), where na is the number of changes in variance found in the first step, as well as the statistic sup SeqT (m, n + 1|m, n). Non-rejections with these tests should then be viewed as ∗ ∗ tests were and SupLR1,T confirmatory evidence that the results based on the UD max LR1,T adequate. We illustrate this approach using series for which we could not obtain a rejection ∗ test but for which Stock and Watson (2002) claimed evidence in with the UD max LR4,T favor a single change in variance. These series are consumption, change in inventory investment, total production of goods and production of nondurables. The results are presented ∗ test with N = 2 is significant at the 10% level or better, in Table 17. The UD max LR1,T except for the total production of goods series for which there is no evidence of change in ∗ test lead us to conclude that there the variance of the errors. The use of the SupLR1,T is one change in the consumption and production of nondurables series and two changes in the change in inventory investment series. None of the sup LR3,T (ma , na , ε|m = 0, na ) or sup SeqT (m, n + 1|m, n) tests, conditional on the number of breaks in variance found, ∗ and are significant. Hence, we are confident that the results based on the UD max LR1,T ∗ tests are not spurious. The parameter estimates yield the following features. For SupLR1,T the consumption series, there is a substantial decrease in variance in 1983:2, a date which is quite different from the date 1992:1 found by Stock and Watson (2002). For the production 23 of nondurables series, there is also a large decrease in variance in 1984:3. For the change in inventory-investement series, there is a large increase in variance in 1973:3 followed by a reversal to the roughly pre-1973 level in 1987:2. Hence, we again have that for series with two changes, the evidence indicates that the decrease in the 1980s is indeed a reversal to a previous level. There is undoubtedly a wealth of interesting features in these results that call for explanations. These are obviously outside the scope of this paper but can hopefully spur the interest of macroeconomists. 7 Conclusion Perron and Zhou (2008) provided tools for testing for multiple structural breaks in the error variance of a linear regression model in the presence or absence of breaks in the regression coefficients. A novel feature of their testing procedures is that no restriction on the break dates is imposed, i.e., the breaks in the regression coefficients and the variance is allowed to occur at the same time or at different times. The proposed statistics have asymptotic distributions invariant to nuisance parameters and are valid in the presence of non-normal errors and conditional heteroseksaticity, as well as serial correlation. Extensive simulations of the finite sample properties show that the tests perform well in terms of size and power. However, the specific to general procedure put forth by Perron and Zhou (2008) for estimating the number and types of breaks has some shorthcomings when the breaks in coefficients and variance of the errors are small and occur at different dates. In this paper, we applied the testing procedures to various macroeconomic time series studied by Stock and Watson (2002). On one hand, our results reinforce the evidence for prevalent changes in mean, persistence and variance of the shocks of simple autoregressions. Most series exhibit an important reduction in variance that occurred in the 1980s. For many series, however, the evidence points to two breaks in the variance of the shocks with the feature that it increases at the first one and decreases at the second. Hence, the so-called “great moderation” may be qualified as a phenomenon for which the high variance level of the 1970s to early 1980s ended and returned back to its (roughly) pre-1970s level. Accordingly, the so-called “great moderation” may better be termed as a “great-reversion”. 24 References Andrews, D.W.K. (1991): “Heteroskedasticity and autocorrelation consistent covariance matrix estimation,” Econometrica, 59, 817-858. Andrews, D.W.K. (1993): “Tests for parameter instability and structural change with unknown change point,” Econometrica, 61, 821-856. Bai, J., and Perron, P. (1998): “Estimating and testing linear models with multiple structural changes,” Econometrica, 66, 47-78. Bai, J., and Perron, P. (2003): “Critical values for multiple structural change tests,” Econometrics Journal, 6, 72-78. Bai, J., and Perron, P. (2006): “Multiple structural change models: a simulation analysis,” in Econometric Theory and Practice: Frontiers of Analysis and Applied Research, D. Corbea, S. Durlauf and B. E. Hansen (eds.), Cambridge University Press, 2006, 212-237. Blanchard, O., and Simon, J. (2001): “The long and large decline in U.S. output volatility,” Brookings Papers on Economic Activity, 1, 135-173. Brown, R.L., Durbin, J., and Evans, J.M. (1975): “Techniques for testing the constancy of regression relationships over time,” Journal of the Royal Statistical Society B, 37, 149-163. Deng, A., and Perron, P. (2008): “The limit distribution of the CUSUM of squares test under general mixing conditions,” Econometric Theory, 24, 809-822. Goldfeld, S.M., and Quandt, R.E. (1978): “Asymptotic tests for the constancy of regressions in the heteroskedastic case,” Econometric Research Program Research Memorandum No. 229, Princeton University. Herrera, A.M., and Pesavento, E. (2005): “The decline in U.S. output volatility: structural changes and inventory investment,” Journal of Business & Economic Statistics, 23, 462-472. Inclán, C., and Tiao, G.C. (1994): “Use of cumulative sums of squares for retrospective detection of changes of variance,” Journal of the American Statistical Association, 89, 913923. Kejriwal, M. (2007): “Tests for a mean shift with good size and monotonic power,” Unpublished Manuscript, Krannert School of Management, Purdue University. Kejriwal, M., and Perron, P. (2006): “Testing for multiple structural changes in cointegrated regression models,” Unpublished Manuscript, Department of Economics, Boston University. 25 Kim, C.-J., and Nelson, C.R. (1999): “Has the U.S. economy become more stable? A Bayesian approach based on a Markov-switching model of the business cycle,” Review of Economics and Statistics, 81, 608-616. McConnell, M.M., and Perez-Quiros, G. (2000): “Output fluctuations in the United States: What has changed since the early 1980’s?” American Economic Review, 90, 1464-1476. Perron, P., and J. Zhou (2008): “Testing jointly for structural changes in the error variance and coefficients of a linear regression model,” Unpublished Manuscript, Department of Economics, Boston University. Qu, Z., and Perron, P. (2007): “Estimating and testing multiple structural changes in multivariate regressions,” Econometrica, 75, 459-502. Sensier, M., and van Dijk, D. (2004): “Testing for volatility changes in U.S. macroeconomic time series,” Review of Economics and Statistics, 86, 833-839. Stock, J.H., and Watson, M.W. (2002): “Has the business cycle changed and why?” in NBER Macroeconomics Annual 17, M. Gertler & K. Rogoff (eds.), MIT press, 159-218. Summers, P. M. (2005), “What caused the great moderation? Some cross-country evidence”, Economic Review QII, Federal Reserve Bank of Kansas, 5-32. 26 ∗ Table 1: Size of the sup LR4,T test using different estimates of ψ in the case of ARCH(1) errors √ (DGP: yt = et , et = ut ht with ut ∼ i.i.d. N (0, 1), ht = δ 1 + γe2t−1 , h0 = δ 1 / (1 − γ), δ 1 = 1 and T = 100; ε = 0.20. Alternative hypothesis: m = n = 1). γ alternative null hybrid 0.1 0.083 0.040 0.042 0.2 0.102 0.042 0.049 0.3 0.116 0.038 0.048 0.4 0.139 0.031 0.040 0.5 0.161 0.027 0.042 Note: "alternative" specifies that the unrestricted residuals are used to construct ψ̂ and m, "null" specifies that the residuals imposing the null hypothesis are used to construct ψ̂ and m and "hybrid" specifies that the residuals under the alternative hypothesis are used to construct m and the residuals under the null hypothesis are used to construct ψ̂. ∗ Table 2: Power of the sup LR4,T test using different estimates of ψ in the case of ARCH(1) errors √ (DGP: yt = μ2 1(t > [.25T ]) + et , et = ut ht with ut ∼ i.i.d. N (0, 1), ht = δ 1 + δ 2 1 (t > [.75T ]) + γe2t−1 , h0 = δ 1 / (1 − γ), δ 1 = 1 and T = 100; ε = 0.20). a) small change in variance, large change in mean γ = 0.1 γ = 0.3 γ = 0.5 null hybrid null hybrid null hybrid μ2 \δ 2 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.5 1 1.5 2 0.267 0.859 0.999 1.000 0.299 0.859 0.998 1.000 0.281 0.863 0.999 1.000 0.318 0.862 0.998 1.000 0.222 0.758 0.986 1.000 0.231 0.752 0.986 1.000 0.230 0.762 0.987 1.000 0.250 0.760 0.986 1.000 0.161 0.612 0.930 0.993 0.169 0.616 0.929 0.992 0.169 0.619 0.932 0.993 0.181 0.631 0.932 0.992 b) small change in mean, large change in variance γ = 0.1 γ = 0.3 null hybrid null hybrid γ = 0.5 null hybrid δ 2 \μ2 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 0.25 0.5 1 3 5 7 0.202 0.512 0.652 0.731 0.394 0.682 0.805 0.853 0.246 0.655 0.822 0.887 0.429 0.771 0.903 0.945 0.142 0.332 0.464 0.532 0.293 0.483 0.592 0.671 0.173 0.438 0.600 0.693 0.324 0.569 0.715 0.791 0.094 0.210 0.299 0.360 0.216 0.346 0.422 0.477 0.121 0.277 0.406 0.493 0.231 0.398 0.495 0.574 Note: "null" specifies that the residuals imposing the null hypothesis are used to construct ψ̂ and m, and "hybrid" specifies that the residuals under the alternative are used to construct m and the residuals under the null hypothesis are used to construct ψ̂. ∗ Table 3: Size and power of the sup LR4,T test using different corrections in the case of normal errors c (DGP: yt = μ1 + μ2 1(t > T1 ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T1v ) with μ1 = 0 and μ2 = δ; ε = 0.15) T = 100 T1c = T v1 = [.5T ] T1c = [.3T ], T1v = [.6T ] T1c = [.6T ], T1v = [.3T ] δ full i.i.d. NC full i.i.d. NC full i.i.d. NC 0 0.25 0.5 0.75 1 1.25 1.5 0.046 0.125 0.425 0.780 0.946 0.992 0.998 0.045 0.126 0.439 0.779 0.947 0.992 0.998 0.053 0.125 0.455 0.783 0.953 0.993 0.999 δ T1c = T v1 = [.5T ] full i.i.d. NC 0.046 0.112 0.406 0.750 0.949 0.991 0.999 0.045 0.115 0.401 0.752 0.948 0.992 0.999 0.053 0.123 0.414 0.753 0.952 0.991 0.999 T = 200 T1c = [.3T ], T1v = [.6T ] full i.i.d. NC 0.046 0.120 0.382 0.685 0.889 0.978 0.995 0.045 0.112 0.377 0.686 0.890 0.978 0.995 0.053 0.121 0.396 0.703 0.898 0.982 0.995 T1c = [.6T ], T1v = [.3T ] full i.i.d. NC 0 0.054 0.053 0.056 0.054 0.053 0.056 0.054 0.053 0.056 0.228 0.224 0.239 0.213 0.210 0.211 0.206 0.207 0.210 0.25 0.783 0.788 0.779 0.745 0.748 0.732 0.709 0.711 0.719 0.5 0.981 0.982 0.993 0.982 0.982 0.985 0.955 0.953 0.960 0.75 1 1.000 1.000 1.000 1.000 1.000 1.000 0.997 0.997 0.998 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.25 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.5 Note: The nominal size is 5% and 1,000 replications are used. The column "full" refers to the tests using the correction ψ̂ which allows for non-normal, conditionally heteroskesdatic and serially correlated errors, as defined by (16); the column "i.i.d." refers to a correction that only allows for i.i.d. errors, i.e., P P ψ̂ = μ̂4 /σ̂ 4 − 1, where σ̂ 2 = T −1 Tt=1 û2t and μ̂4 = T −1 Tt=1 û4t with ût being the residuals under the null hypotheses; the column “NC” applies no correction and sets errors. ψ̂ = 2, which is valid with normal ∗ Table 4: Size and power of the sup LR1,T test using different corrections in the case of normal errors (DGP: yt = et , et ∼ i.i.d. N (0, 1 + δ1(t > [.5T ]); ε = 0.25) T = 100 T = 200 δ full i.i.d. NC full i.i.d. NC 0 0.25 0.5 0.75 1 1.25 1.5 0.046 0.053 0.159 0.308 0.462 0.573 0.761 0.039 0.073 0.159 0.297 0.453 0.603 0.690 Note: see 0.038 0.065 0.190 0.365 0.533 0.668 0.795 notes to 0.041 0.044 0.129 0.112 0.363 0.348 0.618 0.598 0.803 0.806 0.932 0.908 0.969 0.967 Table 3. 0.049 0.137 0.383 0.609 0.848 0.944 0.983 ∗ ∗ Table 5: Size and power of the sup LR1,T (na = 1), U D max LR1,T and CU SQ∗ tests in the case of normal errors (DGP: yt = et , et ∼ i.i.d. N (0, 1 + δ1(t > [.5T ]) ; ε = 0.25) T = 100 T = 200 ∗ ∗ δ sup LR1,T UDmax CU SQ∗ sup LR1,T UDmax CU SQ∗ 0 0.25 0.5 0.75 1 1.25 1.5 0.046 0.053 0.159 0.308 0.462 0.573 0.761 0.041 0.050 0.159 0.300 0.448 0.558 0.614 0.030 0.060 0.134 0.291 0.427 0.560 0.645 0.041 0.129 0.363 0.618 0.803 0.932 0.969 0.041 0.126 0.350 0.607 0.796 0.928 0.965 0.031 0.098 0.345 0.595 0.805 0.905 0.964 ∗ ∗ Table 6: Size and power of the sup LR1,T (1, ε), U D max LR1,T and CU SQ∗ tests in a dynamic model with ARCH(1) errors √ (DGP: yt = c + αyt−1 + et , et = ut ht with ut ∼ i.i.d. N (0, 1), ht = δ 1 + δ 2 1 (t > [.5T ]) + γe2t−1 , h0 = δ 1 / (1 − γ), c = 0.5 and δ 1 = 0.1; ε = 0.25). T = 100 α = 0.2 γ = 0.3 γ = 0.1 γ = 0.5 α = 0.7 γ = 0.3 γ = 0.1 γ = 0.5 δ2 LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ 0 0.05 0.10 0.15 0.20 0.30 0.056 0.165 0.434 0.620 0.811 0.916 0.052 0.156 0.417 0.608 0.807 0.911 0.028 0.134 0.268 0.528 0.649 0.828 0.054 0.138 0.302 0.452 0.617 0.784 0.052 0.136 0.293 0.440 0.602 0.775 0.031 0.083 0.151 0.324 0.413 0.570 0.050 0.125 0.209 0.318 0.434 0.562 0.050 0.126 0.201 0.309 0.411 0.552 0.031 0.054 0.149 0.196 0.315 0.407 0.052 0.170 0.429 0.623 0.809 0.916 0.050 0.160 0.415 0.608 0.801 0.909 0.027 0.139 0.297 0.555 0.678 0.854 0.050 0.140 0.303 0.452 0.611 0.775 0.055 0.147 0.283 0.453 0.594 0.727 0.028 0.082 0.149 0.317 0.399 0.558 0.052 0.116 0.209 0.306 0.415 0.544 0.049 0.096 0.209 0.282 0.430 0.542 0.019 0.059 0.147 0.202 0.319 0.423 T = 200 α = 0.2 γ = 0.3 γ = 0.1 γ = 0.5 α = 0.7 γ = 0.3 γ = 0.1 γ = 0.5 δ2 LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ LR UDmax CUSQ 0 0.05 0.10 0.15 0.20 0.30 0.043 0.351 0.753 0.927 0.981 0.999 0.043 0.341 0.749 0.923 0.981 0.999 0.034 0.313 0.726 0.930 0.984 0.998 0.052 0.214 0.497 0.740 0.885 0.954 0.050 0.209 0.485 0.727 0.876 0.950 0.029 0.190 0.501 0.723 0.837 0.931 0.038 0.139 0.313 0.512 0.621 0.780 0.035 0.133 0.300 0.496 0.614 0.773 0.023 0.114 0.299 0.442 0.616 0.719 0.044 0.350 0.758 0.929 0.980 0.999 0.042 0.340 0.749 0.921 0.979 0.998 0.032 0.309 0.728 0.933 0.982 0.997 0.043 0.209 0.522 0.729 0.878 0.949 0.044 0.203 0.506 0.745 0.839 0.940 0.030 0.192 0.507 0.709 0.825 0.924 0.035 0.123 0.277 0.448 0.627 0.759 0.028 0.122 0.312 0.467 0.631 0.749 0.034 0.120 0.266 0.404 0.576 0.698 ∗ ∗ Table 7: Size and power of the sup LR1,T (na , ε), U D max LR1,T and CU SQ∗ tests with normal errors and two variance breaks (DGP: yt = et , et ∼ i.i.d. N (0, 1 + δ1([.3T ] < t ≤ [.6T ]) with T = 200) δ na = 1 0 0.25 0.5 0.75 1 1.25 1.5 2 2.5 3 0.033 0.071 0.124 0.174 0.202 0.280 0.372 0.502 0.592 0.681 ε = 0.10 na = 2 UDmax 0.033 0.057 0.133 0.252 0.394 0.546 0.673 0.866 0.934 0.977 0.032 0.069 0.117 0.193 0.271 0.403 0.521 0.721 0.827 0.909 Note: The columns na = 1 0.036 0.080 0.137 0.182 0.241 0.328 0.418 0.535 0.675 0.749 ε = 0.15 na = 2 UDmax 0.033 0.062 0.167 0.321 0.484 0.631 0.760 0.915 0.968 0.986 0.033 0.071 0.142 0.218 0.325 0.466 0.586 0.783 0.878 0.938 na = 1 0.040 0.080 0.141 0.184 0.266 0.367 0.454 0.572 0.714 0.780 ε = 0.20 na = 2 UDmax 0.035 0.076 0.197 0.369 0.570 0.704 0.825 0.950 0.982 0.992 0.039 0.081 0.154 0.219 0.364 0.507 0.629 0.819 0.909 0.960 na = 1 0.041 0.087 0.138 0.198 0.287 0.387 0.477 0.624 0.750 0.823 ε = 0.25 na = 2 UDmax 0.035 0.091 0.234 0.439 0.621 0.774 0.868 0.965 0.990 0.998 0.041 0.081 0.150 0.242 0.374 0.532 0.660 0.837 0.922 0.971 ∗ ∗ na = 1 and na = 2 correspond to the sup LR1,T (1, ε) and sup LR1,T (2, ε) tests, respectively. CU SQ∗ 0.026 0.039 0.069 0.089 0.118 0.154 0.186 0.300 0.348 0.397 ∗ Table 8: Size of the sup LR2,T (1, 1, ε|n = 0, m = 1) test with different trimming parameters ε in the case of normal errors (DGP: yt = μ1 + μ2 1(t > [0.5T ]) + et , et ∼ i.i.d. N (0, 1), μ1 = 0). μ2 \ε 0 0.1 0.25 0.5 0.75 1 2 4 10 20 0.1 0.095 0.097 0.088 0.078 0.056 0.051 0.052 0.049 0.049 0.049 T = 100 0.15 0.2 0.081 0.073 0.082 0.075 0.082 0.069 0.069 0.056 0.052 0.045 0.046 0.038 0.041 0.035 0.039 0.033 0.038 0.032 0.038 0.032 0.25 0.066 0.066 0.062 0.048 0.033 0.029 0.025 0.021 0.021 0.021 0.1 0.064 0.069 0.067 0.047 0.031 0.030 0.031 0.030 0.030 0.030 T = 200 0.15 0.2 0.070 0.064 0.065 0.061 0.072 0.060 0.047 0.039 0.029 0.026 0.030 0.029 0.030 0.026 0.031 0.027 0.030 0.027 0.030 0.027 0.25 0.059 0.059 0.052 0.035 0.022 0.019 0.017 0.016 0.017 0.017 ∗ Table 9: Power of the sup LR2,T (1, 1, ε|n = 0, m = 1) test with different trimming parameters ε in the case of normal errors (DGP: yt = μ1 + μ2 1(t > [0.5T ]) + et , et ∼ i.i.d. N (0, 1 + δ1(t > [.25T ]), μ1 = 0). T = 100 δ\μ2 0 0.1 0.25 0.5 0.75 1 1.25 1.5 2 3 4 0.126 0.174 0.252 0.348 0.437 0.528 0.687 0.853 0.938 0.125 0.178 0.249 0.339 0.449 0.524 0.683 0.849 0.938 ε = 0.1 0.5 2 0.095 0.173 0.241 0.347 0.429 0.504 0.665 0.859 0.930 0.064 0.097 0.159 0.224 0.318 0.414 0.570 0.798 0.908 4 10 20 0 0.1 0.5 ε = 0.2 2 4 10 20 0.060 0.102 0.158 0.225 0.331 0.419 0.564 0.790 0.902 0.059 0.102 0.159 0.227 0.332 0.427 0.578 0.803 0.910 0.059 0.102 0.159 0.227 0.332 0.427 0.578 0.803 0.910 0.108 0.148 0.264 0.359 0.492 0.567 0.724 0.905 0.960 0.100 0.196 0.270 0.375 0.452 0.576 0.738 0.898 0.963 0.094 0.165 0.290 0.371 0.442 0.568 0.704 0.892 0.958 0.052 0.102 0.167 0.261 0.362 0.465 0.618 0.849 0.946 0.048 0.085 0.157 0.261 0.351 0.443 0.636 0.823 0.937 0.062 0.093 0.138 0.266 0.368 0.459 0.626 0.865 0.933 0.040 0.113 0.139 0.254 0.358 0.462 0.621 0.847 0.951 T = 200 δ\μ2 0 0.1 0.25 0.5 0.75 1 1.25 1.5 2 3 4 0.128 0.279 0.476 0.644 0.804 0.888 0.970 0.998 1.000 0.131 0.279 0.460 0.651 0.794 0.887 0.973 0.998 1.000 ε = 0.1 0.5 2 0.108 0.242 0.422 0.615 0.771 0.878 0.968 0.998 1.000 0.067 0.177 0.333 0.516 0.699 0.832 0.941 0.997 1.000 4 10 20 0 0.1 0.068 0.184 0.343 0.528 0.707 0.842 0.948 0.996 1.000 0.070 0.184 0.349 0.534 0.714 0.843 0.951 0.996 1.000 0.070 0.184 0.349 0.534 0.714 0.844 0.951 0.996 1.000 0.129 0.277 0.493 0.687 0.828 0.914 0.978 0.999 1.000 0.125 0.289 0.484 0.692 0.821 0.910 0.977 0.999 1.000 ε = 0.2 0.5 2 0.095 0.248 0.443 0.641 0.792 0.896 0.975 0.998 1.000 0.055 0.178 0.352 0.560 0.727 0.850 0.957 0.996 1.000 4 10 20 0.058 0.182 0.357 0.569 0.737 0.859 0.963 0.997 1.000 0.056 0.185 0.359 0.577 0.745 0.864 0.965 0.997 1.000 0.056 0.185 0.359 0.577 0.745 0.864 0.965 0.997 1.000 ∗ ∗ Table 10: Size of the sup LR4,T (ma , na ) and U D max LR4,T tests in the case of normal errors (DGP: yt = et , et ∼ i.i.d. N (0, 1)) T=100 ε ma = na = 1 ma = 1, na = 2 ma = 2, na = 1 UDmax 0.2 0.041 0.050 0.045 0.052 0.15 0.046 0.053 0.043 0.046 0.1 0.057 0.058 0.052 0.054 T=200 ε ma = na = 1 ma = 1, na = 2 ma = 2, na = 1 UDmax 0.2 0.050 0.044 0.052 0.046 0.15 0.054 0.050 0.047 0.047 0.1 0.048 0.040 0.046 0.045 ∗ ∗ Table 11: Size of sup LR4,T (ma , na ) and U D max LR4,T tests in the case of ARCH(1) errors √ (DGP: yt = et , et = ut ht with ut ∼ i.i.d. N (0, 1), ht = δ 1 + γε2t−1 , h0 = δ 1 / (1 − γ) and δ 1 = 1) T=100 γ ma = na = 1 0.1 0.3 0.5 0.057 0.057 0.058 ε = 0.1 ma = 1, na = 2 ma = 2, na = 1 0.058 0.067 0.063 0.058 0.059 0.057 UDmax ma = na = 1 0.062 0.072 0.076 0.042 0.048 0.042 ε = 0.2 ma = 1, na = 2 ma = 2, na = 1 0.057 0.068 0.064 0.046 0.048 0.043 UDmax 0.056 0.070 0.056 T=200 γ ma = na = 1 0.1 0.3 0.5 0.058 0.054 0.044 ε = 0.1 ma = 1, na = 2 ma = 2, na = 1 0.039 0.031 0.019 0.049 0.043 0.038 UDmax ma = na = 1 0.053 0.050 0.044 0.051 0.042 0.044 ε = 0.2 ma = 1, na = 2 ma = 2, na = 1 0.049 0.039 0.030 0.051 0.040 0.037 UDmax 0.056 0.038 0.031 ∗ ∗ Table 12: Power of the sup LR4,T (ma , na ) and U D max LR4,T tests for DGPs with one break in coefficients and two breaks in variance c (DGP: yt = μ1 + μ2 1(t > T ) + et , et ∼ i.i.d. N (0, 1 + δ1(T1v < t ≤ T2v )) with μ1 = 0 and μ2 = δ; ε = 0.1) ma = 1 ma = 1 UDmax na = 1 na = 2 v T c = T v1 = [.3T ], T 2 = [.6T ] ma = 1 ma = 1 UDmax na = 1 na = 2 v T c = T2v = [.6T ], T 1 = [.3T ] δ ma = 1 ma = 1 UDmax na = 1 na = 2 v v T c = [.3T ], T 1 = [.5T ], T 2 = [.6T ] T = 100 0.25 0.5 0.75 1 1.25 1.5 0.119 0.328 0.667 0.906 0.984 1.000 0.087 0.283 0.604 0.891 0.982 0.999 0.083 0.263 0.570 0.847 0.970 0.998 0.125 0.367 0.715 0.933 0.994 1.000 0.099 0.317 0.672 0.917 0.989 1.000 0.094 0.273 0.588 0.891 0.977 1.000 0.117 0.331 0.610 0.924 0.985 0.999 0.093 0.268 0.591 0.883 0.974 0.999 0.25 0.5 0.75 1 1.25 1.5 0.162 0.610 0.958 1.000 1.000 1.000 0.131 0.583 0.948 0.998 1.000 1.000 0.123 0.518 0.921 0.996 1.000 1.000 0.192 0.686 0.970 1.000 1.000 1.000 0.164 0.662 0.964 0.998 1.000 1.000 0.142 0597 0.946 0.995 1.000 1.000 0.158 0.598 0.958 1.000 1.000 1.000 0.128 0.510 0.924 0.995 1.000 1.000 0.086 0.239 0.555 0.837 0.972 0.998 ma = 1 ma = 1 UDmax na = 1 na = 2 v v T c = [.5T ], T 1 = [.3T ], T 2 = [.6T ] ma = 1 ma = 1 UDmax na = 1 na = 2 v v T c = [.6T ], T 1 = [.3T ], T 2 = [.5T ] 0.131 0.391 0.734 0.943 0.994 1.000 0.104 0.316 0.683 0.927 0.995 1.000 0.093 0.288 0.631 0.906 0.985 0.999 0.126 0.368 0.726 0.929 0.995 1.000 0.102 0.306 0.649 0.916 0.988 1.000 0.096 0.277 0.601 0.894 0.983 0.999 0.191 0.698 0.964 1.000 1.000 1.000 0.166 0.667 0.966 0.998 1.000 1.000 0.146 0.605 0.944 0.998 1.000 1.000 0.189 0.666 0.969 0.999 1.000 1.000 0.161 0.631 0.962 0.999 1.000 1.000 0.138 0.572 0.947 0.997 1.000 1.000 T = 200 0.109 0.468 0.889 0.992 1.000 1.000 ∗ ∗ Table 13: Power of the sup LR4,T (ma , na ) and U D max LR4,T tests for DGPs with two breaks in coefficients and one break in variance c c (DGP: yt = μ1 + μ2 1(T1 < t ≤ T2 ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T v )) with μ1 = 0 and μ2 = δ; ε = 0.1). ma = 1 ma = 2 UDmax na = 1 na = 1 c T1c = T v = [.3T ], T 2 = [.6T ] ma = 1 ma = 2 UDmax na = 1 na = 1 c T1c = [.3T ], T 2 = T v = [.6T ] δ ma = 1 ma = 2 UDmax na = 1 na = 1 c v T1c = [.5T ], T 2 = [.6T ], T = [.3T ] T = 100 0.25 0.5 0.75 1 1.25 1.5 0.086 0.160 0.300 0.453 0.660 0.796 0.097 0.240 0.480 0.725 0.877 0.965 0.084 0.201 0.408 0.660 0.836 0.936 0.087 0.194 0.380 0.602 0.791 0.919 0.089 0.270 0.569 0.850 0.973 0.999 0.090 0.248 0.525 0.827 0.962 0.995 0.073 0.110 0.183 0.273 0.377 0.482 0.074 0.121 0.225 0.350 0.502 0.624 0.25 0.5 0.75 1 1.25 1.5 0.122 0.326 0.636 0.871 0.967 0.995 0.169 0.512 0.834 0.978 0.999 1.000 0.131 0.433 0.798 0.959 0.996 1.000 0.135 0.399 0.745 0.948 0.992 1.000 0.172 0.574 0.939 1.000 1.000 1.000 0.147 0.520 0.909 0.993 1.000 1.000 0.091 0.192 0.392 0.600 0.775 0.881 0.089 0.234 0.477 0.724 0.884 0.949 0.071 0.098 0.167 0.272 0.424 0.545 ma = 1 ma = 2 UDmax na = 1 na = 1 c v T1c = [.3T ], T 2 = [.6T ], T = [.5T ] ma = 1 ma = 2 UDmax na = 1 na = 1 c v T1c = [.3T ], T 2 = [.5T ], T = [.6T ] 0.089 0.181 0.370 0.588 0.771 0.880 0.094 0.264 0.554 0.825 0.962 0.991 0.086 0.235 0.500 0.797 0.943 0.989 0.081 0.144 0.248 0.382 0.513 0.623 0.078 0.197 0.453 0.733 0.912 0.981 0.083 0.194 0.405 0.694 0.888 0.971 0.123 0.397 0.733 0.930 0.986 0.997 0.175 0.560 0.913 0.998 1.000 1.000 0.144 0.505 0.890 0.994 1.000 1.000 0.101 0.297 0.540 0.775 0.911 0.970 0.146 0.460 0.848 0.990 1.000 1.000 0.125 0.409 0.813 0.985 1.000 1.000 T = 200 0.080 0.182 0.417 0.667 0.851 0.950 Table14: Finite sample performance of the specific to general sequential procedure to select the number of breaks in coefficients and variance (DGP: yt = μ1 + μ2 1(t > T c ) + et , et ∼ i.i.d. N (0, 1 + δ1(t > T v )) with T = 200; ε = 0.15) m=n=0 μ2 = δ = 1 prob(m = 0, n = 0) prob(m = 0, n = 1) prob(m = 0, n = 2) prob(m = 1, n = 0) prob(m = 1, n = 1) prob(m = 1, n = 2) prob(m = 2, n = 0) prob(m = 2, n = 1) prob(m = 2, n = 2) prob(K̄ = 0) prob(K̄ = 1) prob(K̄ = 2) μ2 = δ = 2 m=n=1 T c = [0.25T ], T v = [0.75T ] μ2 = δ = 1 μ2 = δ = 2 μ2 = 1, δ = 3 0.966 0.010 0.001 0.005 0.000 0.019 0.000 0.002 0.028 0.018 0.167 0.206 0.000 0.021 0.000 0.055 0.001 0.003 0.007 0.010 0.000 0.003 0.000 0.005 0.005 0.419 0.010 0.004 0.079 0.612 0.218 0.044 0.000 0.512 0.778 0.734 0.883 0.329 0.757 0.868 0.000 0.031 0.035 0.040 0.032 0.013 0.022 0.025 0.000 0.004 0.000 0.000 0.001 0.003 0.001 0.000 0.000 0.003 0.002 0.001 0.004 0.000 0.002 0.000 0.000 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.944 0.000 0.000 0.000 0.000 0.004 0.000 0.002 0.053 0.805 0.299 0.109 0.487 0.681 0.238 0.065 0.003 0.195 0.701 0.891 0.513 0.315 0.762 0.933 m=n=1 T c = T v = [0.5T ] μ2 = δ = 1 μ2 = 1, δ = 3 prob(m = 0, n = 0) prob(m = 0, n = 1) prob(m = 0, n = 2) prob(m = 1, n = 0) prob(m = 1, n = 1) prob(m = 1, n = 2) prob(m = 2, n = 0) prob(m = 2, n = 1) prob(m = 2, n = 2) prob(K̄ = 0) prob(K̄ = 1) prob(K̄ = 2) m=n=1 T c = [0.5T ], T v = [0.7T ] μ2 = 1, δ = 3 μ2 = 1, δ = 5 μ2 = 1 m = 1, n = 0 T c = [0.5T ] μ2 = 2 m = 0, n = 1 T v = [0.5T ] δ=2 μ2 = 3 δ=1 0.005 0.000 0.000 0.001 0.000 0.354 0.032 δ=3 0.002 0.045 0.003 0.000 0.000 0.000 0.628 0.940 0.971 0.002 0.000 0.001 0.000 0.000 0.009 0.022 0.021 0.126 0.002 0.934 0.933 0.928 0.002 0.000 0.000 0.801 0.966 0.044 0.047 0.051 0.005 0.006 0.005 0.015 0.024 0.011 0.010 0.009 0.001 0.000 0.001 0.001 0.000 0.007 0.008 0.011 0.001 0.000 0.000 0.004 0.005 0.002 0.001 0.001 0.000 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.342 0.030 0.002 0.946 0.948 0.956 0.946 0.943 0.632 0.919 0.945 0.054 0.052 0.044 0.054 0.057 0.026 0.051 0.053 Note: prob(m = j, n = i) represents the probability of choosing j breaks in mean and i breaks in variance and prob(K̄ = j) denotes the probability of selecting j total breaks in either mean or variance. The upper bound for the total number of breaks is set to 2. Table 15: Empirical results for US macroeconomic series: outcome of the tests U DmaxLR∗4,T (2, 2) ε = 0.15 ε = 0.20 GDP Consumption durables nondurables services Investment (total) fixed investment-total nonresidential residential ∆inventory-inv/GDP Exports Imports Government spending Production goods (total) nondurable durable services structures Employment Price inflation 90-day T-bill rate 39.03c 30.75 49.65a 41.63b 45.85a 25.37 35.77c 27.72 44.22b 31.60 60.22a 56.64a 31.89 33.91c 29.16 46.18a 41.63a 45.79a 24.27 29.10 28.09 31.82c 24.92 61.96a 57.79a 28.66 25.94 31.57 34.68 40.55b 35.05 31.90 47.82a 43.82b 26.04 30.66 34.99b 36.23b 33.44c 26.96 47.82a 38.07b 10-year T-bond rate 42.84b 44.17a series SupLR ∗ 2,T Model sup (1,2|1,1) (2,1|1,1) (1,3|1,2) (2,2|1,2) (2,2|2,1) (1,1|1,0) (1,2|1,0) (2,1|2,0) (1,1|0,1) (2,1|0,1) (1,2|0,2) (1,2) LR ∗ 4,T a Sup-SEQ SupLR 3,T 39.03 12.02b 6.36 3.66 9.10 9.43c 15.68a 15.65a 11.74b 7.84 8.14 14.19 (1,1) (1,1) (1,1) 34.51a 30.17a 34.31a 3.97 2.29 1.95 8.42 11.62 14.83 2.62 1.05 0.99 11.54 11.62 15.92 1.30 2.58 1.47 29.23a 7.44c 5.47 16.27a 3.75 3.10 38.67a 5.57 6.43 8.98 25.55a 31.88a 12.31 18.55a 21.70a 10.02 14.03 30.62a (1,2) 26.39c 8.55c 9.52 4.05 10.78 8.95c 18.06a 11.80a 17.88a 3.02 4.95 3.71 (1,1) 28.19a 8.04 16.48 2.45 15.88 6.94 18.56a 11.80a 20.43a 11.97 13.19 11.54 (2,2) (2,2) 65.77a 51.39a 6.97 2.70 8.47 14.32 4.43 3.09 9.91 13.71 9.82c 1.65 18.13a 16.04a 14.84a 12.58a 21.96a 12.47a 30.71a 28.96a 19.11a 16.81b 13.14 30.17a (1,2) (2,1) (2,2) 26.22c 41.36a 41.87b 10.43b 1.79 8.05 8.63 8.63 7.59 4.06 2.24 2.79 11.14 10.69 9.52 6.18 0.20 7.89 10.82b 2.12 20.35a 9.77a 1.48 12.54a 11.25b 3.24 20.14a 4.35 26.64a 12.95 12.09 19.71a 9.36 8.70 27.71a 10.82 (2,2) (2,2) 47.82a 43.82b 11.97b 8.22 8.47 13.75 4.11 8.00 8.47 16.89 9.53c 8.14 22.68a 9.81b 15.42a 7.24b 18.06a 7.41 6.73 16.41c 8.35 14.19 12.65 12.95 (2,2) 43.39b 19.26a 8.02 9.98c 8.02 18.18a 17.32a 13.87a 15.11a 7.52 5.24 12.04 ∗ Note: The test results are based on an AR(4) model. The sup LR4,T tests reported in the fifth column pertains to the number of breaks of the selected model as indicated in the fourth column. In all cases, the Sup-SEQ(3,2|2,2) and Sup-SEQ(2,3|2,2) tests are insignificant indicating that a third break in either the coefficients or the variance is not present. The superscripts a,b, and c indicate a statistic significant at the 1%, 5% and 10% level, respectively. For the SupLR ∗ tests, the trimming parameter used is ε = 0.15. The first 19 series are annual growth rates (i.e, 100ln (xt /xt−4 )) except for the change in inventory investment, which is the annual £ ¡ ¢ ¡ ¢¤ difference of the quarterly change in inventories as a fraction of GDP. Inflation is the four-quarter change in the annual inflation rate (i.e., 100 ln Pt /P t−1 − ln Pt−4 /P t−5 ) with Pt denoting the GDP deflator, and the two interest rates series are in four-quarter changes (i.e, xt − xt−4 ). Table 16: Empirical results for US macroeconomic series: parameter estimates SW (2002) series GDP Consumption durables nondurables services Investment (total) fixed investment-total nonresidential residential ∆inventory-inv/GDP Exports Imports Government spending Production goods (total) nondurable durable services structures Employment Price inflation 90-day T-bill rate 10-year T-bond rate Tc Tv T1c . . 1987:3 1991:4 1969:4 . . . . . . 1972:4 . 1983:2 1992:1 1987:3 . . . 1983:3 . 1983:2 1988:1 1975:4 1986:2 . . . . 1968:3 1991:3 1981:2 1973:2 1981:1 1981:1 1983:4 1983:4 1985:2 . 1984:2 1983:2 . 1984:4 1979:3 . σ1 σ2 σ3 σ2 σ1 σ3 σ2 σ3 σ1 0.722 0.009 0.017 0.006 1.89 0.35 0.67 0.647 0.692 0.585 0.70 0.826 0.758 0.046 0.010 0.004 0.018 0.006 0.006 0.008 0.726 0.821 0.025 0.039 0.48 0.73 -0.002 0.019 0.642 0.812 0.075 0.028 1992:1 1995:3 0.071 0.037 0.021 0.097 0.041 0.016 -0.169 0.470 0.730 -0.247 0.342 0.747 0.047 0.048 0.020 0.020 0.028 0.027 0.43 0.42 1.4 1.35 0.6 0.56 0.027 0.006 -0.001 0.556 0.334 0.539 -0.079 0.789 0.846 0.031 0.007 0.044 0.71 0.018 1.82 1.75 1.69 0.39 0.839 0.467 0.017 0.004 0.026 0.012 0.005 0.018 0.41 0.69 -0.000 0.167 0.116 0.000 -0.039 -0.111 -0.394 0.663 0.423 0.561 0.566 0.646 0.123 0.766 0.421 0.002 0.223 0.303 0.004 1.317 1.072 0.002 0.593 0.487 2 5.91 3.54 0.5 0.45 0.45 1 2.66 1.61 T1v T2v α1 α2 1968:2 1975:4 1983:1 0.019 1991:1 1992:1 1970:1 1991:1 1982:1 1977:4 1983:3 1974:3 1991:1 1983:4 1972:4 1967:1 . T2c 1992:1 1990:4 1983:2 1985:2 . 1983:3 α3 β1 β2 0.013 0.604 0.017 0.008 0.02 0.02 0.005 0.008 0.013 β3 0.39 0.6 1.5 0.018 1.53 0.37 . .1993:4 1968:3 1973:3 1973:3 1995:3 1973:3 1982:3 1982:4 1991:3 1983:3 0.01 0.032 0.016 1971:3 1967:4 1979:3 1986:1 1983:4 1986:4 1971:3 1967:4 1979:3 1986:1 1983:4 1986:4 0.001 0.089 0.045 Note: The estimation results are based on an AR(4) model with trimming parameter set as ε = 0.15. The first 19 series are annual growth rates (i.e, 100ln (xt /xt−4 )) except for the change in inventory £investment, which of the quarterly change in inventories as a fraction of GDP. Inflation is the four-quarter change in the annual inflation rate (i.e., ¡ ¢ is the ¡ annual difference ¢¤ 100 ln Pt /P t−1 − ln Pt−4 /P t−5 ) with Pt denoting the GDP deflator, and the two interest rates series are in four-quarter changes (i.e, xt − xt−4 ). Table 17: Empirical results for series with variance breaks only: tests and parameter estimates Consumption ∆inventory-inv/GDP Production goods (total) nondurable Tests UDmaxLR∗1,T 8.75c 12.30b 7.90 16.02a Model Selected (0,1) (0,2) . (0,1) SupLR∗1,T 8.75c 9.97a . 16.02a SupLR3,T (1,1|0,1) 11.27 7.32 . 10.68 (2,1|0,1) 11.17 4.86 . 7.80 (1,2|0,2) 12.11 13.14 . 7.79 Sup-SEQ (0,2|0,1) 2.60 7.73 . 0.67 (0,3|0,2) 1.77 0.32 . 0.65 Estimates T1v 1983:2 1973:3 . 1984:3 T2v . 1987:2 . . α 0.008 0.000 . 0.020 β 0.763 0.009 . 0.638 σ1 0.010 0.005 . 0.050 σ2 0.006 0.008 . 0.023 σ3 . 0.004 . . (σ 2 /σ1 ) 0.6 1.6 . 0.46 (σ 3 /σ2 ) . 0.5 . . (σ 3 /σ1 ) . 0.8 . . Stock-Waton (2002) Tv 1992:1 1988:1 1983:4 1983:4 Note: The superscripts a,b, and c indicate a statistic significant at the 1%, 5% and 10% level, respectively. 22 −−− 20 −o− 18 υ T = [.25T] Tυ = [.50T] −x− υ T = [.75T] 16 14 12 10 8 6 4 2 0 1 2 3 4 5 δ 6 7 8 9 10 1 Figure 1: Size of the Sup-LR test for a coefficient change, ignoring the variance change υ c T = [.3T] and T = [.5T] 100 δ1=0 δ1=0.5 δ1=1 δ1=1.5 δ1=2 δ1=2.5 δ =3 80 60 40 1 20 0 0.2 0.4 0.6 0.8 1 1.2 δ2 1.4 1.6 1.8 2 2.2 Tc = [.3T] and Tυ = [.75T] 100 δ1=0 δ =0.5 1 δ1=1 δ1=1.5 δ1=2 δ =2.5 1 δ1=3 80 60 40 20 0 0.2 0.4 0.6 0.8 1 1.2 δ2 1.4 1.6 1.8 2 2.2 Figure 2: Power of the Sup-LR test for a coefficient change, ignoring the variance change 100 −−− −x− −o− 90 c T = [.25T] c T = [.50T] c T = [.75T] 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 δ 6 7 8 9 10 2 Figure 3: Size of the CUSQ test for a variance change, ignoring the coefficient change υ c T = [.5T] and T = [.3T] 100 δ 2 δ2 δ2 δ2 δ2 δ2 δ2 80 60 40 20 0 0 5 = = = = = = = 0 1 1.5 2 2.5 3 3.5 10 15 δ 1 υ c T = [.75T] and T = [.3T] 100 δ 2 δ2 δ2 δ2 δ2 δ2 δ2 80 60 40 20 0 0 5 10 = = = = = = = 0 1 1.5 2 2.5 3 3.5 15 δ1 Figure 4: Power of the CUSQ test for a variance change, ignoring the coefficient change 100 c ... T = [.25T] c −x− T = [.50T] c −−− T = [.75T] 90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 δ 6 7 8 9 10 2 Figure 5: Size of the two-step test for a variance change, ignoring the coefficient change υ c T = [.5T] and T = [.3T] 100 80 δ 2 δ2 δ2 δ 2 δ2 δ2 δ 60 40 20 2 0 0 5 = = = = = = = 0 1 1.5 2 2.5 3 3.5 10 15 δ 1 Tc = [.75T] and Tυ = [.3T] 100 δ2 δ2 δ2 δ 2 δ2 δ2 δ2 80 60 40 20 0 0 5 10 = = = = = = = 0 1 1.5 2 2.5 3 3.5 15 δ1 Figure 6: Power of the two-step test for a variance change, ignoring the coefficient change 0.1 0.1 0 0 Consumption GDP −0.1 1960 1972 1984 1996 −0.1 1960 0.2 0.1 0 0 1972 1984 1996 Consumption−Nondurable Consumption−Durables −0.2 1960 1972 1984 1996 0.1 −0.1 1960 1972 1984 1996 0.5 Consumption−Services 0.05 0 Investment 0 1960 1972 1984 1996 −0.5 1960 0.2 0.2 0 0 1972 Fixed Investment−Total −0.2 1960 1972 1984 1996 0.05 0 0 1984 1996 1972 −0.05 1960 0.5 0.5 0 0 1972 Exports −0.5 1960 1972 1984 1984 1996 Change in inventory investment/GDP Fixed Investment−Residential 1972 1996 Fixed Investment−Nonresidential −0.2 1960 0.5 −0.5 1960 1984 1984 1996 Imports 1996 −0.5 1960 Figure 7: 22 Macro Time Series 1 1972 1984 1996 0.2 0.2 0 0 Government Spending −0.2 1960 1972 1984 1996 Goods Production−Total −0.2 1960 0.5 0.1 0 0 1972 Goods Production−Durables −0.5 1960 1972 1984 1996 0.06 1984 1996 Goods Production−Nondurable −0.1 1960 1972 1984 1996 0.2 Production−Services 0.04 0 0.02 Production−Construction 0 1960 1972 1984 1996 −0.2 0.1 0.02 0 0 0 50 100 NonAgricultural Employment −0.1 1960 1972 1984 1996 GDP Deflator −0.02 1960 10 5 0 0 90−Day T−bill Rate −10 1960 1972 1984 1996 150 1972 1984 1996 10−Year T−bond Rate −5 1960 Figure 8: 22 Macro Time Series 2 1972 1984 1996