Garch Vola Measurepdf
Garch Vola Measurepdf
Garch Vola Measurepdf
Forecasting Financial Time Series: Normal GARCH with Outliers or Heavy Tailed Distribution Assumptions?
Christoph HARTZ
University of Munich
Marc S. PAOLELLA
Established at the initiative of the Swiss Bankers' Association, the Swiss Finance Institute is a private foundation funded by the Swiss banks and Swiss Stock Exchange. It merges 3 existing foundations: the International Center FAME, the Swiss Banking School and the Stiftung "Banking and Finance" in Zurich. With its university partners, the Swiss Finance Institute pursues the objective of forming a competence center in banking and finance commensurate to the importance of the Swiss financial center. It will be active in research, doctoral training and executive education while also proposing activities fostering interactions between academia and the industry. The Swiss Finance Institute supports and promotes promising research projects in selected subject areas. It develops its activity in complete symbiosis with the NCCR FinRisk.
The National Centre of Competence in Research Financial Valuation and Risk Management (FinRisk) was launched in 2001 by the Swiss National Science Foundation (SNSF). FinRisk constitutes an academic forum that fosters cutting-edge finance research, education of highly qualified finance specialists at the doctoral level and knowledge transfer between finance academics and practitioners. It is managed from the University of Zurich and includes various academic institutions from Geneva, Lausanne, Lugano, St.Gallen and Zurich. For more information see www.nccr-finrisk.ch .
This paper can be downloaded without charge from the Swiss Finance Institute Research Paper Series hosted on the Social Science Research Network electronic library at:
http://ssrn.com/abstract=1658362
Forecasting Financial Time Series: Normal GARCH with Outliers or Heavy Tailed Distribution Assumptions?
Christoph Hartza
a b
Marc S. Paolellaa, b,
Department of Statistics, University of Munich, Munich, Germany Department of Banking and Finance, University of Zurich, Switzerland c Swiss Finance Institute
September 2011
Abstract
The use of GARCH models is widely used as an eective method for capturing the volatility clustering inherent in nancial returns series. The residuals from such models are however often non-Gaussian, and two methods suggest themselves for dealing with this; outlier removal, or use of non-Gaussian innovation distributions. While there are benets to both, we show that the latter method is better if interest centers on volatility and value-at-risk prediction. New volatility measures based on OHLC (open high low close) data are derived and used. Use of OHLC measures are shown to be superior to use of the naive estimator used in other GARCH outlier studies.
Corresponding author. E-mail address: marc.paolella@bf.uzh.ch. Part of this research has been carried out within the National Centre of Competence in Research Financial Valuation and Risk Management (NCCR FINRISK), which is a research program supported by the Swiss National Science Foundation.
1 INTRODUCTION
Introduction
The growth of interest in modeling and forecasting nancial time series continues unabated, both in academic and nancial institutions, with a steady proliferation of models being proposed and tested for use in the prediction of volatility and Value-at-Risk, hereafter VaR. It is virtually undisputed that returns of nancial time series sampled at weekly or higher frequencies deviate markedly from being independently and identically distributed (iid), most notably in the form of (i) blatant volatility clustering, which is also reected by high autocorrelation of absolute and squared returns; (ii) substantial kurtosis, i.e., the density of the unconditional return distribution is more peaked around the center and possesses much fatter tails than the normal density; and (iii) mild skewness. With respect to the rst of these, use of (generalized) autoregressive conditional heteroscedasticity (GARCH) models and their numerous extensions constitute by far the most popular modeling method. While they are indeed quite capable of accounting for the volatility clustering and, thus, a signicant amount of the excess kurtosis (compared to the normal distribution) andfor asymmetric GARCH generalizationssome of the skewness, the residuals of tted GARCH models are most often still leptokurtic and skewed. Because of this, it is commonplace to allow the innovations process in the GARCH model to be an iid sequence of random variables from a distributional class which still contains normality as a special case, but is otherwise fat-tailed and asymmetric. A dierent way which is, in some respects, more in line with the traditional approach in statistics to handle aberrant observations not in accord with the normal distribution, is to classify them as outliers and remove them from the analysis. Several authors have contributed to this idea. Frances and Ghijsels (1999) proposed an Additive Outlier GARCH model, in short AO-GARCH, driven by normal innovations. Their method is based on that of Chen and Liu (1993) for detecting dierent types of outliers in time series and is described in more detail in Section 2 below. Their model was extended to support innovative outliers, also based on the Chen and Liu (1993) approach, by Charles and Darne (2005a); see also Charles and Darne (2005b) for further analysis of this model. Alternative approaches, involving outlier removal either before or after applying a GARCH lter, are discussed in Carnero et al. (2001). We concentrate herein on the method in Frances and Ghijsels (1999). They t their model to the returns on four weekly European stock indices and demonstrated that volatility forecasts are more accurate with the AO-GARCH model than when using the normal-GARCH ormore importantlya GARCH model driven by Students t innovations with (jointly) estimated degrees of freedom parameter, denoted t-GARCH. This nding would have important consequences for risk management, given the predominance of t-GARCH and other variants for deriving VaR
1 INTRODUCTION predictions.
If the true data generating process (DGP) is (or, realistically, is closer to) a normal-GARCH process with occasional additive outliers, then it is certainly reasonable to expect that an AOGARCH model with estimated parameters will result in better out-of-sample forecasts than alternative models such as t-GARCH. However, the validity of concluding the converse, i.e., that improved forecasting ability of model A over model B implies that model A is a better description of the DGP than model B, strongly depends on the method of forecast measurement. In the case of Frances and Ghijsels (1999), they compare the predicted volatility of the competing models with the absolute return (demeaned by a long time average) as a proxy for the realized volatility. This type of realized volatility has been, unfortunately, shown to be highly inaccurate by Andersen and Bollerslev (1998). This simplistic, faulty measure was also used by Park (2002) to demonstrate superiority of his outlier robust GARCH model for exchange rate returns. Ideally, intra-day transactions would be used to construct volatility measures, because they have been shown to be an excellent method of eliciting the true, unobservable, daily volatility (Andersen et al., 2003). For the data used in our study below, intra-day prices are not available, so we instead follow Parkinson (1980) and Wiggins (1991) and make use of the OHLC-prices (open-high-low-close) to obtain better predictions of the true volatility. Another point of contention with Frances and Ghijsels (1999) is that their simulation studies consist only of estimating and detecting outliers in the GARCH(1,1) model from simulated, outlier-infected GARCH data and subsequently verifying that the procedure is indeed able to successfully label and remove most of the actual generated outliers. While this is important and useful, they do not study the arguably more important issue of the behavior of their model and detection procedure, and the quality of the ensuing forecasts, assuming a data generating process which closely mimics the stylized facts of nancial time series. In particular, one could consider data generated from a GARCH-type model with fat-tailed innovations such as from the generalized asymmetric Students t, or GAt (Mittnik and Paolella, 2000), the asymmetric stable Paretian (see Doganoglu et al., 2007; Broda et al., 2011; and the references therein), or the generalized hyperbolic (see Broda and Paolella, 2009; Yang, 2011; Paolella and Polak, 2011; and the references therein). In such a case, one would expect the outlier-based method to perform poorly because it will ultimately remove the extreme data pointswhich are precisely those which become relevant for understanding and modeling the tail behavior of the distribution and, consequently, for prediction of risk. The aim of this paper is to show via the use of more appropriate measures of volatility and the use of several real data sets that the AO-GARCH method is inferior to the use of fat-tailed
GARCH models with respect to volatility forecasting and VaR prediction. An added plus to this conclusion is that the AO-GARCH model is, relatively to other GARCH models, far more time-intensive to estimate and also involves the use of a user-decided tuning parameter for the threshold of outlier determination which is decisive for its performance. The remainder of this paper is as follows. In Section 2, the models under comparison are briey introduced, while Section 3 presents the time series used for comparison and discusses the in-sample results for both models. In Section 4, the out-of-sample performance of both models with respect to volatility forecasting and VaR prediction is discussed in detail. Section 5 concludes.
Let the observed time series be denoted by rt , t = 1, . . . , T , where rt = 100(ln Pt ln Pt1 ) is the usual denition of the one-period return of the asset with price Pt at time t. The normalGARCH(p, q) model assumes rt = + t , t = zt t , p q 2 2 2 t = c0 + ci ti + dj tj ,
i=1 iid j=1
(1) (2)
where zt N(0, 1). For the vast majority of applications, p = q = 1 is deemed adequate. For the data sets considered in Section 3, this choice was also found to be appropriate. As discussed in the Introduction, a popular method of augmenting the normal-GARCH model to accommodate the typically observed non-normality of the residuals is simply to replace the normal distribution with one which can exhibit asymmetry and tails fatter than those of the normal. We use the GAt distribution, which generalizes Students t distribution by adding an additional shape and and asymmetry parameter. In particular, its density is given by ( 1 ) d (+ d ) 1 + (z) , if z < 0, fGAt (z; d, , ) = K ( )(+ 1 ) d 1 + (z/)d , if z 0,
(3)
with K 1 = ( + 1 )d1 1/d B(d1 , ). The density is asymmetric for = 1 and coincides with the usual Students t density with m degrees of freedom with m = /2, d = 2 and = 1. The raw integer moments are given by ( ) r (1)r (r+1) + r+1 B r+1 , d r/d d( ) E[Z ] = 1 1 + B d,
r
(4)
for r < d; see Paolella (2007) for derivation and further details of the GAt.
We now briey describe the AO-GARCH model proposed by Frances and Ghijsels (1999), which is based on the method of Chen and Liu (1993) for detecting additive outliers in ARMA
time series. Let time series yt be given by the ARMA(r, s) model r (L)yt = s (L)t , where t is
a white noise process and the roots of r (L) and s (L) lie outside the unit circle. This series is not observed, but rather yt , where
yt = yt + It ( ),
(5)
with It ( ) = 1 for t = and zero otherwise. That is, the observed series is an ARMA process plus an additive outlier. By modeling the observed series, yt , with an ARMA(r, s) model, the estimated residuals t = (L)yt , = t + (L)It ( ) are obtained, with (L) = (L)/(L), from which the estimated impact, ( ), can be elicited from the regression t = xt + t , with xt = 0 for t < , xt = 1 for t = , and xt+k = k for t and k = 1, 2, . . . . Outlier detection involves testing the signicance of the ( ). Chen and Liu (1993) propose three ways for estimating the residual standard deviation, a , from which Frances and Ghijsels (1999) use the omit-one method, i.e., use the sample variance and omit the observation at t = . The test statistic they obtain is given by ]1/2 /[ T , x2 = [ ( )/a ] t
t=
(6)
where the impact of an AO is deemed signicant if exceeds the critical value C, which is set equal to 4 by Frances and Ghijsels (1999). If the impact is signicant, then the observation yt
can be adjusted to obtain the AO-corrected yt = yt It ( ).
Frances and Ghijsels (1999) apply the aforementioned AO correction for ARMA models to
2 the normal GARCH(1,1) model by dening t = 2 t and rewriting the GARCH equation to t
This corresponds to an ARMA(1,1) model for 2 , to which the AO correction is applied. After t 2 estimating a GARCH(1,1) model for the original series t , they calculate t = 2 t and t (L) = (1 (1 + d1 )L)/(1 d1 L) and perform the regression for the estimated residuals t for c every t = to obtain ( ) and the test statistic . Then, t is replaced by t for that t = with 2 the largest value of > C = 4. Next, the series 2 = t + t is constructed to calculate the t AO-corrected series, where t , for t = , t = sign(t )(2 )1/2 , for t = . t
(8)
For the corrected series, , the procedure of estimating the normal GARCH(1,1) model and the t correcting for AO is repeated until no test statistic exceeds the critical value C = 4. The estimated GARCH(1,1) model for the nal AO corrected series is then used for forecasting t purposes. In the subsequent empirical analysis, we follow Frances and Ghijsels (1999) and use C = 4. It is, however, important to realize that estimation and forecasting results for the AO-GARCH model are highly dependent on the choice of C and no objective criteria for choosing a correct value of C has been proposed.
The data used in our study are based on the daily prices of ve major European stock indices: the German DAX, the Dutch AEX, the French CAC40, the Italian MIBTEL and the British FTSE100. While the ending periods of each coincide (last trading day in 2001), the starting dates dier because of data availability and are, respectively, Jan. 1992, Jan. 1993, Jan. 1991, Jan. 1994 and Jan. 1992. Figure 1 plots the returns for each of the ve series, from which the volatility clustering is quite blatant. The usual measures of sample skewness and kurtosis were also computed; each series exhibits negative skewness, i.e., has a longer left tail, and a sample kurtosis which is considerably larger than the appropriate value of three for the normal distribution.1 With respect to computing time, estimation of the GAt-GARCH model takes about ve times longer than normal-GARCH. Time for estimation of the AO-GARCH model with normal innovations depends mainly on the number of outliers detected, as each detected outlier invokes a normal-GARCH estimation and T regressions for detecting subsequent outliers. This is, in turn, dictated by the choice of tuning parameter C. Roughly speaking, with C = 4 we found about 5% of the returns under investigation to be outliers. With that one can say that the AO-GARCH model takes more than 0.05T times longer than estimation of the normal-GARCH model. (All programs were written in Matlab and compiled to increase execution speed.) Because of the nature of the AO-GARCH method of tting a model to the data, it does not lend itself to likelihood-based comparisons. In particular, after the outliers are chosen and removed, the resulting t isessentially by constructionexcellent. As such, for in-sample comparisons, we will study the choice of outliers removed by the AO-GARCH method, and argue
1
Standard errors and classical inference methods to assess the signicance of the deviation of these statistics from their null values corresponding to normality are not reliable in this context because the data are clearly not iid, and the existence of low moments is not at all certain. As such, the statistics are not presented.
5 1991 1992 1993 1994 1995 1996 1997 AEX 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 CAC40 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 MIBTEL 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 FTSE100 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
that it is inappropriate for nancial return data. Table 1 presents the estimated parameters and several test statistics for the AO-GARCH model applied to each of the ve time series under study. For every index there are two columns, labeled 0 and AO, with 0 referring to no outlier detection and use of just the usual normal-GARCH model, and AO, which refers to the nal iteration after full AO detection and correction. For all ve series, the mean of the standardized residuals as well as the mean parameter is not
3 DATA ANALYSIS AND IN-SAMPLE RESULTS Table 1: Estimation results for the normal-GARCH and the AO-GARCH modelsa
DAX
(0/AO)
AEX
(0/AO)
CAC40
(0/AO)
MIBTEL
(0/AO)
FTSE100
(0/AO)
0.067 / 0.064
(0.020) (0.019)
0.070 / 0.078
(0.019) (0.018)
0.041 / 0.059
(0.022) (0.019)
0.053 / 0.039
(0.027) (0.025)
0.043 / 0.043
(0.016) (0.015)
0.023 / 0.009
(0.006) (0.004)
0.016 / 0.010
(0.005) (0.004)
0.054 / 0.008
(0.015) (0.005)
0.092 / 0.048
(0.023) (0.017)
0.010 / 0.004
(0.004) (0.002)
0.091 / 0.059
(0.012) (0.009)
0.101 / 0.065
(0.013) (0.013)
0.066 / 0.028
(0.012) (0.007)
0.145 / 0.076
(0.020) (0.015)
0.065 / 0.039
(0.010) (0.009)
0.897 / 0.934
(0.012) (0.011)
0.891 / 0.924
(0.013) (0.016)
0.900 / 0.965
(0.019) (0.009)
0.814 / 0.890
(0.024) (0.024)
0.925 / 0.955
(0.012) (0.011)
6.25 / 8.53
(0.79) (0.58)
15.12 / 8.76
(0.13) (0.56)
5.66 / 10.14
(0.84) (0.43)
6.65 / 16.94
(0.76) (0.08)
6.41 / 9.90
(0.78) (0.45)
99
(3.9%)
118
(5.2%)
128
(4.6%)
74
(3.7%)
152
(6.0%)
51
(2.0%)
62
(2.7%)
71
(2.6%)
35
(1.7%)
77
(3.1%)
48
(1.9%)
56
(2.5%)
57
(2.1%)
39
(1.9%)
75
(3.0%)
0.06 0.92
0.00 0.00
0.00 0.00
0.00 0.00
0.00 0.00
For every index there are two columns, labeled 0 for normal-GARCH model without any outlier detection and AO for nal iteration after full outlier detection and correction. Rows 1 to 4 contain the parameter estimates (standard errors given in parenthesis); rows 5 to 10 give statistics for the standardized residuals, where LB10 (QLB10) is the Ljung-Box test statistic of order 10 for the (squared) standardized residuals (p-values are given in parentheses); rows 11 to 13 give the number of outliers (# AO), the number of negative outliers (# AO-), and the number of positive outliers (# AO+) detected and corrected by the AO-GARCH model; and rows 14 and 15 give the p-values for the runs-test statistic for the sequence of outliers, R(AO), and the sequence of positive and negative outliers, R(+).
signicantly aected by the outlier correction. This and the number of outliers of either sign (#AO+ and #AO-) indicate that the method is equally likely to detect a negative or a positive outlier. For the tted AO-GARCH model, the constant term of the GARCH equation, c0 , and the ARCH term, c1 , are smaller for all ve indices in comparison to the normal-GARCH model, whereas the GARCH parameter d1 is greater after the outlier correction. This nding is in line with Figure 2, where the corrected return series are plotted. It is obvious that the AO correction
removes all of the peaks of the original series. By cutting these peaks, any information about the DGP contained in these extreme returns is lost. If these observations are indeed genuine outliers resulting from, say, incorrect data entry, or (far more likely) correspond to information arrivals via political or economic events which are truly exceptional and will not have any relevance for future modeling, then the procedure apparently succeeds admirably in their removal.
DAX 5
5 1991 1992 1993 1994 1995 1996 1997 AEX 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 CAC40 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 MIBTEL 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 FTSE100 1998 1999 2000 2001 2002
5 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
The standardized residuals of the normal-GARCH model still exhibit some of the properties as
10
the return series themselves. For all ve indices, they have negative skewness and higher kurtosis than imposed by the normal distribution. After applying the AO-correction, the negative skewness is reduced and the kurtosis drops down below the anticipated value of three for the normal distribution, which leads to the conjecture that the critical value C = 4 may be too small and too many returns are corrected by the procedure. Hence, the AO-correction works as expected: All observations not in line with the assumed model and error distribution are corrected. As indicated by the Ljung-Box test statistic for the squared standardized residuals for order 10 (QLB10) and corresponding p-values, there is no heteroscedasticity left in the standardized residuals of the normal-GARCH model for all ve time series. In comparison to the normalGARCH model, the QLB10 test statistic for the AO-GARCH model rises for all indices except for the AEX index. For the MIBTEL index, QLB10 = 16.94, with p-values 0.08, indicating that there is still an amount of heteroscedasticity remaining in the standardized AO-GARCH model residuals. The occurrence of the detected and removed outliers from the AO-GARCH model is illustrated in Figure 3. From the plot, it is strikingly evident that their positions are not randomly distributed throughout the sample, but rather occur in clusters. This undermines the presumption that outliers are sprinkled throughout the data in a random, unpatterned fashion, but rather that the AO-GARCH model is removing observations occurring in groups of returns with larger than average heteroscedasticity. It is hardly tenable that these observations are outliers in the sense that their consideration would only corrupt the underlying signal in the data, but rather that they are an intimate part of the time-varying volatility phenomena containing information both about the GARCH parameters and, separate from any particular parametric model, about the actual probability of extreme events. To statistically test whether the observations deemed to be outliers are occurring at random or are clustered, dene a new time series {Bt } with Boolean elements, depending on whether an outlier is detected or not, i.e., Bt = 1 if an outlier is detected, and zero otherwise. The runs test statistic R(AO) = s 2T p0 p1 asy N(0, 1) 2 np0 p1 (9)
can then be applied, where p0 and p1 are the proportions of zeros and ones, respectively and s is the number of runs.2 We also test for the random occurrence of positive and negative outliers. Let {St } be the time series such that St = sign(rt )Bt . With p1 and p1 the proportion of negative and positive
2
See Paolella (2006, Sec. 6.3) for the exact distribution of this and related runs distributions, original references, further details and applications.
11
1 1991 1992 1993 1994 1995 1996 1997 AEX 1998 1999 2000 2001 2002
1 1991 1992 1993 1994 1995 1996 1997 CAC40 1998 1999 2000 2001 2002
1 1991 1992 1993 1994 1995 1996 1997 MIBTEL 1998 1999 2000 2001 2002
1 1991 1992 1993 1994 1995 1996 1997 FTSE100 1998 1999 2000 2001 2002
1 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Figure 3: Outlier occurrence: a cross (x) marks the occurrence of an outlier of either sign, whereas a plus (+) marks the same outliers, but dierenced by sign. So, the cross series sums the plus series.
outliers, respectively, and p0 and s are dened as before, the test statistic is given by ) ( 1 2 sn 1 pi asy i=1 R(+) = ( ) N(0, 1). ( 1 ) 1 1 2 3 2 2 n pi 2 pi + pi
i=1 i=1 i=1
(10)
12
With the single exception of the DAX index, the statistical signicance of the resulting test statistics provide evidence to suggest that the occurrence of the outliers is non-random and that useful information about the process is lost by removing the outliers. Table 2: Estimation results for the GAt-GARCH modela DAX c0 c1 d1 d mean
(theo.)
AEX 0.203
(0.032)
CAC40 0.179
(0.045)
MIBTEL 0.097
(0.049)
FTSE100 0.133
(0.030)
0.178
(0.035)
0.022
(0.009)
0.022
(0.008)
0.055
(0.020)
0.133
(0.039)
0.014
(0.006)
0.133
(0.021)
0.154
(0.024)
0.101
(0.019)
0.202
(0.035)
0.109
(0.025)
0.913
(0.013)
0.901
(0.015)
0.919
(0.016)
0.840
(0.026)
0.930
(0.011)
5.879
(0.000)
5.991
(0.187)
5.968
(0.299)
5.979
(0.338)
5.981
(3.068)
2.019
(0.061)
2.060
(0.090)
2.029
(0.084)
2.016
(0.094)
2.100
(0.455)
0.922
(0.020)
0.903
(0.021)
0.924
(0.023)
0.966
(0.025)
0.926
(0.022)
0.097
(0.098)
0.117
(0.122)
0.098
(0.094)
0.042
(0.041)
0.094
(0.091)
var
(theo.)
0.610
(0.601)
0.591
(0.587)
0.607
(0.595)
0.606
(0.595)
0.574
(0.570)
skewness
(theo.)
0.235
(0.180)
0.398
(0.217)
0.387
(0.172)
0.249
(0.077)
0.074
(0.159)
kurtosis
(theo.)
4.350
(3.762)
4.133
(3.671)
5.499
(3.720)
5.355
(3.725)
4.086
(3.576)
LB10 QLB10
a
14.96
(0.13)
13.21
(0.21)
19.21
(0.04)
26.46
(0.00)
25.61
(0.00)
4.53
(0.92)
13.47
(0.20)
5.16
(0.88)
6.96
(0.73)
6.76
(0.75)
Rows 1 to 7 contain the parameter estimates (standard errors given in parenthesis); rows 8 to 14 give statistics for the standardized residuals, where LB10 (QLB10) is the Ljung-Box test statistic of order 10 for the (squared) standardized residuals (p-values are given parenthesis);
We now turn to the analysis of the tted GAt-GARCH models. The estimation results are given in Table 2. Denote by zt the residuals of the estimated model. If the GARCH model is appropriate, then the zt should be approximately iid with zero location, unit scale, and density given by the GAt density with the estimated values of the shape and skewness parameters (these having been jointly estimated with the GARCH parameters). As for the normal-GARCH model and the AO-GARCH model, the values for the mean and the variance are close to the theoretical values computed from (4) with the estimated parameter values. But, in addition, the GAt distribution is able to capture parts of the skewness and the excess kurtosis. Figure 4 shows the
13
kernel and tted probability density functions of the zt . It is visually clear that the model ts the data quite well.
DAX
0.6 0.6
AEX
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0 4
0 4
CAC40
0.6 0.6
MIBTEL
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0 4
0 4
FTSE100
0.6
0.5
0.4
0.3
0.2
0.1
0 4
Figure 4: Fitted GAt density (solid) vs. empirical kernel density (dashed) for the standardized residuals.
14
4
4.1
Judging the quality of a forecast of the volatility obviously necessitates a benchmark value against which it can be compared. As discussed in detail by Andersen and Bollerslev (1998), the commonly used proxy for the true unobservable daily volatility,
s vt = |rt r|,
r=
1 rt , T
(11)
is a poor measure for the true daily volatility owing to its enormous variance (although it is unbiased). This is, unfortunately, the measure employed by Frances and Ghijsels (1999). Andersen and Bollerslev investigate and recommend the use of intra-day returns for measuring daily volatility. However, because of the limited availability of such data, we propose four dierent volatility measures, each based on the widely available OHLC (open-high-low-close) data for stock prices to approximate the realized volatility. ph t pc t
po t pl t Figure 5: Diagrammatic occurrence of OHLC-Prices: po (ph , pl and pc ) denote the t t t t logarithmic open (high, low and close) price at time t.
The rst approach is to use the four available prices per day to construct six possible intra-day returns, i.e. the returns that may occur by intra-day trading and using the limited information set of the OHLC-prices. These articial returns are then used in conjunction with the usual method for calculating a sample standard deviation, i.e.,
a vt =
A = {(o, c), (o, h), (o, l), (h, l), (h, c), (l, c)},
(12)
where rt
(x,y)
= 100 |py px | for (x, y) A, are the absolute values of the six articial returns t t
15
The second approach also uses the four available prices of one day, as well as the closing price of the previous day, to derive four articial returns in terms of an inter-day trading with limited information, i.e., the returns that may occur when trading takes place between the closing price the day before, and one of four possible prices today. As before, the standard deviation of these returns is used as a proxy for todays realized volatility:
e vt =
(13)
where rt
(x,y)
= 100 |py px | with (x, y) E, are the absolute values of the four possible t t1
inter-day returns. A third way to calculate the daily volatility is a mixture of the two previously mentioned approaches, which we refer to as the mixed volatility measure:
m vt =
1 10
(x,y)M
(rt
(x,y)
E[rt ])2 ,
M = A + E,
(14)
where rt
(x,y)
are the ten possible intra- and inter-day returns as dened before.
The three aforementioned approaches are straightforward and obviously simplistic in construction. The fourth approach we consider was proposed in Garman and Klass (1980) for estimating the volatility constant of a diusion process for security prices by the use of OHLC-prices. They derive six dierent estimators for , one of which we use, given by ((po pc ) 100)2 ((ph pl ) 100)2 t1 x t vt = a t + (1 a) t , f (1 f )4 log 2
(15)
with f is the portion of day where no trading takes place; that is, the time between closing at day t 1 and opening of day t. For the empirical applications below, f = 0.6 is used, which turned out to be a good approximation for the dierent values of f for the dierent indices. Finally,
x a = 0.17 is a constant which minimizes vt , independent of f (this is the 3 estimator of Garman 2
and Klass, 1980). The ve volatility approximations described above are used to judge the forecasting accuracy of the two competing GARCH models. For both models, 1000 one-step-ahead volatility forecasts are constructed for all ve indices, with an estimation period of 250 days (about one year of trading data) for both models. To make the comparison more realistic, the model parameters were updated for each t = T 1000, . . . , T 1. Note that such an exercise will extremely timeconsuming when using the AO-correction method.
16
The three popular error criteria, mean squared error (MSE), median squared error (MedSE) and the mean absolute percentage error (MAPE), were used to judge the performance. In addition, we ran a regression analogous to that of Andersen and Bollerslev (1998) of the form vf = 0 + 1 vf , f = 1, . . . , 1000, (16)
with vf being the realized volatility measured by one of the volatility proxies and vf is the one-step-ahead prediction of volatility of one of the models. Note that vf equals f for the AO-GARCH model (with normal innovations) and vf = f E[(Z E[z])2 ] for the GAt dis tribution. For a model which provides correct one-step volatility predictions, the parameters 0 and 1 are expected to be 0 and 1, respectively. The results for the standard volatility proxy (11), as well as for the mixed and the extreme value approximations are given in Tables 3 to 5. (The results for the intra- and inter-day approximations are similar to the results obtained by the mixed volatility approximation, so we do not
s present them here to save space.) Beginning with the standard volatility proxy vt in (11), we see
that, comparatively speaking, it is by far the worst: The low R2 value is in agreement with the results reported in Andersen and Bollerslev (1998) and is the lowest R2 of all competing proxies
s across both models and all indices. Similarly, vt exhibits the largest error criteria values; in the m case of the MAPE, it is on average 10 times higher than for the vt in (14) and about 15 times x higher as the MAPE for the vt in (15).
AEX
(AO/GAt)
CAC40
(AO/GAt)
MIBTEL
(AO/GAt)
FTSE100
(AO/GAt)
0 1 R
2
0.341 / 0.083
(0.145) (0.104)
0.210 / 0.015
(0.121) (0.081)
0.283 / 0.113
(0.234) (0.130)
0.331 / 0.074
(0.141) (0.082)
0.073 / 0.005
(0.142) (0.107)
1.144 / 0.745
(0.102) (0.063)
1.119 / 0.776
(0.099) (0.053)
1.147 / 0.719
(0.185) (0.087)
1.196 / 0.739
(0.112) (0.053)
0.834 / 0.790
(0.131) (0.085)
For every index there are two columns, labeled AO for the AO-GARCH model and GAt for the GAt-GARCH model. Values 0 and 1 are the estimated regression coecients (standard errors in parentheses).
The most accurate proxy for realized volatility seems to be the extreme value estimator from
x Garman and Klass (1980), vt , in (15). It results in the highest R2 , the lowest MAPE and, with
two exceptions, the lowest MedSE of all proxies. For the MSE, it took second place, with a slight preference being given to the volatility approximations given in equations (12) to (14). Thus, the
4 OUT-OF-SAMPLE FORECASTING RESULTS Table 4: Forecasting results for mixed volatility approximation (14)a
DAX
(AO/GAt)
17
AEX
(AO/GAt)
CAC40
(AO/GAt)
MIBTEL
(AO/GAt)
FTSE100
(AO/GAt)
0 1 R
2
0.363 / 0.090
(0.095) (0.067)
0.260 / 0.008
(0.076) (0.048)
0.610 / 0.123
(0.146) (0.077)
0.341 / 0.135
(0.089) (0.050)
0.024 / 0.045
(0.094) (0.069)
1.249 / 0.820
(0.067) (0.041)
1.209 / 0.820
(0.063) (0.032)
1.494 / 0.955
(0.116) (0.051)
1.241 / 0.727
(0.071) (0.032)
0.985 / 0.840
(0.087) (0.055)
For every index there are two columns, labeled AO for the AO-GARCH model and GAt for the GAt-GARCH model. Values 0 and 1 are the regression coecients (standard errors in parentheses).
AEX
(AO/GAt)
CAC40
(AO/GAt)
MIBTEL
(AO/GAt)
FTSE100
(AO/GAt)
0.450 / 0.150
(0.116) (0.082)
0.336 / 0.013
(0.093) (0.058)
0.854 / 0.177
(0.181) (0.094)
0.450 / 0.193
(0.104) (0.058)
0.006 / 0.101
(0.112) (0.082)
1.650 / 1.081
(0.081) (0.050)
1.587 / 1.078
(0.077) (0.038)
2.026 / 1.284
(0.143) (0.063)
1.622 / 0.936
(0.083) (0.037)
1.355 / 1.102
(0.103) (0.065)
For every index there are two columns, labeled AO for the AO-GARCH model and GAt for the GAt-GARCH model. Values 0 and 1 are the regression coecients (standard errors in parentheses).
a e m x simple estimators vt , vt and vt proposed herein are seen to be competitive with vt from Garman
and Klass (1980). (We elected not to report the results based on the most ecient extreme value estimator proposed by Garman and Klass (1980) because the one used here has a straightforward interpretation, is related to the volatility estimator of Parkinson (1980), and leads to qualitatively similar results.) While interesting in themselves, the dierent volatility approximations are not our main goal, but rather to compare the volatility forecasts of the competing GARCH models. The plots for volatility forecasts for both models as well as the realized volatility, measured by the extreme
x value estimator, vt , are given in Figure 6. We see that the AO-GARCH model is not able
to predict higher uctuations of the realized volatility. Its volatility forecasts are very smooth and slowly varying. While this might be appealing from an applied point of view, in the sense
18
100
200
300
400
500 CAC40
600
700
800
900
1000
100
200
300
400
500 MIBTEL
600
700
800
900
1000
100
200
300
400
500
600
700
800
900
1000
Underlying grey line: realized volatiliy, measured by the extreme value approximation, v x Dashed (black) line: volatility forecast of the AO-GARCH model Solid (blue) line: volatility forecast of the GAt-GARCH model
that the frequency of portfolio weight adjustment (and the associated costs) are lessened, the predictions are simply too poor and would not pass the tests required by the Basle Accord. The behavior of the forecasts is what one would expect based on the results of the runs tests in Section 3. In particular, as the extreme values are not randomly dispersed throughout the
19
data, they are informative for risk prediction, but the method removes all such extraordinary peaks of the original series and labels them as outliers. In contrast, the volatility forecasts of the GAt-GARCH model are able to capture the higher uctuations of the volatility process and are, thus, considerably more accurate. With regard to the error criteria, the conclusion as to which model is best in predicting future volatility depends on the chosen volatility approximation. For the standard approximation in (11) as well as for the intra-day measure in (12), the inter-day measure in (13) and the mixed volatility measure in (14), the AO-GARCH model yields smaller error criteria values for all indices (with two exceptions for the intra and one exception for the mixed approximation). For
x the extreme value approximation, vt in (15), we draw opposite conclusions. The GAt-GARCH
model is preferred for all indices by the MSE, for 4 out of 5 indices by the MedSE, and for 2 out of 5 indices by the MAPE. The conclusion is, however, unequivocal when based on the regression results: Irrespective of the chosen volatility approximation, the R2 value is highest for the GAt-GARCH model for all indices. Furthermore, the GAt-GARCH model not only leads to higher R2 values but also the number of regression coecients which are insignicantly dierent from their expected values, are higher
s for all the volatility measures, with the exception of the standard approximation, vt . Based on the x most accurate volatility approximation, vt , there are 4 (3) of the 0 (1 ) estimates insignicantly
dierent for the GAt-GARCH model, whereas for the AO-GARCH model, all parameter estimates are found to be signicantly in violation of the null hypothesis.
4.2
VaR-Forecasts
Besides predicting future volatility, GARCH models can be, and are, used to measure the downside risk inherent in a given nancial position, i.e., the Value-at-Risk (VaR) of a nancial position. The VaR is dened to be the maximum amount of loss of a nancial position that may occur within a given number of days for a given probability . For our purposes, the VaR is given by the cut-o point for which the next days predicted return will not fall with probability 1 . It is obtained by inverting the equation Pr (rt+1 t+1 ()) = . (17)
When using a GARCH model for predicting the VaR, the calculation is done by using the condi tional density ft+1|t (rt+1 |; rt , rt1 , . . . ), where refers to the vector of estimated model parameters, and f () is given by the assumed error distribution. As for the volatility predictions above, 1000 one-step-ahead forecasts of the conditional return densities for the ve indices are evaluated for both models under investigation. Again, an
5 CONCLUSIONS
20
estimation period of 250 days for both models is used and the model parameters are updated for every forecast. To compare the accuracy of the VaR predictions, the empirical tail probabilities = 1 1000
T 1 t=T 1000
(18)
are calculated for three dierent probability levels, = 0.01, = 0.025 and = 0.05, where I is the indicator function. When based on the true distribution at time t + 1, the empirical tail probability will coincide, in terms of the long-run average, with the given probability level. If the observed probability is higher (less) than , then the model tends to underestimate (overestimate) the risk for the index returns, i.e., the implied absolute -values tend to be too small (large). The results for the dierent indices and probabilities are given in Table 6. As expected from the results of forecasting volatility, the AO-GARCH model is not able to measure the VaR correctly. By detecting and correcting the time series for anomalous observations and, thus, removing the information contained in them, the AO-GARCH model suers when attempting to predict the tails of the distribution. For all three probability levels, the AO-GARCH model underestimates the VaR of all time series by a factor of 1.7 for the 5% level and a factor of 3.3 for the 1% level. On the other hand, the GAt-GARCH model works quite well, in agreement with the ndings in Mittnik and Paolella (2000). For all probability levels, the empirical tail probabilities are close to the given probability levels, i.e., the conditional modeling of volatility, in conjunction with a more exible return distribution, enhances the ability to draw conclusions about the future tail probabilities of returns. Table 6: Empirical tail probabilitiesa DAX AEX CAC40 MIBTEL FTSE100
a
5%
(AO/GAt)
2.5%
(AO/GAt)
1%
(AO/GAt)
9.0 / 5.4 8.4 / 4.5 8.3 / 5.2 9.4 / 6.4 8.8 / 5.2
5.3 / 2.5 4.7 / 2.3 5.5 / 2.7 5.7 / 3.3 6.1 / 2.4
2.5 / 0.7 3.0 / 1.0 3.2 / 0.7 2.7 / 1.2 3.3 / 1.0
Observed tail probabilities from (17) multiplied by 100. For a correctly specied model, we expect .
Conclusions
We compare two possible ways to accommodate the well-documented stylized fact of non-normal, heteroskedastic nancial returns for the purpose of risk prediction. One way involves systemati-
REFERENCES
21
cally removing aberrant values; the other is to use distributions which allow for explicit modeling of the observed features of the data. The dierences between both ways can be summarized as follows. The AO-GARCH model is successful in identifying and correcting the original time series for additive outliers. For all the time series investigated in this paper, the occurrence of the outliers reveals strong systematic patterns, against the assumption that outliers occur at random. Instead of labeling them as outliers, such aberrant values appear to be chronic in nancial return series and form an important part of the analysis and prediction of risk. The volatility forecasts obtained from the two competing models are very dierent, although it is hard to judge which model leads to better forecasts when judging is based on the standard volatility approximation and standard error criteria. The results are far clearer if more ecient volatility approximations are used, as proposed herein.
References
Andersen, T. G. and Bollerslev, T. (1998). Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts. International Economic Review, 39(4), 885905. Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2003). Modeling and Forecasting Realized Volatility. Econometrica, 71(2), 579625. Broda, S. A., Haas, M., Krause, J., Paolella, M. S., and Steude, S.-C. (2011). Stable Mixture GARCH Models. Swiss Finance Institute Occasional Paper Series No. 11 39. Broda, S. A. and Paolella, M. S. (2009). CHICAGO: A Fast and Accurate Method for Portfolio Risk Calculation. Journal of Financial Econometrics, 1, 125. Carnero, M. A., Pea, D., and Ruiz, E. (2001). Outliers and Conditional Autoregressive Hetn eroscedasticity in Time Series. Revista Estadistica, 53, 143213. Charles, A. and Darne, O. (2005a). Outliers and GARCH Models in Daily Financial Data. Economics Letters, 86, 347352. Charles, A. and Darne, O. (2005b). Relevance of Detecting Outliers in GARCH Models for Modelling and Forecasting Financial Data. Finance: Revue de Lassociation Francaise de Finance, 26(1), 3371. Chen, C. and Liu, L.-M. (1993). Joint Estimation of Model Parameters and Outlier Eects in Time Series. Journal of American Statistical Association, 88(421), 284297.
REFERENCES
22
Doganoglu, T., Hartz, C., and Mittnik, S. (2007). Portfolio Optimization when Risk Factors are Conditionally Varying and Heavy Tailed. Computational Economics, 29(3-4), 333354. Frances, P. H. and Ghijsels, H. (1999). Additive Outliers, GARCH and Forecasting Volatility. International Journal of Forecasting, 15, 19. Garman, M. B. and Klass, M. J. (1980). On the Estimation of Security Price Volatilities from Historical Data. Journal of Business, 53(1), 6778. Mittnik, S. and Paolella, M. S. (2000). Conditional Density and Value-at-Risk Prediction of Asian Currency Exchange Rates. Journal of Forecasting, 19, 313333. Paolella, M. S. (2006). Fundamental Probability: A Computational Approach. Chichester. Paolella, M. S. (2007). Intermediate Probability: A Computational Approach. Chichester. Paolella, M. S. and Polak, P. (2011). MARC-MARS: Modeling Asset Returns via Conditional Multivariate Asymmetric Regime-Switching. Submitted. Park, B.-J. (2002). An Outlier Robust GARCH Model and Forecasting Volatility of Exchange Rate Returns. Journal of Forecasting, 21, 381393. Parkinson, M. (1980). The Extreme Value Method for Estimating the Variance of the Rate of Return. Journal of Business, 53(1), 6165. Wiggins, J. B. (1991). Empirical Tests of the Bias and Eciency of the Extreme-Value Variance Estimator for Common Stocks. Journal of Business, 64(3), 417432. Yang, M. (2011). Volatility feedback and risk premium in garch models with generalized hyperbolic distributions. Studies in Nonlinear Dynamics & Econometrics, 15(3).
c/o University of Geneva 40 bd du Pont d'Arve 1211 Geneva 4 Switzerland T +41 22 379 84 71 F +41 22 379 82 77 RPS@sfi.ch www.SwissFinanceInstitute.ch
c/o University of Geneva 40 bd du Pont d'Arve 1211 Geneva 4 Switzerland T +41 22 379 84 71 F +41 22 379 82 77 RPS@sfi.ch www.SwissFinanceInstitute.ch