Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Risk Premia Embedded in Index Option

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

The Risk Premia Embedded in Index Options

Torben G. Andersen, Nicola Fusari and Viktor Todorov

CREATES Research Paper 2014-56

Department of Economics and Business Email: oekonomi@au.dk


Aarhus University Tel: +45 8716 5515
Fuglesangs Allé 4
DK-8210 Aarhus V
Denmark
The Risk Premia Embedded in Index Options∗
Torben G. Andersen† Nicola Fusari‡ Viktor Todorov§

December 2014

Abstract
We study the dynamic relation between market risks and risk premia using time series of index
option surfaces. We find that priced left tail risk cannot be spanned by market volatility (and
its components) and introduce a new tail factor. This tail factor has no incremental predictive
power for future volatility and jump risks, beyond current and past volatility, but is critical in
predicting future market equity and variance risk premia. Our findings suggest a wide wedge
between the dynamics of market risks and their compensation, with the latter typically display-
ing a far more persistent reaction following market crises.

Keywords: Option Pricing, Risk Premia, Jumps, Stochastic Volatility, Return Predictability,
Risk Aversion, Extreme Events.
JEL classification: C51, C52, G12.


Andersen gratefully acknowledges support from CREATES, Center for Research in Econometric Analysis of Time
Series (DNRF78), funded by the Danish National Research Foundation. Todorov’s work was partially supported by
NSF Grant SES-0957330. We are grateful to Snehal Banerjee, Geert Bekaert, Peter Carr, Anna Cieslak, Bjorn
Eraker, Kay Giesecke, Ravi Jagannathan, Bryan Kelly (our discussant at the NBER Meeting), Robert Merton,
Toby Moskowitz, Lasse Pedersen, Sergio Rebelo, Myron Scholes, Ivan Shaliastovich (our discussant in Montreal),
Allan Timmermann, Jonathan Wright, Liuren Wu as well as seminar participants at Kellogg School of Management,
Northwestern University, Duke University, the ”Montreal 2013 Econometrics Conference: Time Series and Financial
Econometrics,” the 40th Annual Meeting of the Danish Econometric Society, Sandbjerg, Denmark, the European
University Institute, Florence, 2013 Workshop on ”Measuring and Modeling Financial Risk and High Frequency
Data”, the 2013 ”International Conference in Financial Econometrics” at Shandong University, Jinan, China, the “40
Years after the Black-Scholes-Merton Model” conference at the Stern School of Business, October 2013, the NBER
Asset Pricing Meeting at Stanford University, November 2013, Georgia Tech, Boston University, Stanford University,
the AQR Insight Award, Greenwich, CT, April 2014, the 2014 Annual SoFiE Meeting, Toronto, Canada, the 2014
Australasian Econometric Society Meetings, Tasmania, Australia, the 2014 NBER Summer Institute Forecasting and
Empirical Methods Meeting, and the 2014 EFA meeting, Lugano, Switzerland, for helpful comments.

Department of Finance, Kellogg School of Management, Northwestern University, Evanston, IL 60208; e-mail:
t-andersen@northwestern.edu.

The Johns Hopkins University Carey Business School, Baltimore, MD 21202; e-mail: nicola.fusari@jhu.edu.
§
Department of Finance, Kellogg School of Management, Northwestern University, Evanston, IL 60208; e-mail:
v-todorov@northwestern.edu.
1 Introduction

Equity markets are subject to pronounced time-variation in volatility as well as abrupt shifts,
or jumps. Moreover, these risk features are related in intricate ways, inducing a complex equity
return dynamics. Hence, the markets are incomplete and derivative securities, written on the equity
index, are non-redundant assets. This partially rationalizes the rapid expansion in the trading of
contracts offering distinct exposures to volatility and jump risks. From an economic perspective,
it suggests that derivatives data contain important information regarding the risk and risk pricing
of the underlying asset. Indeed, recent evidence, exploiting parametric models, e.g., Christoffersen
et al. (2012) and Santa-Clara and Yan (2010), or nonparametric techniques, e.g., Bollerslev and
Todorov (2011), finds the pricing of jump risk, implied by option data, to account for a significant
fraction of the equity risk premium.
Standard no-arbitrage and equilibrium-based asset pricing models imply a tight relationship
between the dynamics of the options and the underlying asset. This arises from the assumptions
concerning the pricing of risk in the no-arbitrage setting and the endogenous pricing kernels implied
by the equilibrium models. A prominent example is the illustrative double-jump model of Duffie
et al. (2000) in which the return volatility itself follows an affine jump diffusion. In this context,
the entire option surface is governed by the evolution of market volatility, i.e., the dynamics of all
options is driven by a single latent Markov (volatility) process.
Recent empirical evidence reveals, however, that the dynamics of the option surface is far more
complex. For example, the term structure of the volatility index, VIX, shifts over time in a manner
that is incompatible with the surface being driven by a single factor, see, e.g., Johnson (2012).
Likewise, Bates (2000) documents that a two-factor stochastic volatility model for the risk-neutral
market dynamics provides a significant improvement over a one-factor version. Moreover, Bollerslev
and Todorov (2011) find that even the short-term option dynamics cannot be captured adequately
by a single factor as the risk-neutral tails display independent variation relative to market volatility,
thus driving a wedge between the dynamics of the option surface and the underlying asset prices.
The objective of the current paper is to characterize the risk premia, implied by the large
panel of S&P 500 index options, and its relation with the aggregate market risks in the economy.
As discussed in Andersen et al. (2013), the option panel contains rich information both for the
evolution of volatility and jump risks and their pricing. Consequently, we let the option data speak
for themselves in determining the risk premium dynamics and discriminating among alternative
hypotheses regarding the source of variation in risk as well as risk pricing.
The standard no-arbitrage approach starts by estimating a parametric model for the evolution of

1
the underlying asset price. Risk premia are then introduced through a pricing kernel which implies
that risk compensation is obtained through parameter shifts. This ensures, conveniently, that
the risk-neutral dynamics remains within the same parametric class entertained for the statistical
measure. However, this approach tends to tie the equity market and option surface dynamics closely
together. In particular, the equity risk premia are typically linear in volatility. In contrast, we find
the options to display risk price variation that is largely unrelated to, and effectively unidentifiable
from, the underlying asset prices alone.
This motivates our “reverse” approach of directly estimating a parametric model for the risk-
neutral dynamics exclusively from option data along with no-arbitrage restrictions based on non-
parametric model-free volatility measures constructed from high-frequency data on the underlying
asset. In this manner, we avoid letting a (possibly misspecified) parametric structure for the P-
dynamics impact the identification of option risk premia. Our goal is to synthesize the option
surface dynamics in a low-dimensional state vector without imposing ad hoc restrictions based on
the actual return dynamics, and then proceed to explore the risk premia dynamics by combining
the extracted state vector with high- and low-frequency data on the equity index.
Following Andersen et al. (2013), we specify a general parametric model for the risk-neutral
return dynamics that allows for a separate left tail jump factor to impact the volatility surface.
Simultaneously, we include two distinct volatility factors and accommodate co-jumps between re-
turns and volatility as well as return asymmetries induced by (negative) correlation between both
diffusive and jump innovations. Moreover, we explore both Gaussian and double-exponential spec-
ifications for the jump distributions. As such, we incorporate all major features stressed in prior
empirical option pricing studies and allow for various novel features. In particular, we model the
tail factor as purely jump driven, with one component jointly governed by the volatility jumps
while another is independent of spot volatility. This feature allows the jump intensity to escalate
– through so-called cross-excitation of the jumps – in periods of crises when price and volatility
jumps are prevalent, thus amplifying the response of the jump intensity to major (negative) mar-
ket shocks. The extended model remains within the popular class of affine jump-diffusion models
of Duffie et al. (2000) and exemplifies the flexibility of such models for generating intricate, yet
analytically tractable, dynamic interactions between volatility and jump risks.
Of course, any tractable and parsimonious parametric model is bound to suffer from some
degree of misspecification. What is crucial for our analysis, however, is to avoid systematic biases
in representing the information embedded in the option panel. We do this by allowing for a flexible
state vector driving different components of the conditional risk-neutral return distribution. Most

2
importantly, by introducing the left tail factor, we capture systematic variation in the corresponding
part of the option surface which is missed by traditional model specifications. Thus, one can view
the time series realizations of our novel tail factor as a succinct quantification of dynamic features
not accommodated by existing parametric asset pricing models.
Relative to Andersen et al. (2013), the system is generalized to allow the left tail factor to enter
directly into the spot volatility process. In addition, all three state variables may impact the jump
intensities. Consequently, we can explicitly test for the presence of the tail factor in volatility,
and we can gauge the significance of the different state variables in driving separately the positive
and negative jump intensities. Inference for the general model is feasible through the approach
developed in Andersen et al. (2013). However, we modify the criterion function by including a term
minimizing, not the squared, but the relative squared option pricing error across the sample. This
reduces the weight assigned to turbulent periods where the bid-ask spreads increase sharply.
The estimation of the general system establishes that the left tail factor is an insignificant
contributor to spot volatility, even if it is correlated with the level of volatility. Moreover, it has
no presence in the right jump tail, while the left jump intensity is exclusively governed by the
tail factor and the more volatile and less persistent of the volatility components. Overall, the tail
factor improves the characterization of the option surface dynamics very significantly, both in- and
out-of-sample. In particular, the new model no longer systematically undervalues out-of-the-money
put options following crises. Given the much improved fit to the option surface, our extended model
provides a more suitable basis for studying the dynamics of the market risk premia.
Methodologically, the presence of the separately evolving tail factor implies that part of the risk
premium dynamics cannot be captured by the volatility state variables driving the underlying asset
price dynamics. This suggests that the tail factor may have predictive power for risk premia over
and above spot volatility. This is, indeed, what we find. Our tail factor is important for forecasting
the variance risk premium in conjunction with one of the volatility components. In addition, the
tail factor is significant in predicting excess market returns for horizons up to one year, while the
volatility factors are insignificant. These findings rationalize why the variance risk premium is a
superior return predictor relative to volatility itself, as documented in Bollerslev et al. (2009). The
key is the presence of a separate factor driving the left jump tail of the risk-neutral distribution.
At the same time, the tail factor provides substantially better return forecasts than the variance
risk premium, or any other standard return predictor, over our sample.
Importantly, while the new tail factor has predictive power for risk premia, it contains no incre-
mental information regarding the future evolution of volatility and jump risks for the underlying

3
asset relative to the traditional volatility factors. Hence, our findings indicate that option markets
embody critical information about the market risk premia and its dynamics which is essentially
unidentifiable from stock market data alone. Moreover, the option surface dynamics contains in-
formation that can improve the modeling and forecasting of future volatility and jump risks, but
such applications necessitate an initial untangling of the components in the risk premia that are not
part of the volatility process. Overall, our empirical results suggest that there is a wedge between
the stochastic evolution of risks in the economy and their pricing, with the latter typically having
a far more persistent response to (negative) tail events than the former.
Our finding of a substantial wedge between the dynamics of the option and stock markets
presents a challenge for structural asset pricing models. Specifically, the standard exponentially-
affine equilibrium models with a representative agent equipped with Epstein-Zin preferences imply
that the ratio of the risk-neutral and statistical jump tails is constant. On the contrary, the new
factor, extracted from the option data, drives the risk-neutral left jump tail but has no discernable
impact on the statistical jump tail. We conjecture that this wider gap between fundamentals
and asset prices may be accounted for through an extension of the preferences via some form for
time-varying risk aversion and/or ambiguity aversion towards extreme downside risk.
The rest of the paper is organized as follows. Section 2 describes the data. Section 3 presents
our three-factor model which, apart from the modeling of the jump distributions, encompasses most
existing models in the empirical literature. Section 4 introduces the estimation methodology and
discusses the parameter estimates. Section 5 explores model fit and presents different robustness
checks. This analysis brings out some of the mechanisms behind the improved fit of our model.
Section 6 is dedicated to an out-of-sample analysis. In Section 7 we exploit the estimation results
to study the risk premium dynamics and its implication for return and variance predictability.
Our findings are contrasted to corresponding predictability results implied by popular structural
equilibrium models. Section 8 concludes. Details on various aspects of the analysis are collected in
an Appendix. A number of additional robustness checks and further details on the estimation are
provided in a Supplementary Appendix.

2 Data and Preliminary Analysis

We use European style S&P 500 equity-index (SPX) options traded at the CBOE. We exploit the
closing bid and ask prices reported by OptionMetrics, applying standard filters and discarding all
in-the-money options, options with time-to-maturity of less than 7 days, as well as options with zero
bid prices. For all remaining options, we compute the mid bid-ask Black-Scholes implied volatility

4
(IV). The data spans January 1, 1996–April 23, 2013. It is further divided into an in-sample
period covering January 1, 1996–July 21, 2010, and an out-of-sample period consisting of July 22,
2010–April 23, 2013. Following earlier empirical work, e.g., Bates (2000) and Christoffersen et al.
(2009), we sample every Wednesday.1 The in-sample period includes 760 trading days, and the
estimation is based on an average of 234 bid-ask quotes per day. The out-of-sample period contains
142 trading days and features a much higher number of active quotes at any point in time, so we
exploit an average of 708 option contracts daily from this sample. The nonparametric estimate of
volatility used for penalizing the objective function below is constructed from one-minute data on
the S&P 500 futures covering the time span of the options. The same data are used to construct
measures of volatility and jump risks for the predictive regressions in Section 7. Finally, we employ
the returns on the SPY ETF traded on the NYSE, which tracks the S&P 500 index portfolio, and
the 3-month T-bill rate to proxy for the risk-free rate, when implementing these regressions.

2.1 The Option Panel

We denote European-style out-of-the-money (OTM) option prices for the asset X at time t by
Ot,k,τ . Assuming frictionless trading in the options market and denoting with Q the risk-neutral
measure, the option prices at time t are given as,
 h R i
 EQ e− tt+τ rs ds (Xt+τ − K)+ , if K > Ft,t+τ ,
t
Ot,k,τ = h R i (2.1)
 E e− tt+τ rs ds (K − X )+ ,
Q
if K ≤ Ft,t+τ ,
t t+τ

where τ is the tenor of the option, K is the strike price, Ft,t+τ is the futures price for the underlying
asset at time t referring to date t + τ , for τ > 0, k = ln(K/Ft,t+τ ) is the log-moneyness, and rt is
the instantaneous risk-free interest rate. Finally, we denote the annualized Black-Scholes implied
volatility corresponding to the option price Ot,k,τ by κt,k,τ . This merely represents an alternative
notational convention, as the Black-Scholes implied volatility is a strictly monotone transformation
ert,t+τ Ot,k,τ
of the ratio Ft,t+τ , where rt,t+τ denotes the risk-free rate over the period [t, t + τ ].
The empirical work explicitly accounts for measurement error in the option prices. We denote
the average of the bid and ask quotes (expressed in Black-Scholes implied volatility units) by κt,k,τ ,
and view this as a noisy observation of underlying value. To the extent the measurement errors
are not strongly correlated across a large fraction of the surface, we improve the efficiency of the
inference by incorporating the full option cross-section in our estimation and testing procedures,
effectively averaging out idiosyncratic observation errors. The size of the spread varies over time
and is positively correlated with the volatility level. In addition, there are systematic differences
1
Due to extreme violations of no-arbitrage-conditions, we replaced October 8, 2008, with October 6, 2008.

5
in the relative spread across moneyness. For example, the spread is about 8% of the mid-quote for
deep OTM puts, on average, implying that a typical implied volatility reading of 40% is associated
with bid and ask quotes of 38.4% and 41.6%. Similarly, for an IV of 18% for at-the-money (ATM)
options, the quotes are generally around 17.6%-18.4%, while a typical set of quotes for far OTM
calls are 18.8%-21.2% for a mid-point value of 20%.
The options underlying the IV surface are highly heterogeneous in terms of moneyness and
tenor across time. To facilitate comparison, we create a uniform set of regions based on the option
characteristics. Specifically, we define the volatility-adjusted moneyness, m, at time t for tenor τ ,
by standardizing the log-moneyness with the ATM IV,
ln(K/Ft,t+τ )
m = √ .
κt,0,τ · τ
Table 1 shows how the observations in our sample are distributed across the option surface. The
four regions of moneyness represent deep OTM put options, OTM put options, ATM options and
OTM call options, while the two categories for time-to-maturity provide a rough split into short
versus long dated options. Not surprisingly, there is particularly good coverage for ATM options,
which represent over 44% of the in-sample observations. The quotes for the OTM call options are
somewhat limited and amount to almost 16% of the total options quotes, roughly matching the
proportion of deep OTM put options.

1996:1-2010:7 2010:7-2013:4
τ <= 60 τ > 60 τ <= 60 τ > 60
m ≤ −3 10.36 5.23 18.46 8.62
−3 < m ≤ −1 12.34 11.89 12.31 11.03
−1 < m ≤ 1 19.58 24.77 14.58 17.90
m>1 8.71 7.13 10.34 6.76

All 50.98 49.02 55.69 44.31

Table 1: Relative Number of Contracts. We report the percentage of option contracts that,
on average, fall within the different combinations of moneyness and tenor for the indicated sample.

In the out-of-sample period, the daily number of active quotes is much higher, especially for
deep OTM options. Since we would like to compare model performance across the two samples, we
have truncated the set of OTM options included in the recent period to lie within the boundaries
of -7.1 for the puts and 2.4 for the calls. These cut-offs correspond to the average minimum and

6
maximum moneyness of the option quotes for our in-sample period. This standardization limits the
heterogeneity across the samples. The relative proportion of OTM calls and puts is stable, but we
observe a non-trivial shift from ATM to deep OTM put options, with the former now representing
about 32.5% and the latter 27% of the overall observations. To the extent variation in the pricing
of deep OTM put option prices is harder to accommodate than for ATM prices, this composition
effect will – all else equal – imply a worse out-of-sample fit than would otherwise be observed.

2.2 The Option Surface Characteristics

The option IV surface displays a highly persistent and nonlinear dynamics which is difficult to
convey effectively through a few summary measures. We provide a couple of alternative depictions
that highlight different aspects of the dynamics. The first approach emphasizes the evolution
of separate option surface characteristics, as defined below. The second approach consists of a
standard principal components analysis of the IV surface.
The option characteristics are plotted on Figure 1. The level captures the average IV for ATM
short-dated options, the term structure reflects the difference between the IV of long and short
maturity ATM options, the skew measures the IV gap between short-dated OTM put and OTM
call options, and, finally, the skew term structure is the difference between the skew computed
from long- and short-dated options. The exact definitions are provided in the caption to Figure 1.
Note, in particular, that we define the short- and long-term skew symmetrically with respect to the
degree of volatility-adjusted moneyness for the left and right tail, as advocated by Bates (1991).
The IV level displays occasional erratic spikes to the upside, but also displays strong persistence,
as expected for a series reflecting the general level of volatility. Inspecting the remaining three
panels, we notice a considerable degree of commonality. Every major spike in the IV level is visible
in the other characteristics.

Table 2 supplements Figure 1 with summary statistics for the IV surface characteristics. The
correlation matrix confirms the strong covariation between the IV level and the remaining features.
Moving to the other characteristics, we see that the IV term structure is moderately positive, apart
from episodic large negative outliers. The skew is consistently positive, exceeding 5% over the vast
majority of the sample and averaging 10%. The skew also displays pronounced persistence and
is particularly elevated when markets are turbulent. Finally, the skew term structure is mostly
positive, but spikes downward during steep market declines, indicating a dramatic flattening of the
“smirk” at the onset of market crises. This feature accounts for a great deal of variation in the
surface, as the skew is negatively correlated with the skew term structure. Specifically, when the

7
Implied Volatility Level Implied Volatility Term Structure

0.05
0.6
0

0.4 −0.05
−0.1

0.2 −0.15
−0.2
0 −0.25
98 00 02 04 06 08 10 98 00 02 04 06 08 10

Implied Volatility Skew Implied Volatility Skew Term Structure


0.5
0.15
0.4
0.1
0.3 0.05

0.2 0
−0.05
0.1
−0.1
0
98 00 02 04 06 08 10 98 00 02 04 06 08 10

Figure 1: Implied Volatility (IV) Surface Characteristics. Top left Panel: The IV Level is
the average IV for short ATM options. Top Right Panel: The IV Term Structure is the difference in
IV between long- and short-dated ATM options. Bottom left Panel: The IV Skew is the difference
between the IV of short-dated OTM put and OTM call options. Bottom right Panel: The Skew
Term Structure is the difference between the long- and short-dated skew, with the long skew defined
analogously to the short skew. Short-dated options are those with 0.1 years to maturity and long-
dated options have 0.8 year to maturity. ATM options have volatility-adjusted moneyness m equal
to zero; OTM put options have m = −2; OTM call options have m = 2.

8
skew steepens drastically, the effect is much less pronounced at longer maturities. Hence, this type
of excitement of the short left part of the IV surface must be associated with shocks to volatility
and jump intensities of moderate persistence.
While there is a clear dependence between the option characteristics, both Figure 1 and Table
2 suggest that the dynamics of the IV level cannot account for the overall surface dynamics. For
example, the 1997 and 1998 crises have a much more pronounced and persistent effect on the
skew than the IV level. Likewise, the aftermath of the 1998 Russian crisis is associated with a
historically high upward-sloping term structure, while the IV level is close to its historical mean.
In contrast, other periods with similar IV levels feature much flatter IV term structures. Hence,
we need a multi-factor model to capture the dynamic dependencies between the option surface
characteristics, as also concluded in Christoffersen et al. (2009).

Summary statistics Correlation Matrix


Mean Median Std Level TS Skew Skew TS
Level 0.19 0.18 0.08 1.00 -0.69 0.89 -0.24
TS 0.01 0.02 0.03 - 1.00 -0.57 0.69
Skew 0.10 0.09 0.05 - - 1.00 -0.31
Skew TS 0.04 0.04 0.03 - - - 1.00

Table 2: Summary Statistics and Correlation Matrix for Option Characteristics. The
statistics for the implied volatility Level, Term Structure (TS), Skew, and Skew Term Structure
are computed over the in-sample period, January 1996–July 2010.

Turning to the principal component (PC) analysis, Figure 2 depicts the in-sample realizations
of the first four PCs of the IV surface. It is evident that the first PC is closely related to the IV
level, while the second PC displays commonality with the IV term structure. However, the last
two PCs appear largely unrelated to the characteristics depicted in Figure 1. The first PC captures
about 96.4% of the total variation, while the following PCs account for, respectively, 2.1%, 0.7%
and 0.3%. Clearly, there is a dominant level type effect, but this “factor” also accounts for a great
deal of variation in the skew, term structure and skew term structure, leaving, relatively speaking,
only minor residual variation to explain for the remaining PCs.

To further explore how the PCs interact with the IV surface characteristics, Table 3 reports
on the in-sample regression of characteristics on PCs. The table confirms the strong association

9
PC1: 96.4% PC2: 2.1%
6 4

2
4
0
2
-2
0
-4

-2 -6
96 98 00 03 05 08 10 96 98 00 03 05 08 10

PC3: 0.7% PC4: 0.3%


6 6

4 4

2
2
0
0
-2
-2 -4

-4 -6
96 98 00 03 05 08 10 96 98 00 03 05 08 10

Figure 2: Principal Components of the Implied Volatility Surface. The panels depict the
first four principal components extracted from the S&P 500 IV surface from January 1996 to July
2010. On each day we interpolate the IV surface to generate standardized options with the same
volatility-adjusted moneyness, m, and tenor across the sample. Specifically, we obtain option prices
for m in the set {−4, −3, −2, −1, −0, 1, 2} and tenor equal to three values, {0.1, 0.3, 0.8} (in years).
This produces a total of 21 synthesized option contracts per day. For each panel, we also report
the percentage of the overall variation explained by the given principal component.

10
of the first PC with the IV level, featuring a t-statistic beyond 300. For this PC, we also obtain
the negative association with the IV and skew term structures and positive relation with the skew,
matching the correlation patterns for the IV level in Table 2. In addition, we find the second PC to
be associated with the IV term structure, corroborating the visual impression from Figure 2. The
third PC is associated with the skew as well as the skew term structure and level. However, the
skew is much more strongly associated with the first PC so, effectively, the third PC captures only
residual skew variation that is largely orthogonal to the level. Finally, the fourth PC has no clear
association with any of the characteristics. In particular, the variation in the skew term structure
is effectively accounted for through the first two or three PCs.

Level TS Skew Skew TS


PC1 0.19 (343.79) -0.04 (-60.84) 0.11 (133.05) -0.01 (-6.42)
PC2 -0.24 (-63.65) 0.40 (87.14) -0.05 (-9.19) 0.41 (41.04)
PC3 -0.14 (-21.49) -0.08 (-10.48) 0.45 (43.81) -0.21 (-11.82)
PC4 -0.05 (-5.07) -0.08 (-6.88) -0.36 (-23.36) 0.03 (1.06)
R2 0.99 0.94 0.96 0.71
AC(1) 0.66 0.60 0.32 0.44
AC(2:10) 0.39 0.26 0.21 0.26
AC(11:20) 0.27 0.16 0.12 0.16

Table 3: Relating Option Characteristics and Principal Components. We report the


coefficients with t-statistics (in parentheses) and the R2 from the linear regression of different
options characteristics (level, term structure, slope, and slope term structure) on the first four
principal components extracted from the S&P 500 implied volatility surface from January 1996
to July 2010. We also report the first sample autocorrelation coefficient and the average sample
autocorrelation coefficient over two-to-ten and eleven-to-twenty lags of the regression residuals.

These observations have a number of implications. First, there is no simple mapping between
PCs and characteristics. The latter are intrinsically interconnected and covary strongly, so factors
associated with the PCs will generally exert a significant joint impact on multiple characteristics.
Second, the difficulty of separating the forces driving the individual characteristics is, of course,
a general feature of systems with pronounced nonlinearities. We see further evidence of potential
nonlinearity in the strong serial correlation in the residuals of the regressions in Table 3. The first
order autocorrelation coefficient is very large across all the characteristics and die out extremely
slowly, suggesting highly persistent deviations from the linear approximation associated with the

11
regression analysis. Third, the persistent residuals may be due to missing factors and, indeed, we
find standard tests for the number of factors, see, e.g., Bai and Ng (2002), to indicate 7 to 8 factors.
However, given the nonlinear association between option IV and (latent) factors, this likely reflects
the failure of the linear approximation rather than the true number of underlying (linear) factors.2
In summary, accounting for a given model’s ability to track the IV surface dynamics provides
an intuitive way to highlight the implications of model misspecification. Indeed, we later illustrate
the quality of fit by comparing the model-implied and observed IV surface characteristics. On the
other hand, there is no direct association between the inability to fit the dynamics of a specific set
of characteristics and the lack of a given factor, because the factors’ impact on the option IVs are
highly nonlinear and create correlated dynamic interactions across the characteristics.

3 Parametric Modeling of the Option Panel

We turn next to parametric modeling of the options with the goal of succinctly capturing the
IV surface dynamics via a low-dimensional state vector. We adopt a general-to-specific approach.
The proposed model subsequently serves as the basis for our analysis of the equity and variance
risk premia. The initial decision concerns the number of latent state variables to include. The
vast majority of empirical option pricing studies employs a single stochastic volatility factor, but
the literature on estimating the return dynamics under the physical measure as well as a few
option pricing studies, e.g., Bates (2000), Christoffersen et al. (2009), Christoffersen et al. (2012),
and Andersen et al. (2013), point to a minimum of two factors. This is also consistent with
our descriptive analysis of the IV surface in Section 2. Hence, we follow Andersen et al. (2013) in
proposing a general three-factor model which, apart from the specification of the jump distribution,
embeds most existing continuous-time models in the literature as special cases. We document below
that exponentially distributed price jumps provide a superior fit relative to the more commonly
adopted Gaussian specification, justifying the alternative representation in our benchmark model.
Our three-factor model for the risk-neutral equity index dynamics is given by the following
extension of the representation in Andersen et al. (2013),
2
For a simple illustration, we calibrated a one-factor Heston model and derived the sensitivity of the IV surface to
the volatility factor. For a given location in the surface, i.e., a given mildly OTM option, the sensitivity (derivative
of the option IV with respect to volatility) will typically vary dramatically with the level of volatility. Thus, strongly
correlated approximation errors will arise from linearizing the dependence of the surface characteristics to factors, as
the true sensitivities fluctuate strongly over time. This illustration is provided in our Supplementary Appendix.

12
Z
dXt p
(ex − 1) µ
p Q
p Q Q
= (rt − δt ) dt + V1,t dW1,t + V2,t dW2,t + η Ut dW3,t + eQ (dt, dx, dy),
Xt− R2
Z
x2 1{x<0} µ(dt, dx, dy),
p Q
dV1,t = κ1 (v 1 − V1,t ) dt + σ1 V1,t dB1,t + µ1
R2 (3.1)
p Q
dV2,t = κ2 (v 2 − V2,t ) dt + σ2 V2,t dB2,t ,
Z
(1 − ρu ) x2 1{x<0} + ρu y 2 µ(dt, dx, dy),
 
dUt = − κu Ut dt + µu
R2
 
Q Q Q Q Q Q Q
where (W1,t , W2,t , W3,t , B1,t , B2,t ) is a five-dimensional Brownian motion with corr W1,t , B1,t =
 
Q Q
ρ1 and corr W2,t , B2,t = ρ2 , while the remaining Brownian motions are mutually independent.
In addition, µ is an integer-valued measure counting the jumps in the price, X, as well as the
state vector, (V1 , V2 , U ). The corresponding (instantaneous) jump intensity, under the risk-neutral
measure, is dt ⊗ νtQ (dx, dy). The difference µ
eQ (dt, dx, dy) = µ(dt, dx, dy) − dtνtQ (dx, dy), constitutes
the associated martingale jump measure.
The jump specification involves two separate components, x and y. The former captures co-
jumps that occur simultaneously in the price, the first volatility factor, V1 , and, potentially, in
the U factor (if ρu < 1), while the y jumps represent independent shocks to the U factor.3 The
compensator characterizes the conditional jump distribution and is given by,

νtQ (dx, dy)  c− (t) · 1 −λ− |x|
+ c+ (t) · 1{x>0} λ+ e−λ+ x ,

{x<0} λ− e if y = 0,
=
dxdy  c− (t)λ e−λ− |y| , if x = 0 and y < 0.

The first term on the right hand side, referring to the x 6= 0, y = 0 case, reflects co-jumps
in price and volatility, while the second term, x = 0, y < 0, captures independent shocks to the
U factor. Hence, the individual (strictly positive) jumps in U are either independent from V1 or
proportional to the (simultaneous) jump in V1 . Following Kou (2002), we model the price jumps
as exponentially distributed, with separate tail decay parameters, λ− and λ+ , for negative and
positive jumps. Moreover, for parsimony, the independent shocks to the U factor is distributed
identically to the negative price jumps. Finally, the time-varying jump intensities are governed by
the c− (t) and c+ (t) coefficients. These coefficients evolve as affine functions of the state vector,

c− (t) = c− − −
0 + c1 V1,t− + c2 V2,t− + Ut− , c+ (t) = c+ + + +
0 + c1 V1,t− + c2 V2,t− + cu Ut− .

This representation involves a large set of parameters that can be hard to identify separately. At the
estimation stage, we eliminate those that are insignificant and have no discernible impact on model
fit. However, generality along this dimension is important, as the jump specification turns out to
be critical for a suitable representation of the IV surface dynamics. We note that for identification
3
The latter can also generate a jump in return volatility if η > 0.

13
purposes, we set the coefficient in front of U in c− (t) to unity, and hence the value of U directly
signifies its contribution to the negative jump intensity.4
Our three-factor model possesses a number of distinctive features. The factors (V1 , V2 , U ) drive
both the diffusive volatility and the jump intensities. V1 and V2 are always present in the diffusive
volatility, as in traditional multi-factor volatility models, while U contributes to diffusive volatility
only if η > 0. In fact, the constrained model arising for η = 0 is of separate interest. It implies that
the factor U affects only the jump intensities, with no impact on diffusive volatility. Furthermore,
in an extension to existing option pricing models, we allow positive and negative jump intensities
to have different loadings on the latent factors. In particular, some factors affect only positive or
only negative jump intensities. Such flexibility in modeling the jump intensities is important given
the nonparametric evidence in Bollerslev and Todorov (2011).
The jump modeling also involves several novel features. First, the price jumps are exponentially
distributed. This is unlike most earlier studies which rely on Gaussian price jumps, following
Merton (1976).5 Next, the jumps in the factor V1 are linked deterministically to the price jumps,
with squared price jumps impacting the volatility dynamics in a manner reminiscent of discrete
GARCH models.6 For parsimony and ease of identification, we allow only the negative price jumps
to impact the volatility dynamics. Finally, U is driven in part by the squared negative price jumps
and in part by independent jumps, with the parameter ρu controlling the contribution of each
component in the dynamics of U . As such, the model accommodates both perfect dependence
(ρu = 0) and full independence (ρu = 1) between the jump risks of V1 and U . Moreover, these
state variables, governing critical features of the option surface dynamics, are related through the
time-variation in the jump intensity. Our specification allows for “cross-excitation” in which jumps
4
The formal inference in Andersen et al. (2013) provides strong evidence for the presence of a separately evolving
left jump tail factor like U in the risk-neutral return dynamics, while it is unclear if U also directly impacts spot
volatility. As such, we let U enter the left jump intensity with a unitary coefficient. This resolves identification issues
that arise regarding the scale of U versus the loading coefficient for U in the jump intensity. It is analogous to the
imposition of a unitary coefficient for the volatility components in the return dynamics in equation (3.1).
5
Bates (2012) studies the option pricing implications of models with exponential and Gaussian jumps. He finds
them to be broadly similar for short-maturity options that are not deep OTM.
6
Our modeling of the volatility jumps, and their dependence with the price jumps, deviates from earlier empirical
option pricing studies, e.g., Duffie et al. (2000), who model volatility jumps as exponentially distributed. In our case,
they are the squares of exponentially distributed random variables, and hence much fatter tailed. This enhances the
reaction of volatility to large price shocks. Finally, our modeling implies a nonlinear deterministic link between price
and volatility jumps, unlike Duffie et al. (2000). Of course, there is overwhelming empirical support for GARCH like
dynamics in volatility, similar to the continuous time specification adopted here.

14
in V1 enhance the probability of future jumps in U , and vice versa.7
Differences in the jump distributions aside, our model nests most existing models. For the
one-factor setting, with V2 and U absent, we recover the double-jump volatility model of Duffie
et al. (2000), estimated using options data by Broadie et al. (2007). In the two-factor setting, with
U absent and excluding volatility jumps, we obtain the Bates (2000) jump-diffusion, and further
ruling out price jumps leads to the two-factor diffusive model of Christoffersen et al. (2009). The
key difference in our model from existing work is the separation of the left jump intensity from the
volatility (and its components) and the right jump intensity (through the U factor).
We know of only two prior instances where this type of separation is contemplated. Santa-
Clara and Yan (2010) model jump intensity by a separate diffusive factor that can be correlated
with volatility, which itself follows a diffusion. Hence, the model does not include volatility jumps,
critical for fitting the option surface, as noted by Broadie et al. (2007) and further confirmed here.
Christoffersen et al. (2012) consider a discrete-time GARCH model with conditionally Gaussian and
non-Gaussian innovations (“jumps” of Merton type) where conditional volatility and jump intensity
are driven by past squared innovations. This model does not allow for closed-form solutions for
the option prices, complicating inference from the option surface. Our model differs from these
two papers along several dimensions crucial for the analysis of risk premia. First, we only model
the risk-neutral distribution and do not assume any structure under P. Thus, we do not import
information from the physical return dynamics (other than no-arbitrage restrictions). This is
critical for our subsequent findings regarding the drivers of risks and risk premia in Section 7.
Second, we separate the dynamics of the left and right jump intensities which is important for the
pricing of short maturity OTM puts and calls. Third, we adopt the double-exponential return jump
distribution, providing a superior fit to the short maturity OTM options. Fourth, our specification
involves cross-excitation in the jump and volatility dynamics, leading to improved characterization
of longer maturity options, particularly in the aftermath of volatile episodes in the sample.
In summary, the main departure from prior work stems from the inclusion of the new U factor.
Given the rather unconventional representation, we briefly discuss how this factor enhances the
features of the risk-neutral dynamics. It is best seen by focusing on a restricted model where we
eliminate U from the diffusive volatility, i.e., set η = 0, and further let ρu = 0, so that there is
7
We stress that the model (3.1) still belongs to the affine family covered by Duffie et al. (2003) and, as shown in
Andersen et al. (2013), the following parameter constraints ensure covariance stationarity of the latent factors,

2c−
1 µ1 2κ1 µu
κ1 > , and κu > , and κ2 < 0, and σi2 ≤ 2κi v i , i = 1, 2.
λ2− κ1 λ2− − 2c−1 µ1

15
no independent jump component driving U . In this scenario, there is no distinct source of risk
impacting U – it is driven solely by the squared negative price jumps. Nonetheless, U affects the
jump intensities, and hence the option surface, separately from V1 . That is, to convey the current
state of the system, U must be included among the components of the state vector.8 In other
words, even if the source of risk in U is spanned by the jumps in X and V1 , U is still necessary
for characterizing the conditional risk-neutral distribution of future log returns or predicting the
future evolution of the factors, even after controlling for the current values of V1 and V2 . Of course,
the role of U only expands, if it is subject to independent shocks as well. In the empirical section
below, we detail how the U factor impacts the IV surface characteristics over time.
The model in equation (3.1) pertains to the risk-neutral dynamics of X. However, due to the
equivalence of Q and P, the assumed dynamics have implications for the dynamics of X under P as
well. In general, these implications are limited to those features of model (3.1) which hold almost
surely. They consist of the following. First, the spot diffusive variance is invariant to the change
of measure. In the model, this is given by

Vt = V1,t + V2,t + η 2 Ut .

Second, regarding the jumps, the only property that applies almost surely is the identity of the
realized jumps in the returns and state variables. In particular,

∆V1,t = µ1 (∆ log(X))2 1{∆ log(X)<0} ,

and, if ρu = 0, we also have,

∆Ut = (∆ log(X))2 1{∆ log(X)<0} ,

under both measures.


In most empirical option pricing applications, additional assumptions are invoked when changing
measure from P to Q. For example, it is commonly assumed that the model class is identical, and
affine, under both measures. This is convenient as affine models offer a great deal of tractability.
However, this approach severely restricts the dynamics of the risk premiums. In addition, such
“structure preserving transformations” (SPTs) impose auxiliary restrictions, extending to the model
parameters. In particular, in our affine setting, the SPT assumption implies that σ1 , σ2 , η, ρ1 , ρ2 ,
ρu , µ1 and κ3 are identical under P and Q, while the remaining parameters may differ.
8
This is a reflection of the fact that there is no direct association between the dimensionality of the sources of
risk and the dimension of the state vector required to characterize the conditional dynamics of the system. This
phenomenon arises naturally in continuous-time ARMA models, see, e.g., Brockwell (2001). It is also a well-known
feature of the so-called quadratic term structure models, see, e.g. Ahn et al. (2002) and Leippold and Wu (2002).

16
Given the above discussion, it is clear that data on the underlying equity-index values can
be helpful in the estimation of the risk-neutral model. Below, we impose the pathwise (almost
sure) restriction regarding the spot variance in estimation. Given the difficulty in recovering spot
volatility jumps from high-frequency data – due both to estimation uncertainty and lack of overnight
observations – we do not impose restrictions regarding the pathwise (realized) price and volatility
jump relationship implied by our risk-neutral model. Finally, if one constrains the pricing of risk
through a SPT from P to Q, additional information from the P dynamics can be “imported” from
the underlying return data during estimation of the risk-neutral dynamics. This comes, however,
with the risk of severe model misspecification. As we document below, the option panel is very
informative about the risk-neutral dynamics, so we avoid auxiliary restrictions that may induce
model misspecification with unpredictable impact on our inference for the risk premia.

4 Estimation
4.1 Estimation Approach

The development of formal tools for parametric inference in the context of an option panel is
challenging. There are pronounced time series dependencies in the latent volatility components
and, potentially, the jump intensities. At the same time, sizeable bid-ask spreads influence the
observed prices and quotes for the options. These measurement errors are strongly heterogeneous
and correlated with the overall return variation. Finally, there are no-arbitrage constraints that,
one, at any point in time link the individual option prices across strikes, and, two, equate the spot
diffusive volatility for the underlying asset (imperfectly observable from high-frequency returns)
with the volatility implied by the contemporaneous state vector (also extracted with a degree of
statistical error) for every option cross-section. Consequently, we adapt the parametric estimation
and inference approach put forth in Andersen et al. (2013) that is designed to deal with this type
of environment. It exploits in-fill asymptotics in the option cross-section, i.e., it operates under the
assumption that, for a small set of maturities which may vary from day to day, option prices are
observed across a broad range of strikes with only small gaps between the exercise prices.
The approach provides consistent period-by-period estimates for the state vector along with
valid asymptotic inference for the state vector and model parameters as well as the fit to specific
regions of the IV surface on any given day. This is possible due only to the adoption of the in-fill
asymptotic scheme for the option cross-section.9 In practice, this reflects the actual structure of
9
Alternative inference techniques can potentially generate unbiased estimates for the state vector realizations, de-
termining the values of the volatility components and jump intensities, but cannot achieve consistency, thus rendering
formal inference regarding the state vector and IV surface fit on a period-by-period basis infeasible.

17
our panel. The option quotes are clustered closely in the strike range, while we use only weekly
observations of the option cross-section. Thus, the well-recognized advantages of inference via
“high-frequency” data apply naturally to the cross-section, not the time series, in our setting.
The wealth of information embedded in the option cross-section is, of course, well-recognized.
It is known to facilitate nonparametric extraction of the conditional risk-neutral density for the
underlying asset returns across the maturities available on a given day. Through the imposition
of a general parametric structure and inference procedure, we let the dynamic evolution of the
conditional densities speaks to the number of factors as well as the intertemporal variation in
volatility and jump intensities. Given the identified factors and model parameters, the individual
cross-section identifies the current state vector. In fact, theoretically, the entire system can be
identified and estimated consistently from a single cross-section. However, the identification from a
single option surface is weak. The use of a large number of surfaces, i.e., an option panel, allows the
variation in the state vector over time to assist in identifying the underlying structure governing
the option prices and also helps diversify the idiosyncratic measurement errors.10
We denote the parameter vector of model (3.1) by θ and the state vector at time t by Zt =
(V1,t V2,t Ut ). Further, the model-implied Black-Scholes IV is given by κ(k, τ, Zt , θ) and the model-
implied diffusive spot variance by V (Zt , θ) = V1,t + V2,t + η 2 Ut . Letting n denote the (equidistant)
frequency by which we sample the high-frequency asset returns, our estimator takes the form,

T  

n b n bn
 X Option Fitt + λ × Vol Fitt
{Vb1,t , V2,t , Ut }t=1,...,T , θbn = argmin , (4.1)
{Zt }t=1,...,T , θ∈Θ t=1
VtAT M

Nt q 2
1 X 2 (n,mn )
p
Option Fitt = κt,kj ,τj − κ(kj , τj , Zt , θ) , Vol Fitt = Vbt − V (Zt , θ) ,
Nt j=1

(n,mn )
where Vbt is a nonparametric estimator of the diffusive spot variance constructed from the
underlying intraday asset prices, as detailed in the Appendix, and VtAT M is the squared short term
ATM Black-Scholes IV.11 Finally, λ is a tuning parameter which we set to 0.2.12
The estimator (4.1) minimizes the weighted mean squared error in fitting the panel of observed
option IVs, with a penalization term that reflects the deviation of the model-implied spot variance
10
On the other hand, one year of option data – along with the high-frequency returns used to enforce the (statistical)
equality of spot volatility across the P to Q measures – usually suffices for reasonable identification.
11
This measure is obtained for the shortest available maturity on day t and for the option closest to at-the-money,
with the associated forward rate deduced from put-call parity.
12
In general, given the noise in the high-frequency spot volatility estimate, we should pick a relatively low value
for λ. In the Supplementary Appendix, we report further evidence corresponding to different values of λ.

18
from a model-free spot variance estimate. The presence of the volatility fit in the objective function
not only facilitates the incorporation (statistically) of the no-arbitrage constraint, but also serves as
a regularization device for the estimation by penalizing parameter values that imply “unreasonable”
volatility levels. Moreover, the standardization by VtAT M implies we weigh observations on high
and low volatility days differently. This improves efficiency as the measurement errors in the option
prices and the estimation errors for the state vector generally rise with market volatility.13
The estimator (4.1) is derived via joint optimization over parameters and state vector real-
izations. This high-dimensional problem is tractable due to the particular form of our objective
function: the terms corresponding to a given observation time depend on the state vector only
through the option prices and nonparametric spot volatility estimate for that particular day. Thus,
given a candidate θ vector, we trivially obtain the corresponding state vector realization. Hence,
we concentrate, or profile, the state vector and optimize over the model parameters via MCMC
based estimation, using a chain of length 10, 000, following Chernozhukov and Hong (2004).

4.2 Estimation Results

As noted in Section 3, we follow a general-to-specific approach to estimation. Consequently, model


(3.1) is very richly parameterized, particularly regarding the specification of the jump intensities.
The idea is to allow for full generality and then eliminate the insignificant features in the model.
For brevity, we defer the estimation results for the general case to the Supplementary Appendix.
It turns out that the jump loading parameters, c− − +
0 , c2 , cu , are statistically insignificant, and the
same applies for η. Hence, our more parsimonious model constrains these parameters to zero. For
the negative jump intensity coefficients, this is readily interpreted. It implies that the frequency of
jumps are governed exclusively by the level of the two factors, V1 and U . The η = 0 and c+
u = 0
restrictions are more fundamental. First, η = 0 implies that U is a pure jump intensity factor
which does not contribute directly to the diffusive volatility. This represents a major departure
from existing continuous-time models, which are built from stochastic volatility factors.14 Second,
the elimination of c+
u means that U is constrained to only affect the negative jumps. Again,
this is an important departure from existing asset pricing models. These features are precluded
in the typical approach to empirical option pricing, because the latter is built around multi-factor
volatility models with a single (Gaussian) price jump component. These models rule out pure jump
dynamic factors and do not allow for factors to impact the negative and positive jump intensities
13
Our specific weighting scheme stems from an analysis of the option pricing errors, showing that their volatility
is roughly linear in the level of market volatility.
14
Instead, our specification shares critical features with the discrete-time GARCH type model of Christoffersen
et al. (2012).

19
differentially. In our case, by letting the option panel determine the fit, we find the risk-neutral
dynamics to involve a non-volatility factor that operates only on the negative jump intensity.
The estimation results for the general model (3.1) further reveal that the parameters controlling
the risk-neutral law of U are not estimated very precisely. This result is largely due to the fact
that the persistence of U is rather extreme with a half-life of approximately 8 years. With such
a low degree of mean reversion, the parameters κu , µu and ρu cannot be recovered precisely as
they play an insignificant role in determining the values of options with maturities of up to one
year. A precise estimation of κu , µu and ρu requires far longer-dated options than available in
our sample. This issue also arises in the analogous case of recovering the risk-neutral volatility
parameters from options data if the volatility is very persistent, see, e.g., Broadie et al. (2007)
and the references therein. Nonetheless, we have compelling evidence for the presence of U in the
risk-neutral dynamics. First, note that the hypothesis that U is absent is a joint hypothesis that
µu = 0 as well as that the vector {Ut } is not present, and such a hypothesis is strongly rejected by
the data.15 Second, a constrained model in which U is absent performs significantly worse, both
in- and out-of-sample, as documented in the Supplementary Appendix. Finally, even though the
individual parameters κu , µu and ρu are poorly identified, the realizations of the tail factor U are
well identified. As discussed previously, the latter is crucial for our analysis of the risk-premium
dynamics, as U represents the component of the left tail dynamics that is not tied to volatility.
The case of ρu is particularly interesting as this parameter is not endowed with a natural
boundary condition that eliminates an associated feature from the model. Instead, it indicates the
relative jump size in U stemming from (squared) co-price jumps versus a separate source of jumps.
As such, the parameter is restricted to the [0, 1] interval, with the boundary values indicating
either that U does not co-jump with the price process (ρu = 0) or that U is not subject to separate
jump shocks (ρu = 1). Our estimation results suggest that ρu is sufficiently poorly identified that,
at standard significance levels, we cannot rule out any value in [0, 1]. However, as we document
below, ρu turns out to be largely irrelevant for the extracted state vector realization throughout
the sample.16 Thus, we only constrain ρu to fall within the unit interval.
We now turn to the results for the restricted model obtained after zeroing out c− − +
0 , c2 , cu and
η.17 Table 4 reports the point estimates and associated standard errors. As expected, the volatility
factors differ significantly in their degree of mean reversion, with V2 having a half life of about 5
15
More formally, under the null that µu = 0, the factor realization {Ut } as well as the parameters κu and ρu are not
identified, and hence testing such a hypothesis using a standard t-test (or Wald test) is invalid, see Andrews (2001).
16
Specifically, we verify that imposing ρu = 0 or ρu = 1 has no material impact on any of our qualitative conclusions.
17
A standard Wald test for the four parameters jointly taking the value of zero generates a p-value of 27.25%,
confirming that we cannot reject this restriction at any standard level of significance.

20
months versus 3 weeks for V1 . V2 is also generally larger and has a lower volatility-of-volatility
coefficient than V1 , implying we have a smaller and rapidly moving volatility factor and a larger,
less volatile, but more persistent second volatility factor. Both display a strong negative association
with the return innovations, generating an overall correlation between return and spot volatility
innovations of around -0.95.18

Table 4: Estimation Results

Parameter Estimate Std. Parameter Estimate Std. Parameter Estimate Std.

ρ1 −0.959 0.093 v2 0.010 0.000 c+


0 0.372 0.029
v1 0.003 0.000 κ2 1.864 0.103 c−
1 111.061 4.446
κ1 10.989 0.193 σ2 0.170 0.006 c+
1 25.855 4.971
σ1 0.249 0.028 µu 7.124 24.096 c+
2 81.719 6.947
µ1 12.158 0.247 κu 0.0877 0.123 λ− 25.944 0.196
ρ2 −0.979 0.033 ρu 0.513 4.404 λ+ 36.620 0.857

Note: Parameter Estimates of Model (3.1). The model is estimated using S&P 500 equity-index
option data sampled every Wednesday over the period January 1996-July 2010 and the parameters η, c+
u,
c−
0 and c−
2 are all set to zero. Parameters are reported in annualized return units.

Turning to the left tail factor, we find U to have an estimated half-life of 8 years under the
risk-neutral measure, thus far exceeding those of the volatility factors. Hence, this pure jump factor
has the potential to impact the option prices substantially across both short and long maturities.
Comparing our results with option-based estimation of multi-factor volatility models, reported in
Bates (2000) and Christoffersen et al. (2009), we find that the mean-reversion of the volatility in
our model is far stronger than reported in the above-mentioned papers. This is “compensated” by
the presence of the very persistent U factor. We recall that in the traditional volatility models,
the left and right jump intensities are proportional to the volatility factors. In our case, through
the introduction of U , we drive a wedge between volatility and jump intensity and the option data
identifies the latter as the more persistent one. Intuitively, since U controls the left jump tails, this
result stems from the relative expensiveness of the OTM long maturity puts across our sample, i.e.,
the comparatively slow flattening out of the IV skew at long maturities.
18
The value of this correlation coefficient varies with the relative size of the two factors. The reported value is
obtained for the spot variances equaling their unconditional time series means. In light of the reported standard
errors, this strongly negative coefficient is consistent with recent evidence exploiting an entirely different approach
based on joint high-frequency data for volatility indices and underlying asset returns, see, e.g., Andersen et al. (2014).

21
5 Diagnostic of the Option Panel Fit and Model Performance

In this section, we explore whether model (3.1) provides a satisfactory characterization of the option
surface dynamics. For this purpose, we first consider a sequence of more standard specifications
as competitor models. They range from an extended version of the one-factor double-jump model
of Duffie et al. (2000) to a model including three stochastic volatility factors. The one-factor
model (1FGSJ) features square-root volatility, correlation between Gaussian return and exponen-
tial volatility co-jumps, and has a time-varying jump intensity given as an affine function of the
volatility state. To allow additional flexibility, we also consider the two (stochastic volatility) factor
version of the double-jump model, and we explore both a Gaussian (2FGSJ model) and a double-
exponential (2FESJ model) jump distribution for the return jumps. Finally, we expand the most
successful in-sample two-factor model into a three-factor return-volatility co-jump model (3FESJ-
V) by introducing a third square-root volatility factor, thus expanding the number of factors in
the traditional manner. The exact specifications for each of the alternative models and the cor-
responding in-sample parameter estimates and RMSE fit to the option panel are provided in the
Supplementary Appendix. Generally speaking, they represent more elaborate specifications than
estimated in the literature, and thus afford an improved in-sample fit to the option surface relative
to many of the models considered in prior studies.
At this stage, we simply note that model (3.1) has a considerably lower overall RMSE to the
option panel of 1.71% compared to RMSE values ranging from 3.14% for the one-factor Gaussian
double-jump model to 1.86% for the three-factor return-volatility co-jump model.19 Thus, model
(3.1) provides a comparatively good fit to the surface. However, it is less evident whether this
translates into an improved characterization of the key dynamic factors determining the evolution
of, and associated risks related to, the IV surface as well as the corresponding risk premiums.
To this end, in this section, we provide additional details concerning the model fit to the option
characteristics as well as the fit to the nonparametric volatility estimates from high-frequency data.
We conclude the section with various robustness checks.

5.1 State Vector Dynamics, Volatility Surface Dynamics, and Jump Intensities

We first analyze the state vector in our model. Table 5 reports summary statistics for the extracted
time series of the three factors. We find that the jump volatility factor, V1 , is the dominant
component of the diffusive volatility under the physical probability measure, contributing roughly
19
Using the asymptotic theory developed in Andersen et al. (2013), these differences in RMSE are highly statistically
significant.

22
2/3 to its total mean. The two volatility factors also exhibit some negative correlation (of order
−0.3), while V1 is only slightly less persistent than V2 under the statistical measure. Next, the
mean of U implies it contributes an average of 3.7 negative jumps per year to the negative jump
intensity. Thus, given the average negative jump arrival of 5.6, U is the dominant driver of the
negative jump tail. Moreover, the sample standard deviation of U indicates a substantial degree of
variation over time, inducing significant fluctuations in the negative jump tail and the associated
(high) price for market downside protection. In addition, we find that, also under P, U is, by far,
the most persistent of the three state variables. Finally, U exhibits nontrivial positive correlation
with V1 and slight negative correlation with V2 . This is consistent with our risk-neutral model (3.1)
in which V1 and U are connected through the jumps and the feedback effect induced by the jump
intensities of the two state variables.20

Summary statistics
Mean Std Skew Kurt AC(1) AC(15) AC(30) Q(0.25) Q(0.50) Q(0.75)
V1 0.017 0.035 5.037 36.687 0.884 0.250 -0.006 0.0001 0.0053 0.0162
V2 0.012 0.009 0.737 3.363 0.840 0.369 0.102 0.0052 0.0103 0.0180
U 3.710 3.347 1.861 7.372 0.973 0.591 0.360 1.2538 2.8982 4.9409

Table 5: Summary Statistics for the Recovered State Vector. AC denotes autocorrelation
and Q signifies quantile. Sample averages are for the period January 1996 - July 2010.

To gain further insight into the dynamics implied by the model estimates, on Figure 3, we plot

the model-implied spot diffusive volatility, given by V1 + V2 , and the corresponding high-frequency

nonparametric estimate in the upper panel, while the model-implied estimates of U is depicted
in the lower panel. The plots are intriguing along several dimensions. First, the option-implied
estimate of the diffusive volatility has approximately the same sample mean as the nonparametric
high-frequency estimate, but it is far less noisy, illustrating the potential gains from incorporating
option information into volatility inference. Second, we notice spikes in market volatility and –
even more strikingly – in U around well-known crises. Third, the negative jump intensity factor U
portrays a very different picture than the diffusive volatility. For example, the peaks of U in the
1997 and 1998 crises as well as the European sovereign debt crisis match or exceed those observed
from 2000 through 2003, yet the model-implied volatility is substantially higher in the latter than
20
However, Table 5 concerns the statistical measure while model (3.1) characterizes the risk-neutral measure. In
general, no-arbitrage does not restrict the relation between the moments under the two measures.

23
the former episodes. That is, the alternating shapes of the option surface signal that these events
represent very different types of exposure to volatility and negative jump risks. Fourth, as expected,
the financial crisis stands out from the rest. Moreover, the jump intensity factor U mean-reverts
much slower than volatility following this crisis. The identical pattern is observed across all the
episodes identified above in which the jump intensity rises strongly relative to volatility. This
implies that U contributes (relatively) more to the expensiveness of OTM puts in the aftermaths
of the Asian, Russian, and European crises than market volatility. The opposite is true for the
turbulent episodes surrounding 9-11 and the Second Gulf War in 2002, where market volatility
clearly is the dominant force. Likewise, the initial spike in the jump intensity during the financial
crisis was not caused solely by the U factor. Instead, U is accountable for the subsequent slow
mean reversion. Fifth, we notice that, even during the quiet period 2004-2006, the negative jump
intensity factor U remains at a nontrivial level, implying approximately 1 jump per year.

Spot volatility

0.6

0.4

0.2

0
97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13

Ut
5

0
97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13

Figure 3: Spot Volatility and the U Factor. The top panel displays the spot diffusive volatility
estimated from the high-frequency data (the
√ dark line) and from the option panel (the light-colored line).
The bottom panel displays the estimate of U .

24
5.2 Fitting the Option Characteristics

Next we investigate the model’s success in capturing the dynamics of the option panel. Recall
from Section 2, that the first principal component of the option surface explains more than 96% of
the total variation, but the PCs fail to succinctly capture the dynamic dependencies of the option
characteristics. The first principal component provides a fit (in RMSE) to the set of standardized
options used in the PC analysis of 3.38%. We can compare this RMSE to the ones obtained by
our parametric no-arbitrage models, reported in Table 6 below. Even, the simplest one-factor
1FGSJ model provides a significantly improved fit of 2.67% and this further drops to 1.40% for
our preferred three-factor model. This is due to the nonlinear factor structure for the option panel
implied by the parametric models (recall footnote 2). For the remainder of this section, we analyze
the success of the parametric models in tracking the dynamics of the IV surface by focusing on
their ability to fit the option characteristics.
On Figure 4 we plot the characteristics along with the model-implied fit. Perhaps not surpris-
ingly, the model provides a near perfect fit to the IV level, both for periods of turmoil and relative
tranquility. The fit to the IV skew is also quite satisfactory, although the model slightly overesti-
mates the skew during 1999 and tends to underestimate it at peaks of financial crises. Nonetheless,
there is no evidence of major systematic biases and the relative errors are small except for the Fall
of 2008.21 Turning to the IV term structure, we observe, from the top right panel of Figure 4, that
our model provides an almost perfect fit to this characteristic. Finally, the most challenging feature
of the surface is the IV skew term structure. Nonetheless, the model fits this feature well except for
two noticeable periods, namely the aftermaths of the 1998 Russian crises and the recent financial
crisis. During these episodes, our model predicts a somewhat steeper IV skew term structure than
actually observed. Intuitively, this is indicative of negative jump tails that need to be even more
persistent, during these two periods, than the model implies to deliver the slower thinning of the
risk-neutral tails along the maturity dimension of the conditional return distribution.

To benchmark the success of our model (3.1) in capturing the option surface characteristics,
we now compare it to the fit provided by the alternative stochastic volatility models discussed at
the beginning of this section. The results are presented in Table 6. We stress that our estimation
21
Our finding of underestimation of the skew during highly turbulent periods is consistent with recent nonparametric
evidence in Bollerslev and Todorov (2013) of time-variation in the shape of the jump tails. Accounting for such feature
of the data will alleviate the slight mispricing during such periods, observed on Figure 1, but it will take us outside
the tractable generalized affine framework. Note that some of the time variation in the shape of jump tails can be
partially offset by very high values of the U factor, as the latter is not constrained to be part of volatility. Furthermore,
the measurement errors in the option prices during crises periods are exceptionally large, so gauging model quality
by the size of the pricing errors during these periods is potentially misleading.

25
Implied Volatility Level Implied Volatility Term Structure

0.05
0.6
0

0.4 −0.05
−0.1

0.2 −0.15
−0.2
0 −0.25
98 00 02 04 06 08 10 98 00 02 04 06 08 10

Implied Volatility Skew Implied Volatility Skew Term Structure


0.5
0.15
0.4
0.1
0.3 0.05

0.2 0
−0.05
0.1
−0.1
0
98 00 02 04 06 08 10 98 00 02 04 06 08 10

Figure 4: The Model-Implied Fit to the Option Characteristics. The dark line corresponds
to the data and the light-colored line refers to the fit by model (3.1).

26
does not minimize the distance between the observed and model-implied option characteristics.
Nevertheless, our model provides a superior fit to each of the four option characteristics. The
improvement is quite significant compared to one- and two-factor models with Gaussian price
jumps. Moreover, the improved fit offered by our model is not solely attributable to the addition
of an extra factor. Indeed, the alternative three-factor model performs significantly worse at fitting
the term structure of the IV level and skew. Furthermore, removing U (leading to the model
2FESJ) produces a significant deterioration in the fit to the IV skew. Finally, our model improves
substantially on 1FGSJ and 2FGSJ models in fitting the spot volatility and is essentially on par
with the rest of the alternative specifications in this regard.

Model Fit
1FGSJ 2FGSJ 2FESJ 3FESJ-V 3F
IV Level 1.73 1.01 0.88 0.68 0.64
IV Term Structure 2.72 1.42 1.06 1.30 0.86
IV Skew 3.81 2.88 2.54 2.13 1.96
IV Skew Term Structure 2.79 3.70 2.01 2.57 2.40
Spot Volatility 5.199 4.130 3.944 3.945 3.977
Options with Fixed m, τ 2.67 2.10 1.89 1.58 1.40
All Options 3.14 2.56 2.07 1.86 1.71

Table 6: Fit to Options, Characteristics and Spot Volatility. The numbers in the table
are the RMSEs (in percent) from fitting the IV surface characteristics (first four rows), the spot
volatility (fifth row), the implied volatilties for fixed moneyness m ∈ {−4, −3, −2, −1, −0, 1, 2}
and tenor τ ∈ {0.1, 0.3, 0.8} (sixth row), and all options used in the estimation (seventh row).
The models in the comparison are defined as follows. 1FGSJ (2FGSJ) refer to the One (Two)
Factor Gaussian Jump model with time-varying jump intensity; 2FESJ refers to the Two-factor
Exponential jump model with time-varying jump intensity; 3FESJ-V refers to the Three-Factor
Exponential jump model with time-varying jump intensity; 3F refers to the Three-factor Expo-
nential jump model with the separate jump factor, U in equation (3.1). All models are explicitly
defined in the Supplementary Appendix.

The distinguishing feature of model (3.1) is the presence of U which resides only in the negative
risk-neutral jump intensity. We now explore the specific role of U in driving the surface dynamics.
To this end, on Figure 5, we plot the fitted IV surface characteristics as well as their sensitivity with
respect to U at the current values of the state vector for each day in the sample. These sensitivities
are measured via the change of the characteristics stemming from increases and decreases in U by

27
50% from its estimated value. For the IV level, U has a limited impact except for crises periods
when volatility is very high. This is to be expected as short-term ATM IV is primarily determined
by the level of the diffusive volatility (recall, U is absent from the latter). On the other hand, U has
a very pronounced and more uniform impact on the IV skew. That is, the significance of U for the
skew is ubiquitous and remains nontrivial, even during the tranquil period of 2004-2006. Turning
to the IV term structure in the top right panel of Figure 5, we observe strong time-variation in
the impact of U , with the effect being most substantial following the Asian, Russian and recent
Financial crises. In contrast, U has a very limited effect on the IV term structure during 2004-
2006. This is not surprising. The intensity of the volatility jumps in model (3.1) depends on U .
Hence, a higher value of U today triggers higher expected future diffusive volatility, and given the
persistence of U under Q, this effect is long-lasting and tends to elevate the IV term structure
following the crises of 1997, 1998 and 2008. Finally, U has a relatively small impact on the IV skew
term structure, except when U is very high. Overall, Figure 5 documents that U has a significant
impact on both the term structure and the skew. The impact on the skew is largely immune to the
state of the system, while the impact on the term structure is concentrated in extended periods
following some of the crises in the sample. This effect of U on the surface helps account for the
dynamic interdependencies among the characteristics documented in Section 2.

5.3 Robustness of Estimation Results and Inference

We next explore the robustness of our empirical findings. First, we check the sensitivity with
respect to the penalization term for the volatility fit in the objective function. We vary λ around
the original choice of 0.2 by 50% in either direction. It is useful when interpreting the results below
to keep in mind that a misspecified model will face tension in simultaneously fitting the option
surface and matching the high-frequency based estimate of spot volatility. The results are reported
in the Supplementary Appendix. We note the remarkable stability of the parameters with the
exception of σ2 and c+
0 for which the changes are somewhat large compared with the precision in
their recovery. Hence, there is some evidence for misspecification, but the critical features of the
system remain virtually unchanged. Most significantly, the extracted tail factor U is unaffected
which is crucial for our analysis of the risk premium dynamics later on.
Second, the Supplementary Appendix reports results for subsample estimation covering 1996-
2006 and 2007-2010. The fit is excellent for the longer 1996-2006 subsample, with an overall RMSE
of 0.99%. While this sample excludes the financial crisis, and thus poses less of a challenge to the
option pricing model, the sample still covers some dramatic episodes around 1997, 1998 and the

28
Implied Volatility Level Implied Volatility Term Structure
0.1
0.6
0
0.4
−0.1
0.2
−0.2

98 00 02 04 06 08 10 98 00 02 04 06 08 10

Implied Volatility Skew Implied Volatility Skew Term Structure

0.4 0

0.3
−0.1
0.2

0.1 −0.2

98 00 02 04 06 08 10 98 00 02 04 06 08 10

Figure 5: The effect of U on the Option Characteristics. The line corresponds to the fitted
characteristics and the shaded area indicates the change in the characteristics stemming from a
decrease versus an increase in U by 50% relative to the current estimated value. To better capture
the effect of U on the option characteristics, we use only the days in the sample for which the
shortest maturity of the options is less than 10 business days in the computation of the shaded
area.

29
internet bubble, along with some extremely low volatility levels during 2004-2006. The shorter and
extremely turbulent 2007-2010 period is much harder to accommodate, and the RMSE is now 2.16%.
We do observe some degree of parameter instability, mainly impacting the risk-neutral means of
the two volatility factors, v 1 and v2 , which tend to move in opposite directions, as well as the
loadings on the volatility factors in the jump intensities. This is indicative of some misspecification
in the modeling of diffusive volatility. Nonetheless, the parameters λ− and λ+ , governing the jump
tail decays, are remarkably stable across subsamples. Likewise, the estimates of µ1 , controlling the
volatility jumps, are stable. And most importantly for our analysis, the extracted realization of U ,
across the entire sample, remain near invariant regardless of whether the full sample or subsample
parameter estimates are employed.

6 Out-of-Sample Performance

Our model (3.1) is quite richly parameterized, containing three state variables along with the 18
separate parameters reported in Table 4. This may raise concerns regarding potential in-sample
overfitting. We address this issue by exploring the out-of-sample performance vis-a-vis the more
parsimoniously parameterized alternatives introduced previously in Section 5. If the dynamic fea-
tures extracted from the option surface through model (3.1) are genuine and stable, the model
should continue to provide a superior fit also for the options observed beyond the in-sample period.
To assess the robustness of model (3.1) relative to the alternative specifications, we use the
parameters for each model estimated over the period January 1st 1996 - July 21st 2010 to price the
options week-by-week over the subsequent period from July 22nd 2010 to April 23rd 2013, using
the criterion function from equation (4.1), but optimizing only over the state vector. This out-of-
sample period contains quotes for over 100,000 separate options, representing more than 56% of
the number of in-sample observations.22 The average IV (across all options) in the out-of-sample
period equals 24.17%, while the average ATM IV is 18.40%.
From Figure 3, we note that spot volatility filtered from model (3.1) in the out-of-sample period
continues to provide an excellent fit to the spot volatility estimated from high-frequency returns.
Thus, effectively, we decompose the factors governing the option surface into current spot volatility
and a separate left jump tail factor, U . The extracted volatility states provide an excellent basis for
22
The sharp increase to over 700 quotes per day for the out-of-sample analysis is due to two factors. One, there are
more options quoted at a given tenor - thus filling gaps in the moneyness dimension. Two, the introduction of weekly
options increases the number of tenors available on each day - thus enriching the data in the maturity dimension.
Weekly options were introduced in 2005, but only by 2011 did the associated volume become a significant fraction
(almost 20%) of the total trading volume for SPX options. As a reference, for January 1996 – December 1998, we
have 158 contracts per day, while we have 490 contracts per day between January 2008 and July 2010.

30
forecasting future realized return volatility and jumps, thus freeing U to accommodate the portion
of the risk premium dynamics that is not tightly related to the volatility states.
The out-of-sample fit to the option surface for the various models are summarized in Table 7.
Panel A provides the overall root-mean-squared error (RMSE) and Panel B indicates the perfor-
mance for specific regions of the surface. On the right in Panel B, the RMSE for model (3.1) is
given, while the remaining entries provide the percentage excess RMSE of the alternative specifica-
tions relative to model (3.1). Thus, positive entries signify the degree to which the models perform
worse than model (3.1) for that part of the option surface during the out-of-sample period.
The results are striking. The superiority of model (3.1) is more pronounced out-of-sample
than in-sample. Moreover, we see that the exponential specification for return jumps performs
significantly better than the Gaussian, while two-factor models offer nontrivial improvements over
one-factor representations. However, none of these more standard specifications come close to
matching the performance of model (3.1). Furthermore, we observe that the traditional three-
factor stochastic volatility model performs comparatively poorly out-of-sample, with the two-factor
model doing equally well or slightly better, reflecting some degree of in-sample overfit for the former
model. Thus, in fact, there is a danger of over-parameterizing the in-sample specification, but this
does not manifest itself for model (3.1). Moving to the individual regions of the surface, Panel B
documents that model (3.1) outperforms every other model for each single region of the surface, so
the superior out-of-sample fit is uniform. Finally, in Panel C we observe that we fit, out-of-sample,
the spot volatility far better than any of the alternative models. We conclude that the risk-neutral
dynamics obtained from the in-sample period remains stable and continue to capture the salient
features of the option surface dynamics beyond the estimation period.
To illustrate the type of scenarios for which the improvement of model (3.1) relative to the more
standard one-factor Gaussian jump model, 1FGSJ, is particularly significant, Figure 6 depicts the
fit of the two models to the IV skew and term structure on two separate trading days. December
19, 2012, represents a fairly typical day with an about average skew and term structure. For this
day, the fit of either model is quite satisfactory, and there is no major discrepancy between the two.
In contrast, for September 8, 2010, the skew is elevated although the level of volatility is moderate.
It represents a quite common occurrence in the aftermath of turbulent market conditions – in
this case associated with the initial European sovereign debt crisis. On this date, the one-factor
model misses the ATM IV level quite badly, and it provides a very poor fit to the term structure.
The problem is that the steep skew only can be accommodated through a high jump intensity
which, in the standard model, is only feasible if volatility is high. Hence, the fit to the ATM

31
Panel A: Overall Option RMSE
1FGSJ 2FGSJ 2FESJ 3FESJ-V 3F
RMSE RMSE RMSE RMSE RMSE
3.28 2.66 2.25 2.26 1.77
Panel B: Sorting by Moneyness and Maturity
1FGSJ 2FGSJ 2FESJ 3FESJ-V 3F
RRMSE RRMSE RRMSE RRMSE RMSE
τ ≤ 60 > 60 ≤ 60 > 60 ≤ 60 > 60 ≤ 60 > 60 ≤ 60 > 60
m ≤ −3 0.68 1.16 0.60 0.48 0.09 0.33 0.14 0.30 2.28 2.76
−3 < m ≤ −1 0.49 0.71 0.32 0.34 0.14 0.33 0.03 0.47 1.53 1.95
−1 < m ≤ 1 1.43 0.39 0.99 0.30 0.74 0.08 0.67 0.16 0.91 1.36
m>1 1.77 0.70 0.89 0.38 0.98 0.15 0.82 0.07 1.24 1.86
Panel C: Spot Volatility RMSE
1FGSJ 2FGSJ 2FESJ 3FESJ-V 3F
RMSE RMSE RMSE RMSE RMSE
6.23 5.01 4.05 4.27 3.82

Table 7: Out-of-sample Option Pricing Performance and Volatility Fit. We report RMSEs
in implied volatility (in percent). The abbreviations for the different models are as in Table 6.
RRMSE is the ratio of the RMSE of a given model over that of the 3F model minus one.

volatility is sacrificed in order to fit the OTM put options. Moreover, at an elevated volatility
level, the mean-reversion in volatility prevents the term structure from being sufficiently steep.23
In contrast, with the flexibility afforded by U , model (3.1) can readily accommodate a persistent
state with an elevated intensity for the left jump tail. Furthermore, the cross-excitation between
jumps and volatility is sufficient to generate an increase in the expected return variation over time,
thus adapting to the relatively steep slope of the term structure.
The key to the success of our model is the severance of the linkage between jump intensity and
volatility which is, a priori, imposed in most prior work, as discussed in Section 3. In using standard
models for the analysis of risk premia, one will inevitably treat the systematic mispricing of the IV
skew and term structure, documented in the bottom panels of Figure 6, as persistent observation
23
We have confirmed that identical problems plague, e.g., the 2FESJ model in this type of scenario.

32
error. This problem is recurring. Throughout the in-sample period, we observe similar qualitative
developments following the Asian and Russian crises, and the 2008-2009 financial crisis, while for
the out-of-sample period, the second round of the European debt crisis in late 2011 also generates
this type of persistent dynamic in the surface. Hence, using the standard modeling framework will
induce systematic biases in the inference for risk premia.

7 Risk Premia Dynamics and Predictability

This section relates our findings based on the option panel to the underlying return data. In
particular, we study the links between the state vector, or risk factors, extracted from the option
panel and the volatility and jump risks inferred from the underlying stock returns. This sets the
stage for direct exploration of the equity and variance risk premiums and their association with
the option-implied factors. Given our analysis, we rely on model (3.1) for tracking the IV surface
dynamics. We continue to avoid making strong assumptions regarding the evolution of the actual
market risks. Consequently, this part of our analysis is fully nonparametric, and we invoke only
minimal stationarity conditions regarding the P-law.

7.1 Connecting the Information in the Option Panel and the Underlying Asset

To define risk premia, we must develop consistent notation concerning the pricing of each source
of risk in model (3.1). We define,
Z t Z t
P
W1,t Q
= W1,t − λW
s ds,
1 P
W2,t Q
= W2,t − λWs ds,
2

0 0
Z t Z t (7.1)
P
B1,t = Q
B1,t − λB
s
1
ds, P Q
B2,t = B2,t − λB
s ds,
2

0 0

P , W P , B P and B P are P Brownian motions and λW1 , λW2 , λB1 and λB2 denote the
where W1,t 2,t 1,t 2,t t t t t
associated prices of risk. The compensator of the jump measure, µ, under the P measure is given
by dt ⊗ νtP (dx, dy), and the mapping νtP (dx, dy) → νtQ (dx, dy), defined for every jump size x and
y and every point in time t, reflects the compensation for jump risk.
The dynamics of the stock price process under the physical probability measure P is then,
Z
dXt
(ex − 1) µ
p P
p P
= αt dt + V1,t dW1,t + V2,t dW2,t + eP (dt, dx, dy), (7.2)
Xt− R2

where
Z Z
λW V1,t + λW x
(ex − 1) νtQ (dx, dy), (7.3)
p p
αt − (rt − δt ) = t
1
t
2
V2,t + (e − 1) νtP (dx, dy) −
R2 R2

33
Skew Term Structure
0.6 0.26

0.55
0.24
0.5

0.45 0.22
Implied Volatility

Implied Volatility
0.4
0.2
0.35

0.3 0.18

0.25
0.16
0.2

0.15
0.14
0.1
−8 −6 −4 −2 0 2 0 50 100 150 200 250
Moneyness Maturity
Skew Term Structure
0.6 0.26

0.55
0.25
0.5

0.45 0.24
Implied Volatility

0.4
0.23
0.35
0.22
0.3

0.25 0.21

0.2
0.2
0.15

0.1 0.19
−8 −6 −4 −2 0 2 0 50 100 150 200 250
Moneyness

Figure 6: Out-of-sample Fit to IV Skew and Term Structure. The figure plots the fit to the
IV skew and term structure for the trading days December 19, 2012 (upper row) and September
8, 2010 (bottom row). The circles represent the observed Black-Scholes implied volatilities; the
continuous line is the fit from model 3F; the dashed line is the fit from model 1FGSJ.

34
is the spot equity risk premium, reflecting compensation for diffusive and (price) jump risks.
The conditional cum-dividend equity risk premium over the horizon τ is thus given by,24
Z t+τ 
τ 1 P
ERPt ≡ E (αs − (rs − δs )) ds . (7.4)
τ t t

We next define the quadratic variation over [t, t + τ ] which we denote QVt,t+τ . It captures the
return variation over the given horizon and is given by,
c j
QVt,t+τ = QVt,t+τ + QVt,t+τ ,
Z t+τ
j
Z t+τ Z (7.5)
c
QVt,t+τ = (V1,s + V2,s ) ds, QVt,t+τ = x2 µ(ds, dx, dy),
t t R2
where we decompose the return variation into terms generated by the continuous and jump com-
c j
ponent of X, QVt,t+τ and QVt,t+τ . Further, note that the quadratic variation is independent of
the probability measure. The variance risk premium is defined as,
1h P i
VRPτt ≡ Et (QVt,t+τ ) − EQ
t (QVt,t+τ ) , (7.6)
τ
and is compensation for the variance risk in X.
We are also interested in assessing directly the risks and risk premiums associated with jumps.
In particular, we want to gauge the compensation for large price jumps and to allow for a separate
risk premium for the negative versus positive jumps. We obtain direct measures of the jump risks
by simply counting the number of “big” jumps over the relevant horizon,
Z t+τ Z Z t+τ Z
K K
LTt,t+τ ≡ 1{x≤−K} µ(ds, dx, dy), RTt,t+τ ≡ 1{x≥K} µ(ds, dx, dy), (7.7)
t R2 t R2

where K is a prespecified threshold. We set K = 0.5% in the subsequent analysis.25


c j K K , are not
The risk measures we have introduced, including QVt,t+τ , QVt,t+τ , LTt,t+τ and RTt,t+τ
directly observable, but we can estimate them using high-frequency and overnight futures returns.
Naturally, since intraday data are available only during active trading, our high-frequency measures
pertain exclusively to the trading intervals within [t, t + τ ]. We denote the latter with superscript i
i = t−1+τ
P
(for intraday based measures), e.g., QVt,t+τ s=t QVs+π,s+1 , where π denotes the fraction of
the day corresponding to the overnight period, and 1 − π indicates the length of the trading day.
c,i j,i K,i K,i
QVt,t+τ , QVt,t+τ , LTt,t+τ and RTt,t+τ are defined analogously. By now, there are standard methods
for constructing estimates for these risk measures as well as the corresponding equity and variance
risk premia. We provide the details regarding our empirical implementation in the Appendix.
24
Notice that since a long position in the market index involves a commitment of capital, a part of the wedge in the
P and Q expectations of the cum-dividend equity returns reflects compensation for the time-variation in the risk-free
rate. This term is, of course, absent if we instead define the equity risk premium using futures on the market index.
25
This (fixed) threshold K is large enough that we can separate returns exceeding this (absolute) level from diffusive
volatility using 1-minute observations. Experiments with alternative cutoffs produced similar results.

35
7.2 The Predictability of Equity and Variance Risk and Risk Premia

We now explore the relationship between the option-implied factors, V1 , V2 and U , driving the option
surface dynamics, and the various risk measures and risk premia associated with the underlying
asset. We rely on alternative versions of the following predictive regression,

yt = α0 + α1 V1,t + α2 V2,t + α3 Ut + t , (7.8)

where the left hand side represents, in turn, the empirical jump and diffusive variance risk mea-
c K,i K,i c,i
  R t+τ
Xt+τ
sures and the risk premia, i.e., yt = LT t,t+τ , RT
d t,t+τ , QV
d t,t+τ , QV
d t,t+τ , log Xt −1
τ t (rs −
τ
δs )ds and VRP
[ t . Given the relationships explicated in Section 9.2 of the Appendix, it is ev-
ident that the regressions based on the alternative yt variables, asymptotically in sample size,
yield estimates identical to those based on the corresponding infeasible measures of interest, i.e.,
Pt−1+τ R s+1 R P (dx, dy),
Pt−1+τ R s+1 R P c,i τ
s=t s+π R 1 {x≤−K} νs s=t s+π R 1{x≥K} νs (dx, dy), QVt,t+τ , QVt,t+τ , ERPt and
VRPτt . Thus, the predictive regression in equation (7.8) speaks directly to the linkages between
the option surface dynamics and the risks and risk premia associated with the equity index.
In general, if the premia for the diffusive and jump risks are spanned by the factors V1 , V2 and U ,
then the expectation of yt conditional on time t information will be functionally related to V1,t , V2,t
and Ut . Moreover, in the standard case, almost universally adopted in option pricing applications,
the measure change preserves the affine structure, so the conditional mean of yt is linear in V1,t , V2,t
and Ut . Hence, the regression in equation (7.8) produces optimal (mean-square error) predictors
for the volatility and jump risks at time t. Furthermore, conceptually, our extraction of V1,t , V2,t
and Ut provides a richer information set for forecasting the volatility and jump realizations than
the history of underlying asset returns. The latter, at best, generates estimates of the path for
{V1,s + V2,s }s≤t as well as associated jump variation measures.
We summarize the results from the predictive regressions in Figures 7 and 8. Since we are
particularly interested in the incremental role of the novel factor U , we initially project U linearly
onto the two volatility factors and denote the residual by U
e . Consequently, Ue reflects the features of
the system not associated with the traditional volatility factors. Given our relatively short sample,
we compute the predictive regressions for horizons up to one year only. Figure 7 shows that the
state variables, extracted from the option panel, have significant explanatory power for the future
evolution of risks. In particular, the plot pertaining to the count of positive and negative jumps
reveals that the jump intensities display highly predictable time-variation under the statistical
measure, P, i.e., νtP (dx, dy) is truly a function of t. Moreover, the state variables differ greatly in
their ability to forecast the future volatility and jump intensity. Specifically, once we control for

36
Future negative jumps Future positive jumps

4 4

2 2
t−stat

t−stat
0 0

−2 −2
2 4 6 8 10 12 2 4 6 8 10 12

0.2 0.2
R2

R2
0.1 0.1

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Future continuous variation Future Realized Volatility and Squared Overnight Return
15 15

10 10
t−stat

t−stat
5 5

0 0

2 4 6 8 10 12 2 4 6 8 10 12

1 1
R2

R2

0.5 0.5

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Figure 7: Predictive Regressions for Volatility and Jump Risks. The volatility and jump risk
measures are defined in (9.11)-(9.12). For each regression, the top panels depict the t-statistics for the
individual parameter estimates while the bottom panels indicate the regression R2 . The predictive variables
are V1 (dashed-dotted line), V2 (dashed line) and U e (solid line), where Ue is the residual from the linear
projection of U on V1 and V2 . The regression standard errors are constructed to also account for the
e . The dashed lines in the R2 plots correspond to constrained
estimation error in the projection generating U
regressions, including only V1 and V2 .

the volatility factors, U


e provides no incremental explanatory power. This is evident both from the
e and the trivial drop in R2 when we exclude U
insignificant t-statistics corresponding to U e from
the regressions. Hence, Figure 7 is consistent with a model for which the jump intensity, under P,
e .26 Finally, since the empirical detection
depends only on the volatility factors V1 and V2 , but not U
26
In general, since V1 contains jumps of time-varying intensity that load on all the factors, V1 , V2 and U , the
conditional forecast of future volatility V1,t + V2,t still depends (critically) on the current value of the third factor U .

37
of jumps inevitably is subject to some degree of measurement error and literally is infeasible during
the overnight period, we also display the predictability regarding the combined quadratic return
variation. Since the jump and overnight return variation is less predictable, the overall explanatory
power drops slightly relative to the results for the continuous variation, but the qualitative pattern is
identical. In short, the distinct return variation components are highly predictable, yet the forecast
power of U e is insignificant across all alternative constellations. Of course, the lack of statistical
significance does not imply that U has no impact on the P dynamics of volatility and jump risks,
but rather that the effect is marginal compared to that of the volatility factors.

Future excess returns Future realized VRP


4 10

2 5
t−stat

t−stat
0 0

−2 −5
2 4 6 8 10 12 2 4 6 8 10 12

0.2
0.4
0.15
0.3
R2

R2

0.1
0.2
0.05 0.1

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Figure 8: Predictive Regressions for Equity and Variance Risk Premia. For each regression, the
top panels depict the t-statistics for the individual parameter estimates while the bottom panels indicate
the regression R2 . The predictive variables are V1 (dashed-dotted line), V2 (dashed line) and U
e (solid line),
where U is the residual from the linear projection of U on V1 and V2 . The regression standard errors are
e
constructed to also account for the estimation error in the projection generating U
e . The dashed lines in the
2
R plots correspond to constrained regressions, including only V1 and V2 .

We now turn to the predictability of the equity and variance risk premia. Figure 8 indicates
that our three factors, extracted from the option panel, now take on very different roles. In
particular, a significant part of the predictability of both the equity and variance risk premia is
due to the U factor. For the variance risk premia, V1 also contributes strongly at shorter horizons,
while, importantly, for the equity risk premium U e is the single dominant explanatory factor for
all horizons. The importance of U for predicting both the equity and variance risk premia is
consistent with Bollerslev and Todorov (2011) who find the equity and variance risk premia to
embed a common component stemming from compensation of left jump tail risk. The fact that
both the equity and variance risk premia depend on U , coupled with the significant persistence of

38
the latter, rationalizes the predictive power of the variance risk premium for future excess returns,
documented in Bollerslev et al. (2009) and Drechsler and Yaron (2011). The limited role for the two
volatility factors mirrors the conclusions of many prior studies on the risk-return tradeoff, going
back to, e.g., French et al. (1987) and Glosten et al. (1993). Finally, we note that our results are
not driven by the events surrounding the financial crisis. Qualitatively identical results apply for
subsamples that end prior to 2007.27
Another way to gauge the importance of U , and proper model specification in general, for pre-
dicting the equity and variance risk premia is to contrast our evidence above to findings generated
by standard jump-diffusive models. Since the results are entirely consistent with our prior conclu-
sions, we briefly summarize the results and refer to the Supplementary Appendix for details. If we
omit U from the model, we are still left with an elaborate model which, besides the two volatility
state variables driving the dynamics of the option surface, incorporates both diffusive and jump
leverage effects, co-jumps in returns and volatility, and separate decay rates for the left and right
(exponential) jump tails.28 We find that the forecast performance regarding the future evolution of
volatility and jump risks is similar to the corresponding results in Figure 7. Given that U
e plays only
a minimal role in forecasting these quantities within model (3.1), this is quite intuitive. Moreover,
as before, when considering the equity risk premium, the volatility factors are largely insignificant
(and as likely to be negative as positive), and the R2 of the predictive regression for the future
excess returns is dramatically reduced. In short, the evidence for predictability of the equity risk
premium vanishes when the option surface dynamics is modeled in the common jump-diffusive
framework, and driven exclusively by volatility factors. Hence, the inclusion of the U factor, allow-
ing for the left risk-neutral tail to have a separate source of variation, is pivotal for capturing the
predictability in the equity risk premium. Finally, for the variance risk premium, both volatility
factors are significant in the two-factor model, but the R2 of the variance risk premium regression
is notably lower than for our three-factor model (3.1).
To summarize, consistent with prior findings, we document a substantial time variation in
the pricing of market risks. However, in a key departure from existing work, we provide strong
evidence that the factors driving risks and risk premia differ in a systematic way. This is ruled out,
a priori, through the structure preserving measure transformations adopted in most prior option
pricing studies. Thus, our results point towards the importance of allowing for nonlinearities in the
pricing kernel and have implications for the ability of structural economic models to rationalize the
predictability of the equity and variance risk premia. We discuss this next.
27
The documentation of these results are available upon request.
28
The corresponding two-factor model with Gaussian jumps performs less well along all dimensions.

39
7.3 Structural Implications of the Predictive Power of the Option Surface

Figures 7 and 8 demonstrate that the factor U , driving a substantial part of the OTM short maturity
put option dynamics, has no impact on the actual volatility and jump dynamics of the underlying
asset. In contrast, the factor has a critical effect on the pricing of volatility and jump risk. In other
words, it resembles a risk premium and not a risk factor. Can we rationalize this finding from an
economic perspective? To guide intuition, we compare the findings in Figures 7 and 8 with those
from recent structural models that link option prices to fundamental macroeconomic risks, such as
aggregate consumption and dividends. This emerging literature has made important progress in
tackling the challenging, yet critical, task of jointly explaining the equity return and risk premium
dynamics in a coherent general equilibrium setting. For concreteness, we initially explore a couple
of specific models featuring a representative agent with Epstein-Zin preferences exposed to risks in
real consumption growth, namely Wachter (2013) and Drechsler and Yaron (2011).
In the one-factor model of Wachter (2013), the consumption growth is subject to infrequent, but
large, negative jumps (rare disasters) with a time-varying arrival rate, resembling the mechanism in
Gabaix (2012).29 This type of equilibrium model can account for many critical empirical features
such as the correlation between volatility and jump risks, the time-varying jump arrival as well as
the ability of the market variance risk premium to predict future equity excess returns.
The model of Drechsler and Yaron (2011) specifies consumption growth as conditionally Gaus-
sian with a time-varying conditional mean (long-run risk) and conditional volatility.30 The system
has three factors, with one driving the conditional mean and two governing the conditional volatility
of consumption growth.31 Drechsler and Yaron (2011) show that the time-varying jump intensity
– in turn governed by the volatility state – explains a significant part of the predictability in the
variance risk premia of future excess returns.
Figures 9 and 10 summarize the findings from predictive regressions for future volatility and
jump risks as well as equity and variance risk premia in the models of Wachter (2013) and Drech-
sler and Yaron (2011), using the concurrent level of the relevant state variables in each model as
predictors. Not surprisingly, given the one-factor structure, the Wachter (2013) model faces some
challenges in accommodating the evidence laid out in Figures 7 and 8. Importantly for our anal-
ysis, the predictability of future excess returns and variance risk premia is linked closely to the
predictability of the future return variation, and the underlying pattern of significance is identical
29
These papers generalize work by Barro (2006) and Barro and Ursua (2008) in which rare disasters are i.i.d.
30
This model builds on Eraker and Shaliastovich (2008) and generalizes many prior models in which consumption
growth contains a small predictable persistent component, including the original work of Bansal and Yaron (2004).
31
One of the state variables that we label volatility factors directly controls the conditional variance of consumption
growth, while the other captures the variation in the long run variance of consumption growth.

40
in all cases (flat line), and the degree of explanatory power rises roughly linearly with maturity.
Compared to Figure 7, the predictability in the Wachter (2013) model is inverted, as the return
variation is forecast with relatively higher precision over long rather than short horizons. Further-
more, the explanatory power is uniformly too low. Likewise, referencing Figure 8, the model fails to
capture the degree and pattern of predictability in the excess returns and variance risk premium.32
Turning to Figure 10, it is evident that the more flexible volatility structure of Drechsler and
Yaron (2011) is useful in accommodating some of the stylized features of the data. Nonetheless,
it is equally clear that the long-run risk factor helps predict neither the future volatility and jump
risks nor the equity and variance risk premia. In this structural setting, essentially all predictability
stems from the two volatility factors. They provide the channel through which past variance risk
premia generate predictable movements in the equity risk premium. Thus, relative to our empirical
finding, captured by Figures 7 and 8, this structural model also ties the predictability of future
volatility and jump risks too closely to the predictability of the equity and variance risk premia.
Equivalently, the structural model implies a tight relationship between the dynamics of the option
panel and the return dynamics of the underlying equity market. In contrast, our empirical results
based on model (3.1) document a partial, and critical, decoupling between the factors driving the
equity return dynamics and those governing the pricing of risk, and thus the equity and variance
risk premia.
There is a fundamental reason for the discrepancy between our empirical findings and the
implications of the structural models of Wachter (2013), and its extension in Seo and Wachter
(2013), as well as Drechsler and Yaron (2011). Although the models generate risk premia through
different channels – the presence of rare disasters and uncertainty about their arrival (Wachter
(2013) and Seo and Wachter (2013)) versus long-run risk and stochastic volatility in consumption
growth (Drechsler and Yaron (2011)) – they share a critical feature in the pricing of jump tail
νtQ (dx,dy)
risk. They both imply that the ratio νtP (dx,dy)
is time-invariant. That is, the risk-neutral jump
intensity is proportional to that under the actual probability measure, so the two jump intensities
are equivalent in terms of their time variation. Therefore, the jump risk premia are generated
by changing the distribution of the jump size only. This implies that the variation in the jump
intensity is “inherited” under the equivalent change of measure. Consequently, these equilibrium
based pricing kernels cannot “generate” new state variables in addition to those that drive the
fundamental risks in the economy. In turn, this necessarily generates the tight link between the
32
A two-factor extension, like in Seo and Wachter (2013), in which rare disaster probability is driven by two
factors, can potentially generate dynamic patterns more consistent with the data. However, as we discuss below in
more general terms, such an extension will still produce a close link between the predictability of future risks and
risk premia unlike what we find in the data.

41
Future variation Future excess returns
4 4

2 2
t−stat

t−stat
0 0

−2 −2
2 4 6 8 10 12 2 4 6 8 10 12

0.1 0.1
R2

R2
0.05 0.05

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Variance risk premium


15

10
t−stat

−5
2 4 6 8 10 12

1
R2

0.5

0
2 4 6 8 10 12
Months

Figure 9: Predictive Regressions Implied by the Wachter (2013) Structural Model. The predictive
variable is the time-varying intensity of a rare disaster in consumption growth.

dynamics of the underlying asset and the option surface within these models.
In fact, this tight linkage of the physical and risk-neutral jump intensities is operative for a wide
class of popular structural models with representative agents having Epstein-Zin preferences. The
dQ
part of the density dP due to the change of the jump measure is characterized by,
Z t Z 
P
E (Y (ω, s, x) − 1)e
µ (ds, dx) , (7.9)
0 Rn

where the jump component of the state vector in the economy under the P probability measure is
RtR ν Q (dx)
eP (ds, dx); Y (ω, t, x) = νtP (dx) is the measure
given as a (multivariate) integral of the form 0 Rn x µ
t

42
Future jumps Future continuous variation
6 15

4 10
t−stat

t−stat
2 5

0 0

−2 −5
2 4 6 8 10 12 2 4 6 8 10 12

1
0.6

0.4
R2

R2
0.5
0.2

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Future excess returns Variance risk premium


4 15

10
2
t−stat

t−stat
5
0
0

−2 −5
2 4 6 8 10 12 2 4 6 8 10 12

0.2 1

0.15
R2

R2

0.1 0.5

0.05

0 0
2 4 6 8 10 12 2 4 6 8 10 12
Months Months

Figure 10: Predictive Regressions Implied by the Drechsler and Yaron (2011) Structural Model.
The predictive variables are the conditional mean of consumption growth (solid line), stochastic volatility
of consumption growth (dotted line) and central tendency of stochastic volatility (dashed line). The dashed
lines in the R2 plots correspond to constrained regressions including only the volatility state variables.

change for the jump intensity, and E is the Doleans-Dade exponential.33 In general, Y (ω, t, x)
will be stochastic. However, within an equilibrium setting, stipulating an affine dynamics for
the fundamentals along with a representative agent with Epstein-Zin preferences generates the
restriction that the expression in equation (7.9) is exponentially affine in the state vector, see, e.g.,
equation (2.22) of Eraker and Shaliastovich (2008). In turn, this implies that Y (ω, s, x) must be
non-random and time-invariant, i.e., depend only on x.34 Hence, the equilibrium implied statistical
33
For the expression in (7.9) and the definition of the Doleans-Dade exponential, see Jacod and Shiryaev (2003),
Corollary III.5.22 and I.4.59, respectively.
34
The restriction on Y (ω, s, x) is actually stronger. In the equilibrium model, Y (ω, s, x) is an exponential function

43
and risk-neutral intensities of the price and volatility jumps (which are mixtures of the jumps in
the state variables driving the fundamentals in the equilibrium model) will be affine functions of
the same state vector.35 In contrast, for our extended three-factor model, in which νtP (·) loads, at
most, very marginally on Ut− , there is a natural wedge between the time variation in statistical
and risk-neutral jump intensities.
There are several ways in which the link between the asset and option price dynamics may be
relaxed within a representative agent equilibrium setting to potentially account for our empirical
evidence. They all involve generalizing the preferences in some form. One approach is to allow
the representative agent’s coefficient of risk aversion to vary over time. Du (2010) proposes a
generalization of a habit formation model in which consumption growth is i.i.d. with rare jumps,
building on the equilibrium models of Campbell and Cochrane (1999) and Barro (2006). In this
model, and consistent with our findings, there is a wedge between the time-variation of the P and
Q jump intensities of the market return distribution. The nonlinearity of the pricing kernel will
amplify the effect of time-varying risk-aversion on the risk-neutral jump intensity relative to the
statistical one. Note that the risk-neutral jump intensity is nonlinearly related with volatility in
this setting, and our U factor will naturally serve as a proxy for such nonlinearities. Nevertheless, it
remains an open question whether a model with external habit formation can decouple the option
and asset price dynamics in a manner reminiscent of our empirical findings. It is also unclear
whether the frequency and intensity of stock market jumps can be mapped into corresponding
jumps in the consumption growth rate as implied by this model of habit formation.
A second approach to relax the link between the option and asset price dynamics is to recognize
that agents do not directly observe the state vector and therefore need to filter the states from
observables. The absence of perfect information plus aversion to ambiguity (particularly about ex-
treme negative risks), like in Hansen and Sargent (2008), or lack of confidence in the estimate of the
state vector, like in Bansal and Shaliastovich (2010), can make investors appear more risk averse
than they would be in a perfect information setting. The key question is whether the ambiguity
aversion or confidence risk variation can generate the required gap between the dynamics of the
statistical and risk-neutral jump intensities.36 For example, the excessively tight link between the
jump intensity under P and Q remains within the generalization of the Drechsler and Yaron (2011)
of jump size, see Theorem 1 of Eraker and Shaliastovich (2008). Hence, the jump measure change is implemented by
exponential tilting (in Laplace transform space), with the degree of tilting determined by the preference parameters
of the representative agent.
35
In the models of Wachter (2013), Seo and Wachter (2013) and Drechsler and Yaron (2011), the jump intensities
are even more tightly connected, as they are directly proportional.
36
Recall that, according to the estimates for (3.1), U does not covary perfectly with stochastic volatility.

44
model, developed by Drechsler (2013), in which agents are ambiguous about part of the dynamics,
including the jumps in the conditional mean and variance of consumption growth. The representa-
tive agent’s ambiguity drives an additional wedge between fundamental risks and asset prices and
helps explain why variance risk premia have superior predictive power, relative to volatility itself,
for future returns. Nonetheless, the linearity of the pricing kernel with respect to the state vector
implies that no new state variable is “generated” going from P to Q, thus ultimately rendering the
model predictions incompatible with our empirical findings.
Moving beyond the strict representative agent framework, parts of the literature has also con-
sidered the potential impact of the intermediary sector for the pricing of derivative securities,
particularly following crises; see, e.g., Bates (2003), Hu et al. (2013) and Chen et al. (2014). The
main intuition is that major financial shocks may impose losses on market makers and reduce their
effective risk bearing capacity. Can we explain the wedge between the factors driving the risks and
the risk premia with the health of the financial sector? To informally explore this question, we now
relate our U factor to standard measures viewed as proxies for stress in the financial sector. In
particular, we incorporate the noise (liquidity) measure of Hu et al. (2013), the 3-month LIBOR
minus Treasury (TED) spread, and the default spread defined as the difference between Moodys
BAA and AAA bond yield indices (DFSPRD), while also controlling explicitly for volatility. In
Table 8, we report the related regression results.
Interestingly, we find the noise factor of Hu et al. (2013) to be quite strongly correlated with
our U factor, even after controlling for volatility. While the default spread also has relatively good
explanatory power, it is highly correlated with the noise factor, and the individual t-statistics drop
substantially when all the variables are included in the (multivariate) regression reported in the
last column. Finally, compared to the other factors, the TED spread plays a minimal role. Overall,
we conclude that the critical component of U , not correlated with volatility, arguably is associated
with proxies for stress in the financial sector. Hence, our procedure, extracting risk premium
information directly from the option panel, generates evidence that qualitatively conforms with
prior observations in the literature. We leave further explorations along these lines to future work.

8 Conclusion

We document that the standard exponentially-affine jump-diffusive specifications used in the empir-
ical option pricing literature are incapable of fitting critical features of the option surface dynamics
for the S&P 500 index, especially in scenarios involving significant shifts in the volatility smirk.
We extend the risk-neutral volatility model to include a separate state variable which is crucial

45
Constant 2.233 0.941 2.632 -0.854 0.976 0.296
(9.279) (5.599) (10.071) ( -0.854) (5.222) (1.016)
V 60.834 20.255 19.057
(6.204) (2.192) (2.480)
Noise Fct 0.907 0.727 0.618
( 20.679) (10.466) (6.721)
TED 2.279 -1.043
( 4.357) (-2.960)
DFSPRD 4.616 1.593
(17.407) (4.117)
R2 (%) 0.349 0.487 0.094 0.437 0.506 0.549

Table 8: U factor and the health of the financial sector. Univariate and multivariate regres-
sions of the U factor on volatility and variables that proxy for the health of the financial sector
(t-statistics in parenthesis). V is the variance extracted from the time-series of options panels,
Noise Factor refers to the liquidity factor proposed by Hu et al. (2013); TED is the spread between
the 3-month LIBOR and the Treasury yield; DFSPRD is the difference between BAA and AAA
Moodys bond yield indices.

in capturing the time variation of priced downside tail risk. This new factor has no incremental
explanatory power, beyond the traditional volatility factors, for the future evolution of volatility
and jump risks. On the other hand, relative to the volatility components, the new factor provides
critical, and superior, information for the time variation in the equity and variance risk premia.
Our findings demonstrate that the pricing in the option market is closely integrated with the
underlying asset market. Moreover, the option panel embodies critical information regarding equity
risk pricing that cannot be extracted directly from the underlying asset price dynamics. The wedge
between the two probability measures arises primarily from the varying degree of compensation
for downward tail jump risk. Our results suggest that time-varying risk aversion and/or ambiguity
aversion, driven in part by the presence of large shocks, must be incorporated into structural asset
pricing models if they are to explain the joint dynamics of the equity and option markets.

References
Ahn, D., R. Dittmar, and A. Gallant (2002). Quadratic Term Structure Models: Theory and Evidence.
Review of Financial Studies 15, 243–288.

46
Andersen, T. G., O. Bondarenko, and M. T. Gonzalez-Perez (2014). Exploring the Equity-Index Return
Dynamics via Corridor Implied Volatiliy. Working paper, Northwestern University.

Andersen, T. G., N. Fusari, and V. Todorov (2013). Parametric Inference and Dynamic State Recovery from
Option Panels. Working paper, Northwstern University.

Andrews, D. (2001). Testing When a Parameter is on the Boundary of the Maintained Hypothesis. Econo-
metrica 69, 683–734.

Bai, J. and S. Ng (2002). Determining the Number of Factors in Approximate Factor Models. Economet-
rica 70, 191–221.

Bansal, R. and I. Shaliastovich (2010). Confidence Risk and Asset Prices. American Economic Review 100,
537–541.

Bansal, R. and A. Yaron (2004). Risks for the Long Run: A Potential Resolution of Asset Pricing Puzzles.
Journal of Finance 59, 1481–1509.

Barndorff-Nielsen, O. E. and N. Shephard (2004). Power and Bipower Variation with Stochastic Volatility
and Jumps. Journal of Financial Econometrics 2, 1–37.

Barro, R. and J. F. Ursua (2008). Macroeconomic Crises since 1870. Brookings Papers on Economic Activity,
255–335.

Barro, R. J. (2006). Rare Disasters and Asset Markets in the Twentieth Century. Quarterly Journal of
Economics 121, 823–866.

Bates, D. S. (1991). The Crash of ’87 – Was It Expected? The Evidence from Options Markets. Journal of
Finance 46, 1009–1044.

Bates, D. S. (2000). Post-’87 Crash Fears in S&P 500 Future Options. Journal of Econometrics 94, 181–238.

Bates, D. S. (2003). Empirical Option Pricing: A Retrospection. Journal of Econometrics 116, 387–404.

Bates, D. S. (2012). U.S. Stock Market Crash Risk, 1926 - 2010. Journal of Financial Economics 105,
229–259.

Bollerslev, T., G. Tauchen, and H. Zhou (2009). Expected Stock Returns and Variance Risk Premia. Review
of Financial Studies 22, 4463–4492.

Bollerslev, T. and V. Todorov (2011). Tails, Fears and Risk Premia. Journal of Finance 66, 2165–2211.

Bollerslev, T. and V. Todorov (2013). Time Varying Jump Tails. Journal of Econometrics, forthcoming.

Broadie, M., M. Chernov, and M. Johannes (2007). Specification and Risk Premiums: The Information in
S&P 500 Futures Options. Journal of Finance 62, 1453–1490.

Brockwell, P. (2001). Continuous-Time ARMA Processes. In D. Shanbhag and C. Rao (Eds.), Handbook of
Statistics, Volume 19. North-Holland.

Campbell, J. and J. Cochrane (1999). By Force of Habit: A Consumption Based Explanation of Aggregate
Stock Market Behavior. Journal of Political Economy 107, 205–251.

Chen, H., S. Jostlin, and S. Ni (2014). Demand for Crash Insurance, Intermediary Constraints, and Stock
Return Predictability. Working paper.

47
Chernozhukov, V. and H. Hong (2004). Likelihood Estimation and Inference in a Class of Non-regular
Econometric Models. Econometrica 72, 14451480.

Christoffersen, P., S. Heston, and K. Jacobs (2009). The Shape and Term Structure of the Index Option
Smirk: Why Multifactor Stochastic Volatility Models Work so Well? Management Science 55, 1914–1932.

Christoffersen, P., K. Jacobs, and C. Ornthanalai (2012). Dynamic Jump Intensities and Risk Premiums:
Evidence from S&P 500 Returns and Options. Journal of Financial Economics 106, 447–472.

Drechsler, I. (2013). Uncertainty, Time-Varying Fear, and Asset Prices. Journal of Finance, forthcoming.

Drechsler, I. and A. Yaron (2011). What’s Vol Got to Do with It? Review of Financial Studies 24, 1–45.

Du, D. (2010). General Equilibrium Pricing of Options with Habit Formation and Event Risks. Journal of
Financial Economics 99, 400–426.

Duffie, D., D. Filipović, and W. Schachermayer (2003). Affine Processes and Applications in Finance. Annals
of Applied Probability 13(3), 984–1053.

Duffie, D., J. Pan, and K. Singleton (2000). Transform Analysis and Asset Pricing for Affine Jump-Diffusions.
Econometrica 68, 1343–1376.

Eraker, B. and I. Shaliastovich (2008). An Equilibrium Guide to Designing Affine Pricing Models. Mathe-
matical Finance 18, 519–543.

French, K., W. Schwert, and R. Stambaugh (1987). Expected Stock Returns and Volatility. Journal of
Financial Economics 19, 3–29.

Gabaix, X. (2012). Variable Rare Disasters: An Exactly Solved Framework for Ten Puzzles in Macro-Finance.
Quarterly Journal of Economics 127, 645–700.

Glosten, L., R. Jaganathan, and D. Runkle (1993). On the Relation between the Expected Value and the
Volatility of the Nominal Excess Return on Stocks. Journal of Finance 48, 1779–1801.

Hansen, L. and T. Sargent (2008). Robustness. Princeton University Press.

Hu, G., J. Pan, and J. Wang (2013). Noise as Information for Illiquidity. Journal of Finance 68, 2223–2772.

Jacod, J. (2008). Asymptotic Properties of Power Variations and Associated Functionals of Semimartingales.
Stochastic Processes and their Applications 118, 517–559.

Jacod, J. and A. N. Shiryaev (2003). Limit Theorems For Stochastic Processes (2nd ed.). Berlin: Springer-
Verlag.

Johnson, T. (2012). Equity Risk Premia and the VIX Term Structure. Working paper, Stanford University.

Kou, S. (2002). A Jump Diffusion Model for Option Pricing. Management Science 48, 1086–1101.

Leippold, M. and L. Wu (2002). Asset Pricing under the Quadratic Class. Journal of Financial and
Quantitative Analysis 37, 271–295.

Mancini, C. (2009). Non-parametric Threshold Estimation for Models with Stochastic Diffusion Coefficient
and Jumps. Scandinavian Journal of Statistics 36, 270–296.

Merton, R. (1976). Option Pricing when Underlying Asset Returns are Discontinuous. Journal of Financial
Economics 3, 125–144.

48
Santa-Clara, P. and S. Yan (2010). Crashes, Volatility and the Equity Premium: Lessons from S&P 500
Options. Review of Economics and Statistics 92, 435–451.

Santa-Clara, P. and S. Yan (2010). Crashes, Volatility, and the Equity Premium: Lessons from S&P 500
Options. Review of Economics and Statistics 92, 435–451.

Seo, S. and J. A. Wachter (2013). Option Prices in a Model with Stochastic Disaster Risk. Working paper,
University of Pennsylvania.

Wachter, J. A. (2013). Can Time-Varying Risk of Rare Disasters Explain Aggregate Stock Market Volatility?
Journal of Finance 68, 987–1035.

9 Appendix
9.1 Nonparametric High Frequency Measures

For ease of notation we normalize the time unit to be a day. Each day is then divided into a trading
and a non-trading part and by convention a day starts with the close of trading on the previous day
and ends at the closing of the following day trading period. The resulting daily interval [t − 1, t] is
divided into [t − 1, t − 1 + π] overnight period and [t − 1 + π, t] active part of the trading day. Over
the trading part, we observe the futures price at n + 1 equidistant times, resulting in n intraday
1−π
increments, each over a time interval of length ∆n ≡ n . The intraday increments are given by
∆n,t
i f = ft−1+π+i∆n − ft−1+π+(i−1)∆n for i = 1, ..., n and t = 1, ..., T .
(n,mn )
Vbt is a nonparametric estimator of the diffusive return variation constructed from the
intraday record of the log-futures price of the underlying asset, f , as follows,
n
(n,mn ) n
(∆n,t
X
2
Vbt = i f ) 1{ |∆n,t f | ≤ υ ∆$ } , (9.10)
mn i n
i=n−mn +1

(n,mn )
where υ > 0, $ ∈ (0, 1/2), and mn denotes some deterministic sequence. For mn /n → 0, Vbt is
a consistent estimator of the spot variance at t and corresponds to the truncated variation (Mancini
(2009)) computed over an asymptotically shrinking fraction of the day just prior to the option quote.
In our implementation, we sample every minute over a 6.75 hours trading day, excluding the initial
(n,mn )
five minutes, resulting in n = 400. We employ mn = 300 for Vb in (4.1).
t
Similarly, we introduce the following measures of jump and variance risks for our analysis in
Section 7,
t+τ X
n

c K,i
 X
 LT t,t+τ = 1{∆n,s $ ,

i f < −K∨(υ ∆n )}


s=t+1 i=1
t+τ X n (9.11)
dK,i
 X
 RT t,t+τ = 1{∆n,t f > K∨(υ ∆$ )} ,



i n
s=t+1 i=1

49
t+τ X
n t+τ X
n
d it,t+τ = |∆n,t d c,i |∆n,t
X X
2 2
QV i f| , QV t,t+τ = i f | 1{ |∆n,t f | ≤ υ ∆$ } ,
i n
(9.12)
s=t+1 i=1 s=t+1 i=1

where recall K is the constant jump threshold given in (7.7).


Under general conditions, see, e.g., Jacod (2008),

K,i P K,i K,i


P K,i P ii P c,i c,i
c t,t+τ −→
LT LTt,t+τ dt,t+τ −→
, RT RTt,t+τ d t,t+τ −→
, QV QVt,t+τ d t,t+τ −→
, QV QVt,t+τ .

For the overnight periods within [t, t + τ ], we cannot separate the diffusive volatility from jumps,
and we simply estimate the total (realized) overnight variance via,
t+τ
d ot,t+τ =
X
QV (fs−1+π − fs−1 )2 . (9.13)
s=t+1

Our estimate for the total variation, QVt,t+τ , is then given by QV di


d t,t+τ = QV do
t,t+τ + QV t,t+τ .

Theoretically, for all of the above estimators, any υ > 0 and $ ∈ (0, 1/2) will work. We fix
$ = 0.49. The computation of υ is more involved and takes into account the fact that volatility
varies over time and displays a strong diurnal pattern over the trading day. To account for the
latter, we estimate, nonparametrically, a time-of-day factor T ODi , i = 1, ..., n,

PT n,t 2
t=1 (∆i f ) 1{|∆n,t
i f |≤ῡ∆n
0.49 }
T ODi = N OIi PT Pn n,t 2 , , (9.14)
t=1 j=1 (∆j f ) 1{∆n,t
j f ≤ῡ∆ 0.49 }
n

where ῡ is
v
r u T n
πu 1 X X n,t
ῡ = 3 t |∆i−1 f ||∆n,t
i f| ,
2 T
t=1 i=2

and the number of increments factor N OIi is defined as


PT Pn
t=1 j=1 1{|∆n,t
j f |≤ῡ∆n
0.49 }
N OIi = PT .
t=1 1{|∆n,t f |≤ῡ∆0.49 }
i n

The latter ensures that the numerator and denominator in (9.14) are given in identical units. The
truncation level ῡ is based on the average in-sample volatility obtained from the bipower variation
measure of Barndorff-Nielsen and Shephard (2004). The T OD factor is depicted in Figure 11.

50
Ti me-of-Day Factor T O D i
2.4

2.2

1.8

1.6

1.4

1.2

0.8

0.6

0.4
8:35 9:25 10:15 11:05 11:55 12:45 13:35 14:25 15:15
Time (CST)

Figure 11: Time-of-day factor.

To account for time-varying volatility across days, we use the estimated continuous component of
volatility for the previous day (for the first day we use ῡ). Finally, our time-varying threshold is
s
n,n
Vbt−1
υt,i = 3 × T ODi × ∆0.49
n ,
1−π b

where π
b is given by, PT
(ft+π − ft )2
b = Pt=1
π T
.
t=1 (ft+1 − ft )2

9.2 Linking Risks and Risk Premia with their Feasible Counterparts

From the properties of the compensator for a jump measure, we have,


Z t+τ Z
K
LTt,t+τ = 1{x≤−K} νsP (dx, dy) ds + Lt,t+τ , EPt (Lt,t+τ ) = 0,
t R2
Z t+τ Z (9.15)
K
RTt,t+τ = 1{x≥K} νsP (dx, dy) ds + R
t,t+τ , EPt (R
t,t+τ ) = 0.
t R2

K
Hence, up to martingale difference sequences, LTt,t+τ K
and RTt,t+τ measure the P jump intensity of
“large” jumps.37
37
Empirical estimates for the number of “large” negative and positive jumps across our sample are provided in
Section C of the Supplementary Appendix.

51
Turning towards the equity and variance risk premia, we first note that, from an application of
Itô formula, log(Xt ) has the following representation under P,
h i Z
p p
d log(Xt ) = αt − qtP P
dt + V1,t dW1,t P
+ V2,t dW2,t + eP (dt, dx, dy),
xµ (9.16)
R2

where qtP = 12 V1,t + 12 V2,t + x


R
R2 (e − 1 − x) νtP (dx, dy), and similarly under Q with αt replaced by
rt − δt and all superscripts P replaced with Q in the expression above.

Hence, we have the following relations for the feasible measures of the equity and variance risk
premia,
1 t+τ
  Z Z t+τ 
Xt+τ 1
log − (rs − δs ) ds = ERPτt + EPt qsP ds + E t,t+τ , EPt (E
t,t+τ ) = 0,
Xt τ t τ t (9.17)
[ t = 1 QV
τ h i
VRP d t,t+τ − EQ
t (QV t,t+τ ) = VRPτt + Vt,t+τ , EPt (Vt,t+τ ) = 0,
τ

where EQt (QVt,t+τ ) can be measured in model-free fashion via the VIX index. Equation (9.17)
τ
[ t from VRPτt , and, likewise, a martingale
shows that a martingale difference sequence separates VRP
difference sequence separates the log excess cum-dividend returns on the underlying asset from the
R 
t+τ
unobservable ERPτt + τ1 EPt t qsP ds . In principle, we can remove the term stemming from the
R 
t+τ R t+τ
convexity adjustment, i.e., τ1 EPt t qsP ds , via a consistent estimator for t qsP ds (again up to
a martingale difference term) obtained from high-frequency data, e.g.,38
Z t+τ X 1  n  
qbsP ds = |∆ni f |2 1{ |∆ni f | ≤ υt,i } + e∆i f − 1 − ∆ni f 1{ |∆ni f | > υt,i } .
t i
2
i: n ∈(t,t+τ ]

In practice, this adjustment is minute and the results are virtually unchanged if we implement it.39

38
For the overnight periods we just take one half of the squared return.
39
We do not report results adjusting for this term to conserve space. They are available upon request.

52
Research Papers
2013

2014-39: Søren Johansen and Bent Nielsen: Outlier detection algorithms for least
squares time series regression
2014-40: Søren Johansen and Lukasz Gatarek: Optimal hedging with the cointegrated
vector autoregressive model
2014-41: Laurent Callot and Johannes Tang Kristensen: Vector Autoregressions with
Parsimoniously Time Varying Parameters and an Application to Monetary
Policy
2014-42: Laurent A. F. Callot, Anders B. Kock and Marcelo C. Medeiros: Estimation and
Forecasting of Large Realized Covariance Matrices and Portfolio Choice
2014-43: Paolo Santucci de Magistris and Federico Carlini: On the identification of
fractionally cointegrated VAR models with the F(d) condition
2014-44: Laurent Callot, Niels Haldrup and Malene Kallestrup Lamb: Deterministic and
stochastic trends in the Lee-Carter mortality model
2014-45: Nektarios Aslanidis, Charlotte Christiansen, Neophytos Lambertides and
Christos S. Savva: Idiosyncratic Volatility Puzzle: Infl‡uence of Macro-Finance
Factors
2014-46: Alessandro Giovannelli and Tommaso Proietti: On the Selection of Common
Factors for Macroeconomic Forecasting
2014-47: Martin M. Andreasen and Andrew Meldrum: Dynamic term structure models:
The best way to enforce the zero lower bound
2014-48: Tim Bollerslev, Sophia Zhengzi Li and Viktor Todorov: Roughing up Beta:
Continuous vs. Discontinuous Betas, and the Cross-Section of Expected Stock
Returns
2914-49: Tim Bollerslev, Viktor Todorov and Lai Xu: Tail Risk Premia and Return
Predictability
2014-50: Kerstin Gärtner and Mark Podolskij: On non-standard limits of Brownian semi-
stationary
2014-51: Mark Podolskij : Ambit fields: survey and new challenges
2014-52: Tobias Fissler and Mark Podolskij: Testing the maximal rank of the volatility
process for continuous diffusions observed with noise
2014-53: Cristina M. Scherrer: Cross listing: price discovery dynamics and exchange
rate effects
2014-54: Claudio Heinrich and Mark Podolskij: On spectral distribution of high
dimensional covariation matrices
2014-55: Gustavo Fruet Dias and Fotis Papailias: Forecasting Long Memory Series
Subject to Structural Change: A Two-Stage Approach
2014-56: Torben G. Andersen, Nicola Fusari and Viktor Todorov: The Risk Premia
Embedded in Index Options

You might also like