Through Grant SES-1627660
Through Grant SES-1627660
Through Grant SES-1627660
LONG-RUN COVARIABILITY
Ulrich K. Müller
Mark W. Watson
The views expressed herein are those of the authors and do not necessarily reflect the views of the
National Bureau of Economic Research. Support was provided by the National Science
Foundation through grant SES-1627660.
NBER working papers are circulated for discussion and comment purposes. They have not been
peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies
official NBER publications.
© 2017 by Ulrich K. Müller and Mark W. Watson. All rights reserved. Short sections of text, not
to exceed two paragraphs, may be quoted without explicit permission provided that full credit,
including © notice, is given to the source.
Long-Run Covariability
Ulrich K. Müller and Mark W. Watson
NBER Working Paper No. 23186
February 2017
JEL No. C22,C53,E17
ABSTRACT
We develop inference methods about long-run comovement of two time series. The parameters of
interest are defined in terms of population second-moments of lowfrequency trends computed
from the data. These trends are similar to low-pass filtered data and are designed to extract
variability corresponding to periods longer than the span of the sample divided by q/2, where q is
a small number, such as 12. We numerically determine confidence sets that control coverage over
a wide range of potential bivariate persistence patterns, which include arbitrary linear
combinations of I(0), I(1), near unit roots and fractionally integrated processes. In an application
to U.S. economic data, we quantify the long-run covariability of a variety of series, such as those
giving rise to the “great ratios”, nominal exchange rates and relative nominal prices,
unemployment rate and inflation, money growth and inflation, earnings and stock prices, etc.
Ulrich K. Müller
Department of Economics
Princeton University
Princeton, NJ 08544-1013
umueller@princeton.edu
Mark W. Watson
Department of Economics
Princeton University
Princeton, NJ 08544-1013
and NBER
mwatson@princeton.edu
1 Introduction
Economic theories often have stark predictions about the covariability of variables over long-
horizons: consumption and income move proportionally (permanent income/life cycle model
of consumption) as do nominal exchange rates and relative nominal prices (long-run PPP),
the unemployment rate is unaffected by the rate of price inflation (vertical long-run Phillips
curve), and so forth. But there is a limited set of statistical tools to investigate the validity
of these long-run propositions. This paper expands this set of tools.
Two fundamental problems plague statistical inference about long-run phenomena. First,
inference critically depends on the data’s long-run persistence. Random walks yield statistics
with different probability distributions than i.i.d. data, for example, and observations from
persistent autoregressions or fractionally integrated processes yield statistics with their own
unique probability distributions. The second problem is that there are few “long-run” obser-
vations in the samples typically used in empirical analyses of long-run relations, so sample
information is limited. Taken together these two problems conspire to make long-run infer-
ence particularly difficult: proper inference depends critically on the exact form of long-run
persistence, but there is limited sample information available to empirically determine this
form.
The most well-known example of faulty inference due to a mistaken assumption about
persistence is Granger and Newbold’s (1974) “spurious regression”, where standard OLS in-
ference leads to grossly misleading conclusions when applied to independent (1) variables.
The last 40 years has seen important progress developing inference for specific classes of sto-
chastic processes (most notably for (0) and integrated/cointegrated processes), but several
aspects of the resulting inference remains fragile. For example, while HAC standard errors
lead to reliable inference in (0) settings with limited serial correlation, the resulting hypoth-
esis tests exhibit substantial size distortions for stationary series with high serial correlation
(e.g., den Haan and Levin (1997), Kiefer, Vogelsang, and Bunzel (2000), and Müller (2014)).
Inference in cointegrated models is well-developed (e.g., Engle and Granger (1987), Johansen
(1988), Phillips (1991), Stock and Watson (1993)), but these models have knife-edge impli-
cations about long-run covariability (cointegrated variables have unit long-run correlations)
and efficient inference methods are not robust to small departures from the model’s assumed
exact unit autoregressive roots (Elliott (1998)). Variables that are highly but not perfectly
correlated in the long-run, or are highly persistent, but perhaps without exact unit roots,
1
fall outside the standard cointegration framework.
This paper develops methods designed to provide reliable inference about long-run co-
variability for a wide range of persistence patterns (encompassing (0), (1), and many other
forms of long-run persistence) and that are applicable regardless of the degree of long-run
correlation. The methods rely on low-frequency averages of the data to measure the data’s
long-run variability and covariability. These long-run data summaries have proven useful for
constructing long-run covariance matrices and associated test statistics in (0) settings (e.g.,
Müller (2004, 2007), Phillips (2005), Sun (2013), and Lazarus, Lewis, and Stock (2016)),
but also for conducting inference about more general patterns of long-run persistence and
measuring uncertainty about long-run predictions (Müller and Watson (2008, 2016)). A key
simplification offered by these averages is that they are normally distributed in large samples
even though the stochastic process generating the data may exhibit substantial persistence
(Müller and Watson (forthcoming)). This allows large-sample inference about covariability
parameters to be transformed into a finite-sample problem involving a handful of normal
random variables and, while the inference problem is “non-standard,” it can be solved using
previously developed statistical methods paired with modern computing power.
The paper’s goal is to provide empirical researchers with an easy-to-use method for
constructing confidence intervals for long-run correlation coefficients, linear regression coeffi-
cients, and standard deviations of regression errors. These confidence intervals are both valid
over a wide range of persistence patterns and nearly optimal in the sense of having close to
shortest expected length (see Section 4 for details). As discussed in Section 3, the procedures
allow for (0), (1), near unit roots, fractionally integrated models, and linear combinations
of variables with these forms of persistence. Using a set of pre-computed “approximate least
favorable distributions”, the confidence intervals readily follow from the formulae discussed
in Section 4.1
The outline of the paper is as follows. The next section defines the notion of long-
run variability and covariability used throughout the paper. These are defined in terms of
population second moments of long-run projections, where these projections are similar to
low-pass filtered versions of the data (e.g., Baxter and King (1999)), Hodrick and Prescott
(1997)). The discussion is carried out in the context of two empirical examples, the long-
1
The replication files contains a matlab function for computing these confidence intervals, available at
www.princeton.edu\~mwatson. The function uses the approximate least favorable distributions discussed in
Section 4 and the appendix, which are also available in the replication files.
2
run relationship between consumption and GDP and between short- and long-term nominal
interest rates. In the long-run projections we employ, long-run variability and covariability
is equivalently captured by the covariability of a small number of trigonometrically weighted
averages of the data. Section 3 derives the large-sample normality of these averages and
introduces a flexible parameterization of the joint long-run persistence properties of the
underlying stochastic process. The large-sample framework developed in Section 3 reduces
the problem of inference about long-run covariability parameters into the problem of inference
about the covariance matrix of a low dimensional multivariate normal random vector. Section
4 reviews relevant methods for solving this finite sample problem. Section 5 uses the resulting
inference methods to empirically study several familiar long-run relations involving balanced
growth (GDP, consumption, investment, labor income, and productivity), the term structure
of interest rates, the Fisher correlation (inflation and interest rates), the Phillips correlation
(inflation and unemployment), PPP (exchange rates and price ratios), money growth and
inflation, consumption growth and real returns, and the long-run relationship between stock
prices, dividends and earnings. Section 6 examines the robustness of Section 5’s empirical
conclusions to changes in the periodicities defining the “long-run”, and to alternative choices
for the information set used for inference.
3
to the opposite conclusion (see Lettau and Ludvigson (2013) for discussion and references).
Whether or not consumption and income are cointegrated (i.e., have an exact unit autore-
gressive root and exact unit long-run correlation), even a casual glance at the data suggests
the two variables move together closely in the long run.
Consider, for example, the evolution of U.S. real per-capita GDP and consumption over
the post-WWII period. In the 17 years from 1948 through 1964, GDP increased by 62%
and consumption increased by 52%. Over the next 17 years (1965-1981) both GDP and
consumption grew more slowly, by only 30%. Growth rebounded during 1982 to 1998, when
GDP grew by 43% and consumption increased 55%, but slowed again over 1999-2015 when
GDP grew by only 17% and consumption increased by only 23%. Over these 17-year periods,
there was substantial variability in the average annual rate of growth of GDP (2.9%, 1.4%,
2.1%, and 0.9% per year, respectively over the sub-samples), and these changes were roughly
matched by consumption (annual average growth rates of 2.5%, 1.5%, 2.6%, and 1.2%). In
this sense, GDP and consumption exhibited substantial long-run variability and covariability
over the post-WWII period.2
There are two distinct notions of “long-run” implicit in this calculation. The most obvious
is that each period makes up 17 years, approximately twice the length of the typical business
cycle. But another is that each period encompasses a large fraction (1/4) of the full 1948-
2015 sample period. Our statistical framework defines long-run in this latter way: long-run
statistical analysis involves inference about characteristics of stochastic processes that govern
the evolution of averages of the data over periods that are large relative to the available
sample.
With this is mind, the first two panels of Figure 1 plot the average growth rates of GDP
and consumption over six non-overlapping sub-samples in 1948-2015. Figure 1.a plots the
averages growth rates against time, and Figure 1.b is a scatterplot of the six average growth
rates for consumption against corresponding values for GDP. Each of the six sub-samples
contains 11.25 years (45 quarters), spans of history longer than the typical business cycle,
and arguably capture “long-run” variability in GDP and consumption. And, each represents
2
Consumption is personal consumption expenditures (including durables) from the NIPA; Section 5 shows
results for non-durables, services, and durables separately. Both GDP and consumption are deflated by the
PCE deflator, so that output is measured in terms of consumption goods, and expressed in per-capita terms
using the civilian non-institutionalized population over the age of 16. The supplemental appendix contains
data sources and descriptions for all data used in this paper.
4
5
a substantial fraction (1/6) of the sample and is a long-run observation in the statistical sense
discussed in the preceding paragraph. Average GDP and consumption growth over these
subsamples exhibited substantial variability and (from the scatter plot) roughly one-for-one
covariability.
Figure 1.c sharpens the analysis by plotting “low-pass” moving averages of the series
designed to isolate variation in the series with periods longer than 11 years.3 Sample variation
in these moving averages is much like the variation in the subsample averages of Figure 1.a,
but Figure 1.c captures the smooth transition of the series from high-growth to low-growth
periods. The scatterplot of these moving averages is plotted in Figure 1.d. Like Figure 1.b,
it shows the close relationship between long-run movements in consumption and GDP, but
it also shows the high degree of serial correlation in the moving averages.
A convenient device for handling this serial correlation is to use projections on low-
frequency periodic functions in place of the low-pass moving averages. To be specific, let ,
= 1 denote a time series (e.g., growth rates of GDP or consumption). We use cosine
√
functions for the periodic functions; let Ψ () = 2 cos() denote the function with period
√
2 (where the factor 2 simplifies a calculation below), Ψ() = [Ψ1 () Ψ2 () Ψ ()]0
denote a vector of these functions with periods 2 through 2, and Ψ denote the ×
matrix with ’th row given by Ψ (( − 12) )0 , so the ’th column of Ψ has period 2 .
Most of our empirical analysis uses = 12 which captures periodicities longer than 6; this
defines the long-run variation in the data the analysis is designed to capture. The projection
of onto Ψ (( − 12) ) for = 1 yields the fitted values
where are the projection (linear regression) coefficients, = (Ψ0 Ψ )−1 Ψ0 1: , where
1: is the × 1 vector with ’th element given by . The matrix Ψ has two properties
that simplify calculations and interpretation. First, Ψ0 = 0 where is a vector of ones,
so that b also corresponds to the projection of − 1: onto Ψ (( − 12) ), where 1:
is the sample mean. Second, −1 Ψ0 Ψ = , so corresponds to simple cosine-weighted
3
These were computed using an ideal low-pass filter for periods longer than 6 truncated after 2
terms. The series were padded with pre- and post-sample backcasts and forecasts constructed from an
AR(4) model.
6
averages of the data (i.e., are the “cosine transforms” of { })
X
= −1 Ψ(( − 12) ) (2)
=1
Letting ( ) denote the growth rates of GDP and consumption, the long-run projec-
b ) are plotted in Figure 2.a. Except for minor differences near the endpoints,
tions (b
these long-run projections essentially coincide with the low-pass moving average plotted in
Figure 1.c, so both capture the same long-run sample variability in the data. An advantage
of the long-run projections is that they are fully summarized by the projection coefficients
( ), a relatively small number of cosine-weighted averages of the sample data. Figure
2.b plots the projection coefficients, ( ) against the period of the corresponding co-
7
sine term, 2 . Evidently, there is substantial variation and covariation in the projection
coefficients. Indeed, the scatterplot of ( ) shown in Figure 2.c suggests a roughly
one-to-one relationship between the cosine transforms.
The orthogonality of the cosine regressors Ψ leads to a tight connection between the
b ) plotted in Figure 2.a and the
variability and covariability in the long-run projections (b
cosine transforms ( ) plotted in Figure 2.b and 2.c:
à ! à ! " #
X
b
³ ´ 0 ³ ´ 0
0
−1 b b = −1
0
Ψ0 Ψ = (3)
=1
b
0 0
Thus, sample covariability in the time series projections coincides with sample covariability
in the cosine transforms.
Short-term and Long-term interest rates. The second empirical example involves short-
and long-term nominal interest rates, as measured by the rate on 3-month U.S. Treasury
bills, , and the rate on 10-year U.S. Treasury bonds, , from 1953 through 2015. The levels
of these interest rates are highly serially correlated, but the term spread, − , far less so.
Early cointegration work (e.g., Campbell and Shiller (1987)) modeled the level of interest
rates as (1), and short- and long-rates as cointegrated. Later empirical analysis of the term
structure (e.g., Dai and Singleton (2000), Diebold and Li (2006)) model the levels of interest
rates as a function of small number of dynamic common factors that lead to common, but
less than unit-root, long-run persistence.
Figure 3 plots the levels of short- and long-term interest rates, ( ), along with their
long-run projections, (b b ), and cosine transforms, ( ). The long-run projections
capture the rise in interest rates from the beginning of the sample through the early 1980s
and then their subsequent decline (Figure 3.a). These long-swings in the level of interest
rates lead to relatively larger values in the long-period cosine transforms (Figure 3.b). The
projections for long-term interest rates closely track the projections for short-term rates and,
given the connection between between the projections and cosine transforms, and
are highly correlated (Figure 3.c).
These two datasets differ markedly in their persistence: GDP and consumption growth
rates are often modeled as low-order MA models, while nominal interest rates are highly
serially correlated. Yet, the variables in both data sets exhibit substantial long-run variation
and covariation which is readily evident in the long-run projections (b b ) or equivalently
(from (3)) the projection coefficients ( ). This suggests that the covariance/variance
8
9
properties of ( ) are a useful starting point for defining the long-run covariability
properties of stochastic processes exhibiting a wide range of persistent patterns.
where (Ω Ω Ω ) are the elements of Ω . The linear regression coefficient solves
the population least-squares problem
" #
X
= arg min −1 )2 ,
− b
(b
=1
so that is the coefficient in the population best linear prediction of the long-run projection
̂ by the long-run projection ̂ ,4 2| is the average variance of the prediction error, and
2 is the corresponding population 2 . These parameters thus measure the population
4
The parameter is closely related to a linear band-spectrum regression coefficient (Engle (1974)),
corresponding to periods longer than 2 .
10
comovement of the long-run variation of ( ). Equivalently, by the second equality in (4),
also solves " #
X
= arg min ( − )2
=1
with a corresponding interpretation 2| and 2 . Thus, these parameters equivalently
measure the (population) linear dependence in the scatter plots in Figures 2.c and 3.c.
The objective of the remaining analysis is to develop inference about the parameters
( 2| ).
11
1).5 If the spectral density converges for all frequencies close to zero
and the finite-sample second moment matrix correspondingly converges to its large-sample
counterpart (Müller and Watson (forthcoming, Lemma 2))
à !
2−2 Var = 2−2 Σ → Σ. (7)
The limiting covariance matrix Σ in (6) and (7) is a function of the “local-to-zero”
spectrum ∆ and the cosine weights Ψ () that determine ( ); see Müller and Watson
(forthcoming) for additional details and an explicit formula. We make three comments about
these large-sample results. First, they hold when the first-difference of has a spectral
density (and therefore has limited persistence); the level of is more persistent than its
first difference and may have a (pseudo-) spectrum that diverges at frequency zero. In this
case Σ remains finite because the cosine averages sum to zero (Ψ0 = 0), so they do not
extract zero-frequency variation in the data. If the level of has a spectral density then this
restriction on the weights is not required and, for example, the sample mean of also has a
large-sample normal limit. Second, in common parameterizations of persistence models, the
scale factor − depends on the form of persistence; for example, the factor is −12 for (0)
persistence and −32 for (1) persistence. However, we focus on inference procedures that
do not depend on the scale of (due to invariance or equivariance), so − does not need
to be known. Third, because 2−2 Σ → Σ, then 2−2 Ω → Ω where Ω is defined as in
the last expression of (4) with Σ in place of Σ . Correspondingly, ( 2−2 2| ) →
( 2| ) with the limits defined by (5) with Ω in place of Ω . Thus, a solution to
the small-sample problem of inference about ( 2| ) from observing ( ) readily
translates into a large-sample solution to inference about ( 2| ).
5
The dependence of and ∆ on the sample size accommodates many forms of persistence that
require double arrays as data generating process, such as autoregressive roots of the order 1 − , for fixed
We omit the corresponding dependence of = ( ) on to ease notation.
12
3.2 Parameterizing long-run persistence and covariability
The limiting covariance matrix of the long-run projections, Ω, is a function of the covariance
matrix of the cosine projections, Σ, which in turn is a function of the local-to-zero spectrum
for the first-difference of , ∆ The corresponding local-to-zero (pseudo-) spectrum for the
level of is () = −2 ∆ (). In this section we discuss parameterizations of , Σ, and
Ω.
It is constructive to consider two leading examples. In the first, is (0) with long-run
covariance matrix Λ. In this case () ∝ Λ, and straightforward calculations show that
Σ = Λ ⊗ and Ω ∝ Λ, so the covariance matrix associated with the long-run projections
corresponds to the usual long-run (0) covariance matrix. In this model, the cosine trans-
forms ( ) plotted in Figures 2 and 3 are, in large samples and up to a deterministic
scale, i.i.d. draws from a N (0 Λ) distribution. Inference about Ω = Λ and ( 2| ) thus
follows from well-known small sample inference procedures for Gaussian data (see Müller
and Watson (forthcoming)). In the second example, is (1) with Λ the long-run covari-
ance matrix for ∆ . In this case () ∝ −2 Λ, and a calculation shows that Σ = Λ ⊗ ,
where is a × diagonal matrix with ’th diagonal element = ()−2 . In this model,
the cosine transforms ( ) plotted in Figures 2 and 3 are, in large samples and up to
a deterministic scale, independent but heteroskedastic draws from N (0 ()−2 Λ) distribu-
tions. Thus Ω ∝ Λ, so the covariance matrix for long-run projections for corresponds to
the long-run covariance matrix for its first differences, ∆ . By weighted least squares logic,
inference for (1) processes follows after reweighting the elements of ( ) by the square
roots of the inverse of the diagonal elements of and then using the same methods as in
the (0) model.
GDP, consumption, short-, and long-term interest rates: Table 1 presents estimates and
confidence sets for ( 2| ) using ( ) with = 12 for GDP and consumption
(panel a) and short- and long-term interest rates (panel b). Results are presented for (0)
and (1) models, and for a more general model of persistence introduced below. For now,
focus on the (0) and (1) results. The point estimates shown in the table are MLEs, and
confidence intervals for ( 2| ) are computed using standard finite-sample normal linear
regression formulae (after appropriate weighting in (1) model), and confidence sets for
are constructed as in Anderson (1984, section 4.2.2).
For GDP and consumption, there are only minor differences between the (0) and (1)
13
14
estimates and confidence sets. The estimated long-run correlation is greater than 0.9, the
lower range of the 90% confidence interval exceeds 0.8 in both the (0) and (1) models.
Thus, despite the limited long-run information in the sample (captured here by the 12 ob-
servations making up ( )), the evidence points to a large long-run correlation between
GDP and consumption. The long-run regression of consumption onto GDP yields a regres-
sion coefficient that is estimated to be 0.76 in the (0) model and 0.84 in (1) model. This
estimate is sufficiently accurate that = 1 is not included in the 90% (0) confidence set.
The results for long-term and short-term nominal interest rates are similarly informative –
for example, there is strong evidence that the series are highly correlated over the long-run
– although the (0) and (1) results differ more sharply than for GDP and consumption.
To take just one example, the 90% confidence interval for ranges from 082 to 097 in the
(1) model but is narrower (093 to 099) in the (0) model.
As we show in Table 2 below, the (0) assumption yield confidence intervals with coverage
probability far below the nominal level when in fact the data were generated by the (1)
model, and vice versa. This raises the question of how to obtain valid inference in both
models, and, more generally, under a wider range of forms of persistence.
The shape of the local-to-zero spectrum determines the long-run persistence properties of
the data, and misspecification of this persistence leads to faulty inference about long-run
covariability. Thus, parameterizing is a crucial issue for inference about long-run co-
variability. Addressing this issue faces a familiar trade-off: the parameterization needs to
be sufficiently flexible to yield reliable inference about long-run covariability for a wide
range of economically-relevant stochastic processes and yet be sufficiently constrained to be
tractable. (0) persistence generates a flat local-to-zero spectrum, and (1) persistence gen-
erates a local-to-zero spectrum proportional to −2 . Both of these models are tractable, but
tightly constrain the spectrum. This limits their usefulness as general models for conducting
inference about long-run covariability.
With this trade-off in mind, we use a parameterization that nests and generalizes a range
of models previously used to model persistence in economic time series. The parameterization
is a bivariate extension of the univariate ( ) model used in Müller and Watson (2016)
15
and yields a local-to-zero spectrum of the form
à !
( 2 + 21 )−1 0
() ∝ 2 2 −2
0 + 0 (8)
0 ( + 2 )
16
on , so will the expected length of ( ) and the coverage probability. By definition, the
coverage constraint must be satisfied for all values of ∈ Θ, but one has freedom in choosing
the value of over which expected length is to be minimized. As a general matter, let
denote a distribution that puts weight on different values of , so the problem becomes
Z
min (vol(( )) () (9)
subject to
sup ( ∈ ( )) ≥ 1 − (10)
∈Θ
where the objective function (9) emphasizes that the expected volume depends on the value
of , with different values of weighted by , and the coverage constraint (10) emphasizes
that the constraint must hold for all values of in the parameter space Θ.
As noted by Pratt (1961), the expected length of confidence set for can be expressed
in terms of the power of hypothesis tests of 0 : = 0 . The solution to (9)-(10) thus
amounts to the determination of a family of most powerful hypothesis tests, indexed by 0 .
Elliott, Müller, and Watson (2015) suggest a numerical approach to compute corresponding
approximate “least favorable distributions” for . We implement a version of those methods
here; details are provided in the supplementary appendix. A key feature of the solution
is that, conditional on the weighting function and the least favorable distribution, the
confidence sets have the familiar Neyman-Pearson form with a version of the likelihood ratio
determining the values of included in the confidence interval.
While the resulting confidence intervals have (close to) smallest weighted expected length,
they can have unreasonable properties for particular realizations of ( ). Indeed, for some
values of ( ), the confidence intervals might be empty, with the uncomfortable implication
that, conditional on observing these values of ( ), one is certain that the confidence
interval excludes the true value. To avoid this, we follow Müller and Norets (2016) and
restrict the confidence sets to be supersets of 1 − Bayes credible sets.7
7
Numerical calculations show that the Müller and Norets (2016) adjustment has a small (3%-8%) effect
on expected length of 95%, 90%, and 67% confidence intervals for all three parameters of interest.
17
4.2 Some specifics
4.2.1 Invariance and equivariance
Correlations are invariant to the scale of the data. The linear regression of onto is
the same as the regression of + onto after subtracting from the latter’s regression
coefficient. It is sensible to impose the same invariance/equivariance on the confidence
intervals. Thus, letting , , and denote confidence sets for , , and | , we restrict
these sets as follows:
+
∈ ( ) ⇔ ∈ ( + ) for 6= 0 and all values of (12)
| ∈ ( ) ⇔ | || ∈ ( + ) for 6= 0 and all values of .
(13)
These invariance/equivariance restrictions lead to two modifications to the solution to
(9)-(10). First, they require the use of maximal invariants in place of the original ( ).
The density of the maximal invariants for each of these transformations is derived in the
supplementary appendix. Second, because the objective function (9) is stated in terms of
( ), minimizing expected length by inverting tests based on the maximal invariant leads
to a slightly different form of optimal test statistic. Müller and Norets (2016) develop these
modifications in a general setting, and the supplementary appendix derives the resulting
form of confidence sets for our problem.
The parameter space for = ( ) is as follows: and are real, with lower-
triangular and ( ) chosen so that Ω is non-singular, ≥ 0, and −04 ≤ ≤ 1, for
= 1 2.8 Thus, the confidence intervals control coverage over a wide range of persistence
patterns including processes less persistent than (0), as persistent as (1), local-to-unity
autoregressions, and where different linear combinations of and may have markedly
different persistence (as, for example, in a cointegrated model).
The confidence sets we construct require three distributions over : the weighting func-
tion for computing the average length in the objective (9), the Bayes prior associated
8
See Appendix 3.2 for details.
18
with the Bayes credible sets that serve as subsets for the confidence sets (Müller and Norets
(2016)), and the least favorable distribution for that enforces the coverage constraint. The
latter is endogenous to the program (9)-(10) and is approximated using numerical methods
similar to those discussed in Elliott, Müller, and Watson (2015), with details provided in the
supplementary appendix. We use the same distribution for and the Bayes prior. Specifi-
cally, the distribution is based on the bivariate () model (so that 1 = 2 = 0 = 0) with
1 and 2 independently distributed (−04 10). Because of the invariance/equivariance
restrictions, the scale of the matrix is irrelevant and we set = (1 )() (2 ), where
() is a rotation matrix indexed by the angle , with 1 and 2 independently distributed
[0 ]. The relative eigenvalues of are determined by the diagonal matrix (), with
11 22 = 15 with distributed [0 1].
Table 1 in Section 3.2 above shows estimates for ( | ) and confidence sets using the
( ) model. The estimated value of ( | ) is the median of the posterior using
the ()-model prior, and the table also shows Bayes credible sets for this prior for comparison
with the frequentist confidence intervals. For GDP and consumption, the ( ) results
look much like the results obtained for the (0) model. For most entries, the Bayes credible
sets are slightly larger than the (0) sets, presumably reflecting the possibility of persistence
greater than (0), as was evident in Figure 4. The frequentist confidence intervals often
coincide with Bayes intervals, but occasionally are somewhat wider. The results indicate
that GDP and consumption are highly correlated in the long-run (the 90% confidence set is
071 ≤ ≤ 097) and the long-run regression coefficient of consumption onto GDP is large,
but less than unity (the 90% confidence set is 048 ≤ ≤ 095). The results for interest
rates are somewhat different. The confidence intervals (and Bayes credible sets) are roughly
in-between the (0) and (1) intervals. Substantively, the results indicate that long-run
movements in short- and long-rates are highly correlated, and that a unit long-run response
of long-rates to short-rates is consistent with these data.
19
4.3 Coverage properties of restricted versions of the ( )
model
In this subsection we investigate the coverage distortions for confidence intervals constructed
using misspecified models of persistence. Specifically we consider five models of persistence,
and for each model we both generate data and construct confidence intervals for . The data
are generated using = 0 and Table 2 shows the fraction of the confidence sets that include
the true value = 0.9 The models considered are the (0) model ( () ∝ 0 ), the (1)
model ( () = −2 0 ), a bivariate “local-level” that includes (0) and (1) components
( () ∝ −2 0 + 0 ), the fractional () model ( () ∝ 0 , diagonal with
= −2 ) and the general ( ) model with () given by (8). Because data
were generated by and confidence intervals constructed for each of these five models, the
table contains 25 entries. The columns indicate the model used to generate the data, the
rows shows the model used to construct the confidence set, and the entries are fraction
of confidence sets that contain the true value of , minimized over the other parameters
used to generate the data. The diagonal entries of the table are 0.90 indicating that each
method has coverage 90% under its assumed data generating process. The off-diagonal differ
from 0.90 and show the coverage distortions. For example, 90% (0) confidence sets have
coverage of just 1% when the data are generated by the other four models. (1) confidence
sets have similarly bad coverage when the data are not generated by the (1) model. The
(0) + (1) model encompasses both the (0) and (1) models, so the associated confidence
9
Results are shown for confidence sets that do not incorporate the Müller-Norets Bayes superset adjust-
ment. Including this adjustment yields similar results.
20
intervals has good coverage for these models, but has coverage of only 68% in the () and
( ) models. The () model encompasses the (0) and (1) models, and so has good
coverage for these models. It does not encompass the the (0) + (1) or ( ) models,
but exhibits only a small coverage distortion in these cases. Finally, the general ( )
model encompasses all of the other models, and so controls coverage uniformly across these
models.
Table 2 highlights the large coverage distortions associated with confidence intervals based
on (0), (1), or (0) + (1) models. These results echo results in the earlier literature on
the fragility of (0) and (1) inference (e.g., den Haan and Levin (1997) for HAC inference
in (0) models and Elliott (1998) for inference in cointegrated models). Table 2 suggests
that inference based on the () model is much less fragile; indeed it offers near nominal
coverage in Table 2. However, the () model does not fare as well in other contexts; for
example Müller and Watson (2016) show that () model yields long-run prediction sets with
significant undercoverage when data are generated by a univariate analogue of the ( )
model.
5 Empirical Analysis
The last section showed results for the long-run covariation between GDP and consumption
and between short- and long-term nominal interest rates. In this section we use the same
methods to investigate other important long-run correlations. We focus on two questions:
first, how much information does the sample contain about the long-run covariability, and
second, what are the values of the long-run covariability parameters. A knee-jerk reaction to
investigating long-run propositions in economics using, say, 68-year spans of data is that little
can be learned, particularly so using analysis that is robust to a wide range of persistence
patterns. In this case, even efficient methods for extracting relevant information from the
data will yield confidence intervals that are so wide that they rule out few plausible parameter
values. We find this to be true for some of the long-run relationships investigated below. But,
as we have seen from the consumption-income and interest rate data, confidence intervals
about long-run parameters can be narrow and informative, and this is true for several of the
relationships that we now investigate.
21
5.1 Balanced growth correlations
In the standard one-sector growth model, variations in per-capita GDP, consumption, invest-
ment, and in real wages arise from variations in total factor productivity (TFP). Balanced
growth means that the consumption-to-income ratio, the investment-to-income ratio, and
labor’s share of total income are constant over the long run. This implies perfect pairwise
long-run correlations between the logarithms of income, consumption, investment, labor
compensation, and TFP. In this model, the long-run regression of the logarithm of consump-
tion onto the logarithm of income has a unit coefficient, as do the same regressions with
consumption replaced by investment or labor income. A long-run one-percentage point in-
crease in TFP leads a long-run increase of 1(1 − ) percentage points in the other variables,
where (1 − ) is labor’s share of income. Of course, these implications involve the evolu-
tion of the variables over the untestable infinite long-run. That said, empirical analysis can
determine how well these implications stand-up as approximations to below business cycle
frequency variation in data spanning the post-WWII period. We use data for the U.S. and
the methods discussed above to investigate these long-run balance growth propositions. The
supplemental appendix contains a description of the data that are used.
22
Figure 4 plots the long-run projections of the growth rates of GDP, consumption, invest-
ment, labor income and TFP. (The long-run projections for consumption and GDP were
shown previously in Figure 2.a.) The figure indicates substantial long-run covariability over
the post-WWII period, but less so for investment than the other variables. Table 3 sum-
marizes the results on the long-run correlations. The values above the main diagonal show
point estimates constructed as the posterior median using the ()-model with prior dis-
cussed above, together with 67% confidence intervals using the general ( ) model
(shown in parentheses). The values below the main diagonal are the corresponding 90% con-
fidence intervals using the ( ) model. Table 4 reports results from selected long-run
regressions.
As reported in the previous section, the long-run correlation between GDP and consump-
tion is large. Labor income and GDP are highly correlated with a tightly concentrated 90%
confidence interval of 0.94 to 0.99. The estimated long-run correlation of TFP and GDP
is also high, although the correlation of TFP and the other variables appears to be some-
what lower. Investment and GDP are less highly correlated; the upper bound of the 90%
confidence interval is only 0.8 and the lower bound is close to zero.
Table 4 shows results from long-run regressions of the growth rates of consumption, in-
vestment, and labor income onto the growth rate of GDP, and the corresponding regression of
GDP onto TFP. Labor compensation appears to vary more than one-for-one with GDP and
(as reported above) consumption less than one-for-one. The long-run investment-GDP re-
gression coefficient is imprecisely estimated. Disaggregating consumption into nondurables,
durables, and services, suggests that durable consumption responds more to long-run vari-
23
24
ations in GDP than do services and non-durables. These long-run regression results are
reminiscent of results using business cycle covariability, and in Section 6 we investigate their
robustness to the periodicities incorporated in the long-run analysis.
In summary, what has the 68-year post WWII sample been able to say about the
balanced-growth implications of the simple growth model? First, that several of the variables
are highly correlated over the long-run (labor income and GDP, consumption and GDP),
and second that the long-run regression coefficient on GDP is different from unity for some
variables (consumption and labor income). There is less information about the long-run
covariability of investment with the other variables, although even here there are things to
learn, such as the long-run correlation of investment and GDP is unlikely to much larger
than 0.8.
25
26
27
28
in PCE and CPI inflation track each other closely and the 90% confidence interval shown
in Table 5 suggests that the long correlation is greater than 0.95. The long-run regression
of CPI inflation on PCE inflation yields an estimated slope coefficient that is 1.13 (90%
confidence interval: 098 ≤ ≤ 124) suggesting a larger bias in the CPI during periods of
high trend inflation.
Long-run Fisher correlation and the real term structure: The next two entries in the
figure and table show the long-run covariation of inflation and short- and long-term nominal
interest rates. The well-known Fisher relation (Fisher (1930)) decomposes nominal rates into
an inflation and real interest rate component making it interesting to gauge how much of the
long-run variation in nominal rates can be explained by long-run variation in inflation. The
long-run correlation of nominal interest rates and inflation is estimated to be approximately
0.5, although the confidence intervals indicate substantial uncertainty. A unit long-run re-
gression coefficient of nominal rates onto inflation is consistent with data, but the confidence
intervals are wide.10 The next entry in the figure and table shows the long-run covariation
in short- and long-term real interest rates (constructed as nominal rates minus the PCE
inflation rate). Like their nominal counterparts, short- and long-term real rates are highly
correlated over the long-run (90% confidence interval: 080 ≤ ≤ 098) with a near unit
regression coefficient of long rates onto short rates.
Money growth and inflation: An important implication of the quantity theory of money
is the close relationship between money growth and price inflation over the long-run. Lucas
(1980) investigated this implication using time series data on money (M1) growth and (CPI)
inflation for the U.S. over 1953-1977. After using an exponential smoothing filter to isolate
long-run variation in the series, he found a nearly one-for-one relationship between money
growth and inflation. The next entry in the figure and table examines this long-run relation
using the same M1 and CPI data used by Lucas, but over the longer sample period, 1947-
2015. Figure 5 shows the close long-run relationship between money growth and inflation
from the mid-1950s through late 1970s documented by Lucas, but shows a much weaker (or
non-existent) relationship in the post-1980 sample period, and over the entire sample period
the estimated long-run correlation is only 0.12 with a 67% confidence interval that ranges
10
These estimates measure the long-run Fisher “correlation,” not the long-run Fisher “effect”. The long-
run Fisher correlation considers variation from all sources, while the Fisher effect instead considers variation
associated with exogenous long-run nominal shocks (e.g., Fisher and Seater (1993), King and Watson (1997)).
A similar distinction holds for the Phillips correlation and the Phillips curve (King and Watson (1994)).
29
from -0.17 to 0.54.
Long-run Phillips correlation: The next entry summarizes the long-run correlation be-
tween the unemployment and inflation. The estimated long-run Phillips correlation and
slope coefficient are positive, but = = 0 is contained in the 67% confidence interval.
That said, the confidence intervals are wide so that, like the Fisher correlation, the data are
not very informative about the long-run Phillips correlation.
Unemployment and productivity: Panel (g) of the figure investigates the long-run co-
variation of the unemployment rate and productivity growth. The large negative in-sample
long-run correlation evident in the figure has been noted previously (e.g., Staiger, Stock,
and Watson (2001)); the confidence intervals reported in Table 5 show that the correlation
is unlikely to be spurious. There is a statistically significant negative long-run relationship
between the variables. A long-run one percentage point increase in the rate of growth of
productivity is associated with an estimated one percentage point decline in the long-run
unemployment rate. We are unaware of an economically compelling theoretical explanation
for the large negative correlation.
Real returns and consumption growth: Consumption-based asset pricing models (e.g.,
Lucas (1978)) draw a connection between consumption growth (as an indicator of the in-
tertemporal marginal rate of substitution) and asset returns. A large literature has followed
Hansen and Singleton (1982, 1983) investigating this relationship, with varying degrees of
success. Rose (1988) discusses the puzzling long-run implications of the model when con-
sumption growth follows and (0) process and real returns are (1) (also see Neely and
Rapach (2008)), but moving beyond the (0) and (1) models, it is clear from the empirical
results reported above that both consumption growth and real interest rates exhibit substan-
tial long-run variability. The next two entries in the figure and table investigate the long-run
covariability between consumption growth and and real returns; first using real returns on
short-term treasury bills and then using real returns on stocks. Both suggest a moderate
positive long-run correlation between real returns and consumptions growth rates, although
the confidence interval is wide (90% confidence range from just below zero to 0.80).
Stock Prices, Dividends, and Earnings: Present value models of stock prices imply a close
relationship between long-run values of prices, dividends, and earnings (e.g., Campbell and
Shiller (1987)). An implication of this long-run relation in a cointegration framework is that
dividends, earnings, and stock prices share a common (1) trend, so that their growth rates
are perfectly correlated in the long-run and the dividend-price or price-earning ratio is useful
30
for predicting future stock returns. This latter implication has been widely investigated
(see Campbell and Yogo (2006) for analysis and references). The next two entries show
the long-run correlation of stock prices with dividends and with earnings.11 While there
is considerable uncertainty about the value of the long-run correlation between prices and
dividends or earnings, the data suggest that the correlation is not strong. For example,
values above = 043 are ruled out by the 67% confidence set and values above 0.72 are
ruled out by the 90% sets.
Long-run PPP: The final entry shows results on the long-run correlation between nominal
exchange rates (here the U.S. dollar/British pound exchange rate from 1971-2015) and the
ratio of nominal prices (here the ratio of CPI indices for the two countries). Long-run PPP
implies that the nominal exchange rate should move proportionally with the price ratio over
long time spans, so the long-run growth rates of the nominal exchange rate and price ratios
should be perfectly correlated. A large literature has tested this proposition in a unit-root
and cointegration framework and obtained mixed conclusions. (See Rogoff (1996) and Taylor
and Taylor (2004) for discussion and references). From the final row of Table 5, the growth
rate of nominal exchange rates and relative nominal prices are positively correlated over
the long-run, statistically significantly so at the 33% significance level, but the correlation
is far from perfect ( 072 based on the 90% confidence set). We highlight two caveats.
First, we use the post-Bretton Woods sample period, so the sample includes only 45 years,
and using = 12 cosine terms the long-run projections capture variability with periods of
(approximately) 7 years or higher. This 7-year period may be sufficiently short that long-run
adjustments have not occurred, something we investigate in the next section. Second, the
price ratio uses relative CPIs, a large component of which includes non-traded goods which
may be less tightly linked to exchange rates than prices of traded goods.
31
years is longer than typical business cycles, it does incorporates periods corresponding to
what some researchers refer to as the “medium run” (Blanchard (1997), Comin and Gertler
(2006)). In this section we consider measures of long-run covariability that focus on a subset
of the periods. This allows a comparison of, say, results from periods corresponding to the
“medium-long run” and to those from the “longer-long run.”
To motivate the new measures, look again at Figure 2.a which plots the projections of
GDP and consumption growth rates onto = 12 cosine regressors with periods that range
from 6 (≈ 11 years) to 2 (136 years). Figures 6.b and 6.c show the corresponding
projections onto the first 1 = 6 of these cosine terms (with periods from 3 ≈ 23 years to
2 = 136 years) and last 2 = 6 cosine terms (with periods 12 ≈ 11 years to 2 7 ≈ 19
years). The first of these captures the longer-long-run variation in the data, and the second
32
captures the medium-long-run variability. Each can be studied separately. To differentiate
these periodicities, we replace equation (4) with
"Ã ! # " #
X
b:
³ ´ 0 0
: : : :
Ω: = −1 b: b:
= 0 0
(14)
=1
b: : : : :
where the subscript “ : ” notes that the projection is computed using the through
cosine terms (i.e., the through columns of Ψ ) corresponding to periods 2 through
2 . Thus the longer-long-run periodicities shown in Figure 6.a correspond to the covariance
matrix Ω1:6 (the first 6 cosine terms) and the medium-long-run periodicities in Figure 6.b
correspond to Ω7:12 (the 7-12th cosine terms).
Throughout the paper we have used to denote the number of low-frequency cosine
terms that define the long-run periods of interest (perhaps divided further into longer-long
and medium-long). But plays another important role in the analysis. The value of Ω (or
now Ω: ) ultimately depends on the variability and persistence in the stochastic process as
exhibited in the local-to-zero (pseudo-) spectrum . This spectrum is parameterized by
( ); see equation (8). We learn about the value of these parameters (and therefore
the value of Ω) using the data (1: 1: ). Thus, also denotes the sample variability in
the data that is used to infer the value of the long-run covariance matrix Ω. So, while our
interest might lie in the longer-long-run covariability captured in Ω1:6 , the sample variability
in (1:12 1:12 ) might be used to learn about Ω1:6 . While it is arguably most natural to
match the variability in the data used for inference to the variability of interest, for example
using (1: 1: ) to learn about Ω1: , if the ( ) model accurately characterizes the
spectrum over a wider frequency band, then variability over this wider band can improve
inference. But of course using a wider frequency band runs the risk of misspecification if
the ( ) model is a poor characterization of the spectrum over this wider range of
frequencies. This is the standard trade-off of robustness and efficiency.
With these ideas in mind, Table 6 shows results for long-run correlation and regression
parameters from Ω1:12 , Ω1:6 , and Ω7:12 , corresponding the periods 6 and higher, 3 and
higher, and 6 through 2 7. Results are shown using inference based on the same = 12
cosine transforms used in the sections above, but also using = 6, so only lower frequency
variability in the data is used to learn about ( ), and with = 18, so higher frequency
variability is also used. Table 6.a shows results for long-run covariability of GDP, consump-
tion, investment, labor compensation, and TFP. Table 6.b shows results for selected long-run
33
34
35
relationships involving the other variables. (Results for all the pairs of variables shown in
Table 5 are available in the supplementary appendix.)
The first block of results in Table 6.a are for consumption and GDP. The first row repeats
earlier results using the = 12 cosine terms to learn about Ω: with = 1 and = 12 The
other rows are for other values of , , and . The results suggest remarkable stability across
the different values of , , and . Figures 6.c and 6.d provides hints at this stability. It
shows the scatter plot of (1:6 1:6 ) and (7:12 7:12 ) corresponding to the projections
plotted in panels 6.a and 6.b. The scatter plots corresponding to the different periodicities
are quite similar, and this is reflected in the stability of the results shown in Table 6. This
same stability across , , and is evident for the other pairs of variables in Table 6.a. Looking
closely at Table 6.a, there are subtle differences in the rows. For example, the confidence
intervals for the parameters from Ω1:12 tend to be somewhat narrower using = 18 than
using = 12, consistent with a modest amount of additional information using a larger value
of . The same result holds for results for Ω1:6 computed using = 6 and = 12.
The results summarized in Table 6.b show much of the same stability as Table 6.a, but
there are some notable differences. For example, the point estimates suggest a somewhat
larger Fisher correlation over longer periods (greater than 23 years) than over shorter periods
(11 to 19 years), and the same holds for stock prices and dividends. In both cases however,
the confidence intervals remain wide. And, the puzzling negative correlation between the
unemployment rate and TFP appears to be stronger over the longer-long run than over the
medium-long run.
7 Concluding remarks
This paper has focused on inference about long-run covariability of two time series. Just as
with previous frameworks, such as cointegration analysis, it is natural to consider a gener-
alization to a higher dimensional setting. For example, this would allow one to determine
whether the significant long-run correlation between the unemployment rate and productiv-
ity is robust to including a control for, say, some measure of human capital accumulation.
Many elements of our analysis generalize to time series in a straightforward manner:
The analogous definition of Ω is equally natural as a second-moment summary of the covari-
ability of series, and gives rise to corresponding regression parameters, such as coefficients
from a − 1 dimensional multiple regression, corresponding residual standard deviations
36
and population 2 s.12 Multivariate versions of Ω can also be used for long-run instrumen-
tal variable regressions. As shown in Müller and Watson (forthcoming), the Central Limit
Theorem that reduces the inference question to one about the covariance matrix of a mul-
tivariate normal holds for arbitrary fixed . The ( ) model of persistence naturally
generalizes to a dimensional system. And, confidence sets for multiple regression para-
meters satisfy natural invariance and equivariance constraints, which reduces the number of
effective parameters.
Having said that, our numerical approach for constructing (approximate) minimal-length
confidence sets faces daunting computational challenges in a higher order system: The
quadratic forms that determine the likelihood require (2 2 ) floating point operations.
Worse still, even for as small as = 3, the number of parameters in the ( ) model
is equal to 21. So even after imposing invariance or equivariance, ensuring coverage requires
an exhaustive search over a high dimensional nuisance parameter space.
At the same time, it would seem to be relatively straightforward to determine Bayes
credible sets also for larger values of : Under our asymptotic approximation, the ( )
parameters enter the likelihood through the covariance matrix of a × 1 multivariate
normal, so with some care, modern posterior samplers should be able to reliably determine
the posterior for any function of interest. Of course, such an approach does not guarantee
frequentist coverage, and the empirical results will depend on the choice of prior in a non-
trivial way. In this regard, our empirical results in the bivariate system show an interesting
pattern: Especially at a lower nominal coverage level, for many realizations, there is no
need to augment the Bayes credible set computed from the bivariate fractional model. This
suggests that the frequentist coverage of the unaltered Bayes intervals is not too far below
the nominal level, so these Bayes sets wouldn’t be too misleading even from a frequentist
perspective.13 While this will be difficult to exhaustively check, this pattern might well
generalize also to larger values of .
12
Müller and Watson (forthcoming) provide the details of inference in the (0) model.
13
In fact, a calculation analogous to those in Table 2 shows that the 67% Bayes set contains the true value
of = 0 at least 64% of the time in the bivariate ( ) model, and the 95% Bayes set has coverage of
83%.
37
References
Baxter, M., and R. G. King (1999): “Measuring business cycles: approximate band-pass
filters for economic time series,” Review of economics and statistics, 81(4), 575—593.
Blanchard, O. J. (1997): “The Medium Run,” Brooking Panel on Economic Activity, pp.
89—158.
Campbell, J. Y., and R. J. Shiller (1987): “Cointegration and Tests of Present Value
Models,” Journal of Political Economy, 95, 1062—1088.
Campbell, J. Y., and M. Yogo (2006): “Efficient Tests of Stock Return Predictability,”
Journal of Financial Economics, 81, 27—60.
Cochrane, J. (1994): “Permanent and Transitory Components of GNP and Stock Prices,”
Quarterly Journal of Economics, CIX, 241—266.
Comin, D., and M. Gertler (2006): “Medium Term Business Cycles,” American Eco-
nomic Review, 96, 523—551.
Dai, Q., and K. Singleton (2000): “Specification analysis of affine term structure mod-
els,” Journal of Finance, 55, 1943—1978.
den Haan, W. J., and A. T. Levin (1997): “A Practitioner’s Guide to Robust Covariance
Matrix Estimation,” in Handbook of Statistics 15, ed. by G. S. Maddala, and C. R. Rao,
pp. 299—342. Elsevier, Amsterdam.
38
Diebold, F. X., and C. Li (2006): “Forecasting the Term Structure of Government Bond
Yields,” Journal of Econometrics, 130, 337—364.
Elliott, G., U. K. Müller, and M. W. Watson (2015): “Nearly Optimal Tests When
a Nuisance Parameter is Present Under the Null Hypothesis,” Econometrica, 83, 771—811.
Granger, C. W. J. (1981): “Some Properties of Time Series Data and their Use in Econo-
metric Model Specification,” Journal of Econometrics, 16, 121—130.
39
Johansen, S. (1988): “Statistical Analysis of Cointegration Vectors,” Journal of Economic
Dynamics and Control, 12, 231—254.
King, R. G., and M. W. Watson (1994): “The Post-War U.S. Phillips Curve: A Revision-
ist Econometric History,” Carnegie-Rochester Conference on Public Policy, 41, 157—219.
Klein, L. R., and R. F. Kosobud (1961): “Some Econometrics of Growth: Great Ratios
of Economics,” Quarterly Journal of Economics, LXXV, 173—198.
Lazarus, E., D. J. Lewis, and J. Stock (2016): “HAR Inference: Kernel Choice, Size
Distortions, and Power Loss,” Working Paper, Harvard University.
Lettau, M., and S. Ludvigson (2013): Shocks and Crashes. MIT Press, Cambridge MA.
40
(2014): “HAC Corrections for Strongly Autocorrelated Time Series,” Journal of
Business and Economic Statistics, 32, 311—322.
Neely, C. J., and D. E. Rapach (2008): “Real Interest Rate Persistence: Evidence and
Implications,” Federal Reserve Bank of St. Louis Review, 90(6), 609—641.
Rogoff, K. (1996): “The Purchasing Power Parity Puzzle,” Journal of Economic Litera-
ture, 34, 647—668.
Rose, A. K. (1988): “Is the Real Interest Rate Stable?,” Journal of Finance, 43(5), 1095—
1112.
Sargan, J. D. (1964): “Wages and Prices in the UK: A Study in Econometric Method-
ology,” in Econometric Analysis for National Planning, ed. by P. E. Hart, G. Mills, and
J. K. Whitaker, pp. 25—26, London. Butterworths.
41
Staiger, D., J. H. Stock, and M. W. Watson (2001): “Prices, Wages, and the U.S.
NAIRU in the 1990s,” in The Roaring Nineties, ed. by A. B. Krueger, and R. Solow, pp.
3—60, New York. The Russell Sage Foundation.
Taylor, A. M., and M. P. Taylor (2004): “The Purchasing Power Parity Debate,”
Journal of Economic Perspectives, 18, 135—158.
42
Supplementary Appendix to
Long-Run Covariability
by Ulrich K. Müller and Mark W. Watson
This appendix provides supplemental material. Section 1 discusses the form of the con-
fidence sets; section 2 derives the necessary densities; section 3 discusses the numerically
determined approximate least favorable distributions; the data are described in section 4,
and section 5 includes an expanded version of the paper’s Table 6.
where is the weighting function over which expected length is minimized and the family
of positive measures Λ on Θ, indexed by ∈ (−1 1), are such that Λ ({ : () 6= or
( () ∈ ( )) 1 − }) = 0 and ( () ∈ ( )) ≥ 1 − for all ∈ Θ
14
Here and in the following, we distinguish between random variables and generic real numbers by the
usual upper case / lower case convention. We also implicitly assume the same functional relationship between
the random variables and their corresponding real variables, if appropriate. For example, ( ) on the
right hand side of (A.1) is implicitly thought of as a function of ( ).
1
: Let the − 2 vectors ∗ and ∗ , and 0∗ , 11 12 22 ∈ R be such that
⎛⎛ ⎞ ⎛ ⎞⎞
1 1 Ã !
⎜⎜ ⎟ ⎜ ⎟⎟ 11 12
( ) = ⎝⎝ 0∗ ⎠ ⎝ 0 ⎠⎠
∗ ∗
0 22
that is, perform the LDU decomposition of the upper 2 × 2 block of the × 2 matrix ( ).
Let ∗ = (0∗ ∗0 ∗0 )0 . Then
( R R 11 −12 )
1[ 22 ≤ ]0 ( ∗ |) ()
0 ( ) = : 2 ≤ R ∗ ≤ 1 − 2
1 ( ∗ |) ()
½ Z Z ¾
∗ 11 − 12
( ) = ∗ ∗ ∗
: ( |)1 ( |) () ≥ 0 ( |)Λ () ∪ 0 ( )
22
where 1∗ ( ∗ |) is the density of ∗ under , ( ∗ |) = [|22 11 || ∗ = ∗ ], 0 ( ∗ |)
is the density of the 2 − 2 vector ( ∗0 (11 () − 12 )22 )0 under , and Λ is a positive
measure on Θ such that Λ ({ : ( () ∈ ( )) 1 − }) = 0 and ( () ∈
( )) ≥ 1 − for all ∈ Θ
:
( RR
)
∗
1[
≤ ] 0 ( |) ()
0 ( ) = : 2 ≤ 22
R ∗ ≤ 1 − 2
1 ( ∗ |) ()
½ Z Z ¾
∗ ∗ ∗ ∗
( ) = : ( |)1 ( |) () ≥ 0 ( |)Λ () ∪ 0 ( )
22
where ( ∗ |) = [|22 || ∗ = ∗ ] under , 0 ( ∗ |) is the density of the 2 − 2 vector
( ∗0 ()|22 |)0 under , and Λ is a positive measure on Θ such that Λ ({ : ( () ∈
( )) 1 − }) = 0 and ( () ∈ ( )) ≥ 1 − for all ∈ Θ
It remains to derive , 1∗ 1∗ , 1∗ , 0 and 0 , and to determine Λ , Λ and Λ .
2
Z ∞Z ∞
1
1 −1 2 −1
= 1 2 exp[− 12 (2 − 2 + 2 )]
2 0 0
Z ∞ √ Z ∞
1 −1 2 −1 2
= ( 1 − )2 1 ( − )
0 0
for nonnegative integers 1 and 2 , positive reals , and −1 1, with the p.d.f. of
a standard normal distribution. Note that
Z ∞Z ∞ Ã !0 Ã !Ã !
2
1
||1 ||2 exp[− 12 2
]
2 −∞ −∞
= 2() + 2(−)
We initially discuss how to obtain closed-form expressions for (). The resulting explicit
formulae for densities, even after simplification with a computer algebra system, are long
and uninformative, and they are relegated to the replication files.
1
R
(b) With () = Φ() −∞
() ,
Z µ ¶
0 X
( + ) = Φ() (−)− ()
−∞ =0
and 0 () = 1, 1 () = −()Φ() and () = −−1 ()Φ() + ( − 1)−2 ();
√ R∞ √
(c) 2 0 ( 1 + 2 )+1 = 2 2 Γ(1 + 2)(1 + 2 )−2−1 ;
R∞
p
(d) With () = 2 0 ()Φ( √1− 2 ) , 0 () = −arccos(), 1 () = 2(1+),
and
+2 () = ( + 1) () + Γ(1 + 2)22 (1 − 2 )(1+)2
3
(c) Immediate after substituting 2 → from the definition of the Gamma function.
R∞ √
(d) Define ̃ () = 2 0 ()Φ() , so that () = ̃ ( 1 − 2 ) Note
R∞ R∞
that ̃ (0) = 0 () , and ̃0 () = ̃ () = 2 0 ()()+1 =
√ R∞ √ R
2 0 ( 1 + 2 )+1 . Now ̃ () = ̃ (0) + 0 ̃0 (). The results for 0 () and
1 () now follow by applying (c) and a direct calculation. For the iterative expression, by
integration by parts and 0 () = −()
∙ ¸∞ Z ∞ +1
+1
̃ () = 2()Φ() − 2 (()() − ()Φ())
+1 0 0 +1
µ ¶
1 √ Z ∞ √ +1
= ̃+2 () − 2 2
( 1 + ) ,
+1 0
for some polynomial 0 whose coefficients may be determined explicitly by the formula in
Lemma B.1 (a). Furthermore,
Z 0
( − )1 = ()1 () + Φ()2 ()
−∞
for some polynomials 1 and 2 that may be determined explicitly by the formula in Lemma
B.1 (b). The remaining integral over is of the form
Z ∞ √
2 ( 1 − 2 )[0 () − ()1 () − Φ()2 ()]
0
Z ∞ Z ∞
2 2 2−1 2 1
= (1 − ) () 0 ( √ ) − √ ()2 1 ()
1− 2 2 0
0
Z ∞
−(1 − 2 )2 2−1 ()Φ( √ ) 2
2( √ )
0 1 − 2 1 − 2
which can be determined explicitly by applying Lemma B.1 (c)-(d).
In the following, we simply write Σ for the covariance matrix of vec( ), keeping the
dependence on implicit. If not specified otherwise, all integrals are over the entire real line.
Also, denote the four × blocks of the inverse of Σ as
à !
− −
Σ Σ
Σ−1 =
Σ− −
Σ
4
2.2 Derivation of
√ √
Let = 0 and = 0 . Write for Lebesgue measure on R , and for the
surface measure of a dimensional unit sphere. For ∈ R , let = , where is a point
on the surface of a dimensional unit sphere, and ∈ R+ . By Theorem 2.1.13 of Muirhead
(1982), () = −1
( )1 ( ). We thus can write the joint density of ( )
with respect to × × 1 × 1 as
!0 Ã Ã !
(2)− | det Σ|−12 exp[− 12 Σ−1 ]−1
−1
à !0 à !à !
0 − 0 −
Σ Σ
= (2)− | det Σ|−12 exp[− 12 0 − 0 −
]−1
−1
Σ Σ
( ) = ( † † )
⎛ ⎞
11 12
⎜ ⎟
= ⎝ 11 1∗ 12 1∗ + 22 ⎠
11 ∗ 12 ∗ + 22 ∗
This equation, viewed as a R2 → R2 function of ∗ = ( ∗ ∗ 0∗ 11 12 22 ) has
−1 −2
Jacobian determinant 11 22 , so that the density of ∗ is
∗ (∗ ) = (2)− (det Σ)−12 |11 |−1 |22 |−2 exp[− 12 (vec † )0 Σ−1 (vec † )] (A.2)
with † = († † ), and we are left to integrate out 11 , 12 and 22 . Using vec( † ) =
(2 ⊗ † ) vec(), we have
5
⎛ ⎞0 ⎞ ⎛
11 11
⎜ ⎟Ã † !0 Ã !Ã !⎜ ⎟
⎜ ⎟0 0 Σ −
Σ−
†
0 ⎜ 0 ⎟
= ⎜ ⎜
⎟
⎟ †
−
− †
⎜
⎜ ⎟
⎟
⎝ 12
⎠ 0 Σ Σ 0 ⎝ 12 ⎠
22 22
⎛ ⎞0 ⎛ †0 − † ⎞⎛ ⎞
11 Σ · · 11
⎜ ⎟ ⎜ †0 − † †0 − † ⎟⎜ ⎟
= ⎝ 22 ⎠ ⎝ Σ Σ · ⎠ ⎝ 22 ⎠
12 †0 Σ−
†
†0 Σ−
†
†0 Σ−
†
12
à !0 à !à !
̂ ̂
=
12 0 02 12
à !
†0 − † †0 − †
Σ Σ
where ̂ = (11 22 )0 , = , 0 = (†0 Σ− † †0 − † 2
Σ ) and 0 =
†0 Σ−
†
†0 − †
Σ
†0 − †
Σ . Furthermore, by “completing the square”,
Z Ã !0 Ã !Ã !
̂ ̂ √
exp[− 12 ] 12 = 20−1 exp[− 12 ̂0 ( − 0 0 )̂]
12 0 02 12
( ∗ |)1∗ ( ∗ |)
Z Z Z
22
= | | ∗ (∗ )12 11 22
11
Z Z Ã !0 Ã !
11 11
= (2)−+12 (det Σ)−12 0−1 |11 |−2 |22 |−1 exp[− 12 ̃ ]11 22
22 22
6
2.5 Derivation of 1∗
Proceeding analogously to the derivation of 1∗ , we obtain
( ∗ |)1∗ ( ∗ |)
Z Z Z
= |22 | ∗ (∗ )12 11 22
Z Z Ã !0 Ã !
11 11
= (2)−+12 (det Σ)−12 0−1 |11 |−1 |22 |−1 exp[− 12 ̃ ]11 22
22 22
7
2.7 Derivation of 0
Let ̃ = ()22 , so that = |̃ |. Let ˜0 ( ∗ ̃ |) be the joint density of ( ∗ ̃ ).
Then 0 ( ∗ |) = ˜0 ( ∗ |) + ˜0 ( ∗ − |), so it suffices to derive an expression for
˜0 .
We have ⎛ ⎞ ⎛ ⎞
11 11
⎜ ⎟ ⎜ ⎟
⎝ 22 ⎠ = ⎝ ()̃ ⎠
12 12
This equation, viewed as R3 7→ R3 function of (
Ã11 12 ̃ ), has!Jacobian determinant
11 12
equal to − ()̃ 2 . From (A.2), with = , the joint density of
0 ()̃
( ∗ ̃ ) can thus be written as
Z Z
− −12 − −1
(2) (det Σ) |̃ | | ()| |11 |−1 exp[− 12 (vec † )0 Σ−1 (vec † )]12 11 .
and
⎛ ⎞0 ⎛ ⎞
Z 11 Ã ! 11
⎜ ⎟ ⎜ ⎟
exp[− 12 ⎝ ()̃ ⎠ 0 2
⎝ ()̃ ⎠]12
0
12 12
à !0 à !
√ 11 11
= 20−1 exp[− 12 ( − 0 0 ) ]
()̃ ()̃
so that
˜0 ( ∗ ̃ |) = (2)−+12 (det Σ)−12 | ()|−1 |̃ |− 0−1
Z Ã !0 Ã !
11 11
× |11 |−1 exp[− 12 ̃ ]11
()̃ ()̃
8
Furthermore, with ̃2 the th element of ̃ , = ̃12 ()(̃ ̃11 ) and ̃ =
̃22 ( ()̃ )2
Z Ã !0 Ã !
−1 11 11
|11 | exp[− 12 ̃ ]11
()̃ ()̃
Z
= |11 |−1 exp[− 12 (̃11
2 2
11 + 2 ̃11 11 + 2 )]11
Z
−
= ̃11 ||−1 exp[− 12 ( 2 + 2 + 2 )]
Z
−
= ̃11 exp[− 2 ( − )] ||−1 exp[− 12 ( + )2 ]
1 2 2
For − 1 even, a closed-form expression for the integral follows from Lemma B.1 (a). For
− 1 odd, note that
Z ∞ Z ∞ Z 0
−1 2 −1 2
1
|| exp[− 2 (+ ) ] = 1
exp[− 2 (+ ) ]−2 −1 exp[− 12 (+ )2 ]
−∞ −∞ −∞
so that a closed-form expression can be deduced from Lemma B.1 (a) and (b).
9
it is possible that Λ∗ has coverage uniformly on Θ. This is checked by a numerical search
for the maximum of the Θ 7→ [0 1] non-coverage function () = (() ∈ Λ∗ ( )).
To this end, it is particularly convenient to employ an importance sampling approximation
to (), which generates a continuously differentiable approximation, so that standard
gradient search algorithms can be employed. If these searches (using random starting points)
do not yield a maximum above , a nearly (up to the paramter 0) optimal least favorable
measure Λ∗ has been determined. If the searches yield a 0 for which (0 ) , then this
0 is added to the candidate set Θ , and the algorithm iterates.
For the confidence set , we seek a family of measures Λ that, for each ∈ (−1 1),
have support on the subspace of Θ = { : () = }. We discretize this problem into a
finite number of values of . For each given , we apply the above algorithm, except that
the non-coverage function () now only needs to be searched over Θ .
We discuss details in the following subsections.
3.2 Parameterization
Since the algorithm involves optimization over Θ (or Θ ), it is convenient to introduce a
reparameterization so that this search can be conducted in a unit hypercube. The ( )
model is described by 11 parameters. The restriction to invariant sets reduces the number of
effective parameters to 11 − 3 = 8 for and , and the combination of the bivariate scale
invariance and the restriction Θ = { : () = } also makes Θ effectively 8 dimensional.
The effective parameter space can hence be covered by a [0 1]8 7→ Θ function. In particular,
given = (1 8 ) ∈ [0 1]8 , we set
where chol(·) is the Choleski decomposition of a matrix, ( ) is the 2 × 2 rotation matrix
p p
for the angle , and = diag( tr Σ (1 1 ) tr Σ (2 2 )), with Σ (0 0 ) the
× covariance matrix of in the ( ) model when = 2 , = 0, 1 = 0 and
1 = 0 (so Σ (0 0 ) is the covariance matrix in the scalar model employed in Müller
and Watson (2016) without additional white noise). For and , we set = 2 . For
10
à !
1
, we enforce ∈ Θ by setting = chol . The lower and upper bounds for 1
1
and 2 of 001 and 400 are such that the distribution of ( ) from the resulting Σ ( 0 )s
is nearly indistinguishable from the distribution under the limits → 0 and → ∞
The rationale of this parameterization is that under the equivariance governing and
, it is without loss of generality to consider the case where Ω() = 2 . Now both
= ( ) and = 0, as well as = 0 and = 2 , induce Ω() = 2 with (2)−1
as the factor of proportionality for the local-to-zero spectrum () given in the text. The
parameterization of 0 in terms of ( 5 6 7 ) exhaustively describes all decompositions of
2 = 0 + (2 − 0 ) into two positive semidefinite matrices 0 and (2 − 0 ). Under
the bivariate scale Ã
invariance governing
! , it is without loss of generality to consider the
1 ()
case where Ω() =
, and on Θ , () = .
() 1
X
(()
() |)
Λ ∗ ( )) ≈ −1
( () ∈
Λ ∗ (()
1[ () ∈
() )] (A.3)
=1
(() () )
11
so that the indicator function does not vary with . In fact, for given Λ ∗ , it suffices to
compute the sum over those where ∈ Λ ∗ (()
() ), no matter the value of ∈ Θ .
For , note that by equivariance, the event () ∈ Λ∗ ( ) is equivalent to =
(11 () − 12 )22 ∈ Λ∗ ( † † ), where † = (1 0∗ ∗0 )0 and † = (1 0 ∗0 )0 . Thus,
given that ( † † ) are functions of ∗ , we have
X
0 (()
∗
() |)
( () ∈
Λ∗ ( )) ≈ −1
1[() Λ∗ (()
∈ † †
() )] (A.4)
=1 (()
∗
() )
for some proposal density , where (()
∗
() ) are i.i.d. draws from . Analogously, for
,
X
0 (()
∗
() |) † †
( () ∈
Λ∗ ( )) ≈ −1
∗
1[() Λ∗ (()
∈ () )] (A.5)
=1
(() () )
1. Select 100 initial points based on a low-discrepancy sequence for ∈ [0 1]8 of
∗
subsection 3.2. For each , draw 200 i.i.d. draws (() () ) from 0 ( ∗ | )
12
∗
draws (() () ). We use the larger of 2 BFGS searches starting from random
starting values.
∗
(c) Draw 200 i.i.d. draws (() () ) from 0 ( ∗ |0 ), and add 0 to the set of s.
To ensure unbiased importance sampling estimates, we discard the 400 × 200 draws of
∗
(() that were generated in this determination of , and redraw = 250 000 i.i.d.
() )
draws from for the actual computation of non-coverage probabilities via (A.3), (A.4) and
(A.5).
take on values in the interval [2 1 − 2], respectively. We compute the integral in (A.7)
by numerical quadrature.
Since all three type of confidence sets always contain 0 , the realizations of (() () )
∗
and (() () ) for which (A.6) and (A.7) take on values between [2 1 − 2] never enter
the sums (A.3), (A.4) and (A.5) that approximate the non-rejection probabilities. The effec-
tive number of terms in the sums is thus greatly reduced, which correspondingly facilitates
computations. With this in mind, we modify the determination of the importance sampling
proposal by maximizing the (empirical analogue of the) variance of the importance sampling
weights conditional on the event () ∈ 0 .
13
3.6 Approximate Least Favorable Distributions and Size Control
The initial candidate set Θ consists of 10 randomly selected points in Θ (or in Θ in the
case of ). For given Θ , Λ∗ is computed by the algorithm described in Elliott, Müller, and
Watson (2015), using a target value the level of 1 − + . We set to 0.3%, 0.6% and 1.0%
for = 5%, 10% and 33%, respectively. We search for coverage violating points by BFGS
maximizations over the importance sampling approximation to the non-coverage probability
function (), using numerical derivatives and random starting values. We collect up to 10
coverage violating points in up to 100 BFGS searches before augmenting Θ and recomputing
Λ∗ , which is fairly time consuming, especially if Θ consists of many points. Once 100 BFGS
searches with up to 30 iterations each did not yield a violating point, we switch to BFGS
searches with up to 100 iterations. The algorithm stops once 500 consecutive BFGS searches
with up to 100 iterations each do not yield a coverage violating point.
14
4 Data Used
The data and sources are listed in Table A.1.
15
Extended Table 6: Long-run covariation measures for selected variables with Ωi:j
Inference based on q cosine transforms.
Inv. GDP 12 1:12 0.53 0.29, 0.72 0.02, 0.81 1.24 0.64, 1.79 0.21, 2.21 2.18
12 1:6 0.54 0.27, 0.76 -0.02, 0.91 1.18 0.58, 1.79 -0.46, 2.27 1.54
12 7:12 0.55 0.25, 0.73 0.04, 0.82 1.36 0.70, 2.03 -0.15, 2.88 1.36
18 1:12 0.63 0.42, 0.75 0.29, 0.81 1.60 1.05, 2.16 0.61, 2.59 2.39
6 1:6 0.56 0.21, 0.79 -0.04, 0.91 1.07 0.52, 1.67 -0.14, 2.28 1.13
w×n GDP 12 1:12 0.98 0.96, 0.99 0.94, 0.99 1.28 1.20, 1.36 1.14, 1.42 0.31
12 1:6 0.98 0.96, 0.99 0.93, 0.99 1.28 1.19, 1.36 1.12, 1.42 0.21
12 7:12 0.97 0.95, 0.98 0.89, 0.99 1.29 1.20, 1.39 1.08, 1.54 0.22
18 1:12 0.95 0.91, 0.97 0.87, 0.98 1.21 1.12, 1.31 1.04, 1.38 0.46
6 1:6 0.97 0.92, 0.98 0.83, 0.99 1.26 1.14, 1.38 1.04, 1.48 0.23
GDP TFP 12 1:12 0.78 0.64, 0.89 0.45, 0.95 1.22 0.92, 1.48 0.72, 1.72 0.74
12 1:6 0.79 0.63, 0.94 0.44, 0.97 1.22 0.92, 1.54 0.69, 1.78 0.54
12 7:12 0.75 0.58, 0.86 0.38, 0.91 1.16 0.84, 1.48 0.60, 1.72 0.44
18 1:12 0.76 0.67, 0.87 0.50, 0.94 1.21 0.97, 1.48 0.76, 1.69 0.75
6 1:6 0.71 0.41, 0.94 0.08, 0.97 1.14 0.71, 1.53 0.32, 2.00 0.57
Extended Table 6: continued
Cons. (Serv.) GDP 12 1:12 0.84 0.71, 0.90 0.57, 0.96 0.83 0.66, 0.99 0.54, 1.24 0.60
12 1:6 0.87 0.74, 0.95 0.58, 0.98 0.88 0.70, 1.18 0.54, 1.45 0.47
12 7:12 0.82 0.67, 0.88 0.45, 0.94 0.70 0.51, 0.87 0.35, 0.99 0.32
18 1:12 0.84 0.72, 0.92 0.63, 0.96 0.81 0.66, 0.97 0.54, 1.25 0.62
6 1:6 0.83 0.60, 0.95 0.34, 0.98 0.93 0.66, 1.22 0.42, 1.51 0.47
Cons. (Dur.) GDP 12 1:12 0.78 0.68, 0.90 0.53, 0.92 1.86 1.46, 2.25 1.15, 2.56 1.52
12 1:6 0.82 0.67, 0.93 0.48, 0.96 1.72 1.33, 2.12 0.93, 2.47 1.05
12 7:12 0.77 0.67, 0.89 0.56, 0.94 2.12 1.63, 2.78 1.28, 3.36 1.02
18 1:12 0.75 0.63, 0.83 0.52, 0.91 1.80 1.41, 2.20 1.11, 2.50 1.82
6 1:6 0.87 0.70, 0.94 0.47, 0.98 1.45 1.13, 1.84 0.83, 2.23 0.67
Inv. (Nonres.) GDP 12 1:12 0.42 0.13, 0.65 -0.05, 0.75 0.97 0.41, 1.46 -0.09, 1.90 2.18
12 1:6 0.47 0.15, 0.72 -0.09, 0.89 0.97 0.41, 1.59 -0.09, 2.02 1.51
12 7:12 0.38 0.07, 0.62 -0.20, 0.73 0.90 0.22, 1.52 -0.59, 2.14 1.47
18 1:12 0.55 0.38, 0.73 0.14, 0.78 1.32 0.81, 1.78 0.36, 2.18 2.23
6 1:6 0.53 0.20, 0.79 -0.05, 0.94 0.91 0.43, 1.40 -0.16, 1.94 1.00
Extended Table 6: continued
Inv. (Equip.) GDP 12 1:12 0.33 0.00, 0.54 -0.20, 0.70 0.81 0.12, 1.57 -0.41, 2.11 2.75
12 1:6 0.36 -0.00, 0.58 -0.38, 0.83 0.89 0.05, 1.65 -0.64, 2.18 1.99
12 7:12 0.29 -0.00, 0.56 -0.20, 0.70 0.89 0.05, 1.73 -0.87, 2.56 1.76
18 1:12 0.50 0.29, 0.69 0.07, 0.75 1.33 0.61, 2.04 -0.10, 2.53 2.88
6 1:6 0.32 -0.03, 0.64 -0.46, 0.77 0.89 -0.07, 1.94 -1.21, 2.99 1.95
10Y nom. rates 3M nom.rates 12 1:12 0.96 0.92, 0.98 0.89, 0.98 0.92 0.83, 1.06 0.75, 1.15 0.70
12 1:6 0.96 0.91, 0.98 0.88, 0.98 0.93 0.83, 1.09 0.75, 1.18 0.61
12 7:12 0.94 0.88, 0.97 0.76, 0.98 0.87 0.67, 0.97 0.60, 1.05 0.28
18 1:12 0.95 0.91, 0.98 0.83, 0.98 0.87 0.75, 1.02 0.67, 1.12 0.80
6 1:6 0.96 0.92, 0.98 0.83, 0.99 0.97 0.86, 1.09 0.75, 1.20 0.61
10Y real rates 3M real rates 12 1:12 0.95 0.89, 0.96 0.80, 0.98 0.98 0.86, 1.09 0.71, 1.26 0.68
12 1:6 0.95 0.87, 0.97 0.72, 0.98 0.98 0.85, 1.17 0.65, 1.34 0.57
12 7:12 0.95 0.88, 0.97 0.83, 0.98 0.97 0.85, 1.08 0.76, 1.17 0.31
18 1:12 0.89 0.79, 0.94 0.71, 0.98 0.91 0.77, 1.06 0.66, 1.24 0.83
6 1:6 0.94 0.86, 0.97 0.70, 0.98 1.02 0.87, 1.17 0.64, 1.42 0.61
Extended Table 6: continued
3M rates PCE Infl. 12 1:12 0.47 0.21, 0.83 -0.00, 0.91 0.73 0.34, 1.49 -0.09, 1.91 2.20
12 1:6 0.52 0.24, 0.89 -0.02, 0.95 0.84 0.38, 1.59 0.05, 2.02 1.92
12 7:12 0.23 -0.05, 0.56 -0.38, 0.70 0.36 -0.14, 0.72 -0.64, 1.07 0.88
18 1:12 0.54 0.28, 0.83 0.00, 0.91 0.82 0.47, 1.47 0.03, 1.91 2.20
6 1:6 0.65 0.30, 0.89 0.00, 0.96 1.02 0.55, 1.59 0.13, 2.20 1.98
10Y rates PCE Infl. 12 1:12 0.47 0.23, 0.83 -0.00, 0.91 0.66 0.30, 1.39 -0.06, 1.73 2.12
12 1:6 0.52 0.24, 0.89 -0.02, 0.94 0.73 0.33, 1.43 0.03, 1.86 1.89
12 7:12 0.23 0.03, 0.58 -0.29, 0.71 0.33 0.00, 0.70 -0.36, 0.96 0.79
18 1:12 0.49 0.22, 0.83 0.03, 0.91 0.64 0.29, 1.28 0.08, 1.63 2.03
6 1:6 0.57 0.23, 0.89 -0.03, 0.95 0.88 0.47, 1.43 0.01, 2.02 2.02
CPI Infl. Mon. Supply 12 1:12 0.12 -0.17, 0.54 -0.60, 0.76 0.11 -0.17, 0.52 -0.58, 0.90 2.45
12 1:6 0.16 -0.15, 0.54 -0.60, 0.83 0.13 -0.17, 0.52 -0.63, 1.04 2.15
12 7:12 0.01 -0.35, 0.31 -0.66, 0.47 -0.00 -0.33, 0.30 -0.77, 0.54 0.96
18 1:12 0.13 -0.13, 0.42 -0.36, 0.76 0.11 -0.10, 0.56 -0.28, 0.91 2.24
6 1:6 0.27 -0.08, 0.59 -0.37, 0.86 0.27 -0.08, 0.66 -0.39, 1.09 2.34
Extended Table 6: continued
Un. Rate TFP 12 1:12 -0.65 -0.75,-0.34 -0.91,-0.13 -1.00 -1.39,-0.62 -1.65,-0.27 1.06
12 1:6 -0.78 -0.92,-0.47 -0.97,-0.22 -1.22 -1.61,-0.79 -1.96,-0.44 0.75
12 7:12 -0.31 -0.56, 0.10 -0.69, 0.38 -0.53 -1.09, 0.21 -1.39, 0.64 0.66
18 1:12 -0.42 -0.76,-0.26 -0.91,-0.08 -0.92 -1.31,-0.49 -1.70,-0.15 1.17
6 1:6 0.27 -0.08, 0.57 -0.38, 0.86 0.23 -0.07, 0.49 -0.36, 0.85 1.34
3M real rates Consumption 12 1:12 0.42 0.08, 0.60 -0.06, 0.80 0.88 0.32, 1.45 -0.11, 2.57 1.84
12 1:6 0.42 0.06, 0.64 -0.38, 0.80 0.88 0.26, 1.57 -1.11, 3.26 1.48
12 7:12 0.43 0.14, 0.68 -0.13, 0.76 0.82 0.32, 1.38 -0.11, 1.76 0.88
18 1:12 0.35 0.03, 0.54 -0.13, 0.80 0.63 0.11, 1.16 -0.28, 2.41 1.76
6 1:6 0.30 -0.05, 0.59 -0.46, 0.86 0.97 -0.19, 2.03 -1.48, 3.43 1.71
Stock returns Consumption 12 1:12 0.40 0.07, 0.60 -0.08, 0.80 2.85 0.98, 4.94 -0.48, 8.48 6.15
12 1:6 0.43 0.08, 0.65 -0.16, 0.86 3.27 0.98, 5.57 -0.90,10.78 4.86
12 7:12 0.35 0.04, 0.62 -0.16, 0.73 2.44 0.56, 4.31 -0.90, 5.77 3.17
18 1:12 0.42 0.16, 0.60 -0.02, 0.76 3.05 1.47, 4.80 0.24, 7.43 6.12
6 1:6 0.41 0.00, 0.70 -0.23, 0.89 4.23 0.62, 7.84 -2.63,11.46 5.65
Extended Table 6: continued
Stock prices Earnings 12 1:12 0.21 -0.04, 0.42 -0.27, 0.57 0.38 -0.15, 0.92 -0.53, 1.35 7.23
12 1:6 0.29 -0.06, 0.52 -0.29, 0.76 0.49 -0.10, 1.19 -0.64, 2.16 5.23
12 7:12 0.11 -0.15, 0.35 -0.46, 0.50 0.17 -0.26, 0.65 -0.64, 1.14 4.40
18 1:12 0.38 0.09, 0.44 -0.09, 0.56 0.24 0.05, 0.44 -0.10, 0.60 6.62
6 1:6 0.23 -0.08, 0.57 -0.37, 0.73 0.64 -0.22, 1.50 -0.99, 2.18 5.92
Exchange rates Rel. price ind. 12 1:12 0.42 0.13, 0.57 -0.06, 0.72 1.19 0.51, 1.95 0.00, 2.54 6.10
12 1:6 0.42 0.11, 0.56 -0.16, 0.80 0.93 0.26, 1.61 -0.33, 2.29 4.10
12 7:12 0.51 0.14, 0.67 0.03, 0.80 1.87 0.85, 3.43 0.17, 4.70 4.10
18 1:12 0.38 0.16, 0.54 -0.06, 0.63 1.01 0.43, 1.68 -0.02, 2.19 5.57
6 1:6 0.41 0.00, 0.70 -0.27, 0.91 0.54 0.04, 0.95 -0.38, 1.29 2.08
Notes: Results are shown for a subset of the variables listed in Tables 3-5. Results are based on i:j (col. 4 lists i and j) and sample information in
q cosine transforms (col. 3 shows q).
Additional References
Dhrymes, P. J. (2005): “Moments of Truncated (Normal) Distributions,” Working
Paper, Columbia University.
Muirhead, R. J. (1982): Aspects of Multivariate Statistical Theory. Wiley.
Fernald, J. (2014): “A Quarterly, Utilization-Adjusted Series on Total Factor Produc-
tivity,” FRBSF Working Paper 2012-9.
Friedman, M., and A. J. Schwartz (1963): A Monetary History of the United States,
1867-1960. Princeton University Press, Princeton NJ.
22