Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
33 views

Sources of Autocorrelation

Uploaded by

Nat Boltu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Sources of Autocorrelation

Uploaded by

Nat Boltu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

12.

1 THE NATURE OF THE PROBLEM


The term autocorrelation may be defined as "correlation between
member of series of observations ordered in time [as in time series data]
or space [as in cross-sectional data]’’2 In the regression context, the
classical regression model assumes that such autocorrelation does not
exist in linear the disturbances 𝑢𝑖 . Symbolically,

𝐸(𝑢𝑖 𝑢𝑖 ) = 0 𝑖≠𝑗 (3.2.5)

Sources of Autocorrelation:

The natural question is: Why does serial correlation occur? There are several
reasons, some of which are as follows:
Inertia. A salient feature of most economic time series is inertia, or sluggishness.
As is well known, time series such as GNP, price indexes, production, employment,
and unemployment exhibit (business) cycles.
Starting at the bottom of the recession, when economic recovery starts, most of these
series start moving upward. In this upswing, the value of a series at one point in time
is greater than its previous value. Thus there is a "momentum" built into them, and
it continues until something happens (e.g., increase in interest rate or taxes or both)
to slow them down. Therefore, in regressions involving time series data, successive
observations are likely to be interdependent.

Specification Bias: Excluded Variables Case. In empirical analysis the researcher


often starts with a plausible regression model that may not be the most "perfect" one.
After the regression analysis, the researcher does the postmortem to find out whether
the results accord with a priori expectations. If not, surgery is begun. For example,
the researcher may plot the residuals 𝑢̂𝑖 , obtained from the fitted regression and may
observe patterns such as those shown in Figure 12.1a to d. These residuals (which
are proxies for 𝑢𝑖 ) may suggest that some variables that were originally candidates
1
but were not included in the model for a variety of reasons should be included. This
is the case of excluded variable specification bias. Often the inclusion of such
variables removes the correlation pattern observed among the residuals. For
example, suppose we have the following demand model:

𝑌𝑡 = 𝛽1 + 𝛽2 𝑋2𝑡 + 𝛽3 𝑋3𝑡 + 𝛽4 𝑋4𝑡 + 𝑢𝑡 (12.1.2)

where 𝑌= quantity of beef demanded, 𝑋2 = price of beef, 𝑋3 = consumer income, 𝑋4 =


price of pork, and t = time.4 However, for some reason we run the following
regression:

𝑌𝑡 = 𝛽1 + 𝛽2 𝑋2𝑡 + 𝛽3 𝑋3𝑡 + 𝑣𝑡 (12.1.3)

Now if (12.1.2) is the "correct" model or the "truth" or true relation, running (12.1.3)
is tantamount to letting 𝑣𝑡 = 𝛽4 𝑋4𝑡 + 𝑢𝑡 . And to the extent the price of pork affects
the consumption of beef, the error or disturbance term v will reflect a systematic
pattern, thus creating (false) autocorrelation. A simple test of this would be to run
both 12.1.2) and (12.1.3) and see whether autocorrelation, if any, observed in model
(12.1.3) disappears when (12.1.2) is run.5 The actual mechanics of detecting
autocorrelation will be discussed in Section 12.6 where we will show that a plot of
the residuals from regressions (12.1.2) and (12.1.3) will often shed considerable light
on serial correlation.

Specification Bias: Incorrect Functional Form. Suppose the "true" or correct


model in a cost-output study is as follows:

Marginal cos𝑡𝑖 = 𝛽1 + 𝛽2 outpu𝑡𝑖 , + 𝛽3 outpu𝑡𝑖2 + 𝑢𝑖 (12.1.4)

but we fit the following model:

Marginal cos𝑡𝑖 = 𝛼1 + 𝛼2 , outpu𝑡𝑖 ,+ 𝑣𝑖 (12.1.5)

2
The marginal cost curve corresponding to the "true" model is shown in Figure 12.2
along with the "incorrect" linear cost curve.

As Figure 12.2 shows, between points A and B the linear marginal cost curve
will consistently overestimate the true marginal cost, whereas beyond these points it
will consistently underestimate the true marginal cost. This result is to be expected,
because the disturbance term 𝑣𝑖 , is, in fact, equal to outpu𝑡 2 + 𝑢𝑖 , and hence will
catch the systematic effect of the outpu𝑡 2 term on marginal cost. In this case 𝑣𝑖 , will
reflect autocorrelation because of the use of an incorrect functional form. In Chapter
13 we will consider several methods of detecting specification bias.

Cobweb Phenomenon. The supply of many agricultural commodities rellects the


so-called cobweb phenomenon, where supply reacts to price with a lag of one time
period because supply decisions take time to implement (the gestation period). Thus,
at the beginning of this year's planting of crops, farmers are influenced by the price
prevailing last year, so that their supply function is

𝑆𝑢𝑝𝑝𝑙𝑦𝑡 = 𝛽1 + 𝛽2 𝑃𝑡−1 + 𝑢𝑖 (12.1.6)

Suppose at the end of period t, price P, turns out to be lower than 𝑃𝑡−1 . Therefore, in
period t + 1 farmers may very well decide to produce less than they did in period t.
Obviously, in this situation the disturbances 𝑢𝑖 are not expected to be random
because if the farmers overproduce in year t, they are likely to reduce their
production in t + 1, and so on, leading to a Cobweb pattern.

Lags. In a time series regression of consumption expenditure on income, is not


uncommon to find that the consumption expenditure in the current period depends,
among other things, on the consumption expenditure of the previous period. That is,

Consumptio𝑛𝑡 ,=𝛽1 + 𝛽2 𝑖𝑛𝑐𝑜𝑚𝑒𝑡 + 𝛽1 + 𝛽3 𝑐𝑜𝑛𝑠𝑢𝑚𝑝𝑡𝑖𝑜𝑛𝑡−1 + 𝑢𝑖 , (12.1.7)

3
A regression such as (12.1.7) is known as autoregression because one of the
explanatory variables is the lagged value of the dependent variable. (We shall study
such models in Chapter 17.) The rationale for a model such as (12.1.7) is simple.
Consumers do not change their consumption habits readily for psychological,
technological, or institutional reasons. Now if we neglect the lagged term in (12.1.7),
the resulting error term will reflect a systematic pattern due to the influence of lagged
consumption on current consumption.

"Manipulation" of Data. In empirical analysis, the raw data are often


"manipulated." For example, in time series regressions involving quarterly data,
such data are usually derived from the monthly data by simply adding three monthly
observations and dividing the sum by 3. This averaging introduces smoothness into
the data by dampening the fluctuations in the monthly data. Therefore, the graph
plotting the quarterly data looks much smoother than the monthly data, and this
smoothness may itself lend to a systematic pattern in the disturbances, thereby
introducing autocorrelation. Another source of manipulation is interpolation or
extrapolation of data. For example, the Census of Population is conducted every 10
years in this country, the last being in 2000 and the one before that in 1990. Now if
there is a need to obtain data for some year within the intercensus period 1990-2000,
the common practice is to interpolate on the basis of some ad hoc assumptions. All
such data "massaging" techniques might impose upon the data a systematic pattern
that might not exist in the original data.6

Data Transformation. As an example of this, consider the following


model:

𝑌𝑡 = 𝛽1 + 𝛽2 𝑋𝑡 + 𝑢𝑡 (12.1.8)

where, say, Y = consumption expenditure and X = income. Since (12.1.8) holds true
at every time period, it holds true also in the previous time

4
period, (t-1) . So, we can write (12.1.8) as

𝑌𝑡−1 = 𝛽1 + 𝛽2 𝑋𝑡−1 + 𝑢𝑡−1 (12.1.9)

𝑌𝑡−1 , 𝑋𝑡−1 and 𝑢𝑡−1 are known as the lagged values of Y, X, and 𝑢𝑡 respectively,
here lagged by one period. We will see the importance of the lagged values later in
the chapter as well in several places in the text.
Now if we subtract (12.1.9) from (12.1.8), we obtain

∆𝑌𝑡 = 𝛽2 ∆𝑋𝑡 + ∆𝑢𝑡 (12.1.10)

where ∆, known as the first difference operator, tells us to take successive


differences of the variables in question. Thus, ∆𝑌𝑡 = (𝑌𝑡 − 𝑌𝑡−1 ), 𝑋𝑡 = (𝑋𝑡 − 𝑋𝑡−1 )
and ∆𝑢𝑡 = (𝑢𝑡 − 𝑢𝑡−1 ). For empirical purposes, we write (12.1.10) as

∆𝑌𝑡 = 𝛽2 ∆𝑋𝑡 + 𝑣𝑡 (12.1.11)

where 𝑣𝑡 = ∆𝑢𝑡 = (𝑢𝑡 − 𝑢𝑡 − 1)


Equation (12.1.9) is known as the level form and Eq. (12.1.10) is known as
the (first) difference form. Both forms are often used in empirical analysis. For
example, if in (12.1.9) Y and X represent the logarithms of consumption expenditure
and income, then in (12.1.10) ∆Y and ∆X will represent changes in the logs of
consumption expenditure and income. But as we know, a change in the log of a
variable is a relative change, or a percentage change, if the former is multiplied by
100. So, instead of studying relationships between variables in the level form, we
may be interested in their relationships in the growth form.
Now if the error term in (12.1.8) satisfies the standard OLS assumptions,
particularly the assumption of no autocorrelation, it can be shown that the error term
𝑣𝑡 , in (12.1.11) is autocorrelated. (The proof is given in Appendix 12A, Section
12A.1.) It may be noted here that models like (12.1.11) are known as dynamic
regression models, that is, models involving lagged regressands. We will study such
models in depth in Chapter 17.

5
The point of the preceding example is that sometimes autocorrelation may be
induced as a result of transforming the original model.

Nonstationarity. We mentioned in Chapter 1 that, while dealing with time series


data, we may have to find out if a given time series is stationary. Although we will
discuss the topic of nonstationary time series more thoroughly in the chapters on
time series econometrics in Part V of the text, loosely speaking, a time series is
stationary if its characteristics (e.g., mean, variance, and covariance) are time
invariant; that is, they do not change over time. If that is not the case, we have a
nonstationary time series.
As we will discuss in Part V, in a regression model such as (12.1.8), it is quite
possible that both Y and X are nonstationary and therefore the error u is also
nonstationary.7 In that case, the error term will exhibit autocorrelation.
In summary, then, there are a variety of reasons why the error term in a regression
model may be autocorrelated. In the rest of the chapter we investigate in some detail
the problems posed by autocorrelation and what can be done about it.

12.4 CONSEQUENCES OF USING OLS IN


THE PRESENCE OF AUTOCORRELATION

As in the case of heteroscedasticity, in the presence of autocorrelation the OLS


estimators are still linear unbiased as well as consistent and asymptotically
normally distributed, but they are no longer efficient (i.e., minimum variance).
What then happens to our usual hypothesis testing procedures if we continue
to use the OLS estimators? Again, as in the case of het eroscedasticity, we
distinguish two cases. For pedagogical purposes we still continue to work with
the two-variable model, although the following discussion can be extended to
multiple regressions without much trouble.13

OLS Estimation Allowing for Autocorrelation


As noted, 𝛽̂2 is not BLUE, and even if we use var (𝛽̂2 )ART, the confidence
intervals derived from there are likely to be wider than those based on the GLS

6
procedure. As Kmenta shows, this result is likely to be the case even if the
sample size increases indefinitely.14 That is, 𝛽̂2 is not asymptotically efficient.
The implication of this finding for hypothesis testing is clear. We are likely to
declare a coefficient statistically insignificant (i.e., not different from zero)
even though in fact (i.e., based on the correct GLS procedure) it may be. This
difference can be seen clearly from Figure 12.4. In this figure we show the
95% OLS [AR(1)] and GLS confidence intervals assuming that true 𝛽2 = 0.
Consider a particular estimate of 𝛽2 , say, 𝑏2 Since 𝑏2 lies in the OLS
confidence interval, we could accept the hypothesis that true 𝛽2 is zero with
95% confidence. But if we were to use the (correct) GLS confidence interval,
we could reject the null hypothesis that true 𝛽2 is zero, for 𝛽2 lies in the region
of rejection.
The message is: To establish confidence intervals and to test hypo- theses,
one should use GLS and not OLS even though the estimators derived
from the latter are unbiased and consistent. (However, see Section 12.11
later.)

OLS Estimation Disregarding Autocorrelation

The situation is potentially very serious if we not only use 𝛽̂2 , but also continue to
use var (𝛽̂2 ) = 𝜎 2 / ∑ 𝑥𝑡2 ) o²/x, which completely disregards the problem of
autocorrelation, that is, we mistakenly believe that the usual assumptions of the
classical model hold true. Errors will arise for the following reasons:

1. The residual variance 𝜎 2 = ∑ 𝑢̂𝑡2 /(𝑛 − 2) is likely to underestimate the true 𝜎 2 .


2. As a result, we are likely to overestimate 𝑅 2 .
3. Even if 𝜎 2 is not underestimated, var (𝛽̂2 ) may underestimate var (𝛽̂2 )ARI [Eq.
(12.2.8)], its variance under (first-order) autocorrelation, even though the latter is
inefficient compared to var ((𝛽̂2 )GLS

7
4. Therefore, the usual t and F tests of significance are no longer valid, and if applied,
are likely to give seriously misleading conclusions about the statistical significance
of the estimated regression coefficients.

12.6 DETECTING AUTOCORRELATION


1. Graphical Method
Recall that the assumption of nonautocorrelation of the classical model relates
to the population disturbances 𝑢𝑡 ,which are not directly observable. What we
have instead are their proxies, the residuals 𝑢̂𝑡 , which can be obtained by the
usual OLS procedure. Although the 𝑢̂𝑡 , are not the same thing as 𝑢̂𝑡 ,17 very
often a visual examination of the 𝑢̂′𝑠 gives us some clues about the likely
presence of autocorrelation in the u's. Actually, a visual examination of 𝑢̂𝑡 or
𝑢̂𝑡2 can provide useful information not only about autocorrelation but also
about heteroscedasticity (as we saw in the preceding chapter), model
inadequacy, or specification bias, as we shall see in the next chapter. As one
author notes:

The importance of producing and analyzing plots of [residuals] as a standard


part of statistical analysis cannot be overemphasized. Besides occasionally
providing an easy to understand summary of a complex problem, they allow
the simultaneous examination of the data as an aggregate while clearly
displaying the behavior of individual cases.18

There are various ways of examining the residuals. We can simply plot them
against time, the time sequence plot, as we have done in Figure 12.8, which
shows the residuals obtained from the wages-productivity regression (12.5.1).
The values of these residuals are given in Table 12.5 along with some other
data.

8
II. The Runs Test
If we carefully examine Figure 12.8, we notice a peculiar feature: Initially, we
have several residuals that are negative, then there is a series of positive
residuals, and then there are several residuals that are negative. If these
residuals were purely random, could we observe such a pattern? Intuitively, it
seems unlikely. This intuition can be checked by the so-called runs test,
sometimes also know as the Geary test, a nonparametric test.20
To explain the runs test, let us simply note down the signs (+ or -) of the
residuals obtained from the wages-productivity regression, which are given in
the first column of Table 12.5.

(-----------------)(++++++++++++++++++++++++) (----------------)
(12.6.1)

Thus there are 9 negative residuals, followed by 21 positive residuals, followed by


10 negative residuals, for a total of 40 observations.

We now define a run as an uninterrupted sequence of one symbol or at-


tribute, such as + or. We further define the length of a run as the number of elements
in it. In the sequence shown in (12.6.1), there are 3 runs: a run of 9 minuses (i.e., of
length 9), a run of 21 pluses (i.e., of length 21) and a run of 10 minuses (i.e., of
length 10). For a better visual effect, we have presented the various runs in
parentheses.
By examining how runs behave in a strictly random sequence of observations,
one can derive a test of randomness of runs. We ask this question: Are the 3 runs
observed in our illustrative example consisting of 40 observations too many or too
few compared with the number of runs expected in a strictly random sequence of 40
observations? If there are too many runs, it would mean that in our example the
residuals change sign frequently, thus indicating negative serial correlation (cf.
Figure 12.3b). Similarly, if there are too few runs, they may suggest positive
autocorrelation, as in Figure 12.3a. A priori, then, Figure 12.8 would indicate
positive correlation in the residuals.

9
Now let

N = total number of observations = N₁ + N2


N₁ = number of + symbols (i.e., + residuals)
N₂ = number of symbols (i.e., residuals)
R = number of runs

Then under the null hypothesis that the successive outcomes (here, residuals) are
independent, and assuming that N₁ > 10 and ₂ > 10, the number of runs is
(asymptotically) normally distributed with

2𝑁1 𝑁2
Mean: 𝐸(𝑅) = +1
𝑁
(12.6.2)

2𝑁1 𝑁2 (2𝑁1 𝑁2 −𝑁)


Variance: 𝜎𝑅2 =
(𝑁)2 (𝑁−1)

Note: N =𝑁1 𝑁2
If the null hypothesis of randomness is sustainable, following the proper- ties of the
normal distribution, we should expect that

Prob [E(R)-1.96𝜎𝑅 ≤ 𝑅 ≤ 𝐸(𝑅) +1.96𝜎𝑅 ] = 0.95 (12.6.3)

That is, the probability is 95 percent that the preceding interval will include R.
Therefore we have this rule:

10
11

You might also like