12 Autocorrelation

Autocorrelation
Definition:
The term autocorrelation may be defined as “correlation between members of series of

observations ordered in time or space”. In regression context, the classical linear regression
model assumes that, there is no correlation between disturbance terms. Symbolically,
E ( μi μ j ) =0 i≠ j
If this assumption is violated, symbolically, E ( μi μ j ), the problem arises is called

autocorrelation.
Reasons of autocorrelation:
Inertia:
A salient feature of the most economic time series is inertia, or sluggishness. As is well
known, time series such as GNP, price indexes, production, employment, and unemployment
exhibit cycles. Starting at the bottom of the recession, when economic recovery starts, most
of the series start moving upward. In this upswing, the value of a series at point in time is
greater than its previous value. Thus there is a momentum built into them, it continues until
something happens to slow down. Therefore, in regressions involving time series data,
successive observations are likely to be interdependent.
Specification bias: Excluded Variables Case
Suppose we have the following demand model:
Y t =β 1+ β2 X 2 t + β 3 X 3 t + β 4 X 4 t + μt …………………….(1)
Where, Y t = Quantity of beef demanded

X 2 = price of beef
X 3 = consumes income
X 3 = price of pork,
t= time
However, for some reason we run the following regression:
Y t =β 1+ β2 X 2 t + β 3 X 3 t +ϑ t …………………..(2)
Now if eq. (1) is the correct model or true relation, running (2) is tantamount to letting ϑ t =
β 4 X 4 t + μt . And to the extent the price of pork affects the consumption of beef; the error of
disturbance term ϑ will reflect a systematic pattern, thus creating autocorrelation.
Specification bias: Incorrect functional form
Suppose the “true” or correct model in a cost-output study as follows:

Marginal costi=β1+β2 outputi+ β2 output 2i+µi ( non linear)
But we fit the following model:
Marginal costi=α1+α2 outputi+ϑ i ( linear)
True model
Marginal cost of B
production
Linear
A
model
Output
The marginal cost curve corresponding to the true model is shown in the above figure along
with the incorrect linear cost curve. Figure shows that, between points A and B the linear
marginal cost curve will consistently overestimate the true marginal cost, whereas beyond
these points it will consistently underestimate the true marginal cost.
This result to be expected, because the disturbance term ϑ i is, in fact, will equal to β 2 output
2 2
i+µi, and hence catch the systematic effect of the output term on marginal cost. In this case,
ϑ i will reflect autocorrelation because of an incorrect functional form.
Cobweb Phenomenon:
The supply of many agricultural commodities reflects the so-called cobweb phenomenon,
where supply reacts to the price with a lag of one time period because supply decisions take
time to implement. Thus, at the beginning of this year’s planting of crops, farmers are
influenced by the price prevailing last year, so that their supply function is
Supplyt =β 1+ β 2 P t−1 + μt
Suppose at the end of period t, price Pt turns out to be lower than Pt-1. Therefore, in period t+1
farmers may very well decide to produce less than they did in time period t. Obviously, in
this situation the disturbances μt are not expected to be random because if the farmer
overproduce in year t, they are likely to reduce their production in t+1, and so on, leading to a
cobweb pattern.
Lags:
In a time series regression of consumption expenditure on income, it is not uncommon to find
that the consumption expenditure in the current periods depends, among other things , on the
consumptions expenditure of the previous period. That is,
Consumption t=β 1 + β 2 Income t + β 2 Consumption t−1 + μt
Consumers do not change their habits readily for psychological, technological, or institutional
reasons. Now if we neglect the lagged term, the resulting error term will reflect a systematic
pattern due to the influence of lagged consumption on current consumption.
Manipulation of Data: (Check, P- 441, Text book)
Data Transformation: (Check, P- 441, Text book)
Nonstationary:
A time series is stationary if its characteristics (e.g., mean variance, covariance) are time
invariant; that is, they do not change over time. If that is not the case, we have a
nonstationary time series. In a two variable regression model, it is quite possible that both Y
and X are are nonstationary and therefore the error is also non stationary. In that case, the
error term will exhibit autocorrelation.
OLS Estimation in the Presence of Autocorrelation
Let us consider the two variable regression model
Y t =β 1+ β2 X t + μ t … … ..(1)
To make any headway, we must assume the mechanism that generates μt for E(μ t , μ t+ s)≠ 0 (
s ≠ 0 ¿ is too general an assumption to be of any practical use. As a starting point, or first
approximation, one can assume that the disturbance, or error terms are generated by the
following mechanism.
μt =ρ μt−1 + ε t−1< ρ<+1…………………….(2)
Where ρ is known as the coefficient of autocovariance and ε t is the stockhastic disturbance

term such that it satisfy the standard OLS assumptions, namely,
E ( ε t ) =0
2
V ( ε t )=σ ε
Cov ( ε t , ε t + s )=0 s ≠ 0
In the engineering literature, an error term with the preceding properties is often called a
white noise error term. The equation (2) postulates that the value of the disturbance term in
period t is equal to ρ times its value in the previous period plus a purely random error term.
The scheme (1) is known as a Markov first –order autoregressive scheme, or simply first
order autoregressive scheme, usually denoted as AR (1). Note that ρ, the coefficient of
autocovariance in (1), can also be interpreted as the first order coefficient of autocorrelation,
or more accurately, coefficient of autocorrelation at lag 1. Given the AR (1) scheme, it can be
shown that
2
σε
v ( μt )=E ( μ t )=
2
2
1− ρ
2
s σε
Cov ( μ t , μ t+ s )=E ( μ t μ t+ s )= ρ 2
1− ρ
s
Cor ( μ t , μ t +s )= ρ
One reason we use AR(1) process is not only because of its simplicity compared to higher
order AR schemes, but also because in many applications it has proved to be quite useful.
Now for the two variable regression model we know that
^β = ∑ t t
x y
2
∑ x2t
and its variance is given by
2
^ σ
v ( β2 ) =
∑ x2t
where the small letters as usual denote deviation from the mean values.
Now under the AR (1) scheme, it can be shown that the variance of this estimator is
v ( ^β 2) AR(1)=
σ
2
[1+2 ρ
∑ x t x t −1 +2 ρ2 ∑ x t x t−2 + … … .+ 2 ρ n−1 ∑ xt x n ]where v ( ^β )
∑ x 2t ∑ x 2t ∑ x 2t ∑ x2t 2 AR(1)
means the variance of ^β under a first order autoregressive scheme.

2
BLUE estimator in the presence of Autocorrelation:
For the regression model Y t =β 1+ β2 X t + μ t and assuming the AR(1) process, we can show
that the BLUE estimator of β 2 is given by the following expression
n
∑ ( x t −ρ x t−1 ) ( y t −ρ y t −1 )
^β GLS
2 =
t =2
+C
n
∑ ( x t −ρ x t−1 ) 2
t=2
where C is a correction factor that may be disregarded in practice. And its variance is given
by
σ2
v ( ^β2 )= n
GLS
+D
∑ ( x t −ρ x t−1 )2
t=2
where ,D too is a correction factor that may be disregarded in practice.

Consequences of using OLS in the presence of Autocorrelation:
In the presence of autocorrelation, the OLS estimators are still linear unbiased as well as
consistent and asymptotically normally distributed, but they are no longer efficient.
OLS estimation Disregarding Autocorrelation:
1. The residual variance σ^ 2=

∑ μ^ 2i is likely to underestimate the true σ 2.
(n−2)
2. As a result, we are likely to overestimate R2.
3. Even if σ 2is not underestimated, v ( ^β2 ) may be underestimate v ( ^β 2) AR(1), its variance
under first autocorrelation, even though the latter is inefficient compared to v ( ^β ).
GLS
2
4. Therefore, the usual t and F test of significance are no longer valid, and if applied, are
likely to give seriously misleading conclusions about the statistical significance of the
estimated regression coefficients.
RUN TEST:
Null and alternative hypothesis for the run test
H0: Observed sequence of residuals in random

H1: Observed sequence of residual is not random
Let the level of significance α=.05

Let
N= Total no. of observations (N1+N2)

N1=Number of + symbols (Residuals)
N2= Number of (-) symbols
R= number of runs
Under the null hypothesis that successive outcomes are independent and assuming N1> 10
andN2> 10, the number of run is distributed normally with
2 N1 N2
Mean: E ( R )= +1
N
2 2 N 1 N 2(2 N 1 N 2−N )
Variance: σ R = 2
N (N −1)
If the null hypothesis of randomness is sustainable, following the properties of the normal
distribution, we should expect that
Prob [ E ( R )−1.96 σ R ≤ R ≤ E ( R ) +1.96 σ R ]=.95
That is, the probability is 95% that the preceding interval will include R. So
Do not reject the null hypothesis of randomness with 95% confidence if R, the number of
runs, lies in the preceding confidence interval; reject the null hypothesis if the estimated R
lies outside these limits.
Durbin Watson Test:

The most celebrated test for detecting serial correlation is that developed by statisticians
Durbin and Watson. It is popularly known as the Durbin- Watson d statistic. Null and
alternative hypothesis for the D-W test is
H0: There is no first order autocorrelation (+Ve or –Ve)

H1: There is autocorrelation
“d” test statistic is defined as
t=n
∑ (^μt −^μt−1 )2
d= t =2 t=n …………………………………(1)
∑ ^μ 2
t
t=1
Which is simply the ratio of the sum of squared differences in successive residuals to the
RSS.
Assumptions:
1. The regression model includes the intercept term.

2. The X’s are no stochastic
3. µt’s are generated by first order autoregressive scheme: μt =ρ μt−1 + ε t. Therefore, it
cannot be used to detect higher order autoregressive schemes.
4. The µt assumed to be normally distributed.
5. The regression model does not include the lagged value(s) of the dependent as one of
the independent variable.
The exact sampling or probability distribution of the d statistic given in 1 is difficult to derive
because as the Durbin Watson have shown , it depends in a complicated way on the X values
present in a given samples. Since d is computed from ^μt which are of course dependent on
the given X’s, therefore unlike t, F, or ℵ2 tests there is no unique critical value that will lead to
the rejection or the acceptance of the null hypothesis that there is no first order auto
correlation.
Durbin Watson was successful in deriving a lower bound d L and an upper bound du such that
if the computed d lies outside of these critical values, a decision can be made regarding the
presence of auto correlation.
Decision rule:
Do not Zone of Reject H0*

Reject H0 Zone of
reject H0 or Indecision Evidence of
Evidence of Indecision negative
positive H0*
autocorrelation
autocorrelation
2 4- 4- dL
0 dL du 4
H0: No positive autocorrelation
H0*: No negative autocorrelation
Correcting for pure Autocorrelation (GLS):

Knowing the consequence of autocorrelation, especially the lack of efficiency of OLS
estimator, we may need to remedy the problem. The remedy depends on the knowledge one
has about the nature of interdependence among the disturbances that is knowledge about the
structure of autocorrelation.
As a starter, consider the two variable regression model:
Y t =β 1+ β2 X t + μ t … … ..(1)
and assume that the error term follows the AR(1) scheme , namely,
μt =ρ μt−1 + ε t−1 ≤ ρ ≤1
Now consider two cases:
When ρ is known:
If the coefficient of the first order autocorrelation is known, the problem of autocorrelation
can be easily solved. If equation no.1 holds true at time t-1. Hence
Y t −1=β 1+ β 2 X t −1 + μt −1 … … … (2)
Multiplying equation (2) by ρ on both sides, we obtain
ρ Y t −1 =ρ β 1 + ρ β 2 X t−1 + ρ μ t−1 … … …(3)
Subtracting (3) from (1) we get
( Y t −ρ Y t −1 ) =β 1 ( 1− ρ )+ β2 ( X t −ρ X t−1 ) +(μ t−ρ μt −1)……………..(4)
( Y t −ρ Y t −1 ) =β 1 ( 1− ρ )+ β2 ( X t −ρ X t−1 ) +ε t where ε t=μt −ρ μt −1
We can express (4) as
¿ ¿ ¿ ¿
Y t =β 1+ β2 X t + ε t
Where
¿ ¿ ¿ ¿
β 1=β 1 ( 1−ρ ) ,Y t =( Y t −ρ Y t −1) , X t =( X t− ρ X t−1 ) ,∧β 2=β 2
Since the error term in the equation (4) satisfies the usual OLS assumptions, we can apply
OLS to the transformed variables and obtain estimators with all the optimum properties
namely BLUE.
When ρ is not known:
The method of generalized difference is difficult to implement because ρ is rarely known in
practice. Therefore, we need to find a ways of estimating ρ. We have several possibilities.
The first difference method:
Since ρ lies between 0 and ±1, one could start from two extreme positions. In one extreme,
one could assume that ρ=0, that is, no (first order ) serial correlation, and at the other extreme
we could let ρ=±1, that is perfect positive or negative correlation. As a matter of fact, when a
regression is run, one generally assumes that there is no autocorrelation and then lets the
Durbin –Watson or other test show whether this assumption is justified. If however, ρ=+1,
the generalized difference equation (4) reduce to the first difference equation:
( Y t −Y t −1) =β 2 ( X t− X t −1 ) +(μt −μt −1 )
∆ Y t=β 2 ∆ X t + ε t ……………………….(5)
where ∆ is the first difference operator.
Since the error term in (5) is free from autocorrelation, to run the regression (5) all one has to
do is form a first differences of both the regressand and regressor(s) and run the regression on
these first difference.
This transformation may be appropriate if the coefficient of autocorrelation is very high, say
in excess of .8, or the Durbin-Watson d is quite low. Use first difference method form
whenever d<R2 (proposed by Maddala). An interesting feature of the (5) is that there is no
intercept in it. Hence to estimate (5) one have to use the regression through origin routine. If,
however, you forget to drop the intercept term in the model and estimate the following model
that includes the intercept term
∆ Y t=β 1 + β 2 ∆ X t + ε t
Then the original model must have a trend in it and β1 represents the coefficient of the trend
variable.

12 Autocorrelation

Uploaded by

Copyright:

Available Formats

12 Autocorrelation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

12 Autocorrelation

Uploaded by

Copyright:

Available Formats

Autocorrelation

The term autocorrelation may be defined as “correlation between members of series of

If this assumption is violated, symbolically, E ( μi μ j ), the problem arises is called

Specification bias: Excluded Variables Case

Suppose we have the following demand model:

Where, Y t = Quantity of beef demanded

Specification bias: Incorrect functional form

Suppose the “true” or correct model in a cost-output study as follows:

But we fit the following model:

Marginal costi=α1+α2 outputi+ϑ i ( linear)

ϑ i will reflect autocorrelation because of an incorrect functional form.

Consumption t=β 1 + β 2 Income t + β 2 Consumption t−1 + μt

Manipulation of Data: (Check, P- 441, Text book)

Data Transformation: (Check, P- 441, Text book)

OLS Estimation in the Presence of Autocorrelation

Let us consider the two variable regression model

μt =ρ μt−1 + ε t−1< ρ<+1…………………….(2)

Where ρ is known as the coefficient of autocovariance and ε t is the stockhastic disturbance

means the variance of ^β under a first order autoregressive scheme.

BLUE estimator in the presence of Autocorrelation:

where ,D too is a correction factor that may be disregarded in practice.

OLS estimation Disregarding Autocorrelation:

1. The residual variance σ^ 2=

Null and alternative hypothesis for the run test

H0: Observed sequence of residuals in random

Let the level of significance α=.05

N= Total no. of observations (N1+N2)

Prob [ E ( R )−1.96 σ R ≤ R ≤ E ( R ) +1.96 σ R ]=.95

Durbin Watson Test:

H0: There is no first order autocorrelation (+Ve or –Ve)

1. The regression model includes the intercept term.

Do not Zone of Reject H0*

Correcting for pure Autocorrelation (GLS):

As a starter, consider the two variable regression model:

Now consider two cases:

where ∆ is the first difference operator.

You might also like