Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
29 views

Cross Section: Combination Pooled Data Heteroscedasticity

The document discusses autocorrelation in regression analysis. It defines autocorrelation as correlation between error terms over time or space. There are several potential causes of autocorrelation, including inertia in economic time series, omitted variable bias, using an incorrect functional form, the cobweb phenomenon in agriculture, and lags in relationships between variables. Autocorrelation violates the assumption in regression analysis that error terms are independent and not correlated.

Uploaded by

Navyashree B M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Cross Section: Combination Pooled Data Heteroscedasticity

The document discusses autocorrelation in regression analysis. It defines autocorrelation as correlation between error terms over time or space. There are several potential causes of autocorrelation, including inertia in economic time series, omitted variable bias, using an incorrect functional form, the cobweb phenomenon in agriculture, and lags in relationships between variables. Autocorrelation violates the assumption in regression analysis that error terms are independent and not correlated.

Uploaded by

Navyashree B M
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Autocorrelation:

(What happens if the error terms are correlated?)

The three types of data available for empirical analysis include: (1) Cross section, (2) Time series,
and (3) Combination of cross section and time series, also known as pooled data.
The assumption of homoscedasticity (or equal error variance), may not be always tenable in
cross-sectional data, as this kind of data are often plagued by the problem of heteroscedasticity.
In cross-section studies, data collected using random sample of cross-sectional units (households-
consumption function analysis or firms - investment study analysis), so there is no prior reason to believe
that the error term pertaining to one household (firm) is not correlated with the error term of another
household (firm). If by chance such a correlation is observed in cross-sectional units, it is called spatial
autocorrelation, that is, correlation in space rather than over time (Temporal).
However, it is important to remember that, in cross-sectional analysis, the ordering of the data
must have some logic, or economic interest, to make sense of any determination of whether (spatial)
autocorrelation is present or not.
The situation, is different while dealing with time series data, for the observations in such data
follow a natural ordering over time so that successive observations are likely to exhibit inter-correlations,
especially if the time interval between successive observations is short, such as a day, a week, or a month
rather than a year. If one observes stock price indices of a company over successive days, they move up
or down for several days in succession. Obviously, in such situations, the assumption of no auto, or
serial, correlation in the error terms that underlies the CLRM will be violated.
The Nature of the Problem
The autocorrelation may be defined as “correlation between members of series of observations
ordered in time [as in time series data] or space [as in cross-sectional data]”. In the regression context, the
classical linear regression model assumes that such autocorrelation does not exist in the disturbances ui.
Symbolically,

The classical model assumes that the disturbance term relating to any observation is not
influenced by the disturbance term relating to any other observation.
Ex: i) If we are dealing with quarterly time series data involving the regression of output on labor and
capital inputs and if, say, there is a labor strike affecting output in one quarter, there is no reason to
believe that this disruption will be carried over to the next quarter. That is, if output is lower this
quarter, there is no reason to expect it to be lower next quarter.
ii) While dealing with cross-sectional data involving the regression of family consumption
expenditure on family income, the effect of an increase of one family’s income on its consumption
expenditure is not expected to affect the consumption expenditure of another family.
However, if there is such a dependence, we have autocorrelation.
Symbolically,

In this situation, the disruption caused by a strike this quarter may very well affect output next
quarter, or the increases in the consumption expenditure of one family may very well prompt another
family to increase its consumption expenditure if it wants to keep up with the Joneses.
The terms autocorrelation and serial correlation are used synonymously, some authors like
Tintner differentiate them, he defines autocorrelation as “lag correlation of a given series with itself,
lagged by a number of time units,’’ whereas, he used serial correlation to “lag correlation between two
different series.’’ Thus, correlation between two time series such as u1, u2, . . . , u10 and u2, u3, . . . , u11,
where the former is the latter series lagged by one time period, is autocorrelation, whereas correlation
between time series such as u1, u2, . . . , u10 and v2, v3, . . . , v11, where u and v are two different time
series, is called serial correlation.

AEC 507/GMG  Autocorrelation  1 
 
In following Figures Fig.1a to d shows that there is a distinct pattern among the u’s. Fig.1a shows
a cyclical pattern; Fig.1b and c suggests an upward or downward linear trend in the disturbances; whereas
Fig.1d indicates that both linear and quadratic trend terms are present in the disturbances. Only Fig.1e
indicates no systematic pattern, supporting the non-autocorrelation assumption of the classical linear
regression model.

Fig.1: Patterns of autocorrelation and non-autocorrelation.

AEC 507/GMG  Autocorrelation  2 
 
Causes for autocorrelation:
1. Inertia: A salient feature of most economic time series is inertia (inactivity), or sluggishness. As is
well known, time series such as GNP, price indices, production, employment, and unemployment
exhibit (business) cycles.
Starting at the bottom of the recession, when economic recovery starts, most of these series start
moving upward. In this upswing, the value of a series at one point in time is greater than its previous
value. This “momentum’’ built into them, continues until something happens (e.g., increase in interest
rate or taxes or both) to slow them down. Therefore, in regressions involving time series data, successive
observations are likely to be interdependent.
2. Specification Bias- Excluded Variables Case:
The researcher often starts with a plausible regression model that may not be the most “perfect’’
one. After the regression analysis, to find out whether the results accord with apriori expectations. If
plotted residuals ˆui obtained from the fitted regression follows patterns such as shown in Fig. 1a to d.
These residuals (which are proxies for ui) may suggest that some variables were not included in the model
for a variety of reasons should be included. This is the case of excluded variable specification bias.
Often the inclusion of such variables removes the correlation pattern observed among the
residuals. Ex: For the following demand model:

………. (1)
Where,
Y = Quantity of beef demanded,
X2 = Price of beef,
X3 = Consumer income,
X4 = Price of pork, and
t = time.
However, for some reason, if we run the following regression:
………. (2)
If (1) is the “correct’’ model or true relation, running (2) is equivalent to letting vt = β4X4t + ut.
And to the extent the price of pork affects the consumption of beef, the error term v will reflect a
systematic pattern, thus creating (false) autocorrelation. A simple test of this would be to run both (1) and
(2) and see whether autocorrelation, if any, observed in model (2) disappears when (1) is run.
3. Specification bias - Incorrect functional form
Suppose the “true’’ or correct model in a cost-output study is as follows:
……….(3)
but we fit the following model:
……….(4)

AEC 507/GMG  Autocorrelation  3 
 
Fig.2: Specification bias: incorrect functional form
The marginal cost curve corresponding to the “true’’ model is shown in Fig. 2 along with the
“incorrect’’ linear cost curve. As Fig.2 shows, between points A and B the linear marginal cost curve will
consistently overestimate the true marginal cost, whereas beyond these points it will consistently
underestimate the true marginal cost. This result is to be expected, because the disturbance term vi is,
infact, equal to output2 + ui, and hence will catch the systematic effect of the output2 term on marginal
cost. In this case, vi will reflect autocorrelation because of the use of an incorrect functional form.
4. Cobweb Phenomenon:
The supply of many agricultural commodities reflects the cobweb phenomenon, where supply
reacts to price with a lag of one-time period because supply decisions take time to implement (the
gestation period). Thus, this year’s planting of crops, farmers are influenced by the price prevailing last
year, so that their supply function is
……….(5)
Suppose at the end of period t, price Pt turns out to be lower than Pt−1.
Therefore, in period t+1 farmers may very well decide to produce less than they did in period t.
Obviously, in this situation the disturbances ut are not expected to be random because if the farmers
overproduce in year t, they are likely to reduce their production in t + 1, and so on, leading to a Cobweb
pattern.
5. Lags
In a time series regression of consumption expenditure on income, it is not uncommon to find that
the consumption expenditure in the current period depends, among other things, on the consumption
expenditure of the previous period. That is,
……….(6)
Above regression is known as auto-regression because one of the explanatory variables is the
lagged value of the dependent variable.
Consumers do not change their consumption habits readily for psychological, technological, or
institutional reasons. Now if we neglect the lagged term in above equation, the resulting error term will
reflect a systematic pattern due to the influence of lagged consumption on current consumption.

AEC 507/GMG  Autocorrelation  4 
 
6. Manipulation of Data:
In empirical analysis, the raw data are often “manipulated.’’
Ex: In time series regressions involving quarterly data [adding 3 Months value / 3]. This averaging
introduces smoothness into the data by dampening the fluctuations in the monthly data. Therefore, the
graph plotting the quarterly data looks much smoother than the monthly data, and this smoothness may
itself lend to a systematic pattern in the disturbances, thereby introducing autocorrelation.
Another source of manipulation is interpolation or extrapolation of data.
Ex: The Population Census is conducted every 10 years, like during 2001 and 1991. Now if there is a
need to obtain data for some year within the inter-census period 1991–2001, the common practice is to
interpolate on the basis of some adhoc assumptions. All such data “massaging’’ techniques might impose
upon the data a systematic pattern that might not exist in the original data.
7. Data Transformation:
Consider the following model:
Yt = β1 + β2Xt + ut ……….(7)
where,
Y = Consumption expenditure and X = Income.
Since it holds true at every time period, it holds true also in the previous time period, (t − 1). So,
we can write (7) as
Yt−1 = β1 + β2Xt−1 + ut−1 ……….(8) [Known as level form regression]
Yt−1, Xt−1, and ut−1 are known as the lagged values of Y, X, and u, respectively, by one period.
subtracting (8) from (7), we obtain
∆Yt = β2∆Xt +∆ut……….(9) [Known as difference form regression]
∆ is the first difference operator, tells us to take successive differences of the variables.
For empirical purposes, we write this equation as
∆Yt = β2∆Xt + vt-----------------(10) [Known as dynamic regression models – which
involve lagged regressands]
If in (8) Y and X represent the logarithms of consumption expenditure and income, then in (9) ∆Y and
∆X will represent changes in the logs of consumption expenditure and income. A change in the log of a
variable is a relative or a percentage change, if the former is multiplied by 100. So, instead of studying
relationships between variables in the level form, we may be interested in their relationships in the growth
form.
Now if the error term in (7) satisfies the standard OLS assumptions, particularly the assumption
of no autocorrelation, it can be shown that the error term vt in (10) is autocorrelated.
Since, vt = ut − ut−1,
E(vt) = E(ut − ut−1) = E(ut)−E(ut−1) = 0, since E(u) = 0, for each t.
Now, var (vt) = var (ut − ut−1) = var (ut) + var (ut−1) = 2σ2,
Since, the variance of each ut is σ2 and the u’s are independently distributed. Hence, vt is homoscedastic.
But cov (vt , vt−1) = E(vtvt−1) = E[(ut − ut−1)(ut−1 − ut−2)] = −σ2 which is obviously nonzero. Therefore,
although the u’s are not autocorrelated, the v’s are.
8. Nonstationarity:
A time series is stationary if its characteristics (e.g., mean, variance, and covariance) are time
invariant; that is, they do not change over time. If that is not the case, we have a nonstationary time
series. In a regression model (7), it is quite possible that both Y and X are nonstationary and therefore the
error u is also nonstationary. In that case, the error term will exhibit autocorrelation.
Autocorrelation can be positive (Fig. a) as well as negative, although most economic time series
generally exhibit positive autocorrelation because most of them either move upward or downward over
extended time periods and do not exhibit a constant up-and-down movement such as that shown in Fig. b.

AEC 507/GMG  Autocorrelation  5 
 
Fig.3: (a) Positive and (b) negative autocorrelation

OLS Estimation in the Presence of Autocorrelation


What happens to the OLS estimators and their variances with introduction of autocorrelation in
the disturbances by assuming that E(utut+s) ≠ 0 (s ≠ 0) but retain all the other assumptions of the classical
model? Note again that subscript t on the disturbances to emphasize dealing with time series data.
Two-variable regression model
Yt = β1 + β2Xt + ut …………….(11)
Assume the mechanism that generates ut, for E(utut+s) ≠0 (s ≠0) is too general an assumption to be
of any practical use. As a starting point, assume that the disturbance, or error, terms are generated by the
following mechanism.
ut = ρut−1 + εt −1 < ρ < 1…………….(12)
where ρ ( = rho) is known as the coefficient of auto-covariance and where εt is the stochastic disturbance
term such that it satisfied the standard OLS assumptions, namely,

AEC 507/GMG  Autocorrelation  6 
 
An error term with these properties is called a white noise error term. Equation (12) postulates
that the value of the disturbance term in period t is equal to rho times its value in the previous period plus
a purely random error term. The scheme (12) is known as Markov first-order autoregressive scheme,
or simply a first-order autoregressive scheme, usually denoted as AR(1). The name autoregressive is
appropriate because (12) can be interpreted as the regression of ut on itself lagged one period. It is first
order because ut and its immediate past value are involved; that is, the maximum lag is 1.
If the model is ut = ρ1ut−1 + ρ2ut−2 + εt, it would be an AR(2), or second-order, autoregressive
scheme, and so on.
Rho (coefficient of autocovariance) in (12), can also be interpreted as the first-order coefficient
of autocorrelation, or more accurately, the coefficient of autocorrelation at lag 1.
Given the AR(1) scheme, it can be shown that
……………….. (13)

…………………(14)

…………………(15)
Where, cov (ut, ut+s) means covariance between error terms s periods apart and where cor (ut, ut+s) means
correlation between error terms s periods apart. Because of the symmetry property of covariances and
correlations, cov (ut , ut+s) = cov (ut, ut−s) and cor(ut, ut+s) = cor(ut, ut−s) .
Since ρ is a constant between −1 and +1, (13) shows that under the AR(1) scheme, the variance of
ut is still homoscedastic, but ut is correlated not only with its immediate past value but its values several
periods in the past. It is critical to note that |ρ| < 1, that is, the absolute value of rho is less than one.
 If ρ =1, the variances and covariances listed above are not defined.
 If |ρ| < 1, we say that the AR(1) process given in (12,i.e., ut = ρut−1 + εt) is stationary; that is, the
mean, variance, and covariance of ut do not change over time.
 If |ρ| is less than one, then it is clear from (14) that the value of the covariance will decline as we
go into the distant past.
One reason we use the AR(1) process is not only because of its simplicity compared to higher-order
AR schemes, but also because in many applications it has proved to be quite useful. Additionally, a
considerable amount of theoretical and empirical work has been done on the AR(1) scheme.

Ex: Two-variable regression model: Yt = β1 + β2Xt + ut.


We know that the OLS estimator of the slope coefficient and its variance are given by
……………….. (16)

…………………(17)

Now under the AR(1) scheme, it can be shown that the variance of this estimator is:

……(18)
Where, var (ˆβ2)AR1 means the variance of ˆβ2 under first-order autoregressive scheme.

A comparison of (18) with (17) shows the former is equal to the latter times a term that depends
on ρ as well as the sample autocorrelations between the values taken by the regressor X at various lags.

AEC 507/GMG  Autocorrelation  7 
 
In general we cannot predict whether var (ˆβ2) is less than or greater than var (ˆβ2)AR1. Of course,
if rho is zero, the two formulas will coincide. Also, if the correlations among the successive values of the
regressor are very small, the usual OLS variance of the slope estimator will not be seriously biased. But,
as a general principle, the two variances will not be the same.
To give some idea about the difference between the variances given in (17) and (18), assume that
the regressor X also follows the first-order autoregressive scheme with a coefficient of autocorrelation of
r. Then it can be shown that (18) reduces to:

……(19)
If, for example, r = 0.6 and ρ = 0.8, using (19)
var (ˆβ2)AR1 = 2.8461 var (ˆβ2)OLS.
To put it another way,
var (ˆβ2)OLS = 1/2.8461var (ˆβ2)AR1 = 0.3513 var (ˆβ2)AR1 .
That is, the usual OLS formula [i.e.,(17)] will underestimate the variance of (ˆβ2)AR1 by about 65 percent.
[This answer is specific for the given values of r and ρ].
Warning:
A blind application of the usual OLS formulas to compute the variances and standard errors of the
OLS estimators could give seriously misleading results.
Suppose we continue to use the OLS estimator ˆβ2 and adjust the usual variance formula by taking into
account the AR(1) scheme. That is, we use ˆβ2 given by (16) but use the variance formula given by (18).
What now are the properties of ˆβ2? It can be proved that ˆ β2 is still linear and unbiased.
Detecting Autocorrelation
Ex: Relationship between wages and productivity in the business sector of the US, 1959–1998
Table-1: Indices of Real Compensation and Productivity, United States, 1959–1998

AEC 507/GMG  Autocorrelation  8 
 
Above Table 1 contains the data on indices of real compensation per hour (Y) and output per hour
(X) in the business sector of the U.S. economy for the period 1959–1998[Base is 1992 = 100]

Fig.4: Residuals and standardized residuals from the wages–productivity regression (1)
Plotting this data on Y and X, we obtain Fig.4. Since the relationship between them is expected to
be positive, it is not surprising that the two variables are positively related, but the relationship between
the two is almost linear is surprising, although there is some hint that at higher values of productivity the
relationship between the two may be slightly nonlinear. Therefore, linear as well as a log–linear models
were estimated and results are follows.

where d is the Durbin–Watson statistic

AEC 507/GMG  Autocorrelation  9 
 
Qualitatively, both the models give similar results. In both cases the estimated coefficients are
“highly” significant, as indicated by the high t values.
In the linear model, if the index of productivity goes up by a unit, on average, the index of
compensation goes up by about 0.71 units.
In the log–linear model, if the index of productivity goes up by 1 percent, on average, the index
of real compensation goes up by about 0.67 percent.
How reliable are the results if there is autocorrelation?
Tests / Methods to detect Autocorrelation
If there is autocorrelation, the estimated standard errors are biased, as a result of which the
estimated t ratios are unreliable. Hence there is need to find out if data suffer from autocorrelation using
different methods as detailed below with the LINEAR MODEL.
I. Graphical Method
The assumption of non-autocorrelation of the classical model relates to the population
disturbances ut, which are not directly observable. What we have instead are their proxies, the residuals
ˆut, which can be obtained by the usual OLS procedure. Although the ˆut are not the same thing as ut, very
often a visual examination of the ˆu’s gives us some clues about the likely presence of autocorrelation in
the u’s. Actually, a visual examination of ˆut or (ˆu2t) can provide useful information not only about
autocorrelation but also about heteroscedasticity, model inadequacy, or specification bias.
There are various ways of examining the residuals. Residuals are plotted against time, the time
sequence plot [Fig.5], shows the residuals obtained from the wages–productivity regression (1). The
values of these residuals are given in Table 2 along with some other data.

Fig.5: Residuals and standardized residuals from the wages–productivity regression (1)

AEC 507/GMG  Autocorrelation  10 


 
Table-2: Residuals: Actual, Standardized, And Lagged

Alternatively, we can plot the standardized residuals against time, which are also shown in
Fig.5 and Table 2.
The standardized residuals are the residuals (ˆut) divided by the standard error of the regression
(√ˆσ2), that is, they are (ˆut/ˆσ ). [Note that ˆut and ˆσ are measured in the units in which the regressand Y
is measured]. The values of the standardized residuals will therefore be pure numbers (devoid of units of
measurement) and can be compared with the standardized residuals of other regressions. Moreover, the
standardized residuals, like ˆut, have zero mean and approximately unit variance. In large samples (ˆut/ˆσ)
is approximately normally distributed with zero mean and unit variance. For our example, ˆσ = 2.6755.
Examining the time sequence plot given in Fig 5, one can observe that both ˆut and the standardized ˆut
exhibit a pattern observed in Figure 1d, suggesting that perhaps ut are not random.
To see this differently, we can plot ˆut against ˆut−1, that is, plot the residuals at time t against their
value at time (t-1), a kind of empirical test of the AR(1) scheme. If the residuals are nonrandom, we
should obtain pictures similar to those shown in Fig.3. This plot for wages–productivity regression is
shown in Fig.6 for the data in Table 2. As this figure reveals, most of the residuals are bunched in the
second (northeast) and the fourth (southwest) quadrants, suggesting a strong positive correlation in the
residuals.

AEC 507/GMG  Autocorrelation  11 


 
Fig.6: Current residuals versus lagged residuals
This graphical method although powerful and suggestive, is subjective or qualitative in nature.
But there are several quantitative tests that one can use to supplement this which are as follows.
2. The Runs Test
Fig.5 reveals that initially, several residuals are negative, then there is a series of positive
residuals, and then there are several residuals that are negative. If these residuals were purely random,
could we observe such a pattern? Intuitively, it seems unlikely. This intuition can be checked by the so-
called runs test, sometimes also known as the Geary test, a nonparametric test.
Explanation
First note down the signs (+ or −) of the residuals obtained from the wages–productivity regression,
which are given in the first column of Table 2.
(−−−−−−−−−)(+++++++++++++++++++++)(−−−−−−−−−−)
There are 9 negative residuals, 21 positive residuals, followed by 10 negative residuals, for a total of 40
observations.
We now define a run as an uninterrupted sequence of one symbol or attribute, such as + or −. We
further define the length of a run as the number of elements in it. In the above sequence, there are 3 runs:
 A run of 9 minuses (i.e., of length 9),
 a run of 21 pluses (i.e., of length 21) and
 a run of 10 minuses (i.e., of length 10).
By examining how runs behave in a strictly random sequence of observations, one can derive a
test of randomness of runs.
If there are too many runs, it would mean that in our example the residuals change sign
frequently, thus indicating negative serial correlation (Fig.3b). Similarly, if there are too few runs, they
may suggest positive autocorrelation, as in Fig.3a. A priori, then, Fig.5 would indicate positive
correlation in the residuals.

AEC 507/GMG  Autocorrelation  12 


 
Now let
N = Total number of observations = N1 + N2
N1 = Number of + symbols (i.e., + residuals)
N2 = Number of − symbols (i.e., − residuals)
R = Number of runs
Then under the null hypothesis that the successive outcomes (here, residuals) are independent,
and assuming that N1 > 10 and N2 > 10, the number of runs is (asymptotically) normally distributed with

Note: N = N1 + N2.
If the null hypothesis of randomness is sustainable, following the properties of the normal
distribution, we should expect that
Prob [E(R) − 1.96σR ≤ R ≤ E(R) + 1.96σR] = 0.95
That is, the probability is 95 percent that the preceding interval will include R.
Decision Rule:
Do not reject the null hypothesis of randomness with 95% confidence if R, the number of runs,
lies in the preceding confidence interval; reject the null hypothesis if the estimated R lies outside these
limits. (can choose any level of confidence)
In this example,
N1 - the number of minuses = 19
N2 - the number of pluses = 21
R = 3.
Then using the above formula,
E(R) = 10.975
σ2R = 9.6936
σR = 3.1134
The 95% confidence interval for R is thus:
[10.975 ± 1.96(3.1134)] = (4.8728, 17.0722)
Obviously, this interval does not include 3. Hence, we can reject the hypothesis that the residuals
in our wages–productivity regression are random with 95% confidence. In other words, the residuals
exhibit autocorrelation.
As a general rule, if there is
Positive autocorrelation- the number of runs will be few, whereas
Negative autocorrelation - the number of runs will be many.
Swed and Eisenhart have developed special tables that give critical values of the runs expected in a
random sequence of N observations if N1 or N2 is smaller than 20.

3. Durbin–Watson d Test
The most celebrated test for detecting serial correlation is that developed by statisticians Durbin
and Watson. It is popularly known as the Durbin–Watson d statistic, which is defined as

AEC 507/GMG  Autocorrelation  13 


 
This is simply the ratio of the sum of squared differences in successive residuals to the RSS. Note
that in the numerator of the d statistic the number of observations is n− 1 (starting from 2) because one
observation is lost in taking successive differences.
A great advantage of the d statistic is that it is based on the estimated residuals, which are
routinely computed in regression analysis. Because of this advantage, it is now a common practice to
report the Durbin–Watson d along with summary measures, such as R2, adjusted R2, t, and F.
Assumptions underlying the d statistic:
1. The regression model includes the intercept term. If it is not present, as in the case of the regression
through the origin, it is essential to rerun the regression including the intercept term to obtain the
RSS.
2. The explanatory variables, the X’s, are nonstochastic, or fixed in repeated sampling.
3. The disturbances ut are generated by the first-order autoregressive scheme: ut = ρut−1 + εt. Therefore,
it cannot be used to detect higher-order autoregressive schemes.
4. The error term ut is assumed to be normally distributed.
5. The regression model does not include the lagged value(s) of the dependent variable as one of the
explanatory variables. Thus, the test is inapplicable in models of the following type:
a. Yt = β1 + β2X2t + β3X3t + ꞏꞏ ꞏ+βkXkt + γYt−1 + ut
where Yt−1 is the one period lagged value of Y. Such models are known as autoregressive models.
6. There are no missing observations in the data. if so, the d statistic makes no allowance for such
missing observations
The exact sampling or probability distribution of the d statistic is difficult to derive.
Unlike the t, F, or χ2 tests, there is no unique critical value that will lead to the rejection or the
acceptance of the null hypothesis that there is no first-order serial correlation in the disturbances ui.
However, Durbin and Watson were successful in deriving a lower bound dL and an upper bound dU such
that if the computed d lies outside these critical values, a decision can be made regarding the presence of
positive or negative serial correlation.
Durbin and Watson developed d value tables for n going from 6 to 200 and up to 20 explanatory
variables. Using this we can develop upper ad lower limit of d values and by this we can tell about
presence or absence of autocorrelation.
Test Procedure
It can be explained with the aid of Fig.7, which shows that the limits of d are 0 and 4. These can
be established as follows
The expanded form of above equation of d will be

Since ∑ût2 and û2t-1 differ in only one observation, they are approximately equal. Therefore, setting
∑ û2t-1 ≈∑ û2t, above equation may be written as

where ≈ means approximately.

AEC 507/GMG  Autocorrelation  14 


 
Fig. 7: Durbin–Watson d statistic

By expressing ∑ ût and ût-1 as ρ and substitution in above equation we get

Now let us define

as the sample first-order coefficient of autocorrelation, an estimator of ρ.

But since −1 ≤ ρ ≤ 1, this implies that


0≤d≤4
These are the bounds of d; any estimated d value must lie within these limits.
 If ρˆ = 0, d = 2; - no first-order autocorrelation, either positive or negative.
 If ρˆ = +1, d ≈ 0 -indicating perfect positive correlation in the residuals.
 If ρˆ = -1, d ≈ 4 - indicating perfect negative correlation in the residuals.
Therefore,
The closer d is to 0, the greater the evidence of positive serial correlation. If there is positive
autocorrelation, the ˆut ’s will be bunched together and their differences will therefore tend to be small.
As a result, the numerator sum of squares will be smaller in comparison with the denominator sum of
squares, which remains a unique value for any given regression.
If ρˆ = −1, that is, there is perfect negative correlation among successive residuals, d ≈ 4. Hence,
the closer d is to 4, the greater the evidence of negative serial correlation. For if there is negative
autocorrelation, a positive ˆut will tend to be followed by a negative ˆut and vice versa so that | ˆut −
ˆut−1| will usually be greater than | ˆut |. Therefore, the numerator of d will be comparatively larger than
the denominator.
Mechanics of the Durbin–Watson test
Assuming that the assumptions underlying the test are fulfilled:
1. Run the OLS regression and obtain the residuals.
2. Compute d. (Most computer programs now do this routinely.)
3. For the given sample size and given number of explanatory variables find out the critical dL and dU
values.

AEC 507/GMG  Autocorrelation  15 


 
4. Now follow the decision rules given in following Table-3. For ease of reference,these decision rules
are also depicted in Fig.7 above.
Table-3: Durbin–Watson d Test: Decision Rules

Illustration: Wages–productivity regression for the data given in Table 2 the estimated d value can be
shown to be 0.1229, suggesting that there is positive serial correlation in the residuals. From the Durbin–
Watson tables, we find that for 40 observations and one explanatory variable, dL = 1.44 and dU = 1.54 at
the 5 percent level. Since the computed d of 0.1229 lies below dL, we cannot reject the hypothesis that
there is positive serial correlation in the residuals.
However, the d test has great drawback that if it falls in the indecisive zone, one cannot conclude that
(first-order) autocorrelation does or does not exist. Then following modified d test can be used.
Given the level of significance α,
1. H0: ρ = 0 versus H1:ρ > 0. Reject H0 at α level if d < dU. That is, there is statistically significant
positive autocorrelation.
2. H0: ρ = 0 versus H1:ρ < 0. Reject H0 at α level if the estimated (4 − d) < dU, that is, there is
statistically significant evidence of negative autocorrelation.
3. H0: ρ = 0 versus H1: ρ _= 0. Reject H0 at 2α level if d < dU or (4 − d) <dU, that is, there is
statistically significant evidence of autocorrelation, positive or negative.
It may be pointed out that the indecisive zone narrows as the sample size increases, which can be
seen clearly from the Durbin–Watson tables. For example, with 4 regressors and 20 observations, the 5
percent lower and upper d values are 0.894 and 1.828, respectively, but these values are 1.515 and 1.739
if the sample size is 75.
The computer program Shazam performs an exact d test, that is, it gives the p value, the exact
probability of the computed d value. With modern computing facilities, it is no longer difficult to find the
p value of the computed d statistic. Using SHAZAM (version 9) the wages–productivity regression, the p
value of the computed d of 0.1229 is practically zero, thereby reconfirming our earlier conclusion based
on the Durbin–Watson tables.
The Durbin–Watson d test has become so venerable that practitioners often forget the
assumptions underlying the test. In particular, the assumptions that (1) the explanatory variables, or
regressors, are nonstochastic; (2) the error term follows the normal distribution; and (3) that the regression
models do not include the lagged value(s) of the regressand are very important for the application of the d
test.
If a regression model contains lagged value(s) of the regressand, the d value in such cases is often
around 2, which would suggest that there is no (first-order) autocorrelation in such models. Thus, there is
a built-in bias against discovering (first-order) autocorrelation in such models.
Breusch–Godfrey (BG) Test
This test allows for
(1) nonstochastic regressors, such as the lagged values of the regressand;
(2) higher-order autoregressive schemes, such as AR(1), AR(2), etc.; and
(3) simple or higher-order moving averages of white noise error terms, such as εt in Equation (12).The
BG test, also known as the LM test.

AEC 507/GMG  Autocorrelation  16 


 
Ex: Using two-variable regression model although many regressors can be added to the model. Also,
lagged values of the regressand can be added to the model. Let
………….(1)
Assume that the error term ut follows the pth-order autoregressive, AR(p), scheme as follows
………(2)
where εt is a white noise error term as discussed previously. This is simply the extension of the AR(1)
scheme. The null hypothesis H0 to be tested is that
………(3)
That is, there is no serial correlation of any order. The BG test involves the following steps:
1. Estimate above function by OLS and obtain the residuals, ˆut.
2. Regress ˆut on the original Xt (if there is more than one X variable in the original model, include them
also) and ˆut−1, ˆut−2, . . . , ˆut−p, where the latter are the lagged values of the estimated residuals in
step 1. Thus, if p = 4, we will introduce four lagged values of the residuals as additional regressor in
the model. Note that to run this regression we will have only (n − p) observations. In short, run the
following regression.
……….(4)
and obtain R2 from this (auxiliary) regression.
3. If the sample size is large (technically, infinite), Breusch and Godfrey have shown that

…………..(5)
That is, asymptotically, n − p times the R2 value obtained from the eqn (4) regression follows the chi-
square distribution with p df. If in an application, (n − p)R2 exceeds the critical chi-square value at the
chosen level of significance, we reject the null hypothesis, in which case at least one rho in eqn (2) is
statistically significantly different from zero.

AEC 507/GMG  Autocorrelation  17 


 

You might also like