5 Multicolinearity
5 Multicolinearity
5 Multicolinearity
Dimitrios Asteriou
and
Stephen G Hall
Applied Econometrics 3rd edition
MULTICOLLINEARITY
1. Perfect Multicollinearity
2. Consequences of Perfect Multicollinearity
3. Imperfect Multicollinearity
4. Consequences of Imperfect Multicollinearity
5. Detecting Multicollinearity
6. Resolving Multicollinearity
Applied Econometrics 3rd edition
Learning Objectives
1. Recognize the problem of multicollinearity in the CLRM.
2. Distinguish between perfect and imperfect multicol-
linearity.
3. Understand and appreciate the consequences of perfect
and imperfect multicollinearity on OLS estimates.
4. Detect problematic multicollinearity using econometric
software.
5. Find ways of resolving problematic multicollinearity.
Applied Econometrics 3rd edition
Multicollinearity
Assumption number 8 of the CLRM requires that
there are no exact linear relationships among
the sample values of the explanatory variables
(the Xs).
So, when the explanatory variables are very highly
correlated with each other (correlation
coefficients either very close to 1 or to -1) then
the problem of multicollinearity occurs.
Applied Econometrics 3rd edition
Perfect Multicollinearity
• When there is a perfect linear relationship.
• Assume we have the following model:
Y=β1+β2X2+ β3X3+e
where the sample values for X2 and X3 are:
X2 1 2 3 4 5 6
X3 2 4 6 8 10 12
Applied Econometrics 3rd edition
Perfect Multicollinearity
• We observe that X3=2X2
• Therefore, although it seems that there are two
explanatory variables in fact it is only one.
• This is because X2 is an exact linear function
of X3 or because X2 and X3 are perfectly
collinear.
Applied Econometrics 3rd edition
Perfect Multicollinearity
When this occurs then the equation:
δ1X1+δ2X2=0
can be satisfied for non-zero values of both δ1
and δ2.
In our case we have that
(-2)X1+(1)X2=0
So δ1=-2 and δ2=1.
Applied Econometrics 3rd edition
Perfect Multicollinearity
Obviously if the only solution is
δ1=δ2=0
(usually called as the trivial solution) then
the two variables are linearly independent
and there is no problematic
multicollinearity.
Applied Econometrics 3rd edition
Perfect Multicollinearity
In case of more than two explanatory variables
the case is that one variable can be expressed as
an exact linear function of one or more or even
all of the other variables.
So, if we have 5 explanatory variables we have:
δ1X1+δ2X2 +δ3X3+δ4X4 +δ5X5=0
An application to better understand this situation
is the Dummy variables trap (explain on board).
Applied Econometrics 3rd edition
Imperfect Multicollinearity
• Imperfect multicollinearity (or near
multicollinearity) exists when the explanatory
variables in an equation are correlated, but this
correlation is less than perfect.
• This can be expressed as:
X3=X2+v
where v is a random variable that can be viewed
as the ‘error’ in the exact linear releationship.
Applied Econometrics 3rd edition
ˆ 2
1
var( 3 )
3 3
( X X ) 2
(1 R 2
j)
Consequences of
Imperfect Multicollinearity (Again)
Concluding when imperfect multicollinearity is present we have:
(a) Estimates of the OLS may be imprecise because of large
standard errors.
(b) Affected coefficients may fail to attain statistical significance
due to low t-stats.
(c) Sing reversal might exist.
(d) Addition or deletion of few observations may result in
substantial changes in the estimated coefficients.
Applied Econometrics 3rd edition
Detecting Multicollinearity
• The easiest way to measure the extent of
multicollinearity is simply to look at the matrix of
correlations between the individual variables.
• In cases of more than two explanatory variables
we run the auxiliary regressions. If near linear
dependency exists, the auxiliary regression will
display a small equation standard error, a large R2
and statistically significant F-value.
Applied Econometrics 3rd edition
Resolving Multicollinearity
• Approaches, such as the ridge regression or
the method of principal components. But these
usually bring more problems than they solve.
Resolving Multicollinearity
• The easiest ways to “cure” the problems are
(a) drop one of the collinear variables
(b) transform the highly correlated variables into
a ratio
(c) go out and collect more data e.g.
(d) a longer run of data
(e) switch to a higher frequency
Applied Econometrics 3rd edition
Examples
We have quarterly data for
Imports (IMP)
Gross Domestic Product (GDP)
Consumer Price Index (CPI) and
Producer Price Index (PPI)
Applied Econometrics 3rd edition
Examples
Correlation Matrix