Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
72 views26 pages

5 Multicolinearity

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 26

Applied Econometrics 3rd edition

Dimitrios Asteriou
Stephen G Hall
Applied Econometrics 3rd edition

1. Perfect Multicollinearity
2. Consequences of Perfect Multicollinearity
3. Imperfect Multicollinearity
4. Consequences of Imperfect Multicollinearity
5. Detecting Multicollinearity
6. Resolving Multicollinearity
Applied Econometrics 3rd edition

Learning Objectives
1. Recognize the problem of multicollinearity in the CLRM.
2. Distinguish between perfect and imperfect multicol-
3. Understand and appreciate the consequences of perfect
and imperfect multicollinearity on OLS estimates.
4. Detect problematic multicollinearity using econometric
5. Find ways of resolving problematic multicollinearity.
Applied Econometrics 3rd edition

Assumption number 8 of the CLRM requires that
there are no exact linear relationships among
the sample values of the explanatory variables
(the Xs).
So, when the explanatory variables are very highly
correlated with each other (correlation
coefficients either very close to 1 or to -1) then
the problem of multicollinearity occurs.
Applied Econometrics 3rd edition

Perfect Multicollinearity
• When there is a perfect linear relationship.
• Assume we have the following model:
Y=β1+β2X2+ β3X3+e
where the sample values for X2 and X3 are:

X2 1 2 3 4 5 6

X3 2 4 6 8 10 12
Applied Econometrics 3rd edition

Perfect Multicollinearity
• We observe that X3=2X2
• Therefore, although it seems that there are two
explanatory variables in fact it is only one.
• This is because X2 is an exact linear function
of X3 or because X2 and X3 are perfectly
Applied Econometrics 3rd edition

Perfect Multicollinearity
When this occurs then the equation:
can be satisfied for non-zero values of both δ1
and δ2.
In our case we have that
So δ1=-2 and δ2=1.
Applied Econometrics 3rd edition

Perfect Multicollinearity
Obviously if the only solution is
(usually called as the trivial solution) then
the two variables are linearly independent
and there is no problematic
Applied Econometrics 3rd edition

Perfect Multicollinearity
In case of more than two explanatory variables
the case is that one variable can be expressed as
an exact linear function of one or more or even
all of the other variables.
So, if we have 5 explanatory variables we have:
δ1X1+δ2X2 +δ3X3+δ4X4 +δ5X5=0
An application to better understand this situation
is the Dummy variables trap (explain on board).
Applied Econometrics 3rd edition

Consequences of Perfect Multicollinearity

• Under Perfect Multicollinearity, the OLS

estimators simply do not exist. (prove on
• If you try to estimate an equation in Eviews
and your equation specifications suffers from
perfect multicollinearity Eviews will not give
you results but will give you an error message
mentioning multicollinearity in it.
Applied Econometrics 3rd edition

Imperfect Multicollinearity
• Imperfect multicollinearity (or near
multicollinearity) exists when the explanatory
variables in an equation are correlated, but this
correlation is less than perfect.
• This can be expressed as:
where v is a random variable that can be viewed
as the ‘error’ in the exact linear releationship.
Applied Econometrics 3rd edition

Consequences of Imperfect Multicollinearity

• In cases of imperfect multicollinearity the OLS

estimators can be obtained and they are also
• However, although linear unbiassed estimators
with the minimum variance property to hold,
the OLS variances are often larger than those
obtained in the absence of multicollinearity.
Applied Econometrics 3rd edition

Consequences of Imperfect Multicollinearity

To explain this consider the expression that gives the
variance of the partial slope of variable Xj:
var( ˆ2 ) 
 2 2
( X  X ) 2
(1  r 2
var( ˆ3 ) 
 3 3
( X  X ) 2
(1  r 2

where r2 is the square of the sample correlation

coefficient between X2 and X3.
Applied Econometrics 3rd edition

Consequences of Imperfect Multicollinearity

Extending this to more than two explanatory variables,
we have:
 2
var( ˆ j ) 
 2 2
( X  X ) 2
(1  R 2

ˆ  2
var( 3 ) 
 3 3
( X  X ) 2
(1  R 2

and therefore, what we call the Variance Inflation

Factor (VIF)
Applied Econometrics 3rd edition

The Variance Inflation Factor

R2j VIFj
0 1
0.5 2
0.8 5
0.9 10
0.95 20
0.075 40
0.99 100
0.995 200
0.999 1000
Applied Econometrics 3rd edition

The Variance Inflation Factor

• VIF values that exceed 10 are generally viewed
as evidence of the existence of problematic
• This happens for R2j >0.9 (explain auxiliary reg)
• So large standard errors will lead to large
confidence intervals.
• Also, we might have t-stats that are totally
Applied Econometrics 3rd edition

Consequences of
Imperfect Multicollinearity (Again)
Concluding when imperfect multicollinearity is present we have:
(a) Estimates of the OLS may be imprecise because of large
standard errors.
(b) Affected coefficients may fail to attain statistical significance
due to low t-stats.
(c) Sing reversal might exist.
(d) Addition or deletion of few observations may result in
substantial changes in the estimated coefficients.
Applied Econometrics 3rd edition

Detecting Multicollinearity
• The easiest way to measure the extent of
multicollinearity is simply to look at the matrix of
correlations between the individual variables.
• In cases of more than two explanatory variables
we run the auxiliary regressions. If near linear
dependency exists, the auxiliary regression will
display a small equation standard error, a large R2
and statistically significant F-value.
Applied Econometrics 3rd edition

Resolving Multicollinearity
• Approaches, such as the ridge regression or
the method of principal components. But these
usually bring more problems than they solve.

• Some econometricians argue that if the model

is otherwise OK, just ignore it. Note that you
will always have some degree of
multicollinearity, especially in time series
Applied Econometrics 3rd edition

Resolving Multicollinearity
• The easiest ways to “cure” the problems are
(a) drop one of the collinear variables
(b) transform the highly correlated variables into
a ratio
(c) go out and collect more data e.g.
(d) a longer run of data
(e) switch to a higher frequency
Applied Econometrics 3rd edition

We have quarterly data for
Imports (IMP)
Gross Domestic Product (GDP)
Consumer Price Index (CPI) and
Producer Price Index (PPI)
Applied Econometrics 3rd edition

Correlation Matrix


IMP 1 0.979 0.916 0.883
GDP 0.979 1 0.910 0.899
CPI 0.916 0.910 1 0.981
PPI 0.883 0.8998 0.981 1
Applied Econometrics 3rd edition

Examples – only CPI

Variable CoefficientStd. Error t-Statistic Prob.
C 0.631870 0.344368 1.834867 0.0761
LOG(GDP) 1.926936 0.168856 11.41172 0.0000
LOG(CPI) 0.274276 0.137400 1.996179 0.0548

R-squared 0.966057 Mean dependent var 10.81363

Adjusted R-squared 0.963867 S.D. dependent var 0.138427
S.E. of regression 0.026313 Akaike info criterion -4.353390
Sum squared resid 0.021464 Schwarz criterion -4.218711
Log likelihood 77.00763 F-statistic 441.1430
Durbin-Watson stat 0.475694 Prob(F-statistic) 0.000000
Applied Econometrics 3rd edition

Examples –CPI with PPI

Variable CoefficientStd. Error t-Statistic Prob.
C 0.213906 0.358425 0.596795 0.5551
LOG(GDP) 1.969713 0.156800 12.56198 0.0000
LOG(CPI) 1.025473 0.323427 3.170645 0.0035
LOG(PPI) -0.770644 0.305218 -2.524894 0.0171

R-squared 0.972006 Mean dependent var 10.81363

Adjusted R-squared 0.969206 S.D. dependent var 0.138427
S.E. of regression 0.024291 Akaike info criterion -4.487253
Sum squared resid 0.017702 Schwarz criterion -4.307682
Log likelihood 80.28331 F-statistic 347.2135

Durbin-Watson stat 0.608648 Prob(F-statistic) 0.000000

Applied Econometrics 3rd edition

Examples – only PPI

Variable CoefficientStd. Error t-Statistic Prob.
C 0.685704 0.370644 1.850031 0.0739
LOG(GDP) 2.093849 0.172585 12.13228 0.0000
LOG(PPI) 0.119566 0.136062 0.878764 0.3863

R-squared 0.962625 Mean dependent var 10.81363

Adjusted R-squared 0.960213 S.D. dependent var 0.138427
S.E. of regression 0.027612 Akaike info criterion -4.257071
Sum squared resid 0.023634 Schwarz criterion -4.122392
Log likelihood 75.37021 F-statistic 399.2113
Durbin-Watson stat 0.448237 Prob(F-statistic) 0.000000
Applied Econometrics 3rd edition

Examples – the auxiliary regression

Variable CoefficientStd. Error t-Statistic Prob.
C -0.542357 0.187073 -2.899177 0.0068
LOG(CPI) 0.974766 0.074641 13.05946 0.0000
LOG(GDP) 0.055509 0.091728 0.605140 0.5495

R-squared 0.967843 Mean dependent var 4.552744

Adjusted R-squared 0.965768 S.D. dependent var 0.077259
S.E. of regression 0.014294 Akaike info criterion -5.573818
Sum squared resid 0.006334 Schwarz criterion -5.439139
Log likelihood 97.75490 F-statistic 466.5105

Durbin-Watson stat 0.332711 Prob(F-statistic) 0.000000

You might also like