Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Addis Ababa University,

School of Commerce
Department of Economics

Introduction to Econometrics

December 8, 2023
Chapter 3

Multiple Linear Regression

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 2 / 43


Introduction

I Simple linear regression (SLR) is mostly unrealistic since the


relationship between economic variables involves several variables.
I Thus, to be more realistic, we include more and more explanatory
variables into the regression model ⇒ the issue of this chapter.
I Multiple linear regression (MLR) is specified as follows:
Yi = β0 + β1 X1i + β2 X2i + ... + βK XK i + i
where β0 is intercept & β1 ....βK are slope parameters
I The sample counterpart is expressed as:
Yi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ... + βˆK XK i + ei
where βˆ0 is the intercept & βˆ1 ...βˆK are the slope estimators

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 3 / 43


Introduction

I What changes as we move from simple to multiple regression?


1 Potentially more explanatory power with more variables;
2 The ability to control for other variables; (and the interaction of
various explanatory variables: correlations and multicollinearity);
3 Harder to visualize drawing a line through three or more
(n)-dimensional space.
4 The R2 is no longer simply the square of the correlation coefficient
between Y and X.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 4 / 43


Introduction

Interpreting slope coefficients:


I Let us consider the following multiple linear regression given by the
population regression equation (PRE) where k = 3
Yi = β0 + β1 X1i + β2 X2i + β3 X3i + ui
I The population regression function corresponding to the above
equation is:
E(Yi |X1i , X2i , X3i ) = β0 + β1 X1i + β2 X2i + β3 X3i
I the slope coefficient βj is the marginal effect of the corresponding
explanatory variable Xj on the conditional mean of Y .
I Formally, the slope coefficients, βj : j = 1, 2, 3 are the partial
derivatives of the population regression function (PRF) with respect
to the explanatory variables Xj : j = 1, 2, 3 :
∂E(Yi |X i ) ∂E(Yi |X1i ,X2i ,X3i )
∂Xj i = ∂Xj i = βj ; j = 1, 2, 3

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 5 / 43


Introduction

I For example, for j = 1;


∂E(Yi |X1i ,X2i ,X3i ) ∂E(β0 +β1 X1i +β2 X2i +β3 X3i )
∂X1i = ∂X1i = β1
Interpretation: A partial derivative isolates the marginal effect on the con-
ditional mean of Y of small variations in one of the explanatory variables,
while holding constant the values of the other explanatory variables in the
PRF.
ü β1 = ( ∆E(Y |X1 ,X2 ,X3 )
∆X1 )∆X 2=0,∆X 3=0 = ∂E(Y |X1 ,X2 ,X3 )
∂X1

I Thus, β1 =the partial marginal effect of X1 on the conditional mean


of Y holding constant the values of the other regressors X2 and X3 .
I Including X2 and X3 in the regression function allows us to estimate
the partial marginal effect of X1 on E(Y |X1 , X2 , X3 ) while
1 holding constant the values of X2 and X3
2 controlling for the effects on Y of X2 and X3
3 conditioning on X2 and X3 .

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 6 / 43


Introduction

Assumptions of Classical Linear Regression Model (CLRM)


I We devide the CLRM into three:
1 Assumptions respecting the formulation of PRF or SRF
2 Assumptions respecting the statistical properties of the random error
term and the dependent variable.
3 Assumptions respecting the properties of the sample data

Assumption 1 (A1 ):The population regression equation (PRE) takes the


form
Pk
Yi = β0 + β1 X1i + β2 X2i + ... + βk Xki + i =β0 + j=1 βj Xj + ui

A1 further incorporates three distinct assumptions.


∂Yi
1 Additive Random Error Term, ui ⇒ ∂u i
= 1 ∀i
2 Linearity-in-Parameters or Linearity-in-Coefficients
3 Parameter or Coefficient Constancy ⇒ βj i = βj = ∀ i

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 7 / 43


Introduction

Assumption 2 (A2 ): The Assumption of Zero Conditional Mean Error

E(ui |X1i , X2i , ..., Xki ) = 0

I Implications of A2 :
1 E(ui |X1i , X2i , ..., Xki ) = 0 ⇒ E(ui ) = 0 ∀i. This is from the law of
iterated expectations which suggest that E[E(ui |Xs )] = E(ui ). Since
E(ui |Xs ) = 0, A2 implies that E(ui ) = E[E(ui |Xs )] = E[0] = 0
If the conditional mean of u for each and every population value of Xs
equals zero, then the mean of these zero conditional means must also
be zero.
2 Orthogonality condition:
E(ui |X1i , X2i , ..., Xki ) = 0 ⇒ cov(Xj i , ui ) = E(Xj i ui ) = 0
∀i, j = 1, ..., k

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 8 / 43


Introduction

The equality of cov(Xj i , ui ) = E(Xj i ui ) = 0 can be shown as follows:


cov(Xj i , ui ) = E[(Xj i − E(Xj i ))(ui − E(ui ))] by definition
= E[(Xj i − E(Xj i ))(ui )] since E(ui ) = 0
= E[(Xj i ui − E(Xj i )E(ui )] since E(Xj i ) is constant
= E(Xj i ui ) = 0 since E(ui ) = 0 by A2
3 There exists no linear and nonlinear association between ui and any
of the k regressors Xj (j = 1, . . . , k)
cov(Xj i ,ui ) cov(Xj i ,ui )
⇒ ρ(Xj i , ui ) = √ = std(Xj i )std(ui ) =0
var(Xj i )var(ui )
4 the conditional mean of the population Yi values corresponding to
given values Xj i of the regressors Xj (j = 1, . . . , k) equals the
population regression function (PRF):
E(ui |Xj i ) = 0 ⇒ E(Yi |Xj i ) = β0 + β1 X1i + β2 X2i + ... + βk Xki
Pk
⇒ E(Yi |Xj i ) = β0 + j=1 βj Xj i

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 9 / 43


Introduction

Violation of zero covariance


I the random error term u represents all the unobservable, unmeasured
and unknown variables other than the regressors Xj , j = 1, . . . , k that
determine the population values of the dependent variable Y .
I Anything that causes the random error u to be correlated with one or
more of the regressors Xj(j = 1, . . . , k) will violate assumption A2 :
I If Xj and u are correlated, then E(ui |Xs ) must depend on Xj and so
cannot be zero.
I Common causes of correlation or dependence between the Xj and ui
1 Incorrect specification of the functional form of the relationship
between Y and the Xj , j = 1, . . . , k.
2 Omission of relevant variables that are correlated with one or more of
the included regressors Xj , j = 1, . . . , k.
3 Measurement errors in the regressors Xj , j = 1, . . . , k.
4 Joint determination of one or more Xj and Y .

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 10 / 43


Introduction

Assumption 3 (A3 ) : Constant Error Variances/Homoskedastic Errors

I The conditional variances of the random error terms ui are identical


for all observations – i.e., for all sets of regressor values,
Xj i , j = 1, ..., k
var(ui |Xj i ) = E(u2i |Xj i ) = σ 2 > 0 ∀i

ý var(ui |Xj i ) = E([ui − E(ui |Xj i )]2 |Xj i )= E(u2i |Xj i ) = σ 2 > 0
Implications of A3 :
1 the unconditional variance of the random error u is also equal to σ 2 ⇒
V ar(ui ) = E(ui − E(ui ))2 = E(u2i ) = σ 2 . This is based on the
iterated expectation.
2 the conditional variance of the regressand Yi corresponding to given set
of regressor values Xj i , j =1,...,k equals the conditional error variance
σ2

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 11 / 43


Introduction

I This can be shown as follows:


var(Yi |Xj i ) = E([Yi − E(Yi |Xj ) ]2 |Xj i ) by definition
=E([Yi − β0 − kj=1 βj Xki ]2 |Xj i ) since
P

E(Yi |Xj i ) = β0 − kj=1 βj Xki


P

=E(u2i |Xj i ) since ui = Yi − β0 − kj=1 βj Xki


P

= σ 2 since E(u2i |Xj i ) = σ 2 by A3


I Thus, var (ui |Xj i ) =var(Yi |Xj i ) = σ 2
I A3 says that the variance of the random errors for any particular set
of regressor values Xj i is the same as the variance of the random
errors for any other set of regressor values, Xj s for all Xs 6= Xi
⇒ var (ui |Xj i ) =var (us |Xj s ) = σ 2 > 0 for all Xj i 6= Xj s
I The conditional distributions of the population Y values around the
PRF have the same constant variance σ 2 for all sets of regressor
values ⇒ var (Yi |Xj i ) =var (Ys |Xj s ) = σ 2 > 0 for all Xj i 6= Xj s .

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 12 / 43


Introduction

Assumption 4 (A4 ) : Zero Error Covariances


I Two distinct random error terms ui and us (i 6= s)corresponding to
two different sets of regressor values Xi 6= Xs are not correlated.
Cov(ui , us |Xi , Xs ) = E([(ui − E(ui |Xi ))(us − E(us |X) )]|Xi , Xs ) =
E(ui us |Xi , Xs ) = 0
ü all pairs of error terms corresponding to different sets of regressor
values have zero covariance.
Implications of A4 :
I the conditional covariance of any two distinct values of the
regressand, say Yi and Ys where i 6= s, is equal to zero:
cov(ui , us |Xi , Xs ) = 0 ∀ i 6= s
⇒ cov(Yi , Ys |Xi , Xs ) = E(ui us |Xi , Xs ) = 0 ∀ i 6= s

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 13 / 43


Introduction

I This can easily shown as follows:


cov(Yi , Ys |Xi , Xs ) = E([(Yi − E(Yi |Xi ))(Ys − E(Ys |Xs ))]|Xi , Xs )
= E(ui us |Xi , Xs ) = 0
Pk
since Yi − E(Yi |Xi ) = Yi − β0 − j=1 βj Xj i = ui by assumption A1
and
Pk
Ys − E(Ys |Xs ) = Ys − β0 − j=1 βj Xj s = us by assumption A1
I Therefore, cov(Yi , Ys |Xi , Xs ) = E(ui us |Xi , Xs ) = 0 by assumption
A4

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 14 / 43


Introduction

Assumption 5 (A5 ): Random Sampling or Independent Random Sam-


pling
I The sample data consist of N randomly selected observations on the
regressand Y and the regressors Xj (j = 1, ..., k), the observable
variables.
Implications of A5 :
1 It thus means that the error terms ui and us are statistically
independent, and hence have zero covariance, for any two observations
i and s.
Random sampling ⇒ Cov(ui , us |Xi , Xs ) = Cov(ui , us ) = 0 ∀i 6= s
2 It also means that the dependent variable values Yi and Ys are
statistically independent, and hence have zero covariance, for any two
observations i and s.
Random sampling ⇒ Cov(Yi , Ys |Xi , Xs ) = Cov(Yi , Ys ) = 0 ∀i 6= s
I A5 often is appropriate for cross-sectional regression models, but is
hardly ever appropriate for time-series regression models.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 15 / 43


Introduction

Assumption 6 (A6 ): The number of sample observations N is greater


than the number of unknown parameters K:
I Unless this assumption is satisfied, it is not possible to compute from
a given sample of N observations, estimates of all the unknown
parameters in the model.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 16 / 43


Introduction

Assumption 7 (A7 ): Non-constant Regressors


I The sample values Xj i of each regressor Xj (j = 1, . . . , k) in a given
sample (andhence in the population) are not all equal to a constant:
Xj i 6= cj ∀i = 1, ..., N where cj are constants (j = 1, ..., k)
I It implies that the sample variances of all k1 non-constant regressors
Xj (j = 1, ..., k) must be finite positive numbers for any sample size
N ; i.e. P
(Xj i −X̄)2
⇒ sample variance of Xj i = var(Xj i ) = i
N −1 >0

I each non-constant regressor Xj (j = 1, . . . , k) takes at least two


different values in any given sample
I to calculate the effect of changes in Xj on Y , the sample values Xj i
of the regressor Xj must vary across observations in any given sample.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 17 / 43


Introduction

Assumption 8 (A8 ) : No perfect multicollinearity


I The sample values of the regressors Xj (j = 1, ..., k) in a multiple
regression model do not exhibit perfect or exact multicollinearity.
I The absence of perfect multicollinearity means that there exists no
exact linear relationship among the sample values of the non-constant
regressors Xj (j = 1, ..., k).
I An exact linear relationship exists among the sample values of the
nonconstant regressors if the sample values of the regressors
Xj (j = 1, ..., k) satisfy a linear relationship of the form:
λ0 + λ1 X1i + λ2 X2i + ... + λk Xki = 0 ∀i = 1, 2...., N
I Each non-constant regressor Xj (j = 1, ..., k) must exhibit some
independent linear variation in the sample data.
I Otherwise, it is not possible to estimate the separate linear effect of
each and every non-constant regressor on the regressand Y

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 18 / 43


Introduction

I Suppose that Yi = β0 + β1 X1i + β2 X2i + ui


I Further, suppose also that, X1i = 3X2i ;
Yi = β0 + β1 (3X2i ) + β2 X2i + ui
Yi = β0 + 3β1 X2i + β2 X2i + ui
Yi = β0 + (3β1 + β2 )X2i + ui
Yi = β0 + α2 X2i + ui where α2 = 3β1 + β2
I It is possible to estimate from the sample data the regression
coefficients β0 and α2 .
I But from the estimate of α2 it is not possible to compute estimates of
the coefficients β1 and β2 .
ü Reason: The equation α2 = 3β1 + β2 is one equation containing two
unknowns, namely β1 and β2 .
I Result: It is not possible to compute from the sample data estimates
of both β1 and β2 , the separate linear effects of X1i and X2i on the
regressand Yi .
Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 19 / 43
The Problem of Estimation

3.2.Estimation: The Method of OLS


I We specify the k explanatory variable SRF as follows:
Yi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ... + βˆk Xki + ei
or
Yi = Ŷi + ei
where Ŷi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ... + βˆk Xki
I The goal is to find parameter estimates by minimizing the sum of
square errors, as was done with the simple regression model.
Pn 2 Pn
ü minimize i=1 ei ⇒ i=1 (Yi − Ŷi )2
⇒ minimize ni=1 (Yi − βˆ0 − βˆ1 X1i − βˆ2 X2i − ... − βˆk Xki )2 with respect
P

to βˆ0 and βˆj , j = 1, ..., k.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 20 / 43


The Problem of Estimation

I Let us only consider with two explanatory variables, X1 & X2 :


Minimize n e2 ⇒ minimize n (Yi − βˆ0 − βˆ1 X1i − βˆ2 X2i )2
P P
i=1 i i=1
I We can do this by calculating the partial derivatives with respect to
the three unknown parameters βˆ0 , βˆ1 and βˆ2 , equating each to zero,
and solving.
I The normal equations then become:
nβˆ0 + βˆ1 ni=1 X1i + βˆ2 ni=1 X2i = ni=1 Yi
P P P

βˆ0 ni=1 X1i + βˆ1 ni=1 X1 2i + βˆ2 ni=1 X1i X2i = ni=1 X1i Yi
P P P P

βˆ0 ni=1 X2i + βˆ1 ni=1 X1i X2i + βˆ2 ni=1 X2 2i = ni=1 X2i Yi
P P P P

I Now, it can be easily solved using Cramer’s rule or matrix algebra to


find the formula for the parameter estimates.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 21 / 43


The Problem of Estimation

I An alternative approach is to begin by expressing all the data in the


form of deviations from the sample means.
I Our three variable regression model is:
Yi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ei
I Averaging over the sample observations gives
Ȳ = βˆ0 + βˆ1 X¯1i + βˆ2 X¯2i since ē = 0
I Now, subtracting the second equation from the first gives us the
deviation form:
yi = βˆ1 x1i + βˆ2 x2i + ei
I Note the intercept βˆ0 disappears from the deviation form of the
equation, but it may be recovered from:
βˆ0 = Ȳ − βˆ1 X¯ − βˆ2 X¯2i 1i

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 22 / 43


The Problem of Estimation

I For βˆ1 and βˆ2 , we can solve from the minimization of the deviation
form equation. So to minimize:
e2 ⇒ n (yi − yˆi )2 ⇒ n (yi − βˆ1 x1i − βˆ2 x2i )2
Pn P P
i=1 i i=1 i=1
I we needPto solve:
n Pn Pn
∂ e2
i=1 i
∂ (yi −yˆi )2 ∂ (yi −βˆ1 x1i −βˆ2 x2i )2
⇒ i=1
⇒ i=1
=0
∂ βˆ1 ∂ βˆ1 ∂ βˆ1

Pn Pn and P
n
∂ i=1 i
e2 ∂ (y −yˆi )2
i=1 i
∂ (y −βˆ1 x1i −βˆ2 x2i )2
i=1 i
⇒ ⇒ =0
∂ βˆ2 ∂ βˆ2 ˆ ∂ β2
I which gives respectively the following normal equations:
= βˆ1 ni=1 x1 2i + βˆ2 ni=1 x1i x2i
Pn P P
i=1 x1i yi

= βˆ1 ni=1 x1i x2i + βˆ2 ni=1 x2 2i


Pn P P
i=1 x2i yi

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 23 / 43


The Problem of Estimation

I We can reorganize these two normal equations in matrix form as


follows:
" P #" # "P #
n
x1i x2i βˆ1
Pn n
x 2 x1i yi
Pn i=1 1 i i=1 i=1
ˆ =
Pn 2 Pn
i=1 x1i x2i i=1 x2 i β2 i=1 x2i yi

I using Cramer’s rule, we have the following results:


Pn Pn Pn Pn
( x1i yi )( i=1 x2 2i )−( i=1 x2i yi )( i=1 x1i x2i )
βˆ1 = i=1 Pn Pn Pn
( i=1 x1 2i )( i=1 x2 2i )−( i=1 x1i x2i )2

and
Pn Pn Pn Pn
( x1 2i )( i=1 x2i yi )−( i=1 x1i x2i )( i=1 x1i yi )
βˆ2 = i=1 Pn Pn Pn
( i=1 x1 2i )( i=1 x2 2i )−( i=1 x1i x2i )2

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 24 / 43


The Problem of Estimation

The Variances of OLS estimators

X¯12
P 2 ¯2 P 2 P
x2 i +X2 x1 i −2X̄1 X̄ 2 x1i x2i 2
var(βˆ0 ) = [ n1 + P
x21 i
P P
x22 i −( x1i x2i )2

P
x22 i
var(βˆ1 ) = [ P x2 P x2 P
x1i x2i )2
]σ 2
1i 2 i −(

P
x21 i
var(βˆ2 ) = [ P x2 P x2 P
x1i x2i )2
]σ 2
1i 2 i −(

We estimate σ 2 with σˆ2 where


P
e2i
σˆ2 = n−k

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 25 / 43


The Problem of Estimation

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 26 / 43


The Problem of Estimation

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 27 / 43


The Problem of Estimation

Goodness of fit in MLR:

I From the SRE with K=2, we have:


Yi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ei
I Its mean can be written as:
Ȳ = βˆ0 + βˆ1 X̄1 + βˆ2 X̄2
I subtracting this from the SRE gives us the deviation form of SRF as:
yi = βˆ1 x1i + βˆ2 x2i + ei
⇒ yi = yˆi + ei
I Deriving for error term or residual, we have:
ei = yi − βˆ1 x1i − βˆ2 x2i
I Its square means,
e2i = (yi − βˆ1 x1i − βˆ2 x2i )ei
e2i = yi ei − βˆ1 x1i ei − βˆ2 x2i ei
I summing over all, we get
Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 28 / 43
The Problem of Estimation

ei = yi ei − βˆ1 x1i ei − βˆ2 x2i ei


P 2 P P P

P 2

P P
ei = yi ei since xki ei = 0 by assumption.

I Again, substituting ei = yi − βˆ1 x1i − βˆ2 x2i into this last equation, we
have
ei = yi (yi − βˆ1 x1i − βˆ2 x2i )
P 2 P

ei = yi2 − βˆ1 x1i yi − βˆ2 x2i yi


P 2 P P P

⇒ yi2 = βˆ1 x1i yi + βˆ2 x2i yi + e2i


P P P P

=⇒ T SS = ESS + RSS
yi = T SS; βˆ1 x1i yi + βˆ2 x2i yi = yˆi 2 = ESS; e2i = RSS
P 2 P P P P

I Then,
βˆ1 x1i yi +βˆ2
P 2 P P
ESS yˆi x2i yi
R2 = T SS = P
yi2
= P
yi2
or
P 2
e
R2 = 1 − RSS ⇒ 1 − P i2
T SS yi
Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 29 / 43
The Problem of Estimation

I The coefficient of multiple determination (R2 ) measures the


proportion of the variation in the dependent variable explained by (the
set of all the regressors in) the model.
I R2 can be used to compare goodness-of-fit of alternative regression
equations, but only if the regression models satisfy two conditions.
1 The models must have the same dependent variable.
Reason: T SS, ESS&RSS depend on the units in which the
regressand (Y ) is measured. For instance, the T SS for Y is not the
same as the T SS for ln(Y ).
2 The models must have the same number of regressors & parameters
(same value of K + 1).
Reason: Adding a variable to a model never raises the RSS (or, never
lowers ESS or R2 ) even if the new variable is not very relevant.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 30 / 43


The Problem of Estimation

Adjusted R2

I The adjusted R-squared,R̄2 , attaches a penalty to adding more


variables.
I It is modified to account for changes/ differences in degrees of
freedom (df): due to differences in number of regressors (k) and/ or
sample size (n).
I If adding a variable raises R̄2 , for a regression, then this is a better
indication that it has improved the model than if the addition merely
raises R2 .

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 31 / 43


The Problem of Estimation

P 2
e
P 2 P 2 i
ŷ ei
R2 = P y 2 = 1 − P y 2 = 1 − P
n−k
y2
i i
n−1

I Dividing TSS and RSS by their df and k+1 is the number of


parameters to be estimated.
P
e2 n−1
R̄2 = 1 − [ P yi2 ∗ n−k) ]
i
n−1
⇒ R̄2 = 1 − (1 − R2 ) n−k)
n−1
1 − R̄2 = (1 − R2 ) n−k)
I As long as k ≥ 1, 1 − R̄2 > (1 − R2 ) ⇒ R̄2 < R2
I In general, R̄2 ≤ R2 and as n grows larger relative to k, R̄2 −→ R2 .

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 32 / 43


The Problem of Estimation

Relationship between R̄2 and R2

1 While R2 is always non-negative, R̄2 can be positive or negative.


2 R̄2 can be used to compare goodness-of-fit of two/more regression
models only if the models have the same regressand.
3 Including more regressors reduces both RSS & df ; R̄2 rises only if
the first effect dominates.
4 R̄2 or R2 should never be the sole criterion for choosing
between/among models. In addition to R̄2 , one should also:
consider expected signs & values of coefficients, and
look for consistency with economic theory or reasoning (possible
explanations).

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 33 / 43


The Problem of Estimation

Coefficient of Partial Determination(r2 )


I When more than one regressors are included in the regression
equations, we may interested in how much variation in the regressand
a given regressor explains after controlling for the other.
I in case of k = 2, we compute the coefficient of partial determination
as follows:
Ry2 .12 −Ry2 .1
ry22.1 = 1−Ry2 .1
and
Ry2 .12 −Ry2 .2
ry21.2 = 1−Ry2 .2

I Ry2 .1 and Ry2 .2 are coefficients of determinations from SLR of X1 and


X2 on Y respectively, Ry2 .12 is multiple coefficient of determination.
I The inclusion of X2 increase the explanatory power of the model by
(Ry2 .12 − Ry2 .1 ) yi2 .
P

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 34 / 43


The Problem of Estimation

Example:
I From the earlier example we have the following information:
n = 10; TSS = yi2 = 3450; ESS = yˆi 2 = 3085.78;
P P

RSS= e2i = 364.22; βˆ0 = 111.692; βˆ1 = −7.19 & βˆ2 = 0.0143
P

1 Find multiple coefficient of determination (R2 )


2 how much variation in Y is explained by X1 alone?
3 what is contribution of the inclusion of X2 to the explanatory power of
the model?
4 Even after X2 is included, how much variation of Y is left unexplained?
5 Find R̄2
6 Find ry22.1

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 35 / 43


Inferences

Statistical Inferences in Multiple Linear Regression


The Normality assumptions:
I The error term, ei and estimators, (βˆ0 ) and βˆj are assumed to be
normally distributed

ei ∼ N (0, σ 2 ); ⇒ Yi ∼ N (α + βXi , σ 2 )

ˆ
β0 P ∼ N (β0 , var( ˆ
P 2β0 );
X¯12 x22 i +X¯22
P
var(βˆ0 ) = [ n1 + P 2 P 2x1 i −2 PX̄1 X̄ 2 2 x1i x2i ]σ 2
x1 i x2 i −( x1i x2i )
P
x22 i
βˆ1 ∼ N (β1 , var(βˆ1 ); var(βˆ1 ) = [ P x2 P x2 −(
P
x1i x2i )2
]σ 2
1i 2i
P
x21 i
βˆ2 ∼ N (β2 , var(βˆ2 ); var(βˆ1 ) = [ P x2 P x2 P
x1i x2i )2
]σ 2
1i 2 i −(

We estimate σ2 with σ̂ 2 where P


e2i
σ̂ 2 = n−k

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 36 / 43


Inferences

Confidence Intervals
I Given the level of significance (type I error) denoted by α, the
confidence interval for β0 and βj are given as follows:
100(1 − α)% two sided CI for β0 : βˆ0 ± (tn−k−1
α )se(βˆ0 )
2

100(1 − α)% two sided CI for βj : βˆj ± (tn−k−1


α )se(βˆj )
2
where j = 1, 2, ..., k
I the conventional significance level is 5%. Thus, the 95% CI for β0
and βj are given as follows:
95% two sided CI for β0 : βˆ0 ± (tn−k−1 )se(βˆ0 ) 0.025
95% two sided CI for βj : βˆj ± (t0.025
n−k−1
)se(βˆj )
where j = 1, 2, ..., k

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 37 / 43


Inferences

Example: Based on the earlier figures


I Given that βˆ0 = 111.692; βˆ1 = −7.19 & βˆ2 = 0.0143; n = 10
var(βˆ0 ) = 484.487; var (βˆ1 ) = 5.71; var(βˆ2 ) = 0.00011
se(βˆ0 ) = 22.011; se(βˆ1 ) = 2.39; se(βˆ2 ) = 0.0104
I Construct 95% CI for β0 , β1 and β2

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 38 / 43


Inferences

Hypothesis Testing
I To test the statistical relationship between economic variables in
MLR, we use two types of tests:
1 the t-test: to test individual coefficients
2 the F-test: to test more than one coefficients (to conduct joint tests)
I the t test is conducted using
βˆj −βj
t= s.e.(βj )

where βˆj is estimated value and βj is hypothesized value.


I In our earlier example test the claim that β1 and β2 are zero
⇒ H0 : β1 = 0 vs HA : β1 6= 0 and H0 : β2 = 0 vs HA : β2 6= 0

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 39 / 43


Inferences

The F-test
I The F-statistic is used to test the joint significance of all the slope
coefficients in a multiple linear regression model.
I If the unrestricted PRE is given as:
Yi = β0 + β1 X1i + β2 X2i + ... + βk Xki + ui
I The null and alternative hypothesis are:
H0 : βj = 0 for all j = 1, ..., k
HA : βj 6= 0 for all j = 1, ..., k
I The null hypothesis H0 says that all slope coefficients are jointly
equal to zero.
I The alternative hypothesis HA says that some or all of the slope
coefficients are not equal to zero.

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 40 / 43


Inferences

I The F-statistic for joint significance test is computed as the ratio of


the M SS (mean sum-of-squares) for the sample regression function
to the M SS for the residuals:
PN
ESS/(K−1) ŷ 2 /K−1
i=1 i
⇒ F0 = RSS/(N −K) = PN ˆ i /N −K
u2 i=1

I The distribution of F0 or (F-computed) under the null hypothesis


H0 : βj = 0 for all j = 1, . . . , k – is the F [K − 1, N − K] distribution:
ESS/(K−1)
F0 = RSS/(N −K) ∼ F [K − 1, N − K]
I Alternatively, F0 is computed as follows divide the numerator and
denominator of F0 by T SS:
ESS/(K−1) ESS/T SS/(K−1) R2 /(K−1)
F0 = RSS/(N −K) = F0 = RSS/T SS/(N −K) = F0 = (1−R2 )/(N −K)
since R2 = ESS/T SS and 1 − R2 = RSS/T SS
ESS/(K−1) R2 /(K−1)
Thus, F0 = RSS/(N −K) = (1−R2 )/(N −K)

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 41 / 43


Inferences

Decision Rule:
I Retain H0 at significance level α if F0 ≤ Fα [K − 1, N − K].
I Reject H0 at significance level α if F0 > Fα [K − 1, N − K].
or
ü Retain H0 at significance level α if p-value of F0 ≥ α
ü Reject H0 at significance level α if p-value F0 < α.
Example: Test the joint significance of X1 and X2 in the earlier examples:

⇒ H0 : β1 = β2 = 0 vs HA : β1 = β2 6= 0

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 42 / 43


Inferences

************* End of Chapter Three *************

Introduction to Econometrics Addis Ababa University, School of Commerce December 8, 2023 43 / 43

You might also like