As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
As of Sep 16, 2020: Seppo Pynn Onen Econometrics I
Part II
Definition
1 The Simple Regression Model
Definition
Estimation of the model, OLS
OLS Statistics
Algebraic properties
Definition
y = β0 + β1 x + u. (1)
Definition
∆y = β1 ∆x, if ∆u = 0.
Definition
5 cov[ui , xi ] = 0.
Remark 1
Assumptions 1 and 2 are related to distributional properties of
the error term of which Assumption 1 can always be fulfilled
when the intercept term β0 is included to the regression in
(1). We discuss Assumption 2 later in more detail.
Assumption 3. is mostly relevant in time series data or
”clustered” data.
Assumption 5 is the key assumption how x and u are related,
expressed sometimes in terms the stronger assumption of
mean independence, E[ui |xi ] = E[ui ], which implies
cov[ui , xi ] = 0 when E[ui ] = 0.
The first order conditions (foc) for the minimum are found by
setting the partial derivatives equal to zero. Denote by β̂0 and β̂1
the values satisfying the foc.
n
∂f (β̂0 , β̂1 ) X
= −2 xi (yi − β̂0 − β̂1 xi ) = 0 (5)
∂β1
i=1
The explicit solutions for β̂0 and β̂1 are (OLS estimators of β0 and
β1 ) Pn
(x − x̄)(yi − ȳ )
β̂1 = i=1 Pn i 2
(7)
i=1 (xi − x̄)
Thus the residual component ûi consist of the pure error term ui
and the sample errors due to the estimation of the parameters β0
and β1 .
Example 1
Suppose the population regression is y = β0 + β1 x + u with u ∼ N(0, σ 2 ) in
which β0 = 5, β1 = 1.2, and σ 2 = 4.
For a sample of n = 50 observations (x generated from the normal distribution
N(5, 3)), we have the sample regression line shown in the top right hand plot.
Repeating sampling 5 times, the respective SRFs are shown in the bottom left
figure along with the PRF.
PRF and a SDF form a sample
Data
of n = 50 observations
Obs x y
10 12 14
1 5.03 10.24
2 4.68 9.95
3 2.62 10.89
4 3.96 14.03
5 5.51 12.62
. . .
y
. . .
8
. . .
6
PRF
SRF
4
1 2 3 4 5 6 7
PRF
10 12 14
SRFs
y
8
6
4
1 2 3 4 5 6 7
Remark 2
The slope coefficient β̂1 in terms of sample covariance of x and y and
variance of x.
Sample covariance:
n
1 X
sxy = (xi − x̄)(yi − ȳ ) (13)
n−1
i=1
Sample variance:
n
1 X
sx2 = (xi − x̄)2 . (14)
n−1
i=1
Thus
sxy
β̂1 = . (15)
sx2
Remark 3
The slope coefficient β̂1 in terms of sample correlation and standard
deviations of x and y .
Sample correlation:
Pn
i=1 (xi − x̄)(yi − ȳ ) sxy
rxy = qP = , (16)
n
(x − x̄)2
P n
(y − ȳ ) 2 sx sy
i=1 i i=1 i
p q
where sx = sx2 and sy = sy2 are the sample standard deviations of x
and y , respectively.
Thus we can also write the slope coefficient in terms of sample standard
deviations and correlation as
sy
β̂1 = rxy . (17)
sx
Example 2
Relationship between wage and eduction.
wage = average hourly earnings
educ = years of education
Data is collected in 1976, n = 526
SAS commands for reading data, printing a few lines, sample statistics,
generating a scatter plot, and estimating the regression.
SAS Studio excerpt:
Sample statistics:
/* Some sample statistics */
proc means data = a min max mean std skew kurt maxdec = 2; * stats with output rounded to two decimals;
var wage educ;
run;
Using (17) you can verify the OLS estimate for β1 can be computed using the
correlation (rwage,educ = 0.406) and standard deviations in the above sample
statistics table. After that, applying (8) you get OLS estimate for the intercept.
Thus in all, the estimates can be derived from the basic sample statistics.
OLS Statistics
1 The Simple Regression Model
Definition
Estimation of the model, OLS
OLS Statistics
Algebraic properties
OLS Statistics
n
X
ûi = 0. (18)
i=1
n
X
xi ûi = 0. (19)
i=1
n
X
SST = (yi − ȳ )2 . (21)
i=1
Xn
SSE = (ŷi − ȳ )2 . (22)
i=1
Xn
SSR = (yi − ŷi )2 . (23)
i=1
Seppo Pynnönen Econometrics I
The Simple Regression Model
OLS Statistics
that is
SST = SSE + SSR. (25)
Prove this!
Remark 4
It is unfortunate that different books and different statistical packages
use different definitions, particularly for SSR and SSE. In many the
former means Regression sum of squares and the latter Error sum of
squares. I.e., just the opposite we have here!
SSE SSR
R2 = =1− . (26)
SST SST
Remark 7
It is obvious that 0 ≤ R 2 ≤ 1 with R 2 = 0 representing no linear relation
between x and y and R 2 = 1 representing a perfect fit.
Adjusted R-square:
su2
R̄ 2 = 1 − , (27)
sy2
where
n
1 X
su2 = (yi − ŷi )2 (28)
n−2
i=1
is an estimate of the residual variance
σu2 = var[u].
We find easily that
n−2
R̄ 2 = 1 − (1 − R 2 ). (29)
n−1
Seppo Pynnönen
2 Econometrics
2 I
The Simple Regression Model
Example 3
In the previous example R 2 = 0.1648 and adjusted R-squared,
R̄ 2 = 0.1632. The R 2 tells that about 16.5 percent of the variation in the
hourly earnings can be explained by education. However, the rest 83.5
percent is not accounted by the model.
yi = β0 + β1 xi + ui (30)
Remark 8
Coefficients a1 and b1 scale the measurements and a0 and b0 shift the
measurements origin.
Example 4
Suppose an econometrician has estimated the following wage equation
Example 5
Let the estimated model be
”Demeaned” observations:
So
ŷ ∗ = β̂1 x ∗ .
(Note β̂1 remains unchanged).
As an exercise show that in this case β̂1∗ = rxy , the correlation coefficient
of x and y .
∆x, ∆y are changes in x andy , and %∆x = 100∆x/x, %∆y = 100∆y /y are
percentage changes.
Example 6
Consider again the wage example.
log(y ) = β0 + β1 x + u.
That is
\) = 0.584 + 0.083x
log(y (35)
2
n = 526, R = 0.186. Note that R-squares of this model and the
level-level model are not comparable.
Remark 10
Typically all models where transformations on y and x are functions of
these variables alone can be cast to the form of linear model. That is, if
have generally
g (y ) = β0 + β1 h(x) + u, (36)
where g and h are functions, then defining y ∗ = g (y ) and x ∗ = h(x) we
have a linear model
y ∗ = β0 + β1 x ∗ + u.
Note, however, that all models cannot be cast to a linear form. An
example is
1
cons = + u.
β0 + β1 income
We say generally
h i that an estimator of θ̂ of a parameter θ is
unbiased if E θ̂ = θ.
Theorem 1
Under the classical assumptions 1–5
h i h i
E β̂0 = β0 and E β̂1 = β1 . (37)
P 1
Seppo Pynnönen
PEconometrics I
The Simple Regression Model
I.e.,
1 X
β̂1 = β1 + P (xi − x̄)ui . (39)
(xi − x̄)2
Because xi s fixed we get
h i 1 X
E β̂1 = β1 + P 2
(xi − x̄)E[ui ] = β1 (40)
(xi − x̄)
Theorem 2
Under the classical assumptions 1 through 5 and given x1 , . . . , xn
h i σu2
var β̂1 = Pn 2
, (41)
i=1 (xi − x̄)
h i 1 x̄ 2
var β̂0 = +P σu2 . (42)
n (xi − x̄)2
and for ŷ = β̂0 + β̂1 x with given x
(x − x̄)2
1
var[ŷ ] = +P σu2 . (43)
n (xi − x̄)2
Proof: Again we prove as an example only (41). Using (39) and the
properties of variance with x1 , . . . , xn given
h i h i
= var β1 + P(xi1−x̄)2 (xi − x̄)ui
P
var β̂1
2 P
= P 1 (xi − x̄)2 var[ui ]
(xi −x̄)2
2 P
= P 1 (xi − x̄)2 σu2 (44)
(xi −x̄)2
2
= P σu 2 .
(xi −x̄)
Remark 11
(42) can be written equivalently as
σu2 xi2
h i P
var β̂0 = P . (45)
n (xi − x̄)2
Recalling from equation (12) the residual ûi = yi − ŷi is of the form
This reminds us about the difference between the error term ui and
the residual term ûi .
An unbiased estimator of the error variance σu2 = var[ui ] is
n
1 X 2
σ̂u2 = ûi . (47)
n−2
i=1
Theorem 3
Under the assumption (1)–(5)
E σ̂u2 = σu2 ,
Proof: Omitted.
Replacing in (41) and (42) σu2 by σ̂u2 and taking square roots give
the standard error of β̂1 and β̂0
σ̂u
se(β̂1 ) = pP (49)
(xi − x̄)2
and s
1 x̄ 2
se(β̂0 ) = σ̂u +P . (50)
n (xi − x̄)2