Lecture 8
Lecture 8
Lecture 8
In the last lecture, we derived the sample estimates of βˆ1 and βˆ0 as.
Pn
(x − x̄)(yi − ȳ )
ˆ
β1 = i=1 Pn i 2
(1)
i=1 (xi − x̄)
In the last lecture, we derived the sample estimates of βˆ1 and βˆ0 as.
Pn
(x − x̄)(yi − ȳ )
ˆ
β1 = i=1 Pn i 2
(1)
i=1 (xi − x̄)
To justify this name, for any βˆ1 and βˆ0 define a fitted value for y when
x = xi as
yˆi = βˆ0 + β1ˆxi
This is the value we predict for y when x = xi for the given intercept and
slope. There is a fitted value for each observation in the sample. (Note:
We use the term fitted value here and not predicted value).The residual for
observation i is the difference between the actual yi and its fitted value:
ˆ = yi − βˆ0 − β1ˆxi
ûi = yi − yi
There are n such residuals. (Note: These are not the same as errors in eq.
(1))
Now,
Pn suppose we choose βˆ1 and βˆ0 to make the sum of squared residuals
2
i=1 ûi as small as possible. Mathematically,
n
X
min (yi − β0 − β1 xi )2 (3)
β0 ,β1
i=1
n
X
−2 xi (yi − β0 − β1 xi ) = 0 (5)
i=1
Equations (4) and (5) are basically the same equations as eqs. (11) and
(12) from last class which can be solved to give us the ordinary least
squares estimates.
Equations (4) and (5) are basically the same equations as eqs. (11) and
(12) from last class which can be solved to give us the ordinary least
squares estimates.
They are called ”ordinary least squares” since they minimize the sum of
squared residuals.
Equations (4) and (5) are basically the same equations as eqs. (11) and
(12) from last class which can be solved to give us the ordinary least
squares estimates.
They are called ”ordinary least squares” since they minimize the sum of
squared residuals.
Once we have determined the OLS intercept and slope estimates, we form
the OLS regression line.
Equations (4) and (5) are basically the same equations as eqs. (11) and
(12) from last class which can be solved to give us the ordinary least
squares estimates.
They are called ”ordinary least squares” since they minimize the sum of
squared residuals.
Once we have determined the OLS intercept and slope estimates, we form
the OLS regression line.
Note: This equation (6) above is the sample regression function (SRF).
This is an estimated version of the population regression function
E (y |x) = β0 + β1 x we has stated earlier. Remember the population
regression line is something fixed. But the SRF is obtained for a given
sample of data, a new sample will generate a different slope and intercept
in eq. (6).
Note: This equation (6) above is the sample regression function (SRF).
This is an estimated version of the population regression function
E (y |x) = β0 + β1 x we has stated earlier. Remember the population
regression line is something fixed. But the SRF is obtained for a given
sample of data, a new sample will generate a different slope and intercept
in eq. (6).
wage
[ = 90 + 154education
wage
[ = 90 + 154education
wage
[ = 90 + 154education
wage
[ = 90 + 154education
Note: In our study so far the slope parameter simply measures the
association between y and x. It says nothing about causality. Even
through we often casually use causal language to talk about β1 (e.g. does
a decrease in price increase sales?) it is important to understand that
causality can only be determined after imposing other conditions.
y = β0 + β1 x + u
y = β0 + β1 x + u
They key here is that this model is linear in the parameters β0 and β1 .
y = β0 + β1 x + u
They key here is that this model is linear in the parameters β0 and β1 .
y = β0 + β1 x + u
They key here is that this model is linear in the parameters β0 and β1 .
Then,
%∆wage ≈ (100 ∗ β1 )∆education
y = β0 + β1 x1 + β2 x2 + ...βk xk + u (7)
y = β0 + β1 x1 + β2 x2 + ...βk xk + u (7)
Similar to the simple linear regression case, the OLS estimates (k+1) of
them are chosen to minimize the sum of squared residuals:
n
X
(yi − β0 − β1 xi1 − ...βk xik )2 (9)
i=1