CH 11 Slides
CH 11 Slides
CH 11 Slides
Kuan Xu
University of Toronto
kuan.xu@utoronto.ca
April 3, 2024
1 Introduction
Figure: Graph of Y = β0 + β1 x
Kuan Xu (UofT) ECO 227 April 3, 2024 6 / 41
Introduction (5)
The natural and social science may post many different theories. For
example, in economics, we postulate that the level of wage is a
function of the level of education attainment as a theory; that is,
wage = β0 + β1 edu + ε,
where β1 > 0. Using the model, we are able to estimate β1 and test
H0 : β1 = β10 against Ha : β1 < β10 .
If the above model is not refuted by the data, we may use it for
decision making and prediction.
E (W ) = β0 + β1 ln(x),
where W = ln(Y ).
E (Y ) = β0 + β2 x + β3 x 2 .
E (Y ) = β0 + β1 x1 + β2 x2 .
E (Y ) = β0 + β1 x1 + β2 x2 + β3 x1 x2 + β4 x12 + β5 x22 .
Figure: E (Y ) = β0 + β2 x1 + β3 x2
Y = β0 + β1 x1 + β2 x2 + · · · + βk xk + ε,
E (Y ) = β0 + β1 x1 + β2 x2 + · · · + βk xk .
For given data yi , x1i , x2i , . . . , xki , where i = 1, 2, . . . , n, we can use the
method of least squares to derive estimates for β0 , β1 , . . . , βk .
Note that, for given values of xj (j = 1, 2, . . . , k), V (Y ) = V (ε) = σ 2 .
We can use the following figure to show the idea of the method of least
squares.
Pn 2 n n
∂ i=1 ε̃i
X X
= −2 [yi − (β̂0 + β̂1 xi )]xi = 0. ⇒ ε̂i xi = 0.
∂ β̃1 i=1 i=1
These two f.o.c. are also called the least-squares equations or normal
equations.
1
Note that β̃j becomes β̂j for j = 0, 1. And β̂0 and β̂0 are the optimal and unique
estimators.
Kuan Xu (UofT) ECO 227 April 3, 2024 14 / 41
The Method of Least Squares (4)
Solve the two f.o.c. for β̂0 and β̂1 .
X X
β̂0 n + β̂1 xi = yi
i i
X X X
β̂0 xi + β̂1 xi2 = xi yi
i i i
det(Ai )
Using the Cramer’s rule—for Aβ = q, the solution is βi = det(A) —to find
β̂1 .
P P P
n i xi yi − i xi i yi
β̂1 =
n i xi2 − i xi i xi
P P P
P 1P P
i xi yi − n i xi i yi
= P 2 1 P 2
x − n ( i xi )
P i i
i (x − x̄)(yi − ȳ )
= Pi 2
i (xi − x̄)
Solving for β̂0 is straightforward; just write the sample regression model as
Let Sxy = i (xi − x̄)(yi − ȳ ) and Sxx = i (xi − x̄)2 . We can summarize
P P
our derivations of the least-squares estimators as follows.
Least-Squares Estimators for the Simple Linear Regression Model
Sxy
(a) β̂1 = Sxx .
(b) β̂0 = ȳ − β̂1 ȳ .
Example:
Find β̂0 and β̂1 and the fitted regression line ŷi given the data:
Solution:
Solution (continued):
1P
7 − 15 (0)(5)
P P
i xi yi − n i xi i yi
β̂1 = P 2 1 P 2
= 1 2
= .7.
x
i i − n ( i x i ) 10 − n (0)
5
β̂0 = ȳ − β̂1 x̄ = − (.7)(0) = 1.
5
For i = 1, 2, . . . , n,
ŷi = β̂0 + β̂i xi = 1 + .7xi .
Solution (continued): We can draw ŷi through the data as shown below.
2 X
1
V (β̂1 ) = (xi − x̄)2 V (Yi )
Sxx
i
σ2
= . ∵ V (Yi ) = σ 2 .
Sxx
It will take more steps to derive V (β̂0 ). We start from β̂0 = Ȳ − β̂1 x̄.
OA : V (Ȳ ) = V (ε̄) = 1
n2
P
i V (εi ) = n1 V (εi ) = σ2
n .
O O
Using the results in A and B , we get
σ2 σ2 x̄ 2
1
V (β̂0 ) = + x̄ 2 = σ2 +
n Sxx n Sxx
σ2
P 2
xi σ2
Note that we have so far derived V (β̂0 ) = nSxx , V (β̂1 ) = Sxx , and
−x̄σ 2
Cov (β̂0 , β̂1 ) =Sxx ,
where σ2is unknown and must be estimated. For
this simple regression,, the unbiased estimator for σ 2 is
n n
1 X 1 X
S2 = (Yi − Ŷi )2 = [Yi − (β̂0 + β̂1 xi )]2 .
n−2 n−2
i=1 i=1
Here, SSE = ni=1 (Yi − Ŷi )2 is used to denote the sum of squared errors.
P
It can be shown
(n − 2)S 2
∼ χ2 (n − 2).
σ2
Note that the data in the previous example are used to get the values of S 2 , S, c00 , c11 , c01 , V̂ (β̂0 ), V̂ (β̂1 ), and Cov
d (β̂0 , β̂1 ).
3.0
2.5
2.0
1.5
y
1.0
0.5
0.0
−2 −1 0 1 2
> # Vhat.beta0
> Vhat.beta0 = s2*c00
> Vhat.beta0
[1] 0.073333
> Std.err.beta0 = sqrt(Vhat.beta0)
> Std.err.beta0
[1] 0.2708
> # Vhat.beta1
> Vhat.beta1 = s2*c11
> Vhat.beta1
[1] 0.036667
> Std.err.beta1 = sqrt(Vhat.beta1)
> Std.err.beta1
[1] 0.19149
> # Cov.hat.beta0.beta1
> Cov.hat.beta0.beta1 = s2*c01
> Cov.hat.beta0.beta1
[1] 0
βi > βi0 ,
Ha : βi < βi0 ,
βi ̸= βi0 .
β̂i − βi0
Test statistic: T = √ .
S cii
t > tα ,
Rejection region: t < −tα ,
|t| > tα/2 .
Note that tα is the critical value of t such that P(t > tα ) = α based on n − 2 df.
Kuan Xu (UofT) ECO 227 April 3, 2024 33 / 41
Inferences Concerning the Parameters βi (2)
> summary(my.model)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000 0.271 3.69 0.034 *
x 0.700 0.191 3.66 0.035 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Goodness of Fit
For the simple linear regression model Y = β0 + β1 x + ε, we can estimate
the model using data {yi , xi }ni=1 . We list three ways of measuring
goodness of fit.
(ŷ −ȳ )2
P
1. R squared: R 2 = Pi i 2.
i (yi −ȳ )
V̂ (ŷ ) Sŷ2
2. Multiple R squared: Multiple R 2 = V̂ (y )
= Sy2
.
(y −ŷ )2 /(n−k)
P
3. Adjusted R squared: Adjusted R 2 = 1 − Pi (yi −ȳi )2 /(n−1) , where k = 2
i i
for the simple linear regression model and n = # of observations.
> summary(my.model)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.000 0.271 3.69 0.034 *
x 0.700 0.191 3.66 0.035 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
H0 : β 0 = β 1 = 0
against
Ha : At least one of βi′ s is not zero.
(ŷi − ȳ )2 /(k − 1)
P
Test statistic: F = P i 2
∼ F (k − 1, n − k)
i (yi − ŷi ) /(n − k)
Rejection region: RR = {F > Fα },
where P(F > Fα ) = α, k = 2 for the simple linear regression model and
n = # of observations.
Kuan Xu (UofT) ECO 227 April 3, 2024 39 / 41
Inferences Concerning the Parameters βi (8)