Econometric Theory: Module - Ii
Econometric Theory: Module - Ii
MODULE – II
Lecture - 5
Simple Linear Regression Analysis
Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2
Now we use the method of maximum likelihood to estimate the parameters of the linear regression model
the observations yi (i = 1, 2,..., n) are independently distributed with N ( β 0 + β1 xi , σ ) for all i = 1, 2,..., n. The likelihood
2
1/ 2
n
1 1
i , yi ; β 0 , β1 , σ
L( x= 2
∏ 2
i =1 2πσ
exp − 2 ( yi − β 0 − β1 xi ) 2 .
2σ
n n 1 n
ln L( xi , yi ; β 0 , β1 , σ 2 ) =
− ln 2π − ln σ 2 − 2 ∑ ( yi − β 0 − β1 xi ) 2 .
2 2 2σ i =1
The normal equations are obtained by partial differentiation of log-likelihood with respect to β 0 , β1 and σ 2 equating them
to zero
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) 1 n
− 2 ∑ ( yi − β 0 − β1 xi ) =
= 0
∂β 0 σ i =1
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) 1 n
− 2 ∑ ( yi − β 0 − β1 xi )xi =
= 0
∂β1 σ i =1
and
∂ ln L( xi , yi ; β 0 , β1 , σ 2 ) n 1 n
4 ∑
=
− 2+ ( yi − β 0 − β1 xi ) 2 =
0.
∂σ 2
2σ 2σ i =1
3
The solution of these normal equations give the maximum likelihood estimates of β 0 , β1 and σ 2 as
b0= y − b1 x
n
∑ ( x − x )( y − y )
i i
sxy
b1 =
i =1
n
∑ ( xi − x )2
sxx
i =1
and
n
∑ ( y − bi 0 − b1 xi ) 2
s 2 = i =1
,
n
respectively.
It can be verified that the Hessian matrix of second order partial derivation of ln L with respect to β 0 , β1 , and σ 2 is negative
=
definite at β 0 b=
0 , β1 b1 , and σ 2 = s 2 which ensures that the likelihood function is maximized at these values.
Note that the least squares and maximum likelihood estimates of β 0 and β1 are identical when disturbances are normally
distributed. The least squares and maximum likelihood estimates of σ are different. In fact, the least squares estimate of σ 2 is
2
1 n
=s2 ∑
n − 2 i =1
( yi − y ) 2
n−2 2
so that it is related to maximum likelihood estimate as s 2 =
s .
n
Thus b0 and b1 are unbiased estimators of β 0 and β1 whereas s is a biased estimate of σ 2 , but it is asymptotically unbiased.
2
The variances of b0 and b1 are same as that of b0 and b1 respectively but the mean squared error
H 0 : β1 = β10
where β10 is some given constant.
σ2
b1 ~ N β1 ,
sxx
The 100(1 − α )% confidence interval for β1 can be obtained using the Z1 statistic as follows:
P − zα ≤ Z1 ≤ zα =1 − α
2 2
b1 − β1
P − zα ≤ ≤ zα =1 − α
σ 2
2 2
sxx
σ2 σ2
P b1 − zα ≤ β1 ≤ b1 + zα =1 − α .
2 s xx 2 s xx
α2 α2
b1 − zα /2 , b1 + zα /2
s s
xx xx
and
SS
E res = σ 2 .
n−2
Further, SS res / σ 2 and b1 are independently distributed. This result will be proved formally later in module on multiple linear
regression. This result also follows from the result that under normal distribution, the maximum likelihood estimates, viz.,
sample mean (estimator of population mean) and sample variance (estimator of population variance) are independently
distributed, so b1 and s2 are also independently distributed.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
The 100(1 − α )% confidence interval of β1 can be obtained using the t0 statistic as follows :
P −tα ≤ t0 ≤ tα =1 − α
2 2
b1 − β1
P −tα ≤ ≤ tα =1 − α
σˆ 2
2 2
sxx
σˆ 2 σˆ 2
P b1 − tα ≤ β1 ≤ b1 + tα =1 − α .
2 sxx 2 sxx
SS res SS res
b1 − tn − 2,α /2 , b1 + tn − 2,α /2 .
(n − 2) sxx (n − 2) sxx
8
Now, we consider the tests of hypothesis and confidence interval estimation for intercept term under two cases, viz., when
σ 2 is known and when σ 2 is unknown.
1 x2
where σ is known, then using the result that
2
(b0 ) β 0 , Var=
E= (b0 ) σ 2 + and b0 is a linear combination of
n sx
normally distributed random variables, the following statistic
b0 − β 00
Z0 = ~ N (0,1),
21 x2
σ +
n sxx
has a N(0, 1) distribution when H0 is true.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
9
The 100(1 − α )% confidence intervals for β 0 when σ 2 is known can be derived using the Z 0 statistic as follows:
P − zα ≤ Z 0 ≤ zα =1 − α
2 2
b0 − β 0
P − zα ≤ ≤ zα =1 − α
2 1 x2 2
+
n s xx
x2
21 x2 21
P b0 − zα σ + ≤ β 0 ≤ b0 + zα σ + =1 − α .
2 n s xx 2 n s xx
21 x2 21 x2
b0 − zα /2 σ + , b0 + zα /2 σ + .
n sxx n sxx
10
b0 − β 00
t0 =
SSres 1 x 2
+
n − 2 n s xx
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
11
The 100 (1 − α )% of confidential interval of β 0 can be obtained as follows:
Consider
P tn − 2,α /2 ≤ t0 ≤ tn − 2,α /2 =1 − α
b0 − β 0
P tn − 2,α /2 ≤ ≤ tn − 2,α /2 = 1−α
SS res 1 x 2
+
n − 2 n sxx
SS res 1 x 2 SS res 1 x 2
P b0 − tn − 2,α /2 + ≤ β ≤ b + t n − 2,α /2 + =1 − α .
n − 2 n sxx n − 2 n sxx
0 0
The 100 (1 − α )% confidential interval for β 0 is
SS res 1 x 2 SS res 1 x 2
b0 − tn − 2,α / 2 + , b0 + tn − 2,α / 2 + .
n − 2 n sxx n − 2 n sxx
SS
P χ n2− 2,α /2 ≤ res ≤ χ n2− 2,1−α /2 =
1−α
σ 2
SS SS
P 2 res ≤ σ 2 ≤ 2 res =− 1 α.
χ
n − 2,1−α /2 χ
n − 2,α /2
SS res SS res
The corresponding 100(1 − α )% confidence interval for σ is 2 .
2
,
χ n − 2,1−α /2 χ n2− 2,α /2