Linear Regression Analysis: Module - Ii
Linear Regression Analysis: Module - Ii
Linear Regression Analysis: Module - Ii
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
LINEAR REGRESSION ANALYSIS
MODULE II
Lecture - 4
Simple Linear Regression
Analysis
2
Maximum likelihood estimation
We assume that are independent and identically distributed following a normal distribution
Now we use the method of maximum likelihood to estimate the parameters of the linear regression model
the observations are independently distributed with for all The likelihood
function of the given observations and unknown parameters and is
The maximum likelihood estimates of and can be obtained by maximizing or equivalently
where
The normal equations are obtained by partial differentiation of log-likelihood with respect to and equating them
to zero
2
' ( 1,2,..., )
i
s i n =
2
(0, ). N
0 1
( 1,2,..., ),
i i i
y x i n = + + =
( 1,2,..., )
i
y i n =
2
0 1
( , )
i
N x + 1,2,..., . i n =
( , )
i i
x y
0 1
,
1/2
2 2
0 1 0 1 2 2
1
1 1
( , ; , , exp ( ) .
2 2
n
i i i i
i
L x y y x
=
| | (
=
|
(
\ .
0 1
,
2
2
0 1
( , ; , , )
i i
L x y
2
0 1
ln ( , ; , , )
i i
L x y
2 2 2
0 1 0 1 2
1
1
ln ( , ; , , ) ln2 ln ( ) .
2 2 2
n
i i i i
i
n n
L x y y x
=
| | | | | |
=
| | |
\ . \ . \ .
2
0 1
0 1 2
1 0
2
0 1
0 1 2
1 1
2
2 0 1
0 1 2 2 4
1
ln ( , ; , , ) 1
( ) 0
ln ( , ; , , ) 1
( ) 0
ln ( , ; , , ) 1
( ) 0.
2 2
and
n
i i
i i
i
n
i i
i i i
i
n
i i
i i
i
L x y
y x
L x y
y x x
L x y n
y x
=
=
=
= =
= =
= + =
0 1
,
2
The solution of these normal equations give the maximum likelihood estimates of and as
respectively.
3
It can be verified that the Hessian matrix of second order partial derivation of ln L with respect to , and is negative
definite at and which ensures that the likelihood function is maximized at these values.
Note that the least squares and maximum likelihood estimates of and are identical when disturbances are normally
distributed. The least squares and maximum likelihood estimates of are different. In fact, the least squares estimate of is
so that it is related to maximum likelihood estimate as
Thus and are unbiased estimators of and whereas is a biased estimate of , but it is asymptotically unbiased.
The variances of and are same as that of b
0
and b
1
respectively but the mean squared error
0 1
,
2
0 0 1 1
, , b b = =
2 2
s =
0
2 2
1
1
( )
2
n
i
i
s y y
n
=
=
2 2
2
.
n
s s
n
=
0
b
1
b
2
s
2
0
b
1
b
2 2
( ) ( ). MSE s Var s <
0 1
,
2
0 1
1
1
2
1
2
0 1
2 1
( )( )
( )
( )
,
and
n
i i
xy
i
n
xx
i
i
n
i i
i
b y b x
x x y y
s
b
s
x x
y b b x
s
n
=
=
=
=
= =
4
Testing of hypotheses and confidence interval estimation for slope parameter
Now we consider the tests of hypothesis and confidence interval estimation for the slope parameter of the model under two
cases, viz., when is known and when is unknown.
Case 1: When is known
Consider the simple linear regression model It is assumed that are independent
and identically distributed and follow
First we develop a test for the null hypothesis related to the slope parameter
where is some given constant.
Assuming to be known, we know that
and is a linear combination of normally distributed , so
and so the following statistic can be constructed
which is distributed as N(0, 1) when H
0
is true.
2
0 1
( 1,2,..., ).
i i i
y x i n = + + = '
i
s
2
(0, ). N
0 1 10
: H =
10
2
1 1
~ ,
xx
b N
s
| |
|
\ .
1 10
1
2
xx
b
Z
s
=
2
1
b
' .
i
y s
2
1 1 1
( ) , ( )
xx
E b Var b
s
= =
The confidence interval for can be obtained using the Z
1
statistic as follows:
5
So confidence interval for is
where is the percentage point of the N(0,1) distribution.
1
1
2 2
1 1
2
2 2
2 2
1 1 1
2 2
1
1
1 .
xx
xx xx
P z Z z
b
P z z
s
P b z b z
s s
(
=
(
(
(
(
=
(
(
(
(
+ = (
(
1
2 2
1 /2 1 /2
,
xx xx
b z b z
s s
| |
+ |
|
\ .
/2
z
/ 2
A decision rule to test can be framed as follows:
Reject H
0
if
where is the percentage points on normal distribution. Similarly, the decision rule for one sided alternative
hypothesis can also be framed.
1 1 10
: H
0 /2
Z z
>
/2
z
/ 2
100(1 )%
100(1 )%
6
Case 2: When is unknown
When is unknown, we proceed as follows. We know that
and
Further, and b
1
are independently distributed. This result will be proved formally later in module on multiple linear
regression. This result also follows from the result that under normal distribution, the maximum likelihood estimates, viz.,
sample mean (estimator of population mean) and sample variance (estimator of population variance) are independently
distributed so b
1
and s
2
are also independently distributed.
2
2
.
2
res
SS
E
n
| |
=
|
\ .
2
2
2
~ ( 2).
res
SS
n
2
/
res
SS
Thus the following statistic can be constructed:
which follows a t-distribution with (n - 2) degrees of freedom, denoted as , when H
0
is true.
1 1
0
2
1 1
2
~
( 2)
xx
n
res
xx
b
t
s
b
t
SS
n s
2 n
t
7
A decision rule to test is to
reject if
where is the percentage point of the t-distribution with (n - 2) degrees of freedom.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
1 1 10
: H
0
H
0 2, /2 n
t t
>
2, /2 n
t
/ 2
The confidence interval of can be obtained using the t
0
statistic as follows :
So the confidence interval is
0
2 2
1 1
2
2 2
2 2
1 1 1
2 2
1
1
1 .
xx
xx xx
P t t t
b
P t t
s
P b t b t
s s
(
=
(
(
(
(
=
(
(
(
(
+ = (
(
1
1 2, /2 1 2, /2
, .
( 2) ( 2)
res res
n n
xx xx
SS SS
b t b t
n s n s
| |
+
|
|
\ .
1
100(1 )%
100(1 )%
8
Testing of hypotheses and confidence interval estimation for intercept term
Now, we consider the tests of hypothesis and confidence interval estimation for intercept term under two cases, viz., when
is known and when is unknown.
Case 1: When is known
Suppose the null hypothesis under consideration is
where is known, then using the result that is a linear combination of
normally distributed random variables, the following statistic
has a N(0, 1) distribution when H
0
is true.
A decision rule to test can be framed as follows:
Reject H
0
if
where is the percentage points on normal distribution.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
1 0 00
: H
0 /2
Z z
>
/2
z
/ 2
2
0 0 00
: , H =
2
0 00
0
2
2
~ (0,1),
1
xx
b
Z N
x
n s
=
| |
+
|
\ .
2
2
2
0 0 0 0
1
( ) , ( ) and
x
x
E b Var b b
n s
| |
= = +
|
\ .
9
The confidence intervals for when is known can be derived using the statistic as follows::
So the of confidential interval of is
2
100(1 )%
0
0
2 2
0 0
2
2 2
2 2
2 2
0 0 0
2 2
1
1
1
1 1
1 .
xx
xx xx
P z Z z
b
P z z
x
n s
x x
P b z b z
n s n s
(
=
(
(
(
(
=
(
| |
(
+
|
(
\ .
(
| | | |
( + + + =
| |
( \ . \ .
0
2 2
2 2
0 /2 0 /2
1 1
, .
xx xx
x x
b z b z
n s n s
| |
| | | |
| + + +
| |
|
\ . \ .
\ .
0
Z
100(1 )%
10
Case 2: When is unknown
When is unknown, then the statistic is constructed
which follows a t-distribution with (n - 2) degrees of freedom, i.e., when H
0
is true.
A decision rule to test is as follows:
Reject H
0
whenever
where is the percentage point of the t-distribution with (n - 2) degrees of freedom.
Similarly, the decision rule for one sided alternative hypothesis can also be framed.
2
/ 2
2
0 00
0
2
1
2
res
xx
b
t
SS x
n n s
=
| |
+
|
\ .
1 0 00
: H
0 2, /2 n
t t
>
2, /2 n
t
2 n
t
11
Confidence interval for
A confidence interval for can also be derived as follows. Since thus consider
The corresponding confidence interval for is
2
100(1 )%
2
2 2
2
/ ~ ,
res n
SS
2 2
2, /2 2,1 /2 2
1
res
n n
SS
P
(
=
(
2
2 2
2,1 /2 2, /2
, .
res res
n n
SS SS
| |
|
|
\ .
The 100 of confidential interval of can be obtained as follows:
Consider
The 100 confidential interval for is
(1 )%
0
2, /2 0 2, /2
0 0
2, /2 2, /2
2
2 2
0 2, /2 0 0 2, /2
1
1
1
2
1 1
1 .
2 2
n n
n n
res
xx
res res
n n
xx xx
P t t t
b
P t t
SS x
n n s
SS SS x x
P b t b t
n n s n n s
( =
(
(
(
=
(
| |
(
+
|
(
\ .
(
| | | |
( + + + =
| |
(
\ . \ .
(1 )%
0
2 2
0 2, /2 0 2, /2
1 1
, .
2 2
res res
n n
xx xx
SS SS x x
b t b t
n n s n n s
(
| | | |
( + + +
| |
(
\ . \ .
2
2 2
2,1 /2 2, /2
1 .
res res
n n
SS SS
P
(
=
(
(