Linear Regression Analysis: Module - Ii

Dr.
Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur

LINEAR REGRESSION ANALYSIS

MODULE II

Lecture - 4

Simple Linear Regression
Analysis
2
Maximum likelihood estimation
We assume that are independent and identically distributed following a normal distribution
Now we use the method of maximum likelihood to estimate the parameters of the linear regression model

the observations are independently distributed with for all The likelihood
function of the given observations and unknown parameters and is

The maximum likelihood estimates of and can be obtained by maximizing or equivalently
where

The normal equations are obtained by partial differentiation of log-likelihood with respect to and equating them
to zero

2
' ( 1,2,..., )
i
s i n =
2
(0, ). N
0 1
( 1,2,..., ),
i i i
y x i n = + + =
( 1,2,..., )
i
y i n =
2
0 1
( , )
i
N x + 1,2,..., . i n =
( , )
i i
x y
0 1
,
1/2
2 2
0 1 0 1 2 2
1
1 1
( , ; , , exp ( ) .
2 2
n
i i i i
i
L x y y x

=
| | (
=
|
(
\ .
0 1
,
2
2
0 1
( , ; , , )
i i
L x y
2
0 1
ln ( , ; , , )
i i
L x y
2 2 2
0 1 0 1 2
1
1
ln ( , ; , , ) ln2 ln ( ) .
2 2 2
n
i i i i
i
n n
L x y y x
=
| | | | | |
=
| | |
\ . \ . \ .
2
0 1
0 1 2
1 0
2
0 1
0 1 2
1 1
2
2 0 1
0 1 2 2 4
1
ln ( , ; , , ) 1
( ) 0
ln ( , ; , , ) 1
( ) 0
ln ( , ; , , ) 1
( ) 0.
2 2

and

n
i i
i i
i
n
i i
i i i
i
n
i i
i i
i
L x y
y x
L x y
y x x
L x y n
y x

=
=
=
= =
= =
= + =
0 1
,
2
The solution of these normal equations give the maximum likelihood estimates of and as

respectively.
3
It can be verified that the Hessian matrix of second order partial derivation of ln L with respect to , and is negative
definite at and which ensures that the likelihood function is maximized at these values.
Note that the least squares and maximum likelihood estimates of and are identical when disturbances are normally
distributed. The least squares and maximum likelihood estimates of are different. In fact, the least squares estimate of is

so that it is related to maximum likelihood estimate as
Thus and are unbiased estimators of and whereas is a biased estimate of , but it is asymptotically unbiased.
The variances of and are same as that of b
0
and b
1
respectively but the mean squared error

0 1
,
2
0 0 1 1
, , b b = =
2 2
s =
0
2 2
1
1
( )
2
n
i
i
s y y
n
=
=

2 2
2
.
n
s s
n
=
0
b
1
b
2
s
2
0
b
1
b
2 2
( ) ( ). MSE s Var s <
0 1
,
2
0 1
1
1
2
1
2
0 1
2 1
( )( )
( )
( )
,

and

n
i i
xy
i
n
xx
i
i
n
i i
i
b y b x
x x y y
s
b
s
x x
y b b x
s
n
=
=
=
=

= =
4
Testing of hypotheses and confidence interval estimation for slope parameter
Now we consider the tests of hypothesis and confidence interval estimation for the slope parameter of the model under two
cases, viz., when is known and when is unknown.
Case 1: When is known
Consider the simple linear regression model It is assumed that are independent
and identically distributed and follow
First we develop a test for the null hypothesis related to the slope parameter

where is some given constant.

Assuming to be known, we know that

and is a linear combination of normally distributed , so

and so the following statistic can be constructed

which is distributed as N(0, 1) when H
0
is true.
2
0 1
( 1,2,..., ).
i i i
y x i n = + + = '
i
s
2
(0, ). N
0 1 10
: H =
10
2
1 1
~ ,
xx
b N
s
| |
|
\ .
1 10
1
2
xx
b
Z
s
=
2
1
b
' .
i
y s
2
1 1 1
( ) , ( )
xx
E b Var b
s
= =
The confidence interval for can be obtained using the Z
1
statistic as follows:

5
So confidence interval for is

where is the percentage point of the N(0,1) distribution.

1
1
2 2
1 1
2
2 2
2 2
1 1 1
2 2
1
1
1 .
xx
xx xx
P z Z z
b
P z z
s
P b z b z
s s

(
=
(

(
(
(
=
(
(
(

(
+ = (
(

1
2 2
1 /2 1 /2
,
xx xx
b z b z
s s

| |
+ |
|
\ .
/2
z
/ 2
A decision rule to test can be framed as follows:
Reject H
0
if
where is the percentage points on normal distribution. Similarly, the decision rule for one sided alternative
hypothesis can also be framed.
1 1 10
: H
0 /2
Z z
>
/2
z
/ 2
100(1 )%
100(1 )%
6
Case 2: When is unknown

When is unknown, we proceed as follows. We know that

and

Further, and b
1
are independently distributed. This result will be proved formally later in module on multiple linear
regression. This result also follows from the result that under normal distribution, the maximum likelihood estimates, viz.,
sample mean (estimator of population mean) and sample variance (estimator of population variance) are independently
distributed so b
1
and s
2
are also independently distributed.

2
2
.
2
res
SS
E
n

| |
=
|
\ .
2
2
2
~ ( 2).
res
SS
n

2
/
res
SS
Thus the following statistic can be constructed:

which follows a t-distribution with (n - 2) degrees of freedom, denoted as , when H
0
is true.
1 1
0
2
1 1
2
~
( 2)
xx
n
res
xx
b
t
s
b
t
SS
n s
2 n
t

7
A decision rule to test is to
reject if
where is the percentage point of the t-distribution with (n - 2) degrees of freedom.

Similarly, the decision rule for one sided alternative hypothesis can also be framed.

1 1 10
: H
0
H
0 2, /2 n
t t

>
2, /2 n
t

/ 2
The confidence interval of can be obtained using the t
0
statistic as follows :

So the confidence interval is

0
2 2
1 1
2
2 2
2 2
1 1 1
2 2
1
1

1 .
xx
xx xx
P t t t
b
P t t
s
P b t b t
s s

(
=
(

(
(
(
=
(
(
(

(
+ = (
(

1
1 2, /2 1 2, /2
, .
( 2) ( 2)
res res
n n
xx xx
SS SS
b t b t
n s n s

| |
+
|
|

\ .
1
100(1 )%
100(1 )%
8
Testing of hypotheses and confidence interval estimation for intercept term

Now, we consider the tests of hypothesis and confidence interval estimation for intercept term under two cases, viz., when
is known and when is unknown.

Case 1: When is known

Suppose the null hypothesis under consideration is

where is known, then using the result that is a linear combination of
normally distributed random variables, the following statistic

has a N(0, 1) distribution when H
0
is true.
A decision rule to test can be framed as follows:
Reject H
0
if
where is the percentage points on normal distribution.

1 0 00
: H
0 /2
Z z
>
/2
z
/ 2
2
0 0 00
: , H =
2
0 00
0
2
2
~ (0,1),
1
xx
b
Z N
x
n s
=
| |
+
|
\ .
2
2
2
0 0 0 0
1
( ) , ( ) and
x
x
E b Var b b
n s

| |
= = +
|
\ .
9

The confidence intervals for when is known can be derived using the statistic as follows::

So the of confidential interval of is

2
100(1 )%
0
0
2 2
0 0
2
2 2
2 2
2 2
0 0 0
2 2
1
1
1
1 1
1 .
xx
xx xx
P z Z z
b
P z z
x
n s
x x
P b z b z
n s n s

(
=
(

(
(
(
=
(
| |
(
+
|
(
\ .

(
| | | |
( + + + =
| |
( \ . \ .

0
2 2
2 2
0 /2 0 /2
1 1
, .
xx xx
x x
b z b z
n s n s

| |
| | | |
| + + +
| |
|
\ . \ .
\ .
0
Z
100(1 )%
10

Case 2: When is unknown

When is unknown, then the statistic is constructed

which follows a t-distribution with (n - 2) degrees of freedom, i.e., when H
0
is true.

A decision rule to test is as follows:

Reject H
0
whenever
where is the percentage point of the t-distribution with (n - 2) degrees of freedom.

2
/ 2
2
0 00
0
2
1
2
res
xx
b
t
SS x
n n s

=
| |
+
|
\ .
1 0 00
: H
0 2, /2 n
t t

>
2, /2 n
t

2 n
t

11

Confidence interval for

A confidence interval for can also be derived as follows. Since thus consider

The corresponding confidence interval for is

2
100(1 )%
2
2 2
2
/ ~ ,
res n
SS

2 2
2, /2 2,1 /2 2
1
res
n n
SS
P

(
=
(

2
2 2
2,1 /2 2, /2
, .
res res
n n
SS SS

| |
|
|
\ .
The 100 of confidential interval of can be obtained as follows:
Consider

The 100 confidential interval for is

(1 )%
0
2, /2 0 2, /2
0 0
2, /2 2, /2
2
2 2
0 2, /2 0 0 2, /2
1
1
1
2
1 1
1 .
2 2
n n
n n
res
xx
res res
n n
xx xx
P t t t
b
P t t
SS x
n n s
SS SS x x
P b t b t
n n s n n s

( =

(
(
(
=
(
| |
(
+
|
(
\ .

(
| | | |
( + + + =
| |

(
\ . \ .

(1 )%
0
2 2
0 2, /2 0 2, /2
1 1
, .
2 2
res res
n n
xx xx
SS SS x x
b t b t
n n s n n s

(
| | | |
( + + +
| |

(
\ . \ .

2
2 2
2,1 /2 2, /2
1 .
res res
n n
SS SS
P

(
=
(
(

Linear Regression Analysis: Module - Ii

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Linear Regression Analysis: Module - Ii

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression Analysis: Module - Ii

Uploaded by

Copyright:

Available Formats

Dr.

You might also like