0% found this document useful (0 votes)

61 views

Notes Simple Linear Regression Analysis

This document summarizes simple linear regression analysis. It defines the simple linear regression model as modeling the relationship between a dependent variable and one independent variable. It describes the linear regression equation and parameters. It discusses different methods for estimating the parameters, including least squares estimation. It focuses on the direct regression or ordinary least squares method, describing how it minimizes the sum of squared residuals to estimate the intercept and slope parameters.

Uploaded by

Badri Mishra

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views

Notes Simple Linear Regression Analysis

Uploaded by

Badri Mishra

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

lOMoARcPSD|17346801

Chapter 2
Simple Linear Regression Analysis

The simple linear regression model

We consider the modeling between the dependent and one independent variable. When there is only one
independent variable in the linear regression model, the model is generally termed as simple linear
regression model. When there are more than one independent variables in the model, then the linear model
is termed as the multiple linear regression model.

The linear model

Consider a simple linear regression model
y   0  1 X  

where y is termed as the dependent or study variable and X is termed as independent or explanatory
variable. The terms  0 and 1 are the parameters of the model. The parameter  0 is termed as intercept

term and the parameter 1 is termed as slope parameter. These parameters are usually called as regression

coefficients. The unobservable error component  accounts for the failure of data to lie on the straight line
and represents the difference between the true and observed realization of y . There can be several reasons
for such difference, e.g., the effect of all deleted variables in the model, variables may be qualitative, inherit
randomness in the observations etc. We assume that  is observed as independent and identically
distributed random variable with mean zero and constant variance  2 . Later, we will additionally assume
that  is normally distributed.

The independent variables is viewed as controlled by the experimenter, so it is considered as non-stochastic

whereas y is viewed as a random variable with
E ( y )   0  1 X
and
Var ( y )   2 .
Sometimes X can also be a random variable. In such a case, instead of simple mean and simple variance of
y , we consider the conditional mean of y given X  x as
E ( y | x)   0  1 x

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
1

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

and the conditional variance of y given X  x as

Var ( y | x)   2 .

When the values of  0 , 1 and  2 are known, the model is completely described. The parameters  0 , 1 and

 2 are generally unknown in practice and  is unobserved. The determination of the statistical model
y   0  1 X   depends on the determination (i.e., estimation ) of  0 , 1 and  2 . In order to know the

values of these parameters, n pairs of observations ( xi , yi )(i  1,..., n) on ( X , y ) are observed/collected and
are used to determine these unknown parameters.

Various methods of estimation can be used to determine the estimates of the parameters. Among them, the
methods of least squares and maximum likelihood are the popular methods of estimation.

Least squares estimation

Suppose a sample of n sets of paired observations ( xi , yi ) (i  1, 2,..., n) are available. These observations
are assumed to satisfy the simple linear regression model and so we can write
yi   0  1 xi   i (i  1, 2,..., n).

The principle of least squares estimates the parameters  0 and 1 by minimizing the sum of squares of
difference between the observations and the line in the scatter diagram. Such an idea is viewed from
different perspectives. When the vertical difference between the observations and the line in the scatter
diagram is considered and its sum of squares is minimized to obtain the estimates of  0 and 1 , the method
is known as direct regression. yi

(xi,

Y   0  1 X

(Xi,

xi
Direct regression

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
2

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Alternatively, the sum of squares of difference between the observations and the line in horizontal direction
in the scatter diagram can be minimized to obtain the estimates of  0 and 1 . This is known as reverse (or

inverse) regression method.

Y   0  1 X
(xi, yi)

(Xi, Yi)

xi,

Reverse regression method

Instead of horizontal or vertical errors, if the sum of squares of perpendicular distances between the
observations and the line in the scatter diagram is minimized to obtain the estimates of  0 and 1 , the

method is known as orthogonal regression or major axis regression method.

(xi

Y   0  1 X

(Xi
)

xi
Major axis regression method

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
3

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Instead of minimizing the distance, the area can also be minimized. The reduced major axis regression
method minimizes the sum of the areas of rectangles defined between the observed data points and the
nearest point on the line in the scatter diagram to obtain the estimates of regression coefficients. This is
shown in the following figure:

(xi yi)

Y   0  1 X

(Xi, Yi)

Reduced major axis method

The method of least absolute deviation regression considers the sum of the absolute deviation of the
observations from the line in the vertical direction in the scatter diagram as in the case of direct regression to
obtain the estimates of  0 and 1 .

No assumption is required about the form of probability distribution of  i in deriving the least squares

estimates. For the purpose of deriving the statistical inferences only, we assume that  i ' s are random

variable with E ( i )  0, Var ( i )   2 and Cov ( i ,  j )  0 for all i  j (i, j  1, 2,..., n). This assumption is

needed to find the mean, variance and other properties of the least squares estimates. The assumption that
 i ' s are normally distributed is utilized while constructing the tests of hypotheses and confidence intervals
of the parameters.

Based on these approaches, different estimates of  0 and 1 are obtained which have different statistical

properties. Among them the direct regression approach is more popular. Generally, the direct regression
estimates are referred as the least squares estimates or ordinary least squares estimates.
Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
4

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Direct regression method

This method is also known as the ordinary least squares estimation. Assuming that a set of n paired
observations on ( xi , yi ), i  1, 2,..., n are available which satisfy the linear regression model y   0  1 X   .

So we can write the model for each observation as yi   0  1 xi   i , (i  1, 2,..., n) .

The direct regression approach minimizes the sum of squares

n n
S (  0 , 1 )    i2   ( yi   0  1 xi ) 2
i 1 i 1

with respect to  0 and 1 .

The partial derivatives of S (  0 , 1 ) with respect to  0 is

S (  0 , 1 ) n
 2 ( yt   0  1 xi )
 0 i 1

and the partial derivative of S (  0 , 1 ) with respect to 1 is

S (  0 , 1 ) n
 2 ( yi   0  1 xi )xi .
1 i 1

The solutions of  0 and 1 are obtained by setting

S (  0 , 1 )
0
 0
S (  0 , 1 )
 0.
1
The solutions of these two equations are called the direct regression estimators, or usually called as the
ordinary least squares (OLS) estimators of  0 and 1 .

This gives the ordinary least squares estimates b0 of  0 and b1 of 1 as

b0  y  b1 x
sxy
b1 
sxx
where
n n
1 n 1 n
sxy   ( xi  x )( yi  y ), sxx  ( xi  x ) 2 , x  i x , y   yi .
i 1 i 1 n i 1 n i 1

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
5

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Further, we have
 2 S (  0 , 1 ) n

 02
  2 
i 1
(1)  2n,

 2 S (  0 , 1 ) n

12
 2 
i 1
xi2

 2 S (  0 , 1 ) n
 2 xt  2nx .
 0 1 i 1

The Hessian matrix which is the matrix of second order partial derivatives in this case is given as
  2 S (  0 , 1 )  2 S (  0 , 1 ) 
 
 02  0 1 
H*   2
  S ( ,  )  2 S (  0 , 1 ) 
0 1
 
   0  1 12 
 n nx 
2  n 
 nx  xi2 
 
 i 1 
 '
 2    , x 
 x '
where   (1,1,...,1) ' is a n -vector of elements unity and x  ( x1 ,..., xn ) ' is a n -vector of observations on X .
The matrix H * is positive definite if its determinant and the element in the first row and column of H * are
positive. The determinant of H is given by
 n 
H *  2  n xi2  n 2 x 2 
 i 1 
n
 2n ( xi  x ) 2
i 1

 0.
n
The case when  (x  x )
i 1
i
2
 0 is not interesting because all the observations in this case are identical, i.e.

xi  c (some constant). In such a case there is no relationship between x and y in the context of regression
n
analysis. Since  (x  x )
i 1
i
2
 0, therefore H  0. So H is positive definite for any (  0 , 1 ); therefore

S (  0 , 1 ) has a global minimum at (b0 , b1 ).

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
6

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

The fitted line or the fitted linear regression model is

y  b0  b1 x.
The predicted values are
yˆi  b0  b1 xi (i  1, 2,..., n).

The difference between the observed value yi and the fitted (or predicted) value yˆi is called as a residual.

The i th residual is defined as

ei  yi ~ yˆi (i  1, 2,..., n)
 yi  yˆi
 yi  (b0  b1 xi ).

Properties of the direct regression estimators:

Unbiased property:
sxy
Note that b1  and b0  y  b1 x are the linear combinations of yi (i  1,..., n).
sxx
Therefore
n
b1   ki yi
i 1

n n
where ki  ( xi  x ) / sxx . Note that ki 1
i  0 and k x
i 1
i i  1, so

n
E (b1 )   ki E ( yi )
i 1
n
  ki (  0  1 xi ) .
i 1

 1.

This b1 is an unbiased estimator of 1 . Next

E (b0 )  E  y  b1 x 
 E   0  1 x    b1 x 
  0  1 x  1 x
 0 .

Thus b0 is an unbiased estimators of  0 .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
7

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Variances:
Using the assumption that yi ' s are independently distributed, the variance of b1 is
n
Var (b1 )   ki2Var ( yi )   ki k j Cov( yi , y j )
i 1 i j i

 (x  x ) i
2

 2 i
(Cov( yi , y j )  0 as y1 ,..., yn are independent)
sxx2
 2 sxx
=
sxx2
2
= .
sxx

The variance of b0 is

Var (b0 )  Var ( y )  x 2 Var (b1 )  2 xCov( y , b1 ).

First we find that
Cov( y , b1 )  E  y  E ( y )b1  E (b1 )
 
 E  ( ci yi  1 ) 
 i 
1  
 E (  i )(  0  ci  1  ci xi   ci i )  1   i 
n  i i i i i 
1
  0  0  0  0
n
0
So
 1 x2 
Var (b0 )   2   .
 n sxx 

Covariance:
The covariance between b0 and b1 is

Cov(b0 , b1 )  Cov( y , b1 )  xVar (b1 )

x 2
  .
sxx

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
8

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

It can further be shown that the ordinary least squares estimators b0 and b1 possess the minimum variance

in the class of linear and unbiased estimators. So they are termed as the Best Linear Unbiased Estimators
(BLUE). Such a property is known as the Gauss-Markov theorem which is discussed later in multiple
linear regression model.

Residual sum of squares:

The residual sum of squares is given as
n
SSres   ei2
i 1
n
  ( yi  yˆi ) 2
i 1
n
  ( yi  b0  b1 xi ) 2
i 1
n
   yi  y  b1 x  b1 xi 
2

i 1
n
   ( yi  y )  b1 ( xi  x ) 
2

i 1
n n n
  ( yi  y ) 2  b12  ( xi  x ) 2  2b1  ( xi  x )( yi  y )
i 1 i 1 i 1
2 2 2
 s yy  b s  2b s
1 xx 1 xx

 s yy  b12 sxx
2
s 
 s yy   xy  sxx
 sxx 
sxy2
 s yy 
sxx
 s yy  b1sxy .
n
1 n
where s yy   ( yi  y ) 2 , y   yi .
i 1 n i 1

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
9

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Estimation of  2
The estimator of  2 is obtained from the residual sum of squares as follows. Assuming that yi is normally

distributed, it follows that SSres has a  2 distribution with (n  2) degrees of freedom, so

SSres
~  2 (n  2).
2
Thus using the result about the expectation of a chi-square random variable, we have
E ( SSres )  (n  2) 2 .

Thus an unbiased estimator of  2 is

SSres
s2  .
n2
Note that SSres has only (n  2) degrees of freedom. The two degrees of freedom are lost due to estimation

of b0 and b1 . Since s 2 depends on the estimates b0 and b1 , so it is a model dependent estimate of  2 .

Estimate of variances of b0 and b1 :

The estimators of variances of b0 and b1 are obtained by replacing  2 by its estimate ˆ 2  s 2 as follows:

 (b )  s 2  1  x 
2
Var 0  
 n sxx 
and
2
 (b )  s .
Var 1
sxx
n n
It is observed that since  ( yi  yˆi )  0, so
i 1
e
i 1
i  0. In the light of this property, ei can be regarded as an

estimate of unknown  i (i  1,..., n) . This helps in verifying the different model assumptions on the basis of

the given sample ( xi , yi ), i  1, 2,..., n.

Further, note that

n
(i) xe
i 1
i i  0,

n
(ii)  yˆ e
i 1
i i  0,

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
10

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

n n
(iii)  y   yˆ
i 1
i
i 1
i and

(iv) the fitted line always passes through ( x , y ).

Centered Model:
Sometimes it is useful to measure the independent variable around its mean. In such a case, model
yi   0  1 X i   i has a centered version as follows:

yi   0  1 ( xi  x )  1 x   (i  1, 2,..., n)
  0*  1 ( xi  x )   i

where  0*   0  1 x . The sum of squares due to error is given by

n n 2

S (  , 1 )       yi     1 ( xi  x )  .
*
0 i
2 *
0
i 1 i 1

Now solving
S (  0* , 1 )
0
 0*
S (  0* , 1 )
 0,
1*

we get the direct regression least squares estimates of  0* and 1 as

b0*  y
and
sxy
b1 
sxx
respectively.

Thus the form of the estimate of slope parameter 1 remains same in usual and centered model whereas the

form of the estimate of intercept term changes in the usual and centered models.

Further, the Hessian matrix of the second order partial derivatives of S (  0* , 1 ) with respect to  0* and 1

is positive definite at  0*  b0* and 1  b1 which ensures that S (  0* , 1 ) is minimized at  0*  b0* and
1  b1 .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
11

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Under the assumption that E ( i )  0,Var ( i )   2 and Cov( i j )  0 for all i  j  1, 2,..., n , it follows that

E (b0* )   0* , E (b1 )  1 ,
2 2
Var (b0* )  , Var (b1 )  .
n sxx

In this case, the fitted model of yi   0*  1 ( xi  x )   i is

y  y  b1 ( x  x ),

and the predicted values are

yˆi  y  b1 ( xi  x ) (i  1,..., n).
Note that in centered model
Cov(b0* , b1 )  0.

No intercept term model:

Sometimes in practice, a model without an intercept term is used in those situations when xi  0  yi  0 for

all i  1, 2,..., n . A no-intercept model is

yi  1 xi   i (i  1, 2,.., n).

For example, in analyzing the relationship between the velocity ( y ) of a car and its acceleration ( X ) , the
velocity is zero when acceleration is zero.

Using the data ( xi , yi ), i  1, 2,..., n, the direct regression least squares estimate of 1 is obtained by
n n
minimizing S ( 1 )    i2   ( yi  1 xi ) 2 and solving
i 1 i 1

S ( 1 )
0
1

gives the estimator of 1 as

*
yx i i
b 
1
i 1
n
.
x
i 1
2
i

The second order partial derivative of S ( 1 ) with respect to 1 at 1  b1 is positive which insures that b1

minimizes S ( 1 ).

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
12

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Using the assumption that E ( i )  0,Var ( i )   2 and Cov( i j )  0 for all i  j  1, 2,..., n , the properties

of b1* can be derived as follows:

*
 x E( y ) i i
E (b ) 1
i 1
n

x
i 1
2
i

x  2
i 1
 i 1
n

x i 1
2
i

 1

This b1* is an unbiased estimator of 1 . The variance of b1* is obtained as follows:

 x Var ( y ) 2
i i
Var (b1* )  i 1
2
 n 2
  xi 
 i 1 
n

x 2
i
2 i 1
2
 n 2
  xi 
 i 1 
2
 n

xi 1
2
i

and an unbiased estimator of  2 is obtained as

n n

y 2
i  b1  yi xi
i 1 i 1
.
n 1

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
13

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Maximum likelihood estimation

We assume that  i ' s (i  1, 2,..., n) are independent and identically distributed following a normal

distribution N (0,  2 ). Now we use the method of maximum likelihood to estimate the parameters of the
linear regression model
yi   0  1 xi   i (i  1, 2,..., n),

the observations yi (i  1, 2,..., n) are independently distributed with N (  0  1 xi ,  2 ) for all i  1, 2,..., n.

The likelihood function of the given observations ( xi , yi ) and unknown parameters  0 , 1 and  2 is
1/ 2
n
 1   1 
L( xi , yi ;  0 , 1 ,  2 )    2 
exp   2 ( yi   0  1 xi ) 2 .
i 1  2   2 
The maximum likelihood estimates of  0 , 1 and  2 can be obtained by maximizing L( xi , yi ;  0 , 1 ,  2 ) or

equivalently in ln L( xi , yi ;  0 , 1 ,  2 ) where

n n  1  n
ln L( xi , yi ;  0 , 1 ,  2 )     ln 2    ln  2   2   ( yi   0  1 xi ) 2 .
2 2  2  i 1
The normal equations are obtained by partial differentiation of log-likelihood with respect to  0 , 1 and  2
and equating them to zero as follows:
 ln L( xi , yi ;  0 , 1 ,  2 ) 1 n

 0
 2

(y  
i 1
i 0  1 xi )  0

 ln L( xi , yi ;  0 , 1 ,  2 ) 1 n

1
 2

(y  
i 1
i 0  1 xi )xi  0

and
 ln L( xi , yi ;  0 , 1 ,  2 ) n 1 n

 2
 2  4
2 2
(y  
i 1
i 0  1 xi ) 2  0.

The solution of these normal equations give the maximum likelihood estimates of  0 , 1 and  2 as

b0  y  b1 x
n

 ( x  x )( y  y )
i i
sxy
b1  i 1
n

 (x  x ) 2 sxx
i
i 1

and
n

 ( y  b i 0  b1 xi ) 2
s 2  i 1

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
14

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

respectively.

It can be verified that the Hessian matrix of second order partial derivation of ln L with respect to  0 , 1 ,

and  2 is negative definite at  0  b0 , 1  b1 , and  2  s 2 which ensures that the likelihood function is
maximized at these values.

Note that the least squares and maximum likelihood estimates of  0 and 1 are identical. The least squares

and maximum likelihood estimates of  2 are different. In fact, the least squares estimate of  2 is
1 n
s2   ( yi  y )2
n  2 i 1
so that it is related to maximum likelihood estimate as
n2 2
s 2  s .
n

Thus b0 and b1 are unbiased estimators of  0 and 1 whereas s 2 is a biased estimate of  2 , but it is

asymptotically unbiased. The variances of b0 and b1 are same as of b0 and b1 respectively but

Var ( s 2 )  Var ( s 2 ).

Testing of hypotheses and confidence interval estimation for slope parameter:

Now we consider the tests of hypothesis and confidence interval estimation for the slope parameter of the
model under two cases, viz., when  2 is known and when  2 is unknown.

Case 1: When  2 is known:

Consider the simple linear regression model yi   0  1 xi   i (i  1, 2,..., n) . It is assumed that  i ' s are

independent and identically distributed and follow N (0,  2 ).

First we develop a test for the null hypothesis related to the slope parameter
H 0 : 1  10

where 10 is some given constant.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
15

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

2
Assuming  2 to be known, we know that E (b1 )  1 , Var (b1 )  and b1 is a linear combination of
sxx

normally distributed yi ' s . So

 2 
b1 ~ N  1 , 
 sxx 

and so the following statistic can be constructed

b1  10
Z1 
2
sxx

which is distributed as N (0,1) when H 0 is true.

A decision rule to test H1 : 1  10 can be framed as follows:

Reject H 0 if Z1  Z / 2

where Z /2 is the  / 2 percent points on normal distribution.

Similarly, the decision rule for one sided alternative hypothesis can also be framed.

The 100 (1   )% confidence interval for 1 can be obtained using the Z1 statistic as follows:

P   z /2  Z1  z /2   1  
 
 
 b1  1
P  z /2   z /2   1  
  2 
 
 sxx 
 2 2 
P b1  z /2  1  b1  z /2   1.
 sxx sxx 

So 100 (1   )% confidence interval for 1 is

 2 2 
b1  z / 2 , b1  z / 2 
 sxx sxx 

where z / 2 is the  / 2 percentage point of the N (0,1) distribution.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
16

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Case 2: When  2 is unknown:

When  2 is unknown then we proceed as follows. We know that
SSres
~  2 (n  2)
2
and
 SS 
E  res    2 .
 n2
Further, SSres /  2 and b1 are independently distributed. This result will be proved formally later in next
module on multiple linear regression. This result also follows from the result that under normal distribution,
the maximum likelihood estimates, viz., the sample mean (estimator of population mean) and the sample
variance (estimator of population variance) are independently distributed, so b1 and s 2 are also
independently distributed.
Thus the following statistic can be constructed:
b1  1
t0 
ˆ 2
sxx
b1  1

SSres
(n  2) sxx

which follows a t -distribution with (n  2) degrees of freedom, denoted as tn  2 , when H 0 is true.

A decision rule to test H1 : 1  10 is to

reject H 0 if t0  tn  2, / 2

where tn  2, / 2 is the  / 2 percent point of the t -distribution with (n  2) degrees of freedom. Similarly, the

decision rule for one sided alternative hypothesis can also be framed.

The 100 (1   )% confidence interval of 1 can be obtained using the t0 statistic as follows:

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
17

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Consider
P  t /2  t0  t /2   1  
 
 
 b1  1 
P t /2   t /2  1  
 ˆ2 
 
 sxx 
 ˆ 2 ˆ 2 
P b1  t /2  1  b1  t / 2   1.
 sxx sxx 

So the 100 (1   )% confidence interval 1 is

 SSres SS res 
b1  tn  2, /2 , b1  tn  2, /2 .
 (n  2) sxx (n  2) sxx 

Testing of hypotheses and confidence interval estimation for intercept term:

Now, we consider the tests of hypothesis and confidence interval estimation for intercept term under two
cases, viz., when  2 is known and when  2 is unknown.

Case 1: When  2 is known:

Suppose the null hypothesis under consideration is
H 0 :  0   00 ,

 1 x2 
where  2 is known, then using the result that E (b0 )   0 , Var (b0 )   2    and b0 is a linear
 n sx 
combination of normally distributed random variables, the following statistic
b0   00
Z0 
 1 x2 
   
2

 n sxx 

has a N (0,1) distribution when H 0 is true.

A decision rule to test H1 :  0   00 can be framed as follows:

Reject H 0 if Z 0  Z /2

where Z /2 is the  / 2 percentage points on normal distribution. Similarly, the decision rule for one sided
alternative hypothesis can also be framed.
Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
18

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

The 100 (1   )% confidence intervals for  0 when  2 is known can be derived using the Z 0 statistic as
follows:
P   z /2  Z 0  z /2   1  
 
 
 b0   0 
P   z /2   z /2   1  
  1 x2  
2  
  n s xx 

 
 
21 x2  21 x2 
P b0  z /2   
    0  b0  z /2       1.
  n sxx   n sxx  

So the 100 (1   )% of confidential interval of  0 is

  1 x2  
21 x2
b0  z / 2  2   , 
 0  /2
b z    .
  n sxx   n sxx  

Case 2: When  2 is unknown:

When  2 is unknown, then the following statistic is constructed
b0   00
t0 
SSres  1 x 2 
  
n  2  n sxx 

which follows a t -distribution with (n  2) degrees of freedom, i.e., tn  2 when H 0 is true.

A decision rule to test H1 :  0   00 is as follows:

Reject H 0 whenever t0  tn  2, / 2

where tn  2, / 2 is the  / 2 percentage point of the t -distribution with (n  2) degrees of freedom. Similarly,

the decision rule for one sided alternative hypothesis can also be framed.

The 100 (1   )% confidence interval of  0 can be obtained as follows:

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
19

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Consider
P tn  2, /2  t0  tn  2, /2   1  
 
 
 b0   0 
P tn  2, /2   tn  2, /2   1  
 SS res  1 x 2  
    
 n  2  n sxx  
 SSres  1 x 2  SS res  1 x 2 
P b0  tn  2, /2      0  b0  t n  2, /2      1.
 n  2  n sxx  n  2  n sxx  

So 100(1   )% confidence interval for  0 is

 SS res  1 x 2  SSres  1 x 2 
b0  tn  2, / 2    , b0  tn  2, / 2   .
 n  2  n sxx  n  2  n sxx  

Test of hypothesis for  2

We have considered two types of test statistics for testing the hypothesis about the intercept term and slope
parameter- when  2 is known and when  2 is unknown. While dealing with the case of known  2 , the
value of  2 is known from some external sources like past experience, long association of the experimenter
with the experiment, past studies etc. In such situations, the experimenter would like to test the hypothesis
like H 0 :  2   02 against H 0 :  2   02 where  02 is specified. The test statistic is based on the result
SS r es
~  n2 2 . So the test statistic is
2
SS r es
C0  ~  n2 2 under H 0 .
 02
The decision rule is to reject H 0 if C0   n2 2, /2 or C0   n2 2,1 /2 .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
20

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Confidence interval for  2

A confidence interval for  2 can also be derived as follows. Since SSres /  2 ~  n2 2 , thus consider

 SS 
P   n2 2, /2  res   n2 2,1 /2   1  
  2

 SS SS 
P  2 res   2  2 res   1   .
  n  2,1 / 2  n 2, / 2 

The corresponding 100(1   )% confidence interval for  2 is

 SSres SS 
 2 , 2 res  .
  n  2,1 / 2  n  2, / 2 

Joint confidence region for  0 and 1 :

A joint confidence region for  0 and 1 can also be found. Such region will provide a 100(1   )%

confidence that both the estimates of  0 and 1 are correct. Consider the centered version of the linear
regression model
yi   0*  1 ( xi  x )   i

where  0*   0  1 x . The least squares estimators of  0* and 1 are

sxy
b0*  y and b1  ,
sxx
respectively.

Using the results that

E (b0* )   0* ,
E (b1 )  1 ,
2
Var (b0* )  ,
n
2
Var (b1 )  .
sxx

When  2 is known, then the statistic

b0*   0* b1  1
~ N (0,1) and ~ N (0,1).
 2
2
n sxx
Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
21

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Moreover, both the statistics are independently distributed. Thus

2 2
   
 * * 
 
 b0   0  ~ 12 and  b1  1  ~  2
 2   2  1

  
   s 
 n   xx 
are also independently distributed because b0* and b1 are independently distributed. Consequently, the sum
of these two
n(b0*   o* ) 2 sxx (b1  1 ) 2
 ~  22 .
 2
 2

Since
SSres
~  n2 2
2
and SSres is independently distributed of b0* and b1 , so the ratio

 n(b0*   0* ) 2 sxx (b1  1 ) 2 

   2
 2 2  ~ F2, n  2 .
 res 
SS
 2  (n  2)
  
Substituting b0*  b0  b1 x and  0*   0  1 x , we get

 n  2   Qf 
  
 2   SSres 
where
n n
Q f  n(b0   0 ) 2  2 xt (b0  1 )(b1  1 )   xi2 (b1  1 ) 2 .
i 1 i 1

Since
 n  2  Q f 
P    F2, n  2   1  
 2  SSres 
holds true for all values of  0 and 1 , so the 100 (1   ) % confidence region for  0 and 1 is

 n  2  Qf
 .  F2, n  2;1 . .
 2  SSres
This confidence region is an ellipse which gives the 100 (1   )% probability that  0 and 1 are contained

simultaneously in this ellipse.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
22

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Analysis of variance:
The technique of analysis of variance is usually used for testing the hypothesis related to equality of more
than one parameters, like population means or slope parameters. It is more meaningful in case of multiple
regression model when there are more than one slope parameters. This technique is discussed and
illustrated here to understand the related basic concepts and fundamentals which will be used in developing
the analysis of variance in the next module in multiple linear regression model where the explanatory
variables are more than two.

A test statistic for testing H 0 : 1  0 can also be formulated using the analysis of variance technique as

follows.

On the basis of the identity

yi  yî  ( yi  y )  ( yî  y ),
the sum of squared residuals is
n
S (b)   ( yi  yî ) 2
i 1
n n n
  ( yi  y ) 2   ( yî  yi ) 2  2 ( yi  y )( yî  y ).
i 1 i 1 i 1

Further consider
n n

 ( yi  y )( yˆi  y )   ( yi  y )b1 ( xi  x )
i 1 i 1
n
 b12  ( xi  x ) 2
i 1
n
  ( yˆi  y ) 2 .
i 1

Thus we have
n n n

 i 1
( yi  y ) 2   ( yi  yˆi ) 2   ( yˆi  y ) 2 .
i 1 i 1

n
The term  ( y  y)
i 1
i
2
is called the sum of squares about the mean, corrected sum of squares of y (i.e.,

SScorrected), total sum of squares, or s yy .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
23

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

n
The term  ( y  yˆ )
i 1
i i
2
describes the deviation: observation minus predicted value, viz., the residual sum of

n
squares, i.e., SS res   ( yi  yˆi ) 2
i 1

n
whereas the term  ( yˆ  y )
i 1
i
2
describes the proportion of variability explained by regression,

n
SS r e g   ( yˆi  y ) 2 .
i 1

n
If all observations yi are located on a straight line, then in this case  ( y  yˆ )
i 1
i i
2
 0 and thus

SScorrected  SS r e g .

Note that SSr e g is completely determined by b1 and so has only one degrees of freedom. The total sum of
n n
squares s yy   ( yi  y ) 2 has (n  1) degrees of freedom due to constraint  ( y  y)  0
i and SS res has
i 1 i 1

(n  2) degrees of freedom as it depends on the determination of b0 and b1 .

All sums of squares are mutually independent and distributed as  df2 with df degrees of freedom if the

errors are normally distributed.

The mean square due to regression is

SS r e g
MS r e g 
1
and mean square due to residuals is
SSres
MSE  .
n2
The test statistic for testing H 0 : 1  0 is

MS r e g
F0  .
MSE
If H 0 : 1  0 is true, then MSr e g and MSE are independently distributed and thus

F0 ~ F1, n  2 .
Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
24

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

The decision rule for H1 : 1  0 is to reject H 0 if

F0  F1,n  2;1

at  level of significance. The test procedure can be described in an Analysis of variance table.

Analysis of variance for testing H 0 : 1  0

Source of variation Sum of squares Degrees of freedom Mean square F

Regression SSr e g 1 MSr e g MSr e g / MSE

Residual SS res n2 MSE

Total s yy n 1

Some other forms of SSreg , SSres and s yy can be derived as follows:

The sample correlation coefficient then may be written as

sxy
rxy  .
sxx s yy

Moreover, we have
sxy s yy
b1   rxy .
sxx sxx

The estimator of  2 in this case may be expressed as

1 n 2
s2   ei
n  2 i 1
1
 SS res .
n2
Various alternative formulations for SS res are in use as well:
n
SSres   [ yi  (b0  b1 xi )]2
i 1
n
  [( yi  y )  b1 ( xi  x )]2
i 1

 s yy  b12 sxx  2b1sxy

 s yy  b12 sxx
( sxy ) 2
 s yy  .
sxx

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
25

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Using this result, we find that

SScorrected  s yy

and
SSr e g  s yy  SS res
( sxy ) 2

sxx
2
b s 1 xx

 b1sxy .

Goodness of fit of regression

It can be noted that a fitted model can be said to be good when residuals are small. Since SSres is based on

residuals, so a measure of quality of fitted model can be based on SSres . When intercept term is present in the
model, a measure of goodness of fit of the model is given by
SS res
R2  1 
s yy
SS r e g
 .
s yy

This is known as the coefficient of determination. This measure is based on the concept that how much
variation in y ’s stated by s yy is explainable by SSreg and how much unexplainable part is contained in

SS res . The ratio SS r e g / s yy describes the proportion of variability that is explained by regression in relation

to the total variability of y . The ratio SSres / s yy describes the proportion of variability that is not covered

by the regression.

It can be seen that

R 2  rxy2

where rxy is the simple correlation coefficient between x and y. Clearly 0  R 2  1 , so a value of R 2 closer

to one indicates the better fit and value of R 2 closer to zero indicates the poor fit.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
26

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Prediction of values of study variable

An important use of linear regression modeling is to predict the average and actual values of study variable.
The term prediction of value of study variable corresponds to knowing the value of E ( y ) (in case of
average value) and value of y (in case of actual value) for a given value of explanatory variable. We
consider both the cases.

Case 1: Prediction of average value

Under the linear regression model y   0  1 x   , the fitted model is y  b0  b1 x where b0 and b1 are the

OLS estimators of  0 and 1 respectively.

Suppose we want to predict the value of E ( y ) for a given value of x  x0 . Then the predictor is given by

E ( y | x0 )  ˆ y / x0  b0  b1 x0 .

Predictive bias
Then the prediction error is given as
ˆ y| x  E ( y )  b0  b1 x0  E (  0  1 x0   )
0

 b0  b1 x0  (  0  1 x0 )
 (b0   0 )  (b1  1 ) x0 .
Then
E  ˆ y| x0  E ( y )   E (b0   0 )  E (b1  1 ) x0
 00  0
Thus the predictor  y / x0 is an unbiased predictor of E ( y ).

Predictive variance:
The predictive variance of ˆ y| x0 is

PV ( ˆ y| x0 )  Var (b0  b1 x0 )
 Var  y  b1 ( x0  x ) 
 Var ( y )  ( x0  x ) 2 Var (b1 )  2( x0  x )Cov( y , b1 )
2  2 ( x0  x ) 2
  0
n sxx
 1 ( x  x )2 
2   0 .
n sxx 

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
27

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Estimate of predictive variance

The predictive variance can be estimated by substituting  2 by ˆ 2  MSE as

 ( ˆ )  ˆ 2  1  ( x0  x ) 
2
PV y| x0  
n sxx 
 1 ( x0  x ) 2 
 MSE   .
n sxx 

Prediction interval estimation:

The 100(1-  )% prediction interval for E ( y / x0 ) is obtained as follows:

The predictor ˆ y| x0 is a linear combination of normally distributed random variables, so it is also normally

distributed as


ˆ y| x ~ N  0  1 x0 , PV  ˆ y| x
0 0
 .
So if  2 is known, then the distribution of
ˆ y| x  E ( y | x0 )
0

PV ( ˆ y| x0 )

is N (0,1). So the 100(1-  )% prediction interval is obtained as

 ˆ y| x0  E ( y | x0 ) 
P   z /2   z /2   1  
 PV ( ˆ y| x0 ) 

which gives the prediction interval for E ( y / x0 ) as

  1 ( x  x )2  ( x0  x ) 2  
2 1
 ˆ y| x0  z /2  2   0 , ˆ
 
 y| x0  /2
z     .
 n sxx  n sxx  


When  2 is unknown, it is replaced by ˆ 2  MSE and in this case the sampling distribution of
ˆ y|x  E ( y | x0 )
0

 1 ( x  x )2 
MSE   0
n sxx 

is t -distribution with (n  2) degrees of freedom, i.e., tn  2 .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
28

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

The 100(1-  )% prediction interval in this case is

 
 
 ˆ y| x0  E ( y | x0 ) 
P  t /2,n  2   t   1   .
  1 ( x  x )2  2
,n2

MSE   0 
  n s  
 xx 
which gives the prediction interval as
  1 ( x  x )2   1 ( x0  x ) 2  
 ˆ y| x0  t /2, n  2 MSE   0  , ˆ
 y| x0  t /2, n  2 MSE    .
 n sxx  n sxx  

Note that the width of prediction interval E ( y | x0 ) is a function of x0 . The interval width is minimum for

x0  x and widens as x0  x increases. This is expected also as the best estimates of y to be made at x -

values lie near the center of the data and the precision of estimation to deteriorate as we move to the
boundary of the x -space.

Case 2: Prediction of actual value

If x0 is the value of the explanatory variable, then the actual value predictor for y is

ŷ0  b0  b1 x0 .

The true value of y in the prediction period is given by y0   0  1 x0   0 where  0 indicates the value that
would be drawn from the distribution of random error in the prediction period. Note that the form of
predictor is the same as of average value predictor but its predictive error and other properties are different.
This is the dual nature of predictor.

Predictive bias:
The predictive error of ŷ0 is given by

yˆ 0  y0  b0  b1 x0  (  0  1 x0   0 )
 (b0   0 )  (b1  1 ) x0   .
Thus, we find that
E ( yˆ 0  y0 )  E (b0   0 )  E (b1  1 ) x0  E ( 0 )
 000  0

which implies that ŷ0 is an unbiased predictor of y0 .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
29

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Predictive variance
Because the future observation y0 is independent of ŷ0 , the predictive variance of ŷ0 is

PV ( yˆ 0 )  E ( yˆ 0  y0 ) 2
 E[(b0   0 )  ( x0  x )(b1  1 )  (b1  1 ) x   0 ]2
 Var (b0 )  ( x0  x ) 2 Var (b1 )  x 2Var (b1 )  Var ( 0 )  2( x0  x )Cov(b0 , b1 )  2 xCov(b0 , b1 )  2( x0  x )Var (b1 )
[rest of the terms are 0 assuming the independence of  0 with 1 ,  2 ,...,  n ]
 Var (b0 )  [( x0  x ) 2  x 2  2( x0  x )]Var (b1 )  Var ( )  2[( x0  x )  2 x ]Cov(b0 , b1 )
 Var (b0 )  x02Var (b1 )  Var ( 0 )  2 x0Cov(b0 , b1 )
1 x2  2 x 2
  2     x02   2  2 x0
 n sxx  sxx sxx
 1 ( x  x )2 
  2 1   0 .
 n sxx 

Estimate of predictive variance

The estimate of predictive variance can be obtained by replacing  2 by its estimate ˆ 2  MSE as
  1 ( x  x )2 
PV ( yˆ 0 )  ˆ 2 1   0 
 n sxx 
 1 ( x  x )2 
 MSE 1   0 .
 n sxx 

Prediction interval:
If  2 is known, then the distribution of
yˆ 0  y0
PV ( yˆ 0 )

is N (0,1). So the 100(1-  )% prediction interval is obtained as

 yˆ  y0 
P   z /2  0  z /2   1  
 PV ( yˆ 0 ) 
which gives the prediction interval for y0 as

  1 ( x  x )2  1 ( x0  x ) 2  
2
 yˆ 0  z /2  2 1   0 , ˆ 
 0  /2
y z   1   .
  n sxx   n sxx  

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
30

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

When  2 is unknown, then

yˆ 0  y0
 ( yˆ )
PV 0

follows a t -distribution with (n  2) degrees of freedom. The 100(1-  )% prediction interval for ŷ0 in this

case is obtained as
 yˆ  y0 
P  t /2,n  2  0  t /2, n  2   1  
  ( yˆ )
PV 
 0 
which gives the prediction interval
  1 ( x  x )2   1 ( x0  x ) 2  
 yˆ 0  t /2,n  2 MSE 1   0 , ˆ
y 
 0  /2,n  2
t MSE 1    .
  n s xx   n s xx  

The prediction interval is of minimum width at x0  x and widens as x0  x increases.

The prediction interval for ŷ0 is wider than the prediction interval for ˆ y / x0 because the prediction interval

for ŷ0 depends on both the error from the fitted model as well as the error associated with the future
observations.

Reverse regression method

The reverse (or inverse) regression approach minimizes the sum of squares of horizontal distances between
the observed data points and the line in the following scatter diagram to obtain the estimates of regression
parameters.
yi

Y   0  1 X
(xi,
(Xi,
)

x,
Reverse regression
Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
31

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

The reverse regression has been advocated in the analysis of gender (or race) discrimination in salaries. For
example, if y denotes salary and x denotes qualifications and we are interested in determining if there is a
gender discrimination in salaries, we can ask:
“Whether men and women with the same qualifications (value of x) are getting the same salaries
(value of y). This question is answered by the direct regression.”

Alternatively, we can ask:

“Whether men and women with the same salaries (value of y) have the same qualifications (value of
x). This question is answered by the reverse regression, i.e., regression of x on y.”

The regression equation in case of reverse regression can be written as

xi   0*  1* yi   i (i  1, 2,..., n)

where  i ’s are the associated random error components and satisfy the assumptions as in the case of usual

simple linear regression model. The reverse regression estimates ˆOR of  0* and ˆ1R of  1* for the model

are obtained by interchanging the x and y in the direct regression estimators of  0 and 1 . The estimates are
obtained as
ˆOR  x  ˆ1R y
and
s yy
ˆ1R 
sxy

for  0 and 1 respectively. The residual sum of squares in this case is

*
sxy2
SS res  sxx  .
s yy

Note that
sxy2
ˆ1R b1   rxy2
sxx s yy

where b1 is the direct regression estimator of slope parameter and rxy is the correlation coefficient between x

and y. Hence if rxy2 is close to 1, the two regression lines will be close to each other.

An important application of reverse regression method is in solving the calibration problem.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
32

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Orthogonal regression method (or major axis regression method)

The direct and reverse regression methods of estimation assume that the errors in the observations are either
in x -direction or y -direction. In other words, the errors can be either in dependent variable or independent
variable. There can be situations when uncertainties are involved in dependent and independent variables
both. In such situations, the orthogonal regression is more appropriate. In order to take care of errors in both
the directions, the least squares principle in orthogonal regression minimizes the squared perpendicular
distance between the observed data points and the line in the following scatter diagram to obtain the
estimates of regression coefficients. This is also known as major axis regression method. The estimates
obtained are called as orthogonal regression estimates or major axis regression estimates of regression
coefficients.

(xi, yi)

Y   0  1 X

(Xi, Yi)

Orthogonal or major axis regression

If we assume that the regression line to be fitted is Yi   0  1 X i , then it is expected that all the

observations ( xi , yi ), i  1, 2,..., n lie on this line. But these points deviate from the line and in such a case,

the squared perpendicular distance of observed data ( xi , yi ) (i  1, 2,..., n) from the line is given by

di2  ( X i  xi ) 2  (Yi  yi ) 2

where ( X i , Yi ) denotes the i th pair of observation without any error which lie on the line.

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
33

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

n
The objective is to minimize the sum of squared perpendicular distances given by i 1
di2 to obtain the

estimates of  0 and 1 . The observations ( xi , yi ) (i  1, 2,..., n) are expected to lie on the line

Yi   0  1 X i ,
so let
Ei  Yi   0  1 X i  0.

n
The regression coefficients are obtained by minimizing d
i 1
i
2
under the constraints Ei ' s using the

Lagrangian’s multiplier method. The Lagrangian function is

n n
L0   di2  2 i Ei
i 1 i 1

where 1 ,..., n are the Lagrangian multipliers. The set of equations are obtained by setting

L0 L L L
 0, 0  0, 0  0 and 0  0 (i  1, 2,..., n).
X i Yi  0 1
Thus we find
L0
 ( X i  xi )  i 1  0
X i
L0
 (Yi  yi )  i  0
Yi
L0 n

 0
 
i 1
i 0

L0 n

1
  X
i 1
i i  0.

Since
X i  xi  i 1
Yi  yi  i ,

so substituting these values is  i , we obtain

Ei  ( yi  i )   0  1 ( xi  i 1 )  0
 0  1 xi  yi
 i  .
1  12
n
Also using this i in the equation 
i 1
i  0 , we get

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
34

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

 ( 0  1 xi  yi )
i 1
0
1  12
n
and using ( X i  xi )  i 1  0 and  X
i 1
i i  0 , we get

  ( x    )  0.
i 1
i i i 1

Substituting i in this equation, we get

 ( x   x 0 i
2
1 i  yi xi )
1 (  0  1 xi  yi ) 2
i 1
  0. (1)
(1  i2 ) (1  12 ) 2
n
Using i in the equation and using the equation 
i 1
i  0 , we solve

 ( 0  1 xi  yi )
i 1
 0.
1  12

The solution provides an orthogonal regression estimate of  0 as

ˆ0OR  y  ˆ1OR x

where ˆ1OR is an orthogonal regression estimate of 1.

Now, substituting  0OR in equation (1), we get

)  yxi  1 xxi   x  xi yi   1   y  1 x  1 xi  yi   0
n n

 (1  
i 1
1
2 2
1 i
i 1

or
n n 2

(1  12 )  xi  yi  y  1 ( xi  x )  1   ( yi  y )  1 ( xi  x )  0
i 1 i 1

or
n n
(1  12 ) (ui  x )(vi  1ui )  1  (vi  1ui ) 2  0
i 1 i 1

where ui  xi  x ,
vi  yi  y .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
35

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

n n
Since u  u
i 1
i
i 1
i  0, so
n

  
i 1
u v  1 (ui2  vi2 )  ui vi   0
2
1 i i

or
12 sxy  1 ( sxx  s yy )  sxy  0.

Solving this quadratic equation provides the orthogonal regression estimate of 1 as

s  sxx   sign  sxy  ( sxx  s yy ) 2  4s 2xy

ˆ1OR 
yy

2sxy

where sign( sxy ) denotes the sign of sxy which can be positive or negative . So

1 if sxy  0
sign( sxy )   .
1 if sxy  0.

n
Notice that this gives two solutions for ˆ1OR . We choose the solution which minimizes d i
2
. The other
i 1

n
solution maximizes d
i 1
i
2
and is in the direction perpendicular to the optimal solution. The optimal

solution can be chosen with the sign of sxy .

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
36

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

Reduced major axis regression method:

The direct, reverse and orthogonal methods of estimation minimize the errors in a particular direction which
is usually the distance between the observed data points and the line in the scatter diagram. Alternatively,
one can consider the area extended by the data points in certain neighbourhood and instead of distances, the
area of rectangles defined between corresponding observed data point and nearest point on the line in the
following scatter diagram can also be minimized. Such an approach is more appropriate when the
uncertainties are present in study and explanatory variables both. This approach is termed as reduced major
axis regression.
yi

(xi yi)

Y   0  1 X

(Xi, Yi)

Reduced major axis method

Suppose the regression line is Yi   0  1 X i on which all the observed points are expected to lie. Suppose

the points ( xi , yi ), i  1, 2,..., n are observed which lie away from the line. The area of rectangle extended

between the i th observed data point and the line is

Ai  ( X i ~ xi )(Yi ~ yi ) (i  1, 2,..., n)

where ( X i , Yi ) denotes the i th pair of observation without any error which lie on the line.

The total area extended by n data points is

n n

 A  (X
i 1
i
i 1
i ~ xi )(Yi ~ yi ).

All observed data points ( xi , yi ), (i  1, 2,..., n) are expected to lie on the line

Yi   0  1 X i

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
37

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

and let
Ei*  Yi   0  1 X i  0.

So now the objective is to minimize the sum of areas under the constraints Ei* to obtain the reduced major
axis estimates of regression coefficients. Using the Lagrangian multiplies method, the Lagrangian function is
n n
LR   Ai   i Ei*
i 1 i 1
n n
  ( X i  xi )(Yi  yi )   i Ei*
i 1 i 1

where 1 ,..., n are the Lagrangian multipliers. The set of equations are obtained by setting

LR L L L
 0, R  0, R  0, R  0 (i  1, 2,..., n).
X i Yi  0 1
Thus
LR
 (Yi  yi )  1i  0
X i
LR
 ( X i  xi )  i  0
Yi
LR n
  i  0
 0 i 1
LR n
  i X i  0.
1 i 1
Now
X i  xi  i
Yi  yi  1i
 0  1 X i  yi  1i
 0  1 ( xi  i )  yi  1i
y   0  1 xi
 i  i .
2 1
n
Substituting i in 
i 1
i 0, the reduced major axis regression estimate of  0 is obtained as

ˆ0 RM  y  ˆ1RM x

where ˆ1RM is the reduced major axis regression estimate of 1 . Using X i  xi  i , i and ˆ0 RM in
n

 X
i 1
i i  0 , we get

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
38

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

lOMoARcPSD|17346801

n
 yi  y  1 x  1 xi  yi  y  1 x  1 xi 

i 1  21
 xi 
21
  0.
 
Let ui  xi  x and vi  yi  y , then this equation can be re-expressed as
n

 (v   u )(v   u  2 x )  0.
i 1
i 1 i i 1 i 1

n n
Using u  u
i 1
i
i 1
i  0, we get

n n

v
i 1
2
i  12  ui2  0.
i 1

Solving this equation, the reduced major axis regression estimate of 1 is obtained as

s yy
ˆ1RM  sign( sxy )
sxx

1 if sxy  0
where sign ( sxy )  
1 if sxy  0.
We choose the regression estimator which has same sign as of sxy .

Least absolute deviation regression method

The least squares principle advocates the minimization of sum of squared errors. The idea of squaring the
errors is useful in place of simple errors because the random errors can be positive as well as negative. So
consequently their sum can be close to zero indicating that there is no error in the model and which can be
misleading. Instead of the sum of random errors, the sum of absolute random errors can be considered
which avoids the problem due to positive and negative random errors.

In the method of least squares, the estimates of the parameters  0 and 1 in the model
n
yi   0  1 xi   i . (i  1, 2,..., n) are chosen such that the sum of squares of deviations 
i 1
i
2
is minimum. In

the method of least absolute deviation (LAD) regression, the parameters  0 and 1 are estimated such that
n
the sum of absolute deviations 
i 1
i is minimum. It minimizes the absolute vertical sum of errors as in the

following scatter diagram:

Regression Analysis | Chapter 2 | Simple Linear Regression Analysis | Shalabh, IIT Kanpur
39

Downloaded by Badri Narayan Mishra (badri.unipa@gmail.com)

CE7 Partial Molar Volume Formal Report
100% (1)
CE7 Partial Molar Volume Formal Report
24 pages
Chapter2 Regression SimpleLinearRegressionAnalysis PDF
No ratings yet
Chapter2 Regression SimpleLinearRegressionAnalysis PDF
42 pages
Chapter1 Econometrics IntroductionToEconometrics
No ratings yet
Chapter1 Econometrics IntroductionToEconometrics
42 pages
Chapter 2 Econometrics Simple Linear Regression Analysis
No ratings yet
Chapter 2 Econometrics Simple Linear Regression Analysis
42 pages
Lecture2 241007 162001
No ratings yet
Lecture2 241007 162001
11 pages
Chapter2 Regression SimpleLinearRegressionAnalysis
No ratings yet
Chapter2 Regression SimpleLinearRegressionAnalysis
41 pages
DA-Unit-3-Trio
No ratings yet
DA-Unit-3-Trio
13 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Linear Regression Models
No ratings yet
Linear Regression Models
42 pages
Chapter 3
No ratings yet
Chapter 3
40 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
Lecture3 221109 035214
No ratings yet
Lecture3 221109 035214
87 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Unit4 Multivariate Analysis
No ratings yet
Unit4 Multivariate Analysis
20 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
31 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
8 pages
DA Unit-3
No ratings yet
DA Unit-3
11 pages
Ch2 Linear Regression Analysis
No ratings yet
Ch2 Linear Regression Analysis
57 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
(Revised) Simple Linear Regression and Correlation
No ratings yet
(Revised) Simple Linear Regression and Correlation
41 pages
Notes2
No ratings yet
Notes2
16 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
Multiple Linear Regression Model: (Or Equivalently
No ratings yet
Multiple Linear Regression Model: (Or Equivalently
41 pages
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
No ratings yet
BST 32202 LINEAR REGRESSION 6 SLR ASSUMPTIONS LSE
20 pages
Simple Regression
No ratings yet
Simple Regression
27 pages
Chapter2
No ratings yet
Chapter2
20 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
No ratings yet
Econometrics I: Chapter 3: Two Variable Regression Model: The Problem of Estimation
35 pages
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
No ratings yet
Chapter2 Econometrics MultipleLinearRegressionModel 1 1
34 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
Chapter 4 Multiple Regression Model
No ratings yet
Chapter 4 Multiple Regression Model
31 pages
Linear Models
No ratings yet
Linear Models
92 pages
Lecturer 4 Regression Analysis
100% (1)
Lecturer 4 Regression Analysis
29 pages
Module 5
No ratings yet
Module 5
28 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
UNIT - III
No ratings yet
UNIT - III
9 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Unit III
No ratings yet
Unit III
18 pages
w3 - Linear Model - Linear Regression
No ratings yet
w3 - Linear Model - Linear Regression
33 pages
Econometric Theory: Module - Iii
No ratings yet
Econometric Theory: Module - Iii
10 pages
Regression Notes- Part-1
No ratings yet
Regression Notes- Part-1
17 pages
Econometrics1 Cha2
100% (1)
Econometrics1 Cha2
77 pages
CH 2
No ratings yet
CH 2
31 pages
Linear Regression Analysis: Module - Ii
No ratings yet
Linear Regression Analysis: Module - Ii
15 pages
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
No ratings yet
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
65 pages
Module05 Notes
No ratings yet
Module05 Notes
19 pages
TCH442E Quantitative Methods For Finance
No ratings yet
TCH442E Quantitative Methods For Finance
21 pages
LECTURE2
No ratings yet
LECTURE2
13 pages
Econometrics Simple Linear Regression
No ratings yet
Econometrics Simple Linear Regression
22 pages
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
No ratings yet
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
35 pages
Week 2
No ratings yet
Week 2
33 pages
125.785 Module 2.1
No ratings yet
125.785 Module 2.1
94 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Exercises of Basic Analytical Geometry
From Everand
Exercises of Basic Analytical Geometry
Simone Malacrida
No ratings yet
Robot Manipulators: Modeling, Performance Analysis and Control
From Everand
Robot Manipulators: Modeling, Performance Analysis and Control
Etienne Dombre
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Cambridge O Level: Mathematics (Syllabus D) 4024/21
No ratings yet
Cambridge O Level: Mathematics (Syllabus D) 4024/21
20 pages
Power System Control and Operation
No ratings yet
Power System Control and Operation
3 pages
The Teaching of Mathematics
No ratings yet
The Teaching of Mathematics
15 pages
Certified Global Minima
100% (1)
Certified Global Minima
8 pages
Annual QN Paper, Class XI 2023 SAY
No ratings yet
Annual QN Paper, Class XI 2023 SAY
6 pages
Computational Neural Networks Driving Complex Analytical Problem Solving
No ratings yet
Computational Neural Networks Driving Complex Analytical Problem Solving
7 pages
PSM1 - Jan 2022 QP PDF
No ratings yet
PSM1 - Jan 2022 QP PDF
28 pages
Bed-Material Ldad: (Einstein'S Methooj
No ratings yet
Bed-Material Ldad: (Einstein'S Methooj
41 pages
Som Formulas
No ratings yet
Som Formulas
20 pages
Automated Estimate - Fit Out
No ratings yet
Automated Estimate - Fit Out
12 pages
First Part of This Tutorial On The Java 8 Stream Api: Map Maptoint Maptolong Maptodouble
No ratings yet
First Part of This Tutorial On The Java 8 Stream Api: Map Maptoint Maptolong Maptodouble
21 pages
Transportation Problems
No ratings yet
Transportation Problems
21 pages
Module 5 Tangential and Normal Component of Acceleration
No ratings yet
Module 5 Tangential and Normal Component of Acceleration
9 pages
LI Sample Questions
No ratings yet
LI Sample Questions
10 pages
Mathematics 8 - Week 7
No ratings yet
Mathematics 8 - Week 7
25 pages
13 Red-Black Trees
No ratings yet
13 Red-Black Trees
32 pages
The Bounded Convergence Theorem - Brian Thomson
No ratings yet
The Bounded Convergence Theorem - Brian Thomson
22 pages
Math II Unit V Lesson 1
No ratings yet
Math II Unit V Lesson 1
52 pages
Notes On Econometrics I: Grace Mccormack
No ratings yet
Notes On Econometrics I: Grace Mccormack
50 pages
A Crash Course On Artificial Intelligence: (The Lecture Notes)
No ratings yet
A Crash Course On Artificial Intelligence: (The Lecture Notes)
60 pages
Backwards Heat Equation Info
No ratings yet
Backwards Heat Equation Info
1 page
Athematics Yllabuses: Secondary One To Four
No ratings yet
Athematics Yllabuses: Secondary One To Four
46 pages
Drain Design
100% (2)
Drain Design
106 pages
SPM Physics Paper 3
No ratings yet
SPM Physics Paper 3
10 pages
AGC (Chapter 9 of W&W)
100% (1)
AGC (Chapter 9 of W&W)
94 pages
Mathematics and Twist PDF
No ratings yet
Mathematics and Twist PDF
148 pages
DS PPT - 29.03.2023
No ratings yet
DS PPT - 29.03.2023
14 pages
Svit - Module 4
No ratings yet
Svit - Module 4
30 pages
Judul Pembahasan
No ratings yet
Judul Pembahasan
2 pages