Econometric Theory: Module - Ii
Econometric Theory: Module - Ii
MODULE – II
Lecture - 7
Simple Linear Regression Analysis
Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
2
The reverse (or inverse) regression approach minimizes the sum of squares of horizontal distances between the observed
data points and the line in the following scatter diagram to obtain the estimates of regression parameters.
The reverse regression has been advocated in the analysis of sex (or race) discrimination in salaries. For example, if y
denotes salary and x denotes qualifications and we are interested in determining if there is a sex discrimination in salaries,
we can ask:
“Whether men and women with the same qualifications (value of x) are getting the same salaries (value of y). This
question is answered by the direct regression.”
Alternatively, we can ask:
“Whether men and women with the same salaries (value of y) have the same qualifications (value of x). This
question is answered by the reverse regression, i.e., regression of x on y.”
3
The regression equation in case of reverse regression can be written as xi =β 0* + β1* yi + δ i (i =1, 2,..., n)
where δ i ’s are the associated random error components and satisfy the assumptions as in the case of usual simple linear
regression model.
The reverse regression estimates βˆOR of β 0 and βˆ1R of β 1 for the model are obtained by interchanging the x and y in
* *
βˆOR= x − βˆ1R y
and
sxy
βˆ1R =
s yy
for β 0 and β 1 respectively.
* *
sxy2
SS = sxx −
*
res .
s yy
Note that
sxy2
β=
ˆ b
1R 1 = rxy2
sxx s yy
where b1 is the direct regression estimator of slope parameter and rxy is the correlation coefficient between x and y. Hence
2
if rxy is close to 1, the two regression lines will be close to each other.
The direct and reverse regression methods of estimation assume that the errors in the observations are either in x-direction
or y-direction. In other words, the errors can be either in dependent variable or independent variable. There can be
situations when uncertainties are involved in dependent and independent variables both. In such situations, the orthogonal
regression is more appropriate. In order to take care of errors in both the directions, the least squares principle in
orthogonal regression minimizes the squared perpendicular distance between the observed data points and the line in the
following scatter diagram to obtain the estimates of regression coefficients. This is also known as major axis regression
method. The estimates obtained are called as orthogonal regression
estimates or major axis regression estimates of regression coefficients. yi,
lie on this line. But these points deviate from the line and in such a
case, the squared perpendicular distance of observed data
( xi , yi )(i = 1, 2,..., n) from the line is given by di2 = ( X i − xi ) 2 + (Yi − yi ) 2
(Xi, Yi)
where ( X i , Yi ) denotes the i th pair of observation without any error
which lie on the line.
xi,
∂L0 n
=
∂β1
∑
= λi X i 0.
i =1
6
Since
X=
i xi − λi β1
Y=
i yi + λi
so substituting these values in Ei , we obtain
Ei = ( yi + λi ) − β 0 − β1 ( xi − λi β1 ) = 0
β + β1 xi − yi
⇒ λi =0 .
1 + β12
n
Also using this λi in the equation ∑λ
i =1
i = 0, we get
n
∑ (β 0 + β1 xi − yi )
i =1
=0
1 + β12
n
xi ) + λi β1 0 and
and using ( X i −= = ∑ λi X i 0, we get i =1
n
∑ λ (x − λ β ) =
i =1
i 0. i i 1
n n
∑ (β x + β x
0 i 1 i
2
i i 1 −yx) β ∑ (β 0 + β1 xi − yi ) 2
i 1 =i 1
− =
0. (1)
1
2
(1 + β ) (1 + β12 ) 2
n
Using λi in the equation and using the equation ∑λ
i =1
i = 0 , we solve
n
∑ (β 0 + β1 xi − yi )
i =1
=0.
1 + β12
7
βˆ0OR= y − βˆ1OR x
or (1 + β ) 1
2
∑ x [ y − y − β ( x − x ) ] + β ∑ [ −( y − y ) + β ( x − x ) ]
i i
i 1 =i 1
1 i 1 i 1 i =0
n n
or (1 + β )∑ (ui + x )(vi − β1ui ) + β1 ∑ (−vi + β1ui ) 2 =
1
2
0
i 1 =i 1
where
u=
i xi − x ,
v=
i yi − y .
n n
Since ∑=
ui
=i 1 =i 1
∑=
vi 0, so
n
∑ β
i =1
u v + β1 (ui2 − vi2 ) − ui vi =
2
1 i i 0
or
β12 sxy + β1 ( sxx − s yy ) − sxy =
0.
8
where sign ( sxy ) denotes the sign of s xy which can be positive or negative. So