Unit-4 Correlation and Regression
Unit-4 Correlation and Regression
6 x y
4
dy
2
8
4
1 y2
6 y xy
8 2 2
1
16 4 x 10 2 x
8
6 x y
f y
x
f 8 6 x y
x ,
f x
y
6 2x 6 2x
8
3
P1 Y 3 f y dy
X 2 x 2
1
4 y
3
dy
2
2
3
1 y2
SMTA1402 - Probability and Statistics
2
3
1 y2 1 17 11
Unit-4
4 y 14 and
Correlation . Regression
2 2 2 2 4
21).a). Two random variables X and Y have the following joint probability density
2 x y, 0 x 1, 0 y 1
function f x, y . Find the marginal probability density function
0 , otherwise
of X and Y . Also find the covariance between X and Y .
6 x y
b). If f x, y , 0 x 2, 2 y 4 for a bivariate X , Y , find the correlation
8
coefficient
Solution:
2 x y, 0 x 1, 0 y 1
a) Given the joint probability density function f x, y
0 , otherwise
Marginal density function of X is f X x f x, y dy
1
2 x y dy
0
1
y2
2 y xy
2 0
1
2 x
2
18
Page no 55
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
3
x, 0 x 1
fX x 2
0 , otherwise
1
Marginal density function of Y is fY y 2 x y dx
0
1
x2
2 x xy
2 0
3
y
2
3
y, 0 y 1
fY y 2
0 , otherwise
Covariance of X , Y Cov X , Y E XY E X E Y
1
1
3
1
3 x 2 x3 5
E X xf X x dx x x dx
0 2 2 3 0 12
0
2
3
1 1
5
E Y yfY y dy y y dy
0 0 2 12
Cov X , Y E XY E X E Y
1 1
E XY xy f x, y dxdy
0 0
1 1
xy 2 x y dxdy
0 0
1 1
2xy x 2 y xy 2 dxdy
0 0
1
1
2 x 2 y x3 x2
y y 2 dy
0
2 3 2 0
1
1 y2
y dy
0
3 2
1
y 2 y y3 1
2 3 6 0 6
1 5 5
Cov X , Y
6 12 12
1 25 1
.
6 144 144
19
Page no 56
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
E XY E X E Y
b). Correlation coefficient XY
XY
Marginal density function of X is
6 x y 6 2x
4
fX x f x, y dy 2 8 dy 8
Marginal density function of Y is
6 x y 10 2 y
2
fY y f x, y dx dx
0
8 8
6 2x
2 2
Then E X xf X x dx x dx
0
0
8
2
1 6 x 2 2 x3
8 2 3 0
1 16 1 20 5
12
8 13 8 3 6
4
10 2 y 1 10 y 2 2 y 3
4
17
E Y y dy 6
2 8 8 2 3 2
2
6 2x 1 6 x3 2 x 4
2 2
E X 2
x f x x dx x
2
8
dx 2
8 3
4 0
1
0 0
4
10 2 y 1 10 y 3 2 y 4
4
E Y y
25
2 2
dy
2 8 8 3 4 2 3
2
5 11
Var X X2 E X 2 E X 1
2
6 36
2
25 17 11
Var Y E Y E Y
2
2
Y
2
3 6 36
6 x y
4 2
E XY xy dxdy
2 0 8
2
1 6 x 2 y x3 y x 2 y 2
4
dy
82 2 3 2 0
4
1 2 1 12 y 2 8 y 2 2 y 3
4
8
12 y y 2 y dy
8 2 3 8 2 3 2 3 2
1 64 128 16 16 1 56
96 24
8 3 3 3 3 8 3
7
E XY
3
20
Page no 57
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
7 5 17
E XY E X E Y 3 6 6
XY
XY 11 11
6 6
1
XY .
11
1
22.a). Let the random variables X and Y have pdf f x, y , x, y 0, 0 , 1,1 , 2, 0 .
3
Compute the correlation coefficient.
b) Let X1 and X 2 be two independent random variables with means 5 and 10 and standard
devotions 2 and 3 respectively. Obtain the correlation coefficient of UV where U 3 X1 4 X 2
and V 3 X1 X 2 .
Solution:
a). The probability distribution is
X 0 1 2 P Y
Y
1 1
0 3 0 0 3
1 1
1 0 3 0 3
1 1
0 0 0 3 3
P X 1 1 1
3 3 3
1 1 1
E X xi pi xi 0 1 2 1
i 3 3 3
1 1 1 1
E Y yi p j y j 0 1 0
j 3 3 3 3
1 1 1 5
E X 2 xi p xi 0 1 4
2
i 3 3 3 3
Var ( X ) E X 2 E X 1
2 5 2
3 3
1 1 1 1
E Y 2 y j p y j 0 1 0
2
j 3 3 3 3
V Y E Y 2 E Y
2 1 1 2
3 9 9
21
Page no 58
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
E XY E X E Y
Correlation coefficient XY
V X V Y
E XY xi y j p xi , y j
i j
1 1 1 1
0.0. 0.1.0 1.0.0 1.1. 1.2.0 0.0.0 0.1.0 0.2.
3 3 3 3
1 1
1
XY
3 3 0
2 2
3 9
Correlation coefficient 0 .
b). Given E X1 5, E X 2 10
V X1 4, V X 2 9
Since X and Y are independent E XY E X E Y
E UV E U E V
Correlation coefficient
Var U Var V
E U E 3 X1 4 X 2 3E X1 4E X 2
3 5 4 10 15 40 55.
E V E 3 X1 X 2 3E X1 E X 2
3 5 10 15 10 5
E UV E 3 X1 4 X 2 3 X1 X 2
E 9 X1 3 X1 X 2 12 X1 X 2 4 X 2
2 2
9E X1 3E X1 X 2 12E X1 X 2 4E X 2
2 2
9E X 9E X X 4E X
2 2
1 1 2 2
9E X 9E X E X 4E X
2 2
1 1 2 2
9E X 450 4E X
2 2
1 2
V X E X E X
2 2
1 1 1
E X V X E X 4 25 29
2 2
1 1 1
22
Page no 59
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
275 5 55 0
Since Cov U ,V 0, Correlation coefficient 0 .
1 1
23.a). Let the random variable X has the marginal density function f x 1, x and
2 2
1
1, x y x 1, x 0
2
let the conditional density of Y be f y . Prove that the
x 1
1, x y 1 x, 0 x
2
variables X and Y are uncorrelated.
b). Given f x, y xe , x 0, y 0 . Find the regression curve of Y on X .
x y 1
Solution:
1 1 1
2
x2 2 2
a). We have E X xf x dx xdx 0
1
1 2 1
2 2 2
1
0 x 1 2 1 x
E XY xydxdy xydxdy
1 x 0 x
2
1
0
x 1 2
1 x
x ydy dx x ydy dx
1 x 0 x
2
1
0 2
1 1
2 x 2 x 1 dx 2 x 1 2 x dx
1 0
2
1
0
1 2 x3 x 2 1 x 2 2 x3 2
0
2 3 2 1 2 2 3 0
2
f x, y
f y / x
fX X
Marginal density function f X x f x, y dy
0
23
Page no 60
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
x y 1
x e dy
0
e x y 1
x e x , x 0
x 0
Conditional pdf of Y on X is f y
x
f x, y xe xy x
f x
x xe xy
e X
x
E y yxe xy dy
0
e xy e xy
xy 2
x x 0
1 1
E y y and hence xy 1 .
x x x
x y
, 0 x 1, 0 y 2
24.a). Given f x, y 3 , obtain the regression of Y on X and X on
0 , otherwise
Y.
b). Distinguish between correlation and regression Analysis
Solution:
a). Regression of Y on X is E Y X
X yf y x dy
E Y
X ff x,xy
f Y
X
fX x f x, y dy
2
x y 1 y2
2
dy xy
0
3 3 2 0
2 x 1
3
f x, y
f Y
X
x y
f X x 2( x 1)
y x y
X
2
Regression of Y on X E Y dy
0
2 x 1
24
Page no 61
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
2
1 xy 2 y 3
2 x 1 2 3 0
1 8 3x 4
2x
2 x 1 3 3 x 1
Y xf x y dx
E X
y ff x,yy
f x
Y
fY y f x, y dx
1
x y 1 x2
1
dx xy
0
3 3 2 0
1 1
y
3 2
f x
y
2 x y
2 y 1
Y x y
1
Regression of X on Y E X dx
0
2 y 1
1
1 x2
xy
2 y 1 2 0
1
y
2 1
.
2 y 1 2
b).
1. Correlation means relationship between two variables and Regression is a Mathematical
Measure of expressing the average relationship between the two variables.
2. Correlation need not imply cause and effect relationship between the variables. Regression
analysis clearly indicates the cause and effect relationship between Variables.
3. Correlation coefficient is symmetric i.e. rxy ryx where regression coefficient is not symmetric
4. Correlation coefficient is the measure of the direction and degree of linear relationship
between two variables. In regression using the relationship between two variables we can predict
the dependent variable value for any given independent variable value.
25.a). X any Y are two random variables with variances x2 and y2 respectively and r is the
y x
coefficient of correlation between them. If U X KY and V X , find the value of k
y
so that U and V are uncorrelated.
25
Page no 62
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
Y Y
X
V X Cov X , Y KCov X , Y K X V Y 0
Y Y
K Cov X , Y X V Y V X X Cov x, y
Y Y
X
V X r
Y X Y X2 r X2
K
r X Y X Y
r X Y X V Y
Y
X 1 r
2
X .
X Y 1 r Y
b).
X Y X2 Y2 XY
6 40 36 1600 240
26
Page no 63
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
8 36 64 1296 288
10 20 100 400 200
18 14 324 196 252
20 10 400 100 200
23 2 529 4 46
X 85 Y 122 X2 1453 Y2 3596 XY 1226
x 85 y 122
X 14.17 , Y 20.33
n 6 n 6
x2 x
2 2
1453 85
x 6.44
n n 6 6
y2 y
2 2
3596 122
y 13.63
n n 6 6
xy 1226
xy 14.17 20.33
r n 6 0.95
x y 6.44 13.63
6.44
bxy r x 0.95 0.45
y 13.63
y 13.63
byx r 0.95 2.01
x 6.44
The regression line X on Y is
x x bxy y y x 14.17 0.45 y y
x 0.45 y 23.32
The regression line Y on X is
y y byx x x y 20.33 2.01 x 14.17
y 2.01x 48.81
26. a) Using the given information given below compute x , y and r . Also compute y when
x 2, 2 x 3 y 8 and 4 x y 10 .
b) The joint pdf of X and Y is
X
Y -1 1
27
Page no 64
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
1 3
0 8 8
2 2
1 8 8
Find the correlation coefficient of X and Y .
Solution:
a). When the regression equation are Known the arithmetic means are computed by solving the
equation.
2 x 3 y 8 ------------ (1)
4 x y 10 ------------ (2)
(1) 2 4 x 6 y 16 ------- (3)
2 3 5 y 6
6
y
5
6
Equation 1 2 x 3 8
5
18
2x 8
5
11
x
5
11 6
i.e. x & y
5 5
To find r , Let 2 x 3 y 8 be the regression equation of X on Y .
3
2x 8 3y x 4 y
2
3
bxy Coefficient of Y in the equation of X on Y
2
Let 4 x y 10 be the regression equation of Y on X
y 10 4 x
byx coefficient of X in the equation of Y on X 4 .
r bxybyx
3
4
2
bxy & byx are negative
2.45
Since r is not in the range of 1 r 1 the assumption is wrong.
Now let equation 1 be the equation of Y on X
8 2x
y
3 3
byx Coefficient of X in the equation of Y on X
28
Page no 65
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
2
byx
3
from equation (2) be the equation of X on Y
1
bxy
4
2 1
r bxy byx 0.4081
3 4
2
To compute y from equation 4 byx
3
y
But we know that byx r
x
2
0.4081 y
3 2
y 3.26
b). Marginal probability mass function of X is
1 3 4
When X 0, P X
8 8 8
2 2 4
X 1, P X
8 8 8
Marginal probability mass function of Y is
1 2 3
When Y 1, P Y
8 8 8
3 2 5
Y 1, P Y
8 8 8
4 4 4
E X x p x 0 1
x 8 8 8
3 5 3 5 2
E Y y p y 1 1
y 8 8 8 8 8
E X 2 x 2 p x 02 12
4 4 4
x 8 8 8
E Y 2 y p y 1 12 1
2 2 3 5 3 5
y 8 8 8 8
V X E X 2 E X
2
2
4 4 1
8 8 4
V Y E Y 2 E Y
2
2
1 15
1
4 16
29
Page no 66
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
E XY xy p x, y
x y
1 3 2 2
0 0 1 1 0
8 8 8 8
1 1 1
Cov X , Y E XY E X E Y 0
2 4 8
1
Cov X , Y
r 8 0.26 .
V X V Y 1 15
4 16
27. a) Calculate the correlation coefficient for the following heights (in inches) of fathers X and
their sons Y .
X 65 66 67 67 68 69 70 72
Y 67 68 65 68 72 72 69 71
b) If X and Y are independent exponential variates with parameters 1, find the pdf of
U X Y .
Solution:
X Y XY X2 Y2
65 67 4355 4225 4489
66 68 4488 4359 4624
67 65 4355 4489 4285
68 72 4896 4624 5184
69 72 4968 4761 5184
70 69 4830 4900 4761
72 71 5112 5184 5041
X 544 Y 552 XY 37560 X2 37028 Y2 38132
x 544
X 68
n 8
y 552
Y 69
n 8
XY 68 69 4692
1 2 1
X x2 X (37028) 682 4628.5 4624 2.121
n 8
1 1
Y y2 y2 38132 692 4766.5 4761 2.345
n 8
30
Page no 67
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
1 1
Cov X , Y XY X Y 37650 68 69
n 8
4695 4692 3
The correlation coefficient of X and Y is given by
Cov X , Y 3
r X ,Y
XY 2.121 2.345
3
0.6032 .
4.973
b). Given that X and Y are exponential variates with parameters 1
f X x e x , x 0, fY y e y , y 0
Also f XY x, y f X x f y y since X and Y are independent
e xe y
e x y ; x 0, y 0
Consider the transformations u x y and v y
x u v, y v v
x x
x, y u v 1 1
J 1
u, v y y 0 1
u v
fUV u , v f XY x, y J e x e y e u v e v
eu 2v , u v 0, v 0 RI R II
In Region I when u 0 v u
f u f u, v dv e
u
.e 2v dv u
u u
e2 v
u
e
2 u
eu eu
0 e 2u
2 2
In Region II when u 0
f u f (u, v)dv
0
eu
eu 2v dv
0
2
eu
2 , u 0
f u u
e , u 0
2
31
Page no 68
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
X Y
U .
2
b) If X and Y are independent random variables each following N 0, 2 , find the pdf of
Z 2 X 3Y . If X and Y are independent rectangular variates on 0,1 find the distribution of
X
.
Y
Solution:
x y
a). Consider the transformation u &v y
2
x 2u v and y v
x x
x, y u v 2 1
J 2
u, v y y 0 1
u v
fUV u, v f XY x, y J
e x y 2 2e x y 2e 2u vv
2e2u , 2u v 0, v 0
u
fUV u , v 2e 2u , u 0, 0 v
2
u u
2 2
f u fUV u, v dv 2e 2u dv
0 0
u
2e2u v 2
0
u 2u
2 e , u 0
f u 2
0 , otherwise
32
Page no 69
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
x2 y 2
1
f XY x, y e 8 , x, y
8
The joint pdf of z, w is given by
f ZW z, w J f XY x, y
1
z 3 w w2
2
4
1 1
. e 8
2 8
1 321 z 3w2 4 w2
e , z, w .
16
The pdf of z is the marginal pdf obtained by interchanging f ZW z, w w.r.to w over the range of
w.
1
321 z 2 6 wz 13w2
fZ z e dw
16
z 2 w2
13 6 wz 3 z 3 z
2 2
e e
1 32 32 13
13 13
dw
16
13 w 3 z
2 2 2
z 9z
1 32
dw
1332 32 13
e e
16
2
13
1 8z13 t2
16
e
e 32
dt
13 2 13 16 r 32
r t dr tdt dr dt dr dt
32 16 13t 13
1
16 13 4
dr dt r 2 dr dt
13 r 32 13 2
z 1 2
2 4
e 813 e r r 2 dr
16 13 2 0
z 1 2
1
e 813 e r r 2 dr
2 13 2 0
z2
z2
2 2 13
2
1 1
e 813
e
2 13 2 2 13 2
i.e. Z
N 0, 2 13
b).(ii) Given that X and Y are uniform Variants over 0,1
1, 0 x 1 1, 0 y 1
f X x and fY y
0, otherwise 0, otherwise
33
Page no 70
Sathyabama Institute of Science and Technology
Unit.2. Two Dimensional Random Variables
29. a) If X 1 , X 2 ,..... X n are Poisson variates with parameter 2 . Use the central limit theorem
to estimate P 120 Sn 160 where sn X 1 X 2 ...... X n and n 75 .
34
Page no 71