Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Homework 5 Solution

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

LINEAR REGRESSION MODELS W4315

HOMEWORK 5 ANSWERS
March 9, 2010

Due: 03/04/10
Instructor: Frank Wood
1. (20 points) In order to get a maximum likelihood estimate of the parameters of a
Box-Cox transformed simple linear regression model (Yi = 0 + 1 Xi + i ), we need to find
the gradient of the likelihood with respect to its parameters (the gradient consists of the
partial derivatives of the likelihood function w.r.t. all of the parameters). Derive the partial
derivatives of the likelihood w.r.t all parameters assuming that i N (0, 2 ). (N.B. the
parameters here are , 0 , 1 , )
(Extra Credit: Given this collection of partial derivatives (the gradient), how would you then
proceed to arrive at final estimates of all the parameters? Hint: consider how to increase
the likelihood function by making small changes in the parameter settings.)
Answer:
The gradient of a multi-variate function is defined to be a vector consisting of all the partial
derivatives w.r.t
every single variable. So we need to write down the full likelihood first:
P
Q 1 (yi 021 xi )2
2
L = 2 e
Then the log-likelihood
function is:
P
(y

1 xi )2
0
i
l = n2 log( 2 )
2
2
Take derivatives w.r.t to all the four parameters, we have the followings:
l
1 X
= 2
(yi 0 1 xi )yi lnyi

l
1 X
(yi 0 1 xi )
= 2
0

l
1 X
= 2
(yi 0 1 xi )xi
1

P
l
n
(yi 0 1 xi )2
=

+
2
2 2
2 4
From the above equations array, we can have the gradient.

(1)
(2)
(3)
(4)

2. (15 points)

Derive an extension of Bonferroni inequality (4.2a) which is given as


T
P (A1 A2 ) 1 = 1 2

for the case of three statements, each with statement confidence coefficient 1 .
Answer:

Following the thread on Page 155 in the textbook, we have:


SupposeP (A1 ) = P (A2 ) = P (A3 ) = , then
P (A1 A2 A3 ) = P (A1 A2 A3 ) = 1P (A1 A2 A3 ) = 1P (A1 ) + P (A2 ) + P (A3 ) P (A1 A2 ) P (A1 A
1 3 + P (A1 A2 ) + P (A1 A3 ) + P (A2 A3 ) P (A1 A2 A3 )
So we have P (A1 A2 A3 ) 1 P (A1 ) P (A2 ) P (A3 )
3. (25 points) 2 Refer to Consumer finance Problems 5.5 and 5.13.
a. Using matrix methods, obtain the following: (1) vector of estimated regression coefficients,
(2) vector of residuals, (3) SSR, (4) SSE, (5) estimated variance-covariance matrix of b, (6)
point estimate of E{Yh } when Xh = 4, (7) s2 {pred} when Xh = 4
b. From your estimated variance-covariance matrix in part (a5), obtain the following: (1)
s{b0 , b1 }; (2) s2 {b0 }; (3) s{b1 }
c. Find the hat matrix H
d. Find s2 {e}
Answer:


16
4


1
5
"

10 0
2
; X X = 1
;Y=

4
3
15


13
3
4
22
"
#
55 17
1
(X0 X)1 = 41
;
17 6
"
#"
55
17
1 1 1 1
1
(X0 X)1 X0 = 41
17 6
4 1 2 3
1

1
(a) X =
1

1
1

1
2

1
#

1 1 1 1 1 1
1 2 3 3 4
1

1
1

#
1 1
=
3 4

"
1
41

1 "
#

2
6
17
=
3
17 55

3
4

#
13 38 21 4 4 13
7 11 5 1 1 7

This is problem 4.22 in Applied Linear Regression Models(4th edition) by Kutner etc.
This is problem 5.24 in Applied Linear Regression Models(4th edition) by Kutner etc.

0
1 1
1 0
H = X(X X) X = 41
1

1
1

4
15 6 1 8 8 15

1 "
6 27 16 5 5 6
#

1 16 11 6 6 1
2
13
38
21
4
4
13
1

= 41
8
3
5 6 7 7 8

7 11 5 1 1 7

8
3
5 6 7 7 8
4
15 6 1 8 8 15

16

5
"
" # "
#
#

13
38
21
4
4
13
18
10
0.4390
1
= 1
(1): = (X0 X)1 X0 Y = 41
41 189 = 4.6098
7 11 5 1 1 7
15

13
22


1
16

5 1

10 1

(2): Residual=Y X =
15 1


13 1
1
22

2.8780
4

0.0488
1 "
#

0.3415
0.4390
2

=
0.7317
3

4.6098

1.2683
3
3.1220
4

(3): SSR = Y0 [H n1 J]Y = 145.2073


(4): SSE = Y0 (I H)Y = 20.2927
"

(5): The estimated variance-covariance matrix of b = s2 {b} = M SE(X X)1


"
#
h
i 0.4390
(6): The point estimate of E{Yh } = X0h b = 1 4
= 18.8780
4.6098
(7): At Xh = 4, s2 {pred} = M SE(1 + X0h (X0 X)1 Xh ) = 6.9292
(b) s{b0 , b1 } = 2.1035; s2 {b0 } = 6.8055; s{b1 } =

0.7424 = 0.8616

0
1 1
1 0
(c) As calculated in part(a), the hat matrix H = X(X X) X = 41
1

1
1

#
6.8055 2.1035
=
2.1035 0.7424

1 "
#

2
13
38
21
4
4
13

3
7 11 5 1 1 7

3
4

15

1 1
= 41
8

8
15

6 1 8 8 15
0.3659 0.1463 0.0244 0.1951 0.1951 0.3659

27 16 5 5 6 0.1463 0.6585 0.3902 0.1220 0.1220 0.1463


0.0244

16 11 6 6 1
0.3902
0.2683
0.1463
0.1463
0.0244
=

5 6 7 7 8 0.1951
0.1220 0.1463 0.1707 0.1707 0.1951

5 6 7 7 8 0.1951
0.1220 0.1463 0.1707 0.1707 0.1951
6 1 8 8 15
0.3659 0.1463 0.0244 0.1951 0.1951 0.3659

3.2171
0.7424 0.1237 0.9899 0.9899

1.7323 1.9798 0.6187 0.6187


0.7424

0.1237 1.9798 3.7121 0.7424 0.7424


(d) s2 {e} = M SE(I H) =
0.9899 0.6187 0.7424 4.2070 0.8662

0.9899 0.6187 0.7424 0.8662 4.2070


1.8560 0.7424 0.1237 0.9899 0.9899

1.8560

0.7424

0.1237

0.9899

0.9899
3.2171

Matlab Code:
X=[1 4;1 1;1 2;1 3;1 3;1 4]
Y=[16;5;10;15;13;22]
J=ones(6,6)
I=eye(6,6)
[n, m] = size(Y )
Z = inv(X 0 X)
H=X*Z*X
beta=Z*X*Y
residual=Y-H*Y
SSR=Y*(H-(1/n)*J)*Y
SSE=Y*(I-H)*Y
MSE=SSE/(n-2)
cov=MSE*Z
s2 e = M SE (I H)
Xh=[1;4]
Yhhat=Xh*beta
s2 pred = M SE (1 + Xh0 Z Xh)

4. (25 points) 3 In a small-scale regression study, the following data were obtained: Assume
3

This is problem 6.27 in Applied Linear Regression Models(4th edition) by Kutner etc.

i:
Xi1
Xi2
Yi

1
7
33
42

2 3
4 5
6
4 16 3 21 8
41 7 49 5 31
33 75 28 91 55

that regression model (1) which is:


Yi = 0 + 1 Xi1 + Xi2 + i

(5)

with independent normal error terms is appropriate. Using matrix methods, obtain (a) b;
(b) e; (c) H; (d) SSR; (e) s2 {b}; (f) Yh when Xh1 = 10, Xh2 = 30; (g) s2 {Yh } when Xh1 = 10,
Xh2 = 30
Answer:

33.9321

(a) b = (X0 X)1 X0 Y = 2.7848


0.2644

2.6996

1.2300

1.6374

(b) e = Y Xb =
1.3299

0.0900
6.9868

0.2314
0.2517
0.2118
0.1489 0.0548 0.2110

0.3124
0.0944
0.2663 0.1479 0.2231
0.2517

0.2118

0.0944
0.7044
0.3192
0.1045
0.2041
0
1 0

(c) H = X(X X) X =
0.1489
0.2663 0.3192 0.6143
0.1414 0.1483

0.0548 0.1479 0.1045


0.1414
0.9404 0.0163
0.2110
0.2231
0.2041
0.1483
0.0163 0.1971
(d) SSR = Y0 [H n1 J]Y = 3009.926

715.4711 34.1589 13.5949

(e) s2 {b} = M SE(X0 X)1 = 34.1589 1.6617


0.6441
13.5949 0.6441
0.2625

i 33.9321

(f) Yh = X0h b = 1 10 30 2.7848 = 53.8471


0.2644
h

(g) At Xh1 =10 and Xh2 = 30, s2 {Yh } = X0h s2 {b}Xh = 5.4246
Matlab Code:
X=[1 7 33;1 4 41;1 16 7;1 3 49;1 21 5; 1 8 31]
Y=[42;33;75;28;91;55]
J=ones(6,6)
I=eye(6,6)
[n, m] = size(Y )
Z=inv(X*X)
H=X*Z*X
beta=Z*X*Y
residual=Y-H*Y
SSR=Y*(H-(1/n)*J)*Y
SSE=Y*(I-H)*Y
MSE=SSE/(n-3)
cov=MSE*Z
s2 e=MSE*(I-H)
Xh=[1;10;30]
Yhhat=Xh*beta
s2 yhat=Xh*cov*Xh

5. (15 points) Consider the classic regression model using matrix, i.e.
Y = X + 
where X is a n p design matrix whose first column is an all 1 vector,  N (0, I) and I is
an identity matrix. Prove the followings:
0 e
can be written in a matrix form:
a. The residual sum of squares RSS = e
RSS = y0 (I X(X0 X)1 X0 )y

(6)

b. We call the RHS of (2) a sandwich. Prove the matrix in the middle layer of the sandwich
N = I X(X0 X)1 X0 is an idempotent matrix.
6

c. Prove that the rank of N defined in part (b) is n p.


N.B. p columns in design matrix means there are p 1 predictors plus 1 intercept term.
Before handling the problem, make clear of the dimensions of all the matrices here.
Answer:
(a) SSE = e0 e = (y Xb)0 (y Xb) = (y0 b0 X0 )(y Xb) = y0 y 2b0 X0 y + b0 X0 Xb =
y0 y2b0 X0 y+b0 X0 X(X0 X)1 X0 y = y0 y2b0 X0 y+b0 IX0 y = y0 yb0 X0 y = y0 (Ib0 X0 )y =
y0 (I ((X0 X)1 X0 )0 X0 )y = y0 (I X(X0 X)1 X0 )y
(b)A2 = AA = (IX(X0 X)1 X0 )(IX(X0 X)1 X0 ) = I2X(X0 X)1 X0 +X(X0 X)1 X0 X(X0 X)1 X0 =
I 2X(X0 X)1 X0 + XI(X0 X)1 X0 = I X(X0 X)1 X0 = A
Therefore, A is an idempotent matrix.

(c) Since A is a symmetric and idempotent matrix, rank(A)=trace(A)


Let H = X(X0 X)1 X0
trace(A) = trace(Inn Hnn ) = trace(I)trace(H) = ntrace(H) = ntrace(X(X0 X)1 X0 ) =
0
n trace((X0 X)1
pp Xpn Xnp ) = n trace(Ipp ) = n p
So rank(A)=n-p

You might also like