Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Time Series

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

The Matrix Approach

proach to the Classical Linear R


Regression
egression Model (CLRM)
(Gujarati, Appendix C)

We will begin with the k-variable


variable model. The Population Regression
egression Function is as
follows:

Yi  1  2X 2i  3X 3i  ...  k X ki  ui ; i  1,2, 3,..., n (1)

where,
1 : Intercept
2 ...k :Partial
: Partial slope coefficients
u :Stochastic disturbance term
n : Population size

Remember that the PRF is a conditional expectation E Y X2i , X 3i ,..., Xki  

Expand (1)

Y1  1  2X 21  3X 31  ...  k X k 1  u1


Y2  1  2X 22  3X 32  ...  k Xk 2  u2
Y3  1  2X 23  3X 33  ...  k Xk 3  u 3 (2)
.................................................................
Yn  1  2X 2n  3X 3n  ...  k Xkn  un

Write down (2) in matrix notation


y is a (n  1) column vector of observations on
Y  1 X dependent variable Y .
 1   21
X 31 . . X k 1 
 1  u1 
Y  1 X  
X 32 . . X k 2  2  u2 

 2   22     
Y  1 X X is a (n  k ) data matrix of (k  1) variables
X 33 . . X k 3  3  u3 
 3    23       X 2 to Xk , the column of 1's represents the
.  . . . . . .  .  . 
       intercept term.
.  . . . . . .  .  . 
       
Yn  1 X 2n X 3n . . Xkn   k  un 
 is a (k  1) column vector of
unknown parameters 1...k .
y
  X
 
  u

(n1) (nk ) (k 1) (n1)
u is a (n  1) column vector
1 of n disturbances ui .
So, in vector notation, the PRF is written as y  X  u (3)

Assumptions of the CLRM in Matrix Notation

First, recall the assumptions in scalar notation:

1. E ui   0, for each i




0 for i  j (no auto/serial correlation)
 
2. E ui u j   2
 for i  j (homoscedasticity)



3. X 2, X 3,..., Xk are non-stochastic or fixed.
4. No exact linear relationship among the X varia variables
bles (no multicollinearity)

5. ui  N 0,  2 
Now we will look at those exact same assumptions but expressed in vector
vector-matrix
notation.

2
u  E (u )   0
 1  1   
u  E (u )   0
 2  2   
     
1. E .   . 
 . 
     
.  .  . 
     
un  E (un )  0
     

u 
 1 u 2 u u ... u u 
u   1 1 2 1 n 
 2  2 
     u u u ... u u 
2. E uu   E .  u1 u2 ... un   E  2 1 2 2 n

    ... ... ... ... 
.   
  u u u u ... u 2 
un   n 1 n 2 n 
 
E (u 2 ) E (u u ) ... E (u u ) 
 1 1 2 1 n 
 2 
E (u2u1 ) E (u2 ) ... E (u2un ) 
  
 ... ... ... ... 
 
E (u u ) E (u u ) ... E (u ) 2
 n 1 n 2 n 

 2 0 ... 0 1 0 ... 0
   
 2   
 0  ... 0 2 
0 1 ... 0 
   
 ................   ............... 
   0 0 ... 1
 0 0 ...  2   
 

  2I

variance covariance matrix of the disturbances ui. The diagonal


This is called the variance-covariance
elements are the (constant) variances while the off-diagonal
off diagonal elements are the (zero)
covariances. The matrix is symmetric.

3. The (n x k)) data matrix X is non-stochastic (fixed).

4. The data matrix X has full column rank. This means that none of the columns of
X are linearly dependent on each other i.e. there is no exact linear relationship
among the X variables. This implies no multicoll
multicollinearity.
inearity. (Recall that the rank of a
matrix is the maximum number of linearly independent
dependent rows or columns)

 
5. ui  N 0,  2 , no change here, though note the zero-vector
zero (0).

3
OLS eestimation in matrix notation

First write down the k-variable


variable Sample Regression Function (SRF)

Yi  ˆ1  ˆ2X 2i  ˆ3X 3i  ...  ˆk Xki  uˆi

In matrix notation:

y  Xˆ  uˆ

Y  1 X X 31 . . Xk 1   ˆ  uˆ 
 1   21  1   1 
Y  1 X X 32 . . Xk 2  ˆ2  uˆ2 
 2   22    
Y  1 X X 33 . . Xk 3  ˆ3  uˆ3 
 3    23
    
.  . . . . . . .  . 
      
.  . . . . . . .  . 
      
Yn  1 X 2n X 3n . . Xkn ˆn  uˆn 

Recall that OLS estimators are found by minimising the Residual Sum of Squares

2
min  uˆi2   Yi  ˆ1  ˆ2X 2i  ...  ˆk Xki  (4)
i i

Expression (4) in matrix notation can be written as min uˆ u


ˆ

uˆ 
 1 
uˆ 
 2 
ˆ u
since u   

ˆ  uˆ1 uˆ2 ... uˆn .   uˆ12  uˆ22  ...  uˆn2   uˆi2
i
. 
uˆ 
 n

ˆ  y  X  u
Now, u ˆ  y  Xˆ y  Xˆ  yy  2ˆX y  ˆXXˆ
ˆ u

remembering that AB   B A .

4
So, we are required to min uˆ u 2 ˆXy  ˆXXˆ with respect to ̂ , to do so
ˆ  y y  2
ˆ u
we have to set the partial derivative of u ˆ with respect to ̂ equal to zero. Doing so
we get the following result:

ˆ u
u ˆ
 2X y  2XXˆ  0
ˆ
 XXˆ  X y
1 1
 X  X  XX ˆ  XX X y
1
  ˆ  X X X  y
   
k 1 kk    n1
kn

This result is the matrix counterpart of the scalar OLS estimator of ̂2 . You will
variable sample regression function Yi  ˆ1  ˆ2Xi  uˆi , the
recall that for the two-variable
 x i yi
OLS estimator of ˆ2  i
.
 xi2
i

We will now express in vector


vector-matrix
matrix notation, some further standard results in
econometrics.

Variance – covariance matrix of ̂ :

-1
Recall, ̂  X X X y and y = X + u , substitute and obtain
-1
̂  XX X X + u 
-1 -1
 XX XX  X X u
-1
   X X u
which means
-1
̂    X X Xu

5
Now, by definition,
 
Var  Cov ˆ  E ˆ   ˆ    
 
  

 -1

 E  XX Xu X X X u


-1
 



  -1

 E  XX Xu u X X X
-1




 -1 -1 
 E X X X uu X XX 
 

Since the X’s are non-stochastic E( = X,, so we have,


stochastic (i.e. constant), E(X)
-1 -1
Var  Cov ˆ  XX XE (uu )X XX
-1 -1
 XX X 2I X XX
-1 -1
  2 XX X  X X X
-1
  2 X X 

R-squared
squared and population variance:
variance

 uˆi2
ˆ2 
Recall, in the scalar case ̂ i
, in the k -variable
variable case, in vector notation we
n k
ˆ u
u ˆ
write ˆ2  .
n k

ESS ˆXy  nY 2
R2  
TSS y y  nY 2

6
Hypothesis Testing:

  
We have the following distributions, u  N 0 ,  2I and ˆ  N  ,  2 (XX)1 which 
are the starting points for hypothesis testing. Since, in practice,  2 is unknown, it
has to be replaced by its sample estimate ̂ 2 .

ˆi  i
Recall, t  n – k) degrees of freedom and the F test for overall
with (n
se ˆ 
i

RSSR  RSSUR  / k
significance is accomplished by calculating the statistic F 
RSSUR  / n1  n2  2k 

In the k – variable case, for the null hypothesis H 0 : 2  3  ...  k  0 , we can

express the F statistic in vector notation as F 


ˆXy  nY  / (k  1) . Notice that
2

yy  ˆXy / (n  k )
R2 / (k  1)
this bears a close resemblance to the scalar result F  which, as
(1  R2 ) / (n  k )
you will recall, is another way of calculating the F statistic using R--squared.

When we have linear restrictions, the general procedure for hypothesis testing is as
follows:

If uˆR are the residuals from the restricted least-squares


least squares regression, then we can
calculate RSSR   uˆR2  u
ˆRu
ˆ R , and in a similar fashion if uˆUR are the residuals
from the unrestricted least--squares regression we have RSSUR   uˆUR
2
u  u
ˆ UR ˆ UR ,
uˆ R uˆ R  uˆ UR
 uˆ UR  / m
then F  , where m is the number of linear restrictions, k is
uˆ UR
 u ˆ UR  / (n  k )
the number of parameters (including the intercept) in the unrestricted regression and
n is the number of observations.

7
Generalised Least Squares (GLS):

In order to simplify matters we will work with a 3 x 3 matrix.

In OLS we assumed E (uu )   2I , where I is just the identity matrix. In the 3 x 3


case, if we expand E (uu ) we obtain
 2 0 0 

 
E (uu )   0  2 0  which represents the OLS homoscedasticity and no serial
 
0 0
  2 
correlation assumption.

Let us depart from the OLS assumption and now assume that E (uu )   2V where V
is a known (n x n) variance-covariance
variance matrix. The elements on the main diagonal of
V are the variances (possibly not all the same) and the off
off-diagonal
diagonal elements ar
are
autocorrelations of the error terms (possibly not all equal to zero). V could be of
three types.

1 0 0
 
1. 0 1 0 which is the same as I and would yield the OLS assumption. In fact
 
0 0 1
OLS is just a special case of GLS.

 2 0 0 
 1
 
2.  0 22 0  , now we have heteroscedasticity but no serial correlation.
 
 0
 0 32 

 2 cov(uiu j ) cov(ui u j )


 1
 
3. cov(uiu j ) 22 cov(ui u j ) , here we have both heteroscedasticity and serial
 
cov(u u ) cov(u u )  2 
 i j i j 3 
correlation.

8
Now, if y  X  u with E (u)  0 and var-cov(u)= 2V , and if  2 is unknown, V
represents the assumed underlying structure of the variances and covariances among
1
the random errors ui . Then, GLS  XV 1X   XV 1y and
1
 
var cov GLS   2 XV 1X   .

In practice, we may not know either  2 , or indeed the structure of V. Then we have
to estimate both. Estimated GLS is known as EGLS or Feasible GLS (FGLS).

1 1

 EGLS  XVˆ1X   
XVˆ1y and var cov  EGLS  ˆ2 XVˆ1X   , where Vˆ is an
estimator of V.

BLUE Properties of OLS Estimators in Matrix Notation

When is an estimator (like ̂


 ) BLUE (Best Linear Unbiased Estimator)?

1. When it is a linear function of a random variable, like Y.


2. When it is unbiased i.e. E ˆ   .
3. When it is “best” in the sense that it has minimum variance in the class of all
linear-unbiased estimators.

Let us see these properties expressed in matrix notation.

1 1
1. We know that ̂  XX X y , XX X is just a matrix of fixed numbers, so
̂ is a linear function of y.. It is, therefore, a linear estimator.

1
2. The PRF is y = X  u , substitute for y in ̂  X X X y and get
1
̂  X X X  X  u 
1
   X X  X u
1
Taking expectations we obtain, E (ˆ)  E ( )  XX XE (u)  E (ˆ)   , which
means ̂ is an unbiased estimator of  .

9
3. Let ̂  be any other linear estimator of  which we can write as
 -1 
ˆ  X X X   C y, where C is a matrix of constants
 
 -1 
 ˆ  X X X   C X + u 
 
-1
   CX  XX Xu  Cu

If ̂  is to be an unbiased estimator of  we must have CX = 0, if that is the case


-1
then we have ̂     XX Xu  Cu .

   
    
-1 -1
Now, Var  Cov ˆ : E ˆ   ˆ    E XX Xu + Cu  XX Xu + Cu  .
   

 
1
Simplify and obtain: Var  Cov ˆ   2 XX   2CC .
1
But,  2 X X is Var  Cov ˆ and CC is a positive semi-definite
definite matrix, so

   
Var  Cov ˆ  Var  Cov ˆ   2CC implying Var  Cov ˆ  Var  Cov ˆ .
Hence, ̂ has the smallest variance, making it BLUE.

10

You might also like