Time Series

The Matrix Approach
proach to the Classical Linear R

Regression
egression Model (CLRM)
(Gujarati, Appendix C)
We will begin with the k-variable

variable model. The Population Regression
egression Function is as
follows:
Yi  1  2X 2i  3X 3i  ...  k X ki  ui ; i  1,2, 3,..., n (1)
where,
1 : Intercept
2 ...k :Partial
: Partial slope coefficients
u :Stochastic disturbance term
n : Population size
Remember that the PRF is a conditional expectation E Y X2i , X 3i ,..., Xki  
Expand (1)
Y1  1  2X 21  3X 31  ...  k X k 1  u1

Y2  1  2X 22  3X 32  ...  k Xk 2  u2
Y3  1  2X 23  3X 33  ...  k Xk 3  u 3 (2)
.................................................................
Yn  1  2X 2n  3X 3n  ...  k Xkn  un
Write down (2) in matrix notation

y is a (n  1) column vector of observations on
Y  1 X dependent variable Y .
 1   21
X 31 . . X k 1 
 1  u1 
Y  1 X  
X 32 . . X k 2  2  u2 

 2   22     
Y  1 X X is a (n  k ) data matrix of (k  1) variables
X 33 . . X k 3  3  u3 
 3    23       X 2 to Xk , the column of 1's represents the
.  . . . . . .  .  . 
       intercept term.
.  . . . . . .  .  . 
       
Yn  1 X 2n X 3n . . Xkn   k  un 
 is a (k  1) column vector of
unknown parameters 1...k .
y
  X
 
  u

(n1) (nk ) (k 1) (n1)
u is a (n  1) column vector
1 of n disturbances ui .
So, in vector notation, the PRF is written as y  X  u (3)
Assumptions of the CLRM in Matrix Notation
First, recall the assumptions in scalar notation:
1. E ui   0, for each i



0 for i  j (no auto/serial correlation)
 
2. E ui u j   2
 for i  j (homoscedasticity)



3. X 2, X 3,..., Xk are non-stochastic or fixed.
4. No exact linear relationship among the X varia variables
bles (no multicollinearity)

5. ui  N 0,  2 
Now we will look at those exact same assumptions but expressed in vector
vector-matrix
notation.
2
u  E (u )   0
 1  1   
u  E (u )   0
 2  2   
     
1. E .   . 
 . 
     
.  .  . 
     
un  E (un )  0
     
u 
 1 u 2 u u ... u u 
u   1 1 2 1 n 
 2  2 
     u u u ... u u 
2. E uu   E .  u1 u2 ... un   E  2 1 2 2 n

    ... ... ... ... 
.   
  u u u u ... u 2 
un   n 1 n 2 n 
 
E (u 2 ) E (u u ) ... E (u u ) 
 1 1 2 1 n 
 2 
E (u2u1 ) E (u2 ) ... E (u2un ) 
  
 ... ... ... ... 
 
E (u u ) E (u u ) ... E (u ) 2
 n 1 n 2 n 
 2 0 ... 0 1 0 ... 0
   
 2   
 0  ... 0 2 
0 1 ... 0 
   
 ................   ............... 
   0 0 ... 1
 0 0 ...  2   
 
  2I
variance covariance matrix of the disturbances ui. The diagonal

This is called the variance-covariance
elements are the (constant) variances while the off-diagonal
off diagonal elements are the (zero)
covariances. The matrix is symmetric.
3. The (n x k)) data matrix X is non-stochastic (fixed).
4. The data matrix X has full column rank. This means that none of the columns of
X are linearly dependent on each other i.e. there is no exact linear relationship
among the X variables. This implies no multicoll
multicollinearity.
inearity. (Recall that the rank of a
matrix is the maximum number of linearly independent
dependent rows or columns)
 
5. ui  N 0,  2 , no change here, though note the zero-vector
zero (0).
3
OLS eestimation in matrix notation
First write down the k-variable

variable Sample Regression Function (SRF)
Yi  ˆ1  ˆ2X 2i  ˆ3X 3i  ...  ˆk Xki  uî
In matrix notation:
y  Xˆ  uˆ
Y  1 X X 31 . . Xk 1   ˆ  uˆ 
 1   21  1   1 
Y  1 X X 32 . . Xk 2  ˆ2  uˆ2 
 2   22    
Y  1 X X 33 . . Xk 3  ˆ3  uˆ3 
 3    23
    
.  . . . . . . .  . 
      
.  . . . . . . .  . 
      
Yn  1 X 2n X 3n . . Xkn ˆn  uˆn 
Recall that OLS estimators are found by minimising the Residual Sum of Squares
2
min  uî2   Yi  ˆ1  ˆ2X 2i  ...  ˆk Xki  (4)
i i
Expression (4) in matrix notation can be written as min uˆ u

ˆ
uˆ 
 1 
uˆ 
 2 
ˆ u
since u   

ˆ  uˆ1 uˆ2 ... uˆn .   uˆ12  uˆ22  ...  uˆn2   uî2
i
. 
uˆ 
 n
ˆ  y  X  u
Now, u ˆ  y  Xˆ y  Xˆ  yy  2ˆX y  ˆXXˆ
ˆ u
remembering that AB   B A .
4
So, we are required to min uˆ u 2 ˆXy  ˆXXˆ with respect to ̂ , to do so
ˆ  y y  2
ˆ u
we have to set the partial derivative of u ˆ with respect to ̂ equal to zero. Doing so
we get the following result:
ˆ u
u ˆ
 2X y  2XXˆ  0
ˆ
 XXˆ  X y
1 1
 X  X  XX ˆ  XX X y
1
  ˆ  X X X  y
   
k 1 kk    n1
kn
This result is the matrix counterpart of the scalar OLS estimator of ̂2 . You will
variable sample regression function Yi  ˆ1  ˆ2Xi  uî , the
recall that for the two-variable
 x i yi
OLS estimator of ˆ2  i
.
 xi2
i
We will now express in vector

vector-matrix
matrix notation, some further standard results in
econometrics.
Variance – covariance matrix of ̂ :
-1
Recall, ̂  X X X y and y = X + u , substitute and obtain
-1
̂  XX X X + u 
-1 -1
 XX XX  X X u
-1
   X X u
which means
-1
̂    X X Xu
5
Now, by definition,
 
Var  Cov ˆ  E ˆ   ˆ    
 
  

 -1

 E  XX Xu X X X u


-1
 



  -1

 E  XX Xu u X X X
-1




 -1 -1 
 E X X X uu X XX 
 
Since the X’s are non-stochastic E( = X,, so we have,

stochastic (i.e. constant), E(X)
-1 -1
Var  Cov ˆ  XX XE (uu )X XX
-1 -1
 XX X 2I X XX
-1 -1
  2 XX X  X X X
-1
  2 X X 
R-squared
squared and population variance:
variance
 uî2
ˆ2 
Recall, in the scalar case ̂ i
, in the k -variable
variable case, in vector notation we
n k
ˆ u
u ˆ
write ˆ2  .
n k
ESS ˆXy  nY 2
R2  
TSS y y  nY 2
6
Hypothesis Testing:
  
We have the following distributions, u  N 0 ,  2I and ˆ  N  ,  2 (XX)1 which 
are the starting points for hypothesis testing. Since, in practice,  2 is unknown, it
has to be replaced by its sample estimate ̂ 2 .
î  i
Recall, t  n – k) degrees of freedom and the F test for overall
with (n
se ˆ 
i
RSSR  RSSUR  / k
significance is accomplished by calculating the statistic F 
RSSUR  / n1  n2  2k 
In the k – variable case, for the null hypothesis H 0 : 2  3  ...  k  0 , we can
express the F statistic in vector notation as F 

ˆXy  nY  / (k  1) . Notice that
2
yy  ˆXy / (n  k )
R2 / (k  1)
this bears a close resemblance to the scalar result F  which, as
(1  R2 ) / (n  k )
you will recall, is another way of calculating the F statistic using R--squared.
When we have linear restrictions, the general procedure for hypothesis testing is as
follows:
If uˆR are the residuals from the restricted least-squares

least squares regression, then we can
calculate RSSR   uˆR2  u
ˆRu
ˆ R , and in a similar fashion if uÛR are the residuals
from the unrestricted least--squares regression we have RSSUR   uÛR
2
u  u
ˆ UR ˆ UR ,
uˆ R uˆ R  uˆ UR
 uˆ UR  / m
then F  , where m is the number of linear restrictions, k is
uˆ UR
 u ˆ UR  / (n  k )
the number of parameters (including the intercept) in the unrestricted regression and
n is the number of observations.
7
Generalised Least Squares (GLS):
In order to simplify matters we will work with a 3 x 3 matrix.
In OLS we assumed E (uu )   2I , where I is just the identity matrix. In the 3 x 3

case, if we expand E (uu ) we obtain
 2 0 0 

 
E (uu )   0  2 0  which represents the OLS homoscedasticity and no serial
 
0 0
  2 
correlation assumption.
Let us depart from the OLS assumption and now assume that E (uu )   2V where V
is a known (n x n) variance-covariance
variance matrix. The elements on the main diagonal of
V are the variances (possibly not all the same) and the off
off-diagonal
diagonal elements ar
are
autocorrelations of the error terms (possibly not all equal to zero). V could be of
three types.
1 0 0
 
1. 0 1 0 which is the same as I and would yield the OLS assumption. In fact
 
0 0 1
OLS is just a special case of GLS.
 2 0 0 
 1
 
2.  0 22 0  , now we have heteroscedasticity but no serial correlation.
 
 0
 0 32 
 2 cov(uiu j ) cov(ui u j )

 1
 
3. cov(uiu j ) 22 cov(ui u j ) , here we have both heteroscedasticity and serial
 
cov(u u ) cov(u u )  2 
 i j i j 3 
correlation.
8
Now, if y  X  u with E (u)  0 and var-cov(u)= 2V , and if  2 is unknown, V
represents the assumed underlying structure of the variances and covariances among
1
the random errors ui . Then, GLS  XV 1X   XV 1y and
1
 
var cov GLS   2 XV 1X   .
In practice, we may not know either  2 , or indeed the structure of V. Then we have
to estimate both. Estimated GLS is known as EGLS or Feasible GLS (FGLS).
1 1

 EGLS  XVˆ1X   
XVˆ1y and var cov  EGLS  ˆ2 XVˆ1X   , where Vˆ is an
estimator of V.
BLUE Properties of OLS Estimators in Matrix Notation
When is an estimator (like ̂

 ) BLUE (Best Linear Unbiased Estimator)?
1. When it is a linear function of a random variable, like Y.

2. When it is unbiased i.e. E ˆ   .
3. When it is “best” in the sense that it has minimum variance in the class of all
linear-unbiased estimators.
Let us see these properties expressed in matrix notation.
1 1
1. We know that ̂  XX X y , XX X is just a matrix of fixed numbers, so
̂ is a linear function of y.. It is, therefore, a linear estimator.
1
2. The PRF is y = X  u , substitute for y in ̂  X X X y and get
1
̂  X X X  X  u 
1
   X X  X u
1
Taking expectations we obtain, E (ˆ)  E ( )  XX XE (u)  E (ˆ)   , which
means ̂ is an unbiased estimator of  .
9
3. Let ̂  be any other linear estimator of  which we can write as
 -1 
ˆ  X X X   C y, where C is a matrix of constants
 
 -1 
 ˆ  X X X   C X + u 
 
-1
   CX  XX Xu  Cu
If ̂  is to be an unbiased estimator of  we must have CX = 0, if that is the case

-1
then we have ̂     XX Xu  Cu .
   
    
-1 -1
Now, Var  Cov ˆ : E ˆ   ˆ    E XX Xu + Cu  XX Xu + Cu  .
   
 
1
Simplify and obtain: Var  Cov ˆ   2 XX   2CC .
1
But,  2 X X is Var  Cov ˆ and CC is a positive semi-definite
definite matrix, so
   
Var  Cov ˆ  Var  Cov ˆ   2CC implying Var  Cov ˆ  Var  Cov ˆ .
Hence, ̂ has the smallest variance, making it BLUE.
10

Time Series

Uploaded by

Copyright:

Available Formats

Time Series

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series

Uploaded by

Copyright:

Available Formats

The Matrix Approach

proach to the Classical Linear R

We will begin with the k-variable

Yi  1  2X 2i  3X 3i  ...  k X ki  ui ; i  1,2, 3,..., n (1)

Remember that the PRF is a conditional expectation E Y X2i , X 3i ,..., Xki  

Y1  1  2X 21  3X 31  ...  k X k 1  u1

Write down (2) in matrix notation

Assumptions of the CLRM in Matrix Notation

First, recall the assumptions in scalar notation:

1. E ui   0, for each i

variance covariance matrix of the disturbances ui. The diagonal

3. The (n x k)) data matrix X is non-stochastic (fixed).

First write down the k-variable

Yi  ˆ1  ˆ2X 2i  ˆ3X 3i  ...  ˆk Xki  uˆi

Expression (4) in matrix notation can be written as min uˆ u

remembering that AB   B A .

We will now express in vector

Variance – covariance matrix of ̂ :

Since the X’s are non-stochastic E( = X,, so we have,

In the k – variable case, for the null hypothesis H 0 : 2  3  ...  k  0 , we can

express the F statistic in vector notation as F 

If uˆR are the residuals from the restricted least-squares

In order to simplify matters we will work with a 3 x 3 matrix.

In OLS we assumed E (uu )   2I , where I is just the identity matrix. In the 3 x 3

 2 cov(uiu j ) cov(ui u j )

BLUE Properties of OLS Estimators in Matrix Notation

When is an estimator (like ̂

1. When it is a linear function of a random variable, like Y.

Let us see these properties expressed in matrix notation.

If ̂  is to be an unbiased estimator of  we must have CX = 0, if that is the case

You might also like