Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
44 views

Chapter 4 - Lecture Notes

This document discusses parameter estimation for nonseasonal Box-Jenkins time series models. It begins by introducing ARMA models with nonzero mean and how they can be transformed into equivalent models with zero mean. It then covers: 1) Methods for estimating the parameters of ARMA models using least squares estimation. 2) The concept of invertibility and conditions for MA and ARMA models to be invertible. 3) An example of applying least squares estimation to fit an AR(1) model using a sample time series dataset.

Uploaded by

Joel Tan Yi Jie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Chapter 4 - Lecture Notes

This document discusses parameter estimation for nonseasonal Box-Jenkins time series models. It begins by introducing ARMA models with nonzero mean and how they can be transformed into equivalent models with zero mean. It then covers: 1) Methods for estimating the parameters of ARMA models using least squares estimation. 2) The concept of invertibility and conditions for MA and ARMA models to be invertible. 3) An example of applying least squares estimation to fit an AR(1) model using a sample time series dataset.

Uploaded by

Joel Tan Yi Jie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

MH4500 TIME SERIES ANALYSIS

Chapter 4 (Part 1)
Estimation, diagnostic checking for nonseasonal Box-Jekins models

In the last chapter, we briefly described ways of choosing an appropriate


model. However, "model identification" consists merely of selecting the form of
the model, but not the numerical values of its parameters. Suppose, for example,
we have decided to fit an AR(1) model Xt = aXt−1 + Zt . Since the value of the
parameter a is not known, it must somehow be estimated from the data. This
chapter describes methods of estimating the parameters of ARMA models.

1 AR(P), MA(q) and ARMA(p,q) model with non-zero


mean
Suppose that {Zt } ∼ W N(0, σ 2 ). Models with zero-mean are
(i) AR(p): Xt = ϕ1 Xt−1 + · · · + ϕp Xt−p + Zt .
(ii) MA(q): Xt = Zt + θ1 Zt−1 + · · · + θq Zt−q .
(iii) ARMA(p,q): Xt − ϕ1 Xt−1 − · · · − ϕp Xt−p = Zt + θ1 Zt−1 + · · · + θq Zt−q .
It is sometimes appropriate to include a constant term δ in a Box-Jenkis model.

DEFINITION 1 : Nonzero mean AR(p): Xt = δ + ϕ1 Xt−1 + · · · + ϕp Xt−p + Zt .

Taking expectation on both sides yields

EXt = δ/(1 − ϕ1 − · · · − ϕp ).

If we define xt = Xt − EXt , then

xt = ϕ1 xt−1 + · · · + ϕp xt−p + Zt .

Moreover, we have
Cov(xt , xs ) = Cov(Xt , Xs )
Therefore, all the properties of the nonzero-mean AR(p) are the same as those
of zero-mean AR(p) in terms of ACF.

1
DEFINITION 2 : Nonzero mean MA(q): Xt = δ + Zt + θ1 Zt−1 + · · · + θq Zt−q .

Then EXt = δ. Again if we define xt = Xt − EXt , then

xt = δ + Zt + θ1 Zt−1 + · · · + θq Zt−q .

We also have Cov(xt , xs ) = Cov(Xt , Xs ).

DEFINITION 3 : Nonzero mean ARMA(p,q):

Xt − ϕ1 Xt−1 − · · · − ϕp Xt−p = δ + Zt + θ1 Zt−1 + · · · + θq Zt−q .

Then
EXt = δ/(1 − ϕ1 − · · · − ϕp )
If we define xt = Xt − EXt , then

xt − ϕ1 xt−1 − · · · − ϕp xt−p = Zt + θ1 Zt−1 + · · · + θq Zt−q .

It follows that
Cov(xt , xs ) = Cov(Xt , Xs ).
Therefore, all the properties of the nonzero-mean ARMA(p) are the same as
those of zero-mean ARMA(p).
n

Denote the sample mean by X̄ = n1 Xj , which is one possible point estimate
j=1
of the population mean µ = EXt . If X̄ is statistically (significantly) different from
zero, it is reasonable to assume that µ does not equal zero and, therefore, to
assume that δ does not equal zero. Let s stand for the standard deviation with
∑n
s2 = (Xi − X̄)2 /(n−1). One rough rule of thumb is to decide that X̄ is statistically
i=1
different from zero if the absolute value of


√ ,
s/ n
is greater than 1.96.

2 Invertibility
2.1 Motivation
Why do we need the concept of invertibility ? Consider the following example.

2
EXAMPLE 1 Evaluate ACF of the following two MA(1) models.

1 . Xt = Zt + 1/3Zt−1 .

2 . Xt = Zt + 3Zt−1 . 

SOLUTION : Consider the first example. Since EXt = 0


[ ]
γk = Cov(Xt , Xt+k ) = E[Xt Xt+k ] = E (Zt + 1/3at−1 )(Zt+k + 1/3Zt+k−1 )

1 1 1
= E(Zt Zt+k ) + E(Zt Zt+k−1 ) + E(Zt−1 Zt+k ) + E(Zt−1 Zt+k−1 ).
3 3 9
This gives γ1 = 1/3 and γk = 0 for k > 1. Xt = at + 3at−1 , which further implies
ρ1 = γγ01 = 1+1/9
1/3 3
= 10 . Similarly, for the second example, we may obtain γ1 = 3
and γk = 0 for k > 1 and ρ1 = γγ01 = 10 3
. So two different models have the same
ACF. 

We can not distinguish between these two models by looking at the sample
ACFs. Hence we will have to choose only one of them. We now further look at
the difference between them.

EXAMPLE 2 Consider Xt = Zt + θZt−1 . It can be written as (1 + θB)Zt = Xt so that



∑ ∞

j j
Zt = (−θ) B Xt = (−θ)j Xt−j .
j=0 j=0

In other words, we have

Xt = Zt + θXt−1 − θ 2 Xt−2 + θ 3 Xt−3 + · · · − (−θ)j Xt−j + · · · .

Intuitively speaking, the most recent observations should have higher weight than
observations from the more distant past observations on Xt . When |θ| < 1, |θ|j
becomes smaller as j gets larger. So we should choose model 1 in Example 1
with θ = 1/3.

DEFINITION 4 A time series {Xt } is invertible if it can be expressed as an infinite


series of past X-observations, i.e.

Xt = Zt + ψ1 Xt−1 + ψ2 Xt−2 + · · · ,


such that |ψj | < ∞.
j=1

3
(a) AR(p) model
Xt = ϕ1 Xt−1 + · · · + ϕp Xt−p + Zt
where {Zt } ∼ W N(0, σ 2 ), is always invertible.

(b) MA(q) model


Xt = Zt + θ1 Zt−1 + · · · + θq Zt−q .

The sufficient condition for Xt to be invertible is that

ϕ(B) = 1 + θ1 B + · · · + θq Bq = 0

has all its roots outside the unit circle, i.e. all the roots have modulus (com-
plex norm) greater than 1.
For example, if p = 1, the root of 1 + θ1 B = 0 is

B = −1/θ1 .

The condition is then


|θ1 | < 1

If p = 2, the condition is

−θ1 − θ2 < 1, −θ2 + θ1 < 1, |θ2 | < 1.

For p ≥ 3, there is no clear expression for the conditions.

(c) ARMA(p,q):

Xt − ϕ1 Xt−1 − · · · − ϕp Xt−p = Zt + θ1 Zt−1 + · · · + θq Zt−q .

The sufficient condition for Xt to be invertible is that

ϕ(B) = 1 + θ1 B + · · · + θq Bq = 0

has all its roots outside the unit circle, i.e. all the roots have modulus (com-
plex norm) greater than 1.

EXAMPLE 3 Is the model

Xt = Zt + 2Zt−1 + Zt−2

invertible ? 

4
3 Estimation of an AR(p) model
Suppose that Xt : t = 1, 2, · · · , n is a TS (time series), and we want to fit the
following regression model

Xt = δ + ϕ1 Xt−1 + · · · + ϕk Xt−k + Zt

time Xt intercept Xt−1 Xt−2 ··· Xt−k


1 X1 1 – – ··· –
2 X2 1 X1 – ··· –
3 X3 1 X2 X1 ··· –
.. .. .. .. ..
. 1 . . . ··· .
k+1 Xk+1 1 Xk Xk−1 ··· X1
k+2 Xk+2 1 Xk+1 Xk ··· X2
.. .. .. .. ..
. . 1 . . ··· .
n Xn 1 Xn−1 Xn−2 ··· Xn−k

what is X and Y?
Using LSE (least-squares estimation) we have
 
δ̂
 ϕ̂1 
 
 .  = (X ′ ∗ X)−1 X ′ ∗ Y .
 .. 
ϕ̂p

Similarly, we can calculate


a) fitted value (for t ≤ n) and prediction (for t > n)

X̂t = δ̂ + ϕ̂1 Xt−1 + · · · + ϕ̂p Xt−p

b) Prediction errors

Ẑt = Xt − (δ̂ + ϕ̂1 Xt−1 + · · · + ϕ̂p Xt−p )

c) The estimator of σ 2 = V ar(Zt ) is


n

σ̂ 2 = Ẑt2 /(n − p − 1).
t=1

d) The standard error σ̂ 2 ck,k for δ and ϕk , where ck,k is the (k,k) element
in (X ′ ∗ X)−1 .

EXAMPLE 4 Example (continued) Xt : 1.0445, -0.1338, 0.6706, 0.3755, -0.5110, -


0.2352, 0.1595, 1.6258, -1.6739, 2.4478, -3.1019, 2.6860, -0.9905, 1.2113, -0.0929, 0.9905,
0.5213, -0.1139, -0.4062, 0.5438

5
6
time Xt intercept Xt−1
1 1.0445 1 –
2 -0.1338 1 1.0445
3 0.6706 1 -0.1338
.. ..
. 1 .
20 0.5438 1 -0.4062
we have ( )
′ −1 0.054303878 −0.007102646
(X ∗ X) =
−0.007102646 0.030166596
and ( )
3.97280
X′ ∗ Y =
−26.35337
We have
δ̂ = 0.4029171, ϕ̂1 = −0.8232089
The fitted values are (time: prediction) 2: -0.4569, 3: 0.5130, 4: -0.1491, 5:
0.0938, 6: 0.8235, 7: 0.5965, 8: 0.2716, 9: -0.9354, 10: 1.7809, 11: -1.6121, 12: 2.9564,
13: -1.8082, 14: 1.2183, 15: -0.5942, 16: 0.47939, 17: -0.4124, 18: -0.0262, 19: 0.4966,
20: 0.73730
The estimate of σ 2 = V ar(Zt ) is
20

(Xt − X̂t )2 /(19 − 2) = 0.5948365
i=2

The standard error of δ is



(0.5948365 ∗ 0.054303878) = 0.1797274

and that of ϕ1 is

(0.5948365 ∗ 0.030166596) = 0.1339559

Therefore our model is estimated as

Xt = 0.4029171 − 0.8232089Xt−1 .

The SAS output is given in the Appendix.


In view of the SAS output the estimated model is

(Xt − 0.23383) + 0.82323(Xt−1 − 0.23383) = 0

or
Xt = 0.4263259 − 0.82323Xt−1 .

SAS output(II)

7
Model for variable TS

Estimated Mean 0.23383

Autoregressive Factors

Factor 1: 1 + 0.82323 B**(1)

In view of the SAS output the estimated model is

(Xt − 0.23383) + 0.82323(Xt−1 − 0.23383) = 0

or
Xt = 0.4263259 − 0.82323Xt−1 .

8
MAS451/MTH451/MH4500 TIME SERIES
ANALYSIS
Chapter 4 (Part 2)
Estimation, diagnostic checking for nonseasonal Box-Jekins models

Basic Questions:
(a) How to forecast ?

(b) How to fit and predict using AR model ?

(c) How to use MA model ?

1 Minimum Mean Square Error Prediction


Suppose Y is a random variable with mean µ = EY and variance σ 2 . If our object
is to predict Y using only a constant c, what is the best choice for c? Clearly,
we must first define best. A common (and convenient) criterion is to choose c to
minimize the mean square error of prediction, that is, to minimize

g(c) = E(Y − c)2 .

THEOREM 1 The minimum of E(Y − c)2 is obtained when c = EY .

PROOF If we expand g(c), we have

g(c) = E(Y )2 − 2cEY + c2 .

Since g(c) is quadratic in c and opens upward, solving g ′ (c) = 0 will produce the
required minimum. Note that

g ′ (c) = −2EY + 2c = 0

so that the optimal c is


c = EY .
And
min g(c) = E(Y − EY )2 .
c

1
Now consider the situation where a second random variable X is available and
we wish to use the observed value of X to help predict Y. Again, our criterion
will be to minimize the mean square error of prediction. We need to choose the
function h(X), say, that minimizes

E(Y − h(X))2 .

Rewrite h  i
E(Y − h(X))2 = E E (Y − h(X))2 |X .

The inner conditional expectation can be written as


   
E (Y − h(X))2 |X = x = E (Y − h(x))2 |X = x .

For each value of x, h(x) is a constant and hence we can apply Theorem 1 to the
conditional distribution of Y given X = x. Thus the best choice of h(x) is

h(x) = E(Y |X = x).

It follows that h(X) = E(Y |X) is the best predictor of Y of all functions of X.

THEOREM 2 The minimum of E(Y − g(X))2 is obtained when g(X) = E(Y |X).

Based on the available history of the series up to time t, namely Y1 , Y2 , · · · , Yt−1 , Yt ,


we would like to forecast the value of Yt+l that will occur l time units into the
future. In view of Theorem 2 the minimum mean square error forecast is given
by
E(Yt+l |Y1 , Y2 , · · · , Yt ). (1.1)

2 Prediction using AR model


Suppose we fit an AR(p) model to data X1 , X2 , · · · , Xn and the estimate of the
model is
X̂t = δ̂ + φ̂1 Xt−1 + · · · + φ̂p Xt−p
where δ̂ = (1 − φ̂1 − · · · − φ̂p )µ̂. ]
Point prediction To predict Xn+1 : use (1.1) to obtain
 
E(Xn+1 |X1 , · · · , Xn ) = E δ + φ1 Xn + · · · + φp Xn+1−p + Zn+1 |X1 , · · · , Xn
 
= δ + φ1 Xn + · · · + φp Xn+1−p + E Zn+1 |X1 , · · · , Xn

= δ + φ1 Xn + · · · + φp Xn+1−p .

2
It follows that
X̂n+1 = δ̂ + φ̂1 Xn + · · · + φ̂p Xn−p+1 .
Likewise, to predict Xn+2 : use

X̂n+2 = δ̂ + φ̂1 X̂n+1 + φ̂n Xn + · · · + φ̂p Xn−p+2 .

Continue the procedure to make any τ step ahead prediction X̂n+τ .


Prediction intervals with 95% confidence

[ŷn+τ − 1.96sn+τ (n), ŷn+τ + 1.96sn+τ (n)]

Here sn+τ (n) is the standard error of the forecast error. The calculation of
sn+τ (n) is beyond the scope of this module. However, SAS(orR) can help us to
calculate it.

EXAMPLE 1 yt : 3.91, 3.86, 3.81, 3.02, 2.62, 1.89, -1.13, -3.82, -5.08, -4.42, -1.99, 0.70,
1.86, 2.98, 1.78, 3.01, 2.13, 3.23, 3.17, 4.64, 5.20, 6.76, 5.79, 5.08, 1.88, -0.72, -2.00, -3.03,
-2.35, -3.34, -3.21, -3.57, -4.28, -3.54, -3.16, -1.41, 0.48, 1.61, 2.42, 2.11, 2.45, 1.39, 2.04,
1.71, 3.26, 3.20, 1.43, 1.68, 4.17, 4.75
Using SAS, ACF and PACF graphs can be obtained by the following codes:

First step: read data into sas.


data mydata;
input id ts;
datalines;
1 3.91
2 3.86
3 3.81
4 3.02
.. ..
..
47 1.43
48 1.68
49 4.17
50 4.75
;
Second step: print out the data set and plot the time series.
proc print;
run;
proc gplot data=mydata;
symbol i=spline c=red;
plot ts*id;
run;
Third step: use identify statement in arima procedure to difference the
data and compute autocorrelations, inverse autocorrelations, partial autocor-
relations and cross correlations.

3
proc arima data=mydata;
identify var=ts nlag=20 outcov=exp2;
run;
proc gplot data=exp2;
symbol i=needle width=6;
plot corr*lag;
run;
proc gplot data=exp2; symbol i=needle width=6; plot partcorr*lag/vref=-
0.2771859 0.2771859 lvref=2;
run;

Because the SPACF has a clear cut-off after 2, we can choose AR(2) model
for yt .
The estimates are µ̂ = 3.06451, φ̂1 = 1.40646, φ̂2 = −0.50907. Thus

δ̂ = 3.06451 ∗ (1 − 1.40646 − (−0.50907)) = 0.3144

The estimated model is


ŷt = 0.3144494 + 1.40646yt−1 − −0.50907yt−2
(0.12651) ( 0.12621)

Fitted values

ŷ3 = 0.3144494 + 1.40646y2 − 0.50907y1 = 3.7529


ŷ4 = 0.3144494 + 1.40646y3 − 0.50907y2 = 3.708052
...
ŷ50 = 0.3144494 + 1.40646y49 − 0.50907y48 = 5.32415.

4
Residuals

e3 = y3 − ŷ3 = 0.0571
e4 = y4 − ŷ4 = −0.688
...
e50 = y50 − ŷ50 = −0.57

Prediction

ŷ51 = 0.3144494 + 1.40646y50 − 0.5233y49 = 4.8129


ŷ52 = 0.3144494 + 1.40646ŷ51 − 0.50907y50 = 4.665621
ŷ53 = 0.3144494 + 1.40646ŷ52 − 0.50907ŷ51 = 4.426319
ŷ54 = 0.3144494 + 1.40646ŷ53 − 0.50907ŷ52 = 4.164762
ŷ55 = 0.3144494 + 1.40646ŷ54 − 0.50907ŷ53 = 3.918714

SAS codes:
Fourth step: estimation and prediction.
proc arima data=mydata;
identify var=ts nlag=20;
estimate p=2 plot;
run;
forecast lead=5;
run;
quit;
The output results of the four steps, including identify, estimate and forecast
statements, are listed in Appendix A.
Further topics: (diagnostic) checking of the model(we will discuss this
later)
Check whether there is autocorrelation in the residuals.
Since the autocorrelations of the residuals have been provided in the out-
put(Appendix A), we can plot the acf by the following codes:
data mydata;
input id acf;
datalines;
1 -0.11259
2 0.28399
3 -0.17813
4 0.02414
5 -0.05594
6 -0.02071
7 0.00486
8 0.10026
9 -0.01289

5
10 0.02440
11 -0.04040
12 -0.08307
13 -0.08352
14 -0.12912
15 -0.00174
;
proc gplot data=mydata;
symbol i=needle width=6;
title ’acf of residuals’;
axis2 order=(-1 to 1 by 0.2);
plot acf*id/vaxis=axis2;
run;
quit;

3 LSE of MA model
Suppose that X1 , X2 , · · · , Xn is a sample. To estimate MA(1):

Xt = Zt − θ1 Zt−1

we first write it as (if it is invertible, i.e. |θ1 | < 1)

Xt = Zt − θ1 Xt−1 − θ12 Xt−2 − θ13 Xt−3 − · · ·

Then, θ1 can be estimated by minimizing


n
X
(Xt + θ1 Xt−1 + θ12 Xt−2 + θ13 Xt−3 + · · · )2
t=1

6
For example: Time series Xt : -0.3771, -0.7009, -0.6063, -2.1099, -2.0939, -0.6972,
-0.8131, -0.4401, 0.0068, 0.0498, 0.5552, -0.3333, -2.3981, -1.9854, -1.3579, 0.6725,
2.0068, 2.4630, 1.8879, -0.2878, -1.2468, 0.5062, 1.5715, 2.7983, 0.8574, 0.1626, 0.6390,
0.2969, 0.3278, 0.6458

Based on the ACF, we choose MA(1) model. If we choose model

Xt = Zt + θ1 Zt−1

or
Xt = Zt + θ1 Xt−1 − θ12 Xt−2 + · · · .
To find θ1 , we will minimize
n
X
MSE = n−1 (Xt − X̂t )2 .
t=1

7
For the example, the minimum point for θ1 is 0.7423. Our estimated model is
then

X̂t = Zt + 0.7423Zt−1 .

SAS codes:
Read data into sas:
data example1;
input id ts;
datalines;
1 -0.3771
2 -0.7009
3 -0.6063
.. ..
..
28 0.2969
29 0.3278
30 0.6458
;
proc print;
run;
proc gplot data=example1;
symbol i=spline c=red v=star;
plot ts*id;
run;
ARIMA procedure:
proc arima data=example1;
identify var=ts nlag=14 outcov=exp3;
run;
GOPTIONS RESET=ALL; proc gplot data=exp3; symbol i=needle width=6;
plot corr*lag/VAXIS=(-0.5 to 1.0 by 0.1) vref=-0.3578454 0.3578454 lvref=2; run;
GOPTIONS RESET=ALL; proc gplot data=exp3; symbol i=needle width=6;
plot partcorr*lag/vref=-0.3578454 0.3578454 lvref=2; run; quit;
proc arima data=example1;
identify var=ts nlag=14 outcov=exp3;
estimate q=1 plot;
run;
The output results are listed in Appendix B.
———
Then the estimated model is

(Xt − 0.02271) = Zt + 0.75661Zt−1

or
Xt = 0.02271 + Zt + 0.75661Zt−1

8
Plot the acf of the residuals by the following codes: (autocorrelations of the
residuals are listed in the output results in Appendix B)
data mydata;
input id acf;
datalines;
1 0.28638
2 0.19386
3 -0.19558
4 0.209419
5 0.01885
6 0.27097
7 0.15829
8 0.285044
9 -0.06605
10 -0.01413
11 -0.11075
12 -0.05437
13 -0.07005
14 -0.15193
;
proc gplot data=mydata;
symbol i=needle width=6;
title ’acf of residuals’;
axis2 order=(-1 to 1 by 0.2);
plot acf*id/vaxis=axis2;
run;
quit;

9
MAS451/MTH451/MH4500 TIME SERIES
ANALYSIS
Chapter 4 (Part 3)
Estimation, diagnostic checking for nonseasonal Box-Jekins models

Basic Questions:
(a) Use SAS to implement the estimation.
(b) Estimate ARMA models based on the properties.

1 Using SAS to Estimate of ARMA model (continued)


We use SAS to estimate ARMA(p,q) model

Xt = δ + φ1 Xt−1 + · · · + φp Xt−p + Zt + θ1 Zt−1 + ... + θq Zt−q .

The formula for the l-step ahead forecast is

Xn (l) = E(Xn+l |Xn , Xn−1 , ...).

EXAMPLE 1 yt , t = 1, · · · , 50 are observed as: -1.30, -0.18, 0.94, -0.26, -1.05, -0.78,
-0.82, 0.43, 0.57, 1.41, -1.47, 0.49, 0.00, -0.15, -0.64, 0.24, -0.79, 0.82, -0.20, -0.80, -0.22,
0.88, -0.75, 0.55, 0.73, -0.82, 0.70, -1.54, 0.04, -0.70, -0.58, -1.38, -1.28, 0.49, -0.76, 1.08,
0.16, 1.11, -0.06, 0.88, 0.89, 0.31, 0.03,-1.19, -0.38, 0.49, 1.02, -0.98, 0.50, -0.57

The output results are listed in Appendix and we below list part of them.

1
Conditional Least Squares Estimation

Standard Approx
Parameter Estimate Error t Value Pr > |t| Lag

MU -0.09482 0.10559 -0.90 0.3737 0


MA1,1 -0.58425 0.58246 -1.00 0.3210 1
AR1,1 -0.70942 0.51177 -1.39 0.1722 1

Constant Estimate -0.16209


Variance Estimate 0.64781
Std Error Estimate 0.804866
AIC 123.0922
SBC 128.8282
Number of Residuals 50
* AIC and SBC do not include log determinant.

Model for variable ts

Estimated Mean -0.09482

Autoregressive Factors

Factor 1: 1 + 0.70942 B**(1)

Moving Average Factors

Factor 1: 1 + 0.58425 B**(1)

The estimted model is

X̂t = −0.1621 − 0.70942Xt−1 + Zt + 0.58425Zt−1




where −0.1621 = −0.09482 ∗ (1 + 0.70942).

2
2 Estimation of ARMA model based on the ACF and
PACF: Yule–Walker estimation method
Consider an AR(p) model of the form,

Xt − φ1 Xt−1 − φ2 Xt−2 − · · · − φp Xt−p = Zt . (2.1)

Our aim is to find estimators of the coefficient vector φ = (φ1 , . . . , φp ) and the
white noise variance σ 2 based on the observations X1 , . . . , XN .

Recall Yule–Walker equations,

γ(0) − φ1γ(1) − · · · − φp γ(p) = σ 2,


γ(p) − φ1γ(p − 1) − · · · − φp γ(0) = 0; (2.2)

or the Yule-Walker equations




 ρ(1) = φ1 + φ2 ρ(1) + · · · + φp ρ(p − 1)

ρ(2) = φ1 ρ(1) + φ2 + · · · + φp ρ(p − 2)

 ···

ρ(p) = φ1 ρ(p − 1) + φ2 ρ(p − 2) + · · · + φp

where ρ(k) is ACF of the time series. We need to calculate sample ACF. We can
solve the above equations to estimate φ1 , · · · , φp and σ 2 .
Suppose that X1 , X2 , · · · , Xn are observations.
AR(1) model with mean 0: Xt = φ1 Xt−1 + εt
Recall that we have

γ(1) = φ1 γ(0)

3
i.e.

φ1 = ρ(1)

We can use sample ACF r1 to estimate ρ(1) thus φ1 : φ̂1 = r1 .

EXAMPLE 2 Fit an AR(1): Xt = φ1 Xt−1 + Zt to data -0.06, -0.18, 0.06, 0.15, 0.13, -0.02,
0.19, -0.13, -0.26, -0.29, -0.17, -0.10, 0.10, 0.17, 0.04, 0.00, 0.15, 0.11, 0.01, 0.19
Because r1 = 0.4755, the estimated model is

X̂t = 0.4755Xt−1 .

AR(1) model with nonzero mean: Xt = δ + φ1 Xt−1 + Zt . Let xt = Xt − µ


where µ = EXt . By doing so, xt is now xt = φ1 xt−1 + Zt .
Thus, we need to estimate µ first as
n
X
µ̂ = n−1 Xt
t=1

Recall that we have

γx (1) = φ1 γx (0)

i.e.

φ1 = ρx (1).

We can use sample ACF r1 to estimate ρ(1) thus φ1 : φ̂1 = r1 ; and δ̂ = (1 − φ̂1 )µ̂.

EXAMPLE 3 Fit an AR(1): Xt = φ1 Xt−1 + Zt to data: 5.05, 5.02, 4.78, 4.73, 4.86, 4.81,
4.86, 4.74, 4.89, 5.03, 5.13, 5.16, 5.19, 5.13, 5.16, 5.10, 5.04, 5.07, 4.95, 4.91
We have
X20
µ̂ = X̄ = n−1 Xt = 4.98
t=1

and
19
X 20
X
r1 = (Xt − X̄)(Xt+1 − X̄)/ (Xt − X̄)2 = 0.7747.
t=1 t=1

Thus
δ̂ = (1 − φ̂1 )µ̂ = 1.1220.
Finally the estimated model

X̂t = 1.1220 + 0.7747Xt−1 .

4
AR(2) model with mean 0: Xt = φ1 Xt−1 + φ2 Xt−2 + Zt
Recall that we have

ρ(1) = φ1 + φ2 ρ(1)
ρ(2) = φ1 ρ(1) + φ2 .

We can then estimate φ1 and φ2 by solving

r1 = φ1 + φ2 r1
r2 = φ1 r1 + φ2

(where r1 , r2 are sample ACF).

EXAMPLE 4 Fit an AR(2): Xt = φ1 Xt−1 + φ2 Xt−2 + Zt to data 0.15, -0.06, -0.39, -0.56,
-0.52, -0.26, -0.11, 0.32, 0.31, 0.01, 0.00, 0.17, 0.52, 0.32, -0.08, -0.30, -0.16, 0.32, 0.29,
0.07
Because r1 = 0.64, r2 = 0.04, by

0.64 = φ1 + 0.64φ2
0.04 = 0.64φ1 + φ2 

we have
φ1 = 1.04, φ2 = −0.62.
The estimated model is

X̂t = 1.04Xt−1 − 0.62Xt−2 .

AR(2) model with nonzero mean: Xt = δ + φ1 Xt−1 + φ2 Xt−2 + Zt . Let


xt = Xt − µ where µ = EXt . By doing so, xt is now xt = φ1 xt−1 + φ2 xt−2 + Zt .
Thus, we need to estimate µ first as
n
X
µ̂ = n−1 Xt
t=1

We estimate model xt = φ1 xt−1 + φ2 xt−2 + Zt first, say

x̂t = φ̂1 xt−1 + φ̂2 xt−2 .

Then the model for Xt is

X̂t = µ̂(1 − φ̂1 − φ̂2 ) + φ̂1 Xt−1 + φ̂2 Xt−2 .

EXAMPLE 5 Example: Xt : 1.25, 1.64, 1.78, 1.33, 1.21, 1.04, 1.04, 1.55, 1.31, 0.89, 0.78,
1.28, 1.79, 2.42, 2.09, 1.57, 1.05, 0.97, 1.26, 1.70
Because X̄ = 1.40. Let xt = Xt − 1.40. We have, for xt

r1 = 0.57, r2 = −0.11

5
By solving

0.57 = φ1 + 0.57φ2
−0.11 = 0.57φ1 + φ2 

we have φ̂1 = 0.94, φ̂2 = −0.65. The estimated model for Xt is

X̂t = 0.994 + 0.94Xt−1 − 0.65Xt−2

where 0.994 = 1.40 ∗ (1 − 0.94 + 0.65).

Attention: In SAS, the Yule–Walker estimation method can be implemented by


using the following codes
estimate p=2 method=uls;
where uls denotes unconditional least square method, which is the same as
the Yule–Walker estimation method.

EXAMPLE 6 Generate an AR(2) model of the form: Xt = 0.7Xt−1 − 0.5Xt−2 + Zt


e1 , φ
and then estimate φ1 = 0.7 by (φ e2 ) using the following codes:
Generate an AR(2) model:
data mydata;
x1=0.5;
x2=0.5;
do i=-20 to 300;
z=rannor(10000);
x=0.7*x1-0.5*x2+z;
x2=x1;
x1=x;
if i>0 then output;
end;
run;
proc print data=mydata;
run;
Use arima procedure to estimate parameters:
proc arima data=mydata;
identify var=x nlag=20;
run;
estimate p=2 method=uls plot;
run;
quit;

————————————–
MA(1) model with mean 0: Xt = Zt + θ1 Zt−1
We have
θ1
ρ1 =
1 + θ12

6
Thus
ρ1 θ12 − θ1 + ρ1 = 0
i.e.
r1 θ̂12 − θ̂1 + r1 = 0
We can estimate θ1 by solving the above equations. (we discard the root with
absolute value greater 1 )
MA(1) model with nonzero mean: Xt = δ + Zt + θ1 Zt−1 . Because EXt = δ.
define zt = Xt − δ. We can estimate MA(1) for zt and then Xt .

EXAMPLE 7 Xt : -0.89 -0.53 0.54 -0.26 -1.34 -1.97 -0.35 0.46 -0.08 -1.13 0.04 1.64 1.95
0.94 -0.11 0.18 0.72 0.91 -1.09 0.12 1.29 0.79 1.67 -0.60 -1.72 -0.76 -2.60 -1.71 -0.39
-1.18
Fit a MA(1) model
30
X
ȳ = Xt /30 = −0.182, r1 = 0.5
t=1

We have q
1− 1 − 4r12
θ̂1 = = 1.00.
2r1
Thus the estimated model is

X̂t = Zt + 1.00Zt−1 .

MA(q) model with nonzero mean: There is no analytic solution. For some
special cases, we still have solutions. For example

Xt = δ + Zt + θp Zt−p

——————-
There are no simple methods for Estimation of ARMA model ARMA(p,q)
(the details are beyond the scope of the module).

7
MAS451/MTH451/MH4500 TIME SERIES
ANALYSIS
Chapter 4 (Part 4)
Estimation, diagnostic checking for nonseasonal Box-Jekins models

Basic Questions:
(a) How to select ARMA(p,q) model by AIC ?
(b) “Is the fitted model OK?” What does OK mean?
(c) How to use Ljung-Box statistics?
(d) How to improve a fitted model ?

1 Model selection using Information criteria


Evaluation of the graphs of sample ACF and PACF allows to preliminarily choose
order p and q of ARMA model to be fit. A final decision is done using AIC criterion
which allows us to compare the fit of different models.

DEFINITION 1 Assume that a statistical model of M parameters is fitted to data.


The Akaike’s Information Criterion (AIC) statistic is defined as

AIC = −2ln[maximum likelihood of data] + 2M.

Suppose that the white noise Zt is Gaussian distribution with variance σ 2 . Let
n

σ̂ 2 = n1 (Xj − X̂j )2 /rj−1 with rj−1 being some constants, independent of σ 2 . It
j=1
turns out that the log likelihood function of ARMA(p,q) models is
n
ln L = − lnσ̂ 2 + Const.
2

DEFINITION 2 The Akaike’s Information Criterion (AIC) statistic for ARMA(p,q)


models is defined as
AIC = nlnσ̂ 2 + 2(p + q),
where σ̂ 2 stands for the estimated error variance and n is the number of ob-
servations.

1
How does it works ? Choose the model (choose the values of p, q) with
minimum AIC.
Intuitively, one can think of 2(p + q) as a penalty term to discourage over-
parameterization.

2 Model diagnostic checking


OK means that the fitted model can describe the dependence structure of a time
series adequately.
If ARMA model
Xt = δ + ϕ1 Xt−1 + · · · + ϕp Xt−p + Zt + θ1 Zt−1 + · · · + θq Zt−q
is adequate, then {Zt } should be white noise (WN).
If the ARMA model is adequate, the residuals should be WN (approximately
at least).
Recall what is white noise? If {Zt } is WN, then ρk (or ρ(k)) is zero. In practice
we may use its sample ACF. For example, if an AR(1) model is considered, the
residuals are
Ẑt = Xt − δ̂ − ϕ̂1 Xt−1 .
To check the dependence structure, we calculate
∑n−k
(Ẑt − ā)(Ẑt+k − ā)
rk = t=1∑n , k≥1
2
t=1 (Ẑt − ā)

where ā = Ẑt /n. Here rk is called the residual autocorrelation at lag k. Thus,
if a model is adequate we expect
rh ≈ 0.

THEOREM 1 If H0 : ρ(k) = 0 is true, then

1
ρ̂(k) = rk ∼ N(0, ).
n
We can use the above as a rough guide on whether each ρ(k) is zero.

DEFINITION 3 Overall test, define


m

Q(m) = n(n + 2) rk2 /(n − k)
k=1

2
where 0 << m << n (usually, m ≈ n/5). Q(m) is called the Ljung-Box statistic
(or Portmanteau statistic).

If the fitted model is OK (adequate), then


2
Q(m) ∼ χm−np

where np is the number of parameters (exclusive of δ) in the ARMA model. For


example, if the model is Xt = ϕ1 Xt−1 + Zt , then np = 1.
e.g. if the model is Xt = ϕ1 Xt−1 + ϕ3 Xt−3 + Zt , then np = 2.
e.g. if the model is Xt = δ + ϕ1 Xt−1 + ϕ2 Xt−3 + θ1 Zt−1 + Zt , then np = 3 (δ will
not be counted).

EXAMPLE 1 The data Xt , t = 1, · · · , 20 are observed as: 0.50, -0.41, 0.37, -0.61, 0.23,
-0.13, 0.06, -0.11, 0.18, -0.14, 0.20, 0.09, -0.03, -0.02, -0.14, -0.07, 0.09, 0.09, -0.01, -0.10

Series y Series y
1.0

0.4
0.2
0.5

0.0
Partial ACF
ACF

−0.2
0.0

−0.4
−0.5

−0.6

0 2 4 6 8 10 12 2 4 6 8 10 12

Lag Lag

We may try model Xt = δ + ϕ1 Xt−1 + Zt by looking at SPACF and SACF.


fit = arima(y, order = c(1,0,0))
The fitted model

X̂t = 0.0075 − 0.832Xt−1

The residuals are et : 2, 3, ..., 20: 0.00, 0.02, -0.31, -0.29, 0.05, -0.06, -0.07, 0.08,
0.00, 0.08, 0.25, 0.04, -0.05, -0.16, -0.19, 0.02, 0.16, 0.06, -0.12
The SACF for et are

r1 = 0.34, r2 = −0.21r3 = −0.12,

r4 = −0.22, r5 = −0.09, r6 = 0.09,


r7 = −0.18, r8 = −0.24, r9 = 0.02, r10 = 0.10

3
(each H0 : ρe (k) = 0, k = 1, ..., 10 can be accepted separately, why?)
Consider the Ljung-Box test. If we let m = 5, then

Q(5) = n ∗ (n + 2) ∗ (r12 /(n − 1) + r22 /(n − 2)


+r32 /(n − 3) + r42 /(n − 4) + r52 /(n − 5))
= 5.6964
2 2
Since χ0.05 (5 − 1) = 9.49, Q(5) < χ0.05 (5 − 1). Thus we can not reject the adequacy
of the model by setting α equal to 0.05.
From the table of the residual we can also see that the p value is 0.3819 >
0.05 = α when taking m=6. Again we can not reject the adequacy of the model
by setting α equal to 0.05. This is consistent with that obtained by comparing the
critical value with the observed statistic.
Using tsdiag(fit) yields the following plot.

Standardized Residuals
2
1
0
−1
−2

5 10 15 20

Time

ACF of Residuals
0.8
0.4
ACF

−0.4 0.0

0 2 4 6 8 10 12

Lag

p values for Ljung−Box statistic


0.8
p value

0.4
0.0

2 4 6 8 10

lag

3 Using ACF and PACF of of residuals to improve the


model
EXAMPLE 2 : Suppose that we fit AR(1) model

Xt = ϕ1 Xt−1 + Zt .

4
IF the SACF of Ẑt has a cut-off after lag 1, then it suggests

Ẑt ∼ MA(1)

i.e
Ẑt = et + θet−1
Thus
Xt ∼ ARMA(1, 1).
Hopefully, êt is now closer to white noise.
If the SPACF of Ẑt has a cut-off after lag 1, it suggests

Ẑt ∼ AR(1)

i.e
Zt = ψ1 Zt−1 + et−1 .
Thus
(Xt − ϕ1 Xt−1 ) = ψ1 (Xt−1 − ϕ1 Xt−2 ) + et−1
i.e.
Xt ∼ AR(2).
Hopefully, êt is now closer to white noise.

5
MAS451/MTH451/MH4500 TIME SERIES
ANALYSIS
Chapter 4 (Part 5)
Estimation, diagnostic checking for nonseasonal Box-Jekins models

Basic Questions:
(a) Change a non-stationary time series into stationary one.
(b) Fit ARIMA models.

1 How to change a non-stationary time series into sta-


tionary one
According to the definition of stationarity there are 3 types of non-stationarity:
⊲ Non-stationarity in mean: EXt depends on t;
⊲ Non-stationarity in variance: var(Xt ) depends on t;
⊲ Non-stationarity in covariance: cov(Xt , Xt+k ) depends on t for some k.
EXAMPLE 1 Suppose that {Yt } is a stationary time series. Let Xt = a + bt + Yt .
Apparently, Xt is not a stationary process. However applying the first difference
operator ▽ to {Xt } yields a stationary process. Therefore applying the difference
operator is one way of obtaining a stationary process. 
EXAMPLE 2 Suppose {Yt } is an i.i.d. sequence with γ(k) = Cov(Yt ) independent
of time t. Let Xt = et+Yt . Can we find a suitable d such that {(1 − B)d Xt } is
stationary? (No).
Let Ut = log(Xt ). What happen to {Ut }? 
Not all non-stationary series can be transformed to stationary ones by differ-
encing. Many time series are stationary in the mean but are not stationary in the
variance such as Example 2. To overcome this problem, we need to stabilize the
variance of the time series by using pre-differencing transformation.
We first consider the power transformation to remove some possible non-
stationarity in variance.
Xλ − 1
T(Xt ) = t
λ

When and how to perform the power transformation?

1
Values of lambda Transformation
1
-1.0 Xt
-0.5 √1
Xt
0.0 ln(X
√ t)
0.5 Xt
1.0 Xt (no transformation)

(a) If the variability of a time series increases as time advances it then implies
that the time series is non-stationary with respect to its variance; See figure
1 below.
1800

1600
no. of passengers

1400

1200

1000

800
0 2 4 6 8 10 12 14 16 18 20
time: month

Figure 1:

(b) Find the one with minimum sample variance.

7.6
log(no. of passengers)

7.4

7.2

6.8

6.6
0 2 4 6 8 10 12 14 16 18 20
time: month

Figure 2: log transformation of data in Fig 1

2
After transformation, we then consider the possible differecing to make the
time series stationary.
Criteria: The ACFs of non-stationary time series converges to zero slowly, how-
ever, those of stationary time series converges to zero fast.

2 ARIMA model
EXAMPLE 3 Figure 3 shows US Dow Jones Industrial Average Market Index {Yt }
from 17-Jul-02 to 20-Mar-03.
{Yt } is not stationary. See figures 3 and 4.

zt = yt−yt−1
80

60

40

20

−20

−40

−60
0 20 40 60 80 100 120 140 160 180

Figure 3:

Series y Series y
1.0

0.8
0.8

0.6
0.6

Partial ACF

0.4
ACF

0.4

0.2
0.2

0.0
0.0

0 5 10 15 20 5 10 15 20

Lag Lag

Figure 4: series Y

Moreover, by figure 5 we can fit the following models to the data


xt = Zt + θ19 Zt−19

3
Series z Series z

0.15
1.0

0.10
0.8

0.05
0.6

Partial ACF

0.00
ACF

0.4

−0.05
0.2

−0.10
0.0

−0.15
−0.2

0 5 10 15 20 5 10 15 20

Lag Lag

Figure 5: series z or x

or
xt − φ1 xt−1 − · · · − φ19 xt−19 = Zt .
Generally, we can fit the difference xt = Xt − Xt−1 of a time series by a
ARMA(p,q) model,

φp (B)(Xt − Xt−1 ) = θq (B)Zt

or

φp (B)(1 − B)Xt = θq (B)Zt .

This is an ARIMA(p, 1, q) model.


We can also consider higher order difference,

wt = xt − xt−1 = (Xt − Xt−1 ) − (Xt−1 − Xt−2 )


= Xt − 2Xt−1 + Xt−2 = (1 − 2B + B2 )Xt
= (1 − B)2 Xt .

If we fit wt by

φp (B)wt = θq (B)Zt

or

φp (B)(1 − B)2 Xt = θq (B)Zt .

This is an ARIMA(p,2,q) model. More generally, we define ARIMA(p,d,q) as

φp (B)(1 − B)d Xt = θq (B)Zt .

For the example, we can fit ARIMA(0,1,19) to Yt in the example

4
For the example, we can fit ARIMA(0,1,19) to yt in the example

fitma = arima(y, order = c(0, 1, 19))


Call: arima(x = y, order = c(0, 1, 19))
Coefficients:
ma1 ma2 ma3 ma4 ma5 ma6 ma7 ma8
-0.2219 0.0249 -0.1093 -0.0702 0.0666 0.0518 0.0191 0.1042
s.e. 0.0950 0.0832 0.0863 0.0898 0.0908 0.0867 0.0899 0.0948
ma9 ma10 ma11 ma12 ma13 ma14 ma15 ma16
-0.0537 -0.0714 -0.0685 0.1185 0.0134 0.0531 -0.0942 0.0521
s.e. 0.1016 0.1087 0.0896 0.0893 0.0987 0.1004 0.1145 0.1075
ma17 ma18 ma19
-0.1438 -0.1234 -0.4681
s.e. 0.0972 0.1059 0.0971
sigma^2 estimated as 665.5: log likelihood = -799.39, aic =
1638.78

tsdiag(fitma)
Standardized Residuals
2
1
0
−2

0 50 100 150

Time

ACF of Residuals
0.8
ACF

0.4
0.0

0 5 10 15 20

Lag

p values for Ljung−Box statistic


0.8
p value

0.4
0.0

2 4 6 8 10

lag

predict(fitma, n.ahead= 20)


Call: arima(x = y, order = c(19, 1, 0))
Coefficients:

5
1500
1400
1300
y

1200
1100

0 50 100 150 200

1:171

Figure 6: the black dot is the observation; the red dots are the predictions

ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8


-0.0665 0.1378 -0.0443 -0.1369 -0.0163 0.0462 -0.0396 0.0475
s.e. 0.0749 0.0757 0.0764 0.0770 0.0796 0.0808 0.0808 0.0818
ar9 ar10 ar11 ar12 ar13 ar14 ar15 ar16
-0.0416 -0.1021 -0.0085 0.038 0.0039 -0.0407 -0.0506 0.0904
s.e. 0.0823 0.0817 0.0821 0.082 0.0826 0.0836 0.0837 0.0836
ar17 ar18 ar19
0.1094 -0.0049 -0.2208
s.e. 0.0840 0.0837 0.0832
sigma^2 estimated as 726.2: log likelihood = -801.95,
aic = 1643.9

tsdiag(fitma)
predict(fitma, n.ahead= 20)
The fitted model is

Moving Average Factors


Factor 1: 1 + 0.05076 B**(1) - 0.03281 B**(2) + 0.0297 B**(3) + 0.00626 B**(4)
- 0.01162 B**(5) - 0.01904 B**(6) - 0.06261 B**(7) - 0.09867 B**(8) - 0.14025 B**(9) -
0.11528 B**(10) - 0.02889 B**(11) - 0.00768 B**(12) - 0.2065 B**(13) - 0.18588 B**(14)
+ 0.03721 B**(15) + 0.00595 B**(16) - 0.22064 B**(17)-0.1234B**(18) -0.4681B**(19)

6
Standardized Residuals

2
1
0
−2
0 50 100 150

Time

ACF of Residuals

0.8
ACF

0.4
0.0

0 5 10 15 20

Lag

p values for Ljung−Box statistic


0.8
p value

0.4
0.0

2 4 6 8 10

lag
1500
1400
1300
y

1200
1100

0 50 100 150 200

1:171

Figure 7: the black dot is the observation; the red dots are the predictions

7
EXAMPLE 4 Weekly sales of Super Tech Videocassette Tape [the data can be found
at the website]. 

> plot(1:161, y, xlim = c(0, 200), ylim=c(20, 100) )


> lines(1:161, y, type="l" ) # l for L

100
80
60
y

40
20

0 50 100 150 200

1:161

Figure 8: series y

> acf(y, lag.max=30)


> pacf(y, lag.max=30)

Series y Series y
1.0
1.0

0.8
0.8

0.6
0.6

Partial ACF

0.4
ACF

0.4

0.2
0.2

0.0
0.0

−0.2
−0.2

0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag Lag

Figure 9: series y

The the raw data is not stationary. We take difference

zt = yt − yt−1 = (1 − B)yt .

> acf(z, lag.max=30)


> pacf(z, lag.max=30)

8
Series z Series z

1.0

0.4
0.8

0.3
0.6

0.2
Partial ACF
0.4
ACF

0.1
0.2

0.0
0.0

−0.1
−0.2

−0.2
0 5 10 15 20 25 30 0 5 10 15 20 25 30

Lag Lag

Figure 10: series x

We can use ARIMA(0, 0, 6) for zt , or ARIMA(0,1,6) for yt .

> fit = arima(z, order = c(0,1,6))


> tsdiag(fit)

Standardized Residuals
2
1
−1 0
−3

0 50 100 150

Time

ACF of Residuals
1.0
0.6
ACF

0.2
−0.2

0 5 10 15 20

Lag

p values for Ljung−Box statistic


0.8
p value

0.4
0.0

2 4 6 8 10

lag

Figure 11: ACF of Residuals

Thus, the model is adequate (OK), i.e. there is no autocorrelation in the resid-
uals.
1. Write down the estiamted model
> fit
Call: arima(x = y, order = c(0, 1, 6))
Coefficients:

9
ma1 ma2 ma3 ma4 ma5 ma6
0.6331 -0.0160 0.0361 -0.0264 -0.1490 -0.4374
s.e. 0.0771 0.0892 0.0917 0.0879 0.1055 0.0783
sigmaˆ2 estimated as 4.896: log likelihood = -356.33, aic = 726.65
The fitted model is

(1 − B)yt = Zt + 0.6331Zt−1 − 0.0160Zt−2 + 0.0361Zt−3


−0.0264Zt−4 − 0.1490Zt−5 − 0.4374Zt−6 .

2. Fitted values are as follows.

> plot(1:161, y, xlim = c(0, 200), ylim=c(20, 100) )


> lines(1:161, y, type="l" )
> lines(1:161, y-fit$residuals, type="l", col="red")

3. Forecast for 6 steps ahead

> forecast = predict(fit, n.ahead=6)


> lines(162:167, forecast$pred, type="o", col="red")
> lines(162:167, forecast$pred-1.96*forecast$se, col="blue")
> lines(162:167, forecast$pred+1.96*forecast$se, col="blue")
100
80
60
y

40
20

0 50 100 150 200

Index

Figure 12: the black dots are the observation and the red dots are the predictions

10

You might also like