Introduction To Time Series Analysis. Lecture 4

Introduction to Time Series Analysis. Lecture 4.
Peter Bartlett
Last lecture:
1. Sample autocorrelation function
2. ACF and prediction

3. Properties of the ACF
1
Peter Bartlett
1. Review: ACF, sample ACF.

2. Properties of estimates of µ and ρ.
3. Convergence in mean square.
2
Mean, Autocovariance, Stationarity
A time series {Xt } has mean function µt = E[Xt ]

and autocovariance function
γX (t + h, t) = Cov(Xt+h , Xt )
= E[(Xt+h − µt+h )(Xt − µt )].
It is stationary if both are independent of t.

Then we write γX (h) = γX (h, 0).
The autocorrelation function (ACF) is
γX (h)
ρX (h) = = Corr(Xt+h , Xt ).
γX (0)
3
Estimating the ACF: Sample ACF
For observations x1 , . . . , xn of a time series,

n
1X
the sample mean is x̄ = xt .
n t=1
The sample autocovariance function is
n−|h|
1 X
γ̂(h) = (xt+|h| − x̄)(xt − x̄), for −n < h < n.
n t=1
The sample autocorrelation function is

γ̂(h)
ρ̂(h) = .
γ̂(0)
4
Properties of the autocovariance function
For the autocovariance function γ of a stationary time series {Xt },

1. γ(0) ≥ 0,
2. |γ(h)| ≤ γ(0),
3. γ(h) = γ(−h),
4. γ is positive semidefinite.
Furthermore, any function γ : Z → R that satisfies (3) and (4) is the

autocovariance of some stationary (Gaussian) time series.
5
6
Properties of the sample autocovariance function
The sample autocovariance function:

n−|h|
1 X
γ̂(h) = (xt+|h| − x̄)(xt − x̄), for −n < h < n.
n t=1
For any sequence x1 , . . . , xn , the sample autocovariance function γ̂ satisfies
1. γ̂(h) = γ̂(−h),
2. γ̂ is positive semidefinite, and hence
3. γ̂(0) ≥ 0 and |γ̂(h)| ≤ γ̂(0).
7
Properties of the sample autocovariance function: psd
 
γ̂(0) γ̂(1) ··· γ̂(n − 1)
 
γ̂(1) γ̂(0) ··· γ̂(n − 2) 
 

Γ̂n = 
 .. .. .. ..


 . . . . 
 
γ̂(n − 1) γ̂(n − 2) · · · γ̂(0)
1
= M M ′, (see next slide)
n
1
so a′ Γ̂n a = (a′ M )(M ′ a)
n
1
= kM ′ ak2
n
≥ 0,
i.e., Γn is a covariance matrix. It is also important for forecasting.
8
Properties of the sample autocovariance function: psd
 
0 ··· 0 0 X̃1 X̃2 ··· X̃n
 
 0 ··· 0 X̃1 X̃2 ··· X̃n 0 
 
 
 0
M = ··· X̃1 X̃2 ··· X̃n 0 0 .
 . .. 
 .. . 
 
X̃1 X̃2 ··· X̃n 0 ··· 0
and X̃t = Xt − µ.
9
Estimating µ
How good is X̄n as an estimate of µ?
For a stationary process {Xt }, the sample average,

1
X̄n = (X1 + · · · + Xn ) satisfies
n
E(X̄n ) = µ, (unbiased)
n
1 X |h|
var(X̄n ) = 1− γ(h).
n n
h=−n
10
!  
n n
1X 1 X
To see why: var(X̄n ) = E Xi − µ  X j − µ
n i=1 n j=1
n n
1 XX
= 2 E(Xi − µ)(Xj − µ)
n i=1 j=1
1 X
= 2 γ(i − j)
n i,j
n−1
1 X |h|
= 1− γ(h).
n n
h=−(n−1)
11
Estimating µ
n
1 X |h|
Since var(X̄n ) = 1− γ(h),
n n
h=−n
if lim γ(h) = 0, var(X̄n ) → 0.

h→∞
12
Estimating µ
n
1 X |h|
Also, since var(X̄n ) = 1− γ(h),
n n
h=−n
X ∞
X ∞
X
if |γ(h)| < ∞, n var(X̄n ) → γ(h) = σ 2 ρ(h).
h h=−∞ h=−∞
Compare this to the uncorrelated case....
13
Estimating µ
∞
X
n var(X̄n ) → σ 2 ρ(h).
h=−∞
σ2 σ2
i.e., instead of var(X̄n ) ≈ , we have var(X̄n ) ≈ ,
n n/τ
P
with τ = h ρ(h). The effect of the correlation is a reduction of sample
size from n to n/τ . (c.f. mixing time.)
14
Estimating µ: Asymptotic distribution
Why are we interested in asymptotic distributions?
• If we know the asymptotic distribution of X̄n , we can use it to

construct hypothesis tests,
e.g., is µ = 0?
• Similarly for the asymptotic distribution of ρ̂(h),

e.g., is ρ(1) = 0?
Notation: Xn ∼ AN (µn , σn2 ) means ‘asymptotically normal’:
X n − µn d
→ Z, where Z ∼ N (0, 1).
σn
15
Estimating µ for a linear process: Asymptotically normal
P
Theorem (A.5) For a linear process Xt = µ + j ψj Wt−j ,
P
if ψj 6= 0, then

V
X̄n ∼ AN µx , ,
n
X∞
where V = γ(h)
h=−∞
 2
∞
X
2 
= σw ψj  .
j=−∞
d
(X ∼ AN (µn , σn ) means σn−1 (Xn − µn ) → Z.)
16
Estimating µ for a linear process
P
Recall: for a linear process Xt = µ + j ψj Wt−j ,
∞
X
2
γX (h) = σw ψj ψh+j ,
j=−∞
n−1
X |h|
so lim n var(X̄n ) = lim 1− γ(h)
n→∞ n→∞ n
h=−(n−1)
∞ n−1
2
X X |h|
= lim σw ψj ψj+h − ψj+h
n→∞
j=−∞
n
h=−(n−1)
 2
∞
X
2 
= σw ψj  .
j=−∞
17
Estimating the ACF: Sample ACF for White Noise
Theorem For a white noise process Wt ,

if E(Wt4 ) < ∞,
 
ρ̂(1)

..
 1
 ∼ AN 0, I .
 

 .  n
ρ̂(K)
18
Sample ACF and testing for white noise
If {Xt } is white noise, we expect no more than ≈ 5% of the peaks of the

sample ACF to satisfy
1.96
|ρ̂(h)| > √ .
n
This is useful because we often want to introduce transformations that
reduce a time series to white noise.
19
Sample ACF for white Gaussian (hence i.i.d.) noise
1.2
0.8
0.6
0.4
0.2
−0.2
−20 −15 −10 −5 0 5 10 15 20
20
P
Theorem (A.7) For a linear process Xt = µ + j ψj Wt−j ,
if E(Wt4 ) < ∞,
    
ρ̂(1) ρ(1)
 1 
.. ..
  
 ∼ AN  , V ,
    

 .   .  n 
ρ̂(K) ρ(K)
∞
X
where Vi,j = (ρ(h + i) + ρ(h − i) − 2ρ(i)ρ(h))
h=1
× (ρ(h + j) + ρ(h − j) − 2ρ(j)ρ(h)) .
Notice: If ρ(i) = 0 for all i 6= 0, V = I.
21
Sample ACF for MA(1)
θ
Recall: ρ(0) = 1, ρ(±1) = 1+θ 2 , and ρ(h) = 0 for |h| > 1. Thus,
∞
2
X
V1,1 = (ρ(h + 1) + ρ(h − 1) − 2ρ(1)ρ(h)) = (ρ(0) − 2ρ(1)2 )2 + ρ(1)2 ,
h=1
∞ 1
2
X X
V2,2 = (ρ(h + 2) + ρ(h − 2) − 2ρ(2)ρ(h)) = ρ(h)2 .
h=1 h=−1
And if ρ̂ is the sample ACF from a realization of this MA(1) process, then
with probability 0.95,
r
Vhh
|ρ̂(h) − ρ(h)| ≤ 1.96 .
n
22
Sample ACF for MA(1)
1.2
ACF
Confidence interval
Sample ACF
1
0.8
0.6
0.4
0.2
−0.2
−10 −8 −6 −4 −2 0 2 4 6 8 10
23
24
Convergence in Mean Square
• Recall the definition of a linear process:

∞
X
Xt = ψj Wt−j
j=−∞
• What do we mean by these infinite sums of random variables?

i.e., what is the ‘limit’ of a sequence of random variables?
• Many types of convergence:
1. Convergence in distribution.
2. Convergence in probability.
25
Convergence in Mean Square
Definition: A sequence of random variables S1 , S2 , . . .

converges in mean square if there is a random variable Y
for which
lim E(Sn − Y )2 = 0
n→∞
26
Example: Linear Processes
∞
X
Xt = ψj Wt−j
j=−∞
P∞
Then if j=−∞ |ψj | < ∞,
(1) |Xt | < ∞ a.s.

∞
X
(2) ψj Wt−j converges in mean square
j=−∞
27
Example: Linear Processes (Details)
1
(1) P (|Xt | ≥ α) ≤ E|Xt | (Markov’s inequality)
α
∞
1 X
≤ |ψj |E|Wt−j |
α j=−∞
∞
σ X
≤ |ψj | (Jensen’s inequality)
α j=−∞
→ 0.
28
For (2):
The Riesz-Fisher Theorem (Cauchy criterion):
Sn converges in mean square iff
lim E(Sm − Sn )2 = 0.
m,n→∞
29
n
X
(2) Sn = ψj Wt−j converges in mean square, since
j=−n
 2
X
2
E(Sm − Sn ) = E  ψj Wt−j 
m≤|j|≤n
X
= ψj2 σ 2
m≤|j|≤n
 2
X
2
≤σ  |ψj |
m≤|j|≤n
→ 0.
30
Example: AR(1)
Let Xt be the stationary solution to Xt − φXt−1 = Wt , where

Wt ∼ W N (0, σ 2 ).
If |φ| < 1,
∞
X
Xt = φj Wt−j
j=0
is a solution. The same argument as before shows that this infinite sum
converges in mean square, since |φ| < 1 implies j≥0 |φj | < ∞.
P
31
Example: AR(1)
Furthermore, Xt is the unique stationary solution: we can check that any

other stationary solution Yt is the mean square limit:
n−1
!2
X
lim E Yt − φi Wt−i = lim E(φn Yt−n )2
n→∞ n→∞
i=0
= 0.
32
Example: AR(1)
Let Xt be the stationary solution to
Xt − φXt−1 = Wt ,
where Wt ∼ W N (0, σ 2 ).
If |φ| < 1,
∞
X
Xt = φj Wt−j .
j=0
φ = 1?
φ = −1?
|φ| > 1?
33
34

Introduction To Time Series Analysis. Lecture 4

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Introduction To Time Series Analysis. Lecture 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Time Series Analysis. Lecture 4

Uploaded by

Copyright:

Available Formats

Introduction to Time Series Analysis. Lecture 4.

2. ACF and prediction

1. Review: ACF, sample ACF.

3. Convergence in mean square.

A time series {Xt } has mean function µt = E[Xt ]

It is stationary if both are independent of t.

For observations x1 , . . . , xn of a time series,

The sample autocorrelation function is

For the autocovariance function γ of a stationary time series {Xt },

Furthermore, any function γ : Z → R that satisfies (3) and (4) is the

3. Convergence in mean square.

The sample autocovariance function:

For any sequence x1 , . . . , xn , the sample autocovariance function γ̂ satisfies

3. γ̂(0) ≥ 0 and |γ̂(h)| ≤ γ̂(0).

i.e., Γn is a covariance matrix. It is also important for forecasting.

How good is X̄n as an estimate of µ?

For a stationary process {Xt }, the sample average,

if lim γ(h) = 0, var(X̄n ) → 0.

Compare this to the uncorrelated case....

Why are we interested in asymptotic distributions?

• If we know the asymptotic distribution of X̄n , we can use it to

• Similarly for the asymptotic distribution of ρ̂(h),

Theorem For a white noise process Wt ,

If {Xt } is white noise, we expect no more than ≈ 5% of the peaks of the

Notice: If ρ(i) = 0 for all i 6= 0, V = I.

3. Convergence in mean square.

• Recall the definition of a linear process:

• What do we mean by these infinite sums of random variables?

Definition: A sequence of random variables S1 , S2 , . . .

(1) |Xt | < ∞ a.s.

Let Xt be the stationary solution to Xt − φXt−1 = Wt , where

Furthermore, Xt is the unique stationary solution: we can check that any

Let Xt be the stationary solution to

2. Convergence in mean square.

You might also like