Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Unit 13: ARMA Estimation

Taylor Brown

Department of Statistics, University of Virginia

Fall 2020

1 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Readings for Unit 13

Textbook chapter 3.5 (pages 115 to 121).

2 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Last Unit

1 ARMA forecasting.
2 Prediction error.
3 Prediction interval.

3 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

This Unit

1 Method of Moments Estimation.


2 Maximum Likelihood Estimation.

4 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Motivation

In this unit, we explore a couple of ways to estimate the


parameters for ARMA models: Method of Moments (MOM)
estimation and Maximum Likelihood (ML) estimation.

5 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

1 Method of Moments Estimation

2 Maximum Likelihood Estimation

3 Comparison of MOM and MLE

6 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Method of Moments

Let’s start with the method of moments (MOM) estimation. The


idea behind this is to equate population moments to sample
moments and then solve for the parameters in terms of the sample
moments.

We re-use a lot of the same equations from the previous section!

7 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

AR Estimation

Let’s first assume that we have a causal AR(p) model

φ(B)(xt − µ) = wt ,

where the white noise wt has variance σw2 ,

φ(B) = 1 − φ1 B − · · · − φp B p .

Given n observations x1 , x2 , . . . , xn , we are interested in estimating


the parameters φ1 , . . . , φp and σw2 . Initially we assume that the
order p is known.

8 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

AR Estimation

We’ll assume again without loss of generality (WLOG) that µ = 0.


Why?

E [xt ] = µ can always be estimated with the first sample moment x̄.

If µ 6= 0, then transform your data before estimating as follows:


x̃t = xt − x̄.

9 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)

The method of moments works well when estimating causal AR(p)


models. We consider the causal AR(p) model

xt = φ1 xt−1 + · · · + φp xt−p + wt . (1)

For h = 1, . . . , p, multiply both sides of (1) by xt−h , and take


expectations:

γ(h) = φ1 γ(h − 1) + · · · + φp γ(h − p) (2)

When h = 0, we do the same thing and get

σw2 = γ(0) − φ1 γ(1) + · · · + φp γ(p) (3)

10 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)

We call these the Yule-Walker equations

γ(h) = φ1 γ(h − 1) + φ2 γ(h − 2) + · · · + φp γ(h − p), h = 1, . . . , p


σw2 = γ(0) − φ1 γ(1) − φ2 γ(2) − · · · − φp γ(p).

We can also write them in matrix notation that should look


familiar:
Γp φ = γp σw2 = γ(0) − φ0 γp
where Γp = {γ(k − j)}pj,k=1 is a p × p matrix, φ = (φ1 , · · · , φp )0 is
a p × 1 vector, and γp = (γ(1), · · · , γ(p))0 is a p × 1 vector.

11 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)

Using method of moments, put hat signs on everything, and then


solve for the desired parameters:

Γ p φ = γp σw2 = γ(0) − φ0 γp

yields
φ̂ = Γ̂−1
p γ̂p

and

σ̂w2 = γ̂(0) − γ̂p0 Γ̂−1


p γ̂p .

12 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)


One more small move: divide through by γ̂(0) before solving, so
that we have formulas in terms of ACFs.

1 1
Γp φ = γp σw2 = γ(0) − φ0 γp
γ(0) γ(0)
gives us

φ̂ = R̂p−1 ρ̂p (4)


and
h i
σ̂w2 = γ̂(0) 1 − ρ̂0p R̂p−1 ρ̂p

where R̂p = {ρ̂(k − j)}pj,k=1 is a p × p matrix and


ρ̂p = (ρ̂(1), · · · , ρ̂(p))0 is a p × 1 vector.
13 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)

The asymptotic behavior of the Yule-Walker estimators for causal


AR(p) processes is
√  
d
→ N 0, σw2 Γ−1

n φ̂ − φ − p (5)

and
p
→ σw2 .
σ̂w2 −

14 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Yule-Walker Estimation for AR(p)

The variance-covariance matrix for φ̂ is

σ 2 −1
Var (φ̂) = Γ
n p
σ2
= R −1 (6)
nγ̂(0) p

15 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Simulation Example

Using the armasim() function, I simulated n = 1000 observations


from the following AR(2) process

xt = 1.5xt−1 − 0.75xt−2 + wt
where σw2 = 1. For the sample, γ̂(0) = 7.69697, ρ̂(1) = 0.8456375,
and ρ̂(2) = 0.5054795.

16 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Simulation Example
The data had ρ̂(1) = .849, ρ̂(2) = .519 and γ̂(0) = 8.903
φ̂ = R̂p−1 ρ̂p
 −1  
1 .849 .849
=
.849 1 .519
 
1.463
=
−.723
Also,
h i
σ̂w2 = γ̂(0) 1 − ρ̂0p R̂p−1 ρ̂p
h i
= γ̂(0) 1 − ρ̂0p φ̂
 
  1.463
= 8.903 ∗ (1 − .849 .519 ) = 1.187
−.723
17 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Fish Population Example

In Unit 11, we looked at the ACF and PACF of the time series
from “recruit.dat”, which contains data on fish population in the
central Pacific Ocean. The numbers represent the number of new
fish for each month in the years 1950-1987.

18 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Fish Population Example


Recruit Data
1.0
0.5
ACF
0.0
−0.5

0.5 1.0 1.5


LAG
1.0
0.5
PACF
0.0
−0.5

0.5 1.0 1.5 19 / 39


Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Fish Population Example

Let’s check the results of fitting an AR(2) model using


Yule-Walker estimation in R.

> rec.yw<-ar.yw(rec, order=2)


> rec.yw$x.mean
[1] 62.26278
> rec.yw$ar
[1] 1.3315874 -0.4445447
> sqrt(diag(rec.yw$asy.var.coef))
[1] 0.04222637 0.04222637

20 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Fish Population Example


Recruit Data with 24 Month Predictions
100
80
60
40
20
0

1950 1960 1970 1980 1990

Time 21 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Fish Population Example

rec.pred <- predict(rec.yw, n.ahead=24)


ts.plot(rec, rec.pred$pred, col=1:2)
lines(rec.pred$pred - rec.pred$s, col=4, lty=2)
lines(rec.pred$pred + rec.pred$s, col=4, lty=2)

22 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Method of Moments Estimation for MA(q)

Consider an invertible MA(1) process xt = wt + θwt−1 , with


|θ| < 1. We know that

θ
ρ(1) = .
1 + θ2
Using method of moments, we equate ρ̂(1) to ρ(1) and solve a
quadratic equation in θ.

23 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Method of Moments Estimation for MA(q)

The invertible solution(s) is(are)


p
1 ± 1 − 4ρ̂(1)2
θ̂ = .
2ρ̂(1)

• If |ρ̂(1)| < 0.5, two solutions exist, so we pick the invertible


one.
• If ρ̂(1) = ±0.5, θ̂ = ±1. No invertible solution.
• If ρ̂(1) > 0.5, no real solutions exist: the method of moments
fails to yield an estimator of θ.

24 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Method of Moments Estimation for MA(q)

For higher order MA(q) models, the method of moments quickly


gets complicated. The equations are non-linear in θ1 , · · · , θq , so
numerical methods need to be used.

25 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

1 Method of Moments Estimation

2 Maximum Likelihood Estimation

3 Comparison of MOM and MLE

26 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Maximum Likelihood Estimation

To illustrate the main concept with maximum likelihood


estimation, we consider the AR(1) model with nonzero mean

xt = µ + φ(xt−1 − µ) + wt , (7)

where |φ| < 1 and wt ∼ iid N(0,σw2 ).

27 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Maximum Likelihood Estimation

We seek the likelihood

L(µ, φ, σw2 ) = fµ,φ,σw2 (x1 , x2 , . . . , xn ). (8)

The likelihood function (8) L(µ, φ, σw2 ) is functionally equivalent to


the joint probability distribution of the observed data x1 , x2 , . . . , xn .

28 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Maximum Likelihood Estimation

For a given data set, think of the likelihood as a function of the


parameters (not the data). Since we’ve already observed the data
x1 , x2 , . . . , xn , we can find parameters (µ, φ, σw2 ) to maximize the
likelihood L(µ, φ, σw2 ). This is the basic idea behind maximum
likelihood estimation.

29 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Likelihood Function

We will use the following:

L(µ, φ, σw2 ) = f (x1 , . . . , xt )


= f (x1 )f (x2 |x1 )f (x3 |x2 , x1 ) · · · f (xt |xt−1 , xt−2 , . . . , x1 )
= f (x1 )f (x2 |x1 )f (x3 |x2 ) · · · f (xt |xt−1 )

30 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Likelihood Function

These are all the same:

xt = µ + φ(xt−1 − µ) + wt wt ∼ Normal(0, σw2 )

xt |xt−1 ∼ Normal(µ + φ(xt−1 − µ), σw2 )

and
n [x − µ − φ(x 2o
1 t t−1 − µ)]
fxt |xt−1 (xt |xt−1 ) = p exp − .
2πσw2 2σw2

31 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Likelihood Function

We have

L(µ, φ, σw2 ) = fx1 (x1 ) × fx2 |x1 (x2 |x1 ) × · · · × fxn |xn−1 (xn |xn−1 )

= fx1 (x1 )(2πσw2 )−(n−1)/2 ×


n Pn [x − µ − φ(x 2o
t t−1 − µ)]
exp − t=2 .
2σw2

32 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Likelihood Function: what is fx1 (x1 )?


In midterm 1, we assumed
σ2
 
x1 ∼ Normal µ,
1 − φ2
because that would allow all other time points to have the same
marginal distribution.

Here’s another rationalization: assume this model is causal


(|φ| < 1), and pretend you have an infinite history of data
(impossible in practice).

The causal representation x1 = µ + ∞ j


P
j=0 φ w1−j is. Take
expectations on both sides, and take the variance on both sides.

Since wt are iid normal, x1 is a normal with mean µ and variance


σw2 /(1 − φ2 ).
33 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Likelihood Function

The likelihood function is


Pn
n − µ − φ(xt−1 − µ)]
t=2 [xt
L(µ, φ, σw2 ) = fx1 (x1 )(2πσw2 )−(n−1)/2 exp −
2σw2
n S(µ, φ) o
= (2πσw2 )−n/2 (1 − φ2 )1/2 exp − ,
2σw2

where
n
X
2 2
S(µ, φ) = (1 − φ )(x1 − µ) [xt − µ − φ(xt−1 − µ)]2 .
t=2

34 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

The Log-likelihood Function

It is worth pointing out that it is more common to consider the


log-likelihood

`(µ, φ, σ 2 ) = log L(µ, φ, σ 2 )


 n S(µ, φ) o
2 −n/2 2 1/2
= log (2πσw ) (1 − φ ) exp −
2σw2
n 1 S(µ, φ)
= − log(2πσw2 ) + log(1 − φ2 ) −
2 2 2σw2

Numerically more stable, and the derivatives are easier to calculate.

35 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

The variance estimator

The variance estimator can be obtained after you get the other
estimators:
 
d 2 d n 2 1 2 S(µ, φ)
L(µ, φ, σw ) = − log(2πσw ) + log(1 − φ ) −
dσ 2 dσ 2 2 2 2σw2
 2
n 1
= − 2 + S(µ, φ)
2σw σw2

Setting that equal to 0, replacing σw2 with σ̂w2 , and solving for σ̂w2
gives us
S(µ, φ)
σ̂w2 =
n

36 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

The variance estimator

The estimates for µ and σ 2 is more complicated. Taking the


derivative with respect to these, and setting the equations equal to
0 yields something that cannot be solved analytically. It’s usually
accomplished with a numerical procedure (e.g. Newton-Raphson or
Fisher scoring).

37 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

1 Method of Moments Estimation

2 Maximum Likelihood Estimation

3 Comparison of MOM and MLE

38 / 39
Method of Moments Estimation Maximum Likelihood Estimation Comparison of MOM and MLE

Properties of ML Estimators

39 / 39

You might also like