Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
10 views

Basic Models

The document discusses system models and prediction errors, focusing on the characterization of disturbances and the formulation of linear models, including ARMAX, AR, MA, ARMA, ARX, and FIR models. It also covers state space models, prediction errors, and the identification of model structures, including nonlinear models like Wiener and Hammerstein. The document references key literature in system identification theory.

Uploaded by

nenutrash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Basic Models

The document discusses system models and prediction errors, focusing on the characterization of disturbances and the formulation of linear models, including ARMAX, AR, MA, ARMA, ARX, and FIR models. It also covers state space models, prediction errors, and the identification of model structures, including nonlinear models like Wiener and Hammerstein. The document references key literature in system identification theory.

Uploaded by

nenutrash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

System models and prediction errors

Disturbances
A complete characterization of disturbance v at time instance t + k, k ≥ 1 would be given by the joint PDF
for v(t + k), k ≥ 1 given v(s), s ≤ t. This turns out to be quite laborious and one uses a simpler approach
[1]. Let
X∞
v(t) = h(k)e(t − k) (1)
k=0

where {e(t)} is white noise with zero mean and variance λ. We assume that h is stable and invertible and
the inverse is also stable. We assume without loss of generality that h(0) = 1. From (1),

X
E[v(t)] = h(k)E[e(t − k)] = 0
k=0

and
∞ X
X ∞
Rvv (τ ) = E[v(t)v(t − τ )] = h(k)h(s)E[e(t − k)e(t − τ − s)]
k=0 s=0
X∞ X ∞
= h(k)h(s)λδ(k − τ − s)
k=0 s=0

X
= λ h(k)h(k − τ ). (2)
k=0

It follows that {v} is WSS.


Let q −1 be a backward shift operator i.e., q −1 x(t) = x(t − 1) and q be s forward shift operator i.e.
qx(t) = x(t + 1). Define the following linear shift operators

X ∞
X
−k
G(q) := g(k)q , H(q) := h(k)q −k . (3)
k=1 k=0

Then, v(t) = H(q)e(t). LTI systems with additive disturbance are represented as

y(t) = G(q)u(t) + H(q)e(t). (4)

Suppose u is deterministic and e is stochastic. Then,

E[y(t)] = G(q)u(t) (5)

which implies that {y} is not stationary.

Linear models
Consider linear models of the following form

M(θ) : y(t) = G(q; θ)u(t) + H(q; θ)e(t), E[e(t)eT (s)] = Λ(θ)δt,s (6)

where θ is the parametrization variable, y, e ∈ Rny , u ∈ Rnu and G and H are of appropriate dimensions.
These are also called predictor models denoted as

M(θ) : {(G(q; θ), H(q; θ))|θ ∈ Θ ⊂ Rd }. (7)

1
ARMAX model: Consider a scalar model of the form

A(q)y(t) = B(q)u(t) + C(q)e(t) (8)

where

A(q) = 1 + a1 q −1 + · · · + ana q −na , B(q) = b1 q −1 + · · · + bnb q −nb , C(q) = 1 + c1 q −1 + · · · + cnc q −nc . (9)

The parameter vector can be taken as


 T
θ := a1 · · · ana b1 · · · bnb c1 · · · cnc (10)

For MIMO systems, we have polynomial matrices instead of scalars. Notice that

B(q) C(q)
y(t) = u(t) + e(t)
A(q) A(q)

with G(q) = B(q)


A(q) and H(q) =
C(q)
A(q) . Both G(q) and H(q) are required to be stable. Some special cases of
ARMAX model are as follows.

• AR model: If nb = nc = 0, then we get A(q)y(t) = e(t) which is called the AR (auto regressive)
model. Notice that in this model, G = 0, H = A1 .

• MA model: If na = nb = 0, then we get y(t) = C(q)e(t) which is called the MA (moving average)
model. Notice that in this model, G = 0, H = C.

• ARMA model: If nb = 0, then we get A(q)y(t) = C(q)e(t) which is called the ARMA (auto
regressive moving average) model. Notice that in this model, G = 0, H = C
A.

• ARX model: If nc = 0, then we get A(q)y(t) = B(q)u(t)+e(t) which is called the ARX (controlled
auto regressive) model. The impulse response can be finite or infinite length in this case depending
on the form of A(q). Notice that in this model, G = B 1
A, H = A.

• FIR model: If na = nc = 0, then we get y(t) = B(q)u(t) + e(t) which is called the FIR (finite
impulse response) model. Notice that in this model, G = B, H = I.

The ARX model can be written as

y(t) = φ(t)T θ + e(t) (11)

where  T
φ(t) = −y(t − 1) · · · −y(t − na ) u(t − 1) · · · u(t − nb ) (12)
is called the regressor vector which is not deterministic since it depends on the past outputs.
State space models: A linear stochastic state space model is of the form

x(t + 1) = A(θ)x(t) + B(θ)u(t) + v(t) (13)


y(t) = C(θ)x(t) + n(t) (14)

where v and n are multivariate white noise sequences with zero means and following covariances

E[v(t)v T (s)] = R1 (θ)δt,s , E[n(t)nT (s)] = R2 (θ)δt,s , E[v(t)nT (s)] = R12 (θ)δt,s . (15)

The transfer function is given by

G(q; θ) = C(θ)(qI − A(θ))−1 B(θ) (16)

2
There are other models such as output error model, Box-Jenkins model and so on. The output error model
is described as
B(q)
y(t) = u(t) + e(t) (17)
F (q)
where F (q) = 1 + f1 q −1 + · · · + fnf q −nf . The Box-Jenkins model is given by

B(q) C(q)
y(t) = u(t) + e(t) (18)
F (q) D(q)

Notice that in this model, G = B C


F ,H = D.
One step prediction error: Consider model (4).
Observations: Given y(s) and u(s) for s ≤ t − 1 (i.e., y t−1 , ut is known).
Question: How to predict y(t) from these observations/measurements?
Since v(s) = H(q)e(s),
v(s) = y(s) − G(q)u(s) (19)
which means that v(s) is known for s ≤ t − 1. Based on this information, we want to predict the value

y(t) = G(q)u(t) + v(t).

We assume that H(q) is invertible and both H(q) and H(q)−1 are stable and H(q) is monic. Notice that

X ∞
X
v(t) = h(k)e(t − k) = e(t) + h(k)e(t − k). (20)
k=0 k=1

Let

X
m(t − 1) := h(k)e(t − k). (21)
k=1

We assume that e(t) are identically distributed. The MSE estimator denoted by v̂(t|t − 1) which minimizes
the MSE error: minv̂(t) E[(v(t) − v̂(t))2 ] is

X
v̂(t|t − 1) = m(t − 1) = h(k)e(t − k). (22)
k=1
P∞
Notice that v(t) − v̂(t|t−1) = e(t) ⊥ v̂(t|t −1) = k=1 h(k)e(t −k) satisfying the orthogonality property
of MSE error. Therefore, using v̂(t|t − 1), one can predict the MSE estimate ŷ(t|t − 1) as

ŷ(t|t − 1) = G(q)u(t) + v̂(t|t − 1). (23)

Notice that y(t) − ŷ(t|t − 1) = e(t). Again, using the similar trick above, e(t) ⊥ ŷ(t|t − 1) since

E[e(t)ŷ(t|t − 1)] = E[G(q)u(t)e(t)] + E[e(t)v̂(t|t − 1)] = 0

where we have used E[e(t)] = 0 and E[e(t)v̂(t|t − 1)] = 0.


By rearranging terms, we obtain another expression for ŷ(t|t − 1) below. Notice that

v̂(t|t − 1) = (H(q) − 1)e(t) = (H(q) − 1)H −1 (q)v(t) = (1 − H −1 (q))v(t). (24)

This can also be expressed as


H(q)v̂(t|t − 1) = (H(q) − 1)v(t). (25)

3
Therefore, prediction of ŷ(t|t − 1) is

ŷ(t|t − 1) = G(q)u(t) + v̂(t|t − 1)


= G(q)u(t) + (1 − H −1 (q))v(t)
= G(q)u(t) + (1 − H −1 (q))(y(t) − G(q)u(t))
= H −1 (q)G(q)u(t) + (1 − H −1 (q))y(t). (26)

This can also be expressed as

H(q)ŷ(t|t − 1) = G(q)u(t) + (H(q) − 1)y(t). (27)

From (4) and (26), the prediction error is given by

y(t) − ŷ(t|t − 1) = −H −1 (q)G(q)u(t) + H −1 (q)y(t) = e(t). (28)

The error e(t) represents the part of the output which cannot be predicted from the past data. Therefore, it
is referred as innovation at time t. (Block diagrams will be drawn in the class.)
For models parametrized by θ, we can define one step prediction as

ŷ(t|θ) = H −1 (q; θ)G(q; θ)u(t) + (1 − H −1 (q; θ))y(t). (29)

Example 0.1. Consider the ARX model below

y(t) + a1 y(t − 1) + · · · + ana y(t − na ) = b1 u(t − 1) + · · · + bnb u(t − nb ) + e(t)


T
. Let A(q) = 1 + a1 q −1 +

where e(t) is the white noise term. Let θ = a1 a2 · · · ana b1 · · · bnb
· · · + ana q −na , B(q) = b1 q −1 + · · · + bnb q −nb . Therefore, G(q; θ) = B(q)
A(q) , H(q; θ) =
1
A(q) . Note that from
(29),
ŷ(t|θ) = B(q)u(t) + (1 − A(q))y(t).
 T
Let φ(t) = −y(t − 1) · · · −y(t − na ) u(t − 1) · · · u(t − nb ) . Then,

ŷ(t|θ) = φT θ.

This can also be expressed in linear regression form as

y(t) = ŷ(t|θ) + e(t) = φT (t)θ + e(t) (30)

where e(t) is unknown.


For other models too, one can write ŷ(t|θ) = φT (t)θ using appropriate regressors φ(t).

In case of online estimation, one updates the parameter vector θ to minimize the prediction error
ky(t) − ŷ(t|θ)k. One can similarly define k-step prediction of y [1]. The (optimal) predictor model for
state estimation of LTI state space models is given by the Kalman filter equations.
There are matrix versions of these linear models for MIMO systems where G(q) and H(q) are matrix
transfer functions [1].
Typical models sets are M∗ = {all linear models}. A model structure is a differentiable mapping
M : Ω → M∗ where Ω ⊆ Rn . Let θ ∈ Ω, then M(θ) ∈ M∗ denotes a parametrized model structure.

Definition 0.2. A model structure is globally identifiable at θ∗ if

M(θ) = M(θ∗ ) ⇒ θ = θ∗ . (31)

4
Nonlinear models: One can have nonlinear models as well e.g., Wiener and Hammerstein models [1].
Hammerstein models are used when there is a static nonlinearity at the input followed by a linear model i.e.,
cascade of static nonlinearity with a linear model. Wiener models is used when the static nonlinearity is at
the output side. A combination of the two forms Wiener-Hammerstein model. (Block diagrams for all these
models will be drawn in the class.)
Nonlinear state space models are given by

x(t + 1) = f (t, x(t), u(t), w(t); θ) (32)


y(t) = h(t, x(t), u(t), v(t); θ) (33)

where w(t) and v(t) are independent sequences of random variables and θ is the parameter vector. Predictors
for nonlinear state space models can be constructed using past input-output data

ŷ(t|θ) = g(t, Z t−1 ; θ) (34)

where Z t−1 = (y(1), u(1), . . . , y(t − 1), u(t − 1)). The prediction error is

(t; θ) = y(t) − ŷ(t|θ). (35)

References
[1] L. Ljung, System Identification, Theory for the user, PHI, 2nd Edition, 1999.

[2] T. Söderström, P. Stoica, System Identification, Prentice Hall, 1989.

[3] T. Katayama, Subspace methods for System Identification, Springer, 2005.

You might also like