Report System Identification and Modelling
Report System Identification and Modelling
Report System Identification and Modelling
Universiteit
Leuven
I declare that I am the original and sole author of this submitted paper, except for feedback and support by the
educational team responsible for the guidance, and except for authorised collaboration agreed upon with the same
educational team. With regard to this paper, I also declare that:
This report is based on the the questions and answers from the exercise sessions of the course System Identifi-
cation and Modelling. For clarification, I did not put the figures directly in the sections regarding the sessions
which can be all founded in the the appendix of this report and are each time referenced in my answers.
I found rank r = 82 a sufficient approximation for the original image since from the 82nd eigenvalue of
matrix A, the absolute value goes under 1. My image is shown on figure 3 from the appendix.
• The relative amount of data you need to store for the rank r approximation versus storing the original full
matrix A (as a percentage)
The dimension of the right null space can be easily deduced for A20 from the fact that its main diago-
nal contains only non-zero elements and is equal to zero or N ull(AT ) = 0.
• Find an orthonormal basis for the right null space of A20 . Do not give the numerical result but explain how
you found the basis
From the singular value decomposition (SVD) of A, the orthogonal basis for can be found for A20 by
the vectors {vi }i=1...20 in the 20 first eigenvectors. It is also an orthonormal basis since each vector vi are
normalized in the matrix V of the SVD.
1
• Check that it is indeed an orthonormal basis for the right null space of A20 . Explain how you checked it.
The orthogonality could be checked if the matrix product of the transpose of the basis for the right null
space multiplied by the basis gives a diagonal matrix. The norm of each vector in the basis must also be
verified to be 1.
If the system is consistent, it must be verified that the augmented matrix [A1 b1 ] is the same rank as A.
In this case, the ranks are not the same which suggests that system of linear equations should be solved
using least squares.
• Give the vectors c1 and c2
−2.2120 4.3579
1.4511 7.1163
c1 = 0.5287 c2 = −2.9156 (1)
0.8228 −2.4169
1.4589 11.5871
• Give the residuals r1 and r2 . Explain how you checked if they are perpendicular to the column space of A1
and A2 respectively. Give numerical values to corroborate your method.
−1.3614 0.1549
−1.5662 0.7436
1.6420
r1 = 1.5551 r2 = (2)
−0.3902 −0.3481
−0.8499 −0.1744
It can be verified that r1 and r2 are perpendicular to the column space of A1 and A2 , respectively, if the dot
product r1 .v1 or r2 .v2 are equal (or almost equal)- to zero where v1 , v2 represents the set of vectors from the
column space, which can be identified from the U matrix in the SVD of A1 and A2 . Since the columns of
A1 and A2 form a basis for the column space, we can use |aT1 · r1| and |aT2 · r2 | to be smaller than a defined
tolerance. All tests on succeeded for tol = 10−14 which indicates the orthogonality of the residuals.
• Show your two plots of the 500 solutions with the uncertainty ellipses and their axes, calculated as described
above. Discuss the similarity and/or difference of the two plots.
My plots are shown on figure 6 and 7 from the appendix. The plots should show that defined ellipses
around the true solution are influenced by the values of the singular values for A1 and A2 . We see that the
σ1 on figure 7 has indeed a much higher value than in figure 6. As a results, less solutions, perturbed by
noise, are contained in the ellipse.
The least squares problem was formulated as Ax ≈ b based on the distances with respect to the 6 lamps
from a grid position (i, j), which defined the matrix A of size (625, 6). Vector b defined as a 1-vector of size
625 contains the desired illumination in every grid position. Remark that the system is not consistent and
the augmented matrix [A, b] has a different rank than A, which suggests the use of least squares.
• Solve the least squares problem. Give your solution.
2
36.2400
15.7464
30.7006
power(values) =
(3)
40.3131
23.0801
26.3267
• Show the obtained optimal illumination in a picture, together with the horizontal position of the lamps, as in
Figure 2.
We see that that the plot in figure 8 calculated very similar results as the example shown in the session
questions and represents a very intuitive illumination in the room. We see that the highest intensity of
illumination is concentrated at the lamp locations and decays with an inverse square law in the blue areas.
3
2 EXERCISE SESSION 2: LEAST SQUARES IDENTIFICATION
AND MODEL ORDER SELECTION OF DYNAMICAL SYSTEMS
2.1 Identification of mass-damper-spring system
2.1.1 Description of the system
2.1.2 The linear model
2.1.3 Generating data
2.1.4 Identification of a linear model
• Are you able to retrieve the model parameters exactly? Why is it (not) possible?
Yes, in this exercise I used random samples with given amount N and variance σ 2 which allowed me with to
find the coefficients exactly when using enough samples. It is possible because the coefficients are calculated
with least squares in my implemented function.
Yes, it possible to retrieve the parameters. However, this would require a lower variance σe2 on the latent
input and more samples N to have a good approximation.
• Show the poles and zeros of all 2000 identified models in a plot with the unit circle and the poles and zeros
of the data-generating ARX system. An example is shown in the left part of Figure 3. Use roots to find the
roots of a polynomial.
The poles and zeros of the ARX system are shown on figure 19.
• Show a box plot of the identified polynomial coefficients. matlab command: boxplot. Do not forget to indicate
the true values. You can see an example in Figure 3 on the right hand side.
The boxplot for the identified polynomial coefficients are shown on figures 20 and 21.
• Discuss the influence of N, the number of data points, on the variance of the estimator, and illustrate with
a figure.
We see on figure 23 it causes the identified coefficients b1 and b2 to shrink when more data points are
used.
• Discuss the influence of σe , the standard deviation of the latent input, on the variance of the estimator, and
illustrate with a figure.
We see on figure 22 it causes the identified coeffients b1 and b2 to to have a higher variance. This is
caused by the latent input e[k] that now causes more noise on the output signal y[k].
• How did you choose the initial values of your recursive algorithm? Give the values and discuss how the
results change when you take different initial values.
The initial coefficient vector is set to zeros, initial inverse of the auto-correlation matrix is set to the identity
matrix divided by the forgetting factor.
• What other parameters did you have to choose? What is their influence?
The forgetting factor λ which controls the influence of past observations on the current parameter esti-
mates.
4
• Show how the identified polynomial coefficients ai evolve in time. The plot from my Matlab files can demon-
strate the the coefficients ai follow the true coefficients values until a they reach a point where significant
changes occur could be considered the time instance when the ARX system changes.
• At what time instance does the ARX system change? How did you find out?
In my plot, the values for the coefficients started to explode from sample 700.
• Include your matlab code for the RLS algorithm in your report.
for t = 3:length(u)
% Construct input-output vector phi(t)
phi_t = [y(t-1); y(t-2); u(t-1) ; u(t-2)]; % Adjusted for the ARX model
% Prediction
y_hat = phi_t’ * theta_hat;
% Prediction Error
prediction_error = y(t) - y_hat;
% Gain Calculation
K = P * phi_t / (lambda + phi_t’ * P * phi_t);
% Parameter Update
theta_hat = theta_hat + K * prediction_error;
coef_a(:,t-2) = theta_hat(1:na);
• Did the properties of the spring or of the damper change? Explain how you found out. Give an estimate of
the new spring constant or new damping factor.
Based on the changes in the identified parameters, it can be determined whether the properties of the
spring or damper changed and a new spring constant or damping factor based on the updated model.
5
• Does LS provide us with an unbiased estimator? Discuss. Yes, the estimate is unbiased and repeating a
large number of experiments using the same exact data b and A but different noise realizations of d, their
average will be equal to the exact solution x.
• Total Least Squares (TLS) identification : Create box plots, showing the identified polynomial coefficients a1,
a2, b1, b2. Indicate the values of the true system.
From the plots, it can be deduced that TLS provides more accurate results since there is less variance
between the poles in the unit circle. However, the computation cost for TLS will be higher than LS since
the SVD must be calculated.
• Increase the variance of the misfit and describe its influence on the TLS estimate.
When changing σe2 from 0.1 to 1, the calculated poles of the Identified Model (with Output Misfit) had
a significant higher value.
• Increasing the model order results in a monotonic decreasing cost function Vest . Explain this.
As you increase the model order, the complexity of the model increases, allowing it to fit the training
data more accurately, and allows the estimation cost to decrease as the model order increases.
• We do not observe the same monotonic decrease in the cost function Vval . Why?
When the model order increases, the complexity of the model also increases. The decreasing cost func-
tion is due to overfitting. Overfitting happens when the complexity of the model increases as the order
increases, the model have good results on his training set but this may not necessarily result in better
generalization to new, unseen data.
• Does the Akaike cost function VAIC provide an alternative for using a validation data set?
Yes, the Akaike Information Criterion (AIC) can be considered as good alternative to using a validation
dataset for model selection. The AIC is a model selection criterion that balances the goodness of fit of the
model to the data with the complexity of the model, penalizing more complex models.
• Under what conditions would you prefer to use the Akaike criterion to select a model order and when would
you use validation data?
A combination of both is often used. It is very common to start with AIC for a quick assessment and
then complement it with validation data to gain more confidence in the selected model order.
6
• What happens to the optimal model order if the variance of the disturbing noise increases? Where do you see
this in your figures? Can you give an intuitive explanation for this effect? (Try to reason by considering the
‘optimal’ model order for an undisturbed data set: What is the optimal model order if the disturbing noise is
absent?)
When the variance of the disturbing noise increases, it typically leads to an increase in the optimal model
order. This effect can be observed in figures ..., specifically in the behavior of the cost functions (e.g.,Vest,
Vval, or VAIC) as a function of the model order.
• What happens to the optimal model order when the original system has poles closer to the unit circle?
When the original system has poles closer to the unit circle, it often leads to a decrease in the optimal
model order. This is because lower order models are more likely to capture the faster dynamics and rapid
changes in the system’s behavior.
7
3 EXERCISE SESSION 3: BODE DIAGRAMS
3.1 Estimating the transfer function from a Bode plot
• Determine the sampling time Ts of your discrete-time model. Give the numerical value (in seconds) and
explain how you found it.
From the given Nyquist frequency in the exercise, it can be shown from the Nyquist theorem that:
1
Tsampling (s) =
fsampling
π
From these equations, the sampling time can be defined as Tsampling = ωnyquist = 50.0s
• Find the order of the model and the location of the poles by trial-and-error. Briefly discuss your method.
In my experiments, I tried several order for the all-poles model by comparing the results with the dis-
cretized Bode plot. It was very useful to first observe the peaks in the discretized which indicated pole-pairs
on the unit circle. The order that gave a good approximation was 9 which I also compared to a zero & poles
model which did not lead to better results which confirmed a discretized model of an all-poles model.
• Give the values of the poles and show them in a plot with the unit circle.
0.0002 + 0.0562i
0.0002 − 0.0562i
−0.0006 + 0.0561i
−0.0006 − 0.0561i
poles =
−0.0027 + 0.0272i
(4)
−0.0027 − 0.0272i
−0.0013 + 0.0200i
−0.0013 − 0.0200i
The results I had for this exercise are shown in figure 10 for an all-poles model can with good accuracy
demonstrate the evolution of the magnitude and phase up to Nyquist frequency from the discretized bode
plot. Unfortunately, I was not able to find the same ranges for the magnitude, that I had to tune by adapting
the gain factor K. Another drawback I experienced when implementing the all-poles model was the presence
of a zero when for the magnitude when |H(0)| = 0 which I was not able to demonstrate with the model.
As indicated in the session, we can avoid aliasing by ensuring that the frequency of the input signal is
less than the Nyquist frequency (half the sampling frequency so 50HZ). To avoid leakage, I verified that the
signal length is chosen such that it contains an integer number of periods.(100, 200, 300; etc). Finally, to
8
eliminate the influence of the system transient, I discarded the first measured periods of the output signal
and only use the tail of your signal, where the transient has died out. For stable systems the transient decays
exponentially. Something that also had an important impact was to estimate the transfer function on many
frequencies which has the abilty for better estimations.
• Show the magnitude plot of the estimated transfer function and a small discussion on how you obtained the
results. You do not need to estimate the phase of the transfer function.
My plot is shown on figure 11 from the appendix. Remark that it was necessary to to eliminate the in-
fluence of the system transient by looking at the growing terms in the tails.
The differences with the provided figure during the sessions are quite similar, with the exception that I
was not to able to make the estimated transfer functions start at the same values. It is clear from the plot
that when comparing the estimations, spa is more smooth and continuous but is not able to estimate the
transfer function from the sine-estimation in the discontinuities. On the other hand, etfe is more noisy in
the same frequency interval but is able to better estimate the sine-function in the peaks. The variance from
white noise input did not have a significant impact on the estimations. On the other hand, increasing the
number of samples made sure to decrease previous mentioned drawbacks for spa and etfe.
9
4 EXERCISE SESSION 4: STATE SPACE MODELS
4.1 Realization theory
4.1.1 Realization theory for a single-output autonomous model
• Identify an autonomous state space model as in Equation (1) that generates your sequence. Do this using
realization theory. Give your identified matrices A and C and the initial state x0 . Make sure your resulting
model is minimal.
The system matrices A ∈ R30×30 , C ∈ R1×30 , and x[0] ∈ R30×1 can be found in my Matlab code for
session 4. Furthermore, one can verify that the realization is minimal by checking if the rank of the Hankel
matrix is equal to the order of the system. In this case, rank(H) = 30 corresponds to the number of states
in the system that verifies a minimal realization.
Yes, the poles of the model corresponds to the eigenvalues of the system matrix A.
• Derive in an analytical way, i.e. by reasoning, a state space model that produces your y sequence and whose
model matrices A and C and initial state X [0] have integer entries only.
– Give the matrices A and C and vector X [0] and explain how you determined them.
– Are the two state space models [A, C, x[0]] and [A, C, X [0]] similar, i.e., related by a similarity transforma-
tion? If they are, give the similarity transformation matrix and explain how you obtained it.
One can notice that the system matrices [A, C, X [0]] can be obtained by a cyclic shift of the state vec-
tor in each iteration. This gives the following matrices:
0 1 0 0 0 0 0
0 0 1 0 0 0 0
0 0 0 1 0 0 0
A= 0 0 0 0 1 0 0
0 0 0 0 0 1 0
0 0 0 0 0 0 1
1 0 0 0 0 0 0
C= 1 0 0 0 0 0 0 (5)
0
8
0
8
x[0] =
7
5
9
The 30 output samples were obtained using relation y[k + 1] = CAk x[k]. The 2 models are not related by a
similarity transformation. This can be seen because the transformation matrix T that projects x → − T x is
singular. Also, the previous model shows that the generated output is not exact and results in small errors
which is for example due to the pseudo-inverse A+ in the calculations of system matrix A. This can explain
the previous observation that does not suggest relation by a similarity transformation between the models.
• Check if the first 30 output samples of your two models are equal to the sequence you started with. Show this
in a figure.
10
4.1.2 Realization of an input-output model
1. Give your model matrices A, B, C, D
The system matrices are A ∈ R6×6 , B ∈ R6×2 , C ∈ R3×6 and D ∈ R3×6 can also be founded in my
Matlab code of session 4.
2. Give the eigenvalues of the matrix A.
−0.2537 + 0.6621i
−0.2537 − 0.6621i
−0.4578 + 0.0000i
0.2567 + 0.4057i (6)
0.2567 − 0.4057i
0.6863 + 0.0000i
3. Plot the first 40 samples of the impulse response of your model together with the impulse response of the
black box system, i.e., the signals y1 and y2.
From following equations, it is possible to calculate the Frobenius norm of the difference for the Markov
parameters of a LTI system:
H0 = D
Hk = CAk−1 B
for k = 1, 2, 3, 4. I remarked during the iterations that the Frobenius norm was first high and converged
almost to zero for higher terms where k > 0. I assumed that the identified model had troubles to simulate
the impulse responses from the observed transcients in the impulse response of the black box system on
figure 15 which can explain the higher errors in the first terms.
The bode plots are shown for the original model and the reduced order 2 model on figures 16 from the
appendix. By comparing for example the magnitude plot for the system input 2 and output 2, we can see
that the first and the second identified poles were eliminated leading to constant bode plot in the reduced
model.
• Give the frequency of the resonance peak that is modeled by the model of order 2.
The resonance peak frequency of the second-order model can be found at 22.5705 rad/s.
• Find now a reduced order model that performs better than the model of order 2. It should capture the dy-
namics of the high order model well. Use balanced model reduction as reduction technique. Give your chosen
model order and explain why you have chosen that order.
When following the steps of a balanced model reduction, it can first be observed that the system is marginally
stable since some poles lie on the unit circle. Note that the poles must be inside the unit circle for a stable
system. Secondly, if σr > σr+1 for reduction r, then the system state space model is controllable and observ-
able which is satisfied when r >= 20. The third condition compares the absolute error between the original
11
and the reduced models to the distinct Hankel singular values λi which are not included in the reduced
model.
I applied the conditions for a balanced model reduction and could visualize in the Matlab System Identi-
fication Toolbox for model reductions. My conclusion was that from order 15, the last condition was not
satisfied anymore which is visualized in figure 17, and could observe a significant decrease in the absolute
error with a factor 10 when between λ7 and λ8 . Therefore, I chose for a model reduction with order 8.
• Compare the original high-order model sys and your reduced order model in a Bode plot.
The bode plots are shown for the original model and the reduced order 2 model on figures 18 from the
appendix.
• Why do we call this method for model order reduction ‘balanced’ ?
A state space model [A, B, C, D] is balanced if the controllability and observability Gramians P∞ and
Q∞ are equal and diagonal if we assume that the Gramians were obtained from the solutions of the Lya-
punov equations
12
5 EXERCISE SESSION 5: A PRACTICAL SYSTEM IDENTIFI-
CATION PROBLEM
5.1 Input design
To test several types of designs, I tried all suggested input (step, white noise, colored noise, random binary, impulse,
sine waves) and also simulated the output of the black box model with a zero input to measures the disturbances.
The input and output signals can be seen on figures 28 and 29.
5.2 Preprocessing
During the preprocessing, I first used interpolation to replace the missing zeros in the signals which I then
detrended. The most effective step during the preprocessing was when I used a low-pass filter to eliminate the
high frequencies in the signal and smoothed the signal. Finding the good passband frequency was done with lot
of trial-and-error. I also rescaled the signals before filtering which gave them a zero mean. Lastly, I decided to
first take the second 100 samples of the signals that indicated a lot of patterns and further analyze this subset.
The reason I took the second 100 samples was because the transient effects ware quite high in the first time steps.
The results of the preprocessing can be found on figure 30.
5.3 Identification
5.4 Validation
The identification and validation was made in the System Identification Toolbox from which I was able to test
different input signals that I already preprocessed. During the training of the model, I tested each on the validation
data of the respective input signal. The model that repeatedly succeeded had descent results on the validation
data was the ARX model with na = 30 and nb = 30 which showed a fit of 96% random binary inputs, 98% fit
with the colored noise input, 99% with noise, 82% with the step response, etc. An example is shown on figure 31
for white noise as validation data. I have also provided the poles-zeros plot to check the stability system on figure
32. From this point, I determined my model structure and order but still had to find estimations for the numbers
the delay nk which was easy to implement with cra from the demo for one input and output. The Matlab toolbox
has a much easier way to plot different impulse responses shown on figure 33 which allowed me to estimate an
average delay of 17 lags (nk = 17).
5.5 Conclusion
Lastly, I checked my final model on a sine input that confirms a 99.38% fit which can be found on figure34. I
assumed during the training that I only trained on 100 samples first because the model would be otherwise to
complex to analyze and secondly because of the fact that a lot output sequences on figure 30 contains periodicity
which makes my transfer function easy to delay in time (or G(z) = G(z − T ) for periodicity T) .
13
Appendix: List of Figures
14
Figure 2: Image reconstrution with different ranks
15
Figure 4: stored data vs full matrix with rank r approximation
16
Figure 6: uncertainty plot A1 x ≈ b1
17
Figure 8: optimal illumination
18
Figure 10: discretized and all-poles model bode plot
19
Figure 12: comparison sine etfe and spa
20
Figure 14: impulse response
21
Figure 16: bode plots order 2 reduction
22
Figure 18: bode plots order 8 reduction
23
Figure 20: identified coefficients a1 and a2
24
Figure 22: influence σ on identified coefficients b1 and b2
25
Figure 24: boxplot output misfit LS
26
Figure 26: poles output misfit LS
27
Figure 28: input signals to the black box system
28
Figure 29: output signals from the black box system
29
Figure 31: fit ARX-model on the validation set for white noise input
30
Figure 33: step, white, colored and random binary impulse responses
31