Order Reduction For Large Scale Finite Element Models: A Systems Perspective

Order Reduction for Large Scale Finite
Element Models: a Systems Perspective

William Gressick, John T. Wen, Jacob Fish
ABSTRACT
Large scale finite element models are routinely used in design and optimization for complex engineering systems. However, the high model order prevents efficient exploration
of the design space. Many model reduction methods have been proposed in the literature
on approximating the high dimensional model with a lower order model. These methods
typically replace a fine scale model with a coarser scale model in schemes such as coarse
graining, macro-modeling, domain decomposition and homogenization. This paper takes a
systems perspective by stating the model reduction objective in terms of the approximation
of the mapping between specified input and output variables. Methods from linear systems theory, including balance truncation and optimal Hankel norm approximation, are
reviewed and compared with the standard modal truncation. For high order systems, computational load, numerical stability, and memory storage become key considerations. We
discuss several computationally more attractive iterative schemes that generate the approximate gramian matrices needed in the model reduction procedure. A numerical example is
also included to illustrate the model reduction algorithms discussed in the paper. We envision that these systems oriented model reduction methods complementing the existing
methods to produce low order models suitable for design, optimization, and control.
KEY WORDS
Model Reduction, Large Scale Systems, Balanced Truncation, Finite Element Methods,
Approximate Gramian
available. We will consider four different norms for

comparison: the H norm, which is the worst case
input/output L2 gain, the H2 norm, which is the
worst case gain from the peak input spectral density
to output power, and the time domain L (largest
amplitude) and L2 (energy) norms under a specific
input of interest. Our discussion will focus on the
balanced truncation method which has an a priori
H error bound and is stability preserving. However, the method in its original form has computation complexity On3 , and faces numerical difficulties for stiff high order systems. We then present a
number of iterative methods that produce approximate balanced truncated models. These methods
possess better computational and numerical properties, especially when the system matrix is sparse.
A numerical example involving a piezo-composite
beam is included to illustrate the various methods
discussed in the paper.
This paper is organized as follows. Section 2 reviews the basic description of linear systems. Section 3 presents various model reduction methods,
the commonly used modal reduction, balanced truncation, and optimal Hankel norm reduction. Section 4 discusses various approximation techniques
for the controllability and observability gramians
needed in the balanced truncation. The balanced
truncation type of model reduction using the approximate gramians is shown in Section 5. A piezocomposite beam example is included in Section 6 to
illustrate the performance of the methods presented.
1. INTRODUCTION
For complex engineering systems such as large mechanical structures, fluid dynamic systems, integrated circuits, and advanced materials, the underlying dynamical models are typically obtained from
the finite element method or discretization of partial differential equations. To obtain good approximations of the underlying physical processes, these
models are necessarily of very high order. In order to
use these models effectively in design optimization
and iteration, the high order systems need to be reduced in size while still retaining relevant characteristics. Many model reduction/simplication schemes
have been proposed in the past, such as Guyan and
the related improved reduced system (IRS) methods
[1, 2], hierarchical modeling [3, 4], macro-modeling
[5,6], domaining decomposition [7], and others. This
paper approaches model reduction from a systems
perspective. In contrast to other model reduction
techniques for finite element models, the systems
approach seeks to retain only the dominant dynamics that are strongly coupled to the specified input
and output. This is similar to the goal-oriented
adaptive mesh generation method, where the mesh
geometry (and hence the approximate model) is
governed by its influence on the properties of interests [8]. There has been a recent surge of interests
in model reduction for large scale systems from the
systems community [913]. Well conditioned numerical algorithms have also been developed and
become available [14]. The goal of this paper is to
present a tutorial of this class of approaches and the
underlying algorithms.
The basic problem is as follows: Given an nth order
linear time invariant (LTI) system with state space
parameters (A, B, C, D), find an rth order reduced
order model (Ar , Br , Cr , Dr ), with r << n. The
goal of model reduction is to make the difference
between the full order model and reduced order
model small under some appropriate norm. For LTI
systems, model reduction methods can be broadly
classified as singular value decomposition (SVD)
based approach and the classical moment matching
method [9]. The SVD based methods can be further
separated into model based and data driven. The
model based methods assume the available of a high
order model. Data driven methods produce a reduced order model based on the input/output data.
This is also known as the model identification problem [15, 16]. In this paper, we will focus on model
based SVD methods, since a high order model (e.g.,
obtained from the finite element method) is assumed
2. PRELIMINARIES
2.1. System Description
The finite element method typically generates a high
order continuous-time LTI system in the state space
form:
x(t)
= Ax(t) + Bu(t),
(2.1)
where x(t) Rn is the state vector and u(t) Rni
is the input vector. For mechanical structures, the
model is usually expressed in the generalized second order form
M q(t) + F q(t)
+ Kq(t) = Hu(t),
(2.2)
where q, q,
q are the generalized coordinate, generalized velocity, and generalized acceleration, respectively, M , F , K are the mass, damping, and stiffness
matrices, respectively, and H is the influence matrix
corresponding to the input u(t). Note that F is difficult to obtain accurately, and is frequently just set
2
to zero. The second order model can be transformed

to the state space form by, for example, defining the
state vector as
q(t)
x(t) =
.
(2.3)
q(t)
1(t) is the unit step function and (t) the unit impulse. In this case, the output is related to the input
through a convolution:
Z
g(t )u( ) d + yzi (t)
y(t) =
(2.9)
Then
where yzi is the zero input (unforced) response due

to the initial state.
0
I
0
x(t)
=
u(t). Another characterization of an LTI system is to
M 1 K M 1 F
M 1 H
transform (2.9) to the Laplace domain:
(2.4)
The state definition is not unique, for example, the
Y (s) = G(s)U (s) + Yzi (s).
(2.10)
Legendre transformation is also a popular choice:
where
q(t)
G(s) = C(sI A)1 B + D.
(2.11)
x(t) =
.
(2.5)
M q(t)
Generalized second order systems (2.2) can also be

In this case,
represented as
0
M 1
0
I 0
0
I
0
x(t)
=
x(t) +
u(t). (2.6)
x
=
+
u. (2.12)
K F
H
0 M
K F
H
x(t)+
In the state space model, we will also consider an

output of interest
A system in this form is referred to as a descriptor system, where x is multiplied by a matrix other than the
identity matrix. The key attractions of the descriptor
y(t) = Cx(t) + Du(t),
(2.7) form are the avoidance of mass matrix inversion and
sparsity. In the usual state space form, (2.4) and (2.6),
where y(t) Rno . The input/output, (u, y), may the mass matrix inversion can destroy sparsity if the
correspond to a particular property of interests, or mass matrix is not diagonal. For this reason, model
physical actuators and sensors.
reduction of descriptor systems in their native form
Denote the system with input u and output y by G is an active area of research. The subject and its as(Fig. 1). For a given choice of the state, the quadru- sociated numerical methods are presented in [17, 18]
plet (A, B, C, D) is called the state space represen- and will not be pursued here.
tation for G. A state space model can be trans- An LTI system, G, with input u(t) Rni and output
formed to other state coordinates through a coordi- y(t) Rn may be regarded as a linear operator mapo
nate transformation:
ping Lnp i [t1 , t2 ] to Lnp o [t1 , t2 ], where 1 p , and
(t1 , t2 ) is the time range of interests. The worst case
z = T 1 x
(2.8)
input/output Lp gain is a norm of G (induced by the
Lp -norms):
where T is any invertible matrix. The resulting state
1
1
space representation is (T AT, T B, CT, D).
kykLnp o
kGki,p = sup
.
(2.13)
n
n
up i kukLp i
u
Conceptually, G can be thought of as mapping a unit

Lnp i -ball to an ellipsoid in Lnp o as shown in Fig. 2.
Then kGki,p is the length of major axis of the ellipsoid. The most common induced norms are L2 and
L norms. In the case that (t1 , t2 ) = (0, ), the L2
Figure 1: Input/output system under consideration induced norm is the same as the norm of the Hardy
space H and can be calculated by using the transAn LTI system may also be characterized by its im- fer function of G (we will soon encounter this again
pulse response, g(t), which in terms of state space from the frequency domain perspective). The L inparameters is given by CeAt B1(t) + D(t), where duced norm can be shown to be the L1 norm of the
impulse response of G:
Z
kGki, =
kg(t)k dt = kgkL1 .
0
Lnpi
Lnpo
Fourier transform of the impulse response, G(j), or

the transfer function evaluated along the imaginary
(2.14) axis. If we regard G(j) as an Lp -mapping, we can
again use the induced Lp -norm as the performance
metric. The most common choice is the induced L2 norm, which is also called the H -norm (norm corresponding to the Hardy space H ). The H norm
is a direct generalization of the matrix norm induced
by the Euclidean norm, which is the maximum singular value of the matrix. The H norm is related to
G
the transfer function through
i, p
Lp
=1
kGkH = sup kG(j)k
where kG(j)k denotes the maximum singular

value of G(j).
We can also regard G(j) directly as an element of
Ln2 o ni (j, j). The corresponding norm (not an
Lp induced norm) is called the H2 -norm (norm for
the Hardy space H2 ), which may be considered as a
generalization of the Frobenius norm. The H2 norm
is related to the transfer function as
1
kGkH2 =
tr G(j)GT (j) d .
2
(2.17)
By using the Parseval Theorem, the frequency domain expression for the H2 norm can be shown to
be the same as the time domain expression in (2.15).
The H2 norm may be considered as an induced norm
for power signals [19]. Let P be the space of finite
power signals with the power of a signal u defined
as
Figure 2: Input/Output Ellipsoid

An operator norm may also be defined for G directly without referring to the input/output norms.
A commonly used norm is
Z
kGkH2 =
(2.16)
t2
12
T
tr g(t)g (t) dt
(2.15)
t1
where tr denotes the trace of a matrix, and g(t) is the

impulse response of G. This norm can be thought of
as the generalization of the matrix Frobenius norm
to LTI systems. In the case that (t1 , t2 ) = (0, ), the
norm (2.15) is the same as the norm of the Hardy
space H2 [19].
If the time range is set to [0, ), then an LTI system G : Lnp i [0, ) Lnp o [0, ) is called an Lp -stable
system (these notations may be extended to nonlinear dynamical systems, see [20]). The Lp stability is
equivalent to the stability of the state space system,
called internal stability, (i.e., all eigenvalues of A in
the left half plane) under the stabilizability and detectability conditions (basically ensuring that there
is no hidden internal dynamics) [20]. For generalized second order systems, stability means that the
damping and stiffness matrices are positive definite.
The Lp -norm is a natural performance metric, for ex where G
is the reduced
ample, Lp -norm of G G,
order model.
A stable LTI system may also be regarded as a linear
operator under the Fourier transform, where the input is u
(j) Lnp i (j, j), the Fourier transform
of u(t), and the output is y(j) Lnp o (j, j),
the Fourier transform of y(t) (assuming that the
Fourier transforms of the input and output signals
exist). In this case, the system is represented by the
kukP =
1
lim
T 2T
! 21
2
ku(t)k dt
(2.18)
Define the autocorrelation matrix of u as

Z T
1
Ruu ( ) = lim
u(t + )uT (t) dt.
T 2T T
(2.19)
The spectral density Suu (j) of u is the Fourier

transform of Ruu . Signals with bounded spectral
density are denoted by
S = u(t) Rni : kukS := kSuu (j)kL < .

(2.20)
Then H2 norm is the induced norm from S to P.
We will use the above norms to evaluate our
reduced-order models, by applying them to the difference between the full order LTI and the reduced
However, small
order model, G G.
G G
have different meanings depending on the norm

used. For performance comparison, we will use
four different metrics, summarized in Table 1: H
norm, H2 norm, output L norm, and output L2
norm (the
latter
two for a specific input function).
Small G G
may be interpreted in a worst-
Modal truncation method is simple in principle, but

is limited in practice by the difficulty to assess the
modal dominance of a system. In other words,
knowledge of which modes should be retained is
not always clear, especially in systems which have
closely-spaced eigenvalues, lightly damped high
H
frequency modes, or wide band input excitations.
case sense: the L2 norm of the output is small The method also lacks an a priori error bound. In
for all inputs with unit L2 norms. However, there terms of implementation, an eigen-decomposition
could be large amplitude errors for short durations. on the full system is required which can be compuAlso, for a given u, this norm could be very con- tationally expensive and numerically ill-conditioned
may be observative, meaning that lower order G
for large scaled systems.
tained
to
achieve
the
same
output
error
norm. Small
means that the L2 norm of the difference 3.2. Balanced Realization

G G
H2
between the impulse response (or, equivalently, beModal truncation is driven by the eigen-structure
tween the transfer functions) is small. This does not
of A and does not explicitly take the systems indirectly translate to time domain error bound, howput/output properties into account. Another apever. In terms of the interpretation of induced norm
proach to model reduction is to retain only the
from S to P, small H2 norm means that under unit
state dynamics that are strongly coupled to the inGaussian white noise input (unit spectral density),
put/output of the system. To assess how strong this
the power of the output is small. Small output L2
coupling is, we apply the concepts of controllability
and L norms mean that the time domain output
and observability. We begin by defining controllaresponse will be small in the L2 or L sense for the
bility. Consider all u Ln2 i [0, T ] with kukL2 = 1 apgiven input.
plied to the system initially at rest (zero state). The
corresponding states at T , x(T ) Rn , indicate the
3. MODEL REDUCTION METHODS
strength of coupling between input and state spaces.
We consider x(T ) strongly coupled to the input u(t)
3.1. Modal Truncation
if kx(T )k is large and vice versa. If kx(T )k = 0,
Modal truncation is perhaps one of the simplest and then those states are decoupled from the input. This
most well-known model reduction methods. The may be visualized as a mapping of a unit ball in
ni
n
basic idea is simple: Decompose the transfer func- L2 [0, T ] to an ellipsoid in R , called the controllabiltion into a sum of modes which are transfer func- ity ellipsoid (see Fig. 3). The principal axes of the eltions with a single real pole or a pair of complex lipsoid indicate the degree of coupling between the
poles. A reduced order model is obtained by re- state in that direction to the input signal. Denote
taining only the dominant modes (those contribut- the mapping of L2 -input to the final state, x(T ) as
ni
n
ing the most to the transfer function). In many cases, LT : L2 [0, T ] R :
Z T
it is the high frequency modes that are discarded,
due to damping and bandwidth limitation of actuaeA(T ) Bu( ) d.
(3.2)
LT u :=
0
tors and sensors. In terms of the state space representation, modal truncation first transforms the sys- The lengths of the principal axes of the controllabiltem into the modal form where the system matrix ity ellipsoid are the singular values of LT , or, equivis block diagonal with A1 containing the dominant alently, the square root of the eigenvalues of
modes:
P (T ) = LT LT
(3.3)
A1 0
B1
1
1
T AT =
,T B =
,
where LT is the operator adjoint of LT and P (T ) is
0 A2
B2
an
nn positive semi-definite matrix, called the con
C1 C2 .
CT =
(3.1) trollability gramian (at time T ). The controllability
gramian may be calculated from a linear matrix difThe state space representation of the modal trun- ferential equation
cated model is then (A1 , B1 , C1 , D). The usual approach is to represent A in the Jordan form, and then
P (t) = AP (t) + P (t)AT + BB T , P (0) = 0nn .
retain the low frequency eigenvalues only.
(3.4)
5
Error Norm
G G
G G
H2
Gu Gu
L
Gu Gu
L2
Interpretation
Worst case L2 gain
Worst case spectral density to power gain
Maximum output amplitude with input u
Maximum output L2 norm with input u
Table 1: Error norms considered in this paper and their interpretations

The solution can also be written as a matrix integral:
Z
P (T ) =
At
T AT t
e BB e
dt.
Rn
(3.5)
Ln2o
For stable systems (all eigenvalues of A are in the

strict left half plane), P (t) converges to a steady state
matrix, P , as t . In this case, P solves the following linear matrix equation called the Lyapunov
equation:
AP + P AT + BB T = 0.
(3.6)
x =1
y = T x
Figure 4: Observability Ellipsoid

The solution can also be written as an integral:
Z
T
Fig. 4). The principal axes of the ellipsoid indicate
P =
eAt BB T eA t dt.
(3.7)
the degree of coupling between the state and the out0
put signal. The lengths of the principal axes of the
observability ellipsoid are the singular values of `T ,
or, equivalently, the square root of the eigenvalues
LT
of
Ln2i
Rn
Q(T ) = `T `T
(3.9)
L2
=1
where `T is the operator adjoint of `T and Q(T ) is

an n n positive semi-definite matrix, called the observability gramian (at time T ). The observability
gramian may be calculated from a linear matrix differential equation
x = LT u
Q(t)
= AT Q(t) + Q(t)AT + C T C,
Q(0) = 0nn .
(3.10)
Figure 3: Controllability Ellipsoid
The solution can also be written as a matrix integral:
Z T
T
A dual approach considers the state-to-output couQ(T ) =
eA t C T CeAt dt.
(3.11)
0
pling by using the concept of observability. Since
only state and output are considered, let input u = 0. For stable systems, Q(t) converges to a steady state
Denote the mapping of the initial state x0 Rn to matrix, Q, as t , which solves the following Lyathe output trajectory y Ln2 o [0, T ] by `T , then
punov equation (dual to (3.6):
y(t) = (`T x0 )(t) = CeAt x0 .
AT Q + QA + C T C = 0.
(3.8)
(3.12)
We can visualize `T as a mapping of the unit The solution can also be written as an integral:
Z
ball in Rn to an ellipsoid (at most n-dimensional)
T
no
Q
=
eA t C T CeAt dt.
(3.13)
in L2 [0, T ], called the observability ellipsoid (see
0
The solution of the Lyapunov equations has been

well studied in the literature. Two of the most popular methods are the Bartles and Stewart algorithm
[21] and the Hammarling algorithm [22]. These algorithms involve the reduction of the system matrix A to the triangular form via a Schur decomposition, which requires O(n3 ) operations even when
A is sparse. This computation cost is acceptable for
small to medium scale problems (n 400), but is
obviously prohibitive for large systems. We will discuss reducing this cost through the use of approximate iterative methods in Section 4.
Once the gramians are found, we are now able to
construct a reduced order model by only retaining
the states that are strongly coupled to the input or
output. For controllability, let the eigenvalue decomposition of P be given by (note that P is symmetric positive semidefinite):
P = TcT c Tc
Note that direct truncation matches the high frequency gains between the full order and reduced order models (i.e., D), and the singularly perturbation
model matches the DC gains.
Similarly, we can perform eigen-decomposition on
Q as
(3.16)
Q = ToT o To .
The eigenvalues o can be partitioned into the dominant and discardable portions. The corresponding
partition of (A, B, C, D) can then be used to generate
a reduced order system by using either truncation or
singular perturbation.
The reduced order model using the input-to-state
or state-to-output coupling (through controllability
or observability, respectively) is intuitively appealing but unfortunately is coordinate dependent. Let
(A, B, C, D) and (T 1 AT, T 1 B, CT, D) be two state
space realizations of the same input/output system,
and (P, Q) and (P , Q) the corresponding controllability and observability gramians. Then
(3.14)
where c is diagonal and contains the eigenvalues of P sorted in reverse order, and the columns
of Tc are the eigenvectors. The transformed system (Tc ATcT , Tc B, CTcT , D) has the controllability
gramian c . Now partition c to
c1
0
c =
0
c2
P = T 1 P T T , Q = T T QT.
(3.17)
In general, P and P (resp. Q and Q) have different

eigenvalues (unless T is orthogonal, i.e., T 1 = T T ).
Therefore, model reduction based only on the controllability gramian (resp. observability gramian) information would yield different reduced order modwith c1 containing the dominant eigenvalues and els for the same system by just changing the state
c2 the remainder. Partitioning A, B, C accordingly, representation. In particular, controllability may be
the state equation in the transformed coordinate be- increased in certain state directions (e.g., through
scaling) at the sacrifice of the observability, and vice
comes
versa.
z1
A11 A12
z1
B1
A solution of the coordinate dependence problem is
=
+
u
z2
A21 A22
z2
B2
to consider both controllability and observability at
z1
the same time. Such a transformation may be found
y = C1 C2
+ Du.
through the eigen-decomposition of P Q:
z2
A reduced order model may be obtained by direct
truncation, i.e., assume z2 is small. The corresponding state space representation is (A11 , B1 , C1 , D).
Another way to obtain a reduced order model is
through singular perturbation [23] by assuming that
z2 converges to a steady state much faster than z1 . In
this case, set z2 = 0 to obtain
= T 1 P QT
(3.18)
where the diagonal matrix contains the eigenvalues of P Q sorted in descending order and T is the
corresponding eigenvector matrix. In this coordinate system, both controllability and observability
gramians are :
z2 = A1
22 (A21 z1 + B2 u).
P = T 1 P T T = Q = T T QT = ,
(3.19)
Substitute back into the z1 equation to obtain the re- and the system is said to be balanced (between
duced order system:
controllability and observability) [24]. Balancing is
1
1
usually performed for stable systems using the soluz1 = (A11 A12 A22 A21 )z1 + (B1 A12 A22 B2 )u
tions of the corresponding Lyapunov equations (3.6)
1
y = (C1 C2 A1
22 A21 )z1 + (D C2 A22 B2 )u.
and (3.12). In this case, the diagonal entries of
(3.15) are called the Hankel singular values of the system
7
and they are invariant with respect to the coordinate

transformation. Hankel singular values describe the
degree that a given state contributes to the inputoutput energy flow of the system. States with small
Hankel values are both weakly controllable and
weakly observable, and can be removed from the
system (through truncation, called balanced truncation, or singular perturbation). Therefore, systems
that show a rapid decline in Hankel singular values
are easily approximated by a reduced order system.
A sharp decrease in the Hankel singular values can
indicate a good point to truncate the model [26]. Balanced truncation is the main model reduction tool
examined in the paper, and we will later examine approximate, but computationally attractive, solution
to the system gramians and their use in balancing.
The most common method of finding the balanced
coordinate is the square root method first proposed
in [25]. First find the square roots of P and Q:
P = ZP ZPT ,
T
.
Q = ZQ ZQ
The Hankel singular values also define an error

bound for balanced truncation. For a system G with
Hankel singular values (1 2 . . . n ), the approximation error for a balanced truncation reduced
k (s), satisfies the inequalorder model of order k, G
ity
H k+1 .
2(k+1 + ... + n ) kG Gk
(3.24)
Balanced truncation tends to match gain well but

sometimes poorly in phase. A related approach
called balanced stochastic truncation has been proposed for square systems [27, 28] which balances the
spectral factor of G(s)GT (s) instead of G itself.
This approach tends to approximate the phase better and hasa guaranteed relative error bound, i.e., a
1
bound on (G G)G
. We did not compare this
approach in our numerical study in this paper.
(3.20)
3.3. Optimal Hankel Approximation
The n n square root matrices are known as the

Cholesky factors of the gramians. They are upper
triangular, and always exist since the gramians are
positive semi-definite. Next perform a singular
value decomposition:
Balanced realization chooses a state coordinate

based on its contribution to the input/output energy flow. The logical next step is to consider the input/output map without explicitly considering the
internal state of the system. We first define a lin(3.21) ear operator, called the Hankel operator, that maps
ZPT ZQ = U V T
the past input, Ln2 i (, 0], to the future output,
where U , V are orthogonal and is diagonal. The Lno [0, ):
2
next step is to form the coordinate transformation
= o c .
(3.25)
matrix which requires the system to be controllable
and observable, i.e., P and Q are positive definite.
where c : Ln2 i (, 0] Rn maps the past input to
If this is not the case (which is frequently the case
the initial state, and o : Rn Ln2 o [0, ) maps the
for large scale systems due to numerical accuracy), a
initial state to the future output. It can be shown
model reduction may first need to be performed to
that the induced norm, kk, is the largest Hankel
remove the nearly uncontrollable and nearly unobsingular value of the system. We can now pose the
servable subsystems (by using truncation or singumodel reduction as a minimization problem in terms
lar perturbation using controllability or observabilof the Hankel norm of the approximation error (this
ity alone). If P and Q are both positive definite, then
is called the optimal Hankel norm approximation
ZP and ZQ are both invertible. Define
problem):
1
1
H.
min kG Gk
(3.26)
T1 = ZP U 2 , T2 = ZQ V 2 .
(3.22)
order k G
It follows that T11 = T2T . Note, however, neither The solution of this problem is given [29], and can be
T1 nor T2 require explicit matrix inversion. Using T1 readily implemented. We also consider this method
as the transformation matrix (and use T2T instead of in the numerical study in the next section. The opT11 ), the new state space representation is
timal Hankel norm approximation method gives a
(Ab , Bb , Cb , Db ) = (T2T AT1 , T2T B, CT1 , D).
tighter guaranteed error bound in terms of the H

norm:
(3.23)
It can be verified that the controllability and observability gramians of this system both equal to .
H (k+1 + ... + n ).
kG Gk
(3.27)
If Ad is stable (i.e., all eigenvalues within the unit

circle), then PN converges to a steady state solution,
The model reduction approach described above can P , as N . In this case, P satisfies the steady
also be applied to discrete time LTI systems. Discrete state Lyapunov equation
time system description arises for computation reaAd P ATd P + Bd BdT = 0
(3.33)
sons (as an approximation to continuous time systems se Section 4) or due to sampled data impleand can be evaluated through an infinite sum:
mentation (zero-order-hold digital-to-analog input
and sampled analog-to-digital output).

X
T
P =
Aid Bd BdT Aid .
(3.34)
Consider a discrete time system in a state space repi=0
resentation:
As a dual concept, consider an unforced system (i.e.,
x(k + 1) = Ad x(k) + Bd u(k)
u 0) with the initial condition x(0) generating
y(k) = Cd x(k) + Dd u(k),
(3.28) an output sequence yN = {y(k) : k = 0, . . . , N 1}.
The mapping from Rn to `n2 o [0, N 1] is then
where k is a non-negative integer denoting the time
Cd
horizon.
Cd Ad
As in the continuous time case, controllability can
`N x(0) =
(3.35)
x(0).
..
be defined as the mapping, LN , from an input se
.
1
quence uN = {u(k) : 0 k N 1} to a terminal
C d AN
d
state x(N ) (starting from the origin):
Define the observability gramian as
N
1
X
(3.36)
QN = `TN `N .
LN uN =
AN i1 Bd u(i)
(3.29)
3.4. Discrete Time Systems
i=0
We can now visualize `N as the mapping of a unit

Rn ball to an (at most) n-dimensional `n2 o [0, N 1]
N 1
ellipsoid, the observability ellipsoid, with the prin= [Ad Bd , . . . , Bd ]
.
cipal axes given by the eigenvectors of QN and their
u(N 1)
lengths given by the square roots of the eigenvalues
The controllability gramian can also be similarly de- of QN . The observability ellipsoid captures the degree of coupling between the state and the output.
fined as
T
(3.30) In the case that the observability ellipsoid is degenPN = LN LN .
erate in certain state direction, then those states do
One can visualize LN as the mapping of a unit not generate any output and can be removed from
`n2 i [0, N 1] ball to an Rn ellipsoid, the controllability the system description.
ellipsoid, with the principal axes given by the eigen- The gramian, QN , also satisfies the discrete time Lyavectors of PN and their lengths given by the square punov equation:
roots of the eigenvalues of PN . The controllability elATd QN Ad QN +1 + CdT Cd = 0
(3.37)
lipsoid captures the degree of coupling between the
input and the state. In the case that the controlla- and can also be solved explicitly through a finite
bility ellipsoid is degenerate (zero length) in certain sum:
N
1
state direction, then those states cannot be affected
X
T
QN =
Aid CdT Cd Aid .
(3.38)
by the input (i.e., they are uncontrollable) and can
i=0
be removed from the system description.
The gramian, PN , also satisfies the discrete time Lya- If Ad is stable, then QN converges to a steady state
punov equation:
solution, Q, as N . In this case, Q satisfies the
steady state Lyapunov equation
Ad PN ATd PN +1 + Bd BdT = 0
(3.31)
ATd QAd Q + CdT Cd = 0
(3.39)
and can also be solved explicitly through a finite
and can be evaluated through an infinite sum:
sum:
N
1
X
X
T
T
PN =
Aid Bd BdT Aid .
(3.32)
Q=
Aid CdT Cd Aid .
(3.40)
u(0)
..
.
i=0
i=0
The gramians, P and Q, may be used in exactly the

same way as in the continuous time systems to reduce the system order. For balanced truncation, we
first perform an eigen-decomposition of P Q (in any
coordinate):
= T 1 P QT,
(3.41)
Balanced truncation reduction
Magnitude (dB)
50
full model
order 10
order 20
100
where the diagonal matrix contains the eigenvalues of P Q (Hankel singular values for the discrete
150
time system) sorted in descending order and T is the
2
3
4
10
10
10
corresponding eigenvector matrix. Then by using T
as the coordinate transformation, the transformed
system, (T 1 Ad T, T 1 Bd , Cd T, Dd ), has identical Figure 6: Graphical progression of balanced truncacontrollability and observability gramians which are tion
both . The states corresponding to small values of
may be truncated as in Section 3.2 to obtain the
reduced order model.
Optimal Hankel reduction
50
3.5. Example
In this section, we examine the performance of the

model reduction techniques discussed above applied to a problem of moderate dimension (100
modes or n = 200). The purpose is to demonstrate
the effectiveness of model reduction for generalized
second-order models, and to determine the most effective type of model reduction, which will then be
applied to the large-scale problem.
The system under consideration here has been randomly generated, and has a non-monotonically decaying frequency response. We apply three model
reduction methods that have been discussed: modal
truncation, balanced truncation, and optimal Hankel norm approximation. The performance of these
methods is compared by examining the frequency
response plots of the original system and the approximated ones.
Magnitude (dB)
3.5.1. Comparison of Model Reduction Methods
full model
order 10
order 20
100
150
10
10
10
Figure 7: Graphical progression of optimal Hankelnorm approximation
Magnitude (dB)
Fig. 5-7 show the frequency response comparison

between the full-order model, the modal truncation models, the reduced order balanced truncation
models, and optimal Hankel norm reduction models. Balanced truncation retains the most dominant
system resonant modes in the frequency response,
Modal truncation
50
while modal truncation retains the lowest frequency
full model
modes irrespective of their contributions. Hankel
order 10
norm reduction matches the dominant modes reaorder 20
sonably well but performs poorly at high frequency
100
since the contribution to the error norm is small.
Some of the undesirable features of modal and Hankel norm reduction methods may be corrected by selecting modes based on their peak magnitudes or
150
2
3
4
through frequency dependent weighting, but bal10
10
10
anced truncation is the overall algorithm of choice
since it captures the dominant input/output freFigure 5: Graphical progression of modal truncation quency response, provides an error bound, and preserves system stability.
10
3.5.2. Maximum Output Prediction with Reduced

Order Models
100
In this section, we present an application of model

reduction to the determination of the value and location of the maximum strain in a structure. This information can be used to ensure that a critical yield
stress is not exceeded or to optimize the placement
of sensors to collect strain data for control purposes.
We will show that a reduced order model can be
used to predict the value and location of the maximum strain while reducing computational expense.
Our motivating example is a piezoelectric composite beam, with a force input applied at a randomlyselected location along its length. The model has
been discretized to 400 nodes, giving the full-order
model 800 states. Therefore, there are 399 possible
locations for the maximum strain.
To apply the model reduction methods presented
earlier, we can consider 399 systems, each with a distinct single output corresponding to the strain at a
node. Alternatively, we can consider a single system with 399 outputs, and reduce the model only
once, obtaining a model with 399 outputs. The first
approach has the greatest potential for model reduction, since less information is required to describe the input-output relationship, but the second
is much more efficient computationally since it requires only one reduction. The time required to perform 399 individual reductions is much more than
the time required to simply compute the full-order
model, making the first approach impractical. We
thus proceed with the multiple-output approach.
Thirty simulations were performed, in which the input was a bounded sinusoidal force function with
random frequency content and amplitude. The force
was applied at randomly selected points on the
beam, and the full-order model was simulated to
find the value and temporal and spatial locations of
the maximum strain. Reduced order models with
4 to 200 states were then generated, and the predicted value and location of the maximum strain
were found. The results of these predictions were
compared with the actual value and location of maximum strain, and the quality of the approximation
was then assessed. We define a successful prediction as one predicting the strain value within 5
percent, the location within 2 nodes, and the time
within 10 time steps. Our simulation was run for
1000 time steps. The results are presented in graphical form in figure 8, which shows the percentage of
tests giving successful approximations for a given
reduced-order model size. It is evident that a reason-
80
90
value
time
location
% of Models Successful
70
60
50
40
30
20
10
25
50
75
100
125
Reduced Model Order
150
175
200
Figure 8: Results for 30 tests of maximum strain prediction

able prediction can be obtained using reduced order
models of order 200 or less in most cases.
In almost all cases, the time and location were successfully predicted before the strain value itself. It
is not known a priori when the reduced order model
will successfully predict value, time, and location,
but it was observed that once an accurate prediction had been reached, it remained accurate when
we further increased the model order. Convergence
of the maximum strain value was asymptotic, but
the behavior of the convergence of location and time
was not well-defined, often oscillating between several distinct values before converging.
Although we cannot calculate an analytical error
bound for a problem of this type, we can define an
a posteriori error bound as follows. The induced L
norm in the time domain is the L1 norm of the impulse response [19]:
kg gkL1 =
ky ykL
kukL
(3.42)
where g and g are the impulse responses of the fullorder and reduced order systems, y and y are the
respective outputs, and u is the input. If the desired
maximum output error bound is, , then a sufficient
condition is
kg gkL1
.
kukL
(3.43)
For a given input u, we can choose the order of the

reduced model sufficiently high (and kg gkL1 sufficiently small), so that (3.43) is satisfied. The con-
11
L1 norm of impulse response error
merical ranks. We can exploit this fact by computing

just the dominant portion of the gramians which can
then be sued to calculate a reduced order model. If
we can efficiently compute an approximate gramian
that has eigenvectors that point roughly in the same
state directions as the dominant eigenvectors of the
actual gramian, then the approximate gramian will
perform nearly as well as the actual one in the model
reduction process. This low rank gramian approximation then takes the place of the solution of the
full order Lyapunov equations, which is computationally prohibitive for the large-scale problem.
10
output 1
output 2
output 3
upper bound
Error
10
10
10
10
10
20
40
60
80
Reduced model order
100
120
4.1. Discrete-Time Gramian Formulation
Instead of solving for the gramians in continuousFigure 9: L1 norm of impulse response error for se- time, we will consider the solution a discrete-time
lected output locations
system that has the same gramians. This process allows us to calculate the gramians using an infinite
series instead of the integrals in (3.7) and (3.13).
vergence of kg gkL1 for several output locations is
Consider the following bilinear transformation that
shown in Fig. 9.
maps the imaginary axis (in the s domain) to the unit
The benefit of model reduction can be seen in the circle (in the z domain)
computational time savings shown in Table 2. We incur a one-time cost of reduction (in terms of gramian
(1 z)
s=p
(4.1)
computation and SVD needed in balancing), and af(1 + z)
ter this we enjoy a significant savings for each iteration in terms of the solution of the time response. where p < 0 is a shift parameter to be chosen. If the
The cost to simulate an 200th order model is about discrete time system is obtained through uniform
1
time domain sampling of the continuous time sys16 of an 800th order model. So after 10 design iterations, the total computation time using the reduced tem, then p = 2fs with fs the sampling frequency.
order model is already less than that of the full order Substituting (4.1) into the continuous time transfer
function (2.11), we obtain a discrete time transfer
model.
function
Hp (z) = Cp (zI Ap )1 Bp + Dp ,
4. APPROXIMATE SOLUTIONS OF GRAMIANS

We have shown that balanced truncation is an effective model reduction technique in the previous
section. However, its application to large systems
is limited by the computational load (of order n3
for the gramian and SVD calculations needed in
balanced transformation) and storage requirements.
Additionally, the numerical implementation can become ill-conditioned for stiff systems (widely separated eigenvalues in A). In this section we present
several methods that are more numerically efficient
to approximately compute the system gramians.
For large systems with many state variables and relatively few inputs and outputs, typical of models
arising from the finite element method, the Hankel
singular values decay rapidly. This implies that the
input-output energy coupling is dominated by just
a few states. As a result, the gramians have low nu-
(4.2)
where
Ap
Bp
Cp
Dp
(pI + A)1 (pI A)

p
2p(pI + A)1 B
p
2pC(pI + A)1
D C(pI + A)1 B.
(4.3)
The corresponding Lyapunov equations for the discrete time controllability and observability gramians
are
Ap P ATp P + Bp BpT
= 0
ATp QAp
= 0.
Q+
CpT Cp
(4.4)
Note that for any p < 0, these equations are exactly the same as the continuous time Lyapunov
12
Model
Original
Reduced
Order
800
200
Red Time (s)

0
125
Time (1)(s)
24
125+1.5
Time (10)(s)
240
125+15
Time (100)(s)
2400
125+150
Table 2: Cost savings in maximum strain problem design cycles

equations (3.6) and (3.12). The solutions may be expressed as infinite sums instead of integrals in (3.7)
and (3.13):
P
Q =
X
j=0
Ajp Bp BpT ATp
ATp CpT Cp Ap j .
(4.5)
j=0
Since Ap is stable (i.e., all eigenvalues within the unit

circle), these series converge. A natural approximation of P and Q may then be found by truncating
these series. We now examine various variations for
solving for the approximate gramians efficiently.
where Apj , Bpj , Cpj are the transformed state space

matrices as defined in (4.3) with p replaced by the
jth shift parameter pj . When a fixed number of shift
parameters are used, they are recycled in the iterations. Referencing (4.7), the ADI iteration simplifies
to the Smith method when only one shift parameter
is used. However, when multiple shifts are used, the
convergence rate is typically faster than the Smith
method.
The iterations in (4.8) may be split into two steps to
gain efficiency:
4.2. Smith Method

The infinite series (4.5) may be truncated to generate
the following kth order approximate gramians:
Pk
k1
X
Ajp Bp BpT ATp
j=0
Qk
k1
X
ATp CpT Cp Ap j .
(4.6)
j=0
= Ap Pj1 ATp + Bp BpT ,
P0 = 0
Qj
= ATp Qj1 Ap + CpT Cp ,
Q0 = 0, (4.7)
where j = 1, . . . , k. This iterative solution for the approximate gramians is known as the Smith method.
The computational cost is O(n3 ) for fully populated
A, and O(n2 ) for tridiagonal A.
=
=
BB T Pj1 (AT pj I)
T
BB T Ptemp
(AT pj I)
(AT + pj I)Qtemp
(AT + pj I)Qj
P0 = Q0 = 0
=
=
C T C Qj1 (A pj I)
C T C QTtemp (A pj I)
(4.9)
where Ptemp and Qtemp are intermediate matrices.

Computationally, each ADI iteration in (4.9) involves two matrix-matrix products, and two matrixmatrix solves. For a full matrix A, a matrix-matrix
solve has computational cost O(n3 ), which is impractical for large n. To reduce the computational
cost to O(n2 ), a general matrix A must be made tridiagonal.
These sums may be iteratively computed:

Pj
(A + pj I)Ptemp
(A + pj I)Pj
4.4. Cyclic Smith Method

The cyclic Smith method combines the ADI and
Smith methods by first applying the ADI method
for J steps (using all the shift parameters) and then
using Smith method to generate the approximate
gramians. To show the algorithm, we consider the
controllability case only. First write (4.4) with p = pJ
(which is equivalent to (3.6)):
P = ApJ P ATpJ + BpJ BpTJ .
4.3. ADI Iteration

The Alternating Direction-Implict (ADI) algorithm
[30, 31] is a generalization of the Smith method by
using distinct shift parameters p1 , p2 , ..:
Then substitute for P in the right hand side by using

P = ApJ1 P ATpJ1 + BpJ1 BpTJ1
Pj
Apj Pj1 ATpj + Bpj BpTj ,
P0 = 0
to obtain
Qj
ATpj Qj1 Apj + CpTj Cpj ,
Q0 = 0, (4.8)
P = ApJ (ApJ1 P ATpJ1 +BpJ1 BpTJ1 )ATpJ +BpJ BpTJ .

13
Repeat this process to obtain

P = 0,J P T0,J + PJ
where
k,` :=
`
Y
(4.10)
For the Smith method, only one parameter needs to

be chosen. If the eigenvalues of A are known, and if
pj s are chosen to be the eigenvalues of A, then the
ADI algorithm will produce the exact solution of the
gramians in n step. Of course, the goal is to obtain
an approximate solution in a much smaller number
of iterations, so the number of shift parameters is in
general much smaller, i.e., J << n.
If the eigenvalues of A are all real, the solution to
(4.14) is known, and the optimal parameters may
be readily generated. However, second-order FEM
models typically have numerous complex eigenvalues, with small real parts. For this case, the problem
has no known closed-form solution. Various suboptimal solutions have been proposed [3234].
Api
i=k+1
and
PJ =
J
X
j,J Bpj BpTj Tj,J
j=1
is just the Jth iterate of the ADI iteration (4.8).

Eq. (4.10) is of the same form as (4.4), therefore, we
may apply the Smith method to find an approximate
solution:
(CS)
Pj
(CS)
(CS)
= 0,J Pj1 T0,J + PJ , P0
= 0,
(4.11) 4.6. Low Rank Algorithms
where j = 1, . . . , k, for the k-term approximation of The iterative methods presented so far avoid the
the infinite series expansion. For the observability costly solution of Lyapunov equations in the computation of gramians. However, their application to
gramian, a similar propagation may be used
large scale systems is inherently limited due to the
(CS)
(CS)
(CS)
T
Qj
= 0,J Qj1 0,J + QJ , Q0
= 0, (4.12) computation and storage requirements in propagating the full n n system gramians at each iteration.
where QJ is the Jth ADI iterate. The computational This section discusses the so-called low rank methods
cost is again O(n3 ) for fully populated A, and O(n2 ) which propagate the Cholesky factor of the gramian
for tridiagonal A.
instead of the full gramian [35].
The advantage of the Cyclic Smith method lies in
faster convergence than the Smith method (due to 4.6.1. Low-Rank ADI
the multiple shifts) while avoiding using a large
Low-rank ADI (LR-ADI), proposed in [36], propanumber of shift parameters.
gates the Cholesky factor of the gramians in ADI instead of the full gramian matrix. As a result, it has
4.5. Shift Parameter Selection
reduced computational and storage requirements.
The convergence of the Smith, cyclic Smith, and ADI Let Pj be the jth ADI iterate. Since Pj 0, it can
algorithms depend on the selection of the shift pa- be factored as
rameters, pj (user-selected real numbers or complex
Pj = ZPj ZPTj .
conjugate pairs with negative real parts). To increase
The ADI iteration (4.8) may be written as
the speed of convergence of these algorithms, pj
should be chosen so that the eigenvalues Apj have ZPj ZPTj = Apj ZPj1 ZPTj1 ATpj + Bpj BpTj
small magnitudes. The eigenvalues of Apj is related
T
Apj ZPj1 Bpj
Apj ZPj1 Bpj
.
=
to the eigenvalues of A by
i (Apj ) =
pj i (A)
.
pj + i (A)
We can now update ZPj instead of Pj :
ZPj = Apj ZPj1 Bpj , ZP0 = 0.
(4.13)
(4.15)
The selection of the shift parameters (for ADI and With a little algebra, it can be shown that
cyclic Smith cases) may then be posed as an optih
i
QJ1
mization problem of choosing p1 , p2 , . . . , pJ to miniZPJ = BpJ SJ1 BpJ . . .
j=1 Sj BpJ
mize the largest eigenvalue of 0,J :
(4.16)
where
J
Y (pj (A))
r
.
pi
(4.14)
min max
(I (pi+1 + pi )(A + pi I)1 ).

S
=
p1 ,...,pJ (A)
(p
+
(A))
i
j=1 j
pi+1
14
The following iterative update may then be used to

generate ZPJ (the indexing has been reversed for
convenience):
zj+1
Sj
(ADI)
Z Pj
Similarly, for the observability gramian, we have the

following iteration:
(ADI)
zj+1 = T0,J zj , z0 = ZQJ

Sj zj , z1 = Bp1 ,
h
i
(CS)
(CS)
(CS)
r
Z
=
, ZQ0 = z0 . (4.22)
Z
z
j
pj+1
Qj
Qj1
(I (pj+1 + pj )(A + pj+1 I)1 ),
pj
h
i
The iterations terminate when kzj k is sufficiently
(ADI)
(ADI)
= z1 . (4.17) small. Similar to the LR-ADI case, since Z (CS) and
ZPj1
zj , ZP1
Pj
(CS)
ZQj in general have low column ranks than the dimension of the system, the LR-CS method has less
computational (order O(n2 ) for fully populated A
and O(n) for tridiagonal A) and memory storage requirements.
Similarly, for the observability gramian, we have

zj+1
(ADI)
ZQ j
= SjT zj , z1 = CpT1 ,
h
i
(ADI)
(ADI)
=
= z1 . (4.18)
ZQj1
zj , ZQ1
The iteration terminates when kzj k becomes sufficiently small.

Each iteration of LR-ADI requires only matrixvector solves, instead of the matrix-matrix products
used in the normal ADI method. A matrix-vector
solve has cost O(n2 ) if A is full, and O(n) if A is
sparse. Therefore, the LR-ADI algorithm has cost
O(n) if A is tri-diagonal or sparse, and O(n2 ) if A
is full. Each LR-ADI iteration adds a number of
(ADI)
(ADI)
columns to ZPj
and ZQj
corresponding to the
number of inputs. LR-ADI becomes most advantageous when the iterations terminate with a small
number of columns, therefore saving both storage
and computation. There could be further savings if
only an orthonormal basis is saved in each iteration.
5. APPROXIMATE
TION
BALANCE
TRANSFORMA-
4.6.2. Low-Rank Cyclic Smith
Once the approximate gramians are found, we can

use them to generate a reduced order model. The
square root method presented in Section 3.2 can be
directly extended to use the approximate Cholesky
factors obtained by using the LR-ADI or LR-CS
methods [11, 36]. Let the approximate Cholesky factors of the controllability and observability gramians be ZP Rnk and ZQ Rnk which are full
column rank matrices (note that there are possibly
many more rows than columns and, for simplicity,
we assume that the matrices have the same number
of columns). Next perform an SVD on the following
k k matrix:
V T .
(5.1)
ZPT ZQ = U
In a manner similar to the LR-ADI algorithm, we can

Now define the transformation matrices
formulate a low-rank cyclic Smith (LR-CS) method
to reduce computation and storage requirements
12 , T2 = ZQ V
12 .
T1 = ZP U
(5.2)
[38]. Consider the cyclic Smith iteration (4.11). Substitute the Cholesky factorizations for PJ , the Jth The reduced order system may then be readily ob(CS)
tained:
ADI iterate, and Pj
, we get
(CS)
Z Pj
(CS)T
(CS)T
ZPj
B,
C,
D)
= (T2T AT1 , T2T B, C T1 , D).
(A,
=
(ADI)T
(5.3)
The low-rank methods presented earlier can produce ZP and ZQ directly at reduced computation
We can now just update the Cholesky factor instead and storage needs as compared to the solution of the
of the full gramian
full order gramians. If k << n, the SVD (an O(n3 )
h
i
operation) will also provide considerable savings.
(CS)
(ADI)
ZPJ = 0,J ZP(CS)
.
(4.20) The quality of the reduced order model depends on
Z PJ
j1
the how well the low rank Cholesky factors approxThis may be written in an alternate and more effiimate factors of the actual gramians. However, the
cient update:
analytic error bound (3.24) no longer holds. Some er(ADI)
ror bounds have recently been developed [39] to aczj+1 = 0,J zj , z0 = ZPJ
h
i
count for the approximation error as well. Further(CS)
(CS)
(CS)
Z Pj
=
ZPj1 zj , ZP0 = z0 . (4.21) more, the full order balance truncation preserves the
(CS)
(ADI)
0,J ZPj1 ZPj1 T0,J + ZPJ
ZPJ
(4.19)
15
abs(Y/U)
stability of the full order system. With the approx- 6.1. Performance of LR-ADI and LR-Square-Root
imate gramians, this is no longer true and the unMethods
stable modes will need to be removed. Other model
reduction methods using the approximate Cholesky The parameters that need to be selected in the model
reduction are the number of iterations and the orfactors have also been proposed [11].
der of the reduced system. In the case of exact balanced truncation, we know the H error bound a
priori and can determine the required model order
to achieve the required error tolerance from (3.24).
6. NUMERICAL EXPERIMENT
In the approximated-gramian case, we do not have
a guaranteed error bound. Therefore, we supply a
In this section, we use a fixed-free composite piezo- requested model order, and the low-rank square
electric beam presented in [40] to illustrate the root balancing algorithm produces a reduced order
model reduction methods discussed in this paper. model of size no greater than requested order, and
The beam has been spatially discretized into 450 possibly less. A less-than-requested model order can
nodes, resulting in a full-order model with 900 occur when the approximated gramian is of insuffistates. The input is chosen as a force applied to node cient order. Furthermore, since stability is not pre250, and the output is the strain measured at node served under approximate balancing, there may be
2 (the beam root). The frequency response is shown unstable eigenvalues in the resulting model which
in Fig. 10. A complete derivation of the model ap- will need to be removed, resulting in a lower orpears in [40] and will not be repeated here. All sim- der model. The presence of unstable states was
ulations are performed using MATLAB 6.5, running also noted by other researchers in [11, 38]. Fig. 11
on a Pentium 4 2.0 GHz PC, with 512 MB of RAM. shows the resulting model order for three values
The approximate gramian computation and balanc- of requested model order, as a function of the LRing routines are drawn from the Lyapack [14] soft- ADI iterations. Note that the 40th order model atware library.
tains its full size after about 200 iterations of LR-ADI,
meaning that after 200 iterations the gramians have
numerical rank sufficient to generate a 40th order
Frequency Response
6
model. The 80th order model has a similar behavior,
10
reaching its full size after about 500 iterations. For
the 120th order model, the gramian has insufficient
8
rank even after 600 iterations.
10
Resulting Model Order
100
10
10
Model Order
80
12
10
Frequency (rad/s)
14
10
10
10
60
40
40 States Requested
80 States Requested
120 States Requested
20
5
10
10
0
100
200
300 Iterations
400
LRADI
500
600
Figure 10: Frequency response of full-order model

Figure 11: Resulting model order for given reExtensive numerical comparison between all the al- quested orders
gorithms for gramian approximation have been conducted in [37]. The best performance is obtained We next assess the model reduction error as a funcby using the LR-ADI algorithm for gramian ap- tion of the iteration in LR-ADI. The requested order
proximation together with the low-rank square root is chosen to be 120. The H2 and H norms and timemethod for balancing. We will only present the re- domain L2 and L norms of the impulse response
of the error system are shown in Fig. 12. These repsults related to the LR-ADI method here.
16
resent the best results obtained after testing several

different shift parameter sequences. Note that for
the most part, the error norms decay monotonically
in iterations.
Timedomain error norms
10
L2 norm
Linf norm
L2 error norm
10
10
10
8
10
10
Exact
Approximate
10
10
100
200
300
400
500
600
10
10
Frequencydomain error norms
10
70
10
80
90
100
90
100
90
100
90
100
Linf error norm
10
H2 norm
Hinf norm
10
60
10
12
10
10
14
10
100
200
300
400
LRADI iterations
500
10
600
10
60
70
80
H2 error norm
Figure 12: Model reduction error metrics - Beam

Model
10
10
Fig. 13 shows the error norms as a function of the

model order for both the exact and approximate balance truncation methods. We have observed that occasionally the approximate balancing method produces better approximations to the original system
than the exact method. This is due to the numerical
inaccuracies present in the solution of the Lyapunov
equations, especially for high order stiff systems.
Fig. 14 shows the execution time of LR-ADI and the
LR-square root algorithms. We have used the fully
populated A in all the computations. Therefore, the
computation is linear in k and quadratic in n. If tridiagonalization is first performed, then the computation load grows linearly in n. The computation load
comparison is summarized in Table 3.
Structure
Full A
Tridiagonal A
Exact
O(n3 )
O(n3 )
ADI
O(n3 )
O(n2 )
15
10
60
70
80
Hinf error norm
10
10
15
10
60
70
80
Model order
Figure 13: Comparison of approximate balancing to

exact balancing
LR-ADI
O(n2 )
O(n)
Table 3: Computational requirements for generating

reduced-order models
17
obtain the best convergence for a given problem. We

have chosen 10 purely real shift parameters, which
we found to be a good trade-off between convergence accuracy and convergence rate. We have also
noted that the use of complex conjugate parameters
(nearer to the system eigenvalues) does not appear
to be advantageous. Using 10 complex conjugate
pairs with real parts equal to those used in our work
results in a slower convergence than the purely real
parameters.
LRADI Time
300
Sec
200
100
0
100
200
300
400
500
600
LRSQRT Balance Time

10
x 10
Eigenvalues and Shift Parameters
100
200
300
400
LRADI Iterations
500
Imaginary part
Sec
600
Eigenvalue
Shift Param
1
0
Figure 14: Model reduction times
In general, we observe satisfactory performance of

3
6
4
2
0
the LR-ADI algorithm with low-rank square root
10
10
10
10
Real
part
balancing in terms of the modeling error. The LRADI algorithm quickly converges to the dominant
eigenvectors of the gramians in the numerical example, resulting in reduced order systems whose char- Figure 15: Eigenvalues and shift parameters for
acteristics closely match those of the original system. beam model
6.1.1. Choice of Shift Parameters
7. CONCLUSION
If we know the eigenvalues of the system, we could

choose the shift parameters to be the same as the
eigenvalues, but the LR-ADI method will take many
steps to converge. Since we only want to iterate a small number of steps, the number of shift
parameters is limited, and they should approximate the spectrum of the system. We have implemented the shift parameter selection methods by
Wachspress [41] and Penzl [11], but the best results
were obtained by a heuristic selection procedure that
chooses parameters that cover the range of the system eigenvalues, shown in Fig. 15.
If a large number of shift parameters are used, it
takes many iterations to cycle through the parameters. If the shift parameters are few and far apart (to
cover the spectrum of A), the convergence will also
be slow since the spectral radius of Apj cannot be
made small. In general, the number of shift parameters and their locations have to be carefully tuned to
18
Large scale dynamical systems arising from the finite element method can often be reduced significantly, since their input-output behavior is dominated by only a small number of internal states.
We presented in this paper an overview of the theory and application of model reduction methods
based on the input-state and state-output coupling.
Among the methods reviewed, balanced truncation
is the most attractive as it has a guaranteed H error bound and produces a reduced-order model that
captures well the dominant input-output behavior
in both time and frequency domains.
A key step in balanced truncation is the solution of
two Lyapunov functions which is computationally
intensive and plagued by numerical difficulties for
large systems. We presented several iterative methods to generate approximately-balanced reduced order models for large systems. Among them, the
best choice is the low-rank ADI method which bal[8] J. Tinsley Oden and K. Vemaganti. Estimation
ances the computational load and memory storage
of local modeling error and goal-oriented adaprequirement. However, its effective use requires the
tive modeling of heterogeneous materials; part
selection of a set of shift parameters. The result from
I: Error estimates and adaptive algorithms. J.
low-rank ADI method can be directly used in the
Comp. Physics, 164:2247, 2000.
low-rank square root method to generate a low or[9] A.C. Antoulas, D.C. Sorensen, and S. Gugercin.
der approximate model. A 900-state numerical exA survey of model reduction methods for largeample is included to show the effectiveness of these
scale systems. In Structured Matrices in Operator
methods.
Theory, Numerical Analysis, Control, Signal and
The model reduction algorithms presented here are
Image Processing. AMS, 2001.
for LTI systems only, but models of physical systems
are invariably nonlinear. However, the gramian con- [10] P. Benner. Solving large-scale control probcept central to the balanced truncation method may
lems. IEEE Control Systems Magazine, 24(1):44
be generalized to nonlinear systems and used for
59, February 2004.
their order reduction [9, 42]. Many physical models
also lack damping, such as in molecular dynamics. [11] T. Penzl, Algorithms for model reduction of
large dynamical systems, T.U. Chemnitz, GerIn this case, finite time model reduction using the
many, Technical Report, 1999.
same balanced truncation idea could be applied [43].
[12] V. Balakrishnan, Q. Su, and C-K. Koh, Efficient
balance-and-truncate model reduction for large
scale systems, Proceedings of the American ConR.J. Guyan. Reduction of mass and stiffness
trol Conference, Arlington, VA, 2001.
matrices. American Institute of Aeronautics and
[13] J-R. Li and J. White, Efficient Model reducAstronautics Journal, 3(2):380, 1965.
tion of interconnects via approximate system
J.C. OCallahan. A procedure for an improved
gramians, IEEE/ACM International Conference
reduced system IRS model. In Proceedings of
on Computer Aided Design, San Jose, CA, 1999.
the 6th International Modal Analysis Conference,
[14] T. Penzl, Lyapack - A MATLAB Toolbox for
pages 1721, Las Vegas, NV, 1989.
large lyapunov and riccati equations, model
reduction problems, and linear-quadratic
T.I. Zohdi, J.T. Oden, and G.J. Rodin. Hierarchioptimal control problems. Available from
cal modeling of heterogeneous bodies. Comp.
http://www.tu-chemnitz.de/sfb393/lyapack/
Meth. Applied Mech Engineering, 138:273298,
1996.
[15] L. Ljung. System Identification: Theory for the
User. Prentice-Hall, 1987.
K.J. Bathe, N.S. Lee, and M.L. Bucalem. On
the use of hierarchical models in engineering
[16] B. De Moor, P. Van Overschee, and W. Favoreel.
analysis. Comp. Meth. Appl. Mech. Eng., 82:526,
Numerical algorithms for subspace state space
1990.
system identification - an overview. In Biswa
Datta, editor, Birkhauser Book Series on Applied
A.C. Cangellaris and L. Zhao. Model order rean Computational Control, Signals and Circuits,
duction techniques for electromagnetic macropages 247311. Birkhauser, 1999.
modelling based on finite methods. Int. Journal
References
[1]
[2]
[3]
[4]
[5]
of Numerical Model, 13:181197, 2000.
[17] T. Stykel, Model reduction of descriptor systems, Institut fr Mathematik, Technische

[6] A.C. Cangellaris, M. Celik, S. Pasha, and
Universitt Berlin, Berlin, Germany, Technical
L. Zhao. Electromagnetic model order reducReport 720-01, Dec. 2001.
tion for system-level modeling. IEEE Trans. Microwave Theory and Techniques, 47:840850, 1999. [18] T. Penzl, Numerical solution of generalized
Lyapunov equations, Advances in Computa[7] J. Rubio, J. Arroyo, and J. Zapata. SFELP: An eftional Mechanics, vol. 8, pp. 33-48, 1998.
ficient methodology for microwave circuit analysis. IEEE Journal on Microwave Theory and Tech- [19] K. Zhou, J.C. Doyle, and K. Glover. Robust and
Optimal Control. Prentice-Hall, 1996.
niques, 49(3):509516, March 2001.
19
[20] C.A. Desoer and M. Vidyasagar. Feedback Sys- [32] B. Le Bailly and J.P. Thiran, Optimal rational
tems: InputOutput Properties. Academic Press,
functions for the generalized zolotarev problem
New York, 1975.
in the complex plane, SIAM Journal of Numerical Analysis, vol. 38, no. 5, pp. 1409-1424, 2000.
[21] R. Bartels and G. Stewart, Algorithm 432, solution of the matrix equation AX + XB =C, [33] G. Starke, Fejer-Walsh points for rational
Comm. As. Computer Machinery, vol. 15, pp. 820functions and their use in the ADI iterative
826, 1972.
method, Journal of Computational and Applied
Mathematics, vol. 46, pp. 129-141, 1993.
[22] S. Hammarling, Numerical solution of the
stable, non-negative definite Lyapunov equa- [34] G. Starke, Optimal alternating direction imtion,IMA Journal of Numerical Analysis, vol. 2,
plicit parameters for nonsymmetric systems of
pp.303-323, 1982.
linear equations, SIAM Journal of Numerical
Analysis, vol. 28, no. 5, pp.1431-1445, 1991.
[23] P. V. Kokotovic, R. E. OMalley, P. Sannuti,
Singular Perturbations and Order Reduction [35] J-R. Li and J. White, Low rank solutions
in Control Theory - an Overview, Automatica,
of Lyapunov equations, SIAM Journal Matrix
vol. 12, pp. 123-132, 1976.
Anal. Appl., vol. 24, no. 1, pp.260-280, 2002.
[24] B.C. Moore, Principal component analysis in [36] J-R Li, Model reduction of large linear systems
linear systems: controllability, observability,
via low rank system gramians. Ph.D. disserand model reduction, IEEE Transactions on Autation, Massachusetts Institute of Technology,
tomatic Control, vol. AC-26, pp. 17-32, 1981.
2000.
[25] M. Tombs and I. Postlethwaite, Truncated balanced realization of stable, non-minimal statespace systems, International Journal of Control,
vol. 46, pp. 1319-1330, 1987.
[37] W. Gressick. A comparative study of order

reduction methods for finite element models.
Masters thesis, Rensselaer Polytechnic Institute, Troy, NY., December 2003.
[26] A.C. Antoulas, D.C. Sorensen, and Y. Zhou,

On the decay rate of Hankel singular values
and related issues. Rice University, Houston,
Texas, Technical Report, 2002.
[38] T. Penzl, A cyclic low-rank Smith method for

large sparse Lyapunov equations, SIAM Journal of scientific computation, vol. 21, pp.139-144,
2000.
[27] U.B. Desai and D. Pal. A transformation approach to stochastic model reduction. IEEE [39] S. Gugercin, D.C. Sorensen, and A.C. Antoulas,
A modified low-rank Smith method for largeTransaction on Automatic Control, 29(12):1097
scale Lyapunov equations, Numerical Algo1100, December 1984.
rithms, 32(1), pp.27-55, Jan., 2003.
[28] M. Green. A relative-error bound for balanced
stochastic truncation. IEEE Trans. Automat. Con- [40] J. Fish and W. Chen, Modeling and Simulation
of Piezocomposites, Comp. Meth. Appl. Mech.
trol, 33(10):961965, 1988.
Engng., Vol. 192, pp. 3211-3232, 2003.
[29] K. Glover, All optimal Hankel norm approximations of linear multivariable systems and [41] E. Wachspress. The ADI Minimax Problem for
Complex Spectra. In Iterative Methods for Large
their L bound, Int. Journal of Control, vol. 39,
Linear Systems, D. Kincaid and L. Hayes, Ed.
pp. 1115-1193, 1984.
New York: Academic Press, pp. 251-271, 1990.
[30] G. Birkhoff, R. Varga, and D. Young. Alternating direction implicit methods, in Advances in [42] S. Lall, J. E. Marsden, and S. Glavaski. Empirical model reduction of controlled nonlinear
Computers, Vol. 3, New York: Academic Press,
systems. In In Proceedings of the IFAC World
pp.189-273, 1962.
Congress, Volume F, pages 473478, 1999.
[31] E. Wachspress, Iterative solution of the Lyapunov matrix equation, Applied Mathematics [43] M. Barahona, A.C. Doherty, M. Sznaier,
H. Mabuchi, and J.C. Doyle. Finite horizon
Letters, vol. 1, pp.87-90, 1988.
20
model reduction and the appearance of dissipation in Hamiltonian systems. In IEEE Conference on Decision and Control, pages 45634568,
December 2002.
21

Order Reduction For Large Scale Finite Element Models: A Systems Perspective

Uploaded by

Copyright:

Available Formats

Order Reduction For Large Scale Finite Element Models: A Systems Perspective

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Order Reduction For Large Scale Finite Element Models: A Systems Perspective

Uploaded by

Copyright:

Available Formats

Order Reduction for Large Scale Finite

Element Models: a Systems Perspective

available. We will consider four different norms for

to zero. The second order model can be transformed

g(t )u( ) d + yzi (t)

where yzi is the zero input (unforced) response due

Generalized second order systems (2.2) can also be

In the state space model, we will also consider an

Conceptually, G can be thought of as mapping a unit

Fourier transform of the impulse response, G(j), or

kGkH = sup kG(j)k

where kG(j)k denotes the maximum singular

Figure 2: Input/Output Ellipsoid

where tr denotes the trace of a matrix, and g(t) is the

Define the autocorrelation matrix of u as

The spectral density Suu (j) of u is the Fourier

S = u(t) Rni : kukS := kSuu (j)kL < .

have different meanings depending on the norm

Modal truncation method is simple in principle, but

means that the L2 norm of the difference 3.2. Balanced Realization

Table 1: Error norms considered in this paper and their interpretations

For stable systems (all eigenvalues of A are in the

Figure 4: Observability Ellipsoid

where `T is the operator adjoint of `T and Q(T ) is

The solution of the Lyapunov equations has been

In general, P and P (resp. Q and Q) have different

and they are invariant with respect to the coordinate

The Hankel singular values also define an error

Balanced truncation tends to match gain well but

The n n square root matrices are known as the

Balanced realization chooses a state coordinate

tighter guaranteed error bound in terms of the H

If Ad is stable (i.e., all eigenvalues within the unit

and sampled analog-to-digital output).

We can now visualize `N as the mapping of a unit

The gramians, P and Q, may be used in exactly the

Balanced truncation reduction

In this section, we examine the performance of the

3.5.1. Comparison of Model Reduction Methods

Figure 7: Graphical progression of optimal Hankelnorm approximation

Fig. 5-7 show the frequency response comparison

3.5.2. Maximum Output Prediction with Reduced

In this section, we present an application of model

Figure 8: Results for 30 tests of maximum strain prediction

For a given input u, we can choose the order of the

L1 norm of impulse response error

merical ranks. We can exploit this fact by computing

4.1. Discrete-Time Gramian Formulation

4. APPROXIMATE SOLUTIONS OF GRAMIANS

(pI + A)1 (pI A)

Red Time (s)

Table 2: Cost savings in maximum strain problem design cycles

Ajp Bp BpT ATp

Since Ap is stable (i.e., all eigenvalues within the unit

where Apj , Bpj , Cpj are the transformed state space

4.2. Smith Method

Ajp Bp BpT ATp

= Ap Pj1 ATp + Bp BpT ,

= ATp Qj1 Ap + CpT Cp ,