R Emy Boyer Roland Badeau G Erard Favier: Fast Orthogonal Decomposition of Volterra Cubic Kernels Using Oblique Unfolding

FAST ORTHOGONAL DECOMPOSITION OF VOLTERRA CUBIC KERNELS USING
OBLIQUE UNFOLDING
R emy Boyer
Universit e Paris XI (UPS)
LSS-Sup elec, CNRS
remy.boyer@lss.supelec.fr
Roland Badeau
T el ecom ParisTech
CNRS LTCI
rbadeau@enst.fr
G erard Favier
Universit e Nice Sophia Antipolis
I3S, CNRS
favier@i3s.unice.fr
ABSTRACT
Discrete-time Volterra modeling is a central topic in many appli-
cation areas and a large class of nonlinear systems can be modeled
using high-order Volterra series. The problem with Volterra series
is that the number of parameters grows very rapidly with the or-
der of the nonlinearity and the memory in the system. In order
to efciently implement this model, kernel eigen-decomposition
can be used in the context of a Parallel-Cascade realization of a
Volterra system. So, using the multilinear SVD (HOSVD) for de-
composing high-order Volterra kernels seems natural. In this pa-
per, we propose to drastically reduce the computational cost of the
HOSVD by (1) considering the symmetrized Volterra kernel and
(2) exploiting the column-redundancy of the associated mode by
using an oblique unfolding of the Volterra kernel. Keeping in mind
that the complexity of the full HOSVD for a cubic (I I I) un-
structured Volterra kernel needs 12I
4
ops, our solution allows
reducing the complexity to 2I
4
ops, which leads to a gain equal
to six for a sufciently large size I.
Keywords: Volterra kernel, fast HOSVD, oblique unfolding
1. INTRODUCTION
Specic applications that need nonlinear structures are encoun-
tered in many different areas, in particular in mobile communi-
cation, image processing, in geophysical and biomedical signal
processing. Volterra series [1] can be used to represent a broad
class of nonlinearities. There exists a plethora of identication
techniques adapted to the Volterra model (see [25]). But the
main drawback is the huge number of parameters needed to char-
acterize the Volterra kernel. There are different ways to reduce
the parametric complexity of Volterra models and a major chal-
lenge is to decompose the Volterra kernel efciently. A rst ap-
proach is to expand this kernel using orthonormal basis functions
like Laguerre functions. Another approach consists in represent-
ing the Volterra model in a parallel-cascade form resulting from
the singular value decomposition of an unfolded matrix represen-
tation of the kernel. In this class of methods, we can nd two
families: the matrix-based approach [6] and the tensor-based ap-
proaches [7, 8]. In the second family, tensor-based decompositions
are used as for instance the PARAFAC-Volterra [8] and the multi-
linear SVD (HOSVD) [7, 9]. More precisely, we extend the work
initiated in references [7,8] by exploiting the column-redundancies
existing in the mode of symmetric or symmetrized Volterra ker-
nels [10]. This permits to strongly reduce the computational cost
of the parallel-cascade realization (PCR) of the Volterra model.
This project is funded by both the R egion

Ile-de-France and the Digi-
teo Research Park.
2. VOLTERRA MODELS AND EIGEN DYNAMIC MODES
(EDM)
2.1. Denition of the model
Volterra series constitute a model for systems which yield gener-
alized Taylor series expansions. The input/output relationship for
a discrete-time time-invariant nonlinear causal system can be ex-
pressed as
y(n) =
M
m=1
ym(n) =
M
m=1
H
m
, X(n) (1)
where denotes complex conjugation, y(n) is the systems output
for the n-th discrete observation, M is the order of the Volterra
model and the nonlinearity degree, ., . stands for the inner prod-
uct. The data tensor for the n-th observation is given by X(n) =
x(n). . . x(n) where stands for the outer product, and x(n) =
[x(n) x(n 1) . . . x(n I + 1)]
T
in which
T
stands for matrix
transposition. In model (1), the (I. . .I) tensor Hm is called the
m-dimensional Volterra kernel and describes the dynamics of the
system. I is the memory length of the m-th order homogeneous
term ym(n). Expanding the inner product, we obtain
H
m
, X(n) =
I1
k
1
,...,km=0
[Hm]
k
1
...km
[X(n)]
k
1
...km
=
I1
k
1
,...,km=0
[Hm]
k
1
...km
x(n k1) . . . x(n km)
which represents a multidimensional convolution of the input sig-
nal x(n) with m-th order Volterra kernel. When M = 3, the
above model is called a cubic Volterra model. In the sequel, we
often consider this case but the presented results can be straight-
forwardly generalized to M > 3.
2.2. Multilinear SVD (HOSVD) for cubic Volterra models
2.2.1. Denition of the HOSVD
Every complex (I I I)-tensor H3 can be written as the product
[9]
H3 = Q1 U1 2 U2 3 U3, (2)
in which m denotes the m-mode product [9], U1, U2 and U3
are unitary matrices and Q is an all-orthogonal and ordered com-
plex tensor. This decomposition is a generalization of the matrix
SVD because the diagonality of the matrix containing the singular
values, in the matrix case, is a special case of all-orthogonality.
Also, the HOSVD of a second-order tensor (matrix) yields the ma-
trix SVD, up to trivial indeterminacies. For the third-order tensor
H3, the I I
2
unfolded matrix representations can be obtained as
[H1]
k
1
,k
3
I+k
2
= [H3]
k
1
k
2
k
3
,
[H2]
k
2
,k
3
I+k
1
= [H3]
k
1
k
2
k
3
,
[H3]
k
3
,k
1
I+k
2
= [H3]
k
1
k
2
k
3
.
An alternative (but equivalent) denition with respect to the
k-th frontal slice, denoted by H
k
3
= [H3]
:,:,k
3
, is
H1 = unfold1{H3} =
H0 . . . HI1
,
H2 = unfold2{H3} =
H
T
0
. . . H
T
I1
,
H3 = unfold3{H3} =
vec
H
T
0
. . . vec
H
T
I1
T
where vec(.) creates a I
2
1 vector froma II matrix by stacking
the column vectors of this matrix below one another. For a general
denition of unfolded matrices in case of higher dimensions, the
interested reader can see [9] for instance. The matrix of m-mode
singular vectors, Um, can be found as the matrix of left singular
vectors of the unfolded matrix representation Hm.
Based on the unitary matrices U1, U2 and U3, the core tensor
is given by
Q = H3 1 U
H
1
2 U
H
2
3 U
H
3
(3)
where
H
denotes the conjugate transpose of a matrix.
2.2.2. Complexity in ops
The computational costs presented in this paper are related to the
op (oating point operation) count. For example, a dot product
of I-dimensional vectors approximately involves 2I ops (I mul-
tiplications plus I 1 additions). Using the GR-SVD method [11],
the complexity of the full HOSVD of a (I I I)-tensor is eval-
uated to 12I
4
ops [10].
2.3. Eigen dynamic modes (EDM) and Parallel-Cascade real-
ization
Plugging the HOSVD given in (2) into the model (1), we obtain
y3(n) =
I1
k
1
,k
2
,k
3
=0
q
k
1
k
2
k
3
w
k
1
(n)w
k
2
(n)w
k
3
(n)
. .. .
y
k
1
k
2
k
3
(n)
in which q
k
1
k
2
k
3
= [Q]
k
1
k
2
k
3
, w
km
(n) = u
km
, x(n) = u
T
km
x(n)
where u
km
= [Um]
km
is the km-th column of matrix Um. So,
according to this expression a PCR of the Volterra model is ob-
tained as the parallelization and sum of the following block dia-
gram:
u
T
k
1
x(n)
w
k
1
(n)
$$
I
I
I
I
I
I
I
I
I
I
I
x(n)
//
u
T
k
2
x(n)
w
k
2
(n)
//

//
y
k
1
k
2
k
3
(n)
//
u
T
k
3
x(n)
w
k
3
(n)
::
u
u
u
u
u
u
u
u
u
u
u
q
k
1
k
2
k
3
OO
3. FAST HOSVD FOR THE CUBIC VOLTERRA KERNEL
Our method is based on the exploitation of the symmetry of the
Volterra kernel. A kernel is said to be symmetric if the indices can
be interchanged without affecting its value. More precisely, a m-th
order tensor S which is unchanged by any permutation is called a
symmetric tensor: k1, . . . , km {0, . . . , I1}, [S]
k
(1)
,...,k
(m)
=
[S]
k
1
...km
. In practice, we have two situations explained in the
next section.
3.1. On the symmetry for the Volterra kernel
3.1.1. Example of a symmetric kernel
The kernel of the Volterra model associated with the Wiener-Hammer-
stein model is already symmetric. Specically, this model is formed
of a polynomial dened by the coefcients {c1, . . . , cm}, enclosed
between two linear lters of impulse responses r(.) and g(.), and
memories Mr and Mg respectively. The associated Volterra kernel
has been derived in [12]:
[Hm]
k
1
,...,km
= cm
Mg1
i=0
g(i)
m
u=1
r(ku i),
where k1, . . . , km {0, . . . , Mv} and Mv = Mr + Mg 1 is
the memory of the nonlinear plant, which is assumed to be known.
If Mr = 1, a Hammerstein model is selected whereas Mg = 1
corresponds to a Wiener model. It is clear that Hm is a symmetric
tensor since r(k1 i)r(k2 i) . . . r(km i) is invariant under
permutation.
3.1.2. Existence of a symmetrized kernel
If the kernel is general in the sense that there is no symmetric re-
lations between its entries, there always exists an associated sym-
metrized kernel computed according to
[Sm]
k
1
...km
=

m!
P
[Hm]
k
(1)
...k
(m)
(4)
where ! stands for the factorial notation, ie., m! = m(m1)...2.1,
and P is the permutation set of cardinal m!/ with = n1! . . . nr!,
where r is the number of distinct values in the set {k1, . . . , km}
and n1 . . . nr is the occurrence of each index value. It follows that
for any tensor X(n), H
m
, X(n) = S
m
, X(n). The complexity
of this operation is (1 + m!/)I
m
ops.
3.2. Orthogonal tensor decomposition for a symmetric tensor
In the sequel, we consider the third-order Volterra model (M = 3)
but the proposed method can be easily extended to higher orders.
In the following developments, we no longer mention index m in
order to simplify the notations.
3.2.1. Multilinear SVD (HOSVD) for symmetric tensors
As a special case of the general denition of the HOSVD given in
section 2.2.1, every complex symmetric (I I I)-tensor S can
be written as the product:
S = Q 1 U2 U3 U, (5)
in which U is an unitary matrix and Q is an all-orthogonal and
ordered complex tensor.
3.2.2. Unfolding of a third-order symmetrized/symmetric tensor
The three modes of a third-order symmetric tensor are all equal
and are given by
S = unfold1{S} = unfold2{S} = unfold3{S}. (6)
Thus, S is a fat I I
2
matrix (more columns than rows) and,
as we show in the next section, it is column redundant [10].
3.3. Column-redundancy of the mode for a symmetrized or
symmetric tensor
3.3.1. Denition of the compressed mode
Let us begin by an example. Let Hbe a 222 tensor dened as
1 0
4

2

5 7
9

2

.
The modes (unfolded matrices) are given by
H1 = unfold
1
{H} =
4 2 1 0
9 2 5 7
,
H2 = unfold
2
{H} =
4 9 1 5
2 2 0 7
,
H3 = unfold
3
{H} =
4 2 9 2
1 0 5 7
.
Note that tensor H has no particular structure. This is also
true for the modes. Now, compute the symmetrized associated ten-
sor S dened according to (4) where the cardinal of the permuta-
tion group is 3!/(1!2!) = 3 according to
8
3
1
4

8
3

1 7
8
3

1

.
Then, its single mode is given by
S =
4
8
3

8
3
1
8
3
1 1 7
. (7)
Let us now consider the general case of symmetric tensors of
arbitrary dimension. The mode associated to a symmetric tensor
admits an axial blockwise symmetry [10] and thus S is column
redundant. This is a consequence of the symmetry of tensor S.
Specically, some columns in S are repeated twice.
Let S

= SJ dene the I J compressed mode where J is
a selection matrix which cancels the column-redundancy in mode
S (symbol

= stands for denition). The number of columns of the
compressed mode is now reduced to
J

=
(I + 1)I
2
< I
2
. (8)
The derivation of the selection matrix is not straightforward
for any size I but a systematic computation of the compressed
mode can be obtained following an oblique unfolding, denoted by
o-unfold(.), of tensor S, described in Fig. 1, according to
S
= o-unfold(S) =
T0 . . . TI1
(9)
where [T
k
3
]
k
1
,k
2
= [S]
k
1
,k
2
,k
3
+k
2
(i {0, . . . , I 1}, the
dimensions of Ti are I (I i)).
3.3.2. Weighted compressed mode
To compensate the missing columns in the compressed mode, we
introduce a diagonal matrix
D = diag{d0, . . . , dJ1} (10)
which takes the redundancy of each column in the mode into ac-
count. The weighting factors are given by
d
k
=
1 if 0 k < I,
2 if I k < J.
(11)
This means that if the k-th column is repeated twice then d
k
=
2. In case of no repetition, d
k
= 1. We call matrix S
D
1/2
the
weighted compressed mode.
Let us comeback to the example. Reorganize the columns of
the mode according to
S
()
=
4 1
8
3

8
3
8
3
7 1 1
. (12)
where
()
means that the columns of S have been permuted. It is
clear that the last columns are repeated. To mitigate this problem,
consider the oblique unfolding introduced in (9) and compute the
compressed mode S
with J = 3 according to
S
= o-unfold(S) =
T0 T1
(13)
where T0 =
4 1
8
3
7
and T1 =
8
3
1
. As expected, the re-

peated columns are removed and the remaining columns are scaled
according to the weighting matrix given by D = diag{1, 1, 2}.
3.3.3. SVD of the compressed mode
The SVD of the compressed mode is given by
S
D
1/2
= UGV
H
(14)
where Gis a diagonal matrix with non-negative coefcients and U
and V are unitary matrices. Consequently, the sample covariance
of the mode S veries
R
= SS
H
= S
D
1/2
(S
D
1/2
)
H
= UG
2
U
H
. (15)
This means that the mode S and the weighted compressed
mode S
D
1/2
share the same unitary basis
1
U. In this way, the
1
The detailed proof can be found in [10].
cost of the HOSVD is reduced to that of the SVD of S
D
1/2
. Us-
ing the GR-SVD method [11], the complexity of the computation
of the SVD of an I J matrix with J I needs 4JI
2
ops.
For large I, we have 4JI
2
2I
4
. In particular, it can be noted
that the compression and weighting of the modes lead to a com-
plexity approximately 6 times as low as that of the full HOSVD
computation, and twice as low as that the HOSVD for symmetric
tensors. To illustrate this result, the number of ops is plotted in
Fig. 2 with respect to the size of the kernel. In addition, the dif-
ferent complexities are summarized in Table 1. Remark that the
computational cost of the symmetrization of the Volterra kernel is
dominated by the computational cost of the SVD.
Y
I
-
?
I
I
-
T0
T1
-
TI1
-
Fig. 1. Oblique unfolding of tensor S
Table 1. Cost of the HOSVD
Operation Cost per iteration
HOSVD 12I
4
Symmetrization (optional) 7I
3
HOSVD (symmetric tensor) 4I
4
HOSVD (symmetric + compressed mode) 2I
4
4. CONCLUSION
Reducing the number of parameters toward a parallel-cascade real-
ization of a high-order Volterra kernel is a hot topic. In this paper,
we propose a fast computation of the multilinear SVD (HOSVD)
which fully exploits the redundancies of symmetric/symmetrized
Volterra kernels. First, for such kernels and without loss of gen-
erality, the HOSVD needs the computation of a single SVD of its
unique mode (unfolded matrix). This step leads to reduce the com-
putational cost by a factor three in the case of a third-order tensor.
Next, a second improvement, which is the main contribution of the
paper, is that this mode is column redundant, meaning that some
20 40 60 80 100
10
4
10
2
10
0
10
2
10
4
Kernel size : I
M
F
l
o
p
s

(
l
o
g

s
c
a
l
e
)

HOSVD
HOSVD (symmetric tensor)
HOSVD (symmetric+compressed mode)
Fig. 2. Number of ops with respect to the Kernel size
columns are repeated twice. Consequently, we present a technique
for deriving an SVD of a mode which has less columns than the
initial mode since the repeated columns are removed. Finally, we
show that this compressed mode, before convenient scaling, is
obtained by an oblique unfolding of the symmetric/symmetrized
Volterra kernel. This last point allows decreasing the computa-
tional cost by a factor two and thus by a nal factor equal to six
with respect to the direct computation of the HOSVD for the initial
Volterra kernel. For the interested reader, further improvements for
efciently computing the HOSVD of structured tensors are pre-
sented in [10].
5. REFERENCES
[1] M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems, John
Wiley and Sons, New York, NY, USA, 1980.
[2] Y.W. Lee and M. Schetzen, Measurement of the Wiener kernels of a nonlinear
system by crosscorrelation, Int. Journal of Control, vol. 2, no. 3, pp. 237254,
Sept. 1965.
[3] V. J. Mathews and G. L. Sicuranza, Polynomial Signal Processing, John Wiley
and Sons, New York, NY, USA, 2000.
[4] G. B. Giannakis and E. Serpedin, A bibliography on nonlinear system identi-
cation, Signal Processing, vol. 81, no. 3, pp. 533580, Mar. 2001.
[5] A. Khouaja and G. Favier, Identication of Parafac-Volterra cubic models
using an alternating recursive least squares algorithm, in European Signal
Processing Conference (EUSIPCO), Vienna, Austria, Sept. 2004.
[6] T. M. Panicker and V. J. Mathews, Parallel-cascade realizations and approx-
imations of truncated Volterra systems, IEEE Trans. Signal Processing, vol.
46, no. 10, pp. 28292832, Oct. 1998.
[7] E. Seagraves, B. Walcott, and D. Feinauer, Efcient implementation of
Volterra systems using a multilinear SVD, in International IEEE Symposium
on Intelligent Signal Processing and Communication System (ISPACS), Xia-
men, China, Nov. 2007, pp. 762765.
[8] G. Favier and T. Bouilloc, Parametric complexity reduction of Volterra mod-
els using tensor decompositions, in European Signal Processing Conference
(EUSIPCO), Glasgow, Scotland, Aug. 2009.
[9] L. De Lathauwer, B. De Moor, and J. Vandewalle, A multilinear singular value
decomposition, SIAM J. Matrix Anal. Appl., vol. 21, no. 4, pp. 12531278,
Apr. 2000.
[10] R. Badeau and R. Boyer, Fast multilinear singular value decomposition for
structured tensors, SIAM. J. Matrix Anal. Appl., vol. 30, no. 3, pp. 10081021,
Sept. 2008.
[11] G. H. Golub and C.F. Van Loan, Matrix Computations, Johns Hopkins Univer-
sity Press, 3rd edition, 1996.
[12] A.Y. Kibangou and G. Favier, Wiener-Hammerstein systems modeling using
diagonal Volterra kernels coefcients, IEEE Signal Proc. Letters, vol. 13, no.
6, pp. 381384, June 2006.

R Emy Boyer Roland Badeau G Erard Favier: Fast Orthogonal Decomposition of Volterra Cubic Kernels Using Oblique Unfolding

Uploaded by

Copyright:

Available Formats

R Emy Boyer Roland Badeau G Erard Favier: Fast Orthogonal Decomposition of Volterra Cubic Kernels Using Oblique Unfolding

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

R Emy Boyer Roland Badeau G Erard Favier: Fast Orthogonal Decomposition of Volterra Cubic Kernels Using Oblique Unfolding

Uploaded by

Copyright:

Available Formats

FAST ORTHOGONAL DECOMPOSITION OF VOLTERRA CUBIC KERNELS USING

. As expected, the re-

You might also like