1s
zyxwvutsrqp
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 1, JANUARY 1996
zyxwvutsr
zyxwvutsr
zyxw
zyxw
zyxw
zy
zyxwvut
Linear Estimation in
ein Spaces-
eory
Babak Hassibi, Ali H. Sayed, Member, IEEE, and Thomas Kailath, Fellow, IEEE
Abstract- The authors develop a self-contained theory for
linear estimation in Krein spaces. The derivation is based on
simple concepts such as projections and matrix factorizations
and leads to an interesting connection between Krein space
projection and the recursive computation of the stationary points
of certain second-order (or quadratic) forms. The authors use
the innovations process to obtain a general recursive linear
estimation algorithm. When specialized to a state-space structure,
the algorithm yields a Krein space generalization of the celebrated
Kalman filter with applications in several areas such as H w filtering and control, game problems, risk sensitive control, and
adaptive filtering.
I. INTRODUCTION
N some recent explorations, we have found that H" estimation and control problems and several related problems
(risk-sensitive estimation and control, finite memory adaptive
filtering, stochastic interpretation of the KYP lemma, and
others) can be studied in a simple and unified way by relating
them to Kalman filtering problems, not in the usual (stochastic)
Hilbert space, but in a special kind of indefinite metric space
known as a Krein space (see, e.g., [9], [lo]). Although the
two types of spaces share many characteristics, they differ in
special ways that turn out to mark the differences between the
linear-quadratic-Gaussian (LQG) or H 2 theories and the more
recent H" theories. The connections with the conventional
Kalman filter theory will allow several of the newer numerical
algorithms, developed over the last three decades, to be applied
to the H" theories [22].
In this paper the authors develop a self-contained theory for
linear estimation in Krein spaces. The ensuing theory is richer
than that of the conventional Hilbert space case which is why
it yields a unified approach to the above mentioned problems.
Applications will follow in later papers.
The remainder of the paper is organized as follows. We
introduce Krein spaces in Section I1 and define projections
in Krein spaces in Section 111. Contrary to the Hilbert space
case where projections always exist and are unique, the Kreinspace projection exists and is unique if, and only if, a certain
Gramian matrix is nonsingular. In Section IV, we first remark
that while quadratic forms in Hilbert space always have
minima (or maxima), in Krein spaces one can assert only that
they will always have stationary points. Further conditions will
have to be met for these to be minima or maxima. We explore
this by first considering the problem of finding a vector k to
stationarize the quadratic form ( z - k*y,z - k*y),where (., .)
is an indefinite inner product, * denotes conjugate transpose,
y is a collection of vectors in a Krein space (which we can
regard as generalized random variables), and z is a vector
outside the linear space spanned by the y. If the Gramian
matrix R, = (y,y) is nonsingular, then there is a unique
stationary point kGy, given by the projection of z onto the
linear space spanned by the y; the stationary point will be
a minimum if, and only if, R, is strictly positive definite as
well. In a Hilbert space, the nonsingularity of R, and its strict
positive definiteness are equivalent properties, but this is not
true with y in a Krein space.
Now in the Hilbert space theory it is well known (motivated by a Bayesian approach to the problem) that a certain
deterministic quadratic form J ( z , y ) , where now z and y
are elements of the usual Euclidean vector space, is also
minimized by kGy with exactly the same k as before. In the
Krein-space case, kgy also yields a stationary point of the
corresponding deterministic quadratic form, but now this point
will be a minimum if, and only if, a different condition, not
4 > 0, but R, - R,,R;lR,, > 0, is satisfied. In Hilbert
space, unlike Krein space, the two conditions for a minimum
hold simultaneously (see Corollary 3 in Section IV). This
simple distinction turns out to be crucial in understanding the
difference between H 2 and H" estimation, as we shall show
in detail in Part I1 of this series of papers.
In this first part, we continue with the general theory by
exploring the consequences of assuming that { z , y} are based
on some underlying state-space model. The major ones are
a reduction in computational effort, O ( N n 3 )versus O ( N 3 ) ,
where N is the number of observations and n is the number
of states and the possibility of recursive solutions. In fact,
it will be seen that the innovations-based derivation of the
Hilbert space-Kalman filter extends to Krein spaces, except
that now the Riccati variable P,, and the innovations Gramian
Re+ are not necessarily positive (semi)definite. The Krein
space-Kalman filter continues to have the interpretation of
performing the triangular factorization of the Gramian matrix
of the observations, R,; this reduces the test for R, > 0 to
recursively checking that the Re,%> 0.
Similar results are expected for the corresponding indefinite
quadratic form. While global expressions for the stationary point of such quadratic forms and of the minimization
zyxwvutsrq
Manuscript received March 4, 1994; revised June 16, 1995. Recommended
by Associate Editor at Large, B. Pasik-Duncan. This work was supported
in part by the Advanced Research Projects Agency of the Department of
Defense monitored by the Air Force Office of Scientific Research under
Contract F49620-93-1-0085 and in part by a grant from NSF under award
MIP-9409319.
B . Hassibi and T. Kailath are with the Information Systems Laboratory,
Stanford University, Stanford, CA 94305 USA.
A. H. Sayed is with the Department of Electrical and Computer Engineering,
University of Califomia, Santa Barbara, CA 93106 USA.
Publisher Item Identifier S 0018-9286(96)00386-8.
0018-9286/96$05.00 0 1996 IEEE
zyxwvutsrqponmlkjih
zyxwvutsrqponmlkjihg
zyx
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART I
condition were readily obtained, as previously mentioned,
recursive versions are not easy to obtain. Dynamic programming arguments are the ones usually invoked, and they
turn out to be algebraically more complex than the simple
innovations (Gram-Schmidt orthogonalization) ideas available
in the stochastic (Krein space) case.
Briefly, given a possibly indefinite quadratic form, our
approach is to associate with it (by inspection) a Krein-space
model whose stationary point will have the same gain IC; as for
the deterministic problem. The Kalman filter (KF) recursions
can now be invoked and give a recursive algorithm for the
stationary point of the deterministic quadratic form; moreover,
the condition for a minimum can also be expressed in terms of
quantities easily related to the basic Riccati equations of the
Kalman filter. These results are developed in Sections V and
VI, with Theorems 5 and 6 being the major results.
While it is possible to pursue many of the results of this
paper in greater depth, the development here is sufficient to
solve several problems of interest in estimation theory. In the
companion paper [l], we shall apply these results to H"
and risk-sensitive estimation and to finite memory adaptive
filtering. In a future paper we shall study various dualities and
apply them to obtain dual (or so-called complementary) statespace models and to solve the H 2 , H", and risk-sensitive
control problems. We may mention that using these results
we have also been able to develop the (possibly) numerically more attractive square root arrays and Chandrasekhar
recursions for H" problems [22], to study robust adaptive
filtering [23], to obtain a stochastic interpretation of the
Kalman-Yacubovich-Popov lemma, and to study convergence
issues and obtain steady-state results. The point is that the
many years of experience and intuition gained from the LQG
or H 2 theory can be used as a guide to the corresponding
H" results.
A. Notation
A remark on the notation used in the paper. Elements in
a Krein space are denoted by bold face letters, and elements
in the Euclidean space of complex numbers are denoted by
normal letters. Whenever the Krein-space elements and the
Euclidean space elements satisfy the same set of constraints,
we shall denote them by the same letters with the former ones
being bold and the latter ones being normal. (This convention
is similar to the one used in probability theory, where random
variables are denoted by bold face letters and their assumed
values are denoted by normal letters.)
19
zyxwvu
zyxw
Definition 1 (Krein Spaces): An abstract vector space
{ K , (., .)} that satisfies the following requirements is called
a Krein Space:
i) K is a linear space over C, the complex numbers.
ii) There exists a bilinear form (., .) E C on IC such that
a) ( Y , 4= b y ) * .
b) (ax by,z) = a(x,z)
b(y,z)
for any x,y,z E K , a , b E C, and where * denotes
complex conjugation.
iii) The vector space K: admits a direct orthogonal sum
decomposition
+
+
IC=K+$Ksuch that { K , , (.,.)} and {IC-, -(.,.)} are Hilbert
spaces, and
(X,Y)= 0
for any x E IC+ and y E IC-.
Remarks:
1) Recall that Hilbert spaces satisfy not only i), ii)-a), and
ii)-b) above, but also the requirement that
(x,z)> 0 when z # 0.
2) The fundamental decomposition of K defines two projection operators P+ and P- such that
P+K=K+
and P - K = K - .
zy
zyxw
Therefore, for every x E IC we can write
x = P + x + P - x = x + + z ~ , x * €IC*.
Note that for every x E IC+, we have (z,z)
2 0, but
x) 2 0 does not necessarily
the converse is not true: (2,
imply that x E IC+.
3) A vector x E K will be said to be positive if (z,
x) > 0,
neutral if (x,x)= 0, or negative if (z,x)< 0. Correspondingly, a subspace M c IC can be positive, neutral,
or negative, if all its elements are so, respectively.
We now focus on linear subspaces of K . We shall define
.C{yo,. . . ,yN} as the linear subspace of K spanned by the
elements yo,yl,. . . , yN in IC. The Gramian of the collection
of elements {yo, . . . ,yN} is defined as the ( N 1) x ( N 1)
matrix
zyxwvutsrq
zyxw
+
+
11. ON KREIN SPACES
The reflexivity property, (y,,yj) = (y3,yi)*, shows that the
We briefly introduce the definitions and basic properties of
Gramian is a Hermitian matrix.
Krein spaces, focusing on those results that we shall need later.
It is useful to introduce some matrix notation here. We shall
Detailed expositions can be found in books [9]-[ll]. Most
write the column vector of the {y,} as
readers will be familiar with finite-dimensional (often called
Euclidean) and infinite-dimensional Hilbert spaces. FiniteY = COl{YO, Y1,. . . 7 Y N l
dimensional (often called Minkowski) and infinite-dimensional
Krein spaces share many of the properties Hilbert spaces but and denote the above Gramian of the {y,} as
differ in some important ways that we shall emphasize in the
following.
20
zyxwvutsrqponmlkji
zyxwvutsrqp
zyxwvutsrqpon
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 1, JANUARY 1996
(A useful mnemonic device for recalling this is to think of the
{yo,. . . , yN} as “random variables” and their Gramian as the
“covariance matrix”
1
NeptiLe subspacc
..
.
Neutral cone
f--
where E ( .) denotes “expectation.” We use the quotation marks
because in our context, the covariance matrix will generally
be indefinite, so we are dealing with some kind of generalized
“random variables.” We do not pursue this interpretation here
since our aim is only to provide readers with a convenient
device for interpreting the shorthand notation.)
Also, if we have two sets of elements {zo,...,z~}
and
{yo,. . . , yN} we shall write
zyxwvu
z = co1{zo,z~,. . . , Z M }
and
Y = cO1{YO, Y l ,
+
and introduce the (A4 1) x ( N
Fig. 1.
Three-dimensional Minkowski space.
The (indefinite) squared norm of each vector
equal to
. . . ,YN>
+ 1) cross-Gramian matrix
(‘u,V)
We now proceed with a simple result.
Lemma 1 (Positive and Negative Linear Subspaces):
Suppose yo, ’ . . ,yN are linearly independent elements of
IC. Then C{yo, . . . , yN} is a “positive” (negative) subspace
of IC if, and only if
R, > O(R, < 0).
(2,Z)= k*(y,y)k =
> 0 for all z
k*R&
E C{yo, . . . ,yN}, if, and only if,
R, > 0. The proof for R, < 0 is similar.
Note that any linear subspace whose Gramian has mixed
inertia (both positive and negative eigenvalues) will have
elements in both the positive and negative subspaces.
A. A Geometric Interpretation
= ZlZ2
when
+
a]
a],
zyx
zyxwvu
m.PROJECTIONS IN mEIN SPACES
An important notion in both Hilbert and Krein spaces is that
Indefinite metric spaces were perhaps first introduced into
the solution of physical problems via the finite-dimensional
Minkowski spaces of special relativity [12], and some geometric insight may be gained by considering the special
three-dimensional Minkowski space of Fig. 1, defined by the
inner product
(‘U1,VZ)
negative subspace, x 2 y2 - t2 < 0, and points outside the
cone conesponding to the positive subspace, x 2 y2 - t2 > 0.
Moreover, any plane passing through the origin but lying
outside the neutral cone will have positive definite Gramian,
and any line passing through the origin and inside the neutral
cone will have negative definite Gramian. Also, any plane
passing through the origin that intersects the neutral cone will
have Gramian with mixed inertia, and any plane tangent to the
cone will have singular Gramian.
Two key differences between Krein spaces and Hilbert
spaces are the existence of neutral and isotropic vectors. As
mentioned earlier, a neutral vector is a nonzero vector that has
zero length; an isotropic vector is a nonzero vector lying in
a linear subspace of IC that is orthogonal to every element in
that linear subspace. There are obviously no such vectors in
Euclidean or Hilbert spaces. In the Minkowski space described
is a neutral vector, and if one considers the
above, [l 1
linear subspace L{[1 1
[&0 l]}, then [l 1 fi]is also
an isotropic vector in this linear subspace.
zyxwvuts
zyxwvutsr
Proofi Since the y2 are linearly independent, for any z #
0 E C{yo, . . . , yN} there exists a unique k E CN+’ such that
z = k*y. NOW
= (Zl,Yl,tl),
+ y2 - t2.
+
R,, = Rt,.
U1
= ( x , y , t ) is
In this case, we can take IC+ to be the LC - y plane and
IC- as the t-axis. The neutral subspace is given by the cone,
x2+ y2 - t2 = 0, with points inside the cone belonging to the
Note the property
so that (z,z )
= LC2
‘U
of the projection onto a subspace.
Definition 2 (Projections): Given the element z in IC and
the elements {yo,yl, . . . ,yN} also in IC, we define 2 to be
the projection of z onto C{yo, yl, . . . ,yN} if
zyxwv
z=5+2
where i E C{y,, . . . ,yN} and 2 satisfies the orthogonality
condition
+ YlY2 - t l t 2
2LL{Y0,”’,YN}
‘U2
= (22,Y2,t2) and
(2)
G,Yi,t, E c.
or equivalently, (2,yi) = 0 for i = 0,1, . . . , N
zyxwvutsrqponmlk
zyxwvutsrqponmlkjihgfed
zyxwvutsrqpo
zyx
HASSIBI ef al.: LINEAR ESTIMATION IN KREIN SPACES-PART
I
L1
In Hilbert space, projections always exist and are unique. In
Krein space, however, this is not always the case. Indeed we
have the following result, where for simplicity we shall write
The proof of the above lemma shows that in Hilbert
spaces the singularity of R, implies that the (y,} are linearly
dependent, i.e.,
zyxwvutsrqp
zyxwvuts
zyxwvu
zyxwvutsrqp
zyxwvu
zyxwvuts
C(Y} 2 L{YO,. . . , YN}.
Lemma 2 (Existence and Uniqueness of Projections): In the
Hilbert space setting, projections always exist and are unique.
I n the Krein-space setting, however:
a) If the Gramian matrix R, = (y,y)is nonsingular, then
the projection of z onto C(y} exists, is unique, and is
given by
= (z,d(Y, Y)-lY = RZ,R,lY.
(3)
b) If the Gramian matrix R, = (y,y)is singular, then
i) If R(R,,) C R(R,) (where R ( A ) denotes the
column range space of the matrix A), the projection
i exists but is nonunique. In fact, i = k: y, where ko
is "any" solution to the linear matrix equation
R,ko = R,,.
det(R,) = 0
* k*y = 0 for some vector k E CN+l.
In the Krein-space setting, all we can deduce from the singularity of R, is that there exists a linear combination of the
(y,} that is orthogonal to every vector in C(yo, . . ,yN}, i.e.,
that C(yo, . . . , yN} contains an isotropic vector. This follows
by noting that for any complex matrix k1, and for any k in
the null space of R,, we have
+
k:R,k = (kTy,k*y) = 0
which shows that the linear combination k*y is orthogonal to
k;y, for every ICl, i.e., k*y is an isotropic vector in L{y}.
Standing Assumption: Since existence and uniqueness will
be important for all our future results, we shall make the
standing assumption that the Gramian
(4)
R(R,), the projection i does not exist.
ii) If R(R,,)
Prooj Suppose i is a projection of z onto the desired space.
By ( 2 ) , we can write
R,
is nonsingular.
A. Vector-Valued Projections
Consider the n-vector z = col(z1,... ,zn}composed of
elements
z, E IC, and the set (yO,...,yN},where y3 E IC;
z = k,*y+H
project each element z, onto L(yo,...,y,}
to obtain i z .
We define i = c o l ( i l , . . . ,in}
as the projection of z onto
for some ko E c ( ~ + ~Since
) . ( 2 , ~=
)o
L(yo,...,yN} . (Strictly speaking, we should call i E IC"
( 5 ) the projection of z E IC" onto Ln(yo,...,yN}, since it
R,, = (z,y)= k,*(y,y) 0 = k:R,.
is an element of Ln{yO,...,yN} and not L{yo,...,yN}.
If R, is nonsingular, then the solution for k in ( 5 ) is unique
For simplicity, however, we shall generally use the looser
and the projection is given by (3). If R, is singular, two things
terminology.)
may happen: either R(R,,)
R(R,), in which case ( 5 ) will
It is easy to see that the results on the existence and
have a nonunique solution (since any k ; in the left null space
uniqueness of projections in Lemma 2 continue to hold in
of R, can be added to IC:), or R(R,,)
R(R,), in which
the vector case as well.
case the projection does not exist since a solution to (5) does
In this connection, it will be useful to introduce a slight
not exist.
generalization of the definition of Krein spaces that was given
In Hilbert spaces the projection always exists because it
in Section 11. There, in Definition 1, we mentioned that IC
is always true that R(R,,) C R(R,), or equivalently, that
should be linear over the field of complex numbers, C. It turns
N(R,) C N(R,,) where N ( A ) is the right nullspace of the out, however, that we can replace C with any ring S. In other
matrix A. To show this, suppose that 1 E N(R,). Then
words, the first two axioms for Krein spaces can be replaced
by :
R,l = 0 + l*R,l = 0
i) K is a linear space over the ring S.
l*(y,y)l = (l*y,l*y) = 0
ii) There exists a bilinear form (., .) E S on K such that
l*y = 0
a)
( Y , 4 = (Z,Y)*
b)
(ax by,z) = a ( z , z ) b(y,z)
where the last equality follows from the fact that in Hilbert
for any q y , x E IC and a , b E S , and where the
spaces ( 2 , ~=) 0
z = 0. We now readily conclude
operation * depends on the ring S .
that (z,l*y) = R,,l = 0, i.e., 1 E N(R,,) and hence
When the inner product (., .) E S is positive, (IC, (., .)}
N(R,) C N(R,,). Therefore a solution to (5) (and hence
is referred to as a module. Thus the third axiom for
a projection) always exists in Hilbert spaces.
Krein spaces can be replaced by iii).
In Hilbert spaces the projection is also unique because if kl
iii) The vector space IC admits a direct orthogonal sum
and IC2 are two different solutions to ( 5 ) , then (?GI - k z ) * R y =
decomposition
0. But the above argument shows that we must then have
(kl - ka)y = 0. Hence the projection
K = K, e3 IC2 = IC;y = k;y
such that {IC+, (., .)} and {IC-, -(., .)} are modules,
and (2,
is unique.
0
y) = 0 for any 3: E IC+ and y E IC-.
+
*
*
*
+
+
zyxwvutsrq
22
zyxwvut
zyxwvutsr
zyxwvuts
zyxwvutsrq
zyxwvutsrqp
zyxwvutsr
zyxw
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO 1, JANUARY 1996
The most important case for us is when S is a ring of
complex matrices, and the operation * denotes Hermitian
transpose.
The point of this generalization is that we can now directly
define the projection of a vector z E IC" onto Cn{yo,. . . ,yN}
as an element 2 E Gn{yo, . . . ,yN}, such that
2
= k;;y,
IC;;
E
cnxN
It is well known that the linear least-mean-square estimate,
which minimizes P ( k ) ,is given by the projection of z on L{y}
2 = k,*y
where
k; = Ezy*[Eyy*]-' = RzyR$'.
The simple proof will be instructive. Thus note that
where k is such that
P ( k ) = llz - k*Yll&
= llz - 2 f - k*yl/&
A
+
0 1(2 - kGy, y) = Rzy - k;R,
or
= llz
-
211;
+ 112 - k*&
since by the definition of 2, it holds that
k; R, = Rzy.
Finally, let us remark that to avoid additional notational
burden, we shall often refrain from writing ICn and shall
simply use the notation K: for any Krein space. The ring S
over which the Krein space is defined will be obvious from
the context.
IV. PROJECTIONS
AND QUADRATIC FORMS
In Hilbert space, projections extremize (minimize) certain
quadratic forms, as we shall briefly first describe. In Krein
spaces, we can in general only assert that projections stationarize such quadratic forms; further conditions need to be met
for the stationary points to be extrema (minima). This will be
elaborated in Section IV-A, in the context of (what we shall
call) a stochastic minimization problem. In Section IV-B, we
shall study a closely related quadratic form arising in what
we shall call a partially equivalent deterministic minimization
problem.
(z - i , f
-
k*y)z = 0.
Clearly, since f = k,*y
P ( k ) 2 P(ko)
with equality achieved only when 5 = ko.
This argument breaks down, however, when the elements
are in a Krein space, since then we could have
IJi
- k*y1I2 = Ilk,*y- k*y/I2= 0, even if ko
A11 we can assert is that
k;y - k*y = an isotropic vector in the linear
subspace spanned by {yo, . . . ,yN}.
Moreover, since Ilkty- k*y1I2could be negative, it is not true
that P ( k ) will be minimized by choosing k = ko. So a closer
study is necessary.
We shall start with a definition.
DeJinition 3 (Stationary Point): The matrix ko E d N + l )
x ( M 1) is said to be a stationary point of an (Ad 1) x
( M 1) matrix quadratic form in k , say
+
+
A. Stochastic Minimization Problems in
Hilbert and Krein Spaces
# k.
+
zyxwvut
zyxwvuts
Consider a collection of elements { y o , . . - , y N } in a
Krein space IC with indefinite inner product (., .), Let z =
col{zo, . . . ,Z M } be some column vector of elements in IC, and
consider an arbitrary linear combination of {yo,. . . ,yN}, say
k*y, where k* E C(M+l)X(N+l) and y = col{yo,. . . ,y N } . A
natural object to study is the error Gramian
P ( k ) = ( z - k*y,z - k*y).
(6)
To motivate the subsequent discussion, let us first assume
that the {y,} and { z j } belong to a Hilbert space of zero-mean
random variables and that their variance and cross-variances
are known. In this case the inner product is ( z ~y ,j ) z = Ez,y,T
(where E ( . ) denotes expectation), and P ( k ) is simply the
mean-square-error (or error variance) matrix in estimating z
using k*y, viz.
P ( k ) = E("
- k*y)(z - k*y)* =
112
.
- k*yll&.
P(k)= A
+ B k + k*B* + k*Ck
iff koa is a stationary point of the "scalar" quadratic form
a * P ( k ) afor all complex column vectors a E C M + l , i.e., iff
aa;f)alkxk0
= 0.
Now we can prove the following.
Lemma 3 (Conditionfor Minimum): A stationary point of
P ( k ) is a minimum iff for all a E CM+l
(7)
Moreover, it is a unique minimum iff
zyxwvutsrqponm
zyxwvutsrqponmlkjihgfedc
zyxwvutsrqpon
zyx
zyxw
zyxwv
HASSIBI ef al.: LINEAR ESTIMATION IN KREIN SPACES-PART I
23
Theorem 1 {Stationary Point of the Error Gramian): When
R, is nonsingular, ko, the unique coefficient matrix in the
projection of z onto L{y}
zyxw
zyxwv
zyxwvutsrqp
2 = kiy,
ko = RG'R,,
yields the unique stationary point of the error Gramian
A
P ( k )=
(2 - k*y,z- k*y)
= [I
Fig. 2. The projection 2 = k:y stationarizes the error Gramian P ( k ) =
( z - k*y , z - k*y) over all k*y E L { y } .
-k*l[2,
nd,"] [_Ik]
(12)
over all k E C ( N S 1 ) x ( M + l ) .Moreover, the value of P ( k ) at
the stationary point is given by
zyxwvu
Proofi Writing the Taylor series expansion of u*P(k)u
P ( k 0 ) = R, - R,,R,'R,,.
around the stationary point ko yields (since u*P(k)u is
quadratic in ka), as shown at the bottom of the previous
Proof: The claims follow easily from (11) by differentiapage, or equivalently
tion.
0
Further differentiation and use of Lemma 3 yields the
u*P(k)u- u*P(ko)u
following result.
Corollary 1 {Conditionfor a Minimum): In Theorem 1, ko
* ( k - k0)u.
is a unique minimum iff
Using the above expression, we see that ko is a minimum,
i.e., u * P ( k ) u - u*P(ko)u 2 0 for all k # ko iff (7)
is satisfied. Moreover, ko will be a unique minimum, i.e.,
u*P(k)u- u*P(ko)u > 0 for all k # ko iff (8) is satisfied.
R, > 0
i.e., R, is not only nonsingular but also positive definite.
B. A Partially Equivalent Deterministic Problem
Let us now return to the error Gramian P ( k ) in (6) and
expand it as
or more compactly
Note that the center matrix appearing in (9b) is the Gramian
of the vector col{z,y}.
For this particular quadratic form, we can use the easily
verified triangular factorization (recall our standing assumption
that R, is nonsingular)
to write
u * ~ ( k )=
u [U*
u*k* - u*R,,R;~]
[".-
~ , f ; l ~ , ~
R,
O
I[
U
ku - R;lR,,u
1.
(11)
Calculating the stationary point of P ( k )and the corresponding
condition for a minimum is now straightforward. Note, moreover, that R, nonsingular implies that the stationary point is
unique.
We shall now consider what we call a partially equivalent
deterministic problem. We refer to it as deterministic because
it involves computing the stationary point of a certain scalar
quadratic form over ordinary complex variables (not Krein
space ones). Moreover, it is called partially equivalent since
its solution, i.e., the stationary point, is given by the same
expression as the projection of one suitably defined Kreinspace vector onto another, while the condition for a minimum
is different than that for the Krein-space projection.
To this end, consider the scalar second-order form
where the central matrix is the inverse of the Gramian matrix
in the stochastic problem of Theorem 1 [see (9b)l. Suppose
we seek the stationarizing element zo for a given U. [Of course
now we assume not only that R, is nonsingular, but so also
the block matrix appearing in (13).] Note that z and y are
no longer boldface, meaning that they are to be regarded as
(ordinary) vectors of complex numbers.
Referring to the discussion at the beginning of Section IVA on Hilbert spaces, the motivation for this problem is the
fact that for jointly Gaussian random vectors {z,y}, the linear
least-mean-squares estimate can be found as the conditional
mean of the conditional density pZy(z,y)/py(y). When {z, y}
are zero-mean with covariance matrix.
[t,
2;].tam
logarithms of the conditional density results in the quadratic
form (13) which is the negative of the so-called log-likelihood
function. In this case, the relation between (13) and the
24
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 1, JANUARY 1996
zyxwvutsr
zyxwvuts
zyxwvut
zyxwvutsrqp
projection follows from the fact that the linear least-meansquares estimate is the same as the maximum likelihood
estimate [obtained by minimizing (13)]. With this motivation,
we now introduce and study the quadratic form J ( z ,y) without
any reference to { z ,y} being Gaussian.
Theorem 2 (Deterministic Stationary Point): Suppose both
R, and the block matrix in (13) are nonsingular. Then
a) The stationary point zo of J ( z , y) over z is given by
This is the major difference from the classical Hilbert space
context where we have
When (14) holds, the approaches of Theorems 1 and 2 give
equivalent results.
Corollary 3 (Simultaneous Minima): For vectors z and y
of linear independent elements in a Hilbert space X, the
conditions R, - R,,R;'R,,
> 0 and R, > 0 occur
zo = R , , R ; ~ ~ .
simultaneously.
0
Proof: Immediate from the factorization (10).
b) The value of J ( z , y ) at the stationary point is
We shall see in more detail in Part 11, and to some extent in
4 x 0 , Y ) = Y*RylY.
Section VI-B of this paper, that this difference is what makes
H" (and risk-sensitive and finite memory adaptive filtering)
Corollary 2 (Conditionfor a Minimum): In Theorem 2, zo results different from H 2 results. Briefly, H" problems will
is a minimum iff
lead directly to certain indefinite quadratic forms: to stationarize
them we shall find it useful to set up the corresponding
R, - R,,R~lR,, > 0.
JSrein-space problem and appeal to Theorem 1. While this will
give an algorithm, further work will be necessary to check for
Prooj? We note that [see (lo)]
the minimum condition of Theorem 2 in the H" problem.
It is this difference that leads us to say that the deterministic
problem is only partially equivalent to the stochastic problem
of Section IV-A. (We may remark that we are making a
distinction between equivalence and "duality": one can in fact
define duals to both the above problems, but we defer this
topic to another occasion.)
Remark 3: Finally, recall that Lemma 2 on the existence
and uniqueness of the projection implies that the stochastic
so that we can write
problem of Theorem 1 has a unique solution if, and only if, R,
is nonsingular, thus explaining our standing assumption. The
following result is the analog for the deterministic problem.
Lemma 4 (Existence of Stationarizing Solutions): The deterministic problem of Theorem 2 has a unique stationarizing
It now follows by differentiation that the stationary point of solution for all y if, and only if, R, is nonsingular.
Proofi Let us denote
J ( x , y ) is equal to zo = R,,R;'y, and that J(zo,y) =
y*R;'y. To prove the Corollary, we differentiate once again,
A B
and use Lemma 3.
0
= [B c]
Remark I : Comparing the results of Theorems 1 and 2
shows that the stationary point 20,of the scalar quadratic form so that
(13) is given by a formula that is exactly the same as that in
Theorem 1 for the Krein-space projection of a vector z onto
the linear span L{y}. In Theorem 2, however, there is no
Krein space: x and y are just vectors (in general of different If J(z,y) has a unique stationarizing solution for all y ,
dimensions) in Euclidean space and 20 is not the projection then A must be nonsingular (since by differentiation the
of x onto the vector y. What we have shown in Theorem 2 stationary point must satisfy the equation Azo = By). But
is that by properly defining the scalar quadratic form as in the invertibility of A and the whole center matrix appearing in
(13) using coefficient matrices R, , R,, Rzy, and hzthat are J ( z ,y) imply the invertibility of the Schur complement C arbitrary but can be regarded as being obtained from Gramians B*AP1B.But it is easy to check that this Schur complement
and cross-Gramians of some Krein-space vectors { z , y}, we must be the inverse of R,. Thus R, must be invertible.
On the other hand if R, is invertible, then the deterministic
can calculate the stationary point using the same recipe as in
problem
has a unique stationarizing solution as given by
Theorem 1.
Theorem
2.
U
Remark2: Although the stationary points of the matrix
quadratic form P ( k ) and the scalar quadratic form J ( z , y)
are found by the same computations, the two forms do C. Altemative Inertia Conditionsfor Minima
In many cases it can be complicated to directly check for
not necessarily simultaneously have a minimum, since one
requires the condition R, > 0 (Corollary l), and the other the positivity condition of the deterministic problem, namely
> 0 (Corollary 2). R, - R,YR;lRy, > 0. On the other hand, it is often easier
requires the condition R, - R,,R;'R,,
zyxw
zyxwvutsr
%
;
I
,
zyxwvuts
zyxwvutsrq
zyxwvutsrq
:[
zyxwvutsrqponml
zyxwvutsrqponmlkjihgf
zyxwvutsrqponml
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART
I
zyxwvuts
25
to compute the inertia (the number of positive, negative, and
zero eigenvalues) of R, itself. This often suffices [24].
Lemma 5 (Inertia Conditions for Deterministic Minimization):
If R, and R, are nonsingular, then the deterministic
problem of Theorem 2 will have a minimizing solution
(i.e., R, - R,,R;lR,, will be > 0) if, and only if
unknown complex vectors. The output y j E C P is assumed
known for all j.
In many applications one is confronted with the following
deterministic minimization problem: Given { yj}j",,, minimize
over xo and
the quadratic form
zyxwvutsrq
zyxwvutsrq
zy
I-[R,] 1I-[%]
{U~}Y=~
+ I-[(Ry- R y z R ~ ' R z y ) ] (15)
where I-[A] denotes the negative inertia (number of
negative eigenvalues) of A.
When R, > 0 (rather than just being nonsingular) then
we will have a minimizing solution iff
I-[Ry]= I-[R, - R,,R,lR,,]
(16)
i.e., if, and only if, R, and R, - R,,R;lR,, have the
same inertia.
Proof: If R, and R, are both nonsingular, then equating the
lower-upper and upper-lower block triangular factorizations of
the Gramian matrix in (10) will yield the result that
subject to the state-space constraints (17), and where Q, E
, S, E C m x p , R, E CPxp, IIo E C n X n are (possibly
indefinite) given Hermitian matrices.
The above deterministic quadratic form is usually encountered in filtering problems; a special case that we shall see in
the companion paper is the 23"-filtering problem where the
cmxm
['
zyxwvu
weighting matrices are
IIo, Q,
= I , and R, =
-;;I].
and where H, is now replaced by col{H,,L,}. Another
application arises in adaptive filtering in which case we
usually have U ,
0 and F, = I 1151, [23]. In the general
and
case, however, IIo represents the penalty on the initial state,
0 R,R,,R;lR,,
O
I
RY
and {Q,, R,, S,} represents the penalty on the driving and
are congruent. By Sylvester's Law that congruent matrices measurement disturbances {U,, w,}. (There is also a "dual"
have the same inertia 1161, we have
quadratic form that arises in control applications which we
shall study elsewhere.)
I-[R, - R,,R;lR,,]
I-[R,]
Such deterministic problems can be solved via a variety of
= I- [R,] I- [ ( R y - Ry,R;'R,y)].
methods, such as dynamic programming or Lagrange multipliers (see, e.g., [5]), but we shall find it easier to use the
Now if (15) holds, then I-[R, - R,,R;'R,,]
= 0, so that
equivalence discussed in Section IV: construct a (partially)
R, - R,,R;lR,,
> 0.
equivalent Krein space (or stochastic) problem. To do so we
Conversely if I-[R, - R,,R;lR,,] = 0, then (15) holds.
first need to express the J(XO,U,y) of (18) in the form of (13)
When R, > 0, we have I-[R,] = 0, and (16) follows
of Section IV-B.
immediately.
0
For this, we first introduce some vector notation. Note that
The general results presented so far can be made even
the states {x,} and the outputs {y,} are linear, combinations
more explicit when there is more structure in the problems. In
of the fundamental quantities ( 2 0 ,{U,, w,},"=,}. We introduce
particular, we shall see that when we have state-space structure
(the state transition matrix)
both R, and R, - R,, R;' R,, are block-diagonal. Moreover,
a Krein space-Kalman filter will yield a direct method for
computing the inertia of R,. Thus, when we have state-space
structure, it will be much easier to use the results of Lemma 5
and define
than to directly check for the positivity of R, - R,,R;lR,,
~ 2 1 ~, 4 1 .
'1
[".
+
+
zyxwvutsr
zyxwvutsrq
V. STATE-SPACE
STRUCTURE
One approach at this point is to begin by assuming that the
components {y,} of y arise from an underlying Krein space
state-space model. To better motivate the introduction of such
state-space models, however, we shall start with the following
(indefinite) quadratic minimization problem.
Consider a system described by the state-space equations
c
X ~ + I= F,x,
+ G,U,,
Y, = HJX,
+ U,
as the response at time j to an impulse at time k
both 20 = 0 and 01, E 0).
Then with
<j
(assuming
zy
0I
j5 N
(17)
where F, E CnXn, G, E C n X m , and H, E C p x n are given
matrices and the initial state xo E Cn, the driving disturbance
U , E C m , and the measurement disturbance v, E C p , are
the state-space equations (17) allow us to write
Ho
-
(;
H
1@(I,0)
U=
zyx
zyx
zyxwvutsr
F! .I.
zyxwvutsrqp
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 1, JANUARY 1996
26
H2@(2,0)
_I
r=
and
-HN@iN,0) -
respectively, and that the {U%,
vz} form a white (uncorrelated)
sequence. As mentioned before, the Krein-space elements can
be thought of as some kind of generalized random variables.
Now if, as was done earlier, we define
:
;2
Y = COl{YO , . . .Y N 1
= col{uo, . . .UN}
U
21
= COl{VO,~~
.UN}
then we can use the state-space model (22a) to write
I 0 0'
O I O
U
~
I
and to see that
Finally we make the change of coordinates
zyxwvu
zyxwv
zyxwvut
to obtain
J(Z0, U,
Y) =
-U
-r
= E ] * { [ O
I
which is exactly the inverse of the central matrix appearing
in expression (21) for J ( z 0 , U , y). Therefore, referring to
Theorems 1 and 2, the main point is that to find the stationary
point of J ( z o , u , y ) over { z o , ~ } we
, can alternatively find
the projection of ( 2 0 ,U} onto L{y} in the Krein-space model
(22a).
Now that we have identified the stochastic and deterministic
problems when a state-space structure is assumed, we can give
the analogs of Theorems 1 and 2.
Lemma 6 (Stochastic Interpretation): Suppose z = col{xo,
U} and y are related through the state-space model (22a) and
(22b), and that R, given by (27) is nonsingular. Then the
stationary point of the error Gramian
I 0 0 I I 0 0 0
I O ] 1 0 Q S]
0 r 1 o S*R
O
~
over all k*y is given by the projection
I
This is now of the desired form (13) (with z 2 c o l { z ~ , ~ } ) .
Therefore, comparing with (12) in Theorem 1, we introduce a where
Krein space state-space model
zyxwvutsr
+
xJ+l FJxJ GJuJ, 0
Y3 = H J X 3
vu3
+
5j 5N
(224
where the initial state, xo, and the driving and measurement
disturbances, {uJ} and
{vj},
are such that
The condition (22b) is the Krein-space version of the usual
assumption made in the stochastic (Hilbert space) state-space
models, viz., that the initial condition 20 and the driving and
measurement disturbances { uz,TI%} are zero-mean uncorrelated
random variables with variance matrices
no and
Moreover this stationary point is a minimum if, and only if,
€2, > 0 .
We can now also give the analog result to Theorem 2.
Lemma 7 (Deterministic Quadratic Form): The expression
yields the stationary point of the quadratic order form
]zyxw
zyxwvutsrqpon
x
L
J
J J
[Q,
sJ]-'[
s; RJ
YJ -
u3
HJXJ
(29a)
zyx
zyxwvutsrqponmlkjih
zyxwvutsrqponmlkjihg
zyxwvutsrqponm
zyxwvutsrqpo
zyxwvu
zyxwvutsr
zyxwvut
zyxwvutsr
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART
I
21
over ZO and U = c o l ( u ~ , . . . , u N } ,and subject to the statespace constraints
{
~ j + l= F
In particular, when
+
+ wj.
j ~ j Gjuj, 0 5 j 5 N
y j = HjZj
Q-l+
0 or
Sj # 0) at
A. The Conditionsfor a Minimum
As mentioned earlier, the important point is that the conditions for minima in these two problems are different: R, > 0
in the stochastic problem, and
- R,,R;lR,,
>0
so that we have the following result.
Lemma 8 ( A Conditionfor a Minimum): If Q and R S*Q-'S are invertible, a necessary and sufficient condition
for the stationary point of Lemma 7 to be a minimum is that
i) POIN> 0.
ii) Q-l+(r*+Q-lS)(R-S*Q-lS)-'(I'+S*Q-l)
> 0.
When S F 0, the second condition becomes Q-'+I'*R-lr >
0.
J ( p O I N > f i l N , Y ) = y*R,'y.
A
(r*+ Q - l S ) ( R - S*Q-lS)-l(r + S*Q-l) > 0
Sj 0, the quadratic form is
The value of J ( z 0 ,U , y) (with either Sj
the stationary point is
M = R,
Now we use another well-known fact: the (2, 2) block element
of M-' is just A-l (where A-l exists since M is positivedefinite). Therefore the condition now becomes
where z = col{xo,u}
in the deterministic problem. In the state-space case R, is
given by (27). In this section we shall explore the condition
for a deterministic minimum under the state-space assumption.
First note that for M we have (30) as shown at the bottom
of the page.
Now we know that M > 0 iff both the (1, 1) block entry in
(30) and its Schur complement are positive definite. The (1,
1) block entry may be identified as the Gramian of the error
20 - & I N , i.e.,
The conditions of Lemma 8 need to be reduced further to
provide useful computational tests. This can be done in several
ways, leading to more specific tests. One interesting way is
by showing that Q - l + (r* Q-'S)(R - S*Q-'S)-'(r
S*Q-l) may be regarded as the Gramian matrix of the output
of a so-called backward dual state-space model. This identification will be useful in studying the Hm-control problem (and
in other ways), but we shall not pursue it here.
Instead we shall use the altemative inertia conditions of
Lemma 5 to circumvent the need for direct analysis of the
matrix R, - R,, R; R,, . Recall from Lemma 5 that if R, >
0, a unique minimizing solution to the deterministic problem
of Theorem 2 exists if, and only if, R, and R, - R,,R;lR,,
have the same inertia. For the state-space structure that we are
considering, however
+
+
A
IIo - IIoO*R[~OIIO
= (20 - &JN,ZO- 3 2 0 ~ ~=) POIN.
(31)
so that after some simple algebra we have
To obtain a nice form for the Schur complement of the (1,
1) block entry, say A, we have to use a little matrix algebra.
Recall that
+ R - S*Q-lS.
Using the second expression for R, and a well-known matrix
inversion formula leads to the expression
Thus R, - R,,R;'R,,
is block-diagonal, and we have the
following result.
Lemma 9 (Inertia Condition for Minimum): If IIo > 0 and
Q > 0, then a necessary and sufficient condition for the
stationary point of Lemma 7 to be a minimum is that the
matrices R, and R - S*Q-lS have the same inertia. In
particular, if S 0, then R, and R must have the same inertia.
As we shall see in the next section, the Krein space-Kalman
filter provides the block triangular factorization of R,, and
thereby allows one to easily compare the inertia of R, and
x ( R - S*Q-lS)-l[O I? + S*Q-']. (32) R
=
-
S*Q-lS.
zyxw
zyx
zyxwvu
zyxwvutsrqp
28
IEEE TRANSACRONS ON AUTOMATIC CONTROL, VOL. 41, NO 1, JANUARY 1996
VI. RECURSIVEFORMULAS
factorization of the Gramian R,. To this end, let us write
So far we have obtained global expressions for computing
Y, = 5, e,
projections and for checking the conditions for deterministic
= (g,,,
eo)R,;eo
. + (Y2, e2-1)11,,21_1e2-1 e,
and stochastic minimization. Computing the projection requires inverting the Gramian matrix R, and checking for the and collect such expressions in matrix form
minimization conditions requires checking the inertia of R,,
both of which require O ( N 3 )(where N is the dimension of
R,) computations.
The key consequence of state-space structure in Hilbert
1=
space is that the computational burden of finding projections
YN
can be significantly reduced, to O ( N n 3 ) (where n is the
dimension of the state-space model), by using the Kalman
filter recursions. Moreover, the Kalman filter also recursively
factors the positive definite Gramian matrix R, as LDL*, L
lower triangular with unit diagonal, and D diagonal.
We shall presently see that similar recursions hold in Krein
where L is lower triangular with unit diagonal. Therefore,
space as well, provided
since the e, are orthogonal, the Gramian of y is
R, is strongly nonsingular (or strongly regular)
(34)
R, = LReL*, where Re = Re,O@ Re,1@ . . . @ re,^.
in the sense that all its (block) leading minors are nonzero.
Recall that in Hilbert space if the {y2} are linearly indepen- We thus have the following result.
Lemma IO (Inertia of R,): The Gramian R, of y has the
dent, then R, is strictly positive definite; so that (34) holds
automatically. In the Krein-space theory, we have so far only same inertia as the Gramian of the innovations, Re.The strong
assumed that R, is invertible which does not necessarily imply regularity of % implies the nonsingularity of Re,,,0 5 e 5 N .
(34). Recursive projection, i.e., projection onto C{y,, . . . , y,} In particular, Iz?J > 0, if and only if
for all 2 , however, requires that all the (block) leading submaRe,%> 0, for all i = 0 , 1 , . . . ,N .
trices of R, are nonsingular; recall also that (34) implies that
R, has a unique triangular decomposition
We should also point out that the value at the stationary point
+
+
+
’ ‘
zyxwvuts
zyxwvuts
zyxwvuts
zyxwvuts
I”]
(35) of the quadratic form in Theorem 2 can also be expressed in
terms of the innovations
Therefore, In(R,) = In(D), and in particular, I1?J > 0 iff
J ( z 0 , y ) = y*RL1y = Y*L-*R,~L-’Y
D > 0. This is the standard way of recursively computing the
N
inertia of R,.
The standard method of recursive estimation, which also
= eXR,le =
e,*R,’e,.
(39)
=O
gives a very useful geometric insight into the triangular
factorization of R,, is to introduce the innovations
A. The Krein Space-Kalman Filter
e, = YJ - Y,,
0 Ij 5 N
(36)
Now we shall show that the state-space structure allows
R,
= LDL*.
,
a
where y, =
= the projection of y, onto C {yo,
...
zyxwvu
zyxwvu
Note that due to the construction (36), the innovations form
an orthogonal basis for C{yo, . . . ,yN} (with respect to the
Krein-space inner product) which simplifies the calculation of
projections. For example, we can express the projection of the
fundamental quantities z 0 and uJ onto C{y,, . . . ,yN} as
N
20lN
=
us to efficiently compute the innovations by an immediate
extension of the Kalman filter.
Theorem 3 (Kalman Filter in Krein Space): Consider the
Krein-space state equations
C(Q,
e2)(e2,
e2)-leZ
with
(37)
2=0
and
N
GjIN
=~(UJ,4(e2,ez)%
(38)
2=0
where the state-space structure may be used to calculate the
above inner products recursively.
Before proceeding to show this, however, let us note that
any method for computing the innovations yields the triangular
Assume that R, = [(y,, y,)] is strongly regular. Then the
innovations can be computed via the formulas
e, = y, - H,x,,0 5 a 5 N
xz+1 = Fa& Kp,z(yz- &%),
Kp,, = (F,P,H,* G,S,)R,t
+
+
20 = 0
(41)
(42)
(43)
zyxwvutsrqponmlkjih
zyxwvutsrqponmlkjihg
zyxwvutsrqpon
zyx
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART
I
29
where
and
The number of computations is dominated by those in (44)
and is readily seen to be O(n3)per iteration.
Remark: The only difference from the conventional
Kalman filter expressions is that the matrices Pa and R e , ,
(and, by assumption, IIo, Q , and R,) may now be indefinite.
Proof: The same as in the usual Kalman filter theory (see,
e.g., [13]). For completeness and to show the power of the
geometric viewpoint, however, we present a simple derivation.
There is absolutely no formal difference between the steps in
the (usual) Hilbert space case and in the Krein-space case.
Begin by noting that
Pi = IIi - ci.
zyxwvu
zyxwvu
The state-space equations (22a) show that the state variance
Hi, obeys the recursion
IIi+l
= FiIIiF:
+ GiQfGf.
Likewise, the orthogonality of the innovations implies that (47)
will yield
zyxwvutsrqp
zyxwvut
zyxwvuts
zyxwvu
+
e, = y, - 9, = y, - (H& .;a)
= y, - H,X, = Hap, U,
+
(45)
where 5, is the projection of z,on L{y,, . . . ,Y,-~} and where
we have defined 3, = 2, - ka.It follows readily that
Re,%= ( e , , e , ) = R ,
+ H,PaH,*,
n
Pa = (3a,3,). (46)
Recall (see Lemma 10) that the strong nonsingularity (all
leading minors nonzero) of R , implies that the {Re,,} are
nonsingular (rather than positive-definite, as in the Hilbert
space case). The Kalman filter can now be readily derived by
using the orthogonality of the innovations and the state-space
structure. Thus we first write
2
5a+11$
Subtracting the above two equations yields the desired Riccati
recursion for Pi,
Equations (46)-(49) constitute the Kalman filter of Theorem
3.
0
In Kalman filter theory there are many variations of the
above formulas and we note one here. Let us define the filtered
estimate, f i l i = the projection of zi onto L{yO,.. . ,y i } .
Theorem 4 (Measurement and Time Updates): Consider
the Krein state-space equations of Theorem 3 and assume
that R, is strongly regular. Then when Si
0, the filtered
estimates 5+ can be computed via the following (measurement
and time update) formulas
=ka+l= C ( x , + 1 : e j ) ( e j , e j 1) ej
~:
j
=O
and to seek a recursion we decompose the above as
a-1
+
where e,, R,,,, and Pa are as in Theorem 3.
Corollary 4 (Filtered Recursions): The two step recursions
of Theorem 4 can be combined into the single recursion
= C ( z 2 + 1 , e j ) R ; : e j Kp,tea
j=0
A
KP,,
= (z,+i,ea)~ii.
= 0.
(52)
For numerical reasons, certain square-root versions of the
KF are now more often used in state-space estimation. Furthermore, for constant systems or in fact for systems where
the time-variation is structured in a certain way, the Riccati
recursions and the square-root recursions, both of which take
O ( n 3 )elementary computations (flops) per iteration, can be
replaced by the more efficient Chandrasekhar recursions which
require only O ( n 2 )flops per iteration [17], [18].The squareroot and Chandrasekhar recursions can both be extended to
the Krein-space setting, as described in [22].
Before closing this section we shall note how the innovations computed in Theorem 3 can be used to determine the
using the formulas (37) and (38).
~
projections 5 o l and
2,+11a+1
Now
Note also that the first summation can be rewritten as
2-1
2-1
Fa
ej)R,;ej
j
+ G , x ( u a , e j ) R G i e j = Fa& + 0.
=O
j=0
Combining these facts we find
xa+l
= K&
+ KP,,ea
(47)
= C.%aIz+Kf,a+1(Yz+l
-Ha+lFZ&l,>,
8-11-1
30
zyxwvutsrqponmlkji
zyxwvuts
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO. 1, JANUARY 1996
Lemma I 1 (Computation of Inner Products): We can write
zyxwv
zyxw
zyxw
zyx
zyx
zyxwvut
zyxwvu
zyxw
(54)
where
en(Fk-
In particular, pol^ and B,IN are the stationary points of
J~(zo,u,y)
over ICO and u3 and subject to the state-space
constraints zj+l = F3x3 GJu3,j = 0 , . . . , N . In the
recursions, for each time i , we find Dolz and C312 which are
the stationary points of
+
I-1
@F--KH(Z,j)
identified in Lemma 7
Kp,kHk).
k=3
These lead to the recursions
["i
and (56),found at the bottom of the page, where @ & - K H ( i , j )
( i 2 j ) satisfies the recursion
@>-KH(~
+ 1,j)= @ > - ~ ~ ( i , j ) (-FKp,iHz)*
i
@>-KH(j,j)
=
Pro08 Straightforward computation.
s;
u3
%]-I[
R3
Y3
-HjXJ
1.
Theorem 6 (Deteiministic Problem): If R, is strongly regular, the stationary point of the quadratic form
0
+ Er.,.
2
Ji(X0,U , Y> = z;Fq1zo
(Y3 - HJ%)*
1
3 =O
B. Recursive State-Space Estimation and Quadratic Forms
Theorems 5 and 6 below are essentially restatements of
Theorems 1 and 2 when a state space model is assumed and
a recursive solution is sought.
over 20 and U?, subject to the state-space constraints x3+1 =
The error Gramian associated with the problem of projecting F3z3 G,u,, j = 0,1, . . . ,z can be recursively computed as
{ZO,
U} onto C{y} has already been identified in Lemma 6 and
(55), and (56) furnishes a recursive procedure for calculating
201, = zo12--1-t- & @ L K H ( z , O)H,*R,be,,
201-1 = 0
this projection. The condition for a minimum is R, > 0, where
RY has been shown to be
to the diagonal matrix Re'
and see (x), shown at the bottom of the page, where the
This gives the following theorem.
innovations e3 can be computed via the recursions
Theorem (Stochastic Problem): Suppose z = col(x0, a}
and y are related through the state-space model (22a) and (22b)
&+I = F A Kp,,e,, 20 = 0
and that R, is strongly regular. Then the state-space estimation
algorithm (S),(56) recursively computes the stationary point
with KP,%
= (F,P,H,* G,S,)R;t, Re,%
= R2 H,P,H,*,
of the error Gramian
e, = y, - H,&,and P, satisfying the Riccati recursion
+
+
+
+
( z - k*y,z - k*y)
over all k*y. Moreover, this stationary point is a minimum if,
and only if
Rc,3> 0 for j
=
Moreover, the value of J z ( z o ,U , y) at the stationary point is
given by
O , . - . ,i.
2
Similarly, the scalar quadratic form associated with the
(partially) equivalent deterministic problem has already been
JZ(~Ol2, f42,
Y) =
ep,;e,.
j=0
zyxwvutsrqponmlkjih
zyxwvutsrqponm
zyx
zyxwvutsrqpo
zyxwvutsrqp
zyxwvu
zyxwvutsrqp
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART I
31
Proof: The proof follows from the basic equivalence between the deterministic and stochastic problems. The recursions for Eolz and G,,l are the same as those in the stochastic
problem of Lemma 11, and the innovations e, are found via
the Krein space-Kalman filter of Theorem 3.
0
As mentioned earlier, the deterministic quadratic form of
Theorem 6 is often encountered in estimation problems. By
appeal to Gaussian assumptions on the w,, U,, and 20, and
maximum likelihood arguments, it is well known that state
estimates can be obtained via a deterministic quadratic minimization problem. Here we have shown this result using simple
projection arguments and have generalized it to indefinite
quadratic forms.
The result of Theorem 6 is probably the most important
result of this paper, and we shall make frequent use of it in
the companion paper [ l ] to solve the problems of H" and
risk-sensitive estimation and finite-memory adaptive filtering.
In those problems we shall also need to recursively check for
the condition for a minimum, and therefore we will now study
these conditions in more detail.
Recall from Lemma 9 that the above deterministic problem
has a minimum iff, R, and R- S*Q-lS have the same inertia.
Since R, is congruent to the block diagonal matrix Re, and
since R - S*QP1Sis also block diagonal, the solution of
the recursive stationarization problem will give a minimum
at each step if and only if all the block diagonal elements of
Re and R - S*Q-lS have the same inertia. This leads to the
following result.
Lemma 12 (Inertia Conditionsfor a Minimum): If IIo > 0,
Q > 0, and R is nonsingular, then the (unique) stationary
points of the quadratic forms (59), for i = O , l , . . . N , will
each be a unique minimum iff the matrices
Remark: In comparison to our result in Lemma 12, we
here have the additional requirement that the [Fj G, 1 must
be full rank. Furthermore, we not only have to compute the
P, (which is done via the Riccati recursion of the Kalman
filter), but we also have to invert P, (and R3) at each step
and then check for the positivity of P,-'
H;Ry1H,. The
test of Lemma 12 uses only quantities already present in the
Kalman filter recursion, viz. Re,, and R,. Moreover, these
are p x p matrices (as opposed to P,?: which is n x n) with
p typically less than n and whose inertia is easily determined
via a triangular factorization. Furthermore it can be shown [22]
that even this computation can be effectively blended into the
filter recursions by going to a square-root-array version of the
Riccati recursion. Here, however, for completeness we shall
show how Lemma 13 follows from our Lemma 12.
Proof of Lemma 13: We shall prove the lemma by induction. Consider the matrix
+
zyxw
1.
1
zyxwvut
zyxwvutsr
R+
-Iq1
[
0
Ho
Two different triangular factorizations (lower-upper and upperlower) of the above matrix show that
[
-nil
P,T;=P,-l+H;RrlH,
>O
j=O,l,...,N.
It also follows in the minimum case that Pj+l
j = 0,1,..*,N.
0
I,'
+
> 0 for
0
0
0
-QO1
Ro+HoIIoH,*
and (y), shown at the bottom of the page, have the same inertia.
Thus, since IIo > 0, QO > 0, and QO- SoR;'S,* > 0, then
the matrices R,,o = Ro + HODOH,*and Ro - S,*QOlSowill
have the same inertia (and we will have a minimum for Jo) iff
and R j - Sj*Q;lSj
have the same inertia for all j = 0,1, . . . N . In particular,
when S, s 0, the condition becomes that Re,,, and R, should
have the same inertia for all j = 0,1, . . N .
The conditions of the above Lemma are easy to check since
the Krein space-Kalman filter used to compute the stationary
point also computes the matrices Re,,. There is another
condition, more frequently quoted in the H" literature, which
we restate here (see, e.g., [4]).
Lemma 13 (Conditionfor a Minimum): If I I o > 0, Q > 0 ,
R is invertible, Q-SR-lS* > 0, and [F, G,] has full rank for
all j , then the quadratic forms (59) will each have a unique
minimum if, and only if
0
4
-QO1
QOlSo
SO*QO1 Ro-S,*QG1S0
+ H;ROIHo > 0.
Now with some effort wd may write the first step of the Riccati
recursion as
Pl = IF0
Go1
(r:' $1
+
[QFso]
-1
x ( R i l - S;Q;lSo)-l[Ho
S;Q;'])
[z].
Moreover, the center matrix appearing in the above expression
is congruent to
[no1+ YR,'Ho
(Qo
-
SoRi'S,*)-'
O
I
and hence is positive definite. Thus if [Fo Go] has full rank,
we can conclude that PI > 0. We can now repeat the argument
0
for the next time instant and so on.
We close this section with yet another condition which will
be useful in control problems.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 41, NO 1, JANUARY 1996
32
zyx
zyxwvuts
zy
zyxwvutsrqp
zyxw
Lemma 14 (Conditionfor a Minimum): If in addition to the
conditions of Lemma 13, the matrices Fj - GjSjR;’Hj are
invertible for all j , then the deterministic problems of Theorem
6 will each have a unique minimum iff P N + ~> 0 and
ACKNOWLEDGMENT
The authors would like to thank P. P. Khargonekar and D.
J. N. Limebeer for helpful discussions during the preparation
of this manuscript. Seminars by P. Park on the KYP Lemma
were also helpful in leading us to begin the research.
zyxwvu
zy
REFERENCES
Proof Let us first note that the Riccati recursion can be
rewritten as
B. Hassibi, A. H. Sayed, and T. Kailath, “Linear estimation in Krein
spaces-Part II: Applications,” this issue, pp. 3 4 4 9 .
P. P. Khargonekar and K. M. Nagpal, “Filtering and smoothing in an
H=-setting,” IEEE Trans. Automat. Contr., vol. 36, pp. 151-166, 1991.
M. J. Grimble, “Polynomial matrix solution of the Hw-filtering problem
and the relationship to Riccati equation state-space results,’’IEEE Trans.
Signal Processing, vol. 41, no. 1, pp. 67-81, Jan. 1993.
U. Shaked and Y. Theodor, “Hm-optimal estimation: A tutorial,”
in Proc. IEEE Con& Decision Contr., Tucson, AZ, Dec. 1992, pp.
2278-2286.
T. Basar and P. Bemhard, H”-Optimal Control and Related Mini m a Design Problems-A Dynamic Game Approach. Boston, MA:
Birkhauser, 1991.
D. Limebeer, B. D. 0. Anderson, P. P. Khargonekar, and M. Green,
“A game theoretic approach to HO” control for time varying systems,”
SIAM J. Contr. Optimization, vol. 30, pp. 262-283, 1992.
G. Tadmor, “HO“ in the time domain: The standard problem,” in Am.
Contr. Con$, 1989, pp. 772-773.
P. Whittle, Risk Sensitive Optimal Control. New York: Wiley, 1990.
J. Bognar, Indefinite Inner Product Spaces. New York SpringerVerlag, 1974.
V. I. Istratescu, Inner Product Structures, Theory and Applications,
Mathematics and Its Applications. Dordrecht, Holland: Reidel, 1987.
I. S. Iohvidov, M. G. Krein, and H. Langer, “Introduction to the spectral
theory of operators in spaces with an indefinite metric,” in Mathematical
Research. Berlin, Germany: Akademie-Verlag, 1982.
A. Einstein, Relativiry: The Special and General Theory, transl. by R.
W. Lawson. New York: Crown, 1931.
T. Kailath, Lectures on Wiener and Kalman Filtering. Berlin, Germany: Springer-Verlag, 1981,
M. Green and D. J. N. Limebeer, Linear Robust Control. Englewood
Cliffs, NJ: F‘rentice-Hall, 1995.
A. H. Sayed and T. Kailath, “A state-space approach to adaptive RLS
filtering,” IEEE Signal Processing Mag., pp. 18-60, July 1994.
G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore,
MD: Johns Hopkins Univ., 1989.
M. Mod, G. S. Sidhu, and T. Kailath, “Some new algorithms for
recursive estimation in constant, linear, discrete-time systems,” IEEE
Trans. Automat. Contr., vol. AC-19, pp. 315-323, 1974.
A. H. Sayed and T. Kailath, “Extended Chandrasekhar recursions,” IEEE
Trans. Automat. Contr., vol. 39, pp. 2265-2269, Nov. 1994.
A. E. Bryson and Y.C. Ho, Applied Optimal Control. Blaisdell, 1969.
A. H. Jazwinski, Stochastic Processes and Filtering Theory New York:
Academic, 1970.
B. D. 0. Anderson and J. B. Moore, Optimal Filtering. Englewood
Cliffs, NJ: Prentice-Hall, 1979.
E. Hassibi, A. H. Sayed, and T. Kailath, “Square-root arrays and
Chandrasekhar recursions for H” problems,” submitted to IEEE Trans.
zyxwvut
zyxwvu
The proof, which uses the last of the above equalities, now
follows from the sequence of congruences, found in (2) at the
top of the page, and Lemma 13.
0
VII. CONCLUDING
REMARKS
We developed a self-contained theory for linear estimation
in Krein spaces. We started with the notion of projections and
discussed their relation to stationary points of certain quadratic
forms encountered in a pair of partially equivalent stochastic and deterministic problems. By assuming an additional
state-space structure, we showed that projections could be
recursively computed by a Krein space-Kalman filter, several
applications for which are described in the companion paper
U].
The approach, in all these applications, is that given an
indefinite deterministic quadratic form to which Ha, risksensitive, and finite-memory problems lead almost by inspection, one can relate them to a corresponding Krein-space
stochastic problem for which the Kalman filter can be written
down immediately and used to obtain recursive solutions of
the above problems.
zyxwvutsrqponmlkjih
zyxwvutsr
zyxwvutsrqponm
zyxwvutsrqponm
zyxwvutsrqponm
HASSIBI et al.: LINEAR ESTIMATION IN KREIN SPACES-PART
I
Automat. Contr., also in Proc. 33rd IEEE Con$ Dec. Cont., 1994, pp.
2237-2243.
[23] -,
‘‘Ha optimality of the LMS algorithm,” IEEE Trans. Signal
Processing, to appear. Also in the Proc. 32rd IEEE Con$ Dec. and
Cont., 1993, pp. 74-80.
[24] -,
“Fundamental inertia conditions for the solution of H a problems,” in Proc. ACC, June 1995.
Babak Hassibi was hom in Tehran, Iran, in 1967.
He received the B.S. degree from the University of
Tehran in 1989, and the M.S. degree from Stanford
University, Stanford, CA, in 1993, both in electrical
engineering. He is currently pursuing the Ph.D.
degree at Stanford University.
From June 1992 to September 1992 he was a
Summer Intern with Ricoh, Califomia Research
Center, Menlo Park, CA, and from August 1994
to December 1994 he was a short-term Research
Fellow at the Indian Institute of Science, Bangalore,
India. His research interests include robust estimation and control, adaptive
signal processing and neural networks, blind equalization of communication
channels, and linear algebra.
Ali H. Sayed (S’90-M’92) was bom in Siio Paulo,
Brazil, in 1963. He received the B.S. and M.S.
degrees in electrical engineering from the University
of Siio Paulo, in 1987 and 1989, respectively. In
1992 he received the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA.
From September 1992 to August 1993, he was
a Research Associate with the Information Systems
Laboratory at Stanford University, after which he
joined the Department of Electrical and Computer
Engineering at the University of Califomia, Santa
Barbara, as an Assistant Professor. His research interests include adaptive and
statistical signal processing, robust filtering and control, interplays between
signal processing and control methodologies, interpolation theory, and structured computations in systems and mathematics.
Dr. Sayed is a recipient of the Institute of Engineering Prize, 1987 (Brazil),
the Conde Armando Alvares Penteado Prize, 1987 (Brazil), and a 1994 NSF
Research Initiation Award.
33
Thomas Kailath (S’57-M’62-F’70) received the
S.M. degree in 1959 and the Sc.D degree in 1961
from the Massachusetts Insmute of Technology.
From October 1961 to December 1962, he
worked at the Jet Propulsion Laboratories,Pasadena,
CA, where he also taught part-time at the California
Institute of Technology. He then went to Stanford
University, where he served as Director of the
Information Systems Laboratory from 1971 through
1980, as Associate Department Chairman from 1981
to 1987, and currently holds the Hitachi America
Professorship in Engineering. He has held short-term appointments at several
institutions around the world. His recent research interests include applications
of signal processing, computation and control to problems in semconductor
manufacturing, and wireless communications. He is the author of Linear
Systems (Englewood Cliffs, NJ: Prentice Hall, 1980) and Lectures on Wiener
and Kalman Filtering (New York Spnnger-Verlag, 1981).
Dr. Kailath is a fellow of the Institute of Mathematical Statistics and is a
member of the National Academy of Engineering and the American Academy
of Arts and Sciences. He has held Guggenheim, Churchill, and Royal Society
fellowships, among others, and received awards from the B E E Information
Theory Society and the American Control Council, in addition to the Technical
Achievement and Society Awards of the IEEE Signal Processing Society. He
served as President of the IEEE Information Theory Society in 1975.