Matrices A: (T) Depending On A Parameter T

Matrices A(t) depending on a Parameter t
Math. 412, Fall 1995 Jerry L. Kazdan
If a square matrix A(t) depends smoothly on a parameter t , are its eigenvalues and eigen-
vectors also smooth functions of t ? The answer is “yes” most of the time, but not always.
This story, while old, is interesting and elementary — and deserves to be better known.
One can also ask the same question for objects such as the Schrödinger operator whose
potential depends on a parameter, where much of current understanding arose.
Warm-up Exercise
Given a polynomial p(x,t) = xn + an−1 (t)xn−1 + · · · + a1 (t)x + a0 (t) whose coefficients
depend smoothly on a parameter t . Assume at t = 0 the number x = c is a simple root
of this polynomial, p(c, 0) = 0. Show that for all t sufficiently near 0 there is a unique
root x(t) with x(0) = c that depends smoothly on t . Moreover, if p(x,t) is a real analytic
function of t , that is, it has a convergent power series expansion in t near t = 0, then so
does x(t).
S OLUTION : Given that p(c, 0) = 0 we want to solve p(x,t) = 0 for x(t) with x(0) = c.
The assertions are immediate from the implicit function theorem. Since x(0) = c is a
simple zero of p(x, 0) = 0, then p(x, 0) = (x − c)g(x), where g(c) 6= 0. Thus the derivative
px (c, 0) 6= 0.
The example p(x,t) := x3 −t = 0, so x(t) = t 1/3 , shows x(t) may not be a smooth function
at a multiple root. In this case the best one can get is a Puiseux expansion in fractional
powers of t (see [Kn, ¶15]).
The Generic Case: a simple eigenvalue

In the following, let λ be an eigenvalue and X a corresponding eigenvector of a matrix A.
We say λ is a simple eigenvalue if λ is a simple root of the characteristic polynomial. We
will use the equivalent version: if (A − λ)2V = 0, then V = cX for some constant c. The
point is to eliminate matrices such as the zero 2 × 2 matrix A = 0, where λ = 0 is a double
eigenvalue and any vector V 6= 0 is an eigenvector, as well as the more complicated matrix
0 1
A = 0 0 which has λ = 0 as an eigenvalue with geometric multiplicity one but algebraic
multiplicity two.
Theorem Given a square matrix A(t) whose elements depend smoothly on a real param-
eter t , if λ = λ0 is a simple eigenvalue at t = 0 with a corresponding unit eigenvector
X0 , then for all t near 0 there is a corresponding eigenvalue and unique (normalized)
eigenvector that depend smoothly on t .
Also, if the elements of A(t) are real analytic function of t , then so are the eigenvalue and
eigenvector.
R EMARK [JAN . 1998]: The only text I know of that has a proof of this is [Lax]. The proof
there is different.
1
2
P ROOF : ∗ Although we won’t use it, the eigenvalue part is immediate from the warm-up
exercise above applied to the characteristic polynomial. It is the eigenvector aspect that is
takes a bit more work.
Given A(0)X0 = λ0 X0 for some vector X0 with kX0 k = 1, we want a function λ(t) and a
vector X(t) that depend smoothly on t with the properties
A(t)X(t) = λ(t)X(t), hX0 , X(t)i = 1, and λ(0) = λ0 , X(0) = X0 .
Here, hX, Y i is the standard inner product. Of course we could also have used a different
normalization, such as kX(t)k2 = 1.
S OME BACKGROUND ON THE I MPLICIT F UNCTION T HEOREM.
If H : RN × R → RN , say we want to solve the equations H(Z,t) = 0 for Z = Z(t). These
are N equations for the N unknowns Z(t). Assume that Z = Z0 is a solution at t = 0, so
H(Z0 , 0) = 0. Expanding H in a Taylor series in the variable Z near Z = Z0 we get
H(Z,t) = H(Z0 ,t) + HZ (Z0 ,t)(Z − Z0 ) + · · · ,
where HZ is the derivative matrix and · · · represent higher order terms. If these higher
order terms were missing then the solution of H(Z,t) = 0 would be simply
Z − Z0 = −[HZ (Z0 ,t)]−1 H(Z0 ,t),
that is,
Z = Z0 − [HZ (Z0 ,t)]−1 H(Z0 ,t).
This assumes that the first derivative matrix HZ (Z0 , 0) is invertible (since it is then invertible
for all t near zero). The implicit function theorem says that this is still true even if there
are higher order terms. The key assumption is that the first derivative matrix HZ (Z0 , 0) is
invertible. Although we may think of the special case where t ∈ R, this works without
change if the parameter t ∈ Rk is a vector.
(CONTINUATION OF THE PROOF ) We may assume that λ0 = 0. Write our equations as
f (X, λ,t) A(t)X − λX

F(X, λ,t) := := ,
g(X, λ,t) hX0 , Xi − 1
where we have written f (X, λ,t) := A(t)X − λX and g(X, λ,t) := hX0 , Xi − 1. We wish to
solve: F(X, λ,t) = 0 for both X(t) and λ(t) near t = 0. In the notation of the previous
paragraph, Z = (X, λ) and H(Z,t) = F(X, λ,t). Thus the derivative matrix HZ involves
differentiation with respect to both X and λ.
The derivative with respect to the parameters X and λ is the partitioned matrix
A(t) − λ −X

fX fλ
F (X, λ,t) =
0
= .
gX gλ X0T 0
Here we used hX0 , Xi = X0T X , where X0T is the transpose of the column vector X0 . Thus
at t = 0
A(0) −X0
F (X0 , 0, 0) =
0
.
X0T 0
∗
This was worked out at the blackboard with Dennis DeTurck.
3
For the implicit function theorem we check that the matrix on the right is invertible.
It is
enough to show its kernel is zero. Thus, say F (X0 , 0, 0)W = 0, where W = r . Then
0 V
A(0)V − X0 r = 0 and hX0 , V i = 0. From the first equation we find
A(0)2V = rA(0)X0 = 0.
By assumption, the eigenvalue λ0 = 0 is simple. Thus the only solutions of A(0)2V = 0
are V = (const) X0 . But then hX0 , V i = 0 gives V = 0. Consequently also r = 0 so W = 0.
Since the derivative matrix F 0 (X0 , 0, 0) is invertible, by the implicit function theorem the
equation F(X, λ,t) = 0 has the desired smooth solution X = X(t), λ = λ(t) near t = 0. If
F is real analytic in t near t = 0 then so is this solution .
The General Case

If the eigenvalue λ(t) is not simple, then the situation is more complicated – except if A(t)
is self-adjoint and depends analytically on t . Then both the eigenvalue and corresponding
eigenvectors are analytic functions of t (see [Ka] and [Re]). However, it is obviously false
that the largest
eigenvalue is an analytic function of t — as one sees from the simple
example 0t 00 for t near 0.
We conclude with three examples showing that if either the self-adjoint or analyticity as-
sumptions are deleted, the eigenvalue and/or eigenvector may not depend smoothly on t .
0 1

E XAMPLE 1 At t = 0 the matrix A(t) = has 0 as a double eigenvalue. Since the
t 0
characteristic polynomial is p(t) := λ2 − t , the eigenvalues are not smooth functions of t
for t near 0.
E XAMPLE 2a This is a symmetric matrix depending smoothly (but not analytically) on t .
Near t = 0 the eigenvectors are not even continuous functions of t . This is from Rellich’s
nice book [Re, page 41]. Let
0 cos ϕ(t) − sin ϕ(t)

a(t)
B(t) = and R(t) = ,
0 −a(t) sin ϕ(t) cos ϕ(t)
where a(0) = 0. For t 6= 0 we let a(t) := exp(−1/t 2 ) and ϕ(t) := 1/t .
The desired symmetric matrix is A(t) = R(t)B(t)R−1 (t). It is similar to B(t), but the new
basis determined by the orthogonal matrix R(t) is spinning quickly near t = 0. We find
cos 2ϕ(t) sin 2ϕ(t)

A(t) = a(t) ,
sin 2ϕ(t) − cos 2ϕ(t)
Its eigenvalues are λ± = ±a(t). For t 6= 0 the orthonormal eigenvectors are
cos ϕ(t) − sin ϕ(t)

V+ (t) = and V− (t) = .
sin ϕ(t) cos ϕ(t)
Since a(t) goes to 0 so quickly near t = 0, even though ϕ(t) = 1/t the matrix A(t) is a C ∞
function of t . However the eigenvectors keep spinning and don’t even converge as t → 0.
4
E XAMPLE 2b Another example of the same phenomenon. Let M+ and M− be symmetric

2 × 2 matrices with different orthonormal eigenevctors V1 , V2 , and W1 , W2 ,respectively.
For instance
0 1 1 0

M+ = and M− = ,
1 0 0 2
so we can let V1 = ( √12 , √12 ), V2 = ( √12 , √
−1
2
), and W1 = (1, 0), W2 (0, 1). With a(t) :=
exp(−1/t 2 ) as above, let
for t > 0
(
a(t)M+
A(t) =
a(t)M− for t < 0
with A(0) = 0. This matrix A(t) depends smoothly on t and has the eigenvectors V1
and V2 , for t > 0, but W1 and W2 for t < 0. The eigenvectors are not continuous in a
neighborhood of t = 0.
E XAMPLE 3 In the previous example the eigenvalues were still smooth functions, but this
was lucky. There are symmetric matrices depending smoothly on a parameter t whose
eigenvalues are not C 2 functions of t . Since the eigenvalues are roots of the characteris-
tic polynomial, this is just the situation of a polynomial whose coefficients depend on a
parameeter and asking how smoothly the roots depend on the parameter. One instance is
x2 − f (t) = 0, where f (t) ≥ 0 is smooth. The key observation (certainly known by Rellich
in [Re]) is the perhaps surprising fact that this f (t) may not have a smooth square root.
This has been rediscovered many times. One example is
f (t) = sin2 (1/t)e−1/t + e−2/t for t > 0 while f (t) = 0 for t ≤ 0.
For a recent discussion with additional details and references, see [AKML].
References
[AKML] Alekseevsky, D., Kriegl, A., Michor, P., Losik, M., “Choosing Roots of Polynomials
Smoothly, II” Israel Journal of Mathematics 139 2004, 183–188.
[Ka] Kato, Tosio, Perturbation theory for linear operators, Grundlehren 132, 1976, Springer-
Verlag, Berlin, New York.
[Kn] Knopp, K., Theory of Functions, Vol. II, 1947, Dover, New York (translated from the
German).
[Lax] Lax, Peter D., Linear Algebra,Wiley-Interscience, 1996 ISBN: 0471111112.
[Re] Rellich, Franz, Perturbation Theory of Eigenvalue Problems, 1953 New York University
Lecture Notes reprinted by Gordon and Breach, 1968.

Matrices A: (T) Depending On A Parameter T

Uploaded by

Copyright:

Available Formats

Matrices A: (T) Depending On A Parameter T

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Matrices A: (T) Depending On A Parameter T

Uploaded by

Copyright:

Available Formats

Matrices A(t) depending on a Parameter t

Math. 412, Fall 1995 Jerry L. Kazdan

The Generic Case: a simple eigenvalue

The General Case

E XAMPLE 2b Another example of the same phenomenon. Let M+ and M− be symmetric

You might also like