Matrices A: (T) Depending On A Parameter T
Matrices A: (T) Depending On A Parameter T
Matrices A: (T) Depending On A Parameter T
If a square matrix A(t) depends smoothly on a parameter t , are its eigenvalues and eigen-
vectors also smooth functions of t ? The answer is “yes” most of the time, but not always.
This story, while old, is interesting and elementary — and deserves to be better known.
One can also ask the same question for objects such as the Schrödinger operator whose
potential depends on a parameter, where much of current understanding arose.
Warm-up Exercise
Given a polynomial p(x,t) = xn + an−1 (t)xn−1 + · · · + a1 (t)x + a0 (t) whose coefficients
depend smoothly on a parameter t . Assume at t = 0 the number x = c is a simple root
of this polynomial, p(c, 0) = 0. Show that for all t sufficiently near 0 there is a unique
root x(t) with x(0) = c that depends smoothly on t . Moreover, if p(x,t) is a real analytic
function of t , that is, it has a convergent power series expansion in t near t = 0, then so
does x(t).
S OLUTION : Given that p(c, 0) = 0 we want to solve p(x,t) = 0 for x(t) with x(0) = c.
The assertions are immediate from the implicit function theorem. Since x(0) = c is a
simple zero of p(x, 0) = 0, then p(x, 0) = (x − c)g(x), where g(c) 6= 0. Thus the derivative
px (c, 0) 6= 0.
The example p(x,t) := x3 −t = 0, so x(t) = t 1/3 , shows x(t) may not be a smooth function
at a multiple root. In this case the best one can get is a Puiseux expansion in fractional
powers of t (see [Kn, ¶15]).
P ROOF : ∗ Although we won’t use it, the eigenvalue part is immediate from the warm-up
exercise above applied to the characteristic polynomial. It is the eigenvector aspect that is
takes a bit more work.
Given A(0)X0 = λ0 X0 for some vector X0 with kX0 k = 1, we want a function λ(t) and a
vector X(t) that depend smoothly on t with the properties
A(t)X(t) = λ(t)X(t), hX0 , X(t)i = 1, and λ(0) = λ0 , X(0) = X0 .
Here, hX, Y i is the standard inner product. Of course we could also have used a different
normalization, such as kX(t)k2 = 1.
S OME BACKGROUND ON THE I MPLICIT F UNCTION T HEOREM.
If H : RN × R → RN , say we want to solve the equations H(Z,t) = 0 for Z = Z(t). These
are N equations for the N unknowns Z(t). Assume that Z = Z0 is a solution at t = 0, so
H(Z0 , 0) = 0. Expanding H in a Taylor series in the variable Z near Z = Z0 we get
H(Z,t) = H(Z0 ,t) + HZ (Z0 ,t)(Z − Z0 ) + · · · ,
where HZ is the derivative matrix and · · · represent higher order terms. If these higher
order terms were missing then the solution of H(Z,t) = 0 would be simply
Z − Z0 = −[HZ (Z0 ,t)]−1 H(Z0 ,t),
that is,
Z = Z0 − [HZ (Z0 ,t)]−1 H(Z0 ,t).
This assumes that the first derivative matrix HZ (Z0 , 0) is invertible (since it is then invertible
for all t near zero). The implicit function theorem says that this is still true even if there
are higher order terms. The key assumption is that the first derivative matrix HZ (Z0 , 0) is
invertible. Although we may think of the special case where t ∈ R, this works without
change if the parameter t ∈ Rk is a vector.
(CONTINUATION OF THE PROOF ) We may assume that λ0 = 0. Write our equations as
f (X, λ,t) A(t)X − λX
F(X, λ,t) := := ,
g(X, λ,t) hX0 , Xi − 1
where we have written f (X, λ,t) := A(t)X − λX and g(X, λ,t) := hX0 , Xi − 1. We wish to
solve: F(X, λ,t) = 0 for both X(t) and λ(t) near t = 0. In the notation of the previous
paragraph, Z = (X, λ) and H(Z,t) = F(X, λ,t). Thus the derivative matrix HZ involves
differentiation with respect to both X and λ.
The derivative with respect to the parameters X and λ is the partitioned matrix
A(t) − λ −X
fX fλ
F (X, λ,t) =
0
= .
gX gλ X0T 0
Here we used hX0 , Xi = X0T X , where X0T is the transpose of the column vector X0 . Thus
at t = 0
A(0) −X0
F (X0 , 0, 0) =
0
.
X0T 0
∗
This was worked out at the blackboard with Dennis DeTurck.
3
For the implicit function theorem we check that the matrix on the right is invertible.
It is
enough to show its kernel is zero. Thus, say F (X0 , 0, 0)W = 0, where W = r . Then
0 V
A(0)V − X0 r = 0 and hX0 , V i = 0. From the first equation we find
A(0)2V = rA(0)X0 = 0.
By assumption, the eigenvalue λ0 = 0 is simple. Thus the only solutions of A(0)2V = 0
are V = (const) X0 . But then hX0 , V i = 0 gives V = 0. Consequently also r = 0 so W = 0.
Since the derivative matrix F 0 (X0 , 0, 0) is invertible, by the implicit function theorem the
equation F(X, λ,t) = 0 has the desired smooth solution X = X(t), λ = λ(t) near t = 0. If
F is real analytic in t near t = 0 then so is this solution .
We conclude with three examples showing that if either the self-adjoint or analyticity as-
sumptions are deleted, the eigenvalue and/or eigenvector may not depend smoothly on t .
0 1
E XAMPLE 1 At t = 0 the matrix A(t) = has 0 as a double eigenvalue. Since the
t 0
characteristic polynomial is p(t) := λ2 − t , the eigenvalues are not smooth functions of t
for t near 0.
E XAMPLE 2a This is a symmetric matrix depending smoothly (but not analytically) on t .
Near t = 0 the eigenvectors are not even continuous functions of t . This is from Rellich’s
nice book [Re, page 41]. Let
0 cos ϕ(t) − sin ϕ(t)
a(t)
B(t) = and R(t) = ,
0 −a(t) sin ϕ(t) cos ϕ(t)
where a(0) = 0. For t 6= 0 we let a(t) := exp(−1/t 2 ) and ϕ(t) := 1/t .
The desired symmetric matrix is A(t) = R(t)B(t)R−1 (t). It is similar to B(t), but the new
basis determined by the orthogonal matrix R(t) is spinning quickly near t = 0. We find
cos 2ϕ(t) sin 2ϕ(t)
A(t) = a(t) ,
sin 2ϕ(t) − cos 2ϕ(t)
Its eigenvalues are λ± = ±a(t). For t 6= 0 the orthonormal eigenvectors are
cos ϕ(t) − sin ϕ(t)
V+ (t) = and V− (t) = .
sin ϕ(t) cos ϕ(t)
Since a(t) goes to 0 so quickly near t = 0, even though ϕ(t) = 1/t the matrix A(t) is a C ∞
function of t . However the eigenvectors keep spinning and don’t even converge as t → 0.
4
References
[AKML] Alekseevsky, D., Kriegl, A., Michor, P., Losik, M., “Choosing Roots of Polynomials
Smoothly, II” Israel Journal of Mathematics 139 2004, 183–188.
[Ka] Kato, Tosio, Perturbation theory for linear operators, Grundlehren 132, 1976, Springer-
Verlag, Berlin, New York.
[Kn] Knopp, K., Theory of Functions, Vol. II, 1947, Dover, New York (translated from the
German).
[Lax] Lax, Peter D., Linear Algebra,Wiley-Interscience, 1996 ISBN: 0471111112.
[Re] Rellich, Franz, Perturbation Theory of Eigenvalue Problems, 1953 New York University
Lecture Notes reprinted by Gordon and Breach, 1968.