Matrices A: (T) Depending On A Parameter T
Matrices A: (T) Depending On A Parameter T
Matrices A: (T) Depending On A Parameter T
If a square matrix A(t) depends smoothly on a parameter t , are its eigenvalues and eigen-
vectors also smooth functions of t ? The answer is “yes” most of the time, but not always.
This story, while old, is interesting and elementary — and deserves to be better known.
One can also ask the same question for objects such as the Schrödinger operator whose
potential depends on a parameter, where much of current understanding arose.
Warm-up Exercise
Given a polynomial p(x,t) = xn + an−1 (t)xn−1 + · · · + a1 (t)x + a0 (t) whose coefficients
depend smoothly on a parameter t . Assume at t = 0 the number x = c is a simple root
of this polynomial, p(c, 0) = 0. Show that for all t sufficiently near 0 there is a unique
root x(t) with x(0) = c that depends smoothly on t . Moreover, if p(x,t) is a real analytic
function of t , that is, it has a convergent power series expansion in t near t = 0, then so
does x(t).
S OLUTION : Given that p(c, 0) = 0 we want to solve p(x,t) = 0 for x(t) with x(0) = c.
The assertions are immediate from the implicit function theorem. Since x(0) = c is a
simple zero of p(x, 0) = 0, then p(x, 0) = (x − c)g(x), where g(c) 6= 0. Thus the derivative
px (c, 0) 6= 0.
The example p(x,t) := x3 −t = 0, so x(t) = t 1/3 , shows x(t) may not be a smooth function
at a multiple root. In this case the best one can get is a Puiseux expansion in fractional
powers of t (see [Kn, ¶15]).
P ROOF : ∗ Although we won’t use it, the eigenvalue part is immediate from the warm-up
exercise above applied to the characteristic polynomial. It is the eigenvector aspect that is
takes a bit more work.
Given A(0)X0 = λ0 X0 for some vector X0 with kX0 k = 1, we want a function λ(t) and a
vector X(t) that depend smoothly on t with the properties
A(t)X(t) = λ(t)X(t), hX0 , X(t)i = 1, and λ(0) = λ0 , X(0) = X0 .
Here, hX, Y i is the standard inner product. Of course we could also have used a different
normalization, such as kX(t)k2 = 1.
If H : RN × R → RN , say we want to solve the equations H(Z,t) = 0 for Z = Z(t). These
are N equations for the N unknowns Z(t). Assume that Z = Z0 is a solution at t = 0, so
H(Z0 , 0) = 0. Expanding H in a Taylor series in the variable Z near Z = Z0 we get
H(Z,t) = H(Z0 ,t) + HZ (Z0 ,t)(Z − Z0 ) + · · · ,
where HZ is the derivative matrix and · · · represent higher order terms. If these higher
order terms were missing then the solution of H(Z,t) = 0 would be simply
Z − Z0 = −[HZ (Z0 ,t)]−1 H(Z0 ,t),
that is,
Z = Z0 − [HZ (Z0 ,t)]−1 H(Z0 ,t).
This assumes that the first derivative matrix HZ (Z0 , 0) is invertible (since it is then invertible
for all t near zero). The implicit function theorem says that this is still true even if there
are higher order terms. The key assumption is that the first derivative matrix HZ (Z0 , 0) is
invertible. Although we may think of the special case where t ∈ R, this works without
change if the parameter t ∈ Rk is a vector.
(CONTINUATION OF THE PROOF ) We may assume that λ0 = 0. Write our equations as
f (X, λ,t) A(t)X − λX
F(X, λ,t) := := ,
g(X, λ,t) hX0 , Xi − 1
where we have written f (X, λ,t) := A(t)X − λX and g(X, λ,t) := hX0 , Xi − 1. We wish to
solve: F(X, λ,t) = 0 for both X(t) and λ(t) near t = 0. In the notation of the previous
paragraph, Z = (X, λ) and H(Z,t) = F(X, λ,t). Thus the derivative matrix HZ involves
differentiation with respect to both X and λ.
The derivative with respect to the parameters X and λ is the partitioned matrix
A(t) − λ −X
fX fλ
F (X, λ,t) =
= .
gX gλ X0T 0
Here we used hX0 , Xi = X0T X , where X0T is the transpose of the column vector X0 . Thus
at t = 0
A(0) −X0
F (X0 , 0, 0) =
X0T 0
This was worked out at the blackboard with Dennis DeTurck.
For the implicit function theorem we check that the matrix on the right is invertible.
It is
enough to show its kernel is zero. Thus, say F (X0 , 0, 0)W = 0, where W = r . Then
0 V
A(0)V − X0 r = 0 and hX0 , V i = 0. From the first equation we find
A(0)2V = rA(0)X0 = 0.
By assumption, the eigenvalue λ0 = 0 is simple. Thus the only solutions of A(0)2V = 0
are V = (const) X0 . But then hX0 , V i = 0 gives V = 0. Consequently also r = 0 so W = 0.
Since the derivative matrix F 0 (X0 , 0, 0) is invertible, by the implicit function theorem the
equation F(X, λ,t) = 0 has the desired smooth solution X = X(t), λ = λ(t) near t = 0. If
F is real analytic in t near t = 0 then so is this solution .
We conclude with three examples showing that if either the self-adjoint or analyticity as-
sumptions are deleted, the eigenvalue and/or eigenvector may not depend smoothly on t .
0 1
E XAMPLE 1 At t = 0 the matrix A(t) = has 0 as a double eigenvalue. Since the
t 0
characteristic polynomial is p(t) := λ2 − t , the eigenvalues are not smooth functions of t
for t near 0.
E XAMPLE 2a This is a symmetric matrix depending smoothly (but not analytically) on t .
Near t = 0 the eigenvectors are not even continuous functions of t . This is from Rellich’s
nice book [Re, page 41]. Let
0 cos ϕ(t) − sin ϕ(t)
B(t) = and R(t) = ,
0 −a(t) sin ϕ(t) cos ϕ(t)
where a(0) = 0. For t 6= 0 we let a(t) := exp(−1/t 2 ) and ϕ(t) := 1/t .
The desired symmetric matrix is A(t) = R(t)B(t)R−1 (t). It is similar to B(t), but the new
basis determined by the orthogonal matrix R(t) is spinning quickly near t = 0. We find
cos 2ϕ(t) sin 2ϕ(t)
A(t) = a(t) ,
sin 2ϕ(t) − cos 2ϕ(t)
Its eigenvalues are λ± = ±a(t). For t 6= 0 the orthonormal eigenvectors are
cos ϕ(t) − sin ϕ(t)
V+ (t) = and V− (t) = .
sin ϕ(t) cos ϕ(t)
Since a(t) goes to 0 so quickly near t = 0, even though ϕ(t) = 1/t the matrix A(t) is a C ∞
function of t . However the eigenvectors keep spinning and don’t even converge as t → 0.
[AKML] Alekseevsky, D., Kriegl, A., Michor, P., Losik, M., “Choosing Roots of Polynomials
Smoothly, II” Israel Journal of Mathematics 139 2004, 183–188.
[Ka] Kato, Tosio, Perturbation theory for linear operators, Grundlehren 132, 1976, Springer-
Verlag, Berlin, New York.
[Kn] Knopp, K., Theory of Functions, Vol. II, 1947, Dover, New York (translated from the
[Lax] Lax, Peter D., Linear Algebra,Wiley-Interscience, 1996 ISBN: 0471111112.
[Re] Rellich, Franz, Perturbation Theory of Eigenvalue Problems, 1953 New York University
Lecture Notes reprinted by Gordon and Breach, 1968.