The Numerical Range of 3 X 3 Matrices: Dennis S. Keeler
The Numerical Range of 3 X 3 Matrices: Dennis S. Keeler
The Numerical Range of 3 X 3 Matrices: Dennis S. Keeler
Dennis S. Keeler*
Department of Mathematics
Miami University
oxford, Ohio 45056
and
ABSTRACT
1. INTRODUCTION
* Supported by a Research Experience for Undergraduates grant from tbe National Science
Foundation, Grant number 9300395.
’ Partially supported by the NSF Grant DMS 9123841.
’ Partially supported by the NSF Grant DMS 9401848.
eigenvalues of A and therefore its convex hull conv(a( A)). For A normal
(that is, commuting with A*), W(A) = co&u ( A)); the converse statement
holds when n < 4.
For 2 X 2 matrices A a complete description of the numerical range
W( A) is well known. Namely, W(A) is an ellipse with foci at the eigenvalues
A,, A, of A and a minor axis of the length
Of course, s = 0 for normal A, and the ellipse in this case degenerates into a
line segment connecting A, with h,. On the other hand, for 2 X 2 matrices
A with coinciding eigenvalues the ellipse W(A) degenerates into a disk.
For general n, the following procedure is useful: Write A = H + iK with
H, K Hermitian, and let
Case 1. L, factors into three linear factors. Then C(A) consists of three
(not necessarily distinct) points, A is normal (and therefore reducible’), and
W(A) is the convex hull of its eigenvalues.
’ We say that a matrix A is reducible if there exists a unitary matrix U such that
U*AU = diag[A,, A,], where both diagonal blocks are of nonzero size. For reducible A,
W(A) = conv(W(A,), W(A,)).
NUMERICAL RANGE 117
and an ellipse E. The numerical range W(A) is either an ellipse (if A,, lies
inside E) or a “cone-like” figure otherwise; in the latter case A is reducible
(but not normal).
In the next two cases the polynomial L, (and therefore the matrix A) is
irreducible.
Case 3. The degree of C(A) (that is, the degree of its point equation)
equals 4. Then C(A) has a “double tangent,” and the boundary of W(A)
contains one flat portion but no angular points.
Case 4. The degree of C(A) equals 6. Then C(A) consists of two parts,
one inside another; an outer part (and therefore W(A)) has an “ovular”
shape.
2. W(A) IS AN ELLIPSE
The generalization to the case n > 3 and its proof were suggested to us by
the referee.
=
l<i<j<n I u%aii + u3aii)(uiRajj + v3ajj) - f(U’ + U2)laij12
1
u%Ai + uZYA~)(U%A~ + uDAj) - f(u2 + u2)Iaij12 .
1
’ Here and in what follows, we denote by !Rz and 32 the real and imaging part,
respectively, of a complex number z.
NUMERICAL RANGE 119
(2.4)
Due to (2.2),
Hence, conditions (2.5), (2.6) are necessary for matrices (2.4) and
1
A, s 0
B= 0
[ 0 0
h, 40
Zf these conditions are satisfied, then C(A) is the union of A with the ellipse
having its foci at two other eigenvalues of A and minor axis of length s = 6.
Zf these conditions are satisfied, then C(A) is the union of A with the
ellipse having its foci at two other eigenvalues of A and minor axis of length
s = 0.
3. (IA, - h,l + IA, - X,1)‘- - IA, - &I2 < d, where the eigenvalue coin-
ciding with A is labeled A,.
1. A has a multiple eigenvalue p (so that its eigenvalues equal p, CL, and
A)
2. 2ptrace(A*A) = trace(A*A2) + 2l/~‘l/~ + (2~ - A)lA12, and
3. 4) p - AI2 + 21~)~ + 1Al2 < trace( A*A).
For A in a triangular form (2.41, conditions 2 and 3 may be substituted by
If these conditions are satisfied, then W(A) is centered at p and has radius
i trace( A*A) - 21 k12 - IAl (= a~/lxl” + ly12 + 1~1 in the case (2.4)).
For triangular matrices, this corollary was first proved directly by Chien
and Tam, although in a very different manner [7]. Necessary and sufficient
conditions for W(A) to be the unit disk centered at 0 for a 3 X 3 matrix A
were obtained earlier by N. K. Tsing (unpublished) and stated in [7]; these
conditions appear as a particular case of Corollary 2.5, when we specialize
a = b = p = 0, (xl2 + Iy12 + Iz12 = 4.
122 DENNIS S. KEELER ET AL.
where cl, cg, %( t> are positive, if and only if W( A) has a flat portion on its
boundary. In this form, W(A) has ajlat portion extending from 0 to i and is
contained in the closed right half-plane.
Proof. Let W(A) h ave a flat portion on its boundary. After rotation,
shifting, and scaling (by scaling we mean multiplication by a positive number),
we may assume that a flat portion stretches from 0 to i. Since W(A) is
convex, it must be contained entirely in the right or the left half-plane.
Applying yet another rotation and translation, if necessary, we may assume
that W(A) is in the right half-plane.
Since 0 and i are in W(A), there exist r,, x2 E C”, x:x1 = x2*x2 = 1
such that x:kr, = 0, x$kr, = i. Let 9= Spanix,, x2}. Since _!Z is a
2dimensional subspace, we may represent the linear transformation of A
restricted to 9, Al_Y = A’, by a 2 X 2 matrix. By choosing a proper basis for
A, A’ is the leading principal submatrix of A.
Now W( A') is an ellipse, as are the numerical ranges of all 2 X 2
matrices. Since W( A’) is convex, [0, i] c W( A’). Also, W( A’) E W(A). Since
W(A) does not extend into the left half-plane, the only possible ellipse
i
W( A’) can be is the degenerate ellipse [O, il. This implies that A’ is normal
with eigenvalues 0 and i. So with proper basis
1
0 v1
r
va.
1
o
1
-
Vl + Cl
-
v2 + c2 >
2%(l)
Under these conditions, the flat portion of the boundary lies on the line
ux + “fj + w = 0.
Proof. If A is a real matrix, then W(A) is symmetric about the real axis.
So, the (unique) flat portion of the boundary of W(A) must be a vertical line.
According to Proposition 3.2, it happens if and only if the matrix 1 * H + 0 *
K = H has a multiple eigenvalue. ??
Then WC A) has a flat portion on its boundary if and only if there exist real
u,u not both zero such that
and
det(uH + UK - AZ)
Setting
M2 + ly12+ Id2
=“JM. (3.8)
3
the equality (3.8) h o Id s exactly when both inequalities in (3.9) are actually
equalities, that is, when I xl = I y 1= Iz 1(for the first inequality in (3.9)) and
xZjz E R (for the second). The two conditions obtained are exactly the same
as (3.3) and (3.41, respectively. ??
= uH + uK - $trace(uH + uK))Z.
(3.11)
NUMERICAL RANGE 127
and
1
h; + vk; + w’ h 12 h 13
G j-G W’
-
w’/ h,, = &s/h12 = h,,/( h’, + vk; + w’). (3.14)
h,,h,,
W’ = -
h 12 ’
For v, w’ defined by (3.15) the equalities (3.14) yield, respectively, (3.12) and
It is easily checked that under the restriction (3.12) the latter is equivalent to
(3.11).
~7% = hnh23&hl;)
-- -
42 h3h,, . hu,/h,,
h, 4, h,, h,, = h,,/h,,.
= h,h,,h,,h,, = h,,h,,h,,
Therefore, the first of equalities (3.14) also holds.
Finally, (3.11) and (3.12) imply (3.16), which, in turn, leads to the second
of the equalities (3.14). D ue to (3.13) (3.14), I? is a (nonzero) matrix with
collinear columns and therefore has rank 1. ??
where
k, - k,
___ k3- k,
~ k, - k2
~
p= lh,,12 + lh1J2 + lhJ2 ’
Since h,(k, - k,) + h,(k, - k,) + h,(k, - k,) and p are both real, it
means that condition (3.12) follows from (3.11) if p is nonzero.
The above corollary also works with H and not K diagonal. To see this,
multiply A by i. This makes H’ = - K and K’ = H. Clearly W(iA) has a flat
portion on its boundary if and only if W(A) has a flat portion.
In this section we apply our results to the special case of matrices with a
triple eigenvalue. In their triangular form (2.4) of course, all diagonal
NUMERICAL RANGE 129
elements coincide:
(4-l)
(1) W(A) is a disk if and only if ryz = 0; in this case the disk has radius
A version of Part 1 of this theorem for the nilpotent case was first shown by
Marcus and Pesce, who also developed a unitarily invariant form of this
condition [9].
Proof. Part 1 follows easily from Corollary 2.5. In the rest of the proof
we may therefore suppose that xyz # 0, so that A is irreducible.
23(P)
Part 2: To simplify further calculations, consider the matrix 2 A instead of
A:
I[
Y -ix -iy
2%(p) 2 +i iX 2D( p) -iz .
By Corollary 3.4, W(A) has a flat portion if and only if there exist real U, u
not both zero such that
We can easily choose u, v so that this is true. And so the only condition we
have is 1x1 = (y( = Iz(.
We now prove that under this condition C(A) is a cardioid. Using unitary
transformations A +B U*AU with U = diag[eiVl, eiv2, eiY3] (which do not
change C(A)) an d multiplying A by scalars (which rotate and dilate C(A))
we may reduce the general case to x = y = z = 1. Shifting then A by AI
(which shifts C(A) by A), we may suppose also that its eigenvalue is 5 (such a
choice of the eigenvalue ensures that the cusp of the cardioid would be at the
origin). In other words, without loss of generality
Using Fiedler’s formula (see [6]) for the point equation of C(A) and
transforming to polar coordinates we find
0
A computer image of W(A), where
1
1 1
A= i0
0 0
0 01,
was given in [9]. By Theorem 4.1, W(A) is the convex hull of a cardioid.
As in Sections 2 and 3, the unitarily invariant version exists:
2. W(A) has aflat portion on its boundary if and only if w = g3m >
0.
THEOREM 5.1. Let W(A) be a 2-dimensional shape with only one jlat
portion on its boundary. Then A is an irreducible matrix, which can be
restored up to unitary similarity.
Proof. Having only one flat portion on the boundary of its numerical
range, A belongs to Case 4 of Kippenhahn’s classification and is therefore
irreducible. After scaling, rotation, and shifting of W(A), we can have the flat
portion as the line segment [O, i] and W(A) lies entirely in the right
half-plane. We restore A in this case. After the restoration, one can obtain
the original A by reversing the scaling, rotation, and shifting.
According to Theorem 3.1, A must be unitarily similar to (3.1). Let us
assume A is in that form. The real part H of A is then diag(0, 0, 8( 5 )>, with
8( f > positive. Since W(%( A)) is the projection of W(A) onto the real axis,
which is a line segment from 0 to !R(c), we can determine 8(l).
Since there is only one flat portion, the real part of any point on that
portion is 0. So J is not on that portion. Because %( J’) is an endpoint of the
projection of W(A) onto the real axis and t is not on the flat portion, 5 is
uniquely determined as the point on the boundary of W(A) having a
maximum real part, namely YI( 5 ). So s( [ > is also determined.
The imaginary part K of A is
The determinant of this system is Aa - A,, which is nonzero since the flat
portion is of nonzero length, causing the projection of W(A) onto the
imaginary axis to be of nonzero length. So the system (5.1) has a unique
NUMERICAL RANGE 133
solution:
c; = A,h,(A, + A, - 1 - 5( 6)).
Since c,, c2 are positive, we thus have unique values for them. Therefore
we know all the elements of A in this canonical form, which determines A
up to unitary similarity. a
Proof. Cases l-3 are well known; case 4 was discussed in Theorem 5.1.
In case 5 A is normal, with at least two distinct eigenvalues and all three
eigenvalues collinear. The eigenvalues corresponding to the endpoints can be
determined, but the third eigenvalue cannot. There is a continuum choice for
this third eigenvalue.
If W(A) is an ellipse and A is reducible, A cannot be restored since the
point defined by its 1 X 1 block may be anywhere within the ellipse defined
by the 2 X 2 block. Again, there is a continuum of choices for the 1 X 1
block.
The proof in the remaining two situations (W(A) is an ellipse produced
by an irreducible A or an ovular shape) is based on a series of lemmas and is
therefore relegated to the end of this section. H
determines two others: (1) W(A) and the trace of A; (2) W(A) and the
eigenvalues of A; (3) C(A).
Indeed, C(A) determines uniquely W(A) (because W(A) is the convex
hull of C(A) and th e ei g envalues of A (because there are the foci of C(A)).
On the other hand, if W(A) is known then the maximal and minimal
eigenvalues of every linear combination H cos tf + K sin 5 (here A = H +
iK with Hermitian H and K and 5 is a real number) are determined by
using the orthogonal projection of W(e?6AA) onto the real axis; note that H
cos 6 + K sin 5 is the real part of e -“(A. If, in addition, the trace of A is
known, then all eigenvalues of H cos 5 + K sin 6 are known, and therefore
the polynomial det(uH + vK + wZ) is known, which determines C(A).
It will be clear from the proof of Theorem 5.2 that, in addition to the
cases when a 3 X 3 matrix A can be restored from W(A), such a matrix can
be restored from C(A) (or equivalently from W(A) and the trace of A) if
W(A) is a line segment. On the contrary, if W(A) is ovular or W(A) is an
ellipse (without any information concerning the reducibility of A), then there
are uncountably many unitarily inequivalent matrices B such that C(B) =
C(A). However, if W(A) is an ellipse and it is known that A is reducible,
then A can be restored (up to unitary similarity) from C(A).
The rest of this section is devoted to completion of the proof of Theorem
5.2.
We use the two matrices
with czi > CQ > ~ys, cy; > (Y; > ai, Pi, &’ real, and off-diagonal elements
such that
(5.4)
NUMERICAL RANGE 135
LEMMA 5.3. Let A and B be written in the form (5.2) and (5.3). Then
L, = L, if and only if
1. All the diagonal elements are equal: czj = CX~,Pj = # for j = 1,2,X
2. e’ = e2 + ((al - CQ)/( c+ - a3))(lg’12 - lg12)
3. f’ = Jf2 - ((a1 ‘y3))(lg’12 - lg12)
- C$)/((Yz -
4. e’f)(g’ + z) = ef(g + 2) + (lg12 - lg’12X(aJP2 - PJ
+ cQ( & - PJ + q( PI - &))/(a2 - 4
+( -a2 - ab - b2)U2w
Comparing
- - this to the equation
_ for det(uH’ + vK’ + wl) we see from the
coefficients of u2v and uvw that we must have
which can be viewed as a linear system of equations in e 12, f”. By solving the
system and using our assumption that e’, f’ are nonnegative, we obtain
f’=J.
Finally from the coefficient of v3 we have
u-b 2a +b
c-(c+d)a-d-
I
3(bc - ad)
= ef( g + g) + (lg’i2 - lgt2)
=
(a2 - 4ef + (a3 - al)eg + (a1 - c4fg (5.5)
(5.6)
a1 - ff3
~ = a1 - a2 ,
7=
ff2 - a3 a2 - a3
e’ = &F&T (5.7)
fl=dG (5.8)
To satisfy (5.71, (5.8), let us choose p E I = ( -e2/p, f 2/r). Note that the
length of I is positive, because otherwise f = e = 0, and A would be
138 DENNIS S. KEELER ET AL.
2dfFg + Pq
%g’= * (5.10)
2 (e”+ pp)(f2 - TP)
For number g’ E C with 1g’12 = p + 1g I2 and % g’ given by (5.10) to exist, it
is necessary and sufficient that
It g is not real, then P(O) > 0, so that there is an E > 0 such that
fi p) > 0 for ( pi < E. Every p E Z f~ (-E, E) generates a matrix (5.3) with
C(B) = C(A). Different values of p correspond to different matrices B, and
none of them are unitarily similar due to Lemma 5.4.
If g is real, then &O) = 0, and
CZY
= 4pf2g2 - 4re2g2 + 4e2ff2 - 4efgq.
dp p=o
Substituting in the values of Z.J, 7, and q from (5.6) we see that (5.5) is
equivalent to d4T/dpl,=o = 0. Hence, if (5.5) does not hold, there is a
one-sided neighborhood N of zero such that fi p) > 0 for p E N. Observe
that this neighborhood is positive if e = 0 and negative if f = 0, so that in
any case N n I is a continuum. All p E N n Z generate matrices with the
same associated curve as C(A), and, as above, all these matrices belong to
different unitarily equivalence classes. I
We thank the referee for carefully reading the original version of this
article and suggesting many improvements in exposition, as well as Theorem
2.1.
REFERENCES