Diagonalization by A Unitary Similarity Transformation
Diagonalization by A Unitary Similarity Transformation
Diagonalization by A Unitary Similarity Transformation
= A
A.
In the present note, we wish to examine a special case of matrix diagonalization in
which the diagonalizing matrix P is unitary. In this case, the basis of eigenvectors
B is orthonormal. We demonstrate below that a matrix A is diagonalizable by a
unitary similarity transformation if and only if A is normal.
Before proceeding, we record a few facts about unitary and hermitian matrices.
A unitary matrix U is a matrix that satises UU
= U
U = I. By writing out
these matrix equations in terms of the matrix elements, one sees that the columns
[or rows] of U, treated as vectors, are orthonormal. That is, if the columns of U
are denoted by e
j
, then the inner product
is given by e
i
, e
j
) =
ij
. In particular,
each column is a vector that is normalized to unity. Note that a unitary matrix
is also a normal matrix.
An hermitian matrix satises A
w) .
See the handout entitled: Coordinates, matrix elements and changes of basis.
The inner product of two vectors can be expressed, in terms of their coordinates with respect
to an orthonormal basis, by v , w) =
k
v
k
w
k
, where v =
k
v
k
e
k
and w =
k
w
k
e
k
. Note
the appearance of the complex conjugate, v
k
, in the expression for the inner product.
1
Since A
= A, it follows that
Av , w) = v , A w) (2)
must be true for any vectors v, w V . In particular, if we choose v = w to be
an eigenvector of A; i.e., Av = v for v ,= 0, then
v , v) = v , v) .
Using the properties of the inner product of a complex vector space,
v , v) = v , v) ,
where is the complex conjugate of . Since v ,= 0, it follows that v , v) , = 0.
Thus, we may conclude that = so that is real. One can also prove that
eigenvectors corresponding to non-degenerate eigenvalues of A are orthogonal. To
prove this, take v and w to be eigenvectors of A with corresponding eigenvalues
and
w) .
Once again, we make use of the properties of the inner product to conclude that
(
)v , w) = 0 .
Since = ,=
v
1
Y
,
where the vertical dashed line is inserted for the readers convenience as a reminder
that this is a partitioned matrix that is n 1 to the left of the dashed line and
2
n (n 1) to the right of the dashed line. Since the columns of U
1
comprise an
orthonormal set of vectors, we can write the matrix elements of Y in the form
Y
ij
= (v
j
)
i
, for i = 1, 2, . . . , n and j = 2, 3, . . ., where v
1
, v
2
, . . . , v
n
is an
orthonormal set of vectors. Here (v
j
)
i
is the ith coordinate (with respect to a
xed orthonormal basis) of the jth vector of the orthonormal set. It then follows
that:
v
j
, v
1
) =
n
k=1
(v
j
)
k
(v
1
)
k
= 0 , for j = 2 , 3 , . . . , n, (3)
where (v
j
)
k
is the complex conjugate of the kth component of the vector v
j
. We
can rewrite eq. (3) as a matrix product (where v
1
is an n 1 matrix) as:
Y
v
1
=
n
k=1
(Y )
kj
(v
1
)
k
=
n
k=1
(v
j
)
k
(v
1
)
k
= 0 . (4)
We now compute the following product of matrices:
U
1
AU
1
=
1
Y
v
1
Y
1
Av
1
v
1
AY
Y
Av
1
Y
AY
. (5)
Note that the partitioned matrix above has the following structure:
1 1 1 (n 1)
(n 1) 1 (n 1) (n 1)
,
where we have indicated the dimensions (number of rows number of columns)
of the matrices occupying the four possible positions of the partitioned matrix.
In particular, there is one row above the horizontal dashed line and (n 1) rows
below; there is one column to the left of the vertical dashed line and (n1) columns
to the right. Using Av
1
=
1
v
1
, with v
1
normalized to unity (i.e., v
1
v
1
= 1), we
see that:
v
1
Av
1
=
1
v
1
v
1
=
1
,
Y
Av
1
=
1
Y
v
1
= 0 .
after making use of eq. (4). Using these result in eq. (5) yields:
U
1
AU
1
=
1
v
1
AY
0 Y
AY
. (6)
At this point, we impose the condition that A is hermitian. Note that A
= A
implies that (U
1
AU
1
)
= U
1
AU
1
[recall that (AB)
= B
1
AU
1
is also hermitian. This latter condition can then be used to deduce that
3
the upper right 1 (n 1) matrix block in eq. (6) must be zero. We therefore
conclude that:
U
1
AU
1
=
1
0
0 Y
AY
. (7)
In particular, (Y
AY )
= Y
AY . In fact, since U
1
AU
1
is hermitian, it follows
that
1
is real and Y
AY is hermitian, as expected.
Thus, we have reduced the problem to the diagonalization of the (n1)(n1)
hermitian matrix Y
AY must
coincide with the set of the eigenvalues of A with
1
omitted. To prove this,
consider the eigenvalue problem Y
= I
n
, where I
n
is the n n identity matrix.
Hence we end up
with AY v = Y v. Putting w Y v, we obtain A w = w. This is the eigenvalue
problem for A. Thus, in the solution to Y
AY . Dening a new
unitary matrix U
2
whose rst column is one of the normalized eigenvectors of
Y
AY , we will end up reducing the matrix further. We can keep going until we
end up with a fully diagonal matrix. At each step, one is simply multiplying
on the left with the inverse of a unitary matrix and on the right with a unitary
matrix. Since the product of unitary matrices is unitary (check this!), at the end
of the process one arrives at:
U
AU = D
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
, (8)
where the eigenvalues of A are the diagonal elements of D and the eigenvectors
of A are the columns of U. The latter can be veried explicitly from the equation
AU = UD. Thus, we have proven that an hermitian matrix is diagonalizable by
a unitary similarity transformation. Note that some of the eigenvalues of A may
be degenerate (this imposes no diculty in the above proof). In fact, one of the
consequences of this analysis is that the eigenvectors of an hermitian matrix can
be chosen to be orthonormal. These eigenvectors are precisely the columns of the
matrix U, which was explicitly constructed in deriving eq. (8).
The result Y Y
= I
n
follows from the fact that Y is an n (n 1) matrix whose columns
[rows] are orthonormal.
4
One important corollary to the above result involves the case of a real symmet-
ric matrix A (i.e. A is a real matrix that satises A = A
T
). Since the eigenvalues
of a hermitian matrix are real, it follows that the eigenvalues of a real symmet-
ric matrix are real (since a real symmetric matrix is also hermitian). Thus, it is
possible to diagonalize a real symmetric matrix by a real orthogonal similarity
transformation:
R
T
AR = D,
where R is a real matrix that satises RR
T
= R
T
R = I (note that a real orthogonal
matrix is also unitary). The real orthonormal eigenvectors of A are the columns
of R, and D is a diagonal matrix whose diagonal elements are the eigenvalues
of A.
3. Simultaneous diagonalization of two commuting hermitian matrices
Two hermitian matrices are simultaneously diagonalizable by a unitary simi-
larity transformation if and only if they commute. That is, given two hermitian
matrices A and B, we can nd a unitary matrix V such that both V
AV = D
A
and V
BV = D
B
are diagonal matrices. Note that the two diagonal matrices D
A
and D
B
are not equal in general. But, since V is a matrix whose columns are the
eigenvectors of the both A and B, it must be true that the eigenvectors of A and
B coincide.
Since all diagonal matrices commute, it follows that D
A
D
B
= D
B
D
A
. Hence,
if V
AV = D
A
and V
BV = D
B
, then (V
AV )(V
BV ) = (V
BV )(V
AV ). Using
V V
ABV = V
BAV . Hence, we
conclude that AB = BA. To complete the proof, we must prove the converse:
if two hermitian matrices A and B commute, then A and B are simultaneously
diagonalizable by a unitary similarity transformation.
Suppose that AB = BA, where A and B are hermitian matrices. Then, we
can nd a unitary matrix U such that U
AU = D
A
, where D
A
is diagonal. Using
the same matrix U, we shall dene B
BU. Explicitly,
U
AU = D
A
=
a
1
0 0
0 a
2
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 a
n
, U
BU B
11
b
12
b
1n
b
21
b
22
b
2n
.
.
.
.
.
.
.
.
.
.
.
.
b
n1
b
n2
b
nn
.
Note that because B is hermitian, it follows that B
is hermitian as well:
(B
= (U
BU)
= U
U = U
BU = B
.
The relation AB = BA imposes a strong constraint on the form of B
. First,
observe that:
D
A
B
= (U
AU)(U
BU) = U
ABU = U
BAU = (U
BU)(U
AU) = B
D
A
.
5
Explicitly,
D
A
B =
a
1
b
11
a
1
b
12
a
1
b
1n
a
2
b
21
a
2
b
22
a
2
b
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
n
b
n1
a
n
b
n2
. . . a
n
b
nn
, BD
A
=
a
1
b
11
a
2
b
12
a
n
b
1n
a
1
b
21
a
2
b
22
a
n
b
2n
.
.
.
.
.
.
.
.
.
.
.
.
a
1
b
n1
a
2
b
n2
a
n
b
nn
.
Then, D
A
BBD
A
= 0 yields (a
i
a
j
)b
ij
= 0. If all the a
i
were distinct, then we
would be able to conclude that b
ij
= 0 for i ,= j. That is, B
is diagonal. Thus,
let us examine carefully what happens if some of the diagonal elements are equal
(i.e., some of the eigenvalues of A are degenerate).
If A has some degenerate eigenvalues, we can order the columns of U so that
the degenerate eigenvalues are contiguous along the diagonal. Henceforth, we
assume this to be the case. We would then conclude that b
ij
= 0 if a
i
,= a
j
. One
can then write D
A
and B
1
I
1
0 0
0
2
I
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
k
I
k
, B
1
0 0
0 B
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 B
, (9)
assuming that A possesses k distinct eigenvalues. I
j
indicates the identity matrix
whose dimension is equal to the multiplicity of the corresponding eigenvalue
j
.
The corresponding B
j
is an hermitian matrix with the same dimension as I
j
. Since
B
j
is hermitian, it can be diagonalized by a unitary similarity transformation. In
particular, we can nd a unitary matrix of the form:
U
1
0 0
0 U
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 U
,
such that (U
= D
B
is diagonal. One can easily see by explicit multiplication
of the matrices that (U
D
A
U
= D
A
. Hence, we have succeeded in nding an
invertible matrix V = UU
such that:
V
AV = D
A
, V
BV = D
B
.
That is, A and B are simultaneously diagonalizable by a unitary similarity trans-
formation. The columns of V are the simultaneous eigenvectors of A and B.
The proof we have just given can be extended to prove a stronger result.
Two diagonalizable matrices are simultaneously diagonalizable if and only if they
6
commute. That is, given two diagonalizable matrices A and B, we can nd one
invertible operator S such that both S
1
AS = D
A
and S
1
BS = D
B
are di-
agonal matrices. The proof follows the same steps given above. However, one
step requires more care. Although B is diagonalizable (which implies that B
is
diagonalizable), one must prove that each of the B
j
that appears in eq. (9) is
diagonalizable. Details of this proof can be found in Matrix Analysis, by Robert
A. Horn and Charles R. Johnson (Cambridge University Press, Cambridge, Eng-
land, 1985) p. 49. We do not require this stronger version of the theorem in these
notes.
4. The unitary diagonalization of a normal matrix
We rst prove that if A can be diagonalizable by a unitary similarity trans-
formation, then A is normal. If U
= UD
and
AA
= (UDU
)(UD
) = UDD
= UD
DU
= (UD
)(UDU
) = A
A.
In this proof, we use the fact that diagonal matrices commute with each other, so
that DD
= D
D.
Conversely, if A is normal then it can be diagonalizable by a unitary similarity
transformation. To prove this, we note that any complex matrix A can be uniquely
written in the form:
A = B +iC , where B and C are hermitian matrices .
To verify this assertion, we simply identify: B =
1
2
(A+A
) and C =
1
2
i(AA
).
One easily checks that B = B
, C = C
= A
A) then
0 = AA
BV = D
B
and V
CV = D
C
where D
B
and D
C
are diagonal, then
V
AV = V
(B +iC)V = D
B
+iD
C
, (10)
which is a diagonal matrix. Therefore, we have explicitly demonstrated that
any normal matrix can be diagonalizable by a unitary similarity transformation.
Moreover, as was the case for the hermitian matrix, the eigenvectors of a normal
matrix can be chosen to be orthonormal and correspond to the columns of V .
However, eq. (10) shows that the eigenvalues of a normal matrix are in general
complex (in contrast to the real eigenvalues of an hermitian matrix).
7
5. The unitary diagonalization of a normal matrixrevisited
In the last section, we used the unitary diagonalization of hermitian matrices
and the simultaneous unitary diagonalization of two hermitian matrices to prove
that a normal matrix can be diagonalized by a unitary similarity transformation.
Nevertheless, one can provide a direct proof of the unitary diagonalization of a
normal matrix that does not rely on the diagonalization of hermitian matrices. In
particular, the same proof given for the unitary diagonalization of an hermitian
matrix can also be applied to the case of a normal matrix with only minor changes.
For completeness, we provide the details here.
Our starting point is eq. (6), which is valid for any complex matrix. If A is
normal then U
1
AU
1
is normal, since
U
1
AU
1
(U
1
AU
1
)
= U
1
AU
1
U
1
A
U
1
= U
1
AA
U
1
,
(U
1
AU
1
)
1
AU
1
= U
1
A
U
1
U
1
AU
1
= U
1
A
AU
1
,
where we have used the fact that U
1
is unitary (U
1
U
1
= I). Imposing AA
= A
A,
we conclude that
U
1
AU
1
(U
1
AU
1
)
= (U
1
AU
1
)
1
AU
1
. (11)
However, eq. (6) implies that
U
1
AU
1
(U
1
AU
1
)
1
v
1
AY
0 Y
AY
1
0
Y
v
1
Y
[
1
[
2
+v
1
AY Y
v
1
v
1
AY Y
Y
Y
AY Y
v
1
Y
AY Y
,
(U
1
AU
1
)
1
AU
1
=
1
0
Y
v
1
Y
1
v
1
AY
0 Y
AY
[
1
[
2
1
v
1
AY
1
Y
v
1
Y
v
1
v
1
AY +Y
Y Y
AY
,
Imposing the result of eq. (11), we rst compare the upper left hand block of the
two matrices above. We conclude that:
v
1
AY Y
v
1
= 0 . (12)
But Y
v
1
is an (n1)-dimensional vector, so that eq. (12) is the matrix version
of the following equation:
Y
v
1
, Y
v
1
) = 0 . (13)
Since w, w) = 0 implies that w = 0 (and w
v
1
= v
1
AY = 0 . (14)
Using eq. (14) in the expressions for U
1
AU
1
(U
1
AU
1
)
and (U
1
AU
1
)
1
AU
1
above,
we see that eq. (11) requires that eq. (14) and the following condition are both
satised:
Y
AY Y
Y = Y
Y Y
AY .
8
The latter condition simply states that Y
1
AU
1
=
1
0
0 Y
AY
, (15)
where Y
AY , and we can now repeat the procedure again. The end result is once
again the unitary diagonalization of A:
U
AU = D
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
.
Moreover, the eigenvalues of A are the diagonal elements of D and the eigenvec-
tors of A are the columns of U. This should be clear from the equation AU = UD.
Thus, we have proven that a normal matrix is diagonalizable by a unitary simi-
larity transformation.
9