Matrices
Matrices
Matrices
Topics: The eigenvalue problem- eigenvalues and eigenvectors - properties of eigenvalues and
eigenvectors-Cayley-Hamilton theorem and its applications- symmetric matrices -similarity of ma-
trices - diagonalisation of a real symmetric matrix-quadratic form.
Note: These lecture notes covering the above topics are prepared from various books just to help
the students. The study material is expected to be useful but not exhaustive. For detailed study,
the students are advised to attend the lecture/tutorial classes regularly, and consult the reference
books.
Appeal: Please do not print these lecture notes unless it is really necessary.
1
Contents
1 Matrices 1
1.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Properties of eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Similar matrices and diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Diagonalization of a real symmetric matrix . . . . . . . . . . . . . . . . . . . 8
1.3 Quadratic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Row echelon form and rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Cayley-Hamilton theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Applications of the Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . 14
2
1.1 Eigenvalues and Eigenvectors
Eigenvalues and Eigenvectors find large number of applications in science and engineering (see
Section 1.6). In what follows, you will learn the mathematics of eigenvalues and eigenvectors.
A real number λ is an eigenvalue of an n-square matrix A iff there exists a non-zero n-vector
X such that AX = λX or (A − λIn )X = 0. The non-zero vector X is called eigenvector of
A corresponding to the eigenvalue λ. Since the non-zero vector X is non-trivial solution of the
homogeneous system (A − λIn )X = 0, we must have |A − λIn | = 0. This equation, known as the
characteristic equation of A, yields eigenvalues of A. So to find the eigenvalues of A, we solve the
equation |A − λIn | = 0.
The set Eλ = {X : AX = λX} is known as the eigenspace of λ. Note that Eλ contains all
eigenvectors of A corresponding to the eigenvalue λ in addition to the vector X = 0 since A0 = λ0.
Of course, by definition X = 0 is not an eigenvector
of A.
12 −51
Ex. Find eigenvalues and eigenvectors of A = .
2 −11
Sol. Here, the characteristic equation of A, that is, |A − λI2 | = 0 reads as
12 − λ −51
= 0.
2 −11 − λ
This leads to a quadratic equation in λ given by
λ2 − λ − 30 = 0.
Its roots are λ = 6, −5, the eigenvalues of A.
Now, the eigenvectors corresponding to λ = 6, are the non-trivial solutions X of the homoge-
neous system (A − 6I2 )X = 0. So to find eigenvectors of A corresponding to the eigenvalue λ = 6,
we need to solve the homogeneous system:
6 −51 x1 0
= .
2 −17 x2 0
Applying R1 → (1/6)R1 , we get
1 −17/2 x1 0
= .
2 −17 x2 0
Applying R2 → R2 − 2R1 , we get
1 −17/2 x1 0
= .
0 0 x2 0
So the
system reduces
x1 − (17/2)x2 = 0. Letting x2 = a, we get x1 = (17/2)a. So
to
x1 (17/2)a 17/2
= =a . So the eigenvectors corresponding to λ = 6 are non-zero multiples of
x2 a 1
17/2 17/2
the vector . The eigenspace corresponding to λ = 6, therefore, is E6 = a :a∈R .
1 1
3
Likewise, to find the eigenvectors corresponding to λ = −5, we solve the homogeneous system
(A + 5I2 )X = 0, that is,
17 −51 x1 0
= .
2 −6 x2 0
Applying R2 → R2 − R1 , we have
1 −3 x1 0
= .
0 0 x2 0
−4 − λ 8 −12
6 −6 − λ 12 = 0.
6 −8 14 − λ
λ3 − 4λ2 + 4λ = 0.
4
So the reduced system of equations is
x1 + x3 = 0, x2 − x3 = 0.
x1 − (4/3)x2 + 2x3 = 0.
Note: The algebraic multiplicity of an eigenvalue is defined as the number of times it repeats.
In the above example, the eigenvalue λ = 2 repeats two times. So its algebraic
multiplicity
is 2.
4/3 −2
Also, we get two linearly independent eigenvectors X2 = 1 and X3 = 0 corresponding
0 1
to λ = 2. The following example shows that there may not exist as many linearly independent
eigenvectors as the algebraic multiplicity of an eigenvalue.
5
0 1 0
Ex. If A = 0 0 1 , then eigenvalues of A are λ = 0, 0, 0. The eigenvectors corresponding to
0 0 0
1
λ = 0 are non-zero multiples of the vector X = 0. The eigenspace corresponding to λ = 0,
0
1
therefore, is E0 = a 0 : a ∈ R . Please try this example yourself. Notice that there is only
0
1
one linearly independent eigenvector X = 0 corresponding to the repeated eigenvalue (repeat-
0
ing thrice) λ = 0.
Online tool: A nice online tool to calculate eigenvalues, eigenvectors and various other matrix
calculations is available at Matrix Calculator. It would be very useful to verify your manual
calculations and answers.
0 0 ... λn
This shows that if there exist n linearly independent eigen vectors of A, then A is diagonalizable.
Further, we construct P from these eigenvectors of A, and obtain P −1 AP = D, wherein the eigen-
values of A appear at the diagonal places.
6
Note: If A has n different eigenvalues, then it can be proved that there exist n linearly indepen-
dent eigenvectors of A and consequently A is diagonalizable. However, there may exist n linearly
independent eigenvectors even if A has repeated eigenvalues as we have seen earlier. Such a matrix
is also, of course, diagonalizable. In case, A does not have n linearly independent eigenvectors, it
is not diagonalizable.
12 −51 17 3 −1 6 0
Ex. If A = , then P = and P AP = . (Verify!)
2 −11 2 1 0 −5
−4 8 −12 4 −2 −1 2 0 0
Ex. If A = 6 −6 12 , then P = 3 0 1 and P −1 AP = 0 2 0. (Verify!)
6 −8 14 0 1 1 0 0 0
A2 = (P DP −1 )2 = P DP −1 P DP −1 = P D2 P −1 .
Likewise, A3 = P D3 P −1 . So in general, An = P Dn P −1 .
This result can be utilized to evaluate powers of a diagonalizable matrix easily.
−4 8 −12
Ex. Determine A2 , where A = 6 −6 12 .
6 −8 14
4 −2 −1 1 −1 2 2 0 0
Sol. P = 3 0 1 , P −1 = 3 −4 7 and D = 0 2 0.
0 1 1 −3 4 −6 0 0 0
4 −2 −1 4 0 0 1 −1 2 −8 16 −24
So A2 = P D2 P −1 = 3 0 1 0 4 0 3 −4 7 = 12 −12 24 .
0 1 1 0 0 0 −3 4 −6 12 −16 28
Ex. Show that similar matrices have same eigenvalues. Also, discuss about their eigenvectors.
Sol. Suppose A and B are similar matrices. Then there exists a non-singular matrix P such that
P −1 AP = B. If λ is an eigen value of B, then we have
|B − λI| = 0.
7
1.2.1 Diagonalization of a real symmetric matrix
Here first we need to learn some definitions and results.
xn
q
||X|| = x21 + x22 + ... + x2n .
xn yn
denoted by X.Y , is defined as
X.Y = X T Y = x1 y1 + x2 y2 + ... + xn yn .
The vectors X and Y are orthogonal if X.Y = 0. Further, X and Y are orthonormal if X.Y = 0
and ||X|| = 1 = ||Y ||.
So, if two nonzero vectors X and Y are orthogonal, then X/||X|| and Y /||Y || are orthonormal.
P −1 = P T .
The Gram-Schmidt Process: This process converts a LI set of vectors, say {X1 , X2 , X3 , ..., Xn }
to an orthogonal set of vectors {Y1 , Y2 , Y3 , ..., Yn } as follows:
(i) Y1 = X1 ,
(ii) Y2 = X2 − (X2 .Y1 )Y1 /||Y1 ||2 ,
(iii) Y3 = X3 − (X3 .Y1 )Y1 /||Y1 ||2 − (X3 .Y2 )Y2 /||Y2 ||2 ,
and so on.
8
The above theorem tells us some useful properties of a real symmetric matrix. First, all the
eigenvalues of A are real. Further, A is diagonalizable, and it is diagonalized by an othronormal
matrix P where column vectors of P are linearly independent eigenvectors of A. Also, note that
the eigenvectors of A corresponding to different eigenvalues are mutually orthogonal. Thus, to find
P , first we need eigenvalues of the matrix. If all eigenvalues are different, then the corresponding
eigenvectors are mutually orthogonal. In case of repeated eigen values, we can use Gram-Schmidt
process to generate mutually orthogonal vectors as explained in the following
example.
0 1 1
Ex. Determine an orthonornal matrix P , and use it to diagonalize A = 1 0 1.
1 1 0
Sol. The eigenvalues of Aare−1, −1 and 2. The eigenvectors corresponding to the repeated
−1 −1
eigenvalue −1 are X1 = 1 and X2 = 0 while the eigenvector corresponding to the
0 1
1
eigenvalue 1 is X3 = 1. We find that X1 .X3 = 0 and X2 .X3 = 0. So X1 and X2 are orthogonal
1
to X3 , as expected. But X1 .X2 6= 0. Using Gram-Schmidt process, we find the orthogonal vectors:
−1
Y1 = X 1 = 1 ,
0
−1 −1/2 −1/2
Y2 = X2 − (X2 .Y1 )Y1 /||Y1 ||2 = 0 − 1/2 = −1/2 .
1 0 1
Also, Y1 .X3 = 0 and Y2 .X3 = 0. Thus, {Y1 , Y2 , X3 } is an orthogonal set, and
{Y1 /||Y1 ||2 , Y2 /||Y2 ||2 , X3 /||X3 ||2 } is an orthonormal set. It follows that the orthonormal matrix P
that diagonalizes the given materix A, is given by
√ √ √
−1/√ 2 −1/√6 1/√3
P = 1/ 2 −1/√ 6 1/√3 .
0 2/ 6 1/ 3
9
1.3 Quadratic form
If A is a symmetric matrix of order n × n, and X is a column vector of n variables, then X T AX
is a homogeneous expression of second degree in n variables, defined as a quadratic form. For
example,
if
1 2 −1 x1
A = 2 3 1 and X = x2 , then
−1 1 4 x
3
1 2 −1 x1
X T AX = x1 x2 x3 2 3 1 x2 = x21 + 3x22 + 4x23 + 4x1 x2 − 2x1 x3 + 2x2 x3
−1 1 4 x3
is a quadratic form, which is a homogeneous expression of second degree in x1 , x2 and x3 .
yn
Notice that the transformed quadratic form Y T DY does not carry the cross product terms. It
is called the canonical form of the quadratic form X T AX.
The number of nonzero terms in the canonical form (number of nonzero eigenvalues of A) is
called rank (r); the number of positive terms in the canonical form (number of positive eigenvalues
of A) is called index (i); the difference of the positive and negative terms in the canonical form
(difference of the number of positive and negative eigenvalues of A) is called signature (s), of the
quadratic form X T AX. Further, the quadratic form X T AX is said to be (i) positive definite if all
the eigenvalues of A are positive; (ii) negative definite if all the eigenvalues of A are negative; (iii)
positive semi-definite if the samllest eigenvalue of A is 0; (iv) negative semi-definite if the largest
eigenvalue of A is 0; (v) idefinite if the eigenvalues of A are positive and negative.
Ex. Transform the quadratic form Q = x21 − x23 − 4x1 x2 + 4x2 x3 to its canonical form, and hence
find its rank, index and signature.
Sol. The given quadratic form can be expressed in matrix notation as
1 −2 0 x1
Q = X T AX = x1 x2 x3 −2 0
2 x2 .
0 2 −1 x3
The eigenvalues
ofthe matrix
A are 0, −3, 3, and the corresponding orthonormal eigenvectors are
2/3 −1/3 −2/3
1/3, −2/3, 2/3 , respectively. (Verify!)
2/3 2/3 1/3
10
So the transformation
2/3 −1/3 −2/3 y1
X = P Y = 1/3 −2/3 2/3 y2
2/3 2/3 1/3 y3
0 0 3 y3
In addition, if all the entries above the leading entries are 0, then the matrix is said to be in
reduced row echelon form (RREF).
The number of nonzero rows in the REF or RREF of a matrix is called its rank.
1 1 3 1 1 3
1 1 3
Ex. 0 1 5, 0 0 1, all are in REF with ranks 3, 2 and 2 respectively.
0 0 1
0 0 1 0 0 0
1 0 0 1 3 0 1 0 3
Ex. 0 1 0, 0 0 1, 0 1 5 all are in RREF with ranks 3, 2 and 2 respectively.
0 0 1 0 0 0 0 0 0
The following example illustrates that how we find REF, rank and RREF of a given matrix.
Sol. We have
2 4 5
A = 1 2 3 .
3 5 6
1
Dictionary meaning of echelon: A formation of troops in which each unit is positioned successively to the
left or right of the rear unit to form an oblique or steplike line.
11
Applying R1 ↔ R2 , we obtain
1 2 3
A ∼ 2 4 5 .
3 5 6
Applying R2 ↔ R3 , we obtain
1 2 3
A ∼ 0 −1 −3 .
0 0 −1
the RREF of A.
Useful Tip: From the above example, one may notice that for getting RREF of a matrix we make
use of first row to make zeros in the first column, second row to make zeros in the second column
and so on.
Note: REF of matrix is not unique, but RREF is unique.
Also, note that the rank of a square matrix is equal to the number of its nonzero eigenvalues. In
general, the rank of a matrix is equal to the number of linearly independent rows (or columns) in
the matrix. So rank of a matrix can be found in different ways.
12
1.5 Cayley-Hamilton theorem
Every square satisfies its characteristic equation. In other words, the Cayley-Hamilton theorem
states that substituting a square matrix A for λ in its characteristic polynomial, p(λ) = |A − λIn |,
results in the zero matrix, such as: p(A) = 0.
1 2
Ex. Verify Cayley-Hamilton theorem for A = .
3 4
Sol. The characteristic polynomial of A is
p(λ) = |A − λI2 | = λ2 − 5λ − 2.
2 7 10 5 10 2 0 0 0
=⇒ p(A) = A − 5A − 2I2 = − − = = 0.
15 22 15 20 0 2 0 0
Thus, the Cayley-Hamilton theorem is verified. To find inverse of A, multiplying the equation
A2 − 5A − 2I2 = 0
by A−1 , we obtain
A − 5I2 − 2A−1 = 0.
It follows that
−1 1 5 −2 1
A = A − I2 = .
2 2 3/2 −1/2
2 −1
Ex. Find A78 , where A = .
2 5
Sol. Here characteristic polynomial of A is λ2 − 7λ + 12, and eigenvalues are 3 and 4. By division
algorithm, we have
378 = 3a + b,
478 = 4a + b.
Solving these two equations, we obtain a = 478 −378 , b = 4×378 −3×478 . Also, by Cayley-Hamilton
theorem, A2 − 7A + 12I2 = 0. So equation (1.1) implies
78 2a − b −a
A = aA + bI2 = ,
2a 5a − b
Remark: In case the eigenvalues are equal, we use the derivative of equation (1.1) for determining
the constants in the remainder. For, if f (x) = 0 has repeated root α, then f (α) = 0 and f 0 (α) = 0.
13
1.6 Applications of the Eigenvalues and Eigenvectors
Some Applications of the Eigenvalues and Eigenvectors are given below:
(Source: http://www.soest.hawaii.edu)
1. Communication systems: Eigenvalues were used by Claude Shannon to determine the the-
oretical limit to how much information can be transmitted through a communication medium like
your telephone line or through the air. This is done by calculating the eigenvectors and eigenval-
ues of the communication channel (expressed a matrix), and then waterfilling on the eigenvalues.
The eigenvalues are then, in essence, the gains of the fundamental modes of the channel, which
themselves are captured by the eigenvectors.
2. Google’s PageRank: Google’s extraordinary success as a search engine was due to their
clever use of eigenvalues and eigenvectors. From the time it was introduced in 1998, Google’s
methods for delivering the most relevant result for our search queries has evolved in many ways.
See the link Google’s PageRank for more details.
3. Designing bridges: The natural frequency of the bridge is the eigenvalue of smallest mag-
nitude of a system that models the bridge. The engineers exploit this knowledge to ensure the
stability of their constructions.
4. Designing car stereo system: Eigenvalue analysis is also used in the design of the car stereo
systems, where it helps to reproduce the vibration of the car due to the music.
5. Electrical Engineering: The application of eigenvalues and eigenvectors is useful for decou-
pling three-phase systems through symmetrical component transformation.
7. Underground oil search: Oil companies frequently use eigenvalue analysis to explore land for
oil. Oil, dirt, and other substances all give rise to linear systems which have different eigenvalues,
so eigenvalue analysis can give a good indication of where oil reserves are located. Oil companies
place probes around a site to pick up the waves that result from a huge truck used to vibrate the
ground. The waves are changed as they pass through the different substances in the ground. The
analysis of these waves directs the oil companies to possible drilling sites.
Eigenvalues are not only used to explain natural occurrences, but also to discover new and
better designs for the future. Some of the results are quite surprising. If you were asked to build
the strongest column that you could to support the weight of a roof using only a specified amount
of material, what shape would that column take? Most of us would build a cylinder like most other
columns that we have seen. However, Steve Cox of Rice University and Michael Overton of New
York University proved, based on the work of J. Keller and I. Tadjbakhsh, that the column would
be stronger if it was largest at the top, middle, and bottom. At the points of the way from either
14
end, the column could be smaller because the column would not naturally buckle there anyway.
Does that surprise you? This new design was discovered through the study of the eigenvalues of
the system involving the column and the weight from above. Note that this column would not be
the strongest design if any significant pressure came from the side, but when a column supports a
roof, the vast majority of the pressure comes directly from above.
Very roughly, the eigenvalues of a linear mapping is a measure of the distortion induced by the
transformation and the eigenvectors tell you about how the distortion is oriented. It is precisely
this rough picture which makes PCA (Principal Component Analysis = A statistical procedure)
very useful.
15