Linear Algebra
Linear Algebra
Linear
Algebra
Linear Algebra
Kjell Elfström
Copyright c Kjell Elfström
First edition, December 2020
Contents
1 Matrices 1
1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Addition and Multiplication of Matrices . . . . . . . . . . . . . . . . . . 1
1.3 The Transpose of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 The Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Linear Spaces 11
2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 More on Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Direct Sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5 The Rank-Nullity Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 26
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Inner Product Spaces 33
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Orthogonal Complement . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 The Rank of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 The Method of Least Squares . . . . . . . . . . . . . . . . . . . . . . . . 44
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 Determinants 51
4.1 Multilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Definition of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . 56
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5 Linear Transformations 67
5.1 Matrix Representations of Linear Transformations . . . . . . . . . . . . 67
5.2 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Projections and Reflections . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4 Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 Eigenvalues and Eigenvectors 87
6.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2 Diagonalisability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Contents
iv
1 Matrices
1.1 Definition
A matrix is a rectangular array of real numbers:
a11 a12 · · · a1n
a21 a22 · · · a2n
A= . .. .. .
.. . .
am1 am2 · · · amn
The size of a matrix with m rows and n columns is m × n, which is read as ‘m by n’.
The number aik in row i and column k is called an entry. We shall also use the more
compact notation A = [aik ]m×n , A = [aik ], A = [Aik ]m×n or A = [Aik ]. If m = n, we say
that A is a square matrix of order n, and then the entries aii are said to form the main
diagonal of A. By a column matrix we shall mean an m × 1 matrix with a single column,
and a row matrix is a 1 × n matrix having a single row.
Example 1.1.
1 2 1 1 2 3
1 2 3
A= , B = 2 5 , C = 2 , D = 2 3 4 .
1 1 1
3 7 3 3 4 5
The sizes of these matrices are 2 × 3, 3 × 2, 3 × 1 and 3 × 3, respectively. D is a square
matrix of order 3, and its main diagonal comprises the entries 1, 3 and 5.
When the size of the zero matrix is clear from context, we shall simply denote it by 0.
Example 1.3.
1 2 3 1 1 1 1+1 2+1 3+1 2 3 4
+ = = ,
0 1 1 2 1 2 0+2 1+1 1+2 2 2 3
1 2 0 0 1 2 0+1 0+2 1 2
0 + 2 3 = 0 0 + 2 3 = 0 + 2 0 + 3 = 2 3 ,
3 4 0 0 3 4 0+3 0+4 3 4
whereas
1 2
1 2 3
+ 2 3
0 1 1
3 4
is not defined.
Definition 1.4. Let A = [aik ]m×n be a matrix and s a real number. We define the
product of s and A as
sa11 sa12 ··· sa1n
sa21 sa22 ··· sa2n
sA = [saik ]m×n = . .. .. .
.. . .
sam1 sam2 · · · samn
Example 1.5.
1 2 3 3 6 9
3 = .
0 1 1 0 3 3
Theorem 1.6. Below, s and t are real numbers, and A, B and C are matrices of the
same size.
(i) A + B = B + A.
(ii) A + (B + C) = (A + B) + C.
(iii) 0 + A = A.
(iv) A + (−A) = 0.
(v) s(A + B) = sA + sB.
(vi) (st)A = s(tA).
2
1.2 Addition and Multiplication of Matrices
Then
a11 x1 + a12 x2 + · · · + a1n xn y1
a21 x1 + a22 x2 + · · · + a2n xn y2
AX = Y ⇔ .. = ..
. .
am1 x1 + am2 x2 + · · · + amn xn ym
a11 x1 + a12 x2 + · · · + a1n xn = y1
a21 x1 + a22 x2 + · · · + a2n xn = y2
⇔ .. .
.
am1 x1 + am2 x2 + · · · + amn xn = ym
Thus we achieved the goal we set out to ourselves. We shall now extend the definition
to n × p matrices B. Denote by Bk the kth column of B. The matrix AB is the m × p
matrix comprising the columns ABk , k = 1, 2, . . . , p.
Definition 1.7. Let A = [aik ]m×n and B = [bik ]n×p be matrices of size m × n and n × p,
respectively. The product AB is the m × p matrix C = [cik ]m×p for which
n
X
cik = ai1 b1k + ai2 b2k + · · · + ain bnk = aij bjk , 1 ≤ i ≤ m, 1 ≤ k ≤ p.
j=1
Hence, the entry in position i, k of the product AB is the sum of the products of the
corresponding entries of row i of A and column k of B.
Example 1.8.
1 2 3 1 2 1·1+2·1+3·1 1·2+2·2+3·2 6 12
4 5 6 1 2 = 4 · 1 + 5 · 1 + 6 · 1 4 · 2 + 5 · 2 + 6 · 2 = 15 30 ,
7 8 9 1 2 7·1+8·1+9·1 7·2+8·2+9·2 24 48
3
1 Matrices
whereas
1 2 1 2 3
1 2 4 5 6
1 2 7 8 9
is undefined. As we see, BA need not be defined even though AB is.
1 1 2 −1 1 · 2 + 1 · 2 1 · (−1) + 1 · (−1) 4 −2
= = ,
2 2 2 −1 2 · 2 + 2 · 2 2 · (−1) + 2 · (−1) 8 −4
2 −1 1 1 2 · 1 + (−1) · 2 2 · 1 + (−1) · 2 0 0
= = = 0.
2 −1 2 2 2 · 1 + (−1) · 2 2 · 1 + (−1) · 2 0 0
Here we see that the commutative law and the cancellation law fail for matrix multiplic-
ation.
Theorem 1.9. In the identities below, s ∈ R and A, B, C are matrices. All members
of an identity are defined whenever any member is defined.
(i) A(B + C) = AB + AC.
(ii) (A + B)C = AC + BC.
(iii) A(BC) = (AB)C.
(iv) s(AB) = (sA)B = A(sB).
Proof. (i) In order that any member, hence both members, be defined, it is necessary
and sufficient that the sizes of A, B and C are m × n, n × p and n × p, respectively. We
then have
n
X n
X n
X n
X
(A(B + C))ik = Aiν (B + C)νk = Aiν (Bνk + Cνk ) = Aiν Bνk + Aiν Cνk
ν=1 ν=1 ν=1 ν=1
= (AB)ik + (AC)ik = (AB + AC)ik
for all i and k such that 1 ≤ i ≤ m and 1 ≤ k ≤ p. This proves the statement.
The proof of (ii) is similar and is left to the reader.
(iii) We may here assume that the sizes of A, B and C are m × n, n × p and p × q,
respectively. Then
n
X n
X p
X p
n X
X
(A(BC))ik = Aiν (BC)νk = Aiν Bνµ Cµk = Aiν Bνµ Cµk
ν=1 ν=1 µ=1 ν=1 µ=1
p Xn p n
! p
X X X X
= Aiν Bνµ Cµk = Aiν Bνµ Cµk = (AB)iµ Cµk
µ=1 ν=1 µ=1 ν=1 µ=1
= ((AB)C)ik .
4
1.3 The Transpose of a Matrix
Hence, the transpose of A is the matrix whose columns are the rows of A.
Theorem 1.13. Below, s ∈ R and A and B are matrices. Both sides of an identity are
defined whenever any side is defined.
(i) (A + B)t = At + B t .
(ii) (sA)t = sAt .
(iii) (AB)t = B t At .
Proof. The first two statements are simple consequences of the definition. To prove the
last statement, suppose that the sizes of A and B are m × n and n × p, respectively.
Then
Xn n
X
t
((AB) )ik = (AB)ki = Akj Bji = (At )jk (B t )ij = (B t At )ik ,
j=1 j=1
5
1 Matrices
Theorem 1.19. Let A be a square matrix of order n. Then A is invertible if and only
if, for every n × 1 matrix Y , there is a unique n × 1 matrix X such that AX = Y .
Proof. First assume that A is invertible with inverse A−1 . If AX = Y , then we have
X = IX = A−1 AX = A−1 Y , which shows that the equation AX = Y can have no more
than one solution X. Since AA−1 Y = IY = Y , we see that X = A−1 Y is, in fact, a
solution. This proves the implication to the right.
To show the converse, we assume that, for every n × 1 matrix Y , there is a unique
n × 1 matrix X such that AX = Y . Let Ik be the kth column of the unit matrix I.
By assumption, there is an n × 1 matrix Bk such that ABk = Ik . Let B be the matrix
comprising the columns Bk , k = 1, 2, . . . , n. Then AB = I. It remains to show that
BA = I. Let C = BA. Then we have AC = A(BA) = (AB)A = IA = A = AI. Hence
ACk = AIk for k = 1, 2, . . . , n, and it follows from the uniqueness that Ck = Ik for
k = 1, 2, . . . , n. From this we conclude that BA = C = I.
6
1.4 The Inverse of a Matrix
Proof. If AX = BX for all n × 1 matrices X, then AIk = BIk for all columns Ik of the
n × n unit matrix I. Hence, A = AI = BI = B. The reverse implication is immediate.
Theorems 1.19 and 1.20 can be used to devise a method for finding the inverse when it
exists or disclosing its non-existence. If we find that the system AX = Y has a unique
solution X = BY , then A is invertible by Theorem 1.19. Hence also X = A−1 Y is a
solution. It follows that BY = A−1 Y for all n × 1 matrices Y and so, by Theorem 1.20,
B = A−1 . If, instead, we find that the system has not a unique solution for some
right-hand side Y , then A is not invertible by Theorem 1.19.
is invertible. We have
x1 + 2x2 + 3x3 = y1 x1 + 2x2 + 3x3 = y1
AX = Y ⇔ x + 3x2 + 5x3 = y2 ⇔ x2 + 2x3 = −y1 + y2
1
x1 + 4x2 + 6x3 = y3 2x2 + 3x3 = −y1 + y3
x1 + 2x2 + 3x3 = y1 x1 = 2y1 − y3
⇔ x2 + 2x3 = −y1 + y2 ⇔ x2 = y1 − 3y2 + 2y3 .
− x3 = y1 − 2y2 + y3 x3 = −y1 + 2y2 − y3
Of course, it suffices to keep track of the coefficients of the xi and the yi . Thus the above
computations can be written as
1 2 3 1 0 0 1 2 3 1 0 0
AX = Y ⇔ 1 3 5 0 1 0 ⇔ 0 1 2 −1 1 0
1 4 6 0 0 1 0 2 3 −1 0 1
1 2 3 1 0 0 1 0 0 2 0 −1
⇔ 0 1 2 −1 1 0 ⇔ 0 1 0 1 −3 2 .
0 0 −1 1 −2 1 0 0 1 −1 2 −1
7
1 Matrices
We see that the system has an infinite number of solutions when y1 − 2y2 + y3 = 0 and
no solutions otherwise. Hence A is not invertible.
Theorem 1.23. Let A and B be invertible square matrices. Then AB and At are
invertible, (AB)−1 = B −1 A−1 and (At )−1 = (A−1 )t .
and
Exercises
1.1. Let
1 2 3 1 0 1
2 2 1
A = −1 1 2 , B = 1 2 1 , C= .
1 1 3
1 1 −1 2 1 3
Perform the following operations or explain why they are not defined.
(a) A + B, (b) A + C, (c) AB, (d) BA,
(e) AC, (f) CA, (g) C(A + B), (h) C(A + 2B).
1.2. Let
1 2 1 0
A= and B= .
−1 1 1 2
Compute A2 − B 2 and (A + B)(A − B) and explain what you observe.
8
Exercises
1.3. Let
2 4
A= .
1 2
Find all 2 × 2 matrices B such that
AB = BA = 0.
1.4. (a) Let A and be B be n × n matrices such that AB = BA. Show that the
binomial theorem
X n
n n n−k k
(A + B) = A B
k
k=0
1.5. Let A, B and C be the matrices in Exercise 1.1. Compute At B t and (At + B t )C t .
1.6. Show that At A and AAt are symmetric matrices for all matrices A.
1.7. A matrix A is said to be skew-symmetric if At = −A.
(a) Let A be a square matrix. Show that A + At is symmetric and that A − At
is skew-symmetric.
(b) Show that, for every square matrix A, there exist unique matrices B and C
such that A = B + C, B is symmetric and C is skew-symmetric.
1.8. Find the inverse of each matrix below or explain why it does not exists.
1 2 3 1 2 3 1 2 3
(a) 1 1 2, (b) 1 3 2, (c) 1 3 2.
1 1 1 1 1 4 2 1 4
1.9. Compute the inverses of A, At and A2 where
2 1 1
A = 1 1 2 .
2 1 2
9
1 Matrices
I + A + A2 + · · · + Ak
10
2 Linear Spaces
2.1 Definition
Definition 2.1. Let V be a non-empty set, and let there be defined two operations
V × V → V and R × V → V called addition and scalar multiplication, respectively.
We denote the value of the addition at (u, v) by u + v and the value of the scalar
multiplication at (s, u) by su. We call an element of V a vector and real numbers are
usually called scalars. We say that the set V , together with the addition and scalar
multiplication, forms a linear space provided that the following conditions are met:
The conditions are called the axioms for linear spaces. We call 0 the zero vector,
and −u is called the additive inverse of u. The linear space is in fact a triple (V, +, ·)
where + and · denote the addition and scalar multiplication, respectively. When it is
clear from context which operations + and · are intended, we shall, by abuse of language,
use V to denote also the linear space.
The following statements follow from the definition.
Theorem 2.2.
(i) The vector 0 is uniquely determined.
(ii) For every u ∈ V , the vector −u is uniquely determined.
(iii) 0u = 0 for all u ∈ V .
(iv) s0 = 0 for all s ∈ R.
(v) (−1)u = −u for all u ∈ V .
Proof. (i) Suppose that 01 and 02 both satisfy Axiom (iii) in the definition. Then it
follows from that axiom and Axiom (i) that 01 = 01 + 02 = 02 .
(ii) Suppose that u + v = 0 and u + w = 0. Then it follows from Axioms (i), (ii)
and (iii) that v = v + 0 = v + (u + w) = (v + u) + w = 0 + w = w + 0 = w.
(iii) 0u = 0u + 0 = 0u + 0u + (−(0u)) = (0 + 0)u + (−(0u)) = 0u + (−(0u)) = 0.
(iv) s0 = s0 + 0 = s0 + s0 + (−(s0)) = s(0 + 0) + (−(s0)) = s0 + (−(s0)) = 0.
2 Linear Spaces
12
2.1 Definition
Note that it follows from the proof of Theorem 2.11 that the zero vector of a linear
space V is also the zero vector of its subspaces and that the inverse of a vector in a
subspace of V is the inverse of that vector in V .
Example 2.13. Let V be any linear space. If 0 is the zero vector of V , then U = {0} is a
subset of V . In fact, U is a subspace by Corollary 2.12. Firstly, U is non-empty. Secondly,
if u and v belong to U , then both are the zero vector. Hence, su + tv = s0 + t0 = 0 ∈ U
for all real numbers s and t. We call U the zero subspace of V .
0 = a1 x 1 + a2 x 2 + · · · + a n x n ,
0 = a1 y 1 + a2 y 2 + · · · + a n y n .
Hence
0 = s0 + t0 = s(a1 x1 + a2 x2 + · · · + an xn ) + t(a1 y1 + a2 y2 + · · · + an yn )
= a1 (sx1 + ty1 ) + a2 (sx2 + ty2 ) + · · · + an (sxn + tyn ),
13
2 Linear Spaces
Definition 2.17. Let A be an m×n matrix. The kernel ker A of A is the set of solutions
in Rn of the equation Ax = 0. The image im A of A is the set of vectors y ∈ Rm for
which the equation Ax = y has a solution.
Example 2.19. The planea1 x1 + a2 x2 + a3 x3 = 0 through the origin can be regarded
as the kernel of the matrix a1 a2 a3 . This plane also has a parametric equation
x1 = u1 t1 + v1 t2 u1 v1 x1
t1
x = u2 t1 + v2 t2 ⇔ u2 v2
= x2 .
2 t2
x3 = u3 t1 + v3 t2 u3 v3 x3
If the two planes are not identical, their intersection is a line through the origin. Since
lines have parametric equations, this line can also be regarded as the image of a matrix.
14
2.2 Bases
2.2 Bases
Definition 2.20. Let u and u1 , u2 , . . . , uk be vectors in a linear space V . We say that
u is a linear combination of the ui if there exist real numbers s1 , s2 , . . . , sk such that
u = s 1 u1 + s 2 u2 + · · · + s k uk .
Definition 2.22. Let u1 , u2 , . . . , uk be vectors in a linear space V . We call the set of all
linear combinations of those vectors the span of them and denote it by [u1 , u2 , . . . , uk ].
and hence, by Corollary 2.12, U is a subspace of V . The fact that the vectors span U
follows directly from the definition.
Example 2.24. The plane in Example 2.19 is the span of the two vectors u = (u1 , u2 , u3 )
and v = (v1 , v2 , v3 ). The line there is the span of a single vector w = (w1 , w2 , w3 ).
Definition 2.26. We say that the vectors u1 , u2 , . . . , uk in a linear space V are linearly
dependent if there exist real numbers s1 , s2 , . . . , sk , not all zero, such that
s1 u1 + s2 u2 + · · · + sk uk = 0.
If the vectors are not linearly dependent, we say that they are linearly independent.
s 1 u1 + s 2 u2 + · · · + s k uk = 0 ⇒ s1 = s2 = · · · = sk = 0.
Also note that a single vector u is linearly dependent if and only if it is the zero vector.
15
2 Linear Spaces
Example 2.27. Here we want to find out if the vectors u1 = (1, 1, 1), u2 = (1, 3, 1),
u3 = (1, 4, 3) in R3 are linearly dependent. We solve the equation s1 u1 +s2 u2 +s3 u3 = 0:
s1 + s2 + s3 = 0 s1 + s2 + s3 = 0
s1 + 3s2 + 4s3 = 0 ⇔ 2s2 + 3s3 = 0 ⇔ s1 = s2 = s3 = 0.
s1 + s2 + 3s3 = 0 2s3 = 0
Example 2.28. Consider the vectors u1 = (1, 1, 1), u2 = (1, 3, 5), u3 = (1, 4, 7) in R3 .
This time the equation s1 u1 + s2 u2 + s3 u3 = 0 is equivalent to
s1 + s2 + s3 = 0 s1 + s2 + s3 = 0
s1 + s2 + s3 = 0
s + 3s2 + 4s3 = 0 ⇔ 2s2 + 3s3 = 0 ⇔ .
1 2s2 + 3s3 = 0
s1 + 5s2 + 7s3 = 0 4s2 + 6s3 = 0
Since this equation has non-trivial solutions, the vectors u1 , u2 , u3 are linearly dependent.
Proof. Suppose that the vectors are linearly dependent. Then there exist real numbers
s1 , . . . , si , . . . , sk , where si 6= 0, such that s1 u1 + · · · + si ui + · · · + sk uk = 0. Dividing
by si and moving terms, we find that
X −sj
ui = uj
si
1≤j≤k
j6=i
16
2.2 Bases
x = (x1 , x2 , x3 , . . . , xn ) = x1 ε1 + x2 ε2 + x3 ε3 + · · · + xn εn
x1 ε1 + x2 ε2 + x3 ε3 + · · · + xn εn = (x1 , x2 , x3 , . . . , xn ) = 0
only if x1 = x2 = x3 = · · · = xn = 0.
Definition 2.32. The basis ε1 , . . . , εn in Example 2.31 is called the standard basis for
the linear space Rn .
Theorem 2.33. Let u1 , . . . , uk be vectors in a linear space V . Then they form a basis
for V if and only if every vector u ∈ V can be written as a linear combination
u = x1 u1 + · · · + xk uk
Proof. First assume that the vectors form a basis for V and let u ∈ V . Then, by the
definition of bases, u = x1 u1 +· · ·+xk uk is a linear combination of the vectors. It remains
to show that the coefficients are uniquely determined. Assume, to that end, that we also
have u = y1 u1 + · · · + yk uk . Then 0 = u − u = (x1 − y1 )u1 + · · · + (xk − yk )uk , and
it follows from the linear independence of u1 , . . . , uk that x1 − y1 = · · · = xk − yk = 0,
whence xi = yi for i = 1, . . . , k.
To show the converse, we assume that every vector of V has a unique representation as
in the theorem. Then, certainly, every vector of V is a linear combination of u1 , . . . , uk .
If x1 u1 + · · · + xk uk = 0, then the uniqueness and the fact that 0u1 + · · · + 0uk = 0 give
that x1 = · · · = xk = 0. Hence u1 , . . . , uk are also linearly independent, and therefore
they form a basis for V .
Definition 2.34. Let u1 , . . . , uk be a basis for a linear space V and let u be a vector
of V . If u = x1 u1 + · · · + xk uk , we call (x1 , . . . , xk ) the coordinates of u with respect to
the basis u1 , . . . , uk .
Example 2.36. The polynomials 1, x, x2 , . . . , xn form a basis for the linear space Pn
of polynomials of degree at most n. This is so because every polynomial in Pn can be
written as p = a0 + a1 x + a2 x2 + · · · + an xn with unique coefficients a0 , a1 , a2 , . . . , an .
The coordinates of p with respect to this basis are (a0 , a1 , a2 , . . . , an ).
Example 2.37. The linear space P of all polynomials has no basis. No finite collection
p1 , . . . , pk of polynomials span P since a polynomial of degree greater than the maximum
degree of the pi cannot be a linear combination of them.
17
2 Linear Spaces
Example 2.38. We set out to find bases for ker A and im A where
1 2 3 4 5
A = 1 3 4 5 6 .
2 5 7 9 11
We begin by solving the equation Ax = 0:
x1 + 2x2 + 3x3 + 4x4 + 5x5 = 0 x1 + 2x2 + 3x3 + 4x4 + 5x5 = 0
x1 + 3x2 + 4x3 + 5x4 + 6x5 = 0 ⇔ x2 + x3 + x4 + x5 = 0
2x1 + 5x2 + 7x3 + 9x4 + 11x5 = 0 x2 + x3 + x4 + x5 = 0
x1 = − r − 2s − 3t
x2 = − r − s − t
⇔ x3 = r ⇔ x = ru + sv + tw
x4 =
s
x5 = t
where u = (−1, −1, 1, 0, 0), v = (−2, −1, 0, 1, 0) and w = (−3, −1, 0, 0, 1). This shows
that x ∈ ker A if and only if x = ru + sv + tw. Hence the vectors u, v and w span
ker A. The generators obtained when solving a system in the usual way will always be
linearly independent. The free variables x3 , x4 and x5 in the last system correspond
to the parameters r, s and t in the solution, which in turn correspond to the patterns
(1, 0, 0), (0, 1, 0) and (0, 0, 1) in the third, fourth and fifth positions of the generators.
Hence ru + sv + tw = 0 if and only if (∗, ∗, r, s, t) = 0 which implies that r = s = t = 0.
Thus (−1, −1, 1, 0, 0), (−2, −1, 0, 1, 0) and (−3, −1, 0, 0, 1) form a basis for ker A.
We can use the same computations to find a basis for im A. Since im A is spanned by
the columns A1 , A2 , A3 , A4 , A5 of A, every vector y ∈ im A can be written as
y = x1 A1 + x2 A2 + x3 A3 + x4 A4 + x5 A5 . (2.1)
From the solution of the system we have
(−r − 2s − 3t)A1 + (−r − s − t)A2 + rA3 + sA4 + tA5 = 0.
By setting r = 1, s = 0, t = 0, we see that A3 = r1 A1 + r2 A2 is a linear combination
of A1 , A2 . By setting r = 0, s = 1, t = 0, we see that also A4 = s1 A1 + s2 A2 is
a linear combination of A1 , A2 . Finally, by setting r = 0, s = 0, t = 1, we see that
A5 = t1 A1 + t2 A2 is a linear combination of A1 , A2 . Substituting these expressions for
A3 , A4 and A5 into (2.1) and collecting terms, we find that y is a linear combination of
A1 , A2 , which therefore span im A. The computations also reveal that these vectors are
linearly independent. In fact,
x1 A1 + x2 A2 = 0 ⇔ x1 A1 + x2 A2 + 0A3 + 0A4 + 0A5 = 0
x1 = − r − 2s − 3t
2=−r− s− t
x
⇔ 0= r
0= s
0= t
from which it follows that x1 = x2 = 0. Hence, A1 , A2 form a basis for im A.
18
2.2 Bases
Lemma 2.39. Let u1 , . . . , uj , uj+1 , . . . , uk be vectors and assume that 1 ≤ j < k and
that u1 , . . . , uj are linearly dependent. Then the vectors u1 , . . . , uj , uj+1 , . . . , uk are
also linearly dependent.
Proof. By the assumption, there exist real numbers s1 , . . . , sj , not all zero, such that
s1 u1 + · · · + sj uj = 0. But then s1 u1 + · · · + sj uj + 0uj+1 + · · · + 0uk = 0, and at least
one of these coefficients is non-zero.
Theorem 2.40. If the vectors v 1 , . . . , v k belong to the span [u1 , . . . , uj ] and 1 ≤ j < k,
then v 1 , . . . , v k are linearly dependent.
If a1p = · · · = app = a(p+1)p = 0, then v 1 , . . . , v p+1 belong to the span [u1 , . . . , up−1 ]
and are therefore linearly dependent by hypothesis and Lemma 2.39. Hence, renaming
the vectors if needed, we may assume that a(p+1)p 6= 0. We can then eliminate up from
all equations except the last one by adding suitable multiples of that equation to them.
Doing so, we obtain
b11 u1 + · · · + b1(p−1) up−1 = v 1 + c1 v p+1
.. .
.
bp1 u1 + · · · + bp(p−1) up−1 = v p + cp v p+1
Hence, the vectors v 1 +c1 v p+1 , . . . , v p +cp v p+1 belong to [u1 , . . . , up−1 ] and are therefore
linearly dependent by the induction hypothesis. Thus there exist scalars s1 , . . . , sp , not
all zero, such that
s1 (v 1 + c1 v p+1 ) + · · · + sp (v p + cp v p+1 ) = 0,
and hence
s1 v 1 + · · · + sp v p + (s1 c1 + · · · + sp cp )v p+1 = 0.
This shows that v 1 , . . . , v p+1 are linearly dependent.
19
2 Linear Spaces
Theorem 2.41. Let u1 , . . . , um and v 1 , . . . , v n be bases for the same linear space V .
Then m = n.
Definition 2.42. Let V be a linear space. If V = {0}, we say that the dimension of
V is zero. If V has a basis consisting of n ≥ 1 vectors, we say that the dimension of V
is n. In these two cases we say that V is finite-dimensional and denote the dimension of
V by dim V . In the remaining case where V 6= {0} and has no basis, we say that V is
infinite-dimensional.
Example 2.43. We saw in Example 2.31 that dim Rn = n. From Example 2.36 we have
dim Pn = n + 1, and by Example 2.37, P is infinite-dimensional. The dimensions of the
kernel and image in Example 2.38 are 3 and 2, respectively.
Proof. By assumption, there exist real numbers s1 , . . . , sk , sk+1 , not all zero, such that
s1 u1 + · · · + sk uk + sk+1 uk+1 = 0. If sk+1 = 0, then s1 u1 + · · · + sk uk = 0 where at
least one of the coefficients is non-zero. Since this contradicts the linear independence of
u1 , . . . , uk , we must have sk+1 6= 0. Hence, dividing by sk+1 and moving terms, we can
express uk+1 as a linear combination of u1 , . . . , uk .
Proof. Set n = dim V and consider any finite set S of linearly independent vectors of
U containing the vectors u1 , . . . , uk . Then, by Theorem 2.40, S cannot contain more
than n vectors, for the vectors in S are also vectors of V and V is spanned by n vectors.
Therefore, among all such sets S, there is a set S0 = {u1 , . . . , um } with a maximum
number of vectors. If u is any other vector of U , it follows from the maximality of S0
that u1 , . . . , um , u must be linearly dependent. Hence, by Lemma 2.44, u is a linear
combination of the vectors in S0 . This shows that u1 , . . . , um form a basis for U .
Proof. The claim is trivial when U = {0}. Otherwise, U contains a non-zero vector u1 ,
which by Theorem 2.45 can be extended to a basis for U . The inequality now follows
from Theorem 2.40.
20
2.2 Bases
Theorem 2.49. Let V 6= {0} be a linear space and assume that the vectors u1 , . . . , uk
span V . Then there exists a basis for V comprising the vectors in a subset of {u1 , . . . , uk }.
Proof. Among all non-empty subsets S of {u1 , . . . , uk } with the property that the vectors
in S span V there must be a set S0 with a minimum number of vectors. Suppose that the
vectors in S0 are linearly dependent. Then S0 contains more than one vector, and one
vector u ∈ S0 is a linear combination of the other vectors in S0 . Hence, by Lemma 2.48,
the vectors in S0 \{u} span V . Since this contradicts the minimality of S0 , the vectors
in S0 are linearly independent and hence form a basis for V .
Proof. Suppose that u1 , . . . , un span V . If they are also linearly dependent, then, by
using Theorem 2.49, we can obtain a basis for V by removing one or more of the vectors
u1 , . . . , un . Since this contradicts the fact that all bases for an n-dimensional space
consist of n vectors, u1 , . . . , un must be linearly independent.
Now suppose that u1 , . . . , un are linearly independent. If they do not span V , then,
by using Theorem 2.45, we can obtain a basis for V with more than n vectors. We get
the same contradiction as before. Hence u1 , . . . , un span V .
Hence, in order to find out whether n vectors of an n-dimensional space form a basis for
that space, it is enough to check if they are linearly independent or to check if they span
the space.
Corollary 2.51. Let U be a subspace of V and assume that the two spaces have the
same finite dimension. Then U = V .
21
2 Linear Spaces
Example 2.52. We want to show that the vectors e1 = (1, 2, 1), e2 = (1, 1, 2) and
e3 = (1, 4, 0) form a basis for R3 and find the coordinates of u = (3, 7, 3) with respect
to that basis. The equation s1 e1 + s2 e2 + s3 e3 = 0 is equivalent to
s1 + s2 + s3 = 0 s1 + s2 + s3 = 0 s1 + s2 + s3 = 0
2s1 + s2 + 4s3 = 0 ⇔ − s2 + 2s3 = 0 ⇔ − s2 + 2s3 = 0 .
s1 + 2s2 =0 s2 − s3 = 0 s3 = 0
From this we see that s1 = s2 = s3 = 0. Hence the vectors are linearly independent.
Since the number of vectors is 3 and dim R3 = 3, they must form a basis for R3 . In order
to find the coordinates (x1 , x2 , x3 ) of u, we solve the equation x1 e1 + x2 e2 + x3 e3 = u,
which is equivalent to
x1 + x2 + x3 = 3 x1 + x2 + x3 = 3
2x1 + x2 + 4x3 = 7 ⇔ − x2 + 2x3 = 1
x1 + 2x2 =3 x2 − x3 = 0
x1 + x2 + x3 = 3
⇔ − x2 + 2x3 = 1 ⇔ x1 = x2 = x3 = 1.
x3 = 1
This shows that the coordinates of u with respect to e1 , e2 , e3 are (1, 1, 1). Note that the
coordinates of u with respect to the ordinary basis ε1 , ε2 , ε3 for R3 are (3, 7, 3). When
more than one basis are involved, some care must be taken when stating the coordinates
of a vector.
Example 2.53. We solve the problem in the previous example by, instead, showing that
e1 , e2 , e3 span R3 and using the fact that dim R3 = 3. We must then show that any
vector y = (y1 , y2 , y3 ) of R3 is a linear combination of e1 , e2 , e3 . We check if this holds
by solving the equation x1 e1 + x2 e2 + x3 e3 = y.
x1 + x2 + x3 = y 1 x1 + x2 + x3 = y 1
2x1 + x2 + 4x3 = y2 ⇔ − x2 + 2x3 = y2 − 2y1
x1 + 2x2 = y3 x2 − x3 = y 3 − y 1
x1 + x2 + x3 = y 1 x1 = 8y1 − 2y2 − 3y3
⇔ − x2 + 2x3 = y2 − 2y1 ⇔ x2 = −4y1 + y2 + 2y3 .
x3 = y3 + y2 − 3y1 x3 = −3y1 + y2 + y3
Indeed, the equation has a solution (x1 , x2 , x3 ) for every y. Therefore, the vectors span
R3 and thus form a basis for R3 . We can now find the coordinates of u by substituting
its components y1 = 3, y2 = 7 and y3 = 3 in the last system above. This gives the same
result as before, namely (x1 , x2 , x3 ) = (1, 1, 1).
22
2.3 More on Matrices
both spaces have dimension 2. Therefore, by the corollary, it is sufficient to show that
one of the spaces is a subspace of the other. To do so, we begin by solving the equation
x1 u1 + x2 u2 + y1 v 1 + y2 v 2 = 0.
x 1 + 2x 2 + 4y 1 + y 2 = 0
x1 + 2x2 + 4y1 + y2 = 0
x1 + 3x2 + 5y1 =0 x2 + y 1 − y 2 = 0
⇔
x + x + 3y + 2y = 0
− x 2 − y1 + y2 = 0
1 2 1 2
x1 − x2 + y1 + 4y2 = 0 − 3x2 − 3y1 + 3y2 = 0
x1 + 2x2 + 4y1 + y2 = 0
⇔ .
x2 + y 1 − y 2 = 0
From this we see that we can choose any values for y1 and y2 and then solve for x1 and x2 .
By choosing y1 = −1, y2 = 0, we see that there are numbers x1 and x2 such that
v 1 = x1 u1 + x2 u2 . Hence, v 1 is a linear combination of u1 and u2 and therefore belongs
to [u1 , u2 ]. By choosing y1 = 0, y2 = −1, we see that also v 2 is a linear combination
of u1 and u2 and therefore belongs to [u1 , u2 ]. Hence, every linear combination of
v 1 and v 2 , and therefore every vector of [v 1 , v 2 ], belongs to [u1 , u2 ]. This shows that
[v 1 , v 2 ] is a subspace of [u1 , u2 ] and therefore, by the corollary, [u1 , u2 ] = [v 1 , v 2 ].
Suppose that the dimension of the linear space V is n > 0 and that e1 , . . . , en form
a basis for V . If u and v are vectors of V having coordinates x = (x1 , . . . , xn ) and
y = (y1 , . . . , yn ), respectively, then u = x1 e1 + · · · + xn en and v = y1 e1 + · · · + yn en .
Hence, the coordinates of
u + v = (x1 + y1 )e1 + · · · + (xn + yn )en and su = sx1 e1 + · · · + sxn en
are
x + y = (x1 + y1 , . . . , xn + yn ) and sx = (sx1 , . . . , sxn ).
By means of a basis for V , we may therefore identify V with Rn . The two spaces behave
the same in every linear respect. We exploit this fact in the next example.
Example 2.55. Let V be the space P2 of polynomials of degree at most 2 and consider
the polynomials p1 = 1 + 2x + x2 , p2 = 1 + x + 2x2 and p3 = 1 + 4x. We intend to
show that these vectors form a basis for P2 and find the coordinates with respect to that
basis of p = 3 + 7x + 3x2 . The three polynomials π 1 = 1, π 2 = x, π 3 = x2 form a basis
for P2 . With respect to this basis, the coordinates of the vectors p1 , p2 , p3 and p are
e1 = (1, 2, 1), e2 = (1, 1, 2), e3 = (1, 4, 0) and u = (3, 7, 3), respectively. Hence, it is
enough to show that e1 , e2 , e3 form a basis for R3 and find the coordinates of u with
respect to this basis. We did this in Example 2.52 and found that the coordinates of
u with respect to e1 , e2 , e3 are (1, 1, 1). Hence, p1 , p2 , p3 form a basis for P2 and the
coordinates of p with respect to this basis are (1, 1, 1). Indeed, p = p1 + p2 + p3 .
23
2 Linear Spaces
Proof. The first three statements are equivalent by Theorem 2.50. The equivalences
(i) ⇔ (iv), (ii) ⇔ (v) and (iii) ⇔ (vi) follow from the observation made before the theorem.
The equivalence of (vi) and (vii) is the content of Theorem 1.19. The equivalence of (vii)
and (viii) follows from Theorem 1.23. The last three statements are equivalent to (viii)
since the rows of A are the columns of At .
Definition 2.59. Assume that V = U ′ ⊕ U ′′ is the direct sum of U ′ and U ′′ and let
u ∈ V . The unique vectors u′ and u′′ such that u = u′ + u′′ are called the projections
of u on U ′ along U ′′ and of u on U ′′ along U ′ , respectively.
24
2.4 Direct Sums
U ′′
u′′ u
U′ u′
Proof. The statement follows from the uniqueness and the fact that
Proof. If one of the subspaces is the zero subspace, then the other subspace equals V , and
the statement is trivial. We may therefore assume that none of the subspaces is the zero
subspace. Since both subspaces are finite-dimensional, we can choose bases e1 , . . . , ek and
f 1 , . . . , f m for U ′ and U ′′ , respectively. We show that dim V = k + m = dim U ′ + dim U ′′
by showing that e1 , . . . , ek , f 1 , . . . , f m form a basis for V . If u ∈ V , we can write
u = u′ + u′′ where u′ ∈ U ′ and u′′ ∈ U ′′ . Since u′ and u′′ are linear combinations
of e1 , . . . , ek and f 1 , . . . , f m , respectively, it follows that u is a linear combination of
the vectors e1 , . . . , ek , f 1 , . . . , f m . It remains to show that these vectors are linearly
independent. Suppose that
s1 e1 + · · · + sk ek + t1 f 1 + · · · + tm f m = 0.
s1 e1 + · · · + sk ek = 0 and t1 f 1 + · · · + tm f m = 0.
25
2 Linear Spaces
Example 2.63. Let V be 3-space and consider the plane U ′ and the line U ′′ defined
by x1 + 2x2 + 3x3 = 0 and x = t(2, 1, 2), respectively. It is easily checked that the
intersection of the two subspaces is the zero space. Hence, their sum is direct, and
therefore its dimension is 2 + 1 = 3. Thus V = U ′ ⊕ U ′′ . Let u = (2, 3, 4). In order to
find u′ and u′′ , we form the line x = t(2, 1, 2) + (2, 3, 4) through u. Its intersection with
the plane is given by
for all u and v in U and all real numbers s and t. If U = V , we also say that F is a
linear transformation on U .
The single condition in the definition can be replaced with the two conditions
and
F (su) = F (su + 0u) = sF (u) + 0F (u) = sF (u).
Conversely, if the two conditions hold, then
As in the proof of Theorem 2.18, one sees that ker F and im F are subspaces of U and V ,
respectively. We also note that if A is an m × n matrix, then F (x) = Ax defines a linear
transformation F from Rn to Rm whose kernel and image agree with the kernel and
image of A.
26
2.5 The Rank-Nullity Theorem
When n = 2 and the roots λ1 and λ2 of the characteristic equation are real and unequal,
ker F is spanned by e1 and e2 defined by e1 (t) = eλ1 t and e2 (t) = eλ2 t . The vectors
e1 and e2 are linearly independent, for if
s1 eλ1 t + s2 eλ2 t = 0, t ∈ R,
λ1 s1 eλ1 t + λ2 s2 eλ2 t = 0, t ∈ R.
By inserting t = 0, we get
s1 + s2 = 0
.
λ1 s1 + λ2 s2 = 0
Since λ1 6= λ2 , this is possible only if s1 = s2 = 0. Thus we have shown that ker F is
two-dimensional with basis e1 , e2 .
Proof. If ker F = U , then im F = {0}, and the statement holds trivially. Otherwise,
ker F 6= U , and hence U 6= {0}. We can therefore choose a possibly empty set of basis
vectors {e1 , . . . , ek } for ker F and extend it to a basis e1 , . . . , ek , ek+1 , . . . , en for U . Set
f i = F (ei ) for i = 1, . . . , n. Then f i ∈ im F for i = 1, . . . , n and f i = 0 for i = 1, . . . , k.
We show that dim im F = n − k = n − dim ker F by showing that f k+1 , . . . , f n form a
basis for im F .
First, we show that they are linearly independent. Suppose that
sk+1 f k+1 + · · · + sn f n = 0.
sk+1 ek+1 + · · · + sn en = 0,
27
2 Linear Spaces
Example 2.68. We use the theorem to find the dimensions of ker A and im A where
1 −1 1 −1 1
2 3 4 2 5
A= 3 7 7 5
.
9
4 6 8 4 10
28
Exercises
Proof. Let u and v be vectors of V and s and t real numbers. Set u′ = F −1 (u) and
v ′ = F −1 (v). The linearity of F then yields su + tv = sF (u′ ) + tF (v ′ ) = F (su′ + tv ′ ),
and hence
sF −1 (u) + tF −1 (v) = su′ + tv′ = F −1 (su + tv).
Proof. Let n be the common dimension of U and V . Then the rank-nullity theorem and
Corollary 2.51 yield
Exercises
2.1. Which of the following sets are subspaces of R2 ? Justify your answers.
(a) {x ∈ R2 ; x1 = 2x2 }, (b) {x ∈ R2 ; (x1 , x2 ) = t(1, 2) + (1, 1), t ∈ R},
(c) {x ∈ R2 ; x1 = x22 }, (d) {x ∈ R2 ; (x1 , x2 ) = t(1, 2), t ∈ R}.
2.2. Show that the set of symmetric n × n matrices is a subspace of Mn×n .
2.3. Express the plane through the origin and the two points (1, 1, 0) and (2, 0, 1) as
the image of a matrix and as the kernel of a matrix.
2.4. Which of the following sets of vectors are linearly dependent?
(a) (1, 2, 3), (2, 3, 3), (2, 5, 7) in R3 ,
(b) (1, 2, 3, 1), (2, 3, 2, 3), (1, 1, −1, 2) in R4 ,
(c) (1, 2, 3, 1, 2), (2, 3, 2, 3, 1), (1, 1, −1, 2, 3) in R5 .
29
2 Linear Spaces
(1, 3, 2, 1), (1, 2, 1, 1), (1, 1, 0, 1), (1, 2, 2, 1), (3, 4, 1, 3).
(1, 0, 1, 0), (1, 1, 1, 1), (2, 1, −1, 2), (1, −2, −2, 1)
form a basis for R4 . What are the coordinates of (2, 2, −1, 1) with respect to this
basis?
2.10. Find the dimensions of the spans of the following vectors in R4 .
(a) (2, 1, 0, 1), (1, 0, 1, 2), (1, 1, −1, −1),
(b) (1, 2, 3, 1), (1, 1, 1, 2), (1, 1, −1, 1).
2.11. Show that u1 = (1, 1, 1, −1), u2 = (1, 2, 3, 4) span the same subspace of R4 as
v 1 = (−1, 1, 3, 11), v 2 = (3, 1, −1, −13).
2.12. Let u1 , . . . , un be a basis for a linear space. What is the dimension of the subspace
[u1 − u2 , u2 − u3 , . . . , un−1 − un , un − u1 ]?
2.13. Let A and B be n × n matrices such that A2 − AB = I. Show that A2 − BA = I.
2.14. For which of the subspaces U and V of R4 is the sum U + V direct?
(a) U = {x ∈ R4 ; x1 + x2 + x3 + x4 = 0}, V = {x ∈ R4 ; x1 − x2 + x3 − x4 = 0}.
(b) U = {(t, 0, −t, t) ∈ R4 ; t ∈ R}, V = {x ∈ R4 ; x1 + 2x2 + 3x3 + 4x4 = 0}.
2.15. Show that R3 = U ⊕ V where
30
Exercises
Is F one-to-one? Is it onto?
2.19. Show that the linear transformation F on R3 defined by
31
3 Inner Product Spaces
3.1 Definition
Definition 3.1. Let V be a linear space, and let there be defined a function V × V → R
whose value at (u, v) is denoted by hu, vi. We call the function an inner product on V
if the following conditions are satisfied.
(i) hsu + tv, wi = shu, wi + thv, wi for all u, v and w in V and all s and t in R.
(ii) hu, vi = hv, ui for all u and v in V .
(iii) hu, ui ≥ 0 for all u ∈ V with equality only if u = 0.
A linear space,
p furnished with an inner product, is called an inner product space. We
call kuk = hu, ui the norm or length of a vector u in an inner product space.
When we talk about a subspace of an inner product space V , we shall assume that it is
equipped with the same inner product as V .
Note that Axiom (i) means that u 7→ hu, wi is a linear transformation from V to R
for every fixed w ∈ V . Hence, Axiom (i) is equivalent to hu + v, wi = hu, wi + hv, wi
and hsu, wi = shu, wi for all u, v, w in V and all s ∈ R.
Theorem 3.2.
(i) hw, su + tvi = shw, ui + thw, vi for all u, v and w in V and all s and t in R.
(ii) h0, ui = hu, 0i = 0 for all u ∈ V .
(iii) ksuk = |s| kuk for all u ∈ V and s ∈ R.
(x1 , x2 , . . . , xn ) · (y1 , y2 , . . . , yn ) = x1 y1 + x2 y2 + · · · + xn yn .
When we mention the inner product space Rn without further specification, it is always
assumed that the inner product is the dot product.
R1
Example 3.6. Let V = C[0, 1]. Then hu, vi = 0 u(x)v(x) dx is an inner product on V .
R1
Axioms (i) and (ii) are easily verified. It is also clear that hu, ui = 0 (u(x))2 dx ≥ 0
for all u ∈ V . Suppose that u 6= 0. Then u(a) 6= 0 for some a ∈ [0, 1]. Hence
(u(a))2 = b > 0. Since u is continuous, u2 is continuous. Hence, there exists a real
number δ > 0 such that (u(x))2 > 2b when |x − a| < δ and x ∈ [0, 1]. We may clearly
assume that δ < 12 . At least one of the intervals [a, a + δ] and [a − δ, a] is contained in
R1
the interval [0, 1]. Therefore 0 (u(x))2 dx ≥ δb 2 > 0. This shows that also Axiom (iii)
holds.
ExampleR 3.7. Let V be the linear space of Riemann integrable functions in [0, 1]. Then
1
hu, vi = 0 u(x)v(x) dx is not an inner product. Let, for example, u be the function
defined by u(0) = 1 and u(x) = 0 for x ∈ (0, 1]. Then hu, ui = 0 despite the fact that
u is not the zero vector.
Definition 3.9. We say that two vectors u and v in an inner product space are ortho-
gonal if hu, vi = 0. The vector e is called a unit vector if kek = 1.
1 1
If u =
6 0, we can form the vector e = kuk u. Then kek = kuk kuk = 1. Hence e is a
unit vector. We say that we normalise u to e.
34
3.1 Definition
Proof. If v = 0, then both sides are zero and the inequality holds. Assume that v 6= 0.
Then we have
hu, vi2
kuk2 − ≥ 0.
kvk2
Consequently, hu, vi2 ≤ kuk2 kvk2 , and hence |hu, vi| ≤ kukkvk.
In ordinary 3-space one starts out with the notions of length and angle and defines the
inner product of two non-zero vectors u and v by hu, vi = kukkvk cos θ. In this general
setting we started out with an inner product and defined the length, or norm as we
also call it. When u and v are non-zero vectors, the Cauchy–Schwarz inequality can be
written as
hu, vi
−1 ≤ ≤ 1.
kukkvk
This enables us to complete the situation by also defining angles.
Definition 3.12. Let u and v be non-zero vectors of an inner product space. The angle
between u and v is the unique real number θ for which
hu, vi
cos θ = , 0 ≤ θ ≤ π.
kukkvk
The inner product on ordinary 3-space is an inner product in the sense of this chapter.
By defining angles as we do here, we see that we get our old angles back in 3-space.
Example 3.13. The lengths of the vectors u = (4, 3, −1, −1) and v = (1, 1, −1, −1) in
R4 are
p √ √
kuk = 42 + 32 + (−1)2 + (−1)2 = 27 = 3 3,
p √
kvk = 12 + 12 + (−1)2 + (−1)2 = 4 = 2.
35
3 Inner Product Spaces
Example 3.14. Consider the functions u and Rv in C[0, 1] defined by u(x) = 1 and
1
v(x) = 6x − 2. With the inner product hu, vi = 0 u(x)v(x) dx we have
Z 1 Z 1
2 2
kuk = (u(x)) dx = 1 dx = 1,
0 0
Z 1 Z 1
kvk2 = (v(x))2 dx = (36x2 − 24x + 4) dx = 4,
0 0
Z 1 Z 1
hu, vi = u(x)v(x) dx = (6x − 2) dx = 1.
0 0
You cannot find the angle in the example in a figure depicting the graphs of the two
functions. Instead, it is small if the functions are close to being directly proportional and
large if they are close to being inversely proportional.
Theorem 3.15 (Triangle inequality). Let u and v be vectors in an inner product
space. Then ku + vk ≤ kuk + kvk.
The inequality now follows from the fact that both ku + vk and kuk + kvk are non-
negative.
36
3.2 Orthonormal Bases
Proof. If s1 u1 + · · · + sk uk = 0, then
0 = h0, ui i = hs1 u1 + · · · + si ui + · · · + sk uk , ui i
= s1 hu1 , ui i + · · · + si hui , ui i + · · · + sk huk , ui i = si hui , ui i.
Since ui 6= 0, we have hui , ui i = kui k2 6= 0, and hence si = 0.
Definition 3.17. Let V be an inner product space. We say that the vectors e1 , . . . , ek
in V form an orthonormal set if
(
1 when i = j,
hei , ej i =
0 when i 6= j.
If the vectors also form a basis for V , we say that they form an orthonormal basis for V .
By Theorem 3.16, the vectors of an orthonormal set form a basis for V if and only if they
span V .
Theorem 3.18. Let V be an inner product space with orthonormal basis e1 , . . . , en and
assume that the coordinates of u and v with respect to this basis are x = (x1 , . . . , xn )
and y = (y1 , . . . , yn ), respectively. Then hu, vi = x · y and kuk2 = kxk2 where the last
norm is the ordinary norm in Rn ,
Proof. We have
X n
n X n
X
hu, vi = hx1 e1 + · · · + xn en , y1 e1 + · · · + yn en i = xi yj hei , ej i = xi yi = x · y,
i=1 j=1 i=1
37
3 Inner Product Spaces
Assume that the vectors v 1 , . . . , v n form a basis for an inner product space V . These
vectors can then be used to construct an orthonormal basis for V . To do so, we first
construct a basis for V consisting of pairwise orthogonal vectors u1 , . . . , un . First, set
u1 = v 1 .
If n = 1, we are done. Otherwise, by using Theorem 3.20, we can find a number s12 such
that
u2 = s12 u1 + v 2
is orthogonal to u1 . Then u2 must be non-zero, for if u2 = 0, then v 1 and v 2 would be
linearly dependent. If n > 2, we find numbers s13 and s23 such that
u3 = s13 u1 + s23 u2 + v 3
u1 = v 1
u2 = s12 u1 + v 2
u3 = s13 u1 + s23 u2 + v 3
..
.
un = s1n u1 + s2n u2 + · · · + sn−1n un−1 + v n
38
3.2 Orthonormal Bases
where
hui , v k i
sik = − , 1 ≤ i < k ≤ n,
kui k2
then the vectors
1
ei = ui , i = 1, . . . , n,
kui k
form an orthonormal basis for V .
s12 u1
u2 v2
u1 = v 1
Corollary 3.22. Every non-zero finite-dimensional inner product space has an ortho-
normal basis.
Example 3.23. We demonstrate the process by finding an orthonormal basis for the
subspace of R4 spanned by v 1 = (1, 1, 1, 1), v 2 = (1, 2, 2, 1), v 3 = (2, 3, 1, 6). We set
u1 = v 1 . Then we determine the number r so that
u2 = ru1 + v 2
To avoid fractional numbers, we can replace u2 with 2u2 . This works, because also
u1 and 2u2 are orthogonal, and 2u2 is a linear combination of v 1 and v 2 . Hence, we set
u2 = (−1, 1, 1, −1). Next, we set out to find numbers s and t that make
u3 = su1 + tu2 + v3
39
3 Inner Product Spaces
The Gram–Schmidt process works also if the vi merely span V . As long as the ui
constructed so far are non-zero, they are linearly independent and span the same subspace
as the corresponding vectors vi . If ui = 0, then either i = 1 and v 1 = 0 or v i is a linear
combination of v 1 , . . . , v i−1 . Hence, ui and vi can be discarded. One can then repeat
the last step using the next vector and proceed from there.
Theorem 3.25. Let A be a square matrix. Then the following statements are equivalent.
(i) A is orthogonal.
(ii) At A = I.
(iii) AAt = I.
(iv) The rows of A form an orthonormal set.
[u1 , . . . , uk ]⊥ = {u1 , . . . , uk }⊥ .
40
3.3 Orthogonal Complement
Then u′ ∈ U and u = u′ + u′′ . By using Lemma 3.28 and the fact that
hei , u′′ i = hei , ui − hei , hu, e1 ie1 + · · · + hu, en ien i = hei , ui − hei , ui = 0, i = 1, . . . , n,
U⊥
u′′ u
U u′
Note that the conclusion V = U ⊕ U ⊥ of Theorem 3.29 need not be true if we drop
the assumption that U be finite-dimensional, and if V 6= U ⊕ U ⊥ , the conclusion of
41
3 Inner Product Spaces
Theorem 3.30 need not be true. Consider the space l2 defined in Example 3.8 and let U
be the subspace of l2 consisting of those sequences (xn )∞n=1 for which only a finite number
of components are non-zero. Then the vector εn having a one in position n and zeros
elsewhere belongs to U . If x ∈ U ⊥ , then xn = hx, εn i = 0 for all n, and hence x = 0.
This means that U ⊥ = {0}, whence (U ⊥ )⊥ = {0}⊥ = l2 6= U .
Let U be a finite-dimensional subspace of an inner product space V and assume that
e1 , . . . , en form an orthonormal basis for U . Then the proof of Theorem 3.29 shows that
the orthogonal projection u′ on U of a vector u ∈ V is given by
Example 3.34. Here, we want to find the orthogonal projection of u = (5, 7, 3) on the
plane U defined by x1 + 2x2 + 3x3 = 0. One way of solving this problem would be to
first find a basis v1 , v 2 for U , then apply the Gram–Schmidt process to v 1 and v 2 to get
an orthonormal basis e1 , e2 for U and finally use the formula.
To avoid this rather cumbersome procedure, we use the fact that U ⊥ is spanned by
the single vector (1, 2, 3). Hence
1
e = √ (1, 2, 3)
14
42
3.3 Orthogonal Complement
constitutes an orthonormal basis for U ⊥ . By using the formula, we find that the ortho-
gonal projection of u on U ⊥ is
28
u′′ = hu, eie = √ e = (2, 4, 6).
14
The orthogonal projection of u on U is, therefore,
As we saw in this example, it may be worthwhile to give some thought to which of the
vectors u′ and u′′ should be computed first.
Example 3.35. Consider once again the subspace U and the vector u in Example 3.33.
We found there that the vector closest to u in U is u′ = 12 (3, 3, 2, 4). The distance from
u to U is
1 1 1
ku′′ k = ku − u′ k = (1, 2, 1, 2) − (3, 3, 2, 4) = (−1, 1, 0, 0) = √ .
2 2 2
ax + by + cz
u′′ = hu, eie = √ e.
a2 + b2 + c2
Consequently, the distance from u to the plane is
ax + by + cz |ax + by + cz|
ku′′ k = √ kek = √ .
a2 + b2 + c2 a2 + b2 + c2
Theorem 3.37. Let V be a finite-dimensional inner product space and let U be a sub-
space of V . Then dim U + dim U ⊥ = dim V .
Proof. The statement follows directly from Theorems 3.29 and 2.62.
43
3 Inner Product Spaces
Hence, dim im A = dim im At . This means that the maximum number of linearly inde-
pendent columns of A equals the maximum number of linearly independent rows of A.
Definition 3.38. The common value of dim im A and dim im At is called the rank of A.
AtAx = At y ′ = At y.
AtAx = At y
44
3.5 The Method of Least Squares
give the same result as the two-step method. This method is called the method of least
squares and its name stems from the fact that the approximate solutions minimise the
sum
kAx − yk2 = ((Ax)1 − y 1 )2 + · · · + ((Ax)m − y m )2
of squares.
ker At
y ′′ y
im A y′
Note that y ∈ im A if the system has solutions. Hence, in this case y ′ = y and the
solutions in the sense of least squares are the ordinary solutions.
Example 3.39. We seek the solution in the sense of least squares of the system
x1 + x2 = 6
4x − x2 = 8 .
1
3x1 + 2x2 = 5
We set
1 1 6
A = 4 −1 and Y = 8 .
3 2 5
Then
1 1
6
t 1 4 3 26 3 t 1 4 3 53
AA= 4 −1 = and AY = 8 = .
1 −1 2 3 6 1 −1 2 8
3 2 5
Note that At A is a symmetric matrix and its entry in position i, k is the dot product
Ai · Ak . This observation might save you some time and effort.
The following example illustrates a common application of the method of least squares.
45
3 Inner Product Spaces
Example 3.40. We have reason to believe that some process is described by a linear
model y = at + b where y is some quantity and, for example, t is time. Measurements
give the following data:
t 0 1 2 3
y 1.5 2.9 5.3 6.6
From these data we want to estimate the values of a and b. Ideally, we should be able
to solve for a and b in the following system of equations.
b = 1.5
a + b = 2.9
.
2a + b = 5.3
3a + b = 6.6
This is, however, seldom possible owing to measure errors or perhaps to the fact that we
are mistaken in our assumption about the model. We decide, instead, to minimise the
distance in the sense of least squares between the vectors (b, a + b, 2a + b, 3a + b) and
(1.5, 2.9, 5.3, 6.6). That is to say that we decide to solve the system in the same sense.
We can write the system as AX = Y where
0 1 1.5
1 1 2.9 a
A=
, Y = , X = .
2 1 5.3 b
3 1 6.6
We get
14 6 33.3
At A = , At Y =
6 4 16.3
and find that the solution of the normal equations At AX = At Y is given by a = 1.77,
b = 1.42.
The applicability of the method in the last example has nothing to do with the assumption
that the model is linear. It would have worked equally well under the assumption that
y = at2 + bt + c. The important thing here is that the coefficients appear linearly in the
expression. Thus, the method is not directly applicable to the exponential model y = ceat .
However, in this case there is a way to bypass this limitation. This is demonstrated in
the next example.
Example 3.41. Assume that the model is y = ceat and that we have the following data:
t 0 1 2 3
y 3 3 5 9
Taking the logarithm, we get the equivalent relation ln y = at + ln c. Setting z = ln y
and b = ln c, we can write this as z = at + b. We compute the values of z and construct
a new table:
t 0 1 2 3
z ln 3 ln 3 ln 5 ln 9
46
Exercises
The matrix A is the same matrix as in the previous example. Hence, also At A is the
same. From now on, the problem is not well suited for manual calculations. By means
of some numerical software, we should be able to find that
t 10.90916185
AZ= .
6.003887068
The approach in the above example usually serves its purpose well. Note, however, that
minimising the distance between the vectors comprising the logarithmic values is not the
same as minimising the distance between the vectors themselves.
The method of least squares also gives us a means to compute orthogonal projections.
Let U = [u1 , . . . , un ] be a subspace of an m-dimensional inner product space V . By
introducing an orthonormal basis for V , we can regard V as Rm and the ui as elements
of Rm . If A is the m × n matrix having the ui as columns, then U = im A. Hence, if
x is any solution of the normal equations At Ax = At u, then u′ = Ax is the orthogonal
projection of u on U . If At Ax = 0, then Ax ∈ im A∩ (im A)⊥ = {0}, and hence Ax = 0.
If the ui are linearly independent, this implies that At Ax = 0 has the unique solution
x = 0 and therefore that At A is invertible. Hence, if the ui are linearly independent, we
have the following formula:
If the basis u1 , . . . , un above is an orthonormal basis for U , then At A = I (n) , and (3.4)
reads
u′ = AAt u.
This is Formula (3.1) written in the language of matrices.
Exercises
3.1. Find the angle between the two vectors u = (−1, 1, 1, −1, 0) and v = (0, 2, 1, 0, 2)
in R5 .
3.2. Show that the four points (1, 1, 2, 2), (2, 2, 3, 3), (3, 1, 4, 2), (2, 0, 3, 1) in R4 are
the vertices of a square.
3.3. Show that the parallelogram identity
2 kuk2 + kvk2 = ku + vk2 + ku − vk2
47
3 Inner Product Spaces
3.5. Find an orthonormal basis for the subspace of R4 spanned by (2, 1, 1, 1), (1, 2, 3, 0)
and (1, 1, 1, 1).
3.6. Find an orthonormal basis for the subspace of R4 given by
x1 + 2x2 − 2x3 − x4 = 0.
is orthogonal.
3.8. Let A and B be orthogonal n × n matrices. Show that AB is an orthogonal
matrix.
R1
3.9. (a) Show that hp, qi = 0 p(x)q(x) dx defines an inner product on P2 .
(b) Use the Gram–Schmidt process to find an orthonormal basis for P2 equipped
with this inner product.
3.10. Find an orthonormal basis for the orthogonal complement of the subspace of R4
spanned by (1, 1, 1, 1) and (0, 1, 2, 1).
3.11. Find the orthogonal projections of u = (1, 2, 3, 4) on the subspaces of R4 spanned
by
(a) (1, 1, 1, 1) and (1, −1, 1, −1),
(b) (1, 1, 1, 1) and (1, 1, 1, 0),
and compute the distances from u to the two subspaces.
3.12. Show that the vectors e1 = √13 (1, 1, 1, 0) and e2 = 13 (0, 2, −2, 1) form an orthonor-
mal set in R4 . Extend this set to an orthonormal basis for R4 by first finding a
basis v1 , v 2 for the orthogonal complement of U = [e1 , e2 ] and then applying the
Gram–Schmidt process to v 1 and v 2 .
3.13. Let e1 , . . . , en be an orthonormal set in an inner product space V and let u be
any vector in V .
(a) Show that
n
X
ku′ k2 = hu, ek i2
k=1
48
Exercises
49
4 Determinants
Proof. We begin by showing the first statement in the special case where j = i + 1. Since
F is alternating, we have
F (. . . , ui + uj , ui + uj , . . . ) = 0.
0 = F (. . . , ui + uj , ui + uj , . . . ) = F (. . . , ui , ui + uj , . . . ) + F (. . . , uj , ui + uj , . . . )
= F (. . . , ui , ui , . . . ) + F (. . . , ui , uj , . . . ) + F (. . . , uj , ui , . . . ) + F (. . . , uj , uj , . . . )
= F (. . . , ui , uj , . . . ) + F (. . . , uj , ui , . . . ).
Hence, F (. . . , ui , uj , . . . ) = −F (. . . , uj , ui , . . . ).
For the last statement, we interchange successively adjacent vectors until we obtain an
n-tuple of vectors having two equal adjacent vectors. Since the resulting function value
is zero and can differ from the original value only by a sign, also the original function
value must be zero.
4 Determinants
To show the first statement in general, we assume that i 6= j. It then follows from the
last statement and the multilinearity that
0 = F (. . . , ui + uj , . . . , ui + uj , . . . )
= F (. . . , ui , . . . , ui , . . . ) + F (. . . , ui , . . . , uj , . . . )
+ F (. . . , uj , . . . , ui , . . . ) + F (. . . , uj , . . . , uj , . . . )
= F (. . . , ui , . . . , uj , . . . ) + F (. . . , uj , . . . , ui , . . . ).
Corollary 4.3. The value of F does not change if a multiple of the vector in one position
is added to the vector in another position.
Proof. If i 6= j, we have
F (. . . , ui , . . . , uj + sui , . . . ) = F (. . . , ui , . . . , uj , . . . ) + sF (. . . , ui , . . . , ui , . . . )
= F (. . . , ui , . . . , uj , . . . ).
Theorem 4.4. Let V be an n-dimensional linear space with basis e1 , . . . , en and let F
be an n-multilinear alternating form on V . If F (e1 , . . . , en ) = 0, then F (u1 , . . . , un ) = 0
for all n-tuples (u1 , . . . , un ) of vectors in V .
where the sum is taken over all n-tuples (i1 , . . . , in ) of indices. The assertion now follows
from the fact that all terms of the sum are zero.
Corollary 4.5. Let V be an n-dimensional linear space with basis e1 , . . . , en and let
F and G be n-multilinear alternating forms on V . If F (e1 , . . . , en ) = G(e1 , . . . , en ),
then F (u1 , . . . , un ) = G(u1 , . . . , un ) for all n-tuples (u1 , . . . , un ) of vectors in V .
52
4.2 Definition of Determinants
Consider the linear space V of n×1 columns. The columns Ik , k = 1, . . . , n, of the unit
matrix I form a basis for V . When D is viewed as a function V n → R, conditions (i)
and (ii) mean that D is an n-multilinear alternating form on V . By Corollary 4.5 and
condition (iii), a determinant is uniquely determined if it exists.
We shall now define determinants of all orders recursively. Let A be an n × n matrix
where n ≥ 2. By Aik we mean the (n − 1) × (n − 1) matrix obtained from A by deleting
row i and column k. Hence, if
a11 ··· a1(k−1) a1k a1(k+1) ··· a1n
.. .. .. .. ..
. . . . .
a(i−1)1 · · · a(i−1)(k−1) a(i−1)k a(i−1)(k+1) · · · a(i−1)n
A=
ai1 ··· ai(k−1) aik ai(k+1) ··· ain ,
a(i+1)1 · · · a(i+1)(k−1) a(i+1)k a(i+1)(k+1) · · · a(i+1)n
.. .. .. .. ..
. . . . .
an1 ··· an(k−1) ank an(k+1) ··· ann
then
a11 ··· a1(k−1) a1(k+1) ··· a1n
.. .. .. ..
. . . .
a(i−1)1 · · · a(i−1)(k−1) a(i−1)(k+1) · · · a(i−1)n
Aik =
a(i+1)1 · · · a(i+1)(k−1) a(i+1)(k+1) · · · a(i+1)n .
.. .. .. ..
. . . .
an1 ··· an(k−1) an(k+1) ··· ann
n
X
Dn (A) = (−1)i+j aij Dn−1 (Aij )
j=1
= (−1)i+1 ai1 Dn−1 (Ai1 ) + (−1)i+2 ai2 Dn−1 (Ai2 ) + · · · + (−1)i+n ain Dn−1 (Ain )
is a determinant of order n.
53
4 Determinants
Proof. We denote Dn−1 by D in this proof. Assume that A = [aik ]n×n and let A′ and B
be the matrices obtained from A by replacing its kth column with
′
a1k sa1k + ta′1k
.. ..
. .
′
a and saik + ta′ ,
ik ik
.. ..
. .
′
ank ′
sank + tank
respectively. Plainly, Bik = Aik = A′ik and bik = saik +ta′ik . If j 6= k, then bij = aij = a′ij .
In this case we also have D(Bij ) = sD(Aij ) + tD(A′ij ) since D is assumed to satisfy
condition (i). Therefore,
n
X X
Dn (B) = (−1)i+j bij D(Bij ) = (−1)i+k bik D(Bik ) + (−1)i+j bij D(Bij )
j=1 j6=k
X
= (−1)i+k (saik + ta′ik )D(Bik ) + (−1)i+j bij (sD(Aij ) + tD(A′ij ))
j6=k
X
= (−1)i+k (saik D(Aik ) + ta′ik D(A′ik )) + (−1)i+j (saij D(Aij ) + ta′ij D(A′ij ))
j6=k
n
X n
X
=s (−1)i+j aij D(Aij ) + t (−1)i+j a′ij D(A′ij ) = sDn (A) + tDn (A′ ).
j=1 j=1
Definition 4.7.
We define the function Dn : Mn → R recursively as follows. For 1 × 1
matrices A = a , we set D1 (A) = a. When A is an n × n matrix where n ≥ 2, we set
n
X
Dn (A) = (−1)1+j a1j Dn−1 (A1j )
j=1
= a11 Dn−1 (A11 ) − a12 Dn−1 (A12 ) + · · · + (−1)1+n a1n Dn−1 (A1n ).
54
4.2 Definition of Determinants
Proof. The function D1 satisfies condition (ii) for the simple reason that a square matrix
of order 1 has no adjacent columns. The other two conditions are trivially satisfied.
Hence D1 is a determinant. The statement now follows by induction on n, the induction
step being supplied by Theorem 4.6.
Hence, determinants exist of all orders and are unique. In order not to overload the
notation, we shall from now on denote determinants of all orders by D. Other notations
used for D(A), where A = [aik ]n×n , are det A and
a11 a12 · · · a1n
a21 a22 · · · a2n
.. .. .. .
. . .
an1 an2 · · · ann
For 2 × 2 matrices, the definition yields
a11 a12
= (−1)1+1 a11 a22 + (−1)1+2 a12 a21 = a11 a22 − a12 a21 .
a21 a22
Do not mistake a22 and a21 for absolute values here. To avoid this ambiguity, we shall
never again use this notation for determinants of order 1.
Example 4.9. Using the definition and the above formula for determinants of order 2,
we find that the determinant of the matrix
1 2 3
A= 4 5 6
7 8 9
is
1 2 3
5 6 4 6 4 5
det A = 4 5 6 = (−1)2 · 1 · + (−1)3 · 2 · + (−1)4 · 3 ·
8 9 7 9 7 8
7 8 9
= 5 · 9 − 6 · 8 − 2(4 · 9 − 6 · 7) + 3(4 · 8 − 5 · 7) = 0.
55
4 Determinants
The reader probably recognises this as Sarrus’s rule for determinants of order 3. Hence,
for determinants of order 2 and 3, our definition agrees with the ones usually stated for
such determinants.
Proof. By Theorem 4.6, the right-hand side is a determinant of order n. The equality
therefore follows from the uniqueness of determinants.
56
4.3 Properties of Determinants
Since the columns of A are the rows of At , Theorems 4.10 and 4.11 yield the following
theorem.
Theorem 4.12. Let A be an n × n matrix, where n ≥ 2, and j an integer such that
1 ≤ j ≤ n. Then
Xn
D(A) = (−1)i+j aij D(Aij )
i=1
= (−1)1+j a1j D(A1j ) + (−1)2+j a2j D(A2j ) + · · · + (−1)n+j anj D(Anj ).
Theorem 4.13. The value of a determinant does not change when a multiple of a column
is added to another column or when a multiple of a row is added to another row.
Proof. The statement concerning columns is contained in Corollary 4.3. Since the rows
of a matrix are the columns of its transpose, the statement about rows follows from
Theorem 4.11.
Theorem 4.14. D(A) = 0 if two columns of A are equal or two rows of A are equal.
D(A) changes by a sign if two columns of A are interchanged or two rows of A are
interchanged.
Proof. The statements about columns are translations into the language of determinants
of the statements of Theorem 4.2. Their row analogues follow from Theorem 4.11.
Theorems 4.10, 4.12, 4.13 and 4.14 provide efficient tools for evaluation of determinants.
The formulae of the first two theorems are called expansion along a row and column,
respectively.
Using the recursive definition of determinants amounts to successive expansions along
the first row. A determinant of order 4 first splits into 4 determinants of order 3, and
then each of these splits into 3 determinants of order 2. Hence, we must evaluate 12
determinants of order 2.
By Theorem 4.13, a determinant can be transformed into one containing a row or
column having at most one non-zero entry. The expansion along the transformed row
or column contains at most one non-zero term. Correct use of the tools decreases the
workload significantly.
Example 4.15. We demonstrate the tools by, once again, evaluating the determinant
in Example 4.9. For the sake of easy calculations we choose to produce zeros in the third
column. By subtracting twice the first row from the second row and thrice the first row
from the last row, we obtain
1 2 3 1 2 3
4 5 6 = 2 1 0 .
7 8 9 4 2 0
Expanding along column 3, we get the same value as before:
1 2 3
2 1 1 2 1 2
2 1 0 = (−1)1+3 · 3 + (−1)2+3 · 0 + (−1)3+3 · 0 = 3(2 · 2 − 1 · 4) = 0.
4 2 4 2 2 1
4 2 0
57
4 Determinants
The zero terms are written out for the convenience of the reader. In fact, the whole
evaluation fits on a single line:
1 2 3 1 2 3
2 1
4 5 6 = 2 1 0 = (−1)1+3 · 3 = 3(2 · 2 − 1 · 4) = 0.
4 2
7 8 9 4 2 0
The reader is discouraged from using Sarrus’s rule. The reason for this is twofold.
Firstly, it applies only to determinants of order 3. Secondly, it requires unnecessarily
long calculations.
3 8 4 6
5 3 2 4
.
7 11 4 3
11 13 6 10
Column 3 is best suited for elimination since its entries are integral multiples of its entry
in row 2. We subtract twice the second row from the first and third rows and thrice the
same row from the last row and get
−7 2 0 −2
5 3 2 4
.
−3 5 0 −5
−4 4 0 −2
−7 2 −2 −7 2 −2
2+3
(−1) · 2 −3 5 −5 = −2 −3 5 −5 .
−4 4 −2 −4 4 −2
We now choose to eliminate in row 3 for the same reason as before. Hence, we subtract
twice the last column from the first column and add twice the same column to the second
column. Thus we get
−3 −2 −2
−2 7 −5 −5 .
0 0 −2
Expanding along the third row and then using the formula for determinants of order 2,
we obtain
−3 −2
−2(−1)3+3 (−2) = 4((−3)(−5) − (−2) · 7) = 116.
7 −5
58
4.3 Properties of Determinants
An upper triangular square matrix has zeros below its main diagonal. Likewise, a
lower triangular square matrix has zeros above its main diagonal. A diagonal square
matrix has zeros outside its main diagonal. Hence, a diagonal matrix is upper and lower
triangular. The determinant of a matrix of any of these kinds equals the product of the
diagonal entries of that matrix. For an upper triangular matrix, this can be seen by
successively expanding along the first column:
x 1 1 ··· 1
1 x 1 ··· 1
1 1 x ··· 1
.. .. .. ..
. . . .
1 1 1 ··· x
of order n. Observing that all the row sums are equal, we add the last n − 1 columns to
the first column and obtain
x + n − 1 1 1 ··· 1
x + n − 1 x 1 ··· 1
x + n − 1 1 x ··· 1 .
.. .. .. ..
. . . .
x + n − 1 1 1 ··· x
Now, subtracting the first row from the other rows, we get
x+n−1 1 1 ··· 1
0 x−1 0 ··· 0
0 0 x−1 ··· 0 .
.. .. .. ..
. . . .
0 0 0 ··· x − 1
Proof. Let, for a fixed matrix A, the mappings D ′ and D ′′ from Mn to R be defined by
D ′ (B) = D(A)D(B) and D ′′ (B) = D(AB). When viewed as functions V n → R, where
59
4 Determinants
V is the linear space of n×1 columns, these mappings are n-multilinear alternating forms
on V . Since D is such a form, this statement is trivial for D ′ . For D ′′ , it follows from
the fact that D ′′ (B1 , . . . , Bn ) = D(AB1 , . . . , ABn ). Therefore, and since D ′ (I) = D ′′ (I),
Corollary 4.5 yields that D ′ (B) = D ′′ (B) for all n × n matrices B.
Theorem 4.19. A square matrix A is invertible if and only if D(A) 6= 0, and in that
case D(A−1 ) = (D(A))−1 .
Proof. Assume that A is invertible with inverse A−1 . Then it follows from Theorem 4.18
that D(A)D(A−1 ) = D(AA−1 ) = D(I) = 1, whence D(A) 6= 0 and D(A−1 ) = (D(A))−1 .
The converse amounts to saying that D(A) = 0 if A is not invertible.
Assume that A is
not invertible and let n be the order of A. If n = 1, then A = 0 and hence D(A) = 0 in
this case. Otherwise, the columns of A are linearly dependent, and therefore one column
Ak is a linear combination of the other columns. Thus,
for some real numbers s1 , . . . , sk−1 , sk+1 , . . . , sn , and therefore, by linearity in the kth
argument,
X
D(A) = D(A1 , . . . , Ak−1 , Ak , Ak+1 , . . . , An ) = si D(A1 , . . . , Ak−1 , Ai , Ak+1 , . . . , An ).
i6=k
Each determinant in the sum has two equal columns, and is therefore equal to zero by
Theorem 4.14. Hence, D(A) = 0 also in this case.
Example 4.20. Let us, for every a ∈ R, find the dimensions of ker A and im A where
a 2 1 2
2 a 2 1
A= 1 2
.
a 2
2 1 2 a
By Theorem 4.19, A is invertible if and only if D(A) 6= 0, and then dim ker A = 0 and
dim im A = 4. We observe that the column sums are equal. We choose to add the first
three rows to the last row. Thus we get
60
4.3 Properties of Determinants
Example 4.21. A plane is parallel to the vectors u = (1, 2, 3) and v = (1, −1, 2) and
passes through the point Q = (2, 1, 4). A point P = (x, y, z) lies in the plane if and only
−−
→
if the vectors QP = (x − 2, y − 1, z − 4), u and v are linearly dependent. This in turn is
61
4 Determinants
equivalent to the determinant of the matrix with columns equal to the coordinate vectors
being zero. Hence, the equation of the plane is
x−2 1 1
2 −1 1 1 1 1
0 = y − 1 2 −1 = (x − 2) + (y − 1)(−1) + (z − 4)
3 2 3 2 2 −1
z−4 3 2
= 7(x − 2) + 1(y − 1) − 3(z − 4) = 7x + y − 3z − 3.
Theorem 4.22 (Cramer’s rule). Let A be a square matrix of order n and assume that
AX = Y where X and Y are column vectors. Then
The second equality follows from the linearity in the ith argument and the third equality
follows from the fact that the kth determinant in the sum has two equal columns when
k 6= i. The last statement of the theorem should now be obvious.
Theorem 4.24. Let A = [aik ]n×n be a square matrix. Then AÃ = ÃA = D(A)I. In
particular, if A is invertible, then
1
A−1 = Ã.
D(A)
62
Exercises
Proof. The assertions are trivial for n = 1. Otherwise, the ikth entry of AÃ is
n
X
bik = aij (−1)k+j D(Akj ).
j=1
If k = i, this sum is the expansion of D(A) along the ith row. Hence, bii = D(A) for
i = 1, . . . , n. If k 6= i, let A′ be the matrix obtained from A by replacing the kth row by
the ith row and leaving all other rows unchanged. Then A′kj = Akj , and hence
n
X
bik = aij (−1)k+j D(A′kj ).
j=1
This is the expansion of D(A′ ) along the ith row. Since two rows of A′ are equal, we
have D(A′ ) = 0. Hence, bik = D(A′ ) = 0 if i 6= k. This shows that AÃ = D(A)I. It
ft . Hence
follows from the definition of the adjugate that Ãt = A
ft = D(At )I = D(A)I,
(ÃA)t = At Ãt = At A
The methods mentioned earlier for solving systems of equations and finding inverses
usually involve much shorter calculations than the methods of the last two theorems. One
exception to this is the following formula for the inverse of an invertible 2 × 2 matrix:
−1
a11 a12 1 a22 −a12
= .
a21 a22 a11 a22 − a12 a21 −a21 a11
Example 4.25. Let A be an invertible square matrix with integer entries. Then A−1
has integer entries if and only if D(A) = ±1.
First assume that A−1 has integer entries. Then D(A) and D(A−1 ) are integers. By
the product theorem, D(A)D(A−1 ) = 1, and hence D(A) = ±1.
Since the entries of A are integers, also the entries of its adjugate are integers. If
D(A) = ±1, it therefore follows from Theorem 4.24 that the entries of A−1 are integers.
Exercises
4.1. Evaluate the following determinants.
1 1 2 666 667 669 1 a b+c
(a) 3 2 9 , (b) 667 668 670 , (c) 1 b c + a .
7 2 5 669 670 671 1 c a+b
63
4 Determinants
0 1 2 ··· n − 1 3
1 0 1 ··· n − 2 2
2 1 0 ··· n − 3 . 1
.. .. .. .. ..
. . . . .
n − 1 n − 2 n − 3 n − 4 ··· 0
64
Exercises
1 −(x1 + x2 ) x1 x2 0
0 1 −(x1 + x2 ) x1 x2
= (x1 − y1 )(x1 − y2 )(x2 − y1 )(x2 − y2 ).
1 −(y1 + y2 ) y1 y2 0
0 1 −(y1 + y2 ) y1 y2
a0 a1 a2 0
0 a0 a1 a2
= 0.
b0 b1 b2 0
0 b0 b1 b2
Hint for the induction step: Perform column operations to obtain a determinant
with zeros in all positions of the first row except the first position.
65
4 Determinants
4.17. Let
a b c d
b −a d −c
A=
c −d −a b .
d c −b −a
Show that det A = 0 only if a = b = c = d = 0. Hint: Consider AAt .
4.18. Let V be the volume of the parallelepiped spanned by the vectors u, v and w in
3-space. Show that
hu, ui hu, vi hu, wi
2
V = hv, ui hv, vi hv, wi .
hw, ui hw, vi hw, wi
Hint: Choose an orthonormal basis and use the product theorem.
4.19. Use Theorem 4.24 to find the inverses of the following matrices.
1 1 2
1 2
(a) , (b) 0 3 1.
3 4
2 1 0
66
5 Linear Transformations
According to Definition 2.64 on page 26, a linear transformation F from U to V is a
function F : U → V such that
for all u and v in U and all real numbers s and t. If U = V , we also say that F is a
linear transformation on V . We shall here study linear transformations on a linear space
in more detail.
Y = x1 A1 + · · · + xn An = AX
Example 5.3. Let I be the identity mapping on an n-dimensional linear space V and
let e1 , . . . , en be any basis for V . Since I(ek ) = ek for k = 1, . . . , n, the matrix of I with
respect to the basis is the unit matrix I of order n.
5 Linear Transformations
Example 5.4. Let there be given an orthonormal basis e1 , e2 for the plane and let F
be rotation about the origin through an angle θ. It is apparent from the figure below
that F is a linear transformation from the plane to itself. We also see that
F (u + v)
F (su) e2
F (u) F (e1 )
u+v F (u) F (e2 )
F (v) v
su
u θ
u θ
e1
If θ = π3 , then " √ #
1 1 − 3
A= √ ,
2 3 1
and the vector u with coordinates (1, 2) is mapped to the the vector F (u) with coordin-
ates " √ #
1 1 1−2 3
A = √ .
2 2 3+2
d2 p dp
F (p) = − .
dx2 dx
68
5.1 Matrix Representations of Linear Transformations
F (e1 ) = 0,
F (e2 ) = −1 = −e1 ,
F (e3 ) = 2 − 2x = 2e1 − 2e2 ,
F (e4 ) = 6x − 3x2 = 6e2 − 3e3 .
Example 5.7. Let there be given an orthonormal basis e1 , e2 , e3 for 3-space and let F
be orthogonal projection on the plane 2x1 + 2x2 + x3 = 0. By Theorem 2.60, F is a
linear transformation. The vector e = 13 (2, 2, 1) is a unit normal vector to the plane.
The orthogonal projections of the basis vectors on the normal of the plane are
1·2+0·2+0·1 1 2
e′′1 = he1 , eie = · (2, 2, 1) = (2, 2, 1),
3 3 9
′′ 0·2+1·2+0·1 1 2
e2 = he2 , eie = · (2, 2, 1) = (2, 2, 1),
3 3 9
0 · 2 + 0 ·2+1·1 1 1
e′′3 = he3 , eie = · (2, 2, 1) = (2, 2, 1).
3 3 9
Hence, their projections on the plane are
2 1
e′1 = e1 − e′′1 = (1, 0, 0) − (2, 2, 1) = (5, −4, −2),
9 9
′ ′′ 2 1
e2 = e2 − e2 = (0, 1, 0) − (2, 2, 1) = (−4, 5, −2),
9 9
1 1
e′3 = e3 − e′′3 = (0, 0, 1) − (2, 2, 1) = (−2, −2, 8),
9 9
and the matrix of F with respect to the given basis is
5 −4 −2
1
−4 5 −2 .
9
−2 −2 8
F (x1 e1 + · · · + xn en ) = y1 e1 + · · · + yn en ,
then A is the matrix of F with respect to the basis. In fact, by setting x = εi we find
that the ith column of A is the coordinate vector of F (ei ). We use this observation in
the next example.
69
5 Linear Transformations
Example 5.8. Let us find the matrix of projection on the plane U ′ : x1 + 2x2 + 3x3 = 0
along the line U ′′ : x = t(2, 1, 2) in Example 2.63. Let the coordinates of u be (ξ1 , ξ2 , ξ3 )
and form the line x = (ξ1 , ξ2 , ξ3 ) + t(2, 1, 2). The image u′ of u is the intersection of this
line with the plane. The intersection is given by
ξ1 + 2ξ2 + 3ξ3
ξ1 + 2t + 2(ξ2 + t) + 3(ξ3 + 2t) = 0 ⇔ t=− .
10
U ′′
u′′ u
U′ u′
Hence,
u′ = (ξ1 , ξ2 , ξ3 ) + t(2, 1, 2)
1
= ((10ξ1 , 10ξ2 , 10ξ3 ) − (2ξ1 + 4ξ2 + 6ξ3 , ξ1 + 2ξ2 + 3ξ3 , 2ξ1 + 4ξ2 + 6ξ3 ))
10
1
= (8ξ1 − 4ξ2 − 6ξ3 , −ξ1 + 8ξ2 − 3ξ3 , −2ξ1 − 4ξ2 + 4ξ3 ).
10
The matrix is therefore
8 −4 −6
1
−1 8 −3 .
10
−2 −4 4
Theorem 5.9. Let F and G be linear transformations on a linear space V with basis
e1 , . . . , en and assume that the matrices of F and G are A and B, respectively. Then
the composition F G is a linear transformation on V with matrix AB.
Proof. F G is a linear transformation by Theorem 2.69. Assume that w = F G(u) where
the coordinates of w and u are z and x, respectively. Set v = G(u) and let the coordin-
ates of v be y. Then w = F (v), z = Ay, y = Bx, and therefore z = ABx.
From Theorem 2.73 we know that a linear transformation F on a finite-dimensional linear
space V is one-to-one if and only if it is onto. In that case Theorem 2.71 yields that F −1
is a linear transformation on V .
Theorem 5.10. Let F be a linear transformation on a linear space V with basis
e1 , . . . , en and let A be its matrix. Then F is invertible if and only if A is invertible, and
in that case the matrix of F −1 is A−1 .
Proof. The invertibility of F means that the equation F (u) = v has a unique solution
for every v ∈ V . This is equivalent to the equation Ax = y having a unique solution
for every y ∈ Rn , which by Theorem 1.19 means that A is invertible. Assume that F
is invertible and let B be the matrix of F −1 . Since F F −1 = I is the identity mapping,
Theorem 5.9 gives that AB = I, and hence B = A−1 .
70
5.2 Change of Basis
Example 5.11. Let F1 and F2 be rotations in 2-space about the origin through the
angles θ1 and θ2 , respectively. Clearly, F = F1 F2 is rotation about the origin through
the angle θ = θ1 + θ2 . From this and Example 5.4, we get
cos θ1 − sin θ1 cos θ2 − sin θ2 cos (θ1 + θ2 ) − sin (θ1 + θ2 )
= .
sin θ1 cos θ1 sin θ2 cos θ2 sin (θ1 + θ2 ) cos (θ1 + θ2 )
This can, of course, also be shown by using the angle addition formulae. In particular,
F1 F2 = I when θ1 = −θ2 . Hence,
−1
cos θ − sin θ cos (−θ) − sin (−θ) cos θ sin θ
= =
sin θ cos θ sin (−θ) cos (−θ) − sin θ cos θ
Assume that the coordinates of u with respect to the two bases are x = (x1 , . . . , xn ) and
x′ = (x′1 , . . . , x′n ), respectively. Then
Since the coordinates of u are unique, the first equality now gives that
Hence, x = T x′ where T is the matrix whose columns are the coordinate vectors of
e′1 , . . . , e′n with respect to the basis e1 , . . . , en . Conversely, if x = T x′ whenever
71
5 Linear Transformations
then the kth column of T must be the coordinate vector of e′k with respect to e1 , . . . , en .
This can be seen by setting x′ = εk . We call T the transition matrix from basis e′1 , . . . , e′n
to basis e1 , . . . , en .
If S is the transition matrix from a basis e′′1 , . . . , e′′n to e′1 , . . . , e′n and if
Example 5.12. Consider the bases e1 = (1, 2, 1), e2 = (1, 1, 2), e3 = (1, 4, 0) and
ε1 , ε2 , ε3 for R3 in Example 2.52. Here
In order to find the coordinates x = (x1 , x2 , x3 ) of u = (3, 7, 3) = 3ε1 + 7ε2 + 3ε3 with
respect to e1 , e2 , e3 , we can solve the system
x1 3
T x2 = 7 .
x3 3
We did that in Example 2.52, where we obtained x = (1, 1, 1). Another possibility would
be to compute the transition matrix
8 −2 −3
T −1 = −4 1 2
−3 1 1
by matrix multiplication.
72
5.2 Change of Basis
Theorem 5.13. Let V be an inner product space. If T is the transition matrix from an
orthonormal basis e′1 , . . . , e′n to an orthonormal basis e1 , . . . , en , then T is orthogonal.
Proof. Let ti = (t1i , . . . , tni ) be the coordinates of e′i with respect to e1 , . . . , en . Since
e1 , . . . , en is an orthonormal basis, it follows from Theorem 3.18 that he′i , e′j i = ti · tj .
Since also e′1 , . . . , e′n is an orthonormal basis, we find that t1 , . . . , tn , and hence the
columns of T , form an orthonormal set in Rn .
Let F be a linear transformation on a linear space V and assume that the matrix of
F is A with respect to a basis e1 , . . . , en for V . Then Ax = y whenever
F (x1 e1 + · · · + xn en ) = y1 e1 + · · · + yn en .
Let us introduce a new basis e′1 , . . . , e′n for V . Denote the transition matrix from
e′1 , . . . , e′n to e1 , . . . , en by T . If
F (x′1 e′1 + · · · + x′n e′n ) = y1′ e′1 + · · · + yn′ e′n ,
then AT x′ = T y ′ and hence T −1AT x′ = y ′ . This shows that the matrix of F with
respect to the new basis is A′ = T −1AT . Thus we have proved the following theorem.
Theorem 5.14. Let F : V → V be a linear transformation with matrix A with respect
to a basis e1 , . . . , en and let T be the transition matrix from a basis e′1 , . . . , e′n to
e1 , . . . , en . Then the matrix of F with respect to e′1 , . . . , e′n is
A′ = T −1AT. (5.1)
When V is an inner product space and both bases are orthonormal, it follows from
Theorem 5.13 that (5.1) can also be written as A′ = T tAT .
If A and A′ are the matrices of F with respect to two bases, then A′ = T −1AT for
some matrix T , and hence det A′ = det (T −1AT ) = det T −1 det A det T = det A. This
justifies the following definition.
Definition 5.15. Let F be a linear transformation on a finite-dimensional non-zero
linear space V . We define the determinant det F as det A where A is the matrix of F
with respect to any basis for V .
73
5 Linear Transformations
To show the converse, we assume that P 2 = P . Let u be any vector of V and set
u′ = P (u) and u′′ = u − u′ . Then u = u′ + u′′ and u′ ∈ im P . Since
we also see that u′′ ∈ ker P . Hence V = im P + ker P . We show that V = im P ⊕ ker P
by showing that im P ∩ ker P = {0}. If u ∈ im P ∩ ker P , then u = P (v) for some v ∈ V
and u = P (v) = P 2 (v) = P (u) = 0. Hence, V = im P ⊕ ker P by Theorem 2.61. Now
let u = u′ + u′′ where u′ ∈ im P and u′′ ∈ ker P . Then u′ = P (v) for some v ∈ V ,
and therefore P (u) = P (u′ ) + P (u′′ ) = P (P (v)) = P (v) = u′ . This shows that P is
projection on im P along ker P .
Corollary 5.18. Let V be a linear space with basis e1 , . . . , en and let A be the matrix
of a linear transformation P on V . Then P is a projection if and only if A2 = A.
74
5.3 Projections and Reflections
Since U ′ = ker (P − I), we get U ′ by solving the system (A − I)x = 0 ⇔ 3(A − I)x = 0.
1 3 2 0
−2 −6 −4 0 ⇔ 1 3 2 0 .
1 3 2 0
Hence, U ′ is the plane x1 + 3x2 + 2x3 = 0. In order to find U ′′ = ker A, we solve the
system Ax = 0.
4 3 2 0 4 3 2 0
−2 −3 −4 0 ⇔ 2 0 −2 0 ⇔ x = t(1, −2, 1).
1 3 5 0 −3 0 3 0
Thus we have shown that A is the matrix of projection on the plane x1 + 3x2 + 2x3 = 0
along the line x = t(1, −2, 1).
U ′′
R′′ (u) u′′ u
U′
−u′ u′
−u −u′′ R′ (u)
75
5 Linear Transformations
Corollary 5.22. Let V be a linear space with basis e1 , . . . , en and let A be the matrix
of a linear transformation R on V . Then R is a reflection if and only if A2 = I.
Proof. The statement follows from the fact that ker(R− I) = ker(−2P ′′ ) = ker(P ′′ ) = U ′
and ker(R + I) = ker(2P ′ ) = ker(P ′ ) = U ′′ .
Theorem 5.24. Let V = U ′ ⊕ U ′′ and assume that n = dim V > 0 and k = dim U ′ .
The determinant of the reflection F in U ′ along U ′′ is then (−1)n−k .
Lemma 5.26. Assume that V = U ⊕W where V is an inner product space and U and W
are subspaces of V . If hu, wi = 0 for all u ∈ U and w ∈ W , then W = U ⊥ .
76
5.3 Projections and Reflections
77
5 Linear Transformations
Note that the classes of linear transformations we have discussed so far are not ex-
clusive. For example, the identity mapping I on a 2-dimensional inner product space V
is rotation through the angle 0 about the origin, orthogonal projection on V as well as
orthogonal reflection in V .
5.4 Isometries
Definition 5.32. Let F be a linear transformation on an inner product space V . We
say that F is an isometry if kF (u)k = kuk for all u ∈ V .
Theorem 5.33. F is an isometry if and only if hF (u), F (v)i = hu, vi for all u and v
in V .
that
78
5.4 Isometries
Proof. Let u and v be vectors of V and x and y their coordinates. Since the basis is
orthonormal, hu, vi = xt y = xt Iy and hF (u), F (v)i = (Ax)t Ay = xt At Ay. Hence,
by Theorem 5.33, F is an isometry if and only if xt At Ay = xt Iy for all x and y. By
Lemma 5.29, this in turn is equivalent to At A = I.
Note that A need not be orthogonal even if det A = ±1. Hence, the converse of the
statement of the theorem is not true.
Theorem 5.36. Let F be orthogonal reflection in a subspace U of an inner product
space V = U ⊕ U ⊥ . Then F is an isometry.
kF 2 (u) − uk2 = kF 2 (u)k2 + kuk2 − 2hF 2 (u), ui = kuk2 + kuk2 − 2hF (u), F (u)i
= 2kuk2 − 2kF (u)k2 = 0, u ∈ V.
79
5 Linear Transformations
e2 F (e ) e2 F (e )
1 1
F (e2 ) θ θ/2
θ θ/2
e1 e1
F (e2 )
In the first case, F is rotation through the angle θ about the origin. As we saw in
Example 5.4, the matrix of F is
cos θ − sin θ
A= .
sin θ cos θ
Hence, det F = det A = cos2 θ + sin2 θ = 1 in this case.
In the second case, F is reflection in the bisector of the angle between e1 and F (e1 ).
Comparing with the previous case, we understand that F (e2 ) here is obtained from F (e2 )
there by a change of sign. Hence, the matrix is
cos θ sin θ
A= ,
sin θ − cos θ
and det F = det A = −1.
Theorem 5.38. An isometry F on a 2-dimensional inner product space is either a ro-
tation about the origin or reflection in a line. In the first case, det F = 1, and in the
second case, det F = −1.
80
5.4 Isometries
81
5 Linear Transformations
with respect to an orthonormal, positively oriented basis for 3-space. Since AAt = I,
A is orthogonal, and therefore F is an isometry. We solve the system Ax = x.
2x1 − x2 + 2x3 = 3x1 −1 −1 2 0
2x + 2x2 − x3 = 3x2 ⇔ 2 −1 −1 0 ⇔ x = t(1, 1, 1).
1
−x1 + 2x2 + 2x3 = 3x3 −1 2 −1 0
Hence, F is a rotation about the line U = [(1, 1, 1)]. We set w = (1, 1, 1). A vector
orthogonal to w is, for example, u = (1, −1, 0). We have v = F (u) = (1, 0, −1), and the
angle θ between u and v is given by
hu, vi 1
cos θ = = .
kukkvk 2
The angle of rotation is therefore θ = π3 . The determinant of the matrix having columns
u, v, w is
1 1 1
−1 0 1 = 3 > 0.
0 −1 1
Therefore, the rotation appears anticlockwise on looking from the point (1, 1, 1) towards
the origin.
82
Exercises
By Theorem 5.39, F is a rotation if and only if t = 1. Hence (a, b, c) = (2, −1, 2).
Exercises
5.1. Let (x1 , x2 ) be the coordinates with respect to an orthonormal basis e1 , e2 for
2-space. Find the matrices with respect to that basis of the following linear
transformations on 2-space.
(a) Rotation a quarter-turn in the direction from the x1 -axis to the x2 -axis.
(b) Orthogonal projection on the line 3x1 = 4x2 .
5.2. Find the matrices of the following linear transformations on 3-space with respect
to an orthonormal, positively oriented basis e1 , e2 , e3 .
π
(a) Anticlockwise rotation about the x2 -axis through the angle 6.
(b) Orthogonal projection on the plane x1 + 2x2 − 2x3 = 0.
5.3. Let e1 , e2 be a basis for 2-space. Find the matrix with respect to that basis of
projection on the line x1 = x2 along the x1 -axis.
5.4. Let e1 , e2 , e3 be a basis for 3-space.
(a) Find the matrix of projection on the plane x1 + x2 + x3 = 0 along the line
x = t(1, 2, 3).
(b) Find the matrix of projection on the line along the plane.
83
5 Linear Transformations
with respect to a basis e1 , e2 , e3 . Find the matrix of F with respect to the basis
e′1 = e1 + e2 − e3 ,
e′2 = 2e1 + e2 + e3 ,
e′3 = 2e1 + e2 + 2e3 .
84
Exercises
5.12. Let e1 , e2 , e3 be an orthonormal basis for R3 and consider the plane with equation
x1 + 2x2 − 2x3 = 0. Find the matrix of orthogonal reflection in that plane with
respect to the given basis.
5.13. Let e1 , e2 , e3 be a basis for R3 . Find the matrix with respect to that basis of
reflection in the plane 2x1 − x2 − 3x3 = 0 along the line x = t(1, −2, 1).
5.14. Let I be the unit matrix of order n and B an n × 1 column vector of unit length.
Explain the geometric meaning of the so-called Householder matrix I − 2BB t .
5.15. The matrices below are matrices of linear transformations on 2-space with respect
to an orthonormal basis. Show that the linear transformations are isometries and
find their geometric meaning.
√ √
1 3 √1 1 −1 3 1 1 −1
(a) , (b) √ , (c) √ .
2 −1 3 2 3 1 2 1 1
5.16. The matrices below are matrices of linear transformations on 3-space with respect
to an orthonormal, positively oriented basis. Show that the linear transformations
are isometries and find their geometric meaning.
8 −4 1 1 4 8 0 1 0
1 1
(a) −1 −4 −8, (b) 4 7 −4, (c) 0 0 1,
9 9
4 7 −4 8 −4 1 1 0 0
−6 2 3 −2 1 −2
1 1
(d) 2 −3 6, (e) −2 −2 1 .
7 3
3 6 2 1 −2 −2
5.17. (a) Let F and G be rotations about lines in 3-space. Show, for example by using
determinants, that F G and GF are also rotations about lines.
(b) Let F and G be orthogonal reflections in two different planes through the
origin in 3-space. Show that F G and GF are rotations about the line of
intersection between the planes. What are the angles of rotation?
85
6 Eigenvalues and Eigenvectors
6.1 Definition
Definition 6.1. Let F be a linear transformation on a linear space V . We say that
λ ∈ R is an eigenvalue of F if there exists a non-zero vector u ∈ V such that F (u) = λu.
A non-zero vector u ∈ V for which F (u) = λu is called an eigenvector of F belonging
to the eigenvalue λ. By an eigenvalue and an eigenvector of an n × n matrix A we shall
mean an eigenvalue and an eigenvector of the linear transformation x 7→ Ax on Rn .
Definition 6.2. Let A be a square matrix. The polynomial det (A − λI) is called the
characteristic polynomial of A.
Theorem 6.4. Let A be an n × n matrix. If the complex zeros of det (A − λI) are
λ1 , . . . , λn , where each zero is counted as many times as its multiplicity, then
Proof. Let A(k) be the matrix obtained from I by replacing the kth column of I with
Ak . Then by multilinearity,
This shows that det (A − λI) is of the form stated and that bn−1 = tr A and b0 = det A.
The equalities bn−1 = λ1 +· · ·+λn and b0 = λ1 · · · λn follow from the relationship between
roots and coefficients.
Example 6.8. Let V be a 2-dimensional linear space endowed with a basis e1 , e2 , and
consider the linear transformation F on V whose matrix is
0 −1
A= .
−1 0
The eigenvalues of F are given by
−λ −1
0 = det (A − λI) = = λ2 − 1 = (λ + 1)(λ − 1),
−1 −λ
and are therefore λ1 = −1 and λ2 = 1. The coordinate vectors of the eigenvectors
belonging to λ1 satisfy the system
1 −1 0
⇔ x = t(1, 1).
−1 1 0
Hence, the eigenvectors u belonging to that eigenvalue are u = t(e1 + e2 ), t 6= 0. In the
same way, we find that the eigenvectors belonging to λ2 are u = t(e1 − e2 ), t 6= 0.
88
6.2 Diagonalisability
6.2 Diagonalisability
Let F be a linear transformation on a linear space V and suppose there exists a basis
f 1 , . . . , f n for V consisting of eigenvectors of F belonging to the eigenvalues λ1 , . . . , λn .
Since F (f i ) = λi f i for i = 1, . . . , n, the matrix of F with respect to that basis is the
diagonal matrix
λ1 0 · · · 0
0 λ2 · · · 0
D= . .. .. .
.. . .
0 0 · · · λn
Let A be the matrix of F with respect to any basis e1 , . . . , en for V . If T is the n × n
matrix whose ith column is the coordinate vector of f i with respect to e1 , . . . , en , then
T −1AT = D. According to the following definition, A is diagonalisable.
Definition 6.9. Let A be an n × n matrix. We say that A is diagonalisable if there
exists an invertible n × n matrix T such that T −1AT is a diagonal matrix.
Theorem 6.10. Let F be a linear transformation on a linear space V and let A be the
matrix of F with respect to a basis e1 , . . . , en for V . Then A is diagonalisable if and
only if there exists a basis for V consisting of eigenvectors of F .
Proof. We have already shown that A is diagonalisable if such a basis exists. To show the
converse, we assume that A is diagonalisable, which means that there exist an invertible
matrix T and a diagonal matrix D such that T −1AT = D. Since T is invertible, its
columns t1 , . . . , tn form a basis for Rn . Hence, the vectors f 1 , . . . , f n with coordinates
t1 , . . . , tn with respect to e1 , . . . , en form a basis for V . Let the diagonal entries of D
be λ1 , . . . , λn . Since AT = T D, we have Ati = λi ti , and hence F (f i ) = λi f i , for
i = 1, . . . , n. This shows that f 1 , . . . , f n are also eigenvectors of F .
Example 6.12. The eigenvectors e1 = (1, −2) and e2 = (2, 1) of the matrix
1 2
A=
2 −2
in Example 6.6 form a basis for R2 . Hence, A is diagonalisable, and we have T −1AT = D
where
1 2 −3 0
T = and D = .
−2 1 0 2
89
6 Eigenvalues and Eigenvectors
We have
1−λ 0
det (A − λI) = = (1 − λ)2 .
1 1−λ
The only eigenvalue of A is therefore λ = 1. The eigenvectors x belonging to this
eigenvalue are given by
0 0 0
⇔ x = t(0, 1).
1 0 0
Hence, no basis for R2 consisting of eigenvectors of A exists, and A is not diagonalisable.
Proof. We use induction on k. The statement holds for k = 1 since eigenvectors are
non-zero. Assume that the statement holds for k eigenvectors and let e1 , . . . , ek , ek+1 be
eigenvectors belonging to distinct eigenvalues λ1 , . . . , λk , λk+1 . Suppose that
90
6.2 Diagonalisability
6 − λ −2 −2 6 − λ −2 −2 6 − λ −2 4 − λ
1 2 − λ −1 = 1 2 − λ −1 = 1 2−λ 0
3 −2 1 − λ λ−3 0 3−λ λ−3 0 0
= −(λ − 2)(λ − 3)(λ − 4).
For λ2 , we have
3 −2 −2 0
1 −1 −1 0 1 0 0 0
⇔ ⇔ x = t(0, 1, −1),
1 −1 −1 0
3 −2 −2 0
and for λ3 ,
2 −2 −2 0 0 2 0 0
1 −2 −1 0 ⇔ 1 −2 −1 0 ⇔ x = t(1, 0, 1).
3 −2 −3 0 0 4 0 0
By Corollary 6.16, the eigenvectors (1, 1, 1), (0, 1, −1), (1, 0, 1) form a basis for R3 . Thus
T −1AT = D, and hence A = T DT −1 , where
1 0 1 2 0 0
T = 1 1 0 and D = 0 3 0 .
1 −1 1 0 0 4
91
6 Eigenvalues and Eigenvectors
It follows that
An = (T DT −1 )n = T DT −1 T DT −1 · · · T DT −1 = T D n T −1
n
1 0 1 2 0 0 −1 1 1
= 1 1 0 0 3n 0 1 0 −1
1 −1 1 0 0 4 n 2 −1 −1
n n n n
−2 + 2 · 4 2 −4 2n − 4n
= −2n + 3n 2n 2n − 3n .
−2 − 3 + 2 · 4 2 − 4 2 + 3n − 4n
n n n n n n
Proof. Assume that dim V = n. We choose a basis e1 , . . . , ek for ker (F − λI) and extend
it to a basis e1 , . . . , ek , ek+1 , . . . , en for V . Since F (ei ) = λei for i = 1, . . . , k, the matrix
of F with respect to this basis is of the form
λ 0 ··· 0 a1(k+1) ··· a1n
0 λ · · · 0 a2(k+1) ··· a2n
.. .. .. .. ..
. . . . .
A= 0 0 · · · λ ak(k+1) ··· akn .
0 0 · · · 0 a(k+1)(k+1) · · · a(k+1)n
.. .. .. .. ..
. . . . .
0 0 ··· 0 an(k+1) ··· ann
92
6.2 Diagonalisability
Proof. We have n = deg (det (F − λI)) = dim V . By Theorem 6.15, an eigenvector basis
exists if and only if the sum of the geometric multiplicities of the distinct eigenvalues
equals n. By Theorem 6.20, this cannot happen if det (F − λI) has non-real zeros, and if
all zeros are real, then the sum of the geometric multiplicities of the distinct eigenvalues
equals n if and only if the algebraic and geometric multiplicities of each eigenvalue are
equal.
93
6 Eigenvalues and Eigenvectors
Hence, the geometric multiplicity of λ1 equals 2. As a basis for the eigenspace associated
with λ1 we can choose e1 = (1, 1, 0), e2 = (0, 0, 1). This time it is worthwhile to find
also the eigenvectors belonging to λ2 .
1 1 0 0
1 1 0 0 ⇔ x = t(1, −1, 0).
0 0 2 0
We can, therefore, choose e3 = (1, −1, 0) as a basis for the eigenspace associated with λ2 .
The eigenvectors e1 , e2 , e3 form a basis for R3 since eigenvectors belonging to different
eigenvalues are linearly independent. Hence, A is diagonalisable, and T −1AT = D where
1 0 1 1 0 0
T = 1 0 −1 and D = 0 1 0 .
0 1 0 0 0 −1
u1 = Au0 = A(c1 e1 + · · · + cn en ) = c1 λ1 e1 + · · · + cn λn en ,
u2 = Au1 = A(c1 λ1 e1 + · · · + cn λn en ) = c1 λ21 e1 + · · · + cn λ2n en ,
..
.
uk = c1 λk1 e1 + · · · + cn λkn en .
Hence, if we know u0 , the eigenvalues and the eigenvectors, we know uk for all k.
Example 6.24. Let the numbers ak and bk be defined by
ak+1 = ak + 2bk a0 = 3
, .
bk+1 = 2ak + bk b0 = 1
1−λ 2
= (1 − λ)2 − 4 = (λ + 1)(λ − 3).
2 1−λ
94
6.3 Recurrence Equations
Thus the eigenvalues are λ1 = −1 and λ2 = 3. For λ1 , the eigenvectors are given by
2 2 0
⇔ x = t(1, −1),
2 2 0
and for λ2 by
−2 2 0
⇔ x = t(1, 1).
2 −2 0
The eigenvectors e1 = (1, −1) and e2 = (1, 1) form a basis for R2 . We have
1 1 3
c1 e1 + c2 e2 = u0 ⇔ ⇔ c1 = 1, c2 = 2.
−1 1 1
Hence,
ak k k k 1 k 1
uk = = c1 λ1 e1 + c2 λ2 e2 = (−1) +2·3 .
bk −1 1
Provided that
0 1 0
A = 0 0 1
2 1 −2
is diagonalisable, we can now apply the method used in the previous example. It turns
out that the eigenvectors e1 = (1, −2, 4), e2 = (1, −1, 1), e3 = (1, 1, 1) belonging to the
eigenvalues λ1 = −2, λ2 = −1, λ3 = 1 form a basis for R3 . Setting
(a0 , b0 , c0 ) = d1 e1 + d2 e2 + d3 e3
and in particular,
an = (−2)n − 2 · (−1)n − 1.
95
6 Eigenvalues and Eigenvectors
We have
√ √
−λ 1 1+ 5 1− 5
= λ2 − λ − 1 = 0 ⇔ λ = λ1 = or λ = λ2 = .
1 1−λ 2 2
x · y = x1 y 1 + · · · + xn y n
96
6.4 The Spectral Theorem
If we replace condition (ii) in Definition 3.1 with the condition hu, vi = hv, ui, we get
a complex inner product. The dot product on Cn is then a complex inner product. We
can also allow complex numbers in the theory of matrices and determinants. Let A be
a complex n × n matrix. Then Ax = 0 has a non-zero solution x ∈ Cn if and only if
det A = 0. In particular, if λ ∈ C, then there exists a non-zero vector x ∈ Cn such that
Ax = λx if and only if det (A − λI) = 0. Let A be a complex matrix. By A we shall
mean the matrix obtained from A by taking the complex conjugates of the entries of A.
We say that a square matrix A is Hermitian if At = A.
Lemma 6.27. Let A be a Hermitian n×n matrix. Then all the zeros of the characteristic
polynomial det (A − λI) are real.
Proof. Let λ be a zero of the characteristic polynomial. Then there exists a non-zero
vector x ∈ Cn such that
Ax = λx. (6.4)
Taking the complex conjugate and using the assumption on A, we get
At x = A x = Ax = λx = λ x. (6.5)
By (6.4),
(Ax) · x = (Ax)t x = (λx)t x = λxt x = λkxk2 ,
and by (6.5),
97
6 Eigenvalues and Eigenvectors
98
6.5 Systems of Linear Differential Equations
The vectors v 1 = (1, −1, 0) and v 2 = (2, 0, −1) form a basis for this eigenspace. We apply
the Gram–Schmidt process to them. We set u1 = v 1 and u2 = su1 + v 2 and get s = −1
and u2 = −u1 + v 2 = (1, 1, −1). Normalising u1 and u2 , we get the orthonormal basis
e1 = √12 (1, −1, 0), e2 = √13 (1, 1, −1) for the eigenspace ker (A − λ1 I). By Theorem 6.32,
we know that the eigenvectors belonging to λ2 are orthogonal to e1 and e2 . Hence, the
unit normal vector e3 = √16 (1, 1, 2) of the plane x1 + x2 + 2x3 = 0 must be an eigenvector
belonging to λ2 . With
√1 √1 √1 √ √
2 3 6
√3 √ 2 1 2 0 0
− √1 √1 1
√1 = √ − 3 and D = 0 2 0 ,
T = 2 3 6 √2 1
6 0 − 2 2 0 0 8
0 − √13 √26
of first-order linear differential equations. For every t ∈ R, x(t) = (x1 (t), . . . , xn (t))
and x′ (t) = (x′1 (t), . . . , x′n (t)) are elements of Rn , and the system can be written as
x′ (t) = Ax(t) where A = [aik ]n×n . Suppose that Rn has a basis e1 , . . . , en consisting of
eigenvectors of A and let y(t) = (y1 (t), . . . , yn (t)) be the coordinates of x(t) with respect
to that basis. Then
x(t) = y1 (t)e1 + · · · + yn (t)en ,
and hence
x′ (t) = y1′ (t)e1 + · · · + yn′ (t)en .
If the eigenvalue associated with ei is λi for i = 1, . . . , n, then
99
6 Eigenvalues and Eigenvectors
and therefore
Hence,
yi (t) = ci eλi t , i = 1, . . . , n,
for some constants ci , and thus
are λ1 = −2, λ2 = 1, λ3 = 3 with associated eigenvectors e1 = (1, 1, −3), e2 = (1, −2, 3),
e3 = (1, 0, 1). Since the eigenvalues are distinct, the eigenvectors form a basis for R3 .
By applying the above method, we therefore get
x1 (t) 1 1 1
x2 (t) = c1 e−2t 1 + c2 et −2 + c3 e3t 0 .
x3 (t) −3 3 1
100
6.6 The Vibrating String
Exactly the same method can be used to solve a system of the form
′′
x1 (t) = a11 x1 (t) + a12 x2 (t) + · · · + a1n xn (t)
x′′ (t) = a21 x1 (t) + a22 x2 (t) + · · · + a2n xn (t)
2
.. .
.
′′
xn (t) = an1 x1 (t) + an2 x2 (t) + · · · + ann xn (t)
(6.6) is now replaced by yi′′ (t) = λi yi (t), i = 1, . . . , n. We can solve these second-order
equations and proceed as before.
This time, the eigenvalues and associated eigenvectors are λ1 = −2, λ2 = 2, e1 = (1, −3),
e2 = (1, 1). We have
√ √
y1′′ (t) = −2y1 (t) ⇔ y1 (t) = c1 sin 2 t + c2 cos 2 t
and
√ √
y2′′ (t) = 2y2 (t) ⇔ y2 (t) = d1 e 2t
+ d2 e− 2t
.
Hence,
√
x1 (t) √ √ 1 2t
√
− 2t 1
= c1 sin 2 t + c2 cos 2 t + d1 e + d2 e .
x2 (t) −3 1
101
6 Eigenvalues and Eigenvectors
y
α2
T
T
T T
α3
α1
T T
y1 y2 y3
α0
0 1 2 3 4
n = 3 point masses
Hence, the force exerted on the mass at j is approximately
−T (yj − yj−1 ) + T (yj+1 − yj ) = T (yj−1 − 2yj + yj+1 ), j = 1, . . . , n.
According to Newton’s second law, force equals mass times acceleration. Therefore,
T (yj−1 (t) − 2yj (t) + yj+1 (t)) = myj′′ (t), j = 1, . . . , n.
q
T
Setting q = m, we can write this as
102
Exercises
Example 6.36. Consider a string with two point masses. The corresponding matrix is
2 −2 1
A=q .
1 −2
y1 y2 y1
y2
Exercises
6.1. Find the eigenvalues and eigenvectors of the following matrices.
5 −2 3 −2
(a) , (b) ,
6 −2 4 −1
−1 2 1 4 −1 0
(c) 2 −1 1, (d) 4 0 0.
−1 1 2 2 −1 2
6.4. Let A be a square matrix such that the sum of the entries in each row equals λ.
Show that λ is an eigenvalue of A.
6.5. Let F be a linear transformation on a linear space V and assume that every non-
zero vector of V is an eigenvector of F . Show that there exists a real number λ
such that F = λI.
103
6 Eigenvalues and Eigenvectors
6.6. Determine, for each of the following matrices A, whether it is diagonalisable, and
when it is, find a matrix T such that T −1AT is a diagonal matrix.
−3 3 1 3 −1 −1 1 4 6
(a) 4 −1 −2 , (b) 4 −1 −2, (c) −3 −7 −7.
−14 9 6 −2 1 2 4 8 7
6.10. Let A be a diagonalisable square matrix. Show that At is diagonalisable with the
same eigenvalues as A.
6.11. Let V be an n-dimensional inner product space, where n > 0, and let F be the
linear transformation on V defined by
where b and c are vectors of V for which hb, ci 6= 0. Show that V has a basis
consisting of eigenvectors of F and find the matrix of F with respect to some such
basis.
104
Exercises
6.15. Find, for each of the following matrices A, an orthogonal matrix T such that
T tAT is a diagonal matrix.
1 0 3 13 −4 −2
(a) 0 1 4, (b) −4 13 2 .
3 4 1 −2 2 10
6.16. Let A be an invertible square matrix.
(a) Show that the eigenvalues of the symmetric matrix AtA are positive. Hint:
xtAtAx = kAxk2 .
(b) Show that there exists a unique symmetric matrix
√ B with positive eigenvalues
2 t t
such that B = A A. Hint: Show that Bx = λ x if A Ax = λx.
(c) Show that A = QB where Q is an orthogonal matrix and B a symmetric
matrix with positive eigenvalues. Hint: Try Q = (A−1 )t B.
6.17. Solve the following initial value problem.
′
x1 (t) = x1 (t) + 3x2 (t) + 2x3 (t) x1 (0) = 8
x′ (t) = 3x1 (t) − 4x2 (t) + 3x3 (t) , x (0) = −5 .
2′ 2
x3 (t) = 2x1 (t) + 3x2 (t) + x3 (t) x3 (0) = 10
6.18. Find the general solution of the following system of differential equations.
′′
x1 (t) = x1 (t) + 2x2 (t) + x3 (t)
x′′ (t) = 2x1 (t) + x2 (t) + x3 (t) .
′′2
x3 (t) = 3x1 (t) + 3x2 (t) + 4x3 (t)
6.19. Find the eigenfrequencies and describe the corresponding eigenmodes for a string
with three point masses.
105
7 Quadratic Forms
b(x, y) = xt By
where B = [bik ].
Theorem 7.3. Let b be a bilinear form on a linear space V with basis e1 , . . . , en . If the
coordinates of u and v with respect to that basis are x and y, respectively, then
b(u, v) = xt By
where
b(e1 , e1 ) · · · b(e1 , en )
.. ..
B= . . .
b(en , e1 ) · · · b(en , en )
Hence, by bilinearity,
n n
! n n
!
X X X X
b(u, v) = b xi e i , yk ek = xi b ei , yk e k
i=1 k=1 i=1 k=1
n X
X n
= xi yk b(ei , ek ) = xt By.
i=1 k=1
By using the basis ε1 , . . . , εn for Rn , we see that every bilinear form on Rn is of the form
described in Example 7.2.
Definition 7.5. A bilinear form b on a linear space V is said to be symmetric if
b(u, v) = b(v, u)
Theorem 7.6. Let b be a bilinear form on a linear space V with basis e1 , . . . , en and
let B be the matrix of b with respect to that basis. Then b is symmetric if and only if B
is symmetric.
Proof. Let x and y be the coordinates of u and v, respectively. Then b(u, v) = xt By and
b(v, u) = y t Bx = xt (y t B)t = xt B t y. Hence, b is symmetric if and only if xt By = xt B t y
for all x and y in Rn , and this is equivalent to B = B t .
Theorem 7.7. Let b be a bilinear form on a linear space V with bases e1 , . . . , en and
e′1 , . . . , e′n . If B is the matrix of b with respect to e1 , . . . , en and T is the transition
matrix from e′1 , . . . , e′n to e1 , . . . , en , then the matrix of b with respect to e′1 , . . . , e′n is
B ′ = T tBT.
Proof. If the coordinates of u and v with respect to the bases are x, y and x′ , y ′ , then
This shows that the matrix with respect to e′1 , . . . , e′n is B ′ = T tBT .
108
7.3 The Spectral Theorem Applied to Quadratic Forms
is a quadratic form on R3 since q(x) = b(x, x) where b is the bilinear form on R3 defined
by
b(x, y) = x1 y1 + 2x2 y2 + x3 y3 + 2x1 y2 + 4x1 y3 + 6x2 y3 .
We also have q(x) = c(x, y) where c is the symmetric bilinear form on R3 defined by
Theorem 7.10. Let q be a quadratic form on a linear space V . Then there exists a
unique symmetric bilinear form c on V such that q(u) = c(u, u) for all u ∈ V .
Proof. Let b be any bilinear form on V such that q(u) = b(u, u) for all u ∈ V and
define c by c(u, v) = 12 (b(u, v) + b(v, u)). Then c is symmetric and q(u) = c(u, u) for
all u ∈ V . To show the uniqueness, we let c be any symmetric bilinear form on V for
which q(u) = c(u, u) for all u ∈ V . Then
Consequently,
q(u + v) − q(u) − q(v)
c(u, v) =
2
is uniquely determined by q.
Example 7.12. The matrix with respect to the basis ε1 , ε2 , ε3 of the quadratic form in
Example 7.9 is
1 1 2
1 2 3 .
2 3 1
109
7 Quadratic Forms
Proof. Let e1 , . . . , en be any basis for V and let B be the matrix of q with respect to
that basis. Since B is a symmetric matrix, it follows from Theorem 6.31 that there exist
an orthogonal matrix T and a diagonal matrix D such that T tBT = D. Let e′k be the
vector in V whose coordinate vector with respect to e1 , . . . , en is the kth column of T .
Then the matrix of q with respect to the basis e′1 . . . , e′n is D.
Theorem 7.16. Let q be a quadratic form on an inner product space V with ortho-
normal bases e1 , . . . , en and f 1 , . . . , f n . If
Proof. It suffices to show that the matrices B and C of q with respect to the two bases
have the same characteristic polynomial. Since both bases are orthonormal, the transition
matrix T from f 1 , . . . , f n to e1 , . . . , en is orthogonal. Hence, C = T tBT = T −1BT , and
it follows that
110
7.3 The Spectral Theorem Applied to Quadratic Forms
1 1 1
e′1 = √ (2, −1, −1), e′2 = √ (0, 1, −1), e′3 = √ (1, 1, 1)
6 2 3
form an orthonormal basis for R3 of eigenvectors of B. Hence, if x = x′1 e′1 + x′2 e′2 + x′3 e′3 ,
then
q(x) = −2(x′1 )2 + 4(x′2 )2 + 4(x′3 )2 .
Proof. There exists an orthonormal basis that diagonalises q, and its vectors e1 , . . . , en
can be ordered so that ei corresponds to λi for i = 1, . . . , n. Let u be any vector of
length 1 in V . If (x1 , . . . , xn ) are the coordinates of u with respect to e1 , . . . , en , then
x21 + · · · + x2n = kuk = 1 and q(u) = λ1 x21 + · · · + λn x2n . Hence,
q(u) q(u)
λ1 = min and λn = max .
u6=0 kuk2 u6=0 kuk2
1
Proof. When u ranges over all non-zero vectors, kuk u ranges over all vectors of length
1. The statement of the corollary now follows from the fact that
1 q(u)
q u = .
kuk kuk2
111
7 Quadratic Forms
112
7.4 Quadratic Equations
x2 x2 x2
x′2 x′1 x′2 x′1 x′2 x′1
x1 x1 x1
consists of two lines intersecting at the origin. If c > 0, the equation can be written as
(x′1 )2 (x′2 )2
− 2 =1
a21 a2
where r r
c c
a1 = and a2 = .
λ1 −λ2
Hence, the solution set is a hyperbola with centre at the origin. Its transverse axis is
spanned by e′1 and its conjugate axis by e′2 . If c < 0, the equation can be written as
(x′1 )2 (x′2 )2
− + 2 = 1.
a21 a2
This is also the equation of a hyperbola with centre at the origin, but now the transverse
axis is spanned by e′2 and the conjugate axis by e′1 .
All the remaining cases with non-zero eigenvalues can be brought back to one of the
previous cases by changing the signs of both sides of the equation or reindexing the
eigenvalues and eigenvectors or both.
If one eigenvalue is zero, the solution set is empty or consists of one or two lines
depending on the value of c.
Also in the general case
q(x1 , x2 ) + b1 x1 + b2 x2 = c,
If both eigenvalues are non-zero, we can complete the two squares and get
2 2
b′ b′ (b′1 )2 (b′2 )2
λ1 x′1 + 1 + λ2 x′2 + 2 = c′ + + .
2λ1 2λ2 4λ1 4λ2
113
7 Quadratic Forms
If b′2 = 0, the solution set is empty or consists of one line or two parallel lines. Otherwise,
it is a parabola with vertex at
b′1 4λ1 c′ + (b′1 )2
− ,
2λ1 4λ1 b′2
q(x1 , x2 , x3 ) + b1 x1 + b2 x2 + b3 x3 = c
where q is a quadratic form on R3 , not identically zero. Also now we begin by studying
the equation
q(x1 , x2 , x3 ) = c.
We can write this equation as
where (x′1 , x′2 , x′3 ) are the coordinates of x with respect to an orthonormal basis e′1 , e′2 , e′3
that diagonalises q. At least one eigenvalue is non-zero.
Suppose first that the eigenvalues are positive. If c < 0, the solution set is empty, and
if c = 0, the only solution is (0, 0, 0). If c > 0, the surface is called an ellipsoid. The
intersection between the surface and any of the coordinate planes x′i = 0 is an ellipse. In
fact, the intersection between the surface and the plane x′i = d is an ellipse if λi d2 < c.
114
7.4 Quadratic Equations
Assume that λ1 > 0, λ2 > 0 and λ3 < 0. If c > 0, the surface is a hyperboloid of
one sheet. The intersection between the surface and a plane x′3 = d is an ellipse. The
intersections between the surface and planes of the form x′1 = d or x′2 = d are hyperbolae.
If c = 0, the surface is a cone. The intersection between the surface and the plane x′3 = d
is an ellipse when d 6= 0 and (0, 0, 0) when d = 0. The intersection between the surface
and one of the coordinate planes x′1 = 0 and x′2 = 0 consists of two intersecting lines. If
c < 0, the surface is a hyperboloid of two sheets. The intersection between the surface
and the plane x′3 = d is empty when λ3 d2 > c, consists of one point when λ3 d2 = c and is
an ellipse when λ3 d2 < c. The intersection between the surface and one of the coordinate
planes x′1 = 0 and x′2 = 0 is a hyperbola.
Suppose that λ1 > 0, λ2 > 0 and λ3 = 0. If c < 0, the solution set is empty. If c = 0,
the solution set consists of the x′3 -axis. If c > 0, the surface is an elliptic cylinder.
115
7 Quadratic Forms
Assume that λ1 > 0, λ2 < 0 and λ3 = 0. If c = 0, the solution set consists of two
intersecting planes. Otherwise, the surface is a hyperbolic cylinder.
If λ1 6= 0 and λ2 = λ3 = 0, the solution set is empty, a plane or two parallel planes,
depending on the value of c.
After diagonalisation, the general equation becomes
Completing squares takes us back to the previous cases when the eigenvalues are non-zero.
When λ1 6= 0, λ2 6= 0 and λ3 = 0, we get an equation of the form
We need only consider the case where b′′3 6= 0. If λ1 and λ2 have the same sign, the
surface is an elliptic paraboloid, otherwise a hyperbolic paraboloid.
When λ1 6= 0 and λ2 = λ3 = 0, the equation becomes
If at least one of b′′2 and b′′3 is non-zero, the surface is a parabolic cylinder.
Above we have regarded spheres as ellipsoids. In general, if two or more eigenvalues
are equal and the quadratic equation represents a surface, we call that surface a surface
of revolution.
Example 7.22. We set out to find the type of the surface
We also wish to find the points on the surface closest to the origin and the distance from
those points to the origin. The eigenvalues of
1 3 2
B = 3 −4 3
2 3 1
are λ1 = −6, λ2 = −1 and λ3 = 5. The surface is therefore a hyperboloid of two
sheets. Let e′1 , e′2 , e′3 be an orthonormal basis of eigenvectors associated with λ1 , λ2 , λ3
and (x′1 , x′2 , x′3 ) the coordinates of x = (x1 , x2 , x3 ) with respect to that basis. Then
116
7.5 Sylvester’s Law of Inertia
If a quadratic form q is diagonalised with respect to two orthonormal bases for an inner
product space V , then the coefficients in the two representations are the same according
to Theorem 7.16. For any two diagonalising bases, this need not be true. Let
for all x ∈ Rn , then the number of positive λi equals σ+ and the number of negative λi
equals σ− .
Proof. After reindexing the basis vectors and coefficients if necessary, we may assume
that λ1 , . . . , λk are positive and λk+1 , . . . , λn are non-positive. Set U+ = [e1 , . . . , ek ]
and U− = [ek+1 , . . . , en ]. We use here the convention that the subspace spanned by no
vectors is the zero space {0}. Then q is positive definite on U+ . Let U be any subspace
of V on which q is positive definite. Since q(u) > 0 for all non-zero vectors u ∈ U and
q(u) ≤ 0 for all vectors u ∈ U− , we must have U ∩ U− = {0}. Hence, by Theorem 2.62,
117
7 Quadratic Forms
and therefore dim U ≤ k. This shows that k is the maximum dimension of subspaces
on which q is positive definite. We now obtain the statement about σ− by applying the
statement about σ+ to the quadratic form −q.
of the quadratic form is not well suited for manual computation of eigenvalues. Instead,
we set out to find the representation with respect to some diagonalising basis, not ne-
cessarily orthonormal. Then we can use Theorem 7.25 to find the number of positive
and negative eigenvalues of the form. We find a diagonal representation by completing
squares as follows.
Setting ′
x1 = x1 + 3x2 − 2x3
x′ = x2 − 2x3 ,
′2
x3 = x3
we obtain
q(x) = (x′1 )2 − 2(x′2 )2 + 6(x′3 )2 .
The coefficient matrix
1 3 −2
0 1 −2
0 0 1
is clearly invertible, whence it is the transition matrix from ε1 , ε2 , ε3 to some basis
e1 , e2 , e3 . Its inverse T is then the transition matrix from e1 , e2 , e3 to ε1 , ε2 , ε3 . Hence,
the matrix of q with respect to the basis e1 , e2 , e3 is
1 0 0
T tBT = D = 0 −2 0 .
0 0 6
118
7.5 Sylvester’s Law of Inertia
The representation of q with respect to the diagonalising basis e1 , e2 , e3 has two positive
and one negative coefficients. Hence, q has two positive and one negative eigenvalues.
Therefore and since the right-hand side of the equation of the surface is positive, the
surface is a hyperboloid of one sheet. The reader should be aware that the coefficients
1, −2 and 6 are not eigenvalues of B and hence not of q.
When carried out correctly, the above method always yields an invertible coefficient
matrix and hence a basis that diagonalises q. Sometimes, however, there are no squares
to complete as in the following example.
Example 7.27. The quadratic form
q(x) = x1 x2 + x1 x3 + x2 x3
on R3 has no squares. As a remedy for this we begin with the following change of
coordinates.
x1 = x′1 + x′2
x = x′1 − x′2 .
2
x3 = x′3
Since the coefficient matrix is invertible, this yields a change of basis. We get
q(x) = (x′1 )2 − (x′2 )2 + 2x′1 x′3 .
Now we can proceed as in the previous example.
q(x) = (x′1 )2 − (x′2 )2 + 2x′1 x′3 = (x′1 + x′3 )2 − (x′2 )2 − (x′3 )2 = (x′′1 )2 − (x′′2 )2 − (x′′3 )2 .
The method used in the above two examples works well for finding the type of a surface
but is useless for exploring metric properties of the surface. For example, it cannot be
used to find the points on the surface closest to the origin. The reason for this is that
the diagonalising basis need not be orthonormal. Nor can it reveal whether the surface
is a surface of revolution. Even if two coefficients happen to be equal in the diagonal
representation, nothing says that two eigenvalues of the quadratic form must be equal.
The following result is an immediate consequence of Definition 7.24 but can also be
regarded as a corollary to Theorem 7.25.
Corollary 7.28. Let q be a quadratic form on an n-dimensional linear space V . Then
the following statements hold.
• q is positive definite if and only if σ+ = n.
• q is positive semidefinite if and only if σ+ < n and σ− = 0.
• q is negative definite if and only if σ− = n.
• q is negative semidefinite if and only if σ+ = 0 and σ− < n.
• q is indefinite if and only if σ+ > 0 and σ− > 0.
Definition 7.29. The sign function on R is defined by
1, if x > 0,
sgn(x) = 0, if x = 0,
−1, if x < 0.
119
7 Quadratic Forms
with respect to some basis e1 , . . . , en for V and let D be the matrix of q with respect
to some basis for V that diagonalises q. Then D = T tBT for some invertible matrix T ,
whence
det D = det(T tBT ) = (det T )2 det B.
In the first case, dm and dm+1 have the same sign, and in the second case, dm and dm+1
have opposite signs. Hence, σ− equals the number of sign changes in the sequence
d0 , d1 , . . . , dn .
Summing up, we have the following theorem.
120
7.5 Sylvester’s Law of Inertia
Here
1 3 −2
1 3
d0 = 1, d1 = 1, d2 = = 7 − 9 = −2, d3 = 3 7 −2 = −12.
3 7
−2 −2 2
Since all the determinants are non-zero and there is only one change of sign in the
sequence 1, 1, −2, −12, we see that σ+ = 2 and σ− = 1.
Example 7.32. We can now fulfil the promise made in Section 6.6. Let Bn be the
symmetric n × n matrix
−2 1 0 ··· 0
1 −2 1 ··· 0
0 1 −2 ··· 0
.
.. .. .. ..
. . . .
0 0 0 · · · −2
Then d1 = det B1 = −2, d2 = det B2 = 3 and, for n ≥ 3,
−2 1 0 ··· 0
−2 1 · · · 0 1 0 ··· 0
1 −2 1 ··· 0
1 −2 · · · 0 1 −2 · · · 0
dn = det Bn = 0 1 −2 ··· 0 = −2 . .. .. − .. .. ..
.. .. .. .. .. . . . . .
. . . .
0 0 · · · −2 0 0 · · · −2
0 0 0 · · · −2
= −2dn−1 − dn−2 .
We can now prove by induction that dn = (−1)n (n + 1). The statement holds for n = 1
and n = 2, and if it holds for k < n where n ≥ 3, then
Hence, there are n sign changes in the sequence d0 , d1 , . . . , dn , and therefore all the
eigenvalues of Bn are negative.
121
7 Quadratic Forms
Exercises
7.1. Find, for each of the following quadratic forms q on R3 , an orthonormal basis
for R3 that diagonalises q and find the corresponding diagonal representation.
(a) q(x) = 6x21 + 3x22 + 3x23 − 4x1 x2 + 4x1 x3 − 2x2 x3 ,
(b) q(x) = x21 + x22 + x23 − 2x2 x3 .
7.2. Find the maximum and minimum values of
q(x1 , x2 , x3 ) = 7x21 + 3x22 + 7x23 + 2x1 x2 + 4x2 x3
subject to the constraint x21 + x22 + x23 = 1. Also find the points where they occur.
7.3. Find the maximum and minimum values of
q(x1 , x2 , x3 ) = x21 + 2x22 + 2x23 + 8x1 x2 + 8x1 x3 + 6x2 x3
subject to the constraint x21 + x22 + x23 ≤ 9.
7.4. (a) Find the minimum value of r(x1 , x2 , x3 ) = x21 + x22 + x23 subject to the con-
straint
q(x1 , x2 , x3 ) = x21 + 3x22 + x23 + 2x1 x2 − 2x1 x3 − 2x2 x3 = 1.
(b) Does r(x1 , x2 , x3 ) have a maximum value in the set where q(x1 , x2 , x3 ) = 1?
7.5. Find the least value of a for which
3x21 + 5x22 + 3x23 − 2x1 x2 + 2x1 x3 − 2x2 x3 ≤ a(x21 + x22 + x23 )
for all x ∈ R3 .
7.6. Let A be a square matrix. Show that if λ is the least eigenvalue of the symmetric
matrix AtA, then √
min kAxk = λ .
kxk=1
7.7. Find the least possible distance between the points (x1 , x2 , x3 ) and (−x3 , x1 , x2 )
on the unit sphere.
7.8. Show that the curve described by the equation
18x21 + 12x22 − 8x1 x2 = 40
with respect to an orthonormal coordinate system for 2-space is an ellipse. Find
the lengths and directions of the semi-major and semi-minor axes.
7.9. A quadratic surface has, with respect to an orthonormal coordinate system for
3-space, the equation
3x21 + 3x22 − 8x1 x2 + 4x1 x3 − 4x2 x3 = 1.
Identify the type of surface and find the least distance from a point on the surface
to the origin.
122
Exercises
7.10. A quadratic surface has, with respect to an orthonormal coordinate system for
3-space, the equation
Show that it is an ellipsoid and find the points on the surface closest to the origin
and furthest from the origin.
7.11. A quadratic surface has, with respect to an orthonormal coordinate system for
3-space, the equation
Show that it is a surface of revolution, identify its type and find the axis of
revolution.
7.12. A quadratic surface has, with respect to an orthonormal coordinate system for
3-space, the equation
Identify the type of surface and determine the distance from the surface to the
origin.
7.13. A quadratic surface has, with respect to an orthonormal coordinate system for
3-space, the equation
Identify the type of surface and find the least distance from a point on the surface
to the origin.
123
7 Quadratic Forms
7.17. Identify, for each value of the real constant a, the type of the surface
7.18. Determine whether there is a change of coordinates that takes the quadratic form
124
Answers to Exercises
2 2 4 9 7 12
1.1. (a) 0 3 3, (b) not defined, (c) 4 4 6 ,
3 2 2 0 1 −1
2 3 2
1 7 9
(d) 0 5 6, (e) not defined, (f) ,
3 6 2
4 8 5
7 12 16 13 17 23
(g) , (h) .
11 11 13 19 16 24
2 2 −2 4 −4 2
1.2. A −B = , (A + B)(A − B) = .
−5 −5 −6 −3
2 −4
1.3. B=t , t ∈ R.
−1 2
1 12 138
1.4. (b) 0 1 24 .
0 0 1
2 0 4 7 11
1.5. At B t = 3 5 8 , (At + B t )C t = 12 11 .
2 6 5 16 13
−1 1 1 −10 5 5
1
1.8. (a) 1 −2 1 , (b) not invertible, (c) 0 2 −1.
5
0 1 −1 5 −3 −1
0 −1 1 0 2 −1 −3 −2 4
−1 −1
1.9. A−1 = 2 2 −3 , At = −1 2 0 , A2 = 7 2 −7.
−1 0 1 1 −3 1 −1 1 0
3a − 8 4 −3
1 4 − a 2a − 2 −3 , a 6= 3.
1.10.
6(a − 3)
−2 −8 6
−2 6 −5
1.11. X= .
1 −4 4
2.1. The sets in (a) and (d) are subspaces, the sets in (b) and (c) are not.
Answers to Exercises
1 2
2.3. E.g. im 1 0 = ker 1 −1 −2 .
0 1
2.4. Only the set in (b) is linearly dependent.
2.5. No. Yes.
2.7. (a) ker A: E.g. (−2, −1, 1, 0), (−1, −2, 0, 1). im A: E.g. (1, 1, 2), (1, −1, 1).
(b) ker A: E.g. (−1, −1, 1, 0). im A: E.g. (3, 1, 5), (4, 2, 7), (1, −1, 2).
2.8. E.g. (1, 1, 0, 1), (1, 2, 2, 1), (3, 4, 1, 3).
2.9. (1, −2, 2, −1).
2.10. (a) 2, (b) 3.
2.12. n − 1.
2.14. Only the sum in (b) is direct.
2.15. The projection on U along V is (1, 2, −1). The projection on V along U is (3, 3, 6).
2.16. Only the function in (b) is a linear transformation.
2.17. dim ker A = 2, dim im A = 3.
2.18. F is one-to-one but not onto.
2.19. F −1 (x1 , x2 , x3 ) = (3x1 − 4x2 + 2x3 , −5x1 + 7x2 − 3x3 , 4x1 − 5x2 + 2x3 ).
2.20. F is not one-to-one but onto. The kernel is the set of constant polynomials. The
image is P .
π
3.1. 3.
126
3.16. (a) 2.
(b) E.g. (1, 2, 3, 4), (1, −1, 2, 3) and (1, 1, 1), (2, −1, 5), respectively.
1 23
1 t
3.17. 3 − 42 2 .
3.18. y = − 85 t + 22
5 .
3.19. y = 12 t2 + 9
10 t + 23
10 .
127
Answers to Exercises
2 −7 −12
5.6. −4 13 23 .
3 −6 −12
√ √ √
2 + √3 2 − √3 √2
1
5.7. 2 −√ 3 2+
√ 3 −√ 2 .
4
− 2 2 2 3
5.8. (a) 1, (b) 1.
5.9. U ′ : x1 − 2x2 − x3 = 0, U ′′ : x = t(2, −1, 3).
5.10. U ′ : x = s(0, 3, 3, −2) + t(1, −1, −2, 1), U ′′ : x = s(1, 0, 0, 1) + t(0, 1, 1, 0).
5.11. U ′ : x1 + x2 + 2x3 = 0, U ′′ : x = t(1, 3, −1).
7 −4 4
1
5.12. −4 1 8.
9
4 8 1
−3 2 6
5.13. 8 −3 −12.
−4 2 7
π
5.15. (a) Rotation about the origin through the angle 6 in the direction from e2 to-
wards e1 .
√
(b) Orthogonal reflection in the line x2 = 3 x1 .
π
(c) Rotation about the origin through the angle 4 in the direction from e1 to-
wards e2 .
5.16. (a) Rotation about the line x = t(5, −1, 1) through the angle 2π
3 in the anticlock-
wise direction when looking from the point (5, −1, 1) towards the origin.
(b) Orthogonal reflection in the plane 2x1 − x2 − 2x3 = 0.
(c) Rotation about the line x = t(1, 1, 1) through the angle 2π
3 in the clockwise
direction when looking from the point (1, 1, 1) towards the origin.
(d) Rotation about the line x = t(1, 2, 3) through the angle π.
(e) Rotation about the line x = t(1, 1, 1) through the angle π3 in the anticlockwise
direction when looking from the point (1, 1, 1) towards the origin followed by
reflection in the origin.
6.1. (a) 1: t(1, 2), t 6= 0, 2: t(2, 3), t 6= 0.
(b) No eigenvalues.
(c) −3: t(9, −11, 4), t 6= 0, 1: t(1, 1, 0), t 6= 0, 2: t(1, 1, 1), t 6= 0.
(d) 2: s(0, 0, 1) + t(1, 2, 0), s 6= 0 or t 6= 0.
6.3. (a) 0 and 1. (b) −1 and 1.
128
6.6. (a) Diagonalisable, e.g.
1 1 1 −1 0 0
T = 0 1 2 , T −1AT = 0 1 0 .
2 1 −1 0 0 2
(b) E.g.
1 2 −2 9 0 0
1
T = 2 1 2 , T tAT = 0 9 0 .
3
−2 2 1 0 0 18
129
Answers to Exercises
x1 (t) 1 −1 3
6.17. x2 (t) = 3e−6t −3 + e−t 0 + 2e5t 2.
x3 (t) 1 1 3
x1 (t) 1 1 1 a1 cos t + b1 sin t
6.18. x2 (t) = −1 1 1 a√2 et + b2 e−t√ .
x3 (t) 0 −2 3 a3 e 6 t + b3 e− 6 t
6.19. Eigenfrequencies p p
√ √ √
2− 2q q 2 q 2+ 2
, ,
2π 2π 2π
with associated eigenvectors
√1 1 1
√
2 , 0 , − 2 ,
1 −1 1
respectively.
7.1. (a) E.g. √1 (0, 1, 1), √1 (1, 1, −1), √1 (2, −1, 1), 2(x′ )2 + 2(x′ )2 + 8(x′3 )2 .
2 3 6 1 2
1 1 ′ 2 ′ 2
(b) E.g. √ (0, 1, 1), (1, 0, 0), √ (0, −1, 1), (x ) + 2(x ) .
2 2 2 3
7.2. Minimum value 2 at ± √130 (1, −5, 2), maximum value 8 at ± √16 (1, 1, 2).
7.3. Minimum value −27, maximum value 81.
1
7.4. (a) 4. (b) No.
7.5. 6.
7.7. 1.
7.8. The semi-major
√ axis has length 2 and direction (1, 2). The semi-minor axis has
length 2 and direction (−2, 1).
7.9. Hyperboloid of two sheets. √1 .
8
1
7.10. The points closest to the origin are ± 6√ 3
(1, 1, −2). The points furthest from the
1
origin are ± 3√ 2
(1, 1, 1).
130
7.17. Ellipsoid if a > 0, elliptic cylinder if a = 0, hyperboloid of one sheet if −1 < a < 0,
hyperbolic cylinder if a = −1 and hyperboloid of two sheets if a < −1.
7.18. (a) Yes. (b) No.
7.19. Two sheets if abc > 0, one sheet if abc < 0.
7.20. (2, 1).
131
Index
A dot product, 33
addition
of matrices, 1 E
of vectors, 11 eigenfrequency, 102
additive inverse, 11 eigenmode, 102
adjugate, 62 eigenspace, 92
algebraic multiplicity, 92 eigenvalue
alternating, 51 of a linear transformation, 87
angle, 35 of a quadratic form, 110
eigenvector, 87
B ellipse, 112
basis, 16 ellipsoid, 114
orthonormal, 37 elliptic cylinder, 115
Bessel’s inequality, 48 elliptic paraboloid, 116
bilinear form, 107 entry, 1
expansion along a row or column, 57
C
Cauchy–Schwarz inequality, 35 F
characteristic polynomial, 87 finite-dimensional, 20
column matrix, 1
composition of linear transformations, 28 G
cone, 115 generate, 15
coordinate, 17 generator, 15
Cramer’s rule, 62 geometric multiplicity, 92
cylinder Gram–Schmidt orthogonalisation, 38
elliptic, 115
hyperbolic, 116 H
parabolic, 116 Hermitian, 97
Householder matrix, 85
D hyperbola, 113
determinant, 53 hyperbolic cylinder, 116
of a linear transformation, 73 hyperbolic paraboloid, 116
diagonal matrix, 59 hyperboloid
diagonalisable matrix, 89 of one sheet, 115
diagonalisation of a quadratic form, 109 of two sheets, 115
dimension, 20
direct sum, 24 I
distance, 43 identity mapping, 67
Index
134
S
scalar, 11
scalar multiplication
of matrices, 2
of vectors, 11
sign function, 119
signature, 117
size of a matrix, 1
skew-symmetric matrix, 9
span, 15
spectral theorem, 97
square matrix, 1
standard basis, 17
subspace, 12
sum
direct, 24
of matrices, 1
of subspaces, 24
of vectors, 11
surface of revolution, 116
Sylvester’s law of inertia, 117
symmetric
bilinear form, 108
linear transformation, 76
matrix, 5
T
trace, 87
transition matrix, 72
transpose, 5
triangle inequality, 36
U
unit matrix, 5
unit vector, 34
upper triangular matrix, 59
V
vector, 11
Z
zero matrix, 1
zero subspace, 13
zero vector, 11
135