Passman Ds Lectures On Linear Algebra
Passman Ds Lectures On Linear Algebra
Passman Ds Lectures On Linear Algebra
LECTURES ON
LINEAR ALGEBRA
ISTUDY
This page intentionally left blank
ISTUDY
LECTURES ON
LINEAR ALGEBRA
Donald S Passman
University of Wisconsin-Madison, USA
World Scientific
NEW JERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONG KONG • TAIPEI • CHENNAI • TOKYO
ISTUDY
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center,
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from
the publisher.
Printed in Singapore
ISTUDY
Preface
These are the notes from a one-year, junior-senior level linear alge-
bra course I offered at the University of Wisconsin-Madison some 40
plus years ago. The notes were actually hand written on ditto masters,
run off and given to the class at the time of the lectures. Each section
corresponds to about one week or three one-hour lectures.
The course was very enjoyable and I even remember three particular
A students, but not their names. One always had his hand raised to
answer every question I posed to the class. I told him he reminded me
of myself when I was an undergraduate. The second had a wonderful
sense of humor and always included jokes in each of his homework
assignments. The third was a sophomore basketball player. Of course,
the U.W. athletics department just couldn’t accept the fact that he
was a strong math student.
In those years, I believed that elementary row and column opera-
tions should not be part of such a course, but when I began to translate
these notes into tex, I relented. So I added material on this topic. I
also added a brief final chapter on infinite dimensional vector spaces,
including the existence of basis and dimension. The proofs here might
be considered a bit skimpy.
Most undergraduate linear algebra courses are not as sophisticated
as this one was. Most math graduate students are assumed to have
had a good course in linear algebra, but sadly many have not. Reading
these notes might be an appropriate way to fill in the gap.
D. S. Passman
Madison, Wisconsin
San Diego, California
February 2021
v
ISTUDY
This page intentionally left blank
ISTUDY
Contents
Preface v
Chapter I. Vector Spaces 1
1. Fields 2
2. Vector Spaces 9
3. Subspaces 17
4. Spanning and Linear Independence 24
5. The Replacement Theorem 31
6. Matrices and Elementary Operations 39
7. Linear Equations and Bases 46
Chapter II. Linear Transformations 53
8. Linear Transformations 54
9. Kernels and Images 61
10. Quotient Spaces 68
11. Matrix Correspondence 77
12. Products of Linear Transformations 86
13. Eigenvalues and Eigenvectors 95
Chapter III. Determinants 103
14. Volume Functions 104
15. Determinants 114
16. Consequences of Uniqueness 123
17. Adjoints and Inverses 133
18. The Characteristic Polynomial 142
19. The Cayley-Hamilton Theorem 150
20. Nilpotent Transformations 157
21. Jordan Canonical Form 165
ISTUDY
CHAPTER I
Vector Spaces
ISTUDY
2 I. VECTOR SPACES
1. Fields
Linear algebra is basically the study of linear equations. Let us
start with a very simple example like 4x + 3 = 5 or more generally
ax + b = c. Here of course x is the unknown and a, b and c are known
quantities. Now this is really a simple equation and we all know how
to solve for x. But before we seek the unknown, perhaps we had better
be sure we know the knowns.
First, a, b and c must belong to some system S with an arithmetic.
There is an addition denoted by + and a multiplication indicated by
juxtaposition, that is the elements to be multiplied are written adjacent
to each other. For example, S could be the set of integers, rational
numbers Q, real numbers R or complex numbers C. But, as we will
see, the set of integers is really inadequate.
Let us now try to solve for x. Clearly ax+b = c yields ax = c−b and
then x = (c − b)/a. We see that in order to solve even the simplest of
all equations, S must also have a subtraction (the opposite of addition)
and a division (the opposite of multiplication). Now the quotient of two
integers need not be an integer and we see that the original equation
4x + 3 = 5 in fact has no integer solution (since of course x must be
1/2). Thus the integers are not adequate, but it turns out that Q, R
and C are.
We can now develop three theories of linear algebra, one for each
of Q, R and C. What we would discover is that most theorems proved
in one of these situations would carry over to the other two cases and
the proofs would be the same. Since it is rather silly to do things three
times when one will suffice, we seek the common property of these three
sets that make everything work. The common property is that they
are fields.
Formally, a field F is a set of elements with two operations, addition
and multiplication, defined that satisfy a number of axioms. First there
are the axioms of addition.
ISTUDY
1. FIELDS 3
ISTUDY
4 I. VECTOR SPACES
ISTUDY
1. FIELDS 5
ax + b = a(a−1 (c + (−b)) + b
= (aa−1 )(c + (−b)) + b by the associative law
for multiplication
= 1(c + (−b)) + b by the definition of a−1
= (c + (−b)) + b by the definition of 1
= c + ((−b) + b) by the associative law
for addition
=c+0 by the definition of −b
=c by the definition of 0
a0 = a(0 + 0) = a0 + a0
ISTUDY
6 I. VECTOR SPACES
-3 -2 -1 0 1 2 3
ISTUDY
1. FIELDS 7
fields. At some point, we will see that by assuming that our field F is
algebraically closed, we will obtain much nicer theorems on the struc-
ture of certain functions called linear transformations.
Example 1.4. The above fields all have infinitely many elements.
There are however finite fields. The simplest has just two elements.
Think about the arithmetic of the ordinary integers and the way even
and odd numbers add and multiply. We define a field with two elements
“even” and “odd” with addition and multiplication given by
Problems
ISTUDY
8 I. VECTOR SPACES
ISTUDY
2. VECTOR SPACES 9
2. Vector Spaces
Let F be a fixed field and let e, f, g ∈ F . Consider the linear
equation
ex + f y + gz = 0
with x, y, z as unknowns. We denote solutions of this, with x, y, x ∈ F ,
by ordered triples (x, y, z) and let S denote the set of all such solutions.
ISTUDY
10 I. VECTOR SPACES
Note that the same symbol 0 is used to denote the zero in F as well as
the zero vector in V . However this almost never causes confusion.
Thus a vector space V satisfies the same addition axioms as does
a field. This means, for example, that we can define unambiguously
α1 + α2 + · · · + αn the sum of any finite number of vectors and that this
sum is independent of the order in which the summands are written.
It means also that we can always solve equations of the form x + α = β
for x ∈ V . In fact, just about anything true about the addition in fields
carries over to V . It is easy to check that the addition defined on S
satisfies these axioms.
The axioms of multiplication are a little different. Remember we
can only multiply a vector by a field element and the field element
always occurs on the left. (This last part is really just a matter of
convention.) In particular, there is no commutative law and only one
possibility for associativity.
ISTUDY
2. VECTOR SPACES 11
ax + α = β
ax + α = a(a−1 (β + (−α))) + α
= (aa−1 )(β + (−α)) + α by the associative law
of multiplication
= 1(β + (−α)) + α by definition of a−1
= (β + (−α)) + α by the unital law
= β + ((−α) + α) by the associative law
of addition
=β+0 by definition of −α
=β by definition of 0
Therefore x is indeed a solution. Notice how the unital law gets used
above. In the future we denote β + (−α) by β − α.
The remaining two axioms intertwine addition and multiplication.
Observe that unlike the two distributive laws for fields which are basi-
cally the same, the two laws here are really different. This is because
scalar multiplication is not only not commutative, but in fact it is not
even defined in the other order.
Let us consider some examples of vector spaces.
ISTUDY
12 I. VECTOR SPACES
Example 2.2. Let U be a set and let V be the set of all functions
from U into F . If α, β ∈ V and a ∈ F , then we define the functions
α + β and aα from U to F by
(α + β)(u) = α(u) + β(u)
(aα)(u) = a(α(u)) for all u ∈ U
It is easy to check that with these definitions V is a vector space over
the field F .
Example 2.3. Now the field C is endowed with a nice arithmetic
and it contains R as a subfield. Thus there is an addition in C and a
natural multiplication of elements of C by elements of R. In this way
we see easily that C is a vector space over R.
More generally, if K is any field containing F , then by suitably
restricting the multiplication we can make K a vector space over F .
This example is not as silly as it first appears. For suppose that we
know a good deal about F , but very little about K. Then it is indeed
possible to use certain vector space results to deduce information about
K. This is in fact a very important tool in the study of fields.
Example 2.4. Perhaps the most popular examples of vector spaces
are as follows. For each integer n ≥ 1, let F n denote the set of n-tuples
of elements of F . Thus F n = {(a1 , a2 , . . . , an ) | ai ∈ F } and of course
(a1 , a2 , . . . , an ) = (b1 , b2 , . . . , bn ) if and only if the corresponding entries
are equal, that is a1 = b1 , a2 = b2 , . . . , an = bn . Now let α, β ∈ F n
with
α = (a1 , a2 , . . . , an ), β = (b1 , b2 , . . . , bn )
and let c ∈ F . Then we define
α + β = (a1 + b1 , a2 + b2 , . . . , an + bn ) ∈ F n
and
cα = (ca1 , ca2 , . . . , can ) ∈ F n
The axioms of F carry over easily to show that F n is a vector space
over F .
ISTUDY
2. VECTOR SPACES 13
with the proviso that only finitely many of their coefficients are not
zero.
Let c ∈ F . Then with α and β as above, we define
∞
α+β = (ai + bi )xi
i=0
∞
cα = (cai )xi
i=0
and
n
n
cn = ai b j = ai bn−i = an−j bj
i+j=n i=0 j=0
It is easy to check that with this multiplication F [x] satisfies all the
axioms of a field with the exception of the existence of multiplicative
ISTUDY
14 I. VECTOR SPACES
ISTUDY
2. VECTOR SPACES 15
(iii ) We have
α + (−1)α = (1)α + (−1)α
= (1 + (−1))α = 0α = 0
Thus adding −α to both sides, we obtain (−1)α = −α and the lemma
is proved.
Problems
2.1. In the text we showed that a solution of the equation
ax + α = β
with a = 0 is x = a−1 (β − α). Prove that this is the unique solution.
2.2. Solve the simultaneous vector equations
ax + by = α
cx + dy = β
under the assumption that ad − bc = 0.
2.3. Show that the set of functions from U into F , as described in
Example 2.2, is a vector space.
2.4. Prove that F n is a vector space over the field F . If necessary,
try it first for n = 2 or 3.
2.5. In F n define αi = (0, 0, . . . , 1, . . . , 0) where the 1 occurs as the
ith entry, and all the other entries are 0. Show that every element of
F n can be written uniquely as a sum
α = a1 α1 + a2 α2 + · · · + an αn
for suitable ai ∈ F .
2.6. Clearly R2 is the ordinary Euclidean plane. Let α, β ∈ R2 .
Describe geometrically the quadrilateral with vertices 0, α, α + β and
β. Describe geometrically the relationship between aα and α for any
a ∈ R.
2.7. Let α1 = (1, 1, 2), α2 = (0, 1, 4) and α3 = (1, 0, −1) be vectors
in Q3 . Show that every element α ∈ Q3 can be written uniquely as a
sum α = a1 α1 + a2 α2 + a3 α3 for suitable a1 , a2 , a3 ∈ Q.
2.8. Can we find finitely many polynomials α1 , α2 , . . . , αn ∈ F [x]
with the property that every element α in F [x] is of the form α =
a1 α1 + a2 α2 + · · · + an αn for suitable ai ∈ F ?
ISTUDY
16 I. VECTOR SPACES
ISTUDY
3. SUBSPACES 17
3. Subspaces
Perhaps the subtitle of this section should be “new spaces from
old”. In mathematics, as soon as an object is defined, one seeks ways
of constructing new objects from the original and the obvious first step
is to look inside the original for subobjects. Thus we consider subfields
of fields, like the reals inside the complex numbers, and of course we
consider subsets of sets.
Let V be a vector space over F and let W be a subset of V . Then
W is said to be a subspace of V if W is a vector space in its own right
with the same addition and scalar multiplication as in V . Let us make
this last statement more explicit. Suppose α, β ∈ W and a ∈ F . Since
W is a vector space, we can compute α + β and aα in W . Since α and
β also belong to V ⊇ W , we can also compute α + β and aα in V .
Well, the fact that the arithmetic is the same in both W and V means
precisely that the two sums α + β are the same and again that the two
products aα are identical.
At this point, we could start listing examples of vector spaces V
and subspaces W , but in each case we would have to verify that W
satisfies all ten axioms and this is rather tedious. So what we do first
is to decide which of these axioms we really have to check. Suppose W
is a subspace of V . Then W has a zero element and negatives, and of
course the same is true of V .
Lemma 3.1. Let W be a subspace of V . Then the zero element of
W is the same as that of V , and negatives in W are the same as in V .
Proof. Let 0W be the zero element of W . Then in W we have
0W + 0W = 0W . But this is the same addition as in V , so viewing this
as an equation in V and adding −0W (the negative in V ) to both sides,
we get immediately 0W = 0, the zero of V .
Let α ∈ W and let β be its negative in W . Then β + α = 0W = 0.
Again, this is the same addition as in V , so adding −α to both sides
yields β = −α.
We can now obtain our simplified subspace criteria.
Theorem 3.1. Let W be a subset of a vector space V . Then W is
a subspace of V if and only if
i. 0 ∈ W ,
ii. α, β ∈ W implies that α + β ∈ W , and
iii. α ∈ W and a ∈ F imply that aα ∈ W .
Proof. Suppose first that W is a subspace of V . Then W is a
vector space, so W has a zero element 0W . By the previous lemma,
ISTUDY
18 I. VECTOR SPACES
ISTUDY
3. SUBSPACES 19
Example 3.3. Let U be a set and let V be the set of all functions
from U to F . As we have seen earlier, V is a vector space over F with
addition and scalar multiplication given by
Example 3.4. Let V be the set of all real valued functions defined
on the interval [0, 1]. That is, V is the set of all functions from U = [0, 1]
to F = R and hence, as above, V is a vector space over R. Suppose C
denotes the set of all continuous functions in V . Then 0 is continuous,
and the sum of two continuous functions is continuous. Furthermore,
C is closed under scalar multiplication, so C is in fact a subspace of V .
e 1 x1 + e 2 x2 + · · · + e n xn = 0
f 1 x1 + f 2 x2 + · · · + f n xn = 0
g 1 x1 + g 2 x2 + · · · + g n xn = 0
Observe that the word homogeneous means that zeros occur on the
right hand side of the equal sign. Let S denote the set of all n-tuples
(x1 , x2 , . . . , xn ) ∈ F n that are solutions to all three equations. We show
that S is a subspace of V .
First 0 = (0, 0, . . . , 0) is clearly in S. Now let α = (x1 , x2 , . . . , xn ) ∈
S, β = (y1 , y2 , . . . , yn ) ∈ S and let a ∈ F . Then
ISTUDY
20 I. VECTOR SPACES
W1 + W2 = {α1 + α2 | α1 ∈ W1 , α2 ∈ W2 }
α = α1 + α2 , β = β1 + β2
ISTUDY
3. SUBSPACES 21
Then
α + β = (α1 + α2 ) + (β1 + β2 )
= (α1 + β1 ) + (α2 + β2 ) ∈ W1 + W2
since α1 + β1 ∈ W1 and α2 + β2 ∈ W2 . Also
aα = a(α1 + α2 )
= (aα1 ) + (aα2 ) ∈ W1 + W2
since aα1 ∈ W1 and aα2 ∈ W2 . Finally, 0 = 0 + 0 ∈ W1 + W2 , so
W1 + W2 is indeed a subspace.
Now every element of W1 + W2 can be written as a sum of an
element of W1 and one of W2 , but the summands need not be uniquely
determined. If it happens that all such summands are unique, then
we say that the sum is direct and write W1 ⊕ W2 for W1 + W2 . The
following lemma yields a simple test for deciding when a sum is direct.
ISTUDY
22 I. VECTOR SPACES
Problems
3.1. Let V = F 2 and let α = (1, 0), β = (0, 1) and γ = (1, 1) be
three vectors in V . By computing both sides of each inequality, show
that
α + ( β ∩ γ) = ( α + β) ∩ ( α + γ)
and
α ∩ ( β + γ) = ( α ∩ β) + ( α ∩ γ)
ISTUDY
3. SUBSPACES 23
Thus appropriate analogs of the distributive law do not hold for these
operations on subspaces.
3.2. Let α1 , α2 , α3 , α4 be four vectors in V that satisfy
α1 − 3α2 + 2α3 − 5α4 = 0
Show that α1 , α2 , α3 , α4 = α2 , α3 , α4 .
3.3. Let W1 and W2 be subspaces of V . Show that W1 ∪ W2 is
a subspace if and only if W1 ⊆ W2 or W2 ⊆ W1 . (Hint. If neither
of these two inclusions hold, then we can choose α1 ∈ W1 \ W2 and
α2 ∈ W2 \ W1 . Consider the element α1 + α2 .)
ISTUDY
24 I. VECTOR SPACES
ISTUDY
4. SPANNING AND LINEAR INDEPENDENCE 25
ISTUDY
26 I. VECTOR SPACES
Proof. Suppose first that uniqueness holds and let γi1 , γi2 , . . . γim
be a finite subset of C. Say a1 , a2 , . . . , am ∈ F with
a1 γi1 + a2 γi2 + · · · + am γim = 0
Since we know that
0γi1 + 0γi2 + · · · + 0γim = 0
uniqueness for the vector α = 0 implies that a1 = a2 = · · · = am = 0
and C is linearly independent.
Conversely suppose that C is linearly independent and suppose α ∈
V can be written as
b 1 γ i 1 + b 2 γ i 2 + · · · + br γ i r = α
and
c1 γ j 1 + c2 γ j 2 + · · · + cs γ j s = α
two possibly different finite F -linear combinations of elements of C. By
adding zero terms to each equation, we may assume that the vectors of
C that occur in each equation are the same, and then by renumbering
we may assume that r = s and γik = γjk . Thus, we have
b1 γ i 1 + b 2 γ i 2 + · · · + br γ i r = α
and
c1 γ i 1 + c2 γ i 2 + · · · + cr γ ir = α
Subtracting the second equation from the first then yields
(b1 − c1 )γi1 + (b2 − c2 )γi2 + · · · + (br − cr )γir = α − α = 0
Since C is linearly independent, each of these coefficients must vanish.
Thus bi − ci = 0, so bi = ci and the coefficients for α are unique.
Thus B is a basis if and only if it is a linearly independent spanning
set. We now consider ways to find bases.
Suppose C is a subset of V . If C spans V , then obviously any set
bigger than C also spans V . However, it is not true that we can in-
discriminately remove elements from C and still maintain the spanning
property. We say that C is a minimal spanning set of V if C spans V ,
but for every vector γ ∈ C, the set C \ {γ} does not span.
In the other direction, suppose C is a linearly independent subset
of V . Then clearly, every subset of C is also linearly independent, but
we cannot add vectors indiscriminately to C and still maintain this
property. We say that C is a maximal linearly independent set if C is
linearly independent, but for every vector γ ∈ V \ C, the set C ∪ {γ} is
not linearly independent. The interrelations between these definitions
is given by
ISTUDY
4. SPANNING AND LINEAR INDEPENDENCE 27
ISTUDY
28 I. VECTOR SPACES
ISTUDY
4. SPANNING AND LINEAR INDEPENDENCE 29
Proof. C has only finitely many subsets and one of these subsets,
namely C itself spans V . Thus we may choose a subset B of C that is a
spanning set of smallest possible size. Clearly B is a minimal spanning
set of V and therefore by Theorem 4.1, B is a basis.
If V is a finite dimensional vector space, then V does have such a
finite spanning set, and hence V does have a basis.
It is in fact true that every vector space has a basis. However a
proof of this requires going beyond the above finite type argument,
and one must use transfinite methods. We will put this study off until
the end of these notes since, for the most part, we are concerned with
finite dimensional spaces.
Problems
4.1. Find a basis for the solution space of the real linear equation
x1 − 2x2 + x3 − x4 = 0
4.2. Find a basis for the space of simultaneous solutions of the real
linear equations
x1 − 2x2 + x3 − x4 = 0
2x1 − 3x2 − x3 + 2x4 = 0
4.3. For each i ≥ 0, let βi ∈ F [x] be a polynomial of degree i. Prove
that B = {β0 , β1 , β2 , . . .} is a basis for F [x].
4.4. Verify that B = {β1 , β2 , . . . , βn } as given in Example 4.3 is a
basis for F n .
4.5. Let V be a vector space with basis B = {β1 , β2 , β3 , β4 }. Prove
that C = {β1 , β2 − β1 , β3 − β2 , β4 − β3 } is also a basis.
4.6. Prove that the subset
C = {(1, 0, −1, 2), (1, 1, 0, 1), (1, 0, 2, 3), (0, 0, 1, 2)}
of R4 is linearly independent.
4.7. Prove that the subset C = {(1, 0, 2), (1, 1, 0), (1, 0, −1)} of Q3
is a spanning set.
4.8. Let γ ∈ V . Show that {γ} is linearly independent if and only
0.
if γ =
4.9. Let C be a subset of vector space V . Show that C is linearly
independent if and only if C is a basis for some subspace of V .
ISTUDY
30 I. VECTOR SPACES
ISTUDY
5. THE REPLACEMENT THEOREM 31
ISTUDY
32 I. VECTOR SPACES
ISTUDY
5. THE REPLACEMENT THEOREM 33
ISTUDY
34 I. VECTOR SPACES
ISTUDY
5. THE REPLACEMENT THEOREM 35
ISTUDY
36 I. VECTOR SPACES
Proof. We know that all of the above spaces are finite dimen-
sional. Let C = {γ1 , γ2 , . . . , γt } be a basis for W1 ∩ W2 , so that
dimF (W1 ∩ W2 ) = t. Since W1 ∩ W2 is a subspace of W1 , we can
extend C to a basis A = {γ1 , γ2 , . . . , γt , α1 , α2 , . . . , αr } of W1 . Simi-
larly, we can extend C to a basis B = {γ1 , γ2 , . . . , γt , β1 , β2 , . . . , βs } of
W2 . Observe that dimF W1 = t + r and dimF W2 = t + s, so
dimF W1 + dimF W2 − dimF (W1 ∩ W2 )
= (t + r) + (t + s) − t = r + s + t
In other words, what we want to prove is that dimF (W1 +W2 ) = r+s+t.
Now we have a nice set
D = {α1 , α2 , . . . , αr , β1 , β2 , . . . , βs , γ1 , γ2 , . . . , γt }
of r + s + t seemingly distinct vectors that are all clearly contained in
W1 + W2 . Thus the obvious approach is to prove that D is a basis for
W1 + W2 .
First let δ ∈ W1 + W2 . Then δ = α + β with α ∈ W1 and β ∈ W2 .
Since A spans W1 and B spans W2 , we have
α = a1 α1 + · · · + ar αr + c1 γ1 + · · · + ct γt
β = b1 β1 + · · · + bs βs + d1 γ1 + · · · + dt γt
for suitable field elements ai , bi , ci and di . Thus
δ = α + β = a1 α1 + · · · + ar αr + b1 β1 + · · · + bs βs
+ (c1 + d1 )γ1 + · · · + (ct + dt )γt
and D spans W1 + W2 .
Now suppose that some F -linear sum of the elements of D sums to
zero. Say
a1 α1 + · · · + ar αr + b1 β1 + · · · + bs βs + c1 γ1 + · · · + ct γt = 0
Then
a1 α1 + · · · + ar αr + c1 γ1 + · · · + ct γt = (−b1 )β1 + · · · + (−bs )βs
Now the above left-hand side is clearly in W1 and the right-hand side
is in W2 , so this common vector which we call δ is in W1 ∩ W2 . Thus,
since C spans W1 ∩ W2 , we have
δ = d1 γ1 + · · · + dt γt
for suitable di ∈ F . This yields the equations
a1 α1 + · · · + ar αr + c1 γ1 + · · · + ct γt = d1 γ1 + · · · + dt γt
(−b1 )β1 + · · · + (−bs )βs = d1 γ1 + · · · + dt γt
ISTUDY
5. THE REPLACEMENT THEOREM 37
or equivalently
a1 α1 + · · · + ar αr + (c1 − d1 )γ1 + · · · + (ct − dt )γt = 0
b1 β1 + · · · + bs βs + d1 γ1 + · · · + dt γt = 0
Therefore, by the linear independence of A and B, we conclude that
a1 = · · · = ar = 0, b1 = · · · = bs = 0, d1 = · · · = dt = 0
and then c1 = · · · = ct = 0. Thus D is linearly independent and hence a
basis for W1 + W2 . Observe that the above also tells us that all r + s + t
elements of D are distinct so
dimF (W1 + W2 ) = |D| = r + s + t
and the result follows.
Problems
5.1. Suppose that A and B are finite disjoint index sets. Convince
yourself that for vectors αi ∈ V we have
αi = αi + αi
i∈A∪B i∈A i∈B
What does this say when A = ∅ is the empty set? What is a basis for
the space V = 0?
Find a basis and the dimension of each of the following F -vector
spaces.
5.2. V = F n .
5.3. V = F [x].
5.4. V = C over the field F = R.
5.5. V is the set a functions from a finite set U into the field F .
Let V be a vector space of finite dimension n.
5.6. Let W1 and W2 be subspaces of V with
dimF W1 + dimF W2 > n
Prove that W1 ∩ W2 = 0.
5.7. Let V = Wk > Wk−1 > · · · > W1 > W0 = 0 be a finite chain
of distinct subspaces of V . Show that k ≤ n.
ISTUDY
38 I. VECTOR SPACES
ISTUDY
6. MATRICES AND ELEMENTARY OPERATIONS 39
ISTUDY
40 I. VECTOR SPACES
R2. Multiply the ith row by a nonzero constant c so that the en-
try aij becomes c·aij for all j = 1, 2, . . . , n. We denote this
operation by R2 (i; c).
R3. Finally, for i = k, we add c ∈ F times the kth row to the ith,
so that the entry aij becomes aij + c·akj for all j = 1, 2, . . . , n.
We denote this operation by R3 (i, k; c).
Notice that R2 (i; 1) and R3 (i, k; 0) are both equal to Id, the identity
operation that leaves A unchanged. The key property of all these
operations is that they are invertible. Indeed, one can undo each of
these with another elementary row operation.
Lemma 6.1. Each elementary row operation is invertible with its
inverse being an elementary row operation of the same type.
Proof. Clearly R1 (i, k)R1 (i, k) = Id and R2 (i; c−1 )R2 (i; c) = Id.
Finally since R3 (i, k; c) does not change the kth row of A, we see that
R3 (i, k; −c)R3 (i, k; c) = Id.
With this, we can quickly prove that elementary row operations do
not change the row space. Specifically, we have
Lemma 6.2. If A is an m × n matrix over F , and if R is an ele-
mentary row operation, then the row spaces of A and of R(A) are the
same.
Proof. Write B = R(A). By considering the three operations in
turn, we see easily that each row vector of B belongs to the row space
of A and hence the row space of B is contained in the row space of A.
Furthermore, since R−1 (B) = R−1 (R(A)) = A, we obtain the reverse
inclusion and consequently the two row spaces are equal.
Now let us see what we can do with these operations. To start with,
we describe a fairly nice matrix structure. We say that a matrix A is
in row echelon form if
E1. The zero rows of A, if any, are at the bottom, that is they are
all below the nonzero rows.
E2. Each nonzero row, if any, starts with a leading 1. That is, the
first nonzero entry, from left to right, is a 1.
E3. The leading 1s slope down and to the right. Specifically, if the
rows i and i + 1 are both nonzero, then the leading 1 in row
i+1 is contained in a column strictly to the right of the leading
1 of row i.
For example, we have
ISTUDY
6. MATRICES AND ELEMENTARY OPERATIONS 41
ISTUDY
42 I. VECTOR SPACES
ISTUDY
6. MATRICES AND ELEMENTARY OPERATIONS 43
αr−2 , . . . , α1 , in turn, we see that V , and hence A, determines all the
rows of A in order. Thus (iii ) is proved.
ISTUDY
44 I. VECTOR SPACES
At this point, we have no choice but to divide the last row by 3. For-
tunately all the other entries in that row are 0, so again we do not
introduce fractions in the matrix
⎡ ⎤
0 1 2 0 4
A6 = ⎣0 0 0 1 −5⎦
0 0 0 0 1
Finally, we subtract 4 times the third row from the first, and add 5
times the third row to the second to obtain the reduced row echelon
matrix ⎡ ⎤
0 1 2 0 0
A7 = ⎣0 0 0 1 0⎦
0 0 0 0 1
and the procedure is complete.
Problems
6.1. If A is a matrix in row echelon form, prove that the nonzero
rows of A are linearly independent. This can be done without serious
computation.
6.2. List the analogous elementary column operations and prove
the column version of Theorem 6.1.
6.3. Prove that any elementary row operation R and any elemen-
tary column operation C commute in their action on matrices. To be
precise, show that R(C(A)) = C(R(A)) for all matrices A.
6.4. Assume that row subscripts {i1 , i2 } and {k1 , k2 } are disjoint.
Show that the elementary row operations R3 (i1 , k1 ; c1 ) and R3 (i2 , k2 ; c2 )
commute.
6.5. Let A ∈ F m×n . Using a sequence of elementary row and column
operations show that any matrix A can be transformed into a matrix
of the form Dr = [dij ], where d11 = d22 = · · · = drr = 1 and all the
remaining dij are 0.
6.6. Find an example of an integer matrix whose corresponding
reduced row echelon form matrix does not have all integer entries.
6.7. Find a “slanted” basis for the row space of matrix
⎡ ⎤
1 1 1 2 0
⎢2 3 0 1 3⎥
A=⎢ ⎣1
⎥
2 −1 −1 3⎦
0 1 −2 −2 1
ISTUDY
6. MATRICES AND ELEMENTARY OPERATIONS 45
That is, find a basis that comes from an appropriate reduced row ech-
elon matrix.
6.8. Suppose A ∈ F n×n is a square matrix with linearly independent
rows. If A is converted into the reduced row echelon form matrix A ,
find the structure of A .
6.9. If we “unwrap” matrices into straight horizontal lines, then
m×n
F surely looks like the vector space F mn . With this idea in hand,
define addition and scalar multiplication so that F m×n becomes a vector
space over F .
6.10. If F m×n is viewed as a vector space as above, determine its
dimension and find a nice basis.
ISTUDY
46 I. VECTOR SPACES
the so-called augmented matrix with the constants on the right. The
vertical line is of course not formally part of the matrix structure,
but it does help us to better visualize how the matrix is partitioned.
This matrix clearly contains all the information given by the system of
equations, except perhaps for the names of the unknowns. As is to be
expected, we have
ISTUDY
7. LINEAR EQUATIONS AND BASES 47
ISTUDY
48 I. VECTOR SPACES
ISTUDY
7. LINEAR EQUATIONS AND BASES 49
ISTUDY
50 I. VECTOR SPACES
ISTUDY
7. LINEAR EQUATIONS AND BASES 51
Problems
7.1. Using Gaussian elimination, solve the system of real linear
equations given by
5x1 + 5x2 − x3 + 7x4 + 2x5 + 5x6 = 9
x1 + x2 − x3 − x4 + x5 − x6 = 2
4x1 + 4x2 − x3 + 5x4 + 2x5 + 4x6 = 8
Which of the variables are free, which are bound, and what is the
solution when all free variables are set to 0?
7.2. Find a basis for the solution space to the system of homoge-
neous linear equations given by
x1 + 2x2 + x3 + 3x4 + x5 + x6 = 0
2x1 + 4x2 + 2x3 + 6x4 + 3x5 + 5x6 = 0
x1 + 2x2 + x3 + 2x4 + 0x5 − x6 = 0
7.3. Without computation, show that the elementary column op-
erations on A do not change the dimension of the row space of A.
Similarly show that the elementary row operations on A do not change
the dimension of the column space of A.
7.4. Show that the integer r in Problem 6.5 is uniquely determined.
Indeed it is the dimension of the row space of A and of the column
space of A.
7.5. Find a subset of the rows of the matrix A of the Problem 6.7
that form a basis for its row space.
7.6. Let A be a square matrix so that A has the same number of
rows as columns. Prove that the rows of A are linearly independent if
and only if the columns are linearly independent.
7.7. Again let A ∈ F n×n be a square matrix and assume that its
rows are linearly independent. Consider the system of linear equations
associated with A|B and let the reduced row echelon matrix of this
augmented matrix be A |B . Using Problem 6.8, show that B describes
the unique solution to the system of equations.
ISTUDY
52 I. VECTOR SPACES
7.8. State and prove the appropriate analog of Corollary 7.1 in the
nonhomogeneous situation.
Let A ∈ F m×n and let B1 , B2 , . . . , Bt be m × 1 column matrices.
Consider the system of equations associated with the augmented ma-
trices A|Bj for j = 1, 2, . . . , t and form the large augmented matrix
A|B, where B is an m × t matrix with columns B1 , B2 , . . . , Bt . Via
elementary row operations, convert A|B to the reduced row echelon
form matrix A |B .
7.9. Explain how A |B can be used to solve the systems associated
to the various A|Bj for all j.
7.10. The comments in Example 7.1 do not hold precisely in this
context. How can one tell from the matrix A |B that the system A|Bj
is inconsistent?
ISTUDY
CHAPTER II
Linear Transformations
ISTUDY
54 II. LINEAR TRANSFORMATIONS
8. Linear Transformations
So far our study of linear algebra has been confined to vector spaces.
But in fact the essence of this subject is the study of certain functions
defined on these spaces. This is analogous to the situation in calculus.
There one first considers the real line, but the main interest of course
is in the study of real valued functions defined on the real line.
Let us consider a set of simultaneous linear equations over a field
F . Say, for simplicity
e 1 x1 + e 2 x2 + e 3 x3 = b
f 1 x1 + f 2 x2 + f 3 x3 = c
We will think of the coefficients e1 , e2 , e3 , f1 , f2 , f3 as being fixed. What
we would like to know is for which choices of b and c do solutions exist
and then how many solutions are there. Now starting with b and c and
finding x1 , x2 , x3 takes a certain amount of work. On the other hand,
starting with x1 , x2 , x3 and finding b and c from the above is trivial.
Of course, if we look at all pairs b, c and all triples x1 , x2 , x3 , then the
above two considerations are really the same. Therefore we take the
second point of view since it is certainly simpler.
What we have here is a map T that takes ordered triples (x1 , x2 , x3 )
with entries in F to ordered pairs (b, c) with entries in F . In other
words,
T : F3 → F2
But T is not any old function. It is defined in a linear fashion and
therefore we we expect it to have some nice properties.
Let α = (x1 , x2 , x3 ), β = (y1 , y2 , y3 ) ∈ F 3 with (x1 , x2 , x3 )T = (b, c)
and (y1 , y2 , y3 )T = (b , c ). Observe that we have written this function
T to the right of its argument. Then
e1 (x1 + y1 ) + e2 (x2 + y2 ) + e3 (x3 + y3 )
= (e1 x1 + e2 x2 + e3 x3 ) + (e1 y1 + e2 y2 + e3 y3 )
= b + b
and similarly
f1 (x1 + y1 ) + f2 (x2 + y2 ) + f3 (x3 + y3 ) = c + c
Thus we have
(α + β)T = (x1 + y1 , x2 + y2 , x3 + y3 )T
= (b + b , c + c ) = αT + βT
ISTUDY
8. LINEAR TRANSFORMATIONS 55
ISTUDY
56 II. LINEAR TRANSFORMATIONS
Example 8.4. Let V be a vector space over the field F . For each
a ∈ F define Ta : V → V by αTa = aα for all α ∈ V . It follows from
the distributive and associative laws that Ta is a linear transformation.
Observe that T0 = 0 and that, by the unital law, T1 = I.
Example 8.5. Let C be considered as a vector space over R. For
each complex number α, let e(α) denote its real part and m(α) its
imaginary part. Then both e : C → R and m : C → R are linear
transformations of real vector spaces.
Example 8.6. Let C denote the vector space of real valued con-
tinuous functions on the interval [0, 1]. Let S : C → R be defined by
1
f (x)S = f (t) dt
0
for all f (x) ∈ C. Then S is a linear transformation from C to R.
Example 8.7. Let C be as above and define J : C → C by
x
(f J)(x) = f (t) dt
0
Then J is easily seen to be a linear transformation.
Example 8.8. Given integers m, n ≥ 1, we define T : F n → F m
as follows. First fix mn elements aij ∈ F with i = 1, 2, . . . , m and
j = 1, 2, . . . , n. Then set
n
n
n
(c1 , c2 , . . . , cn )T = a1j cj , a2j cj , . . . , amj cj
j=1 j=1 j=1
ISTUDY
8. LINEAR TRANSFORMATIONS 57
ISTUDY
58 II. LINEAR TRANSFORMATIONS
ISTUDY
8. LINEAR TRANSFORMATIONS 59
so
βi T = 0γ1 + 0γ2 + · · · + 1γi + · · · + 0γn = γi
and the theorem is proved.
There are a number of ways of constructing new linear transforma-
tions from old ones. We consider one such now.
Let T : V → W be a linear transformation. As usual we say that
T is one-to-one if for each γ ∈ W there exists at most one α ∈ V with
αT = γ. We say that T is onto if for each γ ∈ W there exists at least
one α ∈ V with αT = γ. Thus if T is one-to-one and onto, then for
each γ ∈ W there exists one and only one α ∈ V with αT = γ. But
this means that α is really a function of γ and we can therefore define
naturally a back map T −1 : W → V given by
γT −1 = α for γ ∈ W
where α is the unique element of V with αT = γ. As we might expect,
T −1 is also a linear transformation.
For brevity we call such a one-to-one and onto linear transformation
an isomorphism.
Lemma 8.2. Let T : V → W be an isomorphism. Then the map
T −1 : W → V is also an isomorphism.
Proof. Let γ1 , γ2 ∈ W and let a ∈ F with γ1 T −1 = α1 and
γ2 T −1 = α2 . Then by definition α1 T = γ1 and α2 T = γ2 . Since T
is a linear transformation, this yields
(α1 + α2 )T = α1 T + α2 T = γ1 + γ2
(aα1 )T = a(α1 T ) = aγ1
Thus again by definition of T −1 we have
(γ1 + γ2 )T −1 = α1 + α2 = γ1 T −1 + γ2 T −1
(aγ1 )T −1 = aα1 = a(γ1 T −1 )
and T −1 is a linear transformation.
Suppose now that γ1 T −1 = γ2 T −1 . Then α1 = α2 so
γ1 = α1 T = α2 T = γ2
−1
and hence T is one-to-one. Finally, let α ∈ V and set γ = αT . Then
by definition of T −1 we have γT −1 = α. Thus T −1 is onto and the
result follows.
ISTUDY
60 II. LINEAR TRANSFORMATIONS
Problems
8.1. Verify that the projection map P of Example 8.3 is a linear
transformation.
8.2. Verify that the map Ta given in Example 8.4 is a linear trans-
formation.
8.3. If you already know calculus, convince yourself that the maps
S and J given in Examples 8.6 and 8.7 are linear transformations.
8.4. Let D : F [x] → F [x] be the formal derivative map. Show that
(αβ)D = α(βD) + (αD)β
for all α, β ∈ F [x]. (Hint. First verify this for α = axn and β = bxm ,
and then prove the result by induction on deg α + deg β.)
8.5. Describe geometrically the linear transformations Ti : R2 → R2
given by
(a, b)T1 = (2a, 2b)
(a, b)T2 = (2a, b)
(a, b)T3 = (−a, −b)
For each Ti find a nonzero vector αi and a scalar ci ∈ R with αi Ti = ci αi .
8.6. Describe geometrically the linear transformation Sθ : R2 → R2
given by
(a, b)Sθ = (a cos θ − b sin θ, a sin θ + b cos θ)
For which angles θ can one find 0 = α ∈ R2 and a ∈ R with αSθ = aα.
8.7. Let T : F 3 → F 2 be given by
(a1 , a2 , a3 )T = (a1 − a2 , a3 − 2a2 + a1 )
Show that T is onto, but not one-to-one.
8.8. Let T : F 3 → F 4 be given by
(a1 , a2 , a3 )T = (a1 − 2a2 + a3 , a1 + a2 , a2 − a3 , a3 )
Prove that T is one-to-one, but not onto.
8.9. Let V be an n-dimensional vector space over F with basis
B = {β1 , β2 , . . . , βn }. Show that the map T : F n → V given by
(a1 , a2 , . . . , an )T = a1 β1 + a2 β2 + · · · + an βn
is an isomorphism. Find T −1 .
8.10. Let V be a finite dimensional vector space and let W1 and
W2 be subspaces of the same dimension. Construct an isomorphism
T : W1 → W2 such that γT = γ for all γ ∈ W1 ∩ W2 .
ISTUDY
9. KERNELS AND IMAGES 61
ISTUDY
62 II. LINEAR TRANSFORMATIONS
ISTUDY
9. KERNELS AND IMAGES 63
ISTUDY
64 II. LINEAR TRANSFORMATIONS
ISTUDY
9. KERNELS AND IMAGES 65
Example 9.4. Let us see now how all this applies to the study
of linear equations. We consider a set of m linear equations in the n
unknowns x1 , x2 , . . . , xn . This is given by
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
······
am1 x1 + am2 x2 + · · · + amn xn = bm
Observe that, as usual, the coefficients aij ∈ F are double subscripted.
The first subscript corresponds to the row or the equation, and the
second subscript corresponds to the column of the unknown.
We think of the set {aij } as being fixed. Then the above equations
define a linear transformation T : F n → F m given by
(x1 , x2 , . . . , xn )T = (b1 , b2 , . . . , bm )
Now the solution set of the homogeneous equations, that is when b1 =
←
−
b2 = · · · = bm = 0 is clearly (0) T , the kernel of T . This is of course
a subspace of F n and hence has a basis, say {α1 , α2 , . . . , αr }. Thus
every solution of the system of homogeneous equations can be written
uniquely as
a1 α1 + a2 α2 + · · · + ar αr
for suitable ai ∈ F . See Corollary 7.2 for efficient ways to construct
such bases.
Now we ask for which (b1 , b2 , . . . , bm ) ∈ F m do solutions exist. But
clearly a solution exists if and only if (b1 , b2 , . . . , bm ) ∈ im T . In par-
ticular, the set of m-tuples of constant terms for which a solution ex-
ists is in fact a subspace of F m . Suppose (b1 , b2 , . . . , bm ) ∈ im T and
let β = (y1 , y2 , . . . , yn ) be a solution to the associated equations so
ISTUDY
66 II. LINEAR TRANSFORMATIONS
Problems
Let T : V → W be a linear transformation with V and W both
vector spaces over F .
←
−
9.1. Suppose T is onto. Prove that T and T define a one-to-one
correspondence between subspaces of V that contain ker T and all sub-
spaces of W .
9.2. Let V1 be a subspace of V and let W1 be a subspace of W with
(V1 )T ⊆ W1 . Show that the restriction map T1 : V1 → W1 given by
αT1 = αT for α ∈ V1 is a linear transformation. Moreover show that
ker T1 = V1 ∩ (ker T ).
9.3. Suppose T is onto and let V1 be a complement for ker T in V .
Prove that the restriction map T1 : V1 → W is an isomorphism.
Consider the following set of three linear equations in four unknowns
with coefficients in Q.
2x1 + x2 + x3 − 4x4 = b1
3x1 + x2 + 3x3 − x4 = b2
x1 + x2 − x3 − 7x4 = b3
and let T : Q4 → Q3 be the corresponding linear transformation.
9.4. Find a basis for ker T and extend this to a basis for Q4 .
ISTUDY
9. KERNELS AND IMAGES 67
9.5. Find a basis for im T and extend this to a basis for Q3 . What
is the rank of T ?
9.6. Find all solutions with b1 = 2, b2 = 3, b3 = 1.
9.7. Let V be an F -vector space of dimension n < ∞ and let
T : V → V be a linear transformation. Define subspaces Vj and Wj
inductively by
V0 = V, Vj+1 = (Vj )T
←
−
W0 = 0, Wj+1 = (Wj ) T
Show that Vj ⊇ Vj+1 and Wj+1 ⊇ Wj and deduce from this that V2n =
Vn and W2n = Wn .
Let V be a two-dimensional real vector space with basis {α1 , α2 }.
Let T : V → V be the linear transformation given by
α1 T = 4α1 − 5α2
α2 T = 2α1 − 3α2
9.8. Find nonzero vectors β1 , β2 ∈ V with β1 T = −β1 and β2 T =
2β2 .
9.9. Suppose 0 = γ ∈ V with γT = aγ for some a ∈ R. Show that
a = −1 or 2.
9.10. Prove that {β1 .β2 } is a basis for V and describe T as above
in terms of this basis.
ISTUDY
68 II. LINEAR TRANSFORMATIONS
ISTUDY
10. QUOTIENT SPACES 69
ISTUDY
70 II. LINEAR TRANSFORMATIONS
ISTUDY
10. QUOTIENT SPACES 71
i. ∼W is an equivalence relation on V .
ii. If α ∈ V , then cl(α) is equal to the coset α + W .
iii. If α ∼ α and β ∼ β , then α + β ∼ α + β .
iv. If α ∼ α and c ∈ F , then cα ∼ cα .
Parts (iii) and (iv) above say that the equivalence relation respects
the arithmetic in V . As we will see, results of this nature allow us to
define an appropriate arithmetic on the set of equivalence classes.
ISTUDY
72 II. LINEAR TRANSFORMATIONS
ISTUDY
10. QUOTIENT SPACES 73
ISTUDY
74 II. LINEAR TRANSFORMATIONS
Problems
10.1. In Example 10.2, where is the disjointness of the various sub-
sets Ai used in the proof that ∼ is an equivalence relation. In particular,
show that transitivity fails if the subsets are not disjoint.
ISTUDY
10. QUOTIENT SPACES 75
10.8. Starting with the known ring Z, use these ideas to construct
the rational field Q. Sketch a proof that Q satisfies all the appropriate
field axioms, that Q contains a copy Z and that every element of Q is
a fraction with numerator and denominator in this copy of Z.
10.9. Let V = W ⊕W be a vector space over F and let P : V → W
denote the corresponding projection map described in Example 9.3. If
∼ is the equivalence relation on V determined by W , show that the
linear transformation P : V /W → W , as given by Theorem 10.3, is
an isomorphism that assigns to each equivalence class of V the unique
element of W it contains.
ISTUDY
76 II. LINEAR TRANSFORMATIONS
ISTUDY
11. MATRIX CORRESPONDENCE 77
α(S + T ) = αS + αT
α(aS) = a(αS) for all α ∈ V
We observe now that these maps are also linear transformations from
V to W .
Let α, β ∈ V and b ∈ F . Then by definition and the fact that both
S and T are linear transformations, we have
and similarly
ISTUDY
78 II. LINEAR TRANSFORMATIONS
(bα)(aS) = a·(bα)S
= a·b(αS) = b·a(αS)
= b·α(aS)
Thus S + T ∈ L(V, W ) and aS ∈ L(V, W ). In other words, L(V, W ) is
a set with an addition and a scalar multiplication defined on it. It is
in fact a vector space over F .
Theorem 11.1. Let V and W be vector spaces over F . Then with
addition and scalar multiplication defined as above, L(V, W ) is also a
vector space over F .
Proof. We have already verified the closure axioms. The associa-
tive, commutative, distributive and unitary laws are routine to check
and so we relegate these to Problem 11.1 at the end of this section.
Finally the zero map clearly plays the role of 0 ∈ L(V, W ) and −T is
just (−1)T as defined above. The result follows.
At this point we could go ahead and compute dimF L(V, W ) by
constructing a basis. However we take another approach.
Let m and n be positive integers and consider the space F mn . This
is of course the vector space of mn-tuples over F . Now it doesn’t really
matter how we write these mn entries, namely whether we write them
in a straight line or in a circle or perhaps in a rectangular m by n
array. So for reasons that will be apparent later on we will take this
latter approach and write all such elements as m by n arrays. When
we do this, we of course designate F mn by F m×n and we call this the
space of m × n (m by n) matrices over F . We have seen these matrices
before in Sections 6 and 7 as formal arrays, but without considering
their arithmetic.
Now, if α ∈ F m×n , then α is an m by n array of elements of F ,
where m indicates the number of rows and n the number of columns.
As usual, we write α as
⎡ ⎤
a11 a12 . . . a1n
⎢ a21 a22 . . . a2n ⎥
α=⎢ ⎣
⎥
⎦
........
am1 am2 . . . amn
in terms of double subscripted entries aij ∈ F . Here i is the row
subscript and runs between 1 and m, and j is the column subscript and
runs between
1 and n. For brevity we sometimes denote the matrix α
by α = aij .
Since F m×n is really just F mn in disguise, we know how addition
and scalar multiplication are defined. Never-the-less to avoid confusion
ISTUDY
11. MATRIX CORRESPONDENCE 79
we restate this below. Let α = aij and β = bij be matrices in F m×n
and let c ∈ F . Then
α + β = aij + bij = aij + bij
and
cα = c aij = caij
In other words, the i, j-th entry of α + β is aij + bij and the i, j-th entry
of cα is caij . Clearly
dimF F m×n = dimF F mn = mn
ISTUDY
80 II. LINEAR TRANSFORMATIONS
ISTUDY
11. MATRIX CORRESPONDENCE 81
α = m⎣ ⎦, β = r⎣ ⎦
m⎣ ⎦· n ⎣ ⎦ = m⎣ ⎦
and we can think of the intermediate n’s as somehow being cancelled.
Now αβ = cij has the same number of rows as does α and the
same number of columns as does β. Thus it would make sense to have
cij be a function of the ith row of α and the jth column of β. In
addition, the rows of α and the columns of β have the same number of
entries, namely n. Therefore we can and do define
n
cij = aik bkj
k=1
We see that this meets all of the above criteria and therefore at least
makes sense.
ISTUDY
82 II. LINEAR TRANSFORMATIONS
We repeat the definition
formally. Let α = aij be an m×n matrix
let β = bij be an r × s matrix over F . Then the product
over F and
αβ = cij is defined if and only if n = r in which case αβ is m × s and
n=r
cij = aik bkj
k=1
ISTUDY
11. MATRIX CORRESPONDENCE 83
and
n
n
r
fij = aik ek j = aik bk k ckj
k =1 k =1 k=1
n
r
= aik bk k ckj
k =1 k=1
Similarly, if (αβ)γ = fij then
r
n
fij = aik bk k ckj = fij
k=1 k =1
Now
the first term directly above is clearly the i, j-th entry in αβ
=
aij bij and the second term is the i, j-th entry in αγ = aij cij .
Thus by definition of addition we have immediately α(β+γ) = αβ+αγ.
The other distributive law follows in a similar manner and the
lemma is proved.
It will be necessary to develop a certain amount
of manipulative
skill in computing
matrix products. If
α = aij is an m × n matrix
and β = bij is n × r, then αβ = cij is m × r and we have to find
all mr entries. Clearly the easiest way to scan through all these mr
entries is to work column by column or row by row.
Suppose we consider the column by column method. Say we wish to
find the jth column of αβ. As we have already observed, this depends
upon all of α but only the jth column of β. So we start by physically
(or mentally) lifting the jth column of β, turning it on its side and
placing it above the matrix α, as indicated in the following diagram.
ISTUDY
84 II. LINEAR TRANSFORMATIONS
We then apply the b-row to each of the a-rows in turn, multiplying the
corresponding b- and a-entries and summing these products. The sum
we get from the ith row of α is then
ai1 b1j + ai2 b2j + · · · + ain bnj = cij
Thus, in this way we have found the ith entry of the jth column of αβ,
namely cij .
We can also proceed row by row. Here to find the ith row of the
product, we take the ith row of α, turn it on its side and place it next
to β.
⎡ ⎤ ⎡ ⎤
ai1 b11 b12 · · · b1r
⎢ ⎥ ai2 ⎢ b21 b22 · · · b2r ⎥
α=⎢
⎣
⎥ β = .. ⎢ ⎥
ai1 ai2 · · · ain ⎦ . ⎣ · · ·· · · · · · ⎦
ain bn1 bn2 · · · bnr
Problems
11.1. Complete the proof of Theorem 11.1 that L(V, W ) is a vector
space.
Let V be a vector space over R with basis A = {α1 , α2 } and let W
have basis B = {β1 , β2 }. Define α1 = α1 + 2α2 , α2 = 2α1 + 3α2 in V
and β1 = β1 + 2β2 , β2 = β1 + β2 in W so that A = {α1 , α2 } is also a
basis for V and B = {β1 , β2 } is a second basis for W .
11.2. Let T : V → W be given by
1 4
A [T ]B =
5 2
ISTUDY
11. MATRIX CORRESPONDENCE 85
ISTUDY
86 II. LINEAR TRANSFORMATIONS
ISTUDY
12. PRODUCTS OF LINEAR TRANSFORMATIONS 87
and
Since these are equal for all α ∈ U , the associative law holds. Observe
that what we have shown above is that no matter how we write the
product RST , the element α(RST ) is found by first applying R, then
S and then T .
Now suppose R : U → V and S, T : V → W . Then for all α ∈ U ,
Thus R(S + T ) = RS + RT .
Next, let S, T : U → V and R : V → W . Then for all α ∈ U ,
ISTUDY
88 II. LINEAR TRANSFORMATIONS
and
α((aS)T ) = (α(aS))T be definition of (aS)T
= (a(αS))T by definition of aS
= a((αS)T ) since T is linear
= a(α(ST )) by definition of ST
= α(a(ST )) by definition of a(ST )
Thus S(aT ) = a(ST ) = (aS)T , as required.
Notice that the associative and first distributive law follow formally
from the definition of product and sum of functions. But the second
distributive law requires that R is a linear transformation. Similarly
for the two formulas involving a ∈ F . One is formal, but one requires
linearity. Our main result here is the correspondence between multi-
plication of matrices and of linear transformations.
Theorem 12.1. Let S : U → V and T : V → W be linear trans-
formations with U, V, W finite dimensional vector spaces over the same
field F . Let A be a basis for U , B a basis for V , and C a basis for W .
Then
A [S]B · B [T ]C = A [ST ]C
and
βk T = bkj γj
j
so
αi (ST ) = (αi S)T = aik βk T
k
= aik (βk T ) = aik bkj γj
k k j
= aik bkj γj
j k
ISTUDY
12. PRODUCTS OF LINEAR TRANSFORMATIONS 89
On the other hand, by the definition of A [ST ]C = cij , we have
αi (ST ) = cij γj
j
This yields
cij = aik bkj
k
so cij = aij · bij and the result follows.
This simple result of course explains our definition of matrix mul-
tiplication. Indeed matrix multiplication was defined so that it would
correspond to the composition of linear transformations.
The result also has numerous corollaries. Suppose T : V → W and
T is given in the form of its corresponding matrix A [T ]B for some pair
of bases A and B. Now it is quite possible that if another pair A
and B were chosen, then the matrix A [T ]B would be so simple that
we could easily visualize the action of T . We will see an example of
this later. A natural problem is then to find the relationship between
A [T ]B and A [T ]B . Recall that matrix multiplication is associative, so
a product of matrices can be defined unambiguously without the use
of parentheses.
Corollary 12.1. Let T : V → W be a linear transformation with
V and W finite dimensional vector spaces over F . Let A, A be bases
of V and let B, B be bases for W . Then
A [T ]B = A [IV ]A · A [T ]B · B [IW ]B
where IV : V → V and IW : W → W are the identity maps.
Proof. By the previous theorem
A [IV ]A · A [T ]B · B [IW ]B = A [IV ]A · A [T ·IW ]B
ISTUDY
90 II. LINEAR TRANSFORMATIONS
V and let I : V → V be the identity transformation. If A [I]A = aij ,
then
αi = αi I = ai1 α1 + ai2 α2 + · · · + ain αn
In other words, the entries in the ith row of aij are merely the coeffi-
cients that occur
when
we write αi in terms of the basis A. If A = A
then of course aij might be quite complicated. On the other hand if
A = A (in the same order) then clearly A [I]A = In is the n × n identity
matrix ⎡ ⎤
1
⎢ 1 0 ⎥
⎢ ⎥
In = ⎢⎢
1 ⎥
⎥
⎣ 0 ..
. ⎦
1
That is In = δij where
1, for i = j
δij =
0, for i = j
Certainly A [T ]A = In if and only if T = I. Thus we can easily identify
the identity transformation from the matrix A [I]A , but not so easily
from a random A [I]A .
Now In is easily seen to be the identity element in F n×n , that is
In α = αIn = α for all α ∈ F n×n . Next let β ∈ F n×n . We say that
β is nonsingular if and only if β has an inverse β −1 ∈ F n×n with
ββ −1 = β −1 β = In .
Lemma 12.2. Let β = A [T ]A with T : V → V . Then β is non-
singular if and only if T is nonsingular and when this occurs we have
β −1 = A [T −1 ]A .
Proof. Suppose T is nonsingular. Then T is one-to-one and onto
and hence T −1 exists. From the definition of T −1 we have clearly
T T −1 = T −1 T = I. Thus
A [T ]A · A [T −1 ]A = A [T T −1 ]A = A [I]A = In
and
−1
A [T ]A · A [T ]A = A [T −1 T ]A = A [I]A = In
Therefore β has an inverse, namely A [T −1 ]A .
Conversely, suppose β is nonsingular and, by Theorem 11.2, let
S : V → V be given by β −1 = A [S]A . Then
A [ST ]A = A [S]A · A [T ]A = β −1 β = In
ISTUDY
12. PRODUCTS OF LINEAR TRANSFORMATIONS 91
so ST = I. Similarly
A [T S]A = A [T ]A · A [S]A = ββ −1 = In
so T S = I. Observe that for α ∈ V , (αS)T = αI = α so T is onto.
Also (αT )S = αI = α, so αT = 0 implies α = 0 and T is one-to-one.
Thus T is nonsingular and then clearly S = T −1 , so β −1 = A [T −1 ]A
and the result follows.
In particular, since I is nonsingular with inverse I −1 = I, the above
shows that every change of basis matrix β = A [I]A is nonsingular with
inverse β −1 = A [I]A . Now for the converse.
Lemma 12.3. Let β be a nonsingular matrix and let A be a basis
for V . Then there exist bases A and A of V with
β = A [I]A = A [I]A
In particular, a square matrix is a change of basis matrix if and only
if it is nonsingular.
ISTUDY
92 II. LINEAR TRANSFORMATIONS
are called linear functionals. Of course, these are just linear transfor-
mations so that (α1 + α2 )λ = α1 λ + α2 λ and (cα1 )λ = c(α1 λ) for all
α1 , α2 ∈ V and c ∈ F .
Lemma 12.4. Let dimF V = n and let A = {α1 , α2 , . . . , αn } be a
basis for V . Then the functionals αi∗ : V → F defined by
1, if j = i
αi∗ : αj →
0, if j = i
form a basis A∗ = {α1∗ , α2∗ , . . . , αn∗ }, the dual basis, for V ∗ . In particu-
lar, dimF V ∗ = dimF V .
ISTUDY
12. PRODUCTS OF LINEAR TRANSFORMATIONS 93
to (αi T )βj∗ = ( k aik βk )βj∗ = aij . Thus a∗ji = aij and the matrices
∗
A [T ]B and B∗ [T ]A∗ are indeed transposes of each other.
Problems
12.1. Suppose T1 : V1 → V2 , T2 : V2 → V3 , . . ., Tk : Vk → Vk+1 are
linear transformations or in fact any functions. Show that the composi-
tion product T1 T2 · · · Tk with any choice of parentheses merely amounts
to first applying T1 , then T2 , . . ., and then Tk .
12.2. Let S : U → V and T : V → W be linear transformations
with U, V and W finite dimensional. Prove that
min{rank S, rank T } ≥ rank ST ≥ rank S + rank T − dim V
For the second inequality, let R be the restriction of T to im S ⊆ V .
Then im R = im ST and ker R = im S ∩ ker T ⊆ ker T , so
dim ker R ≤ dim ker T = dim V − rank T
12.3. Let α ∈ F m×n and β ∈ F n×r so that αβ ∈ F m×r . Use
Theorem 11.2 and the preceding problem with U = F m , V = F n and
W = F r to show that
min{rank α, rank β} ≥ rank αβ ≥ rank α + rank β − n
12.4. Prove that In is the unique identity element of the ring F n×n .
More generally, if α ∈ F m×n and β ∈ F n×k show that αIn = α and
In β = β.
12.5. Show how to deduce the associative and distributive laws of
matrix multiplication from those of linear transformation multiplica-
tion.
12.6. Let T : V → W be a linear transformation with A [T ]B = cij
where A = {α1 , α2 , . . . , αm } and B = {β1 , β2 , . . . , βn } are bases. Let
α = a1 α1 + a2 α2 + · · · + am αm ∈ V
and
αT = β = b1 β1 + b2 β2 + · · · + bn βn ∈ W
Show that matrix multiplication yields
⎡ ⎤
c11 c12 · · · c1n
⎢ c21 c22 · · · c2n ⎥
a1 a2 · · · am ⎢⎣
⎥ = b 1 b2 · · ·
⎦ bn
........
cm1 cm2 · · · cmn
How does this relate to Theorem 12.1?
ISTUDY
94 II. LINEAR TRANSFORMATIONS
ISTUDY
13. EIGENVALUES AND EIGENVECTORS 95
ISTUDY
96 II. LINEAR TRANSFORMATIONS
ISTUDY
13. EIGENVALUES AND EIGENVECTORS 97
ISTUDY
98 II. LINEAR TRANSFORMATIONS
ISTUDY
13. EIGENVALUES AND EIGENVECTORS 99
so we must have
ISTUDY
100 II. LINEAR TRANSFORMATIONS
ISTUDY
13. EIGENVALUES AND EIGENVECTORS 101
there existed a fixed function of the entries of the matrix A [S]A which
would tell us at a glance just when S is singular. As we will see in the
next chapter, there is a natural candidate for such a function and it
indeed does the job.
Problems
ISTUDY
102 II. LINEAR TRANSFORMATIONS
ISTUDY
CHAPTER III
Determinants
ISTUDY
104 III. DETERMINANTS
α1 + α2
α2
α1
ISTUDY
14. VOLUME FUNCTIONS 105
α2 + α3
α3
α 1 + α2 + α3
α2
α1 + α2
α1
ISTUDY
106 III. DETERMINANTS
concept of positive and negative. Additional reasons even for the field
R occur later in this section.
Now how does v(α1 , α2 , . . . , αn ) behave as a function of each vari-
able individually. Namely, suppose we fix all but the ith vector. We
then have a map from V to R given by αi → v(α1 , α2 , . . . , αi , . . . , αn ).
What sort of a function is it?
Let a ∈ R and consider v(α1 , α2 , . . . , aαi , . . . , αn ). By multiplying
αi by a we have clearly expanded the parallelepiped linearly in one di-
rection by a factor of a and thus we expect the volume to be multiplied
by a. In other words
v(α1 , α2 , . . . , aαi , . . . , αn ) = a·v(α1 , α2 , . . . , αi , . . . , αn )
This can clearly be seen in the figure below where we have compared
v(3α1 , α2 ) and v(α1 , α2 ). If a is negative, then multiplication by a
reverses the sign of a real number. Again negative volumes seem to
appear. Of course, we could still multiply v(α1 , α2 , . . . , αi , . . . , αn ) by
|a|, the absolute value of a, to keep things positive, but the formula is
just not as nice and besides, absolute values do not exist in all fields
that might be of interest to us.
3α1 + α2
2α1 + α2
α1 + α2
α2
3α1
2α1
α1
ISTUDY
14. VOLUME FUNCTIONS 107
α1 + α1 + α2
A
α1 + α2
B
α2
C
α1 + α1
B
α1
A 0
α1 + α2
E
D
α2
E
α1
D 0
ISTUDY
108 III. DETERMINANTS
· · · × V → F
v : V × V ×
n
The first two conditions say that v is multilinear while the third
condition says that v is alternating.
Example 14.3. The zero map v : V × V × · · · × V → F is always
trivially a volume function.
Example 14.4. If dimF V = n = 1, then any linear functional
T : V → F is a volume function since the third axiom is vacuously
satisfied.
ISTUDY
14. VOLUME FUNCTIONS 109
ISTUDY
110 III. DETERMINANTS
Proof. Since the vectors are linearly dependent, for some i, we can
solve for αi in terms of the remaining αj and obtain αi + j=i aj αj = 0.
We now apply part (ii) of the previous lemma and successively add
a1 α1 , a2 α2 , . . . , an αn (for j = i) to the ith variable without changing v.
Thus
v(α1 , . . . , αi , . . . , αn ) = v(α1 , . . . , αi + aj α j , . . . , α n )
j=i
= v(α1 , . . . , 0, . . . , αn ) = 0
and the lemma is proved.
This solves part of the problem. Namely if the n vectors are linearly
dependent then the volume function vanishes. What happens if the
vectors are linearly independent? We will show below that if v is not
the identically zero function, then v does not vanish on any linearly
independent set.
Lemma 14.3. Suppose σ is any permutation of the set {1, 2, . . . , n}.
Then
v(ασ(1) , . . . , ασ(i) , . . . , ασ(n) ) = ± v(α1 , . . . , αi , . . . , αn )
where the ± sign is determined solely by σ.
Proof. By Lemma 14.1(iii ), we can interchange any two entries
and only change v by a factor of −1. So we first interchange α1 with
ασ(1) if 1 = σ(1) so that v(ασ(1) , . . . , ασ(i) , . . . , ασ(n) ) now has α1 as its
first variable. Then we interchange α2 if necessary with the second
entry and continue this process. At each step we either leave v alone
or multiply it by −1. This clearly yields the result since the ± sign is
determined in this way by σ.
It is important to observe that the choice of the ± sign above de-
pends on our specific procedure for reordering the αi s. There is cer-
tainly no a priori reason to believe that a different reordering scheme
will yield the same sign. Indeed this is the fundamental problem in
trying to define a nontrivial volume function. The following argument
may look complicated, but it really a simple application of the multi-
linearity of v.
Theorem 14.1. Let V be a vector space over the field F having
dimension n < ∞. Let A = {α1 , α2 , . . . , αn } be a subset of V and let
B = {β1 , β2 , . . . , βn } be a basis. If v is a volume function then there
exists c ∈ F depending only upon A and B with
v(α1 , α2 , . . . , αn ) = c · v(β1 , β2 , . . . , βn )
ISTUDY
14. VOLUME FUNCTIONS 111
a sum of nn terms.
Consider one such term
a1j1 a2j2 · · · anjn · v(βj1 , βj2 , . . . , βjn )
Since there are precisely n vectors βj and precisely n entries, we see
that either j1 , j2 , . . . , jn is a permutation of the numbers 1, 2, . . . , n or
else two of the ji s are equal. Of course, in the latter case we see that
v(βj1 , βj2 , . . . , βjn ) = 0 and the entire term vanishes. Thus, by deleting
these zero terms, we have
v(α1 , α2 , . . . , αn ) = a1σ(1) a2σ(2) · · · anσ(n) · v(βσ(1) , βσ(2) , . . . , βσ(n) )
σ
where the sum is over all n! permutations σ of the set {1, 2, . . . , n}.
Finally, by the previous lemma
v(βσ(1) , βσ(2) , . . . , βσ(n) ) = ± v(β1 , β2 , . . . , βn )
ISTUDY
112 III. DETERMINANTS
Problems
14.1. Verify that the function defined in Example 14.5 is a volume
function.
ISTUDY
14. VOLUME FUNCTIONS 113
ISTUDY
114 III. DETERMINANTS
15. Determinants
We seem to know everything about volume functions except whether
nonzero ones exist. In this section, we prove their existence. There are
several possible ways of doing this. One approach is to study the ±
sign that occurs in
v(ασ(1) , ασ(2) , . . . , ασ(n) ) = ± v(α1 , α2 , . . . , αn )
Indeed, once this sign is properly understood, we would have no dif-
ficulty in defining v. A second approach and the one we take here is
inductive.
Let α be an n × n matrix over F . Then each row of α is an n-tuple
of elements of F and hence can be thought of as being a vector in F n .
In this way, α is composed on n vectors of F n and thus any function
sending α to F can be thought of as a function of n elements of F n . We
can therefore translate the notion of a volume function to this situation
and we call the resulting function the determinant.
Formally the determinant is a map det : F n×n → F satisfying the
following three axioms.
The first two above tell us, as we have already indicated, that det is a
volume function on F n . The third axiom is a normalization. It says
that the volume function evaluated at the basis {β1 , β2 , . . . , βn }, where
βi = (0, 0, . . . , 1, . . . , n) has a 1 in the ith spot, is equal to 1. Thus, by
Corollary 14.1 we have
Theorem 15.1. If det : F n×n → F exists, then it is unique.
We now proceed to show that det does in fact exist by induction
on n. If α ∈ F n×n then we denote by αij the submatrix of α obtained
by deleting the ith row and jth column. Clearly αij ∈ F (n−1)×(n−1) .
Lemma 15.1. Let n > 1 and suppose that the determinant map
det : F (n−1)×(n−1) → F exists. Fix an integer s with 1 ≤ s ≤ n and
define a function ds : F n×n → F by
n
ds (α) = (−1)i+s ais det αis
i=1
ISTUDY
15. DETERMINANTS 115
Now for i = k, the kth row is not deleted in βis . Thus αis and βis agree
in all rows but one and in that row all entries of βis are equal to b times
the corresponding entry in αis . By properties of det we therefore have
det βis = b(det αis ) for i = k. On the other hand, for i = k, the kth
row is deleted so βks = αks . This yields
ds (β) = (−1)k+s (b·aks ) det αks + (−1)i+s eis ·b(det αis )
i=k
= b (−1)k+s aks det αks + (−1)i+s eis det αis
i=k
= b·ds (α)
Now let α, β and γ be given by
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
eij eij eij
⎣
α = akj , ⎦ β = bkj ⎦ ,
⎣ γ = ⎣akj + bkj ⎦
eij eij eij
Then
ds (γ) = (−1)k+s (aks + bks ) det γks + (−1)i+s eis det γis
i=k
ISTUDY
116 III. DETERMINANTS
Now for i = k, the kth row is not deleted in γis . Clearly αis , βis and
γis agree in all the other rows and they add appropriately in this row.
Thus
det γis = det αis + det βis
for i = k. On the other hand, for i = k we have γks = αks = βks . This
yields
ds (γ) = (−1)k+s aks det αks + (−1)i+s eis det αis
i=k
+ (−1)k+s bks det βks + (−1)i+s eis det βis
i=k
= ds (α) + ds (β)
and ds satisfies the first axiom.
D2) Let us suppose that the kth and th rows of α are identical and
say > k. If α = [aij ], then
ds (α) = (−1)i+s ais det αis
i
ISTUDY
15. DETERMINANTS 117
⎡ ⎤
..
⎢ . ⎥
⎢row k ⎥
⎢ ⎥
⎢row k+1 ⎥
⎢ ⎥
⎢row k+2 ⎥
⎢ . ⎥
αs = ⎢
⎢ ..
⎥
⎥
⎢row −2 ⎥
⎢ ⎥
⎢row −1 ⎥
⎢ ⎥
⎢row (missing)⎥
⎣ ⎦
..
.
By doing it this way, we have not changed the ordering of the other
rows and yet we have shifted row k down to position . Moreover this
was achieved in − k − 1 interchanges (since, for example if k = − 1
then no interchanges are needed). Thus we have
det αs = (−1)−k−1 det αks
and
ds (α) = (−1)k+s aks det αks + (−1)−k (−1)−k−1 det αks
k+s
= (−1) aks det αks − det αks = 0
as required.
D3) If In = α = [aij ] then
ds (In ) = (−1)i+s ais det αis
i
ISTUDY
118 III. DETERMINANTS
×V ×
v: V · · · × V → F
n
by v(α1 , α2 , . . . , αn ) = det[aij ] where αi = j aij βj . Then v is a
nonzero volume function with v(β1 , β2 , . . . , βn ) = 1.
Proof. We must show that v satisfies all the axioms for a volume
function. Suppose first that αk is replaced by bαk . Then clearly the
kth row of [aij ] is multiplied by b ∈ F so (D1) yields
v(α1 , . . . , bαk , . . . , αn ) = b det[aij ] = b·v(α1 , . . . , αk , . . . , αn )
Secondly, consider v(α1 , . . . , αk + αk , . . . , αn ) where αk = j akj βj and
αk = j akj βj . Then αk + αk = j (akj + akj )βj . This means that the
corresponding matrices for the definition of v(α1 , . . . , αk , . . . , αn ) and
v(α1 , . . . , αk , . . . , αn ) agree except in the kth row and there they add
to give the kth row of the matrix for v(α1 , . . . , αk + αk , . . . , αn ). Thus
again by (D1) we have
v(α1 , . . ., αk + αk , . . . , αn ) =
v(α1 , . . . , αk , . . . , αn ) + v(α1 , . . . , αk , . . . , αn )
Next, if the two vectors αr and αs are identical, then certainly the rth
and sth rows of [aij ] are identical so
v(α1 , . . . , αr , . . . , αs , . . . , αn ) = det[aij ] = 0
We have therefore shown that v is a volume function.
ISTUDY
15. DETERMINANTS 119
Finally
v(β1 , β2 , . . . , βn ) = det In = 1
so v is nonzero and the result follows.
Corollary 15.2. Let S : V → V be a linear transformation and let
B and B be bases for V . Then S is singular if and only if det B [S]B = 0.
ISTUDY
120 III. DETERMINANTS
Suppose n = 3 and again use the cofactor expansion down the first
column.
Then
a11 a12 a13
a22 a23 a12 a13
a21 a22 a23 = a11
a32 a33 − a21 a32 a33
a31 a32 a33
a12 a13
+ a31
a22 a23
= a11 (a22 a33 − a32 a23 ) − a21 (a12 a33 − a32 a13 )
+ a31 (a12 a23 − a22 a13 )
= a11 a22 a33 − a11 a23 a32 + a13 a21 a32
− a12 a21 a33 + a12 a23 a31 − a13 a22 a31
Finally, we have
Theorem 15.3. Let [aij ] ∈ F n×n . Then
det[aij ] = |aij | = ±a1j1 a2j2 · · · anjn
where the sum is over all n-tuples (j1 , j2 , . . . , jn ) of distinct column sub-
scripts. Here the ± sign is uniquely determined and the main diagonal
term a11 a22 · · · ann occurs with a plus sign.
ISTUDY
15. DETERMINANTS 121
Problems
15.1. Evaluate the determinant of α ∈ R4×4 where
⎡ ⎤
2 1 0 −1
⎢−1 0 1 3⎥
α=⎢ ⎣ 0
⎥
2 7 1⎦
0 0 −1 2
15.2. Let α ∈ R3×3 be given by
⎡ ⎤
3 −1 2
α=⎣ 0 2 4⎦
−1 0 1
Evaluate the determinants of α and of αT , the transpose of α. Compare
your results.
15.3. Let α, β ∈ R2×2 be given by
4 1 2 −1
α= β=
1 2 1 1
Evaluate det α, det β, det(αβ) and det(βα), Compare your results.
15.4. Recall Example 13.6 where S : V → V is given by
⎡ ⎤
a + 10 −9 −6
A [S]A =
⎣ −32 a + 23 18 ⎦
61 −47 a − 35
Find det A [S]A . For what values of a ∈ R is S singular?
15.5. Let α ∈ F 5×5 be given by
⎡ ⎤
a11 a12 a13 a14 a15
⎢ 0 a22 a23 a24 a25 ⎥
⎢ ⎥
α=⎢ ⎢0 0 a33 a34 a35 ⎥
⎥
⎣0 0 0 a44 a45 ⎦
0 0 0 0 a55
Find det α.
15.6. Let β ∈ F n×n . Prove that β is nonsingular if and only if
det β = 0.
15.7. Let α, β ∈ F n×n . Is the formula
det(α + β) = det α + det β
true in general?
ISTUDY
122 III. DETERMINANTS
· · · × V → F
vT : V × V ×
n
by
(vT )(α1 , α2 , . . . , αn ) = v(α1 T, α2 T, . . . , αn T )
Show that vT is a volume function.
15.9. If a ∈ F define av by
(av)(α1 , α2 , . . . , αn ) = a·(v(α1 , α2 , . . . , αn ))
Show that av is a volume function and that every volume function on
V is of this form for a unique a ∈ F .
15.10. From the above we see that there exists a unique c ∈ F with
vT = cv. Prove that c = 0 if and only if T is singular.
ISTUDY
16. CONSEQUENCES OF UNIQUENESS 123
Observe that if we fix all the columns of α other than the kth, then αik
is kept fixed and the above formula is just
n
det α = aik ci
i=1
ISTUDY
124 III. DETERMINANTS
so if the two columns are identical, then a11 = a12 , a21 = a22 and
|aij | = 0. Finally, let n > 2 and suppose the result is true for n − 1.
Let α ∈ F n×n with say its rth and sth columns identical. Since n ≥ 3,
we can choose k = r, s and then by expanding det α with respect to
the kth column we have
n
det α = (−1)i+k aik det αik
i=1
where α = [aij ]. Now in obtaining αik , we deleted neither the rth nor
the sth column. Thus αik has two columns identical and by induction
det αik = 0 for all i. Hence det α = 0 and the lemma is proved.
Let α = [aij ] ∈ F n×n . Recall that the transpose αT of α is defined
to be the n × n matrix αT = [aij ] where aij = aji . In other words, αT
is obtained from α by reflecting it about the main diagonal. In this
process we see that rows and columns are interchanged. But one thing
that is not changed is the determinant.
Theorem 16.1. Let α ∈ F n×n . Then
det αT = det α
ISTUDY
16. CONSEQUENCES OF UNIQUENESS 125
Proof. Let αT = [aij ] so that aij = aji . Clearly (αT )ij = (αji )T so
using the rth column expansion of det αT we obtain
n
T
det α = det α = (−1)j+r ajr det(αT )jr
j=1
n
= (−1)r+j arj det(αrj )T
j=1
n
= (−1)r+j arj det αrj
j=1
ISTUDY
126 III. DETERMINANTS
ISTUDY
16. CONSEQUENCES OF UNIQUENESS 127
where δ ∗ denotes δ with its first column deleted. By induction, det γ11 =
det β, so the result follows.
We can now prove
Theorem 16.3. If
⎡ ⎤
α 0
γ=⎣ ⎦
δ β
ISTUDY
128 III. DETERMINANTS
ISTUDY
16. CONSEQUENCES OF UNIQUENESS 129
⎡ ⎤
Bn−1 0
Bn = ⎣ ⎦
C ann
2n − 1
bij = aij − ·anj
n+i−1
Thus, by the cofactor expansion with respect to the last column of Bn ,
we have
ISTUDY
130 III. DETERMINANTS
ISTUDY
16. CONSEQUENCES OF UNIQUENESS 131
Problems
16.1. Show that the 2 × 2 determinant
a11 a12
a21 a22
is given by the product a11 a22 , corresponding to the line slanting down
and to the right, minus the product a21 a12 , corresponding to the line
slanting up and to the right.
16.2. We learn this trick in calculus class for evaluating the 3 × 3
determinant |aij |. Copy the first two columns of the matrix to the right
hand side, as indicated, and then draw the six slanting lines.
a11 a12 a13 a11 a12
a21 a22 a23 a21 a22
a31 a32 a33 a31 a32
Show that |aij | is equal to the sum of the three products corresponding
to the lines slanting down and to the right, minus the three products
corresponding to the lines slanting up and to the right.
16.3. Can the above trick work in general for n × n determinants
with n ≥ 4 ? For this, consider Theorem 15.3.
16.4. If a, b, c ∈ F , show that
1 1 1
1 1
= b − a, and a b c = (c − a)(c − b)(b − a)
a b 2 2 2
a b c
Why do you think these right hand factors occur?
16.5. Let α, β ∈ F 2×2 with
a b e f
α= , β=
c d g h
Compute αβ and then det αβ. Prove directly that det αβ factors as
the product (det α)(det β).
16.6. Let α and β be n×n matrices with integer entries and assume
that αβ = diag(1, 2, . . . , n). Show that det α is an integer dividing n!.
16.7. Compute the determinant
1 1 2 3
0 2 4 5
−1 3 −2 1
2 4 1 9
ISTUDY
132 III. DETERMINANTS
ISTUDY
17. ADJOINTS AND INVERSES 133
ISTUDY
134 III. DETERMINANTS
ISTUDY
17. ADJOINTS AND INVERSES 135
ISTUDY
136 III. DETERMINANTS
ISTUDY
17. ADJOINTS AND INVERSES 137
ISTUDY
138 III. DETERMINANTS
and this makes no sense. The three zeroes in the last row tell us that
A must have been a singular matrix, so A−1 does not exist.
Example 17.3. Let us start again, this time with
⎡ ⎤
1 0 1
⎣
A= 1 1 2⎦
2 1 4
so that
⎡ ⎤
1 0 1 1 0 0
A|I3 = ⎣1 1 2 0 1 0⎦
2 1 4 0 0 1
Again, we subtract the first row from the second and subtract twice
the first row from the third. This yields
⎡ ⎤
1 0 1 1 0 0
⎣0 1 1 −1 1 0⎦
0 1 2 −2 0 1
2f 3 i
Next, subtracting row 2 from row 3, gives us
⎡ ⎤
1 0 1 1 0 0
⎣0 1 1 −1 1 0⎦
0 0 1 −1 −1 1
ISTUDY
17. ADJOINTS AND INVERSES 139
Thus ⎡ ⎤
2 1 −1
B=⎣ 0 2 −1⎦
−1 −1 1
must equal A−1 and we can check this by verifying that AB = BA = I3 .
We state this procedure as a formal lemma and offer a proof.
Lemma 17.3. If A ∈ F n×n is nonsingular, then the augmented
matrix A|In is row equivalent to the reduced row echelon matrix In |A−1 .
ISTUDY
140 III. DETERMINANTS
ISTUDY
17. ADJOINTS AND INVERSES 141
Problems
17.1. Let α ∈ F n×n with n ≥ 2. If det α = 0, show that det(adj α) =
0. For this, note that if det(adj α) = 0, then adj α is invertible, and
then (adj α)α = 0 implies that α = 0. As a consequence, conclude that
for all α, we have det(adj α) = (det α)n−1 .
17.2. If α, β are nonsingular matrices in F n×n , show that (αβ)−1 =
β −1 α−1 and then that adj(αβ) = (adj β)(adj α).
17.3. For
a b
α=
c d
find adj α and adj(adj α). If α is nonsingular, find α−1 .
17.4. If α ∈ F n×n has rank n − 1, use Problem 12.3 to prove that
adj α has rank 1. In particular, if n ≥ 3, conclude that adj(adj α) = 0.
17.5. Compute the adjoint of the 3 × 3 real matrix
⎡ ⎤
1 1 0
⎣2 3 −1⎦
1 4 −1
Find the inverse of this matrix using Corollary 17.1 and also by using
the technique of Lemma 17.3.
17.6. Solve the system of real linear equations
x1 + x2 + 2x3 = 4
x1 + 2x2 + 2x3 = 9
2x1 + 3x2 + 7x3 = 7
using Cramer’s Rule.
17.7. Prove Cramer’s Rule using the inverse of the matrix A given
by Corollary 17.1 and suitable column cofactor expansions.
17.8. State and prove the column analog of Lemma 17.4. Show that
the associative law of matrix multiplication implies that any elementary
row operation commutes with any elementary column operation. For
this, see Problem 6.3.
17.9. Show that any elementary matrix obtained from an elemen-
tary column operation applied to In is equal to an elementary matrix
obtained from an elementary row operation applied to In .
17.10. Prove that elementary matrices are all nonsingular with ele-
mentary inverses, and that any nonsingular square matrix is a product
of elementary matrices.
ISTUDY
142 III. DETERMINANTS
ISTUDY
18. THE CHARACTERISTIC POLYNOMIAL 143
ISTUDY
144 III. DETERMINANTS
ISTUDY
18. THE CHARACTERISTIC POLYNOMIAL 145
ISTUDY
146 III. DETERMINANTS
ISTUDY
18. THE CHARACTERISTIC POLYNOMIAL 147
ISTUDY
148 III. DETERMINANTS
Since this holds in the trivial situation when two of the parameters are
equal, we can assume in the proof by induction that all ai are distinct.
Obviously this result holds for n = 1 and 2, so let n > 2.
Now work in F (x)n×n and form the n × n matrix
⎡ ⎤
1 1 ··· 1
⎢ a1 a2 · · · x ⎥
⎢ 2 ⎥
α=⎢ ⎢ a1 a2 · · · x 2 ⎥
2
⎥
⎣ ......... ⎦
n−1 n−1 n−1
a1 a2 ··· x
where an is replaced by the variable x. By considering the column
cofactor expansion with respect to the nth column, we see that det α
is a polynomial in x of degree ≤ n − 1. Indeed, it has degree precisely
n − 1 since the coefficient of xn−1 is cn−1 = det αnn , the Vandermonde
determinant determined by a1 , a2 , . . . , an−1 . Furthermore, cn−1 = 0
since the parameters are distinct.
Finally note that det α is 0 when evaluated at x = a1 , a2 , . . . , an−1
since in each of these cases, two columns of the matrix are equal. Since
these n − 1 roots are all distinct, we must have
det α = cn−1 (x − a1 )(x − a2 ) · · · (x − an−1 )
and evaluating at x = an yields
det ν = cn−1 (an − aj ) = (ai − aj )· (an − aj )
n>j n=i>j n>j
= (ai − aj )
i>j
as required.
Problems
18.1. If α ∈ F n×n and a ∈ F , show that α(aIn ) = (aIn )α.
Compute the characteristic polynomials of the following two matri-
ces with entries in the field F .
18.2. ⎡ ⎤
a11 a12 a13 a14 a15
⎢ 0 a22 a23 a24 a25 ⎥
⎢ ⎥
α=⎢
⎢0 0 a33 a34 a35 ⎥
⎥
⎣0 0 0 a44 a45 ⎦
0 0 0 0 a55
ISTUDY
18. THE CHARACTERISTIC POLYNOMIAL 149
18.3. ⎡ ⎤
0 1 0 0
⎢0 0 1 0⎥
α=⎢
⎣0
⎥
0 0 1⎦
a b c d
If α = [aij ] ∈ F n×n , then the trace of α is defined to be
tr α = a11 + a22 + · · · + ann
the sum of the entries of α on the main diagonal.
18.4. Prove that tr : F n×n → F is a linear functional and that
tr(αβ) = tr(βα) for all α, β ∈ F n×n .
ISTUDY
150 III. DETERMINANTS
ISTUDY
19. THE CAYLEY-HAMILTON THEOREM 151
Now ⎡ ⎤
x −1 0 · · · 0 0
⎢ 0 x −1 · · · 0 0 ⎥
⎢ ⎥
⎢ .............. ⎥
β11 = ⎢ ⎥
⎢0 0 0 · · · −1 0 ⎥
⎣0 0 0 ··· x −1 ⎦
a1 a2 a3 · · · an−2 x + an−1
so β11 = xIn−1 −α where α∗ is the companion matrix of the polynomial
∗
ISTUDY
152 III. DETERMINANTS
ISTUDY
19. THE CAYLEY-HAMILTON THEOREM 153
so we have
f (T )g(T ) = (ai bj )T i+j
i,j
= ai bj T k = h(T )
k i+j=k
ISTUDY
154 III. DETERMINANTS
ISTUDY
19. THE CAYLEY-HAMILTON THEOREM 155
Problems
A nonempty subset A of F [x] is said to be an ideal of the ring if
i. α, β ∈ A implies that α + β ∈ A.
ii. α ∈ A and β ∈ F [x] implies that αβ ∈ A.
For example A = {0} is an ideal and so is the set of all F [x]-multiples
of any fixed element of the ring. In the following three problems, let A
be an ideal of F [x] with A = {0}.
19.1. Let m be the minimal degree of all nonzero polynomials in A.
Show that A contains a unique monic polynomial μ(x) of degree m. μ
is called the minimal polynomial of A.
19.2. Show that for all f (x) ∈ A, there exists some polynomial
g(x) ∈ F [x] with f (x) = μ(x)g(x). (Hint, apply induction on the de-
gree of f (x). If deg f = n ≥ m and f (x) = an xn + lower degree terms,
then f (x) − an xn−m μ(x) ∈ A has degree less than n.)
19.3. Conclude that A = F [x]μ(x) is the set of all F [x]-multiples
of its minimal polynomial μ(x).
ISTUDY
156 III. DETERMINANTS
0 1 2
ISTUDY
20. NILPOTENT TRANSFORMATIONS 157
ISTUDY
158 III. DETERMINANTS
ISTUDY
20. NILPOTENT TRANSFORMATIONS 159
ISTUDY
160 III. DETERMINANTS
Let us consider each term aijk (βij T k+m ) in this sum. If i − k > m
then, by assumption, aijk = 0 so this term is 0. On the other hand, if
i − k < m, then i < k + m so
βij T k+m = (βij T i )T ·T k+m−i−1
= (αij T )·T k+m−i−1 = 0T k+m−i−1 = 0
ISTUDY
20. NILPOTENT TRANSFORMATIONS 161
since βij T i = αij ∈ W = ker T . Thus the only terms that occur in the
above sum have i − k = m, so i = k + m and
aijk (βij T k+m ) = aijk (βij T i ) = aijk αij
Thus we have
aijk αij = 0
i,j,k
k=i−m
But {αij } is a basis for W and there is at most one subscript k for each
i, j, so we conclude that all aijk = 0 with i − k = m, a contradiction by
the definition of m. This proves that B is a linearly independent set of
distinct vectors.
We show now, by inverse induction on q = n + 1, n, . . . , 0, that the
subset of B given by Bq = {βij T k | k ≥ q} spans Vq = im T q = (V )T q .
Of course, Bq ⊆ Vq since all the exponents of T in the vectors of Bq are
at least equal to q. If q = n+1, then Vq = 0 and Bq = ∅, so the result is
trivially true. Let us now suppose that Bq+1 spans Vq+1 and let γ ∈ Vq .
Since γ ∈ Vq = im T q , it follows easily that γT ∈ im T q+1 = Vq+1 and
thus, by induction,
γT = bijk (βij T k )
i,j,k
k≥q+1
for suitable bijk ∈ F . Note that q ≥ 0 implies that k ≥ q + 1 ≥ 1 and
thus if we define
δ=γ− bijk (βij T k−1 )
i,j,k
k≥q+1
then δT = 0.
Two facts about δ ∈ V are now apparent. First, δT = 0 so δ ∈
ker T = W . Second, δ ∈ Vq since γ ∈ Vq and since k−1 ≥ q implies that
βij T k−1 ∈ Vq . Thus δ ∈ W ∩ Vq = Wq . Now we know that {αij | i ≥ q}
is a basis for Wq and therefore, for suitable elements cij ∈ F , we have
δ= cij αij = cij (βij T i )
i,j i,j
i≥q i≥q
Since
γ=δ+ bijk (βij T k−1 )
i,j,k
k≥q+1
= cij (βij T i ) + bijk (βij T k−1 )
i,j i,j,k
i≥q k≥q+1
ISTUDY
162 III. DETERMINANTS
ISTUDY
20. NILPOTENT TRANSFORMATIONS 163
Problems
20.1. Let F be a field with 1 + 1 + 1 = 0. Discuss the formal
derivative map and the difference map on the full polynomial ring F [x]
and show that these maps are nilpotent.
Recall that R is a commutative ring if it is a set with an addition and
multiplication that satisfies all the axioms for a field with the possible
exception of the existence of multiplicative inverses.
20.2. If p ∈ Z is a prime, show that p divides the binomial coef-
ficients pi for i = 1, 2, . . . , p − 1. Now let R be a commutative ring
with
p = 1 + 1 +· · · + 1 = 0
p times
ISTUDY
164 III. DETERMINANTS
and let ns denote the number of block sizes sj that are equal to s. Use
the preceding result to show that
rank T k = ns (s − k) = nk+1 + 2nk+2 + 3nk+3 + · · ·
s≥k
ISTUDY
21. JORDAN CANONICAL FORM 165
ISTUDY
166 III. DETERMINANTS
Also, γi ∈ W so
γi T = γi TW = cij γj = 0αk + cij γj
j k j
ISTUDY
21. JORDAN CANONICAL FORM 167
ISTUDY
168 III. DETERMINANTS
ISTUDY
21. JORDAN CANONICAL FORM 169
show. Even so, one can prove that F [x] has divisibility and factoriza-
tion properties very similar to those of the ordinary integers.
If f (x) is a monic polynomial in F [x], let c(f ) denote its companion
matrix. We recall that if f (x) = xs then
b(0, s) = c(xs )
Now suppose that T : V → V is a linear transformation with charac-
teristic polynomial ϕT (x) = xn . Then our main result on nilpotent
transformations, Theorem 20.1, states that there exists a basis B such
that
s1 s2 sr
B [T ]B = diag(c(x ), c(x ), . . . , c(x ))
where diag has the obvious meaning. Of course
xs1 xs2 · · · xsr = xn = ϕT (x)
Problems
Let V be a finite dimensional vector space over the algebraically
closed field F and let T : V → V . For each a ∈ F let
Va = {α ∈ V | α(T − aI)n = 0 for some n ≥ 1}
21.1. Prove that Va is a subspace of V invariant under T . (Va is the
space of generalized eigenvectors for T with eigenvalue a.)
21.2. Show that Va = 0 if and only if a is an eigenvalue of T .
21.3. Let f (x) be a nonzero polynomial in F [x] and let a ∈ F with
f (a) = 0. Suppose α ∈ V satisfies
αf (T ) = 0 and
α(T − aI)n = 0 for some n ≥ 1
ISTUDY
170 III. DETERMINANTS
ISTUDY
CHAPTER IV
Bilinear Forms
ISTUDY
172 IV. BILINEAR FORMS
ISTUDY
22. BILINEAR FORMS 173
Example 22.5. Let V be as above and now let κ(x, y) be some fixed
continuous real valued function defined on the unit square 0 ≤ x ≤ 1,
0 ≤ y ≤ 1. Then we can set
1 1
B(α, β) = κ(u, v)α(u)β(v) du dv
0 0
ISTUDY
174 IV. BILINEAR FORMS
In this unpleasant situation, the theory just does not work well. There-
fore, we will usually restrict our attention to normal bilinear forms, that
is bilinear forms B satisfying
B(α, β) = 0 if and only if B(β, α) = 0
There are two important special cases here.
First, we say that a bilinear form B is symmetric if
B(α, β) = B(β, α) for all α, β ∈ V
Obviously such a form is also normal. Second, we say that B is skew-
symmetric if
B(α, α) = 0 for all α ∈ V
The following lemma explains the name and shows that these forms
are also normal.
Lemma 22.1. Let B be a skew-symmetric bilinear form on V . Then
for all α, β ∈ V we have
B(β, α) = −B(α, β)
Hence B is normal.
Proof. We have B(α + β, α + β) = 0 and therefore
0 = B(α + β, α + β) = B(α, α + β) + B(β, α + β)
= B(α, α) + B(α, β) + B(β, α) + B(β, β)
= B(α, β) + B(β, α)
since also B(α, α) = B(β, β) = 0. Thus B(β, α) = −B(α, β) and this
clearly implies that B is normal.
We now see that these are not only some examples of normal forms,
but they are in fact the only examples.
Theorem 22.1. Let B : V × V → F be a normal bilinear form.
Then B is either symmetric or skew-symmetric.
Proof. Let σ, τ, η ∈ V . Then by linearity
B σ, B(σ, η)τ − B(σ, τ )η
= B(σ, η)(B(σ, τ ) − B(σ, τ )B(σ, η) = 0
Hence, since B is normal, we have
0 = B B(σ, η)τ − B(σ, τ )η, σ
= B(σ, η)B(τ, σ) − B(σ, τ )B(η, σ)
ISTUDY
22. BILINEAR FORMS 175
ISTUDY
176 IV. BILINEAR FORMS
v. S⊥ = S ⊥ .
Proof. If α ∈ V , then {α}⊥ is clearly the kernel of the linear
transformation β → B(α, β) and is therefore a subspace of V . Since
clearly "
S⊥ = {α}⊥
α∈S
⊥
we see that S is a subspace. This yields (i ). Parts (ii), (iii ) and (iv )
are of course obvious.
Now S ⊇ S so by (iii ) we have S ⊥ ⊇ S⊥ . To obtain the reverse
inclusion, we note that S ⊥⊥ ⊇ S and since S ⊥⊥ is a subspace, we must
have S ⊥⊥ ⊇ S. This says that every element of S ⊥ is perpendicular
to every element of S and hence S ⊥ ⊆ S⊥ . This yields (v ).
Finally we show
Theorem 22.2. Let V be a finite dimensional vector space over F
and let B : V × V → F be a normal bilinear form. If W is a subspace
of V , then
dimF W + dimF W ⊥ ≥ dimF V
In particular, if W ∩ W ⊥ = 0, then V = W ⊕ W ⊥ .
Proof. If W = 0, then W ⊥ = V and the result is clear. Now
suppose dimF W = s ≥ 1 and let S = {α1 , α2 , . . . , αs } be a basis
for W . Then by the previous lemma, W ⊥ = S ⊥ . Now W ⊥ = S ⊥ is
certainly the kernel of the linear transformation
T : V → F s given by
β → B(α1 , β), B(α2 , β), . . . , B(αs , β) . Hence since
dimF im T ≤ dimF F s = s = dimF W
Theorem 9.3 yields
dimF V = dimF im T + dimF ker T
≤ dimF W + dimF W ⊥
as required.
Finally if W ∩ W ⊥ = 0, then W + W ⊥ = W ⊕ W ⊥ is a subspace of
V of dimension dimF W + dimF W ⊥ ≥ dimF V , so W ⊕ W ⊥ = V and
the theorem is proved.
Problems
22.1. Suppose we defined a skew-symmetric bilinear form by the
relation B(α, β) = −B(β, α). Would we get the same answer?
ISTUDY
22. BILINEAR FORMS 177
ISTUDY
178 IV. BILINEAR FORMS
ISTUDY
23. SYMMETRIC AND SKEW-SYMMETRIC FORMS 179
ISTUDY
180 IV. BILINEAR FORMS
ISTUDY
23. SYMMETRIC AND SKEW-SYMMETRIC FORMS 181
ISTUDY
182 IV. BILINEAR FORMS
Problems
Let V be a vector space over F and let B : V × V → F be a normal
bilinear form.
23.1. Let S be a nonempty subset of V . Show that S ⊥⊥⊥ = S ⊥ .
23.2. Let V = W1 + W2 + · · · + Wk be a perpendicular direct sum.
Prove that
rad V = rad W1 + rad W2 + · · · + rad Wk
23.3. In Theorem 23.1, show that the subspace U is in fact equal
to rad V .
23.4. Let V = W1 + W2 + · · · + Wk be a sum of subspaces with
Wi ⊥ Wj for all i = j and with each Wi nonsingular. Prove that the
above sum is direct.
23.5. Let F be a field with 1+1 = 0. What is the difference between
symmetric and skew-symmetric forms?
ISTUDY
23. SYMMETRIC AND SKEW-SYMMETRIC FORMS 183
ISTUDY
184 IV. BILINEAR FORMS
Since the a’s and b’s occur linearly in this formula, it is clear that B is
a bilinear form. Finally since γi = 0γ1 + 0γ2 + · · · + 1γi + · · · + 0γn we
have B(γi , γj ) = cij so this particular B maps to the matrix [cij ] and
the map is onto.
We show now that the map is one-to-one or in other words that B
C
is uniquely
determined by [B] . Let α, β ∈ V and write α = i ai γ i
and β = i bi γi . Then by the linear properties of B, we have
B(α, β) = B ai γ i , bj γ j
i j
= ai B γ i , bj γ j
i j
= ai B(γi , γj )bj
i,j
Now the a’s and b’s depend only upon α, β and C and the field elements
B(γi , γj ) depend only upon [B]C . Thus we see that C and [B]C uniquely
determine B and the theorem is proved.
ISTUDY
24. CONGRUENT MATRICES 185
and B is symmetric.
Now let B be skew-symmetric. Then by Lemma 22.1 we have
B(γi , γi ) = 0 and B(γi , γj ) = −B(γj , γi ) and hence [B(γi , γj )] is a
skew-symmetric matrix. Finally let us suppose that the matrix is skew
symmetric and let α = β in equation (∗). Then
B(α, α) = ai B(γi , γj )aj
i,j
Since the diagonal entries of the matrix are 0, the terms with i = j do
not appear in the above sum. Thus we may assume that i = j and
then the contribution from the pair {i, j} is
ai B(γi , γj )aj + aj B(γj , γi )ai
= ai aj B(γi , γj ) + B(γj , γi ) = 0
ISTUDY
186 IV. BILINEAR FORMS
ISTUDY
24. CONGRUENT MATRICES 187
ISTUDY
188 IV. BILINEAR FORMS
can write
V = W 1 + W2 + · · · + Wk + U
a perpendicular direct sum on the hyperbolic planes Wi and also the
isotropic space U . Let Wi = γi , δi with B(γi , δi ) = 1, B(δi , γi ) = −1
and let {β1 , β2 , . . . , βs } be a basis for U . Then certainly
C = {γ1 , δ1 , γ2 , δ2 , . . . , γk , δk , β1 , β2 , . . . , βs }
is a basis for V . We compute the matrix [B]C . Since the above is a
perpendicular direct sum and since U is an isotropic subspace we have
easily
B(βi , γj ) = 0 = B(γj , βi )
B(βi , δj ) = 0 = B(δj , βi )
B(βi , βj ) = 0 = B(βj , βi )
for all i, j. Furthermore, for all i = j we have B(γi , γj ) = 0 = B(δi , δj )
and B(γi , δj ) = 0 = B(δj , γi ). Thus the only nonzero entries of [B]C
occur when we consider two elements of each Wi . Here we have
B(γi , γi ) = 0 B(γi , δi ) = 1
B(δi , γi ) = −1 B(δi , δi ) = 0
Therefore [B]C has the indicated block diagonal form with precisely k
blocks equal to H. Finally, by Theorem 24.3, α = [B]A is congruent to
the matrix [B]C , as required.
Finally we should consider whether the matrices we get in the above
theorem are actually unique. For symmetric matrices α we get diagonal
matrices but the diagonal entries are by no means unique. For certain
fields we can suitably restrict the entries to get a uniqueness theorem
but the problem is certainly not solved in general. On the other hand,
if α is skew-symmetric then the only parameter here is the number of
H blocks that appear and this is in fact uniquely determined.
Problems
24.1. Let V be the real subspace of R[x] consisting of all polyno-
#mials of degree < n and let B : V × V → R be defined by B(α, β) =
1
0
α(x)β(x) dx. If C is the basis of V given by {1, x, x2 , . . . , xn−1 } show
that the matrix [B]C is the Hilbert matrix of Example 16.2.
24.2. Prove that the map B : V × V → F defined in the first para-
graph of the proof of Theorem 24.1 is a bilinear form.
ISTUDY
24. CONGRUENT MATRICES 189
ISTUDY
190 IV. BILINEAR FORMS
b
c
θ
0 α
a
ISTUDY
25. INNER PRODUCT SPACES 191
Moreover
α • α = a21 + a22 + · · · + a2n
and thus since F = R we have α • α ≥ 0 and α • α = 0 if and only if
α = 0. In this way, Rn becomes an inner product space.
Formally an inner product space is a vector space V over R with
an inner product or dot product α • β defined on it satisfying
1. The inner product • maps V × V to R and is a symmetric
bilinear form.
2. For all α ∈ V , α • α ≥ 0 and α • α = 0 if and only if α = 0.
Let us consider some examples.
Example 25.1. We have as above V = Rn and α • β defined by
α • α = a1 b1 + a2 b2 + · · · + an bn
where α = (a1 , a2 , . . . , an ) and β = (b1 , b2 , . . . , bn ).
ISTUDY
192 IV. BILINEAR FORMS
#1
Observe that α • α = −1
α(x)2 dx ≥ 0 and clearly α • α = 0 if and only
if α = 0.
ab cos θ = α • β
Proof. Most of this follows by virtue of the fact that the inner
product is a symmetric bilinear form.
(i ) If W = 0 we can choose 0 = α ∈ W . By definition α • α > 0
and hence W is nonisotropic.
(ii ) If α ∈ rad W then we must have α ⊥ α and thus α = 0.
Therefore rad W = 0 and W is nonsingular so Theorem 22.2 implies
that V = W ⊕ W ⊥ . k
(iii ) By Theorem 23.1 we have V = 1 Wi + U , an orthogonal
direct sum of the nonisotropic lines Wi and the isotropic subspace U .
Since (i ) implies that U = 0, the lemma is proved.
ISTUDY
25. INNER PRODUCT SPACES 193
ISTUDY
194 IV. BILINEAR FORMS
α1 and
i
(αi+1 • γk )
γi+1 = αi+1 − γk
k=1
(γk • γk )
Proof. We remark first that in the above formula for γi+1 we have
denominators of the form (γk •γk ). Therefore we will have to show that
each γk is nonzero.
The proof of the theorem proceeds by induction on i. When i = 1
we certainly have γ1 = α1 = 0 so C1 = { γ11 γ1 } is an orthonormal basis
for α1 by Lemma 25.2.
Now assume that Ci is an orthonormal basis for α1 , α2 , . . . , αi . In
particular each γk with k ≤ i is nonzero. Thus it makes sense to define
i
(αi+1 • γk )
γi+1 = αi+1 − γk
k=1
(γk • γk )
Now observe that αi+1 can be solved from this equation and hence
αi+1 ∈ γ1 , γ2 , . . . , γi+1 . But we also know, by induction, that αk ∈
γ1 , γ2 , . . . , γi for all k ≤ i. Hence
α1 , α2 , . . . , αi+1 ⊆ γ1 , γ2 , . . . , γi+1
ISTUDY
25. INNER PRODUCT SPACES 195
Problems
25.1. Prove that an orthogonal set of nonzero vectors is always
linearly independent.
ISTUDY
196 IV. BILINEAR FORMS
ISTUDY
26. INEQUALITIES 197
26. Inequalities
Let V be a finite dimensional inner product space. Then we know
that V has an orthonormal basis and we expect this basis to completely
determine the inner product structure. Indeed we have
Lemma 26.1. Let V be an inner product
space with orthonormal
basis {γ1 , γ2 , . . . , γn }. If α = i ai γi and β = i bi γi are vectors in V ,
then ai = α • γi and bi = β • γi so
α= (α • γi )γi
i
α•β = ai b i = (α • γi )(β • γi )
i i
α2 = a2i = (α • γi )2
i i
In particular,
if β = γi this yields α•γi = ai and similarly β •γi = bi .
Thus α = i (α • γi )γi and
α•β = (α • γi )(β • γi )
i
Finally if β = α then
α2 = α • α = a2i = (α • γi )2
i i
ISTUDY
198 IV. BILINEAR FORMS
Clearly
$
a = a21 + a22 + · · · + a2n = α
$
b = b21 + b22 + · · · + b2n = β
so
α•β
cos θ =
α·β
Finally | cos θ| ≤ 1 so we have shown here that
α•β
1≥
α·β
While the above inequality is of interest, the above proof is certainly
unsatisfying. It is based on certain geometric reasoning that just is not
set on as firm a foundation as our algebraic reasoning. Admittedly,
geometry can be axiomatized in such a way that the above proof will
have validity in our situation, but most of us have not seen this formal
approach to the subject. Thus it is best for us to look elsewhere for
a proof of this inequality. In fact, we will offer three entirely different
proofs of the result, the first of which is probably best since it avoids
the use of bases.
The following is called the Cauchy-Schwarz inequality.
Theorem 26.1. Let V be an inner product space and let α, β ∈ V .
Then
α·β ≥ |α • β|
Moreover equality occurs if and only if one of α or β is a scalar multiple
of the other.
Proof. We observe that the theorem is trivially true if one of α
or β is zero. Thus we can certainly assume that both α and β are
nonzero. Moreover if α and β are scalar multiples of each other then
we easily get equality here. Finally since α and β are contained in
α, β, a finite dimensional subspace of V , we may clearly assume that
V is finite dimensional.
First Proof. Let x ∈ R. Then
0 ≤ α − xβ2 = (α − xβ) • (α − xβ)
= α2 − 2x (α • β) + x2 β2
What we have here is a parabola
y = α2 − 2x (α • β) + x2 β2
ISTUDY
26. INEQUALITIES 199
in the (x, y)-plane that does not go below the x-axis. We obviously
derive the most information from this by choosing x to be that real
number that minimizes y, namely we take x = (α • β)/β2 . Then
(α • β) (α • β)2
0 ≤ α2 − 2 (α • β) + β2
β2 β4
(α • β)2
= α2 −
β2
This yields (α•β)2 ≤ α2 β2 and the first part follows. Moreover we
have equality if and only if, for this particular x, we have α−xβ2 = 0
and hence if and only if α = xβ.
1 1
γ1 = β1 = β
β1 β
ISTUDY
200 IV. BILINEAR FORMS
ISTUDY
26. INEQUALITIES 201
α+β
ISTUDY
202 IV. BILINEAR FORMS
ISTUDY
26. INEQUALITIES 203
(β + γ) • α = β • α + γ • α
r2 a
rα • β ≤ α2 + β2
2a 2
If we now choose a so that |r|/2 < a < 2|r| the above easily becomes
rα • β ≤ |r| α2 + β2
and thus
|(rα • β)| ≤ |r| α2 + β2 (∗∗)
ISTUDY
204 IV. BILINEAR FORMS
Problems
26.1. Prove geometrically that the sum of the squares of the lengths
of the four sides of a parallelogram is equal to the sum of the squares
of the lengths of the two diagonals. (Hint. An obvious approach is to
try two applications of the Law of Cosines.)
Let V be an inner product space.
26.2. Show that equality occurs in the Cauchy-Schwarz inequality
if α = aβ or β = bα for some a, b ∈ R.
26.3. Prove that α − β ≤ α − β for all α, β ∈ V .
A linear transformation T : V → V is said to be unitary if αT =
α for all α ∈ V .
26.4. Let T : V → V be a linear transformation. Show that T is
unitary if and only if (αT ) • (βT ) = α • β for all α, β ∈ V .
26.5. Let C = {γ1 , γ2 , . . . , γn } be an orthonormal basis for V . Prove
that T : V → V is unitary if and only if (C)T = {γ1 T, γ2 T, . . . , γn T } is
also an orthonormal basis.
26.6. Let C be as above. Show that T is unitary if and only if
T T
C [T ] C ·C [T ]C = In where is the matrix transpose map.
ISTUDY
26. INEQUALITIES 205
Let V denote the vector space of all bounded real valued functions
α : Z+ → R as defined immediately before Problem 25.9 and recall that
∞
α(n)β(n)
α•β =
n=1
2n
Then we know that V is an inner product space. Suppose by way of
contradiction that V has an orthonormal basis C.
26.7. Show that V is infinite dimensional and therefore that C has
a countably infinite subset {γ1 , γ2 , . . . , γi , . . .}.
26.8. For each γi as above, let ci > 0 denote a finite upper bound
for {|γi (n)| | n ∈ Z+ }. Now define α : Z+ → R by
∞
γi (n)
α(n) =
i=1
2 i ci
(Note that we are not adding infinitely many functions since this is not
defined on V , but rather we are adding infinitely many real function
values.) Prove that α ∈ V .
26.9. For all i show that
1
α • γi = = 0
2i c i
(Hint. α • γi is a double sum and can be computed by interchanging
the order of summation and using the values of the various γj • γi .)
26.10. Use the preceding problem and the ideas of Lemma 26.1 to
deduce that when α is written in terms of the basis C that all γi must
occur. Conclude that V does not have an orthonormal basis.
ISTUDY
206 IV. BILINEAR FORMS
ISTUDY
27. REAL SYMMETRIC MATRICES 207
ISTUDY
208 IV. BILINEAR FORMS
with diagonal entries 1, −1 and 0 and with the number of these entries
determined by B and hence by α. Since α = [B]A is congruent to [B]C
by Lemma 24.1, we see that α is congruent to an appropriate diagonal
matrix.
Conversely assume that α is congruent to a suitable diagonal matrix
γ and let V = Rn have basis A. Then B can be chosen so that α = [B]A
and B depends only upon α. Since α is congruent to γ, Lemma 24.1
implies that [B]C = γ for some basis C of V . The preceding theorem
now implies that the number of diagonal entries of γ equal to 1, −1 or
0 is uniquely determined by B and hence by α.
of degree n > 1 and assume that the result holds for all polynomials
of smaller degree. We embed R in the complex numbers C and then
R[x] ⊆ C[x].
Let : C → C denote complex conjugation. Since C is algebraically
closed, we can let b ∈ C be a root of f . Then since the coefficients ai
are real, we have
i i
0=0= ai b = ai b
i
= ai b = f (b)
ISTUDY
27. REAL SYMMETRIC MATRICES 209
ISTUDY
210 IV. BILINEAR FORMS
ISTUDY
27. REAL SYMMETRIC MATRICES 211
Write α = [aij ] and let β = i bi γ i and δ = j dj γj be vectors in
V . Then
βT = bi aij γj
i,j
δT = dj aji γi
i,j
and
β • δT = bi γ i • dj aji γi
i i,j
= bi aji dj
i,j
Problems
27.1. Let α ∈ Rn×n . Show that α is similar to a diagonal matrix if
and only if it is similar to a symmetric matrix.
27.2. Let V be an inner product space with orthonormal basis C =
{γ1 , γ2 , . . . , γn } and let α ∈ Rn×n . Define the linear transformations T
and T T by
T T
C [T ]C = α, C [T ]C = α
Prove that βT • δ = β • δT T for all vectors β, δ ∈ V .
Let V be an. n-dimensional real vector space and let B : V ×V → R
be a symmetric bilinear form. Suppose there exists a basis C of V with
[B]C = diag( 1, 1, . . . , 1 , −1, −1, . . . , −1 , 0, 0, . . . , 0 )
r s t
ISTUDY
212 IV. BILINEAR FORMS
ISTUDY
28. COMPLEX ANALOGS 213
ISTUDY
214 IV. BILINEAR FORMS
ISTUDY
28. COMPLEX ANALOGS 215
ISTUDY
216 IV. BILINEAR FORMS
ISTUDY
28. COMPLEX ANALOGS 217
Problems
Let B : V × V → C be a normal Hermitian bilinear form. In the
next four problems, we prove Theorem 28.2.
28.1. For σ, τ, η ∈ V , derive the identity
B(η, σ)B(τ, η) = B(σ, η)B(η, τ ) (∗)
(Hint. See the proof of Theorem 22.1.)
28.2. Assume that there exists γ ∈ V with B(γ, γ) > 0 and fix this
element. Setting τ = η = γ, σ = β in (∗), deduce that
B(γ, β) = B(β, γ)
for all β ∈ V .
ISTUDY
218 IV. BILINEAR FORMS
ISTUDY
CHAPTER V
ISTUDY
220 V. INFINITE DIMENSIONAL SPACES
ISTUDY
29. EXISTENCE OF BASES 221
that are more useful. Among these are: the Compactness theorem,
the Hausdorff maximal principle, Kuratowski’s lemma, Tukey’s lemma,
the Well-ordering principle, Zermelo’s postulate, and Zorn’s lemma.
Algebraist tend to use Zorn’s lemma or Well-ordering, so we will restrict
our attention to these. The fact that Zorn’s lemma is equivalent to the
Axiom of Choice means that one can prove Zorn’s lemma using the
Axiom of Choice and conversely one can prove the Axiom of Choice
using Zorn’s lemma.
We need some definitions.
Definition 29.1. We say that (A, ≤) is a partially ordered set or
a poset if the inequality ≤ is defined on A. By this we mean that for
some pairs a, b ∈ A we have a ≤ b. Furthermore, for all a, b, c ∈ A, the
inequality satisfies
PO1. a ≤ a.
PO2. a ≤ b and b ≤ c imply a ≤ c.
PO3. a ≤ b and b ≤ a imply a = b.
Of course the second condition above is the transitive law. As usual,
we will use b ≥ a to mean a ≤ b, and we will use a < b or b > a to
mean less than or equal to, but not equal.
An element m ∈ A is said to be maximal if there are no properly
larger elements in A. That is, if m ≤ a then m = a. Similarly n ∈ A
is minimal if there are no properly smaller elements in A.
A standard example here is as follows. Let F be a set and let
A be the set of all subsets of F . Then A is a partially ordered set
with inequality being setwise inclusion ⊆. Obviously F is the unique
maximal member of A and ∅ is the unique minimal member of A.
Definition 29.2. Next, (A, ≤) is said to be a linearly ordered set
or a totally ordered set if
LO1. (A, ≤) is a partially ordered set.
LO2. For all a, b ∈ A we have a ≤ b or b ≤ a.
Thus this occurs when A is partially ordered and all pairs of elements
are comparable. Of course both the set of rational numbers Q and the
set of real numbers R are totally ordered with the usual inequality.
Definition 29.3. Finally, (A, ≤) is a well-ordered set if
WO1. (A, ≤) is a linearly ordered set.
WO2. Every nonempty subset of A has a minimal member.
Notice that Z+ the set of positive integers is well-ordered, but Q+
the set of positive rationals is not. Indeed, for example, the subset
B = {q ∈ Q+ | q > 1} does not have a minimal member. Indeed, the
lower bound q = 1 is not in B.
ISTUDY
222 V. INFINITE DIMENSIONAL SPACES
ISTUDY
29. EXISTENCE OF BASES 223
B d ⊆ Cc ⊆ B b ⊆ C
ISTUDY
224 V. INFINITE DIMENSIONAL SPACES
ISTUDY
29. EXISTENCE OF BASES 225
ISTUDY
226 V. INFINITE DIMENSIONAL SPACES
Theorem 29.5. Let V be a vector space over the field F and let
L ⊆ S ⊆ V be given where L is a linearly independent set and where S
spans V . Then there exists a basis B of V with L ⊆ B ⊆ S.
ISTUDY
29. EXISTENCE OF BASES 227
Problems
Let (A, ≤) be a partially ordered set.
29.1. Show that the following two conditions are equivalent.
i. A satisfies the maximal condition (MAX), which means that
every nonempty subset of A has a maximal member.
ii. A satisfies the ascending chain condition (ACC), namely A has
no strictly ascending chain
a1 < a2 < · · · < an < · · ·
indexed by the positive integers.
29.2. Show that the following two conditions are equivalent.
i. A satisfies the minimal condition (MIN), which means that
every nonempty subset of A has a minimal member.
ii. A satisfies the descending chain condition (DCC), namely A
has no strictly descending chain
a1 > a2 > · · · > an > · · ·
indexed by the positive integers.
29.3. Show that A is well-ordered if and only if every nonempty sub-
set has a unique minimal member. (Hint. To prove that the ordering
is linear consider subsets of size 2.)
29.4. Define the relation on A by a b if and only if b ≤ a.
Show that (A, ) is also a partially ordered set that has the effect of
interchanging maximal and minimal and also upper bound and lower
ISTUDY
228 V. INFINITE DIMENSIONAL SPACES
ISTUDY
30. EXISTENCE OF DIMENSION 229
ISTUDY
230 V. INFINITE DIMENSIONAL SPACES
ISTUDY
30. EXISTENCE OF DIMENSION 231
ISTUDY
232 V. INFINITE DIMENSIONAL SPACES
ISTUDY
30. EXISTENCE OF DIMENSION 233
ISTUDY
234 V. INFINITE DIMENSIONAL SPACES
W = C = C
so C and C are bases of W since they span and since they are both
linearly independent subsets of the corresponding bases B and B. If
either C or C is finite, then W is finite dimensional and Theorem 5.2
implies that C ∼ C. On the other hand if both C and C are infinite,
then part (iii ) of Lemma 30.2 yields C ∼ N ∼ C.
As a consequence of the above and Zorn’s lemma, we obtain
Theorem 30.3. Let V be a vector space over the field F . Then any
two bases of V have the same size.
Proof. Let B and B be bases of V . We consider the family C of all
pairs (C, f ) where C ⊆ B, f : C → B is a one-to-one map and where
C = f (C). As usual we write (C, f ) (D, g) if C ⊆ D and if f
ISTUDY
30. EXISTENCE OF DIMENSION 235
ISTUDY
236 V. INFINITE DIMENSIONAL SPACES
Problems
30.1. Let V = 0 be a finite dimensional vector space over Q. Show
that V ∼ N. (Hint. Let {γ1 , γ2 , . . . , γk } be a basis for V and for each
natural number n let Vn be the set of all vectors
a1 a2 ak
γ1 + γ2 + · · · + γk
b1 b2 bk
with ai , bi ∈ Z and |ai |, |bi | ≤ n for all i. Now apply Lemma 30.2.)
30.2. If A is a set, then its power set P(A) is the set of all subsets
of A. Observe that A ∼ A implies that P(A) ∼ P(A). Now prove that
that |A| =
|P(A)|. (Hint. If f : A → P(A) is given, define a subset B
of A so that b ∈ B if and only if b ∈/ f (b). Is B in the image of f ?)
30.3. Show that |R| ≤ |P(Q)| = |P(N)|. (Hint. Each real number
r determines a “cut” {q ∈ Q | r < q} in the rational line.)
30.4. Conversely show that |P(N)| ≤ |R| and deduce that equality
occurs. (Hint. We construct a map f from P(N) to R, or actually to
elements in the half open interval [0, 1), as follows. For each subset B
of N, let f (B) be the real number whose nth decimal digit, after the
decimal point, is equal to 1 if n ∈ B and equal to 0 otherwise. Thus
1
f (B) =
n∈B
10n
ISTUDY
30. EXISTENCE OF DIMENSION 237
ISTUDY
This page intentionally left blank
ISTUDY
Index
ISTUDY
INDEX 241
ISTUDY
242 INDEX
ISTUDY
INDEX 243
ISTUDY
244 INDEX
finite dimensional, 28
infinite dimensional, 28
volume, 105
n-dimensional, 105
volume function, 108, 109, 114, 118
existence, 118
linearly dependent vectors, 109
Z-set, 222
Zorn’s lemma, 221, 222, 224–226,
233, 235
ISTUDY