Giaav2 0-B
Giaav2 0-B
Giaav2 0-B
ALGEBRA
B.A. Sethuraman
California State University Northridge
ii
Copyright 2015 B.A. Sethuraman.
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license
is included in the section entitled GNU Free Documentation License.
History
2012 Version 1.0. Created. Author B.A. Sethuraman.
2015 Version 2.0-B. Book Version. Author B.A. Sethuraman
Contents
To the Student: How to Read a Mathematics Book
27
1.1
Further Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 45
57
2.1
2.2
Subrings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.3
2.4
Ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
2.5
2.6
2.7
3 Vector Spaces
165
3.1
3.2
3.3
3.4
3.5
iv
CONTENTS
4 Groups
247
4.1
4.2
4.3
4.4
4.5
333
339
349
Index
361
List of Videos
To the Student: Proofs: Exercise 0.10 . . . . . . . . . . . . . . . . . 23
To the Student: Proofs: Exercise 0.15 . . . . . . . . . . . . . . . . . 24
To the Student: Proofs: Exercise 0.19 . . . . . . . . . . . . . . . . . 25
To the Student: Proofs: Exercise 0.20 . . . . . . . . . . . . . . . . . 25
To the Student: Proofs: Exercise 0.21 . . . . . . . . . . . . . . . . . 25
Chapter 1: GCD via Division Algorithm . . . . . . . . . . . . . . . 46
Chapter 1: Number of divisors of an integer . . . . . . . . . . . . . 49
. . . . . . . . . 98
. . . . . . . . . . . . . . . . . . . . . . . . 106
v
vi
LIST OF VIDEOS
Chapter 2: Ideal generated by a1 , . . . , an . . . . . . . . . . . . . . . 107
Chapter 2: Evaluation Homomorphism . . . . . . . . . . . . . . . . 126
Chapter 2: Linear independence in Q[ 2, 3] . . . . . . . . . . . . 138
LIST OF VIDEOS
LIST OF VIDEOS
How should you read a mathematics book? The answer, which applies
to every book on mathematics, and in particular to this one, can be given
in one wordactively. You may have heard this before, but it can never be
overstressedyou can only learn mathematics by doing mathematics. This
means much more than attempting all the problems assigned to you (although
attempting every problem assigned to you is a must). What it means is that
you should take time out to think through every sentence and confirm every
assertion made. You should accept nothing on trust; instead, not only should
you check every statement, you should also attempt to go beyond what is
stated, searching for patterns, looking for connections with other material
that you may have studied, and probing for possible generalizations.
Let us consider an example:
Example 0.1
On page 65 in Chapter 2, you will find the following sentence:
Yet, even in this extremely familiar number system, mul-
6=
.
0 0
0 0
0 0
0 0
(The number system referred to is the set of 22 matrices whose entries are
real numbers.) When you read a sentence such as this, the first thing that you
should do is verify the computation yourselves. Mathematical insight comes
from mathematical experience, and you cannot expect to gain mathematical
experience if you merely accept somebody elses word that the product on
the left side of the equation does not equal the product on the right side.
The very process of multiplying out these matrices will make the set of 2
2 matrices a more familiar system of objects, but as you do the calculations,
more things can happen if you keep your eyes and ears open. Some or all of
the following may occur:
1. You may notice that not only are the two products not the same, but
that the product on the right side gives you the zero matrix. This
should make you realize that although it may seem impossible that two
nonzero numbers can multiply out to zero, this is only because you
are confining your thinking to the real or complex numbers. Already,
the set of 22 matrices (with which you have at least some familiarity)
contains nonzero elements whose product is zero.
2. Intrigued by this, you may want to discover other pairs of nonzero
matrices that multiply out to zero. You will do this by taking arbitrary
pairs of matrices and determining their product. It is quite probable
that you will not find an appropriate pair. At this point you may be
tempted to give up. However, you should not. You should try to be
creative, and study how the entries in the various pairs of matrices you
have selected affect the product. It may be possible for you to change
one or two entries in such a way that the product comes out to be zero.
For instance, suppose you consider the product
1 1
1 1
4 0
2 0
!
=
6 0
6 0
You should observe that no matter what the entries of the first matrix
are, the product will always have zeros in the (1, 2) and the (2, 2) slots.
This gives you some freedom to try to adjust the entries of the first
matrix so that the (1, 1) and the (2, 1) slots also come out to be zero.
After some experimentation, you should be able to do this.
3. You may notice a pattern in the two matrices that appear in our inequality on page 4. Both matrices have only one nonzero entry, and
that entry is a 1. Of course, the 1 occurs in different slots in the two
matrices. You may wonder what sorts of products occur if you take
similar pairs of matrices, but with the nonzero 1 occuring at other locations. To settle your curiosity, you will multiply out pairs of such
matrices, such as
0 0
1 0
0 1
!
,
0 0
or
0 0
1 0
0 0
1 0
!
.
You will try to discern a pattern behind how such matrices multiply.
To help you describe this pattern, you will let ei,j stand for the matrix
with 1 in the (i, j)-th slot and zeros everywhere else, and you will try
to discover a formula for the product of ei,j and ek,l , where i, j, k, and
l can each be any element of the set {1, 2}.
Notice that a single sentence can lead to an enormous amount of mathematical activity! Every step requires you to be alert and actively involved in what
you are doing. You observe patterns for yourselves, you ask yourselves questions, and you try to answer these questions on your own. In the process, you
discover most of the mathematics yourselves. This is really the only way to
learn mathematics (and in particular, it is the way every professional mathematician has learned the subject). Mathematical concepts are developed
precisely because mathematicians observe patterns in various mathematical
objects (such as the 2 2 matrices), and to have a good understanding of
these concepts you must try to notice these patterns for yourselves.
May you spend many many hours happily playing in the rich and beautiful
world of mathematics!
10
On Proofs
proofs, stated in that generality, this means that not only do you understand
all the mathematics that is currently known, but that you understand all
the mathematics that might ever be known. This, to many, would be one
definition of God, and we may safely assume that we are all mortal here.
The good news lies in what constitutes a proof. A proof is simply a stepby-step revelation of some mathematical truth. A proof lays bare connections
between various mathematical objects and in a series of logical steps, leads
you via these connections to the truth. It is like a map that depicts in detail
how to find buried treasure. Thus, if you have written one proof correctly, this
means that you have discovered for yourself the route to some mathematical
treasureand this is the good news! To write a proof of some result is to
fully understand all the mathematical objects that are connected with that
result, to understand all the relations between them, and eventually to see
instantly why the result must be true. There is joy in the whole process: in
the search for connections, in the quest to understand what these connections
mean, and finally, in the aha moment, when the truth is revealed.
(It is in this sense too that no one can claim to know how to do proofs.
They would in effect be claiming to know all mathematical truths!)
Thus, when students say that they do not know how to do proofs, what
they really mean, possibly without being aware of this themselves, is that
they do not fully understand the mathematics involved in the specific result
that they are trying to prove. It is characteristic of most students who have
been used to the plug and chug style alluded to before that they have
simply not learned to delve deep into mathematics. If this describes you as
well, then I would encourage you to read the companion essay To the Student:
How to Read a Mathematics Book (Page 3). There are habits of thought
that must become second nature as you move into advanced mathematics,
and these are described there.
Besides reading that essay, you can practice thinking deeply about math-
On Proofs
11
ematics by trying to prove a large number of results that involve just elementary concepts that you would have seen in high school (but alas, perhaps
could never really explore in depth then). Doing so will force you to start
examining concepts in depth, and start thinking about them like a mathematician would. We will collect a few examples of proofs of such results in
this chapter, and follow it with more results left as exercises for you to prove.
And of course, as you read through the rest of this book, you will be
forced to think deeply about the mathematics presented here: there really
is no other way to learn this material. And as you think deeply, you will
find that it becomes easier and easier to write proofs. This will happen
automatically, because you will understand the mathematics better, and once
you understand better, you will be able to articulate your thoughts better,
and will be able to present them in the form of cogent and logical arguments.
Now for some practice examples and exercises. We will invoke a few
definitions that will be familiar to you already (although we will introduce
them in later chapters too): Integers are members of the set {0, 1, 2, . . . }.
A prime is an integer n (not equal to 0 or 1) whose only divisors are 1 and
n. If you are used only to divisibility among the positive integers, just keep
the following example in mind: 2 divides 6 because 2 times 3 equals
6. Similarly, 2 3 = 6, so 2 divides 6 as well. By the same token, 2
divides 6 because 2 3 = 6. In general, if m and n are positive integers
and m divides n, then, m divides n.
Example 0.3
Let us start with a very elementary problem: Prove that the sum of
the squares of two odd integers is even.
You can try to test the truth of this statement by taking a few pairs
of odd integers at random, squaring them, and adding the squares. For
12
On Proofs
instance, 32 + 72 = 58, and 58 is even, 52 + 12 = 26, and 26 is even, and
so on. Now this of course doesnt constitute a proof: a proof should
reveal why this statement must be true.
You need to invoke the fact that the given integers are odd. Odd
integers are precisely those that are expressible as 2x+1 for some integer
x (and of course, even integers are precisely those that are expressible
as 2y for some integer y). Recall that the sum of two odd integers is
even: if one integer is expressed as 2x + 1 for some integer x and the
other as 2y + 1 for some integer y (note that we are using y the second
time aroundwe must use a different letter or else the two integers
we start with will be equal!), then their sum is 2x + 1 + 2y + 1 =
2(x + y) + 2 = 2(x + y + 1), and this is even because it is a multiple of
two.
Exercise 0.3.1
Modify the argument above and show that the sum of two even
integers is even, and the sum of an even integer and an odd
integer is odd.
Exercise 0.3.2
Now modify the argument above further and show that the product of two odd integers is odd, the product of two even integers
is even, and the product of an even integer and an odd integer is
even.
Now let us prove the assertion at the start of this example:
Proof. Let the first integer by 2x + 1, and the second be 2y + 1. We
square them and add: (2x + 1)2 + (2y + 1)2 = (4x2 + 4x + 1) + (4y 2 +
On Proofs
13
Example 0.4
Here is something a bit more involved: Show that if n is a positive
integer, show that n5 n is divisible by 5.
How would one go about this? There are no rules of course, but
your experience with odds (2x + 1) and evens (2y) might suggest
to you that perhaps when trying to show that some final expression
is divisible by 5, we should consider the remainders when various integers are divided by 5. (This is the sort of insight that comes from
experiencethere really is no substitute for having done lots of math-
14
On Proofs
ematics before!) The various possible remainders are 0, 1, 2, 3, and 4.
Thus, we write n = 5x + r for some integer x, and some r in the set
{0, 1, 2, 3, 4}, and then expand n5 n, hoping that in the end, we get a
multiple of 5. Knowledge of the binomial theorem will be helpful here.
Proof. Write n = 5x + r as above. Then n5 n = (5x + r)5 (5x + r),
and using the binomial theorem and the symmetry of the coefficients
n
( nr = nr
), this is (5x)5 +5(5x)4 r+(54)/2 (5x)3 r2 +(54)/2 (5x)2 r3 +
5(5x)r4 + r5 (5x + r). Studying the terms, we see that all summands
except possibly r5 r are divisible by 5. We may hence write n5 n =
5y + r5 r, where y is obtained by factoring 5 from all summands other
than r5 r. It is sufficient therefore to prove that for any r in the set
{0, 1, 2, 3, 4}, r5 r is divisible by 5, for if so, we may write r5 r = 5z
for suitable z, and then write n5 n = 5y + 5z = 5(y + z), which is a
multiple of 5. Since r only takes on five values, all of them small, we
can test easily that r5 r is divisible by 5: 05 0 = 0, 15 1 = 0,
25 2 = 30, 35 3 = 240, and 45 4 = 1020, all divisible by 5 as
needed!
On Proofs
Exercise 0.4.1
What was so special about the 5 in the example above? As
n varies through the positive integers, play with expressions of
the form nk n for small values of k, such as k = 2, 3, 4, 6, 7
etc. Can you prove that for all positive integers n, nk n is
divisible by k, at least for these small values of k? (For instance,
you can try to modify the proof above appropriately.) If you
cannot prove this assertion for some of these values of k, can
you find some counterexamples, i.e, some value of n for which
nk n is not divisible by n? Based on your explorations, can you
formulate a conjecture on what values of k will make the assertion
nk n is divisible by k for all positive integers n true? (These
results are connected with some deep results: Fermats little
theorem, Carmichael numbers, and so on, and have applications
in cryptography, among other places. Incidentally, the cases n =
3 and n = 5 appear again as Exercises 1.34 and 1.35 in Chapter
1 ahead, with a hint that suggests a slightly different technique
of proof.)
Exercise 0.4.2
Show that n5 n is divisible by 30 as well.
(Hint: Since we have seen that n5 n is divisible by 5, it is
sufficient to show that it is also divisible by 2 and by 3.)
15
16
On Proofs
Question 0.4.3
Suppose you were asked to prove that a certain integer is divisibly
by 90. Notice that 90 = 15 6. Is it sufficient to check that
the integer is divisible by both 15 and 6 to be able to conclude
that it is divisible by 90? If not, why not? Can you provide a
counterexample?
Example 0.5
Prove that if n is any positive integer and x and y are any two distinct
integers, then xn y n is divisible by x y.
When confronted with an infinite sequence of statements, one for
each n = 1, 2, . . . , it is worthwhile playing with these statements for
small values for n, and checking if they are true for these values. Then,
while playing with them, you might see a pattern that might give you
some ideas.
The statement is clearly true for n = 1: x y is of course divisible
by x y! For n=2, we know that x2 y 2 = (x y)(x + y), so clearly
x2 y 2 is divisible by x y. When n = 3, you may remember the
identity x3 y 3 = (x y)(x2 + xy + y 2 ). So far so good. For n = 4? Or
even, for n = 3 if you didnt remember the identity? How would you
have proceeded?
One possibility is to see if you cant be clever, and somehow reduce
the n = 3 case to the n = 2 case. If we could massage x3 y 3 somehow
so as to incorporate x2 y 2 in it, we would be able to invoke the
fact that x2 y 2 is divisible by x y, and with luck, it would be
obvious that the rest of the expression for x3 y 3 is also divisible by
On Proofs
17
18
On Proofs
Remark 0.5.1
Remember, a statement is simply a (grammatically correct) sentence. The statement need not actually be true: for instance,
All humans live forever is a perfectly valid statement, even
though the elixir of life has yet to be found. When we use constructs like P (n), we mean that we have an infinite family of
statements, labeled by the positive integers. Thus, P (1) is the
statement that x1 y 1 is divisible by x y, P (2) is the statement
that x2 y 2 is divisible by x y, etc., etc. The Principle of Induction states that if P (n), n = 1, 2, . . . is a family of statements
such that P (1) is true, and whenever P (k) is true for some k 1,
then P (k + 1) is also true, then P (n) is true for all n 1.. (You
are asked to prove this statement in Exercise 1.37 of Chapter 1
ahead.)
Remark 0.5.2
A variant of the principle of induction, sometimes referred to as
the Principle of Strong Induction, but logically equivalent to the
principle of induction, states that given the statements P (n) as
above, if P (1) is true, and whenever P (j) is true for all j from 1
to k, then P (k + 1) is also true, then P (n) is true for all n 1.
Another variant of the principle of induction, is that if P (s)
is true for some integer s (possibly greater than 1), and if P (k)
is true for some k s, then P (k + 1) is also true, then P (n) is
true for all n s (note!).
On Proofs
Example 0.6
Prove that given any 6 integers, there must be at least one pair among
them whose difference is divisible by 5.
Let us first work on an easier problem:
Exercise 0.6.1
Prove that in a group of 13 people, there must be at least two
people whose month of birth is the same.
The proof is very simple, but there is a lovely principle behind
it which has powerful applications! The idea is the following:
there are 12 possible months of birth, January through December.
Think of each month as a room, and place each person in the
room corresponding to their month of birth. Then it is clear
that because there are 13 people but only 12 rooms, there must
be at least one room in which more than just one person has bee
placed. That proves it.
19
20
On Proofs
Exercise 0.6.2
Show that if there are 64 people in a room, there must be at least
six people whose months of birth are the same.
Now let us prove the statement that started this exercise: given six
integers, we wish to show that for at least one pair, the difference is
divisible by 5. If a1 , . . . , a6 are the six integers, let r1 , . . . , r6 denote the
remainders when a1 , . . . , a6 respectively when divided by 5. Note that
each ri is either 0, 1, 2, 3, or 4. Apply the pigeon hole principle: there
are 5 possible values of remainders, namely, 0 through 4 (the pigeon
holes), and there are six actual remainders r1 through r6 (the letters).
Placing the six letters into their corresponding pigeon holes, we find
that at least two of the ri must be equal. Suppose for instance that r2
and r5 are equal. Then a2 and a5 leave the same remainder on dividing
by 5, so when a2 a5 is divided by 5, these two remainders cancel, so
a2 a5 will be divisible by 5. (Described more precisely, a2 must be
of the form 5k + r2 for some integer k since it leaves a remainder of
r2 when divided by 5, and a5 must similarly be of the form 5l + r5 for
some integer l. Hence, a2 a5 = 5(k l) + (r2 r5 ). Since r2 = r5 , we
find a2 a5 = 5(k l), it is thus a multiple of 5!) Obviously, the same
idea applies to any two ri and rj that are equal: the difference of the
corresponding ai and aj will be divisible by 5.
Exercise 0.6.3
Show that from any set of 100 integers one can pick 15 integers
such that the difference of any two of these is divisible by 7.
On Proofs
21
Here are further exercises for you to work on. As always, keep the precepts
in the essay To the Student: How to Read a Mathematics Book (Page 3)
uppermost in your mind. Go beyond what is stated, search for patterns,
look for connections, probe for possible generalizations. . . .
22
On Proofs
Further Exercises
Exercise 0.7
If n is any odd positive integer and x and y are any two integers, show
that xn + y n is divisible by x + y.
Exercise 0.8
1 = 1 = (1 2)/2
1 + 2 = 3 = (2 3)/2
1 + 2 + 3 = 6 = (3 4)/2
1 + 2 + 3 + 4 = 10 = (4 5)/2
Conjecture the general formula suggested by these equations and prove
your conjecture!
Exercise 0.9
12 = 1 = (1 2 3)/6
12 + 22 = 5 = (2 3 5)/6
12 + 22 + 32 = 14 = (3 4 7)/6
12 + 22 + 32 + 42 = 30 = (4 5 9)/6
Conjecture the general formula suggested by these equations and prove
your conjecture!
On Proofs
23
Exercise 0.10
1 = 1
2+3+4 = 1+8
5 + 6 + 7 + 8 + 9 = 8 + 27
10 + 11 + 12 + 13 + 14 + 15 + 16 = 27 + 64
For
dis-
cussion
of
Exercise 0.10,
see
http:
//tinyurl.
your conjecture! After you have tried this problem yourself, follow the
com/
GIAA-Proofs-1.
Exercise 0.11
Prove that 1 + 3 + 5 + + (2n 1) = n2 , for n = 1, 2, 3, . . . .
Exercise 0.12
Prove that 2n < n!, for n = 4, 5, 6, . . . .
Exercise 0.13
For n = 1, 2, 3, . . . , prove that
1
1
n
1
+
+ +
=
12 23
n (n + 1)
n+1
Exercise 0.14
The following exercise deals with the famous Fibonacci sequence. Let
a1 = 1, a2 = 1, for n 3, let an be given the formula an = an1 + an2 .
(Thus, a3 = 1 + 1 = 2, a4 = 2 + 1 = 3, etc.) Show the following:
1. a1 + a2 + + an = an+2 1.
2. a1 + a3 + a5 + + a2n1 = a2n .
24
On Proofs
3. a2 + a4 + a6 + + a2n = a2n+1 1.
Exercise 0.15
For
dis-
cussion
of
Exercise 0.15,
see
http:
an =
2
2
5
//tinyurl.
(The amazing thing is that that ugly mess of radical signs on the right
com/
turns out to be an integer!) After you have tried this problem yourself,
GIAA-Proofs-2.
Exercise 0.16
If a 2 and n 2 are integers such that an 1 is prime, show that
a = 2 and n must be a prime. (Primes of the form 2p 1, where p is
a prime, are known as Mersenne primes.)
Remark 0.16.1
Note that it is not true that for every prime p, 2p 1 must
be prime. For more on Mersenne primes, including the Great
Internet Mersenne Prime Search, see http://primes.utm.edu/
mersenne/
For
on
more
Fermat
primes,
see
Exercise 0.17
http:
//tinyurl.
com/
GIAA-Fermat.
On Proofs
25
For
dis-
cussion
of
Exercise 0.18
Exercise 0.19,
see
common vertex such that there is no overlap, and together they fully
//tinyurl.
surround the vertex. Show that the only possibilities are six triangles,
com/
GIAA-Proofs-3.
http:
Exercise 0.19
Show that if six integers are picked at random from the set {1, . . . , 10},
then at least two of them must add up to exactly 11. After you have
tried this problem yourself, follow the link on the side!
Exercise 0.20
For
dis-
cussion
of
Exercise 0.20,
see
2l, then at least two of them must be within a distance l of each other.
//tinyurl.
After you have tried this problem yourself, follow the link on the side!
com/
http:
GIAA-Proofs-4.
Exercise 0.21
Based on your answers to Exercise 0.8 and Exercise 0.9, guess at a
formula for 13 + 23 + + n3 in terms of n, and prove that your
formula is correct. After you have tried this problem yourself, follow
the link on the side!
For
dis-
cussion
of
Exercise 0.21,
see
http:
//tinyurl.
com/
GIAA-Proofs-5.
26
On Proofs
Chapter 1
Divisibility in the Integers
We will begin our study with a very concrete set of objects, the integers,
that is, the set {0, 1, 1, 2, 2, . . . }. This set is traditionally denoted Z and is
very familiar to usin fact, we were introduced to this set so early in our lives
that we think of ourselves as having grown up with the integers. Moreover, we
view ourselves as having completely absorbed the process of integer division;
we unhesitatingly say that 3 divides 99 and equally unhesitatingly say that
5 does not divide 101.
As it turns out, this very familiar set of objects has an immense amount of
structure to it. It turns out, for instance, that there are certain distinguished
integers (the primes) that serve as building blocks for all other integers. These
primes are rather beguiling objects; their existence has been known for over
two thousand years, yet there are still several unanswered questions about
them. They serve as building blocks in the following sense: every positive
integer greater than 1 can be expressed uniquely as a product of primes.
(Negative integers less than 1 also factor into a product of primes, except
that they have a minus sign in front of the product.)
The fact that nearly every integer breaks up uniquely into building blocks
27
28
is an amazing one; this is a property that holds in very few number systems,
and our goal in this chapter is to establish this fact. (In the exercises to
Chapter 2 we will see an example of a number system whose elements do not
factor uniquely into building blocks. Chapter 2 will also contain a discussion
of what a number system issee Remark 2.8.)
We will begin by examining the notion of divisibility and defining divisors
and multiples. We will study the division algorithm and how it follows from
the Well-Ordering Principle. We will explore greatest common divisors and
the notion of relative primeness. We will then introduce primes and prove
our factorization theorem. Finally, we will look at what is widely considered
as the ultimate illustration of the elegance of pure mathematicsEuclids
proof that there are infinitely many primes.
Some authors
define
as
the
Let us start with something that seems very innocuous, but is actually
set
rather profound. Write N for the set of nonnegative integers that is, N =
{1, 2, 3, . . . },
i.e.,
without
the
that
we have included.
is
It
harmless
to use that
least element. This fact, namely that every nonempty subset of N has a least
definition, as
element, turns out to be a crucial reason why the integers possess all the
long as one
is consistent.
We will stick
to our defini-
Compare the integers with another very familiar number system, the
tion in this
rationals, that is, the set {a/b | a and b are integers, with b 6= 0}. (This set
text.
29
Question 1.1
Can you think of a nonempty subset of the positive rationals that fails
to have a least element?
We will take this property of the integers as a fundamental axiom, that
is, we will merely accept it as given and not try to prove it from more
fundamental principles. Also, we will give it a name:
Well-Ordering Principle: Every nonempty subset of the nonnegative
integers has a least element.
Now let us look at divisibility. Why do we say that 2 divides 6? It is
because there is another integer, namely 3, such that the product 2 times 3
exactly gives us 6. On the other hand, why do we say that 2 does not divide
7? This is because no matter how hard we search, we will not be able to find
an integer b such that 2 times b equals 7. This idea will be the basis of our
definition:
Definition 1.2
A (nonzero) integer d is said to divide an integer a (denoted d|a) if
there exists an integer b such that a = db. If d divides a, then d is
referred to as a divisor of a or a factor of a, and a is referred to as a
multiple of d.
Observe that this is a slightly more general definition than most of us are
used toaccording to this definition, 2 divides 6 as well, since there exists
an integer, namely 3, such that 2 times 3 equals 6. Similarly, 2 divides
6, since 2 times 3 equals 6. More generally, if d divides a, then all of the
following are also true: d| a, d|a, d| a. (Take a minute to prove this
formally!) It is quite reasonable to include negative integers in our concept
30
of divisibility, but for convenience, we will often focus on the case where the
divisor is positive.
The following easy result will be very useful:
Lemma
1.3
Lemma 1.3. If d is a nonzero integer such that d|a and d|b for two integers
is
used
ex-
a and b, then for any integers x and y, d|(xa + yb). (In particular, d|(a + b)
tensively
in
problems
in
integer
divisibility!
The
division
algorithm
Question 1.4
If a nonzero integer d divides both a and a + b, must d divide b as well?
(Lemma 1.5)
seems
so
trivial, yet it
is a central
theoretical re-
The following lemma holds the key to the division process. Its statement
is often referred to as the division algorithm. The Well-Ordering Principle
(Page 29) plays a central role in its proof.
sult. In fact,
the existence
of
unique
prime
fac-
torization
the
in
integers
(Theorem
1.20) can be
traced back to
the
division
algorithm.
Lemma 1.5. (Division Algorithm) Given integers a and b with b > 0, there
exist unique integers q and r, with 0 r < b such that a = bq + r.
Remark 1.6
First, observe the range that r lies in. It is constrained to lie between
0 and b 1 (with both 0 and b 1 included as possible values for r).
Next, observe that the lemma does not just state that integers q and
31
r exist with 0 r < b and a = bq + r, it goes furtherit states that
these integers q and r are unique. This means that if somehow one
were to have a = bq1 + r1 and a = bq2 + r2 for integers q1 , r1 , q2 , and
r2 with 0 r1 < b and 0 r2 < b, then q1 must equal q2 and r1 must
equal r2 . The integer q is referred to as the quotient and the integer r
is referred to as the remainder.
Proof of Lemma 1.5. Let S be the set {a bn | n Z}. Thus, S contains
the following integers: a (= a b 0), a b, a + b, a 2b, a + 2b, a 3b,
a + 3b, etc. Let S be the set of all those elements in S that are nonnegative,
that is, S = {a bn | n Z, and a bn 0}. It is not immediate that
S is nonempty, but if we think a bit harder about this, it will be clear that
S indeed has elements in it. For if a is nonnegative, then a S . If a is
negative, then aba is nonnegative (check! remember that b itself is positive,
by hypothesis), so a ba S . By the Well-Ordering Principle (Page 29),
since S is a nonempty subset of N, S has a least element; call it r. (The
notation r is meant to be suggestive; this element will be the r guaranteed
by the lemma.)
Since r is in S (actually in S as well), r must be expressible as a bq
for some integer q, since every element of S is expressible as a bn for some
integer n. (The notation q is also meant to be suggestive, this integer will
be the q guaranteed by the lemma.) Since r = a bq, we find a = bq + r.
What we need to do now is to show that 0 r < b, and that q and r are
unique.
Observe that since r is in S and since all elements of S are nonnegative,
r must be nonnegative, that is 0 r. Now suppose r b. We will arrive
at a contradiction: Write r = b + x, where x 0 (why is x 0?). Writing
b + x for r in a = bq + r, we find a = bq + b + x, or a = b(q + 1) + x, or
x = a b(q + 1). This form of x shows that x belongs to the set S (why?).
32
33
follows that every common divisor of a and b must be less than or equal to
the lesser of |a| and |b|, and must be greater than or equal to the greater of
|a| and |b|. Thus, there are only finitely many common divisors of a and
b, and they all lie in the range max(|a|, |b|) to min(|a|, |b|).
We will now focus on a very special common divisor of a and b.
Definition 1.7
Given two (nonzero) integers a and b, the greatest common divisor of
a and b (written as gcd(a, b)) is the largest of the common divisors of
a and b.
Note that since there are only finitely many common divisors of a and b,
it makes sense to talk about the largest of the common divisors.
Question 1.8
By contrast, must an infinite set of integers necessarily have a largest
element? Must an infinite set of integers necessarily fail to have a
largest element? What would your answers to these two questions be
if we restricted our attention to an infinite set of positive integers?
How about if we restricted our attention to an infinite set of negative
integers?
Notice that since 1 is already a common divisor, the greatest common
divisor of a and b must be at least as large as 1. We can conclude from this
that the greatest common divisor of two nonzero integers a and b must be
positive.
Question 1.9
If p and q are two positive integers and if q divides p, what must
gcd(p, q) be?
See the notes on Page 53 for a discussion on the restriction that both a
34
This alterna-
Theorem 1.10. Given two nonzero integers a and b, let P be the set {xa +
tive formula-
yb|x, y Z, xa+yb > 0}. Let d be the least element in P . Then d = gcd(a, b).
tion of gcd in
Theorem 1.10
Proof. First observe that P is not empty. For if a > 0, then a P , and if
is very use-
35
Let us show that d is a common divisor of a and b. Write a = dq + r for
integers d and r with 0 r < d (division algorithm). We need to show that
r = 0. Suppose to the contrary that r > 0. Write r = a dq. Substituting
xa + yb for d, we find that r = (1 xq)a + (yq)b. Thus, r is a positive
linear combination of a and b that is less than da contradiction, since d is
the smallest positive linear combination of a and b. Hence r must be zero,
that is, d must divide a. Similarly, one can prove that d divides b as well, so
that d is indeed a common divisor of a and b.
Now let us show that d is the largest of the common divisors of a and
b. This is the same as showing that if c is any common divisor of a and b,
then c must be no larger than d. So let c be any common divisor of a and b.
Then, by Lemma 1.3 and the fact that d = xa + yb, we find that c|d. Thus,
c |d| (why?). But since d is positive, |d| is the same as d. Thus, c d, as
desired.
To prove the last statement of the theorem, note that we have already
proved that d|a and d|b. By Lemma 1.3, d must divide all linear combinations
of a and b, and must hence divide every element of P .
We have thus proved our theorem.
36
d is really to say that c divides the greatest common divisor of a and b, thus
proving the proposition.
Exercise 1.39 will yield yet another description of the greatest common
divisor.
Question 1.12
Given two nonzero integers a and b for which one can find integers x
and y such that xa + yb = 2, can you conclude from Theorem 1.10
that gcd(a, b) = 2? If not, why not? What, then, are the possible
values of gcd(a, b)? Now suppose there exist integers x0 and y 0 such
that x0 a + y 0 b = 1. Can you conclude that gcd(a, b) = 1? (See the
notes on Page 53 after you have thought about these questions for at
least a little bit yourselves!)
Given two nonzero integers a and b, we noted that 1 is a common divisor
of a and b. In general, a and b could have other common divisors greater than
1, but in certain cases, it may turn out that the greatest common divisor of
a and b is precisely 1. We give a special name to this:
Definition 1.13
Two nonzero integers a and b are said to be relatively prime if
gcd(a, b) = 1.
We immediately have the following:
Corollary 1.14. Given two nonzero integers a and b, gcd(a, b) = 1 if and only
if there exist integers x and y such that xa + yb = 1.
Proof. You should be able to prove this yourselves! (See Question 1.12
above.)
37
The following lemma will be useful:
Lemma 1.15. Let a and b be positive integers, and let c be a third integer. If
a|bc and gcd(a, b) = 1, then a|c.
Proof. Since gcd(a, b) = 1, Theorem 1.10 shows that there exist integers x
and y such that 1 = xa + yb. Multiplying by c, we find that c = xac + ybc.
Since a|a and a|bc, a must divide c by Lemma 1.3.
For fascinating
recent
progress
on
questions about them that are still unanswered, or at best, only partially
the
primes ques-
answer is unknown.) Are there infinitely many twin primes? (The answer
article
on
to this is also unknown, but see the margin!.) Is there any pattern to the
Zhang
and
occurence of the primes among the integers? Here, some partial answers
twin
are known. The following is just a sample: There are arbitrarily large gaps
http:
between consecutive primes, that is, given any n, it is possible to find two
//tinyurl.
com/
twin
primes:
Zheng-Article.
38
that for any n > 1, there is always a prime between n and 2n. (It is unknown
whether there is a prime between n2 and (n + 1)2 , however!) It is known that
as n becomes very large, the number of primes less than n is approximately
n/ ln(n), in the sense that the ratio between the number of primes less than n
and n/ ln(n) approaches 1 as n becomes large. (This is the celebrated Prime
Number Theorem.) Also, it is known that given any arithmetic sequence a,
a+d, a+2d, a+3d, . . . , where a and d are nonzero integers with gcd(a, d) = 1,
infinitely many of the integers that appear in this sequence are primes!
Those of you who find this fascinating should delve deeper into number
theory, which is the branch of mathematics that deals with such questions.
It is a wonderful subject with hordes of problems that will seriously challenge
your creative abilities! For now, we will content ourselves with proving the
unique prime factorization property and the infinitude of primes already
referred to at the beginning of this chapter.
The following lemmas will be needed:
Lemma 1.17. Let p be a prime and a an arbitrary integer. Then either p|a
or else gcd(p, a) = 1.
Proof. If p already divides a, we have nothing to prove, so let us assume
that p does not divide a. We need to prove that gcd(p, a) = 1. Write x
for gcd(p, a). By definition x divides p. Since the only positive divisors of p
are 1 and p, either x = 1 (which is want we want to show), or else x = p.
Suppose x = p. Then, as x divides a as well, we find p divides a. But we
have assumed that p does not divide a. Hence x = 1.
2
Lemma 1.18. Let p be a prime. If p|ab for two integers a and b, then either
p|a or else p|b.
Proof. If p already divides a, we have nothing to prove, so let us assume that
39
p does not divide a. Then by Lemma 1.17, gcd(p, a) = 1. It now follows from
Lemma 1.15 that p|b.
40
Proof of Theorem 1.20. We will prove the existence part first. The proof is
very simple. Assume to the contrary that there exists an integer greater than
1 that does not admit prime factorization. Then, the set of positive integers
greater than 1 that do not admit prime factorization is nonempty, and hence,
by the Well-Ordering Principle (Page 29), there must be a least positive
integer greater than 1, call it a, that does not admit prime factorization. Now
a cannot itself be prime, or else, a = a would be its prime factorization,
contradicting our assumption about a. Hence, a = bc for suitable positive
integers b and c, with 1 < b < a and 1 < c < a. But then, b and c must
both admit factorization into primes, since they are greater than 1 and less
than a, and a was the least positive integer greater than 1 without a prime
factorization. If b = p1 p2 pk and c = q1 q2 ql are prime factorizations
of b and c respectively, then a(= bc) = p1 p2 pk q1 q2 ql yields a prime
factorization of a, contradicting our assumption about a. Hence, no such
integer a can exist, that is, every positive integer must factor into a product
of primes.
Let us move on to the uniqueness part of the theorem. The basic ideas
behind the proof of this portion of the theorem are quite simple as well. The
key is to recognize that if an integer a has two prime factorizations, then
some prime in the first factorization must equal some prime in the second
factorization. This will then allow us to cancel the two primes, one from each
factorization, and arrive at two factorizations of a smaller integer. The rest
is just induction.
So assume to the contrary that there exists a positive integer greater than
1 with two different (i.e., other than for rearrangement) prime factorizations.
41
Then, exactly as in the proof of the existence part above, the Well-Ordering
Principle applied to the (nonempty) set of positive integers greater than
1 that admit two different prime factorizations shows that there must be a
least positive integer greater than 1, call it a, that admits two different prime
factorizations. Suppose that
a = pn1 1 pns s = q1m1 qtmt ,
where the pi (i = 1, . . . , s) are distinct primes, and the qj (j = 1, . . . , t) are
distinct primes, and the ni and the mj are positive integers. (By distinct
primes we mean that p1 , p2 , . . . , ps are all different from one another, and
similarly, q1 , q2 , . . . , qt are all different from one another.) Since p1 divides a,
and since a = q1m1 qtmt , p1 must divide q1m1 qtmt . Now, by Exercise 1.19
above (which simply generalizes Lemma 1.18), we find that since p1 divides
the product q1m1 qtmt , it must divide one of the factors of this product,
that is, it must divide one of the qj . Relabeling the primes qj if necessary
(remember, we do not consider a rearrangement of primes to be a different
factorization), we may assume that p1 divides q1 . Since the only positive
divisors of q1 are 1 and q1 , we find p1 = q1 .
Since now p1 = q1 , consider the integer a0 = a/p1 = a/q1 . If a0 = 1, this
means that a = p1 = q1 , and there is nothing to prove, the factorization of a
is already unique. So assume that a0 > 1. Then a0 is a positive integer greater
than 1 and less than a, so by our assumption about a, any prime factorization
of a0 must be unique (that is, except for rearrangement of factors). But then,
since a0 is obtained by dividing a by p1 (= q1 ), we find that a0 has the prime
factorizations
a0 = pn1 1 1 pns s = q1m1 1 qtmt
So, by the uniqueness of prime factorization of a0 , we find that n1 1 = m1 1
(so n1 = m1 ), s = t, and after relabeling the primes if necessary, pi = qi , and
similarly, ni = mi , for i = 2, . . . , s(= t). This establishes that the two prime
42
factorizations of a we began with are indeed the same, except perhaps for
rearrangement.
2
Remark 1.22
While Theorem 1.20 only talks about integers greater than 1, a similar
result holds for integers less than 1 as well: every integer less than
1 can be factored as 1 times a product of primes, and these primes
are unique, except perhaps for order. This is clear, since, if a is a
negative integer less than 1, then a = 1 |a|, and of course, |a| > 1
and therefore admits unique prime factorization.
The following result follows easily from studying prime factorizations and
Proposition
1.23
will
prove
very
useful in the
exercises,
if and only if the prime factors of b are a subset of the prime factors of a
when
deter-
mining
the
number
of
divisors of an
integer (Exer-
cise 1.38), or
suppose c > 1. Then c also has a factorization into primes, and multiplying
determining
the
gcd
of
a product of primes. On the other hand, bc is just a, and a has its own
two
integers
(Exercise
1.39).
43
power y in the factorization of b, and to the power z in the factorization of c.
Multiplying together the factorizations of b and c, we find that p occurs to
the power y + z in the factorization of bc. Since the factorization of bc is just
the factorization of a and since p occurs to the power x in the factorization
of a, we find that x = y + z. In particular, y x. This proves one half of
the proposition.
As for the converse, assume that b has the prime factorization b =
pn1 1
mt
s+1
ms
1
Thus, the prime factorization of a must look like a = pm
1 ps ps+1 pt ,
s+1
1 n1
t
s ns
ps+1
Writing c for pm
pm
pm
and noting that mi ni 0 for
t
1
s
44
1.1
Further Exercises
Exercise 1.26
In this exercise, we will formally prove the validity of various quick
tests for divisibility that we learn in high school!
1. Prove that an integer is divisible by 2 if and only if the digit
in the units place is divisible by 2. (Hint: Look at a couple of
examples: 58 = 5 10 + 8, while 57 = 5 10 + 7. What does
Lemma 1.3 suggest in the context of these examples?)
2. Prove that an integer (with two or more digits) is divisible by 4
if and only if the integer represented by the tens digit and the
units digit is divisible by 4. (To give you an example, the integer
represented by the tens digit and the units digit of 1024 is 24,
and the assertion is that 1024 is divisible by 4 if and only if 24
is divisible by 4which it is!)
3. Prove that an integer (with three or more digits) is divisible by
8 if and only if the integer represented by the hundreds digit and
the tens digit and the units digit is divisible by 8.
4. Prove that an integer is divisible by 3 if and only if the sum of
its digits is divisible by 3. (For instance, the sum of the digits of
1024 is 1+0+2+4 = 7, and the assertion is that 1024 is divisible
by 3 if and only if 7 is divisible by 3and therefore, since 7 is
not divisible by 3, we can conclude that 1024 is not divisible by
45
46
For
of
details
Exercise
Exercise 1.27
following (but
(This exercise forms the basis for the Euclidean algorithm for finding
problem yourself!):
http:
//tinyurl.
com/
GIAA-Integers-1
Exercise 1.28
Given nonzero integers a and b, let h = a/gcd(a, b) and k = b/gcd(a, b).
Show that gcd(h, k) = 1.
Exercise 1.29
Show that if a and b are nonzero integers with gcd(a, b) = 1, and if c
is an arbitrary integer, then a|c and b|c together imply ab|c. Give a
counterexample to show that this result is false if gcd(a, b) 6= 1. (Hint:
Just as in the proof of Lemma 1.15, use the fact that gcd(a, b) = 1 to
write 1 = xa + yb for suitable integers x and y, and then multiply both
sides by c. Now stare hard at your equation!)
Exercise 1.30
The Fibonacci Sequence, 1, 1, 2, 3, 5, 8, 13, is defined as follows: If
ai stands for the ith term of this sequence, then a1 = 1, a2 = 1, and
for n 3, an is given by the formula an = an1 + an2 . Prove that for
all n 2, gcd(an , an1 ) = 1.
47
48
Exercise 1.32
Use Exercise 1.31 to prove that given any positive integer n, one can
always find consecutive primes p and q such that q p n.
Exercise 1.33
If m and n are odd integers, show that 8 divides m2 n2 .
Exercise 1.34
Show that 3 divides n3 n for any integer n. (Hint: Factor n3 n as
n(n2 1) = n(n 1)(n + 1). Write n as 3q + r, where r is one of 0, 1,
or 2, and examine, for each value of r, the divisibility of each of these
factors by 3. This result is a special case of Fermats Little Theorem ,
which you will encounter as Theorem 4.50 in Chapter 4 ahead.)
Exercise 1.35
Here is another instance of Fermats Little Theorem : show that 5
divides n5 n for any integer n. (Hint: As in the previous exercise,
factor n5 n appropriately, and write n = 5q + r for 0 r < 5.)
Exercise 1.36
. Show that 7 divides n7 n for any integer n.
49
Exercise 1.37
Use the Well-Ordering Principle to prove the following statement,
known as the Principle of Induction: Let P (n), n = 1, 2, . . . be a
family of statements. Assume that P (1) is true, and whenever P (k)
is true for some k 1, then P (k + 1) is also true. Then P (n) is
true for all n = 1, 2, . . . . (Hint: Assume that P (n) is not true for all
n = 1, 2, . . . . Then the set S of positive integers n such that P (n)
is false is nonempty, and by the well-ordering principle, has a least
element m. Study P (m) as well as P (n) for n near m.)
Exercise 1.38
Use Proposition 1.23 to show that the number of positive divisors of
n = pn1 1 pnk k (the pi are the distinct prime factors of n) is (n1 +
1)(n2 + 1) (nk + 1).
For
of
details
Exercise
Exercise 1.39
problem your-
self!):
mk
nk
n1 n2
1 m2
that m = pm
1 p2 pk and n = p1 p2 pk , where for i = 1, , k,
//tinyurl.
http:
com/
GIAA-Integers-2
50
pl11 pl22
plkk .
will not delve deeper into it in this book, but you are encouraged to
read about it in any introductory textbook on number theory.)
Exercise 1.42
The series 1 + 1/2 + 1/3 + is known as the harmonic series. This
exercise concerns the partial sums (see below) of this series.
1. Fix an integer n 1, and let Sn denote the set {1, 2, . . . , n} Let
2t be the highest power of 2 that appears in Sn . Show that 2t
does not divide any element of Sn other than itself.
Exercise 1.43
Fix an integer n 1, and let Sn denote the set {1, 3, 5, . . . , 2n 1}.
Let 3t be the highest power of 3 that appears in Sn . Show that 3t does
not divide any element of Sn other than itself. Can you use this result
to show that the nth partial sums (n 2) of a series analogous to the
harmonic series (see Exercise 1.42 above) are not integers?
51
52
2 is not a
rational number. Using essentially the same ideas, show that p is not
a rational number for any prime p. (Hint: Suppose that 2 = a/b for
some two integers a and b with b 6= 0. Rewrite this as a2 = 2b2 . What
can you say about the exponent of 2 in the prime factorizations of a2
and 2b2 ?)
For
of
details
Exercise
http:
//tinyurl.
com/
GIAA-Integers-3
53
Notes
Remarks on Definition 1.7 The alert reader may wonder why we have restricted both integers a and b to be nonzero in Definition 1.7 above. Let us
explore this question further: Suppose first that a and b are both zero. Note
that every nonzero integer divides 0, since given any nonzero integer n, we
certainly have the relation n 0 = 0. Thus, if a and b are both zero, we find
that every nonzero integer is a common divisor of a and b, and thus, there
is no greatest common divisor at all. The concept of the greatest common
divisor therefore has no meaning in this situation. Next, let us assume just
one of a and b is nonzero. For concreteness, let us assume a 6= 0 and b = 0.
Then, as we have seen in the discussions preceding Defintion 1.7, |a| is a
divisor of a, and is the largest of the divisors of a. Also, since every nonzero
integer divides 0 and we have assumed b = 0, we find |a| divides b. It follows
that |a| is a common divisor of a and b, and since |a| is the largest among
the divisors of a, it has to be the greatest of the common divisors of a and
b. We find therefore that if exactly one of a and b, say a, is nonzero, then
the concept of gcd(a, b) has meaning, and the gcd in this case equals |a|.
However, this situation may be viewed as somewhat less interesting, since
every integer anyway divides b. The more interesting case, therefore, is when
both a and b are nonzero, and we have chosen to focus on that situation in
Definition 1.7.
54
55
a moment to think about it). The reason why many books prefer to define
the greatest common divisor as above is that this definition applies (with a
tiny modification) to other number systems where the concept of a largest
common divisor may not exist.
In the case of the integers, however, we prefer our Definition 1.7, since the
largest of the common divisors of a and b is exactly what we would intuitively
expect gcd(a, b) to be!
56
Chapter 2
Rings and Fields
57
58
2.1
Abstract algebra begins with the observation that several sets that occur
naturally in mathematics, such as the set of integers, the set of rationals, the
set of 2 2 matrices with entries in the reals, the set of functions from the
reals to the reals, all come equipped with certain operations that allow one
to combine any two elements of the set and come up with a third element.
These operations go by different names, such as addition, multiplication, or
composition (you would have seen the notion of composing two functions in
calculus). Abstract algebra studies mathematics from the point of view of
these operations, asking, for instance, what properties of a given mathematical set can be deduced just from the existence of a given operation on the
set with a given list of properties. We will be dealing with some of the more
rudimentary aspects of this approach to mathematics in this book.
However, do not let the abstract nature of the subject fool you into thinking that mathematics no longer deals with concrete objects! Abstraction
grows only from extensive studies of the concrete, it is merely a device (albeit an extremely effective one) for codifying phenomena that simultaneously
occur in several concrete mathematical sets. In particular, to understand an
abstract concept well, you must work with the specific examples from which
the abstract concept grew (remember the advice on active learning).
Let us look at Z, focusing on the operations of addition and multiplication.
Given a set S, recall that a binary operation on S is a process that takes
For a more
detailed
discussion
of
sets
an ordered pair of elements from S and gives us a third member of the set.
and
functions, see
Appendix 4.5.
59
For
dis-
Now select some of these binary operations and check whether they
cussion
of
Question
tried
to
answer
it
your-
self!):
http:
//tinyurl.
com/
GIAA-Rings-1
60
Definition 2.2
A group is a set S with a binary operation : S S S such that
1. is associative, i.e., a (b c) = (a b) c for all a, b, and c in S,
2. S has an identity element with respect to , i.e., an element id
such that a id = id a = a for all a in S, and
3. every element of S has an inverse with respect to , i.e., for
every element a in S there exists an element a1 such that
a a1 = a1 a = id.
To emphasize that there are two ingredients in this definitionthe
set S and the operation with these special propertiesthe group is
sometimes written as (S, ), and S is often referred to as a group with
respect to the operation .
61
The reason that the integers form a group with respect to addition is that
if we take the set S of this definition to be Z, and if we take the binary
operation to be +, then the three conditions of the definition are met.
There is a vast and beautiful theory about groups, the beginnings of which
we will pursue in Chapter 4 ahead.
Observe that there is one more property of addition that we have not
listed yet, namely commutativity. This is the property that for all integers
a and b, a + b = b + a. In the language of group theory, this makes (Z, +) an
abelian group:
Definition 2.3
An abelian group is one in which the function in Definition 2.2
above satisfies the additional condition a b = b a for all a and b in
S.
Commutativity of addition is a crucial property of the integers; the only
reason we delayed introducing it was to allow us first to introduce the notion
of a group.
Now let us consider multiplication. As with addition, we write (Z, ) to
emphasize the fact that we are considering Z as a set with the binary operation of multiplication, temporarily ignoring the operation addition. As with
addition, we find that multiplication is associative, that is, for all integers
For
dis-
cussion
of
Question
a.
following (but
Question 2.4
have
tried
to
answer
it
your-
self!):
http:
//tinyurl.
com/
GIAA-Rings-2
62
63
m
P
i=0
j=0
64
65
6=
0 0
0 0
0 1
0 0
1 0
0 0
!
.
Rings in which multiplication is not commutative are fairly common in mathematics, and hence requiring commutativity of multiplication in the definition
of a ring would be too restrictive. On the other hand, there is no denying
that a significant proportion of the rings that we come across indeed have
multiplication that is commutative. Thus, it is reasonable to single them out
as special cases of rings, and we have the following:
Definition 2.9
A commutative ring is a ring R in which a b = b a for all a and b in
R.
(Rings in which the multiplication is not commutative are referred to as
noncommutative rings.)
The following are various examples of rings. (Once again, recall the advice
in the preliminary chapter To the Student, page 3, on reading actively.)
66
Example 2.11
In a like manner, both the reals, R, and the complexes, usually denoted
C, are rings under the usual operations of addition and multiplication.
Again, we will not try to prove that the ring axioms hold; we will just
invoke our intimate knowledge of R and C to recognize that they are
rings.
Example 2.12
Let Q[ 2] denote the set of all real numbers of the form a+b 2, where
a and b are arbitrary rational numbers. For instance, this includes
numbers like 1/2 + 3 2, 1/7 + (1/5) 2, etc. You know from your
experience with real numbers how to add and multiply two elements
Question 2.12.2
Why should associativity of addition and multiplication and distributivity of multiplication over addition all follow from the fact
that this set is contained in R?
Question 2.12.3
Are all other ring axioms satisfied? Check!
Question 2.12.4
You know that 2 is not a rational number (see Chapter 1, Exercise 1.44). Why does it follow that if a and b are rational
67
68
69
For a detailed
analysis
of
Question
Question 2.13.4
2.13.4,
the
follow-
ing
(make
see
sure
have
you
tried
to
answer
it
yourself
Example 2.14
first!): http:
//tinyurl.
Q[], which is the set of all complex numbers of the form a + b, where
com/
GIAA-Rings-3
Exercise 2.14.1
Show that if a and b are real numbers, then a + b = 0 if and only
if both a and b are zero. (See the notes on page 154 for a clue.)
Example 2.15
Consider the set of rational numbers q that have the property that when
q is written in the reduced form a/b with a, b integers and gcd(a, b) = 1
the denominator b is odd. This set is usually denoted by Z(2) , and
contains elements like 1/3, 5/7, 6/19, etc., but does not contain 1/4
or 5/62.
For
dis-
cussion
of
Question
Question 2.15.1
Does Z(2) contain 2/6?
2.15.1,
see
the
follow-
ing:
http:
//tinyurl.
com/
GIAA-Rings-4
70
Question 2.15.3
Why do associativity and distributivity follow from the fact that
Z(2) Q?
Question 2.15.4
Do the other ring axioms hold? Check!
Question 2.15.5
Can you generalize this construction to other subsets of Q where
the denominators have analogous properties?
71
Example 2.16
The set of n n matrices with entries in R (Mn (R)), where n is a
positive integer, forms a ring with respect to the usual operations of
matrix addition and multiplication. For almost all values of n, matrix
multiplication is not commutative.
Question 2.16.1
What is the exception?
For a detailed
analysis
Exercise
2.16.2
of
see
the following
Exercise 2.16.2
(make
sure
have
you
(A + B) + C = A + (B + C).
tried to solve
it
What is important is that you get a feel for how associativity and
distributivity in Mn (R) derives from the fact that associativity and
distributivity hold for R.
Question 2.16.3
What about the ring axioms other than associativity and distributivity: do they hold?
Question 2.16.4
What are the additive and multiplicative identities?
yourself
first!): http:
//tinyurl.
com/
GIAA-Rings-5
72
Question 2.16.5
Let ei,j denote the matrix with 1 in the (i, j)-th slot and 0 everywhere else. Study the case of 2 2 matrices and guess at a
formula for the product ei,j ek,l . (You need not try to prove
formally that your formula is correct, but after you have made
your guess, substitute various values for i, j, k, and l and test
your guess.)
Question 2.16.6
Would the ring axioms still be satisfied if we only considered the
set of n n matrices whose entries came from Q? From Z?
Question 2.16.7
Now suppose R is any ring. Let us consider the set Mn (R) of nn
matrices with entries in R with the usual definitions of matrix
addition and multiplication. Is Mn (R) with these operations a
ring? What if R is not commutative? Does this affect whether
Mn (R) is a ring or not?
Example 2.17
R[x], the set of polynomials in one variable with coefficients from R,
forms a ring with respect to the usual operations of polynomial addition and multiplication. (We have considered this before.) Here, x
73
74
Exercise 2.17.1
Now just as with Example 2.16, prove that if f , g, and h are any
three polynomials in R[x], then (f + g) + h = f + (g + h). Your
proof should invoke the fact that associativity holds in R.
Example 2.18
Instead of polynomials with coefficients from R, we can consider polynomials in the variable x with coefficients from an arbitrary ring R,
with the usual definition of addition and multiplication of polynomials.
We get a ring, denoted R[x]. Thus, if we were to consider polynomials
in the variable x whose coefficients are all integers, we get the ring Z[x].
Question 2.18.1
As always, convince yourself that for a general ring R, the set of
polynomials R[x] forms a ring. For arbitrary R, is R[x] commutative?
(See the notes on page 155 for some hints and more remarks.)
Example 2.19
Generalizing Example 2.17, the set R[x, y] of polynomials in two variables x and y, forms a ring. A polynomial in x and y is of the form
P
i j
i,j fi,j x y . (For example, consider the polynomial 4 + 2x + 3y +
x2 y + 5xy 3 here, f0,0 is the coefficient of x0 y 0 , i.e., the coefficient of 1,
so f0,0 = 4. Similarly, f1,3 is the coefficient of x1 y 3 , so it equals 5. On
the other hand, f1,1 is zero, since there is no xy term.) Two polynomiP
P
als i,j fi,j xi y j and i,j gi,j xi y j are equal if and only if for each pair
Example 2.20
Here is a ring with only two elements! Divide the integers into two
sets, the even integers and the odd integers. Let [0]2 denote the set
of even integers, and let [1]2 denote the set of odd integers. (Notice
that [0]2 and [1]2 are precisely the equivalence classes of Z under the
equivalence relation defined by a b iff a b is even.) Denote by Z/2Z
the set {[0]2 , [1]2 }. Each element of {[0]2 , [1]2 } is itself a set containing
an infinite number of integers, but we will ignore this fact. Instead,
we will view all the even integers together as one number of Z/2Z,
and we will view all the odd integers together as another number of
Z/2Z. How should we add and multiply these new numbers? Recall
that if we add two even integers we get an even integer, if we add an
even and an odd integer we get an odd integer, and if we add two odd
integers we get an even integer. This suggests the following addition
rules in Z/2Z:
Z/2Z
+
[0]2
[1]2
[0]2
[0]2
[1]2
[1]2
[1]2
[0]2
75
76
[0]2
[1]2
[0]2
[0]2
[0]2
[1]2
[0]2
[1]2
Later in this chapter (see Example 2.80 and the discussions preceding
that example), we will interpret the ring Z/2Z differently: as a quotient
ring of Z. This interpretation, in particular, will prove that Z/2Z is
indeed a ring under the given operations. Just accept for now the fact
that we get a ring, and play with the it to develop a feel for it.
Question 2.20.1
How would you get a ring with three elements in it? With four?
Example 2.21
Here is the answer to the previous two questions! We have observed
that [0]2 and [1]2 are just the equivalence classes of Z under the equivalence relation a b iff a b is even. Analogously, let us consider the
equivalence classes of Z under the equivalence relation aRb iff a b is
divisible by 3. Since a b is divisible by 3 exactly when a and b each
77
leaves the same remainder when divided by 3, there are three equivalence classes: (i) [0]3 , the set of all those integers that yield a remainder
of 0 when you divide them by 3. In other words, [0]3 consists of all multiples of 3, that is, all integers of the form 3k, k Z. (ii) [1]3 for the
set of all those integers that yield a remainder of 1, so [1]3 consists of
all integers of the form 3k + 1, k Z. (iii) [2]3 for the set of all those
integers that yield a remainder of 2, so [2]3 consists of all integers of
the form 3k + 2, k Z. Write Z/3Z for the set {[0]3 , [1]3 , [2]3 }. Just as
in the case of Z/2Z, every element of this set is itself a set consisting of
an infinite number of integers, but we will ignore this fact. How would
you add two elements of this set? In Z/2Z, we defined addition using
observations like an odd integer plus an odd integer gives you an even
integer. The corresponding observations here are an integer of the
form 3k + 1 plus another integer of the form 3k + 1 gives you an integer
of the form 3k + 2, an integer of the form 3k + 1 plus another integer
of the form 3k + 2 gives you an integer of the form 3k, an integer of
the form 3k + 2 plus another integer of the form 3k + 2 gives you an
integer of the form 3k + 1, etc. We thus get the following addition
table:
Z/3Z
+
[0]3
[1]3
[2]3
[0]3
[0]3
[1]3
[2]3
[1]3
[1]3
[2]3
[0]3
[2]3
[2]3
[0]3
[1]3
78
Exercise 2.21.1
Similarly, study how the remainders work out when we multiply
two integers. (For instance, we find that an integer of the form
3k + 2 times an integer of the form 3k + 2 gives you an integer of
the form 3k + 1, etc.) Derive the following multiplication table:
Z/3Z
[0]3
[1]3
[2]3
[0]3
[0]3
[0]3
[0]3
[1]3
[0]3
[1]3
[2]3
[2]3
[0]3
[2]3
[1]3
Example 2.22
Suppose R and S are two rings. (For example, take R = Z/2Z, and
take S = Z/3Z.) Consider the Cartesian product T = R S, which is
the set of ordered pairs (r, s) with r R and s S. Define addition in
T by (r, s)+(r0 , s0 ) = (r +r0 , s+s0 ). Here, r +r0 refers to the addition
of two elements of R according to the defintion of addition in R, and
similarly, s + s0 refers to the addition of two elements of S according
Question 2.22.4
Now take R = S = Z. Can you find pairs of nonzero elements
a and b in the ring T = Z Z such that a b = 0? (Note that
Z itself does not contain pairs of such elements.) If R and S are
arbitrary rings, can you find a pair of nonzero elements a and b
in T = R S such that a b = 0?
(See the notes on page 156 for hints.)
79
80
Remark 2.24
The examples above should have convinced you that our definition
of a ring (Definition 2.5 above) is rather natural, and that it very
effectively models several number systems that arise in mathematics.
Here is further evidence that our axioms are the correct ones. Notice
that in all the rings that we have come across, the following properties
hold:
1. The additive identity is unique, that is, there is precisely one
element 0 in the ring that has the property that a + 0 = a for all
elements a in the ring.
2. The multiplicative identity is unique, that is, there is precisely
one element 1 in the ring that has the property that a1 = 1a = a
for all elements a in the ring.
81
Follow
link
to
this
see
the proofs of
to verify that at least some of these properties above follow from the
some of these
assertions
exercises at the end of this chapter (see also the remarks on page 156
Remark 2.24
in
above: http:
//tinyurl.
com/
GIAA-Rings-7
82
Question 2.25
Can you think of an example of a ring R and elements a, b, and c in
R such that ab = ac yet b 6= c?
2.2. SUBRINGS
2.2
83
Subrings
In Examples 2.12, 2.13, 2.14, and 2.15 above, we came across the following
phenomenon: A ring R and a subset S of R that had the following two
properties: For any s1 and s2 in S, s1 + s2 was in S and s1 s2 was in S.
In Example 2.12, the ring R was R, and the subset S was the set of all real
numbers of the form a+b 2 with a and b rational numbers. In Example 2.14,
R was C and S was the set of all complex numbers of the form a + bi with a
and b rational numbers. In Example 2.15, R was Q, and S was the set of all
reduced fractions with odd denominator. Moreover, in all three examples, we
endowed S with binary operations in the following way: Given s1 and s2 in S,
we viewed them as elements of R, and formed the sum s1 + s2 (the sum being
defined according to the definition of addition in R). Next, we observed that
s1 + s2 was actually in S (this is one of the two properties alluded to above).
Similarly, we observed that s1 s2 (the product being formed according to the
definition of multiplication in R) was also in S. These two facts hence gave
us two binary operation on S. We then found that with respect to these
binary operations, S was not just an arbitrary subset of R, it was actually a
ring in its own right.
The crucial reason (although not the only reason) why the set S in all
our examples was itself a ring was that S had the properties described at the
beginning of the previous paragraph. We give these properties a name.
84
Definition 2.27
Let S be a subset of a ring R that is closed with respect to addition
and multiplication. Suppose that 1 S. Suppose further that with
respect to these addition and multiplication operations on S that are
induced from those on R, S is itself a ring. We say that S is a subring
of R. We also describe R as a ring extension of S, and refer to R and
S jointly as the ring extension R/S.
Examples 2.12, 2.13, 2.14, and 2.15 above are therefore all instances of
2.2. SUBRINGS
85
Question 2.28
Consider the subset S of Z consisting of the positive even integers, that
is, the set {2n|n Z and n > 0}. Check that S is closed with respect
to both addition and multiplication. Does this make S a subring of
Z? Next, consider the set T of all nonnegative integers. Check that T
is also closed with respect to addition and multiplication. Clearly, T
contains 1. Does this make T a subring of Z?
For
of
details
Exercise
http:
//tinyurl.
com/
GIAA-Rings-8
86
The following are further examples of subrings. Play with these examples
to gain familiarity with them. Check that they are indeed examples of subrings
of the given rings by applying Lemma 2.30.
2.2. SUBRINGS
87
Example 2.31
The set of all real numbers of the form a + b 2 where a and b are
Example 2.34
Question
a subring of the reals. (See the notes on page 162, in particular see
be in this set?
2.34.1: Can
2 3 = 6
See
Exercise
2.115 ahead.
Now
is
good
time
to
that
try
exercise if you
havent done
so already!
88
Question 2.34.1
Example 2.35
If S is a subring of a ring R, then S[x] is a subring of R[x].
Exercise 2.35.1
Prove this assertion!
2.2. SUBRINGS
Question 2.37.2
For what values of n will Un (R) be the same as Mn (R)?
Question 2.37.3
Suppose we considered the set of strictly upper triangular matrices, namely the set of all (ai,j ) in Mn (R) with ai,j = 0 for i j.
Would we still get a subring of Mn (R)?
Example 2.38
Here is another subring of Mn (R). For each real number r, let diag(r)
denote the matrix in which each diagonal entry is just r and in which
the off-diagonal entries are all zero. The set of matrices in Mn (R) of
the form diag(r) (as r ranges through R) is then a subring.
Question 2.38.1
What observations can you make about the function from R to
Mn (R) that sends r to diag(r)?
(See Example 2.103 ahead.)
89
90
2.3
In passing from the concrete example of the integers to the abstract definition of a ring, observe that we have introduced some phenomena that at
first seem pathological. The first, which we have already pointed out explicitly and is already present in M2 (R), is noncommutativity of multiplication.
The second, which is also present in M2 (R), and examples of which you have
seen as far back as in the preliminary chapter To the Student, page 3, is the
existence of zero-divisors.
Definition 2.39
A zero-divisor in a ring R is a nonzero element a for which there exists
a nonzero element b such that either a b = 0 or b a = 0.
Just as noncommutativity of multiplication, on closer observation, turns
out to be quite a natural phemomenon after all, the existence of zero-divisors
is really not very pathological either. It merely seems so because most of our
experience has been restricted to various rings that appear as subrings of the
complex numbers.
Besides matrix rings (try to discover lots of zero-divisors in M2 (R) for
yourselves), zero-divisors occur in several rings that arise naturally in mathematics, including many commutative ones. For instance, the direct product
of two rings always contains zero-divisors (see Example 2.22 above). Also,
(see Exercise 2.21.2), Z/4Z contains zero-divisors: [2]4 [2]4 = [0]4 ! In fact,
as long as n is not prime, you should be able to discover zero-divisors in any
of the rings Z/nZ (see 2.57 ahead). (It can be proved, however, that Z/nZ
91
Definition 2.40
An integral domain is a commutative ring with no zero-divisors.
(Alternatively, an integral domain is a commutative ring R with the property that whenever a b = 0 for two elements a and b in R, then either a
must be 0 or else b must be 0.)
Z, Q, R, and C are all obvious examples of integral domains. (Again, we
are simply invoking our knowledge of these rings when we make this claim.)
Question 2.41
Is R[x] an integral domain? More generally, if R is an arbitrary ring,
can you determine necessary and sufficient conditions on R that will
guarantee that R[x] has no zero-divisors?
(See the notes on page 155 for a definition of R[x], and for some discussions that may help you answer this question.)
92
Now, integral domains are definitely very nice rings, but one can go out
on a limb and require that rings be even nicer! We can require that we be
able to divide any element a by any nonzero element b. This would certainly
make the ring behave much more like Q or R.
To understand division better, let us look at the process of dividing two
integers a little closer. To divide 3 by 5 is really to multiply together 3 and
1/5 (just as to subtract, say, 6 from 9 is really to add together 9 and 6).
The reason this cannot be done within the context of the integers is that
1/5 is not an integer. (After all, if 1/5 were an integer, then the product
of 3 and 1/5 would also be an integer.) Now let us look at 1/5 a different way. 1/5 has the property that 1/5 5 = 5 1/5 = 1. In other words,
1/5 is the inverse of 5 with respect to multiplication (just as 6 is the inverse of 6 with respect to addition). First, let us pause to give a name to this:
93
Definition 2.44
If R is an arbitrary ring, a nonzero element a is said to be invertible or
to have a multiplicative inverse if there exists an element b R such
that ab = ba = 1. In such a situation, b is known as the multiplicative
inverse of a, and a is known as the multiplicative inverse of b.
Invertible elements of a ring R are also known as units of R.
Question 2.45
What are the units of Z?
Putting all this together, the reason that we cannot divide within the
context of the integers is that given an arbitrary (nonzero integer) m, it need
not be invertible. With this in mind, we have the following definition:
Definition 2.46
A field is an integral domain in which every nonzero element a is invertible. The multiplicative inverse of a nonzero element a is usually
denoted either by 1/a or by a1 .
94
We will often use the letter F to denote a field. The set of nonzero
elements of a field F is often denoted by F .
Question 2.47
If F is a field, is F a group with respect to multiplication?
(See also Exercise 2.112 at the end of the chapter.)
Remark 2.48
Notice that 0 can never have a multiplicative inverse, since a 0 = 0
for any a. (See Remark 2.24.) We describe this by saying that division
by 0 is not defined.
Perhaps the most familiar example of a field is Q. We have already seen
that it is a ring (Example 2.10) The multiplicative inverse of the nonzero
rational number m/n is, of course, n/m. Here are more examples:
Example 2.49
The reals, R.
Exercise 2.50:
see
the
lowing
fol(but
http:
//tinyurl.
com/
GIAA-Rings-9
Example 2.50
Q[ 2].
95
Question 2.50.1
Question 2.50.2
Is Z[ 2] a field?
Example 2.51
The complex numbers, C.
Question 2.51.1
What is the inverse of the nonzero number a + b? (Give the
inverse as c + d for suitable real numbers c and d: think in terms
of real-izing denominators.)
Exericse 2.52:
modify
the
arguments for
Exercise 2.50
Example 2.52
Q[].
Question 2.52.1
Why is Q[] a field?
above.
96
Question 2.52.2
Is Z[] a field?
Example 2.53
Here is a new example: the set of rational functions with coefficients
from the reals, R(x). (Note the parentheses around the x.) This is
the set of all quotients
of polynomials with coefficients from the reals,
f (x)
that is, the set
, where f (x) and g(x) are elements of R[x], and
g(x)
g(x) 6= 0. (Of course, we take f (x)/g(x) = f 0 (x)/g 0 (x) if f (x)g 0 (x) =
g(x)f 0 (x).) Addition and multiplication in R(x) are similar to addition
and multiplication in Q
f1 (x) f2 (x)
f1 (x) g2 (x) + f2 (x) g1 (x)
+
=
,
g1 (x) g2 (x)
g1 (x) g2 (x)
and
f1 (x) f2 (x)
f1 (x) f2 (x)
=
.
g1 (x) g2 (x)
g1 (x) g2 (x)
f (x)
g(x)
is just
.
g(x)
f (x)
Example 2.54
More generally, if F is any field, we may consider the set of rational
functions with coefficients
from F , denoted F (x). This is analogous to
f (x)
R(x): it is the set
, where f (x) and g(x) are now elements of
g(x)
F [x] instead of R[x], and g(x) 6= 0. (As with R(x), we take f (x)/g(x) =
f 0 (x)/g 0 (x) if f (x)g 0 (x) = g(x)f 0 (x).) Addition and multiplication are
defined just as in R(x), and we can check that we get a field.
Example 2.56
The ring Z/3Z is also a field!
Question 2.56.1
Study the multiplication table in Z/3Z in Example 2.21. There
are no zeros in the table other than in the first row and in the
first column (which correspond to multiplication by zero). Why
does this show that there are no zero-divisors in this ring? Now
notice that every row and every column (other than the first)
has [1]3 in it. Why does this show that every nonzero element is
invertible?
97
98
Example 2.58
However, Examples 2.55 and 2.56 do generalize suitably: it turns out
that for any prime p, the ring Z/pZ is a field (with p elements). Recall
from the discussions in Examples 2.20 and 2.21 that the elements of
Z/pZ are equivalence classes of integers under the relation a b if and
only if a b is divisible by p. The equivalence class [a]p of an integer a
is thus the set of integers of the form a p, a 2p, a 3p, . . . . Addition
For
of
details
Exercise
http:
//tinyurl.
com/
GIAA-Rings-10
Exercise 2.58.1
Show that addition and multiplication are well-defined, that is,
if a a0 and b b0 , then a + b a0 + b0 and a b a0 b0 .
Exercise 2.58.2
Show that the zero in this ring is [0]p , and the 1 in this ring is
[1]p . (In particular, [a]p is nonzero in Z/pZ precisely when a is
not divisible by p.)
Exercise 2.58.3
Now let [a]p be a nonzero element in Z/pZ. Show that [a]p is
invertible. (Hint: Invoking the fact that a and p are relatively
prime, we find that there must exist integers x and y such that
xa + yp = 1. So?)
Exercise 2.58.4
Now conclude using Exercise 2.46.1 and Exercise 2.58.3 above
that Z/pZ is a field.
99
100
We end this section with the concept of a subfield. The idea is very simple
(compare with Definition 2.27 above):
Definition 2.59
A subset F of a field K is called a subfield of K if F is a subring of
K and is itself a field. In this situation, we also describe K as a field
extension of F , and refer to F and K jointly as the field extension
K/F .
The difference between being a subring of K and a subfield of K is as
follows: Suppose R is a subring of K. Given a nonzero element a in R,
its multiplicative inverse 1/a certainly exists in K (why?). However, 1/a
may not live inside R. If 1/a happens to live inside R, we say that a has
a multiplicative inverse in R itself. Now, if every nonzero a in R has a
multiplicative inverse in R itself, then by Definition 2.46 (why is R an integral
domain?), R is a field. Therefore, by Definition 2.59 above, R is then a
subfield of K.
Thus, Q is a subfield of R, but Z is only a subring of R; it is not a
2.4. IDEALS
2.4
101
Ideals
Consider the ring Z, and consider the subset of even integers, denoted
(suggestively) 2Z. The set 2Z is closed under addition (the sum of two even
integers is again an even integer), and in fact, (2Z, +) is even an abelian
group (this is because (i) 0 is an even integer and hence in 2Z, (ii) for any
even integer a, a is also an even integer and hence in 2Z, (iii) and of course,
addition of integers, restricted to 2Z is both an associative and commutative
operation). Moreover, the set 2Z has one extra property that will be of interest: for any integer a 2Z and for any arbitrary integer m, am is also an
even integer and hence in 2Z. Subsets such as these play a crucial role in the
structure of rings, and are given a special name: they are referred to as ideals.
Definition 2.62
Let R be a ring. A subset I of R is called an ideal of R if I is closed
under the addition operation of R and under this induced binary operation (I, +) is an abelian group, and if for any i I and arbitrary
r R, both ri I and ir I. An ideal I is called proper if I 6= R.
Remark 2.63
Of course, if R is commutative, as in the example of Z and 2Z above,
ri I if and only if ir I, but in an arbitrary ring, one must specify
in the definition that both ri and ir be in I.
102
Remark 2.64
Notice in the definition of ideals above that if ir I for all r R,
then in particular, taking r to come from I, we find that I must be
closed under multiplication as well, that is, for any i and j in I, ij must
also be in I. Once we find that ideals are closed under multiplication,
the associative and distributive laws will then be inherited from R, so
ideals seem like they should be the same as subrings. However, they
differ from subrings in one crucial aspectideals do not have to contain
the multiplicative identity 1. (Recall the definition of subrings, and see
the example of 2Z aboveit certainly does not contain 1.)
Exercise 2.64.1
Show that if I is an ideal of a ring R, then 1 I implies I = R.
2.4. IDEALS
103
Exercise 2.65.1
If I is an ideal of R, then by definition, (I, +) is an abelian group.
Consequently, it has an identity element, call it 0I , that satisfies the
property that i + 0I = 0I + i = i for all i I. On the other hand, the
element 0 in R is the identity element for the group (R, +). Prove
that the element 0I must be the same as the element 0.
(See Exercise 2.29 of this chapter for some clues if you need. Both
this exercise and Exercise 2.29 of this chapter are just special cases of
Exercise 3.5 in Chapter 4 ahead.)
The significance of ideals will become clear when we study quotient rings
and ring homomorphisms a little ahead, but first let us consider several
examples of ideals in rings:
Example 2.66
Convince yourselves that if R is any ring, then both R and the set {0}
are both ideals of R. The ideal {0} is often referred to informally as
the zero ideal.
Example 2.67
Just as with the set 2Z, we may consider, for any integer m, the set of
all multiples of m, denoted mZ.
104
Exercise 2.67.1
Prove that mZ is an ideal of Z.
Question 2.67.2
What does mZ look like when m = 1?
Question 2.67.3
What does mZ look like when m = 0?
Example 2.68
In the ring R[x], let hxi denote the set of all polynomials that are a
multiple of x, i.e. the set {xg(x) | g(x) R[x]}.
Exercise 2.68.1
Prove that hxi is an ideal of R[x].
Exercise 2.68.2
More generally, let f (x) be an arbitrary polynomial, and let
hf (x)i denote the set of all polynomials that are a multiple of
f (x), i.e. the set {f (x)g(x) | g(x) R[x]}. Show that hf (x)i is
an ideal of R[x].
2.4. IDEALS
105
Example 2.69
In the ring R[x, y], let hx, yi denote the set of all polynomials that can
be expressed as xf (x, y) + yg(x, y) for suitable polynomials f (x, y) and
g(x, y). For example, the polynomial x + 2y + x2 y + xy 3 is in hx, yi
because it can be rewritten as x(1 + xy) + y(2 + xy 2 ). (Note that this
rewrite is not uniqueit can also be written as x(1 + xy + y 3 ) + 2ybut
this will not be an issue.)
Exercises
2.69.1
and
Exercise 2.69.1
Show that hx, yi is an ideal of R[x, y].
special cases
of
Example
2.74
ahead.
Exercise 2.69.2
to
Exercises
2.74.1
Example 2.70
Fix an integer n 1. In the ring Mn (Z) (see Exercise 2.16.6), the
subset Mn (2Z) consisting of all matrices all of whose entries are even,
is an ideal.
Exercise 2.70.1
Prove this.
and
2.74.2 there.
106
Question 2.70.2
Given an arbitrary integer m, is the subset Mn (mZ) consisting
of all matrices all of whose entries are a multiple of m an ideal
of Mn (Z)?
Example 2.71
Let R be an arbitrary ring, and let I be an ideal of R. Fix an integer
n 1. In Mn (R), let Mn (I) denote the subset of all matrices all of
whose entries come from I.
Exercise 2.71.1
For
of
details
Exercise
Example 2.72
In the ring Z(2) , denote by h2i(2) the set of all fractions of the (reduced)
problem your-
self!):
http:
//tinyurl.
Question 2.72.1
com/
GIAA-Rings-11
Exercise 2.72.2
Prove that h2i(2) is an ideal of Z(2) .
2.4. IDEALS
107
Example 2.73
Let R and S be rings, and let I1 be an ideal of R and I2 an ideal of S.
Let I1 I2 denote the set {(a, b) | a I1 , b I2 }.
Exercise 2.73.1
Prove that I1 I2 is an ideal of R S.
Example 2.74
For simplicity, we will restrict ourselves in this example to commutative rings. First, just to point out terminology that we have already
introduced in Example 2.68, by a multiple of r in a general commutative ring R, we mean the set {ra | a R}. (This obviously generalizes
the notion of multiple that we use in Z.) In Examples 2.67 and 2.68, we
considered the set of all multiples of a given element of our ring (multiples of m in the case of Z, multiples of f (x) in the case of R[x]), and
observed that these formed an ideal. In Example 2.69, we considered
something more general: the set hp(x, y), q(x, y)i is the set of sums of
multiples of p(x, y) and q(x, y). This process can be generalized even
further. If a1 , . . . , an are elements of a commutative ring R, we de-
For
of
2.74.1
2.69, the ri may not be uniquely determined, but this will not be an
issue.)
details
Exercises
and
following (but
only after you
have tried the
problem yourself!):
http:
//tinyurl.
com/
GIAA-Rings-12
108
Exercise 2.74.1
Show that ha1 , . . . , an i is an ideal of R.
Question 2.74.3
Convince yourselves that h1i = R and h0i is just the zero ideal
{0}.
Exercise 2.74.4
Suppose that R is a field, and let a be a nonzero element of R.
Show that hai = R. (Hint: play with the fact that a1 exists in
R and that hai is an ideal.)
Exercise 2.74.5
Conclude that the only ideals in a field F are the set {0} and F .
2.5
109
Quotient Rings
110
Let us write R/I (R mod I) for the set of equivalence classes of R under
the relation above. Thanks to Lemma 2.75 we know that the equivalence
class of r R is the same as the coset r + I, so we will use the notation [r]
and r + I interchangeably for the equivalence class of r. The key observation we make is that the set R/I can be endowed with two binary operations
+ (addition) and (multiplication) by the following rather natural definitions:
Definition 2.76
[a] + [b] = [a + b] and [a] [b] = [a b] for all [a] and [b] in R/I. (In
coset notation, this would read (a + I) + (b + I) = (a + b) + I, and
(a + I)(b + I) = ab + I.) As always, if the context is clear, we will often
omit the sign and write [a][b] for [a] [b].
Before proceeding any further, we need to settle the issue of whether
Actually,
we
have already
considered
issue
opera-
and b as representatives for the equivalence classes to which they belong, our
being
definition of the sum of the two classes is the class to which a + b belongs.
the
of
these definitions make sense, in other words, whether these operations are
tions
well-defined
However, if we use a0 and b0 as representatives for the classes [a] and [b], our
earlier:
definition says that the sum of the two classes is the class to which a0 + b0
see
Exercise
2.58.1
the
there!
belongs. Can we be certain that the class to which a + b belongs is the same
and
as the class to which a0 + b0 belongs? If yes, then we can be certain that our
hints
111
Remark 2.78
The proof above illustrates why we require in the definition of ideals
that they be closed under addition and that ir I and ri I for all
i in I and all r R (see Lemma 2.65). It was this that allowed us
to say that addition and multiplication are well-defined: we needed to
know above that i + j I in the proof that addition is well-defined,
and that aj I and ib I and ij I and then aj + ib + ij I in the
proof that multiplication is well-defined, and for this, we invoked the
corresponding properties of ideals.
Having proved that the operations + and on R/I are well-defined, let
us proceed to prove that all ring axioms hold in R/I:
Theorem 2.79. (R/I, +, ) is a ring.
Proof. We proceed to check all axioms one by one:
112
113
2
Definition 2.80
(R/I, +, ) is called the quotient ring of R by the ideal I.
How should one visualize R/I? Here is one intuitive way. Note that the
zero of R/I is the element [0], which is just the coset 0 + I (see Lemma 2.75).
But the coset 0 + I is the set of all elements of R of the form 0 + i for some
i I, and of course, the set of all such elements is just I. Thus, we may
view the quotient construction as something that takes the ring R and simply
converts all elements in I to zeromore colloquially, the construction kills
all elements in I, or divides out all elements in I. This last description
explains the term quotient ring, and pushing the analogy one step further,
R/I can then be thought of as the set of all remainders after dividing out
by I, endowed with the natural quotient binary operations of Definition
2.76.
Example 2.81
As our first example, take R to be R[x], and I to be hxi (Example
2.68). What does R/I look like here? Any polynomial in R[x] is of
the form a0 + a1 x + a2 x2 + + an xn for some n 0 and some ai R.
The monomials a1 x, a2 x2 , . . . , an xn are all in I since each of these is a
multiple of x. If we set these to zero, we are left with simply a0 which
is a real number. Thus, R[x]/hxi is just the set of constant terms (the
coefficients of x0 ) as we range through all the polynomials in R[x]. But
the set of constant terms is precisely the set of all real numbers, since
every constant term is just a real number, and every real number shows
up as the constant term of some polynomial. Thus, R[x]/hxi equals
R. But this equality is more than just an equality of sets: it is an
114
equality that preserves the ring structure. (We will make the notion
of preserving ring structure more precise in the next sectionsee
Example 2.101; Example 2.93 is also relevant.)
Example 2.82
Here is another example that would help us understand how to visualize
R/I. Consider R[x] again, but this time take I to be hx2 + 1i (Example
2.68, Exercise 2.68.2). Notice that x2 is in the same equivalence class as
1, since x2 (1) = x2 + 1 is clearly in I. What this means is in the
quotient ring R[x]/hx2 +1i, we may represent the coset x2 +I by 1+I.
(Another way of thinking about this is to note that x2 may be written
as (x2 + 1) + (1). If we kill off the first summand x2 + 1, which is in
I, we arrive at the representative 1 + I for x2 + I.) But there is more.
As we have seen while proving the well-definedness of multiplication
in R/I (Lemma 2.77 above), if x2 1, then x2 x2 (1) (1).
Thus, x4 1, so we may replace x4 + I by 1 + I. Proceeding, we
find x6 + I is the same as 1 + I, x8 + I is the same as 1 + I, etc.
Moreover, x3 + I = (x + I)(x2 + I) = (x + I)(1 + I) = (x + I), etc.
The coset of any monomial xn is thus either 1 + I or x + I. For
instance, while considering the equivalence class of a polynomial such
as 2 5x + 3x2 + 2x3 2x4 + x5 , which is (2 + I) (5 + I)(x + I) + (3 +
I)(x2 + I) + (2 + I)(x3 + I) (2 + I)(x4 + I) + (x5 + I), we may make
the replacements above to find that it is the same as (2 + I) (5 +
I)(x + I) + (3 + I)(1 + I) + (2 + I)(x + I) (2 + I)(1 + I) + (x + I).
Multiplying out, we find this is the same as (2 5x 3 2x 2 + x) + I,
which simplifies to (3 6x) + I or (3 + I) 6(x + I). Temporarily
writing x for x + I, we loosely think of (3 6x) + I as the element
3 6x subject to the relation x2 + 1 = 0, or what is the same thing,
115
Example 2.84
In a similar manner, the ring Z/3Z of Example 2.21 is the quotient ring
of Z by the ideal 3Z.
116
Question 2.84.1
Do you see this?
More generally, one can consider the ideal mZ for m 4 and construct
the ring Z/nZ with operations as in Definition 2.76.
Exercise 2.84.2
Redo Exercise 2.21.2 in this new light.
2.6
117
118
R R/I was surjective, but let us be more general, and not assume that
our map f from R to S is surjective. It will turn out that the image of f
will, all the same, be a subring of S (see Lemma 2.100 ahead). In such a
situation too, it will turn out, the ring operations in the ring R and in the
image of f (a subring of S) will essentially be the same except perhaps for
dividing out by some ideal. We will give this a name:
119
Definition 2.85
Let R and S be two rings, and let f : R S be a function. Suppose
that f has the following properties:
1. f (a) + f (b) = f (a + b) for all a, b, in R,
2. f (a)f (b) = f (ab) for all a, b, in R,
3. f (1R ) = 1S .
Then f is said to be a ring homomorphism from R to S.
Remark 2.86
There are some features of this definition that are worth noting:
1. In the equation f (a)+f (b) = f (a+b), note that the operation on
the left side represents addition in the ring S, while the operation
on the right side represents addition in the ring R. Loosely, we
say that any function f : R S satisfying f (a) + f (b) = f (a + b)
preserves addition.
2. Similarly for the equation f (a)f (b) = f (ab): the operation on
the left side represents multiplication in S, while the operation
on the right side represents multiplication in R. Loosely, we
say that any function f : R S satisfying f (a)f (b) = f (ab)
preserves multiplication.
3. By the very definition of a function, f is defined on all of R. The
image of R under f , however, need not be all of S (i.e, f need not
be surjective). We will see examples of this ahead (see Example
2.94 and Example 2.95 for instance). However, the image of R
under f is not an arbitrary subset of S. The definition of a ring
120
121
122
123
Example 2.92
As a special case of Example 2.91, we have, for any m 2, a ring
homomorphism from Z to Z/mZ defined by f (a) = a + mZ, whose
kernel if precisely the ideal mZ (see Example 2.84).
Example 2.93
Consider the function f : R[x] R that sends x to 0 and more
generally, a polynomial p(x) to p(0).
Exercise 2.93.2
Prove that the kernel of f is precisely the ideal hxi.
See the discussion on page 113. We will have more to say on this example ahead (see Example 2.101 and Theorem 2.107). See also Example
2.98 ahead for a generalization.
Example 2.94
Consider Z as a subset of Q. The function f : Z Q that sends n Z
to the fraction n/1 is a ring homomorphism.
124
Exercise 2.94.1
Prove this.
Exercise 2.94.2
Prove that the kernel of f is the zero ideal in Z.
Note that the image of f is just the integers, and in particular, f is not
surjective.
Example 2.95
More generally, if R is a subring of S, the function f : R S that
sends r to r is a ring homomorphism. The image of f is just R, so if
R is a proper subset of S, then f will not be surjective.
Example 2.96
simplifies into one of the form a+b 2, by using the fact that ( 2)2 = 2,
( 2)3 = 2 2, etc.)
Exercise 2.96.1
Prove that f is a ring homomorphism.
125
Exercise 2.96.2
Prove that f is surjective.
(Hint: Given rationals a and b what is the image of a + bx?)
2 under f , so why is x
2 not in the
126
Example 2.97
Here is an example similar in spirit to Example 2.96 above.
Exercise 2.97.1
Show that the function f : Q[x] Q[] that sends x to i and
more generally p(x) to p(i) is a surjective ring homomorphism,
whose kernel is the ideal hx2 + 1i.
Example 2.98
After seeing in Examples 2.96 and 2.97 above how long division can be
used to determine kernels of homomorphisms from Q[x] to other rings,
the following should be easy:
Exercise 2.98.1
Let F be any field, and let a F be arbitrary. Show that the
function f : F [x] F that sends x to a and more generally p(x)
to p(a) is a surjective ring homomorphism whose kernel is the
ideal generated by hx ai.
For
of
details
Exercise
following (but
only after you
have tried the
problem your-
self!):
somehow the addition and multiplication in two rings are essentially the
http:
//tinyurl.
same except perhaps for dividing out by some ideal, isomorphisms cap-
com/
ture a stronger notion: that multiplication in two rings are essentially the
GIAA-Rings-13
127
128
We now quantify our observation (see the discussion on page 113) that
somehow, the rings R/hxi and R are equal. Let us revisit this example
again in a new light:
Example 2.101. Let us define f : R[x]/hxi R by f(p(x) + hxi) = p(0).
Let us explain this: the equivalence class of a polynomial p(x) under the
equivalence relation that defines the ring R/I is the coset p(x)+I (see Lemma
2.75). Our function sends the equivalence class of p(x) to the constant term
of p(x). We first need to check that this function is well-defined: we have
defined f in terms of one representative of an equivalence class, what if we
had used another representative? So, suppose p(x) + hxi = q(x) + hxi, then if
we had used q(x), we would have defined f(p(x)+hxi) = f(q(x)+hxi) = q(0).
Earlier, we had defined f(p(x) + hxi) to be p(0): are these definitions the
same? In other words, is p(0) = q(0)? The answer is yes! For, the fact that
p(x) + hxi = q(x) + hxi means that p(x) q(x) hxi (why?), or alternatively,
p(x) q(x) is a multiple of x. Hence, the constant term of p(x) q(x), which
is p(0) q(0), must be zero, i.e., p(0) = q(0). It follows that f is indeed
well-defined.
Now that we know f is well-defined, it is easy to check that f is a ring
homomorphism (do it!). What is the kernel of f? It consists of all equivalence
classes p(x) + hxi such that the constant term p(0) is zero. But to say that
p(0) is zero is to say that p(x) is divisible by x (why?), or in other words, that
p(x) is already in hxi. Thus, the kernel of f consists of just the equivalence
class hxibut this is the zero element in the ring R[x]/hxi. Thus, the kernel
of f is just the zero ideal, so by Lemma 2.99, f is injective. Moreover, f is
clearly surjective, since every real number r arises as the constant term of
some polynomial in R[x] (for example, the polynomial r + 0x + 0x2 + ).
The function f quantifies why R[x]/hxi and R are really equal to each
other. There are two ingredients to this: the function f, being injective and
surjective, provides a one-to-one correspondence between R[x]/hxi and R as
129
sets, and the fact that f is a ring homomorphism tells us that the addition
and multiplication in R is essentially the same as that in R[x]/hxi. Moreover, since f has kernel zero, we do not even have to divide out by any ideal
in R[x]/hxi to realize this sameness of ring operations. Thus, R/hxi and
R are really the same rings, even though they look different. We say that
R[x]/hxi is isomorphic to R via the map f.
Definition 2.102
Let f : R S be a ring homomorphism. If f is both injective and
surjective, then f is said to be an isomorphism between R and S. Two
rings R and S are said to be isomorphic (written R
= S) if there is
some function f : R S that is an isomorphism between R and S.
Let us look at some examples of ring isomorphisms:
Example 2.103
Let us revisit Example 2.38. Denote the function that sends r R to
diag(r) by f .
Exercise 2.103.1
Check that f is bijective as a function from R to the subring of
Mn (R) consisting of matrices of the form diag(r).
Exercise 2.103.2
Also, check that f (r + s) = f (r) + f (s), and f (rs) = f (r)f (s).
Moreover f (1) is clearly the identity matrix. Thus, the function f is
indeed a ring homomorphism from R to the subring of Mn (R) consisting
130
Exercise 2.104.2
Show that f is a ring homomorphism.
Exercise 2.104.3
Show that f is surjective.
Exercise 2.104.4
Show that f is injective. (Hint: Recall that we have proved in
131
Example 2.105
The following examples show that well-known fields can show up as
subrings of matrices!
Exercise 2.105.1
Let S denote the subset of M2 (Q) consisting of all matrices of
the form
a 2b
b
132
Exercise 2.105.2
Let S denote the subset of M2 (R) consisting of all matrices of
the form
a b
b
The two examples in the exercises above are referred to as the regular
Example 2.106
It is not necessary that the rings R and S in the definition of a ring
isomorphism be different rings. A ring isomorphism f : R R is to
be thought of as a one-to-one onto map from R to R that preserves the
ring structure. (Such a map is also known as an automorphism of R.)
Here are some examples:
133
Exercise 2.106.1
Exercise 2.106.2
Let F be a field, and let a be a nonzero element of F . Let b be an
arbitrary element of F . Prove that the map f : F [x] F [x] that
sends x to ax + b and more generally, a polynomial p0 + p1 x +
+ pn xn to the polynomial p0 + p1 (ax + b) + + pn (ax + b)n
is an automorphism of F [x].
Exercise 2.106.3
Prove that the complex conjugation map f : C C that sends
a + b (given real number a and b) to the complex number a b
is an automorphism of C. Determine the set of complex numbers
on which f acts as the identity map.
134
R[x]/hxi and R. Observe the close connection between how the functions f
and f are defined in the two examples, and observe that the ring R[x]/hxi
is obtained by modding R[x] by the kernel of f . Now as another instance,
compare Examples 2.96 and 2.104. Here too, in the first example, we defined
f(p(x) + hx2 2i) = p( 2), and observed that it was well-defined and that
135
Exercise 2.107.1
Justify all the equalities above.
136
Example 2.108
We have the isomorphism (see Example 2.97) Q[x]/hx2 + 1i
= Q[].
Example 2.109
By the same token, we find R[x]/hx2 + 1i
= C.
Exercise 2.109.1
Mimic Example 2.97 and construct a homomorphism from R[x]
to C that sends p(x) to p(i) and prove that it is surjective with
kernel hx2 + 1i. Then apply Theorem 2.107 to establish the claim
that R[x]/hx2 + 1i
= C.
Example 2.110
Example 2.98 along with Theorem 2.107 above establishes that for any
field F and any a F , F [x]/hx ai
= F.
2.7
137
Further Exercises
Exercise 2.111
Starting from the ring axioms, prove that the properties stated in Remark 2.24 hold for any ring R.
(See the notes on page 156 for some hints.)
Exercise 2.112
This generalizes Exercise 2.47: If R is a ring, let R denote the set of
invertible elements of R. Prove that R forms a group with respect to
multiplication.
Exercise 2.113
This exercise determines the units of the ring Z[]:
1. Define a function N : Z[] Z by N (a + b) = a2 + b2 . Show that
N (xy) = N (x)N (y) for all x and y in Z[].
2. If x is invertible in Z[], show that N (x) must equal 1.
3. Conclude that the only units of Z[] are 1 and .
Exercise 2.114
Consider the ring Q[ m] of Example 2.12. Now assume for this exer
cise that m is not a perfect square. Show that a + b m = 0 (for a and
138
Exercise 2.115
The following concerns the ring Q[ 2, 3] of Example 2.34, and is
designed to show that if a, b, c, and d are rational numbers, then
2. Show that
a+b 2
. Why is this last
that otherwise we can write 3 =
c+d 2
equality a contradiction?)
For
of
details
Exercise
http:
//tinyurl.
com/
GIAA-Rings-14
139
Exercise 2.116
We will prove in this exercise that Q[ 2, 3] is actually a field.
1. You know that if a and b are rational numbers, then (a+b 2)(a
(a + b 2 + c 3 + d 6) (a + b 2 c 3 d 6)
(a b 2 + c 3 d 6) (a b 2 c 3 + d 6)
is also rational. (This just involves multiplying out all the terms
abovedo it! However, you can save yourselves a lot of work
by multiplying the first two terms together using the formula
(x + y)(x y) = x2 y 2 , and then multiplying the remaining two
terms together, and looking out for patterns.)
2. Now show using part (1) above that Q[ 2, 3] is a field. (Hint:
Given a nonzero element a + b 2 + c 3 + d 6 in Q[ 2, 3], first
(a b 2 + c 3 d 6) or (a b 2 c 3 + d 6) can be zero
140
Exercise 2.117
Let R be an integral domain. Show that an element in R[x] is invertible,
if and only if it is the constant polynomial r(= r + 0x + 0x2 + ) for
some invertible element r R. In particular, if R is a field, then a
polynomial in R[x] is invertible if and only if it is a nonzero element
of R. (See the notes on Page 155 for a discussion on polynomials with
coefficients from an arbitrary ring.)
By contrast, show that the (nonconstant) polynomial 1 + [2]4 x in the
polynomial ring Z/4Z[x] is invertible, by explicitly finding the inverse
of 1+[2]4 x. Repeat the exercise by finding the inverse of 1+[2]8 x in the
polynomial ring Z/8Z[x]. (Hint: Think in terms of the usual Binomial
Series for 1/(1 + t) from your Calculus courses. Do not worry about
convergence issues. Instead, think about what information would you
glean from this series if, due to some miracle, tn = 0 for some positive
integer n?)
Exercise 2.118
We will revisit some familiar identities from high school in the context
of rings! Let R be a ring:
1. Show that a2 b2 = (a b)(a + b) for all a and b in R if and only
if R is commutative.
2. Show that (a + b)2 = a2 + 2ab + b2 for all a and b in R if and only
if R is commutative.
3. More generally, if R is a commutative ring, prove that the Binomial Theorem holds in R: for all a and b in R and for all positive
141
integers n,
n
(a + b)
n n
n n1
=
a +
a b+
0
1
n n2 2
n
n n
n1
a b + +
ab
+
b
2
n1
n
Exercise 2.119
An element a in a ring is said to be nilpotent if an = 0 for some positive
integer n.
1. Show that if a is nilpotent, then 1a and 1+a are both invertible.
(Hint: Just as in Exercise 2.117 above, think in terms of the
Binomial Series for 1/(1 t) and 1/(1 + t). Do not worry about
convergence, but ask yourself what you can learn from the series
if tn = 0 for some positive integer n.)
2. Let R be a commutative ring. Show that the set of all nilpotent
elements in R forms an ideal in R. (Hint: Suppose that an = 0
and bm = 0. What can you say about (a + b)n+m1 , given your
knowledge of the Binomial Theorem for commutative rings from
Exercise 2.118 above?
Exercise 2.120
Let S denote the set of all functions f : R R. Given f and g in S,
define two binary operations + and on S by the rules
(f + g)(x) = f (x) + g(x)
(f g)(x) = f (x)g(x)
142
(These are referred to, respectively, as the pointwise addition and multiplication of functions.)
1. Convince yourselves that (S, +, ) is a ring. What is the 0 of
S? What is the 1 of S?
2. Show that S is not an integral domain. (Hint: Play with functions like f (x) = x + |x| or g(x) = x |x|.)
3. More generally, show that every nonzero f S is either a unit
or a zero-divisor by showing:
(a) f is a unit if and only if f (x) 6= 0 for all x R.
(b) f is a zero-divisor if and only if f (x) = 0 for at least one
x R.
4. Let s : R S be the function that sends the real number r to
the function sr defined by sr (x) = r for all x R. Show that
s is an injective ring homomorphism from R to S The image of
s in R is therefore a subring of R that is isomorphic to R. It is
known as the set of constant functions.
Exercise 2.121
Let R be a ring.
Definition 2.121.1
The center of R, written Z(R), is defined to be the set {r
R | rx = xr for all x R}.
143
144
145
the ring Z[ 5], two elements x and y are associates if and only
an associate of either 1 + 5 or 1 5.
12. A commutative ring R is said to possess unique prime factorization if every element a R that is not a unit factors into a
product of irreducibles, and if a = x1 x2 xs and a = y1 y2 yt
are two factorizations of a into irreducibles, then s must equal
t, and after relabeling if necessary, each xi must be an associate
of the corresponding yi . (Again, it turns out that this is the
correct generalization of the concept of unique prime factorization in the integers to arbitrary commutative rings.) Prove that
Z[ 5] does not possess unique prime factorization by considering two different factorizations of 6 into irreducibles. (Hint:
Look at parts 8, 9, 10, and 11.)
Exercise 2.124
Prove that any finite integral domain must be a field. (Hint: Write
R for the integral domain. Given a nonzero a R, you need to show
that a is invertible. What can you say about the function fa : R R
that sends any r to ar? Is it injective? Is it surjective? So?)
Exercise 2.125
Let K be a field, and let R be a subring of K. Assume that every
element of K satisfies a monic polynomial with coefficients in R: this
means that given any k in K, there exists a positive integer n and
146
147
148
149
nonzero). Show that r(x)(rm /gn )xmn g(x) has degree less
than r(x).
(b) Now show that the element f (x)g(x)(q(x)+(rm /gn )xmn )
is an element of S that has degree less than that of r(x).
Conclude that deg(r(x)) < deg(g(x)).
6. Conclude that we have proved the existence of q(x) and r(x) with
the required properties in the case where S does not contain 0,
and have hence proved our result in all cases.
Exercise 2.129
We saw in Exercise 2.127 that Z is a principal ideal domain. The key
to that proof was the division algorithm in the integers. Now that we
have established a corresponding division algorithm in the ring F [x],
where F is any field (see Exercise 2.128), we will use it to show that
F [x] is also a principal ideal domain.
Accordingly, let I be an ideal of F [x]. If I consists of just the element 0,
then I = h0i and is already principal. Similarly, if I = R, then I = h1i
(see Exercise 2.74.3) so I is principal in this case as well. So, assume
in what follows that I is a nonzero proper ideal of R (proper simply
means that I 6= R). In particular, I cannot contain any constant
polynomials other than 0, since, if some nonzero a F is in I, then
a a1 = 1 is also in I, contradicting what we have assumed about I.
Let f (x) be a polynomial in I whose degree is least among all (nonzero)
polynomials in I. (Such a polynomial exists by the Well-Ordering Principle.) Note that f (x) must have positive degree by our assumption
about I. Let g(x) be an arbitrary polynomial in I. Apply the division
algorithm and, using similar ideas as in Exercise 2.128, prove that g(x)
150
I implies I + J = R, then I is
I. The
151
hypothesis then says I +J = R. But what else can you say about
I + J that then gives a contradiction?)
(Thus either property could be used to define maximal ideals.)
3. Show that a proper ideal I is maximal if and only if R/I is a field.
(Hint: Assume that I is maximal. Pick a nonzero element [x] in
R/I. Since [x] is nonzero, x 6 I. Study the set K = {i + rx | i
I and r R}, which you showed in part (1) to be an ideal of
R. By maximality of I show that there must be some i I and
r R such that i + rx = 1. What does this relation read in
R/I? Now invoke Exercise 2.46.1. A similar argument should
also establish that if R/I is a field, then I must be maximal.)
It is instructive to note that maximal ideals always existsee Theorem
B.6 in Appendix 4.5.
Exercise 2.133
Let R be a commutative ring. A proper ideal I of R is said to be prime
if whenever ab I for a and b in R, then either a or b must be in I.
1. Show that I is a prime ideal if and only if R/I is an integral
domain. (Hint: This is just a matter of translating the definition
of a prime ideal over to the ring R/I: for instance, assume that
I is prime. If we have a relation [a][b] = 0R/I in R/I, then this
means that ab I in R.)
2. Show that every maximal ideal is necessarily prime.
3. Show that if p is a prime integer, then the ideal hpi in Z/pZ is a
prime ideal.
152
Exercise 2.134
Let R be any ring containing the rationals. Prove that the only ring
homomorphism f : Q R is the identity map that sends any rational
number to itself. (Hint: given that f (1) must be 1, what can you say
about f (2), f (3), etc.? Next, what can you say about f (1/2), f (1/3),
etc.? So now, what can you say about f (m/n) for arbitrary integers
m and n with n 6= 0?
Exercise 2.135
Prove that the following are all ring isomorphisms from Q[ 2, 3] to
itself. Here, a, b, c, and d are, as usual, rational numbers.
153
Notes
Remarks on Example 2.10 Every nonzero element in Q has a multiplicative
inverse, that is, given any q Q with q 6= 0, we can find a rational number
q 0 such that qq 0 = 1. The same cannot be said for the integers: not every
nonzero integer has a multiplicative inverse within the integers. For example,
there is no integer a such that 2a = 1, so 2 does not have a multiplicative
inverse.
Remarks on Example 2.12 The sum and product of any two elements a +
numbers, the sum and product also lie in Q[ 2]. Thus, the standard method
that in addition to being in Q[ 2], u, v, and w are also real numbers. Since
associativity holds in the reals, we find upon viewing u, v, and w as real
numbers that (u + v) + w = u + (v + w). Now viewing u, v, and w in this
154
155
Remarks on Example 2.17 For any ring R, we can consider the set of
polynomials with coefficients in R with the usual definition of addition and
multiplication of polynomials. This will be a ring, with additive identity the
constant polynomial 0 and multiplicative identity the constant polynomial
1. If R is commutative, R[x] will also be commutative. (Why? Play with
P
P
j
two general polynomials f = ni=0 fi xi and g = m
j=0 gj x and study f g and
gf .) If R is not commutative, R[x] will also not be commutative. To see this
last assertion, suppose a and b in R are such that ab 6= ba. Then viewing
a and b as constant polynomials in R[x], we find that we get two different
products of the polynomials a and b depending on the order in which we
multiply them!
Here is something strange that can happen with polynomials with coeffi-
156
cients in an arbitrary ring R. First, the degree and highest coefficient of polynomials in R[x] (where R is arbitrary) are defined exactly as for polynomials
with coefficients in the reals. Now over R[x], if f (x) and g(x) are two nonzero
polynomials, then deg(f (x)g(x)) = deg(f (x)) + deg(g(x). But for an arbitrary ring R, the degree of f (x)g(x) can be less than deg(f (x)) + deg(g(x))!
To see why this is, suppose f (x) = fn xn + lower-degree terms (with fn 6=
0), and suppose g(x) = gm xm + lower-degree terms (with gm 6= 0). On
multiplying out f (x) and g(x), the highest power of x that will show up in
the product is xn+m , and its coefficient will be fn gm . If we are working in
R, then fn 6= 0 and gm 6= 0 will force fn gm to be nonzero, so the degree of
f (x)g(x) will be exactly n + m. But over arbitrary rings, it is quite possible
for fn gm to be zero even though fn and gm are themselves nonzero. (You
have already seen examples of this in matrix rings. Elements a and b in a
ring R such that a 6= 0 and b 6= 0 but ab = 0 will be referred to later in
the chapter as zero-divisors.) When this happens, the highest nonzero term
in f (x)g(x) will be something lower than the xn+m term, so the degree of
f (x)g(x) will be less than n + m!
Clearly, this phenomenon will not occur if the coefficient ring R does not
have any zero-divisors. As will be explained further along in the chapter,
fields do not have any zero-divisors (i.e., they are integral domains.) Hence
if F is a field and f (x) and g(x) are two nonzero polynomials in F [x], then
deg(f (x)g(x)) = deg(f (x)) + deg(g(x)). (In particular, this shows that if F
is any field, F [x] also does not have zero-divisorswhy?)
Remarks on Example 2.22 The additive identity is (0, 0) and the multiplicative identity is (1, 1). What is the product of (1, 0) and (0, 1)? Of (2, 0)
and (0, 2)?
157
158
have used notation like Q[ 2], Q[], Z[1/2], to denote various rings that we
have studied. There is a reason for this notation: these are all examples of
rings generated by a subring and an element. We consider this notion here.
We will consider only commutative rings, even though the notion exists
for noncommutative rings as well. Accordingly, let R be a commutative ring,
and let S be a subring. (Must S be commutative as well?) Let a be any
element in R. (For instance, let R be the reals, let S be the rationals, and let
Similarly, the sum of, say 2 and 1+ 2, which is 3+ 2 is not in Q{1+ 2}.)
One could then ask: If in general S {a} is not a subring of R, what are the
elements of R that you should adjoin to the set S {a} to get a set that is
actually a subring of R?
159
160
Notice that S[a] includes both S and a. Our arguments preceding the
lemma above show that any subring of R that contains both S and a must
contain all polynomial expressions in a with coefficients in S, that is, it must
contain S[a]. S[a] should thus be thought of as the smallest subring of R
that contains both S and a.
Here is an exercise: In the setup above, if two polynomial expressions
s0 + s1 a + s2 a2 + + sn an and s00 + s01 a + s02 a2 + + s0m am are equal (as
elements of R), can you conclude that n = m and si = s0i for i = 0, . . . , n?
(Hint: See the examples below.)
Now let us consider some examples:
Example 2.138
What, according to our definition above, is the subring of the reals
is just 4q4 , etc. Similarly, q3 ( 2)3 is just 2q3 2, q5 ( 2)5 is just 4q5 2,
etc. By collecting terms together, it follows that every polynomial
161
(1/4)( 2)3 can be rewritten as 2 + (5/2) 2.) Hence, the subring of the
reals generated by the rationals and 2 is the set of all real numbers of
162
Example 2.143
163
164
Chapter 3
Vector Spaces
165
166
3.1
167
168
The elements of V are called vectors and the elements of F are called
scalars.
Thus, R2 and R3 are both vector spaces over R. Let us look at several examples of vector spaces that arise from other than geometric considerations:
Example 3.2
We have looked at R2 and R3 , why not generalize these, and consider
R4 , R5 , etc.? These would of course correspond to higher-dimensional
worlds. It is certainly hard to visualize such spaces, but there is
no problem considering them in a purely algebraic manner. Recall
that every vector in R2 can be described uniquely by the pair (a, b),
consisting of the x and y components of the vector. (Uniquely means
that the vector (a, b) equals the vector (a0 , b0 ) if and only if a = a0 and
b = b0 .) Similarly, every vector in R3 can be described uniquely by the
triple (a, b, c), consisting of the x, y, and z components of the vector.
Thus, R2 and R3 can be described respectively as the set of all pairs
(a, b) and the set of all triples (a, b, c), where a, b, and c are arbitrary
real numbers. Proceeding analogously, for any positive integer n, we
will let Rn denote the set of n-tuples (a1 , a2 , . . . , an ), where the ai are
arbitrary real numbers. (As with R2 and R3 , the understanding here
is that two n-tuples (a1 , a2 , . . . , an ) and (a01 , a02 , . . . , a0n ) are equal if and
only if their respective components are equal, that is, a1 = a01 , a2 = a02 ,
. . . , and an = a0n .) These n-tuples will be our vectors; how should
169
we add them? Recall that in R2 we add the vectors (a, b) and (a0 , b0 )
by adding a and a0 together and b and b0 together, that is, by adding
componentwise.
Exercise 3.2.1
Deduce from the parallelogram law of addition of vectors in R2
that the sum of (a, b) and (a0 , b0 ) is (a + a0 , b + b0 ).
What should our scalars be? Just as in R2 and R3 , let us take our
scalars to be the field R. How about scalar multiplication? In R2 ,
the product of the scalar r and the vector (a, b) is (ra, rb), that is, we
multiply each component of the vector (a, b) by the real number r. (Is
that so? Check!) We will multiply scalars and vectors in Rn in the
same way: we will decree that the product of the real number r and
the n-tuple (a1 , a2 , . . . , an ) is (ra1 , ra2 , . . . , ran ).
Exercise 3.2.3
Check that this definition satisfies the axioms of scalar multiplication in Definition 3.1.
Thus, Rn is a vector space over R.
170
Example 3.3
Now, why restrict the examples above to n-tuples of R? For any field
F , let F n stand for the set of n-tuples (a1 , a2 , . . . , an ), where the ai
are arbitrary elements of F . Add two such n-tuples componentwise,
that is, define addition via the rule (a1 , a2 , . . . , an ) + (a01 , a02 , . . . , a0n ) =
(a1 + a01 , a2 + a02 , . . . , an + a0n ). Take the field F to be the field of scalars,
and define scalar multiplication just as in Rn : given an arbitrary f F ,
and an arbitrary n-tuple (a1 , a2 , . . . , an ), define their scalar product to
be the n-tuple (f a1 , f a2 , . . . , f an ).
Exercise 3.3.1
Check that these definitions of vector addition and scalar multiplication make F n a vector space over F .
Example 3.4
Similarly, for any field F , let
Q
0
171
Example 3.5
Consider the ring Mn (R). Focusing just on the addition operation on
Mn (R), recall that (Mn (R), +) is an abelian group. (Remember, for
any ring R, (R, +) is always an abelian group.) We will treat the reals
as scalars. Given any real number r and any matrix (ai,j ) in Mn (R),
we will define their product to be the matrix (rai,j ). (See the notes
on page 242 for a comment on this product.) Verify that with this
definition, Mn (R) is a vector space over R. In a similar manner, if F
is any field, Mn (F ) will be a vector space over F .
Example 3.6
Consider the field Q[ 2]. Then (Q[ 2], +) is an abelian group (why?).
Think of the rationals as scalars. There is a very natural way of multi
172
Example 3.7
Let us generalize Example 3.6. What we needed above were that
173
need the full force of the fact that Q[ 2] is a field? No, all we need is
the fact that Q[ 2] is a ring that contains the field Q; this is enough
n
P
i=0
numbers have a dual role here: when we see a real number r by itself,
we want to think of it as a vector, and when we see it in an expression
174
Example 3.10
Now think about this: Suppose V is a vector space over a field K.
Suppose F is a subfield of K. Then V is also a vector space over F !
Question 3.10.1
Why? What do you think the scalar multiplication ought to be?
(See the notes on page 242 for some remarks on this.)
175
Example 3.11
Here is an example that may seem pathological at first, but is not
really so! Consider the trivial abelian group V : this consists of a single
element, namely, the identity element 0V . The only addition rule here
is 0V + 0V = 0V , and it is easy to check that the set {0V } with the
addition rule above is indeed an abelian group. Now let F be any field.
Then V is a vector space over F with the product rule f 0V = 0V .
There is only vector in this space, namely 0V , although, there are lots
of scalars! This vector space is known as the trivial vector space or the
zero vector space over F , and shows up quite naturally as kernels of
injective linear transformations (see Lemma 3.87 ahead, for instance).
Remark 3.12
Now observe that all these examples of vector spaces have the following
properties:
1. For any scalar f , f times the zero vector is just the zero vector.
2. For any vector v, the scalar 0 times v is the zero vector.
3. For any scalar f and any vector v, (f ) v = (f v).
4. If v is a nonzero vector, then f v = 0 for some scalar f implies
f = 0.
These properties somehow seem very natural, and one would expect
them to hold for all vector spaces. Just as in Remark 2.24, where we
considered a similar set of properties for rings, we would like these properties to be deducible from the vector space axioms themselves. This
would, among other things, convince us that our vector space axioms
are the correct ones, that is, they yield objects that behave more or
176
less like the examples above instead of objects that are rather pathological. As it turns out, our expectations are not misguided: these
properties are deducible from the vector space axioms, and therefore
do hold in all vector spaces. We will leave the verification of this to
the exercises (see Exercise 3.97).
3.2
177
178
sets of coordinate axes of the same vector space may have different numbers
of axes in each set! If either of these possibilities were to occur, we would not
have a unique number that we could assign as the dimension of the vector
space. As it turns out, neither of these can happen, and our second task is
to consider the impossibility of these two scenarios.
Let us turn to the first task. Focusing on R2 for convenience, let us denote
the vector with tip at the point (1, 0) by i, and the one with the tip at the
point (0, 1) by j. From vector calculus, we know that if we take an arbitrary
vector in R2 , say u, with its tip at (a, b), then the projection of u onto the
x-axis is just a times the vector i and the projection on the y-axis is just b
times the vector j. The parallelogram law then shows that u is the sum of
a i and b j, that is, u = a i + b j. Since u was an arbitrary vector in this
discussion, we find that every vector in R2 can be written as a scalar times i
added to another scalar times j. This example motivates two definitions.
Definition 3.13
Let V be a vector space over a field F . A linear combination of vectors
v1 , , vn (or, an F -linear combination of vectors v1 , , vn , if we wish
to emphasize the field over which the vector space is defined) is any
vector in V that can be written as f1 v1 + + fn vn for suitable
scalars f1 , , fn .
vectors i+j, 2i3j = 2i+(3)j, and i+3 2 j are all linear combinations
of i and j.)
The other definition motivated by the example of the vectors i and j in
2
R is the following:
179
Definition 3.14
Let V be a vector space over a field F . A subset S of V is said to span
V (or S is said to be a spanning set for V ) if every vector v V can be
n
P
written as
fi vi for some integer n 1, some choice of vectors v1 ,
i=1
180
Remark 3.17
By convention, the empty set is taken as a spanning set for the zero
vector space. Moreover, by convention, the trivial space is the only
space spanned by the empty set. This convention will be useful later,
when defining the dimension of the zero vector space.
Example 3.19
For an example of a spanning set with redundancy in it, we do not have
to look very far: Going back to R2 , let us write w for the vector with
181
Question 3.19.1
This is of course very trivial to seethe vector with tip at (a, b)
can be written as the sum a i + b j + 0 w. More interestingly,
1/ 2) j + w?
spanning set for R2 . To see this, note that j = i + 2w. Thus, any
182
Example 3.20
Let S be a spanning set for a vector space V . If v is any vector in V
that is not in S, then S {v} is also a spanning set for V in which
there is redundancy. More generally, if T is any nonempty subset of V
that is disjoint from S, then S T is also a spanning set for V in which
there is redundancy.
Exercise 3.20.1
Convince yourself of this!
For instance, we have seen in Example 3.16 above that the set
{1, x, x2 , . . . } is a spanning set for the polynomial ring R[x] considered as a vector space over R. Taking T = {1 + x, 1 + x + x2 , 1 +
x + x2 + x3 , . . . }, it follows that the set U = {1, x, 1 + x, x2 , 1 + x +
x2 , x3 , 1 + x + x2 + x3 , . . . } is a spanning set for R[x] in where there is
redundancy.
Exercise 3.20.2
Continuing with the example of the polynomial ring R[x] considered as a vector space over R, show that there is no redundancy
in the spanning set {1, x, x2 , . . . }.
Remember, we are trying to formulate an algebraic definition of coordinate axes. Our intuition from Example 3.19, as well as Example 3.16 and
Exercise 3.20.2 above, would suggest that a set of coordinate axes, first,
should span our vector space, and next, should not have more vectors than
are needed to span the space, that is, should not have redundancy in it.
It would be very useful to have alternative characterizations of redun-
183
184
Now assume that m > 1. Then, dividing by fm and moving vm to the other
side, we find
185
Exercise 3.22.1
Show that if v is a nonzero vector, then the set {v} must be linearly
independent. See Property (4) in Remark 3.12.
Exercise 3.22.2
Show that two vectors are linearly dependent if and only if one is a
scalar multiple of the other.
186
Exercise 3.22.3
Are the following subsets of the given vector spaces linearly independent? (Very little computation, if any, is necessary.)
1. In R3 : {(1, 1, 1), (10, 20, 30), (23, 43, 63)}
2. In R3 : {(1, 0, 0), (2, 2, 0), (3, 3, 3)}
3. In R[x]: {(x + 1)3 , x2 + x, x3 + 1}
Exercise 3.22.4
We know that C2 is a vector-space over both C (Example 3.3) and
over R (Example 3.10). Show that v = (1 + , 2) and w = (1, 1 + )
are linearly dependent when C2 is considered as a C vector space, but
linearly independent when considered as a R vector space.
Also, let us illustrate the meaning of the last two sentences of the Definition 3.22 above. Let us consider the following:
Example 3.23
Consider the subset S = {1, x, x2 , x3 , . . . } of R[x], with R[x] viewed as
a vector space over R (we have already considered this set in Examples
3.16 and 3.20 above). This is, of course, an infinite set. Consider any
nonempty finite subset of S, for instance, the subset {x, x5 , x17 }, or
the subset {1, x, x2 , x20 }, or the subset {1, x3 , x99 , x100 , x1001 , x1004 }. In
general, a nonempty finite subset of S would contain n elements (for
some n 1), and these elements would be various powers of xsay xi1 ,
xi2 , . . . , xin . These elements are definitely linearly independent, since
187
if a1 xi1 + +an xin is the zero polynomial, then by the definition of the
zero polynomial, each ai must be zero. This is true regardless of which
finite subset of S we takeall that would be different in different finite
subsets is the number of elements (the integer n) and the particular
powers of x (the integers i1 through in ) chosen. Thus, according to our
definition, the set S is linearly independent.
On the other hand, consider the subset S 0 = S {1 + x}. Any finite
subset of S 0 that does not contain all three vectors 1, x and 1 + x will
be linearly independent (check!). However, this alone is not enough for
you to conclude that S 0 is a linearly independent set. For the subset
{1, x, 1 + x} of T is linearly dependent: 1 1 + 1 x + (1) (1 + x) = 0.
By the definition above, T is a linearly dependent set.
Remark 3.24
Note that the zero vector is linearly dependent: for example, the
nonzero scalar 1 multiplied by 0V gives 0V . Thus, if V is the zero
vector space, then {0V } is a linearly dependent spanning set, so by
Lemma 3.21, this set has to have redundancy. Hence, some subset of
{0V } must already span the trivial space. But the only subset of {0V }
is the empty set, hence this lemma tells us that the empty set must
span {0V }. This is indeed consistent with the convention adopted in
Remark 3.17 above.
We are now ready to construct the algebraic analog of coordinate axes.
We will choose as our candidate any set of vectors that spans our vector
space and in which there is no redundancy. Moreover, instead of using the
term coordinate axes (which is inspired by the geometric examples of R2
and R3 ), we will coin a new termthe algebraic analog of coordinate axes
will be called a basis of our vector space. Since redundancy is equivalent to
188
Exercise 3.26.2
Show that the set consisting of the vectors i and w =
189
Example 3.27
Recall the definition of the vector space Rn in Example 3.2. Let ei
stand for the vector whose components are all zero except in the i-th
slot, where the component is 1. (For example, in R4 , e1 = (1, 0, 0, 0),
e3 = (0, 0, 1, 0), etc.). Then the ei form a basis for Rn as a R-vector
space. They clearly span Rn since any n-tuple (r1 , . . . , rn ) Rn is just
r1 e1 + + rn en . As for the linear independence, assume that r1 e1 +
+rn en = 0 for some scalars r1 , . . . , rn . Since the sum r1 e1 + +rn en
is just the vector (r1 , . . . , rn ), we find (r1 , . . . , rn ) = (0, . . . , 0), so each
ri must be zero.
This basis is known as the standard basis for Rn . Of course, in R2 , e1
and e2 are more commonly written as i and j, and in R3 , e1 , e2 , and e3
are more commonly written as i, j, and k.
Exercise 3.27.1
Show that the vectors e1 , e2 e1 , e3 e2 , . . . , en en1 also form
a basis for Rn .
Example 3.28
The set consisting of the elements 1 and
as a vector space over Q. (We have seen in Example 3.15 above that
190
Exercise 3.29.1
Prove that the set B = {1, 1 + x, 1 + x + x2 , 1 + x + x2 + x3 . . . }
is also a basis for R[x] as a vector space over R.
(Hint: Writing v0 = 1, v1 = 1 + x, v2 = 1 + x + x2 , etc., note
that for i = 1, 2, . . . , xi = vi vi1 . It follows that all powers of
x (including x0 ) are expressible as linear combinations of the vi .
Why does it follow from this that the vi span R[x]? As for linear
independence, suppose that for some finite collection vi1 , . . . , vik
(with i1 < i2 < < ik ), there exist scalars r1 , . . . , rk such that
r1 vi1 + + rk vik = 0. What is the highest power of x in this
expression? In how many of the elements vi1 , . . . , vik does it show
up? What is its coefficient? So?)
Example 3.30
Consider Fn [x] as an F vector space (see Example 3.9 above). You
should easily be able to describe a basis for this space and prove that
your candidate is indeed a basis.
Example 3.31
The set {1, 2, 3, 6} forms a basis for Q[ 2, 3] as a vector space
over Q. You have seen in Example 2.34 that, by our very definition of
191
Example 3.32
The n2 matrices ei,j (see Exercise 2.16.5 of Chapter 2 for this notation)
are a basis for Mn (R).
Exercise 3.32.1
Prove this! To start you off, here is a hint: In M2 (R), for example,
a matrix such as
1 2
3 4
can be written as the linear combination e1,1 +2e1,2 +3e2,1 +4e2,2 .
Example 3.33
Certain linear combinations of basis vectors also give us a basis:
Exercise 3.33.1
Exercise 3.33.2
Now show that if V is any vector space over any field with basis
{v1 , v2 }, then the vectors v1 , v1 + v2 also form a basis. How
would you generalize this pattern to a vector space that has a
basis consisting of n elements {v1 , v2 , . . . , vn }? Prove that your
candidate forms a basis.
192
Exercise 3.33.3
Let V be a vector space with basis {v1 , . . . , vn }. Study Exercise
3.27.1 and come up with a linear combination of the vi , similar
to that exhibited in that exercise, that also forms a basis for V .
Prove that your candidate forms a basis.
Example 3.34
Consider the vector space
Q
0
it hard to describe explicitly a basis for this space. However, let ei (for
i = 0, 1, . . . ) be the infinite-tuple with 1 in the position indexed by i
and zeros elsewhere. (Thus, e0 = (1, 0, 0, . . . ), e1 = (0, 1, 0, . . . ), etc.)
Exercise 3.34.1
Why is the set S = {e0 , e1 , e2 , . . . } not a basis for
Q
0
F ? Is S at
least linearly independent? (See the notes on page 243 for some
comments on this example.)
Example 3.35
The empty set is a basis for the trivial vector space. This follows from
Remark 3.17 (see also Remark 3.24), since the empty set spans the
trivial space, and since the empty set is vacuously linearly independent.
Here is a result that describes a useful property of bases and is very easy
to prove.
Proposition 3.36. Let V be a vector space over a field F , and let S be a
basis. Then in any expression of a vector v V as v = f1 b1 + + fn bn for
193
suitable vectors bi S and nonzero scalars fi , the bi and the fi are uniquely
determined.
Proof. What we need to show is that if v is expressible as f1 b1 + + fn bn
for suitable vectors bi S and nonzero scalars fi , and is also expressible as
g1 c1 + + gm cm for suitable vectors ci S and nonzero scalars gi , then
n = m, and after relabelling if necessary, each bi = ci and each fi = gi
(i = 1, . . . , n). To do this, assume, after relabelling if necessary, that b1 = c1 ,
. . . , bt = ct (for some t min(m, n)), and that the sets {bt+1 , . . . , bn } and
{ct+1 , . . . , cm } are disjoint. Then, bringing all terms to one side, we may
rewrite our equality as
(f1 g1 )b1 + + (ft gt )bt + ft+1 bt+1 + + fn bn
gt+1 bt+1 gm bm = 0
By the linear independence of the subset {b1 , . . . , bt , bt+1 , , bn , ct+1 , , cm }
of S, we find that f1 = g1 , . . . , ft = gt , ft+1 = = fn = 0, gt+1 = =
gm = 0. But since the scalars were assumed to be nonzero, ft+1 = 0 and
gt+1 = 0 are impossible, so, to begin with, there must have been no ft+1
or gt+1 to speak of! Thus, t must have equaled n, and similarly, t must
have equaled m. From this, we get n = m (= t), and then, by our very definition of t, we find that b1 = c1 , . . . , bn = cn . Coupled with our derivation
that f1 = g1 , . . . , ft = gt , we have our desired result.
194
spite of its name, is really not a lemma, but an axiom of logic. See Chapter
4.5 in the Appendix.) For a first introduction to abstract algebra, any usage
of Zorns Lemma can seem dense and somewhat foreboding (what else will
the Gods of Logic hurl at us?), so we will relegate the full proof to the same
Chapter 4.5 in the Appendix (see Theorem B.7 there). However, to help
build a more concrete feel for the existence of bases, we will also give a proof
of the existence of a basis in the special case when we know that the vector
space in question has a finite spanning set.
We will assume that our vector space is not the trivial space, since we
already know that the trivial space has a basis (see Example 3.35 above).
Proposition 3.37. Let V be a vector space over a field F . Let S be a spanning
set for V , and assume that S is a finite set. Then some subset of S is a basis
of V . In particular, every vector space with a finite spanning set has a basis.
Proof. Note that S is nonempty, since V has been assumed to not be the
trivial space (see Remark 3.17). If the zero vector appears in S, then the
set S 0 = S {0} that we get by throwing out the zero vector will still span
V (why?) and will still be finite. Any subset of S 0 will also be a subset of
S, so if we can show that some subset of S 0 must be a basis of V , then we
would have proved our theorem. Hence, we may assume that we are given
a spanning set S for V that is not only finite, but one in which none of the
vectors is zero.
Let S = {v1 , v2 , . . . , vn } for some n 1. If there is no redundancy in S,
then there is nothing to prove: S would be a basis by the very definition of a
basis. So assume that there is redundancy in S. By relabelling if necessary,
we may assume that vn is redundant. Thus, S1 = {v1 , v2 , . . . , vn1 } is itself
a spanning set for V . Once again, if there is no redundancy in S1 , then we
would be done; S1 would be a basis. So assume that there is redundancy
in S1 . Repeating the arguments above and shrinking our set further and
195
further, we find that this process must stop somewhere, since at worst, we
would shrink our spanning set down to one vector, say Sn1 = {v1 }, and a set
containing just one nonzero vector must be linearly independent (Exercise
3.22.1), so Sn1 would form a basis. (Note that this is only the worst case; in
actuality, this process may stop well before we shrink our spanning set down
to just one vector.) When this process stops, we would have a subset of S
that would be a basis of V .
Remark 3.38
Notice that to prove that bases exist (in the special case where V has
a finite spanning set) what we really did was to show that every finite
spanning set of V can be shrunk down to a basis of V . This result is
true more generally: Given any spanning set S of a vector space V (in
other words, not just a finite spanning set S), there exists a subset S 0
of S that forms a basis of V . See the notes on page 345 Chapter 4.5
in the Appendix.
Having proved that every vector space has a basis, we now need to show
that different bases of a vector space have the same number of elements in
them. (Remember our original program. We wish to measure the size of
a vector space, and based on our examples of R2 and R3 , we think that a
good measure of the size would be the number of coordinate axes, or basis
elements, that a vector space has. However, for this to make sense, we need
to be guaranteed that every vector space has a basiswe just convinced
ourselves of thisand that different bases of a vector space have the same
number of elements in them.) In preparation, we will prove an important
lemma. Our desired results will fall out as corollaries.
We continue to assume that our vector space is not the trivial space.
Lemma 3.39 (Exchange Lemma). Let V be a vector space over a field F , and
196
197
We are now ready to prove that different bases of a given vector space
have the same number of elements. We will distinguish between two cases:
vector spaces having bases with finitely many elements, and those having
bases with infinitely many elements. We will take care of the infinite case
first.
Corollary 3.40. If a vector space V has one basis with an infinite number
of elements, then every other basis of the vector space also has an infinite
number of elements.
Proof. Let S be the basis of V with an infinite number of elements (that
exists by hypothesis), and let T be any other basis. Assume that T has only
finitely many elements, say m. Since S has infinitely many elements, we can
certainly pick m + 1 vectors from it. So pick any m + 1 vectors from S and
198
denote this selected set of vectors by S 0 . Since the vectors in S 0 are part of
the basis S, they are certainly linearly independent. We may think of the
set T as the set B of Lemma 3.39 (after all, T being a basis, will span V ),
and we may think of the set S 0 as the set C of the same lemma (after all,
S 0 is linearly independent). The lemma then shows that m + 1 m, a clear
contradiction. Hence T must also be infinite!
We settle the finite case now. Recall that we are assuming that our vector
space is not the trivial space. The trivial space has only one basis anyway,
the empty set (see Remark 3.24).
Corollary 3.41. If a vector space V has one basis with a finite number of
elements n, then every basis of V contains n elements.
Proof. Let S = {x1 , . . . , xn } be the given basis of V with n elements, and
let T be any other basis. If T were infinite, Lemma 3.40 above says that
S must also be infinite. Since this is not true, we find that T must have
a finite number of elements. So, assume that T has m elements, say T =
{y1 , . . . , ym }. We wish to show that m = n. We may think of S as the set
B of Lemma 3.39, since it clearly spans V . Also, we may think of the set
T as the set C of the lemma, since T , being a basis, is certainly linearly
independent. Then the lemma says that m must be less than or equal to n.
Now let us reverse this situation: let us think of T as the set B, and let
us think of S as the set C. (Why can we do this?) Then the lemma says
that n must be less than or equal to m. Thus, we have m n and n m,
so we find that n = m.
We are finally ready to make the notion of the size of a vector space
precise!
199
Definition 3.42
A (nontrivial) vector space V over a field F is said to be finitedimensional (or finite-dimensional over F ) if it has a basis with a finite
number of elements in it; otherwise, it is said to be infinite-dimensional
(or infinite-dimensional over F ). If V is finite-dimensional, the dimension of V is defined to be the number of elements in any basis. If V
is infinite-dimensional, the dimension of V is defined to be infinite. If
V has dimension n, then V is also referred to as an n-dimensional
space (or as being n-dimensional over F ); this is often written as
dimF (V ) = n.
Remark 3.43
By convention, the dimension of the trivial space is taken to be zero.
This is consistent with the fact that it has as basis the empty set, which
has zero elements.
Let us consider the dimensions of some of the vector spaces in the examples on page 168 (see also the examples on page 188, where we consider bases
of these vector spaces). R2 and R3 have dimensions 2 and 3 (respectively) as
vector spaces over R.
Question 3.44
What is the dimension of Rn ?
200
Question 3.46
By Example 3.7, C is a vector space over C, and as well, over R. What
is the dimension of C as a C-vector space? As a R-vector space?
With the definition of dimension under our belt, the following is another
corollary to the Exchange Lemma (Lemma 3.39):
Corollary 3.47. Let V be an n-dimensional vector space. Then every subset S
of V consisting of more than n elements is linearly dependent. (Alternatively,
if S is a linearly independent subset of V then S has at most n elements.)
Proof. Assume, to the contrary, that V contains a linearly independent subset S that contains more than n elements. Therefore, we can find n + 1
distinct elements v1 , v2 , . . . , vn+1 in S. Write C for the set {v1 , v2 , . . . , vn+1 }
and let B be any basis. By the very definition of dimension, B must have
n elements. Now apply Lemma 3.39 to the sets B and Cwe find that
n + 1 n, which is a contradiction. Hence every subset of V consisting of
more than n elements must be linearly dependent, or, what is the same, any
linearly independent subset of V must have at most n elements.
Similarly, with the definition of dimension under our belt, the following
is an easy corollary of Proposition 3.37:
Corollary 3.48. Let V be an n-dimensional vector space. Then any spanning
set for V has at least n elements.
Proof. Let S be a spanning set, and assume that |S| = t < n. By Proposition
3.37, some subset of S is a basis of V . Since this subset can have at most t
elements, it follows that the dimension of V , which is the size of this basis,
is at most t. This contradicts the fact that the dimension of V is n.
201
202
Example 3.50
For example, in R2 , consider the linear independent set {i}. The contention of the theorem above is that one can adjoin one other vector to
this to get a basis for R2 : for instance the set {i, j} is a basis, and so,
for that matter, is the set {i, w}. (Here, just as earlier in the chapter,
203
Hence, as V is n-
204
Of course, the statements in both (5b) and (5e) above hold even when
V is infinite-dimensional.
3.3
205
The idea behind subspaces is very similar to the idea behind subrings,
while the idea behind quotient spaces is very similar to the idea behind
quotient rings. (There is one key difference: quotient rings are obtained
by modding out rings by ideals, modding out by subrings will not work.
However, quotient spaces can be made by modding out by subspaces. We
will see this later in the chapter.)
We will consider subspaces first:
Definition 3.53
Given a vector space V over a field F , a subspace of V is a nonempty
subset W of V that is closed with respect to vector addition and scalar
multiplication, such that with respect to this addition and scalar multiplication, W is itself a vector space (that is, W satisfies all the axioms
of a vector space).
Now, we saw in the context of rings (Exercise 2.28 in Chapter 2) that
one could have a subset S of a ring R such that S is closed with respect to
addition and multiplication, and yet S is not a subring of R. It turns out
that in the case of vector spaces, it is enough for a (nonempty) subset W
of a vector space V to be closed with respect to vector addition and scalar
multiplicationW will then automatically satisfy all the axioms of a vector
space. This is the content of Theorem 3.55 below.
But first, a quick exercise, which is really a special case of Exercise 3.5 in
Chapter 4 ahead:
206
207
We have the following, which captures both closure conditions of the test
in Theorem 3.55 above:
Corollary 3.56. Let V be a vector space over a field F , and let W be a
nonempty subset of V that is closed under linear combinations, i.e., for all
w1 , w2 in W and all f1 , f2 in F , the element f1 w1 + f2 w2 is also in W . Then
W is a subspace of V . Conversely, if W is a subspace, then W is closed
under linear combinations.
Proof. Assume that W is closed under linear combinations. Taking f1 =
f2 = 1, we find that w1 + w2 is in W for all w1 , w2 in W , i.e., W is closed
under addition. Taking f2 = 0 we find f1 w1 is in W for all w1 in W and
all f1 in F , i.e., W is closed under scalar multiplication. Thus, by Theorem
3.55, W is a subspace. Conversely, if W is a subspace, then for w1 , w2 in W
and all f1 , f2 in F , f1 w1 and f2 w2 are both in W because W is closed under
scalar multiplication, and then, f1 w1 + f2 w2 is in W because W is closed
under vector addition. Hence, W is closed under linear combinations.
Here are some examples of subspaces. In each case, check that the conditions of Theorem 3.55 apply.
Example 3.57
The set consisting of just the element 0 is a subspace.
Question 3.57.1
Why?
208
Q[ 2] is a subspace of the Qvector space Q[ 2, 3]. Of course, we
first way (viewing Q[ 2] as a subspace of Q[ 2, 3]), we first think of
Q[ 2, 3]. Doing so, the vector sum of a+b 2+0 3+0 6 (= a+b 2)
209
Example 3.62
The example above generalizes as follows: Suppose F K L are
fields. The field extension L/F makes L an F vector space. Since
K is closed with respect to vector addition and scalar multiplication,
K becomes a subspace of L. But the field extension K/F exhibits K
directly as an F vector space. The two F vector space structures on
K, one that we get from viewing K as a subspace of the F vector space
L and the other that we get directly from the field extension K/F , are
the same.
Example 3.63
L
Exercise 3.63.1
Prove this!
Exercise 3.63.2
Show that the set S = {e0 , e1 , e2 , . . . } is a basis for
L
0
F?
210
Example 3.64
For any field F , F [x2 ] (that is, the set of all polynomials of the form
n
P
fi x2i , n 0) is a subspace of F [x].
i=0
Question 3.64.1
What is the dimension of this subspace? Can you discover a basis
for this subspace?
Example 3.65
Let V be a vector space over a field F , and let S be any nonempty
subset of V .
Definition 3.65.1. The linear span of S is defined as the set
of all linear combinations of elements of S, that is, the set of
all vectors in V that can be written as c1 s1 + c2 s2 + + ck sk
for some integer k 1, some scalars ci , and some vectors
si S.
Exercise 3.65.1
Show that the linear span of S is a subspace of V .
211
Question 3.66
Which of the following are subspaces of R3 ?
1. {(a, b, c) | a + 3b = c}
2. {(a, b, c) | a = b2 }
3. {(a, b, c) | ab = 0}
We turn our attention now to quotient spaces. Recall how we constructed
the quotient ring R/I given a ring R and an ideal I: we first defined an
equivalence relation on R by a b if and only if a b I (see page 109 in
Chapter 2). We found that the equivalence class of an element a is precisely
the coset a + I (Lemma 2.75 in that chapter). We then defined the ring R/I
to be the set of equivalence class of R under the naturally induced definitions
[a] + [b] = [a + b] and [a][b] = [ab] (see Definition 2.76 in that chapter). Of
course, we had to check that our operations were well-defined and that we
indeed obtained a ring by this process (see Lemma 2.77 and Theorem 2.79
in that chapter). We will follow the same approach here.
So, given a vector space V over a field F , and a subspace W , we define
an equivalence relation on W by v w if and only if v w W . Exactly
as on page 109, we can see that this is indeed an equivalence relation. We
define the coset a + W to be the set of all elements of the vector space of the
form a + w as w varies in W , and we call this the coset of W with respect to
a We have the following, whose proof is exactly as in Lemma 2.75 of Chapter
2 and is therefore omitted:
Lemma 3.67. The equivalence class [a] is precisely the coset a + W .
As with quotient rings, we will denote the set of equivalence classes of V
by V /W , whose members we will denote as both [a] and a + W . We define an
212
213
214
215
216
3.4
Remark 3.75
As with ring homomorphisms, there are some features of this definition
that are worth noting:
1. In the equation f (u) + f (v) = f (u + v), note that the operation
on the left side represents vector addition in the vector space X,
while the operation on the right side represents addition in the
vector space V .
2. Similarly for the equation rf (u) = f (ru): the operation on the
left side represents scalar multiplication in X, while the operation
on the right side represents scalar multiplication in V .
3. By the very definition of a function, f is defined on all of V ,
however, the image of V under f need not be all of X i.e, f
need not be surjective (see Example 3.83 or Example 3.84 for
instance, although, such examples are really very easy to write
down). However, the image of V under f is not an arbitrary
subset of X, the definition of a linear tranformation ensures that
218
Remark 3.78
Here is another way to prove the statement of the lemma above: Pick
any v V . Then, 0V = 0F v, so f (0V ) = f (0F v) = 0F f (v) = 0X .
(Here, the first equality is due to Remark 3.12.2, and the last but one
equality is because f (rv) = rf (v) for any scalar r since f is a linear
transformation.)
220
Definition 3.79
Given a linear transformation f : V X between two F -vector spaces,
the kernel of f is the set {u V | f (u) = 0X }. It is denoted ker(f ).
As in the case of kernels of ring homomorphisms, the following statement
should come as no surprise:
Proposition 3.80. Let V and X be vector spaces over a field F . The kernel
of a linear tranformation f : V X is a subspace of V .
Proof. By Corollary 3.56, it is sufficient to check that ker(f ) is a nonempty
subset of V that is closed under linear combinations. Since 0V ker(f )
(Lemma 3.77), ker(f ) is nonempty. Now, for any w1 , w2 in ker(f ) and any
r1 , r2 in F , we find f (r1 w1 +r2 w2 ) = r1 f (w1 )+r2 f (w2 ) = r1 0X +r2 0X = 0X .
Hence r1 w1 +r2 w2 is indeed in the kernel of f , so ker(f ) is closed under linear
combinations.
2
Remark 3.81
As in the case of ring homomorphisms, for any linear transformation
f : V X between two F -vector spaces, we will have f (v) = f (v).
One proof is exactly the same as in Remark 2.88 in Chapter 2, and this
is not surprising: this is really a proof that in any group homomorphism
f from a group G to a group H, f (g 1 ) will equal (f (g))1 for all
g G (see Corollary 4.68 in Chapter 4). Another proof, of course, is
to invoke scalar multiplication and Remark 3.12.3: f (v) = f (1v) =
1f (v) = f (v).
We are now ready to study examples of linear transformations. The
first example is really the master-example: it provides an algorithm for con-
222
Exercise 3.82.1
Which vector space axioms were used in the two chains of equalities in the proof above that f (u + v) = f (u) + f (v) and
f (rv) = rf (v)?
Exercise 3.82.2
Would the proof be any more complicated if V were not assumed
to be finite-dimensional? (Work it out!)
Conversely, if f is any linear transformation from V to X and if f (bi ) =
wi (i = 1, . . . , n), then, since f is a linear transformation, f (r1 b1 + +
rn bn ) = r1 f (b1 ) + + rn f (bn ) = r1 w1 + + rn bn . Since any vector
in V is a linear combination of the vectors b1 , . . . , bn , this formula
completely determines what f sends each vector in V to.
Now let us carry this one step further. Let f : V X be a linear transformation, and suppose (for simplicity) that X is also finitedimensional, with some basis {c1 , . . . , cm }. Thus, every vector w X
can be uniquely expressed as s1 c1 + +sm cm for suitably scalars ci . In
particular, each of the vectors wi (= f (bi )) therefore can be expressed
as a linear combination of the cj as follows:
w1 = p1,1 c1 + + p1,m cm
w2 = p2,1 c1 + + p2,m cm
..
.
. = ..
wn = pn,1 c1 + + pn,m cm
(The pi,j are scalars. Note how they are indexed: pi,j stands for the
coefficient of cj in the expression of wi as a linear combination of the
r2 (p2,1 c1 + + p2,m cm )
..
.
+ rn (pn,1 c1 + + pn,m cm )
Now let us regroup the right side so that all the scalars that are attached
to the basis vector c1 are together, all scalars attached to the basis
vector c2 are together, etc. Doing so, we find
f (u) = (p1,1 r1 + p2,1 r2 + + pn,1 rn ) c1
= (p1,2 r1 + p2,2 r2 + + pn,2 rn ) c2
.
= ..
= (p1,m r1 + p2,m r2 + + pn,m rn ) cm
(Study this relation carefully: note how the indices of the pi,j behave:
pi,j multiplies ri and is attached to cj . Notice that across each row of
this equation, it is the first index of pi,j that changes: this is in contrast
to the behavior of the indices in the previous equations. There, it was
the second index of pi,j that changed in each row.)
Now suppose that we adopt the convention that we will write any vector
224
r1
r2
u= .
..
rn
and any vector w X, w = s1 c1 + + sm cm as the column vector
s1
s2
w= .
..
sm
Let us rewrite our equation for f (u) above in the form f (u) = s1 c1 +
+ sm cm for suitable scalars si . Since the coefficient of c1 in f (u)
is p1,1 r1 + p2,1 r2 + + pn,1 rn (see the equation above), we find s1 =
p1,1 r1 + p2,1 r2 + + pn,1 rn . Similarly, since the coefficient of c2 in f (u)
is p1,2 r1 + p2,2 r2 + + pn,2 rn , we find s2 = p1,2 r1 + p2,2 r2 + + pn,2 rn .
Proceeding thus, we find that the vectors u and f (u) are related by the
matrix equation
s
1
s2
.
..
sm
. . . pn,1
r1
p1,2 p2,2 . . . pn,2
= .
..
..
..
.
...
.
p1,m p2,m . . . pn,m
r2
..
.
p1,1
p2,1
rn
(3.1)
226
Question 3.82.4
What are the coordinates, in the standard basis for R3 (see Example 3.27), of the vector xi + yj, after it undergoes the linear
transformation f : R2 R3 given by the matrix
a b
c d
e f
where the matrix is written with respect to the ba
sis {i, w
=
(1/ 2, 1/ 2)} of R2 and the basis
{(1, 0, 0), (1, 1, 0), (0, 1, 1)} of R3 ?
Question 3.82.5
How will the treatment in this example change if either V or X
(or both) were to be infinite-dimensional F -vector spaces? (See
the remarks on page 244 in the notes for some hints.)
Example 3.83
Let V be an F -vector space. The map f : V V that sends any v V
to 0 is a linear transformation.
Question 3.83.1
If V is n-dimensional with basis {b1 , . . . , bn }, what is the matrix
of f with respect to this basis?
228
Example 3.84
Let V be an F -vector space, and let W be a subspace. The map
f : W V defined by f (w) = w is a linear transformation.
Question 3.84.1
Assume that W is m-dimensional and V is n-dimensional. Pick
a basis B = {b1 , . . . , bm } of W and expand to a basis C =
{b1 , . . . , bm , bm+1 , . . . , bn } of V . What is the matrix of f with
respect to the basis B of W and the basis C of V ?
Example 3.85
Let F be a field, and view Mn (F ) as a vector space over F (see Example
3.5). Now view F as an F -vector space (see Example 3.7: note that F
is trivially an extension field of F ). Then the function f : Mn (F ) F
that sends a matrix to its trace is a linear tranformation. (Recall that
the trace of a matrix is the sum of its diagonal entries.)
To prove this, note that this is really a function that sends basis vectors
of the form ei,i to 1 and ei,j (i 6= 0, j 6= 0) to 0, and an arbitrary matrix
P
i,j mi,j ei,j to m1,1 1 + + mn,n 1. Now apply Lemma 3.82.1 to
conclude that f must be a linear transformation.
See Exercise 3.100 at the end of the chapter.
Example 3.86
Let V be a vector space over a field F and let W be a subspace. Assume
that V is finite-dimensional (for simplicity). Let dimF (V ) = n, and
dimF (W ) = m. Let {b1 , . . . , bm } be a basis for W , and let us expand
Exercise 3.86.2
Is surjective? Describe a basis for ker().
Exercise 3.86.3
The basis {b1 , . . . , bm } of W can be expanded to a basis
{b1 , . . . , bm , bm+1 , . . . , bn } of V in many different ways (see Example 3.50). The definition of above depends on which choice
of {bm+1 , . . . , bn } we make. For example, take V = R2 and W
the subspace represented by the x-axis. Take the vector b1 = i
(= (1, 0)) as a basis for W . Show that the definition of depends
crucially on the choice of vector b2 used to expand {b1 } to a basis
for R2 as follows: Select b2 in two different ways and show that
for suitable v R2 , (v) defined with one choice of b2 will be
different from (v) defined by the other choice of b2 .
230
vector space structures in two spaces are essentially the same without even
having to divide out by any subspace. As with rings, we need a couple of
lemmas first:
Lemma 3.87. Let V and X be two vector spaces over a field F and let f :
V X be a linear transformation. Then f is an injective function if and
only if ker(f ) is the zero subspace.
Exercise 3.87.1
The proof of this is very similar to the proof of the corresponding
Lemma 2.99 in Chapter 2: study that proof and write down a careful
proof of Lemma 3.87 above.
Lemma 3.91. Continuing with the notation of Lemma 3.90, assume further
that f is injective. Then the vectors {f (b) | b B} form a basis for f (V ).
Proof. With the additional assumption that f is injective, we need to show
that the vectors {f (b) | b B} are linearly independent, since we already
know from Lemma 3.90 that they span f (V ). Assume that r1 f (b1 ) + +
rn f (bn ) = 0X for some scalars r1 , . . . , rn and some vectors b1 , . . . , bn from
B. Since f is a linear transformation, the left side is just f (r1 b1 + + rn bn ).
By the injectivity of f , we find r1 b1 + + rn bn = 0V . But since the bi
are linearly independent in V , r1 , . . . , rn must all be zero, showing that the
vectors {f (b) | b B} are indeed linearly independent.
We now have the following, completely in analogy with rings:
232
Definition 3.92
Let V and X be vector spaces over a field F , and let f : V X be
a linear transformation. If f is both injective and surjective, then f
is said to be an isomorphism between V and X. Two vector spaces
V and X are said to be isomorphic (written V
= X) if there is some
function f : V X that is an isomorphism between V and X.
Example 3.93
Any two vector spaces over the same field F of the same dimension n are isomorphic.
over F both of dimension n, and if, say, {v1 , v2 , . . . , vn } is a basis for V and {w1 , w2 , . . . , wn } is a basis for W , then the function
f : V W defined by f (v1 ) = w1 , f (v2 ) = w2 , . . . , f (vn ) = wn ,
and f (r1 v1 + r2 v2 + + rn vn ) = r1 w1 + r2 w2 + + rn wn is an
F -linear transformation, by Lemma 3.82.1. This map is injective: if
v = r1 v1 + r2 v2 + + rn vn is such that f (v) = 0, then this means
r1 w1 +r2 w2 + +rn wn = 0, and since the wi form a basis for W , each ri
must be zero, so v must be zero. Also, f is surjective: clearly, given any
w = r1 w1 +r2 w2 + +rn wn in W , the vector v = r1 v1 +r2 v2 + +rn vn
maps to w under f . Thus, f is an isomorphism between V and W .
Remark 3.93.1
If f : V W is an isomorphism between two vector spaces V and
W over a field F then, since f provides a bijection between V and
W , we may define f 1 : W V by f 1 (w) equals that unique
v V such that f (v) = w. Clearly, the composite function
f 1
Exercise 3.93.2
If f : V W is an isomorphism, show that the map f 1 of
Remark 3.93.1 above is a linear transformation from W to V .
234
236
3.5
Further Exercises
Exercise 3.97. Starting from the vector space axioms, prove that the properties
listed in Remark 3.12 hold for all vector spaces. (Hint: You should get ideas
from the solutions to the corresponding Exercise 2.111 of Chapter 2: the proofs
of the first three properties are quite similar in spirit. As for the last property,
look to f 1 for help!)
Exercise 3.98. Prove that the polynomials 1, 1 + x, (1 + x)2 , (1 + x)3 , . . .
also form a basis for R[x] as a Rvector space. (Hint: To show that these
polynomials span R[x], it is sufficient to show that the polynomials 1, x, x2 , . . .
are in the linear span (see Example 3.65 above) of 1, 1 + x, (1 + x)2 , (1 + x)3 ,
. . . (Why?) The vector 1 is of course in the linear span. Assuming inductively
that the vectors 1, x, . . . , and xn1 are in the linear span, show that xn is also in
the linear span by considering the binomial expansion of (1 + x)n . As for linear
n
P
independence, suppose that
di (1 + x)i = 0. You may assume that dn 6= 0
i=0
(why?) Now expand each term (1 + x)i above and consider the coefficient of
xn . What do you find?)
If you find the hint too computational, you can also establish this result by
invoking Exercise 3.106 ahead and Exercise 2.106.2 in Chapter 2. (However,
note that Exercise 2.106.2 in turn is computational, so this merely shifts all the
computations to a different place!)
Exercise 3.99.
Show that the matrices ei,j and 2ei,j (1 i, j 2) form a
basis
for M2 (Q[ 2]) considered as a Qvector space. ( 2ei,j is the 2 2 matrix
with 2 in the (i, j) slot, and zeros in the remaining slots.) Now discover a basis
for M2 (C) considered as a vector space over R.
Exercise 3.100. Show that the set of all matrices in Mn (R) whose trace is
zero is a subspace of Mn (R) by exhibiting this space as the kernel of a suitable
homomorphism that we have considered in the text. Use Theorem 3.96 to prove
that this subspace has dimension n2 1. Discover a basis for this subspace.
Exercise 3.101. Let V be an F -vector space. So far, we have considered individual linear tranformations of the form f : V V ; this exercise deals with the
collection of all such F -linear transformations. Let EndF (V ) denote the set of
all F -linear transformations from V to V . (End is short for the word endomorphism, which is another word for a homomorphism from one (abelian) group
to itself, while the subscript F indicates that we are considering those (abelian)
237
238
239
240
Exercise 3.111. Let V be a vector space over a field F , and let U and W be
two subspaces.
1. Show that U W is a subspace of V . (Is U W a subspace of V ?)
2. Denote by U + W the set {u + w | u U and w W }. Show that U + W
is a subspace of V .
3. Now assume that V is finite-dimensional. The aim of this part is to establish the following:
dim(U + W ) = dim(U ) + dim(W ) dim(U W )
(a) Let {v1 , . . . , vp } be a basis for U W (so dim(U W ) = p). Expand this to a basis {v1 , . . . , vp , u1 , . . . , uq } of U , and also to a basis
{v1 , . . . , vp , w1 , . . . , wr } of W (so dim(U ) = p + q and dim(W ) =
p + r). Show that the set B = {v1 , . . . , vp , u1 , . . . , uq , w1 , . . . , wr }
spans U + W .
(b) Show that the set B is linearly independent. (Hint: Assume that we
have the relation f1 v1 + + fp vp + g1 u1 + + gq uq + h1 w1 +
+ hr wr = 0. Rewrite this as g1 u1 + + gq uq = (f1 v1 + +
fp vp + h1 w1 + + hr wr ). Observe that the left side is in U while
the right is in W , so g1 u1 + + gq uq must be in U W . Hence,
g1 u1 + + gq uq = j1 v1 + + jp vp for some scalars j1 , . . . , jp .
Why does this show that the gi must be zero? Now proceed to show
that the fi and the hi must also be zero.)
(c) Conclude that dim(U + W ) = dim(U ) + dim(W ) dim(U W ).
(d) Prove that any two 2-dimensional spaces of R3 must intersect in a
space of dimension at least 1.
(n)
Exercise 3.112. Show that the nth Bernstein Poylnomials Bi (x) = ni xi (1
x)ni , (i = 0, 1, . . . , n) form a basis for Rn [x] (n 1) as follows:
P
(n)
1. Show that 1 = ni=0 Bi .
2. The equation in part 1 above continues to hold if we replace n by n 1
everywhere. (Why?) Make this replacement, multiply throughout by x,
Pn
(n)
and derive the relation
x
=
i=0 (i/n)Bi . (Hint: you will need to use
the relation n1
= (i/n) ni . Why does this last relation hold?)
i1
241
Pn
i=0 (i(i
1) (i k +
span Rn [x].
(n)
form a basis.
242
Notes
Remarks on Example 3.5 It is worth remarking that our definition of scalar
multiplication is a very natural one. First, observe that we can consider R to
be a subring of Mn (R) in the following way: the set of matrices of the form
diag(r), as r ranges through R, is essentially the same as R (see Example
2.103 in Chapter 2). (Observe that this makes the set of diagonal matrices of
the form diag(r) a field in its own right!) Under this identification of r R
with diag(r), what is the most natural way to multiply a scalar r and a vector
(ai,j )? Well, we think of r as diag(r), and then define r (ai,j ) as just the
usual product of the two matrices diag(r) and (ai,j ). But, as you can check
easily, the product of diag(r) and (ai,j ) is just (rai,j )! It is in this sense that
our definition of scalar multiplication is naturalit arises from the rules of
matrix multiplication itself. Notice that once R has been identified with the
subring of Mn (R) consisting of the set of matrices of the form diag(r), this
example is just another special case of Example 3.8.
Remarks on Example 3.10 (V, +) remains an abelian group. This does not
change when we restrict our attention to the subfield F . So we only need to
worry about what the new scalar multiplication ought to be. But there is a
natural way to multiply any element f of F with any element v of V : simply
consider f as an element of K, and use the multiplication already defined
between elements of K and elements of V ! The scalar multiplication axioms
clearly hold: for any f and g in F and any v and w in V , we may first think of
f and g as elements of K, and since the scalar multiplication axioms hold for
V viewed as a vector space over K, we certainly have f (v +w) = f v +f w,
(f + g) v = f v + g v, (f g) v = f (g v), and 1 v = v.
243
Remarks on Example 3.34 This example is a bit tricky. Why are the ei
n
P
not a basis? They are certainly linearly independent, since if
ci ei = 0 for
i=0
244
real analysis). Such notions may not exist for arbitrary fields.
Remarks on the proof Theorem 3.70 The reason why the proofs that
(V /W, +) and (R/I, +) are abelian groups are so similar is that what we
are essentially proving in both is that if (G, +) is an abelian group and if
H is a subgroup, then the set of equivalence classes of G under the relation
g1 g2 if and only if g1 g2 H with the operation [g1 ] + [g2 ] = [g1 + g2 ] is
indeed an abelian group in its own right! We will take this up in Chapter 4
ahead.
245
in F in which each column has only finitely many nonzero entries, one can
defined a linear transformation f : V X exactly as in Example 3.82, with
the column indexed by corresponding to f (b ).
246
Chapter 4
Groups
247
248
4.1
CHAPTER 4. GROUPS
249
250
CHAPTER 4. GROUPS
such that a id = id a = a for all a in S, and
3. every element of S has an inverse with respect to , i.e., for
every element a in S there exists an element a1 such that
a a1 = a1 a = id.
4.1.1
251
Symmetric groups
Example 4.2
Consider the set 3 = {1, 2, 3}, and consider one-to-one and onto maps
from 3 to itself: in more common language, such maps are known
as permutations of {1, 2, 3}. Let us, for example, write 12 23 31 for the
permutation that sends 1 to 2, 2 to 3, and 3 to 1 (so we write the image
of an element under the element, we will call this the stack notation).
Then it is easy to see that there are exactly six permutations, and they
are listed in the following table (where we have given a name to each
permutation):
3 Permutations
id
r1
r2
f1
f2
f3
1
1
1
2
1
3
1
1
1
3
1
2
2
2
2
3
2
1
2
3
2
2
2
1
3
3
3
1
3
2
3
2
3
1
3
3
Now let us see how these permutations compose. You will observe that
r1 r1 takes 1 to 2 under the first application of r1 and then 2 to 3
under the second application of r1 . Likewise, r1 r1 takes 2 to 3 and
then to 1, and similarly, 3 to 1 and then to 2. The net result: r1 r1 is
the permutation 13 21 32 , that is, r1 r1 = r2 !
252
CHAPTER 4. GROUPS
id
r1
r2
f1
f2
f3
id
id
r1
r2
f1
f2
f3
r1
r1
r2
id
f3
f1
f2
r2
r2
id
r1
f2
f3
f1
f1
f1
f2
f3
id
r1
r2
f2
f2
f3
f1
r2
id
r1
f3
f3
f1
f2
r1
r2
id
253
The permutation id acts as the identity element: this is clear from the
first row and the first column of the table above, and finally, (iv) Every
permutation of S3 has an inverse: r1 r2 = r2 r1 = id, id id = id,
f1 f1 = id, f2 f2 = id and f3 f3 = id. Hence, the set of permutations
of 3 forms a group under composition. We denote this group as S3 , and
call it the symmetric group on three elements. (S3 can be interpreted
as the set of symmetries of 3 with the trivial structure: see Example
4.96 in the notes at the end of the chapter.)
Observe something about this group: it is not a commutative group!
For instance, as we observed above, r1 f1 = f3 while f1 r1 = f2 . We
say that the group is nonabelian.
From now on we will suppress the symbol, and simply write f g
for the composition f g. Not only is there less writing involved,
but it is notation that we are used to: it is the notation we use for
multiplication. Continuing the analogy, we write f f as f 2 , and so on,
and we sometimes write 1 for the identity (see Remark 4.22 ahead for
more on the notation used for the identity and the group operation).
In this notation, note that r13 = r23 = 1, f12 = f22 = f32 = 1.
The table such as the one above that describes how pairs of elements in
a group compose under the given binary operation is called the group
table for the group.
Exercise 4.2.1
Use the group table to show that every element of S3 can be
written as r1i f1j for uniquely determined integers i {0, 1, 2} and
j {0, 1}.
254
CHAPTER 4. GROUPS
Example 4.3
Just as we considered the set of permutations of the set 3 = {1, 2, 3}
above, we can consider for any integer n 1, the permutations of the
set n = {1, 2, . . . , n}. This set forms a group under composition, just
as S2 and S3 did above.
Definition 4.3.1. The set of permutations of n , which forms
a group under composition, is denoted Sn and is called the
symmetric group on n elements.
Exercise 4.3.1
Write down the set of permutations of the set 2 = 1, 2 and construct the table that describes how the permutations compose.
Verify that the set of permutations of 2 forms a group. Is it
abelian? This group is denoted S2 , and called the symmetric
group on two elements.
Exercise 4.3.2
Compare the group table of S2 that you get in the exercise above
with the table for (Z/2Z, +) on Page ??. What similarities do
you see?
Exercise 4.3.3
Prove that Sn has n! elements.
Exercise 4.3.4
Find an element g Sn such that g n = 1 but g t 6= 1 (see Remark
4.22 ahead on notation for the identity element) for any positive
integer t < n.
255
Here is an alternative notation that is used for a special class of permutations, which we will call the cycle notation: Working for the sake
of concreteness in 5 , consider the permutation that sends 1 to 3, 3
to 4, and 4 back to 1, and acts as the identity on the remaining elements 2 and 5. (This is the permutation we have denoted up to now as
1 2 3 4 5
.) Notice the cyclic nature of this permutation: it moves 1
3 2 4 1 5
to 3 to 4 back to 1, and leaves 2 and 5 untouched. We will use the notation (1, 3, 4) for this special permutation and call it a 3-cycle. In
general, if a1 , . . . , ad are distinct elements of the set n (so 1 d n),
we will denote by (a1 , a2 , . . . , ad ) the permutation that sends a1 to a2 ,
a2 to a3 , . . . , ad1 to ad , ad back to a1 , and acts as the identity on all
elements of n other than these ai . We will refer to (a1 , a2 , . . . , ad ) as
a d-cycle or a cycle of length d. A 2-cycle (a1 , a2 ) is known as a transposition, since it only swaps a1 and a2 and leaves all other elements
unchanged. Of course a 1-cycle (a1 ) is really just the identity element
since it sends a1 to a1 and acts as the identity on all other elements of
n .
Notice something about cycles: the cycle (1, 3, 4) is the same as
(3, 4, 1), as they both clearly represent the same permutation. More
generally, the cycle (a1 , a2 , . . . , ad ) is the same as (a2 , a3 , . . . , ad , a1 ),
which is the same as (a3 , a4 , . . . , ad , a1 , a2 ), etc. We will refer
to these different representations of the same cycle as internal cyclic
rearrangements.
Since a d-cycle is just a special case of a permutation, it makes perfect
sense to compose a d-cycle and an e-cycle: it is just the composition of
two (albeit special) permutations. For instance, in any Sn (for n 3),
we have the relation (1, 3)(1, 2) = (1, 2, 3) (check!). (We will see
shortlyCorollary 4.7 aheadthat every permutation in Sn can be
256
CHAPTER 4. GROUPS
Exercise 4.3.6
Show that any k cycle in Sn (here n k 2) can be written as
the product of k 1 transpositions.
6
6
6
6
This computation is of course very explicit, but the intuitive idea behind why s and t commute is the following: s only moves the elements
1, 4, and 5 among themselves, and in particular, it leaves the elements
2 and 3 untouched. On the other hand, t swaps the elements 2 and 3,
and leaves the elements 1, 4 and 5 untouched. Since s and t operate
on disjoint sets of elements, the action of s is not affected by t and the
action of t is not effected by s. In particular, it makes no difference
whether we perform s first and then t or the other way around.
257
258
CHAPTER 4. GROUPS
Notice that since disjoint cycles commute, (4, 6, 5)(1, 3) is the same
as (1, 3)(4, 6, 5). Notice, too, that had we started with, for instance,
6 and followed it around, and then picked 3 and followed it around,
we have found s = (3, 1)(6, 5, 4). Any other decomposition of s
into disjoint cycles must be related to the first decomposition s =
(4, 6, 5)(1, 3) in a similar manner as these two above: either the
cycles could have been swapped, or internally, a cycle could have been
rearranged cyclically (such as (6, 5, 4) instead of (4, 6, 5)). This is
because, the product of disjoint cycles simply follows, one by one, the
various elements of {1 2 3 4 5 6} under repeated action of s, and no
matter in which manner the cycles are written, the repeated action of
s must be the same.
These same ideas apply to arbitrary permutations, and we have the following (whose proof we omit because it is somewhat tedious to write in full
generality):
Proposition 4.6. Every permutation in Sn factors into a product of disjoint
cycles. Two factorizations can only differ in the order in which the cycles
appear, or, within any one cycle, by an internal cyclic rearrangement.
Corollary 4.7. Every permutation in Sn can be written as a product of transpositions.
Proof. This is just a combination of Proposition 4.6 and Exercise 4.3.6 above,
which establishes that every cycle can be written as a product of transpositions.
259
Remark 4.8
Unlike the factorization of a permutation into disjoint cycles, there is
no uniqueness to the factorization into transpositions. (For instance,
in addition to the factorization (1, 3)(1, 2) = (1, 2, 3) we had before,
we also find (1, 2)(3, 2) = (1, 2, 3).) But something a little weaker
than uniqueness holds even for factorizations into transpositions: if
a permutation s has two factorizations s = d1 d2 dl and s =
e1 e2 em where the di and ej are transpositions, then either l and
m will both be even or both be odd! (The proof is slightly complicated,
and we will omit it since this is an introduction to the subject.) This
allows us to define unambiguously the parity of a permutation: we call
a permutation even if the number of transpositions that appear in any
factorization into transpositions is even, and likewise, we call it odd if
this number is odd.
260
4.1.2
CHAPTER 4. GROUPS
Dihedral groups
Example 4.9
Consider a piece of cardboard in the shape of an equilateral triangle.
Now consider all operations we can perform on the piece of cardboard
that do not shrink, stretch, or in anyway distort the triangle, but are
such that after we perform the operation, nobody can tell that we did
anything to the triangle! To help determine what such operations could
be, pretend that the piece of cardboard has been placed at a fixed location on a table, and the location has been marked by lines drawn
under the edges of the cardboard. Also, label the points on the table
that lie directly under the vertices of the triangle as a, b, and c respectively. After we have done our (yet to be determined!) operation on
the cardboard, the triangle should stay at the same locationotherwise
it would be obvious that somebody has done something to the piece
of cardboard. This means that after our operation, each vertex of the
triangle must somehow end up once again on top of one of the three
points a, b, and c marked on the table.
a
261
are not allowed to distort the triangle, once we know where the vertices
have gone to under our operation, we would immediately know where
every other point on the triangle would have gone to. For, if a point P
is at a distance x from a vertex A, a distance y from a vertex B and a
distance z from the third vertex C, then the image of P must be at a
distance x from the image of A, a distance y from the image of B and a
distance z from the image of C, and this fixes the location of the image
of P . (Actually, more is true: it is sufficient to know where any two
vertices have gone to under our operation to know where every point
has gone: see Remark 4.11 ahead if you are interested. But of course, if
you know where two vertices have gone, then you automatically know
where the third vertex has gone.) Hence, it is enough to study the
possible rearrangements, or permutations, of the vertices of the triangle
to determine our operations. A key sticking point is that while every
symmetry of the triangle corresponds to a permutation of the vertices,
it is conceivable that not every permutation of the vertices comes from
a symmetry. As it turns out, this is not the case, as we will see below.
Let us, for example, write ab cb ac for the permutation of the vertices
that takes whichever vertex that was on the point on the table marked
a and moves it to the point marked b, whichever vertex that was on
the point on the table marked b and moves it to the point marked
c, and whichever vertex that was on the point on the table marked
c and moves it to the point marked a. Notice that since there are
three vertices, there are only six permutations to consider. With this
notation, let us consider each of the six permutations in turn, and show
that they can be realized as a symmetry of the triangle:
1. id =
a b c
a b c
262
CHAPTER 4. GROUPS
it is clearly a rigid motion of the triangle (there is no distortion
of the cardboard), and after we have performed this operation,
we would not be able to tell whether anybody has disturbed the
triangle or not!
2. =
a b c
b c a
a b c
c a b
a b c
a c b
the line joining the point a and the midpoint of the opposite side
bc. This too is a rigid motion, and after the flip is over, we would
not be able to tell if the cardboard has been moved.
5. b =
a b c
c b a
. This can be realized by flipping the triangle about
the line joining the point b and the midpoint of the opposite side
ac. Like a , this too is a rigid motion, and after the flip is over,
we would not be able to tell if the cardboard has been moved.
6. c =
a b c
b a c
flipping the triangle about the line joining the point c and the
midpoint of the opposite side ab.
263
Thus, we have obtained all six permutations as symmetries of the triangle! Notice that these six symmetries compose as follows:
D3 Composition
id
id
id
id
id
id
id
id
264
CHAPTER 4. GROUPS
shrink, stretch, or in anyway distort the square, but are such that after
we perform the operation, nobody can tell that we did anything to the
square! To help determine what such operations could be, pretend that
the piece of cardboard has been placed at a fixed location on a table,
and the location has been marked by lines drawn under the edges of
the cardboard. Also, label the points on the table that lie directly
under the vertices of the square as a, b, c, and d respectively. After we
have done our (yet to be determined!) operation on the cardboard, the
square should stay at the same locationotherwise it would be obvious
that somebody has done something to the piece of cardboard.
a
We will refer to our operations as symmetries of the square and will refer
to each operation as a rigid motion. Just as in the previous example,
each vertex of the square must somehow end up once again on top of
one of the four points a, b, c, and d marked on the table after the
application of a symmetry. As before, the preservation of the rigidity
of the square ensures that once we know where the vertices have gone
to under the application of a symmetry, we would immediately know
where every other point on the square would have gone to. (In fact, it
is enough to know where two adjacent vertices have gonesee Remark
4.11 ahead.) Hence, it is enough to study the possible permutations
of the vertices of the square to determine its symmetries. Unlike the
previous example, however, it is not true that every permutation of the
vertices comes from a symmetry.
a b c d
b c d a
265
vertices that takes whichever vertex that was on the point on the table
marked a and moves it to the point marked b, the vertex on b to c,
the vertex on c to d, and the vertex on d to a. Notice that since
there are four vertices, there are 4! = 24 permutations to consider (see
Exercise 4.3.3 above). With this notation, let us see which of these 24
permutations can be realized as a symmetry of the square:
1. id =
a b c d
a b c d
a b c d
b c d a
. This can be effected by rotating the square counter-
a b c d
c d a b
. This is effected by rotating the square counter-
clockwise (or clockwise) by 180 , and corresponds to the composition . Hence the name 2 for this symmetry.
4. 3 =
a b c d
d a b c
. This is effected by rotating the square counter-
a b c d
b a d c
horizontal axis (i.e., the line joining the midpoints of the sides ab
and cd).
266
CHAPTER 4. GROUPS
6. V =
a b c d
d c b a
. This corresponds to flipping the square about its
vertical axis (i.e., the line joining the midpoints of the sides ad
and bc).
7. ac =
a b c d
a d c b
. This corresponds to flipping the square about the
a b c d
c b a d
. This corresponds to flipping the square about the
Remark 4.11
Here is another way of seeing that there are only eight symmetries:
Observe that once we know where a pair of adjacent vertices have
gone under the application of a symmetry, we immediately know where
every other point on the square has gone, because of the rigidity. This
is because if a point P of the square is at a distance x from a vertex
267
268
CHAPTER 4. GROUPS
Definition 4.12.3
The center of a group is defined to be the set of all elements in
the group that commute with all other elements in the group.
(For instance, the identity element is always in the center of a
group as it commutes with all other elements.)
Exercise 4.12.4
Determine the elements in D4 that lie in its center.
4.1.3
269
Cyclic groups
Example 4.13
Notice that the subset {1, 1} of Z endowed with the usual multiplication operation of the integers is a group!
Question 4.13.1
What similarities do you see between this group and the group
(Z/2Z, +)?
Question 4.13.2
Let G be any group that has exactly two elements. Can you see
that G must be similar to the group (Z/2Z, +) in exactly the
same way that this group {1, 1} is similar to (Z/2Z, +)? Now
that you have seen the notion of isomorphism in the context of
rings and vector spaces, can you formulate precisely how any
group with exactly two elements must be similar to (Z/2Z, +)?
caption
270
CHAPTER 4. GROUPS
z3
1 = z6
z4
z5
271
Question 4.14.3
Consider the group (Z/nZ, +), for a fixed integer n 1. Notice that
every element in this group is obtained by adding [1]n to itself various
number of times. For instance, [2]n = [1]n + [1]n (which we write as
2 [1]n ), [3]n = [1]n + [1]n + [1]n (which we write as 3 [1]n ), etc. What
similarities do you see between (Z/nZ, +) and the group Cn above?
Now that you have seen the notion of isomorphism in the context of
rings and of vector spaces, can you formulate precisely how (Z/nZ, +)
and Cn are similar?
272
4.1.4
CHAPTER 4. GROUPS
Example 4.15
Let G and H be groups. We endow the cartesian product G H with
the operation (g1 , h1 )(g2 , h2 ) = (g1 g2 , h1 h2 ) (compare with Example
2.22 in Chapter 2). Here, the product g1 g2 refers to the operation in
G, while the product h1 h2 refers to the operation in H.
Exercise 4.15.1
Verify that with this definition of operation, the set G H forms
a group.
Question 4.15.3
If G and H are abelian, must G H necessarily be abelian? If
G H is not abelian, can G or H be abelian? Can both G and
H be abelian?
Exercise 4.15.4
Consider the direct product (Z/2Z, +) (Z/3Z, +). Show by direct computation that every element of this group is a multiple of
the element ([1]2 , [1]3 ). What similarities do you see between this
group and (Z/6Z, +)? With your experience with isomorphisms
in the context of rings and vector spaces, can you formulate precisely how (Z/2Z, +) (Z/3Z, +) and (Z/6Z, +) are similar?
273
274
4.1.5
CHAPTER 4. GROUPS
Matrix groups
Example 4.16
We know (see Exercise 2.112 in Chapter 2) that the set of invertible
elements of a ring R, denoted R , forms a group under the multiplication operation in the ring. In particular, taking R to be Mn (F ) for
a fixed field F , we find that the set of n n invertible matrices with
entries in F forms a group with respect to matrix multiplication. This
is a very important group in mathematics, and has its own notation
and its own name: it is denoted by Gln (F ) and is called the general
linear group of order n over F . Recall (see the parenthetical remarks
in Exercise 2.55.1 in Chapter 2) that a matrix with entries in a field is
invertible if and only if its determinant is nonzero. Thus, Gln (F ) may
be thought of as the group of all n n matrices with entries in F whose
determinant is nonzero.
Exercise 4.16.1
Write down the group table for the group of units of the ring
Gl2 (Z/2Z) (see Exercise 2.55.1 in Chapter 2). What familiar
group is this isomorphic to?
Remark 4.17
Recall from Exercise 3.104 in Chapter 3 that if V is an n-dimensional
vector space over F , then the ring of F -linear transformation from V
to V , namely EndF (V )see Exercise 3.101 of that same chapter as
275
276
CHAPTER 4. GROUPS
Example 4.19
Let B2 (R) be the set of matrices in M2 (R) of the form g =
a b
0 d
where ad 6= 0. Since the determinant of g is precisely ad, the condition
ad 6= 0 shows that g is invertible, i.e., g Gl2 (R). B2 (R) is a group
with respect to matrix multiplication.
Question 4.19.1
What does the product of two such matrices g and h above look
like?
Question 4.19.2
What does the inverse of a matrix such as g above look like?
More generally, consider the subset Bn (F ) of upper triangular matrices in Mn (F ) whose product of diagonal entries is nonzero. Since the
determinant of an upper triangular matrix is just the product of its
diagonal entries, Bn is a subset of Gln (F ). Bn (F ) forms a group with
277
Exercise 4.19.4
Show that the product of two upper triangular matrices is also
upper triangular.
Exercise 4.19.5
Show that the inverse of an invertible upper triangular matrix is
also upper triangular.
Example 4.20
Let U2 (R)
! be the subset of matrices in M2 (R) of the form ga =
1 a
, where a R. Note that the determinant of ga = 1 for
0 1
all a. U2 (R) is a group with respect to matrix multiplication.
Question 4.20.1
What is the product of ga and gb in terms of a and b?
278
CHAPTER 4. GROUPS
Question 4.20.2
What is the multiplicative inverse of ga ?
Question 4.20.3
What similarity do you see between U2 (R) and (R, +)?
Question 4.20.4
View the elements of U2 (R) as the matrix of linear transformations of R2 with respect to the usual basis i and j. Where do i
and j go to under the action of ga ?
Question 4.20.5
How would any of these calculations above in Questions (4.20.1)
or (4.20.2) or (4.20.4) be changed if we had restricted a to be an
integer? What similarity would you then have seen between this
modified set (with a now restricted to be an integer) and (Z, +)?
Example 4.21
As we will see in Exercise 4.86, the set of 22 matrices with entries in R
satisfying At A = I, where I is the identity matrix and At stands for the
transpose of A, forms a group: it is precisely the group of symmetries
279
of the set described in Example 4.100 on Page 322. This set of matrices
is indeed a subset of Gl2 (R), since, as you are asked to prove as well in
Exercise 4.86, the relation At A = I yields that the determinant of A is
1, and in particular, nonzero. More generally, one can consider the
set of n n matrices A with entries in any field F satisfying At A = I.
This set will form a group, known as the orthogonal group of order n
over F . We will use the notation On (F ) for such groups.
See Remark 4.5 at the end of the chapter as well about more general
orthogonal groups than the one we have introduced above.
Exercise 4.21.1
Prove that the set of nn matrices A with entries in G satisfying
At A = I forms a group under matrix multiplication. (Hint: you
will need to show that At A = I is equivalent to AAt = I, and
from this, that (A1 )t = (At )1 .)
280
4.1.6
CHAPTER 4. GROUPS
Remark 4.22
In an abstract group, several different symbols are used to denote the
binary operation, the identity element, and the inverse of an element.
Sometimes, one uses the symbol id for the identity element, as we have
done above for the groups S3 , D3 , etc. Sometimes the symbol e is used
for the identity element. Very often, one imagines the group operation
to be some sort of multiplication between group elements (Warning:
this is just an informal way to think about the operationin general,
the operation may not represent any sort of actual multiplication in
the sense of multiplication in rings), and in such cases, one uses the familiar symbol 1 to represent the identity element. (In such a situation
we say that the group is written in multiplicative notation, or written
multiplicatively.) When writing the group multiplicatively, one simply
writes the binary operation without any symbol, thus, the product of
two elements a and b is simply written ab. (We have followed this convention already with S3 , for example.) In the case where the group is
abelian, one often imagines the group operation as some sort of addition in analogy with the addition operation in rings (Same Warning:
this is just an informal way to think about the operation), and one
writes + for the group operation and 0 for the identity. And continuing with the analogy, one writes a for the inverse of an element a. (In
such a situation, we say that the group is written in additive notation,
or written additively.)
Before proceeding further, here are some exercises that would be useful.
281
You would have encountered many of these results already in the context of
the additive group of a ring or a vector space. (For instance, see Remark
2.24, Exercise 2.111, and the notes on page 156 in Chapter 2):
Exercise 4.23
Show that the identity element in a group is unique.
Exercise 4.24
Show that the inverse of an element in a group is unique.
Exercise 4.25
Show that for any element a in a group G, (a1 )1 = a.
Exercise 4.26
( Cancellation in Groups) If ab = ac for elements a, b, and c in a group
G, show that b = c (left cancellation). Similarly, if ba = ca, show that
b = c (right cancellation).
Exercise 4.27
Let G be a group, and let a and b be elements in G. Show that
(ab)1 = b1 a1 .
Exercise 4.28
Let G be a group written multiplicatively. For any element a G and
for any positive integer j, it is customary to write aj for a
a}.
| a{z
jtimes
the following:
282
CHAPTER 4. GROUPS
1. If y = aj for some integer j, then y 1 = aj . (Hint: Compute
aj aj by invoking the definition of aj and aj you would of
course have to divide your proof into whether j is positive, negative, or zero.
2. For integers s and t, prove that as at = as+t . (Hint: First dispose
of the case where either s or t is zero, and then divide your proof
into four cases according to whether s is positive or negative and
t is positive or negative.)
4.2
283
After our practice with subrings and subspaces, the following concept
must now be quite intuitive:
Definition 4.29. Let G be a group. A subgroup of G is a subset H that is
closed with respect to the binary operation such that with respect to this
operation, H is itself a group.
Exercise 4.30
Let G be a group and let H be a subgroup. Prove that the identity
element of H must be the same as the identity element of G. (Hint:
Write idG and idH for the respective identities. Then idH idH = idH .
Also, idG idH = idH . So?)
The following lemma allows us to check if a nonempty subset of a group
is a subgroup.
Lemma 4.31. (Subgroup Test) Let G be a group, and let H be a nonempty
subset. If for all a and b in H the product ab1 is also in H, then H is a
subgroup of G.
Proof. Since H is nonempty (note that we are invoking the nonemptiness
hypothesis!), H has at least one element in it, call it a. Then, taking b = a
in the the statement of the lemma, we find aa1 = e H. Thus, H contains
the identity. Next, given any x H, we take a = e and b = x in the
statement of the lemma to find that the product ex1 = x1 must be in H,
so H contains inverses of all its elements. Finally, given any x and y in H,
note that y 1 must also be in H by what we just saw, so, taking a = x and
284
CHAPTER 4. GROUPS
Example 4.32
Let G be a group. The subset {1G } is a subgroup, called the trivial
subgroup.
Exercise 4.32.1
Prove this by applying the subgroup test (Lemma 4.31).
Example 4.33
In the group S3 , the subset {id, r1 , r2 } is a subgroup, as are the subsets
{id, f1 }, {id, f2 }, and {id, f3 }.
Exercise 4.33.1
Prove these assertions by studying the group table of S3 on page
252.
Example 4.34
In the group Sn of permutations of {1, 2, . . . , n}, let H be the subset
consisting of all permutations that act as the identity on n.
285
Exercise 4.34.1
Prove that H is a subgroup of Sn using the subgroup test (Lemma
4.31).
Exercise 4.34.2
Compare H with Sn1 . What similarities do you see?
Example 4.35
The various matrix groups we considered above such as Sln (F ), Bn (F ),
On (F ), etc., are all subgroups of Gln (F ).
Example 4.36
Let G be a group. Recall that we have defined the center of G (see
Definition 3.5 ) to be the subset consisting of all elements of G that
commute with every other element of G.
Exercise 4.36.1
Prove that the center of G is a subgroup of G.
Question 4.36.2
What can you say about the center of G when G is abelian?
286
4.2.1
CHAPTER 4. GROUPS
Example 4.37
Let G be a group, and let a be an element in G. What would be the
smallest subgroup of G that contains a? (By smallest, we mean smallest
with respect to set theoretic inclusion, that is, we seek a subgroup H
of G that contains a such that if K is any other subgroup of G that
contains a, then H K.) Let us write G multiplicatively. Then, any
subgroup H that contains a must contain, along with a, the elements
a a = a2 , a a2 = a3 , . . . , because the subgroup must be closed with
respect to the group operation. It must contain the identity 1 (= a0 )
since it is a subgroup. Similarly, it must contain the inverse a1 , and
then, it must contain all products a1 a1 = a2 , a1 a2 = a3 , . . . .
We have the following:
Lemma 4.37.1. The set hai = {an | n Z} is a subgroup of G. It is
the smallest subgroup of G that contains a, in the sense that if H is
any subgroup of G that contains a, then hai H.
Proof. The discussions just before the statement of this lemma show
that if H is any subgroup of G that contains a, then H must contain
all ai , for i Z, that is, H must contain hai. Thus, we only need to
show that hai is a subgroup of G. But this is easy by the subgroup
test (Lemma 4.31): hai is nonempty since a is in there. Given any two
elements x and y in hai, x = ai for some i Z, and y = aj for some
j Z. Note that y 1 = aj (Exercise 3.5). Then xy 1 = ai aj . Hence
(Exercise 3.5 again), xy 1 = aij hai, proving that hai is indeed a
287
2
Before proceeding further, we pause to give a name to the object considered in the lemma above:
Definition 4.37.1
Let G be a group, and let a be an element in G. The subgroup
hai is called the subgroup generated by a. A subgroup H of G is
called cyclic if H = hgi for some g G. In particular, G itself is
called cyclic if G = hgi for some g G.
Exercise 4.37.2
Let G be a group written multiplicatively, and let a G. For
integers s and t, prove that as at = as+t by mimicking the proof
that (aj )1 = aj in the lemma above. (Hint: First dispose of
the case where either s or t is zero, and then divide your proof
into four cases according to whether s is positive or negative and
t is positive or negative.)
In S3 , for instance, we see that hr1 i is the (finite) set {id, r1 , r12 = r2 }.
This is because we need no further powers: r13 = id, so r14 = r17 =
= r1 , and r15 = r18 = = r12 = r2 . Similarly, r11 = r2 , so
r12 = r11 r11 = r22 = r1 , and from this, we see that all powers r1n
(n = 1, 2, . . . ) is one of id, r1 , or r2 .
By contrast, the subgroup h1i of the additive group (Z, +) is all of Z.
This is easy to see: h1i = {1, 1 + 1 = 2, 1 + 2 = 3, . . . , 0, 1, (1) +
(1) = 2, (1) + (2) = 3, . . . }.
These examples suggest an interesting and important concept:
288
CHAPTER 4. GROUPS
Definition 4.37.3
Let G be a group (written multiplicatively), and let a G. The
order of a (written o(a))is the least positive integer n (if it exists)
such that an = 1. If no such integer exists, we say that a has
infinite order.
We now have the following:
Lemma 4.37.2. Let G be a group and let a G. Then o(a) is finite
if and only if hai is a finite set. When these (equivalent) conditions
hold, o(a) equals the number of elements in the subgroup hai, and if
this common integer is m, then the elements 1, a, . . . , am1 , are all
distinct, and hai = {1, a, . . . , am1 }.
Proof. Assume that o(a) is finite, say m. Then, any integer l can be
written as bm + q for 0 q < m, so al = abm+q = (am )b aq = 1 aq .
Hence, every power of a can be written as aq for some q between 0 and
m 1, that is, hai = {1, a, . . . , am1 }. This shows that hai is a finite
set.
Now assume that hai is a finite set. Then, the powers 1, a, a2 , . . . cannot
all be distinct (otherwise hai would be infinite), so there exist nonnegative integers k and l, with k < l, such that ak = al . Multiplying by
ak = (ak )1 , we find 1 = alk . Note that l k is positive. Thus, the
set of positive integers t such that at = 1 is nonempty, since l k is in
this set. By the well-ordering principle, there is a least positive integer
s such that as = 1. This shows that a has finite order, namely s.
Now assume that these equivalent conditions hold. We have already
seen above that if o(a) is finite and equal to some m, then hai =
{1, a, . . . , am1 }. Note that these elements are all distinct, since if
aj = ak for 0 j < k m 1, then, multiplying both sides by
289
The following result is useful, its proof uses an idea that we have already
encountered in the proof of Lemma 4.37.2 above:
Lemma 4.37.3. Let a be an element of a group G and suppose that
al = 1 for some integer l. Then the order of a divides l. (In particular,
the order of a is finite.)
Proof. Note that if l is negative, then al = (al )1 = 1. Hence, the set
of positive integers n such that g n = 1 is nonempty, since either l or
l is in that set. By the Well-Ordering principle, this set has a least
element, so indeed the order of g is finite.
Now suppose the order of a is m. Write l = bm + r for integers b and
r with 0 r < m. Then ar = al abm = al (am )b = 1, because both
al and am equal 1. Since m is the least positive integer n such that
an = 1, it follows that r = 0, i.e., that m divides l.
290
CHAPTER 4. GROUPS
291
order m. To see that had i = hai in the case where d and m are relatively
prime, note that by Lemma 4.37.2, the subgroup hai has m elements
since a has order m, and likewise, the subgroup had i has m elements
since ad has order m. But had i is a subset of hai, since any power of
ad is also a power of a. Since a subset T of a finite set S that has the
same number of elements as S must equal S, we find that had i = hai.
2
Here is a quick exercise to show you that cyclic groups can come in
hidden forms!
Exercise 4.37.4
Show that Z/2Z Z/3Z is cyclic. (Hint: What is the order of
the element ([1]2 , [1]3 )?) After you work this out, see Exercise
4.80.1 ahead and Exercise 4.94 at the end of the chapter.
Definition 4.38
Let G be a group. The order of G (written o(G)) is the number of
elements in G, if this number is finite. If the number of elements in G
is infinite, we say G is of infinite order.
292
CHAPTER 4. GROUPS
Remark 4.39
Do not confuse the order of an element a in a group G with the order of
the group G, these refer to two separate concepts. (All the same, even
though these are separate concepts, we will see (Corollary 4.48 ahead)
that the two integers are related.) Note that Lemma 4.37.2 above says
that the order of a equals the order of the subgroup generated by a.
Thus in the special case when G is cyclic, i.e., when G = hai for some
a G (See Definition 3.5 above), the order of a and the order of the
group G = hai are indeed the same integers, even though they arise
out of different concepts.
Remark 4.40
Continuing with the special situation at the end of Remark 4.39 above,
let G be any cyclic group of order n. Thus G = hai for some a G,
and since G has order n, Lemma 4.37.2 shows that the element a must
have order n, and that G = {1, a, . . . , an1 }. Notice the similarity with
Example 4.14 above. See also Exercise 4.80.1.
4.2.2
293
Cosets
We have already seen the notion of the coset of a subgroup with respect
to an element before. We saw this in the context of subgroups I of abelian
groups of the form (R, +), where R is a ring and I is an ideal (see page 109).
We also saw this in the context of subgroups W of abelian groups of the
form (V, +), where V is a vector space and W is a subspace (see page 211).
The following should therefore come as no surprise, the only novel feature
is that we need to distinguish between right and left cosets since the group
operation in an arbitrary group need not be commutative:
Definition 4.41
Let G be a group and let H be a subgroup. Given any a G, the left
coset of H with respect to a is the set of all elements of the form ah as
h varies in H, and is denote aH. Similarly, the right coset of H with
respect to a is the set of all elements of the form ha as h varies in H,
and is denoted Ha.
Example 4.42
Let us consider an example that will show that indeed left and right
cosets can be different. Take G to be S3 (see Example 4.2), and let
H = hf1 i. Since the order of f1 is 2 (see the group table for S3 on page
252), hf1 i = {1, f1 } by Lemma 4.37.2. Take a to be the element r1 .
Then the left coset r1 hf1 i = {r1 , r1 f1 } = {r1 , f3 } (see the group table),
while the right coset hf1 ir1 = {r1 , f1 r1 } = {r1 , f2 }. Clearly, the left
and right cosets of r1 with respect to the subgroup hf1 i are not equal!
Continuing with this example, let us make a table of all left and right
294
CHAPTER 4. GROUPS
cosets of hf1 i.
Cosets
a
id
{1, f1 }
{1, f1 }
r1
{r1 , f3 }
{r1 , f2 }
r2
{r2 , f2 }
{r2 , f3 }
f1
{1, f1 }
{1, f1 }
f2
{f2 , r2 }
{f2 , r1 }
f3
{f3 , r1 }
{f3 , r2 }
Notice that every coset (left or right) has exactly two elements, which
is the same number as the number of elements in the subgroup hf1 i
that we are considering. This will be useful in understanding the proof
of Lagranges theorem (Theorem 4.47) below.
Exercise 4.43
Take G = S3 and take H = hr1 i. Write down all left cosets of H
and all right cosets of H with respect to all the elements of G. What
observation do you make?
The following equivalence relation in Lemma 4.44 below is analogous to
the corresponding equivalence relations for rings (see page 109) and vector
spaces (see page 211), except that once again, we need to distinguish two
cases because the group operation need not be commutative. Note that in
the case of rings, for example, we define a b if and only if a b I (where
I is some given ideal). Now note that a b is really a + (b). Thus, in the
group situation, the expression analogous to a + (b) would be ab1 , and
this is indeed the expression we consider in the lemma below. (And while
295
a+(b) = (b)+a in the situation of rings, the operation in a group need not
be commutative, so we need to consider the expression analogous to (b) + a
as well, which is b1 a.)
Lemma 4.44. Let G be a group and H a subgroup. Define two relations on
G, denoted L and R , by the following rules: a L b if and only if
b1 a H, and a R b if and only if ab1 H. Then L and R are both
equivalence relations on G. The equivalence class [a]L of an element a with
respect to the relation L is the left coset aH, while its equivalence class [a]R
with respect the relation R is the right coset Ha
Proof. The proof that L is an equivalence relation is similar to the proof of
Lemma 2.75 in Chapter 2, except that we have to account for the fact that
the group operation need not be commutative.
To show that a L a, simply note that a1 a = 1 H. To show that
a L b implies that b L a, note that a L b gives (by definition) b1 a = h
for some h H, and taking inverses of both sides (see Exercise 3.5 above),
we find (b1 a)1 = a1 b = h1 . Since h1 is also in H as H is a subgroup,
we find a1 b is in H, which shows that b L a. Finally, given a L b and
b L c, note that (by definition) b1 a = h1 and c1 b = h2 for some h1 and
h2 in H. Then h2 h1 = c1 b b1 a = c1 a, and since h2 h1 is also in H (as H
is a subgroup), we find a L c as well.
The proof that R is an equivalence relation is similar.
To prove that [a]L = aH, note that any element b in aH is of the form ah
for some h H. Multiplying by a1 , we find a1 b = h and hence a1 b H.
This shows that b L a. Thus, all elements in aH are in the equivalence
class of a, i.e., aH [a]L . For the other direction, take any b [a]L . Then
b L a, so (by definition) a1 b = h for some h H. Thus, multiplying both
sides by a, we find b = ah, so b aH. Hence, [a]L aH as well.
The proof that [a]R = Ha is similar.
296
CHAPTER 4. GROUPS
2
4.2.3
297
Lagranges theorem
298
CHAPTER 4. GROUPS
The rest of the proof of the theorem is as described in the first paragraph.
(Note that essentially the same proof shows that any two right cosets of
H have the same number of elements.)
2
Here is an immediate corollary containing a result promised in Remark
4.39 above:
Corollary 4.48. Let G be a group of finite order. Then the order of any
element of G divides the order of G.
Proof. By Lemma 4.37.2, the order of any element a equal the order of the
subgroup hai generated by a. But, by the theorem above, o(hai) divides
o(G). It follows that o(a) divides o(G).
2
Here is is a corollary to Corollary 4.48:
Corollary 4.49. Let G be a group of finite order d. Then ad = 1 for all a G.
Proof. Let the order of a be q. We saw in Corollary 4.48 that q|d, so d = mq
for some integer m. Then, ad = (aq )m = 1m = 1.
299
300
4.3
CHAPTER 4. GROUPS
Recall how we formed a quotient ring R/I(see page 109) from a ring
R and an ideal I; the elements of R/I were the cosets a + I as a ranged
through R, and the addition and multiplication were defined respectively by
(a + I) + (b + I) = (a + b) + I, and (a + I) (b + I) = (ab) + I. We showed that
these rules for addition and multiplication were well-defined (Lemma 2.77)
and then went on to show (Theorem 2.79) that R/I with these operations
was indeed a ring. Similarly, recall how we formed a quotient space V /W
(see page 212) from a vector space V over a field F and a subspace W :
the elements of V /W were the cosets u + W as u ranged through V , and
the vector addition and scalar multiplication were defined respectively by
(u + W ) + (v + W ) = (u + v) + W , and f (u + W ) = f u + W . Once again, we
observed that these rules for vector addition and scalar multiplication were
well-defined (Exercise 3.69) and then went on to show that V /W with these
operations was indeed a vector space over F (Theorem 3.70).
We would of course like to mimic these constructions and form a quotient
group G/H from a group G and a subgroup H: we would take the elements
of G/H to be the various (say, left) cosets gH as g ranges through G, and we
would define the group operation on G/H by aH bH = (ab)H. But when
we carry out this program, we run into a slight problem: in general, the
operation aH bH = (ab)H is not well defined! For, suppose that aH = a0 H
and bH = b0 H. Viewing aH as a0 H and bH as b0 H, our desired operation
should yield that aH bH = a0 H b0 H = (a0 b0 )H. Thus, (ab)H ought to equal
(a0 b0 )H whenever aH = a0 H and bH = b0 H (or put differently, whenever
a L a0 and b L b0 ).
301
In general, this need not happen. For instance, take G = S3 , and take
H = hf1 i. Consider the cosets r1 hf1 i = {r1 , f3 } and r2 hf1 i = {r2 , f2 } (see the
table in Example 4.42). Now, it is clear from the table that r1 hf1 i = f3 hf1 i
and that r2 hf1 i = f2 hf1 i. So, the question is: is (r1 r2 )hf1 i = (f3 f2 )hf1 i? The
answer is no! We find that (r1 r2 )hf1 i = 1hf1 i = {1, f1 }, while (f3 f2 )hf1 i =
r2 hf1 i = {r2 , f2 }.
So how should one fix this problem? Let us first analyze the situation
some more. Since a0 = a0 1 a0 H and since a0 H = aH, we find a0 aH,
so a0 = ah for some h H. Similarly, b0 = bk for some k H. Then
a0 b0 = ahbk. If (ab)H ought to equal (a0 b0 )H, then a0 b0 ought to equal abl
for some l H (see Exercise 3.5). We have gotten a0 b0 to look like ahbk,
let us massage this a bit and write it as abb1 hbk. Now, suppose that b1 hb
is also in H by some miracle, say that b1 hb = j for some j H. Then,
a0 b0 = ahbk = abb1 hbk = abjk, and of course, jk H as both j and k are
in H. It would follow that if this miracle were to happen, then a0 b0 would
look like ab times an element of H, and therefore, abH would equal a0 b0 H.
As the example of G = S3 and H = hf1 i above shows, this miracle will not
always happen, but there are some special situations where this will happen,
and we give this a name:
Definition 4.51
Let G be a group. A subgroup H of G is called a normal subgroup if
for any g G, g 1 hg H for all h H.
Remark 4.52
Alternatively, write g 1 Hg for the set {g 1 hg | h H}. Then we
may rewrite the definition above as follows: H is said to be normal if
g 1 Hg H for all g G. Note that this is equivalent to requiring
that gHG1 H for all g G. For, setting y to be g 1 , note that as
302
CHAPTER 4. GROUPS
g{1, r1 , r2 }g 1
id
{1, r1 , r2 }
r1
{1, r1 , r2 }
r2
{1, r1 , r2 }
f1
{1, r2 , r1 }
f2
{1, r2 , r1 }
f3
{1, r2 , r1 }
303
Remark 4.56
As a result of this corollary, if N is normal in G, we may simply talk
of the cosets of N without specifying whether these are left or right
coset.
Exercise 4.57
Prove the converse of Corollary 4.55: If N is a subgroup of G such that
for every g G, the left coset gN equals the right coset N g, then N
is normal.
304
CHAPTER 4. GROUPS
Exercise 4.58
Prove that the center of a group (see Definition 3.5) is a normal subgroup.
Exercise 4.59
Prove that every subgroup of an abelian group is normal.
The following is now a consequence of all our discussions:
Lemma 4.60. Let G be a group, and let N be a normal subgroup. Denote by
G/N the set of cosets (see Remark 4.56) of N . Then, the binary operation
defined on G/N by (aN )(bN ) = (ab)N is well-defined.
Proof. The proof of this lemma is contained in the discussions just before
Definition 4.51. In fact, it was precisely the analysis of what would make
the operation (aH)(bH) = (ab)H on the (left) cosets of an arbitrary group
H well-defined that led us to the definition of normal subgroups. It would
be a good idea to read that discussion and furnish the proof of this lemma
yourselves.
2
Theorem 4.61. Let G be a group, and let N be a normal subgroup. Then, the
set G/N , with the operations as defined in the statement of Lemma 4.60, is
a group.
Proof. We have observed in Lemma 4.60 that these operations are welldefined. We have to check if all group axioms are satisfied.
1. Associativity: Given aN , bN , and cN in G/N , we have (aN )[(bN )(cN )] =
(aN )[(bc)N ] = [a(bc)]N = [(ab)c]N (the last equality because of associativity in G). On the other hand, [(aN )(bN )](cN ) = [(ab)N ](cN ) =
[(ab)c]N . Hence, (aN )[(bN )(cN )] = [(aN )(bN )](cN ).
305
306
CHAPTER 4. GROUPS
2. Prove that G/N has order 4.
3. Prove that G/N is abelian.
4. Prove that G/N is not cyclic.
4.4
307
Having had enough experience with quantifying the fact that sometimes
the ring operations in two given rings may be the same except perhaps for
dividing out by an ideal, or that, sometimes, the vector space operations
in two given vector spaces over a field may be the same except perhaps
for dividing out by a subspace, the following concept should now be very
intuitive:
Definition 4.65
Let G and H be groups. A function f : G H is called a group
homomorphism if f (g)f (h) = f (gh) for all g, h G.
Remark 4.66
Just as with the definitions of ring homomorphisms and vector space
homomorphisms (linear transformations) there are some features of
this definition that are worth noting:
1. In the equation f (g)f (h) = f (gh), note that the operation on the
left side represents the group operation in the group H, while the
operation on the right side represents the group operation in the
group G.
2. By the very definition of a function, f is defined on all of G. The
image of G under f , however, need not be all of H (i.e, f need not
be surjective). We will see examples of this ahead (see Example
4.71 and Example 4.72 for instance). However, the image of G
under f is not an arbitrary subset of H: the definition of a group
308
CHAPTER 4. GROUPS
homomorphism ensures that the image of G under f is actually
a subgroup of H (see Lemma 4.76 later in this section).
3. Note that it is not necessary to stipulate that f (1G ) = 1H since
the property holds automatically, see Lemma 4.67 below.
The following definition should be natural at this point, after your experiences with ring homomorphisms and vector space homomorphisms:
309
Definition 4.69
Given a group homomorphism f : G H, the kernel of f is the set
{g G | f (g) = 1H }. It is denoted ker(f ).
No surprise here:
Proposition 4.70. The kernel of a group homomorphism f : G H is a
normal subgroup of G.
Proof. Let us prove first that ker(f ) is a subgroup. Since 1G ker(f )
(see Lemma 4.67), ker(f ) is certainly nonempty. Now that we know it is
nonempty, by Lemma 4.31, it is sufficient to show that whenever g and k are
in ker(f ), then gk 1 is also in ker(f ). First note that by Corollary 4.68, f (k)
and f (k)1 are inverses of each other in the group H. With this at hand, we
have f (gk 1 ) = f (g)f (k 1 ) = f (g)(f (k))1 = 1H 1H = 1H (we have invoked
the fact here that both g and k are in the kernel of f so they get mapped to
1H under f ). We thus find gk 1 ker(f ) as desired.
To show ker(f ) is normal, we need to show that gkg 1 ker(f ) for all
g G and all k ker(f ). But this is easy: for any g G and k ker(f ),
f (gkg 1 ) = f (g)f (k)f (g 1 ) = f (g)1H (f (g))1 = f (g)(f (g))1 = 1H , so
indeed, gkg 1 ker(f ).
2
Example 4.71
Given groups G and H, the map f : G H that sends every g G to
1H is a group homomorphism.
Question 4.71.1
Why is this f a group homomorphism? What is the kernel of f ?
310
CHAPTER 4. GROUPS
Notice that if H has more than just the identity element, then f is not
surjective.
Example 4.72
Let R and S be rings, and let f : R S be a ring homomorphism.
Then, focusing just on the addition operations in R and S (with respect
to which we know that R and S are abelian groups), the function
f : (R, +) (S, +) is a group homomorphism. In particular, if f
is not surjective as a ring homomorphism (for example, the natural
inclusion map Z Q, see Example 2.94 in Chapter 2), then f is not
surjective as a group homomorphism either.
Example 4.73
Let G and H be groups (see Example 4.15). Define a function f :
G H H by f (g, h) = h.
Question 4.73.1
Why is this f a group homomorphism? What is the kernel of f ?
Example 4.74
Define a function f : S3 {1, 1} (see Example 4.13) by f (r1i f1j ) =
(1)j (see Exercise 4.2.1).
Question 4.74.1
Why is this f a group homomorphism? What is the kernel of f ?
311
the same without even having to divide out by any ideal, and just as vector
space isomorphisms capture the notion that the vector space operations in
two vector spaces are essentially the same without even having to divide
out by any subspace, group isomorphisms capture the notion that the group
operations in two groups are essentially the same without even having to
divide out by any normal subgroup.
As with rings and vector spaces, we need a couple of lemmas first:
Lemma 4.75. Let G and H be two groups and let f : G H be a group
homomorphism. Then f is an injective function if and only if ker(f ) is the
trivial subgroup {1G }.
Proof. The proof of this is very similar to the proof of the corresponding
Lemma 2.99 in Chapter 2: let us redo that proof in the context of groups.
Suppose f is injective. Suppose that g ker(f ), so f (g) = 1H . By Lemma
4.67, f (1G ) = 1H . Since both g and 1G map to the same element in H and
since f is injective, we find g = 1G . Thus, the kernel of f consists of just
the element 1G , which is precisely the trivial subgroup. Conversely, suppose
that ker(f ) = {1G }. Suppose that f (g1 ) = f (g2 ) for g1 , g2 in G. Since f
is a group homomorphism, we find f (g1 g21 ) = f (g1 )f (g21 ) = f (g1 )(f (g2 ))1
(the last equality is because of Remark 4.68), and of course f (g1 )(f (g2 ))1 =
f (g1 )(f (g1 ))1 = 1H . Thus, g1 g21 ker(f ). But ker(f ) = {1G }, so g1 g21 =
1G , i.e., g1 = g2 . Hence, f is injective.
2
Our next lemma is analogous to Lemma 2.100 of Chapter 2 and Lemma
3.88 of Chapter 3:
Lemma 4.76. Let G and H be two groups and let f : G H be a group
homomorphism. Write f (G) for the image of G under f . Then f (G) is a
subgroup of H.
312
CHAPTER 4. GROUPS
313
Example 4.80
Recall that Remark 4.40 showed that if G is a cyclic group of order
n, generated by an element g, then g also has order n and that G =
{1, g, . . . , g n1 }.
Exercise 4.80.1
Extend this statement to prove: If G and H are any two cyclic
groups of order n, then G
= H.
Example 4.81
Let G be a cyclic group of order n and H a cyclic group of order m. If m
and n are relatively prime, then the direct product GH is isomorphic
to Cnm . (See Exercise 4.37.4, as also, Exercise 4.94 at the end of the
chapter.)
Exercise 4.81.1
Prove this by showing first that if G = hgi and H = hhi, then
(g, h) must have order mn. Since mn is also the order of G H,
G H must equal the cyclic subgroup generated by (g, h). Now
use Exercise 4.80.1 above.
Example 4.82
Recall the group G/N where G = D4 and N is the subgroup generated
by 2 (see Exercise 3.5).
Exercise 4.82.1
Prove that G/N is isomorphic to (Z/2Z, +) (Z/2Z, +).
314
CHAPTER 4. GROUPS
315
316
4.5
CHAPTER 4. GROUPS
Further Exercises
Exercise 4.85. You have seen the dihedral groups of index 3 and 4 in the text
(Examples 4.9 and 4.10). The groups Dn are defined analogously for n 5.
Determine the group table for D5 and determine its center.
Exercise 4.86. We will determine the group of symmetries of the set in Example
4.100 (see Page 322). Recall from Example 3.82 in Chapter 3 that after fixing
a basis of R2 , we can identify the set of all linear transformations of R2 with
M2 (R).
Let T be a linear transformation that preserves the structure of our set, and
let MT be its matrix representation with respect to, say, the standard basis {i, j}
of R2 . Then,
a b
MT =
c d
for suitable real numbers a, b, c, and d. We will describe the conditions that a,
b, c, and d must satisfy:
1. By considering the lengths of an arbitrary vector (x, y) before and after
applying T , prove that (ax + by)2 + (cx + dy)2 = x2 + y 2 for all x and y
in R.
2. Show that this relation leads to the following necessary and sufficient conditions for MT to represent a symmetry of our set:
(a) a2 + c2 = 1
(b) b2 + d2 = 1, and
(c) ab + cd = 0
3. Show that the conditions in (2) above are equivalent to the condition
(MT )t MT = I, where I is the identity matrix, and (MT )t stands for the
transpose of MT . Conclude from this that any matrix that satisfies the
condition in (2) above must have determinant equal to 1.
4. Now assume that MT satisfies these conditions. Observe that this means
that the columns of MT are of length 1, and that the two columns are
perpendicular to each other. (Such a matrix is called orthonormal.) We
have thus determined the symmetries of the set in Example 4.100 to be
the set of 2 2 orthonormal matrices with entries in R. Now prove that
317
this set actually forms a group under matrix multiplication. This group is
known as the orthogonal group of order 2 over R. You should go back and
revisit Example 4.21 as well. In alignment with that example, the group in
this exercise should be denoted O2 (R).
Exercise 4.87. Here is a group that generalizes Example 4.20. Let Un (R) denote
the set of n n upper triangular matrices with entries in R, all of whose diagonal
entries equal 1. Thus, every matrix in Un (R) can be expressed as the sum of the
identity matrix I and a strictly upper triangular matrix N . We will show that
Un (R) forms a group with respect to multiplication.
1. For any matrix M in Mn (R), define the level l of M as follows: the level
of the zero matrix is , and for nonzero matrices M , l(M ) = min{j
i | Mi,j 6= 0}, where Mi,j stands for the (i, j) entry of M . Thus, a matrix
is of level 0 if and only if it is upper triangular with at least one nonzero
entry along the diagonal, and it is of level 1 if and only if it is strictly upper
triangular, with at least one nonzero entry along the super diagonal (the
super diagonal is the set of entries that run diagonally from the (1, 2) slot
down to the (n 1, n) slot), etc. Show that l(M N ) l(M ) + l(N ). Give
an example of matrices M and N such that M N 6= 0, and l(M N ) >
l(M ) + l(N ).
2. Conclude that any strictly upper triangular matrix N is nilpotent (see Exercise 2.119 in Chapter 2).
3. Now show using Parts (1) and (2) above that Un (R) forms a group with
respect to matrix multiplication. (You may also want to look at Exercise
2.119 in Chapter 2 for some ideas.)
Exercise 4.88. Let G be a group with an even number of elements. Prove that
G has at least one nonidentity element a such a2 = 1. (Hint: To say that a2 = 1
is to say that a = a1 . Now pair the elements in the group suitably, and invoke
the fact that the group has an even number of elements.)
Exercise 4.89. Prove that a group G is abelian if and only if the function f :
G G that sends any g to g 1 is a homomorphism.
Exercise 4.90. Prove that a group G is abelian if and only if (gh)2 = g 2 h2 for
all g and h in G.
318
CHAPTER 4. GROUPS
Exercise 4.91. The discussions preceding Definition 4.51 established the following: if N is a normal subgroup of G, then the operation on the left cosets of N
determined by (aN )(bN ) = abN is well-defined. Prove the converse of this: if
N is a subgroup of G such that the operation on the left cosets of N determined
by (aN )(bN ) = abN is well-defined, then N must be normal in G. (Hint: For
any g G, consider the product of left cosets g 1 N gN = 1G N = N . For any
n N , note that the coset (g 1 n)N is the same as g 1 N (why?). Since the
product is well-defined, (g 1 n)N gN should also equal N . So?)
Exercise 4.92. What is the last digit in 4399999 ? (Hint: Work mod 2 and mod
5 separately, applying Corollary 4.49 above, and then combine the result.)
Exercise 4.93. By Exercise 3.5, the center Z(G) of a group G is a normal
subgroup of G. Hence, it makes sense to talk of the quotient group G/Z(G).
Prove that if G/Z(G) is cyclic, then G must be abelian, and thus, Z(G) must
equal G.
Exercise 4.94. Let G be a cyclic group of order m and H a cyclic group of order
n. Show that G H is cyclic if and only if gcd(m, n) = 1.
319
Notes
Remarks on sets with structure and their symmetry groups Recall from
the text (Page 249) that a set with structure is simply a set with a certain
feature that we wish to focus on, and a symmetry of such a set is a one-to-one
onto map from the set to itself that preserves this feature. If f and g are two
such maps, then the composition f g as well as g f will also be one-to-one
onto maps that preserve the feature. (Recall that if f : S S and g : S S
are two functions from a set S to itself, then the composition of f and g,
written f g takes s S to f (g(s)), and similarly, g f takes s to g(f (s)).)
Often, if f is such a feature-preserving map, then f 1 (which exists because
f is a bijection) will also preserve the feature, although, this is not always
guaranteed. (See the remarks on Page 326 later in these notes for some
examples where the inverse of a structure-preserving map is not structurepreserving.) So, if we restrict our attention to those structure-preserving
maps whose inverse is also structure-preserving, then these maps constitute
a group, called the symmetry group of the set with the given structure.
We consider some examples below of sets with structure and their symmetry groups:
Example 4.95
The set in question could be any set, such as 3 = {1, 2, 3}, with the
feature of interest being merely the fact that it is a set. (This structure
is called the trivial structure.) Of course this particular set has lots
of other features (for example, each element in 3 corresponds to a
length on a number linesee Example 4.96 below), but we do not focus
on any other feature for the moment. Any one-to-one onto map from
a set such as 3 to itself will certainly preserve the feature that 3 is
a set, so, the symmetries of a set with trivial structure are precisely
the various one-to-one maps from the set to itself. We have already
320
CHAPTER 4. GROUPS
321
Example 4.98
Just as in the last example, the set could be a piece of cardboard cut
in the shape of a square, with the structure that it is a rigid object.
We have seen the symmetries of this set, it is D4 (Example 4.10).
Question 4.98.1
Pick one edge of the square, and refer to its direction as the
horizontal direction. Suppose the piece of cardboard of this
example had, additionally, been colored in alternating horizontal
strips of black and white. Suppose that the total number of
strips is odd (and at least three), so that the strips along the two
horizontal edges are both of the same color and there is at least
one strip of the other color. What would be the symmetries of
this new set? What would be the symmetries if the total number
of strips were even, so that the strips along the two horizontal
edges are of different color?
Example 4.99
The set could be R2 , and the feature of interest could be the fact that
it is a vector space over the reals. What would be the symmetries of
a vector space that preserve its vector space structure? In fact, what
should it mean to preserve the vector space structure? A vector space
is characterized by two operations, an addition operation on vectors,
and a scalar multiplication operation between a scalar and a vector.
We say that a map f : R2 R2 preserves its vector space structure if
for any two vectors v and w in R2 and for any scalar a R, f respects
these operations, i.e., if f sends v to some f (v) and w to some f (w),
322
CHAPTER 4. GROUPS
Example 4.100
This examples puts further conditions on Example 4.99: We could take
the set to be R2 as before, with the structure being that it is a vector
space over the reals, and that each vector has a length. (Recall that
field Q[ 2], and the structure could be that (i) Q[ 2] is a field, and
323
rationals.
see that any ring isomorphism from Q[ 2] to Q[ 2] that is the identity on the rationals necessarily preserves the minimal polynomial over
324
CHAPTER 4. GROUPS
2.134 in Chapter 2, any ring homomorphism from Q to Q[ 2] must automatically be the identity map on the rationals, so this extra condition
nothing symmetry, that is, the identity map on Q[ 2], and this
conjugation map.
325
Example 4.102
(This is also from Galois theory, and as with the previous example, you
may wish to postpone this for a future reading.) The set could be the
field Q[ 2, 3], and the structure could be that (i) Q[ 2, 3] is a field
(see Exercise 2.116 in Chapter 2), and (ii) every element in Q[ 2, 3]
a + b 2 + c 3 + d 6 to a + b 2 c 3 d 6 (here a, b, c, and d
are rational numbers). Thus, there are precisely two symmetries
of this set. Note, however, that Exercise 2.135 of Chapter 2 shows
that there are other ring isomorphisms from Q[ 2, 3] to itself:
Remarks on Exercise 4.5.1 Here is how you may start this exercise: Let j
be an integer in the set n . Let us consider the case where j is one of the
326
CHAPTER 4. GROUPS
bs, say j = bk , for some k. Then t(j) = bk+1 , where the subscript is taken
modulo e so as to lie in the set {1, 2, . . . , n}. Hence st(j) would be s(bk+1 ).
Now, because s and t are disjoint cycles, bk+1 will not appear among the as,
and hence, s(bk+1 ) would equal bk+1 . Now work out ts(j) for this particular
case. Next consider the case where j is not one of the bs and work out the
details.
327
the reals, and that every vector has a length (Example 4.100). Now let us
examine length more closely. The length of a vector pi + qj is defined to be
p
p2 + q 2 . Temporarily ignoring the square root (we will put it back later),
the squared-length of a general vector xi + yj is thus given by x2 + y 2 . This
is an example of a quadratic forma polynomial all of whose monomials
are of degree 2 inin two variables. Now note that the polynomial x2 + y 2
can be written as
(x, y)
1 0
!
(x, y)t
0 1
(Here, recall from elementary linear algebra that the product of a row !
vector
1 0
(s, t) and a column vector (p, q)t is given by sp+tq. Thus, since
(x, y)t
0 1
is just (x, y)t , the product above becomes (x, y)(x, y)t = x2 + y 2 , as claimed.)
Mathematicians have found it useful to define length differently as well
(we will see a famous example of this ahead). More generally, let q = ax2 +
2bxy + cy 2 be any quadratic form with coefficients a, b, c from the reals. (It
is convenient to write the coefficient of xy as 2b.) Then, q may be written as
(x, y)
a b
b c
!
(x, y)t
(Check this! Notice how the fact that we wrote the coefficient of xy as 2b
allows us to write the (1, 2) and (2, 1) entries of the matrix above as b. Had
we taken the coefficient of xy as b, then these entries would have had to be
b/2.) Using this quadratic form, we define the q-length of a vector pi + qj
p
as ap2 + 2bpq + cq 2 . (The length may well turn out to be imaginarythe
quantity under the square root sign may be negativebut that only makes
matters more interesting!) Moreover, we define the q-dot product of two
vectors si + tj and pi + qj to be asp + b(sq + tp) + ctq. Writing Mq for the
328
matrix
CHAPTER 4. GROUPS
a b
b c
!
above, we find that the q-length of pi + qj is given by
(p, q)
a b
!
(p, q)t ,
b c
(s, t)
a b
b c
!
(p, q)t ,
329
for all v and w in R2 . Show that T satisfies Property (1) if and only if
it satisfies Property (2).
More generally, an arbitrary quadratic form q in n variables x1 , x2 , . . . ,
xn over a field F is a polynomial in these variables with coefficients in F , all
of whose monomials are of degree 2. As long as 2 6= 0 in this field (so we rule
out fields like Z/2Z), we may form a symmetric n n matrix Mq as above,
where the entries in the slots (i, j) and (j, i) both equal half the coefficient
of xi xj in the quadratic form q. (We have to impose the 2 6= 0 condition,
because otherwise, we would not be able to divide by 2!) The set of n n
matrices A satisfying At Mq A = Mq forms a group On (F, q), referred to as
the orthogonal group of q over F .
Perhaps the most famous example of the length of vectors in Rn being measured by quadratic forms other than x21 + x22 + + x2n is given by
Einsteins theory of relativity. There, space-time is considered as a four dimensional space, and the length of the vector (t, x, y, z)t , where t is the time
coordinate and x, y, and z are the usual spatial coordinates, is given by
p
t2 x2 y 2 z 2 . (Actually, this is a drastic simplification: space-time is
not really a vector space but a four-dimensional manifold, and the length
formula above applies on the tangent spaceswhich are actual vector spaces
but that is too mathematically advanced for now.) This quadratic form
t2 x2 y 2 z 2 has associated
symmetric matrix
0
0
0
1 0
0
0 1 0
0
0 1
330
CHAPTER 4. GROUPS
331
332
CHAPTER 4. GROUPS
Appendix A
Sets, Functions, and Relations
We review here some basic notions that you would have seen in an earlier
course on proofs or on discrete mathematics.
A set is simply a collection of objects. We are of course being informal
here: there are more formal definitions of sets that are based on various axioms designed to avoid paradoxes, but we will not go into such depths in this
appendix. If A is a set, the objects whose collection make up the set A are
also referred to as the elements of A. You will be familiar with both notations
for sets: the explicit notation, such as A = {2, 3, 5, 7} as well as the implicit or
set-builder notation, such as A = {n |n is a prime integer between 2 and 10}.
You will also be familiar with the notation for element of.
If A and B are two sets, we say A is a subset of B (written A B) if
x A implies x B. If A B and B A, we say A = B. If A B but
A 6= B, we say that A is a proper subset of B, and we write A B.
The union of two sets A and B, denoted A B, is simply the set {x | x
A or x B}. The intersection of two sets A and B, denoted A B is the set
{x | x A and x B}. The difference of two sets A and B, denoted A B,
is the set {x | x A and x 6 B}. (Note that in general, A B 6= B A.)
333
334
n, if n is odd
2. g(n) =
n/2, if n is even
3. h(n) = n2 + 1.
4. b(n) = n + 1.
Then f is injective but not surjective, g is surjective but not injective,
h is neither injective nor surjective, and b is bijective.
The Cartesian Product of two sets A and B, denoted A B, is simply
the set of all ordered pairs {(a, b) | a A, b B}. A relation on a set A is
simply a subset of A A. Let R be a relation on a set A. If (a, b) R, we
say a is related to b and we often write a R b to indicate that a is related
to b under the relation R. The relation R is said to be reflexive if for each
a A, a R a. R is said to be symmetric if whenever a R b, then b R a as
well. Finally, R is said to transitive if whenever a R b and b R c, then a R c
as well.
335
A relation R on a set A that is reflexive, symmetric, and transitive is
called an equivalence relation on A. For any a A, let us write [a] for the
set of all elements of B that are related to a, that is, [a] = {b | a R b}. The
set [a] is called the equivalence class of a. We have the following: if R is an
equivalence relation on A, then for any two elements a and b in A, either
[a] = [b] or else, [a] and [b] are disjoint. In particular, this means that the
equivalence classes divide A into disjoint sets of the form [a], whose union is
all of A.
The symbol is often used instead of R to denote a relation on a set.
Example A.2
The easiest and most central example perhaps of an equivalence relation on a set is the relation on Z defined by saying that m is related
to n (or m n) iff m n is even. Convince yourself that this relation is indeed an equivalence relation, and that there are precisely two
equivalence classes: the class [0] and the class [1].
A binary operation on a set A is simply a function f : A A A.
As we have seen, the usual operations of addition and multiplication in, for
example, the integers, are just binary operations on Z, that is, functions
Z Z Z.
Question A.3I
s division a binary operation on the rationals? How about on the set
Q {0}?
A set A is said to be countable if there exists a one-to-one correspondence
between A and some subset of N. If no such correspondence exists, then A is
said to be uncountable. If there exists a one-to-one correspondence between
A and the subset {1, 2, . . . , n} of Z (for some n), then A is said to be finite. If
no such n Z exists for which there is a one-to-one correspondence between
336
A and {1, 2, . . . , n}, then A is said to be infinite. Note that an infinite set
can be either countable or uncountable.
Example A.4
Any set with a finite number of elements is countable, by definition of
finiteness and countability.
Example A.5
Any subset of a countable set is also countable.
Example A.6
The integers are countable. One one-to-one correspondence between Z
and N is the one that sends a to 2a if a 0, and a to 2(a) 1 if
a < 0.
Example A.7
The Cartesian product of two countable sets is countable. Here is a
sketch of a proof when both A and B are infinite. There exists a
one-to-one correspondence between A and N (why?), and turn, there
exists a one-to-one correspondence between N and the set {2n n N}.
Composing, we get a one-to-one correspondence f between A and the
set {2n n N}. Similarly, we have a one-to-one correspondence g
between B and the set the set {3n n N}. Now define the map
h : A B N by h(a, b) = f (a)g(b), and show that this is a bijection.
Example A.8
The rationals Q are countable. This is because we may view Q ZZ
by identifying the rational number a/b, written in lowest terms, with
the ordered pair (a, b). By Example A.7 above, Z Z is countable, and
hence by Example A.5, Q is also countable.
337
Example A.9
The reals R are uncountable. The proof of this is the famous Cantor
diagaonalization argument.
338
Appendix B
Partially Ordered Sets, and Zorns
Lemma
339
340
on T .
This partial order could also have been defined on the set of all subsets
of S, we chose to define it only on the set of proper subsets to make
the situation more interesting (see Example B.4 ahead, for instance)!
Given a partial order on a set, two elements x and y are said to
be comparable if either x y or y x. If neither x y or y x, then
x and y are said to be incomparable. For instance, in Example B.1, 2 and
3 are incomparable, since neither 2|3 nor 3|2. Similarly, in the set of all
proper subsets of, say, the set {1, 2, 3}, the subsets {1, 2} and {1, 3} are
incomparable, since neither of these sets is a subset of the other.
Given a partial order on a set S, and given a subset A of S, an upper
bound of A is an element z S such that x z for all x A.
Example B.3
In Example B.1, if we take A to be the set {1, 2, 3, 4, 5, 6}, then
lcm(1, 2, 3, 4, 5, 6) = 60 is an upper bound for A.
Note that not all subsets of S need have an upper bound. For instance,
if we take B in this same example to be the set of all powers of 2, then
there is no integer divisible by 2m for all values of m, so B will not have
an upper bound.
Given a partial order on a set S, a maximal element in S is an element x such that for any other element y, either y x or else x and y are
incomparable.
Example B.4
In Example B.2, suppose we took S = {1, 2, 3}, so
T = {, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}}.
Then {1, 2} is maximal: each of {1}, {2},, and {2} are {1, 2}, while
341
{1, 3}and {2, 3} cannot be compared with {1, 2}.
Of course, these same arguments show that {1, 3}and {2, 3} are also
maximal elements.
Note that instead, we had taken T to be the set of all subsets of
{1, 2, 3}, then there would only have been one maximal element, namely
{1, 2, 3, }, and all other subset X would have satisfied X {1, 2, 3}.
Having several maximal elements incomparable to one another is certainly a more intriguing situation!
A partial order on a set that has the further property that any two elements are comparable is called a linear order. For example, the usual order
relation on R is a linear order.
Given a partial order on a set S, a chain in S is a nonempty subset A
of S that is linearly ordered with respect to , i.e., for all x and y from A,
either x y or y x.
Example B.5
In Example B.3, note that B is a chain, since every element of B is a
power of 2, and given elements 2m and 2n in B, if m n then 2m |2n ,
else 2n |2m . On the other hand, A is not a chain: we have already seen
that 2 and 3 are incomparable.
Zorns Lemma, in spite of its name, is really not a lemma, but a universally
accepted axiom of logic. It states the following:
Zorns Lemma Let S be a nonempty set with a partial order . If every
chain in S has an upper bound in S, then S has a maximal element.
Zorns Lemma is equivalent to certain other axioms of logic, most famously, to the Axiom of Choice. What this means that if one were to accept
the statement of Zorns Lemma as a fundamental axiom of logic, then in con-
342
junction with other accepted axioms of logic, one can derive the statement of
the Axiom of Choice. Conversely, if one were to accept the Axiom of Choice
as a fundamental axiom of logic, then in conjunction with other accepted
axioms of logic, one can derive the statement of Zorns Lemma.
Here is a typical application of Zorns Lemma. Recall from Exercise 2.132
of Chapter 2 the definition of maximal ideals.
Theorem B.6. Let R be a ring. Then R contains maximal ideals.
Proof. Let S be the set of all proper ideals of R. Note that S is nonempty,
since the zero ideal h0i is in S. We define a partial order on S by I J
if and only if I J (see Example B.2 above). Let T be a chain in S. Recall
what this means: T is a collection of proper ideals of R such that if I and
J are in the collection, then either I J or else J I. We claim that T
has an upper bound in S, i.e., there exists a proper ideal K in R such that
I K for all I in our chain T . The proof of the claim is simple. By the
definition of being a chain, T is nonempty, so T contains at least one ideal
of R. We define K, as a set, to be the union of all the ideals I in T . We
need to show that K is a proper ideal of R. This is easy. Note that since
there is at least one ideal in T , and since this ideal contains 0, K must be
nonempty as it must contain at least the element 0. Now given a and b in
K, note that a must live in some ideal I in T and b must live in some ideal J
in T , since K is, after all, the union of all the ideals in T . Since T is linearly
ordered (this is where the property that chains are linearly ordered comes
in), either I J or else J I. Say I J. Then both a and b are in J.
Hence, a + b is also in J as J is an ideal. Since J in turn is contained in K,
we find a + b K. This shows that K is closed under addition. Now given
any a K, as before, a I for some ideal I in T . Since I is an ideal, both
ar and ra are in I for all r R. Since I K, we find ar and ra are in K.
By Lemma 2.65 of Chapter 2, we find K is an ideal. Of course, K is clearly
343
an upper bound for T , since I K for all I in T by the very manner in
which we have defined K.
Note that indeed K is a proper ideal of R, i.e., K is in S. For, if not, then
K = R, so in particular, this means that 1 K. Since K is the union of the
ideals in T , we find 1 I for some ideal I in T . But this is a contradiction,
since I is a proper ideal of R (remember that the set S was defined as the
set of all proper ideals of R, and I is a member of S).
Since T was arbitrary, we have found that every chain in S has an upper
bound in S. By Zorns lemma, S has a maximal element. But a maximal
element of S is precisely a maximal ideal of R!
2
Now we will present the proof that bases exist in all vector spaces, not just
in those with a finite spanning set; this proof invokes Zorns Lemma. Recall
that we can assume that our vector space is nontrivial, thanks to Example
3.35 of Chapter 3.
Theorem B.7. Every vector space has a basis.
Proof. Let S be the set of all linearly independent subsets of V . Since V is
not trivial by assumption, it has at least one nonzero vector, say v, and the
set {v} is then linearly independent (Exercise 3.22.1). It follows that S is a
nonempty set.
Define a partial order on S by declaring, for any two linearly independent
subsets X and Y , that X Y if and only if X Y . It is easy to check that
this is indeed a partial order: First, given any linearly independent subset
X of V , clearly X X, so indeed X X. Next, if X and Y are two
linearly independent subsets of V and if X Y and Y X, this means that
X Y and Y X, so indeed X = Y . Finally, if X Y and Y Z for
three linearly independent subsets X, Y , and Z of V , then this means that
X Y Z, i.e., X Z, so indeed X Z.
344
Our strategy will be to first establish that S has a maximal element with
respect to this partial order, and then to show that this maximal element
must be a basis for V .
Given any chain T in S (recall that this means that T consists of linearly
independent subsets of S with the property that if X and Y are in T , then
either X Y or Y X), we will show that T has an upper bound in
S. Write K for the union of all linearly independent subsets X that are
contained in T . We claim that K is an upper bound for T . Let us first show
that K is a linearly independent subset of V . By Definition 3.22 of Chapter
3, we need to show that every finite subset of K is linearly independent.
Given any finite set of vectors v1 , . . . , vn from K, note that each vi must
live in some linearly independent subset Xi in the chain T . Since T is a
chain, the subsets in T are linearly ordered (this is where we use the defining
property that the elements of a chain are linearly ordered!), we must have
Xi1 Xi2 Xin for some permutation (i1 , i2 , . . . , in ) of the integers
(1, 2, . . . n). Thus, all the vectors v1 , . . . , vn belong to Xin . But since Xi,n
is a linearly independent set, Definition 3.22 of Chapter 3 implies that the
vectors v1 , . . . , vn must be linearly independent! Since this is true for any
finite set of vectors in K, we find that K is a linearly independent set. In
particular, K is in S.
Now note that given any linearly independent subset X contained in the
chain T , we have X K by the very definition of K, so by definition of the
order relation, X K. This shows that indeed T has an upper bound in S.
By Zorns Lemma, S has a maximal element, call it B. We will show
that B must be a basis of V . Since B is already linearly independent, we
only need to show that B spans V . So let v be any nonzero vector in V : we
need to show that v can be written as a linear combination of elements of
B. If v is already in B, there is nothing to prove (why?). If v is not in B,
B {v} must be linearly dependent, otherwise, B {v} would be a linearly
345
independent subset of V strictly containing B, violating the maximality of B.
Thus, there exists a relation f0 v + f1 b1 + f2 b2 + fk bk = 0 for some scalars
f0 , f1 , , fk (not all zero), and some vectors b1 , b2 , , bk of B. Notice that
f0 6= 0, since otherwise, our relation would read f1 b1 + f2 b2 + fk bk = 0
(with not all fi equal to zero), which is impossible since the bi are in B
and B is a linearly independent set. Therefore, we can divide by f0 to find
v = (f1 /f0 )b1 + (f2 /f0 )b2 + + (fk /f0 )bk . Hence v can be written as
a linear combination of elements of B, so B spans V .
Thus, B is a basis of V .
346
B. For, assume that we have shown this. Then, given any vector v V ,
first write it as v = f1 u1 + + fn un for suitable vectors ui and scalars
fi , invoking the fact that spans V . Next, since we would have shown that
every vector in is expressible as a linear combination of elements in B, we
find that each ui is expressible as ui = fi,1 bi,1 + + fi,ni bi,ni for some vectors
bi,j B and scalars fi,j . Substituting these expressions for each ui into the
expression above for v, we find that v is expressible as a linear combination
of elements of B, i.e., that B spans V .
To show that every vector in is expressible as a linear combination
of elements of B, assume that some u is not expressible as a linear
combination of elements of B. Then, exactly as in the proof of Proposition
3.49 (see how we showed C1 = C {vt+1 } must be linearly independent), we
would find that B {u} is linearly independent. But this contradicts the
maximality of B! Hence every vector in must be expressible as a linear
combination of elements of B, which means that B must be a basis. Since
B , we have succeeded in shrinking down to a basis.
347
of V that contains C.
348
Appendix C
GNU Free Documentation License
Version 1.3, 3 November 2008
Copyright 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
<http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies of this license
document, but changing it is not allowed.
Preamble
The purpose of this License is to make a manual, textbook, or other functional
and useful document free in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either
commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered
responsible for modifications made by others.
This License is a kind of copyleft, which means that derivative works of the
document must themselves be free in the same sense. It complements the GNU
General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software,
because free software needs free documentation: a free program should come with
349
350
manuals providing the same freedoms that the software does. But this License is
not limited to software manuals; it can be used for any textual work, regardless of
subject matter or whether it is published as a printed book. We recommend this
License principally for works whose purpose is instruction or reference.
1. APPLICABILITY AND DEFINITIONS
This License applies to any manual or other work, in any medium, that contains
a notice placed by the copyright holder saying it can be distributed under the terms
of this License. Such a notice grants a world-wide, royalty-free license, unlimited in
duration, to use that work under the conditions stated herein. The Document,
below, refers to any such manual or work. Any member of the public is a licensee,
and is addressed as you. You accept the license if you copy, modify or distribute
the work in a way requiring permission under copyright law.
A Modified Version of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or
translated into another language.
A Secondary Section is a named appendix or a front-matter section of the
Document that deals exclusively with the relationship of the publishers or authors
of the Document to the Documents overall subject (or to related matters) and
contains nothing that could fall directly within that overall subject. (Thus, if
the Document is in part a textbook of mathematics, a Secondary Section may
not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial,
philosophical, ethical or political position regarding them.
The Invariant Sections are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the
Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The
Document may contain zero Invariant Sections. If the Document does not identify
any Invariant Sections then there are none.
351
The Cover Texts are certain short passages of text that are listed, as FrontCover Texts or Back-Cover Texts, in the notice that says that the Document is
released under this License. A Front-Cover Text may be at most 5 words, and a
Back-Cover Text may be at most 25 words.
A Transparent copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that
is suitable for revising the document straightforwardly with generic text editors
or (for images composed of pixels) generic paint programs or (for drawings) some
widely available drawing editor, and that is suitable for input to text formatters
or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or
absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used
for any substantial amount of text. A copy that is not Transparent is called
Opaque.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using
a publicly available DTD, and standard-conforming simple HTML, PostScript
or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats
that can be read and edited only by proprietary word processors, SGML or XML
for which the DTD and/or processing tools are not generally available, and the
machine-generated HTML, PostScript or PDF produced by some word processors
for output purposes only.
The Title Page means, for a printed book, the title page itself, plus such
following pages as are needed to hold, legibly, the material this License requires to
appear in the title page. For works in formats which do not have any title page
as such, Title Page means the text near the most prominent appearance of the
works title, preceding the beginning of the body of the text.
The publisher means any person or entity that distributes copies of the
Document to the public.
352
353
and Back-Cover Texts on the back cover. Both covers must also clearly and legibly
identify you as the publisher of these copies. The front cover must present the full
title with all words of the title equally prominent and visible. You may add other
material on the covers in addition. Copying with changes limited to the covers, as
long as they preserve the title of the Document and satisfy these conditions, can
be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you
should put the first ones listed (as many as fit reasonably) on the actual cover,
and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more
than 100, you must either include a machine-readable Transparent copy along
with each Opaque copy, or state in or with each Opaque copy a computer-network
location from which the general network-using public has access to download using
public-standard network protocols a complete Transparent copy of the Document,
free of added material. If you use the latter option, you must take reasonably
prudent steps, when you begin distribution of Opaque copies in quantity, to ensure
that this Transparent copy will remain thus accessible at the stated location until
at least one year after the last time you distribute an Opaque copy (directly or
through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document
well before redistributing any large number of copies, to give them a chance to
provide you with an updated version of the Document.
4. MODIFICATIONS
You may copy and distribute a Modified Version of the Document under the
conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the
Document, thus licensing distribution and modification of the Modified Version
to whoever possesses a copy of it. In addition, you must do these things in the
Modified Version:
354
A. Use in the Title Page (and on the covers, if any) a title distinct from that of
the Document, and from those of previous versions (which should, if there
were any, be listed in the History section of the Document). You may use
the same title as a previous version if the original publisher of that version
gives permission.
B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together
with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this
requirement.
C. State on the Title page the name of the publisher of the Modified Version,
as the publisher.
D. Preserve all the copyright notices of the Document.
E. Add an appropriate copyright notice for your modifications adjacent to the
other copyright notices.
F. Include, immediately after the copyright notices, a license notice giving the
public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
G. Preserve in that license notice the full lists of Invariant Sections and required
Cover Texts given in the Documents license notice.
H. Include an unaltered copy of this License.
I. Preserve the section Entitled History, Preserve its Title, and add to it
an item stating at least the title, year, new authors, and publisher of the
Modified Version as given on the Title Page. If there is no section Entitled
History in the Document, create one stating the title, year, authors, and
publisher of the Document as given on its Title Page, then add an item
describing the Modified Version as stated in the previous sentence.
355
J. Preserve the network location, if any, given in the Document for public access
to a Transparent copy of the Document, and likewise the network locations
given in the Document for previous versions it was based on. These may
be placed in the History section. You may omit a network location for a
work that was published at least four years before the Document itself, or if
the original publisher of the version it refers to gives permission.
K. For any section Entitled Acknowledgements or Dedications, Preserve
the Title of the section, and preserve in the section all the substance and
tone of each of the contributor acknowledgements and/or dedications given
therein.
L. Preserve all the Invariant Sections of the Document, unaltered in their text
and in their titles. Section numbers or the equivalent are not considered
part of the section titles.
M. Delete any section Entitled Endorsements. Such a section may not be
included in the Modified Version.
N. Do not retitle any existing section to be Entitled Endorsements or to
conflict in title with any Invariant Section.
O. Preserve any Warranty Disclaimers.
If the Modified Version includes new front-matter sections or appendices that
qualify as Secondary Sections and contain no material copied from the Document,
you may at your option designate some or all of these sections as invariant. To
do this, add their titles to the list of Invariant Sections in the Modified Versions
license notice. These titles must be distinct from any other section titles.
You may add a section Entitled Endorsements, provided it contains nothing but endorsements of your Modified Version by various partiesfor example,
statements of peer review or that the text has been approved by an organization
as the authoritative definition of a standard.
356
You may add a passage of up to five words as a Front-Cover Text, and a passage
of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the
Modified Version. Only one passage of Front-Cover Text and one of Back-Cover
Text may be added by (or through arrangements made by) any one entity. If the
Document already includes a cover text for the same cover, previously added by
you or by arrangement made by the same entity you are acting on behalf of, you
may not add another; but you may replace the old one, on explicit permission from
the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give
permission to use their names for publicity for or to assert or imply endorsement
of any Modified Version.
5. COMBINING DOCUMENTS
You may combine the Document with other documents released under this
License, under the terms defined in section 4 above for modified versions, provided
that you include in the combination all of the Invariant Sections of all of the original
documents, unmodified, and list them all as Invariant Sections of your combined
work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple
identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title
of each such section unique by adding at the end of it, in parentheses, the name of
the original author or publisher of that section if known, or else a unique number.
Make the same adjustment to the section titles in the list of Invariant Sections in
the license notice of the combined work.
In the combination, you must combine any sections Entitled History in the
various original documents, forming one section Entitled History; likewise combine any sections Entitled Acknowledgements, and any sections Entitled Dedications. You must delete all sections Entitled Endorsements.
6. COLLECTIONS OF DOCUMENTS
357
You may make a collection consisting of the Document and other documents
released under this License, and replace the individual copies of this License in
the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the
documents in all other respects.
You may extract a single document from such a collection, and distribute it
individually under this License, provided you insert a copy of this License into
the extracted document, and follow this License in all other respects regarding
verbatim copying of that document.
7. AGGREGATION WITH INDEPENDENT WORKS
A compilation of the Document or its derivatives with other separate and
independent documents or works, in or on a volume of a storage or distribution
medium, is called an aggregate if the copyright resulting from the compilation
is not used to limit the legal rights of the compilations users beyond what the
individual works permit. When the Document is included in an aggregate, this
License does not apply to the other works in the aggregate which are not themselves
derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the
Document, then if the Document is less than one half of the entire aggregate,
the Documents Cover Texts may be placed on covers that bracket the Document
within the aggregate, or the electronic equivalent of covers if the Document is in
electronic form. Otherwise they must appear on printed covers that bracket the
whole aggregate.
8. TRANSLATION
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders,
but you may include translations of some or all Invariant Sections in addition to
358
the original versions of these Invariant Sections. You may include a translation
of this License, and all the license notices in the Document, and any Warranty
Disclaimers, provided that you also include the original English version of this
License and the original versions of those notices and disclaimers. In case of a
disagreement between the translation and the original version of this License or a
notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled Acknowledgements, Dedications,
or History, the requirement (section 4) to Preserve its Title (section 1) will
typically require changing the actual title.
9. TERMINATION
You may not copy, modify, sublicense, or distribute the Document except as
expressly provided under this License. Any attempt otherwise to copy, modify,
sublicense, or distribute it is void, and will automatically terminate your rights
under this License.
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if
the copyright holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable
means, this is the first time you have received notice of violation of this License
(for any work) from that copyright holder, and you cure the violation prior to 30
days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses
of parties who have received copies or rights from you under this License. If your
rights have been terminated and not permanently reinstated, receipt of a copy of
some or all of the same material does not give you any rights to use it.
10. FUTURE REVISIONS OF THIS LICENSE
359
The Free Software Foundation may publish new, revised versions of the GNU
Free Documentation License from time to time. Such new versions will be similar
in spirit to the present version, but may differ in detail to address new problems
or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the
Document specifies that a particular numbered version of this License or any later
version applies to it, you have the option of following the terms and conditions
either of that specified version or of any later version that has been published (not
as a draft) by the Free Software Foundation. If the Document does not specify a
version number of this License, you may choose any version ever published (not as
a draft) by the Free Software Foundation. If the Document specifies that a proxy
can decide which future versions of this License can be used, that proxys public
statement of acceptance of a version permanently authorizes you to choose that
version for the Document.
11. RELICENSING
Massive Multiauthor Collaboration Site (or MMC Site) means any World
Wide Web server that publishes copyrightable works and also provides prominent
facilities for anybody to edit those works. A public wiki that anybody can edit is
an example of such a server. A Massive Multiauthor Collaboration (or MMC)
contained in the site means any set of copyrightable works thus published on the
MMC site.
CC-BY-SA means the Creative Commons Attribution-Share Alike 3.0 license
published by Creative Commons Corporation, a not-for-profit corporation with a
principal place of business in San Francisco, California, as well as future copyleft
versions of that license published by that same organization.
Incorporate means to publish or republish a Document, in whole or in part,
as part of another Document.
An MMC is eligible for relicensing if it is licensed under this License, and
if all works that were first published under this License somewhere other than
this MMC, and subsequently incorporated in whole or in part into the MMC, (1)
360
had no cover texts or invariant sections, and (2) were thus incorporated prior to
November 1, 2008.
The operator of an MMC Site may republish an MMC contained in the site
under CC-BY-SA on the same site at any time before August 1, 2009, provided
the MMC is eligible for relicensing.
ADDENDUM: How to use this License for your documents
To use this License in a document you have written, include a copy of the
License in the document and put the following copyright and license notices just
after the title page:
Copyright YEAR YOUR NAME. Permission is granted to copy,
distribute and/or modify this document under the terms of the GNU
Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections,
no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled GNU Free Documentation
License.
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the with . . . Texts. line with this:
with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being
LIST.
If you have Invariant Sections without Cover Texts, or some other combination
of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend
releasing these examples in parallel under your choice of free software license, such
as the GNU General Public License, to permit their use in free software.
Index
Active learning, 3
center, 268
cyclic group, 269, 287
Dihedral group
D3 , 260
D4 , 263
Chain, 341
Closure
homomorphism, 307
under addition, 84
under multiplication, 84
kernel, 309
isomorphism, 312
examples, 312
Euclid, 43
Eulers -function, 50
nonabelian, 253
Field, 93
order, 291
examples, 94
finite, 97, 98
orthogonal group
multiplicative group, 94
subgroup, 283
coset, 293
Sln (F ), 276
symmetric group
abelian, 61
S2 , 254
361
362
INDEX
S3 , 251
Sn , 254
d-cycle, 255
symmetry group of set with structure, 249, 319
table, 251, 252
upper triangular invertible matrices,
276
Harmonic series, 50
Ideal, 101
coset with respect to, 109
examples, 103
Polynomial
Bernstein, 240
expression, 159
Polynomial expression, 159
Polynomials
division algorithm, 147
Prime, 37
infinitely many, 43
linear combination, 34
Proofs
mod 2, 75
mod n, 76
multiple, 29
multiplication, 61
How to do, 9
Q[ 2], 66
Q[ 2, 3], 87
quadratic form, 327
prime, 37
relatively prime, 36
unique prime factorization, 39
Ring, 63
center, 142
INDEX
commutative, 65
examples, 65
homomorphism, 117
examples, 122
Fundamental Theorem, 134
kernel, 121
363
Vector Space
linear transformations
Fundamental Theorem, 233
Vector space, 167
basis, 188
examples, 188
integral domains, 91
invertible element, 93
dimension, 199
examples, 168
isomorphism, 232
automorphism, 132
examples, 129
kernel, 220
noncommutative, 65
ring extension, 84
unit, 93
zero-divisors, 90
scalars, 168
spanning set, 179
subspace, 205
coset with respect to, 211
examples, 207
vectors, 168
examples, 86
generated by an element, 158
examples, 160
test, 85
Subspace
test, 206
Unique prime factorization, 39