Symmetry

Aspects of
Symmetry
Robert B. Howlett
Contents
Chapter 1: Symmetry
1a An example of abstract symmetry 2
1b Structure preserving transformations 3
1c The symmetries of some structured sets 7
1d Some groups of transformations of a set with four elements 10
Chapter 2: Introductory abstract group theory
2a Group axioms 15
2b Basic deductions from the axioms 18
2c Subgroups and cosets 21
2d On the number of elements in a set. 26
2e Equivalence relations 30
2f Cosets revisited 34
2g Some examples 36
2h The index of a subgroup 39
Chapter 3: Homomorphisms, quotient groups and isomorphisms
3a Homomorphisms 40
3b Quotient groups 45
3c The Homomorphism Theorem 52
Chapter 4: Automorphisms, inner automorphisms and conjugacy
4a Automorphisms 57
4b Inner automorphisms 60
4c Conjugacy 63
4d On the number elements in a conjugacy class 66
Chapter 5: Reections
5a Inner product spaces 70
5b Dihedral groups 72
5c Higher dimensions 75
5d Some examples 81
iii
Chapter 6: Root systems and reection groups
6a Root systems 84
6b Positive, negative and simple roots 88
6c Diagrams 94
6d Existence and inadmissibility proofs 105
Index of notation 110
Index 111
iv
1
Symmetry
Of the following geometrical objects, some are rather symmetrical, others
less so:
Objects exhibiting a high degree of symmetry are more special, and
possiblydepending on your tastemore appealing than the less symmet-
rical ones. They are, therefore, natural objects of mathematical interest.
When studying symmetrical objects one can always exploit the symmetry to
limit the amount of work one has to do. It is only necessary to measure one
side of a square, since symmetry says that all other sides are the same. It is
not necessary to measure any of the angles: it is a consequence of symmetry
that they are all ninety degrees.
Although measuring a square is a trivial application of symmetry, it
gives a tiny glimpse of the importance of symmetry in mathematics. Sym-
metry occurs not only in concrete, geometrical situations, but also in highly
complex abstract situations, where exploitation of the symmetrical aspects of
the situation can provide methods for dealing with problems that otherwise
would be hopelessly intractable.
Group theory is the mathematical theory of symmetry, in which the
basic tools for utilizing symmetry are developed. The purpose of these notes
1
2 Chapter One: Symmetry
is to provide a introduction to group theory, concentrating in particular on
a very important class of geometrical groups: the nite Euclidean reection
groups.
1a An example of abstract symmetry
Before launching into a theoretical discussion of symmetry, we take a brief
look at an example of the use of symmetry in an abstract situation. After
this section our examples of symmetry will be almost entirely geometrical in
character, but the example we consider here is drawn from number theory.
Consider the equation x
2
+y
2
= 1, and suppose we wish to nd solutions
of this which have the form x = p/q and y = r/s, where p, q, r and s
are integers. It is not hard to see that this problem amounts to nding
Pythagorean triples: triples (a, b, c) of integers satisfying a
2
+ b
2
= c
2
. The
most famous of these triples, (3, 4, 5), yields the solution x = 3/5, y = 4/5.
Now, how can we bring symmetry into this?
If an equation possesses symmetry, once one solution has been found,
the symmetry may present you with other solutions for free. There is an
obvious symmetry to the equation x
2
+ y
2
= 1, given by interchanging x
and y. So of course we have the solution x = 4/5, y = 3/5 also. This is
not very exciting. But there are other symmetries of the equation which you
probably havent noticed yet. For example, the transformation
x x
= (3/5)x + (4/5)y
y y
= (4/5)x (3/5)y
can be seen to be a symmetry, since if x
2
+y
2
= 1 then
(x
)
2
+ (y
)
2
= ((3/5)x + (4/5)y)
2
+ ((4/5)x (3/5)y)
2
= (
9
25
x
2
+
24
25
xy +
16
25
y
2
) + (
16
25
x
2
24
25
xy +
9
25
y
2
)
= x
2
+y
2
= 1.
So, starting with the solution x = 15/17, y = 8/17, derived from the well
known Pythagorean triple (8, 15, 17), we deduce that
x
= (3/5)(15/17) + (4/5)(8/17) = 77/85

y
= (4/5)(15/17) (3/5)(8/17) = 36/85

Chapter One: Symmetry 3
is another solution to our original equation. Symmetry has combined with
two well-known facts to give us a somewhat less well-known one.
Before moving on, a confession is necessary: symmetry is not really
needed in the study of rational solutions to the equation x
2
+ y
2
= 1. A
complete account of these solutions can be found in many standard texts on
number theory, symmetry not being mentioned.
1b Structure preserving transformations
The two most basic concepts of modern mathematics are the concepts of set
and function. In most formal treatments of the foundations of mathematics,
set is a primitivethat is, undenedconcept. You cant dene everything!
But a set is to be thought of as a collection of things, called the elements of
the set, and the axioms that sets are assumed to satisfy are consistent with
this intuitive description of sets.
In these formal treatments, which rst became fashionable at the start
of the twentieth century, every mathematical object is, ultimately, a set.
Thus, for example, the number zero is commonly dened as the empty set
(the set with no elements), the number one as the set whose single element
is the number zero, two as the set whose two elements are zero and one,
and so on. Of course, in practice noone but researchers into the foundations
of mathematics works directly with such denitions. The purpose of the
work on foundations is to create theories which are proved to be free from
internal contradictions, and which accurately model intuitive approaches to
mathematics. The rest of us can then continue using the intuitive approach,
happy in the knowledge that contradictions will never arise.
The concept of function, unlike that of set, is not usually regarded as a
primitive concept. Functions are dened in terms of sets. Indeed, like every-
thing else in mathematics, a function is, ultimatelyat least in the formal
treatmentsa set. But the formal denition is irrelevant for our purposes;
instead we should focus on the basic properties which characterize functions,
relying on the fact that consistent formal theories have been created which
validate this.
A function consists of two sets, the domain and the codomain of the
function, and a rule that associates with each element of the domain a unique
element of the codomain. Thus, for example, there is a function whose do-
main is the set of all people and whose codomain is the set R of all real
numbers, the rule being to nd the height of the person in centimetres. Ob-
serve that this rule does indeed give an element of the codomain (a number)
for each element of the domain (each person).
The words map and mapping are commonly used synonomously with
function, as also is transformation. However, for the purposes of this course
only those functions for which the codomain is the same as the domain will
be called transformations. So a transformation is a function from a set to
itself. Transformations are important for us as they enable us to explicate
the notion of symmetry:
1.1 Definition A symmetry of an object is a transformation of the object
which leaves it unchanged in its essential features.
This denition is not really acceptable, except as a guide to intuition,
since it is too vague and imprecise. Accepting that all mathematical objects
are (ultimately) sets, the notion of a transformation of an object is satisfac-
tory, but what does it mean to say that a function from a set to itself leaves
the set unchanged in its essential features? To answer this objection, we need
some more concepts.
1.2 Definition Let X
1
, X
2
, . . . , X
n
be arbitrary sets. The Cartesian
product X
1
X
2
X
n
is the set all ordered n-tuples (x
1
, x
2
, . . . , x
n
)
such that x
i
X
i
for each i. That is,
X
1
X
2
X
n
= (x
1
, x
2
, . . . , x
n
) [ x
1
X
1
, x
2
X
2
. . . , x
n
X
n
If X
i
= S for all i then X
1
X
2
X
n
is called the n-fold Cartesian
power of the set S, sometimes written as S
n
.
1.3 Definition Let n be a nonnegative integer and S a set. An n-ary
relation on S is a function from the n-fold Cartesian power S
n
to the
two element set true, false. The elements x
1
, x
2
, . . . , x
n
of S are said
to satisfy the relation if (x
1
, x
2
, . . . , x
n
) = true, and not satisfy if
(x
1
, x
2
, . . . , x
n
) = false.
Thus, for example, if S is the Euclidean plane then we can dene a
ternary (or 3-ary) relation Perp on S as follows: for all a, b, c S,
Perp(a, b, c) =
_
true if

abc = 90
false otherwise.
Binary (or 2-ary) relations are the most familiar kind; in this case the symbol
for the relation is commonly written between the two arguments. For exam-
ple, the less than relation on the set of all real numbers: a < b is true if
b a is a positive number, false otherwise. Single argument relationsthat
is, unary relationsare simply properties which a single object may or may
not possess. Thus there is a unary relation Green on the set of all visible
objects: Green(jumper) is true if the jumper is green.
If R is an n-ary relation on S, then we will abbreviate the statement
R(x
1
, x
2
, . . . , x
n
) is true to R(x
1
, x
2
, . . . , x
n
). For example, a < b has
the same meaning as a < b is true.
Virtually anything that one might want to say about sets and their
elements can be reformulated in terms of relations on sets. For example, the
concept of an operation on a set can be recast so that it becomes an example
of a relation on the set. We could dene a ternary relation Plus on the set R
by the rule
Plus(a,b,c) if and only if a + b = c.
A vector space could be dened as a set equipped with a ternary relation
Plus and binary relations (Mult)
for each scalar (where (Mult)
(u
, v
) if
and only if, in the usual notation, v
= u
), such that the appropriate axioms

are satised.
Be warned that the denition of the term relation that we have em-
ployed is somewhat non-standard: most mathematical authors dene a re-
lation to be a subset of S
n
, namely the subset consisting of all n-tuples
(x
1
, x
2
, . . . , x
n
) which satisfy the relation.
Another non-standard denition which is useful for our present purposes
is the following:
1.4 Definition A structured set is a pair (S, 1), consisting of a set S
together with a collection 1 of relations on S.
This denition is useful since it is easily seen that a great variety of
commonplace mathematical objects are examples of structured sets. So,
for example, a square is a set of four elements a, b, c and d which has
(amongst other relations) a ternary relation Perp such that Perp(a, b, c),
Perp(c, b, a), Perp(b, c, d), Perp(d, c, b), Perp(c, d, a), Perp(a, d, c), Perp(b, a, d),
and Perp(d, a, b) are all true, while Perp(x, y, z) is false in the remaining 16
cases. As we have seen above, a vector space is an example of a structured
set, since it is a set with a ternary addition relation and a collection of bi-
nary scalar multiplication relations. And in the case when the collection 1
of relations is empty, the structured set is just a set, with no extra structure.
When talking about structured sets, the most natural kinds of trans-
formations to consider are those that preserve the relations, in the following
sense:
1.5 Definition Let be an n-ary relation on the set S. A transforma-
tion f: S S is said to preserve if (f(x
1
), f(x
2
), . . . , f(x
n
)) whenever
(x
1
, x
2
, . . . , x
n
).
Consider, for example, a transformation of a vector space which pre-
serves the addition relation and all the scalar multiplication relations. If
V is the vector space and T the transformation, preservation of addition
says that Plus(Tu
, Tv
, Tw
) whenever Plus(u
, v
, w
). Re-expressing this in the

usual notation, it says that Tw
= Tu
+ Tv
whenever w
= u
+ v
. More
succinctly, T(u
+ v
) = Tu
+ Tv
, for all u
, v
V . This is the way preser-

vation of addition is usually dened in texts on vector spaces. Similarly,
preservation of the binary relation (Mult)
says that (Mult)
(Tu
, Tv
) when-
ever (Mult)
(u
, v
); in other words, Tv
= Tu
whenever v
= u
, or, more
succinctly, T(u
) = Tu
for all u
V . So preservation of all the rela-

tions (Mult)
is preservation of scalar multiplication in the usual sense from

vector space theory. Of course, a transformation which does preserve Plus
and (Mult)
for each is nothing other than a linear transformation on the

space V .
We are now at last able to explain the intended meaning of Deni-
tion 1.1, and give a satisfactorily precise denition of symmetry. Certainly
a transformation cannot be said to leave an object essentially unchanged if
it is not possible to undo the eects of the transformation. In other words,
the transformation must have an inverse. Recall that a function has an in-
verse if and only if it is bijective (one to one and onto). So a symmetry of
an object must at least be a bijective transformation of the object. Now,
what about the vague term essential features used in Denition 1.1? Of
course, the features which are essential are whatever relations we happen to
be interested in. We therefore arrive at the following improved denition:
1.6 Definition A symmetry of a structured set (S, 1) is a bijective trans-
formation T: S S such that T preserves for all 1.
1c The symmetries of some structured sets
The simplest case is, of course, a set with no extra structure to worry about.
So consider the structured set (S, 1) where R = , the empty set. If T: S S
is any transformation it is vacuously true that T preserves all the relations
in 1, since there are no such relations. Hence a symmetry of the structured
set (S, ) is simply a bijective function from S to itself. Recall that bijective
transformations of a set are commonly known as permutations. The set
of all permutations of S is called the symmetric group on S, and in this
course we will use the notation Sym(S) for the symmetric group on S. This
traditional namethe symmetric groupis a little unfortunate for us, since
it is not particularly compatible with our denition of symmetry. When 1
is not empty, it will not be the case that all elements of the symmetric group
Sym(S) are symmetries of the structured set (S, 1). As for the word group,
we shall not give the general denition of this term until we have seen some
more examples.
We digress briey to discuss permutations, using the example of the set
S = 1, 2, 3, 4, 5 to illustrate the terminology and notation. The permuta-
tion p: S S which satises
p(1) = 4, p(2) = 3, p(3) = 5, p(4) = 1, p(5) = 2
is commonly denoted by
_
1 2 3 4 5
4 3 5 1 2
_
.
The elements of the set are listed along the top row, the image under the
permutation of each element is written underneath the element.
Recall that multiplication of permutations p and q of a set S is dened
to be composition of functions. That is, pq is the permutation of S dened
by the rule
(pq)(x) = p(q(x)) for all x S.
So if p is the permutation of 1, 2, 3, 4, 5 given above and if
q =
_
1 2 3 4 5
5 4 1 3 2
_
then a short calculation yields
pq =
_
1 2 3 4 5
4 3 5 1 2
_ _
1 2 3 4 5
5 4 1 3 2
_
=
_
1 2 3 4 5
2 1 4 5 3
_
.
For example, (pq)(3) = p(q(3)) = p(1) = 4.
A briefer and hence more convenient notation for permutations is the so-
called disjoint cycle notation. In this notation the permutation q above is
written as (1, 5, 2, 4, 3), indicating that q(1) = 5, q(5) = 2, q(2) = 4, q(4) = 3
and q(3) = 1. Note that q could equally well be written as (5, 2, 4, 3, 1), or
(2, 4, 3, 1, 5), or (4, 3, 1, 5, 2), or (3, 1, 5, 2, 4). It is only the cyclic ordering
of the elements that matters. The permutation p is written as (1, 4)(2, 3, 5).
(Again there are other possibilities, such as (5, 2, 3)(4, 1), obtained by varying
the starting elements within each cycle and the order in which the cycles are
written.) Multiplication of permutations is just as easy in this notation as
the long notation; it can be checked that pq = (1, 2)(3, 4, 5). It is normal in
this notation to omit cycles of length 1. Thus we write p
2
= (2, 5, 3) rather
than p
2
= (1)(4)(2, 5, 3). The identity permutation, in which all cycles have
length 1, will be written simply as i.
Returning now to the primary topic of this section, consider the set
of all symmetries of a square S with vertices a, b, c, d. Any permutation of
a, b, c, d which preserves the perpendicularity relation will be a symmetry
of S. There are only 24 permutations altogether in the symmetric group
Syma, b, c, d, and it is a straightforward task to nd which of them preserve
Perp. They are
p
1
= i
p
5
= (a, c)
p
2
= (a, b, c, d)
p
6
= (a, d)(b, c)
p
3
= (a, c)(b, d)
p
7
= (b, d)
p
4
= (a, d, c, b)
p
8
= (a, b)(c, d).
It is clear that if two transformations p and q both have the property of
leaving an object essentially unchanged, then the composite transformation
pq (obtained by applying q rst and following it by p) will also leave the object
essentially unchanged. Thus, if we dene G = p
1
, p
2
, p
3
, p
4
, p
5
, p
6
, p
7
, p
8
,
it must be true that every product p
i
p
j
of permutations in the set G will
also be in G. With some calculation, it can be veried that the following
multiplication table is satised:
p
1
p
2
p
3
p
4
p
5
p
6
p
7
p
8
p
1
p
1
p
2
p
3
p
4
p
5
p
6
p
7
p
8
p
2
p
2
p
3
p
4
p
1
p
6
p
7
p
8
p
5
p
3
p
3
p
4
p
1
p
2
p
7
p
8
p
5
p
6
p
4
p
4
p
1
p
2
p
3
p
8
p
5
p
6
p
7
p
5
p
5
p
8
p
7
p
6
p
1
p
4
p
3
p
2
p
6
p
6
p
5
p
8
p
7
p
2
p
1
p
4
p
3
p
7
p
7
p
6
p
5
p
8
p
3
p
2
p
1
p
4
p
8
p
8
p
7
p
6
p
5
p
4
p
3
p
2
p
1
It is also clear that the inverse of a transformation which leaves an object
essentially unchanged will also have the same property. And, indeed, it is
easily veried that p
1
i
G for each i. Specically, p
1
2
= p
4
and p
1
4
= p
2
,
while p
1
i
= p
i
in all the other cases.
The two properties of G which we have mentionedthat pq G when-
ever p, q G and p
1
G whenever p Gare the key properties of sets
of symmetries. Any nonempty set of transformations possessing these two
properties is called a group of transformations.
1.7 Definition Let S be any set. A set G of transformations of S is
called a group of transformations if G ,= and
(i) pq G for all p, q G, (closure under multiplication)
(ii) p is bijective and p
1
G for all p G. (closure under inversion)
If S is any object whatever, then the set of all symmetries of S will
be a group of transformations. Observe that the identity transformation is
always a symmetry.
For our nal example in this section we consider the group of all symme-
tries of a Euclidean vector space V . A Euclidean vector space, or real inner
product space, is a vector space equipped with an inner product, which is a
symmetric, bilinear and positive denite function V V R. In these notes
we will always use the dot notation for inner products: the inner product
of vectors u, v V will be written as u v. The three dening properties of
inner products are
(i) u v = v u for all u, v V ,
(ii) (u +v) w = u w +v w for all , R and u, v, w V ,
(iii) v v > 0 for all nonzero v V .
To describe a Euclidean space V as a structured set we need the Plus and
(Mult)
relations introduced previously for vector spaces, and further binary

relations (Dot)
dened as follows: (Dot)
(u, v) if uv = . A transformation
T: V V preserves (Dot)
if and only if Tu Tv = whenever u v = . So

T preserves all these relations if and only if
(1.7.1) Tu Tv = u v for all u, v V .
Thus a symmetry of V is a bijective linear transformation satisfying (1.7.1).
This is the usual denition of an orthogonal transformation: a bijective linear
transformation which preserves the inner product. The set of all orthogonal
transformations of a Euclidean space V is called the orthogonal group of V ,
and is denoted by O(V ).
The group O(V ) has innitely many elements; so it is not possible to
write out a complete multiplication table for O(V ) as we could for the eight-
element group considered above. In these notes we will mainly concentrate
on groups with only nitely many elements, but even so it will not usually
be practicable to write out multiplication tables, as the groups will be too
big. In the next chapter we will look at some of the conceptual tools which
group theorists use in their attempts to make big groups understandable.
Bijective structure-preserving transformations of algebraic systems such
as elds, vector spaces and the like, are usually called automorphisms rather
than symmetries. Thus the usual terminology is to call the orthogonal group
O(V ) of a Euclidean space V not the symmetry group, but the automor-
phism group, of V . In these notes we will use the terms symmetry and
automorphism interchangeably, but with a tendency to prefer symmetry
in geometric situations, automorphism in algebraic ones.
1d Some groups of transformations of a set with four elements
In the previous section we exhibited a set of eight transformations which
form a group of transformations of the set a, b, c, d, and we wrote down the
multiplication table for this eight-element group. In this section we will list
some more examples.
Recall that for a set G of transformations of a set S to be a group,
all that is necessary is for the inverse of every element of G to also be in G
and the product of every pair of elements of G to also be in G. Clearly, one
method of nding examples of such sets G is to start with a set containing
just a few bijective transformations, randomly chosen, calculate x
1
and xy
for all choices of x and y in this initial set, and add all of the resulting
transformations to the set; then repeat this process, and keep on going until
the set stops getting bigger at each stage. Maybe the process will not stop
until the set contains every bijective transformation of S; in this case the
initial set of transformations is said to generate the full symmetric group
Sym(S). But sometimes we can nd other examples of groups by this method.
If we start with just a single transformation g, it is not hard to see
that the group of transformations that we end up with will just consist of all
powers of g and all powers of g
1
, and the identity. These single-generator
groups are known as cyclic groups. Some of the groups of transformations of
S = 1, 2, 3, 4 that can be obtained like this are
G
1
= i
G
2
= i, (1, 2)
G
3
= i, (1, 2)(3, 4)
G
4
= i, (1, 2, 3), (1, 3, 2)
G
5
= i, (1, 2, 3, 4), (1, 3)(2, 4), (1, 4, 3, 2).
The element (1, 2, 3, 4) generates the group G
5
, in the sense that every ele-
ment of G
5
can be expressed as a power of (1, 2, 3, 4). The element (1, 4, 3, 2)
is also a generator, but (1, 3)(2, 4) is not; in fact, the group generated by
(1, 3)(2, 4) is just i, (1, 3)(2, 4), a proper subset of G
5
.
1.8 Definition The order of a group G is the number of elements of G.
The multiplication table for a cyclic group of order four looks like this:
i x x
2
x
3
i i x x
2
x
3
x x x
2
x
3
i
x
2
x
2
x
3
i x
x
3
x
3
i x x
2
Observe that each row of the table is the same as the previous row moved
across one place to the left, the leftmost element jumping across to become
the rightmost element. It is easily seen that the same will be true for the
multiplication table of a cyclic group with any number of elements, provided
the elements are initially listed in the order i, x, x
2
, x
3
, . . . . Note, however,
that if we take the elements in a dierent order then the multiplication table
will look a little dierent: it may take more than just a glance at a multi-
plication table to determine whether or not a group of transformations is a
cyclic group.
Two groups which have the same multiplicative structure, in the sense
that the same multiplication table applies to both groups, are said to be
isomorphic. (We will give a more precise denition of this term in a later
chapter.) The groups G
2
and G
3
above are isomorphic, since they are both
cyclic groups of order two. So although (1, 2) and (1, 2)(3, 4) are rather dier-
ent as transformations, nevertheless the groups they generate are isomorphic
to each other. Internally, so to speak, the two groups are the same. A
slightly more elaborate example of the same phenomenon occurs with trans-
formations of the set 1, 2, 3, 4, 5, 6. Here the transformations (1, 2)(3, 4, 5)
and (1, 2, 3, 4, 5, 6) both generate cyclic groups of order six, although in some
other ways they are not particularly alike. Isomorphism is a valuable con-
cept, since any facts we may discover about the internal structure of a group
will automatically be true also for any isomorphic group.
Returning to our discussion of groups of transformations of 1, 2, 3, 4,
we give some examples of groups which are not cyclic, but can be generated
by two elements:
G
6
= i, (1, 2), (3, 4), (1, 2)(3, 4)
G
7
= i, (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)
G
8
= i, (1, 2), (1, 3), (2, 3), (1, 2, 3), (1, 3, 2).
The groups G
6
and G
7
are isomorphic: it can be checked easily that the
following multiplication table is applicable to both of them:
i a b c
i i a b c
a a i c b
b b c i a
c c b a i
Groups with this multiplication table are said to be isomorphic to Kleins four
group. It can be shown quite easily that any group of order four must either
be cyclic or else isomorphic to Kleins four group. Hence group theorists
often say there are only two groups of order four (namely, the cyclic group
and the Klein four group). This is somemewhat slipshodwe probably ought
to say there are only two isomorphism types of groups of order fourbut
makes the point that there are only two possibilities for the internal structure
of a group with four elements.
The group G
8
above consists of all the permutations of 1, 2, 3, 4 which
x 4. It is clearly essentially the same as Sym1, 2, 3. It can be checked
that the two elements (1, 2) and (2, 3) generate this group. That is, ev-
ery element of G
8
can be expressed in terms of these two. For example,
(1, 3) = (1, 2)(2, 3)(1, 2). There are several other two-element generating sets
for G
8
as well.
One nal group of transformations of 1, 2, 3, 4 that we wish to men-
tion at this point is the alternating group on this set. It consists of all the
even permutations. It is a well known fact that the product of two even
permutations is also even, and the inverse of an even permutation is also
even; so it follows that the set of all even permutations is indeed a group.
In this casedealing with a set of size fourthe even permutations are the
identity, the eight 3-cycles (such as (1, 2, 3), (1, 2, 4), and so on), and the
three permutations which are the product of two disjoint two cycles (namely,
(1, 2)(3, 4), (1, 3)(2, 4) amd (1, 4)(2, 3)). This makes 12 even permutations
altogether, which is as it should be, since we would naturally expect the
numbers of even and odd permutations to be the same, and the order of
Sym1, 2, 3, 4the total number of permutationsis 24. Here is the multi-
plication table for Alt1, 2, 3, 4:
i a b c t
1
t
2
t
3
t
4
s
1
s
2
s
3
s
4
i i a b c t
1
t
2
t
3
t
4
s
1
s
2
s
3
s
4
a a i c b t
2
t
1
t
4
t
3
s
4
s
3
s
2
s
1
b b c i a t
3
t
4
t
1
t
2
s
2
s
1
s
4
s
3
c c b a i t
4
t
3
t
2
t
1
s
3
s
4
s
1
s
2
t
1
t
1
t
4
t
2
t
3
s
1
s
3
s
4
s
2
i a b c
t
2
t
2
t
3
t
1
t
4
s
4
s
2
s
1
s
3
a i c b
t
3
t
3
t
2
t
4
t
1
s
2
s
4
s
3
s
1
b c i a
t
4
t
4
t
1
t
3
t
2
s
3
s
1
s
2
s
4
c b a i
s
1
s
1
s
2
s
3
s
4
i a b c t
1
t
4
t
2
t
3
s
2
s
2
s
1
s
4
s
3
a i c b t
3
t
2
t
4
t
1
s
3
s
3
s
4
s
1
s
2
b c i a t
4
t
1
t
3
t
2
s
4
s
4
s
3
s
2
s
1
c b a i t
2
t
3
t
1
t
4
In this table a = (1, 2)(3, 4) and t
1
= (1, 2, 3). We leave it to the reader to
identify the other permutations.
Observe that for every group G of transformations of 1, 2, 3, 4 that
we have been able to construct, the order of G is a divisor of 24, the order
of the whole symmetric group on the set.
2
Introductory abstract group theory
In Chapter 1 we dened the concept of a group of transformations. Since
the whole purpose of group theory is to study groups of transformations,
there is really no need to dene any other kinds of groups. But, on the other
hand, we can obtain a theory which is (supercially) more general by dening
groups by means of a set of axioms. Elements of a group then do not have
be transformations, they can be any kind of thing, just so long as the axioms
are satised. Nevertheless, we should keep rmly in our minds the fact that
the way groups arise in practice, is as groups of symmetries of objects.
2a Group axioms
2.1 Definition A group is a set G equipped with an operation (g, h) gh
(usually called multiplication), satisfying the following axioms:
(i) (xy)z = x(yz) for all x, y, z G, (Associativity)
(ii) there exists an element e G such that
(a) ex = xe = x for all x G, (Existence of an identity element)
(b) for each x G there exists y G such that xy = yx = e.
(Existence of inverses)
By an operation on a set G we simply mean a rule which gives an
element of G for every pair of elements of G. This is, in fact, the same thing
as a function from GG to G. The only dierence is notational: we would
usually write f(x, y) for the result of applying a function f to a given pair
of elements (x, y), but if we were calling the function an operation we would
use a notation like x y, or x + y, or x y, or simply xy, for the result of
applying the operation to the pair (x, y). In fact, for groups we usually opt
for the last of these alternatives, as indeed we have done in Denition 2.1
itself.
15
16 Chapter Two: Introductory abstract group theory
One immediate consequence of part (ii) of Denition 2.1 is that the
empty set cannot be a group. A group must always have at least one element,
namely, the element e which satises Denition 2.1 (ii). In fact, a group
need not have any other elements; it is easy to check that a set with a single
element, with multiplication dened in the only way possible, does satisfy
Denition 2.1, and is therefore a group.
An element e which satises Denition 2.1 (ii) (a) is called an iden-
tity element. If e is an identity element then elements x and y satisfying
xy = yx = e (as in 2.1 (ii) (b)) are said to be inverses of each other. It is
not necessary for an identity element to be called e; indeed, for groups of
transformations the identity element has to be the identity transformation,
and we have already decided to denote the identity transformation by i.
A group of transformations of a set S, as dened in Denition 1.7, is
easily seen to be a group in the sense of Denition 2.1. Indeed, suppose G
satises the requirements of Denition 1.7. Observe rst of all that G satises
2.1 (i), since it is trivialwe omit the proofthat composition of functions is
associative. Since 1.7 requires that G is nonempty, we can choose an element
p G. By part (ii) of 1.7 we know that p is a bijective transformation of S
and that p
1
G. Now since p, p
1
G we can apply part (i) of 1.7 and
deduce that pp
1
G. So i, the identity transformation on S, is in the set G.
Since it is readily veried that ix = xi = x for all transformations x of S,
we can conclude that part (ii) (a) of 2.1 is satised, with e = i. Finally,
part (ii) (b) of 2.1 is an immediate consequence of 1.7 (ii).
Something which is notable for its absence from Denition 2.1 is the
commutative law: xy = yx for all x and y. Groups do not have to satisfy
this, and indeed we have already met examples of groups which do not. The
elements (1, 2) and (2, 3) in the group Sym1, 2, 3 satisfy
(1, 2)(2, 3) = (1, 2, 3) ,= (1, 3, 2) = (2, 3)(1, 2).
A glance at the multiplication table of the group of symmetries of a square
(calculated in Chapter 1) shows that this group also fails to satisfy the
commutative law. Groups which do satisfy the commutative law are called
Abelian groups. Cyclic groups, for example, are Abelian, and so is the Klein
four group.
Groups of transformations are not the only groups. At least, there are
groups which, at rst sight, do not appear to be groups of transformations,
Chapter Two: Introductory abstract group theory 17
although it may only take minor changes in nomenclature to make them into
groups of transformations. The set of all nonzero real numbers, with the
operation being ordinary multiplication of real numbers, is our rst example.
Dene R
= x R [ x ,= 0 . We must check rst of all that

multiplication of real numbers does give an operation on the set R
. This
is slightly less obvious than you might think. Certainly you can multiply
any two elements of R
and get an answer. But for multiplication to be an

operation on R
we need also that this answer is always in the set R
. It is
true, of course: the product of two nonzero real numbers is always a nonzero
real number. It now only remains to check that the axioms (i) and (ii) of 2.1
are satised, and this is trivial. The associative law is well known to hold for
multiplication of real numbers, the number 1 has the property required of an
identity element, and for each x R
the number y = 1/x has the property

required for axiom (ii) (b). Note that this group, R
, is an Abelian group,
since multiplication of real numbers is commutative.
Note that everything we have just said about the set of all nonzero real
numbers applies equally to the set of all nonzero elements of any eld. If F
is a eld then F
= F [ ,= 0 is a group under the multiplication

operation of F. This group is generally known as the multiplicative group
of F.
Fields, of course, have two operations, addition and multiplication. Un-
der its addition operation, the set of all elements of a eld F forms a group.
This is called the additive group of F. So the eld and its additive group
are the same set of elements, the dierence is merely that multiplication is
ignored when thinking of the eld as an additive group. Checking that the
addition operation of a eld does satisfy the group axioms is trivial: associa-
tivity of addition is a eld axiom, the zero element of the eld is an additive
identity (since a eld axiom states that 0 + x = x + 0 = x for all x in the
eld), for each x the element y = x has the property that x+y = y+x = 0,
as required for group axiom (ii) (b). Note that both the additive and the
multiplicative groups of a eld are Abelian groups.
We can generalize the preceding examples by considering matrices over
a eld F. If m and n are positive integers then, as usual, we dene the sum of
two mn matrices A and B by the rule (A+B)
ij
= A
ij
+B
ij
. This addition
operation makes the set of all mn matrices over F into an Abelian group,
as can be easily checked. In the case m = n = 1 we recover the additive
group of the eld.
Dene GL
n
(F) to be the set of all n n invertible matrices over the
eld F. Matrix multiplication, dened as usual by (AB)
ij
=

n
k=1
A
ik
B
kj
,
gives an operation on GL
n
(F), since the product of two invertible nn matri-
ces is always an invertible nn matrix. Associativity of matrix multiplication
is well known; so group axiom (i) holds. The nn identity matrix I has the
property that XI = IX = X for all X GL
n
(F); so group axiom (ii) (a) is
satised. And for each invertible matrix X there is an invertible matrix Y
(namely, Y = X
1
) with the property that XY = Y X = I; so group ax-
iom (ii) (b) holds. Hence GL
n
(F) is a group. It is known as the general linear
group of degree n over the eld F. Note that GL
n
(F) is not Abeliansince
matrix multiplication is not commutativeexcept in the case n = 1. When
n = 1 we recover the multiplicative group of F. Note also that, in view of the
correspondence between matrices and linear transformations, as described in
linear algebra textbooks, GL
n
(F) is essentially the same as the set of all
invertible linear transformations on a vector space of dimension n over F; so
we really are just talking about transformations in a dierent guise.
2b Basic deductions from the axioms
Let G be a group. As a trivial notational matter, observe (by axiom (i)) that
for x, y, z G we may use xyz, without any brackets, to unambiguously
denote the element x(yz) = (xy)z. We have to be careful about the order
in which the three elements are written, since G may not be Abelian, but
the bracketing is unimportant. If w is also an element of G then applying
axiom (i) with z replaced by zw gives x(y(zw)) = (xy)(zw), and similarly
(xy)(zw) = ((xy)z)w. So the expression xyzw is also unambiguous, as all
possible bracketings yield the same result. The same applies for products
with any number of factors, and so henceforth we will usually omit brackets
from long products.
The most important basic property of groups is that left and right
cancellation are both valid.
2.2 Proposition Let x, y, z G. Then
(i) if xy = xz then y = z,
(ii) if yx = zx then y = z.
Proof. Suppose that xy = xz. By 2.1 (ii) (a) there exists e G such that
eg = ge = g for all g G, and by 2.1 (ii) (b) there exists w G such that
xw = wx = e. For this element w we have that w(xy) = w(xz), since xy and
xz are the same element. So we have
ey = (wx)y = w(xy) = w(xz) = (wx)z = ez,
and we deduce that y = z since the basic property of e tells us that ey = y
and ez = z.
The proof of the right cancellation property is totally analogous to that
just given for left cancellation. Assuming that yx = zx we nd, with w as
above, that
y = ye = y(xw) = (yx)w = (zx)w = z(xw) = ze = z.
Note that Proposition 2.2 does not say that if xy = zx then y = z.

Remember that groups do not necessarily satisfy the commutative law, and
hence it is important keep the factors in products in their rightful order.
A trivial but important fact is that the identity element of a group is
unique.
2.3 Proposition If e, f G and
(i) ex = xe = x for all x G
(ii) fx = xf = x for all x G
then e = f.
Proof. By hypothesis (i) with f in place of x we have that ef = f. But by
hypothesis (ii) with e in place of x we have that ef = e.
Note that group axiom (ii) (a) guarantees the existence of an identity
element, and Proposition 2.3 says that there cannot be two dierent identity
elements. So we can, henceforth, safely talk about the identity element of a
group. Similarly, each element of G has a unique inverse.
2.4 Proposition Let x G and let e be the identity element of G. If
y, z G satisfy
(i) xy = yx = e, and
(ii) xz = zx = e,
then y = z.
Proof. Since hypotheses (i) and (ii) give xy = xz (both equal to e), the
conclusion y = z is immediate from Proposition 2.2.
We know from Denition 2.1 (ii) (b) that each x G has an inverse (an
element satisfying xy = yx = e), and Proposition 2.4 says that x cannot have
two dierent inverses. So we can, henceforth, safely talk about the inverse
of x. It is customary to denote the inverse of x by x
1
. We should also
observe the following trivial fact, the proof of which is virtually the same as
that of Proposition 2.4.
2.5 Proposition If x, y G and either xy = e or yx = e then y = x
1
.
The following is also easy to prove.
2.6 Proposition If x, y G then (xy)
1
= y
1
x
1
.
Another similar property of groups, easy but important, concerns the
solvability of simple kinds of equations.
2.7 Proposition Let g, h G. Then
(i) the equation gx = h has a unique solution x G, and
(ii) the equation yg = h has a unique solution y G.
Proof. If we dene x = g
1
h then gx = g(g
1
h) = (gg
1
)h = eh = h.
This establishes the existence of a solution to gx = h. Uniqueness of the
solution follows easily from 2.2: if z is another solution then gx = gz (both
equal to h), and cancelling g gives x = z.
The proof of the second part is totally analogous, and is omitted.
An nn array of n symbols is called a Latin square if each symbol occurs
exactly once in each row of the array and exactly once in each column of the
array. It is a consequence of Proposition 2.7 that the multiplication table of a
group with n elements is a Latin square. For example, consider an arbitrary
row of a group multiplication table corresponding to an arbitrary element, g.
That is, consider the row which gives the values of all the products gx, as x
varies over all elements of the group. The statement that an arbitrary element
h occurs exactly once in this row is precisely the statement that gx = h has
exactly one solution, and this is part (i) of 2.7. The reader can easily observe
that all the multiplication tables that appear in Chapter 1 really are Latin
squares. Unfortunately, the converse statement, that all Latin squares are
group multiplication tables, is not true; the smallest counterexample occurs
when n = 5. The problem of describing all Latin squares, and the problem
of describing all multiplication tables that satisfy the group axioms, are both
unsolved.
To end this section we discuss some more notational matters. There
is a convention, almost universally observed throughout mathematics, that
the symbol + is only ever used for operations which are commutative. So
additive groupsthat is, groups for which the operation is designated by
the + symbolare always Abelian. Hand in hand with this is another
convention: when the operation is written as addition the identity element is
always written as 0, and the inverse of an element x is always written as x
rather than x
1
. Indeed, for additive groups the term identity element is
not used: instead, 0 is called the zero element. Similarly, for additive groups
it is usual to call x the negative of x, and not use the word inverse.
2c Subgroups and cosets
2.8 Definition Let G be a group and H a subset of G. We say that H
is a subgroup of G if H is itself a group, and for all x, y H the product xy
is the same whether calculated via Gs operation or Hs.
If is an operation on a set S we say that a subset T of S is closed
under the operation if x y T whenever x, y T. Clearly if T is closed
under then an operation can be dened on T by the rule x y = x y for
all x, y T. This does not dene an operation on T if T is not closed, since
the denition of the term operation requires that if is to be an operation
on T, then xy must be in T for all x, y T. If T is closed and if is dened
in this way, we say that is the operation T inherits from the operation
on G. It is immediate from Denition 2.8 that a subgroup H of a group G
has to be closed under the operation that makes G into a group, and the
operation for H must be the inherited operation.
To avoid any further use of the tedious phrase the operation that
makes G into a group we will henceforth call the operation multiplication.
Everything we prove will still be able to be applied to additive groups by
replacing the word multiplication by addition wherever appropriate.
2.9 Proposition Let H be a subgroup of the group G. Then the identity
element of H is the same as the identity element of G. Furthermore, if h is
any element of H then the inverse of h in the group G is the same as the
inverse of h in the group H.
Proof. Let e be the identity element of G and f the identity element of H.
Then fh = h for all h H, and so in particular f
2
= f. But ex = x for all
x G, and so in particular ef = f (since f H G). Hence ef = ff, and
by Proposition 2.2 it follows that e = f.
Let h be an arbitrary element of H. Let g be the inverse of h in the
group G and let k be the inverse of h in the group H. The denition of
inverse says that hg = e and hk = f, but since we have already seen that
e = f we can conclude that hg = hk. By Proposition 2.2 again, g = k.
The second part of Proposition 2.9 shows that if H is a subgroup of G
and h H then h
1
is unambiguously dened: it doesnt matter whether
you are thinking of h as an element of H or as an element of G, its inverse
is the same. This is just as well, for we would have had serious notational
problems otherwise.
It will be convenient to say that a subset S of a group G is closed under
inversion if x
1
S whenever x S. In particular, if H is a subgroup
of G then H must be closed under inversion, since for H to satisfy group
axiom (ii) (b) it is necessary for the inverse of each element of H to also be
in H.
It turns out that the subsets of a group G which are subgroups are
precisely those nonempty subsets which satisfy the two closure properties we
have just discussed, closure under multiplication and closure under inversion.
2.10 Theorem Let G be a group and H a subset of G. Then H is a
subgroup of G if and only if H is nonempty and closed under multiplication
and inversion.
Proof. We observed earlier that the empty set is not a group, and we have
just shown that a subgroup is necessarily closed under multiplication and
inversion; so it remains only to show that a nonempty subset of G which is
closed under multiplication and inversion is a subgroup.
Suppose that ,= H G and that H is closed under multiplication
and inversion. Then as explained earlier in this section, H inherits a mul-
tiplication from G; so our task is simply to show that this operation on H
satises (i) and (ii) of Denition 2.1. We know that the multiplication on G
itself does satisfy these axioms.
Axiom (i) is trivially disposed of: we know that (xy)z = x(yz) for all
x, y, z G, and so it follows that (xy)z = x(yz) for all x, y, z H (since
H G).
Since H ,= we know that there exists at least one element in H. Fix
one such element, and call it h. By closure under inversion we deduce that
h
1
H, and so by closure under multiplication it follows that hh
1
H
also. That is, e H, where e is the identity element of G. Now by group
axiom (ii) (a) applied to G we know that ex = xe = x for all x G, and
so it is certainly true that ex = xe = x for all x H (since H G). Since
e H this shows that group axiom (ii) (a) holds for H.
Group axiom (ii) (b) is all that remains: we must that for all x H
there exists y H with xy = yx = e. Let x H, and put y = x
1
, the
inverse of x in G. Then by closure of H under inversion we know that y H,
and by denition of x
1
we know that xy = yx = e.
Comparing Theorem 2.10 and Denition 1.7, it can be seen that a group
of transformations of a set S is just the same thing as a subgroup of Sym(S)
(the group of all invertible transformations of S).
If H is a subgroup of the group G and x G then we dene
xH = xh [ x H,
and similarly
Hx = hx [ h H.
Those subsets of G of the form xH for some x G are called the left cosets
of H in G, and similarly the subsets Hx are the right cosets of H in G. In
our initial discussion of cosets, we give detailed proofs of various properties
of left cosets only, although there are corresponding results for right cosets
which can be proved by totally analogous arguments. In a subsequent section
we will cover the same ground again using some alternative arguments, and
in this subsequent section we will, for variety, use right cosets rather than
left cosets.
It is important to realize that if x
1
and x
2
are two distinct elements
of G, it is nevertheless possible that x
1
H = x
2
H. Our next proposition
describes precisely when this happens.
2.11 Proposition Let H be a subgroup of the group Gand let x
1
, x
2
G.
Then the following conditions are all equivalent to each other:
(i) x
1
H = x
2
H,
(ii) x
2
x
1
H,
(iii) x
2
= x
1
h for some h H,
(iv) x
1
1
x
2
H.
Proof. Our strategy for proving this is as follows. First of all we will prove
that if (i) holds then (ii) holds also. Then we will give a separate proof that if
(ii) holds then (iii) holds. Then we will give a separate proof that if (iii) holds
then (iv) holds, and nally we will give yet another separate proof that if (iv)
holds then (i) holds. Once all of these implications have been established it
will follow that if any one of the four conditions holds then all the others do
too.
First of all, then, assume that (i) holds; that is, x
1
H = x
2
H. By
Proposition 2.9 we have that e H (where e is the identity element of G),
and so
x
2
= x
2
e x
2
h [ h H = x
2
H.
By our hypothesis that x
1
H = x
2
H we conclude that x
2
x
1
H; that is,
(ii) holds. So we have established that (ii) holds whenever (i) holds; this was
our rst objective.
Dispense now with the assumption that (i) holds, but assume that (ii)
holds. Then x
2
x
1
H = x
1
h [ h H, and so x
2
= x
1
h for some h H.
That is, (iii) holds. So we have shown that (ii) implies (iii).
Now assume that (iii) holds. Then there is an element h H with
x
2
= x
1
h. Multiplying both sides this equation on the left by x
1
1
gives
x
1
1
x
2
= x
1
1
x
1
h = h, and so it follows that x
1
1
x
2
H. Hence (iv) holds.
Finally, assume that (iv) holds, so that x
1
1
x
2
H. Theorem 2.10
guarantees that H is closed under inversion, and so using Proposition 2.6 we
can say that x
1
2
x
1
= (x
1
1
x
2
)
1
H. Our aim is to prove that x
1
H = x
2
H,
and we do this by rst showing that x
2
H x
1
H, then that x
1
H x
2
H.
Let g be an arbitrary element of x
2
H. Then g = x
2
h for some h H,
and it follows that
g = ex
2
h = x
1
x
1
1
x
2
h = x
1
k
where k = (x
1
1
x
2
)h. But x
1
1
x
2
and h are both elements of H, and by
Theorem 2.10 we know that H is closed under multiplication. Hence k H,
and so g = x
1
k x
1
H. We have now shown that every element of x
2
H is
also in x
1
H, and therefore we have shown that x
2
H x
1
H.
For the reverse inclusion, assume instead that g is an arbitrary element
of x
1
H. Then we have g = x
1
h for some h H, and so
g = ex
1
h = x
2
x
1
2
x
1
h x
2
H
since (x
1
2
x
1
)h H by closure of H under multiplication. Thus all elements
of x
1
H are in x
2
H; that is, x
1
H x
2
H.
We have now shown that (iv) implies both x
2
H x
1
H and x
1
H x
2
H,
hence x
1
H = x
2
H. So (iv) implies (i), and the proof is complete.
An important corollary of Proposition 2.11 is that two unequal left
cosets cannot have any elements in common.
1
, x
2
G.
If x
1
H x
2
H ,= then x
1
H = x
2
H.
Proof. Suppose x
1
H x
2
H ,= . Then there exists an element g G with
g x
1
Hx
2
H. This gives g x
1
H and g x
2
H. Applying Proposition 2.11
with g in place of x
2
, using in particular that 2.11 (ii) implies 2.11 (i), we
deduce that x
1
H = gH. By the similar reasoning, g x
2
H gives x
2
H = gH.
So x
1
H = x
2
H, since both equal gH.
The other important fact about left cosets is that each left coset of the
subgroup H has the same number of elements as the subgroup H itself.
2.13 Proposition Let H be a subgroup of the group G and let x G.
Then the function f: H xH dened by f(h) = xh for all h H, is bijective.
Proof. Let h
1
, h
2
H and suppose that f(h
1
) = f(h
2
). Then xh
1
= xh
2
,
and so by 2.2 it follows that h
1
= h
2
. We have shown that f(h
1
) = f(h
2
)
implies h
1
= h
2
; that is, f is injective.
Let g be an arbitrary element of the coset xH. Then g = xh for some
h H; that is, g = f(h). We have shown that every element of xH has the
form f(h) for some h H; that is, f: H xH is surjective.
We have shown that f is both injective and surjective; that is, f is
bijective, as claimed.
2d On the number of elements in a set.
In this section we digress completely from group theory. We shall return to
group theory later.
In the Proposition 2.13 we proved the existence of a bijective function
from one set to another. We had claimed that we were going to show that
the two sets in question had the same number of elements. The situation
is plain enough for sets which have a nite number of elements. It is clear,
for example, that if two sets both have ve elements then a one-to-one cor-
respondence can be set up between the elements of the two sets. In other
words, a bijective function exists from one set to the other. (Indeed, 5!=120
such bijective functions exist.) On the other hand, if one set has ve elements
and the other six, then no bijective function exists from one set to the other.
(Indeed, a function from a ve element set to a six element set cannot be
surjective, while a function from a six element set to a ve element set cannot
be injective.) For nite sets, then, there is no doubt that two sets have the
same number of elements if and only if a bijective function from one to the
other can be found.
For innite sets, the situation is by no means clear. The concept of the
number of elements of a set S is much less familiar if S in fact has innitely
many elements. But even without dening what is meant by the number of
elements of a set S, we can still easily say what it means for two sets S and
T to have the same number of elements. By analogy with the nite case we
make the following denition.
2.14 Definition Sets S and T have equal cardinality, or the same number
of elements, if there exists a bijective function from S to T.
The slightly paradoxical thing about innite sets is that it is possible
for one set to be strictly contained in another, and yet have the same number
of elements as the set that contains it. From one point of view, one of the
sets is clearly smaller than the other; from another point of view the two
sets have the same size! We will illustrate this, and some similar matters, by
means of examples.
First of all, consider the set S = n Z [ n 0 , the set of all
nonnegative integers, and the set T = n Z [ n 1 , the set of all positive
integers. Then T S and T ,= S, but the function S T given by a a+1
is clearly bijective, and so S and T have the same number of elements.
Next, consider Q, the set of all rational numbers. Recall that a number
is rational if it can be expressed as n/m, where n and m are integers. Every
real number can be expressed by means of an innite decimal expansion,
and it is well known that the number is rational if and only if, from some
point onwards in the decimal expansion, the same sequence of digits repeats
indenitely. For example, .5000 . . . is rational, as the 0 repeats forever, and
similarly the decimal expansion for 1/7 consists of the same block of six digits
repeating forever: 1/7 = .142857142857142857 . . .. Some real numbers are
not rational, as for example = 3.14159 . . ., whose decimal expansion never
repeats. But of course, given any real number it is always possible to nd
rational numbers which are as close as you please to the given real number.
For example, the dierence between and the rational number 314/100 is
less than 210
3
; if you want a better approximation than that, the rational
number 31415/10000 diers from by less than 10
4
, while 314159/100000
is closer still, and so on. The technical term used to describe this situation
is denseness: the set of all rational numbers is dense in the set of all real
numbers. Since obviously the set of all integers is not dense in the set of all
real numbers, it is quite plain that there are many more rational numbers
than there are integers.
Not so! The set Q and the set Z have the same number of elements.
This can be proved readily by a famous argument due to Georg Cantor
(18451918), which we present in a moment. First, though, let us state a
denition.
2.15 Definition A set S is said to be countable, or enumerable, if there
exists a sequence s
1
, s
2
, s
3
, . . . of elements of S which includes every element
of S at least once.
That is, a set is countable if its elements can be listed. By crossing
out repetitions one can obtain a sequence, which could be nite or could
be innite, such that each element of S occurs exactly once in the list. Of
course the list will be nite if and only if S is a nite set; so we see that an
innite set S is countable if and only if there is an innite list s
1
, s
2
, s
3
, . . . of
elements of S in which each element of S appears exactly once. Under these
circumstances the function f from the set Z
+
= n Z [ n > 0 to S dened
by f(n) = s
n
is bijective: surjective since every element of S occurs in the
list, injective since each element occurs once only. So an innite countable
set has the same number of elements as the set of all positive integers.
By means of a simple trick, Cantor was able to write down a list of all
the positive rational numbers, thereby proving that the set Q
+
of all positive
rational numbers is countable. The list is
1/1, 2/1, 1/2, 3/1, 2/2, 1/3, 4/1, 3/2, 2/3, 1/4, 5/1, . . .
and it comes from the diagonals of the rectangular array
1/1 2/1 3/1 4/1 5/1 6/1 . . .
1/2 2/2 3/2 4/2 5/2 . . .
1/3 2/3 3/3 4/3 . . .
1/4 2/4 3/4 . . .
1/5 2/5 . . .
1/6 . . .
. . .
That is, Cantor rst lists those fractions n/m with n+m = 2, then moves on
to those with n +m = 3, then n +m = 4, and so on, and clearly in this way
he catches every positive rational number at least once. (In fact, he catches
them all innitely often.)
So there exists a bijective function f: Z
+
Q
+
. This immediately
gives a bijective function F: Z Q, dened by
F(n) =
_
_
_
f(n) if n > 0
0 if n = 0
f(n) if n < 0,
and this justies the claim we made above, that the set of all integers and
the set of all rational numbers have the same number of elements.
At this point one might think, since Z has the same number of elements
as Q, a set which initially looked much bigger, that all innite sets have the
same number of elements as each other. For example, we have seen that Q is
dense in R; surely there cannot be too much dierence between the number
of elements of Q and the number of elements of R!
We know that intuition can be misleading, and, sure enough, though
the set Q is countable, the set R is not. Cantor proved this also, by another
famous argument with a diagonal theme. Since Q is a subset of R, we can
conclude that the number of elements of Q is denitely less than the number
of elements of R. (We have not actually dened what it means for a set S
to have fewer elements, or lesser cardinality, than another set T. A suitable
denition is as follows: the cardinality of S is less than the cardinality of T if
there exists an injective function from S to T but no bijective function from
S to T.)
If it were possible to list all the real numbers it would be possible, by
striking out those numbers on the list which do not lie between 0 and 1, to
obtain a list of all positive real numbers less than 1. Each such number x
has an innite decimal expansion,
x = .a
1
a
2
a
3
a
4
a
5
. . .
where the a
i
are integers from the set 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. Suppose now
that we have a list x
1
, x
2
, x
3
, . . . of all the real numbers between 0 and 1.
Writing out decimal expansions for them all, we have
x
1
= .a
(1)
1
a
(1)
2
a
(1)
3
a
(1)
4
a
(1)
5
a
(1)
6
. . .
x
2
= .a
(2)
1
a
(2)
2
a
(2)
3
a
(2)
4
a
(2)
5
a
(2)
6
. . .
x
3
= .a
(3)
1
a
(3)
2
a
(3)
3
a
(3)
4
a
(3)
5
a
(3)
6
. . .
x
4
= .a
(4)
1
a
(4)
2
a
(4)
3
a
(4)
4
a
(4)
5
a
(4)
6
. . .
x
5
= .a
(5)
1
a
(5)
2
a
(5)
3
a
(5)
4
a
(5)
5
a
(5)
6
. . .
.
.
.
Now let us dene another innite decimal expansion .a
1
a
2
a
3
. . ., as follows:
for each i,
a
i
=
_
1 if a
(i)
i
,= 1
2 if a
(i)
i
= 1.
So this decimal expansion consists of an innite string of 1s and 2s, pre-
sumably with a preponderance of 1s, although that is irrelevant. What
is certain is that the number x dened by this decimal expansion lies be-
tween 0 and 1. Furthermore, the decimal expansions .a
(i)
1
a
(i)
2
a
(i)
3
a
(i)
4
. . . and
.a
1
a
2
a
3
a
4
. . . cannot be the same, for any value of i, since by the construction
a
i
,= a
(i)
i
. In other words, we have chosen things so that the decimal expan-
sion of x diers from the decimal expansion of x
1
in the rst decimal place,
diers from the decimal expansion of x
2
in the second decimal place, diers
from the decimal expansion of x
3
in the third decimal place, and so on. So
x ,= x
i
for any value of i, and we have found a number between 0 and 1 which
is not on the list. This is a contradiction, since the list was meant to include
all the numbers between 0 and 1. So no such list can exist, and so the set of
all real numbers is not countable.
2e Equivalence relations
The matters discussed in the previous section have little relevance for any-
thing else that will be discussed in these notes, since we will mostly be con-
cerned with nite sets rather than innite ones. The present section, however,
though still digressing from group theory, is concerned with matters which
will be important for us later on.
2.16 Definition Let R be a binary relation on a set S. Then
(i) R is said to be reexive if aRa is true for all a S,
(ii) R is said to be symmetric if bRa implies aRb,
(iii) R is said to be transitive if aRb and bRc together imply aRc.
If the relation R is reexive, symmetric and transitive, then it is called an
equivalence relation.
The typical way equivalence relations arise in mathematics is as follows:
if S is a set and T a collection of functions dened on S, let be the relation
on S dened by the rule that for all x, y S,
x y if and only if f(x) = f(y) for all f T.
In other words, we say that x is equivalent to y if all the functions take the
same value at x as at y. It is trivial to check that is reexive, symmetric
and transitive. Furthermore, this is all very natural and intuitive, since in
everyday situations we would regard two things as equivalent if they are the
same in all the aspects that interest us. In the mathematical context, the
aspects that interest us are the values of the functions we are dealing with.
The purpose of the concept of equivalence, in mathematics as in every-
day life, is to limit the number of dierent things we have to deal with. All
We have slurred over a minor point: it is not quite true that every real
number has a unique decimal expansion. For example, .5000. . . = .4999. . . . It is
easily checked that this does not invalidate the argument.
equivalent things can be treated as one. Rather than make separate rules
for each separate object, we can have one rule applying to a whole class of
equivalent objects. Thus we are led naturally to the following denition.
2.17 Definition Let be an equivalence relation on a set S, and let
x S. The equivalence class containing x is the set y S [ y x.
Suppose, for example, that the set S consists of a large assortment of
coloured pencils, and suppose that the relation is dened on S by the rule
p
1
p
2
if p
1
and p
2
correspond to the same colour.
It is readily seen that is an equivalence relation. There will be one equiv-
alence class for each colour that is represented in the set of pencils, the
equivalence class consisting of all pencils of that colour. It is conceivable
that an equivalence class might have only one member (if there is only one
pencil of that colour), or that a single equivalence class includes most of the
elements of S. But note that every element of S lies in some equivalence
class, and no element can lie in two dierent equivalence classes. We can
prove these facts quite generally.
2.18 Proposition Let be an equivalence relation on a set S. Then the
equivalence classes of form a partitioning of S. That is, the equivalence
classes are pairwise disjoint nonempty sets, and their union is the whole of S.
Proof. For each x S dene C(x) = y S [ y x, the equivalence
class containing x, and let Q = C(x) [ x S , the set of all equivalence
classes of . We show rst that two distinct equivalence classes in Q cannot
overlap.
Let x, y S and suppose that C(x) C(y) ,= . We will show that
C(x) = C(y). (Note that this does not mean that x = y.) Choose an element
z C(x) C(y). Then z C(x), whence z x, and similarly z y. Since
is symmetric it follows that y z, and combining this with z x gives,
by transitivity of , that y x. Now if t is an arbitrary element of C(y) we
have that t y, and with y x this gives t x, whence t C(x). So all
elements of C(y) are in C(x); that is, C(y) C(x). On the other hand, if t
is an arbitrary element of C(x) then t x, and since we also have x y we
can conclude that t y, whence t C(y). So C(x) C(y), and therefore
C(x) = C(y), as claimed.
I had better assume that the pencils are monochromatic!
All that remains is to show that the equivalence classes are all nonempty,
and that each element of S lies in some equivalence class. Both of these facts
are clear. Firstly, if x S is arbitrary then x C(x), since x x, and so x
lies in an equivalence class; secondly, if C Q is arbitrary then by denition
C = C(x) for some x, and since x C(x) we conclude that C ,= .
2.19 Definition If is an equivalence relation on the set S, then the set
Q, consisting of all equivalence classes of on S, is called the quotient of S
by the equivalence relation.
Initially the concept of a quotient may seem unduly abstract, dealing
as it does with sets whose elements are sets. Nevertheless, it is not far
removed from everyday practice. Consider numbers, for example. It is a
straightforward task to decide whether or not a set has three elements, but
what is the number three itself? What kind of thing is it? A plausible answer
to this is to say that the number three is exactly the set of all three-element
sets. Certainly it is an abstract thing that is intimately allied with the set of
all three-element sets, and so to identify it with this set of sets seems quite
natural, and avoids postulating the existence of an additional abstract object
of uncertain nature. According to the theory we are suggesting here, the set
of all natural numbers is an example of a quotient set. If S is the set of all
nite sets of physical objects, an equivalence relation can be dened on
S by the rule that X Y if the sets X and Y have the same number of
elements. There is then one equivalence class for each natural number, and,
according to this theory, the number and the equivalence class are one and
the same thing. The set of all natural numbers is the set of all equivalence
classes, and this is precisely the quotient of S by the equivalence relation .
It is worth noting, incidentally, that these considerations provide a method
of dening innite numbers, an issue which we sidestepped in the previous
section.
We have been talking philosophy rather than mathematics. Mathe-
maticians do not really have to answer the philosophical question what is a
number, they just have to know what rules numbers obey. But whether or
not the set of natural numbers is an example of a quotient by an equivalence
relation, the fact is that when one has an equivalence relation, considering
the set of equivalence classes rather than the original set itself reduces the
number of objects one has to contend with, and this is the whole purpose of
equivalence relations.
Our next proposition, although still extremely general and abstract, is
at least mathematics rather than philosophy.
2.20 Proposition Let f: A B be an arbitrary function. Then there
exists sets Q and I, and a surjective function : A Q, a bijective function
: Q I and an injective function : I B, such that f is equal to the
composite . That is, the following sequence
A

Q

I

B
provides a factorization of f as a surjective function, followed by a bijective,
followed by an injective. Furthermore, the set Q is the quotient of A by an
equivalence relation, and I is a subset of B.
Proof. Let be the equivalence relation dened on A by the rule that
x y if and only if f(x) = f(y), and let Q be the quotient of A by .
Let I = f(x) [ x A, the image of the function f. For each x A let
C(x) = y A [ f(y) = f(x) , the equivalence class of x.
Dene : A Q by the rule that (x) = C(x) for all x A. Since by
denition Q = C(x) [ x A = (x) [ x A, which is the image of , it
follows that is surjective.
Observe that I B. Hence we can dene : I B by the rule
that (b) = b for all b I. With this denition it is immediate that if
(b
1
) = (b
2
) then b
1
= b
2
; so is injective.
If C Q and x, y C, then x y (since C is an equivalence class),
and so f(x) = f(y) (by the denition of ). So the function f is constant
on C: all elements x C give the same value for f(x). Dene (C) to be this
constant value; that is, dene (C) = f(x), where x C is chosen arbitrarily.
Because equivalence classes are always nonempty (see 2.18) there is always an
x C to choose; so (C) is always dened, and dened unambiguously since
f is constant on C. Furthermore, (C) = f(x) imf = I. Hence C (C)
denes a function : Q I. It remains to show that is bijective and that
f = .
For all x A we have that x C(x), and so, by the denition of , we
have (C(x)) = f(x). Hence
()(x) = (((x))) = ((x)) = (C(x)) = f(x)
and therefore f = . If b I is arbitrary then, by the denition of I,
there exists an x A such that b = f(x). This gives b = (C(x)), and so
b is in the image of . So is surjective. Finally, suppose that C
1
, C
2
Q
are such that (C
1
) = (C
2
). Choose x
1
, x
2
A such that C
1
= C(x
1
) and
C
2
= C(x
2
). Then
f(x
1
) = (C(x
1
)) = (C
1
) = (C
2
) = (C(x
2
)) = f(x
2
),
whence x
1
x
2
, by the denition of . Now since x
1
and x
2
are equivalent
it follows that their equivalence classes are the same (as in the proof of 2.18
above). Hence C
1
= C
2
. So we have shown that (C
1
) = (C
2
) implies
C
1
= C
2
; that is, is injective.
2f Cosets revisited
The material on equivalence relations that we have just been discussing pro-
vides us with a slightly improved way of describing cosets.
2.21 Definition Let G be a group and H a subgroup of G. We say that
elements g
1
, g
2
G are left congruent modulo H if g
1
= hg
2
for some h H.
Similarly, g
1
and g
2
are right congruent modulo H if g
1
= g
2
h for some h H.
Note that g
1
= hg
2
if and only if h = g
1
g
1
2
, and so we see that g
1
and g
2
are left congruent modulo H if and only if g
1
g
1
2
H. The rst thing
to observe is that both of these congruence relations are equivalence relations
on G. We present only the proofs for left congruence, since those for right
congruence are totally similar.
2.22 Proposition Let H be a subgroup of the group G, and let be the
relation of left congruence modulo H, so that for all g
1
, g
2
G we have
g
1
g
2
if and only if g
1
g
1
2
H. Then is an equivalence relation on G.
Proof. Let g G be arbitrary. Then g = eg, where e is the identity element
of G. By Proposition 2.9 we know that e H, and so we have shown that
g = hg for some h H. Hence g g; that is, is reexive.
Let g
1
, g
2
G with g
1
g
2
. Then we have that g
1
= hg
2
for
some h H, and multiplying both sides of this on the left by h
1
gives
g
2
= h
1
hg
2
= h
1
g
1
. Since H is closed under inversion (by Theorem 2.10)
we know that h
1
H, and hence g
2
g
1
. So is symmetric.
Let g
1
, g
2
, g
3
H with g
1
g
2
and g
2
g
3
. Then there exist h, k H
with g
1
= hg
2
and g
2
= kg
3
. Substituting the value for g
2
given by the second
of these two equations into the rst gives g
1
= hkg
3
. Now closure of H under
multiplication (Theorem 2.10) tells us that hk H, and so g
1
g
3
. Hence
is transitive.
Maintaining the notation of Proposition 2.22, the fact that is an
equivalence relation allows us to conclude (in view of Theorem 2.18) that G
is the disjoint union of the equivalence classes of . But if x G then the
equivalence class containing x is
y G [ y x = y G [ y = hx for some h H = hx [ h H,
which is exactly the denition of the right coset Hx. Thus the equivalence
classes of are exactly the right cosets of H in G. The analogues for right
cosets of Propositions 2.11 and 2.12 now follow almost immediately.
1
, x
2
G.
Then the following conditions are all equivalent to each other:
(i) Hx
1
= Hx
2
,
(ii) x
2
Hx
1
,
(iii) x
2
= hx
1
for some h H,
(iv) x
2
x
1
1
H.
Proof. Parts (iii) and (iv) are dierent formulations of the statement that
x
2
x
1
, whereas (ii) says that x
2
is in the equivalence class containing
x
1
and part (i) that the equivalence class containing x
1
is the same as the
equivalence class containing x
2
. Thus all four are equivalent.
1
, x
2
G.
If x
1
H x
2
H ,= then x
1
H = x
2
H.
Proof. This simply says that distinct equivalence classes are disjoint (which
follows from 2.18).
On the other hand, the third basic property of cosets is not a conse-
quence of general facts about equivalence classes, but must instead be proved
by the method we used previously.
2.25 Proposition Let H be a subgroup of the group G and let x G.
Then H and Hx have the same number of elements.
The proof consists of showing that h hx is a bijective function from H
to xH.
2g Some examples
Let G be a group and x G. Then x generates a cyclic subgroup of G,
consisting of the identity and all the powers of x and x
1
. We will denote
this subgroup by x). For our rst example, let G = Sym1, 2, 3 and let
H = (1, 2)) = i, (1, 2), the cyclic subgroup generated by the transposition
(1, 2). Since (1, 2)
2
= i we see that H has exactly two elements:
H = i, (1, 2).
If x Sym1, 2, 3 then the left coset xH is the two element set x, x(1, 2).
If x = i or x = (1, 2) then xH coincides H. If we choose x = (2, 3) we nd
that the two elements of xH are (2, 3) and (1, 3, 2):
(2, 3)H = (2, 3), (1, 3, 2).
Since (1, 3, 2) (2, 3)H we must have (1, 3, 2)H = (2, 3)H, a fact which is
also easy to check by direct calculation. The remaining two elements of G
give a third coset:
(1, 3)H = (1, 3), (1, 2, 3).
Since we knew from Proposition 2.13 that each of the coset would turn out to
have the same number of elements as H, namely two, we could have predicted
that the total number of cosets would have to be the number of elements of
G divided by the number of elements of H.
If we had worked with right cosets rather than left cosets we would
have found the same number of cosets, but the cosets themselves would be
dierent. In fact, if y is right congruent to x modulo H, so that y = xh for
some h H, then y
1
= h
1
x
1
, and since h
1
H this means that y
1
is left congruent to x
1
modulo H. Conversely, and by similar reasoning, if
y
1
is left congruent to x
1
then y is right congruent to x. It follows that
the right coset Hx
1
consists of the inverses of the elements in the left coset
xH. It is a general facttrue for all groups G and subgroups Hthat taking
inverses changes right cosets into left cosets, and vice versa. Applying this
in the particular example we have been considering, we nd that
i, (1, 2) = H = Hi = H(1, 2)
(1, 3), (1, 2, 3) = H(1, 3) = H(1, 2, 3)
(2, 3), (1, 3, 2) = H(2, 3) = H(1, 3, 2).
For our second example, let G = Sym1, 2, 3 and let H = (1, 2, 3)).
This time, H has three elements:
H = i, (1, 2, 3), (1, 3, 2).
If x is any element of G which is not in H then we know that the coset xH
is dierent from H, and hence has no elements in common with H, and also
has the same number of elements as H. Since there are altogether only three
elements of G which are not in H, the coset must consist exactly of these
three elements. So in this case there are two left cosets of H in G:
i, (1, 2, 3), (1, 3, 2) = H = iH = (1, 2, 3)H = (1, 3, 2)H
(1, 2), (1, 3), (2, 3) = (1, 2)H = (1, 3)H = (2, 3)H.
In this example, if we had chosen to work with right cosets rather than
left cosets the reasoning would have been exactly the same, and we would
have reached the same conclusion: there are two right cosets of H in G, one
being H itself, the other consisting of all the elements of G which are not
in H. So in this case, unlike the previous one, the partioning of G into right
cosets modulo H is exactly the same as the partitioning of G into left cosets
modulo H. Subgroups which have this propertythat every right coset is
a left coset and vice versaare known as normal subgroups. They play an
important role in group theory, and in the next chapter we will investigate
them further.
Every group has a trivial subgroup, consisting of just one element, the
identity. If G = Sym1, 2, 3 and H = i then we see that
(1, 2)H = (1, 2) = H(1, 2),
and a similar statement applies for all the other elements of G too. So once
again left cosets are the same as right cosets: the subgroup is normal. There
are six cosets altogether, each consisting of just a single element.
At the other extreme, still with G = Sym1, 2, 3, suppose that H = G.
It is clear from the denition that a group is always a subgroup of itself! In
this case there is only one coset, namely H itself. The subgroup is therefore
normal.
For our next example, let G = Sym1, 2, 3, 4 and let H be the subgroup
of G generated by the permutations (1, 2, 3, 4) and (1, 3). Then H has eight
elements. In fact, H is just the group of all symmetries of a square, as we
considered in Chapter 1, except that we have here changed a, b, c and d to
1, 2, 3 and 4. Because G has twenty-four elements and H has eight we will
nd that there are three left cosets altogether. After writing down all the
elements of H, choose some element x G such that x / H and calculate all
the products xh for h H. This will give us a coset xH dierent from H.
The remaining eight elements of G will constitute the third left coset. The
result of these calculations is as follows: the three left cosets of H in G are
i, (1,2,3,4), (1,3)(2,4), (1,4,3,2), (1,3), (1,4)(2,3), (2,4), (1,2)(3,4) = H
(1,2), (2,3,4), (1,3,2,4), (1,4,3), (1,3,2), (1,4,2,3), (1,2,4), (3,4) = (1,2)H
(1,4), (1,2,3), (1,3,4,2), (2,4,3), (1,3,4), (2,3), (1,4,2), (1,2,4,3) = (1,4)H.
Taking the inverses of all the elements we nd the decomposition of G into
right cosets modulo H:
i, (1,4,3,2), (1,3)(2,4), (1,2,3,4), (1,3), (1,4)(2,3), (2,4), (1,2)(3,4) = H
(1,2), (2,4,3), (1,4,2,3), (1,3,4), (1,2,3), (1,3,2,4), (1,4,2), (3,4) = H(1,2)
(1,4), (1,3,2), (1,2,4,3), (2,3,4), (1,4,3), (2,3), (1,2,4), (1,3,4,2) = H(1,4).
Since it is not true that all left cosets are right cosets, the subgroup H is not
normal in G.
As our nal example, let G = Alt1, 2, 3, 4, the group of all even
permutations of 1, 2, 3, 4, and let K be the subgroup of G consisting of the
identity and the three permutations (1, 2)(3, 4), (1, 3)(2, 4) and (1, 4)(2, 3).
We nd that the three left cosets of K in G are as follows:
K = i, (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)
(1, 2, 3)K = (1, 2, 3), (1, 3, 4), (2, 4, 3), (1, 4, 2)
(1, 3, 2)K = (1, 3, 2), (1, 4, 3), (2, 3, 4), (1, 2, 4).
In the notation we used in Chapter 1 when discussing the multiplication table
of Alt1, 2, 3, 4, the cosets are i, a, b, c, t
1
, t
2
, t
3
, t
4
and s
1
, s
2
, s
3
, s
4
.
The right coset K(1, 3, 2) consists of the inverses of the elements of
the left coset (1, 2, 3)K, but on calculating these inverses we nd that they
are exactly the elements of (1, 3, 2)K. So K(1, 3, 2) = (1, 3, 2)K. Similarly
K(1, 2, 3) = (1, 2, 3)K. Since the left cosets are also right cosets, the subgroup
K is normal in G.
2h The index of a subgroup
2.26 Definition Let G be a group and H a subgroup of G. The index of
H in G, denoted by [G : H], is dened to be the number of left cosets of H
in G.
As we observed above, taking inverses of elements changes left cosets
into right cosets. So there is a one-to-one correspondence between the set
of all left cosets of H in G and the set of all right cosets of H in G, and
hence the index could equally well be dened as the number of right cosets
of H in G.
If S is any set, we will write [S[ for the number of elements in S. Recall
that if G is a group then [G[ is called the order of G.
Suppose now that G is a group with [G[ nite, so that if H is a subgroup
of G then [H[ and [G : H] will also be nite. Since we know that [xH[ = [H[
for all x G, we can conclude that G is the disjoint union of [G : H] sets
each of which has [H[ elements. It follows that [G[ = [G : H][H[. This result
is known as Lagranges Theorem.
2.27 Theorem Let G be a nite group and H a subgroup of G. Then
[G[ = [G : H][H[,
and, in particular, the order and index of H are both divisors of [G[.
We know that if x G is arbitrary then the set of all powers of x forms
a subgroup x), known as the cyclic subgroup generated by x. The order of
x) is also called the order of the element x; it is the least positive integer n
such that x
n
is the identity element of G. As a corollary of Theorem 2.27 we
deduce the following fact about orders of elements of G.
2.28 Corollary If G is a nite group and x G, then the order of x is a
divisor of [G[.
3
Homomorphisms, quotient groups and isomorphisms
As explained in Chapter 1, the symmetries of a structured set always form a
group, and this is the reason for the importance of groups in mathematics.
But from the axiomatic description of group theory given in Chapter 2 we
see that groups themselves are also examples of structured sets, since the
multiplication operation that a group G must possess can alternatively be
described as a ternary relation on G. As always when considering struc-
tured sets, the most important kind of functions to consider are those which
preserve the structure.
3a Homomorphisms
3.1 Definition If G and H are groups then a function : G H is called
a homomorphism if (xy) = (x)(y) for all x, y G.
To express the multiplication operation as a relation, as suggested in
the introduction above, we would dene a relation Mult on a group by
Mult(x, y, z) if and only if xy = z.
A homomorphism from G to H is then a function that preserves multi-
plication, in the sense that if Mult(x, y, z) then Mult((x), (y), (z)) for all
x, y, z G.
Before continuing with theoretical matters, we give a few examples of
homomorphisms. First of all, let F be a eld and G = GL
n
(F), the group
of all invertible n n matrices over F. For each X GL
n
(F) let det(X) be
the determinant of X, and note that det(X) ,= 0 since X is invertible. The
function det: G F
dened by X det(X) is a homomorphism, since

det(XY ) = det(X) det(Y )
for all n n matrices X and Y .
40
Chapter Three: Homomorphisms, quotient groups and isomorphisms 41
Recall that if z = x + iy is a complex number, with real part x and
imaginary part y, then the modulus of z is the real number [z[ =
_
x
2
+y
2
.
Since [z[ = 0 if and only if z = 0, the rule z [z[ denes a function from C
to R
. Furthermore, it is a well known property of complex numbers that

[z
1
z
2
[ = [z
1
[ [z
2
[ for all z
1
, z
2
C; so this function is a homomorphism.
If p Sym1, 2, . . . , n dene (p) to be the number of ordered pairs
(i, j) of integers in the set 1, 2, . . . , n such that i < j and p(i) > p(j). For
example, if
p =
_
1 2 3 4 5 6 7 8
5 2 8 3 6 1 7 4
_
then it can be seen that (p) = 14. Indeed, the pairs (i, j) with i < j and
p(i) > p(j) are (1, 2), (1, 4), (1, 6), (1, 8), (2, 6), (3, 4), (3, 5), (3, 6), (3, 7),
(3, 8), (4, 6), (5, 6), (5, 8) and (7, 8). (These correspond to the cases where
a number in the second row of p has a smaller number somewhere to its
right.) The sign, or parity, of p is (p) = (1)
(p)
; if (p) = 1 then p is an
even permutation, otherwise it is an odd permutation. The most important
property of parity, and the reason for its importance, is the fact that if two
permutations have the same parity then their product is even, while if they
have opposite parity then their product is odd. That is,
(pq) = (p)(q)
for all p, q Sym1, 2, . . . , n. It is trivial that the numbers 1 form a
cyclic group of order 2 (the operation being ordinary multiplication of num-
bers), and the above equation shows that : Sym1, 2, . . . , n 1 is a
homomorphism.
Bridge is a game played by four people, two against two in partnership.
Given four people P
1
, P
2
, P
3
and P
4
, there are precisely three dierent ways
to choose the partnerships: P
4
can partner P
1
, P
2
or P
3
, and once P
4
s
partner has been chosen the partnerships are completely determined. A
more formal mathematical way of expressing this fact is to say that there
are three equivalence relations on the set 1, 2, 3, 4 having the property
that there are exactly two elements in each equivalence class. We shall call
these equivalence relations R
1
, R
2
and R
3
; they correspond to the following
partitionings of 1, 2, 3, 4 into equivalence classes:
1, 2, 3, 4 = 4, 1 2, 3
1, 2, 3, 4 = 4, 2 1, 3
1, 2, 3, 4 = 4, 3 1, 2.
42 Chapter Three: Homomorphisms, quotient groups and isomorphisms
Thus, for the relation R
1
we have 4R
1
1 and 2R
1
3, but not 1R
1
2 or 1R
1
3. The
equivalence classes correspond to the partnerships; for the relation R
i
, player
P
i
is the partner of player P
4
.
Now if p is any permutation of 1, 2, 3, 4 then we see that permuting
the players in accordance with p will permute the three equivalence relations
in some manner. To be precise, if p is a permutation of 1, 2, 3, 4, then for
each equivalence relation R on 1, 2, 3, 4 there is another equivalence relation
S on 1, 2, 3, 4 such that
iRj if and only if p(i)Sp(j).
If we choose R = R
1
we will nd that S = R
l
for some l, then choosing R = R
2
will give S = R
m
for some m ,= l, and nally R = R
3
will give S = R
n
with
n ,= l and n ,= m. In this way the permutation p has given rise to a permu-
tation
_
1 2 3
l m n
_
of 1, 2, 3. Calling this permutation p, the following
formula summarizes all that we have said: for each p Sym1, 2, 3, 4 there
is a permutation p Sym1, 2, 3 such that
(3.1.1) iR
k
j if and only if p(i)R
(p)(k)
p(j).
We shall show in a moment that this function from Sym1, 2, 3, 4 to
Sym1, 2, 3 is a homomorphism. But before doing so, let us calculate p for
several dierent permutations p; this may make the situation more under-
standable.
First of all, choose p = (1, 2)(3, 4). The equivalence relation R
3
has
players P
4
and P
3
as one partnership, P
1
and P
2
the other, and so the per-
mutation p says that each player swaps places with his/her partner. The
partnerships are thus unchanged. So p in fact preserves the equivalence re-
lation R
3
. Next consider R
2
, for which the partnerships are P
4
, P
2
and
P
1
, P
3
. The permutation (1, 2)(3, 4) makes each player swap places with
someone from the other partnership. The total eect of this is that the
partnerships are the same as before:
P
4
,P
2
P
1
,P
3

P
3
,P
1
P
2
,P
4
Thus p preserves R
2
as well. Similarly we nd that p preserves R
1
:
P
4
,P
1
P
2
,P
3

P
3
,P
2
P
1
,P
4

We have thus shown that if p = (1, 2)(3, 4) then p = i, the identity permu-
tation of 1, 2, 3.
Next consider p = (1, 2). For the relation R
3
, where P
4
and P
3
are
partners, the permutation p tells P
4
and P
3
to stay where they are, while the
other partners, P
1
and P
2
, swap places. This does not aect the partnerships.
So p preserves R
3
. But for R
2
the players P
4
and P
2
are partners, and p keeps
P
4
in the same place but makes P
2
swap with P
1
. So p changes R
2
into R
1
(where P
4
partners P
1
). Similarly, p transforms R
1
into R
2
, and we conclude
that p = (1, 2) gives p = (1, 2).
More generally, let p Sym1, 2, 3, 4 be any permutation such that
p(4) = 4. Of course, there are exactly six of these, namely i, (1, 2), (1, 3),
(2, 3), (1, 2, 3) and (1, 3, 2). The equivalence relation R
i
has P
4
partnering
P
i
; apply p and we have P
p(4)
partnering P
p(i)
. But since p(4) = 4, we have
P
4
partnering P
p(i)
, and the relation corresponding to this is R
p(i)
. In other
words, for these permutations p which leave 4 xed, the equivalence relations
R
1
, R
2
and R
3
are permuted in exactly the same way as P
1
, P
2
and P
3
are
permuted. So we have (1, 3) = (1, 3), (1, 3, 2) = (1, 3, 2) and so on.
Finally, let us calculate p when p = (1, 4, 3, 2). Consider R
3
rst:
here P
1
and P
2
are partners. Apply p and we nd that P
p(1)
and P
p(2)
are
partners; that is, P
4
and P
1
are partners. So R
3
has been transformed to R
1
.
For R
1
, where P
1
and P
4
are partners, applying p makes P
4
and P
3
partners,
showing that R
1
is transformed to R
3
. We see that R
2
must be xed, and so
(1, 4, 3, 2) = (1, 3).
Now for the promised proof that is a homomorphism. Let p and q
be two permutations in Sym1, 2, 3, 4, and apply (3.1.1) with k replaced by
(q)(k) and i, j replaced by p
1
(i), p
1
(j). This gives
p
1
(i)R
(q)(k)
p
1
(j) if and only if iR
((p)(q))(k)
j.
Now apply (3.1.1) again with i, j replaced by (q
1
p
1
)(i), (q
1
p
1
)(j), and
p replaced by q. This gives
(q
1
p
1
)(i)R
k
(q
1
p
1
)(j) if and only if p
1
(i)R
(q)(k)
p
1
(j).
Finally, apply (3.1.1) again, this time with p replaced by pq and i, j replaced
by (q
1
p
1
)(i), (q
1
p
1
)(j). We nd that
(q
1
p
1
)(i)R
k
(q
1
p
1
)(j) if and only if iR
((pq))(k)
j.
Combining these three equivalences shows that for all i, j 1, 2, 3, 4,
iR
((pq))(k)
j if and only if iR
((p)(q))(k)
j,
and hence the relations R
((pq))(k)
and R
((p)(q))(k)
are the same. Thus
((pq))(k) = ((p)(q))(k)
for all k 1, 2, 3, and hence the permutations (pq) and (p)(q) are equal.
Thus we have shown that preserves multiplication, as required.
The preceding example is fairly typical of the way homomorphisms
arise. Given a group G of symmetries of some object X, if one can nd
another associated object Y perhaps a part of Xthen it may turn out
that every symmetry of X in the group G gives rise to a symmetry of Y . In
such cases there will be a homomorphism from G to the group of symmetries
of Y .
For example, consider the set G of all permutations of 1, 2, 3, 4, 5 that
preserve the subset 1, 2, 3. That is,
G = p Sym1, 2, 3, 4, 5 [ p(i) 1, 2, 3 for all i 1, 2, 3 .
Then G is a group, and for each element p G we can dene p to be that
permutation of 1, 2, 3 such that (p)(i) = p(i) for all i 1, 2, 3. Thus, if
p = (1, 3)(4, 5) then p preserves 1, 2, 3, and so p G, and we see that p,
the permutation of 1, 2, 3 to which p gives rise, is just (1, 3). This mapping
, given by p p, is a homomorphism.
To close this section we return briey to general theoretical matters. We
know that a bijective function from one set to another is just a one-to-one cor-
respondence between the elements of the two sets. If the sets are groups and
the function a homomorphism, this one-to-one correspondence also preserves
the group structure. So, for example, if G is a nite group, with elements
g
1
, g
2
, . . . , g
n
, then H also has n elements, namely (g
1
), (g
2
), . . . , (g
n
),
and if the product in G of g
i
and g
j
is g
k
, then in H we have
(g
i
)(g
j
) = (g
i
g
j
) = (g
k
).
So the multiplication table of G becomes the multiplication table of H if,
for each i, the element g
i
is replaced by (g
i
) wherever it occurs. As we
mentioned in Chapter 1, groups G and H which are related in this way are
said to be isomorphic to each other.
3.2 Definition A bijective homomorphism is called an isomorphism, and
groups G and H are said to be isomorphic if there is an isomorphism from
G to H. We will write G

= H to indicate that G and H are isomorphic. The
relation

=, dened on the class of all groups, is also called isomorphism.
It would clearly be undesirable to use the terminology we have just in-
troduced were
= not an equivalence relation, or, at the very least, symmetric.

We would not be prepared to say G and H are isomorphic, if we were not
also prepared to say H and G are isomorphic. But fortunately it is true
that isomorphism of groups is an equivalence relation. We leave the proof of
this as an exercise.
Finally, we prove two simple general properties of homomorphisms.
3.3 Proposition Let G, H be groups and : G H a homomorphism.
(i) If e
G
is the identity element of G and e
H
the identity element of H,
then (e
G
) = e
H
.
(ii) If x is any element of G, then (x
1
) = (x)
1
.
Proof. Since (e
G
) H, we have
(e
G
)e
H
= (e
G
) = (e
2
G
) = (e
G
)
2
,
and cancellation yields e
H
= (e
G
), as desired. Now for all x G,
(x
1
)(x) = (x
1
x) = (e
G
) = e
H
,
and by Proposition 2.5 it follows that (x
1
) = (x)
1
.
3b Quotient groups
It will often be convenient for us to deal with the product of two subsets of
a group, as dened in the following natural sense.
3.4 Definition If G is a group and S, T G, then we dene
ST = st [ s S and t T .
Observe that this multiplication of subsets of G satises the associative
law.
3.5 Proposition If G is a group and S, T, U G, then (ST)U = S(TU).
Proof. By the denition of the product of two subsets of G,
(ST)U = xu [ x ST and u U
= (st)u [ s S, t T and u U
= s(tu) [ s S, t T and u U
= sy [ s S and y TU
= S(TU),
as required.
Of course, Proposition 3.5 allows us to omit the bracketing from prod-
ucts of three or more subsets of G.
If x G and T G then we dene xT = xT = xt [ t T .
Similarly, Tx is dened to equal Tx. Note that this is consistent with, and
generalizes, the notation for cosets which we introduced previously: now xT
and Tx are dened for all T G, whereas previously they were only dened
if T were a subgroup.
In the previous chapter we mentioned normal subgroups, describing
them as subgroups with the property that every right coset of the subgroup
is also a left coset of the subgroup, and vice versa. If K is a normal subgroup
of G and x G, then the left coset of K containing x is xK, and the right
coset of K containing x is Kx. But the left coset containing x must be a
right coset, and so it must be the right coset containing x. It follows that
xK = Kx for all x G; this would be a suitable alternative denition
of normality. However, let us choose the following third alternative as our
ocial denition of normality in these notes.
3.6 Definition Let G be a group and K a subgroup of G. We say that
K is normal in G if x
1
kx K for all k K and x G.
It is not totally obvious that this actually is equivalent to the others;
so let us prove it.
3.7 Proposition Let G be a group and K a subgroup of G. The following
are equivalent:
(i) x
1
kx K for all k K and x G,
(ii) x
1
Kx K for all x G,
(iii) x
1
Kx = K for all x G,
(iv) xK = Kx for all x G,
(v) every left coset of K in G is also a right coset of K in G.
Proof. Let x G. Note that, in accordance with our denitions concerning
products of subsets, x
1
Kx = x
1
kx [ k K. Thus (i) says that every
element of x
1
Kx is in K; that is, x
1
Kx K. Hence (i) and (ii) are
equivalent.
The equivalence of (iv) and (v) was proved in the preamble above. If
(iv) holds then using 3.5 we see that for all x G,
x
1
Kx = x
1
(Kx) = x
1
(xK) = (x
1
x)K = eK = K,
whence (iii) holds. Conversely, if (iii) holds then
Kx = (xx
1
)Kx = x(x
1
Kx) = xK (for all x G),
and so (iv) holds. Hence the equivalence of (iii) and (iv) is established, and
all that remains is to prove that (ii) and (iii) are equivalent; furthermore, it
is trivial that (iii) implies (ii), and so the task reduces to proving that (ii)
implies (iii).
Assume that (ii) holds, and let x G. By (ii) we have that x
1
Kx K.
But the same must also be true with x replaced by x
1
, since the formula
in (ii) is assumed to hold for all x G. So we also have that xKx
1
K,
and it follows that x
1
(xKx
1
)x x
1
Kx. But since
x
1
(xKx
1
)x = (x
1
x)K(x
1
x) = eKe = K
we have that K x
1
Kx. Combining this with the previously established
fact that x
1
Kx K, we can conclude that x
1
Kx = K, and therefore that
(iii) holds, as required.
Let K be a normal subgroup of the group G. Since the left and right
cosets of K in G coincide, it follows that the equivalence relations on G of
left congruence modulo K and right congruence modulo K must coincide
also. Indeed, if x, y G are left congruent modulo K, then x = ky for
some k K; but this gives x = y(y
1
ky), and since normality of K yields
that y
1
ky K, we can conclude that x, y are right congruent modulo K as
well. In this situation we say simply that x and y are congruent modulo K.
The crucial property of congruence modulo a normal subgroup, which
makes it far more valuable than left or right congruence modulo a non-normal
subgroup, is that it interacts with multiplication in the best way imaginable.
3.8 Proposition Let K be a normal subgroup of the group G, and let
be the relation on G of congruence modulo K. If x, x
, y, y
G are such
that x
x and y
y, then x
xy.
Proof. Given x
x and y
y, there exist h, k K such that x
= xh
and y
= yk, and hence

x
= (xh)(yk) = xy(y
1
hy)k.
But since K is normal we know that y
1
hy K, and since k K it follows
from closure of K under multiplication that (y
1
hy)k K. Thus the equa-
tion x
= xy(y
1
hy)k says that x
xy, as required.
As an almost immediate consequence of 3.8 we see that if C
1
, C
2
G
are equivalence classes for the equivalence relation , then the product C
1
C
2
is also an equivalence class. For, let g be an arbitrary element of C
1
C
2
.
Then we have g = xy for some x C
1
and y C
2
. If g
is also an arbitrary
element of C
1
C
2
then, similarly, g
= x
, where x
C
1
and y
C
2
.
Since x and x
are in the same equivalence class we have that x
x, and
similarly y
y, and now by 3.8 we conclude that g
g. So every element of
C
1
C
2
is congruent to g. Conversely, suppose that g
is an arbitrary element
of G that is congruent to g. Then g
= gk for some k K, and this gives

g
= x(yk) C
1
C
2
, since x is in C
1
, and yk, being congruent to y, is in C
2
.
So an element of G is congruent to g if and only if it is in the set C
1
C
2
; thus
C
1
C
2
is an equivalence class, as claimed.
3.9 Proposition Let K be a normal subgroup of the group G. Then
(xK)(yK) = xyK for all x, y G.
Proof. As above, let be the relation of congruence modulo K. From our
discussion of congruence modulo a subgroup in Chapter 2, we know that xK
is the equivalence class containing x, and yK the equivalence class contain-
ing y. By the discussion above it follows that (xK)(yK) is an equivalence
class; but since x xK and y yK it follows that xy (xK)(yK), whence
(xK)(yK) is the equivalence class that contains xy. So (xK)(yK) = xyK,
as claimed.
Continuing with the assumption that K is a normal subgroup of G and
the relation of congruence modulo K, let Q be the quotient of G by .
That is, by Denition 2.19, Qis the set of all equivalence classes of G under .
We have seen that the product of two equivalence classes is an equivalence
class; so subset multiplication denes an operation on Q. It turns out that
this operation makes Q into a group; however, before proving this in general,
let us look at an example.
Let G = Alt1, 2, 3, 4 and K the subgroup consisting of the identity
permutation i and the three permutations a = (1, 2)(3, 4), b = (1, 3)(2, 4) and
c = (1, 4)(2, 3). We investigated the cosets of K in G at the end of Chapter 2,
and concluded that K is normal in G, and that the cosets of K in G are, in
the notation used in the multiplication table in Chapter 1,
iK = i, a, b, c
(1, 2, 3)K = t
1
, t
2
, t
3
, t
4
(1, 3, 2)K = s
1
, s
2
, s
3
, s
4
.
Since when writing out the multiplication table we grouped the elements of
G according to their cosets modulo K, it is easy to observe from a glance
at the table that the assertion of Proposition 3.8 is indeed satised. The
product of two elements of the coset t
1
, t
2
, t
3
, t
4
is an element of the
coset s
1
, s
2
, s
3
, s
4
, the product of an element in s
1
, s
2
, s
3
, s
4
with one
in t
1
, t
2
, t
3
, t
4
lies in i, a, b, c , and so on. If we write the cosets as
I = i, a, b, c, T = t
1
, t
2
, t
3
, t
4
and S = s
1
, s
2
, s
3
, s
4
then we see that
the multiplication table for the cosets is as follows.
I T S
I I T S
T T S I
S S I T
Since this is exactly the same as the multiplication table for a cyclic group of
order three, we conclude that in this case at least, Q, the set of all equivalence
classes for the relation congruence modulo K, is a group. Now let us prove
it in general.
3.10 Proposition Let G be a group and let K be a normal subgroup of G.
Let Q = xK [ x G, the set of all cosets of K in G, which equals the set
of all equivalence classes of G under congruence modulo K. Then Q forms
a group, with multiplication in Q satisfying the rule that (xK)(yK) = xyK
for all x, y G. The identity element of this group is the coset K = eK
(where e is the identity element of G), and for all x G the inverse in Q of
the coset xK is the coset x
1
K.
Proof. We have seen that subset multiplication yields an operation on Q,
and in 3.9 above we have seen that it satises (xK)(yK) = xyK for all
x, y G. Furthermore, we know from Proposition 3.5 that it is associative.
So all that remains is to check the group axioms (ii) (a) and (b) pertaining
to the identity element and inverses (see Denition 2.1).
Let C be an arbitrary element of Q, so that C = xK for some x G.
By 3.9 we have that
KC = (eK)(xK) = exK = xK = C,
and similarly
CK = (xK)(eK) = xeK = xK = C.
So the element K Q has the property that CK = KC = C for all C Q;
that is, K is an identity element for Q.
Again, let C = xK, an arbitrary element of Q. By 3.9
C(x
1
K) = (xK)(x
1
K) = xx
1
K = eK = K,
and similarly
(x
1
K)C = (x
1
K)(xK) = x
1
xK = eK = K,
showing that x
1
K has the property required of an inverse of C. So the
group axioms are satised, and all parts of the proposition are proved.
3.11 Definition The group Q in Proposition 3.10 is called the quotient
group, or factor group, G/K.
Surely, one might think, this same construction will work for any sub-
group K of a group G; the restriction to normal subgroups must be unnec-
essary. For, with Q = xK [ x G and with multiplication dened by
(xK)(yK) = xyK, all the group axioms are satised. The associative law is
trivial, since
(xKyK)zK = xyKzK = (xy)zK = x(yz)K = xKyzK = xK(yKzK),
the coset K = eK is an identity element, since by our denition of multipli-
cation
(xK)(eK) = xeK = xK = exK = (eK)(xK),
and x
1
K is the inverse of xK since
(x
1
K)(xK) = (x
1
x)K = eK = (xx
1
)K = (xK)(x
1
K).
The error in this reasoning comes right at the start, where we blithely asserted
that multiplication could be dened on Q in such a way that the formula
(xK)(yK) = xyK holds. For the subsequent reasoning to work, we need this
formula for all x, y G. However, if K is not normal in G this formula is
inconsistent with itself. If x, x
G are right congruent modulo K then x

and x
correspond to the same element C Q, the coset C = xK = x
K.
If we multiply this element of Q by another, yK say, then on the one hand
the product C(yK) must be (xK)(yK) = xyK, on the other hand it must
also be (x
K)(yK) = x
yK. But the product C(yK) cannot be dened in

two dierent ways: the reasoning will be invalid unless x
yK = xyK, for
all y G, whenever x
K = xK. Rearranging this using 2.11, the required

condition becomes that (xy)
1
(x
y), which equals y

1
(x
1
x
)y, must be in K
whenever x
1
x
H. Thus putting x
= xk we see that our multiplication

rule is unambiguously dened only if y
1
ky K whenever k K; that is,
K must be normal in G.
An example should make this clearer. Let G = Sym1, 2, 3 and let
K = i, (1, 2), the cyclic subgroup generated by (1, 2). The left cosets of K
in G all have two elements, and we can check readily that (1, 3), (1, 2, 3)
and (2, 3), (1, 3, 2) are both left cosets. Let us attempt to evaluate the
product of these by means of the so-called formula (xK)(yK) = xyK. We
have that
(1, 3), (1, 2, 3) = (1, 3)K
(2, 3), (1, 3, 2) = (2, 3)K
and therefore
(1, 3), (1, 2, 3)(2, 3), (1, 3, 2) = (1, 3)(2, 3)K
= (1, 3, 2)K
= (2, 3), (1, 3, 2).
But equally we have
(1, 3), (1, 2, 3) = (1, 2, 3)K
(2, 3), (1, 3, 2) = (1, 3, 2)K
and therefore
(1, 3), (1, 2, 3)(2, 3), (1, 3, 2) = (1, 2, 3)(1, 3, 2)K
= iH
= i, (1, 2).
Clearly, the attempted denition of coset multiplication is unsatisfactory in
this case.
Return now to the case when K is normal in G, so that the quotient
G/K has a well-dened multiplication, and is a group. Given that the con-
struction works in this case, the following observation is trivial.
3.12 Proposition Let K be a normal subgroup of the group G. Then the
function : G G/K, dened by the rule that (x) = xK for all x G, is a
surjective homomorphism.
Proof. By 3.9, (x)(y) = (xK)(yK) = xyK = (xy), for all x, y G.
Hence is a homomorphism. It is surjective since by denition of G/K,
every element of G/K has the form xK = (x) for some x G.
The homomorphism appearing in Proposition 3.12 is called the natural
or canonical homomorphism from G to G/K.
3c The Homomorphism Theorem
Our main objective in this section is to analyse homomorphisms. The next
denition provides a key concept for this task.
3.13 Definition Let G, H be groups and : G H a homomorphism.
The kernel of is the set ker = s S [ (s) = e
H
, where e
H
is the
identity element of H.
There is an intimate connection between normal subgroups and homo-
morphisms, the rst aspect of which is the following proposition.
Then the kernel of is a normal subgroup of G.
Proof. We rst use Proposition 2.10 to prove that ker is a subgroup of G.
By Proposition 3.3 we know that e
G
ker , and so ker ,= . If x, y ker
then (x) = (y) = e
H
, and it follows that
(xy) = (x)(y) = e
2
H
= e
H
,
whence e
H
ker . Thus ker is closed under multiplication. And if
x ker then since (x) = e
H
we deduce from 3.3 that
(x
1
) = (x)
1
= e
1
H
= e
H
,
whence ker is closed under inversion. We have now veried that the hy-
potheses of 2.10 are satised, and can conclude that ker is a subgroup.
Let k ker and x G. Then (k) = e
H
, and since (x
1
) = (x)
1
(by 3.3), we have
(x
1
hx) = (x
1
)(k)(x) = (x)
1
e
H
(x) = (x)
1
(x) = e
H
,
whence x
1
kx ker . Since this holds for all k ker and all x G, we
have shown that the subgroup ker is normal in G.
A homomorphism from G to H also gives rise to a subgroup of H, as
the next proposition shows. In this case, however, the subgroup need not be
normal.
Then the image of ,
im = (g) [ g G
is a subgroup of H.
Proof. In view of 2.10, our task is to show that im is nonempty and closed
under multiplication and inversion.
Since G ,= , since G must at least have an identity element, it follows
that im ,= . Specically, we know that (e
G
) im. (In fact, by Propo-
sition 3.3, we know that (e
G
) = e
H
, although technically this fact is not
needed here.)
Let h, k im. Then, by denition of im, there exist x, y G with
h = (x) and k = (y). Now
hk = (x)(y) = (xy) im
and we have shown that the product of two arbitrary elements of im always
lies in im. That is, im is closed under multiplication.
Let h im. Then h = (x) for some x G, and by 3.3
h
1
= (x)
1
= (x
1
) im,
and so we see that im is also closed under inversion, as required.
Obviously, by the denition of surjectivity, a homomorphism from a
group G to a group H is surjective if and only if the subgroup im is equal
to the whole group H. Parallel to this, but a little less trivial, we have the
following criterion for injectivity of .
3.16 Proposition Let G and H be groups and : G H a homomor-
phism. Then is injective if and only if the identity element of G is the only
element of ker .
Proof. Assume rst that is injective. We know from 3.3 that (e
G
) = e
H
;
so e
G
is an element of ker . To prove that it is the only element, let x ker
be arbitrary, and observe that we then have (x) = e
H
= (e
G
). Injectivity
of then gives immediately that x = e
G
, as required.
Conversely, assume that ker = e
G
; we will prove that is injective.
Accordingly, assume that x, y G with (x) = (y). By 3.3 we deduce that
(x
1
y) = (x
1
)(y) = (x)
1
(y) = (x)
1
(x) = e
H
,
whence x
1
y ker . But this means that x
1
y = e
G
, and hence x = y. So
we have shown that (x) = (y) implies x = y; that is, is injective.
We now come to the Homomorphism Theorem, or the First Isomor-
phism Theorem, which is the main theorem of introductory group theory.
3.17 Theorem Let G and H be groups, and : G H a homomorphism.
Let K = ker , the kernel of , and I = im, the image of . Then
G/K

= I.
Indeed, there is an isomorphism : G/K I such that (xK) = (x) for
all x G.
Proof. Let C be an arbitrary element of G/K, and recall that C is an equiv-
alence class for the relation of congruence modulo K. If x, y are arbitrary
elements of C then x = yk for some k K, and so
(x) = (yk) = (y)(k) = (y)e
H
= (y),
since K = ker . Therefore we can, without ambiguity, dene (C) so that
(C) = (x) for all x C. Since (x) im = I, this rule denes a
function from G/K to I. Furthermore, if x G is arbitrary then x xK,
and so (xK) = (x). We have thus shown that there exists a function
: G/H K such that (xK) = (x) for all x G.
True, we could have started this proof with the apparently simply pre-
scription Dene : G/K I by (xK) = (x) for all x. However, this
runs the risk of falling into the trap that appeared when we tried to dene
quotient groups for non-normal subgroups: it is possible to have xK = x
K
without having x = x
, and so we would be simultaneously be dening

(xK) = (x
K) as (x) and as (x
). To show that is well-dened,

it is necessary to show that (x) = (x
) whenever xK = x
K; this is in fact
what we did in the rst paragraph of the proof.
Given that is well-dened, the rest of the proof becomes easy. First
of all, if C
1
, C
2
G/K are arbitrary, then we have C
1
= xK and C
2
= yK
for some x, y G, and now
(C
1
C
2
) = (xK yK) = (xyK) = (xy)
= (x)(y) = (xK)(yK) = (C
1
)(C
2
).
Hence is a homomorphism.
Let h I. Since I = im we must have h = (x) for some x G,
and hence (xK) = (x) = h. So we have shown that every element of I,
the codomain of , has the form (C) for some C G/K. That is, is
surjective.
Finally, let C
1
, C
2
G/K with (C
1
) = (C
2
). We have C
1
= xK
and C
2
= yK for some x, y G, and now
(x) = (xK) = (C
1
) = (C
2
) = (yK) = (y),
from which it follows that
(x
1
y) = (x)
1
(y) = e
H
.
Thus x
1
y ker = K, and, by 2.11, xK = yK. That is, C
1
= C
2
. We
have thus shown that (C
1
) = (C
2
) implies C
1
= C
2
, and therefore that
is injective.
Since is an injective and surjective homomorphism from G/K to I,
it follows that G/K

= I, as claimed.
Although we did not make use of Proposition 2.20 in our proof of Theo-
rem 3.17, it is nevertheless true that Theorem 3.17 is little more than Propo-
sition 2.20 applied to a function which happens to be a homomorphism of
groups. According to Proposition 2.20, a homomorphism : G H can be
factorized as a surjection : G Q, followed by a bijection : Q I, fol-
lowed by an injection : I H. Furthermore, examining the proof of 2.20,
we nd that I is the image of and Q the quotient of G by the equivalence
relation dened by
x y if and only if (x) = (y).
We showed in the proof of 3.17 that (x) = (y) if and only if x is con-
gruent to y modulo K = ker ; so the equivalence relation derived from
Proposition 2.20 is exactly the the same as the equivalence relation used in
the denition of G/K. So the Q from 2.20 is, in this context, exactly the
quotient group G/K. Furthermore, looking once more at the proof of 2.20,
we nd that the mapping : Q I satises (C) = (x) whenever x C.
The mapping in Theorem 3.17 is dened in exactly the same way. It can
be checked easily that the proofs of the bijectivity of given in the proof of
2.20 and in the proof of 3.17 are essentially the same.
It is worth noting also that the mapping : G Q, dened in the
proof of 2.20, is precisely the natural homomorphism G G/K as dened
in Proposition 3.12. The mapping : I H is dened by (h) = h for all
h I; so clearly is a homomorphism also. So the content of Theorem 3.17,
given Proposition 2.20, is really that the three factors , and are all
homomorphisms, if the original function : G H is a homomorphism.
4
Automorphisms, inner automorphisms and conjugacy
According to the general conventions we have adopted, if we regard a group
G as a structured set then a symmetry, or automorphism, of G is a bijective
transformation of G that preserves the group structure. We start this chapter
by looking at a few examples of automorphisms of groups.
4a Automorphisms
In view of the above remarks and the denition of isomorphism given in
Chapter 3, we see that the following denition is appropriate.
4.1 Definition An automorphism of a group G is an isomorphism from
G to itself.
Recalling that an isomorphism is a bijective homomorphism, and that
a bijective function from G to itself can be called a permutation of G, the
denition can be rephrased as follows: an automorphism is a permutation
of G which is also a homomorphism.
The set of all symmetries of anything, no matter what, is always a
group; so it is certainly true that the set of all automorphisms of a group is
a group.
4.2 Definition If G is a group then Aut(G), the automorphism group
of G, is the set [ : G G is an automorphism, with multiplication of
elements of Aut(G) dened to be composition of functions.
The multiplication operation on Aut(G) is thus inherited from the mul-
tiplication operation on Sym(G), the group of all permutations of the set G.
If a separate proof were required that Aut(G) is indeed a group, the simplest
way to do it would be to use 2.10 to show that Aut(G) is a subgroup of
57
58 Chapter Four: Automorphisms, inner automorphisms and conjugacy
Sym(G). The main task then is to show that the composite of two homo-
morphisms is a homomorphism, and the inverse of an isomorphism is also
an isomorphism. We leave this as an exercise, and instead look at a few
examples.
Let G = I, T, S be a cyclic group of order three. The multiplication
table of G is as follows:
I T S
I I T S
T T S I
S S I T
If we interchange S and T in this table we obtain the following:
I S T
I I S T
S S T I
T T I S
However, this is still a multiplication table for G: the product of two elements
of G will be the same whichever table you choose. For example, both tables
say that T
2
= S, and both tables say that ST = I. The same would not
have worked had we chosen to interchange T and I rather than T and S.
The permutation
=
_
I T S
I S T
_
preserves the multiplicative structure of the group G, whereas the permuta-
tion
=
_
I T S
T I S
_
does not. That is, is an automorphism of G and is not. It is easy to show
that in fact the only other automorphism of G is the identity transformation.
We proved in Proposition 3.3 that if : G H is any homomorphism
then (e
G
) = e
H
. It follows that any automorphism of a group must take the
identity element to itself. If Gis the Klein 4-group then it turns out that every
permutation of G which xes the identity element is an automorphism of G.
This can be seen as follows. Write G = e, a, b, c, where e is the identity
element, and multiplication is given by the table which we wrote down in
Chapter 1. The important thing to note here is that three simple rules
Chapter Four: Automorphisms, inner automorphisms and conjugacy 59
describe the multiplication completely. First, x
2
= e for all x G; second,
xe = ex = x for all x G; third, if x and y are distinct nonidentity elements
of G then xy is the third nonidentity element of G. Now let : G G be
any permutation which xes e, and let x, y G. We will show, using a
case-by-case analysis, that (x)(y) = (xy). If x = y then, using the rst
of the three rules,
(x)(y) = (x)
2
= e = (e) = (x
2
) = (xy).
So suppose that x ,= y. If y = e then, by the second rule, we have
(x)(y) = (x)(e) = (x)e = (x) = (xe) = (xy).
A similar argument applies if x = e. Finally, if x and y are both non-identity
elements then the third rule tells us that xy = z, where z is the third non-
identity element. Furthermore, since is bijective and (e) = e, the elements
(x), (y) and (z) must be distinct nonidentity elements of G; so the third
rule also tells us that (x)(y) = (z). So (x)(y) = (xy) in this case too,
and so our claim is established. Since (x)(y) = (xy) for all x, y G, it
follows that is a homomorphism, and hence an automorphism of G.
From our previous discussion of the cyclic group of order three, it can
be seen that the cyclic group of order three also has the property that all per-
mutations of it which x the identity element are automorphisms, although
in this case there are only two such permutations, as opposed to six in the
case of the Klein group. It trivially holds also for the cyclic groups of or-
ders 1 and 2, when the only permutation which xes the identity element
is the identity permutation. However, there are no other groups with this
property.
Let G be the cyclic group of nite order n. Expressed in terms of some
xed generating element x, the distinct elements of G are e = x
0
(the identity
element), x, x
2
, . . . , x
n2
and x
n1
. Subsequent powers of x give the same
elements again: x
n
= x
0
, x
n+1
= x, and so on. In fact, if r and s are any
integers, then x
r
= x
s
if and only if r s is a multiple of n.
For each integer k there is a function =
k
: G G such that
(x
r
) = x
kr
for all integers r. To prove this we must show that x
kr
= x
ks
whenever x
r
= x
s
; otherwise phi will not be uniquely dened. But if x
r
= x
s
then, as we remarked above, r s must be a multiple of n. But if r s is a
multiple of n then kr ks, which equals k(r s), must be a multiple of n
also, and so x
kr
= x
ks
, as required.
Now if r and s are arbitrary integers we have that
(x
r
)(x
s
) = x
kr
x
ks
= x
kr+ks
= x
k(r+s)
= (x
r+s
) = (x
r
x
s
).
Since all elements of G are powers of x we conclude that (gh) = (g)(h)
for all g, h G. So is a homomorphism from G to G. This does not
mean that is an automorphism: as yet we do not know whether or not is
bijective. To determine whether is injective, we investigate the kernel of .
Suppose that x
r
ker . Then x
kr
= (x
r
) = e, and so kr must be a
multiple of n. Now let n = dm and k = dh, where d is the greatest common
divisor of n and k. Note that m and h cannot have any factors greater than 1,
since if they did we could nd a common factor of n and k greater than d.
Now since kr is a multiple of n it follows, after dividing through by d, that
hr is a multiple of m. Since m and h have no nontrivial common factors it
follows that r is a multiple of m. The converse is also true: if r is a multiple
of m then x
r
ker . So
ker = x
r
[ r is a multiple of m = n/ gcd(n, k) .
We see therefore that if the gcd of n and k is greater than 1 then ker contains
at least one element x
r
such that r is not a multiple of n. So ker ,= e in
this case, and hence (by Proposition 3.16) is not injective. On the other
hand if the gcd of n and k is 1 then
ker = x
r
[ r is a multiple of n = e,
and is injective. It is easily seen that an injective transformation of a nite
set is necessarily surjective also, and so we conclude that is an automor-
phism of G if and only if the greatest common divisor of n and k is 1.
4b Inner automorphisms
The groups whose automorphisms we discussed in 4a were all Abelian. This
section, however, is primarily concerned with groups which are not Abelian,
because the construction we present yields only the identity automorphism
in the Abelian case.
Homomorphisms from a group to itself are sometimes called endomorphisms.
4.3 Proposition Let G be a group and g G, and dene
g
: G G by
the rule
g
(h) = ghg
1
for all h G. Then
g
is an automorphism of G
Proof. If h, k are arbitrary elements of G then
g
(h)
g
(k) = (ghg
1
)(gkg
1
) = ghekg
1
= g(hk)g
1
=
g
(hk),
and so
g
is a homomorphism.
Suppose that h, k G with
g
(h) =
g
(k). Then ghg
1
= gkg
1
, and
applying 2.2 (i) and 2.2 (ii) we conclude that h = k. Since
g
(h) =
g
(k)
implies that h = k, it follows that
g
is injective.
Let h G be arbitrary. Since
g
(g
1
hg) = g(g
1
hg)g
1
= (gg
1
)h(gg
1
) = ehe = h
it follows that h im
g
. Since h was arbitrary, this shows that
g
is
surjective.
4.4 Definition If G is a group, then an automorphism of G is called
an inner automorphism if there exists g G such that (h) = ghg
1
for all
h G.
The reason for the name inner is clear enough: inner automorphisms
of G are produced, in some sense, by elements from within G itself.
4.5 Proposition Let G be a group, and for each g G let
g
Aut(G)
be as dened in 4.3. The the function : G Aut(G), dened by (g) =
g
for all g G, is a homomorphism.
Proof. We must show that (g)(h) = (gh), for all g, h G. That is, we
must show that
g
h
=
gh
, for all g, h G. Now since
g
h
and
gh
are
functions from G to G, to say that they are equal is to say that they have
the same eect on all elements of G.
So, let g, h and x be arbitrary elements of G. Since multiplication in
Aut(G) is composition of functions, (
g
h
)(x) is, by denition,
g
(
h
(x)).
Therefore
(
g
h
)(x) =
g
(hxh
1
) = ghxh
1
g
1
= (gh)x(gh)
1
=
gh
(x),
and since this is valid for all x G we conclude that the functions
g
h
and
gh
are equal, as required.
In Chapter 3 we expended some eort proving the Homomorphism The-
orem, which is applicable whenever one has a homomorphism from one group
to another. In Proposition 4.5 we showed that a certain function is a homo-
morphism; we would be foolish not to apply the Homomorphism Theorem
and see if it tells us anything interesting.
The Propositions 3.14 and 3.15 should really be regarded as part of the
Homomorphism Theorem, since they have the same hypotheses as 3.17 and
are needed for its statement. To apply these results to the homomorphism
dened in 4.5, the rst task is to calculate the kernel and the image of .
The kernel of is the set of all elements g G such that the function
g
: G G is the identity function. That is, ker is the set of all g G such
that
g
(h) = h for all h G.
4.6 Definition The centre of a group G is the set
Z(G) = g G [ gh = hg for all h G.
That is, Z(G) consists of those elements which commute with all elements
of G.
Multiplying the equation gh = hg on the right by g
1
converts it into
ghg
1
= h. Recalling that
g
(h) was dened as ghg
1
, we conclude that the
centre of G is the set of all g G such that
g
(h) = h for all g G.
4.7 Proposition The kernel of the homomorphism of Proposition 4.5
is Z(G), the centre of G.
By Proposition 3.14 it follows that the centre is a normal subgroup.
4.8 Proposition The centre of a group G is always a normal subgroup
of G.
Note that Proposition 4.8 is also very easy to prove directly, using 2.10
and the denition of normality.
The image of the homomorphism is Inn(G) =
g
[ g G, the set
of all inner automorphisms of G.
4.9 Proposition The set Inn(G) is a subgroup of Aut(G).
This follows directly from 3.15, or, alternatively, can be proved readily
using 2.10. We call Inn(G) the inner automorphism group of G.
4.10 Proposition If G is any group, then the central quotient group,
G/Z(G), is isomorphic to the inner automorphism group of G.
Proof. This is immediate from 3.17.
4c Conjugacy
4.11 Definition Let G be a group and x, y G. Then x and y are said to
be conjugate in G if there exists g G such that gxg
1
= y. In this situation
we will sometimes say that g transforms x into y.
An alternative formulation of this concept is as follows: x, y G are
conjugate if there exists Inn(G) such that (x) = y.
It is a straightforward exercise to show that conjugacy is an equiva-
lence relation, which therefore partitions G into mutually disjoint equivalence
classes. These are called the conjugacy classes of G.
Observe that if G is an Abelian group then gxg
1
= y becomes x = y;
so, in an Abelian group, x cannot be conjugate to anything but itself. So the
conjugacy classes of an Abelian group are just the single element subsets of
the group. Accordingly, the rst interesting example to look at is the smallest
group which is not Abelian, namely, the symmetric group Sym1, 2, 3.
Note that (2, 3)(1, 2)(2, 3)
1
= (1, 3), and (1, 2)(1, 3)(1, 2)
1
= (2, 3).
It follows that the elements (1, 2), (1, 3) and (2, 3) of G = Sym1, 2, 3 are
all conjugate to each other. Let p G be arbitrary, and let p(1) = a and
p(2) = b. Then
_
p(1, 2)p
1
_
(a) =
_
p(1, 2)
_
(p
1
(a)) =
_
p(1, 2)
_
(1) = p(2) = b,
and similarly it can be checked that p(1, 2)p
1
takes b to a. Since a and b are
interchanged by p(1, 2)p
1
, the remaining element of 1, 2, 3 must be xed.
In fact, in the literature, the conjugacy classes of a group are usually just
called the classes of the group.
Hence p(1, 2)p
1
= (a, b). It follows that (1, 2), (1, 3) and (2, 3) are the only
elements of this conjugacy class. By similar reasoning it can be veried that
(1, 2, 3) and (1, 3, 2) (which equals (1, 2)(1, 2, 3)(1, 2)) are the only elements
of G conjugate to (1, 2, 3). The remaining element of G is i, the identity,
which must form a class by itself. Indeed, it is obvious that i cannot be
conjugate to anything but itself, since pip
1
= i for all p. So the partitioning
of G into conjugacy classes is as follows:
G = i (1, 2), (1, 3), (2, 3) (1, 2, 3), (1, 3, 2).
Next we investigate conjugacy in Sym1, 2, . . . , n, for arbitrary n. The
key observation is that if q is an element of Sym1, 2, . . . , n that takes a to b,
and p Sym1, 2, . . . , n is arbitrary, then pqp
1
takes p(a) to p(b). For,
given q(a) = b, we nd that
(pqp
1
)(p(a)) = (pq)(p
1
(p(a))) = (pq)(a) = p(q(a)) = p(b).
Thus, for example, if q is the cycle (1, 4, 2, 3, 5) then pqp
1
is the cycle
(p(1), p(4), p(2), p(3), p(5)), for since q takes 1 to 4 it follows that pqp
1
takes p(1) to p(4), and since q takes 4 to 2 it follows that pqp
1
takes
p(4) to p(2), and so on. By choosing p appropriately, we can arrange for
(p(1), p(4), p(2), p(3), p(5)) to be any given 5-cycle. For example, if we want
pqp
1
to be (5, 4, 3, 2, 1), then p =
_
1 2 3 4 5
5 3 2 4 1
_
will do. Similarly, if
q = (2, 6)(3, 4) then pqp
1
= (p(2), p(6))(p(3), p(4)), and by suitable choice
of p we can make pqp
1
equal any product of two disjoint transpositions. In
general, two elements of Sym1, 2, . . . , n are conjugate if and only if they are
of the same cycle type. That is, when written as products of disjoint cycles,
they have the same number of cycles of length 1, the same number of length 2,
the same number of length 3, and so on. For example, in Sym1, 2, . . . , 24
a permutation might have three cycles of length 1, one of length two, four of
length three and one of length seven. Such a permutation is
q = (3, 7)(8, 21, 17)(11, 12, 13)(16, 6, 22)(20, 24, 23)(4, 9, 19, 10, 5, 18, 15),
and another is
r = (11, 12)(1, 2, 3)(7, 15, 14)(13, 16, 17)(19, 22, 20)(5, 6, 21, 24, 23, 10, 9).
Because q and r have the same cycle type, it follows that they must be
conjugate in Sym1, 2, . . . , 24. Here is a permutation p such that pqp
1
= r:
_
3 7 8 21 17 11 12 13 16 6 22 20 24 23 4 9 19 10 5 18 15 1 2 14
11 12 1 2 3 7 15 14 13 16 17 19 22 20 5 6 21 24 23 10 9 4 8 18
_
Arranging the numbers in the top row into increasing order (which is more
usual) this becomes:
_
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
4 8 11 5 23 16 12 1 6 24 7 15 14 18 9 13 3 10 21 19 2 17 20 22
_
In Sym1, 2, 3, 4, 5, which has 120 elements, there are exactly seven possible
cycle types, and so seven classes, as follows.
The identity element is a one element conjugacy class.
The elements conjugate to (1, 2) form ten element class.
The elements conjugate to (1, 2, 3) form a twenty element class.
The elements conjugate to (1, 2, 3, 4) form a thirty element class.
The elements conjugate to (1, 2, 3, 4, 5) form a twenty-four element class.
The elements conjugate to (1, 2)(3, 4) form a fteen element class.
The elements conjugate to (1, 2)(3, 4, 5) form a twenty element class.
Add these up and check that it comes to 120. Also, verify the numbers
themselves!
For groups of symmetries of geometrical objects, elements which are
conjugate invariably turn out to be geometrically similar kinds of transfor-
mations. For example, let G be the group of all symmetries of a square
with vertices a, b, c and d (see Chapter 1). The clockwise rotation through
90
and the anticlockwise rotation through 90
, corresponding to the per-

mutations (a, b, c, d) and (a, d, c, b), are conjugate in G; indeed, the element
(a, c) G transforms one to the other. The element (a, b)(c, d) is a reection
in a line parallel to, and midway between, a pair of opposite sides. The ele-
ment (a, d)(b, c) is also such a reection. These elements are conjugate in G;
the element (a, b, c, d) transforms on to the other. The elements (a, c) and
(b, d) are the reections in diagonals of the square. They are conjugate in G,
and (a, b, c, d) transforms one to the other. The other two elements of G, the
identity and the central inversion (a, c)(b, d), both form one element classes.
So the class decomposition of G is
G = i (a, c)(b, d) (a, b)(c, d), (a, d)(b, c)
(a, c), (b, d) (a, b, c, d), (a, d, c, b).
Let G be the group of all invertible 3 3 matrices over the complex
eld. That is, G = GL
3
(C). It is proved in textbooks on linear algebra that
if a matrix A G has three distinct eigenvalues, , and say, then A is
similar (to use the linear algebra term) to the matrix
D =
_
_
0 0
0 0
0 0
_
_
.
That is, there exists an invertible matrix T such that T
1
AT = D. In group
theory terminology, the elements A, D G are conjugate. If the matrix A
has a repeated eigenvalue the story is not quite so simple, but nevertheless it
is fully understood: every matrix is similar to a Jordan normal form matrix.
This classical fact from linear algebra describes the conjugacy classes in the
group GL
3
(C) (and, indeed, the theory applies to GL
n
(C) for all n).
4d On the number elements in a conjugacy class
Let g be a xed element of the group G. Since the elements of G which are
conjugate to g are exactly the elements of the form tgt
1
, there is a function
f from G to the conjugacy class containing g, given by t tgt
1
. In keeping
with the idea used in Proposition 2.20, let us dene an equivalence relation on
G by the rule that s t if and only if f(s) = f(t). The dierent conjugates
of g in G are then in one-to-one correspondence with the equivalence classes
of G under this equivalence relation.
4.12 Proposition Let G be a group and g G. The set of all elements
of G which commute with g,
C
G
(g) = t G [ gt = gt ,
is a subgroup of G.
Proof. Since ge = eg (where e is the identity element of G) we see that
e C
G
(g), and so C
G
(g) is nonempty.
Let s, t C
G
(g). Then gt = tg and gs = sg, and so
g(st) = (gs)t = (sg)t = s(gt) = s(tg) = (st)g,
and it follows that st C
G
(g). Hence C
G
(g) is closed under multiplication.
Let t C
G
(g). Then gt = tg, and multiplying this equation on the
left by t
1
, and on the right by t
1
as well, gives t
1
(gt)t
1
= t
1
(tg)t
1
,
which simplies to gt
1
= t
1
g. So t
1
C
G
(g). Hence C
G
(g) is also closed
under inversion, and by Proposition 2.10 it follows that C
G
(g) is a subgroup
of G.
4.13 Definition The subgroup C
G
(g) dened in Proposition 4.12 is called
the centralizer of g in G.
Since the equation gt = tg is equivalent to tgt
1
= g, the centralizer
of g can also be described as the set of elements of G that transform g to
itself. Note also that t is in the centralizer of g if and only if g is in the
centralizer of t.
The relevance of the centralizer in the present context is claried by
the following result.
4.14 Proposition Let G be a group and g G. If s, t are arbitrary
elements of G then sgs
1
= tgt
1
if and only if sC
G
(g) = tC
G
(g). That is,
s and t transform g to the same element if and only if they lie in the same
left coset of the centralizer of g.
Proof. The equation sgs
1
= tgt
1
is equivalent to t
1
sg = gt
1
s, which
says that t
1
s C
G
(g). By 2.11, this is equivalent to sC
G
(g) = tC
G
(g), as
required.
At the beginning of this section we dened an equivalence relation on
G by the rule that s t if and only if sgs
1
= tgt
1
; Proposition 4.14 shows
that this equivalence relation is exactly right congruence modulo the central-
izer of g, so that the equivalence classes are just the left cosets of the cen-
tralizer. Because the conjugates of g are in one-to-one correspondence with
these equivalence classes, it follows that the number of conjugates of g in G
equals the index in G of the centralizer. The next proposition summarizes
what we have proved.
4.15 Proposition Let G be a group and g G. Then the number of
conjugates of g in G is equal to [G : C
G
(g)]. Moreover, there is a bijective
correspondence between the conjugates of g and the left cosets of C
G
(g) such
that tC
G
(g) tgt
1
for all t G.
An example will clarify the idea a little. Let G = Sym1, 2, 3, 4 and
let g = (1, 3)(2, 4). For each t G we have tgt
1
= (t(1), t(3))(t(2), t(4)).
Let us rst calculate C
G
(g), which is the set of those t such that tgt
1
= g.
There are in fact eight such elements t; they are as follows:
(i) t(1) = 1, t(3) = 3, t(2) = 2, t(4) = 4 (the identity element);
(ii) t(1) = 1, t(3) = 3, t(2) = 4, t(4) = 2;
(iii) t(1) = 3, t(3) = 1, t(2) = 2, t(4) = 4;
(iv) t(1) = 3, t(3) = 1, t(2) = 4, t(4) = 2;
(v) t(1) = 2, t(3) = 4, t(2) = 1, t(4) = 3;
(vi) t(1) = 2, t(3) = 4, t(2) = 3, t(4) = 1;
(vii) t(1) = 4, t(3) = 2, t(2) = 1, t(4) = 3;
(viii) t(1) = 4, t(3) = 2, t(2) = 3, t(4) = 1.
Writing them out in cycle notation, they are
i, (2, 4), (1, 3), (1, 3)(2, 4), (1, 2)(3, 4), (1, 2, 3, 4), (1, 4, 3, 2), (1, 4)(2, 3).
Multiplying all these elements on the left by (1, 2) we nd that the eight
elements of the left coset (1, 2)C
G
(g) are
(1, 2), (1, 2, 4), (1, 3, 2), (1, 3, 2, 4), (3, 4), (2, 3, 4), (1, 4, 3), (1, 4, 2, 3).
For these eight values of t we have that tgt
1
= (1, 2)g(1, 2)
1
= (2, 3)(1, 4).
In a similar fashion we nd that the eight elements of the left coset (1, 4)C
G
(g)
are
(1, 4), (1, 4, 2), (1, 3, 4), (1, 3, 4, 2), (1, 2, 4, 3), (1, 2, 3), (2, 4, 3), (2, 3).
These eight values of t all give tgt
1
= (1, 4)g(1, 4)
1
= (1, 2)(3, 4). The three
elements which have the same cycle type as (1, 3)(2, 4) have been matched
with the left cosets of C
G
(g), in the way specied in Proposition 4.15.
Let G be a nite group and let (
1
, (
2
, . . . , (
s
be all the conjugacy
classes of G. Since conjugacy is an equivalence relation on G we know that
the classes (
i
form a partitioning of G: each element of G lies in exactly one
class. In particular, therefore, the total number of elements of G equals the
sum of the sizes of the classes:
[G[ =
s
i=1
[(
i
[.
For each class (
i
, choose a representative element g
i
(
i
. By Proposi-
tion 4.15, [(
i
[ = [G : C
G
(g
i
)], and so the equation above can be rewritten
as
[G[ =
s
i=1
[G : C
G
(g
i
)].
This is called the class equation of the group G.
It is useful to collect together the terms [G : C
G
(g
i
)] in the class equa-
tion which are equal to 1. Now [G : C
G
(g
i
)] = 1 if and only if the subgroup
C
G
(g
i
) is the whole of G (so that C
G
(g
i
) itself is the one and only coset
of C
G
(g
i
) in G), and this means that every t G commutes with g
i
. But
g
i
t = tg
i
for all t G if and only if g
i
Z(G). In other words, the single
element conjugacy classes of G correspond to the elements of the centre of G.
The class equation now becomes
[G[ = [Z(G)[ +
s
i=r
[G : C
G
(g
i
)],
where g
r
, g
r+1
, . . . , g
s
are representatives of the the conjugacy classes of
elements of G that lie outside the centre of G.
4.16 Proposition Let G be a group such that [G[ = p
n
, where p is a
prime number and n > 0. Then the centre of G contains at least one non-
identity element.
Proof. Since [G[ = p
n
and p is prime, every divisor of p
n
is also a power
of p. We know that the index of any subgroup of G is necessarily a divisor
of [G[, and so each of the terms [G : C
G
(g
i
)] appearing in the class equation
is a power of p. Now we have
[Z(G)[ = [G[
s
i=r
[G : C
G
(g
i
)],
and all the terms on the right hand side are powers of p greater than 1. So
all the terms on the right hand side of the equation are divisible by p, and it
follows that [Z(G)[ is divisible by p also. In particular, [Z(G)[ , = 1, whence
Z(G) has more elements than just the identity.
5
Reections
At this point we will abruptly abandon abstract group theory, although
we have not even scratched the surface of that enormous subject. Instead,
we turn our attention to n-dimensional Euclidean geometry.
There is an extremely naive viewpoint according to which, since we live
in three-dimensional space, higher dimensional geometry has no real appli-
cation, and is just some totally useless nonsense invented by mathematicians
to amuse themselves as they sit in their ivory tower. The truth is that the
mathematical analysis of very simple, concrete, physical problems immedi-
ately and inevitably involves higher dimensional spaces. To describe the po-
sition of a particle in three dimensional space requires three coordinates, but
if you need to worry about the positions of two particles simultaneously you
need six. Immediately you are working in six dimensional space. True, what
mathematicians call six dimensional geometry, the person in the street would
not recognise as geometry. It becomes hard to draw pictures to illustrate the
arguments that are used, which therefore appear more like algebra than ge-
ometry. Indeed, there is no clear distinction between algebra and geometry.
But, as we shall see, concepts from two and three dimensional geometry do
have natural generalizations in higher dimensions, and it is natural to use
the term geometry to apply to reasoning involving such concepts.
5a Inner product spaces
Let R
n
be the set of all n-component column vectors:
R
n
=
_
_
_
x
1
x
2
.
.
.
x
n
_
x
i
R for all i
_
.
70
Chapter Five: Reections 71
If x
, y
R
n
we dene the dot product of x
and y
by the formula
x
=
n
i=1
x
i
y
i
where x
i
, y
i
are the ith coordinates of x
, y
respectively. Note that x
0
for all x, with equality if and only if x
= 0
. We dene
|x
| =

x
.
When n = 2 or 3 we can use Cartesian coordinates to identify R
n
with the
space of position vectors of points, in two or three dimensional Euclidean
space, relative to xed origin O. It turns out that in these casesprovided
we use a rectangular coordinate systemthe formula
distance =
_
(x
) (x
) = |x
|
gives the distance between the points P and Q whose position vectors are
OP = x
and

OQ = y
. So it is natural to use this as the denition of the

distance between x
and y
whatever the value of n. Similarly, if =

POQ
then
cos =
x
|x
| |y
|
,
and so we use this same formula to dene the angle between x
and y
in
general.
With the dot product as dened above, R
n
becomes an inner product
space (as dened in Chapter 1). Furthermore, if V is any inner product space
over R, and if the dimension of V is n, then a basis v
1
, v
2
, . . . , v
n
of V can
be found which is orthonormal , in the sense that
v
i
v
j
=
_
0 if i ,= j
1 if i = j,
and then there is an inner product preserving vector space isomorphism be-
tween V and R
n
given by
1
v
1
+
2
v
2
+ +
n
v
n

_
2
.
.
.
n
_
_
.
So, eectively, R
n
is the only n-dimensional inner product space over R. The
corollary of this fact which will be important for us later is the following
proposition.
72 Chapter Five: Reections
5.1 Proposition If v
1
, v
2
, . . . , v
n
are elements of a real inner product
space V , then there exist x
1
, x
2
, . . . , x
n
in R
n
with x
i
x
j
= v
i
v
j
for all i, j.
In particular, if
ij
is the angle between x
i
and x
j
, then cos
ij
=
v
i
v
j
v
i
v
j
.
We will also apply the concepts of distance and angle to arbitrary real
inner product spaces, dening them by the same formulas as for R
n
.
It will sometimes be necessary for us to deal with spaces which are
almost inner product spaces, but not quite.
5.2 Definition Let V be a vector space over R. A bilinear form on V is
a function f: V V R such that
(i) f(v +u, w) = f(v, w) +f(u, w) for all u, v, w V and , R,
(ii) f(w, v +u) = f(w, v) +f(w, u) for all u, v, w V and , R.
The bilinear form f is said to be symmetric if in addition
(iii) f(u, v) = f(v, u) for all u, v V .
Observe that a symmetric bilinear form is an inner product if it satises
the extra condition that f(v, v) > 0 for all nonzero v V .
5.3 Proposition Let v
1
, v
2
, . . . , v
n
be a basis of a vector space V over
the eld R, and let A = (a
ij
) be an arbitrary n n matrix over R. Then
there exists a bilinear form f on V such that f(v
i
, v
j
) = a
ij
for all i, j. The
form f is symmetric if A is a symmetric matrix.
Proof. Every element of V can be written uniquely as a linear combination
of the basis vectors v
1
, v
2
, . . . , v
n
. If v =

n
i=1
i
v
i
and u =

n
i=1
i
v
i
,
dene
f(v, u) =
n
i=1
n
j=1
i
a
ij
j
.
It is straightforward to check that f has the required properties.
5b Dihedral groups
We look rst at two dimensional space, and examine the group of symmetries
of a regular n-sided polygon. Number its vertices 1, 2, . . . , n, where 1 is
adjacent to 2, 2 adjacent to 3, and so on, and n adjacent to 1. Let O be
the centre of the polygon; the perpendicular bisectors of all the sides, and
the bisectors of all the angles, all pass through O. These lines are all axes
of symmetry of the polygon. Note that n bisectors of sides plus n bisectors
of angles only makes n axes of symmetry altogether, since each is counted
twice. There is a dierence between the case of even n, when the bisector of
an angle coincides with the bisector of the opposite angle, and similarly for
sides, and odd n, where the bisector of an angle is also the bisector of the
opposite side. The cases n = 5 and n = 6 are illustrated.
An axis of symmetry is a line in which the object can be reected
without being changed. Given a line in the plane, the reection in is the
transformation of the plane which takes each point P to its mirror image,
which is that point P
such that is the perpendicular bisector of the line

segment PP
. Expressed as a permutation of the vertices, the reection in

the line which is the perpendicular bisector of the side joining vertices 1 and n
is
r = (1, n)(2, n 1)(3, n 2)
=
_
(1, 2k)(2, 2k 1) (k, k + 1) if n = 2k is even,
(1, 2k + 1)(2, 2k) (k, k + 2) if n = 2k + 1 is odd.
Similarly, the reection in the bisector of the angle at vertex n is
s = (1, n 1)(2, n 2)(3, n 3)
=
_
(1, 2k 1)(2, 2k 2) (k 1, k + 1) if n = 2k is even,
(1, 2k)(2, 2k 1) (k, k + 1) if n = 2k + 1 is odd.
The product of these reections is
rs = (1, 2, 3, . . . , n)
which is a rotation about O through an angle of /n. The powers of rs give
n rotational symmetries of the polygon, which, when combined with the n
reection symmetries corresponding to the n axes of symmetry mentioned
above, give 2n symmetries altogether. These 2n symmetries constitute the
full symmetry group of the regular n-sided polygon; the group is called the
dihedral group of order 2n.
Note that (rs)
n
= i. That is, rsr rs = i, where there are 2n factors
altogether on the left hand side. Multiplying this equation on the right by the
alternating product srs to n factors gives us the relation rsr = srs ,
where there are n factors on each side. (For example, if n = 3 the relation
(rs)
3
= i gives srs = (rsrsrs)(srs) = rsr.) If 1 k < n then the two
alternating products of length k, namely rsr and srs , correspond to
two distinct elements of the dihedral group. There are n 1 possible values
for k, giving 2(n 1) elements, and together with the identity and the one
element of length n this accounts for all 2n elements of the group. Thus,
expressed in terms of the reections r and s, the elements of the dihedral
group of order 2n are
i, r, s, rs, sr, rsr, srs, rsrs, srsr, . . . . . . , rsr = srs ,
where the expressions are arranged in order of nondecreasing length, the
longest expression having length n.
If is a line through the origin O, and if is the reection in , then
it is easily seen that is a linear transformation of the vector space of po-
sition vectors relative to O. Since addition in this space is governed by the
parallelogram law, the following diagram illustrates the fact that preserves
addition.
(v)
(u)
(u + v)
= (u) + (v)
v
u
u + v
O
A similar kind of diagram can be drawn to illustrate preservation of scalar
multiplication.
Since the reection symmetries of the regular polygon discussed above
all correspond to lines through the centre of the polygon, if we choose the
centre as our origin of coordinates then the reections in question are all lin-
ear transformations. Since the product of two linear transformations is again
linear, all the elements of the dihedral group correspond to linear transfor-
mations of the plane. Our aim in the remainder of these notes will be to
classify all nite groups of linear transformations of Euclidean space which
are generated by reections. The dihedral groups are the simplest examples,
and they arein several waysof fundamental importance in the study of
the more complicated examples.
5c Higher dimensions
If is a reection of n-dimensional space then, as in the 2-dimensional case
described above, if P is any point then (
OP) =

OP
, where the mirror

image point P
has the property that the mirror perpendicularly bisects the

line segment PP
. We have assumed that the origin O lies on the surface

of the mirror, in order to ensure that is a linear transformation. In the
2-dimensional case the mirror is a line, in the 3-dimensional case it is a
plane, and in n-dimensions it is an (n 1)-dimensional subspace. This means
that there is a unique direction perpendicular to the surface of the mirror.
An (n 1)-dimensional subspace of n-dimensional space is usually called a
hyperplane.
Let H be a hyperplane and a
be a nonzero vector perpendicular to H.

Points on H correspond to vectors u
which are perpendicular to a
. Expressing
this in terms of the dot product, u
H if and only if u
a = 0. Now if P is
the point with position vector

OP = a
, and if P
is the mirror image of P,

then

OP
OP. That is, (a
) = a
. If u
is the position vector of a point

on the surface of the mirror, then (u
) = u
. We can now easily obtain a

general formula for (v
) for all v
R
n
.
5.4 Proposition Let a
be a nonzero vector in R
n
and let H be the hy-
perplane of points perpendicular to a
. If is the reection in H, then

(v
) = v
2(v
)
a
for all v
R
n
.
Or, rather, a unique pair of mutually opposite directions.
Proof. Let v
R
n
, and let = (v
)/(a
). Bilinearity of the dot product

gives
(v a
) a
= v
(a
)
= v
= v
= 0,
and therefore v
H. Hence (v
) = v
, and it follows that

(v
) = (v
) +(a
)
= v
+(a
)
= v
+(a
)
= v
2a
as claimed.
Proposition 5.4 was stated in terms of R
n
, but it is natural extend the
terminology to all real inner product spaces.
5.5 Definition If V is an arbitrary real inner product space and if a V
is nonzero, then the transformation
a
: V V dened by
a
(v) = v 2
v a
a a
a
is called the reection in the hyperplane orthogonal to a.
The proof of the following easy fact is left as an exercise for the reader.
5.6 Proposition If a, b are nonzero elements of the inner product space V ,
then
a
=
b
if and only if a is a scalar multiple of b. In particular,
a
=
a
.
Recall that an orthogonal transformation of an inner product space is
a linear transformation which preserves the inner product. Geometrically,
this means that distances and angles are preserved, since distance and angle
are dened in terms of the inner product. It is straightforward to show that
reections are orthogonal transformations.
5.7 Proposition Let V be a real inner product space and 0 ,= a V .
Then
a
(u)
a
(v) = u v, for all u, v V .
Proof. Given u, v V , put = (v
)/(a
) and = (u
)/(a
). Then
a
(u)
a
(v) = (u 2a) (v 2a)
= u (v 2a) 2(a (v 2a))
= u v 2(u a) 2(a v) + 4(a a)
= u v 2((a a)) 2((a a) + 4(a a)
= u v.
It is also geometrically clear that reections do, in fact, preserve dis-

tances and angles.
5.8 Proposition Let be an orthogonal transformation of the inner prod-
uct space V , and let 0 ,= a V . Then
a
1
=
(a)
.
Proof. Let v V be arbitrary, and let x =
1
(v). Then we have
(
a
1
)(v) = (
a
(
1
(v)))
= (
a
(x))
= (x 2
x a
a a
a) (by denition of
a
)
= (x) 2
x a
a a
(a) (as is linear)
= (x) 2
(x) (a)
(a) (a)
(a) (as preserves the dot product)
= v 2
v (a)
(a) (a)
(a) (as x =
1
(v) gives v = (x))
=
(a)
(v) (by denition of
(a)
)
Since
a
1
and
(a)
are transformations of V which have the same eect
on all elements of V , it follows that they are equal.
Our next task is to investigate the product of two reections. In the
case of the Euclidean plane, geometrical methods can be used to show that
the product of two reections is a rotation. Alternatively, we can use linear
algebra to derive the same result, as follows. Let
a
=
_
x
y
_
=
_
r cos
r sin
_
,
the position vector of an arbitrary point in the plane. Assume that a
,= 0
,
and let be the reection in the hyperplane orthogonal to a
. (Of course, in
this situation, a hyperplane is simply a line.) Using the formula in 5.5 we
nd that
_
1
0
_
=
_
1
0
_
2
_
1
0
_
_
r cos
r sin
_
_
r cos
r sin
_
_
r cos
r sin
_
_
r cos
r sin
_
=
_
1
0
_
(2r cos /r
2
)
_
r cos
r sin
_
=
_
1 2 cos
2
2 cos sin
_
=
_
cos 2
sin2
_
= cos 2
_
1
0
_
sin2
_
0
1
_
and similarly
_
0
1
_
=
_
0
1
_
2
_
0
1
_
_
r cos
r sin
_
_
r cos
r sin
_
_
r cos
r sin
_
_
r cos
r sin
_
=
_
0
1
_
(2r sin/r
2
)
_
r cos
r sin
_
=
_
2 sin cos
1 2 sin
2
_
=
_
sin2
cos 2
_
= sin2
_
1
0
_
+ cos 2
_
0
1
_
.
Thus the matrix of the linear transformation relative to the standard basis
_
1
0
_
,
_
0
1
_
of R
2
is
_
cos 2 sin2
sin2 cos 2
_
(which is the same as
_
cos sin
sin cos
_
, where = (/2) 2). Now if
is the
reection corresponding to another vector a
, then
will have matrix

_
cos 2
sin2
sin2
cos 2
_
,
where
is the anticlockwise angle from the x-axis to a
, and the product
has matrix
_
cos 2 sin2
sin2 cos 2
_ _
cos 2
sin2
sin2
cos 2
_
=
_
cos 2 cos 2
+ sin2 sin2
cos 2 sin2
sin2 cos 2
sin2 cos 2
cos 2 sin2
sin2 sin2
+ cos 2 cos 2
_
=
_
cos 2(
) sin2(
)
sin2(
) cos 2(
)
_
,
which is well known to be the matrix of an anticlockwise rotation about the
origin through an angle of 2(
). Note that the angle of rotation is twice

the angle between a
and a
.
We move now to the general situation, and consider nonzero linearly
independent vectors a and b in an n-dimensional inner product space V .
The vectors perpendicular to both a and b form an (n 2)-dimensional sub-
space, K, which is the intersection of H
a
and H
b
, the reecting hyperplanes
corresponding to
a
and
b
. If v K then v is xed by both reections
a
and
b
, and hence also by the product
a
b
. On the other hand, if u is a
vector which is a linear combination of a and b, then
a
(u) and
b
(u) are also
linear combinations of a and b. This says that the 2-dimensional subspace
P spanned by a and b is both
a
-invariant and
b
-invariant. In fact, P is a
plane, and
a
and
b
act on P as reections in the lines H
a
P and H
b
P
respectively. Hence
a
b
acts on P as a rotation through twice the angle
between a and b.
Note that the subspaces K and P are complementary to each other,
meaning that every element of V is uniquely expressible in the form x + y
with x K and y P, and so it follows that the eect of
a
b
on the whole
of V is determined by what it does on the subspaces K and P. Specically,
(
a
b
)(x +y) = (
a
b
)(x) + (
a
b
)(y) = x +(y)
where is a rotation of P. The K-component is xed, the P-component is
rotated.
5.9 Proposition Suppose that a and b are nonzero elements of a real inner
product space V , and suppose that there is a nite group Gof transformations
of V which contains both the reection
a
and the reection
b
. Then the
angle between a and b is (p/q) for some integers p and q.
Proof. Let P be the subspace spanned by a and b, and the angle be-
tween a and b. If P is 1-dimensional then is either 0 or , satisfying the
Proposition. Otherwise P is 2-dimensional, and
a
b
acts on P as a rotation
through 2. It follows that (
a
b
)
q
acts on P as a rotation through 2q. But
since
a
b
G, which is nite, it follows from 2.28 that (
a
b
)
q
= i (the
identity transformation) for some integer q which is a divisor of [G[. So for
this q the angle 2q must be an integral multiple of 2. Thus 2q = 2p for
some integer p, and the desired conclusion follows.
5.10 Definition We will say that a set S of vectors in an inner product
space V is -commensurable if all elements of S are nonzero, and for every
pair a, b of elements of S there exist integers p, q such that the angle between
a and b is (p/q).
There are some trivial examples of large -commensurable sets of vec-
tors in n-dimensional space. For example, it is possible to have n mutually
perpendicular vectors: the elements of the standard basis have this property.
In 2-dimensional space it is easy to nd a set of 2k vectors such that the angle
between any two of them has the form (r/k) for some r Z: just take the
position vectors of the vertices and the midpoints of the edges of a regular
k-sided polygon with centre at the origin. We will call -commensurable sets
of this type polygonal. Now if V and V
are two inner product spaces then

there exists another inner product space U, called the orthogonal direct sum
of V and V
, which contains V and V
as subspaces, in such a way that every

nonzero vector of V is perpendicular to every nonzero vector of V
. Indeed,
each element of U is uniquely expressible as a sum v + v
with v V and
v
, and the inner product of two such elements v

1
+ v
1
and v
2
+ v
2
is
given by
(v
1
+v
1
) (v
2
+v
2
) = v
1
v
2
+v
1
v
2
.
Now if S is a -commensurable set of vectors in V , and S
a -commensurable
set of vectors in V
, then the subset S S
of U will be -commensurable
too, since anything in S will make an angle of /2 with anything in S
.
This enables us to construct -commensurable sets consisting of pairwise
orthogonal subsets each of which is polygonal. But it is a nontrivial task to
nd examples of -commensurable sets of vectors which are not of this kind.
5d Some examples
Let G be the group of all symmetries of a regular tetrahedron T. A tetrahe-
dron can be described as a pyramid on a triangular base; it has four vertices
and four triangular faces. Regularity means that the triangular faces are equi-
lateral, and all congruent to each other. Every permutation of the vertices
in fact corresponds to a symmetry of T, and so it follows that [G[ = 24.
For every pair of vertices of T there is an edge joining the two; thus
there are six edges altogether. For each edge there is a unique opposite
edge
, joining the two vertices which are not endpoints of . The directions
of and
are perpendicular to each other, and lies in the plane which

perpendicularly bisects
. The reection in this plane is a symmetry of the

tetrahedron; it interchanges the two vertices which are the endpoints of
,
and xes the two which are the endpoints of . Thus we see that there are
exactly six reections in the group G; each edge of T has exactly one plane
of symmetry through it.
To visualize the situation adequately, it helps to embed T in a cube. The
vertices of the cube can be coloured red and blue, so that each edge has one
red endpoint and one blue endpoint. The four red vertices of the cube are the
vertices of T, the four blue ones vertices of a second regular tetrahedron T
,
which shares the same symmetry group G as T. The diagonals of the faces of
the cube are the edges of the tetrahedra. The cube has twelve edges, which
split into six pairs of opposite edges, and for each pair of opposite edges
there is a plane that includes both the edges, and which passes through O,
the centre of the cube, as well as including one edge of T and one edge of T
.
These are the six planes corresponding to the reections in G. If P is one of
these planes, then the line through O perpendicular to P bisects another two
edges of the cube. So we see that if a is the position vector of the midpoint
of an edge of the cube, then the plane orthogonal to a is one of the six planes
of symmetry we have described, and so
a
is an element of G. There are
twelve possible choices for a, corresponding to the twelve edges of the cube
(but only six reections since
a
=
a
). Proposition 5.9 guarantees that
this set of vectors is -commensurable. In fact, if X and Y are midpoints of
edges of the cube then the angle between

OX and

OY is 0, or /2 if the
two edges are parallel, and /3 or 2/3 otherwise.
For our next example, consider the full group of symmetries of a cube.
As well as the six reections we described above, the cube has another three
reections. There are three pairs of opposite faces, and the plane which is
parallel to a pair of opposite faces and midway between the two is a plane of
symmetry. If a = OX
, where X is the central point of a face of the cube, then

the plane orthogonal to a is one of these planes of symmetry. Six possible
values of a correspond to only three reections, since
a
= a. If X is the
midpoint of a face and Y the midpoint of an edge then it is easily checked
that the angle between

OX and

OY is either /4, /2 or 3/4.
For our nal 3-dimensional example, consider a regular dodecahedron,
and let a, b and c to be the position vectors of A, B and C, midpoints of
three suitably chosen edges. We can arrange that
angle(a, b) =
4
5
,
a
b
has order 5,
angle(b, c) =
2
3
,
b
c
has order 3,
angle(a, c) =
1
2
,
a
c
has order 2.
Each of the twelve pentagonal faces has ve lines of symmetry bisecting it,
and each such line determines a plane through O, the reection in which is a
symmetry of the dodecahedron. Each of these planes bisect four faces, and so
we obtain a set H of 5 12/4 = 15 planes corresponding to fteen reection
symmetries. Each plane in H passes through a uniquely determined pair of
opposite edgesnote that there are thirty edges altogetherand perpendic-
ularly bisects another pair of opposite edges. Furthermore, the line through
O normal to the plane bisects a third pair of opposite edges. In particular
the reection in the plane orthogonal to the position vector of the midpoint
of an edge is indeed a symmetry of the dodecahedron. The thirty position
vectors of midpoints of edges of the dodecahedron form a -commensurable
set.
To nd suitable points A, B and C, proceed as follows. Let P be the
centre of one of the faces, let Q be the midpoint of one of the edges of this
face and let R be one of the vertices on this edge. The plane containing the
triangle QRO is in the set H, and so its normal through O passes through
points A and A
which are midpoints of opposite edges. Choose A to be the

one which is on the same side of the plane QRO as the point P. Similarly,
choose B on the same side of PRO as Q with BO normal to PRO, and C
on the same side of PQO as R with CO normal to PQO. It can be shown
that the reections
a
,
b
and
c
thus determinedthat is, the reections in
the planes QRO, PRO and PQOgenerate the full group of symmetries of
the dodecahedron.
(This group is in fact isomorphic to Alt(5)C
2
. As we have seen above,
each plane in H determines three pairs of opposite edges; these three pairs
are mutually perpendicular to each other, and the fteen pairs of opposite
edges split into ve such sets of three. The symmetry group permutes these
ve sets and also contains the transformation I, which xes each of the sets
by taking each edge to its opposite.)
6
Root systems and reection groups
If G is a nite group of transformations of n-dimensional Euclidean
space, and if S is a set vectors such that
a
G for all a S, then (as
we proved in Proposition 5.9) the set S is -commensurable, meaning that
for every pair a, b of elements of S, the angle between a and b is a ratio-
nal multiple of . Not all congurations of angles are feasible, however.
For example, it is impossible to nd three points P, Q, R in R
3
such that
POQ =
POR =
QOR = 3/4. Try to do it!

In this chapter we will continue the investigation of sets of vectors in an
n-dimensional inner product space V for which the corresponding reections
lie in a nite group, and determine precisely which congurations are possible.
This will provide a classication of all nite groups of transformations of V
which are generated by reections.
6a Root systems
6.1 Definition Let V be a real inner product space. A set S V is called
a root system in V if
(i) S is nite,
(ii) all elements of S are nonzero,
(iii) S spans V , and
(iv)
a
(b) S whenever a, b S.
Of course, conditions (i) and (iv) of this denition are the crucial ones.
Condition (ii) has to be there, since
0
is undened, but condition (iii) could
reasonably be omitted. If a subset S of V satises (i), (ii) and (iv), but
not (iii), then S is not a root system in V , but it is a root system in the
subspace of V that it spans.
84
Chapter Six: Root systems and reection groups 85
If S is a root system then the vectors a S are called roots.
Let V be an inner product space. Given a set S of nonzero elements
of V , we can form the group G of transformations of V that is generated
by 1 =
a
[ a S . This group consists of the identity transformation, all
the reections in 1, all transformations which are products of two elements
of 1, all that are products of three elements of 1, and so on. That is,
G =
a
1
a
2

a
k
[ k Z is nonnegative, and a
i
S for each i .
Note that we do not need to mention inverses when describing G, since each
a
is its own inverse. Note also that the group G is necessarily a subgroup
of O(V ), the group of all orthogonal transformations of V , since 5.7 shows
that
a
O(V ) for all a.
It is very likely that G will be an innite group. For example, by
Proposition 5.9, if a V [
a
1 is not -commensurable, then the
group will certainly be innite. (If the angle between a and b is not a rational
multiple of then
a
and
b
by themselves generate an innite group, since
b
has innite order.) However, the next proposition shows that if S is a
root system then the group is nite.
6.2 Proposition Let W be a set of bijective linear transformations of a
vector space V , and suppose that S is a nite set of vectors which spans V .
Suppose that S is preserved all elements of W, in the sense that g(v) S
whenever g W and v S. Then W is a nite set.
Proof. For each g W dene
g
: S S by
g
(v) = g(v)
for all v S. In other words,
g
is simply the restriction to S of the trans-
formation g of V . Note that it is because of our hypothesis that g(v) S for
all v S that this denition yields a function from S to S; it is at this point
of the proof that the hypothesis is used.
Let g W. Since g is injective it follows that
g
is injective. Since an
injective transformation of a nite set is necessarily surjective, we conclude
that
g
Sym(S), the group of all permutations of S. So we can dene a
function : W Sym(S) by (g) =
g
for all g W. We will show that
is injective.
86 Chapter Six: Root systems and reection groups
Suppose that g, h W with (g) = (h). Since one of our hypotheses
is that S is nite, we may write S = v
1
, v
2
, . . . , v
n
. Since
g
=
h
we have
g(v
i
) =
g
(v
i
) =
h
(v
i
) = h(v
i
)
for all i from 1 to n. Now since g and h are linear transformations it follows
that
g(
n
i=1
i
v
i
) =
n
i=1
i
g(v
i
) =
n
i=1
i
h(v
i
) = h(
n
i=1
i
v
i
)
for all choices of the scalars
i
. But one of our hypotheses is that S spans V ,
and so for every v V there exist scalars
i
with v =

n
i=1
i
v
i
. Hence we
can conclude that g(v) = h(v) for all v V , and since by denition g and h
are transformations of V , it follows that g = h. So we have proved that
(g) = (h) implies g = h; that is, is injective.
Since there is an injective function from W to Sym(S) it follows that
the number of elements of W is less than or equal to the number of elements
of S. Since the number of permutations of a set of size n is n!, a nite
number, we can conclude that Sym(S) is nite, and hence W is nite, as
required.
6.3 Corollary Let S be a root system in the inner product space V , and
G the subgroup of O(V ) generated by
a
[ a S . Then G is nite.
Proof. Since S is nite and spans V , all we need in order to apply Propo-
sition 6.2 is that g(b) S for all b S and g G. But if g G then
g =
a
1
a
2

a
k
for some a
i
S, and so
g(b) =
a
1
a
2

a
k
(b) =
a
1
(
a
2
( (
a
k
(b)) ))
for all b S. An obvious induction using Part (iv) of the denition of a root
system now shows that g(b) S.
Corollary 6.3 shows that for every root system there is a corresponding
nite subgroup of O(V ) generated by reections. In the other direction, the
next proposition shows that every nite subgroup of O(V ) gives rise to a root
system.
6.4 Proposition Let V be an inner product space and G a nite subgroup
of O(V ). Then the set
S = a V [ a a = 1 and
a
G
is a root system in the subspace of V that it spans.
Proof. Since G is a nite group it contains only nitely many reections.
If a, b S are such that
a
=
b
then, by Proposition 5.6, there exists a
scalar such that a = b. But a a = 1 = b b (since a, b S), and so
1 = a a = (b) b) =
2
(b b) =
2
.
Hence a = b, and so there are at most two elements of S for each reection
in G. Hence S is nite.
We have shown that S satises condition (i) of Denition 6.1, and so by
the remarks following 6.1 it remains to show that conditions (ii) and (iv) are
also satised. Now (ii) is trivial, since if a a = 1 then a is certainly nonzero.
So it remains to show that if a, b S then
a
(b) S.
Let a, b S, and let c =
a
(b). By Proposition 5.7 we know that
a
is
an orthogonal transformation; so by Proposition 5.8,
c
=
a
(b)
=
a
1
a
,
which is in G since
a
and
b
are in G. Furthermore, the fact that
a
is
orthogonal also tells us that
c c =
a
(b)
a
(b) = b b = 1.
Hence c S, as required.
We will say that a root system S is normalized if a a = 1 for all a S.
It is easily checked that if S is any root system, then the set
1
a
a [ a S
is a normalized root system. Of course, the reections corresponding to the
vectors in this normalized system are exactly the same as those corresponding
to the vectors in S itself. So as far as groups generated by reections are
concerned, there is nothing lost by restricting attention to normalized root
systems; hence we shall usually do so.
6b Positive, negative and simple roots
Let S be a root system in the inner product space V . Observe that if a S
then, by (iv) of Denition 6.1,
a
(a) S. But
a
(a) = a
2(aa)
(aa)
a = a. So
a S whenever a S. We now propose to divide S into two halves, called
S
+
and S
, in such a way that, for all a S, if a S

+
then a S
, and
vice versa. We do this in a rather arbitrary fashion, simply choosing some
hyperplane, and declaring the elements of S on one side of it to constitute S
+
,
those on the other side to constitute S
. We only need to be sure that none

of the elements of S actually lie on the hyperplane, but this is a very easy
requirement to meet since there are only nitely many vectors in S and an
uncountably innite supply of hyperplanes available.
6.5 Proposition Let S be a root system in the inner product space V .
There exists a vector v
0
V such that a v
0
,= 0 for all a S.
Proof. Since S is nite, for each v V the set Q(v) = a S [ a v = 0
has at most a nite number of elements. Choose v
0
V so that the number
of elements in Q(v
0
) is minimized. We prove that in fact Q(v
0
) is empty.
Suppose that Q(v
0
) ,= , and let a Q(v
0
). Since S is a nite set, the
set of numbers
A = (b v
0
)/(b a) [ b S and b a ,= 0
is also nite. Let be any real number which is nonzero and not in the set A,
and dene v
1
= v
0
a. Then
a v
1
= a (v
0
a) = a v
0
(a a) = (a a) ,= 0
since a ,= 0. Hence a / Q(v
1
). Furthermore, if b Q(v
1
) then
0 = b v
1
= b (v
0
a) = b v
0
(b a),
and if b a were nonzero this would give =
bv
0
ba
, contrary to the denition
of . So b a = 0, and hence b v
0
= b v
1
= 0. So b Q(v
0
). Thus we
have shown that Q(v
1
) is a subset of Q(v
0
), and not equal to Q(v
0
) since
a / Q(v
1
) whereas a Q(v
0
). So Q(v
1
) has fewer elements than Q(v
0
),
contradicting the denition of v
0
. This contradiction shows that Q(v
0
) must
be empty.
We now choose arbitrarily a v
0
V such that a v
0
,= 0 for all a in our
root system S, and we keep v
0
xed for the rest of the discussion. We can
now dene S
+
and S
.
6.6 Definition Let S be a root system, and x v
0
V such that a v
0
,= 0
for all a S. Dene
S
+
= a S [ a v
0
> 0 ,
the set of positive roots, and
S
= a S [ a v
0
< 0 ,
the set of negative roots.
The following technical denition will be of great use in our investiga-
tions of root systems.
6.7 Definition If B = b
1
, b
2
, . . . , b
n
is a nite set of vectors in the vector
space V , then a vector v V is said to be a positive linear combination of B
if v ,= 0 and
v =
1
b
1
+
2
b
2
+ +
n
b
n
for some scalars
i
such that
i
0 for all i. The set of all v V which are
positive linear combinations of B will be denoted by plc(B).
6.8 Definition Let S be a root system in V , with sets S
+
and S
of
positive and negative roots relative to some xed v
0
V , as in Denition 6.6.
A subset B S
+
is called a base of S if
(i) B is a basis of V , and
(ii) S
+
plc(B).
The next result is the key to our analysis of root systems.
6.9 Theorem Let S be a root system in V and S
+
a set of positive roots
in S. Then S has a base B S
+
.
Proof. Let / = B o
+
[ o
+
plc(B) . That is, / is the set of all sub-
sets B of S
+
with the following property,which we shall call Property (P):
(P) B S
+
and every a S
+
is a positive linear combination of B.
Observe that S
+
itself has Property (P). For, let S
+
= a
1
, a
2
, . . . , a
k
;
now for all a S
+
we can achieve a =
1
a
1
+
2
a
2
+ +
k
a
k
with
i
0
by putting
i
= 1 if a
i
= a and
i
= 0 otherwise. Hence / ,= . Now choose
B / with [B[ as small as possible. So B has Property (P), but no proper
subset of B has Property (P). We will prove that B is a basis of V .
Step 1. Let B = b
1
, b
2
, . . . , b
n
. For all i, the vector b
i
is not a positive
linear combination of Bb
i
= b
1
, b
2
, . . . , b
i1
, b
i+1
, . . . , b
n
.
Proof. Suppose that
(6.9.1) b
i
=
1
b
1
+ +
i1
b
i1
+
i+1
b
i+1
+ +
n
b
n
with all the coecients
j
0. We will prove that Bb
i
has Property (P),
contradicting minimality of B.
Let a S
+
. Since B has (P),
a =
1
b
1
+
2
b
2
+ +
n
b
n
for some
i
0.
Substituting into this the value of b
i
from Equation (6.9.1) gives
a =
1
b
1
+ +
i
(
1
b
1
+ +
i1
b
i1
+
i+1
b
i+1
+ +
n
b
n
) + +
n
b
n
which is clearly a positive linear combination of Bb
i
since all the
i
and
j
are nonnegative. So we have shown that S
+
plc(Bb
i
, as required
Step 2. Let B = b
1
, b
2
, . . . , b
n
. If b
i
,= b
j
, then b
i
b
j
0.
Proof. Suppose that b
i
b
j
> 0. Let a =
b
i
(b
j
), which is an element of the
root system S since b
i
, b
j
S. Now
a = b
j
b
i
where = 2(b
i
b
j
)/(b
j
b
j
) > 0.
If a S
+
then a =
1
b
1
+
2
b
2
+ +
n
b
n
for some coecients
i
0.
So
b
j
b
i
=
1
b
1
+
2
b
2
+ +
n
b
n
,
and rearranging this gives
(6.9.2) (1
j
)b
j
= b
i
+
l=j
l
b
l
.
Taking the dot product with v we nd that
(1
j
)(b
j
v
0
) = (b
i
v
0
) +
l=j
l
(b
l
v
0
)
(b
i
v
0
) > 0.
Since b
j
v
0
> 0, it follows that 1
j
> 0. Now by (6.9.2)
b
j
=

1
j
b
i
+
l=j
l
1
j
b
l
plc(Bb
j
contradicting Step 1. Therefore a / S

+
.
Since a / S
+
, it follows that a S
, and so a S
+
. So there exist
coecients
i
0 with a =
1
b
1
+
2
b
2
+ +
n
b
n
, and this gives
b
i
b
j
=
1
b
1
+
2
b
2
+ +
n
b
n
,
or, rearranging,
(6.9.3) (
i
)b
i
= b
j
+
l=i
l
b
l
.
Taking the dot product with v
0
we deduce that
(
i
)(b
i
v
0
) = b
j
v
0
+
l=j
l
b
l
b
j
v
0
> 0,
and therefore
i
> 0 (since b
i
v
0
> 0). Now dividing through by
i
in (6.9.3) we see that b
i
plc(Bb
i
, contradicting Step 1.
Step 3. B is a basis of V .
Proof. Let v V . Now S spans V , since it is a root system in V , and so
there exist scalar coecients
a
such that v =
aS

a
a. But if a S then
a S
+
for some sign = 1, and since B has Property (P) it follows that
a plc(B). So there exist scalars
ab
(which are nonnegative if a S
+
and
nonpositive if a S
) such that a =
bB

ab
b, and thus
v =
bB
_
aS
ab
_
b.
So every element of V is a linear combination of the elements of B, and we
have shown that B spans V .
It remains to show that the elements of B are linearly independent. Sup-
pose, to the contrary, that
1
b
1
+
2
b
2
+ +
n
b
n
= 0 for some scalars
i
, at
least one of which is nonzero. Now let H be the set of indices i 1, 2, . . . , n
such that
i
0, and J the set of those such that
i
< 0 (the rest). So
1, 2, . . . , n is the disjoint union of H and J, whence
0 =
n
i=1
i
b
i
=
iH
i
b
i
+
iJ
i
b
i
=
iH
[
i
[b
i

iJ
[
i
[b
i
,
and it follows that we can dene
v =
iH
[
i
[b
i
=
iJ
[
i
[b
i
.
Now v v
0
=
iH
[
i
[(b
i
v
0
) =
iJ
[
i
[(b
i
v
0
); furthermore, b
i
v
0
> 0 for
all i (since b
i
S
+
), whence each term [
i
[(b
i
v
0
) is nonnegative, and strictly
positive if
i
,= 0. Since we have assumed that at least one
i
is nonzero, we
conclude that at least one term is strictly positive and the others nonnegative,
and hence v v
0
> 0. In particular, v ,= 0. Therefore,
0 < v v =
_
iH
[
i
[b
i
_
jJ
[
j
[b
j
_
=
iH
jJ
[
i
[ [
j
[(b
i
b
j
) 0
since b
i
b
j
0 whenever i H and j J, by Step 2.
This contradiction proves that the vectors in B are linearly independent,
and therefore form a basis of V .
Since B is a basis and has Property (P), it is a base for the root system,
and Theorem 6.9 is proved.
It can be shown that the base B is in fact uniquely determined by the set
S
+
of positive roots. The positive roots which lie in this uniquely determined
base are called the simple or fundamental roots of the positive system. Note
that, since B is a basis for V , an arbitrary root a can be expressed as a
linear combination of simple roots (since B spans V ), and furthermore the
expression is unique (since B is linearly independent). If a is positive then
all the scalar coecients in this unique expression will be nonnegative (since
B has Property (P)), while if a is negative then all the coecients will be
nonpositive (since a is a positive root in this case).
We know from Proposition 5.9 that the angle between any two roots
in a root system S is necessarily a rational multiple of , and this applies in
particular to any two simple roots. We also know, in fact, from Step 2 of the
proof of Theorem 6.9, that the angle between any two simple roots must be
obtuse or a right angle (as the cosine of the angle is nonpositive). Our next
result gives even more accurate information.
6.10 Proposition Let b, b
B, where B is a base for the root system S.

The there exists a positive integer m 2 such that the angle between b and b
is
m1
m
.
Proof. The vectors b and b
span a 2-dimensional subspace of V which can

be identied with the Euclidean plane, and the transformation
b
b
acts on
this plane as a rotation through 2, where is the angle between b and b
.
Let be this rotation, and let the order of be m. Applying powers of
to b and b
yield 2m vectors, all of which are in the root system and are
linear combinations of b and b
. Clearly these roots are equally spaced (in

a rotational sense) around the origin, each separated from its two nearest
neighbours by an angle of /m. That is to say, the unit vectors in the
directions of these roots form a polygonal -commensurable set, to use the
terminology of Chapter 5.
Let b = a
0
, a
1
, . . . , a
2m1
be the 2m roots we have been discussing,
listed in the order they are encountered when circumnavigating the ori-
gin, from b back to b again. Choose the direction of circumnavigation so
that b
is encountered before getting halfway round. Our aim is to show

that b
= a
m1
.
b = a
0
a
m (a)
b
= a
i
b
a
m1
= ()b + b

Suppose, to the contrary, that b
= a
i
for some i such that 1 i m2.
(Note that b
,= a
m
, and b
,= a
0
, since b and b
are linearly independent.)

In this situation it is geometrically clear that when a
m1
is expressed as
b + b
then is negative and positive. This is because all the vectors

which are rotationally between a
m
= b and a
i
= b
must be positive linear

combinations of b and b
, as illustrated in the diagram above (in which

and are positive scalars). However, we know that in the unique expression
for the root a
m1
as a linear combination of simple roots, the coecients
must either be all nonnegative or all nonpositive. You cannot have mixed
coecients (some positive, others negative). This contradicts the fact that
a
m1
= b + b
with negative and positive. So our assumption that

b
,= a
m1
cannot be sustained: b
has to be a
m1
, whence the angle between
b and b
is
m1
m
, as required.
6c Diagrams
A graph is a set Vert(), whose elements are called the vertices of , with a
binary relation , which in our cases will always be assumed to be symmetric
and irreexive. That is, (a, b) is true if and only if (b, a) is true, and (a, a)
is always false. The pairs a, b of vertices such that (a, b) is true are called
the edges of .
Graphs are most conveniently represented by drawings consisting of
dots joined by lines. There should be one dot for each vertex, and a line
joining the dots corresponding to the vertices a and b if and only if a, b is
an edge. Here is a graph with six vertices and seven edges.
If B is a base for the root system S, then, by Proposition 6.10, for all
a, b B there exists an integer m
ab
2 such that the angle between a and b
is
m
ab
1
m
ab
. We now associate to S a graph , called the Coxeter diagram of
the root system, as follows.
(i) Vert() is in one-to-one correspondence with B.
(ii) If a, b B are not perpendicular, then the vertices corresponding to a
and b are joined by an edge labelled with the integer m
ab
.
Note that a, b B are perpendicular if and only if m
ab
= 2. The vertices
corresponding to a and b are not joined by an edge if m
ab
= 2. It is also
customary to omit the label on the edge if m
ab
= 3. So, for example, if
B = a, b, c, d, with
m
ac
= m
ad
= m
bd
= 2,
m
ab
= m
cd
= 3,
m
bc
= 4,
then the associated Coxeter diagram is
4
where, from left to right, the vertices correspond to a, b, c and d. On the
other hand, no root system can give rise to the diagram
4
4 4
since, as we commented at the start of this chapter, it is impossible to arrange
three vectors in Euclidean space so that the angle between any two of them
is (3/4).
It is our aim in this section to determine exactly which graphs can occur
as Coxeter diagrams of root systems, for this will essentially classify nite
Euclidean reection groups.
Suppose that is a graph which resembles a Coxeter diagram, in that
its edges are labelled with integers greater than 2 (unlabelled edges being
understood to have the label 3). Dene a function m = m
, to be called the
labelling function, as follows: the domain of m is Vert() Vert(), and for
all x, y Vert(),
m(x, y) =
_
_
_
1 if x = y
2 if x ,= y and x, y is not an edge,
l if x, y is an edge labelled l.
Suppose that in fact has n vertices; indeed, with no loss of generality, let us
assume that Vert() = 1, 2, . . . , n, and (for brevity) write m
ij
for m
(i, j).
Let V
be a vector space with basis v

1
, v
2
, . . . , v
n
, and let f
be the bilinear
form dened on V
and having the property that f
(v
i
, v
j
) = cos
_
m
ij
1
m
ij
_
for all i, j. Note that the existence and uniqueness of f
are guaranteed
by 5.3. Note also that f
(v
i
, v
i
) = 1 for all i. We will call V
the space
associated with , and f
the form associated with . We will also call

v
1
, v
2
, . . . , v
n
the canonical basis of V
.
If this bilinear form is positive denite, it means that one can nd lin-
early independent vectors b
i
in Euclidean space such that the angle between
b
i
and b
j
is
m
ij
1
m
ij
(see Proposition 5.1). If the form is not positive denite
then it is impossible to nd such vectors in Euclidean space.
6.11 Definition We say that a diagram is admissible if the form f
associated with is positive denite, inadmissible otherwise.

Suppose, for example, that is the diagram
6
, with the
vertices numbered from left to right, so that m
12
= 3, m
13
= 2 and m
23
= 6.
The vector space V
then has basis v

1
, v
2
, v
3
, and the matrix of the form f
relative to this basisthat is, the matrix whose (i, j)-entry is f
(v
i
, v
j
)is
_
_
1 1/2 0
1/2 1
3/2
0
3/2 1
_
_
.
It turns out that the form f
is not positive denite, since it is possible to nd

a nonzero vector v V
with f
(v, v) = 0. Indeed, let v = v

1
+ 2v
2
+
3v
3
.
By the denition of V
, the vectors v
1
, v
2
, v
3
form a basis, and are therefore
linearly independent. So v ,= 0. Moreover,
f
(v, v) = [ 1 2

3 ]
_
_
1 1/2 0
1/2 1
3/2
0
3/2 1
_
_
_
_
1
2
3
_
_
= [ 0 0 0 ]
_
_
1
2
3
_
_
= 0.
Hence this diagram is inadmissible.
In a similar way we can see that the diagram is also inad-
missible. In this case the matrix of the form is
_
_
1 1/2 1/2
1/2 1 1/2
1/2 1/2 1
_
_
and it can be seen that
[ 1 1 1 ]
_
_
1 1/2 1/2
1/2 1 1/2
1/2 1/2 1
_
_
_
_
1
1
1
_
_
= [ 0 0 0 ]
_
_
1
1
1
_
_
= 0.
On the other hand, the diagram
4
is admissible. To prove this
one has simply to nd three linearly independent vectors a, b, c in Euclidean
space such that the angle between a and b is 2/3, the angle between a and c
is /2, and the angle between b and c is 3/4. Three suitable vectors are
_
_
2/2
2/2
0
_
_
,
_
_
0
2/2
2/2
_
_
,
_
_
1
0
0
_
_
.
Note that the diagram does not have to be connected. For example,
can perfectly well be regarded as a single diagram, the space V
being
6-dimensional. In situations like this, if the i-th and j-th vertices of belong
to dierent components, then m
ij
= 2; so if v
i
and v
j
are the corresponding
elements of the canonical basis of V
, then f
(v
i
, v
j
) = 0. Thus if has k
connected components altogether, the canonical basis splits into k mutually
disjoint subsets such that f
(v
i
, v
j
) = 0 whenever v
i
and v
j
are from dierent
subsets. This yields a decomposition of V
as V
1
V
2
V
k
, where the
i
are the connected components of , and
f
(x
1
+x
2
+ +x
k
, x
1
+x
2
+ +x
k
) = f
(x
1
, x
1
) + +f
(x
k
, x
k
)
whenever x
i
, x
i
V
i
(since the cross terms, like f
(x
4
, x
2
), are all zero).
Now if is admissible, then f
is positive denite, and so

f
(x
1
+x
2
+ +x
k
, x
1
+x
2
+ +x
k
) > 0
whenever x
1
+x
2
+ +x
k
,= 0. In particular, if 0 ,= x
i
V
i
, then (putting
x
j
= 0 for all j ,= i) it follows that f
(x
i
, x
i
) > 0. Hence the component
i
is
also an admissible diagram. Conversely, if each
i
is admissible then must
be admissible also, since if 0 ,= x V
, then x = x
1
+x
2
+ +x
k
for some
x
i
V
i
, and since x
i
,= 0 for at least one i, we conclude that
f
(x
1
+x
2
+ +x
k
, x
1
+x
2
+ +x
k
) = f
(x
1
, x
1
) + +f
(x
k
, x
k
)
=
x
i
=0
f
(x
i
, x
i
)
> 0.
We have now proved the following proposition.
6.12 Proposition A disconnected diagram is admissible if all its con-
nected components are admissible.
Let us give another, slightly dierent, proof that if the connected com-
ponents of are admissible then so is itself.
Proof. Observe rst that the set of all v
R
n+m
whose last m coordinates
are all zero constitutes an n-dimensional subspace V which is isomorphic
to R
n
in an obvious way. In other words, there is a vector space isomorphism
: R
n
V which preserves angles between vectors. Similarly, the set of all
w
R
n+m
whose rst n coordinates are zero constitutes an m-dimensional
subspace W which is isomorphic to R
m
; this gives a vector space isomorphism
: R
m
W which preserves angles. Furthermore, if v
V and w
W then
v w = 0: the spaces V and W are perpendicular to each other. Note also
that R
n+m
is the direct sum of V and W.
Let
1
and
2
be admissible diagrams, and let be the disconnected
diagram consisting of
1
alongside
2
. Let Vert(
1
) = 1, 2, . . . , n and
Vert(
2
) = n+1, n+2, . . . , n+m, and let m be the labelling function. Since
1
is admissible there exists a basis u
1
, u
2
, . . . , u
n
of R
n
such that the angle
between u
i
and u
j
is
m(i,j)1
m(i,j)
for all i, j 1, 2, . . . , n. Similarly, since
2
is admissible there exists a basis w
n+1
, w
n+2
, . . . , w
n+m
of R
m
such that the
angle between w
i
and w
j
is
m(i,j)1
m(i,j)
for all i, j n +1, n +2, . . . , n +m.
Now dene v
i
= (u
i
) for all i 1, 2, . . . , n, and v
i
= (w
i
) for all
i n + 1, n + 2, . . . , n + m. If i, j 1, 2, . . . , n then the angle between
v
i
and v
j
equals the angle between u
i
and u
j
; if i, j n+1, n+2, . . . , n+m
then the angle between v
i
and v
j
equals the angle between w
i
and w
j
; if i n
and j n + 1 then the angle between v
i
and v
j
is /2. Hence the angle be-
tween v
i
and v
j
is
m(i,j)1
m(i,j)
in all cases. Since v
1
, v
2
, . . . , v
n+m
is clearly a
basis of R
n+m
, this proves that is admissible.
Clearly, Proposition 6.12 reduces the problem of classifying admissible
diagrams to the problem of classifying connected admissible diagrams.
Our strategy for obtaining the complete list of admissible diagrams is as
follows. We obtain a long list of inadmissible diagrams, proved inadmissible
by methods similar to those we used above for the diagrams and
6
. The key step is to prove that if is any inadmissible dia-
gram, then any diagram which (in some sense) is more complicated than is
also inadmissible. It is a straightforward task to describe all diagrams which
are not more complicated than any of the inadmissible diagrams on our list,
and which are therefore the only possible admissible diagrams. Each of these
possibly admissible diagrams is proved to be admissible by explicitly nd-
ing linearly independent vectors b
i
in Euclidean space such that the angle
between b
i
and b
j
is
m
ij
1
m
ij
.
The denition of more complicated is as follows:
is more compli-
cated than if
can be obtained from by adding extra vertices and/or

adding extra edges and/or increasing the labels on some edges.
6.13 Definition If and
are diagrams, we say that
is more compli-
cated than , and we write
, if the following conditions both hold.

(i) If ,
have n, n
vertices respectively, then n n
.
(ii) Given a numbering 1, 2, . . . , n of the vertices of , there exists a num-
bering 1, 2, . . . , n
of the vertices of
, such that m
(i, j) m(i, j) for

all i, j 1, 2, . . . , n, where m
and m are the labelling functions for
and (respectively).
For example, if
=
8
7 6
22
and
=
4
22
,
then
.
Let us prove the key proposition forthwith.
6.14 Proposition If ,
are diagrams with
, and if is inadmis-
sible, then
is also inadmissible.
Proof. Let n = [Vert()[ and n
= [Vert(
)[, and assume that, in accor-

dance with Denition 6.13,
m
(i, j) m(i, j) for all i, j 1, 2, . . . , n

where m and m
are the labelling functions for and
(for some numberings

of the vertices of these diagrams). Let V and f be the space and form associ-
ated with , so that V has basis v
1
, v
2
, . . . , v
n
and f(v
i
, v
j
) = cos
_
m(i,j)1
m(i,j)

_
.
Similarly, let V
and f
be the space and form associated with
, the relevant
basis of V
being v
1
, v
2
, . . . , v
n
.
Let i, j 1, 2, . . . , n with i ,= j. Then 2 m(i, j) m
(i, j), whence

/2 /m(i, j) /m
(i, j),
and since cos is decreasing on the interval [0, /2],
0 cos(/m(i, j) cos(/m
(i, j)).
Since cos( ) = cos for all ,
0 cos
_
m(i, j) 1
m(i, j)

_
= cos(/m(ij))
cos(/m
(i, j) = cos
_
m
(i, j) 1
m
(i, j)

_
,
and so, by the denitions of f and f
,
0 f(v
i
, v
j
) f
(v
i
, v
j
).
We have shown that this holds for all i, j 1, 2, . . . , n with i ,= j.
Note that a disconnected diagram is more complicated than any of its con-
nected components; so 6.14 provides another proof that every connected component
is admissible if the whole diagram is.
Given that is inadmissible, there must exist v V with f(v, v) < 0.
Since v must be expressible as a linear combination of the basis vectors
v
1
, v
2
, . . . , v
n
, let
1
,
2
, . . . ,
n
be scalars such that
v =
n
i=1
i
v
i
=
1
v
1
+
2
v
2
+ +
n
v
n
.
Now dene v
by
v
=
n
i=1
[
i
[v
i
= [
1
[v
1
+[
2
[v
2
+ +[
n
[v
n
.
(Recall that n
= dimV
n = dimV . The coecient of v
i
in the expression
for v
is zero if i > n.)

Bilinearity of f yields that
0 > f(v, v) = f
_
n
i=1
i
v
i
,
n
j=1
j
v
j
_
=
n
i=1
2
i
f(v
i
, v
i
) +
i=j
j
f(v
i
, v
j
).
But f(v
i
, v
i
) = 1 for all i, and since f(v
i
, v
j
) 0 whenever i ,= j it follows
that
i
j
f(v
i
, v
j
) [
i
[ [
j
[f(v
i
, v
j
). Moreover, since f
(v
i
, v
j
) f
(v
i
, v
j
),
we have [
i
[ [
j
[f(v
i
, v
j
) [
i
[ [
j
[f
(v
i
, v
j
) whenever i ,= j. Hence
0 >
n
i=1
2
i
+
i=j
[
i
[ [
j
[f(v
i
, v
j
)
n
i=1
[
i
[
2
f
(v
i
, v
i
) +
i=j
[
i
[ [
j
[f
(v
i
, v
j
).
But this last expression equals f
_
n
i=1
[
i
[v
i
,
n
j=1
[
j
[v
j
_
= f
(v
, v
), and
we have shown that the element v
has the property that f
(v
, v
) < 0.
So f
is not positive denite, and so
is inadmissible, as required.
We now need a long list of inadmissible diagrams. We defer the proofs
for the time being, but here is the list.
(i) The simple circuits
. . .
are all inadmissible.
(ii) The following diagrams are all inadmissible.

(iii) So are these.
4 4 4

(iv) And these.
4 4 4 4 4 4

(v) Here are four more inadmissible diagrams.
6 5 5
4
By simple we mean that the edge labels are all equal to three.
(vi) Finally, the following three diagrams are also inadmissible.
Accepting that it is true that the diagrams just listed are all inadmis-
sible, let us determine exactly which diagrams might be admissible.
6.15 Theorem Let be a connected admissible diagram. Then is one
of the following types.
Type A
n
: (n vertices, for any n 1)
Type B
n
:
4
(n vertices, for any n 2)
Type D
n
: (n vertices, for any n 4)
Type I
2
(p):
p
(any p > 4)
Type H
3
:
5
Type H
4
:
5
Type F
4
:
4
Type E
6
:
Type E
7
:
Type E
8
:
Proof. If has exactly one vertex then it is of type A
1
, and if it has exactly
two vertices then it is either of type A
2
or B
2
, or type I
2
(p) for some p > 4.
So we may assume that has at least three vertices.
If has an edge label which is 6 or greater, then is more complicated
than the rst diagram in (iv) of our list of inadmissible diagrams, and by
Proposition 6.14 it follows that is inadmissible, contradiction. So 3, 4 and 5
are the only labels that occur.
If had a circuit then it would be more complicated than one of the
simple circuits in (i) of our list of inadmissible diagrams, and if had a
vertex of valency 4 or more, then it would be more complicated than the
rst diagram in (ii) of our list of inadmissible diagrams. By 6.14, this is
impossible. Similarly, it is impossible for to have two or more vertices of
valency 3, for otherwise would be more complicated than one of the other
diagrams in (ii) of our list of inadmissibles. These facts combine to tell us
that is either a string
with various labels 3, 4 or 5 on the edges, or else consists of three branches
of various lengths emanating from the only vertex of valency 3,
again with variously labelled edges.
The diagrams in (iii) of the inadmissibles list show that cannot have
a vertex of valency 3 and an edge labelled 4 or more; so in the three branch
cases can have only simple edges (labelled 3). The rst diagram in (vi) of
the inadmissibles list shows that the three branches cannot all have length
two or more; that is, at least one of the branches has length one. If two
of the branches have length one then is of type D
n
for some n, and this
is listed as a possibility in the theorem statement. So we can assume that
exactly one of the branches has length one. Now if both the other branches
had length three or more, then would be more complicated than the second
diagram in (vi) of the inadmissibles, which is impossible. So at least one of
the other branches has length exactly two. The third branch can have length
two, three or four, corresponding to types E
6
, E
7
and E
8
, but no more than
that, or else would be more complicated than the third diagram in (vi). So
all the three branch possibilities are covered.
Suppose, on the other hand, that is a string, so that there are exactly
two vertices of valency 1, the rest having valency 2. The diagrams in (iv)
show that cannot have two edges labelled 4 or more. That is, there is at
most one non-simple edge. If all the edges are simple then is of type A
n
;
so we may assume that there is exactly one non-simple edge. If the label on
this edge is 5, then its endpoints cannot both have valency 2, or would be
more complicated than the second diagram in (v) of the list of inadmissible
diagrams. In other words, if the label is 5 then the non-simple edge is one of
the two end edges. And must be either of type H
3
or H
4
, since if it had ve
or more vertices it would be more complicated than the third diagram in (v).
So it remains to deal with the cases when the non-simple edge is labelled 4.
If the non-simple edge is an end edge then is of type B
n
. If not, then
must be of type F
4
, since if it had ve or more vertices it would be more
complicated than the fourth diagram in (v). So all the string possibilities are
covered too.
6d Existence and inadmissibility proofs
Overlooking the fact that a few proofs have been skipped, and a technicality
to be mentioned in a moment, we have now completely classied the nite
groups of transformations of Euclidean space which are generated by reec-
tions. For, if G is such a group, it must have a root system, and the root
system must have a base, and the base must correspond to an admissible dia-
gram. The technicality is that, for all we have proved so far, several dierent
groups generated by reections might give the same diagram. In fact, it is
not too dicult to prove (although we will not do it) that if G is generated
by reections, then it is also generated by the reections corresponding to
the roots in any base for its root system. This means that the diagram does
determine G up to isomorphism. So the classication theorem for Euclidean
reection groupswhich we have not quite provedis as follows.
6.16 Theorem There is a one-to-one correspondence between isomorphism
classes of nite Euclidean reection groups and diagrams whose connected
components come from the list in Theorem 6.15.
We have also yet to prove that all the types listed in Theorem 6.15 are
actually admissible, and correspond to nite reection groups. To show that
the diagrams are admissible simply requires nding, in each case, n vectors
in Euclidean space with the right conguration of angles. Proving that there
is a corresponding nite reection group is more dicult, and requires con-
structing the entire root system (so that Proposition 6.2 can be applied).
Finally, we have yet to prove the inadmissibility of all those diagrams.
The proofs of inadmissibility all use the same method, and so we will
leave most of them as exercises. They depend on the following lemma.
6.17 Lemma Suppose that f is a bilinear form on a vector space V over the
eld R, and suppose that v
1
, v
2
, . . . , v
n
is a basis of V . If there exist scalars
i
0 that are not all zero and have the property that
n
i=1
i
f(v
i
, v
j
) 0
for all j 1, 2, . . . , n, then f is not positive denite.
Proof. Suppose that there exist such
i
, and put v =

n
i=1
i
v
i
. Then
clearly v ,= 0, since the
i
are not all zero and the v
i
are linearly independent.
However,
f(v, v) = f(v,
n
j=1
j
v
j
) =
n
j=1
j
f(v, v
j
)
=
n
j=1
j
f(
n
i=1
i
v
i
, v
j
) =
n
j=1
j
_
n
i=1
i
f(v
i
, v
j
)
_
0
since
j
(
n
i=1
i
f(v
i
, v
j
)) 0 for all j. Hence f is not positive denite.
To apply this lemma in practice, one should attempt to nd

i
such that
n
i=1
i
f(v
i
, v
j
) = 0 for all but one value of j. Since the values f(v
i
, v
j
) are
known, this involves solving a system of n 1 homogeneous linear equations
in the n unknowns
i
. The solution will probably be unique up to a scalar
multiple. Take any nonzero solution and see whether

n
i=1
i
f(v
i
, v
j
) 0
for the remaining value of j. It will be, in every case we need.
For example, let be the third diagram in (v), and let v
1
, v
2
, . . . , v
9
be
the canonical basis of V
. Choose the numbering so that vertex i is adjacent

to vertex i +1 for all i from 2 to 8, and vertex 1 is adjacent to vertex 4. Then
the values of f
on the vectors of the canonical basis are as follows: rstly,

f
(v
i
, v
i
) = 1 for all i, then
f
(v
1
, v
4
) = f
(v
2
, v
3
) = f
(v
3
, v
4
) = f
(v
4
, v
5
) = 1/2
f
(v
5
, v
6
) = f
(v
6
, v
7
) = f
(v
7
, v
8
) = f
(v
8
, v
9
) = 1/2
and f
(v
i
, v
j
) = 0 in all other cases. If we now consider the equations
n
i=1
i
f(v
i
, v
j
) = 0 for all j ,= 2, we nd the requirements to be
0 =
1
2
8
+
9
0 =
1
2
7
+
8

1
2
9
0 =
1
2
6
+
7

1
2
9
0 =
1
2
5
+
6

1
2
7
0 =
1
2
4
+
5

1
2
6
0 =
1
2
3

1
2
1
+
4

1
2
9
0 =
1
2
1
+
4
0 =
1
2
2
+
3

1
2
4
and if we put
9
= c we quickly nd that
8
= 2c,
7
= 3c,
6
= 4c,
5
= 5c,
4
= 6c,
1
= 3c,
3
= 4c and
2
= 2c. Now, lo and behold!, we
see that

n
i=1
i
f(v
i
, v
2
) =
2

1
2
3
= 0. So the conditions of the lemma
are satised, and the form f
is not positive denite. Indeed, we have found

that the nonzero vector
v = 2v
2
+ 4v
3
+ 6v
4
+ 5v
5
+ 4v
6
+ 3v
7
+ 2v
8
+v
9
+ 3v
1
satises f
(v, v) = 0.
It actually works just like this in most of the other cases, and we wind
up with a nonzero vector v such that f
(v, v
j
) = 0 for all j (which certainly
gives f
(v, v) = 0). Only the diagrams with edge labels of 5 give slightly
more complicated calculations.
As for the existence proofs, again we will content ourselves with one
example: type E
8
. This is, in fact, the most dicult. We start with an
orthogonal basis e
1
, e
2
, . . . , e
8
of 8-dimensional Euclidean space such that
each e
i
has length 1/
2. For example, in R
8
, we could choose e
i
to be the
8-tuple whose i-th component is 1/
2 and whose other components are all 0.

Dene
S = e
i
e
j
[ i ,= j [ i, j 1, 2, . . . , 8, i ,= j

1
2
8
i=1
i
e
i
[
i
= 1 and
8
i=1
i
= 1 .
There are 240 vectors altogether in S, since in the rst piece there are 112
(since there are
_
8
2
_
= 28 choices for the pair i, j and 4 choices for the signs)
and 128 in the other piece (since the rst seven signs can be chosen arbitrarily,
giving 2
7
possibilities, and then the last sign is determined uniquely).
It is not hard to check that u u = 1 and u v
1
2
, 0,
1
2
for all
u, v S with u ,= v. Thus the angle between two vectors in S is always
either /3, /2 or 2/3, and so we have that
u
(v) =
_
_
_
v u if the angle is /3
v if the angle is /2
v +u if the angle is /3
It needs to be checked that
u
(v) S in all cases. Again, this is not hard.
The point is that once it has been checked for one pair u, v, permuting the
coordinates yields many other pairs which need not be checked separately.
Dene a
1
=
1
2
(e
1
e
2
e
3
e
4
e
5
e
6
e
7
+e
8
) and a
2
= e
1
+e
2
, and,
for 3 i 8, dene a
i
= e
i1
+e
i2
. Then if we take B = a
i
[ 1 i 8
it can be checked that the inner products a
i
a
j
are as they should be for the
diagram of type E
8
. (That is, a
i
a
j
= cos(2/3) =
1
2
if the vertices i and j
are adjacent, a
i
a
j
= 0 for nonadjacent vertices.) It is an interesting fact
that when an arbitrary root is expressed as a linear combination of the simple
roots a
i
, all the coecients turn out to be integers. A root

8
i=1
i
e
i
S is
positive if the largest i with
i
,= 0 has
i
> 0.
It is possible to explicitly describe the linear transformations g in the
reection group G corresponding to this root system as matrices relative
to the basis e
1
, . . . , e
8
. Firstly, there are the 8! permutation matrices, and
2
7
diagonal matrices with diagonal entries 1 and determinant 1. These
generate a group of order 2
7
8! = 5160960 which is a subgroup H of G. (In
fact, H is itself a Euclidean reection group: it is of type D
8
.) The idea now
is to investigate the cosets of H in G. If v
1
, . . . , v
8
is any orthonormal basis
of the space of eight-component row vectors then there are 2
7
8! orthogonal
matrices of the form xg where x H and g is the matrix whose rows are
v
1
, . . . , v
8
. (These matrices are obtained from g by permuting the rows and
multiplying an even number of rows by 1.) We proceed to describe a large
number of orthonormal bases which give rise to elements of G.
Let
1
,
2
, . . . ,
8
be signs
i
= 1 with

8
i=1
i
= 1, and dene
v
ij
=
_
3
4
if i = j
1
4
j
if i ,= j.
Let v
i
be the row whose jth entry is v
ij
. Then v
1
, . . . , v
8
form an orthonor-
mal basis and give rise to a coset of H in G. In fact this gives 64 cosets,
corresponding to the 64 choices for the signs
i
. These 64 cosets give us
another 64[H[ = 330301440 elements of G. (Who would have thought that
there would be so many 8 8 orthogonal matrices whose entries are all plus
or minus three quarters or one quarter!)
Choose a division of the set 1, 2, . . . , 8 into two subsets J and J
of
four elements eachthere are thirty-ve ways of doing thisand let = 1.
Let J = j
1
, j
2
, j
3
, j
4
and J
= j
5
, j
6
, j
7
, j
8
, and let
ij
be the (i, j)-entry
of the matrix
X =
_
_
_
_
_
_
_
_
_
_
_
1 0 0 0 0
1 1 1 0 0 0 0
1 1 1 0 0 0 0
1 1 1 0 0 0 0
0 0 0 0 1
0 0 0 0 1 1 1
0 0 0 0 1 1 1
0 0 0 0 1 1 1
_
_
_
_
_
_
_
_
_
_
_
Let v
i
be the vector whose kth entry is
1
2
ij
k
; then the matrix g whose rows
are the v
i
is in G. Since there were 35 possible partions of 1, 2, . . . , 8 as
J J
and two choices for , we have in fact obtained another 70 cosets of H.

This gives 70[H[ = 361267200 elements of G.
We have now described 1 +64 +70 = 135 cosets of H in G, and in fact
this is all of them. The group G has 135[H[ = 696729600 elements altogether.
Index of notation
Perp 4
Plus 5
(Mult)
5
Sym(S) 7
i 8
(Dot)
10
O(V ) 10
[G : H] 39
[G[ 39
Aut(G) 57
Z(G) 62
Inn(G) 62
C
G
(g) 66
a
76
plc(B) 89
V
96
f
96
110
Index
A
Abelian groups . . . . . . . . . . . . . . . . . . . . . . 16
additive group of a eld . . . . . . . . . . . . . 17
admissible diagram. . . . . . . . . . . . . . . . . . 96
angle between two vectors . . . . . . . . . . . 71
automorphism of a group . . . . . . . . . . . . 57
automorphisms . . . . . . . . . . . . . . . . . . . . . . 10
of cyclic groups . . . . . . . . . . . . . . . . 58, 59
of Klein 4-group. . . . . . . . . . . . . . . . . . . 58
axis of symmetry . . . . . . . . . . . . . . . . . . . . 73
B
base. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 bilinear form . . . . . . . . . . . . . . . . . . . . . . . . 72
C
canonical homomorphism. . . . . . . . . . . . 52
Cantor, Georg . . . . . . . . . . . . . . . . . . . . . . . 27
cardinality. . . . . . . . . . . . . . . . . . . . . . . 26, 29
Cartesian coordinates. . . . . . . . . . . . . . . . 71
Cartesian product . . . . . . . . . . . . . . . . . . . . 4
central quotient . . . . . . . . . . . . . . . . . . . . . 63
centralizer . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
of a p-group. . . . . . . . . . . . . . . . . . . . . . . 69
class equation . . . . . . . . . . . . . . . . . . . . . . . 69
classes of a group. . . . . . . . . . . . . . . . . . . . 63
of Sym1, 2, 3 . . . . . . . . . . . . . . . . . . . . 64
of Sym1, 2, . . . , n . . . . . . . . . . . . . . . . 64
of Sym1, 2, 3, 4, 5 . . . . . . . . . . . . . . . . 65
of the dihedral group of order 8 . . . 66
of GL
3
(C). . . . . . . . . . . . . . . . . . . . . . . . . 66
closure under an operation. . . . . . . . . . . 21
under inversion . . . . . . . . . . . . . . . . . 9, 22
under multiplication. . . . . . . . . . . . . . . . 8
codomain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
-commensurable. . . . . . . . . . . . . . . . . . . . 80
conjugate elements . . . . . . . . . . . . . . . . . . 63
cosets of a subgroup . . . . . . . . . . . . . . . . . 23
of C
G
(g) . . . . . . . . . . . . . . . . . . . . . . . . . . 67
countable set . . . . . . . . . . . . . . . . . . . . . . . . 27
Coxeter diagram. . . . . . . . . . . . . . . . . . . . . 94
cycle notation . . . . . . . . . . . . . . . . . . . . . . . . 8
cycle type . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
cyclic groups . . . . . . . . . . . . . . . . . . . . . . . . 11
111
D
dense subset. . . . . . . . . . . . . . . . . . . . . . . . . 27
desmic tetrahedra . . . . . . . . . . . . . . . . . . . 81
determinant homomorphism. . . . . . . . . 40
dihedral group of order 8 . . . . . . . . . . . . . 9
dihedral group of order 2n. . . . . . . . . . . 73
distance between two points . . . . . . . . . 71
domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
dot product . . . . . . . . . . . . . . . . . . . . . . . 9, 71
E
endomorphism. . . . . . . . . . . . . . . . . . . . . . . 60
enumerable set . . . . . . . . . . . . . . . . . . . . . . 27
equivalence relation. . . . . . . . . . . . . . . . . . 30
equivalence class . . . . . . . . . . . . . . . . . . . . 31
Euclidean space . . . . . . . . . . . . . . . . . . . . . . 9
even permutation. . . . . . . . . . . . . . . . . . . . 41
F
First Isomorphism Theorem . . . . . . . . . 54
foundations. . . . . . . . . . . . . . . . . . . . . . . . . . . 3
functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
fundamental roots . . . . . . . . . . . . . . . . . . . 92
G
general linear group . . . . . . . . . . . . . . . . . 18
generating a group . . . . . . . . . . . . . . . . . . 11
graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
greatest common divisor . . . . . . . . . . . . . 60
group of transformations. . . . . . . . . . . . . . 9
H
homomorphism. . . . . . . . . . . . . . . . . . 40, 45
injective . . . . . . . . . . . . . . . . . . . . . . . . . . 54
kernel of . . . . . . . . . . . . . . . . . . . . . . . . . . 52
natural . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Sym1, 2, 3, 4 to Sym1, 2, 3 . . . . . 42
Homomorphism Theorem. . . . . . . . . . . . 54
hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . 75
112
I
identity permutation. . . . . . . . . . . . . . . . . . 8
image of a homomorphism. . . . . . . . . . . 53
inadmissible diagram. . . . . . . . . . . . . . . . 96
index of a subgroup . . . . . . . . . . . . . . . . . 39
inherited operation . . . . . . . . . . . . . . . . . . 21
injectivity of a homomorphism. . . . . . . 54
inner automorphism. . . . . . . . . . . . . . . . . 61
inner product space. . . . . . . . . . . . . . . . . . . 9
invariant subspace . . . . . . . . . . . . . . . . . . . 79
isomorphic groups . . . . . . . . . . . . . . . . . . . 12
isomorphism . . . . . . . . . . . . . . . . . . . . . . . . 45
K
kernel of a homomorphism. . . . . . . . . . . 52 Kleins four group . . . . . . . . . . . . . . . . . . . 12
L
labelling function. . . . . . . . . . . . . . . . . . . . 95 Latin square. . . . . . . . . . . . . . . . . . . . . . . . . 20
M
map, mapping . . . . . . . . . . . . . . . . . . . . . . . . 4
modulus homomorphism. . . . . . . . . . . . . 41
multiplicative group of a eld. . . . . . . . 17
N
natural homomorphism. . . . . . . . . . . . . . 52
negative roots . . . . . . . . . . . . . . . . . . . . . . . 89
normal subgroup . . . . . . . . . . . . . . . . . . . . 46
normalized root system. . . . . . . . . . . . . . 87
O
odd permutation . . . . . . . . . . . . . . . . . . . . 41
operations as relations . . . . . . . . . . . . . . . . 5
operation on a set . . . . . . . . . . . . . . . . . . . 15
orthogonal direct sum . . . . . . . . . . . . . . . 80
orthogonal transformation . . . . . . . . . . . 10
orthogonal group . . . . . . . . . . . . . . . . . . . . 10
orthonormal basis . . . . . . . . . . . . . . . . . . . 71
113
P
parallelogram law . . . . . . . . . . . . . . . . . . . 74
parity of a permutation. . . . . . . . . . . . . . 41
permutation . . . . . . . . . . . . . . . . . . . . . . . . . . 7
permutation multiplication . . . . . . . . . . . 7
perpendicularity relation. . . . . . . . . . . . . . 4
-commensurable. . . . . . . . . . . . . . . . . . . . 80
polygonal -commensurable set . . . . . . 80
position vectors . . . . . . . . . . . . . . . . . . . . . 71
positive deniteness . . . . . . . . . . . . . . . . . . 9
positive linear combination . . . . . . . . . . 89
positive roots. . . . . . . . . . . . . . . . . . . . . . . . 89
Pythagorean triples. . . . . . . . . . . . . . . . . . . 2
Q
quotient by an equivalence relation . . 32
R
rational number . . . . . . . . . . . . . . . . . . . . . 27
reection formula. . . . . . . . . . . . . . . . . . . . 75
regular polygon. . . . . . . . . . . . . . . . . . . . . . 72
relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
reexive. . . . . . . . . . . . . . . . . . . . . . . . . . . 30
symmetric . . . . . . . . . . . . . . . . . . . . . . . . 30
transitive . . . . . . . . . . . . . . . . . . . . . . . . . 30
root system . . . . . . . . . . . . . . . . . . . . . . . . . 84
rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
rotation matrix. . . . . . . . . . . . . . . . . . . . . . 79
S
sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
sign of a permutation. . . . . . . . . . . . . . . . 41
simple edge . . . . . . . . . . . . . . . . . . . . . . . . 101
simple roots . . . . . . . . . . . . . . . . . . . . . . . . . 92
squares as structured sets . . . . . . . . . . . . . 5
structured sets . . . . . . . . . . . . . . . . . . . . . . . 5
subgroup. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
subset multiplication . . . . . . . . . . . . . . . . 45
surjectivity of a homomorphism. . . . . . 54
symmetric group . . . . . . . . . . . . . . . . . . . . . 7
symmetries of a square . . . . . . . . . . . . . . . 8
symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
intuitive denition. . . . . . . . . . . . . . . . . . 4
precise denition . . . . . . . . . . . . . . . . . . . 6
T
tetrahedron . . . . . . . . . . . . . . . . . . . . . . . . . 81 transformation . . . . . . . . . . . . . . . . . . . . . . . 4
relation-preserving . . . . . . . . . . . . . . . . . 6
V
vector spaces, as structured sets . . . . . . 5
114

Symmetry

Uploaded by

Copyright:

Available Formats

Symmetry

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Symmetry

Uploaded by

Copyright:

Available Formats

Aspects of

= (3/5)(15/17) + (4/5)(8/17) = 77/85

= (4/5)(15/17) (3/5)(8/17) = 36/85

for each scalar (where (Mult)

), such that the appropriate axioms

). Re-expressing this in the

V . This is the way preser-

says that (Mult)

V . So preservation of all the rela-

is preservation of scalar multiplication in the usual sense from

for each is nothing other than a linear transformation on the

relations introduced previously for vector spaces, and further binary

dened as follows: (Dot)

if and only if Tu Tv = whenever u v = . So

= x R [ x ,= 0 . We must check rst of all that

and get an answer. But for multiplication to be an

we need also that this answer is always in the set R

the number y = 1/x has the property

= F [ ,= 0 is a group under the multiplication

Note that Proposition 2.2 does not say that if xy = zx then y = z.

dened by X det(X) is a homomorphism, since

. Furthermore, it is a well known property of complex numbers that

Chapter Three: Homomorphisms, quotient groups and isomorphisms 43

= not an equivalence relation, or, at the very least, symmetric.

y, there exist h, k K such that x

= yk, and hence

are in the same equivalence class we have that x

y, and now by 3.8 we conclude that g

= gk for some k K, and this gives

G are right congruent modulo K then x

correspond to the same element C Q, the coset C = xK = x

yK. But the product C(yK) cannot be dened in

K = xK. Rearranging this using 2.11, the required

y), which equals y

= xk we see that our multiplication

, and so we would be simultaneously be dening

). To show that is well-dened,

and the anticlockwise rotation through 90

, corresponding to the per-

respectively. Note that x

. So it is natural to use this as the denition of the

whatever the value of n. Similarly, if =

such that is the perpendicular bisector of the line

. Expressed as a permutation of the vertices, the reection in

, where the mirror

has the property that the mirror perpendicularly bisects the

. We have assumed that the origin O lies on the surface

be a nonzero vector perpendicular to H.

which are perpendicular to a

is the mirror image of P,

OP. That is, (a

is the position vector of a point

. We can now easily obtain a

. If is the reection in H, then

). Bilinearity of the dot product

, and it follows that

It is also geometrically clear that reections do, in fact, preserve dis-

will have matrix

is the anticlockwise angle from the x-axis to a