Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

math210A_copy

MATH 210A is a graduate-level course focused on commutative algebra, covering topics such as rings, modules, and homological algebra, aimed at preparing students for the qualifying exam. The course emphasizes active participation, with weekly assignments and quizzes to reinforce learning, and requires previous experience with abstract mathematics. Students are encouraged to collaborate on homework and attend office hours for additional support, while adhering to university health guidelines.

Uploaded by

angel darabu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

math210A_copy

MATH 210A is a graduate-level course focused on commutative algebra, covering topics such as rings, modules, and homological algebra, aimed at preparing students for the qualifying exam. The course emphasizes active participation, with weekly assignments and quizzes to reinforce learning, and requires previous experience with abstract mathematics. Students are encouraged to collaborate on homework and attend office hours for additional support, while adhering to university health guidelines.

Uploaded by

angel darabu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

MATH 210A: Modern Algebra I

Lecturer: Professor Richard Taylor


Notes by: Andrew Lin

Autumn 2022

Introduction
MATH 210A is a course that aims to cover the material covered on the graduate qualifying exam. It’s meant to be a
collection of useful techniques, but it won’t culminate in any big theorem, so it’s not a course we take for relaxation –
we should be dedicated to it. This course will focus on mathematical structures (algebraic objects and maps between
them), rather than how to combine their elements. So in some sense, the level of abstraction will be “one level up.”
This class will cover (only) commutative algebra. The first three weeks of the class will discuss rings – constructions
of rings, products and coproducts, quotients, localizations, completions, and polynomial rings. The Noetherian property
and prime ideals will make an appearance, and there will be some discussion of unique factorization domains. The
next week will cover category theory, which abstracts the ideas of structures and maps in the same way that groups
generalize collections of symmetries. After that, the next three weeks will discuss modules over rings (basic properties,
multilinear algebra, Nakayama’s lemma, and the structure of modules over a principal ideal domain and its applications
to linear algebra). The final three weeks will cover an introduction to homological algebra (abelian categories, injectives,
projectives, exact sequences, derived functors, Ext and Tor), and sheaves will be covered as an example (though they
are not officially on the qualifying exam syllabus).
Quadratic algebra and bilinear forms are topics that we should be covering for the qualifying exam as well, but we
won’t have time to cover that in class, so there is a handout on Canvas about this material which we should review
for the exam. (In other words, it’s examinable on the qual but not for this class.) On that note, this class will be
fast-paced, and we’ll have to do the work to keep up.
There won’t be much prerequisite knowledge, but we will assume previous experience with abstract mathematics.
There are lots of good textbooks on this material – Dummit and Foote is relatively elementary compared to what
we’ll cover, but it’s a good one to learn from and sufficient for the qualifying exam. Jacobson’s book is the official
recommended text, and it’s a great reference to have. But it’s difficult to learn from (at least for this course)
because that book emphasizes noncommutative algebra instead for more efficiency, and the intuitions are different
in commutative and noncommutative algebra. Finally, previous iterations of this class have used Aluffi’s book, which
follows a similar path of emphasizing categorical underpinnings.
Mathematics has to be learned by doing, so exercises are crucial for the course – we will have weekly assignments,
and the first one is on Canvas and due at the start of class last Monday. Professor Taylor can be reached at
rltaylor@stanford.edu, and the course assistant (grader) for this class is Lie Qian (lqian@stanford.edu). Office hours
will provisionally be set at Thursday 1:30-2:30pm (in person) for Professor Taylor and Tuesday 8-9am (Zoom) and
Friday 4-6pm (in person) for Lie. We’re encouraged to work on homework together but to work on the problems
ourselves before doing so, or else we might be a passive participant and not really realize the difficulties and important
parts of the problems.

1
The class will also have weekly quizzes – Professor Taylor sees them as better than midterms because we have to
keep on top of the material every week this way. They’ll be open book and online, and we have a 24-hour window on
Thursday to complete them (but we have only 65 minutes to upload our solutions after we download the quiz). Finally,
there will be a traditional closed-book final exam during exam period. (No collaboration is allowed on the quizzes or
final exam.)
We’re encouraged to ask questions during lecture – the point of an in-person instruction is to be interactive, so we
should speak up whenever we have a question or comment. And attending office hours is good too – mathematicians
often get stuck on what turns out to be an easy point in the end. It’s often just about a change of perspective and
can get sorted out if we ask others for help!
For this quarter, the university requires students to wear masks when taking classes for credit (so we should be
aware of that). Finally, anyone who needs accommodations should let Professor Taylor know as soon as possible.

1 September 26, 2022


Many of us have probably seen rings in some sense before, so this will be covered quickly. There will be notes about
rings on Canvas if we want to check our understanding in any way.

Definition 1
A ring is a set R containing two distinguished elements 0, 1 and two binary operations + (addition) and · (multi-
plication), such that (1) (R, +, 0) is an abelian group (addition is associative and commutative, 0 is the identity
element, and every r has an additive inverse −r ), (2) · is associative, commutative, and has identity element 1,
and (3) the distributive property r (s + t) = r s + r t holds.

We will only consider commutative rings with a multiplicative identity, though some textbooks may consider rings
in more generality. Associativity means that brackets don’t matter, so we can write r + s + t or r st without needing
to worry about the order in which operations are performed.

Example 2
The trivial ring containing only the element {0} (so that 0 = 1) satisfies all of the axioms. More interesting
examples of rings that we may have seen include Z, Q, R, C, and Z/nZ (which will also be denoted Z/(n)).

Example 3
The set of continuous functions C([0, 1]) = {f : [0, 1] → C : f continuous} forms a ring with addition and
multiplication defined pointwise (so that (f + g)(t) = f (t) + g(t) and (f g)t = f (t)g(t)). The additive and
multiplicative identity are then the 0 and 1 functions, respectively.

Example 4
The set {(a, b) ∈ Z2 : a ≡ b mod 3} with component-wise addition and multiplication also forms a ring.

The following simple facts are good exercises if we don’t see how they work immediately:

• For any r ∈ R, we have −r = (−1) · r .

2
• For any r ∈ R, 0 · r = 0.

• We only have 0 = 1 if R is the trivial ring.

Definition 5
Even though multiplication is not required to have inverses, we can define the group of units of R

R× = {r ∈ R : ∃s ∈ R with r s = 1}.

(We can check that multiplicative inverses are unique if they exist.)

For example, the group of units in Z× is {±1}, and the group of units in Q× is Q \ {0}. (And the group of units
in C([0, 1]) is the set of nowhere zero functions.)

Definition 6
An element r ∈ R is nilpotent if r n = 0 for some integer n > 0.

For example, 0 is always nilpotent, and 22 = 0 in Z/4Z so 2 is nilpotent in Z/4Z.

Definition 7
An element r ∈ R is a zero-divisor if there is some nonzero s ∈ R such that r s = 0.

All nilpotent elements are zero divisors, and the function



x − 1 x ∈ 0, 21
 
2
f (x) =
x ∈ 21 , 1
0  


x ∈ 0, 12
0  
is a zero-divisor because it can be multiplied by the function g(x) = .
1
x ∈ 12 , 1
x − 
2

Definition 8
A ring is reduced if 0 is the only nilpotent element, an integral domain (ID) if 0 is the only zero divisor, and a
field if R× = R \ {0}.

In particular, any field is an integral domain because all nonzero elements have inverses, and any integral domain
is reduced. For example, Z is an integral domain but not a field, Q is a field, Z/6Z is reduced, and Z/4Z isn’t even
reduced. The trivial ring is a special case – it is not an integral domain or a field because of technicalities with zero
itself, but it is reduced. (And it’s important that the definitions are set up so that this is the case.)
Whenever we have a structure, we’ll want to consider the maps that preserve that structure:

Definition 9
A map φ : R → S is a ring (homo)morphism if it preserves the structure of addition and multiplication – in other
words, φ(0) = 0, φ(1) = 1, and φ(r + s) = φ(r ) + φ(s) and φ(r s) = φ(r )φ(s) for r, s ∈ R.

3
Some books may not make the requirement that φ(1) = 1 (in which case some “categorical properties” break
down), but it’s the right thing to do when we’re studying commutative algebra. And those conditions above are
enough to guarantee that negatives are preserved as well: φ(−r ) = −φ(r ).

Example 10
The inclusion Z ,→ C and the evaluation map C[0, 1] → C sending f to f ( 12 ) are both ring homomorphisms.

Example 11
The map R → {0} sending everything to 0 is a ring homomorphism for any R (“any ring has a unique homomor-
phism to the trivial ring”), but the inclusion {0} ,→ Z is not a ring homomorphism because 1 is not sent to 1. On
the flip side, “there is a unique morphism from Z to any other ring,” since 0 must be sent to 0 and n must be sent
to 1 + 1 + · · · + 1 (added together n times) for any n > 0, while −n must be sent to −(1 + 1 + · · · + 1). (It’s a
tedious exercise to check that this does work.)

Definition 12
A subset R ⊂ S is a subring if 0, 1 ∈ R and R is closed under addition, multiplication, and negatives. (In other
words, R itself should be a ring with respect to those operations.)

Our early discussions of rings will focus on constructing new rings, and the easiest is the following:

Definition 13
For any rings R, S, the product ring R×S = {(r, s) : r ∈ R, s ∈ S} is defined by having addition and multiplication
performed component-wise (so that 0 = (0, 0) and 1 = (1, 1)).

This product ring R × S comes with natural projections π1 : R × S → R (sending (r, s) → r ) and π2 : R × S → S
(sending (r, s) → s), both of which are ring morphisms. But this doesn’t work the other way around: mapping
Z → Z × Z sending n to (n, 0) is not a ring morphism because it doesn’t send 1 to 1. (Other algebraic objects have
“maps in” and “maps out,” but this is not the case for rings.)

Lemma 14
Let T be any ring, and suppose that f : T → R and g : T → S are morphisms of rings. Then there exists a
unique map f × g : T → R × S such that π1 ◦ (f × g) = f and π2 ◦ (f × g) = g.

This is often written in a diagram as shown below:

T g
f ×g

R×S π2 S
f
π1

Basically, the map shown with the dotted arrow must uniquely exist, and the diagram commutes because we get
the same answer no matter which way we follow the arrows around from one point to another. And constructing the

4
map itself is easy here: we must have (f × g)(t) = (f (t), g(t)), and this does indeed define a morphism. And this
lemma “uniquely characterizes” the product ring R × S – any other ring satisfying this universal property must be
isomorphic to the product ring. But we’ll discuss this fact more next time.

2 September 28, 2022


As a reminder, homework should be submitted in class on paper (though we can type up our solutions and print them
out).
Last lecture, we started discussing constructions of rings, starting with the product ring R × S = {(r, s) : r ∈
R, s ∈ S} defined by component-wise addition and multiplication. In particular, we mentioned that such a ring always
comes with projection maps π1 : R × S → R and π2 : R × S → S, and that for any ring T with maps f : T → R and
g : T → S, there is a natural map f × g : T → R × S completing the “commutative diagram” yielding f = π1 ◦ (f × g)
and g = π2 ◦ (f × g).
It’s not too hard to show why this universal property holds, but we’ll see many arguments of this form in the rest
of the course, so we’ll go through the argument here.

Proposition 15
Suppose there were another ring U with morphisms ρ1 : U → R and ρ2 : U → S with the universal property. Then
U∼= R × S.

Proof. Apply the universal property of U with T = R × S. Then we have the commutative diagram shown below,
yielding a unique map α : R × S → U.

R×S π2
α

U ρ2 S
π1
ρ1

R
But we can also swap the roles of R × S and U, applying the universal property of R × S with T = U, to get a
map ρ1 × ρ2 : U → R × S.

R×S π2
α

ρ1 ×ρ2
U ρ2 S
ρ1
π1

R
Finally, if U appears in both spots, we see that the diagram commutes whether the dotted map is the identity
map (since ρ1 = ρ1 ◦ IdU and so on) or α ◦ (ρ1 × ρ2 ) (by following the arrows in the previous diagrams, we see that
ρ2 = π2 ◦ (ρ1 × ρ2 ) = ρ2 ◦ α ◦ (ρ1 × ρ2 ) so the top loop commutes, and similarly the bottom loop commutes).

5
U ρ2
?

U ρ2 S
ρ1
ρ1

Thus IdU = α ◦ (ρ1 × ρ2 ) (because the map must be unique), and similarly we find that IdR×S = (ρ1 × ρ2 ) ◦ α,
which yields an isomorphism betewen U and R × S.

Two rings can be isomorphic in various ways, and the one we’ve described is in some way “natural:” the isomorphism
α we described above is unique if we require that ρ2 ◦ α = π2 and ρ1 ◦ α = π1 . And later, when we discuss things like
tensor products, we’ll see that this type of argument works more easily than trying to wrestle with the constructions
directly.
Q
Extending our definition of the product, we can take any index set I and construct the product ring i∈I Ri . Then
everything we’ve said generalizes whether I is finite or infinite.

Definition 16
Suppose R, S, T are rings, and we have ring morphisms φ : R → T and ψ : S → T . The relative product
R ×φ,T,ψ S (often denoted R ×T S) is the set {(r, s) ∈ R × S : φ(r ) = ψ(s)} as a subring of R × S.

This relative product also has a universal property: we still have projection maps π1 : R ×T S → R and π2 :
R ×T S → S defined by restriction. Then if φ ◦ f = ψ ◦ g in the diagram below, then there exists a unique map
f × g : U → R ×T S which makes the diagram commute.

U g
f ×g

R ×T S π2 S
f
π1 ψ
φ
R T

Example 17
We mentioned the ring {(a, b) ∈ Z2 : a ≡ b mod 3} last time, which can also be represented as the relative
product Z ×Z/3Z Z.

We’ll next discuss polynomial rings:

Definition 18
Let xi be a set of (commuting) indeterminates (where i ∈ I are part of some index set). We may form monomials
xin1 1 · · · xinr r as finite products of the xi . Then for any ring R, the polynomial ring R[xi ] is the set of formal finite
sums of elements of R multiplied by monomials, with addition and multiplication defined in the usual way, and the
formal power series ring R[[xi ]] is the same but with ∞-formal sums allowed.

There is an embedding R ,→ R[xi ] sending r to r times the trivial monomial, and R[xi ] ,→ R[[xi ]] because any
polynomial is a power series. Also, we can check that (R[x])[y ] = R[x, y ] and so on.

6
Remark 19. It’s okay to have uncountably many variables in all of these definitions, but we can still define (for example)
multiplication of formal power series because there are only finitely many (pairs of) terms that can contribute to any
particular monomial in the product.

Definition 20
Let f ∈ R[x] be a polynomial. The degree of f is deg(f ) = −∞ if f = 0, and otherwise if f = a0 +a1 x +· · ·+ad x d
with ad ̸= 0, then deg(f ) = d. We call f monic if ad = 1.

Lemma 21
If R is an integral domain, then deg(f g) = deg(f ) deg(g) for all f , g ∈ R[x].

Proof. If the highest-degree terms of f and g are ad x d and be x e , respectively, then ad be ̸= 0 if R is an integral domain,
so ad be x d+e is the highest-degree term of f g.

(For a counterexample when R is not an integral domain, notice that (2x + 1)(3x + 1) = 5x + 1 in Z/6Z[x].)

Lemma 22
If R is an integral domain, so are R[x] and R[xi ]i∈I , and so is R[[xi ]].

(The argument for power series is a bit different from the other ones – we have to look at the lowest power of x
instead of the highest one.)
It turns out that polynomials also have a universal property:

Proposition 23
Suppose f : I → S is any function to a ring S and φ : R → S is a morphism. Then there exists a unique ring
morphism ψ : R[xi ]i∈I → S such that ψ|R = φ and ψ(xi ) = f (i ) for all i ∈ I.

In other words, what’s important about polynomials is that we can substitute in any values in for our variables.

Lemma 24 (Division algorithm)


Let f , g ∈ R[x], and suppose g is monic (or more generally that g’s leading term is a unit in R). Then there exist
unique polynomials q, r ∈ R[x] such that f = qg + r and deg(r ) < deg(g).

(This is proved by the usual long division algorithm, which works the same way as it does for dividing integers.)

Definition 25
A subset I of a ring R is an ideal (denoted I ◁ R) if it contains 0, is closed under addition, and for any r ∈ R and
s ∈ I we have r s ∈ I.

Example 26
For any ring R, {0} and R are both ideals (we call an ideal proper if I ̸= R). Also, the set of even integers, which
we can denote (2), is an ideal in Z.

7
Example 27
If φ : R → S is a morphism and J ◁ S, then the preimage φ−1 J = {r ∈ R : φ(r ) ∈ J} is an ideal in R. In particular,
the kernel ker φ = φ−1 (0) of any morphism is an ideal.

Example 28
The image of an ideal is not necessarily an ideal. For example, take the embedding Z ,→ Q. Then (2) is an ideal
1
in Z, but it is not an ideal in Q because 2 × 2 = 1 ̸∈ (2). (However, if φ : R → S is surjective and I ◁ R, then
φI ◁ S.)

Definition 29
Pn
Let X ⊆ R be a subset. The ideal generated by X, denoted (X), is the set (X) = i=1 ri xi : xi ∈ X, ri ∈ R .
A principal ideal is an ideal generated by only one element.

We can check that this is indeed an ideal of R, and for any ideal I ⊃ X we have I ⊃ (X). And this explains the
notation (2) from Example 26.

3 September 30, 2022


Last lecture, we introduced the concept of ideals, seeing that ideals behave well under inverse images (and also forward
images if we have a surjective map). Today, we’ll start with ways to construct new ideals from existing ones.

Definition 30
Let I, J ◁ R be two ideals. The sum of the ideals is I + J = {r + s : r ∈ I, s ∈ J}, the intersection ideal I ∩ J is
their set intersection, and the product ideal IJ is the set of finite sums { ni=1 ri si : ri , ∈ I, si ∈ J}.
P

We can see that I + J is indeed an ideal by direct checking of the definitions – it’s actually the smallest ideal
containing I ∪ J, because it contains both I and J and any such ideal must be closed under addition so it must include
I + J. Making similar arguments for the others, we can check that IJ ⊂ I ∩ J ⊂ I, J ⊂ I + J .

Example 31
In R = Z, taking I = (6) and J = (10) yields I + J = (2) (because 2 = 2 · 6 − 10, and any element in I + J must
be even), I ∩ J = (30), and IJ = (60).

Remark 32. Notice that 2 = gcd(6, 10) and 30 = lcm(6, 10). As a general heuristic, having I ⊃ J can be thought of
as having “I|J,” I + J as the “gcd” of the ideals, I ∩ J as the “lcm”, and IJ as the “product.” (In the case where we
have principal ideals in Z, we’ve just seen that this does make sense.)

Definition 33
Two ideals I, J are comaximal if I + J = R.

(In the language of integers, this is being “coprime.”) Because R = (1), being comaximal is equivalent to being
able to find r ∈ I and s ∈ J with r + s = 1.

8
Lemma 34
Any ideal I ◁ Z is of the form (n) for some n. Similarly, if K is a field and I ◁ K[x], then I = (f ) for some polynomial
f ∈ K[x].

So in these special cases, the principal ideals are the only ones.

Proof. For the integer case, we either have I = (0) or some positive integer in I (by closure under multiplication by
−1). Let n be the minimal such integer in I; we claim that I = (n). Indeed, by the division algorithm, we can write
any m ∈ I as m = qn + r with 0 ≤ r < n, but this means that r = m − qn ∈ I (by ideal closure properties) so it must
be 0 by minimality of n, meaning that n|m.
The same proof works for K[x] using the division algorithm for polynomials and choosing nonzero f of minimal
degree – importantly, we can choose f to be monic because K is a field and we can multiply by the inverse of the
leading coefficient. (So if g = qf + r and deg r < deg g, we must have r = 0.)

Example 35
Consider I = (1407, 917) ◁ Z. Since 1407 = 917 + 490, 490 must be in the ideal as well. Then 917 = 490 + 427,
so 427 is in the ideal too, and repeating this process yields that 63, 49, 14, and 7 are in the ideal as well. But
now 7 does generate the ideal, because working backwards we see that every other number we’ve considered is a
multiple of 7 as well, and thus I = (7).

This is the Euclidean algorithm, and we can notice that it also allows us to write 7 as a multiple of 1407 and 917
in a systematic way (by plugging in 14 = 63 − 49, 49 = 427 − 6 · 63, and so on):

7 = 49 − 3 · 14 = 4 · 49 − 3 · 63 = 4 · 427 − 27 · 63 = 31 · 427 − 27 · 490 = 31 · 917 − 58 · 490 = 89 · 917 − 58 · 1407.

(We’ll see that such an expression can be useful for solving certain kinds of problems.)

Lemma 36
A ring R is the zero ring (0) if and only if it has exactly one ideal (namely R itself), since (0) and R are always
ideals of R. Similarly, R is a field if and only if it has exactly two ideals, since this is equivalent to saying that any
ideal containing (or generated by) a nonzero r ∈ R also contains 1.

Definition 37
For any I ◁ R, the quotient ring is the set of cosets R = {r + I : r ∈ R} with addition and multiplication defined
in the usual way – (r + I) + (s + I) = (r + s) + I, and (r + I)(s + I) = r s + I – and the additive and multiplicative
identities 0 + I and 1 + I, respectively.

We can check that this is indeed well-defined, particularly for multiplication: if we chose different representatives
r , s ′ instead of r, s (meaning that r ′ − r, s ′ − s ∈ I), we must check that r ′ s ′ − r s ∈ I. But indeed, r ′ s ′ − r s =

r ′ (s ′ − s) + s(r ′ − r ) is the product of two terms in I and is thus in I, meaning that r s + I = r ′ s ′ + I.


There is a (surjective) ring morphism π : R → R/I, which we call the quotient map, sending any r to the coset
r + I. And like with the other properties we’ve mentioned, the quotient ring has the following universal property:

9
Lemma 38
Let I ◁ R, and suppose φ : R → S is a morphism such that φ(I) = {0}. Then there is a unique ring morphism
φ : R/I → S with φ = φ ◦ π.

The diagram that needs to commute is shown below – any ring with this property must be the quotient ring R/I.
(The notation ∃! means “exists a unique map.”)

φ
R S
π
∃!φ
R/I

In particular, if we take I to be ker φ, we see that any surjective map φ : R → S yields an isomorphism φ :
R/ ker φ → S (this is sometimes called the first isomorphism theorem).

Lemma 39
The ideals of R × S are exactly the ideals of the form I × J, where I ◁ R and J ◁ S, and (R × S)/(I × J) ∼
= R/I × S/J
(through the map (r, s) + I × J 7→ (r + I, s + J)).

Proof. It’s easy to see that any such I × J is indeed an ideal from the definition. Now suppose K ◁ R × S and that
K contains some element (r, s). Then K also contains (1, 0)(r, s) = (r, 0) and (0, 1)(r, s) = (0, s), so we can verify
that K is in fact of the form (R ∩ K) × (S ∩ K).

Lemma 40 nP o
d i
If I ◁ R, then I[x] = : ri ∈ R is an ideal of R[x]. (But there are many other ideals in R[x] in general.)
i=0 ri x
We then have R[x]/I[x] ∼
P i
ri x + I[x] 7→ (ri + I)x i ).
P
= (R/I)[x] (through the map

Lemma 41
If I ◁ R, then there is a bijection between ideals of R/I and ideals of R containing I. In particular, if π : R → R/I
is the quotient map, we can take an ideal J in R/I and get an ideal π −1 J of R containing I, and we can apply the
surjective map π to get an ideal πJ of R/I from an ideal J of R containing I.

We just need to check that the composite of these two maps is the identity, which is left as an exercise to us.
In particular, we then find that (R/I)/(πJ) ∼
= R/J through the map r + J 7→ (r + I) + πJ. The following type of
argument can be useful in situations like this:

Corollary 42
For any r, s ∈ R, we have
R/(r, s) ∼
= (R/(r ))/(s + (r )) ∼
= (R/(s))/(r + (s)),

so we can choose the order in which to mod out by elements in R.

We’ll now turn to the Chinese remainder theorem:

10
Lemma 43
For any ideals I, J ◁ R, we have R/(I ∩ J) ∼
= (R/I) ×R/(I+J) (R/J), where the map used in the relative product is
r + (I ∩ J) 7→ (r + I, r + J).

For example, applying this to Example 31 above yields Z/30Z ∼


= Z/6Z ×Z/2Z Z/10Z.

Lemma 44
If I and J are comaximal, then I ∩ J = IJ, so R/IJ ∼
= R/(I ∩ J) ∼
= R/I ×R R/J ∼
= R/I × R/J.

For example, because 5 and 7 are coprime in Z, we find that Z/35Z ∼


= Z/5Z × Z/7Z. And the only additional
thing here to prove is that anything in I ∩ J is also in IJ, and indeed for any x ∈ I ∩ J we can write (finding r, s ∈ I, J
so that r + s = 1) x = xr + xs, and both xr and xs are in IJ so x is also in the product ideal. Generalizing this to
several ideals (and using induction) yields the result:

Proposition 45 (Chinese remainder theorem)


If I1 , · · · , In ◁R are pairwise comaximal ideals (meaning that Ii +Ij = R for any i ̸= j), then I1 I2 · · · In = I1 ∩· · ·∩In ,
and R/I1 · · · In = ∼ R/(I1 ∩ · · · ∩ In ) =
∼ R/I1 × · · · × R/In .

4 October 3, 2022
Last week, we discussed rings and ideals, particularly looking at the Chinese remainder theorem. Today,

Definition 46
An ideal I ◁ R is a maximal ideal if it is not the whole ring (it is proper) and not properly contained in any other
proper ideal.

Notice that I is maximal if and only if R/I is a field (because the ideals of R/I are in correspondence with the
ideals of R containing I, and we have a field if and only if there are exactly two ideals).

Definition 47
An ideal I ◁ R is a prime ideal if it is proper, and whenever r s ∈ I, either r ∈ I or s ∈ I. The set of prime ideals
of R is denoted Spec R.

We can see that I is prime if and only if R/I is an integral domain (because r s ∈ I is the same as r s + I being zero
in R/I). Since any field is an integral domain, this means that any maximal ideal is prime.

Example 48
The prime ideals in Z are the zero ideal (0) and the ideals generated by prime numbers (p). (This explains the
name “prime.”) Indeed, every ideal is principal, and if we had an ideal (m) where m = r s, then we must either
have r ∈ S or m ∈ S, which can only occur if r = m or s = m.

Spec R turns out to be a topological space rather than just a set, but we won’t really go into that here. This set
of prime ideals turns out to behave well under pullbacks:

11
Lemma 49
If φ : R → S is a ring morphism, and J ◁ S is a prime ideal, then φ−1 J is a prime ideal of R.

Proof. We already know that the pullback of an ideal is an ideal, so we just need to check that it is prime. We know
that φ−1 J is proper, because 1 ∈ φ−1 J would imply that 1 ∈ J, which is not the case (because J is a prime ideal and
thus proper). And if r s ∈ φ−1 J, then φ(r s) ∈ J, meaning that φ(r )φ(s) ∈ J, so either φ(r ) or φ(s) is in J. So indeed
either r or s is in φ−1 J.

This means we have a map φ−1 : Spec S → Spec R for any ring morphism φ : R → S. However, the pullback of a
maximal ideal is not necessarily maximal – the inclusion Z ,→ Q is a ring morphism, but the preimage of (0) is (0), and
(0) is maximal in Q but not in Z. So prime ideals have better functoriality properties than maximal ones, and that’s
why the development of the subject has followed prime ideals.

Lemma 50
The prime ideals of a product ring R × S are of the form I × S and R × J, where I ◁ R and J ◁ S are prime ideals.

(So the ideals in a product are just products of ideals, but the prime ideals are not just products of prime ideals.)

Proof. If K ◁ R × S is a prime ideal, then it must contain 0 = (0, 1) · (1, 0), so it contains either (0, 1) or (1, 0). In the
first case we must multiply by S and in the second we must multiply by R, and we can check that the other argument
must be a prime ideal.

In other words, we can rephrase this as a “disjoint union”

Spec R × S = Spec R ⊔ Spec S.


Q∞
Remark 51. However, we cannot extend this argument to an infinite product. Consider i=1 Q – the kernel of the
projection onto any factor is a prime ideal, but there are other prime ideals as well. If X is a collection of subsets of
Z>0 (the index set here), define I(X ) = {(ri ) ∈ ∞
Q
i=1 Q : {i : ri = 0} ∈ X }. In other words, given a subset of the
positive integers, we get a subset of the ring. Then we can check what properties X must have to get a prime ideal
(exercise, but for the projection πi it would be all sets containing i ), and we can check that such an X does exist.

Lemma 52
Let I be an ideal of R. Then there is a bijection between prime ideals of R/I and prime ideals of R containing I.

Proof. We already know there is a bijection between the set of all ideals: if π : R → R/I is a quotient map sending
J to πJ, with inverse map sending J to π −1 J, we want to check that prime ideals are preserved. The latter has been
shown in Lemma 49, so we just need to check that if J is prime, then πJ is also prime. But if (r + I)(s + I) ∈ πJ,
then r s + I ∈ πJ is equivalent to saying that r s ∈ J, meaning that either r ∈ J or s ∈ J. Thus either r + I ∈ πJ or
s + I ∈ πJ, as desired.

We’ll now discuss the noetherian property, an important property of rings that holds for most of the rings that
we will encounter.

12
Lemma 53
The following two properties of a ring R are equivalent:
1. Any ideal of R is finitely generated,

2. If X is any nonempty set of ideals of R, then X has a maximal element I ∈ X (meaning that it is not
properly contained in any other ideal, but it’s not required to contain every other ideal).

Proof. First assume (1) but assume (2) is false, so there is some set of ideals X with no maximal element. Take any
I1 ∈ X ; since property (2) is false, I1 is not maximal and there is some I2 ∈ X such that I1 ⊊ I2 . Similarly we can
find I3 ∈ X such that I2 ⊊ I3 ; continuing on we get an infinite increasing chain of ideals in X . Then the nested union
I= ∞
S
i=1 Ii is also an ideal (indeed any element of I is in some In so multiplying by any other element keeps us in In
and thus in I, and if we have two elements they are both in some sufficiently large In ), so it is finitely generated by
some set of elements (r1 , · · · , rn ). But each ri is in some In , so there is some N such that r1 , · · · , rn ∈ IN and thus
I ⊂ IN (we already have the whole union at step N). This contradicts the fact that IN ⊊ IN+1 , so (2) cannot be false.
On the other hand, suppose (2) is true. For any ideal I ◁ R, let X be the set of finitely generated ideals contained
in I. This is a nonzero set of ideals because it contains (0), so it has a maximal element J0 . We know that J0 ⊂ I,
but if J0 is not all of I then there is some r ∈ I − J0 , and then the ideal (J0 , r ) is still finitely generated (so is in X )
and strictly contains J0 , which contradicts maximality. So we must actually have J0 = I and thus I itself is finitely
generated.

Definition 54
A ring R is noetherian if the properties in Lemma 53 hold.

(Noether and Abel are two mathematicians whose names are often not capitalized when used in math.)

Example 55
Z and K[x] (for any field K) are noetherian (because any ideal is principal and thus generated by one element), and
so is K for any field (because it only has two ideals generated by 0 and 1, respectively). However, the polynomial
ring C[x1 , x2 , · · · ] is not noetherian because the ring generated by (x1 , x2 , · · · ) is not finitely generated (any finite
set of generators only involves finitely many of the xi s).

Remark 56. Prompted by a question in class, the set of formal Laurent series C((x)) is actually a field, because we’ve
1
basically taken C[x] and added an inverse x to the only element that isn’t invertible.

Lemma 57
If R and S are noetherian, then so is R × S, and if R is noetherian and I ◁ R, then so is R/I.

(Both of these follow by our characterization of ideals of products and quotients.) As a warning, though, relative
products R ×T S do not need to be noetherian even if R, S, and T are all noetherian.

Proposition 58 (Hilbert’s basis theorem)


If R is noetherian, then so is R[x] (and thus a polynomial ring over R in finitely many variables).

13
Proof. Let I ◁ R[x] be an ideal. To “get back down to R,” consider the set of elements

Ld = {r ∈ R : r is the coefficient of x d of some f ∈ I of degree at most d}.

(In other words, r is either zero or it’s a leading coefficient.) This is an ideal of R, because we can add polynomials
in I together and also multiply f by any s ∈ R to get another degree d polynomial (or zero) now with x d coefficient
r s (or get zero, which is also in Ld ). Furthermore, Ld ⊂ Ld+1 because we can always multiply any polynomial by
x and preserve the “leading coefficient.” By the noetherian property for R, we see that LN = LN+1 = · · · for some
sufficiently large N.
But by the noetherian property (in the other sense), there exist finitely many polynomials fd,1 , · · · , fd,sd ∈ I of
degree at most d whose coefficients of degree d generate Ld . Specifically, for d ≥ N, we can just take sd = sN and
fd,i = fN,i x d−N as a generating set for LN . Now the ideal J = (fi,j : 0 ≤ i ≤ N, 1 ≤ j ≤ si ) is a finitely generated ideal.
Each fi,j is in I so J ⊂ I, but J is finitely generated and contains the fi,j for i > N as well (since those are just the
fN,i s multiplied by a power of x). We now claim that J = I; otherwise, there is some g ∈ I \ J of smallest degree d.
P
Then removing its leading terms by subtracting off some term j rj fd,j from J gets us another element in I \ J with
smaller degree, which is a contradiction.

5 October 5, 2022
Last lecture, we mentioned two equivalent (and both useful) characterizations for being a noetherian ring, namely
that (1) any ideal is finitely generated and (2) any non-empty collection of ideals has a maximal element. We saw
that products and quotients of noetherian rings are still noetherian, and we also saw (by Hilbert’s basis theorem) that
polynomial rings (in finitely many variables) R[x1 , · · · , xn ] are noetherian if R is noetherian. It turns out that the formal
power series ring R[[x]] is also noetherian – the proof is similar as for polynomial rings, but we look at the lowest-degree
terms rather than the highest-degree ones. Here’s a reformulation of Hilbert’s basis theorem:

Corollary 59
Suppose R is noetherian and φ : R → S is a ring morphism. Then if S is finitely generated over R (meaning that
there is a finite subset X ⊂ S, such that S has no proper subring containing im(φ) and X), then S is noetherian.

Proof. If we write X = {X1 , · · · , Xn }, then there is a map ψ : R[x1 , · · · , xn ] → S such that r maps to φ(r ) and
X1 7→ xi (this is the universal property of the polynomial ring). This map should be surjective in order for X to
generate S over R (otherwise the image would be a proper subring of S), and that means S ∼ = R[x1 , · · · , xn ]/ ker φ.
But R[x1 , · · · , xn ] is noetherian, so the quotient is also noetherian.

It will turn out that many rings that we will meet in “real life” are finitely generated over some simple ring, and
thus being noetherian is a reasonable assumption.
Our next construction will be rings of fractions, and the example to keep in mind is to get from Z to Q:

Definition 60
A subset D ⊂ R is multiplicative if it contains 1 and is closed under multiplication.

It is easy to check that if φ : R → S is a morphism and D ⊂ R is multiplicative, then φ(D) ⊂ S is multiplicative.


r
D will be the set of elements that can go into the denominator, and in general the fraction a should be the same as

14
s
b (so br = as). So we’ll consider the set R × D, and suppose

(r, a) ∼ (s, b) if c(br − as) = 0 for some c ∈ D.

To make sure this definition makes sense, we need to check that we actually have an equivalence relation. Transitivity
is the only hard part: if (r, a) ∼ (s, b) ∼ (t, c), then

d(br −as) = 0, e(cs −bt) = 0 =⇒ ecd(br −as)+eda(cs −bt) = 0 =⇒ ecdbr = edabt =⇒ ebd(cr −at) = 0.

And now we see why we need the factor of c in the definition – we wouldn’t have transitivity otherwise. (The c is not
actually necessary if we have an integral domain, though.) And now we can define D−1 R to be the set of equivalence
r
classes of R ×D under ∼, and it’s important to note that when we see a = [(r, a)] we don’t actually uniquely determine
r and a.

Proposition 61
D−1 R is a ring with 0 = 01 , 1 = 11 , and addition and multiplication defined via

r s br + as r s rs
+ = , · =
a b ab a b ab
(which are elements of D−1 R because ab ∈ D). Furthermore, we have a natural ring morphism R → D−1 R with
r 1
r 7→ 1 (though it isn’t always injective), and this map turns elements d ∈ D into units with inverses d ).

r r′
Proof. Most of the work here is in showing that all of these operations are well-defined. For example, if a = a′ and
s s′ br +as b′ r ′ +a′ s ′
b = b′ , we must check that ab = a′ b ′ ; indeed, notice that c(a′ r − ar ′ ) = 0, d(b′ s − bs ′ ) = 0 implies that
cd(a b br + a′ b′ as − abb′ r ′ − aba s ) = 0 by pairing up the first and third terms, as well as the second and fourth terms.
′ ′ ′ ′

A similar check needs to be done for multiplication, and then we have to check ring axioms (such as associativity).
Finally, checking that R → D−1 R is a ring morphism is more straighforward.

We’ve mentioned that D is turned into units under the map R → D−1 R, and it turns out this ring D−1 R is the
“cheapest way” to do so (this is a universal property):

Proposition 62
Suppose D ⊂ R is multiplicative, and φ : R → S is a morphism with φ(D) ⊂ S × . Then there is a unique morphism
φ
D−1 R −
→ S such that the diagram below commutes.

φ
R S

φ
−1
D R
We won’t check the proof carefully, but to make sure the diagram commutes, we must define
r 
φ = φ(r )φ(a)−1 .
a
Verifying that this is well-defined and indeed a ring morphism is just working a bit with the definitions. And indeed if
we have any other ring morphism ψ,
r  r  a r 
ψ φ(a) = ψ ψ =ψ = φ(r ),
a a 1 1
r
= φ(r )φ(a)−1 .

and because a ∈ D we can invert φ(a) and we must have ψ a

15
Example 63
If 0 ∈ D, then D−1 R will be the zero ring, because everything in R × D is equivalent. Indeed, we always have
r 0
a = 1 because 0 · (r · 1 − a · 0) = 0.

Example 64
If D is the set of elements of R that are not zero divisors, then D is multiplicative (since if ab were a zero divisor,
then a would be one as well). In this case, D−1 R is often called the total quotient ring QR. Then R ,→ QR
r s
is injective, because 1 = 1 if and only if a(r − s) = 0 for some a ∈ D, meaning r = s because a is not a zero
divisor. For example, QZ = Q.

Lemma 65
If R is an integral domain (meaning D = R \ {0}), then QR is always a field.

r 0
Proof. For any fraction a ̸= 1 we have r ̸= 0, so that fraction has inverse ar . In particular, QR then ends up being
the smallest field containing R.

Example 66
For some explicit calculations, we have Q(Z × Z) = Q × Q and Q(C[x]/(x 2 )) = C[x]/(x 2 ). In the latter case, we
can check that D = {a + bx : a ̸= 0}, which is the group of units of C[x]/(x 2 ) so we don’t need to change the
ring at all.

Example 67
For any f ∈ R, {1, f , f 2 , · · · } is multiplicative, so we can define R[1/f ] = Rf = {1, f , f 2 , · · · }−1 R.

We claim that Rf ∼
= R[x]/(f x − 1). For one direction, we can construct the morphism Rf → R[x]/(f x − 1) sending
r
fn to r x n (and we must check that this is indeed well-defined). But a more natural way to do this is to notice that
we have a map R → R[x]/(f x − 1) such that f n is a unit for all n (because f n x n = (f x)n = 1). So by the universal
property (and factoring through the map R → Rf ) we also have a map Rf → R[x]/(f x − 1).

R Rf

R[x]/(f x − 1)

For the other direction, we can use the universal property of polynomial rings: the map R → Rf can be extended
1
to a map R[x] → Rf by sending x 7→ f, under which f x − 1 goes to zero, so we get a map from the quotient
R[x]/(f x − 1) → Rf by the universal property of quotients.

R[x] Rf

R[x]/(f x − 1)

16
From here, we must show that these two maps are inverses, and we’re basically doing that by “gluing commutative
diagrams together.” By our work above, the two triangles in the diagram below will commute. So the composition of
the two maps Rf → R[x]/(f x − 1) → Rf is an extension of the map R → Rf , but the identity map is the unique such
map which does this, so our two morphisms do indeed compose to the identity.
id

Rf R[x]/(f x − 1) Rf

R
1
For the other direction, we can construct the following diagram, where the maps in the middle row send x to f
to 0. Looking at the bottom half of the diagram, the universal property of R[x] tells us that there is a unique map
R[x] → R[x]/(f x − 1), namely the quotient map. Thus that map must factor through the quotient R[x]/(f x − 1)
(on the top half of the diagram), meaning the dashed map (the composition R[x]/(f x − 1) → Rf → R[x]/(f x − 1))
must again be the identity map.

R[x]/(f x − 1)
id

R[x] Rf R[x]/(f x − 1)

Example 68
The ring Z[1/n] is the set of rational numbers whose denominator is a power of n, and R[[x]][1/x] = R[[x]]x is
the Laurent series on R (containing elements ∞ n
P
n=N an x where an ∈ R and N ∈ Z is some finite “lowest power”).

6 October 7, 2022
We constructed rings of fractions last time – for any multiplicative subset D ⊂ R of a ring R (which contains 1
and is closed under multiplication), we can define D−1 R to be the set of equivalence classes [ dr ] = [(r, d)] under the
equivalence relation (r, d) ∼ (s, e) if a(r e − sd) = 0 for some a ∈ D. In particular, when D is the set of all elements
that are not zero divisors, we get the ring QR. (And the map R ,→ QR is injective as long as D has no zero divisors.)
After that, we defined Rf = R[1/f ] to be the ring of fractions defined by D = {1, f , f 2 , · · · } – this is known as a
localization.
We’ll start today with some further constructions:

Example 69
If D ⊂ R and E ⊂ S are multiplicative subsets, then D × E ⊂ R × S is also multiplicative (by checking the
∼ D−1 R × E −1 S. In other words, the map (r,s) → r , s is well-
definition), and we get (D × E)−1 (R × S) =

(d,e) d e
defined and produces an isomorphism.

17
Example 70
Let φ : R → S be a ring morphism, and let D ⊂ R be multiplicative. Then φD ⊂ S is multiplicative, so we can
r
form (φD)−1 S, which is sometimes just abbreviated to D−1 S. Then we have the map D−1 R → D−1 S sending d
φ(r )
to φ(d) . So then by the universal property of rings of fractions, any map R → S → D−1 S will factor through that
−1
map D R → D−1 S.

Next (this can be checked from the universal property for rings of fractions and polynomial rings), if we have S a
polynomial ring,
(D−1 R)[x] ∼
= D−1 (R[x]).
Pn ri
(More directly, the isomorphism here sends i=0 di x i to the fraction with a single denominator, which is a polynomial
over d0 d1 · · · dn .) On the other hand, if we consider formal power series, we can consider Z[[x]][ 12 ] (in which we invert
the elements 1, 2, 4, 8, and so on), or we can consider Z[ 21 ][[x]]. But the former contains only things of the form
polynomial x x2 x3
2n but the latter contains elements like 1 + 2 + 4 + 8 + · · · . (In particular, we haven’t described any universal
property for the power series ring yet.)

Definition 71
Suppose p ◁ R is a prime ideal. Then R − p is a multiplicative set (if there were two things in R − p whose
product was not also in it, that would contradict the primeness of p). The localization of R at p is defined to
be Rp = (R − p)−1 R.

Example 72
For any prime ideal (p) in Z for a prime number p, Z(p) inverts any integer not divisible by p, yielding the set
of reduced rational numbers with no powers of p in the denominator. (This can be kind of confusing, because
we defined Zp = Z[ p1 ] and that also looks like the p-adic numbers. So in general we should be careful about the
difference between Rf and R(f ) , which invert completely different elements.) On the other hand, Z(0) = Q.

Example 73
We have C[X](0) = C(X), the field of rational functions of X. So then the localization C[X](x) is the set of
p(X)
(reduced) rational functions q(X) ∈ C(X) for which X ∤ q(X) =⇒ q(0) ̸= 0, or equivalently

C[X](x) = {f (x) ∈ C(x) : f has no pole at 0}.

This explains the term “localization” – this operation makes us care about the functions that make sense localized
near zero.

Example 74
For any prime ideal p ◁ S, we know that R × p ◁ R × S is a prime ideal, so we can define

(R × S)R×p = (R × S − R × p)−1 (R × S) = (R × (S − p))−1 (R × S) = R−1 R × (S − p)−1 S,

and the first term in the product here is the zero ring so we just get back Sp (and R doesn’t do very much).

18
In particular, R × p often contains zero divisors, but localizing by it doesn’t give us the zero ring – it just throws
away some factors. (Containing zero is what immediately collapses the ring of fractions.)

Example 75
 r 
Let D ⊂ R be multiplicative, and let I ◁ R be an ideal. Then D−1 I = a : r ∈ I, a ∈ D ◁ D−1 R is an ideal.

We say that I is saturated with respect to D if for any r ∈ R and a ∈ D, having ar ∈ I implies that we already
had r ∈ I.

Lemma 76
r
If I is saturated with respect to D and a ∈ D−1 I, then r ∈ I.

r
(This is not obvious – a ∈ D−1 I means that there’s some representative equivalent to it with r ∈ I and a ∈ D –
and saturation is required.)
r r s
Proof. Having a ∈ D−1 I means that a = b with s ∈ I and b ∈ D. This means that c(br − as) = 0 for some
c ∈ D. Since cas ∈ I, we must have cbr ∈ I for some c, b ∈ D, so applying saturation to the element cb we see that
r ∈ I.

Proposition 77
Let φ : R → D−1 R be the natural map.
1. For any ideal J ◁ D−1 R, the ideal φ−1 J ◁ R is saturated with respect to D, and D−1 (φ−1 J) = J.

2. For any I ◁ R, φ−1 (D−1 I) is the smallest ideal containing I which is saturated with respect to D, and
D−1 (φ−1 (D−1 I)) = D−1 I.
Thus there is a bijection between ideals of D−1 R and ideals of R that are saturated with respect to D, sending J
to φ−1 J (and with inverse sending I to D−1 I). Additionally, if D is the image of D in R/I (which is multiplicative),
−1 ∼ D−1 R/D−1 I, sending r +I to r + D−1 I.
there is an isomorphism D (R/I) = d+I d

ar 1 ar r
Proof sketch. To prove that φ−1 J is saturated, suppose r ∈ R, a ∈ D, and 1 ∈ J. Then a 1· = 1 ∈ J, meaning
r r
that r ∈ φ −1
J. Meanwhile, D −1 −1
φ J is the set of a such that a ∈ D and 1 ∈ J. Any element ar of J is in that set
r r r 1 r
because a ∈ J implies 1 ∈ J, and for the other inclusion, a = a · 1.
−1
(2) can be proved similarly and is left as an exercise to us. Finally, to show the bijection, our map D (R/I)
D −1 R r −1
must send D to units and kill I, so we start with the map R → D −1 I sending r to 1 +D I. This map sends I
D −1 R
to zero, so by the universal property of the quotient our map factors through to a map R/I → D −1 I . Now any
d −1 1 −1
element d + I with d ∈ D is sent to 1 +D I, which has an inverse d +D I, so D is indeed sent to units
−1 −1
and thus by the universal property we get a map D (R/I) → DD−1RI , sending d+I
r +I
(for any r ∈ R, d ∈ D) to
r −1 d −1 −1 r −1 1 −1 r −1
   
1 +D I 1 +D I = 1 + D I d + D I = d + D I. So this map is indeed the one we wrote down in
the statement.
−1
To produce the map in the other direction, we first construct a map D−1 R → D (R/I), which we can get by first
−1 r +I
constructing the map R → D (R/I) sending r to 1+I . Now something in D gets sent to a unit, so this map factors
−1
through to a map D −1
(R/I). And then starting with an element dr ∈ D−1 I with r ∈ I, we are always sent
R→D
 d+I −1
to zero, so we get a map D−1 R/D−1 I sending dr + D−1 I to 1+Ir +I
1+I . Since the second term is in D, we can
r +I
 1+I  r +I
simplify this to 1+I d+I = d+I , showing that we do indeed have an inverse.

19
Corollary 78
If R is noetherian, then any ring of fractions D−1 R is noetherian, because any nonempty set of ideals in D−1 R
corresponds to a set of ideals in R and must have a maximal element.

Corollary 79
A prime ideal p ◁ R is saturated with respect to D if and only if p ∩ D = ∅. So there is a bijection between
prime ideals of D−1 R and prime ideals of R not intersecting D, given by the same maps J 7→ φ−1 J, I 7→ D−1 I
for the forward and backwards directions. (We just need to check that we actually preserve primeness.) One
interpretation of this is that Spec(D−1 R) ⊂ Spec(R).

Finally, Spec(Rp ) is the set of prime ideals I ∈ Spec R such that I ⊂ p, so localization throws away prime ideals
except those contained in p.

Remark 80. There are some notes on completion that we should read to complete the problem set; they’re not on
the syllabus for the qualifying exam but they are for this class.

7 October 10, 2022


(As a logistical note, we should start submitting our homeworks on Gradescope now that there is a grader for the
course helping out Lie.)
We’ll discuss the last of our ring constructions today, the tensor product. The approach we’ll take here is not the
standard one – usually this is developed as a tensor product of modules, but there’s some advantage in doing it for
rings first. Recall that if we have two rings S and T with maps into R, then we formed the relative product S ×R T
with the following universal property:

S ×R T S

T R

It makes sense to ask what happens if we reverse the arrows as shown below, forming a pushout diagram.

R S

T ?

The answer is yes, and it’s what gives rise to the tensor product:

20
Definition 81
Suppose that we have rings R, S, T with maps φ : R → S and ψ : R → T . The tensor product S ⊗R T is

S ⊗R T = R[xs , yt ]s∈S,t∈T / xs1 +s2 − xs1 − xs2 , xt1 +t2 − xt1 − xt2 , xs1 s2 − xs1 xs2 , xt1 t2 − xt1 xt2 , xψ(r ) − r, yψ(r ) − r ,

where these relations span over all s1 , s2 ∈ S, t1 , t2 ∈ T , and r ∈ R. (This definition does depend on φ and ψ but
those are usually omitted in the notation.)

In other words, we form a polynomial ring in a huge number of variables (one variable for each element of s and
t) and then mod out by the constraints required to make φ, ψ into ring morphisms. Then we can form the following
commutative diagram, which commutes because R gets sent along as the identity map in both directions:

S
φ s7→xs

R S ⊗R T

ψ t7→yt
T

Unfortunately, this concrete definition is hard to work with directly, so we’ll often appeal to other arguments. But
first, we’ll set up the notation that s ⊗ t = xs yt (these elements are called pure tensors) – notice that every element
of S ⊗R T is a finite linear combination of pure tensors, because any polynomial is a finite sum of things of the form
Q Q m
r xsrii ytj j . But it’s usually not possible to write down a map from S ⊗R T by deciding where its pure tensors go,
but it’s hard to know whether we actually have s1 ⊗ t1 + s2 ⊗ t2 + s3 ⊗ t3 = 0 (creating additional relations between
the elements that are not immediately apparent).
Next, we find that

(s1 + s2 ) ⊗ t = s1 ⊗ t + s2 ⊗ t, s ⊗ (t1 + t2 ) = s ⊗ t1 + s ⊗ t2

because of the first two relations multiplied by yt and xs respectively, and similarly

(s1 ⊗ t1 )(s2 ⊗ t2 ) = s1 s2 ⊗ t1 t2

because of the next two relations, and then for any r ∈ R, we have

(φ(r )s) ⊗ t = s ⊗ (ψ(r )t) = r (s ⊗ t)

because both of these can be thought of as the monomial r xs yt = xs r yt .

Lemma 82 (Universal property of tensor product of rings)


Suppose we have a diagram as in the construction of S ⊗R T but with a different ring U, as shown below and
with maps f : S → U and g : T → U. Then there is a unique ring morphism f ⊗ g making the diagram below
commute.

21
S f
φ

f ⊗g
R S ⊗R T U

ψ
T g

In general, we should think of using this universal property whenever we want to map out of a tensor product.

Proof. To make the two triangles on the right commute, we must have the map R[xs , yt ] → U sending xs to f (s)
and yt to g(t) for each s ∈ S, t ∈ T . We must then check that it factors through the quotient, showing that each
generator of that ideal gets sent to zero. But that’s true because f and g are linear, and this is the unique map
because we’ve shown that every variable xs and yt must be sent to specific elements in U.

The point now is that we only need this universal property and work with it instead of the original definition.

Remark 83. By composing diagrams, we see that if S ⊗′R T also has the same property as S ⊗R T , we get a canonical
isomorphism. It looks as shown below:
S

R S ⊗R T S ⊗′R T

T
Indeed, the triangles marked in blue commute because each of the subtriangles inside them commute, and we have
a unique map S ⊗R T → S ⊗R T which must be the identity map.
S

R S ⊗R T S ⊗′R T S ⊗R T

T
We’ll now mention some other basic properties of tensor products:

Lemma 84
Suppose we have maps α : S → S ′ and β : T → T ′ such that α commutes with maps R → S and R → S ′ , and
β commutes with maps R → T and R → T ′ . Then there is a unique map α ⊗ β : S ⊗R T → S ′ ⊗R T ′ such that
s ⊗ t maps to α(s) ⊗ β(t).

Proof. By the universal property, we want to map from S ⊗R T out to S ′ ⊗R T ′ , so we can just describe maps
S → S ′ ⊗R T ′ and T → S ′ ⊗R T ′ . One way to do that is to send s to α(s) ⊗ 1 and t to 1 ⊗ β(t), but another way is
to draw the diagram as follows:
α
S S′

α⊗β
R S ⊗R T S ′ ⊗R T ′

β
T T′

22
Then the outer diagram commutes (the dashed maps come from the originally given maps R → S ′ and R → T ′ ),
and then we can describe where any pure tensor goes via

(α ⊗ β)(s ⊗ t) = (α ⊗ β)(s ⊗ 1 · 1 ⊗ t) = (α ⊗ β)(s ⊗ 1)(α ⊗ β)(1 ⊗ t) = (α(s) ⊗ 1)(1 ⊗ β(t)) = α(s) ⊗ β(t)

Lemma 85
For any rings R, T and map ψ : R → T , we have R ⊗R T ∼
= T , with map sending r ⊗ t to ψ(r )t.

Proof. We can construct the maps explicitly between the two rings, but alternatively we can show that T has the
universal property that R ⊗R T has. We must show that the following diagram commutes for any ring U, which is the
same as showing that the outer diagram commutes. The dashed map exists and must be g if we want the bottom
triangle to commute. But then the whole thing commutes because f = g ◦ ψ from the top triangle, so it’s true for
the big square as well.

R f
id ψ

g
R T U
ψ id
g
T

Either one of those arguments works for the next result:

Lemma 86
We have an isomorphism S ⊗R T ∼
= T ⊗R S sending s ⊗ t to t ⊗ s.

Proof. We’ll produce maps in both directions using the universal property of S ⊗R T and T ⊗R S. Specifically, we get a
unique map α : S ⊗R T → T ⊗R S which sends α(s ⊗ t) = α(s ⊗ 1 · 1 ⊗ t) = α(s ⊗ 1)α(1 ⊗ t) = (1 ⊗ s) · (t ⊗ 1) = t ⊗ s.
Similarly we get a unique map β : (t ⊗ s) = s ⊗ t. Then we can note that β ◦ α(s ⊗ t) = s ⊗ t, and being the identity
on pure tensors also means we have the identity everywhere. The same works for α ◦ β.

Definition 87
Let I be an ideal of S. Let I ⊗R T denote the ideal generated by pure tensors s ⊗ t, where s ∈ I (which contains
all finite sums of pure tensors of this form).

We can indeed check that this is an ideal (exercise for us).

Lemma 88
For any I ◁ S, I ⊗R T is an ideal of S ⊗R T , and we have

(S ⊗R T )/(I ⊗R T ) ∼
= (S/I) ⊗R T

by sending an element s ⊗ t + I ⊗R T to (s + I) ⊗ t.

23
Proof. We have a map S ⊗R T to (S/I) ⊗R T sending s ⊗ t to (s + I) ⊗ t (by applying Lemma 84 with the quotient
map S → S/I and the identity map T → T ), and we can check that this factors through the quotient. For the other
direction, we want to map out of the tensor product (S/I) ⊗R T . We need to draw the following diagram:

S/I

R (S/I) ⊗R T (S ⊗R T )/(I ⊗R T )

The outside diagram commutes, and for any element s ∈ I, we get sent to 0 under the top right map. Thus (by
universal property of the quotient) we factor to a unique (blue dashed) map S/I → (S ⊗R T )/(I ⊗R T ) sending s + I to
s ⊗1+I ⊗T . Applying the universal property of the tensor product, we then get the dashed map S/I → S⊗R T /(I ⊗R T )
sending s + I to s ⊗ 1 + I ⊗ T . We can thus construct the dashed map S/I ⊗R T to (S ⊗R T )/(I ⊗R T ). We see that

(s + I) ⊗ t 7→ (s ⊗ 1 + I ⊗ T )(1 ⊗ t + I ⊗ T ) = (s ⊗ t + I ⊗ T ).

Since we’ve kept track of what happens to pure tensors in both cases and we see that the maps are isomorphisms in
both cases, we do indeed get the desired isomorphism.

Combining the previous result with S = R gives us the following result as well (which is easier than proving it from
scratch):

Lemma 89
If I ◁ R and ψ : R → T , then (R/I) ⊗R,ψ T ∼
= T /⟨ψ(I)⟩

Lemma 90
If D ⊂ S is multiplicative, then D ⊗ 1 = {d ⊗ 1 : d ∈ D} is multiplicative, and

(D−1 S) ⊗R T ∼
= (D ⊗ 1)−1 (S ⊗R T )

s s⊗t
with map sending d ⊗ t to d⊗1 and vice versa.

Combining this with another previous result gives us the following:

Lemma 91
For any multiplicative subset D ⊂ R and any map ψ : R → 1, we get (D−1 R) ⊗R T ∼
= D−1 T , with map sending
r ψ(r )t
d ⊗ t to ψ(d) .

We also have a few other assorted useful facts:

Lemma 92
We have the polynomial ring isomorphism S ⊗R (T [x]) ∼
= (S ⊗R T )[x].

24
Lemma 93
Tensor products are associative: we have an isomorphism S ⊗R (T ⊗R U) ∼
= (S ⊗R T ) ⊗R U sending s ⊗ (t ⊗ u)
to (s ⊗ t) ⊗ u.

Lemma 94
We have the isomorphism S ⊗R (T × U) ∼
= (S ⊗R T ) × (S ⊗R U).

However, this last result may require modules to prove (there’s no proof that Professor Taylor knows using only
rings).

8 October 12, 2022


We’ll spend today and next lecture on the final topic of rings, factorization. We know that integers can be written
uniquely as powers of prime numbers and potentially a negative sign, and we now want to generalize this to other rings.

Definition 95
For any ring R and r, s ∈ R, we say that r divides s, denoted r |s, if s = r t for some t ∈ R. An element r ∈ R is
irreducible if r is not a unit, but whenever r = st in R, either s or t is a unit.

For example, the irreducibles in Z are primes and negative primes (because the only units are ±1). And for any
field K, the irreducibles in K[x] are the irreducible polynomials in the usual sense.

Lemma 96
If R is a noetherian integral domain, then any nonzero r ∈ R can be written as a product r = uπ1 · · · πn , where u
is a unit and πi are irreducibles.

Proof. This is a typical application of noetherianness: let X be the set of principal ideals (r ) such that r does not have
such a factorization. If X were empty, then we have the result that we want. Otherwise, X has some maximal element
(r ), and r is not irreducible because otherwise r would be its own factorization of the desired form. Thus r = st for
some s, t not units. But (s) contains (r ), and if (r ) = (s) then we must have s = ur =⇒ st = tur =⇒ 1 = tu
(here we use that st = r and that R is an integral domain), a contradiction because t is not a unit. So (s) cannot
be in X by maximality of (r ), and similar for (t), meaning that s = uπ1 · · · πn and t = v π1′ · · · πn′ . But this gives us a
factorization r = (uv )π1 · · · πn π1′ · · · πn′ , a contradiction.

Definition 97
Two elements r, s ∈ R are associates, denoted r ∼ s, if r = su for some unit u ∈ R× .

(For example, p and −p are associates in Z, but different primes are not.) In the integers we know not only that
every integer have a factorization, but also that it is unique up to order and switching signs (in other words, up to
associates). More formally, if r = uπ1 · · · πn = v π1′ · · · πm

, where u and v are units and πi , πj′ are all irreducible, then
we know that m = n and then for each i , we have πi ∼ πj′ for some j. However, this is not true for all rings:

25
Example 98
√ √ √ √ √
Consider the ring Z[ −5]. Then we have 6 = 2 · 3 = (1 + −5)(1 − −5), but all of 2, 3, 1 + −5, and 1 − −5

are irreducible and 2 and 1 ± −5 are not associates.

√ √
To check that 2 is not irreducible, suppose 2 = (a + b −5)(c + d −5) for some integers a, b, c, d. Taking the
absolute value of these complex numbers (also called their norm), we see that 4 = (a2 + 5b2 )(c 2 + 5d 2 ), which forces
b = d = 0, and this leaves only trivial factorizations (with units) like 2 = (−1) · (−2) and so on.

Definition 99
A ring R is a unique factorization domain (UFD) if it is an integral domain, every element r of R \ {0} has a
factorization r = uπ1 · · · πn with u a unit and πi irreducible, and such a factorization is unique, meaning that if
r = uπ1 · · · πn = v π1′ · · · πm

for u, v units and πi , πj′ irreducible, then n = m and up to reordering we have πi ∼ πi′ .

(The existence of a factorization is usually not a problem – we just saw that it follows from noetherianness – so it
is the uniqueness that matters.) For example, we know that Z is a UFD, and any field is a UFD because any nonzero

element is already a unit (there are no irreducibles in a field). On the other hand, we’ve just shown that Z[ −5] is
not a UFD.

Lemma 100
Let R be an integral domain in which every element has a factorization. Then the following are equivalent:
1. The principal ideal (r ) is prime if and only if r is irreducible.

2. If an element π is irreducible, then (π) is prime.

3. If π is irreducible and π divides r s, then π divides r or π divides s.

4. R is a UFD.

The point is that (2) or (3) are the weakest-looking conditions, so they are the easiest to check in practice.

Proof. Clearly (1) implies (2) (the latter is a weaker statement), and (2) and (3) are just restatements of each other.
To show (3) implies (4), suppose we have two factorizations r = uπ1 · · · πn = v π1′ · · · πm

. We will prove that
n = m and πi ∼ πi′ up to reordering by induction on m. For the case m = 0, each πi would need to be a unit (which is

not possible) so n = 0 and then the other statement is vacuous. Now if m > 0, πm divides uπ1 · · · πn , so by property
(3) it either divides πn or uπ1 · · · πn−1 , meaning it either divides πn , divides πn−1 , or divides uπ1 · · · πn−2 . Repeating

this process, we find that πi is πm times a unit w for some i , so by reordering we can assume i = n. Substituting in

πn = πm w , we find that (uw )π1 · · · πn−1 = v π1′ · · · πm−1

, so by induction m − 1 = n − 1 and up to order the remaining
πi s and πi′ s are associates.
To show that (4) implies (1), suppose R is a UFD. For one direction, the fact that if (r ) is prime, then r is irreducible
does not rely on the other properties (only on the fact that R is an integral domain). If r were not irreducible, then
r = st for non-units s, t, and by primeness of (r ) we must have s ∈ (r ) or t ∈ (r ). But then we get the same argument
as before: without loss of generality we have s = r u, so st = r ut =⇒ 1 = ut, which means t is a unit. For the
other direction, suppose r is irreducible. Then if st ∈ (r ) for some s, t, then st = r x for some x ∈ R. Writing out the
factorizations s = uπ1 · · · πn , t = v π1′ · · · πm

, x = w π1′′ · · · πp′′ and remembering that r is irreducible, we then get

s = uπ1 · · · πn v π1′ · · · πm

= w r π1′′ · · · πp′′ .

26
But by unique factorization of st, we see that r ∼ πi or πi′ for some i , meaning that r divides s or r divides t. Thus
either s ∈ (r ) or t ∈ (r ), which proves that (r ) is prime.

Definition 101
A ring R is a principal ideal domain (PID) if it is an integral domain and any ideal I ◁ R is principal (meaning
that I = (r ) for some r ∈ R).

For example, the division algorithm tells us that Z and K[x] are principal ideal domains.

Lemma 102
Suppose R is noetherian. Then R is a PID if and only if R is a UFD in which each nonzero prime ideal is a maximal
ideal.

In particular, this tells us that K[x] is always a UFD.

Proof. The reverse direction is on our homework as an exercise. For the forward direction, suppose R is a PID and
π ∈ R is irreducible. We wish to show that (π) is a prime ideal so that we can apply Lemma 100. If r s ∈ (π), then
(r, π) = (t) for some t because we have a PID, and in particular π = tu (because t divides π). By irreducibility, there
are two cases: if u is a unit, then the ideals (t) and (π) are the same, so r ∈ (t) =⇒ r ∈ (π), which is what we want.
On the other hand, if t is a unit, then r and π must generate the whole ring R< so we have 1 = ar + bπ for some
a, b ∈ R. Multiplying by s, we find that s = a(r s) + (bs)π, and now both terms on the right-hand side are divisible by
π and thus π divides s, meaning s ∈ (π).
It now remains to check that any nonzero prime ideal is maximal. Suppose we have two prime ideals p ⊂ q with
p ̸= 0. If we have a PID, then p = (π) and q = (π ′ ), so π = π ′ u for some unit u. But because π is irreducible and π ′
isn’t a unit, u must be a unit and that means (π) = (π ′ ). So we cannot have one prime ideal strictly contain another.
Thus any nonzero prime ideal cannot be strictly contained in a maximal ideal (which would be prime), and thus the
ideal itself must be maximal.

Lemma 103
Let R be a UFD and let r, s ∈ R. The greatest common divisor gcd(r, s) ∈ R is the element such that gcd(r, s)|r ,
gcd(r, s)|s, and it is “greatest” in the sense that whenever t|r and t|s, we also have t| gcd(r, s). Such a gcd always
exists and is unique up to associates.

We can also similarly define gcds of more than two elements in an analogous way, as well as lcms.

Proof. Suppose r = uπ1 · · · πn and s = v π1′ · · · πm



. Reorder so that π1 ∼ π1′ π2 ∼ π2′ , · · · , πp ∼ πp′ , but for any
i, j > p we have πi ̸∼ πj′ . Then we can set gcd(r, s) = π1 · · · πp and check that all of the properties hold.

Definition 104
We say that two elements r, s ∈ R in a UFD are coprime if gcd(r, s) = 1.

27
Definition 105
Let R be a UFD. A polynomial f ∈ R[x] is primitive if the gcd of all of its coefficients are 1.

It will turn out that the product of two primitive polynomials is still primitive, and we’ll see that and some related
ideas next time.

9 October 14, 2022


Last lecture, we introduced unique factorization domains, which are integral domains where every element can be
written as a unit times a bunch of irreducibles unique up to ordering and associates. We showed that being a UFD is
equivalent to being an integral domain where factorizations exist and (π) is a prime ideal for every irreducible π. In a
UFD, it makes sense to talk about the gcd and lcm of two elements, saying whether they’re coprime, and so on. In
the polynomial ring case, we said that a polynomial f is primitive if the coefficients have no common factor. This will
play a role in factorization in the following way:

Lemma 106 (Gauss)


Let R be a UFD. Then if f , g ∈ R[x] are primitive, then f g is primitive as well.

Proof. Suppose otherwise, so that there exists some irreducible π ∈ R such that π divides all coefficients of f g. Since
(π) is prime, we can define f and g to be the images of f and g in R/(π)[x], which is the polynomial ring over an
integral domain. But f g = 0 in R/(π)[x] implies that either f = 0 or g = 0, which would contradict f and g being
primitive.

Lemma 107
Let R be a UFD. Then a polynomial f ∈ R[x] is irreducible if and only if the following condition holds: either
f ∈ R and f is irreducible in R, or f ∈ R[x] is primitive with f irreducible over the fraction field Q(R)[x].

Proof. For the forward direction, suppose f is irreducible. Then if f ∈ R and f = gh in R, then either g or h is a
unit in R[x]× , but all units in R[x]× are elements of R. This means f is indeed irreducible in R. On the other hand,
if f ∈ R[x] \ R and some irreducible π ∈ R divides all of the coefficients of f , then we can write f = π · (f /π), and
neither π nor f /π can be a unit in R[x]× = R× . Thus f is indeed primitive in this case, and now we must check that
it is irreducible over Q(R)[x]. If we can write f = gh ∈ Q(R)[x] with degrees of g and h positive, we can pull out
factors in the denominator of g and h by writing g = ag̃ with a ∈ Q(R) and g̃ ∈ R[x] primitive (basically multiply by
the lcm of the denominators), and similarly h = bh̃. Thus

f = (ag̃)(bh̃) = abg̃ h̃,

and by Gauss’s lemma g̃ h̃ ∈ R[x] is primitive. But now ab cannot have a denominator (we can always cancel common
factors in a fraction in a UFD) – indeed, if there were some irreducible π in the denominator of ab, in order for
abg̃ h̃ = f to be an element of R[x] we must have π divide g̃ h̃, which contradicts g̃ h̃ being primitive. Thus ab ∈ R,
and f = (abg̃)h̃ would give us a factorization in R[x]. This contradicts f being irreducible, so we must actually have
irreducibility over Q(R)[x] as desired.

28
The other direction is more straightforward: if f ∈ R is irreducible and f = gh in R[x], then g and h must be
in R, meaning g or h is in R× , so f is also irreducible in R[x]. And if f is primitive and irreducible in Q(R)[x], then
for any factorization f = gh in R[x], then thinking about all of those terms in Q(R)[x] we can choose g to be an
element of Q(R) ∩ R[x] = R without loss of generality. But because f is primitive, this can only occur if g is a unit in
R× = R[x]× , and thus f is again irreducible.

Theorem 108
If R is a unique factorization domain, so is R[x] (and thus R[x1 , · · · , xn ] for any n).

Proof. To prove existence of a factorization, first notice that Q(R)[x] is a UFD because Q(R) is a field. So suppose
we have f = uπ1 · · · πn with u ∈ Q(R)[x]× , where πi ∈ Q(R)[x] are irreducible. But without loss of generality (by
putting terms into u) we can say that πi is primitive, so each πi ∈ R[x] is irreducible by our previous result. But then
by the same logic as before, if π1 , · · · , πn are primitive, so is their product. Thus u must be in R. This gives us a
factorization of f in R[x] as well.
For uniqueness, we’ll instead check the fact that if π ∈ R[x] is irreducible, then (π) is prime. Again by our previous
result, we can check two cases: one case is where π is an irreducible element of R. Then R[x]/(π) = R/(π)[x], and
because R/(π) is an integral domain so is R/(π)[x]. Thus (π) ◁ R[x] is prime. The other case is where π ∈ R[x] \ R
is primitive and irreducible in Q(R)[x]. Suppose π|f g. Then since Q(R)[x] is a UFD, we know that π divides either f
or g in Q(R)[x]; without loss of generality let it be f . This means f = πh for some h ∈ Q(R)[x], and we can write
h = ah̃ for some a ∈ Q(R) and some primitive h̃ ∈ R[x]. Then f = a(π h̃), but π h̃ is primitive so a cannot have any
factors in the denominator. This means a ∈ R and h was already in R[x], meaning π|f in R[x] as well. That proves
again that (π) is prime.

We’ll finish by discussing some other tricks for showing irreducibility of polynomials:

• The polynomial x 3 − 3x + 1 is irreducible in Q[x]. Indeed, because it is primitive, irreducibility in Q[x] is


equivalent to irreducibility in Z[x], and if it had a factorization there must be a linear factor, meaning that
x 3 − 3x + 1 = (x 2 + ax + b)(x + c) for integers a, b, c. But then bc = 1 implies that we’d either have c = 1 or
c = −1, and ±1 are both not roots of x 3 − 3x + 1. More generally, we get restrictions on the coefficients given
primarily by the constant term if we’re trying to look at irreducibility in Z[x].

• Next, x 3 + x + 105 is irreducible in Q[x]. We can make a similar argument as before, but there’s another trick
here: if x 3 + x + 105 were reducible in Z[x], it would also be reducible in Z/(p)[x] (by reducing each polynomial
individually). But reducing mod 2 gives us x 3 + x + 1, and now by the argument from above we see that this
has no roots and thus cannot have a linear factor.
However, we should be a bit careful with the leading coefficient here. For example, 2x 2 + x = x(2x + 1) is
reducible, but mod 2 this is the polynomial x, which is irreducible. (The problem is that what was originally a
valid irreducible factor becomes a unit.) So we just need to make sure we don’t decrease the degree when we
make such a reduction.

We’ll now mention a less straightforward check for irreducibility:

Lemma 109 (Eisenstein’s criterion)


Suppose R is an integral domain, ℘ ◁ R is a prime ideal. Suppose f (x) = f0 + f1 x + · · · + fd x d ∈ R[x]. Suppose
fd ̸∈ ℘, fi ∈ ℘ for all i < d, and f0 ̸∈ ℘2 . Then f is not a product of two lower-degree polynomials.

29
To explain the technicality of the last point, notice that 2x + 6 is not irreducible, but it does satisfy the criterion
for ℘ = (3) in R = Z. For example, this shows that x 4 + 10x + 5 is irreducible by applying Eisenstein’s criterion with
the ideal (5).

Proof. Suppose f = gh in R[x], with g(x) = g0 + g1 x + · · · + ge x e and h(x) = h0 + h1 x + · · · + hd−e x d−e and
deg g, deg h < deg f . Comparing leading terms, we know that ge hd−e = fd ̸∈ ℘, so ge ̸∈ ℘ and hd−e ̸∈ ℘ (because ℘
is an ideal). Choose i , j minimal so that gi , hj ̸∈ ℘. Then the coefficient fi+j will contain a term gi hj , which is not in ℘
(this is the only place where we use that ℘ is a prime ideal), but it will have other terms gi−k hj+k and gi+k hj−k , which
will both be in ℘ because gi−k and hj−k are in ℘ by minimality of i , j. So we have a sum of terms in ℘ plus something
not in ℘, which cannot be in ℘. This is only possible if i + j = d, and thus i = and j = d − e. This would imply that
all coefficients except the leading one are in ℘. But then f0 = g0 h0 ∈ ℘2 , which contradicts our original assumption.
Thus f cannot be written as this product.

Corollary 110
x p −1
The p-cyclotomic polynomial φp (x) = x−1 = x p−1 + x p−2 + · · · + 1, is irreducible in Z[x], because it’s irreducible
(x+1)p −1
if and only if φp (x + 1) is irreducible. And φp (x + 1) = x has all coefficients divisible by p except the leading
one and with constant term p, so it satisfies Eisenstein’s criterion with the ideal (p).

10 October 17, 2022


We’ve finished our discussion on rings now, and we’ll take a week of lectures to talk over the basics of category theory.
A good reference for us would be Mac Lane’s “Categories for the Working Mathematician.”

Fact 111
We’ll assume the existence of a universe U, which is a set with the following properties:
• If X ∈ Y and Y is in U, then X is in U.

• If X, Y ∈ U, then {X, Y } ∈ U.

• If X ∈ U, then the powerset P(X) is in U.

• The infinite set {0, 1, 2, 3, · · · } is in U.

• If X ∈ U and f : X → U is some function, then Im(f ) ∈ U

(The point is that the axioms of a universe give us models for the axioms of set theory: elements of U give us a
model of ZFC. But ZFC itself doesn’t give us the existence of such a U – we have to add some additional axioms.)
We’ll call any x ∈ U a small set, but we shouldn’t worry too much about this detail.

30
Definition 112
A category C consists of two sets ob(C) ⊂ U (called the objects) and mor(C) ⊂ U (called the morphisms),
together with functions dom : mor(C) → ob(C) and cod : mor(C) → ob(C) (that is, for every morphism we can
specify its domain and codomain), Id : ob(C) → mor(C) (identity maps on each object), and a way to compose
morphisms ◦ : {(f , g) ∈ mor(C) × mor(C) : cod(g) = dom(f )} → mor(C), such that the following axioms hold:
• dom(IdX ) = X = cod(IdX ) for any X,

• f ◦ IdX = f = IdY ◦ f if f has domain X and codomain Y . (The set of morphisms starting at X and ending
f
at Y will be denoted HomC (X, Y ), and such morphisms may be written f : X → Y or X →
− Y .)

• Associativity holds: f ◦ (g ◦ h) = (f ◦ g) ◦ h if cod(h) = dom(g) and cod(g) = dom(f ).

Remark 113. Here, we don’t want to think of “containment” or “elements of objects,” which is why we require
codomains and domains to be equal during composition. The point is to think as abstractly as possible.

Definition 114
A morphism f : X → Y is an isomorphism if there exists a morphism g : Y → X with f ◦ g = IdY and g ◦ f = IdX .
If such a morphism exists, it is unique (because if g, g ′ have this property then g = g ◦ f ◦ g ′ = g ′ ), and we call
this map the inverse of f and denote it f −1 .

Definition 115
A morphism f : X → Y is an epimorphism (sometimes epi) if whenever we have maps g, h : Y → Z with
g ◦ f = h ◦ f (and that composition makes sense), then g = h. f : X → Y is a monomorphism (sometimes
mono) if whenever g, h : Z → X with f ◦ g = f ◦ h, then g = h.

This last definition makes more sense with the following concrete example:

Example 116
We have a category of sets (denoted Set), in which the objects ob(C) are the small sets (so everything in U)
and the morphisms mor(C) are the set of functions from one small set to another. Then domain, codomain,
identity, and composition mean what they usually do for functions. In this category, we can check that being an
epimorphism is equivalent to being surjective, and being a monomorphism is equivalent to being injective.

Example 117
The category Grp of small groups (groups whose elements form a small set) and group homomorphisms is defined
with the usual composition of homomorphisms and so on, and being an epimorphism (resp. monomorphism)
again corresponds to surjectivity and injectivity. The same is true for the category Ring of small rings and ring
homomorphisms.

Example 118
Small topological spaces and continuous functions also form a category Top. Then monomorphisms are still
injective, but epimorphisms don’t need to be surjective. For example, the map Q ,→ R (with the usual topology
on R and either the subspace or discrete topology on Q) is an epimorphism but not surjective.

31
Example 119
We will denote by K-vect the category of small K-vector spaces and K-linear maps and Ab the category of abelian
groups and group homomorphisms – these are also indeed categories.

Definition 120
A covariant functor F : C → D between categories consists of the two functions F : ob(C) → ob(D) and
F : mor(C) → mor(D), such that F (dom(f )) = dom(F (f )), F (cod(f )) = cod(F (f )), F (IdX ) = IdF (X) , and
F (f ◦ g) = F (f ) ◦ F (g). A contravariant functor is very similar but instead satisfies F (dom(f )) = cod(F (f )),
F (cod(f )) = dom(F (f )), F (IdX ) = IdF (X) , and F (f ◦ g) = F (g) ◦ F (f ).

We can think of covariant functors as preserving the direction of arrows and contravariant functors as reversing
f F (f )
the direction of arrows: indeed, for a covariant functor we have X →
− Y becoming F (X) −−−→ F (Y ), but for a
f F (f )
contravariant one we have X →
− Y becoming F (X) ←−−− F (Y ).

Example 121
We have a forgetful functor from Grp to Set, in which we forget about the group structure and think about the
groups and homomorphisms as just sets and functions. We similarly have a forgetful functor Ab to Grp, as well
as one from K-vect to Ab.

Example 122
The map H i : Top → Ab sending X to H i (X, Z) is a contravariant functor (and similarly Hi is a covariant functor).
Fundamental groups are a bit more tricky – the map π1 sends pointed topological spaces to groups, but Top
itself does not contain information about base points. And there is no category of “group up to isomorphism”
because we don’t know how to map between those objects.

Example 123
GLn : Ring → Grp is a functor that sends any ring R to GLn (R).

Example 124
For any category C, we have the identity functor IdC : C → C.

Example 125
We have a contravariant functor ∗ : K-vect → K-vect sending V to its dual space V ∗ .

Definition 126
A functor F : C → D is faithful if it keeps distinct morphisms distinct – in other words, the map HomC (X, Y ) to
either HomC (F (X), F (Y )) (if covariant) or HomC (F (Y ), F (X)) (if contravariant) is injective. Similarly, a functor
is full if that map is surjective.

32
For example, forgetful functors are faithful but not full, the identity functor is fully faithful, and the duality functor
∗ is faithful but not full. Indeed, if the duals f ∗ and g ∗ of two linear maps f , g : V → W are equal, then λ ◦ f = λ ◦ g
for all functionals λ : W → K, but if f (v ) ̸= g(v ) then there would be some λ such that λ(f (v )) ̸= λ(g(v )), so λ ◦ f
would not be equal to λ ◦ g. But there are infinite-dimensional vector spaces where the duals are much bigger than
the original spaces, so ∗ is not full. On the other hand, ∗ is fully faithful in the category of finite-dimensional vector
spaces Fin-K-vect (because the double dual is always equal to itself).

Definition 127
We call C ⊂ D a subcategory if ob(C) ⊂ ob(D) and mor(C) ⊂ mor(D), where C preserves domain, codomain,
identity, and composition from D.

For any subcategory C ⊂ D, we get an inclusion functor I : C → D (for example we have one from Fin-K-vect
to K-vect), which is always faithful. If I is also full, we call C a full subcategory (for example Fin-K-vect is a full
subcategory of K-vect).
We can also compose functors F : C → D and G : D → E to get a functor G ◦ F : C → E. Everything up to this
point should be familiar in formalism with what we’ve already done, but now we have something a layer deeper:

Definition 128
Let F, G : C → D be two functors. A natural transformation φ : F → G is a map which, for each object X ∈ C,
yields a morphism φX : F (X) → G(X) which is compatible with all morphisms in C, meaning that the following
diagram commutes:

F (f )
F (X) F (Y )
φX φY
G(f )
G(X) G(Y )

Example 129
There is a natural transformation det : GLn to GL1 . In other words, for any ring R, we have a map sending
GLn (R) to GL1 (R) (the set of units of R) by taking the determinant.

That determinant is defined as a universal polynomial in the entries, and det is a natural transformation. Indeed,
for any map f : R → S we get a map GLn (R) → GLn (S) (applying f to each entry) such that “taking the determinant
is the same before or after taking the morphism:”
det
GLn (R) GL1 (R)
GLn (f ) GL1 (f )
det
GLn (S) GL1 (S)

We’ll see some more examples of this next time!

11 October 19, 2022


Last lecture, we introduced categories C and functors F : C → D (which are abstract sets of objects and maps between
them). (For example, one useful object we didn’t explicitly state last time is the constant functor CY for some Y ∈ D,

33
which sends any object X ∈ ob(C) to Y and any f ∈ mor(C) to IdY .) We then introduced natural transformations
between functors φ : F → G, which associate to each X ∈ ob(C) a map φX : F (X) → G(X) which commutes with
any morphism f : X → Y in C. The point is that such a transformation is “so natural” that it doesn’t depend on
whether we’re applying F or G. We saw an example with the determinant map GLn (R) → GL1 (R) (where the idea
was that if we have a ring morphism R → S, computing the polynomial in the entires in R or in S “gives the same
thing”).

Example 130
Recall that we have a contravariant functor ∗ from K-vect to K-vect (where we send a vector space V to V ∗ ),
and applying ∗ twice gives us a covariant functor. From linear algebra, there is a canonical map V → V ∗∗ which
sends any v to the evaluation map (λ 7→ λ(v )), and notice that this definition didn’t really depend on v in any
essential way. In other words, there should be a natural transformation from the identity functor to the double
dual functor.

Indeed, for any vector space V ∈ K-vect, define a map DV : V → V ∗∗ so that for any v ∈ V and λ ∈ V ∗ , DV (v )
is the map
DV (v )(λ) = λv .

We must check that the following diagram commutes for any f : V → W :


DV
V V ∗∗
f f ∗∗
DW
W W ∗∗
To do this, we must check that starting with some x ∈ V and going around the diagram yields the same thing in
both directions, and we can do that by checking that the result is the same applied to any λ ∈ W ∗ . Indeed, we have
(going around in one direction)

((DW ◦ f )(x))(λ) = DW (f (x))(λ) = λ(f (x)) = (λ ◦ f )(x),

and (in the other direction, and in the second step using the definition of the dual of a linear transformation)

((f ∗∗ ◦ DV )(x))(λ) = (f ∗∗ (DV (x)))(λ) = (DV (x) ◦ f ∗ )(λ) = DV (x)(λ ◦ f ),

which are indeed the same. Thus we do indeed have a natural transformation.

Example 131
The identity natural transformation from a functor F to itself sends IdX to IdF (X) – we can check that this satisfies
the definitions.

Example 132
If f : Y → Y ′ is a morphism in D, then we can take the constant functor CY and get a natural transformation
Cf : CY → CY ′ . We must then define (for any x ∈ X) Cf ,x : CY x → CY ′ x which basically just applies f (because
CY x = Y and CY ′ x = Y ′ ). And this is a natural transformation because the commutative diagram is just saying
that f ◦ id = id ◦ f .

34
Example 133
Given natural transformations φ : F → G and ψ : G → H, we get a natural transformation (ψ ◦ φ) defined by
(ψ ◦ φ)X = ψX ◦ φX for any X

We now may be curious about “equivalence of categories” – it turns out that requiring morphisms between them
that compose to the identity is too rigid of an assumption. We’ll first discuss equivalence of functors:

Definition 134
We say that two functors F, G : C → D are naturally isomorphic, denoted F ≃ G, if there are natural transfor-
mations φ : F → G and ψ : G → F such that φ ◦ ψ = IdG and ψ ◦ φ = IdF . In other words, for any x ∈ ob(C),
we want φX : F (X) → G(X) and ψX : G(X) → F (X) to be mutually inverse isomorphisms.

Example 135
For the category Fin-K-vect, the identity map and the double dual are naturally isomorphic functors (though they
aren’t actually equal objects, they’re as good as each other for any real purpose).

Definition 136
A covariant functor F : C → D is an equivalence of categories if there is some covariant functor G : D → C
such that F ◦ G ≃ IdD and G ◦ F ≃ IdC . If F and G are contravariant instead and we have the same condition,
we say that we have an anti-equivalence of categories.

Example 137
The functor ∗ : Fin-K-vect → Fin-K-vect is an anti-equivalence of categories (since we can also use ∗ to go in
the opposite direction).

Definition 138
We call C a small category if ob(C) and mor(C) are both in the universe U.

For any small category J, we can consider functors F : J → C, and we’ll now discuss limits and colimits:

Definition 139
A limit (also inverse limit) of F , denoted lim F , is an element X ∈ ob(C) and a natural transformation φ : CX → F
← −
which is universal: that is, if we have any object X ′ ∈ ob(C) and natural transformation CX ′ → F , then there is a
unique α : X ′ → X such that ψ = φ ◦ Cα .

Concretely, this means that for every j ∈ ob(J), we have a map φ : CX,j = X → Fj which commutes with all
morphisms: for any i → j in J, we get the following diagram (where often we’ll just collapse the two Xs into one, and
the requirement is that φj = Ff ◦ φi ).
φi
X Fi
id Ff
φj
X Fj

35
Then the universal property says that given an object X ′ and maps ψi : X ′ → Fi and ψj : X ′ → Fj such that
ψj = Ff ◦ ψi , we have a map α : X ′ → X so that the following diagram commutes for all i :

α ψi
X′ X Fi
ψi

Example 140
Suppose J has two objects 1 and 2 and the two identity maps Id1 and Id2 . Then a functor F : J → C means we
just need to specify that 1 goes to some X1 ∈ C and 2 goes to some X2 ∈ C, and the identity morphisms go to
the identity morphisms.

In this case, the limit lim F must be some object in C with a map to X1 and a map to X2 which commute with the
← −
morphisms in J, and for any other X ′ with maps into X1 and X2 we have a unique map X ′ → lim F :
←−

X′
α

lim F X1
← −

X2

So this actually gives us the product (if the limit exists – it may not exist for a general category) – more generally,
for any small set I, we can create a category with objects I and only the identity morphisms. Then the limit, if it
exists, is the collection of {Xi }i∈I with maps to each Xi with the product universal property that we’ve already seen.

Example 141
Now suppose J has two objects 1 and 2, the two identity maps Id1 and Id2 , and two additional morphisms from 1
to 2. Then having a functor F : J → K-vect means that we specify two vector spaces V and W and two linear
transformations f , g : V → W .

The limit of this functor is then a vector space X with maps to V and W such that the result is the same whether
we take f or g. In other words, it’s a vector space with a map φ to V so that f ◦ φ = g ◦ φ which also satisfies the
universal property. So what we have here is the kernel ker(f − g) with the natural inclusion into V , and indeed for any
other map ψ : X ′ → V with that same property, we have that (f − g)ψ(x) = 0 for all x ∈ X ′ . But that means the
image of ψ lands in ker(f − g), so any such map ψ must factor through ker(f − g).

Definition 142
A colimit (also direct limit) of a functor F : J → C, denoted lim F , is an object X ∈ ob(C) and a natural
−→
transformation φ : F → CX which is universal, meaning that for any object X ′ ∈ C and natural transformation
ψ : F → CX ′ , there is a unique α : X → X ′ such that ψ = Cα ◦ φ.

Example 143
Consider the category J of three objects 1, 2, 3 with the identity maps and a map 1 → 2 and a map 1 → 3. Then
a functor F : J → Ring is a specification of three rings R, S, T , along with maps R → S and R → T .

36
Checking the definitions, a colimit lim F of this functor is then the tensor product S ⊗R T (we saw the corresponding
−→
commutative diagram back when we defined the tensor product, since we need to have maps S → S ⊗R T and
T → S ⊗R T that are compatible with the maps R → S and R → T ). And it turns out that limits and colimits are
unique (just like we saw that the tensor product and other limit objects are unique) – we’ll see that next time.

12 October 21, 2022


Last lecture, we looked at limits and colimits of functors: if F : J → C is a functor, then the limit of F (which may or
may not exist) are defined in the following way: we have a natural transformation φ : C lim F → F so that if X ∈ ob(C)
← −
and we have a natural transformation φ′ : CX → F , then there is a unique morphism α : X → lim F so that we have a
← −
commutative diagram φ′ = φ ◦ Cα . Concretely, we have the following diagram for all j ∈ J, with f sending j to j ′ :

X
α

φj
lim F Fj
← −
φ′j ′
φj ′
Ff

Fj′

The colimit is defined similarly. And as with many other universal properties, we have a uniqueness property:

Lemma 144
Suppose (X, φ) and (X ′ , φ′ ) are two limits of the functor F : J → C. Then there is a unique isomorphism
α : X → X ′ such that φ′ ◦ Cα = φ (and an analogous isomorphism β : X ′ → X). Similarly, if (X, φ) and (X ′ , φ′ )
are two colimits of F : J → C, then there is a unique isomorphism α : X → X ′ with Cα ◦ φ = φ′ (and an analogous
isomprhism β : X ′ → X).

(By the universal property, we already know that the map α uniquely exists; what we’re saying is that this is an
isomorphism.) Basically, we have to check that α and β are inverses in both directions, but we’ve done this kind of
argument many times before.

Example 145
It turns out that in the category of sets, small limits and small colimits always exist – indeed, for any functor F
we can define  
 Y 
lim F = (xj ) ∈ Fj F (f )xj = xj ′ ∀f : j → j ′
← −  
j∈ob(J)

with maps φj to Fj being projection onto the jth factor, and this is compatible with the maps Ff : Fj → Fj ′ because
of the condition on the coordinates. Similarly, rings have all small limits and colimits (as we verified ourselves)
– the limits are constructed similarly in the two cases, but the colimits are different (we basically have a disjoint
union for sets and a tensor product for rings).

Example 146
Let J again be the category with two objects 1 and 2 and only their identity morphisms. As discussed, a functor
F : J → C is just a collection of two objects (X1 , X2 ) ∈ ob(C).

37
We’ll see what this is more concretely for various categories.
• In Set, we know that the limit of F is the product set X1 × X2 , and we can check that the colimit of F is the
disjoint union X1 ⊔ X2 .

• In Ring, the limit is X1 × X2 again, and the colimit is X1 ⊗Z X2 (because X1 and X2 map into the tensor product,
and Z maps uniquely into any ring.)

• In K-vect, the limit is again the product of vector spaces, but we usually call it the direct sum X1 ⊕ X2 . And
the colimit is again X1 ⊕ X2 (because we have the map that sends x ∈ X1 to (x, 0), which is not something
that works for rings); here we have the universal property because whenever we have maps fi : Xi → X, we get
a map f1 + f2 : X1 ⊕ X2 → X by sending (x1 , x2 ) to f1 (x1 ) + f2 (x2 ). So the point is that K-vect is behaving
differently from Set and Ring.

• In Grp, the limit is again a product X1 × X2 , and the colimit is the amalgam X1 ∗ X2 (because of the lack of
commutativity).

Definition 147
Let F : C → D and G : D → C be functors. We say that F is a left adjoint for G, and that G is a right adjoint
for F , if for all X ∈ ob(C) and Y ∈ ob(D), there is an isomorphism φX,Y : HomD (F (X), Y ) → HomC (X, G(Y ))
which is natural in the following way: for any f : X → X ′ , we have the commutative diagrams shown below.

For any map f : X → X ′ , we have the following diagram (“precomposing by f ,” so the notation − ◦ F (f ) means
that we take some h ∈ HomD (F (X ′ ), Y ) and send it to h ◦ F (f )):

φX,Y
HomD (F (X), Y ) HomC (X, G(Y ))
−◦F (f ) −◦f
φX ′ ,Y
HomD (F (X ′ ), Y ) HomC (X ′ , G(Y ))

Similarly, for any map g : Y → Y ′ , we can “postcompose by G”:


φX,Y
HomD (F (X), Y ) HomC (X, G(Y ))
g◦− G(g)◦−
φX,Y ′
HomD (F (X), Y ′ ) HomC (X, G(Y ′ ))

For another way of thinking about this definition, if C op denotes the opposite category of C, then we have a
functor C op × D to Set sending (x, y ) to HomD (F ·, ·) and also one sending (x, y ) to HomC (·, G·). Then φ is basically
demonstrating a natural isomorphism between those functors.
In particular, if we take Y = F (X) in our second diagram here, we have HomD (F (X), F (X)) in the top left, which
has a natural element (the identity map). Then we see that ηX = φX,F (X) (IdF (X) ) sends X 7→ G(F (X)), and similarly
µy = φ−1
G(Y ),Y (IdG(Y ) ) sends F (G(Y )) → Y . So an adjunction gives a natural transformation η : IdC → G ◦ F and a
natural transformation µ : F ◦ G → IdD .
It turns out that the maps η and µ determine the adjunction (they don’t always give rise to an adjunction, but if
they do then knowing η alone is enough). Indeed, if we have a map HomD (F (X), F (X)) → HomC (X, G(F (X)) sending
IdF (X) to ηX , then to complete the diagram, we need to know how to send f ∈ HomD to φX,Y (f ) ∈ HomC (X, G(Y )).
But that allows us to create the map in blue by composition:

38
HomD (F (X), Y ) HomC (X, G(Y ))
g◦− G(g)◦−

HomD (F (X), F (X)) HomC (X, G(F (X))

Under that map the identity map IdF (X) goes to f , so we need to put in the red map. So determining η also
allows us to construct the functor G. Similarly, if f : X → G(Y ), then φ−1
X,Y (f ) is µ composed with F (f ), so we can
determine F from the adjunction.

Example 148
Let G : Ring → Set be the forgetful map. Then G has a left adjoint – indeed, what we need is a functor F so
that if Ω is a set, and R is a ring, then we have an isomorphism

HomRing (F (Ω), R) ∼
= HomSet (Ω, R).

And what we should do is define F (Ω) = Z[Xω ]ω∈Ω to be the polynomial ring (to get a map from Z[Xω ] → R,
we just need to designate where each Xω goes). So we can check that Ω → Z[Xω ]ω∈Ω sends ω to Xω , and
µR : Z[Xr ]r ∈R → R sends xr to r .

Lemma 149
Let F : C → D and G : D → C be functors so that F is left adjoint to G. Then G preserves small limits (so it
sends limits to limits) and F preserves small colimits.

In particular, the forgetful functor is a right adjoint, so it preserves limits (and this explains why we saw that limits
were the same in sets and rings, but not colimits). The converse turns out to be pretty true as well:

Theorem 150
Up to some set-theoretic considerations, the converse also holds: let F : C → D and G : D → C be functors.
Then if G preserves limits and some set theory details hold, then G has a left-adjoint. Similarly, if F preserves
colimits and some set theory details hold, then F has a right adjoint.

13 October 24, 2022


Today’s lecture will begin our discussion of modules, and we’ll start with the basics:

Definition 151
Let R be a ring. An R-module M is an abelian group (M, +) with a binary operation (action) R × M → M (which
will send (r, m) to r m) such that 1 · m = m (identity), (r + s) · m = r · m + s · m and r · (m + n) = r · m + r · n
(distributivity), and r · (s · m) = (r · s) · m (associativity) for all r, s ∈ R and m, n ∈ M.

(This also implies a few other immediate consequences – for example, 0 · m = 0 and (−1) · m = −m by plugging
in appropriate constants in for m and n.)

39
Example 152
If R is a field, then R-modules are the same as R-vector spaces.

However, modules can be much more complicated than vector spaces, much like rings can be much more complicated
than fields.

Example 153
An abelian group is the same as a Z-module – indeed, any Z-module is an abelian group, and for the other direction
we know that (1 + 1 + · · · + 1)m, where we add together n ones, must be m added to itself n times, and similarly
for negatives. So the Z-action is already specified.

Example 154
The ring R itself is an R-module; more generally, for any ideal I△R, R/I is an R-module (in which r (s +I) = r s +I).
And even more generally, if we have a ring homomorphism φ : R → S, then S is an R-module with r · s = φ(r )s
(the axioms hold because φ is a morphism and S is a ring).

Example 155 " # " #


2
x y
Q is a module over the polynomial ring Q[T ], where (for example) we can have T act on to produce .
y −x
More generally, any endomorphism of a K-vector space produces a K[T ]-module, since we already know the action
of K and we just need to specify what T does.

Definition 156
A map φ : M → N is an R-module morphism (also an R-linear map) if it preserves the relevant structure,
meaning that φ(r · m + n) = r · φ(m) + φ(n) for all r ∈ R and m, n ∈ M.

In particular, we see that φ(0) = 0 by setting r = 1 and m = 0.

Definition 157
A subset N ⊂ M of an R-module is a submodule if N is nonempty and for all m, n ∈ N and r ∈ R, r m + n ∈ N.

Example 158
The R-submodules of R are the ideals of R (since both definitions require being a subset and being closed under
addition and multiplication by any element of R). And for any module M, (0) is a submodule.

Definition 159
We denote by R-Mod the category of small R-modules.

There are a few useful properties of this category:

40
• (0) is an initial object of this category, since it maps uniquely to everything else. Similarly, (0) is a final object.
This is actually a pretty special property – in rings we had an initial object of Z but a final object of (0).
• The set of homomorphisms HomR-Mod (M, N) has an additive group structure, and in fact it is an R-module.
(This is not something that we had in rings, because there the multiplicative structure wouldn’t be preserved.)
Indeed, M always has a unique map to the zero object, which has a unique map to N, so that is our identity
element. Then if f , g are two homomorphisms from M to N and r ∈ R, then r f + g will be a homomorphism
defined by
(r f + g)(m) = r f (m) + g(m).

Further, notice that for any m, n ∈ M and r, s ∈ R,

(r f + g)(sm + n) = r f (sm + n) + g(sm + n) = r sf (m) + r f (m) + sg(m) + g(n)

= s(r f (m) + g(m)) + r f (n) + g(n) = s(r f + g)(m) + (r f + g)(n),

so we do indeed have an R-linear map.


• We have a map HomR-mod (M, N) × HomR-mod (N, P ) → HomR-mod (M, P ) sending (f , g) to g ◦ f , and this
map is R-bilinear, meaning that if we hold f or g constant, the map is linear in the other (or in other words,
−◦f : HomR-mod (N, P ) → HomR-mod (M, P ) sending g to g◦f is a linear map, and so is g◦− : HomR-mod (M, N) →
HomR-mod (M, P ) sending f → g ◦ f ).
• We can define the direct sum of two R-modules M ⊕ N to be the set of ordered pairs (m, n) with m ∈ M and
n ∈ N with addition and multiplication performed component-wise. This comes with a few natural maps: we
have the R-linear map M ,→ M ⊕ N sending m to (m, 0) and similarly N ,→ M ⊕ N sending n to (0, n), and we
also have projection π1 : M ⊕ N → M sending (m, n) to m and π2 : M ⊕ N → N sending (m, n) to n. Then we
see directly that π1 ◦ j1 = idM , π2 ◦ j2 = idN , π2 ◦ j1 = 0, π1 ◦ j2 = 0, and finally j1 ◦ π1 + j2 ◦ π2 = idM⊕N .
• It turns out that this direct sum R-module is both a product and a coproduct in R-mod. To verify that we have
a coproduct, if we have maps f : M → P and g : N → P , then there is a unique map f + g : M ⊕ N → P such
that (f + g) ◦ j1 = f and (f + g) ◦ j2 = g (sending (m, n) to f (m) + g(n)), and to verify that we have a product,
then given any maps f : P → M and g : P → N, we have a unique map f ⊕ g : P → M ⊕ N sending p to
(f (p), g(p)) such that π1 ◦ (f ⊕ g) = f and π2 ◦ (f ⊕ g) = g. And thus we get a canonical map between products
and coproducts, which helps us recover more structure among the maps: if we have two maps f , g : M → N, we
can construct (f ⊕ g) : M → N ⊕ N and then map Id + Id : N ⊕ N → N, and the composite gives us the map
f + g. (Similarly, we can map M into M ⊕ M and then map into N via f + g in the coproduct sense.) So the
additive structure on the homomorphisms is uniquely defined once we identify the product with the coproduct.

Definition 160
Let f : M → N be a morphism of R-modules. The kernel of f is the set ker f = {m ∈ M : f (m) = 0}, the image
of f is im f = {f (m) : m ∈ M}, and the cokernel of f is cokerf = N/im f .

It is easy to check that ker f is a submodule of M and that im f is a submodule of N.

Example 161
Let N be a submodule of M. Then the quotient abelian group M/N is an R-module by setting r (m +N) = r m +N,
and the corresponding surjective map π : M → M/N has kernel N. The cokernel of N ,→ M is then M/N.

41
We can redefine the image in terms of the kernel and cokernel, because we always have (thinking of coker f as a
quotient of N)
im f = ker(N → coker f ).

So if we know the kernel of our map, knowing about the image and cokernel are equivalent.

Example 162
If we have an injective map f : N → M, then N ∼
= coker(M → cokerf ), and if we have a surjective map N → M,

then M = coker(ker f → N). (In some sense these are the “isomorphism theorems” for R-modules.)

We’ll now get into the concept of exact sequences:

Definition 163
f g
Let f : M → N and g : N → P be two maps. Then M →
− N →
− P is exact at N if im f = ker g (equivalently,
f g
g ◦ f = 0 and if g(n) = 0, then n ∈ im f ). A sequence 0 →
− M→
− N→
− P →
− 0 is short exact if it is exact at M,
N, and P (meaning that the map M → N is injective, and the map N → P is surjective, plus the same conditions
as before).

Equivalently, we have a short exact sequence if M is isomorphic to ker g and coker f is isomorphic to P (we can
check that the implications go both ways).

Fact 164
All of the properties we’ve described above make R-mod an abelian category (having direct sums that are both
products and coproducts and so on). We’ll see more about this when we discuss homological algebra – it turns
out to be useful because we can have this notion of exactness.

We’ll now talk about properties that don’t hold so generally: let I be a small set and Mi an R-module for all i ∈ I.
Then we can define the direct product (also just product)
( )
Y Y
Mi = (mi ) ∈ mi
i∈I i∈I

and the direct sum ( )


M Y
Mi = (mi ) ∈ Mi : mi = 0 for all but finitely many i .
i∈I i∈I

In particular, these are only different when we get to infinitely many elements in the index set.
It turns out that R-mod always has small limits and colimits: if F : J → R-mod is a functor, then a limit of F is
(basically a subset of the product)
 
 Y 
lim F = (mj ) ∈ Fj : F (φ)(mi ) = mj for all φ : i → j ,
−→  
j∈ob(J)

and similarly the colimit is (viewed as a quotient of the coproduct)


M
lim F = Fj /⟨mj ej − F (φ)(mj ′ )ej ′ for all φ : j → j ′ , m ∈ F (j)⟩,
← −
j∈ob(J)

where mj ej is shorthand for the element which has mj in the jth component of the direct sum. Checking that we
do satisfy the definitions of limits and colimits here is left as an exercise to us.

42
Example 165
For any small set Ω, we can form the free module with Ω as a basis, written as
M
FR (Ω) = R = {F : Ω → R with f (ω) = 0 for all but finitely many ω} .
ω∈Ω

In particular, for any ω ∈ Ω, we have the “basis element” element eω ∈ FR (Ω) so that eω (ω ′ ) = 1 for ω = ω ′ and
P
0 otherwise, and then we can write an element of the free module as (rω ) = ω∈Ω rω eω (if rω is zero except finitely
often).
It turns out that FR is left adjoint to the forgetful functor G : R-mod → Set. In other words, we have an
isomorphism
HomR-mod (FR (Ω), M) ∼
= HomSet (Ω, M)
P
which sends φ to (ω 7→ φ(eω )) and ψ to the map sending (rω ) to ω rω ψ(ω). In other words, if ψ maps Ω to M,
then there is a unique R-linear map FR (Ω) → M under which eω maps to ψ(ω). (So this is kind of like the property
for polynomial rings.)

14 October 26, 2022


We’ll be going over more basic constructions with modules today – last time, we mentioned the free R-module on Ω,
which can be thought of as direct sums of R with itself indexed by Ω. We saw last time that this is left adjoint to the
forgetful functor from R-mod to Set, so we have the universal property that specifying a map FR (Ω) → M is uniquely
determined by specifying where each basis vector eω goes.

Definition 166
An R-module M is free if M is isomorphic to FR (Ω) for some Ω. A subset B ⊂ M is a basis of M if the natural
map FR (B) → M (sending each basis element eb to b) is an isomorphism.

This definition is equivalent to saying that every element of M can be uniquely written as a finite R-linear com-
PN
bination of the elements of B (in other words as i=1 ri bi with ri ∈ R and bi ∈ B). We know that any vector space
has a basis, so any R-module over a field is free. But there are definitely modules that aren’t free – for example, Q
is not a free Z-module, because it cannot have one basis element and span all of Q but it cannot have at least two
basis elements without having a linear relation between them. And more simply, something like Z/(2) is not a free
Z-module (since 0 can’t be a basis element because 1 · 0 = 0, and 1 can’t either because 2 · 1 = 0). The point is that
being a free R-module is generally a very rare property.
It turns out that if R is nonzero and FR (Ω) ∼= FR (Ω′ ), then we can put Ω and Ω′ in bijection with each other.
(The idea is to choose a maximal ideal and deduce that the free modules FR/m (Ω) and FR/m (Ω′ ) are isomorphic, and
then use the standard result for vector spaces.)

Definition 167
The rank of a free R-module M is the cardinality of the set Ω if M ∼
= FR (Ω).

43
Definition 168
For any subset Ω ⊂ M, the image of FR (Ω) in M, denoted ⟨Ω⟩, is the submodule generated by Ω.

Pn
More concretely, the submodule generated by Ω is the set of finite linear combinations i=1 ri ωi with ri ∈ R and
ωi ∈ Ω. We can check that it is indeed a submodule, and whenever N ⊃ Ω is a submodule we also have N ⊃ ⟨Ω⟩.

Lemma 169
For any two submodules N1 , N2 ⊂ M, the intersection N1 ∩ N2 and sum N1 + N2 are also submodules of M.

The idea is that N1 ∩ N2 is the largest submodule contained in both N1 and N2 , and N1 + N2 is the smallest
submodule containing both. So we have containment much like in rings as in the diagram below:

N1 + N2

N1 N2

N1 ∩ N2

It then turns out that we have isomorphism “along the opposite arrows:” we have

N1 /(N1 ∩ N2 ) ∼
= (N1 + N2 )/N2 ,

defined by sending n + (N1 ∩ N2 ) to n + N2 (as we can check).

Definition 170
Let N ⊂ M be a submodule. Then there is a bijection between submodules of M/N and submodules of M
containing N (sending a submodule P of M to P/N and sending a submodule in the quotient P to its preimage
under the projection π : M → M/N).

Definition 171
The dual module of M is the set of module morphisms M ∗ = HomR (M, R).

These dual spaces are not as well-behaved as dual spaces of vector spaces – for example, the dual module (Z/(2))∗
is zero, because there are no nontrivial morphisms from Z/(2) to Z if twice f (1) must be sent to zero. So we lose a
lot of information when we pass to duals.

Definition 172
The set of endomorphisms of M is denoted EndR (M) = HomR (M, M).

We saw the following as an example last time:

Lemma 173
Let M be an R-module, and let T ∈ EndR (M). Then M is also an R[X] module by defining (f0 +f1 x +· · ·+f1 x d )m =
f0 m + f1 T m + f2 T 2 m + · · · + fd T d m.

44
We’ll now mention a submodule construction which does not have any nontrivial analogy for vector spaces:

Definition 174
Pn
Let I ◁ R be an ideal. We define the multiplication IM = i=1 ri mi : ri ∈ I, mi ∈ M}.

For example, (2)(Z ⊕ Z) = (2) ⊕ (2) (by checking inclusions both ways).

Definition 175
Let Ω be a subset of an R-module M. The annihilator of Ω in R is

AnnR (Ω) = {r ∈ R : r ω = 0 ∀ω ∈ Ω}.

We can check that this is always an ideal of R. For example, AnnR (R/I) = I, because anything in I multiplied by
anything in R/I means we get sent to zero.

Example 176 " # " #


x y
Last time, we had an example where Q2 is a Q[X]-module where X acts via the operator 7→ . Then if
y −x
we want the annihilator of Q2 in Q[X], then we want the set of polynomials f ∈ Q[X] such that f (T ) = 0.

It must be an ideal of Q[X], so it must be principal and thus generated by some minimal polynomial (which in this
case is X 2 + 1). Any other polynomial is then of the form q(X)(X 2 + 1) + (aX + b), and this can only be in the
annihilator of aT + b = 0, which only happens if a, b = 0.
Just like for rings, there are two equivalent definitions of noetherianness that are useful:

Lemma 177
Let M be an R-module. Then the following are equivalent:
1. Every submodule is finitely generated,

2. Any non-empty set of submodules contains a maximal element.

The proof here is the same as for ideals in R, and if these properties hold for an R-module M, we call M noetherian.

Lemma 178
We have the following properties:
1. R is noetherian as an R-module if and only if R is noetherian as a ring (because submodules of R are ideals).

2. All submodules and quotients of a noetherian module are noetherian (by property 2, since collections of
submodules in each case can be thought of in terms of the submodules of the whole module).

3. As a kind of converse, if M/N and N are noetherian, then so is M.

4. If M and N are noetherian, then M ⊕ N is noetherian.

5. If R is noetherian and M is a finitely generated R-module, then M is noetherian.

Proof. We already gave properties of (1) and (2) above. We’ll show (3) and (4) by using property 1: if P is a
submodule of M, then N ∩ P is a submodule of N and is thus finitely generated as ⟨n1 , · · · , nr ⟩ for some ni ∈ N. But

45
P/(N ∩ P ) embeds as a submodule in M/N (with the natural map which is injective) so it is also finitely generated
as ⟨m1 + N ∩ P, · · · , ms + N ∩ P ⟩. We can then take ⟨n1 , · · · , nr , m1 , · · · , ms ⟩ to finitely generate P , because
P − r1 m1 − · · · − rn mn must be in N ∩ P for some ri ∈ R and then that is a linear combination of the ni . And (4)
follows by applying (3) to (M ⊕ N)/N ∼ = M.
Finally for (5), saying that M is finitely generated says that there is some finite set Ω such that FR (Ω) maps
surjectively into M. But FR (Ω) is the direct sum of finitely many copies of R (which is noetherian) and is thus
noetherian, so M, a quotient of it, is also noetherian.

We’ll now start talking about localizations:

Definition 179
Suppose D ⊂ R is multiplicative. We can define an equivalence relation on M × D via

(m, d) ∼ (n, e) if f (em − dn) = 0 for some f ∈ D.

m
Then we let D−1 M be the set of equivalence classes under this relation, and we write d = [(m, d)].

We claim that D−1 M is a D−1 R-module – indeed, we define addition and multiplication in the usual ways, with
m
d + ne = em+dn
ed , and r
d · me = rm
ed (and we need to check well-definedness and the axioms, but we won’t do that here).
−1 m
Then the map M → D M sending m to 1 is a morphism of R-modules (note that being a D−1 R module means
we’re also an R-module). We have a few other analogous definitions to the ring ones:

Definition 180
For any prime ideal ℘ ◁ R, we denote (R − ℘)−1 M = M℘ , and for any f ∈ R, we denote {1, f , f 2 , · · · }−1 M = Mf .

Definition 181
A submodule N ⊂ M is saturated with respect to D if for any m ∈ M, d ∈ D, dm ∈ N, we must actually have
m ∈ N. The D saturation of N is the submodule {m ∈ M : ∃d ∈ D with dm ∈ N}.

The D saturation of N is always a submodule containing N, and it is in fact the smallest such saturated submodule.
We’ll do some examples next time!

15 October 28, 2022


We started discussing localization of modules last time – if D ⊂ R is a multiplicative subset and M is an R-module, we
m
let D−1 M be the set of equivalence classes of D × M under the usual equivalence relation, and (as usual) we write d
for the equivalence class of (d, m) – we saw that this is also a D−1 R module. We finished by defining the saturation
of a submodule N ⊂ M, which is the set of m ∈ M such that there is some d with dm ∈ N (so “things in N divided
by something in d”).

Example 182
a+(2)
Let R = Z and D = Z − (3). Then D−1 (Z/(2)) has elements of the form b for some b not divisible by 3, but
2a+(2) 0
that element is equal to 2b = 2b . So everything in this localization is zero and we get the trivial module.

46
Example 183
On the other hand, to compute D−1 (Z/(3)), we know there’s a Z-linear map Z/(3) → D−1 (Z/3), and we want
to see if it is injective.

a+(3)
We know that a + (3) is sent to 1 , which cannot be zero unless a = 0 because that would mean a + (3) is a zero
divisor with some d in Z − (3), which is not true (since that would require 3 to divide da). So Z/(3) → D−1 (Z/(3))
a+(3)
is injective, and it is surjective as well (because r + (3) is equal to d if r d ≡ a mod 3, and d is always invertible
−1
mod 3 so we can always find an r ). That means D (Z/(3)) = Z/(3).

Example 184
The D saturation of (3)/(6) as a submodule of Z/(6) (with the same D as before) is the set of elements a + (6)
such that d(a + (6)) ∈ (3)/(6). But this is the same as saying that 3|da, and since 3 can’t divide d we must have
3|a. So (3)/(6) is already saturated.

Example 185
On the other hand, the D saturation of (2)/(6) is the set of a + (6) such that 2|da for some d not divisible by
3. But we can always just take d = 2, so the D saturation is all of Z/(6).

(To see how well this generalizes to other prime ideals, we can try this out with the example R = Z[x] and using
the ideals (x) and (2).) We’ll now list a series of localization properties (some of which we may be asked to prove).
First of all, we have a universal property:

Lemma 186
For any R-module M, recall that there is a natural R-linear map i : M → D−1 M sending m to m1 . Then if N is
a D−1 R-module and f : M → N is R-linear, then there is a unique D−1 R-linear map f˜ : D−1 M → N such that
f˜ ◦ i = f , given by m 1
f˜ = f˜(m).
d d

(We can think of D−1 M as the “simplest D−1 R-module” that inverts all elements of D.) If such a map f˜ exists, it
would have to have that form above because f˜ needs to be D−1 R-linear, but we do have to check that the definition is
m n
well-defined. Indeed, if d = e (meaning c(em − dn) = 0 for some c ∈ D), then we want to check if d1 f (m) = e1 f (m).
1
But because f is R-linear, we have c(ef (m) − df (n)) = 0, which implies that df (m) − e1 f (n) = 0 by dividing by cde
(which is in D).

Lemma 187
Let f : M → N be a morphism of R-modules. Then there is a morphism of D−1 R-modules D−1 f : D−1 M → D−1 N
m f (m)
sending d to d .

Alternatively, we can construct this map with the universal property – we have maps f : M → N and the natural
map N → D−1 N which are both R-linear, so the composition M → D−1 N is R-linear and thus (by the universal
m 1 f (m)
property) factors to a D−1 R-linear map D−1 M → D−1 N. But then any d ∈ D−1 M must be sent to d 1 because
we can follow the images of m under both paths.

47
Lemma 188
We have D−1 (M/N) ∼
= D−1 M/D−1 N, corresponding m+N
d with m
d + D−1 N.

Lemma 189
We have D−1 (M ⊕ N) = D−1 (M) ⊕ D−1 (N). More generally, D−1 commutes with any small colimit and any
finite limit (but not all limits).

Q∞
Z localized at the prime ideal (0), and compare that with ∞
Q Q∞
For example, if we take i=1 i=1 (Z)(0) = i=1 Q. So
(mi ) mi

we have a natural map from the former module to the later, sending d → d , but it’s not an isomorphism (not
surjective) because we have bounded denominators on the left-hand side but not on the right-hand side.

Lemma 190
The localization of a free module is given by D−1 FR (Ω) = FD−1 R (Ω).

Lemma 191
For any ideal I of R, we have D−1 (IM) ∼
= (D−1 I)(D−1 M).

Lemma 192
f g D −1 f D −1 g
If we have a sequence M → − P which is exact at N, then D−1 M −−−→ D−1 N −−−→ D−1 P is exact at D−1 N.
− N→
Furthermore, if 0 → M → N → P → 0 is short exact, then so is 0 → D−1 M → D−1 N → D−1 P → 0.

Lemma 193
Localization preserves kernels and cokernels: if f : M → N is a module morphism, then ker(D−1 f ) = D−1 ker(f )
and coker(D−1 f ) = D−1 (coker f ).

Lemma 194
Let i : M → D−1 M be the natural map. Then we can map between (1) submodules of D−1 M, (2) the D-saturated
submodules of M, and (3) all submodules of M in the following ways:
• (1) → (2): we can pull back a submodule of D−1 M via i −1 and always get a D-saturated submodule of M.

• (3) → (1): since localization preserves injections, we can send a submodule N to D−1 N

• We can map (2) → (3) via inclusion and (3) → (2) via D saturation.
Then the diagram commutes (that is, applying (3) → (1) → (2) is the same as saturation, the maps (2) →
(3) → (2) give us the identity map, and this shows (1) → (2) is a bijection.

All of the properties so far should be fairly routine to check, but this next one is a bit more difficult:

Lemma 195
If R is noetherian and M is finitely generated over R, then

D−1 HomR-mod (M, N) ∼


= HomD−1 R-mod (D−1 M, D−1 N).

48
Example 196
Let R = Z and D = Z − (3). Then D−1 (Z/(6)) = D−1 (Z/(2)) ⊕ D−1 (Z/(3)) (as rings we’d say × but here we’ll
use direct sum), and by functoriality this is D−1 (Z/(2)) ⊕ D−1 (Z/(3)) ∼
= Z/(3) from our earlier calculations. We
can check also that D−1 (Z/(9)) = Z/(9).

We’ve mentioned that D−1 M is the “simplest D−1 R-module,” and motivated by that, let φ : R → S be any ring
morphism. We want a way to get an S-module from an R-module, and for that we’ll make the following definition:

Definition 197
Let φ : R → S be a ring morphism, and let M be an R-module. Then define the tensor product S ⊗R M = S ⊗φ,R M
to be
S ⊗R M = FS (M)/⟨en + em − en+m , er n − φ(r )en : m, n ∈ M, r ∈ R⟩.

By definition this is an S-module (since we have a quotient of a free S-module) – any element of S ⊗R M is a
finite sum of the pure tensors s ⊗ m, and we have the properties t(s ⊗ m) = (ts) ⊗ m, s ⊗ (m + n) = s ⊗ m + s ⊗ n,
(s + t) ⊗ m = s ⊗ m + t ⊗ m, and s ⊗ (r m) = sφ(r ) ⊗ m. Furthermore, we have a natural map M → S ⊗R M sending
m 7→ 1 ⊗ m. But what’s difficult (just like with tensor products of rings) is that we don’t know what all the relations
are because we can have complicated linear combinations. Instead, we work with the universal property:

Lemma 198
Let N be any S-module. Then if f : M → N is R-linear (meaning that f (r m1 + m2 ) = φ(r )f (m1 ) + f (m2 ) for all
m1 , m2 ∈ M and r ∈ R), then there is a unique S-linear map f˜ : S ⊗R M → N such that f is f˜ composed with the
natural map M → S ⊗R M – in particular, we must have f˜(s ⊗ m) = f˜(s(1 ⊗ m)) = sf (m) on the pure tensors.

The only question we must ask is existence, and we do that by constructing the following diagram:
f
M N
f1

FS (M)

S ⊗R M

As sets, we know that M embeds into FS (M), so by the universal property of the free module we can construct
the map f1 given the map f . Then we get f˜ from f1 by the universal property of the quotient, because everything
in the kernel FS (M) → S ⊗R M does get sent to zero (indeed, f1 (en + em − en+m ) = f (n) + f (m) − f (n + m) = 0
and similar for the other generators). Next time, we’ll see that if M is actually a ring, then this coincides with the
definition before of tensor products of rings!

16 October 31, 2022


Last lecture, we introduced tensor products of an R-module with a ring: specifically, if φ : R → S is a ring morphism
and M is an R-module, we defined the S-module S ⊗R M, consisting of finite sums of pure tensors s ⊗ m. In this tensor
product, we turn out to have t(s ⊗ m) = ts ⊗ m, linearity in each argument, and s ⊗ r m = φ(r )(s ⊗ m)(φ(r )s) ⊗ m; we

49
also have a natural R-linear map M → S ⊗R M sending m → 1 ⊗ m. And the key universal property here is that for any
S-module N and any R-linear map f : M → N (with respect to φ, meaning that f (r m1 + m2 ) = φ(r )f (m1 ) + f (m2 )),
we get a unique map f˜ : S ⊗R M → N (sending s ⊗ m to f (s) ⊗ m) such that f factors through S ⊗R M.
We have that S ⊗R − is left adjoint to the forgetful functor φ∗ : S-mod → R-mod, where for any S-module N
this forgetful functor preserves the abelian group structure and r · n = φ(r )n. Then we have

HomS (S ⊗R M, N) ∼
= HomR (M, N).

We’ll state some basic properties of this tensor product mostly without proof:

Lemma 199
The following tensor product constructions are all equivalent for any rings R, S, T and M an R-module:
1. S ⊗R R ∼
= S,

2. R/I ⊗R M ∼
= M/IM,

3. (D−1 R) ⊗R M ∼
= D−1 M,

4. T ⊗S (S ⊗R M) ∼
= T ⊗R M,

5. S ⊗R (M ⊕ N) = (S ⊗R M) ⊕ (S ⊗R N) (because left adjoints always preserve coproducts and the direct


sum is both a product and a coproduct – this is the sort of thing that adjoints are good for),

6. For any set Ω, S ⊗R FR (Ω) = FS (Ω) (again this is because we really have a coproduct).

The first four of these basically follow by checking that we have the same universal property for the left and right
sides.

Lemma 200
If f : M → N is a morphism of R-modules, then there is a unique morphism 1 ⊗ f : S ⊗R M → S ⊗R N sending
s ⊗ m → s ⊗ f (m).

To show existence, we want a map from S ⊗R M → S ⊗R N, which we construct by first constructing a map
M → S ⊗R N sending m to 1 ⊗ f (m). (So we’re plugging in S ⊗R N into the universal property as N.) This is
manifestly well-defined, and it’s linear because

1 ⊗ f (r m1 + m2 ) = 1 ⊗ (r f (m1 ) + f (m2 )) = φ(r )(1 ⊗ f (m1 )) + (1 ⊗ f (m2 ))

by linearity of f and the properties of pure tensors. Thus we must also get a unique map S ⊗R M → S ⊗R N.

Lemma 201
Let φ : R → S and ψ : R → T be ring morphisms. Then the ring tensor product S ⊗ring
R T is isomorphic (as an
S-module) to the module tensor product S ⊗mod
R T that we’ve defined, sending s ⊗ t to s ⊗ t.

(It’s slightly remarkable that our constructions are the same – in one case, we looked at the free module over
elements of T and modded out by relations, and in the other we looked at the polynomial ring over elements of both
S and T and modded out by an ideal.)

50
Proof. To construct a map S ⊗mod
R T → S ⊗ring ring
R T , consider the map T → S ⊗R T sending t to 1 ⊗ t. This
map is R-linear because r t1 + t2 is sent to 1 ⊗ (t1 + r t2 ) = (1 ⊗ t1 ) + r (1 ⊗ t2 ), so it extends to an S-linear map
S ⊗mod
R T → S ⊗ring
R T sending s ⊗ t to s(1 ⊗ t) = s ⊗ t, as desired.
Checking injectivity and surjectivity is generally very hard, so we usually want to construct an inverse map. But for
the other direction the trouble is that we don’t even know that S ⊗mod
R T is a ring – it’s an S-module, but we haven’t
defined any multiplication on it yet. So we should establish a ring structure first. Given any x ∈ S ⊗mod
R T , we need
to define a “multiplication-by-x” map m(x) : S ⊗mod
R T → S ⊗mod
R T (so m(x) ∈ EndS (S ⊗mod
R T )), and now this is a
little easier to deal with because we know what linear maps look like even if we don’t strictly have “multiplication.”
But we also know that we want m(s ⊗ t) to be sent to the map (u ⊗ v 7→ su ⊗ tv ) (that’s what multiplication
by s ⊗ t does), and now we’re in the realm where we can use the universal property: we have a map T → EndR (T )
(sending t to “multiply by t”), which then maps into EndS (S ⊗mod
R T ) (extending by S-linearity, sending f to 1 ⊗ f ).
The composite of these maps m1 : T → EndS (S ⊗mod
R T ) sends t to the map (u ⊗ v 7→ u ⊗ tv ), and this is R-linear,
because u ⊗ (r t1 + t2 )v = r (u ⊗ t1 v ) + (u ⊗ t2 v ) as usual by the basic properties of tensors. Thus by the universal
property of tensor products (of modules), we get a map S ⊗R T → EndS (S ⊗mod
R (T ), which we call m, which sends
s ⊗ t to the map m(s ⊗ t) such that our composite map m1 factors through it, meaning

m(s ⊗ t)(u ⊗ v ) = s(m1 t)(u ⊗ v ) = s(u ⊗ tv ) = su ⊗ tv .

So we do have a unique S-linear map m, and now we can define a product (S ⊗mod
R T ) × (S ⊗mod
R T ) → S ⊗mod
R T
sending (x, y ) → m(x)y – we claim this is the multiplication structure that we want to make S ⊗mod
R T into a ring.
• To check distributivity, we must check that m(x)(y + z) = m(x)y + m(x)z, but this is true by linearity of m(x).
Similarly, we must check that m(x + y )(z) = (m(x) + m(y ))z = m(x)z + m(y )z, which is true by linearity of
m itself.

• To check commutativity, we will make use of distributivity – any x is a finite sum of pure tensors, so we can just
check that m(x)y = m(y )x when x and y are pure tensors. But we already have a formula m(s ⊗ t)(u ⊗ v ) =
su ⊗ tv , and we have commutativity in S and T so this is the same as m(u ⊗ v )(s ⊗ t).

• Similarly, we can reduce associativity to pure tensors (where the result is clear).

• The identity multiplication element is 1 ⊗ 1, and we can check that it is actually an identity by reducing to pure
tensors.
So we’ve constructed a natural multiplication map on S ⊗mod
R T , and to actually get our map from the ring tensor
product to the module tensor product, we need ring morphisms from S and T into S ⊗mod mod
R T . We map T → S ⊗R T
by sending 1 to 1 ⊗ t – this is a morphism of rings because t1 + t2 is indeed sent to 1 ⊗ (t1 + t2 ) and t1 t2 is sent to
1 ⊗ (t1 t2 ) = (1 · 1) ⊗ (t1 · t2 ) = (1 ⊗ t1 )(1 ⊗ t2 ) (by our formula for multiplication on S ⊗mod
R T for pure tensors).
Similarly we map S → S ⊗mod
R T by sending s onto s ⊗ 1. So now we have the familiar tensor product of rings diagram:

φ
R S
ψ

T S ⊗ring
R T

S ⊗mod
R T

The outer diagram commutes because going around the top sends r to φ(r ) to φ(r ) ⊗ 1, and going around the
bottom sends r to ψ(r ) to 1 ⊗ ψ(r ). But the point is that linearity gives 1 ⊗ ψ(r ) = 1 ⊗ ψ(r ) · 1 = φ(r ) ⊗ 1 in the

51
module tensor product. Thus we get a unique dashed map which must send s ⊗ t to (s ⊗ 1)(1 ⊗ t) = s ⊗ t. These
maps are clearly mutual isomorphisms, as desired.

The point is that we actually can do more complicated arguments with universal properties, and not all proofs with
them will be short!
Next time, we’ll look at a way to construct the tensor products of two general R-modules, and this will require us
to think about multilinear algebra:

Definition 202
Let M1 , · · · , Mn , N be R-modules. A map φ : M1 × · · · × Mn → N (where we do not think of the left-hand side as
a direct sum of R-modules, just as a tuple) is multilinear if it is linear if we fix the arguments for all but one Mi . In
other words, φ is multilinear if φ(m1 , · · · , mi−1 , r mi + mi′ , mi+1 , · · · , mn ) = r φ(m1 , · · · , mi−1 , mi , mi+1 , · · · , mn ) +
φ(m1 , · · · , mi−1 , mi′ , mi+1 , · · · , mn ) for all r ∈ R, all 1 ≤ i ≤ n, and all mi , mi′ ∈ Mi .

For example, if R = R and M = R3 , then the map M × M → R sending (x, y ) to the dot product x · y is bilinear
(multilinear for two variables), and the map M × M → M sending (x, y ) to the cross product x × y is also bilinear.
But we’ll get into it more next time.

17 November 2, 2022
We defined multilinear maps last time – given R-modules M1 , · · · , Ma , P , a multilinear map is a map from the set-
theoretic product ψ : M1 × · · · × Ma → P which is linear in each variable when we fix all other variables. The usual
dot (inner) product and cross (vector) product from vector calculus are multilinear maps, and we’ll start today by
mentioning some other important properties:

Definition 203
A multilinear map ψ is symmetric if M1 = · · · = Ma = M (all modules are the same) and for any permutation
σ ∈ Sa , we have ψ(m1 , · · · , ma ) = ψ(mσ(1) , · · · , mσ(a) ).

In other words, it doesn’t matter what order we input our arguments in – the dot product is symmetric but not
the cross product. The cross product instead falls into another category of linear maps:

Definition 204
A multilinear map ψ is alternating if M1 = · · · = Ma = M and whenever mi = mj for some i ̸= j, ψ(m1 , · · · , ma ) =
0.

We can restate this definition to look more similar to the symmetric case:

Lemma 205
If ψ is alternating and σ ∈ Sa is any permutation, then ψ(mσ(1) , · · · , mσ(a) ) = sgn(σ)ψ(m1 , · · · , ma ) (this is
sometimes called being antisymmetric). The converse also holds if 2 ∈ R× .

Proof. Since the permutation group is generated by transpositions, it’s sufficient to check the case where σ is a
transposition. To simplify the notation, we’ll just consider the transposition (12), but the argument is the same in all

52
cases. By multilinearity and the definition of being alternating,

0 = ψ(m1 + m2 , m1 + m2 , m3 , · · · , ma ) = ψ(m1 , m1 , m3 , · · · , ma ) + ψ(m1 , m2 , m3 , · · · , ma )


+ ψ(m2 , m1 , m3 , · · · , ma ) + ψ(m2 , m2 , m3 , · · · , ma ).

But the first and last terms on the right-hand side are zero by the alternating property again, so ψ(m1 , m2 , m3 , · · · , ma ) =
−ψ(m2 , m1 , m3 , · · · , ma ), as desired (because any transposition has sign −1). And for the converse, if ψ(m1 , m1 , m3 , · · · , ma ) =
0, then by the antisymmetric property we have ψ(m1 , m1 , m3 , · · · , ma ) = −ψ(m1 , m1 , m3 , · · · , ma ), so combining terms
and dividing by 2 proves the result.

We will denote the set of bilinear maps (multilinear maps with two inputs) M1 × M2 → P by BilR (M1 × M2 , P ), and
we could do something similar for the set of multilinear maps. This is not only a set but also naturally an R-module,
since (r ψ + φ) is a bilinear map if ψ and φ are by setting

(r ψ + φ)(m1 , m2 ) = r ψ(m1 , m2 ) + φ(m1 , m2 ).

Lemma 206
There is a natural isomorphism BilR (M1 × M2 , P ) ∼
= HomR (M1 , HomR (M2 , P )) sending ψ to (m1 → (m2 →
ψ(m1 , m2 ))) in the forward direction and sending any f to ((m1 , m2 ) 7→ f (m1 )(m2 ).

We can check that both of these are R-linear maps, they do indeed end up being module morphisms / bilinear
maps in the corresponding directions, and they are inverses of each other. It turns out all of this helps us define the
tensor product of two R-modules:

Proposition 207
There is a universal multilinear map from M1 × · · · × Ma to an R-module that we will denote M1 ⊗ · · · ⊗ Ma (called
the tensor product of M1 , · · · , Ma ), sending (m1 , · · · , ma ) → m1 ⊗ · · · ⊗ ma , so that if ψ : M1 × · · · × Ma → P
is any multilinear map, then there is a unique R-linear (not multilinear) map ψ̃ : M1 ⊗ · · · ⊗ Ma → P such that ψ
factors through M1 ⊗ · · · ⊗ Ma via ψ̃.

Proof. We use the usual construction: consider the free R-module FR (M1 × · · · × Ma ) with one generator for each
element of the set-theoretic product. Then any function M1 × · × Ma → P factors through FR (M1 × · · · × Ma ), but
we need some additional constraints if we want this to be true for only multilinear maps. Thus, we must mod out by
some relations, so we will define
D E
FR (M1 × · · · × Ma )/ e(m1 ,··· ,mi +r mi′ ,··· ,ma ) − e(m1 ,··· ,mi ,··· ,ma ) − r e(m1 ,··· ,mi′ ,··· ,ma ) ∀i , mi′ ∈ Mi , mj ∈ Mj , r ∈ R

FR (M1 × · · · × Ma ) surjects onto M1 ⊗ · · · ⊗ Ma , and for any multilinear map M1 × · · · × Ma → P everything in the
kernel that we modded out by above gets sent to zero under ψ. Thus we do indeed get a unique map (by universal
property of the quotient) M1 ⊗ · · · ⊗ Ma → P .

With this construction, we can define the pure tensors

m1 ⊗ · · · ⊗ ma = [e(m1 ,··· ,ma ) ],

which span M1 ⊗ · · · ⊗ Ma , and we see that we must actually have ψ̃ defined as

ψ̃(m1 ⊗ · · · ⊗ ma ) = ψ(m1 , · · · , ma ).

53
Also, we see immediately from the relations that we have

m1 ⊗ · · · ⊗ (mi + r mi′ ) ⊗ · · · ⊗ ma = m1 ⊗ · · · ⊗ mi ⊗ · · · ⊗ ma + r m1 ⊗ · · · ⊗ mi′ ⊗ · · · ⊗ ma .

But as usual, we don’t want to work with the actual construction of the tensor product – it’s possible to map into the
tensor product, but it’s hard to define maps out of the tensor product from the generators because there are lots of
relations that are hard to check.

Lemma 208
There is a universal symmetric multilinear map M ×· · ·×M → S a (M) (this latter R-module is called the symmetric
power) such that any symmetric multilinear map ψ : M × · · · × M → P factors through a map ψ̃ : S a (M) → P .

Basically the same proof works – we look at the free module FR (M × · · · × M) and mod out by the relations above,
but also mod out by e(m1 ,··· ,ma ) − e(mσ(1) ,··· ,mσ(a) ) . And we’ll often just write the elements of S a (M) as m1 ⊗ · · · ⊗ ma
like the tensor product.

Lemma 209
There is a universal alternating multilinear map M × · · · × M → Λa (M) such that any alternating multilinear map
ψ : M × · · · × M → P factors through a map ψ̃ : Λa (M) → P .

Again we just write down similar relations as before, and the notation we use is that (m1 , · · · , ma ) is sent to
m1 ∧ · · · ∧ ma in Λa (M).

Lemma 210
Suppose that fi : Mi → Ni is R-linear. Then there is a unique R-linear map f1 ⊗ · · · ⊗ fa : M1 ⊗ · · · ⊗ Ma →
N1 ⊗ · · · ⊗ Na such that m1 ⊗ · · · ⊗ ma is mapped to f1 (m1 ) ⊗ · · · ⊗ fa (ma ).

Proof. Such a map must be unique if it exists (since we define it on all pure tensors and can extend by linearity).
To show existence, we first define a multilinear map ψ : M1 × · · · × Ma → N1 ⊗ · · · ⊗ Na sending (m1 , · · · , ma ) to
f1 (m1 ) ⊗ · · · ⊗ fa (ma ). This is indeed R-multilinear because

ψ(m1 , · · · , mi + r mi′ , · · · , ma ) = f1 (m1 ) ⊗ · · · ⊗ fi (mi + r mi′ ) ⊗ · · · ⊗ fa (ma )


= f1 (m1 ) ⊗ · · · ⊗ (fi (mi ) + r fi (mi′ )) ⊗ · · · ⊗ fa (ma )
= f1 (m1 ) ⊗ · · · ⊗ fi (mi ) ⊗ · · · ⊗ fa (ma ) + r f1 (m1 ) ⊗ · · · ⊗ fi (mi′ ) ⊗ · · · ⊗ fa (ma )

by R-linearity of fi . Thus by the universal property of the tensor product we do indeed get f1 ⊗· · ·⊗fa : M1 ⊗· · ·⊗Ma →
N1 ⊗ · · · ⊗ Na , doing the right thing to pure tensors.

Similar results hold for the subclasses of multilinear maps as well:

Lemma 211
If f : M → N is R-linear, then there is a unique R-linear map S a (f ) : S a (M) → S a (N) sending m1 ⊗ · · · ⊗ ma to
f (m1 ) ⊗ · · · ⊗ f (ma ). Similarly, there is a unique R-linear map Λa (f ) : Λa (M) → Λa (N) sending m1 ∧ · · · ma to
f (m1 ) ∧ · · · ∧ f (ma ).

54
We prove these facts similarly by constructing a map starting with M × · · · × M and use the fact that we map into
S (N) or Λa (N) to verify the additional requirement for the universal properties.
a

Remark 212. Sometimes tensor products behave in a funny way and we should be careful: for example, take the ideal
(X, Y ) of C[X, Y ], which is torsion-free (meaning that no non-zero divisor of the ring kills any element of the ideal),
but (X, Y ) ⊗R (X, Y ) turns out to have torsion. So it takes some time to get the right intuition for all of this.

Lemma 213
Let σ ∈ Sa . Then there is an R-linear isomorphism σ ∗ : M1 ⊗ · · ·⊗Ma → Mσ(1) ⊗ · · ·⊗Mσ(a) sending m1 ⊗ · · ·⊗ma
to mσ(1) ⊗ · · · ⊗ mσ(a) .

Proof. As usual, first write down a multilinear map M1 × · · · × Ma → Mσ(1) ⊗ · · · ⊗ Mσ(a) sending (m1 , · · · , ma ) 7→
mσ(1) ⊗ · · · ⊗ mσ(1) ; this is multilinear so it factors through the tensor product. To show this is an isomorphism, we
can also check that (στ )∗ = τ ∗ σ ∗ by checking that the maps agree on pure tensors (we should be careful about the
reversing of order and check it carefully), so id = id∗ = (σ ◦ σ −1 )∗ = (σ −1 )∗ ◦ σ ∗ (and the same in the other direction)
so we do have an inverse map.

Lemma 214
For any a > b, we have

(M1 ⊗ · · · ⊗ Mb ) ⊗ (Mb+1 ⊗ · · · ⊗ Ma ) ∼
= M1 ⊗ · · · ⊗ Ma .

Proof to be continued. The forward direction is a bit tricky, because we need to construct a bilinear map and the
universal property doesn’t tell us anything about how to construct such maps. So we start with the reverse direction:
we can construct a map M1 × · · · × Ma → (M1 ⊗ · · · ⊗ Mb ) ⊗ (Mb+1 ⊗ · · · ⊗ Ma ) sending (m1 , · · · , ma ) to (m1 ⊗
· · · ⊗ mb ) ⊗ (mb+1 ⊗ · · · ⊗ ma ). This is indeed a multilinear map, which is true by the multilinearity of tensor products
twice (on the inner one, then the outer one), so we get the desired map in the reverse direction. We’ll do the other
direction next time.

18 November 4, 2022
We discussed properties of tensor products of modules last time – we’ll go through material a bit faster than usual
today because we’re a bit behind. We started showing last time that the tensor product (M1 ⊗ · · · ⊗ Ma ) is isomorphic
to the tensor product of (M1 ⊗ · · · ⊗ Mb ) and (Mb+1 ⊗ · · · ⊗ Ma ) (sending pure tensors in the way that is visually clear
– m1 ⊗ · · · ⊗ ma is sent to (m1 ⊗ · · · ⊗ mb ) ⊗ (mb+1 ⊗ · · · ⊗ ma and vice versa). Starting from (M1 ⊗ · · · ⊗ Ma ) and
constructing a map to the other space is easy, because we can map from M1 × · · · × Ma into (M1 ⊗ · · · ⊗ Mb ) and
(Mb+1 ⊗ · · · ⊗ Ma ) and show that it is R-linear. But the proof in the other direction is slightly more complicated:

Proof. To construct a map (M1 ⊗ · · · ⊗ Mb ) × (Mb+1 ⊗ · · · ⊗ Ma ) to M1 ⊗ · · · ⊗ Ma , we need that map to be bilinear.


Recall that we have the isomorphism BilR (M1 × M2 , P ) ∼
= HomR (M1 , HomR (M2 , P )), so we want to construct a map
Hom(M1 ⊗ · · · ⊗ Mb , Hom(Mb+1 ⊗ · · · ⊗ Ma , M1 ⊗ · · · ⊗ Ma )). Indeed, the construction should be

m1 ⊗ · · · ⊗ mb 7→ (mb+1 ⊗ · · · ⊗ ma 7→ m1 ⊗ · · · ⊗ ma ) .

55
We must check that for any fixed m1 , · · · , mb , we do indeed have such a morphism of R-modules. But we know
that the map from M1 × · · · × Mb (the product, not tensor product) to Hom(Mb+1 ⊗ · · · ⊗ Ma , M1 ⊗ · · · ⊗ Ma ) is
an R-multilinear map, so it factors through the tensor product and thus we do have a linear map f(m1 ,··· ,mb ) for each
m1 , · · · , mb which is well-defined. (In particular, this tells us that even if we have a weird combination of pure tensors
which is actually zero, it will get sent to zero.) So now the map f : M1 × · · · × Mb (again product, not tensor product)
to Hom(Mb+1 ⊗ · · · ⊗ Ma , M1 ⊗ · · · ⊗ Ma ) sending (m1 , · · · , mb ) → f(m1 ,··· ,mb ) is multilinear (because f(r m1 +m1′ ,m2 ,··· ,mb )
and r f(m1 ,··· ,mb ) + f(m1′ ,m2 ,··· ,mb ) evaluate to the same thing on all pure tensors by multilinearity of the tensor product in
each entry). Thus f also factors through the tensor product to a map f˜ such that f˜(m1 ⊗· · ·⊗mb ) = (mb+1 ⊗· · ·⊗ma ).
To make that into a bilinear form, we define φ(x, y ) = f˜(x)(y ). Such a bilinear form is a bilinear map (M1 ⊗ · · · ⊗
Mb ) × (Mb+1 ⊗ · · · ⊗ Ma ) to M1 ⊗ · · · ⊗ Ma , meaning it factors to the tensor product. So we have the map in the
other direction as what we constructed last time, and it does so in a way that makes the two maps inverses (because
it is on all pure tensors). Thus we have the desired isomorphism.

We’ll now go through some other properties:

Lemma 215
Let ψ : R → S be a ring morphism and M be an R-module. Then the tensor product S ⊗mod
R M can be thought
of as a getting an S-module from an R-module under the action of S (the ring-module tensor product we first
defined, giving us an S-module and thus an R-module via ψ), or as an R-module S ⊗bil
R M (where we think of
both S and M as R-modules to start with and then do our bilinear map construction). But these constructions
are isomorphic by sending s ⊗ m to s ⊗ m in both directions.

Much like in the previous argument, going backwards is easy, but going forward is tricky because we first have to
show that the R-module is actually an S-module. So we have to explain how multiplication by an element of S actually
looks before we can invoke the universal property.

Lemma 216
If M, N, P are all R-modules, then M ⊗ (N ⊕ P ) ∼
= (M ⊗ N) ⊕ (M ⊗ P ), sending m ⊗ (n, p) to (m ⊗ n, m ⊗ p) and
vice versa.

(This turns out to be true for arbitrary direct sums, not just finite ones.)

Lemma 217
As a special case, we have FR (X) ⊗ FR (Y ) ∼
= FR (X × Y ), sending ex ⊗ ey to e(x,y ) and vice versa.

Lemma 218
The ath symmetric power S a (FR (X)) is isomorphic to the free module FR (S a (X)), where S a (X) is the a-fold
product X × · · · × X quotiented out by the action of the symmetric group Sa (in other words, multisets of a
elements), where we get an isomorphism via ex1 ⊗ · · · ⊗ exa to e[(x1 ,··· ,xa )] .

So on free modules, tensor products behave quite well – it’s only when we get to arbitrary tensor products that
things get messier.

56
The corresponding ath exterior power Λa (FR (X)) can be described in a more complicated way: let < be a total
order on X (which always exists but is not unique). Then we define

Λa< (X) = {(x1 , · · · , xa ) : xi ∈ X, x1 < · · · < xa },

Lemma 219
We have an isomorphism Λa (FR (X)) ∼
= FR (Λa< (X)), sending ex1 ∧ · · · ∧ exa to 0 if xi = xj for some i ̸= j and
sgn(σ)e(xσ(1) ,··· ,xσ(a) ) if σ is the permutation that puts the xi s in increasing order.

For example, if R⊕a is the direct sum of a copies of R, which is a free module on {1, · · · , a}, then there is only
one element of Λa< (X), so
Λa (R⊕a ) ∼
= R,

though this isomorphism is not canonical. Similarly, we have Λa−1 (R⊕a ) ∼


= R⊕a , because there are a ways to pick an
increasing sequence of length (a − 1)on a elements.

Lemma 220
There is a map ψ : M → HomR (Λa M, Λa+1 M) sending m to the map ψ(m), where

ψ(m)(m1 ∧ · · · ∧ ma ) = m1 ∧ · · · ∧ ma ∧ m.

(We could have also wedged m in at the beginning.) It’s easy to see that this is indeed a valid morphism and that
ψ is R-linear.

Lemma 221
The map ψ described above is functorial: if f : M → N is a module morphism, then we can construct a map
Λa M → Λa+1 N in two ways which are equivalent:

Λa+1 f ◦ ψ(m) = ψ(f (m)) ◦ Λa f .

Proposition 222
Let f ∈ EndR (R⊕a ), which is equivalent to specifying an a × a matrix with entries in R. Then Λa f : Λa (R⊕a ) →
Λa (R⊕a ) is a morphism between rank-1 modules, so taking a basis element e, we know that e must be sent to
some multiple of e. We write that e is sent to det(f )e, where det is the determinant of f .

This determinant turns out to be the usual one from linear algebra: we can check that det(f ◦ g) = det(f ) det(g),
P
and we can check that we recover the usual formula if we let e = e(1,··· ,a) and represent f ei = j bij ej for some
bij ∈ R. We will indeed find (plugging in e1 ∧ · · · ∧ ea ) that
X
det(f ) = sgn(σ)b1,σ(1) · · · ba,σ(a) .
σ∈Sa

But continuing to fix an endomorphism f , recall also that we can send R⊕a to Hom(Λa−1 R⊗a , Λa (R⊕a )) by Lemma 220,
and in this case because we have free modules this is actually an isomorphism. We can then map Hom(Λa−1 R⊗a , Λa (R⊕a ))
to itself by precomposing by Λa−1 f . We end up getting the following commutative diagram to construct the blue map
(by applying ψ, then − ◦ Λa−1 f , then the inverse of ψ), which we call the adjugate of f :

57
ψ
R⊕a Hom(Λa−1 R⊕a , Λa R⊕a )
adj(f ) −◦Λa−1 f
ψ
R⊕a Hom(Λa−1 R⊕a , Λa R⊕a )

It turns out that if we unravel the definitions,

adj(f ) ◦ f = det(f )id.

(And with this, we can recover the matrix cofactor expression for the inverse of a matrix.) The identity is also true if
we reverse the two terms on the left-hand side, but it somehow becomes more difficult to prove.

Definition 223
If f ∈ EndR (R⊕a ), then we can think of f as an element of EndR[T ] (R[T ]⊕a ), since R[T ]⊕a is really R[T ] ⊗R R⊕a
and we can think of f as 1 ⊗ f . If T is the multiplication by T endomorphism, then

det T idR[T ]⊕a − 1 ⊗ f

is an element of R[T ], which we call the characteristic polynomial of f , which we denote Cf (T ).

Proposition 224 (Cayley-Hamilton)


We have Cf (f ) = 0. (In linear algebra terms, if we plug in a matrix into its own characteristic polynomial, we will
always get zero.)

Proof. By our definition above, we know that (an equality of endomorphisms)

det (T id − f ) = adj(T id − f ) · (T id − f ).

The left-hand side is a polynomial in T with coefficients ci in R. Meanwhile, the adjugate is an endomorphism of
R[T ]⊕a , but we can write it as a polynomial in T with coefficients Bi in EndR (R⊕a ). Setting T -coefficients equal, we
see that Bi−1 − Bi ◦ f = ci for all i . But now
X
Cf (f ) = (Bi−1 − Bi ◦ f )f i ,
i

and we see that the coefficient of each f i cancels out by telescoping sum, so Cf (f ) = 0 as desired.

Corollary 225 (Nakayama’s lemma)


Let M be a finitely generated R-module and I be an ideal of R. Suppose that IM = M (recall IM is the submodule
of M in which everything is multipled by an element of I). Then there is some r ∈ I such that (1 + r )M = 0.

Proof. Since M is finitely generated by some m1 , · · · , ma , we get a map π : R⊕a → M. Then IM = M means that
X
mi = aij mj

for some aij ∈ I, so the linear map from the matrix A = (aij ) is such that we actually have π = π ◦ (aij ) (as in the
diagram below):

58
π
R⊕a M
A=(aij ) =

π
R⊕a M
So for any polynomial f ∈ R[T ], we also get a diagram commuting as below:
π
R⊕a M
f (A) f (1)
π
R⊕a M

In particular, taking f to be the characteristic polynomial of A and using Cayley-Hamilton, we see that charA (1) = 0,
so charA (1)M = 0. But expanding out the polynomial, we see that charA (T ) has all T -coefficients in I except the
leading coefficient 1T a , so charA (1) is indeed 1 + r for some r ∈ I, as desired.

Corollary 226
Let M be a torsion-free, finitely-generated R-module (meaning that for any nonzero r ∈ R and m ̸= 0 in M,
r m ̸= 0). Then if I is a proper ideal and IM = M, then M = (0).

Proof. By Nakayama’s lemma, we know that (1 + r )M = (0) for some r ∈ I, so (1 + r )m = 0 for any m ∈ M. Then
1 + r ̸∈ I (otherwise 1 would be in I), so in particular it is nonzero. Since M is torsion-free this means m must be
zero.

Corollary 227
Let M be a finitely-generated R-module, and suppose I is contained in all maximal ideals of R. If IM = M, then
M = (0).

Proof. We know that r ∈ I is in all maximal ideals, but 1 + r is in no maximal ideal (otherwise (1 + r ) − r = 1 would
be in a maximal ideal). Thus it must be a unit, so if (1 + r )M = (0) then M must indeed be (0).

Corollary 228
Suppose M is finitely generated over R, I is an ideal contained in all maximal ideals, and m1 , · · · , ma ∈ M with M
generated as ⟨m1 , · · · , ma , IM⟩. Then we can suppress IM as a generator, and we in fact have M = ⟨m1 , · · · , ma ⟩.

In other words, we can produce generators mod IM and use that to generate M.

Proof. Apply the previous result to the quotient M/⟨m1 , · · · , ma ⟩.

19 November 7, 2022
We’ll be discussing finitely generated modules over a PID this week, starting with the main result:

59
Theorem 229
Let R be a principal ideal domain, and suppose N is a submodule of a free module R⊕n . Then N is free, and there
is a basis e1 , · · · , en of R⊕n and elements a1 |a2 | · · · |am of R (with am ̸= 0) such that a1 e1 , · · · , am em form a basis
for N. Also, these ai s are unique up to associates.

In other words, we can choose a basis for the larger module so that a subset of those basis elements, scaled
appropriately, gives us a basis for the smaller module. Before proving this (which is the least important part), we’ll
state a few of the theorem’s useful applications and work through some examples.

Corollary 230
Let R be a principal ideal domain, and let M be a finitely generated R-module. Then there are elements
a1 |a2 | · · · |am of R (with am ̸= 0), with a1 not a unit, and some integer d ≥ 0, such that
m
M∼
M
= R⊕d ⊕ R/(ai ).
i=1

Also, d, m, and (ai ) are uniquely determined up to associates by the module M. We call the ai s invariant factors
of M.

In other words, we have some number of copies of the whole ring R. This follows from the theorem before,
because we can pick some n generators for M and get a surjection π : R⊕n → πM. Then applying the kernel of π to
Theorem 229, we see that
m
M∼
= R⊕n / ker π ∼
M
= R⊕(n−m) ⊕ R/(ai ),
i=1

and where we can drop any terms with ai a unit (and absorb them into the free part). Notice also that the Chinese
remainder theorem says that whenever (a, b) = R, R/(ab) ∼ = R/(a)⊕R/(b), so we can split up each ai into irreducibles
Q m (π)
ππ (times a unit). This gives us the following reformulation:
i

Corollary 231
Again let R be a PID and M a finitely generated R-module. Then we have
n(π)
M∼
M M
= R⊕d ⊕ R/(π mi (π) ),
(π) prime ideal ̸=(0) i=1

where n(π) = 0 for all but finitely many π and the integers mi (π) are positive and in nondecreasing order. Also,
M uniquely determines d, n(π), and mi (π).

We’ll start by thinking about the case of finitely-generated Z-modules (so finitely generated abelian groups):

Example 232
We can list all isomorphism classes of abelian groups of order 16.

Listing out ways to split up 16 into powers of 2, we have the abelian groups

Z/(16), Z/(2) × Z/(8), Z/(4) × Z/(4), Z/(2) × Z/(2) × Z/(2) × Z/(2), Z/(2) × Z/(2) × Z/(4).

(since we need a product of positive integers a1 a2 a3 · · · multiplying to 16 with none equal to 1 and a1 |a2 |a3 | · · · ).

60
Example 233
Next, we can calculate all abelian groups M that fit into a short exact sequence

0 → Z ⊕ Z/(5) → M → Z ⊕ Z/(10) → 0.

We know that M must have the form we’ve been discussing – we can determine some of the invariants with some
algebraic manipulation. Notice that if M ∼ R/(ai ), then we can localize at 0 and get M(0) ∼
Lm
= R⊕d ⊕ i=1 = R⊕d , so (0)
localizing our short exact sequence yields

0 → Q → M(0) → Q → 0,
Lm
which means M(0) = Q⊕2 just by checking dimensions and thus M must be Z2 ⊕ i=1 Z/(ai ) (with 1 < a1 |a2 | · · · |am ̸=
0). But now we can consider the torsion submodule

M tor = {m ∈ M : ∃a ̸= 0 ∈ R with am = 0}.

The sum of two elements of M tor is still in M tor , because a1 m1 = 0 and a2 m2 = 0 implies (a1 a2 )(m1 + m2 ) = 0. And
this is closed under scalar multiplication as well. So now because R is an integral domain, R⊕d has no torsion, and
thus M tor ∼
Lm
= i=1 R/(ai ) – this is a way for us to “get rid of the free part.” So the torsion of Z ⊕ Z/(5) is Z/(5), and
things with torsion must go to things with torsion (because multiplying by the corresponding element of R would still
kill our element even after being mapped) – this means we have an exact sequence

0 → Z/(5) → M tor → Z/(10)


×2
(we have injectivity on the left, but we do not need to have surjectivity – consider 0 → Z −→ Z → Z/(2) → 0; taking
torsion gives 0 → 0 → 0 → Z/(2) → 0), but this still tells us that the order of M tor must be divisible by 5 and must
be a factor of 50. So that tells us that 5|a1 · · · am |50 and a1 | · · · |am ; this only works if we have one of the sequences
(ai ) = (5), (10), (25), (50), (5, 5), (5, 10).
So there are at most six possibilities for fitting into the short exact sequence, but we don’t actually know that any
of those work yet:
• Suppose we want to construct a short exact sequence 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(5) → Z ⊕ Z/(10) → 0.
Then we could send (a, b) to (10a, 0, b) (so that the cokernel matches up with Z ⊕ Z/(10)), and then we could
send (x, y , z) → (y , x mod 10). So this possibility can arise.
• Next, we try 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(10) → Z ⊕ Z/(10) → 0. This time, we can send (a, b) to (5a, 0, 2b),
so that the cokernel is now Z/(5) ⊕ Z ⊕ Z/(2) = Z ⊕ Z/(10) (by the Chinese remainder theorem). So this works
as well if we send (x, y , z) to (y , 5z + 2x) in the second map (we can check exactness at all points explicitly).
• Now we can try 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(5) ⊕ Z/(5) → Z ⊕ Z/(10) → 0 – this time, we send (a, b) to
(2a, 0, b, 0), so the cokernel is Z/(2) ⊕ Z ⊕ (0) ⊕ Z/(5), which is again correct. Then we send (x, y , z, w ) to
(y , 2w + 5x) much like before, and this is also possible.
• Similarly, we can construct 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(25) → Z ⊕ Z/(10) → 0 by sending (a, b) to (2a, 0, 5b)
– the cokernel works out again, and we need to send (x, y , z) to (y , 5x + 2(zmod 5)), giving us another valid
exact sequence.
• It is easy to construct 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(5) ⊕ Z/(10) → Z ⊕ Z/(10) → 0 by sending (a, b) to
(a, 0, b, 0) and then sending (x, y , z, w ) to (y , w ) – this is clearly exact too.

61
• Finally, we can construct 0 → Z ⊕ Z/(5) → Z ⊕ Z ⊕ Z/(50) → Z ⊕ Z/(10) → 0 by sending (a, b) to (a, 0, 10b),
giving us yet another exact sequence.

Basically, we can just try to construct two valid maps by checking that cokernel is of the right type, show injectivity
and surjectivity of the first and second maps, and then check exactness at the middle. So in this case, this strategy
works, but it’s generally surprising how many of the possibilities will work and there typically has to be a “good reason”
for it to fail. (For example, one situation where we do run into issues is that M cannot use more generators than the
first and last modules combined.)
Our next application will be to linear algebra:

Example 234
Let K be a field, and let V be a finite-dimensional K-vector space. Given a linear map T ∈ EndK (V ), we can think
P i
ai x v = ai T i (v ). Then V is finitely generated
P
of V as a K[x]-module in which x acts by T , meaning that
over K[x] (because it was already finitely generated over K), so we have an isomorphism as K[x]-modules
m
V ∼
M
= K[x]⊕d ⊕ K[x]/(ai )
i=1

with a1 | · · · |am ̸= 0 are all polynomials over K, which we can assume to be monic by appropriately multiplying,
and a1 is not a unit (meaning it has positive degree).

In such a case, we know that d is actually zero, because K[x] is already infinite-dimensional over K and V has to
be finite-dimensional, so we just have a direct sum of K[x]/(ai )s. Furthermore, K[x]/(ai ) has dimension deg(ai ) (it
has a K-basis {1, x, · · · , x deg(ai )−1 } by the division algorithm), so we must have dim V = m
P
i=1 deg(ai ). And since T
acts as multiplication by x, the action of T on each K[x]/(ai ) is also multiplication by x.
Now if we have another vector space V ′ with action T ′ , we can write V ′ = ∼ Lm′ K[x]/(ai ). Then there is an
i=1

→ V ′ if and only if f ◦ T = T ′ ◦ f (in other words, there is a linear map that commutes with the
isomorphism f : V −
structure of K[x]-modules), and we know this only happens if m = m′ and ai = ai′ for all i . So this gives us a way to
test whether two vector spaces with an endomorphism are isomorphic with that endomorphism. We’ll talk more about
this next time!

20 November 9, 2022
Last time, we stated the classification theorem for finitely generated modules M over a PID R, saying that such
modules always take the form M = ∼ R⊕d ⊕ Lm R/(ai ), where d is some nonnegative integer, a1 | · · · | · · · am (with
i=1
a1 not a unit and am nonzero), where the ideals (ai ) are uniquely determined (meaning the ai s are determined up to
associates) and so are d and m. We saw some examples in the context of abelian groups, and we started looking at
the case where R = K[x] (in which case giving a module is like giving a vector space over K plus an endomorphism
on that space). In particular, where V is finite-dimensional over K, we know there are a1 | · · · |am in K[x] (we called
these invariant factors and we can take them to be monic) with all degrees positive, such that
m
V ∼
M
= K[x]/(ai (x)),
i=1

where the action of our endomorphism T is multiplication by x (in each direct summand) and dim V = m
P
i=1 dim K[x]/(ai (x)) =
Pm
i=1 deg(ai ). This then lets us write down a matrix form for T (multiplying by x): it is block diagonal (with one block

62
for each ai ), where if we pick the monomials as a basis, each block looks like
 
0 0 ··· 0 −ai,0
··· −ai,1
 
1 0 0 
 
T = 0 1
 ··· 0 −ai,2 .

K[x]/ai (x) . . .. ..
. . .. 
. . . . .


0 0 ··· 1 −ai,deg(ai )−1

(The last row here comes from the fact that ai,0 + ai,1 x + · · · + ai,deg(ai )−1 x deg(ai −1) + x deg(ai ) = 0.) A matrix with
diagonal blocks that look like this is said to be in rational canonical form, and it is unique because given a matrix
of this form representing T (where a1 (x)|a2 (x)| · · · |am (x) we know that each block is isomorphic to K[x]/(ai (x)) and
the conditions are satisfied. And from this we can also calculate the characteristic polynomial of T – it turns out that
we actually have
charT (x) = a1 (x) · · · am (x).

Indeed, we must calculate the polynomials


 
x 0 ··· 0 −ai,0
··· −ai,1
 
1 x 0 
 
 ?
det 0 1 ··· 0 −ai,2  = a0 + a1 x + · · · + ad x d

. .. .. .. .. 
. .
. . . .


0 0 ··· 1 x − ai,deg(ai )−1

and multiply them together across all i to get the result, and this is indeed true by induction if we expand the
determinant along the first column.

Lemma 235
There exists a monic polynomial m(x) such that mT (T ) = 0 and such that whenever f (T ) = 0 we have mT |f ;
we call this the minimal polynomial.

In the particular case above, because a1 | · · · |am , the minimal polynomial is mT (x) = am (x).

Proof. Consider the annihilator


AnnK[x] (V ) = {f ∈ K[x] : f (T ) = 0};

since the annihilator is an ideal and R is a PID, this must be equal to (mT ) for some mT unique up to units and thus
unique if we insist that it is monic.

Last time, we mentioned that for any (V, T ) and (V ′ , T ′ ) (for V, V ′ finite-dimensional K-vector spaces) and T, T ′
corresponding endomorphisms, we see that V ∼= V ′ as a K[x] module if and only if there is some isomorphism g : V → V ′
with g ◦ T = T ′ ◦ g (meaning that we have the same number of invariant factors of T and of T ′ and ai = ai′ for all i ).
This can in fact be restated in terms of matrices:

Definition 236
Two matrices A, B ∈ Mn×n (K) are similar if they are conjugate, meaning that there is some g ∈ GLn (K) with
B = gAg −1 .

63
In other words, similarity means that the K[x]-modules (K ⊕n , A) and (K ⊕n , B) are isomorphic via a map g, since
that is the same as saying that Bg = gA. And that’s then equivalent to saying that A and B must have the same
invariant factors. (So in some sense, this is a better version of Jordan normal form.)

Example 237
With this method, we can determine the conjugacy classes in a group like GL3 (F2 ) = GL3 (Z/(2)).

It suffices to look for monic invariant factors a1 | · · · |am ∈ Z/(2)[x], such that all degrees are positive and
P
deg(ai ) = 3. But we also need an additional condition – remember that the constant term of the characteris-
tic polynomial is the determinant, which we want to be nonzero because we want invertible matrices. Thus we must
require that x does not divide any am .
But now we can just list possibilities: in one case we can have m = 1, meaning we have a single cubic polynomial
with nonzero constant term. The possibilities here are x 3 + 1, x 3 + x 2 + 1, x 3 + x + 1, and x 3 + x 2 + x + 1. Alternatively,
we can have m = 2, in which case we must have a linear and a quadratic polynomial. So then a1 must be x + 1 and
a2 = (x + 1)2 = x 2 + 1. Finally, if m = 3, then we can have a1 = a2 = a3 = x + 1. Putting all of this together, we
see that there are six conjugacy classes of matrices. We can then go back to our rational canonical form and find a
representative for each class: in the order we listed them, they are
          
0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 1 0 0
          
1 0
 0 , 1 0
  0 , 1
  0 1 , 1 0 1 0
    0 1 , 0
  1 0

0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 0 1

(colors indicating diagonal blocks in the last two cases).

Example 238
Next, we’ll determine the number of conjugacy classes of g ∈ GL3 (K) such that g 5 = 1 (just to avoid having to
write out the matrices).

P
This time, we’re again looking for monic polynomials a1 | · · · |am such that all degrees are positive and deg(ai ) = 3,
5 5
but if g = 1 that means the minimal polynomial mg (x) = am (x) must divide x − 1 (and in particular this does mean
the constant terms will be nonzero so we will get an invertible matrix).
We’ll first do the case where K = C. Factoring over the complex numbers,

x 5 − 1 = (x − 1)(x − ζ)(x − ζ 2 )(x − ζ 3 )(x − ζ 4 )

where ζ = e 2πi/5 . We can again do casework on m: for m = 1, we take any cubic factor of x 5 − 1, so there are
5
 i
3 = 10 ways to pick three distinct ζ s. For m = 2, we pick any linear factor (in 5 ways), and then we pick any
quadratic factor which includes that linear factor (in 4 more ways), giving us 5 · 4 = 20 conjugacy classes. Finally, for
m = 3 we just have 5 ways to pick a single linear factor three times. This gives us 10 + 20 + 5 = 35 conjugacy
classes in total.
Next, we can consider the case K = R. This time for m = 1, we again want a1 (x) + (x − ζ1 )(x − ζ2 )(x − ζ3 ),
but in order for the polynomial to be real all roots must come in conjugate pairs. This means we must have 1 and
a pair of the other roots, giving us 2 possibilities. But for m = 2, there are no possibilities – we would need to have
a1 (x) = (x − 1), but then a2 (x) is (x − 1)(x − ζ i ) for some ζ i ̸= 1 but that will never be real. And finally for m = 3
there is only one possibility where all factors are (x − 1). Thus there are only 2 + 1 = 3 conjugacy classes in this
case.

64
After that, we consider K = F2 . This time, the polynomial can be factored as

x 5 − 1 = (x − 1)(x 4 + x 3 + x 2 + x + 1),

and this time x 4 + x 3 + x 2 + x + 1 turns out to be irreducible (because there are no linear factors, and we can check
that no product of quadratic factors works either – the only possibility would have been (x 2 + x + 1)2 to avoid roots,
and that doesn’t work). So now m = 1 and m = 2 are not possible (because we can’t have a quadratic or cubic
factor), and the only possibility is (x − 1) three times and there is only 1 conjugacy class.
Finally, consider K = F5 , in which case x 5 − 1 = (x − 1)5 . And in this case each of m = 1, 2, 3 has a unique
possibility ((x − 1)3 for m = 1, (x − 1), (x − 1)2 for m = 2, and (x − 1), (x − 1), (x − 1) for m = 3), so we have 3
conjugacy classes.

Remark 239. Remember that we have been calculating conjugacy classes of matrices satisfying g 5 = 1, but when we
consider conjugacy we can still conjugate by any matrix in GL3 (K). (Indeed, the set of matrices where g 5 = 1 doesn’t
form a group because GL3 (K) is nonabelian.)

Next time, we’ll briefly consider Jordan normal form and then sketch the proof of this clasification theorem.

21 November 11, 2022


We’ll start with Jordan normal form today – start with any algebraically closed field K. Then the only irreducible
polynomials in K[x] are the linear ones X − λ (since any polynomial has a root and we can pull out the corresponding
linear factors). Thus if V is a finite-dimensional K-vector space, and T is a K-linear endomorphism of V (recall this is
the same thing as having a K[x]-module V ), then we have a normal form for finitely generated K[x]-modules in two
ways. One is where we choose polynomials a1 | · · · |an , and the other is where we choose irreducible polynomials π and
get
MM
R⊕d ⊕ R/(π mi,π ).
(π) i

We’ll use this latter description here, and we then find that over an algegbraically closed K, we must have

V ∼
MM
= K[x]/(x − λ)mλ,i ,
λ∈K i=1

where mλ,i are positive and nλ = 0 for all but finitely many of the λs. Then we can also describe the action of T quite
easily – if we choose the basis {1, (x − λ), · · · , (x − λ)mλ −1 } for each of these summands, then notice that

x(x − λ)j = (x − λ)j+1 + λ(x − λ)j ,

so we can choose a basis so that our matrix for T will be block diagonal (blocks corresponding to the different direct
summands) of the form
 
λ 0 0 ··· 0 0
···
 
1 λ 0 0 0
 
0
 1 λ ··· 0 0
.
. .. .. .. 
. .
. . . λ 0

0 0 0 ··· 1 λ
So any matrix in Mn×n (K) (thought of as an endomorphism acting on K ⊕n ) will be similar (conjugate) to a block
diagonal matrix with such Jordan blocks, with the λs, nλ s, and mλ,i s uniquely determined. So we have uniqueness up

65
to reordering the blocks, and we call this representation Jordan normal form. It has the disadvantage of only working
over algebraically closed field, but it’s more useful for computing matrix powers than rational canonical form.
We’re now ready to turn back to the main classification theorem we stated two lectures ago and do the proof.
Recall the statement: for any PID R, any submodule N of a free module R⊕n is free, and we can choose a basis
y1 , · · · , yn of R⊕n and a1 |a2 | · · · |am elements of R (with am ̸= 0) such that a1 e1 , · · · , am em form a basis for N.

Proof. We can assume N is nonzero (otherwise there’s nothing to prove). If this result were true, then for any linear
map R⊕n → R, everything in N would get sent to a multiple of a1 (since a1 |a2 | · · · ), so we want to identify a1 as the
largest element of R with this property. Towards that, define

X = {φN ◁ R : φ : R⊕n → R is R-linear}.

(Unpacking the notation, each element of this set is the image of N under an R-linear map to R, so it’s an ideal of
R.) This contains a nonzero ideal because we can look at the projections πi onto the i th coordinate; those cannot
all be zero if N is nonzero. Since R is a principal ideal domain, it is noetherian, and thus we can pick some maximal
element φ1 N = (a1 ). In particular, this means there is some y ∈ N such that φ1 (y ) = a1 (we expect it to be a1 y1 ).
So now take ψ : R⊕n → R to be any other linear map. If we look at the ideal (a1 , ψ(y )) in R, because R is a
PID it must be (d), and we must be able to write x = αa1 + βψ(y ) = (αφ1 + βψ)(y ). So (αφ1 + βψ)(N) contains
(d), which contains (a1 ), but now (αφ + βψ)(N) is also an element of X so by maximality all of these ideals must
be equal. So that means (a1 ) = (d) and a1 must divide ψ(y ), as desired. In particular, choosing ψ to be each of the
coordinate projections in turn, we see that a1 must divide every coordinate of y . So we can indeed write y = a1 y1 for
some y1 ∈ R⊕n , which gives us our first basis element. (And we have φ1 (y1 ) = 1, so in some sense we can’t divide y1
any further.)
So now if m ∈ R⊕n , we can write m as a linear combination of y1 and the rest, which we denote m = φ1 (m)y1 +
(m − φ1 (m)y1 ). But the latter term is in the kernel of φ1 (because φ1 (y1 ) = 1), so R⊕n = Ry1 + ker(φ1 ). And this is in
fact a direct sum because being in both Ry1 and ker(φ1 ) would make the element zero. Thus R⊕n = Ry1 ⊕ ker(φ1 ) ,
and we want to describe how N looks under this decomposition too. For any element m ∈ N, we know φ1 (m) is
divisible by a1 , so N = Ra1 y1 ⊕ (N ∩ ker(φ1 )) . Thus we’ve actually managed to split up N in the same way as R⊕n ,
which is the goal.
Unfortunately we can’t just apply induction directly from here, since we don’t necessarily know that ker(φ1 ) (the
bigger module) is free yet. So we have to proceed in two steps here: first, we show that any submodule N (not just the
specific N in the statement) is free by induction on the dimension of the localization N(0) over the quotient ring Q(R).
For the base case, if N(0) = (0) then N = (0). Then for the inductive step, we know that N = Ra1 y1 ⊕ (N ∩ ker(φ))
from above, so localizing at zero yields

N(0) = Q(R) ⊕ (N ∩ ker(φ))(0) .

The dimension of (N ∩ ker(φ))(0) is one less than that of N(0) , so (the unlocalized) (N ∩ ker(φ)) must be free by the
inductive hypothesis. So N is the direct sum of two free modules and thus N itself is free.
So now we can complete the proof: we induct on n. If n = 0 there is nothing to prove, and in general we use the
boxed decomposition above. We have just shown that ker(φ1 ) is free (because it is a submodule of the bigger module
so the argument above holds), and rank is additive so ker φ1 has rank (n − 1). Thus there is a basis y2 , · · · , yn of
ker(φ1 ) and a2 | · · · |am ̸= 0 so that N ∩ ker(φ1 ) has basis a2 y2 , · · · , am ym . From the first boxed statement we know
that y1 , y2 , · · · , ym do form a basis of R⊕n , and from the second we see that a1 y1 , · · · , am ym is a basis of N. So we
just need to show that a1 divides a2 ; indeed, take the map ψ which sends y1 and y2 to 1 and yi to 0 for any other i .

66
Then ψ(N) = (a1 , a2 ) is an element of X containing (a1 ), but by maximality these must be equal and thus a1 |a2 .

And as we mentioned in a previous lecture, this allows us to recover that for any finitely generated R-module M
over a PID R, we do have
m
M∼
M
= R⊕d ⊕ R/(ai )
i=1

by applying our classification theorem to the kernel of the map R⊕n → M (and noting that we can throw away any of
the terms here where ai is a unit, since R/(ai ) = 0). The remaining thing to show is that these d, m, (ai ) are uniquely
determined by M:

Lemma 240
If c1 |c2 | · · · |ct , where c1 is not a unit in R, then t is the minimal number of generators for M = R/(c1 )⊕· · ·⊕R/(ct ).

Proof. Choose a maximal ideal m containing (c1 ) (the latter is a proper ideal because c1 is not a unit). If M can be
generated by s elements, then so can M/mM ∼= R/m ⊕ · · · ⊕ R/m. But this is now a vector space over a field, and a
vector space of dimension t can only be generated if we have at least t elements, so s ≥ t.

And we can now finally show uniqueness:

Proposition 241
R/(ai ) ∼
Lm Ln
Let R be a PID. If M can be written as R⊕d ⊕ i=1 = R⊕e ⊕ i=1 R/(bi ), and a1 | · · · |am and b1 | · · · |bn
with a1 , b1 not units and am , bn ̸= 0, then d = e, m = n, and (ai ) = (bi ) for all i .

Proof. Localizing at zero, we have Q(R)⊕d ∼


= Q(R)⊕e , so d = e by invariance of dimension of a vector space. Now
R/(ai ) ∼
L L
M has a torsion submodule, which is = R/(bi ), so we can say without loss of generality that d = e = 0.
But now i=1 R/(ai ) is generated by m things, and ni=1 R/(bi ) is generated by n things, so by our lemma above
Lm L

m ≥ n and n ≥ m, so n = m. Now for any element a ∈ R, define the a-torsion

M[a] = {m ∈ M : am = 0}.

Looking at M/M[a], we see that

(R/(ai ))[a] = {r + (ai ) : ar ∈ (ai )} = {r + (ai ) : ai |ar },

a1 a ai
which is the same thing as requiring that gcd(ai ,a) | gcd(ai ,a) r , meaning that we require gcd(ai ,a) |r . Thus (R/(ai ))[a] =
(ai /(gcd(ai , a)))/(ai ), so

R/((bi / gcd(bi , a))) ∼


= M/M[a] ∼
M M
= R/((ai / gcd(ai , a))).

Then the minimum number of generators on the left is the number of bi s that don’t divide a, and the minimum number
of generators on the right is the number of ai s that don’t divide a. But now taking a = a1 (and using that m = n),
we see that all bi s must divide a1 and in particular b1 |a1 . Similarly a1 |b1 , so a1 and b1 are associates. More generally
taking a = ai allows us to show that ai and bi are associates for all i , so (ai ) = (bi ).

67
22 November 14, 2022
In these last three weeks, we’ll briefly introduce homological algebra – this is somewhat difficult to motivate but
turns out to be very useful in various branches of mathematics (including algebra, algebraic geometry, and algebraic
topology). We’ll decide a setting in which we’ll do homological algebra, and the most useful one is to work in abelian
categories. Recall that the category R-mod has a few special properties, namely that we can add morphisms and
describe kernels and cokernels, that finite products and finite coproducts are the same, and so on. These can all be
abstracted into a more general concept – for the sake of presentation, we’ll state results in abelian categories and
prove them in R-mod.

Definition 242
A category C is additive if it satisfies the following properties:
• C has an initial object (an object that maps uniquely to any object) and a final object (an object for which
any object maps to it uniquely), and the map from the initial object to the final object is required to be
an isomorphism. We then call this (initial/final) object the null object and denote it (0). In particular,
composing the maps X → (0) → Y , we get a unique zero morphism 0 : X → Y for any X, Y ∈ C.

• Binary products and coproducts exist, meaning that we have X × Y along with its projection maps px , py to
X and Y , as well as X ⨿Y along with maps from X and Y into it. So if we take the identity map idX : X → X
and the zero map 0 : X → Y , we get a unique map jX : X → X × Y (this is intuitively “embedding into the
first factor and zero in the second”) and similarly jy : Y → X × Y . Thus by the universal property of the
coproduct we get a map α : X ⨿ Y → X × Y . We then also require α to be an isomorphism – we then
call this object the direct sum X ⊗ Y .

So in an additive category, thinking of X ⊕ X as a product, we have two natural maps p1 : X ⊕ X → X and


p2 : X ⊕ X → X (the projections onto the two coordinates). Then taking the identity maps from X → X in each
coordinate, we get a unique map ∆ : X → X ⊕ X commuting with those maps, which we call the diagonal map.
Similarly, thinking of X ⊕ X, the identity maps from the two X’s into X gives us a unique map + : X ⊕ X → X – this
can be thought of as the addition map.
If we now have two maps f , g : X → Y , we can consider the composite map
∆ (f ,g) +
X−
→ X ⊕ X −−−→ Y ⊕ Y −
→ Y.

(To explain the middle map, consider the following commutative diagram:
f
X Y
ι1
ι1

X⨿X Y ⨿Y
ι2 ι2

g
X Y

We get two maps from X to Y ⨿ Y , which induces a map X ⨿ X → Y ⨿ Y .) So now we get a new map
f + g : X → Y , and it turns out that (+, 0) make HomC (X, Y ) into an abelian group for any X, Y ∈ C. Furthermore,
HomC (X, Y ) × HomC (Y, Z) → HomC (X, Z), defined via composition of maps, will be bilinear.

68
Remark 243. Some additional conditions are required for this part – in general what we’ve written here isn’t enough
to guarantee additive inverses – and this is clarified in a few lectures.

Definition 244
We call a functor F : C → D between additive categories additive if it preserves finite products and coproducts
(including the initial and final object, which are the empty coproduct and product). In particular, this means
F : HomC (X, Y ) → HomD (F X, F Y ) is a group homomorphism.

This way of setting up abelian categories is nice because it shows that we don’t need to additionally define addition
as a new structure – it’s just following from the axioms. But there is an alternative way to describe all of this: we can
say that a pre-additive category is a category in which each HomC (X, Y ) is endowed with the structure of an abelian
group, such that the compositions HomC (X, Y ) × HomC (Y, Z) → HomC (X, Z) are all bilinear. Then in a pre-additive
category, a functor F : C → D is called additive if F (f + g) = F (f ) + F (g) for all f , g : X → Y for X, Y ∈ C.

Proposition 245
A category C is additive if and only if it is pre-additive and has finite products and coproducts.

Proof. The forward direction is clear. For the reverse direction, notice that we have an isomorphism between the
0
initial and final objects, because the composite map initial → final →
− initial is the identity (since we always have a
zero map from the final object to the initial object because any abelian group has the zero element) and similarly
0
final →
− initial → final is the identity, so we do have an isomorphism initial → final with inverse map 0. And we also
have to show that the map α : X ⨿ Y → X × Y is an isomorphism, but looking back to how we defined α: we have
pX αιX = idX , pY αιY = idY , pX αιY = 0, pY αιX = 0. But we get a map ιX ◦ pX + ιY ◦ pY (addition coming because
of the abelian group structure given by being preadditive), and (ιX ◦ pX + ιY ◦ pY )(α) is the identity because we can
compose those maps with ιX and ιY , seeing that

(ιX ◦ pX + ιY ◦ pY )(α)ιX = ιX + 0 = ιX ,

(ιX ◦ pX + ιY ◦ pY )(α)ιY = 0 + ιY = ιY .

Thus by the universal product of the coproduct, (ιX ◦ pX + ιY ◦ pY )(α) = idX⨿X . A very similar analysis shows that
this holds when we switch the order of terms on the left.
We should check that the additive structures coincide with our two definitions, but we won’t do that here – it does
turn out that they are exactly equivalent.

Definition 246
An additive category C is abelian if it satisfies the following properties:
• If f ∈ HomC (X, Y ), then the combination of maps f : X → Y and 0 : X → Y has a limit ker(f ) → X (which
we call the kernel of f ) and a colimit Y → coker(f ) (which we call the cokernel of f ). We can check that
these are indeed a generalization of the definitions for R-modules, the kernel is always a monomorphism,
and the cokernel is always an epimorphism.

• Any monomorphism is a kernel and any epimorphism is a cokernel.

69
Example 247
As discussed, R-mod is an example of an abelian category. Somewhat similarly, if Γ is a group, then we can define
Γ-mod to be the category of abelian groups with an action of Γ (where morphisms must commute with the action
of Γ), and this is also an abelian category.

Example 248 (Not on quals syllabus but useful for motivation)


Let X be a topological space. Then Open(X) is the category in which objects are open sets of X and morphisms
are inclusions of open sets (so there is always either zero or one morphism between two objects). A presheaf (of
abelian groups) on X is a contravariant functor F : Open(X) → Ab.

More concretely, for every open set U ∈ X we associate an abelian group F(U), such that whenever V ⊂ U we
also have a restriction map F(U) → F(V ) sending m to m|V , such that the restriction F(U) → F(U) is the identity
and the restriction is compatible with triples (meaning that if W ⊂ V ⊂ U are open in X, then the composition of
restrictions F(U) → F(V ) → F(W ) yields the same result as the restriction F(U) → F(W ). Then a morphism
between presheaves F and G is a natural transformation f : F → G – in other words, for every U ⊂ X, we want a map
fU : F(U) → G(U) compatible with restriction (so applying fU and then restricting to V is the same as restricting to
V and then applying fV ).

Definition 249
A presheaf F is a sheaf if the following holds: for all U ⊂ X open and any open covers {Ui }i∈I of U, then there
Q Q
are two maps i∈I F(Ui ) → (i,j)∈I×I F(Ui ∩ Uj ), namely (1) the one sending (si ) to (si |Ui ∩Uj )(i,j) and (2) the
Q
one sending (si ) to (sj |Ui ∩Uj )(i,j) . Then we require that the component-wise restriction F(U) → i∈I F(Ui ) is
isomorphic to the limit of the two maps (1) and (2).

The idea is that we can basically “determine F” locally, and another way to say this sheaf condition is that (1) if
s ∈ F(U) and s|Ui = 0 for all i , then s = 0, and (2) if si ∈ F(Ui ) for all i and these si s (called sections) are compatible
on intersections, meaning si |Ui ∩Uj = sj |Ui ∩Uj , then there is some s ∈ F(U) such that s|Ui = si for all i (and in fact by
(1) this must be unique if it exists).
We thus have the categories Sh(X) and PreSh(X) (the category of sheaves and category of presheaves on X)
with Sh(X) ⊂ PreSh(X) by inheriting all of the objects.

Example 250
Let C(U) be the set of all continuous functions f : U → C with f continuous. This is a presheaf because we can
restrict a continuous function to any open subset, and it’s also a sheaf because a function is zero if and only if it
is zero everywhere locally. Indeed, if fi : Ui → C are continuous and fi , fj agree on intersections Ui ∩ Uj , then we
S
have a continuous function f : U → C on all of U = i∈I Ui (just by defining f (x) = fi (x) if x ∈ Ui ).

Example 251
The constant sheaf C(U) is the set of functions f : U → C that are locally constant (meaning that around every
point there is a neighborhood on which f is constant) – since this doesn’t depend on the topology of C, for any
abelian group we can also similarly define A(U) = {f : U → A, f locally constant}.

70
23 November 16, 2022
Last time, we mentioned the categories of presheaves and sheaves on a topological space, which assign to each open
set U an abelian group F(U) in a way that lets us restrict a section s ∈ F(U) to some s|V ∈ F(V ) whenever V ⊂ U
(which is compatible with repeated restriction). And the condition for being a sheaf is that this construction is “local:”
S
if U = i∈I Ui and we have si ∈ F(Ui ) such that si = sj on Ui ∩ Uj , then we have a unique s ∈ F(U) which restricts
to si on each Ui .

Definition 252
Let F be a presheaf, and let x ∈ X. The stalk of F at x is the direct limit Fx = limU∋x F(U).
−→ open

For example, if F is the set of continuous functions from U to C, then the stalk would be the continuous functions
defined on a neighborhood of x, where two functions are equal if they’re equal on some small enough set (sometimes
this is called the “germs of continuous functions at x” in analysis terminology). And for the constant sheaf A, we have
Ax = A.

Fact 253
We can detect monomorphisms in sheaves in the obvious way: if F → G is a monomorphism, that’s equivalent to
having a monomorphism (for abelian groups, injection) F(U) ,→ G(U) for all U ⊂ X open, which is equivalent to
Fx ,→ Gx for all x ∈ X. But we have to be more careful with epimorphisms: F → G is an epimorphism if and only
if Fx → Gx is surjective for all x ∈ X, but it’s possible to have an epimorphism where the map F(U) → G(U) is
not surjective.

Returning now to general abelian categories, we’ll introduce some terminology similar to that which we previously
defined for R-modules:

Definition 254
Let C be an abelian category, and suppose we have a series of morphisms between objects of C:
f
0 1f 2 f
· · · → X0 −
→ X1 −
→ X2 −
→ X3 → · · · .

If fi+1 ◦ fi for all i , we say that this sequence is a complex, and if im(fi ) = ker(fi+1 ) for all i , we say this sequence
is exact.

We didn’t formally define the image in a general abelian category last time, but we know that we have a map
f
ker f → X →
− Y → coker(f ) for any morphism f : X → Y . Then we can look at either coker(ker f → X) or
ker(Y → coker(f )), and they turn out to be equal and we call that the image of f .

Definition 255
An exact sequence of the form 0 → X → Y → Z → 0 is called short exact. Similarly, an exact sequence
0 → X → Y → Z is called left exact (and equivalent to the statement X = ker(Y → Z)), and an exact sequence
X → Y → Z → 0 is right exact (and equivalent to Z = coker(X → Y )).

Being short exact is equivalent to being both left and right exact, and a long exact sequence can always be split
0 1 f2 f f
into a sequence of short exact sequences: if we have a long exact sequence · · · → X0 −
→ X1 −
→ X2 −
→ X3 → · · · ,

71
that’s equivalent to requiring that we have a short exact sequence

0 → coker(fi−2 ) → Xi → ker(fi+1 ) → 0

for all i . (So for example, at i = 2 above, we need that the cokernel of f0 : X0 → X1 , which is X1 /im(f0 ) ∼
= X1 / ker(f1 ),
and the kernel of f3 , which is the image of f2 : X2 → X3 , to fit into the sequence.)
Recall that additive functors preserve the zero object and additive direct sums, but they are not required to (and
may not) preserve kernels and cokernels. So homological algebra is about how additive functors interact with those
kernels and cokernels:

Example 256
×2
Consider the functor Ab → Ab given by tensoring ⊗Z Z/(2). Then 0 → Z −→ Z → Z/(2) → 0 is short exact,
0 id
but tensoring gives us a sequence 0 → Z/(2) →
− Z/(2) −
→ Z/(2) → 0, and the zero map Z/(2) → Z/(2) is
not injective. So tensoring does not preserve kernels. Similarly, the functor HomZ (Z/(2), −) would send that
exact sequence to 0 → (0) → (0) → Z/(2) → 0, so we lose surjectivity and the Hom functor does not preserve
cokernels.

Definition 257
A covariant functor F : C → D is left exact (resp. right exact) if and only if it preserves kernels (resp.
cokernels), which is the same as preserving left (resp. right) exact sequences. F is exact if it is both left and
right exact, meaning that it preserves kernels and cokernels, or equivalently preserving short exact sequences
(which is equivalent to preserving all exact sequences).

This last string of logic includes the fact that preserving kernels and cokernels also means we preserve images, and
clearly preserving all exact sequences means we preserve short exact sequences. But showing that preserving short
exact sequences implies preserving kernels and cokernels takes a bit more work: for any f : X → Y , we can write down
two exact sequences as shown below:

0 ker(f ) X im(f ) 0
=

0 im(f ) Y coker(f ) 0

Applying our exact functor, we get an exactly analogous diagram where the rows will still be short exact:

F (f )

0 F (ker(f )) F (X) F (im(f )) 0


=

0 F (im(f )) F (Y ) F (coker(f )) 0

We wish to show that F (ker(f )) is the kernel of F (f ). Since having 0 → A → B is the same as having a
F (f )
monomorphism A → B, F (im(f )) → F (Y ) is a monomorphism. Consider any Z → F (X) −−−→ F (Y ) which composes

72
F (f ) F (f )
to the zero map – this factors to a map Z → F (X) −−−→ F (im(f )) → F (Y ), so Z → F X −−−→ F (Im(f )) is the zero
map (because of the monomorphism). That means that F (ker f ) is indeed the kernel of F (X) → F (im(f )) because
the composite map in the top row is zero. The cokernel is checked similarly.
For contravariant functors we have to choose our convention, since a left exact sequence becomes a right exact
sequence and vice versa:

Definition 258
Let F : C → D be a contravariant functor. Then F is left exact, right exact, or exact, respectively, if the
covariant functor F op : C op → D is correspondingly left exact, right exact, or exact.

In other words, F is left exact if and only if for all X → Y → Z → 0 right exact, we end up with a left exact
sequence in the target space 0 → F Z → F Y → F X. (And put another way, left exactness for contravariant functors
means that we take cokernels to kernels.)

Example 259
In R-mod, the localization functor sending M 7→ D−1 M is exact, the tensor product functor M 7→ M ⊗R N (for
fixed N) is right exact (but not exact), and the Hom functor M 7→ HomR (N, M) (for fixed N) is left exact (but
not exact). Finally, the contravariant functor M 7→ HomR (M, N) is left exact (but not exact) as well. The failure
of exactness here leads us to the Tor and Ext functors.

Example 260
In Sh(X) (the category of sheaves), the global section functor F 7→ F(X) does preserve kernels but not cokernels
(by the reasoning in Fact 253), so it is left exact but not generally right exact. Failure of exactness here leads to
the usual cohomology theories.

Definition 261
In the category Γ-Mod (as defined last lecture), we can send M to M Γ , the set of fixed points {m ∈ M : γm ∀γ ∈
Γ}; this turns out to be left exact but not generally right exact. And the failure of exactness here turns out to be
group cohomology.

Lemma 262
If 0 → M → N → FR (X) → 0 is a short exact sequence of R-modules, and F : R-mod → D is an additive
functor, then 0 → F (M) → F (N) → F (FR (X)) → 0 is also exact.

f g
Proof. By the universal property of the free module, if we have a map 0 → M →
− N→
− FR (X) → 0, there exists some
s : FR (X) → N with g ◦ s = id. (Indeed, for each basis element ei ∈ FR (x), we can choose s(ei ) to be any preimage
of ex in g and extend by linearity.) We call s a section – notice that s ◦ g is not necessarily the identity. Then N is
isomorphic to M ⊕FR (x) by sending (m, p) 7→ f (m)+s(p), because for any n we can map it to (f −1 (n−s(g(n))), g(n))
since g(n − s(g(n))) = 0, and we can check that these maps are mutually inverses.
So now we also have an analogous sequence 0 → M → M ⊕ FR (X) → FR (X) → 0, such that the following diagram
commutes:

73
f g
0 M N FR (X) 0
= ∼
= =
ι1 π2
0 M M ⊕ FR (X) FR (X) 0

Applying the additive functor F and noting that F preserves direct sums, the top and bottom rows are still isomorphic
and the bottom row is still short exact, so we still have a short exact sequence 0 → F (M) → F (N) → F (FR (X)) → 0,
as desired.

There’s no notion of being a “free module” in a general abelian category, but what we really needed was the section
s, so we make definitions that allow for that:

Definition 263
An object P ∈ ob(C) is projective if for any epimorphism X → Y , any map P → Y factors through X.

(This is similar to the universal property of a free module – free R-modules are projective in R-mod.) There is a
dual notion as well:

Definition 264
An object I ∈ ob(C) is injective if for any monomorphism X → Y , any map X → I factors through Y .

Lemma 265
If 0 → X → Y → Z → 0 is short exact in C, and F : C → D is additive, then 0 → F (X) → F (Y ) → F (Z) → 0 is
short exact in D if either X is injective or Z is projective.

(This argument is basically the same as the one in the module case – if we have X injective then we construct a
map Y → X, and if we have Z projective then we construct a map Z → Y .) The idea is that replacing objects with a
collection of projective or injective objects is often useful in homological algebra:

Definition 266
A category C has enough projectives if for any X ∈ ob(C), there is an epimorphism P → X (so any object is
a quotient of a projective object). Similarly, C has enough injectives if for any X ∈ ob(C) there is some map
X → I with I injective.

We can then iterate this: once we get a map P 0 → X, the kernel of that will be the image of some P −1 → P 0 ,
and then the kernel of that will be the image of P −2 → P −1 , and so on, giving us an exact sequence · · · → P −2 →
P −1 → P 0 → X → 0, which we call a projective resolution of X. Similarly, if we have enough injectives, repeatedly
looking at cokernels gives us an injective resolution X → I 0 → I 1 → I 2 → · · · . (For example, the category of sheaves
has enough injectives but not enough projectives.)

24 November 18, 2022


We’ll start with a clarification – last time, we said that an additive category is a category in which there are finite
products and coproducts, including the empty ones (so the initial and final objects). We then said that the map from

74
the initial to final object should be an isomorphism, and so should the map X ⨿ Y → X × Y . This gives us an addition
HomC (X, Y ), but we require the additional assumption that for all x ∈ C, there is a special element −IdX , such
that −IdX + IdX = 0.
Last time, we defined projectives and injectives – projectives are objects P such that whenever we have a surjection
X → Y , any map P → Y can be lifted to a map P → X, and injectives are objects such that whenever we have an
injection X → Y , any map X → I can be extended to a map Y → I. The concepts of “enough projectives” and “enough
injectives” are then that we always have some P such that P → X is an epimorphism, and that we always have some
I such that X → I is a monomorphism, respectively. (In particular, the category R-mod has enough projectives and
enough injectives, and so does Γ-mod, but Sh(X) usually has enough injectives but not enough projectives.) Those
two conditions give us long exact sequences · · · → P −1 → P 0 → X → 0 and X → I 0 → I 1 → · · · , respectively (by
repeatedly applying the condition to the kernel or cokernel of our maps, respectively). Today, we’ll discuss injectives
and projectives in R-mod, and we’ll see the use of those projective and injective resolutions next week.

Lemma 267
Let P be an R-module. The following are equivalent:
1. P is projective,

2. P is a direct summand of a free module, meaning that we can write FR (Ω) ∼


= P ⊕ Q for some Q.
If P is finitely presented, meaning that there is a right exact sequence R⊕b → R⊕a → P → 0, then these are
also equivalent to these additional conditions:
3. Pp is free over Rp for all prime ideals p,

4. Pm is free over Rm for all maximal ideals m.

Being finitely presented basically means that we want the kernel of the map R⊕a → P to also be finitely generated,
and one way this is satisfied is if P is finitely generated and R is noetherian.

Proof. To show that (1) implies (2), we know that there is a natural surjection π : FR (P ) → P , sending each generator
to the corresponding element of P . The property of being projective then means that the identity map P → P lifts to
a map s : P → FR (P ) such that π ◦ s = IdP , so we have the isomorphism FR (P ) ∼
= ker(π) ⊕ P (where in the backward
direction we send (a, b) to a + s(b), and in the forward direction we send c to (c − sπ(c), π(c))), so P is a direct
summand.
For (2) implies (1), we know that P ⊕ Q ∼
= FR (Ω). We want to show that if X → Y is a surjection, then P → Y

lifts to a map P → X. But P ⊕ Q = FR (Ω) is a free module (always projective), so given the composite map
π
P
P ⊕ Q −→ P → Y , we get a map P ⊕ Q → X (in blue). Then P maps into P ⊕ Q via inclusion in the first factor, and
P ι
then ιP ◦ πP is the identity. Thus the composite map P −→ P ⊕ Q → X gives us the desired map, showing that P is
projective.

P
ιP

P ⊕Q
πP

X Y

75
Showing equivalence to (3) and (4) is a bit more tricky. Assuming (1) and (2), since P is finitely generated, there
is a surjection FR (Ω) → P , where |Ω| is finite. We then get a splitting s : P → FR (Ω) which allows us to write
FR (Ω) ∼
= P ⊕ Q, meaning P is a direct summand of a finite free module. Then for any p, we know that localizing at
that prime ideal gives
FRp (Ω) ∼
= Pp ⊕ Q p .

We then find that (tensoring by R/p)


FRp /p (Ω) ∼
= Pp /pPp ⊕ Qp /pQp ,

and (Rp /p) is a field so these are finite-dimensional vector spaces. Let the basis elements on the right-hand side
be e 1 , · · · e a in the first summand and e a+1 , · · · , e b in the second. If we choose e1 , · · · , ea ∈ Pp that lift e 1 , · · · , e a
and ea+1 , · · · , eb in Qp lifting e a+1 , · · · , e b , by Nakayama’s lemma (specifically the corollary which allows us to omit
⊕(b−a)
generators in pPp ), we get a surjection Rp⊕a → Pp and a surjection Rp → Qp , so we get a surjection Rp⊕b →
Pp ⊕ Qp = FRp (Ω). Thus b = |Ω| and finite-dimensional vector spaces have a fixed basis size, and in particular this
map Rp⊕b → FRp (Ω) is represented by some b × b matrix. Reducing that matrix modulo the prime ideal gives us an
isomorphism, so det(A) is nonzero mod p. But that’s the same as saying that det(A) is not in p, and p is the maximal
ideal in Rp so det(A) is a unit. Thus A is invertible, meaning Rp⊕b → FRp (Ω) is injective so Rp⊕a → Pp is also injective
and thus an isomorphism, as desired.
Clearly (3) implies (4) (since all maximal ideals are prime), and now we prove that (4) implies (1). Given a surjection
M → N and a map P → N, we wish to find a map P → M. We have a map HomR (P, M) → HomR (P, N), and if we
define Q to be its cokernel we have a right exact sequence HomR (P, M) → HomR (P, N) → Q → 0. We now localize
this at m (which preserves exactness) to get HomR (P, M)m → HomR (P, N)m → Qm → 0. And we claim that if P is
finitely presented, then this can be rewritten as HomRm (Pm , Mm ) → HomRm (Pm , Nm ) → Qm → 0. But since Pm is
free by assumption, it is certainly projective, so each Qm is zero. And we proved that if Qm = 0 for all maximal ideals
that means Q = 0. So it remains to prove the following assertion:

Lemma 268
Let M and N be R-modules and M be finitely presented. If D ⊂ R is multiplicative, then the map

D−1 HomR (M, N) → HomD−1 (R) (D−1 M, D−1 N)

is an isomorphism.

We won’t go through the full proof, but the idea is to first deal with the case where M is finite and free. Then
homomorphisms from M → N are just a-vectors, so the Hom set is isomorphic to N ⊕a . Thus we have a map
D−1 N ⊕a → (D−1 N)⊕a , which is an isomorphism because localization commutes with direct sums.
Next, if we have a general finitely presented M, meaning we have R⊕a → R⊕b → M → 0, then applying Hom(, N)
(which is left exact contravariant) yields 0 → Hom(M, N) → Hom(R⊕b , N) → Hom(R⊕a , N); localizing then gives
0 → D−1 Hom(M, N) → D−1 Hom(R⊕b , N) → D−1 Hom(R⊕a , N). But we can also localize first and then take Homs,
which will yield 0 → Hom(D−1 M, D−1 N) → Hom(D−1 R⊕b , D−1 N) → Hom(D−1 R⊕a , D−1 N). Putting those into a
commutative diagram with two rows, since we know that the maps between the last two columns are isomorphisms,
so is the one between the two modules we care about.

We next talk about injectives, starting with a special case:

76
Lemma 269
A Z-module I is injective if and only if it is divisible, meaning that for all m ∈ I and a ∈ Z̸=0 , there is some m′ ∈ I
such that am′ = m.

For example, Q is a divisible Z-module, as is Q/Z, so we have a few examples of injective abelian groups.

Proof. If I is injective, then for any m ∈ I and a ∈ Z̸=0 we look at the map Z → I sending 1 to m. Then Z injects
into itself by multiplication by a, so by injectivity there is a map (from the latter copy of Z to M) sending 1 → m′ ,
and for the maps to be compatible we must have m = am′ .
The other direction is a Zorn’s lemma argument: if I is divisible and we have an inclusion of abelian groups X → Y
and a map X → I, we consider extending X to Y “bit by bit.” Specifically, consider the set X of pairs (Z, β) where
X ⊆ Z ⊂ Y and β : Z → I is an extension of α (meaning β|X = α). This is nonempty because it contains (X, α),
and we can place a partial ordering on it where (Z, β) ≥ (Z ′ , β ′ ) if Z ⊇ Z ′ and β|Z ′ = β ′ . But if we have a chain
of elements {(Z, β)}, it has an upper bound where we take the union W of the submodules Z (still a submodule)
and define a map γ : W → I by having γ(w ) agree with β(w ) if w ∈ Z for some (Z, β) (this is consistent because
everything is nested).
So by Zorn’s lemma, there is some (Z, β) maximal in this set X . If Z = Y we’re happy; otherwise, choose some
y ∈ Y \ Z. We will try to extend β to the module generated by Z and y . Consider the set of multiples of y in Z, which
we write as J = {n ∈ Z : ny ∈ Z}. This is an ideal of Z and thus generated by some a. In I, we know that β(ay ) = am
for some m ∈ I (because I is divisible), and now it makes sense to define γ : ⟨Z, y ⟩ → I to send z + by 7→ β(z) + bm.
It remains to check that this is well-defined: indeed, if z + by = z ′ + b′ y , then (z − z ′ ) = (b′ − b)y is an element of
Z, which means b′ − b = ca for some c. Then the images of the two sides agree because

(β(z) + bm) − (β(z ′ ) − b′ m) = β(z − z ′ ) − cam = β(z − z ′ − c(ay )) = β((z + by ) − (z ′ + b′ y )) = β(0) = 0.

This contracts maximality and thus Z = Y , meaning we’ve extended our map in the desired way.

Corollary 270
If I is an injective Z-module and M is a submodule of I, then I/M is also injective (because divisibility is preserved
under quotients).

Corollary 271
Z-mod has enough injectives, because given a Z-module M we have a short exact sequence 0 → K → FZ (M) →
M → 0. In other words, FZ (M)/K is isomorphic to M. But FQ (M) is a free Q-module that contains FZ (M), so
M sits inside FQ (M)/K – since FQ (M) is divisible (it’s a Q-vector space), so is its quotient FQ (M)/K.

This result can be bootstrapped to a general R:

Lemma 272
We have the following:
1. If I is an injective Z-module, then HomZ (R, I) as an R-module (defined by saying that for any a ∈ R and
f ∈ HomZ (R, I), af is the map (af )(b) = f (ab)) is an injective R-module.

2. Thus R-mod has enough injectives.

77
The idea is that I → HomZ (R, I) is right adjoint to the forgetful functor, and right adjoints preserve injectivity.
And given any R-module M, M embeds as an abelian group in some I, so we can embed M ,→ HomZ (R, I) to get an
injective R-module.

25 November 28, 2022


Last time, we showed that there are enough projectives and injectives in R-mod, giving a concrete example. We’ll
now talk about homological algebra properly on more general abelian categories with that in mind:

Lemma 273
Suppose C is an abelian category with enough injectives. Given any morphism f : X → Y in C, suppose 0 → X →
I 0 → I 1 → · · · and 0 → Y → J 0 → J 1 → · · · are injective resolutions (that is, exact sequences with all I n s and
J n s injective). We also denote these injective resolutions 0 → X → I • and 0 → Y → J • . Then there is a chain
map (map of complexes) f : I • → J • , unique up to homotopy (defined below).

If we have two complexes (where remember that we don’t need exactness, just that the composite of any two
adjacent maps is zero), a chain map between them is a collection of maps C i → Di (vertical maps shown below) such
that all squares commute:

··· C i−1 Ci C i+1 ···

··· Di−1 Di Di+1 ···

Then two maps of complexes f • , g • : C • → D• are homotopic (denoted f • ≃ g • ) if there are maps k i : C i+1 → Di
i−1
such that f i − g i = δD ◦ k i−1 + k i ◦ δCi as in the diagram below:

δCi
··· C i−1 Ci C i+1 ···
f i−1 −g i−1 f i −g i f i+1 −g i+1
k i−1 ki
··· Di−1 i−1
Di Di+1 ···
δD

So the point is that once we replace X and Y with injective objects, we get a map between the extensions, and
the extension is unique up to some complicated definition. (The case for enough projectives holds similarly, but we’ll
leave that to our imagination.)

Proof. We wish to construct f 0 : I 0 → J 0 , but J 0 is injective and we have an injection from X → I 0 , so there exists
a map f 0 : I 0 → J 0 extending the composite map X → X → Y → J 0 . To extend at degree i + 1, notice that we
can map from I i to I i+1 by mapping I i → coker δIi−1 → I i+1 , and the latter of these is injective (this is just encoding
exactness). Now the map I i−1 → I i → J i → J i+1 is the same as the map I i−1 → J i−1 → J i → J i+1 (by the chain map
property), so it is zero. Thus it factors through coker δIi−1 , and now by injectivity of J i+1 the map f i+1 : I i+1 → J i+1
exists.
For uniqueness, we start with the following diagram:

0 X I0 I 0 /X I1
f 0 −g 0

0 Y J0

78
f 0 −g 0
Now the map X → I 0 −−−→ J 0 is zero, so it factors through I 0 /X. Then because I 0 /X → I1 is an injection, we can
define the map k 0 : I 1 → J 0 by injectivity of J 0 . Then f 0 − g 0 = k 0 ◦ δI0 (there’s no k −1 map so this is what we want
at this first stage).
For the general construction of k i , consider the following diagram:

I i−1 Ii I i+1
f i−1 −g i−1 f i −g i
k i−1
J i−1 Ji

Then we again split the map I i → I i+1 into two parts as I i → coker(δIi−1 ) ,→ I i+1 , and we claim the map
f i − g i − δJi−1 ◦ k i−1 factors through that cokernel. To do so we must check that it’s zero on the image of I i−1 , but

(f i − g i − δJi−1 ◦ k i−1 ) ◦ δII−1 = δJi−1 ◦ (f i−1 − g i−1 − k i−1 ◦ δIi−1 ) = δJi−1 ◦ (δJi−2 ◦ k i−2 )

where we’ve inductively used the chain property from i − 1. Then this is indeed zero, and then by injectivity of J i
and the fact that coker(δIi−1 ) → I i+1 is an injection we get an extension k i : I i+1 → J i which satisfies the desired
homotopy property.

The point in this proof is that once we’ve come up with the right notion of homotopy, there’s only one way the
argument can really proceed.

Corollary 274
Let 0 → X → I • and 0 → X → J • be two injective resolutions of X. Then there is a (unique up to homotopy)
f • : I • → J • extending the identity map IdX and also g • : J • → I • extending IdX , Then g • ◦ f • is a map of
complexes from I • → I • extending the identity, so g • ◦ f • is homotopic to the identity map on the complexes.
The same holds for f • ◦ g • .

In general we say that two complexes I • and J • are homotopic if there are maps f • : I • → J • and g • : J • → I •
such that the compositions in both directions are homotopic to their respective identities. So now that we’ve replaced
an object with a complex of injectives up to homotopy, we might want to ask what properties are preserved under
homotopy. The answer is cohomology:

Definition 275
Let C • be a complex, not necessarily exact. The i th (co)homology H i (C • ) is defined as follows: we have maps
δ i−1 δi
C i−1 −−
C
→ C i −→
C
C i+1 , and we know that im δCi−1 is contained in ker δCi , so we may set H i (C • ) = ker δCi /im δCi−1 .

For example, if 0 → X → I • is an injective resolution, then H i (I • ) is zero for all i except at i = 0 by exactness.
And at i = 0, the map I 0 → I 1 has a kernel, specifically X.
This cohomology has nice functorial properties: if f : C • → D• is a map of complexes, we get a map H i : H i (C • ) →
H i (D• ) because f i maps the image of δCi−1 to the image of δD
i−1
and also the kernel of δCi to the kernel of δD
i
. So it
maps the quotient appropriately as well.

Lemma 276
If f • and g • are two homotopic maps C • → D• , then H i (f • ) = H i (g • ) as maps H i (C • ) → H i (D• ).

79
Proof. We’ll do the proof for just R-modules. For any x ∈ ker δCi , we know that we map to f i (x) and g i (x) under the
two maps of complexes, but

f i (x) − g i (x) = (δD


i−1
◦ k i−1 + k i ◦ δCi )(x) = δD
i−1 i−1
(k (x)),

i−1
which is the image of δD . Since cohomology mods out that image, f i (x) = g i (x) will be the same under the
cohomology maps.

Definition 277
Let C and D be abelian categories. Suppose C has enough injectives, and suppose F : C → D is a left exact,
additive functor. Then if X ∈ ob(C), then it has an injective resolution 0 → X → I • . Then F I • is a complex
(though not necessarily exact anymore), and we can look at its cohomology. We can then define the i th right
derived functor of F via
Ri F (X) = H i (F I • )

This appears to depend on the injective resolution, but it turns out it does not: if 0 → X → I • and 0 → X → J • ,
then we get an extension of the identity map f • : I • → J • , unique up to homotopy, such that f • ◦ g • ≃ IdJ and
g • ◦ f • ≃ IdI . Then applying F g • : F J • → F I • and F f • : F I • → F J • , we know that F f • ◦ F g • ≃ IdF J • because F
preserves homotopy. (This is a general property of additive functors: f • ≃k • g • , then f i − g i = δ i−1 ◦ k i−1 + k i ◦ δ i ,
so F (f i ) − F (g i ) = F (δ i−1 ) ◦ F (k i−1 ) + F (k i ) ◦ F (δ i ), meaning that F (f • ) ≃ F (g • ).) So F f • ◦ F g • is homotopic to
the identity on F J • , and similarly F g • ◦ F f • is homotopic to the identity on F I • , so H i (F f • ) ◦ H i (F g • ) is the identity
′• ′
map and so is H i (F g • ) ◦ H i (F f • ). And if f also extended the identity map, then f • is homotopic to f • , so F f • and
′•
Ff are homotopic and thus the same map on cohomology. So it’s not just unique up to isomorphism – it’s unique
up to unique isomorphism.
So now if we have a map f : X → Y , we can choose resolutions I • and J • and extend f on the resolutions to
get well-defined maps H i (F f • ) : H i (F I • ) → H i (F J • ), and that is how we define Ri F (f ). So we know how objects
and morphisms are sent under Ri F , and we can check that Ri F is an additive functor (identity and composition are
preserved). And R0 F can be described explicitly: if F is left exact, then 0 → X → I 0 → I 1 being left exact means
0 → F X → F I 0 → F I 1 is also left exact, so in fact R0 F = F (since we compute this zeroth cohomology by looking
at the kernel of F I 0 → F I 1 ). And if I is injective, Ri F (I) = (0) for all higher i > 0. Indeed, an injective resolution for
id
I is just 0 → I −
→ I → 0 → 0 → · · · , and the cohomology of F I → 0 → 0 → · · · is trivial except in i = 0.

Example 278
Let X be a topological space, and we have a global section functor Γ : Sh(X) → Ab sending F to the abelian
group F(X) (basically setting U the entire space). This is left exact, and Sh(X) has enough injective, so we get
functors Ri Γ : Sh(X) → Ab for all i ≥ 0.

80
Theorem 279
Suppose X is a second countable topological space (meaning it has a countable base), and suppose every point
has an open neighborhood homeomorphic to an open set in Rn (for example any manifold). Then the derived
functors of the global section functor can be applied to the constant sheaf Z. We then have

Ri Γ(ZX ) ∼ i
= Hsing (X, Z)

i
where Hsing (X) is the usual singular cohomology of X with coefficients in Z from algebraic topology.

26 November 30, 2022


Last time, we considered the following situation: if C, D are abelian categories with C having enough injectives, and we
have an additive functor F : C → D which is covariant and left exact, then we can define the right derived (covariant)
functors Ri F : C → D in the following way: any object X has an injective resolution 0 → X → I • , and throwing
away X and then applying F gives us a sequence F I • (no longer necessarily exact, but still a complex). Then we set
Ri F (x) = H i (F I • ); we showed last time that this is indeed functorial and well-defined. (And in particular, R0 F = F .)
We can similarly see that if F is contravariant but still left exact, and C has enough projectives, we can again
define (now contravariant) right derived functors Ri F : C → D. Here, we take a projective resolution P • → X → 0,
throw away X, and apply F to get a complex F P • . (In other words, given P −2 → P −1 → P 0 → X → 0, we look at
the complex F P 0 → F P −1 → F P −2 → · · · . Just to keep notation consistent, we define (F P )i = F P −i , and then we
can define Ri F (X) = H i ((F P )• ) – all of the analogous properties will still hold.
The two other similar situations also work out the way we might expect – if F is covariant and right exact, and
C has enough projectives, we now get (covariant) left derived functors Li F : C → D, in which we take a projective
resolution P • → X → 0, throw away X and apply F to get F P • , and define Li F (X) = H −i (F P • ). (So for example,
L1 F is the kernel of the map F P −1 → F P 0 , modulo the image of the map F P −2 → F P −1 .) And finally, if F is
contravariant and right exact, and C has enough injectives, we get (contravariant) left derived functors Li F : C → D
by taking the injective resolution 0 → X → I • , throw away X and apply F to get F I • . From I 0 → I 1 → I 2 → · · ·
we thus get a complex · · · → F I 2 → F I 1 → F 0 , and we can again renumber and define ((F I)• )i = F −i so that
Li F (X) = H −i ((F I)• ).
To say a bit more about these left and right derived functors, they were introduced because the original functors
F are not fully exact. Motivated by that, if we have 0 → X → Y → Z → 0, and X, Y, Z have injective resolutions, we
may want to relate the short exact sequence on X, Y, Z to short exact sequences simultaneously on all terms of the
injective resolutions.

Proposition 280
Let C have enough injectives, and suppose 0 → X → Y → Z → 0 is short exact in C. Then there exist injective
resolutions 0 → X → I • , 0 → Y → K • , 0 → Z → J • , such that the diagram below commutes and the rows are
exact:

81
0 0 0

0 X Y Z 0

0 I• K• J• 0

(Like last time, there is an exactly analogous situation for projectives.)

Proof. We’ve shown previously that 0 → I → Y → Z → 0 is short exact with I injective, then we in fact have
= I ⊕ Z. So the only option for proving this result is if we can show K i ∼
Y ∼ = I i ⊕ J i . Thus we can choose injective
resolutions 0 → X → I • and 0 → Z → J • and set K i = I i ⊕ J i . (So notice that in fact this result is stronger than
stated – we can choose any I • and J • and there is still a suitable K.)
We’ll do the proof in the case of R-modules. Our goal is to show that 0 → Y → K • is an injective resolution, and
we must define maps Y → I 0 ⊕ J 0 and I i ⊕ J i → I i+1 ⊕ J i+1 . Since we need the square with Y, Z, K 0 , J 0 to commute,
we must send y to (h−1 y , δJ−1 y ), such that (now looking at the square X, Y, I 0 , K 0 ) h−1 restricted to X is δI−1 . (Here
note that the −1s are indices, not inverses.) And such a map h−1 exists because I 0 is injective, so a map X → I 0
extends to Y through our map X → Y . So that gives us 0 → I 0 → I 0 ⊕ J 0 → J 0 → 0, so the diagram we want
commutes with i = 0. And now we can check that Y → I 0 ⊕ J 0 is injective because mapping to zero requires us to
be zero by going around the right square.
For a general stage i , we know the map from I i ⊕ J i should send (x, y ) to (something, δJi y ) by commutativity of
the right square, and in fact we must have (x, y ) 7→ (δIi x + hi y , δJi y ) where hi is a map J i → I i+1 . It remains to check
that this process gives an injective resolution 0 → Y → I 0 ⊕ J 0 → I 1 ⊕ J 1 → · · · if we choose hi correctly, so we must
check exactness. First of all, we must have δ ◦ δ = 0, so exactness at I i+1 ⊕ J i+1 yields

(x, y ) 7→ (δIi x + hi y , δIi y ) 7→ (δIi+1 hi y + hi+1 δJi y , 0)

and thus we need δIi+1 hi + hi+1 δJi = 0 . (And we also need to think about the case i = −1 separately, but in that
case it turns out we do require δI0 h−1 + h0 δJ−1 = 0.) But it turns out this condition also guarantees exactness –
if (x, y ) ∈ I i ⊕ J i maps to zero in I i+1 ⊕ J i+1 , then δJi y = 0 and δIi x + hi y = 0, but by exactness of J we know
y = δJi−1 y ′ for some y ′ , and thus the second equation becomes 0 = δIi x + hi δJi−1 y ′ = δIi (x − hi−1 y ′ ) (last step by
our condition). Then by exactness of I we see that x + hi−1 y ′ = δIi−1 x ′ for some x ′ , so in fact (x ′ , y ′ ) maps to
(δIi−1 x ′ + hi−1 y ′ , δJi−1 y ′ ) = (x, y ). So exactness is automatic and we just need to construct hi satisfying that boxed
condition.
And we can do this inductively – we’ve already constructed h−1 , and the construction of h0 is left as an exercise
(it’s similar). For the general case, we want to construct a map hi+1 : J i+1 → I i+2 for any i ≥ 0 which makes the
map J i → I i+2 : −δIi+1 ◦ hi factor through J i → J i+1 . But we can first map J i → coker(δIi−1 ), and we can check
that J i → I i+2 factors through that cokernel. Indeed, the composite map J i−1 → J i → I i+2 is −δIi+1 hi δJi−1 , and by
inductive hypothesis we can use the condition for i to show that this is in fact δIi+1 δIi hi−1 = 0. So we get a map
coker(δIi−1 ) → I i+2 , and then by injectivity of I i+2 this gives us a map J i+1 → I i+2 because coker(δIi−1 ) → J i+1 is
injective.

So now we know that we can choose compatible I • , K • , J • injective resolutions – applying F to them, we get an
exact sequence 0 → F I • → F K • → F J • → 0. What we’ve shown is that the columns (each of the complexes)
will no longer be exact, but the rows are exact (so we lose exactness F X → F Y → F Z, but we still have exactness
F I • → F K • → F J • ).

82
Lemma 281
f• g•
Suppose C • , D• , E • are complexes that fit into a commutative diagram 0 → C • −→ D• −→ E • → 0 with exact
rows. By functoriality, we get an exact sequence H i (C • ) → H i (D• ) → H i (E • ), and it turns out we can extend
this to a sequence H i (C • ) → H i (D• ) → H i (E • ) → H i+1 (C • ) → H i+1 (D• ) → H i+1 (E • ) → · · · which is exact
everywhere. (The blue map is called the boundary map.)

Corollary 282
If F : C → D is left exact, C has enough injectives, and we have 0 → X → Y → Z → 0 exact in C, then we get
an exact sequence

0 → R0 F X → R0 F Y → R0 F Z → R1 F X → R1 F Y → R1 F Z → R2 F X → · · · ,

and because this sequence starts as 0 → F X → F Y → F Z we have a way of measuring the failure of exactness
under F .

Beginning of proof of Lemma 281. Consider the diagram below:

0 C i−1 Di−1 E i−1 0

0 Ci Di E i+1 0

0 C i+1 Di+1 E i+1 0

The kernel of the map δCi : C i → C i+1 maps to the kernel of the map δD
i
, which maps to the kernel of δEi
(since the image of something in C i going to zero in C i+1 goes to zero in Di+1 , and so on). Furthermore, the
composite map ker δCi → ker δD
i
→ ker δEi is zero because the composite map C i → Di → E i is zero. We also
similarly have a map im(δCi−1 ) → im(δD
i−1 i
) → im(δD ). So to show exactness at H i (D• ), we start with some d ∈ ker δD
I

i−1
and consider what happens to d + im(δD ). We have g i (d) = δEi−1 (e) = δEi−1 (g i−1 (d ′ )) = g i (δD
i−1
(d ′ )) – thus,
i−1 ′ i−1 ′
g i (d − δD d ) = 0, so d − δD d = f i (c) for some c. But by injectivity, to show δCi (c) = 0 it suffices to compute
i i−1 ′
f i+1 (δCi (c)) = δD
i
(f i (c)) = δD
i
(d) − δD i
δD d = 0. So δD (i ) = 0, and now for any c + im(δCi−1 ), we map onto
I−1 i−1
f i (c) + im(δD ). So in summary starting with something in H i (D) (specifically d + imδD )) which maps under g
to zero, we found some c + im(δCi−1 ) which maps to it. That shows exactness at H i (D• ), and next time we’ll show
exactness at H i (C • ) and H i (E • ) as well.

27 December 2, 2022
Last lecture, we mentioned that given an exact sequence of complexes 0 → C • → D• → E • → 0, there are maps in
cohomology H i (C • ) → H i (D• ) → H i (E • ) for each i , and we can in fact add boundary maps H i (E • ) → H i+1 (C • ) to
make an exact sequence · · · → H i (C • ) → H i (D• ) → H i (E • ) → H i+1 (C • ) → H i+1 (D• ) → H i+1 (E • ) → · · · .

Proof of Lemma 281, continued. We look at the diagram from last time again:

83
g i−1
Di−1 E i−1 0
i−1
∂D ∂Ei−1

f i gi
0 Ci Di Ei 0
∂Ci i
∂D ∂Ei

f i+1 g i+1
0 C i+1 Di+1 E i+1 0
∂Ci+1 i+1
∂D

0 C i+2 Di+2

To construct a map E i → C i+1 , we start with some element e ∈ ker ∂Ei and want to consider where e + im(∂Ei−1 )
goes. By surjectivity, there is some d ∈ Dk so that g i d = e. Then we know that g i+1 (∂D
i
d) = ∂Ei e = 0 (by going
around the bottom right square in both ways, and by using the definition of e), so by exactness there is some c ∈ C i+1
such that f i+1 c = ∂D
i
d. But we know that ∂Ci+1 c will map to an element in Di+2 , but the map C i+2 → Di+2 is injective
i+1 i+1 i+1 i
and ∂D f c = ∂D ∂D d = 0, which means c is in the kernel of ∂Ci+1 . We thus define the boundary map by sending
e + im(∂Ei−1 ) to c + im(∂Ci ).
We must check that this is well-defined, since we made arbitrary choices e and d in this construction. Indeed, if
e + im(∂Ei−1 ) = e + im(∂Ei−1 ), then e ′ − e = ∂Ei−1 e ′′ for some e ′′ ∈ E i−1 . Lifting e ′′ to d ′′ ∈ Di−1 (since the map

Di−1 → E i−1 is surjective), we have g i−1 d ′′ = e ′′ , so

e ′ = e∂Ei−1 e ′′ = ∂Ei−1 g i−1 d ′′ = g i ∂D


i−1 ′′
d

i−1 ′′
(by looking at the top right square), and thus d ′ − d − ∂D d ∈ ker(g i ), so by exactness in the i th row we can
i i−1 ′′
write it as f i c ′′ for some c ′′ ∈ C i . Applying ∂D
i i
to both sides, ∂D (d ′ ) − ∂D
i
(d) = ∂D i i ′′
∂D d + ∂D f c – the first
term on the right-hand side is zero, and the second term becomes f i+1 ∂Ci c ′′ . And now remembering that we defined
f i+1 (c ′ ) = ∂D
i
(d ′ ), we see that

f i+1 (c ′ ) − f i+1 (c) = f i+1 ∂Ci (c ′′ ) =⇒ c ′ − c = ∂Ci c ′′

by injectivity of f i+1 . But this means that c ′ + im(∂Ci ) = c + im(∂Ci ), so regardless of our choice of d and e we end
up with the same element in cohomology, meaning our map H i (E • ) → H i+1 (C • ) is well-defined.
We now need to check that we have a complex: first, we check that the composite map H i (D• ) → H i (E • ) →
i−1
H i+1 (C • ) is zero. A typical element of H i (D• ) is of the form d + im(∂D i
) where ∂D d = 0; this is then sent to
g i (d) + im(∂Ei−1 ). But then following the prescription of our boundary map, we first find a preimage of g i (d), and an
i
obvious choice is d itself; by definition we are then sent to ∂D d + im(∂Ci ) (viewed as an element of C i+1 ⊂ Di+1 ).
i
But the composite map must then be zero because ∂D d = 0 by definition. Similarly, the composite e + im(∂Ei−1 ) 7→
c + im(∂Ci ) 7→ f i+1 c + im(∂Ci ) is zero, because f i+1 c is in im(∂D
i
) by definition.
Next, we must check exactness itself, first at H i (E • ) – suppose e + im(∂Ei−1 ) maps to 0 in H i+1 (C • ). That means
that c would be an an element of im(∂Ci ), so c = ∂Ci c ′ for some c ′ ∈ C i . We wish to show that this means e is in the
image of g i : Di → E i – indeed, ∂D
i
d = f i+1 c = f i+1 ∂Ci c ′ = ∂D
i i ′
f c , so d − f i c ′ is in the kernel of ∂D
i
. Then applying
i−1
g i to (d − f i c ′ ) + im(∂D ) ∈ H i (D• ), we see that

(d − f i c ′ ) + im(∂D
i−1
) 7→ g i d − g i f i c ′ + im(∂Ei−1 ),

but g i f i is zero by exactness of row i , and g i d = e by definition, meaning that e is indeed mapped to by some element
in H i (D• ). Finally, to check exactness at H i (C • ), start with some c + im(∂Ci ) ∈ H i+1 (C • ) such that f i+1 c ∈ im(∂D
i
);
we wish to prove this is mapped to by the boundary map. If we say that f i+1 c = ∂D
i
d, then we can consider

84
g i d + im(∂Ei−1 ) ∈ H i (E • ), and under our boundary map we are mapped to c + im(∂Ci ) ∈ H i (C • ), as desired.

We’ve done the proof here just for R-modules, but the argument works for general abelian categories – it just looks
messier because we can’t talk about elements in the same way.

Lemma 283
Suppose there is a commutative diagram of complexes with exact rows as shown below:

0 C• D• E• 0
f • g• h•
′• ′ ′
0 C D• E• 0
Then the diagram below also commutes:

··· H i (C • ) H i (D• ) H i (E • ) H i+1 (C • ) ···


H i (f i ) H i (g i ) H i (hi ) H i+1 (f i+1 )

′ ′ ′ ′
··· H i (C • ) H i (D • ) H i (E • ) H i+1 (C • ) ···

We basically just need to check that the new square formed by the boundary map commutes – this is a similar
argument to what we just did, and it’s left as an exercise to us.
Recall that if we have an additive left-exact functor F : C → D with C having enough injectives, then 0 → X →
Y → Z → 0 gives us a long exact sequence 0 → R0 F X → R0 F Y → R0 F Z → R1 F X → · · · coming from the exact
sequence of injective resolutions 0 → F I • → F K • → F J • → 0 (here using injectivity).

Lemma 284
Suppose C, D are abelian categories, F : C → D is left exact and additive, and C has enough injectives. Then
given a commutative diagram with exact rows between 0 → X → Y → Z → 0 and 0 → X ′ → Y ′ → Z ′ → 0, we
get the following commutative diagram:

0 R0 F X R0 F Y R0 F Z R1 F X R1 F Y R1 F Z R2 F X ···
R0 F f R0 F g R0 F h R1 F f R1 F g R1 F g R2 F f

0 R0 F X ′ R0 F Y ′ R0 F Z ′ R1 F X ′ R1 F Y ′ R1 F Z ′ R2 F X ′ ···

Proof sketch. Using the proof last time, we can find injective resolutions I • , I • ⊕ J • , J • of X, Y, Z so that 0 → I • →
I • ⊕ J • → J • → 0 fit into the commutative diagram, using the maps k i : J i → I i+1 , and then having the map
I i ⊕ J i → I i+1 → J i+1 send (x, y ) → (∂I x + k i y , ∂Ji y ). Then we want to apply our result about long exact sequences
to 0 → F I • → F (I • ⊕ J • ) → F J • → 0. But we can compare the results we get from 0 → X → Y → Z → 0 and

0 → X ′ → Y ′ → Z ′ → 0. Given a map f : X → X ′ , we saw that the injective resolutions give us a map f • : I • → I • .

Similarly, for a map h : Z → Z ′ , we get a map h• : J • → J • . So now we want to try to take our map g : Y → Y ′
′ ′
and get a map f • ⊕ h• from I • ⊕ J • → I • ⊕ J • . We want to send (x, y ) to (∂I ′ x + ℓi y , ∂J•′ y ), where we choose
′ ′ ′ ′ ′ ′
ℓi : J i → J (i+1) to satisfy the properties that (1) 0 → Y ′ → I • ⊕ J • is a complex, and (2) f • ⊕ g • : I • ⊕ J • → I • ⊕ J •
is a map of complexes. (So the key point is that we do everything on 0 → X → Y → Z → 0 first, and then we must
construct the map ℓi similarly to how we construct k i but now with an extra condition coming from compatibility –
it’s important that this is the last thing we do.)

These right derived functors we’ve defined are part of a more general notion:

85
Definition 285
A δ-functor C → D, denoted {S • }, is a sequence of additive functors S 0 , S 1 , S 2 , · · · all from C to D, such that for
any 0 → X → Y → Z → 0 exact in C, we get a long exact sequence 0 → S 0 X → S 0 Y → S 0 Z → S 1 X → S 1 Y →
S 1 Z → · · · (so these boundary maps S 0 Z → S 1 X are part of the definition of the δ-functor). Furthermore, the
“extra square” formed from functoriality must commute, meaning that for any commutative diagram with exact
rows 0 → X → Y → Z → 0 and 0 → X ′ → Y ′ → Z ′ → 0, we must have commutativity within the square formed
by S i Z, S i+1 X, S i Z ′ , and S i+1 X ′ .

This definition is a bit hard to work with, but there’s an extra property we may have that makes working with these
objects easier:

Definition 286
A δ-functor {S • } is a universal δ-functor if for any other δ-functor {T • } and any natural transformation φ0 : S 0 →
T 0 , there is a unique natural transformation φi : S i → T i for all i such that the square formed by S i Z, S i+1 X, T i Z,
and T i+1 X commutes for any exact sequence 0 → X → Y → Z → 0.

So the point is that we often understand the map at degree 0 quite well, and that can help us get maps between the
higher degrees without the construction. And it turns out that Ri F is a universal δ-functor – this is a useful technique
called dimension shifting, and we’ll go through the proof of that next time.

28 December 5, 2022
Last time, we considered the following situation: if C has enough injectives and F : C → D is a left exact additive
functor, we defined the right derived functors Ri F : C → D by taking an injective resolution 0 → X → I • , applying
F to it, and taking its cohomology. Canonically this doesn’t depend on the choice of injective resolution, and it
is indeed a functor. Then if 0 → X → Y → Z → 0 is exact, we found that we get a long exact sequence
Ri F X → Ri F Y → Ri F Z → Ri+1 F X → · · · (and in fact this is functorial). A generalization of this idea is a δ-functor,
which is a sequence of functors {S i } such that such a long exact sequence (including the boundary maps) can be
constructed. Specifically, it’s useful to understand universal δ-functors, which are δ-functors {S i } where given a
natural transformation φ0 : S 0 → T 0 , we can produce φi : S i → T i for all i in a way that is compatible with exact
sequences.

Lemma 287
The sequence {Ri F } is a universal δ-functor.

Proof. We’ve already seen that this sequence is a δ-functor, so we just need to check universality. Let S i be another
δ-functor, and suppose φ0 : R0 F = F → S 0 is a natural transformation. We will inductively construct φi : Ri F → S i ,
prove it is a natural transformation, and show that it is compatible with boundary maps. Basically, given any X ∈ ob(C),
we must construct φiX : Ri F (X) → S i (X). The way we do this is by embedding X in a sequence 0 → X → I → Q → 0
for some injective I (where Q is the resulting quotient). We then get a sequence Ri−1 F I → Ri−1 F Q → Ri F X as part
of our long exact sequence. But we showed that the higher derived functors Ri F I vanish for injectives I, so in fact the
map Ri−1 F Q → Ri F X is surjective. We also have S i−1 I → S i−1 Q → S i X as part of the long exact sequence for S,

86
so putting these together we can fill in this part of the diagram, with left square commuting because φi−1
I is a natural
transformation (but dashed map not yet constructed)

Ri−1 F I Ri−1 F Q Ri F X 0
φi−1
I
i−1
φQ φiX

S i−1 F I S i−1 F Q Si F X 0

Since Ri F X is the cokernel of the map Ri−1 F I → Ri−1 F Q, we can factor the map Ri−1 F Q → S i F X through
Ri F X as long as it is zero on im(Ri−1 F I). And indeed, the map Ri−1 F I → RI1 F Q → S i−1 Q → S i−1 X is zero
because it’s the same as traveling along Ri−1 F I → S i−1 F I → S i−1 F Q → S i F X, and the composition of those last
two maps is zero by exactness. So we do factor through Ri F X in a unique way, and now we need to check that this
is independent of I and that this is indeed a natural transformation. We can do those together by considering an
arbitrary map f : X → Y and considering exact sequences 0 → X → I → Q → 0 and 0 → Y → J → Q′ → 0 (the
point is that checking independence of I can be done by taking X = Y but choosing different injective resolutions).
Then we have a diagram as below (except the blue dashed arrows):

0 X I Q 0
f g h

0 Y J Q′ 0
J is injective, so the monomorphism X → J factors through I, meaning we can draw g : I → J to make the left
square commute. Then again Q is the cokernel of X → I, so to produce a map h : Q → Q′ we just need to check
that the map X → I → J → Q′ is zero, which it is by exactness of the bottom row. So now we need to check the
natural transformation condition (and remember that setting f to be the identity will show that our definition of φiX
does not depend on whether we use I or J as the injective resolution for X):
Ri F f
Ri F X Ri F Y
φiX,I φiY,J

Si F f
Si F X Si F Y

For that, we first look at this commutative diagram, which commutes because φi−1
I is natural:

Ri−1 F h
Ri−1 F Q Ri−1 F Q′
φi−1
Q
φi−1
Q′

Si F h
S i−1 F Q S i−1 F Q′

Then there is a surjective map from the top left corner here (Ri−1 F Q) to the top left corner Ri F X in the previous
diagram, as well as a map from top right corner RI−1 F Q′ to top right corner Ri F Y . Because Ri−1 F is a δ-functor,
the square formed by those four elements commute. Similarly, we get a commutative square from the four elements
on the bottom. So we know those faces of our “cube” commute, and the front and back square commute because of
the definition of φiX,I and φiY,J . So basically everything commutes except the original diagram we drew, and now we
can check that that original face also commutes because we can start from some element Ri F X and pull it back to
Ri−1 F Q (by surjectivity); then we can basically follow the two paths to S i Y by using commutativity of all of the other
faces. So our maps are well-defined and natural transformations.
Next, we check that this is compatible with boundary maps. For that, look at a short exact sequence 0 → X →
Y → Z → 0, yielding a boundary map Ri−1 F Z → Ri F X and also one for S i−1 F Z → S i F X:

87
Ri−1 F Z Ri F X
φi−1
Z
φiX

S i−1 F Z Si F X
From our previous construction, we know this square already commutes when Y is injective. So we’ll map X into
some injective, and we know the corresponding square for 0 → X → I → Q → 0 commutes. But X → Y is an
injection so it extends to a map Y → I, and Q is a cokernel so we get a map Z → Q:

0 X Y Z 0

0 X I Q 0

Thus we get a commutative square by definition of φi :

Ri−1 F Q Ri F X
i−1
φQ φiX

S i−1 F Q Si F X

We compare those two squares. The square formed by the four “right elements” commutes because the two S i Xs
and Ri F Xs are equal. The map Z → Q induces maps Ri−1 F Z → Ri−1 F Q and S i−1 Z → S i−1 Q, so again we form a
cube. Then because φi−1 is natural, the square formed by the four “left elements” also commutes. Also the four top
elements, as well as the four bottom elements, commute because Ri and S are δ-functors and thus commutes with
boundary maps. So again all faces of the cube commute except the original one, and this is true because the map
S i X → S i X is an injection (in fact an equality), so we can chase the arrows Ri−1 F Z → Ri F X → S i X → S i X and
Ri−1 F Z → S i−1 Z → S i X → S i X around both ways and find that we indeed get the same map.

The whole principle of this proof is that we reduced something in degree i to something in degree (i − 1) by
considering injectives instead.
We’ll next make another general definition:

Definition 288
An object X ∈ ob(C) is acyclic for an additive functor F if Ri F X = 0 for all i > 0.

For example, we’ve seen that all injective objects are acyclic, but sometimes there are acyclic objects that are not
injective and it’s useful to work with the more general class (at least for the study of the functor F alone).

Lemma 289
If X embeds into some acyclic object A via 0 → X → A → Q → 0, then we get maps F A → F Q → R1 F X → 0
and 0 → Ri F Q → Ri+1 F X → 0 for all i > 0.

In other words, we can understand the higher right derived functors applied to X in terms of how they are applied
to the corresponding quotient one degree down. And we can repeat this process as well:

88
Lemma 290
Suppose we have a sequence of objects Qi and a sequence of acyclic objects Ai , such that Q0 = X and 0 → Qi →
Ai → Qi+1 → 0 is short exact for all i . Then applying the previous argument, we have

Rm F X ∼
= Rm−1 F Q1 ∼
= Rm−2 F Q2 ∼
= ··· ∼
= R1 F Qm−1 = coker(F Am−1 → F Qm ).

So if we can embed in this way, we can understand the higher derived functors in terms of the cokernel of F
applied to a single map. And we can do even better: if 0 → Qm → Am → Qm+1 → 0 is our sequence and we apply
F to it, we get 0 → F Qm → F Am → F Qm+1 because F is left exact. Then Qm+1 embeds into Am+1 , so looking
Qm → Am → Am+1 we still have left exactness 0 → F Qm → F Am → F Am+1 , so F Qm = ker(F Am → F Am+1 ).
Plugging this into the previous fact, we actually find that

Rm F X ∼
= Rm (F A• ),

since this is basically saying that we have an acyclic resolution 0 → X → A0 → A1 → A2 → · · · of X. So that brings
us back to the idea that we had with injective resolutions.
Our last topic of the class will be the Ext and Tor functors, which are somehow the simplest applications of
homological algebra to describe but not the most interesting. (But it appears regularly on the qualifying exam.)
We’ll begin with Ext – suppose C is a general abelian category. Fixing any X ∈ ob(C), we know that HomC (X, ·)
is a covariant, left exact functor from C to Ab. We can then think about the right derived functors of it:

Definition 291
If C has enough injectives, we define
i
ExtC (X, ·) = Ri Hom(X, ·).

Similarly, if Y is any object of C, we have a contravariant functor HomC (·, Y ) : C → Ab which will again be left
exact. We can then look at its right derived functors as well:

Definition 292
If C has enough projectives, we define

ExtiC (·, Y ) = Ri HomC (·, Y ).

So more concretely, if 0 → Y1 → Y2 → Y3 → 0 is a short exact sequence, then we get a map


1 1
0 → HomC (X, Y1 ) → HomC (X, Y2 ) → HomC (X, Y3 ) → Ext (X, Y1 ) → Ext (X, Y2 ) → · · · ,

and similarly if we have a short exact sequence 0 → X1 → X2 → X3 → 0, we get a map

0 → HomC (X3 , Y ) → HomC (X2 , Y ) → HomC (X1 , Y ) → Ext1 (X3 , Y ) → Ext1 (X2 , Y ) → · · · .

But the point is that these are actually the same if they both exist (so measuring lack of exactness will yield the same
result whether we are taking homomorphisms into the objects or out of them).

Theorem 293
i
If C has enough injectives and enough projectives, then ExtiC (X, Y ) ∼
= ExtC (X, Y ).

89
To show this, we’ll need a better characterization of Ext:

Lemma 294
The following are equivalent:
1. X is projective,

2. Exti (X, Y ) = 0 for all i > 0 and all Y ,

3. Ext1 (X, Y ) = 0 for all Y .

Proof. To show that (1) implies (2), we calculate Exti (X, Y ) by finding a projective resolution of X, but we can just
use · · · → 0 → 0 → X → X → 0 in this case. And we calculate Exti (X, Y ) by looking at the cohomology

Hom(X, Y ) → Hom(0, Y ) → Hom(0, Y ) → · · · ,

which is Hom(X, Y ) if i = 0 and 0 otherwise, so indeed condition (2) is satisfied. (2) clearly implies (3). Finally, suppose
(3) holds. Given X we can find some projective P such that we get a sequence 0 → K → P → X → 0 – we claim that
this sequence splits, so that X is a direct sum of a projective and thus projective itself. For that, we need to construct
a map P → K by looking at homomorphisms into K: there is an exact sequence 0 → Hom(X, K) → Hom(P, K) →
Hom(K, K) → Ext1 (X, K) → · · · , but by assumption Ext1 (X, K) is zero. Thus the map Hom(P, K) → Hom(K, K) is
surjective, meaning some map f : P → K lifts to the identity map 1K ∈ Hom(K, K). That means that the composition
ι f
of the maps K →
− P →
− K is the identity (by definition of the Hom functor), so P is isomorphic to K ⊕ ker f and ker f
is isomorphic to X.

We’ll develop some more results on Ext and prove that these two definitions are indeed equivalent next time!

29 December 7, 2022
i
Last time, we considered the covariant, left exact functor HomC (X, ·), which gives us right derived functors ExtC (X, ·)
if C has enough injectives, and the contravariant, left exact functor HomC (·, Y ), which gives right derived functors
ExtiC (·, Y ) if C has enough projectives. Our goal is to show that these two Ext functors are actually the same when
they both exist – last time, we showed that (when C has enough projectives) an object is projective if and only if
Exti (X, Y ) = 0 for all i > 0 and Y (that is, X is acyclic for HomC (·, Y ) for all Y ), and in fact it’s also equivalent to
just checking that Ext1 (X, Y ) = 0 for all Y . We’ll now prove a similar criterion for injectivity:

Lemma 295
Suppose C has enough projectives (we’ll assume this throughout the rest of the lecture). Then the following are
equivalent:
1. Y is injective,

2. Exti (X, Y ) = 0 for all i > 0 and all X (in other words, Exti (·, Y ) = 0 for all i > 0),

3. Ext1 (X, Y ) = 0 for all X.

Proof. For (1) implies (2), start with a projective resolution P • → X → 0. We then get 0 → Hom(X, Y ) →
Hom(P • , Y ), and then throwing away Hom(X, Y ) gives us cohomology. So we’re just trying to prove that we have

90
an exact sequence Hom(P • , Y ), but Hom(·, Y ) is exact if and only if Y is injective, so we indeed have trivial Exti
for all i > 0 and all X, as desired. (To explain why Hom(·, Y ) is exact, start with 0 → A → B → C → 0. Taking
homomorphisms into Y , we know it’s left exact so we have 0 → Hom(C, Y ) → Hom(B, Y ) → Hom(A, Y ), so the
question is whether that last map is surjective. So we want to ask whether a map from A to Y is a restriction of a
map B to Y , but that’s the definition of being injective.)
(2) implies (3) is again clear. For (3) implies (1), suppose we have 0 → A → B, and there is a map A → Y
and we want to show that we can extend it to a map B → Y . Indeed, we can complete the exact sequence to
0 → A → B → Q → 0 – the corresponding exact sequence is

0 → Hom(Q, Y ) → Hom(B, Y ) → Hom(A, Y ) → Ext1 (Q, Y ) → · · · ,

and by assumption Ext1 (Q, Y ) = 0 so Hom(B, Y ) → Hom(A, Y ) is indeed surjective, meaning that any map A → Y
can indeed be lifted.

(Note that in the previous proof, we similarly used the fact that X is projective if and only if Hom(X, ·) is exact.)
Next, recall that Exti (X, Y ) was defined as a functor in X, but we want to show that it actually behaves as a functor
in Y as well:

Lemma 296
The map Y 7→ Exti (X, Y ) is a functor.

Proof. Given a map f : Y1 → Y2 , we need to produce a map Ext1 (X, Y1 ) → Ext1 (X, Y2 ). We’ll do so by producing a
natural transformation Exti (·, Y1 ) → Exti (·, Y2 ) (and then we can substitute any X in). Since these are δ-functors, we
have a map φ : Hom(·, Y1 ) → Hom(·, Y2 ) sending each morphism g to f ◦ g. We must check that this is natural by
taking any h : X → X ′ and checking the diagram:
φX ′
Hom(X ′ , Y1 ) Hom(X ′ , Y2 )
−◦h −◦h
φX ′
Hom(X, Y1 ) Hom(X, Y2 )

Indeed, going around the bottom takes a morphism from g to g ◦ h to f ◦ g ◦ h, and going around the top takes
us from g to f ◦ g to f ◦ g ◦ h, so we do have a natural transformation. (The other diagram is checked similarly.)
And because Exti (·, Y2 ) is a δ-functor (also universal, but not important) and Exti (·, Y1 ) is a universal δ-functor, for
each i there is a unique natural transformation φi : Exti (·, Y1 ) → Ext1 (·, Y2 ) compatible with the boundary maps. So
in fact Ext1 (X, Y1 ) → Ext1 (X, Y2 ) can just be φ1X – we can then check that functoriality does actually hold because of
naturality.

Lemma 297
For any 0 → Y → Z → W → 0, we get a long exact sequence Exti (X, Z) → Exti (X, W ) → Exti+1 (X, Y ) →
Exti+1 (X, Z) → · · · .

Again, we already know we have a long exact sequence in the first variable, but we’re saying that we get this
δ-functor-like property in the second variable as well.

91
Proof. The recipe for Exti (X, Z) was to take a projective resolution P • → X → 0, take homomorphisms Hom(P • , Y ) →
Hom(P • , Z) → Hom(P • , W ). If this sequence were short exact for each P i , we would get the desired long exact se-
quence, so we must just check that short exactness holds. But that’s true because P i is projective so Hom(P i , ·) is
exact.

Lemma 298
Exti (X, ·) is a δ-functor.

Proof sketch. We’ve already constructed the long exact sequence, and we just need to check that it is functorial. But
each long exact sequence 0 → Hom(P • , Y ) → Hom(P • , Z) → Hom(P • , W ) → 0 can be thought of as a “plane” with
P i s in one direction and Y, Z, W in the other, and we just need to diagram chase with two such planes next to each
other. This is left as an exercise to us.

Lemma 299
i
Now suppose C also has enough injectives. We have an isomorphism Ext (X, Y ) → Exti (X, Y ) – in fact, there is
a natural isomorphism Ext(X, Y ) → Exti (X, ·).

i
(This means that Ext (X, Y ) is isomorphic to Exti (X, Y ) in a way functorial in Y .)

Proof. The usual way of constructing maps between right derived functors is as follows: we start with the identity map
i
Hom(X, ·) → Hom(X, ·), which is a natural transformation. By construction, Ext (X, ·) is a universal δ-functor, and
we’ve just checked that Exti (X, ·) is a δ-functor as well in Lemma 298, though this time we don’t know it is universal
yet because it didn’t arise from a right derived functor. (It turns out that δ-functor that vanishes on injectives must be
a universal δ-functor, so we could check that it vanishes on injectives, which is true by the first lemma of this lecture.
i
But we didn’t discuss that fact yet.) Thus there are unique maps Ext (X, ·) → Exti (X, ·) compatible with boundary
maps which extend the identity in degree 0.
We will show this is an isomorphism by induction on i using a dimension-shifting argument: since we have enough
i−1
injectives, we have a short exact sequence 0 → Y → I → Q → 0, which yields a long exact sequence Ext (X, I) →
i−1 i−1 i i
Ext (X, Q) → Ext (X, Y ) → Ext (X, I). But because i > 0 and I is injective, Ext (X, I) = 0. Similarly, we have
the same long exact sequence for Exti , namely Exti−1 (X, I) → Exti−1 (X, Q) → Exti−1 (X, Y ) → Exti (X, I) = (0). We
then get maps between them:
i−1 i−1 i−1 i
··· Ext (X, I) Ext (X, Q) Ext (X, Y ) Ext (X, I) = (0) ···

= ∼
=

··· Exti−1 (X, I) Exti−1 (X, Q) Exti−1 (X, Y ) Exti (X, I) = (0) ···

By induction, we have an isomorphism in the first two columns, and we want to check if the third column is an
i
isomorphism. But that’s because Ext (X, Y ) and Exti (X, Y ) are both cokernels of isomorphic maps (from the first to
the second column in both the top and bottom rows), so they are isomorphic. And in fact when i > 1 the first column
vanishes as well so we have isomorphisms between Y in degree i and Q in degree (i − 1).

92
Example 300
We’ll work in the example of Z-modules (abelian groups). Notice that ExtiZ (Z, A) = A if i = 0 and 0 otherwise,
because Z is a projective Z-module. On the other hand, we can compute ExtiZ (Z/nZ, A) by looking at the short
×n
exact sequence 0 → Z −−→ Z → Z/nZ → 0, yielding a long exact sequence

0 → Hom(Z/nZ, A) → Hom(Z, A) → Hom(Z, A) → Ext1 (Z/nZ, A) → Ext1 (Z, A),

where the last term is in fact zero.

We know that the blue map is induced by the multiplication-by-n map, so Ext1 (Z/nZ, A) is the cokernel of that,
which is A/nA, and Hom(Z/nZ, A) is the n-torsion of A, denoted A[n]. And then because Exti (Z, A) = 0 for all i > 0,
we find that Exti (Z/nZ, A) = 0 if i > 1, A/nA if i = 1, and A[n] if i = 0. Additionally, by functoriality, a short exact
sequence 0 → A → B → C → 0 yields 0 → Ext0 (Z/nZ, A) → Ext0 (Z/nZ, B) → Ext0 (Z/nZ, C) → · · · , or more
specifically an exact sequence

0 → A[n] → B[n] → C[n] → A/nA → B/nB → C/nC → 0 → 0 → · · · .

(This is basically the snake lemma if we have multiplication-by-n maps between two copies of 0 → A → B → C → 0.)

Example 301
As a challenge, we should try computing Exti (Q, Z). There’s no homomorphisms Q → Z, so it turns out this is
zero or i = 0 or i > 1, but there is a surprisingly complicated answer for i = 1.

This is all we’ll say about Ext, and we’re now ready to talk about Tor. We’ll just make the definition for now:
recall that M ⊗ · is a functor R-mod → R-mod sending N to M ⊗ N. This is covariant and right-exact, and R-mod
has enough projectives. Thus, we produce functors

TorR i
i (M, ·) = L (M ⊗R ·)

for each i . This means that for any exact sequence 0 → N1 → N2 → N3 → 0, we get a long exact sequence (showing
failure of left exactness)

· · · → TorR R R
1 (M, N1 ) → Tor1 (M, N2 ) → Tor1 (M, N3 ) → M ⊗ N1 → M ⊗ N2 → M ⊗ N3 → 0.

Definition 302
An R-module M is flat if M ⊗R · is exact.

We’ll see that flat R-modules turn out to be the acyclic ones, so we can use flat modules in place of projectives.
But we’ll understand this more next time.

30 December 9, 2022
Last time, we started discussing Tor: using the right-exact, covariant tensor M ⊗R · taking R-mod to itself, we get
the left derived functors Li (M ⊗R ·) = TorR
i (M, ·), and we get the usual long exact sequence from the short exact

93
sequence. We mentioned that a module M is flat if M ⊗R · is exact (not just right exact), and this immediately gives
us the following equivalent definitions (since the derived functors of an exact functor are trivial):

Lemma 303
The following are equivalent:
1. M is a flat R-module,

2. Tori (M, ·) = 0 for all i > 0 (in other words, Tori (M, N) = 0 for all i > 0 and all modules N),

3. Tor1 (M, ·) = 0.

Proof. For (1) implies (2), take a projective resolution P • → N → 0. Then Tori (M, N) = 0 is the −i th cohomology
of the complex H −i (M ⊗ P i ), but by exactness (because M is flat) tensoring will still give us an exact sequence and
thus zero Tor for all i > 0.
(2) implies (3) is clear. For (3) implies (1), suppose we have an exact sequence 0 → N1 → N2 → N3 → 0.
Tensoring with M yields a long exact sequence Tor1 (M, N3 ) → M ⊗ N1 → M ⊗ N2 → M ⊗ N3 → 0, but we assume
Tor1 (M, N3 ) = 0 so this is indeed short exact.

Lemma 304
M1 ⊕ M2 is flat if and only if M1 and M2 are flat (because tensor products distribute over direct sums). Also, R
is a flat R-module, and so is a general free R-module. In particular, projective modules are direct summands of
free modules, so they are also flat.

Recall that Ext had the strange property that it can be considered a functor in either variable, and we’ll now
establish something similar for Tor.

Lemma 305
Given a short exact sequence 0 → M1 → M2 → M3 → 0, there is a long exact sequence “the wrong way around:”

· · · → Tor1 (M2 , N) → Tor1 (M3 , N) → M1 ⊗ N → M2 ⊗ N → M3 ⊗ N → 0

Proof. Take a projective resolution P • → N → 0. But since each P i is projective, it is also flat, which means that the
rows of 0 → M1 ⊗ P • → M2 ⊗ P • → M3 ⊗ P • → 0 are exact. The long exact sequence arising from this then gives
us the desired long exact sequence.

Lemma 306
The following are equivalent:
1. N is a flat R-module,

2. Tori (M, N) = 0 for all i > 0 and all M (that is, N is acyclic for the functor M ⊗R · for all M),

3. Tor1 (M, N) = 0 for all M.

Proof. For (1) implies (2), we will argue by induction on i . For any such M, we can write it as part of a short exact
sequence 0 → K → P → M → 0 with P projective (and K the kernel of that map P → M). By the previous lemma,

94
and the fact (the first thing we proved today) that Tori (P, N) = 0 because P is projective and thus flat, we get a long
exact sequence
Tori (P, N) → Tori (M, N) → Tori−1 (K, N) → Tori−1 (P, N).

If i = 1, the last two modules here are instead K ⊗ N → P ⊗ N, and since K injects in P and N is flat, K ⊗ N will also
inject into P ⊗ N, meaning that exactness must yield Tor1 (M, N) = 0. (So this is really our base case.) Otherwise
if i > 1, we know that Tori−1 (K, N) = 0 by the inductive hypothesis, and we know Tori (P, N) = 0 because P is
projective and flat. Thus we have exactness 0 → Tori (M, N) → 0 and thus Tori (M, N) = 0 as desired.
(2) implies (3) is trivial. For (3) implies (1), we have an exact sequence 0 → M1 → M2 → M3 → 0 and we tensor
it with N. We then get (the “wrong way around” long exact sequence) the long exact sequence

Tor1 (M3 , N) → M1 ⊗ N → M2 ⊗ N → M3 ⊗ N → 0,

and by assumption the first module here is zero and we do have exactness.

Proposition 307
The map {M 7→ Tori (M, N)} is a δ-functor (“in the wrong variable”).

Proof. To check that this is a functor, we take a morphism f : M1 → M2 and get a natural transformation φ :
M1 ⊗ · → M2 ⊗ ·, where φN = f ⊗ IdN sends M1 ⊗ N → M2 ⊗ N in the usual way. (Naturality can be checked from the
definition.) Because the ordinary Tor definition is a universal δ-functor (because it is a deried functor): we get natural
transformations φi : Tori (M1 , ·) → Tori (M2 , ·) and thus φiN : Tori (M1 , N) → Tori (M2 , N).
Alternatively, we could also check more directly: taking a projective resolution P • → N → 0, we get a map
M1 ⊗ P • → M2 ⊗ P • which gives rise to cohomology H i (M1 ⊗ P • ) → H i (M2 ⊗ P • ).
Finally, by the lemmas we’ve just proved, we do get long exact sequences, and we need to check that they are
functorial – that last part is left as an exercise to us.

Lemma 308
Tor is symmetric in a natural way – that is, there is an isomorphism Tori (M, N) → Tori (N, M).

Proof. Fix M. Then Tori (M, ·) is a universal δ-functor, and Tori (·, M) is a (not-necessarily universal for now) δ-
functor. Furthermore, in degree 0 we have a map M ⊗ · → · ⊗ M is the usual map sending m ⊗ n 7→ n ⊗ m, which we
can check is a natural transformation.
Thus universality gives us natural transformations φi : Tori (M, ·) → Tori (·, m) compatible with boundary maps,
and we want to prove that we have an isomorphism. But like last time, we can use a dimension shifting argument
for this. We use induction on the statement “φi,N is an isomorphism for all N.” As mentioned, the base case i = 0
is true because M ⊗ N → N ⊗ M is an isomorphism, and now for i > 0 we can put N into an exact sequence
0 → K → P → N → 0 with P projective. We then get two long exact sequences in the rows

··· Tori (M, P ) Tori (M, N) Tori−1 (M, K) Tori−1 (M, P ) ···

= ∼
=

··· Tori (P, M) Tori (N, M) Tori−1 (K, M) Tori−1 (P, M) ···

P is projective and thus flat, so the two modules in the left column are zero. Also, by inductive hypothesis, the two
maps in the right two columns are isomorphic. But then that means Tori (M, N) and Tori (N, M) are the kernels of the
two isomorphic maps from the third to the fourth column, so they are isomorphic.

95
In particular, since Tori (M, ·) are the derived functors of the tensor product and thus form a universal δ-functor,
and Tori (·, M) are isomorphic to them in the way described above, Tori (·, M) also form a universal δ-functor with

Tori (·, M) = Li (· ⊗R M).

We’ll now mention some more properties of flat modules, starting with a “locality” property:

Lemma 309
The following are equivalent:
1. M is flat over R,

2. Mp is flat over Rp for all prime ideals p,

3. Mm is flat over Rm for all maximal ideals m,

Proof. This is basically using the fact that localization is an exact functor. For (1) implies (2), suppose 0 → N1 →
N2 → N3 → 0 is short exact, where each Ni is an Rp module. Viewing them as R-modules, since M is flat, we get an
exact sequence of R-modules
0 → M ⊗R N1 → M ⊗R N2 → M ⊗R N3 → 0.

But M ⊗R N1 is the same as M ⊗R (Rp ) ⊗Rp N1 ), and now by the associative law (even when the tensor products are
different) this is the same as (M ⊗R Rp ) ⊗Rp N1 = Mp ⊗Rp N1 . So Mp is indeed flat over Rp because we get the short
exact sequence we want.
(2) implies (3) is trivial because all maximal ideals are prime. Now for (3) implies (1), take an exact sequence of
R-modules 0 → N1 → N2 → N3 → 0. We wish to tensor it with M, and because the tensor product is right exact we
have a sequence
0 → K → M ⊗R N1 → M ⊗R N2 → M ⊗R N3 → 0.

We wish to show K = 0, and as we have previously showed it is sufficient to show that Km = 0 for all maximal ideals
m. Since localization is exact, and (M ⊗R N)m = Mm ⊗Rm Nm , localizing that sequence gives us

0 → Km → Mm ⊗Rm (N1 )m → Mm ⊗Rm (N2 )m → Mm ⊗Rm (N3 )m → 0.

But we also know that we have an exact sequence 0 → (N1 )m → (N2 )m → (N3 )m → 0 by localizing the original exact
sequence, and then by assumption (3) that sequence is preserved under tensoring by Rm so we must have Km = 0, as
desired.

Lemma 310
If R is a noetherian local ring (meaning that it has a unique maximal ideal) and M is a finitely generated R-module,
then M is free over R if and only if M is flat over R.

(So for noetherian rings, flatness is equivalent to being locally free.)

Proof. The forward direction is clear. For the backwards direction, we use Nakayama’s lemma: looking at M/(m)M
(where m is the unique maximal ideal) we are finitely generated over the residue field and thus free. That means
M/(mM) → (R/m)⊕d is an isomorphism, and we can choose a surjection R⊕d → M which lifts that isomorphism
by Nakayama. So there is a short exact sequence 0 → K → R⊕d → M → 0 for some kernel K coming from that

96
surjection. Tensoring that by (R/m)⊗R , we get

⊕d ∼
=
TorR
1 (R/m, M) → K/m → (R/m) −
→ M/mM → 0.

But now we can use the fact that M is flat again to see that the Tor module on the left is zero. Thus K/m = 0, but K
is a submodule of a finitely-generated module over a noetherian ring, so it is also finitely generated. So by Nakayama’s
lemma again this means K = 0 and thus M is indeed free.

Example 311
×m
To calculate TorZi (Z/(n), Z/(m)), we take a projective resolution · · · → 0 → Z −−→ Z → Z/(m) → 0. To
×m
calculate Tor, we tensor with Z/(n) and take the −i th cohomology of (Z/(n) −−→ Z/(n)), where that map goes
from degree −1 to 0.

Thus for i = 0 we want the cokernel of this map, which is Z/(m, n), and for i > 1 we just have zero. Finally for
i = 1, we take the things in Z/(n) which are killed by multiplication by m, which is the set of a + nZ ∈ Z such that n
n
divides ma, which is the same as the a such that gcd(n,m) divides a. In other words, we end up finding that



 Z/(m, n) i = 0,

TorZi (Z/(n), Z/(m)) = Z/(m, n) i = 1,



0 i > 1.

though if we want to check functoriality with this we do need to be a bit careful with how we constructed the
isomorphism n
Z/(nZ) =∼ Z/gcd(n, m)Z (it’s not quite canonical).
gcd(n,m)
So now if we consider the short exact sequence
×2
0 → Z/(2) −→ Z/(4) → Z/(2) → 0

and we tensor with Z/(2)⊗, we end up with the long exact sequence
×2=0 ∼
=
· · · → 0 → TorZ1 (Z/(2), Z/(2)) → TorZ1 (Z/(2), Z/(4)) → TorZ1 (Z/(2), Z/(2)) → Z/(2) −−−→ Z/(2) −
→ Z/(2) → 0,

where all three of the Tor terms are isomorphic to Z/(2). And then we see that the maps from left to right are an
isomorphism (because it is injective), then zero (by exactness), then an isomorphism for the boundary map:

= 0 ∼
= 0 ∼
=
0 → Z/(2) −
→ Z/(2) →
− Z/(2) −
→ Z/(2) →
− Z/(2) −
→ Z/(2) → 0.

And in particular this means that the maps have shifted between the original sequence and the one between Tors.

97

You might also like