Formalizing The Proof of An Intermediate-Level Algebra Theorem - An Experiment
Formalizing The Proof of An Intermediate-Level Algebra Theorem - An Experiment
Formalizing The Proof of An Intermediate-Level Algebra Theorem - An Experiment
Abstract. — Proof assistants are computer softwares that allow us to write mathematical
proofs so as to assess their correctness. In November 2021, I started the project of checking
the simplicity of the alternating groups within the Lean theorem prover and its mathlib
library. This text aims at reviewing this experiment.
Résumé. — Les assistants de preuves sont des logiciels qui permettent de rédiger des dé-
monstrations mathématiques et d’en garantir leur correction. En novembre 2021, j’ai débuté
un projet de vérification de la simplicité des groupes alternés au sein de l’assistant de preuve
Lean, et de sa librairie mathlib. Ce texte est un essai de compte rendu de cette expérience.
1. Introduction
Human mathematics is written in plain language, and we all know examples of
shortcomings that lead to “proofs” of wrong results. We also know for now more
than hundred years ago, notably by the works of P EANO (1889) or W HITEHEAD &
R USSELL (1927), that mathematics can be written using axiomatic systems, and,
at least in principle, in a rigid syntactic way, so as to avoid such problems, at least
if the chosen axiomatic system does not lead to contradiction. I write “in principle”,
because this rigid syntactic writing is extremly verbose: It took hundreds of pages
to Whitehead and Russell to prove that 1 + 1 = 2. One may find a pleasant, large
audience, account of this quest in the comic book D OXIADIS & PAPADIMITRIOU
(2009).
Since the 1950s, the development of computers led mathematicians to propose to
use their mechanical force to develop fully formalized proofs of the mathematical
corpus. Among such examples, let us mention N. G. De Bruijn’s Automath (1967),
A. Trybulec’s Mizar (1973), G. Huet’s team project Coq (1989), C. Coquand’s Agda
(1999) or L. de Moura’s Lean (2013). . .
In recent years, these softwares have allowed us to check delicate parts of the
mathematical corpus: Appel and Haken’s proof the Four color theorem (the regions
delimited by a finite planar graph can be colored in four colors so that any two
neighboring regions have different colors); Feit and Thompson’s proof of the Odd
order theorem (any finite group of odd cardinality is solvable), by G ONTHIER (2008)
and G ONTHIER et al (2013), (both in Coq); Hales’s proof of the Kepler conjecture
(the standard, “cubic close”, sphere packing is the densest one), by H ALES et al
(2017) (in HOL Light); following the challenge of S CHOLZE (2022), the proof of a
2 ANTOINE CHAMBERT-LOIR
delicate homological algebra result of Clausen and Scholze (in Lean, the so called
“Liquid tensor experiment”, 2022, by C OMMELIN and T OPAZ, with the help of many
more people involved); or Gromov’s proof of the h-principle and the sphere eversion
theorem by M ASSOT, VAN D OORN & N ASH (2022), also in Lean.
Actually, the latter results were not formalized in plain Lean, but were built
on the Lean mathematical library mathlib. Lead by a group of approximately
25 people, plus some 15 reviewers, this mathematical library is an ongrowing ef-
fort of roughly 300 people, with (as today) approximately 45 000 definitions and
110 000 mathematical statements (“theorems”) that cover many fields of mathe-
matics, such as additive combinatorics, complex variables, differential geometry
and Lebesgue integration. . . So that a collective effort is at all possible, the initial
authors of mathlib had to make careful architecture and design decisions, described
in (T HE MATHLIB C OMMUNITY, 2020). As Lean/mathlib is an open source project,
it is also relatively easy to install it on one’s own computer, and start joining this
collective effort. This is also facilitated by a comprehensive website and an online
discussion board where contributors share their problems and, remarkably gener-
ously, insights.
In November 2021, I embarked in checking in Lean/mathlib the proof that the
alternating group of a finite set of cardinality at least 5 is a simple group. While
this mathematical result is of a smaller scale, compared to the above-quoted accom-
plishements, it belongs to the classical (under)graduate mathematical corpus, and
I felt interesting to experiment the formalization process on a result of this inter-
mediate level. For reasons I will try to share, I chose a nonstandard way to do that,
that led me to unsuspected mathematical territories.
This text is a retrospective account of this journey.
I thank Javier Fresán for having inviting me to write this paper, a Spanish version
of which should appear in La Gaceta de la Real Sociedad Matemática Española. I
also thank him for his insightful comments, as well as those of Riccardo Brasca and
Patrick Massot, as well as Martin Liebeck and Raphaël Rouquier for their help. I
also thank the mathlib community for their enthusiasm in welcoming newcomers to
the game, and for the support they provide so generously.
2. Solvability, simplicity
Let us first recall the terms of the statement we have in mind.
Simple groups are those (nontrivial) groups whose only normal subgroups are
the two obvious examples, the full group and trivial group { e}. When a nontrivial
group G is not simple, it admits a normal subgroup H such that H 6= { e} and H 6= G;
then one can (try to) study G through its projection to the quotient group G/H, whose
kernel is H. When we restrict ourselves to finite groups, a full “dévissage” is possible
and a common metaphor presents finite simple groups as the “elementary particles”
of finite group theory. In this direction, a legendary theorem whose proof involved
hundreds of mathematicians and hundreds of mathematical papers written over
4 ANTOINE CHAMBERT-LOIR
a period of 50 years, is the classification of finite simple groups: All finite simple
groups appear in a list of groups of the following form:
The difficult part of the classification of finite simple groups asserts that those
finite groups are the only simple groups, but we are only concerned here by the
easy part of the classification, that these groups are indeed simple.
The first ones, cyclic groups of prime order, are simple: it follows from Lagrange’s
theorem (the order of a subgroup divides the order of the group) that they have no
other subgroup than themselves and the trivial subgroup.
As an aside remark, let us note that the center Z(G) of a group G, the set of ele-
ments g ∈ G which commute with any other element of G, is also a normal subgroup.
Consequently, if G is simple, then either Z(G) = G, in which case G is commutative,
hence a cyclic group of prime order, or Z(G) = { e}. This explains why, from the second
item on, all groups of the above list have trivial center.
On the second item of that list come the alternating groups, which are the very
subject of this note, and whose simplicity is often established in algebra lectures
related to Galois theory and the solvability of algebraic equations (in one variable).
While Abel and Ruffini had proved that general algebraic equations of degree Ê 5
cannot be solved by radicals, Galois’s theorem refines that result by proving that
a given algebraic equation is solvable by radicals if and only if its Galois group is
solvable. The notion of a group of an equation was introduced by Galois, as well as
the notion of a normal subgroup and of solvable group, although he did not give a
name to these two concepts: the Galois group is the subgroup of the permutations of
the roots that preserve all algebraic relations with rational coefficients; and a finite
group G is solvable if it is trivial or if, by induction, it admits a nontrivial normal
subgroup H which is itself solvable and such that the quotient group G/H is com-
mutative. In modern terms, we say that a group G is solvable if its “derived series”
G, D(G), D(D(G)) . . . , the decreasing sequence of subgroups obtained by successively
taking commutator subgroups, eventually reaches the identity subgroup.
In that perspective, the Abel–Ruffini theorem boils down to the fact that a general
equation of degree n has Galois group the full symmetric group Sn , and that, for
n Ê 5, it is not solvable, itself a direct consequence of the following more precise
result.
AN EXPERIMENT IN FORMALIZATION OF AN INTERMEDIATE-LEVEL ALGEBRA THEOREM 5
We see that any 3-cycle can we written as a commutator, so that D(Sn ) contains all
3-cycles, which are known to generate the alternating group An . (This works for
n Ê 3.)
To prove that D(An ) is An itself, we prove that the quotient group K = An /D(An )
is trivial. The group An is generated by 3-cycles g, so their images generate K. The
hypothesis n Ê 5 implies that all 3-cycles are conjugate in An ; consequently, they
all have the same image in K, say k, and K = 〈 k〉. Since the square of a 3-cycle
g = (a b c) is again a 3-cycle, namely g2 = (a c b), one has k = k2 , hence k = e and
K = { e}.
Theorem 3.1. — Let a group G act on a set X, and assume that we are given, for
every x ∈ X, a subgroup T( x) of G, such that the following properties hold:
– For every x ∈ X, the group T( x) is commutative;
– For every g ∈ G and every x ∈ X, one has T( g · x) = gT( x) g−1 ;
– The groups T( x) generate G.
If, moreover, the action of G on X is quasiprimitive, then any normal subgroup N
of G that acts nontrivially on X contains the commutator subgroup D(G) of G.
An action of a group G on a set X is said to be quasiprimitive if any normal sub-
group of G which acts nontrivially on X acts transitively. This property may look
obscure, but it appears naturally in the framework of primitive actions, a classic
theme of 19th century group theory which remained very important in finite group
theory but seems to have disappeared from the algebra package we offer to under-
graduate students. Let us define it in terms of partitions of X (sets of nonempty
disjoint subsets of X whose union is X):
Lemma 3.3. — Let us assume that the action of G is 2-fold transitive: X has at least
two elements and for any two pairs ( x, y) and ( x0 , y0 ) of distinct elements of X, there
exists g ∈ G such that g · x = x0 and g · y = y0 . Then this action is primitive.
The proof is elementary: consider an element B of a partition Σ of X which has at
least two elements x, y and let us show that B = X. Let z ∈ X be such that z 6= x, y.
By the 2-fold transitivity condition applied to ( x, y) and ( x, z), there exists g ∈ G such
that g · x = x and g · y = z. The set g · B belongs to Σ but has a common point with B,
namely x, so that g · B = B. In particular, z ∈ B. This proves that B = X.
We just observed that members of a G-invariant partition are subsets B of X such
that either g · B ∩ B = ; or g · B = B; in the traditional terminology of group theory,
they are called blocks, and blocks which are neither empty, nor singletons, nor the
full sets are called blocks of imprimitivity. Conversely, if B is a nonempty block and
AN EXPERIMENT IN FORMALIZATION OF AN INTERMEDIATE-LEVEL ALGEBRA THEOREM 7
if the action is transitive, then the set of all g · B, for g ∈ G, gives a G-invariant
partition of X.
As an example of a transitive, but not primitive action, one may consider the
action of S4 on the set of pairs of elements of {1, 2, 3, 4}: in this case, there are
nontrivial blocks, such as B = {{1, 2}, {3, 4}}. In fact, we will have to meet this example
later, and some variants of it.
The terminology “primitive” comes from Galois, in the language of equations: as
explained by (N EUMANN, 2006, p. 390), when the Galois group G of an irreducible
polynomial equation f ( x) = 0 acts on its roots, there are m blocks of size n if and
only if there is an auxiliary equation of degree m the adjunction of one root of which
allows f to be factored as f 1 f 2 , where f 1 has degree n.
The Lean definitions follow these descriptions, see listing 2, with a few adjust-
ments to follow the general mathlib conventions.
First of all, definitions are always given under very minimal hypotheses, one idea
being that they could serve in more general contexts than the ones that are gener-
ally considered, so as to avoid the need for infinite variations of otherwise identical
proofs. Another principle to have definitions as general as possible is that chang-
ing a definition later on requires to adjust all theorems that refer to it, a painful
and long task. In our case, “actions” of a type G on another type X just presumes a
map G → X → X embodied in the predicate has_smul G X, and then denoted by the
symbol ·, not even requiring that G has an inner multiplicative structure! It is remi-
niscent of the “groups with operators” introduced in the first chapter of (B OURBAKI,
1998) with a similar intention.
Then a “subset” B of X (something called set X) is a block if and only if the sets
g · B, for g in X, are pairwise equal or disjoint. The (possibly) obscure definition
makes use of mathlib’s general predicate set.pairwise_disjoint.
8 ANTOINE CHAMBERT-LOIR
(1)
Recall that a subgroup H of G is maximal if H 6= G and if any subgroup H0 of G containing H is H
or G.
AN EXPERIMENT IN FORMALIZATION OF AN INTERMEDIATE-LEVEL ALGEBRA THEOREM 9
these three proofs by ...; the codes of the first two ones are 2-line long, that of the
third one is 17-line long.
3.5. — We end this section with a proof of the Iwasawa criterion (theorem 3.1).
Fix a point a ∈ X. We first prove that the subgroup 〈N, T(a)〉 generated by N
and T(a) is equal to G. By assumption, N acts transitively on X. Since N is normal,
the hypothesis that the action is quasiprimitive implies that for every b ∈ X, there
exists n ∈ N such that n · a = b. Since nT(a) n−1 = T( b), this implies that 〈N, T(a)〉
contains T( b). Since b is arbitrary, the subgroup 〈N, T(a)〉 contains the subgroup
generated by all T( x), for x ∈ X, which, by assumption, is G.
The subgroup N is normal; the desired conclusion that it is contains the de-
rived subgroup of G is equivalent to the commutativity of the quotient G/N. Since
〈N, T(a)〉 = G, the composition T(a) → G → G/N is surjective; since T(a) is commuta-
tive, we conclude that G/N is commutative, as we wished to.
Proposition 4.1. — Let k and n be integers such that 0 < k < n − k < n. If 4 É n,
then the actions of An and Sn on X[k] are primitive.
Given this primitivity result, the approach of Iwasawa allows us to understand
the normal subgroups of the symmetric and alternating groups. We will only need
to use the cases k = 2, k = 3 and k = 4.
10 ANTOINE CHAMBERT-LOIR
4.2. — Let us first consider the case k = 2. For any 2-element subset x = {a, b} of X,
let us consider the subgroup T( x) generated by the transposition (a b): it is commu-
tative of order 2; the relation ( g · a g · b) = g(a b) g−1 implies that these subgroups sat-
isfy the relation T( g · x) = gT( x) g−1 ; and since Sn is generated by all transpositions,
they generate the symmetric group. Consequently, Iwasawa’s criterion implies that
if this action is primitive, then any normal subgroup N of Sn such that N 6= { e} con-
tains D(Sn ), which as we have seen, is equal to An . Since Sn /An has order 2, the
only subgroups of Sn that contain An are An and Sn .
What about the primitivity assumption? Note that the action of Sn on X[2] is not
2-fold transitive, because one cannot map {1, 2} and {1, 3} to the sets {1, 2} and {3, 4}.
Let us observe that it is nevertherless primitive; here, we will use that 2 < n −2, that
is, n > 4. (W ILSON, 2009, §2.5.1) shows that the fixator of any element of X[2] is a
maximal subgroup, and we will discuss this in greater generality in the next section,
but let me tell right now the following proof as explained to me by G. Chenevier.
Let B be an imprimitivity block of X[2] , and let {a, b} be a pair in B.
First assume that B contains another pair of the form {a, c}. Consider g ∈ G
such that g · a = c and g · b = a; then B and g · B share the element {a, c}, so that
g · B = B; consequently, B contains the pair { g · a, g · c} = { c, g · c}, hence all pairs of the
form { c, d }. Redoing the argument from {a, c} and { c, d } we deduce that B contains
any pair, hence B = X[2] .
Assume then that B contains a pair { c, d } which is disjoint from {a, b}. Since n Ê 5,
we may consider a fifth element e in X; let us prove that { c, e} ∈ B. Indeed, there
exists g ∈ Sn which maps a to a, b to b, c to c and d to e, hence {a, b} to itself, and
{ c, d } to { c, e}; then B and g · B have {a, b} in common, so that g · B = B and { c, e} ∈ B.
In particular, B contains two pairs { c, d } and { c, e} whose supports are not disjoint
and the first part of the argument implies that B = X[2] .
We thus obtain the following result (also a consequence of theorem 2.1).
4.5. — For this, let us consider k = 4. For any 4-element set x = {a, b, c, d } in X,
let us consider Klein’s Vierergruppe V( x) in the alternating group of these four
elements, viewed as a subgroup of An . It is commutative of order 4, and con-
sists of the identity and of the three “double transpositions” (a b)( c d ), (a c)( b d ) and
AN EXPERIMENT IN FORMALIZATION OF AN INTERMEDIATE-LEVEL ALGEBRA THEOREM 11
fact that the alternating group on n letters is ( n − 2)-fold transitive: given distinct
systems x1 , . . . , xn−2 and y1 , . . . , yn−2 , there are two permutations g such that g ·
x i = yi for all i , one is even and the other is of the form (a b) g, where a, b are
the two elements of {1, . . . , n} not in { y1 , . . . , yn−2 }. It had also been observed that
beyond these two cases, a permutation subgroup on n letters has to act much less
transitively and 19th century mathematicians proved many theorems that aimed at
quantifying this limit. For example, Mathieu had proved that unless it contains the
alternating group, a subgroup of Sn isn’t n/2-fold transitive, while J ORDAN (1872)
proved that it isn’t m-fold transitive if n − m is a prime number > 2.
As explained in C AMERON (1981), once the classification of simple finite groups
had been achieved, it could be checked on the list that a 6-fold transitive subgroup
of Sn must be symmetric or alternating.
Parallel to the classification is the understanding of all maximal subgroups of a
given finite simple group. In the case of the alternating group, an explicit list has
been provided independently by M. O’Nan and L. Scott. As remarked by C AMERON
(1981), this question is closely related to the description of all subgroups of the
symmetric group Sn which act primitively on {1, . . . , n}.
This classification theorem takes the given form: Let G be a strict subgroup of An
or Sn ; then G is conjugate to a subgroup of one of six types of which the first three
take the form:
(a) A product Sm × Sn−m , where 0 < m < n — the intransitive case.
(b) The “wreath product” Sm o S p , where n = pm, namely the subgroup generated
by the product of p symmetric groups acting on p disjoint sets of m letters (isomor-
phic to Sm × · · · × Sm ), and a permutation that permutes cyclically these p sets —
the imprimitive case;
(c) An affine group of an F p -vector space of dimension d , where n = p d is the
power of a prime number.
It applies in particular to maximal subgroups, and L IEBECK et al (1987) established
the converse assertion, deciding which of the groups of this list are maximal. That
case (a) is maximal when m 6= n − m is exactly the statement of proposition 4.1.
However, when n = 2 m, case (a) is not maximal but case (b) gives the corresponding
maximal case. For n = 4, for example, the subgroup given by (b) has order 8, hence
is a 2-sylow subgroup of S4 , while the subgroup S2 × S2 has order 4.
Cases of the form (c) were of particular interest to Galois, who proved that they
appear for the Galois groups of irreducible equations of prime degree which are
solvable by radicals. In other words, solvable and transitive subgroups of S p can
be viewed, up to conjugacy, as a group of permutations of the form x 7→ ax + b on F p ,
for a ∈ F×p and b ∈ F p . Since the identity is the only permutation of that form that
fixes two elements, Galois obtains that an irreducible equation of prime degree is
solvable by radicals if and only if any of its roots can be expressed rationally by any
two of them.
Galois also defined primitive algebraic equations which correspond exactly to the
case where the Galois group acts primitively on their roots. In the solvable case,
AN EXPERIMENT IN FORMALIZATION OF AN INTERMEDIATE-LEVEL ALGEBRA THEOREM 13
5.2. — But let us go back to the promised proof of proposition 4.1. Let G be a
subgroup of An strictly containing (Sk × Sn−k ) ∩ An , where 0 < k < n and n 6= 2 k.
We need to prove that G coincides with An . By symmetry, we may assume that
k < n − k. The case k = 1 is easy. Indeed, the action of An on {1, . . . , n} is ( n − 2)-fold
transitive, hence it is 2-fold transitive, because n Ê 4, hence it is primitive. We now
assume that 2 É k; then n Ê 5.
A theorem of (J ORDAN, 1870, Note C to §398, page 664) asserts that a primitive
subgroup of Sn that contains a cycle of prime order p is at least ( n − p + 1)-fold
transitive. When p = 2, we get that this subgroup is ( n − 1)-fold transitive, hence it
has to be the whole Sn , while when p = 3, it is ( n − 2)-fold transitive, and it is not
too difficult to deduce that it contains An . Since 1 É k < n − k < n and n Ê 5, we have
n − k Ê 3 and our subgroup G contains a 3-cycle. To conclude, it remains to establish
that it acts primitively on {1, . . . , n}.
One first proves that G acts transitively on {1, . . . , n}. In fact, G contains the
subgroups Sk and Sn−k ; in particular, it acts transitively on the elements of each
subset {1, . . . , k} and { k + 1, . . . , n}, hence it has at most two orbits. But since it strictly
contains (Sk × Sn−k ) ∩ An , it cannot leave {1, . . . , k} and { k + 1, . . . , n} invariant.
Arguing as for transitivity, G acts k-fold transitively on {1, . . . , k} and ( n − k)-fold
transitively on { k+1, . . . , n}; since 2 É k < n− k, it acts in particular 2-fold transitively,
hence primitively, on both of these sets.
We consider imprimitivity blocks B for the action of G, assuming that they have
at least two elements and are distinct from {1, . . . , n}.
First observe that B cannot contain { k + 1, . . . , n}, because its translates g · B, for
g ∈ B such that g · B 6= B, would have to be contained in {1, . . . , k}, which is impossible
since k < n − k. In particular, B meets { k + 1, . . . , n} in at most one element. If it
is disjoint from { k + 1, . . . , n}, it is contained in {1, . . . , k}. Since G acts primitively
on {1, . . . , k}, one then has B = {1, . . . , k}. Consider an element g of G which does not
stabilize {1, . . . , k}. Then g · B is a block distinct from B, hence disjoint, so that g · B
is a block contained in { k + 1, . . . , n}. By primitivity, g · B = { k + 1, . . . , n}, contradicting
the beginning of the proof.
In particular, there are elements a ∈ {1, . . . , k} ∩ B and b ∈ { k + 1, . . . , n} ∩ B. To
conclude the proof by a contradiction, it suffices to establish that B contains
{ k + 1, . . . , n}. So let c ∈ { k + 1, . . . , n} and consider an element g ∈ G that fixes {1, . . . , k}
such that g · b = c. Then g · B and B both contain a, hence g · B = B, hence c ∈ B, as
was to be shown.
14 ANTOINE CHAMBERT-LOIR
of the set of cycles of g which preserves their lengths, there is a unique element h σ
of Z g such that h σ (a c ) = a σ(c) for all c, and the map σ 7→ h σ is a group morphism.
Now, the kernel of ϕ is the subgroup of all elements h ∈ Z g such that hch−1 = c for
all cycles c of g. Necessarily, h stabilizes the support of each such c, so it maps a c
to some power iterate of a c under g; fix k c ∈ Z (modulo the cardinality n c of the
support of c) such that h(a c ) = g k c (a c ) = c k c (a c ); using the fact that c is a cycle, it
follows that h acts like c k c on the support of c. Finally, we see that h is the product
of these powers c k c . In other words, ker(ϕ) is a product of cyclic groups, c (Z/ k c Z),
Q
which we rewrite as i (Z/ i Z)m i , since m i is the number of cycles c such that n c = i .
Q
In particular, the order of ker(ϕ) is equal to i m i .
Q
Finally, Card(Z g ) = Card(Im(ϕ)) Card(ker(ϕ)) = i i m i i m i !, as was to be shown.
Q Q
7.2. — This reasoning can also be applied for other cases of geometric groups. In
his paper, Iwasawa himself indicates that the same method works for the symplectic
group PSp(2 n, F) (“complex projective groups” in the earlier terminology) acting on
the projective space P2n−1 (F). Iwasawa does not explicitly consider the notion of a
primitive action in his paper: his arguments are only spelt out for a 2-fold transitive
action. However, he mentions in a footnote that while the action of the symplectic
group on P2n−1 (F) is not 2-fold transitive, it is quasiprimitive, and this suffices for
his proof. On the other hand, K ING (1981) established that the stabilizers of this
action are maximal subgroups, so that this action is even primitive.
In fact, it seems that the simplicity of the appropriate groups of geometric trans-
formations can all be established in this way.
I find it remarkable how much this method, that relates the simplicity of a group
with the structure of its maximal subgroups, is absolutely in par with the point of
view of Jordan and early group theorists!
16 ANTOINE CHAMBERT-LOIR
References
N. B OURBAKI (1998), Elements of Mathematics. Algebra I. Chapters 1–3. Transl.
from the French. Softcover Edition of the 2nd Printing 1989., Berlin: Springer,
softcover edition of the 2nd printing 1989 edition.
P. J. C AMERON (1981), “Finite Permutation Groups and Finite Simple Groups”.
Bulletin of the London Mathematical Society, 13 (1), pp. 1–22.
D. A. C OX (2012), Galois Theory, Pure and Applied Mathematics, John Wiley &
Sons, Hoboken, N.J, 2nd ed edition.
A. D OXIADIS & C. H. PAPADIMITRIOU (2009), Logicomix—An Epic Search for
Truth, Bloomsbury Press, New York.
G. G ONTHIER (2008), “Formal proof—the four-color theorem”. Notices of the Amer-
ican Mathematical Society, 55 (11), pp. 1382–1393.
18 ANTOINE CHAMBERT-LOIR