JM FM
JM FM
JM FM
Andrei Okounkov
Abstract
While the author is a professional mathematician, he is by no means an expert in the sub-
ject area of these notes. The goal of these notes is to share the author’s personal excitement
about some results of James Maynard with mathematics enthusiasts of all ages, using
maximally accessible, yet precise mathematical language. No attempt has been made to
present an overview of the current state field, its history, or to place this narrative in any
kind of broader scientific or social context. See the references in Section 11 for both pro-
fessional surveys and popular science accounts that will certainly give the reader a broader
and deeper understanding of the material.
(1)
where the dots indicate that we imagine this table has infinitely many rows and columns. The
numbers 𝑛 that appear in the shaded area are called composite numbers. They can be written
in the form 𝑛 = 𝑎𝑏 where both 𝑎 ≠ 1 and 𝑏 ≠ 1 are positive integers.
Numbers that are not 1 and not composite are called prime. For instance, 2, 3, 5,
and 7 are prime, as one sees from (1). Indeed, every composite number 𝑎𝑏 appears in the
multiplication table in the column 𝑎 and row 𝑏, which are both less than the number 𝑎𝑏. So,
2, 3, 5, 7 will never appear in the shaded part.
It is a fundamental arithmetic fact that every positive integer 𝑛 > 1 can be factored as
a product of primes, and this factorization is unique up to the order of the prime factors. One
can compare and contrast factorization into primes with how molecules are built from atoms.
One clear difference is that the order of prime factors does not matter, unlike the positions
of the atoms in a molecule.
Primes form an infinite sequence which has mesmerized and puzzled mathemati-
cians for millenia. Many mathematicians were first attracted to mathematics by the magic of
prime numbers and remained true to their first mathematical love — number theory.
“It is the fact that primes are so fundamental (being the building blocks of whole
numbers), but still so mysterious and poorly understood which makes them so fascinating to
me”, says James Maynard, the hero of these notes. Kannan Soundararajan, the presenter of
Maynard’s Fields Medal laudatio at ICM 2022, agrees: “Like many others, I was drawn in by
the extreme simplicity of problems involving primes, and the remarkable difficulty of proving
anything about them. Twin primes and Goldbach in particular were especially fascinating
problems. It’s been amazing to witness such spectacular progress as the Green–Tao theorem
and bounded gaps between primes over the last twenty years.”
The following method for tabulating the primes goes at least far back as Eratosthenes
(276 – 195/194 BC). To remove the composite numbers from the list of all numbers, we can
successively cross out or punch trough all numbers from the grey columns in the multiplica-
tion table (1), that is, remove all nontrivial multiples of 2, of 3, of 5, et cetera. For instance,
the list of natural numbers with 1 and multiples of 2 and 3 removed will look like this:
2 Andrei Okounkov
⃝ 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
(2)
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
where dots indicate that this table has infinitely many rows. The reader may notice there is
no need to worry about multiples of 4, 6, or any other composite number.
Once we remove all composite numbers from numbers up to a 100, the result will
look like this (the colors will be explained momentarily):
⃝ 2 3 ⃝ 5 ⃝ 7 ⃝ ⃝ ⃝
11 ⃝ 13 ⃝ ⃝ ⃝ 17 ⃝ 19 ⃝
⃝ ⃝ 23 ⃝ ⃝ ⃝ ⃝ ⃝ 29 ⃝
31 ⃝ ⃝ ⃝ ⃝ ⃝ 37 ⃝ ⃝ ⃝
41 ⃝ 43 ⃝ ⃝ ⃝ 47 ⃝ ⃝ ⃝
⃝ ⃝ 53 ⃝ ⃝ ⃝ ⃝ ⃝ 59 ⃝ (3)
61 ⃝ ⃝ ⃝ ⃝ ⃝ 67 ⃝ ⃝ ⃝
71 ⃝ 73 ⃝ ⃝ ⃝ ⃝ ⃝ 79 ⃝
⃝ ⃝ 83 ⃝ ⃝ ⃝ ⃝ ⃝ 89 ⃝
⃝ ⃝ ⃝ ⃝ ⃝ ⃝ 97 ⃝ ⃝ ⃝
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
This table has a lot of holes, just like a sieve. For this reason, the methods that produce an
interesting set (e.g. primes) from a less interesting set (e.g. integers) by successively sifting
out the unwanted elements are referred to as sieve methods.
The primes shown in green are the twin primes, that is, primes 𝑝 such that 𝑝 + 2 or
𝑝 − 2 are also prime1. Twin primes are the simplest rhymes in the mysterious poem of primes.
While it is very easy to see that there are infinitely many primes2, the infinitude of twin primes
is a very old conjecture, still open today. However, the recent years saw an incredible progress
in our understanding of various patterns in primes, recognized, in particular, by the Fields
medal, the highest honor in mathematics, awarded in 2022 to James Maynard.
1 Can you prove that 𝑝 + 2 and 𝑝 − 2 cannot both be prime, except for 𝑝 = 5? Questions like
this will be clarified when we talk about admissible patterns.
2 Every divisor of the number 𝑛! + 1, where 𝑛! = 1 · 2 · 3 · · · · · 𝑛, has to be larger than 𝑛.
Since 𝑛 is arbitrary, there are infinitely many primes.
3 Rhymes in primes
In these notes, we will try to give a very basic introduction to this area of number
theory and some of the results of Maynard and his predecessors. A more experienced reader
can probably skip many sections of this narrative. All newcomers we wish some patience
working through these notes, and very much hope this patience will be rewarded by the
sense of awe that this mathematics inspires.
89 mod 10 = 9 .
One also says that the residue of 89 modulo 10 is 9. More generally, we write
𝑎 1 = 𝑎 2 mod 𝑏
to mean that 𝑎 1 − 𝑎 2 is divisible by 𝑏. We say that 𝑎 1 and 𝑎 2 are equal mod 𝑏, or that they
are in the same residue class modulo 𝑏.
If 𝑛 mod 10 = 8 then 𝑛 is even and not equal to 2, hence 𝑛 cannot possibly be prime.
Therefore, the 8th column in (3) is empty. Similar reasoning applies to the 2nd, 4th, 5th,
6th, and 10th columns. In due time we will see that prime numbers are approximately evenly
distributed among the remaining 4 columns of table (3). Whether the column corresponding
to a residue 𝑎 modulo 10 has many or very few primes is determined by the greatest common
divisor gcd(𝑎, 10). The columns with gcd(𝑎, 10) > 1 contain at most one prime.
The base 10 of the decimal expansion can be replaced by any other base 𝑏 > 1. For
instance, 𝑏 = 2 means binary expansions, as exemplified by
23 = 10111binary = 1 · 24 + 0 · 23 + 1 · 22 + 1 · 21 + 1 · 20 . (4)
4 Andrei Okounkov
(5)
Primes indeed seem to be roughly evenly distributed among all columns3, except the very
last one, which contains the multiples of 211. Of course, what catches the eye in this picture
are the diagonal stripes. We invite the reader to explain them using the equality
3 Actually, the number of primes in any given column in (5) varies between 14 and 31, but
it all evens out as we go further and further down the list of primes. It is fact of life that it
takes a while for primes 𝑝 to equidistribute mod any fixed prime like 𝑞 = 211. It is a very
subtle business to find out how long exactly this while can be, for either some fixed 𝑞 and
or averaged over 𝑞. This is, in fact, one of the key technical questions in this part of number
theory.
5 Rhymes in primes
For instance, for 𝑏 = 10 = 2 · 5, consider the following table. The rows and the columns of
this table are indexed by residues mod 2 and 5, respectively, and we place each residue mod
10 in the corresponding row and column:
1 2 3 4 0 mod 5
1 mod 2 1 7 3 9 5 . (7)
0 mod 2 6 2 8 4 0
We observe the remarkable fact that each residue 𝑎 = 0, 1, . . . , 9 mod 10 finds a unique
place in this table, filling the table completely. In general CRT says that the map (6) gives a
one-to-one correspondence
that preserves arithmetic operations. We invite the reader to prove the CRT and to generalize
its statement to the case 𝑏 = 𝑏 1 𝑏 2 · · · 𝑏𝑟 .
Let us revisit table (3) from the point of view of CRT. Shading the residue classes
that contain ≤ 1 primes, we get
1 2 3 4 0 mod 5
1 mod 2 1 7 3 9 5 (9)
0 mod 2 6 2 8 4 0
Here we think of residue classes 𝑎 modulo 10 as all equally likely and we call two events E1
and E2 independent if
Prob(E1 &E2 ) = Prob(E1 ) Prob(E2 ) .
While primes are truly special and not random at all, after centuries of looking into patterns
in primes most mathematicians would probably agree that primes behave as if they were
completely random, subject to, first, all possible constraints imposed by the considerations of
residues and, second, density constraints imposed by the unique factorization of integers into
primes. It is therefore very useful to inject, following Cramér, some probabilistic terminology
and intuition into our discussion.
6 Andrei Okounkov
In mathematics, there is lot of questions for which one is free to discard an arbitrary
finite part of some infinite data set. As an example, let’s take the concept of a limit, which is
very important when talking about primes. In the discussion that follows, we will very often
have a sequence of real numbers
(𝑎 𝑛 ) = (𝑎 1 , 𝑎 2 , 𝑎 3 , . . . ) ,
as 𝑛 goes to infinity. Slightly incorrectly, this means that every digit in the decimal expansions
of 𝑎 𝑛 ’s equals to that of 𝑎, except for finitely many values of 𝑛. Any person trained in calculus
will be quick to point out some problems with this definition, namely
𝑎 𝑛 = 10𝑛 ↛ 0 ,
even though every digit of 𝑎 𝑛 is zero except for one value of 𝑛, while
𝑎 𝑛 = 0. 999 . . . 9 → 1.0000 . . . ,
| {z }
𝑛 times
despite the fact that all digits after the decimal point are different. Readers who are not sure
how to fix these issues and feel they could use a more rigorous discussion, can find it in
Appendix A.
With the notion of a limit one can define infinite sums and products by
∞
∑︁ 𝑁
∑︁ ∞
Ö 𝑁
Ö
𝑎 𝑛 = lim 𝑎𝑛 , 𝑎 𝑛 = lim 𝑎𝑛 ,
𝑁 →∞ 𝑁 →∞
𝑛=1 𝑛=1 𝑛=1 𝑛=1
when these limits exist. For example, for any number |𝑥| < 1, we have
𝑥 ∞ = lim 𝑥 𝑛 = 0 , (11)
𝑛→∞
and also ∞
∑︁ 1
𝑥𝑛 = , (12)
𝑛=0
1−𝑥
which we invite the reader to deduce from (11).
Limits are needed not only for talking about infinite sets, but also as a way to define
some very important functions4
∞
𝑥2 𝑥3 𝑥4 ∑︁ 𝑥 𝑛
𝑒𝑥 = 1 + 𝑥 + + + +··· = , (13)
2 2·3 2·3·4 𝑛=0
𝑛!
4 The primary reason the exponential 𝑒 𝑥 and the natural logarithm ln 𝑦 are so important in
mathematics is because they solve the simplest differential equations, namely (𝑒 𝑥 ) ′ = 𝑒 𝑥
and (ln 𝑦) ′ = 1/𝑦. The reader can check this using the series (13), (15), and the rule ( 𝑥 𝑛 ) ′ =
𝑛𝑥 𝑛−1 .
7 Rhymes in primes
where 𝑒 = 2.71828 . . . is a famous transcendental number that can be computed by substi-
tuting 𝑥 = 1 in the above series. Another important constant that we will meet below is the
Euler constant !
𝑁
∑︁ 1
𝛾 = lim ln 𝑁 − = 0.57721 . . . . (14)
𝑁 →∞ 𝑛
1
Here and below ln 𝑦 denotes the function inverse to (13), which means that by definition
ln 𝑒 𝑥 = 𝑥 .
It is called the natural logarithm, and for arguments in (0, 2) it can be computed using the
series ∞
𝑦2 𝑦3 𝑦4 ∑︁ 𝑦𝑛
ln(1 + 𝑦) = 𝑦 − + − +··· = (−1) 𝑛−1 , |𝑦| < 1 . (15)
2 3 4 𝑛=1
𝑛
Readers unfamiliar with these functions will discover that the exponential 𝑒 𝑥 grows very
quickly with 𝑥, making the inverse function ln 𝑦 grow very slowly. Notice that the sum in
(14), with its minus sign, is the partial sum for 𝑦 = −1 in (15). No wonder it goes to ln 0 = −∞
as 𝑁 grows.
While Zeno of Elea (c. 495 – c. 430 BC) made a career out of being confused by the
𝑥 = 1/2 case of (12), we want to stress there are no logical problems whatsoever in thinking
about the infinity of primes and about limits. We encourage the reader to embrace these
notions as something more true and fundamental than any finite approximations to it.
8 Andrei Okounkov
as we will see momentarily. Let us look at the reciprocal of the product (18). We have the
𝑥 = 1/𝑝 special case of (12)
1 1 1 1 ∑︁ 1
1
= 1+ + 2 + 3 +··· = ,
1− 𝑝 𝑝 𝑝 𝑝 𝑚≥0
𝑝𝑚
and multiplying those out for different primes 𝑝 𝑖 , we get
𝑟 −1
Ö 1 ∑︁ 1
1− = 𝑚1 𝑚2 𝑚𝑟 . (21)
2 · · · 𝑝𝑟
𝑖=1
𝑝 𝑖 𝑚 ,...,𝑚 ≥0
𝑝 1 𝑝
1 𝑟
If the set {𝑝 𝑖 } contains all primes that are ≤ 𝑁, then the sum on the right in (21) contains,
in particular, the reciprocals of all natural numbers ≤ 𝑁. Therefore, by the existence of the
prime factorization, we conclude
−1
Ö 1 1 1
1− = 1 + + · · · + + more terms
all primes 𝑝 ≤ 𝑁
𝑝 2 𝑁
1 1
≥ 1+ +···+
2 𝑁
= ln 𝑁 + 𝛾 + 𝑜(1) , (22)
where 𝛾 is the Euler constant from (14) and 𝑜(1) denotes a quantity that goes to 0 as 𝑁 → ∞.
This shows that the rightmost term in
Ö 1
0 ≤ density({primes}) ≤ density({coprime to 𝑁!}) = 1− (23)
all primes 𝑝 ≤ 𝑁
𝑝
goes to 0 as 𝑁 → ∞ and completes the proof of (19).
It is curious to notice that taking logarithms in (22) and using that (15) says that
− ln(1 − 𝑝 −1 ) ≈ 𝑝 −1 for large 𝑝, we get
∑︁ 1
= +∞ . (24)
primes 𝑝
𝑝
This means that the same computation (22) proves that the density of primes is zero and yet
there are sufficiently many primes for the series (24) to diverge, as first noted by Euler.
While we may be disappointed in the fact that the number (19) vanishes, very similar
considerations often lead to positive results. For instance, let us consider square-free numbers
𝑛, that is numbers not divisible by 𝑚 2 for any 𝑚 > 1. This means
𝑛 mod 𝑝 2 ≠ 0 ,
for any prime 𝑝. Referring back to (4), this means that the two last digits of 𝑛 in the expansion
base 𝑝 do not vanish simultaneously. Since this pair of digits is free to take any of the 𝑝 2
possible values, one can conclude
Ö 1
6
density({squarefree}) = 1 − 2 = 𝜁 (2) −1 = 2 ≈ 0.6 . (25)
primes 𝑝
𝑝 𝜋
Here we meet the infinitely famous Riemann 𝜁-function
∞ Ö −1
∑︁ 1 1
𝜁 (𝑠) = = 1− 𝑠 , 𝑠 > 1, (26)
𝑛=1
𝑛𝑠 primes 𝑝 𝑝
9 Rhymes in primes
and its value 𝜁 (2) first computed by Euler in 1735. Our earlier computation (20) means that
𝜁 (1) = ∞.
Lest the reader concludes that the last column is always positive, it is known that, in fact, the
function Li(𝑥) − 𝜋(𝑥) changes sign infinitely many times. Also, while all 3 functions in (28)
5 A limit procedure is part of the definition of such everyday notions as areas and volumes.
The integral of a univariate or multivariate function 𝑓 is the signed area or volume between
the graph of 𝑓 and the graph of the zero function. It is a continuous limit of summing the
values of 𝑓 over a finer and finer mesh.
10 Andrei Okounkov
grow at the same rate, the logarithmic integral Li(𝑥) gives a much better approximation to
𝜋(𝑥) than the ratio ln(𝑥𝑥 ) .
The prime number theorem was first shown by Hadamard and de la Vallée Poussin
in 1896, so more than 2000 years after Eratosthenes. Certainly, many additional ideas were
required, and are still required today to prove (28). Therefore, we will say very little about
the proof. The reader interested in a heuristic derivation of the 1/ln(𝑁) density from unique
factorization can find it here (requires familiarity with integrals).
To extract the distribution of primes from (93), Hadamard and de la Vallée Poussin
had to use some properties of 𝜁 (𝑠) for complex values of 𝑠. What happens with 𝜁 (𝑠) for
complex 𝑠 involves some of deepest problems in all of mathematics, including the infinitely
famous Riemann hypothesis (RH), still completely open today. The RH says that all solutions
of 𝜁 (𝑠) = 0 are either the so-called trivial zeros 𝑠 = −2, −4, −6, . . . or have real part ℜ𝑠 = 21 .
The remarkable 21 from the Riemann Hypothesis can be in fact seen in the table (29)
if one notices that the number of 0’s in the second column is about half the number of digits
of 𝜋(𝑥), meaning that the difference 𝜋(𝑥) − Li(𝑥) of of the order 𝑥 1/2 , give or take some
logarithmic factors. If there was a zero with ℜ𝑠 = 𝑐 > 21 , the error 𝜋(𝑥) − Li(𝑥) would be
at least of size 𝑥 𝑐 , and the argument of Hadamard and de la Vallée Poussin was really about
proving that ℜ𝑠 < 1 for all zeros of the 𝜁-function.
While this is an incredibly interesting topic, the plot of our narrative follows a dif-
ferent path. Asked about the RH, James Maynard says: “The Riemann Hypothesis suggests
that there is a deep hidden structure within the prime numbers. This must occur for a good
reason - we just do not know what the reason is yet.”
7. Inclusion–exclusion
Let A be a set of integers, or even of objects of arbitrary nature. A very, very abstract
formulation of a sieve involves some subsets A 𝑝 ⊂ A , labelled by 𝑝 in some index set 𝑝 ∈ P,
which we wish to remove or sift out from the set A . In other words, our goal is to understand
the complement A \ 𝑝∈ P A 𝑝 of all sets A 𝑝 in A .
Ð
In its most basic form, the principle of inclusion–exclusion refers to the following
elementary observation. Assuming the number of elements |A | is finite, we have
Ø
A \ A 𝑝 = |A | count all elements of A
𝑝∈ P
∑︁
− A𝑝 subtract |A 𝑝 | for each 𝑝
𝑝
∑︁
+ A 𝑝1 ∩ A 𝑝2 correct for subtracting twice
𝑝1 < 𝑝2
∑︁
− A 𝑝1 ∩ A 𝑝2 ∩ A 𝑝3 + . . . et cetera. (30)
𝑝1 < 𝑝2 < 𝑝3
11 Rhymes in primes
For example, referring back to table (9) we may take
In other words, subtracting 5 multiples of 2 and 2 multiples of 5, we subtract the zero residue
twice, as the shading in table (9) illustrates. Hence we have to put it back.
If the subsets A 𝑝 ⊂ A correspond to independent events, meaning that
|A 𝑝1 ∩ A 𝑝2 ∩ · · · ∩ A 𝑝𝑟 | Ö 𝑟
|A 𝑝𝑖 |
= , (31)
|A | 𝑖=1
|A |
then formula (30) factors very nicely
A \ 𝑝∈ P A 𝑝
Ð Ö
|A 𝑝 |
= 1− , (32)
|A | 𝑝∈ P
|A |
A 𝑝1 ∩ A 𝑝2 ∩ · · · ∩ A 𝑝𝑟 = A 𝑝1 𝑝2 ... 𝑝𝑟 , (33)
(34)
If (33) is the case, the terms in formula (30) correspond to square-free integers 𝑛 all prime
factors of which belong to P. Thus (30) may be written more compactly
Ø ∞
∑︁
A \ A𝑝 = 𝜇P (𝑑)|A𝑑 | , (35)
𝑝∈ P 𝑑=1
12 Andrei Okounkov
using a variant of the Möbius function
(−1) 𝑟 ,
𝑑 is a product of 𝑟 distict primes in P ,
𝜇P (𝑑) = (36)
0, otherwise.
A more flexible language for the inclusion-exclusion principle uses the notion of character-
istic functions. For any subset 𝑆 ⊂ A , we define its characteristic function 𝛿 𝑆 by
1,
𝑛 ∈ 𝑆,
𝛿 𝑆 (𝑛) = (37)
0, 𝑛 ∉ 𝑆.
Then (35) can be refined to
∞
∑︁
𝛿A \Ð 𝑝∈ P A 𝑝 = 𝜇P (𝑑) 𝛿A𝑑 . (38)
𝑑=1
Since ∑︁
|𝑆| = 𝛿 𝑆 (𝑛) , (39)
𝑛
satisfied by (40) and some other choices of 𝜌, implies an analog of independence (31) for
weighted counts. For example, for A = N, A 𝑝 = 𝑝N, and a function 𝜌 satisfying (41), formula
(32) transforms into Í
𝑛 coprime to P 𝜌(𝑛)
Ö
Í = (1 − 𝜌( 𝑝)) . (42)
𝑛∈ N 𝜌(𝑛) 𝑝∈ P
We invite the reader to generalize formula (42) for functions 𝜌 satisfying a weaker property
13 Rhymes in primes
Other than (40), what other interesting functions satisfy (41)? For every 𝑁, the set
(Z/𝑁Z) × = {residue classes 𝑎 mod 𝑁 such that gcd(𝑎, 𝑁) = 1} (44)
is a finite abelian group with respect to multiplication. We take a character of 𝜒 of the group
(44) that is, a complex-valued multiplicative function with 𝜒(1) = 1, and extend it by zero
to all residues mod 𝑁. Examples of such functions are
1, 𝑛 = 1 mod 3 ,
𝑖 𝑚 , 𝑛 = 2𝑚 mod 5 ,
𝜒3 (𝑛) = −1 , 𝑛 = −1 mod 3 , 𝜒5 (𝑛) = (45)
0 , 𝑛 = 0 mod 5 ,
0,
𝑛 = 0 mod 3 ,
√
where the complex number 𝑖 = −1 ∈ C is the imaginary unit. The weight
𝜒(𝑛 mod 𝑁)
𝜌 𝑁 , 𝜒,𝑠 (𝑛) = , 𝑠 > 1, (46)
𝑛𝑠
satisfies (41) and the corresponding analog of the 𝜁-function
∞
∑︁ 𝜒(𝑛 mod 𝑁)
𝐿( 𝜒, 𝑠) = , 𝑠 > 1, (47)
𝑛=1
𝑛𝑠
is called the Dirichlet L-function. Its properties are entirely parallel to the 𝜁-function with
one crucial difference. Namely, if 𝜒 is nontrivial, that is, takes values other than 0 and 1, then,
in contrast to the 𝜁 function having a singularity at 𝑠 = 1 as in (93), the L-function has a finite
nonzero value at 𝑠 = 1. This allowed Dirichlet to show that primes are equally distributed
among the residue classes (44).
14 Andrei Okounkov
Having seen products of this general shape before, the reader should not be surprised by the
following exact result of F. Mertens
2𝑒 −𝛾
Ö 1
1− ∼ , (50)
√ 𝑝 ln 𝑥
primes 𝑝 ≤ 𝑥
where 𝛾 is the number from (22) and (93). Since 2𝑒 −𝛾 ≈ 1.123 this is somewhat close to the
right answer and, in particular, gives the correct logarithmic dependence on 𝑥, but little else
can be said in defence of a wrong formula.
This example is meant to illustrate that it is not easy to construct a good sieve, and
not to discourage the reader from reading on! See also the references in Section 11, and in
particular [7].
9. Patterns in primes
So far, we have looked at primes individually, meaning that we studied expressions
like
∑︁ 1,
𝑦 ∈ [1, 𝑥] ,
𝜋(𝑥) = 𝛿 [1, 𝑥 ] ( 𝑝) , where 𝛿 [1,𝑥 ] (𝑦) =
0, otherwise ,
primes 𝑝
∑︁ 1
ln 𝜁 (𝑠) = − ln 1 − ,
primes 𝑝
𝑝𝑠
given by summing some natural function 𝑓 ( 𝑝) over the set of all primes. To a general science
audience, we can say that we have been learning about 1-point correlations in the set of
primes.
Recall we expect the primes to be as “random” as the constraints imposed by residues
and density allow. To really put these ideas to the test, one should study multi-point correla-
tions, that is, events or patterns that involve pairs, triples, etc. of primes.
To start with a concrete example, what is the probability that 𝑛 and 𝑛 + 1 are both
prime? The answer is clearly 0 because one of these numbers will have to be even, and so
𝑛 = 2 is the only solution. What about 𝑛 and 𝑛 + 2 being simultaneously prime? Such pairs
are called twin primes and we saw many such pairs (green) in the Eratosthenes’ sieve (3).
Similarly, in the plot (5), twin primes are shown in green, all other primes in blue.
Twin primes provide an excellent test of our probabilistic intuition based on density
and mod 𝑝 considerations. From density alone, we should expect that the density of twin
primes around 𝑁 should be about (ln 𝑁) −2 . However, this needs to be corrected from mod 𝑝
considerations. Indeed, if 𝑛 and 𝑛 + 2 were truly independent, the probability of both of them
to be coprime to 𝑝 would be (1 − 1/𝑝) 2 , while in reality it is 1/2 for 𝑝 = 2 and (1 − 2/𝑝) for
𝑝 > 2. Whence the following constant in the 1923 conjecture of Hardy and Littlewood
∫ 𝑥
? 𝑑𝑦
𝜋2 (𝑥) = |{𝑝 ≤ 𝑥 such that 𝑝 + 2 is prime}| ∼ 𝐶2 , 𝑥 → ∞, (51)
2 (ln 𝑦) 2
where
Ö 1 − 𝑝2
𝐶2 = 2 1 2
= 1.32 . . . . (52)
primes 𝑝 > 2 (1 − 𝑝 )
15 Rhymes in primes
In exactly the same fashion, the probability that 𝑛 and 𝑛 + 2𝑚 are both coprime to 𝑝 equals
(1 − 1/𝑝) if 𝑝 divides 2𝑚 and (1 − 2/𝑝) otherwise. Therefore, for any fixed 𝑚 one can
conjecture that
∫ 𝑥
? 𝑑𝑦
|{𝑝 ≤ 𝑥 such that 𝑝 + 2𝑚 is prime}| ∼ 𝐶2𝑚 , 𝑥 → ∞, (53)
2 (ln 𝑦) 2
where
𝐶2𝑚 Ö 𝑝−1
= ≥ 1. (54)
𝐶2 𝑝−2
𝑝 |𝑚, 𝑝≠2
From this, it is clear that products of consecutive odd primes like 1155 = 3 · 5 · 7 · 11 should be
particularly likely to occur as distances 𝑝 2 − 𝑝 1 between primes, while powers of two are the
least likely values of 𝑝 2 − 𝑝 1 . In (55) the function (54) is plotted in the ranges 𝑚 ∈ [1 . . . 105]
and 𝑚 ∈ [1 . . . 1155], respectively6.
(55)
The conjecture (53) is in excellent agreement with data, especially if one considers the relative
frequencies of distances. The following plot (56) compares the function 𝐶2𝑚 with the actual
distribution of the distances among first 106 odd primes:
(56)
6 The reader may have to adjust the size/resolution of the graph to see the peak at 1155
16 Andrei Okounkov
In (56) we have plotted the relative frequencies, normalized to exactly 1 for 𝑚 = 1. The numer-
ical data is in light blue and the theoretical prediction is in dark blue. The latter overshoots
(with the exception of 𝑚 = 18) the former by less than 1%, so it is just barely visible in the
plot. Had we gone any deeper in the list of primes, the difference in graphs woud have become
undetectable.
We note that the above discussion is for distances between primes, while a prime gap
of length 2𝑚 means there are no other primes between 𝑝 and 𝑝 + 2𝑚. However, since primes
become sparser and sparser, finding another prime in an interval of fixed length becomes less
and less probable as 𝑝 → ∞.
The exact same heuristic can be applied to any finite set of jumps
that we would like to find between primes. We denote by 𝑛 + 𝐽 = {𝑛 + 𝑗 1 < · · · < 𝑛 + 𝑗𝑙 } the
shift of 𝐽 by 𝑛 ∈ N and by 𝑛 + 𝐽 ⊂ P the event that all numbers 𝑛 + 𝑗 𝑖 are prime. In parallel
to (53), it is natural to expect that
∫ 𝑥
? 𝑑𝑦
|{𝑛 ≤ 𝑥 such that 𝑛 + 𝐽 ⊂ P }| ∼ 𝐶 𝐽 |𝐽|
, 𝑥 → ∞, (58)
2 (ln 𝑦)
where
Ö 1 − | 𝐽 mod
𝑝
𝑝|
𝐶𝐽 = |𝐽| . (59)
𝑝 1 − 𝑝1
Here |𝐽 mod 𝑝| is the number of distinct residue classes mod 𝑝 in 𝐽. Since, for fixed 𝐽, this
equals |𝐽 | for all sufficiently large 𝑝, the contribution of all such 𝑝 to (59) is 1 + 𝑂 ( 𝑝12 ).
Therefore, the product (59) converges.
It is clear from (59) that the pat-
tern in primes favor those 𝐽 that
contain a small fraction of residues
modulo some prime 𝑝 and pro-
hibit those 𝐽 for which |𝐽 mod 𝑝| =
𝑝. It is also clear from definitions
that it suffices to consider the case
𝑗1 = 0. The graph of the function
𝐶 {0,2𝑖,2(𝑖+ 𝑗 ) } 𝐶 {0,2,6} is plotted on
the left. It vanishes unless 𝑖 𝑗 (𝑖 + 𝑗) =
0 mod 3, which explains the missing
columns in the plot.
17 Rhymes in primes
10. Closing the gap
Let is call a pattern 𝐽 as in (57) admissible if 𝐶 𝐽 ≠ 0, that is, if has a nonzero chance
to occur in prime numbers. As a very, very special case of the above heuristic reasoning,
one expect that any admissible pattern 𝐽 will occur as a sequence of prime gaps infinitely
many times7. In particular, one expects the set of twin primes to be infinite. This is known
as the twin prime conjecture, and it is still open today. However, in constrast to the Riemann
Hypothesis, there has been a truly dramatic progress in the recent years on such infinitude
questions. This progress has been so dramatic that it inspires us to say that these conjectures
are ”almost” proven. It is quite incredible to see humans actually reach for the stars.
James Maynard does not quite agree with the narrator here. He says: “Despite all
the recent progress, it seems we are still missing an important idea to prove the Twin Prime
Conjecture. But perhaps it is only one big idea.”
Of course, the actual mathematics involved in proofs compares to what we have
discussed so far like a modern airplane compares to a paper airplane. But if the reader tried
to think about the issues discussed in Section 8, then she or he may begin to appreciate the
amazing creativity and technical mastery required to design sieving arguments leading to the
proofs of the breakthrough results below.
It is clear from the prime number theorem that for any constant 𝑐 > 1 there are
infinitely many pairs of primes 𝑝 1 and 𝑝 2 such that
𝑝 1 < 𝑝 2 < 𝑝 1 + 𝑐 ln 𝑝 1 . (60)
Proving the same statement for some value 𝑐 < 1 is not easy. Many brilliant mathematicians
worked on this, finding proofs for smaller and smaller values of 𝑐, until Goldston, Pintz and
Yıldırım have shown that for any constant 𝑐 > 0 there are infinitely many pairs of primes
satisfying (60).
The new important ideas introduced by Goldston, Pintz and Yıldırım opened the
race to replace 𝑐 ln 𝑝 1 in (60) by some fixed constant 𝐵, that is to prove the infinitude of pairs
of primes that are within a fixed finite distance
𝑝1 < 𝑝2 ≤ 𝑝1 + 𝐵 (61)
from each other. This race was won in a very dramatic fashion in April 2013 by Yitang Zhang.
Even much more modest results in mathematics today require finding a new way
through a real maze of possible ideas, techniques, and logical constructions, and hence
moments of extraordinary concentration and clarity of mind. This is not unlike the need
to be in a really, really top form for an athlete to set a world record. Research mathematicians
(who do have time to do research as part of their job description, in addition to teaching,
advising, and other professional duties) cherish these precious moments. Most athletes and
mathematicians will surely agree that these special moments tend to be spaced further than
ln 𝑁 apart once we are past our prime. Zhang’s proof is therefore particularly incredible
and inspiring, since he had to find his way not just through the mathematical maze, but also
18 Andrei Okounkov
through the many turns of his difficult career outside of academia, not giving up despite the
big success finally coming to him only at the age of 55. His achievement was widely cel-
ebrated by the community, earning him a number of prestigious prizes including the 2013
Ostrowski Prize, the 2014 Cole Prize in Number Theory, and the 2014 Rolf Schock Prize. In
the same year 2014, the Cole Prize in Number theory was also awarded to Goldston, Pintz
and Yıldırım for their influential work mentioned above.
We hope the reader will turn to [8, 13, 17, 19, 30, 31] to learn more about these
developments, and turn to the main hero of these popular notes, the winner of many awards
including the 2022 Fields Medal. In the same eventful year 2013, James Maynard realized
he can make the sieve a lot more effective, eclipsing Zhang’s result in two key dimensions:
getting a much stronger result by an easier method.
Speaking about the influences and inspirations that have lead to this result, James
Maynard says: “I was trying to understand the sieve intuition behind the groundbreaking
work of Goldston-Pintz-Yıldırım, but in studying this I realised that it might be possible to
modify their ideas to go further.”
It is commonly said that great minds think alike, and the same sometimes happens to
the greatest minds, also. In the suspenseful race to close the prime gap, Terry Tao arrived at the
same results independently at the same time as James Maynard. “I was a bit shocked when I
first heard the news, but fortunately Tao was very generous and understanding. Simultaneous
discovery happens more often than you’d imagine!”, says James Maynard.
To explain Maynard’s and Tao’s main result on small gaps in primes, it is important
to make a certain change of perspective. In Section 9, we were interested in the event when
all numbers
𝑛 + 𝐽 = (𝑛 + 𝑗 1 , 𝑛 + 𝑗 2 , . . . , 𝑛 + 𝑗𝑙 ) (62)
are prime. But if one asks for less one can prove more! Let’s instead fix some 𝑚 < 𝑙 and
ask that at least 𝑚 of the numbers (62) are prime for infinitely many values of 𝑛. We will
not know which ones among (62) are prime, but we will know, for instance, that there are
infinitely many primes within distance 𝑗𝑙 − 𝑗 1 from each other.
The following is a special case of the spectacular main result of [20], which Kannan
Soundararajan compares with “sun amidst the stars” in his Fields Medal laudatio.
Theorem 1. For any 𝑚, for all sufficiently long admissible patterns 𝐽, at least 𝑚 of the
numbers (62) are prime for infinitely many 𝑛.
In fact, for any given 𝑚, the required size of 𝐽 in Theorem 1 can be made explicit.
For 𝑚 = 2, |𝐽 | = 50 suffices, and the following set being admissible
𝐽 = {0, 4, 6, 16, 30, 34, 36, 46, 48, 58, 60, 64, 70, 78, 84, 88, 90, 94, 100, 106,
108, 114, 118, 126, 130, 136, 144, 148, 150, 156, 160, 168, 174, 178, 184,
190, 196, 198, 204, 210, 214, 216, 220, 226, 228, 234, 238, 240, 244, 246} , (63)
19 Rhymes in primes
For 𝑚 = 3, |𝐽 | = 35410 suffices, and one can take8, for instance, the first 35410
primes larger than 35410
Therefore, there are infinitely many triples of primes within 433992 of each other. In general,
the best estimate for required length of 𝐽 currently stands at 𝑐𝑒 3.815𝑚 , see [1].
The more general result proven in [20] guarantees there are at least 𝑚 primes among
the numbers 𝑎 1 𝑛 + 𝑗 1 , . . . , 𝑎 𝑙 𝑛 + 𝑗𝑙 provided these are distinct and admissible. This stronger
version of Theorem 1 leads to many further interesting conclusions about patterns in primes.
For example, one can deduce that there are arbitrarily large sets of primes where any pair in
the set differs in only 2 decimal places! Indeed, if we take
then all digits of 𝑎 𝑖 𝑛 + 𝑗 𝑖 , 𝑖 = 1, . . . , 𝑙 are the same, except the position of the 1 in the (𝑖 + 1)st
decimal place, which is changing its position within the string of 𝑙 zeros.
I hope the readers share the narrator’s sense of awe at this absolutely amazing math-
ematics and join me in warmest congratulations on it being recognized by the Fields Medal.
I also hope the readers got the sense that today’s mathematics is not just extraordinarily pow-
erful, but also concrete, understandable, and fun, once one finds the right idea and the right
point of view. While finding that right point of view is not at all easy, my biggest hope is to
have inspired my youngest readers to believe that mathematics can be beautiful and reward-
ing, both as a subject and as a profession. Maybe this is also a good place for me to thank
James Maynard and Kannan Soundararajan for this special opportunity to be introduced to
their wonderful subject.
8 As an exercise, the reader may check than any 𝑙-tuples of primes larger than 𝑙 is admissible
20 Andrei Okounkov
12. A glimpse into the argument
To help the reader make transition to further popular and research reading, we will
indicate some initial logical steps in the argument leading to the proof of Theorem 1. There
is a certain distance that we can fly even on our paper airplane.
21 Rhymes in primes
Plots of the function 𝛿P look like barcodes, and here is an example
(69)
in which 𝑛 takes odd values from 106 + 1 to 106 + 599. In principle, (38) gives a formula for
𝛿P , and we can approach the goal of finding a replacement 𝛿P by tinkering with the formula
(38). For instance, we just truncate summation over 𝑑 to some maximal value 𝐷. That is, we
define
2
∑︁
𝛿0 (𝑛) = (70)
© ª
e 𝜇(𝑑) ® ,
«𝑑 |𝑛 , 𝑑 ≤𝐷 ¬
where we square the sum to make the result nonnegative. Since this equals 1 if 𝑛 has no
nontrivial divisors 𝑑 ≤ 𝐷, it is natural to compare this function to the characteristic function
𝛿 ≤𝐷 of numbers without prime factors 𝑝 ≤ 𝐷.
It is easy to plot the function e𝛿0 − 𝛿 ≤𝐷 and the result
(71)
for 𝐷 = 100 is not really satisfying. The two peaks in the graph correspond to the numbers
and, in general, the function (70) becomes large not because 𝑛 is prime, but because there
is a significant disbalance between its divisors 𝑑 ≤ 𝐷 with different parity of the number of
𝛿0 (𝑛) is much more sensitive to the artificial cutoff introduced
prime factors. In other words, e
by us at 𝑑 ≤ 𝐷 than to what we set out to measure in the first place.
22 Andrei Okounkov
To get rid of this effect, it makes sense to replace the hard cutoff at 𝑑 ≤ 𝐷 by a more
gentle one, through some weight function of 𝑑 that gives 1 for prime numbers and vanishes
at 𝑑 = 𝐷. Let us try
𝑘 2
1 © ∑︁ 𝐷 ª
𝛿 𝑘 (𝑛) =
e 𝜇(𝑑) ln ® , (72)
(ln 𝐷) 2𝑘 𝑑 |𝑛 , 𝑑 ≤𝐷
𝑑
« ¬
and this works much, much better for 𝑘 ≥ 1. For 𝐷 = 100, the function e
𝛿1 − 𝛿 ≤𝐷 looks like
this:
(73)
Not only it takes values in [0, 1) in this plot, it also peaks at numbers with prime factors 𝑝
of size close to 𝐷. Since the weight ln 𝐷𝑝 gets small for such 𝑝, we certainly expect such
numbers to contribute on par with the prime numbers.
23 Rhymes in primes
By allowing 𝐹 to depend on each divisor 𝑑𝑖 , the Maynard-Tao method activates
a very powerful principle of measure concentration in high-dimensional geometry. At the
risk of being repetitive, one may note that there is really a lot of space in a space of a large
dimension 𝑁. There is so much space that no probability distribution can cover all of it evenly
as 𝑁 → ∞, and one could put this vague principle in a mathematically precise form, see for
instance [18].
To make a negative statement positive, one can say that any high-dimensional prob-
ability density has to concentrate on some small portion of the whole space. For example, a
probability measure 𝜈 on the line R is another name for a random variable 𝑥, and a product
measure 𝜈 ⊗ 𝑁 = 𝜈 × · · · × 𝜈 on R 𝑁 is another name for a sequence of independent, identically
distributed (i.i.d.) random variables 𝑥 1 , . . . , 𝑥 𝑁 . We know from basic probability theory that,
with minimal assumptions about 𝜈, the average 𝑁1 𝑥𝑖 , and many other functions of i.i.d.
Í
random variables 𝑥1 , . . . , 𝑥 𝑁 , will sharply peak, or concentrate, around their expected value
as 𝑁 → ∞.
A reader not familiar with these notions, may experiment by working out the example
in which 𝜈 is the uniform density on [0, 1] and 𝜈 ⊗ 𝑁 is a uniform density on an 𝑁-dimensional
Í
cube [0, 1] 𝑁 . Taking the sum 𝑥𝑖 means projecting the cube onto the (1, 1, . . . , 1) axis, and
the reader may enjoy actually plotting these densities for different values of 𝑁. It is also fun
to compute the projection of a uniform measure on a high-dimensional sphere onto any axis.
It is by harnessing these concentration of measure phenomena that the density (74)
can significantly improve upon (72).
24 Andrei Okounkov
Time and time again in these notes we have stressed the technical importance of
being able to accurately count primes in arithmetic progression in analytic number theory,
also stressing that this may be very delicate if the progression is not much longer than its
common difference.
The counting function (27) may be refined to count primes in a given residue class
modulo 𝑏
𝜋(𝑥, 𝑏, 𝑎) 𝜙(𝑏) −1 ,
gcd(𝑎, 𝑏) = 1 ,
→ (79)
𝜋(𝑥) 0, otherwise ,
as 𝑥 → ∞, where 𝜙(𝑏) is the number of residue classes coprime to 𝑏. For fixed 𝑥, however,
the function
𝜋(𝑥, 𝑏, 𝑎)
(𝑏, 𝑎) ↦→ 𝜙(𝑏) −1 (80)
𝜋(𝑥)
behaves in a very irregular manner. This is illustrated in the following plot for 𝑎 < 𝑏 ≤ 100
25 Rhymes in primes
distributions of primes in arithmetic progression can be rigorously proven on average. The
actual estimate one needs here has the form
∑︁ 𝜋(𝑥) 𝑥
max 𝜋(𝑥, 𝑏, 𝑎) − ≤ 𝐶 ( 𝐴, 𝜀) , (81)
𝑎
1/2− 𝜀 gcd(𝑎,𝑏)=1
𝜙(𝑏) (ln 𝑥) 𝐴
𝑏<𝑥
which holds for any 𝐴 > 0 and 𝜀 > 0 with some positive constant 𝐶 ( 𝐴, 𝜀) that depends on
𝐴 and 𝜀. In our example, the maxima over 𝑎 in (81) and their running average over 𝑏 can be
seen in the following plot
Averaging really does make the behavior a lot more regular and, hence, manageable.
We have discussed some of the key ingredient that go into the proof of the amazing
result of Maynard and Tao. Perhaps, this discussion has given the reader the motivation and
confidence to open more advanced literature written by the experts in the field, including the
papers listed in Section 11. In any case, we hope to have communicated to the reader our
own sense of awe at the beauty of mathematics.
A. Limits
Limits are defined not just for numerical sequences (𝑎 1 , 𝑎 2 , . . . ) but for objects of
arbitrary nature for which there is a notion of neighborhoods. Namely, 𝑎 is the limit of the
above sequence, if every neighborhood of 𝑎 contains all elements 𝑎 𝑛 except maybe finitely
26 Andrei Okounkov
many. The reader may find it useful to picture this as follows:
(82)
where the bin represents a neighborhood of 𝑎 and spheres represent the elements 𝑎 𝑛 . Of
course, since the sequence is infinite, any neighborhood of the limit point contains not just
many, but infinitely many of the 𝑎 𝑛 ’s.
For real numbers, or any other set with the notion of distance, we may take the open
balls of arbitrary positive radius 𝑟 > 0
as standard neighborhoods. The reader may check her or his understanding of the definition
by proving (11) and (12), constructing a sequence or real numbers that does not have a limit,
and proving that the limit of a sequence of real numbers is unique when it exists.
The slight issue with defining the limits digit by digit is that the set of all real numbers
whose decimal expansion is fixed up to a certain point is a half-open interval, for instance
To define limits for real numbers correctly, one one should take open intervals, that is, those
without both endpoints as neighborhoods. Back to the main text.
{1 ≤ 𝑥1 ≤ 𝑥 2 ≤ · · · ≤ 𝑥𝑟 }
27 Rhymes in primes
Which functions 𝑓 (𝑦) should we consider?
In mathematics, the success often depends on choosing the right point of view. If
one has the right point of view, then one is able to see clearly where one is going.
A very nice choice here is to take 𝑓 (𝑦) = 𝑦 −𝑠 , where 𝑠 > 1 is parameter. This is called
Mellin transform, and it is a transform because it takes a function 𝜌𝑟 (𝑦) of one variable 𝑦 to
another functions 𝜌𝑟Mellin (𝑠), of the parameter 𝑠. Thus one trades a function of one variable
𝜌𝑟 (𝑦) for another function of one variable 𝜌𝑟Mellin (𝑠), which seems like a fair exchange. In
fact, one can reconstruct 𝜌𝑟 (𝑦) from 𝜌𝑟Mellin (𝑠), so no information is lost.
The Mellin transform is a close relative of the Fourier transform and what makes
the following computation work is the basic identity
(𝑥 1 𝑥2 ) 𝑠 = 𝑥1𝑠 𝑥2𝑠 .
Because of this, the function 𝑓 (𝑥1 · · · 𝑥𝑟 ) in (83) factors as 𝑓 (𝑥1 ) · · · 𝑓 (𝑥𝑟 ) and we can even-
tually reduce an 𝑟-fold integral in (83) to a product of 𝑟 integrals.
We compute
∫ ∞
def
𝜌𝑟Mellin (𝑠) = 𝑦 −𝑠 𝜌𝑟 (𝑦) 𝑑𝑦 (84)
∫ 1
Ö
= (𝑥1 · · · 𝑥𝑟 ) −𝑠 𝜌1 (𝑥𝑖 ) 𝑑𝑥 𝑖 (85)
1≤ 𝑥1 ≤ 𝑥2 ≤ ··· ≤ 𝑥𝑟
∫
1 Ö
= 𝑥 𝑖−𝑠 𝜌1 (𝑥𝑖 ) 𝑑𝑥 𝑖 (86)
𝑟! [1,∞) 𝑟
1 Mellin 𝑟
= 𝜌 (𝑠) , (87)
𝑟! 1
where in going from (85) to (86) we used the fact that
Ø
[1, ∞) 𝑟 = {1 ≤ 𝑥 𝑤(1) ≤ 𝑥 𝑤(2) ≤ · · · ≤ 𝑥 𝑤(𝑟 ) } (88)
permutations
𝑤:{1,...,𝑟 }→{1,...,𝑟 }
and that the integration over any of the 𝑟! sets in the right-hand side of (88) gives the same
result as (85).
If 𝜌• is the density of numbers 𝑦 having an arbitrary number of factors 𝑟, including
the case when 𝑟 = 0 and 𝑦 = 1, then summing (87) over 𝑟 = 0, 1, 2, . . . gives
𝜌•Mellin (𝑠) = exp 𝜌1Mellin (𝑠) , (89)
where exp(𝑥) is another notation for the function 𝑒 𝑥 from (13). The appearance of the expo-
nential function here is typical in many inclusion-exclusion situations.
To model unique factorization we want to take 𝜌• = 1 on [1, ∞) which means
∫ ∞
1
Mellin
𝜌• (𝑠) = 𝑥 −𝑠 𝑑𝑥 = , 𝑠 > 1. (90)
1 𝑠−1
Thus, we expect ∫ ∞
? 1
𝑥 −𝑠 𝜌1 (𝑥) 𝑑𝑥 = ln , 𝑠 > 1, (91)
1 𝑠−1
which is both good and bad news for the following reasons.
28 Andrei Okounkov
1
On the one hand, ln 𝑠−1 is not a Mellin transform of any density 𝜌1 on [1, ∞) simply
because it does not have a limit as 𝑠 → +∞. The 𝑠 → +∞ limit in (91) probes 𝜌1 (𝑥) for 𝑥
very close to 1 because 𝑥 −𝑠 becomes very small on the whole interval (1 + 𝛿, ∞) as 𝑠 → ∞,
for any fixed 𝛿 > 0. In particular, the Mellin transform of a bounded density function 𝜌1 (𝑥)
on [1, ∞) has to go to zero as 𝑠 → +∞.
This means that we cannot accurately model prime numbers with real numbers and
continuous densities. Of course, it was certainly silly to be asking for the density of small
primes to begin with. However, our interest is precisely the opposite, as we want to know the
behavior of 𝜌1 (𝑥) for large 𝑥. This region is probed by 𝑠 → 1 limit of the Mellin tranform.
In fact ∫ ∞
𝑓0
𝑓 (𝑥) = 𝑓0 + 𝑂 (𝑥 −𝑐 ) ⇒ 𝑓 (𝑥)𝑥 −𝑠 𝑑𝑥 = +... , (92)
1 𝑠 − 1
where 𝑂 (𝑥 −𝑐 ) means that 𝑓 ( 𝑥𝑥−𝑐) − 𝑓0
remains bounded as 𝑥 → ∞, the double arrow ⇒ denotes
implication, and dots stand for a function which is analytic for 𝑠 > 1 − 𝑐. (And also analytic
for complex values of 𝑠 such that ℜ𝑠 > 1 − 𝑐.) In the 𝑠 → 1 limit, we may write
∫ ∞ ∫ ∞
𝑑 𝑑 1 1
𝑥 −𝑠 𝜌1 (𝑥) ln(𝑥)𝑑𝑥 = − 𝑥 −𝑠 𝜌1 (𝑥)𝑑𝑥 ∼ − ln = .
1 𝑑𝑠 1 𝑑𝑠 𝑠 − 1 𝑠 − 1
which strongly suggests 𝜌1 (𝑥) ∼ 1/ln(𝑥) for 𝑥 → ∞.
In place of continuous approximations, the proof of Hadamard and de la Vallée
Poussin uses properties of the 𝜁-function (26), which, in the spirit of (84) , can be interpreted
as the averaged value of 𝑛 −𝑠 with respect to the measure that gives every positive integer 𝑛
weight 1. The equality between the sum and the product in (26) is the correct discrete version
of the relation (89). It looks different because in the discrete situation we need to account for
the nonzero chance of having two equal prime factors, the possibility of which was ignored
in going from (85) to (86). The exact analog of (90) is the the following description
1
𝜁 (𝑠) = + 𝛾 + 𝑜(1) , 𝑠 → 1 , (93)
𝑠−1
of the 𝑠 → 1 behavior of the 𝜁-function, where 𝛾 is the constant from (14) and (22) . Back
to the main text.
References
[1] R. C. Baker and A. J. Irving, Bounded intervals containing many primes, Math. Z. 286 (2017), no. 3-4, 821–841.
↑20
[2] E. Bombieri, On the large sieve, Mathematika 12 (1965), 201–225. ↑25
[3] Harold Davenport, Multiplicative number theory, 3rd ed., Graduate Texts in Mathematics, vol. 74, Springer-
Verlag, New York, 2000. Revised and with a preface by Hugh L. Montgomery. ↑20
[4] John Friedlander and Henryk Iwaniec, Opera de cribro, American Mathematical Society Colloquium Publi-
cations, vol. 57, American Mathematical Society, Providence, RI, 2010. ↑20
[5] D. A. Goldston, J. Pintz, and C. Y. Yıldırım, Small gaps between primes, Proceedings of the International
Congress of Mathematicians—Seoul 2014. Vol. II, Kyung Moon Sa, Seoul, 2014, pp. 419–441. ↑20
[6] Daniel A. Goldston, János Pintz, and Cem Y. Yıldırım, Primes in tuples. I, Ann. of Math. (2) 170 (2009),
no. 2, 819–862. ↑20
[7] Andrew Granville, Unexpected irregularities in the distribution of prime numbers, Proceedings of the Inter-
national Congress of Mathematicians, Vol. 1, 2 (Zürich, 1994), Birkhäuser, Basel, 1995, pp. 388–399. ↑15
29 Rhymes in primes
[8] , Primes in intervals of bounded length, Bull. Amer. Math. Soc. (N.S.) 52 (2015), no. 2, 171–222. ↑19,
20
[9] , Number theory revealed: a masterclass, American Mathematical Society, Providence, RI, [2019]
©2019. ↑20
[10] Andrew Granville and Jennifer Granville, Prime suspects, Princeton University Press, Princeton, NJ, 2019.
The anatomy of integers and permutations; Illustrated by Robert J. Lewis. ↑20
[11] Kevin Hartnett, New Proof Settles How to Approximate Numbers Like Pi, Quanta Magazine (August 14, 2019).
https://www.quantamagazine.org/new-proof-settles-how-to-approximate-numbers-like-pi-20190814/. ↑20
[12] Henryk Iwaniec and Emmanuel Kowalski, Analytic number theory, American Mathematical Society Collo-
quium Publications, vol. 53, American Mathematical Society, Providence, RI, 2004. ↑20
[13] Erica Klarreich, Unheralded Mathematician Bridges the Prime Gap, Quanta Magazine (May 19,
2013). www.quantamagazine.org/yitang-zhang-proves-landmark-theorem-in-distribution-of-prime-numbers-
20130519. ↑19, 20
[14] , Together and Alone, Closing the Prime Gap, Quanta Magazine (November 19, 2013).
www.quantamagazine.org/mathematicians-team-up-on-twin-primes-conjecture-20131119/. ↑20
[15] , Prime Gap Grows After Decades-Long Lull, Quanta Magazine (December 10, 2014).
www.quantamagazine.org/mathematicians-prove-conjecture-on-big-prime-number-gaps-20141210/. ↑20
[16] Dimitris Koukoulopoulos, The distribution of prime numbers, Graduate Studies in Mathematics, vol. 203,
American Mathematical Society, Providence, RI, [2019] ©2019. ↑20
[17] Emmanuel Kowalski, Gaps between prime numbers and primes in arithmetic progressions [after Y. Zhang
and J. Maynard], Astérisque 367-368 (2015), Exp. No. 1084, ix, 327–366. ↑19, 20
[18] Michel Ledoux, The concentration of measure phenomenon, Mathematical Surveys and Monographs, vol. 89,
American Mathematical Society, Providence, RI, 2001. ↑24
[19] Thomas Lin, After Prime Proof, an Unlikely Star Rises, Quanta Magazine (April 2, 2015).
https://www.quantamagazine.org/yitang-zhang-and-the-mystery-of-numbers-20150402. ↑19, 20
[20] James Maynard, Small gaps between primes, Ann. of Math. (2) 181 (2015), no. 1, 383–413. ↑19, 20
[21] , Digits of primes, European Congress of Mathematics, Eur. Math. Soc., Zürich, 2018, pp. 641–661.
↑20
[22] , Gaps between primes, Proceedings of the International Congress of Mathematicians—Rio de Janeiro
2018. Vol. II. Invited lectures, World Sci. Publ., Hackensack, NJ, 2018, pp. 345–361. ↑20
[23] , The twin prime conjecture, Jpn. J. Math. 14 (2019), no. 2, 175–206. ↑20
[24] D. H. J. Polymath, New equidistribution estimates of Zhang type, Algebra Number Theory 8 (2014), no. 9,
2067–2199. ↑20
[25] , Variants of the Selberg sieve, and bounded intervals containing many primes, Res. Math. Sci. 1
(2014), Art. 12, 83. ↑20
[26] K. Soundararajan, Small gaps between prime numbers: the work of Goldston-Pintz-Yıldırım, Bull. Amer. Math.
Soc. (N.S.) 44 (2007), no. 1, 1–18. ↑20
[27] Gérald Tenenbaum and Michel Mendès France, The prime numbers and their distribution, Student Mathemat-
ical Library, vol. 6, American Mathematical Society, Providence, RI, 2000. Translated from the 1997 French
original by Philip G. Spain. ↑20
[28] Gérald Tenenbaum, Introduction to analytic and probabilistic number theory, 3rd ed., Graduate Studies in
Mathematics, vol. 163, American Mathematical Society, Providence, RI, 2015. Translated from the 2008
French edition by Patrick D. F. Ion. ↑20
[29] A. I. Vinogradov, The density hypothesis for Dirichet 𝐿-series, Izv. Akad. Nauk SSSR Ser. Mat. 29 (1965),
903–934 (Russian). ↑25
[30] Yitang Zhang, Small gaps between primes and primes in arithmetic progressions to large moduli, Proceedings
of the International Congress of Mathematicians—Seoul 2014. Vol. II, Kyung Moon Sa, Seoul, 2014, pp. 557–
567. ↑19
[31] , Bounded gaps between primes, Ann. of Math. (2) 179 (2014), no. 3, 1121–1174. ↑19, 20
30 Andrei Okounkov
Andrei Okounkov
Andrei Okounkov, Department of Mathematics, University of California, Berkeley, 970
Evans Hall Berkeley, CA 94720–3840, okounkov@math.columbia.edu
31 Rhymes in primes