The Joy of Factoring PDF
The Joy of Factoring PDF
The Joy of Factoring PDF
Volume 68
The Joy
of Factoring
The Joy
of Factoring
c 2013 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Printed in the United States of America.
∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at http://www.ams.org/
10 9 8 7 6 5 4 3 2 1 18 17 16 15 14 13
Contents
Preface ix
Exercise xiv
v
vi Contents
Bibliography 273
Index 287
Preface
ix
x Preface
For the past thirty-five years, a very important reason for factor-
ing has been the public-key cipher of Rivest, Shamir, and Adleman
(RSA), whose security requires that the problem of factoring integers
is hard. Chapter 1 describes the development of the RSA cipher.
The mathematical details of it are presented in Chapter 4. Chapter
1 also discusses three older reasons for interest in factoring, repunits,
decimal fractions, and perfect numbers. Then it describes the Cun-
ningham Project, more than a century old and the greatest integer
factoring collaboration in history.
Chapter 2 reviews some elementary number theory found in a
first course in the subject. It considers divisibility, prime numbers,
congruences, Euler’s theorem, arithmetic functions, and Quadratic
Reciprocity. Few proofs are given here. It is assumed that the reader
has learned this material elsewhere. A few algorithms, such as the
Euclidean Algorithm for the greatest common divisor, are stated.
Chapter 3 deals with more advanced number theory, probably
not taught in a first course but needed to understand factoring al-
gorithms and applications of factoring. It discusses the frequency
of occurrence of integers whose greatest prime factor is small, how
to compute modular square roots, cyclotomic polynomials, primality
testing, and divisibility sequences such as the Fibonacci numbers and
the numbers bm − 1, m = 1, 2, 3, . . ., for fixed b. Of course, the
recognition of primes is essential to telling when a factorization is
complete, and this topic is treated extensively. Many algorithms are
stated in Chapter 3.
More applications of factoring are given in Chapter 4. It begins
with a set of algebraic factorizations of some numbers in a divisibility
sequence discovered by Aurifeuille and others. A more complete dis-
cussion of perfect numbers follows, with a sample of theorems in this
area. Next come harmonic numbers, prime proving aided by factor-
ing, and linear feedback shift registers, which are hardware devices
used to generate cryptographic keys and random numbers. Testing
conjectures is an important and common application of factoring. We
give three examples. Bernoulli numbers are connected to the struc-
ture of cyclotomic fields and to Fermat’s Last Theorem. While most
work in this area is beyond the scope of this book, we do give a taste
xii Preface
performing the sieve process. Others are computers with special archi-
tecture to facilitate factoring. We also discuss factoring with quantum
objects and with DNA molecules. The author is neither a physicist
nor a biologist, so he can give only a taste of these new factoring
methods. The reader who really wants to learn about these topics
should consult the references.
Chapter 10 discusses practical aspects of factoring and also some
purely theoretical results about the difficulty of factoring. We tell how
computers calculate with very large integers. Another section reveals
special methods for factoring integers quickly when they have special
form or when partial information is known about their factors. These
tricks include applications to breaking the RSA cipher. We describe
some of the ongoing factoring projects and how the reader can help
with them. The final section tosses out some new ideas for possible
future factoring methods.
Most of the algorithms in this work are written in pseudocode
and described in words. We have tried to make the pseudocode clear
enough so that programmers with limited knowledge of number the-
ory can write correct programs. We have also tried to make the
verbal descriptions of algorithms understandable to number theorists
unfamiliar with computer programming.
This book does not discuss factoring integers mentally, although
one reviewer suggested it as a topic. The only items at all related to
mental arithmetic are Example 2.29 and Exercises 0.1 and 2.13.
If you have taken a course in elementary number theory, are
computer-literate, don’t care about applications, and wish to learn
about factoring algorithms immediately, then you could begin reading
with Chapter 5. But then you would wonder why we keep factoring
the number 13290059 over and over again. This choice is explained
in Section 4.6.3.
The author thanks CERIAS, the Center for Education and Re-
search in Information Security and Assurance at Purdue University,
for its support.
The author is grateful to Richard Brent, Greg Childers, Graeme
Cohen, Carl Pomerance, and Richard Weaver, who answered ques-
tions about the material of this book. He is indebted to Robert
xiv Preface
Sam Wagstaff
Exercise
Introduction
A modern reason for studying the problem of factoring integers is the
cryptanalysis of certain public-key ciphers, such as the RSA system.
We will explain public-key cryptography and its connection to factor-
ing in the next section. The rest of the chapter discusses some older
reasons for factoring integers. These include repunits, describing per-
fect numbers, and determining the length of repeating decimals. We
introduce the Cunningham Project, which has factored interesting
numbers for more than a century.
1
The size of these primes would have to be doubled now due to the improvement
in the speed of factoring algorithms since Lenstra wrote these words in 1982.
1
2 1. Why Factor Integers?
is easy to compute f (x) for any x, but, given almost any value y in
the range, it is hard to find any x with f (x) = y.
Diffie and Hellman invented a method of key exchange based
on the one-way function f (x) = bx mod p. Given large numbers b,
p, and y, with prime p, it is hard to find a “discrete logarithm” x
with f (x) = y. Their protocol allows two users who have never met
or exchanged a secret key to choose a common secret key, for the
DES, say, while communicating over a channel that may have an
eavesdropper listening.
This key-exchange protocol provides secure communication be-
tween one user and whoever is at the other end of the connection.
But it is subject to the “man in the middle” attack in which someone
hijacks a computer between the two parties trying to communicate
and executes the protocol with each of them separately. After that,
the hijacker can read (and even modify) all encrypted correspondence
between the two parties as he or she decrypts a message from one and
re-enciphers it to pass on to the other.
The second innovation in the Diffie-Hellman paper [DH76] solved
this problem. Their idea was to split the key into public and private
parts! Bob would use a cipher with two independent keys. He would
make public his enciphering key and the enciphering algorithm. His
deciphering algorithm would be public, but only Bob would know his
deciphering key. It would be impossible to compute the deciphering
key from the enciphering key and other public data. Such a cipher is
called a public-key cipher.
To send Bob a secret message, Alice would find his public key
and use it to encipher the message. Once it was enciphered, only
Bob could decipher it. If Alice wanted to communicate with Bob
by the DES, she could choose a random DES key, encipher the key
with Bob’s public key, and email it to Bob. When Bob received this
message, he would decipher it and use the DES key to communicate
with Alice. No eavesdropper would be able to learn the DES key.
A public-key cipher lacks authenticity. Anyone could write a
letter to Bob, sign it “Alice,” encipher it using Bob’s public key, and
send it to Bob. Bob would not know whether it came from Alice.
Thus, Eve could send a random DES key to Bob, telling him that
4 1. Why Factor Integers?
it came from Alice. Then she could communicate with Bob secretly,
pretending to be Alice.
The third innovation of Diffie and Hellman in [DH76] was the
notion of a digital signature. Alice could construct her own public and
private keys, just as Bob did above, and “sign” a message by applying
the deciphering algorithm with her private key to the plaintext mes-
sage. Then she could encipher this signed message with Bob’s public
key and send it to him. Bob would be certain that the message came
from Alice when he deciphered it using his private key, enciphered the
result with Alice’s public key, and obtained meaningful text. Only
Alice could have constructed a message with this property because
only Alice knows her private key. We assume there is no “man in
the middle” attack and that Bob obtained Alice’s public key from a
secure site.
Diffie and Hellman did not propose enciphering and deciphering
algorithms for public-key cryptography in their paper. Ron Rivest,
Adi Shamir, and Leonard Adleman (RSA) read their paper and tried
many algorithms for doing this. Some of these schemes involved the
difficulty of factoring large integers. On April 3, 1977, Rivest dis-
covered the system now called RSA. He gave a simple formula for
enciphering a message. The public key was the product of two large
primes. He also gave a formula for deciphering the ciphertext that
used the two large primes. He created a trap-door one-way function,
that is, a function f that cannot be inverted unless one knows a se-
cret, in which case it is easy to find x given f (x). We will describe the
method in Section 4.8. The RSA paper was published as [RSA78].
The secret that unlocks the RSA cipher is the pair of prime factors
of the public key. An obvious attack on this cipher is to factor the
public key. The publication of the RSA cipher sparked tremendous
interest in the problem of factoring large integers, which earlier had
been studied by few people.
Martin Gardner3 wrote a column in the August 1977 Scientific
American describing the work of Diffie, Hellman, Rivest, Shamir, and
Adleman. In it, RSA offered a challenge message encoded with a
3
The title of Gardner’s article was “A new kind of cipher that would take millions
of years to break.” It appeared on pages 120–124.
1.2. Repunits 5
1.2. Repunits
When a child is learning to write, the first numbers he or she writes
often are the repunits 1, 11, 111, 1111, . . .. These same decimal
numbers continue to fascinate young people and adults as they learn
more about arithmetic. Which of these numbers are prime? Which
are composite? What are the factors of the composite ones? As a
child, I found joy in multiplying 3 times 37 and getting a product 111
with all ones. After one discovers that 1111 = 11 · 101, one might ask
which repunits divide which other repunits.
Of course, the decimal number Rn with n ones (and no other
digits) is just (10n − 1)/9. One can prove that Rm divides Rn if and
only if m divides n. Thus, 11 divides not only R4 = 1111 but also
every other repunit with an even number of ones. Likewise, 111, and
therefore also 3 and 37, divides R3n for every n.
Another consequence of the divisibility rule just mentioned is that
Rn cannot be prime unless n is prime. Therefore, R6 = 111111 and
R9 = 111111111 cannot be prime, but R2 = 11, R3 , R5 , R7 , and R11
might be prime. Of these five integers, only R2 = 11 is prime. We
4
GCHQ means “Government Communications Headquarters” in England.
6 1. Why Factor Integers?
n Rn Rn factored
2 11 11
3 111 3 · 37
4 1111 11 · 101
5 11111 41 · 271
6 111111 3 · 7 · 11 · 13 · 37
7 1111111 239 · 4649
8 11111111 11 · 73 · 101 · 137
9 111111111 3 · 3 · 37 · 333667
10 1111111111 11 · 41 · 271 · 9091
The next four prime repunits after R2 = 11 are R19 , R23 , R317 , and
R1031 . How does one prove that such large numbers are prime? We
will explain in Example 4.22 using theorems from Section 3.7.
primes p for which the length of the period of the decimal fraction
for 1/p is n. In 1801, Gauss determined the period length of the
1.3. Repeating Decimal Fractions 7
decimal fraction for every rational number a/b in terms of the factors
of numbers 10n − 1. See Articles 308–318 of [Gau01].
Here is a rough summary of what Gauss proved. Let p be a
prime number other than 2 or 5. Let m be a positive integer. If p
does not divide the integer r, then the length of the period of the
decimal fraction for r/pm is the smallest positive integer e for which
pm divides 10e − 1. This means that e is a divisor of pm−1 (p − 1).
1 p2 · · · pk , where the pi
mk
Call this period length (pm ). If M = pm 1 m2
are distinct primes not equal to 2 or 5, then the length of the period
of the decimal fraction for r/M is the least common multiple (M )
mk
of the numbers (pm m2
1 ), (p2 ), . . ., (pk ). In all of these cases the
1
period begins with the first digit after the decimal point. In case
N = 2a 5b M , with M not divisible by 2 or 5, the decimal fraction
for r/N becomes periodic after the first c digits following the decimal
point, where c is the larger of a and b, and the length of the period
is (M ).
Example 1.1. The decimal fraction for 1/11 is 0.09 09 . . ., with pe-
riod length 2, and 102 − 1 = 99 is the least number 10e − 1 divisible
by 11. The decimal fraction for 1/112 is
0.0082644628099173553719 0082644628099173553719 . . . ,
There is nothing special about base 10, except that humans have
ten fingers. If you write your fractions in base b rather than base 10,
the Cunningham Project helps you find all the primes whose recipro-
cals written in base b (for 2 ≤ b ≤ 12) have a given period length.
Example 1.2. In base 3, the reciprocal of the prime 11 (in base 10)
is 0.00211 00211 . . ., with period length 5 because 35 − 1 is the first
multiple of 11 having the form 3n − 1.
n ≤ 257 are prime. For most of the past few hundred years, the
largest known prime number has been a Mersenne prime. It is likely
that there are infinitely many primes 2n − 1, although this conjecture
has never been proved.
Although perfect numbers have been studied for 2,500 years and
we know the shape of all even perfect numbers, we still don’t know
whether there are any odd perfect numbers. The best we have been
able to do is prove theorems that restrict the size or form of any
hypothetical odd perfect number. Ochem and Rao [OR12] proved
that every odd perfect number is > 101500 and has at least 101 prime
factors, counted with multiplicity. Nielsen [Nie07] showed that an
odd perfect number must have at least nine distinct prime factors.
The proofs of these theorems are done by computer programs because
they have thousands of cases. They all require explicit knowledge of
the prime factors of many numbers bn ± 1. The Cunningham Project
helps the investigation of odd perfect numbers. We will give examples
in Section 4.2.
5
This college later became the Indian Institute of Technology at Roorkee.
10 1. Why Factor Integers?
this short sample, avoiding repeating prime factors this way produces
much shorter tables.
Another benefit of the parentheses is that they make it easier to
see the multiplicative structure of these numbers. For example, at
least in Table 1, 101 + 1 = 11 divides 10n + 1 for every odd n, and
102 + 1 = 101 divides every fourth number: 106 + 1, 1010 + 1. Table
1 in Section 3.4 has more parentheses. We will say more about this
multiplicative structure in Section 3.4.
The tables in [CW25] omitted the parentheses and their con-
tents, listing only the “M.A.P.F.” (Maximal Algebraic Primitive Fac-
tors) of each number. These tables also credited discoverers of certain
difficult factorizations and left blank spaces to encourage others to
continue the work and enter the factors they find.
For some purposes, such as studying odd perfect numbers, the
table on the left is more useful. But for other purposes, such as
understanding the multiplicative structure of the numbers or finding
period lengths of decimal fractions, the table on the right is more
useful. To use Table 1 to find period lengths, note that a theorem
(Theorem 3.23) says that the primitive prime factors of 10n + 1 are
the same as the primitive prime factors of 102n − 1. Thus, Table 1
tells us that the decimal fraction for 1/73 has a period of length 8 (in
fact, 1/73 = 0.01369863 01369863 . . .) because 73 is a primitive prime
factor of 108 − 1.
After Cunningham died in 1928, the Cunningham Project was
continued by Dick Lehmer and others. A revised and enlarged version
of [CW25] was published as [BLS+ 02] in 1983 and 1988. The third
edition of this book was published in 2002 as an electronic book by
the American Mathematical Society. The author maintains the latest
versions of the tables at
http://homes.cerias.purdue.edu/~ssw/cun/index.html.
If you examine the tables in [BLS+ 02], you will notice that there
are four tables with b = 2 but only two tables for b > 2. The numbers
in parentheses in any table refer to earlier lines in that table. For each
base b there is just one table for factors of bn − 1, and it lists only odd
n because of the identity b2n − 1 = (bn − 1)(bn + 1). If you want the
factors of 326 − 1, you must look for 313 − 1 in the table for factors of
12 1. Why Factor Integers?
Exercises
Introduction
This chapter recounts basic facts from number theory that you will
need to understand the rest of the book. Most of the proofs are
omitted. If you haven’t had a first course in number theory, you
should read one of the many books with a title Introduction to Number
Theory, where you will find the proofs. For example, see Hardy and
Wright [HW79] or Niven, Zuckerman, and Montgomery [NZM91].
References are given for a few selected proofs.
We will use this notation throughout the book. If x is a real
number, then x means the largest integer ≤ x and x means the
smallest integer ≥ x. Thus, 3.14 = 3, 3.14 = 4, −3.14 = −4,
and −3.14 = −3. If x is an integer, then x = x = x.
Here we explain some terms used to describe the speed of al-
gorithms. If f (n) and g(n) are functions defined for positive in-
tegers n, we say “f (n) is big O of g(n),” written f (n) is O(g(n))
or f (n) = O(g(n)), to mean that there is a constant c > 0 so
that |f (n)| ≤ cg(n) for all sufficiently large n. If the functions
f (n) and g(n) never vanish, then we define f (n) ∼ g(n) to mean
limn→∞ f (n)/g(n) = 1.
Let be the length of the input to an algorithm. For example, if
the algorithm factors integers N , then is the number of digits (or
bits) in the input N , so = O(log N ).
13
14 2. Number Theory Review
problems are all equally hard and are provably at least as hard as any
problem in N P.
Algorithms are classified by whether they use random numbers.
A deterministic algorithm chooses no random numbers. If you run it
twice with the same input, it will perform exactly the same steps in
the same order each time. A probabilistic algorithm chooses random
numbers during its operation. These random values determine the
steps it performs and may affect the running time. The expected
running time of a probabilistic algorithm is the average taken over
all possible random choices it could make. If you run a probabilistic
algorithm twice with the same input, it will probably perform different
steps each time, it may produce different output, and it may take
different running times. The Elliptic Curve Method is an example of a
probabilistic integer factoring algorithm. Given an input number N to
factor, it chooses some random numbers (the coefficients of an elliptic
curve and the coordinates of a point on it). It performs a certain
calculation using the random numbers and N , and it may or may not
find a proper factor of N . An important statistic of a probabilistic
integer factoring algorithm is the probability that it will succeed.
A Monte Carlo probabilistic polynomial time algorithm always runs
in polynomial time but may give the wrong answer (or no answer)
for some of its random number choices. A Las Vegas probabilistic
polynomial time algorithm always gives the correct answer, runs in
polynomial time on average, but may run in exponential time for
some of its random number choices.
Many algorithms in this book are presented in pseudocode, which
we now explain. Variable names, like m and P next, are computer
memory locations where numbers are stored. A name with brackets
is a component of an array (vector). Thus, L[i] is the i-th element
of the array L. An assignment statement like a ← f (a, b, c) means
evaluate the function f (a, b, c) using the numbers currently stored in
the locations a, b, c and then store the result in location a, replacing
the old value that was stored there.
An if statement has a condition enclosed in parentheses and one
or more instructions enclosed in braces. These instructions are per-
formed if the condition is true and not performed if it is false. For
16 2. Number Theory Review
example,
if (m mod p = 0) { m ← m/p }
has the condition (m mod p = 0), which means “p divides m.” If this
is true, the instruction performed is to divide out the factor p from
m, leaving the quotient in m. An if statement may be followed by an
else and then one or more additional instructions in braces. These
instructions are performed when the condition is false. For example,
for (i ← 1 to n) { L[i] ← 1 }
2.1. Divisibility
When m and n are whole numbers and m = 0, we say m divides n
and write m | n if n/m is a whole number. Integer is another word
for whole number. Here are two basic facts about divisibility: If k | m
and m | n, then k | n. If k | m and k | n, then k | (mx + ny) for any
integers x and y. We write m n if m does not divide n.
We write “n mod m” to mean the remainder when the integer n is
divided by the positive integer m. We always have 0 ≤ n mod m < m.
For example, 15 mod 4 = 3, 100 mod 7 = 2, 30 mod 5 = 0, and
(−17) mod 6 = 1. In computer languages like C and Java, “n mod m”
is written “n%m,” at least when n and m are positive integers.
If at least one of the integers m, n is nonzero, define the greatest
common divisor of m and n, written gcd(m, n), to be the largest
integer that divides both m and n. It is clear that gcd(m, n) ≥ 1 and
that gcd(m, n) = gcd(n, m). Integers m, n are said to be relatively
prime if gcd(m, n) = 1. The least common multiple of two or more
integers is the smallest integer divisible by all of them. It is less than
or equal to their product. For example, the least common multiple of
4, 6, and 8 is 24.
75 = 21 · 3 + 12
21 = 12 · 1 + 9
12 = 9 · 1 + 3
9 = 3 · 3 + 0.
gcd(75, 21) = 3 = 12 − 9 · 1
= 12 − (21 − 12 · 1) · 1
= 12 · 2 − 21 · 1
= (75 − 21 · 3) · 2 − 21 · 1
= 75 · 2 − 21 · 7,
so x = 2 and y = −7.
The numbers x and y are not unique, but the algorithm returns
x and y with smallest absolute values. The algorithm works because
every triple (a, b, c) satisfies a = bm + cn throughout the algorithm.
If you ignore the second and third components of the triples, this al-
gorithm is exactly the (ordinary) Euclidean Algorithm above. There-
fore, its time complexity is given by Lamé’s Theorem 2.3. Since the
final value of the first component of the triple u is gcd(m, n), the
algorithm is correct.
Example 2.8. We repeat Example 2.6 via the Extended Euclidean
Algorithm. This table shows the value of the two triples u and v and
the integer q each time q is computed and also at the end:
u0 u1 u2 v0 v1 v2 q
75 1 0 21 0 1 3
21 0 1 12 1 −3 1
12 1 −3 9 −1 4 1
9 −1 4 3 2 −7 3
3 2 −7 0 −7 25
The last line shows that gcd(75, 21) = 3 = 75(2) + 21(−7).
1
Theorem 2.12 is Proposition 20 of Book IX of Euclid’s Elements. His proof is
essentially the one given here.
22 2. Number Theory Review
n pn 1 + ni=1 pi
1 2 3
2 3 7
3 7 43
4 43 1807 = 13 · 139
5 13 23479 = 53 · 443
6 53 1244335 = 5 · 248867
7 5 6221671 (prime)
8 6221671 38709183810571 (prime)
9 38709183810571 1498400911280533294827535471 =
139 · 25621 · 420743244646304724409
10 139 208277726667994127981027430331 =
2801 · 2897 · 489241 · 119812279 · 437881957
√
Theorem 2.14. If n is composite, then n has a prime factor p ≤ n.
2.3. Congruences
If m is a positive integer and a and b are integers, we say a is congruent
to b modulo m and write a ≡ b (mod m) if m divides a−b. If m (a−b),
we write a ≡ b (mod m). The integer m is called the modulus (plural
moduli). When a ≡ b (mod m), each of a, b is called a residue of the
other (modulo m). Congruence modulo m is an equivalence relation,
meaning that if a, b, and c are integers, then
(1) a ≡ a (mod m),
(2) if a ≡ b (mod m), then b ≡ a (mod m), and
(3) if a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m).
Do not confuse the “mod” in the relation a ≡ b (mod m) with
the arithmetic operation “mod” (remainder) in a mod m. We have
a ≡ b (mod m) if and only if (a mod m) = (b mod m).
The congruence a ≡ b (mod m) is equivalent to saying that there
exists an integer k so that a = b + km.
Note that part (4) of Theorem 2.16 say that if a ≡ b (mod m),
then an ≡ bn (mod m). However, if a ≡ b (mod m), then usually
na ≡ nb (mod m).
Theorem 2.17. If m > 1 and gcd(a, m) = 1, then there is a unique
integer x in 0 < x < m for which ax ≡ 1 (mod m).
The two methods are about equally fast, but the Extended Euclidean
Algorithm works even when the modulus is composite. This calcula-
tion is often done when p is a prime number with hundreds of decimal
digits. The power of n is computed efficiently by the Fast Exponen-
tiation Algorithm.
To compute ne (mod m), write the exponent e as a binary number
e = i bi 2i , where bi ∈ {0, 1}. Then
bi
bi 2i i
ne = n i = n2 .
i
i
The variable z in the algorithm holds successively the powers n2 ,
for i = 0, 1, 2, . . .. When the exponent e is repeatedly divided by
2, discarding any fractional part, the parity of the quotients is the
bits bi , i = 0, 1, 2, . . .. The variable y, initially 1, remembers the
i
product of some of the powers n2 , held in z. The condition in the
if statement in the algorithm is true if bi = 1. When this happens,
i
z = n2 is multiplied into the running product y. At the end, y holds
ne . To get ne mod m, each product (yz or z 2 ) is reduced modulo m
as soon as it is formed, to keep the numbers small.
Algorithm 2.25. Fast Exponentiation Algorithm.
Input: Integers m > 0, n ≥ 0, e ≥ 0.
y←1
z←n
while (e > 0) {
if (e is odd) y ← (y · z) mod m
z ← (z · z) mod m
e ← e/2
}
Output: ne mod m = the final value of y.
y z e
0 1 2 162166
1 1 4 81083
2 4 16 40541
3 64 256 20270
4 64 65536 10135
5 140129 136468 5067
6 67398 94577 2533
7 2377 1543 1266
8 2377 110511 633
9 136274 46518 316
10 136274 130043 158
11 136274 82755 79
12 99523 77615 39
13 139101 70676 19
14 52235 29042 9
15 98752 7197 4
16 98752 65536 2
17 98752 136468 1
18 85902 94577 0
2.4. Fermat and Euler 31
Definition 2.30. For a positive integer m define the Euler phi func-
tion φ(m) to be the number of n in 1 ≤ n ≤ m with gcd(m, n) = 1.
We define the Jacobi symbol, mentioned later in the book but not
really used. If Q is an odd positive integer, so that Q = q1 q2 · · · qt ,
where the qi are odd primes, and P is an integer, then the Jacobi
t
symbol (P/Q) = i=1 (P/qi ), where the (P/qi ) are Legendre symbols.
If gcd(P, Q) > 1, then (P/Q) = 0, but if gcd(P, Q) = 1, then (P/Q) =
±1. The Jacobi symbol satisfies the properties listed in Theorem
2.59, except for Euler’s Criterion (4). These properties allow one to
compute in polynomial time the value of (P/Q) even when the prime
factors of Q (and P ) are unknown. They also facilitate evaluation
of Legendre symbols. If P is a quadratic residue modulo Q, then
(P/Q) = +1. But (P/Q) = +1 does not imply that P is a quadratic
residue modulo Q when Q is composite.
Exercises
2.1. Find the greatest common divisor g of 321 and 381. Find
integers x and y so that 321x + 381y = g.
2.2. Extend the table of Example 2.13 as far as you can with your
factoring resources.
2.3. Estimate the number of random 300-digit numbers you would
have to test for primality to find one prime.
2.4. Solve 31x ≡ 17 (mod 109).
2.5. Find x so that x ≡ 5 (mod 9) and x ≡ 8 (mod 11).
2.6. Find the two low-order digits of 83765 .
2.7. Compute φ(m), d(m), σ(m), μ(m), and ω(m) for each integer
m in 30 ≤ m ≤ 40.
2.8. Find all 0 ≤ r < 8 for which x2 ≡ r (mod 8) has a solution.
Repeat with 8 replaced by 16.
2.9. Is the Carmichael λ function multiplicative?
2.10. Show that the least exponent t for which 5t ≡ 1 (mod 2e ) is
t = 2e−2 = λ(2e ) when e > 2.
2.11. Use Theorem 2.59 to evaluate these Legendre symbols: (−1/23),
(2/31), (7/37), (69/103), and (35/67). Try to answer without
using a computer.
2.12. Which primes p have 11 as a quadratic residue?
k
2.13. Let m = i=0 di 10i be the decimal number (dk dk−1 . . . d1 d0 )10 .
(a) Prove that 3 divides m if and only if 3 divides the sum
d0 + d1 + · · · + dk−1 + dk of the digits of m.
(b) Prove that 11 divides m if and only if 11 divides the
alternating sum d0 − d1 + · · · ± dk−1 ∓ dk .
(c) Prove that 37 divides m if and only if 37 divides
d0 + 10d1 − 11d2 + d3 + 10d4 − 11d5 + d6 + 10d7 − 11d8 + · · · .
Chapter 3
Number Theory
Relevant to Factoring
Introduction
This chapter presents more advanced topics from number theory that
you will need to understand the rest of the book. This material does
not appear in most introductory number theory books. References
or proofs are given for these theorems. The fastest known factoring
algorithms work by factoring millions of small auxiliary numbers us-
ing simple techniques like Trial Division and sieves. They succeed
in factoring an auxiliary number when it is smooth. The main re-
sult stated in the first section is that a positive fraction of auxiliary
numbers is smooth. This fraction determines the running time of the
integer factoring algorithm. The next section deals with solving the
congruence x2 ≡ r (mod m). Several fast factoring algorithms do
this as the penultimate step, the last step being finding a greatest
common divisor. (For one such algorithm see Example 6.11.) Square
roots of quadratic residues are also needed for certain cryptographic
algorithms discussed in Section 4.8. The next four sections deal with
the algebraic factorizations of the numbers bn ± 1 and the Fibonacci
and Lucas numbers, some of the most interesting numbers to factor.
Simple algebra factors some of these numbers for free. One should
take advantage of this free factoring before one begins to use the more
41
42 3. Number Theory Relevant to Factoring
to p. This makes p take on the values 2, 3, 5, 7, 11, 13, 17, 19, 23, 25,
29, 31, 35, . . .. In order to make the algorithm correct, the variable
p must not skip any prime. The reason why no primes are missed
when one adds 2 and 4 alternately is that, after the primes 2 and 3,
all larger primes are either 1 more or 1 less than a multiple of 6, and
the alternation of adding 2 and 4 makes the variable p run through
exactly these numbers. We will say more about this algorithm in
Section 5.1.
Trial Division is most successful when factoring a “smooth” num-
ber, that is, one with only small prime factors.
ψ(x, x1/u )
lim = ρ(u).
x→∞ x
A useful approximation is ρ(u) ≈ u−u , that is,
Example 3.7. Use the Tonelli algorithm to find the square roots of
50 modulo 73.
Trying small (nonsquare) positive integers 2, 3, 5, 6, . . ., we find
that 5 is the smallest positive quadratic nonresidue modulo 73. We
write 73 − 1 = 72 = 23 9, so e = 3 and f = 9. We set R = 509 mod
73 = 27 and N = 59 mod 73 = 10. Set j = 0 and begin the for loop.
When i = 1, the condition in the if statement is true and j is changed
to 0 + 21 = 2. When i = 2, the condition is true and j is changed to
2 + 22 = 6. When i = 3, the condition is false and j is not changed.
Finally, x = 50(9+1)/2 106/2 mod 73 = 59 or 73 − 59 = 14. The two
square roots of 50 are 14 and 59 modulo 73.
1
The Extended Riemann Hypothesis is a conjecture about the zeros of certain
L-functions similar to the Riemann zeta function.
3.3. Cyclotomic Polynomials 47
and
xm − 1 = Φd (x) and Φm (x) = (xm/d − 1)μ(d) .
d|m d|m
(3.2) Φmp (x) = Φm (xp ) Φm (x),
xp − 1
(3.4) Φp (x) = = xp−1 + xp−2 + · · · + x + 1.
x−1
Table 1. Factors of 5m − 1.
Theorem 3.15. Let q be a prime that does not divide the positive
integer m. Then the congruence
(3.5) Φm (x) ≡ 0 (mod q)
is solvable if and only if q ≡ 1 (mod m). If q ≡ 1 (mod m), then
the solutions to (3.5) are exactly the x satisfying xm ≡ 1 (mod q) but
not xk ≡ 1 (mod q) for any 0 < k < m. If x satisfies (3.5), then the
highest power of q dividing Φm (x) is the same as the highest power of
q dividing xm − 1.
The next theorem tells when intrinsic factors occur. Many parts
of this theorem go back at least 150 years to Sylvester.
If q | Φm (x), then
a
(3.7) q | Φm1 (xq ).
a
By Theorem 3.15, we see that m1 is the least k > 0 such that (xq )k ≡
1 (mod q). Therefore, m1 is the smallest positive integer k with
xk ≡ 1 (mod q) (since xq ≡ x (mod q)). For q = 2 this would imply
that m1 = 1. Since we are supposing here that m is not a power of
2, q must be odd.
If x = 1, then m1 = 1 and Φm (1) = q by the remark after (3.4).
If x = −1, then m1 = 2 and Φm (−1) = Φm/2 (1) = q by the same
remark.
Suppose next that x = ±1. If m1 is the least k > 0 such that
a
xk ≡ 1 (mod q), then (3.7) holds and q divides Φm1 (xq ) to the same
a
power as it divides (xq )m1 − 1 = xm − 1. Suppose q s (xm/q − 1), so
xm/q = 1 + tq s , where q t. Raise this equation to the q-th power
and get xm = 1 + qtq s + t1 q 2 s = 1 + t2 q s+1 for integers t1 and t2 .
a−1
Note that q t2 because q > 2. Therefore, if q s Φm1 (xq ), then
a
q s+1 Φm1 (xq ). Then equation (3.6) shows that Φm (x) is divisible
by q but not by q 2 .
3.4. Divisibility Sequences and bm − 1 53
but now use bk − 1 > bk−1 for the denominator and bk − 1 < bk for
the numerator. The result is
ω(m1 )−1
Φm1 (b) < bφ(m1 )+2 .
Proof. Let Pm (b) = Φm (b)/ gcd(Φm (b), m) denote Φm (b) with any
intrinsic factor removed. The numbers in the infinite series
Pm (3), P2m (3), P3m (3), P4m (3), . . . , Pkm (3), . . .
are all relatively prime and, by Theorem 3.19, each contains at least
one prime divisor ≡ 1 (mod m).
The proof of the corollary could have used any base b > 1; we
chose base 3 to avoid 26 − 1.
3.5. Factors of bm + 1
Note that {bm + 1}, with fixed b, is not a divisibility sequence. For
example, with b = 5 we have 1 | 2, but 6 = 51 + 1 52 + 1 = 26. The
next theorem describes the divisibility properties of bm + 1.
Theorem 3.21. If b is an integer > 1 and m | n and n/m is odd,
then (bm + 1) | (bn + 1).
n un n un n un
0 0 10 5.11 20 3.5.11.41
1 1 11 89 21 2.13.421
2 1 12 2.2.2.2.3.3 22 89.199
3 2 13 233 23 28657
4 3 14 13.29 24 2.2.2.2.2.3.3.7.23
5 5 15 2.5.61 25 5.5.3001
6 2.2.2 16 3.7.47 26 233.521
7 13 17 1597 27 2.17.53.109
8 3.7 18 2.2.2.17.19 28 3.13.29.281
9 2.17 19 37.113 29 514229
2
Technically, a Las Vegas probabilistic algorithm.
60 3. Number Theory Relevant to Factoring
Proof. Let F = pei i be the factorization of F . Let q be a prime
factor of m. Let fi be the smallest positive integer for which afi i ≡
1 (mod q). Then fi divides q − 1. Since am−1 ≡ 1 (mod m), we also
have that fi divides m − 1. But gcd(a(m−1)/pi − 1, q) = 1, so fi does
62 3. Number Theory Relevant to Factoring
One can prove that for integers a > 1, most probable primes to
base a are prime and that very few of them are pseudoprimes. For
example, Erdős [Erd56] proved that the number of pseudoprimes to
base 2 up to x is less than x·exp −c (ln x) ln ln x for some positive
64 3. Number Theory Relevant to Factoring
One can show that for every positive integer a there are infinitely
many pseudoprimes to base a. One might try to reduce the chance
of getting a pseudoprime when one wants a large prime by requiring
that it be a pseudoprime to several bases simultaneously. One prob-
lem with this approach is that there are infinitely many Carmichael
numbers.
(2) ⇒ (3): Suppose m is composite and square free and for each
prime p dividing m we have p−1 divides m−1. Since m is square free,
it is enough to show that am ≡ a (mod p) for each prime p dividing
m. Let a be an integer and let p be a prime factor of m. If p does not
divide a, then ap−1 ≡ 1 (mod p) by Fermat’s Little Theorem 2.23.
Since we are given that p − 1 | m − 1, we have am−1 ≡ 1 (mod p) and
so am ≡ a (mod p). It is clear that am ≡ a (mod p) when p divides a.
(3) ⇒ (1): We are given that am ≡ a (mod m) for all a. When
gcd(a, m) = 1, we can cancel an a from each side to get am−1 ≡
1 (mod p), that is, m is a pseudoprime to base a.
Example 3.37. Note that 3 − 1, 11 − 1, and 17 − 1 all divide 561 − 1.
Likewise, 7 − 1, 13 − 1, and 19 − 1 all divide 1729 − 1.
Proof. Since the Legendre symbol (1/m) = +1, there are exactly
two solutions to x2 ≡ 1 (mod m). But +1 and −1 are two solutions
to this congruence, so there are no others. Look at the sequence of
integers af , a2f , . . ., am−1 modulo m. Each number is the square of
the number before it. The last one is +1 by Fermat’s Little Theorem
2.23. Either they are all +1 or else the last one that is not +1 is −1.
But this is the definition of strong probable prime to base a.
Theorem 3.41. Every strong probable prime m is a probable prime
to the same base.
3.7. Primality Testing 67
See Corollary 4.5 for a proof that there are infinitely many strong
pseudoprimes to every (prime) base. There are 3291 strong pseudo-
primes to base 2 below 1010 and 24220195 of them below 1019 .
The next theorem says that there is no “strong” Carmichael num-
ber, that is, no m is a strong pseudoprime to every base relatively
prime to it.
Theorem 3.42 (Monier [Mon80], Rabin [Rab80]). If m is an odd
composite integer > 9, then the number of bases a in 1 ≤ a ≤ m to
which m is a strong pseudoprime is ≤ φ(m)/4.
There are also primality tests that choose random numbers while
deciding whether m is prime. The answer is always correct and they
usually run in polynomial time, but there is a chance that some bad
random numbers might make the algorithm run for longer than poly-
nomial time. One example of such an algorithm is the elliptic curve
version of Theorem 3.29 by Goldwasser and Kilian [GK86]. See The-
orem 7.12.
One can prove that if m is prime and ≡ 3 or 7 mod 10, then
m divides the Fibonacci number um+1 . We explained at the end of
Section 3.4 how to compute um+1 mod m when m is a large inte-
ger. In 1980, Baillie, Pomerance, Selfridge, and the author [BW80],
[PSW80] conjectured that no odd composite integer m, whose last
digit is 3 or 7, is a strong probable prime to base 2 and also divides the
Fibonacci number um+1 . (We made a similar, but slightly more com-
plicated, conjecture for numbers with last digit 1 or 9. See [BW80]
and [PSW80] for details.) No one has ever proved or disproved this
conjecture and it has been used millions of times to construct large
primes without a single failure. Jeff Gilchrist has verified that the
test is correct for all m < 264 using Jan Feitsma’s [Fei13] list of
pseudoprimes to base 2 up to 264 . The authors of [PSW80] offer
a cash prize to the first person who either proves that this test is
always correct or exhibits a composite number that the test says is
probably prime. The original value of the prize was $30 but has been
increased to $620. Heuristic arguments suggest that composite num-
bers exist that the test says are probably prime, and that the least
such examples have hundreds or even thousands of decimal digits.
write “m is prime”
Output: The test tells whether m is composite or prime.
Exercises
3.1. Estimate the number of numbers between 1036 and 1036 + 107
whose greatest prime factor is less than 106 .
3.2. Estimate the number of numbers between 1836 and 1836 + 107
whose greatest prime factor is less than 186 .
3.3. Find all x with x2 ≡ 9 (mod 253).
3.4. Tonelli’s algorithm requires a quadratic nonresidue modulo p.
Prove that the least positive quadratic nonresidue modulo p
must be prime.
3.5. Find the cyclotomic polynomials Φ10 (x), Φ12 (x), Φ15 (x), and
Φ30 (x).
3.6. Show that for m > 1 the cyclotomic polynomial Φm (x) is
self-reciprocal, that is, Φm (x) = xφ(m) Φm (1/x).
3.7. Show that at least one coefficient of Φ105 (x) is different from
1, 0, −1.
3.8. Show that the repunits Rn of Section 1.2 form a divisibility
sequence.
3.9. Let {cn } be a divisibility sequence. Prove that
(a) for all n ≥ 1, cn /c1 is an integer,
(b) the sequence {cn /c1 } is a divisibility sequence, and
(c) if (a + 1) | (b + 1), then σ(pa ) | σ(pb ).
3.10. For n ≥ 1, let an = φ(n), bn = gcd(2n − 1, 3n − 1), and
cn = λ(n), Carmichael’s function. Prove that {an }, {bn },
and {cn } are divisibility sequences.
72 3. Number Theory Relevant to Factoring
3.11. Use Theorem 3.17 to verify that 73, 11, 23, 61, and 7 are
intrinsic factors of the primitive parts of 2657 − 1, 3605 − 1,
3253 − 1, 11122 + 1, and 21029 − 1, respectively.
3.12. Prove that if p and q are distinct primes and pq > 6, then
2pq − 1 has at least three different prime factors.
3.13. Prove that gcd(2m − 1, 2n − 1) = 2gcd(m,n) − 1.
3.14. Prove that gcd(um , un ) = ugcd(m,n) .
3.15. Factor the first 30 Lucas numbers vn . For which m and n
does vm | vn in your table?
3.16. Some versions of Pepin’s test use 3 in place of 5. Prove that
this version is correct for all k ≥ 1.
3.17. Find all 22 pseudoprimes to base 2 below 10000. Which
of these numbers are Carmichael numbers? Which ones are
strong pseudoprimes to base 2?
3.18. Show that every composite factor of bn ± 1 is a pseudoprime
to base b.
3.19. Let N be the primitive part of bn ± 1 with any intrinsic factor
removed. Show that every composite factor of N is a strong
pseudoprime to base b.
3.20. True or False? Give proof or counterexample.
(a) If m is a pseudoprime to base a, then m is a pseudoprime
to base a2 .
(b) If m is a strong pseudoprime to base a, then m is a strong
pseudoprime to base a2 .
(c) If m is a pseudoprime to bases a and b, then m is a
pseudoprime to base ab.
(d) If m is a strong pseudoprime to bases a and b, then m is
a strong pseudoprime to base ab.
3.21. Use Korselt’s Criterion, Theorem 3.36, to prove that if 6k + 1,
12k + 1, and 18k + 1 are all prime, then their product is a
Carmichael number.
3.22. Use the probable primality test of Baillie, Pomerance, Self-
ridge, and Wagstaff to prove that 16493 is probably prime.
Exercises 73
3.23. If the primality test of Agrawal, Kayal, and Saxena finds that
its input m is composite, does it produce a factor of m?
3.24. Use the primality test of Agrawal, Kayal, and Saxena to prove
that 16493 is prime.
Chapter 4
Introduction
Hamming was referring to numerical analysis, but the quotation ap-
plies equally to computational number theory and, in particular, to
factoring integers.
Chapter 1 mentioned a few uses of tables of factored integers.
This chapter lists many more examples of the uses of factorization.
One use is discovering new algebraic identities that tell how to factor
certain integers. Aurifeuille discovered an infinite number of these by
examining tables of factored numbers. We mentioned perfect num-
bers N (with σ(N ) = 2N ) in Chapter 1. In this chapter we tell more
about perfect numbers and the related harmonic numbers. We dis-
cuss the use of factoring in proving that certain types of numbers are
prime. The study of Bernoulli numbers, used in combinatorics and
the structure of fields, requires the knowledge of factors of some in-
tegers. We discuss several topics from cryptography, including linear
feedback shift registers (LFSR), the RSA cryptosystem and signa-
tures, secure random number generation, and zero-knowledge proofs.
75
76 4. How Are Factors Used?
The fact that the sequences {bm − 1} and {um } are divisibility
sequences was discovered by the study of tables of these numbers
factored.
More than a century ago, mathematicians studied tables of num-
bers bn ± 1 factored and observed that for each b, the numbers in
which n lies in a certain arithmetic progression split into more and
smaller prime factors than did other numbers in the same table. They
suspected and found an algebraic identity that breaks these integers
into two factors of roughly the same size.
In 1869, Landry factored
where the first factor is 5 · 107367629 and the second factor is prime.
He observed that the two displayed factors are close together, their
difference is 216 , and the equation is the special case j = 15 of the
identity
Landry’s task would have been much easier had he noticed this for-
mula when he began to factor 258 + 1. Later, Lucas proved the iden-
tities (4.1), (4.2) and other similar formulas.
Let us examine Table 2 of factors of 3m + 1 and try to discover
an algebraic factorization. Note the two nearly equal factors 19441
m 3m + 1 m 3m + 1 m 3m + 1
1 2.2 10 (2) 5∗.1181 19 (1) 2851.101917
2 2∗.5 11 (1) 67.661 20 (4) 42521761
3 (1) 7 12 (4) 6481 21 (1, 3, 7) 7∗.43.2269
4 2∗.41 13 (1) 398581 22 (2) 5501.570461
5 (1) 61 14 (2) 29.16493 23 (1) 23535794707
6 (2) 73 15 (1, 3, 5) 31.271 24 (8) 97.577.769
7 (1) 547 16 2∗.21523361 25 (1, 5) 151.22996651
8 2∗.17.193 17 (1) 103.307.1021 26 (2) 53.4795973261
9 (1, 3) 19.37 18 (2, 6) 530713 27 (1, 3, 9) 19441.19927
Corollary 4.5. For every integer b > 1 and integer m > 2, every
composite divisor of the primitive part of bm − 1 or bm + 1, with any
intrinsic factor removed, is a strong pseudoprime to base b. For every
square free b there are infinitely many strong pseudoprimes to base b.
Proof. Let Pm (b) = Φm (b)/ gcd(Φm (b), m) denote Φm (b) with any
intrinsic factor removed. By Theorems 3.15 and 3.23, the prime fac-
tors of Pm (b) are all ≡ 1 (mod m). Let n be a composite divisor of
Pm (b). Then m | n−1, so bn−1 ≡ 1 (mod Pm (b)). When gcd(b, t) = 1,
let b (t) denote the smallest positive integer k so that bk ≡ 1 (mod t).
(There always is one because k = φ(t) is one such integer.) Then
b (pe ) = m for each prime power pe with pe n. Hence there is an
integer k so that 2k b (pe ) for each prime power pe with pe n. This
fact and bn−1 ≡ 1 (mod n) show that n is a strong pseudoprime to
base b.
Now let b be square free. Let η = 1 if b ≡ 1 (mod 4) and η = 2
if b = 2 or b ≡ 3 (mod 4). Theorem 4.1 shows that Phηb (b) factors
into two nearly equal pieces for every odd positive integer h, so that
Phηb (b) must be composite, and therefore a strong pseudoprime to
base b.
Using Schinzel’s [Sch62] full theorem, one can show that Corol-
lary 4.5 is valid without the restriction that b be square free. See
Theorem 1 of [PSW80] for this theorem. A closer analysis of the
proof of Corollary 4.5 shows that the number of strong pseudoprimes
to base b up to x is > ln x/(4b ln b). See [EP86] for a stronger in-
equality.
What happens when b is not square free? First, if b = ct , then
the factors of bn − 1 are just those of ctn − 1, and likewise for bn + 1.
Now suppose b is not a power and not square free. Write b = cd2
where c is square free and d2 is the largest square dividing b. Then
the Aurifeuillian factorization of bn ± 1 comes from that of cn ± 1.
3. Since the “m” is easily inferred from the line of the table, the actual
Cunningham tables write “mL” and“mM” simply as “L” and “M.”
m 3m + 1 m 3m + 1
1 2.2 16 2∗.21523361
2 2∗.5 17 (1) 103.307.1021
3 (1) 7 18 (2, 6) 530713
4 2∗.41 19 (1) 2851.101917
5 (1) 61 20 (4) 42521761
6 (2) 73 21 (1, 7) 21L.21M
7 (1) 547 21L (3) 7∗.43
8 2∗.17.193 21M 2269
9 (1, 3) 9L.9M 22 (2) 5501.570461
9L 19 23 (1) 23535794707
9M 37 24 (8) 97.577.769
10 (2) 5∗.1181 25 (1, 5) 151.22996651
11 (1) 67.661 26 (2) 53.4795973261
12 (4) 6481 27 (1, 3, 9) 27L.27M
13 (1) 398581 27L 19441
14 (2) 29.16493 27M 19927
15 (1, 5) 15L.15M 28 (4) 430697.647753
15L (3) 31 29 (1) 523.6091.5385997
15M 271 30 (2, 6, 10) 47763361
Proof. If M | N , we have
σ(M ) 1 1 σ(N )
= ≤ = .
M d d N
d|M d|N
If also M < N , then the inequality is strict because every term in the
first sum is in the second sum, while 1/N appears in the second sum
but not in the first. The third statement is immediate from the first
one.
σ(N ) pa+1 − 1
2= =
N pa (p − 1)
p N
a
p − p−a p 5 7 11
= < ≤ · · < 2,
p − 1 p − 1 4 6 10
p N
a p|N
Since 1 n
n = = d = σ(n),
d d
d|n d|n d|n
we have
−1
1 σ(n) nd(n)
H(n) = = ,
d(n) n σ(n)
which is a multiplicative function because each of n, d(n), σ(n) is
multiplicative.
Definition 4.14. A positive integer n is a harmonic number if H(n)
is an integer.
4.3. Harmonic Numbers 89
terms of d(n) and σ(n), the construction of such a table involves some
factoring. Currently, one can compute these arithmetic functions and
find all the harmonic numbers up to about 1012 in this straightforward
way. Of course, one can determine whether an integer n of any size is
harmonic provided one can factor that n. A more sophisticated way
of computing a table of harmonic numbers is based on the following
two lemmas.
Lemma 4.17. For each real number A, only a finite number of har-
monic numbers n satisfy H(n)A > n.
Lemma 4.18. For each real number B, only a finite number of har-
monic numbers n satisfy H(n) ≤ B.
But d(p22 p3 ) = 6, so
and σ(5100 ) must divide p22 p3 . However, the Cunningham table gives
σ(5100 ) as the product of the three different primes 5937018283241,
3434487311396589821473854121, 483593153887747265029536907421,
so it cannot divide a number of the form p22 p3 .
This fourth example requires factoring, too, but this time the
Cunningham tables do not help.
4.5. Linear Feedback Shift Registers 93
at the left end at each clock tick. Number the bit positions of the
shift register from 1 to n from left to right. Let the i-th bit position
hold the bit ai ∈ {0, 1} and let ti = 1 if the i-th bit position is selected
as input to the exclusive-or gate and ti = 0 if it is not selected. Then
the output of the exclusive-or gate is a = ( ni=1 ai ti ) mod 2. At each
clock tick, the content of the i-th bit position changes from ai to ai−1
if 1 < i ≤ n, and the content of the left-most bit position changes
from a1 to a.
The characteristic polynomial of the LFSR just described is f (x)
= 1 + ni=1 ti xi . There is a way to describe a LFSR with an n ×
n matrix. The characteristic polynomial of the LFSR is just the
characteristic polynomial of the matrix.
Example 4.24. Consider the LFSR with four bits and characteristic
polynomial f (x) = x4 + x3 + 1. This means that the two bits on the
right end are input to the exclusive-or gate (⊕). Suppose the initial
contents of the shift register are 0001 from left to right.
- 0 0 0 1 -
?
Then the contents of the shift register at each successive clock tick
are shown in Table 4. The characteristic polynomial is primitive and
a1 a2 a3 a4 a1 a2 a3 a4 a1 a2 a3 a4
0 0 0 1 1 1 0 0 1 1 0 1
1 0 0 0 0 1 1 0 1 1 1 0
0 1 0 0 1 0 1 1 1 1 1 1
0 0 1 0 0 1 0 1 0 1 1 1
1 0 0 1 1 0 1 0 0 0 1 1
4.5. Linear Feedback Shift Registers 95
Proof. For i ≥ 0 let ri be the value of the bit an after i clock ticks.
n
Then ri = j=1 tj ri−j mod 2 for i ≥ n. Define the generating
function G(x) = ∞ i
i=0 ri x . A short calculation with the recursion
formula for the ri shows that G(x) = g(x)/f (x), where f (x) is the
characteristic polynomial (of degree n) and g(x) is a polynomial of
degree < n.
Now suppose that the characteristic polynomial is not irreducible,
that is, f (x) = s(x)t(x) for some polynomials s(x), t(x) of degrees n1
and n2 both < n with n1 + n2 = n. Assume s(x) = t(x). Then G(x)
has the partial fraction decomposition
g(x) α(x) β(x)
G(x) = = + ,
f (x) s(x) t(x)
where degree α < degree s and degree β < degree t. Then α(x)/s(x)
and β(x)/t(x) are generating functions for two LFSRs with (mini-
mum) periods dividing 2n1 − 1 and 2n2 − 1, respectively. The period
of the LFSR with generating function G(x) cannot exceed the least
common multiple of the periods of the two new LFSRs. Therefore,
2n − 1 ≤ lcm(2n1 − 1, 2n2 − 1) ≤ (2n1 − 1)(2n2 − 1) ≤ 2n − 3,
a contradiction. The case s(x) = t(x) also leads to a contradiction.
Hence the assumption that f (x) is not irreducible was wrong.
n
if and only if f (x) divides x2 + x. The first step of the algorithm
n
is to compute x2 mod f (x). If this polynomial is x, then f (x) is
irreducible and might be primitive. If f (x) passes this test and 2n − 1
is prime, then f (x) is primitive. But if 2n − 1 is composite, then
a second step is needed. For each prime factor q of 2n − 1, test
whether f (x) divides x(2 −1)/q − 1 by computing x(2 −1)/q mod f (x)
n n
and comparing with the polynomial 1. If f (x) does divide x(2 −1)/q −1
n
96 4. How Are Factors Used?
for some factor q of 2n − 1, then f (x) is not primitive and the period
of the LFSR will divide (2n − 1)/q. However, if for every factor q
of 2n − 1, f (x) does not divide x(2 −1)/q − 1, then f (x) is primitive.
n
1 d
2 μ(n/d),
n
d|n
φ(2n − 1)/n.
4.6. Testing Conjectures 97
During the next 250 years mathematicians found that his list had five
errors: p = 67 and 257 should be removed from the list while p = 61,
89, and 107 should be added to it. (See Bateman et al. [BSW89]
for a new “Mersenne” conjecture.) Much of the checking involved
factoring. For example, the prime 43 should not be on the list because
431 divides 243 − 1. Lucas used Theorem 3.31 to show that 2127 − 1
is prime, and it was the largest known prime for 75 years. Lucas did
the same calculation to test whether 267 − 1 is prime, and it showed
that this number is composite, but he was not certain that his work
was correct. Lucas died in 1891. Cole factored this number in 1903.
See the quote from Bell [Bel51] in Section 4.9 for more of the story.
Currently, the largest known Mersenne prime is 257885161 − 1. It
is conjectured that the number of Mersenne primes ≤ x is asymptot-
ically (eγ / ln 2) ln ln x as x → ∞. Here γ ≈ 0.577215665 is Euler’s
constant and eγ / ln 2 ≈ 2.5695. Numerical evidence supports this
conjecture. See [Wag83].
4.6.2. Bell Numbers. The Bell numbers B(n) are named after the
same E. T. Bell who wrote about Cole. They appear in combinatorial
problems. For example, B(n) is the number of ways a product of
1
This is the reason why the 2p − 1 are called the Mersenne numbers.
4.6. Testing Conjectures 99
n: 0 1 2 3 4 5 6 7 8 9
B(n) : 1 1 2 5 15 52 203 877 4140 21147
B(n) mod 2 : 1 1 0 1 1 0 1 1 0 1
m s1 (m) s2 (m) . . .
12 16 15 9 4 3 1 0
24 36 55 17 1 0
28 28 28 28 . . .
30 42 54 66 78 90 144 259 45 33 15 9 4 3 1 0
220 284 220 284 . . .
276 396 696 1104 1872 3770 3790 3050 2716 2772 5964 . . .
or by B0 = 1 and
n−1
−1 n + 1
Bn = Bk
n+1 k
k=0
n
for n ≥ 1, where k = n!/((n − k)!k!) is the binomial coefficient. The
first few are
1 1 1 1
B1 = − , B2 = , B3 = 0, B4 = − , B5 = 0, B6 = .
2 6 30 42
One can prove that B2k+1 = 0 for k ≥ 1. The signs of the nonzero
Bernoulli numbers alternate. The next few nonzero ones are
1 5 691 7 3617
B8 = − , B10 = , B12 = − , B14 = , B16 = − .
30 66 2730 6 510
After a slow start, the absolute value |B2k | increases rapidly.
Jacob Bernoulli introduced the numbers named after him to give
a closed form for the sum of the first k n-th powers of integers:
n
1 n+1
1n + 2n + 3n + · · · + kn = Bj · (k + 1)n+1−j .
n + 1 j=0 j
Example 4.35.
∞
1 (2π)4 16π 4 −1 π4
= (−1)2−1 B4 = − · = .
i=1
i4 2(4)! 2 · 24 30 90
4.7. Bernoulli Numbers 103
Siegel [Sie64] and much numerical evidence suggest that more than
60% (the fraction is e−1/2 ) of primes are regular.
Example 4.38. The first few irregular primes are 37, which divides
P32 , 59, which divides P44 , and 67, which divides P58 . The prime 157
is the first one to divide two Bernoulli numerators, namely, P62 and
P110 .
Alice makes n and e public but keeps d secret. The primes p and
q are not needed after d is computed. Anyone who wants to send
a secret message to Alice will encipher it using n and e. Once it is
enciphered, only Alice can decipher it because only she knows d. The
cipher will be secure provided it is impossible to find d given n and
e. The most obvious attack on RSA is to factor n and then compute
d from e and the prime factors p and q of n in the same way that
Alice computed d originally. If the attacker cannot factor n, then this
attack will fail.
There is a trick Alice can use to speed her RSA signature gen-
eration. Suppose her modulus is n = pq, where the primes p and q
have about the same length. Let b be the number of bits in n, so that
the length of p and q is about b/2 bits. If the decryption exponent
is d, then Alice signs the plaintext M as S = D(M ) = M d mod n.
The trick replaces this Fast Exponentiation with b-bit numbers by
two Fast Exponentiations with b/2-bit numbers. This makes the sig-
nature generation run about four times faster. Let Mp = M mod p,
Mq = M mod q, dp = d mod (p − 1), and dq = d mod (q − 1). The
length of each of these four numbers is about b/2 bits. Alice computes
d d
Sp = Mp p mod p and Sq = Mq q mod q by Fast Exponentiation. Now
the signature S ≡ Sp (mod p) and S ≡ Sq (mod q), so Alice com-
putes S = D(M ) from Sp and Sq by the Chinese Remainder Theorem.
The use of the Chinese Remainder Theorem may be accelerated. One
can show that S = (f Sp + gSq ) mod n where f and g are the precom-
puted constants f = q · (q −1 mod p) and g = p · (p−1 mod q). The
same trick can be used to speed deciphering RSA messages, but not
enciphering them because the person encrypting does not know the
secret factors p and q.
Proof. The eavesdropper chooses any t for which the Jacobi symbol
(t/n) = −1. Then t is a quadratic nonresidue modulo n. Let xr+1 =
t2 mod n. Compute xr in polynomial time. Then x2r ≡ xr+1 ≡
t2 (mod n), but xr ≡ ±t (mod n), so gcd(t + xr , n) is a proper
factor of n by Theorem 6.17.
4.40. Which one of the four square roots of xr+1 is the quadratic
residue xr modulo n?
Modulo p = 7, we have C = 58 ≡ 2, and we found that the two
square roots of 2 are 3 and 4. Clearly, 4 = 22 is a quadratic residue,
and therefore 3 = 7 − 4 is a quadratic nonresidue.
Modulo p = 11, we have C = 58 ≡ 3, and we found that the two
square roots of 3 are 5 and 6. Evaluating the Legendre symbols by Eu-
ler’s Criterion, part (4) of Theorem 2.59, we find (5/11) ≡ 5(11−1)/2 =
55 ≡ +1 (mod 11) and (6/11) ≡ 6(11−1)/2 = 65 ≡ −1 (mod 11), so 5
is the quadratic residue.
The square root xr of 58 that is a quadratic residue modulo n =
77 is the one that is a quadratic residue modulo both p and q, that
is, the one ≡ 4 (mod 7) and ≡ 5 (mod 11). We saw in Example 4.40
that xr ≡ 60 (mod 77).
Given n and xi , one can compute xi+j for any j > 0 by xi+j =
j
x2i (mod n). In case j is very large, we can do this in O((log j)(log n)2 )
steps by computing t = 2j mod φ(n) first and then xi+j = xti mod n.
This property is useful if one wishes to decipher the middle of a long
message and not start from its beginning.
say c with c2 ≡ b (mod N ). Theorem 3.5 and Algorithm 3.6 tell how
the Prover can compute square roots of quadratic residues modulo N
quickly, provided she knows the prime factors of N . We will see in
Corollary 6.19 that anyone who can compute square roots of arbitrary
quadratic residues modulo N quickly can also factor N . The danger in
doing the protocol in this simple way is that the Verifier will probably
be able to factor N , by Theorem 6.18. Here is one standard way to
avoid this trap and perform the protocol safely.
√
(1) Prover chooses a in N < a < N and lets b = a2 mod N .
√
(2) Verifier chooses c in N < c < N and lets d = c2 mod N .
(3) Prover sends b to Verifier and Verifier sends d to Prover.
(4) Prover receives d and solves x2 ≡ bd (mod N ). Let x1 be
one of the (four) solutions.
(5) Verifier chooses a random bit 0 or 1, each with probability
0.5, and sends the bit to the Prover.
(6) Prover receives the bit. If it is 0, she sends a to the Verifier.
If it is 1, she sends x1 to the Verifier.
(7) Verifier receives a or x1 . If the bit was 0, he checks that
b = a2 mod N . If it was 1, he checks that x21 ≡ bd (mod N ).
The Prover and Verifier repeat steps (1) through (7) a few dozen
times. If the check in step (7) is always correct, then the Verifier
accepts that the Prover really knows the factors of N . But if the
check in step (7) ever fails, then the Verifier concludes that the Prover
does not know the factorization of N .
If the Prover really knows the factors of N , then she can compute
all the square roots, as explained above. But if the Prover does not
know the factors of N , then she cannot compute both of the square
roots needed in step (6). (It turns out that she could fake either one
of them if she knew in advance whether the bit would be 0 or 1 in
step (5).) If the protocol is repeated 30 times, there is one chance
in 230 ≈ 109 that the Prover could correctly guess the bit in step
(5) each time and supply the needed square root in step (6) if she
did not know the factors of N . Verifier does not learn the factors of
N no matter how many times the protocol is repeated because the
4.9. Other Applications 113
quadratic residues are new each time and because he learns only one
square root of each one. He would have to get two different square
roots (whose sum is not N ) in order to factor N by Theorem 6.17.
But we will see in Chapter 10 that the Verifier can cheat and learn
the factors of N anyway, and that is why this simple version is not
used.
Example 4.45. Which of the numbers 30, 843, 29149, 30629 is the
sum of two squares?
The number 30 has the prime factor 3 to the first power, so it is
not the sum of two squares. Since 843 ≡ 3 (mod 4), it is not the sum of
two squares. Both 29149 and 30629 are ≡ 1 (mod 4), so we need their
prime factors to answer the question. We factor 29149 = 103 · 283.
Since 103 and 283 ≡ 3 (mod 4), 29149 is not the sum of two squares.
We factor 30629 = 109 · 281. Since both 109 and 281 ≡ 1 (mod 4),
30629 is the sum of two squares.
Those who factor the largest and hardest composite numbers are
rewarded with the bragging rights for this achievement. The cham-
pion factorizations for each factoring algorithm are promulgated on
the Internet together with the names of those who did the work.
Cole could certainly brag after factoring M67 = 267 − 1. Bell
[Bel51, p. 128] tells this tale, which is probably exaggerated:
When I asked Cole in 1911 how long it had taken him
to crack M67 , he said “three years of Sundays.” But
this, though interesting, is not the history. At the
October, 1903, meeting in New York of the Ameri-
can Mathematical Society, Cole had a paper on the
program with the modest title On the factorization
of large numbers. When the chairman called on him
for his paper, Cole—who was always a man of very
few words—walked to the board and, saying noth-
ing, proceeded to chalk up the arithmetic for rais-
ing 2 to the sixty-seventh power. Then he care-
fully subtracted 1. Without a word he moved over
4.9. Other Applications 115
In 1983, Diane Holdridge and Jim Davis, working for Gus Sim-
mons at Sandia National Labs, factored the 69-digit cofactor of 2251 −
1 and the 60-digit cofactor of 2211 −1 using a Cray-1 computer. These
were the last two numbers considered by Cunningham and Woodall in
their 1925 book [CW25] to be completely factored. These numbers
were also considered by Mersenne, as mentioned earlier, and were the
last of these numbers Mp , p ≤ 257, to be factored completely. This
achievement was reported in Time magazine of February 13, 1984.
In 1988, Mark Manasse and Arjen Lenstra factored the first hard
100-digit composite number. They organized more than a dozen col-
laborators running about 400 computers around the world. The num-
ber was the 100-digit cofactor of 11104 + 1. They used the Quadratic
Sieve Algorithm. Their work was reported on the front page of the
New York Times newspaper for October 12, 1988, in an article ti-
tled “A Most Ferocious Math Problem Tamed,” written by Malcolm
W. Browne. The 100-digit composite number turned out to be the
product of a 41-digit prime and a 60-digit prime.
In 1994, Derek Atkins, Michael Graff, Arjen Lenstra, and Paul
Leyland [AGLL95] factored the 129-digit RSA challenge number
published in the August 1977 Scientific American. Gina Kolata an-
nounced their result in articles in the New York Times on March 22
and April 27, 1994. They used the Quadratic Sieve Algorithm. See
the end of Section 8.2 for more details.
Hard-to-factor composite numbers, like RSA challenge numbers
and the unfactored parts of some Cunningham numbers, furnish an
interesting test for new factoring algorithms. One way to prove the
116 4. How Are Factors Used?
Exercises
m 5m − 1
5 (1) 11.71
15 (1, 3) 11.71.181.1741
25 (1, 5) 101.251.401.9384251
35 (1, 7) 11.71.211.631.4201.85280581
has two nearly equal (prime) factors. Also, the number 12193 −1
has two nearly equal 77-digit (prime) factors:
4521744280918 . . . 81162723213257,
4657568121081 . . . 71111751624211.
Simple Factoring
Algorithms
Introduction
This chapter describes some of the slow methods of factoring integers.
Although they are slow, most of them have short, simple programs
and require little auxiliary storage. Fifty years ago there were no
better algorithms. Furthermore, they work well for factoring small
integers, up to 109 , at least, so they are used for many applications
where the numbers are not large.
Until about 100 years ago, people computed and published tables
of primes and of prime factors of integers from 1 to some limit. Er-
atosthenes made the first such table more than 2,500 years ago. In
1603, Cataldi published a table of factors of the integers 1 to 750.
Chernac published a factor table to 1020000 in 1811. Burckhardt
published one for the second million three years later. Crelle ex-
tended this work to five million, but his tables were too inaccurate to
be published. The last factor table (to 10017000) was published by
D. N. Lehmer [Leh09] in 1909. Five years later he published a table
of all primes up to 10006721. No more such tables will be published
because one can compute them in seconds using Algorithms 8.2, 8.3,
and 8.5.
119
120 5. Simple Factoring Algorithms
We will revisit the Trial Division Algorithm 3.1 and tell simple
tricks for making it slightly faster. Fermat’s Factoring Method quickly
factors the product of two primes of about the same size. Hart’s
algorithm has a very short program and works well for integers with
some special forms. Lawrence and Lehman invented other variations
on Fermat’s Method. Euler, Legendre, Gauss, Chebyshev, and the
Lehmers found ways to factor integers that can be expressed in the
form ax2 + bxy + cy 2 . In the 1970s, Pollard created two interesting
factoring methods, called the Rho Method and the p − 1 Method.
Like Trial Division, they find small factors of large integers.
Several algorithms in this and later chapters have a block of in-
structions repeated many times, and one of the instructions is expen-
sive, but it is performed only during one of every k iterations of the
block. When we analyze the running time for the block of instruc-
tions, we add 1/k of the time for the expensive instruction to the time
for the other instructions to obtain the total time for one iteration
of the block. This calculation is called amortizing the time for the
expensive instruction.
Example 5.1. In the following instructions, assume n is much larger
than k, that Instruction 2 takes 50 time units, while Instructions 1
and 3 each take 1 time unit. Then the total time for the entire for
loop to execute is approximately n(50/k + 2) time units.
Example of Amortization
for i ← 1 to n {
Instruction 1
If (i ≡ 1 (mod k)) { Instruction 2 }
Instruction 3
}
Wheels that skip multiples of the first eight or ten primes have
been used on computers to accelerate Trial Division.
Alternatively, one could compute a table of primes and divide the
candidate number only by these primes. A sieve can be used to build
a table of primes between two limits faster than the primes could be
read from disk, so there is no need to save a large table of primes in
a file. See Section 8.1 for more about sieves.
One can use quadratic residues to speed Trial Division by skipping
some primes that cannot be divisors. This device was used by Euler,
Gauss, and others hundreds of years ago. Let N be the number to
factor. Suppose we know a nonsquare quadratic residue r modulo N .
Then r is also a quadratic residue modulo any prime factor p of N .
If r is not a square, the Law of Quadratic Reciprocity restricts p to
only half of the possible residue classes modulo 4|r|.
Example 5.3. Suppose we are trying to factor N and we know that
13 is a quadratic residue modulo N . If p is a prime divisor of N , then
13 is also a quadratic residue modulo p. We saw in Example 2.60
that the odd primes p = 13 for which 13 is a quadratic residue are
the primes p ≡ 1, 3, 4, 9, 10, or 12 (mod 13). Therefore, the primes
3, 13, 17, 23, 29, 43, . . . may divide N , but not the primes 5, 7, 11,
19, 31, 37, 41, . . ..
} √
Output: x = N .
x t = 2x + 1 x2 r = x2 − N
22 45 484 −43
23 47 529 2
24 49 576 49 = 72
The last line of the table gives 527 = 242 − 72 = (24 − 7)(24 + 7).
1
The “last digit” of −43 is 7 because −43 ≡ 7 (mod 10).
126 5. Simple Factoring Algorithms
√
1 N √ ( N − a)2 (1 − k)2 √
1+ a+ −2 N = 1+ =1+ N
2 a 2a 2k
times.
Theorem 5.11√ does not say that the time complexity of Fermat’s
algorithm is O( N ) to factor√N . In the worst case, N = 3p, for
some prime p, we have k = 3/ N and the while loop is performed
essentially N/6 times. If a ≈ N 1/3 and b ≈ N 2/3 , then k ≈ N −1/6
and the number of steps needed is
1 − k2 √ 1 − N −1/3 √ N 2/3
N≈ −1/6
N≈ ,
2k 2N 2
much slower than
√ Trial Division. √ The algorithm
√ works well when a is
very close to N , say, within O( 4 N ) of N .
One can build special hardware devices (called sieves) capable
of executing the while loop at high speed. See Lehmer [Leh33a],
[Leh66] and Williams and Patterson [WP83] for information about
the construction and use of sieves. See also Section 9.1.
Sometimes one can enhance Fermat’s Difference of Squares Meth-
od with the tricks from Trial Division.
Example 5.12. Factor the Mersenne number M29 = 229 − 1.
Use the tricks in Example 5.7. Any prime factor q of M29 must
have the form q = 2k · 29 + 1 = 58k + 1, where k ≡ 0 or 3 (mod 4).
Now 58 · 3 + 1 = 175 is composite, but 58 · 4 + 1 = 233 is prime,
5.3. Hart’s One-Line Factoring Algorithm 127
if (m is a square) { break }
}√
t← m
Output: gcd(s − t, N ) is a factor of N .
Remove the factors of ab from the two trinomial factors to obtain the
factors of N . That is, gcd(x + j + y, N ) and gcd(x + j − y, N ) will be
the factors of N .
5.4. Lehman’s Variation of Fermat 129
The first factor, 7297, is prime, while the second factor, 7053, is
3 · 2351, so N = 2351 · 7297. Note that 2351/7297 ≈ 0.322, which is
near 1/3.
When one of a, b is even and the other is odd, the calculations are
slightly more complicated because one must deal with half integers.
See Lawrence [Law95] for details. Lehman avoids this problem by
multiplying a, b, and other numbers in the algorithm by 2.
Lawrence’s Factoring Method is similar to Fermat’s in that it will
factor N eventually, but if N does not have factors whose ratio is near
a/b, then it will take a very long time to find them.
Note that the only way that the small positive integers a, b are
used in the algorithm is as their product k = ab. Suppose a pos-
itive integer k can be factored in more than two ways and we run
Lawrence’s algorithm with k in place of ab. Then we are searching
simultaneously for factors of N whose ratio is near a/b for any fac-
torization k = ab. For example, if k = 12, then Lawrence’s algorithm
searches simultaneously for factors of N whose ratio is near either
1/12 or 3/4.
Lehman’s idea [Leh74] is to divide up the interval between 0 and
1 into parts, each consisting of the real numbers “near” the fraction
a/b with 1 ≤ ab ≤ r for some limit r. If N = pq with p ≤ q, then the
ratio p/q must lie in one of the parts and we will factor N when we
examine that fraction a/b.
130 5. Simple Factoring Algorithms
2
This list of fractions is similar to a Farey series.
5.4. Lehman’s Variation of Fermat 131
If we write r = N c for some constant c, then N/r + r is√minimized
(for large N ) with c = 1/3. It follows that if we let r = 3 N , we get
a factoring algorithm with time complexity O(N 1/3 ).
Lehman [Leh74] gives a complete Algol program for this algo-
rithm. Here is the loop on x for one fixed k. In the algorithm
u = x2 − 4kN . The variables u and x are initialized and i1 is chosen
to ensure that the two congruences of Theorem 5.17 are satisfied and
that the variable i counts correctly through all values of x in Inequal-
ity (5.1). Note how the program avoids computing x2 and any other
multiplication, as these operations are slower than addition. Multi-
plication by a power of 2 can be done swiftly by shifting a binary
number.
Proof. We only sketch the proof. The details are on page 266 of
Mathews [Mat92].
The theory of quadratic forms shows that if N were prime, then
it would have at most one representation as N = ax2 + by 2 . Thus, N
must be composite because it has two such representations.
We may assume that gcd(ab, N ) = 1. If any of x, y, u, v were 0,
then we could factor N easily. Thus, we may assume all of x, y, u, v
are positive integers. Note that
a(xv + yu)(xv − yu) = ax2 v 2 − ay 2 u2
= v 2 (ax2 + by 2 ) − y 2 (au2 + bv 2 )
= (v 2 − y 2 )N.
Therefore, N divides (xv + yu)(xv − yu). Using the distinctness of
the two representations, one can prove that N does not divide either
xv + yu or xv − yu. Therefore, gcd(xv + yu, N ) and gcd(xv − yu, N )
are proper factors of N .
Compute
1065 · 5286 + 5186 · 295 = 7159460
and
1065 · 5286 − 5186 · 295 = 4099720.
Then gcd(7159460, N ) = 4649 and gcd(4099720, N ) = 6029, so N =
4649 · 6029.
One problem with Theorem 5.20 is that it may take a long time
to find even one representation of N as ax2 + by 2 . An even greater
problem is that N may have no representation at all in this form.
Dick and Emma Lehmer [LL74] devised a variation of the method of
Theorem 5.20 to solve the second problem. They used representations
λN = x2 − Dy 2 with D = 1 and D = 0 and gave the
range of possible
solutions y. For example, if D < 0, then 0 ≤ y < |λN/D|. (When
D > 0, the theory of Pell equations is needed to give the range of
y. See the end of Section 6.8 for this range.) In case two distinct
134 5. Simple Factoring Algorithms
solutions are found with y in this range, one can factor λN as in the
proof of Theorem 5.20. The multiplier λ is small and can be removed
from the factors of λN to factor N .
The Lehmers [LL74] provided a list of ten quadratic forms λN =
x − Dy 2 and told how to choose three of them, depending on the
2
The Lehmers used this method in the early 1970s to factor num-
bers from aliquot sequences having 20 to 25 decimal digits and no
small factor. They use a sieve algorithm, as in Section 8.1, to speed
the search for x and y. They also used a special hardware device, the
Delay Line Sieve (see Section 9.1), to test one million values of y per
second, quite fast for that time.
Emma Trotskaya was born in 1906 and grew up in Harbin, Man-
churia. Her interest in mathematics was sparked by a wonderful high
school teacher. She did not wish to attend college in Moscow in the
early 1920s because the Russian revolution was still in progress there.
She convinced her parents that Berkeley, California, was closer to
Harbin than was Moscow. She applied to the University of California
and was offered a scholarship. She accepted and crossed the Pacific via
Japan and Vancouver, where she obtained a tourist visa for the United
States. University officials told her that she could attend classes but
not access her scholarship funds until she obtained a student visa,
5.6. Pollard’s Rho Method 135
which would take several months. Running out of money, she went to
the mathematics department for help. They told her that Professor
D. N. Lehmer wanted to hire a student to compute some numbers.
She took the job and worked with the professor and his son, D. H.
(Dick) Lehmer doing number theory computations. Dick and Emma
fell in love and married a few years later. Their marriage lasted until
he died in 1991. She lived until 2007.
3
The birthday problem asks how many people are needed so that the probability
is approximately 0.5that two have the same birthday. If there are p possible birthdays,
√
then the answer is (2 ln 2)p ≈ 1.18 p.
136 5. Simple Factoring Algorithms
fair chance that we can separate them and find just one of them if we
restart the algorithm with new random b and s.
The factor g of N written in the next-to-last line is not guaranteed
to be prime. It is possible that we may find two or more prime
factors of N together. One should always test g for primality when
the algorithm finishes.
As noted above, assuming f is a random mapping, the complexity
√
of the Pollard Rho Method is O( p) steps, where p is the smallest
√
prime factor of N . Since p ≤ N , this complexity is O(N 1/4 ). Trial
Division would find a prime factor p of N in O(p) steps, so the Pollard
Rho Method is faster than Trial Division unless p is very small. The
Rho Method will find a 20-digit prime factor of N with about the
same work needed to find a 10-digit factor by Trial Division.
Example 5.24. Factor N = 25279 by Pollard Rho with s = b = 1.
The first 13 iterates (modulo N ) are s0 = 1, s1 = 2, 5, 26, 677,
3308, s6 = 22337, 9947, 804, 14442, 19615, 1846, s12 = 20331. We
have gcd(s12 −s6 , N ) = gcd(20331−22337, 25279) = 17. The (hidden)
values of si mod 17 are shown in the figure below. The shape is the
reason for the algorithm’s name.
*9
Z
Z
5 Z~
Z
14
2X yXXX
10
6 X 16
9
Pollard Rho
1 modulo 17
The most expensive step in the while loop is the greatest com-
mon divisor. Its cost may be amortized by adding a new variable C,
initialized at 1, replacing the greatest common divisor operation by
138 5. Simple Factoring Algorithms
p − 1 has a prime factor > B, then the algorithm will fail. We could
fail to find a prime factor p as small as p = 2q + 1, where q is the first
prime > B (or > B2 , if the second stage is used).
Exercises
√
5.1. Prove that Algorithm 5.9 for N is correct and that it takes
O(log log N ) iterations.
5.2. Find the 22 squares modulo 100 and the 12 squares modulo 64.
5.3. Use Fermat’s Difference of Squares Algorithm to factor 5293.
5.4. Factor N = 299944727 by Lehman’s algorithm using k = 12.
Omit the Trial Division step.
7
5.5. Factor the Fermat number F7 = 22 +1 given that F7 = x2 +y 2
with x = 16382350221535464479 and y = 8479443857936402504.
Use Theorem 5.20.
142 5. Simple Factoring Algorithms
Continued Fractions
Introduction
Although the factoring methods in this chapter are slow compared
to the best known ones, the ideas in them led to some of the fastest
known factoring algorithms. All of the algorithms in this chapter use
simple continued fractions in some way. Several of them use continued
fractions to produce quadratic residues modulo the number N to be
factored. The Continued Fraction Factoring Algorithm (CFRAC) was
the first factoring algorithm with subexponential time complexity.
Some facts about continued fractions are used in later chapters.
For example, Theorem 6.7 is needed in the proof of Theorem 10.8
and for quantum computing in Chapter 9. Theorems 6.17 and 6.18
in Section 6.4 are used first in Section 6.5 but are really not about
continued fractions. They are used many times in Chapters 8 and 10.
143
144 6. Continued Fractions
for every k ≥ 1.
k qk Ak Bk Ak /Bk
−1 − 1 0 −
0 3 3 1 3.0
1 7 22 7 3.1428
2 15 333 106 3.141509
3 1 355 113 3.1415929
4 292 103993 33102 3.1415926530
a proper factor of N .
If we knew the factors p and q, then we could let r = q/p
and the solution would be y = 2r, x = r 2 p + q − by, z = q − r 2 p. But,
of course, we don’t know p and q in advance. The inequalities in the
theorem will help us find x, y, z in O(N 1/4+ ) steps.
Suppose some integer m divides z but not y. Let x0 ≡ xy −1
(mod m2 ), where y −1 is the inverse of y modulo m2 . Since Q(x, y) =
z 2 ≡ 0 (mod m2 ) we have Q(x0 , 1) ≡ z 2 /y 2 ≡ 0 (mod m2 ). Also,
since x0 ≡ xy −1 (mod m2 ), we have x = x0 y − λm2 for some integer
λ, so that
x0 λ x
(6.5) − = 2 .
m2 y m y
If x > 0 and
(6.6) m2 > 2xy,
148 6. Continued Fractions
where the overline indicates the string of p partial quotients that are
repeated forever.
The word “surd” is an old word meaning “irrational number.” A
“quadratic surd” is an irrational number that is a zero of a quadratic
polynomial with integer coefficients.
• easier to compute,
• easier to factor,
• more likely to be smooth, and
• easier to test for being a square
than larger ones near N/2. As we noted in Section 5.1, small quadratic
residues r modulo N have a more manageable set of primes p for
which r is a quadratic residue modulo p. These primes p are the only
potential divisors of N when Trial Division is performed on N .
We define the i-th complete quotient by
√
N if i = 0 ,
xi =
1/ (xi−1 − qi−1 ) if i ≥ 1 .
√
One can prove that xi = (Pi + N )/Qi for i ≥ 0, where
⎧
⎪
⎪
⎨0 if i = 0 ,
(6.7) Pi = q0 if i = 1 ,
⎪
⎪
⎩q Q
i−1 i−1 − Pi−1 if i ≥ 2
and
⎧
⎪
⎪
⎨1 if i = 0 ,
(6.8) Qi = N − q02 if i = 1 ,
⎪
⎪
⎩Q
i−2 + (Pi−1 − Pi )qi−1 if i ≥ 2 .
152 6. Continued Fractions
Example
√ 6.16. Table 2 gives the sequences for the continued fraction
of 85 = 9.219544457292887 . . ..
√
Table 2. Continued fraction expansion for 85.
i qi Pi Qi Ai Bi Ai /Bi
−1 − − − 1 0 −
0 9 0 1 9 1 9.0
1 4 9 4 37 4 9.25
2 1 7 9 46 5 9.20
3 1 2 9 83 9 9.2222
4 4 7 4 378 41 9.21951
5 18 9 1 6887 747 9.2195448
6 4 9 4 27926 3029 9.219544404
7 1 7 9 34813 3776 9.2195444915
8 1 2 9 62739 6805 9.2195444526
9 4 7 4 285769 30996 9.21954445735
10 18 9 1 5206581 564733 9.2195444572922
11 4 9 4 21112093 2289928 9.21954445729298
6.4. A General Plan for Factoring 153
The reader should use the formulas to check the values in the
table. Note also that the approximations
√ Ai /Bi alternate above and
below the actual value of 85 and rapidly approach this limit. One
can prove that they always alternate this way.
Proof. Suppose that N is odd and has k > 1 distinct prime factors
and that x2 ≡ y 2 (mod N ). Then x2 ≡ y 2 (mod pe ) for each of the
k distinct prime power divisors pe of N . The number y 2 is clearly a
quadratic residue modulo p. The congruence z 2 ≡ y 2 (mod pe ) has
exactly two solutions z. Since y and −y are clearly two solutions,
z ≡ ±y (mod pe ). By the Chinese Remainder Theorem, given y,
there are 2k solutions z to z 2 ≡ y 2 (mod N ), one for each choice of
the ± sign in each congruence z ≡ ±y (mod pe ). The solutions with
x ≡ ±y (mod N ) are two of these 2k solutions. Therefore, if x and
y are chosen randomly subject to x2 ≡ y 2 (mod N ), the probability
that x ≡ ±y (mod N ) is (2k − 2)/2k = 1 − 2k−1 . Since k > 1, the
probability is at least 1/2 that a random congruence x2 ≡ y 2 (mod N )
will yield a factorization of N .
k qk Pk Qk Ak−1 mod N
0 3645 0 1 1
1 1 3645 2.2017 3645
2 1 389 3257 3646
3 4 2868 5.311 7291
4 5 3352 1321 32810
5 3 3253 2.52 .41 171341
.. .. .. .. ..
. . . . .
22 1 1134 41.113 5235158
23 31 3499 2.113 1914221
24 1 3507 5.877 11415773
25 1 878 5.571 39935
26 1 1977 2.31.53 11455708
27 1 1309 13.271 11495643
28 2 2214 2381 9661292
29 2 2548 5.571 4238109
3
One might think that the probability that Qi is B-smooth would be lessened
by having only about half of the possible primes available. But there is a heuristic
argument (see Section 4.5.4 of Knuth [Knu81]) that if p ≤ B does not divide N and
(N/p) = +1, then p divides Qi with probability 2/(p + 1) rather than the expected
1/p. This higher chance of dividing Qi compensates for the smaller number of useful
primes < B and leaves the estimate in equation (3.1) essentially unchanged.
160 6. Continued Fractions
Fi
i−1
i (−1) Qi−1 2 · Pn (−1)i Qi
1 1 2 · 3645 −4034
2 −4034 2 · 389 3257
3 3257 2 · 2868 −1555
4 −1555 2 · 3352 1321
5 1321 2 · 3253 −2050
···
50 −2327 2 · 2869 2174
51 2174 2 · 1479 −5107
52 −5107 2 · 3628 25
Gi
i−1
i (−1) Si−1 2 · Rn (−1)i Si
0 −5 2 · 3643 3722
1 3722 2 · 79 −3569
2 −3569 2 · 3490 311
3 311 2 · 3352 −6605
4 −6605 2 · 3253 410
···
21 1130 2 · 2603 −5765
22 −5765 2 · 3162 571
23 571 2 · 3119 −6238
24 −6238 2 · 3119 571
of the square form found in the first part of the algorithm. When
Ri = Ri+1 is found, the factor of N is Si if Si is odd and Si /2 if Si
is even.
Example
√ 6.24. We factor N = 13290059 by the SQUFOF. Then
q0 = N = 3645 and q02 − N = −4034. Some of the forms are
shown in Table 4. Compare Table 4 with Table 3. At the end of
Table 4 we have found the square form F52 = (−5107, 2 · 3628, 52 ).
166 6. Continued Fractions
5
A fancier and often faster version of the SQUFOF uses a Queue data structure
instead of a List.
6.7. SQUFOF—SQUare FOrms Factoring 167
trivial factor 1. The purpose of putting the small Q’s on the List and
checking for r being on the List is to avoid this trap.
Here is the first part of the SQUFOF algorithm:
When the first part of the SQUFOF ends by going to Part 2, it has
just found the square form F = (−Qprev, 2P, r 2 ). Its inverse square
root is F −1/2 = (−r, 2P, rQprev). The first lines of the second part
reduce this form to produce the form (−Qprev, 2P, Q) just before
the while loop begins. The division by Qprev in the third line is
exact. Then the SQUFOF cycles through reduced forms seeking two
consecutive forms with the same middle term: 2P = 2P next. When
they are found, the factor of N is either Q or Q/2.
168 6. Continued Fractions
Qprev ← r
P ← P + r (s − P )/r
Q ← (N − P · P )/Qprev
i←0
while (i < B) {
q ← (s + P )/Q
P next ← q · Q − P
if (P = P next) {
Q ← Q/ gcd(Q, 2)
write “Q divides N ” ; exit
}
t ← Qprev + q · (P − P next)
Qprev ← Q
Q←t
P ← P next
i←i+1
}
Output: A factor of N .
Note that the code for the next form in the reverse cycle in Part
2 is exactly the same as that for the next form in the forward cycle
in Part 1. The number of iterations of the while loop in Part 2 is
approximately half of that in Part 1.
Under plausible hypotheses,√one can prove that the average run-
ning time for the SQUFOF is C 4 N , where C depends on the number
of prime factors of N and whether N ≡ 1 or 3 (mod 4). See Gower
and Wagstaff [GW08] for the hypotheses, proof, and formula for C.
Shanks introduced two important new ideas in the SQUFOF:
√
• keeping virtually all numbers < 2 N when factoring N and
• using the connections among continued fractions, binary
quadratic forms, and the structure of real quadratic fields
to compute the factors of N in a simple way and to analyze
the time complexity of the SQUFOF.
6.8. Pell’s Equation 169
i qi Pi Qi Ai Bi
0 1 0 1 1 1
1 1 1 2 2 1
2 2 1 1 5 3
3 1 1 2 7 4
4 2 1 1 19 11
5 1 1 2 26 15
Exercises
6.3. Find
√ the length p of the period of the simple continued fraction
of N , where N = 13290059.
6.4. Let p be a 15-digit prime number. Prove that the CFRAC will
fail to factor N = p3 .
Chapter 7
Elliptic Curves
Introduction
This chapter describes how elliptic curves are used to factor integers.
We also explain how one can prove that large integers are prime with
elliptic curves. Finally, we give applications of factoring integers to
the construction of certain elliptic curves with desirable properties.
Elliptic curves have been studied for hundreds of years. In 1985,
H. W. Lenstra, Jr. [Len87] invented a factoring method that used
elliptic curves. Basically, Lenstra freed Pollard’s p − 1 factoring algo-
rithm from its dependency on a single number (p − 1) being smooth.
He replaced the multiplicative group of integers modulo p by an ellip-
tic curve modulo p. The integer that needs to be smooth is the size
of the elliptic curve modulo p. This number varies through a small
interval around p depending on the parameters of the elliptic curve.
Soon after this discovery, Miller [Mil87] and Koblitz [Kob87b] gave
applications of elliptic curves to cryptography. Many cryptographic
algorithms use the multiplicative group of integers modulo p. They
replaced this group with an elliptic curve modulo p. This change al-
lows one to use smaller p and design faster cryptographic algorithms
offering the same level of security. Today elliptic curves are an im-
portant tool in cryptography.
173
174 7. Elliptic Curves
y 2 = x3 + ax2 + bx + c,
Example 7.4. Add the points (3, 3)+(6, 1) on the curve whose points
were just listed.
We have s = (1 − 3)/(6 − 3) = −2/3. The Extended Euclidean
Algorithm gives 3−1 ≡ 5 (mod 7), so s ≡ (−2)(5) ≡ −10 ≡ 4 (mod 7).
Thus, x3 = 42 − 3 − 6 = 7 ≡ 0 (mod 7) and y3 ≡ 4(3 − 0) − 3 ≡
2 (mod 7), so the sum is (0, 2).
Theorem 7.7. Let p be an odd prime. Let (r/p) denote the Legendre
symbol. The number Mp,a,b of points on the elliptic curve y 2 ≡ x3 +
3
ax + b (mod p) satisfies Mp,a,b = p + 1 + p−1
x=0 (x + ax + b)/p .
3 p−1
3
x + ax + b x + ax + b
Mp,a,b = 1+ 1+ = p+1+ .
x=0
p x=0
p
Theorem 7.8 (Hasse). Let the elliptic curve Ea,b modulo a prime p
have Mp,a,b points. Then
√ √
p + 1 − 2 p ≤ Mp,a,b ≤ p + 1 + 2 p.
7.2. Factoring with Elliptic Curves 181
The Elliptic Curve Method has a second stage, just like the Pol-
lard p − 1 Method. The second stage of the algorithm chooses a
bound B2 > B. At the end of the first stage (Algorithm 7.9), the
variable P is a point Q equal to L times the original point P . Let
q1 < q2 < · · · < qt be the primes between B and B2 . One computes
successively (Lqi )P for i = 1, 2, . . . , t, where P is the original point.
The first point (Lq1 )P is computed directly as (q1 )Q. The differences
qi+1 − qi are all even numbers and are much smaller than the qi them-
selves. Precompute (Ld)P = dQ for d = 2, 4, . . ., up to a few hundred.
To find (Lqi+1 )P from (Lqi )P , add the latter to (Ld)P = dQ, where
d = qi+1 − qi . The amortized cost of computing each (Lqi )P for
1 ≤ i ≤ k is a single addition of two points on Ea,b .
The second stage finds a factor p of N when the largest prime
factor of Mp,a,b is less than B2 and all the other prime factors of
Mp,a,b are less than B.
Montgomery and Silverman each discovered more than 500 fac-
tors of Cunningham numbers by the ECM. Suyama found a good
form for an elliptic curve that has become standard for the ECM.
Brent [Bre86] and Montgomery [Mon87] made many improvements
to the second stage.
As I write this, the ECM has discovered one 79-digit prime factor,
one 77-digit prime factor, and two 75-digit prime factors of large
composite numbers, but no larger ones. Efficient modern versions
of the algorithm discover prime factors up to about 20 digits in a
few seconds and those up to about 40 digits in a few hours. Luck is
required to discover factors having more than 60 digits. See Silverman
and Wagstaff [SW93] for some practical aspects of the ECM. See
Zimmerman and Dodson [ZD06] for more about the ECM. A table
in [ZD06] recommends suitable parameters for the ECM, based on
the suspected size of the unknown prime factor. For example, when
seeking a prime factor of 50 digits, they suggest B = 43 · 106 and
B2 = 4620B. With these choices, an average of 7771 curves will be
needed to find the unknown prime. (These values assume a different
form for the elliptic curve and a faster way of performing the second
stage than we described above.)
7.3. Primality Proving with Elliptic Curves 187
Exercises
7.1. Derive the formulas for adding points (x1 , y1 ) + (x2 , y2 ) stated
after Theorem 7.1.
7.2. Consider the curve y 2 = x3 − 43x + 166. Let P = (3, 8).
Compute 2P , 3P , 4P , and 5P . What is the smallest positive
integer k with kP = ∞? Be sure to check that the given point
and your answers all lie on the curve.
7.3. Consider the curve y 2 ≡ x3 + 4x (mod 11). Let P = (2, 4).
Compute 2P , 3P , and 4P . What is the smallest positive integer
k with kP = ∞? Find the number of points on the elliptic
curve. Be sure to check that the given point and your answers
all lie on the curve.
7.4. Prove that if p is prime, p ≡ 3 (mod 4), and b = 0, then the
elliptic curve Ea,0 modulo p has exactly p + 1 points.
7.5. Let g be a quadratic nonresidue modulo the prime p. Let M
and N be the sizes of the two elliptic curves y 2 ≡ x3 + ax + b
and y 2 ≡ x3 +g 2 ax+g 3 b modulo p. Prove that M +N = 2p+2.
7.6. Factor N = 7151393 using the ECM by computing multiples
of the point P = (0, 3) on the elliptic curve y 2 = x3 + 4x + 9
modulo N . Try B = 10.
Chapter 8
Sieve Algorithms
Introduction
This chapter describes sieve algorithms used to factor integers. The
sieve is one of the oldest efficient algorithms in mathematics. The ba-
sic idea goes back to Eratosthenes in ancient Greece 2,500 years ago.
He used it to compute the first few prime numbers. Modern computer
programs use the same technique to form a table of the primes up to
a few million in a fraction of a second. A simple variation finds all in-
tegers in a short interval of large numbers free of small prime factors.
Another variation finds all primes or all numbers without small prime
factors in the range of a polynomial with arguments in some interval.
These are all called sieves and are described in the next section.
191
192 8. Sieve Algorithms
1 2 3 4 5 \
6 7 8 \
9 1 0 11 12
\ 13 1 4 15
\ 1 6 17 18
\ 19.
of each f (x) that lie in a finite set P of primes. The polynomial f (x)
is assumed fixed and is not part of the input. This sieve algorithm is
the heart of the Quadratic and Number Field Sieve Factoring Algo-
rithms. This algorithm works just like the preceding one, except that
L[i] holds the prime factors of f (i) instead of those of i. The number
of roots of f (x) ≡ 0 (mod pa ) is no more than the degree of f .
Algorithm 8.6. Sieve to Factor the Range of a Polynomial.
Input: Integers J > I > 1 and a finite set P of primes.
for (i ← I to J) { L[i] ← empty }
for each p ∈ P {
Find the roots r1 , . . ., rd of f (x) ≡ 0 (mod p)
for (j ← 1 to d) {
i ← the least integer ≥ I and ≡ rj (mod p)
while (i ≤ J) { append p to L[i]; i ← i + p }
a←2 √
while (pa ≤ J ) {
Lift rj to a root r of f (x) ≡ 0 (mod pa )
i ← the least integer ≥ I and ≡ r (mod pa )
while (i ≤ J) { append p to L[i]; i ← i + pa }
a←a+1
}
}
}
Output: For I ≤ i ≤ J, L[i] gives the factors in P of f (i).
factor base is
P = {2, 5, 13, 31, 41, 43, 53, 67, 83, 89, 97}.
For the large prime variation, we allow primes p in the range 100 <
p < 200.√ Using Algorithm 8.6, sieve f (s) for the first 1000 values
of s > N , that is, the sieve interval is 3646 ≤ s ≤ 4645. We save
relations s2 ≡ f (s) (mod N ) produced by the sieve whenever f (s)
is factored completely using only primes in the factor base (called
a full relation) or if the cofactor of f (s) remaining after all primes
in the factor base have been divided out is a number between 100
and 200 (called a partial relation). Since this number between 100
and
√ 200 has no prime factor less than 100 (which is greater than
200), it must be prime, a large prime. Table 1 shows the relations
harvested from the sieving. The four full relations are shown first,
followed by the fourteen partial relations in order of their large primes.
We omit partial relations whose large prime does not appear in any
other partial relation, as these are not useful since their large prime
cannot be matched to form a square. The repeated large primes
(appearing twice each) are 109, 113, 127, 131, 137, 149, and 157.
Now the relations with repeated large prime factors are combined.
For example, the relations numbered 7 and 8 in Table 1 have large
prime 113 and represent the congruences
44032 ≡2 · 52 · 13 · 83 · 113 (mod N ),
46372 ≡2 · 5 · 132 · 43 · 113 (mod N ).
Multiply these two congruences to get
44032 · 46372 ≡ 22 · 53 · 133 · 43 · 83 · 1132 (mod N ).
When we combine all pairs of relations with repeated large primes,
we get the congruences in Table 2. At this point we have the full
relations in the first four lines of Table 1 and the seven combined
relations in Table 2. The prime 67 occurs only in one of these eleven
relations, namely relation 19 in Table 2. Therefore, that 67 cannot
be combined with any other 67, so relation 19 is useless. Each of the
other nine primes in the factor base appears in at least two of the
remaining ten relations. We wish to find a subset of the ten relations
whose product has a right side in which every prime is raised to an
8.2. The Quadratic Sieve 199
even power. To do this, we form the matrix in Table 3 with one row
per relation and one column per prime in the factor base. The matrix
entry is 1 if the prime appears to an odd power on the right side of
the relation and 0 otherwise. The numbers i on the left side give the
number of the relation. Using linear algebra1 and remembering that
1 + 1 = 0 in the field F2 with two elements, one finds that the rows
for relations 1, 2, 3, and 20 sum to the 0 vector, as do the rows for
1
We might find a basis for the null space of the matrix over F2 .
200 8. Sieve Algorithms
i 2 5 13 31 41 43 53 83 89
1 0 1 0 0 0 1 0 1 0
2 0 1 0 0 0 0 0 0 0
3 0 1 1 0 0 0 0 0 0
4 0 1 0 1 0 1 0 0 0
20 0 1 1 0 0 1 0 1 0
21 1 1 1 0 0 0 1 1 0
22 1 1 0 1 1 0 0 0 1
23 0 0 0 0 0 1 0 1 0
24 1 0 1 1 0 0 1 0 0
25 1 1 1 0 1 1 0 0 1
and y is the square root of the product of the primes on the right
sides, that is,
and
The Quadratic Sieve was used to factor the 129-digit RSA chal-
lenge number mentioned in Sections 1.1 and 4.9. Atkins, Graff,
Lenstra, and Leyland [AGLL95] used a variation of the Quadratic
Sieve with two large primes allowed in each relation. Their factor
base contained 524339 primes. They generated more than 8.4 million
relations. Allowing “two large primes” means that each relation may
be full or have one or two primes larger than the greatest one in the
factor base. When the relations are harvested at the end of the sieve,
after the primes in the factor base have been removed from f (s), if
the remaining cofactor is composite, then some effort is made to fac-
tor this number, usually with the ECM or the SQUFOF. If it can
be factored easily, then the relation is saved. Using two large primes
complicates the linear algebra slightly.
A faster version of the Quadratic Sieve uses multiple polynomials.
See Silverman [Sil87] and Alford and Pomerance [AP93].
In the 1990s, George Sassoon organized all the personal comput-
ers on the Isle of Mull, Scotland, into a factoring group. They used
the multiple polynomial version of the Quadratic Sieve, with each
machine sieving a different set of polynomials. As these computers
were not interconnected, the relations were collected on floppy disks
and carried to one machine for the linear algebra. They factored more
202 8. Sieve Algorithms
(6) Once > k B-smooth a’s have been saved, create a list of
new relations a2 ≡ a(a + N ) (mod N ).
(7) Form an × k matrix with one row for each new relation
and one column for each prime factor of a(a + N ).
The following table shows the first three and last three relations:
a b=N +a
8 = 23 13290067 = 7 · 232 · 37 · 97
61 = 61 13290120 = 23 · 32 · 5 · 19 · 29 · 67
133 = 7 · 19 13290192 = 24 · 32 · 17 · 61 · 89
.. ..
. .
6100 = 22 · 52 · 61 13296159 = 32 · 17 · 432 · 47
6241 = 792 13296300 = 22 · 3 · 52 · 23 · 41 · 47
6253 = 132 · 37 13296312 = 23 · 34 · 172 · 71
Each row of the table represents a relation a ≡ b = a + N (mod N )
in which both a and b are B-smooth. We form new relations by
multiplying both sides of a ≡ b (mod N ) by a to get a2 ≡ ab =
a(a + N ) (mod N ). Here are the first three and last three new rela-
tions:
a2 ab = a(N + a)
8 2
2 · 7 · 232 · 37 · 97
3
61 2
2 · 32 · 5 · 19 · 29 · 61 · 67
3
133 2
24 · 32 · 7 · 17 · 19 · 61 · 89
.. ..
. .
61002 22 · 32 · 52 · 17 · 432 · 47 · 61
62412 22 · 3 · 52 · 23 · 41 · 47 · 792
62532 23 · 34 · 132 · 172 · 37 · 71
We form an × k matrix with one row for each relation and one
column for each prime as a factor of ab. The (i, j) entry of the matrix
is 1 if the j-th prime divides a(a + N ) of the i-th relation to an odd
power, and it is 0 otherwise. Since the matrix has more rows ( = 31)
than columns (k = 25), some linear combination of the rows will be
the zero vector. The coefficients of the linear combinations will be
0 or 1. When the relations that appear in this linear combination
are multiplied, the product of the a2 will be square, say x2 , and the
product of the a(a + N ) will be square, say y 2 , because the prime
factors have been matched so that each prime factor occurs an even
number of times. Then x2 ≡ y 2 (mod N ) and we have a chance to
factor N by Theorem 6.18. Since = 31 and k = 25, we will have at
least six chances to factor N . We leave the rest to the reader.
8.4. Schroeppel’s Linear Sieve 205
54 59 411037 = 112 · 43 · 79
54 60 414736 = 24 · 72 · 232
We form an × (k + L) matrix with one row for each relation, one
column for each prime as a factor of S(a, b), and one column for each
a or b. For 1 ≤ j ≤ k, the (i, j) entry of the matrix is 1 if the j-th
prime divides S(a, b) of the i-th relation to an odd power, and it is
0 otherwise. For k < j ≤ k + L, the (i, j) entry of the matrix is 1
if j = a or j = b, and it is 0 otherwise. Since the matrix has more
rows ( = 98) than columns (k + L = 85), some linear combinations
(modulo 2) of the rows will be zero vectors. The coefficients of the
linear combinations will be 0 or 1. When the relations that appear in
such a linear combination are multiplied, the product of the S(a, b)
will be square, say x2 , because the prime factors have been matched
so that each prime factor occurs an even number of times; and the
product of the (K + a)(K + b) will be square, say y 2 , because the
factors K + a and K + b have been matched so that each (K + j)
occurs an even number of times. Then x2 ≡ y 2 (mod N ) and we have
a chance to factor N by Theorem 6.18. Since = 98 and k + L = 85,
we will have at least 13 chances to factor N . We leave the rest to the
reader.
8.5. The Number Field Sieve 207
The set of all algebraic integers in Q(α) is written Z(α). This set
forms a commutative ring with unity. A unit in Z(α) is an element
having a multiplicative inverse in Z(α). A nonzero, nonunit element
γ of Z(α) is irreducible if it can be factored in Z(α) only as γ = uβ
where u is a unit. When γ = uβ, where u is a unit, β is called an
associate of γ (and γ is an associate of β). An algebraic integer γ
has unique factorization (in Z(α)) if any two factorizations of γ into
8.5. The Number Field Sieve 209
the product of irreducible elements and units are the same except for
replacing irreducibles by their associates and using different units.
√ √
Example 8.11. One can show that Z( −6) is the set of all a+b −6,
where a, b are integers. In this set, the algebraic integer 10 (the zero
of the monic polynomial x − 10) does not have unique factorization
because
√ √
10 = 2 · 5 = (2 + −6)(2 − −6)
√ √
and all four factors 2, 5, (2 + −6), (2 − −6) are irreducible.
times and thus the product of the associated relations will be a con-
gruence with two squares of integers congruent modulo the number
N to factor.
The first difficulty is in writing a congruence modulo N with
an algebraic integer on one side. This problem is solved by using a
homomorphism h from the algebraic integers Z(α) to ZN , the integers
modulo N . Suppose we have many algebraic integers θi , each factored
into irreducibles, and also every h(θi ) factored into the product of
primes. Then we may match the irreducibles and match the primes
to choose a subset of the θi whose product is a square γ 2 in Z(α) and
so that the product of the h(θi ) is a square y 2 in the integers. See
Montgomery [Mon94] for a way to find γ from a set of θi ’s whose
product is γ 2 . Let x = h(γ), a residue class modulo N . We have
2 2 2
x = (h(γ)) = h(γ ) = h θi = h(θi ) ≡ y 2 (mod N ),
i∈S i∈S
The numbers θ will all have the simple form a − bα. We will seek
a set S of pairs (a, b) of integers such that
(8.2) (a − bm) is a square in Z
(a,b)∈S
8.5. The Number Field Sieve 211
and
(8.3) (a − bα) is a square in Z[α].
(a,b)∈S
Let the integer y be a square root of the first product. Let γ ∈ Z[α] be
a square root of the second product. We have h(γ 2 ) ≡ y 2 (mod N ),
since h(a − bα) ≡ a − bm (mod N ). Let x = h(γ). Then x2 ≡
y 2 (mod N ), which will factor N with probability at least 1/2, by
Theorem 6.18.
In practical applications, the degree d of the polynomial f (x) is
typically 4, 5, or 6. In addition to being irreducible and having a
known zero m modulo N , we want f (x) to have “small” coefficients
compared to N . (Actually, we want the norm function N ( ) to have
small values, so it is important for the high-order coefficients of f (x)
to be very small.) There are several ways one might satisfy all these
conditions.
The requirements on f (x) are easily met in the Special Number
Field Sieve, which factors numbers N = r e − s, where r and |s| are
small positive integers. Cunningham numbers have this form with
s = ±1. Let k be the least positive integer for which kd ≥ e. Let
t = sr kd−e . Let f (x) be the polynomial xd − t. Let m = r k . Then
f (m) = r kd − sr kd−e = r kd−e N ≡ 0 (mod N ).
In the general case, called the General Number Field Sieve, one
standard approach to finding a good polynomial (of degree 5, say) to
factor N is to let m be an integer slightly larger than N 1/5 . Write N =
5
i=0 di m in base m. The digits di will be in the interval 0 ≤ di < m,
i
212 8. Sieve Algorithms
5
small compared to N . Then let the polynomial be f (x) = i=0 di xi .
This method makes all the coefficients of f (x) somewhat small but
does not make the high-order coefficients especially small. See the
article [BLP93] by Buhler, Lenstra, and Pomerance for the origins of
the General Number Field Sieve. Montgomery and Murphy [MM99]
give better ways to choose a polynomial for the General Number Field
Sieve.
Example 8.14. Let us find a polynomial for factoring
N = 37965134430918647673876906901814739053246839817137.
√
Let d = 5. Now 5 N is approximately m = 8239047153. If we write
N in base m, we find
N = m5 + 2m4 + 5m3 + 2267016550m2 + 6448349153m + 3629348338.
Thus we let
f (x) = x5 + 2x4 + 5x3 + 2267016550x2 + 6448349153x + 3629348338.
We automatically have f (m) = N ≡ 0 (mod N ). The number field is
K = Q(α), where α is a zero of f .
for some constant c > 0. The constant c is a bit smaller for the Special
than for the General Number Field Sieve because the coefficients can
be made smaller.
In summary, the Number Field Sieve has these steps.
One can find good polynomials for the Special Number Field Sieve
for numbers in any divisibility sequence (and numbers near numbers
in a divisibility sequence).
Example 8.18. Find a fifth degree polynomial for factoring the Fi-
bonacci number u1289 by the Special Number Field Sieve.
The Fibonacci numbers enjoy many identities. The identity
(8.4) u5n + 10u3n u2n+1 + 10u2n u3n+1 + 10un u4n+1 + 3u5n+1 = u5n+4
is easy to prove from the formula un = (αn − β n )/(α − β), where α
and β are the roots of x2 − x − 1 = 0. In it let n = 257 and divide
through by u5258 . The identity shows that m = un /un+1 = u257 /u258
is a zero modulo u5n+4 = u1289 of the polynomial
f (x) = x5 + 10x3 + 10x2 + 10x + 3.
Compute m by inverting u258 modulo u1289 via the Extended Eu-
clidean Algorithm and multiplying u257 by this inverse modulo u1289 .
Exercises
3
6,930L and 6,930M are the Aurifeuillian factors of 6930 + 1.
Chapter 9
Factoring Devices
Introduction
This chapter describes some hardware used to factor integers. The
next section discusses devices to perform the sieve algorithms of Chap-
ter 8, from paper strips to electronic shift registers. The last section
lists several computers with special hardware attachments designed to
speed parts of certain factoring algorithms. Some of these machines
were actually built and factored numbers. Others were proposed but
never built. The last section also describes factoring by the new tech-
nologies of quantum computing and DNA computing.
219
220 9. Factoring Devices
a Modulus
4 3 5 7
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Figure 1. Lawrence’s paper strip sieve.
toothpicks to plug the holes for unacceptable residues; the holes for
the sij were left open. The 30 gears had a common line of tangency
so that a light could shine through the holes not plugged. The gears
were rotated from some initial position. When a light ray shined
through all the gears, a photoelectric cell would detect it and halt
the gears. See Figure 4. Then a human could discover the value of
a with help from a counter. This mechanical device processed 5000
values of a per second in 1932, faster than any machine could execute
Algorithm 9.2 until it was programmed on the IBM 7094 computer
in the early 1960s. The photoelectric sieve factored many numbers
from the Cunningham Project, including
(293 + 1)/(3 · 3 · 715827883) = 529510939 · 2903110321.
D. N. Lehmer wrote a whimsical account [Leh33b] of the operation
of his son’s photoelectric sieve. He begins as follows:
On the 19th of October a little group of mathemati-
cians gathered in the Burt Laboratories in Pasadena,
California, around a mysterious little machine to
watch it attack a problem in mathematics. It was a
simple enough problem to state. It had only to find
9.1. Sieve Devices 225
(2120 + 1)(28 + 1)
Φ240 (2) = = 394783681 · 46908728641.
(224 + 1)(240 + 1)
In 1937, Gérardin [Gér37] built a sieve, but little is known about it.
Dick Lehmer and Paul Morton built the Delay Line Sieve DLS-
127 in 1965. A delay line is a circuit that transmits an electric pulse
at a constant speed. The DLS-127 had 31 delay lines with lengths
proportional to powers of the 31 primes up to 127. The pulses in the
i-th delay line circulated (mod pi ) and were initialized to represent
the residues sij . As the pulses passed a solution tap, their logical
AND was computed, and the current value of a was saved when all
31 values were 1, meaning that a solution was detected. This machine
228 9. Factoring Devices
See Sections 7.3, 8.2, 8.3, and 16.1 of Williams [Wil98] for much more
about the history of sieve devices.
An even faster sieve of this type could use k lasers, one for
each modulus pi , to generate streams of light pulses to represent the
residues sij . An optical sensor would observe the streams and report
a whenever it saw a pulse from each laser simultaneously. A machine
like this in theory could process more than 1015 values of a per second.
When I last checked on the feasibility of constructing such a device,
the problem was that it is hard for an optical sensor to distinguish
between pulses from all k lasers and from only k − 1 or k − 2 of them.
If it reports a whenever it sees at least k − 1 pulses together, there
will be very many false solutions, perhaps too many for its computer
to handle1 .
Another application of sieve devices is to compute pseudosquares.
These are positive integers that behave like squares modulo all of the
first few primes but which are not square.
Definition 9.3. Let p be an odd prime. The pseudosquare Lp is
the smallest positive nonsquare integer with Lp ≡ 1 (mod 8) and the
Legendre symbol (Lp /q) = +1 for all odd primes q ≤ p.
Example 9.4. The first few pseudosquares (and some larger ones)
are shown in Table 1.
p Lp
3 73
5 241
7 1009
11 2641
13 8089
17 18001
19 53881
53 22000801
101 10310263441
193 2854909648103881
241 2327687064124474441
277 69848288320900186969
and the process repeated. The main processor used the report vec-
tors to determine whether Qi was smooth enough for the relation
to be saved. A personal computer connected to the main processor
stored the smooth relations. The linear algebra was done on a larger
computer, an IBM 370.
qubits. For example, for an AND gate with output x and inputs a
and b, x would be in state 1 if and only if both a and b are in state
1. Since classical computers are built from logic gates, one can build
quantum computers to compute any function.
The basic idea of Shor’s quantum computer algorithm to factor an
integer N is to compute the order r of a random integer x modulo N ,
that is, find the smallest positive integer r for which xr ≡ 1 (mod N ).
If r is even, one can factor N by computing gcd(xr/2 − 1, N ), via
Theorem 6.18. Each x has probability at least 1/2 of leading to a
factor of N .
To compute r, one might put all possible r in one quantum reg-
ister, compute all possible xr mod N in another quantum register
entangled with the first one, and try to observe the value 1 in the
second register. This observation would fix all the qubits and one
could read the value of r from the first register. Unfortunately, this
method doesn’t quite work because one cannot “observe the value 1.”
One can only observe whatever value is in the second qubit register,
and it probably won’t be 1.
Instead, we let q = 2t be the power of 2 with N 2 ≤ q < 2N 2 . Let a
quantum register with t qubits hold the superposition of all integers
a modulo q. Compute xa (mod N ) in another quantum register.
Perform a discrete Fourier transform on the first register. This is done
by multiplying the register’s contents by a series of unitary matrices (a
rotation done by a microwave pulse or flipping an external magnetic
field). The Fourier transform changes the probability distribution of
the values in the first register so as to focus on periodicity in the
second register, that is, values that make xa mod N repeat, passing
through the value 1. Now observe the qubits in the second register.
With high probability, the second register will contain 1 and the first
register will contain an integer c with
c d
− ≤ 1 ,
q r 2q
for some integer d. Since q ≥ N 2 , at most one fraction d/r with
r < N can satisfy this inequality, by Theorem 6.7. Using a classical
computer, expand c/q in a simple continued fraction, computing the
convergents Ak /Bk by Theorem 6.5. One of these convergents will
234 9. Factoring Devices
2
The Hamiltonian path problem is to determine whether a graph has a path that
visits each node exactly once. N P-complete means that the problem is as hard as any
problem in class N P.
9.2. Special Computers 235
(2) Merge. Given two tubes P1 and P2 , form the tube ∪(P1 , P2 )
containing all DNA strands in either P1 or P2 . Pour the two
tubes into one.
(3) Detect. Given a tube P , answer “yes” if P contains at least
one DNA strand and answer “no” if P is empty.
(4) Copy. Given a tube P , create a new tube P1 identical to P .
(5) Append. Given a tube P and a DNA strand S, append S
to the end of each DNA strand in P .
(6) Read. Given a tube P , print the sequence of elementary
pieces of one DNA strand in P .
One can also delete and rename tubes.
Here is an example of an algorithm that generates a tube contain-
ing DNA representing all 2k k-bit numbers. In the algorithm, 0 and
1 mean DNA strings representing these bits. The algorithm begins
with a tube containing both 0 and 1. It copies it and appends 0 to
each DNA strand in the first tube, creating a tube with DNA 00 and
10. It appends 1 to each DNA in the second tube, creating a tube
with DNA 01 and 11. It merges these two tubes and repeats this
process until a tube containing all k-bit numbers is formed.
Algorithm 9.7. Create a tube with all k-bit numbers.
Input: An integer k > 1.
Create a tube P containing 0 and 1
for (i ← 2 to k) {
copy (P , P1 )
append (P , 0)
append (P1 , 1)
P2 ← ∪(P, P1 )
delete P and P1
rename P2 as P
}
Output: Tube P contains all k-bit numbers.
Using the simple operations on tubes, one can define more com-
plicated procedures to add numbers, multiply numbers, and compare
numbers. To factor an integer N that is the product of two k-bit
primes, load a tube with all k-bit primes (or all k-bit numbers), mul-
tiply every pair of them with each product labeled by its two factors,
236 9. Factoring Devices
compare each product with N (bit by bit with extract), and read a
DNA strand that contains N . The two factors of N are also part of
this strand. According to Chang et al. [CGH05], a tube can hold
up to 1018 DNA molecules. The number of elementary biological op-
erations needed to factor the product N of two k-bit primes is only
O(k3 ) steps, which is polynomial time. However, the total number of
DNA strands needed to factor N is O(2k ), so k is limited to about 64.
Since it easy to factor an integer N < 2128 with the Elliptic Curve
Method or the Quadratic or Number Field Sieve, DNA computing
with known methods is not more powerful than conventional algo-
rithms. The problem is that the known algorithm for DNA factoring
is Trial Division, which is slow, even though highly parallel. This
negative assessment would improve if someone could find a way to
perform a faster factoring algorithm on a DNA computer.
Shamir [Sha99] proposed an optoelectronic device called Twinkle
for the Number Field Sieve. It stands for The Weizmann INstitute
Key Locating Engine. It performs the sieve step for the Quadratic or
Number Field Sieve one candidate number at a time. Each prime in
the factor base is represented by a light-emitting diode (LED). The
LED for p glows once every p time units and its intensity is propor-
tional to log p. An optical sensor monitors the combined brightness of
all the LEDs and reports the current value of a counter whenever the
total intensity exceeds a threshold. In effect, the device adds the log-
arithms of all the prime divisors of the candidate number and reports
the B-smooth ones (those with large enough sum of logarithms). This
is the way the Quadratic and Number Field Sieves work on regular
computers, adding the logarithms of the prime divisors as scaled bi-
nary integers. Only a crude prototype was built; it never factored
any serious number.
Shamir and Tromer [ST03] designed the TWIRL in 2003. This
acronym stands for The Weizmann Institute Relation Locator. It is
a hypothetical hardware device that performs the sieving step of the
Number Field Sieve in a highly parallel fashion, using optical sensors
to sum the logarithms of the prime factors of the candidate numbers.
It handles small, medium, and large primes differently. It would cost a
Exercise 237
Exercise
9.1. Suppose a laser sieve could be built and it could process 1015
values of a per second. How large a number N could it factor
reliably by Algorithm 9.2?
Chapter 10
Theoretical and
Practical Factoring
Introduction
Oliver Atkin was referring to expressing a positive integer as a sum of
two squares, but the remark applies equally to factoring. Some the-
oretical mathematicians and computer scientists want to prove that
an algorithm works correctly in all cases. Some algorithms, like the
fastest ones discussed in this book, do not have this property. That
is, there is no proof that they always succeed. However, they have
worked on every number tried, which is thousands or even millions
of numbers. Practical mathematicians and computer scientists like
Oliver Atkin take the view that these algorithms are perfectly good.
They would say that once a number has been factored, it does not
matter whether you could have proved before the algorithm started
that it would factor the number. Theoretical folks argue that we do
not fully understand an algorithm until we can prove when it works
and when it fails. The world needs both theoretical and practical
people.
239
240 10. Theoretical and Practical Factoring
The theorem says that when N is large, almost all of the algo-
rithms AS with correct parameters n and v will factor
√
N successfully
and that all AS run in subexponential time L(N )3 2 .
There is a more complicated algorithm due to Lenstra and Pomer-
ance [LP92] that factors N with the rigorously proved time bound
of L(N ) steps. The algorithm uses class groups of binary quadratic
forms and is currently the fastest known integer factoring algorithm
with rigorous expected time bound.
Although they have rigorous proofs of their time complexities,
the algorithms mentioned above are much slower than the Quadratic
and Number Field Sieves.
The RSA cryptosystem and several other cryptographic algo-
rithms depend on factoring integers being a hard problem. Can we
prove that factoring integers is hard?
One must phrase this question carefully. Suppose N has a million
decimal digits and the low-order digit is 6. Then obviously 2 is a prime
factor of N . One can trivially “factor” N as 2 · (N/2).
Here is one way to ask the question. As a function of N , what
is the minimum, taken over all integer factoring algorithms, of the
maximum, taken over all integers M between 2 and N , of the num-
ber of steps the algorithm takes to factor M ? The integer factoring
algorithm has input M and outputs a list of all prime factors of M .
Integer factoring would be a polynomial-time problem if the answer
were O((log N )c ) for some constant c.
The model of computation, that is, what constitutes a “step,”
probably matters a lot in answering the question. Suppose we decide
that a single arithmetic operation (+, −, ×, ÷) is one step. Assume
that integers are stored in binary notation and that each register
(memory location) may hold one integer of any size. Shamir [Sha79]
proved that, in this model, one can factor N in O(log N ) steps. We
give a simplified version of his algorithm that runs in O(log3 N ) steps.
Assume N > 4.
10.1. Theoretical Factoring 243
2m
i 2m 2m
(2 + 1) = 2i·j .
j=0
j
i·j
Writing in binary, each term 2m 2 is just 2m shifted left ij bits.
2m
j j
If i is large enough, each j will occupy a separate block of i bits,
so it can be isolated. Since 2m
j ≤ 22m , i = 2m is large enough. The
low-order k bits of x are x mod 2k = x − x/2k · 2k . To isolate bits
k1 + 1 through k2 of x, subtract the lower k1 bits of x from the lower
k2 bits of x and divide by 2k1 .
Finally, (2i + 1)2m may be computed in O(log m) steps by Fast
Exponentiation. Combining all these results demonstrates that we
can compute k! in O(log2 k) steps.
Shamir’s result shows that we can factor N in a polynomial (in
log N ) number of steps provided that an arithmetic operation with
numbers as large as N ! counts as one step. This shows why we need
to consider the difficulty of arithmetic with large numbers when an-
alyzing number-theoretic algorithms. This complexity is the subject
of the next section.
244 10. Theoretical and Practical Factoring
k
The following algorithm multiplies two integers X = i=0 xi B i ,
m k+m
Y = i=0 yi B i to form their product Z = i=0 zi B i . The normal
school boy method would multiply X times each digit yi of Y , writ-
ing the intermediate products in a slanted parallelogram and then
adding the shifted partial products to form the overall product. On
a computer, space is saved by adding the intermediate products into
the final product as they are formed. The space for the final product
is initialized to 0 by the first for loop. The two nested for loops
10.2. Multiprecise Arithmetic 245
multiply pairs of 1-digit numbers and add them into the final prod-
uct being formed. The variable carry remembers the carry from one
column to the next one to the left.
3
q.g. = quod googla = “which you can Google,” from the first declension Latin
verb googlare, “to Google.”
10.3. Factoring—There’s an App for That 247
(2) Look for small factors first. Maybe the input was mistyped or
part of it was omitted in a cut-and-paste operation.
(5) Remember that algorithms like the Pollard Rho Method and
the Elliptic Curve Method are probabilistic. They may discover
a large factor before a small one. Do not expect prime factors
to be discovered in order of size.
avoid this trap (and another trap) is for the Prover and Verifier to send
each other the low-order bits of b and d and send the remaining bits
only after receiving the low-order bits of the other person’s number.
10.4.3. Factor an RSA public key. The rest of the dirty tricks
are ways to factor an RSA public modulus N when some information
is known about it or the parameters are chosen poorly. None of these
tricks actually threaten the RSA cipher because, when the cipher
is used properly, the needed information about the modulus is not
leaked. There are other attacks on the RSA cipher besides factoring
N . For example, if Alice happens to encipher the same message M
via RSA using different enciphering exponents e1 and e2 , but the
same RSA modulus N , then an attacker can recover M from the two
ciphertexts without factoring N . Boneh [Bon99] describes all the
attacks below and many more.
Theorem 10.6 shows why one must not reveal φ(N ) when N is
an RSA public key.
One must also not reveal the deciphering exponent d when N is an
RSA public key. Not only could one decipher all ciphertext knowing
d, but one can factor N , too. Of course, the enciphering exponent e
is public. This theorem appears in the original RSA paper [RSA78].
p and q are (unknown) primes with q < p < 2q, and there is an
(unknown) integer d < N 1/4 /3 satisfying ed ≡ 1 (mod φ(N )).
This shows that k/d ≈ e/φ(N ). Now e is known, but φ(N ) is un-
known. But we can approximate φ(N ) by√N , as we now show. Since
q < p < 2q and pq =√N , we have q < N and 0 < N − φ(N ) =
p + q − 1 < 3q − 1 < 3 N . Therefore,
e
− k = ed − kN = 1 − k(N − φ(N ))
N d dN dN
√
because ed = 1 + kφ(N ). Using 0 < N − φ(N ) < 3 N gives
√
e
− ≤ 3k N = √
k 3k
.
N d dN d N
Since kφ(N ) = ed − 1 < ed and e < φ(N ), we have k < d < N 1/4 /3.
Therefore,
e
− k ≤ 1 < 1 .
N d dN 1/4 2d2
This inequality is an instance of formula (6.4), so by Theorem 6.7, the
fraction k/d must be one of the convergents of the simple continued
fraction expansion of the number x = e/N . Now N and e are given,
so we know x and can compute the convergents Am /Bm of its con-
tinued fraction in polynomial time using formulas (6.3), computing
the partial quotients qi by Algorithm 6.1. Try each Bm as d and test
whether M ed ≡ M (mod N ) for a few random M . As the Bm grow
exponentially with m and d < N 1/4 /3, we try at most O(log N ) of
them.
Note that for any vector u, u · u ≥ 0 since it is the sum of squares.
The zero vector 0 = (0, . . . , 0) is the only vector with length 0.
The Gram-Schmidt process accepts a basis {v1 , v2 , . . . , vr } and
computes an orthogonal basis {w 1, w r } for the same vector
2, . . . , w
space. It sets w 1 = v1 . To compute w i for i > 1, it projects vi
orthogonally onto the subspace U generated by {w i−1 }, which
1, . . . , w
is the same as the subspace generated by {v1 , . . . , vi−1 }. Then w i is
defined to be the difference between vi and this projection. This
construction makes w i orthogonal to all the vectors in the subspace
U . The projection is computed one vector at a time by the for loop.
Example 10.10. In R4 , let V be the subspace with basis {v1 , v2 , v3 },
where v1 = (1, 2, 3, 0), v2 = (1, 2, 0, 0), and v3 = (1, 0, 0, 1). Find an
orthogonal basis {w 1, w 3 } for V .
2, w
10.5. Dirty Tricks with Lattices 255
First we have w
1 = v1 = (1, 2, 3, 0).
Then
2 = v2 − ((v2 · w
w 1 · w
1 )/(w 1 = (1/14)(9, 18, −15, 0).
1 ))w
3 = v3 − ((v3 · w
w 1 · w
1 )/(w 1 − ((v3 · w
1 ))w 2 · w
2 )/(w 2 ))w
2
= (1/5)(4, −2, 0, 5).
defined in terms of any basis. If L has full rank, then B is square and
det(L) = | det(B)|. This is the only case we will use below.
There exists a vector v1 in a lattice L with minimum positive
norm, the shortest vector. Let λ1 (L) be the norm of this shortest
v1 . Every vector w ∈ L which is linearly dependent on v1 must be
= tv1 for some integer t. At least two vectors have this minimum
w
positive length since the norm of −v equals the norm of v .
We generalize λ1 (L) as follows. For integer k ≥ 1, let λk (L) be
the smallest positive real number so that there is at least one set of
k linearly independent vectors of L, with each vector having length
≤ λk (L). This defines a sequence
Clearly, if we can solve (3), then we can solve (2); and if we can
solve (2), then we can solve (1).
In fact, (3) is equivalent to (2). It is not known whether (3)
is equivalent to (1). All three problems seem hard, although Shor
showed that one can solve all three quickly on a quantum computer.
We will give a complete proof of the equivalence of (2) and (3) in
case p and q have the same length in bits. Then we will sketch some
other results.
The coefficient vector of a polynomial h(x) = h0 + h1 x + h2 x2 +
· · · + hn xn is the vector v = (h0 , h1 , . . . , hn ) and h(x) = v . We
begin with a useful lemma. In it, h(xX) is the polynomial with
coefficient vector (h0 , h1 X, h2 X 2 , . . . , hn X n ).
Proof. We have
i i
i i x0 hi X i x0
|h(x0 )| = h i x0 = hi X ≤
i
i
X i
X
√
≤ hi X i ≤ rh(xX) < M.
i
Theorem 10.14 (May). Let N = pq, where p and q are two primes
of the same bit length. Let positive integers e and d satisfy ed ≡
1 (mod φ(N )) and ed ≤ N 2 . There is a deterministic polynomial-
time algorithm which factors N given the input N , e, d.
while the penultimate factor of F10 has 40 digits, which the ECM can
discover only with considerable effort (at least in the 1990s). Can one
guess from this table in what year F12 will be completely factored?
Six small factors of F12 are already known. The remaining cofactor
is composite and has 1133 decimal digits. This number is much too
large to factor by the QS or the NFS. The only way it will be factored
soon by a known method is if it has just one very large prime factor
and we can discover one or more small prime factors by the ECM.
The table shows that six different algorithms were used to factor the
seven numbers. Perhaps someone will discover a new algorithm to
finish F12 .
A new factoring algorithm was invented about every five years
from 1970 to 1995, as shown in this table:
faster programs for these algorithms. For example, the multiple poly-
nomial the QS uses many quadratic polynomials for sieving, not just
one of them. New forms for representing elliptic curves make the
ECM more efficient. Line sieving and the use of special q speed the
NFS. The improved algorithms are ten to one hundred times faster
than their first versions, but they still have the same asymptotic time
complexity as the first version. Have we already discovered the fastest
integer factoring algorithms? I doubt it.
Computers have become faster in the past 50 years. Between 1963
and 2013, computers have become 1000 times faster and factoring
algorithms 1000 times faster; factoring is about a million times faster
in 2013 compared to 1963.
We need a new factoring algorithm. The time complexities of the
fastest known algorithms have the form
(10.2) exp c(ln N )t (ln ln N )1−t ,
for some constants c and 0 < t < 1. For the QS and the ECM,
t = 1/2; for the NFS, t = 1/3. The reason for this shape for the time
complexity is the requirement of finding one or more smooth numbers,
the group order for the ECM, and the numbers in the relations in the
QS and the NFS and formula (3.1). Any new factoring algorithm
that succeeds by finding smooth numbers would likely also have time
complexity of the form (10.2). A truly fast factoring algorithm, for
example, a polynomial-time algorithm, probably would not rely on
smooth numbers.
Genetic algorithms might be an approach to factoring N . A ge-
netic algorithm runs on an ordinary computer and mimics biological
evolution. Begin with a population of random data structures con-
nected to factoring N . A function rates each data structure, with a
higher rating meaning that it is closer to factoring N . Delete a frac-
tion, say 50%, of the data with lowest ratings. Mutate and recombine
some of the remaining data to form new data from pieces of the old
data. (DNA does this in reproduction of living things.) Rate the new
data and repeat the process. (The rating and deleting of items with
low ratings is similar to survival of the fittest.) See Davis [Dav91]
for more details. A few years ago, I tried this approach to factoring
10.6. The Future of Factoring 265
with the data being formulas that input N , perhaps with hints, and
compute an output. The rating function was the number out of ten
composite input numbers for which the formula gave a correct proper
factor. The operations allowed in the formulas were the arithmetic
operations, greatest common divisor, Fast Exponentiation, modular
inverse, Jacobi symbol, etc. When the input was a set of ten compos-
ite N and the hint for N was a base b to which N was a pseudoprime,
but not a strong pseudoprime, the genetic algorithm discovered the
method of the proof of Theorem 10.4 in a few generations of formu-
las. However, when the input was a set of ten composite divisors N
of Cunningham numbers bn ± 1 and the hint for N was b, the genetic
algorithm could not find a formula to give factors of all ten N after
searching for thousands of generations of formulas. Likewise, the ge-
netic algorithm failed when given only a set of ten composite N with
no hints.
We give a few suggestions for discovering new, faster ways to
factor large numbers.
(1) The fastest general modern algorithms, the QS and the GNFS,
solve x2 ≡ y 2 (mod N ) and use Theorem 6.18. Can one con-
struct congruent squares modulo N in a faster way than using
smooth numbers to build relations?
(2) How about finding a base b so that N is a pseudoprime to
base b but not a strong pseudoprime to base b and applying
Theorem 10.4? Given N , can you find an m such that mN is
a pseudoprime to many bases or even a Carmichael number?
(3) Find an efficient way (perhaps similar to Fast Exponentiation)
to compute m! mod N for any 1 < m < N on a computer
with fixed word size. Then Shamir’s algorithm would become
practical.
(4) Find an efficient algorithm to solve this puzzle: Given a real
number R > 1, compute an integer multiple kR of R that is
within 1 of a square: |kR − m2 | < 1. Example: If R = 2π, then
k = 4 works: |8π − 52 | < 1. It is an easy exercise (Exercise
10.7) to convert the solution to the puzzle into a fast factoring
algorithm. (This suggestion is due to Schroeppel.)
266 10. Theoretical and Practical Factoring
(5) Try to factor Cunningham numbers via Exercise 10.5. That is,
use the known base b to which the number is a strong pseudo-
prime to construct a base a to which the number is a pseudo-
prime but not a strong pseudoprime.
(6) Theorem 10.4 tells how to factor N , given a base a to which
N is a pseudoprime but not a strong pseudoprime. But some-
times the method in the proof succeeds even when N is not a
j
pseudoprime to base a. If you compute gcd(a2 f − 1, N ) for
random a and all j, sometimes you will factor N . For exam-
ple, N = 341 is factored by this method for every a, whether
N is a pseudoprime to base a or not. How often does this tech-
nique work? Given N , can you predict an a (not necessarily a
pseudoprime base) for which it factors N ?
(7) Try to discover a formula F (N ) for a factor of N using a genetic
algorithm, perhaps with a hint. Maybe I made a mistake,
omitted a vital operation, or did not try enough generations.
(8) According to [GW08], when the SQUFOF is used to factor
a given integer N and many small multipliers m are used, so
that the algorithm essentially computes with the mN , there is
a large variation5 in the number of steps (and running time)
taken to factor N as m varies. Given a composite number N ,
can one quickly predict a multiplier m for which the SQUFOF
will factor N especially fast, perhaps
√ running in much less time
than the average complexity O( 4 N )?
(9) Can you discover an algorithm to answer this question quickly?
Given integers 1 < B < N , does N have a (prime) factor p ≤
B? This question sounds easier than factoring N because the
answer is either “yes” or “no.” It is called the decision problem
version of the factoring problem. In fact, it is equivalent to
factoring. If you had a fast algorithm to answer the yes/no
question, then you could use it log2 N times in a binary
√ search
for the least prime factor of N . Note that when N ≤ B <
N , the question simply asks whether N is composite, and we
5
If W (N ) is the number of iterations of the while loop in Algorithm 6.25 (Part
√4
1 of the SQUFOF), then the mean and standard deviation of the statistic W (N )/ N
are equal, which suggests an exponential distribution for this random variable.
Exercises 267
Exercises
49023533868409411085186322722914188749822273357609280786433
10.3. Twin brothers Bill and Bob use the same RSA modulus N
but have different enciphering exponents, e1 for Bill and e2
for Bob. Alice enciphers the same message M and sends it
to both brothers. Eve records the two ciphertexts. Assuming
that gcd(e1 , e2 ) = 1, explain how Eve can read M without
factoring N . Can Eve use this information to factor N ?
10.4. While experimenting with Bob’s RSA public keys N and e,
Eve discovered an interesting property. If she enciphered a
certain message M seven times, she got M back. That is, if
E(M ) = M e mod N is the enciphering algorithm, then
E(E(E(E(E(E(E(M ))))))) = M.
(Such an M is called a fixed point of order 7 for RSA with
these keys.) Can Eve use this information to factor N ?
10.5. Corollary 4.5 shows that almost any composite divisor N of
the primitive part of bn ± 1 is a strong pseudoprime to base b.
Can you use this information to compute (quickly) a base a
to which N is a pseudoprime but not a strong pseudoprime?
If so, then you could factor Cunningham numbers quickly via
Theorem 10.4. Recall Exercise 3.20.
10.6. Suppose N is a large odd composite integer and you discover
an integer b so that N is a strong pseudoprime to base b. Can
you use this information to construct a good polynomial for
factoring N by the Special Number Field Sieve?
10.7. Assume there is a fast way to solve the puzzle in suggestion
(4). Develop an efficient algorithm to factor N . Hint: The
puzzle solution allows you to convert a quadratic residue mod-
ulo N with known square root into another quadratic residue
with known square root and smaller absolute value. Iterate
this process until you reach a square root of 1 or obtain a
set of small residues whose product is square. In either case,
apply Theorem 6.18.
Appendix
Introduction
In this appendix we give answers to some exercises and hints for some
other ones.
A.1. Chapter 1
Answers and hints for exercises in Chapter 1.
269
270 A. Answers and Hints for Exercises
A.2. Chapter 2
Answers and hints for exercises in Chapter 2.
2.1. We have 321(19) + 381(−16) = 3.
2.2. We find that the pn , n = 11, 12, 13, . . ., are 2801, 11, 17, 5471,
52662739, 23003, 30693651606209, 37, 1741, 1313797957, 887,
71, 7127, 109, 23, 97, 159227, 643679794963466223081509857,
. . ..
2.3. The number is roughly ln 10300 ≈ 691. It would be half as
much if we considered only even 300-digit numbers.
2.4. 99.
2.5. 41.
2.6. 43.
2.8. 0, 1, 4 (mod 8).
2.9. No.
2.11. −1, +1, +1, −1, +1.
2.12. Hint: Consider two cases: p ≡ 1 (mod 4) and p ≡ 3 (mod 4).
2.13. (a) 10 ≡ 1 (mod 3). (b) 10 ≡ −1 (mod 11). (c) 102 ≡
−11 (mod 37) and 103 ≡ 1 (mod 37).
A.3. Chapter 3
Answers and hints for exercises in Chapter 3.
3.1. The probability that a number near 1036 is 106 -smooth is ρ(u),
where u = 36/6 = 6, that is, ρ(6) ≈ 0.0000197. The number
of 106 -smooth numbers between 1036 and 1036 + 107 is about
107 ρ(6) ≈ 197.
3.2. Same answer as for the previous exercise.
3.3. Note that 253 = 11 · 23. The answer is 3 or −3 modulo each
of these primes. Combine with four applications of the CRT.
The answers are x = 3, 118, 135, 250 modulo 253.
3.12. Hint: Use Bang’s Theorem 3.19.
3.19. This is Corollary 4.5.
A.6. Chapter 6 271
A.4. Chapter 4
Answers and hints for exercises in Chapter 4.
4.1. We have Φ5 (x) = x4 +x3 +x2 +x+1 = (x2 +3x+1)2 −5x(x+1)2 ,
so
Φ5 (55h ) = [52h +3·5h +1−5(h+1)/2 (5h +1)][52h +3·5h +1+5(h+1)/2 (5h +1)].
4.2. We have 1313h −1 = (13h −1)Φ13 (13h ) and Φ13 (13h ) = L13h M13h ,
where L13h , M13h are
136h + 7 · 135h + 15 · 134h + 19 · 133h + 15 · 132h + 7 · 13h + 1
∓ 13(h+1)/2 (135h + 3 · 134h + 5 · 133h + 5 · 132h + 3 · 13h + 1).
4.17. 14316 is the smallest element of a set of 28 sociable numbers.
4.21. Assume p < q. Then q | Qq−1 but q Qp−1 .
4.23. e and d must be relatively prime to φ(n), which is always even
for n > 2.
4.26. 193 and 24847873 are the sum of two squares.
A.5. Chapter 5
Answers and hints for exercises in Chapter 5.
5.6. This N is not the sum of two squares because it has a prime
factor 3119 ≡ 3 (mod 4).
A.6. Chapter 6
Answers and hints for exercises in Chapter 6.
6.3. p = 1068.
6.4. Hint: When N = p3 , if the CFRAC finds x, y with x2 ≡
y 2 (mod N ) and 1 < gcd(x − y, N ) < N , then p must be
in the factor base.
272 A. Answers and Hints for Exercises
A.7. Chapter 7
Answers and hints for exercises in Chapter 7.
7.1. Hint: The sum of the roots of the cubic x3 + ax + b = 0 is 0
because there is no x2 term.
7.2. 2P = (−5, −16), 3P = (11, −32), 4P = (11, 32), 5P = (−5, 16),
k = 7.
7.3. 2P = (0, 0), 3P = (2, 7), 4P = ∞, k = 4.
A.8. Chapter 8
Answers and hints for exercises in Chapter 8.
8.1. The formula is p (I + p − 1)/p.
A.9. Chapter 10
Answers and hints for exercises in Chapter 10.
10.1. One of the prime factors is 576297563010049.
10.5. This is a research problem.
10.6. This is a research problem.
10.7. We tell how to solve the problem in the hint using the solution
to the puzzle. Given x and a with x2 ≡ a (mod N ), we must
find y and b with y 2 ≡ b (mod N ) and 0 < |b| < |a|. We may
assume that |a| < N . Let R = N/|a|. Then R > 1. The puzzle
solution gives k and m with kR = m2 + ε, where |ε| < 1. Let
y = xm and b = −εa. Then
kN
y = x m = x (kR − ε) = x
2 2 2 2 2
−ε
|a|
kN
≡a − ε = ±kN − εa ≡ −εa = b (mod N )
|a|
and |b| < |a|.
Bibliography
273
274 Bibliography
287
288 Index
ISBN 978-1-4704-1048-3
For additional information
and updates on this book, visit
www.ams.org/bookpages/stml-68
9 781470 410483
AMS on the Web
STML/68 www.ams.org