Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Joy of Factoring PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 311

S T U D E N T M AT H E M AT I C A L L I B R A RY

Volume 68

The Joy
of Factoring

Samuel S. Wagstaff, Jr.


The Joy
of Factoring
S t u d e n t m at h e m at i c a l l i b r a ry
Volume 68

The Joy
of Factoring

Samuel S. Wagstaff, Jr.

American Mathematical Society


Providence, Rhode Island
Editorial Board
Satyan L. Devadoss John Stillwell
Gerald B. Folland (Chair) Serge Tabachnikov

The photographs of Samantha, Mahala, and Autumn Rider that appear


on the front cover are used courtesy of Dean Rider.

2010 Mathematics Subject Classification. Primary 11A51, 11Y05, 11Y11,


11Y16; Secondary 11A25, 11A55, 11B68, 11N35.

For additional information and updates on this book, visit


www.ams.org/bookpages/stml-68

Library of Congress Cataloging-in-Publication Data


Wagstaff, Samuel S., Jr., 1945–
The joy of factoring / Samuel S. Wagstaff, Jr.
pages cm. — (Student mathematical library ; volume 68)
Includes bibliographical references and index.
ISBN 978-1-4704-1048-3 (alk. paper)
1. Factorization (Mathematics) 2. Number theory. I. Title.
QA241.W29 2013
512.72—dc23
2013026680

Copying and reprinting. Individual readers of this publication, and nonprofit


libraries acting for them, are permitted to make fair use of the material, such as to
copy a chapter for use in teaching or research. Permission is granted to quote brief
passages from this publication in reviews, provided the customary acknowledgment of
the source is given.
Republication, systematic copying, or multiple reproduction of any material in this
publication is permitted only under license from the American Mathematical Society.
Requests for such permission should be addressed to the Acquisitions Department,
American Mathematical Society, 201 Charles Street, Providence, Rhode Island 02904-
2294 USA. Requests can also be made by e-mail to reprint-permission@ams.org.


c 2013 by the American Mathematical Society. All rights reserved.
The American Mathematical Society retains all rights
except those granted to the United States Government.
Printed in the United States of America.

∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at http://www.ams.org/
10 9 8 7 6 5 4 3 2 1 18 17 16 15 14 13
Contents

Preface ix
Exercise xiv

Chapter 1. Why Factor Integers? 1


Introduction 1
§1.1. Public-Key Cryptography 2
§1.2. Repunits 5
§1.3. Repeating Decimal Fractions 6
§1.4. Perfect Numbers 8
§1.5. The Cunningham Project 9
Exercises 12

Chapter 2. Number Theory Review 13


Introduction 13
§2.1. Divisibility 17
§2.2. Prime Numbers 20
§2.3. Congruences 24
§2.4. Fermat and Euler 28
§2.5. Arithmetic Functions 33

v
vi Contents

§2.6. Quadratic Congruences 36


Exercises 39

Chapter 3. Number Theory Relevant to Factoring 41


Introduction 41
§3.1. Smooth Numbers 42
§3.2. Finding Modular Square Roots 44
§3.3. Cyclotomic Polynomials 47
§3.4. Divisibility Sequences and b − 1
m
49
§3.5. Factors of b + 1
m
55
§3.6. Factors of Fibonacci and Lucas Numbers 56
§3.7. Primality Testing 59
Exercises 71

Chapter 4. How Are Factors Used? 75


Introduction 75
§4.1. Aurifeuillian Factorizations 76
§4.2. Perfect Numbers 83
§4.3. Harmonic Numbers 88
§4.4. Prime Proving 91
§4.5. Linear Feedback Shift Registers 93
§4.6. Testing Conjectures 97
§4.7. Bernoulli Numbers 101
§4.8. Cryptographic Applications 104
§4.9. Other Applications 113
Exercises 116

Chapter 5. Simple Factoring Algorithms 119


Introduction 119
§5.1. Trial Division 120
§5.2. Fermat’s Difference of Squares Method 123
§5.3. Hart’s One-Line Factoring Algorithm 127
§5.4. Lehman’s Variation of Fermat 128
Contents vii

§5.5. The Lehmers’ Factoring Method 132


§5.6. Pollard’s Rho Method 135
§5.7. Pollard’s p − 1 Method 138
Exercises 141

Chapter 6. Continued Fractions 143


Introduction 143
§6.1. Basic Facts about Continued Fractions 144
§6.2. McKee’s Variation of Fermat 147
§6.3. Periodic Continued Fractions 149
§6.4. A General Plan for Factoring 153
§6.5. Lehmer and Powers 155
§6.6. Continued Fraction Factoring Algorithm 158
§6.7. SQUFOF—SQUare FOrms Factoring 163
§6.8. Pell’s Equation 169
Exercises 170

Chapter 7. Elliptic Curves 173


Introduction 173
§7.1. Basic Properties of Elliptic Curves 174
§7.2. Factoring with Elliptic Curves 181
§7.3. Primality Proving with Elliptic Curves 187
§7.4. Applications of Factoring to Elliptic Curves 188
Exercises 190

Chapter 8. Sieve Algorithms 191


Introduction 191
§8.1. The Basic Sieve 192
§8.2. The Quadratic Sieve 195
§8.3. The Double Sieve 202
§8.4. Schroeppel’s Linear Sieve 205
§8.5. The Number Field Sieve 207
Exercises 217
viii Contents

Chapter 9. Factoring Devices 219


Introduction 219
§9.1. Sieve Devices 219
§9.2. Special Computers 230
Exercise 237

Chapter 10. Theoretical and Practical Factoring 239


Introduction 239
§10.1. Theoretical Factoring 240
§10.2. Multiprecise Arithmetic 244
§10.3. Factoring—There’s an App for That 246
§10.4. Dirty Tricks 248
§10.5. Dirty Tricks with Lattices 253
§10.6. The Future of Factoring 262
Exercises 267

Appendix. Answers and Hints for Exercises 269


Introduction 269
§A.1. Chapter 1 269
§A.2. Chapter 2 270
§A.3. Chapter 3 270
§A.4. Chapter 4 271
§A.5. Chapter 5 271
§A.6. Chapter 6 271
§A.7. Chapter 7 272
§A.8. Chapter 8 272
§A.9. Chapter 10 272

Bibliography 273

Index 287
Preface

The problem of distinguishing prime numbers from


composite numbers and of resolving the latter into
their prime factors is known to be one of the most
important and useful in arithmetic. . . . Neverthe-
less we must confess that all methods that have been
proposed thus far are either restricted to very spe-
cial cases or are so laborious and prolix that even for
numbers that do not exceed the limits of tables con-
structed by estimable men, i.e. for numbers that do
not yield to artificial methods, they try the patience
of even the practiced calculator. . . . The dignity of
the science itself seems to require that every possible
means be explored for the solution of a problem so
elegant and so celebrated.
C. F. Gauss [Gau01, Art. 329]

Factoring integers is important. Gauss said so in 1801.


The problem of distinguishing prime numbers from composite
numbers has been solved completely, both theoretically and practi-
cally. We have made some progress on factoring composite integers,
but it remains a difficult problem.
Some mathematicians play a version of the television game show
Jeopardy! with multiplication tables. Host Alex Trebek reads the

ix
x Preface

answer “thirty-five.” A contestant rings in and gives the question,


“What is five times seven?” Mathematicians currently play this game
with 200-digit numbers rather than 2-digit numbers.
This book is intended for readers who have taken an introduc-
tory number theory course and want to learn more about factoring.
This work offers many reasons why factoring is important. Chapter
2 reviews the elementary number theory material the reader is as-
sumed to know. To fully understand this book, the reader will also
need calculus and linear algebra. As factoring integers usually in-
volves computers, the reader is assumed to be computer-literate and
to understand simple pseudocode and protocols. In a few places we
assume the reader is familiar with the notions of polynomial time
(easy problem) and the nondeterministic polynomial-time class N P
(hard problem).
This book explains and motivates the Cunningham Project, the
largest factoring enterprise in the world today. The official tables of
the Cunningham Project are published as [BLS+ 02]; the first part of
that book includes some material from this work in condensed form.
For readers not interested in the Cunningham Project, this book
offers numerous other applications of factoring, especially to cryptog-
raphy, and gives important results in these areas.
There is tremendous pleasure in devising a new factoring method,
programming it, and using it to factor a number no one else could
split. I hope that some readers will participate in this endeavor and
experience the joy. The end of the last chapter suggests where the
reader might begin.
In the chapters to follow we will give many reasons why the fac-
torizations of certain numbers are important and useful. We will
describe some of the major algorithms and devices for factoring and
tell a little about the people who invented them. Finally, we will tell
how you can help with some of the factoring projects currently in
progress.
Chapters 1 and 4 tell some reasons why people factor integers.
The discussion in Chapter 1 requires no more than high school math-
ematics. Chapter 4 gives additional reasons understood better with
the (college-level) number theory of Chapters 2 and 3.
Preface xi

For the past thirty-five years, a very important reason for factor-
ing has been the public-key cipher of Rivest, Shamir, and Adleman
(RSA), whose security requires that the problem of factoring integers
is hard. Chapter 1 describes the development of the RSA cipher.
The mathematical details of it are presented in Chapter 4. Chapter
1 also discusses three older reasons for interest in factoring, repunits,
decimal fractions, and perfect numbers. Then it describes the Cun-
ningham Project, more than a century old and the greatest integer
factoring collaboration in history.
Chapter 2 reviews some elementary number theory found in a
first course in the subject. It considers divisibility, prime numbers,
congruences, Euler’s theorem, arithmetic functions, and Quadratic
Reciprocity. Few proofs are given here. It is assumed that the reader
has learned this material elsewhere. A few algorithms, such as the
Euclidean Algorithm for the greatest common divisor, are stated.
Chapter 3 deals with more advanced number theory, probably
not taught in a first course but needed to understand factoring al-
gorithms and applications of factoring. It discusses the frequency
of occurrence of integers whose greatest prime factor is small, how
to compute modular square roots, cyclotomic polynomials, primality
testing, and divisibility sequences such as the Fibonacci numbers and
the numbers bm − 1, m = 1, 2, 3, . . ., for fixed b. Of course, the
recognition of primes is essential to telling when a factorization is
complete, and this topic is treated extensively. Many algorithms are
stated in Chapter 3.
More applications of factoring are given in Chapter 4. It begins
with a set of algebraic factorizations of some numbers in a divisibility
sequence discovered by Aurifeuille and others. A more complete dis-
cussion of perfect numbers follows, with a sample of theorems in this
area. Next come harmonic numbers, prime proving aided by factor-
ing, and linear feedback shift registers, which are hardware devices
used to generate cryptographic keys and random numbers. Testing
conjectures is an important and common application of factoring. We
give three examples. Bernoulli numbers are connected to the struc-
ture of cyclotomic fields and to Fermat’s Last Theorem. While most
work in this area is beyond the scope of this book, we do give a taste
xii Preface

of the possible results. The chapter ends with a deeper discussion of


public-key cryptography, more applications of factoring to cryptogra-
phy, and other assorted uses of factoring. These include accelerating
RSA signature generation, zero-knowledge proofs, and sums of two or
four squares.
The remaining chapters discuss methods of factoring integers.
Each chapter presents algorithms with related ideas, roughly in his-
torical order. Most algorithms are described both in words and in
pseudocode. Simple examples are given for each method. Chapter 5
gives the oldest, simplest, and slowest algorithms, from Trial Division
and Fermat’s Method to techniques Pollard developed in the 1970s.
Chapter 6 treats simple continued fractions and several factoring
algorithms that use them. While most of these have been superseded
by faster algorithms, at least one of them (SQUFOF) is often used as
a procedure in the powerful sieve algorithms of Chapter 8. Chapter
6 also proves a simple but important theorem (Theorem 6.18) that
tells how most of the factoring algorithms in Chapter 6 and Chap-
ter 8 finish, that is, how the factors are produced at the end of the
algorithm.
In Chapter 7, we examine the basic properties of elliptic curves
and tell how they lead to good algorithms for factoring and primal-
ity proving. Elliptic curves have many uses in cryptography and data
security. In some of these applications, factoring integers is an impor-
tant tool for constructing elliptic curves with desirable properties to
make computing with them efficient while maintaining security. Some
of these techniques are mentioned in the final section of Chapter 7.
This chapter uses the newest mathematics in the book.
Chapter 8 deals with the notion of sieve, from the Sieve of Eratos-
thenes more than 2,000 years old to the Number Field Sieve factoring
algorithm about 25 years old. On the way we describe the Quadratic
Sieve factoring algorithm and a couple of other sieves for factoring.
The Number Field Sieve works especially well for numbers in the
Cunningham Project.
In Chapter 9 we describe special hardware rather than software
for factoring. Some of these are mechanical or electronic devices for
Preface xiii

performing the sieve process. Others are computers with special archi-
tecture to facilitate factoring. We also discuss factoring with quantum
objects and with DNA molecules. The author is neither a physicist
nor a biologist, so he can give only a taste of these new factoring
methods. The reader who really wants to learn about these topics
should consult the references.
Chapter 10 discusses practical aspects of factoring and also some
purely theoretical results about the difficulty of factoring. We tell how
computers calculate with very large integers. Another section reveals
special methods for factoring integers quickly when they have special
form or when partial information is known about their factors. These
tricks include applications to breaking the RSA cipher. We describe
some of the ongoing factoring projects and how the reader can help
with them. The final section tosses out some new ideas for possible
future factoring methods.
Most of the algorithms in this work are written in pseudocode
and described in words. We have tried to make the pseudocode clear
enough so that programmers with limited knowledge of number the-
ory can write correct programs. We have also tried to make the
verbal descriptions of algorithms understandable to number theorists
unfamiliar with computer programming.
This book does not discuss factoring integers mentally, although
one reviewer suggested it as a topic. The only items at all related to
mental arithmetic are Example 2.29 and Exercises 0.1 and 2.13.
If you have taken a course in elementary number theory, are
computer-literate, don’t care about applications, and wish to learn
about factoring algorithms immediately, then you could begin reading
with Chapter 5. But then you would wonder why we keep factoring
the number 13290059 over and over again. This choice is explained
in Section 4.6.3.
The author thanks CERIAS, the Center for Education and Re-
search in Information Security and Assurance at Purdue University,
for its support.
The author is grateful to Richard Brent, Greg Childers, Graeme
Cohen, Carl Pomerance, and Richard Weaver, who answered ques-
tions about the material of this book. He is indebted to Robert
xiv Preface

Baillie, Arjen Lenstra, Richard Schroeppel, Hugh Williams, and at


least one anonymous reviewer for helpful comments. Hugh Williams
generously allowed the use of the sieve photos in Chapter 9. The au-
thor thanks Junyu Chen for checking and programming many of the
algorithms in this work. Any remaining errors are the responsibility
of the author.

Keep the factors coming!

Sam Wagstaff

Exercise

0.1. No computers are allowed in multiplication table Jeopardy!.


Alex Trebek reads the clue “299.” You ring in and say what?
Chapter 1

Why Factor Integers?

Suppose, for example, that two 80-digit1 numbers


p and q have been proved prime; . . . Suppose fur-
ther, that the cleaning lady gives p and q by mistake
to the garbage collector, but that the product pq is
saved. How to recover p and q? It must be felt
as a defeat for mathematics that, in these circum-
stances, the most promising approaches are search-
ing the garbage dump and applying mnemo-hypnotic
techniques. H. W. Lenstra, Jr. [Len82]

Introduction
A modern reason for studying the problem of factoring integers is the
cryptanalysis of certain public-key ciphers, such as the RSA system.
We will explain public-key cryptography and its connection to factor-
ing in the next section. The rest of the chapter discusses some older
reasons for factoring integers. These include repunits, describing per-
fect numbers, and determining the length of repeating decimals. We
introduce the Cunningham Project, which has factored interesting
numbers for more than a century.

1
The size of these primes would have to be doubled now due to the improvement
in the speed of factoring algorithms since Lenstra wrote these words in 1982.

1
2 1. Why Factor Integers?

1.1. Public-Key Cryptography


Cryptography has been used to hide messages for more than 2,000
years. Usually the sender and receiver meet before the secret commu-
nication and decide how to hide their future secret message(s). They
choose an algorithm, which remains fixed for a long time, and a secret
key, which they change often.
The National Security Agency (NSA) was created by President
Truman in 1952 to protect the secret communications of the United
States and to attack those of other countries. It performed this mis-
sion secretly and apparently with great success until at least the 1970s.
Some businesses need cryptography to communicate secretly with
their offices overseas. Some individuals want good cryptography for
their personal secrets. On March 17, 1975, the National Bureau
of Standards announced a cipher, the Digital Encryption Standard
(DES), developed by IBM with help from the NSA and approved for
use by individuals and businesses. It was soon suspected that the
NSA had approved the DES after putting secret weaknesses into it
(in the S-boxes) that would allow the agency, but not others, to break
it easily. In any case, the original key size was reduced, perhaps so
that a brute force attack could break the DES with enough hardware.
In the early 1970s, people began to connect computers together
in large numbers into networks2 and think about email and other
forms of electronic communication. Computer theoreticians began
to ponder how one could “sign” an electronic document so that the
reader could be certain who wrote it. If each person used a unique
number as a signature, say, it could be copied perfectly by a forger.
Others pondered how to create a system where people who had
never met could communicate securely. One-key ciphers like the DES
require users to exchange keys securely before their secret communi-
cation happens. All ciphers known then were of this type.
Whit Diffie and Marty Hellman thought about these matters and
came up with several brilliant solutions in a 1976 paper [DH76].
First, they considered one-way functions, easy to compute (for-
wards), but almost impossible to invert. A function f is one-way if it
2
The first computer networks were created in the 1960s.
1.1. Public-Key Cryptography 3

is easy to compute f (x) for any x, but, given almost any value y in
the range, it is hard to find any x with f (x) = y.
Diffie and Hellman invented a method of key exchange based
on the one-way function f (x) = bx mod p. Given large numbers b,
p, and y, with prime p, it is hard to find a “discrete logarithm” x
with f (x) = y. Their protocol allows two users who have never met
or exchanged a secret key to choose a common secret key, for the
DES, say, while communicating over a channel that may have an
eavesdropper listening.
This key-exchange protocol provides secure communication be-
tween one user and whoever is at the other end of the connection.
But it is subject to the “man in the middle” attack in which someone
hijacks a computer between the two parties trying to communicate
and executes the protocol with each of them separately. After that,
the hijacker can read (and even modify) all encrypted correspondence
between the two parties as he or she decrypts a message from one and
re-enciphers it to pass on to the other.
The second innovation in the Diffie-Hellman paper [DH76] solved
this problem. Their idea was to split the key into public and private
parts! Bob would use a cipher with two independent keys. He would
make public his enciphering key and the enciphering algorithm. His
deciphering algorithm would be public, but only Bob would know his
deciphering key. It would be impossible to compute the deciphering
key from the enciphering key and other public data. Such a cipher is
called a public-key cipher.
To send Bob a secret message, Alice would find his public key
and use it to encipher the message. Once it was enciphered, only
Bob could decipher it. If Alice wanted to communicate with Bob
by the DES, she could choose a random DES key, encipher the key
with Bob’s public key, and email it to Bob. When Bob received this
message, he would decipher it and use the DES key to communicate
with Alice. No eavesdropper would be able to learn the DES key.
A public-key cipher lacks authenticity. Anyone could write a
letter to Bob, sign it “Alice,” encipher it using Bob’s public key, and
send it to Bob. Bob would not know whether it came from Alice.
Thus, Eve could send a random DES key to Bob, telling him that
4 1. Why Factor Integers?

it came from Alice. Then she could communicate with Bob secretly,
pretending to be Alice.
The third innovation of Diffie and Hellman in [DH76] was the
notion of a digital signature. Alice could construct her own public and
private keys, just as Bob did above, and “sign” a message by applying
the deciphering algorithm with her private key to the plaintext mes-
sage. Then she could encipher this signed message with Bob’s public
key and send it to him. Bob would be certain that the message came
from Alice when he deciphered it using his private key, enciphered the
result with Alice’s public key, and obtained meaningful text. Only
Alice could have constructed a message with this property because
only Alice knows her private key. We assume there is no “man in
the middle” attack and that Bob obtained Alice’s public key from a
secure site.
Diffie and Hellman did not propose enciphering and deciphering
algorithms for public-key cryptography in their paper. Ron Rivest,
Adi Shamir, and Leonard Adleman (RSA) read their paper and tried
many algorithms for doing this. Some of these schemes involved the
difficulty of factoring large integers. On April 3, 1977, Rivest dis-
covered the system now called RSA. He gave a simple formula for
enciphering a message. The public key was the product of two large
primes. He also gave a formula for deciphering the ciphertext that
used the two large primes. He created a trap-door one-way function,
that is, a function f that cannot be inverted unless one knows a se-
cret, in which case it is easy to find x given f (x). We will describe the
method in Section 4.8. The RSA paper was published as [RSA78].
The secret that unlocks the RSA cipher is the pair of prime factors
of the public key. An obvious attack on this cipher is to factor the
public key. The publication of the RSA cipher sparked tremendous
interest in the problem of factoring large integers, which earlier had
been studied by few people.
Martin Gardner3 wrote a column in the August 1977 Scientific
American describing the work of Diffie, Hellman, Rivest, Shamir, and
Adleman. In it, RSA offered a challenge message encoded with a

3
The title of Gardner’s article was “A new kind of cipher that would take millions
of years to break.” It appeared on pages 120–124.
1.2. Repunits 5

129-digit product of two secret primes. This article was startling


because (a) it revealed to the general public a powerful cipher that
even the government couldn’t break and (b) it showed that number
theory, which most educated people regarded as the purest of pure
mathematics, had a concrete, dirty application to the real world.
The RSA product of two primes in Gardner’s article was factored
in 1993–1994 by Derek Atkins, Michael Graff, Arjen Lenstra, and
Paul Leyland [AGLL95] who supervised 600 volunteers using 1,600
machines. We tell how this was done in Section 8.2.
It was revealed in 1997 that James Ellis of the GCHQ4 invented
public-key cryptography in 1969 and Cliff Cocks of the GCHQ in-
vented the RSA cipher in 1973. Of course, these discoveries were
kept secret by the GCHQ at the time. The general public first learned
about them when they were rediscovered by Diffie, Hellman, Rivest,
Shamir, and Adleman as explained above.

1.2. Repunits
When a child is learning to write, the first numbers he or she writes
often are the repunits 1, 11, 111, 1111, . . .. These same decimal
numbers continue to fascinate young people and adults as they learn
more about arithmetic. Which of these numbers are prime? Which
are composite? What are the factors of the composite ones? As a
child, I found joy in multiplying 3 times 37 and getting a product 111
with all ones. After one discovers that 1111 = 11 · 101, one might ask
which repunits divide which other repunits.
Of course, the decimal number Rn with n ones (and no other
digits) is just (10n − 1)/9. One can prove that Rm divides Rn if and
only if m divides n. Thus, 11 divides not only R4 = 1111 but also
every other repunit with an even number of ones. Likewise, 111, and
therefore also 3 and 37, divides R3n for every n.
Another consequence of the divisibility rule just mentioned is that
Rn cannot be prime unless n is prime. Therefore, R6 = 111111 and
R9 = 111111111 cannot be prime, but R2 = 11, R3 , R5 , R7 , and R11
might be prime. Of these five integers, only R2 = 11 is prime. We
4
GCHQ means “Government Communications Headquarters” in England.
6 1. Why Factor Integers?

have already mentioned that 111 = 3 · 37. It is easy to factor 11111,


but the last two numbers take more work to factor. Here is a table
of factors of the first few Rn :

n Rn Rn factored
2 11 11
3 111 3 · 37
4 1111 11 · 101
5 11111 41 · 271
6 111111 3 · 7 · 11 · 13 · 37
7 1111111 239 · 4649
8 11111111 11 · 73 · 101 · 137
9 111111111 3 · 3 · 37 · 333667
10 1111111111 11 · 41 · 271 · 9091

The next four prime repunits after R2 = 11 are R19 , R23 , R317 , and
R1031 . How does one prove that such large numbers are prime? We
will explain in Example 4.22 using theorems from Section 3.7.

1.3. Repeating Decimal Fractions


You learned in elementary school how to convert fractions into re-
peating decimals. For example, 1/3 = 0.333 . . ., with period 3, 1/7 =
0.142857 142857 . . ., with period 142857, 1/11 = 0.09 09 . . ., with pe-
riod 09, and 1/37 = 0.027 027 . . ., with period 027. Did you ever
wonder about the length of the periods? When p is a prime number,
other than 2 or 5, the length of the period for the decimal fraction
for 1/p is the smallest positive integer n for which p divides 10n − 1.
Thus, 3 divides 101 − 1 = 9, so the period for 1/3 has length 1. Since
11 divides 99 but not 9, the period for 1/11 has length 2. Also, 37
divides 999 but not 99 or 9, so the period for 1/37 has length 3. The
prime 7 divides 106 − 1 but not 10n − 1 for any 0 < n < 6, so the
period for 1/7 has length 6.
The primitive prime factors of 10n − 1 are the primes that divide
10 − 1 but not 10m − 1 for any 0 < m < n. These are exactly the
n

primes p for which the length of the period of the decimal fraction
for 1/p is n. In 1801, Gauss determined the period length of the
1.3. Repeating Decimal Fractions 7

decimal fraction for every rational number a/b in terms of the factors
of numbers 10n − 1. See Articles 308–318 of [Gau01].
Here is a rough summary of what Gauss proved. Let p be a
prime number other than 2 or 5. Let m be a positive integer. If p
does not divide the integer r, then the length of the period of the
decimal fraction for r/pm is the smallest positive integer e for which
pm divides 10e − 1. This means that e is a divisor of pm−1 (p − 1).
1 p2 · · · pk , where the pi
mk
Call this period length (pm ). If M = pm 1 m2

are distinct primes not equal to 2 or 5, then the length of the period
of the decimal fraction for r/M is the least common multiple (M )
mk
of the numbers (pm m2
1 ), (p2 ), . . ., (pk ). In all of these cases the
1

period begins with the first digit after the decimal point. In case
N = 2a 5b M , with M not divisible by 2 or 5, the decimal fraction
for r/N becomes periodic after the first c digits following the decimal
point, where c is the larger of a and b, and the length of the period
is (M ).

Example 1.1. The decimal fraction for 1/11 is 0.09 09 . . ., with pe-
riod length 2, and 102 − 1 = 99 is the least number 10e − 1 divisible
by 11. The decimal fraction for 1/112 is

0.0082644628099173553719 0082644628099173553719 . . . ,

with period length 22 = 2 · 11, and e = 22 is the least number e


such that 10e − 1 is divisible by 112 . The decimal fraction for 7/37
is 0.189 189 . . ., and e = 3 is the least number e such that 10e − 1
is divisible by 37. The decimal fraction for 7/740 = 7/(20 · 37) is
0.00 945 945 . . .. Since 20 = 22 51 and c = 2 is the larger of 2 and 1, the
period begins after the first 2 digits (“00”) following the decimal point
and has length 3, as for 7/37. The decimal fraction for 163/407 =
163/(11 · 37) is 0.400491 400491 . . .. The period length is 6, the least
common multiple of (11) = 2 and (37) = 3.

If you had a table of the primitive prime factors of 10n − 1, then


you could find in it all the primes whose reciprocals had a given
decimal period length. The Cunningham table is such a table. We
will describe the Cunningham Project after we give one more example
in the next section.
8 1. Why Factor Integers?

There is nothing special about base 10, except that humans have
ten fingers. If you write your fractions in base b rather than base 10,
the Cunningham Project helps you find all the primes whose recipro-
cals written in base b (for 2 ≤ b ≤ 12) have a given period length.
Example 1.2. In base 3, the reciprocal of the prime 11 (in base 10)
is 0.00211 00211 . . ., with period length 5 because 35 − 1 is the first
multiple of 11 having the form 3n − 1.

1.4. Perfect Numbers


Some integers, like 4, are greater than the sum of their proper divisors:
4 > 1 + 2. (A “proper” divisor of n is a divisor that is positive but
less than n.) Other integers, like 12, are less than the sum of their
proper divisors: 12 < 1 + 2 + 3 + 4 + 6. If you add the divisors of many
integers, you will notice that few integers exactly equal the sum of
their proper divisors. The ancient Greeks knew that 6 = 1 + 2 + 3 and
28 = 1 + 2 + 4 + 7 + 14 do have this property and called these numbers
“perfect.” Perfect numbers are the first topic discussed in Dickson’s
monumental 1919 History of the Theory of Numbers [Dic71].
An important theorem in Euclid’s Elements tells how to construct
(some) perfect numbers. It says that if 2n − 1 is prime, then the
number 2n−1 (2n − 1) is perfect. For example, 22 − 1 = 3 is prime, so
21 (22 − 1) = 6 is perfect. The next two primes of the form 2n − 1 are
23 −1 = 7 and 25 −1 = 31. These primes produce the perfect numbers
22 · 7 = 28 and 24 · 31 = 496. Can you find the next prime of the form
2n − 1? What perfect number does it produce? If you had a table
of the factorizations of numbers 2n − 1 into primes, you could easily
find the primes that produce perfect numbers because these numbers
would have only one prime factor in their entry, namely 2n − 1 itself.
The Cunningham Project provides such a table.
In 1747, Euler proved that every even perfect number has the
form 2n−1 (2n − 1) with 2n − 1 prime. His theorem, a converse of
the theorem in Euclid’s book, completely characterizes even perfect
numbers. Cataldi proved that if 2n − 1 is prime, then n must be
prime. The primes 2n − 1 are named after Mersenne because in 1644
he made a famous prediction about which numbers of this form with
1.5. The Cunningham Project 9

n ≤ 257 are prime. For most of the past few hundred years, the
largest known prime number has been a Mersenne prime. It is likely
that there are infinitely many primes 2n − 1, although this conjecture
has never been proved.
Although perfect numbers have been studied for 2,500 years and
we know the shape of all even perfect numbers, we still don’t know
whether there are any odd perfect numbers. The best we have been
able to do is prove theorems that restrict the size or form of any
hypothetical odd perfect number. Ochem and Rao [OR12] proved
that every odd perfect number is > 101500 and has at least 101 prime
factors, counted with multiplicity. Nielsen [Nie07] showed that an
odd perfect number must have at least nine distinct prime factors.
The proofs of these theorems are done by computer programs because
they have thousands of cases. They all require explicit knowledge of
the prime factors of many numbers bn ± 1. The Cunningham Project
helps the investigation of odd perfect numbers. We will give examples
in Section 4.2.

1.5. The Cunningham Project


The Cunningham Project computes and publishes tables of the prime
factors of many numbers bn ± 1 for various small integers b.
For more than 150 years, people have published tables of factors
of numbers bn ± 1. For example, in 1851, Looff [Loo51] listed some
factors of 10n − 1 for 1 ≤ n ≤ 60, while in 1856 Reuschle [Reu56]
published many factors of bn ± 1 for 2 ≤ b ≤ 10 and 1 ≤ n ≤ 42.
Many people published factors of just one or two of these numbers at
a time. See Chapter XVI of Dickson [Dic71] for numerous examples
of such publications.
A. J. C. Cunningham was born in Delhi in 1842. He served as a
military engineer in the British army in India, helped to design the
Ganges Canal, and taught mathematics at the Thomason Civil Engi-
neering College5 in Roorkee. After he retired from the army in 1891,
he devoted the remainder of his life to number theory and especially

5
This college later became the Indian Institute of Technology at Roorkee.
10 1. Why Factor Integers?

to factoring numbers an ±bn . In 1925, Cunningham and Woodall pub-


lished a small book [CW25] that assembled many scattered known
factors of bn ± 1 for 2 ≤ b ≤ 12 “up to high powers n.” The latter
phrase means that for b = 2, n goes up to 500, while for 2 < b ≤ 12,
n goes up to about 110.
Most earlier tables similar to [CW25] listed all the prime factors
of bn ± 1 for each n, resulting in much repetition of factors listed
earlier. The book [CW25] was the first work to avoid this repetition
by listing essentially only the primitive factors. Table 1 compares the
first ten lines of the table for 10n + 1 by a typical earlier author (left)
with the same table in [CW25] (right).

Table 1. Factors of 10n + 1.

pre − Cunningham Cunningham


n Prime factors of 10n + 1 n Prime factors of 10n + 1
1 11 1 11
2 101 2 101
3 7.11.13 3 (1) 7.13
4 73.137 4 73.137
5 11.9091 5 (1) 9091
6 101.9901 6 (2) 9901
7 11.909091 7 (1) 909091
8 17.5882353 8 17.5882353
9 7.11.13.19.52579 9 (1, 3) 19.52579
10 101.3541.27961 10 (2) 3541.27961

Note that a period “.” is used to indicate multiplication. The


numbers in parentheses in a Cunningham table refer to earlier lines
in that table. They mean that one should copy all the factors ap-
pearing after the parentheses (if any) in the earlier lines. The “(1, 3)”
in line 9, for example, tells the reader to copy the factor 11 from
line 1 and the factors 7 and 13 from line 3. These factors, together
with the primitive factors 19 and 52579 in line 9 give all the prime
factors of 109 + 1 = 7 · 11 · 13 · 19 · 52579. Although not obvious from
1.5. The Cunningham Project 11

this short sample, avoiding repeating prime factors this way produces
much shorter tables.
Another benefit of the parentheses is that they make it easier to
see the multiplicative structure of these numbers. For example, at
least in Table 1, 101 + 1 = 11 divides 10n + 1 for every odd n, and
102 + 1 = 101 divides every fourth number: 106 + 1, 1010 + 1. Table
1 in Section 3.4 has more parentheses. We will say more about this
multiplicative structure in Section 3.4.
The tables in [CW25] omitted the parentheses and their con-
tents, listing only the “M.A.P.F.” (Maximal Algebraic Primitive Fac-
tors) of each number. These tables also credited discoverers of certain
difficult factorizations and left blank spaces to encourage others to
continue the work and enter the factors they find.
For some purposes, such as studying odd perfect numbers, the
table on the left is more useful. But for other purposes, such as
understanding the multiplicative structure of the numbers or finding
period lengths of decimal fractions, the table on the right is more
useful. To use Table 1 to find period lengths, note that a theorem
(Theorem 3.23) says that the primitive prime factors of 10n + 1 are
the same as the primitive prime factors of 102n − 1. Thus, Table 1
tells us that the decimal fraction for 1/73 has a period of length 8 (in
fact, 1/73 = 0.01369863 01369863 . . .) because 73 is a primitive prime
factor of 108 − 1.
After Cunningham died in 1928, the Cunningham Project was
continued by Dick Lehmer and others. A revised and enlarged version
of [CW25] was published as [BLS+ 02] in 1983 and 1988. The third
edition of this book was published in 2002 as an electronic book by
the American Mathematical Society. The author maintains the latest
versions of the tables at
http://homes.cerias.purdue.edu/~ssw/cun/index.html.
If you examine the tables in [BLS+ 02], you will notice that there
are four tables with b = 2 but only two tables for b > 2. The numbers
in parentheses in any table refer to earlier lines in that table. For each
base b there is just one table for factors of bn − 1, and it lists only odd
n because of the identity b2n − 1 = (bn − 1)(bn + 1). If you want the
factors of 326 − 1, you must look for 313 − 1 in the table for factors of
12 1. Why Factor Integers?

3n − 1 and for 313 + 1 in the table for factors of 3n + 1. A single table


for factors of 2n + 1 would have been overly long, so it was split into a
table for factors of 2n + 1 with odd n, one with n a multiple of 4 and
one with n = 4k + 2. The numbers in the latter table (n = 4k + 2)
have an algebraic factorization that splits them into two nearly equal
pieces. This factorization is described in Section 4.1. Because of this
algebraic factorization, the numbers are easier to factor than other
numbers 2n + 1 and so this table extends twice as far as the other
base 2 tables.

Exercises

1.1. Factor the repunits Rn for 11 ≤ n ≤ 13.


1.2. Multiply 2071723 × 5363222357 by hand. Feel the joy.
1.3. What is the length of the period of the decimal fraction for
1/9091? What is the length of the period of the decimal frac-
tion for 1/9901?
1.4. Find the period of the decimal expansion of 1/49 and its length.
1.5. Find the period of the base 3 fraction for 1/13.
1.6. Find the next two Mersenne primes 2p − 1 after 22 − 1, 23 − 1,
and 25 − 1. What perfect numbers do they produce?
Chapter 2

Number Theory Review

Introduction
This chapter recounts basic facts from number theory that you will
need to understand the rest of the book. Most of the proofs are
omitted. If you haven’t had a first course in number theory, you
should read one of the many books with a title Introduction to Number
Theory, where you will find the proofs. For example, see Hardy and
Wright [HW79] or Niven, Zuckerman, and Montgomery [NZM91].
References are given for a few selected proofs.
We will use this notation throughout the book. If x is a real
number, then x means the largest integer ≤ x and x means the
smallest integer ≥ x. Thus, 3.14 = 3, 3.14 = 4, −3.14 = −4,
and −3.14 = −3. If x is an integer, then x = x = x.
Here we explain some terms used to describe the speed of al-
gorithms. If f (n) and g(n) are functions defined for positive in-
tegers n, we say “f (n) is big O of g(n),” written f (n) is O(g(n))
or f (n) = O(g(n)), to mean that there is a constant c > 0 so
that |f (n)| ≤ cg(n) for all sufficiently large n. If the functions
f (n) and g(n) never vanish, then we define f (n) ∼ g(n) to mean
limn→∞ f (n)/g(n) = 1.
Let  be the length of the input to an algorithm. For example, if
the algorithm factors integers N , then  is the number of digits (or
bits) in the input N , so  = O(log N ).

13
14 2. Number Theory Review

Some algorithms run in exponential time, which means that there


is a constant C > 1 so that the time the algorithm takes when the
input has length  is at least C  time units. Exponential time al-
gorithms are considered slow. The integer factoring algorithms in
Chapter 5 take exponential time.
The phrase polynomial time means that the running time of the
algorithm is less than a polynomial in the length of the input. This
means that there is a constant d so that when the input has length ,
the algorithm takes O(d ) time units. A polynomial time algorithm is
usually an efficient method of computing something, especially when
d is small. No polynomial time integer factoring algorithm is known.
However, there is a polynomial time test for primality.
Subexponential time is intermediate between exponential time and
polynomial time. An example of an algorithm with subexponential

time is one that takes f () = exp(  log ) time units when the input
has length . The function f () grows faster than d for every d and
slower than C  for every C > 1. The fastest known integer factoring
algorithms have subexponential time.
Roughly speaking, it is reasonable to run a polynomial time algo-
rithm with input length in the thousands, or even millions, provided
the degree of the polynomial is small. In contrast, an exponential
time algorithm will finish in a reasonable time only when the input
length is not more than a few dozen.
The fastest known integer factoring algorithms can factor general
integers with up to about two hundred decimal digits. Of course,
these rough estimates depend on the algorithm, the computer that
runs it, and how the input is represented.
The complexity class N P is the set of problems whose alleged
answers can be checked in polynomial time. For example, factoring
integers is in class N P because if you are told that the factors of N
are a and b, then you can check this claim by multiplying a · b in
polynomial time and comparing the product to N . Every problem
that can be solved in polynomial time lies in class N P. It is widely
believed that some problems in N P cannot be solved in polynomial
time. The hardest problems in N P are called N P-complete. These
Introduction 15

problems are all equally hard and are provably at least as hard as any
problem in N P.
Algorithms are classified by whether they use random numbers.
A deterministic algorithm chooses no random numbers. If you run it
twice with the same input, it will perform exactly the same steps in
the same order each time. A probabilistic algorithm chooses random
numbers during its operation. These random values determine the
steps it performs and may affect the running time. The expected
running time of a probabilistic algorithm is the average taken over
all possible random choices it could make. If you run a probabilistic
algorithm twice with the same input, it will probably perform different
steps each time, it may produce different output, and it may take
different running times. The Elliptic Curve Method is an example of a
probabilistic integer factoring algorithm. Given an input number N to
factor, it chooses some random numbers (the coefficients of an elliptic
curve and the coordinates of a point on it). It performs a certain
calculation using the random numbers and N , and it may or may not
find a proper factor of N . An important statistic of a probabilistic
integer factoring algorithm is the probability that it will succeed.
A Monte Carlo probabilistic polynomial time algorithm always runs
in polynomial time but may give the wrong answer (or no answer)
for some of its random number choices. A Las Vegas probabilistic
polynomial time algorithm always gives the correct answer, runs in
polynomial time on average, but may run in exponential time for
some of its random number choices.
Many algorithms in this book are presented in pseudocode, which
we now explain. Variable names, like m and P next, are computer
memory locations where numbers are stored. A name with brackets
is a component of an array (vector). Thus, L[i] is the i-th element
of the array L. An assignment statement like a ← f (a, b, c) means
evaluate the function f (a, b, c) using the numbers currently stored in
the locations a, b, c and then store the result in location a, replacing
the old value that was stored there.
An if statement has a condition enclosed in parentheses and one
or more instructions enclosed in braces. These instructions are per-
formed if the condition is true and not performed if it is false. For
16 2. Number Theory Review

example,
if (m mod p = 0) { m ← m/p }
has the condition (m mod p = 0), which means “p divides m.” If this
is true, the instruction performed is to divide out the factor p from
m, leaving the quotient in m. An if statement may be followed by an
else and then one or more additional instructions in braces. These
instructions are performed when the condition is false. For example,

if (m mod p = 0) { m ← m/p } else { p ← p + 1 }

is the same as the previous example when p divides m, but it adds 1


to p when p does not divide m.
A for loop in this book usually has the simple form

for (i ← 1 to n) { L[i] ← 1 }

which means perform the assignment L[i] ← 1 for i = 1, 2, 3, . . ., n.


The one or more repeated instructions are enclosed in braces. In case
n < 1, no instructions are executed. Another possible form of the for
loop is
for (i ∈ S) { L[i] ← L[i] + 1 }
which adds 1 to L[i] for each i in the set S.
A while loop has a condition in parentheses followed by one or
more instructions in braces. The condition is evaluated first. If it is
true, the instructions are performed. Probably the instructions will
change part of the condition. The condition is evaluated again. If it is
still true, the instructions are performed again. This process continues
as long as the condition is true. See the Euclidean Algorithm below
for an example.
However, a for or while loop may end early if a break or goto
statement is executed within it. A break statement stops the loop
and continues with the first instruction following the loop. A goto
statement specifies with a label the location in the algorithm of the
next instruction to perform.
Each algorithm begins with an “Input” line that lists the vari-
ables that are specified before it begins. For example, the Euclidean
Algorithm begins “Input: Integers m ≥ n > 0.” These memory loca-
2.1. Divisibility 17

tions are assumed to hold integers satisfying these inequalities when


the algorithm starts. Every algorithm ends with an “Output” state-
ment that tells what answer is given. For example, the Euclidean
Algorithm ends “Output: gcd(m, n) = the final value of m.” When
this algorithm stops (by having no more instructions to execute), the
answer, which is the greatest common divisor of m and n, is found in
the memory location m.

2.1. Divisibility
When m and n are whole numbers and m = 0, we say m divides n
and write m | n if n/m is a whole number. Integer is another word
for whole number. Here are two basic facts about divisibility: If k | m
and m | n, then k | n. If k | m and k | n, then k | (mx + ny) for any
integers x and y. We write m  n if m does not divide n.
We write “n mod m” to mean the remainder when the integer n is
divided by the positive integer m. We always have 0 ≤ n mod m < m.
For example, 15 mod 4 = 3, 100 mod 7 = 2, 30 mod 5 = 0, and
(−17) mod 6 = 1. In computer languages like C and Java, “n mod m”
is written “n%m,” at least when n and m are positive integers.
If at least one of the integers m, n is nonzero, define the greatest
common divisor of m and n, written gcd(m, n), to be the largest
integer that divides both m and n. It is clear that gcd(m, n) ≥ 1 and
that gcd(m, n) = gcd(n, m). Integers m, n are said to be relatively
prime if gcd(m, n) = 1. The least common multiple of two or more
integers is the smallest integer divisible by all of them. It is less than
or equal to their product. For example, the least common multiple of
4, 6, and 8 is 24.

Theorem 2.1. If m is a positive integer and n, q, r are integers such


that n = mq + r, then gcd(n, m) = gcd(m, r).

Proof. Write a = gcd(n, m) and b = gcd(m, r). Since a divides both


n and m, it must also divide r = n − mq. This shows that a is a
common divisor of m and r, so it must be ≤ b, their greatest common
divisor. Likewise, since b divides both m and r, it must divide n, so
b ≤ a = gcd(n, m). Therefore a = b. 
18 2. Number Theory Review

The Euclidean Algorithm uses Theorem 2.1 to compute the great-


est common divisor of two positive integers. It has been used for at
least 2,500 years and is the oldest efficient mathematical algorithm.
To compute gcd(m, n), with m ≥ n > 0, the algorithm repeatedly re-
places the pair (m, n) by the pair (n, m mod n) until n = 0, at which
time m is the required greatest common divisor.
Algorithm 2.2. Euclidean Algorithm.
Input: Integers m ≥ n > 0.
while (n > 0) {
r ← m mod n
m←n
n←r
}
Output: gcd(m, n) = the final value of m.

The Euclidean Algorithm was the first nontrivial algorithm to


have its worst-case time complexity determined.

Theorem 2.3 (Lamé, 1845). The number of mod operations per-


formed by the Euclidean Algorithm to compute the greatest common
divisor of two positive integers is no more than five times the number
of decimal digits in the smaller of the two numbers. The algorithm
runs in polynomial time.

See Theorem 3.12 of Wagstaff [Wag03] for a proof. One can


show that the Euclidean Algorithm is slowest when computing the
greatest common divisor of two consecutive Fibonacci numbers. (The
Fibonacci numbers are defined by u0 = 0, u1 = 1, and un+1 = un +
un−1 for n ≥ 1.) See also Example 6.3.

Example 2.4. The Euclidean Algorithm computes these values to


find gcd(75, 21) = 3:
m n r
75 21 12
21 12 9
12 9 3
9 3 0
3 0 −
2.1. Divisibility 19

Theorem 2.5. If m and n are integers and at least one of them is


not 0, then there exist integers x and y with mx + ny = gcd(m, n).

Example 2.6. In Example 2.4 we have, using Theorem 2.1,

75 = 21 · 3 + 12
21 = 12 · 1 + 9
12 = 9 · 1 + 3
9 = 3 · 3 + 0.

We can work backwards to get

gcd(75, 21) = 3 = 12 − 9 · 1
= 12 − (21 − 12 · 1) · 1
= 12 · 2 − 21 · 1
= (75 − 21 · 3) · 2 − 21 · 1
= 75 · 2 − 21 · 7,

so x = 2 and y = −7.

An important use of Theorem 2.5 is to compute inverses modulo


m. See Section 2.3.
Rather than working backwards as in Example 2.6, one can use
the extended Euclidean Algorithm, which computes the same x and y
by working forwards only. We write it in a compact form using triples
of integers, like u = (u0 , u1 , u2 ), which are added and multiplied by
integers using the rules for vector addition and scalar multiplication.
Algorithm 2.7. Extended Euclidean Algorithm.
Input: Integers m ≥ n > 0.
u ← (m, 1, 0)
v ← (n, 0, 1)
while (v0 > 0) {
q ← u0 /v0 
 ← u − qv
w
u ← v
v ← w 
}
Output: (gcd(m, n), x, y) = the final value of the triple u.
20 2. Number Theory Review

The numbers x and y are not unique, but the algorithm returns
x and y with smallest absolute values. The algorithm works because
every triple (a, b, c) satisfies a = bm + cn throughout the algorithm.
If you ignore the second and third components of the triples, this al-
gorithm is exactly the (ordinary) Euclidean Algorithm above. There-
fore, its time complexity is given by Lamé’s Theorem 2.3. Since the
final value of the first component of the triple u is gcd(m, n), the
algorithm is correct.
Example 2.8. We repeat Example 2.6 via the Extended Euclidean
Algorithm. This table shows the value of the two triples u and v and
the integer q each time q is computed and also at the end:
u0 u1 u2 v0 v1 v2 q
75 1 0 21 0 1 3
21 0 1 12 1 −3 1
12 1 −3 9 −1 4 1
9 −1 4 3 2 −7 3
3 2 −7 0 −7 25
The last line shows that gcd(75, 21) = 3 = 75(2) + 21(−7).

2.2. Prime Numbers


A prime number is an integer greater than 1 divisible only by 1 and
itself. A composite number is an integer greater than 1 which is not
prime. Thus, if n is composite, then there is some integer 1 < m < n
with m | n. Integers < 2 are neither prime nor composite.
It is easy to prove the next theorem and its corollary using basic
facts about divisibility, the definition of greatest common divisor, and
mathematical induction.
Theorem 2.9. If , m, and n are positive integers and  | mn and
gcd(, m) = 1, then  | n.
Corollary 2.10. If a prime divides a product of positive integers,
then it divides at least one of them.

This corollary is used to prove that every integer > 1 can be


written as the product of prime numbers, which is part of the following
important theorem.
2.2. Prime Numbers 21

Theorem 2.11 (Fundamental Theorem of Arithmetic). Every inte-


ger greater than 1 can be written as a product of primes, perhaps with
just one prime in the product, and this product is unique when the
primes are written in nondecreasing order.

Suppose we collect together repeated prime factors of an integer


and write e copies of the prime p as pe . Then we get the standard
factorization of an integer n as

k
n = pe11 pe22 · · · pekk = pei i ,
i=1

where p1 , p2 , . . ., pk are the distinct primes that divide n and ei ≥ 1


is the number of times pi divides n. If n is prime, there is only one
“factor.” We also allow n = 1 with the “empty product” for its
standard factorization.
Knowledge of the standard factorization of n is essential for com-
puting arithmetic functions of n, as we will see in Section 2.5, and for
many other purposes.
The Fundamental Theorem of Arithmetic is basically an existence
theorem. No (known) proof of it provides an efficient algorithm for
computing the standard factorization. This computation is equivalent
to factoring n, the subject of this book.

Theorem 2.12. The number of primes is infinite.

Proof. Suppose that p1 , . . ., pk were all the primes. Let n = p1 · · · pk


+ 1. Then n is not divisible by any pi because n mod pi = 1 for every
i. By the Fundamental Theorem of Arithmetic, n can be written as
the product of (one or more) primes, so n is divisible by some prime,
which cannot be one of the pi . Therefore the assumption that p1 , . . .,
pk are all the primes is false, and the number of primes is infinite. 

We have given this proof, which goes back to Euclid1 , because


it leads to an interesting sequence of integers whose computation re-
quires extensive factoring.

1
Theorem 2.12 is Proposition 20 of Book IX of Euclid’s Elements. His proof is
essentially the one given here.
22 2. Number Theory Review

Example 2.13. Suppose we try to use the proof of Theorem 2.12


to construct an infinite sequence of primes. Suppose we know only
the first prime, p1 = 2. The proof tells us that p1 + 1 = 3 has a
prime factor different from 2. That prime is p2 = 3. Next we form
p1 ·p2 +1 = 7. It must have a prime factor different from 2 and 3. That
prime is p3 = 7. The next number in the sequence is p1 p2 p3 + 1 = 43,
which is prime, so p4 = 43. The next number in the sequence is
p1 p2 p3 p4 + 1 = 1807 = 13 · 139, which is composite and we get two
new primes. In order to form a deterministic sequence of primes, we
must choose one of the new ones. One way to do this is always to
choose the smallest prime factor, that is, define pn+1 = the smallest

prime factor of 1 + ni=1 pi . This leads to the sequence of integers
shown in Table 1. See [Wag93] for more about this sequence and
related ones.

Table 1. A sequence of primes suggested by Euclid’s proof.


n pn 1 + ni=1 pi
1 2 3
2 3 7
3 7 43
4 43 1807 = 13 · 139
5 13 23479 = 53 · 443
6 53 1244335 = 5 · 248867
7 5 6221671 (prime)
8 6221671 38709183810571 (prime)
9 38709183810571 1498400911280533294827535471 =
139 · 25621 · 420743244646304724409
10 139 208277726667994127981027430331 =
2801 · 2897 · 489241 · 119812279 · 437881957


Theorem 2.14. If n is composite, then n has a prime factor p ≤ n.

Proof. If n is composite, then it has at least two prime factors. Let


p and q be two of them. Assume p ≤ q. Then n ≥ pq ≥ p2 , so

p ≤ n. 
2.2. Prime Numbers 23

What fraction of positive integers are prime? Are most positive


integers prime? Or are most of them composite? If we tested random
100-digit integers until we found a prime, about how many integers
would we have to test? How many primes are less than x? Let
π(x) denote the number of prime numbers ≤ the real number x. We
already know from Theorem 2.12 that π(x) → ∞ as x → ∞. The
next theorem, proved more than a century ago by Hadamard and
de la Vallée Poussin, gives the growth rate of π(x) and answers our
questions.
Theorem 2.15 (Prime Number Theorem). The ratio of π(x) to
x/ ln x tends to 1 as x → ∞; that is, π(x) ∼ x/ ln x.

The theorem says that x/ ln x is an approximation to π(x), where


ln x is the natural logarithm of x, and that the percentage error in
this approximation gets small as x gets large. Gauss computed large
tables of prime numbers and observed that the density of primes near
x
 xis approximately 1/ ln x. This implies that π(x) is approximately
2
dt/ ln x. In fact, this integral is a closer approximation to π(x)
than is x/ ln x. The probability that a random integer near 10100 is
prime is about 1/ ln 10100 = 1/(100 ln 10) ≈ 0.00434. The reciprocal
of this probability (about 230) is the expected number of random
integers near 10100 that we would have to test to find one prime with
one hundred digits.
It is fortunate (for finding large primes) that there are so many

of them. For example, if π(x) were approximately x, rather than
x/ ln x, then there would be about 1050 100-digit primes. This may
seem like a lot of them, but it would mean that only one in about 1050
100-digit integers would be prime. Someone trying to find a random
100-digit prime would have to try about 1050 numbers just to get one
prime!
What is a typical factorization of an integer n? Hardy and Ra-
manujan [HR17] proved that an integer n has ln ln n prime factors on
average. Moreover, the Erdős-Kac theorem [EK40] tells us that the
probability distribution of the number of prime factors of an integer n
is asymptotically normal with mean and variance ∼ ln ln n. This an-
swer is the same whether repeated prime factors of n are counted only
once or according to their multiplicity. In his Ph.D. thesis [Bac85]
24 2. Number Theory Review

Bach gave a fast algorithm for generating pre-factored random num-


bers. Very roughly speaking, to make a factored random number n
between x/2 and x, his algorithm chooses a random prime power fac-
tor q of n so that ln q has a uniform distribution between 0 and ln x.
Then it recursively selects a pre-factored random number m between
x/(2q) and x/q and lets n = mq. This factorization is a typical output
of the algorithm:
n = 17 · 1217 · 148961 · 24517014940753.
Note that ln ln n = 3.96 and that there are four prime factors. How-
ever, numbers with special form, such as bk ± 1, often have atypical
factorizations. (The numbers bk ± 1 usually have more prime factors
than typical numbers of the same size because of algebraic divisibility
properties discussed in Section 3.4.) The number n shown factored
above is actually 276 + 1, but it just happens to have a typical prime
factorization.

2.3. Congruences
If m is a positive integer and a and b are integers, we say a is congruent
to b modulo m and write a ≡ b (mod m) if m divides a−b. If m  (a−b),
we write a ≡ b (mod m). The integer m is called the modulus (plural
moduli). When a ≡ b (mod m), each of a, b is called a residue of the
other (modulo m). Congruence modulo m is an equivalence relation,
meaning that if a, b, and c are integers, then
(1) a ≡ a (mod m),
(2) if a ≡ b (mod m), then b ≡ a (mod m), and
(3) if a ≡ b (mod m) and b ≡ c (mod m), then a ≡ c (mod m).
Do not confuse the “mod” in the relation a ≡ b (mod m) with
the arithmetic operation “mod” (remainder) in a mod m. We have
a ≡ b (mod m) if and only if (a mod m) = (b mod m).
The congruence a ≡ b (mod m) is equivalent to saying that there
exists an integer k so that a = b + km.

Theorem 2.16. Let a, b, c, and d be integers, m a positive inte-


ger, and f (x) a polynomial with integer coefficients. Suppose a ≡
2.3. Congruences 25

b (mod m) and c ≡ d (mod m). Then:


(1) a + c ≡ b + d (mod m).
(2) a − c ≡ b − d (mod m).
(3) ac ≡ bd (mod m).
(4) f (a) ≡ f (b) (mod m).
(5) If d | m, then a ≡ b (mod d).

Note that part (4) of Theorem 2.16 say that if a ≡ b (mod m),
then an ≡ bn (mod m). However, if a ≡ b (mod m), then usually
na ≡ nb (mod m).
Theorem 2.17. If m > 1 and gcd(a, m) = 1, then there is a unique
integer x in 0 < x < m for which ax ≡ 1 (mod m).

The Extended Euclidean Algorithm provides a way to compute


x given m and a. Use the algorithm to find x, y such that ax +
my = gcd(a, m) = 1. The number x returned by the algorithm
might not be in the interval 0 < x < m; if it is not, then either add
or subtract m from it to find an x in the interval. The number x
is called the multiplicative inverse of a modulo m and is sometimes
written a−1 mod m. Remember that a has a multiplicative inverse
modulo m only when gcd(a, m) = 1.
Example 2.18. Find an x in 0 < x < 103 with 7x ≡ 1 (mod 103).
We use the Extended Euclidean Algorithm with m = 103, n = 7.
u0 u1 u2 v0 v1 v2 q
103 1 0 7 0 1 14
7 0 1 5 1 −14 1
5 1 −14 2 −1 15 2
2 −1 15 1 3 −44 2
1 3 −44 0 −7 103
The last line shows that gcd(103, 7) = 1 = (103)(3) + (7)(−44). This
implies that (7)(−44) ≡ 1 (mod 103). To find an x in 0 < x < 103
with 7x ≡ 1 (mod 103), add 103 to −44 to get x = 59 = 7−1 mod 103.

Congruences suggest equations. Just as we can solve an equation


f (x) = 0, we can consider how to solve congruences such as f (x) ≡
26 2. Number Theory Review

0 (mod m). If f (x) is a polynomial with integer coefficients and x = a


is an integer for which f (a) ≡ 0 (mod m), then by part (4) of Theorem
2.16, we have f (b) ≡ 0 (mod m) for every integer b ≡ a (mod m). It
is customary to count all such congruent b as just one solution. We
say the “number of solutions to f (x) ≡ 0 (mod m)” is the number of
integers x in 0 ≤ x < m for which f (x) ≡ 0 (mod m).
The simplest case is the linear equation ax − b ≡ 0 (mod m),
that is, ax ≡ b (mod m). The general solution to this equation is a
bit messy. (Find it in an introductory number theory text.) Usually,
we need to solve ax ≡ b (mod m) only when gcd(a, m) = 1. In
this situation, the solution is easy and there is always exactly one
solution. Use the Extended Euclidean Algorithm to find an integer u
with au ≡ 1 (mod m). Multiply both sides of ax ≡ b (mod m) by u
to get
x ≡ aux ≡ ub (mod m).

Example 2.19. Solve 7x ≡ 23 (mod 103).


In Example 2.18 we found a number u = 59 with 7u ≡ 1 (mod 103).
Thus the solution is x ≡ (u)(23) ≡ (59)(23) = 1357 ≡ 18 (mod 103).
In fact, x = 18 is the only integer in 0 ≤ x < 103 such that
7x ≡ 23 (mod 103).

Consider a system ai x ≡ bi (mod mi ), 1 ≤ i ≤ k, of congru-


ences in which gcd(ai , mi ) = 1 for 1 ≤ i ≤ k and gcd(mi , mj ) = 1
whenever 1 ≤ i < j ≤ k. Our goal is to describe all x that make all
k congruences true simultaneously. The condition gcd(mi , mj ) = 1
ensures that there are solutions x. If we first solve each congruence
separately, we get a system x ≡ ci (mod mi ), 1 ≤ i ≤ k, where
ci ≡ bi a−1
i (mod mi ). (If one congruence can’t be solved because
some gcd(ai , mi ) > 1, then the entire system has no solution.)

Theorem 2.20 (Chinese Remainder Theorem). Let m1 , . . ., mk be k


positive integers satisfying gcd(mi , mj ) = 1 whenever 1 ≤ i < j ≤ k.
Let ci , 1 ≤ i ≤ k, be any integers. Then the system of k congruences
x ≡ ci (mod mi ), 1 ≤ i ≤ k,
has a solution x. Any two solutions are congruent modulo M =
m1 · · · mk .
2.3. Congruences 27

Proof. For 1 ≤ i ≤ k, the number M/mi is an integer. One can show


that gcd(M/mi , mi ) = 1 for 1 ≤ i ≤ k follows from the hypothesis
gcd(mi , mj ) = 1 whenever 1 ≤ i < j ≤ k. By Theorem 2.17, for each
1 ≤ i ≤ k there is an integer bi such that (M/mi )bi ≡ 1 (mod mi ).
Then (M/mj )bj ≡ 0 (mod mi ) when i = j because mi divides

(M/mj ) in that situation. Let x0 = ki=1 (M/mi )bi ci . Let δij = 1 if
i = j and 0 if i = j. Then

k 
k
x0 = (M/mj )bj cj ≡ δij cj ≡ ci (mod mi ).
j=1 j=1

Therefore, the system of k congruences has a common solution x0 .


If x1 is another common solution, then x1 ≡ ci ≡ x0 (mod mi )
for each i. The hypothesis gcd(mi , mj ) = 1 implies that x1 ≡
x0 (mod M ). 

We have sketched the proof of the Chinese Remainder Theorem


2.20 because it gives the following polynomial time algorithm for solv-
ing a system of simultaneous congruences. The variables mentioned
in the statement and proof of the Chinese Remainder Theorem 2.20
have the same meaning in the algorithm below, except that x in the
algorithm is x0 in the proof. The variable ni = M/mi is the prod-
uct of all mj except for mi , and ai = ni bi . The algorithm simply
computes x0 by the formula for it in the proof.
Algorithm 2.21. Algorithm for the Chinese Remainder Theorem
Input: Moduli mi and integers ci for 1 ≤ i ≤ k.
We must have gcd(mi , mj ) = 1 for 1 ≤ i < j ≤ k.

M ← ki=1 mi
for (i ← 1 to k) {
ni ← M/mi
bi ← (ni mod mi )−1 mod mi
ai ← ni · bi
}
x←0
for (i ← 1 to k) {
x ← x + ai · ci
}
x ← x mod M
Output: The algorithm returns the solution x to x ≡ ci (mod mi ).
28 2. Number Theory Review

Example 2.22. Find x so that x ≡ 4 (mod 7) and x ≡ 24 (mod 103).


Since M = 7 · 103 = 721, the answer will be a congruence
class modulo 721. In the Chinese Remainder Theorem 2.20, we have
m1 = 7 and m2 = 103, so n1 = M/m1 = 103 and n2 = M/m2 = 7.
When k > 2, we would have to use the Extended Euclidean Algo-
rithm k times to compute the bi ’s. But when k = 2, a single appli-
cation of the Extended Euclidean Algorithm computes both bi . In
Example 2.18 we found that 1 = (−44)(7) + (3)(103). Therefore,
b1 = (103 mod 7)−1 mod 7 = 3 and b2 = (7 mod 103)−1 mod 103 =
−44 ≡ 59 (mod 103). Then a1 = n1 b1 = 103 · 3 = 309 and a2 =
n2 b2 = 7 · 59 = 413. Finally,
x = a1 c1 + a2 c2 = 309 · 4 + 413 · 24 = 11148 ≡ 333 (mod 721).
The answer is x ≡ 333 (mod 721) or x = 333 + 721t for any integer t.

A typical use of the Chinese Remainder Theorem 2.20 is to com-


pute an integer x modulo M when we know the values of x mod pe for
each prime power factor pe of M . For example, we might want to solve
a congruence f (x) ≡ 0 (mod M ). It may be easier to solve f (x) ≡
k
0 (mod pe ) for each prime power factor pe of M . If M = i=1 pei i
and x ≡ ci (mod pei i ) is the solution to f (x) ≡ 0 (mod pei i ) for i = 1,
. . ., k, then the Chinese Remainder Theorem 2.20 and the algorithm
above (with mi = pei i ) will give the solution x to f (x) ≡ 0 (mod M ).

2.4. Fermat and Euler


Fermat proved this useful theorem more than 350 years ago.
Theorem 2.23 (Fermat’s Little Theorem). If p is prime and n is
an integer not divisible by p, then p divides np−1 − 1, that is, np−1 ≡
1 (mod p).
Corollary 2.24. If p is prime and n is an integer, then np ≡ n (mod p).

One use of Theorem 2.23 is to compute the multiplicative in-


verse of n modulo a prime p, as an alternative to using the Extended
Euclidean Algorithm as in Example 2.18. If gcd(n, p) = 1, then
n−1 ≡ np−2 (mod p). This works because n·np−2 = np−1 ≡ 1 (mod p).
2.4. Fermat and Euler 29

The two methods are about equally fast, but the Extended Euclidean
Algorithm works even when the modulus is composite. This calcula-
tion is often done when p is a prime number with hundreds of decimal
digits. The power of n is computed efficiently by the Fast Exponen-
tiation Algorithm.
To compute ne (mod m), write the exponent e as a binary number

e = i bi 2i , where bi ∈ {0, 1}. Then
  bi
bi 2i i
ne = n i = n2 .
i

i
The variable z in the algorithm holds successively the powers n2 ,
for i = 0, 1, 2, . . .. When the exponent e is repeatedly divided by
2, discarding any fractional part, the parity of the quotients is the
bits bi , i = 0, 1, 2, . . .. The variable y, initially 1, remembers the
i
product of some of the powers n2 , held in z. The condition in the
if statement in the algorithm is true if bi = 1. When this happens,
i
z = n2 is multiplied into the running product y. At the end, y holds
ne . To get ne mod m, each product (yz or z 2 ) is reduced modulo m
as soon as it is formed, to keep the numbers small.
Algorithm 2.25. Fast Exponentiation Algorithm.
Input: Integers m > 0, n ≥ 0, e ≥ 0.
y←1
z←n
while (e > 0) {
if (e is odd) y ← (y · z) mod m
z ← (z · z) mod m
e ← e/2
}
Output: ne mod m = the final value of y.

Theorem 2.26. The number of iterations of the while loop in the


Fast Exponentiation Algorithm is 1 + log2 e. The algorithm runs in
polynomial time.

See Knuth [Knu81, Section 4.6.3] for more discussion.


Another application of Fermat’s Little Theorem is to identify
large composite numbers.
30 2. Number Theory Review

Corollary 2.27. If p is an integer > 1 and p does not divide the


integer n and np−1 ≡ 1 (mod p), then p is composite.

Proof. If p were prime, then we would have np−1 ≡ 1 (mod p),


contrary to the hypothesis. 

Example 2.28. Show that 162167 is composite without factoring it.


Let p = 162167. We will compute 2p−1 mod p. Table 2 shows
the values of y, z, and e in the Fast Exponentiation Algorithm with
m = p, n = 2, and e = p − 1. The first column shows the number
of times the while loop has been executed. Then 2p−1 mod p is the
final value of y, which is 85902. Since this value is not 1 (mod p), we
have proved that p is composite.

Table 2. Example of Fast Exponentiation.

y z e
0 1 2 162166
1 1 4 81083
2 4 16 40541
3 64 256 20270
4 64 65536 10135
5 140129 136468 5067
6 67398 94577 2533
7 2377 1543 1266
8 2377 110511 633
9 136274 46518 316
10 136274 130043 158
11 136274 82755 79
12 99523 77615 39
13 139101 70676 19
14 52235 29042 9
15 98752 7197 4
16 98752 65536 2
17 98752 136468 1
18 85902 94577 0
2.4. Fermat and Euler 31

Fermat’s Little Theorem cannot be used to prove that a large


number is prime. For a converse to Fermat’s Little Theorem that
does prove primality see Theorem 3.27.

Example 2.29. Find the two low-order digits of 57543 .


We take 57543 modulo 100 to get its two low-order digits. The
answer is x = 57543 mod 100. We could compute this number with
the Fast Exponentiation Algorithm, but that is tedious and x can
almost be found with mental arithmetic.
Note that 100 = 4 · 25 is the factorization of 100 into its prime
power factors. We will compute x mod 4 and x mod 25 and then
determine x with the Chinese Remainder Theorem 2.20.
First, 57 ≡ 7 (mod 25), so 572 ≡ 72 = 49 ≡ −1 (mod 25). Hence,
57 ≡ (−1)2 ≡ 1 (mod 25). Now 543 = 4 · 135 + 3, so
4

57543 = (574 )135 · 573 ≡ 1135 · 73 = 343 ≡ 18 (mod 25).

It is even easier to compute 57543 mod 4: 57 ≡ 1 (mod 4), so 57543 ≡


1 (mod 4). Use the Chinese Remainder Theorem algorithm to solve
the system x ≡ 1 (mod 4) and x ≡ 18 (mod 25). The answer is
x ≡ 93 (mod 100), and the two low-order digits of 57543 are 93.

Euler generalized Fermat’s Little Theorem to composite moduli.


To state Euler’s theorem, we need the Euler phi function.

Definition 2.30. For a positive integer m define the Euler phi func-
tion φ(m) to be the number of n in 1 ≤ n ≤ m with gcd(m, n) = 1.

Since it is easy to show that a ≡ b (mod m) implies gcd(a, m) =


gcd(b, m), we see that φ(m) is the number of numbers that are rela-
tively prime to m in any interval of m consecutive integers. If m is
prime, then all of the numbers 1, 2, . . ., m are relatively prime to m
except m itself. Thus, φ(m) = m − 1 when m is prime.

Theorem 2.31 (Euler’s Theorem). If m > 1 and a are integers with


gcd(m, a) = 1, then aφ(m) ≡ 1 (mod m).
32 2. Number Theory Review

Corollary 2.32. If m > 1 and a are integers with gcd(m, a) = 1,


then a−1 = aφ(m)−1 mod m is a multiplicative inverse of a mod m.
Definition 2.33. A primitive root modulo a positive integer m is an
integer g for which t = φ(m) is the smallest positive integer such that
g t ≡ 1 (mod m).

A primitive root g modulo m must be relatively prime to m be-


cause otherwise no power of g would be ≡ 1 (mod m). Not all integers
m have primitive roots.
Theorem 2.34. An integer m > 1 has a primitive root if and only
if m = 2, m = 4, m = pe , or m = 2pe , where p is an odd prime and
e is a positive integer. If m has a primitive root, then it has exactly
φ(φ(m)) of them in 1 ≤ g ≤ m.
Example 2.35. The number g = 2 is a primitive root modulo 5
because its powers are 2, 4, 8 ≡ 3 and 16 ≡ 1 (mod 5). Since
φ(φ(5)) = φ(4) = 2, there is one more primitive root modulo 5 and it
is 3.
Theorem 2.36. If g is a primitive root modulo m and b is relatively
prime to m, then there is exactly one exponent i in 0 ≤ i < φ(m) so
that g i ≡ b (mod m).
Definition 2.37. If g is a primitive root modulo m and b is relatively
prime to m, then the exponent i whose existence is guaranteed by
Theorem 2.36 is called the index of b to base g modulo m by number
theorists and is called the discrete logarithm of b to base g modulo m
by computer scientists.

In Corollary 3.28, we will tell how to compute a primitive root


modulo a prime m, which is the most common case.
We make a few remarks here about how hard it is to compute a
discrete logarithm modulo m. The discrete logarithm problem (DLP)
is to find x such that g x ≡ b (mod m) for given m, g, and b. This
problem is important because the security of many cryptographic
functions depends on the DLP being hard to solve. It is relevant to
this book because some factoring may be needed to determine the
difficulty of particular DLP problems.
2.5. Arithmetic Functions 33

If m is composite, then one can solve g x ≡ b (mod p) separately


for each prime factor p of m and combine the answers via the Chi-
nese Remainder Theorem 2.20 to solve the DLP modulo m. For this
reason, DLPs in cryptography usually have a prime modulus.
Now suppose p is prime with primitive root g and we must solve
g x ≡ b (mod p). The Index Calculus Method, which is similar to the
Quadratic Sieve Factoring Algorithm, described in Section  8.2, will

solve the DLP modulo p in time O exp( 2 ln p ln ln p) , which is
subexponential time, although not polynomial time. There is also an
analogue of the Number Field Sieve, described in Section 8.5, which
solves this DLP even faster. Finally, if q is the largest prime factor

of p − 1, then one can solve the DLP modulo p in time O( q). Thus,
p − 1 should have a large prime factor q in order for the DLP modulo
p to be hard. For more detail, see Pohlig and Hellman [PH78].

2.5. Arithmetic Functions


See [NZM91] or [HW79] for proofs of the theorems in this section.

Definition 2.38. An arithmetic function is a function defined on the


positive integers.

Most arithmetic functions in this book take integer values.

Definition 2.39. A multiplicative function is an arithmetic function


f (x) such that f (mn) = f (m)f (n) whenever gcd(m, n) = 1.

Theorem 2.40. The Euler phi function is multiplicative.

Multiplicative functions are determined by their values on prime


powers, and therefore the obvious way to evaluate them involves fac-
toring integers.

Theorem 2.41. If m = ki=1 pei i is the standard factorization of m
into a product of primes and if f (x) is any multiplicative function,

then f (m) = ki=1 f (pei i ).

Definition 2.42. If m is a positive integer, let d(m) denote the num-


ber and σ(m) the sum of the positive divisors of m.
34 2. Number Theory Review

Example 2.43. If m is prime, then its positive divisors are 1 and


m, so d(m) = 2 and σ(m) = m + 1. If m and n are distinct prime
numbers, then the positive divisors of mn are 1, m, n, and mn, so
d(mn) = 4 and σ(mn) = 1 + m + n + mn = (m + 1)(n + 1).

Theorem 2.44. The functions d(m) and σ(m) are multiplicative.



Theorem 2.45. If m = ki=1 pei i , where the pi are distinct primes,
k 
then d(m) = i=1 (1 + ei ), σ(m) = ki=1 (piei +1 − 1)/(pi − 1), and
k
φ(m) = i=1 pei i −1 (pi − 1).

Proof. The divisors of pe are 1, p, p2 , . . ., pe . There are 1+e of them


and their sum is (pe+1 − 1)/(p − 1). Note that φ(pe ) = pe−1 (p − 1) =
pe −pe−1 because all of the pe numbers m in 1 ≤ m ≤ pe are relatively
prime to pe except the multiples of p, and there are pe−1 of these
multiples. Now apply Theorem 2.41. 

Definition 2.46. A perfect number is a positive integer m for which


σ(m) = 2m.

We need one more multiplicative function.

Definition 2.47. The Möbius function μ(m) is defined for positive


integers m to be 0 if m has a repeated prime factor and (−1)k if m
is the product of exactly k different primes. Define μ(1) = 1.

A positive integer is called square free if it is the product of dis-


tinct primes, that is, if it is not divisible by any square > 1. Thus
μ(m) = 0 if m is not square free and either +1 or −1 if m is square
free. If m is prime, then μ(m) = −1.

Example 2.48. Since 18 is divisible by the square 9, we have μ(18) =


0. Since 30 = 2 · 3 · 5 is the product of three different primes, we have
μ(30) = (−1)3 = −1.

Theorem 2.49. The function μ(m) is multiplicative.

Theorem 2.50. If m is a positive integer, then



1
 μ(d)
φ(m) = m 1− =m .
p d
p|m d|m
2.5. Arithmetic Functions 35

In Theorem 2.50, the product is taken over all distinct prime


factors p of m and the sum is taken over all positive divisors d of m.

Theorem 2.51. The sum d|m μ(d) equals 1 if m = 1 and it equals
0 if m is an integer > 1.
Theorem 2.52. If f (m) is a multiplicative function, then so is g(m)

= d|m f (d).

Theorem 2.53 (Möbius Inversion Formula). If g(m) = d|m f (d),

then f (m) = d|m μ(m/d)g(d).

Another arithmetic function that we will use is ω(m) = the num-


ber of different prime factors of the positive integer m. For example,
ω(pa ) = 1 if pa is any prime power, ω(24) = 2, and ω(105) = 3. The
function ω is not multiplicative, but it is additive, that is, ω(mn) =
ω(m) + ω(n) whenever gcd(m, n) = 1. We need the function ω for
this theorem about φ(m), which will be used to prove Theorem 3.19.
Theorem 2.54. If m is a positive integer, then φ(m) ≥ 2ω(m)−1 .

Proof. If p is an odd prime and e ≥ 1, then φ(pe ) = pe−1 (p − 1) ≥ 2.


Now m has ω(m) distinct prime factors, so it has at least ω(m) − 1
distinct odd prime factors, each of which contributes a factor φ(pe ) ≥
2 to φ(m). 

We use the notation pe m to mean that pe | m but pe+1  m.


Definition 2.55. For a positive integer m define the Carmichael
λ function by λ(1) = 1, λ(2e ) = φ(2e ) = 2e−1 for e = 1 and 2,
λ(2e ) = φ(2e )/2 = 2e−2 for e > 2, λ(pe ) = φ(pe ) = pe−1 (p − 1) for
all odd primes p and all positive integers e, and λ(m) = the least
common multiple of λ(pe ) for all prime powers pe such that pe m.
Example 2.56. Here are the values of λ(m) for some small m:
m: 1 2 3 4 5 6 7 8 9 10 11 12 16 24 35
λ(m) : 1 1 2 2 4 2 6 2 6 4 10 2 4 2 12
Theorem 2.57. If m is a positive integer and a is relatively prime
to m, then aλ(m) ≡ 1 (mod m), and λ(m) is the least exponent t for
which bt ≡ 1 (mod m) for every integer b relatively prime to m.
36 2. Number Theory Review
e
Proof. By Euler’s Theorem 2.31, if pe m, then aφ(p ) ≡ 1 (mod pe ).
e
Since λ(pe ) = φ(pe ) when p is odd or e ≤ 2, we have aλ(p ) ≡
1 (mod pe ) in these cases. Now 2e has no primitive root when e > 2
by Theorem 2.34. We have φ(2e ) = 2e−1 for e ≥ 1 by Theorem 2.45,
so λ(2e ) ≤ φ(2e )/2 when e > 2. By Exercise 2.10, the least exponent t
for which 5t ≡ 1 (mod 2e ) is t = 2e−2 = λ(2e ) when e > 2. Therefore,
λ(2e ) = φ(2e )/2 when e > 2. By group theory, the least exponent t
for which bt ≡ 1 (mod m) is the least common multiple of the least
exponents tp for which btp ≡ 1 (mod pe ) for each pe m. 

We will use Theorem 2.57 to prove Korselt’s Theorem 3.36.

2.6. Quadratic Congruences


In Section 2.3, we noted that congruences are similar to equations and
saw how one can easily solve a linear congruence ax ≡ b (mod m) us-
ing the Extended Euclidean Algorithm. In this section we consider
the solution of a quadratic congruence ax2 +bx+c ≡ 0 (mod m). The
quadratic formula gives the two roots. It uses the four arithmetic op-
erations, add, subtract, multiply, and divide, and also a square root.
The division operation may be done using the Extended Euclidean
Algorithm just as for the linear congruence case. The new operation
here is the square root of the discriminant. To solve a congruence
y 2 ≡ r (mod m), one solves the same congruence modulo each prime
power factor of m and combines the solutions using the Chinese Re-
mainder Theorem 2.20. (Example 4.40 illustrates this calculation.)
Solutions modulo a prime power pk are obtained from those for the
congruence modulo p by “lifting” those solutions. (See Theorem 3.8
and Example 3.9.) The real difficulty and novelty lies in solving
y 2 ≡ r (mod p) when p is prime.
Consider first the prime p = 2. It is easy to solve x2 ≡ r (mod 2):
If r = 0, then x ≡ 0 (mod 2), and if r = 1, then x ≡ 1 (mod 2).
These solutions do not “lift” well to powers of 2 because 2 behaves in
a special way for square roots. The solutions to x2 ≡ r (mod 4) are:
If r = 0, then x ≡ 0 (mod 2), if r = 1, then x ≡ 1 (mod 2), while if
r = 2 or 3, then there is no solution x. The reader should examine
the cases with moduli 8 and 16. See also Exercise 2.10.
2.6. Quadratic Congruences 37

Now let p be an odd prime. The congruence x2 ≡ r (mod p)


has one solution, x = 0, when r = 0; two solutions, x and p − x,
for (p − 1)/2 nonzero values of r; and no solution for the remaining
(p − 1)/2 nonzero values of r. Study this table of squares modulo 13:
x: 0 1 2 3 4 5 6 7 8 9 10 11 12
x2 mod 13 : 0 1 4 9 3 12 10 10 12 3 9 4 1
Observe that 1, 3, 4, 9, 10, and 12 are squares modulo 13, but 2, 5,
6, 7, 8, and 11 are not. The solutions to x2 ≡ 4 (mod 13) are x ≡ 2
and 11 (mod 13). The congruence x2 ≡ 6 (mod 13) has no solution.
The nonzero squares modulo m are called quadratic residues and
the other numbers relatively prime to m are called quadratic non-
residues.
Theorem 2.58. If p is an odd prime, then exactly (p − 1)/2 of the
numbers 1, 2, . . ., p − 1 are quadratic residues modulo p; the other
(p − 1)/2 of these numbers are quadratic nonresidues modulo p.

Legendre defined the symbol (r/p), where p is an odd prime and r


is an integer, to be +1 when r is a quadratic residue modulo p and −1
when r is a quadratic nonresidue modulo p. We also define (r/p) = 0
when p | r. The most important properties of the Legendre symbol
are given in the next theorem.
Theorem 2.59. Let p be an odd prime and let r and s be integers.
Then:
(1) (rs/p) = (r/p)(s/p).
(2) If r ≡ s (mod p), then (r/p) = (s/p).
(3) If p  r, then (r 2 /p) = +1 and (r 2 s/p) = (s/p).
(4) r (p−1)/2 ≡ (r/p) (mod p).
(5) (−1/p) = +1 when p ≡ 1 (mod 4) and (−1/p) = −1 when
p ≡ 3 (mod 4).
(6) (2/p) = +1 when p ≡ 1 or 7 (mod 8), and (2/p) = −1 when
p ≡ 3 or 5 (mod 8).
(7) If q is an odd prime different from p, then (p/q) = (q/p)
when p ≡ 1 (mod 4) or q ≡ 1 (mod 4), and (p/q) = −(q/p)
when p ≡ q ≡ 3 (mod 4).
38 2. Number Theory Review

Part (4) of the theorem is called Euler’s Criterion and provides


(with Fast Exponentiation) a quick way to determine whether r is a
quadratic residue or nonresidue modulo p. Part (7) is called the Law
of Quadratic Reciprocity and parts (5) and (6) are the two supple-
ments to this law. See [NZM91] or [HW79] for proofs.
If g is a primitive root modulo an odd prime p, then the even pow-
ers g 2k are all of the quadratic residues modulo p and the odd powers
g 2k+1 are all of the quadratic nonresidues modulo p. In particular,
the primitive root g must be a quadratic nonresidue modulo p.

Example 2.60. Which odd primes p have 13 as a quadratic residue?


The Law of Quadratic Reciprocity says that (13/p) = (p/13) because
13 ≡ 1 (mod 4). The quadratic residues modulo 13 are 1, 3, 4, 9, 10,
12, so the answer is all odd primes p ≡ 1, 3, 4, 9, 10, or 12 (mod 13).
These primes lie in the six arithmetic progressions 1 + 13k, 3 + 13k,
etc. The other odd primes, those ≡ 2, 5, 6, 7, 8, or 11 (mod 13), have
13 as a quadratic nonresidue.

We define the Jacobi symbol, mentioned later in the book but not
really used. If Q is an odd positive integer, so that Q = q1 q2 · · · qt ,
where the qi are odd primes, and P is an integer, then the Jacobi
t
symbol (P/Q) = i=1 (P/qi ), where the (P/qi ) are Legendre symbols.
If gcd(P, Q) > 1, then (P/Q) = 0, but if gcd(P, Q) = 1, then (P/Q) =
±1. The Jacobi symbol satisfies the properties listed in Theorem
2.59, except for Euler’s Criterion (4). These properties allow one to
compute in polynomial time the value of (P/Q) even when the prime
factors of Q (and P ) are unknown. They also facilitate evaluation
of Legendre symbols. If P is a quadratic residue modulo Q, then
(P/Q) = +1. But (P/Q) = +1 does not imply that P is a quadratic
residue modulo Q when Q is composite.

Example 2.61. Evaluate the Legendre symbol (15/73).


Since 73 is prime, the Jacobi symbol (15/73) equals the Legendre
symbol (15/73). By property (7) of Theorem 2.59, (15/73) = (73/15).
By property (2) of Theorem 2.59, (73/15) = (13/15). By property
(7) of Theorem 2.59, (13/15) = (15/13). By property (2) of The-
orem 2.59, (15/13) = (2/13). By property (6) of Theorem 2.59,
Exercises 39

(2/13) = −1. Therefore, (15/73) = −1 and 15 is a quadratic non-


residue modulo 73.

Exercises

2.1. Find the greatest common divisor g of 321 and 381. Find
integers x and y so that 321x + 381y = g.
2.2. Extend the table of Example 2.13 as far as you can with your
factoring resources.
2.3. Estimate the number of random 300-digit numbers you would
have to test for primality to find one prime.
2.4. Solve 31x ≡ 17 (mod 109).
2.5. Find x so that x ≡ 5 (mod 9) and x ≡ 8 (mod 11).
2.6. Find the two low-order digits of 83765 .
2.7. Compute φ(m), d(m), σ(m), μ(m), and ω(m) for each integer
m in 30 ≤ m ≤ 40.
2.8. Find all 0 ≤ r < 8 for which x2 ≡ r (mod 8) has a solution.
Repeat with 8 replaced by 16.
2.9. Is the Carmichael λ function multiplicative?
2.10. Show that the least exponent t for which 5t ≡ 1 (mod 2e ) is
t = 2e−2 = λ(2e ) when e > 2.
2.11. Use Theorem 2.59 to evaluate these Legendre symbols: (−1/23),
(2/31), (7/37), (69/103), and (35/67). Try to answer without
using a computer.
2.12. Which primes p have 11 as a quadratic residue?
k
2.13. Let m = i=0 di 10i be the decimal number (dk dk−1 . . . d1 d0 )10 .
(a) Prove that 3 divides m if and only if 3 divides the sum
d0 + d1 + · · · + dk−1 + dk of the digits of m.
(b) Prove that 11 divides m if and only if 11 divides the
alternating sum d0 − d1 + · · · ± dk−1 ∓ dk .
(c) Prove that 37 divides m if and only if 37 divides
d0 + 10d1 − 11d2 + d3 + 10d4 − 11d5 + d6 + 10d7 − 11d8 + · · · .
Chapter 3

Number Theory
Relevant to Factoring

Introduction
This chapter presents more advanced topics from number theory that
you will need to understand the rest of the book. This material does
not appear in most introductory number theory books. References
or proofs are given for these theorems. The fastest known factoring
algorithms work by factoring millions of small auxiliary numbers us-
ing simple techniques like Trial Division and sieves. They succeed
in factoring an auxiliary number when it is smooth. The main re-
sult stated in the first section is that a positive fraction of auxiliary
numbers is smooth. This fraction determines the running time of the
integer factoring algorithm. The next section deals with solving the
congruence x2 ≡ r (mod m). Several fast factoring algorithms do
this as the penultimate step, the last step being finding a greatest
common divisor. (For one such algorithm see Example 6.11.) Square
roots of quadratic residues are also needed for certain cryptographic
algorithms discussed in Section 4.8. The next four sections deal with
the algebraic factorizations of the numbers bn ± 1 and the Fibonacci
and Lucas numbers, some of the most interesting numbers to factor.
Simple algebra factors some of these numbers for free. One should
take advantage of this free factoring before one begins to use the more

41
42 3. Number Theory Relevant to Factoring

expensive factoring algorithms described in this book. The final sec-


tion treats primality testing, essential for telling when factorization
is complete.

3.1. Smooth Numbers


The most naive factoring algorithm is Trial Division. It factors N by
dividing N by each prime in turn, removing each prime factor it finds
and stopping when the next prime trial divisor exceeds the square
root of the remaining cofactor. This algorithm is correct because of
Theorem 2.14. A copy m of N is made because m may change during
the algorithm. The while loop runs through the primes p (or other
trial divisors; see below). For each p, the algorithm tests whether
p | m. If so, it records the factor p and removes it from m. Then
it tests whether the same p divides the new m. When p  m, the
loop continues with a new trial divisor p. The loop stops as soon as

p > m. Finally, it prints the last prime factor m of N . In case √
N is
prime, the algorithm finds no factor of N and stops when p > N ,
printing “N is prime” as it exits.
Algorithm 3.1. Trial Division.
Input: An integer N > 1.
m←N
p←2 √
while (p ≤ m) {
if (m mod p) = 0 {
write “p divides N ”
m ← m/p
}
else { p ← the next prime after p }
}
if (m = N ) { write “N is prime” }
else if (m > 1) { write “m divides N ” }
Output: The algorithm writes the prime factors of N .

The algorithm would work correctly, but perhaps more slowly, if


it let p be p + 1 rather than “the next prime after p.” In a typical
implementation of the algorithm there is a compromise between these
two extremes. One might add 1 to p the first time to reach p = 3, then
add 2 to get to p = 5, and after that alternate between adding 2 and 4
3.1. Smooth Numbers 43

to p. This makes p take on the values 2, 3, 5, 7, 11, 13, 17, 19, 23, 25,
29, 31, 35, . . .. In order to make the algorithm correct, the variable
p must not skip any prime. The reason why no primes are missed
when one adds 2 and 4 alternately is that, after the primes 2 and 3,
all larger primes are either 1 more or 1 less than a multiple of 6, and
the alternation of adding 2 and 4 makes the variable p run through
exactly these numbers. We will say more about this algorithm in
Section 5.1.
Trial Division is most successful when factoring a “smooth” num-
ber, that is, one with only small prime factors.

Definition 3.2. A positive integer n is y-smooth if all of its prime


factors are ≤ y. The de Bruijn function, ψ(x, y), is the number of
y-smooth integers between 1 and x, inclusive.

If n is composite with largest prime factor n1 and second largest


prime factor n2 , then n is n1 -smooth and
the number of steps Trial
Division takes to factor n is O(max(n2 , n1 )). See Knuth and Trabb
Pardo [KP76] for a complete analysis of Algorithm 3.1 and more
about smooth numbers.
We will encounter smooth numbers later in the analysis of the
fastest known factoring algorithms. For that purpose we will need to
estimate ψ(x, y). The fraction of integers ≤ x which are y-smooth is
ψ(x, y)/x. The proper relation between x and y for studying the de
Bruijn function is y = x1/u for some real number u = (ln x)/ ln y ≥
1. Often u is fixed. Dickman [Dic30] gave a heuristic argument
and Ramaswami [Ram49] proved that there is a positive continuous
function ρ(u), defined for u > 0, so that

ψ(x, x1/u )
lim = ρ(u).
x→∞ x
A useful approximation is ρ(u) ≈ u−u , that is,

(3.1) ψ(x, x1/u ) ≈ xρ(u) ≈ xu−u ,

which is valid when u ≥ 1 is fixed and also holds when u increases


slowly with x so long as u is small compared to (ln x)/ ln ln x. See
Canfield et al. [CEP83] for more details and a proof. This table gives
44 3. Number Theory Relevant to Factoring

values of ρ(u) and u−u for some small positive integers u:


u ρ(u) u−u
1 1.0000000 1.0000000
2 0.3068528 0.2500000
3 0.0486084 0.0370370
4 0.0049109 0.0039062
5 0.0003547 0.0003200
6 0.0000197 0.0000214
We will use the approximation ψ(x, x1/u )/x ≈ u−u in estimating the
time complexity of the fastest factoring algorithms, like the Elliptic
Curve Method in Theorem 7.11.
Example 3.3. The table shows that about 31% of numbers n have

no prime factor greater than n, about 4.9% of numbers n have no

prime factors greater than 3 n, etc.
Example 3.4. Estimate the number of integers between 1030 − 105
and 1030 , all of whose prime factors are less than one million.
The probability that an integer near 1030 is 106 -smooth is approx-
imately the same as the probability that a random number between 1
and 1030 is 106 -smooth, namely ψ(1030 , 106 )/1030 . Since the interval
of interest has length 105 , the number we must estimate is approxi-
mately 105 times ψ(1030 , 106 )/1030 . Since 106 = (1030 )1/5 , we have
u = 5. The table shows that ρ(u) ≈ 0.00035, so by (3.1) the answer
is approximately 105 · 0.00035 = 35. Using Algorithm 8.3, one finds
that the exact count of the 106 -smooth numbers between 1030 − 105
and 1030 is 30, including 1030 .

3.2. Finding Modular Square Roots


In Section 2.6 we asked how to solve a quadratic congruence and
reduced the problem to computing a square root of a quadratic residue
modulo an odd prime p. We now give an efficient algorithm for doing
this. The algorithm is especially simple when p ≡ 3 (mod 4).
Theorem 3.5. Let p ≡ 3 (mod 4) be prime and let r be a quadratic
residue modulo p. Then the square roots of r, that is, the solutions to
the congruence x2 ≡ r (mod p), are x ≡ ± r (p+1)/4 (mod p).
3.2. Finding Modular Square Roots 45

Proof. We simply note that

x2 ≡ r (p+1)/2 ≡ r · r (p−1)/2 ≡ r · (r/p) ≡ r (mod p)

by Euler’s Criterion and the fact that (r/p) = +1. 

When p ≡ 5 (mod 8) the algorithm is slightly more complicated.


If r is a quadratic residue modulo p, then its square roots are ±x,
where x is computed as follows: Let x = r (p+3)/8 mod p. If x2 ≡
r (mod p), then replace x with x2(p−1)/4 mod p.
Both algorithms just given are special cases of an algorithm in-
vented by Tonelli that works in all cases. A bit of group theory
is needed to explain how Tonelli’s algorithm works. If you haven’t
studied group theory, then skip to the algorithm. The multiplicative
group G of integers modulo a prime p is cyclic of order p − 1. Write
p − 1 = 2e f with f odd. Euler’s Criterion shows that the multi-
plicative order of R = r f modulo p divides 2e−1 . If n is a quadratic
nonresidue modulo p, then Euler’s Criterion shows that the multi-
plicative order of N = nf equals 2e . Therefore, R ≡ N −j (mod p) for
some even integer j. If we know j mod 2i , then j mod 2i+1 can only
be j or j + 2i ; the if statement decides which it is. Modulo 20 = 1,
j = 0. The for loop computes j. Finally, we have RN j ≡ 1 (mod p),
that is, r f N j ≡ 1 (mod p), so r f +1 N j ≡ r (mod p). Since both
exponents f + 1 and j are even numbers, we can divide them by 2 to
get a square root x of r.

Algorithm 3.6. Tonelli Modular Square Root.


Input: An odd prime p and a quadratic residue r mod p.
Find a quadratic nonresidue n modulo p.
Express p − 1 = 2e f with f odd and e > 0.
R ← r f mod p
N ← nf mod p
j←0
for (i ← 1 to e) {
e−i−1
if ((RN j )2 ≡ −1 (mod p)) { j ← j + 2i }
}
x ← r (f +1)/2 N j/2 mod p
Output: x and p − x are the two solutions to x2 ≡ r (mod p).
46 3. Number Theory Relevant to Factoring

In the first line of the algorithm we must find a quadratic non-


residue n modulo p. One way to do this is to try small nonsquare
positive integers n and test each one with Euler’s Criterion. There
is no known deterministic polynomial time algorithm for finding a
quadratic nonresidue n modulo a prime p. If one assumes the Ex-
tended Riemann Hypothesis1 , then one can prove that some positive
integer n < 2 ln2 p is a quadratic nonresidue modulo p. However, half
of the integers between 1 and p − 1 are quadratic nonresidues, so, if
we choose random n as suggested above, the average number of them
we will have to test is 2.
In the second line, one removes all factors of 2 from the even
number p − 1. If there are e twos, then 2e divides p − 1 and we let
f = (p−1)/2e be the odd cofactor. The exponentiation in subsequent
lines is done by the Fast Exponentiation Algorithm.
For a proof that the Tonelli algorithm works, see Algorithm 2.3.8
of Crandall and Pomerance [CP05]. Compare with Theorem 7.1.3
of Bach and Shallit [BS96]. Shanks [Sha73] called this algorithm
RESSOL. See Section 2.9 of [NZM91] for more discussion of this
algorithm. See also page 106 of Wagstaff [Wag03].

Example 3.7. Use the Tonelli algorithm to find the square roots of
50 modulo 73.
Trying small (nonsquare) positive integers 2, 3, 5, 6, . . ., we find
that 5 is the smallest positive quadratic nonresidue modulo 73. We
write 73 − 1 = 72 = 23 9, so e = 3 and f = 9. We set R = 509 mod
73 = 27 and N = 59 mod 73 = 10. Set j = 0 and begin the for loop.
When i = 1, the condition in the if statement is true and j is changed
to 0 + 21 = 2. When i = 2, the condition is true and j is changed to
2 + 22 = 6. When i = 3, the condition is false and j is not changed.
Finally, x = 50(9+1)/2 106/2 mod 73 = 59 or 73 − 59 = 14. The two
square roots of 50 are 14 and 59 modulo 73.

Theorem 3.8. Let r be a quadratic residue modulo an odd prime p.


Then for every e ≥ 1, the congruence x2 ≡ r (mod pe ) has exactly
two solutions. If one solution is re , the other is pe − re .

1
The Extended Riemann Hypothesis is a conjecture about the zeros of certain
L-functions similar to the Riemann zeta function.
3.3. Cyclotomic Polynomials 47

The theorem is proved by “lifting” a solution re modulo pe to a


solution re+1 modulo pe+1 by Hensel’s Lemma, a variation of New-
ton’s method. We will illustrate this technique with e = 1. Sup-
pose x2 ≡ r (mod p). We wish to solve y 2 ≡ r (mod p2 ). Clearly
y ≡ x (mod p). Write y = x + tp for some integer t. Then r ≡ y 2 ≡
x2 + 2xtp (mod p2 ). This gives pt ≡ (2x)−1 (r − x2 ) (mod p2 ), which
is equivalent to t ≡ (2x)−1 ((r − x2 )/p) (mod p). Note that (r − x2 )/p
is an integer, gcd(2x, p) = 1, and (2x)−1 is the inverse of 2x modulo
p. This formula determines t and therefore y.

Example 3.9. Find the square roots of r = 37987 modulo p2 , where


p = 239. We will need these values for Example 6.11.
First, note that r mod p = 225 = 152 , so x = ±15 is the solution
to x2 ≡ r (mod p). Let y = x + tp be the solution to y 2 ≡ r (mod p2 ).
From the discussion above, we have t ≡ (2x)−1 ((r − x2 )/p) (mod p).
We have 2x = 30 and 30 · 8 = 240 = p + 1, so (2x)−1 ≡ 8 (mod p).
Also, (r − x2 )/p = (37987 − 152 )/239 = 158. Thus,

t = (2x)−1 ((r − x2 )/p) = 8 · 158 = 1264

and

y = x + tp = 15 + 1264 · 239 = 302111 ≡ 16506 (mod p2 )

is one square root of r modulo p2 . The other square root is

p2 − y = 57121 − 16506 = 40615.

3.3. Cyclotomic Polynomials


A nonconstant polynomial f (x) with rational coefficients is called
irreducible (over the rational numbers) if it is not the product of
two nonconstant polynomials with rational coefficients of lower degree
than f (x). Irreducible polynomials are analogous to prime numbers.
One can prove that every nonconstant polynomial can be written as
the product of irreducible ones.
The polynomial xm − 1 has m zeros in the complex numbers,
namely, all the m-th roots of unity. If m > 1, then xm − 1 can
48 3. Number Theory Relevant to Factoring

be factored over the rational numbers. The irreducible factors (over


the rationals) of xm − 1 are called the cyclotomic polynomials. For
each positive integer m, xm − 1 has exactly one irreducible factor
Φm (x) which does not divide xk − 1 for any k < m. The zeros of
Φm (x) are the m-th roots of unity that are not k-th roots of unity
for any k < m. These zeros are called the primitive m-th roots of
unity. See Weintraub [Wei13] for other definitions of the cyclotomic
polynomials and proofs that they are irreducible. The coefficients of
Φm (x) are integers. The degree of Φm (x) is φ(m).

Theorem 3.10. If m is a positive integer, then

 
xm − 1 = Φd (x) and Φm (x) = (xm/d − 1)μ(d) .
d|m d|m

Theorem 3.10 is Exercise 32 in Section 4.6.2 of Knuth [Knu81].


See Section 35 of van der Waerden [vdW70] for a complete proof.
The two formulas in Theorem 3.10 are easily seen to be equivalent by
the Möbius Inversion Formula, Theorem 2.53. From Theorem 3.10
and the definition of μ(d) we find that if p is prime and p  m, then


(3.2) Φmp (x) = Φm (xp ) Φm (x),

while if p is prime and p | m, then

(3.3) Φmp (x) = Φm (xp ).

Also, if p is prime, then

xp − 1
(3.4) Φp (x) = = xp−1 + xp−2 + · · · + x + 1.
x−1

It is easy to show from these formulas that Φ1 (0) = −1, Φm (0) = 1


if m > 1, Φm (1) = p if m is a power of the prime p, and Φm (1) = 1
if m has at least two different prime factors.
3.4. Divisibility Sequences and bm − 1 49

Example 3.11. The first few cyclotomic polynomials are


Φ1 (x) = x − 1,
Φ2 (x) = x + 1,
Φ3 (x) = x2 + x + 1,
Φ4 (x) = x2 + 1,
Φ5 (x) = x4 + x3 + x2 + x + 1,
Φ6 (x) = x2 − x + 1,
Φ7 (x) = x6 + x5 + x4 + x3 + x2 + x + 1,
Φ8 (x) = x4 + 1,
Φ9 (x) = x6 + x3 + 1.

The Cunningham Project [BLS+ 02] factors Φm (b) for fixed b in


2 ≤ b ≤ 12 and m = 1, 2, 3, . . .. In contrast, Kida, Morimoto, Saito,
and Kobayashi [MK87], [MKS89], [MKK92] published tables of
factors of Φm (b) for fixed m and b = 2, 3, 4, . . .. They especially
sought prime values of the cyclotomic numbers Φm (b).

3.4. Divisibility Sequences and bm − 1


Definition 3.12. A sequence of integers {cn } (n = 1, 2, 3, . . .) is a
divisibility sequence if cm divides cn whenever m divides n.

See Bliss et al. [BFLS13] for recent work on divisibility sequences.

Theorem 3.13. If b is an integer > 1, then the sequence {cn } with


cn = bn − 1 is a divisibility sequence.

Proof. Let m | n. We must show that (bm − 1) | (bn − 1). Let x = bm


and k = n/m. Observe that
bn − 1 = xk − 1 = (x − 1)(xk−1 + · · · + x + 1)
is divisible by x − 1 = bm − 1. 
Example 3.14. Observe the divisibility in Table 1 of numbers 5m −1
factored. (The stars (∗) will be explained shortly.) You can see that
51 − 1 = 4 divides 5m − 1 for every m ≥ 1. Also, 52 − 1 = 24 divides
50 3. Number Theory Relevant to Factoring

Table 1. Factors of 5m − 1.

m Prime factors of 5m − 1 m Prime factors of 5m − 1


1 2.2 11 (1) 12207031
2 (1) 2∗.3 12 (1, 2, 3, 4, 6) 601
3 (1) 31 13 (1) 305175781
4 (1, 2) 2∗.13 14 (1, 2, 7) 29.449
5 (1) 11.71 15 (1, 3, 5) 181.1741
6 (1, 2, 3) 3∗.7 16 (1, 2, 4, 8) 2∗.17.11489
7 (1) 19531 17 (1) 409.466344409
8 (1, 2, 4) 2∗.313 18 (1, 2, 3, 6, 9) 3∗.5167
9 (1, 3) 19.829 19 (1) 191.6271.3981071
10 (1, 2, 5) 521 20 (1, 2, 4, 5, 10) 41.9161

5m − 1 for every even m ≥ 2, and 53 − 1 = 124 divides 5m − 1 for


every m ≥ 3 which is a multiple of 3. As shown by Theorem 3.13,
the sequence cm = 5m − 1 is a divisibility sequence.

It is not true that if both d and e divide m, then (bd − 1)(be − 1)


divides bm − 1. All we can say is that the least common multiple of
(bd − 1) and (be − 1) must divide bm − 1.
The cyclotomic factor Φm (b) is called the primitive part of bm −1.
The other part, (bm − 1)/Φm (b), is called the algebraic part of bm − 1.
A prime factor of bm − 1 is called primitive if is does not divide
bk − 1 for any 0 < k < m. Otherwise, it is called algebraic. Every
algebraic factor must divide the algebraic part of bm − 1. Likewise,
every primitive factor must divide the primitive part of bm − 1. By
definition, a primitive factor of bm − 1 cannot divide the algebraic
part. However, it is possible that an algebraic factor q of bm − 1 may
divide the primitive part. Such a prime factor q is called intrinsic
and must divide m. A primitive part may have at most one intrinsic
factor. Therefore, gcd(Φm (b), m) is either 1 or an intrinsic prime
factor of bm −1. Intrinsic factors in the Cunningham tables are marked
with a star (∗). The next two theorems describe the prime factors of
Φm (b). These theorems and proofs are from Nagell [Nag51] but the
results are due to Zsigmondy [Zsi92]. The first theorem describes
3.4. Divisibility Sequences and bm − 1 51

the primitive factors of xm − 1. Recall that the notation pe m means


that pe | m but pe+1  m.

Theorem 3.15. Let q be a prime that does not divide the positive
integer m. Then the congruence
(3.5) Φm (x) ≡ 0 (mod q)
is solvable if and only if q ≡ 1 (mod m). If q ≡ 1 (mod m), then
the solutions to (3.5) are exactly the x satisfying xm ≡ 1 (mod q) but
not xk ≡ 1 (mod q) for any 0 < k < m. If x satisfies (3.5), then the
highest power of q dividing Φm (x) is the same as the highest power of
q dividing xm − 1.

Proof. Since Φm (0) = ±1, no solution x of (3.5) can be a multiple


of q. If q | Φm (x), then q | xm − 1 by the first equation in Theorem
3.10. Let k be the smallest positive integer such that q | xk − 1.
Then k | m. If q | xm/d − 1, where d | m, then every prime dividing
m/d must also divide m/k. Suppose q s xk − 1, so that xk = 1 + tq s
for some integer t. Raise this equation to the i-th power and get
xki = 1 + itq s + t1 q 2s = 1 + t2 q s for integers t1 and t2 . Then q | t2 if
and only if q | i. Therefore, the highest power of q dividing xik − 1 is
the same as the highest power of q dividing xk − 1. In xm/d − 1 we
have m/d = ik, where q  i, since q  m. By Theorem 3.10,

Φm/k (x) = (xm/(kd) − 1)μ(d) .
d|(m/k)

It follows from the above that q s xm/(kd) − 1 for every d dividing


m/k. Therefore, the power of q dividing Φm/k (x) is

 s if m = k,
s μ(d) =
d|(m/k)
0 if m > k

by Theorem 2.51. Therefore, if m > k, then q  Φm (x). If m = k,


then q divides Φm (x) to the same power as it divides xm − 1. 

Example 3.16. In Table 1, we see that q = 19 divides Φ9 (5) (but


q  9) and so 19 ≡ 1 (mod 9). The highest power of 19 that divides
Φ9 (5) is the same (the first power of 19) as the highest power of 19
that divides 59 − 1.
52 3. Number Theory Relevant to Factoring

Likewise, q = 2 divides Φ1 (5) (but q  1) and so 2 ≡ 1 (mod 1).


The highest power of 2 that divides Φ1 (5) is the same (the second
power of 2) as the highest power of 2 that divides 51 − 1 = 4.

The next theorem tells when intrinsic factors occur. Many parts
of this theorem go back at least 150 years to Sylvester.

Theorem 3.17. Suppose q is a prime factor of m and write m =


q a m1 with q  m1 . If q divides Φm (x), then q ≡ 1 (mod m1 ) and q
divides Φm1 (x). If q ≡ 1 (mod m1 ), then q divides Φm (x) whenever
m1 is the smallest positive integer n for which xn ≡ 1 (mod q). The
number of different x modulo q for which q divides Φm (x) is φ(m1 ).
If q divides Φm (x) and m > 2, then q 2 does not divide Φm (x).

Proof. Suppose first that m is not a power of 2. Note that m1 > 2


when q = 2. From equations (3.2) and (3.3), we have
a  a−1
(3.6) Φm (x) = Φm1 (xq ) Φm1 (xq ).

If q | Φm (x), then
a
(3.7) q | Φm1 (xq ).
a
By Theorem 3.15, we see that m1 is the least k > 0 such that (xq )k ≡
1 (mod q). Therefore, m1 is the smallest positive integer k with
xk ≡ 1 (mod q) (since xq ≡ x (mod q)). For q = 2 this would imply
that m1 = 1. Since we are supposing here that m is not a power of
2, q must be odd.
If x = 1, then m1 = 1 and Φm (1) = q by the remark after (3.4).
If x = −1, then m1 = 2 and Φm (−1) = Φm/2 (1) = q by the same
remark.
Suppose next that x = ±1. If m1 is the least k > 0 such that
a
xk ≡ 1 (mod q), then (3.7) holds and q divides Φm1 (xq ) to the same
a
power as it divides (xq )m1 − 1 = xm − 1. Suppose q s (xm/q − 1), so
xm/q = 1 + tq s , where q  t. Raise this equation to the q-th power
and get xm = 1 + qtq s + t1 q 2 s = 1 + t2 q s+1 for integers t1 and t2 .
a−1
Note that q  t2 because q > 2. Therefore, if q s Φm1 (xq ), then
a
q s+1 Φm1 (xq ). Then equation (3.6) shows that Φm (x) is divisible
by q but not by q 2 .
3.4. Divisibility Sequences and bm − 1 53

Finally, if m is a power of 2, then m ≥ 4 because m > 2. In this


case, Φm (x) = x2j + 1, where j = m/4 is an integer, so Φm (x) is not
divisible by 4. 

Example 3.18. The prime q = 7 divides m = 147 and Φm (2), so 7


is an intrinsic factor of 2147 − 1. Now 147 = 72 · 3 and 7 ≡ 1 (mod 3)
and 3 is the smallest positive integer n for which 2n ≡ 1 (mod 7).
Also, 3 is the smallest positive integer n for which 11n ≡ 1 (mod 7),
so 7 divides Φm (11). Only one 7 divides Φm (2) and Φm (11). Finally,
3 is not the smallest positive integer n for which 5n ≡ 1 (mod 7), so 7
does not divide Φm (5). Instead, 6 is the smallest positive integer n for
which 5n ≡ 1 (mod 7), so 7 divides Φ2m (5), that is, 7 is an intrinsic
factor of 5147 + 1. (See Section 3.5 for the definition of intrinsic factor
of bm + 1.)

Theorem 3.19 (Bang). If b ≥ 2 and m ≥ 2 are integers, then bm − 1


has a primitive prime factor, except for the case b = 2, m = 6: 26 − 1.

Proof. By Theorem 3.10, we have


 
bm − 1 = Φd (b) and Φm (b) = (bm/d − 1)μ(d) .
d|m d|m

By Theorem 3.17, the primitive part Φm (b) of bm − 1 has at most


one intrinsic prime factor. Suppose first that there is an intrinsic
prime factor q of Φm (b). By Theorem 3.17, q divides Φm (b) only
once. Therefore, Pm (b) = Φm (b)/q is the product of the primitive
prime factors of bm − 1, if any. We will show that bm − 1 has at least
one primitive prime factor by proving that Pm (b) > 1.
By Theorem 3.17 again, q | m, say, m = q a m1 , where q  m1 .
Then q ≡ 1 (mod m1 ) and q | Φm1 (b). Since Φm1 (b) = 0, we have
Φm1 (b) ≥ q and therefore Pm (b) ≥ Φm (b)/Φm1 (b). We will prove a
lower bound for Φm (b) and an upper bound for Φm1 (b) to get a lower
bound for Pm (b).
Now also suppose that m1 > 1. Write

 d|m; μ(d)=+1 (b
m/d
− 1)
Φm (b) = (bm/d
− 1) μ(d)
= .
d|m; μ(d)=−1 (b
m/d − 1)
d|m
54 3. Number Theory Relevant to Factoring

We have bk − 1 > bk−1 since b ≥ 2. Apply this inequality in each


factor in the numerator and apply the inequality bk − 1 < bk in each
factor in the denominator. This gives the lower bound

d|m; μ(d)=+1 b
m/d−1  
Φm (b) >  m/d
= bμ(d)m/d
b−1 = bX−Y ,
d|m; μ(d)=−1 b
d|m d|m; μ(d)=+1

where X = μ(d)m/d = φ(m) by Theorem 2.50 and Y =
 d|m
ω(m)−1
d|m; μ(d)=+1 1 = 2 . The result is
ω(m)−1
(3.8) Φm (b) > bφ(m)−2 .

Now use m1 > 1 and make the same estimate for



d|m ; μ(d)=+1 (b
m1 /d
− 1)
Φm1 (b) =  1 m1 /d − 1)
,
d|m1 ; μ(d)=−1 (b

but now use bk − 1 > bk−1 for the denominator and bk − 1 < bk for
the numerator. The result is
ω(m1 )−1
Φm1 (b) < bφ(m1 )+2 .

Note that ω(m1 ) = ω(m) − 1 because m = q a m1 . Combining the two


estimates gives
ω(m)−2
Pm (b) ≥ Φm (b)/Φm1 (b) > bφ(m)−φ(m1 )−3·2 .

It suffices to prove that the exponent φ(m) − φ(m1 ) − 3 · 2ω(m)−2 ≥ 0.


By Theorem 2.54 we have φ(m1 ) ≥ 2ω(m1 )−1 = 2ω(m)−2 . Note
that φ(m) = φ(q a )φ(m1 ) because m = q a m1 . We can’t have q = 2
here since q ≡ 1 (mod m1 ) and we are still assuming that m1 > 1.
Now φ(q a ) = q a−1 (q − 1) ≥ 4 except when q = 3 and a = 1. Rewrite
this as φ(q a ) − 1 ≥ 3. Multiplying this inequality times the one for
φ(m1 ) yields
φ(m1 )(φ(q a ) − 1) ≥ 3 · 2ω(m)−2 ,
or φ(m) − φ(m1 ) − 3 · 2ω(m)−2 ≥ 0 except when q = 3, a = 1, and
m1 = 2, that is, m = 6. Now Φ6 (b) = b2 − b + 1 > 3 unless b = 2, so
P6 (b) = Φ6 (b)/3 > 1 unless b = 2. Note that 26 − 1 = 32 · 7 has no
primitive prime factor.
3.5. Factors of bm + 1 55
a−1
Next, when m1 = 1, we have m = q a and so, with z = bq ,
qa
b −1 z −1q
Φqa (b) = = = z q−1 + · · · + z + 1 > q,
bqa−1 − 1 z−1
so Pqa (b) = Φqa (b)/q > 1.
Finally, if there is no intrinsic factor, we have
ω(m)−1
Φm (b) > bφ(m)−2 ≥1
by Theorem 2.54 and formula (3.8), whose proof above did not depend
on whether Φm (b) has an intrinsic factor. 

The reader may wonder what happens when m = 1. Then the


primitive part of b1 −1 is just b−1. Any prime factor of it is primitive,
so b1 − 1 has a primitive prime factor except in the case of b = 2.
Corollary 3.20. For any positive integer m there are infinitely many
primes p ≡ 1 (mod m).

Proof. Let Pm (b) = Φm (b)/ gcd(Φm (b), m) denote Φm (b) with any
intrinsic factor removed. The numbers in the infinite series
Pm (3), P2m (3), P3m (3), P4m (3), . . . , Pkm (3), . . .
are all relatively prime and, by Theorem 3.19, each contains at least
one prime divisor ≡ 1 (mod m). 

The proof of the corollary could have used any base b > 1; we
chose base 3 to avoid 26 − 1.

3.5. Factors of bm + 1
Note that {bm + 1}, with fixed b, is not a divisibility sequence. For
example, with b = 5 we have 1 | 2, but 6 = 51 + 1  52 + 1 = 26. The
next theorem describes the divisibility properties of bm + 1.
Theorem 3.21. If b is an integer > 1 and m | n and n/m is odd,
then (bm + 1) | (bn + 1).

Proof. Let x = bm and k = n/m. Observe that since k is odd,


bn + 1 = xk + 1 = (x + 1)(xk−1 − xk−2 + · · · − x + 1)
is divisible by x + 1 = bm + 1. 
56 3. Number Theory Relevant to Factoring

Example 3.22. Look at the factors of 10n + 1 in Table 1 in Chapter


1. Note that 101 + 1 = 11 divides 10n + 1 for every odd n, but not
for any even n. Note also that 102 + 1 divides 106 + 1 because both 2
and 6 are divisible by 2 just once, but 102 + 1 does not divide 104 + 1
or 108 + 1 because 4 and 8 have more factors of 2 than 2 has.

The sequence bm + 1 has primitive parts (Φ2m (b)) and algebraic


parts ((bm + 1)/Φ2m (b)), just as for bm − 1. Primitive, algebraic, and
intrinsic prime factors of bm + 1 are defined just as for bm − 1. Thus,
a prime factor of bm + 1 is primitive if it does not divide bk + 1 for
any 0 < k < m. Theorems 3.15 and 3.17, with m replaced by 2m,
describe the primitive and intrinsic factors of xm + 1 because of the
following theorem.

Theorem 3.23. Let b ≥ 2 and m ≥ 2 be integers and let p be an odd


prime. Then p is a primitive factor of bm + 1 if and only if it is a
primitive factor of b2m − 1.

Proof. Suppose p is a primitive factor of bm +1. Clearly p | (b2m −1)


since (bm + 1) | (b2m − 1). Suppose p | (bk − 1) for some 0 < k < 2m.
Then p | (bm + 1 + bk − 1), so p | (bm + bk ). Of course, p  b. If k < m,
then p | bk (bm−k + 1), so p | (bm−k + 1) and 0 < m − k < m, so p is
not a primitive factor of bm + 1. If k > m, then p | bm (bk−m + 1), so
p | (bk−m + 1) and 0 < k − m < m, so p is not a primitive factor of
bm + 1. Finally, if k = m, then p | 2bm , which is impossible because
p is odd and p  b.
Now suppose p is a primitive factor of b2m − 1 = (bm − 1)(bm + 1).
Since p does not divide bm −1, it must divide bm +1. If p divided bk +1
for some 0 < k < m, then it would divide b2k − 1. But 0 < 2k < 2m,
so it would not be a primitive factor of b2m − 1. 

3.6. Factors of Fibonacci and Lucas Numbers


The Fibonacci numbers {un } are defined by u0 = 0, u1 = 1, and
un+1 = un + un−1 for n ≥ 1. Table 2 shows the prime factorizations
of the first 30 Fibonacci numbers. Note that for integers m, n in
the range of Table 2, um | un whenever m | n. In fact, the Fibonacci
3.6. Factors of Fibonacci and Lucas Numbers 57

Table 2. Fibonacci numbers factored.

n un n un n un
0 0 10 5.11 20 3.5.11.41
1 1 11 89 21 2.13.421
2 1 12 2.2.2.2.3.3 22 89.199
3 2 13 233 23 28657
4 3 14 13.29 24 2.2.2.2.2.3.3.7.23
5 5 15 2.5.61 25 5.5.3001
6 2.2.2 16 3.7.47 26 233.521
7 13 17 1597 27 2.17.53.109
8 3.7 18 2.2.2.17.19 28 3.13.29.281
9 2.17 19 37.113 29 514229

numbers form a divisibility sequence. See page 85 of Williams [Wil98]


for a proof.
The Lucas numbers {vn } are defined by v0 = 2, v1 = 1, and
vn+1 = vn + vn−1 for n ≥ 1. One can show that if α and β are
the two roots of x2 − x − 1 = 0, then un = (αn − β n )/(α − β) and
vn = αn + β n for all n ≥ 0. These formulas show that the Lucas
numbers are related to the Fibonacci numbers in the same way that
the sequence bn + 1 is related to bn − 1. For example, the formulas
imply that u2n = un vn , analogous to b2n − 1 = (bn − 1)(bn + 1).
There is an analogue of Theorem 3.21 which says that if m | n and
n/m is odd, then vm | vn . See page 85 of Williams [Wil98] for a
proof. Primitive, algebraic, and intrinsic prime factors of um and vm
may be defined just as for bm − 1 and bm + 1.
The sequences mentioned above are interesting also because they
satisfy second-order linear recurrence relations. When cn equals either
bn − 1 or bn + 1, the recurrence relation is cn+1 = (b + 1)cn − bcn−1 .
Linear recurrence relations arise in many problems in mathematics,
computer science, and other sciences. For example, the sequence
2n −1 is the solution to the Tower of Hanoi problem and the Fibonacci
numbers arise in the growth rate of a rabbit population.
Later we will need to compute un mod m when n and m are
large integers. It is easy to do this with Fast Exponentiation of
58 3. Number Theory Relevant to Factoring
   
1 1 1 1
2 × 2 matrices. Define A0 = , L = , and An =
0 2 1 0
 
un+1 vn+1
for n ≥ 0. A simple induction shows that An =
un vn
Ln A0 for n ≥ 0, where L0 means the identity matrix. This formula
provides a quick way to compute un and vn when n is huge. The Fast
Exponentiation Algorithm applies to anything one can multiply as-
sociatively, including matrices. To compute un mod m or vn mod m,
reduce each matrix entry modulo m as it is computed. This keeps
the numbers small.

Algorithm 3.24. Fast Exponentiation Algorithm for Matrices.

Input: Integers m > 0 and e ≥ 0, and a square matrix L.


Y ← I, the identity matrix of the same size as L.
Z←L
while (e > 0) {
if (e is odd) Y ← (Y · Z) mod m
Z ← (Z · Z) mod m
e ← e/2
}
Output: Le mod m = the final value of the matrix Y .

Example 3.25. Compute L10 mod m, where L is a square matrix


and m > 1 is an integer.
We have e = 10. Let Y = I and Z = L. In the while loop, e
is even at first, so Y remains I and Z becomes L2 mod m. Then e
is halved to 5. Since e = 5 > 0, the loop continues. This e = 5 is
odd, so Y becomes IZ = L2 mod M . Then Z is squared again and
becomes L4 mod M . After that, e is halved to 2. Since e = 2 > 0,
the loop continues. Now e = 2 is even, so Y remains L2 mod m, Z is
squared to L8 mod m, and e is halved to 1. Since e = 1 > 0, the loop
continues. This time e = 1 is odd, so Y becomes L2 L8 = L10 mod m.
Then Z is squared, e becomes 0, and the while loop ends. The output
is Y = L10 mod m.

Brillhart, Montgomery, and Silverman [BMS88] published tables


of factors of Fibonacci numbers un for 1 < n < 1000 and Lucas
numbers vn for 1 < n ≤ 500.
3.7. Primality Testing 59

3.7. Primality Testing


Although this book is about factoring integers, we must say something
about testing whether a number is prime. We should not try to factor
a number unless we are fairly certain it is not prime. If we want the
complete factorization of a composite number N and we factor it as
N = i · j, we need to know whether i and j are prime or composite
so that we can tell whether our work is finished. Some methods of
proving that an integer is prime, like Theorem 3.27, require factoring
a different integer.
A century ago, primality testing was as hard as factoring. Math-
ematicians would try for a while to factor a large number and if they
could not factor it, they might conjecture that it is prime. Later,
some other mathematician might factor it or prove its primality.
Rigorous tests were developed to decide the primality of special
numbers, such as Mersenne numbers Mp = 2p −1 and Fermat numbers
k
Fk = 22 +1. About one hundred years ago, mathematicians invented
fast tests that reported whether an integer is certainly composite or
“probably prime.” At about the same time, theorems were proved
that gave a rigorous proof that a probable prime number is prime,
but some hard factoring might be needed to discover the proof. These
tests were refined during the past century until thirty years ago people
invented a very fast test for primeness that has never been proved
correct but that has never failed either. The theorems were also
improved to the point that twenty years ago one could prove with
modest effort that almost every 1000-digit probable prime really is
prime. Some of these primality tests are probabilistic 2 , meaning that:
(a) the algorithm chooses random numbers as it runs, (b) most choices
for these numbers will lead to a short running time, (c) some random
choices could make the program run for a long time, and (d) the
program always gives a correct answer when it finishes. In 2002, a
deterministic general polynomial-time algorithm for primality testing
was invented. Although it runs in deterministic polynomial time and
always gives the correct answer, this primality test is still slower for
large numbers than some probabilistic primality tests.

2
Technically, a Las Vegas probabilistic algorithm.
60 3. Number Theory Relevant to Factoring

The remainder of this section presents highlights of the devel-


opment of primality testing during the past century. Pomerance
[Pom10] describes a common theme of most of the prime proving
methods below.
We know that if m is prime, then φ(m) = m − 1. The converse
holds.

Theorem 3.26. If φ(m) = m − 1, then m is prime.

Proof. By definition, φ(m) is the number of integers i in 1 ≤ i ≤ m


with gcd(i, m) = 1. If m is composite, then it has a divisor i in
1 < i < m and gcd(m, i) = i > 1 for this i. Therefore, φ(m) ≤
m − 2 < m − 1. 

Lehmer [Leh27] proved the following theorem in 1927.

Theorem 3.27 (Lehmer). Let m be a positive integer. Then m


is prime if and only if there exists an integer a such that am−1 ≡
1 (mod m) and a(m−1)/q ≡ 1 (mod m) for every prime q that divides
m − 1.

Proof. Suppose m is prime. Let a be a primitive root for m. Then


am−1 ≡ 1 (mod m) and a(m−1)/q ≡ 1 (mod m) for every prime q that
divides m − 1.
Now assume there exists an integer a such that am−1 ≡ 1 (mod m)
and a(m−1)/q ≡ 1 (mod m) for every prime q that divides m − 1. Let
e be the smallest positive integer for which ae ≡ 1 (mod m). The fact
that am−1 ≡ 1 (mod m) shows that e | (m − 1). If e < m − 1, then
there must be a prime q that divides (m − 1)/e, say, (m − 1)/e = qk.
For this q we have a(m−1)/q ≡ aek ≡ 1 (mod m), a contradiction.
Hence, e = m − 1. By Euler’s Theorem, aφ(m) ≡ 1 (mod m), so
e ≤ φ(m). Therefore, φ(m) = m − 1 and m is prime by Theorem
3.26. 

Corollary 3.28. If m is proved prime using Theorem 3.27, then the


number a is a primitive root modulo m.

Proof. The proof of Theorem 3.27 shows that m − 1 is the smallest


positive integer e for which ae ≡ 1 (mod m). 
3.7. Primality Testing 61

Theorem 3.27 is useful for proving that m is prime only when we


can factor m − 1 completely. It turns out that this is the most serious
obstacle to its use. There are φ(m − 1) primitive roots, each of which
will work for the number a in the theorem. If we choose random a in
2 ≤ a ≤ m − 2, we will almost certainly find a primitive root quickly.
Selfridge [BS67] proved in 1967 that Theorem 3.27 still holds even if
a different a is used for each prime factor q of m − 1. But if it is used
this way, then no a is proved to be a primitive root for m.
Pratt [Pra75] used Theorem 3.27 to show that every prime num-
ber m has a succinct certificate of its primeness. The certificate lists
the prime factors p of m − 1 and gives a primitive root a for m that
makes the conclusion true. Then, for each prime factor p > 2 of
m − 1, the certificate tells the prime factors q of p − 1 and primitive
roots a for each p that makes the conclusion true in a proof that p
is prime. The certificate continues in this recursive fashion until the
primes are small enough to be obviously prime. Extensive factoring
may be required to discover a certificate, but one can use the certifi-
cate to make a quick verification of primeness. Pratt’s result shows
that primality is in class N P because the length of the certificate for
the primeness of m is a polynomial in log m.
The following theorem of Pocklington [Poc16] is useful when only
a partial factorization of m−1 is known: F is the factored part of m−1

and R is the unfactored composite remainder. In case F ≥ m, we

can prove that m is prime. Even when F < m, the theorem limits
the possible trial divisors of m to numbers iF + 1 for i = 1, 2, . . ..

Theorem 3.29 (Pocklington). Let m be an odd positive integer.


Suppose m − 1 = F R, where gcd(F, R) = 1 and the complete fac-
torization of F is known. Suppose that there is an integer a such
that for every prime factor p of F , we have am−1 ≡ 1 (mod m)
and gcd(a(m−1)/p − 1, m) = 1. Then every prime factor of m is

≡ 1 (mod F ). If also F ≥ m, then m is prime.


Proof. Let F = pei i be the factorization of F . Let q be a prime
factor of m. Let fi be the smallest positive integer for which afi i ≡
1 (mod q). Then fi divides q − 1. Since am−1 ≡ 1 (mod m), we also
have that fi divides m − 1. But gcd(a(m−1)/pi − 1, q) = 1, so fi does
62 3. Number Theory Relevant to Factoring

not divide (m − 1)/pi . Therefore, pei i divides q − 1 for each i, and so


F must divide q − 1, that is, q ≡ 1 (mod F ). This shows that every

prime factor of m is > F . If also F ≥ m, then every prime factor

of m is > m, so m must be prime. 

Wunderlich [Wun83] used Pocklington’s Theorem (and similar


results) to prove recursively that certain numbers are prime. (The
similar results [BLS75] allow partial factorizations of m−1 and m+1
to be combined in a proof that m is prime.) To show that m = m0 is
prime this way, construct integers fi , mi for i = 1, . . ., k with mi −1 =

fi+1 mi+1 , fi+1 completely factored into small primes, fi+1 ≥ mi ,
2mi+1 −1 ≡ 1 (mod mi+1 ) (so that mi+1 is probably prime) for i = 0,
. . ., k − 1, and mk proved prime, perhaps because it is small enough.
Then apply Pocklington’s Theorem k times with m = mi , F = fi+1 ,
R = mi+1 . The theorem proves that mi is prime for i = k − 1,
i = k − 2, . . ., i = 0, so that finally m = m0 is prime. This method
works only for certain numbers m. The use of elliptic curves allows
the method to work for all primes m. See Section 7.3 and Theorem
7.12, where elliptic curves give the fastest current practical prime
proving method.
Now we consider primality tests for Fermat and Mersenne num-
bers. Both of these tests run in polynomial time, assuming the input
numbers are written in binary or decimal. By the “input,” we mean
the full number Fk or Mp whose primality is tested, rather than sim-
ply k or p. If the input were k or p written in binary, then these
primality tests do not run in polynomial time.
k
Theorem 3.30 (Pepin’s Test). If k ≥ 2, then Fk = 22 + 1 is prime
if and only if 5(Fk −1)/2 ≡ −1 (mod Fk ).

Proof. Suppose Fk is prime. We first show that 5 is a quadratic


nonresidue modulo Fk . Since 24 ≡ 1 (mod 5), it is easy to see that
k
22 ≡ 1 (mod 5) for k ≥ 2. By 5 ≡ 1 (mod 4) and the Law of
Quadratic Reciprocity




5 Fk 2
= = = −1,
Fk 5 5
3.7. Primality Testing 63

so 5 is a quadratic nonresidue modulo Fk . By Euler’s Criterion, part


(4) of Theorem 2.59, we have 5(Fk −1)/2 ≡ −1 (mod Fk ).
For the converse, apply Theorem 3.27 with a = 5. Only the
prime q = 2 divides Fk − 1. By hypothesis, 5(Fk −1)/2 ≡ −1 (mod Fk ).
Therefore, 5(Fk −1)/2 ≡ 1 (mod Fk ) and 5Fk −1 ≡ 1 (mod Fk ), so Fk is
prime. 

The proof of Theorem 3.30 shows that 5 is a primitive root for


Fk whenever Fk is prime. (When Fk is not prime, it doesn’t have
a primitive root.) Exercise 3.16 shows that Pepin’s Theorem holds
with 5 replaced by 3.

Theorem 3.31 (Lucas-Lehmer Test). Let p be an odd prime. Define


S1 = 4 and Sk+1 = Sk2 − 2 for k ≥ 1. Then Mp = 2p − 1 is prime if
and only if Mp divides Sp−1 .

See Theorem 9.2.4 of Bach and Shallit [BS96] for a proof of


Theorem 3.31.
What if an integer m does not have special form like Mp or Fk
and we cannot factor m − 1? How can we tell whether m is prime or
composite? If we are willing to allow a tiny chance of error, there are
some fast algorithms.
Suppose m is a large odd number and a is an integer unrelated
to m. By “unrelated” we mean that gcd(a, m) = 1 and if m divides
bn ± 1, say, then gcd(a, b) = 1 as well. If we compute am−1 mod m
and its value is 1, then it is very likely that m is prime. Of course,
if m is prime, then am−1 ≡ 1 (mod m) by Fermat’s Little Theorem
2.23.

Definition 3.32. We call an integer m > 1 a probable prime to


base a if am−1 ≡ 1 (mod m). If m is composite and satisfies this
congruence, then we call m a pseudoprime to base a.

One can prove that for integers a > 1, most probable primes to
base a are prime and that very few of them are pseudoprimes. For
example, Erdős [Erd56] proved that the number of  pseudoprimes to

base 2 up to x is less than x·exp −c (ln x) ln ln x for some positive
64 3. Number Theory Relevant to Factoring

constant c. More recently, Pomerance [Pom81] showed that the num-


ber of pseudoprimes to base 2 up to x is less than x1−(ln ln ln x)/(2 ln ln x)
for all sufficiently large x. Both of these upper bounds grow faster
than x1−ε , for any ε > 0, but slower than x/ lnn x, for any positive
integer n. By Theorem 2.15 the number of primes up to x is about
x/ ln x, which grows faster than the upper bound on the number of
base 2 pseudoprimes. These results show that almost all probable
primes are prime.
If one had a table of all pseudoprimes to base 2, say, up to L,
one could construct a fast primality test as follows. Given an odd
number m ≤ L, test whether 2m−1 ≡ 1 (mod m). If this congruence
fails, then m is composite by Fermat’s Little Theorem 2.23. If the
congruence holds, look for m in the table. If m is in the table, then
m is composite. Otherwise, m is prime. This test works fine for
relatively small L. There are 14884 pseudoprimes to base 2 below
L = 1010 and 91210364 of them below 1019 . One problem with this
test for larger L is that the table would be too big. Another problem
is that the pseudoprimes to base 2 have been computed only up to
about 1019 . See [Fei13]. The following four or five pages tell how
people tried to get around these difficulties. See also Theorem 9.5
for a primality test that uses pseudosquares instead of pseudoprimes
(and works for slightly larger numbers).

Example 3.33. The smallest pseudoprime to base 2 is 341 = 11 · 31.


One computes 2340 ≡ 1 (mod 341) either by the Fast Exponentiation
Algorithm or via the Chinese Remainder Theorem 2.20.

One can show that for every positive integer a there are infinitely
many pseudoprimes to base a. One might try to reduce the chance
of getting a pseudoprime when one wants a large prime by requiring
that it be a pseudoprime to several bases simultaneously. One prob-
lem with this approach is that there are infinitely many Carmichael
numbers.

Definition 3.34. A Carmichael number is a composite positive in-


teger m that is a pseudoprime to every base relatively prime to m.

Example 3.35. The smallest Carmichael number is 561 = 3 · 11 · 17.


3.7. Primality Testing 65

One can show that if all three of 6k + 1, 12k + 1, and 18k + 1


are prime, then their product is a Carmichael number. The first
Carmichael number of this form is 1729 = 7 · 13 · 19 (with k = 1).

There are 1547 Carmichael numbers below 1010 and 3381806 of


them below 1019 .
Pomerance [Pom81] showed that the number of Carmichael num-
bers up to x is less than x1−(ln ln ln x)/(ln ln x) for all sufficiently large x.
Alford, Granville, and Pomerance [AGP94] proved that for all suf-
ficiently large x, the number of Carmichael numbers < x is > x2/7 .
One can recognize a Carmichael number from its prime factorization,
as first proved in 1889 by Korselt [Kor99].

Theorem 3.36 (Korselt). Let m be a composite positive integer. The


following statements are equivalent.

(1) m is a Carmichael number.


(2) m has at least three prime factors, is odd and square free,
and for each prime p dividing m we have p−1 divides m−1.
(3) For every integer a we have am ≡ a (mod m).

Proof. (1) ⇒ (2): Let m be a Carmichael number and let a be


relatively prime to m. Then am−1 ≡ 1 (mod m). But by Theorem
2.57, λ(m) is the smallest exponent so that aλ(m) ≡ 1 (mod m).
Therefore, λ(m) divides m − 1 and m is relatively prime to λ(m).
Also, if p | m, then p − 1 | λ(m), so p − 1 | m − 1.
Suppose m were not square free, say, p2 | m for some prime p.
Then p | λ(m) by the definitions of λ(m) and φ(m). This implies that
gcd(m, λ(m)) > 1, a contradiction. Hence, m is square free.
Suppose m = pq were the product of only two (distinct) primes.
Let p > q. Then p − 1 | m − 1 = pq − 1, so (pq − 1)/(p − 1) is an
integer. But (pq − 1)/(p − 1) = q + (q − 1)/(p − 1) is not an integer
since p > q. Therefore, m has at least three prime factors.
If m were even, then m − 1 would be odd. But λ(m) | m − 1,
so λ(m) would be odd. However, λ(m) is even for every m > 2.
Therefore, m is odd.
66 3. Number Theory Relevant to Factoring

(2) ⇒ (3): Suppose m is composite and square free and for each
prime p dividing m we have p−1 divides m−1. Since m is square free,
it is enough to show that am ≡ a (mod p) for each prime p dividing
m. Let a be an integer and let p be a prime factor of m. If p does not
divide a, then ap−1 ≡ 1 (mod p) by Fermat’s Little Theorem 2.23.
Since we are given that p − 1 | m − 1, we have am−1 ≡ 1 (mod p) and
so am ≡ a (mod p). It is clear that am ≡ a (mod p) when p divides a.
(3) ⇒ (1): We are given that am ≡ a (mod m) for all a. When
gcd(a, m) = 1, we can cancel an a from each side to get am−1 ≡
1 (mod p), that is, m is a pseudoprime to base a. 
Example 3.37. Note that 3 − 1, 11 − 1, and 17 − 1 all divide 561 − 1.
Likewise, 7 − 1, 13 − 1, and 19 − 1 all divide 1729 − 1.

One way to avoid Carmichael numbers when using probable prime


tests on large numbers is to use a strong probable prime test.
Definition 3.38. A strong probable prime to base a is an odd pos-
itive integer m with this property: If we write m − 1 = 2e f with f
odd, then either af ≡ 1 (mod m) or af ·2 ≡ −1 (mod m) for some c
c

in 0 ≤ c < e. A strong pseudoprime is a composite strong probable


prime.
Example 3.39. The first strong pseudoprime to base 2 is m =
2047 = 23 · 89. We have 2047 − 1 = 2 · 1023 and 21023 ≡ 1 (mod m).
The second strong pseudoprime to base 2 is m = 3277. We have
3277 − 1 = 22 · 819, 2819 ≡ 128 (mod m), and 22·819 ≡ −1 (mod m).
Theorem 3.40. Every prime m is a strong probable prime to any
base a that it does not divide.

Proof. Since the Legendre symbol (1/m) = +1, there are exactly
two solutions to x2 ≡ 1 (mod m). But +1 and −1 are two solutions
to this congruence, so there are no others. Look at the sequence of
integers af , a2f , . . ., am−1 modulo m. Each number is the square of
the number before it. The last one is +1 by Fermat’s Little Theorem
2.23. Either they are all +1 or else the last one that is not +1 is −1.
But this is the definition of strong probable prime to base a. 
Theorem 3.41. Every strong probable prime m is a probable prime
to the same base.
3.7. Primality Testing 67

Proof. Write m−1 = 2e f with f odd. We can compute am−1 mod m


by squaring af ·2 mod m (for some c in 0 ≤ c < e) at least once. Since
c

by hypothesis one of these numbers is ≡ ±1 (mod m) and it is squared


at least once, we must have am−1 ≡ 1 (mod m). 

See Corollary 4.5 for a proof that there are infinitely many strong
pseudoprimes to every (prime) base. There are 3291 strong pseudo-
primes to base 2 below 1010 and 24220195 of them below 1019 .
The next theorem says that there is no “strong” Carmichael num-
ber, that is, no m is a strong pseudoprime to every base relatively
prime to it.
Theorem 3.42 (Monier [Mon80], Rabin [Rab80]). If m is an odd
composite integer > 9, then the number of bases a in 1 ≤ a ≤ m to
which m is a strong pseudoprime is ≤ φ(m)/4.

For a proof, see Theorem 5.10 of Rosen [Ros88] or Theorem 3.4.4


of Crandall and Pomerance [CP05].
The theorem can be used to construct a good probable3 primality
test. Miller [Mil76] and Rabin [Rab80] recommended choosing k
random bases a. If an odd integer m > 9 is a strong probable prime
to every one of these k bases, then the probability that m is composite
is smaller than 4−k because, if m were composite, then each strong
probable prime test has at least three chances in four to show that
m is composite. With k = 10, say, there is less than one chance in
one million that a composite m would pass all ten individual strong
probable prime tests.
Another closely related approach to primality testing is to get
an upper bound on the size of the smallest a to which a composite
number m is not a strong pseudoprime. Rigorous bounds on this
size are huge, but if one assumes (as many people do) the Extended
Riemann Hypothesis, then one can show that the smallest such a is
< 2(ln m)2 . This was proved by Bach [Bac85]. See Theorem 3.4.12
of [CP05]. Thus, if you believe the Extended Riemann Hypothesis,
then m is prime if and only if it is a strong probable prime to every
base a < 2(ln m)2 .
3
Technically, a Monte Carlo probabilistic algorithm.
68 3. Number Theory Relevant to Factoring

There are also primality tests that choose random numbers while
deciding whether m is prime. The answer is always correct and they
usually run in polynomial time, but there is a chance that some bad
random numbers might make the algorithm run for longer than poly-
nomial time. One example of such an algorithm is the elliptic curve
version of Theorem 3.29 by Goldwasser and Kilian [GK86]. See The-
orem 7.12.
One can prove that if m is prime and ≡ 3 or 7 mod 10, then
m divides the Fibonacci number um+1 . We explained at the end of
Section 3.4 how to compute um+1 mod m when m is a large inte-
ger. In 1980, Baillie, Pomerance, Selfridge, and the author [BW80],
[PSW80] conjectured that no odd composite integer m, whose last
digit is 3 or 7, is a strong probable prime to base 2 and also divides the
Fibonacci number um+1 . (We made a similar, but slightly more com-
plicated, conjecture for numbers with last digit 1 or 9. See [BW80]
and [PSW80] for details.) No one has ever proved or disproved this
conjecture and it has been used millions of times to construct large
primes without a single failure. Jeff Gilchrist has verified that the
test is correct for all m < 264 using Jan Feitsma’s [Fei13] list of
pseudoprimes to base 2 up to 264 . The authors of [PSW80] offer
a cash prize to the first person who either proves that this test is
always correct or exhibits a composite number that the test says is
probably prime. The original value of the prize was $30 but has been
increased to $620. Heuristic arguments suggest that composite num-
bers exist that the test says are probably prime, and that the least
such examples have hundreds or even thousands of decimal digits.

Algorithm 3.43. BPSW probable prime test.

Input: An odd integer m > 10 with last digit 3 or 7.


if ((m is a strong probable prime to base 2)
and (m divides um+1 )) {
write “m is probably prime”
} else write “m is composite”
Output: Tells whether m is composite or probably prime.

The algorithm reports that m is probably prime if m is a strong


probable prime to base 2 and m divides the Fibonacci number um+1 ;
3.7. Primality Testing 69

otherwise, it says that m is composite. The time complexity of this


probable prime test is O((log m)3 ), that is, polynomial time.
In 2002, Agrawal, Kayal, and Saxena [AKS04] found the first
general deterministic polynomial-time primality algorithm. It decides
the primality of all integers, has running time a polynomial in the
size of its input m, uses no random numbers, and its correctness de-
pends on no unproved hypothesis. Their original version had running
time O((log m)12 ), but others [Mor04], [Gra05] soon reduced this
to about O((log m)6 ). See Agrawal, Kayal, and Saxena [AKS04] and
articles that refer to it for details.
Here is the original version of their primality test. In the algo-
rithm, x is a variable (placeholder). Its use is explained in the next
algorithm, as is the notation “mod (m, xs − 1)”. See [AKS04] for an
explanation of how and why the algorithm works.

Algorithm 3.44. Agrawal, Kayal, and Saxena prime test.


Input: An odd integer m > 10.
if (m is of the form nk for integers n ≥ 2 and k ≥ 2) {
write “m is composite”
}
L ← log2 m + 1
s←2
while (s < m) {
if (gcd(m, s) > 1) {
write “m is composite”
}
if (s is prime) {
Let q be the largest prime factor of s − 1
if (q ≥ 6 sL and m(s−1)/q ≡ 1 (mod s)) {
goto NEXT
}
}
s←s+1
}
NEXT:
for b ← 1 to 3 sL {
if ((x + b)m ≡ xm + b mod (m, xs − 1)) {
write “m is composite”
}
}
70 3. Number Theory Relevant to Factoring

write “m is prime”
Output: The test tells whether m is composite or prime.

Agrawal, Kayal, and Saxena proved that the algorithm is correct


and that the while and for loops terminate within O((log m)12 ) it-
erations. The congruence in the last if statement is computed both
modulo m and modulo xs − 1. The test whether m = nk in the first
step is quite fast. See Bernstein [Ber98].
The following algorithm indicates how to compute (x + b)m mod
(m, xs − 1). Note the similarity of this procedure to Fast Exponentia-
tion. The variables y and z represent polynomials of degree s−1. The
procedure mult(y, z, m, s) multiplies two polynomials y, z of degree
s − 1 modulo (m, xs − 1). Compare this procedure with Algorithm
10.3.
Algorithm 3.45. Compute (x + b)m mod (m, xs − 1).
Input: Integers m > 0, b > 0, s > 1.
for (i ← 1 to s − 1) { y[i] ← 0 ; z[i] ← 0 }
y[0] ← 1 ; z[0] ← b ; z[1] ← 1
while (e > 0) {
if (e is odd) y ← mult(y, z, m, s)
z ← mult(z, z, m, s)
e ← e/2
return(y)
}
end
function mult(y, z, m, s)
for (i ← 0 to 2s − 2) { w[i] ← 0 }
for (i ← 0 to s − 1) {
for (j ← 0 to s − 1) {
w[i + j] ← (w[i + j] + y[i] · z[j]) mod m
}
}
i←s−2
while (i ≥ 0) {
w[i] ← (w[i] + w[i + s]) mod m
i←i−1
}
return(w)
end
Output: (x + b)m mod (m, xs − 1) = the final value of y.
Exercises 71

When the algorithm returns with y = (x + b)m mod (m, xs −


1), one tests whether this is ≡ xm + b as follows: If y[0] = b and
y[m mod s] = 1 and y[i] = 0 for all 1 ≤ i < s with i = m mod s,
then (x + b)m ≡ xm + b mod (m, xs − 1); otherwise (x + b)m ≡ xm +
b mod (m, xs − 1) (which means m is composite).

Exercises

3.1. Estimate the number of numbers between 1036 and 1036 + 107
whose greatest prime factor is less than 106 .
3.2. Estimate the number of numbers between 1836 and 1836 + 107
whose greatest prime factor is less than 186 .
3.3. Find all x with x2 ≡ 9 (mod 253).
3.4. Tonelli’s algorithm requires a quadratic nonresidue modulo p.
Prove that the least positive quadratic nonresidue modulo p
must be prime.
3.5. Find the cyclotomic polynomials Φ10 (x), Φ12 (x), Φ15 (x), and
Φ30 (x).
3.6. Show that for m > 1 the cyclotomic polynomial Φm (x) is
self-reciprocal, that is, Φm (x) = xφ(m) Φm (1/x).
3.7. Show that at least one coefficient of Φ105 (x) is different from
1, 0, −1.
3.8. Show that the repunits Rn of Section 1.2 form a divisibility
sequence.
3.9. Let {cn } be a divisibility sequence. Prove that
(a) for all n ≥ 1, cn /c1 is an integer,
(b) the sequence {cn /c1 } is a divisibility sequence, and
(c) if (a + 1) | (b + 1), then σ(pa ) | σ(pb ).
3.10. For n ≥ 1, let an = φ(n), bn = gcd(2n − 1, 3n − 1), and
cn = λ(n), Carmichael’s function. Prove that {an }, {bn },
and {cn } are divisibility sequences.
72 3. Number Theory Relevant to Factoring

3.11. Use Theorem 3.17 to verify that 73, 11, 23, 61, and 7 are
intrinsic factors of the primitive parts of 2657 − 1, 3605 − 1,
3253 − 1, 11122 + 1, and 21029 − 1, respectively.
3.12. Prove that if p and q are distinct primes and pq > 6, then
2pq − 1 has at least three different prime factors.
3.13. Prove that gcd(2m − 1, 2n − 1) = 2gcd(m,n) − 1.
3.14. Prove that gcd(um , un ) = ugcd(m,n) .
3.15. Factor the first 30 Lucas numbers vn . For which m and n
does vm | vn in your table?
3.16. Some versions of Pepin’s test use 3 in place of 5. Prove that
this version is correct for all k ≥ 1.
3.17. Find all 22 pseudoprimes to base 2 below 10000. Which
of these numbers are Carmichael numbers? Which ones are
strong pseudoprimes to base 2?
3.18. Show that every composite factor of bn ± 1 is a pseudoprime
to base b.
3.19. Let N be the primitive part of bn ± 1 with any intrinsic factor
removed. Show that every composite factor of N is a strong
pseudoprime to base b.
3.20. True or False? Give proof or counterexample.
(a) If m is a pseudoprime to base a, then m is a pseudoprime
to base a2 .
(b) If m is a strong pseudoprime to base a, then m is a strong
pseudoprime to base a2 .
(c) If m is a pseudoprime to bases a and b, then m is a
pseudoprime to base ab.
(d) If m is a strong pseudoprime to bases a and b, then m is
a strong pseudoprime to base ab.
3.21. Use Korselt’s Criterion, Theorem 3.36, to prove that if 6k + 1,
12k + 1, and 18k + 1 are all prime, then their product is a
Carmichael number.
3.22. Use the probable primality test of Baillie, Pomerance, Self-
ridge, and Wagstaff to prove that 16493 is probably prime.
Exercises 73

3.23. If the primality test of Agrawal, Kayal, and Saxena finds that
its input m is composite, does it produce a factor of m?
3.24. Use the primality test of Agrawal, Kayal, and Saxena to prove
that 16493 is prime.
Chapter 4

How Are Factors Used?

The purpose of computing is insight, not numbers.


R. W. Hamming [Ham62, Dedication]

Introduction
Hamming was referring to numerical analysis, but the quotation ap-
plies equally to computational number theory and, in particular, to
factoring integers.
Chapter 1 mentioned a few uses of tables of factored integers.
This chapter lists many more examples of the uses of factorization.
One use is discovering new algebraic identities that tell how to factor
certain integers. Aurifeuille discovered an infinite number of these by
examining tables of factored numbers. We mentioned perfect num-
bers N (with σ(N ) = 2N ) in Chapter 1. In this chapter we tell more
about perfect numbers and the related harmonic numbers. We dis-
cuss the use of factoring in proving that certain types of numbers are
prime. The study of Bernoulli numbers, used in combinatorics and
the structure of fields, requires the knowledge of factors of some in-
tegers. We discuss several topics from cryptography, including linear
feedback shift registers (LFSR), the RSA cryptosystem and signa-
tures, secure random number generation, and zero-knowledge proofs.

75
76 4. How Are Factors Used?

Finally, we list a few miscellaneous applications, such as counting the


representations of an integer as the sum of 2 or 4 squares.

4.1. Aurifeuillian Factorizations


Another use of tables like the Cunningham Project is to discover
new algebraic identities. If one did not know the identity x2 − 1 =
(x−1)(x+1), one could discover it from a table of the numbers n2 −1
factored. See Table 1. Even if the factors were not rewritten as in
the third column, one would notice the identity when x − 1 and x + 1
were (twin) primes.

Table 1. Numbers n2 − 1 factored.

n Prime factors of n2 − 1 Factors of n2 − 1 grouped


3 2.2.2 2.4
4 3.5 3.5
5 2.2.2.3 4.6
6 5.7 5.7
7 2.2.2.2.3 6.8
12 11.13 11.13

The fact that the sequences {bm − 1} and {um } are divisibility
sequences was discovered by the study of tables of these numbers
factored.
More than a century ago, mathematicians studied tables of num-
bers bn ± 1 factored and observed that for each b, the numbers in
which n lies in a certain arithmetic progression split into more and
smaller prime factors than did other numbers in the same table. They
suspected and found an algebraic identity that breaks these integers
into two factors of roughly the same size.
In 1869, Landry factored

258 + 1 = 5 · 107367629 · 536903681.

He wrote that none of the factorizations of 2n ± 1 that he obtained


gave as much trouble and labor as that of 258 + 1. In 1871, Aurifeuille
4.1. Aurifeuillian Factorizations 77

(see [CW25, p. v]) rewrote this factorization as

258 + 1 = 536838145 · 536903681,

where the first factor is 5 · 107367629 and the second factor is prime.
He observed that the two displayed factors are close together, their
difference is 216 , and the equation is the special case j = 15 of the
identity

(4.1) Φ4 (22j−1 ) = 24j−2 + 1 = (22j−1 − 2j + 1)(22j−1 + 2j + 1).

Landry’s task would have been much easier had he noticed this for-
mula when he began to factor 258 + 1. Later, Lucas proved the iden-
tities (4.1), (4.2) and other similar formulas.
Let us examine Table 2 of factors of 3m + 1 and try to discover
an algebraic factorization. Note the two nearly equal factors 19441

Table 2. Numbers 3m + 1 factored.

m 3m + 1 m 3m + 1 m 3m + 1
1 2.2 10 (2) 5∗.1181 19 (1) 2851.101917
2 2∗.5 11 (1) 67.661 20 (4) 42521761
3 (1) 7 12 (4) 6481 21 (1, 3, 7) 7∗.43.2269
4 2∗.41 13 (1) 398581 22 (2) 5501.570461
5 (1) 61 14 (2) 29.16493 23 (1) 23535794707
6 (2) 73 15 (1, 3, 5) 31.271 24 (8) 97.577.769
7 (1) 547 16 2∗.21523361 25 (1, 5) 151.22996651
8 2∗.17.193 17 (1) 103.307.1021 26 (2) 53.4795973261
9 (1, 3) 19.37 18 (2, 6) 530713 27 (1, 3, 9) 19441.19927

and 19927 of 327 + 1. The fact that 27 is a multiple of b = 3 suggests


looking at how Φ3k (3) factors. When k is even, this primitive part
could be prime, as shown by the examples k = 2 (Φ6 (3) = 73) and
k = 4 (Φ12 (3) = 6481). But when k = 3, 5, 7, and 9, Φ3k (3) is
composite. A possible new algebraic factorization of Φ3k (3) should
also hold when k = 1, but Φ3 (3) = 7 is so small that the algebraic
cofactor might be 1. Consider the factorization of the polynomial

x3 + 1 = (x + 1)(x2 − x + 1) = Φ2 (x)Φ6 (x).


78 4. How Are Factors Used?

Now Φ6 (x) = x2 − x + 1 = (x + 1)2 − 3x is irreducible over the


rational numbers. But when x = 32j−1 (that is, 3 to an odd power),
it becomes the difference of two squares and it factors:
(4.2) Φ6 (32j−1 ) = (32j−1 +1)2 −32j = (32j−1 −3j +1)(32j−1 +3j +1).
For 27 = 3·(2·5−1) we take j = 5. Then 32j−1 −3j +1 = 39 −35 +1 =
19441 and 32j−1 + 3j + 1 = 39 + 35 + 1 = 19927. This explains the
nearly equal factors of 327 + 1. Once the identity has been discovered,
it is easy to prove it by multiplying the factors using algebra.
There are similar identities for bn + 1 when b = 2, 6, 7, 10, 11,
12, and for bn − 1 when b = 5. For example, with b = 2 the identity
Φ4 (x) = x2 + 1 = (x + 1)2 − 2x becomes a difference of two squares
when x = 22j−1 . This leads to equation (4.1). For more details, see
[BLS+ 02] or Riesel [Rie94].
Two important papers on Aurifeuillian factorizations are Schin-
zel [Sch62], which proves the existence of Aurifeuillian factorizations
of an − bn , and Granville and Pleasants [GP07], which shows that
Schinzel found all such factorizations. (See also [Wag12].) For an
algorithm to compute the coefficients in Aurifeuillian factorizations,
see Brent [Bre95], [Bre93].
Here is a version of a special case of the theorem of Schinzel
[Sch62]. The first part of the theorem was proved by Lucas [Luc78].

Theorem 4.1. Let b > 2 be a square-free integer. Then there ex-


ist polynomials Cb (x) and Db (x) with integer coefficients and degrees
φ(b)/2 and φ(b)/2 − 1, respectively, having the following properties.
Let h be an odd positive integer. If b ≡ 1 (mod 4), then
  
Φb (bh ) = Cb (bh ) − b(h+1)/2 Db (bh ) Cb (bh ) + b(h+1)/2 Db (bh ) ,
and if b ≡ 2 or 3 (mod 4), then
  
Φ2b (bh ) = Cb (bh ) − b(h+1)/2 Db (bh ) Cb (bh ) + b(h+1)/2 Db (bh ) .

In particular, this theorem says that there exists a formula for


factoring Φb (b2j−1 ) when b = 4k + 1 is square free and Φ2b (b2j−1 )
when b = 4k − 1 or b = 4k − 2 is square free. This means that when
b is square free, the Aurifeuillian factorization happens in the bn − 1
4.1. Aurifeuillian Factorizations 79

table if b = 4k + 1 and in the bn + 1 table if b = 4k − 1 or b = 4k − 2.


See Riesel [Rie94] for a table of the coefficients of these polynomials.

Example 4.2. Find the Aurifeuillian factorization for 1414h + 1.


Let h be odd. We have

1414h + 1 = Φ4 (14h )Φ28 (14h ) = (142h + 1)Φ28 (14h ).

Since 14 ≡ 2 (mod 4), Theorem 4.1 with p = 14 factors Φ28 (14h ) as


(C − 14(h+1)/2 D)(C + 14(h+1)/2 D). Riesel’s table [Rie94] gives the
coefficients of the polynomials C and D as 1, 7, 3, −7, 3, 7, 1 for C
and 1, 2, −1, −1, 2, 1 for D, that is, C(x) = x6 + 7x5 + 3x4 − 7x3 +
3x2 + 7x + 1. The two Aurifeuillian factors of Φ28 (14h ) are

146h + 7 · 145h + 3 · 144h − 7 · 143h + 3 · 142h + 7 · 14h + 1


∓ 14(h+1)/2 (145h + 2 · 144h − 143h − 142h + 2 · 14h + 1).

Example 4.3. Find the Aurifeuillian factorization for 1515h + 1.


Let h be odd. We have

1515h + 1 = Φ2 (15h )Φ6 (15h )Φ10 (15h )Φ30 (15h )


= (15h + 1)(152h − 15h + 1)
× (154h − 153h + 152h − 15h + 1)Φ30 (15h ).

Since 15 ≡ 3 (mod 4), Theorem 4.1 with p = 15 factors Φ30 (15h ) as


(C − 15(h+1)/2 D)(C + 15(h+1)/2 D). Riesel’s table [Rie94] gives the
coefficients of polynomials C and D as 1, 8, 13, 8, 1 for C and 1, 3,
3, 1 for D, that is, D(x) = (x + 1)3 . The two Aurifeuillian factors of
Φ30 (15h ) are

154h + 8 · 153h + 13 · 152h + 8 · 15h + 1 ∓ 15(h+1)/2 (15h + 1)3 .

Example 4.4. The author [Wag96] factored


173173 − 1
= 347 · 685081 · 161297590410850151 · P 176 · P 184,
173 − 1
where P xxx is a prime with xxx decimal digits. If one began naively
to factor this number, it would be easy to discover the three small
prime factors by Trial Division and the Elliptic Curve Method, but no
algorithm known at this time could split the product of the two large
80 4. How Are Factors Used?

prime factors. However, by Theorem 4.1 with b = 173, this number


has an Aurifeuillian factorization that breaks it into two nearly equal
pieces, each of which is easy to factor.

Corollary 4.5. For every integer b > 1 and integer m > 2, every
composite divisor of the primitive part of bm − 1 or bm + 1, with any
intrinsic factor removed, is a strong pseudoprime to base b. For every
square free b there are infinitely many strong pseudoprimes to base b.

Proof. Let Pm (b) = Φm (b)/ gcd(Φm (b), m) denote Φm (b) with any
intrinsic factor removed. By Theorems 3.15 and 3.23, the prime fac-
tors of Pm (b) are all ≡ 1 (mod m). Let n be a composite divisor of
Pm (b). Then m | n−1, so bn−1 ≡ 1 (mod Pm (b)). When gcd(b, t) = 1,
let b (t) denote the smallest positive integer k so that bk ≡ 1 (mod t).
(There always is one because k = φ(t) is one such integer.) Then
b (pe ) = m for each prime power pe with pe n. Hence there is an
integer k so that 2k b (pe ) for each prime power pe with pe n. This
fact and bn−1 ≡ 1 (mod n) show that n is a strong pseudoprime to
base b.
Now let b be square free. Let η = 1 if b ≡ 1 (mod 4) and η = 2
if b = 2 or b ≡ 3 (mod 4). Theorem 4.1 shows that Phηb (b) factors
into two nearly equal pieces for every odd positive integer h, so that
Phηb (b) must be composite, and therefore a strong pseudoprime to
base b. 

Using Schinzel’s [Sch62] full theorem, one can show that Corol-
lary 4.5 is valid without the restriction that b be square free. See
Theorem 1 of [PSW80] for this theorem. A closer analysis of the
proof of Corollary 4.5 shows that the number of strong pseudoprimes
to base b up to x is > ln x/(4b ln b). See [EP86] for a stronger in-
equality.
What happens when b is not square free? First, if b = ct , then
the factors of bn − 1 are just those of ctn − 1, and likewise for bn + 1.
Now suppose b is not a power and not square free. Write b = cd2
where c is square free and d2 is the largest square dividing b. Then
the Aurifeuillian factorization of bn ± 1 comes from that of cn ± 1.

Example 4.6. Find the Aurifeuillian factorization of 12n + 1.


4.1. Aurifeuillian Factorizations 81

We saw above that Φ6 (x) = x2 − x + 1 = (x + 1)2 − 3x becomes


the difference of two squares when x = 32j−1 . It has the same form
and it factors whenever x is a square times 3 to an odd power. Write
h = 2j − 1 for brevity. Let x = 12h . Then x equals a square (22h )
times 3 to an odd power (3h ), so
2
Φ6 (12h ) = (12h + 1)2 − 2h 3j

= 12h − 2h 3j + 1 12h + 2h 3j + 1 .
This formula factors 123h + 1 = (12h + 1)Φ6 (12h ); that is, every sixth
number in the Cunningham table for 12n + 1 is factored.

Example 4.7. Find the Aurifeuillian factorization of 18n + 1.


We saw above that Φ4 (x) = x2 + 1 = (x + 1)2 − 2x becomes the
difference of two squares when x = 22j−1 . It has the same form and
it factors whenever x is a square times 2 to an odd power. Write
h = 2j − 1. Let x = 18h . Then x equals a square (32h ) times 2 to an
odd power (2h ), so
2
Φ4 (18h ) = (18h + 1)2 − 2j 3h

= 18h − 2j 3h + 1 18h + 2j 3h + 1 .
This formula factors 182h +1 = (18h +1)Φ4 (18h ); that is, every fourth
number in the Cunningham-type table for 18n + 1 is factored.

Example 4.8. Find the Aurifeuillian factorization of 20n − 1.


Begin with the Aurifeuillian factorization:
Φ5 (x) = (x2 + 3x + 1)2 − 5x(x + 1)2 .
Let x = 20h , where h is odd. The result is a factorization for every
number 20h − 1, where h ≡ 5 (mod 10). We leave the details to the
reader.

In the Cunningham tables, the Aurifeuillian factor with the mi-


nus sign is labeled “L” and the one with the plus sign is labeled
“M.” In the base 3 Cunningham table, the primitive part Φ6 (31 )
has Aurifeuillian factorization 3L.3M = 1.7, so the table omits “3L”
and writes simply (3) for 7 in references to this primitive part. But
Φ6 (33 ) = 9L.9M = 19.37. With this notation, Table 2 becomes Table
82 4. How Are Factors Used?

3. Since the “m” is easily inferred from the line of the table, the actual
Cunningham tables write “mL” and“mM” simply as “L” and “M.”

Table 3. Numbers 3m + 1 factored.

m 3m + 1 m 3m + 1
1 2.2 16 2∗.21523361
2 2∗.5 17 (1) 103.307.1021
3 (1) 7 18 (2, 6) 530713
4 2∗.41 19 (1) 2851.101917
5 (1) 61 20 (4) 42521761
6 (2) 73 21 (1, 7) 21L.21M
7 (1) 547 21L (3) 7∗.43
8 2∗.17.193 21M 2269
9 (1, 3) 9L.9M 22 (2) 5501.570461
9L 19 23 (1) 23535794707
9M 37 24 (8) 97.577.769
10 (2) 5∗.1181 25 (1, 5) 151.22996651
11 (1) 67.661 26 (2) 53.4795973261
12 (4) 6481 27 (1, 3, 9) 27L.27M
13 (1) 398581 27L 19441
14 (2) 29.16493 27M 19927
15 (1, 5) 15L.15M 28 (4) 430697.647753
15L (3) 31 29 (1) 523.6091.5385997
15M 271 30 (2, 6, 10) 47763361

After he proved the existence of Aurifeuillian factorizations, Schin-


zel [Sch62] proved an analogue to Bang’s Theorem 3.19. He showed
that when bm ± 1 has an Aurifeuillian factorization, there are at least
two primitive prime factors except for a small number of exceptional
cases when the “L” Aurifeuillian factor is 1 or an intrinsic factor.
These cases are 33 − 1, mentioned above with 3L = 1 and the first
three entries in the table for 2n + 1 with n = 4k − 2: 2L = 1, 6L = 1,
and 10L = 5, an intrinsic factor. Actually, Schinzel [Sch62] solved
the more general problem of finding the Aurifeuillian factorizations
of an − bn and determining when one of the factors is 1 or intrinsic.
4.2. Perfect Numbers 83

4.2. Perfect Numbers


Euler studied perfect numbers and proved the following theorems
about 250 years ago.

Lemma 4.9. If M | N , then σ(M )/M ≤ σ(N )/N . If M | N and


M < N , then σ(M )/M < σ(N )/N . If M | N and σ(M ) > 2M , then
N is not perfect.

Proof. If M | N , we have
σ(M )  1  1 σ(N )
= ≤ = .
M d d N
d|M d|N

If also M < N , then the inequality is strict because every term in the
first sum is in the second sum, while 1/N appears in the second sum
but not in the first. The third statement is immediate from the first
one. 

The first statement of the next theorem is proved in Euclid’s


Elements.

Theorem 4.10 (Euclid and Euler). If 2n −1 is prime, then 2n−1 (2n −


1) is perfect. Every even perfect number has the form 2n−1 (2n − 1),
where 2n − 1 is prime.

Proof. If p = 2n − 1 is prime, then gcd(2n−1 , p) = 1 and we have


σ(2n−1 p) = σ(2n−1 )σ(p) = (2n − 1)(p + 1) = (2n − 1)2n = 2 · 2n−1 p,
so 2n−1 p is perfect.
Conversely, suppose N is an even perfect number. Write N =
2n−1 m, where m is odd. Then n ≥ 2 because N is even. Since
gcd(2n−1 , m) = 1, we have
2n m = 2N = σ(N ) = σ(2n−1 )σ(m) = (2n − 1)σ(m).
Then 2n − 1 divides m because 2n − 1 and 2n are relatively prime.
Using Lemma 4.9, we find
2n σ(m) σ(2n − 1) (2n − 1) + 1 2n
= ≥ ≥ = .
2n − 1 m 2n − 1 2n − 1 2n − 1
84 4. How Are Factors Used?

Then in fact equality must hold throughout the expression so that,


first, m must equal 2n − 1 and, second, 2n − 1 must be prime. Thus
N = 2n−1 (2n − 1), where 2n − 1 is prime. 
Theorem 4.11 (Euler). If N is an odd perfect number, then N =
pa0 0 Q2 , where p0 is an odd prime, p0 ≡ a0 ≡ 1 (mod 4), and Q
is odd. In other words, an odd perfect number N has the standard
t
factorization N = i=0 pai i , where p0 ≡ a0 ≡ 1 (mod 4) and ai is
even for i = 1, . . ., t.

Before we prove Theorem 4.11, we present some facts about the


parity of σ(m). If p is prime, then by Theorem 2.45
pn+1 − 1
σ(pn ) = 1 + p + · · · + pn = .
p−1
When p = 2, only the first term in the sum is odd, so σ(2n ) is odd
for all n ≥ 1. When p is an odd prime, every one of the n + 1 terms is
odd, so σ(pn ) is odd if and only if n + 1 is odd, that is, if and only if
t t
n is even. Finally, if m = i=1 pai i is odd, then σ(m) = i=1 σ(pai i )
is odd if and only if each σ(pai i ) is odd, that is, if and only if each ai
is even, that is, if and only if m is an odd square.

Proof. Now we prove Euler’s Theorem 4.11. Let N be an odd perfect


number. By Theorem 2.11, N has a standard factorization N =
t ai
i=0 pi , where the pi are odd primes. Since σ(N ) = 2N , the prime
factors of σ(N ) must be the same set as the prime factors of N , except
that σ(N ) has exactly one factor of 2 while N has no factor of 2. Since

σ(N ) = ti=0 σ(pai i ), exactly one of the factors σ(pai i ) is even. Let
the even one be σ(pa0 0 ). Since σ(pai i ) is odd for i = 1, . . ., t, ai must

be even for i = 1, . . ., t. Thus, ti=1 pai i is an odd square Q2 .
What about pa0 0 with exactly one factor of 2 in σ(pa0 0 )? Odd
primes are either 4k+1 or 4k−1. Suppose first that p0 = 4k−1. Then
the powers pi0 alternate between −1 and +1 modulo 4. Therefore,
σ(pa0 0 ) = 1 + p0 + · · · + pa0 0 ≡ 1 or 0 (mod 4),
according as a0 is even or odd. It is never ≡ 2 (mod 4), so σ(pa0 0 ) can
never have exactly one factor of 2. Therefore, p0 must have the form
4k + 1. Its powers pi0 are all ≡ 1 (mod 4), so
σ(pa0 0 ) = 1 + p0 + · · · + pa0 0 ≡ a0 + 1 (mod 4).
4.2. Perfect Numbers 85

This shows that a0 +1 ≡ 2 (mod 4), which means a0 ≡ 1 (mod 4). 

We will give an example below to show how Euler’s Theorem 4.11


may be used with a lot of factoring (of bn − 1) to prove a theorem
about odd perfect numbers.
The proof uses the factor chain method. It chooses an upper
bound for a hypothetical odd perfect number N and begins by assum-
ing a particular prime power factor pa of N . Since σ(pa ) | σ(N ) = 2N ,
each odd prime factor q of σ(pa ) must divide N . Assume an exponent
b for q in N . Then the odd prime factors r of σ(q b ) must divide N .
This process continues until a contradiction is reached (or an odd per-
fect number constructed). Then one backtracks and changes the most
recent assumption. Eventually, every assumption is contradicted and
the theorem is proved. The sequence of primes p, q, r, . . . is the factor
chain.
Recall that pa N means that pa | N but pa+1  N .
We will use the following lemma often in the example below.

Lemma 4.12. If p is prime, a ≥ 2, and pa divides the odd perfect


number N , then N > p2a .

Proof. We may assume that pa N . Write N = pa M , where p  M .


We have gcd(pa , σ(pa )) = 1 because 1 is the only divisor of pa that
is not a multiple of p. Then pa | σ(M ) since 2pa M = 2N = σ(N ) =
σ(pa )σ(M ) and gcd(pa , σ(pa )) = 1. If pa < σ(M ), we are done
since then pa ≤ σ(M )/3 < 2M/3, so p2a < 2N/3 < N . (We have
σ(M ) < 2M by Lemma 4.9.)
It suffices to show that σ(M ) has a prime factor other than p
because then σ(M ) = pa , so pa < σ(M ). By Bang’s Theorem 3.19,
pa+1 − 1 has a primitive prime factor q, so q | σ(pa ). Since a > 0,
we have q = 2 and q = p, so q | M , say q b M . We may assume that
σ(q b ) = pa , since otherwise N has yet another prime factor and we
are done. Then c ≤ a, so c < a + 1. But the equation σ(q b ) = pc with
c < a + 1 implies pc ≡ 1 (mod q). That is, q|(pc − 1) with c < a + 1,
so q is not a primitive factor of pa+1 − 1. Therefore, σ(q b ) = pa and
σ(M ) has a prime factor r other than p. 
86 4. How Are Factors Used?

Example 4.13. Let us prove that there is no odd perfect number


N < 106 . This limit is so small that one could compute σ(N ) for
every odd N < 106 and check whether σ(N ) = 2N . But our proof
illustrates some techniques used when 106 is replaced by a limit with
hundreds of decimal digits.
Assume that N is an odd perfect number < 106 . We show first
that the smallest prime factor of N is 3. If it exceeds 3, then N must
have at least four different prime factors since if it had no more than
three of them, then

σ(N )  pa+1 − 1
2= =
N pa (p − 1)
p N
a

 p − p−a  p 5 7 11
= < ≤ · · < 2,
p − 1 p − 1 4 6 10
p N
a p|N

which is impossible. By Euler’s Theorem 4.11, we have that N ≥


(5 · 7 · 11)2 · 13 > 106 . Therefore, N can’t have four different prime
factors and the smallest prime factor of N must be 3.
Now we need to show that if 3 | N , then N > 106 . By Theorem
4.11, if 3a N , then a is even.
Suppose first that 32 N . Then σ(32 ) = 13 | N . If p0 = 13, then,
since σ(131 ) = 14 = 2 · 7, we have 7 | N . Suppose first that 72 N . We
have σ(72 ) = 57 = 3·19, so 19 | N . But then N ≥ (3·7·19)2 ·13 > 106 .
If 74 | N , then N > 78 > 106 by Lemma 4.12. This contradiction
shows that 131 N is false, so 13 is not p0 . We now try 13 as p1 and
assume 132 N . Then σ(132 ) = 183 = 3 · 61, so 61 | N . Possibly,
p0 = 61. But then, since σ(611 ) = 62 = 2 · 31, we have 31 | N and
N ≥ (3 · 13 · 31)2 · 61 > 106 , another contradiction. If either 612 | N
or 134 | N , then N > 106 by Lemma 4.12. The result of all this is
that it is false that 32 N .
Next suppose that 34 N . Then σ(34 ) = 112 | N . We have
σ(112 ) = 7 · 19, so N ≥ 34 (7 · 11 · 19)2 > 106 if 112 N . If 114 | N ,
then N > 118 > 106 by Lemma 4.12.
Now suppose that 36 N . Then σ(36 ) = 1093 | N . If p0 = 1093,
then, since σ(10931 ) = 2 · 547, we have N ≥ 36 · 5472 · 1093 > 106 .
Lemma 4.12 shows that we cannot have 10932 | N . Therefore, we
4.2. Perfect Numbers 87

cannot have 36 N . Finally, we cannot have 38 | N by Lemma 4.12.


This completes the proof that there is no odd perfect number < 106 .
This example is due to G. L. Cohen.

We now mention some other ideas used in a proof that there is


no odd perfect number < 10L for some large L.
First, we can make L quite large even if the proof eliminates just
a few small primes as possible factors of N . For example, eliminating
3 as a factor of N shows that N > 1013 . This works by showing first
that if 3  N , then N must have at least seven different prime factors,
for if N has only six prime factors, then, as in the example,
 p 5 7 11 13 17 19
2< ≤ · · · · · < 2.
p−1 4 6 10 12 16 18
p|N

Then N ≥ (5 · 7 · 11 · 13 · 17 · 19)2 · 23 > 1013 . The product of fractions


is approximately 1.95. If the next factor 23/22 were included, the
product would be about 2.04. Therefore, L = 13 is the greatest L
for which one can prove N > 10L in this straightforward way by
eliminating only the prime factor 3 from N .
To finish the proof that N > 1013 , one would consider factor
chains beginning with 3a for a = 2, 4, . . ., 12 because if 314 | N , then
N > 328 > 1013 by Lemma 4.12.
The first of these factor chains would postulate 32 N . Then
σ(32 ) = 13 | N . If 132 N , then σ(132 ) = 3 · 61 | N . If 612 N ,
then σ(612 ) = 3 · 13 · 97 | N . If 972 N , then σ(972 ) = 3 · 3169 | N .
But now we have too many 3’s: 33 | σ(132 · 612 · 97), contradicting the
assumption that 32 N . Therefore, we cannot have 972 N , etc. This
kind of contradiction did not occur in Example 4.13.
Another factor chain in the same proof would begin with 34 N .
Then σ(34 ) = 112 | N . If 114 N , then σ(114 ) = 5 · 3221 | N . Suppose
52 | N . Let M = 34 · 52 · 114 . Then M | N . But σ(M ) > 2M ,
so by Lemma 4.9, N cannot be perfect. This is another kind of
contradiction that did not occur in Example 4.13.
If one eliminates the two primes 3 and 5 as possible factors of N ,
then N must have at least 15 different prime factors and N ≥ 1041 .
88 4. How Are Factors Used?

Once a prime factor of N has been eliminated, its appearance in a


later factor chain is a contradiction.
Note that if (a + 1) | (b + 1), then σ(pa ) | σ(pb ) by Exercise
3.9(c) of Chapter 3. Therefore, it suffices to consider factor chains
that begin pa N with a + 1 prime.
Primes need not be eliminated in increasing order as possible fac-
tors of N . Convenience and simplicity led Brent and Cohen [BC89]
to consider and eliminate the primes 127, 19, 7, 11, 31, 13, 3, 5 in
that order in their proof that N > 10160 .
Recently, Ochem and Rao [OR12] proved that every odd perfect
number is > 101500 and has at least 101 prime factors, counted with
multiplicity. Nielsen [Nie07] showed that an odd perfect number
must have at least nine distinct prime factors, and at least twelve of
them if it is not a multiple of 3.

4.3. Harmonic Numbers


The harmonic mean of k positive real numbers a1 , . . ., ak is
 k −1
1 1
.
k i=1 ai
For a positive integer n, define H(n) to be the harmonic mean of the
positive divisors of n:
⎛ ⎞−1
1 1
H(n) = ⎝ ⎠ .
d(n) d
d|n

Since 1 n 
n = = d = σ(n),
d d
d|n d|n d|n
we have
−1
1 σ(n) nd(n)
H(n) = = ,
d(n) n σ(n)
which is a multiplicative function because each of n, d(n), σ(n) is
multiplicative.
Definition 4.14. A positive integer n is a harmonic number if H(n)
is an integer.
4.3. Harmonic Numbers 89

Example 4.15. H(1) = 1 and H(6) = 6 · 4/12 = 2, so 1 and 6 are


harmonic numbers. But H(2) = 2 · 2/3 = 4/3 and H(4) = 4 · 3/7 =
12/7, so 2 and 4 are not harmonic numbers.
The harmonic numbers ≤ 10000 are 1, 6, 28, 140, 270, 496, 672,
1638, 2970, 6200, 8128, 8190.

Harmonic numbers generalize perfect numbers.


Theorem 4.16 (Ore). Every perfect number is a harmonic number.

Proof. Let n be perfect. Then σ(n) = 2n, so


nd(n) nd(n) d(n)
H(n) = = = .
σ(n) 2n 2
Now d(n) is an even number unless n is square. But by Euler’s Theo-
rems 4.10 and 4.11 no square can be perfect. Therefore, d(n) is even
when n is perfect, and so H(n) is an integer. 

If p and 2p − 1 are prime, so that n = 2p−1 (2p − 1) is perfect,


then H(n) = d(n)/2 = p.
Another way that harmonic numbers generalize perfect numbers
is that if one ignores the “trivial” harmonic number 1, then it is not
known whether there are any odd harmonic numbers. Ore [Ore48]
conjectured that there is no odd harmonic number > 1.
Some harmonic numbers are not perfect. For example, H(140) =
5, H(270) = 6, and H(672) = 8. It is not known whether there are
infinitely many harmonic numbers. Many number theorists believe
that there are infinitely many Mersenne primes. If this is true, then
there are infinitely many (even) perfect numbers and so infinitely
many harmonic numbers. But it might be easier to prove that there
are infinitely many harmonic numbers than to show that there are
infinitely many perfect numbers.
In Section B2 of [Guy04] Guy asked what values H(n) can have.
It is known that H(n) cannot equal 4 or 12 but can equal any other
integer between 1 and 15 and many larger integers. See Goto and
Okeya [GO07] for a list of possible values of H(n) ≤ 1200.
The study of these and related questions might be aided by a table
of all harmonic numbers up to some limit. Since H(n) is defined in
90 4. How Are Factors Used?

terms of d(n) and σ(n), the construction of such a table involves some
factoring. Currently, one can compute these arithmetic functions and
find all the harmonic numbers up to about 1012 in this straightforward
way. Of course, one can determine whether an integer n of any size is
harmonic provided one can factor that n. A more sophisticated way
of computing a table of harmonic numbers is based on the following
two lemmas.

Lemma 4.17. For each real number A, only a finite number of har-
monic numbers n satisfy H(n)A > n.

Since the harmonic mean of positive real numbers is always less


than or equal to their geometric mean and since the geometric mean

of the positive divisors of n is n, there is no harmonic number n

with H(n) > n. Hence Lemma 4.17 holds trivially when A ≤ 2.
See Theorem 1.4 of Goto and Okeya [GO07] for a proof of Lemma
4.17.

Lemma 4.18. For each real number B, only a finite number of har-
monic numbers n satisfy H(n) ≤ B.

Lemma 4.18 follows from the theorem of Kanold [Kan57] that


for each real number c, only a finite number of n satisfy H(n) = c.
Lemmas 4.17 and 4.18 are copied from [GO07] and might also be
expressed as H(n) = O(nε ) for any ε > 0 and limn→∞ H(n) = ∞.
Goto and Okeya [GO07] give algorithms for finding the finite lists
of harmonic numbers whose H(n) satisfies the inequality of each of
these lemmas. The algorithms take longer as A and B increase. Given
the finite lists of the lemmas for particular A and B, it is relatively
easy to find all harmonic numbers n ≤ B A as follows. If H(n) ≤ B,
then n is on the list for Lemma 4.18. Otherwise, H(n) > B and so
H(n)A > B A ≥ n, and n is on the list for Lemma 4.17. Therefore,
every harmonic number n ≤ B A appears in the union of the lists for
the two lemmas.
Goto and Okeya [GO07] did this for A = 4.55 and B = 1200.
They found all 937 harmonic numbers less than 1014 . (Here 12004.55 ≈
1.02 · 1014 .) Their algorithm to construct the list for Lemma 4.18
uses the equality H(n) = c to limit the possible exponent vectors
4.4. Prime Proving 91

(e1 , . . . , er ) in n = pe11 · · · perr for each 1 ≤ c ≤ 1200 to a finite set of


vectors. Then it tests values of appropriate size for the primes p1 , . . .,
pr . Here is one example from their paper.

Example 4.19. Suppose the exponent vector is (100, 2, 1), c = 303


and we are trying p1 = 5. We will show that there is no harmonic
number n = 5100 p22 p3 with primes p2 = p3 and gcd(p2 p3 , 10) = 1.
The equation H(n) = H(5100 p22 p3 ) = 303 implies (since H(n) is
multiplicative)

303 303 3σ(5100 )


H(p22 p3 ) = = = .
100
H(5 ) 101 · 5 /σ(5 )
100 100 5100

But d(p22 p3 ) = 6, so

d(p22 p3 )p22 p3 6p22 p3


H(p22 p3 ) = = ,
σ(p22 p3 ) σ(p22 p3 )

and σ(5100 ) must divide p22 p3 . However, the Cunningham table gives
σ(5100 ) as the product of the three different primes 5937018283241,
3434487311396589821473854121, 483593153887747265029536907421,
so it cannot divide a number of the form p22 p3 .

4.4. Prime Proving


When one proves that a large integer m is prime using Theorem
3.27, one must factor m − 1. When m divides bn ± 1, sometimes
the Cunningham table in which m appears helps with this factoring
and sometimes it does not. Here are three examples to illustrate this
phenomenon.

Example 4.20. According to the Cunningham table, 21004 + 1 =


(24 + 1)P 302, where P 302 is a prime number with 302 decimal digits.
The number P 302−1 has several large prime factors which complicate
its factorization by a direct approach with simple methods. The task
becomes easier if one notices that
21004 + 1 21004 − 24 16 1000
P 302 − 1 = − 1 = = (2 − 1),
24 + 1 17 17
92 4. How Are Factors Used?

and the factors of 21000 − 1 appear in four base 2 Cunningham tables.


In fact,
P 302 − 1 = 16 · (2500 + 1)/17 · (2250 + 1) · (2125 + 1) · (2125 − 1).
Using Equation (4.1), the factor 2250 + 1 splits into two nearly equal
Aurifeuillian factors. The second largest prime factor of each of the
five pieces is small enough to be discovered by simple methods. With
these preliminary algebraic factorizations, the complete factorization
of P 302 − 1 is easy to obtain and the proof that P 302 is prime via
Theorem 3.27 is easy to finish.
Example 4.21. The line for 2800 + 1 in the Cunningham table reads
800 (32, 160) P 193.
Write t = 2160 for brevity. The table entry says that 2800 + 1 =
t5 + 1 = (t + 1) P 193, where P 193 = t4 − t3 + t2 − t + 1 is a 193-digit
prime. To use Theorem 3.27 to show that P 193 is prime, we need to
factor P 193 − 1. We have
P 193 − 1 = t4 − t3 + t2 − t = t(t − 1)(t2 + 1),
and the numbers t − 1 = 2160 − 1 and t2 + 1 = 2320 + 1 are factored
in the Cunningham tables, so P 193 − 1 is completely factored.
Example 4.22. In 1978, Williams [Wil78] proved that the repunit
R317 = (10317 − 1)/9 is prime. His first step was to factor
R317 − 1 = (10317 − 10)/9
= 10 · (1079 − 1)/9 · (1079 + 1) · (10158 + 1).
These numbers are all factored completely in the present Cunning-
ham table and the primality proof by Theorem 3.27 is routine. Back
in 1978, only tiny prime factors of 10158 + 1 were known, and the
remaining cofactor was composite, so Williams had to use a theorem
like Theorem 3.29 to complete the prime proof. See [Wil78] for de-
tails. In 1986, Williams and Dubner [WD86] proved that N = R1031
is prime. They used prime factors of N ± 1, N 2 + 1, and N 2 ± N + 1
in their proof.

This fourth example requires factoring, too, but this time the
Cunningham tables do not help.
4.5. Linear Feedback Shift Registers 93

Example 4.23. The base 12 Cunningham table claims that 12128 +


1 = 257 · P 136. The second largest prime factor of P 136 − 1 has 30
decimal digits, which might make it hard to find. The number P 136
is interesting because it is not the entire primitive part that must be
proved prime. However, we can still do some algebraic factoring:
12128 + 1 12128 − 28
P 136 − 1 = −1= ,
257 257
and we have, writing s = 1216 for brevity,
s8 − 28 s4 + 24 2
P 136 − 1 = = (s + 22 )(s + 2)(s − 2).
257 257
Each of the four factors is easy to factor into primes and then it is
easy to finish the proof by Theorem 3.27 that P 136 is prime. This
example is due to R. deVogelaere.

4.5. Linear Feedback Shift Registers


Linear feedback shift registers (LFSR) are devices used to generate
pseudorandom bit streams. They may be used for encryption, privacy,
error-correcting codes, and simulation. The devices are simple and
efficient. Usually, they are designed to produce a bit stream with
the longest possible period, which is 2n − 1 when the shift register
holds n bits. The period is longest exactly when the characteristic
polynomial for the LFSR is primitive. A polynomial of degree n over
F2 is primitive if it is irreducible, it divides x2 −1 + 1, but it does
n

not divide xd + 1 for any divisor d of 2n − 1 less than 2n − 1. This


condition is easy to check when the complete factorization of 2n − 1
is known, but it is impossible to check when one does not know all
the factors of 2n − 1. See Golomb [Gol82] for details and all results
in this section.
A linear feedback shift register consists of a shift register and
an exclusive-or gate. The shift register holds n bits. A clock ticks
whenever a new pseudorandom bit is needed. At each clock tick, the
bits shift one position to the right. The bit that exits the right end is
the next pseudorandom bit. The exclusive-or gate computes the sum
modulo 2 of the bits in selected positions of the shift register (before
the shift). The output of the exclusive-or gate enters the shift register
94 4. How Are Factors Used?

at the left end at each clock tick. Number the bit positions of the
shift register from 1 to n from left to right. Let the i-th bit position
hold the bit ai ∈ {0, 1} and let ti = 1 if the i-th bit position is selected
as input to the exclusive-or gate and ti = 0 if it is not selected. Then

the output of the exclusive-or gate is a = ( ni=1 ai ti ) mod 2. At each
clock tick, the content of the i-th bit position changes from ai to ai−1
if 1 < i ≤ n, and the content of the left-most bit position changes
from a1 to a.
The characteristic polynomial of the LFSR just described is f (x)

= 1 + ni=1 ti xi . There is a way to describe a LFSR with an n ×
n matrix. The characteristic polynomial of the LFSR is just the
characteristic polynomial of the matrix.

Example 4.24. Consider the LFSR with four bits and characteristic
polynomial f (x) = x4 + x3 + 1. This means that the two bits on the
right end are input to the exclusive-or gate (⊕). Suppose the initial
contents of the shift register are 0001 from left to right.

- 0 0 0 1 -

?


Then the contents of the shift register at each successive clock tick
are shown in Table 4. The characteristic polynomial is primitive and

Table 4. Contents of a Linear Feedback Shift Register.

a1 a2 a3 a4 a1 a2 a3 a4 a1 a2 a3 a4
0 0 0 1 1 1 0 0 1 1 0 1
1 0 0 0 0 1 1 0 1 1 1 0
0 1 0 0 1 0 1 1 1 1 1 1
0 0 1 0 0 1 0 1 0 1 1 1
1 0 0 1 1 0 1 0 0 0 1 1
4.5. Linear Feedback Shift Registers 95

the period length is 24 − 1 = 15, the maximum possible length for a


four-bit LFSR.

Theorem 4.25. If an LFSR with n bits has maximum period, which


is 2n − 1, then its characteristic polynomial is irreducible.

Proof. For i ≥ 0 let ri be the value of the bit an after i clock ticks.
n
Then ri = j=1 tj ri−j mod 2 for i ≥ n. Define the generating

function G(x) = ∞ i
i=0 ri x . A short calculation with the recursion
formula for the ri shows that G(x) = g(x)/f (x), where f (x) is the
characteristic polynomial (of degree n) and g(x) is a polynomial of
degree < n.
Now suppose that the characteristic polynomial is not irreducible,
that is, f (x) = s(x)t(x) for some polynomials s(x), t(x) of degrees n1
and n2 both < n with n1 + n2 = n. Assume s(x) = t(x). Then G(x)
has the partial fraction decomposition
g(x) α(x) β(x)
G(x) = = + ,
f (x) s(x) t(x)
where degree α < degree s and degree β < degree t. Then α(x)/s(x)
and β(x)/t(x) are generating functions for two LFSRs with (mini-
mum) periods dividing 2n1 − 1 and 2n2 − 1, respectively. The period
of the LFSR with generating function G(x) cannot exceed the least
common multiple of the periods of the two new LFSRs. Therefore,
2n − 1 ≤ lcm(2n1 − 1, 2n2 − 1) ≤ (2n1 − 1)(2n2 − 1) ≤ 2n − 3,
a contradiction. The case s(x) = t(x) also leads to a contradiction.
Hence the assumption that f (x) is not irreducible was wrong. 

Here is an algorithm to test whether a polynomial f (x) ∈ F2 [x] is


primitive. Since f must have constant term 1, f (x) divides x2 −1 + 1
n

n
if and only if f (x) divides x2 + x. The first step of the algorithm
n
is to compute x2 mod f (x). If this polynomial is x, then f (x) is
irreducible and might be primitive. If f (x) passes this test and 2n − 1
is prime, then f (x) is primitive. But if 2n − 1 is composite, then
a second step is needed. For each prime factor q of 2n − 1, test
whether f (x) divides x(2 −1)/q − 1 by computing x(2 −1)/q mod f (x)
n n

and comparing with the polynomial 1. If f (x) does divide x(2 −1)/q −1
n
96 4. How Are Factors Used?

for some factor q of 2n − 1, then f (x) is not primitive and the period
of the LFSR will divide (2n − 1)/q. However, if for every factor q
of 2n − 1, f (x) does not divide x(2 −1)/q − 1, then f (x) is primitive.
n

Note the similarity of this test to that in Theorem 3.27.

Example 4.26. Which of these polynomials f (x) are primitive:


(a) x5 + x2 + 1, (b) x4 + x + 1, (c) x4 + x3 + x2 + x + 1?
(a) Since n = 5, we compute x32 mod f (x). We square x five
times modulo f (x) to see whether we get x. The first two squarings
give x2 and x4 , which need no reduction. But x8 has degree ≥ 5, so
we divide it by f (x) and get the remainder x3 + x2 + 1. Then x16 is
(x3 + x2 + 1)2 = x6 + x4 + 1. (The cross product terms vanish since
1+1 = 0 in F2 .) We reduce x6 +x4 +1 modulo f (x) to get x4 +x3 +x+1
for x16 . Then x32 is (x4 + x3 + x + 1)2 = x8 + x6 + x2 + 1. Reducing
modulo f (x) gives x, so f (x) is irreducible. Since 25 −1 = 31 is prime,
this already shows that f (x) is primitive.
(b) Since n = 4, we compute x16 mod f (x). We square x four
times modulo f (x) to see whether we get x. We find x4 modulo f (x)
is x + 1, x8 modulo f (x) is x2 + 1, and x16 modulo f (x) is x, so f (x)
is irreducible. Now 24 − 1 = 15 = 3 · 5, so we must test whether f (x)
divides either x5 − 1 or x3 − 1. To do this, we compute x5 mod f (x)
and x3 mod f (x) to see if we get 1. We reduce x5 modulo f (x) and
get x2 + x. x3 needs no reduction. Neither x2 + x nor x3 is the
constant polynomial 1. This shows that f (x) is primitive.
(c) As in part (b), we find x16 mod f (x) is x, so f (x) is irreducible.
(On the way, x4 reduces to x3 + x2 + x + 1 and x8 reduces to x3 .)
But x5 reduces to 1 modulo f (x), so f (x) is not primitive.

One can show that the total number of irreducible polynomials


of degree n modulo 2 is

1 d
2 μ(n/d),
n
d|n

and the number of these polynomials that are primitive is

φ(2n − 1)/n.
4.6. Testing Conjectures 97

When 2n − 1 is prime, both numbers equal (2n − 2)/n, and every


irreducible polynomial modulo 2 of degree n is primitive. See Golomb
[Gol82] for proofs.
Example 4.27. The number of irreducible polynomials modulo 2 of
degree 8 is
1 d 1 240
2 μ(8/d) = (0 + 0 − 24 + 28 ) = = 30.
8 8 8
d|8

The number of primitive polynomials modulo 2 of degree 8 is


φ(28 − 1) φ(255) φ(3 · 5 · 17) 2 · 4 · 16
= = = = 16.
8 8 8 8
Many articles list primitive polynomials modulo 2, especially tri-
nomials. See for example Brent and Zimmermann [BZ09], [BZ11]
and their references. Most of these articles favor Mersenne prime ex-
ponents because there is no need to factor 2n −1, but any n will work,
provided the factors of 2n − 1 are known. A surprising recent result of
Brent, Hart, Kruppa, and Zimmermann is that when n = 57885161,
so that 2n − 1 is the current largest known prime, there is no prim-
itive trinomial. For each s in 1 ≤ s ≤ n/2, they found a nontrivial
factor of xn + xs + 1. The second largest known prime, 2p − 1 with
p = 43112609, yielded four primitive trinomials, xp + xs + 1 with
s = 3569337, 4463337, 17212521, and 21078848; the third largest
known prime gave five primitive trinomials.
See [DXS91] for more about LFSRs and their variations.

4.6. Testing Conjectures


Mathematicians often compute examples to test conjectures; factoring
is frequently involved in this computation.

4.6.1. Mersenne Primes. One of the oldest conjectures is that


there are infinitely many primes of the form 2p − 1. Theorem 3.31
gives an efficient way of testing whether 2p − 1 is prime. But of-
ten there is a small prime factor q of this number. One can test
whether q divides 2p − 1 using the Fast Exponentiation Algorithm to
find 2p mod q. Then q divides 2p − 1 if and only if 2p ≡ 1 (mod q).
This work multiplies numbers smaller than q. In contrast, the test of
98 4. How Are Factors Used?

Theorem 3.31 multiplies numbers of the size of 2p − 1, that is, p-bit


numbers. We will see in Example 5.7 that if a prime q divides 2p − 1,
then q = 2kp + 1, where k ≡ 0 or −p (mod 4). When p is large
(say, p > 30), it is worthwhile to test whether 2p ≡ 1 (mod q) for
small q (say, k < 220 ) in these two arithmetic progressions. There is
a good chance that a factor q of 2p − 1 will be found and that the
Lucas-Lehmer test Theorem 3.31 will not be needed.

Example 4.28. The prime 78511 divides 22617 − 1, so the latter is


composite.

Mersenne famously stated in 1644 that, of the fifty-five primes


p ≤ 257, 2p − 1 is prime only for the eleven values1

p = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, and 257.

During the next 250 years mathematicians found that his list had five
errors: p = 67 and 257 should be removed from the list while p = 61,
89, and 107 should be added to it. (See Bateman et al. [BSW89]
for a new “Mersenne” conjecture.) Much of the checking involved
factoring. For example, the prime 43 should not be on the list because
431 divides 243 − 1. Lucas used Theorem 3.31 to show that 2127 − 1
is prime, and it was the largest known prime for 75 years. Lucas did
the same calculation to test whether 267 − 1 is prime, and it showed
that this number is composite, but he was not certain that his work
was correct. Lucas died in 1891. Cole factored this number in 1903.
See the quote from Bell [Bel51] in Section 4.9 for more of the story.
Currently, the largest known Mersenne prime is 257885161 − 1. It
is conjectured that the number of Mersenne primes ≤ x is asymptot-
ically (eγ / ln 2) ln ln x as x → ∞. Here γ ≈ 0.577215665 is Euler’s
constant and eγ / ln 2 ≈ 2.5695. Numerical evidence supports this
conjecture. See [Wag83].

4.6.2. Bell Numbers. The Bell numbers B(n) are named after the
same E. T. Bell who wrote about Cole. They appear in combinatorial
problems. For example, B(n) is the number of ways a product of

1
This is the reason why the 2p − 1 are called the Mersenne numbers.
4.6. Testing Conjectures 99

exactly n different primes can be factored. The Bell numbers are


defined by the generating function


x
−1 xn
ee = B(n) .
n=0
n!

Williams [Wil45] proved that for each prime p the sequence


{B(n) mod p}, n = 0, 1, . . ., is periodic and the period length di-
vides Np = (pp − 1)/(p − 1). It is conjectured that the minimum
period length equals Np for every prime p. The conjecture is known
to hold for all p < 126 and for several larger primes.

Example 4.29. The first few Bell numbers are

n: 0 1 2 3 4 5 6 7 8 9
B(n) : 1 1 2 5 15 52 203 877 4140 21147
B(n) mod 2 : 1 1 0 1 1 0 1 1 0 1

This table and Touchard’s [Tou33] congruence B(n + p) ≡ B(n) +


B(n + 1) (mod p), valid for n ≥ 0 and prime p, show that the Bell
numbers modulo 2 are periodic with period length (22 −1)/(2−1) = 3.

To prove the conjecture for a given prime p, one simply checks


that for each prime q | Np , the period does not divide Np /q. This test
is similar to Theorem 3.27 for primality. See [Wag96] for how to tell
whether the period divides Np /q. Thus, one can test the conjecture
for a prime p if one can factor Np completely. The conjecture has
been proved exactly for those primes p for which Np has been fac-
tored completely. Whenever p ≡ 1 (mod 4), Np has an Aurifeuillian
factorization that splits it into two nearly equal factors by Theorem
4.1. See Example 4.4.

4.6.3. Aliquot Sequences.

Definition 4.30. An aliquot part of a positive integer m is a positive


divisor of m other than m. Let s(m) = σ(m) − m denote the sum of
the aliquot parts of m.
100 4. How Are Factors Used?

Example 4.31. An integer m is perfect if and only if it equals the


sum of its aliquot parts, that is, s(m) = m. An integer m is prime if
and only if 1 is its only aliquot part, that is, s(m) = 1.

Definition 4.32. The aliquot sequence starting at m is the integer


sequence {si (m), i ≥ 0} of iterates of the function s: s0 (m) = m and
si+1 (m) = s(si (m)) for i ≥ 0.

Example 4.33. Here are some aliquot sequences:

m s1 (m) s2 (m) . . .
12 16 15 9 4 3 1 0
24 36 55 17 1 0
28 28 28 28 . . .
30 42 54 66 78 90 144 259 45 33 15 9 4 3 1 0
220 284 220 284 . . .
276 396 696 1104 1872 3770 3790 3050 2716 2772 5964 . . .

The way to compute an aliquot sequence is to factor each number


m, use Theorem 2.41 to evaluate σ(m), and then subtract m to get
the next term s(m).
An aliquot sequence can either end with 0 or enter a cycle, such
as {28} of length 1 or {220, 284} of length 2, or perhaps grow without
bound. The cycles of length 1 are perfect numbers. Cycles of length
2 are called amicable pairs. Cycles of length > 2 are called sociable
numbers. For example, {12496, 14288, 15472, 14536, 14264} is a set of
sociable numbers.
Aliquot sequences, perfect numbers, and amicable pairs have had
a long and colorful history. God made Heaven and Earth in 6 days.
The moon circles the earth every 28 days. Both 6 and 28 are perfect.
In Genesis 32:14, Jacob gave his brother 200 ewes and 20 rams. Jew-
ish commentators have noted that the total number of sheep is one
number of an amicable pair. The philosopher Iamblichus of Chalcis
(ca. 250–330 A.D.) wrote this:
4.7. Bernoulli Numbers 101

Certain men steeped in mistaken opinion thought


that the perfect number was called love by the Py-
thagoreans on account of the union of different ele-
ments and affinity which exists in it; for they call cer-
tain other numbers, on the contrary, amicable num-
bers, adopting virtues and social qualities to num-
bers, as 284 and 220, for the [aliquot] parts of each
have the power to generate the other, according to
the rule of friendship, as Pythagoras affirmed. When
asked what is a friend, he replied, “another I,” which
is shown in these numbers. Aristotle so defined a
friend in his Ethics.
(Quoted in Dickson [Dic71, p. 38])

The aliquot sequence starting with 276 shown in Example 4.33 is


the first one not known to terminate or cycle. It has been followed for
more than 1600 terms and leads to numbers with with more than 160
decimal digits. Work on this sequence has been done at least since
1930. In 1931, Dick Lehmer factored the term

s53 (276) = 254903331620 = 22 · 5 · 7 · 137 · 13290059,

and the composite number 13290059 will be used as an example of


some factoring methods in later chapters.
No one has ever proved that any aliquot sequence is unbounded,
but the current guess is that infinitely many, and perhaps most, of
them grow without bound. See Guy and Selfridge [GS75] for more
about the growth rate of aliquot sequences.

4.7. Bernoulli Numbers


The Bernoulli numbers Bn , n ≥ 0, are rational numbers with inter-
esting arithmetic properties. They may be defined by


t tn
= Bn
e − 1 n=0
t n!
102 4. How Are Factors Used?

or by B0 = 1 and
n−1

−1  n + 1
Bn = Bk
n+1 k
k=0
n
for n ≥ 1, where k = n!/((n − k)!k!) is the binomial coefficient. The
first few are
1 1 1 1
B1 = − , B2 = , B3 = 0, B4 = − , B5 = 0, B6 = .
2 6 30 42
One can prove that B2k+1 = 0 for k ≥ 1. The signs of the nonzero
Bernoulli numbers alternate. The next few nonzero ones are
1 5 691 7 3617
B8 = − , B10 = , B12 = − , B14 = , B16 = − .
30 66 2730 6 510
After a slow start, the absolute value |B2k | increases rapidly.
Jacob Bernoulli introduced the numbers named after him to give
a closed form for the sum of the first k n-th powers of integers:
n

1  n+1
1n + 2n + 3n + · · · + kn = Bj · (k + 1)n+1−j .
n + 1 j=0 j

Example 4.34. When n = 2, we have




1 3 3
12 + 22 + 32 + · · · + k2 = (k + 1)3 − (k + 1)2 + (k + 1)1
3 2 6


k+1 3k 3 1
= k2 + 2k + 1 − − +
3 2 2 2
k(k + 1)(2k + 1)
= .
6

The well-known formula ∞ i=1 i
−2
= π 2 /6 generalizes to
∞
1 (2π)2k
2k
= (−1)k−1 B2k .
i=1
i 2(2k)!

Example 4.35.
∞
1 (2π)4 16π 4 −1 π4
= (−1)2−1 B4 = − · = .
i=1
i4 2(4)! 2 · 24 30 90
4.7. Bernoulli Numbers 103

Write the rational number Bn = Pn /Qn with gcd(Pn , Qn ) = 1


and Qn > 0. The denominators Qn are determined by a corollary to
the von Staudt–Clausen Theorem. If n is an even positive integer,
then 
Qn = p,
(p−1)|n, p prime
that is, Qn is the product of all primes p for which p − 1 divides n.
Thus, 6 divides Qn for all even n ≥ 2 because both 2 − 1 and 3 − 1
divide every even number. The denominator Qn is always square free.
One can compute Qn provided one can factor n: For each even divisor
d of n, test whether d + 1 is prime; if so, it is a factor of Qn ; all odd
prime factors of Qn arise this way. One can prove that for each even
positive integer n there are infinitely many even positive integers m
for which Qm = Qn . See Erdős and Wagstaff [EW80].
Example 4.36. Find the denominator Q12 of B12 .
The divisors of 12 are 1, 2, 3, 4, 6, 12. Add 1 to each divisor to
get 2, 3, 4, 5, 7, 13. All of these numbers are prime except 4, so
Q12 = 2 · 3 · 5 · 7 · 13 = 2730.

Much less is known about the numerator Pn of Bn . When one


mentions “factoring a Bernoulli number,” one always means factoring
its numerator. The main theoretical result about the prime factors
of Pn is the pathetic
Theorem 4.37. If p ≡ 3 (mod 4) is prime, then p does not divide
P(p+1)/2 .

See Remark 3.6 in [MW06] for a simple proof.


The prime factors of Pn have important applications to the struc-
ture of cyclotomic fields. See Chapters 5 and 6 of Washington [Was96]
for details. See Buhler and Harvey [BH11] for recent work in this
area. Near the end of Section 8.5, we describe how P200 was factored.
A prime p > 3 is called regular if p  Pk for any k in 2 ≤ k ≤ p − 3
and irregular if p divides Pk for some k in 2 ≤ k ≤ p − 3. It has been
proved that there are infinitely many irregular primes. (See Theorems
3.18, 3.19, and 3.20 of [MW06].) Although it has never been proved
that the number of regular primes is infinite, a heuristic argument by
104 4. How Are Factors Used?

Siegel [Sie64] and much numerical evidence suggest that more than
60% (the fraction is e−1/2 ) of primes are regular.

Example 4.38. The first few irregular primes are 37, which divides
P32 , 59, which divides P44 , and 67, which divides P58 . The prime 157
is the first one to divide two Bernoulli numerators, namely, P62 and
P110 .

If p is a regular prime, then it is relatively easy to prove Fermat’s


Last Theorem for exponent p, that is, there are no positive integers x,
y, z, with xp + y p = z p . Before Wiles proved Fermat’s Last Theorem
for all primes, some mathematicians proved it one exponent at a time,
with extra work needed for each instance of p | Pk with 2 ≤ k ≤ p − 3,
k even.

4.8. Cryptographic Applications


Factoring has many uses in cryptography. We have already mentioned
linear feedback shift registers in Section 4.5. The security of some
ciphers depends on the difficulty of factoring integers. Other ciphers
and protocols depend on the discrete logarithm problem (DLP) being
hard to solve. There are special cases in which a DLP that looks hard
because it has large parameters is actually easy to solve because of
some trick. Knowing the factors of a parameter helps to avoid some
of these traps. Blum, Blum, and Shub invented a random number
generator based on a large composite number N whose random output
bits can be predicted if and only if one can factor N . We describe
RSA signatures and zero-knowledge proofs, each based on a large
number with secret factors. In Section 10.4, we will explain how to
break these protocols by factoring a large number.

4.8.1. RSA and Factoring. The well-known Rivest-Shamir-Adle-


man public-key cipher works as follows. If Alice uses RSA, she chooses
two large primes p and q. They should have the property that n =
pq cannot be factored easily. Alice also chooses an e in 1 < e <
n − 1 with gcd(e, φ(n)) = 1 and computes a number d with ed ≡
1 (mod φ(n)). Note that it is easy to do this step via the Extended
Euclidean Algorithm when p and q are known because φ(n) = φ(pq) =
4.8. Cryptographic Applications 105

(p − 1)(q − 1). Plaintext and ciphertext are represented by integers


between 0 and n−1. A plaintext M is enciphered as E(M ) = M e mod
n and ciphertext C is deciphered as D(C) = C d mod n. The basic
property of any cipher is that when any message is enciphered and
then deciphered, the original message is recovered. In the case of RSA,
this means that for all 0 < M < n we have D(E(M )) = M . This
property holds because of Euler’s Theorem 2.31. Let ed = 1 + kφ(n)
for an integer k. Then for any M in 0 < M < n,

D(E(M )) = (M e mod n)d mod n ≡ M ed


= M 1+kφ(n) = M 1 (M φ(n) )k ≡ M 1k = M (mod n).

Alice makes n and e public but keeps d secret. The primes p and
q are not needed after d is computed. Anyone who wants to send
a secret message to Alice will encipher it using n and e. Once it is
enciphered, only Alice can decipher it because only she knows d. The
cipher will be secure provided it is impossible to find d given n and
e. The most obvious attack on RSA is to factor n and then compute
d from e and the prime factors p and q of n in the same way that
Alice computed d originally. If the attacker cannot factor n, then this
attack will fail.

Example 4.39. Alice uses RSA with p = 5, q = 13, n = pq = 65, and


e = 11. To find d, Alice first computes φ(n) = φ(65) = 4 · 12 = 48.
Then d = e−1 mod φ(n) = 11−1 mod 48. The Extended Euclidean
Algorithm applied to 11 and 48 gives 48 · 3 − 11 · 13 = 1, so d ≡ −13 ≡
48 − 13 = 35 (mod 48).
Alice makes n = 65 and e = 11 public. If Bob wants to send
the message M = 42 to Alice, he enciphers it as C = E(M ) =
E(42) = 4211 mod 65 = 48 by the Fast Exponentiation Algorithm.
Bob sends C = 48 to Alice. Alice deciphers C as D(C) = D(48) =
4835 mod 65 = 42 = M by the Fast Exponentiation Algorithm.

An important application of factoring is to determine which num-


bers n can be factored and which cannot be factored with known
algorithms and current computers. Users of RSA should choose the
latter. For example, n should be large enough so that the Number
Field Sieve will not factor it in a reasonable time, both p − 1 and q − 1
106 4. How Are Factors Used?

should have large prime factors to thwart a Pollard p − 1 attack, and


p and q should be approximately the same size, but not so close that
Fermat’s Method will factor n. Attackers of RSA should try to find
new factoring methods or build faster computers that can factor n
already in use for RSA.
No one has ever proved that breaking RSA is equivalent to fac-
toring n. Key management is important. Where does Alice store d?
If an attacker can get d, then there is no need to factor n. (In fact, if
you know d and e, then you can factor n. See Theorem 10.7 and Sec-
tion 10.5.) Many other issues are involved in building a secure cipher.
The reader should not attempt to design a serious RSA system using
only the information in this book. For example, if the same message
is enciphered and sent to a user twice, then the ciphertext will be the
same. The fact that the same ciphertext was sent twice may be useful
information to an eavesdropper even if he cannot decipher it.

4.8.2. Rabin, Williams, and Factoring. Rabin and Williams each


devised public-key ciphers with the property that one can prove that
breaking the cipher is equivalent to factoring a large integer n. In Ra-
bin’s [Rab79] system, Alice chooses two large primes2 p and q with
p ≡ q ≡ 3 (mod 4). She publishes the product n = pq and keeps p
and q secret. When Bob wants to send a message M in 0 < M < n
to Alice, he enciphers it as C = M 2 mod n. When Alice receives C,
which is a quadratic residue modulo n, she uses her knowledge of p
and q to find the four square roots of C modulo n. She uses Theorem
3.5 to find the square roots of C modulo p and q separately and she
uses the Chinese Remainder Theorem to combine the four pairs of
square roots. If M is ordinary plaintext, then only one square root
will make sense and M is that one. If M is a binary string or other-
wise indistinguishable from the other three square roots of M 2 mod n,
then Bob must indicate which square root is M .
Example 4.40. Suppose Alice chooses the primes 7 and 11 for her
Rabin cipher. She publishes n = 77. If Bob wants to send the secret
message M = 38 to Alice, he enciphers it as C = M 2 mod n =
382 mod 77 = 58.
2
The primes are chosen ≡ 3 (mod 4) because it is so easy to compute square roots
modulo such primes using Theorem 3.5.
4.8. Cryptographic Applications 107

Alice receives the ciphertext C = 58. She knows that 77 = 7 · 11.


The first step is to find the square roots of 58 ≡ 2 (mod 7) and
58 ≡ 3 (mod 11). Using the formula from Theorem 3.5 for the square
root of a quadratic residue modulo a prime ≡ 3 (mod 4), she has,
modulo 7:
x1 ≡ 2(7+1)/4 = 22 = 4 (mod 7)
and x2 ≡ 7 − x1 = 7 − 4 = 3 (mod 7), so the square roots are 3, 4
modulo 7.
Modulo 11 she has
x1 ≡ 3(11+1)/4 = 33 ≡ 5 (mod 11)
and x2 ≡ 11 − x1 = 11 − 5 = 6 (mod 11), so the square roots are 5, 6
modulo 11.
Now Alice combines these roots with the Chinese Remainder The-
orem. Note first that 11−1 mod 7 = 4−1 mod 7 = 2 and 7−1 mod
11 = 8, so by the Chinese Remainder Theorem the solution to the
system
x ≡ a (mod 7) and x ≡ b (mod 11)
is
x ≡ 11(11−1 mod 7)a + 7(7−1 mod 11)b
≡ 11 · 2a + 7 · 8b ≡ 22a + 56b (mod 77).

For our four pairs of a, b, Alice finds


a, b :
3, 5 : x = 22 · 3 + 56 · 5 = 66 + 280 ≡ 38 (mod 77),
3, 6 : x = 22 · 3 + 56 · 6 = 66 + 366 ≡ 17 (mod 77),
4, 5 : x = 22 · 4 + 56 · 5 = 88 + 280 ≡ 60 (mod 77),
4, 6 : x = 22 · 4 + 56 · 6 = 88 + 366 ≡ 39 (mod 77).
Note that 38+39 = 77 and 17+60 = 77, so that she could have found
the last two solutions by subtracting the first two solutions from 77.
After she finds the four square roots, how does Alice know that
38 is Bob’s secret message? Perhaps “38” is the only meaningful
message. Otherwise, Bob would have to tell her something to identify
his message.
108 4. How Are Factors Used?

Rabin [Rab79] proved that if one has an algorithm to decipher


any message M enciphered with Rabin’s cipher in a reasonable time,
then there is an algorithm to factor the modulus n in a reasonable
time.
Williams [Wil80] constructed a similar cipher using primes p ≡
3 (mod 8) and q ≡ 7 (mod 8), and n = pq. His cipher eliminates the
ambiguity in deciphering and also has the property that breaking the
cipher is equivalent to factoring n.

4.8.3. Hard DLP and Factoring. Another use of factoring in


cryptography is in the design of cryptographic functions that must
have a hard discrete logarithm problem. As we mentioned in the dis-
cussion of the DLP at the end of Section 2.4, if q is the largest prime
factor of p − 1, then one can solve the DLP g x ≡ b (mod p) in time

O( q). Therefore, when designing a cipher like Pohlig-Hellman or
ElGamal or a key exchange algorithm like Diffie-Hellman, one must
ensure that the prime modulus p is chosen so that p − 1 has at least

one prime factor q so large that one cannot perform q operations
in a reasonable time. One can make this check by factoring p − 1.
Elliptic curve analogues of the DLP are discussed in Chapter 8.

4.8.4. RSA Signatures. We mentioned in Section 1.1 that Diffie


and Hellman invented the idea of digital signature and that Rivest,
Shamir, and Adleman found a way to do it. We now explain a method
of signing a message.
Suppose Alice uses RSA with keys n, e, d and wants to sign a
message M in 0 < M < n. She has a (public) enciphering algorithm
E and a (secret) deciphering algorithm D. To sign M , she just applies
D to it: S = D(M ) = M d mod n. If Bob sees the signed message S,
he verifies the signature by obtaining Alice’s public RSA key n, e and
computing T = E(S) = S e mod n. The proof above that D(E(M )) ≡
M (mod n) also shows that E(D(M )) ≡ M (mod n) because ed = de.
Therefore, Bob will find that T = M when the signature is valid. Only
Alice could have created this signature because only Alice knows d.
In case Bob also uses RSA and Alice wants to encipher the mes-
sage M as well as sign it, she can encipher the signed message S with
Bob’s public RSA key.
4.8. Cryptographic Applications 109

There is a trick Alice can use to speed her RSA signature gen-
eration. Suppose her modulus is n = pq, where the primes p and q
have about the same length. Let b be the number of bits in n, so that
the length of p and q is about b/2 bits. If the decryption exponent
is d, then Alice signs the plaintext M as S = D(M ) = M d mod n.
The trick replaces this Fast Exponentiation with b-bit numbers by
two Fast Exponentiations with b/2-bit numbers. This makes the sig-
nature generation run about four times faster. Let Mp = M mod p,
Mq = M mod q, dp = d mod (p − 1), and dq = d mod (q − 1). The
length of each of these four numbers is about b/2 bits. Alice computes
d d
Sp = Mp p mod p and Sq = Mq q mod q by Fast Exponentiation. Now
the signature S ≡ Sp (mod p) and S ≡ Sq (mod q), so Alice com-
putes S = D(M ) from Sp and Sq by the Chinese Remainder Theorem.
The use of the Chinese Remainder Theorem may be accelerated. One
can show that S = (f Sp + gSq ) mod n where f and g are the precom-
puted constants f = q · (q −1 mod p) and g = p · (p−1 mod q). The
same trick can be used to speed deciphering RSA messages, but not
enciphering them because the person encrypting does not know the
secret factors p and q.

4.8.5. Secure Random Number Generation. In 1986, Blum,


Blum, and Shub [BBS86] invented a pseudorandom number gen-
erator that may be used for public-key cryptography. They proved
that it can be broken if and only if one can factor a large integer.
The integer n is the product of two primes p ≡ q ≡ 3 (mod 4). Let
s be an integer (the seed ) relatively prime to n. Define a sequence
x0 = s2 and xi = x2i−1 mod n for i > 0. The i-th pseudorandom bit
is the low-order bit bi of xi : bi = xi mod 2.
The bits may be used to encipher a message bit by bit. If the
message is the bit string m0 m1 . . ., then the ciphertext is c0 c1 . . .
defined by ci = mi ⊕ bi , where ⊕ is exclusive-or (the sum modulo
2) as in the LFSR. To recover the message from the ciphertext, use
mi = ci ⊕ bi .
One can prove that, while it is easy to compute the sequence xi
forwards from any starting point xk , one cannot compute the sequence
backwards, that is, find xk−1 from xk , unless one knows the factors
p, q of n, in which case it is easy to compute backwards.
110 4. How Are Factors Used?

Alice can use this property to set up a public-key cipher as follows.


She chooses secret large primes p ≡ q ≡ 3 (mod 4) and makes n = pq
public. Bob enciphers a message m0 m1 . . . , mr as follows. He chooses
a random s relatively prime to n and generates xi , bi , and ci as
above. He sends Alice c0 c1 . . . , cr and xr+1 over a public channel,
where eavesdroppers may see it. Using the secret factors of n, Alice
deciphers the message by computing xr , xr−1 , . . ., x0 from xr+1 .
An eavesdropper who knows xr+1 but not the secret factors cannot
perform this computation and so cannot decipher the message.

Theorem 4.41. An eavesdropper who can compute in polynomial


time xr for any given xr+1 can factor n in polynomial time.

Proof. The eavesdropper chooses any t for which the Jacobi symbol
(t/n) = −1. Then t is a quadratic nonresidue modulo n. Let xr+1 =
t2 mod n. Compute xr in polynomial time. Then x2r ≡ xr+1 ≡
t2 (mod n), but xr ≡ ±t (mod n), so gcd(t + xr , n) is a proper
factor of n by Theorem 6.17. 

Theorem 4.42. Alice can use knowledge of the factors p, q of n to


compute xr from xr+1 in polynomial time.

Proof. The number −1 is a quadratic nonresidue of p and of q by


part (5) of Theorem 2.59. Thus, u is a quadratic residue modulo
p if and only if −u is a quadratic nonresidue modulo p by part (1)
of Theorem 2.59, and likewise modulo q. The square roots of xr+1
(p+1)/4
modulo p are ±xr+1 by Theorem 3.5, and exactly one of these two
numbers is a quadratic residue. This statement holds also modulo q.
The four square roots of xr+1 modulo n may be constructed from
the square roots of xr+1 modulo p and modulo q by the Chinese
Remainder Theorem 2.20, as we saw for Rabin’s cipher above. Now x
is a quadratic residue modulo n if and only if it is a quadratic residue
modulo both p and q. Therefore, exactly one of the four square roots
of xr+1 modulo n is a quadratic residue modulo n; this one is xr . All
the steps of this calculation may be done in polynomial time. 

Example 4.43. Suppose p, q, and n for Blum-Blum-Shub have the


values in Example 4.40 above. Suppose xr+1 = 58, the C in Example
4.8. Cryptographic Applications 111

4.40. Which one of the four square roots of xr+1 is the quadratic
residue xr modulo n?
Modulo p = 7, we have C = 58 ≡ 2, and we found that the two
square roots of 2 are 3 and 4. Clearly, 4 = 22 is a quadratic residue,
and therefore 3 = 7 − 4 is a quadratic nonresidue.
Modulo p = 11, we have C = 58 ≡ 3, and we found that the two
square roots of 3 are 5 and 6. Evaluating the Legendre symbols by Eu-
ler’s Criterion, part (4) of Theorem 2.59, we find (5/11) ≡ 5(11−1)/2 =
55 ≡ +1 (mod 11) and (6/11) ≡ 6(11−1)/2 = 65 ≡ −1 (mod 11), so 5
is the quadratic residue.
The square root xr of 58 that is a quadratic residue modulo n =
77 is the one that is a quadratic residue modulo both p and q, that
is, the one ≡ 4 (mod 7) and ≡ 5 (mod 11). We saw in Example 4.40
that xr ≡ 60 (mod 77).

Given n and xi , one can compute xi+j for any j > 0 by xi+j =
j
x2i (mod n). In case j is very large, we can do this in O((log j)(log n)2 )
steps by computing t = 2j mod φ(n) first and then xi+j = xti mod n.
This property is useful if one wishes to decipher the middle of a long
message and not start from its beginning.

4.8.6. Zero-Knowledge Proofs. A zero-knowledge proof is a pro-


tocol between two parties, called the Prover and the Verifier, in which
the Prover convinces the Verifier that she knows a certain secret. She
does this without revealing to the Verifier any part of the secret. Af-
ter the protocol finishes, neither the Verifier nor any eavesdropper
can convince someone else that he or she knows the secret (perhaps
by replaying parts of the protocol he or she has seen). Although zero-
knowledge proofs exist for many different types of secrets, we consider
here only the one in which the secret is the factorization of a large
integer N .
To keep the protocol simple, assume N = pq is the product of
two different prime numbers. The Prover knows p and q. The Verifier
is not supposed to learn p or q but is supposed to become convinced
that the Prover knows them. Roughly speaking, the Verifier will
give the Prover several quadratic residues modulo N computed as
b = a2 mod N , and the Prover will reply with a square root of each,
112 4. How Are Factors Used?

say c with c2 ≡ b (mod N ). Theorem 3.5 and Algorithm 3.6 tell how
the Prover can compute square roots of quadratic residues modulo N
quickly, provided she knows the prime factors of N . We will see in
Corollary 6.19 that anyone who can compute square roots of arbitrary
quadratic residues modulo N quickly can also factor N . The danger in
doing the protocol in this simple way is that the Verifier will probably
be able to factor N , by Theorem 6.18. Here is one standard way to
avoid this trap and perform the protocol safely.

(1) Prover chooses a in N < a < N and lets b = a2 mod N .

(2) Verifier chooses c in N < c < N and lets d = c2 mod N .
(3) Prover sends b to Verifier and Verifier sends d to Prover.
(4) Prover receives d and solves x2 ≡ bd (mod N ). Let x1 be
one of the (four) solutions.
(5) Verifier chooses a random bit 0 or 1, each with probability
0.5, and sends the bit to the Prover.
(6) Prover receives the bit. If it is 0, she sends a to the Verifier.
If it is 1, she sends x1 to the Verifier.
(7) Verifier receives a or x1 . If the bit was 0, he checks that
b = a2 mod N . If it was 1, he checks that x21 ≡ bd (mod N ).

The Prover and Verifier repeat steps (1) through (7) a few dozen
times. If the check in step (7) is always correct, then the Verifier
accepts that the Prover really knows the factors of N . But if the
check in step (7) ever fails, then the Verifier concludes that the Prover
does not know the factorization of N .
If the Prover really knows the factors of N , then she can compute
all the square roots, as explained above. But if the Prover does not
know the factors of N , then she cannot compute both of the square
roots needed in step (6). (It turns out that she could fake either one
of them if she knew in advance whether the bit would be 0 or 1 in
step (5).) If the protocol is repeated 30 times, there is one chance
in 230 ≈ 109 that the Prover could correctly guess the bit in step
(5) each time and supply the needed square root in step (6) if she
did not know the factors of N . Verifier does not learn the factors of
N no matter how many times the protocol is repeated because the
4.9. Other Applications 113

quadratic residues are new each time and because he learns only one
square root of each one. He would have to get two different square
roots (whose sum is not N ) in order to factor N by Theorem 6.17.
But we will see in Chapter 10 that the Verifier can cheat and learn
the factors of N anyway, and that is why this simple version is not
used.

4.9. Other Applications


Theorems like 3.15, 3.17, 3.19, and 3.23 were discovered by examining
tables of numbers bm ± 1 factored.
Factorization is needed to evaluate arithmetic functions and to
identify Carmichael numbers via Theorems 2.41 and 3.36. There are
many more arithmetic functions in addition to those in Section 2.5.
Another one is the sum of the odd positive divisors of an integer.
Other examples come from expressing an integer n as the sum of k
squares of integers. Let rk (n) denote the number of solutions to
n = x21 + x22 + · · · + x2k

in integers x1 , x2 , . . ., xk (positive, negative, or zero). Then fk (n) =


rk (n)/rk (1) is a multiplicative function (of n) for k = 1, 2, 4, and 8,
but for no other positive integer k. Jacobi proved that

8σ(n) if n is odd,
r4 (n) =
24σ(m) if n is even and m is its largest odd divisor.
See Theorem 2.6 in [MW06] for a proof. Since the sum of divisors
function σ(n) is positive for every integer n > 0, every positive integer
is the sum of four squares. We can compute the number of ways to
write n as the sum of four squares provided we know the prime factors
of n.

Example 4.44. In how many ways can 30 be written as the sum of


four squares?
The largest odd divisor of 30 is 15 = 3 · 5, and

σ(15) = (3 + 1)(5 + 1) = 24.


By Jacobi’s theorem, r4 (30) = 24σ(15) = 24 · 24 = 576.
114 4. How Are Factors Used?

Fermat proved that an integer n can be expressed as the sum of


the squares of two integers if and only if no prime ≡ 3 (mod 4) exactly
divides n to an odd power. Therefore, in many cases, one must know
the prime factors of n to be able to tell whether it is the sum of two
squares. But some cases are easy and do not require knowledge of
the factors of n. If n ≡ 3 (mod 4), then n is not the sum of two
squares because it must have at least one prime factor ≡ 3 (mod 4)
that divides it an odd number of times.

Example 4.45. Which of the numbers 30, 843, 29149, 30629 is the
sum of two squares?
The number 30 has the prime factor 3 to the first power, so it is
not the sum of two squares. Since 843 ≡ 3 (mod 4), it is not the sum of
two squares. Both 29149 and 30629 are ≡ 1 (mod 4), so we need their
prime factors to answer the question. We factor 29149 = 103 · 283.
Since 103 and 283 ≡ 3 (mod 4), 29149 is not the sum of two squares.
We factor 30629 = 109 · 281. Since both 109 and 281 ≡ 1 (mod 4),
30629 is the sum of two squares.

Those who factor the largest and hardest composite numbers are
rewarded with the bragging rights for this achievement. The cham-
pion factorizations for each factoring algorithm are promulgated on
the Internet together with the names of those who did the work.
Cole could certainly brag after factoring M67 = 267 − 1. Bell
[Bel51, p. 128] tells this tale, which is probably exaggerated:
When I asked Cole in 1911 how long it had taken him
to crack M67 , he said “three years of Sundays.” But
this, though interesting, is not the history. At the
October, 1903, meeting in New York of the Ameri-
can Mathematical Society, Cole had a paper on the
program with the modest title On the factorization
of large numbers. When the chairman called on him
for his paper, Cole—who was always a man of very
few words—walked to the board and, saying noth-
ing, proceeded to chalk up the arithmetic for rais-
ing 2 to the sixty-seventh power. Then he care-
fully subtracted 1. Without a word he moved over
4.9. Other Applications 115

to a clear space on the board and multiplied out,


by longhand, 193, 707, 721 × 761, 838, 257, 287. The
two calculations agreed. Mersenne’s conjecture—if
such it was—vanished into the limbo of mathemati-
cal mythology. For the first and only time on record,
an audience of the American Mathematical Society
vigorously applauded the author of a paper delivered
before it. Cole took his seat without having uttered
a word. Nobody asked him a question.
E.T. Bell [Bel51]

In 1983, Diane Holdridge and Jim Davis, working for Gus Sim-
mons at Sandia National Labs, factored the 69-digit cofactor of 2251 −
1 and the 60-digit cofactor of 2211 −1 using a Cray-1 computer. These
were the last two numbers considered by Cunningham and Woodall in
their 1925 book [CW25] to be completely factored. These numbers
were also considered by Mersenne, as mentioned earlier, and were the
last of these numbers Mp , p ≤ 257, to be factored completely. This
achievement was reported in Time magazine of February 13, 1984.
In 1988, Mark Manasse and Arjen Lenstra factored the first hard
100-digit composite number. They organized more than a dozen col-
laborators running about 400 computers around the world. The num-
ber was the 100-digit cofactor of 11104 + 1. They used the Quadratic
Sieve Algorithm. Their work was reported on the front page of the
New York Times newspaper for October 12, 1988, in an article ti-
tled “A Most Ferocious Math Problem Tamed,” written by Malcolm
W. Browne. The 100-digit composite number turned out to be the
product of a 41-digit prime and a 60-digit prime.
In 1994, Derek Atkins, Michael Graff, Arjen Lenstra, and Paul
Leyland [AGLL95] factored the 129-digit RSA challenge number
published in the August 1977 Scientific American. Gina Kolata an-
nounced their result in articles in the New York Times on March 22
and April 27, 1994. They used the Quadratic Sieve Algorithm. See
the end of Section 8.2 for more details.
Hard-to-factor composite numbers, like RSA challenge numbers
and the unfactored parts of some Cunningham numbers, furnish an
interesting test for new factoring algorithms. One way to prove the
116 4. How Are Factors Used?

worth of a new factoring method is to factor a number that nobody


else could factor using older methods. Cryptographers who use RSA
and other ciphers that rely on the difficulty of factoring numbers of a
certain size follow developments in factoring to determine when they
should increase the size of their moduli.
A few years ago a prison inmate wrote to me saying that he
had invented a fast new factoring algorithm. He asked for money in
exchange for the algorithm. I sent him some Cunningham numbers
that no one could factor and said I would send him money for the
algorithm if he would send me the factors of these numbers first. He
never replied.
In Chapter 8, we give some applications of factoring integers to
elliptic curves.

Exercises

4.1. Study Table 5. It gives factorizations of 5m −1. Deduce the Au-


rifeuillian factorization for Φ5 (55h ), where h is an odd positive
integer. The primes following the parentheses are the factors
of Φ5 (55h ) for h = 1, 3, 5, and 7. You do not need to know the
earlier lines3 referenced inside the parentheses. Try to group
the factors following the parentheses to form a product of two
nearly equal integers.
Table 5. Numbers 5m − 1 factored.

m 5m − 1
5 (1) 11.71
15 (1, 3) 11.71.181.1741
25 (1, 5) 101.251.401.9384251
35 (1, 7) 11.71.211.631.4201.85280581

4.2. Factor some numbers 1313h − 1, where h is a small odd integer.


Deduce an Aurifeuillian factorization for these numbers.
3
The primitive factors 11 and 71 of 55 − 1 have been copied into the lines with
m = 15 and 35 because they participate in the Aurifeuillian factorization.
Exercises 117

4.3. Find the Aurifeuillian factorization of 205h − 1; that is, finish


Example 4.8.
4.4. Aurifeuillian factorizations were discovered when mathemati-
cians noticed nearly equal factors of certain numbers. Note
that

6106 + 1 = 37 · 26713 · 175436926004647658810244613736479118917


· 175787157418305877173455355755546870641,

has two nearly equal (prime) factors. Also, the number 12193 −1
has two nearly equal 77-digit (prime) factors:

4521744280918 . . . 81162723213257,
4657568121081 . . . 71111751624211.

Try to deduce algebraic factorizations from these examples.


4.5. Note that 243 − 1 = 431 · 9719 · 2099863 = 2099863 · 4188889
and the last two factors are close to 221 and 222 . Can you
generalize this example to a formula like (4.1)?
4.6. Write (prime) factors of bn ± 1 in number base b. Observe
patterns in the digits. Try to deduce algebraic factorizations
from the patterns.
4.7. Prove that p = 3 is the only odd prime for which both
(pp − 1)/(p − 1) and (pp + 1)/(p + 1) are prime.
4.8. Show that every odd perfect number has at least seven different
prime factors.
4.9. Tell which lines in which Cunningham tables you would consult
for help in proving that the Aurifeuillian factor 2997 + 2499 + 1
of 21994 + 1 is a prime number using Theorem 3.27.
4.10. Complete the proof by Theorem 3.27 that the P 302 divisor of
21004 + 1 is prime. Find the prime factors of P 302 − 1 in the
online Cunningham tables.
4.11. Trace the contents of the 4-bit LFSR with characteristic poly-
nomial f (x) = x4 + x + 1 and initial contents 0001 through one
complete cycle.
118 4. How Are Factors Used?

4.12. Draw diagrams of the LFSRs whose characteristic polynomials


appear in Example 4.26, showing which bits of the shift register
are connected to the exclusive-or gate. Determine the period
length for each LFSR.
4.13. Show that for 1 ≤ s ≤ n/2, the polynomial xn + xs + 1 is
primitive if and only if xn + xn−s + 1 is primitive.
4.14. The Bell numbers modulo 3 are periodic with period length
N3 = 13. Find the terms of the period.
4.15. Show that 5020 and 5564 are an amicable pair.
4.16. Verify that {12496, 14288, 15472, 14536, 14264} is a set of so-
ciable numbers.
4.17. Show that 14316 is one element of a set of sociable numbers.
4.18. Compute the first 25 terms of the aliquot sequence starting
with 276.
4.19. Use Bernoulli numbers to express the sum 13 +23 +33 +· · ·+k3
as a polynomial of degree 4 in k.
4.20. Compute the Bernoulli denominator Q100 .
4.21. Prove that if p and q are different odd primes, then Qp−1 =
Qq−1 .
4.22. Prove that 691 and 3617 are irregular primes using only the
information in Section 4.7.
4.23. Show that the RSA encryption and decryption exponents e and
d are always odd integers.
4.24. Would RSA work if N = pqr were the product of three primes?
Perhaps N would be easier to factor than if it had only two
prime factors, but would the encryption and decryption func-
tions work correctly for all messages 0 < M < N ?
4.25. In how many ways can 31, 32, 33, and 34 be written as the
sum of four squares?
4.26. Which of the numbers 193, 211, 1663633, 24847873 are the sum
of two squares?
Chapter 5

Simple Factoring
Algorithms

Introduction
This chapter describes some of the slow methods of factoring integers.
Although they are slow, most of them have short, simple programs
and require little auxiliary storage. Fifty years ago there were no
better algorithms. Furthermore, they work well for factoring small
integers, up to 109 , at least, so they are used for many applications
where the numbers are not large.
Until about 100 years ago, people computed and published tables
of primes and of prime factors of integers from 1 to some limit. Er-
atosthenes made the first such table more than 2,500 years ago. In
1603, Cataldi published a table of factors of the integers 1 to 750.
Chernac published a factor table to 1020000 in 1811. Burckhardt
published one for the second million three years later. Crelle ex-
tended this work to five million, but his tables were too inaccurate to
be published. The last factor table (to 10017000) was published by
D. N. Lehmer [Leh09] in 1909. Five years later he published a table
of all primes up to 10006721. No more such tables will be published
because one can compute them in seconds using Algorithms 8.2, 8.3,
and 8.5.

119
120 5. Simple Factoring Algorithms

We will revisit the Trial Division Algorithm 3.1 and tell simple
tricks for making it slightly faster. Fermat’s Factoring Method quickly
factors the product of two primes of about the same size. Hart’s
algorithm has a very short program and works well for integers with
some special forms. Lawrence and Lehman invented other variations
on Fermat’s Method. Euler, Legendre, Gauss, Chebyshev, and the
Lehmers found ways to factor integers that can be expressed in the
form ax2 + bxy + cy 2 . In the 1970s, Pollard created two interesting
factoring methods, called the Rho Method and the p − 1 Method.
Like Trial Division, they find small factors of large integers.
Several algorithms in this and later chapters have a block of in-
structions repeated many times, and one of the instructions is expen-
sive, but it is performed only during one of every k iterations of the
block. When we analyze the running time for the block of instruc-
tions, we add 1/k of the time for the expensive instruction to the time
for the other instructions to obtain the total time for one iteration
of the block. This calculation is called amortizing the time for the
expensive instruction.
Example 5.1. In the following instructions, assume n is much larger
than k, that Instruction 2 takes 50 time units, while Instructions 1
and 3 each take 1 time unit. Then the total time for the entire for
loop to execute is approximately n(50/k + 2) time units.
Example of Amortization
for i ← 1 to n {
Instruction 1
If (i ≡ 1 (mod k)) { Instruction 2 }
Instruction 3
}

5.1. Trial Division


The basic Trial Division Algorithm 3.1 is given in Section 3.1. Of
course, it is a very old and slow algorithm. For three hundred years,
factorers who used Trial Division sought ways to skip trial divisors
that could not possibly divide the candidate number. The algorithm
does not need to use composite trial divisors. One way to skip some
composites is to alternate adding 2 and 4 to a trial divisor to form
the next one. This trick skips multiples of 2 and 3. The technique is
5.1. Trial Division 121

called a wheel and can be extended to skip multiples of 2, 3, and 5 by


adding the differences between consecutive residue classes relatively
prime to 30.
Example 5.2. The residue classes relatively prime to 30 are
1, 7, 11, 13, 17, 19, 23, 29.
After Trial Division by 2, 3, and 5, one should add successively 7−5 =
2, 11 − 7 = 4, 13 − 11 = 2, 17 − 13 = 4, 19 − 17 = 2, 23 − 19 = 4,
29 − 23 = 6, 31 − 29 = 2, 37 − 31 = 6, 2, 4, . . ., to the previous trial
divisor to form the next one. This wheel has φ(30) = 8 “spokes.”

Wheels that skip multiples of the first eight or ten primes have
been used on computers to accelerate Trial Division.
Alternatively, one could compute a table of primes and divide the
candidate number only by these primes. A sieve can be used to build
a table of primes between two limits faster than the primes could be
read from disk, so there is no need to save a large table of primes in
a file. See Section 8.1 for more about sieves.
One can use quadratic residues to speed Trial Division by skipping
some primes that cannot be divisors. This device was used by Euler,
Gauss, and others hundreds of years ago. Let N be the number to
factor. Suppose we know a nonsquare quadratic residue r modulo N .
Then r is also a quadratic residue modulo any prime factor p of N .
If r is not a square, the Law of Quadratic Reciprocity restricts p to
only half of the possible residue classes modulo 4|r|.
Example 5.3. Suppose we are trying to factor N and we know that
13 is a quadratic residue modulo N . If p is a prime divisor of N , then
13 is also a quadratic residue modulo p. We saw in Example 2.60
that the odd primes p = 13 for which 13 is a quadratic residue are
the primes p ≡ 1, 3, 4, 9, 10, or 12 (mod 13). Therefore, the primes
3, 13, 17, 23, 29, 43, . . . may divide N , but not the primes 5, 7, 11,
19, 31, 37, 41, . . ..

If two nonsquare quadratic residues r1 and r2 modulo N are


known and neither is a square times the other, then the set of pos-
sible prime factors of N is cut in half twice, so that only one-fourth
of all primes could divide N . If one tries to use this technique with
122 5. Simple Factoring Algorithms

many nonsquare quadratic residues modulo N , the bookkeeping (data


structures) becomes unwieldy and impractical. But special hardware
(see Section 9.1) can help.
Where do we find nonsquare quadratic residues to use in the
prime divisor limiting technique just described? In the example above,
how could one know that 13 is a quadratic residue modulo N ? First,
it is easy to √
find quadratic residues modulo N . Just pick a random √
2
integer x > N √ and compute2 r = x mod N . (We need x > N
because if x < N , then r = x is a square and therefore a quadratic
residue modulo every prime that does not divide x.) Second, a small
r is better than a large r because the set of residue classes allowed
for the candidate divisors is more manageable. The size of this set is
about |r| − 1 or |r| − 2 residue classes modulo 4|r|, that is, about half
of the possible classes. See Table 22 of Riesel [Rie94]. One source
of small quadratic residues√is x2 mod N where x is slightly larger
than an integer multiple of N . (This idea led to the quadratic sieve
algorithm of Section 8.2.)

Example 5.4. Factor N = 11231 completely by Trial Division.


√ √
The square root is N ≈ 105.976. Let x =  N  = 106. Then
x2 mod N = 11236 − 11231 = 5. Clearly, 5 does not divide N . By
the Law of Quadratic Reciprocity, the primes with 5 as a quadratic
residue, which are the possible divisors of N , are p ≡ 1 or 9 (mod 10).
Trying p = 11, 19, 29, 31, 41, 59, 61, . . ., we find that N = 11 · 1021.
Is 1021 prime? It must have the same set of possible prime divisors
because 5 is a quadratic residue modulo 1021, too, by part (5) of
Theorem 2.16. After the first four candidates p have been tried and
do not divide 1021, we have √ proved that 1021 is prime because the
fifth trial divisor, 41, is > 1021.

Another source √of small quadratic residues is the continued frac-


tion expansion of N , as we shall see in Section 6.3. (This idea led
to the factoring algorithms of Sections 6.5 and 6.6.)
Sometimes the special form of N suggests small quadratic residues.
For example, if N = bn + 1 and n is even, then −1 is a quadratic
residue modulo N (because x2 ≡ −1 (mod N ) when x = bn/2 ), so
by part (5) of Theorem 2.59 every odd (prime) factor of N must be
5.2. Fermat’s Difference of Squares Method 123

≡ 1 (mod 4). For another example, if N = bn − 1 and n is odd,


then b is a quadratic residue modulo N because x2 ≡ b (mod N )
with x = b(n+1)/2 . When N is a Cunningham number bn ± 1, there
is another restriction on possible prime factors of N . The next two
theorems follow from Theorems 3.15 and 3.23, but the proofs here are
simpler.
Theorem 5.5. If p is a primitive prime factor of bn − 1, then p ≡
1 (mod n) and, if n is odd, then p ≡ 1 (mod 2n).

Proof. By definition of primitive factor, n is the least positive integer


m for which p divides bm − 1. But p divides bp−1 − 1 by Fermat’s
Little Theorem. Therefore, n divides p − 1. 
Theorem 5.6. If p is a primitive prime factor of bn + 1, then p ≡
1 (mod 2n).

Proof. By definition of primitive factor, n is the least positive integer


m for which bm ≡ −1 (mod p) and so 2n is the least positive integer
m for which p divides bm − 1. But p divides bp−1 − 1 by Fermat’s
Little Theorem. Therefore, 2n divides p − 1. 
Example 5.7. If q is a primitive prime factor of 2p −1, where p is odd,
then q ≡ 1 (mod 2p) by Theorem 5.5. We can also use the technique
mentioned just before Theorem 5.5. Notice that 2 is a quadratic
residue modulo q, so q ≡ ±1 (mod 8) by part (6) of Theorem 2.59.
Write q = 2kp + 1. Then q ≡ 1 (mod 8) if and only if k ≡ 0 (mod 4).
Also, q ≡ −1 (mod 8) if and only if kp ≡ −1 (mod 4), that is, if and
only if k ≡ −p (mod 4). This proves the claim about possible factors
of Mersenne numbers made near the beginning of Section 4.6.

5.2. Fermat’s Difference of Squares Method


To factor an odd number N , Fermat tried to express N as a difference
of two squares, x2 − y 2 , with the pair x, y different from (N + 1)/2,
(N − 1)/2. This pair gives x + y = N and x − y = 1. Any other
representation of N as x2 − y 2√gives a nontrivial factorization N =
(x − y)(x + y). Note that x ≥ N .
Fermat used the following algorithm to factor some numbers.
124 5. Simple Factoring Algorithms

Algorithm 5.8. Fermat’s Difference of Squares Factoring Algorithm.

Input:√An odd composite positive integer N to factor.


x ←  N
t ← 2x + 1
r ← x2 − N
while (r is not a square) {
r ←r+t
t←t+2
}
x←√ (t − 1)/2
y← r
Output: The factors x − y and x + y of N .

The variable r in the program takes on the values x2 − N ,


(x + 1)2 − N , (x + 2)2 − N , etc., until r is a square, r = y 2 . The
way r takes on these values without multiplication is that successive
odd numbers t are added to r, beginning with t = 2x + 1. This works
because (x + 1)2 = x2 + (2x + 1). When the while loop ends, we have
r = x2 − N , where x = (t − 1)/2, and also r is a square r = y 2 , say.
Then N = x2 − y 2 is factored.
In two lines of the algorithm one must find the integer part of
the square root of an integer. A good way to do this is with a modi-
fication of Newton’s method.√The initial value of √ x in the algorithm
below can be any integer ≥ N , the closer to N the better. The
value in the first line of the algorithm is easy to compute on a binary
computer. The assignment statements y ← (x + N/x)/2 are the
usual Newton iteration for the square root of N , adapted to use only
integers.√ The successive values of y decrease monotonically as long
as y > N . Then they oscillate between the two integers closest to

N . That is why the while loop terminates when y ≥ x.
Algorithm 5.9. Integer part of the square root of a positive integer.

Input: A positive integer N .


x ← 2(log2 N )/2 
y ← (x + N/x)/2
while (y < x) {
x←y
y ← (x + N/x)/2
5.2. Fermat’s Difference of Squares Method 125

} √
Output: x =  N .

It is easy to show that this algorithm is correct and requires about


O(log log N ) iterations.

Example 5.10. Use Fermat’s Difference of Squares Algorithm to


factor N = 527. The following table traces the variables in the algo-
rithm. (Only t and r are actually computed during the while loop.)

x t = 2x + 1 x2 r = x2 − N
22 45 484 −43
23 47 529 2
24 49 576 49 = 72

The last line of the table gives 527 = 242 − 72 = (24 − 7)(24 + 7).

The condition in the while loop in Fermat’s difference of squares



factoring algorithm may be tested as, “Is r = ( r)2 ?”, where the
square root is computed by the algorithm just given. However, the
rest of the loop contains only two additions. The integer square root
algorithm uses several divisions and would dominate the time for the
loop. Fermat, working by hand with decimal numbers, solved this
problem by recognizing possible squares by their low-order digits.
Every square has last decimal digit 0, 1, 4, 5, 6, or 9. In the example
in the table above, r = −43 and 2 cannot be squares because their
last digits1 are not in the list. Only 22 2-digit numbers may occur
as the last two decimal digits of a square. A binary computer can
test whether r might be a square with the logical operation (r&63)
to find (r mod 64), followed by looking up the remainder in a table
of the twelve possible squares modulo 64. If r passes this test, then
look up (r mod pe ) in a table of possible squares modulo pe for a few
small odd prime powers pe . Only in case r passes all these tests need

one check “r = ( r)2 ?” Tricks like these amortize the evaluation
of the while condition to a cost comparable to the cost of the two
addition operations inside the loop.

1
The “last digit” of −43 is 7 because −43 ≡ 7 (mod 10).
126 5. Simple Factoring Algorithms

Theorem 5.11 (Complexity of Fermat’s factoring algorithm). Let


the odd composite positive
√ integer N = √ ab, where a is the largest
divisor of N which is ≤ N . Let k = a/ N , so that 0 < k ≤ 1. Then
the instructions of the while loop in Fermat’s
√ difference of squares
factoring algorithm are executed 1 + (1 − k)2 N /(2k) times.

Proof. If a = b, then N is a square, k = 1, and the while loop


ends after the first iteration. In any case, when the algorithm ends,
x − y = a and x + y = b, that is, x = (a + b)/2 and y = (b − a)/2.
At this time, x = (t − 1)/2, so t = 1 + a + b = 1 + a + N/a. The
variable t increases
√ by 2 at each iteration, begins at the first odd
number > 2 N , and stops at 1 + a + N/a. Hence, the while loop is
executed


1 N √ ( N − a)2 (1 − k)2 √
1+ a+ −2 N = 1+ =1+ N
2 a 2a 2k
times. 

Theorem 5.11√ does not say that the time complexity of Fermat’s
algorithm is O( N ) to factor√N . In the worst case, N = 3p, for
some prime p, we have k = 3/ N and the while loop is performed
essentially N/6 times. If a ≈ N 1/3 and b ≈ N 2/3 , then k ≈ N −1/6
and the number of steps needed is
1 − k2 √ 1 − N −1/3 √ N 2/3
N≈ −1/6
N≈ ,
2k 2N 2
much slower than
√ Trial Division. √ The algorithm
√ works well when a is
very close to N , say, within O( 4 N ) of N .
One can build special hardware devices (called sieves) capable
of executing the while loop at high speed. See Lehmer [Leh33a],
[Leh66] and Williams and Patterson [WP83] for information about
the construction and use of sieves. See also Section 9.1.
Sometimes one can enhance Fermat’s Difference of Squares Meth-
od with the tricks from Trial Division.
Example 5.12. Factor the Mersenne number M29 = 229 − 1.
Use the tricks in Example 5.7. Any prime factor q of M29 must
have the form q = 2k · 29 + 1 = 58k + 1, where k ≡ 0 or 3 (mod 4).
Now 58 · 3 + 1 = 175 is composite, but 58 · 4 + 1 = 233 is prime,
5.3. Hart’s One-Line Factoring Algorithm 127

and we find 233 | 229 − 1. The cofactor N = M29 /233 = 2304167 is


composite. Write
N = x2 − y 2 = (x − y)(x + y) = ab = (58k + 1)(58 + 1).
Then 2y = b − a = 58( − k), so 29 | y and 292 = 841 | y 2 . We have
x2 = N + y 2 ≡ N = 2304167 ≡ 668 (mod 841).
Taking square roots as in Example 3.9, we find x ≡ ±86 (mod 841).
Now we begin Fermat’s
√ Difference of Squares Method, but rather
√ all x√≥  N , we use only those x ≡ ±86 (mod 841).
than trying
We find N = 2304167 ≈ 1517.9, so the first x we try is x =
2 · 841 − 86 = 1596. This gives at once
x2 − N = 15962 − 2304167 = 243049 = 4932 = y 2 .
The factors of N are x ± y = 1596 ± 493 = 1103 and 2089.
This example is due to Richard Schroeppel.

5.3. Hart’s One-Line Factoring Algorithm


Hart [Har12] invented a variation of Fermat’s Factoring Method
which has a very short, simple program. He gave a heuristic argument
that it factors N in O(N 1/3+ ) steps.
Hart’s algorithm begins by checking whether N is square. If N is
not square, then it does Trial Division, Algorithm 3.1, but quits when
p reaches N 1/3 . In case N has not yet been factored, it performs the
following steps.

For i = 1, 2, 3, . . ., test whether
√ ( N i )2 mod N is a square. If
this number equals t , then gcd( N i  − t, N ) is a factor of N .
2

The heuristic argument predicts success before i gets much larger


than O(N 1/3 ). Here are the steps after the square check and Trial
Division.
Algorithm 5.13. Hart’s One-Line Factoring Algorithm.
Input: A positive integer N and a limit L.
for (i ← 1√to L) {
s ←  Ni 
m ← s2 mod N
128 5. Simple Factoring Algorithms

if (m is a square) { break }
}√
t← m
Output: gcd(s − t, N ) is a factor of N .

Example 5.14. Factor N = 13290059 by Hart’s One-Line Factoring


Algorithm.
When i = 165, we have s = 46828 and m = 1849 = 432 , so
t = 43. Then gcd(s − t, N ) = gcd(46785, 13290059) = 3119, a factor
of N . In this case i had to go up to about 0.7N 1/3 to find a square
m.

This algorithm is especially fast for integers of the special form


(ca + d)(cb + e), where c, |d|, |e|, and |a − b| are small positive integers.
For example, if p and q are the next primes after 10200 and 10201 ,
respectively, then the One-Line Factoring Algorithm will factor their
401-digit product N = pq in a fraction of a second.
Hart’s algorithm is related to that of Lehman, described in the
next section.

5.4. Lehman’s Variation of Fermat


We begin with Lawrence’s [Law95] variation of Fermat’s Factoring
Method. In 1895, Lawrence proposed a way to factor N when it is
believed that N = pq with p ≤ q, where the ratio p/q is approximately
a/b and a and b are small relatively prime positive integers. When
a = b = 1, this algorithm is the same as Fermat’s. Assume that
gcd(ab, N ) = 1.

Suppose first that both a and b are odd. Write x =  abN .
Test the integers (x + i)2 − abN , i = 0, 1, 2, . . . , for being square as
in Fermat. Suppose j is the first value of i for which this number is
square, say, (x + j)2 − abN = y 2 . Then

abN = (x + j)2 − y 2 = (x + j + y)(x + j − y).

Remove the factors of ab from the two trinomial factors to obtain the
factors of N . That is, gcd(x + j + y, N ) and gcd(x + j − y, N ) will be
the factors of N .
5.4. Lehman’s Variation of Fermat 129

Example 5.15. Factor N = 17155247, assuming that the ratio of its


factors is near a/b = 1/3.
We have
abN = 1 · 3 · N = 51465741
and

x =  abN  = 7174.
With i = 0 we get (x + 0)2 − abN = 71742 − 51465741 = 535, which is
not square. But i = 1 gives (x+1)2 −abN = 71752 −51465741 = 1222 ,
so y = 122. Then

abN = 71752 − 1222 = (7175 + 122)(7175 − 122) = 7297 · 7053.

The first factor, 7297, is prime, while the second factor, 7053, is
3 · 2351, so N = 2351 · 7297. Note that 2351/7297 ≈ 0.322, which is
near 1/3.

When one of a, b is even and the other is odd, the calculations are
slightly more complicated because one must deal with half integers.
See Lawrence [Law95] for details. Lehman avoids this problem by
multiplying a, b, and other numbers in the algorithm by 2.
Lawrence’s Factoring Method is similar to Fermat’s in that it will
factor N eventually, but if N does not have factors whose ratio is near
a/b, then it will take a very long time to find them.
Note that the only way that the small positive integers a, b are
used in the algorithm is as their product k = ab. Suppose a pos-
itive integer k can be factored in more than two ways and we run
Lawrence’s algorithm with k in place of ab. Then we are searching
simultaneously for factors of N whose ratio is near a/b for any fac-
torization k = ab. For example, if k = 12, then Lawrence’s algorithm
searches simultaneously for factors of N whose ratio is near either
1/12 or 3/4.
Lehman’s idea [Leh74] is to divide up the interval between 0 and
1 into parts, each consisting of the real numbers “near” the fraction
a/b with 1 ≤ ab ≤ r for some limit r. If N = pq with p ≤ q, then the
ratio p/q must lie in one of the parts and we will factor N when we
examine that fraction a/b.
130 5. Simple Factoring Algorithms

Example 5.16. When r = 12, the fractions a/b with 1 ≤ ab ≤ 12


are2
1 1 1 1 1 1 1 1 1 1 2 1 2 3 1
, , , , , , , , , , , , , , .
12 11 10 9 8 7 6 5 4 3 5 2 3 4 1
In Lehman’s algorithm, each fraction a/b is represented by the
product k = ab, and one k may represent more than one fraction.
The following theorem is the basis for Lehman’s factoring algorithm.
Theorem 5.17 (Lehman). Let N = pq, where√p and q are odd
√ and let r be an integer between 2 and N . If
primes, N/r <
p < N , then there are nonnegative integers x, y, and k such that
x2 − y 2 = 4kN, 1 ≤ k < r,
x ≡ k + 1 (mod 2),
x ≡ k + N (mod 4) if k is odd,

(5.1) 0 ≤ x − 4kN ≤ (1/4r) N/k
and {p, q} = {gcd(x + y, N ), gcd(x − y, N )}. If N is prime, then no
integers x, y, k satisfy the four displayed conditions.

Lehman used this theorem to devise a factoring algorithm. Sup-


pose N is given and the integer r has been chosen. First, use Trial
Division up to N/r to find any small prime factor of N . Then,
for each 1 ≤ k ≤ r and for each x satisfying (5.1) and the two con-
gruences in Theorem 5.17, test whether x2 − 4kN is a square as in
Fermat’s Method. If this quantity equals y 2 , then gcd(x − y, N ) is a
proper factor of N .
 
The Trial Division takes O N/r steps. Counting a square
test as one step, the task of testing
 all x in the given range for one

fixed k takes O 1 + (1/r) N/k steps. The total time complexity
is
   r  
O N/r + O 1 + (1/r) N/k
k=1
   √ 
=O N/r + O(r) + O (1/r) rN .

2
This list of fractions is similar to a Farey series.
5.4. Lehman’s Variation of Fermat 131

If we write r = N c for some constant c, then N/r + r is√minimized
(for large N ) with c = 1/3. It follows that if we let r = 3 N , we get
a factoring algorithm with time complexity O(N 1/3 ).
Lehman [Leh74] gives a complete Algol program for this algo-
rithm. Here is the loop on x for one fixed k. In the algorithm
u = x2 − 4kN . The variables u and x are initialized and i1 is chosen
to ensure that the two congruences of Theorem 5.17 are satisfied and
that the variable i counts correctly through all values of x in Inequal-
ity (5.1). Note how the program avoids computing x2 and any other
multiplication, as these operations are slower than addition. Multi-
plication by a power of 2 can be done swiftly by shifting a binary
number.

Algorithm 5.18. Inner loop of Lehman’s factoring algorithm.


Input:√Integers N (to factor), r and k < r (parameters).
x ←  4kN ; u ← x2 − 4kN ;j ← ( N/k − 1)/(4(r + 1))
if (x ≡ k (mod 2)) { i1 ← 1; u ← u + 2x + 1; x ← x + 1 }
else { i1 ← 0 }
w←2
if (k ≡ 1 (mod 2)) {
w←4
if (x ≡ k + N (mod 4)) {
i1 ← i1 + 2; u ← u + 4(x + 1); x ← x + 2
}
}
for (i ← i1; i ≤ j + 1; i ← i + w) {
if (u is square)
√ {
y← u
write “factor gcd(x − y, N ) found”
exit
}
if (k ≡ 1 (mod 2)) { u ← u + 8(x + 2); x ← x + 4 }
else { u ← u + 4(x + 1); x ← x + 2 }
}
Output: A factor of N may be printed.

Example 5.19. Factor N = 13290059 by Lehman’s algorithm with


k = 6.
The value of the parameter r must be > k, say, r = 7. The
parameter k = 6 represents either ratio a/b = 1/6 or 2/3. The values
132 5. Simple Factoring Algorithms

of x and u generated by the for loop with k = 6 are


x u
17861 53905
17863 125353
17865 196809
17867 268273
17869 339745
17871 411225
17873 482713
17875 554209
17877 625713
17879 697225
We find u = 697225 = 8352 = y 2 , so x − y = 17879 − 835 = 17044
and we get gcd(x − y, N ) = gcd(17044, 13290059) = 4261. The other
factor of N is 3119. The ratio 3119/4261 ≈ 0.73, which is close enough
to 2/3 ≈ 0.67 to succeed for an integer N of this size.

McKee gives another variation of Fermat’s Method. It is de-


scribed in Section 6.2 because it uses continued fractions.

5.5. The Lehmers’ Factoring Method


Fermat proposed factoring N by finding a pair of integers x, y so that
N = x2 − y 2 . In 1647, Mersenne noticed that if you can express N as
the sum of two squares in two different ways, then you can factor N
easily. Euler, Legendre, Gauss, and Chebyshev observed that if you
can find a quadratic form Q(x, y) = ax2 + bxy + cy 2 and two different
pairs x, y for which Q(x, y) = N , then you can factor N directly.
Here is an example of the sort of result they proved. Let a = b = 1
to get Mersenne’s statement.

Theorem 5.20. Let a and b be relatively prime integers. There is a


polynomial time algorithm to find a proper factor of a positive integer
N , given two representations
N = ax2 + by 2 , N = au2 + bv 2
with {±x, ±y} = {±u, ±v}.
5.5. The Lehmers’ Factoring Method 133

Proof. We only sketch the proof. The details are on page 266 of
Mathews [Mat92].
The theory of quadratic forms shows that if N were prime, then
it would have at most one representation as N = ax2 + by 2 . Thus, N
must be composite because it has two such representations.
We may assume that gcd(ab, N ) = 1. If any of x, y, u, v were 0,
then we could factor N easily. Thus, we may assume all of x, y, u, v
are positive integers. Note that
a(xv + yu)(xv − yu) = ax2 v 2 − ay 2 u2
= v 2 (ax2 + by 2 ) − y 2 (au2 + bv 2 )
= (v 2 − y 2 )N.
Therefore, N divides (xv + yu)(xv − yu). Using the distinctness of
the two representations, one can prove that N does not divide either
xv + yu or xv − yu. Therefore, gcd(xv + yu, N ) and gcd(xv − yu, N )
are proper factors of N . 

Example 5.21. Factor N = 28028821, given that


N = 10652 + 51862 = 2952 + 52862 .

Compute
1065 · 5286 + 5186 · 295 = 7159460
and
1065 · 5286 − 5186 · 295 = 4099720.
Then gcd(7159460, N ) = 4649 and gcd(4099720, N ) = 6029, so N =
4649 · 6029.

One problem with Theorem 5.20 is that it may take a long time
to find even one representation of N as ax2 + by 2 . An even greater
problem is that N may have no representation at all in this form.
Dick and Emma Lehmer [LL74] devised a variation of the method of
Theorem 5.20 to solve the second problem. They used representations
λN = x2 − Dy 2 with D = 1 and D = 0 and gave the
range of possible
solutions y. For example, if D < 0, then 0 ≤ y < |λN/D|. (When
D > 0, the theory of Pell equations is needed to give the range of
y. See the end of Section 6.8 for this range.) In case two distinct
134 5. Simple Factoring Algorithms

solutions are found with y in this range, one can factor λN as in the
proof of Theorem 5.20. The multiplier λ is small and can be removed
from the factors of λN to factor N .
The Lehmers [LL74] provided a list of ten quadratic forms λN =
x − Dy 2 and told how to choose three of them, depending on the
2

remainder of N modulo 24, so that at least one of the three forms


has solutions. If N is the product of two primes, then there are two
solutions and one can factor N as in Theorem 5.20.

Example 5.22. Factor N = 13290059 by the Lehmers’ method.


Since N mod 24 = 11, the Lehmers prescribe using the three
quadratic forms N = x2 + 2y 2 , −N = x2 − 3y 2 , and 2N = x2 + 6y 2 . A
search of 0 ≤ y < N/2 finds no solution to the first quadratic form.
The second quadratic form has the two solutions (x, y) = (1297, 2234)
and (1468, 2269). (See Example 6.30 for the search limits on y in the
second quadratic form.) As in the proof of Theorem 5.20, we find
gcd(1297 · 2269 + 1468 · 2234, N ) = 3119
and
gcd(1297 · 2269 − 1468 · 2234, N ) = 4261.

The Lehmers used this method in the early 1970s to factor num-
bers from aliquot sequences having 20 to 25 decimal digits and no
small factor. They use a sieve algorithm, as in Section 8.1, to speed
the search for x and y. They also used a special hardware device, the
Delay Line Sieve (see Section 9.1), to test one million values of y per
second, quite fast for that time.
Emma Trotskaya was born in 1906 and grew up in Harbin, Man-
churia. Her interest in mathematics was sparked by a wonderful high
school teacher. She did not wish to attend college in Moscow in the
early 1920s because the Russian revolution was still in progress there.
She convinced her parents that Berkeley, California, was closer to
Harbin than was Moscow. She applied to the University of California
and was offered a scholarship. She accepted and crossed the Pacific via
Japan and Vancouver, where she obtained a tourist visa for the United
States. University officials told her that she could attend classes but
not access her scholarship funds until she obtained a student visa,
5.6. Pollard’s Rho Method 135

which would take several months. Running out of money, she went to
the mathematics department for help. They told her that Professor
D. N. Lehmer wanted to hire a student to compute some numbers.
She took the job and worked with the professor and his son, D. H.
(Dick) Lehmer doing number theory computations. Dick and Emma
fell in love and married a few years later. Their marriage lasted until
he died in 1991. She lived until 2007.

5.6. Pollard’s Rho Method


Pollard invented two nice factoring algorithms in the 1970s. They are
described in this section and the following one. Both methods are
faster than Trial Division at finding relatively small prime factors of
a large integer.
Let N be the composite number to factor and let p be an un-
known prime factor of N . Pollard [Pol75] proposed choosing a ran-
dom function f from the set {0, 1, . . . , N − 1} into itself, picking a
random starting number s in the set and iterating f :

s, f (s), f (f (s)), f (f (f (s))), . . . .

If we reduce these numbers modulo the unknown prime p, we get a


sequence of integers in the smaller set {0, 1, . . . , p − 1}. Because of
the “birthday problem”3 in probability theory, some number in the

smaller set will be repeated after about p iterations of f . If u, v are
iterates of f with u ≡ v (mod p), then it is likely that gcd(u−v, N ) = p
because p divides u − v and N , and probably no other prime factor
of N divides u − v. But how can we detect this repeated value when
it happens? We don’t know p and must iterate f modulo N .
Define si by s0 = s and si = f (si−1 ) for i > 0. If si ≡ sj (mod p),
then p divides si − sj and also gcd(si − sj , N ). However, we can’t

compute a greatest common divisor for every pair i, j < p because

there would be about 12 ( p)2 = p/2 pairs and we might as well use
Trial Division to find p.

3
The birthday problem asks how many people are needed so that the probability
is approximately 0.5that two have the same birthday. If there are p possible birthdays,

then the answer is (2 ln 2)p ≈ 1.18 p.
136 5. Simple Factoring Algorithms

Floyd solved this problem by computing two iterates of f together


in the same loop, with one instance running twice as fast as the other.
(See Exercise 6b in Section 3.1 of [Knu81].) This trick generates sm
and s2m together and forms gcd(s2m − sm , N ), hoping to find p. Here
is why the trick works. Suppose si ≡ sj (mod p) for some i < j. By
the birthday problem, the first j for which this congruence holds for

some i < j is O( p). Let k = j − i. Then for any m ≥ i and t ≥ 0,
sm ≡ sm+tk (mod p). When m = ki/k  and t = i/k , we have

sm ≡ s2m (mod p) and m ≤ j, so m is O( p).
What is a good choice for the random function f (x)? Experi-
ments and a bit of theory suggest that quadratic polynomials f (x) =
(x2 + b) mod N are good choices when b = 0 and b = −2. See Section
10.2 of [Wag03] for more about bad choices for b.
Here is the algorithm. After defining the function f (x), the main
loop iterates the function in two ways, with one iteration per step for
A and two iterations per step for B. Then it computes the greatest
common divisor of N with A−B and stops when the gcd first exceeds
1. The Pollard Rho Method is a Monte Carlo probabilistic algorithm
and is sometimes called the Pollard Monte Carlo algorithm.

Algorithm 5.23. Pollard Rho Factorization Method.

Input: A composite number N to factor.


Choose a random b in 1 ≤ b ≤ N − 3
Choose a random s in 0 ≤ s ≤ N − 1
A←s;B←s
Define a function f (x) ← (x2 + b) mod N
g←1
while (g = 1) {
A ← f (A)
B ← f (f (B))
g ← gcd(A − B, N )
}
if (g < N ) { write g “is a proper factor of” N }
else { either give up or try again with new s and/or b }
Output: A proper factor g of N , or else “give up.”

If we reach the else line of the algorithm, it means that g = N ,


that is, we have found all prime factors of N together. There is a
5.6. Pollard’s Rho Method 137

fair chance that we can separate them and find just one of them if we
restart the algorithm with new random b and s.
The factor g of N written in the next-to-last line is not guaranteed
to be prime. It is possible that we may find two or more prime
factors of N together. One should always test g for primality when
the algorithm finishes.
As noted above, assuming f is a random mapping, the complexity

of the Pollard Rho Method is O( p) steps, where p is the smallest

prime factor of N . Since p ≤ N , this complexity is O(N 1/4 ). Trial
Division would find a prime factor p of N in O(p) steps, so the Pollard
Rho Method is faster than Trial Division unless p is very small. The
Rho Method will find a 20-digit prime factor of N with about the
same work needed to find a 10-digit factor by Trial Division.
Example 5.24. Factor N = 25279 by Pollard Rho with s = b = 1.
The first 13 iterates (modulo N ) are s0 = 1, s1 = 2, 5, 26, 677,
3308, s6 = 22337, 9947, 804, 14442, 19615, 1846, s12 = 20331. We
have gcd(s12 −s6 , N ) = gcd(20331−22337, 25279) = 17. The (hidden)
values of si mod 17 are shown in the figure below. The shape is the
reason for the algorithm’s name.


*9
Z
 Z
5 Z~
Z
 14

 
 

2X yXXX

10
6 X 16 
9

Pollard Rho
1 modulo 17

The most expensive step in the while loop is the greatest com-
mon divisor. Its cost may be amortized by adding a new variable C,
initialized at 1, replacing the greatest common divisor operation by
138 5. Simple Factoring Algorithms

the instruction C = C(A − B) mod N , and computing g = gcd(C, N )


occasionally. One strategy performs the greatest common divisor op-
eration only when the iteration number is a power of 2. This causes
the expensive operation to be done less frequently but increases the
chance of finding all factors of N at once.
Brent [Bre80] gives a faster version of the Pollard Rho Method.

Using a special processor, Dubner found several 19-digit prime


factors of large numbers by the Pollard Rho Method. Suyama also
found many factors by this algorithm.

5.7. Pollard’s p − 1 Method


Pollard’s second factoring method [Pol74] is the p − 1 Method. The
p − 1 Method is based on Fermat’s Little Theorem (Theorem 2.23),
which says that ap−1 ≡ 1 (mod p) when p is a prime which does not
divide a. Therefore, aL ≡ 1 (mod p) for any multiple L of p−1. If also
p | N , then p divides gcd(aL − 1, N ). Of course, we cannot compute
aL mod p because p is an unknown prime factor of N . However, we
can compute aL mod N . Pollard’s idea is to let L have many divisors
of the form p − 1 and thus try many potential prime factors p of N
at once.
If p − 1 is B-smooth, that is, the largest prime factor of p − 1 is
≤ B, then p−1 will divide L if L is the product of all primes ≤ B, each
repeated an appropriate number of times. If a prime q ≤ B divides
p−1, then q cannot divide p−1 more than logq p−1 = (log p/ log q)−1
times. This number is an upper bound on the “appropriate number of
times” q divides L. However, large primes rarely divide large random
integers more than one time. A reasonable compromise for L is to
choose a bound B, which tells how much work one is willing to do in
an effort to factor N , and define L to be the least common multiple

of the positive integers up to B. One can show that this L = q e ,
where q runs over all primes ≤ B and, for each q, q e is the largest
power of q which is ≤ B. Typically, B is in the millions and L is
enormous. There is no need to compute L. As each q e is formed, one
e
computes a = aq mod N . Here is the first stage of the algorithm.
5.7. Pollard’s p − 1 Method 139

Algorithm 5.25. Simple Pollard p − 1 Factorization Method.


Input: A composite positive integer N to factor and a bound B.
Find the primes p1 = 2, p2 = 3, p3 , . . . , pk ≤ B
a←2
for (i ← 1 to k) {
e ← (log B)/ log pi 
f ← pei
a ← af mod N
}
g ← gcd(a − 1, N )
if (1 < g < N ) { write “g divides N ” }
else { give up }
Output: A proper factor g of N , or else give up.

The primes may be generated quickly by the Sieve of Eratos-


thenes described in Section 8.1. Exponentiation is done by the Fast
Exponentiation Algorithm. The greatest common divisor operation
should be computed once every few thousand iterations of the for
loop rather than just once at the end. The for loop continues if
g = 1. If g = 1 at the end, one can either give up or try the second
stage described below. If g = N , then all prime divisors p of N have
been discovered together. When this happens, if one has saved the
value of a at the previous greatest common divisor operation, one
can return to it and compute a greatest common divisor after each
exponentiation in an effort to separate the prime divisors p of N . But
even this device won’t work in case p − 1 has the same largest prime
divisor q for every prime factor p of N .
Example 5.26. The Pollard p−1 Method fails to factor 1247 = 29·43
because 29 − 1 = 22 · 7 and 43 − 1 = 2 · 3 · 7. They have the same
largest prime factor 7, and both 29 and 43 will be discovered during
the same iteration of the for loop. That is, a will become 1 modulo
N when i = 4.

Erdős [Erd35] proved that the numbers p − 1, where p is prime,


are just as likely to be smooth as other numbers of the same size. If
we use the Pollard p − 1 Algorithm with bound B to try to factor N
and N has a prime factor p, then the probability of finding p is the
probability that a number near p is B-smooth, which is ψ(p, B)/p.
By formula (3.1), this is ρ(u) ≈ u−u , where u = (log p)/ log B. If
140 5. Simple Factoring Algorithms

p − 1 has a prime factor > B, then the algorithm will fail. We could
fail to find a prime factor p as small as p = 2q + 1, where q is the first
prime > B (or > B2 , if the second stage is used).

Example 5.27. In 1986, Baillie found the 16-digit prime divisor


p = 1256132134125569 of the Fermat number F12 = 24096 + 1 using
the Pollard p − 1 Method with B = 30000000. He succeeded because
the largest prime factor of p − 1 is less than B:
p − 1 = 214 · 72 · 53 · 29521841.
In 2010, Vang, Zimmerman, and Kruppa found a 54-digit prime factor
q of F12 by the Elliptic Curve Method. This factor could not have
been found by the Pollard p − 1 Method because q − 1 = 215 r, where
r is a 50-digit prime number.

The algorithm has a second stage in which one chooses a second


bound B2 > B and seeks a factor p of N for which the largest prime
factor of p−1 is ≤ B2 and the second largest prime factor is ≤ B. Here
is one version of the second stage. At the end of the first stage (the
algorithm above), a has the value 2L (mod N ). Let q1 < q2 < · · · < qt
be the primes between B and B2 . The idea is to compute successively
2Lqi (mod N ) and then gcd(2Lqi − 1, N ) for 1 ≤ i ≤ k. The first
power 2Lq1 (mod N ) is computed directly as aq1 (mod N ). The
differences qi+1 − qi are even numbers and much smaller than the qi
themselves. Precompute 2Ld (mod N ) for d = 2, 4, . . . up to a few
hundred. To find 2Lqi+1 (mod N ) from 2Lqi (mod N ), multiply the
latter by 2Ld (mod N ), where d = qi+1 − qi . The amortized cost
of computing 2Lqi (mod N ) for 2 ≤ i ≤ k is a single multiplication
modulo N .

Example 5.28. In 1984, Brent found the 31-digit prime factor p =


49858990580788843054012690078841 of N = 2977 −1 with this method.
Since
p − 1 = 23 · 5 · 13 · 19 · 977 · 1231 · 4643 · 74941 · 1045397 · 11535449,
he must have used B ≥ 1045397 and B2 ≥ 11535449.

When Pollard’s p − 1 Method reports a factor, one should test


it for being prime. In the early days, this check was not always
Exercises 141

performed. Sometime between 1978 and 1981, the “prime” factor


1223165341640099735851 of 6175 −1 was entered into the Cunningham
tables. It appeared as a primitive “prime” factor of this number in
the first edition of [BLS+ 02] in 1983. Atkin factored this “prime”
number in 1986 as the product pq of two 11-digit primes. The largest
prime factor of each of p − 1 and q − 1 lies between 45000 and 50000,
so the 22-digit “prime” was probably discovered by the p − 1 Method,
checking the gcd only once for the primes pi in every block of 5000
integers. The Cunningham Project maintains a Ten Most Wanted list
of important numbers to factor. Atkin’s factorization of a published
“prime” was a Least Wanted factorization. Another Least Wanted
factorization was the intrinsic factor 11 of 12121 − 1 found by Baillie
by the p − 1 Method after Trial Division to 235 had already been done
for this number. Trial Division missed this factor because it assumed
that all factors of the primitive part of 12121 − 1 were primitive and
therefore ≡ 1 (mod 242) by Theorem 5.5.
Baillie, Brent, Buell, Silverman, and Suyama factored many Cun-
ningham numbers by Pollard’s p−1 Method, as did the team of Atkin
and Rickert.
There is a complementary algorithm, due to Williams [Wil82]
and called the p + 1 Factoring Method, that discovers a prime divisor
p of N provided p + 1 is smooth.

Exercises

5.1. Prove that Algorithm 5.9 for  N  is correct and that it takes
O(log log N ) iterations.
5.2. Find the 22 squares modulo 100 and the 12 squares modulo 64.
5.3. Use Fermat’s Difference of Squares Algorithm to factor 5293.
5.4. Factor N = 299944727 by Lehman’s algorithm using k = 12.
Omit the Trial Division step.
7
5.5. Factor the Fermat number F7 = 22 +1 given that F7 = x2 +y 2
with x = 16382350221535464479 and y = 8479443857936402504.
Use Theorem 5.20.
142 5. Simple Factoring Algorithms

5.6. Why did we not use our canonical number N = 13290059 in


Example 5.21?
5.7. Use Pollard’s Rho Method to factor 5293.
5.8. Experiment with different functions f in Pollard’s Rho Method.
Discover why b = 0 and b = −2 are bad choices to use in
f (x) = (x2 + b) mod N .
5.9. Use Pollard’s p − 1 Method to factor 5293.
5.10. Factor Atkin’s number 1223165341640099735851 into the prod-
uct pq of two primes. Find and compare the largest prime
factors of p − 1 and q − 1.
5.11. For each (large) composite divisor N of the primitive part of
a Cunningham number bn ± 1 that has been completely fac-
tored, test whether each of the factoring algorithms of Fermat,
Hart, Lawrence, and Lehman could have factored N easily. Do
not run the algorithms. Use your knowledge of the factors p,
q of N to check whether N has the required form to be fac-
tored quickly. For example, for Lehman’s algorithm, determine
whether p/q is close to a fraction a/b with small integers a, b.
One can test this condition easily using continued fractions.
Try a = Ai , b = Bi with small i in Theorem 6.5 with Ak = p,
Bk = q.
k
5.12. Prove that if p divides the Fermat number Fk = 22 + 1, then
p ≡ 1 (mod 2k+2 ).
Chapter 6

Continued Fractions

When one is about to study a large number, it is


necessary to begin by determining several quadratic
residues. Maurice Kraitchik [Kra29, p. 1]

Introduction
Although the factoring methods in this chapter are slow compared
to the best known ones, the ideas in them led to some of the fastest
known factoring algorithms. All of the algorithms in this chapter use
simple continued fractions in some way. Several of them use continued
fractions to produce quadratic residues modulo the number N to be
factored. The Continued Fraction Factoring Algorithm (CFRAC) was
the first factoring algorithm with subexponential time complexity.
Some facts about continued fractions are used in later chapters.
For example, Theorem 6.7 is needed in the proof of Theorem 10.8
and for quantum computing in Chapter 9. Theorems 6.17 and 6.18
in Section 6.4 are used first in Section 6.5 but are really not about
continued fractions. They are used many times in Chapters 8 and 10.

143
144 6. Continued Fractions

6.1. Basic Facts about Continued Fractions


A simple continued fraction is an expression of the form
1
(6.1) x = q0 + ,
1
q1 +
1
q2 +
q3 + · · ·
which we denote by x = [q0 ; q1 , q2 , q3 , . . .]. The numbers qi are re-
quired to be integers for all i and also positive when i > 0. A simple
continued fraction is allowed to be finite,
1
(6.2) x = q0 + ,
1
q1 +
1
q2 +
1
q3 + · · · +
qk
which we write as [q0 ; q1 , q2 , q3 , . . . , qk ]. The numbers q1 , q2 , . . . are
called the partial quotients of either continued fraction.
Every real number x has a simple continued expansion which
may be computed (in theory) by this algorithm. Separate x into its
integer part q0 and fractional part, the new value of x. The main
loop alternates reciprocal with this separation operation, forming the
sequence of partial quotients qi of x.
Algorithm 6.1. Continued Fraction Algorithm.
Input: A real number x.
i←0
q0 ← x
x ← x − q0
while (x > 0) {
i←i+1
qi ← 1/x
x ← x − qi
}
Output: [q0 ; q1 , q2 , q3 , . . .] is the continued fraction for x.

Computing a continued fraction this way via floating point arith-


metic requires great precision to find more than the first few qi .
6.1. Basic Facts about Continued Fractions 145

Theorem 6.2. The Continued Fraction Algorithm with input x ter-


minates if and only if x is a rational number.

Example 6.3. If qi = 1 for all i ≥ 0 (which is the smallest possible


value when i > 0),
√then the continued fraction2 (6.1) equals the golden
mean α = (1 + 5)/2, the larger root of x = x + 1. If β is the
other root of this quadratic equation, then the Fibonacci and Lucas
numbers satisfy un = (αn − β n )/(α − β) and vn = αn + β n for n ≥ 0.
This connection was used by Lamé in his proof of Theorem 2.3. These
α and β also appeared in Section 3.6.

Example 6.4. The continued fraction for e = 2.718281828 . . . is


e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, 1, 1, 12, 1, . . .].
However, the continued fraction for π = 3.1415926 . . . is not regular.
It begins
π = [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, . . .].

If x = n/m, then the qi for i > 0 are the quotients (m/n) in


the Euclidean Algorithm for gcd(n, m), and the successive values of x
in the Continued Fraction Algorithm are the ratios n/m in the while
loop of the Euclidean Algorithm. To see this, assume that m > n > 0
for simplicity. Define m0 = m, n0 = n, and x0 = n/m = n0 /m0 .
The Continued Fraction Algorithm with input x begins by computing
q0 = x = n/m = 0 and x1 = x0 − q0 = x0 = n/m. So long as
xi = ni /mi > 0, the algorithm computes qi+1 = mi /ni  and
mi mi − ni mi /ni  mi mod ni
xi+1 = − qi+1 = = .
ni ni ni
If we let mi+1 = ni and ni+1 = mi mod ni , then we will have xi =
ni /mi always. But the (mi , ni ) are exactly the values of m and n
that appear in the Euclidean Algorithm, Algorithm 2.2.
If an infinite continued fraction (6.1) is truncated at qk , then it
has the value (6.2), which is clearly a rational number, xk , say, called
the k-th complete quotient of (6.1).
Given a finite continued fraction (6.2), we can find its value
xk = Ak /Bk as a rational number working backwards clearing the
denominators starting from qk−1 + 1/qk . The next theorem gives a
146 6. Continued Fractions

way of finding this rational number working forwards. See Theorem


7.4 of [NZM91] for a proof.

Theorem 6.5. The rational number Ak /Bk = [q0 ; q1 , q2 , . . . qk ] is


determined from q0 , . . ., qk by A−1 = 1, B−1 = 0, A0 = q0 , B0 = 1,
and

(6.3) Ai = qi Ai−1 + Ai−2 ,


Bi = qi Bi−1 + Bi−2 for i = 1, 2, . . . , k.

The rational number Ai /Bi = [q0 ; q1 , q2 , q3 , . . . , qi ] is called the


i-th convergent to the continued fractions (6.1) and (6.2). The reason
for the name is that limi→∞ Ai /Bi = x, that is, the Ai /Bi converge
to x. In fact, Ai /Bi is the best rational number approximation for x
with denominator no larger than Bi .

Theorem 6.6. With Ak and Bk as above,


 
 Ak  1 1
 − x <
 Bk  Bk Bk+1 < B 2
k

for every k ≥ 1.

See Theorem 7.11 of [NZM91] for a proof.

Theorem 6.7. If A and B are integers with gcd(A, B) = 1 and


 
A  1
 
(6.4)  B − x < 2B 2 ,

then there is a k > 0 for which A = Ak and B = Bk .

See Theorem 7.14 of [NZM91] for a proof.

Example 6.8. Table 1 shows the partial quotients and convergents


to π. Note also that the convergents Ak /Bk alternate above and
below the actual value of π. They always alternate this way. The
convergent 355/113 is a well-known excellent approximation to π. It
is very close to π = 3.1415926535 . . . because the first omitted partial
quotient (that is, 292) is unusually large.
6.2. McKee’s Variation of Fermat 147

Table 1. Continued fraction expansion for π.

k qk Ak Bk Ak /Bk
−1 − 1 0 −
0 3 3 1 3.0
1 7 22 7 3.1428
2 15 333 106 3.141509
3 1 355 113 3.1415929
4 292 103993 33102 3.1415926530

6.2. McKee’s Variation of Fermat



Let N be an odd composite integer to factor. Let b =  N  and
define Q(x, y) = (x+by)2 −N y 2 . We will factor N by finding integers
x, y, z so that Q(x, y) = z 2 . The factor of N will be gcd(x+by−z, N ).
If we let y = 1, then we get Fermat’s Factoring Method of Section
5.2.
McKee’s [McK99] factoring algorithm begins with this theorem.

Theorem 6.9 (McKee). Suppose that N = pq with 2 4 N < p < q.
Then there
√ exist integers
√ x, y, z so √ that Q(x, y) = z 2 , y is even,
2 ≤ y ≤ N , |x|y < 2 N , 0 ≤ z < 2 N , and gcd(x + by − z, N ) is
4

a proper factor of N .

If we knew the factors p and q, then we could let r =  q/p
and the solution would be y = 2r, x = r 2 p + q − by, z = q − r 2 p. But,
of course, we don’t know p and q in advance. The inequalities in the
theorem will help us find x, y, z in O(N 1/4+ ) steps.
Suppose some integer m divides z but not y. Let x0 ≡ xy −1
(mod m2 ), where y −1 is the inverse of y modulo m2 . Since Q(x, y) =
z 2 ≡ 0 (mod m2 ) we have Q(x0 , 1) ≡ z 2 /y 2 ≡ 0 (mod m2 ). Also,
since x0 ≡ xy −1 (mod m2 ), we have x = x0 y − λm2 for some integer
λ, so that
x0 λ x
(6.5) − = 2 .
m2 y m y
If x > 0 and
(6.6) m2 > 2xy,
148 6. Continued Fractions

then (6.5) implies


x0 λ 1
0< − < 2,
m2 y 2y
so that λ/y is a convergent in the simple continued fraction expansion
of x0 /m2 by Theorem 6.7.
Here is the preliminary
√ algorithm to factor
√ n. Perform Trial
Division up to 2 4 N . Choose several m > 2 4 N in the hope that
m divides z. For each m, compute all solutions x0 to the congruence
Q(x0 , 1) ≡ 0 (mod m2 ) by the method of Example 3.9. For each x0 ,
compute the convergents λ/y in the continued
√ fraction expansion of
x0 /m2 for which λ/y < x0 /m2 and y ≤ 4 N . If Q(x0 y − λm2 , y) is a
square, then we can factor N by Theorem 6.9.
It is easier to compute a square root of N modulo m2 when m is
prime, so we restrict m to be prime. The preliminary algorithm
√ will
4
succeed if m is the smallest prime factor of z greater than 2 N . A
difficulty with the preliminary algorithm is that m might not exist or
be too large to guess. The next theorem guarantees that we will have
enough solutions if we look just a bit longer.

Theorem 6.10√(McKee). Let T > 1 be an integer. Suppose that


N = pq with 2 4 N < p < q. Then there exist at least √ T triples of
, y is even, 2 ≤ y ≤ 4 N +2(T −1),
y, z so that Q(x, y) = z 2√
integers x,√
|x|y < T 4 N , 0 ≤ z < (T 2 − 1) N , and gcd(x + by − z, N ) is a
nontrivial factor of N . Furthermore, at least T − 1 of the solutions
have x > 0.

The inequalities of Theorem 6.10 require that m > T 2 4 N , slightly
larger than in the preliminary algorithm. Now let T = c log N for
some constant c. By the Prime Number Theorem, the heuristic prob-
ability that a single value
√ of z in Theorem
√ 6.10 is relatively prime
to all primes between T 2 4 N and
√ 2T 2 4
N is 1/ log N . Therefore, we
expect to find a suitable m < c 4 N (log N )2 , for some small c > 0.

Example 6.11. Factor N = 13290059 by McKee’s algorithm.


√ √
We have b =  N  = 3646 and 2 4 N ≈ 122 and

Q(x, y) = (x + by)2 − N y 2 = x2 + 7292xy + 3257y 2


6.3. Periodic Continued Fractions 149

with discriminant 4N . The roots of Q(x0 , 1) = 0 are −b ± N . If
we try the primes m = 127, 131, . . ., the first one that leads to a
factor of N is m = 239. We have N mod m2 = 37987. By Example
3.9, the square roots of 37987 modulo m2 are ±16506, so the roots of
Q(x0 , 1) ≡ 0 (mod m2 ) are

x0 ≡ −b ± N ≡ −3646 ± 16506 ≡ 12860 or 36969 (mod m2 ).

The appropriate convergents for x0 /m2 = 12860/57121 are 0/1, 2/9,


9/40, . . .. The convergent λ/y = 9/40 gives
Q(x0 y − λm2 , y) = Q(12860 · 40 − 9 · 57121, 40)
= Q(311, 40) = 97992 = z 2 .

The factor is gcd(311 + by − z, N ) = gcd(136352, 13290059) = 4261.


If we allowed composite m, then m = 155 is the first one that
works. In this case, there are four square roots of N modulo m2
because m = 5 · 31. The reader may check that x0 = 18557 satisfies
Q(x0 , 1) ≡ 0 (mod m2 ) and the second convergent λ/y = 3/4 to
x0 /m2 leads to Q(2153, 4) = 82152 = z 2 and gcd(2153 + by − z, N ) =
gcd(8522, 13290059) = 4261.

6.3. Periodic Continued Fractions


We have seen that a continued fraction is finite if and only if it rep-
resents a rational number. When is an infinite continued fraction
periodic? By “periodic” we mean that the partial quotients repeat
after some point, that is,
x = [q0 ; q1 , . . . , qj , qj+1 , . . . , qj+p ],

where the overline indicates the string of p partial quotients that are
repeated forever.
The word “surd” is an old word meaning “irrational number.” A
“quadratic surd” is an irrational number that is a zero of a quadratic
polynomial with integer coefficients.

Theorem 6.12. Let x be a real number. Then the continued fraction


for x is infinite and periodic if and only if x is quadratic surd.
150 6. Continued Fractions

See Theorem 7.19 of [NZM91] for a proof. In case x = N ,
where N is a nonsquare positive integer, the period begins with q1
and the last partial quotient qp in the period is 2q0 .
Theorem 6.13 (Galois). Let N be a √positive integer and not a
square. Then the continued fraction for N has the form

N = [q0 ; q1 , q2 , . . . , qp−1 , 2q0 ].
Furthermore, the sequence q1 , q2 , . . . , qp−1 is symmetric about its mid-
dle; that is, qp−i = qi for 1 ≤ i ≤ p − 1.

See Theorem 9.12 of Levesque [Lev77] for a proof.


Example 6.14.

19 = [4; 2, 1, 3, 1, 2, 8 ],

22 = [4; 1, 2, 4, 2, 1, 8 ],

29 = [5; 2, 1, 1, 2, 10 ],

44 = [6; 1, 1, 1, 2, 1, 1, 1, 12 ],

85 = [9; 4, 1, 1, 4, 18 ].

In applications to factoring, N will be a large integer whose fac-


tors we seek and the algorithm will compute only a small portion
of the continued fraction, usually much less than half of the period.√
Typically, the length
√ p of the period of the continued fraction for N
is approximately N . But note that the period can be much shorter
for a few special N .
Theorem 6.15. Let N be a nonsquare positive integer √ and let p be
the length of the period of the continued fraction for N . Then p is
1 if and only if N = y 2 + 1 for some integer y ≥ 1. Also, p is 2 if and
only if either (a) N = y 2 z 2 + y for some integers y > 1 and z ≥ 1 or
(b) N = y 2 z 2 + 2y for some integers y ≥ 1 and z ≥ 1.

The statement about length p = 1 is an easy exercise. A proof


of the statement about length p = 2 may be found as Theorem 3 in
Arnold [Arn09]. See also the article by Beach and Williams [BW71]
for some other continued fractions with short periods.
There is√a simple iteration that computes the qi in the continued
fraction for N using only integer arithmetic. Several other integers
6.3. Periodic Continued Fractions 151

are computed during the iteration. One of them is an integer√ Qi


which satisfies A2i−1 − N Bi−1
2
= (−1)i Qi and 0 < Qi < 2 N . If
we regard the equation as a congruence modulo N , we have A2i−1 ≡
(−1)i Qi (mod N ). In other words, the continued fraction iteration
√ Qi } of quadratic residues modulo N whose
produces a sequence {(−1) i

absolute values are < 2 N , very small indeed. Continued fractions


provide a great way to find many small quadratic residues modulo
N . Of course, one can form many quadratic residues modulo N by
choosing random x modulo N and letting r = x2 mod N . These
quadratic residues are “large,” with average value N/2. We want
small quadratic residues because they are

• easier to compute,
• easier to factor,
• more likely to be smooth, and
• easier to test for being a square

than larger ones near N/2. As we noted in Section 5.1, small quadratic
residues r modulo N have a more manageable set of primes p for
which r is a quadratic residue modulo p. These primes p are the only
potential divisors of N when Trial Division is performed on N .
We define the i-th complete quotient by
√
N if i = 0 ,
xi =
1/ (xi−1 − qi−1 ) if i ≥ 1 .

One can prove that xi = (Pi + N )/Qi for i ≥ 0, where



⎨0 if i = 0 ,
(6.7) Pi = q0 if i = 1 ,


⎩q Q
i−1 i−1 − Pi−1 if i ≥ 2

and



⎨1 if i = 0 ,
(6.8) Qi = N − q02 if i = 1 ,


⎩Q
i−2 + (Pi−1 − Pi )qi−1 if i ≥ 2 .
152 6. Continued Fractions

The qi can be computed using


⎧√

⎪ N if i = 0 ,



(6.9) qi = xi  = !
⎪ " #√ $

⎪ q0 + P i N + Pi

⎩ = if i > 0 .
Qi Qi

Define Ai and Bi as in Theorem 6.5 so that [q0 ; q1 , q2 , . . . , qi ] = Ai /Bi


for i ≥ 0.
Here are some important facts we will need:

(6.10) (−1)i Qi = A2i−1 − Bi−1


2
N,

(6.11) N = Pi2 + Qi Qi−1 ,



(6.12) 0 ≤ Pi , Qi < 2 N .

See Riesel [Rie94] for a proof of these facts.

Example
√ 6.16. Table 2 gives the sequences for the continued fraction
of 85 = 9.219544457292887 . . ..

Table 2. Continued fraction expansion for 85.

i qi Pi Qi Ai Bi Ai /Bi
−1 − − − 1 0 −
0 9 0 1 9 1 9.0
1 4 9 4 37 4 9.25
2 1 7 9 46 5 9.20
3 1 2 9 83 9 9.2222
4 4 7 4 378 41 9.21951
5 18 9 1 6887 747 9.2195448
6 4 9 4 27926 3029 9.219544404
7 1 7 9 34813 3776 9.2195444915
8 1 2 9 62739 6805 9.2195444526
9 4 7 4 285769 30996 9.21954445735
10 18 9 1 5206581 564733 9.2195444572922
11 4 9 4 21112093 2289928 9.21954445729298
6.4. A General Plan for Factoring 153

The reader should use the formulas to check the values in the
table. Note also that the approximations
√ Ai /Bi alternate above and
below the actual value of 85 and rapidly approach this limit. One
can prove that they always alternate this way.

Legendre used equation (6.10) to find small quadratic residues


modulo N and used them to limit the possible divisors in Trial Divi-
sion of N , as described in Section 5.1.

6.4. A General Plan for Factoring


The following theorem has been known for a long time.

Theorem 6.17. If N is a composite positive integer, if x and y


are integers, and if x2 ≡ y 2 (mod N ), but x ≡ ±y (mod N ), then
gcd(x − y, N ) and gcd(x + y, N ) are proper factors of N .

Proof. The congruence shows that N divides x2 −y 2 = (x−y)(x+y),


while the incongruences imply that N does not divide either x − y or
x + y. Hence, at least one prime factor of N does not divide x − y and
so must divide x + y. Likewise, at least one prime factor of N divides
x − y. Therefore both greatest common divisors exceed 1. Neither
greatest common divisor can equal N because of the incongruences.


Later we will tell how to find x and y with x2 ≡ y 2 (mod N ).


However, it is difficult to ensure that x ≡ ±y (mod N ), so we ignore
this condition. In case N is a prime power (higher than the first
power), the theorem could fail without the condition. For example,
92 ≡ 162 (mod 25) and gcd(16 − 9, 25) = 1 and gcd(16 + 9, 25) = 25,
neither of which is a proper factor of 25. A problem might arise also
when N is even. For example, 42 ≡ 62 (mod 10) and gcd(4 + 6, 10) =
10, which is not a proper factor of 10. We can easily factor even
numbers. We can also factor prime powers N by computing N 1/k (by
Newton’s method, perhaps) for 2 ≤ k ≤ log2 N . The next theorem
tells how several modern factoring algorithms finish.

Theorem 6.18. If N is an odd positive integer with at least two


different prime factors and if x and y are chosen randomly subject to
154 6. Continued Fractions

x2 ≡ y 2 (mod N ), then, with probability ≥ 1/2, gcd(x − y, N ) is a


proper factor of N .

Proof. Suppose that N is odd and has k > 1 distinct prime factors
and that x2 ≡ y 2 (mod N ). Then x2 ≡ y 2 (mod pe ) for each of the
k distinct prime power divisors pe of N . The number y 2 is clearly a
quadratic residue modulo p. The congruence z 2 ≡ y 2 (mod pe ) has
exactly two solutions z. Since y and −y are clearly two solutions,
z ≡ ±y (mod pe ). By the Chinese Remainder Theorem, given y,
there are 2k solutions z to z 2 ≡ y 2 (mod N ), one for each choice of
the ± sign in each congruence z ≡ ±y (mod pe ). The solutions with
x ≡ ±y (mod N ) are two of these 2k solutions. Therefore, if x and
y are chosen randomly subject to x2 ≡ y 2 (mod N ), the probability
that x ≡ ±y (mod N ) is (2k − 2)/2k = 1 − 2k−1 . Since k > 1, the
probability is at least 1/2 that a random congruence x2 ≡ y 2 (mod N )
will yield a factorization of N . 

We saw in Section 3.2 that we can compute in probabilistic poly-


nomial time a square root of any quadratic residue r modulo N , pro-
vided we know the factors of N . The following corollary shows that
computing square roots modulo N is polynomial-time equivalent to
factoring N .
Corollary 6.19. Let N have at least two different odd prime factors.
If there is a probabilistic polynomial time algorithm A to find a solu-
tion x to x2 ≡ r (mod N ) for any quadratic residue r modulo N , then
there is a probabilistic polynomial time algorithm B to find a factor
of N .

Proof. The algorithm B works as follows. It chooses a random y


in 0 < y < N . If g = gcd(y, N ) > 1, then g is a proper factor of
N . Otherwise, r = y 2 mod N is a quadratic residue modulo N . Call
A with input r. If A returns with output y or N − y, repeat the
above steps with a new random y. Otherwise, let x be the output of
A. Then gcd(x − y, N ) is a proper factor of N . The probability of
success for each random y is at least 1/2 by Theorem 6.18. 

The general plan of several factoring algorithms, including some


old ones and some of the fastest known ones, is to generate (some)
6.5. Lehmer and Powers 155

pairs of integers x, y with x2 ≡ y 2 (mod N ) and hope that at least


one of gcd(x − y, N ), gcd(x + y, N ) is a proper factor of N . Theorem
6.18 says that we will not be disappointed often. It says that each
such pair gives at least a 50% chance to factor N . If N has more
than two (different) prime factors, then at least one of the greatest
common divisors will be composite and we will have more factoring
to do. In the fastest modern factoring algorithms it may take a long
time to produce the first pair x, y, but after it is found, many more
random pairs are produced quickly, and these are likely to yield all
prime factors of N . Example 10.5 illustrates the use of two congruent
pairs to factor an integer having three prime factors.

6.5. Lehmer and Powers


Before Lehmer and Powers, a few mathematicians used continued
fractions to factor numbers by seeking a square (−1)k Qk , which can
happen only when k is even.
Example 6.20. Suppose√we wish to factor 85. Example 6.16 gives
the continued fraction for 85. With numbers from Table 2, equation
(6.10) shows that
372 − 42 · 85 = A21 − B12 N = (−1)2 Q2 = 9 = 32 ,
which implies that 372 ≡ 32 (mod 85). This is an instance of Theorem
6.18 with x = 37, y = 3, and N = 85. Then gcd(37 − 3, 85) = 17
and gcd(37 + 3, 85) = 5 are proper factors of 85. But this calculation
doesn’t always succeed, as shown by
832 − 92 · 85 = A23 − B32 N = (−1)2 Q4 = 4 = 22 ,
which implies that 832 ≡ 22 (mod 85). Here we find gcd(83−2, 85) = 1
and gcd(83 + 2, 85) = 85, which are not proper factors of 85.
Square values of (−1)k Qk are somewhat rare, as we will see when
we discuss square forms factoring (SQUFOF). The paper of Lehmer
and Powers [LP31] addressed this scarcity and proposed two methods

to factor N , both based on the continued fraction expansion of N ,
and proved that both would succeed or both would fail for any given
N , except in rare circumstances1 .
1
The rare circumstance is that gcd(Qk , N ) > 1.
156 6. Continued Fractions

The first method uses the numbers Pk and Qk . Equation (6.11)


implies the congruence −Qk−1 Qk ≡ Pk2 (mod N ). For k = 1 this
is −Q1 ≡ P12 (mod N ) since Q0 = 1. For k = 2 we get Q2 P12 ≡
P22 (mod N ). Continuing this way, one shows by induction that
(6.13) (−1)k Qk (Pk−1 Pk−3 · · · Pr )2 ≡ (Pk Pk−2 · · · Ps )2 (mod N ),
where (r, s) = (1, 2) or (2, 1), according as k is even or odd.
Now suppose we can find Qi and Qj whose product is a square and
where i and j have the same parity. Then (−1)i Qi x2 = (−1)j Qj y 2 .
Using congruence (6.13) twice, we get
(6.14) (xPi+1 Pi+3 · · · Pj−1 )2 ≡ (yPi+2 Pi+4 · · · Pj )2 (mod N ),
which is an instance of Theorem 6.18 and might lead to a factorization
of N . This method extends to the case in which the product of more
than two (−1)k Qk ’s is a square.
The second method uses equation (6.10), which implies the con-
gruence
(6.15) A2k−1 ≡ (−1)k Qk (mod N ).
Now if we find (−1)i Qi x2 = (−1)j Qj y 2 , then congruence (6.15) gives
(xAi−1 )2 ≡ (yAj−1 )2 (mod N ),
which is an instance of Theorem 6.18 and might lead to a factorization
of N . This method extends to the case in which the product of more
than two (−1)k Qk ’s is a square even more easily than the earlier
method. For example, if (−1)i Qi (−1)j Qj x2 = (−1)k Qk y 2 , then
(xAi−1 Aj−1 )2 ≡ (yAk−1 )2 (mod N ),
another instance of Theorem 6.18.

Example 6.21. Factor N = 13290059, a factor of the fifty-third term


s53 (276) of the aliquot sequence that begins with 276 in Example
4.33. Table 3 is√an excerpt from the continued fraction table (similar
to Table 2) for N . Note that the numbers Ak−1 are reduced modulo
N , that their subscripts are 1 less than the other subscripts in that
row, and that the Qk are factored. The continued fraction begins

13290059 = [3645; 1, 1, 4, 5, 3, . . . ]. First note that Q25 = Q29 =
6.5. Lehmer and Powers 157

Table 3. Continued fraction for 13290059.

k qk Pk Qk Ak−1 mod N
0 3645 0 1 1
1 1 3645 2.2017 3645
2 1 389 3257 3646
3 4 2868 5.311 7291
4 5 3352 1321 32810
5 3 3253 2.52 .41 171341
.. .. .. .. ..
. . . . .
22 1 1134 41.113 5235158
23 31 3499 2.113 1914221
24 1 3507 5.877 11415773
25 1 878 5.571 39935
26 1 1977 2.31.53 11455708
27 1 1309 13.271 11495643
28 2 2214 2381 9661292
29 2 2548 5.571 4238109

5 · 571 and both 25 and 29 are odd, so congruence (6.14) becomes

(P26 P28 )2 ≡ (P27 P29 )2 (mod N )

and gcd(P26 P28 − P27 P29 , N ) gives a proper factor of N .


To illustrate the use of three Q’s, observe that

(−1)5 Q5 (−1)22 Q22 (−1)23 Q23 = 2 · 52 · 41 · 41 · 113 · 2 · 113

is a square. After canceling equal factors on each side of the con-


gruence obtained from multiplying (6.13) for k = 5, 22, and 23, we
obtain
(5P2 P4 P23 )2 ≡ (113P1 P3 P5 )2 (mod N )

and gcd(5P2 P4 P23 − 113P1 P3 P5 , N ) factors N .


Using the same three Q’s, but congruence (6.15) instead of con-
gruence (6.13), gives a simpler calculation and shows how the A are
158 6. Continued Fractions

used. This leads to the congruence2


(5A21 A22 )2 ≡ (113A4 )2 (mod N )
and gcd(5A21 A22 − 113A4 , N ) factors N .
The reader should check that all three approaches factor N .

Lehmer and Powers computed with hand calculators in the 1930s.


The sets of Q whose product was a square were obtained by inspecting
the factored Q written on paper and trying to match the prime fac-
tors. The method was abandoned because computing the continued
fraction sequences was tedious and, as Lehmer said many years later,
too often “your butterfly net was empty,” meaning that either no
square product of Q could be found or one was found that produced
only the trivial factors 1 and N of N .

6.6. Continued Fraction Factoring Algorithm


This factoring algorithm has been superseded by faster algorithms
in later chapters. We study it here because, historically, it was the
first subexponential integer factoring algorithm and because the later
algorithms build on the ideas invented to make this one work.
Morrison and Brillhart reprised the algorithm of Lehmer and
Powers—the version using the A’s—and programmed it on a com-
puter. The computer did the tedious computation of the continued
fraction sequences and factored the Q’s. But how did the computer
“inspect” the factored Q to form a square? The brilliant idea of
Morrison and Brillhart was to use Gaussian elimination on vectors
of exponents modulo 2, a task easily programmed, to form squares.
If the computer’s “butterfly net was empty,” it would just swing it
again.
Like the algorithm of Lehmer and Powers, the Continued Fraction
Factoring Algorithm (CFRAC) of Morrison and √ Brillhart [MB75]
uses the fact that, since the Qi are small (near N ), they are more
likely to be smooth than numbers near N/2, say, because u will be
only half as big in equation √ (3.1). The algorithm uses the contin-
ued fraction expansion for N to generate the sequences {Qi } and
2
Remember that A21 is in the row with k = 22.
6.6. Continued Fraction Factoring Algorithm 159

{Ai mod N } and tries to factor each Qi by Trial Division. An innova-


tion of Morrison and Brillhart was to restrict the primes in the Trial
Division to those below some bound B, called the factor base. That
is, if a Qi has prime factors larger than B, then that Qi is discarded.
The CFRAC saves the B-smooth Qi , together with the correspond-
ing Ai−1 , representing the relation A2i−1 ≡ (−1)i Qi (mod N ). When
enough relations have been collected, Gaussian elimination is used to
find linear dependencies (modulo 2) among the exponent vectors of
the relations. There are enough relations when there are more of them
than primes in the factor base. Each linear dependency produces a
congruence x2 ≡ y 2 (mod N ) and a chance to factor N by Theorem
6.18.
There is a second restriction beyond p ≤ B on the primes in
the factor base. Suppose the prime p divides Qi . The equation
A2i−1 − N Bi−12
= (−1)i Qi shows that (Ai−1 /Bi−1 )2 ≡ N (mod p),
so N is a quadratic residue modulo p. Therefore, the factor base
should contain only primes p for which N is a quadratic residue, that
is, about half 3 of the primes up to B. This trick was discussed in
Section 5.1. Assuming a couple of plausible hypotheses, Pomerance √
[Pom82] proved that  the time complexity
 of the CFRAC is L(N ) 2 ,
where L(x) = exp (ln x) ln ln x . Schroeppel proved this complex-
ity earlier, but he did not publish it. Even earlier, several people had
found incorrect formulas for this complexity.
Let me say a bit more about the linear algebra step. Suppose
there are K primes in the factor base. Call them p1 , p2 , . . ., pK .
As mentioned above, these are the primes p ≤ B for which N is a
quadratic residue modulo p. They comprise about half of the primes ≤

B. Our goal is to find a set S of i for which the product i∈S (−1)i Qi
is the square of an integer. Since a square must be positive, we add
the “prime” p0 = −1 to the factor base. For each i for which (−1)i Qi
 eij
is B-smooth, write (−1)i Qi = K j=0 pi . Thus ei0 = 1 if i is odd

3
One might think that the probability that Qi is B-smooth would be lessened
by having only about half of the possible primes available. But there is a heuristic
argument (see Section 4.5.4 of Knuth [Knu81]) that if p ≤ B does not divide N and
(N/p) = +1, then p divides Qi with probability 2/(p + 1) rather than the expected
1/p. This higher chance of dividing Qi compensates for the smaller number of useful
primes < B and leaves the estimate in equation (3.1) essentially unchanged.
160 6. Continued Fractions

and 0 if i is even. When (−1)i Qi is B-smooth, define the vector


vi = (ei0 , ei1 , . . . , eiK ). Note that when (−1)i Qi and (−1)k Qk are
multiplied, the corresponding vectors vi , vj are added. A product

like i∈S (−1)i Qi is a square if and only if all entries in the vector

sum i∈S vi are even numbers. Form a matrix with K + 1 columns
whose rows are the vectors vi (reduced modulo 2) for which (−1)i Qi
is B-smooth. If there are more rows than columns in this matrix,
then Gaussian elimination will find nontrivial dependencies among
the rows modulo 2. Since there are only two numbers modulo 2 (0
and 1), either a row is in such a dependency or it is not. Let S be
the set of i for which the row vi is in the dependency. Each nontrivial
 
dependency, say, i∈S vi = 0, gives a product i∈S (−1)i Qi which is

a square, say, y 2 . Let x = i∈S Ai−1 mod N . Then x2 ≡ y 2 (mod N ),
an instance of Theorem 6.18.
Here is a simple version of the CFRAC algorithm:

Algorithm 6.22. Continued Fraction Integer Factoring Algorithm.

Input: A composite integer N > 1 to factor.


Choose an upper bound B for the factor base.
p0 ← −1
Let p1 , . . ., pK be the primes ≤ B with (N/pi ) = +1.
R←0
i←0
while (R < K + 10) {
Use equations (6.7), (6.8), (6.9), and(6.3)
to compute Pi , Qi , qi , Ai−1 mod N .
Try to factor Qi using only the primes in the factor base.
If you succeed, save i, Qi , Ai−1 in a file and add 1 to R.
i←i+1
}
Read i, Qi , Ai−1 from the file.
Factor the (−1)i Qi again.
Form the matrix with row vectors vi .
Use linear algebra to find some dependencies among the vi .

For each dependency, say, i∈S vi =0,
2
 i
Let y = i∈S (−1) Qi and x = i∈S Ai−1 mod N .
If gcd(x − y, N ) is a proper factor of N , write it and stop.
Output: A factor of N .
6.6. Continued Fraction Factoring Algorithm 161

Example 6.23. Factor N = 13290059 by the CFRAC.


Let the factor base be {−1, 2, 5, 31, 43, 53, 113}. (We chose this
set of primes to make√the example fit on a page.) The continued
fraction expansion for N , part of which is shown in Table 3, yields
the relations below and many others:
i Ai−1 mod N (−1)i Qi Qi factored
10 6700527 +1 1333 31 · 43
23 1914221 −1 226 2 · 113
26 11455708 +1 3286 2 · 31 · 53
31 1895246 −1 5650 2 · 52 · 113
40 3213960 +1 4558 2 · 43 · 53
Since a square cannot be negative, we let −1 be another “prime”
factor of (−1)i Qi . Each relation in the table above is represented by
one row in the next table. Each row holds one exponent vector vi
reduced modulo 2:
i −1 2 5 31 43 53 113
10 0 0 0 1 1 0 0
23 1 1 0 0 0 0 1
26 0 1 0 1 0 1 0
31 1 1 0 0 0 0 1
40 0 1 0 0 1 1 0
By Gaussian elimination modulo 2, or otherwise, one sees that the
rows with i = 10, 26, and 40 are linearly dependent, as are the rows
with i = 23 and 31. The first dependency gives
(6700527 · 11455708 · 3213960)2 ≡ (2 · 31 · 43 · 53)2 (mod 13290059)
or 141298 ≡ 141298 (mod 13290059), which fails to factor N . The
2 2

second dependency gives


(1914221 · 1895246)2 ≡ (2 · 5 · 113)2 (mod 13290059)
or
126776052 ≡ 11302 (mod 13290059),
which yields the factors
gcd(12677605 − 1130, 13290059) = gcd(12676475, 13290059) = 4261,
gcd(12677605 + 1130, 13290059) = gcd(12678735, 13290059) = 3119.
162 6. Continued Fractions

√There is a problem using the CFRAC when the continued fraction


of N is unusually short, such as those cases mentioned in Theorem
6.15. The period may end before enough relations are found. If one
continues the while loop beyond the end of the period, one will just
rediscover the same relations already found. Morrison and Brillhart
[MB75] encountered this problem when they tried to factor F7 =
7 √
22 + 1 = 2128 + 1. The continued fraction for F7 has period length
1 by Theorem 6.15. They solved the problem by factoring 257F7

instead, that is, they expanded 257F7 in a continued fraction and
reduced the Ai−1 in this continued fraction modulo F7 .
The idea of combining several relations x2i ≡ ri mod N to form a
congruence x2 ≡ y 2 (mod N ) and get a chance to factor N goes back
at least to Kraitchik [Kra29] and Lehmer and Powers [LP31].
Here are some of the important new ideas in [MB75]:
• the fixed factor base,
• using linear algebra modulo 2 to combine relations to form
a square, and
• using large primes to augment the factor base.
Here is how the third idea works. When you try to factor Qi
using the primes in the factor base, you often fail because a cofac-
tor > 1 remains. It is easy to test the cofactor for being prime (or
probably prime). If it is prime (and larger than the primes in the
factor base but not too large), save the relation. It is called a “large
prime relation” because it has one prime factor larger than those
in the factor base. If another relation with the same large prime is
found, then one can make a useful relation by multiplying them. Say
A2i−1 ≡ (−1)i Qi (mod N ) and A2j−1 ≡ (−1)j Qj (mod N ). Then their
product, (Ai−1 Aj−1 )2 ≡ (−1)i+j Qi Qj (mod N ), will have one prime
not in the factor base appearing on the right side, but this prime will
be squared, so it will not appear in the matrix, and the product rela-
tion can be used in the construction of a congruence x2 ≡ y 2 (mod N ).
If the number of possible large primes is t, then repeated large primes
roughly when the number of relations4 with large
will begin to appear √
primes reaches 1.18 t. The use of large primes this way does not
4
This happens because of the “birthday problem”. See the footnote in Section 5.6.
6.7. SQUFOF—SQUare FOrms Factoring 163

change the theoretical time complexity of the CFRAC, but it does


have huge practical value. If the upper limit B on the primes in the
factor base and the upper limit on the size of large primes are chosen
well, then most of the relations will come from pairing repeated large
primes. The CFRAC with large primes runs about ten times faster
than the CFRAC without large primes.

6.7. SQUFOF—SQUare FOrms Factoring


Shanks [Sha75] invented this algorithm soon after the CFRAC al-
gorithm was published and before its time complexity was known.
It turns out that the SQUFOF is much slower than the CFRAC for
factoring sufficiently large numbers. Because of its simplicity and
modest memory requirements, the SQUFOF is often used to factor
small composite numbers that arise when two or more “large primes”
are used in the Quadratic and Number Field Sieves.
The main loop of the SQUFOF is similar to that of the CFRAC
in that it computes the √ sequences qi , Pi , and Qi of the continued
fraction expansion of N . However, it does not compute the Ai
sequence. The Ai are much larger than the other variables in the
CFRAC. Their computation consumes much of the effort of iterating
the main loop of the CFRAC. Shanks once quipped that, when he
invented the SQUFOF, he removed the A-ness but not the P -ness or
the Q-ness from the CFRAC.

Basically, the SQUFOF expands N in a simple continued frac-
tion, computing the sequences qi , Pi , and Qi until it finds a square
(−1)k Qk (so k must be even). Since the SQUFOF does not compute
the Ai sequence, it does not have the Ak−1 with A2k−1 ≡ Qk (mod N ),
so it cannot factor N via Theorem 6.18. Instead, the SQUFOF factors
N in a completely different way using the theory of binary quadratic
forms.
See Buell [Bue89] for the needed background on binary quadratic
forms. A function f (x, y) = ax2 +bxy +cy 2 is a binary quadratic form
in the variables x and y. The constants a, b, and c are integers. The
discriminant of f is b2 − 4ac. We will abbreviate f = (a, b, c).
164 6. Continued Fractions

In view of the equation (6.11), the binary quadratic forms



Fi = (−1)i−1 Qi−1 , 2Pi , (−1)i Qi
have discriminant 4Pi2 + 4Qi−1 Qi = 4N . We have
F1 = (Q0 , 2P1 , −Q1 ), F2 = (−Q1 , 2P2 , Q2 ),
F3 = (Q2 , 2P3 , −Q3 ), F4 = (−Q3 , 2P4 , Q4 ), etc.
The forms Fi are the principal period of reduced forms of discriminant
4N . The Fi are easy to compute using formulas (6.7), (6.8), and
(6.9). Formulas (6.7) and (6.8) √ show that the first form is F1 =
(1, 2q0 , q0 − N ), where q0 =  N .
2

Because of the size restriction in (6.12), the sequence {Fi } must


repeat. One can show that√the period p is the same as the period of
the continued fraction for N . In fact, we have Fp = (q02 − N, 2q0 , 1)
and Fp+1 = (1, 2q0 , q02 − N ) = F1 .
The SQUFOF computes the forms F1 , F2 , . . . until it finds a
square form whose third coefficient (−1)n Qn is a square r 2 of an in-
teger r (so n must be even). It turns out that such a form is also
a “square” with respect to composition of binary quadratic forms,
that is, it is equivalent to the composition of a form with itself. If
the square form is Fn = (−Q, 2P, r 2 ), then one can define its in-
−1/2
verse square root with respect to composition of forms as Fn =
(−r, 2P, rQ). This form is reduced by the formulas
R0 = P + r(q0 − P )/r, S−1 = r, S0 = (N − R02 )/r,
−1/2
to give the reduced form G0 = (−S−1 , 2R0 , S0 ) equivalent to Fn .
Then the SQUFOF generates a sequence of forms

Gi = (−1)i−1 Si−1 , 2Ri , (−1)i Si
via the formulas ! "
q0 + Ri
si = ,
Si
Ri+1 = si Si − Ri , Si+1 = Si−1 + (Ri − Ri+1 )si ,
which are completely analogous to formulas (6.7), (6.8), and (6.9).
In the sequence of forms {Gi }, the algorithm seeks two consecu-
tive forms with the same middle term: 2Ri = 2Ri+1 . The subscript i
for which this happens is approximately n/2, where n is the subscript
6.7. SQUFOF—SQUare FOrms Factoring 165

Table 4. Factor N = 13290059 by the SQUFOF: The forward forms.

Fi
i−1
i (−1) Qi−1 2 · Pn (−1)i Qi
1 1 2 · 3645 −4034
2 −4034 2 · 389 3257
3 3257 2 · 2868 −1555
4 −1555 2 · 3352 1321
5 1321 2 · 3253 −2050
···
50 −2327 2 · 2869 2174
51 2174 2 · 1479 −5107
52 −5107 2 · 3628 25

Table 5. Factor N = 13290059 by the SQUFOF: The reverse forms.

Gi
i−1
i (−1) Si−1 2 · Rn (−1)i Si
0 −5 2 · 3643 3722
1 3722 2 · 79 −3569
2 −3569 2 · 3490 311
3 311 2 · 3352 −6605
4 −6605 2 · 3253 410
···
21 1130 2 · 2603 −5765
22 −5765 2 · 3162 571
23 571 2 · 3119 −6238
24 −6238 2 · 3119 571

of the square form found in the first part of the algorithm. When
Ri = Ri+1 is found, the factor of N is Si if Si is odd and Si /2 if Si
is even.

Example
√ 6.24. We factor N = 13290059 by the SQUFOF. Then
q0 =  N  = 3645 and q02 − N = −4034. Some of the forms are
shown in Table 4. Compare Table 4 with Table 3. At the end of
Table 4 we have found the square form F52 = (−5107, 2 · 3628, 52 ).
166 6. Continued Fractions

We compute its inverse square root as F −1/2 = (−5, 2 · 3628, 25535)


and reduce it to get G0 = (−5, 2 · 3643, 3722). In Table 5, which
shows the forms Gi , we find R23 = R24 = 3119. The factor of N
comes from S23 = 6238. Since this number is even, the factor of N
is S23 /2 = 6238/2 = 3119. Thus, N = 3119 · 4261. Note that 23
approximately equals 52/2.

We will not explain more of the details of the SQUFOF, but we


state the algorithm below. See Gower and Wagstaff [GW08] for a
fuller discussion of the algorithm.
In the algorithm below, the variable N is the integer to factor, s
remembers q0 , q holds the current qi , P and P next hold Pi and Pi+1 ,
Qprev and Q hold Qi−1 and Qi , and t is a temporary variable used
in updating Q. Formulas (6.7), (6.8), and (6.9) are used to advance
from one form to the next. The List remembers certain square roots
r that must be avoided because they lead to the trivial factor 1 of N .
Only a handful of integers is ever placed on the List5 . Finally, B is
an upper bound on the number of forms tested for being square forms
before the algorithm gives up. It could be replaced by a larger bound
with little change. The purpose of B in this simple version is to give
up when the period of the continued fraction is short and it contains
no square form. When this happens, one can try the SQUFOF again
with N replaced by mN for some small, nonsquare integer m. Then
the factor of N in the second part will be Q/ gcd(Q, 2m). See [GW08]
for more details.
Just before the while loop begins, the principal form is (1, 2P, −Q)
= (1, 2q0 , −(N − q02 )). Just before the parity test of Q in the middle
of the while loop, the current form is ((−1)i−1 Qprev, 2P, (−1)i Q).
If this form has an especially small Q (≤ L if Q is even and ≤ L/2
in any case), then the square of this form will appear later (at about
twice i iterations) in the principal period. If we recognize it as a
square form and go to Part 2 of the algorithm, then we will return to
the form with the small Q in the principal period of reduced forms,
cycle back to the beginning of the principal period, and discover the

5
A fancier and often faster version of the SQUFOF uses a Queue data structure
instead of a List.
6.7. SQUFOF—SQUare FOrms Factoring 167

trivial factor 1. The purpose of putting the small Q’s on the List and
checking for r being on the List is to avoid this trap.
Here is the first part of the SQUFOF algorithm:

Algorithm 6.25. The SQUFOF, Part 1.


Input:√A composite integer N > 1 to factor.
s← N
if (N = s · s) { write “N = s · s” ; exit }
Create an empty List
Qprev ← 1
P ←s
Q ← %N √ − P&· P
L ← 2 2s
B ← 3L
i←1
while (i < B) {
q ← (s + P )/Q
P next ← q · Q − P
if (Q ≤ L) {
if (Q is even) { put Q/2 on the List }
else if (Q ≤ L/2) { put Q on the List }
}
t ← Qprev + q · (P − P next)
Qprev ← Q
Q←t
P ← P next
if (i is even and Q = r 2 and r is an
integer not on the List) { goto the SQUFOF Part 2 }
i←i+1
}
write “the SQUFOF failed; try again with a multiplier” ; exit

When the first part of the SQUFOF ends by going to Part 2, it has
just found the square form F = (−Qprev, 2P, r 2 ). Its inverse square
root is F −1/2 = (−r, 2P, rQprev). The first lines of the second part
reduce this form to produce the form (−Qprev, 2P, Q) just before
the while loop begins. The division by Qprev in the third line is
exact. Then the SQUFOF cycles through reduced forms seeking two
consecutive forms with the same middle term: 2P = 2P next. When
they are found, the factor of N is either Q or Q/2.
168 6. Continued Fractions

Here is the second part of the the SQUFOF algorithm:

Algorithm 6.26. The SQUFOF, Part 2.

Qprev ← r
P ← P + r (s − P )/r
Q ← (N − P · P )/Qprev
i←0
while (i < B) {
q ← (s + P )/Q
P next ← q · Q − P
if (P = P next) {
Q ← Q/ gcd(Q, 2)
write “Q divides N ” ; exit
}
t ← Qprev + q · (P − P next)
Qprev ← Q
Q←t
P ← P next
i←i+1
}
Output: A factor of N .

Note that the code for the next form in the reverse cycle in Part
2 is exactly the same as that for the next form in the forward cycle
in Part 1. The number of iterations of the while loop in Part 2 is
approximately half of that in Part 1.
Under plausible hypotheses,√one can prove that the average run-
ning time for the SQUFOF is C 4 N , where C depends on the number
of prime factors of N and whether N ≡ 1 or 3 (mod 4). See Gower
and Wagstaff [GW08] for the hypotheses, proof, and formula for C.
Shanks introduced two important new ideas in the SQUFOF:

• keeping virtually all numbers < 2 N when factoring N and
• using the connections among continued fractions, binary
quadratic forms, and the structure of real quadratic fields
to compute the factors of N in a simple way and to analyze
the time complexity of the SQUFOF.
6.8. Pell’s Equation 169

6.8. Pell’s Equation


We would be remiss not to discuss Pell’s equation here, although the
subject has little to do with factoring integers. Pell’s equation is
x2 − Dy 2 = Q, where D and Q are given integers and x and y are
unknown integers. Brahmagupta was the first to study this equation.
It is named after Pell because he helped Johann Heinrich Rahn, who
wrote an algebra book in 1658 containing an example of the equation.
Assume in this section that D is not the square of an integer.
Equation (6.10), with N = D, gives a lot of solutions to Pell’s equa-
tion and connects it to continued fractions.
Theorem 6.27. Let D be √ a positive integer but not a square. Let
the integer Q satisfy |Q| < D. If there exist positive integers x, y
with x2 − Dy 2 = Q, then there exists an integer k with x = Ak−1 ,
y = Bk−1 , and Q = (−1)k Qk .

See Theorem 7.24 of [NZM91] for a proof.


We now concentrate on the solutions with Q = ±1.√Let p denote
the length of the period of the continued fraction for D. It is the
number p in Theorems 6.13 and 6.15.
Theorem 6.28. Let D be a positive integer but √ not a square. If the
period length p of the continued fraction for D is even, then the
equation x2 − Dy 2 = −1 has no solution and all positive solutions of
x2 − Dy 2 = +1 are given by x = Akp−1 , y = Bkp−1 , for k = 1, 2, . . ..
However, if p is odd, then x = Akp−1 , y = Bkp−1 gives all positive
solutions to x2 − Dy 2 = −1 when k = 1, 3, 5, . . ., and all positive
solutions to x2 − Dy 2 = +1 when k = 2, 4, 6, . . ..

See Theorem 7.25 of [NZM91] for a proof.


Now we can finish the description of the Lehmers’ Factoring
Method of Section 5.5. The fundamental solution (T, U ) to T 2 −
Du2 = 1 is the solution in smallest positive integers. By Theorem
6.28, we have (T, U ) = (Ap−1 , Bp−1 ) when the period p is even and
(T, U ) = (A2p−1 , B2p−1 ) when p is odd.
Example
√ 6.29. Table 6 gives the continued fraction expansion for
3. The period length is p = 2 and the fundamental solution to
170 6. Continued Fractions

Table 6. Continued fraction expansion for 3.

i qi Pi Qi Ai Bi
0 1 0 1 1 1
1 1 1 2 2 1
2 2 1 1 5 3
3 1 1 2 7 4
4 2 1 1 19 11
5 1 1 2 26 15

x2 − 3y 2 = 1 is (T, U ) = (A1 , B1 ) = (2, 1). Note that we have


A2k − 3Bk2 = 1 for all odd k and A2k − 3Bk2 = −2 for all even k.

The Lehmers’ Factoring Method factors N by finding two repre-


sentations of N by a quadratic form λN = x2 − Dy 2 . We mentioned
in Section 5.5 that when D < 0 the search range for y is 0 ≤ y <

|λ|N/D. Now we can describe the search range when D > 1. When
D > 1 and λ > 0, we have 0 ≤ y < λ(T − 1)N/(2D). When D > 1
and λ < 0, we have |λN/D| ≤ y < |λ|(T + 1)N/(2D).

Example 6.30. In Example 5.22 we factored N = 13290059 by


searching for solutions to −N = x2 − 3y 2 . Here we have D = 3
and λ = −1. By Example 6.29, the fundamental solution to the
Pell equation T 2 − DU 2 = 1 is (T, U ) = (2, 1).
Therefore the
search
limit on y is N/3 ≤ y < (T + 1)N/(2D) = 3N/6 = N/2, or
2104 ≤ y < 2577. In Example 5.22, the values of y were 2234 and
2269, which lie in this interval.

Exercises

6.1. Prove the first statement of Theorem 6.15.


6.2. Compute the√length of the period of the continued fraction
expansion of N for many small N and try to determine when
this length is three. Can you prove your answer?
Exercises 171

6.3. Find
√ the length p of the period of the simple continued fraction
of N , where N = 13290059.
6.4. Let p be a 15-digit prime number. Prove that the CFRAC will
fail to factor N = p3 .
Chapter 7

Elliptic Curves

Introduction
This chapter describes how elliptic curves are used to factor integers.
We also explain how one can prove that large integers are prime with
elliptic curves. Finally, we give applications of factoring integers to
the construction of certain elliptic curves with desirable properties.
Elliptic curves have been studied for hundreds of years. In 1985,
H. W. Lenstra, Jr. [Len87] invented a factoring method that used
elliptic curves. Basically, Lenstra freed Pollard’s p − 1 factoring algo-
rithm from its dependency on a single number (p − 1) being smooth.
He replaced the multiplicative group of integers modulo p by an ellip-
tic curve modulo p. The integer that needs to be smooth is the size
of the elliptic curve modulo p. This number varies through a small
interval around p depending on the parameters of the elliptic curve.
Soon after this discovery, Miller [Mil87] and Koblitz [Kob87b] gave
applications of elliptic curves to cryptography. Many cryptographic
algorithms use the multiplicative group of integers modulo p. They
replaced this group with an elliptic curve modulo p. This change al-
lows one to use smaller p and design faster cryptographic algorithms
offering the same level of security. Today elliptic curves are an im-
portant tool in cryptography.

173
174 7. Elliptic Curves

In order to understand this chapter, the reader should know what


groups and fields are. See Sections 2.10 and 2.11 of [NZM91] for a
quick review of this topic. The books [ST92] by Silverman and Tate
and [TW02] by Trappe and Washington give a good introduction to
elliptic curves at the undergraduate level.

7.1. Basic Properties of Elliptic Curves


An elliptic curve is the graph of an equation

y 2 = x3 + ax2 + bx + c,

where a, b, c, x, and y are numbers in an appropriate set (usually a


field) K. Typically, K is the set of real numbers, rational numbers,
complex numbers, integers, integers modulo a prime p, or any finite
field. In applications to factoring, K is the integers modulo p or N .
For cryptography, K is a finite field. An elliptic curve also contains
a special point ∞ which is not on the graph. Thus, an elliptic curve
E with parameters a, b, c is

E = {(x, y) : x, y ∈ K, y 2 = x3 + ax2 + bx + c} ∪ {∞}.

We first look at elliptic curves with K = the real numbers so that


we can draw pictures. In this situation, think of the point ∞ as being
infinitely far up (or down) the y-axis. This means that every vertical
line passes through ∞, but nonvertical lines do not contain ∞. The
graph E may have either one or two components, as shown in Figure
1. There are two components when the cubic polynomial has three
real zeros. There is one component when the cubic polynomial has
one real zero.
A simple change of variables removes the x2 term from the equa-
tion for an elliptic curve. To simplify some of the formulas, we assume
that this change has been made, so that the equation has the Weier-
strass form y 2 = x3 + ax + b.
The discriminant of the quadratic polynomial ax2 + bx + c is
D = b2 − 4ac. When D > 0, the polynomial has two real zeros.
When D < 0, the polynomial has no real zeros. When D = 0, the
polynomial has a repeated real zero.
7.1. Basic Properties of Elliptic Curves 175

Figure 1. Graphs of y 2 = x3 − 4x + 1 (left) and y 2 = x3 + 1 (right).

Likewise, the discriminant of the cubic polynomial x3 + ax + b is


D = 4a3 + 27b2 . When D > 0, the polynomial has just one real zero
(and two complex zeros). When D < 0, the polynomial has three
distinct real zeros. When D = 0, the polynomial has a repeated real
zero, either one triple zero or a double zero and a single one. For
technical reasons, we exclude the case D = 0, so that the cubic does
not have a repeated zero. Thus, we are excluding elliptic curves like
those in Figure 2, which have a “double point” and a “cusp.”

Figure 2. Graphs of y 2 = x3 − 3x + 2 (left) and y 2 = x3 (right).


176 7. Elliptic Curves

We now define a way to “add” points of the set


Ea,b = {(x, y) : x, y ∈ K, y 2 = x3 + ax + b} ∪ {∞}.
This addition is not vector addition in the plane.
If P = (x, y) lies on the graph of y 2 = x3 + ax + b, define −P =
(x, −y), that is, −P is P reflected in the x-axis. Also define −∞ = ∞.
Given two points P and Q on the graph but not on the same
vertical line, define P + Q = R, where −R is the third point on the
straight line through P and Q.

Figure 3. Adding points P + Q = R on y 2 = x3 − 4x + 1.

If P and Q are distinct points on the graph and on the same


vertical line, then they must have the form (x, ±y), that is, Q = −P ,
and we define P + Q = P + (−P ) = ∞.
Also define P + ∞ = ∞ + P = P for any element P of the elliptic
curve (including P = ∞). This makes ∞ the identity element of the
group Ea,b .
To add a point P = ∞ to itself, draw the tangent line to the
graph at P . If the tangent line is vertical, then P = (x, 0) and we
7.1. Basic Properties of Elliptic Curves 177

define P +P = ∞. If the tangent line is not vertical, then it intersects


the graph in exactly one more point R, and we define P + P = −R.
(If P is a point of inflection, then R = P .)
Theorem 7.1. An elliptic curve Ea,b with the addition operation +
forms an abelian group with identity ∞. The inverse of P is −P .

Proof. The operation + is well defined. It assigns an element P + Q


of Ea,b to every pair of elements P, Q of Ea,b . One checks easily that
∞ is the identity, that the inverse of P is −P , and that P +Q = Q+P .
The associative law (P + Q) + R = P + (Q + R) may be verified by
a tedious calculation using the addition formulas that follow. 

See [Was03] or [ST92] for a proof of the associative law.


Next we present formulas for computing the coordinates of P + Q
in terms of those of P and Q. If one of P , Q is ∞, then we have
already defined P + Q. Let P = (x1 , y1 ) and Q = (x2 , y2 ) be on the
graph of y 2 = x3 + ax + b. If x1 = x2 and y1 = −y2 , then P = −Q
and P + Q = ∞. Otherwise, let s be the slope defined as follows.
When P = Q, let s = (y2 − y1 )/(x2 − x1 ) be the slope of the line
through P and Q. When P = Q, let s = (3x21 +a)/(2y1 ) be the slope1
of the tangent line to the graph at P . Then P + Q = (x3 , y3 ), where
x3 = s2 − x1 − x2 and y3 = s(x1 − x3 ) − y1 . (The point (x3 , −y3 ) is
the third point on the line through P and Q that lies on the graph.)
Here is the point addition algorithm in pseudocode. It just fol-
lows the rules and computes the numbers explained in the previous
paragraph.
Algorithm 7.2. Elliptic Curve Point Addition.
Input: Two points P , Q on Ea,b .
if (P = ∞) { R ← Q }
else if (Q = ∞) { R ← P }
else {
let P = (x1 , y1 ) and Q = (x2 , y2 )
if (x1 = x2 and y2 = −y1 ) { R ← ∞ }
else {
if (P = Q) { s ← (3x21 + a)/(2y1 ) }
1
The easy way to derive the formula for the slope of the tangent line is by implicit
differentiation of y 2 = x3 + ax + b.
178 7. Elliptic Curves

else { s ← (y2 − y1 )/(x2 − x1 ) }


x 3 ← s 2 − x1 − x2
y3 ← s(x1 − x3 ) − y1
R ← (x3 , y3 )
}
}
Output: R = P + Q.

Example 7.3. On the elliptic curve y 2 = x3 − 4x + 1, perform these


point additions: (0, −1) + (2, 1); (0, −1) + (0, −1); (4, 7) + (2, −1).
The first step is to check that these points are actually on the
graph, that is, (−1)2 = 03 − 4(0) + 1, . . ., (−1)2 = 23 − 4(2) + 1.
For P +Q = (0, −1)+(2, 1), the slope is s = (1−(−1))/(2−0) = 1.
Then x3 = (1)2 − 0 − 2 = −1 and y3 = (1)(0 − (−1)) − (−1) = 2, so
R = (0, −1) + (2, 1) = (−1, 2). This addition is shown in Figure 3.
One should check the arithmetic by verifying that the sum is a point
on the curve. Here the check is 22 = (−1)3 − 4(−1) + 1.
For (0, −1) + (0, −1), the slope is s = (3(0)2 + (−4))/(2(−1)) = 2.
Then x3 = (2)2 − 0 − 0 = 4 and y3 = (2)(0 − 4) − (−1) = −7, so
(0, −1) + (0, −1) = (4, −7).
For (4, 7) + (2, −1), the slope is s = (−1 − 7)/(2 − 4) = 4. Then
x3 = (4)2 − 4 − 2 = 10 and y3 = (4)(4 − 10) − 7 = −31, so (4, 7) +
(2, −1) = (10, −31).

Now we consider elliptic curves modulo a prime p. Examination


of the formulas for adding points shows that if a and b and the coor-
dinates of points P and Q on the elliptic curve Ea,b are all rational
numbers, then the coordinates of P + Q will be rational numbers (un-
less P + Q = ∞). Therefore, if a and b and the coordinates of points
P and Q on the elliptic curve Ea,b are integers modulo p, then the
coordinates of P + Q will be integers modulo p, unless P + Q = ∞,
provided that any division needed in the slope calculation is by a
number relatively prime to p. The modulus p cannot be even because
we have to divide by 2 in the formula for the slope s when P = Q.
The condition on the discriminant becomes 4a3 + 27b2 ≡ 0 (mod p).
Of course, the graph is just a set of pairs of numbers modulo p, not
a curve in the plane.
7.1. Basic Properties of Elliptic Curves 179

Let us look at the points of the elliptic curve y 2 ≡ x3 + 2x −


3 (mod 7). Of course, −3 ≡ 4 (mod 7). Note that D = 4a3 + 27b2 ≡
4 · 23 + 27 · 42 ≡ 2 ≡ 0 (mod 7):

x (x3 + 2x + 4) mod 7 points (x, y)


0 4 (0, 2), (0, 5)
1 0 (1, 0)
2 2 (2, 3), (2, 4)
3 2 (3, 3), (3, 4)
4 6 none
5 6 none
6 1 (6, 1), (6, 6)

There are ten points on this elliptic curve, counting ∞.

Example 7.4. Add the points (3, 3)+(6, 1) on the curve whose points
were just listed.
We have s = (1 − 3)/(6 − 3) = −2/3. The Extended Euclidean
Algorithm gives 3−1 ≡ 5 (mod 7), so s ≡ (−2)(5) ≡ −10 ≡ 4 (mod 7).
Thus, x3 = 42 − 3 − 6 = 7 ≡ 0 (mod 7) and y3 ≡ 4(3 − 0) − 3 ≡
2 (mod 7), so the sum is (0, 2).

Example 7.5. Double the point (2, 4) on the same curve.


We must add (2, 4) + (2, 4). We have s = (3 · 22 + 2)/(2 · 4) ≡
0 (mod 7), x3 = 02 − 2 − 2 ≡ 3 (mod 7), and y3 ≡ 0 · (2 − 3) − 4 ≡
3 (mod 7), so 2(2, 4) = (2, 4) + (2, 4) = (3, 3).

When k is a positive integer and P is a point on an elliptic curve,


let kP mean P added to itself k times. It is easy to compute kP when
k is large by a modification of the Fast Exponentiation Algorithm,
which takes about log2 k point additions on the elliptic curve. The
algorithm converts the multiplier k into binary, just like the exponent
e in the Fast Exponentiation Algorithm. The variable point R takes
the value 2i P during the i-th iteration of the while loop. The point
R = 2i P is added to Q, initially the group identity, whenever the i-th
bit of k is 1. The result is that Q = kP when the algorithm finishes.
180 7. Elliptic Curves

Algorithm 7.6. Fast Point Multiplication.


Input: A point P on Ea,b modulo m and an integer k ≥ 0.
Q←∞
R←P
while (k > 0) {
if (k is odd) { Q ← (Q + R) mod m }
R ← (R + R) mod m
k ← k/2
}
Output: kP = the final value of Q.

One can count the number of points on an elliptic curve modulo


a prime in terms of the Legendre symbol.

Theorem 7.7. Let p be an odd prime. Let (r/p) denote the Legendre
symbol. The number Mp,a,b of points on the elliptic curve y 2 ≡ x3 +
 3
ax + b (mod p) satisfies Mp,a,b = p + 1 + p−1
x=0 (x + ax + b)/p .

Proof. Each x between 0 and p − 1 gives one value for x3 + ax + b.


The number of y between 0 and p − 1 with y 2 ≡ x3 + ax + b (mod p)
is 0, 1, or 2 according as x3 + ax + b is a quadratic nonresidue, is
≡ 0, or is a quadratic
residue, all modulo p. This number is exactly
3
1 + (x + ax + b)/p by definition of the Legendre symbol. Including
∞, we have
p−1


3 p−1
3

x + ax + b x + ax + b
Mp,a,b = 1+ 1+ = p+1+ .
x=0
p x=0
p

According to Theorem 2.58, there are as many quadratic residues


as quadratic nonresidues in the interval 1 ≤ r ≤ p − 1. Thus the
Legendre symbol in Theorem 7.7 is +1 about as often as it is −1.
Hence, we expect the number of points on a random elliptic curve
modulo p to be close to p + 1. H. Hasse proved that this is true.

Theorem 7.8 (Hasse). Let the elliptic curve Ea,b modulo a prime p
have Mp,a,b points. Then
√ √
p + 1 − 2 p ≤ Mp,a,b ≤ p + 1 + 2 p.
7.2. Factoring with Elliptic Curves 181

See Washington [Was03] for a proof. The range of possible values


for Mp,a,b is called the Hasse interval. Deuring [Deu41] proved that

for each prime p, every integer M in the Hasse interval p + 1 − 2 p ≤

M ≤ p + 1 + 2 p actually occurs as the size of an elliptic curve
y 2 ≡ x3 + ax + b (mod p) for some pair a, b.
In some cases, we can find the size of an elliptic curve immediately.
For example, if p ≡ 3 (mod 4) and b = 0, then the curve Ea,0 modulo
p has exactly p + 1 points for every a.

7.2. Factoring with Elliptic Curves


In 1985, H. W. Lenstra, Jr. [Len87] invented a factoring algorithm
using elliptic curves. It is called the Elliptic Curve Method or the
ECM. Let Rp denote the multiplicative group of integers modulo
a prime p. It consists of the integers {1, 2, . . ., p − 1}; the group
operation is multiplication modulo p. Recall that Pollard’s p − 1
Factoring Method, Algorithm 5.25, performs a calculation (aL mod
N , where L is the product of the prime powers below some bound B)
in the integers modulo N that hides a calculation (aL mod p) in Rp .
The factor p of N is discovered when the size p − 1 of the group Rp
divides L. The p − 1 Method to find p when p − 1 has a prime divisor
larger than B. Lenstra replaced Rp with an elliptic curve group Ea,b
modulo p. By Hasse’s Theorem, the two groups have roughly the
same size, namely, approximately p. Lenstra’s algorithm discovers p
when Mp,a,b divides L. It fails to find p when Mp,a,b has a prime
factor larger than B. There is only one group Rp , but many elliptic
curve groups Ea,b modulo p. If the size of Rp has a prime factor > B,
we are stuck. But if the size of Ea,b modulo p has a prime factor
> B, we just change a and b and try another elliptic curve. Each
curve gives an independent chance to find the prime factor p.
Compare the algorithm with Pollard’s p − 1 Method. The two
algorithms work exactly the same way, except that Pollard’s p − 1
Method raises a to the power pei while the Elliptic Curve Method mul-
tiplies P by pei . The former algorithm explicitly computes a greatest
common divisor with N while the latter algorithm hides this opera-
tion in the slope calculation of the elliptic curve point addition.
182 7. Elliptic Curves

Algorithm 7.9. Simple Elliptic Curve Factorization Method.


Input: A composite positive integer N to factor and a bound B.
Find the primes p1 = 2, p2 , . . ., pk ≤ B
Choose a random elliptic curve Ea,b modulo N
and a random point P = ∞ on it
g ← gcd(4a3 + 27b2 , N )
if (g = N ) { choose a new curve and point P }
if (g > 1) { report the factor g of N and stop }
for (i ← 1 to k) {
e ← (log B)/ log pi 
P ← (pei )P or else find a factor g of N
}
Give up or try another random elliptic curve.
Output: A proper factor p of N , or else give up.

It is not easy to choose a random elliptic curve Ea,b and then a


random point P on it. A better way to make these random choices
is to choose a random a modulo N and a random point P = (x1 , y1 )
modulo N and then let b = (y12 − x31 − ax1 ) mod N . In other words,
b has the only possible value modulo N so that the point P is on
Ea,b . Lenstra [Len87] proved that when many curves and points are
chosen this way, the sizes Mp,a,b of the curves modulo a prime p are
well distributed in the Hasse interval.
Whenever we add two points during the computation of pei P , we
reduce the coordinates modulo N . Imagine that the coordinates are
also reduced modulo p, an unknown prime divisor of N . Here is why
the algorithm works. If the size Mp,a,b of the elliptic curve modulo
k
p divides L = i=1 pei i , then one can show using group theory that
LP = ∞. Since P = ∞, at some point during the calculation we
must have P1 + P2 = ∞ for two points P1 , P2 = ∞, working with
coordinates modulo p. This means that P1 and P2 will have the same
x-coordinate modulo p. But we are computing modulo N , and the
x-coordinates of the two points will probably not be the same modulo
N . When we try to compute the slope, we will try to invert a number
not relatively prime to N and we will factor N instead.

Example 7.10. Factor N = 13290059 by the Elliptic Curve Method.


We use the simple algorithm above with B = 50. We choose a
random value for a, say, a = 4, and a random point, say, P = (3, 11)
7.2. Factoring with Elliptic Curves 183

on E. Then b = 112 − 33 − 4 · 3 = 82, so the elliptic curve is y 2 =


x3 + 4x + 82. All point additions are done modulo N .
We write Q1 , Q2 , etc., for the successive values of the variable
point P during the for loop. (Thus, Qi = pei i Qi−1 for i ≥ 1, where
Q0 = P .) The variables i, ei = (log B)/ log pi , and Qi are shown
in the following table. The first line gives (1719049, 11957272) =
25 (3, 11), which was computed as 2(3, 11), 4(3, 11), 8(3, 11), 16(3, 11),
Q1 = 32(3, 11), doubling the original point five times:
i pi ei pei i Qi
1 2 5 32 (1719049, 11957272)
2 3 3 27 (8856541, 13159308)
3 5 2 25 (2623616, 4754609)
4 7 2 49 (10781957, 8108628)
5 11 1 11 (865708, 8187137)
6 13 1 13 (2941774, 12043031)
So far we have computed Q6 = 25 · 33 · 52 · 72 · 11 · 13 · P . Next we try
to compute Q7 = 17 · Q6 . The Fast Point Multiplication Algorithm
computes 2Q6 = Q6 + Q6 , 4Q6 = (2Q6 ) + (2Q6 ), etc., to 16Q6 , as
shown in this table:
i 2i 2i · Q6
1 2 (10831961, 3406968)
2 4 (10610615, 9209777)
3 8 (2574592, 3334955)
4 16 (2732985, 10825756)
The next step is to compute
Q7 = 17 · Q6 = (16 · Q6 ) + Q6
= (2732985, 10825756) + (2941774, 12043031).
The slope is
12043031 − 10825756 1217275
= .
2941774 − 2732985 208789
But we cannot invert 208789 modulo N because gcd(208789, N ) =
4261 = 1. Thus, we cannot finish the calculation of Q7 . But we don’t
care; we have factored N .
Here is what happened. When we tried to add 16Q6 and Q6 , it
appeared that we should use the formula s = (y2 −y1 )/(x2 −x1 ) for the
184 7. Elliptic Curves

slope since 16Q6 = ±Q6 modulo N . But if we reduce the coordinates


of these two points modulo 4261, we find they are 16Q6 mod 4261 =
(1684, 2816) and Q6 mod 4261 = (1684, 1445) = (1684, −2816). Mod-
ulo 4261, Q6 is −16Q6 , so Q7 = 16Q6 + Q6 = ∞. But modulo 3119,
the other prime factor of N , the sum is not ∞. We have factored N
by making the two prime factors of N behave differently.

See Proposition VI.3.1 of Koblitz [Kob87a] for more details of


why the algorithm works.
If we make some reasonable assumptions, we can determine the
complexity of the Elliptic Curve Method. To factor a large number
N , the basic algorithm for one curve, given above, is repeated until a
factor p is found. If the probability is 1/m that p is discovered by one
instance of the algorithm, then the expected number of curves that
must be tried is m. The factor p is discovered by the algorithm when
the size Mp,a,b of the elliptic curve modulo p is B-smooth. The Hasse
√ √
interval p + 1 − 2 p ≤ M ≤ p + 1 + 2 p is too short to prove that it
contains a B-smooth M . The main assumption we make is that the
probability that Mp,a,b is B-smooth is u−u , where u = (ln p)/ ln B.
We saw in formula (3.1) that u−u is a fair approximation to this
probability (when the interval is long enough).
We also assume that the optimal value of B is used in the algo-
rithm. This optimal value depends on p, which is unknown, so we
must guess. But if we increase B slowly as we try more curves, the
effect is almost as if the correct value were used.

Theorem 7.11. Let N be a positive integer with an unknown (least)


prime factor p. Let B be the optimal bound for finding p by the Elliptic
Curve Method. Assume that a random elliptic curve Ea,b modulo p
with 4a3 + 27b2 ≡ 0 (mod p) has a B-smooth size with probability 

u−u , where u = (ln p)/ ln B. Define L(x) = exp (ln x) ln ln x .

Then B = L(p) 2/2 . The expected total number of point additions
performed

when the Elliptic Curve Method is used to discover p is
2
L(p) . The expected total work needed to discover one prime factor
of N is at most L(N ) point additions.
7.2. Factoring with Elliptic Curves 185

Proof. By the Prime Number Theorem, there are about B/ ln B


primes ≤ B. For nearly all of these primes q, the largest power
q e ≤ B has e = 1. The Fast Point Multiplication used to compute
(q e )P takes about log B point additions. Hence the total number of
point additions per curve with bound B is about (B/ ln B) ln B = B.
Since the probability of finding p with any single curve is u−u , the
expected number of curves required is 1/u−u = uu . Therefore, if we
use bound B in the algorithm, the total number of group operations
needed to find p is f (B) = Buu . We must find the B that minimizes
f (B).
Let a = (ln B)/ ln L(p) so that B = (L(p))a . We will express
f (B) in terms of a. We have ln B = a ln L(p) = a (ln p) ln ln p, so
'
ln p ln p 1 ln p
u= = =
ln B a (ln p) ln ln p a ln ln p

and ln u = 12 ln ln p − 12 ln ln ln p − ln a ≈ 12 ln ln p since the other two


terms are small compared to 12 ln ln p. Hence,
'
1 ln p 1 1
u ln u ≈ · ln ln p = ln L(p).
a ln ln p 2 2a

Therefore, uu = eu ln u ≈ L(p)1/(2a) and the function we seek to


minimize is
f (B) = Buu ≈ L(p)a L(p)1/(2a) = L(p)a+1/(2a) .
Since L(p) is a positive constant (for p > ee ), the minimum of f (B)
occurs when a + 1/(2a) is minimal. It is an easy calculus exercise√ to
show that the minimum of a + 1/(2a) occurs when a = 2/2√and

the minimum value is 2. Therefore, the optimal B is L(p) 2/2 .
With√this B, the expected total number of point additions is f (B) =
L(p) 2 .

Let p be the smallest prime factor of N . Then p ≤ N , ln p ≤
(1/2) ln N , and ln ln p < ln ln N , so the expected total number of point
additions is
√    
L(p) 2 = exp 2(ln p) ln ln p < exp (ln N ) ln ln N = L(N ),

and the proof is complete. 


186 7. Elliptic Curves

The Elliptic Curve Method has a second stage, just like the Pol-
lard p − 1 Method. The second stage of the algorithm chooses a
bound B2 > B. At the end of the first stage (Algorithm 7.9), the
variable P is a point Q equal to L times the original point P . Let
q1 < q2 < · · · < qt be the primes between B and B2 . One computes
successively (Lqi )P for i = 1, 2, . . . , t, where P is the original point.
The first point (Lq1 )P is computed directly as (q1 )Q. The differences
qi+1 − qi are all even numbers and are much smaller than the qi them-
selves. Precompute (Ld)P = dQ for d = 2, 4, . . ., up to a few hundred.
To find (Lqi+1 )P from (Lqi )P , add the latter to (Ld)P = dQ, where
d = qi+1 − qi . The amortized cost of computing each (Lqi )P for
1 ≤ i ≤ k is a single addition of two points on Ea,b .
The second stage finds a factor p of N when the largest prime
factor of Mp,a,b is less than B2 and all the other prime factors of
Mp,a,b are less than B.
Montgomery and Silverman each discovered more than 500 fac-
tors of Cunningham numbers by the ECM. Suyama found a good
form for an elliptic curve that has become standard for the ECM.
Brent [Bre86] and Montgomery [Mon87] made many improvements
to the second stage.
As I write this, the ECM has discovered one 79-digit prime factor,
one 77-digit prime factor, and two 75-digit prime factors of large
composite numbers, but no larger ones. Efficient modern versions
of the algorithm discover prime factors up to about 20 digits in a
few seconds and those up to about 40 digits in a few hours. Luck is
required to discover factors having more than 60 digits. See Silverman
and Wagstaff [SW93] for some practical aspects of the ECM. See
Zimmerman and Dodson [ZD06] for more about the ECM. A table
in [ZD06] recommends suitable parameters for the ECM, based on
the suspected size of the unknown prime factor. For example, when
seeking a prime factor of 50 digits, they suggest B = 43 · 106 and
B2 = 4620B. With these choices, an average of 7771 curves will be
needed to find the unknown prime. (These values assume a different
form for the elliptic curve and a faster way of performing the second
stage than we described above.)
7.3. Primality Proving with Elliptic Curves 187

7.3. Primality Proving with Elliptic Curves


Elliptic curve prime proving algorithms run in probabilistic polyno-
mial time. The best versions can prove n is prime in expected time
O((log n)4 ). The first such algorithms are due to Goldwasser and Kil-
ian [GK86]. Atkin and Morain [AM93] improved the algorithm and
made it practical.
We first describe the original algorithm by Goldwasser and Kilian.
The following theorem is an elliptic curve analogue of Theorem 3.29.
(The variables m, s, and P in the next theorem correspond to m − 1,
F , and a in the Pocklington Theorem.)
Theorem 7.12. Let n be a positive integer relatively prime to 6.
Let s and m be positive integers with s|m. Let Ea,b be an elliptic
curve2 modulo n. Suppose there is a point P of Ea,b such that we can
perform the curve operations to compute mP and find mP = ∞, and
for every prime p dividing s we can perform the curve operations to
compute (m/p)P and find (m/p)P = ∞. Then s divides the size of
Ea,b modulo any prime divisor of n. If also s > (n1/4 + 1)2 , then n
is prime.

Proof. Let q be a prime factor of n. The calculations on Ea,b modulo


n, when reduced modulo q, show that s divides the smallest positive
integer e such that eP = ∞ on Ea,b modulo q, just as in the proof of
Theorem 3.29.
If also s > (n1/4 +1)2 , then the size of Ea,b modulo q must also be
> (n1/4 + 1)2 . But by Hasse’s Theorem 7.8, the size of Ea,b modulo q
√ √ √
is ≤ ( q + 1)2 . Therefore, ( q + 1)2 > (n1/4 + 1)2 , so q > n. Since
this is true for every prime factor q of n, n must be prime. 

Theorem 7.12 may be used to prove recursively that a large num-


ber is prime, similar to Wunderlich’s procedure described after Pock-
lington’s Theorem. The difference is that the elliptic curve version
works for many more numbers than the version using Theorem 3.29
because there are many choices for Ea,b .
To prove that a large number n is prime using Theorem 7.12
recursively, try to find an elliptic curve Ea,b modulo n and a point P
2
We don’t actually know it is an elliptic curve until n is proved prime.
188 7. Elliptic Curves

on Ea,b with pP = ∞, s = p probably prime, and p > (n1/4 + 1)2 .


If the conditions of Theorem 7.12 hold, then this computation shows
that if p is prime, then n is prime. How do we show that p is prime?
Apply the theorem recursively to produce a decreasing sequence of
numbers, each of which is prime if the next smaller one is prime. The
sequence ends when it reaches a number small enough to be proved
prime directly. When the last number is proved prime, the primality
of all the other numbers, including n, is demonstrated.
The proper choice for m in the theorem is the size of Ea,b modulo
n. Schoof [Sch85] found a reasonably fast (but complicated) algo-
rithm for counting the points on Ea,b modulo a prime n. Schoof’s
algorithm uses division polynomials and is too complicated to present
here. See [Sch85] or Algorithm 7.5.6 of [CP05] for details.
Atkin and Morain [AM93] improved this method by first choos-
ing the size p to be a probable prime in the Hasse interval n + 1 −
√ √
2 n ≤ p ≤ n + 1 + 2 n and then constructing an elliptic curve Ea,b
modulo n having size p and the properties needed for Theorem 7.12.
They avoided using Schoof’s method to count the point on Ea,b and
created a fast probabilistic polynomial time prime proving algorithm.
Their programs routinely prove the primality of 1000-digit primes in
a short time.

7.4. Applications of Factoring to Elliptic Curves


Elliptic curves are used in many ways in cryptography. Here we men-
tion two ways factoring helps to construct elliptic curves that are
especially well suited for use in cryptography.
One simple application is in choosing an elliptic curve with simple
formulas for adding points. An elliptic curve (for cryptography) is a
group whose points are defined by an equation over a finite field. For
every prime power q = pk there is exactly one finite field Fq with size
q. (When p is prime, Fp is the same set with the same arithmetic
as the integers modulo p, but when k > 1, the arithmetic of Fpk is
different from that of the integers modulo pk , although both sets have
size pk .) For use in cryptography, the size of q is typically about 128
or 256 bits. Some of the most useful elliptic curves have size equal
to q ± 1. The points (other than ∞) of an elliptic curve over Fq are
7.4. Applications of Factoring to Elliptic Curves 189

pairs of numbers in Fq . The formulas for adding two points (other


than ∞) involve arithmetic in Fq . When q is prime, this arithmetic is
in the integers modulo q and it is slow, especially for smart cards and
other devices with limited computing power. But if you choose p to
be a small prime (2 ≤ p ≤ 11, say), then arithmetic in Fq = Fpk can
be done in steps with numbers no larger than p, so it is fast on a slow
machine. Elliptic curves are often constructed over Fpk with k > 1.
Then the size of such a curve is pk + 1, a number in the Cunningham
table. It is important to know the factorization of the size to ensure
that the elliptic curve discrete logarithm problem (ECDLP) is hard.
For example, one might want to know that pk +1 has at least one very
large prime factor. The ECDLP is to find x so that Q = xP , where P
and Q are two given points on an elliptic curve. Many cryptographic
algorithms rely on the ECDLP being intractable.
Let Ea,b be an elliptic curve over a finite field Fq . A pairing on
Ea,b is a function from pairs of points of Ea,b to the n-th roots of
unity μn in Fq (or in an algebraic closure of Fq ) satisfying several
mathematical properties (bilinear, nondegenerate). The earliest use
of a pairing in cryptography was in an attack on the ECDLP in 1993.
The attack by Menezes, Okamoto, and Vanstone [MOV93] converts
the problem Q = xP in an elliptic curve over Fpk to an equivalent
problem a = bx in Fpk itself. The latter problem is easier to solve.
Since 2000, many constructive uses of pairings have been found in
cryptographic protocols, including efficient key agreement, identity-
based encryption, and short signatures. These protocols could be
broken if the ECDLP were easy.
The naive computation of a pairing via its definition is slow.
Faster ways to compute pairings exist for supersingular elliptic curves,
whose sizes are Cunningham numbers pk ± 1 and their Aurifeuillian
factors. Unfortunately, supersingular elliptic curves are the ones for
which the pairing-based attack on the ECDLP works best. Therefore,
the parameter choice for these curves is a delicate balance between
ease of computing the pairing and the need to keep the ECDLP hard.
This decision requires knowledge of the factorization of the elliptic
curve size, which is either a Cunningham number pn ± 1 or an Au-
rifeuillian factor of such a number.
190 7. Elliptic Curves

See Washington [Was03] for definitions of the pairings. Boneh


and Franklin [BF01] designed an identity-based encryption system
using pairings. It allows someone to use your email address as your
public key. See Trappe and Washington [TW02] for a description of
the Weil pairing and identity-based encryption at the undergraduate
level. See Estibals [Est10] for ways of using factors of 2n − 1 and
3n − 1 to help construct elliptic curves in which pairings are easy to
compute, but the ECDLP is hard.

Exercises

7.1. Derive the formulas for adding points (x1 , y1 ) + (x2 , y2 ) stated
after Theorem 7.1.
7.2. Consider the curve y 2 = x3 − 43x + 166. Let P = (3, 8).
Compute 2P , 3P , 4P , and 5P . What is the smallest positive
integer k with kP = ∞? Be sure to check that the given point
and your answers all lie on the curve.
7.3. Consider the curve y 2 ≡ x3 + 4x (mod 11). Let P = (2, 4).
Compute 2P , 3P , and 4P . What is the smallest positive integer
k with kP = ∞? Find the number of points on the elliptic
curve. Be sure to check that the given point and your answers
all lie on the curve.
7.4. Prove that if p is prime, p ≡ 3 (mod 4), and b = 0, then the
elliptic curve Ea,0 modulo p has exactly p + 1 points.
7.5. Let g be a quadratic nonresidue modulo the prime p. Let M
and N be the sizes of the two elliptic curves y 2 ≡ x3 + ax + b
and y 2 ≡ x3 +g 2 ax+g 3 b modulo p. Prove that M +N = 2p+2.
7.6. Factor N = 7151393 using the ECM by computing multiples
of the point P = (0, 3) on the elliptic curve y 2 = x3 + 4x + 9
modulo N . Try B = 10.
Chapter 8

Sieve Algorithms

The invention of new [factorization] methods may


push off the limits of the unknown a little farther,
just as the invention of a new astronomical instru-
ment may push off a little the boundaries of the phys-
ical universe; but the unknown regions are infinite,
and if we could come back a thousand years from
now, we should no doubt find workers in the theory
of numbers announcing in the journals new schemes
and new processes for the resolution of a given num-
ber into its factors. D. N. Lehmer [Leh18]

Introduction
This chapter describes sieve algorithms used to factor integers. The
sieve is one of the oldest efficient algorithms in mathematics. The ba-
sic idea goes back to Eratosthenes in ancient Greece 2,500 years ago.
He used it to compute the first few prime numbers. Modern computer
programs use the same technique to form a table of the primes up to
a few million in a fraction of a second. A simple variation finds all in-
tegers in a short interval of large numbers free of small prime factors.
Another variation finds all primes or all numbers without small prime
factors in the range of a polynomial with arguments in some interval.
These are all called sieves and are described in the next section.

191
192 8. Sieve Algorithms

Different variations of the Sieve of Eratosthenes factor, either


completely or partially, all integers in some interval or those in the
range of a polynomial whose arguments lie in some interval. These
algorithms are also called sieves and form the heart of the fastest
known integer factoring algorithms, the Quadratic and Number Field
Sieves, described later in this chapter. Pomerance [Pom96] gives a
very readable account of the Quadratic and Number Field Sieves.

8.1. The Basic Sieve


The Sieve of Eratosthenes finds the primes less than some limit J.
Begin by writing the numbers 1, 2, . . ., J. Cross out the number 1,
which is not prime. After that, let p be the first number not crossed
out. Then p is prime, so leave it intact, but cross out every p-th
number (including those already crossed out) starting with 2p. That
is, cross out all multiples of p greater than p. Repeat this operation,

replacing p by the next number not yet crossed out, so long as p ≤ J.
Then stop. All numbers crossed out are composite (or 1) and all
numbers not crossed out are prime. This algorithm works √ because
every composite number ≤ J has a prime factor p ≤ J and so it
would be crossed out as a multiple of p. Recall Theorem 2.14.

Example 8.1. Let J = 19. Then J ≈ 4.4. Write the numbers 1 to
19. Cross out 1 and all multiples of 2 starting with 4. We cross out
(/) 4, 6, 8, 10, 12, 14, 16, 18. After that, 3 is the next number not
crossed out, so cross out all multiples of 3 starting with 6. We cross
out (\) 6, 9, 12,√15, 18. The next number not crossed out is 5, which
is greater than 19, so we stop. The numbers not crossed out are 2,
3, 5, 7, 11, 13, 17, 19, exactly the primes between 1 and 19:

1 2 3 4 5 \
6 7 8 \
9 1 0 11 12
\ 13 1 4 15
\ 1 6 17 18
\ 19.

A computer program would use an array (of bits, say) to represent


the numbers 1 to J. Let the value “1” mean that the number is not
crossed out and let the value “0” mean that the number is crossed out.
The algorithm begins by marking 1 as “crossed out” and the other
numbers as “not crossed out.” The first inner while loop crosses out
all multiples of the prime p. The second inner while loop finds the
8.1. The Basic Sieve 193

next prime p. The outer while loop runs through all primes ≤ J.
Here is the algorithm.

Algorithm 8.2. Sieve of Eratosthenes.


Input: An integer J > 1.
P [1] ← 0
for (i ← 2 to J) { P [i] ← 1 }
p←2 √
while (p ≤ J) {
i←p+p
while (i ≤ J) { P [i] ← 0; i ← i + p }
i←p+1 √
while (i ≤ J and P [i] = 0) { i ← i + 1 }
p←i
}
Output: The array P [·] lists the primes ≤ J.

When the algorithm finishes, the value of P [i] is 1 if i is prime


and 0 if i is 1 or composite.
The Sieve of Eratosthenes finds all primes ≤ J in O(J log log J)
steps. While this is fast, it is possible to make it slightly faster. See
Pritchard [Pri81] for a sieve that runs in O(J/ log log J) steps.
The next variation finds all integers in an interval not divisible
by any prime in a finite set of primes. First write the numbers in the
interval. Then, for each prime p in the set, cross out every multiple
of p in the interval. The set of numbers not crossed out is the answer.

Algorithm 8.3. Modified Sieve of Eratosthenes.


Input: Integers J > I > 1 and a finite set P of primes.
for (i ← I to J) { A[i] ← 1 }
for each p ∈ P {
i ← the smallest multiple of p that is ≥ I
while (i ≤ J) { A[i] ← 0; i ← i + p }
}
Output: The array A[·] lists the numbers between I and J
free of factors in P.

When the algorithm finishes, A[i] = 0 if some prime p ∈ P divides


i and A[i] = 1 if no prime in P divides i.
194 8. Sieve Algorithms

Example 8.4. In the special case that P contains all primes ≤ J
and the largest prime in P is < I, this sieve finds all primes between
I and J.
For example, if P = {2, 3, 5, 7, 11, 13, 17, 19}, I = 300, and J =
400, then the sieve will return with A[i] = 1 if 300 ≤ i ≤ 400 and i is
prime, and A[i] = 0 if 300 ≤ i ≤ 400 and i is composite. This works
because 300 > 19, 23 is the next prime after 19, and 400 < 232 .

The next variation factors the integers between I and J. Each


integer i in this interval is represented by a list L[i], initially empty,
which will contain the prime factors of i when the √ algorithm finishes.
The algorithm first finds the prime factors ≤ J of each i with a
sieve. The first while loop places one p in L[i] whenever p | i. Then
the second while loop places one more p in L[i] whenever p2 | i, one
more p in L[i] whenever p3 | i, etc., until j p’s have been appended
to L[i] if pj i. Then a second for loop divides i by each prime factor
found by the sieve. If the remaining cofactor exceeds 1, then this
cofactor is one last prime factor of i, so it is added to the list.
Algorithm 8.5. Factoring with the Sieve of Eratosthenes.
Input: Integers J > I > 1.
for (i ← I to J) { L[i]
√ ← empty }
for each prime p ≤ J {
i ← the smallest multiple of p with i ≥ I
while (i ≤ J) { append p to L[i]; i ← i + p }
a←2 √
while (pa ≤ J) {
i ← the smallest multiple of pa with i ≥ I
while (i ≤ J) { append p to L[i]; i ← i + pa }
a←a+1
}
}
for (i ← I to J) {
j←i
for each prime p in L[i] { j ← j/p }
if (j > 1) { append j to L[i] }
}
Output: For I ≤ i ≤ J, L[i] gives the prime factors of i.

The last variation factors the numbers in the range of a polyno-


mial f (x) with integer coefficients, but it just finds the prime factors
8.2. The Quadratic Sieve 195

of each f (x) that lie in a finite set P of primes. The polynomial f (x)
is assumed fixed and is not part of the input. This sieve algorithm is
the heart of the Quadratic and Number Field Sieve Factoring Algo-
rithms. This algorithm works just like the preceding one, except that
L[i] holds the prime factors of f (i) instead of those of i. The number
of roots of f (x) ≡ 0 (mod pa ) is no more than the degree of f .
Algorithm 8.6. Sieve to Factor the Range of a Polynomial.
Input: Integers J > I > 1 and a finite set P of primes.
for (i ← I to J) { L[i] ← empty }
for each p ∈ P {
Find the roots r1 , . . ., rd of f (x) ≡ 0 (mod p)
for (j ← 1 to d) {
i ← the least integer ≥ I and ≡ rj (mod p)
while (i ≤ J) { append p to L[i]; i ← i + p }
a←2 √
while (pa ≤ J ) {
Lift rj to a root r of f (x) ≡ 0 (mod pa )
i ← the least integer ≥ I and ≡ r (mod pa )
while (i ≤ J) { append p to L[i]; i ← i + pa }
a←a+1
}
}
}
Output: For I ≤ i ≤ J, L[i] gives the factors in P of f (i).

The root rj is “lifted” to a root r via Hensel’s Lemma, as was


done for a quadratic polynomial in Theorem 3.8 and Example 3.9. As
this may be complicated, sometimes the second while loop is omitted.
When this is done, the output L[i] lists only the distinct prime factors
of f (i) in P. An alternative to lifting roots is to form the number f (i)
and divide it by each p in the first while loop repeatedly to determine
how many factors of p f (i) has and append that many p’s to L[i]. The
advantage of the algorithm as written is that one (partially) factors
f (i) without ever forming this number, which may be huge.

8.2. The Quadratic Sieve


The Quadratic Sieve Factoring Algorithm is similar to the Continued
Fraction Factoring Algorithm. The difference lies in the method of
196 8. Sieve Algorithms

producing relations x2 ≡ q (mod N ) with q factored completely. √ The


CFRAC forms x and q from the continued fraction expansion of N
and factors q by Trial Division, which is slow. The quadratic residues

q in the CFRAC are likely to be smooth because they are < 2 N .
The Quadratic Sieve Factoring Algorithm (QS) was invented by
Pomerance [Pom82], but many of its ideas go back to Kraitchik
[Kra29]. It produces x and q using a quadratic polynomial q =
f (x) and factors the q with a sieve, a much faster process than Trial
Division. The quadratic polynomial f (x) is chosen so that the q will
be
√ as small as possible. This means that most of them will exceed
2 N , but not by a large factor, so that they are nearly as likely to
be smooth as the q in the CFRAC.

Let f (x) = x2 − N and s =  N . Consider the numbers

f (s), f (s + 1), f (s + 2), . . . .

The Quadratic Sieve factors some of these numbers by Algorithm 8.6.


If there are K primes p ≤ B and we can find R > K B-smooth num-
bers f (x), then we will have R relations involving K primes and linear
algebra will give us at least R − K congruences x2 ≡ y 2 (mod N ),
each of which has probability at least 1/2 of factoring N , by Theorem
6.18.
We sieve using Algorithm 8.6 to find the B-smooth numbers
among f (s), f (s + 1), f (s + 2), . . .. The factor base P consists of the
primes p < B (for which the Legendre symbol (N/p) = −1). Write
down the numbers f (s + i) for i in an interval a ≤ i < b of conve-
nient length, say a few million. The first interval will have a = s.
Subsequent intervals will begin with a equal to the endpoint b of the
previous interval. For each prime p < B, remove all factors of p from
those f (s + i) which p divides. Since f (x) = x2 − N , p divides f (x)
precisely when x2 ≡ N (mod p). The solutions x to this congruence
lie in the union of two arithmetic progressions with common difference
p. If the roots of x2 ≡ N (mod p) are x1 and x2 , then the arithmetic
progressions begin with the first numbers ≡ x1 and x2 (mod p) which
are ≥ a. The prime factor p is removed from each f (s + i) which it
divides.
8.2. The Quadratic Sieve 197

The number of sieve operations for a prime p is about


2
p (b − a) because exactly two of every p numbers are divided by p.

The complexity of the sieve is p<B,p prime p2 (b − a). It can be shown
that this sum is O((b − a) ln ln B). The amortized cost of sieving
one i value is thus ln ln B. Trial Division would have taken about
O((b − a)B/ ln B) steps to find the B-smooth numbers between a and
b, or B/ ln B steps per i value. The sieve saves much time.
If one replaces f (s + i) and p by their logarithms, one can replace
the slow division of large numbers with subtraction of small numbers.
The logarithms may be approximate, scaled, and stored one per byte.
In the final step of the algorithm, x in x2 ≡ y 2 (mod N ) is formed
as the product modulo N of the xi on the left sides of the relations
x2i ≡ qi (mod N ) which participate in the dependency. The number
y 2 is the product of the qi in the same relations.
The size K of the factor base is about 12 π(B) ≈ 12 B/ ln B and
should be optimized to minimize the total work. Pomerance showed
thatthe time complexity
 of the Quadratic Sieve Algorithm is L(N ) =

exp (ln N ) ln ln N . The best value for the smoothness bound B

is about L(N ). The total number of values of f (s + i) sieved is
about
( 
ln N
M = L(N )/ ln ln B = exp .
ln ln N
The proof of these results is similar to that of Theorem 7.11.
Several variations speed the practical Quadratic Sieve Algorithm,
although they do not change its theoretical complexity. They include:
• using large primes (larger than B), as in the CFRAC,
• multiple polynomials (not just f (x) = x2 − N ) [Sil87],
• self-initializing polynomials (amortize the root-finding)
[AP93].

Example 8.7. Factor N = 13290059 by the Quadratic Sieve.


We will use only the polynomial f (x) = x2 − N and we will
use the large prime variation. The factor base is the set of primes
p < B = 100 for which the Legendre symbol (N/p) = +1, that is, the
198 8. Sieve Algorithms

factor base is
P = {2, 5, 13, 31, 41, 43, 53, 67, 83, 89, 97}.
For the large prime variation, we allow primes p in the range 100 <
p < 200.√ Using Algorithm 8.6, sieve f (s) for the first 1000 values
of s > N , that is, the sieve interval is 3646 ≤ s ≤ 4645. We save
relations s2 ≡ f (s) (mod N ) produced by the sieve whenever f (s)
is factored completely using only primes in the factor base (called
a full relation) or if the cofactor of f (s) remaining after all primes
in the factor base have been divided out is a number between 100
and 200 (called a partial relation). Since this number between 100
and
√ 200 has no prime factor less than 100 (which is greater than
200), it must be prime, a large prime. Table 1 shows the relations
harvested from the sieving. The four full relations are shown first,
followed by the fourteen partial relations in order of their large primes.
We omit partial relations whose large prime does not appear in any
other partial relation, as these are not useful since their large prime
cannot be matched to form a square. The repeated large primes
(appearing twice each) are 109, 113, 127, 131, 137, 149, and 157.
Now the relations with repeated large prime factors are combined.
For example, the relations numbered 7 and 8 in Table 1 have large
prime 113 and represent the congruences
44032 ≡2 · 52 · 13 · 83 · 113 (mod N ),
46372 ≡2 · 5 · 132 · 43 · 113 (mod N ).
Multiply these two congruences to get
44032 · 46372 ≡ 22 · 53 · 133 · 43 · 83 · 1132 (mod N ).
When we combine all pairs of relations with repeated large primes,
we get the congruences in Table 2. At this point we have the full
relations in the first four lines of Table 1 and the seven combined
relations in Table 2. The prime 67 occurs only in one of these eleven
relations, namely relation 19 in Table 2. Therefore, that 67 cannot
be combined with any other 67, so relation 19 is useless. Each of the
other nine primes in the factor base appears in at least two of the
remaining ten relations. We wish to find a subset of the ten relations
whose product has a right side in which every prime is raised to an
8.2. The Quadratic Sieve 199

Table 1. Quadratic sieve relations for factoring 13290059.

i s f (s) f (s) factored


1 3648 17845 5 · 43 · 83
2 3652 47045 5 · 97 · 97
3 3662 120185 5 · 13 · 43 · 43
4 4178 4165625 5 · 5 · 5 · 5 · 5 · 31 · 43
5 4037 3007310 2 · 5 · 31 · 89 · 109
6 4247 4746950 2 · 5 · 5 · 13 · 67 · 109
7 4403 6096350 2 · 5 · 5 · 13 · 83 · 113
8 4637 8211710 2 · 5 · 13 · 13 · 43 · 113
9 3822 1317625 5 · 5 · 5 · 83 · 127
10 4203 4375150 2 · 5 · 5 · 13 · 53 · 127
11 3758 832505 5 · 31 · 41 · 131
12 4151 3940742 2 · 13 · 13 · 89 · 131
13 3963 2415310 2 · 5 · 41 · 43 · 137
14 4237 4662110 2 · 5 · 41 · 83 · 137
15 3727 600470 2 · 5 · 13 · 31 · 149
16 4468 6672965 5 · 13 · 13 · 53 · 149
17 3922 2092025 5 · 5 · 13 · 41 · 157
18 4393 6008390 2 · 5 · 43 · 89 · 157

even power. To do this, we form the matrix in Table 3 with one row
per relation and one column per prime in the factor base. The matrix
entry is 1 if the prime appears to an odd power on the right side of
the relation and 0 otherwise. The numbers i on the left side give the
number of the relation. Using linear algebra1 and remembering that
1 + 1 = 0 in the field F2 with two elements, one finds that the rows
for relations 1, 2, 3, and 20 sum to the 0 vector, as do the rows for

1
We might find a basis for the null space of the matrix over F2 .
200 8. Sieve Algorithms

Table 2. Combined relations for factoring 13290059.

combining i s1 s2 right side factored


5, 6 19 4037 4247 22 · 53 · 13 · 31 · 67 · 89 · 1092
7, 8 20 4403 4637 22 · 53 · 133 · 43 · 83 · 1132
9, 10 21 3822 4203 2 · 54 · 13 · 53 · 83 · 1272
11, 12 22 3758 4151 2 · 5 · 132 · 31 · 41 · 89 · 1312
13, 14 23 3963 4237 22 · 52 · 412 · 43 · 83 · 1372
15, 16 24 3727 4468 2 · 52 · 133 · 31 · 53 · 1492
17, 18 25 3922 4393 2 · 52 · 13 · 41 · 43 · 89 · 1572

Table 3. Matrix for factoring 13290059.

i 2 5 13 31 41 43 53 83 89
1 0 1 0 0 0 1 0 1 0
2 0 1 0 0 0 0 0 0 0
3 0 1 1 0 0 0 0 0 0
4 0 1 0 1 0 1 0 0 0
20 0 1 1 0 0 1 0 1 0
21 1 1 1 0 0 0 1 1 0
22 1 1 0 1 1 0 0 0 1
23 0 0 0 0 0 1 0 1 0
24 1 0 1 1 0 0 1 0 0
25 1 1 1 0 1 1 0 0 1

relations 3, 4, 22, and 25. Consider these two linear dependencies in


turn.
In the first one, x is the product of the values of s, that is,

x = 3648 · 3652 · 3662 · 4403 · 4637 ≡ 3262075 (mod N ),


8.2. The Quadratic Sieve 201

and y is the square root of the product of the primes on the right
sides, that is,

y = 2 · 53 · 132 · 432 · 83 · 97 · 113 ≡ 10027984 (mod N ).

Unfortunately, x + y = N ≡ 0 (mod N ), so gcd(x + y, N ) = N and


gcd(x − y, N ) = 1.
In the second dependency,

x = 3662 · 4178 · 3758 · 4151 · 3922 · 4393 ≡ 12231590 (mod N )

and

y = 2 · 52 · 132 · 31 · 41 · 432 · 89 · 131 · 157 ≡ 8515998 (mod N ).

This leads to gcd(x + y, N ) = 3119 and gcd(x − y, N ) = 4261, which


splits N .

The Quadratic Sieve was used to factor the 129-digit RSA chal-
lenge number mentioned in Sections 1.1 and 4.9. Atkins, Graff,
Lenstra, and Leyland [AGLL95] used a variation of the Quadratic
Sieve with two large primes allowed in each relation. Their factor
base contained 524339 primes. They generated more than 8.4 million
relations. Allowing “two large primes” means that each relation may
be full or have one or two primes larger than the greatest one in the
factor base. When the relations are harvested at the end of the sieve,
after the primes in the factor base have been removed from f (s), if
the remaining cofactor is composite, then some effort is made to fac-
tor this number, usually with the ECM or the SQUFOF. If it can
be factored easily, then the relation is saved. Using two large primes
complicates the linear algebra slightly.
A faster version of the Quadratic Sieve uses multiple polynomials.
See Silverman [Sil87] and Alford and Pomerance [AP93].
In the 1990s, George Sassoon organized all the personal comput-
ers on the Isle of Mull, Scotland, into a factoring group. They used
the multiple polynomial version of the Quadratic Sieve, with each
machine sieving a different set of polynomials. As these computers
were not interconnected, the relations were collected on floppy disks
and carried to one machine for the linear algebra. They factored more
202 8. Sieve Algorithms

than a dozen Cunningham numbers this way, including a composite


101-digit factor of 2,1286M = 2643 + 2322 + 1.
We have said little about using linear algebra to match the prime
factors so that each occurs an even number of times. There is a
preliminary phase called filtering in which relations are processed in
disk files. Then a large dense matrix like that in Table 3 is loaded
into memory and some vectors in its null space are computed. These
vectors give the linear dependencies that may factor n. When fac-
toring a large integer by either the Quadratic or Number Field Sieve,
hundreds of millions of relations are stored. Many of them contain
a large prime that appears in no other relation. The filtering phase
discards these relations and combines pairs of relations with the same
large prime, as was done in Table 2. Gaussian elimination suffices to
find the null space for bit matrices up to a few tens of thousands of
rows. Beyond that size, other techniques, like block Lanczos or block
Weidemann, are used. See, for example, Montgomery [Mon95] or
Coppersmith [Cop94].

8.3. The Double Sieve


This section treats a useless factoring algorithm that has never been
programmed. It introduces in a gentle way one basic idea of the
Number Field Sieve. To factor N , the Double Sieve forms many
relations a ≡ b (mod N ) in which a = b and both a and b have been
completely factored. It uses linear algebra as in the CFRAC or the
Quadratic Sieve to match the prime factors on the two sides of the
congruences to produce two congruent squares x2 ≡ y 2 (mod N ). The
result of the linear algebra modulo 2 is a set of relations in which each
prime occurs an even number of times on each side of the relations in
the set. Here x2 is the product of the a in the set of relations and y 2
is the product of the b in the set of the relations.
Here is a longer description of the Double Sieve to factor N .

(1) Choose a bound B and a limit L.


(2) Choose a factor base P of all primes ≤ B. Let k = π(B) be
the size of P.
8.3. The Double Sieve 203

(3) Use Algorithm 8.6 with f (x) = x to find all B-smooth a in


1 ≤ a ≤ L.

(4) Use Algorithm 8.6 with f (x) = x to find all B-smooth b in


N + 1 ≤ b ≤ N + L.

(5) For each a in 1 ≤ a ≤ L, if a is B-smooth and b = a+N is B-


smooth, save a to represent the relation a ≡ a+N (mod N ).

(6) Once  > k B-smooth a’s have been saved, create a list of
new relations a2 ≡ a(a + N ) (mod N ).

(7) Form an  × k matrix with one row for each new relation
and one column for each prime factor of a(a + N ).

(8) Find vectors in the null space of the matrix modulo 2.

(9) Each vector in the null space gives a congruence x2 ≡ y 2


(mod N ) and a chance to factor N by Theorem 6.18.

Example 8.8. Factor N = 13290059 by the Double Sieve.


Choose B = 100 so that k = 25. How large should L be? Ac-
cording to equation (3.1), the probability that an integer near N is
B-smooth is roughly ρ(u) ≈ u−u , where u = (ln N )/ ln B. Thus,
u = (ln 13290059)/ ln 100 = 3.56176 and u−u = 0.01084. We want
the number of smooth relations to be Lρ(u) > k, so L > kuu =
25/0.01084 ≈ 2306. This crude estimate ignores the fact that we
need a and b = N + a to be simultaneously 100-smooth and also
the fact that the a’s are smaller than N , so they are more likely to
be 100-smooth than the b’s. Let us guess the probability that a is
100-smooth. If we try a near 1002 , then u = (ln 1002 )/ ln 100 = 2,
so the probability that a is 100-smooth is roughly ρ(u) ≈ 0.31. If
we assume further that the probabilities of a and b = a + N being
100-smooth are independent, then the probability that a and b are
simultaneously 100-smooth is (0.01084)(0.31) = 0.003360. Thus we
will need L > 25/0.003360 = 7439.6. This estimate turns out to be
poor, perhaps because the probabilities are not independent and we
actually need L = 6300. With this L,  = 31 relations are produced.
204 8. Sieve Algorithms

The following table shows the first three and last three relations:
a b=N +a
8 = 23 13290067 = 7 · 232 · 37 · 97
61 = 61 13290120 = 23 · 32 · 5 · 19 · 29 · 67
133 = 7 · 19 13290192 = 24 · 32 · 17 · 61 · 89
.. ..
. .
6100 = 22 · 52 · 61 13296159 = 32 · 17 · 432 · 47
6241 = 792 13296300 = 22 · 3 · 52 · 23 · 41 · 47
6253 = 132 · 37 13296312 = 23 · 34 · 172 · 71
Each row of the table represents a relation a ≡ b = a + N (mod N )
in which both a and b are B-smooth. We form new relations by
multiplying both sides of a ≡ b (mod N ) by a to get a2 ≡ ab =
a(a + N ) (mod N ). Here are the first three and last three new rela-
tions:
a2 ab = a(N + a)
8 2
2 · 7 · 232 · 37 · 97
3

61 2
2 · 32 · 5 · 19 · 29 · 61 · 67
3

133 2
24 · 32 · 7 · 17 · 19 · 61 · 89
.. ..
. .
61002 22 · 32 · 52 · 17 · 432 · 47 · 61
62412 22 · 3 · 52 · 23 · 41 · 47 · 792
62532 23 · 34 · 132 · 172 · 37 · 71
We form an  × k matrix with one row for each relation and one
column for each prime as a factor of ab. The (i, j) entry of the matrix
is 1 if the j-th prime divides a(a + N ) of the i-th relation to an odd
power, and it is 0 otherwise. Since the matrix has more rows ( = 31)
than columns (k = 25), some linear combination of the rows will be
the zero vector. The coefficients of the linear combinations will be
0 or 1. When the relations that appear in this linear combination
are multiplied, the product of the a2 will be square, say x2 , and the
product of the a(a + N ) will be square, say y 2 , because the prime
factors have been matched so that each prime factor occurs an even
number of times. Then x2 ≡ y 2 (mod N ) and we have a chance to
factor N by Theorem 6.18. Since  = 31 and k = 25, we will have at
least six chances to factor N . We leave the rest to the reader.
8.4. Schroeppel’s Linear Sieve 205

The point of the Double Sieve is that it is not necessary to have


a square on one side of a relation, as was done in the CFRAC and
the Quadratic Sieve. If there are smooth numbers on both sides of
the relations, linear algebra can match the primes on both sides to
form two congruent squares. Linear algebra is easy; finding smooth
numbers is hard.

8.4. Schroeppel’s Linear Sieve


A major problem with the Double Sieve is that it seeks smooth num-
bers near N , where they are comparatively rare. In 1975, Schroeppel
(see [Pom82]) invented a new sieve that does not have this problem.
He used a linear function in two variables that takes√small values and
can be sieved to find its smooth values. Let K =  N . The linear
function is S(a, b) = (K + a)(K + b) − N . When the variables a and
b are integers with small absolute values, S(a, b) is not much larger
than K and so is more likely to be smooth than an integer near N .
The linear algebra matches the prime factors of the S(a, b) values
and also ensures that the values of a and b appear an even number of
times in making both products
 
S(ai , bi ), (K + ai )(K + bi )
i i

have square values, which of course are congruent modulo N . Then


one applies Theorem 6.18.
Here is a description of Schroeppel’s Linear Sieve.
(1) Choose a bound B and a limit L.
(2) Choose a factor base P of all primes ≤ B. Let k = π(B) be
the size of P.
(3) For each 1 ≤ a ≤ L, use Algorithm 8.6 with f (x) = S(a, x)
to find all B-smooth S(a, b) in a ≤ b ≤ L. Save the pairs
(ai , bi ) with B-smooth S(ai , bi ) until you get  > k + L+ a
few more such pairs.
(4) Form an  × (k + L) matrix with one row for each new
relation, one column for each prime factor of S(ai , bi ), and
one column for each a and b in 1 ≤ a, b ≤ L.
206 8. Sieve Algorithms

(5) Find vectors in the null space of the matrix modulo 2.


(6) Each vector in the null space gives a congruence x2 ≡ y 2
(mod N ) and a chance to factor N by Theorem 6.18.

The two squares formed by linear algebra are x2 = i S(ai , bi ) and

y 2 = i (K + ai )(K + bi ).
Example 8.9. Factor N = 13290059 by Schroeppel’s Linear Sieve.

We have K =  N  = 3645. Let us choose B = 100 so that
k = 25. Let L = 60. The sieving produces  = 98 relations, larger
than k + L = 85. Here are the first three and last three relations:
a b (K + a)(K + b) − N
1 2 6903 = 32 · 13 · 59
1 38 138159 = 33 · 7 · 17 · 43
2 22 83490 = 2 · 3 · 5 · 112 · 23
.. .. .. ..
. . . .
48 58 385120 = 2 · 5 · 29 · 83
5

54 59 411037 = 112 · 43 · 79
54 60 414736 = 24 · 72 · 232
We form an  × (k + L) matrix with one row for each relation, one
column for each prime as a factor of S(a, b), and one column for each
a or b. For 1 ≤ j ≤ k, the (i, j) entry of the matrix is 1 if the j-th
prime divides S(a, b) of the i-th relation to an odd power, and it is
0 otherwise. For k < j ≤ k + L, the (i, j) entry of the matrix is 1
if j = a or j = b, and it is 0 otherwise. Since the matrix has more
rows ( = 98) than columns (k + L = 85), some linear combinations
(modulo 2) of the rows will be zero vectors. The coefficients of the
linear combinations will be 0 or 1. When the relations that appear in
such a linear combination are multiplied, the product of the S(a, b)
will be square, say x2 , because the prime factors have been matched
so that each prime factor occurs an even number of times; and the
product of the (K + a)(K + b) will be square, say y 2 , because the
factors K + a and K + b have been matched so that each (K + j)
occurs an even number of times. Then x2 ≡ y 2 (mod N ) and we have
a chance to factor N by Theorem 6.18. Since  = 98 and k + L = 85,
we will have at least 13 chances to factor N . We leave the rest to the
reader.
8.5. The Number Field Sieve 207

Like the Double Sieve, Schroeppel’s Linear Sieve shows that it is


not necessary to have a square on one side of a relation, as was done
in the CFRAC and the Quadratic Sieve. Schroeppel’s Linear Sieve
also shows that
√ the numbers that need to be smooth can be not much
larger than N . The Number Field Sieve in the next section shows
that the smooth numbers,
√ which can be found with a sieve, can be
even smaller than N .
Schroeppel actually invented his Linear Sieve before Pomerance
invented the Quadratic Sieve. He used it in an attempt to factor the
Fermat number F8 . After he had found all the relations and before he
had finished the linear algebra, Brent and Pollard factored F8 by the
Pollard Rho Algorithm 5.23. Pomerance noticed that the restriction
a = b changes the Linear Sieve into the Quadratic Sieve.

8.5. The Number Field Sieve


We now discuss the Number Field Sieve. Pollard [Pol93] was the first
to suggest raising the degree of the polynomial in the Quadratic Sieve
from 2 to a higher value, but only for numbers with special form. He
factored the Fermat number F7 (which had been factored earlier by
Morrison and Brillhart) using the cubic polynomial 2x3 +2 on a small
computer. Manasse and the Lenstra brothers soon extended Pollard’s
ideas to higher degree polynomials, still only for numbers of the form
r e − s. Their goal was to factor F9 , the smallest Fermat number
with no known prime factor. They hoped to use the special form of
F9 to make the numbers that had to be smooth smaller than those
needed for the Quadratic Sieve. After they factored F9 in 1990, they
and others extended the Number Field Sieve to general numbers. See
Lenstra and Lenstra [LL93] for more details of the early history of
this factoring algorithm. See Pomerance [Pom94] for an excellent
summary of the algorithm. Crandall and Pomerance [CP05] give a
thorough description of the modern algorithm.
Recall that the Quadratic Sieve produces many relations x2i ≡
qi (mod N ) with qi factored completely. When we have enough re-
lations, we match the prime factors of the qi and create a subset of
the qi whose product is square. In this way, we find congruences
x2 ≡ y 2 (mod N ) which may factor N by Theorem 6.18.
208 8. Sieve Algorithms

Drop the requirement that the left side of a relation must be


square. Instead seek relations ri ≡ qi (mod N ) in which both ri and
qi have been factored completely, as in the Double and Linear Sieves.
We will use linear algebra to match the prime factors of ri and the
prime factors of qi and select a subset of the relations for which both
the product of the ri and the product of the qi are square. This is
a fine idea, but too slow to be practical. The main difficulty is that
at least one of |ri |, |qi | must exceed N/2, so it has low probability of
being smooth.
The Number Field Sieve makes the idea fast and practical by
letting the numbers on one side of each relation be algebraic integers
from an algebraic number field. The idea is to match the irreducible
factors so that each occurs an even number of times and the product
of the algebraic integers in the selected subset of the relations might
be a square in the algebraic number field.
Here we give a brief review of number fields. See one of the books
[IR98], [Jan98], [DF04], [Her99] for more about this topic. An al-
gebraic number is the zero of a polynomial with integer coefficients. If
the polynomial is monic, then the algebraic number is called an alge-
braic integer. An algebraic number field is a field which contains only
algebraic numbers. The smallest algebraic number field containing
the algebraic number α is written Q(α).

Example 8.10. The complex number α = −6 is an algebraic in-
teger because it is the zero of the monic polynomial x2 + 6. The set

Q(α) of all numbers a + b −6, where a and b are any rational num-

bers, forms a field. All of a + b −6 are algebraic numbers, so this
set is an algebraic number field. Indeed, it is the smallest algebraic

number field containing −6.

The set of all algebraic integers in Q(α) is written Z(α). This set
forms a commutative ring with unity. A unit in Z(α) is an element
having a multiplicative inverse in Z(α). A nonzero, nonunit element
γ of Z(α) is irreducible if it can be factored in Z(α) only as γ = uβ
where u is a unit. When γ = uβ, where u is a unit, β is called an
associate of γ (and γ is an associate of β). An algebraic integer γ
has unique factorization (in Z(α)) if any two factorizations of γ into
8.5. The Number Field Sieve 209

the product of irreducible elements and units are the same except for
replacing irreducibles by their associates and using different units.
√ √
Example 8.11. One can show that Z( −6) is the set of all a+b −6,
where a, b are integers. In this set, the algebraic integer 10 (the zero
of the monic polynomial x − 10) does not have unique factorization
because
√ √
10 = 2 · 5 = (2 + −6)(2 − −6)
√ √
and all four factors 2, 5, (2 + −6), (2 − −6) are irreducible.

The polynomial of lowest degree having an algebraic number α as


a zero must be irreducible, that is, it doesn’t factor into the product
of two polynomials of lower degree. If an algebraic number α is a zero
of the irreducible polynomial f (x) ∈ Z[x], then the conjugates of α
are all of the zeros of f (x) (in the complex numbers C). The norm
N (α) of α is the product of all of the conjugates of α (including α).
The norm is a real number. The norm of an algebraic integer is an
integer. The norm function is multiplicative: N (αβ) = N (α)N (β).
If β = γ 2 for some γ ∈ Z(α), then N (β) is the square of the integer
N (γ). If the algebraic integer α is a zero of the irreducible polynomial
f (x) = xd +cd−1 xd−1 +· · ·+c1 x+c0 and a and b are integers, then the
norm of a − bα is N (a − bα) = F (a, b), where F is the homogeneous
polynomial

(8.1) F (x, y) = xd + cd−1 xd−1 y + · · · + c1 xy d−1 + c0 y d = y d f (x/y).



Example 8.12. In Example 8.11, the norm of a + b −6 is

N (a + b −6) = a2 + 6b2 .
√ √
The algebraic integer a + b −6 ∈ Z( −6) can be factored as
√ √
a + b −6 = αβ only if 1 < N (α) < N (a + b −6) = a2 + 6b2 .

We have N (a + b −6) ≥ 6 when b = 0. This shows that 2 and 5 are

irreducible in Z −6.

We now return to discussion of the Number Field Sieve. Recall


that we are letting the numbers on one side of each relation be alge-
braic integers from an algebraic number field. Then we proposed to
match the irreducible factors so that each occurs an even number of
210 8. Sieve Algorithms

times and thus the product of the associated relations will be a con-
gruence with two squares of integers congruent modulo the number
N to factor.
The first difficulty is in writing a congruence modulo N with
an algebraic integer on one side. This problem is solved by using a
homomorphism h from the algebraic integers Z(α) to ZN , the integers
modulo N . Suppose we have many algebraic integers θi , each factored
into irreducibles, and also every h(θi ) factored into the product of
primes. Then we may match the irreducibles and match the primes
to choose a subset of the θi whose product is a square γ 2 in Z(α) and
so that the product of the h(θi ) is a square y 2 in the integers. See
Montgomery [Mon94] for a way to find γ from a set of θi ’s whose
product is γ 2 . Let x = h(γ), a residue class modulo N . We have
 
 
2 2 2
x = (h(γ)) = h(γ ) = h θi = h(θi ) ≡ y 2 (mod N ),
i∈S i∈S

which may factor N by Theorem 6.18.


Now we tell how to choose the algebraic number field and con-
struct the homomorphism. We want to have an irreducible monic
polynomial

f (x) = xd + cd−1 xd−1 + · · · + c1 x + c0

with integer coefficients. Let α be a zero of f in C. The algebraic


d−1
number field will be Q(α). Let Z[α] be the set of all j
j=0 aj α ,
where the aj are integers. This is a ring contained in the ring Z(α)
of integers of Q(α). We also need an integer m for which f (m) ≡
0 (mod N ). The homomorphism from Z[α] to ZN will be defined by
setting h(α) = m (mod N ), that is,
⎛ ⎞

d−1 
d−1
h⎝ aj α j ⎠ ≡ aj mj (mod N ).
j=0 j=0

The numbers θ will all have the simple form a − bα. We will seek
a set S of pairs (a, b) of integers such that

(8.2) (a − bm) is a square in Z
(a,b)∈S
8.5. The Number Field Sieve 211

and

(8.3) (a − bα) is a square in Z[α].
(a,b)∈S

Let the integer y be a square root of the first product. Let γ ∈ Z[α] be
a square root of the second product. We have h(γ 2 ) ≡ y 2 (mod N ),
since h(a − bα) ≡ a − bm (mod N ). Let x = h(γ). Then x2 ≡
y 2 (mod N ), which will factor N with probability at least 1/2, by
Theorem 6.18.
In practical applications, the degree d of the polynomial f (x) is
typically 4, 5, or 6. In addition to being irreducible and having a
known zero m modulo N , we want f (x) to have “small” coefficients
compared to N . (Actually, we want the norm function N ( ) to have
small values, so it is important for the high-order coefficients of f (x)
to be very small.) There are several ways one might satisfy all these
conditions.
The requirements on f (x) are easily met in the Special Number
Field Sieve, which factors numbers N = r e − s, where r and |s| are
small positive integers. Cunningham numbers have this form with
s = ±1. Let k be the least positive integer for which kd ≥ e. Let
t = sr kd−e . Let f (x) be the polynomial xd − t. Let m = r k . Then
f (m) = r kd − sr kd−e = r kd−e N ≡ 0 (mod N ).

Example 8.13. The number N to factor is 6469 + 1.


Let d = 5. Note that 6(6469 + 1) = 6470 + 6 and 470 = 5 · 94, so
if we let f (x) = x5 + 6 and m = 694 , then f (m) = 6N ≡ 0 (mod N ).
The number field is K = Q(α), where α = (−6)1/5 is a (complex)
zero of f .
A second possible polynomial would be f (x) = x6 +65 with d = 6,
m = 679 , and α = (−65 )1/6 . One should perform a tiny bit of sieving
on each of these two polynomials to see which produces relations
faster.

In the general case, called the General Number Field Sieve, one
standard approach to finding a good polynomial (of degree 5, say) to
factor N is to let m be an integer slightly larger than N 1/5 . Write N =
5
i=0 di m in base m. The digits di will be in the interval 0 ≤ di < m,
i
212 8. Sieve Algorithms
5
small compared to N . Then let the polynomial be f (x) = i=0 di xi .
This method makes all the coefficients of f (x) somewhat small but
does not make the high-order coefficients especially small. See the
article [BLP93] by Buhler, Lenstra, and Pomerance for the origins of
the General Number Field Sieve. Montgomery and Murphy [MM99]
give better ways to choose a polynomial for the General Number Field
Sieve.
Example 8.14. Let us find a polynomial for factoring
N = 37965134430918647673876906901814739053246839817137.

Let d = 5. Now 5 N is approximately m = 8239047153. If we write
N in base m, we find
N = m5 + 2m4 + 5m3 + 2267016550m2 + 6448349153m + 3629348338.
Thus we let
f (x) = x5 + 2x4 + 5x3 + 2267016550x2 + 6448349153x + 3629348338.
We automatically have f (m) = N ≡ 0 (mod N ). The number field is
K = Q(α), where α is a zero of f .

We never need to compute the zero α of f . It is used in under-


standing why the algorithm works but does not appear in the program
for the Number Field Sieve.
When choosing f (x) for either the Special or the General Number
Field Sieve, we may assume f is irreducible. If f is not irreducible,
then we can factor N immediately. If f (x) = g(x)h(x) in Z[x], then
the integer factorization N = g(m)h(m) gives a nontrivial factoriza-
tion of N , just as a cyclotomic factorization factors a Cunningham
number in Theorem 3.10.
We will have two sieves, one for a − bm and one for a − bα. The
sieve on a − bm is simple: For each fixed 0 < b < M we try to factor
the numbers a − bm for −M < a < M by sieve Algorithm 8.6.
The goal of the sieve on the numbers a − bα is to allow us to
choose a set S of pairs (a, b) so that the product in equation (8.3)
is a square. Rather than try to factor the algebraic integers a − bα,
let us work with their norms. If the product in equation (8.3) is a
square, then its norm is a square, and its norm is the product of all
8.5. The Number Field Sieve 213

N (a − bα) with (a, b) ∈ S. Since the norms are rational integers,


rather than algebraic integers, it is easy to match the prime factors
of norms to form squares. Furthermore, the norm of a − bα is a
polynomial, F (a, b) in (8.1), and therefore something we can factor
with sieve Algorithm 8.6. For each fixed b between 0 and M , sieve
the polynomial F (x, b) = N (x − bα) for x between −M and M to
find smooth values of N (a − bα).
Whenever both a − bm and N (a − bα) are smooth, save the pair
(a, b) to represent the relation h(a − bα) ≡ a − bm (mod N ). When
we have found many relations, use linear algebra to construct sets S
of pairs (a, b) for which the product of a − bm is a square and the
product of the norms of a − bα is a square.
Several problems arise, even in this simple description of the
Number Field Sieve Algorithm. For example, the fact that N (θ)
is square need not imply θ is square. One problem is that the norm
function does not distinguish among associates. Another is the lack of
unique factorization in most number fields. A third problem is com-
puting the square root of algebraic numbers. All of these problems
can be solved. See Crandall and Pomerance [CP05] for the details.
One can show that the Number Field Sieve is faster than the
Quadratic Sieve when N is large and the degree d is chosen properly.
A careful analysis shows that the complexity of the Number Field
Sieve is
 
exp c(ln N )1/3 (ln ln N )2/3

for some constant c > 0. The constant c is a bit smaller for the Special
than for the General Number Field Sieve because the coefficients can
be made smaller.
In summary, the Number Field Sieve has these steps.

(1) Select a polynomial f (x) ∈ Z[x] and an integer m with


f (m) ≡ 0 (mod N ).
(2) Sieve numbers a − bm and N (a − bα); save (a, b) whenever
both a − bm and N (a − bα) are smooth.
(3) Filter the relations to remove duplicates and those contain-
ing a prime that appears in no other relation.
214 8. Sieve Algorithms

(4) Use linear algebra modulo 2 to find sets S as in formulas


(8.2) and (8.3).
(5) Find the square roots y and γ of the squares in formulas
(8.2) and (8.3).
(6) Let x = h(γ); try to factor N via gcd(N, x ± y).
Variations of the Number Field Sieve allow one or even two large
primes on each side of each relation, as in the Quadratic Sieve. When
two large primes are used, their product is often split by the SQUFOF
or the ECM. Other variations include using free relations, a line sieve,
special q’s, etc.
The examples below illustrate ways to find polynomials for the
Special Number Field Sieve to factor Cunningham numbers and Fi-
bonacci numbers.

Example 8.15. The number N to factor is 12257 − 1.


If we proceed as in Example 8.13, we find the polynomials f (x) =
x5 − 123 with m = 1251 and f (x) = x6 − 12 with m = 1243 . It turns
out that the polynomial f (x) does not have to be monic. A few new
difficulties are introduced when it is not monic, but they are not hard
to overcome. Thus we consider nonmonic polynomials with small
coefficients. Note that
123 · N = 12260 − 123 = (1252 )5 − 123 = 45 (1252 /4)5 − 123 .
Dividing by 43 gives
33 · N = 42 (1252 /4)5 − 33 .
Therefore, m = 1252 /4 is a zero of the polynomial f (x) = 42 x5 −
33 = 16x5 − 27 modulo N . Here α = (27/16)1/5 is a (complex)
zero of f . The coefficients of 16x5 − 27 are smaller than those of
x5 + 123 = x5 − 1728 and consequently a + bα has a slightly smaller
norm. Thus a + bm and N (a + bα) are smaller and might be more
likely to be smooth. This trick can be used for any composite base
b, such as 6, 10, or 12, but not for a prime base. It was not used in
Example 8.13 because x5 + 6 already has small coefficients.
If we are not restricted to monic polynomials, we could also use
f (x) = 144x5 − 1, with m = 1251 , or f (x) = 12x4 − 1, with m =
8.5. The Number Field Sieve 215

1264 . To decide which of these polynomials is best, we perform a tiny


amount of sieving for each using the same sieve parameters. We find
the number of relations shown in the following table:
Polynomial m Relations2
12x5 − 27 52
12 /4 3524388
144x5 − 1 1251 3644849
x5 − 1728 1252
3469764
12x4 − 1 1264 2511945
x6 − 12 1243 5523975
It turns out that the simplest monic polynomial, x6 − 12, is best.
Example 8.16. The number N to factor is
(3803 + 1)(31 + 1)
Φ1606 (3) = .
(373 + 1)(311 + 1)
Note that 803 = 11 · 73. If we assumed that N = 3803 + 1, then
N would be larger and we would have to do more sieving than if we
assumed that N = (3803 + 1)/(373 + 1). But this N does not have the
form r e − s and we must work harder to find a suitable polynomial
and root. Write k = 373 . Then N = (k11 + 1)/(k + 1) = g(k), where
g(x) = x10 + x9 + · · · + x + 1. Degree 10 is too large. We attempt to
reduce the degree to about 5. Factor an x5 from g:
g(x) = x5 (x5 + x4 + x3 + x2 + x + 1 + x−1 + x−2 + x−3 + x−4 + x−5 ).
Since the expression in parentheses is unchanged when x is replaced
by x−1 , it can be written as a polynomial in the variable u = x + x−1 .
One computes that this polynomial is
f (u) = u5 + u4 − 4u3 − 3u2 + 3u + 1.
Then x5 f (x + x−1 ) = g(x). Let m = k + k−1 mod N . We have
k5 f (m) ≡ g(k) ≡ 0 mod N . Since gcd(k, N ) = 1, we have f (m) ≡
0 mod N . Let α be a zero of f . When a and b are small, the norm of
a+bα is near 3146 which is 315 smaller (and more likely to be smooth)
than if we had used N = 3803 + 1, f (x) = x5 + 9, and m = 3161 . One
can use this trick for r e − s whenever 11 divides e. Similar tricks work
when e is a multiple of 7 or 13.
2
These relations come from sieving special q’s between 300000000 and 320000000
for each polynomial.
216 8. Sieve Algorithms

Example 8.17. The number to factor is (a divisor of) 21846 + 1.


Of course, this number has the Aurifeuillian factorization shown
in equation (4.1). The particular number to factor happens to be
N = 2,1846L = 2923 − 2462 + 1. This N does not have the form
r e − s, but its shape suggests the polynomials f (x) = x4 − 2x2 + 2
with m = 2231 and f (x) = x6 − 2x3 + 2 with m = 2154 . This type
of polynomial should be investigated whenever N is an Aurifeuillian
factor.

One can find good polynomials for the Special Number Field Sieve
for numbers in any divisibility sequence (and numbers near numbers
in a divisibility sequence).

Example 8.18. Find a fifth degree polynomial for factoring the Fi-
bonacci number u1289 by the Special Number Field Sieve.
The Fibonacci numbers enjoy many identities. The identity
(8.4) u5n + 10u3n u2n+1 + 10u2n u3n+1 + 10un u4n+1 + 3u5n+1 = u5n+4
is easy to prove from the formula un = (αn − β n )/(α − β), where α
and β are the roots of x2 − x − 1 = 0. In it let n = 257 and divide
through by u5258 . The identity shows that m = un /un+1 = u257 /u258
is a zero modulo u5n+4 = u1289 of the polynomial
f (x) = x5 + 10x3 + 10x2 + 10x + 3.
Compute m by inverting u258 modulo u1289 via the Extended Eu-
clidean Algorithm and multiplying u257 by this inverse modulo u1289 .

The group NFS@Home, led by G. Childers, used the Special Num-


ber Field Sieve to factor the Mersenne number 21061 − 1. This 320-
digit number split into prime factors with 143 and 177 digits.
In a footnote, Mazur [Maz11] claimed that the numerator P200
of the Bernoulli number B200 is
P200 = 389 · 691 · 5370056528687 · N,
where N is a 204-digit prime. (See Section 4.7.) In fact this N is
composite. A large group of factorers pooled their efforts to factor
the 204-digit composite N using the General Number Field Sieve.
Shi Bai found a good General Number Field Sieve polynomial using
Exercises 217

Kleinjung’s algorithm in the CADO-NFS. Many volunteers led by


Childers did the sieving. Two groups performed the filtering and
linear algebra independently. The group of Thomé and Zimmermann
finished first, so the group of Childers and Hart stopped its work.
The number has prime factors of 90 and 115 digits.
At least one person using the Number Field Sieve lives in a city
with intermittent electricity. He has to save his relations and check-
point his linear algebra frequently so that his machine can resume the
work when the electricity returns. In spite of this difficulty, he has
factored several large numbers with the Number Field Sieve.

Exercises

8.1. Find a simple formula for the smallest multiple i of p with


i ≥ I.
8.2. Find a simple formula for the smallest i ≡ r (mod p) with
i ≥ I.
8.3. Use the Sieve of Eratosthenes to make a table of primes either
to 100 by hand or to 1000000 with a computer. (The program
should run for less than one second.)
8.4. Use the Sieve of Eratosthenes to count the numbers between
109 and 109 + 103 with no prime factor < 102 . (Hint: Use the
program of the previous exercise to find the primes < 102 first.
The program should run for less than one second.)
8.5. Use Algorithm 8.3 to verify the claim in Example 3.4 that there
are exactly 30 numbers n in the interval 1030 − 105 ≤ n ≤ 1030
with all prime factors < 106 .
8.6. When sieving in the Number Field Sieve to find smooth val-
ues of N (x − bα) for fixed b, what is the polynomial f (x) in
Algorithm 8.6?
8.7. Suppose f (x) = 2x +x−1, which is not a polynomial in x. Can
you use an algorithm similar to the sieve to quickly find all x
in 100 < x < 200 for which f (x) has no prime factor < 1000?
218 8. Sieve Algorithms

By “quickly” we mean faster than Trial Division to 1000 for


each of f (101), f (102), . . ., f (199).
8.8. Show that3
f (x) = x8 − 6x7 − 30x6 + 216x5 + 144x4 − 1944x3 + 5184x + 1296,
with zero m = (631 + 1)(6−15 mod 6,930M), is a good Spe-
cial Number Field Sieve polynomial for factoring the Aurifeuil-
lian factor 6,930M and that g(x) = f (−x), with zero m =
(631 + 1)(6−15 mod 6,930L), is a good polynomial for factoring
6,930L. This polynomial was found by Serge Batalov.
8.9. Prove equation (8.4).

3
6,930L and 6,930M are the Aurifeuillian factors of 6930 + 1.
Chapter 9

Factoring Devices

Introduction
This chapter describes some hardware used to factor integers. The
next section discusses devices to perform the sieve algorithms of Chap-
ter 8, from paper strips to electronic shift registers. The last section
lists several computers with special hardware attachments designed to
speed parts of certain factoring algorithms. Some of these machines
were actually built and factored numbers. Others were proposed but
never built. The last section also describes factoring by the new tech-
nologies of quantum computing and DNA computing.

9.1. Sieve Devices


We begin with an example to illustrate the ideas that follow.

Example 9.1. Factor N = 407 using a sieve to express N = a2 − b2 .


√ √ √
We will try a =  N ,  N  + 1,  N  + 2, . . . and seek those
a for which a2 − N is a square b2 . Our plan is to consider only those
a for which a2 − N is a square modulo all of the first few primes or
prime powers.
The squares modulo 8 are 0, 1, and 4. Since N ≡ 7 (mod 8),
the only way to write N = a2 − b2 is with a2 ≡ 0 (mod 8) and

219
220 9. Factoring Devices

b2 ≡ 1 (mod 8). (Try the nine cases of 0, 1, 4 for a2 and b2 .) Then


a ≡ 0 (mod 4). (Try the cases of a mod 8.)
The squares modulo 3 are 0 and 1. Since N ≡ 2 (mod 3), the only
way to write N = a2 − b2 is with a2 ≡ 0 (mod 3) and b2 ≡ 1 (mod 3).
Then a ≡ 0 (mod 3).
The squares modulo 5 are 0, 1, and −1. Since N ≡ 2 (mod 5),
the only way to write N = a2 − b2 is with a2 ≡ 1 (mod 5) and
b2 ≡ −1 (mod 5). Then a ≡ 1 or 4 (mod 5) (and b ≡ 2 or 3 (mod 5)).
The squares modulo 7 are 0, 1, 2, and 4. Since N ≡ 1 (mod 7),
the only way to write N = a2 − b2 is with either a2 ≡ 1 (mod 7) and
b2 ≡ 0 (mod 7) or a2 ≡ 2 (mod 7) and b2 ≡ 1 (mod 7). Then a ≡ 1,
6, 3, or 4 (mod 7).
So far we know

a ≡ 0 (mod 4) and a ≡ 0 (mod 3) and a ≡ 1 or 4 (mod 5)


and a ≡ 1, 3, 4, or 6 (mod 7).

These congruences may also be written as linear forms for a:

a = 4w and a = 3x and {a = 1 + 5y or a = 4 + 5y}


and {a = 1 + 7z, a = 3 + 7z, a = 4 + 7z, or a = 6 + 7z},

for some integers w, x, y, z. It is easy to combine the first two linear


forms into a = 12v, but it is more complicated to combine them all
into eight linear forms.

We also know from N = a2 − b2 that N < a < N , or 20 < N <
407. The following table indicates with a check mark which moduli
work for each a > 20. Each column represents one value of a:
modulus 21 22 23 24 25 26 27 28 29
√ √
4 X X X X X X X
√ √ √
3 X X X X X X
√ √ √ √
5 X X X X X
√ √ √ √ √
7 X X X X
The column of check marks under a = 24 indicated that this value
might work. We compute

a2 − N = 242 − N = 576 − 407 = 169 = 132 = b2 ,


9.1. Sieve Devices 221

so a = 24 works with b = 13 and we have


N = a2 − b2 = 242 − 132 = (24 − 13)(24 + 13) = 11 · 37.
If a2 − N had not been the square of an integer, we would have
continued the table with more values of a until we found another
column of check marks.

This example basically uses Fermat’s Factoring Method of Section


5.2 and tries to recognize squares as quadratic residues modulo each
prime up to some limit. As we mentioned after Example 5.3, it is
hard to combine very many linear forms, and we would like to use
dozens of moduli, not just four of them. When one builds a table
like that in Example 9.1, it is tedious to keep track of all the residue
classes. The ideas we now present mechanize this process.
Here is the algorithm we wish to execute efficiently. We are given
a set of moduli p1 , p2 , . . ., pk and sets of allowable residues modulo pi ,
Si = {si1 , si2 , . . . , sini } of size |Si | = ni . Suppose we want solutions
a in the interval [L, H] to (a mod pi ) ∈ Si for all 1 ≤ i ≤ k.
Algorithm 9.2. Combine linear forms using a sieve:
Input: L, H, k, pi , ni , sij as above.
for (a ← L to H) {
gooda ← 1
for (i ← 1 to k) {
goodp ← 0 ; ap ← a mod pi
for (j ← 1 to ni ) {
if (ap = sij ) { goodp ← 1 }
}
if (goodp = 0) { gooda ← 0 }
}
if (gooda = 1) { output a and pause }
}
Output: The algorithm pauses at each solution a.

The inner for loop tests whether a mod pi ∈ Si for a single i.


The variable goodp is initialized to 0, meaning that no j has yet
been found with a mod pi = sij for this particular i. It becomes 1
if and when such a j is found. The middle for loop tests whether
a mod pi ∈ Si for all 1 ≤ i ≤ k. The variable gooda is initially 1,
meaning that every pi tested so far has a mod pi ∈ Si . It is changed
222 9. Factoring Devices

to 0 when an i is found for which a mod pi ∈ Si , that is, goodp is 0.


The outer for loop runs through all values of a. If gooda is still 1 just
before the end of this loop, it means that this a satisfies a mod pi ∈ Si
for every 1 ≤ i ≤ k, and the algorithm pauses.
When the algorithm pauses, it has found a solution a to the
system of congruences and one should check whether this a solves
the larger problem. In case the larger problem is factoring N as in
Example 9.1, one checks whether a2 − N is a square b2 . If it is, then
one may be able to factor N . If the solution a does not solve the
larger problem, then one resumes the algorithm with the next a.
The algorithm is called a sieve because it accepts only certain
residues modulo each small prime (or prime power).
Lawrence (of Section 5.4) was one of the first to suggest ways to
mechanize the combination of linear forms. His first suggestion was
to use paper strips. Use a strip pi units long to represent the set Si .
Draw a heavy line across the shorter dimension of the strip at sij for
each j. Lay the strips side by side on a table with proper alignment
so that a line across all strips represents one a. If a heavy line extends
across all strips at a, then a is a solution. If no solution is found, slide
the strip that stops at the smallest a down the table by pi units if it
is the strip for pi , and repeat the process. Figure 1 shows some strips
for the data in Example 9.1.

His second suggestion was to represent the modulus pi , which


need not be prime, and the set Si of acceptable residues by a gear
with pi teeth. Put electric contacts in a circle on each gear at the
positions of the sij . Turn the gears together so that one tooth on each
gear passes a fixed point in each time unit. When an electric current
passes through a contact for every gear, it means that a solution
has been found and the rotation of the gears stops. Then a human
operator can look at a counter to learn the total number a of gear
teeth that have passed the fixed point.
In 1922, Carissan built such a machine using rings instead of
gears. It was called the Machine á Congruences. It had 14 concentric
brass rings with p ≤ 59 studs per ring. Each stud triggered a switch.
9.1. Sieve Devices 223

a Modulus
4 3 5 7
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Figure 1. Lawrence’s paper strip sieve.

When all 14 switches were triggered together, an alarm sounded.


The operator turned the rings with a hand crank. When the alarm
sounded, he stopped it, read the value of the solution a on a counter,
and checked whether it solved the big problem. See [SWM95] and
Figure 2. The machine factored 7141075053842 = 2·841249·4244329.
In the 1920s, Dick Lehmer built a similar machine using gears
and electric contacts on bicycle chains of length pi . See Figure 3. He
factored some numbers with this machine, which was run by a motor,
including
1020 + 1
Φ40 (10) = = 9999000099990001 = 1676321 · 5964848081.
104 + 1

In 1932, Dick Lehmer [Leh33b], [Leh33a] built another, much


faster, machine using 30 gears for the moduli pi = 128, 81, 125,
49, 121 and the primes 13, 17, . . ., 109, 113. (Lehmer told stories
about the reaction of the machinist when Lehmer asked him to make
gears with 37 teeth, 59 teeth, etc.) A hole was drilled at the base of
each tooth to represent the residue classes modulo each pi . He used
224 9. Factoring Devices

Figure 2. Carissan sieve with all 14 rings, 1922.

toothpicks to plug the holes for unacceptable residues; the holes for
the sij were left open. The 30 gears had a common line of tangency
so that a light could shine through the holes not plugged. The gears
were rotated from some initial position. When a light ray shined
through all the gears, a photoelectric cell would detect it and halt
the gears. See Figure 4. Then a human could discover the value of
a with help from a counter. This mechanical device processed 5000
values of a per second in 1932, faster than any machine could execute
Algorithm 9.2 until it was programmed on the IBM 7094 computer
in the early 1960s. The photoelectric sieve factored many numbers
from the Cunningham Project, including
(293 + 1)/(3 · 3 · 715827883) = 529510939 · 2903110321.
D. N. Lehmer wrote a whimsical account [Leh33b] of the operation
of his son’s photoelectric sieve. He begins as follows:
On the 19th of October a little group of mathemati-
cians gathered in the Burt Laboratories in Pasadena,
California, around a mysterious little machine to
watch it attack a problem in mathematics. It was a
simple enough problem to state. It had only to find
9.1. Sieve Devices 225

Figure 3. Lehmer’s bicycle chain sieve, 1927.

two numbers which when multiplied together would


give 5,283,065,753,709,209. Any person with a few
hundred years of leisure time on his hands could work
it out.
“Here we step out into the unknown!” said the
young inventor as he threw the switch and set whirl-
ing a complicated mass of gears. It was of no use
for the human eye to try to make anything out of
the rapidly rotating wheels. One might as well try
to identify the teeth on a buzz-saw. Besides it was
quite unnecessary, for a fixed, unwinking eye was
turned on the machine waiting for a ray of light to
226 9. Factoring Devices

slip through certain holes in the wheels, which should


be the signal for it to stop the motion and to gather
in the solution of the baffling problem in the theory
of numbers.

Figure 4. Lehmer’s photoelectric sieve, 1932.

D. N. Lehmer explains how this machine factored the 16-digit


number N by finding two representations of it as N = x2 + 7y 2 , as
in Theorem 5.20. He reports the factors. After a quotation from
Macaulay’s Horatius, Lehmer continues:
And after all we had taken only an important out-
work in the assault upon a real fortress. This vic-
tory had merely cleared the decks for action against
another and much larger number which was under
grave suspicion of being a prime; that is, not the
9.1. Sieve Devices 227

product of any two smaller numbers. This number is


the great unconquered factor of 295 +1. It is the nine-
teen digit number 3,011,347,479,614,249,131. A very
powerful test invented some three hundred years ago
by a French jurist, Fermat, had failed to show the
character of this number; whether it was prime or
composite. A more delicate test discovered some five
years ago by the inventor of this machine must be
applied to it; but this test demanded the knowledge
of the factors of the sixteen digit number which the
machine had just been examining. Now that this job
was finished the advance on the main citadel was easy
and in a few more minutes of work the big nineteen
digit number was branded for all time as a prime;
one of the vast undivided and indivisible sums of the
first magnitude.

In fact, 295 + 1 = 3 · 11 · 2281 · 174763 · P 19, where P 19 is the 19-


digit number in the quotation. Fermat’s Little Theorem 2.23 showed
that this number is probably prime. To prove that P 19 is prime by
Theorem 3.27, one needs all the prime factors of P 19−1 = 2·3·5·19·N ,
where N is the 16-digit number factored earlier by the photoelectric
sieve.
Dick Lehmer also built a sieve in 1936 using 16mm movie film
with holes to represent the sij . See Figure 5. It factored the number

(2120 + 1)(28 + 1)
Φ240 (2) = = 394783681 · 46908728641.
(224 + 1)(240 + 1)

In 1937, Gérardin [Gér37] built a sieve, but little is known about it.
Dick Lehmer and Paul Morton built the Delay Line Sieve DLS-
127 in 1965. A delay line is a circuit that transmits an electric pulse
at a constant speed. The DLS-127 had 31 delay lines with lengths
proportional to powers of the 31 primes up to 127. The pulses in the
i-th delay line circulated (mod pi ) and were initialized to represent
the residues sij . As the pulses passed a solution tap, their logical
AND was computed, and the current value of a was saved when all
31 values were 1, meaning that a solution was detected. This machine
228 9. Factoring Devices

Figure 5. Lehmer’s movie film sieve, 1934.

processed 1000000 values of a per second and factored many numbers.


Later, six more delay lines were added to the DLS-127 to create the
DLS-157. A computer program read the number N to factor and
computed the initial values sij for these sieves. Another program
checked each solution a to see whether it led to a factor of N .
Any device that executes Algorithm 9.2 can find solutions to any
quadratic form, not just squares. These sieves were used to factor
numbers by the Lehmers’ method of Section 5.5.
Modern sieves use shift registers with multiple taps to parallelize
solution detection. Machines built by Williams and his associates
[WP83], and by others, process 108 to 109 values of a per second.
9.1. Sieve Devices 229

See Sections 7.3, 8.2, 8.3, and 16.1 of Williams [Wil98] for much more
about the history of sieve devices.
An even faster sieve of this type could use k lasers, one for
each modulus pi , to generate streams of light pulses to represent the
residues sij . An optical sensor would observe the streams and report
a whenever it saw a pulse from each laser simultaneously. A machine
like this in theory could process more than 1015 values of a per second.
When I last checked on the feasibility of constructing such a device,
the problem was that it is hard for an optical sensor to distinguish
between pulses from all k lasers and from only k − 1 or k − 2 of them.
If it reports a whenever it sees at least k − 1 pulses together, there
will be very many false solutions, perhaps too many for its computer
to handle1 .
Another application of sieve devices is to compute pseudosquares.
These are positive integers that behave like squares modulo all of the
first few primes but which are not square.
Definition 9.3. Let p be an odd prime. The pseudosquare Lp is
the smallest positive nonsquare integer with Lp ≡ 1 (mod 8) and the
Legendre symbol (Lp /q) = +1 for all odd primes q ≤ p.
Example 9.4. The first few pseudosquares (and some larger ones)
are shown in Table 1.

The pseudosquares have been computed further than the pseu-


doprimes or strong pseudoprimes to base 2. They have been used in
primality tests for relatively small numbers, as in this theorem, which
is Theorem 16.2.6 of Williams [Wil98].
Theorem 9.5. Let m > 2 be a positive integer and let p > 2 be
prime. Suppose m has no prime factor ≤ B, m/B < Lp , and for
every prime q in 2 ≤ q ≤ p, we have q (m−1)/2 ≡ ±1 (mod m). In
case m ≡ 5 (mod 8), assume that 2(m−1)/2 ≡ −1 (mod m). In case
m ≡ 1 (mod 8), assume that for some prime q in 2 < q ≤ p, we have
q (m−1)/2 ≡ −1 (mod m).
Then m is either prime or a prime power.
1
A friend informs me that new developments in laser technology may solve this
problem.
230 9. Factoring Devices

Table 1. Some pseudosquares Lp .

p Lp
3 73
5 241
7 1009
11 2641
13 8089
17 18001
19 53881
53 22000801
101 10310263441
193 2854909648103881
241 2327687064124474441
277 69848288320900186969

If one performs Trial Division on m up to about B = 1010 and uses


Lp = L277 , then Theorem 9.5 gives a reasonably efficient primality
test for integers m < 1030 .

9.2. Special Computers


In the 1980s, Smith and Wagstaff [SW83] and [WS87] fabricated an
Extended-Precision Operand Computer (EPOC) for factoring large
integers by the Continued Fraction Factoring Algorithm 6.22. It had
a 128-bit wide main processor with a bit-slice architecture to gen-
erate the Ai and Qi , and 16 simple remaindering units (the “Mod
Squad”) to divide a Qi by 16 different primes pj in parallel, finding
only the remainders. As each Qi was generated, it was loaded into a
wide shift register. Sixteen trial divisors pj were loaded into registers
of simple arithmetic-logic units (ALUs). While the main processor
generated the next Qi and Ai , the current Qi was shifted out of its
register, one bit at a time, and broadcast to the Mod Squad of simple
ALUs. Each simple ALU computed the remainder Qi mod pj by sub-
traction and conditional load. When the remaindering was complete,
the Mod Squad returned a vector of 16 bits to tell which remainders
were 0. Then a new set of trial divisors was loaded into the ALUs
9.2. Special Computers 231

and the process repeated. The main processor used the report vec-
tors to determine whether Qi was smooth enough for the relation
to be saved. A personal computer connected to the main processor
stored the smooth relations. The linear algebra was done on a larger
computer, an IBM 370.

Example 9.6. This special computer factored several integers in-


cluding the 62-digit Cunningham number
(3204 + 1)(34 + 1)
3,204+ = Φ408 (3) = .
(312 + 1)(368 + 1)

Smith and Wagstaff announced the EPOC on January 6, 1983,


at the AMS annual meeting in Denver, Colorado. It was not fin-
ished until the summer of 1984, when it factored 7135 − 1. At several
mathematics conferences in 1983 and early 1984, the author told how
great the machine would be when it was finished. The audiences were
skeptical. He promised that he would show the first factors found by
the machine at the Illinois Number Theory Conference in Normal,
Illinois, on April 14, 1984. In early 1984, a hamburger chain ran a
television commercial in which an elderly lady received a hamburger
with an oversized bun and a small patty from a competitor. The lady
exclaimed, “Where’s the beef?” When the author began his April 14
talk in Normal by explaining that the EPOC had not quite finished
factoring its first number, the audience began chanting, “Where’s the
beef?”
Pomerance, Smith, and Tuler [PST88] designed Quasimodo, the
Quadratic Sieve Motor. It collected relations for the Quadratic Sieve
and had a pipelined architecture. It was only partly debugged and
never worked.
Since 1990, personal computers have been widely available. They
are often used for factoring. One may run the Elliptic Curve Method
on each of many personal computers, using different random param-
eters for the curves. Suyama [Suy81] was one of the first to factor
numbers using a personal computer. He started this endeavor in 1980
and found several factors of Fermat numbers by Trial Division and
Exercise 5.12. He also used the Pollard Rho Method, the Pollard
p − 1 Method, and later the Elliptic Curve Method to factor dozens
232 9. Factoring Devices

of other numbers from the Cunningham Project. Another use of mul-


tiple personal computers is to distribute the sieving of the Quadratic
or Number Field Sieve on them and use one large machine for the
linear algebra. See, for example, Lenstra and Manasse [LM89] for
factoring by email.
In 1986, the Dubners [DD86] built and marketed a coprocessor
that plugged into a personal computer. It performed factorization via
several algorithms, and other number theory functions, much faster
than the personal computer could do them by itself. It found small
factors of many numbers, including the factors 1059099980653317121
of 21049 − 1 by Pollard’s Rho Method and 380623849488714809 of
10142 + 1 by the Elliptic Curve Method. The same device also set
records for finding large primes of special form: palindromic, largest
non-Mersenne, twin, and factorial: n! + 1 with n = 1477.
In 1994, Shor [Sho94] stunned the world by giving algorithms for
factoring integers and computing discrete logarithms in polynomial
time using a quantum computer. We give a brief and very oversim-
plified account of his discovery.
Certain objects (molecules, atoms, photons, subatomic particles)
may exist in two quantum states simultaneously (spin up or spin
down; left-handed or right-handed helicity, energy level, etc.). The
two states are said to be in “superposition.” (Actually, anything, even
Schrödinger’s cat, can be in a superposition of states. Atoms can be
kept reasonably uninfluenced by ambient junk and effective observa-
tions, so that they can keep their mixed state for useful amounts of
time.) When the object is “observed,” its state becomes fixed. A
1-bit register called a qubit may be built from one such object. With
its two states it represents both a 0 and a 1 bit simultaneously. When
the qubit is observed, it takes one of these two values and no longer
changes state until after the computation is done. One can combine
n qubits to build a quantum register that holds all 2n n-bit integers
simultaneously.
Two objects may have their quantum states “entangled,” that is,
they are always in the same state or are always in opposite states.
When either object is observed, the states of both objects become
fixed. One can built logic gates whose input and output are entangled
9.2. Special Computers 233

qubits. For example, for an AND gate with output x and inputs a
and b, x would be in state 1 if and only if both a and b are in state
1. Since classical computers are built from logic gates, one can build
quantum computers to compute any function.
The basic idea of Shor’s quantum computer algorithm to factor an
integer N is to compute the order r of a random integer x modulo N ,
that is, find the smallest positive integer r for which xr ≡ 1 (mod N ).
If r is even, one can factor N by computing gcd(xr/2 − 1, N ), via
Theorem 6.18. Each x has probability at least 1/2 of leading to a
factor of N .
To compute r, one might put all possible r in one quantum reg-
ister, compute all possible xr mod N in another quantum register
entangled with the first one, and try to observe the value 1 in the
second register. This observation would fix all the qubits and one
could read the value of r from the first register. Unfortunately, this
method doesn’t quite work because one cannot “observe the value 1.”
One can only observe whatever value is in the second qubit register,
and it probably won’t be 1.
Instead, we let q = 2t be the power of 2 with N 2 ≤ q < 2N 2 . Let a
quantum register with t qubits hold the superposition of all integers
a modulo q. Compute xa (mod N ) in another quantum register.
Perform a discrete Fourier transform on the first register. This is done
by multiplying the register’s contents by a series of unitary matrices (a
rotation done by a microwave pulse or flipping an external magnetic
field). The Fourier transform changes the probability distribution of
the values in the first register so as to focus on periodicity in the
second register, that is, values that make xa mod N repeat, passing
through the value 1. Now observe the qubits in the second register.
With high probability, the second register will contain 1 and the first
register will contain an integer c with
 
c d
 − ≤ 1 ,
q r  2q
for some integer d. Since q ≥ N 2 , at most one fraction d/r with
r < N can satisfy this inequality, by Theorem 6.7. Using a classical
computer, expand c/q in a simple continued fraction, computing the
convergents Ak /Bk by Theorem 6.5. One of these convergents will
234 9. Factoring Devices

have Bk = r. Test each Bk for xBk ≡ 1 (mod N ), and one of them


will work. As the Bk grow exponentially with k, and r < N , the
classical computer must try at most O(log N ) of them.
The quantum algorithm for discrete logarithms uses three quan-
tum registers and is slightly more complicated. See Shor [Sho99] for
details of both algorithms. Vandersypen et al. [VSB+ 01] fabricated
a 4-qubit quantum computer and used it to factor the 4-bit number
15. Xu et al. [XZL+ 12] built a liquid-crystal NMR quantum proces-
sor and used it to factor the 8-bit number 143. It is not clear whether
it is possible to build a quantum computer with enough qubits to
perform serious computing. The technical challenge is decoherence,
in which ambient energies perform effective measurements before the
computation is ready for them.
In 1994 Adleman [Adl94] used DNA computing to solve in poly-
nomial time an instance of the N P-complete Hamiltonian path2 prob-
lem. Later, Lipton [Lip95] found a way to use DNA computing to
solve in polynomial time the satisfiability problem, which is also N P-
complete. These results show that any problem in class N P can be
solved in polynomial time using DNA. Since factoring integers is in
class N P, there must be a polynomial-time algorithm for it with DNA
computing. There is a nice description of how to do this by Chang
et al. [CGH05]. Here is a rough summary of their method. DNA
consists of long molecules made up of strands of four types of elemen-
tary pieces called A, C, G, T, for the first letters in their chemical
names. Binary numbers can be encoded in DNA strands. Biologists
can easily perform these simple operations on test tubes (or simply
tubes):

(1) Extract. Given a tube P and a strand S of DNA, produce


two tubes: +(P, S), which contains all strands in P having
S as a substrand, and −(P, S), which contains all strands in
P not having S as a substrand.

2
The Hamiltonian path problem is to determine whether a graph has a path that
visits each node exactly once. N P-complete means that the problem is as hard as any
problem in class N P.
9.2. Special Computers 235

(2) Merge. Given two tubes P1 and P2 , form the tube ∪(P1 , P2 )
containing all DNA strands in either P1 or P2 . Pour the two
tubes into one.
(3) Detect. Given a tube P , answer “yes” if P contains at least
one DNA strand and answer “no” if P is empty.
(4) Copy. Given a tube P , create a new tube P1 identical to P .
(5) Append. Given a tube P and a DNA strand S, append S
to the end of each DNA strand in P .
(6) Read. Given a tube P , print the sequence of elementary
pieces of one DNA strand in P .
One can also delete and rename tubes.
Here is an example of an algorithm that generates a tube contain-
ing DNA representing all 2k k-bit numbers. In the algorithm, 0 and
1 mean DNA strings representing these bits. The algorithm begins
with a tube containing both 0 and 1. It copies it and appends 0 to
each DNA strand in the first tube, creating a tube with DNA 00 and
10. It appends 1 to each DNA in the second tube, creating a tube
with DNA 01 and 11. It merges these two tubes and repeats this
process until a tube containing all k-bit numbers is formed.
Algorithm 9.7. Create a tube with all k-bit numbers.
Input: An integer k > 1.
Create a tube P containing 0 and 1
for (i ← 2 to k) {
copy (P , P1 )
append (P , 0)
append (P1 , 1)
P2 ← ∪(P, P1 )
delete P and P1
rename P2 as P
}
Output: Tube P contains all k-bit numbers.

Using the simple operations on tubes, one can define more com-
plicated procedures to add numbers, multiply numbers, and compare
numbers. To factor an integer N that is the product of two k-bit
primes, load a tube with all k-bit primes (or all k-bit numbers), mul-
tiply every pair of them with each product labeled by its two factors,
236 9. Factoring Devices

compare each product with N (bit by bit with extract), and read a
DNA strand that contains N . The two factors of N are also part of
this strand. According to Chang et al. [CGH05], a tube can hold
up to 1018 DNA molecules. The number of elementary biological op-
erations needed to factor the product N of two k-bit primes is only
O(k3 ) steps, which is polynomial time. However, the total number of
DNA strands needed to factor N is O(2k ), so k is limited to about 64.
Since it easy to factor an integer N < 2128 with the Elliptic Curve
Method or the Quadratic or Number Field Sieve, DNA computing
with known methods is not more powerful than conventional algo-
rithms. The problem is that the known algorithm for DNA factoring
is Trial Division, which is slow, even though highly parallel. This
negative assessment would improve if someone could find a way to
perform a faster factoring algorithm on a DNA computer.
Shamir [Sha99] proposed an optoelectronic device called Twinkle
for the Number Field Sieve. It stands for The Weizmann INstitute
Key Locating Engine. It performs the sieve step for the Quadratic or
Number Field Sieve one candidate number at a time. Each prime in
the factor base is represented by a light-emitting diode (LED). The
LED for p glows once every p time units and its intensity is propor-
tional to log p. An optical sensor monitors the combined brightness of
all the LEDs and reports the current value of a counter whenever the
total intensity exceeds a threshold. In effect, the device adds the log-
arithms of all the prime divisors of the candidate number and reports
the B-smooth ones (those with large enough sum of logarithms). This
is the way the Quadratic and Number Field Sieves work on regular
computers, adding the logarithms of the prime divisors as scaled bi-
nary integers. Only a crude prototype was built; it never factored
any serious number.
Shamir and Tromer [ST03] designed the TWIRL in 2003. This
acronym stands for The Weizmann Institute Relation Locator. It is
a hypothetical hardware device that performs the sieving step of the
Number Field Sieve in a highly parallel fashion, using optical sensors
to sum the logarithms of the prime factors of the candidate numbers.
It handles small, medium, and large primes differently. It would cost a
Exercise 237

few millions dollars to build a TWIRL capable of factoring a 1024-bit


composite number.
Lenstra et al. [LTS+ 03] estimate the parameters and complexity
for factoring a 1024-bit number with the Number Field Sieve using
special hardware. They extrapolate these values from actual factor-
izations of smaller numbers and from experiments in which they per-
formed a tiny fraction of the sieving for a 1024-bit number. A major
source of uncertainty in these estimates lies in the matching behavior
of the large primes.
Geiselmann and Steinwandt proposed two special devices to speed
sieving in the Number Field Sieve: DSH [GS03] in 2003 and YASD
[GS04] in 2004. In 2005, Franke, Kleinjung, Paar, Pelzl, Priplata,
and Stahlke [FKP+ 05] proposed the Shark, another device for the
NFS sieving. None of these devices have actually been built.

Exercise

9.1. Suppose a laser sieve could be built and it could process 1015
values of a per second. How large a number N could it factor
reliably by Algorithm 9.2?
Chapter 10

Theoretical and
Practical Factoring

I assume that we are content to do it, & do not need


to prove before we start that we shall do it.
A.O.L. Atkin, email, 1992

Introduction
Oliver Atkin was referring to expressing a positive integer as a sum of
two squares, but the remark applies equally to factoring. Some the-
oretical mathematicians and computer scientists want to prove that
an algorithm works correctly in all cases. Some algorithms, like the
fastest ones discussed in this book, do not have this property. That
is, there is no proof that they always succeed. However, they have
worked on every number tried, which is thousands or even millions
of numbers. Practical mathematicians and computer scientists like
Oliver Atkin take the view that these algorithms are perfectly good.
They would say that once a number has been factored, it does not
matter whether you could have proved before the algorithm started
that it would factor the number. Theoretical folks argue that we do
not fully understand an algorithm until we can prove when it works
and when it fails. The world needs both theoretical and practical
people.

239
240 10. Theoretical and Practical Factoring

After considering some theorems about the complexity of fac-


toring, this chapter describes arithmetic with large numbers, some
programs available on the Internet that factor integers, and other
practical aspects of factoring actual numbers. It discusses tricks to
factor certain kinds of numbers, such as pseudoprimes, Carmichael
numbers, and secret numbers in zero-knowledge proofs. There is a
short cryptanalysis of the RSA cipher, part of which involves lat-
tices. Finally, we speculate about the future of factoring and suggest
possible new ways to factor large numbers.

10.1. Theoretical Factoring


Unlike Atkin, some mathematicians want to prove rigorously that an
algorithm for factoring N always works and runs in a specified time.
Of course, Trial Division described in Section 5.1 has this property
and factors N in O(N 1/2 ) steps. Lehman’s algorithm described in
Section 5.4 factors N in O(N 1/3+ε ) steps. There is an algorithm of
Pollard [Pol74] and Strassen [Str77] (see Section 5.5 of [CP05]) that
uses fast polynomial evaluation to factor N in O(N 1/4+ε ) steps. All
these algorithms are rigorous and deterministic; that is, they do not
choose random numbers and one can prove before they begin that
they will succeed in a specified number of steps. Shanks [Sha71] in-
vented a fast algorithm1 for factoring N while computing the class
number of primitive binary quadratic forms of discriminant N . This
complicated algorithm runs in O(N 1/4+ε ) steps but can be modified
to factor N in O(N 1/5+ε ) steps, assuming the Extended Riemann Hy-
pothesis. Algorithms like Shanks’s O(N 1/5+ε ) one, the CFRAC, and
the Quadratic and Number Field Sieves are fast and deterministic,
but the proofs of their running times depend on heuristic (unproved
but plausible) hypotheses, so they are not rigorous.
A probabilistic algorithm2 for factoring integers uses random num-
bers and may or may not factor the number, depending on the random
choices it makes. For a given input, it may perform different steps
each time it is invoked and may produce different factorizations or
none at all. For example, the Elliptic Curve Method of Section 7.2
1
This algorithm is different from the SQUFOF, which he invented later.
2
Technically, a Las Vegas probabilistic algorithm.
10.1. Theoretical Factoring 241

chooses a random elliptic curve and a random point on it and then


works deterministically to compute a large multiple of the point. For
some choices it will factor the input number, perhaps finding different
factors (say, if N has two prime factors of about the same size).
A probabilistic algorithm may or may not be rigorous. The Ellip-
tic Curve Method is probabilistic and not rigorous because it assumes
without proof that, with a certain probability (u−u in formula (3.1)),
there is an elliptic curve having B-smooth size in the Hasse interval.
The ability to make random choices is a powerful tool that can make
a probabilistic algorithm faster than a deterministic algorithm for the
same job. In order to say that a probabilistic algorithm for factoring
is rigorous, there must be a proof without unproved assumptions that
it will factor the input number N with positive probability in a speci-
fied number of steps. It turns out that there is a rigorous probabilistic
algorithm for factoring integers with a subexponential running time.
Dixon [Dix81] invented such an algorithm in 1981.
Dixon’s algorithm for factoring N has two parameters, n and v, to
be specified later. It begins by choosing a set S of n random integers
between 1 and N . The remainder of the algorithm is deterministic.
Call the algorithm AS . Let the factor base consist of all primes < v.
For each z ∈ S, try to factor w = (z 2 mod N ) using only the primes
in the factor base. If you succeed in factoring w completely, save
the relation z 2 ≡ (the product of the factors of w) (mod N ). After
the factoring is finished, combine the relations using linear algebra
modulo 2 as in the Quadratic Sieve. The method used to factor the
numbers w doesn’t matter; even Trial Division would be fast enough
for the theorem.
Let
 
L(N ) = exp (ln N ) ln ln N .

Dixon proved this theorem about the set of algorithms {AS : S ⊆


[1, N ], size(S) = n}.

Theorem 10.1 (Dixon). Let N be an √odd integer with at least two


different prime factors. Let v = L(N ) 2 and n = v 2 . Then the
average number of steps taken in the execution of algorithm AS is
242 10. Theoretical and Practical Factoring

L(N )3 2 . The probability

that algorithm AS fails to factor N is pro-
portional to L(N )− 2 , uniformly in N .

The theorem says that when N is large, almost all of the algo-
rithms AS with correct parameters n and v will factor

N successfully
and that all AS run in subexponential time L(N )3 2 .
There is a more complicated algorithm due to Lenstra and Pomer-
ance [LP92] that factors N with the rigorously proved time bound
of L(N ) steps. The algorithm uses class groups of binary quadratic
forms and is currently the fastest known integer factoring algorithm
with rigorous expected time bound.
Although they have rigorous proofs of their time complexities,
the algorithms mentioned above are much slower than the Quadratic
and Number Field Sieves.
The RSA cryptosystem and several other cryptographic algo-
rithms depend on factoring integers being a hard problem. Can we
prove that factoring integers is hard?
One must phrase this question carefully. Suppose N has a million
decimal digits and the low-order digit is 6. Then obviously 2 is a prime
factor of N . One can trivially “factor” N as 2 · (N/2).
Here is one way to ask the question. As a function of N , what
is the minimum, taken over all integer factoring algorithms, of the
maximum, taken over all integers M between 2 and N , of the num-
ber of steps the algorithm takes to factor M ? The integer factoring
algorithm has input M and outputs a list of all prime factors of M .
Integer factoring would be a polynomial-time problem if the answer
were O((log N )c ) for some constant c.
The model of computation, that is, what constitutes a “step,”
probably matters a lot in answering the question. Suppose we decide
that a single arithmetic operation (+, −, ×, ÷) is one step. Assume
that integers are stored in binary notation and that each register
(memory location) may hold one integer of any size. Shamir [Sha79]
proved that, in this model, one can factor N in O(log N ) steps. We
give a simplified version of his algorithm that runs in O(log3 N ) steps.
Assume N > 4.
10.1. Theoretical Factoring 243

First, it is enough to find a single proper factor of N . If f divides


N and 1 < f < N , then apply the algorithm recursively to f and
N/f to obtain the complete prime factorization of N .
Suppose now that we could compute k! quickly. Let m be the
smallest positive integer for which N divides m!. We can find m by a
binary search of the interval 1 ≤ m ≤ N using O(log N ) evaluations
of m!. Then N is prime if and only if m = N . It is easy to show that
f = gcd(m, N ) is a proper factor of N if N is composite. We can
compute the greatest common divisor in O(log N ) steps by Theorem
2.3.
2
Now we show how to compute k! in O(log k) steps. The formulas
k! = k · (k − 1)! when k is odd and k! = k/2 · ((k/2)!)2 when k is even
k

reduce the problem of computing k! to computing (k/2)! and 2m m .
We compute (k/2)! recursively in O(log2 k) steps.

To compute 2m m , consider the identity

2m


i 2m 2m
(2 + 1) = 2i·j .
j=0
j

i·j
Writing in binary, each term 2m 2 is just 2m shifted left ij bits.
2m j j
If i is large enough, each j will occupy a separate block of i bits,

so it can be isolated. Since 2m
j ≤ 22m , i = 2m is large enough. The
low-order k bits of x are x mod 2k = x − x/2k  · 2k . To isolate bits
k1 + 1 through k2 of x, subtract the lower k1 bits of x from the lower
k2 bits of x and divide by 2k1 .
Finally, (2i + 1)2m may be computed in O(log m) steps by Fast
Exponentiation. Combining all these results demonstrates that we
can compute k! in O(log2 k) steps.
Shamir’s result shows that we can factor N in a polynomial (in
log N ) number of steps provided that an arithmetic operation with
numbers as large as N ! counts as one step. This shows why we need
to consider the difficulty of arithmetic with large numbers when an-
alyzing number-theoretic algorithms. This complexity is the subject
of the next section.
244 10. Theoretical and Practical Factoring

10.2. Multiprecise Arithmetic


There is much more to arithmetic than you learned in third grade.
See Knuth [Knu81, Chapter 4] or Brent and Zimmermann [BZ10,
Chapters 1 and 2] for more about the material of this section.
Each computer has a fixed word size, usually a power of 2. The
sizes 32 bits and 64 bits are most common. Larger integers are stored
in arrays of digits in some number base B, usually a power of 2. The
bases B = 230 and B = 232 are often used on 32-bit machines.
Each positive integer n has a unique representation as n =
k
i=0 di B , where the digits di satisfy 0 ≤ di < B and dk > 0.
i

The digits are stored in arrays. Each arithmetic operation is imple-


mented in a procedure. For example, this algorithm adds two integers
 
X = ki=0 xi B i , Y = ki=0 yi B i of the same length to form their sum
k+1
Z = i=0 zi B i . Note that the sum Z may have one more digit than
X or Y . The algorithm adds the k + 1-digit numbers from low order
to high order one digit a time, with a carry from one digit position
to the next.

Algorithm 10.2. Multiprecise integer addition: Z = X + Y .


Input: Two arrays xi , yi of digits in base B.
carry ← 0
for i ← 0 to k {
zi ← xi + yi + carry
if (zi < B) { carry ← 0 }
else { carry ← 1 ; zi ← zi − B }
}
zk+1 ← carry
Output: An array of digits zi .

k
The following algorithm multiplies two integers X = i=0 xi B i ,
m  k+m
Y = i=0 yi B i to form their product Z = i=0 zi B i . The normal
school boy method would multiply X times each digit yi of Y , writ-
ing the intermediate products in a slanted parallelogram and then
adding the shifted partial products to form the overall product. On
a computer, space is saved by adding the intermediate products into
the final product as they are formed. The space for the final product
is initialized to 0 by the first for loop. The two nested for loops
10.2. Multiprecise Arithmetic 245

multiply pairs of 1-digit numbers and add them into the final prod-
uct being formed. The variable carry remembers the carry from one
column to the next one to the left.

Algorithm 10.3. Multiprecise integer multiplication: Z = X · Y .

Input: Two arrays xi , yi of digits in base B.


for i ← 0 to k + m { zi ← 0 }
for i ← 0 to k {
carry ← 0
for i ← 0 to m {
t ← xi · yi + ci+j + carry
ci+j ← t mod B
carry ← t/B
}
zm+i ← carry
}
Output: An array of digits zi .

Subtraction and division are performed by similar procedures.


Some of these may have several versions. For example, when division
is performed, one may require only the quotient or only the remainder
or both values. When one operand is single precision and the other
is multiprecision, the procedure is simpler and faster than when both
are multiprecision. There are tricks for multiplying and dividing large
integers that run much faster than the simple algorithms you learned
in grade school. One may square a large integer by a faster method
than that used to multiply two different integers of the same size.
Montgomery [Mon85] invented a way to perform repeated modular
multiplication, as happens during the Fast Exponentiation ne mod
m, without doing any division. Division is the slowest arithmetic
operation.
Some of the basic arithmetic procedures may be written in assem-
bly language to take advantage of certain fast machine instructions.
This choice of language makes the procedures nonportable, but one
must consider the machine architecture anyway to choose the number
base B.
246 10. Theoretical and Practical Factoring

The basic arithmetic procedures all run in polynomial time in the


length of the input, and the polynomial has degree 1 or 2.
After one has the arithmetic operations programmed, one may
add other number theory procedures, such as input and output of
multiprecise integers, Fast Exponentiation, the Euclidean Algorithm,
Jacobi symbols, and other algorithms described in this book. See also
Lehmer [Leh69] for a list of some useful number theory procedures
to add to the library.

10.3. Factoring—There’s an App for That


By the time you read this, there may be apps for cell phones to
factor integers. Here we describe some of the factoring programs for
computers and give some advice for factoring integers.
Various packages for number theory are freely available on the
Internet. Computer algebra systems like Maple, Pari, and Mathe-
matica let the user do number theory (and much more) by entering
commands in a language similar to algebra. They are only moder-
ately fast at factoring large integers. Sage is an excellent system for
work with elliptic curves. It was used to draw the figures in Chap-
ter 7. Readers who wish to write fast number theory programs in a
computer language like C should use a package like the GNU multi-
precise library GMP. See Brent and Zimmermann [BZ10] for more
about number theory packages. Several very fast factoring programs
use GMP. The GMP-ECM project has the fastest Elliptic Curve
integer factoring program that I know of. There are at least two
GMP implementations of the Number Field Sieve, the GGNFS and
the CADO-NFS, q.g3 . One can also find several Quadratic Sieve pro-
grams that use GMP on the Internet. All of these programs can be
located via Google or other search engines. Note that these programs
use more complicated versions of the algorithms than those described
in this book.
Several organized groups factor large integers by the ECM or the
NFS. The reader is encouraged to join such groups.

3
q.g. = quod googla = “which you can Google,” from the first declension Latin
verb googlare, “to Google.”
10.3. Factoring—There’s an App for That 247

Practical tips for factoring:

(1) Don’t try to factor a prime. Always perform at least a prob-


able prime test on the number before starting the factoring
program. Some powerful factoring programs just assume the
input is composite.

(2) Look for small factors first. Maybe the input was mistyped or
part of it was omitted in a cut-and-paste operation.

(3) Use the algorithms appropriately. Perform some Trial Division


first. Then use the Elliptic Curve Method with small bound
for B in Algorithm 7.9. Gradually increase the bound. Do not
use the Quadratic or Number Field Sieve until simpler, faster,
tentative algorithms have been tried. Choose the parameters
for all factoring algorithms appropriately.

(4) If you have several computers available, you may parallelize


the algorithms on them. The Elliptic Curve Method and the
sieving part of the Quadratic and Number Field Sieves are easy
to run on many machines at once.

(5) Remember that algorithms like the Pollard Rho Method and
the Elliptic Curve Method are probabilistic. They may discover
a large factor before a small one. Do not expect prime factors
to be discovered in order of size.

(6) When factoring a number known to have no small factor, choose


the best algorithm for the size of the number. The Quadratic
Sieve is fastest for 50- to 100-digit numbers and the Number
Field Sieve is fastest for numbers with more than 100 digits.

(7) When a factor p is discovered, test it for primality. Some al-


gorithms may discover two prime factors together. See Exam-
ples 10.5 and 5.26. Use the fast Baillie-Pomerance-Selfridge-
Wagstaff probable prime test rather than the much slower
Agrawal-Kayal-Saxena test. If you need to prove that the fac-
tor is prime, then use the Elliptic Curve Prime Proving Method
or, if you can factor p − 1, Theorem 3.27 or 3.29.
248 10. Theoretical and Practical Factoring

10.4. Dirty Tricks


In this section we consider various ways to factor a large integer with
the help of special information available about it. Some of the special
information might include its use as a public key for the RSA cipher.

10.4.1. Factor a pseudoprime.

Theorem 10.4. There is a polynomial-time algorithm for factoring


a composite integer N , given a base a to which N is a pseudoprime
but not a strong pseudoprime.

Proof. Write N − 1 = 2e f , where f is odd. Consider the numbers


e
(10.1) af , a2f , a4f , ..., a2 f
(mod N ).

We know that a2 f = aN −1 ≡ 1 (mod N ) because N is a pseudoprime


e

to base a. If all numbers in (10.1) were ≡ 1 (mod N ), then N would be


a strong pseudoprime to base a. As we are told that this is not so, one
of the numbers must be ≡ 1 (mod N ). Let i be the largest integer in
i
0 ≤ i < e for which x := a2 f ≡ 1 (mod N ). If x ≡ −1 (mod N ), then
N would be a strong pseudoprime to base a. Since this is not true,
we have x ≡ ±1 (mod N ). But x2 ≡ 1 (mod N ), so by Theorem 6.17
with y = 1 the two numbers gcd(x − 1, N ) and gcd(x + 1, N ) must
be proper factors of N . The exponentiation and greatest common
divisor can be done in polynomial time. 

As a corollary, factoring a Carmichael number is easy. By defini-


tion, a Carmichael number is a pseudoprime to every base relatively
prime to it. Could it be a strong pseudoprime to every one of these
bases? No. According to Theorem 3.42, a Carmichael number N can
be a strong pseudoprime to no more than N/4 bases. So, if we choose
random a in 1 < a < N − 1, then the probability is < 1/4 that N is a
strong pseudoprime to base a. (Of course, if a is not relatively prime
to N , then N is factored directly by gcd(a, N ).)

Example 10.5. Factor the Carmichael number N = 23224518901.


Note that N − 1 = 22 · f with f = 5806129725. Try random bases
a. With a = 17 and a = 2 we find a2f ≡ −1 (mod N ), so N is a
10.4. Dirty Tricks 249

strong pseudoprime to these bases. But with a = 29 we find


af ≡ 6710870280, a2f ≡ 2986995677, a4f ≡ 1 (mod N ).
This gives the factors
gcd(2986995677 − 1, N ) = 6275201
and
gcd(2986995677 + 1, N ) = 3701.
Of course, a Carmichael number has at least three prime factors by
Korselt’s Theorem 3.36, so we have more work to do. With a = 3 we
find
af ≡ 7578277351, a2f ≡ 555812478, a4f ≡ 1 (mod N ),
which gives the factors
gcd(555812478−1, N ) = 3301 and gcd(555812478+1, N ) = 7035601.
Thus, the factors of N are 3301, 3701, and 6275201/3301 = 1901,
which are easily seen to be primes.

10.4.2. Zero-knowledge proof. Recall the zero-knowledge proof


protocol in Section 4.8. In it, the Prover convinces the Verifier that
she knows the factors of some large integer N . The Verifier is not
supposed to learn anything about the factors during the protocol.
Here is how he can cheat. The Verifier skips step (2). In step (3),
he waits until he receives b from the Prover. He quickly computes
d = b3 mod N and sends this d to the Prover. Note that d is a
quadratic residue modulo N because if b = a2 mod N (step (1)),
then d = (a3 )2 mod N . In step (5), the Verifier sends the bit 1 to the
Prover. In step (6), the Prover sends to the Verifier a solution x1 to
the congruence x21 ≡ bd ≡ b4 (mod N ). The Verifier already knows
two solutions, b2 mod N and (N − b2 ) mod N , to this congruence.
In case x1 is one of these two numbers, the Verifier learns nothing
and tries again. But if x1 is one of the other two square roots of
b4 (mod N ), then the Verifier can factor N via Theorem 6.17.
The Prover could prevent this attack by checking whether d =
b3 mod N when she receives it in step (4). But the Verifier could use
d = b5 mod N , d = b7 mod N , d = b9 mod N , etc. There are too
many possibilities for the Prover to check all of them. One way to
250 10. Theoretical and Practical Factoring

avoid this trap (and another trap) is for the Prover and Verifier to send
each other the low-order bits of b and d and send the remaining bits
only after receiving the low-order bits of the other person’s number.

10.4.3. Factor an RSA public key. The rest of the dirty tricks
are ways to factor an RSA public modulus N when some information
is known about it or the parameters are chosen poorly. None of these
tricks actually threaten the RSA cipher because, when the cipher
is used properly, the needed information about the modulus is not
leaked. There are other attacks on the RSA cipher besides factoring
N . For example, if Alice happens to encipher the same message M
via RSA using different enciphering exponents e1 and e2 , but the
same RSA modulus N , then an attacker can recover M from the two
ciphertexts without factoring N . Boneh [Bon99] describes all the
attacks below and many more.

Theorem 10.6. If N is the product of two different primes p and q,


then one can factor N in polynomial time, given N and φ(N ).

Proof. We have φ(N ) = (p − 1)(q − 1) = N − (p + q) + 1 and N = pq.


Thus, N +1−φ(N ) = p+q = p+N/p or p2 −(N +1−φ(N ))p+N = 0.
This quadratic equation in the unknown p has the two solutions p and
q, which may be computed easily by the quadratic formula. 

Theorem 10.6 shows why one must not reveal φ(N ) when N is
an RSA public key.
One must also not reveal the deciphering exponent d when N is an
RSA public key. Not only could one decipher all ciphertext knowing
d, but one can factor N , too. Of course, the enciphering exponent e
is public. This theorem appears in the original RSA paper [RSA78].

Theorem 10.7. There is a probabilistic polynomial-time algorithm


to factor N , given an integer N which is the product of two un-
known primes and given two integers e, d between 1 and N with
ed ≡ 1 (mod φ(N )).

Compare this proof with that of Theorem 10.4.


10.4. Dirty Tricks 251

Proof. We have ed − 1 = kφ(N ) for some integer k. Of course, k


and φ(N ) are unknown, but their product r = ed − 1 is known, and
we have by Euler’s Theorem 2.31
 k
ar = aed−1 = aφ(N ) ≡ 1 (mod N )

whenever gcd(a, N ) = 1. Note that r is even because φ(N ) is even


for N > 2, and so both e and d are odd.
Now write r = 2s c with c odd. Choose a random a in 1 < a <
N − 1. If gcd(a, N ) > 1, then N has been factored and we are done.
i
Otherwise, compute bi = a2 c mod N for 0 ≤ i ≤ s. We know that
bs = ar mod N = 1 by the congruence above.
If for some 0 < i ≤ s we have bi = 1 but bi−1 ≡ ±1 (mod N ), then
gcd(bi−1 − 1, N ) is a proper factor of N . If there is no such i, try a
different random a. The reason this works is that b2i−1 ≡ 1 (mod N ),
but bi−1 ≡ ±1 (mod N ), so gcd(bi−1 − 1, N ) is a proper factor of N
by Theorem 6.17. In fact, each random a leads to a factorization of N
with probability at least 1/2. All of the bi and the greatest common
divisor can be computed in polynomial time by Fast Exponentiation
and the Euclidean Algorithm. 

The method just presented for factoring N = pq, given e, d with


ed ≡ 1 (mod φ(N )) is probabilistic because it chooses a random a. In
fact, there is a deterministic method for factoring N , given e and d,
but it is more complicated. As we saw above, we can factor N given
N and φ(N ) by solving a quadratic equation. May [May04] shows
how to find φ(N ) from N , e, and d in deterministic polynomial time.
His result depends on a theorem of Coppersmith telling how to find
small solutions to a polynomial congruence modulo N . Coppersmith’s
theorem relies in turn on an algorithm for finding a reduced basis for
a lattice. As this work leads to several other attacks on RSA, we
describe them in the following section.
Here is one more way to factor N when a parameter is √ chosen
badly. One can factor N when d is unknown but satisfies d ≤ 4 N /3,
according to Wiener [Wie90].
Theorem 10.8 (Wiener). There is a polynomial-time algorithm which
can factor N , given N and e, provided e < φ(N ) and N = pq, where
252 10. Theoretical and Practical Factoring

p and q are (unknown) primes with q < p < 2q, and there is an
(unknown) integer d < N 1/4 /3 satisfying ed ≡ 1 (mod φ(N )).

Proof. Since ed ≡ 1 (mod φ(N )), there is an integer k such that


ed − kφ(N ) = 1. Divide by dφ(N ) to get
 
 e k  1

 φ(N ) − d  = dφ(N ) .

This shows that k/d ≈ e/φ(N ). Now e is known, but φ(N ) is un-
known. But we can approximate φ(N ) by√N , as we now show. Since
q < p < 2q and pq =√N , we have q < N and 0 < N − φ(N ) =
p + q − 1 < 3q − 1 < 3 N . Therefore,
     
e     
 − k  =  ed − kN  =  1 − k(N − φ(N )) 
N d   dN   dN 

because ed = 1 + kφ(N ). Using 0 < N − φ(N ) < 3 N gives
   √ 
e 
 −  ≤  3k N  = √
k 3k
.
N d   dN  d N

Since kφ(N ) = ed − 1 < ed and e < φ(N ), we have k < d < N 1/4 /3.
Therefore,
 
e 
 − k ≤ 1 < 1 .
N d  dN 1/4 2d2
This inequality is an instance of formula (6.4), so by Theorem 6.7, the
fraction k/d must be one of the convergents of the simple continued
fraction expansion of the number x = e/N . Now N and e are given,
so we know x and can compute the convergents Am /Bm of its con-
tinued fraction in polynomial time using formulas (6.3), computing
the partial quotients qi by Algorithm 6.1. Try each Bm as d and test
whether M ed ≡ M (mod N ) for a few random M . As the Bm grow
exponentially with m and d < N 1/4 /3, we try at most O(log N ) of
them. 

Finally, we give an example of factoring an RSA modulus n by ex-


ploiting a hardware failure during signature generation. We explained
in Section 4.8 how Alice can sign a message M and how she can ac-
celerate this calculation of S using the Chinese Remainder Theorem
10.5. Dirty Tricks with Lattices 253

and her knowledge of the factors p and q of n. In brief, she com-


putes Sp = S mod p and Sq = S mod q by Fast Exponentiation with
smaller numbers and then combines these values with the Chinese
Remainder Theorem to obtain S.
Now suppose that a hardware error complements one bit of Sp
during this calculation, but Sq is computed correctly.
First suppose that an eavesdropper Eve obtains the correct sig-
nature S as well as the incorrect one S formed by the Chinese Re-
mainder Theorem using the wrong Sp and the correct Sq . Then Eve
can factor Alice’s public modulus n, given S, S , and n. We have
S ≡ Sp (mod p), S ≡ Sq (mod q), S ≡ Sp (mod p), S ≡ Sq (mod q)
so that gcd(S − S , n) = q since q divides S − S but p does not.
Now suppose instead that Eve obtains S , but not the correct
signature S. Eve can still factor Alice’s public modulus n, given only
S , n, e, and M , where e is Alice’s public encryption exponent. We
have M = S e mod n. Since the error occurred in computing Sp ,
we have M ≡ (S )e (mod q) but M ≡ (S )e (mod p). Therefore,
gcd(M − (S )e , n) = q.
The second attack is due to A. K. Lenstra. See Boneh [Bon99]
for both attacks.

10.5. Dirty Tricks with Lattices


In this section we give a brief overview of lattices and applications to
factoring an RSA public modulus. This section assumes the reader
has studied linear algebra. The definitions and theorems in this sec-
tion are somewhat vague. See [LLL82] for the true definitions and
theorems.
Lattices have many uses in cryptography. They may be used to
define cryptosystems and to attack RSA and other ciphers. Several
attacks on the RSA public-key cipher with poor parameter choices use
lattices and the LLL algorithm (for Lenstra, Lenstra, and Lovász).
We describe here only the ways lattices can be used to factor an RSA
modulus.
254 10. Theoretical and Practical Factoring

We begin with the Gram-Schmidt process from linear algebra.


The Gram-Schmidt process constructs an orthogonal basis for a vec-
tor space V , given any basis for it. It is fast and simple. If u =
(u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) are two vectors in n-dimen-
n
sional space, their dot product is u · v = i=1 ui vi . Two vectors
are orthogonal if they are perpendicular, which is the same as having
dot product 0. A basis is orthogonal if each pair of vectors in it are
orthogonal. The length or norm of u is
)
* n
√ *
u = u · u = + u2i .
i=1

Note that for any vector u, u · u ≥ 0 since it is the sum of squares.
The zero vector 0 = (0, . . . , 0) is the only vector with length 0.
The Gram-Schmidt process accepts a basis {v1 , v2 , . . . , vr } and
computes an orthogonal basis {w  1, w  r } for the same vector
 2, . . . , w
space. It sets w 1 = v1 . To compute w  i for i > 1, it projects vi
orthogonally onto the subspace U generated by {w  i−1 }, which
 1, . . . , w
is the same as the subspace generated by {v1 , . . . , vi−1 }. Then w  i is
defined to be the difference between vi and this projection. This
construction makes w  i orthogonal to all the vectors in the subspace
U . The projection is computed one vector at a time by the for loop.

Algorithm 10.9. Gram-Schmidt process.

Input: A basis {v1 , v2 , . . . , vr } for a vector space V .


for (i ← 1 to r) {
w i ← vi
for (j ← 1 to i − 1) {
mi,j ← (vi · w j )/(w j · wj)
wi ← w
 i − mi,j w j
}
}
Output: Orthogonal basis {w  1, w  r } for V .
 2, . . . , w

Example 10.10. In R4 , let V be the subspace with basis {v1 , v2 , v3 },
where v1 = (1, 2, 3, 0), v2 = (1, 2, 0, 0), and v3 = (1, 0, 0, 1). Find an
orthogonal basis {w  1, w  3 } for V .
 2, w
10.5. Dirty Tricks with Lattices 255

First we have w
 1 = v1 = (1, 2, 3, 0).
Then

 2 = v2 − ((v2 · w
w 1 · w
 1 )/(w  1 = (1/14)(9, 18, −15, 0).
 1 ))w

We simplify the calculation by replacing w  2 = (9, 18, −15, 0).


 2 with 14w
Finally

 3 = v3 − ((v3 · w
w 1 · w
 1 )/(w  1 − ((v3 · w
 1 ))w 2 · w
 2 )/(w  2 ))w
2
= (1/5)(4, −2, 0, 5).

We could replace w  3 = (4, −2, 0, 5).


 3 with 5w
 2 = (9, 18, −15, 0),
 1 = (1, 2, 3, 0), w
The orthogonal basis is w
 3 = (4, −2, 0, 5).
w
Check that any two vectors w 1 · w
 i are orthogonal: w 2 = 1 · 9 +
2 · 18 + 3 · (−15) + 0 · 0 = 0, etc.

Now we give a definition of lattice.

Definition 10.11. The lattice generated by linearly independent vec-


tors v1 , . . ., vr with integer coordinates is the set of all linear combi-
nations a1v1 + · · · + arvr with integers ai .

A basis for a lattice L is a set of linearly independent vectors


v1 , . . ., vr with integer coordinates such that any vector in L can be
written as a linear combination a1v1 + · · · + arvr of the basis vectors
with integer coefficients ai .
Every lattice has a basis. Every basis of a given lattice L has
the same size r, called the rank of L. The rank r is the same as the
dimension of the vector space (a subspace of Rn ) spanned by any
basis of L. A lattice has full rank if r = n.
Recall that if v1 , . . ., vr and w
 1 , . . ., w
 r are two bases for the same
vector space, then each w  i can be written as a linear combination of
the vj .
Likewise, if v1 , . . ., vr and w
 1 , . . ., w
 r are two bases for the same
lattice, then each w  i can be written as a linear combination of the
vj with integer coefficients. Let B be the r × n matrix whose rows
are v1 , . . ., vr . The “size” of a lattice L is a real number det(L)
256 10. Theoretical and Practical Factoring

defined in terms of any basis. If L has full rank, then B is square and
det(L) = | det(B)|. This is the only case we will use below.
There exists a vector v1 in a lattice L with minimum positive
norm, the shortest vector. Let λ1 (L) be the norm of this shortest
v1 . Every vector w ∈ L which is linearly dependent on v1 must be
 = tv1 for some integer t. At least two vectors have this minimum
w
positive length since the norm of −v equals the norm of v .
We generalize λ1 (L) as follows. For integer k ≥ 1, let λk (L) be
the smallest positive real number so that there is at least one set of
k linearly independent vectors of L, with each vector having length
≤ λk (L). This defines a sequence

λ1 (L) ≤ λ2 (L) ≤ λ3 (L) ≤ · · · .

Note that we count lengths, not vectors, in this definition.


A lattice is often presented by giving a basis for it. This basis
sometimes consists of very long, nearly parallel vectors. Sometimes it
is more useful to have a basis with shorter vectors that are closer to
being orthogonal to each other. The process of finding such a basis
from a poor one is called reducing the lattice or lattice reduction.
The nicest basis {v1 , v2 , . . . , vr } for L would have the length of vi
be λi (L) for each i. It turns out to be N P-hard to find a nicest basis.
The Gram-Schmidt process (Algorithm 10.9) does not work for
lattices because the mi,j are usually not integers, so the new basis
vectors are not integer linear combinations of the original vectors.
There are several definitions of reduced basis and they are not equiv-
alent. Lenstra, Lenstra, and Lovász [LLL82] give a precise definition.
Its vectors approximate the shortest possible vectors for a basis of a
given lattice, and the angle between any two reduced basis vectors
is not allowed to be too small. They give a polynomial-time algo-
rithm, related to the Gram-Schmidt process, but more complicated,
that computes a reduced basis from a given one. They prove that
their algorithm, the LLL algorithm, q.g., runs in polynomial time
and constructs a reduced basis.
We will use only the following special application of the LLL
algorithm.
10.5. Dirty Tricks with Lattices 257

Theorem 10.12 (LLL [LLL82]). Let L be a lattice spanned by v1 ,


. . ., vr . With input v1 , . . ., vr , the LLL algorithm finds in polynomial
time a vector b1 with length

b1  ≤ 2(r−1)/4 det(L)1/r .

Now we describe some attacks on RSA using LLL.


Recall the notation of RSA: N = pq is the product of two large
primes and is hard to factor. Choose e with gcd(e, φ(N )) = 1, where
φ(N ) = (p − 1)(q − 1). Via the Extended Euclidean Algorithm, find
d with ed ≡ 1 (mod φ(N )). Discard p and q. The public key is N , e
and the private key is d. Encipher plaintext M as C = M e mod N .
Decipher ciphertext C as M = C d mod N .
Consider these three problems about the RSA cipher.

(1) The RSA problem: Given N , e, and C, find M .


(2) Compute d: Given N and e, find d.
(3) Factor N : Given N , find p and q.

Clearly, if we can solve (3), then we can solve (2); and if we can
solve (2), then we can solve (1).
In fact, (3) is equivalent to (2). It is not known whether (3)
is equivalent to (1). All three problems seem hard, although Shor
showed that one can solve all three quickly on a quantum computer.
We will give a complete proof of the equivalence of (2) and (3) in
case p and q have the same length in bits. Then we will sketch some
other results.
The coefficient vector of a polynomial h(x) = h0 + h1 x + h2 x2 +
· · · + hn xn is the vector v = (h0 , h1 , . . . , hn ) and h(x) = v . We
begin with a useful lemma. In it, h(xX) is the polynomial with
coefficient vector (h0 , h1 X, h2 X 2 , . . . , hn X n ).

Lemma 10.13 (Howgrave-Graham [HG97]). Let h(x) be a poly-


nomial which is a sum of at most r monomials. Let X be a posi-
tive real number and let M > 1 and x0 be integers. Suppose that

h(x0 ) ≡ 0 (mod M ), where |x0 | ≤ X and h(xX) < M/ r. Then
h(x0 ) = 0.
258 10. Theoretical and Practical Factoring

Proof. We have
   
    i     i 
 i  i x0  hi X i x0 
|h(x0 )| =  h i x0  =  hi X ≤ 

i
 
i
X  i
X 
  √
≤ hi X i  ≤ rh(xX) < M.
i

Since h(x0 ) ≡ 0 (mod M ), we have h(x0 ) = 0. 

The next theorem was first proved by May [May04]. An im-


proved version in Coron and May [CM07] is the basis for the proof
we give below. The proof uses ideas of Coppersmith [Cop96b] and
the LLL algorithm [LLL82].

Theorem 10.14 (May). Let N = pq, where p and q are two primes
of the same bit length. Let positive integers e and d satisfy ed ≡
1 (mod φ(N )) and ed ≤ N 2 . There is a deterministic polynomial-
time algorithm which factors N given the input N , e, d.

Proof. Let U = ed − 1 and s = p + q − 1. The algorithm will


compute s from N and U . Since φ(N ) = N − s, the theorem follows
from Theorem 10.6.
Suppose for the moment that we know the high-order bits s0 of
s. Let X be an integer to be determined later. Write s = s0 X + x0 ,
where 0 ≤ x0 < X. We will describe a way to find x0 given s0 . Let
φ = φ(N ). Since φ = N − s = N − s0 X − x0 , we have x0 − N + s0 X ≡
0 (mod φ) and U = ed − 1 ≡ 0 (mod φ). Let m and k be integers to
be chosen later. Define polynomials in the variable x by

gij (x) = xi · (x − N + s0 X)j · U m−j

for 0 ≤ j ≤ m and i = 0 and also for j = m and 1 ≤ i ≤ k. For all


these pairs (i, j), we have gij (x0 ) ≡ 0 (mod φm ). Some coefficients of
the polynomials gij (x) are large integers.
Our plan is to use the LLL algorithm to find a linear combination
h(x) of the polynomials gij (x) which has small coefficients. Then
h(x0 ) ≡ 0 (mod φm ). We will apply Lemma 10.13 with M = φm to
get h(x0 ) = 0. Then we will find x0 by ordinary root-finding methods
of numerical analysis. This will give us s = s0 X + x0 .
10.5. Dirty Tricks with Lattices 259

Let L be the lattice spanned by the coefficient vectors of the


polynomials gij (x) and let B be the matrix with these vectors as its
rows. The lattice has full rank r = m + k + 1. The r × r matrix B is
shown in this diagram for k = 4 and m = 3:
1 x x2 x3 x4 x5 x6 x7
g00 (xX) U3
g01 (xX)  U 2X
g02 (xX)   U X2
g03 (xX)    X3
g13 (xX)    X4
g23 (xX)    X5
g33 (xX)    X6
g43 (xX)    X7
In the matrix, the stars denote nonzero entries that don’t matter. The
blank entries are all 0. Since the matrix is triangular, its determinant
is the product of the entries on the diagonal. We have
det(L) = det(B) = X (m+k)(m+k+1)/2 U m(m+1)/2 .
The LLL algorithm (Theorem 10.12) applied to L with the basis in
B gives in polynomial time a nonzero short vector b with length
b ≤ 2(r−1)/4 det(L)1/r . This vector b is the coefficient vector of
some polynomial h(xX) with h(xX) = b. The polynomial h(x)
is an integer linear combination of the gij (x), so h(x0 ) ≡ 0 (mod φm ).
In order to apply Lemma 10.13, it is enough to have

2(r−1)/4 det(L)1/r < φm / r.

Using the inequalities r ≤ 2(r−1)/2 , φ > N/2, and r−1 = m+k ≥ m,
we have the sufficient condition
det(L) ≤ N mr 2−2r(r−1) .
Using the value of det(L) above, the hypothesis U = ed − 1 < N 2 ,
and mr = m(m + 1) + mk gives the sufficient condition
X (m+k)(m+k+1)/2 ≤ N mk 2−2(m+k+1)(m+k) .
Take the (m + k)(m + k + 1)/2 root of both sides. We need
X ≤ N f (m,k) 2−4
260 10. Theoretical and Practical Factoring

where f (m, k) = 2mk/((m + k)(m + k + 1)). We want X to be as


large as possible so that s0 will be small. Calculus shows that for
fixed m, f (m, k) is maximized when k = m. When k = m, we have
f (m, m) = m/(2m + 1) = 1/2 − 1/(2(2m + 1)), so we need
1
X≤ · N 1/2−1/(4m+2) .
16
Now let X be the largest integer that satisfies this inequality. The
LLL algorithm runs on a lattice of dimension 2m + 1 and with vector
components O(N 2m ). The running time of the LLL algorithm is a
polynomial in the lattice dimension and the bit length of the compo-
nents. Thus, LLL runs in time polynomial in m and log2 N for each
candidate s0 . It will find the correct s and factor N for one of the
values of s0 .
How many values of s0 must √ be tried? Using the largest possible
value of X and s = p + q − 1 ≤ 3 N , we get

s 3 N
s0 ≤ ≤ 1√ = 48N 1/(4m+2) .
X N · N −1/(4m+2)
16

If we let m = log2 N , then s0 is bounded by a constant. 

Theorem 10.14 holds without the restriction that p and q have


the same size. See [CM07] for a proof.
Theorem 10.14 is a beautiful theoretical result. It says that know-
ing the RSA deciphering exponent is deterministically polynomial-
time equivalent to knowing the factors of N . Of course, if an RSA
deciphering exponent were accidentally leaked, one could factor N
easily in probabilistic polynomial time by the method of Theorem
10.7.
Assume we have some partial information about p or q or d be-
cause they are limited somehow, perhaps by a poor choice of param-
eters or a faulty random number generator.
Coppersmith [Cop96b] proved that if f (x) is a monic polynomial
and b is an unknown factor of a given integer N , then one can find all
“small” solutions x to f (x) ≡ 0 (mod b) quickly via the LLL lattice
reduction algorithm.
10.5. Dirty Tricks with Lattices 261

Theorem 10.14 is one application of his method. Here is another


application. We can factor an RSA modulus given a few more than
half of the high-order bits of p (as a number p̃ ≈ p).

Theorem 10.15 (Coppersmith [Cop96b]). One can factor N = pq,


where p > q, in polynomial time, given N and an integer p̃ with
|p − p̃| < N 5/28 /2.

Proof. We sketch the proof, which is similar to the proof of May’s


Theorem 10.14.
Let f (x) = x + p̃. Then x0 = p − p̃ is a small zero of f (x) ≡
0 (mod p), so we can find it with Coppersmith’s method.
Define polynomials for i = 0 to 3 by

fi (x) = N i f 4−i (x) = N i (x + p̃)4−i ,


f4+i (x) = xi f 4 (x) = xi (x + p̃)4 .

Then p4 divides all fi (x0 ), i = 0, 1, . . ., 7.


Let X = N 5/28 /2.
Define a lattice L of dimension 8 by the basis of coefficient vectors
of fi (xX).
Apply the LLL algorithm (Theorem 10.12) to get a short vector
b1 = the coefficient vector of some polynomial g(xX).
We have |g(x0 )| < p4 and p4 divides g(x0 ). Therefore, g(x0 ) = 0.
Solve g(x) = 0 for an integer x0 . Then p = x0 + p̃. 

Similar theorems allow one to factor N = pq given the high-order


half of the bits of q or the low-order bits of either p or q.
Coppersmith [Cop96a], [Cop97] also proved a theorem that lets
one find small roots of a polynomial with two variables in polynomial
time using the LLL algorithm. This result gives a faster algorithm
than Theorem 10.15 for factoring N = pq when the high- or low-order
half of the bits of either p or q is known. The bivariate polynomial
theorem also gives polynomial-time algorithms for factoring an RSA
modulus N when some bits of d are known. Here is an example of
these results.
262 10. Theoretical and Practical Factoring

Theorem 10.16 (Boneh and Durfee [BD00]). There is a polynomial-


time algorithm to factor N = pq, given N and e, provided that there
exists an integer d < N 0.292 such that ed ≡ 1 (mod φ(N )).

Compare this theorem with that of Wiener, Theorem 10.8, which


uses continued fractions to prove a similar statement.
The lesson of Theorems 10.8 and 10.16 is that the deciphering
exponent d in RSA should not be too small or else an attacker will
be able to factor N . If you are choosing RSA keys and notice that
d < N 2/3 , say, then you should choose new e and d. So far as I know,
there is no risk in letting e be small; even e = 3 appears to be safe.

10.6. The Future of Factoring


Can one predict the future by looking at the past? Let us consider the
special problem of completely factoring Fermat numbers and look at
the history of how and when the final factors of each Fermat number
were learned.
m
The Fermat numbers Fm = 22 +1 for 5 ≤ m ≤ 32 are composite.
These numbers have been completely factored for 5 ≤ m ≤ 11. The
following table lists the year and method by which the factorization
of these seven numbers was finished4 :
m Year Method Who
5 1732 Trial Division Euler
6 1855 Trial Division and Clausen and Landry
p ≡ 1 (mod 2m+2 )
7 1970 CFRAC Morrison and Brillhart
8 1980 Pollard Rho Brent and Pollard
9 1990 NFS Lenstra, Manasse, et al.
10 1995 ECM Brent
11 1988 ECM Brent
Can one predict the future of factoring Fermat numbers from this
table? The years increase with m except for the last one. It happens
that F11 has four factors small enough to discover by the ECM easily,
4
The first five numbers have exactly two prime factors. F10 has four and F11 has
five prime factors. The table reports only on splitting the two largest prime factors.
The factors of F6 were found in 1855 but not proved prime until 1880.
10.6. The Future of Factoring 263

while the penultimate factor of F10 has 40 digits, which the ECM can
discover only with considerable effort (at least in the 1990s). Can one
guess from this table in what year F12 will be completely factored?
Six small factors of F12 are already known. The remaining cofactor
is composite and has 1133 decimal digits. This number is much too
large to factor by the QS or the NFS. The only way it will be factored
soon by a known method is if it has just one very large prime factor
and we can discover one or more small prime factors by the ECM.
The table shows that six different algorithms were used to factor the
seven numbers. Perhaps someone will discover a new algorithm to
finish F12 .
A new factoring algorithm was invented about every five years
from 1970 to 1995, as shown in this table:

Year Method Who


1970 CFRAC Morrison and Brillhart
1975 Pollard p − 1 and Rho Pollard
1980 Quadratic Sieve Pomerance
1985 ECM H.W. Lenstra, Jr.
1990 SNFS Pollard and Lenstra
1995 GNFS Pollard and Lenstra

This period was the Golden Age of Factoring Algorithms. Each of


these algorithms did something that previous algorithms could not
do. The CFRAC factors general numbers too hard for Trial Division.
Pollard’s algorithms find small factors beyond the range of Trial Di-
vision of numbers too large for the CFRAC. The Quadratic Sieve
factors general numbers faster than the CFRAC. The Elliptic Curve
Method finds relatively small factors larger than the ones Pollard’s
algorithms can find. The Special Number Field Sieve factors num-
bers of the special form bn ± c, for small integers b, n, c faster than
the Quadratic Sieve can do them. The General Number Field Sieve
factors general numbers not of the right form for the Special Number
Field Sieve faster than the Quadratic Sieve.
Why have no new faster factoring algorithms been discovered
since 1995? Many people have tried to find new ones. People have
invented variations of the QS, the ECM, and the NFS that lead to
264 10. Theoretical and Practical Factoring

faster programs for these algorithms. For example, the multiple poly-
nomial the QS uses many quadratic polynomials for sieving, not just
one of them. New forms for representing elliptic curves make the
ECM more efficient. Line sieving and the use of special q speed the
NFS. The improved algorithms are ten to one hundred times faster
than their first versions, but they still have the same asymptotic time
complexity as the first version. Have we already discovered the fastest
integer factoring algorithms? I doubt it.
Computers have become faster in the past 50 years. Between 1963
and 2013, computers have become 1000 times faster and factoring
algorithms 1000 times faster; factoring is about a million times faster
in 2013 compared to 1963.
We need a new factoring algorithm. The time complexities of the
fastest known algorithms have the form

(10.2) exp c(ln N )t (ln ln N )1−t ,

for some constants c and 0 < t < 1. For the QS and the ECM,
t = 1/2; for the NFS, t = 1/3. The reason for this shape for the time
complexity is the requirement of finding one or more smooth numbers,
the group order for the ECM, and the numbers in the relations in the
QS and the NFS and formula (3.1). Any new factoring algorithm
that succeeds by finding smooth numbers would likely also have time
complexity of the form (10.2). A truly fast factoring algorithm, for
example, a polynomial-time algorithm, probably would not rely on
smooth numbers.
Genetic algorithms might be an approach to factoring N . A ge-
netic algorithm runs on an ordinary computer and mimics biological
evolution. Begin with a population of random data structures con-
nected to factoring N . A function rates each data structure, with a
higher rating meaning that it is closer to factoring N . Delete a frac-
tion, say 50%, of the data with lowest ratings. Mutate and recombine
some of the remaining data to form new data from pieces of the old
data. (DNA does this in reproduction of living things.) Rate the new
data and repeat the process. (The rating and deleting of items with
low ratings is similar to survival of the fittest.) See Davis [Dav91]
for more details. A few years ago, I tried this approach to factoring
10.6. The Future of Factoring 265

with the data being formulas that input N , perhaps with hints, and
compute an output. The rating function was the number out of ten
composite input numbers for which the formula gave a correct proper
factor. The operations allowed in the formulas were the arithmetic
operations, greatest common divisor, Fast Exponentiation, modular
inverse, Jacobi symbol, etc. When the input was a set of ten compos-
ite N and the hint for N was a base b to which N was a pseudoprime,
but not a strong pseudoprime, the genetic algorithm discovered the
method of the proof of Theorem 10.4 in a few generations of formu-
las. However, when the input was a set of ten composite divisors N
of Cunningham numbers bn ± 1 and the hint for N was b, the genetic
algorithm could not find a formula to give factors of all ten N after
searching for thousands of generations of formulas. Likewise, the ge-
netic algorithm failed when given only a set of ten composite N with
no hints.
We give a few suggestions for discovering new, faster ways to
factor large numbers.

(1) The fastest general modern algorithms, the QS and the GNFS,
solve x2 ≡ y 2 (mod N ) and use Theorem 6.18. Can one con-
struct congruent squares modulo N in a faster way than using
smooth numbers to build relations?
(2) How about finding a base b so that N is a pseudoprime to
base b but not a strong pseudoprime to base b and applying
Theorem 10.4? Given N , can you find an m such that mN is
a pseudoprime to many bases or even a Carmichael number?
(3) Find an efficient way (perhaps similar to Fast Exponentiation)
to compute m! mod N for any 1 < m < N on a computer
with fixed word size. Then Shamir’s algorithm would become
practical.
(4) Find an efficient algorithm to solve this puzzle: Given a real
number R > 1, compute an integer multiple kR of R that is
within 1 of a square: |kR − m2 | < 1. Example: If R = 2π, then
k = 4 works: |8π − 52 | < 1. It is an easy exercise (Exercise
10.7) to convert the solution to the puzzle into a fast factoring
algorithm. (This suggestion is due to Schroeppel.)
266 10. Theoretical and Practical Factoring

(5) Try to factor Cunningham numbers via Exercise 10.5. That is,
use the known base b to which the number is a strong pseudo-
prime to construct a base a to which the number is a pseudo-
prime but not a strong pseudoprime.
(6) Theorem 10.4 tells how to factor N , given a base a to which
N is a pseudoprime but not a strong pseudoprime. But some-
times the method in the proof succeeds even when N is not a
j
pseudoprime to base a. If you compute gcd(a2 f − 1, N ) for
random a and all j, sometimes you will factor N . For exam-
ple, N = 341 is factored by this method for every a, whether
N is a pseudoprime to base a or not. How often does this tech-
nique work? Given N , can you predict an a (not necessarily a
pseudoprime base) for which it factors N ?
(7) Try to discover a formula F (N ) for a factor of N using a genetic
algorithm, perhaps with a hint. Maybe I made a mistake,
omitted a vital operation, or did not try enough generations.
(8) According to [GW08], when the SQUFOF is used to factor
a given integer N and many small multipliers m are used, so
that the algorithm essentially computes with the mN , there is
a large variation5 in the number of steps (and running time)
taken to factor N as m varies. Given a composite number N ,
can one quickly predict a multiplier m for which the SQUFOF
will factor N especially fast, perhaps
√ running in much less time
than the average complexity O( 4 N )?
(9) Can you discover an algorithm to answer this question quickly?
Given integers 1 < B < N , does N have a (prime) factor p ≤
B? This question sounds easier than factoring N because the
answer is either “yes” or “no.” It is called the decision problem
version of the factoring problem. In fact, it is equivalent to
factoring. If you had a fast algorithm to answer the yes/no
question, then you could use it log2 N times in a binary
√ search
for the least prime factor of N . Note that when  N  ≤ B <
N , the question simply asks whether N is composite, and we

5
If W (N ) is the number of iterations of the while loop in Algorithm 6.25 (Part
√4
1 of the SQUFOF), then the mean and standard deviation of the statistic W (N )/ N
are equal, which suggests an exponential distribution for this random variable.
Exercises 267

saw how to answer that question in polynomial time in Section


3.7.
(10) Build a quantum computer and use Shor’s method.
(11) Try DNA computing. Can a DNA computer simulate the El-
liptic Curve Method or the Number Field Sieve, rather than
Trial Division? Can it simulate the SQUFOF or one of the
algorithms in Chapter 5?
(12) Try visual computing. Write a graphics program that inputs a
composite integer N and displays an image representing prop-
erties of N . For example, each pixel might show the Jacobi
symbol (a/N ) for a different a. Manipulate the image with a
mouse as one might move around and zoom in on a map. Look
for interesting features of the image that might lead to a factor
of N . Perhaps a “crack” in the image might signal a nearby
split of N into proper factors.
(13) If none of the above works, then try to prove that factoring is
hard. Remember Shamir’s O(log N )-step algorithm of Section
10.1. Can you even prove that there is a constant c > 0 such
that when factoring some numbers N on a computer with fixed
word size, at least c log2 N steps are required?

Exercises

10.1. The number

49023533868409411085186322722914188749822273357609280786433

is a Carmichael number. Factor it in less than a second of


computer time using the trick in Section 10.4. Then prove
that it is a Carmichael number using Korselt’s Criterion, The-
orem 3.36.
10.2. Suppose N is known to be the product of three unknown
primes. Can one factor N , given N and φ(N )? If not, then
what additional information about N would help to factor it?
Would σ(N ) work?
268 10. Theoretical and Practical Factoring

10.3. Twin brothers Bill and Bob use the same RSA modulus N
but have different enciphering exponents, e1 for Bill and e2
for Bob. Alice enciphers the same message M and sends it
to both brothers. Eve records the two ciphertexts. Assuming
that gcd(e1 , e2 ) = 1, explain how Eve can read M without
factoring N . Can Eve use this information to factor N ?
10.4. While experimenting with Bob’s RSA public keys N and e,
Eve discovered an interesting property. If she enciphered a
certain message M seven times, she got M back. That is, if
E(M ) = M e mod N is the enciphering algorithm, then
E(E(E(E(E(E(E(M ))))))) = M.
(Such an M is called a fixed point of order 7 for RSA with
these keys.) Can Eve use this information to factor N ?
10.5. Corollary 4.5 shows that almost any composite divisor N of
the primitive part of bn ± 1 is a strong pseudoprime to base b.
Can you use this information to compute (quickly) a base a
to which N is a pseudoprime but not a strong pseudoprime?
If so, then you could factor Cunningham numbers quickly via
Theorem 10.4. Recall Exercise 3.20.
10.6. Suppose N is a large odd composite integer and you discover
an integer b so that N is a strong pseudoprime to base b. Can
you use this information to construct a good polynomial for
factoring N by the Special Number Field Sieve?
10.7. Assume there is a fast way to solve the puzzle in suggestion
(4). Develop an efficient algorithm to factor N . Hint: The
puzzle solution allows you to convert a quadratic residue mod-
ulo N with known square root into another quadratic residue
with known square root and smaller absolute value. Iterate
this process until you reach a square root of 1 or obtain a
set of small residues whose product is square. In either case,
apply Theorem 6.18.
Appendix

Answers and Hints


for Exercises

Introduction
In this appendix we give answers to some exercises and hints for some
other ones.

A.1. Chapter 1
Answers and hints for exercises in Chapter 1.

1.3. The period length of 1/9091 is 10 and that of 1/9901 is 12.


1.4. The period for the decimal expansion of 1/49 is
020408163265306122448979591836734693877551 of length 42.
1.5. The period length of 1/13 in base 3 is 3 because 13 divides
33 − 1.
1.6. The fourth and fifth Mersenne primes are 27 − 1 and 213 − 1.
They give perfect numbers 26 (27 − 1) = 8128 and
212 (213 − 1) = 33550336.

269
270 A. Answers and Hints for Exercises

A.2. Chapter 2
Answers and hints for exercises in Chapter 2.
2.1. We have 321(19) + 381(−16) = 3.
2.2. We find that the pn , n = 11, 12, 13, . . ., are 2801, 11, 17, 5471,
52662739, 23003, 30693651606209, 37, 1741, 1313797957, 887,
71, 7127, 109, 23, 97, 159227, 643679794963466223081509857,
. . ..
2.3. The number is roughly ln 10300 ≈ 691. It would be half as
much if we considered only even 300-digit numbers.
2.4. 99.
2.5. 41.
2.6. 43.
2.8. 0, 1, 4 (mod 8).
2.9. No.
2.11. −1, +1, +1, −1, +1.
2.12. Hint: Consider two cases: p ≡ 1 (mod 4) and p ≡ 3 (mod 4).
2.13. (a) 10 ≡ 1 (mod 3). (b) 10 ≡ −1 (mod 11). (c) 102 ≡
−11 (mod 37) and 103 ≡ 1 (mod 37).

A.3. Chapter 3
Answers and hints for exercises in Chapter 3.
3.1. The probability that a number near 1036 is 106 -smooth is ρ(u),
where u = 36/6 = 6, that is, ρ(6) ≈ 0.0000197. The number
of 106 -smooth numbers between 1036 and 1036 + 107 is about
107 ρ(6) ≈ 197.
3.2. Same answer as for the previous exercise.
3.3. Note that 253 = 11 · 23. The answer is 3 or −3 modulo each
of these primes. Combine with four applications of the CRT.
The answers are x = 3, 118, 135, 250 modulo 253.
3.12. Hint: Use Bang’s Theorem 3.19.
3.19. This is Corollary 4.5.
A.6. Chapter 6 271

3.20. True, True, True, False.


3.23. Usually not.

A.4. Chapter 4
Answers and hints for exercises in Chapter 4.
4.1. We have Φ5 (x) = x4 +x3 +x2 +x+1 = (x2 +3x+1)2 −5x(x+1)2 ,
so
Φ5 (55h ) = [52h +3·5h +1−5(h+1)/2 (5h +1)][52h +3·5h +1+5(h+1)/2 (5h +1)].
4.2. We have 1313h −1 = (13h −1)Φ13 (13h ) and Φ13 (13h ) = L13h M13h ,
where L13h , M13h are
136h + 7 · 135h + 15 · 134h + 19 · 133h + 15 · 132h + 7 · 13h + 1
∓ 13(h+1)/2 (135h + 3 · 134h + 5 · 133h + 5 · 132h + 3 · 13h + 1).
4.17. 14316 is the smallest element of a set of 28 sociable numbers.
4.21. Assume p < q. Then q | Qq−1 but q  Qp−1 .
4.23. e and d must be relatively prime to φ(n), which is always even
for n > 2.
4.26. 193 and 24847873 are the sum of two squares.

A.5. Chapter 5
Answers and hints for exercises in Chapter 5.
5.6. This N is not the sum of two squares because it has a prime
factor 3119 ≡ 3 (mod 4).

A.6. Chapter 6
Answers and hints for exercises in Chapter 6.
6.3. p = 1068.
6.4. Hint: When N = p3 , if the CFRAC finds x, y with x2 ≡
y 2 (mod N ) and 1 < gcd(x − y, N ) < N , then p must be
in the factor base.
272 A. Answers and Hints for Exercises

A.7. Chapter 7
Answers and hints for exercises in Chapter 7.
7.1. Hint: The sum of the roots of the cubic x3 + ax + b = 0 is 0
because there is no x2 term.
7.2. 2P = (−5, −16), 3P = (11, −32), 4P = (11, 32), 5P = (−5, 16),
k = 7.
7.3. 2P = (0, 0), 3P = (2, 7), 4P = ∞, k = 4.

A.8. Chapter 8
Answers and hints for exercises in Chapter 8.
8.1. The formula is p (I + p − 1)/p.

A.9. Chapter 10
Answers and hints for exercises in Chapter 10.
10.1. One of the prime factors is 576297563010049.
10.5. This is a research problem.
10.6. This is a research problem.
10.7. We tell how to solve the problem in the hint using the solution
to the puzzle. Given x and a with x2 ≡ a (mod N ), we must
find y and b with y 2 ≡ b (mod N ) and 0 < |b| < |a|. We may
assume that |a| < N . Let R = N/|a|. Then R > 1. The puzzle
solution gives k and m with kR = m2 + ε, where |ε| < 1. Let
y = xm and b = −εa. Then


kN
y = x m = x (kR − ε) = x
2 2 2 2 2
−ε
|a|


kN
≡a − ε = ±kN − εa ≡ −εa = b (mod N )
|a|
and |b| < |a|.
Bibliography

[Adl94] L. M. Adleman, Molecular computation of solutions to combina-


torial problems, Science 266 (November 11, 1994), 1021–1024.

[AGLL95] D. Atkins, M. Graff, A. K. Lenstra, and P. C. Leyland, The


magic words are squeamish ossifrage, ASIACRYPT ’94 Proceed-
ings of the 4th International Conference on the Theory and Ap-
plications of Cryptology, pp. 263–277, Springer-Verlag, London,
UK, 1995.

[AGP94] W. Alford, A. Granville, and C. Pomerance, There are infinitely


many Carmichael numbers, Ann. of Math. 139 (1994), 703–722.

[AKS04] M. Agrawal, N. Kayal, and N. Saxena, PRIMES is in P, Ann.


of Math. 160 (2004), 781–793.

[AM93] A. O. L. Atkin and F. Morain, Elliptic curves and primality


proving, Math. Comp. 61 (1993), 29–68.

[AP93] W. R. Alford and C. Pomerance, Implementing the self-


initializing quadratic sieve on a distributed network, Num-
ber Theoretic and Algebraic Methods in Computer Science
(Moscow) (A. van der Poorten, I. Shparlinski, and H. G. Zim-
mer, eds.), 1993, pp. 163–174.

[Arn09] V. I. Arnold, Lengths of periods of continued fractions of square


roots of integers, Funct. Anal. Other Math. 2 (2009), 151–164.

[Bac85] E. Bach, Analytic Methods in the Analysis and Design of


Number-Theoretic Algorithms, The MIT Press, Cambridge,
Massachusetts, 1985.

273
274 Bibliography

[BBS86] L. Blum, M. Blum, and M. Shub, A simple unpredictable pseudo-


random number generator, SIAM J. Comput. 15 (1986), 364–
383.
[BC89] R. P. Brent and G. L. Cohen, A new lower bound for odd perfect
numbers, Math. Comp. 53 (1989), 431–437.
[BD00] D. Boneh and G. Durfee, Cryptanalysis of RSA with private key
d less than n0.292 , IEEE Trans. on Info. Theory 46(4) (2000),
1339–1349.
[Bel51] E. T. Bell, Mathematics: Queen and Servant of Science,
McGraw-Hill, New York, 1951.
[Ber98] D. J. Bernstein, Detecting perfect powers in essentially linear
time, Math. Comp. 67 (1998), 1253–1283.
[BF01] D. Boneh and M. Franklin, Identity based encryption from the
Weil pairing, Advances in Cryptology—Proc. CRYPTO ’01
(Springer-Verlag, Berlin, New York), Lecture Notes in Com-
puter Science, vol. 2139, 2001, pp. 213–229.
[BFLS13] N. Bliss, B. Fulan, S. Lovett, and J. Sommars, Strong divisibility,
cyclotomic polynomials, and iterated polynomials, Amer. Math.
Monthly 120 (2013), 519–536.
[BH11] J. P. Buhler and D. Harvey, Irregular primes to 163 million,
Math. Comp. 80 (2011), 2435–2444.
[BLP93] J. Buhler, H. W. Lenstra, Jr., and C. Pomerance, Factoring
integers with the number field sieve, The Development of the
Number Field Sieve (Springer-Verlag, Berlin, New York) (A. K.
Lenstra and H. W. Lenstra, Jr., eds.), Lecture Notes in Mathe-
matics, vol. 1554, 1993, pp. 50–94.
[BLS75] J. Brillhart, D. H. Lehmer, and J. L. Selfridge, New primality
criteria and factorizations of 2m ± 1, Math. Comp. 29 (1975),
620–647.
[BLS+ 02] John Brillhart, D. H. Lehmer, J. L. Selfridge, Bryant Tuck-
erman, and S. S. Wagstaff, Jr., Factorizations of bn ± 1,
b = 2, 3, 5, 6, 7, 10, 11, 12 up to high powers, Third
ed., Contemporary Mathematics, vol. 22, Amer. Math. Soc.,
Providence, Rhode Island, 2002, Electronic book available at
http://www.ams.org/bookstore/conmseries.
[BMS88] J. Brillhart, P. L. Montgomery, and R. D. Silverman, Tables
of Fibonacci and Lucas factorizations, Math. Comp. 50 (1988),
251–260.
[Bon99] D. Boneh, Twenty years of attacks on the RSA cryptosystem,
Notices Amer. Math. Soc. 46 (1999), 202–213.
Bibliography 275

[Bre80] R. P. Brent, An improved Monte Carlo factorization method,


BIT 20 (1980), 176–184.
[Bre86] , Some integer factorization algorithms using elliptic
curves, Australian Computer Science Communications 8 (1986),
149–163.
[Bre93] , On computing factors of cyclotomic polynomials, Math.
Comp. 61 (1993), 131–149.
[Bre95] , Computing Aurifeuillian factors, Computational Alge-
bra and Number Theory (Sydney, 1992) (Dordrecht), Math.
Appl., vol. 325, Kluwer Acad. Publ., 1995, pp. 201–212.
[BS67] J. Brillhart and J. L. Selfridge, Some factorizations of 2n ±1 and
related results, Math. Comp. 21 (1967), 87–96, Corrigendum,
ibid., 251.
[BS96] E. Bach and J. Shallit, Algorithmic Number Theory, Volume
I: Efficient Algorithms, The MIT Press, Cambridge, Mas-
sachusetts, 1996.
[BSW89] P. T. Bateman, J. L. Selfridge, and S. S. Wagstaff, Jr., The new
Mersenne conjecture, Amer. Math. Monthly 96 (1989), 125–128.
[Bue89] D. A. Buell, Binary Quadratic Forms, Springer-Verlag, New
York, 1989.
[BW71] B. D. Beach and H. C. Williams, Some computer results on pe-
riodic continued fractions, Proceedings of the Second Louisiana
Conference on Combinatorics, Graph Theory and Computing,
LSU, Baton Rouge, LA, 1971, pp. 133–146.
[BW80] R. Baillie and S. S. Wagstaff, Jr., Lucas pseudoprimes, Math.
Comp. 35 (1980), 1391–1417.
[BZ09] R. P. Brent and P. Zimmermann, Ten new primitive binary tri-
nomials, Math. Comp. 78 (2009), 1197–1199.
[BZ10] , Modern Computer Arithmetic, Cambridge University
Press, 2010.
[BZ11] , The great trinomial hunt, Notices Amer. Math. Soc. 58
(2) (2011), 233–239.
[CEP83] E. Canfield, P. Erdős, and C. Pomerance, On a problem of Op-
penheim concerning “factorisatio numerorum”, J. Number The-
ory 17 (1983), 1–28.
[CGH05] W.-L. Chang, M. Guo, and M. S.-H. Ho, Fast parallel molecu-
lar algorithms for DNA-based computation: factoring integers,
IEEE Trans. on Nanobioscience 4(2) (2005), 149–163.
[CM07] J.-S. Coron and A. May, Deterministic polynomial time equiva-
lence of computing the RSA secret key and factoring, J. Cryp-
tology 20(1) (2007), 39–50.
276 Bibliography

[Cop94] D. Coppersmith, Solving homogeneous linear equations over


GF(2) via block Wiedemann algorithm, Math. Comp. 62 (1994),
333–350.
[Cop96a] , Finding a small root of a bivariate integer equa-
tion; factoring with high bits known, Advances in Cryptology—
Eurocrypt ’96 (Springer-Verlag, Berlin), Lecture Notes in Com-
puter Science, vol. 1070, 1996, pp. 178–189.
[Cop96b] , Finding a small root of a univariate modular equa-
tion, Advances in Cryptology—Eurocrypt ’96 (Springer-Verlag,
Berlin), Lecture Notes in Computer Science, vol. 1070, 1996,
pp. 155–165.
[Cop97] , Small solutions to polynomial equations, and low expo-
nent RSA vulnerabilities, J. Cryptology 10(4) (1997), 233–260.
[CP05] R. Crandall and C. Pomerance, Prime Numbers: A Computa-
tional Perspective, Second ed., Springer-Verlag, New York, 2005.
[CW25] A. J. C. Cunningham and H. J. Woodall, Factorisation of y n ∓1,
y = 2, 3, 5, 6, 7, 10, 11, 12 up to High Powers (n), Francis
Hodgson, London, 1925.
[Dav91] L. Davis (ed.), Handbook of Genetic Algorithms, Van Nostrand
Reinhold, New York, 1991.
[DD86] H. Dubner and R. Dubner, The development of a powerful, low-
cost computer for number theory applications, J. Rec. Math. 18
(1986), 81–86.
[Deu41] M. Deuring, Die Typen der Multiplikatorenringe elliptischer
Funktionenkörper, Abh. Math. Sem. Hansischen Univ. 14
(1941), 197–272.
[DF04] D. S. Dummit and R. M. Foote, Abstract Algebra, Third ed.,
John Wiley & Sons, New York, 2004.
[DH76] W. Diffie and M. Hellman, New directions in cryptography, IEEE
Trans. on Info. Theory IT-22(6) (1976), 644–654.
[Dic30] K. Dickman, On the frequency of numbers containing prime fac-
tors of a certain relative magnitude, Ark. Mat., Astronomi och
Fysik 22A, 10 (1930), 1–14.
[Dic71] L. E. Dickson, History of the Theory of Numbers, Volume I,
Divisibility and Primality, Carnegie Institute of Washington,
Reprinted by Chelsea Publishing Company, New York, 1971,
originally published in 1919.
[Dix81] J. D. Dixon, Asymptotically fast factorization of integers, Math.
Comp. 36 (1981), 255–260.
Bibliography 277

[DXS91] C. Ding, G. Xiao, and W. Shan, The stability theory of stream


ciphers, Lecture Notes in Computer Science, vol. 561, Springer-
Verlag, New York, 1991.
[EK40] P. Erdős and M. Kac, The Gaussian law of errors in the theory of
additive number theoretic functions, Amer. J. Math. 62 (1940),
738–742.
[EP86] P. Erdős and C. Pomerance, On the number of false witnesses
for a composite number, Math. Comp. 46 (1986), 259–279.
[Erd35] P. Erdős, On the normal number of prime factors of p − 1 and
some related problems concerning Euler’s φ-function, Quarterly
J. Math. 6 (1935), 205–213.
[Erd56] , On pseudoprimes and Carmichael numbers, Publica-
tions Mathematicae Debrecen 4 (1956), 201–206.
[Est10] N. Estibals, Compact hardware for computing the Tate pair-
ing over 128-bit-security supersingular curves, Pairing-based
cryptography—Pairing 2010 (Springer-Verlag, Berlin), Lecture
Notes in Computer Science, vol. 6487, 2010, pp. 397–416.
[EW80] P. Erdős and S. S. Wagstaff, Jr., The fractional parts of the
Bernoulli numbers, Illinois J. Math. 24 (1980), 104–112.
[Fei13] Jan Feitsma, The pseudoprimes below 264 , 2013. See the URL
http://www.janfeitsma.nl/math/psp2/.
[FKP+ 05] J. Franke, T. Kleinjung, C. Paar, J. Pelzl, C. Priplata, and
C. Stahlke, SHARK: a realizable special hardware sieving device
for factoring 1024-bit integers, Workshop on Special Purpose
hardware for attacking cryptographic systems (SHARCS), 2005,
pp. 27–37.
[Gau01] C. F. Gauss, Disquisitiones Arithmeticae, G. Fleischer, Leipzig,
1801, English translation by A. A. Clarke published by Yale
University Press, 1966, reprinted by Springer-Verlag, 1986.
[Gér37] A. Gérardin, Machine à congruences, 70e Congrès des Sociétés
Savantes de Paris et des Départements, Section des Sciences
(Gauthier-Villars, Paris), vol. II, 1937.
[GK86] S. Goldwasser and J. Kilian, Almost all primes can be quickly
certified, Proc. Eighteenth Annual ACM Symp. on the Theory
of Computing (STOC), Berkeley, May 28-30, 1986, ACM, 1986,
pp. 316–329.
[GO07] T. Goto and K. Okeya, All harmonic numbers less than 1014 ,
Japan J. Indust. Appl. Math. 24 (2007), 275–288.
[Gol82] S. W. Golomb, Shift Register Sequences, Revised ed., Aegean
Park, 1982.
278 Bibliography

[GP07] A. Granville and P. Pleasants, Aurifeuillian factorization, Math.


Comp. 75 (2007), 497–508.
[Gra05] A. Granville, It is easy to determine whether a given integer is
prime, Bull. Amer. Math. Soc. (N.S.) 42 (2005), 3–38.
[GS75] R. K. Guy and J. L. Selfridge, What drives an aliquot sequence?,
Math. Comp. 29 (1975), 101–107.
[GS03] W. Geiselmann and R. Steinwandt, A dedicated sieving hard-
ware, PKC 2003 (Springer-Verlag, Berlin), Lecture Notes in
Computer Science, vol. 2567, 2003, pp. 254–266.
[GS04] , Yet another sieving device, CT-RSA 2004 (Springer-
Verlag, Berlin), Lecture Notes in Computer Science, vol. 2964,
2004, pp. 278–291.
[Guy04] R. K. Guy, Unsolved Problems in Number Theory, Third ed.,
Springer-Verlag, New York, 2004.
[GW08] J. Gower and S. S. Wagstaff, Jr., Square form factorization,
Math. Comp. 77 (2008), 551–588.
[Ham62] R. W. Hamming, Numerical Methods for Scientists and Engi-
neers, International Series in Pure and Applied Mathematics,
McGraw-Hill, New York, 1962.
[Har12] W. B. Hart, A one line factoring algorithm, J. Aust. Math. Soc.
92 (2012), 61–69.
[Her99] I. N. Herstein, Abstract Algebra, Third ed., John Wiley & Sons,
New York, 1999.
[HG97] N. Howgrave-Graham, Finding small roots of univariate modu-
lar equations revisited, Proceedings of Cryptography and Coding
(Springer-Verlag, Berlin), Lecture Notes in Computer Science,
vol. 1355, 1997, pp. 131–142.
[HR17] G. H. Hardy and S. Ramanujan, The normal number of prime
factors of a number n, Quart. J. Math. 48 (1917), 76–92.
[HW79] G. H. Hardy and E. M. Wright, An Introduction to the Theory
of Numbers, Fifth ed., Clarendon Press, Oxford, England, 1979.
[IR98] K. Ireland and M. Rosen, A Classical Introduction to Modern
Number Theory, Springer-Verlag, Berlin, New York, 1998.
[Jan98] G. Janusz, Algebraic Number Fields, Second ed., Amer. Math.
Soc., Providence, Rhode Island, 1998.
[Kan57] H.-J. Kanold, Über das harmonische Mittel der Teiler einer
natürlichen Zahl, Math. Annalen 133 (1957), 371–374.
[Knu81] D. E. Knuth, The Art of Computer Programming, Volume 2,
Seminumerical Algorithms, Second ed., Addison-Wesley, Read-
ing, Massachusetts, 1981.
Bibliography 279

[Kob87a] N. Koblitz, A Course in Number Theory and Cryptography,


Springer-Verlag, New York, 1987.
[Kob87b] , Elliptic curve cryptosystems, Math. Comp. 48 (1987),
203–209.
[Kor99] A. Korselt, Probléme chinois, L’Intermédiaire des Mathéma-
ticiens 6 (1899), 142–143.
[KP76] D. E. Knuth and L. Trabb Pardo, Analysis of a simple fac-
torization algorithm, Theoretical Computer Science 3 (1976),
321–348.
[Kra29] M. Kraitchik, Recherches sur la Théorie des Nombres,
Gauthiers-Villars, Paris, France, 1929.
[Law95] F. W. Lawrence, Factorisation of numbers, Messinger of Math.
24 (1895), 100–109.
[Leh09] D. N. Lehmer, Factor table for the first ten millions containing
the smallest factor of every number not divisible by 2, 3, 5,
or 7 between the limits 0 and 10017000, Carnegie Institute of
Washington, 1909.
[Leh18] , On the history of the problem of separating a num-
ber into its prime factors, Scientific Monthly (September 1918),
227–234.
[Leh27] D. H. Lehmer, Tests for primality by the converse of Fermat’s
theorem, Bull. Amer. Math. Soc. 33 (1927), 327–340.
[Leh33a] , A photo-electric number sieve, Amer. Math. Monthly
40 (1933), 401–406.
[Leh33b] D. N. Lehmer, Hunting big game in the theory of numbers,
Scripta Math. 1 (March 1933), 229–235.
[Leh66] D. H. Lehmer, An announcement concerning the delay line sieve
DLS 127, Math. Comp. 20 (1966), 645–646.
[Leh69] , Computer technology applied to the theory of numbers,
Studies in Number Theory, Math. Assn. Amer. (distributed by
Prentice-Hall), 1969, pp. 117–151.
[Leh74] R. S. Lehman, Factoring large integers, Math. Comp. 28 (1974),
637–646.
[Len82] H. W. Lenstra, Jr., Primality testing, Computational Methods
in Number Theory, Part 1 (CWI, Amsterdam) (H. W. Lenstra,
Jr. and R. Tijdeman, eds.), Math. Centrum Tract, vol. 154,
1982, pp. 55–77.
[Len87] , Factoring integers with elliptic curves, Ann. of Math.
126 (1987), 649–673.
[Lev77] W. J. Levesque, Fundamentals of Number Theory, Dover, 1977.
280 Bibliography

[Lip95] R. J. Lipton, DNA solution of hard computational problems,


Science 268 (1995), 542–545.
[LL74] D. H. Lehmer and Emma Lehmer, A new factorization technique
using quadratic forms, Math. Comp. 28 (1974), 625–635.
[LL93] A. K. Lenstra and H. W. Lenstra, Jr. (eds.), The development
of the number field sieve, Lecture Notes in Mathematics, vol.
1554, Springer-Verlag, New York, 1993.
[LLL82] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovász, Factoring poly-
nomials with rational coefficients, Math. Annalen 261 (1982),
515–534.
[LM89] A. K. Lenstra and M. S. Manasse, Factoring by electronic mail,
Advances in Cryptology—EUROCRYPT ’89 (Springer, Berlin),
Lecture Notes in Computer Science, vol. 434, 1989, pp. 355–371.
[Loo51] W. Looff, Über die Periodicität der Decimalbrüche, Archiv der
Mathematik und Physik. 16 (1851), 54–57.
[LP31] D. H. Lehmer and R. E. Powers, On factoring large numbers,
Bull. Amer. Math. Soc. 37 (1931), 770–776.
[LP92] H. W. Lenstra, Jr. and C. Pomerance, A rigorous time bound
for factoring integers, Jour. Amer. Math. Soc. 5 (1992), no. 3,
483–516.
[LTS+ 03] A. K. Lenstra, E. Tomer, A. Shamir, W. Kortsmit, B. Dodson,
J. Hughes, and P. C. Leyland, Factoring estimates for a 1024-
bit RSA modulus, Lecture Notes in Computer Science, vol. 2894,
pp. 55–74, Springer, Berlin, 2003.
[Luc78] É. Lucas, Sur les formules de Cauchy et de Lejeune Dirichlet,
Assoc. Française pour l’Advanc. Sci., Comptes Rendus 7 (1878),
164–173.
[Mat92] G. B. Mathews, Theory of Numbers, Part I, Deighton, Bell &
Co., Cambridge, England, 1892.
[May04] A. May, Computing the RSA secret key is deterministic poly-
nomial time equivalent to factoring, Advances in Cryptology—
Proc. CRYPTO 2004 (Springer-Verlag, Berlin, New York), Lec-
ture Notes in Computer Science, vol. 3152, 2004, pp. 213–219.
[Maz11] B. Mazur, How can we construct abelian Galois extensions of
basic number fields?, Bull. Amer. Math. Soc. (N.S.) 48 (2011),
155–209.
[MB75] M. A. Morrison and J. Brillhart, A method of factoring and the
factorization of F7 , Math. Comp. 29 (1975), 183–205.
[McK99] J. McKee, Speeding Fermat’s factoring method, Math. Comp.
68 (1999), 1729–1737.
Bibliography 281

[Mil76] G. Miller, Riemann’s hypothesis and tests for primality, J. Com-


put. System Sci. 13 (1976), 300–317.
[Mil87] V. Miller, Use of elliptic curves in cryptography, Advances in
Cryptology—Proc. CRYPTO ’85 (Springer-Verlag, Berlin, New
York), Lecture Notes in Computer Science, vol. 218, 1987,
pp. 417–426.
[MK87] M. Morimoto and Y. Kida, Factorization of Cyclotomic Num-
bers, Sophia University, Tokyo, 1987.
[MKK92] M. Morimoto, Y. Kida, and M. Kobayashi, Factorization of Cy-
clotomic Numbers, III, Sophia University, Tokyo, 1992.
[MKS89] M. Morimoto, Y. Kida, and M. Saito, Factorization of Cyclo-
tomic Numbers, II, Sophia University, Tokyo, 1989.
[MM99] P. L. Montgomery and B. Murphy, Improved polynomial selec-
tion for the number field sieve, The Mathematics of Public Key
Cryptography Conference (Toronto), Fields Institute, 1999.
[Mon80] L. Monier, Evaluation and comparison of two efficient proba-
bilistic primality testing algorithms, Theoret. Comput. Sci. 12
(1980), 97–108.
[Mon85] P. L. Montgomery, Modular multiplication without trial division,
Math. Comp. 44 (1985), 519–521.
[Mon87] , Speeding the Pollard and elliptic curve methods of fac-
toring, Math. Comp. 48 (1987), 243–264.
[Mon94] , Square roots of products of algebraic numbers, Math-
ematics of Computation 1943–1993 (W. Gautschi, ed.), Proc.
Symp. Appl. Math., vol. 48, Amer. Math. Soc., 1994, pp. 567–
571.
[Mon95] , A block Lanczos algorithm for finding dependencies over
GF (2), Advances in Cryptology—EUROCRYPT ’95 (Springer-
Verlag, Berlin) (A. J. Menezes and S. A. Vanstone, eds.), Lec-
ture Notes in Computer Science, vol. 921, 1995, pp. 106–120.
[Mor04] F. Morain, La primalité en temps polynomial (d’après Adleman,
Huang; Agrawal, Kayal, Saxena), Astérisque 294 (2004), 205–
230.
[MOV93] A. J. Menezes, T. Okamoto, and S. A. Vanstone, Reducing ellip-
tic curve logarithms to logarithms in a finite field, IEEE Trans.
on Info. Theory IT-39(5) (1993), 1639–1646.
[MW06] C. J. Moreno and S. S. Wagstaff, Jr., Sums of Squares of Inte-
gers, Chapman & Hall/CRC Press, Boca Raton, Florida, 2006.
[Nag51] T. Nagell, Introduction to Number Theory, John Wiley, New
York, 1951.
282 Bibliography

[Nie07] P. P. Nielsen, Odd perfect numbers have at least nine distinct


prime factors, Math. Comp. 76 (2007), 2109–2126.
[NZM91] I. Niven, H. S. Zuckerman, and H. L. Montgomery, An Intro-
duction to the Theory of Numbers, Fifth ed., John Wiley, New
York, 1991.
[OR12] P. Ochem and M. Rao, Odd perfect numbers are greater than
101500 , Math. Comp. 81 (2012), 1869–1877.
[Ore48] O. Ore, On the averages of the divisors of a number, Amer.
Math. Monthly 55 (1948), 615–619.
[PH78] S. Pohlig and M. Hellman, An improved algorithm for computing
logarithms over GF(p) and its cryptographic significance, IEEE
Trans. on Info. Theory IT-24(1) (1978), 106–110.
[Poc16] H. C. Pocklington, The determination of the prime or composite
nature of large numbers by Fermat’s theorem, Proc. Camb. Phil.
Soc. 18 (1914–16), 29–30.
[Pol74] J. M. Pollard, Theorems on factorization and primality testing,
Proc. Cambridge Philos. Soc. 76 (1974), 521–528.
[Pol75] , A Monte Carlo method for factorization, Nordisk Tid-
skr. Informationsbehandling (BIT) 15 (1975), 331–335.
[Pol93] , Factoring with cubic integers, Lecture Notes in Mathe-
matics, vol. 1554, pp. 4–10, Springer-Verlag, New York, 1993.
[Pom81] C. Pomerance, On the distribution of pseudoprimes, Math.
Comp. 37 (1981), 587–593.
[Pom82] , Analysis and comparison of some integer factoring al-
gorithms, Computational Methods in Number Theory, Part 1
(CWI, Amsterdam) (H. W. Lenstra, Jr. and R. Tijdeman, eds.),
Math. Centrum Tract, vol. 154, 1982, pp. 89–139.
[Pom94] , The number field sieve, Mathematics of Computation
1943–1993: a half-century of computational mathematics (Van-
couver, BC, 1993) (Amer. Math. Soc., Providence, RI), Proc.
Sympos. Appl. Math., vol. 48, 1994, pp. 465–480.
[Pom96] , A tale of two sieves, Notices Amer. Math. Soc. 43
(1996), 1473–1485.
[Pom10] , Primality testing: variations on a theme of Lucas, Con-
gressus Numerantium 201 (2010), 301–312.
[Pra75] V. R. Pratt, Every prime has a succinct certificate, SIAM J.
Comput. 4 (1975), 214–220.
[Pri81] P. A. Pritchard, A sublinear additive sieve for finding prime
numbers, Comm. ACM 24 (1981), 18–23.
Bibliography 283

[PST88] C. Pomerance, J.W. Smith, and R. Tuler, A pipeline architecture


for factoring large integers with the quadratic sieve algorithm,
SIAM J. Comput. 17 (1988), no. 2, 387–403.
[PSW80] C. Pomerance, J. L. Selfridge, and S. S. Wagstaff, Jr., The pseu-
doprimes to 25 · 109 , Math. Comp. 35 (1980), 1003–1026.
[Rab79] M. Rabin, Digitized signatures and public-key functions as in-
tractable as factoring, Tech. Report LCS/TR-212, M.I.T. Lab
for Computer Science, 1979.
[Rab80] , Probabilistic algorithms for testing primality, J. Num-
ber Theory 12 (1980), 128–138.
[Ram49] V. Ramaswami, The number of positive integers < x and free of
prime divisors > xc , and a problem of S. S. Pillai, Duke Math.
J. 16 (1949), 99–109.
[Reu56] K. G. Reuschle, Mathematische Abhandlung, enthaltend: Neue
zahlentheoretische Tabellen, Königl. Gymnasium Stuttgart,
1856.
[Rie94] H. Riesel, Prime Numbers and Computer Methods of Factoriza-
tion, Second ed., Birkhäuser, Boston, Massachusetts, 1994.
[Ros88] K. H. Rosen, Elementary Number Theory and Its Applications,
Second ed., Addison-Wesley, Reading, Massachusetts, 1988.
[RSA78] R. L. Rivest, A. Shamir, and L. Adleman, A method for obtain-
ing digital signatures and public-key cryptosystems, Comm. A.
C. M. 21(2) (1978), 120–126.
[Sch62] A. Schinzel, On primitive prime factors of an − bn , Proc. Cam-
bridge Philos. Soc. 58 (1962), 555–562.
[Sch85] R. Schoof, Elliptic curves over finite fields and the computation
of square roots mod p, Math. Comp. 44 (1985), 483–494.
[Sha71] D. Shanks, Class number, a theory of factorization, and genera,
1969 Number Theory Institute, Stony Brook, N.Y., Proc. Sym-
pos. Pure Math., vol. 20, Amer. Math. Soc., 1971, pp. 415–440.
[Sha73] , Five number-theoretic algorithms, Proceedings of the
Second Manitoba Conference on Numerical Mathematics (Univ.
Manitoba, Winnipeg, Man., 1972) (Winnipeg, Man.), Congres-
sus Numerantium, no. VII, Utilitas Math., 1973, pp. 51–70.
[Sha75] , Analysis and improvement of the continued fraction
method of factorization. Abstract 720-10-43, Notices Amer.
Math. Soc. 22 (1975), A–68.
[Sha79] A. Shamir, Factoring numbers in O(log n) arithmetic steps, In-
form. Proc. Lett. 8 (1979), 28–31.
284 Bibliography

[Sha99] , Factoring large numbers with the TWINKLE device (ex-


tended abstract), Cryptographic Hardware and Embedded Sys-
tems, First International Workshop, CHES ’99, Worcester, MA
(Springer-Verlag, New York) (Ç. Koç and C. Paar, eds.), Lec-
ture Notes in Computer Science, vol. 1717, 1999, pp. 2–12.
[Sho94] P. Shor, Algorithms for quantum computation: discrete loga-
rithms and factoring, Proceedings of the Thirty-Fifth Annual
Symposium on the Foundations of Computer Science, 1994,
pp. 124–134.
[Sho99] , Polynomial-time algorithms for prime factorization and
discrete logarithms on a quantum computer, SIAM Review 41
(1999), 303–332.
[Sie64] C. L. Siegel, Zu zwei Bemerkungen Kummers, Nachr. Akad.
Wiss. Göttingen Math-Phys. Kl II 1964 (1964), 51–57.
[Sil87] R. D. Silverman, The multiple polynomial quadratic sieve, Math.
Comp. 48 (1987), 329–339.
[ST92] J. H. Silverman and J. Tate, Rational Points on Elliptic Curves,
Undergraduate Texts in Mathematics, Springer-Verlag, New
York, 1992.
[ST03] A. Shamir and E. Tromer, Factoring large numbers with the
TWIRL device, Advances in Cryptology—Proc. CRYPTO 2003
(Springer-Verlag, Berlin, New York), Lecture Notes in Com-
puter Science, vol. 2729, 2003, pp. 1–26.
[Str77] V. Strassen, Einige Resultate über Berechnungskomplexität,
Jahresber. Deutsch. Math.-Verein. 78 (1976/77), 1–8.
[Suy81] H. Suyama, Searching for prime factors of Fermat numbers with
a microcomputer, BIT (Tokyo) 13 (1981), 240–245.
[SW83] J. W. Smith and S. S. Wagstaff, Jr., An extended precision
operand computer, Proc. of the Twenty-First Southeast Region
ACM Conference (ACM), 1983, pp. 209–216.
[SW93] R. D. Silverman and S. S. Wagstaff, Jr., A practical analysis of
the elliptic curve factoring algorithm, Math. Comp. 61 (1993),
445–462.
[SWM95] J. Shallit, H. C. Williams, and F. Morain, Discovery of a lost
factoring machine, Math. Intelligencer 17 (1995), 41–47.
[Tou33] J. Touchard, Propriétés arithmétiques de certain nombres recur-
rents, Ann. Soc. Sci. Bruxelles 53A (1933), 21–31.
[TW02] W. Trappe and L. C. Washington, Introduction to Cryptography
with Coding Theory, Prentice Hall, Upper Saddle River, New
Jersey, 2002.
Bibliography 285

[vdW70] B. L. van der Waerden, Algebra, vol. 1, Frederick Ungar, New


York, 1970.
[VSB+ 01] L. M. K. Vandersypen, M. Steffen, G. Breyta, C. S. Yannoni,
M. H. Sherwood, and I. L. Chuang, Experimental realization
of Shor’s quantum factoring algorithm using nuclear magnetic
resonance, Nature 414 (6866) (2001), 883–887.
[Wag83] S. S. Wagstaff, Jr., Divisors of Mersenne numbers, Math. Comp.
40 (1983), 385–397.
[Wag93] , Computing Euclid’s primes, Bull. of the Inst. for Com-
binatorics and Its Applications 8 (1993), 23–32.
[Wag96] , Aurifeuillian factorizations and the period of the Bell
numbers modulo a prime, Math. Comp. 65 (1996), 383–391.
[Wag03] , Cryptanalysis of Number Theoretic Ciphers, Chapman
& Hall/CRC Press, Boca Raton, Florida, 2003.
[Wag12] , The search for Aurifeuillian-like factorizations, Integers
12A (2012), 1449–1461.
[Was96] L. C. Washington, Introduction to Cyclotomic Fields, Second
ed., Springer-Verlag, New York, 1996.
[Was03] , Elliptic Curves: Number Theory and Cryptography,
Chapman & Hall/CRC Press, Boca Raton, Florida, 2003.
[WD86] H. C. Williams and H. Dubner, The primality of R1031, Math.
Comp. 47 (1986), 703–711.
[Wei13] S. H. Weintraub, Several proofs of the irreducibility of the cyclo-
tomic polynomials, Amer. Math. Monthly 120 (2013), 537–545.
[Wie90] M. J. Wiener, Cryptanalysis of short RSA secret exponents,
IEEE Trans. on Info. Theory 36 (1990), 553–558.
x
[Wil45] G. T. Williams, Numbers generated by the function ee −1 , Amer.
Math. Monthly 52 (1945), 323–327.
[Wil78] H. C. Williams, Some primes with interesting digit patterns,
Math. Comp. 32 (1978), 1306–1310.
[Wil80] , A modification of the RSA public-key encryption proce-
dure, IEEE Trans. on Info. Theory IT-26(6) (1980), 726–729.
[Wil82] , A p + 1 method of factoring, Math. Comp. 39 (1982),
225–234.
[Wil98] , Édouard Lucas and Primality Testing, Canadian Math-
ematics Society Series of Monographs and Advanced Texts,
vol. 22, John Wiley & Sons, New York, 1998.
[WP83] H. C. Williams and C. D. Patterson, A report on the University
of Manitoba sieve unit, Congressus Numerantium 37 (1983),
85–98.
286 Bibliography

[WS87] S. S. Wagstaff, Jr. and J. W. Smith, Methods of factoring


large integers, Number Theory, New York, 1984–1985 (Springer-
Verlag, New York) (D. V. Chudnovsky, G. V. Chudnovsky,
H. Cohn, and M. B. Nathanson, eds.), Lecture Notes in Math-
ematics, vol. 1240, 1987, pp. 281–303.
[Wun83] M. C. Wunderlich, A performance analysis of a simple prime-
testing algorithm, Math. Comp. 40 (1983), 709–714.
[XZL+ 12] N. Xu, J. Zhu, D. Lu, X. Zhou, X. Peng, and J. Du, Quan-
tum factorization of 143 on a dipolar-coupling nuclear magnetic
resonance system, Phys. Rev. Lett. 108 (2012), 130501.
[ZD06] P. Zimmermann and B. Dodson, 20 years of ECM, Algorith-
mic Number Theory, Proceedings ANTS 2006 (Berlin), Lecture
Notes in Computer Science, vol. 4076, Springer, 2006, pp. 525–
542.
[Zsi92] K. Zsigmondy, Zur Theorie der Potenzreste, Monatsh. Math. 3
(1892), 265–284.
Index

Abelian group, 177 Bang, A. S., 53


Additive function, 35 Batalov, S., 218
Adleman, L. M., xi, 4, 104, 108, 234 Bateman, P. T., 98
Agrawal, M., 69, 70, 73 Beach, B. D., 150
Alford, W., 65, 201 Bell number, 98, 118
Algebraic Bell, E. T., 98, 114
factor, 50, 56, 57 Bernoulli numbers, 101–104, 118,
number, 208 216
part, 50, 56 Bernoulli, J., 102
Aliquot sequence, 99–101, 118, 134 Bernstein, D. J., 70
American Mathematical Society, Big O notation, 13
114, 115 Birthday problem, 135, 162
Amicable pairs, 100 Bliss, N., 49
Amortize, 120, 125, 137, 140, 186, Block Lanczos, 202
197 Block Weidemann, 202
Aristotle, 101 Blum, L., 109
Arithmetic function, 33–36 Blum, M., 109
Arnold, V. I., 150 Boneh, D., 190, 250, 253, 261
Associative law, 177 Brahmagupta, 169
Atkin, A. O. L., 141, 142, 187, 188, Brent, R. P., xiii, 78, 88, 97, 138,
239 140, 141, 186, 207, 244, 246,
Atkins, D., 5, 115, 201 262
Aurifeuille, L. F. A., 77 Brillhart, J., 58, 158, 159, 162, 207,
Aurifeuillian factorization, 76–82, 262
92, 99, 116, 117, 189, 216, 218 Browne, M. W., 115
Buell, D. A., 141, 163
Bach, E., 24, 46, 63 Buhler, J. P., 103, 212
Bai, S., 216 Burckhardt, J. C., 119
Baillie, R., xiv, 68, 72, 140, 141 Burt Laboratories, 224
Bang’s Theorem, 53, 82, 85 Butterfly net, 158

287
288 Index

Canfield, E., 43 Cunningham Project, 7, 9–12, 49,


Carissan, P., 222 76
Carmichael Cunningham table, 81, 82, 91–93,
lambda function, 35, 71 117, 141
number, 64, 72, 113, 248, 249, Cunningham, A. J. C., 9, 115
265, 267
Cataldi, P., 8, 119 Davis, J., 115
Ceiling function, 13 Davis, L., 264
CFRAC, 160 de Bruijn function, 43
Chang, W.-L., 234 Decimals, repeating, 6–8
Characteristic polynomial, 94 de la Vallée Poussin, C., 23
Chebyshev, P., 132 Delay Line Sieve, 227
Chen, J., xiv DES, 2
Chernac, L., 119 Deterministic algorithm, 15
Childers, G., xiii, 216, 217 Deuring, M., 181
Chinese Remainder Theorem, deVogelaere, R., 93
26–28, 31, 33, 36, 64, 106, 107, Dickman, K., 43
109, 110, 154, 253 Dickson, L. E., 8, 9, 101
Clausen, T., 262 Difference of Squares Factoring
Algorithm, 124
Cocks, C., 5
Diffie, W., 2, 4, 108
Cohen, G. L., xiii, 87, 88
Digital signature, 4, 108–109
Cole, F. N., 98, 114
Discrete logarithm, 32–33, 108
Complexity
Discriminant, 175, 178
of AKS primality test, 69
Divisibility, 17–20
of BPSW primality test, 69
Divisibility sequence, 49–55, 71,
of Elliptic Curve Method, 184
216
of Euclidean Algorithm, 18
Dixon, J. D., 241
of Fermat’s Method, 126
DNA computing, 234–236
of Number Field Sieve, 213 Dodson, B., 186
of Pollard Rho, 137 Double Sieve, 202
of Quadratic Sieve, 197 DSH, factoring device, 237
of the CFRAC, 159 Dubner, H., 92, 138, 232
of the SQUFOF, 168 Dubner, R., 232
Composite number, 20 Durfee, G., 261
Congruence, 24–28
Continued fraction, 144, 158 ECM, 181
algorithm, 144 ECM, second stage of, 186
periodic, 149–153 Elliptic curve
Continued Fraction Factoring discrete logarithm problem, 189
Algorithm, 158–163, 195, 230, method, 79, 181, 231, 232, 236,
262, 263 241, 246, 247, 262, 263, 267
Coppersmith, D., 202, 251, 260, 261 pairing, 189
Coron, J.-S., 258 point addition, 177
Crandall, R., 46, 67, 207, 213 prime proving, 187
Cray-1 computer, 115 supersingular, 189
Crelle, A. L., 119 Ellis, J., 5
Cryptography, public-key, 2–5 EPOC, factoring machine, 230
Index 289

Equivalence relation, 24 Fundamental Theorem of


Eratosthenes, 119 Arithmetic, 21
Erdős, P., 23, 63, 103, 139
Estibals, N., 190 Galois, E., 150
Euclid, 8, 21, 83 Gardner, M., 4
Euclidean Algorithm, 18–20, 145 Gauss, C. F., ix, 7, 23, 121, 132
Euler Geiselmann, W., 237
and perfect numbers, 83–86, 89 Generating function, 95
constant, 98 Genetic algorithm, 264, 266
Criterion, 38, 45, 46, 63, 111 Gérardin, A., 227
phi function, 31, 33 Gilchrist, J., 68
Theorem, 31, 36, 60, 105, 251 Goldwasser, S., 68, 187
Euler, L., 8, 31, 83, 84, 121, 132, Golomb, S. W., 93, 97
262 Goto, T., 89, 90
Exponential time, 14 Gower, J., 166, 168
Extended Euclidean Algorithm, 25, Graff, M., 5, 115, 201
26, 28, 36, 104, 179, 216, 246, Gram-Schmidt process, 254
251 Granville, A., 65, 78
Greatest common divisor, 17
Factor base, 159, 162, 202 Guy, R., 89, 101
Factor chain method, 85
Hadamard, J., 23
Factors of bm + 1, 55–56
Hamming, R. W., 75
Factors of bm − 1, 49–55
Hardy, G. H., 13, 23
Fast Exponentiation Algorithm, 29,
Harmonic
30, 46, 58, 64, 70, 97, 105, 109,
mean, 88
139, 179, 245, 251, 253, 265
number, 88–91
Fast Point Multiplication, 180, 185
Hart, W. B., 97, 127, 128, 142, 217
Feitsma, J., 68
Harvey, D., 103
Fermat
Hasse
Factoring Method, 106, 123–128,
interval, 181, 182, 184, 188, 241
130, 132, 141, 147, 221
Theorem, 180, 181, 187
Last Theorem, 104
Hasse, H., 180
Little Theorem, 28, 29, 31, 63,
Hellman, M., 2, 4, 33, 108
64, 66, 123, 138, 227
Hensel’s Lemma, 47, 195
number, 59, 62, 63, 140–142, 207,
Hensel, K. W. S., 47
231, 262
Holdridge, D., 115
sum of two squares, 114
Howgrave-Graham, N., 257
Fermat, P., 28, 114, 123, 125, 128,
132, 142 Iamblichus, 100
Fibonacci number, 18, 56–58, 68, Index = discrete logarithm, 32
145, 216 Integer square root, 124
Filtering, 202 Intrinsic factor, 50, 52, 53, 56, 57,
Floor function, 13 72, 80, 82, 141
Floyd, R. W., 136
Franke, J., 237 Jacobi
Franklin, M., 190 sum of four squares, 113
Full rank lattice, 255 symbol, 38, 110, 246, 265, 267
Fundamental solution, 169 Jacobi, C. G. J., 113
290 Index

Jeopardy!, ix, xiv Looff, W., 9


Lovász, L., 256
Kac, M., 23 Lucas number, 57, 72, 145
Kanold, H.-J., 90 Lucas, É., 63, 77, 78, 98
Kayal, N., 69, 70, 73 Lucas-Lehmer Test, 63, 98
Kida, Y., 49 L(x), 159, 184, 185, 197, 241, 242
Kilian, J., 68, 187
Kleinjung, T., 217, 237 Macaulay, T. B., 226
Knuth, D. E., 29, 43, 48, 159, 244 Manasse, M., 115, 207, 232, 262
Kobayashi, M., 49 Mathews, G. B., 133
Koblitz, N., 173, 184 May, A., 251, 258, 261
Kolata, G., 115 Mazur, B., 216
Korselt’s Theorem, 65, 72, 249, 267 McKee, J., 132, 147–149
Korselt, A., 65 Menezes, A. J., 189
Kraitchik, M., 143, 162, 196 Mersenne
Kruppa, A., 97, 140 number, 59, 63, 123, 126, 216
prime, 9, 12, 89, 97, 98
Lamé, G., 18 Mersenne, M., 8, 98, 115, 132
Landry, F., 76, 77, 262 Miller, G. L., 67
Las Vegas algorithm, 15, 59, 240 Miller, V., 173
Lattice, 255 Mod Squad, 230
Law of Quadratic Reciprocity, 38, Modulus of a congruence, 24
62, 121, 122 Möbius
Lawrence, F. W., 128, 129, 142, 222 function, 34
Least common multiple, 17, 95 inversion formula, 35, 48
Legendre symbol, 37, 66, 111, 180, Monier, L. M. G., 67
196, 197, 229 Monte Carlo algorithm, 15, 67, 136
Legendre, A. M., 37, 132, 153 Montgomery, H. L., 13
Lehman, R. S., 129–131, 141, 142, Montgomery, P. L., 58, 186, 202,
240 210, 212, 245
Lehmer, D. H., 11, 60, 63, 101, 126, Morain, F., 187, 188
133–135, 155, 158, 162, Morimoto, M., 49
223–228, 246 Morrison, M. A., 158, 159, 162,
Lehmer, D. N., 119, 135, 191, 224, 207, 262
226 Morton, P., 227
Lehmer, E., 133, 134 Multiplicative function, 33
Lehmers’ Factoring Method, Murphy, B., 212
132–135, 169, 228
Lenstra, A. K., xiv, 5, 115, 201, Nagell, T., 50
207, 232, 237, 253, 256, 262 New York Times, 115
Lenstra, H. W., Jr., 1, 173, 181, Newton’s method, 47, 124, 153
182, 207, 212, 242, 256 Nielsen, P. P., 9, 88
Levesque, W. J., 150 Niven, I., 13
Leyland, P. C., 5, 115, 201 N P complete, 14
Linear feedback shift register, N P complexity class, 14
93–97, 117 Number Field Sieve, 33, 105,
Linear recurrence relation, 57 207–217, 232, 236, 237, 247,
Lipton, R. J., 234 262, 263, 267
Index 291

Ochem, P., 9, 88 part, 50, 53, 55, 56, 72, 77, 81


Okamoto, T., 189 polynomial, 93, 96, 97
Okeya, K., 89, 90 prime factor, 6, 7, 10, 11, 50, 53,
One-way function, 3 56, 57, 82, 85, 123
Ore, O., 89 root, 32, 33, 38, 60, 61, 63
root of unity, 48
Paar, C., 237 Priplata, C., 237
Paper strips as sieve, 222 Probabilistic algorithm, 15
Partial fraction decomposition, 95 Probable prime, 63, 188
Pell’s equation, 169 Probable prime, strong, 66
Pell, J., 169 Pseudocode, 15
Pelzl, J., 237 Pseudoprime, 63, 72, 248, 265, 266,
Pepin, T., 62, 72 268
Perfect number, 8–9, 83–88 Pseudoprime, strong, 66, 72, 80,
Pleasants, P., 78 248, 265, 266, 268
Pocklington Theorem, 61, 68, 92, Pseudosquare, 64, 229
187, 247 Public-key cipher, 3, 4, 104–108
Pocklington, H. C., 61
Pohlig, S., 33 Quadratic
Pollard Rho Method, 135–138, 142, congruence, 36
207, 231, 232, 247, 262, 263 form, 132, 134, 163, 170, 240, 242
Pollard, J. M., xii, 135, 138, 207, formula, 36
240, 262 nonresidue, 37, 45, 46, 62, 63, 71,
Pollard p − 1 Method, 106, 110, 190
138–142, 173, 181, 186, 231, polynomial, 136, 149, 174, 196
263 residue, 37, 44–46, 106, 107, 110,
Polynomial 121–123, 143, 151, 154, 159,
cyclotomic, 47–49, 71 180, 221
irreducible, 47, 96 Sieve, 33, 115, 195–202, 207, 231,
primitive, 95–97 232, 236, 247, 263
self-reciprocal, 71 surd, 149
time, 14, 33, 110, 154, 234, 243, Quantum computing, 232–234, 257
250, 262, 267
Pomerance, C., xiii, 46, 60, 64, 65, Rabin, M. O., 67, 106, 108
67, 68, 72, 159, 192, 197, 201, Rahn, J. H., 169
207, 212, 213, 231, 242 Ramanujan, S., 23
Powers, R. E., 155, 158, 162 Ramaswami, V., 43
p + 1 Method, 141 Random number generation,
Pratt, V. R., 61 109–111
Primality testing, 59–71 Rao, M., 9, 88
Prime Relation
factor, primitive, 6 full, 198
irregular, 103 partial, 198
number, 20 Repunit, 5–6, 12, 71, 92
Number Theorem, 23, 148, 185 Reuschle, K. G., 9
proving, 60–62, 91–93 Rickert, N., 141
regular, 103 Riemann Hypothesis, Extended,
Primitive 46, 67, 240
292 Index

Riesel, H., 78, 79, 122 Sum of


Rivest, R., xi, 4, 104, 108 divisors function, 33
Root of unity, primitive, 48 four squares, 113
Rosen, K. H., 67 two squares, 114
RSA cipher, 4, 5, 104–106, 242, Suyama, H., 141, 186, 231
250, 253, 268 Sylvester, J. J., 52

Saito, M., 49 Tate, J., 174


Sandia National Labs, 115 Ten Most Wanted List, 141
Sassoon, G., 201 Thomason Civil Engineering
Saxena, N., 69, 70, 73 College, 9
Schinzel, A., 78, 80, 82 Thomé, E., 217
Schoof, R., 188 Time magazine, 115
Schroeppel’s Linear Sieve, 205–207 Tonelli, A., 45
Schroeppel, R., xiv, 127, 159, 205, Touchard, J., 99
265 Tower of Hanoi problem, 57
Scientific American, 115 Trabb Pardo, L., 43
Selfridge, J. L., 61, 68, 72, 101 Trappe, W., 174, 190
Sequence, divisibility, 49–55, 71 Trebek, A., x, xiv
Shallit, J., 46, 63 Trial Division, 42, 79, 120–123,
Shamir, A., xi, 4, 104, 108, 236, 127, 135, 137, 141, 148, 151,
242, 243, 265, 267 159, 196, 230, 231, 236, 240,
Shanks, D., 46, 163, 168, 240 241, 247, 262, 263, 267
Shark, factoring device, 237 Tromer, E., 236
Shor, P., 232, 257, 267 Truman, H., 2
Shub, M., 109 Tuler, R., 231
Siegel, C. L., 104 Twinkle, factoring device, 236
Sieve devices, 126, 134, 219–230 TWIRL, factoring device, 236
Sieve of Eratosthenes, 139, van der Waerden, B. L., 48
192–195, 217 Vandersypen, L. M. K., 234
Silverman, J. H., 174 Vang, M., 140
Silverman, R. D., 58, 141, 186, 201 Vanstone, S. A., 189
Simmons, G., 115 von Staudt–Clausen Theorem, 103
Size of an elliptic curve modulo p,
180 Wagstaff, S. S., Jr., 18, 46, 72, 103,
Smith, J. W., 230, 231 166, 168, 186, 230, 231
Smooth integer, 43–44, 139, 158, Washington, L. C., 103, 174, 181,
184, 203, 205, 213, 264 190
Sociable numbers, 100 Weaver, R., xiii
Square root modulo p, 44–47, 110 Weierstrass form, 174
Square-free integer, 34 Weil pairing, 190
SQUFOF, 163–168, 266 Weintraub, S. H., 48
Stahlke, C., 237 Wheel, 121
Standard factorization, 21 Wiener, M. J., 251, 262
Stars in Cunningham table, 50 Wiles, A., 104
Steinwandt, R., 237 Williams, G. T., 99
Strassen, V., 240 Williams, H. C., xiv, 57, 92, 106,
Subexponential time, 14, 33 108, 126, 141, 150, 228, 229
Index 293

Woodall, H. J., 10, 115


Wright, E. M., 13
Wunderlich, M. C., 62, 187

Xu, N., 234

YASD, factoring device, 237

Zero-knowledge proof, 111–113, 249


Zimmermann, P., 97, 140, 186, 217,
244, 246
Zsigmondy, K., 50
Zuckerman, H. S., 13
Selected Published Titles in This Series
68 Samuel S. Wagstaff, Jr., The Joy of Factoring, 2013
67 Emily H. Moore and Harriet S. Pollatsek, Difference Sets, 2013
66 Thomas Garrity, Richard Belshoff, Lynette Boos, Ryan Brown,
Carl Lienert, David Murphy, Junalyn Navarra-Madsen, Pedro
Poitevin, Shawn Robinson, Brian Snyder, and Caryn Werner,
Algebraic Geometry, 2013
65 Victor H. Moll, Numbers and Functions, 2012
64 A. B. Sossinsky, Geometries, 2012
63 Marı́a Cristina Pereyra and Lesley A. Ward, Harmonic Analysis,
2012
62 Rebecca Weber, Computability Theory, 2012
61 Anthony Bonato and Richard J. Nowakowski, The Game of Cops
and Robbers on Graphs, 2011
60 Richard Evan Schwartz, Mostly Surfaces, 2011
59 Pavel Etingof, Oleg Golberg, Sebastian Hensel, Tiankai Liu, Alex
Schwendner, Dmitry Vaintrob, and Elena Yudovina, Introduction to
Representation Theory, 2011
58 Álvaro Lozano-Robledo, Elliptic Curves, Modular Forms, and Their
L-functions, 2011
57 Charles M. Grinstead, William P. Peterson, and J. Laurie Snell,
Probability Tales, 2011
56 Julia Garibaldi, Alex Iosevich, and Steven Senger, The Erdős
Distance Problem, 2011
55 Gregory F. Lawler, Random Walk and the Heat Equation, 2010
54 Alex Kasman, Glimpses of Soliton Theory, 2010
53 Jiřı́ Matoušek, Thirty-three Miniatures, 2010
52 Yakov Pesin and Vaughn Climenhaga, Lectures on Fractal Geometry
and Dynamical Systems, 2009
51 Richard S. Palais and Robert A. Palais, Differential Equations,
Mechanics, and Computation, 2009
50 Mike Mesterton-Gibbons, A Primer on the Calculus of Variations and
Optimal Control Theory, 2009
49 Francis Bonahon, Low-Dimensional Geometry, 2009
48 John Franks, A (Terse) Introduction to Lebesgue Integration, 2009
47 L. D. Faddeev and O. A. Yakubovskiĭ, Lectures on Quantum
Mechanics for Mathematics Students, 2009
46 Anatole Katok and Vaughn Climenhaga, Lectures on Surfaces, 2008

For a complete list of titles in this series, visit the


AMS Bookstore at www.ams.org/bookstore/stmlseries/.
This book is about the theory and practice of integer
factorization presented in a historic perspective. It
describes about twenty algorithms for factoring and a

Courtesy of Purdue Univerity


dozen other number theory algorithms that support

Samuel S. Wagstaff, Jr.


the factoring algorithms. Most algorithms are described
both in words and in pseudocode to satisfy both number
theorists and computer scientists. Each of the ten chap-
ters begins with a concise summary of its contents.
The book starts with a general explanation of why factoring integers is
important. The next two chapters present number theory results that are
relevant to factoring. Further on there is a chapter discussing, in particular,
mechanical and electronic devices for factoring, as well as factoring using
quantum physics and DNA molecules. Another chapter applies factoring to
breaking certain cryptographic algorithms. Yet another chapter is devoted
to practical vs. theoretical aspects of factoring. The book contains more
than 100 examples illustrating various algorithms and theorems. It also
contains more than 100 interesting exercises to test the reader’s under-
standing. Hints or answers are given for about a third of the exercises. The
book concludes with a dozen suggestions of possible new methods for
factoring integers.
This book is written for readers who want to learn more about the best
methods of factoring integers, many reasons for factoring, and some
history of this fascinating subject. It can be read by anyone who has taken
a first course in number theory.

ISBN 978-1-4704-1048-3
For additional information
and updates on this book, visit
www.ams.org/bookpages/stml-68

9 781470 410483
AMS on the Web
STML/68 www.ams.org

You might also like