Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

EleAna PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 508

Amazing and Aesthetic

Aspects of Analysis:
On the incredible infinite

(A Course in Undergraduate Analysis, Fall 2006)

π2 1 1 1 1
= 2 + 2 + 2 + 2 + ···
6 1 2 3 4

22 32 52 72 112
= 2 · · · · ···
2 − 1 32 − 1 52 − 1 72 − 1 112 − 1

1
=
14
02 + 12 −
24
12 + 22 −
34
22 + 32 −
44
32 + 42 −
42 + 52 − . . .

Paul Loya

(This book is free and may not be sold. Please email


paul@math.binghamton.edu to report errors or give criticisms)
Contents

Preface i

Acknowledgement iii

Some of the most beautiful formulæ in the world v

A word to the student vii

Part 1. Some standard curriculum 1

Chapter 1. Sets, functions, and proofs 3


1.1. The algebra of sets and the language of mathematics 4
1.2. Set theory and mathematical statements 11
1.3. What are functions? 15

Chapter 2. Numbers, numbers, and more numbers 21


2.1. The natural numbers 22
2.2. The principle of mathematical induction 27
2.3. The integers 35
2.4. Primes and the fundamental theorem of arithmetic 41
2.5. Decimal representations of integers 49
2.6. Real numbers: Rational and “mostly” irrational 53
2.7. The completeness axiom of R and its consequences 63
2.8. m-dimensional Euclidean space 72
2.9. The complex number system 79
2.10. Cardinality and “most” real numbers are transcendental 83

Chapter 3. Infinite sequences of real and complex numbers 93


3.1. Convergence and ε-N arguments for limits of sequences 94
3.2. A potpourri of limit properties for sequences 102
3.3. The monotone criteria, the Bolzano-Weierstrass theorem, and e 111
3.4. Completeness and the Cauchy criteria for convergence 117
3.5. Baby infinite series 123
3.6. Absolute convergence and a potpourri of convergence tests 131
3.7. Tannery’s theorem, the exponential function, and the number e 138
3.8. Decimals and “most” numbers are transcendental á la Cantor 146

Chapter 4. Limits, continuity, and elementary functions 153


4.1. Convergence and ε-δ arguments for limits of functions 154
4.2. A potpourri of limit properties for functions 160
4.3. Continuity, Thomae’s function, and Volterra’s theorem 166
iii
iv CONTENTS

4.4. Compactness, connectedness, and continuous functions 172


4.5. Monotone functions and their inverses 182
4.6. Exponentials, logs, Euler and Mascheroni, and the ζ-function 187
4.7. The trig functions, the number π, and which is larger, π e or eπ ? 198
4.8. F Three proofs of the fundamental theorem of algebra (FTA) 210
4.9. The inverse trigonometric functions and the complex logarithm 217
4.10. F The amazing π and its computations from ancient times 226

Chapter 5. Some of the most beautiful formulæ in the world 237


5.1. F Euler, Wallis, and Viète 240
5.2. F Euler, Gregory, Leibniz, and Madhava 249
5.3. F Euler’s formula for ζ(2k) 258

Part 2. Extracurricular activities 267

Chapter 6. Advanced theory of infinite series 269


6.1. Summation by parts, bounded variation, and alternating series 271
6.2. Liminfs/sups, ratio/roots, and power series 279
6.3. A potpourri of ratio-type tests and “big O” notation 290
6.4. Some pretty powerful properties of power series 295
6.5. Double sequences, double series, and a ζ-function identity 300
6.6. RearrangementsPand multiplication of power series 311
6.7. F Proofs that 1/p diverges 320
6.8. Composition of power series and Bernoulli and Euler numbers 325
6.9. The logarithmic, binomial, arctangent series, and γ 332
6.10. F π, Euler, Fibonacci, Leibniz,PMadhava, and Machin 340

6.11. F Another proof that π 2 /6 = n=1 1/n2 (The Basel problem) 344

Chapter 7. More on the infinite: Products and partial fractions 349


7.1. Introduction to infinite products 350
7.2. Absolute convergence for infinite products 355
7.3. Euler, Tannery, and Wallis: Product expansions galore 359
7.4. Partial fraction expansions ofPthe trigonometric functions 366

7.5. F More proofs that π 2 /6 = n=1 1/n2 370
7.6. F Riemann’s remarkable ζ-function, probability, and π 2 /6 373
7.7. F Some of the most beautiful formulæ in the world IV 382

Chapter 8. Infinite continued fractions 389


8.1. Introduction to continued fractions 390
8.2. F Some of the most beautiful formulæ in the world V 394
8.3. Recurrence relations, Diophantus’ tomb, and shipwrecked sailors 403
8.4. Convergence theorems for infinite continued fractions 411
8.5. Diophantine approximations and the mystery of π solved! 422
8.6. F Continued fractions and calendars, and math and music 433
8.7. The elementary functions and the irrationality of ep/q 437
8.8. Quadratic irrationals and periodic continued fractions 446
8.9. Archimedes’ crazy cattle conundrum and diophantine equations 456
8.10. Epilogue: Transcendental numbers, π, e, and where’s calculus? 464

Bibliography 471
CONTENTS v

Index 481
Preface

I have truly enjoyed writing this book. Admittedly, some of the writing is
too overdone (e.g. overdoing alliteration at times), but what can I say, I was
having fun. The “starred” sections of the book are meant to be “just for fun”
and don’t interfere with other sections (besides perhaps other starred sections).
Most of the quotes that you’ll find in these pages are taken from the website
http://www-gap.dcs.st-and.ac.uk/~history/Quotations/
This is a first draft, so please email me any errors, suggestions, comments etc.
about the book to
paul@math.binghamton.edu.
The overarching goals of this textbook are similar to any advanced math text-
book, regardless of the subject:
Goals of this textbook. The student will be able to . . .
• comprehend and write mathematical reasonings and proofs.
• wield the language of mathematics in a precise and effective manner.
• state the fundamental ideas, axioms, definitions, and theorems upon which real
analysis is built and flourishes.
• articulate the need for abstraction and the development of mathematical tools
and techniques in a general setting.
The objectives of this book make up the framework of how these goals will be
accomplished, and more or less follow the chapter headings:
Objectives of this textbook. The student will be able to . . .
• identify the interconnections between set theory and mathematical statements
and proofs.
• state the fundamental axioms of the natural, integer, and real number systems
and how the completeness axiom of the real number system distinguishes this
system from the rational system in a powerful way.
• apply the rigorous ε-N definition of convergence for sequences and series and
recognize monotone and Cauchy sequences.
• apply the rigorous ε-δ definition of limits for functions and continuity and the
fundamental theorems of continuous functions.
• determine the convergence and properties of an infinite series, product, or con-
tinued fraction using various tests.
• identify series, product, and continued fraction formulæ for the various elemen-
tary functions and constants.
I’d like to thank Brett Bernstein for looking over the notes and gave many
valuable suggestions.
Finally, some last words about my book. This not a history book (but we
try to talk history throughout this book) and this not a “little” book like Herbert
Westren Turnbull’s book The Great Mathematicians, but like Turnbull, I do hope
i
ii PREFACE

If this little book perhaps may bring to some, whose acquain-


tance with mathematics is full of toil and drudgery, a knowledge
of those great spirits who have found in it an inspiration and
delight, the story has not been told in vain. There is a largeness
about mathematics that transcends race and time: mathematics
may humbly help in the market-place, but it also reaches to the
stars. To one, mathematics is a game (but what a game!) and
to another it is the handmaiden of theology. The greatest math-
ematics has the simplicity and inevitableness of supreme poetry
and music, standing on the borderland of all that is wonderful
in Science, and all that is beautiful in Art. Mathematics trans-
figures the fortuitous concourse of atoms into the tracery of the
finger of God.
Herbert Westren Turnbull (1885–1961). Quoted from [225, p.
141].

Paul Loya
Binghamton University, Vestal Parkway, Binghamton, NY 13902
paul@math.binghamton.edu

Soli Deo Gloria


Acknowledgement

To Jesus, my Lord, Savior and Friend.

iii
Some of the most beautiful formulæ in the world

Here is a very small sample of the many beautiful formulas we’ll prove in this
book involving some of the main characters we’ll meet in our journey.

2
e=2+ (Euler; §5.2)
3
2+
4
3+
5
4+
.
5 + ..

X (−1)n
γ= ζ(n) (Euler; §6.9)
n=2
n

2 2 2 2
log 2 = √ · p√ · qp · rq · · · (Seidel; §7.1)
1+ 2 1+ 2 1+ √ p√
2 1+ 2
v v
u u s
√ u u r q
1+ 5 t
u t √
Φ= = 1 + 1 + 1 + 1 + 1 + 1 + · · · (§3.3)
2
1
Φ=1+ (§3.4)
1
1+
1
1+
.
1 + ..
π2 1 1 1 1
= 2 + 2 + 2 + 2 + ··· (Euler; §6.11)
6 1 2 3 4
π4 1 1 1 1
= 4 + 4 + 4 + 4 + ··· (Euler; §7.5)
90 1 2 3 4
π 1 1 1 1
= − + − + · · · (Gregory-Leibniz-Madhava; §6.10)
4 1 3 5 7
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + · · · (Viète; §4.10)
π 2 2 2 2 2 2 2 2 2
π 1 2 2 4 4 6 6 8
= · · · · · · · ··· (Wallis; §6.10)
2 1 1 3 3 5 5 7 7
v
vi SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

4 12
=1+ (Lord Brouncker; §5.2)
π 32
2+
52
2+
72
2+
.
2 + ..
1
ex = (Euler; §8.7)
2x
1−
x2
x+2+
x2
6+
x2
10 +
.
14 + . .
∞ 
Y z2 
sin πz = πz 1− 2 (Euler; §7.3)
n=1
n
Y 1 −1 Y pz
ζ(z) = 1− z = (Euler–Riemann; §7.6)
p pz − 1
A word to the student

One can imagine mathematics as a movie with exciting scenes, action, plots,
etc. ... There are a couple things you can do. First, you can simply sit back and
watch the movie playing out. Second, you can take an active role in shaping the
movie. A mathematician does both at times, but is more the actor rather than the
observer. I recommend you be the actor in the great mathematics movie. To do
so, I recommend you read this book with a pencil and paper at hand writing down
definitions, working through examples filling in any missing details, and of course
doing exercises (even the ones that are not assigned).1 Of course, please feel free
to mark up the book as much as you wish with remarks and highlighting and even
corrections if you find a typo or error. (Just let me know if you find one!)

1There are many footnotes in this book. Most are quotes from famous mathematicians and
others are remarks that I might say to you if I were reading the book with you. All footnotes may
be ignored if you wish!

vii
Part 1

Some standard curriculum


CHAPTER 1

Sets, functions, and proofs

Mathematics is not a deductive science — that’s a cliche. When you try to


prove a theorem, you don’t just list the hypotheses, and then start to reason.
What you do is trial and error, experimentation, guesswork.
Paul R. Halmos (1916– ), I want to be a Mathematician [92].
In this chapter we start being “mathematicians”, that is, we start doing proofs.
One of the goals of this text is to get you proving mathematical statements in real
analysis. Set theory provides a safe environment in which to learn about math
statements, “if ... then”, “if and only if”, etc., and to learn the logic behind proofs.
Actually, I assume that most of you have had some exposure to sets, so many proofs
in this chapter are “left to the reader”; the real meat comes in the next chapter.
The students at Binghamton University, the people in your family, your pets,
the food in your refrigerator are all examples of sets of objects. Mathematically,
a set is defined by some property or attribute that an object must have or must
not have; if an object has the property, then it’s in the set. For example, the
collection of all registered students at Binghamton University who are signed up
for real analysis forms a set. (A BU student is either signed up for real analysis
or not.) For an example of a property that cannot be used to define a set, try to
answer the following question proposed by Bertrand Russell (1872–1970) in 1918:
A puzzle for the student: A barber in a local town puts up a
sign saying that he shaves only those people who do not shave
themselves. “Who, then, shaves the barber?”

Try to answer this question. (Does the barber shave himself or does someone else
shave him?) Any case, the idea of a set is perhaps the most fundamental idea in
all of mathematics. Sets can be combined to form other sets and the study of such
operators is called the algebra of sets, which we cover in Section 1.1. In Section 1.2
we look at the relationship between set theory and the language of mathematics.
Second to sets in fundamental importance is the idea of a function, which we cover
in Section 1.3. In order to illustrate relevant examples of sets, we shall presume
elementary knowledge of the real numbers. A thorough discussion of real numbers
is left for the next chapter.
This chapter is short on purpose since we do not want to spend too much time
on set theory so as to start real analysis ASAP. In the words of Paul Halmos [91,
p. vi], “... general set theory is pretty trivial stuff really, but, if you want to be a
mathematician, you need some, and here it is; read it, absorb it, and forget it.”
Chapter 1 objectives: The student will be able to . . .
• manipulate and create new sets from old ones using the algebra of sets
• identify the interconnections between set theory and math statements/proofs.
• Define functions and the operations of functions on sets.
3
4 1. SETS, FUNCTIONS, AND PROOFS

1.1. The algebra of sets and the language of mathematics


In this section we study sets and various operations, referred to as the algebra
of sets, to form other sets. We shall see that the algebra of sets is indispensable in
many branches of mathematics such as the study of topology in later chapters. Set
theory also provides the language by which mathematics and logic are built.
1.1.1. Sets and intervals. A set is a collection of definite, well-distinguished
objects, also called elements, which are usually defined by a conditional statement
or simply by listing their elements. All sets and objects that we deal with have
the property that given an object and a set, there must be a definite “yes” or “no”
answer to whether or not the object is in the set, because otherwise paradoxes can
arise as seen in the barber paradox; see also Problem 4 for another puzzle.
Example 1.1. Sets where we can list the elements include
N := {1, 2, 3, 4, 5, 6, 7, . . .} and Z := {· · · , −2, −1, 0, 1, 2, · · · },
the natural numbers and integers, respectively. Here, the symbol “:=” means
that the symbol on the left is by definition the expression on the right and we
usually read “:=” as “equals by definition”.1
Example 1.2. We can define the rational numbers by the conditional state-
ment n o
a
Q := x ∈ R ; x = , where a, b ∈ Z and b 6= 0 .
b
Here, R denotes the set of real numbers and the semicolon should be read “such
that”. So, Q is the set of all real numbers x such that x can be written as a ratio
x = a/b where a and b are integers with b not zero.
Example 1.3. The empty set is a set with no elements — think of an empty
clear plastic bag. We denote this empty set by ∅. (In the next subsection we prove
that there is only one empty set.)
Example 1.4. Intervals provide many examples of sets defined by conditional
statements. Let a and b be real numbers with a ≤ b. Then the set
{x ∈ R ; a < x < b}
is called an open interval and is often denoted by (a, b). If a = b, then there are
no real numbers between a and b, so (a, a) = ∅. The set
{x ∈ R ; a ≤ x ≤ b}
is called a closed interval and is denoted by [a, b]. There are also half open and
closed intervals,
{x ∈ R ; a < x ≤ b}, {x ∈ R ; a ≤ x < b},
called left-half open and right-half open intervals and are denoted by (a, b] and
[a, b), respectively. The points a and b are called the end points of the intervals.
There are also infinite intervals. The sets
{x ∈ R ; x < a}, {x ∈ R ; a < x}
1The errors of definitions multiply themselves according as the reckoning proceeds; and lead
men into absurdities, which at last they see but cannot avoid, without reckoning anew from the
beginning. Thomas Hobbes (1588–1679) [160].
1.1. THE ALGEBRA OF SETS AND THE LANGUAGE OF MATHEMATICS 5

B Ac

A A

Figure 1.1. The left-hand side displays a subset. The right-hand


side deals with complements that we’ll look at in Section 1.1.3.

are open intervals, denoted by (−∞, a) and (a, ∞), respectively, and

{x ∈ R ; x ≤ a}, {x ∈ R ; a ≤ x}

are closed intervals, denoted by (−∞, a] and [a, ∞), respectively. Note that the
sideways eight symbol ∞ for “infinity,” introduced in 1655 by John Wallis (1616–
1703) [45, p. 44], is just that, a symbol, and is not to be taken to be a real number.
The real line is itself an interval, namely R = (−∞, ∞).

1.1.2. Subsets and “if ... then” statements. If a is belongs to a set A,


then we usually say a is in A and we write a ∈ A or if a does not belong to A, then
we write a 6∈ A. If each element of a set A is also an element of a set B, we write
A ⊆ B and say that A is a subset of, or contained in, B. Thus, A ⊆ B means if a
is in A, then also a is in B. See Figure 1.1. If A not a subset of B, we write A 6⊆ B.

Example 1.5. N ⊆ Z since every natural number is also an integer and Z ⊆ R


since every integer is also a real number, but R 6⊆ Z because not every real number
is an integer.

To say that two sets A and B are the same just means that they contain exactly
the same elements; in other words, every element in A is also in B (that is, A ⊆ B)
and also every element in B is also in A (that is, B ⊆ A). Thus, we define

A=B means that A ⊆ B and B ⊆ A.

A set that is a subset of every set is the empty set ∅. To see that ∅ is a subset
of every set, let A be a set. We must show that the statement if x ∈ ∅, then x ∈ A
is true. However, the part “x ∈ ∅” of this statement sounds strange because the
empty set has nothing in it, so x ∈ ∅ is an untrue statement to begin with. Before
evaluating the statement “If x ∈ ∅, then x ∈ A,” let us first discuss general “If ...
then” statements. Consider the following following statement made by Joe:
If Professor Loya cancels class on Friday, then I’m driving to New York City.
Obviously, Joe told the truth if indeed I cancelled class and he headed off
to NYC. What if I did not cancel class but Joe still went to NYC, did Joe tell
the truth, lie, or neither: his statement simply does not apply? Mathematicians
would say that Joe told the truth (regardless of the outcome, Joe went to NYC or
stayed in Binghamton). He only said if the professor cancels class, then he would
drive to NYC. All bets are off if the professor does not cancel class. This is the
standing convention mathematicians take for any “If ... then” statement. Thus,
given statements P and Q, we consider the statement “If P , then Q” to be true if
the statement P is true, then the statement Q is also true, and we also regard it
6 1. SETS, FUNCTIONS, AND PROOFS

as being true if the statement P is false whether or not the statement Q is true or
false. There is no such thing as a “neither statement” in this book.2
Now back to our problem. We want to prove that if x ∈ ∅, then x ∈ A. Since
x ∈ ∅ is untrue, by our convention, the statement “if x ∈ ∅, then x ∈ A” is true
by default. Thus, ∅ ⊆ A. We can also see that there is only one empty set, for
suppose that ∅0 is another empty set. Then the same (silly) argument that we just
did for ∅ shows that ∅0 is also a subset of every set. Now to say that ∅ = ∅0 , we
must show that ∅ ⊆ ∅0 and ∅0 ⊆ ∅. But ∅ ⊆ ∅0 holds because ∅ is a subset of
every set and ∅0 ⊆ ∅ holds because ∅0 is a subset of every set. Therefore, ∅ = ∅0 .
There is another, perhaps easier, way to see that ∅ is a subset of any set by
invoking the “contrapositive”. Consider again the statement that A ⊆ B:
(1) If x ∈ A, then x ∈ B.
This is equivalent to the contrapositive statement
(2) If x 6∈ B, then x 6∈ A.
Indeed, suppose that statement (1) holds, that is, A ⊆ B. We shall prove that
statement (2) holds. So, let us assume that x 6∈ B is true; is true that x 6∈ A?3
Well, the object x is either in A or it’s not. If x ∈ A, then, since A ⊆ B, we must
have x ∈ B. However, we know that x 6∈ B, and so x ∈ A is not the valid option,
and therefore x 6∈ A. Assume now that statement (2) holds: If x 6∈ B, then x 6∈ A.
We shall prove that statement (1) holds, that is, A ⊆ B. So, let x ∈ A. We must
prove that x ∈ B. Well, either x ∈ B or it’s not. If x 6∈ B, then we know that
x 6∈ A. However, we are given that x ∈ A, so x 6∈ B is not the correct option,
therefore, the other option x ∈ B must be true. Therefore, (1) and (2) really say
the same thing. We now prove that ∅ ⊆ A for any given set A. Assume that
x 6∈ A. According to (2), we must prove that x 6∈ ∅. But this last statement is true
because ∅ does not contain anything, so x 6∈ ∅ is certainly true. Thus, ∅ ⊆ A.
The following theorem states an important law of sets.
Theorem 1.1 (Transitive law). If A ⊆ B and B ⊆ C, then A ⊆ C.
Proof. Suppose that A ⊆ B and B ⊆ C. We need to prove that A ⊆ C,
which by definition means that if x ∈ A, then x ∈ C. So, let x be in A; we need
to show that x is also in C. Since x is in A and A ⊆ B, we know that x is also in
B. Now B ⊆ C, and therefore x is also in C. In conclusion, we have proved that if
x ∈ A, then x ∈ C, which is exactly what we wanted to prove. 

Finally, we remark that the power set of a given set A is the collection con-
sisting of all subsets of A, which we usually denote by P(A).
Example 1.6.

P({e, π}) = ∅, {e}, {π}, {e, π} .

2Later in your math career you will find some “neither statements” such as e.g. the continuum
hypothesis . . . but this is another story!
3Recall our convention that for a false statement P , we always consider a statement “If P ,
then Q” to be true regardless of the validity of the statement Q. Therefore, “x 6∈ B” is false
automatically makes the statement (2) true regardless of the validity of the statement “x 6∈ A”,
so in order to prove statement (2) is true, we might as well assume that the statement “x 6∈ B” is
true and try to show that the statement “x 6∈ A” is also true.
1.1. THE ALGEBRA OF SETS AND THE LANGUAGE OF MATHEMATICS 7

1.1.3. Unions, “or” statements, intersections, and set differences.


Given two sets A and B, their union, denoted A ∪ B, is the set of elements
that are in A or B:
A ∪ B := {x ; x ∈ A or x ∈ B}.
Here we come to another difference between English and mathematical language.
Let’s say that your parents come to visit you on campus and your dad asks you:
Would you like to go to McDonald’s or Burger King?
By “or,” your dad means that you can choose only one of the two choices, but not
both. At the restaurant, your mom asks:
Would you like to have ketchup or mustard?
Now by “or” in this case, your mom means that you can choose ketchup, mustard,
or both if you want. Mathematicians always follow mom’s meaning of “or” (mom
is always right ,)! Thus, A ∪ B, is the set of elements that are in A or B, where
“or” means in A, in B, or in both A and B.
Example 1.7.
√ √
{0, 1, e, i} ∪ {e, i, π, 2} = {0, 1, e, i, π, 2}.
The intersection of two sets A and B, denoted by A∩B, is the set of elements
that are in both A and B:
A ∩ B := {x ; x ∈ A and x ∈ B}.
(Here, “and” means just what you think it means.)
Example 1.8. √
{0, 1, e, i} ∩ {e, i, π, 2} = {e, i}.
If the sets A and B have no elements in common, then A ∩ B = ∅, and the
sets are said to be disjoint. Here are some properties of unions and intersections,
the proofs of which we leave mostly to the reader.
Theorem 1.2. Unions and intersections are commutative and associative in
the sense that if A, B, and C are sets, then
(1) A ∪ B = B ∪ A and A ∩ B = B ∩ A.
(2) (A ∪ B) ∪ C = A ∪ (B ∪ C) and (A ∩ B) ∩ C = A ∩ (B ∩ C).
Proof. Consider the proof that A ∪ B = B ∪ A. By definition of equality
of sets, we must show that A ∪ B ⊆ B ∪ A and B ∪ A ⊆ A ∪ B. To prove that
A ∪ B ⊆ B ∪ A, let x be in A ∪ B. Then by definition of union, x ∈ A or x ∈ B.
This of course is the same thing as x ∈ B or x ∈ A. Therefore, x is in B ∪ A. The
proof that B ∪ A ⊆ A ∪ B is similar. Therefore, A ∪ B = B ∪ A. We leave the proof
that A ∩ B = B ∩ A to the reader. We also leave the proof of (2) to the reader. 
Our last operation on sets is the set difference A \ B (read “A take away
B” or the “complement of B in A”), which is the set of elements of A that do not
belong to B. Thus,
A \ B := {x ; x ∈ A and x 6∈ B}.

Example 1.9. √
{0, 1, e, i} \ {e, i, π, 2} = {0, 1}.
8 1. SETS, FUNCTIONS, AND PROOFS

A B A B A B A B

A∪B A∩B A\B

Figure 1.2. Visualization of the various set operations. Here, A


and B are overlapping triangles.

It is always assumed that in any given situation we working with subsets of


some underlying “universal” set X. Given any subset A of X, we denote X \ A,
the set of elements in X that are outside of A, by Ac , called the complement of
A; see Figure 1.1. Therefore,
Ac := X \ A = {x ∈ X ; x 6∈ A}.
Example 1.10. Let us take our “universe” to be R. Then,
(−∞, 1]c = {x ∈ R ; x 6∈ (−∞, 1]} = (1, ∞),
and
[0, 1]c = {x ∈ R ; x 6∈ [0, 1]} = (−∞, 0) ∪ (1, ∞).
In any given situation, the universal set X will always be clear from context,
either because it is stated what X is, or because we are working in, say a section
dealing with real numbers only, so R is by default the universal set. Otherwise, we
assume that X is just “there” but simply not stated. For pictorial representations
of union, intersection, and set difference, see Figure 1.2; these pictures are called
Venn diagrams after John Venn (1834–1923) who introduced them.
1.1.4. Arbitrary unions and intersections. We can also consider arbitrary
(finite or infinite) unions and intersections. Let I be a nonempty set and assume
that for each α ∈ I, there corresponds a set Aα . The sets Aα where α ∈ I are
said to be a family of sets indexed by I, which we often denote by {Aα ; α ∈ I}.
An index set that shows up quite often is I = N; in this case we usually call
{An ; n ∈ N} a sequence of sets.
Example 1.11. For example, A1 := [0, 1], A2 := [0, 1/2], A3 := [0, 1/3], and in
general,    
1 1
An := 0, = x ∈ R; 0 ≤ x ≤ ,
n n
form a family of sets indexed by N (or a sequence of sets). See Figure 1.3 for a
picture of these sets.
How do we define the union of all the sets Aα in a family {Aα ; α ∈ I}? Consider
the case of two sets A and B. We can write
A ∪ B = {x ; x ∈ A or x ∈ B}
= {x ; x is in at least one of the sets on the left-hand side}.

[ ] ] ] ] ] ]
0 . . . . . . 16 51 41 1
3
1
2
1

Figure 1.3. The sequence of sets An = [0, 1/n] for n ∈ N.


1.1. THE ALGEBRA OF SETS AND THE LANGUAGE OF MATHEMATICS 9

With this as motivation, we define the union of all the sets Aα to be


[
Aα := {x ; x ∈ Aα for at least one α ∈ I}.
α∈I
S S
To simplify notation, we sometimes just write Aα or α Aα for the left-hand side.
Example 1.12. For the sequence {An ; n ∈ N} where An = [0, 1/n], by staring
at Figure 1.3 we see that
[
An := {x ; x ∈ [0, 1/n] for at least one n ∈ N} = [0, 1].
n∈N

We how do we define the intersection of all the sets Aα in a family {Aα ; α ∈ I}?
Consider the case of two sets A and B. We can write
A ∩ B = {x ; x ∈ A and x ∈ B}
= {x ; x is in every set on the left-hand side}.
With this as motivation, we define the intersection of all the sets Aα to be
\
Aα := {x ; x ∈ Aα for every α ∈ I}.
α∈I

Example 1.13. For the sequence An = [0, 1/n] in Figure 1.3, we have
\
An := {x ; x ∈ [0, 1/n] for every n ∈ N} = {0}.
n∈N
T T
To simplify notation, we sometimes just write Aα or α Aα for the left-hand
side. If I = {1, 2, . . . , N } is a finite set of natural numbers, then we usually denote
S T SN TN
α Aα and αS Aα by n=1 T An and S∞ n=1 An , respectively.
T∞ If I = N, then we
usually denote α Aα and α Aα by n=1 An and n=1 An , respectively.
Theorem 1.3. Let A be a set and {Aα } be a family of sets. Then union and
intersections distribute in the sense that
[ [ \ \
A ∩ Aα = (A ∩ Aα ), A ∪ Aα = (A ∪ Aα )
α α α α

and satisfy the Augustus De Morgan (1806–1871) laws:


[ \ \ [
A\ Aα = (A \ Aα ), A\ Aα = (A \ Aα ).
α α α α

Proof. We shall leave the first distributive


S law to the reader and prove the
S
second one. We need to show that A ∩ α Aα = α (A ∩ Aα ), which means that
[ [ [ [
(1.1) A ∩ Aα ⊆ (A ∩ Aα ) and (A ∩ Aα ) ⊆ A ∩ Aα .
α α α α
S S
To prove the first inclusion,
S let x ∈ A ∩ α Aα ; we must show S that x ∈ α (A ∩ Aα ).
The statement x ∈ A ∩ α Aα means that x ∈ A and x ∈ α Aα , which means,
by the definition of union, x ∈SA and x ∈ Aα for some α. Hence, x ∈ A ∩ Aα for
some α, which is to say, x ∈S α (A ∩ Aα ). Consider now the second inclusion in
(1.1). To prove this, let x ∈ α (A ∩ Aα ). This means that x ∈ A ∩ Aα for some α.
Therefore, by definition of intersection, x ∈ A and x ∈ Aα for some α. This means
10 1. SETS, FUNCTIONS, AND PROOFS

S S
that x ∈ A and x ∈ α Aα , which is to say, x ∈ A ∩ α Aα . In summary, we have
established both inclusions in (1.1), which proves the equality of the sets.
We shall prove the Sfirst De Morgan
T law and leave the second to the reader. We
need to show that A \ α Aα = α (A \ Aα ), which means that
[ \ \ [
(1.2) A\ Aα ⊆ (A \ Aα ) and (A \ Aα ) ⊆ A \ Aα .
α α α α
S S
To prove the first inclusion, let x ∈ A \ α Aα . This means x ∈ A and x 6∈ α Aα .
For x not to be in the union, it must be that x 6∈ Aα for any α whatsoever
S (because
if x happened to be in some Aα , then x would be in the union α Aα which we
know x is not). Hence, x ∈ A and T x 6∈ Aα for all α, in other words, x ∈ A \ Aα
for all α, which means
T that x ∈ α (A \ Aα ). We now prove the second inclusion
in (1.2). So, let x ∈ α (A \ Aα ). This means that x ∈ A \ Aα for all α. Therefore,
S
x ∈ A and x 6∈ Aα for all S α. Since x is not in any ASα , it follows that x 6∈ α Aα .
Therefore, x ∈ A and x 6∈ α Aα and hence, x ∈ A \ α Aα . In summary, we have
established both inclusions in (1.2), which proves the equality of the sets. 

The best way to remember De Morgan’s laws is the English versions: The
complement of a union is the intersection of the complements and the complement
of an intersection is the union of the complements. For a family {Aα } consisting of
just two sets B and C, the distributive and De Morgan laws are just
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C), A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
and
A \ (B ∪ C) = (A \ B) ∩ (A \ C), A \ (B ∩ C) = (A \ C) ∪ (A \ C).
Here are some exercises where we ask you to prove statements concerning sets.
In Problem 3 it is very helpful to draw Venn diagrams to “see” why the statement
should be true. Some advice that is useful throughout this whole book: If you can’t
see how to prove something after some effort, take a break and come back to the
problem later.4
Exercises 1.1.
1. Prove that ∅ = {x ; x 6= x}. True, false, or neither: If x ∈ ∅, then real analysis is
everyone’s favorite class.
2. Prove that for any set A, we have A ∪ ∅ = A and A ∩ ∅ = ∅.
3. Prove the following statements:
(a) A \ B = A ∩ B c .
(b) A ∩ B = A \ (A \ B).
(c) B ∩ (A \ B) = ∅.
(d) If A ⊆ B, then B = A ∪ (B \ A).
(e) A ∪ B = A ∪ (B \ A).
(f) A ⊆ A ∪ B and A ∩ B ⊆ A.
(g) If A ∩ B = A ∩ C and A ∪ C = A ∪ B, then B = C.
(h) (A \ B) \ C = (A \ C) \ (B \ C)

4Finally, two days ago, I succeeded - not on account of my hard efforts, but by the grace of
the Lord. Like a sudden flash of lightning, the riddle was solved. I am unable to say what was
the conducting thread that connected what I previously knew with what made my success possible.
Carl Friedrich Gauss (1777–1855) [67].
1.2. SET THEORY AND MATHEMATICAL STATEMENTS 11

4. (Russell’s paradox)5 Define a “thing” to be any collection of items. The reason


that we use the word “thing” is that these things are not sets. Let
B = { “things” A ; A 6∈ A},
that is, B is the collection of all “things” that do not contain themselves. Questions:
Is B a “thing”? Is B ∈ B or is B 6∈ B? Is B a set?
5. Find
∞   ∞   ∞  
[ 1 \ 1 [ 1 1
(a) 0, , (b) 0, , (c) , ,
n=1
n n=1
n n=1
2n 2n−1
∞   \  
\ 1 1 \ 1
(d) , , (e) (α, ∞), (f ) 1, 1 + .
n=1
2n 2n−1 α∈R
α
α∈(0,∞)

1.2. Set theory and mathematical statements


As already mentioned, set theory provides a comfortable environment in which
to do proofs and to learn the ins and outs of mathematical statements.6 In this
section we give a brief account of the various ways mathematical statements can
be worded using the background of set theory.
1.2.1. More on “if ... then” statements. We begin by exploring different
ways of saying “if ... then.” Consider again the statement that A ⊆ B:
If x ∈ A, then x ∈ B.
We can also write this as
x ∈ A implies x ∈ B or x ∈ A =⇒ x ∈ B;
that is, x belongs to A implies that x also belongs to B. Here, =⇒ is the common
symbol for “implies”. Another way to say this is
x ∈ A only if x ∈ B;
that is, the object x belongs to A only if x also belongs to B. Here is yet one more
way to write the statement:
x ∈ B if x ∈ A;
that is, the object x belongs to B if, or given that, the object x belongs to A.
Finally, we also know that the contrapositive statement says the same thing:
If x 6∈ B, then x 6∈ A.
Thus, the following statements all mean the same thing:
If x ∈ A, then x ∈ B; x ∈ A implies x ∈ B; Given x ∈ A, x ∈ B;
(1.3)
x ∈ A only if x ∈ B; x ∈ B if x ∈ A; If x 6∈ B, then x 6∈ A.
We now consider each of these set statements in more generality. First of all,
a “statement” in the mathematical sense is a statement that is either true or false,
but never both; much in the same way that we work only with sets and objects such
5
The point of philosophy is to start with something so simple as not to seem worth stating,
and to end with something so paradoxical that no one will believe it. Bertrand Russell (1872–
1970).
6
Another advantage of a mathematical statement is that it is so definite that it might be
definitely wrong; and if it is found to be wrong, there is a plenteous choice of amendments ready
in the mathematicians’ stock of formulae. Some verbal statements have not this merit; they are
so vague that they could hardly be wrong, and are correspondingly useless. Lewis Fry Richardson
(1881–1953). Mathematics of War and Foreign Politics.
12 1. SETS, FUNCTIONS, AND PROOFS

that any given object is either in or not in a given set, but never both. In a day when
“there are no absolutes” is commonly taught in high school, it may take a while
to fully grasp the language of mathematics. A mathematical statement always has
hypotheses or assumptions, and a conclusion. Almost always there are hidden
assumptions, that is, assumptions that are not stated, but taken for granted,
because the context makes it clear what these assumptions are. Whenever you read
a mathematical statement, make sure that you fully understand the hypotheses or
assumptions (including hidden ones) and the conclusion. For the statement “If
x ∈ A, then x ∈ B”, the assumption is x ∈ A and the conclusion is x ∈ B. The
“if-then” wording means: If the assumptions (x ∈ A) are true, then the conclusion
(x ∈ B) is also true, or stated another way, given that the assumptions are true, the
conclusion follows. Let P denote the statement that x ∈ A and Q the statement
that x ∈ B. Then each of the following statements are equivalent, that is, the truth
of any one statement implies the truth of any of the other statements:7
If P , then Q; P implies Q; Given P , Q holds;
(1.4)
P only if Q; Q if P ; If not Q, then not P .
Each of these statements are for P being x ∈ A and Q being x ∈ B, but as you
probably guess, they work for any mathematical statements P and Q. Let us
consider statements concerning real numbers.
Example 1.14. Let P be the statement that x > 5. Let Q be the statement
that x2 > 100. Then each of the statements are equivalent:
If x > 5, then x2 > 100; x > 5 implies x2 > 100; Given x > 5, x2 > 100;
x > 5 only if x2 > 100; x2 > 100 if x > 5; If x2 ≤ 100, then x ≤ 5.
The hidden assumptions are that x represents a real number and that the real
numbers satisfy all the axioms you think they do. Of course, any one (and hence
every one) of these six statements is false. For instance, x = 6 > 5 is true, but
x2 = 36, which is not greater than 100.
Example 1.15. Let P be the statement that x2 = 2. Let Q be the statement
that x is irrational. Then each of the statements are equivalent:
If x2 = 2, then x is irrational; x2 = 2 implies x is irrational;
Given x2 = 2, x is irrational; x2 = 2 only if x is irrational;
x is irrational if x2 = 2; If x is rational, then x2 6= 2.
Again, the hidden assumptions are that x represents a real number and that the
real numbers satisfy all their usual properties. Any one (and hence every one)√of
these six statements is of course true (since we are told since high school that ± 2
are irrational; we shall prove this fact in Section 2.6).
As these two examples show, it is very important to remember that none of the
statements in (1.4) assert that P or Q is true; they simply state if P is true, then
Q is also true.
7P implies Q is sometimes translated as “P is sufficient for Q” in the sense that the truth of
P is sufficient or enough or ample to imply that Q is also true. P implies Q is also translated “Q
is necessary for P ” because Q is necessarily true given that P is true. However, we shall not use
this language in this book.
1.2. SET THEORY AND MATHEMATICAL STATEMENTS 13

1.2.2. Converse statements and “if and only if” statements. Given a
statement P implies Q, the reverse statement Q implies P is called the converse
statement. For example, back to set theory, the converse of the statement
If x ∈ A, then x ∈ B; that is, A ⊆ B,
is just the statement that
If x ∈ B, then x ∈ A; that is, B ⊆ A.
These set theory statements make it clear that the converse of a true statement may
not be true, for {e, π} ⊆ {e, π, i}, but {e, π, i} 6⊆ {e, π}. Let us consider examples
with real numbers.
Example 1.16. The statement “If x2 = 2, then x is irrational” is true, but its
converse statement, “If x is irrational, then x2 = 2,” is false.
Statements for which the converse is equivalent to the original statement are
called “if and only if” statements.
Example 1.17. Consider the statement “If x = −5, then 2x + 10 = 0.” This
statement is true. Its converse statement is “If 2x + 10 = 0, then x = −5.” By
solving the equation 2x + 10 = 0, we see that the converse statement is also true.
The implication x = −5 =⇒ 2x + 10 = 0 can be written
(1.5) 2x + 10 = 0 if x = −5,
while the implication 2x + 10 = 0 =⇒ x = −5 can be written
(1.6) 2x + 10 = 0 only if x = −5.
Combining the two statements (1.5) and (1.6) into one statement, we get
2x + 10 = 0 if and only if x = −5,
which is often denoted by a double arrow
2x + 10 = 0 ⇐⇒ x = −5,
or in more common terms, 2x + 10 = 0 is equivalent to x = −5. We regard the
statements 2x + 10 = 0 and x = −5 as equivalent because if one statement is true,
then so is the other one; hence the wording “is equivalent to”. In summary, if both
statements
Q if P (that is, P =⇒ Q) and Q only if P (that is, Q =⇒ P )
hold, then we write
Q if and only if P or Q ⇐⇒ P.
Also, if you are asked to prove a statement “Q if and only if P ”, then you have
to prove both the “if” statement “Q if P ” (that is, P =⇒ Q) and the “only if”
statement “Q only if P ” (that is, Q =⇒ P ).
The if and only if notation ⇐⇒ comes in quite handy in proofs whenever we
want to move from one statement to an equivalent one.
14 1. SETS, FUNCTIONS, AND PROOFS

Example
S 1.18.S Recall that in the proof of TheoremS 1.3, weSwanted to show
that
S A ∩ α Aα = Sα (A ∩ A α ), which means that A ∩ α Aα ⊆ α (A ∩ Aα ) and
α (A ∩ Aα ) ⊆ A ∩ α Aα ; that is,
[ [ [ [
x ∈ A ∩ Aα =⇒ x ∈ (A ∩ Aα ) and x ∈ (A ∩ Aα ) =⇒ x ∈ A ∩ Aα ,
α α α α

which is to say, we wanted to prove that


[ [
x∈A∩ Aα ⇐⇒ x ∈ (A ∩ Aα ).
α α

We can prove this quick and simple using ⇐⇒:


[ [
x∈A∩ Aα ⇐⇒ x ∈ A and x ∈ Aα ⇐⇒ x ∈ A and x ∈ Aα for some α
α α
⇐⇒ x ∈ A ∩ Aα for some α
[
⇐⇒ x ∈ (A ∩ Aα ).
α

Just make sure that if you use ⇐⇒, the expression to the immediate left and
right of ⇐⇒ are indeed equivalent.

1.2.3. Negations and logical quantifiers. We already know that a state-


ment and its contrapositive are always equivalent: “if P , then Q” is equivalent to
“if not Q, then not P ”. Therefore, it is important to know how to “not” something,
that is, find the negation. Sometimes the negation is obvious.
Example 1.19. The negation of the statement that x > 5 is x ≤ 5, and the
negation of the statement that x is irrational is that x is rational. (In both cases,
we are working under the unstated assumptions that x represents a real number.)
But some statements are not so easy especially when there are logical quan-
tifiers: “for every” = “for all” (sometimes denoted by ∀ in class, but not in this
book), and “for some” = “there exists” = “there is” = “for at least one” (sometimes
denoted by ∃ in class, but not in this book). The equal signs represent the fact that
we mathematicians consider “for every” as another way of saying “for all”, “for
some” as another way of saying “there exists”, and so forth. Working under the
assumptions that all numbers we are dealing with are real, consider the statement
(1.7) For every x, x2 ≥ 0.
What is the negation of this statement? One way to find out is to think of this in
terms of set theory. Let A = {x ∈ R ; x2 ≥ 0}. Then the statement (1.7) is just
that A = R. It is obvious that the negation of the statement A = R is just A 6= R.
Now this means that there must exist some real number x such that x 6∈ A. In
order for x to not be in A, it must be that x2 < 0. Therefore, A 6= R just means
that there is a real number x such that x2 < 0. Hence, the negation of (1.7) is just
For at least one x, x2 < 0.
Thus, the “for every” statement (1.7) becomes a “there is” statement. In general,
the negation of a statement of the form
“For every x, P ” is the statement “For at least one x, not P .”
1.3. WHAT ARE FUNCTIONS? 15

Similarly, the negation of a “there is” statement becomes a “for every” statement.
Explicitly, the negation of
“For at least one x, Q” is the statement “For every x, not Q.”
For instance, with the understanding that x represents a real number, the negation
of “There is an x such that x2 = 2” is “For every x, x2 6= 2”.
Exercises 1.2.
1. In this problem all numbers are understood to be real. Write down the contrapositive
and converse of the following statement:
If x2 − 2x + 10 = 25, then x = 5,
and determine which (if any) of the three statements are true.
2. Write the negation of the following statements, where x represents an integer.
(a) For every x, 2x + 1 is odd.
(b) There is an x such that 2x + 1 is prime.8
3. Here are some more set theory proofs to brush up on.
(a) Prove that (Ac )c = A.
(b) Prove that A = A ∪ B if and only if B ⊆ A.
(c) Prove that A = A ∩ B if and only if A ⊆ B.

1.3. What are functions?


In high school we learned that a function is a “rule that assigns to each input
exactly one output”. In practice, what usually comes to mind is a formula, such as
p(x) = x2 − 3x + 10.
In fact, Leibniz who in 1692 (or as early as 1673) introduced the word “function”
[221, p. 272] and to all mathematicians of the eighteenth century, a function was
always associated to some type of analytic expression “a formula”. However, be-
cause of necessity to problems in mathematical physics, the notion of function was
generalized throughout the years and in this section we present the modern view
of what a function is; see [118] or [137, 138] for some history.
1.3.1. (Cartesian) product. If A and B are sets, their (Cartesian) prod-
uct, denoted by A × B, is the set of all 2-tuples (or ordered pairs) where the first
element is in A and the second element is in B. Explicitly,
A × B := {(a, b) ; a ∈ A, b ∈ B}.
We use the adjective “ordered” because we distinguish between ordered pairs, e.g.
(e, π) 6= (π, e), but as sets we regard then as equal, {e, π} = {π, e}. Of course, one
can also define the product of any finite number of sets
A1 × A2 × · · · × Am
as the set of all m-tuples (a1 , . . . , am ) where ak ∈ Ak for each k = 1, . . . , m.
Example 1.20. Of particular interest is m-dimensional Euclidean space
Rm := R × · · · × R,
| {z }
(m times)

which is studied in Section 2.8.


8
A number that is not prime is called composite.
16 1. SETS, FUNCTIONS, AND PROOFS

y-axis = codomain
y (x, y) = (x, τ (x)) = (x, sin x)

R(τ ) = range
x x-axis = domain
[−1, 1]

Figure 1.4. The function τ : R −→ R defined by τ (x) = sin x.

1.3.2. Functions. Let X and Y be sets. Informally, we say that a function


f from X into Y , denoted f : X −→ Y , is a rule that associates to each element
x ∈ X, a single element y ∈ Y . Mathematically, a function from X into Y is a
subset f of the product X × Y such that each element x ∈ X appears exactly once
as the first entry of an ordered pair in the subset f . Explicitly, for each x ∈ X
there is a unique y ∈ Y such that (x, y) ∈ f .
Example 1.21. For instance,
(1.8) p = {(x, y) ; x ∈ [0, 1] and y = x2 − 3x + 10} ⊆ [0, 1] × R
defines a function p : [0, 1] −→ R, since p is an example of a subset of [0, 1] × R such
that each real number x ∈ [0, 1] appears exactly once as the first entry of an ordered
pair in p; e.g., the real number 1 appears as the first entry of (1, 12 −3·1+10) = (1, 8),
and there is no other ordered pair in the set (1.8) with 1 as the first entry. Thus,
p satisfies the mathematical definition of a function as you thought it should!
If f : X −→ Y is a function, then we say that f maps X into Y . If Y = X so
that f : X −→ X, we say that f is a function on X. For a function f : X −→ Y ,
the domain of f is X, the codomain or target of f is Y , and the range of f ,
sometimes denoted R(f ), is the set of all elements in Y that occur as the second
entry of an ordered pair in f . If (x, y) ∈ f (recall that f is a set of ordered pairs),
then we call the second entry y the value or image of the function at x and we
write y = f (x), and sometimes we write
x 7→ y = f (x).
Using this f (x) notation, which by the way was introduced in 1734 by Leonhard
Euler (1707–1783) [171], [36, p. 443], we have
(1.9) f = {(x, y) ∈ X × Y ; y = f (x)} = {(x, f (x)) ; x ∈ X} ⊆ X × Y,
and
R(f ) = {y ∈ Y ; y = f (x) for some x ∈ X} = {f (x) ; x ∈ X}.
See Figure 1.4 for the familiar graph illustration of domain, codomain, and range
for the trig function τ : R −→ R given by τ (x) = sin x. Also using this f (x)
notation, we can return to our previous ways of thinking of functions. For instance,
we can say “let p : [0, 1] −→ R be the function p(x) = x2 − 3x + 10” or “let
p : [0, 1] −→ R be the function x 7→ x2 − 3x + 10”, by which we mean of course the
set (1.8). In many situations in this book, we are dealing with a fixed codomain;
for example, with real-valued functions or stated another way, functions whose
codomain is R. Then we can omit the codomain and simply say, “let p be the
function x 7→ x2 − 3x + 10 for x ∈ [0, 1]”. In this case we again mean the set (1.8).
1.3. WHAT ARE FUNCTIONS? 17

Figure 1.5. An attempted graph of Dirichlet’s function.

We shall also deal quite a bit with complex-valued functions, that is, functions
whose codomain is C. Then if we say, “let f be a complex-valued function on
[0, 1]”, we mean that f : [0, 1] −→ C is a function. Here are some more examples.
Example 1.22. Consider the function s : N −→ R defined by
  
(−1)n
s= n, ; n ∈ N ⊆ N × R.
n
n
We usually denote s(n) = (−1) n by sn and write {sn } for the function s, and we
call {sn } a sequence of real numbers. We shall study sequences in great depth in
Chapter 3.
Example 1.23. Here is a “piecewise” defined function: a : R −→ R,
(
x if x ≥ 0;
a(x) =
−x if x < 0.
Of course, a(x) is usually denoted by |x| and is called the absolute value function.

Example 1.24. Here’s an example of a “pathological function,” the Dirichlet


function, named after Johann Peter Gustav Lejeune Dirichlet (1805–1859), which
is the function D : R −→ R defined by
(
1 if x is rational
D(x) =
0 if x is irrational.
This function was introduced in 1829 in Dirichlet’s study of Fourier series and was
the first function (1) not given by an analytic expression and (2) not continuous
anywhere [118, p. 292]. See Section 4.3 for more on continuous functions. See
Figure 1.5 for an attempted graph of Dirichlet’s function.
In elementary calculus, you often encountered composition of functions when
learning, for instance, the chain rule. Here is the precise definition of composition.
If f : X −→ Y and g : Z −→ X, then the composition f ◦ g is the function
f ◦ g : Z −→ Y
defined by (f ◦ g)(z) := f (g(z)) for all z ∈ Z. As a set of ordered pairs, f ◦ g is
given by (do you see why?)
f ◦ g = {(z, y) ∈ Z × Y ; for some x ∈ X, (z, x) ∈ g and (x, y) ∈ f } ⊆ Z × Y.
Also, when learning about the exponential or logarithmic functions, you probably
encountered inverse functions. Here are some definitions related to this area. A
18 1. SETS, FUNCTIONS, AND PROOFS

function f : X −→ Y is called one-to-one or injective if for each y ∈ R(f ),


there is only one x ∈ X with y = f (x). Another way to state this is

(1.10) f is one-to-one means: If f (x1 ) = f (x2 ), then x1 = x2 .


In terms of the contrapositive, we have
(1.11) f is one-to-one means: If x1 6= x2 , then f (x1 ) 6= f (x2 ).

In case f : X −→ Y is injective, the inverse map f −1 is the map with domain


R(f ) and codomain X:
f −1 : R(f ) −→ X
defined by f −1 (y) := x where y = f (x). The function f is called onto or surjec-
tive if R(f ) = Y ; that is,

(1.12) f is onto means: For every y ∈ Y there is an x ∈ X such that y = f (x).


A one-to-one and onto map is called a bijection. Here are some examples.
Example 1.25. Let f : R −→ R be defined by f (x) = x2 . Then f is not
one-to-one because e.g. (see the condition (1.11)) 2 6= −2 yet f (2) = f (−2). This
function is also not onto because it fails (1.12): e.g. for y = −1 ∈ R there is no
x ∈ R such that −1 = f (x).
Example 1.26. In elementary calculus, we learn that the exponential function
exp : R −→ (0, ∞), f (x) = ex ,
is both one-to-one and onto, that is, a bijection, with inverse
exp−1 : (0, ∞) −→ R, f −1 (x) = log x.
Here, log x denotes the “natural logarithm”, which in many calculus courses is
denoted by ln x, with log x denoting the base 10 logarithm; however in this book
and in most advanced math texts, log x denotes the natural logarithm. In Chapter
3 we shall define the exponential and logarithmic functions rigorously.
1.3.3. Images and inverse images. Functions act on sets as follows. Given
a function f : X −→ Y and a set A ⊆ X, we define
f (A) := {f (x) ; x ∈ A} = {y ∈ Y ; y = f (x) for some x ∈ A},
and call this set the image of A under f . Thus,
y ∈ f (A) ⇐⇒ y = f (x) for some x ∈ A.
Given a set B ⊆ Y , we define

f −1 (B) := {x ∈ X ; f (x) ∈ B},


and call this set the inverse image or preimage of B under f . Thus,
x ∈ f −1 (B) ⇐⇒ f (x) ∈ B.
−1 −1
Warning: The notation f in the preimage f (B) is only notation and does not
represent the inverse function of f . (Indeed, the function may not have an inverse
so the inverse function may not even be defined.)
1.3. WHAT ARE FUNCTIONS? 19

pq 9 pq 9
-  -

6 xy 4 xy 4
? ?
[ ] [ ] [ ]
−3 −2 −3 −2 2 3

Figure 1.6. (Left-hand picture) The function f (x) = x2 takes


all the points in [−3, −2] to the set [4, 9], so f ([−3, −2]) = [4, 9].
(Right-hand picture) f −1 ([4, 9]) consists of every point in R that
f brings inside of [4, 9], so f −1 ([4, 9]) = [−3, −2] ∪ [2, 3].

Example 1.27. Let f (x) = x2 with domain and range in R. Then as we can
see in Figure 1.6,
f ([−3, −2]) = [4, 9] and f −1 ([4, 9]) = [−3, −2] ∪ [2, 3].
Here are more examples: You are invited to check that
f ((1, 2]) = (1, 4], f −1 ([−4, −1)) = ∅, f −1 ((1, 4]) = [−2, −1) ∪ (1, 2].
The following theorem gives the main properties of images and inverse images.
Theorem 1.4. Let f : X −→ Y , let B, C ⊆ Y , {Bα } be a family of subsets of
Y , and let {Aα } a family of subsets of X. Then
[  [
f −1 (C \ B) = f −1 (C) \ f −1 (B), f −1 Bα = f −1 (Bα ),
α α
\  \ [  [
−1 −1
f Bα = f (Bα ), f Aα = f (Aα ).
α α α α

Proof. Using the definition of inverse image and set difference, we have
x ∈ f −1 (C \ B) ⇐⇒ f (x) ∈ C \ B ⇐⇒ f (x) ∈ C and f (x) 6∈ B
⇐⇒ x ∈ f −1 (C) and x 6∈ f −1 (B)
⇐⇒ x ∈ f −1 (C) \ f −1 (B).

Thus, f −1 (C \ B) = f −1 (C) \ f −1 (B).


Using the definition of inverse image and union, we have
[  [
x ∈ f −1 Bα ⇐⇒ f (x) ∈ Bα ⇐⇒ f (x) ∈ Bα for some α
α α
⇐⇒ x ∈ f −1 (Bα ) for some α
[
⇐⇒ x ∈ f −1 (Bα ).
α
S S −1
Thus, f −1 α Bα = αf (Bα ). The proof of the last two properties in this
theorem are similar enough to the proof just presented that we leave their verifica-
tion to the reader. 
20 1. SETS, FUNCTIONS, AND PROOFS

We end this section with some definitions needed for the exercises. Let X be
a set and let A be any subset of X. The characteristic function of A is the
function χA : X −→ R defined by
(
1 if x ∈ A;
χA (x) =
0 if x 6∈ A.
The sum and product of two characteristic function χA and χB are the functions
χA + χB : X −→ R and χA · χB : X −→ R defined by
(χA +χB )(x) = χA (x)+χB (x) and (χA ·χB )(x) = χA (x)·χB (x), for all x ∈ X.
Of course, the sum and product of any functions f : X −→ R and g : X −→ R
are defined in the same way. We can also replace R by, say C, or by any set Y as
long as “+” and “·” are defined on Y . Given any constant c ∈ R, we denote by
the same letter the function c : X −→ R defined by c(x) = c for all x ∈ X. This
is the constant function c. For instance, 0 is the function defined by 0(x) = 0
for all x ∈ X. The identity map on X is the map defined by I(x) = x for all
x ∈ X. Finally, we say that two functions f : X −→ Y and g : X −→ Y are equal
if f = g as subsets of X × Y , which holds if and only if f (x) = g(x) for all x ∈ X.
Exercises 1.3.
1. Which of the following subsets of R × R define functions from R to R?
(a) A1 = {(x, y) ∈ R × R ; x2 = y}, (b) A2 = {(x, y) ∈ R × R ; x = sin y},
(c) A3 = {(x, y) ∈ R × R ; y = sin x}, (d) A4 = {(x, y) ∈ R × R ; x = 4y − 1}.
(Assume well-known properties of trig functions.) Of those sets which do define func-
tions, find the range of the function. Is the function is one-to-one; is it onto?
2. Let f (x) = 1 − x2 . Find
f ([1, 4]), f ([−1, 0] ∪ (2, 10)), f −1 ([−1, 1]), f −1 ([5, 10]), f (R), f −1 (R).
3. If f : X −→ Y and g : Z −→ X are bijective, prove that f ◦ g is a bijection and
(f ◦ g)−1 = g −1 ◦ f −1 .
4. Let f : X −→ Y be a function.
(a) Given any subset B ⊆ Y , prove that f (f −1 (B)) ⊆ B.
(b) Prove that f (f −1 (B)) = B for all subsets B of Y if and only if f is surjective.
(c) Given any subset A ⊆ X, prove that A ⊆ f −1 (f (A)).
(d) Prove that A = f −1 (f (A)) for all subsets A of X if and only if f is injective.
5. Let f : X −→ Y be a function. Show that f is one-to-one if and only if there is a
function g : Y −→ X such that g ◦ f is the identity map on X. Show that f is onto if
and only if there is a function h : Y −→ X such that f ◦ h is the identity map on Y .
6. (Cf. [152]) In this problem we give various applications of characteristic functions to
prove statements about sets. First, prove at least two of (a) – (e) of the following. (a)
χX = 1, χ∅ = 0; (b) χA · χB = χB · χA = χA∩B and χA · χA = χA ; (c) χA∪B =
χA + χB − χA · χB ; (d) χAc = 1 − χA ; (e) χA = χB if and only if A = B. Here are
some applications of these properties Prove the distributive law:
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),
by showing that the characteristic functions of each side are equal as functions. Then
invoke (e) to demonstrate equality of sets. Prove the nonobvious equality
(A ∩ B c ) ∩ (C c ∩ A) = A ∩ (B ∪ C)c .
Here’s a harder question: Consider the sets (A ∪ B) ∩ C and A ∪ (B ∩ C). When, if
ever, are they equal? When is one set a subset of the other?
CHAPTER 2

Numbers, numbers, and more numbers

I believe there are 15,747,724,136,275,002,577,605,653,961,181,555,468,


044,717,914,527,116,709,366,231,425,076,185,631,031,296 protons in the
universe and the same number of electrons.
Sir Arthur Eddington (1882–1944), “The Philosophy of Physical Science”.
Cambridge, 1939.
This chapter is on the study of numbers. Of course, we all have a working
understanding of the real numbers and we use many aspects of these numbers in
everyday life: tallying up tuition and fees, figuring out how much we have left on our
food cards, etc. We have accepted from our childhood all the properties of numbers
that we use everyday. In this chapter we shall actually prove these properties.
In everyday life, what usually comes to mind when we think of “numbers” are
the counting, or natural, numbers 1, 2, 3, 4, . . .. We shall study the natural numbers
and their properties in Sections 2.1 and 2.2. These numbers have been used from
the beginning. Later, the Hindus became the first to systematically use “zero”
and “negative” integers [35], [36, p. 220]; for example, Brahmagupta (598–670)
gave arithmetic rules for multiplying and dividing with such numbers (although
he mistakenly believed that 0/0 = 0). We study the integers in Sections 2.3, 2.4,
and 2.5. Everyday life forces us to talk about fractions, for example, 2/3 of a
pizza “two pieces of a pizza divided into three equal parts”. Such fractions (and
their negatives and zero) make up the so-called rational numbers, which are called
rational not because they are “sane” or “comprehensible”, but simply because they
are ratios of integers. It was a shock to the Greeks who discovered that the rational
numbers are not enough to describe nature. They noticed that according to the
Pythagorean
√ theorem,
√ the √
length of the hypotenuse √ of a triangle with sides of length
1 is 12 + 12 = 1 + 1 = 2. We shall prove that 2 is “irrational”, which simply
means “not rational,” that is, not a ratio of integers. In fact, we’ll see that “most”
numbers that you encountered in high school:
Square roots and more generally n-th roots, roots of polynomials, and
values of trigonometric and logarithmic functions, are mostly irrational!

You’ll have to wait for this mouth-watering subject until Section 2.6! In Section 2.7
we study the all-important property of the real numbers called the completeness
property, which in some sense says that real numbers can describe any length
whatsoever. In Sections 2.8 and 2.9, we leave the one-dimensional real line and
discuss m-dimensional space and the complex numbers (which is really just two-
dimensional space). Finally, in Section 2.10 we define “most” using cardinality and
show that “most” real numbers are not only irrational, they are transcendental.
Chapter 2 objectives: The student will be able to . . .
• state the fundamental axioms of the natural, integer, and real number systems.
21
22 2. NUMBERS, NUMBERS, AND MORE NUMBERS

• Explain how the completeness axiom of the real number system distinguishes
this system from the rational number system in a powerful way.
• prove statements about numbers from basic axioms including induction.
• Define Rm and C and the norms on these spaces.
• Explain cardinality and how “most” real numbers are irrational or even tran-
scendental.

2.1. The natural numbers


The numbers we encounter most often in “everyday life” are the counting num-
bers, or the positive whole numbers
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, . . .
These are called the natural numbers and have been used since the beginning.
In this section we study the essential properties of these fundamental numbers.
2.1.1. Axioms for natural numbers. The set, or collection, of natural num-
bers is denoted by N. We all know that if two natural numbers are added, we obtain
a natural number; for example, 3 + 4 = 7. Similarly, if two natural numbers are
multiplied we get a natural number. We say that the natural numbers are closed
under addition and multiplication. Thus, if a, b are in N, then using the familiar
notation for addition and multiplication, a + b and a · b are also in N.1 The following
properties of + and · are also familiar.
Addition satisfies
(A1) a + b = b + a; (commutative law)
(A2) (a + b) + c = a + (b + c). (associative law)
By the associative law, we may “drop” parentheses in sums of more than two
numbers:
a + b + c is unambiguously defined as (a + b) + c = a + (b + c).
Multiplication satisfies
(M1) a · b = b · a; (commutative law)
(M2) (a · b) · c = a · (b · c); (associative law)
(M3) there is a natural number denoted by 1 “one” such that
1 · a = a = a · 1. (existence of multiplicative identity)
By the associative law for multiplication, we may “drop” parentheses:
a · b · c is unambiguously defined as (a · b) · c = a · (b · c).
Addition and multiplication are related by
(D) a · (b + c) = (a · b) + (a · c). (distributive law)
As usual, we sometimes drop the dot · and just write ab for a · b. The natural
numbers are also ordered in the sense that you can compare the magnitude of any
two of them; for example 2 < 5 because five is three greater than two or two is three
less than five. This inequality relationship satisfies the following familiar properties.
Given any natural numbers a and b exactly one of the (in)equalities hold:
(O1) a = b;
(O2) a < b, which by definition means that b = a + c for some natural number c;
1By the way, + and · are functions, as we studied in Section 1.3, from N × N into N. These
are not arbitrary functions but must satisfy the properties (A), (M), and (D) listed.
2.1. THE NATURAL NUMBERS 23

(O3) b < a, which by definition means that a = b + c for some natural number c.
Thus, 2 < 5 because 5 = 2 + c where c = 3. Of course, we write a ≤ b if a < b
or a = b. There are similar meanings for the opposite inequalities “>” and “≥”.
The inequality signs < and > are called strict. There is one more property of the
natural numbers called induction. Let M be a subset of N.
(I) Suppose that M contains 1 and that M has the following property: If n belongs
to M , then n + 1 also belongs to M . Then M contains all natural numbers.
The statement that M = N is “obvious” with a little thought. M contains 1.
Because 1 belongs to M , by (I), we know that 1+1 = 2 also belongs to M . Because
2 belongs to M , by (I) we know that 2 + 1 = 3 also belongs to M . Assuming we
can continue this process indefinitely makes it clear that M = N.
Everyday experience convinces us that the counting numbers satisfy properties
(A), (M), (D), (O), and (I). However, mathematically we will assume, or take by
faith, the existence of a set N with operations + and · that satisfy properties (A),
(M), (D), (O), and (I).2 From these properties alone we shall prove many well-
known properties of these numbers that we have accepted since grade school. It is
quite satisfying to see that many of the well-known properties about numbers that
are memorized (or even those that are not so well-known) can in fact be proven
from a basic set of axioms! The “rules of the game” to prove such properties is
that we are allowed to prove statements only using facts that we already know are
true either because these facts were given to us in a set of axioms, or because these
facts have already been proven by us in this book, by your teacher in class, or by
you in an exercise.
2.1.2. Proofs of well-known high school rules. Again, you are going to
learn the language of proofs in the same way that a child learns to talk; by observing
others prove things and imitating them, and eventually you will get the hang of it.
We begin by proving the familiar transitive law.
Theorem 2.1 (Transitive law). If a < b and b < c, then a < c.
Proof. Suppose that a < b and b < c. Then by definition of less than (recall
the inequality law (O2) in Section 2.1.1), there are natural numbers d and e such
that b = a + d and c = b + e. Hence, by the associative law,
c = b + e = (a + d) + e = a + (d + e).
Thus, c = a + f where f = d + e ∈ N, so a < c by definition of less than. 
Before moving on, we briefly analyze this theorem in view of what we learned
in Section 1.2. The hypotheses or assumptions of this theorem are that a, b,
and c are natural numbers with a < b and b < c and the conclusion is that a < c.
Note that the fact that a, b, and c are natural numbers and that natural numbers
are assumed to satisfy all their arithmetic and order properties were left unwritten
in the statement of the proposition since these assumptions were understood within
the context of this section. The “if-then” wording means: If the assumptions are
true, then the conclusion is also true or given that the assumptions are true, the
conclusion follows. We can also reword Theorem 2.1 as follows:
a < b and b < c implies (also written =⇒) a < c;
2Taking the axioms of set theory by faith, which we are doing in this book even though we
haven’t listed many of them(!), we can define the natural numbers as sets, see [91, Sec. 11].
24 2. NUMBERS, NUMBERS, AND MORE NUMBERS

that is, the truth of the assumptions implies the truth of the conclusion. We can
also state this theorem as follows:
a < b and b < c only if a < c;
that is, the hypotheses a < b and b < c hold only if the conclusion a < c also holds,
or
a < c if a < b and b < c;
that is, the conclusion a < c is true if, or given that, the hypotheses a < b and b < c
are true. The kind of proof used in Theorem 2.1 is called a direct proof , where
we take the hypotheses a < b and b < c as true and prove that the conclusion a < c
is true. We shall see other types of proofs later. We next give another easy and
direct proof of the so-called “FOIL law” of multiplication. However, before proving
this result, we note that the distributive law (D) holds from the right:
(a + b) · c = ac + bc.
Indeed,
(a + b) · c = c · (a + b) commutative law
= (c · a) + (c · b) distributive law
= (a · c) + (b · c) commutative law.
Theorem 2.2 (FOIL law). For any natural numbers a, b, c, d, we have
(a + b) · (c + d) = ac + ad + bc + bd, (first + outside + inside + last).
Proof. We simply compute:
(a + b) · (c + d) = (a + b) · c + (a + b) · d distributive law
= (ac + bc) + (ad + bd) distributive law (from right)
= ac + (bc + (ad + bd)) associative law
= ac + ((bc + ad) + bd) associative law
= ac + ((ad + bc) + bd) commutative law
= ac + ad + bc + bd,
where at the last step we dropped parentheses as we know we can in sums of more
than two numbers (consequence of the associative law). 
We now prove the familiar cancellation properties of high school algebra.
Theorem 2.3. Given any natural numbers a, b, c, we have
a+c=b+c if and only if a = b.
In particular, given a + c = b + c, we can “cancel” c, obtaining a = b.
Proof. Suppose that a = b, then because a and b are just different letters for
the same natural number, we have a + c = b + c.
We now have to prove that if a + c = b + c, then a = b. To prove this, we use
a proof by contraposition. This is how it works. We need to prove that if the
assumption “P : a + c = b + c” is true, then the conclusion “Q : a = b” is also true.
Instead, we shall prove the logically equivalent contrapositive statement: If the
conclusion Q is false, then the assumption P must also false. The statement that Q
2.1. THE NATURAL NUMBERS 25

is false is just that a 6= b and the statement that P is false is just that a + c 6= b + c.
Thus, we must prove
if a 6= b, then a + c 6= b + c.
To this end, assume that a 6= b; then either a < b or b < a. Because the notation
is entirely symmetric between a and b, we may presume that a < b. Then by
definition of less than, we have b = a + d for some natural number d. Hence, by
the associative and commutative laws,
b + c = (a + d) + c = a + (d + c) = a + (c + d) = (a + c) + d.
Thus, by definition of less than, a + c < b + c, so a + c 6= b + c.


There is a multiplicative cancellation as well; see Problem 5b. Other examples


of using the fundamental properties (A), (M), (D), and (O) of the natural numbers
are found in the exercises. We now concentrate on the induction property (I).

2.1.3. Induction. We all know that every natural number is greater than or
equal to one. Here is a proof!
Theorem 2.4. Every natural number is greater than or equal to one.
Proof. Rewording this as an “if-then” statement, we need to prove that if n
is a natural number, then n ≥ 1. To prove this, let M = {n ∈ N ; n ≥ 1}, the
collection all natural numbers greater than or equal to one. Then M contains 1.
If a natural number n belongs to M , then by definition of M , n ≥ 1. This means
that n = 1 or n > 1. In the first case, n + 1 = 1 + 1, so by definition of less than,
1 < n + 1. In the second case, n > 1 means that n = 1 + m for some m ∈ N, so
n + 1 = (1 + m) + 1 = 1 + (m + 1). Again by definition of less than, 1 < n + 1. In
either case, n + 1 also belongs to M . Thus by induction, M = N. 

Now we prove the Archimedean ordering property of the natural numbers.


Theorem 2.5 (Archimedean ordering of N). Given any natural numbers a
and b there is a natural number n so that b < a · n.
Proof. Let a, b ∈ N; we need to produce an n ∈ N such that b < a · n. By
the previous theorem, either a = 1 or a > 1. If a = 1, then we set n = b + 1, in
which case b < b + 1 = 1 · n. If 1 < a, then we can write a = 1 + c for some natural
number c. In this case, let n = b. Then,
a · n = (1 + c) · b = b + c · b > b.


The following theorem contains an important property of the natural numbers.


Its proof is an example of a proof by contradiction or reductio ad absurdum,
whereby we start with a tentative assumption that the conclusion is false and then
proceed with our argument until we eventually get a logical absurdity.
Theorem 2.6 (Well-ordering (principle) of N). Any nonempty set of nat-
ural numbers has a smallest element; that is, an element less than or equal to any
other member of the set.
26 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Proof. We need to prove that if A is a nonempty set of natural numbers,


then A contains a natural number a so that a < a0 for any other element a0 of
A. Well, A either has this property or not; suppose, for the sake of contradiction,
that A does not have a smallest element. From this assumption we shall derive
a nonsense statement. Let M = {n ∈ N ; n < a for all a ∈ A}. Note that since
a natural number is never less than itself, M does not contain any element of A.
In particular, since A is nonempty, M does not consist of all natural numbers.
However, we shall prove by induction that M is all of N. This of course would be
a contradiction, for we already know that M is not all of N.
To arrive at our contradiction, we first show that M contains 1. By Theorem
2.4, we know that 1 is less than or equal to every natural number; in particular,
1 is less than or equal to every element of A. Hence, if 1 is in A, then 1 would
be the smallest element of A. However, we are assuming that A does not have a
smallest element, so 1 cannot be in A. Hence, 1 is less than every element of A, so
M contains 1.
Suppose that M contains n; we shall prove that M contains n + 1, that is,
n + 1 is less than every element of A. Now either n + 1 ∈ A or n + 1 6∈ A. Suppose
that n + 1 ∈ A and let a be any element of A not equal to n + 1. Since n < a
(as n ∈ M ), we can write a = n + c for some natural number c. Note that c 6= 1
since by assumption a 6= n + 1. Thus (by Theorem 2.4) c > 1 and so we can write
c = 1 + d for some natural number d. Hence,
a = n + c = n + 1 + d,
which shows that n + 1 < a. This implies that n + 1 is the smallest element of A,
which we know cannot exist. Hence, our supposition that n + 1 ∈ A must have
been incorrect, so n + 1 6∈ A. In this case, the exact same argument just explained
shows that n + 1 < a for every element a ∈ A. Thus, n + 1 ∈ M , so by induction
M = N, and we arrive at our desired contradiction.

Finally, we remark that the letter 2 denotes, by definition, the natural number
1 + 1. Since 2 = 1 + 1, 1 < 2 by definition of less than. The natural number 3
denotes the number 2 + 1 = 1 + 1 + 1. By definition of less than, 2 < 3. Similarly
4 is the number 3 + 1, and so forth. Continuing in this manner we can assign the
usual symbols to the natural numbers that we are accustomed to in “everyday life”.
In Problem 4 we see that there is no natural number between n and n + 1, so the
sequence of symbols defined will cover all possible natural numbers.
All the letters in the following exercises represent natural numbers. In the
following exercises, you are only allowed to use the axioms and properties of the
natural numbers established in this section. Remember that if you can’t see how
to prove something after some effort, take a break (e.g. take a bus ride somewhere)
and come back to the problem later.3
Exercises 2.1.
1. Prove that any natural number greater than 1 can be written in the form m + 1 where
m is a natural number.
3I entered an omnibus to go to some place or other. At that moment when I put my foot on
the step the idea came to me, without anything in my former thoughts seeming to have paved the
way for it, that the transformations I had used to define the Fuchsian functions were identical
with non-Euclidean geometry. Henri Poincaré (1854–1912).
2.2. THE PRINCIPLE OF MATHEMATICAL INDUCTION 27

2. Are there natural numbers a and b such that a = a + b? What logical inconsistency
happens if such an equation holds?
3. Prove the following statements.
(a) If n2 = 1 (that is, n · n = 1), then n = 1.
(b) There does not exist a natural number n such that 2n = 1.
(c) There does not exist a natural number n such that 2n = 3.
4. Prove the following statements.
(a) If n ∈ N, then there is no m ∈ N such that n < m < n + 1.
(b) If n ∈ N, then there is a unique m ∈ N satisfying n < m < n + 2; in fact, prove
that the only such natural number is m = n + 1. (That is, prove that n + 1 satisfies
the inequality and if m also satisfies the inequality, then m = n + 1.)
5. Prove the following statements.
(a) (a + b)2 = a2 + 2ab + b2 , where a2 means a · a and b2 means b · b.
(b) For any fixed natural number c,
a = b if and only if a · c = b · c .
Conclude that 1 is the only multiplicative identity (that is, if a · c = c for some
a, c ∈ N, then a = 1).
(c) For any fixed natural number c,
a < b if and only if a + c < b + c .
Also prove that
a < b if and only if a · c < b · c .
(d) If a < b and c < d, then a · c < b · d.
6. Let A be a finite collection of natural numbers. Prove that A has a largest element,
that is, A contains a number n such that n ≥ m for every element m in A.
7. Many books take the well-ordering property as a fundamental axiom instead of the
induction axiom. Replace the induction axiom by the well-ordering property.
(a) Prove Theorem 2.4 using well-ordering. Suggestion: By well-ordering, N has a
least element, call it n. We need to prove that n ≥ 1. Assume that n < 1 and find
another natural number less than n to derive a contradiction.
(b) Prove the induction property.

2.2. The principle of mathematical induction


We now turn to the principle of mathematical induction and we give many
applications of its use.

2.2.1. Principle of mathematical induction. The induction axiom of the


natural numbers is the basis for the principle of mathematical induction, which
goes as follows. Suppose that we are given a list of statements:
P 1 , P2 , P3 , P4 , P5 , . . . ,
and suppose that (1) P1 is true and (2) if n is a natural number and the statement
Pn happens to be valid, then the statement Pn+1 is also valid. Then it must be
that every statement P1 , P2 , P3 , . . . is true. To see why every statement Pn is true,
let M be the collection of all natural numbers n such that Pn is true. Then by
(1), M contains 1 and by (2), if M contains a natural number n, then it contains
n + 1. By the induction axiom, M must be all of N; that is, Pn is true for every
n. Induction is like dominoes: Line up infinitely many dominos in a row and knock
down the first domino (that is, P1 is true) and if the n-th domino knocks down the
(n + 1)-st domino (that is, Pn =⇒ Pn+1 ), then every domino gets knocked down
28 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Figure 2.1. Induction is like dominoes.

(that is, all the statements P1 , P2 , P3 , . . . are true). See Figure 2.1 for a visual of
this concept.
We now illustrate this principle through some famous examples. In order to
present examples that have applicability in the sequel, we have to go outside the
realm of natural numbers and assume basic familiarity with integers, real, and
complex numbers. Integers will be discussed in the next section, real numbers in
Sections 2.6 and 2.7, and complex numbers in Section 2.9.
2.2.2. Inductive definitions: Powers and sums. We of course know what
73 is, namely 7 · 7 · 7. In general, we define an where a is any complex number called
the base and n is a positive integer called the exponent as follows:
an := a
| · a{z· · · a} .
n times
(Recall that “:=” means “equals by definition”.)
Example 2.1. We can also define an using induction. Let Pn denote the
statement “the power an is defined”. We define a1 := a. Assume that an has been
defined. Then we define an+1 := an · a. Thus, the statement Pn+1 is defined, so by
induction an is defined for any natural number n.
Example 2.2. Using induction, we prove that for any natural numbers m and
n, we have
(2.1) am+n = am · an .
Indeed, let us fix the natural number m and let Pn be the statement “Equation
(2.1) holds for the natural number n”. Certainly
am+1 = am · a = am · a1
holds by definition of am+1 . Assume that (2.1) holds for a natural number n. Then
by definition of the power and our induction hypothesis,
am+(n+1) = a(m+n)+1 = am+n · a = am · an · a = am · an+1 ,
which is exactly the statement Pn+1 . If a 6= 0 and we also define a0 := 1, then as
the reader can readily check, (2.1) continues to hold even if m or n is zero.
In elementary calculus, we were introduced to the summation notation. Let
a0 , a1 , aP
2 , a3 , . . . be any list of complex numbers. For any natural number n, we
n
define k=0 ak as the sum of the numbers a0 , . . . , an :
n
X
ak := a0 + a1 + · · · + an .
k=0
P
By the way, in 1755 Euler introduced the sigma notation for summation [171].
2.2. THE PRINCIPLE OF MATHEMATICAL INDUCTION 29

P0 Example 2.3. We also can define summation using induction. We Pndefine


a
k=0 k := a0 . For a natural number n, let P n represent the statement “ k=0 ak
is defined”. We define
1
X
ak := a0 + a1 .
k=0
Pn
Suppose that Pn holds for n ∈ N; that is, k=0 ak is defined. Then we define
n+1 n
!
X X
ak := ak + an+1 .
k=0 k=0
Pn
Thus, Pn+1 holds. We conclude that the sum k=0 ak is defined for n = 0 and for
any natural number n.
2.2.3. Classic examples: The arithmetic and geometric progressions.

Example 2.4. First, we shall prove that for every natural number n, the sum
of the first n integers is n(n + 1)/2; that is,

n(n + 1)
(2.2) 1 + 2 + ··· + n = .
2
Here, Pn represents the statement “Equation (2.2) holds”. Certainly,
1(1 + 1)
1= .
2
Thus, our statement is true for n = 1. Suppose our statement holds for a number
n. Then adding n + 1 to both sides of (2.2), we obtain
n(n + 1)
1 + 2 + · · · + n + (n + 1) = + (n + 1)
2
n(n + 1) + 2(n + 1) (n + 1)(n + 1 + 1)
= = ,
2 2
which is exactly the statement Pn+1 . Hence, by the principle of mathematical
induction, every single statement Pn is true.
We remark that the high school way to prove Pn is to write the sum of the first
n integers forward and backwards:
Sn = 1 + 2 + · · · + (n − 1) + n
and
Sn = n + (n − 1) + · · · + 2 + 1.
Notice that the sum of each column is just n + 1. Since there are n columns, adding
these two expressions, we obtain 2Sn = n(n + 1), which implies our result.
What if we only sum the odd integers? We get (proof left to you!)
1 + 3 + 5 + · · · + (2n − 1) = n2 .
Do you see why Figure 2.2 makes this formula “obvious”?
30 2. NUMBERS, NUMBERS, AND MORE NUMBERS

   
   
   
   

Figure 2.2. Sum of the first n odd numbers.

Example 2.5. We now consider the sum of a geometric progression. Let a 6= 1


be any complex number. We prove that for every natural number n,
1 − an+1
(2.3) 1 + a + a2 + · · · + an = .
1−a
Observe that
1 − a2 (1 + a)(1 − a)
= = 1 + a,
1−a 1−a
so our assertion holds for n = 1. Suppose that the sum (2.3) holds for the number
n. Then adding an+1 to both sides of (2.3), we obtain
1 − an+1
1 + a + a2 + · · · + an + an+1 = + an+1
1−a
1 − an+1 + an+1 − an+2 1 − an+2
= = ,
1−a 1−a
which is exactly the equation (2.3) for n + 1. The completes the proof for the sum
of a geometric progression.
The high school way to establish the sum of a geometric progression is to
multiply
Gn = 1 + a + a2 + · · · + an
by a:
a Gn = a + a2 + a3 + · · · + an+1 ,
and then to subtract this equation from the preceding one and cancelling like terms:
(1 − a)Gn = Gn − a Gn = (1 + a + · · · + an ) − (a + · · · + an+1 ) = 1 − an+1 .
Dividing by 1 − a proves (2.3). Splitting the fraction at the end of (2.3) and solving
for 1/(1 − a), we obtain the following version of the geometric progression

1 an+1
(2.4) = 1 + a + a2 + · · · + an + .
1−a 1−a

2.2.4. More sophisticated examples. Here’s a famous inequality due to


Jacob (Jacques) Bernoulli (1654–1705) that we’ll have to use on many occasions.
Theorem 2.7 (Bernoulli’s inequality). For any real number a > −1 and
any natural number n,
(
n = 1 + na if n = 1 or a = 0
(1 + a) Bernoulli’s inequality.
> 1 + na if n > 1 and a 6= 0,
2.2. THE PRINCIPLE OF MATHEMATICAL INDUCTION 31

Proof. If a = 0, then Bernoulli’s inequality certainly holds (both sides equal


1), so we’ll assume that a 6= 0. If n = 1, then Bernoulli’s inequality is just 1 + a =
1+a, which certainly holds. Suppose that Bernoulli’s inequality holds for a number
n. Then (1 + a)n ≥ (1 + na) (where if n = 1, this is an equality and if n > 1, this
is a strict inequality). Multiplying Bernoulli’s inequality by 1 + a > 0, we obtain
(1 + a)n+1 ≥ (1 + a)(1 + na) = 1 + na + a + na2 .
Since n a2 is positive, the expression on the right is greater than
1 + na + a = 1 + (n + 1)a.
Combining this equation with the previous inequality proves Bernoulli’s inequality
for n + 1. By induction, Bernoulli’s inequality holds for every n ∈ N. 
If n is a natural number, recall that the symbol n! (read “n factorial”) represents
the product of the first n natural numbers. Thus,
n! := 1 · 2 · 3 · · · (n − 1) · n.
It is convenient to define 0! = 1 so that certain formulas continue to hold for n = 0.
Thus, n! is defined for all nonnegative integers n, that is, n = 0, 1, 2, . . .. Given
nonnegative integers n and k with k ≤ n, we define the binomial coefficient nk
by
 
n n!
:= .
k k!(n − k)!
For example, for any nonnegative integer n,
   
n n! n! n n! n!
= = = 1 and = = = 1.
0 0!(n − 0)! n! n n!(n − n)! n!
Problem 11 contains a generalization of the following important theorem.
Theorem 2.8 (Binomial theorem). For any complex numbers a and b, and
n ∈ N, we have
n  
n
X n k n−k
(a + b) = a b , binomial formula.
k
k=0

Proof. If n = 1, the right-hand side of the binomial formula reads


1      
X 1 k 1−k 1 0 1 1 1 0
a b = a b + a b = b + a,
k 0 1
k=0
so the binomial formula holds for n = 1. Suppose that the binomial formula holds
for a natural number n. Then multiplying the formula by a + b, we get
n  
X n k n−k
(a + b)n+1 = (a + b) a b
k
k=0
n   n  
X n k+1 n−k X n k n+1−k
= a b + a b
k k
k=0 k=0
n   n  
X n k+1 n−k 0 n+1
X n k n+1−k
(2.5) = a b +a b + a b .
k k
k=0 k=1
32 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Observe that the first term on the right can be rewritten as


n          
X n k+1 n−k n 1 n n 2 n−1 n n n+1 0
a b = a b + a b + ··· + an b1 + a b
k 0 1 n−1 n
k=0
n  
X n
(2.6) = ak bn+1−k + an+1 b0 .
k−1
k=1

Also observe that


   
n n n! n!
(2.7) + = +
k−1 k (k − 1)! (n − k + 1)! k! (n − k)!
n! k n! (n − k + 1)
= +
k! (n − k + 1)! k! (n − k + 1)!
 
n! (n + 1) n+1
= = .
k! (n + 1 − k)! k
Now replacing (2.6) into (2.5) and using (2.7), we obtain
n   n  
X n X n k n+1−k
(a + b)n+1 = ak bn+1−k + an+1 b0 + a0 bn+1 + a b
k−1 k
k=1 k=1
n    
X n n
= + ak bn+1−k + an+1 b0 + a0 bn+1
k−1 k
k=1
n  
0 n+1
X n + 1 k n+1−k
=a b + a b + an+1 b0
k
k=1
n+1
X n + 1
= ak bn+1−k .
k
k=0

Thus, the binomial formula holds for the natural number n + 1. 

2.2.5. Strong form of induction. Sometimes it is necessary to use the fol-


lowing stronger form of induction. For each natural number n, let Pn be a state-
ment. Suppose that (1) P1 is true and (2) if n is a natural number and if each
statement Pm is true for every m ≤ n, then the statement Pn+1 is also true. Then
every single statement P1 , P2 , P3 , . . . is true. To see this, let M be all the natural
numbers such that Pn is not true. We shall prove that M must be empty, which
shows that Pn is true for every n. Indeed, suppose that M is not empty. Then
by well-ordering, M contains a least element, say n. Since P1 is true, M does not
contain 1, so n > 1. Since 1, 2, . . . , n − 1 are not in M (because n is the least
element of M ), the statements P1 , P2 , . . . , Pn−1 must be true. Hence, by Property
(2) of the statements, Pn must also be true. This shows that M does not contain
n, which contradicts the assumption that n is in M . Thus, M must be empty.
Problems 6, 9, and 10 contain exercises where strong induction is useful.
As already stated, in order to illustrate nontrivial induction examples, in the
exercises, we assume basic familiarity with integers, real, and complex numbers.
Exercises 2.2.
1. Consider the statement 1 + 2 + 3 + · · · + n = (2n + 1)2 /8. Prove that Pn implies Pn+1 .
What is wrong here?
2.2. THE PRINCIPLE OF MATHEMATICAL INDUCTION 33

Figure 2.3. The towers of Hanoi.

2. Using induction prove that for any complex numbers a and b and for any natural
numbers m and n, we have
(ab)n = an · bn , (am )n = amn .
If a and b are nonzero, prove that these equations hold even if m = 0 or n = 0.
3. Prove the following (some of them quite pretty) formulas/statements via induction:
(a)
1 1 1 n
+ + ··· + = .
1·2 2·3 n(n + 1) n+1
(b)
n(n + 1)(2n + 1)
12 + 22 + · · · + n2 = .
6
(c)
 n(n + 1) 2
13 + 23 + · · · + n3 = (1 + 2 + · · · + n)2 = .
2
(d)
1 2 3 n n+2
+ 2 + 3 + ··· + n = 2 − .
2 2 2 2 2n
(e) For a 6= 1,
n+1
n 1 − a2
(1 + a)(1 + a2 )(1 + a4 ) · · · (1 + a2 ) = .
1−a
(f) n3 − n is always divisible by 3.
(g) Every natural number n is either even or odd. Here, n is even means that n = 2m
for some m ∈ N and n odd means that n = 1 or n = 2m + 1 for some m ∈ N.
(h) n < 2n for all n ∈ N. (Can you also prove this using Bernoulli’s
 inequality?)
(i) Using the identity (2.7), called Pascal’s rule, prove that nk is a natural number

for all n, k ∈ N with 1 ≤ k ≤ n. (Pn is the statement “ nk ∈ N for all 1 ≤ k ≤ n.”)
4. In this problem we prove some nifty binomial formulas. Prove that
n
! n
!
X n n
X k n
(a) =2 , (b) (−1) = 0,
k k
k=0 k=0
! !
X n n−1
X n
(c) =2 , (d) = 2n−1 ,
k k
k odd k even

where the sums in (c) and (d) are over k = 1, 3, . . . and k = 0, 2, . . ., respectively.
5. (Towers of Hanoi) Induction can be used to analyze games! (See Problem 6 in the
next section for the game of Nim.) For instance, the towers of Hanoi starts with three
pegs and n disks of different sizes placed on one peg, with the biggest disk on the
bottom and with the sizes decreasing to the top as shown in Figure 2.3. A move is
made by taking the top disk off a stack and putting it on another peg so that there is
no smaller disk below it. The object of the game is to transfer all the disks to another
peg. Prove that the puzzle can be solved in 2n − 1 moves. (In fact, you cannot solve
the puzzle in less than 2n − 1 moves, but the proof of this is another story.)
34 2. NUMBERS, NUMBERS, AND MORE NUMBERS

6. (The coin game) Two people have n coins each and they put them on a table, in
separate piles, then they take turns removing their own coins; they may take as many
as they wish, but they must take at least one. The person removing the last coin(s)
wins. Using strong induction, prove that the second person has a “full-proof winning
strategy.” More explicitly, prove that for each n ∈ N, there is a strategy such that the
second person will win the game with n coins if he follows the strategy.
7. We now prove the arithmetic-geometric mean inequality (AGMI): For any non-
negative (that is, ≥ 0) real numbers a1 , . . . , an , we have
a1 + · · · + a n  a + · · · + a n
1 n
(a1 · · · an )1/n ≤ or equivalently, a1 · · · an ≤ .
n n
The product (a1 · · · an )1/n is the geometric mean and the sum a1 +···+a n
n
is the
arithmetic mean, respectively, of the numbers a1 , . . . , an .
√ √ √
(i) Show that a1 a2 ≤ a1 +a 2
2
. Suggestion: Expand ( a1 − a2 )2 ≥ 0.
n
(ii) By induction show the AGMI holds for 2 terms for every natural number n.
(iii) Now prove the AGMI for n terms where n is not necessarily a power of 2. (Do
not use induction.) Suggestion: Let a = (a1 + · · · + an )/n. By Problem 3h,
we know that 2n − n is a natural number. Apply the AGMI to the 2n terms
a1 , . . . , an , a, a, . . . , a where there are 2n − n occurrences of a in this list, to derive
the AGMI in general.
8. Here’s Newman’s proof [157] of the AGMI. The AGMI holds for one term, so assume
it holds for n terms; we shall prove that the AGMI holds for n + 1 terms.
(a) Prove that if the AGMI holds for all n + 1 nonnegative numbers a1 , . . . , an+1 such
that a1 · · · an+1 = 1, then the AGMI holds for any n + 1 nonnegative numbers.
(b) By (a), we just have to verify that the AGMI holds when a1 · · · an+1 = 1. Using
the induction hypothesis, prove that a1 + · · · + an + an+1 ≥ n(an+1 )−1/n + an+1 .
(c) Prove that for any x > 0, we have nx−1/n + x ≥ n + 1. Suggestion: Replace
n by n + 1 and a = x1/n − 1 > −1 in Bernoulli’s inequality. Now prove that
a1 + · · · + an+1 ≥ n + 1, which is the AGMI for n + 1 terms.
9. (Fibonacci sequence) Define F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2.
Using strong induction, prove that for every natural number n,

1 h n −n
i 1+ 5
Fn = √ Φ − (−Φ) , where Φ = is called the golden ratio.
5 2

Suggestion: Note that Φ2 = Φ + 1 and hence −Φ−1 = 1 − Φ = (1 − 5)/2.
10. (Pascal’s method) Using a method due to Pascal, we generalize our formula (2.2)
for the sum of the first n integers to sums of powers. See [18] for more on Pascal’s
method. For any natural number k, put σk (n) := 1k + 2k + · · · + nk and set σ0 (n) := n.
(i) Prove that
k
!
k+1
X k+1
(n + 1) −1= σ` (n).
`
`=0
Pn k+1

Suggestion: The left-hand side can be written as m=1 (m + 1) − mk+1 .
Use the binomial theorem on (m + 1)k+1 .
(ii) Using the strong form of induction, prove that for each natural number k,
1
σk (n) = nk+1 + akk nk + · · · + ak2 n2 + ak1 n (Pascal’s formula),
k+1
for some coefficients ak1 , . . . , akk ∈ Q.
(iii) (Cf. [124]) Using the fact that σ3 (n) = 14 n4 + a33 n3 + a32 n2 + a31 n, find the
coefficients a31 , a32 , a33 . Suggestion: Consider the difference σ3 (n) − σ3 (n − 1).
Can you find the coefficients in the sum for σ4 (n)?
11. (The multinomial theorem) A multi-index is an n-tuple of nonnegative integers
and are usually denoted by Greek letters, for instance α = (α1 , . . . , αn ) where each
2.3. THE INTEGERS 35

αk is one of 0, 1, 2, . . .. We define |α| := α1 + · · · + αn and α! := α1 ! · α2 ! · · · αn !. By


induction on n, prove that for any complex numbers a1 , . . . , an and natural number k,
we have
X k! α
(a1 + · · · + an )k = a 1 · · · aαn
n .
α! 1
|α|=k
Suggestion: For the induction step, write a1 + · · · + an+1 = a + an+1 where a =
a1 + · · · + an and use the binomial formula on (a + an+1 )k .

2.3. The integers


Have you ever wondered what it would be like in a world where the temperature
was never below zero degrees Celsius? How boring it would be to never see snow!
The natural numbers 1, 2, 3, . . . are closed under addition and multiplication, which
are essential for counting purposes used in everyday life. However, the natural
numbers do not have negatives, which is a problem. In particular, N is not closed
under subtraction. So, e.g. 4 − 7 = −3, which is not a natural number, therefore
the equation
x+7=4
does not have any solution x in the natural numbers. We can either accept that such
an equation does not have solutions4 or we can describe a number system where
such an equation does have solutions. We shall go the latter route and in this
section we study the integers or whole numbers, which are closed under subtraction
and have negatives.
2.3.1. Axioms for integer numbers. Incorporating zero and the negatives
of the natural numbers,
0, −1, −2, −3, −4, . . . ,
to the natural numbers forms the integers or whole numbers:
. . . , −4, −3, −2, −1, 0, 1, 2, 3, 4, . . . .
The set of integers is denoted by Z. The natural numbers are also referred to as
the positive integers, their negatives the negative integers, and the numbers
0, 1, 2, . . ., the natural numbers plus zero, are called the nonnegative integers,
and finally, 0, −1, −2, . . . are the nonpositive integers. The following arithmetic
properties of addition and multiplication of integers, like for natural numbers, are
familiar (in the following a, b denote arbitrary integers):
Addition satisfies
(A1) a + b = b + a; (commutative law)
(A2) (a + b) + c = a + (b + c); (associative law)
(A3) there is an integer denoted by 0 “zero” such that
a + 0 = a = 0 + a; (existence of additive identity)
(A4) for each integer a there is an integer denoted by the symbol −a such that5
a + (−a) = 0 and (−a) + a = 0. (existence of additive inverse)
4The imaginary expression
p
(−a) and the negative expression −b, have this resemblance,
that either of them occurring as the solution of a problem indicates some inconsistency or absur-
dity.
p As far as real meaning is concerned, both are imaginary, since 0 − a is as inconceivable as
(−a). Augustus De Morgan (1806–1871).
5At this moment, there could possibly be another integer, say b 6= −a, such that a + b = 0,
but in Theorem 2.9 we prove that if such a b exists, then b = −a; so additive inverses are unique.
36 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Multiplication satisfies
(M1) a · b = b · a; (commutative law)
(M2) (a · b) · c = a · (b · c); (associative law)
(M3) there is an integer denoted by 1 “one” such that
1 · a = a = a · 1. (existence of multiplicative identity)
As with the natural numbers, the · is sometimes dropped and the associative laws
imply that expressions involving integers such as a+b+c or abc make sense without
using parentheses. Addition and multiplication are related by
(D) a · (b + c) = (a · b) + (a · c). (distributive law)
Of these arithmetic properties, the only additional properties listed that were not
listed for natural numbers are (A3) and (A4). As usual, we denote
a + (−b) = (−b) + a by b − a,
so that subtraction is, by definition, really just “adding negatives”. A set together
with operations of addition and multiplication that satisfy properties (A1) – (A4),
(M2), (M3), and (D) is called a ring; essentially a ring is just a set of objects closed
under addition, multiplication, and subtraction. If in addition, this multiplication
satisfies (M1), then the set is called a commutative ring.
The natural numbers or positive integers, which we denote by N or by Z+ , is
closed under addition, multiplication, and has the following property: Given any
integer a, exactly one of the following “positivity” properties hold:
(P) a is a positive integer, a = 0, or −a is a positive integer.
Stated another way, property (P) means that Z is a union of disjoint sets,
Z = Z+ ∪ {0} ∪ −Z+ ,
where −Z+ consists of all integers a such that −a ∈ Z+ .
Everyday experience convinces us that the integers satisfy properties (A), (M),
(D), and (P); however, as with the natural numbers, we will assume the existence
of a set Z satisfying properties (A), (M), (D), and (P) such that Z+ = N, the
natural numbers. From the properties listed above, we shall derive some well-
known properties of the integers memorized since grade school.
2.3.2. Proofs of well-known high school rules. Since the integers satisfy
the same arithmetic properties as the natural numbers, the same proofs as in Section
2.1, prove that the distributive law (D) holds from the right and the FOIL law holds.
Also, the cancellation theorem 2.3 holds: Given any integers a, b, c,
a = b if and only if a + c = b + c .
However, now this statement is easily proved using the fact that the integers have
additive inverses. We only prove the “if” part: If a + c = b + c, then adding −c to
both sides of this equation we obtain
(a+c)+(−c) = (b+c)+(−c) =⇒ a+(c+(−c)) = b+(c+(−c)) =⇒ a+0 = b+0,
or a = b. Comparing this proof with that of Theorem 2.3 shows the usefulness of
having additive inverses.
We now show that we can always solve equations such as the one given at
the beginning of this section. Moreover, we prove that there is only one additive
identity.
2.3. THE INTEGERS 37

Theorem 2.9 (Uniqueness of additive identities and inverses). For


a, b, x ∈ Z,
(1) The equation
x+a=b holds if and only if x = b − a.
In particular, the only x that satisfies the equation x + a = a is x = 0. Thus,
there is only one additive identity.
(2) The only x that satisfies the equation
x + a = 0.
is x = −a. Thus, each integer has only one additive inverse.
(3) Finally, 0 · a = 0. (zero × anything is zero)
Proof. If x = b − a, then
(b − a) + a = (b + (−a)) + a = b + ((−a) + a) = b + 0 = b,
so the integer x = b − a solves the equation x + a = b. Conversely, if x satisfies
x + a = b, then
x + a = b = b + 0 = b + ((−a) + a) = (b + (−a)) + a,
so by cancellation (adding −a to both sides), x = b + (−a) = b − a. This proves
(1) and taking b = 0 in (1) implies (2).
Since 0 = 0 + 0, we have
0 · a = (0 + 0) · a = 0 · a + 0 · a.
Cancelling 0 · a (that is, adding −(0 · a) to both sides) proves (3). 

By commutativity, a + x = b if and only if x = b − a. Similarly, a + x = a if


and only if x = −a and a + x = 0 if and only if x = −a. We now prove the very
familiar “rules of sign” memorized since grade school.
Theorem 2.10 (Rules of sign). The following “rules of signs” hold:
(1) −(−a) = a.
(2) a · (−1) = −a = (−1) · a.
(3) (−1) · (−1) = 1.
(4) (−a) · (−b) = ab.
(5) (−a) · b = −(ab) = a · (−b).
(6) −(a + b) = (−a) + (−b). In particular, −(a − b) = b − a.
Proof. We prove (1)–(3) and leave (4)–(6) for you in Problem 1.
To prove (1), note that since a + (−a) = 0, by uniqueness of additive inverses
proved in the previous theorem, the additive inverse of −a is a, that is, −(−a) = a.
To prove (2), observe that
a + a · (−1) = a · 1 + a · (−1) = a · (1 + (−1)) = a · 0 = 0,
so by uniqueness of additive inverses, we have −a = a · (−1). By commutativity,
−a = (−1) · a also holds.
By (1), (2), we get (3): (−1) · (−1) = −(−1) = 1. 
38 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Everyone knows that −0 = 0. This fact follows from as an easy application of


(2): −0 = 0 · (−1) = 0, since zero times anything is zero.
Using the positivity assumption (P), we can order the integers in much the
same way as the natural numbers are ordered. Given any integers a and b exactly
one of the following holds:
(O1) a = b, that is, b − a = 0;
(O2) a < b, which means that b − a is a positive integer;
(O3) b < a, which means that −(b − a) is a positive integer.
By our previous theorem, we have −(b − a) = a − b, so (O3) is just that a − b is a
natural number.
Just as for natural numbers, we can define ≤, >, and ≥. For example, 0 < b, or
b > 0, means that b − 0 = b is a positive integer. Thus, an integer b is greater than 0
is synonymous with b is a positive integer. (Of course, this agrees with our English
usage of b > 0 to mean b is positive!) Similarly, b < 0 means that 0 − b = −b is a
positive integer. As with the natural numbers, we have the transitive law: If a < b
and b < c, then a < c, and we also have the Archimedean ordering of Z: Given
any natural number a and integer b there is a natural number n so that b < a · n.
To see this last property, note that if b < 0, then any natural number n works; if
b > 0, then b is a natural number and the Archimedean ordering of N applies to
show the existence of n. We now prove some of the familiar inequality rules.
Theorem 2.11 (Inequality rules). The following inequality rules hold:
(1) If a < b and c ≤ d, then a + c < b + d.
(2) If a < b and c > 0, then a · c < b · c. (positives preserve inequalities)
(3) If a < b and c < 0, then a · c > b · c. (negatives switch inequalities)
(4) If a > 0 and b > 0, then ab > 0. (positive × positive is positive)
(5) If a > 0 and b < 0 (or vise-verse), then ab < 0. (positive × negative is negative)
(6) If a < 0 and b < 0, then ab > 0. (negative × negative is positive)
Proof. We prove (1)–(3) and leave (4)–(6) for you in Problem 1.
To prove (1), we use associativity and commutativity to write
(b + d) − (a + c) = (b − a) + (d − c).
Since a < b, by definition of less than, b − a is a natural number and since c ≤ d,
d − c is either zero (if c = d) or a natural number. Hence, (b − a) + (d − c) is either
adding two natural numbers or a natural number and zero; in either case, the result
is a natural number. Thus, a + c < b + d.
If a < b and c > 0, then
b · c − a · c = (b − a) · c.
c is a natural number and since a < b, the integer b − a is a natural number, so
their product (b − a) · c is also a natural number. Thus, a · c < b · c.
If a < b and c < 0, then by our rules of sign,
a · c − b · c = (a − b) · c = −(a − b) · (−c) = (b − a) · (−c).
Since c < 0, the integer −c is a natural number and since a < b, the integer b − a
is a natural number, so their product (b − a) · (−c) is also a natural number. Thus,
a · c > b · c. 
We now prove that zero and one have the familiar properties that you know.
2.3. THE INTEGERS 39

Theorem 2.12 (Properties of zero and one). Zero and one satisfy
(1) If a · b = 0, then a = 0 or b = 0.
(2) If a · b = a where a 6= 0, then b = 1; that is, 1 is the only multiplicative identity.
Proof. We give two proofs of (1). Although Proof I is acceptable, Proof
II is much preferred because Proof I boils down to a contrapositive statement
anyways, which Proof II goes to directly.
Proof I: Assume that ab = 0. We shall prove that a = 0 or b = 0. Now either
a = 0 or a 6= 0. If a = 0, then we are done, so assume that a 6= 0. We need to prove
that b = 0. Well, either b = 0 or b 6= 0. However, it cannot be true that b 6= 0, for
according to the properties (4), (5), and (6) of our rules for inequalities,
(2.8) if a 6= 0 and b 6= 0, then a · b 6= 0.
But ab = 0, so b 6= 0 cannot be true. This contradiction shows that b = 0.
Proof II: Our second proof of (1) is a proof by contraposition, which is
essentially what we did in Proof I without stating it! Recall that, already explained
in the proof of Theorem 2.3, the technique of a proof by contraposition is that in
order to prove the statement “if a · b = 0, then a = 0 or b = 0,” we can instead try
to prove the contrapositive statement:
if a 6= 0 and b 6= 0, then a · b 6= 0.
However, as explained above (2.8), the truth of this statement follows from our
inequality rules. This gives another (better) proof of (1).
To prove (2), assume that a · b = a where a 6= 0. Then,
0 = a − a = a · b − a · 1 = a · (b − 1).
By (1), either a = 0 or b − 1 = 0. We are given that a 6= 0, so we must have
b − 1 = 0, or adding 1 to both sides, b = 1. 
Property (1) of this theorem is the basis for solving quadratic equations in high
school. For example, let us solve x2 − x − 6 = 0. We first “factor”; that is, observe
that
(x − 3)(x + 2) = x2 − x − 6 = 0.
By property (1), we know that x − 3 = 0 or x + 2 = 0. Thus, x = 3 or x = −2.
2.3.3. Absolute value. Given any integer a, we know that either a = 0, a is
a positive integer, or −a is a positive integer. The absolute value of the integer
a is denoted by |a| and is defined to be the “nonnegative part of a”:
(
a if a ≥ 0,
|a| :=
−a if a < 0.

Thus, for instance, |5| = 5, while | − 2| = −(−2) = 2. In the following theorem, we


prove some (what should be) familiar rules of absolute value. To prove statements
about absolute values, it’s convenient to prove by cases.
Theorem 2.13 (Absolute value rules). For a, b, x ∈ Z,
(1) |a| = 0 if and only if a = 0.
(2) |ab| = |a| |b|.
(3) |a| = | − a|.
(4) For x ≥ 0, |a| ≤ x if and only if −x ≤ a ≤ x.
40 2. NUMBERS, NUMBERS, AND MORE NUMBERS

(5) −|a| ≤ a ≤ |a|.


(6) |a + b| ≤ |a| + |b|. (triangle inequality)
Proof. If a = 0, then by definition, |0| = 0. Conversely, suppose that |a| = 0.
We have two cases: either a ≥ 0 or a < 0. If a ≥ 0, then 0 = |a| = a, so a = 0. If
a < 0, then 0 = |a| = −a, so a = 0 in this case as well. This proves (1).
To prove (2), we consider four cases: a ≥ 0 and b ≥ 0, a < 0 and b ≥ 0,
a ≥ 0 and b < 0, and lastly, a < 0 and b < 0. If a ≥ 0 and b ≥ 0, then ab ≥ 0, so
|ab| = ab = |a|·|b|. If a < 0 and b ≥ 0, then ab ≤ 0, so |ab| = −ab = (−a)·b = |a|·|b|.
The case that a ≥ 0 and b < 0 is handled similarly. Lastly, if a < 0 and b < 0, then
ab > 0, so |ab| = ab = (−a) · (−b) = |a| · |b|.
By (2), we have | − a| = |(−1) · a| = | − 1| · |a| = 1 · |a| = |a|, which proves (3).
To prove (4) we again go to two cases: a ≥ 0, a < 0. In the first case, if a ≥ 0,
then −x ≤ a ≤ x if and only if −x ≤ |a| ≤ x, which holds if and only if |a| ≤ x.
On the other hand, if a < 0, then −x ≤ a ≤ x if and only if (multiplying through
by −1) −x ≤ −a ≤ x or −x ≤ |a| ≤ x, which holds if and only if |a| ≤ x. Property
(5) follows from (4) with x = |a|.
Finally, we prove the triangle inequality. From (5) we have −|a| ≤ a ≤ |a| and
−|b| ≤ b ≤ |b|. Adding these inequalities gives

− |a| + |b| ≤ a + b ≤ |a| + |b|.
Applying (4) gives the triangle inequality. 
Exercises 2.3.
1. Prove properties (4)–(6) in the “Rules of signs” and “Inequality rules” theorems.
2. For integers a, b, prove the inequalities
| |a| − |b| | ≤ |a ± b| ≤ |a| + |b|.
3. Let b be an integer. Prove that the only integer a satisfying
b−1<a<b+1
is the integer a = b.
4. Let n ∈ N. Assume properties of powers from Example 2.2 in Section 2.2.
(a) Let a, b be nonnegative integers. Using a proof by contraposition, prove that if
an = bn , then a = b.
(b) We now consider the situation when a, b are not necessarily positive. So, let a, b be
arbitrary integers. Suppose that n = 2m for some positive integer m. Prove that
if an = bn , then a = ±b.
(c) Again let a, b be arbitrary integers. Suppose that n = 2m − 1 for some natural
number m. Prove the statement if an = bn , then a = b, using a proof by cases.
Here the cases consist of a, b both nonnegative, both negative, and when one is
nonnegative and the other negative (in this last case, show that an = bn actually
can never hold, so for this last case, the statement is superfluous).
5. In this problem we prove an integer version of induction. Let k be any integer (positive,
negative, or zero) and suppose that we are given a list of statements:
Pk , Pk+1 , Pk+2 , . . . ,
and suppose that (1) Pk is true and (2) if n is an integer with n ≥ k and the statement
Pn happens to be valid, then the statement Pn+1 is also valid. Then every single
statement Pk , Pk+1 , Pk+2 , . . . is true.
6. (Game of Nim) Here’s a fascinating example using strong induction and proof by
cases; see the coin game in Problem 6 of Exercises 2.2 for a related game. Suppose
that n stones are thrown on the ground. Two players take turns removing one, two, or
2.4. PRIMES AND THE FUNDAMENTAL THEOREM OF ARITHMETIC 41

three stones each. The last one to remove a stone loses. Let Pn be the statement that
the player starting first has a full-proof winning strategy if n is of the form n = 4k,
4k + 2, or 4k + 3 for some k = 0, 1, 2, . . . and the player starting second has a full-proof
winning strategy if n = 4k + 1 for some k = 0, 1, 2, . . .. In this problem we prove that
Pn is true for all n ∈ N.6
(i) Prove that P1 is true. Assume that P1 , . . . , Pn hold. To prove that Pn+1 holds, we
prove by cases. The integer n + 1 can be of four types: n + 1 = 4k, n + 1 = 4k + 1,
n + 1 = 4k + 2, or n + 1 = 4k + 3.
(ii) Case 1: n + 1 = 4k. The first player can remove one, two, or three stones; in
particular, he can remove three stones (leaving 4k − 3 = 4(k − 1) + 1 stones).
Prove that the first person wins.
(iii) Case 2: n+1 = 4k +1. Prove that the second player will win regardless if the first
person takes one, two, or three stones (leaving 4k, 4(k − 1) + 3, and 4(k − 1) + 2
stones, respectively).
(iv) Case 3, Case 4: n + 1 = 4k + 2 or n + 1 = 4k + 3. Prove that the first player has
a winning strategy in the cases that n + 1 = 4k + 2 or n + 1 = 4k + 3. Suggestion:
Make the first player remove one and two stones, respectively.

2.4. Primes and the fundamental theorem of arithmetic


It is not always true that given any two integers a and b, there is a third integer
q (for “quotient”) such that
b = a q.
For instance, 2 = 4q can never hold for any integer q, nor can 17 = 2q. This of
course is exactly the reason rational numbers are needed! (We shall study rational
numbers in Section 2.6.) The existence or nonexistence of such quotients opens up
an incredible wealth of topics concerning prime numbers in number theory.7

2.4.1. Divisibility. If a and b are integers, and there is a third integer q such
that
b = a q,
then we say that a divides b or b divisible by a or b is a multiple of a, in which
case we write a|b and call a a divisor or factor of b and q the quotient (of b
divided by a).
Example 2.6. Thus, for example 4|(−16) with quotient 4 because −16 =
4 · (−4) and (−2)|(−6) with quotient 3 because −6 = (−2) · 3.
We also take the convention that
divisors are by definition nonzero.
To see why, note that
0 = 0 · 0 = 0 · 1 = 0 · (−1) = 0 · 2 = 0 · (−2) = · · · ,

6
Here, we are implicitly assuming that any natural number can be written in the form 4k,
4k + 1, 4k + 2, or 4k + 3; this follows from Theorem 2.15 on the division algorithm, which we
assume just for the sake of presenting a cool exercise!
7Mathematicians have tried in vain to this day to discover some order in the sequence of
prime numbers, and we have reason to believe that it is a mystery into which the human mind
will never penetrate. Leonhard Euler (1707–1783) [210].
42 2. NUMBERS, NUMBERS, AND MORE NUMBERS

so if 0 were allowed to be a divisor, then every integer is a quotient when 0 is


divided by itself! However, if a 6= 0, then b = aq can have only one quotient, for if
in addition b = aq 0 for some integer q 0 , then
aq = aq 0 =⇒ aq − aq 0 = 0 =⇒ a(q − q 0 ) = 0.
Since a 6= 0, we must have q − q 0 = 0, or q = q 0 . So the quotient q is unique.
Because uniqueness is so important in mathematics we always assume that divisors
are nonzero. Thus, comes the high school phrase “You can never divide by 0!” Here
are some important properties of division.
Theorem 2.14 (Divisibility rules). The following divisibility rules hold:
(1) If a|b and b is positive, then |a| ≤ b.
(2) If a|b, then a|bc for any integer c.
(3) If a|b and b|c, then a|c.
(4) If a|b and a|c, then a|(bx + cy) for any integers x and y.
Proof. Assume that a|b and b > 0. Since a|b, we know that b = aq for some
integer q. Assume for the moment that a > 0. By our inequality rules (Theorem
2.11), we know that “positive × negative is negative”, so q cannot be negative. q
also can’t be zero because b 6= 0. Therefore q > 0. By our rules for inequalities,
a=a·1≤a·q =b =⇒ |a| ≤ b.
Assume now that a < 0. Then b = aq = (−a)(−q). Since (−a) > 0, by our proof
for positive divisors that we just did, we have (−a) ≤ b, that is, |a| ≤ b.
We now prove (2). If a|b, then b = aq for some integer q. Hence,
bc = (aq)c = a(qc),
so a|bc.
To prove (3), suppose that a|b and b|c. Then b = aq and c = bq 0 for some
integers q and q 0 . Hence,
c = bq 0 = (aq)q 0 = a(qq 0 ),
so a|c.
Finally, assume that a|b and a|c. Then b = aq and c = aq 0 for integers q and
q 0 . Hence, for any integers x and y,
bx + cy = (aq)x + (aq 0 )y = a(qx + q 0 y),
so a|(bx + cy). 
2.4.2. The division algorithm. Although we cannot always divide one in-
teger into another we can always do it up to remainders.
Example 2.7. For example, although 2 does not divide 7, we can write
7 = 3 · 2 + 1.
Another example is that although −3 does not divide −13, we do have
−13 = 5 · (−3) + 2.
In general, if a and b are integers and
b = qa + r, where 0 ≤ r < |a|,
then we call q the quotient (of b divided by a) and r the remainder. Such numbers
always exists as we now prove.
2.4. PRIMES AND THE FUNDAMENTAL THEOREM OF ARITHMETIC 43

Theorem 2.15 (The division algorithm). Given any integers a and b with
a 6= 0, there are unique integers q and r so that
b = qa + r with 0 ≤ r < |a|.
Moreover, if a and b are both positive, then q is nonnegative. Furthermore, a divides
b if and only if r = 0.
Proof. Assume for the moment that a > 0. Consider the list of integers
(2.9) . . . , 1 + b − 3a, 1 + b − 2a, 1 + b − a, b, 1 + b + a, 1 + b + 2a, 1 + b + 3a, . . .
extending indefinitely in both ways. Notice that since a > 0, for any integer n,
1 + b + na < 1 + b + (n + 1)a,
so the integers in the list (2.9) are increasing. Moreover, by the Archimedean
ordering of the integers, there is a natural number n so that −1 − b < an or
1 + b + an > 0. In particular, 1 + b + ak > 0 for k ≥ n. Thus, far enough to the right
in the list (2.9), all the integers are positive. Let A be set of all natural numbers
appearing in the list (2.9). By the well-ordering principle (Theorem 2.6), this set of
natural numbers has a least element, let us call it 1 + b + ma where m is an integer.
This integer satisfies
(2.10) 1 + b + (m − 1)a < 1 ≤ 1 + b + ma,
for if 1 + b + (m − 1)a ≥ 1, then 1 + b + (m − 1)a would be an element of A smaller
than 1 + b + ma. Put q = −m and r = b + ma = b − qa. Then b = qa + r by
construction, and substituting in q and r into (2.10), we obtain
1 + r − a < 1 ≤ 1 + r.
Subtracting 1 from everything, we see that
r − a < 0 ≤ r.
Thus, 0 ≤ r and r − a < 0 (that is, r < a). Thus, we have found integers q and r
so that b = qa + r with 0 ≤ r < a. Observe from (2.10) that if b is positive, then
m can’t be positive (for otherwise the left-hand inequality in (2.10) wouldn’t hold).
Thus, q is nonnegative if both a and b are positive. Assume now that a < 0. Then
−a > 0, so by what we just did, there are integers s and r with b = s(−a) + r with
0 ≤ r < −a; that is, b = qa + r, where q = −s and 0 ≤ r < |a|.
We now prove uniqueness. Assume that we also have b = q 0 a + r0 with 0 ≤
r < |a|. We first prove that r = r0 . Indeed, suppose that r 6= r0 , then by
0

symmetry in the primed and unprimed letters, we may presume that r < r0 . Then
0 < r0 − r ≤ r0 < |a|. Moreover,
q 0 a + r0 = qa + r =⇒ (q 0 − q)a = r0 − r.
This shows that a|(r0 − r) which is impossible since r0 − r is smaller than |a| (see
property (1) of Theorem 2.14). Thus, we must have r = r0 . Then the equation
(q 0 − q)a = r0 − r reads (q 0 − q)a = 0. Since a 6= 0, we must have q 0 − q = 0, or
q = q 0 . Our proof of uniqueness is thus complete.
Finally, we prove that a|b if and only if r = 0. If a|b, then b = ac = ac + 0
for some integer c. By uniqueness already established, we have q = c and r = 0.
Conversely, if r = 0, then b = aq, so a|b by definition of divisibility. 
44 2. NUMBERS, NUMBERS, AND MORE NUMBERS

An integer n is even if we can write n = 2m for some integer m, and odd if


we can write n = 2m + 1 for some integer m.
Example 2.8. For instance, 0 = 2 · 0 so 0 is even, 1 = 2 · 0 + 1 so 1 is odd, and
−1 is odd since −1 = 2 · (−1) + 1.
Using the division algorithm we can easily prove that every integer is either even
or odd. Indeed, dividing n by 2, the division algorithm implies that n = 2m + k
where 0 ≤ k < 2, that is, where k is either 0 or 1. This shows that n is either even
(if k = 0) or odd (if k = 1).
An important application of the division algorithm is to the so-called Euclidean
algorithm for finding greatest common divisors; see Problem 4.
2.4.3. Prime numbers. Consider the number 12. This number has 6 positive
factors or divisors, 1, 2, 3, 4, 6, and 12. The number 21 has 4 positive factors, 1, 3,
7, and 21. The number 1 has only one positive divisor, 1. However, as the reader
can check, 17 has exactly two positive factors, 1 and 17. Similarly, 5 has exactly
two positive factors, 1 an 5. Numbers such as 5 and 17 are given a special name: A
natural number that has exactly two positive factors is called a prime number.8
Another way to say this is that a prime number is a natural number with exactly
two factors, itself and 1. (Thus, 1 is not prime.) A list of the first ten primes is
2, 3, 5, 7, 11, 13, 17, 19, 23, 29, . . . .
A number that is not prime is called a composite number. Notice that
12 = 2 × 6 = 2 × 2 × 3, 21 = 3 × 7, 17 = 17, 5 = 5.
In each of these circumstances, we have factored or expressed as a product, each
number into a product of its prime factors. Here, by convention, we consider a
prime number to be already in its factored form.
Lemma 2.16. Every natural number other than 1 can be factored into primes.
Proof. We shall prove that for any natural number m = 1, 2, 3, . . ., the
number m + 1 can be factored into primes; which is to say, any natural num-
ber n = 2, 3, 4, . . . can be factored into primes. We prove this lemma by using
strong induction. By our convention, n = 2 is already in factored form. Assume
our theorem holds for all natural numbers 2, 3, 4, . . . , n; we shall prove our theorem
holds for the natural number n + 1. Now n + 1 is either prime or composite. If it
is prime, then it is already in factored form. If it is composite, then n + 1 = pq
where p and q are natural numbers greater than 1. By Theorem 2.14 both p and q
are less than n + 1. By induction hypothesis, p and q can be factored into primes.
It follows that n + 1 = pq can also be factored into primes. 
One of the first questions that one may ask is how many primes there are.
This was answered by Euclid of Alexandria (325 B.C.–265 B.C.): There are infinity
many. The following proof is the original due to Euclid and is the classic “proof by
contradiction”.
Theorem 2.17 (Euclid’s theorem). There are infinitely many primes.
8I hope you will agree that there is no apparent reason why one number is prime and another
not. To the contrary, upon looking at these numbers one has the feeling of being in the presence
of one of the inexplicable secrets of creation. Don Bernard Zagier [253, p. 8].
2.4. PRIMES AND THE FUNDAMENTAL THEOREM OF ARITHMETIC 45

Proof. We start with the tentative assumption that the theorem is false.
Thus, we assume that there are only finitely many primes. There being only finitely
many, we can list them:
p1 , p2 , . . . , pn .
Consider the number
p1 p2 p3 · · · pn + 1.
This number is either prime or composite. It is greater than all the primes p1 , . . . , pn ,
so this number can’t equal any p1 , . . . , pn . We conclude that n must be composite,
so
(2.11) p1 p2 p3 · · · pn + 1 = ab,
for some natural numbers a and b. By our lemma, both a and b can be expressed as
a product involving p1 , . . . , pn , which implies that ab also has such an expression.
In particular, being a product of some of the p1 , . . . , pn , the right-hand side of
(2.11) is divisible by at least one of the prime numbers p1 , . . . , pn . However, the
left-hand side is certainly not divisible by any such prime because if we divide the
left-hand side by any one of the primes p1 , . . . , pn , we always get the remainder
1! This contradiction shows that our original assumption that the theorem is false
must have been incorrect; hence there must be infinitely many primes. 
2.4.4. Fundamental theorem of arithmetic. Consider the integer 120,
which can be factored as follows:
120 = 2 × 2 × 2 × 3 × 5.
A little verification shows that it is impossible to factor 120 into any primes other
than the ones displayed. Of course, the order can be different; e.g.
120 = 3 × 2 × 2 × 5 × 2.
It is of fundamental importance in mathematics that any natural number can be
factored into a product of primes in only one way, apart from the order.
Theorem 2.18 (Fundamental theorem of arithmetic). Every natural num-
ber other than 1 can be factored into primes in only one way, except for the order
of the factors.
Proof. For sake of contradiction, let us suppose that there are primes that
can be factored in more that one way. By the well-ordering principle, there is a
smallest such natural number a. Thus, we can write a as a product of primes in
two ways:
a = p1 p2 · · · pm = q 1 q 2 · · · q n .
Note that both m and n are greater than 1, for a single prime number has one prime
factorization. We shall obtain a contradiction by showing there is a smaller natural
number that has two factorizations. First, we observe that none of the primes pj
on the left equals any of the primes qk on the right. Indeed, if for example p1 = q1 ,
then by cancellation, we could divide them out obtaining the natural number
p2 p3 · · · pm = q 2 q 3 · · · q n .
This number is smaller than a and the two sides must represent two distinct prime
factorizations, for if these prime factorizations were the same apart from the order-
ings, then (since p1 = q1 ) the factorizations for a would also be the same apart from
orderings. Since a is the smallest such number with more than one factorization, we
46 2. NUMBERS, NUMBERS, AND MORE NUMBERS

conclude that none of the primes pj equals a prime qk . In particular, since p1 6= q1 ,


by symmetry we may assume that p1 < q1 . Now consider the natural number
(2.12) b = (q1 − p1 )q2 q3 · · · qn
= q 1 q 2 · · · q n − p1 q 2 · · · q n
= p1 p2 · · · pm − p1 q 2 · · · q n (since p1 · · · pm = a = q1 · · · qm )
(2.13) = p1 (p2 p3 · · · pm − q2 q3 · · · qn ).
Since 0 < q1 − p1 < q1 , the number b is less than a, so b can only be factored in
one way apart from orderings. Observe that the number q1 − p1 cannot have p1 as
a factor, for if p1 divides q1 − p1 , then p1 also divides (q1 − p1 ) + p1 = q1 , which is
impossible because q1 is prime. Thus, writing q1 − p1 into its prime factors, none
of which is p1 , the expression (2.12) and the fact that p1 6= qk for any k shows that
b does not contain the factor p1 in its factorization. On the other hand, writing
p2 p3 · · · pm − q2 q3 · · · qn into its prime factors, the expression (2.13) clearly shows
that p1 is in the prime factorization of b. This contradiction ends the proof. 
Another popular way to prove the fundamental theorem of arithmetic uses the
concept of the greatest common divisor; see Problem 5 for this proof.
In our first exercise, recall that the notation n! (read “n factorial”) for n ∈ N
denotes the product of the first n integers: n! := 1 · 2 · 3 · · · n.
Exercises 2.4.
1. A natural question is: How sparse are the primes? Prove that there are arbitrarily
large gaps in the list of primes in the following sense: Given any positive integer k,
there are k consecutive composite integers. Suggestion: Consider the integers
(k + 1)! + 2, (k + 1)! + 3, . . . , (k + 1)! + k, (k + 1)! + k + 1.
2. Using the fundamental theorem of arithmetic, prove that if a prime p divides ab, where
a, b ∈ N, then p divides a or p divides b. Is this statement true if p is not prime?
3. Prove Lemma 2.16, that every natural number other than 1 can be factored into primes,
using the well-ordering principle instead of induction.
4. (The Euclidean algorithm) Let a and b be any two integers, both not zero. Consider
the set of all positive integers that divide both a and b. This set is nonempty (it contains
1) and is finite (since integers larger than |a| and |b| cannot divide both a and b). This
set therefore has a largest element (Problem 6 in Exercises 2.1), which we denote by
(a, b) and call the greatest common divisor (GCD) of a and b. In this problem we
find the GCD using the Euclidean algorithm.
(i) Show that (±a, b) = (a, ±b) = (a, b) and (0, b) = |b|. Because of these equalities,
we henceforth assume that a and b are positive.
(ii) By the division algorithm we know that there are unique nonnegative integers q0
and r0 so that b = q0 a + r0 with 0 ≤ r0 < a. Show that (a, b) = (a, r0 ).
(iii) By successive divisions by remainders, we can write
b = q0 · a + r0 , a = q1 · r0 + r1 , r0 = q2 · r1 + r2 ,
(2.14)
r1 = q3 · r2 + r3 , ... rj−1 = qj · rj + rj+1 , ...,
where the process is continued only as far as we don’t get a zero remainder. Show
that a > r0 > r1 > r2 > · · · and using this fact, explain why we must eventually
get a zero remainder.
(iv) Let rn+1 = 0 be the first zero remainder. Show that rn = (a, b). Thus, the last
positive remainder in the sequence (2.14) equals (a, b). This process for finding
the GCD is called the Euclidean algorithm.
(v) Using the Euclidean algorithm, find (77, 187) and (193, 245).
2.4. PRIMES AND THE FUNDAMENTAL THEOREM OF ARITHMETIC 47

5. Working backwards through the equations (2.14) show that for any two integers a, b,
we have
(a, b) = rn = k a + ` b,
for some integers k and `. Using this fact concerning the GCD, we shall give an easy
proof of the fundamental theorem of arithmetic.
(i) Prove that if a prime p divides a product ab, then p divides a or p divides b.
(Problem 2 does not apply here because in that problem we used the fundamental
theorem of arithmetic, but now we are going to prove this fundamental theorem.)
Suggestion: Either p divides a or it doesn’t; if it does, we’re done, if not, then the
GCD of p and a is 1. Thus, 1 = (p, a) = k a + ` b, for some integers k, `. Multiply
this equation by b and show that p must divide b.
(ii) Using induction prove that if a prime p divides a product a1 · · · an , then p divides
some ai .
(iii) Using (ii), prove that the fundamental theorem of arithmetic.
6. (Modular arithmetic) Given n ∈ N, we say that x, y ∈ Z are congruent modulo
n, written x ≡ y (mod n), if x − y is divisible by n. For a, b, x, y, u, v ∈ Z, prove
(a) x ≡ y (mod n), y ≡ x (mod n), x − y ≡ 0 (mod n) are equivalent statements.
(b) If x ≡ y (mod n) and y ≡ z (mod n), then x ≡ z (mod n).
(c) If x ≡ y (mod n) and u ≡ v (mod n), then ax + by ≡ au + bv (mod n).
(d) If x ≡ y (mod n) and u ≡ v (mod n), then xu ≡ yv (mod n).
(e) Finally, prove that if x ≡ y (mod n) and m|n where m ∈ N, then x ≡ y (mod m).
7. (Fermat’s theorem) We assume the basics of modular arithmetic from Problem 6.
In this problem we prove that for any prime p and x ∈ Z, we have xp ≡ x (mod p).
This theorem is due to Pierre de Fermat (1601–1665).
(i) Prove that for any k ∈ N with 1 < k < p, thebinomial coefficient (which is an
integer, see e.g. Problem 3i in Exercises 2.2) kp = k!(p−k)!
p!
is divisible by p.
(ii) Using (i), prove that for any x, y ∈ Z, (x + y) ≡ x + y p (mod p).
p p

(iii) Using (ii) and induction, prove that xp ≡ x (mod p) for all x ∈ N. Conclude that
xp ≡ x (mod p) for all x ∈ Z.
8. (Pythagorean triples) A Pythagorean triple consists of three natural numbers
(x, y, z) such that x2 + y 2 = z 2 . For example, (3, 4, 5) and (6, 8, 10) are such triples.
The triple is called primitive if x, y, z are relatively prime, or coprime, which
means that x, y, z have no common prime factors. For instance, (3, 4, 5) is primitive
while (6, 8, 10) is not. In this problem we prove
(
x = 2mn , y = m2 − n2 , z = m2 + n2 , or,
(x, y, z) is primitive ⇐⇒
x = m2 − n2 , y = 2mn , z = m2 + n2 ,
where m, n are coprime, m > n, and m, n are of opposite parity; that is, one of m, n is
even and the other is odd.
(i) Prove the “⇐=” implication. Henceforth, let (x, y, z) be a primitive triple.
(ii) Prove that x and y cannot both be even.
(iii) Show that x and y cannot both be odd.
(iv) Therefore, one of x, y is even and the other is odd; let us choose x as even and
y as odd. (The other way around is handled similarly.) Show that z is odd and
conclude that u = 12 (z + y) and v = 12 (z − y) are both natural numbers.
(v) Show that y = u − v and z = u + v and then x2 = 4uv. Conclude that uv is a
perfect square (that is, uv = k2 for some k ∈ N).
(vi) Prove that u and v must be coprime and from this fact and the fact that uv is a
perfect square, conclude that u and v each must be a perfect square; say u = m2
and v = n2 for some m, n ∈ N. Finally, prove the desired result.
9. (Pythagorean triples, again) If you like primitive Pythagorean triples, here’s an-
other problem: Prove that if m, n are coprime, m > n, and m, n are of the same parity,
48 2. NUMBERS, NUMBERS, AND MORE NUMBERS

then
m2 − n2 m2 + n2
(x, y, z) is primitive, where x = mn , y = , z= .
2 2
Combined with the previous problem, we see that given coprime natural numbers
m > n,
(
x = 2mn , y = m2 − n2 , z = m2 + n2 , or,
(x, y, z) is primitive, where 2 2 2 2
x = mn , y = m −n
2
, z = m +n2
,
according as m and n have opposite or the same parity.
10. (Mersenne primes) A number of the form Mn = 2n − 1 is called a Mersenne
number, named after Marin Mersenne (1588–1648). If Mn is prime, it’s called a
Mersenne prime. For instance, when M2 = 22 − 1 = 3 is prime, M3 = 23 − 1 = 7
is prime, but M4 = 24 − 1 = 15 is not prime. However, when M5 = 25 − 1 =
31 is prime again. It it not known if there exists infinitely many Mersenne primes.
Prove that if Mn is prime, then n is prime. (The converse if false; for instance M23 ,
is composite.) Suggestion: Prove the contrapositive. Also, the polynomial identity
xk − 1 = (x − 1)(xk−1 + xk−2 + · · · + x + 1) might be helpful.
11. (Perfect numbers) A number n ∈ N is said to be perfect if it is the sum of its proper
divisors (divisors excluding itself). For example, 6 = 1+2+3 and 28 = 1+2+4+7+14
are perfect. It’s not known if there exists any odd perfect numbers! In this problem
we prove that perfect numbers are related to Mersenne primes as follows:
n is even and perfect ⇐⇒ n = 2m (2m+1 − 1) where m ∈ N , 2m+1 − 1 is prime.
For instance, when m = 1, 21+1 − 1 = 3 is prime, so 21 (21+1 − 1) = 6 is perfect.
Similarly, we get 28 when m = 2 and the next perfect number is 496 when m = 4.
(Note that when m = 3, 2m+1 − 1 = 15 is not prime.)
(i) Prove that if n = 2m (2m+1 − 1) where m ∈ N and 2m+1 − 1 is prime, then n
is perfect. Suggestion: The proper divisors of n are 1, 2, . . . , 2m , q, 2q, . . . , 2m−1 q
where q = 2m+1 − 1.
(ii) To prove the converse, we proceed systematically as follows. First prove that if
m, n ∈ N, then d is a divisor of m · n if and only if d = d1 · d2 where d1 and d2
mk
are divisors of m and n, respectively. Suggestion: Write m = pm 1
1 · · · pk and
n1 n`
n = q1 · · · q` into prime factors. Observe that a divisor of m · n is just a number
i j
of the form pi11 · · · pkk q1j1 · · · q` ` where 0 ≤ ir ≤ mr and 0 ≤ jr ≤ nr .
(iii) For n ∈ N, define σ(n) as the sum of all the divisors of n (including n itself).
Using (ii), prove that if m, n ∈ N, then σ(m · n) = σ(m) · σ(n).
(iv) Let n be even and perfect and write n = 2m q where m ∈ N and q is odd. By (iii),
σ(n) = σ(2m )σ(q). Working out both sides of σ(n) = σ(2m )σ(q), prove that
q
(2.15) σ(q) = q + m+1 .
2 −1
Suggestion: Since n is perfect, prove that σ(n) = 2n and by definition of σ, prove
that σ(2m ) = 2m+1 − 1.
(v) From (2.15) and the fact that σ(q) ∈ N, show that q = k(2m+1 − 1) for some
k ∈ N. From (2.15) (that σ(q) = q + k), prove that k = 1. Finally, conclude that
n = 2m (2m+1 − 1) where q = 2m+1 − 1 is prime.
12. In this exercise we show how to factor factorials (cf. [78]). Let n > 1. Show that the
prime factors of n! are exactly those primes less than or equal to n. Explain that to
factor n!, for each prime p less than n, we need to know the greatest power of p that
divides n!. We shall prove that the greatest power of p that divides n! is
∞  
X n
(2.16) ep (n) := ,
pk
k=1
2.5. DECIMAL REPRESENTATIONS OF INTEGERS 49

where bn/pk c is the quotient when n is divided by pk .


(a) If pk > n, show that bn/pk c = 0, so the sum in (2.16) is actually finite.
(b) Show that
    (
n+1 n 1 if pk | (n + 1)
− =
pk p k
0 if pk 6 | (n + 1).
(c) Prove that
∞   X∞  
X n+1 n
− = j,
pk pk
k=1 k=1

where j is the largest integer such that pj divides n + 1.


(d) Now prove (2.16) by induction on n. Suggestion: For the induction step, write
(n + 1)! = (n + 1) n! and show that ep (n + 1) = j ep (n), where j is the largest
integer such that pj divides n + 1.
(e) Use (2.16) to find e2 , e3 , e5 , e7 , e11 for n = 12 and then factor 12! into primes.

2.5. Decimal representations of integers


Since grade school we have represented numbers in “base 10”. In this section
we explore the use of arbitrary bases.
2.5.1. Decimal representations of integers. We need to carefully make a
distinction between integers and the symbols used to represent them.
Example 2.9. In our common day notation, we have
2 = 1 + 1, 3 = 2 + 1, 4 = 3 + 1, . . . etc.
where 1 is our symbol for the multiplicative unit.
Example 2.10. The Romans used the symbol I for the multiplicative unit,
and for the other numbers,
II = I + I, III = II + I, IV = III + I, . . . etc;
if you want to be proficient in using Roman numerals, see [208].
Example 2.11. We could be creative and make up our own symbols for inte-
gers: e.g.
i = 1 + 1, like = i + 1, math = like + 1, . . . etc.
As you could imagine, it would be very inconvenient to make up a different
symbol for every single number! For this reason, we write numbers with respect
to “bases”. For instance, undoubtedly because we have ten fingers, the base 10 or
decimal system, is the most widespread system to make symbols for the integers.
In this system, we use the symbols 0, 1, 2, . . . , 9 called digits for zero and the first
nine positive integers, to give a symbol to any integer using the symbol 10 := 9 + 1
as the “base” with which to express numbers.
Example 2.12. Consider the symbol 12. This symbol represents the number
twelve, which is the number
1 · 10 + 2.
Example 2.13. The symbol 4321 represents the number a given in words by
four thousand, three hundred and twenty-one:
a = 4000 + 300 + 20 + 1 = 4 · 103 + 3 · 102 + 2 · 10 + 1.
50 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Note that the digits 1, 2, 3, 4 in the symbol 4321 are exactly the remainders produced
after successive divisions of a and its quotients by 10. For example,
a = 432 · 10 + 1 (remainder 1)
Now divide the quotient 432 by 10:
432 = 43 · 10 + 2 (remainder 2).
Continuing dividing the quotients by 10, we get
43 = 4 · 10 + 3, (remainder 3), and finally, 4 = 0 · 10 + 4, (remainder 4).
We shall use this technique of successive divisions in the proof of Theorem 2.19
below. In general, the symbol a = an an−1 · · · a1 a0 represents the number
a = an · 10n + an−1 · 10n−1 + · · · + a1 · 10 + a0 (in base 10).
As with our previous example, the digits a0 , a1 , . . . , an are exactly the remainders
produced after successive divisions of a and the resulting quotients by 10.

2.5.2. Other common bases. We now consider other bases; for instance, the
base 7 or septimal system. Here, we use the symbols 0, 1, 2, 3, 4, 5, 6, 7 to represent
zero and the first seven natural numbers and the numbers 0, 1, . . . , 6 are the digits
in base 7. Then we write an integer a as an an−1 · · · a1 a0 in base 7 if
a = an · 7n + an−1 · 7n−1 + · · · + a1 · 7 + a0 .
Example 2.14. For instance, the number with symbol 10 in base 7 is really
the number 7 itself, since
10 (base 7) = 1 · 7 + 0.
Example 2.15. The number one hundred one has the symbol 203 in the sep-
timal system because
203 (base 7) = 2 · 72 + 0 · 7 + 3,
and in our familiar base 10 or decimal notation, the number on the right is just
2 · 49 + 3 = 98 + 3 = 101.
The base of choice for computers is base 2 or the binary or dyadic system.
In this case, we write numbers using only the digits 0 and 1. Thus, an integer a is
written as an an−1 · · · a1 a0 in base 2 if
a = an · 2n + an−1 · 2n−1 + · · · + a1 · 2 + a0 .
Example 2.16. For instance, the symbol 10101 in the binary system represents
the number
10101 (base 2) = 1 · 24 + 0 · 23 + 1 · 22 + 0 · 21 + 1.
In familiar base 10 or decimal notation, the number on the right is 16+4+2+1 = 21.
Example 2.17. The symbol 10 in base 2 is really the number 2 itself, since
10 (base 2) = 1 · 2 + 0.
2.5. DECIMAL REPRESENTATIONS OF INTEGERS 51

Not only are binary numbers useful for computing, they can also help you be a
champion in the Game of Nim; see [202]. (See also Problem 6 in Exercises 2.3.)
Another common base is base 3, which is known as the tertiary system.
We remark that one can develop addition and multiplication tables in the sep-
timal and binary systems (indeed, with respect to any base); see for instance [57, p.
7]. Once a base is fixed, we shall not make a distinction between a number and its
representation in the chosen base. In particular, throughout this book we always
use base 10 and write numbers with respect to this base unless stated otherwise.
2.5.3. Arbitrary base expansions of integers. We now show that a num-
ber can be written with respect to any base. Fix a natural number b > 1, called a
base. Let a be a natural number and suppose that it can be written as a sum of
the form
a = an · bn + an−1 · bn−1 + · · · + a1 · b + a0 ,
where 0 ≤ ak < b and an 6= 0. Then the symbol an an−1 · · · a1 a0 is called the b-adic
representation of a. A couple questions arise: First, does every natural number
have such a representation and second, if a representation exists, is it unique? The
answer to both questions is yes.
In the following proof, we shall use the following useful “telescoping” sum
several times:
Xn n
X
(b − 1) bk = (bk+1 − bk ) = b1 + · · · + bn + bn+1 − (1 + b1 + · · · + bn ) = bn+1 − 1.
k=0 k=0

Theorem 2.19. Every natural number has a unique b-adic representation.


Proof. We first prove existence then uniqueness.
Step 1: We first prove existence using the technique of successive divisions
we talked about before. Using the division algorithm, we form the remainders
produced after successive divisions of a and its quotients by b:
a = q0 · b + a0 (remainder a0 ), q0 = q1 · b + a1 , (remainder a1 ),
q1 = q2 · b + a2 , (remainder a2 ), . . . , qj−1 = qj · b + aj , (remainder aj ), . . .
and so forth. By the division algorithm, we have qj ≥ 0, 0 ≤ aj < b, and moreover,
since b > 1 (that is, b ≥ 2), from the equation qj−1 = qj · b + aj it is evident that
as long as the quotient q0 is positive, we have
a = q0 · b + a0 ≥ q0 · b ≥ 2q0 > q0
and in general, as long as the quotient qj is positive, we have
qj−1 = qj · b + aj ≥ qj · b ≥ 2qj > qj .
These inequalities imply that a > q0 > q1 > q2 > · · · ≥ 0 where the strict inequality
> holds as long as the quotients remain positive. Since there are only a numbers
from 0 to a − 1, at some point the quotients must eventually reach zero. Let us say
that qn = 0 is the first time the quotients hit zero. If n = 0, then we have
a = 0 · b + a0 ,
so a has the b-adic representation a0 . Suppose that n > 0. Then we have
a = q0 · b + a0 (remainder a0 ), q0 = q1 · b + a1 , (remainder a1 ),
(2.17)
q1 = q2 · b + a2 , (remainder a2 ), . . . , qn−1 = 0 · b + an , (remainder an ),
52 2. NUMBERS, NUMBERS, AND MORE NUMBERS

and we stop successive divisions once we get an . Combining the first and second
equations in (2.17), we get
a = q0 · b + a0 = (q1 · b + a1 )b + a0 = q1 · b2 + a1 b + a0 .
Combining this equation with the third equation in (2.17) we get
a = (q2 · b + a2 ) · b2 + a1 b + a0 = q2 · b3 + a2 · b2 + a1 b + a0 .
Continuing this process (slang for “by use of induction”) we eventually arrive at
a = (0 · b + an ) · bn + an−1 · bn−1 + · · · + a1 · b + a0
= an · bn + an−1 · bn−1 + · · · + a1 · b + a0 .
This shows the existence of a b-adic representation of a.
Step 2: We now show that this representation is unique. Suppose that a has
another such representation:
Xn Xm
(2.18) a= ak bk = ck bk ,
k=0 k=0
where 0 ≤ ck < b and cm 6= 0. We first prove that n = m. Indeed, let’s suppose
that n 6= m, say n < m. Then,
Xn n
X
a= ak bk ≤ (b − 1) bk = bn+1 − 1.
k=0 k=0

Since n < m =⇒ n + 1 ≤ m, we have bn+1 ≤ bm , so


m
X
a ≤ bm − 1 < bm ≤ cm · bm ≤ ck bk = a =⇒ a < a.
k=0
This contradiction shows that n = m. Now let us assume that some digits in
the expressions for a differ; let p be the largest integer such that ap differs from
the corresponding cp , say ap < cp . Since ap < cp we have ap − cp ≤ −1. Now
subtracting the two expressions for a in (2.18), we obtain
p
X p−1
X
0=a−a= (ak − ck )bk = (ak − ck )bk + (ap − cp )bp
k=0 k=0
p−1
X
≤ (b − 1)bk + (−bp ) = (bp − 1) − bp = −1,
k=0
a contradiction. Thus, the two representation of a must be equal. 
Lastly, we remark that if a is negative, then −a is positive, so −a has a b-
adic representation. The negative of this representation is by definition the b-adic
representation of a.
Exercises 2.5.
1. In this exercise we consider the base twelve or duodecimal system. For this system
we need two more digit symbols for eleven and twelve. Let α denote ten and β denote
eleven. Then the digits for the duodecimal system are 0, 1, 2, . . . , 9, α, β.
(a) In the duodecimal system, what is the symbol for twelve, twenty-two, twenty-three,
one hundred thirty-one?
(b) What numbers do the following symbols represent? ααα, 12, and 2ββ1.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 53

2. In the following exercises, we shall establish the validity of grade school divisibility
“tricks”, cf. [112]. Let a = an an−1 . . . a0 be the decimal (= base 10) representation
of a natural number a. Let us first consider divisibility by 2, 5, 10.
(a) Prove that a is divisible by 10 if and only if a0 = 0.
(b) Prove that a is divisible by 2 if and only if a0 is even.
(c) Prove that a is divisible by 5 if and only if a0 = 0 or a0 = 5.
3. We now consider 4 and 8.
(a) Prove that a is divisible by 4 if and only if the number a1 a0 (written in decimal
notation) is divisible by 4.
(b) Prove that a is divisible by 8 if and only if a2 a1 a0 is divisible by 8.
4. We consider divisibility by 3, 6, 9. (Unfortunately, there is no slick test for divisibility
by 7.) Suggestion: Before considering these tests, prove that 10k − 1 is divisible by 9
for any nonnegative integer k.
(a) Prove that a is divisible by 3 if and only if the sum of the digits (that is, an + · · · +
a1 + a0 ) is divisible by 3.
(b) Prove that a is divisible by 6 if and only if a is even and the sum of the digits is
divisible by 3.
(c) Prove that a is divisible by 9 if and only if the sum of the digits is divisible by 9.
5. Prove that a is divisible by 11 if and only if the difference between the sums of the
even and odd digits:
n
X
(a0 + a2 + a4 + · · · ) − (a1 + a3 + a5 + · · · ) = (−1)k ak
k=0

is divisible by 11. Suggestion: First prove that 10 −1 and 102k+1 +1 are each divisible
2k

by 11 for any nonnegative integer k.


6. Using the idea of modular arithmetic from Problem 6 in Exercises 2.4, one can easily
deduce the above “tricks”. Take for example the “9 trick” and “11 trick”.
(a) Show that 10k ≡ 1 (mod 9) for any k = 0, 1, 2, . . .. Using this fact, prove that a is
divisible by 9 if and only if the sum of the digits of a is divisible by 9.
(b) Show that 10k ≡ (−1)k (mod 11) for any k = 0, 1, 2, . . .. Using this fact, prove
that a is divisible by 11 if and only if the difference between the sums of the even
and odd digits of a is divisible by 11.

2.6. Real numbers: Rational and “mostly” irrational


Imagine a world where you couldn’t share half a cookie with your friend or where
you couldn’t buy a quarter pound of cheese at the grocery store; this is a world
without rational numbers. In this section we discuss rational and real numbers and
we shall discover, as the Greeks did 2500 years ago, that the rational numbers are
not sufficient for the purposes of geometry. In particular, a world with only rational
numbers is a world in which you couldn’t measure the circumference of a circular
swimming pool. Irrational numbers make up the missing rational lengths. We shall
discover in the next few sections that there are vastly, immensely, incalculably (any
other synonyms I missed?) more irrational numbers than rational numbers.

2.6.1. The real and rational numbers. The set of real numbers is denoted
by R. The reader is certainly familiar with the following arithmetic properties of
real numbers (in what follows, a, b, c denote real numbers):
Addition satisfies
(A1) a + b = b + a; (commutative law)
(A2) (a + b) + c = a + (b + c); (associative law)
54 2. NUMBERS, NUMBERS, AND MORE NUMBERS


1 2 1

Figure 2.4. The Greek’s discovery of irrational numbers.

(A3) there is a real number denoted by 0 “zero” such that


a + 0 = a = 0 + a; (existence of additive identity)
(A4) for each a there is a real number denoted by the symbol −a such that
a + (−a) = 0 and (−a) + a = 0. (existence of additive inverse)
Multiplication satisfies
(M1) a · b = b · a; (commutative law)
(M2) (a · b) · c = a · (b · c); (associative law)
(M3) there is a real number denoted by 1 “one” such that
1 · a = a = a · 1; (existence of multiplicative identity)
(M4) for a 6= 0 there is a real number denoted by the symbol a−1 such that
a · a−1 = 1 and a−1 · a = 1. (existence of multiplicative inverse)
As with the integers, the · is sometimes dropped and the associative laws imply
that expressions such as a + b + c or abc make sense without using parentheses.
Addition and multiplication are related by
(D) a · (b + c) = (a · b) + (a · c). (distributive law)
Of these arithmetic properties, the only additional property listed that was not
listed for integers is (M4), the existence of a multiplicative inverse for each nonzero
real number. As usual, we denote
a + (−b) = (−b) + a by b−a
and
a
a · b−1 = b−1 · a by a/b or .
b
The positive real numbers, denoted R+ , are closed under addition, multiplication,
and has the following property: Given any real number a, exactly one of the fol-
lowing “positivity” properties hold:
(P) a is a positive real number, a = 0, or −a is a positive real number.
A set together with operations of addition and multiplication that satisfy prop-
erties (A1) – (A4), (M1) – (M4), and (D) is called a field; essentially a field is
just a set of objects closed under addition, multiplication, subtraction, and division
(by nonzero elements). If in addition, the set has a “positive set” closed under
addition and multiplication satisfying (P), then the set is called an ordered field.

A rational number is a number that can be written in the form a/b where
a and b are integers with b =
6 0 and the set of all such numbers is denoted by Q.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 55

We leave the reader to check that the rational numbers also form an ordered field.
Thus, both the real numbers and the rational numbers are ordered fields. Now
what is the difference between the real and rational numbers? The difference was
discovered more than 2500 years ago by the Greeks, who found out that the length
of √
the diagonal of a unit square, which according to the Pythagorean theorem
is 2, is not a rational number (see Theorem 2.23). Because √ this length is not
a rational number, the Greeks called a number such as 2 irrational.9 Thus,
there are “gaps” in the rational numbers. Now it turns out that every length is a
real number. This fact is known as the completeness axiom of the real numbers.
Thus, the real numbers have no “gaps”. To finish up the list of axioms for the real
numbers, we state this completeness axiom now but we leave the terms in the axiom
undefined until Section 2.7 (so don’t worry if some of these words seem foreign).
(C) (Completeness axiom of the real numbers) Every nonempty set of real
numbers that is bounded above has a supremum, that is, a least upper bound.
We shall assume the existence of a set R such that N ⊆ R+ , Z ⊆ R, and R
satisfies all the arithmetic, positivity, and completeness properties listed above.10
All theorems that we prove in this textbook are based on this assumption.
2.6.2. Proofs of well-known high school rules. Since the real numbers
satisfy the same arithmetic properties as the natural and integer numbers, the
same proofs as in Section 2.1 and 2.3 prove the uniqueness of additive identities
and inverses, rules of sign, properties of zero and one (in particular, the uniqueness
of the multiplicative identity), etc . . ..
Also, the real numbers are ordered in the same way as the integers. Given any
real numbers a and b exactly one of the following holds:
(O1) a = b;
(O2) a < b, which means that b − a is a positive real number;
(O3) b < a, which means that −(b − a) is a positive real number.
Just as for integers, we can define ≤, >, and ≥ and (O3) is just that a − b is a
positive real number. One can define the absolute value of a real number in the
exact same way as it is defined for integers. Since the real numbers satisfy the
same order properties as the integers, the same proofs as in Section 2.3 prove the
inequality rules, absolute value rules, etc . . ., for real numbers. Using the inequality
rules we can prove the following well-known fact from high school: if a > 0, then
a−1 > 0. Indeed, by definition of a−1 , we have a · a−1 = 1. Since 1 > 0 (recall that
1 ∈ N ⊆ R+ ) and a > 0, we have positive × a−1 = positive; the only way this is
possible is if a−1 > 0 by the inequality rules. Here are other high school facts that
can be proved using the inequality rules: If 0 < a < 1, then a−1 > 1 and if a > 1,
then a−1 < 1. Indeed, if a < 1 with a positive, then multiplying by a−1 > 0, we
obtain
a · (a−1 ) < 1 · (a−1 ) =⇒ 1 < a−1 .
9The idea of the continuum seems simple to us. We have somehow lost sight of the difficulties
it implies ... We are told such a number as the square root of 2 worried Pythagoras and his school
almost to exhaustion. Being used to such queer numbers from early childhood, we must be careful
not to form a low idea of the mathematical intuition of these ancient sages; their worry was
highly credible. Erwin Schrödinger (1887–1961).
10For simplicity we assumed that N ⊆ R+ (in particular, all natural numbers are positive by
assumption) and Z ⊆ R, but it is possible to define N and Z within R. Actually, one only needs
to define N for then we can put Z := N ∪ {0} ∪ (−N).
56 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Similarly, if 1 < a, then multiplying through by a−1 > 0, we get a−1 < 1.
Here are some more high school facts.
Theorem 2.20 (Uniqueness of multiplicative inverse). If a and b are real
numbers with a 6= 0, then x · a = b if and only if x = ba−1 = b/a. In particular,
setting b = 1, the only x that satisfies the equation x · a = 1 is x = a−1 . Thus, each
real number has only one multiplicative inverse.
Proof. If x = b · a−1 , then
(ba−1 ) · a = b(a−1 a) = b · 1 = b,
so the real number x = b/a solves the equation x · a = b. Conversely, if x satisfies
x · a = b, then
ba−1 = (x · a)a−1 = x · (a a−1 ) = x · 1 = x.

Recall that x · 0 = 0 for any real number x. (This is Theorem 2.12 in the real
number case.) In particular, 0 has no multiplicative inverse (there is no real number
“0−1 ” such that 0 · 0−1 = 1); thus, the high school saying: “You can’t divide by
zero.”
Theorem 2.21 (Fraction rules). For a, b, c, d ∈ R, the following fraction rules
hold (all denominators are assumed to be nonzero):
a a a a
(1 ) = 1, = a, (2 ) =−
a 1 −b b
a c ac a ac
(3 ) · = , (4 ) = ,
b d bd b bc
1 b a/b a d ad a c ad ± bc
(5 ) = , (6 ) = · = , (7 ) ± = .
a/b a c/d b c bc b d bd
Proof. The proofs of these rules are really very elementary, so we only prove
(1)–(3) and leave (4)–(7) to you in Problem 1.
We have a/a = a·a−1 = 1 and since 1·1 = 1, by uniqueness of the multiplicative
inverses, we have 1−1 = 1 and therefore a/1 = a · 1−1 = a · 1 = a.
To prove (2), note that by our rules of sign,
(−b) · (−b−1 ) = b · b−1 = 1
and therefore by uniqueness of multiplicative inverses, we must have (−b)−1 =
−b−1 . Thus, a/(−b) := a · (−b)−1 = a · −b−1 = −a · b−1 = −a/b.
To prove (3), observe that b · d · b−1 · d−1 = (bb−1 ) · (dd−1 ) = 1 · 1 = 1, so by
uniqueness of multiplicative inverses, (bd)−1 = b−1 d−1 . Thus,
a c ac
· = a · b−1 · c · d−1 = a · c · b−1 · d−1 = ac · (bd)−1 = .
b d bd

We already know that what an means for n = 0, 1, 2, . . .. For negative integers,
we define powers by
1
a−n := , a 6= 0, n = 1, 2, 3, . . . .
an
Here are the familiar power rules.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 57

Theorem 2.22 (Power rules). For a, b ∈ R and for integers m, n,


n
am · an = am+n ; am · bm = (ab)m ; (am ) = amn ,
provided that the individual powers are defined (e.g. a and b are nonzero if an
exponent is negative). If n is a natural number and a, b ≥ 0, then
a<b if and only if an < bn .
In particular, a 6= b (both a, b nonnegative) if and only if an 6= bn .
Proof. We leave the proof of the first three rules to the reader since we already
dealt with proving such rules in the problems of Section 2.2. Consider the last rule.
Let n ∈ N and let a, b ≥ 0 be not both zero (if both are zero, then a = b and
an = bn and there is nothing to prove). Observe that
(2.19) (b − a) · c = bn − an , where c = bn−1 + bn−2 a + · · · + b an−2 + an−1 ,
which is verified by multiplying out:
(b − a) (bn−1 + bn−2 a + · · · + an−1 ) = (bn + bn−2 a2 + bn−1 a + · · · + ban−1 )
− (bn−1 a + bn−2 a2 + · · · + b an−1 + an ) = bn − an .
The formula for c (and the fact that a, b ≥ 0 are not both zero) shows that c > 0.
Therefore, the equation (b − a) · c = bn − an shows that (b − a) > 0 if and only if
bn − an > 0. Therefore, a < b if and only if an < bn . 
If n is a natural number, then the n-th root of a real number a is a real number
b such that bn = a, if such a number b exists. For n = 2, we usually call b a square
root and if n = 3, a cube root.
Example 2.18. −3 is a square root of 9 since (−3)2 = 9. Also, 3 is a square
root of 9 since 32 = 9. Here’s a puzzle: Which two real numbers are their own
nonnegative square roots?
If a ≥ 0, then according to last power rule in Theorem 2.22, a can have at most
one nonnegative n-th root. In Section 2.7 we shall prove that any nonnegative real
number
√ has a unique nonnegative n-th√root. We denote this√unique n-th root by
n
a or a1/n . If n = 2, we always write a or a1/2 instead of 2 a.
We now show that “most” real numbers are not rational numbers, that is, ratios
of integers. These examples will convince the reader that there are many “gaps” in
the rational numbers and the importance of √ irrational numbers to real life. For the
rest of this section, we shall assume that n a exists for any a ≥ 0 (to be proved in
Section 2.7) and we shall assume basic facts concerning the trig and log functions
(to be proved in Sections 4.7 and 4.6, respectively) . We make these assumptions
only to present interesting examples that will convince you without a shadow of a
doubt that irrational numbers are indispensable in mathematics.
2.6.3. Irrational
√ roots and the rational zeros theorem. We begin by
showing that 2 is not rational. Before proving this, we establish some terminology.
We say that a rational number a/b is in lowest terms if a and b do not have
common prime factors in their prime factorizations. By Property (4) of the fraction
rules, we can always “cancel” common factors to put a rational number in lowest
terms.
√ √
Theorem 2.23 (Irrationality of 2). 2 is not rational.
58 2. NUMBERS, NUMBERS, AND MORE NUMBERS

a b
b b
c c
b d

Figure 2.5. On the right, we measure the length b along the


longer side a and we draw a perpendicular from side a to the shorter
side b. We get a new isosceles triangle with sides d, c, c. (The
smaller triangle is similar to the original one because it has a 90◦
angle just like the original one does and it shares an angle, the
lower left corner, with the original one.)

Proof. We provide three proofs, the first one is essentially a version (the
version?) of the original geometric Pythagorean proof while the second one is a real
analysis version of the same proof! The third proof is the “standard” proof in this
business. (See also Problems 6 and 7.)
Proof I: (Cf. [8] for another version.) This proof is not to be considered
rigorous! We only put this proof here for historical purposes and because we shall
make this proof completely rigorous in Proof II below. We assume common facts
from high school geometry, in particular, similar
√ triangles.
Suppose, by way of contradiction, that 2 = a/b where a, b ∈ N. Then a2 =
2b2 = b2 + b2 , so by the Pythagorean theorem, the isosceles triangle with sides
a, b, b is a right triangle (see Figure 2.5). Hence, there is an isosceles right triangle
whose lengths are (of course, positive) integers. By taking a smaller triangle if
necessary, we may assume that a, b, b are the lengths of the smallest such triangle.
We shall derive a contradiction by producing another isosceles right triangle with
integer lengths and a smaller hypotonus. In fact, consider the triangle d, c, c drawn
in Figure 2.5. Note that a = b + c so c = a − b ∈ Z. To see that d ∈ Z, observe
that since the ratio of corresponding sides of similar triangles are in proportion, we
have
d a a a a2
(2.20) = =⇒ d= · c = (a − b) = − a = 2b − a,
c b b b b
where we used that a2 = b2 + b2 = 2b2 . Therefore, d = 2b − a ∈ Z as well. Thus,
we have indeed produced a smaller isosceles right triangle with integer lengths.
Proof II:√(Cf. [218, p. 39], [143], [194].) We now make Proof I rigorous.
Suppose that 2 = a/b (a, b ∈ N).√ By well-ordering, we may assume that a is the
smallest positive numerator that 2 can have as a fraction; explicitly,
n √ n o
a = least element of n ∈ N ; 2 = for some m ∈ Z .
m
Motivated by (2.20), we claim that
√ d
(2.21) 2= where d = 2b − a, c = a − b are integers with d ∈ N and d < a.
c
Once we prove this claim, we contradict the minimality of a. Of course, the facts in
(2.21) were derived from Figure 2.5 geometrically, but now we actually prove these
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 59


facts! First, to prove that 2 = d/c, we simply compute:
√ √ √
d 2b − a 2 − a/b 2− 2 2− 2 2+1
= = =√ =√ ·√
c a−b a/b − 1 2−1 2−1 2+1
√ √ √ √
2 2 + 2 − ( 2)2 − 2 2 √
= √ = = 2.
2
( 2) − 1 1
√ 2
To prove that 0 < d < a, note that since 1 < 2 < 4, that is, 12 < ( 2)√ < 22 , by
the (last statement of the) power rules in Theorem 2.22, we have 1 < 2 < 2, or
1 < a/b < 2. Multiplying by b, we get b < a < 2b, which implies that
(2.22) d = 2b − a > 0 and d = 2b − a < 2a − a = a.
Therefore, d ∈ N and d < a and we get our a contradiction.
Proof III: The following proof is the classic proof. We first establish the fact
that the square of an integer has the factor 2 if and only if the integer itself has
the factor 2. A quick way to prove this fact is using the fundamental theorem of
arithmetic: The factors of m2 are exactly the squares of the factors of m. Therefore,
m2 has a prime factor p if and only if m itself has the prime factor p. In particular,
m2 has the prime factor 2 if and only if m has the factor 2, which establishes our
fact. A proof without using the fundamental theorem goes as follows. An integer
is either even or odd, that is, is of the form 2n or 2n + 1 where n is the quotient of
the integer when divided by 2. The equations
(2n)2 = 4n2 = 2(2n2 )
(2n + 1)2 = 4n2 + 4n + 1 = 2(2n2 + 2n) + 1

confirm the asserted fact. Now suppose that 2 were a rational number, say
√ a
2= ,
b
where a/b is in lowest terms. Squaring this equation we get
a2
2==⇒ a2 = 2b2 .
b2
The number 2b2 = a2 has the factor 2, so a must have the factor 2. Therefore,
a = 2c for some integer c. Thus,
(2c)2 = 2b2 =⇒ 4c2 = 2b2 =⇒ 2c2 = b2 .
The number 2c2 = b2 has the factor 2, so b must also have the factor 2. Thus, we
have showed that a and b both have the factor 2. This contradicts the assumption
that a and b have no common factors. 

The following theorem gives another method to prove the irrationality of 2
and also many other numbers. Recall that a (real-valued) n-th degree polyno-
mial is a function p(x) = an xn + · · · + a1 + x + a0 , where ak ∈ R for each k and
with the leading coefficient an 6= 0.
Theorem 2.24 (Rational zeros theorem). If a polynomial equation with
integral coefficients,
cn xn + cn−1 xn−1 + · · · + c1 x + c0 , cn 6= 0,
where the ck ’s are integers, has a nonzero rational solution a/b where a/b is in
lowest terms, then a divides c0 and b divides cn .
60 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Proof. Suppose that a/b is a rational solution of our equation with a/b in
lowest terms. Being a solution, we have
 a n  a n−1 a
cn + cn−1 + · · · + c1 + c0 = 0.
b b b
Multiplying both sides by bn , we obtain
(2.23) cn an + cn−1 an−1 b + · · · + c1 a bn−1 + c0 bn = 0.
Bringing everything to the right except for cn an and factoring out a b, we find
cn an = −cn−1 an−1 b − · · · − c1 a bn−1 − c0 bn
= b(−cn−1 an−1 − · · · − c1 a bn−2 − c0 bn−1 ).
This formula shows that every prime factor of b occurs in the product cn an . By
assumption, a and b have no common prime factors and hence every prime factor
of b must occur in cn . This shows that b divides cn .
We now rewrite (2.23) as
c0 bn = −cn an − cn−1 an−1 b − · · · − c1 a bn−1
= a(−cn an−1 − cn−1 an−2 − · · · − c1 bn−1 ).
This formula shows that every prime factor of a occurs in the product c0 bn . How-
ever, since a and b have no common prime factors, we conclude that every prime fac-
tor of a occurs in c0 , which implies that a divides c0 . This completes our proof. 
√ √
Example 2.19. (Irrationality of 2, Proof IV) Observe that 2 is a so-
lution of the polynomial equation x2 − 2 = 0. The rational zeros theorem implies
that if the equation x2 − 2 = 0 has a rational solution, say a/b in lowest terms, then
a must divide c0 = −2 and b must divide c2 = 1. It follows that a can equal ±1 or
±2 and b can only be ±1. Therefore, the only rational solutions of x2 − 2 = 0, if
any, are x = ±1 or x = ±2. However,
(±1)2 − 2 = −1 6= 0 and (±2)2 − 2 = 2 6= 0,

so x2 − 2 = 0 has no rational solutions. Therefore 2 is not rational.
A similar argument using the equation xn − a = 0 proves the the following
corollary.

Corollary 2.25. The n-th root n a, where a and n are positive integers, is
either irrational or an integer; if it is an integer, then a is the n-th power of an
integer.
2.6.4. Irrationality of trigonometric numbers. Let 0 < θ < 90◦ be an
angle whose measurement in degrees is rational. Following [142], we shall prove
that cos θ is irrational except when θ = 60◦ , in which case
1
cos 60◦ = .
2
The proof of this result is based on the rational zero theorem and Lemma 2.26 below.
See Problem 5 for corresponding statements for sine and tangent. Of course, at this
point, and only for purposes of illustration, we have to assume basic knowledge of
the trigonometric functions. In Section 4.7 we shall define these function rigourously
and establish their usual properties.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 61

Lemma 2.26. For any natural number n, we can write 2 cos nθ as an n-th degree
polynomial in 2 cos θ with integer coefficients and with leading coefficient one.

Proof. We need to prove that


(2.24) 2 cos nθ = (2 cos θ)n + an−1 (2 cos θ)n−1 + · · · + a1 (2 cos θ) + a0 ,
where the coefficients an−1 , an−2 , . . . , a0 are integers. For n = 1, we can write
2 cos θ = (2 cos θ)1 + 0, so our proposition holds for n = 1. To prove our result in
general, we use the strong form of induction. Assume that our proposition holds
for 1, 2, . . . , n. Before proceeding to show that our lemma holds for n + 1, we shall
prove the identity
  
(2.25) 2 cos(n + 1)θ = 2 cos nθ 2 cos θ − 2 cos(n − 1)θ.

To verify this identity, consider the identities


cos(α + β) = cos α cos β − sin α sin β
cos(α − β) = cos α cos β + sin α sin β.

Adding these equations, we obtain cos(α + β) + cos(α − β) = 2 cos α cos β, or


cos(α + β) = 2 cos α cos β − cos(α − β).
Setting α = nθ and β = θ, and then multiplying the result by 2, we get (2.25).
Now, since our lemma holds for 1, . . . , n, in particular, 2 cos(n − 1)θ can be
written as an (n − 1)-degree polynomial in 2 cos θ with integer coefficients and with
leading coefficient one and 2 cos nθ can be written as an n-degree polynomial in
2 cos θ with integer coefficients and with leading coefficient one. Substituting these
polynomials into the right-hand side of the identity (2.25) shows that 2 cos(n + 1)θ
can be expressed as an (n + 1)-degree polynomial in 2 cos θ with integer coefficients
and with leading coefficient one. This proves our lemma. 

We are now ready to prove our main result.

Theorem 2.27. Let 0 < θ < 90◦ be an angle whose measurement in degrees is
rational. Then cos θ is rational if and only if θ = 60◦ .

Proof. If θ = 60◦ , then we know that cos θ = 1/2, which is rational.


Assume now that θ is rational, say θ = a/b where a and b are natural numbers.
Then choosing n = b · 360◦ , we have nθ = b · 360◦ · (a/b) = a · 360◦ . Thus, nθ is a
multiple of 360◦ , so cos nθ = 1. Substituting nθ into the equation (2.24), we obtain

(2 cos θ)n + an−1 (2 cos θ)n−1 + · · · + a1 (2 cos θ) + a0 − 2 = 0,


where the coefficients are integers. Hence, 2 cos θ is a solution of the equation
xn + an−1 xn−1 + · · · + a1 x + a0 − 2 = 0.
By the rational zeros theorem, any rational solution of this equation must be an
integer dividing −2. So, if 2 cos θ is rational, then it must be an integer. Since
0 < θ < 90◦ and cosine is strictly between 0 and 1 for these θ’s, the only integer
that 2 cos θ can be is 1. Thus, 2 cos θ = 1 or cos θ = 1/2, and so θ must be 60◦ . 
62 2. NUMBERS, NUMBERS, AND MORE NUMBERS

2.6.5. Irrationality of logarithmic numbers. Recall that the (common)


logarithm to the base 10 of a real number a is defined to be the unique number x
such that
10x = a.
In Section 4.6 we define logarithms rigourously but for now, and only now, in order
to demonstrate another interesting example of irrational numbers, we shall assume
familiarity with such logarithms from high school. We also assume basic facts
concerning powers that we’ll prove in the next section.
Theorem 2.28. Let r > 0 be any rational number. Then log10 r is rational if
and only if r = 10n where n is an integer, in which case
log10 r = n.
Proof. If r = 10n where n ∈ Z, then log10 r = n, so log10 r is rational.
Assume now that log10 r is rational; we’ll show that r = 10n for some n ∈ Z. We
may assume that r > 1 because if r = 1, then r = 100 , and we’re done, and if r < 1,
then r−1 > 1 and log10 r−1 = − log10 r is rational, so we can get the r < 1 result
from the r > 1 result. We henceforth assume that r > 1. Let r = a/b where a and
b are natural numbers with no common factors. Assume that log10 r = c/d where c
and d are natural numbers with no common factors. Then r = 10c/d , which implies
that rd = 10c , or after setting r = a/b, we get (a/b)d = 10c =⇒ ad = 10c · bd , or
(2.26) ad = 2c · 5c · bd .
By assumption, a and b do not have any common prime factors. Hence, expressing a
and b in the their prime factorizations in (2.26) and using the fundamental theorem
of arithmetic, we see that the only way (2.26) can hold is if b has no prime factors
(that is, b = 1) and a can only have the prime factors 2 and 5. Thus,
a = 2m · 5n and b=1
for some nonnegative integers m and n. Now according to (2.26),
2md · 5nd = 2c · 5c .
Again by the fundamental theorem of arithmetic, we must have md = c and nd = c.
Now c and d have no common factors, so d = 1, and therefore m = c = n. This,
and the fact that b = 1, proves that
a a
r = = = 2m · 5n = 2n · 5n = 10n .
b 1

In the following exercises, assume that square roots and cube roots exist for
nonnegative real numbers; again, this fact will be proved in the next section.
Exercises 2.6.
1. Prove properties (4)–(7) in the “Fraction rules” theorem.
2. Let a be any positive real number and let n, m be nonnegative integers with m < n. If
0 < a < 1, prove that an < am and if a > 1, prove that am < an .
3. Let α be any irrational number. Prove that −α and α−1 are irrational. If r is any
nonzero rational number, prove that the addition, subtraction, multiplication, and
division of α and r are again irrational. As an application of this result, deduce that

√ 1 √ √ √ 2 7
− 2, √ , 2 + 1, 4 − 2, 3 2, , √
2 10 2
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 63

are each irrational.


4. In this problem √ we prove that various numbers are irrational.
(a) Prove√that 6 is irrational using the Proof III in Theorem 2.23. From the √ fact
that
√ 6 is irrational,
√ and √ without using any irrationality facts concerning √ 2 and

3, prove that 2 + 3 is irrational. Suggestion: To prove that 2 + 3 is
irrational, consider
√ its√square.
(b) Now prove √ that√ 2 + 3 is irrational using the rationals zero theorem. Suggestion:
Let x = 2 + 3, then show that x4 − 10x2 +√ 1 = 0. √ √
3
(c) Using the rationals zero √ theorem,
√ prove that (2 6 + 7)/3 is irrational and 3 2 − 3
is irrational. (If x = 3 2 − 3, you should end up with a sixth degree polynomial
equation for x for which you can apply the rationals zero theorem.)
5. In this problem we look at irrational values of sine and tangent. Let 0 < θ < 90◦ be
an angle whose measurement in degrees is rational. You may assume any knowledge
of the trigonometric functions and their identities.
(a) Prove that sin θ is rational if and only if θ = 30◦ , in which case sin 30◦ = 1/2.
Suggestion: Do not try to imitate the proof of Theorem 2.27, instead use a trig
identity to write sine in terms of cosine.
(b) Prove that tan θ is rational if and only if θ = 45◦ , in which case tan θ = 1. Sugges-
2
tion: Use the identity cos 2θ = 1−tan θ
.
√ 1+tan2 θ
6. (Cf. [43]) (Irrationality of 2, Proof V) This √ proof is similar to the algebraic
Pythagorean proof of Theorem 2.23. Assume that 2 is rational.√
(i) Show that there is√a smallest natural
√ number n such that n 2 is an integer.
(ii) Show that m = n 2 − √ n = n( 2 − 1) is a natural number smaller than n.
(iii) Finally, show that m 2 is an integer, which contradicts the fact that n was the
smallest natural
√ number having this property.
7. (Irrationality of 2, Proof VI) Here’s a proof due to Marcin Mazur [148].
√ √
(i) Show that 2 = −4 √ 2+6 .
3√ 2−4
(ii) Now √ suppose that 2 = a/b (a, b ∈ N) where a is the smallest positive numerator
that 2 can have as a fraction. Using the formula in (i), derive a contradiction
as in the algebraic Pythagorean proof of Theorem 2.23.

2.7. The completeness axiom of R and its consequences


The completeness axiom of the real numbers essentially states that the real
numbers have no “gaps”. As discovered in the previous section, this property is
quite in contrast to the rational numbers that have many “gaps”. In this section
we discuss the completeness axiom and its consequences. Another consequence of
the completeness axiom is that the real numbers is uncountable while the rationals
are countable, but we leave this breath-taking subject for Section 2.10.

2.7.1. The completeness axiom. Before discussing the completeness axiom,


we need to talk about lower and upper bounds of sets.
A set A ⊆ R is said to be bounded above if there is a real number b larger
than any number in A in the sense that for each a in A we have a ≤ b. Any such
number b, if such exists, is called an upper bound for A. Suppose that b is an
upper bound for A. Then b is called the least upper bound or supremum for
A if b is just that, the least upper bound for A, in the sense that it is less than any
other upper bound for A. This supremum, if it exists, is denoted by sup A. We
shall use both terminologies “least upper bound” and “supremum” interchangeably
although we shall use least upper bound more often.
64 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Example 2.20. Consider the interval I = [0, 1). This interval is bounded above
by, for instance, 1, 3/2, 22/7, 10, 1000, etc. In fact, any upper bound for I is just
a real number greater than or equal to 1. The least upper bound is 1 since 1 is the
smallest upper bound. Note that 1 6∈ I.
Example 2.21. Now let J = (0, 1]. This set is also bounded above, and any
upper bound for J is as before, just a real number greater than or equal to 1. The
least upper bound is 1. In this case, 1 ∈ J.
These examples show that the supremum of a set, if it exists, may or may not
belong to the set.
Example 2.22. Z is not bounded above (see Lemma 2.34) nor is the set (0, ∞).
We summarize: Let A ⊆ R be bounded above. Then a number b is the least
upper bound or supremum for A means two things concerning b:
(L1) for all a in A, a ≤ b — this just means that b is an upper bound for A;
(L2) if c is an upper bound for A, then b ≤ c — this just means that b is the least,
or smallest, upper bound for A.
Instead of (L2) it is sometimes convenient to substitute the following.
(L20 ) if c < b, then for some a in A we have c < a — this just means that any
number c smaller than b cannot be an upper bound for A, which is to say,
there is no upper bound for A that is smaller than b.
(L20 ) is just the contrapositive of (L2) — do you see why? We can also talk
about lower bounds. A set A ⊆ R is said to be bounded below if there is a real
number b smaller than any number in A in the sense that for each a in A we have
b ≤ a. Any such number b, if such exists, is called a lower bound for A. If b is a
lower bound for A, then b is called the greatest lower bound or infimum for A
if b is just that, the greatest lower bound for A, in the sense that it is greater than
any other lower bound for A. This infimum, if it exists, is denoted by inf A. We
shall use both terminologies “greatest lower bound” and “infimum” interchangeably
although we shall use greatest lower bound more often.
Example 2.23. The sets I = [0, 1) and J = (0, 1] are both bounded below (by
e.g. 0, −1/2, −1, −1000, etc.) and in both cases the greatest lower bound is 0.
Thus, the infimum of a set, if it exists, may or may not belong to the set.
Example 2.24. Z (see Lemma 2.34) and (−∞, 0) are not bounded below.
We summarize: Let A ⊆ R be bounded below. Then a number b is the greatest
lower bound or infimum for A means two things concerning b:
(G1) for all a in A, b ≤ a — this just means that b is a lower bound for A,
(G2) if c is a lower bound for A, then c ≤ b — this just means that b is the greatest
lower bound for A.
Instead of (G2) it is sometimes convenient to substitute its contrapositive.
(G20 ) if b < c, then for some a in A we have a < c — this just means that any
number c greater than b cannot be a lower bound for A, which is to say,
there is no lower bound for A that is greater than b.
In the examples given so far (e.g. the intervals I and J), we have shown that
if a set has an upper bound, then it has a least upper bound. This is a general
phenomenon, called the completeness axiom of the real numbers:
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 65

(C) (Completeness axiom of the real numbers) Every nonempty set of real
numbers that is bounded above has a supremum, that is, a least upper bound.
As stated in the last section, we assume that R has this property. Using the
following lemma, we can prove the corresponding statement for infimums.
Lemma 2.29. If A is nonempty and bounded below, then −A := {−a ; a ∈ A}
is nonempty and bounded above, and inf A = − sup(−A) in the sense that inf A
exists and this formula for inf A holds.
Proof. Since A is nonempty and bounded below, there is a real number b such
that b ≤ a for all a in A. Therefore, −a ≤ −b for all a in A, and hence the set −A
is bounded above by −b. By the completeness axiom, −A has a least upper bound,
which we denote by b. Our lemma is finished once we show that −b is the greatest
lower bound for A. To see this, we know that −a ≤ b for all a in A and so, −b ≤ a
for all a in A. Thus, −b is a lower bound for A. Suppose that b0 ≤ a for all a in A.
Then −a ≤ −b0 for all a in A and so, b ≤ −b0 since b is the least upper bound for
−A. Thus, b0 ≤ −b and hence, −b is indeed the greatest lower bound for A. 
This lemma immediately gives the following theorem.
Theorem 2.30. Every nonempty set of real numbers that is bounded below has
an infimum, that is, greatest lower bound.
The consequences of the completeness property of the real numbers are quite
profound as we now intend to demonstrate!

2.7.2. Existence of n-th roots. As a first consequence of the completeness


property we show that any nonnegative real number has a unique nonnegative n-th
root where n ∈ N. In the following theorem we use the fact that if ξ, η > 0, then
for any k ∈ N,
1<ξ =⇒ ξ ≤ ξk and η<1 =⇒ η k ≤ η.
These properties follow from the power rules in Theorem 2.22. E.g. 1 < ξ implies
1 = 1k−1 ≤ ξ k−1 (with = when k = 1 and with < when k > 1); then multiplying
1 ≤ ξ k−1 by ξ we get ξ ≤ ξ k . A similar argument shows that η k ≤ η.
Theorem 2.31 (Existence/uniqueness of n-th roots). Every nonnegative
real number has a unique nonnegative n-th root.
Proof. First of all, uniqueness follows from the last power rule in Theorem
2.22. Note that the n-th root of zero exists and equals zero √
and certainly 1-th roots
always exist. So, let a > 0 and n ≥ 2; we shall
√ prove that n
a exists.
Step 1: We first define the tentative n a as a supremum. Let A be the set of
real numbers x such that xn ≤ a. Certainly A contains 0, so A is nonempty. We
claim that A is bounded above by a + 1. To see this, observe that if x ≥ a + 1, then
in particular x > 1, so for such x,
a < a + 1 ≤ x ≤ xn =⇒ x∈
/ A.
This shows that A is bounded above by a + 1. Being nonempty and bounded above,
by the axiom of completeness, A has a least upper bound, which we denote by b ≥ 0.
We shall prove that bn = a, which proves our theorem. Well, either bn = a, bn < a,
or bn > a. We shall prove that the latter two cases cannot occur.
66 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Step 2: Suppose that bn < a. Let 0 < ε < 1. Then εm ≤ ε for any natural
number m, so by the binomial theorem,
n   n−1
X n
n
X n k n−k n
(b + ε) = b ε =b + bk εn−k
k k
k=0 k=0
n−1
X n
n
≤b + bk ε = bn + εc,
k
k=0
Pn−1 n

where c is the positive number c = k=0 kbk . Since bn < a, we have (a −
n n
b )/c > 0. Let ε equal (a − b )/c or 1/2, whichever is smaller (or equal to 1/2 if
(a − bn )/c = 1/2). Then 0 < ε < 1 and ε ≤ (a − bn )/c, so
a − bn
(b + ε)n ≤ bn + εc ≤ bn + · c = a.
c
This shows that b + ε also belongs to A, which contradicts the fact that b is an
upper bound for A.
Step 3: Now suppose that bn > a. Then b > 0 (for if b = 0, then bn = 0 6> a).
Given any 0 < ε < b, we have ε b−1 < 1, which implies −ε b−1 > −1, so by
Bernoulli’s inequality (Theorem 2.7),
n 
(b − ε)n = bn 1 − εb−1 ≥ bn 1 − nεb−1 = bn − ε c,
where c = nbn−1 > 0. Since a < bn , we have (bn − a)/c > 0. Let ε equal (bn − a)/c
or b/2, whichever is smaller (or equal to b/2 if (bn − a)/c = b/2). Then 0 < ε < b
and ε ≤ (bn − a)/c, which implies that −εc ≥ −(bn − a). Therefore,
(b − ε)n ≥ bn − εc ≥ bn − (bn − a) = a.
This shows that b − ε is an upper bound for A, which contradicts the fact that b is
the least upper bound for A.


In particular, 2 exists and, as we already know, is an irrational number. Here
are proofs of the familiar root rules memorized from high school.
Theorem 2.32 (Root rules). For any nonnegative real numbers a and b and
natural number n, we have
√n √ √n
q
m √ √
ab = n a b, n
a = mn a.
Moreover,
√ √
n
n
a<b ⇐⇒ a < b.
√ √
Proof. Let x = n a and y = n b. Then, xn = a and y n = b, so
(xy)n = xn y n = ab.

By uniqueness of n-th roots, we must have xy = n ab. This proves the first iden-
tity. The second identity is proved similarly. Finally, by our power rules theorem
√ √ √ n  √ n
(Theorem 2.22), we have n a < n b ⇐⇒ ( n a) < n b ⇐⇒ a < b, which proves
the last statement of our theorem. 
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 67

Another way to write these root rules are


1 1 1 1 1 1 1 1
(ab) n = a n b n , (a n ) m = a mn , and a < b ⇐⇒ an < bn .
Given any a ∈ R with a ≥ 0 and r = m/n where m ∈ Z and n ∈ N, we define
m
(2.27) ar := a1/n ,
provided that a 6= 0 when m < 0. One can check that the right-hand side is defined
independent of the representation of r; that is, if r = p/q = m/n for some other
p ∈ Z and q ∈ N, then (a1/q )p = (a1/n )m . Combining the power rules theorem for
integer powers and the root rules theorem above, we get
Theorem 2.33 (Power rules for rational powers). For a, b ∈ R with a, b ≥
0, and r, s ∈ Q, we have
s
ar · as = ar+s ; ar · br = (ab)r ; (ar ) = ars ,
provided that the individual powers are defined (e.g. a and b are nonzero if an
exponent is negative). If r is nonnegative and a, b ≥ 0, then
a<b ⇐⇒ ar < br .
We shall define ax for any real number x in Section 4.6 and prove a similar
theorem (see Theorem 4.32); see also Exercise 9 for another way to define ax .
2.7.3. The Archimedean property and its consequences. Another con-
sequence of the completeness property is the following “obvious” fact.
Lemma 2.34. N is not bounded above and Z is not bounded above nor below.
Proof. We only prove the claim for N leaving the claim for Z to you. Assume,
for sake of achieving a contradiction, that N is bounded above. Then the set N must
have a least upper bound, say b. Since the number b − 1 is smaller than the least
upper bound b, there must be a natural number m such that b − 1 < m, which
implies that b < m + 1. However, m + 1 is a natural number, so b cannot be an
upper bound for N, a contradiction. 
This lemma yields many useful results.
Theorem 2.35 (The 1/n-principle). Given any real number x > 0, there is
a natural number n such that n1 < x.
Proof. Indeed, since N is not bounded above, x1 is not an upper bound so
there is an n ∈ N such that x1 < n. This implies that n1 < x and we’re done. 
Here’s an example showing the 1/n-principle in action.
Example 2.25. Let
 
3
A = 1 − ; n = 1, 2, 3, . . . .
n
We shall prove that sup A = 1 and inf A = −2. (Please draw a few points of A on
a number line to see why these values for the sup and inf are reasonable.)
To show that sup A = 1 we need to prove two things: That 1 is an upper bound
for A and that 1 is the least of all upper bounds for A. First, we show that 1 is an
upper bound. To see this, observe that for any n ∈ N,
3 3 3
≥ 0 =⇒ − ≤ 0 =⇒ 1 − ≤ 1.
n n n
68 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Thus, for all a ∈ A, a ≤ 1, so 1 is indeed an upper bound for A. Second, we must


show that 1 is the least of all upper bounds. So, assume that c < 1; we’ll show that
c cannot be an upper bound by showing that there is an a ∈ A such that c < a;
that is, there is an n ∈ N such that c < 1 − 3/n. Observe that
3 3 1 1−c
(2.28) c<1− ⇐⇒ < 1 − c ⇐⇒ < .
n n n 3
Since c < 1, we have (1 − c)/3 > 0, so by the 1/n-principle, there exists an n ∈ N
such that 1/n < (1−c)/3. Hence, by (2.28), there is an n ∈ N such that c < 1−3/n.
This shows that c is not an upper bound for A.
To show that inf A = −2 we need to prove two things: That −2 is a lower
bound and that −2 is the greatest of all lower bounds. First, to prove that −2 is a
lower bound, observe that for any n ∈ N,
3 3 3
≤ 3 =⇒ −3 ≤ − =⇒ −2 ≤ 1 − .
n n n
Thus, for all a ∈ A, −2 ≤ a, so −2 is indeed a lower bound for A. Second, to see
that −2 is the greatest of all lower bounds, assume that −2 < c; we’ll show that
c cannot be a lower bound by showing there is an a ∈ A such that a < c; that
is, there is an n ∈ N such that 1 − 3/n < c. In fact, simply take n = 1. Then
1 − 3/n = 1 − 3 = −2 < c. This shows that c is not a lower bound for A.
Here’s another useful consequence of the fact that N is not bounded above.
Theorem 2.36 (Archimedean property). 11 Given a real number x > 0 and
a real number y, there is a unique integer n such that
nx ≤ y < (n + 1)x.
In particular, with x = 1, given any real number y there is a unique integer n such
that n ≤ y < n + 1, a fact that is obvious from viewing the real numbers as a line.
Proof. Dividing nx ≤ y < (n + 1)x by x, we need to prove that there is a
unique integer n such that
y
n ≤ z < n + 1, where z = .
x
We first prove existence existence. Since N is not bounded above, there is a k ∈ N
such that 1 − z < k, or adding z’s, 1 < z + k. Again using that N is not bounded
above, the set A = {m ∈ N ; z + k < m} is not empty, so by the well-ordering of
N, A contains a least element, say ` ∈ N. Then z + k < ` (because ` ∈ A) and
` − 1 ≤ z + k (because on the other hand, if z + k < ` − 1, then setting m = ` − 1 < `
and recalling that 1 < z + k, we see that m ∈ N and z + k < m, so m ∈ A is smaller
than ` contradicting that ` is the least element of A). Thus,
`−1≤z+k <` =⇒ n ≤ z < n + 1,
where n = ` − 1 − k ∈ Z.
To prove uniqueness, assume that n ≤ z < n + 1 and m ≤ z < m + 1 for
m, n ∈ Z. These inequalities imply that n ≤ z < m + 1, so n < m + 1, and that
m ≤ z < n + 1, so m < n + 1. Thus, n < m + 1 < (n + 1) + 1 = n + 2, or
0 < m − n + 1 < 2. This implies that m − n + 1 = 1, or m = n. 
11The “Archimedean property” might equally well be called the “Eudoxus property” after
Eudoxus of Cnidus (408 B.C.–355 B.C.); see [169] and [120, p. 7].
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 69

6
2 q a

1 q a

 q a -
−3 −2 −1 1 2 3
q a−1

q a −2
?

Figure 2.6. The greatest integer function.

We remark that some authors replace the integer n by the integer n − 1 in the
Archimedean property so it reads: Given a real number x > 0 and a real number y,
there is a unique integer n such that (n − 1)x ≤ y < nx. We’ll use this formulation
of the Archimedean property in the proof of Theorem 2.37 below.
Using the Archimedean property, we can define the greatest integer func-
tion as follows (see Figure 2.6): Given any a ∈ R, we define bac as the greatest
integer less than or equal to a, that is, bac is the unique integer n satisfying the
inequalities n ≤ a < n + 1. This function will come up various times in the sequel.
We now prove an important fact concerning the rational and irrational numbers.
Theorem 2.37 (Density of the (ir)rationals). Between any two real num-
bers is a rational and irrational number.
Proof. Let x < y. We first prove that there is a rational number between x
and y. Indeed, y − x > 0, so by our corollary there is a natural number m such
that 1/m < y − x. By the Archimedean principle, there is an integer n such that
n 1 n
n − 1 ≤ mx < n =⇒ − ≤x< .
m m m
In particular, x < n/m, and
n 1 n
≤ + x < (y − x) + x = y =⇒ < y.
m m m
Thus, the rational number n/m is between x and y. √
To
√ prove that between x and y is an irrational number, note that x − 2 <
y − √2, so by what√ we just proved
√ above, there is a rational number r such that
x − 2 < r < y − 2. Adding 2, we obtain

x < ξ < y, where ξ = r + 2.

Note that ξ is irrational, for if it were rational, then 2 = ξ − r would also be
rational, which we know is false. This completes our proof. 

2.7.4. The nested intervals property. A sequence of sets {An } is said to


be nested if
A1 ⊇ A2 ⊇ A3 ⊇ · · · ⊇ An ⊇ An+1 ⊇ · · · ,
that is Ak ⊇ Ak+1 for each k.
70 2. NUMBERS, NUMBERS, AND MORE NUMBERS

[ [ [ [ ] ... ] ] ]
a1 a2 a3 . . . an bn b3 b2 b1

Figure 2.7. Nested Intervals.



Example 2.26. If An = 0, n1 , then {An } is a nested sequence. Note that
∞ ∞  
\ \ 1
An = 0, = ∅.
n=1 n=1
n
T 
Indeed, if x ∈ An , which means that x ∈ 0, n1 for every n ∈ N, then 0 < x < 1/n
for all n ∈ N. However, by Theorem 2.35, there is an n such that 0 < 1/n < x. This
shows
T that x 6∈ (0, 1/n), contradicting that x ∈ 0, n1 for every n ∈ N. Therefore,
An must be empty.
 
Example 2.27. Now on the other hand, if An = 0, n1 , then {An } is a nested
sequence, but in this case,
∞ ∞  
\ \ 1
An = 0, = {0} 6= ∅.
n=1 n=1
n
The difference between the first example and the second is that the second
example is a nested sequence of closed and bounded intervals. Here, bounded
means bounded above and below. It is a general fact that the intersection of a
nested sequence of nonempty closed and bounded intervals is nonempty. This is
the content of the nested intervals theorem.
Theorem 2.38 (Nested intervals theorem). The intersection of a nested
sequence of nonempty closed and bounded intervals in R is nonempty.
Proof. Let {In = [an , bn ]} be a sequence of nonempty closed and bounded
intervals. Since we are given that this sequence is nested, we in particular have
I2 = [a2 , b2 ] ⊆ [a1 , b1 ] = I1 , and so a1 ≤ a2 ≤ b2 ≤ b1 . Since I3 = [a3 , b3 ] ⊆ [a2 , b2 ],
we have a2 ≤ a3 ≤ b3 ≤ b2 , and so a1 ≤ a2 ≤ a3 ≤ b3 ≤ b2 ≤ b1 . In general, we see
that for any n,
a1 ≤ a2 ≤ a3 ≤ · · · ≤ an ≤ bn ≤ · · · ≤ b3 ≤ b2 ≤ b1 .
See Figure 2.7. Let a = sup{ak ; k = 1, 2, . . .}. Since a1 ≤ a2 ≤ a3 ≤ · · · , by
definition of supremum an ≤ a for each n. Also, since any bn is an upper bound
for the set {ak ; k = 1, 2, . . .}, by definition of supremum, a ≤ bn for each n. Thus,
a ∈ In for each n, and our proof is complete. 
Example 2.28. The “bounded” assumption cannot be dropped, for if An =
[n, ∞), then {An } is a nested sequence, but

\
An = ∅.
n=1

We end this section with a discussion of maximums and minimums. Given any
set A of real numbers, a number a is called the maximum of A if a ∈ A and
a = sup A, in which case we write a = max A. Similarly, a is called the minimum
of A if a ∈ A and a = inf A, in which case we write a = min A. For instance,
1 = max(0, 1], but (0, 1) has no maximum, only a supremum, which is also 1. In
Problem 4, we prove that any finite set has a maximum.
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 71

Exercises 2.7.
1. What are the supremums and infimums of the following sets? Give careful proofs of
your answers. The “1/n-principle” might be helpful in some of your proofs.
{1 + n5 ; n = 1, 2, 3, . . .}
(a) A = 
8
 − n3 ; n = 1, 2, 3, . . .}
(b) B = {3
(c) C = 1 + (−1)n n1 ; n = 1, 2, 3, . . . (d) D = n(−1)n + n1 ; n = 1, 2, 3, . . . o
P n n+1
(e) E = 1
k=1 2k ; n = 1, 2, . . . (f ) F = (−1)n + (−1)n ; n = 1, 2, 3, . . . .

2. Are the following sets bounded above? Are they bounded below? If the supremum or
infimum exists, find it and prove your answer.
n n
o n n
o
(a) A = 1 + n(−1) ; n = 1, 2, 3, . . . , (b) B = 2n(−1) ; n = 1, 2, 3, . . . .

3. (Various properties of supremums/infimums)


(a) If A ⊆ R is bounded above and A contains one of its upper bounds, prove that this
upper bound is in fact the supremum of A.
(b) Let A ⊆ R be a nonempty bounded set. For x, y ∈ R, define a new set xA + y by
xA + y := {xa + y ; a ∈ A}. Consider the case y = 0. Prove that
x > 0 =⇒ inf(xA) = x inf A, sup(xA) = x sup A;

x < 0 =⇒ inf(xA) = x sup A, inf(xA) = x sup A.


(c) With x = 1, prove that inf(A + y) = inf(A) + y and sup(A + y) = sup(A) + y.
(d) What are the formulas for inf(xA + y) and sup(xA + y)?
(e) If A ⊆ B and B is bounded, prove that sup A ≤ sup B and inf B ≤ inf A.
4. In this problem we prove some facts concerning maximums and minimums.
(a) Let A ⊆ R be nonempty. An element a ∈ A is called the maximum, respectively
minimum, element of A if a ≥ x, respectively a ≤ x, for all x ∈ A. Prove A has a
maximum (resp. minimum) if and only if sup A exists and sup A ∈ A (resp. inf A
exists and inf A ∈ A).
(b) Let A ⊆ R and suppose that A has a maximum, say a = max A. Given any b ∈ R,
prove that A ∪ {b} also has a maximum, and max(A ∪ {b}) = max{a, b}.
(c) Prove that a nonempty finite set of real numbers has a maximum and minimum,
where by finite we mean a set of the form {a1 , a2 , . . . , an } where a1 , . . . , an ∈ R.
5. If A ⊆ R+ is nonempty and closed under addition, prove that A is not bounded above.
(As a corollary, we get another proof that N is not bounded above.) If A ⊆ (1, ∞) is
nonempty and closed under multiplication, prove that A is not bounded above.
6. Using the Archimedean property, prove that if a, b ∈ R and b − a > 1, then there is an
n ∈ Z such that a < n < b. Using this result can you give another proof that between
any two real numbers there √ is a rational number?
7. If a ∈ R, prove that |a| = a2 .
8. Here are some more power rules for you to prove. Let p, q ∈ Q.
(a) If p < q and a > 1, then ap < aq .
(b) If p < q and 0 < a < 1, then aq < ap .
(c) Let a > 0 and let p < q. Prove that a > 1 if and only if ap < aq .
9. (Real numbers to real powers) We define 0x := 0 for all x > 0; otherwise 0x is
undefined. We now define ax for a > 0 and x ∈ R. First, assume that a ≥ 1 and x ≥ 0.
(a) Prove that A = {ar ; 0 ≤ r ≤ x} is bounded above, where ar is defined in (2.27).
Define ax := sup A. Prove that if x ∈ Q, then this definition of ax agrees with the
definition (2.27).
(b) For a, b, x, y ∈ R with a, b ≥ 1 and x, y ≥ 0, prove that
(2.29) ax · ay = ax+y ; ax · bx = (ab)x ; (ax )y = axy .
(In the equality (ax )y = axy , you should first show that ax ≥ 1 so (ax )y is defined.)
72 2. NUMBERS, NUMBERS, AND MORE NUMBERS

(c) If 0 < a < 1 and x ≥ 0, define ax := 1/(1/a)x ; note that 1/a > 1 so (1/a)x is
defined. Finally, if a > 0 and x < 0, define ax := (1/a)−x ; note that −x > 0 so
(1/a)−x is defined. Prove (2.29) for any a, b, x, y ∈ R with a, b > 0 and x, y ∈ R.
10. Let p(x) = ax2 + bx + c be a quadratic polynomial with real coefficients and with
a 6= 0. Prove that p(x) has a real root (that is, an x ∈ R with p(x) = 0) if and only if
b2 − 4ac ≥ 0, in which case, the root(s) are given by the quadratic formula:

−b ± b2 − 4ac
x= .
2a
11. Let {In = [an , bn ]} be a nested sequence of nonempty closed and bounded intervals
and put A = {an ; n ∈ N and B = {bn ; n ∈ N}. Show that sup A and inf B exist and
T
In = [sup A, inf B].
12. In this problem we give a characterization of the completeness axiom (C) of R in terms
of intervals as explained by Christian [50]. A subset A of R is convex if given any x
and y in A and t ∈ R with x < t < y, we have t ∈ A.
(a) Assume axiom (C). Prove that all convex subsets of R are intervals.
(b) Assume that all convex subsets of R are intervals. Prove the completeness property
(C) of R. Suggestion: Let I be the set of all upper bounds of a nonempty set A
that is bounded above. Show that I is convex.
This problem shows that the completeness axiom is equivalent to the statement that
all convex sets are intervals.

2.8. m-dimensional Euclidean space


The plane R2 is said to be two-dimensional because to locate a point in the plane
requires two points, its ordered pair of coordinates. Similarly, we are all familiar
with R3 , which is said to be three-dimensional because to represent any point in
space we need an ordered triple of real numbers. In this section we generalize these
considerations to m-dimensional space Rm and study its properties.
2.8.1. The vector space structure of Rm . Recall that the set Rm is just
the product Rm := R × · · · × R (m copies of R), or explicitly, the set of all m-tuples
of real numbers, 
Rm := (x1 , . . . , xm ) ; x1 , . . . , xm ∈ R .
We call elements of Rm vectors (or points) and we use the notation 0 for the
m-tuple of zeros (0, . . . , 0) (m zeros); it will always be clear from context whether
0 refers to the real number zero or the m-tuple of zeros. In elementary calculus, we
usually focus on the case when m = 2 or m = 3; e.g. when m = 2,

R2 = R × R = (x1 , x2 ) ; x1 , x2 ∈ R ,
and in this case, the zero vector “0” = (0, 0).
Given any x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ) in Rm and real number a, we
define
x + y := (x1 + y1 , . . . , xm + ym ) and a x := (ax1 , . . . , axm ).
We also define
−x := (−x1 , . . . , −xm ).
With these definitions, observe that
x + y = (x1 + y1 , . . . , xm + ym ) = (y1 + x1 , . . . , ym + xm ) = y + x,
and
x + 0 = (x1 + 0, . . . , xm + 0) = (x1 , . . . , xm ) = x
2.8. m-DIMENSIONAL EUCLIDEAN SPACE 73

and similarly, 0+x = x. These computations prove properties (A1) and (A3) below,
and you can check that the following further properties of addition are satisfied:
Addition satisfies
(A1) x + y = y + x; (commutative law)
(A2) (x + y) + z = x + (y + z); (associative law)
(A3) there is an element 0 such that x + 0 = x = 0 + x; (additive identity)
(A4) for each x there is a −x such that
x + (−x) = 0 and (−x) + x = 0. (additive inverse)
Of course, we usually write x + (−y) as x − y.
Multiplication by real numbers satisfies
(M1) 1 · x = x; (multiplicative identity)
(M2) (a b) x = a (bx); (associative law)
and finally, addition and multiplication are related by
(D) a(x + y) = ax + ay and (a + b)x = ax + bx. (distributive law)
We remark that any set, say with elements denoted by x, y, z, . . ., called vec-
tors, with an operation of “+” and an operation of multiplication by real numbers
that satisfy properties (A1) – (A4), (M1) – (M2), and (D), is called a real vector
space. If the scalars a, b, 1 in (M1) – (M2) and (D) are elements of a field F, then
we say that the vector space is an F vector space or a vector space over F. In
particular, Rm is a real vector space.
2.8.2. Inner products. We now review inner products, also called dot prod-
ucts in elementary calculus. We all probably know that given any two vectors
x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in R3 , the dot product x · y is the number
x · y = x1 y1 + x2 y2 + x3 y3 .
We generalize this to Rm as follows: If x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ), then
we define the inner product (also called the dot product or scalar product)
hx, yi as the real number
m
X
hx, yi := x1 y1 + x2 y2 + · · · + xm ym = xj yj .
j=1

It is also common to denote hx, yi by x · y or (x, y), but we prefer the angle bracket
notation hx, yi, which is popular in physics, because the dot “·” can be confused
with multiplication and the parentheses “( , )” can be confused with ordered pair.
In the following theorem we summarize some of the main properties of h·, ·i.
Theorem 2.39. For any vectors x, y, z in Rm and real number a,
(i) hx, xi ≥ 0 and hx, xi = 0 if and only if x = 0.
(ii) hx + y, zi = hx, zi + hy, zi and hx, y + zi = hx, yi + hx, zi.
(iii) ha x, yi = ahx, yi and hx, a yi = ahx, yi.
(iv) hx, yi = hy, xi.
Proof. To prove (i), just note that
hx, xi = x21 + x22 + · · · + x2m
and x2j ≥ 0 for each j. If hx, xi = 0, then as the only way a sum of nonnegative
numbers is zero is that each number is zero, we must have x2j = 0 for each j. Hence,
74 2. NUMBERS, NUMBERS, AND MORE NUMBERS

every xj = 0 and therefore, x = (x1 , . . . , xm ) = 0. Conversely, if x = 0, that is,


x1 = 0, . . . , xm = 0, then of course hx, xi = 0 too. This concludes the proof of (i).
To prove (ii), we just compute:
m
X m
X
hx + y, zi = (xj + yj ) zj = (xj zj + yj zj ) = hx, zi + hy, zi.
j=1 j=1

The other identity hx, y + zi = hx, yi + hx, zi is proved similarly. The proofs of (iii)
and (iv) are also simple computations, so we leave their proofs to the reader. 

We remark that any real vector space V with an operation that assigns to
every two vectors x and y in V a real number hx, yi satisfying properties (i) – (iv)
of Theorem 2.39 is called a real inner product space and the operation h·, ·i is
called an inner product on V . In particular, Rm is a real inner product space.

2.8.3. The norm in Rm . Recall that the length of a vector x = (x1 , x2 , x3 )


in R3 is just the distance of the point x from the origin, or
q
|x| = x21 + x22 + x23 .

We generalize these considerations as follows. The length or norm of a vector


x = (x1 , . . . , xm ) in Rm is by definition the nonnegative real number
q p
|x| := x21 + · · · + x2m = hx, xi ≥ 0.

We interpret the norm |x| as the length of the vector x, or the distance of x from
the origin 0. In particular, the squared length |x|2 of the vector x is given by

|x|2 = hx, xi.

Warning: For m > 1, |x| does not mean absolute value of a real number x, it
means norm of a vector x. However, if m = 1, then “norm” and “absolute value” p
are the same, because for x = x1 ∈ R1 = R, the above definition of norm is x21 ,
which is exactly the absolute value of x1 according to Problem 7 in Exercises 2.7.
The following inequality relates the norm and the inner product. It is commonly
called the Schwarz inequality or Cauchy-Schwarz inequality after Hermann
Schwarz (1843–1921) who stated it for integrals in 1885 and Augustin Cauchy
(1789–1857) who stated it for sums in 1821. However, (see [94] for the history)
it perhaps should be called the Cauchy-Bunyakovskiı̆-Schwarz inequality be-
cause Viktor Bunyakovskiı̆ (1804–1889), a student of Cauchy, published a related
inequality 25 years before Schwarz. (Note: There is no “t” before the “z.”)

Theorem 2.40 (Schwarz inequality). For any vectors x, y in Rm , we have

|hx, yi| ≤ |x| |y| Schwarz inequality.


2.8. m-DIMENSIONAL EUCLIDEAN SPACE 75

x
6
hx, yi
x− y
|y|2
- -y
hx, yi
y
|y|2

hx,yi
Figure 2.8. The projection of x onto y is |y|2 y and the projection
hx,yi
of x onto the orthogonal complement of y is x − |y|2 y.

Proof. If y = 0, then both sides of Schwarz’s inequality are zero, so we may


6 0. Taking the squared length of the vector x − hx,yi
assume that y = |y|2 y, we get

hx, yi 2 D hx, yi hx, yi E



0 ≤ x − y = x − y, x − y
|y|2 |y|2 |y|2
hx, yi hx, yi hx, yi hx, yi
= hx, xi − 2
hx, yi − 2
hy, xi + hy, yi
|y| |y| |y|4
hx, yi2 hx, yi2 hx, yi2
= |x|2 − − + .
|y|2 |y|2 |y|2

Cancelling the last two terms, we see that

|hx, yi|2
0 ≤ |x|2 − =⇒ |hx, yi|2 ≤ |x|2 |y|2 .
|y|2

Taking square roots proves the Schwarz inequality. As a side remark (skip this if
you’re not interested), the vector x − hx,yi
|y|2 y that we took the squared length of
didn’t come out of a hat. Recall from your “multi-variable calculus” or “vector
calculus” course, that the projection of x onto y and the projection of x onto the
orthogonal complement of y are given by hx,yi hx,yi
|y|2 y and x − |y|2 y, respectively, see
Figure 2.8. Thus, all we did above was take the squared length of the projection of
x onto the orthogonal complement of y. 

In the following theorem, we list some of the main properties of the norm | · |.

Theorem 2.41. For any vectors x, y in Rm and real number a,


(i) |x| ≥ 0 and |x| = 0 if and only if x = 0.
(ii) |a x| = |a| |x|.
(iii) |x + y| ≤ |x| + |y| (triangle inequality).
(iv) | |x| − |y| | ≤ |x ± y| ≤ |x| + |y|.

Proof. (i) follows from Property (i) of Theorem 2.39. To prove (ii), observe
that
p p p
|ax| = hax, axi = a2 hx, xi = |a| hx, xi = |a| |x|,
76 2. NUMBERS, NUMBERS, AND MORE NUMBERS

and therefore |ax| = |a| |x|. To prove the triangle inequality, we use the Schwarz
inequality to get
|x + y|2 = hx + y, x + yi = |x|2 + hx, yi + hy, xi + |y|2
= |x|2 + 2hx, yi + |y|2
≤ |x|2 + 2|hx, yi| + |y|2
≤ |x|2 + 2|x| |y| + |y|2 ,
where we used the Schwarz inequality at the last step. Thus,
|x + y|2 ≤ (|x| + |y|)2 .
Taking the square root of both sides proves the triangle inequality.
The second half of (iv) follows from the triangle inequality:
|x ± y| = |x + (±1)y| ≤ |x| + |(±1)y| = |x| + |y|.
To prove the first half | |x| − |y| | ≤ |x ± y| we use the triangle inequality to get
|x| − |y| = |(x − y) + y| − |y| ≤ |x − y|+|y| − |y| = |x − y|
(2.30) =⇒ |x| − |y| ≤ |x − y|.
Switching the letters x and y in (2.30), we get |y| − |x| ≤ |y − x| or −(|x| − |y|) ≤
|x − y|. Combining this with (2.30), we see that
|x| − |y| ≤ |x − y| and − (|x| − |y|) ≤ |x − y| =⇒ | |x| − |y| | ≤ |x − y|,
where we used the definition of absolute value of the real number |x|−|y|. Replacing
y with −y and using that | − y| = |y|, we get | |x| − |y| | ≤ |x + y|. This finishes the
proof of (iv). 
We remark that any real vector space V with an operation that assigns to every
vector x in V a nonnegative real number |x|, such that | · | satisfies properties (i) –
(iii) of Theorem 2.41 is called a real normed space and the operation | · | is called
a norm on V . In particular, Rm is a real normed space. The exercises explore
different norms on Rm .
In analogy with the distance between two real numbers, we define the distance
between two vectors x and y in Rm to be the number
|x − y|.
In particular, the triangle inequality implies that given any other vector z, we have
|x − y| = |(x − z) + (z − y)| ≤ |x − z| + |z − y|,
that is,
(2.31) |x − y| ≤ |x − z| + |z − y|.
This inequality is the “genuine” triangle inequality since it represents the geomet-
rically intuitive fact that the distance between two points x and y is shorter than
the distance transversed by going from x to z and then from z to y; see Figure 2.9.

Finally, we remark that the norm | · | on Rm is sometimes called the ball norm
m
on Rp for the following reason. Let r > 0 and take m = 3. Then |x| < r means
that x21 + x22 + x23 < r, or squaring both sides, we get
x21 + x22 + x23 < r2 ,
2.8. m-DIMENSIONAL EUCLIDEAN SPACE 77

6
q
x |x − y|
q
y

-
|x − z| |z − y|

q
z

Figure 2.9. Why the “genuine” triangle inequality (2.31) is obvious.

which simply says that x is inside the ball of radius r. So, if c ∈ R3 , then |x − c| < r
just means that
(x1 − c1 )2 + (x2 − c2 )2 + (x3 − c3 )2 < r2 ,
which is to say, x is inside the ball of radius r that is centered at the point c =
(c1 , c2 , c3 ). Generalizing this notion to m-dimensional space, given c in Rm , we call
the set of all x such that |x − c| < r, or after squaring both sides,
(x1 − c1 )2 + (x2 − c2 )2 + · · · + (xm − cm )2 < r2 ,
the open ball of radius r centered at c. We denote this set by Br , or Br (c) to
emphasize that the center of the ball is c. Therefore,
(2.32) Br (c) := {x ∈ Rm ; |x − c| < r}.
The set of x with < replaced by ≤ is called the closed ball of radius r centered
at c and is denoted by B r or B r (c),

B r (c) := {x ∈ Rm ; |x − c| ≤ r}.

If m = 1, then the ball concept reduces to intervals in R1 = R:


x ∈ Br (c) ⇐⇒ |x − c| < r ⇐⇒ −r < x − c < r
⇐⇒ c − r < x < c + r ⇐⇒ x ∈ (c − r, c + r).
Thus, for m = 1, Br (c) is just the open interval centered at c of length 2r. For
m = 1, B r (c) is just the closed interval centered at c of length 2r.
Exercises 2.8.
1. Let x, y ∈ Rm . Prove that
|x + y|2 + |x − y|2 = 2|x|2 + 2|y|2 (parallelogram law).
Vectors x and y are said to be orthogonal if hx, yi = 0. Prove that x and y are
orthogonal if and only if
|x + y|2 = |x|2 + |y|2 (Pythagorean theorem).
2. (Schwarz’s inequality, Proof II) Here’s another way to prove Schwarz’s inequality.
(a) For any real numbers a and b, prove that
1 2 
ab ≤ a + b2 .
2
(b) Let x, y ∈ Rm with |x| = 1 and |y| = 1. Using (a), prove that |hx, yi| ≤ 1.
78 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Aq α c = |A − B|
β B
q

b = |A − C| a = |C − B|
γq
C

Figure 2.10. Triangle for the law of cosines.

(c) Now let x and y be arbitrary nonzero vectors of Rm . Applying (b) to the vectors
x/|x| and y/|y|, derive Schwarz’s inequality.
3. (Schwarz’s inequality, Proof III) Here’s an “algebraic” proof. Let x, y ∈ Rm with
y 6= 0 and let p(t) = |x + ty|2 for t ∈ R. Note that p(t) ≥ 0 for all t.
(a) Show that p(t) can be written in the form p(t) = a t2 + 2b t + c where a, b, c are
real numbers with a 6= 0.
(b) Using the fact that p(t) ≥ 0 for all t, prove the Schwarz inequality. Suggestion:
Write p(t) = a(t + b/a)2 + (c − b2 /a).
4. Prove that for any vectors x and y in Rm , we have
m
!2 m X m
2 2
X X
2|x| |y| − 2 xn yn = (xk y` − x` yk )2 (Lagrange identity).
n=1 k=1 `=1

after Joseph-Louis Lagrange (1736–1813). Suggestion: Show that


m
!2 m
! m ! m
! m
!
2 2
X X 2
X 2 X X
2|x| |y| − 2 xn yn =2 xk y` − 2 xk yk x` y`
n=1 k=1 `=1 k=1 `=1
m X
X m m X
X m
=2 x2k y`2 − 2 xk yk x` y`
k=1 `=1 k=1 `=1
P Pm
and prove that m k=1
2
`=1 (xk y` − x` yk ) , when expanded out, has the same form.
5. (Schwarz’s inequality, Proof IV)
(a) Prove the Schwarz inequality from Lagrange’s identity.
(b) UsingPLagrange’s identity, prove that equality holds in the Schwarz inequality (that
is, | mn=1 xn yn | = |x| |y|) if and only if x and y are collinear, which is to say, x = 0
or y = c x for some c ∈ R.
(c) Now show that equality holds in the triangle inequality (that is, |x + y| = |x| + |y|)
if and only if x = 0 or y = c x for some c ≥ 0.
6. (Laws of trigonometry) In this problem we assume knowledge of the trigonometric
functions; see Section 4.7 for a rigorous development of these functions. By the Schwarz
inequality, given any two nonzero vectors x, y ∈ Rm , we have |hx,yi|
|x| |y|
≤ 1. In particular,
hx,yi
there is a unique angle θ ∈ [0, π] such that cos θ = |x| |y|
. The number θ is by definition
the angle between the vectors x, y.
(a) Consider the triangle labelled as in Figure 2.10. Prove the following:
a2 = b2 + c2 − 2bc cos α
b2 = a2 + c2 − 2ac cos β (Law of cosines)
2 2 2
c = a + b − 2ab cos γ.

Suggestion: To prove the last equality, observe that c2 = |A − B|2 = |x − y|2 where
x = A − C, y = B − C. Compute the dot product |x − y|2 = hx − y, x − yi.
2.9. THE COMPLEX NUMBER SYSTEM 79

(b) Using that sin2 α = 1 − cos2 α and that a2 = b2 + c2 − 2bc cos α, prove that
sin2 α s(s − a)(s − b)(s − c)
(2.33) =4 ,
a2 a2 b2 c2
where s := (a + b + c)/2 is called the semiperimeter . From (2.33), conclude that

sin α sin β sin γ


= = (Law of sines).
a b c
(c) Assume the formula: Area of a triangle = 12 base × height. Use this formula
together with (2.33) to prove that the area of the triangle in Figure 2.10 is
p
Area = s(s − a)(s − b)(s − c), (Heron’s formula),
a formula named after Heron of Alexandria (10–75). .
7. Given any x = (x1 , x2 , . . . , xm ) in Rm , define

kxk∞ := max |x1 |, |x2 |, . . . , |xm | ,
called the sup (or supremum) norm . . . of course, we need to show this is a norm.
(a) Show that k · k∞ defines a norm on Rm , that is, k · k∞ satisfies properties (i) – (iii)
of Theorem 2.41.
(b) In R2 , what are the set of all points x = (x1 , x2 ) such that kxk∞ ≤ 1? Draw a
picture of this set. Do you see why k · k∞ is sometimes called the box norm?
(c) Show that for any x in Rm ,

kxk∞ ≤ |x| ≤ m kxk∞ ,
where |x| denotes the usual “ball norm” of x.
(d) Let B r denote the closed ball in Rm of radius r centered at the origin (the set
of points x in Rm such that |x| ≤ r). Let Boxr denote the closed ball in Rm of
radius r in the box norm centered at the origin (the set of points x in Rm such
that kxk∞ ≤ r). Show that

B 1 ⊆ Box1 ⊆ B √m .
When m = 2, give a “proof by picture” of these set inequalities by drawing the
three sets B 1 , Box1 , and B √2 .

2.9. The complex number system


Imagine a world in which we could not solve the equation x2 − 2 = 0. This
is a rational numbers only world. Such a world is a world where the length of the
diagonal of a unit square would not make sense; a very poor world indeed! Imagine
now a world in which every time we tried to solve a quadratic equation such as
x2 + 1 = 0, we get “stuck”, and could not proceed further. This would incredibly
slow down the progress of mathematics. The complex number system (introducing
“imaginary numbers”)12 alleviates this potential stumbling block to mathematics
and also to science . . . in fact, complex numbers are necessary to describe nature.13

12The imaginary number is a fine and wonderful resource of the human spirit, almost an
amphibian between being and not being. Gottfried Leibniz (1646–1716) [141].
13Furthermore, the use of complex numbers is in this case not a calculational trick of applied
mathematics but comes close to being a necessity in the formulation of the laws of quantum
mechanics ... It is difficult to avoid the impression that a miracle confronts us here. Nobel
prize winner Eugene Wigner (1902–1995) responding to the “miraculous” appearance of complex
numbers in the formulation of quantum mechanics [161, p. 208], [244], [245].
80 2. NUMBERS, NUMBERS, AND MORE NUMBERS

2.9.1. Definition of complex numbers. The complex number system is


actually very easy to define; it’s really just R2 ! The complex number system
C is the set R2 together with the following rules of arithmetic: If z = (a, b) and
w = (c, d), then we already know how to add two such complex numbers:
z + w := (a, b) + (c, d) = (a + c, b + d);
the new ingredient is multiplication, which is defined by
z · w = (a, b) · (c, d) := (ac − bd, ad + bc).

In summary, C as a set is just R2 , with the usual addition structure but with a
special multiplication. Of course, we also define −z = (−a, −b) and we write 0 for
(0, 0). Finally, if z = (a, b) 6= 0 (that is, a 6= 0 and b 6= 0), then we define
 
a −b
(2.34) z −1 := , .
a2 + b2 a2 + b2
Theorem 2.42. The complex numbers is a field with (0, 0) (denoted henceforth
by 0) and (1, 0) (denoted henceforth by 1) being the additive and multiplicative
identities, respectively.
Proof. If z, w, u ∈ C, then we need to show that addition satisfies
(A1) z + w = w + z; (commutative law)
(A2) (z + w) + u = z + (w + u); (associative law)
(A3) z + 0 = z = 0 + z; (additive identity)
(A4) for each complex number z,
z + (−z) = 0 and (−z) + z = 0; (additive inverse)
multiplication satisfies
(M1) z · w = w · z; (commutative law)
(M2) (z · w) · u = z · (w · u); (associative law)
(M3) 1 · z = z = z · 1; (multiplicative identity)
(M4) for z 6= 0, we have
z · z −1 = 0 and z −1 · z = 1; (multiplicative inverse);
and finally, multiplication and addition are related by
(D) z · (w + u) = (z · w) + (z · u). (distributive law)
The proofs of all these properties are very easy and merely involve using the defini-
tion of addition and multiplication, so we leave all the proofs to the reader, except
for (M4). Here, by definition of multiplication,
 
a −b
z · z −1 = (a, b) · ,
a2 + b2 a2 + b2
 
a −b −b a
= a· 2 − b· 2 , a· 2 + b· 2
a + b2 a + b2 a + b2 a + b2
 2 2

a b
= 2 2
+ 2 , 0 = (1, 0) = 1.
a +b a + b2
Similarly, z −1 · z = 1, and (M4) is proven. 
In particular, all the arithmetic properties of R hold for C.
2.9. THE COMPLEX NUMBER SYSTEM 81

2.9.2. The number i. In high school, the complex numbers are introduced
in a slightly different manner, which we now describe.
First, we consider R as a subset of C by the identification of the real number a
with the ordered pair (a, 0), in other words, for sake of notational convenience, we
do not make a distinction between the complex number (a, 0) and the real number
a. Observe that by definition of addition and multiplication of complex numbers,
(a, 0) + (b, 0) = (a + b, 0)
and
(a, 0) · (b, 0) = (a · b − 0 · 0, a · 0 + 0 · b) = (ab, 0),
which is to say, “a + b = a + b” and “a · b = a · b” under our identification. Thus, our
identification of R preserves the arithmetic operations of R. Moreover, we know
how to multiply real numbers and elements of R2 : a(x, y) = (ax, ay). This also
agrees with our complex number multiplication:
(a, 0) · (x, y) = (a · x − 0 · y, a · y + 0 · x) = (ax, ay).
In summary, our identification of R as first components of ordered pairs in C does
not harm any of the additive or multiplicative structures of C.
The number i, notation introduced in 1777 by Euler [171], is by definition the
complex number
i := (0, 1).
Then using the definition of multiplication of complex numbers, we have
i2 = i · i = (0, 1) · (0, 1) = (0 · 0 − 1 · 1, 0 · 1 + 1 · 0) = (−1, 0) =⇒ i2 = −1,
where we used our identification of (−1, 0) with −1. Thus, the complex number
i = (0, 1) is the “imaginary unit” that you learned about in high school; however,
our definition of i avoids the mysterious square root of −1 you probably encoun-
tered.14 Moreover, given any complex number z = (a, b), by definition of addition,
multiplication, i, and our identification of R as a subset of C, we see that
a + b i = a + (b, 0) · (0, 1) = a + (b · 0 − 0 · 1, b · 1 + 0 · 0)
= (a, 0) + (0, b) = (a, b) = z.
Thus, z = a + b i, just as you were taught in high school! By commutativity, we
also have z = a + i b. We call a the real part of z and b the imaginary part of
z, and we denote them by a = Re z and b = Im z, so that

z = Re z + i Im z.

From this point on, we shall typically use the notation z = a + b i = a + i b instead
of z = (a, b) for complex numbers.

14That this subject [imaginary numbers] has hitherto been surrounded by mysterious obscu-
rity, is to be attributed largely to an ill adapted notation. If, for example, +1, -1, and the square
root of -1 had been called direct, inverse and lateral units, instead of positive, negative and imag-
inary (or even impossible), such an obscurity would have been out of the question. Carl Friedrich
Gauss (1777–1855).
82 2. NUMBERS, NUMBERS, AND MORE NUMBERS

2.9.3. Absolute values and complex conjugates. We define the absolute


value of a complex number z = (a, b) (or length) as the usual length (or norm) of
(a, b):
p
|z| := |(a, b)| = a2 + b2 .
Thus, |z|2 = a2 + b2 . We define the complex conjugate of z = (a, b) as the
complex number z = (a, −b), that is, z = a − bi. Note that if |z| = 6 0, then
according to the definition (2.34) of z −1 , we have
z
z −1 = 2 ,
|z|
so the inverse of a complex number can be expressed in terms of the complex
conjugate and absolute value. In the next theorem we list other properties of the
complex conjugate.
Theorem 2.43. If z and w are complex numbers, then
(1) z = z;
(2) z + w = z + w and zw = z · w;
(3) z z = |z|2 ;
(4) z + z = 2 Re z and z − z = 2i Im z.
Proof. The proofs of all these properties are very easy and merely involve
using the definition of complex conjugation, so we leave all the proofs to the reader,
except the last two. We have
z z = (a + bi) (a − bi) = a2 + a(−bi) + (bi)a + (bi)(−bi)
= a2 − abi + abi − b2 · i2 = a2 − b2 · (−1) = a2 + b2 = |z|2 .
To prove (4), observe that z + z = a + bi + (a − bi) = 2a = 2 Re z and z − z =
a + bi − (a − bi) = 2bi = 2i Im z. 
In the final theorem of this section we list various properties of absolute value.
Theorem 2.44. For any complex numbers z, w, we have
(i) |z| ≥ 0 and |z| = 0 if and only if z = 0;
(ii) |z| = |z|;
(iii) | Re z| ≤ |z|;
(iv) |z w| = |z| |w|;
(v) |z + w| ≤ |z| + |w|. (triangle inequality)
Proof. Properties (i) and (v) follow from the properties of the norm on R2 .
Properties (ii) and (iii) are straightforward to check. To prove (iv), note that
|z w|2 = zw zw = z w zw = zz ww = |z|2 |w|2 .
Taking the square root of both sides shows that |z w| = |z| |w|. 
An induction argument shows that for any n complex numbers z1 , . . . , zn ,
|z1 z2 · · · zn | = |z1 | |z2 | · · · |zn |.
In particular, setting z1 = z2 = · · · = zn = z, we see that
|z n | = |z|n .
Exercises 2.9.
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 83

1. Show that z ∈ C is a real number if and only if z = z.


2. If w is a complex root of a polynomial p(z) = z n + an−1 z n−1 + · · · + a1 z + a0 with real
coefficients (that is, p(w) = 0 and each ak is real), prove that w is also a root.
3. If z ∈ C, prove that there exists a nonnegative real number r and a complex number
ω with |ω| = 1 such that z = r ω. If z is nonzero, show that r and ω are uniquely
determined by z, that is, if z = r0 ω 0 where r0 ≥ 0 and |ω 0 | = 1, then r0 = r and ω 0 = ω.
The decomposition z = r ω is called the polar decomposition of z. (In Section 4.7
we relate the polar decomposition to the trigonometric functions.)

2.10. Cardinality and “most” real numbers are transcendental


In Section 2.6 we have seen that in some sense (in dealing with roots, trig
functions, logarithms — objects of practical interest) there are immeasurably more
irrational numbers than there are rational numbers. This begs the question:15 How
much more? In this section we discuss Cantor’s strange discovery that the rational
numbers have in some sense the same number of elements as the natural numbers
do! The rational numbers are thus said to be countable. It turns out that there
are just as many irrational numbers are there are real numbers; the irrational
and real numbers are said to be uncountable. We shall also discuss algebraic and
transcendental numbers and discuss their countability properties.
2.10.1. Cardinality. Cardinality is simply a mathematical way to say that
two sets have the same number of elements. Two sets A and B are said to have
the same cardinality, if there is a bijection between these two sets. Of course, if
f : A −→ B is a bijection, then g = f −1 : B −→ A is a bijection, so the notion of
cardinality does not depend on “which way the bijection goes”. We think of A and
B as having the same number of elements since the bijection sets up a one-to-one
correspondence between elements of the two sets. A set A is said to be finite if it
is empty or if it has the same cardinality as a set of the form Nn := {1, 2, . . . , n}
for some natural number n, in which case we say that A has zero elements or n
elements, respectively. If A is not finite, it is said to be infinite.16 A set is
called countable if it has the same cardinality as a finite set or the set of natural
numbers. To distinguish between finite and infinite countable sets, we call a set
countably infinite if it has the cardinality of the natural numbers. Finally, a set
is uncountable if it is not countable, so the set is not finite and does not have
the cardinality of N. See Figure 2.11 for relationships between finite and countable
sets. If f : N −→ A is a bijection, then A can be listed:
A = {a1 , a2 , a3 , . . .},
where an = f (n) for each n = 1, 2, 3, . . ..
Example 2.29. The integers are countably infinite since the function f : Z −→
N defined by (
2n if n > 0
f (n) =
2|n| + 1 if n ≤ 0
is a bijection of Z onto N.
15
In mathematics the art of proposing a question must be held of higher value than solving
it. (A thesis defended at Cantor’s doctoral examination.) Georg Cantor (1845–1918).
16Even in the realm of things which do not claim actuality, and do not even claim possibility,
there exist beyond dispute sets which are infinite. Bernard Bolzano (1781–1848).
84 2. NUMBERS, NUMBERS, AND MORE NUMBERS

uncountable countably finite


sets infinite sets
sets

infinite sets countable sets

Figure 2.11. Infinite sets are uncountable or countably infinite


and countable sets are countably infinite or finite. Infinite sets and
countable sets intersect in the countably infinite sets.

If two sets A and B have the same cardinality, we sometimes write card(A) =
card(B). One can check that if card(A) = card(B) and card(B) = card(C), then
card(A) = card(C). Thus, cardinality satisfies a “transitive law”.
It is “obvious” that a set cannot have both n elements and m elements where
n 6= m, but this still needs proof! The proof is based on the “pigeonhole principle”,
which can be interpreted as saying that if m > n and m pigeons are put into n
holes, then at least two pigeons must be put into the same hole.
Theorem 2.45 (Pigeonhole principle). If m > n, then there does not exist
an injection from Nm into Nn .
Proof. We proceed by induction on n. Let m > 1 and f : Nm −→ {1} be any
function. Then f (m) = f (1) = 1, so f is not an injection.
Assume that our theorem is true for n; we shall prove it true for n + 1. Let
m > n + 1 and let f : Nm −→ Nn+1 . We shall prove that f is not an injection.
First of all, if the range of f is contained in Nn ⊆ Nn+1 , then we can consider f
as a function into Nn , and hence by induction hypothesis, f is not an injection. So
assume that the f (a) = n + 1 for some a ∈ Nm . If there is another element of Nm
whose image is n + 1, then f is not injection, so we may assume that a is the only
element of Nm whose image is n + 1. Then f (k) ∈ Nn for k 6= a, so we can define
a function g : Nm−1 −→ Nn by “skipping” f (a) = n + 1:

g(1) := f (1), g(2) := f (2), . . . , g(a − 1) := f (a − 1), g(a) := f (a + 1),


g(a + 1) := f (a + 2), . . . , g(m − 1) := f (m).
Since m > n + 1, we have m − 1 > n, so by induction hypothesis, g is not an
injection. The definition of g shows that f : Nm −→ Nn+1 cannot be an injection
either, which completes the proof of our theorem. 
We now prove that the number of elements of a finite set is unique. We also
prove the “obvious” fact that an infinitely countable set is not finite.
Theorem 2.46. The number of elements of a finite set is unique and an infin-
itely countable set is not finite.
Proof. Suppose that f : A −→ Nn and g : A −→ Nm are bijections where
m > n. Then
f ◦ g −1 : Nm −→ Nn
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 85

is a bijection, and hence in particular an injection, an impossibility by the pigeon-


hole principle. This proves that the number of elements of a finite set is unique.
Now suppose that f : A −→ Nn and g : A −→ N are bijections. Then,
f ◦ g −1 : N −→ Nn
is a bijection, so an injection, and so in particular, its restriction to Nn+1 ⊆ N is
an injection. This again is impossible by the pigeonhole principle. 
2.10.2. Basic results on countability. The following is intuitively obvious.
Lemma 2.47. A subset of a countable set is countable.
Proof. Let A be a nonempty subset of a countable set B, where for definite-
ness we assume that B is countably infinite. (The finite case is left to the reader.)
Let f : N −→ B be a bijection. Using the well-ordering principle, we can define
n1 := smallest element of {n ∈ N ; f (n) ∈ A}.
If A 6= {f (n1 )}, then via well-ordering, we can define
n2 := smallest element of {n ∈ N \ {n1 } ; f (n) ∈ A }.
Note that n1 < n2 (why?). If A 6= {f (n1 ), f (n2 )}, then we can define
n3 := smallest element of {n ∈ N \ {n1 , n2 } ; f (n) ∈ A }.
Then n1 < n2 < n3 . We can continue this process by induction defining nk+1
as the smallest element in the set {n ∈ N \ {n1 , . . . , nk } ; f (n) ∈ A} as long as
A 6= {f (n1 ), . . . , f (nk )}.
There are two possibilities: the above process terminates or it continues in-
definitely. If the process terminates, let nm be the last natural number that
can be defined in this process. Then A = {f (n1 ), . . . , f (nm )}, which shows that
g : Nm −→ A defined by g(k) := f (nk ) is a bijection. If the above process can
be continued indefinitely, we can produce an infinite sequence of natural num-
bers n1 < n2 < n3 < n4 < · · · using the above recursive procedure. Since
n1 < n2 < n3 < · · · is an increasing sequence of natural numbers, one can check (for
instance, by induction) that k ≤ nk for all k. We claim that the map h : N −→ A
defined by h(k) := f (nk ) is a bijection. It is certainly injective because f is. To
see that h is surjective, let a ∈ A. Then, because f is surjective, there is an ` ∈ N
such that f (`) = a. Since k ≤ nk for every k, by the Archimedean property, there
is a k such that ` < nk+1 . We claim that ` ∈ {n1 , . . . , nk }. Indeed, if not, then
` ∈ {n ∈ N \ {n1 , n2 , . . . nk } ; f (n) ∈ A}, so by definition of nk+1 ,
nk+1 := smallest element of {n ∈ N \ {n1 , n2 , . . . nk } ; f (n) ∈ A } ≤ `,
contradicting that ` < nk+1 . Hence, ` = nj for some j, so h(j) = f (nj ) = f (`) = a.
This proves that h is surjective and completes our proof. 
Theorem 2.48. A finite product of countable sets is countable and a countable
union of countable sets is countable.
Proof. We only consider the product of two countably infinite sets (the other
cases are left to the reader). The countability of the product of more than two
countable sets can be handled by induction. If A and B are countably infinite,
then card(A × B) = card(N × N), so it suffices to show that card(N × N) = card(N).
Let C ⊆ N consist of all natural numbers of the form 2n 3m where n, m ∈ N. Being
86 2. NUMBERS, NUMBERS, AND MORE NUMBERS

an infinite subset of N, it follows that C is countably infinite (that is, card(C) =


card(N)). Consider the function f : N × N −→ C defined by
f (n, m) := 2n 3m .
By unique factorization, f is one-to-one, and so N × N has the same cardinality as
the countably infinite set C (that is, card(N × N) = card(C) = card(N)). Thus,
N × N is countable.
S∞ See Problem 1 for other proofs that N × N is countable.
Let A = n=1 An be a countable union of countable sets An . Since An is
countable, we can list the (distinct) elements of An :
An = {an1 , an2 , an3 , . . .},
and each element a of A is of the form a = anm for some pair (n, m), which may
not be unique because anm and an0 m0 could be the same for different (n, m) and
(n0 , m0 ). To identify a unique such pair we require that n be the least such number
with a = anm for some m. This recipe defines a map g : A −→ N × N by
g(a) := (n, m),
which, as the reader can check, is one-to-one. So, A has the same cardinality as
the subset g(A) of the countable set N × N. Since subsets of countable sets are
countable, it follows that A has the same cardinality as a countable set, so A is
countable. 
Example 2.30. (Cf. Example 2.29) As an easy application of this theorem, we
observe that Z = N ∪ {0} ∪ (−N), and since each set on the right is countable, their
union Z is also countable.
2.10.3. Real, rational, and irrational numbers. We now prove that the
rational numbers are countable.
Theorem 2.49. The set of rational numbers is countably infinite.
Proof. Let A := {(m, n) ∈ Z × N ; m and n have no common factors}. Since
a product of countable sets is countable, Z × N is countable and since A is a subset
of a countable set, A is countable. Moreover, A is infinite since, for example, all
numbers of the form (m, 1) belong to A where m ∈ Z. Define f : A −→ Q by
m
f (m, n) := .
n
This function is a bijection, so card(Q) = card(A) = card(N). 
The following is Cantor’s first proof that R is uncountable. (The following proof
is close to, but not exactly, Cantor’s original proof; see [87] for a nice exposition
on his original proof.) His second proof is in Section 3.8.
Theorem 2.50 (Cantor’s first proof ). Any interval of real numbers that is
not empty or consisting of a single point is uncountable.
Proof. Here we are omitting the empty interval and intervals of the form
[a, a] = {a}. Suppose, for sake of contradiction, that there is such a countable
interval: I = {c1 , c2 , . . .}. Let I1 = [a1 , b1 ] ⊆ I, where a1 < b1 , be an interval
in I that does not contain c1 . (To see that such an interval exists, divide the
interval I into three disjoint subintervals. At least one of the three subintervals
does not contain c1 . Choose I1 to be any closed interval in the subinterval that
does not contain c1 .) Now let I2 = [a2 , b2 ] ⊆ I1 be an interval that does not contain
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 87

c2 . By induction, we construct a sequence of nested closed and bounded intervals


In = [an , bn ] that does not contain cn . By the nested intervals theorem, there is a
point c in every In . By construction, In does not contain cn , so c cannot equal any
cn , which contradicts that {c1 , c2 , . . .} is a list of all the real numbers in I. 

Example 2.31. With I = R, we see that the set of all real numbers is un-
countable. It follows that Rm is uncountable for any m ∈ N; in particular, C = R2
is uncountable.

Corollary 2.51. The set of irrational numbers in any interval that is not
empty or consisting of a single point is uncountable.
Proof. If the irrationals in such an interval I were countable, then I would be
the union of two countable sets, the irrationals in I and the rationals in I; however,
we know that I is not countable so the irrationals in I cannot be countable. 

In particular, the set of all irrational numbers is uncountable.

2.10.4. Roots of polynomials. We already know that the real numbers are
classified into two disjoint sets, the rational and irrational numbers. There is an-
other important classification into algebraic and transcendental numbers. These
numbers have to do with roots of polynomials, so we begin by discussing some
elementary properties of polynomials. We remark that everything we say in this
subsection and the next are valid for real polynomials (polynomials of a real variable
with real coefficients), but it is convenient to work with complex polynomials.
Let n ≥ 1 and
(2.35) p(z) = an z n + an−1 z n−1 + · · · + a2 z 2 + a1 z + a0 , an 6= 0,
be an n-th degree polynomial with complex coefficients (that is, each ak ∈ C).
Lemma 2.52. For any z, a ∈ C, we can write
p(z) − p(a) = (z − a) q(z),
where q(z) is a polynomial of degree n − 1.
Proof. First of all, observe that given any polynomial f (z) and a complex
number b, the “shifted” function f (z + b) is also a polynomial in z of the same
degree as f ; this can be easily proven using the formula (2.35) for a polynomial. In
particular, r(z) = p(z + a) − p(a) is a polynomial of degree n and hence can written
in the form
r(z) = bn z n + bn−1 z n−1 + · · · + b2 z 2 + b1 z + b0 ,
where bn 6= 0 (in fact, bn = an but this isn’t needed). Notice that r(0) = 0, which
implies that b0 = 0, so we can write
r(z) = z s(z) , where s(z) = bn z n−1 + bn−1 z n−2 + · · · + b2 z + b1
is a polynomial of degree n − 1. Now replacing z with z − a, we obtain
p(z) − p(a) = r(z − a) = (z − a) q(z),
where q(z) = s(z − a) is a polynomial of degree n − 1. 
88 2. NUMBERS, NUMBERS, AND MORE NUMBERS

Suppose that a ∈ C is a root of p(z), which means p(a) = 0. Then according


to our lemma, we can write p(z) = (z − a)q(z) where q is a polynomial of degree
n − 1 (here we drop the dependence on a in q(z, a)). If q(a) = 0, then again by our
lemma, we can write q(z) = (z − a)r(z) where r(z) is a polynomial of degree n − 2.
Thus, p(z) = (z − a)2 r(z). Continuing this process, which must stop by at least the
n-th step (because the degree of a polynomial cannot be negative), we can write
p(z) = (z − a)k s(z),
where s(z) is a polynomial of degree n − k and s(a) 6= 0. We say that a is a root of
p(z) of multiplicity k.
Theorem 2.53. Any n-th degree complex polynomial (see the expression (2.35))
has at most n complex roots counting multiplicity.
Proof. The proof is by induction. Certainly this theorem holds for polyno-
mials of degree 1 (if p(z) = a1 z + a0 with a1 6= 0, then p(z) = 0 if and only if
z = −a0 /a1 ). Suppose that this theorem holds for polynomials of degree n. Let p
be a polynomial of degree n + 1. If p has no roots, then this theorem holds for p,
so suppose that p has a root, call it a. Then by our lemma we can write
p(z) = (z − a)q(z),
where q is a polynomial of degree n. By induction, we know that q has at most n
roots counting multiplicity. The polynomial p has at most one more root (namely
z = a) than q, so p has at most n roots counting multiplicity. 

By the fundamental theorem of algebra (Section 4.8) we’ll see that any poly-
nomial of degree n has exactly n (complex) roots counting multiplicities.

2.10.5. Uncountability of transcendental numbers. We already know


that a rational number is a real number that can be written as a ratio of integers,
and a number is irrational, by definition, if it is not rational. An important class
of numbers that generalizes rational numbers is called the algebraic numbers. To
motivate this generalization, let r = a/b, where a, b ∈ Z with b 6= 0, be a rational
number. Then r is a root of the polynomial equation
bz − a = 0.
Therefore, any rational number is the root of a (linear or degree 1) polynomial
with integer coefficients. In general, an algebraic number is a complex number
that is a root of a polynomial with integer coefficients. A complex number is called
transcendental if it is not algebraic. (These numbers are transcendental because
as remarked by Euler, they “transcend” the power of algebra to solve for them.)
Since complex numbers contain the real numbers, we can talk about real algebraic
and transcendental numbers also. As demonstrated above, we already know that
every rational number is algebraic. But there are many more algebraic numbers.
√ √
Example 2.32. 2 and 3 5 are both algebraic, being roots of the polynomials
z2 − 2 and z 3 − 5,
respectively. On the other hand, the numbers e and π are examples of transcen-
dental numbers; for proofs see [162], [163], [136].
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 89

√ √
The numbers 2 and 3 5 are irrational, so there are irrational numbers that
are algebraic. Thus, the algebraic numbers include all rational numbers and in-
finitely many irrational numbers, namely those irrational numbers that are roots
of polynomials with integer coefficients. Thus, it might seem as if the algebraic
numbers are uncountable, while the transcendental numbers (those numbers that
are not algebraic) are quite small in comparison. This is in fact not the case, as
was discovered by Cantor.
Theorem 2.54 (Uncountability of transcendental numbers). The set
of all algebraic numbers is countable and the set of all transcendental numbers
is uncountable. The same statement holds for real algebraic and transcendental
numbers.
Proof. We only consider the complex case. An algebraic number is by defi-
nition a complex number satisfying a polynomial equation with integer coefficients
an z n + an−1 z n−1 + · · · + a1 z + a0 = 0, an 6= 0.
Here, n ≥ 1 (that is, this polynomial is nonconstant) in order for a solution to exist.
We define the index of this polynomial as the number
n + |an | + |an−1 | + |an−2 | + · · · + |a2 | + |a1 | + |a0 |.
Since |an | ≥ 1, this index is at least 2 for any nonconstant polynomial with integer
coefficients. Given any natural number k there are only a finite number of non-
constant polynomials with index k. For instance, there are only two nonconstant
polynomials of index 2, the polynomials z and −z. There are eight nonconstant
polynomials of index 3:
z2, z + 1, z − 1, 2z, −z 2 , −z + 1, −z − 1, −2z,
and there are 22 polynomials of index 4:
z 3 , 2z 2 , z 2 + z, , z 2 − z, z 2 + 1, z 2 − 1, 3z, 2z + 1, 2z − 1, z + 2, z − 2,
together with the negatives of these polynomials, and so forth. Since any polynomial
of a given degree has finitely many roots (Theorem 2.53) and there are only a
finite number of polynomials with a given index, the set Ak , consisting of all roots
(algebraic numbers) of polynomials of index k, is a finite set. Since every polynomial
with integer coefficients
S∞ has an index, it follows that the set of all algebraic numbers
is the union k=2 Ak . Since a countable union of countable sets is countable, the
set of algebraic numbers is countable!
We know that the complex numbers is the disjoint union of algebraic numbers
and of transcendental numbers. Since the set of complex numbers is uncountable,
the set of transcendental numbers must therefore be uncountable. 
Exercises 2.10.
1. Here are some countability proofs.
(a) Prove that the set of prime numbers is countably infinite.
(b) Let N0 = {0, 1, 2, . . .}. Show that N0 is countably infinite. Define f :
N0 × N0 −→ N0 by f (0, 0) = 0 and for (m, n) 6= (0, 0), define
1
f (m, n) = 1 + 2 + 3 + · · · + (m + n) + n = (m + n)(m + n + 1) + n.
2
Can you see (do not prove) that this function counts N0 × N0 as shown in
Figure 2.12? Unfortunately, it is not so easy to show that f is a bijection.
90 2. NUMBERS, NUMBERS, AND MORE NUMBERS
.. .. ..
. k . .
(0, 2) (1, 2) (2, 2) ...
k k
(0, 1) (1, 1) (2, 1) ...
k k k
(0, 0) - (1, 0) z (2, 0) q. . .

Figure 2.12. Visualization of the map f : N0 × N0 −→ N0 .

(c) Write Q as a countable union of countable sets, so giving another proof that
the rational numbers are countable.
(d) Prove that f : N × N −→ N defined by f (m, n) = 2m−1 (2n − 1) is a bijection;
this give another proof that N × N is countable.
2. Here are some formulas for polynomials in terms of roots.
(a) If c1 , . . . , ck are roots of a polynomial p(z) of degree n (with each root re-
peated according to multiplicity), prove that p(z) = (z − c1 )(z − c2 ) · · · (z −
ck ) q(z), where q(z) is a polynomial of degree n − k.
(b) If k = n, prove that p(z) = an (z − c1 )(z − c2 ) · · · (z − cn ) where an is the
coefficient of z n in the formula (2.35) for p(z).
3. Prove that if A is an infinite set, then A has a countably infinite subset.
4. Let X be any set and denote the set of all functions from X into {0, 1} by ZX 2 .
Define a map from the power set of X into ZX 2 by
f : P(X) −→ ZX
2 , X⊇A 7−→ f (A) := χA ,
where χA is the characteristic function of A. Prove that f is a bijection. Con-
clude that P(X) has the same cardinality as ZX 2 .
5. Suppose that card(X) = n. Prove that card(P(X)) = 2n . Suggestion: There
are many proofs you can come up with; here’s one using the previous problem.
Assuming that X = {0, 1, . . . , n − 1}, which we may (why?), we just have to
prove that card(ZX n X n
2 ) = 2 . To prove this, define F : Z2 −→ {0, 1, 2, . . . , 2 − 1}
as follows: If f : X −→ {0, 1} is a function, then denoting f (k) by ak , define
F (f ) := an−1 2n−1 + an−2 2n−2 + · · · + a1 21 + a0 .
Prove that F is a bijection (Section 2.5 will come in handy).
6. (Cantor’s theorem) This theorem is simple to prove yet profound in nature.
(a) Prove that there can never be a surjection of a set A onto its power set P(A)
(This is called Cantor’s theorem). In particular, card(A) 6= card(P(A)).
Suggestion: Suppose not and let f be such a surjection. Consider the set
B = {a ∈ A ; a ∈
/ f (a)} ⊆ A.
Derive a contradiction from the assumption that f is surjective. Cantor’s
theorem shows that by taking power sets one can always get bigger and
bigger sets.
(b) Prove that the set of all subsets of N is uncountable.
(c) From Cantor’s theorem and Problem 4 prove that the set of all sequences of
0’s and 1’s is uncountable. Here, a sequence is just function from N into
{0, 1}, which can also be thought of as a list (a1 , a2 , a3 , a4 , . . .) where each
ak is either 0 or 1.
7. (Vredenduin’s paradox [234]) Here is another paradox related to the Russell’s
paradox. Assume that A = {{a} ; a is a set} is a well-defined set. Let B ⊆ A
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 91

be the subset consisting of all sets of the form {a} where a ∈ P(A). Define
g : P(A) −→ B by g(V ) := {V }.
Show that g is a bijection and then derive a contradiction to Cantor’s theorem.
This shows that A is not a set.
8. We define a statement as a finite string of symbols found on the common
computer keyboard (we regard a space as a symbol). E.g. Binghamton University
is sooo great! Math is fun! is a statement. Let’s suppose there are 100 symbols
on the common keyboard.
(a) Let A be the set of all statements. What’s the cardinality of A?
(b) Is the set of all possible mathematical proofs countable? Why?
CHAPTER 3

Infinite sequences of real and complex numbers

Notable enough, however, are the controversies over the series 1 - 1 + 1 - 1


+ 1 - ... whose sum was given by Leibniz as 1/2, although others disagree.
... Understanding of this question is to be sought in the word “sum”; this
idea, if thus conceived — namely, the sum of a series is said to be that
quantity to which it is brought closer as more terms of the series are taken
— has relevance only for convergent series, and we should in general give
up the idea of sum for divergent series.
Leonhard Euler (1707–1783).

Analysis is often described as the study of infinite processes, of which the study
of sequences and series form the backbone. It is in dealing with the concept of
“infinite” in infinite processes that makes analysis technically challenging. In fact,
the subject of sequences is when real analysis becomes “really hard”.
Let us consider the following infinite series that Euler mentioned:

s = 1 − 1 + 1 − 1 + 1 − 1 + 1 − 1 + ··· .

Let’s manipulate this infinite series without being too careful. First, we notice that

s = (1 − 1) + (1 − 1) + (1 − 1) + · · · = 0 + 0 + 0 + · · · = 0,

so s = 0. On the other hand,

s = 1 − (1 − 1) − (1 − 1) − (1 − 1) − · · · = 1 − 0 − 0 − 0 − · · · = 1,

so s = 1. Finally, we can get Leibniz’s value of 1/2 as follows:

2s = 2 − 2 + 2 − 2 + · · · = 1 + 1 − 1 − 1 + 1 + 1 − 1 − 1 + · · ·
= 1 + (1 − 1) − (1 − 1) + (1 − 1) − (1 − 1) + · · ·
= 1 + 0 − 0 + 0 − 0 + · · · = 1,

so s = 1/2! This example shows us that we need to be careful in dealing with the
infinite. In the pages that follow we “tame the infinite” with rigorous definitions.
Another highlight of this chapter is our study of the number e (Euler’s number),
which you have seen countless times in calculus and which pops up everywhere
including economics (compound interest), population growth, radioactive decay,
probability, etc. We shall prove two of the most famous formulas for this number:
 n
1 1 1 1 1
e = 1 + + + + + · · · = lim 1 + .
1! 2! 3! 4! n→∞ n

93
94 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

See [55], [145] for more on this incredible



and versatile number. Another number
we’ll look at is the golden ratio Φ = 1+2 5 which has strikingly pretty formulas
s r q
√ 1
Φ = 1 + 1 + 1 + 1 + ··· = 1 + .
1
1+
1
1+
.
1 + ..
In Section 3.1 we begin our study of infinite processes by learning about sequences
and their limits, then in Section 3.2 we discuss the properties of sequences. Sec-
tions 3.3 and 3.4 are devoted to answering the question of when a given sequence
converges; in these sections we’ll also derive the above formulas for Φ. Next, in
Section 3.5, we study infinite series, which is really a special case of the study of
infinite sequences. The exponential function, called by many “the most important
function in mathematics” [192, p. 1], is our subject of study in Section 3.7. This
function is defined by

X zn
exp(z) := , z ∈ C.
n=0
n!

We shall derive a few of the exponential function’s many properties including its
relationship to Euler’s number e. As a bonus prize, in Section 3.7 we’ll also prove
that e is irrational and we look at a useful (but little publicized) theorem called
Tannery’s theorem, which is a very handy result we’ll used in subsequent sections.
Finally, in Section 3.8 we see how real numbers can be represented as decimals
(with respect to arbitrary bases) and we look at Cantor’s famous “constructive”
diagonal argument.
Chapter 3 objectives: The student will be able to . . .
• apply the rigorous ε-N definition of convergence for sequences and series.
• determine when a sequence is monotone, Cauchy, or has a convergent subse-
quence (Bolzano-Weierstrass), and when a series converges (absolutely).
• define the exponential function and the number e.
• explain Cantor’s diagonal argument.

3.1. Convergence and ε-N arguments for limits of sequences


Undeniably, the most important concept in all of undergraduate analysis is the
notion of convergence. Intuitively, a sequence {an } in Rm converges to an element
a in Rm indicates that an is “as close as we want” to a for n “sufficiently large”. In
this section we make the terms in quotes rigorous, which introduces the first bona
fide technical definition in this book: the ε-N definition of limit.
3.1.1. Definition of convergence. A sequence in Rm can be thought of as
a list
a1 , a2 , a3 , a4 , . . .
m
of vectors, or points, an in R . In the language of functions, a sequence is simply a
function f : N −→ Rm , where we denote f (n) by an . Usually a sequence is denoted
by {an } or by {an }∞ n=1 . Of course, we are not restricted to n ≥ 1 and we could
start at any integer too, e.g. {an }∞
n=−5 . For convenience, in most of our proofs we
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 95

shall work with sequences starting at n = 1, although all the results we shall discuss
work for sequences starting with any index.
Example 3.1. Some examples of real sequences (that is, sequences in R1 = R)
include1
3, 3.1, 3.14, 3.141, 3.1415, . . .
and
1 1 1 1 1 1
1, , , , , , . . . , an = , . . . .
2 3 4 5 6 n
We are mostly interested in real or complex sequences. Here, by a complex
sequence we simply mean a sequence in R2 where we are free to use the notation
of i for (0, 1) and the multiplicative structure.
Example 3.2. The following sequence is a complex sequence:
i, i2 = −1, i3 = −i, i4 = 1, . . . , an = in , . . .
Although we shall focus on R and C sequences in this book, later on you might
deal with topology and calculus in Rm (as in, for instance, [136]), so for your later
psychological health we might as well get used to working with Rm instead of R1 .
We now try to painstakingly motivate a precise definition of convergence (so
please bear with me). Intuitively, a sequence {an } in Rm converges to an element
L in Rm indicates that an is “as close as we want” to L for n “sufficiently large”.
We now make the terms in quotes rigorous. First of all, what does “as close as we
want” mean? We take it to mean that given any error, say ε > 0 (e.g. ε = 0.01),
for n “sufficiently large” we can approximate L by an to within an error of ε. In
other words, for n “sufficiently large” the difference between L and an is within ε:
|an − L| < ε.
Now what does for n “sufficiently large” mean? We define it to mean that there is
a real number N such that for all n > N , a specified property holds (e.g. the above
inequality); thus, for all n > N we have |an − L| < ε, or using symbols,
(3.1) n>N =⇒ |an − L| < ε.
In conclusion: For any given error ε > 0 there is an N such that (3.1) holds.2
We now summarize our findings as a precise definition. Let {an } be a sequence
in Rm . We say that {an } converges (or tends) to an element L in Rm if, for
every ε > 0, there is an N ∈ R such that for all n > N , |an − L| < ε. Because this
definition is so important, we display it: {an } converges to L if,
for every ε > 0, there is an N ∈ R such that n > N =⇒ |an − L| < ε.
We call {an } a convergent sequence, L the limit of {an }, and we usually denote
the fact that {an } converges to L in one of four ways:
an → L, an → L as n → ∞, lim an = L, lim an = L.
n→∞
If a sequence does not converge (to any element of Rm ), we say that it diverges.
We can also state the definition of convergence in terms of open balls. Observe
1We’ll talk about decimal expansions of real numbers in Section 3.8 and π in Chapter 4.
2One magnitude is said to be the limit of another magnitude when the second may approach
the first within any given magnitude, however small, though the second may never exceed the
magnitude it approaches. Jean Le Rond d’Alembert (1717–1783). The article on Limite in the
Encyclopdie 1754.
96 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

6
a2
• Iε
a3 an L Bε (L)
• •
• •

a1
• -

Figure 3.1. an → L if and only if given any ε-ball around L, the


an ’s are “eventually” inside of the ε-ball.

that |an − L| < ε is just saying that an ∈ Bε (L), the open ball of radius ε centered
at L (see formula (2.32) in Section 2.8). Therefore, an → L in Rm if,
for every ε > 0, there is an N ∈ R such that n > N =⇒ an ∈ Bε (L).
See Figure 3.1. We’ll not emphasize this interpretation of limit but we state it
because the open ball idea will occur in other classes, in particular, when studying
metric spaces in topology.
3.1.2. Standard examples of ε-N arguments. We now give some standard
examples of using our “ε-N ” definition of limit.
Example 3.3. We shall prove that the sequence {1/2, 1/3, 1/4, . . .} converges
to zero:
1
lim = 0.
n+1
In general, any sequence {an } that converges to zero is called a null sequence.
Thus, we claim that {1/(n + 1)} is a null sequence. Let ε > 0 be any given positive
real number. We want to prove there exists a real number N such that

1 1
(3.2) n > N =⇒ − 0 = < ε.
n+1 n+1
To find such a number N , we can proceed in many ways. Here are two ways.
(I) For our first way, we observe that
1 1 1
(3.3) < ε ⇐⇒ < n + 1 ⇐⇒ − 1 < n.
n+1 ε ε
For this reason, let us choose N to be the real number N = 1/ε − 1. Let
n > N , that is, N < n or using the definition of N , 1/ε − 1 < n. Then
by (3.3), we have 1/(n + 1) < ε. In summary, for n > N , we have proved
that 1/(n + 1) < ε. This proves (3.2). Thus, by definition of convergence,
1/(n + 1) → 0.
(II) Another technique is to try and simplify the right-hand side of (3.2). Since
n < n + 1, we have 1/(n + 1) < 1/n. Therefore,
1 1 1 1
(3.4) if < ε, then because < , we have < ε also.
n n+1 n n+1
Now we can make 1/n < ε easily since
1 1
< ε ⇐⇒ < n.
n ε
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 97

With this scratch work done, let us now choose N = 1/ε. Let n > N , that
is, N < n or using the definition of N , 1/ε < n. Then we certainly have
1/n < ε, and hence by (3.4), we know that 1/(n + 1) < ε too. In summary,
for n > N , we have proved that 1/(n + 1) < ε. This proves (3.2).
Note that in (I) and (II), we found different N ’s (namely N = 1/ε − 1 in (I)
and N = 1/ε in (II)), but this doesn’t matter because to prove (3.2) we just need to
show such an N exists; it doesn’t have to be unique and in general, many different
N ’s will work. We remark that a similar argument shows that the sequence {1/n}
is also a null sequence: lim n1 = 0.
Example 3.4. Here’s a harder example. Let’s prove that
2n2 − n
lim = 2.
n2 − 9
For the sequence an = (2n2 − n)/(n2 − 9), we take the indices to be n = 4, 5, 6, . . .
(since for n = 3 the quotient is undefined). Let ε > 0 be given. We want to prove
there exists a real number N such that the following statement holds:
2
2n − n
n > N =⇒ 2 − 2 < ε.
n −9
One technique to prove this is to try and “massage” (simplify) the absolute value
on the right as much as we can. For instance, we first can combine fractions:
2 2
2n − n 2n − n 2n2 − 18 18 − n
(3.5) n2 − 9 − 2 = n2 − 9 − n2 − 9 = n2 − 9 .

Second, just so that we don’t have to worry about absolute values, we can get rid
of them by using the triangle inequality: for n = 4, 5, . . ., we have

18 − n 18 + n
(3.6) n2 − 9 ≤ n2 − 9 .

Third, just for topping on the cake, let us make the top of the right-hand fraction
a little simpler by observing that 18 ≤ 18n, so we conclude that
18 + n 18n + n 19n
(3.7) ≤ 2 = 2 .
n2 − 9 n −9 n −9
In conclusion, we have “massaged” our expression to the following inequality:
2
2n − n 19n
n2 − 9 − 2 ≤ n2 − 9 .

So we just need to prove there is an N such that


19n
(3.8) n > N =⇒ < ε;
n2 − 9
2
this will automatically imply that for n > N , we have 2n
−n
2
n −9 − 2 < ε, which was
what we originally wanted. There are many “tricks” to find an N satisfying (3.8).
Here are three slightly different ways.
(I) For our first method, we use a technique sometimes found in elementary
calculus: We divide top and bottom by n2 :
19n 19/n
= .
n2 − 9 1 − 9/n2
98 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

To show that this can be made less than ε, we need to show that the denom-
inator can’t get too small (otherwise the fraction (19/n)/(1 − 9/n2 ) might
get large). To this end, observe that n92 ≤ 492 for n ≥ 4, so
9 9 9 7 1
for n ≥ 4, 1− ≥1− 2 =1− = > .
n2 4 16 16 3
Hence, for n ≥ 4, we have 13 < 1 − n92 , which is to say, 1−9/n
1
2 < 3. Thus,

19n 19/n 57
(3.9) for n ≥ 4, 2
= 2
< (19/n) · 3 = .
n −9 1 − 9/n n
Therefore, we can satisfy (3.8) by making 57/n < ε instead. Now,
57 57
(3.10) < ε ⇐⇒ < n.
n ε
Because of (3.9) and (3.10), let us pick N to be the larger of 3 and 57/ε.
We’ll prove that this N works for (3.8). Let n > N , which implies that
n > 3 and n > 57/ε. In particular, n ≥ 4 and ε > 57/n. Therefore,
19n by (3.9) 57 by (3.10)
2
< < ε.
n −9 n
This proves (3.8).
(II) For our second method, observe that n2 − 9 ≥ n2 − 9n since 9 ≤ 9n. Hence,
1 1 1
for n > 9, ≤ 2 = ,
n2 − 9 n − 9n n(n − 9)
where we chose n > 9 so that n(n − 9) is positive. Thus,
19n 19n 19
(3.11) for n > 9, ≤ = .
n2 − 9 n(n − 9) n−9
So, we can satisfy (3.8) by making 19/(n − 9) < ε instead. Now, for n > 9,
19 19 19
(3.12) < ε ⇐⇒ < n − 9 ⇐⇒ 9 + < n.
n−9 ε ε
Because of (3.11) and (3.12), let us pick N = 9 + 19
ε . We’ll prove that this
N works for (3.8). Let n > N , which implies, in particular, that n > 9.
Therefore,
19n by (3.11) 19 by (3.12)
2
< < ε.
n −9 n−9
This proves (3.8).
(III) For our last (and my favorite of the three) method, we factor the bottom:
19n 19n 19n 1 1
(3.13) = = · < 19 · ,
n2 − 9 (n + 3)(n − 3) n+3 n−3 n−3
19n 19n
where used the fact that n+3 < n = 19. Now (solving for n),
19 19
(3.14) < ε ⇐⇒ 3 + < n.
n−3 ε
For this reason, let us pick N = 3 + 19/ε. We’ll show that this N satisfies
(3.8). Indeed, for n > N (that is, 3 + 19/ε < n), we have
19n by (3.13) 19 by (3.14)
< < ε.
n2 − 9 n−3
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 99

3.1.3. Sophisticated examples of ε-N arguments. We now give some very


famous classical examples of ε-N arguments.
Example 3.5. Let a be any complex number with |a| < 1 and consider the
sequence a, a2 , a3 , . . . (so that an = an for each n). We shall prove that {an } is a
null sequence, that is,
lim an = 0, |a| < 1.
Let ε > 0 be any given positive real number. We need to prove that there is a real
number N such that the following statement holds:
(3.15) n>N =⇒ |an − 0| = |a|n < ε.
If a = 0, then any N would do, so we might as well assume that a 6= 0. In this
1
case, since the real number |a| is less than 1, we can write |a| = 1+b , where b > 0;
in fact, we can simply take b = −1 + 1/|a|. (Since |a| < 1, we have 1/|a| > 1, so
b > 0.) Therefore,
1
|a|n = , where b > 0.
(1 + b)n
By Bernoulli’s inequality (Theorem 2.7),
1 1
(1 + b)n ≥ 1 + nb ≥ nb =⇒ n
≤ .
(1 + b) nb
Hence,
1
(3.16) |a|n ≤.
nb
Thus, we can satisfy (3.15) by making 1/(nb) < ε instead. Now,
1 1
(3.17) < ε ⇐⇒ < n.
nb bε
For this reason, let us pick N = 1/(bε). Let n > N (that is, 1/(bε) < n). Then,
by (3.16)1 by (3.17)
|a|n ≤ < ε.
nb
This proves (3.15) and thus, by definition of convergence, an → 0.
Example 3.6. For our next example, let a > 0 be any positive real number
and consider the sequence a, a1/2 , a1/3 , . . . (so that an = a1/n for each n). We shall
prove that an → 1, that is,

lim a1/n = 1, a > 0.

If a = 1, then the sequence a1/n is just the constant sequence 1, 1, 1, 1, . . ., which


certainly converges to 1 (can you prove this?). Suppose that a > 1; we shall consider
the case 0 < a < 1 afterwards. Let ε > 0 be any given positive real number. We
need to prove that there is a real number N such that

n > N =⇒ a1/n − 1 < ε.

(3.18)

By our familiar root rules (Theorem 2.32), we know that a1/n > 11/n = 1 and
therefore bn := a1/n − 1 > 0. By Bernoulli’s inequality (Theorem 2.7), we have
n a
a = a1/n = (1 + bn )n ≥ 1 + nbn ≥ nbn =⇒ bn ≤ .
n
100 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Hence,
a
1/n
(3.19) a − 1 = |bn | ≤ .
n
Thus, we can satisfy (3.18) by making a/n < ε instead. Now,
a a
(3.20) < ε ⇐⇒ < n.
n ε
For this reason, let us pick N = a/ε. Let n > N (that is, a/ε < n). Then,
by (3.19) a by (3.20)
1/n
a − 1 ≤ < ε.
n
So, by definition of convergence, a1/n → 1. Now consider the case when 0 < a < 1.
Let ε > 0 be any given positive real number. We need to prove that there is a real
number N such that
n > N =⇒ a1/n − 1 < ε.

Since 0 < a < 1, we have 1/a > 1, so by our argument for real numbers greater
1/n
than one we know that 1/a1/n = (1/a) → 1. Thus, there is a real number N
such that
1
n > N =⇒ 1/n − 1 < ε.

a
Multiplying both sides of the right-hand
inequality by the positive real number
a1/n , we get n > N =⇒ 1 − a1/n < a1/n ε. Since 0 < a < 1, by our root rules,
a1/n < 11/n = 1, so a1/n ε < 1 · ε = ε. Hence,

n > N =⇒ a1/n − 1 < ε,

which shows that a1/n → 0 as we wished to show.


Example 3.7. We come to our last example, which may seem surprising at
first. Consider the sequence an = n1/n . We already know that if a > 0 is any fixed
real number, then a1/n → 1. In our present case an = n1/n , so the “a” is increasing
with n and it is not at all obvious what n1/n converges to, if anything! However,
we shall prove that
lim n1/n = 1.
For n > 1 by our root rules we know that n1/n > 11/n = 1, so for n > 1, we
conclude that bn := n1/n − 1 > 0. By the binomial theorem (Theorem 2.8), we have
     
1/n n n n n 2 n n
n = (n ) = (1 + bn ) = 1 + bn + bn + · · · + b .
1 2 n n
Since bn > 0, all the terms on the right-hand side are positive, so dropping all the
terms except the third term on the right, we see that for n > 1,
 
n 2 n! n(n − 1) 2
n> b = b2 = bn .
2 n 2! (n − 2)! n 2
Cancelling off the n’s from both sides, we obtain for n > 1,

2 2 2
bn < =⇒ bn < √ .
n−1 n−1
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 101

Hence, for n > 1,



1/n 2
(3.21) n − 1 = |bn | < √ .
n−1
Let ε > 0 be given. Then

2 2
(3.22) √ <ε ⇐⇒ 1+ < n.
n−1 ε2
For this reason, let us pick N = 1 + 2/ε2 . Let n > N (that is, 1 + 2/ε2 < n). Then,
by (3.21) √
1/n 2 by (3.22)
n − 1 ≤ √ < ε.
n−1
Thus, by definition of convergence, n1/n → 1.
Exercises 3.1.
1. Using the ε-N definition of limit, prove that
 
(−1)n 3 n (−1)n
(a) lim = 0, (b) lim 2 + = 2, (c) lim = 1, (d) lim √ = 0.
n n n−1 n−1
2. Using the ε-N definition of limit, prove that

5n2 + 2 n2 − n 1 hp i
2 +n−n =
1
(a) lim 3 = 0, (b) lim = , (c) lim n .
n − 3n + 1 3n2 − 2 3 2
3. Here is another method to prove that a1/n → 1.
(i) Recall that for any real number b, bn − 1 = (b − 1)(bn−1 + bn−2 + · · · + b + 1).
Using this formula, prove that if a ≥ 1, then a − 1 ≥ n(a1/n − 1). Suggestion:
Let b = a1/n .
(ii) Now prove that for any a > 0, a1/n → 1. (Do the case a ≥ 1 first, then consider
the case 0 < a < 1.)
(iii) In fact, we shall prove that if {rn } is a sequence of rational numbers converging
to zero, then for any a > 0, arn → 1. First, using (i), prove that if a ≥ 1 and
0 < r < 1 is rational, then ar − 1 ≤ ra(a − 1). Second, prove that if a ≥ 1 and
0 < r < 1 is rational, then |a−r − 1| ≤ ar − 1. Using these facts, prove that for
any a > 0, arn → 1. (As before, do the case a ≥ 1 first, then 0 < a < 1.)
4. Let a be a complex number with |a| < 1. We already know that an → 0. In this
problem we prove the, somewhat surprising, fact that nan → 0. Although n grows
very large, this limit shows that an must go to zero faster than n grows.
(i) As in Example 3.5 we can write |a| = 1/(1 + b) where b > 0. Using the binomial
theorem, show that for n > 1,
1
|a|n < n
 .
2
b2
(ii) Show that
2
n|a|n < .
(n − 1) b2
(iii) Now prove that nan → 0.
5. Here’s an even more surprising fact. Let a be a complex number with |a| < 1. Prove
that given any natural number k > 0, we have nk an → 0. Suggestion: Let α :=
|a|1/k < 1 and use the fact that nαn → 0 by the previous problem.
6. If {an } is a sequence of nonnegative real numbers and an → L, prove
(i) L ≥ 0. √

(ii) an → L. (You need to consider two cases, L = 0 and L > 0.)
102 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

7. Let {an } be a sequence in Rm and let L ∈ Rm . Form the negation of the definition
that an → L, thus giving a statement that an 6→ L (the sequence {an } does not tend to
L). Using your negation, prove that the sequence {(−1)n } diverges, that is, does not
converge to any real number. In the next section we shall find an easy way to verify
that a sequence diverges using the notion of subsequences.
8. (Infinite products — see Chapter 7 for more on this amazing topic!) In this problem
we investigate the infinite product
22 32 42 52 62 72
(3.23) · · · · · ···
1·3 2·4 3·5 4·6 5·7 6·8
We interpret this “infinite product” as the limit of the “partial products”
22 22 32 22 32 42
a1 = , a2 = · , a3 = · · ,....
1·3 1·3 2·4 1·3 2·4 3·5
2 2 2
2 3 (n+1)
In other words, for each n ∈ N, we define an := 1·3 · 2·4 · · · n·(n+2) . We prove that the
sequence {an } converges as follows.
(i) Prove that an = 2(n+1)
n+2
.
Q
(ii) Now prove that an → 2. We sometimes write the infinite product (3.23) using
notation and we express the limit lim an = 2 as

22 32 42 52 Y (n + 1)2
· · · ··· = 2 or = 2.
1·3 2·4 3·5 4·6 n=1
n(n + 2)

3.2. A potpourri of limit properties for sequences


Now that we have a working knowledge of the ε-N definition of limit, we move
onto studying the properties of limits that will be used throughout the rest of the
book. In particular, we learn the “algebra of limits,” which allows us to combine
convergent sequences to form other convergent sequences. Finally, we discuss the
notion of properly divergent sequences.

3.2.1. Some limit theorems. We begin by proving that limits are unique,
that is, a convergent sequence cannot have two different limits. Before doing so, we
first prove a lemma.
Lemma 3.1 (The ε-principle). If x ∈ R and for any ε > 0, we have x ≤ ε,
then x ≤ 0. In particular, if a ∈ Rm and for any ε > 0, we have |a| < ε, then
a = 0.
Proof. By way of contradiction, assume that x > 0. Then choosing ε =
x/2 > 0, by assumption we have x < ε = x/2. Subtracting x/2 from both sides, we
conclude that x/2 < 0, which implies that x < 0, a contradiction.
The second assertion of the theorem follows by applying the first assertion to
x = |a|. In this case, |a| ≤ 0, which implies that |a| = 0 and therefore a = 0. 

The proof of the following theorem uses the renowned “ε/2-trick.”


Theorem 3.2 (Uniqueness of limits). Sequences can have at most one limit.

Proof. Let {an } be a sequence in Rm and suppose that an → L and an → L0 .


We shall prove that L = L0 . To see this let ε > 0. Since an → L, with ε/2 replacing
ε in the definition of limit, there is an N such that |an − L| < ε/2 for all n > N .
3.2. A POTPOURRI OF LIMIT PROPERTIES FOR SEQUENCES 103

Since an → L0 , with ε/2 replacing ε in the definition of limit, there is an N 0 such


that |an − L0 | < ε/2 for all n > N 0 . By the triangle inequality,
|L − L0 | = |(L − an ) + (an − L0 )| ≤ |L − an | + |an − L0 |.
In particular, for n greater than the larger of N and N 0 , we see that
ε ε
|L − L0 | ≤ |L − an | + |an − L0 | < + = ε;
2 2
that is, |L − L0 | < ε. Now ε > 0 was completely arbitrary, so by the ε-principle, we
must have L − L0 = 0, or L = L0 , which is what we intended to show. 

It is important that the convergence or divergence of a sequence depends only


on the “tail” of the sequence, that is, on the terms of the sequence for large n. This
fact is more-or-less obvious
Example 3.8. The sequence
1 1 1 1 1 1 1
−100, 100, 50, 1000, , , , , , , , ···
2 4 8 16 32 64 128
converges to zero, and the first terms of this sequence don’t effect this fact.
Given a sequence {an } in Rm and a nonnegative integer k = 0, 1, 2, . . ., we call
the sequence {ak+1 , ak+2 , ak+3 , ak+4 , . . .} a k-tail (of the sequence {an }). We’ll
leave the following proof to the reader.
Theorem 3.3 (Tails theorem for sequences). A sequence converges if and
only if every tail converges, if and only if some tail converges.
We now show that convergence in Rm can be reduced to convergence in R —
this is why R sequences are so important. Let {an } be a sequence in Rm . Since
an ∈ Rm , we can express an in terms of its m-tuple of components:
an = (a1n , a2n , . . . , amn ).
Notice that each coordinate, say the k-th one akn , is a real number and so, {akn } =
{ak1 , ak2 , ak3 , . . .} is a sequence in R. Given L = (L1 , L2 , . . . , Lm ) ∈ Rm , in the
following theorem we prove that if an → L, then for each k = 1, . . . , m, akn → Lk as
n → ∞ as well. Conversely, we shall prove that if for each k = 1, . . . , m, akn → Lk
as n → ∞, then an → L as well.
Theorem 3.4 (Component theorem). A sequence in Rm converges to L ∈
m
R if and only if each component sequence converges in R to the corresponding
component of L.
Proof. Suppose first that an → L. Fixing k, we shall prove that akn → Lk .
Let ε > 0. Since an → L, there is an N such that for all n > N , |an − L| < ε.
Hence, by definition of the norm on Rm , for all n > N ,
(akn − Lk )2 ≤ (a1n − L1 )2 + (a2n − L2 )2 + · · · + (amn − Lm )2 = |an − L|2 < ε2 .
Taking square roots of both sides shows that for all n > N , |akn − Lk | < ε, which
shows that akn → Lk .
Suppose now that for each k = 1, . . . , m, akn → Lk . Let
√ ε > 0. Since akn → Lk
there is an Nk such that for all n > Nk , |akn − Lk | < ε/ m. Let N be the largest
104 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

of the numbers N1 , N2 , . . . , Nm . Then for n > N , we have

|an − L|2 = (a1n − L1 )2 + (a2n − L2 )2 + · · · + (amn − Lm )2


 2  2  2
ε ε ε ε2 ε2 ε2
< √ + √ + ··· + √ = + + ··· + = ε2 .
m m m m m m
Taking square roots of both sides shows that for all n > N , |an − L| < ε, which
shows that an → L. 

Example 3.9. Let us apply this theorem to C (which remember is just R2


with a special multiplication). Let cn = (an , bn ) = an + ibn be a complex sequence
(here we switch notation from an to cn in the theorem and we let c1n = an and
c2n = bn ). Then it follows that cn → c = a + ib if and only if an → a and bn → b.
In other words, cn → c if and only if the real and imaginary parts of cn converge
to the real and imaginary parts, respectively, of c. For example, from our examples
in the previous section, it follows that for any real a > 0, we have
1
+ ia1/n → 0 + i · 1 = i.
n
We now prove the fundamental fact that if a sequence converges, then it must
be bounded. In other words, if {an } is a convergent sequence in Rm , then there
is a constant C such that |an | ≤ C for all n.
Example 3.10. The sequence {n} = {1, 2, 3, 4, 5, . . .} is not bounded by the
Archimedean property of R. Also if a > 1 is a real number, then the sequence
{an } = {a1 , a2 , a3 , . . .} is not bounded. One way to see this uses Bernoulli’s in-
equality: We can write a = 1 + r where r > 0, so by Bernoulli’s inequality,
an = (1 + r)n ≥ 1 + n r > n r,
and n r can be made greater than any constant C by the Archimedean property of
R. Thus, {an } cannot be bounded.
Theorem 3.5. Every convergent sequence is bounded.
Proof. If an → L in Rm , then with ε = 1 in the definition of convergence,
there is an N such that for all n > N , we have |an − L| < 1, which, by the triangle
inequality, implies that
(3.24) n>N =⇒ |an | = |(an − L) + L| ≤ |an − L| + |L| < 1 + |L|.
Let k be any natural number greater than N and let

C := max |a1 |, |a2 |, . . . , |ak−1 |, |ak |, 1 + |L| .
Then |an | ≤ C for n = 1, 2, . . . , k, and by (3.24), |an | < C for n > k. Thus,
|an | ≤ C for all n and hence {an } is bounded. 

Forming the contrapositive, we know that if a sequence is not bounded, then


the sequence cannot converge. Therefore, this theorem can be used to prove that
certain sequences do not converge.
Example 3.11. Each of following sequences: {n}, {1 + i n2 }, {2n + i/n} are
not bounded, and therefore do not converge.
3.2. A POTPOURRI OF LIMIT PROPERTIES FOR SEQUENCES 105

3.2.2. Real sequences and preservation of inequalities. Real sequences


have certain properties that general sequences in Rm and complex sequences do not
have, namely those corresponding to the order properties of R. The order properties
of sequences are based on the following lemma.
Lemma 3.6. A real sequence {an } converges to L ∈ R if and only if, for every
ε > 0 there is an N such that
n>N =⇒ L − ε < an < L + ε.
Proof. By our interpretation of limits in terms of open balls, we know that
an → L just means that given any ε > 0 there is an N such that n > N =⇒
an ∈ Bε (L) = (L − ε, L + ε). Thus, for n > N , we have L − ε < an < L + ε.
We can also prove this theorem directly from the original definition of convergence:
an → L means that given any ε > 0 there is an N such that
n>N =⇒ |an − L| < ε.
By our properties of absolute value, |an − L| < ε is equivalent to −ε < an − L < ε,
which is equivalent to L − ε < an < L + ε. 
The following theorem is the well-known squeeze theorem. Recall from the
beginning part of Section 3.1 that the phrase “for n sufficiently large” means “there
is an N such that for n > N ”.
Theorem 3.7 (Squeeze theorem). Let {an }, {bn }, and {cn } be sequences in
R with {an } and {cn } convergent, such that lim an = lim cn and for n sufficiently
large, an ≤ bn ≤ cn . Then the sequence {bn } is also convergent, and
lim an = lim bn = lim cn .
Proof. Let L = lim an = lim cn and let ε > 0. By the tails theorem, we may
assume that an ≤ bn ≤ cn for all n. Since an → L there is an N1 such that for
n > N1 , L − ε < an < L + ε, and since cn → L there is an N2 such that for n > N2 ,
L − ε < cn < L + ε. Let N be the larger of N1 and N2 . Then for n > N ,
L − ε < an ≤ bn ≤ cn < L + ε,
which implies that for n > N , L − ε < bn < L + ε. Thus, bn → L. 
Example 3.12. Here’s a neat sequence involving the squeeze theorem. Con-
sider {bn }, where
n
1 1 1 1 X 1
bn = 2
+ 2
+ 2
+ · · · + 2
= .
(n + 1) (n + 2) (n + 3) (2n) (n + k)2
k=1
Observe that for k = 1, 2, . . . n, we have
1 1 1
0≤ ≤ = 2.
(n + k)2 (n + 0)2 n
Thus, !
n n n
X 1 X 1 X 1 1
0≤ 2
≤ 2
= 1 · 2 = n · 2.
(n + k) n n n
k=1 k=1 k=1
Therefore,
1
0 ≤ bn ≤.
n
Since an = 0 → 0 and cn = 1/n → 0, by the squeeze theorem, bn → 0 as well.
106 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Example 3.13. (Real numbers as limits of (ir)rational numbers) We


claim that given any c ∈ R there are sequences of rational numbers {rn } and
irrational numbers {qn }, both converging to c. Indeed, for each n ∈ N we have
c − n1 < c, so by Theorem 2.37 there is a rational number rn and irrational number
qn such that
1 1
c − < rn < c and c − < qn < c.
n n
Since c − n1 → c and c → c, by the squeeze theorem, we have rn → c and qn → c.
The following theorem states that real sequences preserve inequalities.
Theorem 3.8 (Preservation of inequalities). Let {an } converge in R.
(1) If {bn } is convergent and an ≤ bn for n sufficiently large, then lim an ≤ lim bn .
(2) If c ≤ an ≤ d for n sufficiently large, then c ≤ lim an ≤ d.
Proof. Since an → a := lim an and bn → b := lim bn given any ε > 0 there is
an N such that for all n > N ,
ε ε ε ε
a − < an < a + and b − < bn < b +
2 2 2 2
By the tails theorem, we may assume that an ≤ bn for all n. Thus, for n > N ,
ε ε ε ε
a − < an ≤ bn < b + =⇒ a − < b + =⇒ a − b < ε.
2 2 2 2
By the ε-principle, we have a − b ≤ 0 or a ≤ b, and our first result is proved.
(2) follows from (1) applied to the constant sequences {c, c, c, . . .}, which con-
verges to c, and {d, d, d, . . .}, which converges to d:
c = lim cn ≤ lim an ≤ lim dn = d.

If c < an < d for n sufficiently large, must it be true that c < lim an < d? The
answer is no . . . can you give a counterexample?
3.2.3. Subsequences. For the rest of this section we focus on general se-
quences in Rm and not just R. A subsequence is just a sequence formed by picking
out certain (countably many) terms of a given sequence. More precisely, let {an }
be a sequence in Rm . Let ν1 < ν2 < ν3 < · · · be a sequence of natural numbers
that is increasing. Then the sequence {aνn } given by
aν1 , aν2 , aν3 , aν4 , . . .
is called a subsequence of {an }.
Example 3.14. Consider the sequence
1 1 1 1 1 1 1
, , , , , , . . . , an = , . . . .
1 2 3 4 5 6 n
Choosing 2, 4, 6, . . . , νn = 2n, . . ., we get the subsequence
1 1 1 1
, , , . . . , aνn = ,....
2 4 6 2n
Example 3.15. As another example, choosing 1!, 2!, 3!, 4!, . . . , νn = n!, . . ., we
get the subsequence
1 1 1 1
, , , . . . , aνn = , . . . .
1! 2! 3! n!
3.2. A POTPOURRI OF LIMIT PROPERTIES FOR SEQUENCES 107

Notice that both subsequences, {1/(2n)} and {1/n!} also converge to zero, the
same limit as the original sequence {1/n}. This is a general fact: If a sequence
converges, then any subsequence of it must converge to the same limit.
Theorem 3.9. Every subsequence of a convergent sequence converges to the
same limit as the original sequence.
Proof. Let {an } be a sequence in Rm converging to L ∈ Rm . Let {aνn } be
any subsequence and let ε > 0. Since an → L there is an N such that for all n > N ,
|an −L| < ε. Since ν1 < ν2 < ν3 < . . . is an increasing sequence of natural numbers,
one can check (for instance, by induction) that n ≤ νn for all n. Thus, for n > N ,
we have νn > N and hence for n > N , we have |aνn − L| < ε. This proves that
aνn → L and completes the proof. 
This theorem gives perhaps the easiest way to prove that a sequence does not
converge.
Example 3.16. Consider the sequence
i, i2 = −1, i3 = −i, i4 = 1, i5 = i, i6 = −1, . . . , an = in , . . . .
Choosing 1, 5, 9, 13, . . . , νn = 4n − 3, . . ., we get the subsequence
i, i, i, i, . . . ,
which converges to i. On the other hand, choosing 2, 6, 10, 14, . . . , νn = 4n − 2, . . .,
we get the subsequence
−1, −1, −1, −1, . . . ,
which converges to −1. Since these two subsequences do not converge to the same
limit, the original sequence {in } cannot converge. Indeed, if {in } did converge,
then every subsequence of {in } would have to converge to the same limit as {in },
but we found subsequences that converge to different limits.
3.2.4. Algebra of limits. Let {an } and {bn } be sequences in Rm . Given any
real numbers c, d, we define the linear combination of these sequences by c and d
as the sequence {c an + d bn }. As special case, the sum of these sequences is just the
sequence {an + bn } and the difference is just the sequence {an − bn }, and choosing
d = 0, the multiple of {an } by c is just the sequence {c an }. The sequence of
norms of the sequence {an } is the sequence of real numbers {|an |}.
Theorem 3.10. Linear combinations and norms of convergent sequences con-
verge to the corresponding linear combinations and norms of the limits.
Proof. Consider first the linear combination sequence {c an +d bn }. If an → a
and bn → b, we shall prove that c an + d bn → c a + d b. Let ε > 0. We need to
prove that there is a real number N such that
n>N =⇒ |c an + d bn − (c a + d b)| < ε.
By the triangle inequality,
|c an + d bn − (c a + d b)| = |c (an − a) + d (bn − b)|
≤ |c| |an − a| + |d| |bn − b|.
Now, since an → a, there is an N1 such that for all n > N1 , |c| |an − a| < ε/2. (If
|c| = 0, any N1 will work; if |c| > 0, then choose N1 corresponding to the error
ε/(2|c|) in the definition of convergence for an → a.) Similarly, since bn → b, there
108 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

is an N2 such that for all n > N2 , |d| |bn − b| < ε/2. Then setting N as the larger
of N1 and N2 , it follows that for n > N ,
ε ε
|c an + d bn − (c a + d b)| ≤ |c| |an − a| + |d| |bn − b| < + = ε.
2 2
This proves that c an + d bn → c a + d b.
Assuming that an → a, we show that |an | → |a|. Let ε > 0. Then there is an
N such that |an − a| < ε for all n > N . Hence, as a consequence of the triangle
inequality (see Property (iv) in Theorem 2.41), for n > N , we have
| |an | − |a| | ≤ |an − a| < ε,
which shows that |an | → |a|. 
Let {an } and {bn } be complex sequences. Given any complex numbers c, d,
the same proof detailed above shows that c an + d bn → c a + d b. However, being
complex sequences we can also multiply these sequences, term by term, defining the
product sequence as the sequence {an bn }. Also, assuming that bn 6= 0 for each n,
we can divide the sequences, term by term, defining the quotient sequence as the
sequence {an /bn }.
Theorem 3.11. Products of convergent complex sequences converge to the cor-
responding products of the limits. Quotients of convergent complex sequences, where
the denominator sequence is a nonzero sequence converging to a nonzero limit, con-
verge to the corresponding quotient of the limits.
Proof. Let an → a and bn → b. We first prove that an bn → a b. Let ε > 0.
We need to prove that there is a real number N such that for all n > N ,
|an bn − a b| < ε.
By the triangle inequality,
|an bn − a b| = |an (bn − b) + b(an − a)| ≤ |an | |bn − b| + |b| |an − a|.
Since an → a, there is an N1 such that for all n > N1 , |b| |an − a| < ε/2. By
Theorem 3.5 there is a constant C such that |an | ≤ C for all n. Since bn → b, there
is an N2 such that for all n > N2 , C |bn − b| < ε/2. Setting N as the larger of N1
and N2 , it follows that for n > N ,
ε ε
|an bn − a b| ≤ |an | |bn − b| + |b| |an − a| ≤ C |bn − b| + |b| |an − a| < + = ε.
2 2
This proves that an bn → a b.
We now prove the second statement. If bn 6= 0 for each n and b 6= 0, then
we shall prove that an /bn → a/b. We can write this limit statement as a product:
an · b−1
n → a·b
−1
, so all we have to do is show that b−1n →b
−1
. Let ε > 0. We need
to prove that there is a real number N such that for all n > N ,
|b−1
n −b
−1
| = |bn b|−1 |bn − b| < ε.
To do so, let N1 be chosen in accordance with the error |b|/2 in the definition of
convergence for bn → b. Then for n > N1 ,
|b|
|b| = |b − bn + bn | ≤ |b − bn | + |bn | <+ |bn |.
2
Bringing |b|/2 to the left, for n > N1 we have |b|/2 < |bn |, or
|bn |−1 < 2|b|−1 , n > N1 .
3.2. A POTPOURRI OF LIMIT PROPERTIES FOR SEQUENCES 109

Now let N2 be chosen in accordance with the error |b|2 ε/2 in the definition of
convergence for bn → b. Setting N as the larger of N1 and N2 , it follows that for
n > N,
 ε
|bn−1 − b−1 | = |bn b|−1 |bn − b| < (2|b|−1 ) (|b|−1 ) |bn − b| < (2|b|−2 ) |b|2 = ε.
2
Thus, b−1
n →b
−1
and our proof is complete. 
These two “algebra of limits” theorems can be used to evaluate limits in an
easy manner.
Example 3.17. For example, since lim n1 = 0, by our product theorem (Theo-
rem 3.11), we have
   
1 1 1
lim 2 = lim · lim = 0 · 0 = 0.
n n n
Example 3.18. In particular, since the constant sequence 1 converges to 1, by
our linear combination theorem (Theorem 3.10), for any number a, we have
 a 1
lim 1 + 2 = lim 1 + a · lim 2 = 1 + a · 0 = 1.
n n
2
Example 3.19. Now dividing the top and bottom of nn2 +3 2
+7 by 1/n and using
our theorem on quotients and the limit we just found, we obtain
 3
n2 + 3 lim 1 + 2
lim 2 =  n  = 1 = 1.
n +7 7 1
lim 1 + 2
n
3.2.5. Properly divergent sequences. When dealing with sequences of real
numbers, inevitably infinities occur. For instance, we know that the sequence {n2 }
diverges since it is unbounded. However, in elementary calculus, we would usu-
ally write n2 → +∞, which suggests that this sequence converges to the number
“infinity”. We now make this notion precise.
A sequence {an } of real numbers diverges to +∞ if given any real number
M > 0, there is a real number N such that for all n > N , an > M . The sequence
diverges to −∞, if for any real number M < 0, there is a real number N such
that for all n > N , an < M . In the first case we write lim an = +∞ or an → +∞
(sometimes we drop the “+” in front of ∞) and in the second case we write lim an =
−∞ or an → −∞. In either case we say that {an } is properly divergent. It is
important to understand that the symbols +∞ and −∞ are simply notation and
they do not represent real numbers3. We now present some examples.
Example 3.20. First we show that for any natural number k, nk → +∞. To
see this, let M > 0. Then we want to prove there is an N such that for all n > N ,
nk > M . To do so, observe that nk > M if and only if n > M 1/k . For this reason,
we choose N = M 1/k . With this choice of N , for all n > N , we certainly have
nk > M and our proof is complete. Using a very similar argument, one can show
that −nk → −∞.
3It turns out that ±∞ form part of a number system called the extended real numbers,
which consists of the real numbers together with the symbols +∞ = ∞ and −∞. One can define
addition, multiplication, and order in this system, with the exception that subtraction of infinities
is not allowed. If you take measure theory, you will study this system.
110 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Example 3.21. In Example 3.10, we showed that given any real number a > 1,
the sequence {an } diverges to +∞.
Because ±∞ are not real numbers, some of the limit theorems we have proved
in this section are not valid when ±∞ are the limits, but many do hold under certain
conditions. For example, if an → +∞ and bn → +∞, then for any nonnegative
real numbers c, d, at least one of which is positive, the reader can check that
c an + d bn → +∞.
If c, d are nonpositive with at least one of them negative, then c an + d bn → −∞. If
c and d have opposite signs, then there is no general result. For example, if an = n,
bn = n2 , and cn = n + (−1)n , then an , bn , cn → +∞, but
lim(an − bn ) = −∞, lim(bn − cn ) = +∞, and lim(an − cn ) does not exist!
We encourage the reader to think about which limit theorems extend to the case of
infinite limits. For example, here is a squeeze law: If an ≤ bn for all n sufficiently
large and an → +∞, then bn → +∞ as well. Some more limit theorems for infinite
limits are presented in the exercises (see e.g. Problem 10).
Exercises 3.2.
1. Evaluate the following limits by using limits already proven (in the text or exercises)
and invoking the “algebra of limits”.
 2
(−1)n n (−1)n 2n 3
(a) lim 2 , (b) lim , (c) lim n , (d) lim 7 + .
n +5 n + 10 3 + 10 n
2. Why do the following sequences diverge?
( n
)
X n n
o
n k
(a) {(−1) } , (b) an = (−1) , (c) an = 2n(−1) , (d) {in + 1/n}.
k=0

3. Let
n n n
X 1 X 1 X 1
an = √ , bn = √ , cn = .
n2 + k n+k nn + n!
k=1 k=1 k=1
Find lim an , lim bn , and lim cn .
n
4. (a) Let a1 ∈ R and for n ≥ 1, define an+1 = sgn(an )+10(−1) √
n
. Here, sgn(x) := 1 if
x > 0, sgn(x) := 0 if x = 0, and sgn(x) := −1 if x < 0. Find lim an .
(b) Let a1 ∈ [−1, 1] and for n ≥ 1, define an+1 = (|anan|+1) . Find lim an . Suggestion:
Can you prove that −1/n ≤ an ≤ 1/n for all n ∈ N?
5. If {an } and {bn } are complex sequences with {an } bounded and bn → 0, prove that
an bn → 0. Why is Theorem 3.11 not applicable in this situation?
6. Let {an } be a sequence in Rm .
(a) Let b ∈ Rm and suppose that there is a sequence {bn } in R with bn → 0 and
|an − b| ≤ C|bn | for some C > 0 and for all n. Prove that an → b.
(b) If lim |an | = 0, show that an → 0. It is important that zero is the limit in the
hypothesis. Indeed, give an example of a sequence in R for which lim |an | exists
and is nonzero, but lim an does not exist.
7. (The root test for sequences) Let {an } be a sequence of positive real numbers with
1/n  1/n
L := lim an < 1. (That is, an converges with limit less than 1.)
(i) Show that there is a real number r with 0 < r < 1 such that 0 < an < rn for all
n sufficiently large, that is, there is an N such that 0 < an < rn for all n > N .
(ii) Prove that lim an = 0.
(iii) If, however, L > 1, prove that an is not a bounded sequence, and hence diverges.
3.3. THE MONOTONE CRITERIA, THE BOLZANO-WEIERSTRASS THEOREM, AND e 111

(iv) When L = 1 the test is inconclusive: Give an example of a convergent sequence


and of a divergent sequence, both of which satisfy L = 1.
8. (The ratio test for sequences) Let {an } be a sequence of positive real numbers
with L := lim(an+1 /an ) < 1.
(i) Show that there are real numbers C, r with C > 0 and 0 < r < 1 such that
0 < an < C rn for all n sufficiently large.
(ii) Prove that lim an = 0.
(iii) If, however, L > 1, prove that an is not a bounded sequence, and hence diverges.
(iv) When L = 1 the test is inconclusive: Give an example of a convergent sequence
and of a divergent sequence, both of which satisfy L = 1.
9. Which of the following sequences are properly divergent? Prove your answers.
p n 3n − 10 o n n o
(a) { n2 + 1}, (b) {n(−1)n }, (c) , (d) √ .
2n n + 10
10. Let {an } and {bn } be sequences of real numbers with lim an = +∞ and bn 6= 0 for n
large and suppose that for some real number L, we have
an
lim = L.
bn
(a) If L > 0, prove that lim bn = +∞.
(b) If L < 0, prove that lim bn = −∞.
(c) Can you make any conclusions if L = 0?

3.3. The monotone criteria, the Bolzano-Weierstrass theorem, and e


In many examples, we proved the convergence of a sequence by first exhibiting
an a priori limit of the sequence and then proving that the sequence converged to the
exhibited value. For instance, we showed that the sequence {1/(n+1)} converges by
showing that it converges to 0. Can we still determine if a given sequence converges
without producing an a priori limit value? There are two ways to do this, one is
called the monotone criterion and the other is the Cauchy criterion. We study the
monotone criterion in this section and save Cauchy’s criterion for the next. In this
section we work strictly with real numbers.

3.3.1. Monotone criterion. A monotone sequence {an } of real numbers


is a sequence that is either nondecreasing, an ≤ an+1 for each n:
a1 ≤ a2 ≤ a3 ≤ · · · ,
or nonincreasing, an ≥ an+1 for each n:
a1 ≥ a2 ≥ a3 ≥ · · · .
Example 3.22. Consider the sequence of real numbers {an } defined inductively
as follows:

(3.25) a1 = 0, an+1 = 1 + an , n ∈ N.
Thus,
q r q
√ √
(3.26) a1 = 0, a2 = 1, a3 = 1 + 1, a4 = 1+ 1+ 1, . . . .

We claim that this sequence is nondecreasing: 0 = a1 ≤ a2 ≤ a3 ≤ · · · . To see


this, we use induction to prove that 0 ≤ an ≤ an+1 for each n. If n = 1, then
112 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

-
a1 a2 a3 an

Figure 3.2. Bounded monotone (e.g. nondecreasing) sequences


must eventually “bunch up” at a point.

a1 = 0 ≤ 1 = a2 . Assume that 0 ≤ an ≤ an+1 . Then 1 + an ≤ 1 + an+1 , so using


that square roots preserve inequalities (our well-known root rules) we see that
√ p
an+1 = 1 + an ≤ 1 + an+1 = an+2 .
This establishes the induction step, so we conclude that our sequence {an } is non-
decreasing. We also claim that {an } is bounded. Indeed, we shall prove that an ≤ 3
for each n. Again we proceed by induction. First, we have a1 = 0 ≤ 3. If an ≤ 3,
then by definition of an+1 , we have
√ √ √
an+1 = 1 + an ≤ 1 + 3 = 4 = 2 ≤ 3,
which proves that {an } is bounded by 3.
The monotone criterion states that a monotone sequence of real numbers con-
verges to a real number if and only if the sequence is bounded. Intuitively, the
terms in a bounded monotone sequence must accumulate at a certain point, see
Figure 3.2. In particular, this implies that our recursive sequence (3.25) converges.

Theorem 3.12 (Monotone criterion). A monotone sequence of real numbers


converges if and only if the sequence is bounded.
Proof. If a sequence converges, then we know that it must be bounded. So,
we need only prove that a bounded monotone sequence converges. So, let {an } be
a bounded monotone sequence. If {an } is nonincreasing, a1 ≥ a2 ≥ a3 ≥ · · · , then
the sequence {−an } is nondecreasing: −a1 ≤ −a2 ≤ · · · . Thus, if we prove that
bounded nondecreasing sequences converge, then lim −an would exist. This would
imply that lim an = − lim −an exists too. So it remains to prove our theorem
under the assumption that {an } is nondecreasing: a1 ≤ a2 ≤ · · · . Let L :=
sup{a1 , a2 , a3 , . . .}, which exists since the sequence is bounded. Let ε > 0. Then
L − ε is smaller than L. Since L is the least upper bound of the set {a1 , a2 , a3 , . . .}
and L − ε < L, there must exist an N so that L − ε < aN ≤ L. Since a1 ≤ a2 ≤ · · · ,
for all n > N we must have L − ε < an ≤ L and since L < L + ε, we conclude that
n>N =⇒ L − ε < an < L + ε.
Hence, lim an = L. 
Example 3.23. The monotone criterion implies that our sequence (3.25) con-
verges, say an → L ∈ R. Squaring both sides of an+1 , we see that
a2n+1 = 1 + an .
The subsequence {an+1 } also converges to L by Theorem 3.9, therefore by the
algebra of limits,
L2 = lim a2n+1 = lim(1 + an ) = 1 + L,
or solving for L, we get (using the quadratic formula),

1± 5
L= .
2
3.3. THE MONOTONE CRITERIA, THE BOLZANO-WEIERSTRASS THEOREM, AND e 113

Since 0 = a1 ≤ a2 ≤ a3 ≤ · · · ≤ an → L, and limits preserve inequalities (Theorem


3.8), the limit L cannot be negative, so we conclude that

1+ 5
L= ,
2
the number called the golden ratio and is denoted by Φ. In view of the expressions
found in (3.26), we can interpret Φ as the infinite “continued square root”:
v v
u u s
√ u u r q
1+ 5 t
u t √
(3.27) Φ= = 1 + 1 + 1 + 1 + 1 + 1 + · · ·.
2

There are many stories about Φ; unfortunately, many of them are false, see [146].
Our next important theorem is the monotone subsequence theorem. There are
many nice proofs of this theorem, cf. the articles [158] [20], [223]. Given any set
A of real numbers, a number a is said to be the maximum of A if a ∈ A and
a = sup A, in which case we write a = max A.
Theorem 3.13 (Monotone subsequence theorem). Any sequence of real
numbers has a monotone subsequence.
Proof. Let {an } be a sequence of real numbers. Then the statement “for
every n ∈ N, the maximum of the set {an , an+1 , an+2 , . . .} exists” is either a true
statement, or it’s false, which means “there is an m ∈ N such that the maximum
of the set {am , am+1 , am+2 , . . .} does not exist.”
Case 1: Suppose that we are in the first case: for each n, {an , an+1 , an+2 , . . .}
has a greatest member. In particular, we can choose aν1 such that
aν1 = max{a1 , a2 , . . .}.
Now {aν1 +1 , aν1 +2 , . . .} has a greatest member, so we can choose aν2 such that
aν2 = max{aν1 +1 , aν1 +2 , . . .}.
Since aν2 is obtained by taking the maximum of a smaller set of elements, we have
aν2 ≤ aν1 . Let
aν3 = max{aν2 +1 , aν2 +2 , . . .}.
Since aν3 is obtained by taking the maximum of a smaller set of elements than
the set defining aν2 , we have aν3 ≤ aν2 . Continuing by induction we construct a
monotone (nonincreasing) subsequence.
Case 2: Suppose that the maximum of the set A = {am , am+1 , am+2 , . . .} does
not exist, where m ≥ 1. Let aν1 = am . Since A has no maximum, there is a ν2 > m
such that
am < aν2 ,
for, if there were no such aν2 , then am would be a maximum element of A, which we
know is not possible. Since none of the elements am , am+1 , . . . , aν2 is a maximum
element of A, there must exist an ν3 > ν2 such that
aν2 < aν3 ,
114 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

for, otherwise one of am , . . . , an2 would be a maximum element of A. Similarly,


since none of am , . . . , aν2 , . . . , aν3 is a maximum element of A, there must exist an
ν4 > ν3 such that
aν3 < aν4 .
Continuing by induction we construct a monotone (nondecreasing) sequence {aνk }.

3.3.2. The Bolzano-Weierstrass theorem. The following theorem named
after Bernard Bolzano (1781–1848) and Karl Weierstrass (1815–1897) is one of the
most important results in analysis and will be frequently employed in the sequel.
Theorem 3.14 (Bolzano-Weierstrass theorem for R). Every bounded se-
quence in R has a convergent subsequence. In fact, if the sequence is contained in
a closed interval I, then the limit of the convergent subsequence is also in I.
Proof. Let {an } be a bounded sequence in R. By the monotone subsequence
theorem, this sequence has a monotone subsequence {aνn }, which of course is also
bounded. By the monotone criterion (Theorem 3.12), this subsequence converges
to some limit value L. Suppose that {an } is contained in a closed interval I = [a, b].
Then a ≤ aνn ≤ b for each n. Since limits preserve inequalities, the limit L of the
subsequence {aνn } also lies in [a, b]. 
Using induction on m (we already did the m = 1 case), we leave the proof of
the following generalization to you, if you’re interested.
Theorem 3.15 (Bolzano-Weierstrass theorem for Rm ). Every bounded
sequence in Rm has a convergent subsequence.
3.3.3. The number e. We now define Euler’s constant e by a method that
has been around for ages, cf. [120, p. 82], [243]. Consider the two sequences whose
terms are given by
 n  n  n+1  n+1
1 n+1 1 n+1
an = 1 + = and bn = 1 + = ,
n n n n
where n = 1, 2, . . .. We shall prove that the sequence {an } is bounded above and is
strictly increasing, which means that an < an+1 for all n. We’ll also prove that
{bn } is bounded below and strictly decreasing, which means that bn > bn+1 for
all n. In particular, the limits lim an and lim bn exist by the monotone criterion.
Notice that  
1
bn = an 1 + ,
n
and 1 + 1/n → 1, so if sequences {an } and {bn } converge, they must converge to
the same limit. This limit is denoted by the letter e, introduced in 1727 by Euler
perhaps because “e” is the first letter in “exponential” [36, p. 442], not because
“e” is the first letter of his last name!
The proof that the sequences above are monotone follow from Bernoulli’s in-
equality. First, to see that bn−1 > bn for n ≥ 2, observe that
 n  n+1  n  
bn−1 n n n2 n
= =
bn n−1 n+1 n2 − 1 n+1
 n  
1 n
= 1+ 2 .
n −1 n+1
3.3. THE MONOTONE CRITERIA, THE BOLZANO-WEIERSTRASS THEOREM, AND e 115

According to Bernoulli’s inequality, we have


 n
1 n n n+1
1+ 2 >1+ 2 >1+ 2 = ,
n −1 n −1 n n
which implies that
bn−1 n+1 n
> · = 1 =⇒ bn−1 > bn
bn n n+1
and proves that {bn } is strictly decreasing. Certainly bn > 0 for each n, so the
sequence bn is bounded below and hence converges.
To see that an−1 < an for n ≥ 2, we proceed in a similar manner:
 n  n−1  2 n  
an n+1 n−1 n −1 n
= =
an−1 n n n2 n−1
 n  
1 n
= 1− 2 .
n n−1
Bernoulli’s inequality for n ≥ 2 implies that
 n
1 n 1 n−1
1− 2 >1− 2 =1− = ,
n n n n
so
an n−1 n
> · = 1 =⇒ an−1 < an .
an−1 n n−1
This shows that {an } is strictly increasing. Finally, since an < bn < b1 = 4, the
sequence {an } is bounded above.
In conclusion, we have proved that the limit
 n
1
e := lim 1 +
n→∞ n
exists, which equals by definition the number denoted by e. Moreover, we have also
derived the inequality
 n  n+1
1 1
(3.28) 1+ <e< 1+ , for all n.
n n
We shall need this inequality later when we discuss Euler-Mascheroni’s constant.
This inequality is also useful in studying Stirling’s formula, which we now describe.
Recall that 0! := 1 and given a positive integer n, we define n! (which we read “n
n! := 1 · 2 · 3 · · · n. For n positive, observe that n! is less than nn , or
factorial”) as √
n
equivalently,
√ n!/n < 1. A natural question to ask is: How much less than one is
the ratio n n!/n? Using (3.28), in Problem 5 you will prove that
√n
n! 1
(3.29) lim = , (Stirling’s formula).
n e
Exercises 3.3.
1. (a) Show that the sequence defined inductively by an+1 = 31 (2an + 4) with a1 = 0 is
nondecreasing and bounded above by 6. Suggestion: Use induction, e.g. prove that
an ≤ an+1 and an ≤ 6 by induction on n. Determine the limit. √
(b) Let a1 = 1. Show that the sequence defined inductively by an+1 = 2 an is
nondecreasing and bounded above by 2. Determine the limit.
116 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

(c) Let a ≥ 2. Show that the sequence defined inductively by an+1 = 2 − a−1 n with
a1 = a is a bounded monotone sequence. Determine the limit. Suggestion: Pick
e.g. a = 2 or a = 10 and calculate a few values of an to conjecture if {an } is, for
general a > 1, nondecreasing or nonincreasing. Also conjecture what a bound may
be from these examples. Now prove your conjecture using induction.
(d) Let a > 0. Show that the sequence defined inductively by an+1 = an /(1 + 2an )
with a1 = a is a bounded monotone sequence. Determine the √ limit.
(e) Show that the sequence defined inductively by an+1 = 5 + an − 5 with a1 = 10
is a bounded monotone sequence. Determine the limit.
1 1 1
(f) Show that the sequence with an = n+1 + n+2 + · · · + 2n is a bounded monotone
sequence. The limit of this sequence is not at all obvious (it turns out to be log 2,
which we’ll study in Section 4.6).
2. In this problem we give two different ways to determine square roots using sequences.
(1) Let a > 0. Let a1 be any positive number and define
 
1 a
an+1 = an + , n ≥ 1.
2 an
(a) Show that an > 0 for all n and a2n+1 − a ≥ 0 for all n.
(b) Show that {an } is nonincreasing for n ≥ 2.
(c) Conclude that {an } converges and determine its limit.
(2) Let 0 ≤ a ≤ 1. Show that the sequence defined inductively by an+1 = an + 12 (a−a2n )

with a1 = 0 is nondecreasing and bounded above by a. Determine the limit.

Suggestion: To prove that an+1 is bounded above by a, assume an is and write

an = a − ε where ε ≥ 0.
3. (Cf. [250]) In this problem we analyze the constant e based on arithmetic-geometric
mean inequality (AGMI); see Problem 7 of Exercises 2.2. Recall that the arithmetic-
geometric mean states that given any n + 1 nonnegative real numbers x1 , . . . , xn+1 ,
x + x + ··· + x n+1
1 2 n+1
x1 · x2 · · · xn+1 ≤ .
n+1
(i) Put xk = (1 + 1/n) for k = 1, . . . , n and xn+1 = 1, in the AGMI to prove that
the sequence an = (1 + n1 )n is nondecreasing.
(ii) If bn = (1 + 1/n)n+1 , then show that for n ≥ 2,
 n       
bn 1 1 1 1 1
= 1− 2 1+ = 1 − 2 ··· 1 − 2 1+ .
bn−1 n n n n n
| {z }
n times

Applying the AGMI to the right hand side, show that bn /bn−1 ≤ 1, which shows
that the sequence {bn } is nonincreasing.
(iii) Conclude that both sequences {an } and {bn } converge. Of course, just as in the
text we denote their common limit by e.
4. (Continued roots) For more on this subject, see [4], [101], [149, p. 775], [215], [106].
(1) Fix k ∈ N with√ k ≥ 2 and fix a √ > 0. Show that the sequence defined inductively
by an+1 = k a + an with a1 = k a is a bounded monotone sequence. Prove that
the limit L is a root of the equation xk − x − a = 0. Can you see why L can be
thought of as
s r q
k k k √
L = a + a + a + k a + · · ·.

(2) Let {an } be a sequence of nonnegative real numbers and for each n, define
v s
u r
u q
t √
αn := a1 + a2 + a3 + · · · + an .
3.4. COMPLETENESS AND THE CAUCHY CRITERIA FOR CONVERGENCE 117

Prove that {αn } converges if and only if there is a constant M ≥ 0 such that

2n a
n ≤ M for all n. Suggestion: To prove the “only if” portion, prove that
r q
√ p √
2n a M + M 22 + · · · + M 2n =
n ≤ αn . To prove the “if”, prove that αn ≤
r q p √
M bn where bn = 1 + 1 + · · · + 1 is defined in (3.25); in particular, we
showed that bn ≤ 3. Now setting an = n for all n, show that
v s
u r
u q
t √
1 + 2 + 3 + 4 + 5 + ···

exists. This number is called Kasner’s number named after Edward Kasner
(1878–1955) and is approximately 1.75793 . . ..
5. In this problem we prove (3.29).
(a) Prove that for each natural number n, (n − 1)! ≤ nn e−n e ≤ n!. Suggestion:
Can you use induction and (3.28)? (You can also prove these inequalities using
integrals as in [128, p. 219], but using (3.28) gives an “elementary” proof that is
free of integration theory.)
(b) Using (a), prove that for every natural number n,

e1/n n
n! e1/n n1/n
≤ ≤ .
e n e
(c) Now prove (3.29). Using (3.29), prove that
 1/n  1/n
(3n)! 27 (3n)! 27
lim = and lim = 2.
n3n e3 n! n2n e

3.4. Completeness and the Cauchy criteria for convergence


The monotone criterion gives a criterion for convergence (in R) of a monotone
sequence of real numbers. Now what if the sequence is not monotone? The Cauchy
criterion, originating with Bolzano, but then made into a formulated “criterion”
by Cauchy [120, p. 87], gives a convergence criterion for general sequences of real
numbers and more generally, sequences of complex numbers and vectors.
3.4.1. Cauchy sequences. A sequence {an } in Rm is said to be Cauchy if,
for every ε > 0, there is an N ∈ R such that k, n > N =⇒ |ak − an | < ε.
Intuitively, all the an ’s get closer together as the indices n gets larger and larger.
Example 3.24. The sequence of real numbers with an = 2n−1 n−3 and n ≥ 4 is
Cauchy. To see this, let ε > 0. We need to prove that there is a real number N
such that
2k − 1 2n − 1
k, n > N =⇒ − < ε.
k−3 n−3
To see this, we “massage” the right-hand expression:

2k − 1 2n − 1 (2k − 1)(n − 3) − (2n − 1)(k − 3)

k−3 − =
n−3 (k − 3)(n − 3)

5(n − k) 5n 5k
= ≤ + .
(k − 3)(n − 3) (k − 3)(n − 3) (k − 3)(n − 3)
Now observe that for n ≥ 4, we have n4 ≥ 1, so
 n 3n n 1 4
n−3≥n− 3· =n− = =⇒ ≤ .
4 4 4 n−3 n
118 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Thus, for n, k ≥ 4, we have



5n 5k 4 4 4 4 80 80
(k − 3)(n − 3) (k − 3)(n − 3) < 5n · k · n + 5k · k · n = k + n .
+

Thus, our “massaged” expression has been fully relaxed as an inequality



2k − 1 2n − 1 80 80
for k, n ≥ 4, − < + .
k−3 n−3 k n
Thus, we can make the left-hand side less than ε by making the right-hand side
less than ε, and we can do this by noticing that we can make
80 80 80 80 ε
+ < ε by making , < ,
k n k n 2
or after solving for k and n, we must have k, n > 160/ε. For this reason, let us pick
N to be the larger of 3 and 160/ε. Let k, n > N (that is, k, n ≥ 4 and k, n > 160/ε).
Then,
2k − 1 2n − 1 80 80 ε ε
k − 3 − n − 3 < k + n < 2 + 2 = ε.

This shows that the sequence {an } is Cauchy. Notice that this sequence, { 2n−1n−3 },
converges (to the number 2).
Example 3.25. Here’s a more sophisticated example of a Cauchy sequence.
Let a1 = 1, a2 = 1/2, and for n ≥ 2, we let an be the arithmetic mean between the
previous two terms:
an−2 + an−1
an = , n > 2.
2
Thus, a1 = 1, a2 = 1/2, a3 = 3/4, a4 = 5/8, . . . so this sequence is certainly not
monotone. However, we shall prove that {an } is Cauchy. To do so, we first prove
by induction that
(−1)n+1
(3.30) an − an+1 = .
2n
Since a1 = 1 and a2 = 1/2, this equation holds for n = 1. Assume that the equation
holds for n. Then
an+1 + an 1  1 (−1)n+1 (−1)n+2
an+1 − an+2 = an+1 − = − an − an+1 = − n
= ,
2 2 2 2 2n+1
which proves the induction step. With (3.30) at hand, we can now show that the
sequence {an } is Cauchy. Let k, n be any natural numbers. By symmetry we may
assume that k ≤ n (otherwise just switch k and n in what follows), let us say
n = k + j where j ≥ 0. Then according to (3.30) and the sum of a geometric
progression (2.3), we can write
ak − an = ak − ak+j
   
= ak − ak+1 + ak+1 − ak+2 + ak+2 − ak+3 + · · · + ak+j−1 − ak+j
(−1)k+1 (−1)k+2 (−1)k+3 (−1)k+j
= + + + · · · +
2k 2k+1 2k+3 2k+j−1
k+1
     
(−1) −1 −1 2 −1 j−1
= 1 + + + · · · +
2k 2 2 2
(−1)k+1 1 − (−1/2)j (−1)k+1 2 
(3.31) = · = · · 1 − (−1/2)j .
2k 1 − (−1/2) 2k 3
3.4. COMPLETENESS AND THE CAUCHY CRITERIA FOR CONVERGENCE 119

2
2
Since 3 · 1 − (−1/2)j ≤ 3 · 1 + 1) ≤ 2, we conclude that
1
|ak − an | ≤ , for all k, n with k ≤ n.
2k−1
Now let ε > 0. Since 1/2 < 1, we know that 1/2k−1 = 2 · (1/2)k → 0 as k → ∞ (see
Example 3.5 in Subsection 3.1.3). Therefore there is an N such that for all k > N ,
1/2k−1 < ε. Let k, n > N and again by symmetry, we may assume that k ≤ n. In
this case, we have
1
|ak − an | ≤ k−1 < ε.
2
This proves that the sequence {an } is Cauchy. Moreover, we claim that this se-
quence also converges. Indeed, by (3.31) with k = 1, so that n = 1 + j or j = n − 1,
we see that
1 2  1 
1 − an = · · 1 − (−1/2)n−1 =⇒ an = 1 − · 1 − (−1/2)n−1 .
2 3 3
Since |(−1/2)| = 1/2 < 1, we know that (−1/2)n−1 → 0. Taking n → ∞, we
conclude that
1 2
lim an = 1 − (1 − 0) = .
3 3
We have thus far gave examples of two Cauchy sequences and we have observed
that both sequences converge. In Theorem 3.17 we shall prove that any Cauchy
sequence converges, and conversely, every convergent sequence is also Cauchy.
3.4.2. Cauchy criterion. The following two proofs use the “ε/2-trick.”
Lemma 3.16. If a subsequence of a Cauchy sequence in Rm converges, then the
whole sequence converges too, and with the same limit as the subsequence.
Proof. Let {an } be a Cauchy sequence and assume that aνn → L for some
subsequence of {an }. We shall prove that an → L. Let ε > 0. Since {an } is Cauchy,
there is an N such that
ε
k, n > N =⇒ |ak − an | < .
2
Since aνn → L there is a natural number k ∈ {ν1 , ν2 , ν3 , ν4 , . . .} with k > N such
that
ε
|ak − L| < .
2
Now let n > N be arbitrary. Then using the triangle inequality and the two
inequalities we just wrote down, we see that
ε ε
|an − L| = |an − ak + ak − L| ≤ |an − ak | + |ak − L| < + = ε.
2 2
This proves that an → L and our proof is complete. 

Theorem 3.17 (Cauchy criterion). A sequence in Rm converges if and only


if it is Cauchy.
Proof. Let {an } be a sequence in Rm converging to L ∈ Rm . We shall prove
that the sequence is Cauchy. Let ε > 0. Since an → L there is an N such that for
all n > N , we have |an − L| < 2ε . Hence, by the triangle inequality
ε ε
k, n > N =⇒ |ak − an | ≤ |ak − L| + |L − an | < + = ε.
2 2
120 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

This proves that a convergent sequence is also Cauchy.


Now let {an } be Cauchy. We shall prove that this sequence also converges.
First we prove that the sequence is bounded. To see this, let us put ε = 1 in the
definition of being a Cauchy sequence; then there is an N such that for all k, n ≥ N ,
we have |ak − an | < 1. Fix any k > N . Then by the triangle inequality, for any
n > k, we have
|an | = |an − ak + ak | ≤ |an − ak | + |ak | ≤ 1 + |ak |.
Hence, for any natural number n, this inequality implies that
|an | ≤ max{|a1 |, |a2 |, |a3 |, . . . , |ak−1 |, 1 + |ak |}.
This shows that the sequence {an } is bounded. Therefore, the Bolzano-Weierstrass
theorem implies that this sequence has a convergent subsequence. Our lemma now
guarantees that the whole sequence {an } converges. 
Because every Cauchy sequence in Rm converges in Rm , we say that Rm is
complete. This property of Rm is essential to many objects in analysis, e.g. series,
differentiation, integration, . . ., all of which use limit processes. Q is an example of
something that is not complete.
Example 3.26. The sequence
1, 1.4, 1.41, 1.414, 1.4142, 1.41421, . . .

is a Cauchy sequence of rational numbers, but its limit (which is supposed to be 2)
does not exist as a rational number! (By the way, we’ll study decimal expansions
of real numbers in Section 3.8.)
From this example you can imagine the difficulties the noncompleteness of Q
can cause when trying to do analysis with strictly rational numbers.
3.4.3. Contractive sequences. Cauchy’s criterion is important because it
allows us to determine whether a sequence converges or diverges by instead proving
that the sequence is or is not Cauchy. Thus, we now focus on how to determine
whether or not a sequence is Cauchy. Of course, we could appeal directly to the
definition of a Cauchy sequence, but unfortunately it is sometimes difficult to show
that a given sequence is Cauchy directly from the definition. However, for a wide
variety of applications it is often easier to show that a sequence is “contractive,”
which automatically implies that it is Cauchy (see the contractive sequence theorem
below), and hence by the Cauchy criterion, must converge.
A sequence {an } in Rm is a said to be contractive if there is a 0 < r < 1 such
that for all n,
(3.32) |an − an+1 | ≤ r |an−1 − an |.

Theorem 3.18 (Contractive sequence theorem). Every contractive se-


quence converges.
Proof. Let {an } be a contractive sequence. Then with n = 2 in (3.32), we
see that
|a2 − a3 | ≤ r |a1 − a2 |,
and with n = 3,
|a3 − a4 | ≤ r |a2 − a3 | ≤ r · r |a1 − a2 | = r2 |a1 − a2 |.
3.4. COMPLETENESS AND THE CAUCHY CRITERIA FOR CONVERGENCE 121

By induction, for n ≥ 2 we get


(3.33) |an − an+1 | ≤ C rn−1 , where C = |a1 − a2 |.
To prove that {an } converges, all we have to do is prove the sequence is Cauchy.
Let k, n ≥ 2. By symmetry we may assume that n ≤ k (otherwise just switch k
and n in what follows), say k = n + j where j ≥ 0. Then according to (3.33), the
triangle inequality, and the geometric progression (2.3), we can write
  
|an − ak | = an − an+1 + an+1 − an+2 + · · · + an+j−1 − an+j

≤ an − an+1 + an+1 − an+2 + an+2 − an+3 + · · · + an+j−1 − an+j
= C rn−1 + C rn + C rn+1 + · · · + C rn+j−2
 
1 − rj C
= C rn−1 1 + r + r2 + · · · + rj−1 = C rn−1 ≤ rn−1 .
1−r 1−r
We are now ready to prove that the sequence {an } is Cauchy. Let ε > 0. Since
C
r < 1, we know that 1−r rn−1 → 0 as n → ∞. Therefore there is an N , which we
C
may assume is greater than 1, such that for all n > N , 1−r rn−1 < ε. Let k, n > N .
Then k, n ≥ 2 and by symmetry, we may assume that n ≤ k, in which case by the
above calculation we find that
C
|an − ak | ≤ rn−1 < ε.
1−r
This proves that the sequence {an } is Cauchy. 
By the tails theorem (Theorem 3.3), a sequence {an } will converge as long as
(3.32) holds for sufficiently n large. We now consider an example.
Example 3.27. Define

(3.34) a1 = 1 and an+1 = 9 − 2an , n ≥ 1.
It is not a priori obvious that this sequence is well-defined;
√ how do we know that
9 − 2an ≥ 0 for all n so that we can take the square root 9 − 2an to define an+1 ?
Thus, we need to show that an cannot get “too big” so that 9 − 2an becomes
negative. This is accomplished through the following estimate:
(3.35) For all n ∈ N, an is defined and 0 ≤ an ≤ 3.
This certainly holds for n = 1. If (3.35) holds for an , then after manipulating the
inequality
√ we see that 3 ≤ 9 − 2an ≤ 9. In particular, the square root an+1 =
9 − 2an is well-defined, and
√ √ √
0 ≤ 3 ≤ an+1 = 9 − 2an ≤ 9 = 3,
so (3.35) holds for an+1 . Therefore, (3.35) holds for every n. Now multiplying by
conjugates, for n ≥ 2, we obtain
 √9 − 2a √
p √ n−1 + 9 − 2an
an − an+1 = 9 − 2an−1 − 9 − 2an √ √
9 − 2an−1 + 9 − 2an
−2an−1 + 2an
=√ √
9 − 2an−1 + 9 − 2an
2
=√ √ · (−an−1 + an )
9 − 2an−1 + 9 − 2an
122 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

The smallest the denominator can possibly be is when an−1 and an are the largest
they can be, which according to (3.35), is at most 3. It follows that for any n ≥ 2,
2 2 2
√ √ ≤√ √ = √ = r,
9 − 2an−1 + 9 − 2an 9−2·3+ 9−2·3 2 3
where r := √1 < 1. Thus, for any n ≥ 2,
3

2
|an − an+1 | = √ √ |an−1 − an | ≤ r |an−1 − an |.
9 − 2an−1 + 9 − 2an
This proves that the sequence {an } is contractive and therefore an → L for some
real number L. Because an ≥ 0 for all n and limits preserve inequalities we must
have L ≥ 0 too. Moreover, by (3.34), we have

L2 = lim a2n+1 = lim(9 − 2an ) = 9 − 2L,

which implies that L2 + 2L − 9 = 0. √


Solving this quadratic equation and taking the
positive root we conclude that L = 10 − 1.
Exercises 3.4.
1. Prove directly, via the definition, that the following sequences are Cauchy.
  ( 2 )  
(−1)n 3 n2
(a) 10 + √ , (b) 7+ , (c) .
n n n2 − 5

2. Negate the statement that a sequence {an } is Cauchy. With your negation, prove that
the following sequences are not Cauchy (and hence cannot converge).
( n
)
n
X n
(a) {(−1) }, (b) an = (−1) , (c) {in + 1/n}.
k=0

3. Prove that the following sequences are contractive, then determine their limits.
(a) Let a1 = 0 and an+1 = (2an − 3)/4.
(b) Let a1 = 1 and an+1 = 51 a2n − 1.
(c) Let a1 = 0 and an+1 = 18 a3n + 14 an + 12 .
1
(d) Let a1 = 1 and an+1 = 1+3a . Suggestion: Prove that 41 ≤ an ≤ 1 for all n.
n √
(e) (Cf. Example 3.27.) Let a1 = 1 and an+1 = 5 − 2an .
(f) (Cf. Example 3.25.) Let a1 = 0, a2 = 1, and an = 32 an−2 + 31 an−1 for n > 2.
(g) Let a1 = 1 and an+1 = a2n + a1n .
4. We can use Cauchy sequences to obtain roots of polynomials. E.g. using a graphing
calculator, we see that x3 − 4x + 2 has exactly one root, call it a, in the interval [0, 1].
(i) Show that the root a satisfies a = 14 (a3 + 2).
(ii) Define a sequence {an } recursively by an+1 = 14 (a3n + 2) with a1 = 0.
(iii) Prove that {an } is contractive and converges to a.
5. Here are some Cauchy limit theorems. Let {an } be a sequence in Rm .
(a) Prove that {an } is Cauchy if and only if for every ε > 0 there is a number N such
that for all n > N and k ≥ 1, |an+k − an | < ε.
(b) Given any sequence {bn } of natural numbers, we call the sequence {dn }, where
dn = an+bn − an , a difference sequence. Prove that {an } is Cauchy if and
only if every difference sequence converges to zero (that is, is a null sequence).
Suggestion: To prove the “if” part, instead prove the contrapositive: If {an } is not
Cauchy, then there is a difference sequence that does not converge to zero.
3.5. BABY INFINITE SERIES 123

| | | | | |
0 1
1 1 1 1
2 22 23 24

Figure 3.3. A stick of length one is halved infinitely many times.

6. (Continued fractions — see Chapter 8 for more on this amazing topic!) In this
problem we investigate the continued fraction

1
.
1
2+
1
2+
2 + ···
We interpret this infinite fraction as the limit of the fractions
1 1 1
a1 = , a2 = 1 , a3 = 1 ,....
2 2+ 2
2+ 2+ 1
2

In other words, this sequence is defined by a1 = 21 and an+1 = 2+a 1


n
for n ≥ 1. Prove
that {an } converges and find its limit. Here’s a related example: Prove that

1
(3.36) Φ=1+
1
1+
1
1+
.
1 + ..

in the sense that the right-hand continued fraction converges with value Φ. More
precisely, prove that Φ − 1 = lim φn where {φn } is the sequence defined by φ1 := 1 and
φn+1 = 1/(1 + φn ) for n ∈ N.

3.5. Baby infinite series


Imagine taking a stick of length 1 foot and cutting it in half, getting two sticks
of length 1/2. We then take one of the halves and cut this piece in half, getting
two sticks of length 1/4 = 1/22 . We now take one of these fourths and cut it in
half, getting two sticks of length 1/8 = 1/23 . We continue this process indefinitely.
Then, see Figure 3.3, the sum of all the lengths of all the sticks is 1:
1 1 1 1 1
1= + + 3 + 4 + 5 + ··· .
2 22 2 2 2
In this section we introduce the theory of infinite series, which rigorously defines
the right-hand sum.4

4If you disregard the very simplest cases, there is in all of mathematics not a single infinite
series whose sum has been rigorously determined. In other words, the most important parts
of mathematics stand without a foundation. Niels H. Abel (1802–1829) [210]. (Of course,
nowadays series are rigorously determined — this is the point of this section!)
124 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

3.5.1. Basic results on infinite series. P Given a sequence {an }∞ n=1Pof com-

plex numbers, we want to attach a meaning to n=1 an , mostly written an for
simplicity. To do so, we define the n-th partial sum, sn , of the series to be
n
X
sn := ak = a1 + a2 + · · · + an .
k=1

Of course, here there are only finitely many numbers being summed, so the right-
hand side has a clear definition. IfPthe sequence {sn } of partial sums converges,
then we say that the infinite series an converges and we define
X ∞
X
an = an = a1 + a2 + a3 + · · · := lim sn .
n=1

If the sequence of partial sums does not converge to a complex number, then we
say that the series diverges. Since R ⊆ C, restricting to real sequences {an },
we already have built in to the above definition the convergence of a series of real
numbers. Just as a sequence can be indexed so its starting value is a0 or a−7 or
a1234 , etc., we can also consider series starting with indices other than 1:

X ∞
X ∞
X
an , an , an , etc.
n=0 n=−7 n=1234

For convenience, in all our proofs we shall most of the time work with series starting
at n = 1, although all the results we shall discuss work for series starting with any
index.
Example 3.28. Consider the series

X
(−1)n = 1 − 1 + 1 − 1 + − · · · .
n=0

Observe that s1 = 1, s2 = 1 − 1 = 0, s3 = 1 − 1 + 1 = 1, and in Pgeneral, sn = 1 if n



is odd and sn = 0 if n is even. Since {sn } diverges, the series n=0 (−1)n diverges.
There are two very simple tests that will help to determine the convergence or
divergence of a series. The first test might also be called the “fundamental test”
because it is the first thing that one should always test when given a series.
P
Theorem 3.19 (n-th term test). P If an converges, then lim an = 0. Stated
another way, if lim an 6→ 0, then an diverges.
P
Proof. Let s = an and sn denote the n-th partial sum of the series. Observe
that
sn − sn−1 = a1 + a2 + · · · + an−1 + an − (a1 + a2 + · · · + an−1 ) = an .
P
By definition of convergence of an , we have sn → s. Therefore, sn−1 → s as
well, hence an = sn − sn−1 → s − s = 0. 
P∞
Example
P∞ 3.29. So, for example, the series n=0 (−1)n = 1 − 1 + 1 − 1 + − · · ·
and n=1 n = 1 + 2 + 3 + · · · cannot converge, since their n-th terms do not tend
to zero.
3.5. BABY INFINITE SERIES 125

The converse of Pthe n-th term test is false; that is, even though lim an = 0, it
may not follow that an exists.5 For example, the . . .
Example 3.30. (Harmonic series diverges, Proof I) Consider

X 1 1 1 1
= 1 + + + + ··· .
n=1
n 2 3 4

This series is called the harmonic series; see [125] for “what’s harmonic about
the harmonic series”. To see that the harmonic series does not converge, observe
that
     
1 1 1 1 1 1 1
s2n = 1 + + + + + + ··· + +
2 3 4 5 6 2n − 1 2n
     
1 1 1 1 1 1 1
>1+ + + + + + ··· + +
2 4 4 6 6 2n 2n
     
1 1 1 1 1
=1+ + + + ··· + = + sn .
2 2 3 n 2
Thus, s2n > 1/2 + sn . Now if the harmonic series did converge, say to some real
number s, that is, sn → s, then we would also have s2n → s. However, according to
the inequality above, this would imply that s ≥ 1/2 + s, which is an impossibility.
Therefore, the harmonic series does not converge. See Problem 5 for more proofs.
Using the inequality s2n > 1/2 + sn , one can show (and we encourage you to
do it!) that the partial sums of the harmonic series are unbounded. Then one can
deduce that the harmonic series must diverge by the following very useful test.
P
Theorem 3.20 (Nonnegative series test). A series an of nonnegative
real numbers converges if and only if the sequence
P {sn } of partial sums is bounded,
in which case, sn ≤ s for all n where s = an := lim sn .
Proof. Since an ≥ 0 for all n, we have
sn = a1 + a2 + · · · + an ≤ a1 + a2 + · · · + an + an+1 = sn+1 ,
so the sequence of partial sums {sn } is nondecreasing: s1 ≤ s2 ≤ · · · ≤ sn ≤ · · · .
By the monotone criterion for sequences, the sequence
P∞ of partial sums converges if
and only if it is bounded. To see that sn ≤ s := m=1 am for all n, fix n ∈ N and
note that sn ≤ sk for all k ≥ n because the partial sums are nondecreasing. Taking
k → ∞ and using that limits preserve inequalities gives sn ≤ lim sk = s. 

Example 3.31. Consider the following series:



X 1 1 1 1
= + + ··· + + ··· .
n=1
n(n + 1) 1 · 2 2 · 3 n(n + 1)

To analyze this series, we use the “method of partial fractions” and note that
1 1 1
k(k+1) = k − k+1 . Thus, the adjacent terms in sn cancel (except for the first and

5The sum of an infinite series whose final term vanishes perhaps is infinite, perhaps finite.
Jacob Bernoulli (1654–1705) Ars conjectandi.
126 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

the last):
1 1 1
(3.37) sn = + + ··· +
1·2 2·3 n(n + 1)
     
1 1 1 1 1 1 1
= − + − + ··· + − =1− ≤ 1.
1 2 2 3 n n+1 n+1
Hence, the sequence {sn } is bounded above by 1, so our series converges. Moreover,
we also see that sn = 1 − 1/(n + 1) → 1, and therefore

X 1
= 1.
n=1
n(n + 1)
Example 3.32. Now if the sum of the reciprocals of the natural numbers
diverges, what about the sum of the reciprocals of the squares (called the 2-series):

X 1 1 1 1
2
= 1 + 2 + 2 + 2 + ··· .
n=1
n 2 3 4
To investigate the convergence of this 2-series, using (3.37) we note that
1 1 1 1
sn = 1 + + + + ··· +
2·2 3·3 4·4 n·n
1 1 1 1
≤1+ + + + ··· + ≤ 1 + 1 = 2.
1·2 2·3 3·4 (n − 1) · n
Since the partial sums of the 2-series are bounded, the 2-series converges. Now
what is the value of this series? This question was answered by Leonhard Euler
(1707–1783) in 1734. We shall rigourously prove, in 9 different ways in this book
that the value of the 2-series is π 2 /6 starting in Section 6.11! (Now what does π
have to do with reciprocals of squares of natural numbers???)
3.5.2. Some properties of series. It is important to understand that the
convergence or divergence of a series only depends on the “tails” of the series.
P
Theorem 3.21 (Tails theorem P for series). A series an converges if and

only if there is an index m such that n=m an converges.
P
P∞Proof. Let sn denote the n-th partial sum of an and tn that of any “m-tail”
n=m an . Then
Xn
tn = ak = sn − a,
k=m
Pm−1
where a is the number a = k=1 ak . It follows that {sn } converges if and only if
{tn } converges and our theorem is proved. 
An important type of series
Pwe’ll run into often are geometric series. Given a
complex number a, the series an is called a geometric series. The following
theorem characterizes those geometric series that converge.
Theorem 3.22 (Geometric series Ptheorem). For any nonzero complex num-

ber a and k ∈ Z, the geometric series n=k an converges if and only if |a| < 1, in
which case

X ak
an = ak + ak+1 + ak+2 + ak+3 + · · · = .
1−a
n=k
3.5. BABY INFINITE SERIES 127

Proof. If |a| ≥ 1 and n ≥ 0, then |a|n ≥ 1, so the terms of the geometric


series do not tend to zero, and therefore the geometric series cannot converge by
the n-th term test. Thus, we may henceforth assume that |a| < 1. By the formula
for a geometric progression (see (2.3) in Section 2.2), we have
  1 − an+1
sn = ak + ak+1 + · · · + ak+n = ak 1 + a + · · · + an = ak .
1−a
Since |a| < 1, we know that lim an+1 = 0, so sn → ak /(1 − a). This shows that the
geometric series converges with sum equal to ak /(1 − a). 
Of course, if a = 0, then the geometric series ak + ak+1 + ak+2 + · · · is not
defined if k ≤ 0 and equals zero if k ≥ 1.
Example 3.33. If we put a = 1/2 < 1 in the geometric series theorem, then
we have

X 1 1 1 1 1/2
n
= + 2 + 3 + ··· = = 1,
n=1
2 2 2 2 1 − 1/2
just as hypothesized in the introduction to this section!
Finally, we state the following theorem on linear combinations of series.
P P
Theorem 3.23 (Arithmetic properties of series). P If an and bn con-
verge, then given any complex numbers, c, d, the series (c an + d bn ) converges,
and X X X
(c an + d bn ) = c an + d bn .
Moreover, we can group the terms in the series
a1 + a2 + a3 + a4 + a5 + · · ·
inside parentheses in any way we wish as long as we do not P change the ordering of
the terms and the resulting series still converges with sum an ; in other words,
the associative law holds for convergent infinite series.
P
Proof. The n-th partial sum of (c an + d bn ) is
Xn n
X Xn
(c ak + d bk ) = c ak + d bk = c sn + d tn ,
k=1 k=1 k=1
P P
wherePsn and tn are the
P n-th partial sums of an and bn , respectively. Since
sn → an and tn → bn , the first statement our theorem follows.
Let 1 = ν1 < ν2 < ν3 < · · · be any strictly increasing sequence of integers. We
must show that
(a1 + a2 + · · · + aν2 −1 ) + (aν2 + aν2 +1 + · · · + aν3 −1 )+

X
(aν4 + aν4 + · · · + aν5 −1 ) + · · · =: (aνn + · · · + aνn+1 −1 )
n=1
P∞
converges
Pwith sum n=1 an . To see this, observe that if {sn } denotes the partial

sums of n=1 an and if {Sn } denotes the partial sums of the right-hand series, then
Sn = (a1 + a2 + · · · + aν2 −1 ) + · · · + (aνn + aν4 + · · · + aνn+1 −1 ) = sνn+1 −1 ,
since the associative law holds for finite sums.
P∞Therefore, {Sn } is just a subsequence
of {sn }, P
and hence has the same limit: n=1 (aνn + · · · + aνn+1 −1 ) = lim Sn =
lim sn = an . This completes our proof. 
128 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

In Section 6.6, we’ll see that the commutative law doesn’t hold! It is worth
remembering that the associative law does not work in reverse.

Example 3.34. For instance, the series


0 = 0 + 0 + 0 + 0 + · · · = (1 − 1) + (1 − 1) + (1 − 1) + · · ·
certainly converges, but we cannot omit the parentheses and conclude that 1 − 1 +
1 − 1 + 1 − 1 + · · · converges, which we already showed does not.

P 3.5.3. Telescoping series. As seen in Example 3.31, the value of the series
1/n(n + 1) was very easy to find because in writing out its partial sums, we
saw that the sum “telescoped” to give a simple expression. In general, it is very
difficult to find the value of a convergent series, but for telescoping series, the sums
are quite straightforward to find.

Theorem 3.24 (Telescoping


P∞ series theorem). Let {xn } be a sequence of
complex numbers and let n=0 aP n be the series with n-th term an := xn − xn+1 .
Then lim xn exists if and only if an converges, in which case

X
an = x0 − lim xn .
n=0

Proof. Observe that adjacent terms of the following partial sum cancel:
sn = (x0 − x1 ) + (x1 − x2 ) + · · · + (xn−1 − xn ) + (xn − xn+1 ) = x0 − xn+1 .
P
If x := lim xn exists, we have x = lim xn+1 as well, and therefore an := lim sn
exists with sum x0 − x. Conversely, if s = lim sn exists, then s = lim sn−1 as well,
and since xn = x0 − sn−1 , it follows that lim xn exists. 

Example 3.35. Let a be any nonzero complex number not equal to a negative
integer. Then we claim that

X 1 1
= .
n=0
(n + a)(n + a + 1) a

Indeed, in this case, we can use the “method of partial fractions” to write
1 1 1 1
an = = − = xn −xn+1 , where xn = .
(n + a)(n + a + 1) (n + a) n + a + 1 (n + a)
Since lim xn = 0, applying the telescoping series theorem gives

X 1 1 1
= x0 = = ,
n=0
(n + a)(n + a + 1) (0 + a) a

just as we stated.

Example 3.36. More generally, given any natural number k we have



X 1 1 1
(3.38) = .
n=0
(n + a)(n + a + 1) · · · (n + a + k) k a(a + 1) · · · (a + k − 1)
3.5. BABY INFINITE SERIES 129

Indeed, in this general case, we have (again using partial fractions)


1
(n + a)(n + a + 1) · · · (n + a + k)
1 1
= − .
k(n + a) · · · (n + a + k − 1) k(n + a + 1) · · · (n + a + k)
| {z } | {z }
xn xn+1

Since lim xn = 0 (why?), applying the telescoping series theorem gives



X 1 1 1
= x0 = ,
n=0
(n + a)(n + a + 1) · · · (n + a + k) k a(a + 1) · · · (a + k − 1)
just as we stated. For example, with a = 1/2 and k = 2 in (3.38), we obtain (after
a little algebra)
1 1 1 1
+ + + ··· =
1·3·5 3·5·7 5·7·9 12
and with a = 1/3 and k = 2 in (3.38), we obtain another beautiful sum:
1 1 1 1
+ + + ··· = .
1 · 4 · 7 4 · 7 · 10 7 · 10 · 13 24
More examples of telescoping series can be found in the article [184]. In sum-
mary, the telescoping series theorem is useful in quickly finding sums to certain
series. Moreover, this theorem allows us to construct series with any specified sum.
Corollary 3.25. Let s be any complex number and let {xn }∞ n=0 be a null
sequence
P∞ (that is, lim xn = 0) such that x0 = s. Then setting an := xn − xn+1 , the
series n=0 an converges to s.
P
Proof. By the telescoping series theorem, an = x0 −lim xn = s−0 = s. 
Example 3.37.
P∞For example, let s = 1. Then xn = 1/2n defines a null sequence
0
with 1/2 = 1, so n=0 an = 1 with
1 1 1
an = xn − xn+1 = n − n+1 = n+1 .
2 2 2
Hence,
∞ ∞
X 1 X 1
1= n+1
= ,
n=0
2 n=1
2n
a fact that we already knew from our work with geometric series.
Example 3.38. Also, xn = 1/(n + 1) defines a null sequence with x0 = 1. In
this case,
1 1 1
an = xn − xn+1 = − = ,
n+1 n+2 (n + 1) (n + 2)
so
∞ ∞
X 1 X 1
1= = ,
n=0
(n + 1) (n + 2) n=1
n (n + 1)
another fact we already knew!
2
What
√ fancy formulas for 1 do you get when you put xn = 1/(n + 1) and
xn = 1/ n + 1?
130 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Exercises 3.5.
1. Determine the convergence of the following series. If the series converges, find the sum.
∞  n ∞   ∞
X 1 X i n X 1
(a) 1+ , (b) , (c) 1/n
.
n=1
n n=1
2 n=1
n

2. Let {an } be a sequence


P of complex numbers. P∞
(a) Assume that an converges. Prove that the sum P∞of the even terms n=1 a2n
converges
P if and
P only ifPthe sum of the odd terms n=1 a 2n−1 converges, in which
case,P an = a2n + a2n−1 . P
(b) Let cn bePa series obtained from an by
P modifying at most finitely many terms.
Show that an converges if and only if cn converges. P∞
(c) Assume that lim an = P 0. Fix α, β ∈ C with α + β 6= 0. Prove that n=1 an

converges if and only if n=1 (αan + βan+1 ) converges.
3. Prove that

X n 1 2 3
n
= + 2 + 3 + · · · = 2.
n=1
2 2 2 2
Suggestion: Problem 3d in Exercises 2.2 might help.
4. Let a be a complex number. Using the telescoping series theorem, show that
(
a
a a2 a4 a8 |a| < 1
2
+ 4
+ 8
+ 16
+ · · · = 1−a
1
1−a 1−a 1−a 1−a 1−a
|a| > 1.
x 1 1
Suggestion: Using the identity 1−x2
= 1−x
− 1−x2
for any x 6= ±1 (you should prove
2n n
this identity!), write a 2n+1 as the difference xn − xn+1 where xn = 1 − a2 .
P 1−a Pn 1
5. ( ∞ 1
n=1 n diverges, Proofs II–IV) For more proofs, see [115]. Let sn = k=1 k .
(a) Using that 1 + n1 < e1/n for all n ∈ N, which is from (3.28), show that
     
1 1 1 1
1+ 1+ 1+ ··· 1 + ≤ esn , for all n ∈ N.
1 2 3 n
Show that the left-hand side equals n + 1 and conclude that {sn } cannot converge.
(b) Show that for any k ∈ N with k ≥ 3, we have
1 1 1 3
+ + ≥ .
k−1 k k+1 k
Using this inequality, prove that for any n ∈ N, s3n+1 ≥ 1 + sn by grouping the
terms of s3n+1 into threes (starting from 21 ). Now show that {sn } cannot converge.
(c) For any k ∈ N with k ≥ 2, prove the inequality
1 1 1 1
+ + ··· + ≥1− .
(k − 1)! + 1 (k − 1)! + 2 k! k
Writing sn! into groups of the form given on the left-hand side of this inequality,
prove that sn! ≥P1 + n − sn . Conclude that {sn } cannot converge.
6. We shall prove that ∞ n=1 nz
n−1
converges if and only if |z| < 1, in which case

1 X
(3.39) = nz n−1 .
(1 − z)2 n=1
P
(i) Prove that if ∞n=1Pnz n−1 converges, then |z| < 1.
(ii) Prove that (1 − z) nk=1 kz
k−1
can be written as
n
X 1 − zn
(1 − z) kz k−1 = − nz n .
1−z
k=1
3.6. ABSOLUTE CONVERGENCE AND A POTPOURRI OF CONVERGENCE TESTS 131

P
(iii) Now prove that if |z| < 1, then 1/(1 − z)2 = ∞ n=1 nz
n−1
. Solve Problem 3 using
(3.39). Suggestion: Problem 4 inPExercises 3.1 might be helpful.
2 ∞ n−2
(iv) Can you prove that (1−z) 3 = n=2 n(n − 1)z for |z| < 1 using a similar
technique? (Do this problem if you are feeling extra confident!)
7. In Problem 9 of Exercises 2.2 we studied the Fibonacci sequence, F0 = 0, F1 = 1,
and Fn = Fn−1 + Fn−2 for all n ≥ 2. Using the telescoping theorem prove that
∞ ∞
X 1 X Fn
=1 , = 2.
n=2
F n−1 Fn+1
n=2
F n−1 Fn+1

8. Here is a generalization of the telescoping series


P theorem.
(i) Let xn → x and let k ∈ N. Prove that ∞ n=0 an with an = xn − xn+k converges
and has the sum
X∞
an = x0 + x1 + · · · + xk−1 − k x.
n=0

Using this formula, find the sums


∞ ∞
X 1 X 1
and .
n=0
(2n + 1)(2n + 7) n=0
(3n + 1)(3n + 7)
(ii) Let a be any nonzero complex number not equal to a negative integer and let
k ∈ N. Using (i), prove that
∞  
X 1 1 1 1 1
= + + ··· + .
n=0
(n + a)(n + a + k) k a a+1 a+k−1

With a = 1 and k = 2, derive a beautiful expression for 3/4.


(iii) Here’s a fascinating result: Given another natural number m, prove that

X 1
n=0
(n + a)(n + a + m) · · · (n + a + km)
m−1
1 X 1
= .
km n=0 (n + a)(n + a + m) · · · (n + a + (k − 1)m)

Find a beautiful series when a = 1 and k = m = 2.


9. Let xn → x, and let c1 , . . .P
, ck be k ≥ 2 numbers such that c1 + · · · + ck = 0.
(i) Prove that the series ∞ n=0 an with an = c1 xn+1 +c2 xn+2 +· · ·+ck xn+k converges
and has the sum

X
an = c1 x1 + (c1 + c2 ) x2 + · · · + (c1 + c2 + · · · + ck−1 ) xk−1
n=0
+ (c2 + 2 c3 + 3c4 + · · · + (k − 1) ck ) x.
(ii) Using (i), find the sum of
5 11 17 6n + 5
+ + + ··· + + ··· .
5·7·9 7 · 9 · 11 9 · 11 · 13 (2n + 5)(2n + 7)(2n + 9)
10. Let a be any complex number not equal to 0, −1, −1/2, −1/3, . . .. Prove that

X n 1
= .
n=1
(a + 1)(2a + 1) · · · (na + 1) a

3.6. Absolute convergence and a potpourri of convergence tests


We now give some important tests that guarantee when certain series converge.
132 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

3.6.1. Various tests for convergence. The first test is the series version of
Cauchy’s criterion for sequences.
P
Theorem 3.26 (Cauchy’s criterion for series). The series an converges
if and only if given any ε > 0 there is an N such that for all n > m > N , we have

Xn

ak = |am+1 + am+2 + · · · + an | < ε,

k=m+1

in which case, for any m > N ,



X ∞

ak < ε.

k=m+1
P
In particular, for a convergent series an , we have

X
lim ak = 0.
m→∞
k=m+1
P
Proof.
P Let sn denote the n-th partial sum of an . Then to say that the
series an converges means that the sequence {sn } converges. Cauchy’s criterion
for sequences states that {sn } converges if and only if given any ε > 0 there is an
N such that for all n, m > N , we have |sn − sm | < ε. As n and m are symmetric
in this criterion we may assume that n > m > N . Since
n
X m
X n
X
sn − sm = ak − ak = ak ,
k=1 k=1 k=m+1
Pn
this Cauchy criterion is equivalent to | k=m+1 ak | < ε for all n > m ≥ N . Taking
P∞
n → ∞ shows that | k=m+1 ak | < ε. 

Here is another test, which is the most useful of the ones we’ve looked at.
Theorem 3.27 (Comparison test). Let {an } and {bn } be real sequences and
suppose that for n sufficiently large, say for all n ≥ k for some k ∈ N, we have
0 ≤ an ≤ bn .
P P P P
If bn converges, then an converges.
P∞Equivalently,
P∞ if an diverges, then bn
diverges. In the case of convergence, n=k an ≤ n=k bn .
P
P Proof. By the tails theorem forP series (Theorem
∞ P ∞
3.21), the series an and
bn converge if and only if the series n=k an and n=k bn converge. By working
with these series instead of the original ones, we may assume that P 0 ≤ an ≤ bn
holds for every n. In P
this case, if sn denotes the n-partial sum for an and tn , the
n-th partial sum for bn , then 0 ≤ an ≤ bn (for every n) implies that for every n,
we have
0 ≤ sn ≤ tn .
P
Assume thatP bn converges. Then by the nonnegative series test (Theorem P 3.20),
tn ≤ t := bn . Hence, 0 ≤ sn ≤ t for all n; that is, the partial P sums of an are
bounded. Again by the nonnegative series test, P it follows
P that an converges and
taking n → ∞ in 0 ≤ sn ≤ t shows that 0 ≤ an ≤ bn . 
3.6. ABSOLUTE CONVERGENCE AND A POTPOURRI OF CONVERGENCE TESTS 133

Example 3.39. For example, the p-series, where p is a rational number,



X 1 1 1
p
= 1 + p + p + ··· ,
n=1
n 2 3

converges for p ≥ 2 and diverges for p ≤ 1. To see this, note that if p < 1, then
1 1
≤ p,
n n
because 1 − p > 0, so 1 = 11−p ≤ n1−p by the power rules theorem (Theorem
2.33) and 1 ≤ n1−p is equivalent to the above inequality. Since the harmonic series
diverges, by the comparison test, so does the p-series for p < 1. If p > 2, then by a
similar argument, we have
1 1
≤ 2.
np n
P
In the last section, we showed that the 2-series 1/n2 converges, so by the com-
parison test, the p-series for p > 2 converges. Now what about for 1 < p < 2? To
answer this question we shall appeal to Cauchy’s condensation test below.

3.6.2. Cauchy condensation test. The following test is usually not found
in elementary calculus textbooks, but it’s very useful.

Theorem 3.28 (Cauchy condensation test). If {an } isPa nonincreasing


sequence of nonnegative real numbers, then the infinite series an converges if
and only if

X
2n a2n = a1 + 2a2 + 4a4 + 8a8 + · · ·
n=0
converges.

Proof. ThePproof of this theorem is just like thePproof of the p-test! Let the

partial sums of an be denoted by sn and P those of n=0 2n a2n by tn . Then by
the nonnegative
P series test (Theorem 3.20), an converges if and only if {sn } is
bounded and 2n a2n converges if and only if {tn } is bounded. Therefore, we just
have to prove that {sn } is bounded if and only if {tn } is bounded.
Consider the “if” part: Assume that {tn } is bounded; we shall prove that {sn }
is bounded. To prove this, we note that sn ≤ s2n −1 and we can write (cf. the above
computation with the p-series)

sn ≤ s2n −1 = a1 + (a2 + a3 ) + (a4 + a5 + a6 + a7 ) + · · · + (a2n−1 + · · · + a2n −1 ),

where in the k-th parenthesis, we group those terms of the series with index running
from 2k to 2k+1 − 1. Since the an ’s are nonincreasing (that is, an ≥ an+1 for all n),
replacing each number in a parenthesis by the first term in the parenthesis we can
only increase the value of the sum, so

sn ≤ a1 + (a2 + a2 ) + (a4 + a4 + a4 + a4 ) + · · · + (a2n−1 + · · · + a2n−1 )


≤ a1 + 2a2 + 4a4 + · · · + 2n−1 a2n−1 = tn−1 .

Since {tn } is bounded, it follows that {sn } is bounded as well.


134 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Now the “only if” part: Assume that {sn } is bounded; we shall prove that {tn }
is bounded. To prove this, we try to estimate tn using s2n . Observe that
s2n = a1 + a2 + (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · + (a2n−1 +1 + · · · + a2n )
≥ a1 + a2 + (a4 + a4 ) + (a8 + a8 + a8 + a8 ) + · · · + (a2n + · · · + a2n )
= a1 + a2 + 2a4 + 4a8 + · · · + 2n−1 a2n
1 1
= a1 + (a1 + 2a2 + 4a4 + 8a8 + · · · + 2n a2n )
2 2
1 1
= a1 + tn .
2 2
It follows that tn ≤ 2s2n for all n. In particular, since {sn } is bounded, {tn } is
bounded as well. This completes our proof. 

Example 3.40. For instance, consider the p-series (where p ≥ 0 is rational):



X 1 1 1
p
= 1 + p + p + ···
n=1
n 2 3
1
With an = np , by Cauchy’s condensation test, this series converges if and only if
∞ ∞ ∞  n
X
n
X 2n X 1
2 a2n = =
n=0 n=1
(2n )p n=1
2p−1

converges. This is a geometric series, so this series converges if and only if


1
<1 ⇐⇒ p > 1.
2p−1
Summarizing, we get

(
X 1 converges for p > 1,
p-test:
n=1
np diverges for p ≤ 1

Once we P
develop the theory of real exponents, the same p-test holds for p real. By

the way, n=1 1/np is also denoted by ζ(p), the zeta function at p:

X 1
ζ(p) := p
.
n=1
n

We’ll come across this function again in Section 4.6.


Cauchy’s condensation test is especially useful when dealing with series involv-
ing logarithms; see the problems. Although we technically haven’t introduced the
logarithm function, we’ll thoroughly develop this function in Section 4.6, so for now
we’ll assume you know properties of log x for x > 0. Actually, for the particular
example below, we just need to know that log xk = k log x for all k ∈ Z, log x > 0
for x > 1, and log x is increasing with x.
Example 3.41. Consider the series

X 1
.
n=2
n log n
3.6. ABSOLUTE CONVERGENCE AND A POTPOURRI OF CONVERGENCE TESTS 135

At first glance, it may seem difficult to determine the convergence of this series,
but Cauchy’s condensation test gives the answer quickly:
∞ ∞
X 1 1 X1
2n · = ,
n=1
2n log 2n log 2 n=1 n

which diverges. (You should P check that 1/(n log n) is nonincreasing.) Therefore by
∞ 1 6
Cauchy’s condensation test, n=2 n log n also diverges.
P
3.6.3. Absolute
P convergence. A series an is said to be absolutely con-
vergent if |an | converges. The following theorem implies that any absolutely
convergent series is convergent in the usual sense.
P
Theorem 3.29 (Absolute convergence). Let an be an infinite series.
P P
(1) If |an | converges, then an also converges, and
X X

(3.40) an ≤ |an | (triangle inequality for series).

(2) Any linear combination of absolutely convergent series is absolutely convergent.


P P
Proof. Suppose that |an | converges. We shall prove that an converges
and (3.40)
P holds. To prove convergence, we use Cauchy’s criterion, so let ε > 0.
Since |an | converges, there is an N such that for all n > m > N , we have
n
X
|ak | < ε.
k=m+1

By the usual triangle inequality, for n > m > N , we have


n n
X X

ak ≤ |ak | < ε.

k=m+1 k=m+1
P
Thus, by Cauchy’s criterion for Pseries, an converges. To prove (3.40), let sn
denote the n-th partial sum of an . Then,
n n ∞
X X X

|sn | = ak ≤ |ak | ≤ |ak |.

k=1 k=1 k=1
P P P
Since |sPn | → | aP k |, by the squeeze theorem it follows that | ak | ≤ |ak |.
If |an | and |bn | converge, then given any P complex numbers c, d, as |can +
dbn | ≤ |c||an |+|d||bn |, by the comparison theorem, |can +dbn | also converges. 
P
Example 3.42. Since the 2-series 1/n2 converges, each of the following series
is absolutely convergent:
∞ ∞ ∞
X (−1)n X in X (−i)n
, , .
n=1
n2 n=1
n2 n=1
n2

It is possible to have a convergent series that is not absolutely convergent.

6This series is usually handled in elementary calculus courses using the technologically ad-
vanced “integral test,” but Cauchy’s condensation test gives one way to handle such series without
knowing any calculus!
136 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

P
Example 3.43. Although the harmonic series 1/n diverges, the alternating
harmonic series

X (−1)n−1 1 1 1 1 1
= 1 − + − + − + −···
n=1
n 2 3 4 5 6
converges. To see this, we use the Cauchy criterion. Given n > m, observe that

n
X
(−1)k−1 (−1)m (−1)m+1 (−1)m+2 (−1)n−1
= + + + ··· +
k m+1 m+2 m+3 n
k=m+1

1 1 1 (−1)n−m−1
(3.41) = − + + ··· + .
m+1 m+2 m+3 n
Suppose that n − m is even. Then the sum in the absolute values in (3.41) equals
1 1 1 1 1 1
− + − + ··· + −
m+1 m+2 m+3 m+4 n−1 n
     
1 1 1 1 1 1
= − + − + ··· + − > 0,
m+1 m+2 m+3 m+4 n−1 n
since all the terms in parentheses are positive. Thus, if n − m is even, then we can
drop the absolute values in (3.41) to get

n
X
(−1)k−1 1 1 1 1 1
= − + − ··· + −
k m+1 m+2 m+3 n−1 n
k=m+1
   
1 1 1 1 1 1 1
= − − − ··· − − − < ,
m+1 m+2 m+3 n−2 n−1 n m+1
since all the terms in parentheses are positive. Now suppose that n − m is odd.
Then the sum in the absolute values in (3.41) equals
1 1 1 1 1 1
− + − + ··· − +
m+1 m+2 m+3 m+4 n−1 n
     
1 1 1 1 1 1 1
= − + − + ··· + − + > 0,
m+1 m+2 m+3 m+4 n−2 n−1 n
since all the terms in parentheses are positive. So, if n − m is odd, then just as
before, we can drop the absolute values in (3.41) to get
n
X (−1)k−1 1 1 1 1 1

= − + − ··· + −
k m+1 m+2 m+3 n−1 n
k=m+1
   
1 1 1 1 1 1
= − − − ··· − − < ,
m+1 m+2 m+3 n−1 n m+1
since, once again, all the terms in parentheses are positive. In conclusion, regardless
if n − m is even or odd, we see that for any n > m, we have

n
X
(−1)k−1 1
< .
k m+1
k=m+1

Since 1/(m + 1) → 0 as m → ∞, this inequality shows that the alternating har-


monic series satisfies the conditions of Cauchy’s criterion, and therefore converges.
Another way to prove convergence is to use the “alternating series test,” a subject
3.6. ABSOLUTE CONVERGENCE AND A POTPOURRI OF CONVERGENCE TESTS 137

we will study thoroughly in Section 6.1. Later on, in Section 4.6, we’ll prove that
P∞ (−1)n−1
n=1 n equals log 2.
Example 3.44. Using the associative law in Theorem 3.23, we can also write

X (−1)n−1  1 1 1 1 1 1 1 1
= 1− + − + − = + + + ··· .
n=1
n 2 3 4 5 6 1·2 3·4 5·6
Exercises 3.6.
1. For this problem, assume you know all the “well-known” high school properties of log x
(e.g. log xk = k log x, log(xy) = log x + log y, etc.). Using the Cauchy condensation
test, determine the convergence of the following series:
∞ ∞ ∞
X 1 X 1 X 1
(a) 2
, (b) p
, (c) .
n=2
n(log n) n=2
n(log n) n=2
n(log n) (log(log n))
For (b), state which p give convergent/diverent series.
2. Prove that

X (−1)n−1 1 1 1
=1− − − − ··· ,
n=1
n 2 · 3 4 · 5 6 ·7

X (−1)n−1 1  1  1 
2
= 2 2 1 + 2 + 2 2 3 + 4 + 2 2 5 + 6 + ··· .
n=1
n 1 ·2 3 ·4 5 ·6
P
3. We consider various (unrelated) properties of real series an with an ≥ 0 for all n.
(a) Here is a nice generalization of the Cauchy condensation test: P If the an ’s are
nonincreasing, then given a natural number b > 1, prove that an converges or
diverges with the series
X∞
b n abn = a1 + b a b + b 2 ab2 + b 3 ab3 + · · · .
n=0

Thus,
P the Cauchy condensation test is just this test with P b k= 2.
(b) If P an converges, prove that for any k ∈ N, the series P√ an also converges.
(c) If an converges, give anPexample showing that an may not converge. How-

ever, prove that the series an /n does converges. Suggestion: Can you somehow
use the AGMI with two terms?P P
(d) If an > 0 for all n, prove that
P −1an converges if and only if for any series bn of
−1
nonnegative
P real numbers, (a n + b n) converges. P P
(e) If bn is another series
Pof√ nonnegative real numbers, prove that an and bn
converge if Pand only if 2 2
an + bn converges.
P an
(f) Prove that an converges if and only if 1+an
converges.
pP∞ √
(g) For each n ∈ N, define P bn = k=n a k = a
P anan+1 + an+2 + · · ·. Prove that
n +
if an > 0 for all n and an converges, then bn
converges. Suggestion: Show
that an = b2n − b2n+1 and using this fact show that abnn ≤ 2(bn − bn+1 ) for all n.
P
4. We already know that if an (of complex numbers) converges, then lim an = 0. When
the an ’s form a nonincreasing sequence of nonnegative real numbers, P then prove the
following astonishing fact (called Pringsheim’s theorem): If an converges, then
n an → 0. Use the Cauchy criterion for series somewhere in your proof. Suggestion:
Let ε > 0 and choose N such that n > m > N implies
ε
am+1 + am+2 + · · · + an < .
2
Take n = 2m and then n = 2m + 1.
5. (Limit comparison test) Let {an } and {bn } be nonzero complex sequences and
suppose that the following limit exists: L := lim abnn . Prove that

138 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

P P
(i) If L 6= 0, then an is absolutely convergent if and only if bn is absolutely
convergent. P P
(ii) If L = 0 and bn is absolutely convergent, then an is absolutely convergent.
6. Here’s an alternative method to prove that the alternating harmonic series converges.
(i) Let {bn } be a sequence in Rm and suppose that the even and odd subsequences
{b2k } and {b2k−1 } both converge and have the same limit L. Prove that the
original sequence {bn } converges and has limit L.
(ii) Show that the subsequences of even and odd partial sums of the alternating
harmonic series both converge and have the same limit.
7. (Ratio comparison test) Let {an } and {bn } be sequences of positive numbers and
a b P P
suppose that n+1 ≤ n+1 for all n. If b converges, prove that an also converges.
an P bn P n
(Equivalently, if an diverges, then bn also diverges.) Suggestion: Consider the
telescoping product
an an−1 an−2 a2
an = · · ··· · a1 .
an−1 an−2 an−3 a1
P
8. (Cf. [104], [113], [235]) We already know that the harmonic series 1/n diverges. It
turns out that omitting certain numbers from this sum makes the sum converge. Fix a
natural number b ≥ 2. Recall (see Section 2.5) that we can write any natural number
n uniquely as n = ak ak−1 · · · a0 , where 0 ≤ aj ≤ b − 1, j = 0, . . . , k, are called digits,
and where the notation ak · · · a0 means that
a = ak bk + ak−1 bk−1 + · · · + a1 b + a0 .
Prove that the following sum converges:
X 1
.
n
n has no 0 digit

Suggestion: Let ck be the sum over all numbers of the form n1 where n = ak ak−1 · · · a0
with none of aj ’s zero. Show that there at most (b − 1)k+1 such n’s and that n ≥ bk
k+1 P
and use these facts to show that ck ≤ (b−1)
bk
. Prove that ∞k=0 ck converges and use
this to prove that the desired sum converges.

3.7. Tannery’s theorem, the exponential function, and the number e


Tannery’s theorem (named after Jules Tannery (1848–1910)) is a little known,
but fantastic theorem, that I learned from [31], [30], [41], [73]. Tannery’s theorem
is really a special case of the Weierstrass M -test [41, p. 124], which is why it
probably doesn’t get much attention. We shall use Tannery’s theorem quite a bit
in the sequel. In particular, we shall use it to derive certain properties of the
complex exponential function, which is undoubtedly the most important function
in analysis and arguably all of mathematics. In this section we derive some of its
many properties including its relationship to the number e defined in Section 3.3.
3.7.1. Tannery’s theorem for series. Tannery has two theorems, one for
series and the other for products; we’ll cover his theorem for products in Section
7.3. Here is the one for series.
Theorem
Pmn 3.30 (Tannery’s theorem for series). For each natural number
n, let k=1 ak (n) be a finite sum where m Pn∞→ ∞ as n → ∞. If for each k,
limn→∞ ak (n) exists, and there is a series k=1 Mk of nonnegative real numbers
such that |ak (n)| ≤ Mk for all k, n, then
Xmn ∞
X
lim ak (n) = lim ak (n);
n→∞ n→∞
k=1 k=1
3.7. TANNERY’S THEOREM, THE EXPONENTIAL FUNCTION, AND THE NUMBER e 139

that is, both sides are well-defined (the limits and sums converge) and are equal.
Proof. First of all, we remark that the series on the right converges. Indeed,
if we put ak := limn→∞ ak (n), which exists by assumption, then taking n → ∞
in the inequality P
|ak (n)| ≤ Mk , we have |ak | ≤ Mk as well. Therefore, by the

comparison test, k=1 ak converges (absolutely).
Now to prove our theorem, let ε > 0 be given. By Cauchy’s criterion for series
we can fix an ` so that
ε
M`+1 + M`+2 + · · · < .
3
Since mn → ∞ as n → ∞ we can choose N1 so that for all n > N1 , we have mn > `.
Then using that |ak (n) − ak | ≤ |ak (n)| + |ak | ≤ Mk + Mk = 2Mk , observe that for
any n > N1 we have
m ∞
mn ∞

X n X X̀ X X

ak (n) − ak = (ak (n) − ak ) + (ak (n) − ak ) − ak

k=1 k=1 k=1 k=`+1 k=mn +1

X̀ mn
X ∞
X
≤ |ak (n) − ak | + 2Mk + Mk
k=1 k=`+1 k=mn +1

X̀ X X̀ ε
≤ |ak (n) − ak | + 2Mk < |ak (n) − ak | + 2 .
3
k=1 k=`+1 k=1

Since for each k, limn→∞ ak (n) = ak , there is an N such that for each k = 1, 2, . . . , `
and for n > N , we have |ak (n) − ak | < ε/(3`). Thus, if n > N , then
m ∞

X n X̀ ε
X ε ε ε
ak (n) − ak < + 2 = + 2 ε.
3` 3 3 3
k=1 k=1 k=1

This completes the proof. 


Tannery’s theorem states that P
under certain conditions
P∞ we can “switch” limits
mn
and infinite summations: limn→∞ k=1 ak (n) = k=1 limn→∞ ak (n). (Of course,
we can always switch limits and finite summations by the algebra of limits, but
infinite summations is a whole other matter.) See Problem 8 for another version of
Tannery’s theorem and see Problem 9 for an application to double series.
Example 3.45. We shall derive the formula
 
1 1 + 2n 1 + 2n 1 + 2n 1 + 2n
= lim + + n 3 + ··· + n n .
2 n→∞ 2n 3 + 4 2n 32 + 42 2 3 + 43 2 3 + 4n
To prove this, we write the right-hand side as
  mn
1 + 2n 1 + 2n 1 + 2n 1 + 2n X
lim + + + · · · + = lim ak (n),
n→∞ 2n 3 + 4 2n 32 + 42 2n 33 + 43 2n 3n + 4n n→∞
k=1

where mn = n and
1 + 2n
ak (n) := .
2n 3k
+ 4k
Observe that for each k ∈ N,
1
1 + 2n 2n + 1 1
lim ak (n) = lim n k k
= lim k =
n→∞ n→∞ 2 3 + 4 k
n→∞ 3 + 4
2n
3k
140 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

exists. Also,
1 + 2n 2n + 2n 2 · 2n 2
|ak (n)| = ≤ = n k = k =: Mk .
2n 3k
+4 k n
2 3k 2 3 3
P∞
By the geometric series test, we know that k=1 Mk converges. Hence by Tannery’s
theorem, we have
 
1 + 2n 1 + 2n 1 + 2n 1 + 2n
lim + n 2 + n 3 + ··· + n n
n→∞ 2n 3 + 4 2 3 + 42 2 3 + 43 2 3 + 4n
mn ∞ ∞
X X X 1 1/3 1
= lim ak (n) = lim ak (n) = k
= = .
n→∞ n→∞ 3 1 − 1/3 2
k=1 k=1 k=1

If the hypotheses of Tannery’s theorem are not met, then the conclusion of
Tannery’s theorem may not hold as the following example illustrates.
Example 3.46. Here’s a non-example of Tannery’s theorem. For each k, n ∈ N,
let ak (n) := n1 and let mn = n. Then
∞ ∞
1 X X
lim ak (n) = lim =0 =⇒ lim ak (n) = 0 = 0.
n→∞ n→∞ n n→∞
k=1 k=1
On the other hand,
mn n n mn
X X 1 1 X X
ak (n) = = · 1=1 =⇒ lim ak (n) = lim 1 = 1.
n n n→∞ n→∞
k=1 k=1 k=1 k=1
Thus, for this example,
mn
X ∞
X
lim ak (n) 6= lim ak (n).
n→∞ n→∞
k=1 k=1
What went wrong hereP is that there is no constant Mk such that |ak (n)| ≤ Mk for

all n where the series k=1 Mk converges. Indeed, the inequality |ak (n)| P∞≤ Mk
for all n implies (setting n = 1) that 1 ≤ Mk . It follows that the series k=1 Mk
cannot converge. Therefore, Tannery’s theorem cannot be applied.
3.7.2. The exponential function. The exponential function exp : C → C
is the function defined by

X zn
exp(z) := , for z ∈ C.
n=0
n!

Of course, we need to show that the right-hand side converges for each z ∈ C. In
fact, we claim that the series defining exp(z) is absolutely convergent. To prove
this, fix z ∈ C and choose k ∈ N so that |z| ≤ k2 . (Just as a reminder, recall that
such a k exists by the Archimedean property.) Then for any n ≥ k, we have
         
|z|n |z| |z| |z| |z| |z|
= · ··· · ···
n! 1 2 k (k + 1) n
     
k k k
≤ |z|k · ···
2(k + 1) 2(k + 2) 2n
 k      
k 1 1 1 1
≤ · ··· = kk n .
2 2 2 2 2
3.7. TANNERY’S THEOREM, THE EXPONENTIAL FUNCTION, AND THE NUMBER e 141

n
Therefore,
P for n ≥ k, |z| C k
n! ≤ 2n where C is the constant k . Since the geometric
n
series 1/2 converges, by the comparison test, the series defining exp(z) is abso-
lutely convergent for any z ∈ C. In the following theorem, we relate the exponential
function to Euler’s number e introduced in Section 3.3. The proof of Property (1)
in this theorem is a beautiful application of Tannery’s theorem.
Theorem 3.31 (Properties of the complex exponential). The exponential
function has the following properties:
(1) For any z ∈ C and sequence zn → z, we have
 z n n
exp(z) = lim 1 + .
n→∞ n
In particular, setting zn = z for all n,
 z n
exp(z) = lim 1 + ,
n→∞ n
and setting z = 1, we get
 n
1
exp(1) = lim 1 + = e.
n→∞ n
(2) For any complex numbers z and w,
exp(z) · exp(w) = exp(z + w).
(3) exp(z) is never zero for any complex number z, and
1
= exp(−z).
exp(z)
Proof. To prove (1), let z ∈ C and let {zn } be a complex sequence and
suppose that zn → z; we need to show that limn→∞ (1 + zn /n)n = exp(z). To
begin, we expand (1 + zn /n)n using the binomial theorem:
n   n
 zn n X n znk  z n n X
1+ = =⇒ 1 + = ak (n),
n k nk n
k=0 k=0
 zk
where ak (n) = nk nnk . Hence, we are aiming to prove that
n
X
lim ak (n) = exp(z).
n→∞
k=0
Of course, written in this way, we are in the perfect set-up for Tannery’s theorem!
However, before going to Tannery’s theorem, we note that, by definition of ak (n),
we have a0 (n) = 1 and a1 (n) = zn . Therefore, since zn → z,
n n
! n
X X X
lim ak (n) = lim 1 + zn + ak (n) = 1 + z + lim ak (n).
n→∞ n→∞ n→∞
k=0 k=2 k=2
Thus, we just have to apply Tannery’s theorem to the sum starting from k = 2; for
this reason, we henceforth assume that k, n ≥ 2. Now observe that for 2 ≤ k ≤ n,
we have
 
n 1 n! 1 1 1
= = n(n − 1)(n − 2) · · · (n − k + 1) k
k nk k!(n − k)! nk k! n
    
1 1 2 k−1
= 1− 1− ··· 1 − .
k! n n n
142 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Thus, for 2 ≤ k ≤ n,
    
1 1 2 k−1
ak (n) = 1− 1− ··· 1 − znk .
k! n n n

Using this expression for ak (n) we can easily verify the hypotheses of Tannery’s
theorem. First, since zn → z,
    
1 1 2 k−1 zk
lim ak (n) = lim 1− 1− ··· 1 − znk = .
n→∞ n→∞ k! n n n k!

Second, since {zn } is a convergent sequence, it must be bounded, say by a constant


C, so that |zn | ≤ C for all n. Then for 2 ≤ k ≤ n,
    
1 1 2 k−1
|ak (n)| =
1− 1− ··· 1 − znk
k! n n n
    
1 1 2 k−1 Ck
≤ 1− 1− ··· 1 − Ck ≤ =: Mk ,
k! n n n k!

where we used that the term inP brackets is product


P∞ of positive numbers ≤ 1 so the

product is also ≤ 1. Note that k=2 Mk = k=2 C k /k! converges (its sum equals
exp(C) − 1 − C, but this isn’t important). Hence by Tannery’s theorem,
 n ∞
z n n X X
lim 1+ = 1 + z + lim ak (n) = 1 + z + lim ak (n)
n→∞ n n→∞ n→∞
k=2 k=2

X zk
=1+z+ = exp(z).
k!
k=2

To prove (2), observe that


 z n  w n  z n n
exp(z) · exp(w) = lim 1+ 1+ = lim 1 + ,
n→∞ n n n→∞ n
where zn = z + w + (z + w)/n. Since zn → z + w, we obtain

exp(z) · exp(w) = exp(z + w).

In particular,
exp(z) · exp(−z) = exp(z − z) = exp(0) = 1,
which implies (3). 

We remark that Tannery’s theorem can also be used to establish formulas for
sine and cosine, see Problem 2. Also, in Section 4.6 we’ll see that exp(z) = ez ;
however, at this point, we don’t even know what ez (“e to the power z”) means.

3.7.3. Approximation and irrationality of e. We now turn to the question


of approximating e. Because n! grows very large as n → ∞, we can use the series for
the exponential function to calculate e quite easily. If sn denotes the n-th partial
3.7. TANNERY’S THEOREM, THE EXPONENTIAL FUNCTION, AND THE NUMBER e 143

sum for e = exp(1), then


1 1 1
e = sn + + + + ···
(n + 1)! (n + 2)! (n + 3)!
1 1 1
= sn + + + + ···
(n + 1)! (n + 1)!(n + 2) (n + 1)!(n + 2)(n + 3)
 
1 1 1
< sn + 1+ + + · · ·
(n + 1)! (n + 1) (n + 1)2
1 1 1
= sn + · = sn + .
(n + 1)! 1 n! n
1−
n+1
Thus, we get the following useful estimate for e:
1
(3.42) sn < e < sn + .
n! n
Example 3.47. In particular, with n = 1 we have s1 = 2 and 1/(1! 1) = 1,
therefore 2 < e < 3. Of course, we can get a much more precise estimate with
higher values of n: with n = 10 we obtain (in common decimal notation — see
Section 3.8)
2.718281801 < e < 2.718281829.
Thus, only n = 10 gives a quite accurate approximation!
The estimate (3.42) also gives an easy proof that the number e is irrational, a
fact first proved by Euler in 1737 [36, p. 463].
Theorem 3.32 (Irrationality of e). e is irrational.
Proof. Indeed, by way of contradiction suppose that e = p/q where p and q
are positive integers with q > 1. Then (3.42) with n = q implies that
p 1
sq < < sq + .
q q! q
1 1
Since sq = 2 + 2! + · · · + q! , the number q! sq is an integer (this is because q! =
1 · 2 · · · k · (k + 1) · · · q contains a factor of k! for each 1 ≤ k ≤ q), which we denote
by m. Then multiplying the inequalities sq < pq < sq + q!1q by q! and using the fact
that q > 1, we obtain
1
m < p (q − 1)! < m + < m + 1.
q
Hence the integer p (q − 1)! lies between m and m + 1, which of course is absurd,
since there is no integer between m and m + 1. 

We end with the following neat “infinite nested product” formula for e:
1 1 1 1 1 
(3.43) e=1+ + 1+ 1+ 1+ ··· ;
1 2 3 4 5
see Problem 6 for the meaning of the right-hand side.
Exercises 3.7.
144 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

1. Determine the following limits.


 
1+n 22 + n2 nn + nn
(a) lim + + ··· + ,
n→∞ (1 + 2n) (1 + 2n)2 (1 + 2n)n
( )
n n n
(b) lim p +p + ··· + p ,
n→∞ 1 + (1 · 2 · n)2 1 + (2 · 3 · n)2 1 + (n · (n + 1) · n)2
 
1 + n2 22 + n2 n2 + n2
(c) lim + + · · · + ,
n→∞ 1 + (1 · n)2 1 + (2 · n)2 1 + (n · n)2
( 1  1  1)
n n n n n n
(d) lim 2n
+ 2n
+ ··· + 2n
,
n→∞ 1+1 1+2 1+n
P
where for (c) and (d), prove that the limits are ∞ 1
k=1 k2 .
2. For each z ∈ C, define the cosine of z by
 n  n 
1 iz iz
cos z := lim 1+ + 1−
n→∞ 2 n n
and the sine of z by
 n  n 
1 iz iz
sin z := lim 1+ − 1− .
n→∞ 2i n n
(a) Use Tannery’s theorem in a similar way as we did in the proof of Property (1) in
Theorem 3.31 to prove that the limits defining cos z and sin z exists and moreover,
∞ ∞
X z 2k X z 2k+1
cos z = (−1)k and sin z = (−1)k .
(2k)! (2k + 1)!
k=0 k=0

(b) Following the proof that e is irrational, prove that cos 1 (or sin 1 if you prefer) is
irrational.
3. Following [139], we prove that for any m ≥ 3,
m  m X m
X 1 3 1 1
− < 1+ < .
n=0
n! 2 m m n=0
n!

Taking m → ∞ gives an alternative proof that exp(1) = e. Fix m ≥ 3.


(i) Prove that for any 2 ≤ k ≤ m, we have
   
k(k − 1) (1 + 2 + · · · + k − 1) 1 k−1
1− =1− ≤ 1− ··· 1 − < 1.
2m m m m
(ii) Using (i), prove that
m m−2  m X m
X 1 1 X 1 1 1
− < 1+ < .
n=0
n! 2 m n=0
n! m n=0
n!
1 m
Now prove the formula. Suggestion: Use the binomial theorem on (1 + m ) .
4. Let {an } be any sequence of rational numbers tending to +∞, that is, given any M > 0
there is an N such that for all n > N , we have an > M . In this problem we show that
 an
1
(3.44) e = lim 1 + .
an
This formula also holds when the an ’s are real numbers, but as of now, we only know
about rational powers (we’ll consider real powers in Section 4.6).
(i) Prove that (3.44) holds in case the an ’s are integers tending to +∞.
3.7. TANNERY’S THEOREM, THE EXPONENTIAL FUNCTION, AND THE NUMBER e 145

(ii) By the tails theorem, we may assume that 1 < an for all n. For each n, let mn
be the unique integer such that mn − 1 ≤ an < mn (thus, mn = ban c − 1 where
ban c is the greatest integer function). Prove that if mn ≥ 1, then
 mn −1  an  mn
1 1 1
1+ ≤ 1+ ≤ 1+ .
mn an mn − 1
Now prove (3.44).
5. Let {bn } be any null sequence of positive rational numbers. Prove that
1
e = lim (1 + bn ) bn .
6. Prove that for any n ∈ N,
1 1 1 1 1 1 1  1  1 
1+ + + ··· + =1+ + 1+ 1+ ··· 1 + 1+ .
1! 2! n! 1 2 3 4 n−1 n
The infinite nested sum in (3.43) denotes the limit as n → ∞ of this expression.
7. Trying to imitate the proof that e is irrational, prove that for any m ∈ N, exp(1/m) is
irrational. After doing this, show that cos(1/m) (or sin(1/m) if you prefer) is irrational,
where cosine and sine are defined in Problem 2. See the article [177] for more on
irrationality proofs. P
8. (Tannery’s theorem II) For each natural number n, let ∞ k=1 ak (n) be a
Pconvergent
series. Prove that if for each k, limn→∞ ak (n) exists, and there is a series ∞ k=1 Mk of
nonnegative real numbers such that |ak (n)| ≤ Mk for all k, n, then

X ∞
X
lim ak (n) = lim ak (n).
n→∞ n→∞
k=1 k=1

Suggestion: Try to imitate the proof of the original Tannery’s theorem.


9. (Tannery’s theorem and Cauchy’s double series theorem— see Section 6.5 for
more on double series!)) In this problem we relate Tannery’s theorem to double series.
A double sequence is just a map a : NP× N → PC; for m, n ∈ N we denote a(m, n) by
∞ ∞
amn . We say
P that the iterated series m=1 n=1 amn converges
P∞if for each m ∈ N,
the series ∞ n=1 amn converges (call the sum
P∞ αmP)∞and the series m=1 αm converges.
Similarly, P
we say that the iterated series n=1 m=1 amn converges P∞ if for each n ∈ N,
the series ∞ m=1 a mn converges (call the sum β n ) and the series n=1 βn converges.
The object of this problem is to prove that given any double sequence {amn } of
complex numbers such that either
X ∞
∞ X ∞ X
X ∞
(3.45) |amn | converges or |amn | converges
m=1 n=1 n=1 m=1

then
∞ X
X ∞ ∞ X
X ∞
(3.46) amn = amn
m=1 n=1 n=1 m=1

in the sense that both iterated sums converge and are equal. The implication (3.45)
=⇒ (3.46) is called Cauchy’s double series theorem; see Theorem 6.26 in Section
6.5 for the full story.
PTo prove
P∞ this, you may proceed as follows.

(i) Assume that m=1 n=1 |amn | converges;
P we must prove the equality (3.46).
To do so, for each k ∈ N define Mk := ∞ j=1 |akj |, which converges by assump-
P Pn
tion. Then ∞ k=1 Mk also converges by assumption. Define ak (n) := j=1 akj .
Prove that Tannery’s theorem II can be applied to these ak (n)’s and in doing so,
establish the equality
P P(3.46).
(ii) Assume that ∞ n=1

m=1 |amn | converges; prove the equality (3.46).
146 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

(iii) Cauchy’s double series theorem can be used to prove neat and non-obvious iden-
tities. For example, prove that for any k ∈ N and z ∈ C with |z| < 1, we have
∞ ∞
X z n(k+1) X z m+k
n
= ;
n=1
1−z m=1
1 − z m+k
that is,
z k+1 z 2(k+1) z 3(k+1) z 1+k z 2+k z 3+k
+ + + ··· = + + + ··· .
1−z 1 − z2 1 − z3 1 − z 1+k 1 − z 2+k 1 − z 3+k
Suggestion: Apply Cauchy’s double series to {amn } where amn = z n(m+k) .

3.8. Decimals and “most” numbers are transcendental á la Cantor


Since grade school we have represented real numbers in “base 10”.7 In this
section we continue our discussion initiated in Section 2.5 (for integers) on the use
of arbitrary bases for real numbers. We also look at “Cantor’s diagonal argument”
that is able to construct transcendental numbers.

3.8.1. Decimal and b-adic representations of real numbers. We are all


familiar with the common decimal or base 10 notation, which we used without men-
tion in the last section concerning the estimate 2.718281801 < e < 2.718281829.
Here, we know that the decimal (also called base 10) notation 2.718281801 repre-
sents the number
7 1 8 2 8 1 8 0 1
2+ + + 3 + 4 + 5 + 6 + 7 + 8 + 9,
10 102 10 10 10 10 10 10 10
that is, this real number gives meaning to the symbol 2.718281801. More generally,
the symbol αk αk−1 · · · α0 .a1 a2 a3 a4 a5 . . ., where the αn ’s and an ’s are integers in
0, 1, . . . , 9, represents the number
k ∞
a1 a2 a3 X X an
αk · 10k + · · · + α1 · 10 + α0 + + 2 + 3 + ··· = αn · 10n + n
.
10 10 10 n=0 n=1
10
P∞ an
Notice that the infinite series n=1 10 converges because 0 ≤ an ≤ 9 for all n so
P∞ n 9
we can
P∞ an compare this series with n=1 10n = 1 < ∞. In particular, the number
n=1 10n lies in [0, 1].
More generally, instead of restricting to base 10, we can use other bases. Let
b > 1 be an integer (the base). Then the symbol αk αk−1 · · · α0 .b a1 a2 a3 a4 a5 . . .,
where the αn ’s and an ’s are integers in 0, 1, . . . , b − 1, represents the real number
k ∞
a1 a2 a3 X X an
a = αk · bk + · · · + α1 · b + α0 + + 2 + 3 + ··· = αn · bn + .
b b b n=0 n=1
bn
P∞ an
The infinite series n=1 bn converges because 0 ≤ an ≤ b − 1 for all n so we can
compare this series with
∞ ∞  n 1
X b−1 X 1 b
n
= (b − 1) · = (b − 1) 1 = 1 < ∞.
n=1
b n=1
b 1 − b

7To what heights would science now be raised if Archimedes had made that discovery ! [= the
decimal system of numeration or its equivalent (with some base other than 10)]. Carl Friedrich
Gauss (1777–1855).
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 147

The symbol αk αk−1 · · · α0 .b a1 a2 a3 a4 a5 . . . is called the b-adic representation or


b-adic expansion of a. The numbers αn and an are called digits. A natural
question is: Does every real number have such a representation? The answer is yes.
Recall from Section 2.5 that integers have b-adic representations so we can focus
on noninteger numbers. Now given any x ∈ R \ Z, we can write x = m + a where
m = bxc is the integer part of x and 0 < a < 1. We already know that m has a
b-adic expansion, so we can focus on writing a in a b-adic expansion.
Theorem 3.33. Let b ∈ N with b > 1. Then for any a ∈ [0, 1], there exists a
sequence of integers {an }∞
n=1 with 0 ≤ an ≤ b − 1 for all n such that

X an
a= ;
n=1
bn
if a 6= 0, then infinitely many of the an ’s are nonzero.
Proof. If a = 0 we must (and can) take all the an ’s to be zero, so we may
assume that a ∈ (0, 1]; we find the an ’s as follows. First, we divide (0, 1] into b
disjoint intervals:
 1i 1 2i 2 3i b − 1 i
0, , , , , ,..., ,1 .
b b b b b b
Since a ∈ (0, 1], a must lie in one of these intervals, so there is an integer a1 with
0 ≤ a1 ≤ b − 1 such that
a a + 1i a1 a1 + 1
1 1
a∈ , ⇐⇒ <a≤ .
b b b b

Second, we divide ab1 , a1b+1 into b disjoint subintervals. Since the length of
a1 a1 +1
 a +1 a 
b , b is 1b − b1 = 1b , we divide the interval ab1 , a1b+1 into b subintervals of
length (1/b)/b = 1/b2 :
a a 1 i  a1 1 a1 2 i  a1 2 a1 3i a b − 1 a1 1 i
1 1 1
, + 2 , + 2, + 2 , + 2, + 2 ,..., + 2 , + .
b b b b b b b b b b b b b b b

Now a ∈ ab1 , a1b+1 , so a must lie in one of these intervals. Thus, there is an integer
a2 with 0 ≤ a2 ≤ b − 1 such that
a a2 a1 a2 + 1 i a1 a2 a1 a2 + 1
1
a∈ + 2, + ⇐⇒ + 2 <a≤ + .
b b b b2 b b b b2
Continuing this process (slang for “by induction”) we can find a sequence of integers
{an } such that 0 ≤ an ≤ b − 1 for all n and
a1 a2 an−1 an a1 an−1 an + 1
(3.47) + 2 + · · · + n−1 + n < a ≤ + · · · + n−1 + .
b b b b b b bn
P∞ an
Let y := n=1 bn ; this series converges because its partial sums are bounded by
a according to the left-hand inequality in (3.47). Since 1/bn → 0 as n → ∞ by
taking n → ∞ in (3.47) P∞and using the squeeze rule, we see that y ≤ a ≤ y. This
shows that a = y = n=1 abnn . There must be infinitely many nonzero an ’s for if
there were only finitely many nonzero Pm an ’s, say for some m we have an = 0 for
all n > m, then we would have a = n=1 abnn . Now setting n = m in (3.47) and
looking at the left-hand inequality shows that a < a. This is impossible, so there
must be infinitely many nonzero an ’s. 
Here’s another question: If a b-adic representation exists, is it unique? The
answer to this question is no.
148 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

Example 3.48. Consider, for example, the number 1/2, which has two decimal
expansions:
1 1
= 0.50000000 . . . and = 0.49999999 . . . .
2 2
Notice that the first decimal expansion terminates.

You might remember from high school that the only decimals with two dif-
ferent expansions are the ones that terminate. In general, a b-adic expansion
0.b a1 a2 a3 a4 a5 . . . is said to terminate if all the an ’s equal zero for n large.

Theorem 3.34. Let b be a positive integer greater than 1. Then every real
number in (0, 1] has a unique b-adic expansion, except a terminating expansion,
which also can have a b-adic expansion where an = b − 1 for all n sufficiently large.
P∞ an
Proof. For a ∈ (0, 1], let a = n=1 bn be its b-adic expansion found in
Theorem 3.33, so there are infinitely many nonzero an ’s. Suppose that {αn } is
another sequence of integers, notP equal to the sequence {an }, such that 0 ≤ αn ≤

b − 1 for all n and such that a = n=1 αbnn . Since {an } and {αn } are not the same
sequence there is at least one n such that an 6= αn . Let m be the smallest natural
number such that am 6= αm . Then an = αn for n = 1, 2, . . . , m − 1, so
∞ ∞ ∞ ∞
X an X αn X an X αn
n
= =⇒ = .
n=1
b n=1
bn n=m
b n
n=m
bn

Since there are infinitely many nonzero an ’s, we have


∞ ∞ ∞
am X an X αn αm X αn
< = = +
bm n=m
b n
n=m
b n b m
n=m+1
bn

αm X b−1 αm 1
≤ + = m + m.
bm n=m+1 bn b b

Multiplying the extremities of these inequalities by bm , we obtain am < αm + 1, so


am ≤ αm . Since we know that am 6= αm , we must actually have am < αm . Now
∞ ∞ ∞
αm X αn X an am X an
m
≤ n
= n
= m
+
b n=m
b n=m
b b n=m+1
bn

am X b−1 am 1 αm
≤ + = m + m ≤ m.
bm n=m+1 bn b b b

Since the ends are equal, all the inequalities in between must be equalities. In
particular, making the first inequality into an equality shows that αn = 0 for all
n = m + 1, m + 2, m + 3, . . . and making the middle inequality into an equality
shows that an = b − 1 for all n = m + 1, m + 2, m + 3, . . .. It follows that a has only
one b-adic expansion except when we can write a as

a = 0.b α1 α2 . . . αm = 0.b a1 . . . am (b − 1)(b − 1)(b − 1) . . . ,

a terminating one, and one that has repeating b−1’s. This completes our proof. 
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 149

3.8.2. Rational numbers. We now consider periodic decimals, such as


1 3526 611
= 0.3333333 . . . , = 7.1232323 . . . , = 1.2343434 . . . .
3 495 495
As you well know, we usually write these decimals as
1 3526 611
= 0.3 . . . , = 7.123 . . . , = 1.234 . . . .
3 495 495
For general b-adic expansions, we say that αk . . . α0 .b a1 a2 a3 · · · is periodic if there
exists an ` ∈ N (called a period) such that an = an+` for all n sufficiently large.
3526
Example 3.49. For example, in the base 10 expansion of 495 = 7.1232323 . . .
= 7.a1 a2 a3 . . ., we have an = an+2 for all n ≥ ` = 2.
We can actually see how the periodic pattern appears by going back to high
school long division! Indeed, long dividing 495 into 3526 we get
 7.123
495 3526.000
3465
610
495
1150
990
1600
1485
115
At this point, we get another remainder of 115, exactly as we did a few lines before.
Thus, by continuing this process of long division, we are going to repeat the pattern
2, 3. We shall use this long division technique to prove the following theorem.
Theorem 3.35. Let b be a positive integer greater than 1. A real number is
rational if and only if its b-adic expansion is periodic.
Proof. We first prove the “only if”, then the “if” statement.
Step 1: We prove the “only if”: Given integers p, q with q > 0, we show that
p/q has a periodic b-adic expansion. By the division algorithm (see Theorem 2.15),
we can write p/q = q 0 + r/q where q 0 ∈ Z and 0 ≤ r < q. Thus, we just have
to prove that r/q has a periodic b-adic expansion. In particular, we might as well
assume from the beginning that 0 < p < q so that p/q < 1. Proceeding via high
school long division, we construct the decimal expansion of p/q.
First, using the division algorithm, we divide bp by q, obtaining a unique integer
a1 such that bp = a1 q + r1 where 0 ≤ r1 < q. Since
p a1 bp − a1 q r1
− = = ≥ 0,
q b bq bq
we have
a1 p
≤ < 1,
b q
which implies that 0 ≤ a1 < b.
Next, using the division algorithm, we divide b r1 by q, obtaining a unique
integer a2 such that br1 = a2 q + r2 where 0 ≤ r2 < q. Since
p a1 a2 r1 a2 br1 − a2 q r2
− − 2 = − 2 = 2
= 2 ≥ 0,
q b b bq b b q b q
150 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

we have
a2 r1 q 1
2
≤ < = ,
b bq bq b
which implies that 0 ≤ a2 < b.
Once more using the division algorithm, we divide b r2 by q, obtaining a unique
integer a3 such that br2 = a3 q + r3 where 0 ≤ r3 < q. Since
p a1 a2 a3 r2 a3 br2 − a3 q r3
− − 2 − 3 = 2 − 3 = = 3 ≥ 0,
q b b b b q b b3 q b q
we have
a3 r2 q 1
3
≤ 2 < 2 = 2,
b b q b q b
which implies that 0 ≤ a3 < b. Continuing by induction, we construct integers
0 ≤ an , rn < q such that for each n, brn = an+1 q + rn+1 and
p a1 a2 a3 an rn
− − 2 − 3 − ··· − n = n .
q b b b b b q
rn
Since 0 ≤ rn < q it follows that bn q → 0 as n → ∞, so we can write

p X an p
(3.48) = ⇐⇒ = 0.b a1 a2 a3 a4 a5 . . . .
q n=1 bn q
Now one of two things holds: Either some remainder rn = 0 or none of the rn ’s
are zero. Suppose that we are in the first case, some rn = 0. By construction,
we divide brn by q using the division algorithm to get brn = an+1 q + rn+1 . Since
rn = 0 and quotients and remainders are unique, we must have an+1 = 0 and
rn+1 = 0. By construction, we divide brn+1 by q using the division algorithm to
get brn+1 = an+2 q +rn+2 . Since rn+1 = 0 and quotients and remainders are unique,
we must have an+2 = 0 and rn+2 = 0. Continuing this procedure, we see that all
ak with k > n are zero. This, in view of (3.48), shows that the b-adic expansion of
p/q has repeating zeros, so in particular is periodic.
Suppose that we are in the second case, no rn = 0. Consider the q + 1 re-
mainders r1 , r2 , . . . , rq+1 . Since 0 ≤ rn < q, each rn can only take on the q values
0, 1, 2, . . . , q − 1 (“q holes”), so by the pigeonhole principle, two of these remainders
must have the same value (“be in the same hole”). Thus, rk = rk+` for some k
and `. We now show that ak+1 = ak+`+1 . Indeed, ak+1 was defined by dividing
brk by q so that brk = ak+1 q + rk+1 . On the other hand, ak+`+1 was defined by
dividing brk+` by q so that brk+` = ak+`+1 q + rk+`+1 . Now the division algorithm
states that the quotients and remainders are unique. Since brk = brk+` , it follows
that ak+1 = ak+`+1 and rk+1 = rk+`+1 . Repeating this same argument shows that
ak+n = ak+`+n for all n ≥ 0; that is, an = an+` for all n ≥ k. Thus, p/q has a
periodic b-adic expansion.
Step 2: We now prove the “if” portion: A number with a periodic b-adic
expansion is rational. Let a be a real number and suppose that its b-adic decimal
expansion is periodic. Since a is rational if and only of its noninteger part is rational,
we may assume that the integer part of a is zero. Let
a = 0.b a1 a2 · · · ak b1 · · · b`
have a periodic b-adic expansion, where the bar means that the block b1 · · · b` re-
peats. Observe that in an expansion αm αm−1 · · · α0 .b β1 β2 β3 . . ., multiplication by
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 151

bn for n ∈ N moves the decimal point n places to the right. (Try to prove this;
think about the familiar base 10 case first.) In particular,
bk+` a = a1 a2 · · · ak b1 · · · b` .b b1 · · · b` = a1 a2 · · · ak b1 · · · b` + 0.b b1 · · · b`
and
bk a = a1 a2 · · · ak .b b1 · · · b` = a1 a2 · · · ak + 0.b b1 · · · b` .
Subtracting, we see that the numbers given by 0.b b1 · · · b` cancel, so bk+` a−bk a = p,
where p is an integer. Hence, a = p/q, where q = bk+` − bk . Thus a is rational. 
3.8.3. Cantor’s diagonal argument. Now that we know about decimal ex-
pansions, we can present Cantor’s second proof that the real numbers are uncount-
able. His first proof appeared in Section 2.10.
Theorem 3.36 (Cantor’s second proof ). The interval (0, 1) is uncountable.
Proof. Assume, for sake of deriving a contradiction, that there is a bijection
f : N −→ (0, 1). Let us write the images of f as decimals (base 10):
1 ←→ f (1) = .a11 a12 a13 a14 · · ·
2 ←→ f (2) = .a21 a22 a23 a24 · · ·
3 ←→ f (3) = .a31 a32 a33 a34 · · ·
4 ←→ f (4) = .a41 a42 a43 a44 · · ·
.. ..
. .,
where we may assume that in each of these expansions there is never an infinite
run of 9’s. Recall from Theorem 3.33 there every real number of (0, 1) has a unique
such representation. Now let us define a real number a = .a1 a2 a3 · · · , where
(
3 if ann 6= 3
an :=
7 if ann = 3.
(The choice of 3 are 7 is arbitrary — you can choose another pair of unequal
integers in 0, . . . , 9 if you like!) Notice that an 6= ann for all n. In particular,
a 6= f (1) because a and f (1) differ in the first digit. On the other hand, a 6= f (2)
because a and f (2) differ in the second digit. Similarly, a 6= f (n) for every n since
a and f (n) differ in the n-th digit. This contradicts that f : N → (0, 1) is onto. 
This argument is not only elegant, it is useful: Cantor’s diagonal argument
gives a good method to generate transcendental numbers (see [87])!
Exercises 3.8.
1. Find the numbers with the b-adic expansions (here b = 10, 2, 3, respectively):
(a) 0.010101 . . . , (b) 0.2 010101 . . . , (c) 0.3 010101 . . . .
2. Prove that a real number a ∈ (0, 1) has a terminating decimal expansion if and only if
2m 5n a ∈ Z for some nonnegative integers m, n.
3. (s-adic expansions) Let s = {bn } be a sequence of integers with bn > 1 for all n and
let 0 < a ≤ 1. Prove that there is a sequence of integers {an }∞n=1 with 0 ≤ an ≤ bn − 1
for all n and with infinitely many nonzero an ’s such that

X an
a= ,
n=1
b 1 · b2 · b3 · · · bn

Suggestion: Can you imitate the proof of Theorem 3.33?


152 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS

4. (Cantor’s original diagonal argument) Let g and c be any two distinct objects
and let G be the set consisting of all functions f : N −→ {g, c}. Let f1 , f2 , f3 , . . . be
any infinite sequence of elements of G. Prove that there is an element f in G that is
not in this list. From this prove that G is uncountable. Conclude that the set of all
sequences of 0’s and 1’s is uncountable.
CHAPTER 4

Limits, continuity, and elementary functions

One merit of mathematics few will deny: it says more in fewer words than
any other science. The formula, eiπ = −1 expressed a world of thought, of
truth, of poetry, and of the religious spirit “God eternally geometrizes.”
David Eugene Smith (1860–1944) [188].
In this chapter we study, without doubt, the most important types of functions
in all of analysis and topology, the continuous functions. In particular, we study
the continuity properties P∞of the “the most important function in mathematics”
n
[192, p. 1]: exp(z) = n=0 zn! , z ∈ C. From this single function arise just about
every important function and number you can think of: the logarithm function,
powers, roots, the trigonometric functions, the hyperbolic functions, the number e,
the number π, . . . . . ., and the famous formula displayed in the above quote!
What do the Holy Bible, squaring the circle, House bill No. 246 of the Indiana
state legislature in 1897, square free natural numbers, coprime natural numbers,
the sentence
(4.1) May I have a large container of coffee? Thank you,
the mathematicians Archimedes of Syracuse, William Jones, Leonhard Euler, Jo-
hann Heinrich Lambert, Carl Louis Ferdinand von Lindemann, John Machin, and
Yasumasa Kanada have to do with each other? The answer (drum role please):
They all have been involved in the life of the remarkable number π! This fascinat-
ing number is defined and some of its amazing and death-defying properties and
formulæ are studied in this chapter! By the way, the sentence (4.1) is a mnemonic
device to remember the digits of π. The number of letters in each word represents
a digit of π; e.g. “May” represents 3, “I” 1, etc. The sentence (4.1) gives ten digits
of π: 3.141592653.1
In Section 4.1 we begin our study of continuity by learning limits of functions,
in Section 4.2 we study some useful limit properties, and then in Section 4.3 we
discuss continuous functions in terms of limits of functions. In Section 4.4, we study
some fundamental properties of continuous functions. A special class of functions,
called monotone functions, have many special properties, which are investigated in
Section 4.5. In Section 4.6 we study “the most important function in mathematics”
and we also study its inverse, the logarithm function, and then we use the logarithm
function to define powers. We also define the Riemann zeta function, the Euler-
Mascheroni constant γ:
 
1 1
γ := lim 1 + + · · · + − log n ,
n→∞ 2 n

1Using mnemonics to memorize digits of π isn’t a good idea if you want to beat Hiroyuki
Goto’s record of reciting 42,195 digits from memory! (see http://www.pi-world-ranking-list.com)

153
154 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

a constant will come up again and again (see the book [96], which is devoted to
this number), and we’ll prove that the alternating harmonic series has sum log 2:
1 1 1 1 1
log 2 = 1 − + − + − + −··· ,
2 3 4 5 6
another fact that will come up often. In Section 4.7 we use the exponential function
to define the trigonometric functions and we define π, the fundamental constant
of geometry. In Section 4.8 we study roots of complex numbers and we give fairly
elementary proofs of the fundamental theorem of algebra. In Section 4.9 we study
the inverse trigonometric functions. The calculation and (hopeful) imparting of a
sense of great fascination of the incredible number π are the features of Sections
4.10, 5.1 and 5.2. In particular, we’ll derive the first analytical expression for π:
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· ,
π 2 2 2 2 2 2 2 2 2

given in 1593 by François Viète (1540–1603) [47, p. 69], Gregory-Leibniz-Madhava’s


formula for π/4:
π 1 1 1 1 1
=1− + − + − + −··· ,
4 3 5 7 9 11
and Euler’s solution to the famous Basel problem. Here, the Basel problem was the
following:
P∞ 1 Find the sum of the reciprocals of the squares of the natural numbers,
2
n=1 n2 ; the answer, first given by Euler in 1734, is π /6:

π2 1 1 1 1 1 1
= 1 + 2 + 2 + 2 + 2 + 2 + 2 + ··· .
6 2 3 4 5 6 7
Chapter 4 objectives: The student will be able to . . .
• apply the rigorous ε-δ definition of limits for functions and continuity.
• apply and understand the proofs of the fundamental theorem of continuous
functions.
• define the elementary functions (exponential, trigonometric, and their inverses)
and the number π.
• explain three related proofs of the fundamental theorem of algebra.

4.1. Convergence and ε-δ arguments for limits of functions


In elementary calculus you most likely studied limits of functions without rig-
orous proofs, using intuition, graphs, or informal reasoning to determine limits. In
this section we define limits and seek to truly understand them precisely.
4.1.1. Limit points and the ε-δ definition of limit. Before reading on,
it might benefit the reader to reread the material on open balls in Section 2.8. If
A ⊆ Rm , then a point c ∈ Rm is said to be a limit point of A if every open
ball centered at c contains a point of A different from c. In other words, given any
r > 0, there is a point x ∈ A such that x ∈ Br (c) and x 6= c, which is to say,
c is a limit point of A ⇐⇒ for each r > 0, there’s an x ∈ A with 0 < |x − c| < r.
The inequality 0 < |x − c| just means that x 6= c while the inequality |x − c| < r
just means that x ∈ Br (c). If m = 1, then c is a limit point of A if for any r > 0,
4.1. CONVERGENCE AND ε-δ ARGUMENTS FOR LIMITS OF FUNCTIONS 155

there is a point x ∈ A such that x ∈ (c − r, c + r) and x 6= c. We remark that the


point c may or may not belong to A.
Example 4.1. Let A = [0, 1). Then 0 is a limit point of A and 1 is also a limit
point of A; in this example, 0 belongs to A while 1 does not. Moreover, as the
reader can verify, the set of all limit points of A is the closed interval [0, 1].
Example 4.2. If A = {1/n ; n ∈ N}, then the diligent reader will verify that
0 is the only limit point of A. (Note that every open ball centered at 0 contains a
point in A by the 1/n-principle.)
The name “limit point” fits because the following lemma states that limit points
are exactly that, limits of points in A.
Lemma 4.1 (Limit points and sequences). A point c ∈ Rm is a limit point
of a set A ⊆ Rm if and only if c = lim an for some sequence {an } contained in A
with an 6= c for each n.
Proof. Assume that c is a limit point of A. For each n, by definition of limit
point (put r = 1/n), there is a point an ∈ A such that 0 < |an − c| < 1/n. We
leave the reader to check that an → c.
Conversely, suppose that c = lim an for some sequence {an } contained in A
with an 6= c for each n. We shall prove that c is a limit point for A. Let r > 0.
Then by definition of convergence for an → c, there is an n sufficiently large such
that |an − c| < r. Since an 6= c by assumption, we have 0 < |x − c| < r with
x = an ∈ A, so c is a limit point of A. 
We now define limits of functions. Let m, p ∈ N and let f : D −→ Rm where
D ⊆ Rp . From elementary calculus, we learn that L = limx→c f (x) indicates that
f (x) is “as close as we want” to L for x ∈ D “sufficiently close”, but not equal,
to c. We now make the terms in quotes rigorous. As with limits of sequences, we
interpret “as close as we want” to mean that given any error ε > 0, for x ∈ D
“sufficiently close”, but not equal, to c we can approximate L by f (x) to within an
error of ε. In other words, for x ∈ D “sufficiently close”, but not equal, to c, we
have
|f (x) − L| < ε.
We interpret “sufficiently close” to mean that there is a real number δ > 0 such
that for all x ∈ D with |x − c| < δ and x 6= c, the above inequality holds; since
x 6= c we have |x − c| > 0, so we are in effect saying 0 < |x − c| < δ. In summary,
for all x ∈ D with 0 < |x − c| < δ we have |f (x) − L| < ε.
We now conclude our findings as a precise definition. A function f : D −→ Rm
is said to have a limit L at a limit point c of D if for each ε > 0 there is a δ > 0
such that
(4.2) x∈D and 0 < |x − c| < δ =⇒ |f (x) − L| < ε.
(Note that since c is a limit point of D there always exists points x ∈ D with
0 < |x − c| < δ, so this implication is not an empty implication.) If this holds, we
write
L = lim f or L = lim f (x),
x→c x→c
or sometimes expressed by f → L or f (x) → L as x → c. Of course, we can use
another letter instead of “x” to denote the domain variable.
156 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

rf (c)
L+ε L+ε L+ε
L L b L b
L−ε L−ε L−ε

c−δ cc+δ c−δ cc+δ c−δ cc+δ

Figure 4.1. Here are three functions with D = [0, ∞) (we’ll de-
note them by the generic letter f ). In the first graph, L = f (c),
in the second graph f (c) 6= L, and in the third graph, f (c) is not
even defined. However, in all three cases, limx→c f = L.

An alternative definition of limit involves open balls. A function f : D −→ Rm


has limit L at a limit point c of D if for each ε > 0 there is a δ > 0 such that
x ∈ D ∩ Bδ (c) and x 6= c =⇒ f (x) ∈ Bε (L).
For p = m = 1, this condition simplifies as follows: f : D −→ R has limit L at a
limit point c of D ⊆ R if for each ε > 0 there is a δ > 0 such that
x ∈ D ∩ (c − δ, c + δ) and x 6= c =⇒ f (x) ∈ (L − ε, L + ε),
which is to say (see Figure 4.1 for an illustration of this limit concept),
x ∈ D with c − δ < x < c + δ and x 6= c =⇒ L − ε < f (x) < L + ε.
We take the convention that if not explicitly mentioned, the domain D of a function
is always taken to be the set of all points for which the function makes sense.
4.1.2. Working with the ε-δ definition. Here are some examples to master.
Example 4.3. Let us prove that

lim 3x2 − 10 = 2.
x→2

(Here, the domain D of 3x2 − 10 is assumed to be all of R.) Let ε > 0 be given.
We need to prove that there is a real number δ > 0 such that

0 < |x − 2| < δ =⇒ 3x2 − 10 − 2 = 3x2 − 12 < ε.
How do we find such a δ . . . well . . . we “massage” |3x2 − 12|. Observe that
2
3x − 12 = 3 x2 − 4 = 3 |x + 2| · |x − 2|.
Let us tentatively restrict x so that |x − 2| < 1. In this case,
|x + 2| = |x − 2 + 4| ≤ |x − 2| + 4 < 1 + 4 = 5.
Thus,
2
(4.3) |x − 2| < 1 =⇒ 3x − 12 = 3 |x + 2| · |x − 2| < 15 |x − 2|.
Now
ε
(4.4) 15 |x − 2| < ε ⇐⇒ |x − 2| < .
15
For this reason, let us pick δ to be the minimum of 1 and ε/15. Then |x − 2| < δ
implies |x − 2| < 1 and |x − 2| < ε/15, therefore according to (4.3) and (4.4), we
have
by (4.3) by (4.4)
0 < |x − 2| < δ =⇒ 3x2 − 12 < 15 |x − 2| < ε.
4.1. CONVERGENCE AND ε-δ ARGUMENTS FOR LIMITS OF FUNCTIONS 157

Thus, by definition of limit, limx→2 (3x2 − 10) = 2.


Example 4.4. Now let a > 0 be any real number and let us show that
√ √
x+a− a 1
lim = √ .
x→0 x 2 a
(Here, the domain D = [−a, 0) ∪ (0, ∞).) Let ε > 0 be any given positive real
number. We need to prove that there is a real number δ > 0 such that
√ √
x+a− a 1
0 < |x| < δ =⇒ − √ < ε.
x 2 a
To establish this result we “massage” the absolute value with the “multiply by
conjugate trick”:
√ √ √ √
√ √ x+a− a x+a+ a x
(4.5) x+a− a= ·√ √ =√ √ .
1 x+a+ a x+a+ a
Therefore,
√ √ √ √
x+a− a 1 1 1 x+a− a
− √ = √ √ − √ = √ √ √ .
x 2 a x + a + a 2 a 2 a ( x + a + a)
Applying (4.5) to the far right numerator, we get
√ √
x+a− a 1 |x|
− √ = √ √ √ .
x 2 a 2 a ( x + a + a)2
√ √ √
Observe that ( x + a + a)2 ≥ (0 + a)2 = a, so
√ √
1 1 1 x+a− a 1 |x|
√ √ √ 2 ≤ √ = 3/2 =⇒ − √ ≤ 3/2 ,
2 a ( x + a + a) 2 a · a 2a x 2 a 2a
such a simple expression! Now
|x|
< ε ⇐⇒ |x| < 2a3/2 ε.
2a3/2
With this in mind, we choose δ = 2a3/2 ε and with this choice of δ, we obtain our
desired inequality:
√ √
x+a− a 1
x ∈ D and 0 < |x| < δ =⇒ − √ < ε.
x 2 a
Example 4.5. Here is an example involving complex numbers. Let c be any
nonzero complex number and let us show that
1 1
lim = .
z→c z c
Here, f : D −→ C is the function f (z) = 1/z with D ⊆ C consisting of all nonzero
complex numbers. (Recall that C = R2 , so D is a subset of R2 and in terms of our
original definition (4.2), D ⊆ Rp and f : D −→ Rm with p = m = 2.) Let ε > 0
be any given positive real number. We need to prove that there is a real number
δ > 0 such that
1 1
0 < |z − c| < δ =⇒ − < ε.

z c
Now
1 1 c − z
− = = 1 |z − c|.
z c zc |zc|
158 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

|c| c
2
|c|
2
z
|z|

|c| |c|
Figure 4.2. If |z − c| < 2 , then this picture shows that |z| > 2 .

In order to make this expression less than ε, we need to bound the term in front
of |z − c| (we need to make sure that |z| can’t get too small, otherwise 1/|zc| can
blow-up). To do so, we tentatively restrict z so that |z − c| < |c|
2 . In this case, as
|c|
seen in Figure 4.2, we also have |z| > 2 . Here is a proof if you like:
|c| |c|
|c| = |c − z + z| ≤ |c − z| + |z| < + |z| =⇒ < |z|.
2 2
Therefore, if |z − c| < |c| |c| 1 2 2
2 , then |zc| > 2 · |c| = 2 |c| = b, where b = |c| /2 is a
positive number. Thus,

|c| 1 1 1
(4.6) |z − c| < =⇒ − < |z − c|.

2 z c b
Now
1
(4.7) |z − c| < ε ⇐⇒ |z − c| < b ε.
b
For this reason, let us pick δ to be the minimum of |c|/2 and b ε. Then |z − c| < δ
implies |z − c| < |c|/2 and |z − c| < b ε, therefore according to (4.6) and (4.7), we
have
1 1 by (4.6) 1 by (4.7)
0 < |z − c| < δ =⇒ − < |z − c| < ε.
z c b
Thus, by definition of limit, limz→c 1/z = 1/c.
Example 4.6. Here is one last example. Define f : R2 \ {0} −→ R by
x21 x2
f (x) = , x = (x1 , x2 ).
x21 + x22
We shall prove that limx→0 f = 0. (In the subscript “x → 0”, 0 denotes the zero
vector (0, 0) in R2 while on the right of limx→0 f = 0, 0 denotes the real number 0;
it should always be clear from context what “0” means.) Before our actual proof,
we first note that for any real numbers a, b, we have 0 ≤ (a − b)2 = a2 + b2 − 2ab.
Solving for ab, we get
1 
a b ≤ a2 + b2 .
2
This inequality is well worth remembering. Hence,
|x1 | |x1 | 1  |x1 |
|f (x1 , x2 )| = 2 2 · |x1 x2 | ≤ 2 2 · x21 + x22 = .
x1 + x2 x1 + x2 2 2
Given ε > 0, choose δ = ε. Then
|x1 | ε
0 < |x| < δ =⇒ |x1 | < δ =⇒ |f (x)| ≤ ≤ < ε,
2 2
which implies that limx→0 f = 0.
4.1. CONVERGENCE AND ε-δ ARGUMENTS FOR LIMITS OF FUNCTIONS 159

4.1.3. The sequence definition of limit. It turns out that we can relate
limits of functions to limits of sequences, which was studied in Chapter 3, so we can
use much of the theory developed in that chapter to analyze limits of functions. In
particular, take note of the following important theorem!
Theorem 4.2 (Sequence criterion for limits). Let f : D −→ Rm and let c
be a limit point of D. Then L = limx→c f if and only if for every sequence {an } of
points in D \ {c} with c = lim an , we have L = limn→∞ f (an ).
Proof. Let f : D −→ Rm and let c be a limit point of D. We first prove that
if L = limx→c f , then for any sequence {an } of points in D \ {c} converging to c,
we have L = lim f (an ). To do so, let {an } be such a sequence and let ε > 0. Since
f has limit L at c, there is a δ > 0 such that
x∈D and 0 < |x − c| < δ =⇒ |f (x) − L| < ε.
Since an → c and an 6= c for any n, it follows that there is an N such that
n>N =⇒ 0 < |an − c| < δ.
The limit property of f now implies that
n>N =⇒ |f (an ) − L| < ε.
Thus, L = lim f (an ).
We now prove that if for every sequence {an } of points in D \ {c} converging
to c, we have L = lim f (an ), then L = limx→c f . We prove the logically equivalent
contrapositive; that is, if L 6= limx→c f , then there is a sequence {an } of points in
D\{c} converging to c such that L 6= lim f (an ). Now L 6= limx→c f means (negating
the definition L = limx→c f ) that there is an ε > 0 such that for all δ > 0, there is
an x ∈ D with 0 < |x − c| < δ and |f (x) − L| ≥ ε. Since this statement is true for
all δ > 0, it is in particular true for δ = 1/n for each n ∈ N. Thus, for each n ∈ N,
there is a point an ∈ D with 0 < |an − c| < 1/n and |f (an ) − L| ≥ ε. It follows
that {an } is a sequence of points in D \ {c} converging to c and {f (an )} does not
converge to L. This completes the proof of the contrapositive. 

This theorem can be used to prove that certain functions don’t have limits.
Example 4.7. Recall from Section 1.3 the Dirichlet function, named after
Johann Peter Gustav Lejeune Dirichlet (1805–1859):
(
1 if x is rational
D : R −→ R is defined by D(x) =
0 if x is irrational.
Let c ∈ R. Then as we saw in Example 3.13 of Section 3.2, there is a sequence {an }
of rational numbers converging to c with an 6= c for all n. Since an is rational we
have D(an ) = 1 for all n ∈ N, so
lim D(an ) = lim 1 = 1.
Also, there is a sequence {bn } of irrational numbers converging to c with bn 6= c for
all n, in which case
lim D(bn ) = lim 0 = 0.
Therefore, according to our sequence criterion, limx→c D cannot exist.
160 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Example 4.8. Consider the function f : R2 −→ R defined by


x1 x2
f (x) = 2 , x = (x1 , x2 ) 6= 0.
x1 + x22
We claim that limx→0 f does not exist. To see this, observe that f (x1 , 0) = 0, so
f (an ) → 0 for any sequence an approaching 0 along the x1 -axis. On the other hand,
since
x2 1
f (x1 , x1 ) = 2 1 2 = ,
x1 + x1 2
it follows that f (an ) → 1/2 for any sequence an approaching 0 along the diagonal
x1 = x2 . Therefore limx→0 f does not exist.
Exercises 4.1.
1. Using the ε-δ definition of limit, prove that (where z is a complex variable)
1 1 3z
(a) lim z 2 + 2z) = 3, (b) lim z 3 = 8, (c) lim 2 = , (d) lim = 2, .
z→1 z→2 z→2 z 4 z→2 z + 1

Suggestion: For (b), can you factor z 3 − 8?


2. Using the ε-δ definition of limit, prove that (where x, a are real variables and where in
(b) and (c), a > 0)

x2 − x − a2 1 1 1 a2 + 6x2 − a 3
(a) lim = − , (b) lim √ = √ , (c) lim = .
x→a x+a 2 x→a x a x→0 x2 a
3. Prove that the limits (a) and (b) do not exist while (c) does exist:
x2 + x2 x21 + x2 x21 x22
(a) lim p 1 , (b) lim , (c) lim .
x→0 x21 + x22 x→0 x21 + x22 x→0 x21 + x22
4. Here are problems involving functions similar to Dirichlet’s function. Define
( (
x if x is rational x if x is rational
f (x) = , g(x) =
0 if x is irrational 1 − x if x is irrational.
(a) Prove that limx→0 f = 0, but limx→c f does not exist for c 6= 0.
(b) Prove that limx→1/2 g = 1/2, but limx→c g does not exist for c 6= 1/2.
5. Let f : D −→ R with D ⊆ Rp and let L = limx→c √f . Assume pthat f (x) ≥ 0 for all
x 6= c sufficiently close to c. Prove that L ≥ 0 and L = limx→c f (x).

4.2. A potpourri of limit properties for functions


Now that we have a working knowledge of the ε-δ definition of limit for func-
tions, we move onto studying the properties of limits that will be used throughout
the rest of the book.
4.2.1. Limit theorems. As we already mentioned in Section 4.1.3, combining
the sequence criterion for limits (Theorem 4.2) with the limit theorems in Chapter
3, we can easily prove results concerning limits. Here are some examples begin-
ning with the following companion to the uniqueness theorem (Theorem 3.2) for
sequences.
Theorem 4.3 (Uniqueness of limits). A function can have at most one limit
at any given limit point of its domain.
Proof. If limx→c f equals both L and L0 , then according to the sequence
criterion, for all sequences {an } of points in D \ {c} converging to c, we have
lim f (an ) = L and lim f (an ) = L0 . Since we know that limits of sequences are
unique (Theorem 3.2), we conclude that L = L0 . 
4.2. A POTPOURRI OF LIMIT PROPERTIES FOR FUNCTIONS 161

If f : D −→ Rm , then we can write f in terms of its components as


f = (f1 , . . . , fm ),
where for k = 1, 2, . . . , m, fk : D −→ R, are the component functions of f . In
particular, if f : D −→ C, then we can always break up f as
f = (f1 , f2 ) ⇐⇒ f = f1 + if2 ,
where f1 , f2 : D −→ R.
Example 4.9. For instance, if f : C −→ C is defined by f (z) = z 2 , then we
can write this as f (x + iy) = (x + iy)2 = x2 − y 2 + i2xy. Therefore, f = f1 + if2
where if z = x + iy, then
f1 (z) = x2 − y 2 , f2 (z) = 2xy.
The following theorem is a companion to the component theorem (Theorem
3.4) for sequences.
Theorem 4.4 (Component theorem). A function converges to L ∈ Rm (at
a given limit point of the domain) if and only if each component function converges
in R to the corresponding component of L.
Proof. Let f : D −→ Rm . Then limx→c f = L if and only if for every sequence
{an } of points in D \ {c} converging to c, we have lim f (an ) = L. According
to the component theorem for sequences, lim f (an ) = L if and only if for each
k = 1, 2, . . . , m, we have limn→∞ fk (an ) = Lk . This shows that limx→c f = L if
and only if for each k = 1, 2, . . . , m, limx→c fk = Lk and completes our proof. 
The following theorem is a function analog of the “algebra of limits” studied
in Section 3.2.
Theorem 4.5 (Algebra of limits). If f and g both have limits as x → c, then
(1) limx→c |f | = | lim
 x→c f |.
(2) limx→c af + bg = a limx→c f + b limx→c g, for any real a, b.
If f and g take values in  C, then 
(3) limx→c f g = limx→c f limx→c g .
(4) limx→c f /g = limx→c f / limx→c g, provided the denominators are not zero for
x near c.
Proof. All these properties follow from the corresponding statements for se-
quences in Theorems 3.10 and 3.11. For example, let us prove (4) and leave the rest
to the reader. If L = limx→c f and L0 = limx→c g, then by the sequence criterion
(Theorem 4.2) it suffices to show that for any sequence {an } of points in D\{c} con-
verging to c, we have lim f (an )/g(an ) = L/L0 . However, given any such sequence,
by the sequence criterion we know that lim f (an ) = L and lim g(an ) = L0 , and by
Theorem 3.11, we thus have lim f (an )/g(an ) = (lim f (an ))/(lim g(an )) = L/L0 . 
By induction, we can use the algebra of limits on any finite sum or finite product
of functions.
Example 4.10. It is easy to show that at any point c ∈ C, limz→c z = c.
Therefore, by our algebra of limits, for any complex number a and natural number
n, we have
     
lim az n = a lim |z · z{z· · · z} = a lim z · lim z · · · lim z = a c · c · · · c = a cn .
z→c z→c z→c z→c z→c
n z’s
162 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Therefore, given any polynomial


p(z) = an z n + an−1 z n−1 + · · · + a1 z + a0 ,
for any c ∈ C, the algebra of limits implies that
lim p(z) = lim an z n + lim an−1 z n−1 + · · · + lim a1 z + lim a0
z→c z→c z→c z→c z→c
n
= an c + an−1 cn−1 + · · · + a1 c + a0 ;
that is,
lim p(z) = p(c).
z→c

Example 4.11. Now let q(z) be another polynomial and suppose that q(c) 6= 0.
Since q(z) has at most finitely many roots (Proposition 2.53), it follows that q(z) 6= 0
for z sufficiently close to c. Therefore by our algebra of limits, we have
p(z) limz→c p(z) p(c)
lim = = .
z→c q(z) limz→c q(z) q(c)
The following theorem is useful when dealing with compositions of functions.
Theorem 4.6 (Composition of limits). Let f : D −→ Rm and g : C −→ Rp
where D ⊆ Rp and C ⊆ Rq and suppose that g(C) ⊆ D so that f ◦ g : C −→ Rm is
defined. Let d be a limit point of D and c a limit point of C and assume that
(1) d = limx→c g(x).
(2) L = limy→d f (y).
(3) Either f (d) = L or d 6= g(x) for all x 6= c sufficiently near c.
Then
L = lim f ◦ g.
x→c

Proof. Let {an } be any sequence in C \ {c} converging to c. Then by (1), the
sequence {g(an )} in D converges to d.
We now consider the two cases in (3). First, If g(x) 6= d for all x 6= c sufficiently
near c, then a tail of the sequence {g(an )} is a sequence in D \ {d} converging to
d, so by (2), lim f (g(an )) = L. On the other hand, if it is the case that f (d) = L
then by (2) and the definition of limit it follows that for any sequence {bn } in
D converging to d, we have lim f (bn ) = L. Therefore, in this case we also have
lim f (g(an )) = L. In either case, we get L = limx→c f ◦ g. 
We now finish our limit theorems by considering limits and inequalities. In the
following two theorems, all functions map a subset D ⊆ Rp into R.
Theorem 4.7 (Squeeze theorem). Let f , g, and h be such that f (x) ≤
g(x) ≤ h(x) for all x sufficiently close to a limit point c in D and such that both
limits limx→c f and limx→c h exist and are equal. Then the limit limx→c g also
exists, and
lim f = lim g = lim h.
x→c x→c x→c

As with the previous two theorems, the squeeze theorem for functions is a direct
consequence of the sequence criterion and the corresponding squeeze theorem for
sequences (Theorem 3.7) and therefore we shall omit the proof. The next theorem
follows (as you might have guessed) the sequence criterion and the corresponding
preservation of inequalities theorem (Theorem 3.8) for sequences.
4.2. A POTPOURRI OF LIMIT PROPERTIES FOR FUNCTIONS 163

Theorem 4.8 (Preservation of inequalities). Suppose that limx→c f exists.


(1) If limx→c g exists and f (x) ≤ g(x) for x 6= c sufficiently close to c, then
limx→c f ≤ limx→c g.
(2) If for some real numbers a and b, we have a ≤ f (x) ≤ b for x 6= c sufficiently
close to c, then a ≤ limx→c f ≤ b.
4.2.2. Limits, limits, limits, and more limits. When the domain is a
subset of R, there are various extensions of the limit idea. We begin with left and
right-hand limits. For the rest of this section we consider functions f : D −→ Rm
where D ⊆ R (later we’ll further restrict to m = 1).
Suppose that c is a limit point of the set D ∩ (−∞, c). Then f : D −→ Rm is
said to have a left-hand limit L at c if for each ε > 0 there is a δ > 0 such that
(4.8) x∈D and c − δ < x < c =⇒ |f (x) − L| < ε.
In a similar way we define a right-hand limit: Suppose that c is a limit point of the
set D ∩ (c, ∞). Then f is said to have a right-hand limit L at c if for each ε > 0
there is a δ > 0 such that
(4.9) x∈D and c < x < c + δ =⇒ |f (x) − L| < ε.
We express left-hand limits in one of several ways:
L = lim f , L = lim f (x) , L = f (c−) , or f (x) → L as x → c−;
x→c− x→c−

with similar expressions with c+ replacing c− for right-hand limits. An important


fact relating one-sided limits and regular limits is described in the next result, whose
proof we leave to you.
Theorem 4.9. Let f : D −→ Rm with D ⊆ R and suppose that c is a limit
point of the sets D ∩ (−∞, c) and D ∩ (c, ∞). Then
L = lim f ⇐⇒ L = f (c−) and L = f (c+).
x→c

If only one of f (c−) or f (c+) makes sense, then L = limx→c f if and only if
L = f (c−) (when c is only a limit point of D ∩ (−∞, c)) or L = f (c+) (when c is
only a limit point of D ∩ (c, ∞)), whichever makes sense.
We now describe limits at infinity. Suppose that for any real number N there
is a point x ∈ D such that x > N . A function f : D −→ Rm is said to have a limit
L ∈ Rm as x → ∞ if for each ε > 0 there is a N ∈ R such that
(4.10) x∈D and x>N =⇒ |f (x) − L| < ε.
Now suppose that for any real number N there is a point x ∈ D such that x < N .
A function f : D −→ Rm is said to have a limit L ∈ Rm as x → −∞ if for each
ε > 0 there is a N ∈ R such that
(4.11) x∈D and x<N =⇒ |f (x) − L| < ε.

To express these limits at infinity, we use the notations (sometimes with ∞ replaced
by +∞)
L = lim f , L = lim f (x) , f → L as x → ∞ , or f (x) → L as x → ∞;
x→∞ x→∞

with similar expressions when x → −∞.


164 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Finally, we discuss infinite limits, which are also called properly divergent limits
of functions.2 We now let m = 1 and consider functions f : D −→ R with D ⊆ R.
Suppose that for any real number N there is a point x ∈ D such that x > N . Then
f is said to diverge to ∞ as x → ∞ if for any real number M > 0 there is a N ∈ R
such that
(4.12) x∈D and x > N =⇒ M < f (x).
Also, f is said to diverge to −∞ as x → ∞ if for any real number M < 0 there is
a N ∈ R such that
(4.13) x∈D and x > N =⇒ f (x) < M.
In either case we say that f is properly divergent as x → ∞ and when f is
properly divergent to ∞ we write
∞ = lim f , ∞ = lim f (x) , f → ∞ as x → ∞ , or f (x) → ∞ as x → ∞;
x→∞ x→∞
with similar expressions when f properly diverges to −∞. In a very similar manner
we can define properly divergent limits of functions as x → −∞, as x → c, as
x → c−, and x → c+; we leave these other definitions for the reader to formulate.
Let us now consider an example.
Example 4.12. Let a > 1 and let f : Q −→ R be defined by f (x) = ax
(therefore in this case, D = Q). Here, we recall that ax is defined for any rational
number x (see Section 2.7). We shall prove that
lim f = ∞ and lim f = 0.
x→∞ x→−∞

In Section 4.6 we shall define ax for any x ∈ R (in fact, for any complex power)
and we shall establish these same limits with D = R. Before proving these results,
we claim that
(4.14) for any rational p < q, we have ap < aq .
Indeed, 1 < a and q − p > 0, so by our power rules,
1 = 1q−p < aq−p ,
which, after multiplication by ap , gives our claim. We now prove that f → ∞ as
x → ∞. To prove this, we note that since a > 1, we can write a = 1 + b for some
b > 0, so by Bernoulli’s inequality, for any n ∈ N,
(4.15) an = (1 + b)n ≥ 1 + n b > n b.
Now fix M > 0. By the Archimedean principle, we can choose N ∈ N such that
N b > M , therefore by (4.15) and (4.14),
x ∈ Q and x > N =⇒ M < N b < aN < ax .
This proves that f → ∞ as x → ∞. We now show that f → 0 as x → −∞. Let
ε > 0. Then by the Archimedean principle there is an N ∈ N such that
1 1
< N =⇒ < ε.
bε Nb
2I protest against the use of infinite magnitude as something completed, which in mathe-
matics is never permissible. Infinity is merely a facon de parler, the real meaning being a limit
which certain ratios approach indefinitely near, while others are permitted to increase without
restriction. Carl Friedrich Gauss (1777–1855).
4.2. A POTPOURRI OF LIMIT PROPERTIES FOR FUNCTIONS 165

By (4.15) and (4.14) it follows that


1 1
x ∈ Q and x < −N =⇒ 0 < ax < a−N = < < ε.
aN Nb
This proves that f → 0 as x → −∞.
Many of the limit theorems in Section 4.1.3 that we have worked out for “regular
limits” also hold for left and right-hand limits, limits at infinity, and infinite limits.
To avoid repeating these limit theorems in each of our new contexts (which will
take up a few pages at the least!), we shall make the following general comment:
The sequence criterion, uniqueness of limits, component theorem,
algebra of limits, composition of limits, squeeze theorem, and preser-
vation of inequalities have analogous statements for left/right-hand
limits, limits at infinity, and infinite (properly divergent) limits.
Of course, some statements don’t hold when we consider infinite limits, for example,
we cannot subtract infinities or divide them, nor can we multiply zero and infinity.
We encourage the reader to think about these analogous statements and we shall
make use of these extended versions without much comment in the sequel.
Example 4.13. For an example of this general comment, suppose that L =
limy→∞ f (y). We leave the reader to show that limx→0+ 1/x = ∞. Therefore,
according to our extended composition of limits theorem, we have
 
1
L = lim+ f .
x→0 x
Similarly, since limx→−∞ −x = ∞, again by our extended composition of limits
theorem, we have
L = lim f (−x).
x→−∞
More generally, if g is any function with limx→c g(x) = ∞ where c is either a real
number, ∞, or −∞, then by our composition of limits theorem,
L = lim f ◦ g.
x→c

Example 4.14. One last example. Suppose as before that limx→c g(x) = ∞
where c is either a real number, ∞, or −∞. Since limy→∞ 1/y = 0, by our extended
composition of limits theorem, we have
1
lim = 0.
x→c g(x)

Exercises 4.2.
1. Using the ε-δ definition of (left/right-hand) limit, prove (a) and (b):
x x x
(a) lim = −1, (b) lim = 1. Conclude that lim does not exist.
x→0− |x| x→0+ |x| x→0 |x|

2. Using the ε-N definition of limits at infinity, prove that


x2 + x + 1 1 p p 1
(a) lim 2
= , (b) lim x2 + 1 − x = 0, (c) lim x2 + x − x = .
x→∞ 2x − 1 2 x→∞ x→∞ 2
3. Let p(x) = an xn + · · · + a1 x + a0 and q(x) = bm xm + · · · + b1 x + b0 be polynomials
with real coefficients and with an 6= 0 and bm 6= 0.
(a) Prove the for any natural number k, limx→∞ 1/xk = 0.
(b) If n < m, prove that limx→∞ p(x)/q(x) = 0.
(c) If n = m, prove that limx→∞ p(x)/q(x) = an /bn .
166 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

f (x)
f (c) + ε
f (c)
f (c) − ε

c−δc c+δ

Figure 4.3. Visualization of continuity.

(d) If n > m, prove that if an > 0, then limx→∞ p(x)/q(x) = ∞, and on the other
hand if an < 0, then limx→∞ p(x)/q(x) = −∞.
4. Let f, g : D −→ R with D ⊆ R, limx→∞ f = ∞, and g(x) 6= 0 for all x ∈ D. Suppose
that for some real number L, we have
f
lim = L.
x→∞ g

(a) If L > 0, prove that limx→∞ g = +∞.


(b) If L < 0, prove that limx→∞ g = −∞.
(c) If L = 0, can you make any conclusions about limx→∞ g?

4.3. Continuity, Thomae’s function, and Volterra’s theorem


In this section we study the most important functions in all of analysis and
topology, continuous functions. We begin by defining what they are and then give
examples. Perhaps one of the most fascinating functions you’ll ever run across is
the modified Dirichlet function or Thomae’s function, which has the perplexing and
pathological property that it is continuous on the irrational numbers and discon-
tinuous on the rational numbers! We’ll see that there is no function opposite to
Thomae’s, that is, continuous on the rationals and discontinuous on the irrationals;
this was proved by Vito Volterra (1860–1940) in 1881. For an interesting account
of Thomae’s function and its relation to Volterra’s theorem, see [60].
4.3.1. Continuous functions. We begin by defining continuity at a point.
Let D ⊆ Rp . A function f : D −→ Rm is continuous at a point c ∈ D if for each
ε > 0 there is a δ > 0 such that
(4.16) x∈D and |x − c| < δ =⇒ |f (x) − f (c)| < ε.
See Figure 4.3 for a picture of what continuity means. The best way to think of
a continuous function is that f takes points which are “close” (x and c, which are
within δ) to points which are “close” (f (x) and f (c), which are within ε). We can
relate this definition to the definition of limit. Suppose that c ∈ D is a limit point
of D. Then comparing (4.16) with the definition of limit, we see that for a limit
point c of D such that c ∈ D, we have
f is continuous at c ⇐⇒ f (c) = lim f.
x→c

Technically speaking, when we compare (4.16) to the definition of limit, for a limit
we actually require that 0 < |x − c| < δ, but in the case that |x − c| = 0, that is,
x = c, we have |f (x) − f (c)| = |f (c) − f (c)| = 0, which is automatically less than ε,
4.3. CONTINUITY, THOMAE’S FUNCTION, AND VOLTERRA’S THEOREM 167

so the condition that 0 < |x − c| can be dropped. What if c ∈ D is not a limit point
of D? In this case c is called an isolated point in D and by definition of (not
being a) limit point there is an open ball Bδ (c) such that Bδ (c) ∩ D = {c}; that is,
the only point of D inside Bδ (c) is c itself. Hence, with this δ, for any ε > 0, the
condition (4.16) is automatically satisfied:
x∈D and |x − c| < δ =⇒ x=c =⇒ |f (x) − f (c)| = 0 < ε.
Therefore, at isolated points of D, the function f is automatically continuous, and
therefore “boring”. For this reason, if we want to prove theorems concerning the
continuity of f : D −→ Rm at a point c ∈ D, we can always assume that c is a limit
point of D; in this case, we have all the limit theorems from the last section at our
disposal. This is exactly why we spent so much time on learning limits during the
last two sections!
If f is continuous at every point in a subset A ⊆ D, we say that f is contin-
uous on A; in particular, if f is continuous at every point of D, we say that f is
continuous, or continuous on D, to emphasize D:
f is continuous ⇐⇒ for all c ∈ D, f is continuous at c.

Example 4.15. Dirichlet’s function is discontinuous at every point in R since


in Example 4.7 we already proved that limx→c D(x) does not exist at any c ∈ R.
Example 4.16. Define f : R2 −→ R by
 2
 x1 x2 (x1 , x2 ) 6= 0
f (x1 , x2 ) = x21 + x22

0 (x1 , x2 ) = 0.
From Example 4.6 we know that limx→0 f = 0, so f is continuous at 0.
Example 4.17. If we define f : R2 −→ R by
 x x
 1 2 (x1 , x2 ) 6= 0
2 2
f (x1 , x2 ) = x1 + x2
0 (x1 , x2 ) = 0,
then we already proved that limx→0 f does not exist in Example 4.8. In particular,
f is not continuous at 0.
Example 4.18. From Example 4.10, any polynomial function p : C → C is
continuous (that is, continuous at every point c ∈ C). From Example 4.11, any
rational function p(z)/q(z), where p and q are polynomials, is continuous at any
point c ∈ C such that q(c) 6= 0.
4.3.2. Continuity theorems. We now state some theorems on continuity.
These theorems follow almost without any work from our limit theorems in the
previous section, so we shall omit all the proofs. First we note that the sequence
criterion for limits of functions (Theorem 4.2) implies the following theorem, which
is worth highlighting!
Theorem 4.10 (Sequence criterion for continuity). A function f : D −→
Rm is continuous at c ∈ D if and only if for every sequence {an } in D with c =
lim an , we have f (c) = lim f (an ).
168 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

We can write the last equality as f (lim an ) = lim f (an ) since c = lim an . Thus,
 
f lim an = lim f (an );
n→∞ n→∞

in other words, limits can be “pulled-out” of continuous functions.

Example 4.19. Question: Suppose that f, g : R → Rm are continuous and


f (r) = g(r) for all rational numbers r; must f (x) = g(x) for all irrational numbers
x? The answer is yes, for let c be an irrational number. Then (see Example 3.13
in Section 3.2) there is a sequence of rational numbers {rn } converging to c. Since
f and g are both continuous and f (rn ) = g(rn ) for all n, we have

f (c) = lim f (rn ) = lim g(rn ) = g(c).

Note that the answer is false if either f or g were not continuous. For example, with
D denoting Dirichlet’s function, D(r) = 1 for all rational numbers, but D(x) 6= 1
for all irrational numbers x. See Problem 2 for a related problem.

The component theorem (Theorem 4.4) implies that a function f = (f1 , . . . , fm )


is continuous at c if and only if every component fk is continuous at c.

Theorem 4.11 (Component criterion for continuity). A function is con-


tinuous at c if and only if all of its component functions are continuous at c.

Next, the composition of limits theorem (Theorem 4.6) implies the following
theorem.

Theorem 4.12. Let f : D −→ Rm and g : C −→ Rp where D ⊆ Rp and


C ⊆ Rq and suppose that g(C) ⊆ D so that f ◦ g : C −→ Rm is defined. If g is
continuous at c and f is continuous at g(c), then the composite function f ◦ g is
continuous at c.

In simple language: The composition of continuous functions is continuous.


Finally, our algebra of limits theorem (Theorem 4.5) implies the following.

Theorem 4.13. If f, g : D −→ Rm are both continuous at c, then


(1) |f | and af + bg are continuous at c, for any real a, b.
If f and g take values in C, then
(2) f g and (provided g(c) 6= 0) f /g are continuous at c.

In simple language: Linear combinations of Rm -valued continuous functions are


continuous. Products, norms, and quotients of real or complex-valued continuous
functions are continuous (provided that the denominator functions are not zero).
Finally, the left and right-hand limit theorem (Theorem 4.9) implies

Theorem 4.14. Let f : D −→ Rm with D ⊆ R and let c ∈ D be a limit point


of the sets D ∩ (−∞, c) and D ∩ (c, ∞). Then f is continuous at c if and only if
f (c) = f (c+) = f (c−).

If only one of f (c−) or f (c+) makes sense, then f is continuous at c if and only
if f (c) = f (c−) or f (c) = f (c+), whichever makes sense.
4.3. CONTINUITY, THOMAE’S FUNCTION, AND VOLTERRA’S THEOREM 169

1/2
1/3

−1−1 1 1 2 1 4 3 5 2
2 3 3 2 3 3 2 3

Figure 4.4. The left-hand side shows plots of T (p/q) for q at most
3 and the right shows plots of T (p/q) for q at most 7.

4.3.3. Thomae’s function and Volterra’s theorem. We now define a fas-


cinating function sometimes called Thomae’s function [17, p. 123] after Johannes
Karl Thomae (1840–1921) who published it in 1875 or the (modified) Dirichlet
function [238], which has the perplexing property that it is continuous at every
irrational number and discontinuous at every rational number. (See Problem 7 in
Exercises 4.5 for a generalization.)
We define Thomae’s function, aka (also known as) the modified Dirichlet
function, T : R −→ R by
(
1/q if x ∈ Q and x = p/q in lowest terms and q > 0,
T (x) =
0 if x is irrational.
Here, we interpret 0 as 0/1 in lowest terms, so T (0) = 1/1 = 1. See Figure
4.4 for a graph of this “pathological function.” To see that T is discontinuous on
rational numbers, let c ∈ Q and √ let {an } be a sequence of irrational numbers
converging to c, e.g. an = c + 2/n works (or see Example 3.13 in Section 3.2).
Then lim T (an ) = lim 0 = 0, while T (c) > 0, hence T is discontinuous at c.
To see that T is continuous at each irrational number, c be an irrational number
and let ε > 0. Consider the case when c > 0 (the case when c < 0 is analogous)
and choose m ∈ N with c < m. Let 0 < x ≤ m and let’s consider the inequality
(4.17) |T (x) − T (c)| = |T (x) − 0| = T (x) < ε.
If x is irrational, then T (x) = 0 < ε holds. If x = p/q is in lowest terms and
q > 0, then T (x) = 1/q < ε holds if and only if q > 1/ε. Set n := b1/εc, the
greatest integer ≤ 1/ε (see Section 2.7.3 if you need a reminder of the greatest
integer function); then T (x) = 1/q < ε holds if and only if q > n. Summarizing:
For 0 < x ≤ m, the inequality (4.17) always holds unless x = p/q is rational in
lowest terms with 0 < q ≤ n; that is, when x is in the set
 
p
(4.18) r ∈ Q ; r = is in lowest terms , 0 < p ≤ m n , 0 < q ≤ n .
q
The requirement 0 < p ≤ m n is needed so that 0 < p/q ≤ m. Now, there are only
a finite number of rationals in the set (4.18). In particular, we can choose δ > 0
such that the interval (c − δ, c + δ) is contained in (0, m) and contains none of the
rational numbers in (4.18). Therefore,
x ∈ (c − δ, c + δ) =⇒ |T (x) − T (c)| < ε,
which proves that T is continuous at c. Thus, T (x) is discontinuous at every rational
number and continuous at every irrational number. The inquisitive student might
ask if there is a function opposite to Thomae’s function, that is,
170 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Is there a function which is continuous at every rational point


and discontinuous at every irrational point?
The answer is “No.” There are many ways to prove this; one can answer this
question using the Baire category theorem (cf. [1, p. 128]), but we shall answer this
question using “compactness” arguments originating with Vito Volterra’s (1860–
1940) first publication in 1881 (before he was twenty!)[3]. To state his theorem we
need some terminology.
Let D ⊆ Rp . A subset A ⊆ D is said to be dense in D if for each point c ∈ D,
any open ball centered at c intersects A, that is, for all r > 0, Br (c) ∩ A is not
empty, or more explicitly, for all r > 0, there is a point x ∈ A such that |x − c| < r.
This condition is equivalent to the statement that every point in D is either in A
or a limit point of A. For p = 1, A ⊆ D is dense if for all c ∈ D and r > 0,
(c − r, c + r) ∩ A 6= ∅.
Example 4.20. Q is dense in R, because (by the density of the (ir)rationals
in R — Theorem 2.37) for any c ∈ R and r > 0, (c − r, c + r) ∩ Q is never empty.
Similarly, the set Qc , the irrational numbers, is also dense in R.
Let f : D −→ Rm with D ⊆ Rp and let Cf ⊆ D denote the set of points in D
at which f is continuous. Explicitly,
Cf := {c ∈ D ; f is continuous at c} .
The function f is said to be pointwise discontinuous if Cf is dense in D.
Theorem 4.15 (Volterra’s theorem). On any nonempty open interval, any
two pointwise discontinuous functions have a point of continuity in common.
Proof. Let f and g be pointwise discontinuous functions on an open interval
I. We prove our theorem in three steps.
Step 1: A closed interval [α, β] is said to be nontrivial if α < β. We first
prove that given any ε > 0 and nonempty open interval (a, b) ⊆ I, there is a
nontrivial closed interval J ⊆ (a, b) such that for all x, y ∈ J,
|f (x) − f (y)| < ε and |g(x) − g(y)| < ε.
Indeed, since the continuity points of f are dense in I, there is a point c ∈ (a, b)
at which f is continuous, so for some δ > 0, x ∈ I ∩ (c − δ, c + δ) implies that
|f (x) − f (c)| < ε/2. Choosing δ > 0 smaller if necessary, we may assume that
J 0 = [c − δ, c + δ] ⊆ (a, b). Then for any x, y ∈ J 0 , we have
|f (x) − f (y)| = |(f (x) − f (c)) + (f (c) − f (y))| ≤ |f (x) − f (c)| + |f (c) − f (y)| < ε.
Using the same argument for g, but with (c − δ, c + δ) in place of (a, b) shows
that there is a nontrivial closed interval J ⊆ J 0 such that x, y ∈ J implies that
|g(x)−g(y)| < ε. Since J ⊆ J 0 , the function f automatically satisfies |f (x)−f (y)| <
ε for x, y ∈ J. This completes the proof of Step 1.
Step 2: With ε = 1 and (a, b) = I in Step 1, there is a nontrivial closed
interval [a1 , b1 ] ⊆ I such that x, y ∈ [a1 , b1 ] implies that
|f (x) − f (y)| < 1 and |g(x) − g(y)| < 1.
Now with ε = 1/2 and (a, b) = (a1 , b1 ) in Step 1, there is a nontrivial closed
interval [a2 , b2 ] ⊆ (a1 , b1 ) such that x, y ∈ [a2 , b2 ] implies that
1 1
|f (x) − f (y)| < and |g(x) − g(y)| < .
2 2
4.3. CONTINUITY, THOMAE’S FUNCTION, AND VOLTERRA’S THEOREM 171

Continuing by induction, we construct a sequence of nontrivial closed intervals


{[an , bn ]} such that [an+1 , bn+1 ] ⊆ (an , bn ) for each n and x, y ∈ [an , bn ] implies
that
1 1
(4.19) |f (x) − f (y)| < and |g(x) − g(y)| < .
n n
By the nested intervals theorem there is a point c contained in every [an , bn ].
Step 3: We now complete the proof. We claim that both f and g are con-
tinuous at c. To prove continuity, let ε > 0. Choose n ∈ N with 1/n < ε. Since
[an+1 , bn+1 ] ⊆ (an , bn ), we have c ∈ (an , bn ) so we can choose δ > 0 such that
(c − δ, c + δ) ⊆ (an , bn ). With this choice of δ > 0, in view of (4.19) and the fact
that 1/n < ε, we obtain
|x − c| < δ =⇒ |f (x) − f (c)| < ε and |g(x) − g(c)| < ε.
Thus, f and g are continuous at c and our proof is complete. 
Thus, there cannot be a function f : R −→ R that is continuous at every
rational point and discontinuous at every irrational point. Indeed, if so, then f
would be pointwise discontinuous (because Q is dense in R) and the function f and
Thomae’s function wouldn’t have any continuity points in common, contradicting
Volterra’s theorem.
Exercises 4.3.
1. Recall that bxc denotes the greatest integer less than or equal to x. Determine the set
of continuity points for the following functions:
(a) f (x) = bxc, (b) g(x) = xbxc, (c) h(x) = b1/xc,
where the domains are R, R, and (0, ∞), respectively. Are the functions continuous on
the domains (−1, 1), (−1, 1), and (1, ∞), respectively?
2. In this problem we deal with zero sets of functions. Let f : D −→ Rm with D ⊆ Rp .
The zero set of f is the set Z(f ) := {x ∈ D ; f (x) = 0}.
(a) Let f : D −→ Rm be continuous and let c ∈ D be a limit point of Z(f ). Prove
that f (c) = 0.
(b) Let f : D −→ Rm be continuous and suppose that Z(f ) is dense in D. Prove that
f is the zero function, that is, f (x) = 0 for all x ∈ D.
(c) Using (b), prove that if f, g : D −→ Rm are continuous and f (x) = g(x) on a dense
subset of D, then f = g, that is, f (x) = g(x) for all x ∈ D.
3. In this problem we look at additive functions. Let f : Rp −→ Rm be additive in the
sense that f (x + y) = f (x) + f (y) for all x, y ∈ Rp . Prove:
(a) Prove that f (0) = 0 and that f (x − y) = f (x) − f (y) for all x, y ∈ Rp .
(b) If f is continuous at some x0 ∈ Rp , then f is continuous on all of Rp .
(c) Assume now that p = 1 so that f : R −→ Rm is additive (no continuity assumptions
at this point). Prove that f (r) = f (1) r for all r ∈ Q.
(d) (Cf. [251]) If f : R −→ Rm is continuous, prove that f (x) = f (1) x for all x ∈ R.
4. Let f : Rp −→ C be multiplicative in the sense that f (x + y) = f (x) f (y) for all
x, y ∈ Rp . Assume that f is not the zero function.
(a) Prove that f (x) 6= 0 for all x ∈ Rp .
(b) Prove that f (0) = 1 and prove that f (−x) = 1/f (x).
(c) Prove that if f is continuous at some point x0 , then f is continuous on all of Rp .
When p = 1 and f is real-valued and continuous, in Problem 11 in Exercises 4.6 we
show that f is given by an “exponential function”.
5. Let f : I −→ R be a continuous function on a closed and bounded interval I. Suppose
there is a 0 < r < 1 having the property that for each x ∈ I there is a point y ∈ I with
172 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

|f (y)| ≤ r|f (x)|. Prove that f must have a root, that is, there is a point c ∈ I such
that f (c) = 0.
6. Consider the following function related Thomae’s function:
(
q if x = p/q in lowest terms and q > 0,
t(x) :=
0 if x is irrational.
Prove that t : R −→ R is discontinuous at every point in R.
7. Here are some fascinating questions related to Volterra’s theorem.
(a) Are there functions f, g : R −→ R that don’t have any continuity points in common,
one that is pointwise discontinuous and the other one that is not (but is continuous
at least at one point)? Give an example or prove there are no such functions.
(b) Is there a continuous function f : R −→ R that maps rationals to irrationals? Give
an example or prove there is no such function.
(c) Is there a continuous function f : R −→ R that maps rationals to irrationals and
irrationals to rationals? Suggestion: Suppose there is such a function and consider
the function T ◦ f where T is Thomae’s function.

4.4. Compactness, connectedness, and continuous functions


Consider the continuous function f (x) = 1/x with domain D = R \ {0}, the
real line with a “hole.” By drawing a graph of this function, it will be apparent that
f has the following “bad” properties:3 f is not bounded on D, in particular f does
not attain a maximum or minimum value on D, and that although the range of f
contains both positive and negative values, f never takes on the intermediate value
of 0. In this section we prove that the “bad” nonboundedness property is absent
when the domain is a closed and bounded interval and the “bad” nonintermediate
value property is absent when the domain is any interval. We shall prove these
results in two rather distinct viewpoints using:
(I) A somewhat concrete analytical approach that only uses concepts we’ve
covered in previous sections.
(II) A somewhat abstract topological approach based on the topological lemmas
presented in Section 4.4.1.
If you’re only interested in the easier analytical approach, skip Section 4.4.1
and also skip the Proof II’s in Theorems 4.19, 4.20, and 4.22. For an interesting
and different approach using the concept of “tagged partitions,” see Gordon [84].
4.4.1. Some fundamental topological lemmas. Let A ⊆ R. A collection
U of subsets of R is called a cover of A (or covers A) if the S union of all the sets
in U contains A. Explicitly, U = {Uα } covers A if A ⊆ α Uα . We are mostly
interested in coverings by open intervals, that is, where each Uα is an open interval.

S∞ Example 4.21. (0, 1) is covered by U = {Un = (1/n, 1)} because (0, 1) ⊆


n=1 (1/n, 1). (The diligent student will supply the details!)
Example
S∞ 4.22. [0, 1] is covered by V = {Vn = (−1/n, 1 + 1/n)} because
[0, 1] ⊆ n=1 (−1/n, 1 + 1/n).
It’s interesting to notice that U does not have a finite subcover, that is,
there are not finitely many elements of U that will still cover (0, 1). To see this,
let {Un1 , . . . , Unk } be a finite subcollection of elements of U . By relabelling, we
may assume that n1 < n2 < · · · < nk . Since nk is the largest of these k numbers,
3“Bad” from one angle, from another angle, these properties can be viewed as interesting.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 173

 ( ( ) -
1
0 nk 1

Figure 4.5. The finite subcover {Un1 , . . . , Unk } does not cover [0, 1).

Sk
we have j=1 Unj = ( n1k , 1), which does not cover (0, 1) because there is a “gap”
between 0 and 1/nk as seen in Figure 4.5. On the other hand, V does have a
finite subcover, that is, there are finitely many elements of V that will cover [0, 1].
Indeed, [0, 1] is covered by the single element V1 of V because [0, 1] ⊆ (−1, 1 + 1).
This is in fact a general phenomenon for closed and bounded intervals.
Lemma 4.16 (Compactness lemma). Every cover of a closed and bounded
interval by open intervals has a finite subcover.
Proof. Let U be a cover of [a, b] by open intervals. We must show that there
are finitely many elements of U that still cover [a, b]. Let A be the set of all numbers
x in [a, b] such that the interval [a, x] is contained in a union of finitely many sets
in U . Since [a, a] is the single point a, this interval is contained in a single set in
U , so A is not empty. Being a nonempty subset of R bounded above by b, A has
a supremum, say ξ ≤ b. Since ξ belongs to the interval [a, b] and U covers [a, b], ξ
belongs to some open interval (c, d) in the collection U . Choose any real number
η with c < η < ξ. Then η is less than the supremum of A, so [a, η] is covered by
finitely many sets in U , say [a, η] ⊆ U1 ∪ · · · ∪ Uk . Adding Uk+1 := (c, d) to this
collection, it follows that for any real number x with c < x < d, the interval [a, x] is
covered by the finitely many sets U1 , . . . , Uk+1 in U . In particular, since c < ξ < d,
for any x with ξ ≤ x < d, the interval [a, x] is covered by finitely many sets in U ,
so unless ξ = b, the set A would contain a number greater than ξ. Hence, b = ξ
and [a, b] can be covered by finitely many sets in U . 
Because closed and bounded intervals have this finite subcover property, and
therefore behave somewhat like finite sets (which are “compact” — take up little
space), we call such intervals compact. We now move to open sets. An open
set Sin R is simply a union of open intervals; explicitly, A ⊆ R is open means that
A = α Uα where each Uα is an open interval.
Example 4.23. R = (−∞, ∞) is an open interval, so R is open.
Example 4.24. Any open interval (a, b) is open because it’s a union consisting
of just itself. In particular, if b ≤ a, we have (a, b) = ∅, so ∅ an open set.
S
Example 4.25. Another example is R \ Z because R \ Z = n∈Z (n, n + 1).
A set A ⊆ R is disconnected if there are open sets U and V such that A ∩ U
and A ∩ V are nonempty, disjoint, and have union A. To have union A, we mean
A = (A ∩ U) ∪ (A ∩ V), which is actually equivalent to saying that
A ⊆ U ∪ V.
A set A ⊆ R is connected if it’s not disconnected.
Example 4.26. A = (−1, 0) ∪ (0, 1) is disconnected because A = U ∪ V where
U = (−1, 0) and V = (0, 1) are open, and A ∩ U = (−1, 0) and A ∩ V = (0, 1) are
nonempty, disjoint, and union to A.
174 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

c ]
1

−M

Figure 4.6. Illustrations of the boundedness, max/min value, and


intermediate value theorems for a function f on [0, 1].

Intuitively, intervals should always be connected. This is in fact the case.


Lemma 4.17 (Connectedness lemma). Intervals (open, closed, bounded, un-
bounded, etc.) are connected.
Proof. Let I be an interval and suppose, for sake of contradiction, that it is
disconnected. Then there are open sets U and V such that I ∩ U and I ∩ V are
disjoint, nonempty, and have union I. Let a, b ∈ I with a ∈ U and b ∈ V. By
symmetry we may assume that a < b. Then [a, b] ⊆ I, so [a, b] ∩ U and [a, b] ∩ V are
disjoint, nonempty, and have union [a, b]. Thus, [a, b] is disconnected so we may as
well assume that I = [a, b] in the first place and derive a contradiction from this.
Define 
c := sup I ∩ U .
This number exists because I ∩ U contains a and is bounded above by b. In par-
ticular, c ∈ I. Since I ⊆ U ∪ V, the point c must belong to either U or V. We shall
derive a contradiction in either situation. Suppose that c ∈ U. Since b ∈ I ∩ V and
I ∩ U and I ∩ V are disjoint, it follows that c 6= b, so c < b. Now U is open, so it’s
a union of open intervals, therefore c ∈ (α, β) for some open interval (α, β) making
up U. This implies that I ∩ U contains points between c and β. However, this is
impossible because c is an upper bound for I ∩ U. So, suppose that c ∈ V. Since
a ∈ I ∩ U and I ∩ U and I ∩ V are disjoint, it follows that c 6= a, so a < c. Now
c ∈ (α0 , β 0 ) for some open interval (α0 , β 0 ) making up V. Since U and V are disjoint
and c is an upper bound for I ∩ U, there are points between α0 and c that are also
upper bounds for I ∩ U. This too is impossible since c is the least upper bound. 
4.4.2. The boundedness theorem. The geometric content of the bounded-
ness theorem is that the graph of a continuous function f on a closed and bounded
interval lies between two horizontal lines, that is, there is a constant M such that
|f (x)| ≤ M for all x in the interval. Therefore, the graph does not extend infinitely
up or down; see Figure 4.6 (the dots and the point c in the figure have to do with
the max/min value and intermediate value theorems). The function f (x) = 1/x on
(0, 1] or f (x) = x on any unbounded interval shows that the boundedness theorem
does not hold when the interval is not closed and bounded. Before proving the
boundedness theorem, we need the following lemma.
Lemma 4.18 (Inequality lemma). Let f : I −→ R be a continuous map on
an interval I, let c ∈ I, and suppose that |f (c)| < d where d ∈ R. Then there is an
open interval Ic containing c such that for all x ∈ I with x ∈ Ic , we have |f (x)| < d.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 175

Proof. Let ε = d − |f (c)| and, using the definition of continuity, choose δ > 0
such that x ∈ I and |x − c| < δ =⇒ |f (x) − f (c)| < ε. Let Ic = (c − δ, c + δ). Then
given x ∈ I with x ∈ Ic , we have |x − c| < δ, so

|f (x)| = |f (x) − f (c)| + |f (c)| < ε + |f (c)| = d − |f (c)| + |f (c)| = d,
which proves our claim. 
An analogous proof shows that if a < f (c) < b, then there is an open interval
Ic containing c such that for all x ∈ I with x ∈ Ic , we have a < f (x) < b. Yet
another analogous proof shows that if f : D −→ Rm with D ⊆ Rp is a continuous
map and |f (c)| < d, then there is an open ball B containing c such that for all
x ∈ D with x ∈ B, we have |f (x)| < d. We’ll leave these generalizations to the
interested reader.
Theorem 4.19 (Boundedness theorem). A continuous real-valued function
on a closed and bounded interval is bounded.
Proof. Let f be a continuous function on a closed and bounded interval I.
Proof I: Assume that f is unbounded; we shall prove that f is not continuous.
Since f is unbounded, for each natural number n there is a point xn in I such that
|f (xn )| ≥ n. By the Bolzano-Weierstrass theorem, the sequence {xn } has a conver-
gent subsequence, say {x0n } that converges to some c in I. By the way the numbers
xn were chosen, it follows that |f (x0n )| → ∞, which shows that f (x0n ) 6→ f (c), for
if f (x0n ) → f (c), then we would have |f (c)| = lim |f (x0n )| = ∞, an impossibility
because f (c) is a real number. Thus, f is not continuous at c.
Proof II: Given any arbitrary point c in I, we have |f (c)| < |f (c)|+1, so by our
inequality lemma there is an open interval Ic containing c such that for each x ∈ Ic ,
|f (x)| < |f (c)| + 1. The collection of all such open intervals U = {Ic ; c ∈ I} covers
I, so by the compactness lemma, there are finitely many open intervals in U that
cover I, say Ic1 , . . . , Icn . Let M be the largest of the values |f (c1 )|+1, . . . , |f (cn )|+1.
We claim that f is bounded by M on all of I. Indeed, given x ∈ I, since Ic1 , . . . , Icn
cover I, there is an interval Ick containing x. Then,
|f (x)| < |f (ck )| + 1 ≤ M.
Thus, f is bounded. 
4.4.3. The max/min value theorem. The geometric content of our second
theorem is that the graph of a continuous function on a closed and bounded interval
must have highest and lowest points. The dots in Figure 4.6 show such extreme
points; note that there are two lowest points in the figure. The simple example
f (x) = x on (0, 1) shows that the max/min theorem does not hold when the interval
is not closed and bounded.
Theorem 4.20 (Max/min value theorem). A continuous real-valued func-
tion on a closed and bounded interval achieves its maximum and minimum values.
That is, if f : I −→ R is a continuous function on a closed and bounded interval I,
then for some values c and d in the interval I, we have
f (c) ≤ f (x) ≤ f (d) for all x in I.
Proof. Define
M := sup{f (x) ; x ∈ I}.
176 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

This number is finite by the boundedness theorem. We shall prove that there is a
number d in [a, b] such that f (d) = M . This proves that f achieves its maximum;
a related proof shows that f achieves its minimum.
Proof I: By definition of supremum, for each natural number n, there exists
an xn in I such that
1
(4.20) M − < f (xn ) ≤ M,
n
for otherwise, the value M −1/n would be a smaller upper bound for {f (x) ; x ∈ I}.
By the Bolzano-Weierstrass theorem, the sequence {xn } has a convergent subse-
quence {x0n }; let’s say that x0n → d where d is in [a, b]. By continuity, we have
f (x0n ) → f (d). On the other hand, by (4.20) and the squeeze theorem, we have
f (xn ) → M , so f (x0n ) → M as well. By uniqueness of limits, f (d) = M .
Proof II: Assume, for sake of contradiction, that f (x) < M for all x in I. Let
c be any point in I. Since f (c) < M by assumption, we can choose εc > 0 such that
f (c) + εc < M , so by our inequality lemma there is an open interval Ic containing
c such that for all x ∈ Ic , |f (x)| < M − εc . The collection U = {Ic ; c ∈ I}
covers I, so by the compactness lemma, there are finitely many open intervals in
U that cover I, say Ic1 , . . . , Icn . Let m be the largest of the finitely many values
εck + |f (ck )|, k = 1, . . . , n. Then m < M and given any x ∈ I, since Ic1 , . . . , Icn
cover I, there is an interval Ick containing x, which shows that
|f (x)| < εck + |f (ck )| ≤ m < M.
This implies that M cannot be the supremum of f over I, since m is a smaller
upper bound for f . This gives a contradiction to the definition of M . 
4.4.4. The intermediate value theorem. A real-valued function f on an
interval I is said to have the intermediate value property if it attains all its
intermediate values in the sense that if a < b both belong to I, then given any
real number ξ between f (a) and f (b), there is a c in [a, b] such that f (c) = ξ. By
“between” we mean that either f (a) ≤ ξ ≤ f (b) or f (b) ≤ ξ ≤ f (a). Geometrically,
this means that the graph of f can be draw without “jumps,” that is, without ever
lifting up the pencil. We shall prove the intermediate value theorem, which states
that any continuous function on an interval has the intermediate value property.
See the previous Figure 4.6 for an example where we take, for instance, a = 0 and
b = 1; note for this example that the point c need not be unique (there is another
c0 such that f (c0 ) = ξ). The function in the introduction to this section shows that
the intermediate value theorem fails when the domain is not an interval.
Before proving the intermediate value theorem we first think a little about
intervals. Note that if I is an interval, bounded or unbounded, open, closed, etc,
then given any points a, b ∈ I with a < b, it follows that every point c between
a and b is also in I. The converse statement: “if A ⊆ R is such that given any
points a, b ∈ A with a < b, every point c between a and b is also in A, then A is an
interval” is “obviously” true. We shall leave its proof to the interested reader.
Lemma 4.21. A set A in R is an interval if and only if given any points a < b
in A, we have [a, b] ⊆ A. Stated another way, A is an interval if and only if given
any two points a, b in A with a < b, all points between a and b also lie in A.
(In fact, some mathematicians might even take this lemma as the definition of
interval.) We are now ready to prove our third important theorem in this section.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 177

f (b)

ξ
a c b
f (a)

Figure 4.7. Proof of the intermediate value property.

Theorem 4.22 (Intermediate value theorem). A real-valued continuous


function on any interval (bounded, unbounded, open, closed, . . .) has the interme-
diate value property. Moreover, the range of the function is also an interval.
Proof. Let f be a real-valued continuous function on an interval I and let ξ
be between f (a) and f (b) where a < b and a, b ∈ I. We shall prove that there is a
c in [a, b] such that f (c) = ξ. Assume that f (a) ≤ ξ ≤ f (b); the reverse inequalities
have a related proof. Note that if ξ = f (a), then c = a works or if ξ = f (b), then
c = b works, so we may assume that f (a) < ξ < f (b). We now prove that f has
the intermediate value property.
Proof I: To prove that f has the IVP we don’t care about f outside of [a, b],
so let’s (re)define f outside of the interval [a, b] such that f is equal to the constant
value f (a) on (−∞, a) and f (b) on (b, ∞). This gives us a continuous function,
which we again denote by f , that has domain R as shown in Figure 4.7.
Define
A = {x ∈ R ; f (x) ≤ ξ}.
Since f (a) < ξ, we see that a ∈ A so A is not empty and since ξ < f (b), we see that
A is bounded above by b. In particular, c := sup A exists and a ≤ c ≤ b. We shall
prove that f (c) = ξ, which is “obvious” from Figure 4.7. To prove this rigourously,
observe that by definition of least upper bound, for any n ∈ N, there is a point
xn ∈ A such that c − n1 < xn ≤ c. As n → ∞, we have xn → c, so by continuity,
f (xn ) → f (c). Since f (xn ) ≤ ξ, because xn ∈ A, and limits preserve inequalities,
we have f (c) ≤ ξ. On the other hand, by definition of upper bound, for any n ∈ N,
we must have f (c + n1 ) > ξ. Taking n → ∞ and using that f is continuous and
that limits preserve inequalities, we see that f (c) ≥ ξ. It follows that f (c) = ξ.
Proof II: To prove that f has the IVP using topology, suppose that f (x) 6= ξ
for any x in I. Let c be any point in I. If f (c) < ξ, then by the discussion after
our inequality lemma, there is an open interval Ic containing c such that if x ∈ I
with x ∈ Ic , we have f (x) < ξ. Similarly, if ξ < f (c), there is an open interval Ic
containing c such that if x ∈ I with x ∈ Ic , we have ξ < f (x). In summary, we
have assigned to each point c ∈ I, an open interval Ic that contains c such that
either f (x) < ξ or ξ < f (x) for all x ∈ I with x ∈ Ic . Let U be the union of all the
Ic ’s where f (c) < ξ and let V be the union of all the Ic ’s where ξ < f (c). Then U
and V are unions of open intervals so are open sets by definition, and a ∈ U since
f (a) < ξ, and b ∈ V since ξ < f (b). Notice that U and V are disjoint because
U has the property that if x ∈ U, then f (x) < ξ and V has the property that if
x ∈ V, then ξ < f (x). Thus, U and V are disjoint, nonempty, and I ⊆ U ∪ V. This
contradicts the fact that intervals are connected.
178 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

We now prove that f (I) is an interval. By our lemma, f (I) is an interval if


and only if given any points α, β in f (I) with α < β, all points between α and β
also lie in f (I). Since α, β ∈ f (I), we can write α = f (x) and β = f (y). Now
let f (x) < ξ < f (y). We need to show that ξ ∈ f (I). However, according to the
intermediate value property, there is a c in I such that f (c) = ξ. Thus, ξ is in f (I)
and our proof is complete. 
A root or zero of a function f : D −→ Rm (with D ⊆ Rp ) is point c ∈ D
such that f (c) = 0.
Corollary 4.23. Let f be a real-valued continuous function on an interval
and let a < b be points in the interval such that f (a) and f (b) have opposite signs
(that is, f (a) > 0 and f (b) < 0, or f (a) < 0 and f (b) > 0). Then there is a number
a < c < b such that f (c) = 0.
Proof. Since 0 is between f (a) and f (b), by the intermediate value theorem
there is a point c in [a, b] such that f (c) = 0; since f (a) and f (b) are nonzero, c
must lie strictly between a and b. 
4.4.5. The fundamental theorems of continuous functions in action.
Example 4.27. The intermediate value theorem helps us to solve the following
puzzle [219, p. 239]. (For another interesting puzzle, see Problem 6.) At 1 o’clock
in the afternoon a man starts walking up a mountain, arriving at 10 o’clock in the
evening at his hut. At 1 o’clock the next afternoon he walks back down by the
exact same route, arriving at 10 o’clock in the evening at the point he started off
the day before. Prove that at some time he is at the same place on the mountain
on both days. To solve this puzzle, let f (x) and g(x) be the distance of the man
from his hut, measured along his route, at time x on day one and two, respectively.
Then f : [1, 10] −→ R and g : [1, 10] −→ R are continuous. We need to show
that f (x) = g(x) at some time x. To see this, let h(x) = f (x) − g(x). Then
h(1) = f (1) > 0 and h(10) = −g(10) < 0. The IVP implies there is some point t
where h(t) = 0. This t is a time that solves our puzzle.4
Example 4.28. The intermediate value theorem can be used to prove that any
nonnegative real number has a square root. To see this, let a ≥ 0 and consider the
function f (x) = x2 . Then f is continuous on R, f (0) = 0, and
f (a + 1) = (a + 1)2 = a2 + 2a + 1 ≥ 2a ≥ a.
Therefore, f (0) ≤ a ≤ f (a + 1), so by the intermediate value theorem, there is a
point 0 ≤ c ≤ a + 1 such that f (c) = a, that is, c2 = a. This proves that a has a
square root. (The uniqueness of c follows from the last power rule in Theorem 2.22.)
Of course, considering the function f (x) = xn , we can prove that any nonnegative
real number has a unique n-th root.
Example 4.29. Here’s an interesting Question: Is there a continuous function
f : [0, 1] −→ R that takes on each value in its range exactly twice? In other
words, for each y ∈ f ([0, 1]) there are exactly two points x1 , x2 ∈ [0, 1] such that
y = f (x1 ) = f (x2 ) — such a function is said to be “two-to-one”. The answer is no.
(See Problem 8 for generalizations of this example.) To see this, assume, by way
4A nonmathematical way to solve this problem is to have the man’s “twin” walk down the
mountain while the man is walking up. At some moment, the two men will cross.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 179

of contradiction, that there is such a two-to-one function. Let y0 be the maximum


value of f , which exists by the max/min value theorem. Then there are exactly two
points a, b ∈ [0, 1], say 0 ≤ a < b ≤ 1, such that y0 = f (a) = f (b). Note that all
other points x ∈ [0, 1] besides a, b must satisfy f (x) < y0 . This is because if x 6= a, b
yet f (x) = y0 = f (a) = f (b), then there would be three points x, a, b ∈ [0, 1] taking
on the same value contradicting the two-to-one property. We claim that a = 0.
Indeed, suppose that 0 < a and choose any c ∈ (a, b); then, 0 < a < c < b. Since
f (0) < y0 and f (c) < y0 we can choose a ξ ∈ R such that f (0) < ξ < y0 and
f (c) < ξ < y0 . Therefore,
f (0) < ξ < f (a) , f (c) < ξ < f (a) , f (c) < ξ < f (b).
By the intermediate value theorem, there are points
0 < c1 < a , a < c2 < c , c < c3 < b
such that ξ = f (c1 ) = f (c2 ) = f (c3 ). Note that c1 , c2 , c3 are all distinct and ξ is
taken on at least three times by f . This contradicts the two-to-one property, so
a = 0. Thus, f achieves its maximum at 0. Since −f is also two-to-one, it follows
that −f also achieves its maximum at 0, which is the same as saying that f achieves
its minimum at 0. However, if y0 = f (0) is both the maximum and minimum of f ,
then f must be the constant function f (x) = y0 for all x ∈ [0, 1] contradicting the
two-to-one property of f .
Exercises 4.4.
1. Is there a nonconstant continuous function f : R −→ R that takes on only rational
values (that is, whose range is contained in Q)? What about only irrational values?
2. In this problem we investigate real roots of real-valued odd degree polynomials.
(a) Let p(x) = xn + an−1 xn−1 + · · · + a1 x + a0 be a polynomial with each ak real and
n ≥ 1 (not necessarily odd). Prove that there is a real number a > 0 such that
1 an−1 a0
(4.21) ≤1+ + ··· + n, for all |x| ≥ a.
2 x x
(b) Using (4.21), prove that if n is odd, there is a c ∈ [−a, a] with p(c) = 0.
(c) Puzzle: Does there exist a real number that is one more than its cube?
3. In this problem we investigate real roots of real-valued even degree polynomials. Let
p(x) = xn + an−1 xn−1 + · · · + a1 x + a0 with each ak real and n ≥ 2 even.
(a) Let b > 0 with bn ≥ max{2a0 , a} where a is given in (4.21). Prove that if |x| ≥ b,
then p(x) ≥ a0 = p(0).
(b) Prove there is a c ∈ R such that for all x ∈ R, p(c) ≤ p(x). That is, p : R −→ R
achieves a minimum value. Is this statement true for odd-degree polynomials?
(c) Show that there exists a d ∈ R such that the equation p(x) = ξ has a solution
x ∈ R if and only if ξ ≥ d. In particular, p has a real root if and only if d ≤ 0.
4. Here are a variety of continuity problems. Let f : [0, 1] −→ R be continuous.
(a) If f is one-to-one, prove that f achieves its maximum and minimum values at 0 or
1; that is, the maximum and minimum values of f cannot occur at points in (0, 1).
(b) If f (0) = f (1), prove there are points a, b ∈ (0, 1) with a 6= b such that f (a) = f (b).
(c) If f is one-to-one and f (0) < f (1), prove that f is strictly increasing; that is, for
all a, b ∈ [0, 1] with a < b, we have f (a) < f (b).
(d) If g : [0, 1] −→ R is continuous such that f (0) < g(0) and g(1) > f (1), prove that
there is a point c ∈ (0, 1) such that f (c) = g(c).
5. (Brouwer’s fixed point theorem) If f : [a, b] −→ [a, b] is a continuous function,
prove that there is a point c ∈ [a, b] such that f (c) = c. This result is a special case
of a theorem by Luitzen Egbertus Jan Brouwer (1881–1966). Puzzle: You are given
a straight wire lying perpendicular to a wall and you bend it into any shape you can
180 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

imagine and put it back next to the wall. Is there a point on the bent wire whose
distance to the wall is exactly the same as it was originally?
6. (Antipodal point puzzle) Prove that there are, at any given moment, antipodal
points on the earth’s equator that have the same temperature. Here are some steps:
(i) Let a > 0 and let f : [0, 2a] −→ R be a continuous function with f (0) = f (2a).
Show that there exists a point ξ ∈ [0, a] such that f (ξ) = f (ξ + a).
(ii) Using (i) solve our puzzle.
7. Let f : I −→ R and g : I −→ R be continuous functions on a closed and bounded
interval I and suppose that f (x) < g(x) for all x in I
(a) Prove that there is a constant α > 0 such that f (x) + α < g(x) for all x ∈ I.
(b) Prove that there is a constant β > 1 such that βf (x) < g(x) for all x ∈ I.
(c) Do properties (a) and (b) hold in case I is bounded but not closed (e.g. I = (0, 1)
or I = (0, 1]) or unbounded (e.g. I = R or I = [1, ∞))? In each of these two cases
prove (a) and (b) are true, or give counterexamples.
8. (n-to-one functions) This problem is a continuation of Example 4.29.
(a) Define a (necessarily non-continuous) function f : [0, 1] −→ R that takes on each
value in its range exactly two times.
(b) Prove that there does not exist a function f : [0, 1] −→ R that takes on each value
in its range exactly n times, where n ∈ N with n ≥ 2.
(c) Now what about a function with domain R instead of [0, 1]? Prove that there does
not exist a continuous function f : R −→ R that takes on each value in its range
exactly two times.
(d) Prove that there does not exist a continuous function f : R −→ R that takes on
each value in its range exactly n times, where n ∈ N is even. If n is odd, there does
exist such a function! Draw such a function when n = 3 (try to draw a “zig-zag”
type function). If you’re interested in a formula for a continuous n-to-one function
for arbitrary odd n, try to come up with one or see Wenner [242].
9. Show that a function f : R −→ R can have at most a countable number of strict
maxima. Here, a strict maximum is a point c such that f (x) < f (c) for all x
sufficiently close to c. Suggestion: At each point c where f has a strict maximum,
choose an interval (p, q) containing c where p, q ∈ Q.
The remaining exercises give alternative proofs of the boundedness, max/min, and
intermediate value theorems.
10. (Boundedness, Proof III) We shall give another proof of the boundedness theorem
as follows. Let f be a real-valued continuous function on a closed interval [a, b]. Define
A = {c ∈ [a, b] ; f is a bounded on [a, c]}.
If we prove that b ∈ A, then f is bounded on [a, b], which proves our theorem.
(i) Show that a ∈ A and d := sup A exists where d ≤ b. We show that d = b.
(ii) Suppose that d < b. Show that there is an open interval I containing d such that
f is bounded on [a, b] ∩ I, and moreover, for all points c ∈ I with d < c < b, f is
bounded on [a, c]. Derive a contradiction.
11. (Max/min, Proof III) We give another proof of the max/min value theorem as
follows. Let M be the supremum of a real-valued continuous function f on a closed
and bounded interval I. Assume that f (x) < M for all x in I and define
1
g(x) = .
M − f (x)
Show that g is continuous on I. However, show that g is actually not bounded on I.
Now use the boundedness theorem to arrive at a contradiction.
12. (Max/min, Proof IV) Here’s a proof of the max/min value theorem that is similar
to the proof of the boundedness theorem in Problem 10. For each c ∈ [a, b] define
Mc = sup{f (x) ; x ∈ [a, c]}.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 181

This number is finite by the boundedness theorem and M := Mb is the supremum of f


over all of [a, b]. We shall prove that there is a number d in [a, b] such that f (d) = M .
This proves that f achieves its maximum; a related proof shows that f achieves its
minimum. Define
A = {c ∈ [a, b] ; Mc < M }.
(i) If a 6∈ A, prove that f (a) = M and we are done.
(ii) So, suppose that a ∈ A. Show that d := sup A exists where d ≤ b. We claim that
f (d) = M . By way of contradiction, suppose that f (d) < M . Let ε > 0 satisfy
f (d) < M − ε and, by the inequality lemma, choose an open interval I containing
d such that for all x ∈ [a, b] with x ∈ I, f (x) < M − ε. Show that there is an
m < M such that for any c ∈ [a, b] with c ∈ I, Mc < m. In the two cases, d < b
or d = b, derive a contradiction.
13. (IVP, Proof III) Here’s another proof of the intermediate value property. Let f be a
real-valued continuous function on an interval [a, b] and suppose that f (a) < ξ < f (b).
(i) Define
A = {x ∈ [a, b] ; f (x) < ξ}.
Show that c := sup A exists. We shall prove that f (c) = ξ. Indeed, either this
holds or f (c) < ξ or f (c) > ξ.
(ii) If f (c) < ξ, derive a contradiction by showing that c is not an upper bound.
(iii) If f (c) > ξ, derive a contradiction by showing that c is not the least upper bound.
14. (IVP, Proof IV) In this problem we prove the intermediate value theorem using
the compactness lemma. Let f be a real-valued continuous function on [a, b] and let
f (a) < ξ < f (b). Suppose that f (x) 6= ξ for all x in [a, b].
(i) Let U be the collection of all the open intervals Ic constructed in Theorem 4.22.
This collection covers [a, b], so by the compactness lemma, there are finitely many
open intervals in U that cover I, say (a1 , b1 ), . . . , (an , bn ). We may assume that
a1 ≤ a2 ≤ a3 ≤ · · · ≤ a n
by reordering the ak ’s if necessary. Prove that f (x) < ξ for all x ∈ (a1 , b1 ).
(ii) Using induction, prove that f (x) < ξ for all x in (ak , bk ), k = 1, . . . , n. Derive a
contradiction, proving the intermediate value theorem.
15. (IVP, Proof V) Finally, we give one last proof of the intermediate value theorem
called the “bisection method”. Let f be a continuous function on an interval and
suppose that a < b and f (a) < ξ < f (b).
(i) Let a1 = a and b1 = b and let c1 be the midpoint of [a1 , b1 ] and define the numbers
a2 and b2 by a2 = a1 and b2 = c1 if ξ ≤ f (c1 ) or a2 = c1 and b2 = b1 if f (c1 ) < ξ.
Prove that in either case, we have f (a2 ) < ξ ≤ f (b2 ).
(ii) Using [a2 , b2 ] instead of [a1 , b1 ] and c2 the midpoint of [a2 , b2 ] instead of c1 , and
so on, construct a nested sequence of closed and bounded intervals [an , bn ] such
that f (an ) < ξ ≤ f (bn ) for each n.
(iii) Using the nested intervals theorem show that the intersection of all [an , bn ] is a
single point, call it c, and show that f (c) = ξ.
16. We prove the connectedness lemma using the notion of “chains”. Let U and V be open
sets and suppose that [a, b] ∩ U and [a, b] ∩ V are disjoint, nonempty, and have union
[a, b]. We define a chain in U as finitely many intervals I1 , . . . , In , where the Ik ’s are
open intervals in the union defining U (recall that U, being open, is by definition a
union of open intervals), such that a ∈ I1 and Ik ∩ Ik+1 6= ∅ for k = 1, . . . , n − 1. Let

A = c ∈ [a, b] ; there is a chain I1 , . . . , In in U with c ∈ In .
(i) Show that a ∈ A and that c = sup A exists, where c ∈ [a, b]. Then c ∈ U or c ∈ V.
(ii) However, show that c 6∈ U by assuming c ∈ U and deriving a contradiction.
(iii) However, show that c 6∈ V by assuming c ∈ V and deriving a contradiction.
182 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

q
6
aq
a qa q
a q
a q

q -

Figure 4.8. Zeno’s function Z : [0, 1] −→ R.

4.5. Monotone functions and their inverses


In this section we study monotone functions on intervals and their continuity
properties. In particular, we prove the following fascinating fact: Any monotone
function on an interval (no other assumptions besides monotonicity) is continuous
everywhere on the interval except perhaps at countably many points. With the
monotonicity assumption dropped, anything can happen, for instance, recall that
Dirichlet’s function is nowhere continuous.
4.5.1. Continuous and discontinuous monotone functions. Let I ⊆ R
be an interval. A function f : I −→ R is said to be nondecreasing if a ≤ b (where
a, b ∈ I) implies f (a) ≤ f (b), (strictly) increasing if a < b implies f (a) < f (b),
nonincreasing if a ≤ b implies f (a) ≥ f (b), and (strictly) decreasing if a < b
implies f (a) > f (b). The function is monotone if it’s one of these four types.
(Really two types because increasing and decreasing functions are special cases of
nondecreasing and nonincreasing functions, respectively.)
Example 4.30. A neat example of a monotone (nondecreasing) function is
Zeno’s function Z : [0, 1] −→ R, named after Zeno of Elea (490 B.C.–425 B.C.):


 0 x=0




 1/2 0 < x ≤ 1/2

1/2 + 1/22 = 3/4 1/2 < x ≤ 3/4
Z(x) =


 1/2 + 1/22 + 1/23 = 7/8 3/4 < x ≤ 7/8

· · · etc. · · ·

 ···


1 x = 1.
See Figure 4.8. This function is called Zeno’s function because as described by
Aristotle (384 B.C.–322 B.C.), Zeno argued that “there is no motion because that
which is moved must arrive at the middle of its course before it arrives at the end”
(you can read about this in [100]). Zeno’s function moves from 0 to 1 via half-way
stops. Observe that the left-hand limits of Zeno’s function exist at each point of
[0, 1] except at x = 0 where the left-hand limit is not defined, and the right-hand
limits exist at each point of [0, 1] except at x = 1 where the right-hand limit is not
defined. Also observe that Zeno’s function has discontinuity points exactly at the
(countably many) points x = (2k − 1)/2k for k = 0, 1, 2, 3, 4, . . ..
It’s an amazing fact that Zeno’s function is typical: Every monotone function
on an interval has left and right-hand limits at every point of the interval except at
the end points when a left or right-hand limit is not even defined and has at most
countably many discontinuities. For simplicity . . .
4.5. MONOTONE FUNCTIONS AND THEIR INVERSES 183

To avoid worrying about end points, in this section we only con-


sider monotone functions with domain R. However, every result we
prove has an analogous statement for domains that are intervals.
We repeat, every statement we mention in this section holds for monotone
functions on intervals (open, closed, half-open, etc.) as long as we make suitable
modifications of these statements at end points.
Lemma 4.24. Let f : R −→ R be nondecreasing. Then the left and right-
hand limits, f (c−) = limx→c− f (x) and f (c+) = limx→c+ f (x), exist at every point
c ∈ R. Moreover, the following relations hold:
f (c−) ≤ f (c) ≤ f (c+),
and if c < d, then
(4.22) f (c+) ≤ f (d−).
Proof. Fix c ∈ R. We first show that f (c−) exists. Since f is nondecreasing,
for all x ≤ c, f (x) ≤ f (c), so the set {f (x) ; x < c} is bounded above by f (c).
Hence, b := sup{f (x) ; x < c} exists and b ≤ f (c). Given any ε > 0, by definition
of supremum there is a y < c such that b − ε < f (y). Let δ = c − y. Then
c − δ < x < c implies that y < x < c, which implies that
|b − f (x)| = b − f (x) (since f (x) ≤ b by definition of supremum)
≤ b − f (y) (since f (y) ≤ f (x))
< ε.
This shows that
lim f (x) = b = sup{f (x) ; x < c}.
x→c−
Thus, f (c−) exists and f (c−) = b ≤ f (c). By considering the set {f (x) ; c < x}
one can similarly prove that
f (c+) = inf{f (x) ; c < x} ≥ f (c).
Let a < b. Then given any c with a < c < b, by definition of infimum and supremum,
we have
f (a+) = inf{f (x) ; x < a} ≤ f (c) ≤ sup{f (x) ; x < b} = f (b−).
Our proof is now complete. 
Of course, there is a corresponding lemma for nonincreasing functions where
the inequalities in this lemma are reversed. As a corollary of the property f (c−) ≤
f (c) ≤ f (c+) (for a nondecreasing function) and Theorem 4.14, which states that f
is continuous at c if and only if f (c−) = f (c) = f (c+), we see that a nondecreasing
function f : R −→ R is discontinuous at a point c if and only if f (c+) − f (c−)
is positive. In particular, Figure 4.9 shows that there are three basic types of
discontinuities that a (in the picture, nondecreasing) monotone function may have.
These discontinuities are jump discontinuities, where a function f : D −→ Rm with
D ⊆ R is said to have a jump discontinuity at a point c ∈ D if f is discontinuous
at c but both the left and right-hand limits f (c±) exist, provided that c is a limit
point of D ∩ (c, ∞) and D ∩ (−∞, c); the number f (c+) − f (c−) is then called the
jump of f at c. If c is only a limit point of one of the sets D∩(c, ∞) and D∩(−∞, c)
then we require only the corresponding right or left-hand limit to exist. Here is a
184 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

c c c

Figure 4.9. Monotone functions have only jump discontinuities.

proof that every monotone function has at most countably many discontinuities,
each of which being a jump discontinuity; see Problem 2 for another proof.
Theorem 4.25. A monotone function on R has uncountably many points of
continuity and at most countably many discontinuities, each discontinuity being a
jump discontinuity.
Proof. Assume that f is nondecreasing, the case for a nonincreasing function
is proved in an analogous manner. We know that f is discontinuous at a point
x if and only if f (x+) − f (x−) > 0. Given such a discontinuity point, choose a
rational number rx in the interval (f (x−), f (x+)). Since f is nondecreasing, given
any two such discontinuity points x < y, we have (see (4.22)) f (x+) ≤ f (y−), so
the intervals (f (x−), f (x+)) and (f (y−), f (y+)) are disjoint. Thus, rx 6= ry and to
each discontinuity, we have assigned a unique rational number. It follows that the
set of all discontinuity points of f is in one-to-one correspondence with a subset of
the rationals, and therefore, since a subset of a countable set is countable, the set
of all discontinuity points of f is countable. Since R, which is uncountable, is the
union of the continuity points of f and the discontinuity points of f , the continuity
points of f must be uncountable. 
The following is a very simple and useful characterization of continuous mono-
tone functions on intervals.
Theorem 4.26. A monotone function on R is continuous on R if and only if
its range is an interval.
Proof. By the intermediate value theorem, we already know that the range
of any (in particular, a monotone) continuous function on R is an interval. Let
f : R −→ R be monotone and suppose, for concreteness, that f is nondecreasing,
the case for a nonincreasing function being similar. It remains to prove that if the
range of f is an interval, then f is continuous. We shall prove the contrapositive.
So, assume that f is not continuous on I. Then at some point c, we have
f (c−) < f (c+).
Since f is nondecreasing, this inequality implies that either interval (f (c−), f (c)) or
(f (c), f (c+)), whichever is nonempty, is not contained in the range of f . Therefore,
the range of f cannot be an interval. 
4.5.2. Monotone inverse theorem. Recall from Section 1.3 that a function
has an inverse if and only if the function is injective, that is, one-to-one. Notice
that a strictly monotone function f : R −→ R is one-to-one since, for instance, if
f is strictly increasing, then x 6= y, say x < y, implies that f (x) < f (y), which
4.5. MONOTONE FUNCTIONS AND THEIR INVERSES 185

in particular says that f (x) 6= f (y). Thus a strictly monotone function is one-to-
one. The last result in this section states that a one-to-one continuous function is
automatically strictly monotone. This result makes intuitive sense for if the graph
of the function had a dip in it, the function would not pass the so-called “horizontal
line test” learned in high school.
Theorem 4.27 (Monotone inverse theorem). A one-to-one continuous
function f : R −→ R is strictly monotone, its range is an interval, and it has
a continuous strictly monotone inverse (with the same monotonicity as f ).
Proof. Let f : R −→ R be a one-to-one continuous function. We shall prove
that f is strictly monotone. Fix points x0 < y0 . Then f (x0 ) 6= f (y0 ) so either
f (x0 ) < f (y0 ) or f (x0 ) > f (y0 ). For concreteness, assume that f (x0 ) < f (y0 ); the
other case f (x0 ) > f (y0 ) can be dealt with analogously. We claim that f is strictly
increasing. Indeed, if this is not the case, then there exists points x1 < y1 such that
f (y1 ) < f (x1 ). Now consider the function g : [0, 1] → R defined by
g(t) = f (ty0 + (1 − t)y1 ) − f (tx0 + (1 − t)x1 ).
Since f is continuous, g is continuous, and
g(0) = f (y1 ) − f (x1 ) < 0 and g(1) = f (y0 ) − f (x0 ) > 0.
Hence by the IVP, there is a c ∈ [0, 1] such that g(c) = 0. This implies that
f (a) = f (b) where a = cx0 + (1 − c)x1 and b = cy0 + (1 − c)y1 . Since f is one-to-
one, we must have a = b; however, this is impossible since x0 < y0 and x1 < y1
implies a < b. This contradiction shows that f must be strictly monotone.
Now let f : R −→ R be a continuous strictly monotone function and let I =
f (R). By Theorem 4.26, we know that I is an interval too. We shall prove that
f −1 : I −→ R is also a strictly monotone function; then Theorem 4.26 implies
that f −1 is continuous. Now suppose, for instance, that f is strictly increasing; we
shall prove that f −1 is also strictly increasing. If x < y in I, then we can write
x = f (ξ) and y = f (η) for some ξ and η in I. Since f is increasing, ξ < η, and
hence, f −1 (x) = ξ < η = f −1 (y). Thus, f −1 is strictly increasing and our proof is
complete. 

Here is a nice application of the monotone inverse theorem.


Example 4.31. Note that f (x) = xn is monotone on [0, ∞) and strictly in-
creasing. Therefore f −1 (x) = x1/n is continuous. In particular, for any m ∈ N,
g(x) = xm/n = (x1/n )m is continuous on [0, ∞) being a composition of the contin-
uous functions f −1 and the n-th power. Similarly, x 7→ xm/n when m ∈ Z with
m < 0 is continuous on (0, ∞). Therefore, for any r ∈ Q, x 7→ xr is continuous on
[0, ∞) if r ≥ 0 and on (0, ∞) if r < 0.
Exercises 4.5.
1. Prove the following algebraic properties of nondecreasing functions:
(a) If f and g are nondecreasing, then f + g is nondecreasing.
(b) If f and g are nondecreasing and nonnegative, then f g is nondecreasing.
(c) Does (b) hold for any (not necessarily nonnegative) nondecreasing functions? Ei-
ther prove it or give a counterexample.
2. Here is different way to prove that a monotone function has at most countably many
discontinuities. Let f : [a, b] −→ R be nondecreasing.
186 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(i) Given any finite number x1 , . . . , xk of points in (a, b), prove that
d(x1 ) + · · · + d(xk ) ≤ f (b) − f (a), where d(x) := f (x+) − f (x−).
(ii) Given any n ∈ N, prove that there are only a finite number of points c ∈ [a, b]
such that f (c+) − f (c−) > 1/n.
(iii) Now prove that f can have at most countably many discontinuities.
3. Let f : R −→ R be a monotone function. Prove that if f happens to also be additive
(see Problem 3 in Exercises 4.3), then f is continuous. Thus, any additive monotone
function is continuous.
4. In this problem we investigate jump functions. Let x1 , x2 , . . . be countablyP
many points
on the real line and let c1 , c2 , . . . be nonzero complex numbers such that cn is abso-
lutely convergent. For x ∈ R, the functions
X X
(4.23) ϕ` (x) = cn and ϕr (x) = cn
xn <x xn ≤x

are called a (left-continuous) jump function and (right-continuous) jump func-


tion, respectively. More precisely, ϕ` (x) := lim sn (x) and ϕr (x) := lim tn (x) where
X X
sn (x) := ck and tn (x) := ck ;
k≤n, xk <x k≤n, xk ≤x

thus, e.g. for sn (x) we only sum over those ck ’s such that k ≤ n and also xk < x.
(a) Prove that ϕ` , ϕr : R −→ C are well-defined for all x ∈ R (that is, the two infinite
series (4.23) make sense for all x ∈ R).
(b) If all the cn ’s are nonnegative real numbers, prove that ϕ` and ϕr are nondecreasing
functions on R.
(c) If all the cn ’s are nonpositive real numbers, prove that ϕ` and ϕr are nonincreasing
functions on R.
5. In this problem we prove that ϕr in (4.23) is right-continuous having only jump dis-
continuities at x1 , x2 , . . . with the jump at xn equal to cn . To this end, let ε > 0. Since
P
|cn | converges, by Cauchy’s criterion for series, we can choose N so that
X
(4.24) |cn | < ε.
n≥N +1

(i) Prove that for any δ > 0,


X
ϕr (x + δ) − ϕr (x) = cn .
x<xn ≤x+δ

Using (4.24) prove that for δ > 0 sufficiently small, |ϕr (x + δ) − ϕr (x)| < ε.
(ii) Prove that for any δ > 0,
X
ϕr (x) − ϕr (x − δ) = cn .
x−δ<xn ≤x

If x is not one of the points x1 , . . . , xN , using (4.24) prove that for δ > 0 sufficiently
small, |ϕr (x) − ϕr (x − δ)| < ε.
(iii) If x = xk for some 1 ≤ k ≤ N , prove that |ϕr (x) − ϕr (x − δ) − ck | < ε.
(iv) Finally, prove that ϕr is right-continuous having only jump discontinuities at
x1 , x2 , . . . with the jump at xn equal to cn .
6. Prove that ϕ` is left-continuous having only jump discontinuities at x1 , x2 , . . . where
the jump at xn equal to cn , with the notation given in (4.23).
7. (Generalized Thomae functions) In this problem we generalize Thomae’s function
to arbitrary countable sets. Let A ⊆ R be a countable set.
(a) Define a nondecreasing function on R that is discontinuous exactly on A.
(b) Suppose that A is dense. (Dense in defined in Subsection 4.3.3.) Prove that there
does not exist a continuous function on R that is discontinuous exactly on Ac .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 187

4.6. Exponentials, logs, Euler and Mascheroni, and the ζ-function


We now come to a very fun part of real analysis: We apply our work done in
the preceding chapters and sections to study the so-called “elementary transcen-
dental functions,” the exponential, logarithmic, and trigonometric functions. In
particular, we develop the properties of undoubtedly the most important function
in all of analysis, the exponential function. We also study logarithms and (complex)
powers and derive some of their main properties. For another approach to defining
logarithms, see the interesting article [12] and for a brief history, [182]. In Section
4.7 we define the trigonometric functions.
4.6.1. The exponential function. Recall that (see Section 3.7) the expo-
nential function is defined by

X zn
exp(z) := , z ∈ C.
n=0
n!
Some properties of the exponential function are found in Theorem 3.31. Here’s
another important property.
Theorem 4.28. The exponential function exp : C −→ C is continuous.
Proof. Given any c ∈ C, using properties (2) and (3) of Theorem 3.31, we
obtain
   
(4.25) exp(z) − exp(c) = exp(c) · exp(−c) exp(z) − 1 = exp(c) · exp(z − c) − 1 .
Observe that
∞ ∞
X (z − c)n X (z − c)n
exp(z − c) − 1 = −1= .
n=0
n! n=1
n!
If |z − c| < 1, then for n = 1, |z − c|n = |z − c| and for n > 1,
|z − c|n = |z − c| · |z − c|n−1 < |z − c| · 1 = |z − c|,
so by our triangle inequality for series (see Theorem 3.29), we have
(4.26) |z − c| < 1 =⇒
∞ ∞
X |z − c|n X 1
| exp(z − c) − 1| ≤ < |z − c| = |z − c| · (e − 1).
n=1
n! n=1
n!
Now let ε > 0. Then choosing δ = min{1, ε/(exp(c)(e − 1))}, we see that for
|z − c| < δ, we have
by (4.25) by (4.26)
| exp(z)−exp(c)| = | exp(c)|· exp(z−c)−1 < | exp(c)|·|z−c|·(e−1) < ε.
This completes the proof of the theorem. 
An easy induction argument using (2) shows that for any complex numbers
z1 , . . . , zn , we have
exp(z1 + · · · + zn ) = exp(z1 ) · · · exp(zn ).
We now restrict the exponential function to real variables z = x ∈ R:

X xn
exp(x) = , x ∈ R.
n=0
n!
188 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

y = exp x
y = log x

(0, 1)

(1, 0)

Figure 4.10. The graph of exp : R −→ (0, ∞) looks like the


graph you learned in high school! Since the exponential function is
strictly increasing, it has an inverse function exp−1 , which we call
the logarithm, log : (0, ∞) −→ R.

In particular, the right-hand side, being a sum of real numbers, is a real number, so
exp : R −→ R. Of course, this real exponential function shares all of the properties
(1) – (4) as the complex one does. In the following theorem we show that this real-
valued exponential function has the increasing/decreasing properties you learned
about in elementary calculus; see Figure 4.10.
Theorem 4.29 (Properties of the real exponential). The real exponential
function has the following properties:
(1) exp : R −→ (0, ∞) is a strictly increasing continuous bijection. Moreover,
limx→∞ exp(x) = ∞ and limx→−∞ exp(x) = 0.
(2) For any x ∈ R, we have
1 + x ≤ exp(x)
with strict inequality for x 6= 0, that is, 1 + x < exp(x) for x 6= 0.
Proof. Observe that
x2 x3
exp(x) = 1 + x + + + · · · ≥ 1 + x, x ≥ 0,
2! 3!
with strict inequalities for x > 0. In particular, exp(x) > 0 for x ≥ 0 and the
inequality exp(x) ≥ 1 + x shows that limx→∞ exp(x) = ∞. If x < 0, then −x > 0,
so exp(−x) > 0, and therefore by Property (3) of Theorem 3.31,
1
exp(x) = > 0.
exp(−x)
Thus, exp(x) is positive for all x ∈ R and recalling Example 4.14, we see that
1 1
lim exp(x) = lim = lim = 0.
x→−∞ x→−∞ exp(−x) x→∞ exp(x)

(As a side note, we can also get exp(x) > 0 for all x ∈ R by noting that exp(x) =
exp(x/2) · exp(x/2) = (exp(x/2))2 .) If x < y, then y − x > 0, so exp(y − x) ≥
1 + (y − x) > 1, and thus,
exp(x) < exp(y − x) · exp(x) = exp(y − x + x) = exp(y).
Thus, exp is strictly increasing on R. The continuity property of exp implies that
exp(R) is an interval and then the limit properties of exp imply that this interval
must be (0, ∞). Thus, exp : R −→ (0, ∞) is onto (since exp(R) = (0, ∞)) and
injective (since exp is strictly increasing) and therefore is a continuous bijection.
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 189

Finally, we verify (2). We already know that exp(x) ≥ 1 + x for x ≥ 0. If


x ≤ −1, then 1 + x ≤ 0 so our inequality is automatically satisfied since exp(x) > 0.
If −1 < x < 0, then by the series expansion for exp, we have
 2   4 
x x3 x x5
exp(x) − (1 + x) = + + + + ··· ,
2! 3! 4! 5!
where we group the terms in pairs. A typical term in parentheses is of the form
 2k   
x x2k+1 x2k x
+ = 1+ , k = 1, 2, 3, . . . .
(2k)! (2k + 1)! (2k)! (2k + 1)
x
For −1 < x < 0, 1 + (2k+1) is positive and so is x2k (being a perfect square). Hence,
being a sum of positive numbers, exp(x) − (1 + x) is positive for −1 < x < 0. 
The inequality 1 + x ≤ exp(x) is quite useful and we will many opportunities
to use it in the sequel; see Problem 4 for a nice application to the AGMI.
4.6.2. Existence and properties of logarithms. Since exp : R −→ (0, ∞)
is a strictly increasing continuous bijection (so in particular is one-to-one), by the
monotone inverse theorem (Theorem 4.27) this function has a strictly increasing
continuous bijective inverse exp−1 : (0, ∞) −→ R. This function is called the
logarithm function5 and is denoted by log = exp−1 ,
log = exp−1 : (0, ∞) −→ R.
By definition of the inverse function, log satisfies
(4.27) exp(log x) = x, x ∈ (0, ∞) and log(exp x) = x, x ∈ R.
The logarithm is usually introduced as follows. If a > 0, then the unique real
number ξ having the property that
exp(ξ) = a
is called the logarithm of a, where ξ is unique because exp : R −→ (0, ∞) is
bijective. Note that ξ = log a by the second equation in (4.27):
ξ = log(exp(ξ)) = log a.
Theorem 4.30. The logarithm log : (0, ∞) −→ R is a strictly increasing con-
tinuous bijection. Moreover, limx→∞ log x = ∞ and limx→0+ log x = −∞.
Proof. We already know that log is a strictly increasing continuous bijec-
tion. The limit properties of log follow directly from the limit properties of the
exponential function in Part (1) of the previous theorem, as you can check. 
The following theorem lists some of the well-known properties of log.
Theorem 4.31 (Properties of the logarithm). The logarithm has the fol-
lowing properties:
(1) exp(log x) = x and log(exp x) = x.
(2) log(xy) = log x + log y,
5In elementary calculus classes, our logarithm function is denoted by ln and is called the
natural logarithm function; the notation log usually referring to the “base 10” logarithm.
However, in more advanced mathematics, log always refers to the natural logarithm function:
Mathematics is the art of giving the same name to different things. Henri Poincaré (1854–1912).
[As opposed to the quotation: Poetry is the art of giving different names to the same thing].
190 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(3) log 1 = 0. log e = 1.


(4) log(x/y) = log x − log y.
(5) log x < log y if and only if x < y.
(6) log x > 0 if x > 1 and log x < 0 if x < 1.
Proof. We shall leave most of these proofs to the reader. The property (1)
follows from the fact that exp and log are inverse functions. Consider now the proof
of (2). We have
exp(log(xy)) = xy.
On the other hand,
exp(log x + log y) = exp(log x) exp(log y) = xy,
so
exp(log(xy)) = exp(log x + log y).
Since exp is one-to-one, we must have log(xy) = log x + log y. To prove (3), observe
that
exp(0) = 1 = exp(log 1),
so, because exp is one-to-one, log 1 = 0. Also, since
exp(1) = e = exp(log e),
by uniqueness, 1 = log e. We leave the rest of the properties to the reader. 
4.6.3. Powers and roots of real numbers. Recall that in Section 2.7, we
defined the meaning of ar for a > 0 and r ∈ Q; namely if r = m/n with m ∈ Z
√ m
and n ∈ N, then ar = n a . We also proved that these rational powers satisfy
all the “power rules” that we learned in high school (see Theorem 2.33). We now
ask: Can we define ax for x an arbitrary irrational number. In fact, we shall now
define az for z an arbitrary complex number!
Given any positive real number a and complex number z, we define
az := exp(z log a).
The number a is called the base and z is called the exponent. The astute student
might ask: What if z = k is an integer; does this definition of ak agree with our
usual definition of k products of a? What about if z = p/q ∈ Q, then is the
definition
√ of ap/q as exp((p/q) log a) in agreement with our previous definition as
q p
a ? We answer these questions and more in the following theorem.
Theorem 4.32 (Generalized power rules). For any real a, b > 0, we have
(1) ak = a · a · · · a (k times) for any integer k.
(2) ez = exp z for all z ∈ C.
(3) log xy = y log x for all x, y > 0.
(4) For any x ∈ R, z, w ∈ C,
z
az · aw = az+w ; az · bz = (ab)z ; (ax ) = axz .
(5) If z = p/q ∈ Q, then √
ap/q = q ap .
(6) If a > 1, then x 7→ ax is a strictly increasing continuous bijection of R onto
(0, ∞) and limx→∞ ax = ∞ and limx→−∞ ax = 0. On the other hand, if
0 < a < 1, then x 7→ ax is a strictly decreasing continuous bijection of R onto
(0, ∞) and limx→∞ ax = 0 and limx→−∞ ax = ∞.
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 191

(7) If a, b > 0 and x > 0, then a < b if and only if ax < bx .


Proof. By definition of ak and the additive property of the exponential,
ak = exp(k log a) = exp(log a + · · · + log a) = exp(log a) · · · exp(log a) = |a ·{z
· · a},
| {z } | {z }
k times k times k times

which proves (1). Since log e = 1, we have


ez = exp(z log e) = exp(z),
which is just (2).
To prove (3), observe that
exp(log(xy )) = xy = exp(y log x).
Since the exponential is one-to-one, we have log(xy ) = y log x.
If x ∈ R and z, w ∈ C, then the following computations prove (4):
az · aw = exp(z log a) exp(w log a) = exp(z log a + w log a)
 
= exp (z + w) log a = az+w ;

az · bz = exp(z log a) exp(z log b) = exp(z log a + z log b)


 
= exp z log(ab) = (ab)z ,
and
z
(ax ) = exp(z log ax ) = exp(xz log a) = axz .
To prove (5), observe that by the last formula in (4),
 p
ap/q = a(p/q)q = ap .

Therefore, since ap/q > 0, by uniqueness of roots (Theorem 2.31), ap/q = q ap .
We leave the reader to verify that since exp : R −→ R is a strictly increasing
bijection with the limits limx→∞ exp(x) = ∞ and limx→−∞ exp(x) = 0, then for
any b > 0, exp(bx) is also a strictly increasing continuous bijection of R onto (0, ∞)
and limx→∞ exp(bx) = ∞ and limx→−∞ exp(bx) = 0. On the other hand, if b < 0,
say b = −c where c > 0, then these properties are reversed: exp(−cx) is a strictly
decreasing continuous bijection of R onto (0, ∞) and limx→∞ exp(−cx) = 0 and
limx→−∞ exp(−cx) = ∞. With this discussion in mind, note that if a > 1, then
log a > 0 (Property (6) of Theorem 4.31), so ax = exp(x log a) = exp(bx) has the
required properties in (6); if 0 < a < 1, then log a < 0, so ax = exp(x log a) =
exp(−cx), where c = − log a > 0, has the required properties in (6).
Finally, to verify (7), observe that for a, b > 0 and x > 0, using the fact that
log and exp are strictly increasing, we obtain
a<b ⇐⇒ log a < log b ⇐⇒ x log a < x log b
⇐⇒ ax = exp(x log a) < exp(x log b) = bx .

Example 4.32. Using Tannery’s theorem, we shall prove the pretty formula
   n  n  n 
e n n n−1 n−2 1
= lim + + + ··· + .
e − 1 n→∞ n n n n
192 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

To prove this, we write the right-hand side as


   n  n  ∞
n n n−1 1 X
lim + + ··· + = lim ak (n),
n→∞ n n n n→∞
k=0
where ak (n) := 0 for k ≥ n and for 0 ≤ k ≤ n − 1,
 n − k n  k n
ak (n) := = 1− .
n n
Observe that  k n
lim ak (n) = lim 1 − = e−k
n→∞ n→∞ n
exists. Also, for k ≤ n − 1,
 k n  −k/n n
|ak (n)| = 1 − ≤ e = e−k ,
n
where we used that 1 + x ≤ ex for all x ∈ R from Theorem 4.29. Since ak (n) = 0
for k ≥ n, it follows that |ak (n)|
P∞≤ Mk for P all k, n where Mk = e−k . Since e−1 < 1,

by the geometric series test, k=0 Mk = k=0 (e−1 )k < ∞. Hence by Tannery’s
theorem, we have
   n  n 
n n n−1 1
lim + + ··· +
n→∞ n n n
∞ ∞ ∞
X X X 1 e
= lim ak (n) = lim ak (n) = e−k = = .
n→∞ n→∞ 1 − 1/e e−1
k=0 k=0 k=0

Example 4.33. Here’s a Puzzle: Do there exist rational numbers α and β


such that αβ is irrational ? You should be able to answer this in the affirmative!
Here’s a harder question [108]: Do there exist irrational numbers α and β such that

αβ is rational
√ ? Here’s a very cool argument to the affirmative. Consider α = β2
and β = 2, both of which are irrational. Then there are two cases: either α
rational or irrational. If αβ is rational, then we have answered our question in the
affirmative. However, in the case that α0 := αβ is irrational, then by our rule (4)
of exponents,
 β 2 √ 2
(α0 )β = αβ = αβ = 2 = 2
is rational, so we have answered our question in the affirmative in this case as well.
Do there exist irrational numbers α and β such that αβ is irrational ? For the
answer, see Problem 6.
4.6.4. The Riemann zeta function. The last two subsections are applica-
tions of what we’ve learned about exponentials, logs, and powers. We begin with
the Riemann zeta-function, which is involved in one of the most renowned unsolved
problems in all of mathematics: The Riemann hypothesis. If you want to be fa-
mous and earn one million dollars too, just prove the Riemann hypothesis (see
http://www.claymath.org/millennium/ and [53] for hints on how one may try to
solve this conjecture); for now, our goal is simply to introduce this function. Ac-
tually, this function is simply a “generalized p-series” where instead of using p, a
rational number, we use a complex number:

X 1 1 1 1
ζ(z) := z
= 1 + z + z + z + ··· .
n=1
n 2 3 4
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 193

Theorem 4.33 (The Riemann zeta function). The Riemann zeta-function


converges absolutely for all z ∈ C with Re z > 1.
Proof. Let p be an arbitrary rational number with p > 1; then we just have
to prove that ζ(z) converges absolutely for all z ∈ C with Re z ≥ p. To see this, let
z = x + iy with x ≥ p and observe that nz = ez log n = ex log n · eiy log n . In Problem
1d you’ll prove that |eiθ | = 1 for any real θ, so |eiy log n | = 1 and hence,
|nz | = |ex log n · eiy log n | = ex log n ≥ ep log n = np .
P
Therefore, 1/nz ≤ 1/np , so by comparison with the p-series 1/np , it follows
P z
that |1/n | converges. This completes our proof. 
The ζ-function has profound implications to prime numbers; see Section 7.6.
4.6.5. The Euler-Mascheroni constant. The constant
 
1 1
γ := lim 1 + + · · · + − log n
n→∞ 2 n
is called the Euler-Mascheroni constant. This constant was calculated to 16
digits by Euler in 1781, who used the notation C for γ. The symbol γ was first
used by Lorenzo Mascheroni (1750–1800) in 1790 when he computed γ to 32 decimal
places, although only the first 19 places were correct (cf. [96, pp. 90–91]). To prove
that the limit on the right of γ exists, consider the sequence
1 1
γn = 1 + + · · · + − log n, n = 2, 3, . . . .
2 n
We shall prove that γn is nonincreasing and bounded below and hence the Euler-
Mascheroni constant is defined. In our proof, we shall see that γ is between 0 and 1;
the exact value in base 10 is γ = .5772156649 . . .. Here’s a mnemonic to remember
the digits of γ [236]:
(4.28) These numbers proceed to a limit Euler’s subtle mind discerned.
The number of letters in each word represents a digit of γ; e.g. “These” represents
5, “numbers” 7, etc. The sentence (4.28) gives ten digits of γ: .5772156649. By the
way, it is not known6 whether γ is rational or irrational, let alone transcendental!
To prove that {γn } is a bounded monotone sequence, we shall need the following
inequality proved in Section 3.3 (see (3.28)):
 n  n+1
n+1 n+1
<e< for all n ∈ N.
n n
Taking the logarithm of both sides of the first inequality and using the fact that
log is strictly increasing implies, we get
 
 n+1
n log(n + 1) − log n = n log < log e = 1,
n
and doing the same thing to the second inequality gives
 
n+1 
1 = log e < (n + 1) log = (n + 1) log(n + 1) − log n .
n
6Unfortunately what is little recognized is that the most worthwhile scientific books are those
in which the author clearly indicates what he does not know; for an author most hurts his readers
by concealing difficulties. Evariste Galois (1811–1832). [188].
194 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Combining these two inequalities, we obtain


1 1
(4.29) < log(n + 1) − log n < .
n+1 n
Using the definition of γn and first inequality in (4.29), we see that
1 1 1
γn = 1 + + · · · + − log n = γn+1 − + log(n + 1) − log n > γn+1 ,
2 n n+1
so the sequence {γn } is strictly decreasing. In particular, γn < γ1 = 1 for all n. We
now show that γn is bounded below by zero. We already know that γ1 = 1 > 0.
Using the second inequality in (4.29) with n = 2, n = 3, . . . , n = n, we obtain
1 1 1    
γn = 1 + + + · · · + − log n > 1 + log 3 − log 2 + log 4 − log 3
 2 3  n   
+ log 5 − log 4 + · · · + log n − log(n − 1) + log(n + 1) − log n − log n
= 1 − log 2 + log(n + 1) − log n > 1 − log 2 > 0.
Here, we used that log 2 < 1 because 2 < e. Thus, {γn } is strictly decreasing and
bounded below by 1 − log 2 > 0, so γ is is well-defined and 0 < γ < 1.
We can now show that the value of the alternating harmonic series

X (−1)n−1 1 1 1 1 1
= 1 − + − + − + −···
n=1
n 2 3 4 5 6
is log 2. Indeed, since
 
1 1
γ = lim 1 + + · · · + − log n ,
n→∞ 2 n
we see that  
1 1 1 1
γ = lim 1 + + + + · · · + − log 2n
n→∞ 2 3 4 2n
and  
1 1 1
γ = lim 2 + + ··· + − log n.
n→∞ 2 4 2n
Subtracting, we obtain
 
1 1 1 1
0 = lim 1 − + − + − · · · − − log 2,
n→∞ 2 3 4 2n
which proves that

X 1 1 1 1 1 1
log 2 = (−1)n−1 = 1 − + − + − + −··· .
n=1
n 2 3 4 5 6

Using a similar technique, one can find series representations for log 3; see Problem
7. Using the above formula for log 2, in Problem 7 you are asked to derive the
following striking expression:
e1 e1/3 e1/5 e1/7 e1/9
(4.30) 2= · · · · ··· .
e1/2 e1/4 e1/6 e1/8 e1/10
Exercises 4.6.
1. Establish the following properties of exponential functions.
(a) If zn → z and an → a (with zn , z complex and an , a > 0), then aznn → az .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 195

(b) If a, b > 0, then for any x < 0, a < b if and only if ax > bx .
(c) If a, b > 0, then for any complex number z, a−z = 1/az and (a/b)z = az /bz .
(d) Prove that for any x ∈ R, |eix | = 1.
2. Let a ∈ R with a 6= 0 and define f (x) = xa .
(a) If a > 0, prove that f : [0, ∞) −→ R is continuous, strictly increasing, limx→0+ f =
0, and limx→∞ f = ∞.
(b) If a < 0, prove that f : (0, ∞) −→ R is continuous, strictly decreasing, limx→0+ f =
∞, and limx→∞ f = 0.
3. Establish the following limit properties of the exponential function.
(a) Show that for any natural number n and for any x ∈ R with x > 0 we have
xn+1
ex > .
(n + 1)!
Use this inequality to prove that for any natural number n,
xn
lim x = 0.
x→∞ e

(b) Using (a), prove that for any a ∈ R with a > 0, however large, we have
xa
lim = 0.
x→∞ ex
x
It follows that e grows faster than any power (no matter how large) of x. This
limit is usually derived in elementary calculus using L’Hospital’s rule.
4. Let a1 , . . . , an be nonnegative real numbers. Recall from Problem 7 in Exercises 2.2
that the arithmetic-geometric mean inequality (AGMI) is the iequality
a1 + · · · + a n
(a1 · a2 · · · an )1/n ≤ .
n
Prove this inequality by setting a = (a1 + · · · + an )/n, xk = −1 + ak /a (so that
ak /a = 1 + xk ) for k = 1, . . . , n, and using the inequality 1 + x ≤ ex .
5. For any x > 0, derive the following remarkable formula:
√ 
log x = lim n n x − 1 (Halley’s formula),
n→∞

named after the famous Edmond Halley (1656–1742) of Halley’s comet. Suggestion:

Write n x = elog x/n and write elog x/n as a series in log x/n.
6. (Cf. [108]) Puzzle: Do there exist irrational numbers α and β such that αβ is irra-
0 √ √
tional ? Suggestion: Consider αβ and αβ where α = β = 2 and β 0 = 2 + 1.
7. In this fun problem, we derive some interesting formulas.
(a) Prove that
∞    ∞   
X 1 1 X 1 1
γ= − log 1 + =1+ + log 1 −
n=1
n n n=2
n n
∞   
X 1 1
=1+ + log 1 + ,
n=1
n+1 n
where γ is the Euler-Mascheroni constant. Suggestion: Think telescoping series.
(b) Using a similar technique on how we derived our formula for log 2, prove that
1 2 1 1 2 1 1 2
log 3 = 1 + − + + − + + − + + − · · ·
2 3 4 5 6 7 8 9
Can you find a series representation for log 4?
e1 2n−1
(c) Define an = e1/2 · · · e e2n . Prove that 2 = lim an .
8. Following Greenstein [86] (cf. [42]) we establish a “well-known” limit from calculus,
but without using calculus!
(i) Show that log x < x for all x > 0.
(ii) Show that (log x)/x < 2/x1/2 for x > 0. Suggestion: log x = 2 log x1/2 .
196 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(iii) Show that


log x
lim = 0.
x→∞ x
This limit is usually derived in elementary calculus using L’Hospital’s rule.
(iv) Now let a ∈ R with a > 0. Generalizing the above argument, prove that
log x
lim = 0.
x→∞ xa
Thus, log x grows slower than any power (no matter how small) of x.
9. In this problem we get an inequality for log(1 + x) and use it to obtain a nice formula.
1
(i) Prove that for all x ∈ [0, 1], we have e 2 x ≤ 1 + x. Conclude that for all x ∈ [0, 1],
we have log(1 + x) ≥ x/2.
(ii) Using Tannery’s theorem, prove that
( )
1 1 1 1
ζ(2) = lim   + 
2
 + 
2
 +· · ·+   .
n→∞ n2
n2 log 1 + n12 n2 log 1 + n2 2 n2 log 1 + n3 2 n2 log 1 + n2

10. In high school you probably learned logarithms with other “bases” besides e. Let a ∈ R
with a > 0 and a 6= 1. For any x > 0, we define
log x
loga x := ,
log a
called the logarithm of x to the base a. Note that if a = e, then loge = log, our
usual logarithm. Here are some of the well-known properties of loga .
(a) Prove that x 7→ loga x is the inverse function of x 7→ ax .
(b) Prove that for any x, y > 0, loga xy = loga x + loga y.
(c) Prove that if b > 0 with b 6= 1 is another base, then for any x > 0,
 
log b
loga x = logb x (Change of base formula).
log a

11. Part (a) of this problem states that a “function which looks like an exponential function
is an exponential function,” while (b) says the same for the logarithm function.
(a) Let f : R −→ R satisfy f (x + y) = f (x) f (y) for all x, y ∈ R; see Problem 4 in
Exercises 4.3. Assume that f is not the zero function. Prove that if f is continuous,
then
f (x) = ax for all x ∈ R, where a = f (1).
Suggestion: Show that f (x) > 0 for all x. Now there are a couple ways to proceed.
One way is to first prove that f (r) = (f (1))r for all rational r (to prove this you
do not require the continuity assumption). This second way is to define h(x) =
log f (x). Prove that h is linear and then apply Problem 3 in Exercises 4.3.
(b) Let g : (0, ∞) −→ R satisfy g(x · y) = g(x) + g(y) for all x, y > 0. Prove that if g
is continuous, then there exists a unique real number c such that
g(x) = c log x for all x ∈ (0, ∞).
12. (Exponentials the “old fashion way”) Fix a > 0 and x ∈ R. In this section we
defined ax := exp(x log a) However, in this problem we shall define real powers the “old
fashion way” via rational sequences. Henceforth we only assume knowledge of rational
powers and we proceed to define them for real powers.
(i) Let {rn } be a sequence of rational numbers converging to zero. From Section 3.1
we know that a1/n → 1 and a−1/n = (a−1 )1/n → 1. Let ε > 0 and fix m ∈ N such
that 1 − ε < a±1/m < 1 + ε. Show that if |rn | < 1/m, then 1 − ε < arn < 1 + ε.
Conclude that arn → 1. (See Problem 3 in Exercises 3.1 for another proof.)
Suggestion: Recall that any rational p < q and real b > 1, we have bp < bq .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 197

(ii) Let {rn } be a sequence of rational numbers converging to x. Prove that {arn } is
a Cauchy sequence, hence it converges to a real number, say ξ. We define ax = ξ.
Prove that this definition makes sense; that is, if {rn0 } is any other sequence of
0
rational numbers converging to x, then {arn } also converges to ξ.
x
(iii) Prove that if x = n ∈ N, then a = a · a · · · a where there are n a’s multiplied

together. Also prove that a−x = 1/an and if x = n/m ∈ Q, then ax = m an .
Thus, our new definition of powers agrees with the old definition. Finally, show
that for x, y ∈ R,
ax · ay = ax+y ; ax · bx = (ab)x ; (ax )y = axy .
13. (Logarithms the “old fashion way”) In this problem we define the logarithm the
“old fashion way” using rational sequences. In this problem we assume knowledge of
real powers as presented in the previous problem. Fix a > 0.
(i) Prove that it is possible to define unique integers a0 , a1 , a2 , . . . inductively with
0 ≤ ak ≤ 9 for k ≥ 1 such that if xn and yn are the rational numbers
a1 a2 an−1 an
xn = a0 + + 2 + · · · + n−1 + n
10 10 10 10
and
a1 a2 an−1 an + 1
yn = a0 + + 2 + · · · + n−1 + ,
10 10 10 10n
then
(4.31) exn ≤ a < eyn .
Suggestion: Since e > 1, we know that for r ∈ Q, we have the limits er → ∞,
respectively 0, as r → ∞, respectively r → −∞.
(ii) Prove that both sequences {xn } and {yn } converge to the same value, call it L.
Show that eL = a where eL is defined by means of the previous problem. Of
course, L is just the logarithm of a defined in this section.
14. (The Euler-Mascheroni constant II) In this problem we prove that the Euler-
Mascheroni constant constant exists following [44]. Consider the sequence
1 1
an = 1 + + ··· + − log n, n = 2, 3, . . . .
2 n−1
We shall prove that an is nondecreasing and bounded and hence lim an exists.
(i) Assuming that the limit lim an exists, prove that the limit defining the Euler-
Mascheroni constant also exists and equals lim an .
(ii) Using the inequalities in (3.28), prove that
e1/n e1/n 1
(4.32) 1< and < e n(n+1) .
(n + 1)/n (n + 1)/n
(iii) Prove that for each n ≥ 2,
 
e1 e1/2 e1/(n−1)
an = log · ··· .
2/1 3/2 n/(n − 1)
(iv) Using (c) and the inequalities in (4.32), prove that {an } is strictly increasing such
that 0 < an < 1 for all n. Conclude that lim an exists.
15. (The Euler-Mascheroni constant III) We prove that Euler-Mascheroni constant
−k k
exists following [16]. For each k ∈ N, define ak := e 1 + k1 so that e = ak 1 + k1 .
(i) Prove that
1 1/k
= log(ak ) + log(k + 1) − log k.
k
(ii) Prove that
1 1  
1/2 1/3
1 + + · · · + − log(n + 1) = log a1 a2 a3 · · · a1/n n .
2 n
198 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

n  o
1/2 1/n
(iii) Prove that the sequence log a1 a2 · · · an is nondecreasing. Conclude that
if this sequence is bounded, then Euler’s constant exists.
(iv) Prove that
  1 1
1/2
log a1 a2 · · · a1/n
n = log a1 + log a2 + · · · + log an
 2   n   
1 1 1 1 1
< log 1 + + log 1 + + · · · + log 1 +
1 2 2 n n
1 1 1 1 1
< + · + ··· + · .
1 2 2 n n
(v) nSince of the reciprocals
o of the squares converges, conclude that the sequence
1/2 1/n
log a1 a2 · · · an is bounded.

4.7. The trig functions, the number π, and which is larger, π e or eπ ?


In high school we learned about sine and cosine using geometric intuition based
on either triangles or the unit circle. (For this point of view, see the interesting
paper [211].) In this section we introduce these function from a purely analytic
framework and we prove that these functions have all the properties you learned
in high school. In high school we also learned about the number π,7 again using
geometric intuition. In this section we define π rigourously using analysis without
any geometry. However, we do prove that π has all the geometric properties you
think it does.

4.7.1. The trigonometric and hyperbolic functions. We define cosine


and sine as the functions cos : C −→ C and sin : C −→ C defined by the equations

eiz + e−iz eiz − e−iz


cos z := , sin z := .
2 2i
In particular, both of these functions are continuous functions, being constant mul-
tiples of a sum and difference, respectively, of the continuous functions eiz = exp(iz)
and e−iz = exp(−iz). From these formulas, we see that cos 0 = 1 and sin 0 = 0;
other “well-known” values of sine and cosine are discussed in the problems. Multi-
plying the equation for sin z by i and then adding this equation to cos z, the e−iz
terms cancel and we get cos z + i sin z = eiz . This equation is the famous Euler’s
identity:
eiz = cos z + i sin z. (Euler’s identity)
This formula provides a very easy proof of de Moivre’s formula, named after its
discoverer Abraham de Moivre (1667–1754),

(cos z + i sin z)n = cos nz + i sin nz, z ∈ C, (de Moivre’s formula),


which is given much attention in elementary mathematics and is usually only stated
when z = θ, a real variable. Here is the one-line proof:
n
(cos z + i sin z)n = eiz = |eiz · eiz iz
{z· · · e } = e
inz
= cos nz + i sin nz.
n terms

7“Cosine, secant, tangent, sine, 3.14159; integral, radical, u dv, slipstick, sliderule, MIT!”
MIT cheer.
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 199

In the following theorem, we adopt the standard notation of writing sin2 z for
(sin z)2 , etc.8 Here are some well-known trigonometric identities that you memo-
rized in high school, now proved from the basic definitions and even for complex
variables.
Theorem 4.34 (Basic properties of cosine and sine). Cosine and sine
are continuous functions on C. In particular, restricting to real values, they define
continuous functions on R. Moreover, for any complex numbers z and w,
(1) cos(−z) = cos z, sin(−z) = − sin z,
(2) cos2 z + sin2 z = 1, (Pythagorean identity)
(3) Addition formulas:
cos(z + w) = cos z cos w − sin z sin w, sin(z + w) = sin z cos w + cos z sin w,
(4) Double angle formulas:
cos(2z) = cos2 z − sin2 z = 2 cos2 z − 1 = 1 − 2 sin2 z,
sin(2z) = 2 cos z sin z.
(5) Trigonometric series:9
∞ ∞
X z 2n X z 2n+1
(4.33) cos z = (−1)n , sin z = (−1)n ,
n=0
(2n)! n=0
(2n + 1)!

where the series converge absolutely.


Proof. We shall leave some of this proof to the reader. Note that (1) follows
directly from the definition of cosine and sine. Consider the addition formula:
 iz   iw 
e + e−iz e + e−iw
cos z cos w − sin z sin w =
2 2
 iz   iw 
e − e−iz e − e−iw

2i 2i

1 i(z+w)
= e + ei(z−w) + e−i(z−w) + e−i(z+w)
4

i(z+w) i(z−w) −i(z−w) −i(z+w)
+e −e −e +e

ei(z+w) + e−i(z+w)
= = cos(z + w).
2
Taking w = −z and using (1) we get the Pythagorean identity:
1 = cos 0 = cos(z − z) = cos z cos(−z) − sin z sin(−z) = cos2 z + sin2 z.

8Sin2 φ is odious to me, even though Laplace made use of it; should it be feared that sin2 φ
might become ambiguous, which would perhaps never occur, or at most very rarely when speaking
of sin(φ2 ), well then, let us write (sin φ)2 , but not sin2 φ, which by analogy should signify sin(sin φ).
Carl Friedrich Gauss (1777–1855).
9In elementary calculus, these series are usually derived via Taylor series and are usually
attributed to Sir Isaac Newton (1643–1727) who derived them in his paper “De Methodis Serierum
et Fluxionum” (Method of series and fluxions) written in 1671. However, it is interesting to know
that these series were first discovered hundreds of years earlier by Madhava of Sangamagramma
(1350–1425), a mathematicians from the Kerala state in southern India!
200 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

We leave the double angle formulas to the reader. To prove (5), we use the power
series for the exponential to compute
∞ n n ∞
X i z X (−1)n in z n
eiz + e−iz = + .
n=0
n! n=0
n!
The terms when n is odd cancel, so
∞ 2n 2n ∞
X i z X z 2n
2 cos z = eiz + e−iz = 2 = (−1)n ,
n=0
(2n)! n=0
(2n)!

where we used the fact that i2n = (i2 )n = (−1)n . This series converges absolutely
since it is the sum of two absolutely convergent series. The series expansion for
sin z is proved in a similar manner. 
From the series expansion for sin it is straightforward to prove the following
limit from elementary calculus (but now for complex numbers):
sin z
lim = 1;
z→0 z
see Problem 3. Of course, from the identities in Theorem 4.34, one can derive other
identities such as the so-called half-angle formulas:
1 + cos 2z 1 − cos 2z
cos2 z = , sin2 z = .
2 2
The other trigonometric functions are defined in terms of sin and cos in the
usual manner:
sin z 1 cos z
tan z = , cot z = =
cos z tan z sin z
1 1
sec z = , csc z = ,
cos z sin z
and are called the tangent, cotangent, secant, and cosecant, respectively. Note
that these functions are only defined for those complex z for which the expressions
make sense, e.g. tan z is defined only for those z such that cos z 6= 0. The extra trig
functions satisfy the same identities that you learned in high school, for example,
for any complex numbers z, w, we have
tan z + tan w
(4.34) tan(z + w) = ,
1 − tan z tan w
for those z, w such that the denominator is not zero. Setting z = w, we see that
2 tan z
tan 2z = .
1 − tan2 z
In Problem 4 we ask you to prove (4.34) and other identities.
Before baking our π, we quickly define the hyperbolic functions. For any com-
plex number z, we define
ez + e−z ez − e−z
cosh z := , sinh z := ;
2 2
these are called the hyperbolic cosine and hyperbolic sine, respectively. There
are hyperbolic tangents, secants, etc . . . defined in the obvious manner. Observe
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 201

that, by definition, cosh z = cos iz and sinh z = −i sin iz, so after substituting iz
for z in the series for cos and sin, we obtain
∞ ∞
X z 2n X z 2n+1
cosh z = , sinh z = (−1)n .
n=0
(2n)! n=0
(2n + 1)!

These functions are intimately related to the trig functions and share many of the
same properties; see Problem 8.
4.7.2. The number π and some trig identities. Setting z = x ∈ R into
the series (4.33), we obtain the formulas learned in elementary calculus:
∞ ∞
X x2n X x2n+1
cos x = (−1)n , sin x = (−1)n .
n=0
(2n)! n=0
(2n + 1)!
In particular, cos, sin : R −→ R. In the following lemma and theorem we shall
consider these real-valued functions instead of the more general complex versions.
The following lemma is the key result needed to define π.
Lemma 4.35. Sine and cosine have the following properties on [0, 2]:
(1) sin is nonnegative on [0, 2] and positive on (0, 2];
(2) cos : [0, 2] −→ R is strictly decreasing with cos 0 = 1 and cos 2 < 0.
Proof. Since
∞    
X x2n+1 x2 x5 x2
sin x = (−1)n =x 1− + 1− + ···
n=0
(2n + 1)! 2·3 5! 6·7
and each term in the series is positive for 0 < x < 2, we have sin x > 0 for all
0 < x < 2 and sin 0 = 0.
Since

X x2n x2 x4 x6
cos x = (−1)n =1− + − + ··· ,
n=0
(2n)! 2! 4! 6!
we have  6   10 
22 24 2 28 2 212
cos 2 = 1 − + − − − − − ··· .
2! 4! 6! 8! 10! 12!
All the terms in parentheses are positive because for k ≥ 2, we have
 
2k 2k+2 2k 4
− = 1− > 0.
k! (k + 2)! k! (k + 1)(k + 2)
Therefore,
22 24 1
cos 2 < 1 − + = − < 0.
2! 4! 3
We now show that cos is strictly decreasing on [0, 2]. Since cos is continuous,
by Theorem 4.27 if we show that cos is one-to-one on [0, 2], then we can conclude
that cos is strictly monotone on [0, 2]; then cos 0 = 1 and cos 2 < 0 tells us that
cos must be strictly decreasing. Suppose that 0 ≤ x ≤ y ≤ 2 and cos x = cos y; we
shall prove that x = y. We already know that sin is nonnegative on [0, 2], so the
identity
sin2 x = 1 − cos2 x = 1 − cos2 y = sin2 y
implies that sin x = sin y. Therefore,
sin(y − x) = sin y cos x − cos y sin x = sin x cos x − cos x sin x = 0,
202 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

and using that 0 ≤ y − x ≤ 2 and (1), we get y − x = 0. Hence, x = y, so cos is


one-to-one on [0, 2], and our proof is complete. 
We now define the real number π.
Theorem 4.36 (Definition of π). There exists a unique real number, denoted
by the Greek letter π, having the following two properties:
(1) 3 < π < 4,
(2) cos(π/2) = 0.
Moreover, cos x > 0 for 0 < x < π/2.
Proof. By our lemma, we know that cos : [0, 2] −→ R is strictly decreasing
with cos 0 = 1 and cos 2 < 0, so by the intermediate value theorem and the fact
that cos is strictly decreasing, there is a unique point 0 < c < 2 such that cos c = 0.
Define π := 2c, that is, c = π/2. Then 0 < c < 2 implies that 0 < π < 4 and
this is the only number between 0 and 4 such that cos(π/2) = 0. Since cos is
strictly decreasing on [0, 2], we have cos x > 0 for 0 < x < π/2. To see that in fact,
3 < π < 4, we just need to show that cos(3/2) > 0; this implies that 3/2 < c < 2
and therefore 3 < π < 4. Plugging in x = 3/2 into the formula for cos x, we get
   4   8 
3 32 3 36 3 310
cos = 1 − 2 + − + − + ··· .
2 2 2! 24 4! 26 6! 28 8! 210 10!
The first term is negative and equals 1 − 9/8 = −1/8, while, as the reader can
check, all the rest of the parentheses are positive numbers. In particular (after a
lot of scratch work figuring out the number in the second parentheses), we obtain
   4 
3 32 3 36 1 33 · 37 359
cos > 1 − 2 + 4
− 6
= − + 10 = 10 > 0.
2 2 2! 2 4! 2 6! 8 2 ·5 2 ·5

The number π/180 is called a degree. Thus, π/2 = 90 · π/180 is the same as
90 degrees, which we write as 90◦ , π = 180 · π/180 is the same as 180◦ , etc.

4.7.3. Properties of π. As we already stated, the approach we have taken to


introduce π has been completely analytical without reference to triangles or circles,
but surely the π we have defined and the π you have grown up with must be the
same. We now show that the π we have defined is not an imposter, but indeed does
have all the properties of the π that you have grown to love.
We first state some of the well-known trig identities involving π that you learned
in high school, but now we even prove them for complex variables.
Theorem 4.37. The following identities hold:
cos(π/2) = 0, cos(π) = −1, cos(3π/2) = 0, cos(2π) = 1
sin(π/2) = 1, sin(π) = 0, sin(3π/2) = −1, sin(2π) = 0.
Moreover, for any complex number z, we have the following addition formulas:
 π  π
cos z + = − sin z, sin z + = cos z,
2 2
cos(z + π) = − cos z, sin(z + π) = − sin z,
cos(z + 2π) = cos z, sin(z + 2π) = sin z.
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 203

y = sin x y = cos x

− 3π
2 − π2 π
2

2 − 3π
2 − π2 π
2

2
−2π −π π 2π −2π −π π 2π

Figure 4.11. Our definitions of sine and cosine have the same
properties as the ones you learned in high school!

Proof. We know that cos(π/2) = 0 and, by (1) of Lemma 4.35, sin(π/2) > 0,
therefore since
sin2 (π/2) = 1 − cos2 (π/2) = 1,
we must have sin(π/2) = 1. The double angle formulas now imply that
cos(π) = cos2 (π/2) − sin2 (π/2) = −1, sin(π) = 2 cos(π/2) sin(π/2) = 0,
and by another application of the double angle formulas, we get
cos(2π) = 1, sin(2π) = 0.
The facts just proved plus the addition formulas for cosine and sine in Property (3)
of Theorem 4.34 imply the last six formulas above; for example,
 π π π
cos z + = cos z cos − sin z sin = − sin z,
2 2 2
and the other formulas are proved similarly. Finally, setting z = π into
 π  π
cos z + = − sin z, sin z + = cos z
2 2
prove that cos(3π/2) = 0 and sin(3π/2) = −1. 
The last two formulas in Theorem 4.37 (plus an induction argument) imply
that cos and sin are periodic (with period 2π) in the sense that for any n ∈ Z,
(4.35) cos(z + 2πn) = cos z, sin(z + 2πn) = sin z.
iz
Now, substituting z = π into e = cos z + i sin z and using that cos π = −1 and
sin π = 0, we get eiπ = −1, or by bringing −1 to the left we get perhaps most
important equation in all of mathematics (at least to some mathematicians!):10

eiπ + 1 = 0.
In one shot, this single equation contains the five “most important” constants in
mathematics: 0, the additive identity, 1, the multiplicative identity, i, the imag-
inary unit, and the constants e, the base of the exponential function, and π, the
fundamental constant of geometry.
Now consider the following theorem, which essentially states that the graphs
of cosine and sine go “up and down” as you think they should; see Figure 4.11.

10[after proving Euler’s formula eiπ = −1 in a lecture] Gentlemen, that is surely true, it
is absolutely paradoxical; we cannot understand it, and we don’t know what it means. But we
have proved it, and therefore we know it is the truth. Benjamin Peirce (1809–1880). Quoted in
E Kasner and J Newman [110].
204 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Theorem 4.38 (Oscillation theorem). On the interval [0, 2π], the following
monotonicity properties of cos and sin hold:
(1) cos decreases from 1 to −1 on [0, π] and increases from −1 to 1 on [π, 2π].
(2) sin increases from 0 to 1 on [0, π/2] and increases from −1 to 0 on [3π/2, 2π],
and decreases from 1 to −1 on [π/2, 3π/2].
Proof. From Lemma 4.35 we know that cos is strictly decreasing from 1 to
0 on [0, π/2] and from this same lemma we know that sin is positive on (0, π/2).
Therefore by the Pythagorean identity,
p
sin x = 1 − cos2 x
on [0, π/2]. Since cos is positive and strictly decreasing on [0, π/2], this formula
implies that sin is strictly increasing on [0, π/2]. Replacing z by x − π/2 in the
formulas  π  π
cos z + = − sin z, sin z + = cos z
2 2
found in Theorem 4.37, give the new formulas
 π  π
cos x = − sin x − , sin x = cos x − .
2 2
The first of these new formulas plus the fact that sin is increasing on [0, π/2] show
that cos is decreasing on [π/2, π], while the second of these formulas plus the fact
that cos is decreasing on [0, π/2] show that sin is also decreasing on [π/2, π]. Finally,
the formulas
cos x = − cos(x − π), sin x = − sin(x − π),
also obtained as a consequence of Theorem 4.37, and the monotone properties al-
ready established for cos and sin on [0, π], imply the rest of the monotone properties
in (1) and (2) of cos and sin on [π, 2π]. 
In geometric terms, the following theorem states that as θ moves from 0 to
2π, the point f (θ) = (cos θ, sin θ) in R2 moves around the unit circle. (However,
because we like complex notation, we shall write (cos θ, sin θ) as the complex number
cos θ + i sin θ = eiθ in the theorem.)
Theorem 4.39 (π and the unit circle). For a real number θ, define
f (θ) := eiθ = cos θ + i sin θ.
Then f : R −→ C is a continuous function and has range equal to the unit circle
S1 := {(a, b) ∈ R2 ; a2 + b2 = 1} = {z ∈ C ; |z| = 1}.
Moreover, for each z ∈ S1 there exists a unique θ with 0 ≤ θ < 2π such that
f (θ) = z. Finally, f (θ) = f (φ) if and only if θ − φ is an integer multiple of 2π.
Proof. Since the exponential function is continuous, so is the function f , and
by the Pythagorean identity, cos2 θ + sin2 θ = 1, so we also know that f maps into
the unit circle. Given z in the unit circle, we can write z = a + ib where a2 + b2 = 1.
We prove that there exists a unique 0 ≤ θ < 2π such that f (θ) = z, that is, such
that cos θ = a and sin θ = b. Now either b ≥ 0 or b < 0. Assume that b ≥ 0; the case
when b < 0 is proved in a similar way. Since, according to Theorem 4.38, sin θ < 0
for all π < θ < 2π, and we are assuming b ≥ 0, there is no θ with π < θ < 2π
such that f (θ) = z. Hence, we just have to show there is a unique θ ∈ [0, π] such
that f (θ) = z. Since a2 + b2 = 1, we have −1 ≤ a ≤ 1 and 0 ≤ b ≤ 1. Since cos
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 205

strictly decreases from 1 to −1 on [0, π], by the intermediate value theorem there
is a unique value θ ∈ [0, π] such that cos θ = a. The identity
sin2 θ = 1 − cos2 θ = 1 − a2 = b2 ,
and the fact that sin θ ≥ 0, because 0 ≤ θ ≤ π, imply that b = sin θ.
We now prove the last assertion of our theorem. Let θ and φ be real numbers
and suppose that f (θ) = f (φ). Let n be the unique integer such that
θ−φ
n≤ < n + 1.

Multiplying everything by 2π and subtracting by 2πn, we obtain
0 ≤ θ − φ − 2πn < 2π.
By periodicity (see (4.35)),
f (θ − φ − 2πn) = f (θ − φ) = ei(θ−φ) = eiθ e−iφ = f (θ)/f (φ) = 1.
Since θ − φ − 2πn is in the interval [0, 2π) and f (0) = 1 also, by the uniqueness
we proved in the previous paragraph, we conclude that θ − φ − 2πn = 0. This
completes the proof of the theorem. 

We now solve trigonometric equations. Notice that Property (2) of the following
theorem shows that cos vanishes at exactly π/2 and all its π translates and (3) shows
that sin vanishes at exactly all integer multiples of π, again, well-known facts from
high school! However, we consider complex variables instead of just real variables.
Theorem 4.40. For complex numbers z and w,
(1) ez = ew if and only if z = w + 2πin for some integer n.
(2) cos z = 0 if and only if z = nπ + π/2 for some integer n.
(3) sin z = 0 if and only if z = nπ for some integer n.
Proof. The “if” statements follow from Theorem 4.37 so we are left to prove
the “only if” statements. Suppose that ez = ew . Then ez−w = 1. Hence, it suffices
to prove that ez = 1 implies that z is an integer multiple of 2πi. Let z = x + iy for
real numbers x and y. Then,
1 = |ex+iy | = |ex eiy | = ex .
Since the exponential function on the real line in one-to-one, it follows that x = 0.
Now the equation 1 = ez = eiy implies, by Theorem 4.39, that y must be an integer
multiple of 2π. Hence, z = x + iy = iy is an integer multiple of 2πi.
Assume that sin z = 0. Then by definition of sin z, we have eiz = e−iz . By (1),
we have iz = −iz + 2πin for some integer n. Solving for z, we get z = πn. Finally,
the identity
 π
sin z + = cos z
2
and the result already proved for sine shows that cos z = 0 implies that z = nπ+π/2
for some integer n. 

As a corollary of this theorem we see that the domain of tan z = sin z/ cos z
and sec z = 1/ cos z consists of all complex numbers except integer multiples of π/2.
206 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

4.7.4. Which is larger, π e or eπ ? Of course, one can simply check using a


calculator that eπ is greater. Here’s a mathematical proof following [196]. First
recall that 1+x < ex for any positive real x. Hence, as powers preserve inequalities,
for any x, y > 0, we obtain
 y
x
1+ < (ex/y )y = ex .
y
In Section 3.7, we noted that e < 3. Since 3 < π, we have π − e > 0. Now setting
x = π − e > 0 and y = e into the above equation, we get
 e  
π−e π e
1+ = < eπ−e ,
e e
which, after multiplying by ee , gives the inequality π e < eπ .
By the way, speaking about eπ , Charles √
Hermite (1822–1901) made a fascinat-
ing discover that for many values of n, eπ n is an “almost integer” [47, p. 80].
For example, if you go to a calculator, you’ll find that when n = 1, eπ is not almost
an integer, but eπ − π is:
eπ − π ≈ 20.
In fact, eπ − π = 19.999099979 . . .. When n = 163, we get the incredible approxi-
mation

(4.36) eπ 163
= 262537412640768743.9999999999992 . . .

Check out eπ 58
. Isn’t it amazing how e and π show up in the strangest places?
4.7.5. Plane geometry and polar representations of complex num-
bers. Given a nonzero complex number z, we can write z = r ω where r = |z| and
ω = z/|z|. Notice that |ω| = 1, so from our knowledge of π and the unit circle
(Theorem 4.39) we know that there is a unique 0 ≤ θ < 2π such that
z
= eiθ = cos θ + i sin θ.
|z|
Therefore, 
z = reiθ = r cos θ + i sin θ .
This is called the polar representation of z. We can relate this representation
to the familiar “polar coordinates” on R2 as follows. Recall that C is really just R2 .
Let z = x + iy, which remember is the same as z = (x, y) where i = (0, 1). Then
p
r = |z| = x2 + y 2
is just the familiar radial distance of (x, y) to the origin. Equating the real and
imaginary parts of the equation cos θ + i sin θ = z/|z|, we get the two equations
x x y y
(4.37) cos θ = = p and sin θ = = p .
r x2 + y 2 r x2 + y 2

Summarizing: The equation (x, y) = z = reiθ = r cos θ + i sin θ is equivalent to
x = r cos θ and y = r sin θ.
We call (r, θ) the polar coordinates of the point z = (x, y). When z is drawn
as a point in R2 , r represents the distance of z to the origin and θ represents (or
rather, is by definition) the angle that z makes with the positive real axis; see
Figure 4.12. In elementary calculus, one usually studies polar coordinates without
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 207

z = (x, y)

cos θ = x
θ r y r
y
sin θ = r

θ
x

Figure 4.12. The familiar concept of angle.

introducing complex numbers, however, we prefer the complex number approach


and in particular, the single notation z = reiθ instead of the pair notation x = r cos θ
and y = r sin θ. We have taken 0 ≤ θ < 2π, but it will be very convenient to allow
θ to represent any real number. In this case, z = reiθ is not attached to a unique
choice of θ, but by our knowledge of π and the unit circle, we know that any two
such θ’s differ by an integer multiple of 2π. Thus, the polar coordinates (r, θ) and
(r, θ + 2πn) represent the same point for any integer n.
Summarizing this section, we have seen that
All that you thought about trigonometry is true!
In particular, from (4.37) and adding the formula tan θ = sin θ/ cos θ = y/x, from
Figure 4.12 we see that

adjacent opposite opposite


cos θ = , sin θ = , tan θ = ,
hypotonus hypotonus adjacent

just as you learned from high school!

Exercises 4.7.
1. Here are some values of the trigonometric functions.
(a) Find sin i, cos i, and tan(1 + i) (in terms of e and i).
(b) Using various trig identities (no triangles allowed!),√
prove the following well-known
values of sine and cosine: sin(π/4)
√ = cos(π/4) = 1/ 2, sin(π/6) = cos(π/3) = 1/2,
and sin(π/3) = cos(π/6) = 3/2.
(c) Using trig identities, find sin(π/8) and cos(π/8).
2. In this problem we find a very close estimate of π. Prove that for 0 < x < 2, we have

x2 x4
cos x < 1 −
+ .
2 24
p √
Use
p this √fact to prove that 3/2 < π/2 < 6 − 2 3, which implies that 3 < π <
2 6 − 2 3 ≈ 3.185. We’ll get a much better estimate in Section 4.10.
3. Using the series representations (4.33) for sin z and cos z, find the limits

sin z sin z − z cos z − 1 + z 2 /2 cos z − 1 + z 2 /2


lim , lim , lim , lim .
z→0 z z→0 z3 z→0 z3 z→0 z4
4. Prove some of the following identities:
208 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(a) For z, w ∈ C,
2 sin z sin w = cos(z − w) − cos(z + w),
2 cos z cos w = cos(z − w) + cos(z + w),
2 sin z cos w = sin(z + w) + sin(z − w),
tan z + tan w
tan(z + w) = ,
1 − tan z tan w
1 + tan2 z = sec2 z, cot2 z + 1 = csc2 z.

(b) If x ∈ R, then for any natural number n,


bn/2c
!
X k n
cos nx = (−1) cosn−2k x sin2k x,
2k
k=0
b(n−1)/2c
!
X n
sin nx = (−1)k cosn−2k−1 x sin2k+1 x,
2k + 1
k=0

where btc is the greatest integer less than or equal to t ∈ R. Suggestion: Expand
the left-hand side of de Moivre’s formula using the binomial theorem.
(c) Prove that
√ √ √
π 5− 5 π 3+ 5 π 1+ 5
sin2 = , cos2 = , cos = .
5 8 5 8 5 4
Suggestion: What if you consider x = π/5 and n = 5 in the equation for sin nx in
Part (b)?
5. Prove that for 0 ≤ r < 1 and θ ∈ R,
∞ ∞
X 1 − r cos θ X r sin θ
rn cos(nθ) = , rn sin(nθ) = .
n=0
1 − 2r cos θ + r2 n=1
1 − 2r cos θ + r2
P
Suggestion: Let z = reiθ in the geometric series ∞ n
n=0 z .
e β
6. Prove that if e < β, then β < e .
7. Here’s a very neat problem posed by D.J. Newman [156].
(i) Prove that
lim n sin(2π e n!) = 2π.
n→∞
1 1
where e is Euler’s number. Suggestion: Start by multiplying e = 1 + 2! + 3!
+
1 1
· · · + n! + (n+1)! + · · · by 2πn! and see what happens.
(ii) Prove, using (i), that e is irrational.
8. (Hyperbolic functions) In this problem we study the hyperbolic functions.
(a) Show that
sinh(z + w) = sinh z cosh w + cosh z sinh w,
cosh(z + w) = cosh z cosh w + sinh z sinh w,
sinh(2z) = 2 cosh z sinh z , cosh2 z − sinh2 z = 1

(b) If z = x + iy, prove that


sinh z = sinh x cos y + i cosh x sin y, cosh z = cosh x cos y + i sinh x sin y
| sinh z|2 = sinh2 x + sin2 y, | cosh z|2 = sinh2 x + cos2 y.
Determine all z ∈ C such that sinh z is real. Do the same for cosh z. Determine all
the zeros of sinh z and cosh z.
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 209

(c) Prove that if z = x + iy, then


sin z = sin x cosh y + i cos x sinh y, cos z = cos x cosh y − i sin x sinh y.
Determine all z ∈ C such that sin z is real. Do the same for cos z.
9. Here is an interesting geometric problem. Let z ∈ C and let G(n, r) denote a regular
n-gon (n ≥ 3) of radius r centered at the origin of C. In this problem we find a formula
for the sum of the squares of the distances from z to the vertices of G(n, r). Using
complex numbers, this problem is not too difficult to solve. Proceed as follows.
P  2  n
(i) Show that 0 = n k=1 e
2πik/n
= e2πi/n + e2πi/n + · · · + e2πi/n .
(ii) Show that
n
X 2
2πik/n 2 2
z − re = n(|z| + r ).
k=1

Interpret this equation in the context of our problem.


10. In this problem we consider “Thomae-like” functions. Prove that the following func-
tions are continuous at the irrationals and discontinuous at the rationals.
(a) Define f : R −→ R by
(
sin(1/q) if x ∈ Q and x = p/q in lowest terms and q > 0,
f (x) =
0 if x is irrational.

(b) Define g : (0, ∞) −→ R by


(
p sin(1/q) if x ∈ Q and x = p/q in lowest terms and q > 0,
g(x) =
x if x is irrational.
11. In this problem we define π using only the most elementary properties of cosine and
sine. (See [209, p. 160] for another proof). Assume that you are given continuous
functions cos, sin : R −→ R such that
(a) cos2 x + sin2 x = 1 for all x ∈ R.
(b) cos(0) = 1 and sin x is positive for x > 0 sufficiently small.
(c) sin(x ± y) = sin x cos y ± cos x sin y for all x, y ∈ R.
Based on these three properties of cosine and sine, we shall prove that
π := 2 · inf A , where A = {x > 0 ; cos x = 0},
is well-defined, which amounts to showing that A 6= ∅. Assume, by way of contradic-
tion, that A = ∅. Now proceed as follows.
(i) First establish the following identity: For any x, y ∈ R,
x+y x−y
sin x − sin y = 2 cos sin .
2 2
(ii) Show that cos x > 0 for all x ≥ 0.
(iii) Using (a) show that sin : [0, ∞) −→ R is strictly increasing and use this to show
that cos : [0, ∞) −→ R is strictly decreasing. p
(iv) Show that L := lim cos x exists and lim sin x = 1 − L2 .
x→∞ x→∞
(v) Prove that sin 2x = 2 cos x sin x for all x ∈ R and then prove that L = 12 .
(vi) Using the identity in (i), prove that for any y ∈ R we have sin y ≤ 0. This
contradicts that sin x > 0 for x sufficiently small.
(vii) Thus, the assumption that A = ∅ must have been false and hence π is well-
defined. Now that we know π is well-defined we can use this new definition to
re-prove some properties we already verified in the text. For example, prove that
cos(π/2) = 0 and from (i), show that sin x is strictly increasing on [0, π/2]. Prove
that sin(π/2) = 1 and cos x is strictly decreasing on [0, π/2].
210 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

4.8. F Three proofs of the fundamental theorem of algebra (FTA)


In elementary calculus you were exposed to the “method of partial fractions” to
integrate rational functions, in which you had to factor polynomials. The necessity
to factor polynomials for the method of partial fractions played a large rôle in the
race to prove the fundamental theorem of algebra; see [61] for more on this history,
especially Euler’s part in this theorem. It was Carl Friedrich Gauss (1777–1855)
who first proved the fundamental theorem of algebra, as part of his doctoral thesis
(1799) entitled “A new proof of the theorem that every integral rational algebraic
function11 can be decomposed into real factors of the first or second degree” (see e.g.
[36, p. 499]). We present three independent and different guises of one of the more
elementary and popular “topological” proofs of the theorem, except we shall work
with general complex polynomials, that is, polynomials with complex coefficients.

4.8.1. Our first proof of the FTA. Our first proof is found in the article by
Remmert [186]. This proof could have actually been presented immediately after
Section 4.4, but we have chosen to save the proof till now because it fits so well
with roots of complex numbers that we’ll touch on in Section 4.8.3.
Given n ∈ N and z ∈ C, a complex number ξ is called an n-th root of z
if ξ n = z. A natural question is: Does every z ∈ C have an n-th root? Notice
that if z = 0, then ξ = 0 is an n-th root of z and is the only n-th root (since
a nonzero number cannot be an n-th root of 0 because the product of nonzero
complex numbers is nonzero). Thus, for existence purposes we may assume that z
is nonzero. Now certainly if n = 1, then z has a one root; namely ξ = z. If n = 2
and if z is a real positive number,√then we know z has a square root and if z is a
real negative number, then ξ = i −z is a square root of z. If z = a + ib, where
b 6= 0, then the numbers
r r !
|z| + a b |z| − a
ξ=± +i .
2 |b| 2
are square roots of z, as the reader can easily verify; see Problem 8. What about
higher order roots for nonzero complex numbers? In the following lemma we prove
that any complex number has an n-th root. In Subsection 4.8.3 we’ll give another
proof of this lemma using facts about exponential and trigonometric functions de-
veloped in the previous sections. However, the following proof is interesting because
it is completely elementary in that it avoids any reference to these functions.
Lemma 4.41. Any complex number has an n-th root.
Proof. Let z ∈ C, which we may assume is nonzero. We shall prove that z
has an n-th root using strong induction. We already know that z has n-th roots for
n = 1, 2. Let n > 2 and assume that z has roots for all natural numbers less than
n, we shall prove that z has an n-th root.
Suppose first that n is even, say n = 2m for some natural number m > 2. Then
we are looking for a complex number ξ such that ξ 2m = z. By our discussion before
this lemma, we know that there is a number η such that η 2 = z and since m < n,

11In plain English, a polynomial with real coefficients. You can find a beautiful translation
of Gauss’ thesis by Ernest Fandreyer at http://www.fsc.edu/library/documents/Theorem.pdf.
Gauss’ proof was actually incorrect, but he published a correct version in 1816.
4.8. F THREE PROOFS OF THE FUNDAMENTAL THEOREM OF ALGEBRA (FTA) 211

by induction hypothesis, we know there is a number ξ such that ξ m = η. Then


ξ n = ξ 2m = (ξ m )2 = η 2 = z,
and we’ve found an n-th root of z.
Suppose now that n is odd. If z is a nonnegative real number, then we know
that z has a real n-th root, so we may assume that z is not a nonnegative real
number. Choose a complex number η such that η 2 = z. Then for x ∈ R, consider
the polynomial p(x) given by taking the imaginary part of η(x − i)n :
  1 
p(x) := Im η(x − i)n = η(x − i)n − η(x + i)n ,
2i
1
where we used Property (4) of Theorem 2.43 that Im w = 2i (w−w) for any complex
n
number w. Expanding (x − i) using the binomial theorem, we see that
p(x) = Im(η) xn + lower order terms in x.
Since η is not real, the coefficient in front of xn is nonzero, so p(x) is an n-th degree
polynomial in x with real coefficients. In Problem 2 of Section 4.4 we noted that
all odd degree real-valued polynomials have a real root, so there is some c ∈ R with
p(c) = 0. For this c, we have
η(c − i)n − η(c + i)n = 0.
After a little manipulation, and using that η 2 = z, we get
(c + i)n η η2 z (c + i)n
n
= = 2 = =⇒ |z| = z,
(c − i) η |η| |z| (c − i)n
p c+i
It follows that ξ = n |z| c−i satisfies ξ n = z and our proof is now complete. 

We now present our first proof of the celebrated fundamental theorem of alge-
bra. The following proof is a very elementary proof of Gauss’ famous result in the
sense that looking through the proof, we see that the nontrivial results we use are
kept at a minimum:
(1) The Bolzano-Weierstrass theorem.
(2) Any nonzero complex number has a k-th root.
For other presentations of basically the same proof, see [69], [222], [191], or (one
of my favorites) [185].
Theorem 4.42 (The fundamental theorem of algebra, Proof I). Any
complex polynomial of positive degree has at least one complex root.
Proof. Let p(z) = an z n + an−1 z n−1 + · · · + a1 z + a0 be a polynomial with
complex coefficients, n ≥ 1 with an 6= 0. We prove this theorem in four steps.
Step 1: We begin by proving a simple, but important, inequality. Since
an−1 an−2 a1 a0
|p(z)| = |an z n + · · · + a0 | = |z|n an +

+ 2 + · · · + n−1 + n ,
z z z z
for |z| sufficiently large the absolute value of the sum of all the terms to the right
of an can be made less than, say |an |/2. Therefore,
|an |
(4.38) |p(z)| ≥ · |z|n , for |z| sufficiently large.
2
212 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Step 2: We now prove that there exists a point c ∈ C such that |p(c)| ≤ |p(z)|
for all z ∈ C. The proof of this involves the Bolzano-Weierstrass theorem. Define
m := inf A, A := {|p(z)| ; z ∈ C} .
This infimum certainly exists since A is nonempty and bounded below by zero.
Since m is the greatest lower bound of A, for each k ∈ N, m + 1/k is no longer a
lower bound, so there is a point zk ∈ C such that m ≤ |p(zk )| < m+1/k. By (4.38),
the sequence {zk } must be bounded, so by the Bolzano-Weierstrass theorem, this
sequence has a convergent subsequence {wk }. If c is the limit of this subsequence,
then by continuity of polynomials, |p(wk )| → |p(c)| and since m ≤ |p(zk )| < m+1/k
for all k, by the squeeze theorem we must have |p(c)| = m.
Step 3: The rest of the proof involves showing that the minimum m must be
zero, which shows that p(c) = 0, and so c is a root of p(z). To do so, we introduce
an auxiliary polynomial q(z) as follows. Let us suppose, for sake of contradiction,
that p(c) 6= 0. Define q(z) := p(z + c)/p(c). Then |q(z)| has a minimum at the
point z = 0, the minimum being |q(0)| = |1| = 1. Since q(0) = 1, we can write
(4.39) q(z) = bn z n + · · · + 1 = bn z n + · · · + bk z k + 1,
where k is the smallest natural number such that bk 6= 0. In our next step we shall
prove that 1 is in fact not the minimum of |q(z)|, which gives a contradiction.
Step 4: By our lemma, −1/bk has a k-th root a, so that ak = −1/bk . Then
|q(az)| also has a minimum at z = 0, and
q(bz) = 1 + bk (az)k + · · · = 1 − z k + · · · ,
where · · · represents terms of higher degree than k. Thus, we can write
q(az) = 1 − z k + z k+1 r(z),
where r(z) is a polynomial of degree at most n − (k + 1). Let z = x, a real number
with 0 < x < 1, be so small that x |r(x)| < 1. Then,

|q(ax)| = |1 − xk + xk+1 r(x)| ≤ |1 − xk | + xk+1 |r(x)|


< 1 − xk + xk · 1 = 1 = |q(0)| =⇒ |q(ax)| < |q(0)|.
This shows that |q(z)| does not achieve a minimum at z = 0, contrary to what we
said earlier. Hence our assumption that p(c) 6= 0 must have been false and our
proof is complete. 
We remark that the other two proofs of the FTA in this section (basically) only
differ from this proof at the first line in Step 4, in how we claim that there is a
complex number a with ak = −1/bk .
As a consequence of the fundamental theorem of algebra, we can prove the
well-known fact that a polynomial can be factored. Let p(z) be a polynomial of
positive degree n with complex coefficients and let c1 be a root of p, which we know
exists by the FTA. Then from Lemma 2.52, we can write
p(z) = (z − c1 ) q1 (z),
where q1 (z) is a polynomial of degree n − 1 in both z and c1 . By the FTA, q1 has a
root, call it c2 . Then from Lemma 2.52, we can write q1 (z) = (z − c2 ) q2 (z) where
q2 has degree n − 2 and substituting q1 into the formula for p, we obtain
p(z) = (z − c1 )(z − c2 ) q2 (z).
4.8. F THREE PROOFS OF THE FUNDAMENTAL THEOREM OF ALGEBRA (FTA) 213

Proceeding a total of n − 2 more times in this fashion we eventually arrive at


p(z) = (z − c1 )(z − c2 ) · · · (z − cn ) qn ,
where qn is a polynomial of degree zero, that is, a necessarily nonzero constant.
It follows that c1 , . . . , cn are roots of p(z). Moreover, these numbers are the only
roots, for if
0 = p(c) = (c − c1 )(c − c2 ) · · · (c − cn ) qn ,
then c must equal one of the ck ’s since a product of complex numbers is zero if and
only if one of the factors is zero. Summarizing, we have proved the following.
Corollary 4.43. If p(z) is a polynomial of positive degree n, then p has exactly
n complex roots c1 , . . . , cn counting multiplicities and we can write
p(z) = a (z − c1 )(z − c2 ) · · · (z − cn ).
4.8.2. Our second proof of the FTA. Our second proof of the FTA is
almost exactly the same as the first, but at the beginning of Step 4 in the above
proof, we use a neat trick by Searcóid [167] that avoids the fact that every complex
number has an n-th root. His trick is the following lemma.

Lemma 4.44. Let ` be an odd natural number, ζ = (1 + i)/ 2, and let α be a
complex number of length 1. Then there is a natural number ν such that
|1 + α ζ 2ν` | < 1.
Proof. Observe that
(1 + i)(1 + i) 1 + 2i + i2
ζ2 = = = i.
2 2
Therefore, 1+α ζ 2ν` simplifies to 1+α iν` , and we shall use this latter expression for
the rest of the proof. Since ` is odd we can write ` = 2m+1 for some m = 0, 1, 2, . . .,
thus for any natural number ν,
(
ν` ν(2m+1) 2νm ν νm ν iν if m is even
i =i =i · i = (−1) · i =
(−i)ν if m is odd.
Using this formula one can check that {iν` ; ν ∈ N} = {1, i, −1, −i}. Observe that
|1 + α iν` |2 = (1 + α iν` )(1 + α iν` ) = 1 + α iν` + α iν` + |α iν` |2 = 2 + 2 Re(α iν` ),
where we used that |α iν` | = |α| = 1 and that 2 Re w = w + w for any complex
number w from Property (4) of Theorem 2.43. Let α = a + ib. Then considering
the various cases iν` = 1, i, −1, −i, we get
{α iν` ; ν ∈ N} = {aiν` + ibiν` ; ν ∈ N} = {a + ib, −b + ia, −a − ib, b − ia}.
Hence, in view of the formula |1 + α iν` |2 = 2 + 2 Re(α iν` ), we obtain
(4.40) { |1 + α iν` |2 ; ν ∈ N } = {2 + 2a, 2 − 2b, 2 − 2a, 2 + 2b}.
√ √
Since |α|2 = a2 + b2 = 1, |a| ≥√ 1/ 2 or |b| ≥ 1/ 2 (for otherwise a2 + b2 < 1). √ Let
us take the case when√|a| ≥ 1/ √ 2; the other case is handled similarly. If a ≥ 1/ 2,
then 2 − 2a ≤ 2 − 2/ 2 = 2 − 2 < 1 and a ν corresponding
√ to 2 − 2a√in (4.40)
satisfies the conditions of this lemma. If a ≤ −1/ 2, then 2 + 2a ≤ 2 − 2, so a ν
corresponding to 2 + 2a in (4.40) satisfies the conditions of this lemma. 
214 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Theorem 4.45 (The fundamental theorem of algebra, Proof II). Any


complex polynomial of positive degree has at least one complex root.
Proof. We proceed by strong induction. Certainly the FTA holds for all
polynomials of first degree, therefore assume that p(z) is a polynomial of degree
n ≥ 2 and suppose the FTA holds for all polynomials of degree less than n.
Now we proceed, without changing a single word, exactly as in Proof I up to
Step 4, where we use the following argument in place.
Step 4 modified: Recall that the polynomial q(z) in (4.39),
q(z) = bn z n + · · · + bk z k + 1,
has the property that |q(z)| has the minimum value 1. We claim that the k in this
expression cannot equal n. To see this, for sake of contradiction, let us suppose
that k = n. Then q(z) = bn z n + 1 and q(z) has the property that

n
bn
|q(z)| = |1 + bn z | = 1 +
n
|bn | z = |1 + α wn |
|bn |
has the minimum value 1, where α = bn /|bn | has unit length and w = |bn |1/n z. We
derive a contradiction in three cases: n > 2 is even, n = 2, and n is odd. If n > 2
is even, then we can write n = 2m for a natural number m with 2 ≤ m < n. By
our induction hypothesis (the FTA holds for all polynomials of degree less than n),
there is a number η such that η m + 1/α = 0, and, there is a number ξ such that
ξ 2 − η = 0. Then
ξ n = ξ 2m = (ξ 2 )m = η m = −1/α.
Thus, for w = ξ, we obtain |1+α wn | = 0, which contradicts the fact that |1+α wn |
is never less than 1. Now suppose that n = 2. Then by our lemma with ` = 1,
there is a ν such that
|1 + α ζ 2ν | < 1,

where ζ = (1 + i)/ 2. This shows that w = ζ ν satisfies |1 + α wn | < 1, again
contradicting the fact that |1 + α wn | is never less than 1. Finally, suppose that
n = ` is odd. Then by our lemma, there is a ν such that
|1 + α ζ 2νn | < 1,

where ζ = (1 + i)/ 2, which shows that w = ζ 2ν satisfies |1 + α wn | < 1, again
resulting in a contradiction. Therefore, k < n.
Now that we’ve proved k < n, we can use our induction hypothesis to conclude
that there is a complex number a such that ak + 1/bk = 0, that is, ak = −1/bk . We
can now proceed exactly as in Step 4 of Proof I to finish the proof. 

4.8.3. Roots of complex numbers. Back in Section 2.7 we learned how to


find n-th roots of nonnegative real numbers; we now generalize this to complex
numbers using the polar representation of complex numbers studied in Section 4.7.
Let n ∈ N and let w be any complex number. We shall find all n-th roots of z
using trigonometry. If z = 0, then the only ξ that works is ξ = 0 since the product
of nonzero complex numbers is nonzero, therefore we henceforth assume that z 6= 0.
We can write z = reiθ where r > 0 and θ ∈ R and given any nonzero complex ξ we
can write ξ = ρeiφ where r > 0 and φ ∈ R. Then ξ n = z if and only if
ρn einφ = r eiθ .
4.8. F THREE PROOFS OF THE FUNDAMENTAL THEOREM OF ALGEBRA (FTA) 215

Taking the absolute


√ value of both sides, and using that |einφ | = 1 = |eiθ |, we get
n
ρ = r, or ρ = r. Now cancelling off ρn = r, we see that
n

einφ = eiθ ,
which holds if and only if nφ = θ + 2πm for some integer m, or
θ 2πm
φ= + , m ∈ Z.
n n
As the reader can easily check, any number of this form differs by an integer multiple
of 2π from one of the following numbers:
θ θ 2π θ 4π θ 2π
, + , + ,..., + (n − 1).
n n n n n n n
None of these numbers differ by an integer multiple of 2π, therefore by our knowl-
edge of π and the unit circle, all the n numbers

1
ei n θ+2πk , k = 0, 1, 2, . . . , n − 1
are distinct. Thus, there are a total of n solutions ξ to the equation ξ n = z, all of
them given in the following theorem.
Theorem 4.46 (Existence of complex n-th roots). There are exactly n
n-th roots of any nonzero complex number z = reiθ ; the complete set of roots is
given by
  
√ 1 √ 1  1 
n
r ei n θ+2πk = n r cos θ + 2πk + i sin θ + 2πk , k = 0, 1, 2, . . . , n − 1.
n n
There is a very convenient way to write these n-th roots as we now describe.
First of all, notice that
√  √ √  2π k
1 θ 2πk θ
n
r ei n θ+2πk = n r ei n · ei n = n r ei n · ei n .

Therefore, the n-th roots of z are given by


√ θ 2π 2π 2π
n
r ei n · ω k , k = 0, 1, . . . , n − 1, where ω = ei n = cos + i sin .
n n
Of all the n distinct roots, there is one called the principal n-th root, denoted

by n z, and is the n-th root given by choosing θ to satisfy −π < θ ≤ π; thus,
√ √ θ
n
z := n r ei n , where − π < θ ≤ π.

Note that if z = x > 0 is a positive real number, then√x = rei0 with √ r = x and
−π < 0 ≤ π, so the principal n-th root of x is just n xei0/n = n x, the usual
real n-th root of x. Thus, there is no ambiguity in notation between the complex
principal n-th root of a positive real number and its real n-th root.
We now give some examples.
Example 4.34. For our first example, we find the square roots of −1. Since
−1 = eiπ , because cos π + i sin π = −1 + i0, the square roots of −1 are ei(1/2)π and
ei(1/2)(π+2π) = ei3π/2 . Writing these numbers in terms of sine and cosine, we get i
and −i √ as the square roots of −1. Note that the principal square root of −1 is i
and so −1 = i, just as we learned in high school!
216 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Example 4.35. Next let us compute the n-th roots of unity, that is, 1. Since
1 = 1 ei0 , all the n n-th roots of 1 are given by
2π 2π 2π
1, ω, ω 2 , . . . , ω n−1 , where ω := ei n = cos + i sin .
n n
Consider n = 4. In this case, cos 2π 2π 2 3
4 + i sin 4 = i, i = −1, and i = −i, therefore
the fourth roots of unity are
1, i, −1, −i.
Since √
2π 2π 1 3
cos + i sin =− +i ,
3 3 2 2
the cube roots of unity are
√ √
1 3 1 3
1, − + i , − −i .
2 2 2 2
4.8.4. Our third proof of the FTA. We are now ready to prove our third
proof of the FTA.
Theorem 4.47 (The fundamental theorem of algebra, Proof III). Any
complex polynomial of positive degree has at least one complex root.
Proof. We proceed, without changing a single word, exactly as in Proof I up
to Step 4, where we use the following in place.
Step 4 modified: At the beginning of Step 4 in Proof I, we used Lemma
4.41 to conclude that there is a complex a such that ak = −1/bk . Now we can
simply invoke Theorem 4.46 to verify that there is such a number a. Explicitly, we
can just write −1/bk = reiθ and simply define a = r1/k eiθ/k . In any case, now that
we have such an a, we can proceed exactly as in Step 4 of Proof I to finish the
proof. 
Exercises 4.8.
1. Let p(z) and q(z) be polynomials of degree at most n.
(a) If p vanishes at n + 1 distinct complex numbers, prove that p = 0, the zero poly-
nomial.
(b) If p and q agree at n + 1 distinct complex numbers, prove that p = q.
(c) If c1 , . . . , cn (with each root repeated according to multiplicity) are roots of p(z),
a polynomial of degree n, prove that p(z) = an (z − c1 )(z − c2 ) · · · (z − cn ) where
an is the coefficient of z n in the expression for p(z).
2. Find the following roots and state which of the roots represents the principal root.
(a) Find the cube roots of −1.
(b) Find the square roots of i.
(c) Find the cube roots of i. √
(d) Find the square roots of 3 + 3i.
3. Geometrically (not rigorously) demonstrate that the n-th roots, with n ≥ 3, of a
nonzero complex number z are the vertices of a regular polygon.

4. Let n ∈ N and let ω = ei n . If k is any integer that is not a multiple of n, prove that
1 + ω k + ω 2k + ω 3k + · · · + ω (n−1)k = 0.
5. Prove by “completing the square” that any quadratic polynomial z 2 + bz + c = 0 with
complex coefficients has two complex roots, counting multiplicities, given by

−b ± b2 − 4ac
z= ,
2a

where b2 − 4ac is the principal square root of b2 − 4ac.
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM217

6. We show how the ingenious mathematicians of the past solved the general cubic equa-
tion z 3 + bz 2 + cz + d = 0 with complex coefficients; for the history, see [88].
(i) First, replacing z with z − b/3, show that our cubic equation transforms into an
equation of the form z 3 + αz + β = 0 where α and β are complex. Thus, we may
focus our attention on the equation z 3 + α z + β = 0.
(ii) Second, show that the substitution z = w − α/(3w) gives an equation of the form
27(w3 )2 + 27β(w3 ) − α3 = 0,
a quadratic equation in w3 . We can solve this equation for w3 by the previous
problem, therefore we can solve for w, and therefore we can get z = w − α/(3w).
(iii) Using the technique outlined above, solve the equation z 3 − 12z − 3 = 0.
7. A nice application of the previous problem is finding sin(π/9) and cos(π/9).
(i) Use de Moivre’s formula to prove that
cos 3x = cos3 x − 3 cos x sin2 x, sin 3x = 3 cos2 x sin x − sin3 x.
(ii) Choose one of these equations and using cos2 x + sin2 x = 1, turn the right-hand
side into a cubic polynomial in cos x or sin x.
(iii) Using the equation you get, determine sin(π/9) and cos(π/9).
8. This problem is for the classic mathematicians at heart: We find square roots without
using the technology of trigonometric functions.
(i) Let z = a + ib be a nonzero complex number with b 6= 0. Show that ξ = x + iy
satisfies ξ 2 = z if and√only if x2 − y 2 = a and 2xy = b.  
(ii) Prove that x2 +y 2 = a2 + b2 = |z|, and then x2 = 21 |z|+a and y 2 = 21 |z|−a .
(iii) Finally, deduce that z must equal
r r !
|z| + a b |z| − a
ξ=± +i .
2 |b| 2
n n−1
9. Prove that if r is a root of a polynomial p(z) = z + an−1 z + · · · + a0 , then
Pn−1
|r| ≤ max 1, k=0 |ak | .
10. (Continuous dependence of roots) Following Uherka and Sergott [227], we prove
the following useful theorem. Let z0 be a root of multiplicity m of a polynomial
p(z) = z n + an−1 z n−1 + · · · + a0 . Then given any ε > 0, there is a δ > 0 such that if
q(z) = z n + bn−1 z n−1 + · · · + b0 satisfies |bj − aj | < δ for all j = 0, . . . , n − 1, then q(z)
has at least m roots within ε of z0 . You may proceed as follows.
(i) Suppose the theorem is false. Prove there is an ε > 0 and a sequence {qk } of
polynomials qk (z) = z n + bk,n−1 z n−1 + · · · + bk,0 such that qk has at most m − 1
roots within ε of z0 and for each j = 0, . . . , n − 1, we have bk,j → aj as k → ∞.
(ii) Let rk,1 , . . . , rk,n be the n roots of qk . Let Rk = (rk,1 , . . . , rk,n ) ∈ Cn = R2n .
Prove that the sequence {Rk } has a convergent subsequence. Suggestion: Problem
9 is helpful.
(iii) By relabelling the subsequence if necessary, we assume that {Rk } itself converges;
say Rk = (rk,1 , . . . , rk,n ) → (r1 , . . . , rn ). Prove that at most m − 1 of the rj ’s can
equal z0 .
(iv) From Problem 2 in Exercises 2.10, qk (z) = (z − rk,1 )(z − rk,2 ) · · · (z − rk,n ). Prove
that for each z ∈ C, limk→∞ qk (z) = (z−r1 )(z−r2 ) · · · (z−rn ). On the other hand,
using that bk,j → aj as k → ∞, prove that for each z ∈ C, limk→∞ qk (z) = p(z).
Derive a contradiction.

4.9. The inverse trigonometric functions and the complex logarithm


In this section we study the inverse trigonometric functions you learned in el-
ementary calculus. We then use these functions to derive properties of the polar
angle, also called the argument of complex number. In Section 4.6 we developed
218 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

the properties of real logarithms and using the logarithm we defined complex pow-
ers of positive bases. In our current section we shall extend logarithms to include
complex logarithms, which are then used to define complex powers with complex
bases. Finally, we use the complex logarithm to define complex inverse trigonomet-
ric functions.

4.9.1. The real-valued inverse trigonometric functions. By the oscilla-


tion theorem 4.38, we know that
sin : [−π/2, π/2] −→ [−1, 1] and cos : [0, π] −→ [−1, 1]
are both strictly monotone bijective continuous functions, sin being strictly in-
creasing and cos being strictly decreasing. In particular, by the monotone inverse
theorem, each of these functions has a strictly monotone inverse which we denote
by
arcsin : [−1, 1] −→ [−π/2, π/2] and arccos : [−1, 1] −→ [0, π],
called the inverse, or arc, sine, which is strictly increasing, and inverse, or
arc, cosine, which is strictly decreasing. Being inverse functions, these functions
satisfy
sin(arcsin x) = x, −1 ≤ x ≤ 1 and arcsin(sin x) = x, −π/2 ≤ x ≤ π/2
and
cos(arccos x) = x, −1 ≤ x ≤ 1 and arccos(cos x) = x, 0 ≤ x ≤ π.
If 0 ≤ θ ≤ π, then −π/2 ≤ π/2 − θ ≤ π/2, so letting x denote both sides of the
identity
π 
cos θ = sin −θ ,
2
−1 −1
we get θ = cos x and π/2 − θ = sin x, which further imply that
π
(4.41) arccos x = − arcsin x, for all − 1 ≤ x ≤ 1.
2
We now introduce the inverse tangent function. We first claim that
tan : (−π/2, π/2) −→ R
is a strictly increasing bijection. Indeed, since
sin x
tan x =
cos x
and sin is strictly increasing on [0, π/2] from 0 to 1 and cos is strictly decreasing
on [0, π/2] from 1 to 0, we see that tan is strictly increasing on [0, π/2) from 0 to
∞. Using the properties of sin and cos on [−π/2, 0], in a similar manner one can
show that tan is is strictly decreasing on (−π/2, 0) from −∞ to 0. This proves that
tan : (−π/2, π/2) −→ R is a strictly increasing bijection. Therefore, this function
has a strictly increasing inverse, which we denote by
arctan : R −→ (−π/2, π/2),
and called the inverse, or arc, tangent.
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM219

4.9.2. The argument of a complex number. Given a nonzero complex


number z we know that we can write z = |z|eiθ for some θ ∈ R and all such
θ’s satisfying this equation differ by integer multiples of 2π. Geometrically, θ is
interpreted as the angle z makes with the positive real axis when z is drawn as a
point in R2 . Any such angle θ is called an argument of z and is denoted by arg z.
Thus, we can write
z = |z| ei arg z .
We remark that arg z is not a function but is referred to as a “multiple-valued
function” since arg z does not represent a single value of θ; however, any two
choices for arg z differ by an integer multiple of 2π. If w is another nonzero complex
number, written as w = |w| eiφ , so that arg w = φ, then
 
zw = |z| eiθ |w| eiφ = |z| |w| ei(θ+φ) ,
which implies that
arg(zw) = arg z + arg w.
We interpret this as saying that any choices for these three arguments satisfy this
equation up to an integer multiple of 2π. Thus, the argument of a product is the
sum of the arguments. What other function do you know of that takes products
into sums? The logarithm of course — we shall shortly show how arg is involved
in the definition of complex logarithms. Similarly, properly interpreted we have
z
arg = arg z − arg w.
w
With all the ambiguity in arg, mathematically it would be nice to turn arg
into a function. To do so, note that given a nonzero complex number z, there is
exactly one argument satisfying −π < arg z ≤ π; this particular angle is called the
principal argument of z and is denoted by Arg z. Thus, Arg : C \ {0} −→ R is
characterized by the following properties:

z = |z|ei Arg z , −π < Arg z ≤ π.

Then all arguments of z differ from the principal one by multiples of 2π:
arg z = Arg z + 2π n, n ∈ Z.
We can find many different formulas for Arg z using the inverse trig functions as
follows. Writing z in terms of its real and imaginary parts: z = x+iy, and equating
this with |z|ei Arg z = |z| cos(Arg z) + i|z| sin(Arg z), we see that
x y
(4.42) cos Arg z = p and sin Arg z = p .
x2 + y 2 x2 + y 2
By the properties of cosine, we see that
π π
− < Arg z < ⇐⇒ x > 0.
2 2
Since arcsin is the inverse of sin with angles in (−π/2, π/2), it follows that
!
y
Arg z = arcsin p , x > 0.
x2 + y 2
220 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Perhaps the most common formula for Arg z when x > 0 is in terms of arctangent,
which is derived by dividing the formulas in (4.42) to get tan Arg z = y/x and then
taking arctan of both sides:
y
Arg z = arctan , x > 0.
x
We now derive a formula for Arg z when y > 0. By the properties of sine, we see
that

0 ≤ Arg z ≤ π ⇐⇒ y ≥ 0 and − π < Arg z < 0 ⇐⇒ y < 0.

Assuming that y ≥ 0, that is, 0 ≤ Arg z ≤ π, we can take the arccos of both sides
of the first equation in (4.42) to get
!
x
Arg z = arccos p , y ≥ 0.
x2 + y 2

Assume that y < 0, that is, −π < Arg z < 0. Then p 0 < − Arg z < π and since
cos Arg z = cos(− Arg z), we get cos(− Arg z) = x/ x2 + y 2 . Taking the arccos of
both sides, we get
!
x
Arg z = − arccos p , y ≤ 0.
x2 + y 2

Putting together our expressions for Arg z, we obtain the following formulas for the
principal argument:

y
(4.43) Arg z = arctan if x > 0,
x

and
 !

 x
arccos if y ≥ 0,

 p
x2 + y 2 !
(4.44) Arg z =

 x
− arccos if y < 0.

 p
x + y2
2

Using these formulas, we can easily prove the following theorem.

Theorem 4.48. Arg : C \ {0} −→ (−π, π] is continuous.

Proof. Since

C \ (−∞, 0] = {x + iy ; x > 0} ∪ {x + iy ; y > 0} ∪ {x + iy ; y < 0},

all we have to do is prove that Arg is continuous on each of these three sets. But
this is easy: The formula (4.43) shows that Arg is continuous when x > 0, the first
formula in (4.44) shows that Arg is continuous when y > 0, and the second formula
in (4.44) shows that Arg is continuous when y < 0. 
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM221

4.9.3. The complex logarithm and powers. Recall from Section 4.6.2 that
if a ∈ R and a > 0, then a real number ξ having the property that
eξ = a
is called the logarithm of a; we know that ξ always exists and is unique since
exp : R −→ (0, ∞) is a bijection. Of course, ξ = log a by definition of log. We now
consider complex logarithms. We define such logarithms in an analogous way: If
z ∈ C and z 6= 0, then a complex number ξ having the property that
eξ = z
is called a complex logarithm of z. The reason we assume z 6= 0 is that there
is no complex ξ such that eξ = 0. We now show that nonzero complex numbers
always have logarithms; however, in contrast to the case of real numbers, complex
numbers have infinitely many distinct logarithms!
Theorem 4.49. The complex logarithms of any given nonzero complex number
z are all of the form

(4.45) ξ = log |z| + i Arg z + 2π n , n ∈ Z.
Therefore, all complex logarithms of z have exactly the same real part log |z|, but
have imaginary parts that differ by integer multiples of 2π from Arg z.
Proof. The idea behind this proof is very simple: We write
z = |z| · ei arg z = elog |z| · ei arg z = elog |z|+i arg z .
Since any argument of z is of the form Arg z + 2πn for n ∈ Z, this equation shows
that all the numbers in (4.45) are indeed logarithms. On the other hand, if ξ is any
logarithm of z, then
eξ = z = elog |z|+i Arg z .
By Theorem 4.40 we must have ξ = log |z| + i Arg z + 2πi n for some n ∈ Z. This
completes our proof. 
To isolate one of these infinitely many logarithms we define the so-called “prin-
cipal” one. For any nonzero complex number z, we define the principal (branch
of the) logarithm of z by
Log z := log |z| + i Arg z.
By Theorem 4.49, all logarithms of z are of the form
Log z + 2πi n, n ∈ Z.
Note that if x ∈ R, then Arg x = 0, therefore
Log x = log x,
our usual logarithm, so Log is an extension of the real log to complex numbers.
Example 4.36. Observe that since Arg(−1) = π and Arg i = π/2 and log | −
1| = 0 = log |i|, since both equal log 1, we have
π
Log(−1) = iπ, Log i = i .
2
The principal logarithm satisfies some of the properties of the real logarithm,
but we need to be careful with the addition properties.
222 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Example 4.37. For instance, observe that


π
Log(−1 · i) = Log(−i) = log | − i| + i Arg(−i) = −i .
2
On the other hand,
π 3π
Log(−1) + Log i = iπ + i = i ,
2 2
so Log(−1 · i) 6= Log(−1) + Log i. Another example of this phenomenon is
π π
Log(−i · −i) = Log(−1) = iπ, Log(−i) + Log(−i) = −i − i = −iπ.
2 2
However, under certain conditions, Log does satisfy the usual properties.
Theorem 4.50. Let z and w be complex numbers.
(1) If −π < Arg z + Arg w ≤ π, then
Log zw = Log z + Log w.
(2) If −π < Arg z − Arg w ≤ π, then
z
Log = Log z − Log w.
w
(3) If Re z, Re w ≥ 0 with at least one strictly positive, then both (1) and (2) hold.
Proof. Suppose that −π < Arg z + Arg w ≤ π. By definition,
 
Log zw = log |z| |w| + i Arg zw = log |z| + log |w| + i Arg zw.
Since arg(zw) = arg z + arg w, Arg z + Arg w is an argument of zw, and since
Arg(zw) is the unique argument of zw in (−π, π] and −π < Arg z + Arg w ≤ π, it
follows that Arg(zw) = Arg z + Arg w. Thus,
Log zw = log |z| + log |w| + i Arg z + i Arg w = Log z + Log w.
Property (2) is prove in a similar manner. Property (3) follows from (1) and (2)
since in case Re z, Re w ≥ 0 with at least one strictly positive, then as the reader
can verify, the hypotheses of both (1) and (2) are satisfied. 
We now use Log to define complex powers of complex numbers. Recall that
given any positive real number a and complex number z, we have az := ez log a .
Using Log instead of log, we can now define powers for complex a. Let a be any
nonzero complex number and let z be any complex number. Any number of the
form ezb where b is a complex logarithm of a is called a complex power of a to
the z; the choice of principal logarithm defines
az := ez Log a
and we call this the principal value of a to the power z. As before, a is called
the base and z is called the exponent. Note that if a is a positive real number,
then Log a = log a, so
az = ez Log a = ez log a
is the usual complex power of a defined in Section 4.6.3. Theorem 4.49 implies the
following.
Theorem 4.51. The complex powers of any given nonzero complex number a
to the power z are all of the form

(4.46) ez Log a+2πi n , n ∈ Z.
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM223

In general, there are infinitely many complex powers, but in certain cases they
actually reduce to a finite number, see Problem 3. Here are some examples.
Example 4.38. Have you ever thought about what ii equals? In this case,
Log i = iπ/2, so
ii = ei Log i = ei(iπ/2) = e−π/2 ,
a real number! Here is another nice example:
π π
(−1)1/2 = e(1/2) Log(−1) = e(1/2)i π = cos + i sin = i,
2 2
therefore (−1)1/2 = i, just as we suspected!
4.9.4. The complex-valued arctangent function. We now investigate the
complex arctangent function; the other complex inverse functions are found in
Problem 5. Given a complex number z, in the following theorem we shall find all
complex numbers ξ such that
(4.47) tan ξ = z.
Of course, if we can find such a ξ, then we would like to call ξ the “inverse tangent
of z.” However, when this equation does have solutions, it turns out that it has
infinitely many.
Lemma 4.52. If z = ±i, then the equation (4.47) has no solutions. If z 6= ±i,
then
1 + iz
tan ξ = z ⇐⇒ e2iξ = ,
1 − iz
that is, if and only if
1 1 + iz
ξ = × a complex logarithm of .
2i 1 − iz
Proof. The following statements are equivalent:

tan ξ = z ⇐⇒ sin ξ = z cos ξ ⇐⇒ eiξ − e−iξ = iz(eiξ + e−iξ )


⇐⇒ (e2iξ − 1) = iz(e2iξ + 1) ⇐⇒ (1 − iz)e2iξ = 1 + iz.
If z = i, then this last equation is just 2e2iξ = 0, which is impossible, and if z = −i,
then the last equation is 0 = 2, again an impossibility. If z 6= ±i, then the last
equation is equivalent to
1 + iz
e2iξ = ,
1 − iz
which by definition of complex logarithm just means that 2iξ is a complex logarithm
of the number (1 + iz)/(1 − iz). 
We now choose one of the solutions of (4.47), the obvious choice being the one
corresponding to the principal logarithm: Given any z ∈ C with z 6= ±i, we define
the principal inverse, or arc, tangent of z to be the complex number
1 1 + iz
Arctan z = Log .
2i 1 − iz
This defines a function Arctan : C \ {±i} −→ C, which does satisfy (4.47):
tan(Arctan z) = z, z ∈ C, z 6= ±i.
224 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

Some questions you might ask are whether or not Arctan really is an “inverse” of
tan, in other words, is Arctan a bijection; you might also ask if Arctan x = arctan x
for x real. The answer to the first question is “yes,” if we restrict the domain of
Arctan, and the answer to the second question is “yes.”
Theorem 4.53 (Properties of Arctan). Let
D = {z ∈ C ; z 6= iy, y ∈ R, |y| ≥ 1}, E = {ξ ∈ C ; | Re ξ| < π/2}.
Then
Arctan : D −→ E
is a continuous bijection from D onto E with inverse tan : E −→ D and when
restricted to real values,
Arctan : R −→ (−π/2, π/2)
and equals the usual arctangent function arctan : R −→ (−π/2, π/2).
Proof. We begin by showing that Arctan(D) ⊆ E. First of all, by definition
of Log, for any z ∈ C with z 6= ±i (not necessarily in D) we have
 
1 1 + iz 1 1 + iz 1 + iz
Arctan z = Log = log
+ i Arg
2i 1 − iz 2i 1 − iz 1 − iz

1 1 + iz i 1 + iz
(4.48) = Arg − log .
2 1 − iz 2 1 − iz
Since the principal argument of any complex number lies in (−π, π], it follows that
π π
− < Re Arctan z ≤ , for all z ∈ C, z 6= ±i.
2 2
Assume that Arctan z ∈ / E, which, by the above inequality, is equivalent to
1 + iz 1 + iz
2 Re Arctan z = Arg = π ⇐⇒ ∈ (−∞, 0).
1 − iz 1 − iz
1+iz
If z = x + iy, then (by multiplying top and bottom of 1−iz by 1 + iz and making
a short computation) we can write
1 + iz 1 − |z|2 2x
= + i.
1 − iz |1 − iz|2 |1 − iz|2
This formula shows that (1 + iz)/(1 − iz) ∈ (−∞, 0) if and only if x = 0 and
1 − |z|2 < 0, that is, x = 0 and 1 − y 2 < 0, or, |y| > 1; hence,
1 + iz
∈ (−∞, 0) ⇐⇒ z = iy , |y| > 1.
1 − iz
In summary, for any z ∈ C with z 6= ±i, we have Arctan z ∈
/ E ⇐⇒ z ∈
/ D, or
(4.49) Arctan z ∈ E ⇐⇒ z ∈ D.
Therefore, Arctan(D) ⊆ E.
We now show that Arctan(D) = E, so let ξ ∈ E. Define z = tan ξ. Then
1+iz
according to Lemma 4.52, we have z 6= ±i and e2iξ = 1−iz . By definition of E,
the real part of ξ satisfies −π/2 < Re ξ < π/2. Since Im(2iξ) = 2 Re(ξ), we have
−π < Im(2iξ) < π, and therefore by definition of the principal logarithm,
1 + iz
2iξ = Log .
1 − iz
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM225

Hence, by definition of the arctangent, ξ = Arctan z. The complex number z


must be in D by (4.49) and the fact that ξ = Arctan z ∈ E. This shows that
Arctan(tan ξ) = ξ for all ξ ∈ E and we already know that tan(Arctan z) = z for all
z ∈ D (in fact, for all z ∈ C with z 6= ±i), so Arctan is a continuous bijection from
D onto E with inverse given by tan.
Finally, it remains to show that Arctan equals arctan when restricted to the
real line. To prove this we just need to prove that Arctan x is real when x ∈ R.
This will imply that Arctan : R −→ (−π/2, π/2) and therefore is just arctan. Now
from (4.48) we see that if x ∈ R, then the imaginary part of Arctan x is

1 + ix 1 + x2

log = log p = log 1 = 0.
1 − ix 1 + (−x)2
Thus, Arctan x is real and our proof is complete. 
Setting z = 1, we get Giulio Carlo Fagnano dei Toschi’s (1682–1766) famous
formula (see [14] for more on Giulio Carlo, Count Fagnano, and Marquis de Toschi):
π 1 1+i
= Log .
4 2i 1−i
Exercises 4.9.
1. Find the following logs:
√ √
Log(1 + i 3), Log( 3 − i), Log(1 − i)4 ,
and find the following powers:

2i , (1 + i)i , eLog(3+2i) , i 3
, (−1)2i .
2. Using trig identities, prove the following identities:
 
x+y
arctan x + arctan y = arctan ,
1 − xy
 p p 
arcsin x + arcsin y = arcsin x 1 − y 2 + y 1 − y 2 ,
and give restrictions under which these identities are valid. For example, the first
identity holds when xy 6= 1 and the left-hand side lies strictly between −π/2 and π/2.
When does the second hold?
3. In this problem we study real powers of complex numbers. Let a ∈ C be nonzero.
(a) Let n ∈ N and show that all powers of a to 1/n are given by

1
e n Log a+2πi k , k = 0, 1, 2, . . . , n − 1.
In addition, show that these values are all the n-th roots of a and that the principal
n-th root of a is the same as the principal value of a1/n .
(b) If m/n is a rational number in lowest terms with n > 0, show that all powers of a
to m/n are given by

m
e n Log a+2πi k , k = 0, 1, 2, . . . , n − 1.
(c) If x is an irrational number, show that there are infinitely many distinct complex
powers of a to the x.
4. Let a, b, z, w ∈ C with a, b 6= 0 and prove the following:
(a) 1/az = a−z , az · aw = az+w , and (az )n = azn for all n ∈ Z.
(b) If −π < Arg a + Arg b ≤ π, then (ab)z = az bz .
(c) If −π < Arg a − Arg b ≤ π, then (a/b)z = az /bz .
(d) If Re a, Re b > 0, then both (b) and (c) hold.
226 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(e) Give examples showing that the conclusions of (b) and (c) are false if the hypotheses
are not satisfied.
5. (Arcsine and Arccosine function) In this problem we define the principal arcsin
and arccos functions. To define the complex arcsine, given z ∈ C we want to solve the
equation sin ξ = z for ξ and call ξ the “inverse sine of z”.
(a) Prove that sin ξ = z if and only if (eiξ )2 − 2iz (eiξ ) − 1 = 0.
(b) Solving this quadratic equation for eiξ (see Problem 5 in Exercises 4.8) prove that
sin ξ = z if and only if
1 p
ξ = × a complex logarithm of iz ± 1 − z 2 .
i
Because of this formula, we define the principal inverse or arc sine of z to be
the complex number
1  p 
Arcsin z := Log iz + 1 − z 2 .
i
Based on the formula (4.41), we define the principal inverse or arc cosine of z
to be the complex number
π
Arccos z := − Arcsin z.
2
(c) Prove that when restricted to real values, Arcsin : [−1, 1] −→ [−π/2, π/2] and
equals the usual arcsine function.
(d) Similarly, prove that when restricted to real values, Arccos : [−1, 1] −→ [0, π] and
equals the usual arccosine function.
6. (Inverse hyperbolic functions) We look at the inverse hyperbolic functions.
(a) Prove that sinh : R −→ R is a strictly increasing bijection. Thus, sinh−1 : R −→ R
exists. Show that cosh : [0, ∞) −→ [1, ∞) is a strictly increasing bijection. We
define cosh−1 : [1, ∞) −→ [0, ∞) to be the inverse of this function.
(b) Using a similar argument as you did for the arcsine function in Problem 5, prove
2x x
that sinh x = y (here,px, y ∈ R) if and only if e − 2ye − 1 = 0, which holds if
and only if ex = y ± y 2 + 1. From this, prove that
p
sinh−1 x = log(x + x2 + 1).
If x is replaced by z ∈ C and log by Log, the principal complex logarithm, then
this formula is called the principal inverse hyperbolic sine of z.
(c) Prove that p
cosh−1 x = log(x + x2 − 1).
If x is replaced by z ∈ C and log by Log, the principal complex logarithm, then
this formula is called the principal inverse hyperbolic cosine of z.

4.10. F The amazing π and its computations from ancient times


In the Measurement of the circle, Archimedes of Syracuse (287–212 B.C.), listed
three famous propositions involving π (see Heath’s translation [99]). In this section
we look at each of these propositions especially his third one, which uses the first
known algorithm to compute π to any desired number of decimal places!12 His basic
idea is to approximate a circle by inscribed and circumscribed regular polygons. We
begin by looking at a brief history of π.
12[On π] Ten decimal places of are sufficient to give the circumference of the earth to a
fraction of an inch, and thirty decimal places would give the circumference of the visible universe
to a quantity imperceptible to the most powerful microscope. Simon Newcomb (1835–1909) [140].
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 227

4.10.1. A brief (and very incomplete) history of π. We begin by giving


a short snippet of the history of π with, unfortunately, many details left out. Some
of what we choose to put here is based on what will come up later in this book (for
example, in the chapter on continued fractions — see Section 8.5) or what might
be interesting trivia. References include Schepler’s comprehensive chronicles [198,
199, 200], the beautiful books [26, 10, 68, 183], the wonderful websites [168,
170, 207], Rice’s short synopsis [187], and (my favorite π papers by) Castellanos
[47, 48]. Before discussing this history, recall the following formulas for the area
and circumference of a circle in terms of the radius r:
K K
Area of of radius r = πr2 , Circumference of of radius r = 2πr.

In terms of the diameter d := 2r, we have


K d2 K circumference
Area of =π , Circumference of = πd, π = .
4 diameter
(1) (circa 1650 B.C.) The Rhind (or Ahmes) papyrus is the oldest known mathe-
matical text in existence. It is named after the Egyptologist Alexander Henry
Rhind (1833–1863) who purchased it in Luxor in 1858, but it was written by a
scribe Ahmes (1680 B.C.–1620 B.C.). In this text is the following rule to find
the area of a circle: Cut 91 off the circle’s diameter and construct a square on
the remainder. Thus,
d2  1   1 2  8 2 2
π = area of circle ≈ square of d − d = d − d = d .
4 9 9 9
Cancelling off d2 from both extremities, we obtain
 8 2  4 4
π≈4 = = 3.160493827 . . . .
9 3
(2) (circa 1000 B.C.) The Holy Bible in I Kings, Chapter 7, verse 23, and II
Chronicles, Chapter 4, verse 2, states that
And he made a molten sea, ten cubits from the one brim to the other:
it was round all about, and his height was five cubits: and a line of
thirty cubits did compass it about. I Kings 7:23.
This give the approximate value (cf. the interesting article [5]):
circumference 30 cubits
π= ≈ = 3.
diameter 10 cubits
Not only did the Israelites use 3, other ancient civilizations used 3 for rough
purposes (more than good enough for “everyday life”) like the Babylonians,
Hindus, and Chinese.
(3) (circa 250 B.C.) Archimedes of Syracuse (287–212) gave the estimate π ≈
22/7 = 3.14285714 . . . (correct to two decimal places). We’ll thoroughly discuss
“Archimedes’ method” in a moment.
(4) (circa 500 A.D.) Tsu Chung-Chi (also Zu Chongzhi) of China (429–501) gave
the estimate π ≈ 355/113 = 3.14159292 . . . (correct to six decimal places); he
also gave the incredible estimate

3.1415926 < π < 3.1415927.


228 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

(5) (circa 1600 A.D.) The Dutch mathematician Adriaan Anthoniszoon of Holland
(1527–1607) used Archemides’ method to get
333 377
<π< .
106 120
By taking the average of the numerators and the denominator, he found Tsu
Chung-Chi’s approximation 355/113.
(6) (1706) The symbol π was first introduced by William Jones (1675–1749) in his
beginners calculus book Synopsis palmariorum mathesios where he published
John Machin’s (1680–1751) one hundred digit approximation to π; see Subsec-
tion 4.10.5 for more on Machin. The symbol π was popularized and became
standard through Leonhard Euler’s (1707–1783) famous book Introductio in
Analysin Infinitorum [65]. The letter π was (reportedly) chosen because it’s
the first letter of the Greek words “perimeter” and “periphery”.
(7) (1761) Johann Heinrich Lambert (1728–1777) proved that π is irrational.
(8) (1882) Carl Louis Ferdinand von Lindemann (1852–1939) proved that π is tran-
scendental.
(9) (1897) A sad day in the life of π. House bill No. 246, Indiana state legislature,
1897, written by a physician Edwin Goodwin (1828–1902), tried to legally set
the value of π to a rational number; see [213], [90] for more about this sad tale.
This value would be copyrighted and used in Indiana state math textbooks and
other states would have to pay to use this value! The bill is very convoluted
(try to read Goodwin’s article [83] and you’ll probably get a headache) and
(reportedly) the following values of π can be√ implied from the bill: π = 9.24,
3.236, 3.232, and 3.2; it’s also implied that 2 = 10/7. Moreover, Mr. Good-
win claimed he could trisect an angle, double a cube, and square a circle, which
(quoting from the bill) “had been long since given up by scientific bodies as
insolvable mysteries and above mans ability to comprehend.” These problems
“had been long since given up” because they have been proven unsolvable! (See
[57, 79] for more on these unsolvable problems, first proved by Pierre Laurent
Wantzel (1814–1848), and see [59] for other stories of amateur mathematicians
claiming to have solved the insolvable.) This bill passed the house (!), but for-
tunately, with the help of mathematician C.A. Waldo of the Indiana Academy
of Science, the bill didn’t pass in the senate.
Hold on to your seats because we’ll take up our brief history of π again in
Subsection 4.10.5, after a brief intermission.

4.10.2. Archimedes’ three propositions. The following three propositions


are contained in Archimedes’ book Measurement of the circle [99]:
(1) The area of a circle is equal to that of a right-angled triangle where the sides
including the right angle are respectively equal to the radius and circumference
of the circle.
(2) The ratio of the area of a circle to that of a square with side equal to the circle’s
diameter is close to 11:14.
(3) The ratio of the circumference of any circle to its diameter is less than 3 1/7
but greater than 3 10/71.
Archimedes’ first proposition is seen in Figure 4.13. Archimedes’ second propo-
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 229

1 1
2πr area 4 = base × height = r · (2πr) = πr2
2 2
r

Figure 4.13. Archimedes’ first proposition.

22
sition gives the famous estimate π ≈ 7 :
area of circle πr2 π 11 22
= = ≈ =⇒ π ≈ .
area of square (2r)2 4 14 7
We now derive Archimedes’ third proposition using the same method Archimedes
pioneered over two thousand years ago, but we shall employ trigonometric functions!
Archimedes’ original method used plane geometry to derive his formulas (they
didn’t have the knowledge of trigonometric functions back then as we do now.)
However, before doing so, we need a couple trig facts.
4.10.3. Some useful trig facts. We first consider some useful trig identities.
Lemma 4.54. We have
sin(2z) tan(2z)
tan z = and 2 sin2 z = sin(2z) tan z.
sin(2z) + tan(2z)
Proof. We’ll prove the first one and leave the second one to you. Multiplying
tan z by 2 cos z/2 cos z = 1 and using the double angle formulas 2 cos2 z = 1 + cos 2z
and sin(2z) = 2 cos z sin z (see Theorem 4.34), we obtain
sin z 2 sin z cos z sin(2z)
tan z = = 2
= .
cos z 2 cos z 1 + cos(2z)
Multiplying top and bottom by tan 2z, we get
sin(2z) tan(2z) sin(2z) tan(2z)
tan z = = .
tan(2z) + cos(2z) tan 2z tan(2z) + sin(2z)

Next, we consider some useful inequalities.
Lemma 4.55. For 0 < x < π/2, we have
sin x < x < tan x.
Proof. We first prove that sin x < x for 0 < x < π/2. We note that the
inequality sin x < x for 0 < x < π/2 automatically implies that this same inequality
holds for all x > 0, since x is increasing and sin x is oscillating. Substituting the
power series for sin x, the inequality sin x < x, that is, −x < − sin x, is equivalent
to
x3 x5 x7 x9
−x < −x + − + − + −··· ,
3! 5! 7! 9!
or after cancelling off the x’s, this inequality is equivalent to
   
x3 x2 x7 x2
1− + 1− + · · · > 0.
3! 4·5 7! 8·9
For 0 < x < 2, each of the terms in parentheses is positive. This shows that in
particular, this expression is positive for 0 < x < π/2.
230 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

tn

sn
2θn
regular polygons with
M · 2n sides

Figure 4.14. Archimedes inscribed and circumscribed a circle


with diameter 1 (radius 1/2) with regular polygons. The sides
of these polygons have lengths sn and tn , respectively. 2θn is the
central angle of the inscribed and circumscribed 2n M -gons.

We now prove that x < tan x for 0 < x < π/2. This inequality is equivalent to
x cos x < sin x for 0 < x < π/2. Substituting the power series for cos and sin, the
inequality x cos x < sin x is equivalent to
x3 x5 x7 x3 x5 x7
x− + − + −··· < x − + − + −··· .
2! 4! 6! 3! 5! 7!
Bringing everything to the right, we get an inequality of the form
       
1 1 1 1 1 1 1 1
x3 − − x5 − + x7 + − x9 + + − · · · > 0.
2! 3! 4! 5! 6! 7! 8! 9!
Combining adjacent terms, the left-hand side is a sum of terms of the form
   
1 1 1 1
x2k−1 − − x2k+1 − , k = 2, 3, 4, · · · .
(2k − 2)! (2k − 1)! (2k)! (2k + 1)!
We claim that this term is positive for 0 < x < 3. This shows that x cos x < sin x
for 0 < x < 3, and so in particular, for 0 < x < π/2. Now the above expression is
positive if and only if
1 1

(2k − 2)! (2k − 1)!
x2 < = (2k + 1)(2k − 2), k = 2, 3, 4, . . . .
1 1

(2k)! (2k + 1)!
where we multiplied the top and bottom by (2k+1)!. The right-hand side is smallest
when k = 2, when it equals 5 · 2 = 10. It follows that these inequalities hold for
0 < x < 3, and our proof is now complete. 
4.10.4. Archimedes’ third proposition. We start with a circle with diam-
eter 1 (radius 1/2). Then,
1
circumference of this circle = 2πr = 2π = π.
2
Let us fix a natural number M ≥ 3. Given any n = 0, 1, 2, 3, . . ., we inscribe
and circumscribe the circle with regular polygons having 2n M sides. See Figure
4.14. We denote the perimeter of the inscribed 2n M -gon by lowercase pn and the
perimeter of the circumscribed 2n M -gon by uppercase Pn . Then geometrically we
can see that
pn < π < Pn , n = 0, 1, 2, . . .
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 231

tn
2
sn
1
2 2
θn
sn tn
2 2
θn
1
2

Figure 4.15. We cut the central angle in half. The right picture
shows a blow-up of the overlapping triangles on the left.

and pn → π and Pn → π as n → ∞; we shall prove these facts analytically in


Theorem 4.56. Using plane geometry, Archimedes found iterative formulas for the
sequences {pn } and {Pn } and using these formulas he proved his third proposition.
Recall that everything we thought about trigonometry is true ,, so we shall use
these trig facts to derive Archimedes’ famous iterative formulas for the sequences
{pn } and {Pn }. To this end, we let sn and tn denote the length of the sides of the
inscribed and circumscribed polygons, so that
Pn = (# sides) × (length each side) = 2n M · tn .
and
pn = (# sides) × (length each side) = 2n M · sn
Let 2θn be the central angle of the inscribed and circumscribed 2n M -gons as shown
in Figures 4.14 and 4.15 (that is, θn is half the central angle). The right picture in
Figure 4.15 gives a blown-up picture of the triangles in the left-hand picture. The
outer triangle in the middle picture shows that
opposite tn /2
tan θn = = =⇒ tn = tan θn =⇒ Pn = 2n M tan θn .
adjacent 1/2
The inner triangle shows that
opposite sn /2
sin θn = = =⇒ sn = sin θn =⇒ pn = 2n M sin θn .
hypotonus 1/2
Now what’s θn ? Well, since 2θn is the central angle of the 2n M -gon, we have
total angle of circle 2π
central angle = = n .
# of sides of regular polygon 2 M
Setting this equal to 2θn , we get
π
θn = n .
2 M
In particular,
π 1 π 1
θn+1 = n+1 = n
= θn .
2 M 22 M 2
Setting z equal to z/2 in Lemma 4.54, we see that
1  sin(z) tan(z) 1  1 
tan z = and 2 sin2 z = sin(z) tan z .
2 sin(z) + tan(z) 2 2
Hence,
1  sin(θn ) tan(θn )
tan θn+1 = tan θn =
2 sin(θn ) + tan(θn )
232 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

and
1
 1 
2 sin2 (θn+1 ) = 2 sin2 θn = sin(θn ) tan θn = sin(θn ) tan(θn+1 ).
2 2
In particular, recalling that Pn = 2n M tan θn and pn = 2n M sin θn , we see that
sin(θn ) tan(θn )
Pn+1 = 2n+1 M tan θn+1 = 2n+1 M
sin(θn ) + tan(θn )
2n M sin(θn ) · 2n M tan(θn ) pn Pn+1
=2 n =2 .
2 M sin(θn ) + 2n M tan(θn ) pn + P n
and
2 2
2p2n+1 = 2 2n+1 M sin θn+1 = 2n+1 M sin(θn ) tan(θn+1 )
= 2 · 2 M sin(θn ) 2n+1 M tan(θn+1 ) = 2pn Pn+1 ,
n

p
or pn+1 = pn Pn+1 . Finally, recall that θn = 2nπM . Thus, P0 = M tan( M π
) and
π
p0 = M sin( M ). Let us summarize our results in the following formulas:

2pn Pn p
Pn+1 = , pn+1 = pn Pn+1 ; (Archimedes’ algorithm)
(4.50) pn + P n
π π
P0 = M tan , p0 = M sin .
M M
This is the celebrated Archimedes’ algorithm. Starting from the values of P0 and
p0 , we can use the iterative definitions for Pn+1 and pn+1 to generate sequences
{Pn } and {pn } that converge to π, as we now show.
Theorem 4.56 (Archimedes’ algorithm). We have
pn < π < Pn , n = 0, 1, 2, . . .
and pn → π and Pn → π as n → ∞.
π π
Proof. Note that for any n = 0, 1, 2, . . ., we have 0 < θn = 2n M < 2 because
M ≥ 3. Thus, by Lemma 4.55,
pn = 2n M sin θn < 2n M θn < 2n M tan θn = Pn .
Since θn = 2nπM , the middle term is just π, so pn < π < Pn for every n = 0, 1, 2, . . ..
Using the limit limz→0 sin z/z = 1, we obtain

n sin 2nπM
lim pn = lim 2 M sin θn = lim π π
 = π.
n→∞ n→∞ n→∞
2n M

Since limz→0 cos z = 1, we have limz→0 tan z/z = limz→0 sin z/(z · cos z) = 1, so the
same argument we used for pn shows that limn→∞ Pn = π. 
In Problem 4 you will study how fast pn and Pn converge to π. Now let’s
consider a specific example: Let M = 6, which is what Archimedes chose! Then,
π √ π
P0 = 6 tan = 2 3 = 3.464101615 . . . and p0 = 6 sin = 3.
6 6
From these values, we can find P1 and p1 from Archimedes algorithm (4.50):

2p0 P0 2·3·2 3
P1 = = √ = 3.159659942 . . .
p0 + P 0 3+2 3
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 233

and p √
p1 = p0 P1 = 3 · 3.159659942 . . . = 3.105828541 . . . .
Continuing this process (I used a spreadsheet) we can find P2 , p2 , then P3 , p3 , and
so forth, arriving with the table
n pn Pn
0 3 3.464101615
1 3.105828541 3.215390309
2 3.132628613 3.159659942
3 3.139350203 3.146086215
4 3.141031951 3.1427146
5 3.141452472 3.14187305
6 3.141557608 3.141662747
7 3.141583892 3.141610177
Archimedes considered p4 = 3.14103195 . . . and P4 = 3.1427146 . . .. Notice that
10 1
3 = 3.140845070 . . . < p4 and P4 < 3.142857142 . . . = 3 .
71 7
Hence,
10 1
3 < p4 < π < P4 < 3 ,
71 7
which proves Archimedes’ third proposition. It’s interesting to note that Archimedes
didn’t have computers back then (to find square roots for instance), or trig func-
tions, or coordinate geometry, or decimal notation, etc. so it’s incredible that
Archimedes was able to determine π to such an incredible accuracy!
4.10.5. Continuation of our brief history of π. Here are (only some!)
famous formulas for π (along with their earliest known date of publication) that
we’ll prove in our journey through our book: Archimedes of Syracuse ≈ 250 B.C.:
π = lim Pn = lim pn , where
2pn Pn p π π
Pn+1 = , pn+1 = pn Pn+1 ; P0 = M tan , p0 = M sin .
pn + P n M M
We remark that Archimedes’ algorithm is similar to Borchardt’s algorithm (see
Problem 1), which is similar to the modern-day AGM method of Eugene Salamin,
Richard Brent, and Jonathan and Peter Borwein [32, 33]. This AGM method can
generate billions of digits of π!
François Viète 1593 (§ 5.1):
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· .
π 2 2 2 2 2 2 2 2 2

Lord William Brouncker 1655 (§ 7.7):

4 12
=1+ .
π 32
2+
52
2+
72
2+
2 + ···
234 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS

John Wallis 1656 (§ 6.10):



π Y 2n 2n 2 2 4 4 6 6
= · = · · · · · ··· .
2 n=1
2n − 1 2n + 1 1 3 3 5 5 7

James Gregory, Gottfried Wilhelm von Leibniz 1670, Madhava of Sangamagramma


≈ 1400 (§ 5.1):
π 1 1 1 1 1
=1− + − + − + −··· .
4 3 5 7 9 11
John Machin 1706 (§ 6.10):
∞  
1  1  X (−1)n 4 1
π = 4 arctan − arctan =4 − .
5 239 n=0
(2n + 1) 52n+1 2392n+1

Machin calculated 100 digits of π with this formula. William Shanks (1812–1882)
is famed for his calculation of π to 707 places in 1873 using Machin’s formula.
However, only the first 527 places were correct as discovered by D. Ferguson in
1944 [72] using another Machin type formula. Ferguson ended up publishing 620
correct places in 1946, which marks the last hand calculation for π ever to so many
digits. From this point on, computers have been used to find π and the number of
digits of π known today is well into the trillions lead by Yasumasa Kanada and his
coworkers at the University of Tokyo using a Machin type formula; see Kanada’s
website http://www.super-computing.org/. One might ask “why try to find so
many digits of π?” Well (taken from Young’s great book [252, p. 238]),
Perhaps in some far distant century they may say, “Strange that
those ingenious investigators into the secrets of the number sys-
tem had so little conception of the fundamental discoveries that
would later develop from them!” D. N. Lehmer (1867–1938).
We now go back to our list of formulas. Leonhard Euler 1736 (§ 5.1):

π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4

and (§ 5.1):

π2 22 32 52 72 112
= 2 · 2 · 2 · 2 · 2 ··· .
6 2 − 1 3 − 1 5 − 1 7 − 1 11 − 1
We end our history with a question to ponder: What is the probability that a
natural number, chosen at random, is square free (that is, is not divisible by the
square of a prime)? What is the probability that two natural numbers, chosen
at random, are relatively (or co-) prime (that is, don’t have any common prime
factors)? The answers, drum role please, (§ 7.6):

6
Probability of being square free = Probability of being coprime = .
π2
Exercises 4.10.
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 235

1. In a letter from Gauss to his teacher Johann Pfaff (1765–1825) around 1800, Gauss
asked Pfaff about the following sequences {αn }, {βn } defined recursively as follows:
1 p
αn+1 = αn + βn ), βn+1 = αn+1 βn . (Borchardt’s algorithm)
2
Later, Carl Borchardt (1817–1880) rediscovered this algorithm and since then this
algorithm is called Borchardt’s algorithm [46]. Prove that Borchardt’s algorithm
is basically the same as Archimedes’ algorithm in the following sense: if you set αn :=
1/Pn and βn := 1/pn in Archimedes’ algorithm, you get Borchardt’s algorithm. 
π
2. (Pfaff ’s solution I) Now what if we don’t use the starting values P0 = M tan M and
π

p0 = M sin M for Archimedes’ algorithm in (4.50), but instead used other starting
values? What do the sequences {Pn } and {pn } converge to? These questions were
answered by Johann Pfaff. Pick starting values P0 and p0 and let’s assume that 0 ≤
p0 < P0 ; the case that P0 < p0 is handled in the next problem.
(i) Define
p  p0 P0
0
θ := arccos , r := p .
P0 P02 − p20
Prove that P0 = r tan θ and p0 = r sin θ.  
(ii) Prove by induction that P0 = 2n r tan 2θn and pn = 2n r sin 2θn .
(iii) Prove that as n → ∞, both {Pn } and {pn } converge to
p0 P0 p 
0
rθ = p arccos .
P02 − p20 P0
3. (Pfaff ’s solution II) Now assume that 0 < P0 < p0 .
(i) Define (see Problem 6 in Exercises 4.9 for the definition of cosh−1 )
p  p0 P0
0
θ := cosh−1 , r := p .
P0 p20 − P02
Prove that P0 = r tanh θ and p0 = r sinh θ.  
(ii) Prove by induction that P0 = 2n r tanh 2θn and pn = 2n r sinh θ
2n
.
(iii) Prove that as n → ∞, both {Pn } and {pn } converge to
p0 P0 p 
0
rθ = p cosh−1 .
2
p0 − P02 P0
4. (Cf. [150], [181]) (Rate of convergence)
(a) Using the formulas pn = 2n M sin θn and Pn = 2n M tan θn , where θn = π
2n M
, prove
that there are constants C1 , C2 > 0 such that for all n,
C1 C2
|pn − π| ≤ n and |Pn − π| ≤ n .
4 4
3
Suggestion: For the first estimate, use the expansion sin z = z − z3! + · · · . For the
second estimate, notice that |Pn − π| = cos1θn |pn − π cos θn |.
(b) Part (a) shows that {pn } and {Pn } converge to π very fast, but we can get even
faster convergence by looking at the sequence {an } where an := 31 (2pn +Pn ). Prove
that there is a constant C > 0 such that for all n,
C
|an − π| ≤ n .
16
CHAPTER 5

Some of the most beautiful formulæ in the world

God used beautiful mathematics in creating the world.


Paul Adrien Maurice Dirac (1902–1984)

In this chapter we present a small sample of some of the most beautiful formulas
in the world. We begin in Section 5.1 where we present Viète’s formula, Wallis’
formula, and Euler’s sine expansion. Viète’s formula, due to François Vite (1540–
1603), is the infinite product

s v s
r r u r
u
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· ,
π 2 2 2 2 2 2 2 2 2

published in 1593. This is not only the first recorded infinite product [120, p. 218] it
is also the first recorded theoretically exact analytical expression for the number π
[36, p. 321]. Wallis’ formula, named after John Wallis (1616–1703) was the second
recorded infinite product [120, p. 219]:

π Y 2n 2n 2 2 4 4 6 6
= · = · · · · · ··· .
2 n=1
2n − 1 2n + 1 1 3 3 5 5 7

To explain Euler’s sine expansion, recall that if p(x) is a polynomial with nonzero
roots r1 , . . . , rn (repeated according to multiplicity), then we can factor p(x) as
p(x) = a(x−r1 )(x−r2 ) · · · (x−rn ) where a is a constant. Factoring out −r1 , . . . , −rn ,
we can write p(x) as
    
x x x
(5.1) p(x) = b 1 − 1− ··· 1 − ,
r1 r2 rn
sin x
for another constant b. Euler noticed that the function x has only nonzero roots,
located at
π, −π, 2π, −2π, 3π, −3π, . . . ,
2 4
so thinking of sinx x = 1 − x3! + x5! − · · · as a (infinite) polynomial, assuming that
(5.1) holds for such an infinite polynomial we have (without caring about being
rigorous for the moment!)
sin x  x x x  x  x  x
=b 1− 1+ 1− 1+ 1− 1+ ···
x  π  π  2π
 2π
 3π 3π
x2 x2 x2
=b 1− 2 1− 2 2 1 − 2 2 ··· ,
π 2 π 3 π

237
238 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

where b is a constant. In Section 5.1, we prove that Euler’s guess was correct (with
b = 1)! Here’s Euler’s famous formula:
 x2  x2  x2  x2  x2 
(5.2) sin x = x 1 − 2 1 − 2 2 1 − 2 2 1 − 2 2 1 − 2 2 · · · ,
π 2 π 3 π 4 π 5 π
which Euler proved in 1735 in his epoch-making paper De summis serierum recipro-
carum (On the sums of series of reciprocals), which was read in the St. Petersburg
Academy on December 5, 1735 and originally published in Commentarii academiae
scientiarum Petropolitanae 7, 1740, and reprinted on pp. 123–134 of Opera Omnia:
Series 1, Volume 14, pp. 73–86.
In Section 5.2 we studyPthe Basel problem, which is the problem to determine
∞ 1
the exact value of ζ(2) = n=1 n2 . Euler, in the same 1735 paper De summis
2
serierum reciprocarum proved that ζ(2) = π6 :

X 1 1 1 1 π2
= 1 + + + + · · · = .
n=1
n2 22 32 42 6

Euler actually gave three proofs of this formula in De summis serierum recipro-
carum, but the third one is the easiest to explain. Here it is: First, recall Euler’s
sine expansion:
    
sin x x2 x2 x2 x2
= 1− 2 2 1− 2 2 1− 2 2 1 − 2 2 ··· .
x 1 π 2 π 3 π 4 π
If you think about multiply out the right-hand side you will get
 
x2 1 1 1
1− 2 + + + · · · + ···
π 12 22 32
where the dots “· · · ” involves powers of x of degree at least four or higher. Thus,
sin x x2
= 1 − 2 ζ(2) + · · · .
x π
x3
Dividing the power series of sin x = x − 3! + · · · by x we conclude that
2
x x2
1− + · · · = 1 − 2 ζ(2) + · · ·
3! π
where “· · · ” involves powers of x of degree at least four or higher. Finally, equating
powers of x2 we conclude that
1 ζ(2) π2 π2

= 2 =⇒ ζ(2) = = .
3! π 3! 6
Here is Jordan Bell’s [21] English translation of Euler’s argument from De summis
serierum reciprocarum (which was originally written in Latin):
Indeed, it having been put1 y = 0, from which the fundamental
equation will turn into this2
s3 s5 s7
0=s− + − + etc.,
1·2·3 1·2·3·4·5 1·2·3·4·5·6·7
1Here, Euler set y = sin s.
2Instead of writing e.g. 1 · 2 · 3, today we would write this as 3!. However, the factorial symbol
wasn’t invented until 1808 [151], by Christian Kramp (1760–1826), more than 70 years after De
summis serierum reciprocarum was read in the St. Petersburg Academy.
5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD 239

The roots of this equation give all the arcs of which the sine is
equal to 0. Moreover, the single smallest root is s = 0, whereby
the equation divided by s will exhibit all the remaining arcs of
which the sine is equal to 0; these arcs will hence be the roots of
this equation
s2 s4 s6
0=1− + − + etc.
1·2·3 1·2·3·4·5 1·2·3·4·5·6·7
Truly then, those arcs of which the sine is equal to 0 are3
p, −p , +2p , −2p , 3p , −3p etc.,
of which the the second of the two of each pair is negative, each
of these because the equation indicates for the dimensions of s to
be even. Hence the divisors of this equation will be
s s s s
1− , 1+ , 1− , 1+ etc.
p p 2p 2p
and by the joining of these divisors two by two it will be
s2 s4 s6
1− + − + etc.
1 · 2 · 3 1· 2 · 3 · 4 
 · 5 1 · 2 · 3·4 · 5 · 6 · 7 
s2 s2 s2 s2
= 1− 2 1− 2 1− 2 1− etc.
p 4p 9p 16p2
It is now clear from the nature of equations for the coeffi-
1
cient4 of ss that is 1·2·3 to be equal to
1 1 1 1
2
+ 2+ 2+ + etc.
p 4p 9p 16p2
In this last step, Euler says that
1 1 1 1 1
= 2+ 2+ 2+ + etc,
1·2·3 p 4p 9p 16p2
2
which after some rearrangement is exactly the statement that ζ(2) = π6 . Euler’s
proof reminds me of a quote by Charles Hermite (1822–1901):
There exists, if I am not mistaken, an entire world which is the
totality of mathematical truths, to which we have access only
with our mind, just as a world of physical reality exists, the one
like the other independent of ourselves, both of divine creation.
Quoted in The Mathematical Intelligencer, vol. 5, no. 4.
By the way, in this book we give eleven proofs of Euler’s formula for ζ(2)!
In Section 5.2 we also prove the Gregory-Leibniz-Madhava series
π 1 1 1
= 1 − + − + −··· .
4 3 5 7
This formula is usually called Leibniz’s series after Gottfried Leibniz (1646–1716)
because he is usually accredited to be the first to mention this formula in print in
1673, although James Gregory (1638–1675) probably knew about it. However, the
3Here, Euler uses p for π. The notation π for the ratio of the length of a circle to its diameter
was introduced in 1706 by William Jones (1675–1749), and only around 1736, a year after Euler
published De summis serierum reciprocarum, did Euler seem to adopt the notation π.
4Here, ss means s2 .
240 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

great mathematician and astronomer Madhava of Sangamagramma (1350–1425)


from India discovered this formula over 200 years before either Gregory or Leibniz!
Finally, in Section 5.3 we derive Euler’s formula for ζ(n) for all even n.
Chapter 5 objectives: The student will be able to . . .
• Explain the various formulas of Euler, Wallis, Viète, Gregory, Leibniz, Madhava.
• Formally derive Euler’s sine expansion and formula for π 2 /6.
• describe Euler’s formulæ for ζ(n) for n even.

5.1. F Euler, Wallis, and Viète


Historically, Viète’s formula was the first infinite product written down and
Wallis’ formula was the second [120, p. 218–219]. In this section we prove these
formulas and we also prove Euler’s celebrated sine expansion.

5.1.1. Viète’s Formula: The first analytic expression for π. François


Viète’s (1540–1603) formula has a very elementary proof. For any nonzero z ∈ C,
dividing the identity sin z = 2 sin(z/2) cos(z/2) by z, we get
sin z sin(z/2)
= cos(z/2) · .
z z/2
Replacing z with z/2, we get sin(z/2)/(z/2) = cos(z/22 )·sin(z/22 )/(z/22 ), therefore
sin z sin(z/22 )
= cos(z/2) · cos(z/22 ) · .
z z/22
Continuing by induction, we obtain
sin z sin(z/2n )
= cos(z/2) · cos(z/22 ) · · · · cos(z/2n ) ·
z z/2n
n
sin(z/2n ) Y
(5.3) = · cos(z/2k ),
z/2n
k=1
or
n
Y z/2n sin z
cos(z/2k ) = · .
sin(z/2n ) z
k=1
z/2n
Since limn→∞ sin(z/2n ) = 1 for any nonzero z ∈ C, we have
n
Y sin z z/2n sin z
lim cos(z/2k ) = lim · = .
n→∞ n→∞ z sin(z/2n ) z
k=1

For notation purposes, we can write



sin z Y
(5.4) = cos(z/2n ) = cos(z/2) · cos(z/22 ) · cos(z/24 ) · · ·
z n=1

and refer the right-hand side as an infinite product, the subject of which we’ll
thoroughly study in Chapter 7. For Q∞the purposes of this chapter, given a sequence
a1 , a2 , a3 , . . . we shall denote by n=1 an as the limit

Y n
Y 
an := lim ak = lim a1 a2 · · · an ,
n→∞ n→∞
n=1 k=1
5.1. F EULER, WALLIS, AND VIÈTE 241

provided that the limit exists. We now put z = π/2 into (5.4):
π π π π ∞  π 
2 Y
= cos 2 · cos 3 · cos 4 · cos 5 · · · = cos n+1
π 2 2 2 2 n=1
2

We now just have to obtain formulas for cos 2πn . To do so, note that for any
0 ≤ θ ≤ π, we have
  r
θ 1 1
cos = + cos θ.
2 2 2
(This follows from the double angle formula: 2 cos2 (2z) = 1 + cos z.) Thus,
s
  s   r
θ 1 1 θ 1 1 1 1
cos = + cos = + + cos θ,
22 2 2 2 2 2 2 2
Continuing this process (slang for “it can be shown by induction”), we see that
v v
u u s
  u u r
θ u
t 1 1t1 1 1 1 1 1
(5.5) cos = + + + ··· + + cos θ,
2n 2 2 2 2 2 2 2 2

where there are n square roots here. Therefore, putting θ = π/2 we obtain
v v
u u s
u r
 π  u1 1u t1 1 1 1 1 1
cos n+1 = t + + + ··· + + ,
2 2 2 2 2 2 2 2 2

where there are n square roots here. In conclusion, we have shown that
v v
u u s

u u r
2 Y u
t 1 1t1 1 1 1 1 1
= + + + ··· + + ,
π n=1 2 2 2 2 2 2 2 2

where there are n square roots in the n-th factor of the infinite product; or, writing
out the infinite product, we have
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
(5.6) = · + · + + ··· .
π 2 2 2 2 2 2 2 2 2

This formula was given by Viète in 1593.

5.1.2. Expansion of sine I. Our first proof of Euler’s infinite product for
sine is based on a neat identity involving tangents that we’ll present in Lemma 5.1
below. To begin we first write, for z ∈ C,
 n  n 
1  iz −iz  1 iz iz
(5.7) sin z = e −e = lim 1+ − 1− = lim Fn (z),
2i n→∞ 2i n n n→∞

where Fn is the polynomial of degree n in z given by


 n  n 
1 iz iz
(5.8) Fn (z) = 1+ − 1− .
2i n n
In the following lemma, we write Fn (z) in terms of tangents.
242 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Lemma 5.1. If n = 2m + 1 with m ∈ N, then we can write


m  
Y z2
Fn (z) = z 1− 2 .
k=1
n tan2 (kπ/n)
Proof. Observe that setting z = n tan θ, we have
iz sin θ 1 
1+ = 1 + i tan θ = 1 + i = cos θ + i sin θ
n cos θ cos θ
= sec θ eiθ ,
and similarly, 1 − iz/n = sec θ e−iθ . Thus,
 n  n 
1 iz iz
Fn (n tan θ) = 1+ − 1−
2i n n
z=n tan θ
1  
n inθ −inθ
= sec θ e − e ,
2i
1
or, since sin z = 2i (eiz − e−iz ), we have
Fn (n tan θ) = secn θ sin(nθ).
The sine function vanishes at integer multiples of π, so it follows that Fn (n tan θ) =
0 where nθ = kπ for all integers k, that is, for θ = kπ/n for all k ∈ Z. Thus,
Fn (zk ) = 0 for
 kπ   kπ 
zk = n tan = n tan ,
n 2m + 1
where we recall that n = 2m + 1. Since tan θ is strictly increasing on the interval
(−π/2, π/2), it follows that
z−m < z−m+1 < · · · < z−1 < z0 < z1 < · · · < zm−1 < zm ;
moreover, since tangent is an odd function, we have z−k = −zk for each k. In
particular, we have found 2m + 1 = n distinct roots of Fn (z), so as a consequence
of the fundamental theorem of algebra, we can write Fn (z) as a constant times
m n
Y o
(z − z0 ) · (z − zk )·(z − z−k )
k=1
m n
Y o
=z· (z − zk ) · (z + zk ) (since z−k = −zk )
k=1
m
Y
=z (z 2 − zk2 )
k=1
m   
Y kπ
=z z 2 − n2 tan2 .
n
k=1


Factoring out the −n2 tan2 n terms and gathering them all into one constant we
can conclude that
m  
Y z2
Fn (z) = a z 1− 2 ,
k=1
n tan2 (kπ/n)
for some constant a. Multiplying out the terms in the formula (5.8), we see that
Fn (z) = z plus higher powers of z. This implies that a = 1 and completes the proof
of the lemma. 
5.1. F EULER, WALLIS, AND VIÈTE 243

Using this lemma we can give a formal5 proof of Euler’s sine expansion. From
(5.7) and Lemma 5.1 we know that for any x ∈ R,
n−1
 
sin x Fn (x) Y2
x2
(5.9) = lim = lim 1− 2 ,
x n→∞ x n→∞
k=1
n tan2 (kπ/n)
where in the limit we restrict n to odd natural numbers. Thus, writing n = 2m + 1
the limit in (5.9) really means
m  
sin x Y x2
= lim 1− ,
x m→∞
k=1
(2m + 1)2 tan2 (kπ/(2m + 1))
but we prefer the simpler form in (5.9) with the understanding that n is odd in
(5.9). We now take n → ∞ in this expression. Now,
sin2 (kπ/n)
lim n2 tan2 (kπ/n) = lim n2
n→∞ n→∞ cos2 (kπ/n)
 sin(kπ/n) 1 2
= lim ·
n→∞ 1/n cos(kπ/n)
 sin(kπ/n) 1 2
= lim (kπ)2 · ·
n→∞ kπ/n cos(kπ/n)
= k2 π2 ,
where we used that limz→0 sinz z = 1 and cos(0) = 1. Hence,
   
x2 x2
(5.10) lim 1 − 2 = 1 − ,
n→∞ n tan2 (kπ/n) k2 π2
thus, formally evaluating the limit in (5.9), we see that
n−1
 
sin x 2
Y x2
= lim 1− 2
x n→∞
k=1
n tan2 (kπ/n)
∞  
Y x2
= lim 1 − 2
k=1
n→∞ n tan2 (kπ/n)
∞  
Y x2
= 1− 2 2 ,
k π
k=1
which is Euler’s result. Unfortunately, there is one issue with this argument; it
occurs in switching the limit with the product:
n−1
  Y ∞  
Y2
x2 x2
(5.11) lim 1− 2 = lim 1 − .
n→∞
k=1
n tan2 (kπ/n) k=1
n→∞ n2 tan2 (kπ/n)
See Problem 2 for an example where such an interchange leads to a wrong answer.
In Section 7.3 of Chapter 7 we’ll learn a Tannery’s theorem for infinite products,
5“Formal” in mathematics usually refers to “having the form or appearance without the
substance or essence,” which is the 5-th entry for “formal” in Webster’s 1828 dictionary. This is
very different to the common use of “formal”: “according to form; agreeable to established mode;
regular; methodical,” which is the first entry in Webster’s 1828 dictionary. Elaborating on the
mathematicians use of “formal,” it means something like “a symbolic manipulation or expression
presented without paying attention to correctness”.
244 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

from which we can easily deduce that (5.11) does indeed hold. However, we’ll
leave Tannery’s theorem for products until Chapter 7 because we can easily justify
(5.11) in a very elementary (although a little long-winded) way, which we do in the
following theorem.
Theorem 5.2 (Euler’s theorem). For any x ∈ R we have
∞ 
Y x2 
sin x = x 1− 2 2 .
π k
k=1

Proof. We just have to verify the formula (5.11), which in view of (5.10) is
equivalent to the equality
∞  
Y x2
lim pn = 1− 2 2 ,
n→∞ k π
k=1
where
n−1
Y
2
x2

pn = 1− .
k=1
n tan2 (kπ/n)
2

The limit in the case x = 0 is easily checked to hold so we can (and henceforth do)
fix x 6= 0.
2
Step 1: We begin by finding some nice estimates on the quotient n2 tanx2 (kπ/n) .
In Lemma 4.55 back in Section 4.10, we proved that
θ < tan θ, for 0 < θ < π/2.
n−1
In particular, if n ∈ N is odd and 1 ≤ k ≤ 2 , then
kπ n−1 π π
< · < ,
n 2 n 2
so
x2 x2 x2
(5.12) 2 < = .
n2 tan (kπ/n) n2 (kπ)2 /n2 k2 π2
Step 2: We now break up pn in a nice way. Choose m < n−1
2 and let us break
n−1
up the product pn from k = 1 to m and then from m to 2 :
m   n−1  
Y x2 Y2
x2
(5.13) pn = 1− 2 1− .
k=1
n tan2 (kπ/n) k=m+1
n2 tan2 (kπ/n)
We shall use (5.12) to find estimates on the second product in (5.13). Choose m
2
and n large enough such that mx2 π2 < 1. Then it follows that from (5.12) that
x2 x2 n−1
2 2 < 2 2
< 1 for k = m + 1, m + 2, . . . , .
n tan (kπ/n) k π 2
In particular,
 
x2 n−1
0< 1− 2 2 < 1 for k = m + 1, m + 2, . . . , .
n tan (kπ/n) 2
Hence,
n−1
 
Y 2
x2
0< 1− 2 <1
k=m+1
n tan2 (kπ/n)
5.1. F EULER, WALLIS, AND VIÈTE 245

and therefore, in view of (5.13), we have


m  
Y x2
pn ≤ 1− 2 .
k=1
n tan2 (kπ/n)
In Problem 3 you will prove that for any nonnegative real numbers a1 , a2 , . . . , ap ≥
0, we have
(5.14) (1 − a1 )(1 − a2 ) · · · (1 − ap ) ≥ 1 − (a1 + a2 + · · · + ap ).
Using this inequality it follows that
n−1 n−1
 
2
Y x2 X2
x2
1− 2 ≥ 1 − .
k=m+1
n tan2 (kπ/n) k=m+1
n2 tan2 (kπ/n)
By (5.12) we have
n−1 n−1
2
X x2 x2 2
X 1
≤ ≤ sm ,
k=m+1
n2 tan2 (kπ/n) π2
k=m+1
k2
where

x2 X 1
sm = .
π2 k2
k=m+1
Thus,
n−1
2
X x2
1− ≥ 1 − sm ,
k=m+1
n2 tan2 (kπ/n)
and hence,
n−1
Y 
2
x2

1− 2 ≥ 1 − sm .
k=m+1
n tan2 (kπ/n)
Therefore, in view of the expression (5.13) for pn , we have
m  
Y x2
pn ≥ (1 − sm ) 1− 2 .
k=1
n tan2 (kπ/n)
To summarize, we have shown that
m   m  
Y x2 Y x2
(5.15) (1 − sm ) 1− 2 ≤ p n ≤ 1 − ,
k=1
n tan2 (kπ/n) k=1
n2 tan2 (kπ/n)
Step 3: Using (5.15) we can now finish the proof. Indeed, from (5.9) we know
that
sin x
lim pn (x) = ,
n→∞ x
and recalling the limit (5.10), by the algebra of limits we have
m   Y m  
Y x2 x2
lim 1− 2 = 1− 2 2 ,
n→∞
k=1
n tan2 (kπ/n) k=1
k π
Qm
since the product k=1 is a finite product. Thus, taking n → ∞ (5.15) we obtain
m   m  
Y x2 sin x Y x2
(1 − sm ) 1− 2 2 ≤ ≤ 1− 2 2 ,
k π x k π
k=1 k=1
246 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

or after rearrangement,
m   m  
Y x2 sin x Y x2
−sm 1− 2 2 ≤ − 1 − 2 2 ≤ 0.
k π x k π
k=1 k=1

Taking absolute values, we get


 
m  m 
sin x Y
x2
Y
x2
− 1 − 2 2 ≤ |sm | 1 − 2 2 .
x k π k π
k=1 k=1

Our goal is to take m → ∞ here, but before doing so we need the following estimate
on the right-hand side:
m   Y m  
Y
x2 x2
1− 2 2 ≤ 1+ 2 2
k π k π
k=1 k=1
m
Y x2
≤ e k2 π 2 (since 1 + t ≤ et for any t ∈ R)
k=1
x2
Pm
=e k=1 k2 π 2 (since ea · eb = ea+b )
≤ eL ,
P∞ 2
where L = k=1 k2xπ2 , a finite constant the exact value of which is not important.
P∞ 1
(Note that k=1 k2 converges by the p-test with p = 2.) Thus,
m  
Y x2
1 − 2 2 ≤ eL ,


k π
k=1

and so,
m  
sin x Y x2
1 − 2 2 ≤ |sm |eL .

(5.16) −
x k π
k=1
2 P∞ P∞
Recalling that sm = πx2 k=m+1 k12 and k=1 k12 converges (the p-test with p = 2),
by the Cauchy Criterion for series we know that limm→∞ sm = 0. Thus, it follows
from (5.16) that
m  
sin x Y
x2
− 1− 2 2
x k π
k=1

can be made as small as we desire by taking m larger and larger. This, by definition
of limit, means that
m  
sin x Y x2
= lim 1− 2 2 ,
x m→∞ k π
k=1

which proves our result. 

We remark that Euler’s sine expansion also holds for all complex z ∈ C (and
not just real x ∈ R), but we’ll wait for Section 7.3 of Chapter 7 for the proof of the
complex version.
5.1. F EULER, WALLIS, AND VIÈTE 247

5.1.3. Wallis’ formulas. As an application of Euler’s sine expansion, we can


derive John Wallis’ (1616–1703) formulas for π.
Theorem 5.3 (Wallis’ formulas). We have

π Y 2n 2n 2 2 4 4 6 6
= · = · · · · · ··· ,
2 n=1
2n − 1 2n + 1 1 3 3 5 5 7
n
√ 1 Y 2k 1 2 4 6 2n
π = lim √ = lim √ · · · · · · .
n→∞ n 2k − 1 n→∞ n 1 3 5 2n − 1
k=1

Proof. To obtain the first formula, we set x = π/2 in Euler’s infinite product
expansion for sine:
∞  ∞
Y x2  π Y 1 
sin x = x 1− 2 2 =⇒ 1 = 1− 2 2 .
n=1
π n 2 n=1 2 n
1 22 n2 −1 (2n−1)(2n+1)
Since 1 − 22 n2 = 22 n2 = (2n)(2n) , we see that

2 Y 2n − 1 2n + 1
= · .
π n=1 2n 2n
Now taking reciprocals of both sides (you are encouraged to verify that the recip-
rocal of an infinite product is the product of the reciprocals) we get Wallis’ first
formula. To obtain the second formula, we write the first formula as
     2n 2 
π 2 2 4 2 1
= lim · ··· · .
2 n→∞ 1 3 2n − 1 2n + 1
Then taking square roots we obtain
r n n
√ 2 Y 2k 1 1 Y 2k
π = lim = lim √ p .
n→∞ 2n + 1 2k − 1 n→∞ n 1 + 1/2n 2k − 1
k=1 k=1
p
Using that 1/ 1 + 1/2n → 1 as n → ∞ completes our proof. 
We prove prove a beautiful expression for π due to Sondow [217] (which I found
on Weisstein’s website [241]). To present this formula, we first manipulate Wallis’
first formula to
∞ ∞ ∞  
π Y 2n 2n Y 4n2 Y 1
= · = = 1+ 2 .
2 n=1
2n − 1 2n + 1 n=1 4n2 − 1 n=1 4n − 1
Second, using partial fractions we observe that
∞ ∞  
X 1 1X 1 1 1 1
2
= − = ·1= ,
n=1
4n − 1 2 n=1 2n − 1 2n + 1 2 2
since the sum telescopes (see e.g. the telescoping series theorem — Theorem 3.24).
Dividing these two formulas, we get
∞  
Y 1
1+ 2
4n − 1
π = n=1 ∞ ,
X 1
n=1
4n2 − 1
248 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

quite astonishing!
Exercises 5.1.
1. Here are some Viète-Wallis products from [175, 178].
(i) From the formulas (5.3), (5.5), and Euler’s sine expansion, prove that for any
x ∈ R and p ∈ N we have
v v
u u s
p u
u r ∞  p
2 πn − θ 2p πn + θ 
u
sin x Y t 1 1t1 1 1 1 1 1 Y
= + + + ··· + + cos x · · ,
x 2 2 2 2 2 2 2 2 n=1
2p πn 2p πn
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(ii) Setting x = π/2 in (i), show that
v v
u u s
p u
u u r ∞
2 Y t1 1t1 1 1 1 1 1 Y  2p+1 n − 1 2p+1 n + 1 
= + + + ··· + + · · ,
π 2 2 2 2 2 2 2 2 n=1 2p+1 n 2p+1 n
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(iii) Setting x = π/6 in (i), show that
v v
u u v
u
p u u u s √  Y ∞ 
3 Y u1 1 u1 1 u1
t 1 1 1 3 3 · 2p+1 n − 1 3 · 2p+1 n + 1 
= t + t + + ··· + + · · ,
π 2 2 2 2 2 2 2 2 2 n=1
3 · 2p+1 n 3 · 2p+1 n
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(iv) Experiment with two other values of x to derive other Viète-Wallis-type formulas.
2. Suppose for each n ∈ N we are given a finite product
an
Y
fk (n),
k=1

where fk (n) is an expression involving k, n and an ∈ N is such that limn→∞


 an = ∞.
x2
For example, in (5.11) we have an = n−12
and fk (n) = 1 − 2 2
n tan (kπ/n)
; then (5.11)
claims that for this example we have
an
Y ∞
Y
(5.17) lim fk (n) = lim fk (n).
n→∞ n→∞
k=1 k=1

However, this equality is not always true. Indeed, prove that (5.17) is false for the
example an = n and fk (n) = 1 + n1 .
3. Prove (5.14) using induction on p.
4. Prove the following splendid formula:
√ (n!)2 22n
π = lim √ .
n→∞ (2n)! n
Suggestion: Wallis’ formula is hidden here.
5. (cf. [22]) In this problem we give an elementary proof of the following interesting
identity: For any n that is a power of 2 and for any x ∈ R we have
n −1
 x  2Y  
x sin2 (x/n)
(5.18) sin x = n sin cos 1− .
n n
k=1
sin2 (kπ/n)
(i) Prove that for any x ∈ R,
x π + x
sin x = 2 sin sin .
2 2
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 249

(ii) Show that for n equal to any power of 2, we have


x π + x  2π + x 
sin x = 2n sin sin sin ···
n n n
 (n − 2)π + x   (n − 1)π + x 
· · · sin sin ;
n n
note that if n = 21 we get the formula in (i).
(iii) Show that the formula in (ii) can be written as
x  nπ + x Y  kπ + x   kπ − x 
sin x = 2n sin sin 2 sin sin .
n n n n n
1≤k< 2

(iv) Prove the identity sin(θ + ϕ) sin(θ − ϕ) = sin2 θ − sin2 ϕ and use this to conclude
that the formula in (iii) equals
x x Y   kπ   x 
sin x = 2n sin cos sin2 − sin2 .
n n n n n
1≤k< 2

(v) By considering what happens as x → 0 in the formula in (iv), prove that for n a
power of 2, we have
Y  2  kπ 
n = 2n sin .
n
n
1≤k< 2

Now prove (5.18).


6. (Expansion of sine II) We give a second proof of Euler’s sine expansion.
(i) Show that taking n → ∞ on both sides of the identity (5.18) from the previous
problem gives a formal proof of Euler’s sine expansion.
(ii) Now using the identity (5.18) and following the ideas found in the proof of The-
orem 5.2, give another rigorous proof of Euler’s sine expansion.

5.2. F Euler, Gregory, Leibniz, and Madhava


In this section we present two beautiful formulas involving π: Euler’s formula
for π 2 /6 and the Gregory-Leibniz-Madhava formula for π/4. The simplest proofs
I know of these formulas are taken from the article by Hofbauer [102] and are
completely “elementary” in the sense that they involve nothing involving derivatives
or integrals . . . just a little bit of trigonometric identities and then a dab of some
inequalities (or Tannery’s theorem if you prefer) to finish them off. However, before
presenting Hofbauer’s proofs we present (basically) Euler’s original (third) proof of
his solution to the Basel problem.

5.2.1. Proof I of Euler’s formula for π 2 /6. In 1644, the Italian mathe-
matician Pietro Mengoli (1625–1686) posed the question: What’s the value of the
sum

X 1 1 1 1
2
= 1 + 2 + 2 + 2 + ··· ?
n=1
n 2 3 4
This problem was made popular through Jacob (Jacques) Bernoulli (1654–1705)
when he wrote about it in 1689 and was solved by Leonhard Euler (1707–1783) in
1735. Bernoulli was so baffled by the unknown value of the series that he wrote
If somebody should succeed in finding what till now withstood
our efforts and communicate it to us, we shall be much obliged
to him. [47, p. 73], [252, p. 345].
250 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Before Euler’s solution to this request, known as the Basel problem (Bernoulli lived
in Basel, Switzerland), this problem eluded many of the great mathematicians of
that day: In 1742, Euler wrote
Jacob Bernoulli does mention those series, but confesses that,
in spite of all his efforts, he could not get through, so that Joh.
Bernoulli, de Moivre and Stirling, great authorities in such mat-
ters, were highly surprised when I told them that I had found the
sum of ζ(2), and even of ζ(n) for n even. [237, pp. 262-63].
(We shall consider ζ(n) for n even in the next section.) Needless to say, it shocked
the mathematical community when Euler found the sum to be π 2 /6; in the intro-
duction to his famous 1735 paper De summis serierum reciprocarum (On the sums
of series of reciprocals) where he first proves that ζ(2) = π 2 /6, Euler writes:
So much work has been done on the series ζ(n) that it seems
hardly likely that anything new about them may still turn up . . .
I, too, in spite of repeated efforts, could achieve nothing more
than approximate values for their sums . . . Now, however, quite
unexpectedly, I have found an elegant formula for ζ(2), depend-
ing upon the quadrature of the circle [i.e., upon π] [237, p. 261].
For more on various solutions to the Basel problem, see [109], [49], [195], and for
more on Euler, see [11], [119]. On the side is a picture of a Swiss Franc banknote
honoring Euler.
We already saw Euler’s original argument in the introduction
to this chapter; we shall now make his argument rigorous. First,
we claim that for any nonnegative real numbers a1 , a2 , . . . , an ≥
0, we have

n
X n
Y n
X X
(5.19) 1− ak ≤ (1 − ak ) ≤ 1 − ak + ai aj .
k=1 k=1 k=1 1≤i<j≤n

You will prove these inequalities in Problem 1. Applying these inequalities to


Qn  x2
k=1 1 − k2 π 2 , we obtain

n n   n
X x2 Y x2 X x2 X x2 x2
1− ≤ 1 − ≤ 1 − + .
k2 π2 k2 π2 k2 π2 i2 π 2 j 2 π 2
k=1 k=1 k=1 1≤i<j≤n

After some slight simplifications we can write this as

n n   n
x2 X 1 Y x2 x2 X 1 x4 X 1
(5.20) 1− 2 ≤ 1 − ≤ 1 − + .
π k2 k2 π2 π2 k2 π4 i2 j 2
k=1 k=1 k=1 1≤i<j≤n

Let us put
n n
X 1 X 1
ζn (2) = and ζn (4) = ,
k2 k4
k=1 k=1
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 251

and observe that


n
X n
 X  n
1 1 X 1
ζn (2)2 = =
i=1
i2 j=1
j2 i,j=1
i2 j 2
X 1
= ζn (4) + 2 .
i2 j 2
1≤i<j≤n

Thus, (5.20) can be written as


n  
x2 Y x2 x2 x4 ζn (2)2 − ζn (4)
1 − 2 ζn (2) ≤ 1 − 2 2 ≤ 1 − 2 ζn (2) + 4 .
π k π π π 2
k=1

We remark that the exact coefficient of x4 on the right is not important, but it
might be helpfulif you try Problem 7. Taking n → ∞ and using that ζn (2) → ζ(2),
Qn x2 sin x
k=1 1 − k2 π 2 → x , and that ζn (4) → ζ(4), we obtain

x2 sin x x2 x4 ζ(2)2 − ζ(4)


1− ζ(2) ≤ ≤ 1 − ζ(2) + .
π2 x π2 π4 2
sin x
Replacing x by its power series expansion we see that
x2 x2 x4 x6 x2 x4 ζ(2)2 − ζ(4)
1− 2
ζ(2) ≤ 1 − + − + · · · ≤ 1 − 2 ζ(2) + 4 .
π 3! 5! 7! π π 2
Now subtracting 1 from everything and dividing by x2 , we get
1 1 x2 x4 1 x2 ζ(2)2 − ζ(4)
− ζ(2) ≤ − + − + · · · ≤ − ζ(2) + .
π2 3! 5! 7! π2 π4 2
Finally, putting x = 0 we conclude that
1 1 1
− 2
ζ(2) ≤ − ≤ − 2 ζ(2).
π 3! π
π2
This implies that ζ(2) = 6 , exactly as Euler stated.

5.2.2. Proof II of Euler’s formula for π 2 /6. Follow Hofbauer [102] we


give our second proof of Euler’s formula for π 2 /6. We begin with the identity, valid
for noninteger z ∈ C,
! !
1 1 1 1 1 1 1 1
= = + = +  ,
sin2 z 4 sin2 z2 cos2 z2 4 sin2 z2 cos2 z2 4 sin2 z2 sin2 π−z
2

where at the last step we used that cos(z) = sin( π2 − z). Replacing z with πz, we
get for noninteger z,
 
1 1 1 1
(5.21) = +   .
sin2 πz 4 sin2 zπ
2 sin 2 (1−z)π
2

In particular, setting z = 1/2, we obtain


!
1 1 1 2 1
1= 2 π + = · .
4 sin 22 sin2 2π2 4 sin2 2π2
252 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Applying (5.21) (with z = 1/22 ) to the right-hand side of this equation gives
! 1
2 1 1 2 X 1
1= 2 + = .
4 sin2 2π3 sin2 3π
23
42
sin2 (2k+1)π
3 k=0 2
1 1
Applying (5.21) to each term sin2 π and sin2 3π gives
23 23
" # " #!
2 1 1 1 1 1 1
1= 2 + + +
4 4 sin2 2π4 sin2 7π
24
4 sin2 3π24 sin2 5π
24
!
2 1 1 1 1
= 3 2 π
+ 2 3π
+ 2 5π
+ 2
4 sin 24 sin 24 sin 24 sin 7π24
2
2 X 1
= .
43 2 (2k+1)π
k=0 sin 24
Repeatedly applying (5.21) (slang for “use induction”), we arrive at the following.
Lemma 5.4. For any n ∈ N, we have
2n−1
X−1
2 1
1= n .
4
k=0 sin2 (2k+1)π
2n+1
To establish Euler’s formula, we need one more lemma.
Lemma 5.5. For 0 < x < π/2, we have
1 1 1
−1 + 2 < 2 < .
sin x x sin2 x
Proof. Taking reciprocals in the formula from Lemma 4.55: For 0 < x < π/2,
sin x < x < tan x,
−2
2
we get cot x < x −2
< sin x. Since cot2 x = cos2 x/ sin2 x = sin−2 x − 1, we
conclude that
1 1 1 π
> 2 > −1 + , 0<x< ,
sin2 x x sin2 x 2
which proves the lemma. 
n−1
Now, observe that for 0 ≤ k ≤ 2 − 1 we have
n−1
(2k + 1)π (2(2 − 1) + 1)π (2n − 1)π π
n+1
≤ = < ,
2 2n+1 2 n+1 2
therefore using the identity
1 1 1 π
−1 + < < , 0<x<
sin2 x x2 sin2 x 2
we see that
2n−1
X−1 2n−1
X−1 2n−1
X−1
n−1 1 1 1
−2 + <  2 < .
k=0 sin2 (2k+1)π
2n+1 k=0
(2k+1)π
k=0 sin2 (2k+1)π
2n+1
2n+1

Multiplying both sides by 2/4n = 2/22n and using Lemma 5.4, we get
2n−1
X−1
1 8 1
− n +1< 2 < 1.
2 π (2k + 1)2
k=0
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 253

Taking n → ∞ and using the squeeze theorem, we conclude that


∞ ∞
8 X 1 X 1 π2
1≤ ≤1 =⇒ = .
π2 (2k + 1)2 (2k + 1)2 8
k=0 k=0

Finally, summing over the even and odd numbers (see Problem 2a in Exercises 3.5),
we have
∞ ∞ ∞ ∞
X 1 X 1 X 1 π2 1X 1
(5.22) = + = +
n=1
n2 n=0 (2n + 1)2 n=1 (2n)2 8 4 n=1 n2

3X 1 π2
=⇒ = .
4 n=1 n2 8
P∞
and solving for n=1 1/n2 , we obtain Euler’s formula:

π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4

5.2.3. Proof III of Euler’s formula for π 2 /6. In Proof II we established


Euler’s formula from Lemma 5.5. This time we apply Tannery’s theorem. The idea
is to write the identity in Lemma 5.4 in a form found in Tannery’s theorem:
2n−1
X−1 2n−1
X−1
2 1
(5.23) 1= n (2k+1)π
= ak (n),
4 sin2
k=0 2n+1 k=0

where
2 1
ak (n) = .
4n sin2 (2k+1)π
n+1
2
sin z
Let us verify the hypotheses of Tannery’s theorem. First, since limz→0 z = 1, we
have
(2k+1)π
(2k + 1)π sin 2n+1
lim 2n+1 sin n+1
= (2k + 1)π · lim (2k+1)π = (2k + 1)π.
n→∞ 2 n→∞
n+1 2
Therefore,
2 1
lim ak (n) = lim n
· (2k+1)π
n→∞ n→∞4 2
sin 2n+1
1 8
= lim 8 ·  2 = 2 .
n→∞ (2k+1)π π (2k + 1)2
2n+1 sin 2n+1

To verify the other hypothesis of Tannery’s theorem we need the following lemma.
Lemma 5.6. There exists a constant c > 0 such that for 0 ≤ x ≤ π/2,
c x ≤ sin x.
Proof. Since limz→0 sinz z = 1, the function f (x) = sin x/x is a continuous
function of x in [0, π/2] where we define f (0) := 1. Observe that f is positive on
[0, π/2] because f (0) = 1 > 0 and sin x > 0 for 0 < x ≤ π/2. Therefore, by the
max/min value theorem, f (x) ≥ f (b) > 0 on [0, π/2] for some b ∈ [0, π/2]. This
proves that c x ≤ sin x on [0, π/2] where c = f (b) > 0. 
254 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Now, observe that for 0 ≤ k ≤ 2n−1 − 1 we have


(2k + 1)π (2(2n−1 − 1) + 1)π (2n − 1)π π
n+1
≤ n+1
= n+1
< ,
2 2 2 2
therefore by Lemma 5.6,
(2k + 1)π (2k + 1)π 1 4n+1
c· ≤ sin =⇒ ≤ .
2n+1 2n+1 sin2 (2k+1)π c2 π 2 (2k + 1)2
2n+1

Multiplying both sides by 2/4n , we obtain


2 1 8 1
n
· ≤ 2 2· =: Mk .
4 2 (2k+1)π
sin 2n+1 c π (2k + 1)2
P∞
It follows that |ak (n)| ≤ Mk for all n, k. Moreover, since the sum k=0 Mk =
P ∞ 8 1
k=0 c2 π 2 · (2k+1)2 converges, we have verified the hypotheses of Tannery’s theorem.
Hence, taking n → ∞ in (5.23), we get
2n−1
X−1 ∞
X
1 = lim ak (n) = lim ak (n)
n→∞ n→∞
k=0 k=0
∞ ∞
X 8 π2 X 1
= =⇒ = .
π 2 (2k + 1)2 8 (2k + 1)2
k=0 k=0

Doing the even-odd trick as we did in (5.22), we know that this formula implies
Euler’s formula for π 2 /6. See Problem 5 for Proof IV, a classic proof.

5.2.4. Proof I of Gregory-Leibniz-Madhava’s formula for π/4. As the


proof of Euler’s formula was based on a trigonometric identity for sines (Lemma
5.4), the proof of Gregory-Leibniz-Madhava’s formula:

π X (−1)n−1 1 1 1 1 1
= =1− + − + − + ··· ,
4 n=0
2n − 1 3 5 7 9 11

also involves trigonometric identities, but for cotangents. Concerning Leibniz’s


discovery of this formula, Christian Huygens (1629–1695) wrote “that it would be
a discovery always to be remembered among mathematicians” [252, p. 316]. In
1676, Isaac Newton (1642–1727) wrote
Leibniz’s method for obtaining convergent series is certainly very
elegant, and it would have sufficiently revealed the genius of its
author, even if he had written nothing else. [226, p. 130].
To prove the Gregory-Leibniz-Madhava formula we begin with the double angle
formula
cos 2z cos2 z − sin2 z
2 cot 2z = 2 = = cot z − tan z,
sin 2z cos z sin z
from which we see that
1 
cot 2z = cot z − tan z .
2
Since tan z = cot(π/2 − z), we find that
 π 
1
cot 2z = cot z − cot −z .
2 2
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 255

Replacing z with πz/2, we get


 
1 zπ (1 − z)π
(5.24) cot πz = cot − cot ,
2 2 2
which is our fundamental equation. In particular, setting z = 1/4, we obtain
  0  
1 π 3π 1X (4k + 1)π (4k + 3)π
1= cot − cot = cot − cot .
2 4·2 4·2 2 23 23
k=0
π
Applying (5.24) to each term cot 23and cot 3π
23 gives
    
1 1 π 7π 1 3π 5π
1= cot 4 − cot 4 − cot 4 − cot 4
2 2 2 2 2 2 2
   
1 π 3π 5π 7π
= 2 cot 4 − cot 4 + cot 4 − cot 4
2 2 2 2 2
1  
1 X (4k + 1)π (4k + 3)π
= 2 cot − cot .
2 24 24
k=0

Repeatedly applying (5.24), one can prove that for any n ∈ N, we have
2n−1 −1  
1 X (4k + 1)π (4k + 3)π
(5.25) 1= n cot − cot .
2 2n+2 2n+2
k=0

(The diligent reader will supply the details!) Since we know some nice properties
of sine from Lemma 5.6 we write the right-hand side of this identity in terms of
sine. To do so, observe that for any complex numbers z, w, not integer multiples of
π, we have
cos z cos w sin w cos z − cos w sin z
cot z − cot w = − =
sin z sin w sin z sin w
sin(w − z)
= .
sin z sin w
Using this identity in (5.25), we obtain
2n−1
X−1 π 2n−1
X−1
1 sin 2n+1
(5.26) 1= n = ak (n),
2
k=0 sin (4k+1)π
4·2n · sin (4k+3)π
2n+2 k=0

where
π
1 sin 2n+1
ak (n) = n
.
2 sin n+2 · sin (4k+3)π
(4k+1)π
n+2
2 2
The idea to derive Gregory-Leibniz-Madhava’s formula is to take n → ∞ in (5.26)
and use Tannery’s theorem. Let us verify the hypotheses of Tannery’s theorem.
First, to determine limn→∞ ak (n) we write
π π
1 sin 2n+1 3 2n+1 sin 2n+1
· = 2 · ·
2n sin (4k+1)π · sin (4k+3)π 2n+2 · 2n+2 sin (4k+1)π (4k+3)π
4·2n 4·2n 2n+2 · sin 2n+2
2n+1 π
8 π sin2n+1
= · .
π(4k + 1)(4k + 3) 2n+2
sin (4k+1)π 2n+2
· (4k+3)π sin (4k+3)π
(4k+1)π 2n+2 2n+2
256 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

sin z
Therefore, since limz→0 z = 1, we have
8
lim ak (n) = .
n→∞ π(4k + 1)(4k + 3)
To verify the other hypothesis of Tannery’s theorem we need the following lemma.

Lemma 5.7. If |z| ≤ 1, then


6
| sin z| ≤ |z|.
5
Proof. Observe that for |z| ≤ 1, we have |z|k ≤ |z| for any k ∈ N, and

(2n + 1)! = (2 · 3) · (4 · 5) · · · (2n · (2n + 1))


≥ (2 · 3) · (2 · 3) · · · (2 · 3) = (2 · 3)n = 6n .

Thus,

∞ 2n+1 ∞
X |z|2n+1
X z
(−1)n

| sin z| = ≤

n=0
(2n + 1)! n=0 (2n + 1)!
∞ ∞
X |z| X 1 1 6
≤ n
= |z| n
= |z| = |z|.
n=0
6 n=0
6 1 − (1/6) 5

π
Since 0 ≤ 2n+1 ≤ 1 for n ∈ N (because π < 4), by this lemma we have
π 6 π
(5.27) sin ≤ · n+1 .
2n+1 5 2
Observe that for 0 ≤ k ≤ 2n−1 − 1 and 0 ≤ ` ≤ 4, we have

(4k + `)π (4(2n−1 − 1) + `)π (2n+1 − 4 + `)π π


n+2
≤ n+2
= ≤ ,
2 2 2n+2 2
therefore by Lemma 5.6,

(4k + `)π (4k + `)π 1 1 2n+2


c· n+2
≤ sin =⇒ ≤ .
2 2n+2 sin (4k+`)π c (4k + `)π
2n+2

Combining this inequality with (5.27), we see that for 0 ≤ k ≤ 2n−1 − 1, we have
π      
1 sin 2n+1 1 6 π 1 2n+2 1 2n+2
≤ · · · ·
2n sin (4k+1)π (4k+3)π 2n 5 2n+1 c (4k + 1)π c (4k + 3)π
2n+2 · sin 2n+2
6 8
= · .
5 π(4k + 1)(4k + 3)
It follows that for any k, n, we have
6 8
|ak (n)| ≤ · =: Mk .
5 π(4k + 1)(4k + 3)
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 257

P∞
Since the sum k=0 Mk converges, we have verified the hypotheses of Tannery’s
theorem. Hence, taking n → ∞ in (5.26), we get
2n−1
X−1 ∞
X
1 = lim ak (n) = lim ak (n)
n→∞ n→∞
k=0 k=0
∞ ∞
X 8 π X 2
= =⇒ = .
π(4k + 1)(4k + 3) 4 (4k + 1)(4k + 3)
k=0 k=0

The last series is equivalent to Gregory-Leibniz-Madhava’s formula because if we


use partial fractions, we see that
2 1 1
= − ,
(4k + 1)(4k + 3) 4k + 1 4k + 3
so when we write out the series term by term we obtain
∞  
π X 1 1 1 1 1 1 1
= − =1− + − + − + ··· ,
4 4k + 1 4k + 3 3 5 7 9 11
k=0

which is exactly Gregory-Leibniz-Madhava’s formula.


Exercises 5.2.
1. Prove the formula (5.19) by induction on n.
2. Determine the following limit:
 
 
1 1 1
lim  +   + ··· +   .
n→∞  3
n sin 1·2
n3
n3 sin 2·3
n3
n3 sin n·(n+1)
n3

3. (Partial fraction expansion of 1/ sin2 x, Proof I) Here’s Hofbauer’s [102] derivation


of a partial fraction expansion of 1/ sin2 x.
(i) Prove that
2n −1
1 1 X 1
= 2n .
sin2 x 2
k=0
2 x+πk
sin 2n
(ii) Show that
2n−1
X−1
1 1 1
(5.28) 2 = 2n .
sin x 2 sin2 x+πk
2n
k=−2n−1
1
Pn 1
(iii) Using Lemma 5.5 prove that sin2 x
= limn→∞ k=−n (x+πk)2 . We usually write
this as
1 X 1
(5.29) = .
sin2 x k∈Z
(x + πk)2

4. (Partial fraction expansion of 1/ sin2 x, Proof II) Give another proof of (5.29)
using Tannery’s theorem and the formula (5.28).
5. (Euler’s sum for π 2 /6, Proof IV) In this problem we derive Euler’s sum via an old
argument found in Thomas John l’Anson Bromwich’s (1875–1929) book [41, p. 218–19]
(cf. similar ideas found in [6], [179], [123], [249, Problem 145]).
(i) Recall from Problem 4 in Exercises 4.7 that for any n ∈ N and x ∈ R,
b(n−1)/2c
!
X k n
sin nx = (−1) cosn−2k−1 x sin2k+1 x.
2k + 1
k=0
258 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Using this formula, prove that if sin x 6= 0, then


n
!
2n+1
X2n + 1 k
sin(2n + 1)x = sin x (−1) (cot2 x)n−k .
2k + 1
k=0
Pn k 2n+1
 n−k
(ii) Prove that if n ∈ N, then the roots of k=0 (−1) 2k+1
t = 0 are the n
numbers t = cot2 2n+1

where m = 1, 2, . . . , n.
(iii) Prove that if n ∈ N, then
n
X kπ n(2n − 1)
(5.30) cot2 = .
2n + 1 3
k=1

Suggestion: Recall that if p(t) is a polynomial of degree n with roots r1 , . . . , rn ,


then p(t) = a(t − r1 )(t − r2 ) · · · (t − rn ) for a constant a. What’s the coefficient
of t1 if you multiply out a(t − r1 ) · · · (t − rn )?
(iv) From the identity (5.30), derive Euler’s sum.
6. (Euler’s sum for π 2 /6, Proof V) Here’s another proof (cf. [102])!
(i) Use (5.29) to prove that for any n ∈ N,
n−1
1 1 X 1
(5.31) = 2 .
2
sin x n m=0 sin2 x+πm
n

Suggestion: Replace x with x+πm


n
in (5.29) and sum from m = 0 to n − 1.
(ii) Take the m = 0 term in (5.31) to the left, replace n by 2n + 1, and then take
x → 0 to derive the identity
n
X 1 2n(n + 1)
(5.32) πk
= .
k=1
sin2 2n+1
3

(iii) From the identity (5.32), derive Euler’s sum.


7. In this problem we prove that

X 1 π4
ζ(4) = = .
n=1
n4 90

(i) Prove that for any nonnegative real numbers a1 , . . . , an , we have


n
X X X n
Y Xn X
1− ak + ai aj − ai aj ak ≤ (1 − ak ) ≤ 1 − ak + ai a j .
k=1 1≤i<j≤n 1≤i<j<k≤n k=1 k=1 1≤i<j≤n
 
x2
Qn
(ii) Applying the inequalities in (i) to k=1 1− k2 π 2
, prove that ζ(4) = π 4 /90.

5.3. F Euler’s formula for ζ(2k)


In Euler’s famous 1735 paper De summis serierum reciprocarum (On the sums
of series of reciprocals), he found not only ζ(2) but he also explicitly determined
ζ(n) for n even up to n = 12, although it is clear from his method that he could get
the value of ζ(n) for any even n. Following G.T. Williams [247], we derive Euler’s
formula for ζ(n), for all n ∈ N even, as a rational multiple of π n (this proof is by
far the most elementary proof I know of).

5.3.1. Williams’ formula. In order to find Euler’s formula, we need the


following theorem, whose proof is admittedly long but it is completely elementary
in the sense that it basically uses only high school arithmetic and the most basic
facts about series!
5.3. F EULER’S FORMULA FOR ζ(2k) 259

Theorem 5.8 (Williams’ formula). For any k ∈ N with k ≥ 2, we have


  k−1
1 X
k+ ζ(2k) = ζ(2`) ζ(2k − 2`).
2
`=1

Proof. Fix k ∈ N with k ≥ 2. Then for N ∈ N, define


k−1 N
! N ! k−1 N
X X 1 X 1 X X 1
aN := 2` 2k−2`
= 2` n2k−2`
,
m=1
m n=1
n m,n=1
m
`=1 `=1
PN PN
where for simplicity in notation, we write the double summation m=1 n=1 as
PN P∞ PN
as single entity m,n=1 . Since ζ(z) = n=1 1/nz = limN →∞ n=1 1/nz , we have
k−1
X
ζ(2`) ζ(2k − 2`) = lim aN ,
N →∞
`=1
so we just
 have to work out a nice formula for aN and show that limN →∞ aN =
k + 21 ζ(2k). To accomplish this we proceed in three steps.
PN
Step 1: We first break up the double sum m,n=1 into two sums, one with
m = n and the other with m 6= n:
k−1 N k−1 N
X X 1 XX 1
aN = 2` 2k−2`
+ ,
m=n=1
m n m n2k−2`
2`
`=1 `=1 m6=n
PN PN
where we denote the summation m,n=1 with identical m, n omitted by m6=n .
When m = n, we have
k−1 N k−1 N
X X 1 XX 1
(5.33) 2` 2k−2`
=
m=n=1
m n n=1
n n2k−2`
2`
`=1 `=1
k−1 N N
XX 1 X 1
= 2k
= (k − 1) ,
n=1
n n=1
n2k
`=1
and for m 6= n, we have
k−1 N k−1 N N k−1
X X 1 XX 1  n 2` X 1 X  n 2`
= = .
m2` n2k−2` n2k m n2k m
`=1 m6=n `=1 m6=n m6=n `=1
Pk−1 ` k
Recalling the formula for a geometric sum: `=1 r = (r − r )/(1 − r) where r 6= 1,
we can write
k−1
1 X  n 2` 1 (n/m)2 − (n/m)2k n2−2k − m2−2k
2k
= 2k 2
= .
n m n 1 − (n/m) m2 − n2
`=1
Therefore,
k−1 N N
X X 1 X n2−2k − m2−2k
(5.34) 2` 2k−2`
=
m n m2 − n2
`=1 m6=n m6=n
N N
X n2−2k X m2−2k
= 2 2
− .
m −n m2 − n2
m6=n m6=n
260 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

The second term is actually the same as the first because


N N N
X m2−2k X m2−2k X n2−2k
− = = ,
m2 − n2 n2 − m2 m2 − n2
m6=n m6=n n6=m

where in the last equality we switched the letters m and n. Combining (5.33) with
(5.34) we get
N N
X 1 X n2−2k
(5.35) aN = (k − 1) + 2 .
n=1
n2k m2 − n2
m6=n

PN 2−2k
Step 2: We now find a nice expression for 2 m6=n mn2 −n2 . To this end we
write this as
N N n−1 N
!
X n2−2k X X X n2−2k
2 2 2
=2 +
m −n n=1 m=1 m=n+1
m2 − n2
m6=n
N n−1 N
!
X 1 X X 2n
= 2k−1
+ 2 − n2
n=1
n m=1 m=n+1
m
N n−1 N
! 
X 1 X X 1 1
(5.36) = + − .
n=1
n2k−1 m=1 m=n+1 m−n m+n

where at the last step we used partial fractions m22n


−n2 =
1
m−n − 1
m+n . We now
work out the second sums in (5.36). First observe that
n−1
X 1 1 1 1
= + + ··· +
m=1
m−n 1−n 2−n −1
  n−1 n
1 1 1 X 1 1 X 1
=− + + · · · + = − = − .
n−1 n−2 1 m=1
m n m=1
m

Second, observe that


N N −n
X 1 1 1 1 X 1
= + + ··· + = .
m=n+1
m−n 1 2 N −n m=1
m

Third, observe that


n−1 N
! N
X X 1 1 X 1
− + = −
m=1 m=n+1
m+n n + n m=1 m + n
 
1 1 1 1
= − + + ··· +
n+n 1+n 2+n N +n
N +n
1 X 1
= − .
2n m=n+1 m
5.3. F EULER’S FORMULA FOR ζ(2k) 261

Therefore,
n−1 N
! 
X X 1 1
+ −
m=1 m=n+1
m−n m+n
n
! N −n N +n
!
1 X 1 X 1 1 X 1
= − + + −
n m=1 m m=1
m 2n m=n+1 m
N +n N −n
3 X 1 X 1
(5.37) = − +
2n m=1 m m=1 m
N +n
3 X 1
= + .
2n m
m=N −n+1

Thus, by (5.36), we have


N N N −n
!
X n2−2k X 1 3 X 1
2 2 2
= 2k−1
+
m −n n=1
n 2n m
m6=n m=N −n+1
N N N +n
!
3 X 1 X 1 X 1
= + .
2 n=1 n2k n=1 n2k−1 m
m=N −n+1

Plugging this into the formula (5.35) for aN , we obtain


 N
X N N +n
!
1 1 X 1 X 1
aN = k+ 2k
+ .
2 n=1
n n=1
n2k−1 m
m=N −n+1

Therefore, lim aN = (k − 1/2)ζ(2k) provided we can show that


N N +n
!
X 1 X 1
(5.38) 0 = lim .
N →∞
n=1
n2k−1 m
m=N −n+1

Step 3: We prove the limit (5.38) (see Problem 1 for a proof using Tannery’s
theorem). To this end, observe that
N +n
X 1 1 1 1
= + + ··· +
m N −n+1 N −n+2 N +n
m=N −n+1
1 1 1 2n
≤ + + ··· + = .
N −n+1 N −n+1 N −n+1 N −n+1
Therefore,
N N +n
! N   N
X 1 X 1 X n 1 X 1
≤2 =2 ,
n=1
n2k−1 m n=1
n2k−1 N −n+1 n=1
n2 (N − n + 1)
m=N −n+1

n 1
where recall that k ≥ 2 so that n2k−1
≤ n2 .
Using partial fractions we see that
 
1 1 1 1
= + .
n(N − n + 1) N +1 n N −n+1
262 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

Thus,
 
1 1 1 1 1 1 1
= · = · +
n2 (N − n + 1) n n(N − n + 1) n N +1 n N −n+1
 
1 1 1
= 2
+
N +1 n n(N − n + 1)
  
1 1 1 1 1
= + +
N + 1 n2 N +1 n N −n+1
1 1 1
= · + (1 + 1)
N + 1 n2 (N + 1)2
1 1 2
≤ · 2+ .
N +1 n (N + 1)2
Hence,
N N +n
! N
X 1 X 1 X 1
=2
n=1
n2k−1 m n=1
n2 (N − n + 1)
m=N −n+1
N N
2 X 1 X 2 2π 2 /6 2N
≤ 2
+ 2
≤ + .
N + 1 n=1 n n=1
(N + 1) N + 1 (N + 1)2

Taking N → ∞ proves (5.38) and completes our proof. 


2 π4 π2
In particular, setting k = 2 we see that 25 ζ(4) = ζ(2)2 . Thus, ζ(4) = 5 36 = 90 .
Taking k = 3, we get
7 π2 π4
ζ(6) = ζ(2)ζ(4) + ζ(4)ζ(2) = 2ζ(2)ζ(4) = 2 · · ,
2 6 90
which after doing the algebra, we get ζ(6) = π 2 /945. Thus,
∞ ∞
π4 X 1 π6 X 1
= , = .
90 n=1 n4 945 n=1 n6

We can also derive explicit formulas for ζ(2k) for all k ∈ N.

5.3.2. Euler’s formula for ζ(2k). To this end, we first define a sequence
1
C1 , C2 , C3 , . . . by C1 = 12 , and for k ≥ 2, we define
k−1
1 X
(5.39) Ck = − C` Ck−` .
2k + 1
`=1

The first few Ck ’s are


1 1 1 1
C1 =
, C2 = − , C3 = , C4 = − ,....
12 720 30240 1209600
The numbers Ck are rational numbers (easily proved by induction) and are related
to the Bernoulli numbers to be covered in Section 6.8, but it’s not necessary to
know this.6 We are now ready to prove . . .

6Explicitly, C = B /(2k)! but this formula is not needed.


k 2k
5.3. F EULER’S FORMULA FOR ζ(2k) 263

Theorem 5.9 (Euler’s formulæ). For any k ∈ N, we have



X 1 (2π)2k Ck (2π)2k Ck
(5.40) 2k
= (−1)k−1 ; that is, ζ(2k) = (−1)k−1 .
n=1
n 2 2

Proof. When k = 1, we have


(2π)2k Ck (2π)2 (1/12) π2
(−1)k−1 = = = ζ(2),
2 2 6
so our theorem holds when k = 1. Let k ≥ 2 and assume our theorem holds for all
natural numbers less than k; we shall prove it holds for k. Using Williams’ formula
and the induction hypothesis, we see that
  k−1
1 X
k+ ζ(2k) = ζ(2`)ζ(2k − 2`)
2
`=1
k−1
X 2`
 2k−2`

`−1 (2π) C` k−`−1 (2π) Ck−2
= (−1) (−1)
2 2
`=1
k−1
X 2k

k−2 (2π) C` Ck−`
= (−1)
4
`=1
k−1
(2π)2k X
= (−1)k−2 C` Ck−`
4
`=1
2k
(2π)
= (−1)k−1 (2k + 1)Ck .
4
Dividing everything by (k + 1/2) = (1/2)(2k + 1) and using the formula (5.39) for
Ck proves our result for k. 

As a side note, we remark that (5.40) shows that ζ(2k) is a rational number
times π 2k ; in particular, since π is transcendental (see, for example, [162, 163,
136]) it follows that ζ(n) is transcendental for n even. One may ask if there are
similar P
expressions like (5.40) for sums of the reciprocals of the odd powers (e.g.

ζ(3) = n=1 n13 ). Unfortunately, there are no known formulas! Moreover, it is not
even known if ζ(n) is transcendental for n odd and in fact, of all odd numbers only
ζ(3) is known without a doubt to be irrational; this was proven by Roger Apéry
(1916–1994) in 1979 (see [29], [230])!
Exercises 5.3.
1. Prove (5.38) using Tannery’s
P theorem.
2. (Cf. [116]) Let Hn = n 1
m=1 m , the n-th partial sum of the harmonic series. In this
problem we prove the equalities:
∞ ∞ ∞
X 1 X Hn 1 X Hn
(5.41) ζ(3) = = = .
n=1
n3 n=1
(n + 1)2 2 n=1 n2

(a) Prove that for N ∈ N,


N N N N −1
X 1 X Hm X 1 X Hk
= 2
= 3
+ ,
m,n=1
mn(m + n) m=1
m m=1
m n=1
(k + 1)2
264 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

PN
where the notation is as in the proof of Williams’ theorem. Suggestion:
m,n=1  
1
For the first equality, use that mn(m+n) = m12 n1 − m+n1
.
(b) Now prove that for N ∈ N,
N N X N
X 1 X 1
=2 .
m,n=1
mn(m + n) m=1 n=1
m(m + n)2
1 1 1
Suggestion: Use that mn(m+n) = m(m+n) 2 + n(m+n)2 .

(c) In Part (b), instead of using n as the inner summation variable on the right-hand
side, change to k = m + n − 1 and in doing so, prove that
N N XN N m+N
X−1
X 1 X 1 X 1
=2 2
+ bN , where bN = 2 .
m,n=1
mn(m + n) m=1
m(k + 1) m=1
m(k + 1)2
k=m k=N +1
PN PN 1
PN Hk
(d) Show that m=1 =
k=m m(k+1)2 and that bN → 0 as N → ∞.
k=1 (k+1)2
Now prove (5.41).
3. (Euler’s sum for π 2 /6, Proof VI) In this problem we prove Euler’s formula for
π 2 /6 by carefully squaring Gregory-Leibniz-Madhava’s formula for π/4; thus, taking
Gregory-Leibniz-Madhava’s formula as “given,” we derive Euler’s formula.7 The proof
is very much in the same spirit as the proof of Williams’ formula; see Section 6.11 for
another, more systematic, proof.
(i) Given N ∈ N, prove that
N
! N ! N N
X (−1)m X (−1)n X 1 X (−1)m+n
= +
m=0
(2m + 1) n=0
(2n + 1) n=0
(2n + 1)2 (2m + 1)(2n + 1)
m6=n
PN
where the notation m6=n is as in the proof of Williams’ theorem.
(ii) For m 6= n, prove that8
2m+1 2n+1
1 2n+1
− 2m+1
=
(2m + 1)(2n + 1) (2m + 1)2 − (2n + 1)2
   
2m + 1 1 2n + 1 1
= − ,
2n + 1 (2m + 1)2 − (2n + 1)2 2m + 1 (2m + 1)2 − (2n + 1)2
then use this identity to prove that
N N
X (−1)m+n X (−1)m+n 2m + 1
=2 ·
(2m + 1)(2n + 1) 2n + 1 (2m + 1)2 − (2n + 1)2
m6=n m6=n
!
X (−1)n n−1
N X XN
2m + 1
=2 + (−1)m 2 − (2n + 1)2
.
n=0
2n + 1 m=0 m=n+1
(2m + 1)

(iii) Prove that


n−1 N
! NX
+n+1
X X 2m + 1 (−1)n 1
4 + (−1)m = − + (−1)N
.
m=0 m=n+1
(2m + 1)2 − (2n + 1)2 2n + 1 m=N −n+1
m

Suggestion: Note that 4 (2m+1)2m+1


2 −(2n+1)2 =
2m+1
(m+n+1)(m−n)
= 1
m−n
+ 1
m+n+1
.

7
Actually, this works in reverse: We can just as well take Euler’s formula as “given,” and
then derive Gregory-Leibniz-Madhava’s formula!
8 1 1 1
Alternatively, one can prove that (2m+1)(2n+1) = 2(m−n)(2n+1) + 2(n−m)(2m+1) and use
PN m+n
(−1)
this decomposition to simplify m6=n (2m+1)(2n+1) . However, if you do Problem 4 to follow, in
your proof you will run into the decomposition appearing in Part (b) above!
5.3. F EULER’S FORMULA FOR ζ(2k) 265

(iv) Prove that


N
! N
! N
X (−1)m X (−1)n 1X 1
= bN + ,
m=0
(2m + 1) n=0
(2n + 1) 2 n=0 (2n + 1)2
N NX
+n+1
!
1 X (−1)N +n 1
where bN = .
2 n=0 (2n + 1) m=N −n+1
m
P∞
(v) Prove that bN → 0 as N → ∞, and conclude that (π/4)2 = 1
2
1
n=0 (2n+1)2 .
Finally, derive Euler’s formula for π 2 /6.
4. (Williams’ other formula) For any k ∈ N, define

X 1
ξ(k) = (−1)n .
n=0
(2n + 1)k

For example, by Gregory-Leibniz-Madhava’s formula we know that ξ(1) = π/4. Prove


that for any k ∈ N with k ≥ 2, we have
  ∞ k−1
1 X 1 X
k− 2k
= ξ(2` + 1) ξ(2k − 2` − 1).
2 n=0 (2n + 1)
`=0
P
This formula also holds when k = 1, for it’s simply the identity 12 ∞ 1 2
n=0 (2n+1)2 = ξ(1) ,
a fact established in Problem 3. Suggestion: Imitate the proof of Williams’ formula.
You will see that ideas from Problem
P 3 will also be useful.
5. (Cf. [247], [37], [117]) Let Hn = n 1
m=1 m , the n-th partial sum of the harmonic series.
In this problem we prove that for any k ∈ N with k ≥ 2, we have
k−2 ∞
X X Hn
(5.42) (k + 2) ζ(k + 1) = ζ(k − `) ζ(` + 1) + 2 ,
n=1
nk
`=1

a formula due to Euler (no surprise!). The proof is very similar to the proof of Williams’
formula, with some twists of course. You may proceed as follows.
(i) For N ∈ N, define
k−2 N
! N ! N k−2
X X 1 X 1 X X 1
aN = k−` `+1
= k−` n`+1
,
m=1
m n=1
n m,n=1
m
`=1 `=1
Pk−2 1 1
Pk−2 `
Summing the geometric sum `=1 mk−` n`+1 = mk n `=1 (m/n) , prove that
N N
X 1 X 1
aN = (k − 2) +2 ,
n=1
nk+1 nk−1 m (m − n)
m6=n
PN
where the notation m6=n is as in the proof of Williams’ theorem.
(ii) Prove that
N N n−1 N
! 
X 1 X 1 X X 1 1
k−1
= k
+ −
n m (m − n) n=1
n m=1 m=n+1
m−n m
m6=n

(iii) Prove that


n−1 N
!  N
X X 1 1 2 X 1
+ − = − Hn − .
m=1 m=n+1
m−n m n m=N −n+1
m

Suggestion: The computation around (5.37) might be helpful.


266 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD

(iv) Prove that


N N N N
!
X 1 X Hn X 1 X 1
aN = (k + 2) −2 − b N , where b N = 2 .
n=1
nk+1 n=1
n k
n=1
n k
m=N −n+1
m
(v) Prove that bN → 0 as N → ∞, and conclude that (5.42) holds.
6. (Cf. [116], [117]) Here’s are a couple applications of (5.42). First, use (5.42) to give a
quick proof of (5.41). Second, prove that
∞  
π4 X 1 1 1 1
= 1 + + + ··· + .
72 n=1
n3 2 3 n
Part 2

Extracurricular activities
CHAPTER 6

Advanced theory of infinite series

Ut non-f Initam Seriem f Inita cöercet,


Summula, & in nullo limite limes adest:
Sic modico immensi vestigia Numinis haerent
Corpore, & angusto limite limes abest.
Cernere in immenso parvum, dic, quanta voluptas!
In parvo immensum cernere, quanta, Deum.

Even as the finite encloses an infinite series


And in the unlimited limits appear,
So the soul of immensity dwells in minutia
And in the narrowest limits no limit in here.
What joy to discern the minute in infinity!
The vast to perceive in the small, what divinity!
Jacob Bernoulli (1654-1705) Ars Conjectandi.[216, p. 271]
This chapter is about going in-depth into the theory and application of infinite
series. One infinite series that will come up again and again in this chapter and the
next chapter as well, is the Riemann zeta function

X 1
ζ(z) = z
,
n=1
n

introduced in Section 4.6. Amongst many other things, in this chapter we’ll see
how to write some well-known constants in terms of the Riemann zeta function;
e.g. we’ll derive the following neat formula for our friend log 2 (§ 6.5):

X 1
log 2 = n
ζ(n),
n=2
2

another formula for our friend the Euler-Mascheroni constant (§ 6.9):



X (−1)n
γ= ζ(n),
n=2
n

and two more formulas involving our most delicious friend π (see §’s 6.10 and 6.11):
∞ ∞
X 3n − 1 π2 X 1 1 1 1
π= ζ(n + 1) , = ζ(2) = = 1 + 2 + 2 + 2 + ··· .
n=2
4n 6 n=1
n2 2 3 4

We’ll also re-derive Gregory-Leibniz-Madhava’s formula (§ 6.10)


π 1 1 1 1 1
=1− + − + − + −··· ,
4 3 5 7 9 11
269
270 6. ADVANCED THEORY OF INFINITE SERIES

and Machin’s formula which started the “decimal place race” of computing π (§
6.10):
∞  
1  1  X (−1)n 4 1
π = 4 arctan − arctan =4 − .
5 239 n=0
(2n + 1) 52n+1 2392n+1

Leibniz’s formula for π/4 is an example of an “alternating series”. We study


these types of series in Section 6.1. In Section 6.2 and Section 6.3 we look at the
ratio and root tests, which you are probably familiar with from elementary calculus.
In Section 6.4 we look at power series and prove some pretty powerful
P∞ properties of
n
power series. The formulas for log 2, γ, and the formula π = n=2 3 4−1 n ζ(n + 1)
displayed above are proved using a famous theorem called the Cauchy double series
theorem. This theorem, and double sequences and series in general, are the subject
of Section 6.5. In Section 6.6 we investigate rearranging (that is, mixing up the
order of the terms in a) series. Here’s an interesting question: Does the series
X 1 1 1 1 1 1 1 1 1 1 1
= + + + + + + + + + + ···
p 2 3 5 7 11 13 17 19 23 29
p is prime

converge or diverge? For the answer, see Section 6.7. In elementary calculus, you
probably never seen the power series representations of tangent and secant. This
is because these series are somewhat sophisticated mathematically speaking. In
Section 6.8 we shall derive the power series representations

X 22n (22n − 1) B2n 2n−1
tan z = (−1)n−1 z ,
n=1
(2n)!

and

X E2n 2n
sec z = (−1)n z .
n=0
(2n)!
Here, the B2n ’s are called “Bernoulli numbers” and the E2n ’s are called “Euler
numbers,” which are certain numbers having extraordinary properties. Although
you’ve probably never seen the tangent and secant power series, you might have
seen the logarithmic, binomial, and arctangent series:
∞ ∞   ∞
X (−1)n−1 n X α n X z 2n+1
log(1 + z) = z , (1 + z)α = z , arctan z = (−1)n
n=1
n n=0
n n=0
2n + 1

where α ∈ R. You most likely used calculus (derivatives and integrals) to derive
these formulæ. In Section 6.9 we shall derive these formulæ without any calculus.
Finally, in Sections 6.10 and 6.11 we derive many incredible and awe-inspiring
formulæ involving π. In particular, we again look at the Basel problem.
Chapter 6 objectives: The student will be able to . . .
• determine the convergence, and radius and interval of convergence, for an infinite
series and power series, respectively, using various tests, e.g. Dirichlet, Abel,
ratio, root, and others.
• apply Cauchy’s double series theorem and know how it relates to rearrangement,
and multiplication and composition of power series.
• identify series formulæ for the various elementary functions (binomial, arctan-
gent, etc.) and for π.
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 271

6.1. Summation by parts, bounded variation, and alternating series


In elementary calculus, you studied “integration by parts,” a formula I’m sure
you used quite often trying to integrate tricky integrals. In this section we study a
discrete version of the integration by parts formula called “summation by parts,”
which is used to sum tricky summations! Summation by parts has broad appli-
cations, including finding sums of powers of integers and to derive some famous
convergence tests for series, the Dirichlet and Abel tests.
6.1.1. Summation by parts and Abel’s lemma. Here is the famous sum-
mation by parts formula. The formula is complicated, but the proof is simple.

Theorem 6.1 (Summation by parts). For any complex sequences {an } and
{bn }, we have
n
X n
X
bk+1 (ak+1 − ak ) + ak (bk+1 − bk ) = an+1 bn+1 − am bm .
k=m k=m

Proof. Combining the two terms on the left, we obtain


Xn h i Xn

bk+1 ak+1 − bk+1 ak + ak bk+1 − ak bk = bk+1 ak+1 − ak bk .
k=m k=m
This is a telescoping sum, and it simplifies to an+1 bn+1 − am bm after all the can-
cellations. 
As a corollary, we get Abel’s lemma named after Niels Abel1 (1802–1829).
Corollary 6.2 (Abel’s lemma). Let {an } and {bn } be any complex se-
quences and let sn denote the n-th partial sum of the series corresponding to the
sequence {an }. Then for any m < n we have
n
X n−1
X
ak bk = sn bn − sm bm − sk (bk+1 − bk ).
k=m+1 k=m

Proof. Applying the summation by parts formula to the sequences {sn } and
{bn }, we obtain
n−1
X n−1
X
bk+1 (sk+1 − sk ) + sk (bk+1 − bk ) = sn bn − sm bm .
k=m k=m
Since ak+1 = sk+1 − sk , we conclude that
n−1
X n−1
X
bk+1 ak+1 + sk (bk+1 − bk ) = sn bn − sm bm .
k=m k=m
Replacing k with k − 1 in the first sum and bringing the second sum to the right,
we get our result. 
Summation by parts is a very useful tool. We shall apply it to find sums of
powers of integers (cf. [254], [77]); see the exercises for more applications.
1Abel has left mathematicians enough to keep them busy for 500 years. Charles Hermite
(1822–1901), in “Calculus Gems” [210].
272 6. ADVANCED THEORY OF INFINITE SERIES

6.1.2. Sums of powers of integers.


Example 6.1. Let ak = k and bk = k. Then each of the differences ak+1 − ak
and bk+1 − bk equals 1, so by summation by parts, we have
n
X n
X
(k + 1) + k = (n + 1)(n + 1) − 1 · 1.
k=1 k=1

This sum reduces to


n
X
2 k = (n + 1)2 − n − 1 = n(n + 1),
k=1

which gives the well-known result:


n(n + 1)
1 + 2 + ··· + n = .
2
Example 6.2. Now let ak = k 2 − k = k(k − 1) and bk = k − 1/2. In this case,
ak+1 − ak = (k + 1)k − k(k − 1) = 2k and bk+1 − bk = 1, so by the summation by
parts formula,
n   n  
X 1 X
2 1
k+ (2k) + (k − k)(1) = (n + 1)n · n + .
2 2
k=1 k=1
Pn
The first sum on the left contains the sum k=1 k and the second one contains the
negative of the same sum. Cancelling, we get
n
X n(n + 1)(2n + 1)
3 k2 = ,
2
k=1

which gives the well-known result:


n(n + 1)(2n + 1)
12 + 22 + · · · + n2 = .
6
Example 6.3. For our final result, let ak = k 2 and bk = (k − 1)2 . Then
ak+1 − ak = (k + 1)2 − k 2 = 2k + 1 and bk+1 − bk = 2k − 1, so by the summation
by parts formula,
n
X n
X
(k + 1)2 (2k + 1) + k 2 (2k − 1) = (n + 1)2 · n2 .
k=1 k=1

After some work simplifying the left-hand side, we get


n2 (n + 1)2
13 + 23 + · · · + n3 = .
4
6.1.3. Sequences of bounded variation and Dirichlet’s test. A sequence
{an } of complex numbers is said to be of bounded variation if

X
|an+1 − an | < ∞.
n=1

Typical examples of a sequences of of bounded variation are bounded monotone se-


quences of real numbers. A nice property of general sequences of bounded variation
is that they always converge. We prove these facts in the following
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 273

Proposition 6.3. Any sequence of bounded variation converges. Moreover,


any bounded monotone sequence is of bounded variation.
Proof. Let {an } be of bounded variation. Given m < n, we can write an −am
as a telescoping sum:
an − am = (am+1 − am ) + (am+2 − am+1 ) + · · ·
n
X
+ (an−1 − an−2 ) + (an − an−1 ) = (ak+1 − ak ).
k=m
Hence,
n
X
|an − am | ≤ |ak+1 − ak |.
k=m
P∞
By assumption, the sum k=1 |ak+1 − ak | converges, so the sum on the right-
hand side of this inequality can be made arbitrarily small as m, n → ∞ (Cauchy’s
criterion for series). Thus, {an } is Cauchy and hence converges.
Now let {an } be a nondecreasing and bounded sequence. We shall prove that
this sequence is of bounded variation; the proof for a nonincreasing sequence is
similar. In this case, we have an ≤ an+1 for each n, so for each n,
n
X n
X
|ak+1 − ak | = (ak+1 − ak ) = (a2 − a1 ) + (a3 − a2 )
k=1 k=1
+ · · · + (an − an−1 ) + (an+1 − an ) = an+1 − a1 ,
since the sum telescoped. The sequence {an } is P by assumption bounded, so it

follows that the partial sums of the infinite series n=1 |an+1 − an | are bounded,
hence the series must converge by the nonnegative series test (Theorem 3.20). 
Here’s a useful test named after Johann Dirichlet (1805–1859).

P Theorem 6.4 (Dirichlet’s test). Suppose that P the partial sums of the series
an are uniformly bounded (although the series an may not converge). Then
for
P any sequence {b n } that is of bounded variation
P and converges to zero, the series
an bn converges. In particular, the series an bn converges if {bn } is a monotone
sequence of real numbers approaching zero.
P
Proof. The trick to use Abel’s lemma to rewrite an bn in terms of an ab-
solutely convergent series. Define a0 = 0 (so that s0 = a0 = 0) and b0 = 0. Then
setting m = 0 in Abel’s lemma, we can write
n
X n−1
X
(6.1) ak bk = sn bn − sk (bk+1 − bk ).
k=1 k=1

Now we are given two facts: The first is that the partial sums {sn } are bounded,
say by a constant C, and the second is that the sequence {bn } is of bounded
variation and converges to zero. Since {sn } is bounded and bn → 0 it follows that
sP
n bn → 0. Since |sn | ≤ C for all n and {bn } is of bounded variation, the sum

k=1 sk (bk+1 − bk ) is absolutely convergent:

X ∞
X
|sk (bk+1 − bk )| ≤ C |bk+1 − bk | < ∞.
k=1 k=1
274 6. ADVANCED THEORY OF INFINITE SERIES

P
Therefore,
P∞taking n → ∞ in (6.1) it follows that the sum ak bk converges (and
equals k=1 sk (bk+1 − bk )), and our proof is complete. 

Example 6.4. For each x ∈ (0, 2π), determine the convergence of the series


X einx
.
n=1
n

To do so, we let an = einx and bn = 1/n. Since {1/n} is a monotone sequence


converging to zero, by Dirichlet’s test, if we can prove that the partial sums of
P inx P∞ inx
e are bounded, then n=1 e n converges. To establish this boundedness, we
observe that
m
X 1 − eimx
einx = eix ,
n=1
1 − einx

Pm ix n
where we summed n=1 (e ) via the geometric progression (2.3). Hence,


X m 1 − eimx 1 + |eimx | 2
inx
e ≤ inx

inx
= .

n=1
1−e 1−e | |1 − eix |

Since 1 − eix = eix/2 (e−ix/2 − eix/2 ) = −2ieix/2 sin(x/2), we see that

m
X 1
|1 − eix | = 2| sin(x/2)| einx ≤

=⇒ .
sin(x/2)
n=1

Thus, for each x ∈ (0, 2π), by Dirichlet’s test,


P∞ given any sequence {bn } of bounded
variation that converges to zero, the sum n=1 bn einx converges. In particular,
P∞ einx P∞ einx
n=1 n converges, and more generally, n=1 np converges for any p > 0.
Taking real and imaginary parts shows that for any x ∈ (0, 2π),

∞ ∞
X cos nx X sin nx
and converge.
n=1
n n=1
n

Before going to other tests,P


it might be interesting to note that we can determine

the convergence of the series n=1 cosnnx without using the fancy technology of
Dirichlet’s test. To this end, observe that from the addition formulas for sin(n ±
1/2)x, we have

sin(n + 1/2)x − sin(n − 1/2)x


cos nx = ,
2 sin(x/2)
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 275

which implies that, after gathering like terms,


m m
X cos nx 1 X sin(n + 1/2)x − sin(n − 1/2)x
=
n=1
n 2 sin(x/2) n=1
n

1 sin(3x/2) − sin(x/2) sin(5x/2) − sin(3x/2)
= +
2 sin(x/2) 1 3

sin(m + 1/2)x − sin(m − 1/2)x
+ ··· +
m
 m−1 1 
1 sin(m + 1/2)x X 1 
= − sin(x/2) + + sin(n + 1/2)x −
2 sin(x/2) m n=1
n n+1
 m−1 
1 sin(m + 1/2)x X 1
= − sin(x/2) + + sin(n + 1/2)x .
2 sin(x/2) m n=1
n(n + 1)

Therefore,
m m−1  
1 X cos nx sin(m + 1/2)x X sin(n + 1/2)x 1
(6.2) + = + · .
2 n=1 n 2m sin(x/2) n=1
2 sin(x/2) n(n + 1)
P
Since the sine is always bounded by 1 and 1/n(n + 1) converges, it follows that
as m → ∞, the first term on the right of (6.2) tends to zero while the summation
on the right of (6.2) converges; in particular, the series in question converges, and
we get the following pretty formula:
∞ ∞
1 X cos nx 1 X sin(n + 1/2)x
+ = , x ∈ (0, 2π).
2 n=1 n 2 sin(x/2) n=1 n(n + 1)
P∞
In Example 6.39 of Section 6.9, we’ll show that n=1 cosnnx = log(2 sin(x/2)).

6.1.4. Alternating series tests, log 2, and the irrationality of e. As a


direct consequence of Dirichlet’s test, we immediately get the alternating series test.

Theorem 6.5 (Alternating series test). P If {an } is a sequence of bounded


variation that converges to zero, then the sum (−1)n−1 an converges. In particu-
lar, if {an } is a monotone sequence of real numbers approaching zero, then the sum
P
(−1)n−1 an converges.
P
Proof. Since the partial sums of (−1)n−1
P are bounded and {an } is of
bounded variation and converges to zero, the sum (−1)n−1 an converges by Dirich-
let’s test. 

Example 6.5. The alternating harmonic series



X 1 1 1 1 1 1
(−1)n−1 = 1 − + − + − + −···
n=1
n 2 3 4 5 6

converges. Of course, we already knew this and we also know that the value of the
alternating harmonic series equals log 2 (see Section 4.6).
We now come to a very useful theorem for approximation purposes.
276 6. ADVANCED THEORY OF INFINITE SERIES

a2
z }| {
a4
z }| {
0 s2 s4 s s3 s1
| {z }
a3
| {z }
a1

Figure 6.1. The partial sums {sn } jump forward and backward
by the amounts given by the an ’s. This picture also shows that
|s − s1 | ≤ a2 , |s − s2 | ≤ a3 , |s − s3 | ≤ a4 , . . ..

Corollary 6.6 (Alternating series error estimate). If {an } isPa monotone


sequence of real numbers approaching zero, and if s denotes the sum (−1)n−1 an
and sn denotes the n-th partial sum, then
|s − sn | ≤ |an+1 |.
Proof. To establish the error estimate, we assume that an ≥ 0 for each n, in
which case we have a1 ≥ a2 ≥ a3 ≥ a4 ≥ · · · ≥ 0. (The case when an ≤ 0 is similar
or can
P∞ be derived from the present case by multiplying by −1.) Let’s consider how
s = n=1 (−1)n−1 an is approximated by the sn ’s. Observe that s1 = a1 increases
from s0 = 0 by the amount a1 ; s2 = a1 − a2 = s1 − a2 decreases from s1 by the
amount a2 ; s3 = a1 − a2 + a3 = s2 + a3 increases from s2 by the amount a3 , and so
on; see Figure 6.1 for a picture of what’s going on here. Studying this figure also
shows why |s − sn | ≤ an+1 holds. For this reason, we shall leave the exact proof
details to the diligent and interested reader! 

Example 6.6. Suppose that we wanted to find log 2 to two decimal places (in
base 10); that is, we want to find b0 , b1 , b2 in the decimal expansion log 2 = b0 .b1 b2
where by the usual convention, b2 is “rounded up” if b3 ≥ 5. We can determine these
P∞ n−1
decimals by finding n such that sn , the n-th partial sum of log 2 = n=1 (−1)n ,
satisfies
| log 2 − sn | < 0.005;
that is,
log 2 − 0.005 < sn < log 2 + 0.005.
Can you see why these inequalities guarantee that sn has a decimal expansion
starting with b0 .b1 b2 ? Any case, according to the alternating series error estimate,
we can make this this inequality hold by choosing n such that
1
|an+1 | = < 0.005 =⇒ 500 < n + 1 =⇒ n = 500 works.
n+1
With about five hours of pencil and paper work (and ten coffee breaks ,) we find
P500 n
that s500 = n=1 (−1) n = 0.69 to two decimal places. Thus, log 2 = 0.69 to two
decimal places. A lot of work just to get two decimal places!
Example 6.7. (Irrationality of e, Proof II) Another nice application of the
alternating series error estimate (or rather its proof) is a simple proof that e is
irrational, cf. [180], [7]. Indeed, on the contrary, let us assume that e = m/n where
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 277

m, n ∈ N. Then we can write


∞ m ∞
n X (−1)k n X (−1)k X (−1)k
= e−1 = =⇒ − = .
m k! m k! k!
k=0 k=0 k=m+1
m+1
Multiplying both sides by m!/(−1) = ±m!, we obtain
 m  ∞ ∞
X m! X (−1)k−m−1 m! X (−1)k−1 m!
(6.3) ± n (m − 1)! − (−1)k = = .
k! k! (m + k)!
k=0 k=m+1 k=1

For 0 ≤ k ≤ m, m!/k! is an integer (this is because m! = 1 · 2 · · · k · (k + 1) · · · m


contains a factor
P∞ of k!), therefore the left-hand side of (6.3) is an integer, say s ∈ Z,
m!
so that s = k=1 (−1)k−1 ak where ak = (m+k)! . Thus, as seen in Figure 6.1, we
have
1
0 < s < a1 = .
m+1
Now recall that m ∈ N, so 1/(m + 1) ≤ 1/2. Thus, s is an integer strictly between
0 and 1/2; an obvious contradiction!
P∞ inx
6.1.5. Abel’s test for series. Now let’s modify the sum n=1 e n , say to
the slightly more complicated version
∞ 
X 1 n einx
1+ .
n=1
n n
If we try to determine the convergence of this series using Dirichlet’s test, we’ll have
to do some work, but if we’re feeling a little lazy, we can use the following theorem,
whose proof uses an “ε/3-trick.”
P
Theorem 6.7 (Abel’s test for series). Suppose P that an converges. Then
for any sequence {bn } of bounded variation, the series an bn converges.

P We shall apply Abel’s lemma to establish that the sequence


Proof. P of partial
sums for an bn forms a Cauchy sequence, which implies that an bn converges.
For m < n, by Abel’s lemma, we have
n
X n−1
X
(6.4) ak bk = sn bn − sm bm − sk (bk+1 − bk ),
k=m+1 k=m
P
wherePsn is the n-th partial sum of the series an . Adding and subtracting
s := an to sk on the far right of (6.4), we find that
n−1
X n−1
X n−1
X
sk (bk+1 − bk ) = (sk − s)(bk+1 − bk ) + s (bk+1 − bk )
k=m k=m k=m
n−1
X
= (sk − s)(bk+1 − bk ) + sbn − sbm ,
k=m
since the sum telescoped. Replacing this into (6.4), we obtain
n
X n−1
X
ak bk = (sn − s)bn − (sm − s)bm − (sk − s)(bk+1 − bk ).
k=m+1 k=m

Let ε > 0. Since {bn } is of bounded variation, this sequence converges by Propo-
sition 6.3, so in particular is bounded and therefore, since sn → s, we have
278 6. ADVANCED THEORY OF INFINITE SERIES

(sn − s)bn → 0 and (sm − s)bm → 0. Thus, we can choose N such that for
n, m > N , we have |(sn − s)bn | < ε/3, |(sm − s)bm | < ε/3, and |sn − s| < ε/3.
Thus, for N < m < n, we have

X n n−1
X

a b
k k ≤ |(sn − s)b n | + |(sm − s)b m | + |(sk − s)(bk+1 − bk )|

k=m+1 k=m
n−1
ε ε ε X
< + + |bk+1 − bk |.
3 3 3
k=m
P
Finally, since |bk+1 − bk | converges, by the Cauchy criterion for series, the sum
Pn−1
|b
k=m k+1 − b k | can be made
Pn less than 1 for N chosen larger if necessary. Thus,
for N < m < n, we have | k=m+1 ak bk | < ε. This completes our proof. 
Example 6.8. Back to our discussion above, we can write
∞ 
X 1 n einx X
1+ = an bn ,
n=1
n n
inx P∞
where an = e n and bn = (1 + n1 )n . Since we already know that n=1 an con-
verges and that {bn } is nondecreasing and bounded above (by e — see Section
P 3.3)
and therefore is of bounded variation, Abel’s test shows that the series an bn
converges.
Exercises 6.1.
1. Following Fredricks and Nelsen [77], we use summation by parts to derive neat identities
for the Fibonacci numbers. Recall that the Fibonacci sequence {Fn } is defined as
F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2.
(a) Let an = Fn+1 and bn = 1 in the summation by parts formula (see Theorem 6.1)
to derive the identity:
F1 + F2 + F3 + · · · + Fn = Fn+2 − 1.
(b) Let an = bn = Fn in the summation by parts formula to get
F12 + F22 + F32 + · · · + Fn2 = Fn Fn+1 .
(c) What an ’s and bn ’s would you choose to derive the formulas:
F1 + F3 + F5 + · · · + F2n−1 = F2n , 1 + F2 + F4 + F6 + · · · + F2n = F2n+1 ?
2. Following Fort [76], we relate limits of arithmetic means to summation by parts.
(a) LetP{a n }, {bn } be sequences of complex numbers and assume that bn → 0 and
1 n
n1 Pk=1 k |b k+1 − bk | → 0 as n → ∞, and that for some constant C, we have
n
k=1 ak ≤ C for all n. Prove that

n
n
1X
ak b k → 0 as n → ∞.
n
k=1

(b) Apply this result to an = (−1)n−1 n and bn = 1/ n to prove that
n
1 √ √ √ √ √  1X √
1 − 2 + 3 − 4 + · · · + (−1)n−1 n = (−1)k k → 0 as n → ∞.
n n
k=1

3. Determine the convergence or divergence of the following series:



1 1 1 1 1 1 1 X √ √
(a) + + − − + + − − + +··· , (b) (−1)n ( n + 1 − n).
1 2 3 4 5 6 7 n=1
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 279

an a1
a3
a5
a7
a9
a11 a13 a15

n
a12 a14 a16
a10
a8
a6
a4
a2

Figure 6.2. For the oscillating sequence {an }, the upper dashed
line represents lim sup an and the lower dashed line represents
lim inf an .


X cos nx 1 1 1 1 1 1
(c) , (d) − + − + − + −···
n=2
log n 2·1 2·2 3·3 3·4 4·5 4·6
∞ ∞ ∞
X (−1)n−1 2n + 1 X x X log n
(e) log , (f ) cos nx sin (x ∈ R) , (g) (−1)n−1 .
n=2
n n n=2
n n=2
n

6.2. Liminfs/sups, ratio/roots, and power series


It is a fact of life that most sequences simply do not converge. In this section
we introduce limit infimums and supremums, which always exist, either as real
numbers or as ±∞. We also study their basic properties. We need these limits to
study the ratio and root tests. You’ve probably seen these tests before in elementary
calculus, but in this section we’ll look at them in a slightly more sophisticated way.

6.2.1. Limit infimums and supremums. For an arbitrary sequence {an }


of real numbers we know that lim an may not exist; such as the sequence seen in
Figure 6.2. However, being mathematicians we shouldn’t let this stop up and in
this subsection define “limits” for an arbitrary sequence. It turns out that there
are two notions of “limit” that show up often, one is the limit supremum of {an },
which represents the “greatest” limiting value the an ’s could possibly have and the
second is the limit infimum of {an }, which represents the “least” limiting value that
the an ’s could possibly have. See Figure 6.2 for a picture of these ideas.
We now make “greatest” limiting value and “least” limiting value precise. Let
a1 , a2 , a3 , . . . be any sequence of real numbers bounded from above. Let us put
sn := sup ak = sup{an , an+1 , an+2 , an+3 , . . .}.
k≥n

Note that
sn+1 = sup{an+1 , an+2 , . . .} ≤ sup{an , an+1 , an+2 , . . .} = sn .
Indeed, sn is an upper bound for {an , an+1 , an+2 , . . .} and hence an upper bound
for {an+1 , an+2 , . . .}, therefore sn+1 , being the least such upper bound, must satisfy
sn+1 ≤ sn . Thus, s1 ≥ s2 ≥ · · · ≥ sn ≥ sn+1 ≥ · · · is an nonincreasing sequence.
In particular, being a monotone sequence, the limit lim sn is defined either a real
280 6. ADVANCED THEORY OF INFINITE SERIES

number or (properly divergent to) −∞. We define


 
lim sup an := lim sn = lim sup{an , an+1 , an+2 , . . .} .
n→∞

This limit, which again is either a real number or −∞, is called the limit supre-
mum or lim sup of the sequence {an }. This name fits since lim sup an is exactly
that, a limit of supremums. If {an } is not bounded from above, then we define
lim sup an := ∞ if {an } is not bounded from above.
We define an extended real number as a real number or the symbols ∞ = +∞,
−∞. Then it is worth mentioning that lim sups always exist as an extended real
number, unlike regular limits which may not exist. For the picture in Figure 6.2
notice that
s1 = sup{a1 , a2 , a3 , . . .} = a1 ,
s2 = sup{a2 , a3 , a4 , . . .} = a3 ,
s3 = sup{a3 , a4 , a5 , . . .} = a3 ,
and so on. Thus, the sequence s1 , s2 , s3 , . . . picks out the odd-indexed terms of the
sequence a1 , a2 , . . .. Therefore, lim sup an = lim sn is the value given by the upper
dashed line in Figure 6.2. (Of course, here we are assuming that the an ’s behave
just as you think they should for for n ≥ 17.) Here are some other examples.
Example 6.9. We shall compute lim sup an where an = n1 . According to the
definition of lim sup, we first have to find sn :
 
1 1 1 1 1
sn := sup{an , an+1 , an+2 , . . .} = sup , , , ,... = .
n n+1 n+2 n+3 n
Second, we take the limit of the sequence {sn }:
1
lim sup an := lim sn = lim = 0.
n→∞ n n→∞
Notice that lim an also exists and lim an = 0, the same as the lim sup. We’ll come
back to this observation in Example 6.11 below.
Example 6.10. Consider the sequence {(−1)n }. In this case, we know that
lim(−1)n does not exist. To find lim sup(−1)n , we first compute sn :
sn = sup{(−1)n , (−1)n+1 , (−1)n+2 , . . .} = sup{+1, −1} = 1,
where we used that the set {(−1)n , (−1)n+1 , (−1)n+2 , . . .} is just a set consisting
of the numbers +1 and −1. Hence,
lim sup(−1)n := lim sn = lim 1 = 1.
We can also define a corresponding lim inf an , which is a limit of infimums.
To do so, assume for the moment that our generic sequence {an } is bounded from
below. Consider the sequence {ιn } where
ιn := inf ak = inf{an , an+1 , an+2 , an+3 , . . .}.
k≥n

Note that
ιn = inf{an , an+2 , . . .} ≤ inf{an+1 , an+2 , . . .} = ιn+1 ,
since the set {an , an+2 , . . .} on the left of ≤ contains the set {an+1 , an+2 , . . .}. Thus,
ι1 ≤ ι2 ≤ · · · ≤ ιn ≤ ιn+1 ≤ · · · is an nondecreasing sequence. In particular, being
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 281

a monotone sequence, the limit lim ιn is defined either a real number or (properly
divergent to) ∞. We define
 
lim inf an := lim ιn = lim inf{an , an+1 , an+2 , . . .} ,
n→∞

which exists either as a real number or +∞, is called the limit infimum or lim
inf of {an }. If {an } is not bounded from below, then we define
lim inf an := −∞ if {an } is not bounded from below.
Again, as with lim sups, lim infs always exist as extended real numbers. For the
picture in Figure 6.2 notice that
ι1 = sup{a1 , a2 , a3 , . . .} = a2 ,
ι2 = sup{a2 , a3 , a4 , . . .} = a2 ,
ι3 = sup{a3 , a4 , a5 , . . .} = a4 ,
and so on. Thus, the sequence ι1 , ι2 , ι3 , . . . picks out the even-indexed terms of the
sequence a1 , a2 , . . .. Therefore, lim sup an = lim ιn is the value given by the lower
dashed line in Figure 6.2. Here are some more examples.
Example 6.11. We shall compute lim inf an where an = n1 . According to the
definition of lim inf, we first have to find ιn :
 
1 1 1 1
ιn := inf{an , an+1 , an+2 , . . .} = inf , , , , . . . = 0.
n n+1 n+2 n+3
Second, we take the limit of ιn :
lim inf an := lim ιn = lim 0 = 0.
n→∞ n→∞
Notice that lim an also exists and lim an = 0, the same as lim inf an , which is the
same as lim sup an as we saw in Example 6.9. We are thus lead to make the following
conjecture: If lim an exists, then lim sup an = lim inf an = lim an ; this conjecture is
indeed true as we’ll see in Property (2) of Theorem 6.8.
Example 6.12. If an = (−1)n , then
inf{an , an+1 , an+2 , . . .} = sup{(−1)n , (−1)n+1 , (−1)n+2 , . . .} = inf{+1, −1} = −1.
Hence,
lim inf(−1)n := lim −1 = −1.
The following theorem contains the main properties of limit infimums and
supremums that we shall need in the sequel.
Theorem 6.8 (Properties of lim inf/sup). If {an } and {bn } are sequences
of real numbers, then
(1) lim sup an = − lim inf(−an ) and lim inf an = − lim sup(−an ).
(2) lim an is defined, as a real number or ±∞, if and only if lim sup an = lim inf an ,
in which case,
lim an = lim sup an = lim inf an .
(3) If an ≤ bn for all n sufficiently large, then
lim inf an ≤ lim inf bn and lim sup an ≤ lim sup bn .
(4) The following inequality properties hold:
282 6. ADVANCED THEORY OF INFINITE SERIES

(a) lim sup an < a =⇒ there is an N such that n > N =⇒ an < a.


(b) lim sup an > a =⇒ there exist infinitely many n’s such that an > a.
(c) lim inf an < a =⇒ there exist infinitely many n’s such that an < a.
(d) lim inf an > a =⇒ there is an N such that n > N =⇒ an > a.
Proof. To prove (1) assume first that {an } is not bounded from above; then
{−an } is not bounded from below. Hence, lim sup an := ∞ and lim inf(−an ) :=
−∞, which implies (1) in this case. Assume now that {an } is bounded above.
Recall from Lemma 2.29 that given any nonempty subset A ⊆ R bounded above,
we have sup A = − inf(−A). Hence,
sup{an , an+1 , an+2 , . . .} = − inf{−an , −an+1 , −an+2 , −an+3 , . . .}.
Taking n → ∞ on both sides, we get lim sup an = − lim inf(−an ).
We now prove (2). Suppose first that lim an converges to a real number L.
Then given ε > 0, there exists an N such that
L − ε ≤ ak ≤ L + ε, for all k > N ,
which implies that for any n > N ,
L − ε ≤ inf ak ≤ sup ak ≤ L + ε.
k≥n k≥n

Taking n → ∞ implies that


L − ε ≤ lim inf an ≤ lim sup an ≤ L + ε.
Since ε > 0 was arbitrary, it follows that lim sup an = L = lim inf an . Reversing
these steps, we leave you to show that if lim sup an = L = lim inf an , then {an }
converges to L. We now consider (2) in the case that lim an = +∞; the case where
the limit is −∞ is proved similarly. Then given any real number M > 0, there
exists an N such that
n > N =⇒ M ≤ an .
This implies that
M ≤ inf ak ≤ sup ak .
k≥n k≥n
Taking n → ∞ we obtain
M ≤ lim inf an ≤ lim sup an .
Since M > 0 was arbitrary, it follows that lim sup an = +∞ = lim inf an . Reversing
these steps, we leave you to show that if lim sup an = +∞ = lim inf an , then
an → +∞.
To prove (3) note that if {an } is not bounded from below, then lim inf an := −∞
so lim inf an ≤ lim inf bn automatically; thus, we may assume that {an } is bounded
from below. In this case, observe that an ≤ bn for all n sufficiently large implies
that, for n sufficiently large,
inf{an , an+1 , an+2 , . . .} ≤ inf{bn , bn+1 , bn+2 , bn+3 , . . .}
Taking n → ∞, and using that limits preserve inequalities, now proves (3). The
proof that lim sup an ≤ lim sup bn is similar.
Because this proof is becoming unbearably unbearable , we’ll only prove (a),
(b) of (4) leaving (c), (d) to the reader. Assume that lim sup an < a, that is,
 
lim sup{an , an+1 , an+2 , . . .} < a.
n→∞
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 283

It follows that for some N , we have


n>N =⇒ sup{an , an+1 , an+2 , . . .} < a,
that is, the least upper bound of {an , an+1 , an+2 , . . .} is strictly less than a, so we
must have we have an < a for all n > N . Assume now that lim sup an > a. If
{an } is not bounded from above then there must exist infinitely many n’s such
that an > a, for otherwise if there were only finitely many n’s such that an > a,
then {an } would be bounded from above. Assume now that {an } is bounded from
above. Then,  
lim sup{an , an+1 , an+2 , . . .} > a
n→∞
implies that for some N , we have
n>N =⇒ sup{an , an+1 , an+2 , . . .} > a.
Now if there were only finitely many n’s such that an > a, then we can choose
n > N large enough such that ak ≤ a for all k ≥ n. However, this would imply that
for such n, sup{an , an+1 , an+2 , . . .} ≤ a, a contradiction. Hence, there are infinitely
many n’s such that an > a. 
6.2.2. Ratio/root tests, and the exponential and ζ-functions, again.
In elementary
calculus you shouldP have studied the ratio test: If the limit L1 :=
lim aan+1
n
exists, then the series an converges if L1 < 1 and diverges if L1 > 1 (if
L1 = 1, then the test is inconclusive). You P also studied the root test: If the limit
L2 := lim |an |1/n exists, then the series an converges if L2 < 1 and diverges
if
L2 > 1 (if L2 = 1, then the test is inconclusive). Now what if the limits lim an+1 an
or lim |an |1/n don’t exist, are there still ratio and root tests? The answer is “yes,”
but we have to replace lim with lim inf’s and lim sup’s. Before stating these new
ratio/root tests, we first consider the following important lemma.
Lemma 6.9. If {an } is a sequence of nonzero complex numbers, then

an+1 1/n 1/n
an+1
lim inf
≤ lim inf |an | ≤ lim sup |an | ≤ lim sup
.
an an
Proof. The middle inequality is automatic (because inf’s are ≤ sup’s), so we
just need to prove the left and right inequalities. Consider the left one; the right
one is analogous and is left to the reader. If lim inf |an+1 /an | = −∞, then there
is nothing to prove, so we may assume that lim inf |an+1 /an | = 6 −∞. Given any
b < lim inf |an+1 /an |, we shall prove that b < lim inf |an |1/n . This proves the left
side in our desired inequalities, for, if on the contrary we have lim inf |an |1/n <
lim inf |an+1 /an |, then choosing b = lim inf |an |1/n , we would have
lim inf |an |1/n < lim inf |an |1/n ,
an obvious contradiction. So, let b < lim inf |an+1 /an |. Choose a such that b < a <
lim inf |an+1 /an |. Then by Property 4 (d) in Theorem 6.8, for some N , we have

an+1
n > N =⇒ > a.
an
Fix m > N and let n > m > N . Then we can write

an an−1 am+1
|an | =
· ··· · |am |.
an−1 an−2 am
284 6. ADVANCED THEORY OF INFINITE SERIES

There are n − m quotients in this equality, each of which is greater than a, so


|an | > a · a · · · a · |am | = an−m · |am |,
which implies that
(6.5) |an |1/n > a1−m/n · |am |1/n .
Since
lim a1−m/n · |am |1/n = a,
n→∞
and limit infimums preserve inequalities, we have
lim inf |an |1/n ≥ lim inf a1−m/n · |am |1/n = lim a1−m/n · |am |1/n = a,
n→∞

where we used Property (2) of Theorem 6.8. Since a > b, we have b < lim inf |an |1/n
and our proof is complete. 
Here’s Cauchy’s root test, a far-reaching generalization of the root test you
learned in elementary calculus.
P
Theorem 6.10 (Cauchy’s root test). A series an converges absolutely or
diverges according as
1/n 1/n
lim sup an <1 or lim sup an > 1.
1/n
Proof. Suppose first that lim sup an < 1. Then we can choose 0 < a < 1
1/n
such that lim sup an
< a, which, by Property 4 (a) of Theorem 6.8, implies
that for some N ,
1/n
n > N =⇒ an < a,
that is,
n > N =⇒ an < an .
P
Since a < 1, wePknow that the infinite series anPconverges; thus by the comparison
test, the sum |an | also converges, and hence an converges as well.
1/n
Assume now that lim sup an > 1. Then by Property 4 (b) of Theorem
1/n
6.8, there are infinitely many n’s such that an > 1. Thus, there P are infinitely
many n’s such that |an | > 1. Hence by the n-th term test, the series an cannot
converge. 
1/n
It is important to remark that in the other case, that is, lim sup an = 1,
this test does not give information as to convergence.
P
Example 6.13. Consider the series 1/n, which diverges, and observe that
1/n 1/n
lim sup |1/n|
P = lim 1/n = 1 (see Section 3.1 for the proof that lim n1/n = 1).
However, 1/n2 converges, and lim sup |1/n2 |1/n = lim(1/n1/n )2 = 1 as well, so
when lim sup |an |1/n = 1 it’s not possible to tell whether or not the series converges.
As with the root test, in elementary calculus you learned the ratio test most
likely without proof, and, accepting by faith this test as correct you probably
used it to determine the convergence/divergence of many types of series. Here’s
d’Alembert’s ratio test, a far-reaching generalization of the ratio test2.
2Allez en avant, et la foi vous viendra [push on and faith will catch up with you]. Advice to
those who questioned the calculus by Jean Le Rond d’Alembert (1717–1783) [141]
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 285

P
Theorem 6.11 (d’Alembert’s ratio test). A series an , with an nonzero
for n sufficiently large, converges absolutely or diverges according as
a a
n+1 n+1
lim sup <1 or lim inf > 1.
an an
1/n
Proof. If we set L := lim sup an , then by Lemma 6.9, we have

an+1 an+1
(6.6) lim inf
≤ L ≤ lim sup .
an an
P
Therefore, if lim sup aan+1

n
< 1, then L < 1 too, so an converges absolutely by
P
the root test. On the other hand, if lim inf aan+1

n
> 1, then L > 1 too, so an
diverges by the root test. 
an+1 an+1
We remark that in the other case, that is, lim inf an ≤ 1 ≤ lim sup an ,
this test does not give information as to convergence. P Indeed,Pthe same diver-
gent and convergent examples
an+1 used for the root test, 1/n and 1/n2 , have the
property that lim inf an = 1 = lim sup aan+1 n
.
1/n
Note that if lim sup an = 1, that is, the root test fails (to give a decisive
answer), then setting L = 1 in (6.6), we see that the ratio test also fails. Thus,
(6.7) root test fails =⇒ ratio test fails.
Therefore, if the root test fails one cannot hope to appeal to the ratio test.
Let’s now consider some examples.
Example 6.14. First, our old friend:

X zn
exp(z) := ,
n=1
n!
which we already knows converges, but for the fun of it, let’s apply the ratio test.
Observe that
a zn+1 n! |z|
n+1 (n+1)!
= zn = |z| · = .
an n! (n + 1)! n+1
Hence, a
n+1
lim = 0 < 1.
an
Thus, the exponential function exp(z) converges absolutely for all z ∈ C. This
proof was a little easier than the one in Section 3.7, but then again, back then we
didn’t have the up-to-day technology of the ratio test that we have now. Here’s an
example that fails.
Example 6.15. Consider the Riemann zeta function

X 1
ζ(z) = , Re z > 1.
n=1
nz
If z = x + iy is separated into its real and imaginary parts, then

1/n 1 1/n  1 1/n  1 x
an = z = = .
n nx n1/n
286 6. ADVANCED THEORY OF INFINITE SERIES

Since lim n1/n = 1, it follows that


1/n
lim an =1
so the root test fails to give information, which also implies that the ratio test fails
as well. Of course, using the comparison test as we did in the proof of Theorem
4.33 we already know that ζ(z) converges for all z ∈ C with Re z > 1.
It’s easy to find examples of series for which the ratio test fails but the root
test succeeds.
Example 6.16. A general class of examples that foil the ratio test are (see
Problem 4)
(6.8) a + b + a2 + b2 + a3 + b3 + a4 + b4 + · · · , 0 < b < a < 1;
n
here, the odd terms are given by a2n−1 = a and the even terms are given by
a2n = bn . For concreteness, let us consider the series
1 1  1 2  1 2  1  3  1 3  1 4  1 4
+ + + + + + + + ··· .
2 3 2 3 2 3 2 3
Since
2n (1/3)n
a  2 n
= =
a2n−1 (1/2)n 3
and
2n+1 (1/2)n+1
a  3 n 1
= = · ,
a2n (1/3)n 2 2
It follows that lim inf |an+1 /an | = 0 < 1 < ∞ = lim sup |an+1 /an |, so the ratio test
does not give information. On the other hand, since
n
1/(2n−1)  1  2n−1
|a2n−1 |1/(2n−1) = (1/2)n =
2
and
1/(2n)  1  n−12n
|a2n |1/(2n) = (1/3)n−1 =
3
we leave it as an exercise for you to show that lim sup |an |1/n = (1/2)1/2 . Since
(1/2)1/2 < 1, the series converges by the root test.
Thus, in contrast to (6.7),
ratio test fails =⇒
/ root test fails.
However, in the following lemma we show that if the ratio test fails such that the
true limit lim | aan+1
n
| = 1, then the root test fails as well.

Lemma 6.12. If | aan+1


n
| → L with L an extended real number, then |an |1/n → L.
Proof. By Lemma 6.9, we know that
a 1/n 1/n a
n+1 n+1
lim inf ≤ lim inf an ≤ lim sup an ≤ lim sup .
an an
By Theorem 6.8, a limit exists if and only if the lim inf and the lim sup have the
same limit, so the outside quantities in these inequalities equal L. It follows that
lim inf |an |1/n = lim sup |an |1/n = L as well, and hence lim |an |1/n = L. 
Let’s do one last (important) example:
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 287

Example 6.17. Consider the series



X 1 · 3 · 5 · · · (2n − 1)
(6.9) 1+ .
n=1
2 · 4 · 6 · · · (2n) (2n + 1)
Applying the ratio test, we have
2 1
an+1 (2n + 1)(2n + 1) 4n2 + 8n + 1 1+ + 2
(6.10) = = 2 = n 4n .
an (2n + 2)(2n + 3) 4n + 10n + 6 5 3
1+ + 2
2n 2n
an+1
Therefore, lim | an | = 1, so the ratio and root test give no information! What can
we do? We’ll see that Raabe’s test in Section 6.3 will show that (6.9) converges.
6.2.3. Power series. Our old friend

X zn
exp(z) :=
n=0
n!
is an example of a power series, by which we mean a series of the form
X∞ X∞
an z n , where z ∈ C, or an xn , where x ∈ R,
n=0 n=0

where an ∈ C for all n (in particular, the an ’s may be real). However, we shall
focus on power series of the complex variable z although essentially everything we
mention works for real variables x.
Example 6.18. Besides the exponential function,Pother familiar examples of

power series
P∞ include the trigonometric series, sin z = n=0 (−1)n z 2n+1 /(2n + 1)!,
cos z = n=0 (−1)n z 2n /(2n)!.
P∞
The convergence of power series is quite easy to analyze. First, n=0 an z n =
a0 + a1 z + a2 z 2 + · · · certainly converges if z = 0. For |z| > 0 we can use the root
test: Observe that (see Problem 8 for the proof that we can take out |z|)
1/n  
lim sup an z n = lim sup |z| |an |1/n = |z| lim sup |an |1/n .
P
Therefore, an z n converges (absolutely) or diverges according as
|z| · lim sup |an |1/n < 1 or |z| · lim sup |an |1/n > 1.
Therefore, if we define 0 ≤ R ≤ ∞ by
1
(6.11) R :=
lim sup |an |1/n

where by convention, we put R := +∞ when P lim sup |an |1/n = 0 and R := 0


1/n
when lim sup |an | = +∞, then it follows that an z n converges (absolutely) or
diverges according to |z| < R or |z| > R; when |z| = R, anything can happen.
According to Figure 6.3, it is quite fitting to call R the radius of convergence.
Let us summarize our findings in the following theorem named after Cauchy (whom
we’ve already met many times) and Jacques Hadamard (1865–1963).3

3The shortest path between two truths in the real domain passes through the complex domain.
Jacques Hadamard (1865–1963). Quoted in The Mathematical Intelligencer 13 (1991).
288 6. ADVANCED THEORY OF INFINITE SERIES

|z| > R diverges

R

|z| < R converges

P
Figure 6.3. an z n converges (absolutely) or diverges according
as |z| < R or |z| > R.

Theorem 6.13 (Cauchy-Hadamard


P theorem). If R is the radius of con-
vergence of the power series an z n , then the series is absolutely convergent for
|z| < R and is divergent for |z| > R.
One final remark. Suppose that the an ’s are nonzero for n sufficiently large
and lim | aan+1
n
| exists. Then by Lemma 6.12, we have

an
(6.12) R = lim
.
an+1
This formula for the radius of convergence might, in some cases, be easier to work
with than the formula involving |an |1/n .
Exercises 6.2.
1. Find the lim inf/sups of the sequence {an }, where an is given by
 n
2 + (−1)n  1 n n (−1)n
(a) , (b) (−1)n 1 − , (c) 2(−1) , (d) 2n(−1) , (e) 1 + .
4 n 2
(f ) If {rn } is a list of all rationals in (0, 1), prove lim inf rn = 0 and lim sup rn = 1.
2. Investigate the following series for convergence (in (c), z ∈ C):
∞ ∞ ∞ ∞
X (n + 1)(n + 2) · · · (n + n) X (n + 1)n X nz X 1
(a) n
, (b) , (c) , (d) n+(−1)n
.
n=1
n n=1
n! n=1
n! n=1
2
3. Determine the radius of convergence for the following series:
∞ ∞  ∞ ∞
X (n + 1)n n X n n n X (2n)! n X zn
(a) n+1
z , (b) z , (c) z , (d) ,
n=1
n n=1
n + 1 n=1
(n!)2 n=1
np
where in the last sum, p ∈ R. If z = x ∈ R, state all x ∈ R such that the series
converge. For (c), your answer should depend on p.
4. (a) Investigate the series (6.8) for convergence using both the ratio and the root tests.
(b) Here is another class of examples:
1 + a + b 2 + a3 + b 4 + a5 + b 6 + · · · , 0 < a < b < 1.
Show that the ratio test fails but the root test works.
5. Lemma 6.12 is very useful to determine certain limits which aren’t obvious at first
glance. Using this lemma, derive the following limits:
n n+1 n e
(a) lim = e , (b) lim = e , (c) lim = ,
(n!)1/n (n!)1/n [(n + 1)(n + 2) · · · (n + n)]1/n 4
and for a, b ∈ R with a > 0 and a + b > 0,
n e
(d) lim = .
[(a + b)(2a + b) · · · (na + b)]1/n a
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 289

a 1/n
Suggestion: For (a), let an = nn /n!. Prove that lim n+1
an
= e and hence lim an = e as
well. As a side remark, recall that (a) is called (the “weak”) Stirling’s formula, which
we introduced in (3.29) and proved in Problem 5 of Exercises P 3.3. n! n
6. In this problem we investigate the interesting power series ∞ n=1 nn z , where z ∈ C.
(a) Prove that this series has radius of convergence R = e.
(b) If |z| = e, then the ratio and root test both fail. However, if |z| = e, then prove
that the infinite series diverges.
nn n
P
(c) Investigate the convergence/divergence of ∞ n=1 n! z , where z ∈ C.
7. In this problem we investigate the interesting power series


X
F (z) := Fn+1 z n = F1 + F2 z + F3 z 2 + · · · ,
n=0

where {Fn } is the Fibonacci sequence defined in Problem 9 of Exercises 2.2: F0 = 0,


F1 = 1, and Fn = Fn−1 + Fn−2 for √all n ≥ 2. In that problem you proved that
Fn = √15 [Φn − (−Φ)−n ] where Φ = 1+2 5 , the Golden ratio.
(i) Prove that F (z) has radius of convergence equal to Φ−1 .
(ii) Prove that for all z with |z| < Φ−1 , we have F (z) = 1−z−z 1
2 . Suggestion: Show

that (1P− z − z 2 )F (z) = 1. By the way, given any sequence {an }∞ n=0 , the power
series ∞ n=0 a n z n
is called the generating function of the sequence {an }. Thus,
the generating function for {Fn+1 } has the closed form 1/(1 − z − z 2 ). For more
on generating functions, see the free book [246]. Also, if you’re interested in a
magic trick you can do with the formula F (z) = 1/(1 − z − z 2 ), see [176].
8. Here are some lim inf/sup problems. Let {an }, {bn } be sequences of real numbers.
(a) Prove that if c > 0, then lim inf(can ) = c lim inf an and lim sup(can ) = c lim sup an .
Here, we take the “obvious” conventions: c · ±∞ = ±∞.
(b) Prove that if c < 0, then lim inf(can ) = c lim sup an and lim sup(can ) = c lim inf an .
(c) If {an }, {bn } are bounded, prove that lim inf an + lim inf bn ≤ lim inf(an + bn ).
(d) If {an }, {bn } are bounded, prove that lim sup(an + bn ) ≤ lim sup an + lim sup bn .
9. If an → L where L is a positive real number, prove that lim sup(an · bn ) = L lim sup bn
and lim inf(an · bn ) = L lim inf bn . Here are some steps if you want them:
(i) Show that you can get the lim inf statement from the lim sup statement, hence
we can focus on the lim sup statement. We shall prove that lim sup(an bn ) ≤
L lim sup bn and L lim sup bn ≤ lim sup(an bn ).
(ii) Show that the inequality lim sup(an bn ) ≤ L lim sup bn follows if the following
statement holds: If lim sup bn < b, then lim sup(an bn ) < L b.
(iii) Now prove that if lim sup bn < b, then lim sup(an bn ) < L b. Suggestion: If
lim sup bn < b, then choose a such that lim sup bn < a < b. Using Property 4
(a) of Theorem 6.8 and the definition of L = lim an > 0, prove that there is
an N such that n > N implies bn < a and an > 0. Conclude that for n > N ,
an bn < aan . Finally, take lim sups of both sides of an bn < aan .
(iv) Show that the inequality L lim sup bn ≤ lim sup(an bn ) follows if the following
statement holds: If lim sup(an bn ) < L b, then lim sup bn < b; then prove this
statement.
10. Let {an } be a sequence of real numbers. We prove that there are monotone subse-
quences of {an } that converge to lim inf an and lim sup an . Proceed as follows:
(i) Using Theorem 3.13, show that it suffices to prove that there are subsequences
converging to lim inf an and lim sup an
(ii) Show that it suffices to that there is a subsequence converging to lim inf an .
(iii) If lim inf an = ±∞, prove there is a subsequence converging  to lim inf an .
(iv) Now assume that lim inf an = limn→∞ inf{an , an+1 , . . .} ∈ R. By definition of
limit, show that there is an n so that a − 1 < inf{an , an+1 , . . .} < a + 1. Show
290 6. ADVANCED THEORY OF INFINITE SERIES

that we can choose an n1 so that a − 1 < an1 < a + 1. Then show there an
n2 > n1 so that a − 21 < an2 < a + 21 . Continue this process.

6.3. A potpourri of ratio-type tests and “big O” notation


In the previous section, we left it in the air whether or not the series

X 1 · 3 · 5 · · · (2n − 1)
1+
n=1
2 · 4 · 6 · · · (2n) (2n + 1)
converges (both the ratio and root tests failed). In this section we’ll develop some
new technologies that are able to detect the convergence of this series and other
series for which the ratio and root tests fail to give information.
6.3.1. Kummer’s test. The fundamental enhanced version of the ratio test
is named after Ernst Kummer (1810–1893), from which we’ll derive a potpourri of
other ratio-type tests.
Theorem 6.14 (Kummer’s
P test). Let {an } and {bn } be sequences of positive
numbers where the sum bn diverges, and define
1 an 1
κn = − .
bn an+1 bn+1
P
Then an converges or diverges according as lim inf κnP> 0 or lim sup κn < 0. In
particular, if κn tends to some definite limit, κ, then an converges to diverges
according as κ > 0 or κ < 0.
Proof. If lim inf κn > 0, then by Property 4 (d) of Theorem 6.8, given any
positive number a less than this limit infimum, there is an N such that
1 an 1
n > N =⇒ − > a.
bn an+1 bn+1
Thus,
1 1
(6.13) n>N =⇒ an − an+1 > a an+1 .
bn bn+1
Let m > N and let n > m > N . Then (6.13) implies that
n n  
X X 1 1 1 1
a ak+1 < ak − ak+1 = am − an+1 ,
bk bk+1 bm bn+1
k=m k=m
1
Pn
since the sum telescoped. Therefore, as bn+1 an+1 > 0, we have k=m a ak+1 <
1
bm am , or more succinctly,
X n
ak+1 < C
k=m
where C = a1 b1m am is a constant independent of n. Since n > m is completely
P
arbitrary it follows that the partial sums of an always remain bounded by a
fixed constant, so the sum must converge.
Assume now that lim sup κn < 0. Then by property 4 (a) of Theorem 6.8, there
is an N such that for all n > N , κn < 0, that is,
1 an 1 an an+1
n > N =⇒ − < 0, that is, < .
bn an+1 bn+1 bn bn+1
6.3. A POTPOURRI OF RATIO-TYPE TESTS AND “BIG O” NOTATION 291

Thus, for n > N , abnn is increasing with n. In particular, fixing m > N , for all
n > m, we have C < an /bn , where C = am /bm is a constantP independent of n.
Thus, for all n > m, we have P nCb < an and since the sum bn diverges, the
comparison test implies that an diverges too. 

Note that d’Alembert’s ratio test is just Kummer’s test with bn = 1 for each n.

6.3.2. Raabe’s test and “big O” notation. The following test, attributed
to Joseph Ludwig Raabe (1801–1859), is just Kummer’s test with the bn ’s making
up the harmonic series: bn = 1/n.
P
Theorem 6.15 (Raabe’s test). A series an of positive terms converges or
diverges according as
   
an an
lim inf n −1 >0 or lim sup n − 1 < 0.
an+1 an+1
In order to effectively apply Raabe’s test, it is useful to first introduce some
very handy notation. For a nonnegative function g, when we write f = O(g) (“big
O” of g), we simply mean that |f | ≤ Cg for some constant C. In words, the
big O notation just represents “a function that is in absolute value less than or
equal to a constant times”. This big O notation was introduced by Paul Bachmann
(1837–1920) but became well-known through Edmund Landau (1877–1938) [239].

Example 6.19. For x ≥ 0, we have

x2
= O(x2 )
1+x
because x2 /(1 + x) ≤ x2 for x ≥ 0. Thus, for x ≥ 0,

1 x2 1
(6.14) =1−x+ =⇒ = 1 − x + O(x2 ).
1+x 1+x 1+x
In this section, we are mostly interested in using the big O notation when
dealing with natural numbers.

Example 6.20. For n ∈ N,


 
2 1 1
(6.15) + 2 =O ,
n 4n n
2 1 2 1 C
because n + 4n2 ≤ n + 4n = n where C = 2 + 1/4 = 9/4.

Three important properties of the big O notation are (1) if f = O(ag) with
a ≥ 0, then f = O(g), and if f1 = O(g1 ) and f2 = O(g2 ), then (2) f1 f2 = O(g1 g2 )
and (3) f1 +f2 = O(g1 +g2 ). To prove these properties, observe that if |f | ≤ C(ag),
then |f | ≤ C 0 g, where C 0 = aC, and that |f1 | ≤ C1 g1 and |f2 | ≤ C2 g2 imply

|f1 f2 | ≤ C1 C2 g1 g2 and |f1 + f2 | ≤ (C1 + C2 ) (g1 + g2 );

hence, our three properties.


292 6. ADVANCED THEORY OF INFINITE SERIES

2 
Example
 6.21. Thus, in view of (6.15), we have O n2 + 4n1 2 = O 1
n · 1
n =
O n12 . Therefore, using (the right-hand part of) (6.14), we obtain
 2    
1 2 1 2 1 2 1 1
  =1− − 2 +O + =1− +O 2 +O 2
2 1 n 4n n 4n2 n n n
1+ +
n 4n2
 
2 1
=1− +O 2 ,
n n

since O(2/n2 ) = O(1/n2 ).

Here we can see the very “big” advantage of using the big O notation: it hides a
lot of complicated junk information. For example, the left-hand side of the equation
is exactly equal to (see the left-hand part of (6.14))
" 2 #
2
1 2 1 n + 4n1 2
  =1− + − 2 + ,
2 1 n 4n 1 + n2 + 4n1 2
1+ +
n 4n2
so the big O notation allows
 us to summarize the complicated material on the right
as the very simple O n12 .

Example 6.22. Consider our “mystery” series



X 1 · 3 · 5 · · · (2n − 1)
1+
n=1
2 · 4 · 6 · · · (2n) (2n + 1)

already considered in (6.9). We saw that the ratio and root tests failed for this
series; however, it turns out that Raabe’s test works. To see this, let an denote the
n-th term in the “mystery” series. Then from (6.10), we see that
5 3
an 1+ + 2 
5 3

2
 
1
= 2n 2n = 1+ + 2 1− +O
an+1 2 1 2n 2n n n2
1+ + 2
n 4n      
5 1 2 1
= 1+ +O 1− +O .
2n n2 n n2
Multiplying out the right-hand side, using the properties of big O, we get
   
an 5 2 1 1 1
=1+ − +O = 1 + + O .
an+1 2n n n2 2n n2
Hence,
     
an 1 1 an 1
n −1 = +O =⇒ lim n − 1 = > 0,
an+1 2 n an+1 2

so by Raabe’s test, the “mystery” sum converges.4

4It turns out that the “mystery” sum equals π/2; see [136] for a proof.
6.3. A POTPOURRI OF RATIO-TYPE TESTS AND “BIG O” NOTATION 293

6.3.3. De Morgan and Bertrand’s test. We next study a test due to Au-
gustus De Morgan (1806–1871) and Joseph Bertrand (1822–1900). For this test,
we let bn = 1/n log n in Kummer’s test.
Theorem 6.16 (De Morgan and Bertrand’s test). Let {an } be a sequence
of positive numbers and define αn by the equation
an 1 αn
=1+ + .
an+1 n n log n
P
Then an converges or diverges according as lim inf αn > 1 or lim sup αn < 1.
Proof. If we let bn = 1/n log n in Kummer’s test, then
 
1 an 1 1 αn
κn = − = n log n 1 + + − (n + 1) log(n + 1)
bn an+1 bn+1 n n log n
h i
= αn + (n + 1) log n − log(n + 1) .
Since
h i  n+1
1
(n + 1) log n − log(n + 1) = log 1 − → log e−1 = −1,
n+1
we have
lim inf κn = lim inf αn − 1 and lim sup κn = lim sup αn − 1.
Invoking Kummer’s test now completes the proof. 
6.3.4. Gauss’s test. Finally, to end our potpourri of tests, we conclude with
Gauss’ test:
Theorem 6.17 (Gauss’ test). Let {an } be a sequence of positive numbers and
suppose that we can write
 
an ξ 1
=1+ +O ,
an+1 n np
P
where ξ is a constant and p > 1. Then an converges or diverges according as
ξ ≤ 1 or ξ > 1.
Proof. The hypotheses imply that
     
an 1 1
n − 1 = ξ + nO = ξ + O →ξ
an+1 np np−1
P
as n → ∞, where we used that p−1 > 0. Thus, Raabe’s test shows that series an
converges for ξ > 1 and diverges for ξ < 1. For the case ξ = 1, let aan+1
n
= 1 + n1 + fn

where fn = O n1p . Then we can write
an 1 1 αn
= 1 + + fn = 1 + + ,
an+1 n n n log n
where αn = fn n log n. If we let p = 1+δ, where δ > 0, then we know that log

n
→0
as n → ∞ by Problem 8 in Exercises 4.6, so
   
1 log n
αn = fn n log n = O n log n = O =⇒ lim αn = 0.
n1+δ nδ
P
Thus, De Morgan and Bertrand’s test shows that the series an diverges. 
294 6. ADVANCED THEORY OF INFINITE SERIES

Example 6.23. Gauss’ test originated with Gauss’ study of the hypergeometric
series:
α·β α(α − 1) · β(β − 1) α(α − 1)(α − 2) · β(β − 1)(β − 2)
1+ + + + ··· ,
1·γ 2! · γ(γ + 1) 3! · γ(γ + 1)(γ + 2)
P
where α, β, γ are positive real numbers. We can write this as an where
α(α − 1)(α − 2) · · · (α − n + 1) · β(β − 1)(β − 2) · (β − n + 1)
an = .
n! · γ(γ + 1)(γ + 2) · · · (γ + n − 1)
Hence, for n ≥ 1 we have
γ+1 γ
an (n + 1)(γ + n) n2 + (γ + 1)n + γ 1+ + 2
= = 2 = n n .
an+1 (α + n)(β + n) n + (α + β)n + αβ α+β αβ
1+ + 2
n n
Using the handy formula from (6.14),
1 x2
=1−x+ ,
1+x 1+x
we see that (after some algebra)
   
an γ+1 γ α+β αβ 1
= 1+ + 2 1− − 2 +O
an+1 n n n n n2
 
γ+1−α−β 1
=1+ +O .
n n2
Thus, the hypergeometric series converges if γ > α + β and diverges if γ ≤ α + β.
Exercises 6.3.
1. Determine whether or not the following series converge.
∞ ∞
X 1 · 3 · 5 · · · (2n − 1) X 3 · 6 · 9 · · · (3n)
(a) , (b) ,
n=1
2n (n + 1)! n=1
7 · 10 · 13 · · · (3n + 4)
∞ ∞
X 1 · 3 · 5 · · · (2n − 1) X 2 · 4 · 6 · · · (2n + 2)
(c) , (d) .
n=1
2 · 4 · 6 · · · (2n) n=1
1 · 3 · 5 · · · (2n − 1)(2n)
For α, β 6= 0, −1, −2, . . .,
∞ ∞
X α(α + 1)(α + 2) · · · (α + n − 1) X α(α + 1)(α + 2) · · · (α + n − 1)
(e) , (f ) .
n=1
n! n=1
β(β + 1)(β + 2) · · · (β + n − 1)

If α, β, γ, κ, λ 6= 0, −1, −2, . . ., then prove that the following monster



X α(α + 1) · · · (α + n − 1)β(β + 1) · · · (β + n − 1)γ(γ + 1) · · · (γ + n − 1)
(g)
n=1
n! κ(κ + 1) · · · (κ + n − 1)λ(λ + 1) · · · (λ + n − 1)

converges for κ + λ − α − β − γP > 0.


2. Using Raabe’s test, prove that 1/np converges for p > 1 and diverges for p <
P1.
3. (Logarithmic test) We prove a useful test called the logarithmic test: If an is
a series of positive terms, then this series converges or diverges according as
 an   an 
lim inf n log > 1 or lim sup n log < 1.
an+1 an+1
To prove this, proceed as follows.
6.4. SOME PRETTY POWERFUL PROPERTIES OF POWER SERIES 295

an

(i) Suppose first that lim inf n log an+1
> 1. Show that there is an a > 1 and an
N such that
an an+1
n>N =⇒ a < n log =⇒ < e−a/n .
an+1 an
n
(ii) Using 1 + n1 < e from (3.28), the p-test,
P and the limit comparison test (see
Problem 7 in Exercises 3.6), prove that an converges.P
an
(iii) Similarly, prove that if lim sup n log an+1 < 1, then an diverges.
(iv) Using the logarithmic test, determine the convergence/diverence of
∞ ∞
X n! X nn
n
and .
n=1
n n=1
n!

6.4. Some pretty powerful properties of power series


The title of this section speaks for itself. As stated already, we focus on power
series of a complex variable z, but all the results stated in this section have corre-
sponding statements for power series of a real variable x.

6.4.1. Continuity and the exponential function (again). We first prove


that power series are always continuous (within their radius of convergence).
P∞ P∞
Lemma 6.18. If n=0 an z n has radius of convergence R, then n=1 n an z n−1
also has radius of convergence R.
Proof. (See Problem P∞ 3 for n−1 another proof of this lemma using P∞ properties of
n−1
lim
P∞ sup.) For z =
6 0, n=1 n a n z converges if andP∞only if z · n=1 n an z =
n n
n=1 n an z converges, so we just have to show that P∞n=1 n a n z has radius of con-
vergence
P∞ R. Since |a n | ≤ n|a n |, by comparison, if n=1 n |a n | |z|n converges,
P∞ then
n n
n=1 n|a | |z| also converges, so the radius of convergence of the series n=1 n a nz
can’t be larger than R. To prove thatPthe radius of convergence is at least R, fix z

with |z| < R; we need to prove P that n=1 n |an | |z|n converges. To this end, fix ρ

with |z| < ρ < R and note that n=1 n (|z|/ρ)n converges, by e.g. the root test:
 |z| n 1/n |z| |z|
= lim n1/n ·

lim n = < 1.
ρ ρ ρ
P∞
Since Pn=1 |an |ρn converges (because ρ < R, the radius of convergence of the

series n=0 an z n ), by the n-th term test, |an |ρn → 0 as n → ∞. In particular,
n
|an |ρ ≤ M for some constant M , hence
 |z| n  |z| n
n |an | |z|n = n |an | ρn · ≤M ·n .
ρ ρ
P P
Since M n (|z|/ρ)n converges, by the comparison test, it follows that n |an | |z|n
also converges. This completes our proof. 

Theorem 6.19 (Continuity theorem for power series). A power series is


continuous within its radius of convergence.
P∞
Proof. Let f (z) = n=0 an z n have radius of convergence R; we need to show
that f (z) is continuous at each point c ∈ C with |c| < R. So, let us fix such a c.
Since
z n − cn = (z − c) qn (z), where qn (z) = z n−1 + z n−2 c + · · · + z cn−2 + cn−1 ,
296 6. ADVANCED THEORY OF INFINITE SERIES

which is proved by multiplying out (z − c) qn (z), we can write



X ∞
X
f (z) − f (c) = an (z n − cn ) = (z − c) an qn (z).
n=1 n=0
P∞
To make the sum n=0 an qn (z) small in absolute value we proceed as follows. Fix
r such that |c| < r < R. Then for |z − c| < r − |c|, we have

|z| ≤ |z − c| + |c| < r − |c| + |c| = r.

Thus, as |c| < r, for |z − c| < r − |c| we see that

|qn (z)| ≤ rn−1 + rn−2 r + · · · + r rn−2 + rn−1 = nrn−1 .


| {z }
n terms
P∞ P∞
By our lemma, n=1 n |an | rn−1 converges, so if C := n=1 n |an | rn−1 , then

X ∞
X
|f (z) − f (c)| ≤ |z − c| |an | |qn (z)| ≤ |z − c| |an | nrn−1 = C|z − c|,
n=1 n=1

which implies that limz→c f (z) = f (c); that is, f is continuous at z = c. 

6.4.2. Abel’s limit theorem.P∞ Abel’s limit theorem has to do with the fol-
lowing question. Let f (x) = n=0 an xn have radius of convergence R; this implies,
in particular, that f (x) is defined for all −R < x < R and, by Theorem
P∞ 6.19, is
continuous on the interval (−R, R). Let us suppose that f (R) = n=0 an Rn con-
verges. In particular, f (x) is defined for all −R < x ≤ R. Question: Is f continuous
on the interval (−R, R], that is, is it true that

(6.16) lim f (x) = f (R)?


x→R−

The answer to this question is “yes” and it follows from the following more general
theorem due to Neils Abel; however, Abel’s theorem is mostly used for the real
variable case limx→R− f (x) = f (R) that we just described.
P∞
Theorem 6.20 (Abel’s limit theorem). Let f (z) = n=0 an z n have P∞ radius
of convergence R and let z0 ∈ C with |z0 | = R where the series f (z0 ) = n=0 an z0n
converges. Then
lim f (z) = f (z0 )
z→z0

where the limit on the left is taken in such a way that |z| < R and that the ratio
|z0 −z|
R−|z| remains bounded by a fixed constant.

Proof. By considering the limit of the function g(z) = f (z0 z)−f (z0 ) as z → 1
in such a way that |z| < 1 and that the ratio |1 − z|/(1 − |z|) remains bounded by
a fixed constant, we may henceforth assume that z0 = 1 and that f (z0 ) = 0 (the
diligent student will check the details of this statement).
P∞ With these assumptions,
if we put sn = a0 + a1 + · · · + an , then 0 = f (1) = n=0 an = lim sn . Now observe
6.4. SOME PRETTY POWERFUL PROPERTIES OF POWER SERIES 297

that an = sn − sn−1 , so
n
X
ak z k = a0 + a1 z + a2 z 2 + · · · + an z n
k=0
= s0 + (s1 − s0 )z + (s2 − s1 )z 2 + · · · + (sn − sn−1 )z n
= s0 (1 − z) + s1 (z − z 2 ) + · · · + sn−1 (z n−1 − z n ) + sn z n
= s0 (1 − z) + s1 (1 − z)z + · · · + sn−1 (1 − z)z n−1 + sn z n

= (1 − z) s0 + s1 z + · · · + sn−1 z n−1 + sn z n .
Pn Pn
Thus, k=0 ak z k = (1 − z) k=0 sk z k + sn z n . Since sn → 0 and |z| < 1 it follows
that sn z n → 0. Therefore, taking n → ∞, we obtain
X∞ ∞
X
f (z) = an z n = (1 − z) sn z n ,
n=0 n=0
which implies that

X
|f (z)| ≤ |1 − z| |sn | |z|n .
n=0
Let us now take z → 1 in such a way that |z| < 1 and |1 − z|/(1 − |z|) < C where
C > 0. Let ε > 0 be given and, since sn → 0, we can choose an integer N such
PN
that n > N =⇒ |sn | < ε/(2C). Define K := n=0 |sn |. Then we can write
N
X ∞
X
|f (z)| ≤ |1 − z| |sn | |z|n + |1 − z| |sn | |z|n
n=0 n=N
N ∞
X
n
X ε
< |1 − z| |sn | · 1 + |1 − z| |z|n
n=0
2C
n=N

ε X
= K|1 − z| + |1 − z| |z|n
2C n=0
ε |1 − z| ε
= K|1 − z| + < K|1 − z| + .
2C 1 − |z| 2
Thus, with δ := ε/(2K), we have
|1 − z|
|z − 1| < δ with |z| < 1 and <C =⇒ |f (z)| < ε.
1 − |z|
This completes our proof. 
Notice that for z = x with 0 < x < R, we have
|R − z| |R − x| R−x
= = =1
R − |z| R − |x| R−x
which, in particular, is bounded by 1, so (6.16) holds under the assumptions stated.
Once weP prove this result at x = R, we can prove a similar resultP at x = −R: If
∞ n ∞ n
f (x) = n=0 a n x has radius of convergence R and f (−R) = n=0 an (−R)
converges, then
lim f (x) = f (−R).
x→−R+
To prove this, consider the function g(x) = f (−x), then apply (6.16) to g.
298 6. ADVANCED THEORY OF INFINITE SERIES

6.4.3. The identity theorem. The identity theorem is perhaps one of the
most useful properties of power series. The identity theorem says, very roughly,
that if two power series are identical at “sufficiently many” points, then in fact, the
power series are identical everywhere!
P P
Theorem 6.21 (Identity theorem). Let f (z) = an z n and g(z) = bn z n
have positive radii of convergence and suppose that f (ck ) = g(ck ) for some nonzero
sequence ck → 0. Then the power series f (z) and g(z) must be identical; that is
an = bn for every n = 0, 1, 2, 3, . . ..
Proof. We begin by proving that for each m = 0, 1, 2, . . ., the series

X
fm (z) := an z n−m = am + am+1 z + am+2 z 2 + am+3 z 3 + · · ·
n=m

has the same radius of convergence as f . Indeed, since we can write



X
fm (z) = z −m an z n
n=m
P∞
for z 6= 0, the power series fm (z) converges if and only if n=m an z n converges,
which in turn converges if and only if f (z) converges. It follows that fm (z) and
f (z) have the same radius of convergence; in particular, by the continuity theorem
for powerPseries, fm (z) is continuous at 0. Similarly, for each m = 0, 1, 2, . . .,

gm (z) := n=m bn z n−m has the same radius of convergence as g(z); in particular,
gm (z) is continuous at 0. These continuity facts concerning fm and gm are the
important facts that will be used below.
Now to our proof. We are given that
(6.17) a0 + a1 ck + a2 c2k + · · · = b0 + b1 ck + b2 c2k + · · · that is, f (ck ) = g(ck )
for all k. In particular, taking k → ∞ in the equality f (ck ) = g(ck ), using that
ck → 0 and that f and g are continuous at 0, we obtain f (0) = g(0), or a0 = b0 .
Cancelling a0 = b0 and dividing by ck 6= 0 in (6.17), we obtain
(6.18) a1 + a2 ck + a3 c2k + · · · = b1 + b2 ck + b3 c2k + · · · that is, f1 (ck ) = g1 (ck )
for all k. Taking k → ∞ and using that ck → 0 and that f1 and g1 are continuous
at 0, we obtain f1 (0) = g1 (0), or a1 = b1 . Cancelling a1 = b1 and dividing by
ck 6= 0 in (6.18), we obtain
(6.19) a2 + a3 ck + a4 c2k + · · · = b2 + b3 ck + b4 c2k + · · · that is, f2 (ck ) = g2 (ck )
for all k. Taking k → ∞, using that ck → 0 and that f2 and g2 are continuous at
0, we obtain f2 (0) = g2 (0), or a2 = b2 . Continuing by induction we get an = bn for
all n = 0, 1, 2, . . ., which is exactly what we wanted to prove. 
P P
Corollary 6.22. If f (z) = an z n and g(z) = bn z n have positive radii
of convergence and f (x) = g(x) for all x ∈ R with |x| < ε for some ε > 0, then
an = bn for every n; in other words, f and g are actually the same power series.
Proof. To prove this, observe that since f (x) = g(x) for all x ∈ R such that
|x| < ε, then f (ck ) = g(ck ) for all k sufficiently large where ck = 1/k; the identity
theorem now implies an = bn for every n. 
Using the identity theorem we can deduce certain properties of series.
6.4. SOME PRETTY POWERFUL PROPERTIES OF POWER SERIES 299

P
Example 6.24. Suppose that f (z) = an z n is an odd function in the sense
that f (−z) = −f (z) for all z within its radius of convergence. In terms of power
series, the identity f (−z) = −f (z) is
X X
an (−1)n z n = −an z n .

By the identity theorem, we must have (−1)n an = −an for each n. Thus, for n
even we must have an = −an or an = 0, and for n odd, we must have −an = −an ,
a tautology. In conclusion, we see that f is odd if and only if all coefficients of even
powers vanish:
X∞
f (z) = a2n+1 z 2n+1 ;
n=0

that is, f is odd if and only if f has only odd powers in its series expansion.
Exercises 6.4.
P
1. Prove that f (z) = an z n is an even function in the sense that f (−z) = f (z) for all
z within its radius of convergenceP if and only if f has only even powers in its expansion,
that is, f takes the form f (z) = ∞ n=0 a2n
2n
 z . n!
n
2. Recall that the binomial coefficient is k = k!(n−k)! for 0 ≤ k ≤ n. Prove the highly
nonobvious result:
! k
! !
m+n X m n
= .
k j=0
j k−j

Suggestion: Apply the binomial formula to (1 + z)m+n , which equals (1 + z)m · (1 + z)n .
Prove that
! n
!2
2n X n
= .
n k
k=0
P
3. Prove that ∞ n
n=1 n |an | r converges, where the notation is as in the proof of Theorem
6.19, using the root test. You will need Problem P 9 in Exercises 6.2.
4. (Abel summability)
P We say that a series a n is Abel summable to L if the power
n
series f (x) := aP
n x is defined for all x ∈ [0, 1) and limx→1− f (x) = L.
P
(a) Prove that if an converges to L ∈ C, then an is also Abel summable to L.
(b) Derive the following amazing formulas (properly interpreted!):

1
1 − 1 + 1 − 1 + 1 − 1 + − · · · =a ,
2
1
1 + 2 − 3 + 4 − 5 + 6 − 7 + − · · · =a ,
4

where =a mean “is Abel summable to”. You will need Problem 6 in Exercises 3.5.
5. In this problem we continue our fascinating study of Abel summability.PLet a0 , a1 , a2 , . . .
be a positive nonincreasing sequence tending to zero (in particular, (−1)n−1 an con-
verges by the alternating series test). Define bn := a0 + a1 + · · · + an . We shall prove
the neat formula

1X
b 0 − b 1 + b 2 − b 3 + b 4 − b 5 + − · · · =a (−1)n an .
2 n=0
P
(i) Let f (x) = ∞ n n
n=0 (−1) bn x . Prove that f has radius of convergence 1. Sugges-
tion: Use the ratio test.
300 6. ADVANCED THEORY OF INFINITE SERIES

(ii) Let
n
X
fn (x) = (−1)k bk xk
k=0

= a0 − (a0 + a1 )x + (a0 + a1 + a2 )x2 − · · · + (−1)n (a0 + a1 + · · · + an )xn


be the n-th partial sum of f (x). Prove that
1 
fn (x) = a0 − a1 x + a2 x2 − a3 x3 + · · · + (−1)n an xn
1+x
xn+1
+ (−1)n a0 + a2 + a3 + · · · + an ).
1+x
5
(iii) Prove that

1 X
f (x) = (−1)n an xn .
1 + x n=0
Finally, from this formula prove the desired result.
(iv) Establish the remarkable formula
 1  1 1  1 1 1 1
1− 1+ + 1+ + − 1+ + + + − · · · =a log 2.
2 2 3 2 3 4 2
P P
6. Suppose that f (z) = an z n has radius of convergence 1, where an is a divergent
series of positive real numbers. Prove that limx→1− f (x) = +∞.

6.5. Double sequences, double series, and a ζ-function identity


After studying single integrals in elementary calculus, you probably took a
course where you studied “double integrals”. In a similar way, now that we have a
thorough background in “single infinite series,” we now move to the topic of “double
infinite series”. The main result of this section is Cauchy’s double series theorem
— Theorem 6.26, which we’ll use quite often in the sequel. If you did Problem 9
in Exercises 3.7 you tasted a bit of Cauchy’s theorem in its relation to Tannery’s
theorem (however, we won’t assume Tannery’s theorem for this section). The books
[144, Ch. 3] and [41, Ch. 5] have lots of material on double sequences and series.
6.5.1. Double sequences and series and Pringsheim’s theorem. We
begin by studying double sequences. Recall that a complex sequence is really just
a function s : N → C where we usually denote s(n) by sn . By analogy, we define a
double sequence of complex numbers as a function s : N × N −→ C. We usually
denote s(m, n) by smn and the corresponding double sequence by {smn }.
Example 6.25. For m, n ∈ N,
m·n
smn =
(m + n)2
defines a double sequence {smn }.
Whenever we talk about sequences, the idea of convergence is bound to follow.
Let {smn } be a double sequence of complex numbers. We say that the double
sequence {smn } converges if there is a complex number L having the property
that given any ε > 0 there is a real number N such that
m, n > N =⇒ |L − smn | < ε,
5
In the next section, we’ll learn how to prove this identity in a much quicker way using the
technologically advanced Cauchy’s double series theorem.
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 301

in which case we write L = lim smn .


Care has to be taken when dealing with double sequences because sometimes
sequences that look convergent are actually not.
Example 6.26. The nice looking double sequence smn = mn/(m + n)2 does
not converge. To see this, observe that if m = n, then
n·n n2 1
smn = = = .
(n + n)2 4n2 4
However, if m = 2n, then
2n · n 2n2 2
smn = 2
= 2 = .
(2n + n) 9n 9
Therefore it is impossible for smn to approach any single number no matter how
large we take m, n.
Given a double sequence {smn } it is convenient to look at the iterated limits:

(6.20) lim lim smn and lim lim smn .


m→∞ n→∞ n→∞ m→∞

For limm→∞ limn→∞ smn on the left, we mean to first take n → ∞ and second to
take m → ∞, reversing the order for limn→∞ limm→∞ smn . In general, the iterated
limits (6.20) may have no relationship!
Example 6.27. Consider the double sequence smn = mn/(m + n2 ). We have
mn
lim smn = lim = 0 =⇒ lim lim smn = lim 0 = 0.
n→∞ n→∞ m + n2 m→∞ n→∞ m→∞

On the other hand,


mn
lim smn = lim = n =⇒ lim lim smn = lim n = ∞.
m→∞ m→∞ m + n2 n→∞ m→∞ n→∞

Here are a couple questions:


(I) If both iterated limits (6.20) exist and are equal, say to a number L, is it
true that the regular double limit lim smn exists and lim smn = L?
(II) If L = lim smn exists, is it true that both iterated limits (6.20) exist and are
equal to L:
(6.21) L = lim lim smn = lim lim smn ?
m→∞ n→∞ n→∞ m→∞

It may shock you, but the answer to both of these questions is “no”.
Example 6.28. For a counter example to Question I, consider our first example
smn = mn/(m + n)2 . We know that lim smn does not exist, but observe that
mn
lim smn = lim = 0 =⇒ lim lim smn = lim 0 = 0.
n→∞ n→∞ (m + n)2 m→∞ n→∞ m→∞

and
mn
lim smn = lim =0 =⇒ lim lim smn = lim 0 = 0,
m→∞ m→∞ (m + n)2 n→∞ m→∞ n→∞

so both iterated limits converge. For a counter example to Question II, see limit
(d) in Problem 1.
302 6. ADVANCED THEORY OF INFINITE SERIES

However, if a double sequence converges and both iterated limits exists, then
they all must equal the same number. This is the content of the following theorem,
named after Alfred Pringsheim (1850–1941) (cf. [41, p. 79]).
Theorem 6.23 (Pringsheim’s theorem for sequences). If {smn } converges
and for each m, limn→∞ smn exists and for each n, limm→∞ smn exists, then both
iterated limits exist and the equality (6.21) holds.
Proof. Let ε > 0. Then there is an N such that for all m, n > N , we have
|L − smn | < ε/2. Taking n → ∞, we get, for m > N , |L − limn→∞ smn ≤ ε/2.
Hence,
m > N =⇒ L − lim smn < ε.
n→∞
This means that limm→∞ (limn→∞ smn ) = L. A similar argument establishes the
equality with the limits of m and n reversed. 
P
Recall that if {an } is a sequence of complex numbers,
Pnthen we say that an
converges if the sequence {sn } converges, where sn := k=1 ak . By analogy, we
define a double series of complex numbers as follows. Let {amn } be a double
sequence of complex numbers and let
Xm Xn
smn := aij ,
i=1 j=1
P P
called the m, n-th partial sum of amn . We say that the double P series amn
converges if the double sequence {smn } of partial sums converges. If amn exists,
we can ask whether or not
X X∞ X ∞ X∞ X ∞
(6.22) amn = amn = amn ?
m=1 n=1 n=1 m=1
Pm Pn
Here, with smn = i=1 j=1 aij , the iterated series on the right are defined as
∞ X
X ∞ ∞ X
X ∞
amn := lim lim smn and amn := lim lim smn .
m→∞ n→∞ n→∞ m→∞
m=1 n=1 n=1 m=1
P
Thus, (6.22) is just the equality (6.21) with s = amn . Hence, Pringsheim’s
theorem for sequences immediately implies the following.
P
Theorem 6.24 (Pringsheim’s
P∞ theorem for series). If a double
P∞series amn
converges and for each m, n=1 amn converges and for each n, m=1 amn con-
verges, then both iterated series converge and the equality (6.22) holds.
We can “visualize” the iterated sums in (6.22) as follows. First, we arrange
the aPmn ’s in an infinite array as shown in Figure 6.4. Then for fixed m ∈ N, the

sum n=1 amn is summing all the numbers in theP m-th row shown in the left

picture in Figure 6.4. For example, if m = 1, then n=1 a1n is summing all the
numbers
P∞ P∞ in the first row shown in the left picture in Figure 6.4. The summation
m=1 n=1
P∞ amnPis summing over all the rows (that have already been summed).

Similarly, n=1 m=1 amn is summing over all the columns. In Subsection 6.5.3
we shall study the most useful theorem on iterated sums, Cauchy’s double series
theorem, which states Pthat (6.22) always holds for absolutely convergent series.
Here, a double series
P amn is said to converge absolutely if the double series of
absolute values |amn | converges. However, before presenting Cauchy’s theorem,
we first generalize summing by rows and columns to “summing by curves”.
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 303

. . . .
a11 .... a12 .... a13 .... a14 .... . . .
..a...11
......a..12
......a..13
.......a..14
.............. ... ... ... ...
a21 a22 a23 a24 . . . a21 .... a22 .... a23 .... a24 .... . . .
......................................... ... ... ... ...
a31 ... a32 ... a33 ... a34 ... . . .
..a...31
......a..32
......a..33
.......a..34
.............. .... .... .... ...
a41 a42 a43 a44 . . . a41 ... a42 ... a43 ... a44 .... . . .
.............................................. . ... . ... . ... . ... .
. . . . .. . .. . .. . .. . .. . .
. . . . . .. .. .. .
. . . .

Figure 6.4. In the first array we are “summing by rows” and in


the second array we are “summing by columns”.
.. ... ... ... .. .. .. ..
a11......a12......a13......a14....... . .
..a...11.... a12 .... a13 ..... a14 ..... . . . .
.. .... ... ...
.
..a...21
.. . ..
...... a23 ..... a24 .... . . .
......a..22 ...a. 21......a. 22......a..23......a..24 . . .
.... ... ...
..a...31
......a..32......a...33
.. ..
...... a34 ..... . . . ...a. 31.......a. 32.......a.33 a34 . . .
... ...
..a....41
......a...42
......a...43
......a....44
..
...... .. . . ...a. 41......a. 42 a43 a44 . . .
..... . . . ..
.
.
.
.
.
.
.
.
.. ..... .. .
.
.
. .

Figure 6.5. “Summing by squares” and “summing by triangles”.

6.5.2. “Summing by curves”. Before presenting the “sum by curves theo-


rem” (Theorem 6.25 below) it might be helpful to give
P a couple examples of this
theorem to help in understanding what it says. Let amn be a double series.
Example 6.29. Let
Sk = {(m, n) ; 1 ≤ m ≤ k , 1 ≤ n ≤ k},
which represents a k × k square of numbers; see the left-hand
P picture in Figure 6.5
for 1 × 1, 2 × 2, 3 × 3, and 4 × 4 examples. We denote by (m,n)∈Sk amn the sum
of those amn ’s within the k × k square Sk . Explicitly,
X k X
X k
amn = amn .
(m,n)∈Sk m=1 n=1

It is natural to refer to the limit (provided it exists)


X X k
k X
lim amn = lim amn
k→∞ k→∞
(m,n)∈Sk m=1 n=1
P
as “summing by squares”, since as we already noted, (m,n)∈Sk amn involves sum-
ming the amn ’s within a k × k square.
Example 6.30. Now let
Sk = T1 ∪ · · · ∪ Tk , where T` = {(m, n) ; m + n = ` + 1}.
Notice that T` = {(m, n) ; m + n = ` + 1} = {(1, `), (2, ` − 1), . . . , (`, 1)} rep-
resents the `-th diagonal in the right-hand picture in Figure 6.5; for instance,
T3 = {(1, 3), (2, 2), (3, 1)} is the third diagonal in Figure 6.5. Then
X k
X X
amn = amn
(m,n)∈Sk `=1 (m,n)∈T`
304 6. ADVANCED THEORY OF INFINITE SERIES

is the sum of the amn ’s that are within the triangle consisting of the first k diagonals.
It is natural to refer to the limit (provided it exists)

X k
X X
lim amn = lim amn ,
k→∞ k→∞
(m,n)∈Sk `=1 (m,n)∈T`

as “summing by triangles”. Using that T` = {(1, `), (2, ` − 1), . . . , (`, 1)}, we can
express the summation by triangles as

X 
a1,k + a2,k−1 + · · · + ak,1 .
k=1

More generally, we can “sum by curves” as long as the curves increasingly fill up
the array like the squares or triangles shown in Figure 6.5. More precisely, suppose
that S1 ⊆ S2 ⊆ S3 ⊆ · · · ⊆ N × N is a nondecreasing sequence of finite sets having
the property that for any m, n there is a k such that

(6.23) {1, 2, . . . , m} × {1, 2, . . . , n} ⊆ Sk ⊆ Sk+1 ⊆ Sk+2 ⊆ · · · .

In the following theorem we consider the sequence {sk } where for each k ∈ N, sk is
the finite sum
X
(6.24) sk := amn ,
(m,n)∈Sk

obtained by summing over all amn with (m, n) inside Sk .


P
Theorem 6.25 (Sum by curves theorem). P If a double series amn of
complex numbers is absolutely convergent, then amn itself converges; moreover,
the sequence {sk } defined in (6.24) converges, and
X
amn = lim sk .

Proof.PWe first show that {s Pk } is Cauchy and therefore converges, then we


prove that amn converges and amn = lim sk .
P Step 1: To prove that {sk } is Cauchy, let ε > 0 be given. By assumption,
|amn | converges, so if L denotes its limit and tmn its m, n-th partial sum, we can
choose N such that
ε
(6.25) m, n > N =⇒ |L − tmn | < .
2
Fix n > N . Then by the property (6.23) there is an N 0 ∈ N such that

(6.26) {1, 2, . . . , n} × {1, 2, . . . , n} ⊆ SN 0 ⊆ SN 0 +1 ⊆ SN 0 +2 ⊆ · · · .

Fix k > ` > N 0 . Then, since S` ⊆ Sk , we have


X X X X

|sk − s` | = aij − aij = aij ≤ |aij |.
(i,j)∈Sk (i,j)∈S` (i,j)∈Sk \S` (i,j)∈Sk \S`
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 305

Since ` > N 0 , by (6.26), {1, 2, . . . , n} × {1, 2, . . . , n} ⊆ S` . Now choose m > n such


that Sk ⊆ {1, 2, . . . , m} × {1, 2, . . . , m}. Then,
X X m
m X n X
X n
|aij | ≤ |aij | − |aij |
(i,j)∈Sk \S` i=1 j=1 i=1 j=1

= tmm − tnn
ε ε
= (tmm − L) + (L − tnn ) < + = ε,
2 2
where we used (6.25). Hence, |sk P
− s` | < ε and so {sk } is Cauchy.
Step 2: We now show that amn converges with sum equal to s := lim sk .
Let ε > 0 be given and choose N such that (6.25) holds with ε/2 replaced with ε/3.
Fix natural numbers m, n > N . By the property (6.23) and the fact that sk → s
we can choose a k > N such that
{1, 2, . . . , m} × {1, 2, . . . , n} ⊆ Sk
and |sk − s| < ε/3. Observe that

X X
|sk − smn | = aij − aij
(i,j)∈Sk (i,j)∈{1,...,m}×{1,...,n}

X X
= aij ≤ |aij |.
(i,j)∈Sk \{1,...,m}×{1,...,n} (i,j)∈Sk \({1,...,m}×{1,...,n})

Now choose m0 ∈ N such that Sk ⊆ {1, 2, . . . , m0 } × {1, 2, . . . , m0 }. Then,


0 0
X X m
m X m X
X n
|aij | ≤ |aij | − |aij |
(i,j)∈Sk \({1,...,m}×{1,...,n}) i=1 j=1 i=1 j=1

= tm0 m0 − tmn
ε ε 2ε
= (tm0 m0 − L) + (L − tmn ) < + = ,
3 3 3
where we used the property (6.25) (with ε/2 replaced with ε/3). Finally, recalling
that |sk − s| < ε/3, by the triangle inequality, we have
2ε ε
|smn − s| ≤ |smn − sk | + |sk − s| < + = ε.
3 3
P
This proves that amn = s and completes our proof. 

We recommend the reader to look at Exercise 11 for a related result.

6.5.3. Cauchy’s double series theorem. Instead of summing by curves, in


many applications we are interested in summing by rows or by columns.

P Theorem 6.26 (Cauchy’s double series theorem). For any double series
amn of complex numbers, the following are equivalent statements:
P
(a) P
The series
P amn is absolutely convergent;
∞ ∞
(b) Pm=1P n=1 |amn | converges;
∞ ∞
(c) n=1 m=1 |amn | converges.
306 6. ADVANCED THEORY OF INFINITE SERIES

In either of these cases,


X X ∞
∞ X ∞ X
X ∞
(6.27) amn = amn = amn
m=1 n=1 n=1 m=1
in the sense that both iterated sums converge and are equal to the sum of the series.
Proof. We proceed in three steps.P
Step 1: Assume first Pthat the
Psum amn Pconverges absolutely; we shall prove
∞ ∞ ∞ P∞
that
P both iterated sums m=1 Pn=1 |a mn | , n=1 m=1 |amn | converge. Since
|amn | converges, setting s := |amn | and denoting by smn the m, n-th partial
sum, by definition of convergence we can choose N such that
(6.28) m, n > N =⇒ |s − smn | < 1 =⇒ smn < s + 1.
Given p ∈ N, choose m ≥ p such that m > N and let n > N . Then in view of
(6.28) we have
n
X m X
X n
(6.29) |apj | ≤ |aij | = smn < s + 1.
j=1 i=1 j=1
P∞
Therefore, the partial sums of j=1 |apj | are bounded above by a fixed constant
and hence
P∞ (by the nonnegative series test — see Theorem P 3.20), for any p ∈ N, the

sum j=1 |apj | exists. Similarly, for each q ∈ N, the sum i=1 |aiq | exists. There-
P∞ P∞
fore, by
P P Pringsheim’s theorem for series,Pboth iterated series m=1 n=1 |amn | ,
∞ ∞
n=1 m=1 |amn | converge (and
P equal |amn |).
Step 2: Assuming that amn converges absolutely, we now establish
P the
equality (6.27). Indeed, by the sum by curves theorem we know that P∞ mn a con-
verges
P∞ and we showed in Step 1 that for each p, q ∈ N, P the sums n=1P|apn | and
∞ ∞
m=1 |amq | exist. This implies that for each p, q ∈ N, n=1 apn and m=1 amq
converge. Now (6.27) follows from Pringsheim’s theorem.
Step 3: Now assume that
X∞ X ∞
|amn | = t < ∞.
m=1 n=1
P
We
P∞will Pshow that amn is absolutely
P convergent; a similar proof shows that if

n=1 |a
m=1 Pmn | < ∞, then  a mn is absolutely convergent. Let ε > 0. Then
∞ P∞
the fact that i=1 j=1 |aij | < ∞ implies, by the Cauchy criterion for series,
there is an N such that
Xk X ∞  ε
k > m > N =⇒ |aij | < .
i=m+1 j=1
2

Let m, n > N . Then for any k > m, we have



X
k X ∞ Xm X n
Xk X∞
ε
|aij | − |aij | ≤ |aij | < .

i=1 j=1 i=1 j=1

i=m+1 j=1 2

Taking k → ∞ shows that for all m, n > N ,



m X n
X ε
t − |aij ≤
| < ε,
2

i=1 j=1
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 307

P
which proves that |amn | converges, and completes the proof of our result. 
Now for some double series examples.
P
Example 6.31. For our first example, consider the sum 1/(mp nq ) where
p, q ∈ R. Since in this case,
∞ ∞
X 1 1 X 1 
= · ,
n=1
mp nq mp n=1
nq
it follows that
∞ X
∞ X∞ ∞
X 1 1  X 1 
= · .
m=1 n=1
mp nq m=1
mp n=1
nq
P
Therefore, by Cauchy’s double series theorem and the p-test, 1/(mp nq ) converges
if and only if both p, q > 1.

PExample 6.32. The previous example can help us with other examples such
as 1/(m4 + n4 ). Observe that
1 1
(m2 − n2 )2 ≥ 0 =⇒ m4 + n4 − 2m2 n2 ≥ 0 =⇒ ≤ .
m4 + n4 2m2 n2
P
Since 1/(m2 n2 ) converges, by an easy generalization
P of 4our good ole comparison
test (Theorem 3.27) to double series, we see that 1/(m + n4 ) converges too.
Example 6.33. For an applicationP of Cauchy’s theorem and the sum by curves
theorem, we look at the double sum z m+n for |z| < 1. For such z, this sum
converges absolutely because
∞ X ∞ ∞
X X 1 1
|z|m+n = |z|m · = < ∞,
m=0 n=0 m=0
1 − |z| (1 − |z|)2
P∞ 1
where we used the geometric series test (twice): If |r| < 1, then k=0 rk = 1−r .
P m+n
So z converges absolutely by Cauchy’s double series theorem, and
∞ X ∞ ∞
X X X 1 1
z m+n = z m+n = zm · = .
m=0 n=0 m=0
1−z (1 − z)2
P m+n
On the other hand, by our sum by curves theorem, we can determine z by
summing over curves; we shall choose to sum over triangles. Thus, if we set
Sk = T0 ∪ T1 ∪ T2 ∪ · · · ∪ Tk , where T` = {(m, n) ; m + n = ` , m, n ≥ 0},
then
X X k
X X
z m+n = lim z m+n = lim z m+n .
k→∞ k→∞
(m,n)∈Sk `=0 (m,n)∈T`
Since T` = {(m, n) ; m + n = `} = {(0, `), (1, ` − 1), . . . , (`, 0)}, we have
X
z m+n = z 0+` + z 1+(`−1) + z 2+(`−2) + · · · + z `+0 = (` + 1)z ` .
(m,n)∈T`
P P∞ P
Thus, z m+n = k=0 (k + 1)z k . However, we already proved that z m+n =
2
1/(1 − z) , so

1 X
(6.30) = nz n−1 .
(1 − z)2 n=1
308 6. ADVANCED THEORY OF INFINITE SERIES

See Problem 4 for an easier proof of (6.30) using Cauchy’s double series theorem.
Example 6.34. Another very neat application of Cauchy’s double series the-
orem is to derive nonobvious identities. For example, let |z| < 1 and consider the
series

X zn z z2 z3
= + + + ··· ;
n=1
1 + z 2n 1 + z2 1 + z4 1 + z6
we’ll see why this converges in a moment. Observe that (since |z| < 1)

1 X
= (−1)m z 2mn ,
1 + z 2n m=0
P∞
by the familiar geometric series test with r = −z 2n : Since |r| < 1, then k=0 rk =
1
1−r . Therefore,
∞ ∞ ∞ ∞ X ∞
X zn X
n
X
m 2mn
X
2n
= z · (−1) z = (−1)m z (2m+1)n
n=1
1 + z n=1 m=0 n=1 m=0
P m (2m+1)n
We claim that the double sum (−1) z converges absolutely. To prove
this, observe that

∞ X ∞ ∞ ∞
X X X X |z|n
|z|(2m+1)n = |z|n |z|2nm = .
n=1 m=0 n=1 m=0 n=1
1 − |z|2n
1 1 2n
Since 1−|z|2n ≤ 1−|z| (this is because |z| ≤ |z| for |z| < 1), we have

|z|n 1
≤ · |z|n .
1 − |z|2n 1 − |z|
P n P∞ |z|n
Since |z| converges, by the comparison theorem, n=1 1−|z|2n converges too.
Hence, Cauchy’s double series theorem applies, and
∞ X
X ∞ ∞ X
X ∞
(−1)m z (2m+1)n = (−1)m z (2m+1)n
n=1 m=0 m=0 n=1
X∞ ∞
X
= (−1)m z (2m+1)n
m=0 n=1

X z 2m+1
= (−1)m .
m=0
1 − z 2m+1

Thus,
∞ ∞
X zn X
m z
2m+1
= (−1) ;
n=1
1 + z 2n m=0
1 − z 2m+1
that is, we have derived the striking identity between even and odd powers of z:
z z2 z3 z z3 z5
2
+ 4
+ 6
+ ··· = − 3
+ − +··· .
1+z 1+z 1+z 1−z 1−z 1 − z5
There are more beautiful series like this found in the exercises (see Problem 5
or better yet, Problem 7). We just touch on one more because it’s so nice:
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 309

6.5.4.
P∞A neat ζ-function identity. Recall that the ζ-function is defined by
ζ(z) = n=1 n1z , which converges absolutely for z ∈ C with Re z > 1. Here’s a
beautiful theorem from Flajolet and Vardi [75, 232].
P∞ P∞
Theorem 6.27. If f (z) = n=2 an z n and n=2 |an | converges, then

X∞ 1 X ∞
f = an ζ(n).
n=1
n n=2

Proof. We first write


∞ 1 X ∞ X ∞
X 1
f = am m .
n=1
n n=1 m=2
n
P∞
Now if we set C := m=2|am | < ∞, then
∞ X ∞ ∞ ∞ ∞
X 1 X X 1 X 1
am m ≤ |am | 2 ≤ C < ∞.
n=1 m=2
n n=1 m=2
n n=1
n2
Hence, by Cauchy’s double series theorem, we can switch the order of summation:
∞ 1 X ∞ X ∞ ∞ ∞ ∞
X 1 X X 1 X
f = am m = am = am ζ(m),
n=1
n n=1 m=2
n m=2 n=1
nm m=2

which completes our proof. 

Using this theorem we can derive the pretty formula (see Problem 9):

X 1
(6.31) log 2 = ζ(n).
n=2
2n

Not only is this formula pretty, it converges to log 2 much faster than the usual
P∞ n−1
series n=1 (−1)n (from which (6.31) is derived by the help of Theorem 6.27);
see [75, 232] for a discussion of such convergence issues.
Exercises 6.5.
1. Determine the convergence of the limits and the iterated limits for the double sequences
1 1 m  n + 1 m
(a) smn = + , (b) smn = , (c) smn = ,
m n m+n n+2
1 1 1
(d) smn = (−1)m+n + , (e) smn = .
m n 1 + (m − n)2
2. Determine the convergence, iterated convergence, and absolute convergence, for the
double series
X (−1)mn X (−1)n X 1
(a) , (b) p p
, p > 1 , (c)
mn (m + n )(m + n − 1) mn
m,n≥1 m,n≥1 m≥2,n≥1
P∞ 1
Suggestion: For (b), show that m=1 (m+np )(m+np −1)telescopes.
P
3. (mn-term test for double series) Show that if amn converges, then amn → 0.
Suggestion: First verify that amn = smn − sm−1,n − sm,n−1 + sm−1,n−1 .
4. Let z ∈ C with |z| < 1. For (m, n) ∈ N × N, define amn = z n P if m ≤ n and define
amn = 0 otherwise.P∞Using Cauchy’s double series theorem on amn , prove (6.30).
Using (6.30), find n=1 2nn (cf. Problem 3 in Exercises 3.5).
310 6. ADVANCED THEORY OF INFINITE SERIES

5. Let |z| < 1. Using Cauchy’s double series theorem, derive the beautiful identities
z z3 z5 z z3 z5
(a) 2
+ 6
+ 10
+ ··· = 2
− 6
+ − +··· ,
1+z 1+z 1+z 1−z 1−z 1 − z 10
z z2 z3 z z3 z5
(b) 2
− 4
+ 6
− +··· = − 3
+ − +··· ,
1+z 1+z 1+z 1+z 1+z 1 + z5
z 2z 2 3z 3 z z2 z3
(c) − 2
+ 3
− +··· = 2
− 2 2
+ − +··· .
1+z 1+z 1+z (1 + z) (1 + z ) (1 + z 3 )2
P
Suggestion: For (c), you need the formula 1/(1 − z)2 = ∞ n=1 nz
n−1
found in (6.30).
6. Here’s a neat formula for ζ(k) found in [40]: For any k ∈ N with k ≥ 3, we have
k−2 ∞ ∞
XX X 1
ζ(k) = .
m` (m + n)k−`
`=1 m=1 n=1

To prove this you may proceed as follows.


(i) Show that
k−2 k−2
X  m + n `
X 1 1 1 1
= = k−2 − .
m` (m + n)k−` (m + n)k m m n(m + n) n(m + n)k−1
`=1 `=1

(ii) Use (i) to show that


k−2 ∞ ∞ ∞ X ∞ ∞ X ∞
XX X 1 X 1 X 1
= − .
m` (m + n)k−` m k−2 n(m + n) n(m + n)k−1
`=1 m=1 n=1 m=1 n=1 m=1 n=1

Make sure you justify each step; in particular, why does each sum converge?
1
(iii) Use the partial fractions n(m+n) = n1 − m+n
1
to show that

∞ X m ∞
X 1 X 1
X 1
= .
m=1 n=1
mk−2 n(m
+ n) m=1 n=1
n mk−1
P P∞
(iv) Replace the summation variable n with ` = m + n in ∞ m=1
1
n=1 n(m+n)k−1 to
get a new sum in terms of m and `, then use Cauchy’s double series theorem to
change the order of summation. Finally, prove the desired result.
7. (Number theory series) Here are some pretty formulas involving number theory!
(a) For n ∈ N, let τ (n) denote the number of positive divisors of n (that is, the number
of positive integers that divide n). For example, τ (1) = 1 and τ (4) = 3 (because
1, 2, 4 divide 4). Prove that
∞ ∞
Xzn X
(6.32) n
= τ (n)z n , |z| < 1.
n=1
1 − z n=1
P P
Suggestion: Write 1/(1 − z n ) =P ∞ m=0 z
mn
= ∞ m=1 z
n(m−1)
, then prove that the
mn
left-hand side of (6.32) equals z . Finally, use Theorem 6.25 with the set Sk
given by Sk = T1 ∪ · · · ∪ Tk where Tk = {(m, n) ∈ N × N ; m · n = k}.
(b) For n ∈ N, let σ(n) denote the sum of the positive divisors of n. For example,
σ(1) = 1 and σ(4) = 1 + 2 + 4 = 7). Prove that
∞ ∞
zn
X X
n )2
= σ(n)z n , |z| < 1.
n=1
(1 − z n=1
P P∞
8. Here is a neat problem. Let f (z) = ∞ n
n=1 an z and g(z) =
n
n=1 bn z . Determine a
set of points z ∈ C for which the following formula is valid:

X ∞
X
bn f (z n ) = an g(z n ).
n=1 n=1
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 311

From this formula, derive the following pretty formulas:


∞ ∞ ∞ ∞
X X an z n X X an z n
f (z n ) = , (−1)n−1 f (z n ) = ,
n=1 n=1
1 − zn n=1 n=1
1 + zn

and my favorite:
∞ ∞
X f (z n ) X n
= an ez .
n=1
n! n=1

9. In this problem we derive (6.31).  


z2
P P∞
(i) Prove that log 2 = ∞ 1
n=1 2n(2n−1) =
1
n=1 f n , where f (z) = 2(2−z) .
zn
P
(ii) Show that f (z) = ∞ n=2 2n and from this and Theorem 6.27 prove (6.31).
P
10. (Cf. [75, 232]) Prove the following extension of Theorem 6.27: If f (z) = ∞ n=2 an z
n
P∞ |an |
and for some N ∈ N, n=2 N n converges, then
∞ 1 ∞   
X X 1 1
f = an ζ(n) − 1 + n + · · · + ,
n=N
n n=2
2 (N − 1)n
 
1 1
where the sum 1 + 2n
+ ··· + (N −1)n
is (by convention) zero if N = 1.
11. (Arbitrary rearrangements of double series) Let f : N → N × N be a bijective
function and denote by νn = f (n)
P∈ N×N; therefore ν1 , ν2 , ν3 , . . . is a list ofP
all elements

of N × N. For a double series amnPof complex numbers, prove that n=1 aνn is
absolutely
P∞ convergent
P if and only if a mn is absolutely convergent, in which case,

n=1 aνn = amn .

6.6. Rearrangements and multiplication of power series


We already know that the associative law holds for infinite series. That is, we
can group the terms of an infinite series in any way we wish and the resulting series
still converges with the same sum (see Theorem 3.23). A natural question that you
may ask is whether or not the commutative law holds for infinite series. That is,
suppose that s = a1 + a2 + a3 + · · · exists. Can we commute the an ’s in any way
we wish and still get the same sum? For instance, is it true that
s = a1 + a2 + a4 + a3 + a6 + a8 + a5 + a10 + a12 + · · ·?
For general series, the answer is, quite shocking at first, “no!”

6.6.1. Rearrangements. A sequence ν1 , ν2 , ν3 , . . . of natural numbers such


that every natural number occurs exactly once in this list is called a rearrange-
ment of the natural numbers.
Example 6.35. 1, 2, 4, 3, 6, 8, 5, 10, 12, . . ., where we follow every odd number
by two adjacent even numbers, is a rearrangement.
P∞ P∞
A rearrangement of a series n=1 an is a series n=1 aνn where {νn } is a
rearrangement of N.
Example 6.36. Let us rearrange the alternating harmonic series

X 1 1 1 1 1 1 1 1
log 2 = (−1)n−1 = 1 − + − + − + − + −···
n=1
n 2 3 4 5 6 7 8
312 6. ADVANCED THEORY OF INFINITE SERIES

using the rearrangement 1, 2, 4, 3, 6, 8, 5, 10, 12, . . . we’ve already mentioned:


1 1 1 1 1 1 1 1
s=1− − + − − + − − + −−
2 4 3 6 8 5 10 12
1 1 1
··· + − − + ··· ,
2k − 1 4k − 2 4k
provided of course that this sum converges. Here, the bottom three terms represent
the general formula for the k-th triplet of a positive term followed by two negative
ones. To see that this sum converges, let sn denote its n-th partial sum. Then we
can write n = 3k + ` where ` is either 0, 1, or 2, and so
1 1 1 1 1 1 1 1
sn = 1 − − + − − + − − ··· + − − + rn ,
2 4 3 6 8 2k − 1 4k − 2 4k
where rn consists of the next ` (= 0, 1, 2) terms of the series for sn . Note that
rn → 0 as n → ∞. In any case, we can write
     
1 1 1 1 1 1 1 1
sn = 1 − − + − − + − − ··· + − − + rn
2 4 3 6 8 2k − 1 4k − 2 4k
1 1 1 1 1 1
= − + − + −··· + − + rn
2 4 6 8 4k − 2 4k
 
1 1 1 1 1 1
= 1 − + − + −··· + − + rn .
2 2 3 4 2k − 1 2k
Taking n → ∞, we see that
1
s=log 2.
2
Thus, the rearrangement s has a different sum than the original series!
In summary, rearrangements of series can, in general, have different sums that
the original series. In fact, it turns out that a convergent series can be rearranged
to get a different value if and only if the series is not absolutely convergent. The
“only if” portion is proved in Theorem 6.29 and the “if” portion is proved in

6.6.2. Riemann’s rearrangement theorem.


P
Theorem 6.28 (Riemann’s rearrangement theorem). If a series an of
real numbers converges, but not absolutely, then there are rearrangements of the
series that can be made to converge to ±∞ or any real number whatsoever.
Proof. We shall prove that there are rearrangements of the series that con-
verge to any real number whatsoever; following the argument for this case, you
should be able to handle the ±∞ cases yourself.
StepP 1: We first show that the series corresponding to the positive and negative
terms in an each diverge. Let b1 , b2 , b3 , . . . denote the terms in the sequence {an }
that are nonnegative, in the order in which they occur, and let c1 , c2 , c3 , . . . denote
the absolute values of the terms in {an } thatP are negative,
P again, in the order in
which they occur. We claim that both series bn and cn diverge. To see this,
observe that
Xn X X
(6.33) ak = bi − cj ,
k=1 i j
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 313

where the right-hand sums are only over those natural numbers i, j such that bi
and cj occur in the left-hand
P∞sum. The Pleft-hand side converges as n → ∞ by as-

sumption, so if either sum n=1 bn or n=1 cn of nonnegative numbers converges,
then the equality (6.33) would imply that the other sum converges. But this would
then imply that
Xn X X
|ak | = bi + cj
k=1 i j
P P
converges as n → ∞, which does not. Hence, both sums bn and cn diverge.
Step 2: We produce a rearrangement. Let ξ ∈ R. We shall produce a re-
arrangement

(6.34) b1 + · · · + bm1 − c1 − · · · − cn1 + bm1 +1 + · · · + bm2


− cn1 +1 − · · · − cn2 + bm2 +1 + · · · + bm3 − cn2 +1 − · · ·
such that its partial sums converge
P to ξ.PWe do so as follows. Let {βn } and {γn }
denote the partial sums for bn and cn , respectively. Since βn → ∞, for n
sufficiently large, βn > ξ. We define m1 as the smallest natural number such that
βm1 > ξ.
Note that βm1 differs from ξ by at most bm1 . Since γn → ∞, for n sufficiently large,
βm1 − γn < ξ. We define n1 to be the smallest natural number such that
βm1 − γn1 < ξ.
Note that the left-hand side differs from ξ by at most cn1 . Now define m2 as the
smallest natural number greater than m1 such that
βm2 − γn1 > ξ.
As before, such a number exists because βn → ∞, and the left-hand side differs
from ξ by at most bm2 . We define the number n2 as the smallest natural number
greater than n1 such that
βm2 − γn2 < ξ,
where the left-hand side differs from ξ by at most cn2 . Continuing this process, we
produce sequences m1 < m2 < m3 < · · · and n1 < n2 < n3 < · · · such that for
every k,
βmk − γnk−1 > ξ,
where the left-hand side differs from ξ by at most bmk , and
βmk − γnk < ξ,
where the left-hand side differs from ξ by at most cnk .
P Step 3: We now show that the series (6.34), which is just a rearrangement of
an , converges to ξ. Let

βk0 := b1 + · · · + bm1 − c1 − · · · − cn1 + bm1 +1 + · · · + bm2 −


· · · − cnk−2 +1 − · · · − cnk−1 + bmk−1 +1 + · · · + bmk = βmk − γnk−1
and

γk0 := b1 + · · · + bm1 − c1 − · · · − cn1 + bm1 +1 + · · · + bm2 −


· · · + bmk−1 +1 + · · · + bmk − cnk−1 +1 − · · · − cnk = βmk − γnk .
314 6. ADVANCED THEORY OF INFINITE SERIES

Then any given partial sum t of (6.34) is one of the following two sorts:

t = b1 + · · · + bm1 − c1 − · · · − cn1 + bm1 +1 + · · · + bm2 −


· · · − cnk−2 +1 − · · · − cnk−1 + bmk−1 +1 + · · · + b` ,
0
where ` ≤ mk , in which case, γk−1 < t ≤ βk0 ; otherwise,

t = b1 + · · · + bm1 − c1 − · · · − cn1 + bm1 +1 + · · · + bm2 −


· · · + bmk−1 +1 + · · · + bmk − cnk−1 +1 − · · · − c` ,

where ` ≤ nk , in which case, γk0 ≤ t < βk0 . Now by construction, βk0 differs from
ξ by at most bmk and γk0 differs from ξ by at most cnk . Therefore, the fact that
0
γk−1 < t ≤ βk0 or γk0 ≤ t < βk0 imply that
ξ − cnk−1 < t < ξ + bnk or ξ − cnk < t < ξ + bnk .
P
By assumption, an converges, so bnk , cnk → 0, hence the partial sums of (6.34)
must converge to ξ. This completes our proof. 

We now prove that a convergent series can be rearranged to get a different


value only if the series is not absolutely convergent. Actually, we shall prove the
contrapositive: If a series is absolutely convergent, then any rearrangement has the
same value as the original sum. This is a consequence of the following theorem.
Theorem 6.29 (Dirichlet’s theorem). All rearrangements of an absolutely
convergent series of complex numbers converge with the same sum as the original
series.
P
Proof. Let an converge absolutely. We shall prove that any rearrange-
ment of this series converges to the same value as the sum itself. To see this, let
ν1 , ν2 , ν3 , . . . be any rearrangement of the natural numbers and define
(
am if m = νn ,
amn =
0 else.

Then by definition of amn , we have



X ∞
X
am = amn and aνn = amn .
n=1 m=1

Moreover,
X ∞
∞ X ∞
X
|amn | = |am | < ∞,
m=1 n=1 m=1

so by Cauchy’s double series theorem,



X X ∞
∞ X X ∞
∞ X ∞
X
am = amn = amn = aνn .
m=1 m=1 n=1 n=1 m=1 n=1

We now move to the important topic of multiplication of series.


6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 315

6.6.3. Multiplication
P∞ of powerP∞ series and infinite series. If we consider
two power series n=0 an z n and n=0 bn z n , then formally multiplying and com-
bining like powers of z, we get
 
a0 + a1 z + a2 z 2 + a3 z 3 + · · · b0 + b1 z + b2 z 2 + b3 z 3 + · · · =
a0 b0 + (a0 b1 + a1 b0 )z + (a0 b2 + a1 b1 + a2 b0 )z 2
+ (a0 b3 + a1 b2 + a2 b1 + a3 b0 )z 3 + · · · .
In particular, taking z = 1, we get (again, only formally!)
 
a0 + a1 + a2 + a3 + · · · b0 + b1 + b2 + b3 + · · · =
a0 b0 + (a0 b1 + a1 b0 ) + (a0 b2 + a1 b1 + a2 b0 )
+ (a0 b3 + a1 b2 + a2 b1 + a3 b0 ) + · · · .
P∞
These
P∞ thoughts suggest the following definition.P∞Given two series n=0 an and
b
n=0 n , their Cauchy product is the series c
n=0 n , where
n
X
cn = a0 bn + a1 bn−1 + · · · + an b0 = ak bn−k .
k=0
P∞ P∞
A natural question to ask is if n=0 an and n=0 bn converge, then is it true that
X ∞ ∞
 X  X ∞
an bn = cn ?
n=0 n=0 n=0

The answer is, what may be a surprising, “no”.


P∞ n−1 P∞ n−1
Example 6.37. Let us consider the example ( n=1 (−1) √
n
)( n=1 (−1)

n
),
which is due to Cauchy. That is, let a0 = b0 = 0 and
1
an = bn = (−1)n−1 √ , n = 1, 2, 3, . . . .
n
P∞ n−1
We know, by the alternating series test, that n=1 (−1)√
n
converges. However,
we shall see that the Cauchy product does not converge. Indeed,
c0 = a0 b0 = 0, c1 = a0 b1 + a1 b0 = 0,
and for n ≥ 2,
n n−1 n−1
X X (−1)k (−1)n−k X 1
cn = ak bn−k = √ √ = (−1)n √ √ .
k=0 k=1
k n−k k=1
k n−k
Since for 1 ≤ k ≤ n − 1, we have
1 1
k(n − k) ≤ (n − 1)(n − 1) = (n − 1)2 =⇒ ≤p ,
n−1 k(n − k)
we see that
n−1 n−1 n−1
X 1 X 1 1 X
(−1)n cn = p ≥ = 1 = 1.
k(n − k) n−1 n−1
k=1 k=1 k=1

Thus, P
the terms cn do not tend to zero as n → ∞, so by the n-th term test, the

series n=0 cn does not converge.
316 6. ADVANCED THEORY OF INFINITE SERIES

P (−1)n−1
The problem with this example is that the series √
n
does not converge
absolutely. However, for absolutely convergent series, there is no problem as the
following theorem, due to Franz Mertens (1840–1927), shows.

Theorem 6.30 P (Mertens’ multiplication


P theorem). If at least one of two
convergent series an = A and bn = B converges absolutely, then their Cauchy
product converges with sum equal to AB

Proof. Consider the partial sums of the Cauchy product:

Cn = c0 + c1 + · · · + cn
= a0 b0 + (a0 b1 + a1 b0 ) + · · · + (a0 bn + a1 bn−1 + · · · + an b0 )
(6.35) = a0 (b0 + · · · + bn ) + a1 (b0 + · · · + bn−1 ) + · · · + an b0 .

We need to show that Cn tends to AB as n → P


∞. Because our notation is symmetric
in A and B, we may assume that P the sum an is absolutely
P convergent. If An
denotes the n-th partial sum of an and Bn that of bn , then from (6.35), we
have
Cn = a0 Bn + a1 Bn−1 + · · · + an B0 .
If we set Bk = B + βk , then βk → 0, and we can write

Cn = a0 (B + βn ) + a1 (B + βn−1 ) + · · · + an (B + β0 )
= An B + (a0 βn + a1 βn−1 + · · · + an β0 ).

Since An → A, the first part of this sum converges to AB. Thus, we just need to
show that the term in parenthesis
P tends to zero as n → ∞. To see this, let ε > 0
be given. Putting α = |an | and using that βn → 0, we can choose a natural
number N such that for all n > N , we have |βn | < ε/(2α). Also, since βn → 0, we
can choose a constant C such that |βn | ≤ C for every n. Then for n > N ,

|a0 βn +a1 βn−1 + · · · + an β0 | = |a0 βn + a1 βn−1 + · · · + an−N +1 βN +1


+ an−N βN + · · · + an β0 |
≤ |a0 βn + a1 βn−1 + · · · + an−N +1 βN +1 | + |an−N βN + · · · + an β0 |
  ε  
< |a0 | + |a1 | + · · · + |an−N +1 | · + |an−N | + · · · + |an | · C
ε  2α
≤α· + C |an−N | + · · · + |an |
2α  
ε
= + C |an−N | + · · · + |an | .
2
P
Since |an | < ∞, by the Cauchy criterion for series, we can choose N 0 > N such
that
ε
n > N0 =⇒ |an−N | + · · · + |an | < .
2C
Then for n > N 0 , we see that
ε ε
|a0 βn + a1 βn−1 + · · · + an β0 | < + = ε.
2 2
Since ε > 0 was arbitrary, this completes the proof of the theorem. 
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 317

P∞ P∞
As an easy corollary, we see that if n=0 an z n and n=0 bn z n have radii of
convergence R1 , R2 , respectively, then since power series converge absolutely within
their radii of convergence, for all z ∈ C with |z| < R1 , R2 , we have
X∞ ∞
 X  X ∞
n n
an z bn z = cn z n
n=0 n=0 n=0
Pn
where cn = k=0 ak bn−k . In words: The P product P of power series is a power series.
Here’s
P a question: Suppose that an and
P bn P
converge
 Pand their Cauchy
product cn also converges; is it true that cn = an bn ? The answer
may seem to be an “obvious” yes. However, it’s not so “obvious’ because the
definition of the Cauchy product was based on a formal argument. Here is a proof
of this “obvious” fact.
Theorem 6.31 (Abel’s
P multiplication
P theorem). If the Cauchy product of
two convergent series an = A and bn = B converges, then the Cauchy product
has the value AB.
Proof. In my opinion, the slickest proof of this theorem is Abel’s original,
proved in 1826 [120, p. 321] using his limit theorem, Theorem 6.20. Let
X X X
f (z) = an z n , g(z) = bn z n , h(z) = cn z n ,

where cn = a0 bn + · · · + an b0 . These power series converge at z = 1, so they must


have radii of convergence at least 1. In particular, each series converges absolutely
for |z| < 1 and for these values of z according to according to Merten’s theorem,
we have
h(z) = f (z) · g(z).
P P P
Since each of the sums an , bn , and cn converges,
Pby Abel’s limit theorem,
the functions f , g, and h converge to A, B, and C = cn , respectively, as z =
x → 1 from the left. Thus,
C = lim h(x) = lim f (x) · g(x) = A · B.
x→1− x→1−


P∞ (−1)n−1
Example 6.38. For example, let us square log 2 = n=1 P∞ n . It turns out
that it will be convenient to write log 2 in two ways: log 2 = n=1 an where a0 = 0
n−1 P∞ n
and an = (−1)n for n = 1, 2, . . ., and as log 2 = n=0 bn where bn = (−1)
n+1 . Thus,
c0 = a0 b0 = 0 and for n = 1, 2, . . ., we see that
n n
X X (−1)k−1 (−1)n−k
cn = ak bn−k = = (−1)n−1 αn ,
k(n + 1 − k)
k=0 k=1
Pn 1
where αn = k=1 k(n+1−k) . By Abel’s multiplication theorem, we have (log 2)2 =
P∞ P∞ n−1
n=0 cn = n=1 (−1) αn as long as this latter sum converges. By the alternat-
ing series test, this sum converges if we can prove that {αn } is nonincreasing and
converges to zero. To prove these statements hold, observe that we can write
1 1 1 1 
= + ,
k(n − k + 1) n+1 k n−k+1
318 6. ADVANCED THEORY OF INFINITE SERIES

therefore
1 1 1 1 1 1 1 1
αn = · + · + · + ··· + ·
1 n 2 n−1 3 n−2 n 1
1 h 1  1 1  1 1  1 1 i
= 1+ + + + + + ··· + + .
n+1 n 2 n−1 3 n−2 n 1
1
In the brackets there are two copies of 1 + 2 + · · · + n1 . Thus,
2 1 1 1
αn = Hn , where Hn := 1 + + + · · · + .
n+1 2 3 n
It is common to use the notation Hn for the n-th partial sum of the harmonic
series. Now, recall from Section 4.6.5 on the Euler-Mascheroni constant that γn :=
Hn − log n is bounded above by 1, so
2 2 log n 2 n 1
αn = (γn + log n) ≤ +2 = +2· · log n
n+1 n+1 n+1 n+1 n+1 n
2 n
= +2· · log(n1/n ) → 0 + 2 · 1 · log 1 = 0
n+1 n+1
as n → ∞. Thus, αn → 0. Moreover,
2 2 2 2  1 
αn − αn+1 = Hn − Hn+1 = Hn − Hn +
n+1 n+2 n+1 n+2 n+1
 2 2  2
= − Hn −
n+1 n+2 (n + 1)(n + 2)
2 2
= Hn −
(n + 1)(n + 2) (n + 1)(n + 2)
2
= (Hn − 1) ≥ 0.
(n + 1)(n + 2)
P P
Thus, αn ≥ αn+1 , so cn = (−1)n−1 αn converges. Hence, we have proved the
following pretty formula:

1 2 X (−1)n−1
log 2 = Hn
2 n=1
n+1

X (−1)n−1  1 1
= 1 + + ··· + .
n=1
n+1 2 n

Our final theorem, Cauchy’s multiplication theorem, basically says that we


can multiply absolutely convergent series without worryingP aboutPanything. To
introduce this theorem, note that if we have finite sums an and bn , then
X  X  X
an · bn = am bn ,

where the sum on the right means to add over all such products am bn in any order
we wish. One can ask if this holds true in the infinite series realm. The answer is
“yes” if both series on the left are absolutely convergent.
P
Theorem
P 6.32 (Cauchy’s multiplication theorem). If two P series an =
A and bn = B converge absolutely, then the double series am bn converges
absolutely and has the value AB.
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 319

Proof. Since
∞ X
X ∞ ∞
X ∞
X ∞
X ∞
 X 
|am bn | = |am | |bn | = |am | |bn | < ∞,
m=0 n=0 m=0 n=0 m=0 n=0
P
by Cauchy’s double series theorem, the double series am bn converges absolutely,
and we can iterate the sums:
X ∞ X
X ∞ ∞
X X∞ X ∞  X ∞ 
am bn = am bn = am bn = am bn = A · B.
m=0 n=0 m=0 n=0 m=0 n=0


We remark that Cauchy’s multiplication theorem generalizes to a product of
more than two absolutely convergent series.

6.6.4. The exponential function (again). Using Mertens’ or Cauchy’s mul-


tiplication theorem, we can give an alternative and quick proof of the formula
exp(z) exp(w) = exp(z + w) for z, w ∈ C, which was originally proved in Theorem
3.31 using a completely different method:
X ∞  X ∞ 
zn wn
exp(z) exp(w) = ·
n=0
n! n=0
n!
∞ 
X X zk n 
wn−k
= ·
n=0
k! (n − k)!
k=0

X 1 X  n 
n!
= z k wn−k
n=0
n! k!(n − k)!
k=0
∞  n    X ∞
X 1 X n k n−k 1
= z w = (z + w)n = exp(z + w),
n=0
n! k n=0
n!
k=0

where we used the binomial theorem for (z + w)n in the last line.
Exercises 6.6.
1. Here are some alternating series problems:
(a) Prove that
1 1 1 1 1 1 1 1 1 3
+ − + + − + ··· + + − + · · · = log 2.
1 3 2 5 7 4 4k − 3 4k − 1 2k 2
that is, we rearrange the alternating harmonic series so that two positive terms are
followed by one negative one, otherwise keeping the ordering the same. Suggestion:
Observe that
1 1 1 1 1 1 1
log 2 = − + − + − + ···
2 2 4 6 8 9 10
1 1 1 1
= 0 + + 0 − + 0 + + 0 − + ··· .
2 4 6 8
Add this term-by-term to the series for log 2.
(b) Prove that
1 1 1 1 1 1 1 1 1 1 3
+ + + − + ··· + + + + − + · · · = log 2;
1 3 5 7 2 8k − 7 8k − 5 8k − 3 8k − 1 2k 2
that is, we rearrange the alternating harmonic series so that four positive terms
are followed by one negative one, otherwise keeping the ordering the same.
320 6. ADVANCED THEORY OF INFINITE SERIES

(c) What’s wrong with the following argument?


1 1 1 1 1  1 1 1 1 1 
1 − + − + − + ··· = 1 + + + + + + ···
2 3 4 5 6 2 3 4 5 6
1 1 1 
−2 + 0 + + 0 + + ···
2 4 6
 1 1 1 1 1   1 1 1 1 1 
= 1 + + + + + + · · · − 1 + + + + + + · · · = 0.
2 3 4 5 6 2 3 4 5 6
P
2. Let f (z) = ∞ a
n=0 n z n
be absolutely convergent for |z| < 1. Prove that for |z| < 1, we
have

f (z) X
= (a0 + a1 + a2 + · · · + an )z n .
1−z n=0
3. Using the previous problem, prove that for z ∈ C with |z| < 1,
∞ X∞  X∞  X ∞
1 X
2
= (n + 1)z n ; that is, zn · zn = (n + 1)z n .
(1 − z) n=0 n=0 n=0 n=0

Using this formula, derive the neat looking formula: For z ∈ C with |z| < 1,
X ∞  X ∞  ∞
1X
(6.36) cos nθ z n · sin nθ z n = (n + 1) sin nθ z n .
n=0 n=0
2 n=0
P P∞ n
Suggestion:
P∞ Put z = eiθ
x with x real into the formula ( ∞ n
n=0 z ) · ( n=0 z ) =
n
n=0 (n + 1)z , then equate imaginary parts of both sides; this proves (6.36) for z = x
real and |x| < 1. Why does (6.36) hold for z ∈ C with |z| < 1?
4. Derive the beautiful formula: For |z| < 1,
X ∞ ∞ ∞
cos nθ n   X sin nθ n  1 X Hn sin nθ n
z · z = z .
n=1
n n=1
n 2 n=2 n
P
5. In this problem we prove the following fact: Let f (z) = ∞ n
n=0 an z be a power series
with radius of convergence R > 0 and let α ∈ C with |α| < R. Then we can write

X
f (z) = bn (z − α)n ,
n=0

where this series converges absolutely for |z − α| < R − α.


(i) Show that
∞ X n
!
X n
(6.37) f (z) = an αn−m (z − α)m .
n=0 m=0
m
(ii) Prove that
∞ X
n
! ∞
X n X m
|an | |α|n−m |z − α|m = |an | |z − α| + |α| < ∞.
n=0 m=0
m n=0

(iii) Verifying that you can change the order of summation in (6.37), prove the result.
P
6.7. F Proofs that 1/p diverges
P
We know that the harmonic series 1/n diverges.
P However, if we only sum
over the squares, then we get the convergentPsum 1/n2 . Similarly, if we only sum
over the cubes, we get the convergent sum 1/n3 . One may ask: What if we sum
only over all primes:
X1 1 1 1 1 1 1 1
= + + + + + + + ··· ,
p 2 3 5 7 11 13 17
P
6.7. F PROOFS THAT 1/p DIVERGES 321

do we get a convergent sum? We know that there are arbitrarily large gaps P be-
tween primes (see Problem 1 in Exercises 2.4), so one may conjecture that 1/p
converges.
P However, following [23], [63], [164] (cf. [165]), and [130] we shall prove
that 1/p diverges! Other proofs can be found in the exercises. An expository
article giving other proofs (cf. [153], [51]) on this fascinating divergent sum can be
found in [231].
6.7.1. Proof I: Proof by multiplication and rearrangement. This is
Bellman
P [23] and Dux’s [63] argument. Suppose, for sake of contradiction,
P that
1/p converges. Then we can fix a prime number m such that p>m 1/p < 1.
Let 2 < 3 < · · · < m be the list of all prime numbers up to m. Given N > m, let
PN be the set of natural numbers greater than one and less than or equal to N all
of whose prime factors are less than or equal to m, and let QN be the set of natural
numbers greater than one and less than or equal to N all of whose prime factors
are greater than m. Explicitly,
k ∈ PN ⇐⇒ 1 < k ≤ N and k = 2i 3j · · · mk , some i, j, . . . , k,
(6.38)
` ∈ QN ⇐⇒ 1 < ` ≤ N and ` = p q · · · r, p, q, . . . , r > m are prime.
In the product p q · · · r, prime numbers may be repeated. Observe that any integer
1 < n ≤ N that is not in PN or QN must have prime factors that are both less than
or equal to m and greater than m, and hence can be factored in the form n = k `
where k ∈ PN and ` ∈ QN . Thus, the finite sum
X 1 X 1  X 1  X 1  X 1 X 1 X 1
+ + = + + ,
k ` k ` k ` k`
k∈PN `∈QN k∈PN `∈QN k∈PN `∈QN k∈PN ,`∈QN

contains every number of the form 1/n where 1 < n ≤ N . (Of course, the resulting
sum contains other numbers too.) In particular,
X 1 X 1  X 1  X 1  X N
1
+ + ≥ ,
k ` k ` n=2
n
k∈PN `∈QN k∈PN `∈QN

We shall prove that the finite sums on the left remain bounded as N → ∞, which
contradicts the fact
P that the harmonic series diverges. P∞ j
To see that PN 1/k converges, note that each geometric series j=1 1/p
j
converges (absolutely since all the 1/p are positive) to a finite real number. Hence,
by Cauchy’s multiplication theorem (or rather its generalization to a product of
more than two absolutely convergent series), we have
X ∞  X∞  X ∞  X
1 1 1 1
i j
· · · k
= i 3j · · · mk
i=1
2 j=1
3 m 2
k=1

is a finite real number, where the sum on the rightP is over all i, j, . . . , k = 1, 2, . . ..
Using the definition of PN in (6.38), we see that PN 1/k is bounded above by this
P
finite real number uniformly in N . Thus, limN →∞ PN 1/k is finite.
P
We now prove that limN →∞ QN 1/` is finite. To do so observe that since
P P
α := p>m 1/p < 1 and all the 1/p’s are positive, the sum p>m 1/p, in particular,
converges absolutely. Hence, by Cauchy’s multiplication theorem, we have
 X 2
2 1 X 1
α = = ,
p>m
p p,q>m
pq
322 6. ADVANCED THEORY OF INFINITE SERIES

where the sum is over all primes p, q > m, and


 X 3
1 X 1
α3 = = ,
p>m
p p,q,r>m
pqr

where the sum is over all primes P p, q, r > m. We can continue this procedure
showing that αj is the sum 1/(p q · · · r) where the sum is over all j-tuples of
primes p, q, . . . , r all of which P
are strictly larger than m. By definition
P∞ of QN in
(6.38), it follows that the sum QN 1/` is bounded by the number j=1 αj , which
P
is finite because α < 1. Hence, the limit limN →∞ QN 1/` is finite, and we have
reached a contradiction.
6.7.2. An elementary number theory fact. Our next proof depends on
the idea of square-free integers. A positive integer is said to be square-free if no
squared prime divides it, that is, if a prime occurs in its prime factorization, then
it occurs with multiplicity one. For instance, 1 is square-free because no squared
prime divides it, 10 = 2 · 5 is square-free, but 24 = 23 · 3 = 22 · 2 · 3 is not square-free.
We claim that any positive integer can be written uniquely as the product of
a square and a square-free integer. Indeed, let n ∈ N and let k be the largest
natural number such that k 2 divides n. Then n/k 2 must be square-free, for if n/k 2
is divided by a squared prime p2 , then pk > k divides n, which is not possible by
definition of k. Thus, any positive integer n can be uniquely written as n = k 2 if n
is a perfect square, or
(6.39) n = k 2 · p q · · · r,
where k ≥ 1 and where p, q, . . . , r are some primes less than or equal to n that occur
with multiplicity one. Using the fact that any positive integer can be uniquely
written
P as the product of a square and a square-free integer, we shall prove that
1/p diverges.
6.7.3. Proof II: Proof by comparison. Here is Niven’s [164, 165] proof.
We first prove that the product
Y  1
1+
p
p<N

diverges to ∞ as N → ∞, where the product is over all primes less than N . Let
2 < 3 < · · · < m be all the primes less than N . Consider the product
Y  1  1  1  1
1+ = 1+ 1+ ··· 1 + .
p 2 3 m
p<N

For example, if N = 5, then


Y 1 1 1 1
1+ =1+ + + .
p<5
p 2 3 2·3

If N = 6, then
Y 1  1  1  1
1+ = 1+ 1+ 1+
p<6
p 2 3 5
1 1 1 1 1 1 1
=1+ + + + + + + .
2 3 5 2·3 2·5 3·5 2·3·5
P
6.7. F PROOFS THAT 1/p DIVERGES 323

Using induction on N , we can always write


Y  1 X 1 X 1 X 1
1+ =1+ + + ··· + ,
p p p·q p · q···r
p<N p<N p,q<N p,q,...,r<N

where the k-th sum on the right is the sum over over all reciprocals of the form
1
p1 ·p2 ···pk with p1 , . . . , pk distinct primes less than N . Thus,

Y  1 X 1 X 1 X X 1
1+ · 2
= 2
+
p k k k2 p
p<N k<N k<N k<N p<N
X X 1 X X 1
+ + ··· + .
k2 ·p·q k2 · p · q · · · r
k<N p,q<N k<N p,q,...,r<N

By our discussion on square-free numbers around (6.39), the right-hand side con-
tains every number of the form 1/n where n < N (and many other numbers too).
In particular,
Y  1 X 1 X 1
(6.40) 1+ · ≥ .
p k2 n
p<N k<N n<N
P
FromP this inequality, we shall prove
P that 1/p diverges. To this end, we know
∞ ∞
that k=1 1/k 2 converges while n=1 1/n diverges, so it follows that
Y  1
lim 1+ = ∞.
N →∞ p
p<N
P
To relate this product to the sum 1/p, note that
x2 x3
ex = 1 + x + + + ··· ≥ 1 + x
2! 3!
for x ≥ 0 — in fact, this inequality holds for all x ∈ R by Theorem 4.29. Hence,
Y  1 Y  X 1
1+ ≤ exp(1/p) = exp .
p p
p<N p<N p<N

Since the left-hand side increases without bound as N → ∞, so must the sum
P
p<N 1/p. This ends Proof II; see Problem 2 for a related proof.

6.7.4. Proof III: Another proof by comparison. This is Gilfeather and


Meister’s argument [130]. The first step is to prove that for any natural number
N > 1, we have
Y p N −1
X 1
≥ .
p−1 n=1
n
p<N
Q  −1
To prove this we shall prove that p<N 1 − p1 → ∞. To see this, observe that
 1 −1 1 1 1
1− = 1 + + 2 + 3 + ··· .
p p p p
Let 2 < 3 < · · · < m be all the primes less than N . Then every natural number
n < N can be written in the form
n = 2i 3j · · · mk
324 6. ADVANCED THEORY OF INFINITE SERIES

for some nonnegative integers i, j, . . . , k. It follows that the product


Y  1 −1  1 −1  1 −1  1 −1
1− = 1− 1− ··· 1 −
p 2 3 m
p<N
 1 1 1  1 1 1 
= 1 + + 2 + 3 ··· 1 + + 2 + 3 + ··· ···
2 2 2 3 3 3
 1 1 1 
··· 1 + + 2 + 3 + ···
m m m
after multiplying out using Cauchy’s multiplication theorem (or rather its gener-
alization to a product of more than two absolutely convergent series), contains all
the numbers 11 , 12 , 13 , 41 , . . . , N 1−1 (and of course, many more numbers too). Thus,
Y  N −1
Y p 1 −1 X 1
(6.41) = 1− ≥ ,
p−1 p n=1
n
p<N p<N

which proves our first step. Now recall from (4.29) that for any natural number n,
we have
1 1
(6.42) < log(n + 1) − log n < .
n+1 n
In particular, taking logarithms of both sides of (6.41), we get
 NX−1 
1  Y p 
log ≤ log
n=1
n p−1
p<N
X  X 1 X 2
= log p − log(p − 1) ≤ ≤ ,
p−1 p
p<N p<N p<N

where we used that p ≤ 2(p−1) (this is because n ≤ 2(n−1) for all natural numbers
PN −1 PN −1 
n > 1). Since P n=1 1/n → ∞ as N → ∞, log n=1 1/n → ∞ as N → ∞ as
well, so the sum 1/p must diverge.
Exercises 6.7.
1. LetPsn = 1/2 + 1/3 + · · · + 1/pn (where pn is the n-th prime) be the n-th partial sum
of 1/p. We know that sn → ∞ as n → ∞. However, it turns out that sn → ∞
avoiding all integers! Prove this. Suggestion: Multiply sn by 2 · 3 · · · pn−1 .
2. Niven’s proof can be slightly modified to avoid using the square-free P fact. Derive the
inequality (6.40) (which, as shown in the main text, implies that 1/p diverges) by
proving that for any prime p,
  Xn 2n+1
1 1 X 1
1+ · = .
p p2k pk
k=0 k=0

3. Here is another proof that is similar to Gilfeather and Meister’s argument where we
replace the inequality (6.42) with the following argument.
(i) Prove that
1
(6.43) ≤ ex for all 0 ≤ x ≤ 1.
1 − x/2
Suggestion: Prove that e−x ≤ 1 − x/2 using the series expansion for e−x .
(ii) Taking logarithms of (6.43), prove that for any prime number p, we have
 1  2/p  2
− log 1 − = − log 1 − ≤ .
p 2 p
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 325

(iii) Prove that


1 X  p  X 1
log ≤ .
2 p<N p−1 p<N
p
P
(iv) Finally, use (6.41) as in the main text to prove thatP 1/p diverges.
4. Here’s Vanden Eynden’s proof P [231]. Assume that 1/p converges. Then we can
choose an N such that α := p>N 1/p < 1/2.
(i) For x ≥ 1, let Mx be the set of all natural numbers 1 ≤ n ≤ x such that n = 1 or
n = p1 · · · pk where the pj ’s are prime and pj > N . Prove that
X 1
lim = ∞.
x→∞
n∈Mx
n
P
(ii) By (i), we can choose x such that β := n∈Mx n1 > 2. Prove that β − 1 ≤ α · β.
−1
(iii) Deduce that 1 − β ≤ α and use this fact, together with the assumptions that
α < 1/2 and β > 2, to derive a contradiction. P
5. Here is Paul Erdös’ (1913–1996) celebrated
P proof [64]. Assume that 1/p converges.
Then we can choose an N such that p>N 1/p < 1/2; derive a contradiction as follows.
(i) For any x ∈ N, let Ax be the set of all integers 1 ≤ n ≤ x such that n = 1 or all
the prime factors of n are ≤ N ; that is, n = p1 · · · pk where the pj ’s are prime
and pj ≤ N . Given n ∈ Ax , we can write n = k2 m where m is square free. Prove

that k ≤ x. From this, deduce that

#Ax ≤ C x,
where #Ax denotes the number of elements in the set Ax and C is a constant
(you can take C to equal the number of square free integers m ≤ N ).
(ii) Given x ∈ N and a prime p, prove that the number of integers 1 ≤ n ≤ x divisible
by p is no more than x/p.
(iii) Given x ∈ N, prove that x − #Ax equals the number of integers 1 ≤ n ≤ x that
are divisible by somePprime p > N . From this fact and Part (b) together with
our assumption that p>N 1/p < 1/2, prove that
x
x − #Ax < .
2

(iv) Using (c) and the inequality #Ax ≤ C x you proved in Part (a), conclude that
for any x ∈ N, we have √
x ≤ 2C.
From this derive a contradiction.

6.8. Composition of power series and Bernoulli and Euler numbers


We’ve kept you in suspense long enough concerning the extraordinary Bernoulli
and Euler numbers, so in this section we finally get to these fascinating numbers.
6.8.1. Composition and division of power series. The Bernoulli and Eu-
ler numbers come up when dividing power series, so before we do anything, we
need to understand division of power series, and to understand this we first need
to consider the composition of power series. The following theorem basically says
that the composition of power series is again a power series.
Theorem 6.33 (Power series composition theorem). If f (z) and g(z) are
power series, then the composition f (g(z)) can be written as a power series that is
valid for all z ∈ C such that

X
|an z n | < the radius of convergence of f ,
n=0
326 6. ADVANCED THEORY OF INFINITE SERIES

P∞
where g(z) = an z n .
n=0
P∞
P∞Proof. Let f (z) = n=0 bn z n have radius of convergence R and let g(z) =
n
n=0 an z have radius of convergence r. Then by Cauchy or Mertens’ multiplica-
tion theorem, for each m, we can write g(z)m as a power series:

X m X ∞
m n
g(z) = an z = amn z n , |z| < r.
n=0 n=0
Thus,

X ∞ X
X ∞
f (g(z)) = bm g(z)m = bm amn z n .
m=0 m=0 n=0
If we are allowed to interchange the order of summation in f (g(z)), then our result
is proved:
∞ X
X ∞ X∞ X∞
n n
f (g(z)) = bm amn z = cn z , where cn = bm amn .
n=0 m=0 n=0 m=0
Thus, we can focus on interchanging the order of summation in f (g(z)). Assume
henceforth that
X∞ ∞
X
ξ := |an z n | = |an | |z|n < R = the radius of convergence of f ;
n=0 n=0
P∞ m
in particular, since f (ξ) = m=0 bm ξ is absolutely convergent,
X∞
(6.44) |bm | ξ m < ∞.
m=0
Now according to Cauchy’s double series theorem, we can interchange the order of
summation as long as we can show that
X∞ X ∞ ∞ X ∞
X
bm amn z n = |bm | |amn | |z|n < ∞.
m=0 n=0 m=0 n=0
To prove this, we first claim that the inner summation satisfies the inequality
X∞
(6.45) |amn | |z|n ≤ ξ m .
n=0
To see this, consider the case m = 2. Recall that the coefficients a2n are defined
via the Cauchy product:
X ∞ 2 X∞ Xn
g(z)2 = an z n = a2n z n where a2n = ak an−k .
n=0 n=0 k=0
Pn 2
Thus, |a2n | ≤ k=0 |ak | |an−k |. On the other hand, we can express ξ via the
Cauchy product:

X 2 X ∞ Xn
ξ2 = |an | |z|n = αn |z|n where αn = |ak | |an−k |.
n=0 n=0 k=0
Pn
Hence, |a2n | ≤ k=0 |ak | |an−k | = αn , so

X ∞
X
|a2n | |z|n ≤ αn |z|n = ξ 2 ,
n=0 n=0
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 327

which proves (6.45) for m = 2. An induction argument shows that (6.45) holds for
all m. Finally, using (6.45) and (6.44) we see that
∞ X
X ∞ ∞ X
∞ ∞
X X
bm amn z n = |bm | |amn | |z|n ≤ |bm | ξ m < ∞,
m=0 n=0 m=0 n=0 m=0

which shows that we can interchange the order of summation in f (g(z)) and com-
pletes our proof. 

We already know (by Mertens’ multiplication theorem for instance) that the
product of two power series is again a power series. As a consequence of the following
theorem, we get the same statement for division.
Theorem 6.34 (Power series division theorem). If f (z) and g(z) are power
series with positive radii of convergence and with g(0) 6= 0, then f (z)/g(z) is also
a power series with positive radius of convergence.
Proof. Since f (z)/g(z) = f (z) · (1/g(z)) and we know that the product of
two power series is a power series,
P∞ all we have to do is show that 1/g(z) is a power
series. To this end, let g(z) = n=0 an z n and define

1 X
g̃(z) := g(z) − 1 = αn z n ,
a0 n=1

where αn = aan0 and where we recall that a0 = g(0) 6= 0. Then g̃ has a positive
1
radius of convergence and g̃(0) = 0. Now let h(z) := a0 (1+z) , which can be writ-
ten
P∞as a geometric series with radius of convergence 1. Note that for |z| small,
n
n=1 |αn | |z| < 1 (why?), thus by the previous theorem, for such z,

1 1
= = h(g̃(z))
g(z) a0 (g̃(z) + 1)
has a power series expansion with a positive radius of convergence. 

6.8.2. Bernoulli numbers. See [120], [54], [206], or [85] for more informa-
tion on Bernoulli numbers. Since
∞ ∞ ∞
ez − 1 1 X 1 n X 1 n−1 X 1
= · z = z = zn
z z n=1 n! n=1
n! n=0
(n + 1)!

has a power series expansion and equals 1 at z = 0, by our division of power series
theorem, the quotient 1/((ez − 1)/z) = z/(ez − 1) also has a power series expansion
near z = 0. It is customary to denote its coefficients by Bn /n!, in which case we
can write

z X Bn n
(6.46) = z
ez − 1 n=0 n!

where the series has a positive radius of convergence. The numbers Bn are called the
Bernoulli numbers after Jacob (Jacques) Bernoulli (1654–1705) who discovered
them while searching for formulas involving powers of integers; see Problems 3 and
4. We can find a remarkable symbolic equation for these Bernoulli numbers as
328 6. ADVANCED THEORY OF INFINITE SERIES

follows. First, we multiply both sides of (6.46) by (ez − 1)/z and use Mertens’
multiplication theorem to get
X ∞  X ∞  X ∞ X n  
Bn n 1 Bk 1
1= z · zn = · zn.
n=0
n! n=0
(n + 1)! n=0
k! (n − k + 1)!
k=0
By the identity theorem, the n = 0 term on the right must equal 1 while all other
terms must vanish. The
Pn n = 0 term on the right is just B0 , so B0 = 1, and for
n > 1, we must have k=0 Bk!k · (n+1−k)!
1
= 0. Multiplying this by (n + 1)! we get
n n n  
X Bk (n + 1)! X (n + 1)! X n+1
0= · = · Bk = Bk ,
k! (n + 1 − k)! k!(n + 1 − k)! k
k=0 k=0 k=0

and adding Bn+1 = n+1n+1 Bn+1 to both sides of this equation, we get
n+1
X n + 1
Bn+1 = Bk .
k
k=0
The right-hand side might look familiar from the binomial formula. Recall from
the binomial formula that for any complex number a, we have
n+1
X n + 1 n+1
X n + 1
(a + 1)n+1 = ak · 1n−k = ak .
k k
k=0 k=0
Notice that the right-hand side of this expression is exactly the right-hand side of
the previous equation if put a = B and we make the superscript k into a subscript
k. Thus, if we use the notation + to mean “equals after making superscripts into
subscripts”, then we can write
(6.47) B n+1 + (B + 1)n+1 , n = 1, 2, 3, . . . with B0 = 1.
Using the identity (6.47), one can in principle find all the Bernoulli numbers: When
n = 1, we see that
1
B 2 + (B + 1)2 = B 2 + 2B 1 + 1 =⇒ 0 = 2B1 + 1 =⇒ B1 = − .
2
When n = 2, we see that
1
B 3 + (B + 1)3 = B 3 + 3B 2 + 3B 1 + 1 =⇒ 0 = 3B2 + 3B1 + 1 =⇒ B2 = .
6
Here is a partial list through B14 :
1 1
B0 = 1, B1 = − , B2 = , B3 = 0,
2 6
1
B4 = − , B5 = B7 = B9 = B11 = B13 = B15 = 0,
30
1 1 5 691 7
B6 = , B8 = − , B10 = , B12 = − , B14 = .
42 30 66 2730 6
These numbers are rational, but besides this fact, there is no known regular pattern
these numbers conform to. However, we can easily deduce that all odd Bernoulli
numbers greater than one are zero. Indeed, we can rewrite (6.46) as

z z X Bn n
(6.48) z
+ = 1 + z .
e −1 2 n=2
n!
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 329

The fractions on the left-hand side can be combined into one fraction
z z z(ez + 1) z(ez/2 + e−z/2 )
(6.49) + = = ,
ez − 1 2 2(ez − 1) 2(ez/2 − e−z/2 )
which an even function of z. Thus, (see Exercise 1 in Section 6.4)
(6.50) B2n+1 = 0, n = 1, 2, 3, . . . .
Other properties are given in the exercises (see e.g. Problem 3).

6.8.3. Trigonometric functions. We already know the power series expan-


sions for sin z and cos z. It turns out that the power series expansions of the other
trigonometric functions involve Bernoulli numbers! For example, to find the expan-
sion for cot z, we replace z with 2iz in (6.48) and (6.49) to get
∞ ∞
iz(eiz + e−iz ) X Bn n
X B2n
iz −iz
= 1 + (2iz) = 1 + (−1)n (2z)2n ,
(e − e ) n=2
n! n=1
(2n)!

where used that B3 , B5 , B7 , . . . all vanish in order to sum only over all even Bernoulli
numbers. Since cot z = cos z/ sin z, using the definition of cos z and sin z in terms
of e±iz , we see that the left-hand side is exactly z cot z. Thus, we have derived the
formula

X 22n B2n 2n
z cot z = (−1)n z .
n=0
(2n)!

From this formula, we can get the expansion for tan z by using the identity
cos 2z cos2 z − sin2 z
2 cot(2z) = 2 =2 = cot z − tan z.
sin 2z 2 sin z cos z
Hence,
∞ ∞
X 22n B2n 2n X 22n B2n 2n 2n
tan z = cot z − 2 cot(2z) = (−1)n z −2 (−1)n 2 z ,
n=0
(2n)! n=0
(2n)!

which, after combining the terms on the right, takes the form

X 22n (22n − 1) B2n 2n−1
tan z = (−1)n−1 z .
n=1
(2n)!

In Problem 1, we derive the power series expansion of csc z. In conclusion we have


power series expansions for sin z, cos z, tan z, cot z, csc z. What about sec z?

6.8.4. The Euler numbers. It turns out that the expansion for sec z involves
the Euler numbers, which are defined in a similar way as the Bernoulli numbers.
By the division of power series theorem, the function 2ez /(e2z + 1) has a power
series expansion near zero. It is customary to denote its coefficients by En /n!, so

2ez X En n
(6.51) 2z
= z
e + 1 n=0 n!
330 6. ADVANCED THEORY OF INFINITE SERIES

where the series has a positive radius of convergence. The numbers En are called
the Euler numbers. We can get the missing expansion for sec z as follows. First,
observe that

X En n 2ez 2 1
z = 2z = z −z
= = sech z,
n=0
n! e + 1 e + e cosh z
where sech z := 1/ cosh z is the hyperbolic secant. Since sech z is an even function
(that is, sech(−z) = sech z) it follows that all En with n odd vanish. Hence,

X E2n 2n
(6.52) sech z = z .
n=0
(2n)!

In particular, putting iz for z in (6.52) and using that cosh(iz) = cos z, we get the
missing expansion for sec z:

X E2n 2n
sec z = (−1)n z .
n=0
(2n)!

Just as with the Bernoulli numbers, we can derive a symbolic


P∞ equation for the
1
Euler numbers. To do so, we multiply (6.52) by cosh z = n=0 (2n)! z 2n and use
Mertens’ multiplication theorem to get
X ∞  X ∞  X ∞ Xn  
E2n 2n 1 2n E2k 1
1= z · z = · z 2n .
n=0
(2n)! n=0
(2n)! n=0
(2k)! (2n − 2k)!
k=0
By the identity theorem, the n = 0 term on the right must equal 1 while all other
terms must vanish. The
Pn n = 0 term on the right is just E0 , so E0 = 1, and for
E2k 1
n > 1, we must have k=0 (2k)! · (2n−2k)! = 0. Multiplying this by (2n)! we get
n n
X E2k (2n)! X (2n)!
(6.53) 0= · = · E2k .
(2k)! (2n − 2k)! (2k)!(2n − 2k)!
k=0 k=0
Now from the binomial formula, for any complex number a, we have
2n 2n
X (2n)! X (2n)!
(a + 1)2n + (a − 1)2n = ak + ak (−1)2n−k
k!(2n − k)! k!(2n − k)!
k=0 k=0
2n 2n
X (2n)! X (2n)!
= ak + ak (−1)k
k!(2n − k)! k!(2n − k)!
k=0 k=0
2n
X (2n)!
= a2k ,
(2k)!(2n − 2k)!
k=0
since all the odd terms cancel. Notice that the right-hand side of this expression is
exactly the right-hand side of (6.53) if put a = E and we make the superscript 2k
into a subscript 2k. Thus,
(6.54) (E + 1)2n + (E − 1)2n + 0 , n = 1, 2, . . . with E0 = 1 and Eodd = 0.
Using the identity (6.54), one can in principle find all the Euler numbers: When
n = 1, we see that
(E 2 + 2E 1 + 1) + (E 2 − 2E 1 + 1) + 0 =⇒ 2E2 + 2 = 0 =⇒ E2 = −1.
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 331

Here is a partial list through E12 :


E0 = 1, E1 = E2 = E3 = · · · = 0 (Eodd = 0), E2 = −1, E4 = 5
E6 = −61, E8 = 1385, E10 = −50, 521, E12 = 2, 702, 765, ....
These numbers are all integers, but besides this fact, there is no known regular
pattern these numbers conform to.
Exercises 6.8.
1. Recall that csc z = 1/ sin z. Prove that csc z = cot z + tan(z/2), and from this identity
deduce that

X (22n − 2) B2n 2n
z csc z = (−1)n−1 z .
n=0
(2n)!
P P
2. (a) Let f (z) = an z n and g(z) = bn z n with bP
0 6= 0 be power series with positive
radii of convergence. Show that f (z)/g(z) = cn z n where {cn } is the sequence
defined recursively as follows:
n
a0 X
c0 = , b0 cn = an − bk cn−k .
b0
k=1

(b) Use Part (a) to find the first few coefficients of the expansion for tan z = sin z/ cos z.
3. (Cf. [120, p. 526] which is reproduced in [166]) In this and the next problem we give
an elegant application of the theory of Bernoulli numbers to determine the sum of the
first k-th powers of integers, Bernoulli’s original motivation for his numbers.
(i) For n ∈ N, derive the formula
z e(n+1)z − 1
1 + ez + e2z + · · · + enz = · .
ez −1 z
(ii) Writing each side of this identity as a power series (on the right, you need to use
the Cauchy product), derive the formula
k
!
k k k
X k (n + 1)k+1−j
(6.55) 1 + 2 + ··· + n = Bj , k = 1, 2, . . . .
j=0
j k+1−j

Plug in k = 1, 2, 3 to derive some pretty formulas!


4. Here’s another proof of (6.55) that is aesthetically appealing.
(i) Prove that for a complex number a and natural numbers k, n,
k+1
!
k+1 k+1
X k + 1 k+1−j  
(n + 1 + a) − (n + a) = n (a + 1)j − aj .
j=1
j

(ii) Replacing a with B, prove that

1 
1k + 2k + · · · + nk + (n + 1 + B)k+1 − B k+1 .
k+1

Suggestion: Look for a telescoping sum and recall that (B + 1)j + B j for j ≥ 2.
5. The n-th Bernoulli polynomial Bn (t) is by definition, n! times the coefficient of z n in
the power series expansion in z of the function f (z, t) := zezt /(ez − 1); that is,

z ezt X Bn (t) n
(6.56) z
= z .
e −1 n=0
n!
332 6. ADVANCED THEORY OF INFINITE SERIES

P 
(a) Prove that Bn (t) = n n
k=0 k Bk t
n−k
where the Bk ’s are the Bernoulli numbers.
Thus, the first few Bernoulli polynomials are
1 1 3 2 1
B0 (t) = 1, B1 (t) = t − , B2 (t) = t2 − t + , B3 (t) = t3 − t + t.
2 6 2 2
(b) Prove that Bn (0) = Bn for n = 0, 1, . . . and that Bn (0) = Bn (1) = Bn for n 6= 1.
Suggestion: Show that f (z, 1) = z + f (z, 0).
(c) Prove that Bn (t + 1) − Bn (t) = ntn−1 for n = 0, 1, 2, . . .. Suggestion: Show that
f (z, t + 1) − f (z, t) = zezt .
(d) Prove that B2n+1 (0) = 0 for n = 1, 2, . . . and B2n+1 (1/2) = 0 for n = 0, 1, . . ..

6.9. The logarithmic, binomial, arctangent series, and γ


From elementary calculus, you might have seen the logarithmic, binomial, and
arctangent series (discovered by Nicolaus Mercator (1620–1687), Sir Isaac Newton
(1643–1727), and Madhava of Sangamagramma (1350–1425), respectively):
∞ ∞   ∞
X (−1)n−1 n X α n X x2n+1
log(1 + x) = x , (1 + x)α = x , arctan x = (−1)n
n=1
n n=0
n n=0
2n + 1

where α ∈ R. (Below we’ll discuss the meaning of α n .) I can bet that you used
calculus (derivatives and integrals) to derive these formulæ. In this section we’ll
derive even more general complex versions of these formulæ without derivatives!

6.9.1. The binomial coefficients. From our familiar binomial  theorem,


 we
Pk
know that for any z ∈ C and k ∈ N, we have (1 + z)k = n=0 nk z n , where k0 := 1
and for n = 1, 2, . . . , k,
 
k k! 1 · 2···k k(k − 1) · · · (k − n + 1)
(6.57) := = = .
n n!(k − n)! n! · 1 · 2 · · · (k − n) n!
Pk 
The formula (1 + z)k = n=0 nk z n trivially holds when k = 0 too. Another way
to express this formula is
k
X k(k − 1) · · · (k − n + 1) n
(1 + z)k = 1 + z .
n=1
n!

With this motivation, given any complex number α, we define the binomial coef-
ficient α α
n for any nonnegative integer n as follows: 0 = 1 and for n > 0,
 
α α(α − 1) · · · (α − n + 1)
(6.58) = .
n n!
 
Note that if α = 0, 1, 2, . . ., then we see that all α α
n vanish for n ≥ α + 1 and n
is exactly the usual binomial coefficient (6.57). In the following lemma, we derive
an identity that will be useful later.

Lemma 6.35. For any α, β ∈ C, we have


  X n   
α+β α β
= , n = 0, 1, 2, . . . .
n k n−k
k=0
6.9. THE LOGARITHMIC, BINOMIAL, ARCTANGENT SERIES, AND γ 333

Proof. Throughout this proof, we put N0 := {0, 1, 2, 3, . . .}.


Step 1: First of all, our lemma holds when both α and β are in N0 . Indeed,
if α = p, β = q are in N0 , then expressing both sides of the identity (1 + z)p+q =
(1 + z)p (1 + z)q using the binomial formula, we obtain
p+q   p  
! q  
!
X p+q n X p k X q k
z = z · z
n=0
n k k
k=0 k=0
p+q X n   !
X p q
= zn,
n=0
k n − k
k=0

where at the last step we formed the Cauchy product of (1 + z)p (1 + z)q . By the
identity theorem we must have
  X n   
p+q p q
= , for all p, q, n ∈ N0 .
n k n−k
k=0
Step 2: Assume now that β = q ∈ N0 , n ∈ N0 , and define f : C −→ C by
  X n   
z+q z q
f (z) := − .
n k n−k
k=0

In view of the definition (6.58) of the binomial coefficient, it follows that f (z)
is a polynomial in z of degree at most n. Moreover, by Step 1 we know that
f (p) = 0 for all p ∈ N0 . In particular, the polynomial f (z) has more than n roots.
Therefore, f (z) must be the zero polynomial, so in particular, given any α ∈ C, we
have f (α) = 0; that is,
  X n   
α+q α q
= , for all α ∈ C , q, n ∈ N0 .
n k n−k
k=0
Step 3: Let α ∈ C, n ∈ N0 , and define g : C −→ C by
  X n   
α+z α z
g(z) := − .
n k n−k
k=0

As with the function f (z) in Step 2, g(z) is a polynomial in z of degree at most n.


Also, by Step 2 we know that g(q) = 0 for all q ∈ N0 and consequently, g(z) must
be the zero polynomial. In particular, given any β ∈ C, we have g(β) = 0, which
completes our proof. 
6.9.2. The complex logarithm and binomial series. In Theorem 6.37 we
shall derive (along with a power series for Log) the binomial series:
∞  
α
X α n α(α − 1) 2
(6.59) (1 + z) = z = 1+ αz + z + · · · , |z| < 1.
n=0
n 1!
P∞  n
Let us define f (α, z) := n=0 α n z . Our goal is to show that f (α, z) = (1 + z)
α

for all α ∈ C and |z| < 1, where


(1 + z)α := exp(α Log(1 + z))
with Log the principal logarithm of the complex number 1+z. If α = k = 0, 1, 2, . . .,
then we already know that all the nk vanish for n ≥ k + 1 and these binomial
coefficients are the usual ones, so f (k, z) converges with sum f (k, z) = (1 + z)k . To
334 6. ADVANCED THEORY OF INFINITE SERIES

see that f (α, z) converges for all other α, assume that α ∈ C is not a nonnegative
integer. Then setting an = α n , we have

an α(α − 1) · · · (α − n + 1) (n + 1)! = n+1 ,

= ·
an+1 n! α(α − 1) · · · (α − n) |α − n|
which approaches 1 as n → ∞. Thus, the radius of convergence of f (α, z) is 1 (see
(6.12)). In conclusion, f (α, z) is convergent for all α ∈ C and |z| < 1.
We now prove the real versions of the logarithm series and the binomial series
(6.59); see Theorem 6.37 below for the more general complex version. It is worth
emphasizing that we do not use the advanced technology of the differential and
integral calculus to derive these formulas!
Lemma 6.36. For all x ∈ R with |x| < 1, we have

X (−1)n−1 n
log(1 + x) = x
n=1
n
and for all α ∈ C and x ∈ R with |x| < 1, we have
∞  
X α n α(α − 1) 2
(1 + x)α = x = 1+ αx+ x + ··· .
n=0
n 1!

Proof. We prove this lemma in three steps.


Step 1: We first show that f (r, x) = (1+x)r for all r = p/q ∈ Q where p, q ∈ N
with q odd and x ∈ R with |x| < 1. To see this, observe for any z ∈ C with |z| < 1,
taking the Cauchy product of f (α, z) and f (β, z) and using our lemma, we obtain
∞ Xn    ∞  
X α β X α+β n
f (α, z) · f (β, z) = zn = z = f (α + β, z).
n=0 j=0
j n−j n=0
n

By induction it easily follows that


f (α1 , z) · f (α2 , z) · · · f (αk , z) = f (α1 + α2 + · · · + αk , z).
Using this identity, we obtain
f (1/q, z)q = f (1/q, z) · · · f (1/q, z) = f (1/q + · · · + 1/q , z) = f (1, z) = 1 + z.
| {z } | {z }
q times q times

Now put z = x ∈ R with |x| < 1 and let q ∈ N be odd. Then f (1/q, x)q = 1 + x,
so taking q-th roots, we get f (1/q, x) = (1 + x)1/q . Here we used that every real
number has a unique q-th root, which holds because q is odd — for q even we could
only conclude that f (1/q, x) = ±(1 + x)1/q (unless we checked that f (1/q, x) is
positive, then we would get f (1/q, x) = (1 + x)1/q ). Therefore,
f (r, x) = f (p/q, x) = f (1/q + · · · + 1/q , x) = f (1/q, x) · · · f (1/q, x)
| {z } | {z }
p times p times

= f (1/q, x)p = (1 + x)p/q = (1 + x)r .


Step 2: Second, we prove that for any given z ∈ C with |z| < 1, f (α, z) can
be written as a power series in α that converges for all α ∈ C:

X
f (α, z) = 1 + am (z) αm ;
m=1
6.9. THE LOGARITHMIC, BINOMIAL, ARCTANGENT SERIES, AND γ 335

in particular, since we know that power series are continuous, f (α, z) is a continuous
function of α ∈ C. Here, the coefficients am (z) depend on z (which we’ll see are
power series in z) and we’ll show that

X (−1)n−1 n
(6.60) a1 (z) = z .
n=1
n

To prove these statements, note that for n ≥ 1, α(α−1) · · · (α−n+1) is a polynomial


of degree n in α, so for n ≥ 1 we can write
  n
α α(α − 1) · · · (α − n + 1) X
(6.61) = = amn αm ,
n n! m=1

for somecoefficients
P∞ amn . Defining amn = 0 for m = n + 1, n + 2, n + 3, . . ., we can
write αn = m
m=0 mn α . Hence,
a
∞   ∞  X∞ 
X α n X
(6.62) f (z, α) = 1 + z =1+ amn αm z n .
n=1
n n=1 m=1

To make this a power series in α, we need to switch the order of summation,


which we can do by Cauchy’s double series theorem if we can demonstrate absolute
convergence by showing that
X ∞
∞ X ∞
∞ X
X
amn αm z n = |amn | |α|m |z|n < ∞.
n=1 m=1 n=1 m=1

To verify this, we first observe that for all α ∈ C, we have


n
α(α + 1)(α + 2) · · · (α + n − 1) X
(6.63) = bmn αm ,
n! m=1

where the bmn ’s are nonnegative real numbers. (This is certainly plausible because
the numbers 1, 2, . . . , n − 1 on the left each come with positive signs; any case, this
statement can be verified by induction for instance.) We secondly observe that
replacing α with −α in (6.61), we get
n
X −α(−α − 1) · · · (−α − n + 1)
amn (−1)m αm =
m=1
n!
n
α(α + 1) · · · (α + n − 1) X
= (−1)n = (−1)n bmn αm .
n! m=1

By the identity theorem, we have amn (−1)m = (−1)n bmn . In particular, |amn | =
bmn since bmn > 0, therefore in view of (6.63), we see that
∞ n n
X X X |α|(|α| + 1) · · · (|α| + n − 1)
|amn | |α|m = |amn | |α|m = bmn |α|m = .
m=0 m=0 m=0
n!

Therefore,

∞ X ∞
X X |α|(|α| + 1) · · · (|α| + n − 1) n
|amn | |α|m |z|n = |z| .
n=1 m=1 n=1
n!
336 6. ADVANCED THEORY OF INFINITE SERIES

Using the now very familiar ratio test it’s easily checked that, since |z| < 1, the
series on the right converges. Thus, we can iterate sums in (6.62) and conclude
that
∞  X
X ∞  X∞ X ∞ 
f (α, z) = 1 + amn αm z n = 1 + amn z n αm .
n=1 m=1 m=1 n=1

Thus, f (α, z) is indeed a power series in α. To prove (6.60), we just need to


determine the coefficient of α1 in (6.61), which we see is just

α(α − 1)(α − 2) · · · (α − n + 1)
a1n = coefficient of α in
n!
(−1)(−2)(−3) · · · (−n + 1) (n − 1)! (−1)n−1
= = (−1)n−1 = .
n! n! n
Therefore,
∞ ∞
X X (−1)n−1 n
a1 (z) = a1n z n = z ,
n=1 n=1
n
just as we stated in (6.60). This completes Step 2.
Step 3: We are finally ready to prove our theorem. Let x ∈ R with |x| < 1.
By Step 2, we know that for any α ∈ C,

X
f (α, x) = 1 + am (x) αm
m=1

is a power series in α. However,



X 1
(1 + x)α = exp(α log(1 + x)) = log(1 + x)n · αn
n=0
n!

is also a power series in α ∈ C. By Step 1, f (α, x) = (1 + x)α for all α ∈ Q with


α > 0 having odd denominators. The identity theorem applies to this situation
(why?), so we must have f (α, x) = (1 + x)α for all α ∈ C. Also by the identity
theorem, the coefficients of αn must be identical; in particular, the coefficients of α1
are identical: a1 (x) = log(1 + x). Now (6.60) implies the series for log(1 + x). 

Using this lemma and the identity theorem, we are ready to generalize these
formulas for real x to formulas for complex z.
Theorem 6.37 (The complex logarithm and binomial series). We have

X (−1)n−1 n
Log(1 + z) = z , |z| ≤ 1, z 6= −1,
n=1
n

and given any α ∈ C, we have


∞  
α
X α α(α − 1) 2
(1 + z) = zn = 1 + α z + z + ··· , |z| < 1.
n=0
n 1!

Proof. We prove this theorem first for Log(1 + z), then for (1 + z)α .
6.9. THE LOGARITHMIC, BINOMIAL, ARCTANGENT SERIES, AND γ 337

P∞ n−1
Step 1: Let us define f (z) := n=1 (−1)n z n . Then one can check that the
radius of convergence of f (z) is 1, so by our power series composition theorem, we
know that exp(f (z)) can be written as a power series:

X
exp(f (z)) = an z n , |z| < 1.
n=0

Restricting to real values of z, by our lemma we know that f (x) = log(1 + x), so

X
an xn = exp(f (x)) = exp(log(1 + x)) = 1 + x.
n=0

By the identity theorem for power series, we must have a0 = 1, a1 = 1, and all
other an = 0. Thus, exp(f (z)) = 1 + z. Since exp(Log(1 + z)) = 1 + z as well, we
have
exp(f (z)) = exp(Log(1 + z)),
which implies that f (z) = Log(1 + z) + 2πik for some integer k. Setting z = 0
P∞ n−1
shows that k = 0 and hence proves that Log(1 + z) = f (z) = n=1 (−1)n z n .
P∞ n−1
We now prove that Log(1 + z) = n=1 (−1)n z n holds for |z| = 1 with z 6= −1
(note that for z = −1, both sides of this equality are not defined). If |z| = 1, then
we can write z = −eix with x ∈ (0, 2π). Recall from Example 6.4 in Section 6.1
P∞ inx
that for any x ∈ (0, 2π), the series n=1 e n converges. Hence, as
∞ ∞ ∞
X (−1)n−1 n X (−1)n (−eix )n X einx
(6.64) − z = = ,
n=1
n n=1
n n=1
n
P∞ n−1
it follows that n=1 (−1)n z n converges for |z| = 1 with z 6= −1. Now fix a point
z0 with |z0 | = 1 and z0 6= −1, and let us take z → z0 through the straight line from
z = 0 to z = z0 (that is, let z = tz0 where 0 ≤ t ≤ 1 and take t → 1− ). Since the
ratio
|z0 − z| |z0 − tz0 | |z0 − tz0 | |1 − t|
= = = = 1,
1 − |z| 1 − |tz0 | 1−t 1−t
which bounded by a fixed constant, by Abel’s theorem (Theorem 6.20), it follows
that
∞ ∞
X (−1)n−1 n X (−1)n−1 n
z0 = lim z = lim Log(1 + z) = Log(1 + z0 ),
n=1
n z→z0
n=1
n z→z0

where we used that Log(1 + z) is continuous.


Step 2: Let α ∈ C. To prove the binomial series, we note that by the power
series composition theorem, (1 + z)α = exp(α Log(1 + z)), being the composition
of exp and Log, can be written as a power series:

X
(1 + z)α = bn z n , |z| < 1.
n=0

Restricting to real z = x ∈ R with |x| < 1, by our lemma we know that (1 + x)α =
f (α, x). Hence, by the identity theorem, we must have (1 + z)α = f (α, z) for all
z ∈ C with |z| < 1. This proves the binomial series. 
338 6. ADVANCED THEORY OF INFINITE SERIES


For any z ∈ C with |z| < 1, we have Log (1 + z)/(1 − z) = Log(1 + z) −
Log(1 − z). Therefore, we can use this theorem to prove that (see Problem 1)
  X ∞
1 1+z z 2n+1
(6.65) Log = .
2 1−z n=0
2n + 1

Here’s another consequence of Theorem 6.37.


Example 6.39. In the proof of Theorem 6.37 we used that, for x ∈ (0, 2π), the
P∞ inx P∞ P∞
series n=1 e n = n=1 cosnnx + n=1 sinnnx converges. We shall prove that
∞ ∞
X cos nx  X sin nx x−π
= log 2 sin(x/2) and = .
n=1
n n=1
n 2

To see this, recall from (6.64) that, with z = −eix , we have


∞ ∞ ∞
X cos nx X sin nx X (−1)n−1 n
+i =− z = − Log(1 + z) = − Log(1 − eix ).
n=1
n n=1
n n=1
n
We can write
1 − eix = eix/2 (e−ix/2 − eix/2 ) = −2ieix/2 sin(x/2) = 2 sin(x/2)eix/2−iπ/2 .
Hence, by definition of Log, we have
 x−π
Log(1 − eix ) = log 2 sin(x/2) + i .
2
This proves our result.
6.9.3. Gregory-Madhava series and formulæ for γ. Recall from Section
4.9 that
1 1 + iz
Arctan z = Log .
2i 1 − iz
Using (6.65), we get the famous formula first discovered by Madhava of Sangama-
gramma (1350–1425) around 1400 and rediscovered over 200 years later in Europe
by James Gregory (1638–1675), who found it in 1671! In fact, the mathematicians
of Kerala in southern India not only discovered the arctangent series, they also
discoved the infinite series for sine and cosine, but their results were written up
in Sanskrit and not brought to Europe until the 1800’s. For more history on this
fascinating topic, see the articles [111], [190], and the website [172].
Theorem 6.38. For any complex number z with |z| < 1, we have

X z 2n+1
Arctan z = (−1)n , Gregory-Madhava’s series.
n=0
2n + 1

This series is commonly known as Gregory’s arctangent series, but we


shall call it the Gregory-Madhava arctangent series because of Madhava’s
contribution to this series. Setting z = x, a real variable, we obtain the usual
formula learned in elementary calculus:

X x2n+1
arctan x = (−1)n .
n=0
2n + 1
6.9. THE LOGARITHMIC, BINOMIAL, ARCTANGENT SERIES, AND γ 339

In Problem 5 we prove the following stunning formulæ for the Euler-Mascheroni


constant γ in terms of the Riemann ζ-function ζ(z):

X (−1)n
γ= ζ(n)
n=2
n

X 1 
(6.66) =1− ζ(n) − 1
n=2
n

3 X (−1)n 
= − log 2 − n − 1) ζ(n) − 1 .
2 n=2
n

The first two formulas are due to Euler and the last one to Philippe Flajolet and
Ilan Vardi (see [203, pp. 4,5], [75]).
Exercises 6.9.
1. Fill in the details in the proof of formula (6.65).
2. Derive the remarkably pretty formulas:

X (−1)n  1 1 1  2n+2
2(Arctan z)2 = 1 + + + ··· + z ,
n=0
2n + 2 3 5 2n + 1

and the formula



1 X (−1)n  1 1 1  n+2
(Log(1 + z))2 = 1 + + + ··· + z ,
2 n=0
n+2 2 3 n+1

both valid for |z| < 1.


3. Before looking at the next section, prove that
∞ ∞
X x2n+1 X (−1)n−1 n
arctan x = (−1)n and log(1 + x) = x
n=0
2n + 1 n=0
n
are valid for −1 < x ≤ 1. Suggestion: I know you are Abel to do this! From these
facts, derive the formulas
π 1 1 1 1 1 1
= 1 − + − + − · · · and log 2 = 1 − + − + · · · .
4 3 5 7 2 3 4
P ∞ α
4. For α ∈ R, prove that n=0 n converges if and only if α ≤ 0 or α ∈ N, in which case,

!
α
X α
2 = .
n=0
n
Suggestion: To prove convergence use Gauss’ test.
5. Prove the exquisite formulas
∞ ∞
X 1 zn X 1
(a) n
= Log , |z| < 1,
n=1
n 1 − z n=1
1 − zn
∞ ∞
X (−1)n−1 z n X
(b) n
= Log(1 + z n ), |z| < 1.
n=1
n 1 − z n=1
Suggestion: Cauchy’s double series theorem.
6. In this problem, we prove the stunning formulæ in (6.66).
(i) Using the
 first formula for γ in Problem 7a of Exercises 4.6, prove that γ =
P∞ P (−1)n n
n=1 f n
1
where f (z) = ∞ n=2 n
z .
340 6. ADVANCED THEORY OF INFINITE SERIES

P (−1)n
(ii) Prove that γ = 1 − log 2 + ∞ n=2 n
(ζ(n) − 1) using (i) and Problem 10 in
Exercises 6.5. Show that this formula is equivalent to the first formula in (6.66).
(iii) Using the second and third formulas in Problem 7a of Exercises 4.6, derive the
second and third formulas in (6.66).

6.10. F π, Euler, Fibonacci, Leibniz, Madhava, and Machin


In this section, we continue our fascinating study of formulas for π that we
initiated in Section 4.10. In particular, we derive (using a very different method
from the one presented in Section 5.2) Gregory-Leibniz-Madhava’s formula for π/4,
formulas for π discovered by Euler involving the arctangent function and even the
Fibonacci numbers, and finally, we look at Machin’s formula for π, versions of
which has been used to compute trillions of digits of π by Yasumasa Kanada and
his coworkers at the University of Tokyo.6 For other formulas for π/4 in terms of
arctangents, see the articles [132, 97]. For more on the history of computations of
π, see [13], and for interesting historical facets on π in general, see [10], [47, 48].
The website [207] has tons of information.

6.10.1. Gregory-Leibniz-Madhava’s formula for π/4, Proof II. Recall


Gregory-Madhava’s formula for real values:

X x2n−1
arctan x = (−1)n−1
.
n=0
2n − 1
P∞
By the alternating series theorem, we know that n=0 (−1)n−1 /(2n − 1) converges,
therefore by Abel’s limit theorem (Theorem 6.20) we know that

π X 1 1 1 1
= lim arctan x = (−1)n−1 = 1 − + − + −··· .
4 x→1−
n=0
2n − 1 3 5 7
Therefore, we obtain another derivation of
π 1 1 1
= 1 − + − + −··· , Gregory-Leibniz-Madhava’s series.
4 3 5 7
Madhava of Sangamagramma (1350–1425) was the first to discover this formula,
over 200 years before James Gregory (1638–1675) or Gottfried Leibniz (1646–1716)
were even born! Note that the Gregory-Leibniz-Madhava’s series is really just a
special case of Gregory-Madhava’s formula for arctan x (just set x = 1), which recall
was discovered in 1671 by Gregory and again, much earlier by Madhava. Leibniz
discovered the formula for π/4 (using geometric arguments) around 1673. Although
there is no published record of Gregory noting the formula for π/4 (he published few
of his mathematical results plus he died at only 37 years old), it would be hard to
believe that he didn’t know about the formula for π/4. For more history, including
Nilakantha Somayaji’s (1444–1544) contribution, see [190, 172, 111, 38].
Example 6.40. Let us say that we want to approximate π/4 by Gregory-
Leibniz-Madhava’s series to within, say a reasonable amount of 7 decimal places.

6The value of π has engaged the attention of many mathematicians and calculators from the
time of Archimedes to the present day, and has been computed from so many different formulae,
that a complete account of its calculation would almost amount to a history of mathematics.
James Glaisher (1848–1928) [82].
6.10. F π, EULER, FIBONACCI, LEIBNIZ, MADHAVA, AND MACHIN 341

Then denoting the n-th partial sum of Gregory-Leibniz-Madhava’s series by sn ,


according to the alternating series error estimate, we want
π 1
< 0.00000005 = 5 × 10−8 ,

− sn ≤
4 2n + 1
which implies that 2n + 1 > 108 /5, which works for n ≥ 10, 000, 000. Thus, we
can approximate π/4 by the n-th partial sum of Gregory-Leibniz-Madhava’s series
by taking ten million terms! Thus, although Gregory-Leibniz-Madhava’s series is
beautiful, it is quite useless to compute π.
Example 6.41. From Gregory-Leibniz-Madhava’s formula, we can easily derive
the breath-taking formula (see Problem 4)

X 3n − 1
(6.67) π= ζ(n + 1),
n=2
4n

due to Philippe Flajolet and Ilan Vardi (see [204, p. 1], [232, 75]).
6.10.2. Euler’s arctangent formula and the Fibonacci numbers. In
1738, Euler derived a very pretty two-angle arctangent expression for π:
π 1 1
(6.68) = arctan + arctan .
4 2 3
This formula is very easy to derive. We start off with the addition formula for
tangent (see (4.34), but now considering real variables)
tan θ + tan φ
(6.69) = tan(θ + φ),
1 − tan θ tan φ
where it is assumed that 1 − tan θ tan φ 6= 0. Let x = tan θ and y = tan φ and
assume that −π/2 < θ + φ < π/2. Then taking arctangents of both sides of the
above equation, we obtain
 
x+y
arctan = θ + φ,
1 − xy
or after putting the left-hand in terms of x, y, we get
 
x+y
(6.70) arctan = arctan x + arctan y.
1 − xy
Setting x = 1/2 and y = 1/3 and using that
x+y 5/6
= = 1,
1 − xy 1 − 5/6
we get
1 1
arctan 1 = arctan + arctan .
2 3
This expression is just (6.68).
In Problem 9 of Exercises 2.2 we studied the Fibonacci sequence, named
after Leonardo Fibonacci (1170–1250): F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for
all n ≥ 2 and you proved that for every n,

1 h n −n
i 1+ 5
(6.71) Fn = √ Φ − (−Φ) , Φ= .
5 2
342 6. ADVANCED THEORY OF INFINITE SERIES

We can use (6.68) and (6.70) to derive the following fascinating formula for π/4 in
terms of the (odd-indexed) Fibonacci numbers due to Lehmer [131] (see Problem
2 and [133]):
∞  
π X 1
(6.72) = arctan .
4 n=0
F2n+1

Also, in Problem 3 you will prove the following series for π, due to Castellanos [47]:


π X (−1)n F2n+1 22n+3
(6.73) √ = √ .
5 n=0 (2n + 1)(3 + 5)2n+1

6.10.3. Machin’s arctangent formula for π. In 1706, John Machin (1680–


1752) derived a fairly rapid convergent series for π. To derive this expansion,
consider the smallest positive angle α whose tangent is 1/5:
1
tan α = (that is, α := arctan(1/5)).
5
Now setting θ = φ = α in (6.69), we obtain
2 tan α 2/5 5
tan 2α = = = ,
1 − tan2 α 1 − 1/25 12
so
2 tan 2α 5/6 120
tan 4α = 2 = = ,
1 − tan 2α 1 − 25/144 119
which is just slightly above one. Hence, 4α − π/4 is positive, and moreover,
 π tan 4α + tan π/4 1/119 1
tan 4α − = = = .
4 1 + tan 4α tan π/4 1 + 120/119 239
π
Taking the inverse tangent of both sides and solving for 4, we get
π 1 1
= 4 tan−1 − tan−1 .
4 5 239
Substituting 1/5 and 1/239 into the Gregory-Madhava series for the inverse tangent,
we arrive at Machin’s formula for π:
Theorem 6.39 (Machin’s formula). We have
∞ ∞
X (−1)n X (−1)n
π = 16 2n+1
− 4 .
n=0
(2n + 1)5 n=0
(2n + 1) 2392n+1

Example 6.42. Machin’s formula gives many decimal places of π without much
P∞ (−1)n
effort. Let sn denote the n-th partial sum of s := 16 n=0 (2n+1)5 2n+1 and tn that
P∞ (−1)n
of t := 4 n=0 (2n+1) 2392n+1 . Then π = s − t and by the alternating series error
estimate,
16
|s − s3 | ≤ ≈ 9.102 × 10−7
9 · 59
and
4
|t − t0 | ≤ ≈ 10−7 .
3 · (239)3
6.10. F π, EULER, FIBONACCI, LEIBNIZ, MADHAVA, AND MACHIN 343

Therefore,
|π − (s3 − t0 )| = |(s − t) − (s3 − t0 )| ≤ |s − s3 | + |t − t0 | < 5 × 10−6 .
A manageable computation (even without a calculator!) shows that s3 − t0 =
3.14159 . . .. Therefore, π = 3.14159 to five decimal places!
Exercises 6.10.
1. From Gregory-Madhava’s series, derive the following pretty series
π 1 1 1 1
√ =1− + − + − +··· .
2 3 3·3 5 · 32 7 · 33 9 · 34

Suggestion: Consider√ arctan(1/ 3) = π/6. How many terms of this series do you need
to approximate π/2 3 to within seven decimal places? History Bite: Abraham Sharp
(1651–1742) used this formula in 1669 to compute π to 72 decimal places, and Thomas
Fantet de Lagny (1660–1734) used this formula in 1717 to compute π to 126 decimal
places (with a mistake in the 113-th place) [47].
2. In this problem we prove (6.72).
(i) Prove that arctan 13 = arctan 51 + arctan 18 , and use this prove that
π 1 1 1
= arctan + arctan + arctan .
4 2 5 8
1 1 1
Prove that arctan 8
= arctan 13
+ arctan 21
, and use this prove that
π 1 1 1 1
= arctan + arctan + arctan + arctan .
4 2 5 13 21
From here you can now see the appearance of Fibonacci numbers.
(ii) To continue this by induction, prove that for every natural number n,
F2n+1 F2n+2 − 1
F2n = .
F2n+3
Suggestion: Can you use (6.71)?
(iii) Using the formula in (ii), prove that
     
1 1 1
arctan = arctan + arctan .
F2n F2n+1 F2n+2
Using this formula derive (6.72).
3. In this problem we prove (6.73).
(i) Using (6.70), prove that
√  √   √ 
5x 1+ 5 1− 5
tan−1 = tan −1
x − tan−1
x.
1 − x2 2 2
(ii) Now prove that
√ ∞
−1 5x X (−1)n F2n+1 x2n+1
tan 2
= .
1−x n=0
5n (2n + 1)

(iii) Finally, derive the formula (6.73).


4. In this problem, we prove the breath-taking formula (6.67).
(i) Prove that
∞  ∞
π X 1 1  X 1
= − = f
4 n=1
4n − 3 4n − 1 n=1
n
z z
where f (z) = 4−3z − 4−z .
(ii) Use Theorem 6.27 to derive our breath-taking formula.
344 6. ADVANCED THEORY OF INFINITE SERIES

P∞
6.11. F Another proof that π 2 /6 = 1/n2 (The Basel problem)
n=1
P∞ n
Assuming only Gregory-Leibniz-Madhava’s series: π4 = n=0 (−1)
2n+1 , we give
our seventh proof of the fact that7

π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4

According to Knopp [120, p. 324], the proof we are about to give “may be regarded
as the most elementary of all known proofs, since it borrows nothing from the theory
of functions except the Leibniz series”. Knopp attributes the main ideas of the proof
to Nicolaus Bernoulli (1687–1759).
6.11.1. Cauchy’s arithmetic mean theorem. Before giving our sixth proof
of Euler’s sum, we prove the following theorem (attributed to Cauchy by Knopp
[120, p. 72]).
Theorem 6.40 (Cauchy’s arithmetic mean theorem). If a sequence a1 ,
a2 , a3 , . . . converges to L, then the sequence of arithmetic means (or averages)
1 
mn := a1 + a2 + · · · + an
n
also converges to L. Moreover, if the sequence {an } is nonincreasing, then so is the
sequence of arithmetic means {mn }.
Proof. To show that mn → L, we need to show that
1 
mn − L = (a1 − L) + (a2 − L) + · · · + (an − L)
n
tends to zero as n → ∞. Let ε > 0 and choose N ∈ N so that for all n > N , we
have |an | < ε/2. Then for n > N , we can write
1  1 
|mn − L| ≤ |(a1 − L) + · · · + (aN − L)| + |(aN +1 − L) + · · · + (an − L)|
n n
1  1ε ε
≤ |(a1 − L) + · · · + (aN − L)| + + ··· +
n n 2 2
1  n−N ε
= |(a1 − L) + · · · + (aN − L)| + ·
n n 2
1   ε
≤ |(a1 − L) + · · · + (aN − L)| + .
n 2
 
1
By choosing n larger, we can make n |(a1 − L) + · · · + (aN − L)| also less than ε/2,
which shows that |mn − L| < ε for n sufficiently large. This shows that mn → L.
Assume now that {an } is nonincreasing. We shall prove that {mn } is also
nonincreasing; that is, for each n,
1   1 
a1 + · · · + an + an+1 ≤ a1 + · · · + an ,
n+1 n
or, after multiplying both sides by n(n + 1),
     
n a1 + · · · + an + nan+1 ≤ n a1 + · · · + an + a1 + · · · + an .

7This proof can be thought of as a systematized version of Problem 3 in Exercises 5.2.


P∞
6.11. F ANOTHER PROOF THAT π 2 /6 = n=1 1/n2 (THE BASEL PROBLEM) 345

Cancelling, we conclude that the sequence {mn } is nonincreasing if and only if


nan+1 = an+1 + an+1 + · · · an+1 ≤ a1 + a2 + · · · + an .
| {z }
n times

But this inequality certainly holds since an+1 ≤ ak for k = 1, 2, . . . , n. This com-
pletes the proof. 

There is a related theorem for geometric means found in Problem 2, which can
be used to derive the following neat formula:
 1  2  3  n 1/n
2 3 4 n+1
(6.74) e = lim ··· .
n→∞ 1 2 3 n

6.11.2. Proof VII of Euler’s formula for π 2 /6. First we shall apply Abel’s
multiplication theorem to Gregory-Leibniz-Madhava’s series:
 π 2  X ∞  X∞ 
1 1
= (−1)n · (−1)n .
4 n=0
2n + 1 n=0
2n + 1

To do so, we first form the n-th term in the Cauchy product:


n n
X 1 1 X 1
cn = (−1)k · (−1)n−k = (−1)n .
2k + 1 2n − 2k + 1 (2k + 1)(2n − 2k + 1)
k=0 k=0

Using partial fractions one can check that


 
1 1 1 1
= + ,
(2k + 1)(2n − 2k + 1) 2(n + 1) 2k + 1 2n − 2k + 1
which implies that
n n
! n
(−1)n X 1 X 1 (−1)n X 1
cn = + = ,
2(n + 1) 2k + 1 2n − 2k + 1 n+1 2k + 1
k=0 k=0 k=0
Pn 1
Pn 1
since k=0 2n−2k+1 = k=0 2k+1 . Thus, we can write
 π 2 ∞  
X 1 1 1
= (−1)n mn , where mn = 1 + + ··· + ,
4 n=0
n+1 3 2n + 1

provided that the series converges! To see that this series converges, note that mn
is exactly the arithmetic mean, or average, of the numbers 1, 1/3, . . . , 1/(2n + 1).
Since 1/(2n + 1) → 0 monotonically, Cauchy’s arithmetic mean theorem shows that
these averages also
P∞ tend to zero monotonically. In particular, by the alternating
series theorem, n=0 (−1)n mn converges, so by Abel’s multiplication theorem, we
get (not quite π 2 /6, but pretty nonetheless)
∞  
π2 X 1 1 1
(6.75) = (−1)n 1 + + ··· + .
16 n=0 n+1 3 2n + 1

We evaluate the right-hand side using the following theorem (whose proof is tech-
nical so you can skip it if you like).
346 6. ADVANCED THEORY OF INFINITE SERIES

Theorem
P 2 6.41. Let {an } be a nonincreasing sequence of positive numbers such
that an converges. Then both series

X ∞
X
s := (−1)n an and δk := an an+k , k = 1, 2, 3, . . .
n=0 n=0
P∞ k−1
converge. Moreover, ∆ := k=1 (−1) δk also converges, and we have the formula

X
a2n = s2 + 2∆.
n=0
P
P Proof. Since a2n
converges, we must have an → 0, which implies that
(−1)n an converges byPthe alternating series test. By monotonicity, an an+k ≤
2
a
Pn · an = an and since a2n converges, by comparison, so does each series δk =

n=0 an an+k . Also by monotonicity,

X ∞
X
δk+1 = an an+k+1 ≤ an an+k = δk ,
n=0 n=0

so by the alternating series test, the sum ∆ converges if δk → 0. To prove that this
holds, let ε > 0 and choose N (by invoking the Cauchy criterion for series) such
that a2N +1 + a2N +2 + · · · < ε/2. Then, since the sequence {an } is nondecreasing, we
can write

X
δk = an an+k
n=0
   
= a0 ak + · · · + aN aN +k + aN +1 aN +1+k + aN +2 aN +2+k + · · ·
   
≤ a0 ak + · · · + aN ak + a2N +1 + a2N +2 + a2N +3 + · · ·
  ε
< ak · a0 + · · · + aN + .
2
As ak → 0 we can make the first term less than ε/2 for all k large enough.
Thus,Pδk < ε for all k sufficiently large. This proves that δk → 0 and hence

∆ = k=1 (−1)k−1 δk converges. Finally, we need to prove the equality

X ∞
X
a2n = s2 + 2∆ = s2 + 2 (−1)k−1 δk .
n=0 k=1
P∞ n
To prove this, let sn denote the n-th partial sum of the series s = n=0 (−1) an .
We have
n
!2 n Xn
X X
2 k
sn = (−1) ak = (−1)k+` ak a` .
k=0 k=0 `=0

We can write the double sum on the right as a sum over (k, `) such that k = `,
k < `, and ` < k:
n X
X n X X X
(−1)k+` ak a` = (−1)k+` ak a` + (−1)k+` ak a` + (−1)k+` ak a` ,
k=0 `=0 k=` k<` `<k
P∞
6.11. F ANOTHER PROOF THAT π 2 /6 = n=1 1/n2 (THE BASEL PROBLEM) 347

where the smallest k and ` can be is 0 and the largest is n. The first sum is just
P n 2
k=0 ak and by symmetry in k and `, the last two sums are actually the same, so
n
X X
s2n = a2k + 2 (−1)k+` ak a` .
k=0 0≤k<`≤n

In the second sum, 0 ≤ k < ` ≤ n so we can write ` = k + j where 1 ≤ j ≤ n and


0 ≤ k ≤ n − j. Hence,
X n n−j
X X n n−j
X X
(−1)k+` ak a` = (−1)k+(k+j) ak ak+j = (−1)j ak ak+j .
1≤k<`≤n j=1 k=0 j=1 k=0

In summary, we have
n n n−j
!
X X X
s2n = a2k +2 (−1)j
ak ak+j .
k=0 j=1 k=0
P∞
Let dn be the n-th partial sum of 2∆ = 2 j=1 (−1)j−1 δj ; we need to show that
P∞
s2n + dn → k=0 a2k as n → ∞. To this end, we add the expressions for s2n and dn :
n n n−j
! n
X X X X
2 2 j
sn + dn = ak + 2 (−1) ak ak+j + 2 (−1)j−1 δj
k=0 j=1 k=0 j=1
n n n−j
!
X X X
= a2k + 2 (−1)j ak ak+j − δj .
k=0 j=1 k=0
P∞
Recalling that δj = k=0 ak ak+j , we can write s2n + d2n as
n
X n
X
s2n + dn = a2k + 2 (−1)j αj ,
k=0 j=1

where

X
αj := ak ak+j = an−j+1 an+1 + an−j+2 an+2 + an−j+3 an+3 + · · · .
k=n−j+1

Since the sequence {an } is nonincreasing, it follows that the sequence {αj } is non-
decreasing:
αj = an−j+1 an+1 + an−j+2 an+2 + · · · ≤ an−j an+1 + an−j+1 an+2 + · · · = αj+1 .
Now assuming n is even, we have
n

1 2 X
a2k = (−α1 + α2 ) + (−α3 + α4 ) + · · · + (−αn−1 + αn )

sn + dn −
2
k=0
= (−α1 + α2 ) + (−α3 + α4 ) + · · · + (−αn−1 + αn )
= −α1 − (α3 − α2 ) − (α5 − α4 ) − · · · − (αn−1 − αn−2 ) + αn
≤ αn = a1 an+1 + a2 an+2 + · · · = δn − a0 an ,
where we used the fact that the terms in the parentheses are all nonnegative because
the αj ’s are nondecreasing. Using a very similar argument, we get
n
1 2 X
2

(6.76) s + dn − ak ≤ δn − a0 an
2 n
k=0
348 6. ADVANCED THEORY OF INFINITE SERIES

for n odd. Therefore, (6.76) holds for all n. We already know that δn → 0 and
an → 0, so (6.76) shows that the left-hand side tends to zero as n → ∞. This
completes the proof of the theorem. 
Finally, we are ready to prove Euler’s formula for π 2 /6. To do so, we apply the
preceding theorem to the sequence an = 1/(2n + 1). In this case,
∞ ∞
X X 1
δk = an an+k = .
n=0 n=0
(2n + 1)(2n + 2k + 1)
Writing in partial fractions,
 
1 1 1 1
= − ,
(2n + 1)(2n + 2k + 1) 2k 2n + 1 2n + 2k + 1
we get (after some cancellations)
∞    
1 X 1 1 1 1 1
δk = − = 1 + + ··· + .
2k n=0 2n + 1 2n + 2k + 1 2k 3 2k − 1
P∞
Hence, the equality n=0 a2n = s2 + 2∆ takes the form
∞  π 2 X ∞  
X 1 k−1 1 1 1
= + (−1) 1 + + · · · .
n=0
(2n + 1)2 4 k 3 2k − 1
k=1
However, see (6.75), we already proved that the Cauchy product of Gregory-Leibniz-
Madhava’s series with itself is given by the sum on the right. Thus,

X 1  π 2  π 2 π2
(6.77) = + = .
n=0
(2n + 1)2 4 4 8
Finally, summing over the even and odd numbers, we have
∞ ∞ ∞ ∞
X 1 X 1 X 1 π2 1X 1
= + = + ,
n=1
n2 n=0
(2n + 1)2 n=1 (2n)2 8 4 n=1 n2
P∞ 2 P∞
and solving for n=1 1/n2 , we obtain Euler’s formula: π6 = n=1 n12 .
Exercises 6.11.
1. Find the following limits:
1 + 21/2 + 31/3 + · · · + n1/n
(a) lim ,
n
1 2 3 n
1 + 11 + 1 + 12 + 1 + 13 + · · · + 1 + n1
(b) lim .
n
2. If a sequence a1 , a2 , a3 , . . . of positive numbers converges to L > 0, prove that the
sequence of geometric means (a1 a2 · · · an )1/n also converges to L. Suggestion: Take
logs of the geometric means. Using this result, prove (6.74). Using (6.74), prove that
n
e = lim .
(n!)1/n
3. Here is a generalization of Cauchy’s arithmetic mean theorem: If a1 , a2 , a3 , . . . con-
verges to a and b1 , b2 , b3 , . . . converges to b, then the sequence
1 
a1 bn + a2 bn−1 + · · · + an−1 b2 + an b1
n
converges to ab.
CHAPTER 7

More on the infinite: Products and partial


fractions

Reason’s last step is the recognition that there are an infinite number of
things which are beyond it.
Blaise Pascal (1623–1662), Pensees. 1670.
We already met François Viète’s infinite product expression for π in Sections
4.10 and 5.1. This chapter is devoted entirely to the theory and application of
infinite products, and as a consolation prize we also talk about partial fractions.
In Sections 7.1 and 7.2 we present the basics of infinite products. Hold on to your
seats, because the rest of the chapter is full of surprises!
We begin with the following “Viète-type” formula for log 2, which is due to
Philipp Ludwig von Seidel (1821–1896):
2 2 2 2
log 2 = √ · p√ · qp · rq ··· .
1+ 2 1+ 2 1+ √ p√
2 1+ 2

In Section 7.3, we give another proof Euler’s famous sine formula:


 z 2  z 2  z 2  z 2  z2 
sin πz = πz 1 − 2 1 − 2 1 − 2 1 − 2 1 − 2 · · · ,
1 2 3 4 5
In Section 7.4, we look at partial fraction expansions of the trig functions. Recall
that if p(z) is a polynomial with roots r1 , . . . , rn , then we can factor p(z) as p(z) =
a(z − r1 )(z − r2 ) · · · (z − rn ), and from elementary calculus, we can write
1 1 a1 a2 an
= = + + ··· +
p(z) a(z − r1 )(z − r2 ) · · · (z − rn ) z − r1 z − r2 z − rn
for some constants a1 , . . . , an . You probably studied this in the “partial fraction
method of integration” section in your elementary calculus course. Writing
sin πz = az(z − 1)(z + 1)(z − 2)(z + 2)(z − 3)(z + 3) · · · ,
Euler thought that we should be able to apply the partial fraction decomposition
to 1/ sin πz:
1 a1 a2 a3 a4 a5
= + + + + + ··· .
sin πz z z−1 z+1 z−2 z+2
In Section 7.4, we’ll prove that this can be done where a1 = 1 and a2 = a3 = · · · =
−1. That is, we’ll prove that
π 1 1 1 1 1 1 1
= − − − − − − − ··· .
sin πz z z−1 z+1 z−2 z+2 z−3 z+3
349
350 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

1 1 2z
Combining the adjacent factors, − z−n − z+n = n2 −z 2 , we get Euler’s celebrated
partial fraction expansion for sine:

π 1 X 2z
(7.1) = + .
sin πz z n=1 n2 − z 2

We’ll also derive partial fraction expansions for the other trig functions. In Section
7.5, we give more proofs of Euler’s sum for π 2 /6 using the infinite products and
partial fractions we found in Sections 7.3 and 7.4. In Section 7.6, we prove one of
the most famous formulas for the Riemann zeta function, namely writing it as an
infinite product involving only the prime numbers:
2z 3z 5z 7z 11z
ζ(z) = · z · z · z · z ··· .
2z − 1 3 − 1 5 − 1 7 − 1 11 − 1
In particular, setting z = 2, we get the following expression for π 2 /6:

π2 Y p2 22 32 52
= = · · ··· .
6 p2 − 1 22 − 1 32 − 1 52 − 1
As a bonus prize, we see how π is related to questions from probability. Finally, in
Section 5.3, we derive some awe-inspiring beautiful formulas (too many to list at
this moment!). Here are a couple of my favorite formulas of all time:
π 3 5 7 11 13 17 19 23
= · · · · · · · ··· .
4 4 4 8 12 12 16 20 24
The numerators of the fractions on the right are the odd prime numbers and the
denominators are even numbers divisible by four and differing from the numerators
by one. The next one is also a beaut:
π 3 5 7 11 13 17 19 23
= · · · · · · · ··· .
2 2 6 6 10 14 18 18 22
The numerators of the fractions are the odd prime numbers and the denominators
are even numbers not divisible by four and differing from the numerators by one.
Chapter 7 objectives: The student will be able to . . .
• determine the (absolute) convergence for an infinite product.
• explain the infinite products and partial fraction expansions of the trig functions.
• describe Euler’s formulæ for powers of π and their relationship to Riemann’s
zeta function.

7.1. Introduction to infinite products


We start our journey through infinite products taking careful steps to define
what these phenomenal products are.

7.1.1. Basic definitions and examples. Let {bn } be a sequence of complex


numbers. An infinite product

Y
bn = b1 · b2 · b3 · · ·
n=1
7.1. INTRODUCTION TO INFINITE PRODUCTS 351

is said to converge if there exists an m ∈ N Q


such that the bn ’s are nonzero for all
n
n ≥ m, and the limit of the partial products k=m bk = bm · bm+1 · · · bn :
n
Y 
(7.2) lim bk = lim bm · bm+1 · · · bn
n→∞ n→∞
k=m

converges to a nonzero complex value, say p. In this case, we define



Y
bn := b1 · b2 · · · bm−1 · p.
n=1

This definition is of course independent ofQthe m chosen such that the bn ’s are

nonzero for all n ≥ m. The infinite product n=1 bn diverges if it doesn’t converge;
that is, either there are infinitely many zero bn ’s or the limit (7.2) diverges or the
limit (7.2) converges to zero. In this latter case, we say that the infinite product
diverges to zero. Just as sequences
Q∞ and series can start at any integer, products
can also start at any integer: n=k bn , with straightforward modifications of the
definition.
Q∞
Example 7.1. The “harmonic product” n=2 (1−1/n) diverges to zero because
n 
Y 1  1  1 1 2 3 n−1 1
1− = 1− ··· 1 − = · · ··· = → 0.
k 2 n 2 3 4 n n
k=2
Q∞
Example 7.2. On the other hand, the product n=2 (1 − 1/n2 ) converges
because
n  n n
Y 1  Y k2 − 1 Y (k − 1)(k + 1)
1− 2 = =
k k2 k·k
k=2 k=2 k=2
1·3 2·4 3·5 4·6 (n − 1)(n + 1) n+1 1
= · · · ··· = → 6= 0.
2·2 3·3 4·4 5·5 n·n 2n 2
Therefore,
∞ 
Y 1 1
1− = .
n=2
n2 2
Q∞
Note that the infinite product n=1 (1 − 1/n2 ) also converges, but in this case,
∞  n
Y 1  1 Y 1 1
1− 2
:= 1 − 2
· lim 1 − 2
= 0 · = 0.
n=1
n 1 n→∞ k 2
k=2

Proposition 7.1. If an infinite product converges, then its factors tend to one.
Also, a convergent infinite product has the value 0 if and only if it has a zero factor.
Proof. The second statement is automatic from the definition of convergence.
If none of the bn ’s vanish for n ≥ m and pn = bm · bm+1 · · · bn , then pn → p, a
nonzero number, so
bm · bm+1 · · · bn−1 · bn pn p
bn = = → = 1.
bm · bm+1 · · · bn−1 pn−1 p

352 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

Because the factors of a convergent infinite product always tend to one, we


henceforth write bn as 1 + an , so the infinite product takes the form
Y
(1 + an );
then this infinite product converges implies that an → 0.

7.1.2. Infinite products and series: the nonnegative Q case. The follow-
ing theorem states that the analysis of an infinite product (1 + an ) with P all the
an ’s real and nonnegative is completely determined by the infinite series an .
Q
Theorem 7.2. An infiniteP product (1 + an ) with nonnegative terms an con-
verges if and only if the series an converges.
Proof. Let the partial products and partial sums be denoted by
n
Y n
X
pn = (1 + ak ) and sn = ak .
k=1 k=1

Since all the ak ’s are nonnegative, both sequences {pn } and {sn } are nondecreasing,
so converge if and only if they are bounded. Since 1 ≤ 1 + x ≤ ex for any real
number x (see Theorem 4.29), it follows that
n
Y n
Y Pn
1 ≤ pn = (1 + ak ) ≤ eak = e k=1 ak
= esn .
k=1 k=1

This equation shows that if the sequence {sn } is bounded, then the sequence {pn }
is also bounded, and hence converges. Its limit must be ≥ 1, so in particular, is not
zero. On the other hand,
pn = (1 + a1 )(1 + a2 ) · · · (1 + an ) ≥ 1 + a1 + a2 + · · · + an = 1 + sn ,
since the left-hand side, when multiplied out, contains the sum 1 + a1 + a2 + · · · + an
(and a lot of other nonnegative terms too). This shows that if the sequence {pn }
is bounded, then the sequence {sn } is also bounded. 

See Problem 4 for the case when the terms an are negative.
Example 7.3. Thus, as a consequence of this theorem, the product
Y 1

1+ p
n
converges for p > 1 and diverges for p ≤ 1.
7.1.3. Infinite products for log 2 and e. I found the following gem in [205].
Define a sequence {en } by e1 = 1 and en+1 = (n + 1)(en + 1) for n = 1, 2, 3, . . .; e.g.
e1 = 1 , e2 = 4 , e3 = 15 , e4 = 64 , e5 = 325 , e6 = 1956 , . . . .
Then

Y en + 1 2 5 16 65 326 1957
(7.3) e= = · · · · · ··· .
n=1
en 1 4 15 64 325 1956

You will be asked to prove this in Problem 6.


7.1. INTRODUCTION TO INFINITE PRODUCTS 353

We now prove Philipp Ludwig von Seidel’s (1821–1896) formula for log 2:
2 2 2 2
log 2 = √ · p√ · qp · rq ··· .
1+ 2 1+ 2 1+ √ p√
2 1+ 2

To prove this, we follow the proof of Viète’s formula in Section 5.1.1 using hyperbolic
functions instead of trigonometric functions. Let x ∈ R be nonzero. Then dividing
the identity sinh x = 2 cosh(x/2) sinh(x/2) (see Problem 8 in Exercises 4.7) by x,
we get
sinh x sinh(x/2)
= cosh(x/2) · .
x x/2
Replacing x with x/2, we get sinh(x/2)/(x/2) = cosh(x/22 ) · sinh(x/22 )/(x/22 ),
therefore
sinh x sinh(x/22 )
= cosh(x/2) · cosh(x/22 ) · .
x x/22
Continuing by induction, we obtain for any n ∈ N,
n
sinh x Y sinh(x/2n )
= cosh(x/2k ) · .
x x/2n
k=1
n
Since limz→0 sinh
z
z
= 1 (why?), we have limn→∞ sinh(x/2
x/2n
)
= 1, so taking n → ∞,
it follows that
n
x Y 1
(7.4) = lim .
sinh x n→∞ cosh(x/2k )
k=1

Now let us put x = log θ, that is, θ = ex , into the equation (7.4). To this end,
observe that
ex − e−x θ − θ−1 θ2 − 1 x 2θ log θ
sinh x = = = =⇒ =
2 2 2θ sinh x (θ − 1)(θ + 1)
and
x x 1 1 1
e 2k + e− 2k θ 2k + θ − 2k θ 2k−1 + 1
cosh(x/2k ) = = = 1
2 2 2θ 2k
1
1 2 θ 2k
=⇒ k
= 1/2k−1 .
cosh(x/2 ) θ +1
Thus,
n 1 n n
!
2θ log θ Y 2 θ 2k Y 1 Y 2
= lim = lim θ 2k · 1
(θ − 1)(θ + 1) n→∞
k=1
θ k−1 + 1 n→∞
1/2
k=1 k=1 θ 2k−1 + 1
n−1
!
Pn 1 Y 2
= lim θ k=1 2k
· 1 .
n→∞
k=0 θ 2k + 1
Pn 1
P∞ 1
Since limn→∞ k=1 2k = 1 (this is just the geometric series k=1 2k ), we see that
n−1 n−1
2θ log θ Y 2 2 Y 2
= θ · lim 1 =θ· · lim 1 .
(θ − 1)(θ + 1) n→∞ θ + 1 n→∞
k=0 θ +1 k=1 θ +1
2 k 2 k
354 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

Cancelling like terms, we have, by definition of infinite products, the following


beautiful infinite product expansion for log θ
θ−1 :


log θ Y 2 2 2 2
= = √ · p√ · qp · · · Seidel’s formula.
θ−1 1
1+ θ 1+ √
k=1 1 + θ θ 1+
2k
θ

In particular, taking θ = 2, we get Seidel’s infinite product formula for log 2.


Exercises 7.1.
1. Prove that
∞  ∞ 
Y 1  Y 2  1
(a) 1+ = 2, (b) 1 − = ,
n=2
n2 − 1 n=3
n(n − 1) 3
∞  ∞ 
Y 2  Y (−1)n 
(c) 1+ = 3, (d) 1+ = 1.
n=2
n2 +n−2 n=2
n
2. Prove that for any z ∈ C with |z| < 1,
∞  
Y n 1
1 + z2 = .
n=0
1−z
Q    2n 
Conclude that ∞ n=0 1+ 2
1
= 2. Suggestion: Derive, e.g. by induction, a formula
Qn k
for pn = k=0 (1 + z 2 ) as a geometric sum as in Problem 3e in Exercises 2.2.
3. Determine convergence for:
∞  ∞   nx2 n  ∞ 
Y  1  Y Y 1 + x2 + x2n 
(a) 1 + sin2 , (b) 1+ , (c) ,
n=1
n n=1
1+n n=1
1 + x2n

where for (b) and (c), state for which x ∈ R, the products Q converge and diverge.
4. In this problem, we prove P that an infinite product (1−a n ) with 0 ≤ an < 1 converges
if and only if the Qnseries an converges.P
n −sn
(i) Let P pn = k=1 (1 − a k ) and sn =
Q k=1 ak . Show that pn ≤ e . Conclude that
if an diverges, then
P (1 − a n ) also diverges (in this case, diverges to zero).
(ii) Suppose now that an converges. Then we can choose m such that am + am+1 +
· · · < 1/2. Prove by induction that
(1 − am )(1 − am+1 ) · · · (1 − an ) ≥ 1 − (am + am+1 + · · · + an )
for n = m, m + 1,Qm + 2, . . .. Conclude that pn /pm ≥ 1/2 for all n ≥ m, and from
this, prove that
Q (1 − an )1 converges.

(iii) For what p is ∞ n=2 1 − np convergent and divergent?
5. In this problem we derive relationships between series and products. Let {an } be a
sequence of complex numbers with an 6= 0 for all n.
(a) Prove that for n ≥ 2,
n
Y n
X
(1 + ak ) = a1 + (1 + a1 ) · · · (1 + ak−1 )ak .
k=1 k=2
Q P∞
Thus, ∞ n=1 (1 + an ) converges if and only if a1 + k=2 (1 + a1 ) · · · (1 + ak−1 )ak
converges to a nonzero value, in which case they have the same value.
(b) Assume that a1 + · · · + ak 6= 0 for every k. Prove that for n ≥ 2,
n n  
X Y ak
ak = a1 1+ .
a1 + a2 + · · · + ak−1
k=1 k=2
7.2. ABSOLUTE CONVERGENCE FOR INFINITE PRODUCTS 355

P Q∞  
Thus, ∞ n=1 an converges if and only if a1
an
n=2 1 + a1 +a2 +···+an−1 either con-
verges or diverges to zero,
P in which case they have the same value.
(c) Using (b) and the sum ∞ 1 1
n=1 (n+a−1)(n+a) = a from (3.38), prove that

∞  
Y a
1+ = a + 1.
n=2
(n + a)(n − 1)

6. In this problemPwe prove (7.3)


(i) Let sn = n 1
k=0 k! . Prove that en = n! sn−1 for n = 1, 2, . . ..
(ii) Show that sn /sn−1 = (en + 1)/en .
Q
(iii) Show that sn = n ek +1
k=1 ek and then complete the proof. Suggestion: Note that
we can write sn = (s1 /s0 ) · (s2 /s1 ) · · · (sn /sn−1 ).

7.2. Absolute convergence for infinite products


Way back in Section 3.6 we introduced absolute convergence for infinite series
and since then we have experienced how incredibly useful this notion is. In this
section we continue our study of the basic properties of infinite products by intro-
ducing the notion of absolute convergence for infinite products. We also present a
general convergence test that is able to test the convergence of any infinite product
in terms of a corresponding series of logarithms.

Q 7.2.1. Absolute convergence for infinite Q products. An infinite product


(1 Q
+ an ) is said to converge absolutely P if (1 + |an |) converges. ByQTheorem
7.2, (1 + |an |) converges if and only if |an | converges.
P Therefore, (1 + an )
converges absolutely if and only if the infinite series an converges absolutely.
We know that if an infinite series is absolutely convergent, then the series itself
converges; is this the same for infinite products? The answer is yes, but before
proving this we first need the following lemma.
Lemma 7.3. Let {pk }∞ k=m , where m ∈ N, be a sequence of complex numbers.
P∞
(a) {pk } converges if and only if the infinite series k=m+1 (pk − pk−1 ) converges,
in which case
X∞
lim pk = pm + (pk − pk−1 ).
k→∞
k=m+1
Qk
(b) If {aj }∞
j=m is a sequence of complex numbers and pk = j=m (1 + aj ), then
P k−1
|aj |
|pk − pk−1 | ≤ |ak | e j=m .
Proof. The identity in (a) is reminiscent of the telescoping series theorem,
Theorem 3.24, and in fact can be derived from it, but let us prove (a) directly. To
do so, we note that for k ≥ m, we have
k
X
(7.5) pk = pm + (pj − pj−1 ),
j=m+1

since the sum on the right telescopes. It follows that the limit lim pk exists if and
Pk
only if the limit limk→∞ j=m+1 (pj − pj−1 ) exists; in other words, if and only if
P∞
the infinite series j=m+1 (pj − pj−1 ) converges. In case of convergence, the limit
equality in (a) follows from taking k → ∞ in (7.5).
356 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

To prove (b), observe that

k
Y k−1
Y
pk − pk−1 = (1 + aj ) − (1 + aj )
j=m j=m
k−1
Y k−1
Y
= (1 + ak ) (1 + aj ) − (1 + aj )
j=m j=m
k−1
Y
= ak (1 + aj ).
j=m

Qk−1
Therefore, |pk − pk−1 | ≤ |ak | j=m (1 + |aj |). Since 1 + x ≤ ex for all real numbers
x, we have
k−1
Y P k−1
|aj |
|pk − pk−1 | ≤ |ak | e|aj | = |ak | e j=m ,
j=m

just as we wanted. 

Theorem 7.4. Any absolutely convergent infinite product converges.


Q
Proof.
P Let (1 + an ) be absolutely convergent,
Q which is equivalent to the
series |an | P
converging; we need to prove that (1 + an ) converges in the usual
sense. Since
P∞ |an | converges, by the Cauchy criterion for series we can choose m
such that n=m |an | < 21 . In particular,
Qn |ak | < 1 for k ≥ m, so 1 + ak is nonzero
for k ≥ m. For n ≥ m, let pn = k=m (1 + akP ). From Lemma 7.3, we know that

lim pn exists if and only if the infinite series k=m+1 (pk − pk−1 ) converges. To
prove that this series converges, note that by (b) in Lemma 7.3, for k > m we have
P k−1
|aj |
|pk − pk−1 | ≤ |ak | e j=m =⇒ |pk − pk−1 | ≤ C |ak |,
P∞ 1
P
with C = e1/2 , recalling that j=m |aj |P< 2 . In particular, since |ak | con-

verges, by the comparison test, the series k=m+1 |pk − pk−1 | converges and hence
P∞
k=m+1 (pk − pk−1 ) also converges. This shows that lim pn exists.
We now prove that lim pn 6= 0. To this end, we claim that for each n ≥ m, we
have
n
Y n
X
(7.6) |pn | = |1 + ak | ≥ 1 − |ak |.
k=m k=m

We prove (7.6) by induction on n = m, m + 1, m + 2, . . .. To check the base case,


|1 + am | ≥ 1 − |am |, observe that for any complex number z,

(7.7) 1 ≤ |1 + z − z| ≤ |1 + z| + |z| =⇒ |1 + z| ≥ 1 − |z|,


7.2. ABSOLUTE CONVERGENCE FOR INFINITE PRODUCTS 357

which in particular proves the base case. Assume that our result holds for n ≥ m;
we prove it for n + 1. Observe that
n+1
Y n
Y
|1 + ak | = |1 + ak | · |1 + an+1 |
k=m k=m
n
!
X
≥ 1− |ak | (1 − |an+1 |) (induction hypothesis and (7.7))
k=m
n
X n
X
=1− |ak | − |an+1 | + |ak ||an+1 |
k=m k=m
Xn
≥1− |ak | − |an+1 |,
k=m

which is exactly the n + 1 case. 

Just as for infinite series, the converse of this theorem is not true. For example,
Q∞  (−1)n

the infinite product n=2 1 + n converges (and equals 1 — see Problem 1 in
Exercises 7.1), but this product is not absolutely convergent.

7.2.2. Infinite products and series: the general case. Q For nonnegative
real numbers {an }, in Theorem
P 7.2, we showed that the product (1+an ) converges
if and only if the series an converges. In the general case
Q of a complex sequence
{an }, in Theorem
P 7.4 we showed that the infinite product (1 + an ) still converges
if the series |an | converges. In the general complex case, is there an “if and
only if” theorem relating convergence of an infinite product to the convergence of
a corresponding infinite series? We now give one such theorem where the series is
a series of logarithms. Moreover, we also get a formula for the product in terms of
the sum of the infinite series.
Q
Theorem 7.5. An infinite product (1 + an ) converges if and only if an → 0
and the series
X∞
Log(1 + an ),
n=m+1
starting from a suitable index m + 1, converges. Moreover, if L is the sum of the
series, then
Y
(1 + an ) = (1 + a1 ) · · · (1 + am ) eL .

Proof. First of all, we remark that the statement “starting from a suitable
index m + 1” concerning the sum of logarithms is needed because we need to
make sure the sum starts sufficiently high so that none of the terms 1 + an is zero
(otherwise
Q Log(1 + an ) is undefined). By Proposition 7.1, in order for the product
(1+an ) to converge, we at least need an → 0. Thus, we may assume that an → 0;
in particular we can fix m such that n > m implies |an | < 1. Q
Let bn = 1 + an . We shall prove that the infinite product bn converges if and
only if the series
X∞
Log bn ,
n=m+1
358 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

converges, and if L is the sum of the series, then


Y
(7.8) bn = b1 · · · bm eL .
For n > m, let the partial products and partial sums be denoted by
Y n Xn
pn = bk and sn = Log bk .
k=m+1 k=m+1

Since exp(Log z) = z for any nonzero complex number z, it follows that


(7.9) exp(sn ) = pn .
Thus, if the sum sn converges to a value L, this equation shows that pn converges
to eL , which is nonzero, and also proves the formula (7.8).
Conversely, suppose that {pn } converges to a nonzero complex number p. We
shall prove that {sn } also converges; once this is established, the formula (7.8)
follows from (7.9). Note that replacing bm+1 by bm+1 /p, we may assume that
p = 1. For n > m, we can write pn = exp(Log pn ), so the formula (7.9) implies
that for n > m,
sn = Log pn + 2πikn
for some integer kn . Moreover, since
n
! n−1
!
X X
sn − sn−1 = Log bk − Log bk = Log bn ,
k=m+1 k=m+1

and bn → 1 (since an → 0), it follows that as n → ∞,


Log pn − Log pn−1 + 2πi(kn − kn−1 ) = sn − sn−1 → log 1 = 0.
By assumption pn → 1, so we must have kn − kn−1 → 0. Now kn − kn−1 is an
integer, so it can approach 0 only if kn and kn−1 are the same integer, say k, for
all n sufficiently large. It follows that
sn = Log pn + 2πikn → Log 1 + 2πik = 2πk,
which shows that {sn } converges. 
Exercises 7.2.
1. For what z ∈ C are the following products absolutely convergent?
Y∞   Y∞   nz n 
(a) 1 + zn , (b) 1+
n=1 n=1
1+n
∞  ∞  ∞
Y  z  Y zn  Y sin(z/n)
(c) 1 + sin2 , (d) 1+ , (e) .
n=1
n n=2
n log n n=1
z/n
P 2 Q
2. Here is a nice convergence test:PSuppose that an converges. Then (1 + an ) con-
verges if andPonly if the series an converges. You may proceed as follows.
(i) Since a2n converges, we know that an → 0, so we may henceforth assume that
|an |2 < 21 for all n. Prove that

Log(1 + an ) − an ≤ |an |2 .
Suggestion: P
You will need the power series expansion for Log(1 + z).
(ii) Prove that P (Log(1 + an ) − an ) is absolutely
P convergent.
(iii) Prove that an converges if and only if Log(1 + an ) converges and from this,
prove the desired result.
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 359

Q (−1)n 
(iv) Does the product ∞ n=2 1 + n
converge? What about the product
 1  1  1  1  1  1  1  1
1+ 1+ 1− 1+ 1+ 1− 1+ 1+ · · ·?
2 3 4 5 6 7 8 9
P P 2
3. Let {an } be a sequence of real numbers and Q assume that an converges but an
diverges. In this problem we shall prove that (1 + an ) diverges.
(i) Prove that there is a constant C > 0 such that for all x ∈ R with |x| ≤ 1/2, we
have
x − log(1 + x) ≥ Cx2 .
P
(ii) Since an converges, we know that P an → 0, so we may assume that Q |an | ≤ 1/2
for all n. Using (i), prove that log(1 + an ) diverges and hence, (1 + an )
diverges.
Q n−1
(iii) Does (1 + (−1) √
n
) converge or diverge?
4. Using the formulas from Problem 5 in Exercises 6.9, prove that for |z| < 1,
∞ ∞
! ∞ ∞
!
Y n
X 1 zn Y n
X (−1)n−1 z n
(1 − z ) = exp − , (1 + z ) = exp .
n=1 n=1
n 1 − zn n=1 n=1
n 1 − zn
Q
5. In this problem we prove that (1 + an ) is absolutely convergent if and only if the
series

X
Log(1 + an ),
n=m+1
starting from a suitable index m + 1, is absolutely convergent. Proceed as follows.
(i) Prove that for any complex number z with |z| ≤ 1/2, we have
1 3
(7.10) |z| ≤ | Log(1 + z)| ≤ |z|.
2 2
Log(1+z)
Suggestion: Look at the power series expansion for
z
− 1 and using this
Log(1+z)
− 1 ≤ 12 . Use this

power series, prove that for |z| ≤ 1/2, we have z
inequality to prove (7.10).
(ii) Now use (7.10) to prove the desired result.

7.3. Euler, Tannery, and Wallis: Product expansions galore


The goal of this section is to learn Tannery’s theorem for products and use it
to prove Euler’s celebrated formula (5.2) stated in the introduction of this chapter:
Theorem 7.6 (Euler’s product for sine). For any complex z, we have
∞ 
Y z2 
sin πz = πz 1− .
n=1
n2

We give two proofs of this astounding result. We also prove Wallis’ infinite
product expansion for π. To begin, we first need
7.3.1. Tannery’s theorem for products.
Theorem 7.7 Q (Tannery’s theorem for infinite products). For each nat-
mn
ural number n, let k=1 (1 + ak (n)) be a finite product where
P∞mn → ∞ as n → ∞.
If for each k, limn→∞ ak (n) exists, and there is a series k=1 Mk of nonnegative
real numbers such that |ak (n)| ≤ Mk for all k, n, then
mn
Y ∞
Y
lim (1 + ak (n)) = lim (1 + ak (n));
n→∞ n→∞
k=1 k=1
360 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

that is, both sides are well-defined (the limits and products converge) and are equal.
Proof. First of all, we remark that the infinite product on the right converges.
Indeed, if we put ak := limn→∞ ak (n), which exists by assumption, then taking
n → ∞ in the inequality |ak (n)| ≤ P Mk , we have |ak | ≤ Mk as well. Therefore,

by the comparison test, the seriesQ k=1 ak converges absolutely and hence, by

Theorem 7.4, the infinite product
P k=1 (1 + ak ) converges.
Now to our proof. Since Mk converges, Mk → 0, so we can choose m ∈ N
such that for all k ≥ m, we have Mk < 1. This implies that |ak | < 1 for k ≥ m, so
1 + ak is nonzero for k ≥ m. Put
mn
Y
p(n) = (1 + ak (n))
k=1

and, for n large enough so that mn > m, write


mn
Y m−1
Y
p(n) = q(n) · (1 + ak (n)) , where q(n) = (1 + ak (n)).
k=m k=1
Qm−1
Since q(n) is a finite product, q(n) → k=1 (1 + ak ) as n → ∞; therefore we just
have to prove that
mn
Y ∞
Y
lim (1 + ak (n)) = (1 + ak ).
n→∞
k=m k=m
Consider the partial products
k
Y k
Y
pk (n) = (1 + aj (n)) and pk = (1 + aj ).
j=m j=m

Since these are finite products and aj = limn→∞ aj (n), by the algebra of limits we
have limn→∞ pk (n) = pk . Now observe that
mn
Y mn
X
(1 + aj (n)) = pmn (n) = pm (n) + (pk (n) − pk−1 (n)),
j=m k=m

since the right-hand side telescopes to pmn (n), and by the limit identity in (a) of
Lemma 7.3, we know that

Y ∞
X
(1 + aj ) = pm + (pk − pk−1 ),
j=m k=m+1
Q∞
since j=m (1 + aj ) := limk→∞ pk . Also, by Part (b) of Lemma 7.3, we have
P k−1 P k−1
|pk (n) − pk−1 (n)| ≤ |ak (n)| e j=m |aj (n)| ≤ Mk e j=m Mj ≤ CMk ,
P∞ P∞
where C = e j=m Mj . Since k=m+1 CMk converges, by Tannery’s theorem for
series we have
mn
X X∞
lim (pk (n) − pk−1 (n)) = lim (pk (n) − pk−1 (n))
n→∞ n→∞
k=m+1 k=m+1
X∞
= (pk − pk−1 ).
k=m+1
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 361

Therefore,
mn mn
!
Y X
lim (1 + aj (n)) = lim pm (n) + (pk (n) − pk−1 (n))
n→∞ n→∞
j=m k=m+1
mn
X
= pm + lim (pk (n) − pk−1 (n))
n→∞
k=m+1

X ∞
Y
= pm + (pk − pk−1 ) = (1 + aj ),
k=m+1 j=m

which is what we wanted to prove. 

See Problem 6 for another (shorter) proof using complex logarithms.

7.3.2. Expansion of sine III. (Cf. [41, p. 294]). Our third proof of Euler’s
infinite product for sine is a Tannery’s theorem version of the proof found in Section
5.1 of Chapter 5. To this end, first recall from Lemma 5.1 of that section and the
work done in that section, that for any z ∈ C, we have
sin z = lim Fn (z),
n→∞

where n = 2m + 1 is odd and


m  
Y z2
Fn (z) = z 1− .
k=1
n tan2 (kπ/n)
2

Thus,
( m  )
Y z2
sin z = lim z 1− 2
m→∞
k=1
n tan2 (kπ/n)
m
Y
= lim z (1 + ak (m)) ,
m→∞
k=1
2
where ak (m) := − n2 tanz2 (kπ/n) with n = 2m + 1. Second, since limz→0 tan z
z =
limz→0 sinz z · cos1 z = 1, we see that

z2
lim ak (m) = lim −
m→∞ m→∞ (2m + 1)2 tan2 (kπ/(2m + 1))
z2 z2
= lim −   2 = − .
m→∞ k2 π2
k 2 π 2 tan(kπ/(2m+1))
kπ/(2m+1)

Third, in Lemma 4.55 back in Section 4.10, we proved that


(7.11) x < tan x, for 0 < x < π/2.
In particular, for any z ∈ C, if n = 2m + 1 and 1 ≤ k ≤ m, then

z2 |z|2 |z|2
n2 tan2 (kπ/n) ≤ n2 (kπ)2 /n2 = k 2 π 2 =: Mk .

362 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

P∞
Thus, for all k, m, |ak (m)| ≤ Mk . Finally, since the sum k=1 Mk converges, by
Tannery’s theorem for infinite products, we have
m ∞ ∞  
Y Y Y z2
sin z = lim z (1 + ak (m)) = z lim (1 + ak (m)) = z 1− 2 2 .
m→∞ m→∞ k π
k=1 k=1 k=1
After replacing z by πz, we get Euler’s infinite product expansion for sin πz. This
completes Proof III of Theorem 7.6. In particular, we see that
∞   ∞  
Y 1 Y i2 e−π − eπ
πi 1 + 2 = πi 1 − 2 = sin πi = .
k k 2i
k=1 k=1
Thus, we have derived the very pretty formula
∞  
eπ − e−π Y 1
= 1+ 2 .
2π n=1
n
Q∞ 1

Recall from Section 7.1 how easy it was to find that n=1 1− n2 = 1/2, but
replacing −1/n2 with +1/n2 is a whole different story!
7.3.3. Expansion of sine IV. Our fourth proof of Euler’s infinite product
for sine is based on the following neat identity involving sines instead of tangents!
Lemma 7.8. If n = 2m + 1 with m ∈ N, then for any z ∈ C,
m  
Y sin2 z
sin nz = n sin z 1− .
k=1
sin2 (kπ/n)
Proof. Lemma 2.26 shows that for each k ∈ N, 2 cos kz is a polynomial in
2 cos z of degree k (with integer coefficients, although this fact is not important for
this lemma). Technically speaking, Lemma 2.26 was proved under the assumption
that z is real, but the proof only used the angle addition formula for cosine, which
holds for complex variables as well. Any case, since 2 cos kz is a polynomial in
2 cos z of degree k, it follows that cos kz is a polynomial in cos z of degree k, say
cos kz = Qk (cos z) where Qk is a polynomial of degree k. In particular,
cos 2kz = Qk (cos 2z) = Qk (1 − 2 sin2 z),
so cos 2kz is a polynomial of degree k in sin2 z. Now using the addition formulas
for sine, we get, for each k ∈ N,
(7.12) sin(2k + 1)z − sin(2k − 1)z = 2 sin z · cos(2kz) = 2 sin z · Qk (1 − 2 sin2 z).
We claim that for any m = 0, 1, 2, . . ., we have
(7.13) sin(2m + 1)z = sin z · Pm (sin2 z),
where Pm is a polynomial of degree m. For example, if m = 0, then sin z =
sin z · P0 (sin2 z) where P0 (w) = 1 is the constant polynomial 1. If m = 1, then by
(7.12) with k = 1, we have
sin(3z) = sin z + 2 sin z · Q1 (1 − 2 sin2 z)

= sin z 1 + 2 sin z · Q1 (1 − 2 sin2 z) = sin z · P1 (sin2 z),
where P1 (w) = 1 + 2Q1 (1 − 2w). To prove (7.13) for general m just requires an
induction argument based on (7.12), which we leave to the interested reader. Now,
observe that sin(2m + 1)z is zero when z = zk with zk = kπ/(2m + 1) where
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 363

k = 1, 2, . . . , m. Also observe that since 0 < z1 < z2 < · · · < zm < π/2, the m
values sin zk are distinct positive values. Hence, according to (7.13), Pm (w) = 0
at the m distinct values w = sin2 zk , k = 1, 2, . . . , m. Thus, as a consequence of
the fundamental theorem of algebra, the polynomial Pm (w) can be factored into a
constant times

(w − z1 )(w − z2 ) · · · (w − zm ) =
m    m   
Y
2 kπ Y
2 kπ
w − sin = w − sin ,
2m + 1 n
k=1 k=1

(since n = 2m + 1) which is a constant times


m  
Y w
1− 2 .
k=1
sin (kπ/n)

Setting w = sin2 z, we obtain


m  
2
Y sin2 z
sin(2m + 1)z = sin z · Pm (sin z) = a sin z · 1− ,
k=1
sin2 (kπ/n)

for some constant a. Since sin(2m + 1)z/ sin z has limit equal to 2m + 1 as z → 0,
it follows that a = 2m + 1. This completes the proof of the lemma. 

We are now ready to give our fourth proof of Euler’s infinite product for sine.
To this end, we let n ≥ 3 be odd and we replace z by z/n in Lemma 7.8 to get
m  
Y sin2 (z/n)
sin z = n sin(z/n) 1− ,
k=1
sin2 (kπ/n)

where n = 2m + 1. Since
sin(z/(2m + 1))
lim (2m + 1) sin(z/(2m + 1)) = lim z = z,
m→∞ m→∞ z/(2m + 1)

we have
m   m
Y sin2 (z/n) Y
sin z = z lim 1− = z lim (1 + ak (m))
m→∞
k=1
sin2 (kπ/n) m→∞
k=1

2
sin (z/n)
where ak (m) := − sin 2 (kπ/n) with n = 2m + 1. Since we are taking m → ∞, we can

always make sure that n = 2m + 1 > |z|, which we henceforth assume. Now recall
from Lemmas 5.6 and 5.7 that there is a constant c > 0 such that for any 0 ≤ x ≤ π2 ,
we have c x ≤ sin x, and for any w ∈ C with |w| ≤ 1, we have | sin w| ≤ 56 |w|. It
follows that for any k = 1, 2, . . . , m,

sin2 (z/n) (6/5|z/n|)2 36|z|2 1
sin2 (kπ/n) ≤ c2 (kπ/n)2 = 25c2 π 2 · k 2 =: Mk ,

364 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

P∞
Since the sum k=1 Mk converges, and
sin2 (z/(2m + 1))
lim ak (m) = − lim
m→∞ m→∞ sin2 (kπ/(2m + 1))
 2
sin(z/(2m+1))
z2 z/(2m+1) z2
= − lim 2 2 ·   2 = − ,
m→∞ k π sin(kπ/(2m+1)) k2 π2
kπ/(2m+1)

Tannery’s theorem for infinite products implies that


m ∞ ∞  
Y Y Y z2
sin z = z lim (1 + ak (m)) = z lim (1 + ak (m)) = z 1− 2 2 .
m→∞ m→∞ k π
k=1 k=1 k=1

Finally, replacing z by πz completes Proof IV of Euler’s product formula.

7.3.4. Euler’s cosine expansion. We can derive an infinite product expan-


sion for the cosine function easily from the sine expansion. In fact, using the double
angle formula for sine, we get
∞   ∞  
Y 4z 2 Y 4z 2
2πz · 1− 2 1− 2
sin 2πz n=1
n n
cos πz = = ∞   = n=1
∞  .
2 sin πz Y z2 Y z2
2πz · 1− 2 1− 2
n=1
n n=1
n

The top product can be split as a product of even and odd terms:
∞  Y ∞   Y ∞  Y ∞  
Y 4z 2 4z 2 4z 2 z2
1− 1 − = 1 − 1 − ,
n=1
(2n − 1)2 n=1 (2n)2 n=1
(2n − 1)2 n=1 n2

from which we get (see Problem 3 for three more proofs)


∞  
Y 4z 2
cos πz = 1− .
n=1
(2n − 1)2

Exercises 7.3.

1. Put z = π/4 into the cosine expansion to derive the following elegant product for 2:
√ 2 2 6 6 10 10
2= · · · · · ··· .
1 3 5 7 9 11
Compare this with Wallis’ formula:
π 2 2 4 4 6 6 8 8 10 10
= · · · · · · · · · ··· .
2 1 3 3 5 5 7 7 9 9 11

Thus, the product for 2 is obtained from Wallis’ formula for π/2 by removing the
factors with numerators that are multiples of 4.
2. Prove that
∞   ∞  
Y z2 Y z2
sinh πz = πz 1+ 2 and cosh πz = 1+ .
k n=1
(2n − 1)2
k=1

3. (Euler’s infinite product for cos πz) Here are three more proofs!
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 365

(a) Replace z by −z +1/2 in the sine product to derive the cosine product. Suggestion:
Begin by showing that
!     
(−z + 12 )2 1 2z 2z
1− = 1 − · 1 + 1 − .
n2 4n2 2n − 1 2n + 1

(b) For our second proof, show that for n even, we can write
n−1
Y 
sin2 (z/n)
cos z = 1− , k = 1, 3, 5, . . . , n − 1.
k=1
sin2 (kπ/2n)

Using Tannery’s theorem, deduce the cosine expansion.


(c) Write cos z = limn→∞ Gn (z), where
 n  n 
1 iz iz
Gn (z) = 1+ + 1− .
2 n n
Prove that if n = 2m with m ∈ N, then
m  
Y z2
Gn (z) = 1− 2 .
n tan2 ((2k + 1)π/(2n))
k=0

Using Tannery’s theorem, deduce the cosine expansion.


4. Prove that
 2  2  2  2
2z 2z 2z 2z
1 − sin z = 1 − 1+ 1− 1+ ···
1 3 5 7

Suggestion: First show that 1 − sin z = 2 sin2 ( π4 − z2 ).


5. Determine the following limits.
( ! !
1 1
(a) lim 1−   2  · 1 −   2  ·
n→∞ 2 3
4n2 log 1 + 2n 4n2 log 1 + 2n
! !)
1 1
1−   2  · · · 1 −   2  ,
4 n
4n2 log 1 + 2n 4n2 log 1 + 2n
( ! ! !)
1 1 1
(b) lim 1+   · 1+   · · · 1+   .
n→∞ 4·12 −1 4·22 −1 4·n2 −1
4n2 sin 4n2 −1
4n2 sin 4n2 −1
4n2 sin 4n2 −1

6. In this problem we prove Tannery’s theorem for products using


P complex logarithms.
Assume the hypotheses and notations of Theorem 7.7. Since Mk converges, Mk → 0,
so we can choose m such that for all k ≥ m, we have Mk < 1/2. Then as in the proof
of Theorem 7.7, we just have to show that
mn
Y ∞
Y
(7.14) lim (1 + ak (n)) = (1 + ak ).
n→∞
k=m k=m

(i) Show that Tannery’s theorem for series implies that


mn
X ∞
X
lim Log(1 + ak (n)) = Log(1 + ak ).
n→∞
k=m k=m

Suggestion: Use the inequality (7.10) in Problem 5 of Exercises 7.2.


(ii) From (i), deduce (7.14).
366 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

Q
7. (Tannery’s theorem II) For each natural number n, let ∞ k=1 (1+ak (n)) bePa conver-
gent infinite product. If for each k, limn→∞ ak (n) exists, and there is a series ∞k=1 Mk
of nonnegative real numbers such that |ak (n)| ≤ Mk for all k, n, prove that

Y ∞
Y
lim (1 + ak (n)) = lim (1 + ak (n));
n→∞ n→∞
k=1 k=1

that is, both sides are well-defined (the limits and products converge) and are equal.

7.4. Partial fraction expansions of the trigonometric functions


The goal of this section is to prove Euler’s partial fraction expansion (7.1):
Theorem 7.9 (Euler’s partial fraction ( sinππz )). We have

π 1 X 2z
= + for all z ∈ C \ Z.
sin πz z n=1 n2 − z 2

We also derive partial fraction expansions for the other trigonometric functions.
We begin with the cotangent.

7.4.1. Partial fraction expansion of the cotangent. We shall prove the


following theorem (from which we’ll derive the sine expansion).
Theorem 7.10 (Euler’s partial fraction (πz cot πz)). We have

X 1
πz cot πz = 1 + 2z 2 for all z ∈ C \ Z.
n=1
z2 − n2

Our proof of Euler’s expansion of the cotangent is based on the following lemma.
Lemma 7.11. For any noninteger complex number z and n ∈ N, we have
2n−1
X−1 πz  
πz πz π(z + k) π(z − k) πz πz
πz cot πz = n cot n + n
cot n
+ cot n
− n tan n .
2 2 2 2 2 2 2
k=1

Proof. Using the double angle formula


cos 2z cos2 z − sin2 z
2 cot 2z = 2 = = cot z − tan z,
sin 2z cos z sin z
we see that
1 
cot 2z = cot z − tan z .
2
Replacing z with πz/2, we get
 
1 πz πz
(7.15) cot πz = cot − tan .
2 2 2
Multiplying this equality by πz proves our lemma for n = 1. In order to proceed
by induction, we note that since tan z = − cot(z ± π/2), we find that
 
1 πz π(z ± 1)
(7.16) cot πz = cot + cot .
2 2 2
7.4. PARTIAL FRACTION EXPANSIONS OF THE TRIGONOMETRIC FUNCTIONS 367

This is the main formula on which induction may be applied to prove our lemma.
For instance, let’s take the case n = 2. Considering the positive sign in the second
cotangent, we have
 
1 πz π(z + 1)
cot πz = cot + cot .
2 2 2
Applying (7.16) to each cotangent on the right of this equation, using the plus sign
for the first and the minus sign for the second, we get
   
1 πz π( z2 + 1) π(z + 1) π( z+1
2 − 1)
cot πz = 2 cot 2 + cot + cot + cot
2 2 2 22 2
 
1 πz π(z + 2) π(z + 1) π(z − 1)
= 2 cot 2 + cot + cot + cot ,
2 2 22 22 22
which, after bringing the second cotangent to the end, takes the form
  πz 
1 πz π(z + 1) π(z − 1) π
cot πz = 2 cot 2 + cot + cot + cot 2 + .
2 2 22 22 2 2
However, the last term is exactly − tan πz/22 , and so our lemma is proved for n = 2.
Continuing by induction proves our lemma for general n. 
Fix a noninteger z; we shall prove Euler’s expansion for the cotangent. Note
that limn→∞ πz πz
2n tan( 2n ) = 0 · tan 0 = 0, and since
w
(7.17) lim w cot w = lim · cos w = 1 · 1 = 1,
w→0 w→0 sin w
we have limn→∞ πz πz
2n cot 2n = 1. Therefore, taking n → ∞ in the formula from the
preceding Lemma 7.11, we conclude that
 2n−1
X−1 πz  
π(z + k) π(z − k)
πz cot πz = 1 + lim cot + cot
n→∞ 2n 2n 2n
k=1
2n−1
X−1
= 1 + lim ak (n),
n→∞
k=1
where  
πz π(z + k) π(z − k)
ak (n) = n cot + cot .
2 2n 2n
We shall apply Tannery’s theorem to this sum. To this end, observe that, from
(7.17),
πz π(z + k) z π(z + k) π(z + k) z
lim n
cot n
= lim n
cot n
= ,
n→∞ 2 2 z + k n→∞ 2 2 z+k
and in a similar way,
πz π(z − k) z
lim cot = .
n→∞ 2n 2n z−k
Thus,
z z 2z 2
lim ak (n) = + = 2 ,
n→∞ z+k z−k z − k2
so Tannery’s theorem gives Euler’s cotangent expansion:

X 1
πz cot πz = 1 + 2z 2 ,
z 2 − k2
k=1
368 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

P
provided of course we can show that |ak (n)| ≤ Mk where Mk < ∞. Actually,
we shall
P∞prove that are m, N ∈ N such that |ak (n)| ≤ M k for all n > N and k ≥ m
where k=m Mk < ∞. The conclusion of Tannery’s theorem will still hold with
these conditions. (Why so?)
To bound each ak (n), we use the formula
sin 2α
cot(α + β) + cot(α − β) = .
sin α − sin2 β
2

This formula is obtained by expressing cot(α ± β) in terms of cosine and sine and
using the angle addition formulas (the diligent reader will supply the details!).
Setting α = πz/2n and β = πk/2n , we obtain
π(z + k) π(z − k) sin 2α
(7.18) cot + cot = ,
2n 2n sin α − sin2 β
2

where we keep the notation α = πz/2n and β = πk/2n on the right. Our goal now
is to bound the term on the right of (7.18). Choose N ∈ N such that for all n > N ,
we have |α| = |πz/2n | < 1/2. Then for n > N , according to Lemma 5.7,
6 6
| sin 2α| ≤ |2α| ≤ 3|α| and | sin α| ≤ |α| ≤ 2|α|,
5 5
and, since β = πk/2n < π/2 for k = 1, . . . , 2n−1 − 1, according to Lemma 5.6, for
some c > 0,
c β ≤ sin β.
Hence, for n > N ,

c2 β 2 ≤ sin2 β ≤ | sin2 α − sin2 β| + | sin2 α| ≤ | sin2 α − sin2 β| + 4|α|2


=⇒ c2 β 2 − 4|α|2 ≤ | sin2 α − sin2 β|.
Choose m ∈ N such that c m > 2|z|. Then for k ≥ m, we have
 2  2  2
πk πc k π|z|
c2 β 2 = c2 = > 4 = 4|α|2 =⇒ c2 β 2 − 4|α|2 > 0,
2n 2n 2n
and combining this with the preceding line, we obtain
0 < c2 β 2 − 4|α|2 ≤ | sin2 α − sin2 β|.
Hence,
| sin 2α| 3|α| 3π|z|/2n 2n 3|z|
2 2 ≤ 2 2 2
= 2 n 2 n 2
= .
| sin α − sin β| c β − 4|α| c (πk/2 ) − 4(π|z|/2 ) π c k − 4|z|2
2 2

Thus, for n > N and k ≥ m, in view of (7.18) and the definition of ak (n), we have
3|z|2
|ak (n)| ≤ Mk , where Mk = .
c2 k2− 4|z|2
Since
3|z|2
1 3|z|2 1  1
Mk = ≤
4|z|2
2
· 4|z|2
· 2 = constant · 2 ,
2
c − k2 k 2
c − m2 k k
P∞
by the comparison test, the sum k=m Mk converges. This completes the proof of
Euler’s cotangent expansion.
7.4. PARTIAL FRACTION EXPANSIONS OF THE TRIGONOMETRIC FUNCTIONS 369

7.4.2. Partial fraction expansions of the other trig functions. We shall


leave most of the details to the exercises. Using the formula (see (7.15))
πz πz
π tan = π cot − 2π cot πz,
2 2
and substituting in the partial fraction expansion of the cotangent, gives, as the
diligent reader will do in Problem 1, for z ∈ C not an odd integer,

πz X 4z
(7.19) π tan = .
2 n=0
(2n + 1)2 − z 2
π
To derive a partial fraction expansion for sin πz , we first derive the identity
1 z
= cot z + tan .
sin z 2
To see this, observe that
z cos z sin(z/2) cos z cos(z/2) + sin z sin(z/2)
cot z + tan = + =
2 sin z cos(z/2) sin z cos(z/2)
cos(z − (z/2)) cos(z/2) 1
= = = .
sin z cos(z/2) sin z cos(z/2) sin z
This identity, together with the partial fraction expansions of the tangent and
cotangent and a little algebra, which the extremely diligent reader will supply in
Problem 1, imply that for noninteger z ∈ C,

π 1 X 2z
(7.20) = + .
sin πz z n=1 n2 − z 2

Finally, the incredibly awesome diligent reader , will supply the details for the
following cosine expansion: For z ∈ C not an odd integer,

π X (2n + 1)
(7.21) πz = (−1)n .
4 cos 2 n=0
(2n + 1)2 − z 2

Exercises 7.4.
1. Fill in the details for the proofs of (7.19) and (7.20). For (7.21), first show that
π 1  1 1   1 1 
= + − − − + ··· .
sin πz z 1−z 1+z 2−z 2+z
1−z
Replacing z with 2
and doing some algebra, derive the expansion (7.21).
P (−1)n−1
2. Derive Gregory-Leibniz-Madhava’s series π4 = ∞ n=1 2n−1 = 1 − 13 + 41 − 15 + · · · by
replacing z = 1/4 in the partial fraction expansions of πz cot πz and π/ sin πz. How
can you derive Gregory-Leibniz-Madhava’s series from the expansion of 4 cosπ πz ?
2
3. Derive the following formulas for π:
π h 1 1 1 1 i
π = z tan · 1− + − + − +···
z z−1 z+1 2z − 1 2z + 1
and
π h 1 1 1 1 i
π = z sin · 1+ − − + + − − + + ··· .
z z−1 z+1 2z − 1 2z + 1
In particular, plug in z = 3, 4, 6 to derive some pretty formulas.
370 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

P∞
7.5. F More proofs that π 2 /6 = n=1 1/n2
In this section, we continue our discussion from Sections 5.2 and 6.11, concern-
ing the Basel problem of determining the sum of the reciprocals of the squares. A
good reference for this material is [109] and for more on Euler, see [11].

7.5.1. Proof VIII of Euler’s formula for π 2 /6. (Cf. [47, p. 74].) One can
consider this proof as a “logarithmic” version of Euler’s original (third) proof of the
formula for π 2 /6, which we explained in the introduction to Chapter 5. As with
Euler, we begin with Euler’s sine expansion restricted to 0 ≤ x < 1:
∞ 
sin πx Y x2 
= 1− 2 .
πx n=1
n

However, in contrast to Euler, we take logarithms of both sides:


  m 
! m 
!
sin πx Y x2  Y x2 
log = log lim 1− 2 = lim log 1− 2
πx m→∞
n=1
n m→∞
n=1
n
m
X  x2 
= lim log 1 − 2 ,
m→∞
n=1
n

where in the second equality we can pull out the limit because log is continuous,
and at the last step we used that logarithms take products to sums. Thus, we have
shown that
  X ∞
sin πx  x2 
log = log 1 − 2 , 0 ≤ x < 1.
πx n=1
n
P∞ (−1)m−1 m
Recalling that log(1 + t) = m=1 m t , we see that

X 1 m
log(1 − t) = − t ,
m=1
m

so replacing t by x2 /n2 we obtain


  ∞ X ∞
sin πx X 1 x2m
log =− , 0 ≤ x < 1.
πx n=1 m=1
m n2m

Since
∞ X ∞ ∞ X ∞ ∞  
X 1 x2m X 1 |x|2m X  |x|2  sin π|x|
= = log 1 − = log < ∞,
m n2m m n2m n2 π|x|
n=1 m=1 n=1 m=1 n=1

by Cauchy’s double series theorem, we can iterate sums:


  X ∞ ∞
!
sin πx X 1 x2m
(7.22) − log =
πx m=1 n=1
n2m m
∞ ∞ ∞
X 1 x4 X 1 x6 X 1
= x2 + + + ··· .
n=1
n2 2 n=1 n4 3 n=1 n6
P∞
7.5. F MORE PROOFS THAT π 2 /6 = n=1 1/n2 371

On the other hand, by our power series composition theorem, we have (after some
simplification)
    2 2 
sin πx π x π 4 x4
− log = − log 1 − − + −···
πx 3! 5!
 2 2   2
π x π 4 x4 1 π 2 x2 π 4 x4
= − + −··· + − + −··· + ···
3! 5! 2 3! 5!
   6 
π2 2 π4 π4 4 π π6 π6
(7.23) = x + − + x + − + x6 + · · · .
3! 5! 2 · (3!)2 7! 3! · 5! 3 · (3!)3
Equating this with (7.22), we obtain
   6 
π2 2 π4 π4 4 π π6 π6
x + − + x + − + x6 + · · ·
3! 5! 2 · (3!)2 7! 3! · 5! 3 · (3!)3
∞ ∞ ∞
X 1 x4 X 1 x6 X 1
= x2 + + + ··· ,
n=1
n2 2 n=1 n4 3 n=1 n6
or after simplification,
∞ ∞ ∞
π2 2 π4 4 π6 6 X 1 x4 X 1 x6 X 1
(7.24) x + x + x + · · · = x2 + + +··· .
6 180 2835 n=1
n2 2 n=1 n4 3 n=1 n6

By the identity theorem, the coefficients of xk must be identical. Thus, comparing


the x2 terms, we get Euler’s formula:

π2 X 1
= 2
,
6 n=1
n
comparing the x4 terms, we get

π4 X 1
(7.25) = ,
90 n=1 n4

and finally, comparing the x6 terms, we get



π6 X 1
(7.26) = .
945 n=1 n6

2k
Now what if weP took2kmore terms in (7.22) and (7.23), say to x , can we then find
a formula for 1/n ? The answer is certainly true but the work required to get a
formula is rather intimidating; see Problem 1 for a formula when k = 4. Of course,
in Section 5.2 we found formulas for ζ(2k) for all k.
7.5.2. Proof IX. (Cf. [123], [49].) For this proof, we start with Lemma 7.8,
which states that if n = 2m + 1 with m ∈ N, then
m  
Y sin2 z
(7.27) sin nz = n sin z 1− .
k=1
sin2 (kπ/n)
We fix an m; later we shall take m → ∞. We now substitute the expansion
n3 z 3 n5 z 5
sin nz = nz − + − +···
3! 5!
372 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

into the left-hand side of (7.27), and the expansions


z3 z5
sin z = z − + − +··· ,
3! 5!
and
1 2
sin2 z = (1 − cos 2z) = z 2 − z 4 + − · · · ,
2 3
into the right-hand side of (7.27). Then multiplying everything out and simplifying,
we obtain (after a lot of algebra)
m
!
n3 z 3 n X 1
nz − + − · · · = nz + − − n 2 z3 + · · · .
3! 6 sin (kπ/n)
k=1

Comparing the z 3 terms, by the identity theorem we conclude that


m
n3 n X 1
− =− −n 2 ,
6 6 sin (kπ/n)
k=1
which can be written in the form
m
1 X 1 1
(7.28) − 2 = 2.
6 2
n sin (kπ/n) 6n
k=1
To establish Euler’s formula, we apply Tannery’s theorem to this sum. According
to Lemma 5.6, for some positive constant c,
(7.29) c x ≤ sin x for 0 ≤ x ≤ π/2.
Now for 0 ≤ k ≤ m = (n − 1)/2, we have kπ/n < π/2, so for such k,
kπ kπ
c· ≤ sin ,
n n
which gives
1 1 1 n2 1 1
· 2 ≤ · = 2 2 · 2.
n2 sin (kπ/n) n2 (cπ)2 k 2 c π k
By the p-test, we know that the sum

X 1 1
·
c2 π 2 k 2
k=1
converges. Also, since n sin(x/n) → x as n → ∞, which implies that
1 1
lim 2 2 = 2 2,
n→∞ n sin (kπ/n) k π
taking m → ∞ in (7.28), Tannery’s theorem gives

1 X 1
− = 0,
6 k2 π2
k=1
which is equivalent to Euler’s formula. See Problem 2 for a proof that uses (7.28)
but doesn’t use Tannery’s theorem.
Exercises 7.5.
P
1. Determine the sum ∞ 1
n=1 n8 using Euler’s method; that is, in the same manner as we
derived (7.25) and (7.26).
2. (Cf. [49]) (Euler’s sum, Proof X) Instead of using Tannery’s theorem to derive
Euler’s formula from (7.28), we can follow Kortram [123] as follows.
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 373

(i) Fix any M ∈ N and let m > M . Using (7.28), prove that for n = 2m + 1,
M m
1 X 1 1 X 1
− 2 sin2 (kπ/n)
= + 2 sin2 (kπ/n)
.
6 n n2 n
k=1 k=M +1

(ii) Using that c x ≤ sin x for 0 ≤ x ≤ π/2 with c > 0, prove that
M ∞
1 X 1 1 1 X 1
0≤ − 2 sin2 (kπ/n)
≤ 2 + 2 2 .
6 n n c π k2
k=1 k=M +1

(iii) Finally, letting m → ∞ (so that n = 2m + 1 → ∞ as well) and then letting


M → ∞, establish Euler’s formula.
3. (Cf. [56]) Let S ⊆ N denote the set of square-free natural numbers; see Subsection
6.7.2 for a review of square-free numbers.
(i) Let N ∈ N and prove that
! ! ∞
X 1 X 1 X 1 X 1
2
≤ 4 2
≤ 2
.
n<N
n k n∈S , n<N
n n=1
n
k<N
P P
(ii) If n∈S n12 := limN →∞ n∈S , n<N n12 , using (i), prove that
X 1 15
2
= 2.
n∈S
n π

4. (Cf. [56])
P Let A ⊆ N denotePthe set of natural numbers that are not perfect squares.
With n∈A n12 := limN →∞ n∈A , n<N n12 , prove that
X 1 π2
2
= (15 − π 2 ).
n∈A
n 90

7.6. F Riemann’s remarkable ζ-function, probability, and π 2 /6


We have already seen the Riemann zeta function at work in many examples. In
this section we’re going to look at some of its relations with number theory; this will
give just a hint as to the great importance of the zeta function in mathematics. As a
consolation prize to our discussion on Riemann’s ζ-function we’ll find an incredible
connection between probability theory and π 2 /6.

7.6.1. The Riemann-zeta function and number theory. We begin with


the following theorem proved by Euler which connects ζ(z) to prime numbers. See
Problem 1 for a proof of this theorem using the good ole Tannery’s theorem!
Theorem 7.12 (Euler and Riemann). For all z ∈ C with Re z > 1, we have
Y 1 −1 Y pz
ζ(z) = 1− z = ,
p pz − 1
where the infinite product is over all prime numbers p ∈ N.
Proof. We give two proofs, one using Cauchy’s theorem on the multiplication
of series and the other is Euler’s classic.
Proof I: Let r > 1 be arbitrary and let Re z ≥ r. Let 2 < N ∈ N and let
2 < 3 < · · · < m < N be all the primes less than N . Then for every natural n < N ,
by unique factorization,
z
nz = 2i 3j · · · mk = 2iz 3jz · · · mkz
374 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

for some nonnegative integers i, j, . . . , k. Using this fact, it follows that the product
Y 1 −1  1 −1  1 −1  1 −1
1− z = 1− z 1− z ··· 1 − z
p 2 3 m
p<N
 1 1 1  1 1 1 
= 1 + z + 2z + 3z · · · 1 + z + 2z + 3z + · · · · · ·
2 2 2 3 3 3
 1 1 1 
· · · 1 + z + 2z + 3z + · · · ,
m m m
after multiplying out and using Cauchy’s multiplication theorem (or rather its gen-
eralization to a product of more than two absolutely convergent series), contains the
numbers 1, 21z , 31z , 41z , 51z . . . , (N −1)
1 1
z (along with all other numbers nz with n ≥ N

having prime factors 2, 3, . . . , m). In particular,


X ∞ ∞ ∞
1 Y  1 −1 X 1
X 1
z
− 1 − z ≤ z ≤ ,
n=1
n p n nr
p<N n=N n=N

1
P
since Re z ≥ r. By the p-test (with p = r > 1), nr converges so the right-hand
side tends to zero as N → ∞. This completes Proof I.
Proof II: Here’s Euler’s beautiful proof using a “sieving method” made famous
by Eratosthenes of Cyrene (276 B.C.–194 B.C.). First we get rid of all the numbers
in ζ(z) that have factors of 2: Observe that
∞ ∞
1 1 X 1 X 1
ζ(z) = z = ,
2z 2 n=1 nz n=1
(2n)z

therefore,
  ∞ ∞
1 X 1 X 1 X 1
1− ζ(z) = − = .
2z n=1
nz
n=1
(2n)z nz
n ; 26 |n

Next, we get rid of all the numbers in 1 − 21z ζ(z) that have factors of 3: Observe
that  
1 1 1 X 1 X 1
1 − ζ(z) = = ,
3z 2z 3z nz (3n)z
26 |n n ; 26 |n

therefore,
      
1 1 1 1 1
1− z 1− z ζ(z) = 1 − z ζ(z) − z 1 − z ζ(z)
3 2 2 3 2
X 1 X 1
= −
nz (3n)z
n ; 26 |n n ; 26 |n
X 1
= .
nz
n ; 2,36 |n

Repeating this argument, we get, for any prime q:


 
 Y  
1  X 1
1− z ζ(z) = ,
 p  nz
p prime≤q n ; 2,3,...,q6 |n
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 375

where the sum is over all n ∈ N that are not divisible by the primes from 2 to q.
Therefore, choosing r > 1 such that |z| > r, we have
 

Y  1

X 1
1− z ζ(z) − 1 =

z


p prime≤q p  n ; n6=1 & 2,3,...,q6 |n n


X 1 X 1
≤ r
≤ .
n n=q
nr
n ; n6=1 & 2,3,...,q6 |n
P∞
By Cauchy’s criterion for series, limq→∞ n=q n1r = 0, so we conclude that
 
 Y  
1 
1− z ζ(z) = 1,
 p 
p prime

which is equivalent to Euler’s product formula. 


2
In particular, since we know that ζ(2) = π /6, we have
π2 Y p2 22 32 52
= = · · ··· .
6 p2 − 1 22 − 1 32 − 1 52 − 1
Our next connection is with the following strange (but interesting) function:

1
 if n = 1
k
µ(n) := (−1) if n = p1 p2 · · · pk is a product k distinct prime numbers


0 else.
This function is called the Möbius function after August Ferdinand Möbius
(1790–1868) who introduced the function in 1831. Some of its values are
µ(1) = 1 , µ(2) = −1 , µ(3) = −1 , µ(4) = 0 , µ(5) = −1 , µ(6) = 1 , . . . .
Theorem 7.13. For all z ∈ C with Re z > 1, we have
Y  X ∞
1 1 µ(n)
= 1− z = .
ζ(z) p n=1
nz

Proof. Let r > 1 be arbitrary and let Re z ≥ r. Let 2 < N ∈ N and let
2 < 3 < · · · < m < N be all the primes less than N . Then observe that the product
Y  1
 
−1  −1  −1   −1 
1− z = 1+ z 1+ z 1 + z ··· 1 + z ,
p 2 3 5 m
n<N
when multiplied out contains 1 and all numbers of the form
 −1   −1   −1   −1  (−1)k (−1)k
z · z · z ··· z = z z z = , n = p1 p2 . . . pk ,
p1 p2 p3 pk p1 p2 · · · pk nz
Q  
1
where p1 < p2 < · · · < pk < N are distinct primes. In particular, n<N 1− pz
µ(n)
contains the numbers nz for n = 1, 2, . . . , N − 1 (along with all other numbers
µ(n)
nz with n ≥ N having prime factors 2, 3, . . . , m), so

X µ(n) Y ∞ ∞
1  X µ(n) X 1
z
− 1 − z
≤ z ≤ r
,
n=1
n p n n
p<N n=N n=N
376 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

P 1
since Re z ≥ r. By the p-test (with p = r > 1), nr converges so the right-hand
side tends to zero as N → ∞. This completes our proof. 

See the exercises for other neat connections of ζ(z) with number theory.

7.6.2. The eta function. A function related to the zeta function is the “al-
ternating zeta function” or Dirichlet eta-function:

X (−1)n−1
η(z) := .
n=1
nz

We can write the eta function in terms of the zeta function as follows.
Theorem 7.14. We have
η(z) = (1 − 21−z )ζ(z) , z > 1.
Proof. Splitting into sums of even and odd numbers, we get
∞ ∞ ∞
X (−1)n−1 X 1 X 1
z
=− z
+
n=1
n n=1
(2n) n=1
(2n − 1)z
∞ ∞
X 1 1 X 1
=− z z
+
n=1
2 n n=1
(2n − 1)z

X 1
= −2−z ζ(z) + .
n=1
(2n − 1)z

On the other hand, breaking the zeta function into sums of even and odd numbers,
we get
∞ ∞ ∞ ∞
X 1 X 1 X 1 −z
X 1
ζ(z) = z
= z
+ z
= 2 ζ(z) + .
n=1
n n=1
(2n) n=1
(2n − 1) n=1
(2n − 1)z

Substituting this expression into the previous one, we see that



X (−1)n−1
z
= −2−z ζ(z) + ζ(z) − 2−z ζ(z),
n=1
n

which is equivalent to the expression that we desired to prove. 

We now consider a shocking connection between probability theory, prime num-


bers, divisibility, and π 2 /6 (cf. [2], [107]).1 Question: What is the probability that
a natural number, chosen at random, is square free? Answer (drum role please):
6/π 2 , a result which follows from work of Dirichlet in 1849 [122, p. 324], [95, p.
272]. Here’s another Question: What is the probability that two given numbers,
chosen at random, are relatively prime? Answer (drum role please): 6/π 2 , first
proved by Leopold Bernhard Gegenbauer (Feb 1849–1903) [95, p. 272] who proved
it in 1885.

1Such shocking connections in science perhaps made Albert Einstein (1879–1955) state that
“the scientist’s religious feeling takes the form of a rapturous amazement at the harmony of natu-
ral law, which reveals an intelligence of such superiority that, compared with it, all the systematic
thinking and acting of human beings is an utterly insignificant reflection”. [103]
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 377

7.6.3. Elementary probability theory. You will prove these results with
complete rigor in Problems 11 and 10. However, we are going to derive them
intuitively — not rigorously (!) — based on some basic probability ideas that should
be “obvious” (or at least believable) to you; see [229, 70, 71] for standard books
on probability in case you want the hardcore theory. We only need the basics. We
denote the probability, or chance, that an event A happens by P (A). The classic
definition is
number of occurrences of A
(7.30) P (A) = .
total number of possibilities
For example, consider a classroom with 10 people, m men and w women (so that
m + w = 10). The probability of randomly “choosing a man” (= M ) is
number of men m
P (M ) = = .
total number of possibilities 10
Similarly, the probability of randomly choosing a woman is w/10. We next need to
discuss complementary events. If Ac is the event that A does not happen, then
(7.31) P (Ac ) = 1 − P (A).
For instance, according to (7.31) the probability of “not choosing a man”, M c ,
should be P (M c ) = 1 − P (M ) = 1 − m/10. But this is certainly true because
“not choosing a man” is the same as “choosing a woman” W , so recalling that
m + w = 10, we have
w 10 − m m
P (M c ) = P (W ) = = =1− .
10 10 10
Finally, we need to discuss independence. Whenever an event A is unrelated to an
event B (such events are called independent), we have the fundamental relation:
P (A and B) = P (A) · P (B).
For example, let’s say that we have two classrooms of 10 students each, the first one
with m1 men and w1 women, and the second one with m2 men and w2 women. Let
us randomly choose a pair of students, one from the first classroom and the other
from the second. What is the probability of “choosing a man from the first class-
room” = A and “choosing a woman from the second classroom” = B? Certainly
A and B don’t depend on each other, so by our formula above we should have
m1 w2 m1 w2
P (A and B) = P (A) · P (B) = · = .
10 10 100
To see that this is indeed true, note that the number of ways to pair a man in
classroom 1 with a woman in classroom 2 is m1 · w2 and the total number of
possible pairs of people is 102 = 100. Thus,
number of men-women pairs m1 · m2
P (A and B) = = ,
total number of possible pairs of people 100
in agreement with our previous calculation. We remark that for any number of
events A1 , A2 , . . ., which are unrelated to each other, we have the generalized result:
(7.32) P (A1 and A2 and · · · ) = P (A1 ) · P (A2 ) · · · .
378 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

7.6.4. Probability and π 2 /6. To begin discussing our two incredible and
shocking problems, we first look at the following question: Given a natural number
k, what is the probability, or chance, that a randomly chosen natural number is
divisible by k? Since the definition (7.30) involves finite quantities, we can’t use
this definition as it stands. We can instead use the following modified version:
number of occurrences of A amongst n possibilities
(7.33) P (A) = lim .
n→∞ n
Using this formula, in Problem 8, you should be able to prove that the probability
a randomly chosen natural number is divisible by k is 1/k. However, instead of
using (7.33), we shall employ the following heuristic trick (which works to give the
correct answer). Choose an “extremely large” natural number N , and consider the
very large sample of numbers
1, 2, 3, 4, 5, 6, . . . , N k.
There are exactly N numbers in this list that are divisible by k, namely the N
numbers k, 2k, 3k, . . . , N k, and no others, and there are a total of N k numbers in
this list. Thus, the probability that a natural number n, randomly chosen amongst
the large sample, is divisible by k is exactly the probability that n is one of the N
numbers k, 2k, 3k, . . . , N k, so
number of occurrences of divisibility N 1
(7.34) P (k|n) = = = .
total number of possibilities listed Nk k
For instance, the probability that a randomly chosen natural number is divisible by
1 is 1, which makes sense. The probability that a randomly chosen natural number
is divisible by 2 is 1/2; in other words, the probability that a randomly chosen
natural number is even is 1/2, which also makes sense.
We are now ready to solve our two problems. Question: What is the prob-
ability that a natural number, chosen at random, is square free? Let n ∈ N be
randomly chosen. Then n is square free just means that p2 6 | n for all primes p.
Thus,
P (n is square free) = P ((226 | n) and (326 | n) and (526 | n) and (726 | n) and · · · ).
Since n was randomly chosen, the events 226 | n, 326 | n, 526 | n, etc. are unrelated, so
by (7.32),
P (n is square free) = P (226 | n) · P (326 | n) · P (526 | n) · P (726 | n) · · ·
To see what the right-hand side is, we use (7.31) and (7.34) to write
1
P (p26 | n) = 1 − P (p2 |n) = 1 − .
p2
Thus,
Y Y  1

1 6
2
P (n is square free) = P (p 6 | n) = 1− 2 = = 2,
p ζ(2) π
p prime p prime

and our first question is answered!


Question: What is the probability that two given numbers, chosen at random,
are relatively, or co, prime? Let m, n ∈ N be randomly chosen. Then m and n are
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 379

relatively prime, or coprime, just means that m and n have no common factors
(except 1), which means2 that p6 | both m, n for all prime numbers p. Thus,

P (m, n are relatively prime)


= P ((26 | both m, n) and (36 | both m, n) and (56 | both m, n) and · · · ).
Since m and n were randomly chosen, that p6 | both m, n is unrelated to q6 | both m, n,
so by (7.32),
Y
P (m, n are relatively prime) = P (p6 | both m, n).
p prime

To see what the right-hand side is, we use (7.31), (7.32), and (7.34) to write

P (p6 | both m, n) = 1 − P (p| both m, n) = 1 − P (p|m and p|n)


1 1 1
= 1 − P (p|m) · P (p|n) = 1 − · = 1 − 2.
p p p
Thus,
Y Y  1

6
P (m, n are relatively prime) = P (p6 | both m, n) = 1− 2 = ,
p π2
p prime p prime

and our second question is answered!


Exercises 7.6.
1. (ζ(z) product formula, Proof III) We prove Theorem 7.12 using the good ole
Tannery’s theorem for products.
(i) Let r > 1 be arbitrary and let Re z ≥ r. Prove that

Y z ∞ ∞
p − (1/pz )N

X 1 X 1
z
− z
≤ r
.

p<N
p − 1 n=1
n
n=N
n

pz −(1/pz )N z N +1
Suggestion: pz −1
= 1−(1/p
1−1/pz
)
= 1 + 1/pz + 1/p2z + · · · + 1/pN z .
z z N z N
p −(1/p )
(ii) Write pz −1
= 1 + 1−(1/p
pz −1
)
. Show that

1 − (1/pz )N 2 4
pz − 1 ≤ pr − 1 ≤ pr

P
and 4/pr converges. Now prove Theorem 7.12 using Tannery’s theorem for
products.
2. Prove that for z ∈ C with Re z > 1,

ζ(z) X |µ(n)|
= .
ζ(2z) n=1
nz

ζ(z) Q 1

Suggestion: Show that ζ(2z)
= 1+ pz
and copy Proof I of Theorem 7.13.
3. (Möbius inversion formula) In this problem we prove Möbius inversion formula.
(i) Given n ∈ N with n > 1, let p1 , . . . , pk be the distinct prime factors of n. For
1 ≤ i ≤ k, let

Ai = m ∈ N ; m = a product of exactly i distinct prime factors of n .

2Explicitly, “p6 | both m, n” is the negation of “p|m and p|n”; that is, “p6 | m or p6 | n”.
380 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

Show that
X k
X X
µ(d) = 1 + µ(m),
d|n i=1 m∈Ai
P
where d|n µ(d) means to sum over all d ∈ N such that d|n. Next, show that
!
i k
X
µ(m) = (−1) .
m∈A
i
i

(ii) For any n ∈ N, prove that


(
X 1 if n = 1
µ(d) =
d|n
0 if n > 1.

(iii) Let f : (0, ∞) → R such that f (x) = 0 for x < 1, and define
∞  
X x
g(x) = f .
n=1
n

Note that g(x) = 0 for x < 1 and this infinite series is really only a finite sum
since f (x) = 0 for x < 1; specifically, choosing any N ∈ N with N ≥ bxc (the
P
greatest integer ≤ x), we have g(x) = N n=1 f (x/n). Prove that
∞  
X x
f (x) = µ(n) g (Möbius inversion formula);
n=1
n
As before, this sum is really only finite. Suggestion: If you’ve not gotten anywhere
after some time, let S = {(k, n) ∈ N × N ; n|k} and consider the sum
 
X x
µ(n) f .
k
(k,n)∈S
P P P∞ P
Write this sum as ∞ k=1 n ; n|k µ(n) f (x/k), then as n=1 k ; n|k µ(n) f (x/k)
and simplify each iterated sum.
4. (Liouville’s function) Define

1
 if n = 1
λ(n) := 1 if the number of prime factors of n, counted with repetitions, is even


−1 if the number of prime factors of n, counted with repetitions, is odd.
This function is called Liouville’s function after Joseph Liouville (1809–1882). Prove
that for z ∈ C with Re z > 1,

ζ(2z) X λ(n)
= .
ζ(z) n=1
nz

ζ(2z) Q 1
−1
Suggestion: Show that ζ(z)
= 1+ pz
and copy Proof I of Theorem 7.12.
5. For n ∈ N, let τ (n) denote the number of positive divisors of n (that is, the number of
positive integers that divide n). Prove that for z ∈ C with Re z > 1,

X τ (n)
ζ(z)2 = .
n=1
nz
P
Suggestion: By absolute convergence, we can write ζ(z)2 = m,n 1/(m · n)z where this
double series can be summed in any way we wish. Use Theorem 6.25 with the set Sk
given by Sk = T1 ∪ · · · ∪ Tk where Tk = {(m, n) ∈ N × N ; m · n = k}.
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 381

P
6. Let ζ(z, a) := ∞n=0 (n + a)
−z
for z ∈ C with Re z > 1 and a > 0 — this function is
called the Hurwitz zeta function after Adolf Hurwitz (1859–1919). Prove that
Xk  m
ζ z, = kz ζ(z).
m=1
k

7. In this problem, we find useful bounds and limits for ζ(x) with x > 1 real.
(a) Prove that 1 − 21x < η(x) < 1.
(b) Prove that
1 − 2−x 1
< ζ(x) < .
1 − 21−x 1 − 21−x
(c) Prove the following limits: ζ(x) → 1 as x → ∞, ζ(x) → ∞ as x → 1+ , and
(x − 1)ζ(x) → 1 as x → 1+ .
8. Using the definition (7.33), prove that given a natural number k, the probability that
a randomly chosen natural number is divisible by k is 1/k. Suggestion: Amongst the n
natural numbers 1, 2, 3, . . . , n, show that bn/kc many numbers are divisible by k. Now
take n → ∞ in bn/kc/n.
9. (cf. [25, 107]) Let k ∈ N with k ≥ 2. We say that a natural number n is k-th power
free if pk 6 | n for all primes p. What is the probability that a natural number, chosen
at random, is k-th power free? What is the probability that k natural numbers, chosen
at random, are relatively prime (have not common factors except 1)?
10. (Square-free numbers) Define S : (0, ∞) → R by
S(x) := #{k ∈ N ; 1 ≤ k ≤ x and k is square free};
note that S(x) = 0 for x < 1. We shall prove that
S(n) 6
lim = 2.
n→∞ n π
Do you see why this formula makes precise the statement “The probability that a
randomly chosen natural number is square free equals 6/π 2 ”?
(i) For any real number x > 0 and n ∈ N, define

A(x, n) := k ∈ N ; 1 ≤ k ≤ x and n2 is the largest square that divides k .
Note that A(x, n) = ∅ for n2 > x. Prove that A(x, 1) consists of all square-free
numbers ≤ x, and also prove that

[
{k ∈ N ; 1 ≤ k ≤ x} = A(x, n).
n=1

(ii) Show that there is a bijection between A(x, n) and A(x/n2 , 1).
(iii) Show that for any x > 0, we have
∞  
X x
bxc = S 2
.
n=1
n

Using the Möbius inversion formula from Problem 3, conclude that


∞  
X x
S(x) = µ(n) 2
.
n=1
n

(iv) Finally, prove that lim S(x)/x = 6/π 2 , which in particular proves our result.
x→∞
11. (Relatively prime numbers; for different proofs, see [122, p. 337] and [95, p. 268])
Define R : (0, ∞) → R by
R(x) := #{(k, `) ∈ N ; 1 ≤ k, ` ≤ x and k and ` are relatively prime};
382 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

note that R(x) = 0 for x < 1. We shall prove that


R(n) 6
lim = 2,
n→∞n2 π
Do you see why this formula makes precise the statement “The probability that two
randomly chosen natural numbers are relatively prime equals 6/π 2 ”?
(i) For any real number x > 0 and n ∈ N, define

A(x, n) := (k, `) ∈ N × N ; 1 ≤ k, ` ≤ x and n is the largest divisor of both k and ` .
Note that A(x, n) = ∅ for n > x. Prove that A(x, 1) consists of all pairs (k, `) of
relatively prime natural numbers that are ≤ x, and also prove that

[
{(k, `) ∈ N × N ; 1 ≤ k, ` ≤ x} = A(x, n).
n=1

(ii) Show that there is a bijection between A(x, n) and A(x/n, 1).
(iii) Show that for any x > 0, we have
∞  
X x
bxc2 = R .
n=1
n

Using the Möbius inversion formula from Problem 3, conclude that


∞  2
X x
R(x) = µ(n) .
n=1
n

(iv) Finally, prove that lim R(x)/x2 = 6/π 2 , which in particular proves our result.
x→∞

7.7. F Some of the most beautiful formulæ in the world IV


Hold on to your seats, for you’re about to be taken on another journey through
a beautiful world of mathematical formulas! In this section we derive many formulas
found in Euler’s wonderful book Introduction to analysis of the infinite [65]; his
second book [66] is also great. We also give our tenth proof of Euler’s formula for
π 2 /6 and our third proof of Gregory-Leibniz-Madhava’s formula for π/4.

7.7.1. Bernoulli numbers and evaluating sums/products.


P∞ We start our
onslaught of beautiful formulæ with a formula for ζ(2k) = n=1 n12k in terms of
Bernoulli numbers; this complements the formulæ in Section 5.3 when we didn’t
know about Bernoulli numbers. To find such a formula, we begin with the partial
fraction expansion of the cotangent from Section 7.4:
∞ ∞
X 1 X z2
πz cot πz = 1 + 2z 2 = 1 − 2 .
n=1
z 2 − n2 n=1
n2 − z 2
Next, we apply Cauchy’s double series theorem to this sum. Let z ∈ C be near 0
and observe that
∞  2 k
z2 z 2 /n2 X z
= = ,
n2 − z 2 1 − z 2 /n2 n2
k=1
P∞ r
where we used the geometric series formula k=1 rk = 1−r for |r| < 1. Therefore,
∞  2 k
∞ X
X z
πz cot πz = 1 − 2 .
n=1 k=1
n2
7.7. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD IV 383

Since
∞ X∞ 2 k ∞ X∞ 
X z X |z|2 k 1 
2 = = 1 − π cot π|z| < ∞,
n=1
n n=1
n2 2
k=1 k=1
by Cauchy’s double series theorem, we have
∞ X∞  2 k ∞ X∞ 
X z X 1
(7.35) πz cot πz = 1 − 2 2
=1−2 2k
z 2k .
n=1
n n=1
n
k=1 k=1
P∞ 2k
On the other hand, we recall from Section 6.8 that z cot z = k=0 (−1)k 2 (2k)!
B2k 2k
z
(for |z| small), where the B2k ’s are the Bernoulli numbers. Replacing z with πz,
we get

X 22k B2k 2k 2k
πz cot πz = 1 + (−1)k π z .
(2k)!
k=1

Comparing this equation with (7.35) and using the identity theorem, we see that
∞ 2k
X 1 k2 B2k 2k
−2 2k
= (−1) π , k = 1, 2, 3, . . . .
n=1
n (2k)!

Rewriting this slightly, we obtain Euler’s famous result: For k = 1, 2, 3, . . .,


∞ 2k
X 1 k−1 (2π) B2k (2π)2k B2k
(7.36) 2k
= (−1) ; that is, ζ(2k) = (−1)k−1 .
n=1
n 2(2k)! 2(2k)!

Using the known values of the Bernoulli numbers found in Section 6.8, setting
k = 1, 2, 3, we get, in particular, our eleventh proof of Euler’s formula for π 2 /6:
∞ ∞ ∞
π2 X 1 π4 X 1 π6 X 1
= (Euler’s sum, Proof XI) , = , = .
6 n=1
n2 90 n=1 n4 945 n=1 n6

Using (7.36), we can derive many other pretty formulas. First, in Theorem 7.14
we proved that

X (−1)n−1
z
= (1 − 21−z )ζ(z), z > 1.
n=1
n
In particular, setting z = 2k, we find that for k = 1, 2, 3, . . .,

X (−1)n−1 k−1 1−2k (2π)
 2k
B2k
(7.37) η(2k) = 2k
= (−1) 1 − 2 ;
n=1
n 2(2k)!

what formulas do you get when you set k = 1, 2? Second, recall from Theorem 7.12
that
∞ Y pz
X 1 2z 3z 5z 7z
(7.38) z
= z
= z · z · z · z ···
n=1
n p −1 2 −1 3 −1 5 −1 7 −1

where the product is over all primes. In particular, setting z = 2, we get

π2 22 32 52 72 112
(7.39) = 2 · 2 · 2 · 2 · 2 ···
6 2 − 1 3 − 1 5 − 1 7 − 1 11 − 1
384 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

and setting z = 4, we get


π4 24 34 54 74 114
= 4 · 4 · 4 · 4 · 4 ··· .
90 2 − 1 3 − 1 5 − 1 7 − 1 11 − 1
Dividing these two formulas and using that
n4
2
n4 − 1 = n2 · n − 1 = n2 · n2 − 1 n2
2 4 2 2
= 2 ,
n n −1 (n − 1)(n + 1) n +1
n2 − 1
we obtain
π2 22 32 52 72 112
(7.40) = 2 · 2 · 2 · 2 · 2 ··· .
15 2 + 1 3 + 1 5 + 1 7 + 1 11 + 1
Third, recall from Theorem 7.13 that

1 X µ(n)
= ,
ζ(z) n=1 nz
where µ(n) is the Möbius function. In particular, setting z = 2, we find that
6 1 1 1 1 1 1 1
= 1 − 2 − 2 − 2 + 2 − 2 + 2 − 2 + ··· ;
π2 2 3 5 6 7 10 11
what formula do you get when you set z = 4?

7.7.2. Euler numbers and evaluating sums. We now derive a formula for
the alternating sum of the odd natural numbers to odd powers:
1 1 1 1
1 − 2k+1 + 2k+1 − 2k+1 + 2k+1 − + · · · , k = 0, 1, 2, 3, . . . .
3 5 7 9
First try: To this end, let |z| < 1 and recall from Section 7.4 that

π 1 3 5 X (2n + 1)
(7.41) πz = 2 2
− 2 2
+ 2 2
+ · · · = (−1)n .
4 cos 2 1 −z 3 −z 5 −z n=0
(2n + 1)2 − z 2
Expanding as a geometric series, observe that

(2n + 1) 1 1 X z 2k
(7.42) 2 2
= · 2 = .
(2n + 1) − z z
(2n + 1) 1 − (2n+1)2 (2n + 1)2k+1
k=0

Thus,
∞ X∞
π X z 2k
(7.43) πz = (−1)n .
4 cos 2 n=0
(2n + 1)2k+1
k=0

Just as we did in proving (7.35), we shall try to use Cauchy’s double series theorem
on this sum ... however, observe that
∞ X ∞ X ∞ X ∞ ∞
X
(−1)n z 2k
= |z|2k X (2n + 1)
2k+1 2k+1
= ,
n=0 k=0
(2n + 1)
n=0 k=0
(2n + 1) n=0
(2n + 1)2 − |z|2
P 1
which diverges (because this series behaves like 2n+1 = ∞)! Therefore, we
cannot apply Cauchy’s double series theorem.
7.7. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD IV 385

Second try: Let us start fresh from scratch. This time, we break up (7.41)
into sums over n even and n odd (just consider the sums with n replaced by 2n
and also by 2n + 1):
∞  
π X (4n + 1) (4n + 3)
= − .
4 cos πz
2 n=0
(4n + 1)2 − z 2 (4n + 3)2 − z 2
(4n+1) (4n+3)
Let |z| < 1. Then writing (4n+1)2 −z 2 and (4n+3)2 −z 2 as geometric series (just as we
did in (7.42)) we see that
∞ X ∞  
π X z 2k z 2k
= −
4 cos πz
2 n=0 k=0
(4n + 1)2k+1 (4n + 3)2k+1
∞ X ∞  
X 1 1
(7.44) = 2k+1
− 2k+1
z 2k .
n=0
(4n + 1) (4n + 3)
k=0

We can now use Cauchy’s double series theorem on this sum because
∞ X ∞  
X 1 1

(4n + 1)2k+1 − 2k+1
z 2k
n=0 k=0
(4n + 3)
∞ X∞  
X 1 1 π
= 2k+1
− 2k+1
|z|2k = < ∞,
n=0 k=0
(4n + 1) (4n + 3) 4 cos π|z| 2

where we used (7.44) with z replaced by |z|. Thus, by Cauchy’s double series
theorem, we have
∞ X∞  
π X 1 1
= − z 2k .
4 cos πz
2 n=0
(4n + 1)2k+1 (4n + 3)2k+1
k=0

Combining the terms in parentheses, we get


∞ X ∞  
π X (−1)n
(7.45) = z 2k ;
4 cos πz
2 n=0
(2n + 1)2k+1
k=0

thus, we could in fact interchange orders in (7.43), but to justify it with complete
mathematical rigor, we needed a little bit of mathematical gymnastics.
Now recall from Section 6.8 that

1 X E2k 2k
= sec z = (−1)k z ,
cos z (2k)!
k=0

where the E2k ’s are the Euler numbers. Replacing z with πz/2 and multiplying by
π/4, we get

π πX E2k  π 2k 2k
πz = (−1)k z .
4 cos 2 4 (2k)! 2
k=0
Comparing this equation with (7.45) and using the identity theorem, we conclude
that for k = 0, 1, 2, 3, . . .,

X (−1)n k E2k
 π 2k+1
(7.46) = (−1) .
n=0
(2n + 1)2k+1 2(2k)! 2
386 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

In particular, setting k = 0 (and recalling that E0 = 1) we get our third proof of


Gregory-Leibniz-Madhava’s formula:
π 1 1 1
= 1 − + − + · · · , (Gregory-Leibniz-Madhava, Proof III).
4 3 5 7
What pretty formulas do you get when you set k = 1, 2? (Here, you need the Euler
numbers calculated in Section 6.8.) We can derive many other pretty formulas from
(7.46). To start this onslaught, we first state an “odd version” of Theorem 7.12:
Theorem 7.15. For any z ∈ C with Re z > 1, we have

X (−1)n 3z 5z 7z 11z 13z
z
= z · z · z · z · z ··· ,
n=0
(2n + 1) 3 + 1 5 − 1 7 + 1 11 + 1 13 − 1

where the product is over odd primes (all primes except 2) and where the ± signs
in the denominators depends on whether the prime is of the form 4k + 3 (+ sign)
or 4k + 1 (− sign), where k = 0, 1, 2, . . ..
Since the proof of this theorem is similar to that of Theorem 7.12, we shall leave
the proof of this theorem to the interested reader; see Problem 5. In particular,
setting z = 1, we get
π 3 5 7 11 13 17 19 23
(7.47) = · · · · · · · ··· .
4 4 4 8 12 12 16 20 24
The numerators of the fractions on the right are the odd prime numbers and the
denominators are even numbers divisible by four and differing from the numerators
by one. In (7.39), we found that
π2 22 32 52 4 3 · 3 5 · 5 7 · 7 11 · 11 13 · 13
= 2 · 2 · 2 ··· = · · · · · ··· .
6 2 −1 3 −1 5 −1 3 2 · 4 4 · 6 6 · 8 10 · 12 12 · 14
Dividing this expression by (7.47), and cancelling like terms, we obtain
4π π 2 /6 4 3 5 7 11 13 17
= = · · · · · · ··· .
6 π/4 3 2 6 6 10 14 18
Multiplying both sides by 3/4, we get another one of Euler’s famous formulas:
π 3 5 7 11 13 17 19 23
(7.48) = · · · · · · · ··· .
2 2 6 6 10 14 18 18 22
The numerators of the fractions are the odd prime numbers and the denominators
are even numbers not divisible by four and differing from the numerators by one.
(7.47) and (7.48) are two of my favorite infinite product expansions for π.

7.7.3. Benoit Cloitre’s e and π in a mirror. In this section we prove a


unbelievable fact connecting e and π that is due to Benoit Cloitre [52], [74], [205].
Define sequences {an } and {bn } by a1 = b1 = 0, a2 = b2 = 1, and the rest as the
following “mirror images”:
1
an+2 = an+1 + an
n
1
bn+2 = bn+1 + bn .
n
7.7. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD IV 387

We shall prove that


n π n
(7.49) e = lim , = lim 2 .
n→∞ an 2 n→∞ bn

The sequences {an } and {bn } look so similar and so do { ann } and { bn2 }, yet they
n
generate very different numbers. Seeing such a connection between e and π, which
a priori are very different, makes you wonder if there isn’t someone behind this
“coincidence.”
To prove the formula for e, let us define a sequence {sn } by sn = an /n. Then
s1 = a1 /1 = 0 and s2 = a2 /2 = 1/2. Observe that for n ≥ 2, we have
an+1 an 1  n+1 
sn+1 − sn = − = an+1 − an
n+1 n n+1 n
1  1  1 
= an + an−1 − 1 + an
n+1 n−1 n
1  1 an 
= an−1 −
n+1 n−1 n
−1
= (sn − sn−1 ).
n+1
Using induction we see that
−1 −1 −1
sn+1 − sn = (sn − sn−1 ) = · (sn−1 − sn−2 )
n+1 n+1 n
−1 −1 −1
= · · (sn−2 − sn−3 ) = · · · etc.
n+1 n n−1
−1 −1 −1 −1
= · · ··· (s2 − s1 )
n+1 n n−1 3
−1 −1 −1 −1 1 (−1)n−3 (−1)n+1
= · · ··· · = = .
n+1 n n−1 3 2 (n + 1)! (n + 1)!
Thus, writing as a telescoping sum, we obtain
n n n
X X (−1)k X (−1)k
sn = s1 + (sk − sk−1 ) = 0 + = ,
k! k!
k=2 k=2 k=0

which is exactly the n-th partial sum for the series expansion of e−1 . It follows that
sn → e−1 and so,
n 1 1
lim = lim = −1 = e,
n→∞ an n→∞ sn e
as we claimed. The limit for π in (7.49) will be left to you (see Problem 2).
Exercises 7.7.
1. In this problem we derive other neat formulas:
(1) Dividing (7.40) by π 2 /6, prove that

5 22 + 1 32 + 1 52 + 1 72 + 1 112 + 1
= 2 · · · · ··· ,
2 2 − 1 32 − 1 52 − 1 72 − 1 112 − 1
quite a neat expression for 2.5.
(2) Dividing (7.48) by (7.47), prove that
2 2 4 6 6 8 10 12
2= · · · · · · · ··· ,
1 3 3 5 7 9 9 11
388 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS

quite a neat expression for 2. The fractions on the right are formed as follows:
Given an odd prime 3, 5, 7, . . ., we take the pair of even numbers immediately above
and below the prime, divide them by two, then put the resulting even number as
the numerator and the odd number as the denominator.
2. In this problem, we prove the limit for π in (7.49).
(i) Define tn = bn+1 /bn for n = 2, 3, 4, . . .. Prove that (for n = 2, 3, 4, . . .), tn+1 =
1/n + 1/tn and then, (
1 n even
tn = n
n−1
n odd.
(ii) Prove that b2n = t22 · t23 · t24 · · · t2n−1 , then using Wallis’ formula, derive the limit for
π in (7.49).
3. From Problem 7 in Exercises 7.6, prove that
2(2n)! (1 − 22n ) 2(2n)!
2n 1−2n
< B2n < .
(2π) (1 − 2 ) (2π)2n (1 − 21−2n )
This estimate shows that the Bernoulli numbers grow incredibly fast as n → ∞.
4. (Radius of convergence) In this problem we (finally) determine the radii of conver-
gence of
∞ ∞
X 22n B2n 2n X 22n (22n − 1) B2n 2n−1
z cot z = (−1)n z , tan z = (−1)n−1 z .
n=0
(2n)! n=1
(2n)!
2n
(a) Let a2n = (−1)n 2 (2n)!
B2n
. Prove that
1 1/2n 1
lim |a2n |1/2n = lim ·2 · ζ(2n)1/2n = .
n→∞ n→∞ π π
Conclude that the radius of convergence of z cot z is π.
(b) Using a similar argument, show that the radius of convergence of tan z is π/2.
5. In this problem, we prove Theorem 7.15
(i) Let us call an odd number “type I” if it is of the form 4k + 1 for some k = 0, 1, . . .
and “type II” if it is of the form 4k + 3 for some k = 0, 1, . . .. Prove that every
odd number is either of type I or type II.
(ii) Prove that type I × type I = type I, type I × type II = type II, and type II ×
type II = type I.
(iii) Let a, b, . . . , c ∈ N be odd. Prove that if there is an odd number of type II integers
amongst a, b, . . . , c, then a · b · · · c is of type II, otherwise, a · b · · · c is type I.
(iv) Show that
∞ ∞ ∞
X (−1)n X 1 X 1
z
= z
− ,
n=0
(2n + 1) n=0
(4n + 1) n=0
(4n + 3)z
a sum of type I and type II natural numbers!
(v) Let z ∈ C with Re z ≥ r > 1, let 1 < N ∈ N, and let 3 < 5 < · · · < m < 2N + 1
be the odd prime numbers less than 2N + 1. In a similar manner as in the proof
of Theorem 7.12, show that

(−1)n 3z 5z 7z 11z mz
X

z
− z · z · z · z ··· z
n=1
(2n + 1) 3 + 1 5 − 1 7 + 1 11 − 1 m ±1
∞ ∞
X 1 X 1
≤ ≤ ,
n=N
(2n + 1)z n=N
(2n + 1)r
where the + signs in the product are for type II odd primes and the − signs for
type I odd primes. Now finish the proof of Theorem 7.15.
CHAPTER 8

Infinite continued fractions

From time immemorial, the infinite has stirred men’s emotions more than
any other question. Hardly any other idea has stimulated the mind so
fruitfully . . . In a certain sense, mathematical analysis is a symphony of
the infinite.
David Hilbert (1862-1943) “On the infinite” [24].
We dabbed a little into the theory of continued fractions (that is, fractions
that continue on and on and on . . .) way back in the exercises of Section 3.4. In
this chapter we concentrate on this fascinating subject. We start in Section 8.1 by
showing that such fractions occur very naturally in long division and we give their
basic definitions. In Section 5.3, we prove some pretty dramatic formulas (this is
one reason continued fractions are so fascinating, at least to me). For example,
we’ll show that 4/π and π can be written as the continued fractions:

4 12 12
=1+ 2
, π =3+ .
π 3 32
2+ 6+
52 52
2+ 6+
72 72
2+ 6+
. .
2 + .. 6 + ..
The continued fraction on the left is due to Lord Brouncker (and is the first contin-
ued fraction ever recorded) and the one on the right is due to Euler. If you think
these π formulas are cool, we’ll derive the following formulas for e as well:
2 1
e=2+ =1+ .
3 1
2+ 0+
4 1
3+ 1+
5 1
4+ 1+
. 1
5 + .. 2+
1
1+
1
1+
.
4 + ..
We’ll prove the formula on the left in Section 7.7, but you’ll have to wait for
the formula on the right until Section 8.7. In Section 8.3, we discuss elementary
properties of continued fractions. In this section we also discuss how a Greek
mathematician, Diophantus of Alexandrea (≈ 200–284 A.D.), can help you if you’re
stranded on an island with guys you can’t trust and a monkey with a healthy
appetite! In Section 8.4 we study the convergence properties of continued fractions.
389
390 8. INFINITE CONTINUED FRACTIONS

Recall from our discussion on the amazing number π and its computations from
ancient times (see Section 4.10) that throughout the years, the following approxi-
mation to π came up: 3, 22/7, 333/106, and 355/113. Did you ever wonder why
these particular numbers occur? Also, did you ever wonder why our calendar is
constructed the way it is (e.g. why leap years occur)? Finally, did you ever wonder
why a piano has twelve keys (within an octave)? In Sections 8.5 and 8.6 you’ll find
out that these mysteries have to do with continued fractions! In Section 8.8 we
study special types of continued fractions having to do with square roots and in
Section 8.9 we learn why Archimedes needed around 8 × 10206544 cattle in order to
“have abundant of knowledge in this science [mathematics]”!
In the very last section, Section 8.10, we look at continued fractions and tran-
scendental numbers.
Chapter 8 objectives: The student will be able to . . .
• define a continued fraction, state the Wallis-Euler and fundamental recurrence
relations, and apply the continued fraction convergence theorem (Theorem 8.14).
• compute the canonical continued fraction of a given real number.
• understand the relationship between convergents of a simple continued fraction
and best approximations, and the relationship between periodic simple contin-
ued fractions and quadratic irrationals.
• solve simple diophantine equations (of linear and Pell type).

8.1. Introduction to continued fractions


In this section we introduce the basics of continued fractions and see how they
arise out of high school division and also from solving equations.

8.1.1. Continued fractions arise when dividing. A common way contin-


ued fractions arise is through “repeated divisions”.
157
Example 8.1. Take for instance, high school division of 68 into 157: 68 =
2 + 21 21 157
68 . Inverting the fraction 68 , we can write 68 as

157 1
=2+ .
68 68
21
68 5 1 157
Since we can further divide 21 = 3+ 21 = 3+ 21/5 , we can write 68 in the somewhat
fancy way
157 1
=2+ .
68 1
3+
21
5
21
Since 5 = 4 + 15 , we can write

157 1
(8.1) =2+ .
68 1
3+
1
4+
5
Since 5 is now a whole number, our repeated division process stops.
8.1. INTRODUCTION TO CONTINUED FRACTIONS 391

The expression on the right in (8.1) is called a finite simple continued frac-
tion. There are many ways to denote the right-hand side, but we shall stick with
the following two:
1 1 1 1
h2; 3, 4, 5i or 2+ represent 2 + .
3+ 4+ 5 1
3+
1
4+
5
Thus, continued fractions (that is, fractions that “continue on”) arise naturally out
of writing rational numbers in a somewhat fancy way by repeated divisions. Of
course, 157 and 68 were not special, by repeated divisions one can take any two
integers a and b with b 6= 0 and write a/b as a finite simple continued fraction; see
Problem 2. In Section 8.4, we shall prove that any real number, not necessarily
rational, can be expressed as a simple (possibly infinite) continued fraction.

8.1.2. Continued fractions arise when solving equations. Continued


fractions also arise naturally when trying to solve equations.
Example 8.2. Suppose we want to find the positive solution x to the equation
x2 − x − 2 = 0. (Notice that x = 2 is the only positive solution.) On the other
hand, writing x2 − x − 2 = 0 as x2 = x + 2 and dividing by x, we get
2
x=1+ .
x
We can replace x in the denominator with x = 1 + 2/x to get
2
x=1+ .
2
1+
x
Repeating this many times, we can write
2
x=1+ .
2
1+
2
1+
..
.
1+
2
1+
x
Repeating this “to infinity” and using that x = 2, we write
2
“ 2=1+ . ”
2
1+
2
1+
2
1+
.
1 + ..
Quite a remarkable formula for 2. Later, (see Problem 4 in Section 8.4) we
shall see that any whole number can be written in such a way. The reason for the
quotation marks is that we have not yet defined what the right-hand object means.
392 8. INFINITE CONTINUED FRACTIONS

However, we shall define what this means in a moment, but before doing so, here’s
another neat example:
Example

8.3. Consider the slightly modified formula x2 − x − 1 = 0. Then
1+ 5
Φ = 2 , called the golden ratio, is the only positive solution. We can rewrite
Φ2 − Φ − 1 = 0 as Φ = 1 + Φ1 . Replacing Φ in the denominator with Φ = 1 + Φ1 , we
get
1
Φ=1+ .
1
1+
Φ
Repeating this substitution process “to infinity”, we can write
1
(8.2) “ Φ=1+ ,”
1
1+
1
1+
1
1+
.
1 + ..
quite a beautiful expression (cf. Problem 6 in Exercises 3.4)! As a side remark, there
are many false rumors concerning the golden ratio; see [146] for the rundown.
8.1.3. Basic definitions for continued fractions. A general finite contin-
ued fraction can be written as
b1
(8.3) a0 +
b2
a1 +
b3
a2 +
..
.
a3 +
bn
an−1 +
an
where the ak ’s and bk ’s are real numbers. (Of course, we are implicitly assuming
that these fractions are all well-defined, e.g. no divisions by zero are allowed. Also,
when you simplify this big fraction by combining fractions, you need to go from the
bottom up.) Notice that if bm = 0 for some m, then
b1 b1
(8.4) a0 + = a0 + ,
b2 b2
a1 + a1 +
b3 ..
a2 + .
.. a2 +
. bm−1
a3 + am−2 +
bn am−1
an−1 +
an
since the bm = 0 will zero out everything below it. The continued fraction is called
simple if all the bk ’s are 1 and the ak ’s are integers with ak positive for k ≥ 1.
Instead of writing the continued fraction as we did above, which takes up a lot of
space, we shall shorten it to:
b1 b2 b3 bn
a0 + ... .
a1 + a2 + a3 + + an
8.1. INTRODUCTION TO CONTINUED FRACTIONS 393

In the case all bn = 1, we shorten the notation to


1 1 1 1
a0 + ... = ha0 ; a1 , a2 , a3 , . . . , an i.
a1 + a2 + a3 + + an
If a0 = 0, some authors write ha1 , a2 , . . . , an i instead of h0; a1 , . . . an i.
We now discuss infinite continued fractions. Let {an }, n = 0, 1, 2, . . ., and {bn },
n = 1, 2, . . ., be sequences of real numbers and suppose that
b1 b2 b3 bn
cn := a0 + ...
a1 + a2 + a3 + + an
is defined for all n. We call cn the n-th convergent of the continued fraction. If
the limit, lim cn , exists, then we say that the infinite continued fraction
b1 b1 b2 b3
(8.5) a0 + or a0 + ...
b2 a1 + a2 + a3 +
a1 +
b3
a2 +
.
a3 + . .
converges and we use either of these notations to denote the limiting value lim cn .
In the case all bn = 1, in place of (8.5) we use the notation
ha0 ; a1 , a2 , a3 , . . .i := lim ha0 ; a1 , a2 , a3 , . . . , an i,
n→∞

provided that the right-hand side exists. In Section 8.4 we shall prove that any
simple continued fraction converges; in particular, we’ll prove that (8.2) does hold
true:
1 1 1
Φ=1+ ....
1+ 1+ 1+
In the case when there is some bm term that vanishes, then convergence of (8.5) is
easy because (see (8.4)) for n ≥ m, we have cn = cm−1 . Hence, in this case
b1 b2 b3 b1 b2 b3 bm−1
a0 + . . . = a0 + ...
a1 + a2 + a3 + a1 + a2 + a3 + + am−1
converges automatically (such a continued fraction is said to terminate or be
finite). However, general convergence issues are not so straightforward. We shall
deal with the subtleties of convergence in Section 8.4.
Exercises 8.1.
1. Expand the following fractions into finite simple continued fractions:
7 11 3 13 42
(a) , (b) − , (c) , (d) , (e) − .
11 7 13 3 31
2. Prove that a real number can be written as a finite simple continued fraction if and
only if it is rational. Suggestion: for the “if” statement, use the division algorithm
(see Theorem 2.15): For a, b ∈ Z with b > 0, we have a = qb + r where q, r ∈ Z with
0 ≤ r < b; if a, b are both nonnegative, then so is q.
3. Let ξ = a0 + ab11 ab22 ab33 . . . abnn 6= 0. Prove that
+ + + +
1 1 b1 b2 b3 bn
= ... .
ξ a0 + a1 + a2 + a3 + + an
1
In particular, if ξ = ha0 ; a1 , . . . , an i =
6 0, show that ξ
= h0; a0 , a1 , a2 , . . . , an i.
394 8. INFINITE CONTINUED FRACTIONS

4. A useful technique to study continued fraction is the following artifice of writing a


continued fraction within a continued fraction. For a continued fraction ξ = a0 +
b1 b2 b3
a1 + a2 + a3 +
. . . + abnn , if m < n, prove that
b1 b2 b3 bm bm+1 bn
ξ = a0 + ... , where η = ...+ .
a1 + a2 + a3 + + η am+1 + an

8.2. F Some of the most beautiful formulæ in the world V


Hold on to your seats, for you’re about to be taken on another journey through
the beautiful world of mathematical formulas!

8.2.1. Transformation of continued fractions. It will often be convenient


to transform one continued fraction to another one. For example, let ρ1 , ρ2 , ρ3 be
nonzero real numbers and suppose that the finite fraction
b1
ξ = a0 + ,
b2
a1 +
b3
a2 +
a3
where the ak ’s and bk ’s are real numbers, is defined. Then multiplying the top and
bottom of the fraction by ρ1 , we get
ρ1 b1
ξ = a0 + .
ρ1 b2
ρ1 a1 +
b3
a2 +
a3
Multiplying the top and bottom of the fraction with ρ1 b2 as numerator by ρ2 gives
ρ1 b1
ξ = a0 + .
ρ1 ρ2 b2
ρ1 a1 +
ρ2 b3
ρ2 a2 +
a3
Finally, multiplying the top and bottom of the fraction with ρ2 b3 as numerator by
ρ3 gives
ρ1 b1
ξ = a0 + .
ρ1 ρ2 b2
ρ1 a1 +
ρ2 ρ3 b3
ρ2 a2 +
ρ3 a3
In summary,
b1 b2 b3 ρ1 b1 ρ1 ρ2 b2 ρ2 ρ3 b3
a0 + = a0 + .
a1 + a2 + a3 ρ1 a1 + ρ2 a2 + ρ3 a3
A simple induction argument proves the following.
Theorem 8.1 (Transformation rules). For any real numbers a1 , a2 , a3 , . . .,
b1 , b2 , b3 , . . ., and nonzero constants ρ1 , ρ2 , ρ3 , . . ., we have
b1 b2 b3 bn ρ1 b1 ρ1 ρ2 b2 ρ2 ρ3 b3 ρn−1 ρn bn
a0 + ... = a0 + ... ,
+ +
a1 a2 a3 + + an ρ1 a1 + ρ2 a2 + ρ3 a3 + + ρn an
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 395

in the sense when the left-hand side is defined, so is the right-hand side and this
equality holds. In particular, if the limit as n → ∞ of the left-hand side exists, then
the limit of the right-hand side also exists, and
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
a0 + ... . . . = a0 + ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
8.2.2. Two stupendous series and continued fractions identities. Let
α1 , α2 , α3 , . . . be any real numbers with αk 6= 0 and αk 6= αk−1 for all k. Observe
that
1 1 α2 − α1 1
− = = α1 α2 .
α1 α2 α1 α2 α2 −α1
Since
α1 α2 α1 (α2 − α1 ) + α12 α12
= = α1 + ,
α2 − α1 α2 − α1 α2 − α1
we get
1 1 1
− = α21
.
α1 α2 α1 + α2 −α 1

This little exercise suggests the following theorem.


Theorem 8.2. If α1 , α2 , α3 , . . . are nonzero real numbers with αk 6= αk−1 for
all k, then for any n ∈ N,
n
X (−1)k−1 1
= .
k=1
αk α12
α1 +
α22
α2 − α1 +
..
.
α3 − α2 + 2
αn−1
αn − αn−1
In particular, taking n → ∞, we conclude that

X (−1)k−1 1 α12 α22 α32
(8.6) = ...
αk α1 + α2 − α1 + α3 − α2 + α4 − α3 +
k=1

as long as either (and hence both) side makes sense.


Proof. This theorem is certainly true for alternating sums with n = 1 terms.
Assume it is true for sums with n terms; we shall prove it holds for sums with n + 1
terms. Observe that we can write
n+1
X (−1)k−1 1 1 (−1)n−1 (−1)n
= − + ··· + +
αk α1 α2 αn αn+1
k=1
 
1 1 1 1
= − + · · · + (−1)n−1 −
α1 α2 αn αn+1
 
1 1 αn+1 − αn
= − + · · · + (−1)n−1
α1 α2 αn αn+1
1 1 1
= − + · · · + (−1)n−1 αn αn+1 .
α1 α2 αn+1 −αn
396 8. INFINITE CONTINUED FRACTIONS

This is now a sum of n terms. Thus, we can apply the induction hypothesis to
conclude that
n+1
X (−1)k−1 2
1 α12 αn−1
(8.7) = ··· αn αn+1 .
αk α1 + α2 − α1 + + αn+1 −αn − αn−1
k=1
Since
αn αn+1 αn (αn+1 − αn ) + αn2
− αn−1 = − αn−1
αn+1 − αn αn+1 − αn
αn2
= αn − αn−1 + ,
αn+1 − αn
putting this into (8.7) gives
n+1 2
X (−1)k−1 1 α12 αn−1
= ··· α2n
.
αk α1 + α2 − α1 + + αn − αn−1 +
k=1 αn+1 −αn

This proves our induction step and completes our proof. 


Example 8.4. Since we know that

X (−1)k−1 1 1 1 1
log 2 = = − + − + ··· ,
k 1 2 3 4
k=1
setting αk = k in (8.6), we can write
1 12 22 32
log 2 = ...,
1+ 1 + 1 + 1 +
which we can also write as the equally beautiful expression
1
log 2 = .
12
1+
22
1+
32
1+
42
1+
.
1 + ..
See Problem 1 for a general formula for log(1 + x).
Here is another interesting identity. Let α1 , α2 , α3 , . . . be real, nonzero, and
never equal to 1. Then observe that
1 1 α2 − 1 1
− = = α1 α2 .
α1 α1 α2 α1 α2 α2 −1
Since
α1 α2 α1 (α2 − 1) + α1 α1
= = α1 + ,
α2 − 1 α2 − 1 α2 − 1
we get
1 1 1
− = .
α1 α1 α2 α1 + αα 1
2 −1

We can continue by induction in much the same manner as we did in the proof of
Theorem 8.2 to obtain the following result.
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 397

Theorem 8.3. For any real sequence α1 , α2 , α3 , . . . with αk 6= 0, 1, we have


n
X (−1)k−1 1
= .
α1 · · · αk α1
k=1 α1 +
α2
α2 − 1 +
..
.
α3 − 1 +
αn−1
αn−1 +
αn − 1
In particular, taking n → ∞, we conclude that

X (−1)k−1 1 α1 α2 αn−1
(8.8) = ... ...,
α1 · · · αk α1 + α2 − 1 + α3 − 1 + + αn − 1 +
k=1

as long as either (and hence both) side makes sense.


Theorems 8.2 and 8.3 turn series to continued fractions; in Problem 9 we do
the same for infinite products.

8.2.3. Continued fractions for arctan and π. We now use the identities
just learned to derive some remarkable continued fractions.
Example 8.5. First, since
π 1 1 1 1
= − + − + ··· ,
4 1 3 5 7
using the limit expression (8.6) in Theorem 8.2:
1 1 1 1 1 α12 α22 α32
− + − + ··· = ··· ,
α1 α2 α3 α4 α1 + α2 − α1 + α3 − α2 + α4 − α3 +
we can write
π 1
= .
4 12
1+
32
2+
52
2+
72
2+
.
2 + ..
Inverting both sides (see Problem 3 in Exercises 8.1), we obtain the incredible
expansion:

4 12
(8.9) =1+ .
π 32
2+
52
2+
72
2+
.
2 + ..
This continued fraction was the very first continued fraction ever recorded, and was
written down without proof by Lord Brouncker (1620 – 1686), the first president
of the Royal Society of London.
398 8. INFINITE CONTINUED FRACTIONS

Actually, we can derive (8.9) from a related expansion of the arctangent func-
tion, which is so neat that we shall derive in two ways, using Theorem 8.2 then
using Theorem 8.3.
Example 8.6. Recall that
x3 x5 x7 x2n−1
arctan x = x − + − + · · · + (−1)n−1 + ··· .
3 5 7 2n − 1
Setting α1 = x1 , α2 = x33 , α3 = x55 , and in general, αn = x2n−1 2n−1 into the formula

(8.6) from Theorem 8.2, we get the somewhat complicated formula


1 3 2
 2n−3 2

1 x2 x3 x2n−3
arctan x = 1 . . . 2n−1 ....
x
+ x32 − x1 + x55 − x33 + + x2n−1 − x2n−3
2n−3 +

However, we can simplify this using the transformation rule from Theorem 8.1:
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
+
a1 a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
(Here we drop the a0 term from that theorem.) Let ρ1 = x, ρ2 = x3 , . . ., and in
general, ρn = x2n−1 . Then,
1

3 2

5 2
1 x2 x3 x5 x x2 32 x2 52 x2
1+ 3 1+ 5 3 + 7 5 + . . . = 1 + 3 − x2 + 5 − 3x2 + 7 − 5x2 + . . . .
x x3 − x x5 − x3 x7 − x5
Thus,
x x2 32 x2 52 x2
arctan x = ...,
1 + 3 − x2 + 5 − 3x2 + 7 − 5x2 +
or in a fancier way:
x
(8.10) arctan x = .
x2
1+
32 x2
(3 − x2 ) +
52 x2
(5 − 3x2 ) +
.
(7 − 5x2 ) + . .
In particular, setting x = 1 and inverting, we get Lord Brouncker’s formula:
4 12
=1+ .
π 32
2+
52
2+
72
2+
.
2 + ..
Example 8.7. We can also derive (8.10) using Theorem 8.3: Using once again
that
x3 x5 x7 x2n−1
arctan x = x − + − + · · · + (−1)n−1 + ···
3 5 7 2n − 1
and setting α1 = x1 , α2 = x32 , α3 = 3x5 2 , α4 = 5x7 2 , · · · , αn = (2n−3)x
2n−1
2 for n ≥ 2,

into the limiting formula (8.8) from Theorem 8.3:


1 1 1 1 α1 α2 αn
− + − ··· = ... ...
α1 α1 α2 α1 α2 α3 α1 + α2 − 1 + α3 − 1 + + αn+1 − 1 +
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 399

we obtain
1 3 2n−1
1 x x2 (2n−3)x2
arctan x = 1 3 5 ... 2n+1 ....
x
+ x2 − 1+ 3x2 − 1+ + (2n−1)x2 − 1+

From Theorem 8.1, we know that

b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +

In particular, setting ρ1 = x, ρ2 = x2 , ρ3 = 3x2 , ρ4 = 5x2 , and in general,


ρn = (2n − 3)x2 for n ≥ 1 into the formula for arctan x, we obtain (as you are
invited to verify) the exact same expression (8.10)!

Example 8.8. We leave the next two beauts to you! Applying Theorem 8.2
2
and/or Theorem 8.3 to Euler’s sum π6 = 112 + 212 + 312 + · · · , in Problem 2 we ask
you to derive the formula

6 14
(8.11) = 02 + 12 − ,
π2 24
12 + 22 −
34
22 + 32 −
44
32 + 42 −
.
42 + 52 − . .

which is, after inversion, the last formula on the front cover of this book.

Example 8.9. In Problem 9 we derive Euler’s splendid formula [47, p. 89]:

π 1
(8.12) =1+ .
2 1·2
1+
2·3
1+
3·4
1+
.
1 + ..

8.2.4. Another continued fraction for π. We now derive another remark-


able formula for π, which is due to Euler (according to Castellanos [47, p. 89]; the
proof we give is found in Lange’s article [129]). Consider first the telescoping sum
∞        
X 1 1 1 1 1 1 1 1
(−1)n−1 + = + − + + + − + · · · = 1.
n=1
n n+1 1 2 2 3 3 4

Since

π 1 1 1 1 X (−1)n−1
= − + − + ··· = 1 − ,
4 1 3 5 7 n=1
2n + 1
400 8. INFINITE CONTINUED FRACTIONS

multiplying this expression by 4 and using the previous expression, we can write
∞ ∞
X (−1)n−1 X (−1)n−1
π =4−4 =3+1−4
n=1
2n + 1 n=1
2n + 1
∞   ∞
X 1 1 X (−1)n−1
=3+ (−1)n−1 + −4
n=1
n n+1 n=1
2n + 1
∞  
X 1 1 4
=3+ (−1)n−1 + −
n=1
n n + 1 2n +1

X (−1)n−1
=3+4 ,
n=1
2n(2n + 1)(2n + 2)

where we combined fractions in going from the third to forth lines. We now apply
the limiting formula (8.6) from Theorem 8.2 with αn = 2n(2n+1)(2n+2). Observe
that
αn − αn−1 = 2n(2n + 1)(2n + 2) − 2(n − 1)(2n − 1)(2n)
 
= 4n (2n + 1)(n + 1) − (n − 1)(2n − 1)
 
= 4n 2n2 + 2n + n + 1 − (2n2 − n − 2n + 1) = 4n(6n) = 24n2 .
Now putting the αn ’s into the formula
1 1 1 1 1 α12 α22 α32
− + − + ··· = ...,
α1 α2 α3 α4 α1 + α2 − α1 + α3 − α2 + α4 − α3 +
we get
∞  
X (−1)n−1 1 (2 · 3 · 4)2 (4 · 5 · 6)2
4 =4· ...
n=1
2n(2n + 1)(2n + 2) 2 · 3 · 4 + 24 · 22 + 24 · 32 +
1 (2 · 3)2 · 4 (4 · 5 · 6)2
= ....
2 · 3 + 24 · 22 + 24 · 32 +
Hence,
1 (2 · 3)2 · 4 (4 · 5 · 6)2 (2(n − 1)(2n − 1)(2n))2
π =3+ 2 2
... ...,
6 + 24 · 2 + 24 · 3 + + 24 · n2 +
which is beautiful, but we can make this even more beautiful using the transforma-
tion rule from Theorem 8.1:
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
1
Setting ρ1 = 1 and ρn = 4n2 for n ≥ 2, we see that for n ≥ 3 we have
1 1
ρn−1 ρn bn 4(n−1)2 · 4n2 · (2(n − 1)(2n − 1)(2n))2 (2n − 1)2
= 1 = ;
ρn an 4n2 · 24 · n2 6
the same formula holds for n = 2 as you can check. Thus,
12 32 52 72 (2n − 1)2
π =3+ ... ...
6+ 6+ 6+ 6+ + 6 +
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 401

or in more elegant notation:

12
(8.13) π =3+ .
32
6+
52
6+
72
6+
.
6 + ..

8.2.5. Continued fractions and e. For our final beautiful example, we shall
compute a continued fraction expansion for e. We begin with

1 −1
X (−1)n 1 1 1
=e = =1− + − + ··· ,
e n=0
n! 1 1 · 2 1 · 2 ·3

so
e−1 1 1 1 1 1
=1− = − + − + ··· .
e e 1 1·2 1·2·3 1·2·3·4
Thus, setting αk = k into the formula (8.8):

1 1 1 1 α1 α2 αn−1
− + − ··· = ... ...,
α1 α1 α2 α1 α2 α3 α1 + α2 − 1 + α3 − 1 + + αn − 1 +
we obtain
e−1 1
= .
e 1
1+
2
1+
3
2+
.
3 + ..
We can make this into an expression for e as follows: First, invert the expression
and then subtract 1 from both sides to get

e 1 1 1
=1+ , then = .
e−1 2 e−1 2
1+ 1+
3 3
2+ 2+
. .
3 + .. 3 + ..
Second, invert again to obtain

2
e−1=1+ .
3
2+
4
3+
5
4+
.
5 + ..
402 8. INFINITE CONTINUED FRACTIONS

Finally, adding 1 to both sides we get the incredibly beautiful expression


2
(8.14) e=2+ ,
3
2+
4
3+
5
4+
.
5 + ..
or in shorthand:
2 3 4 5
e=2+ ....
2+ 3+ 4+ 5+
In the exercises you will derive other amazing formulæ.
Exercises 8.2.
P n xn+1
1. Recall that log(1+x) = ∞ n=0 (−1) n+1 . Using this formula, the formula (8.6) derived
from Theorem 8.2, and also the transformation rule, prove that fabulous formula

x
log(1 + x) = .
12 x
1+
22 x
(2 − 1x) +
32 x
(3 − 2x) +
.
(4 − 3x) + . .
Plug in x = 1 to derive our beautiful formula for log 2.
2
2. Using Euler’s sum π6 = 112 + 212 + 312 + · · · , give two proofs of the formula (8.11), one
using Theorem 8.2 and the other using Theorem 8.3. The transformation rules will
come in handy.
3. (i) For any real numbers {αk }, prove that for any n,
n
X k α1 x −αα2
x −αα3
x − ααn−1
n
x
1 2
αk x = α0 + α2 α3 ... αn
1 + 1 + α1 x + 1 + α2 x + + 1+ α x
k=0 n−1

provided, of course, that the right-hand side is defined, which we assume holds
for every n. P
(ii) Deduce that if the infinite series ∞ n
n=0 αn x converges, then

X α1 x −αα2
x −α 3
α2
x − ααn−1
n
x
αn xn = α0 + 1
α2 α3 . . . αn ....
n=0
1 + 1 + α1 x + 1 + α2 x + + 1+ α
n−1
x+
Transforming the continued fraction on the right, prove that

X α1 x −α2 x −α1 α3 x −αn−2 αn x
αn xn = α0 + ... ....
1 + α1 + α2 x + α2 + α3 x + + αn−1 + αn x +
n=0
2 3
y
4. Writing arctan x = x(1 − 3
+ y5 − y7 + · · · ) where y = x2 , and using the previous
2 3
problem on (1 − y3 + y5 − y
7
+ · · · ), derive the formula (8.10).
5. Let x, y > 0. Prove that

X (−1)n 1 x2 (x + y)2 (x + 2y)2 (x + 3y)2
= ....
n=0
x + ny x+ y + y + y + y +

Suggestion: The formula (8.6) might help. P


6. Recall the partial fraction expansion πx cot πx = 1 + 2x2 ∞n=1
1
x2 −n2
.
8.3. RECURRENCE RELATIONS, DIOPHANTUS’ TOMB, AND SHIPWRECKED SAILORS403

2x
(a) By breaking up x2 −n2
using partial fractions, prove that
1 1 1 1 1
π cot πx = − + − + − +··· .
x 1−x 1+x 2−x 2+x
(b) Derive the remarkable formula

1 x2 (1 − x)2 (1 + x)2 (2 − x)2 (2 + x)2


π cot πx = ....
x + 1 − 2x + 2x + 1 − 2x + 2x + 1 − 2x +
Putting x = 1/4, give a new proof of Lord Brouncker’s formula.
(c) Derive

tan πx x (1 − x)2 (1 + x)2 (2 − x)2 (2 + x)2


=1+ ....
πx 1 − 2x + 2x + 1 − 2x + 2x + 1 − 2x +
P
7. Recall that sinππx = x1 + ∞ 2x
n=1 n2 −x2 . From this, derive the beautiful expression

sin πx x (1 − x)2 (1 + x)2 (2 − x)2 (2 + x)2


=1− ....
πx 1+ 2x + 1 − 2x + 2x + 1 − 2x +
Suggestion: First break up n22x
−x2
and use an argument as you did for π cot πx to get
a continued fraction expansion for sinππx . From this, deduce the continued fraction
expansion for sin πx/πx.
P (2n+1)
8. From the expansion 4 cosπ πx = ∞ n
n=0 (−1) (2n+1)2 −x2 derive the beautiful expression
2

πx
cos 2 (x + 1)2 (x − 1)2 (x − 3)2 (x + 3)2 (x + 5)2 (x − 5)2
π =x+1+ ....
2
−2 · 1 + −2 + 2 · 3 + 2 + −2 · 5 + −2 +

9. (Cf. [114]) In this problem we turn infinite products to continued fractions.


(a) Let α1 , α2 , α3 , . . . be a sequence of real numbers with αk 6= 0, −1. Define sequences
b1 , b2 , b3 , . . . and a0 , a1 , a2 , . . . by b1 = (1 + α0 )α1 , a0 = 1 + α0 , a1 = 1, and
αn
bn = −(1 + αn−1 ) , αn = 1 − an for n = 2, 3, 4, . . . .
αn−1
Prove (say by induction) that for any n ∈ N,
n
Y b1 b2 bn
(1 + αk ) = a0 + ... .
a1 + a2 + + an
k=0

Taking n → ∞, get a formula between infinite products


and fractions.
Q∞      
(b) Using that πx = n=1 1− n = (1−x)(1+x) 1− 2 1+ x2 1− x3 1+ x3 · · ·
sin πx x x

and (a), derive the continued fraction expansion

sin πx x 1 · (1 − x) 1 · (1 + x) 2 · (2 − x) 2 · (2 + x)
=1− ....
πx 1+ x + 1−x + x + 1−x +
(c) Putting x = 1/2, prove (8.12). Putting x = −1/2, derive another continued
fraction for π/2.

8.3. Recurrence relations, Diophantus’ tomb, and shipwrecked sailors


In this section we define the Wallis-Euler recurrence relations, which generate
sequences of numerators and denominators for convergents of continued fractions.
Diophantine equations is the subject of finding integer solutions to polynomial
equations. Continued fractions (through the special properties of the Wallis-Euler
recurrence relations) turn out to play a very important role in this subject.
404 8. INFINITE CONTINUED FRACTIONS

8.3.1. Convergents and recurrence relations. We shall call a continued


fraction
b1 b2 b3 bn
(8.15) a0 + ... ...
a1 + a2 + a3 + + an +
nonnegative if the an , bn ’s are real numbers with an > 0, bn ≥ 0 for all n ≥ 1
(a0 can be any sign). We shall not spend a lot of time on continued fractions
when the an ’s and bn ’s in (8.15), for n ≥ 1, are arbitrary real numbers; it is
only for nonnegative infinite continued fractions that we develop their convergence
properties in Section 8.4. However, we shall come across continued fractions where
some of the an , bn are negative — see for instance the beautiful expression (8.44)
for cot x (and the following one for tan x) in Section 8.7! We focus on continued
fractions with an , bn > 0 for n ≥ 1 in order to avoid some possible contradictory
statements. For instance, the convergents of the elementary example 11 + −1 1 +1
1

have some weird properties. Let us form its convergents: c1 = 1, which is OK, but
1 −1 1 1 1
c2 = = = = =???,
1+ 1 −1 1−1 0
1+
1
which is not OK.1 However,
1 −1 1 1 1 1
c3 = = = = = 2,
1+ 1 + 1 −1 −1 1
1+ 1+
1 2 2
1+
1
which is OK again! To avoid such craziness, we shall focus on continued fractions
with an > 0 for n ≥ 1 and bn ≥ 0, but we emphasize that much of what we do in
this section and the next works in greater generality.
Let {an }∞ ∞
n=0 , {bn }n=1 be sequences of real numbers with an > 0, bn ≥ 0 for
all n ≥ 1 (there is no restriction on a0 ). The following sequences {pn }, {qn } are
central in the theory of continued fractions:

pn = an pn−1 + bn pn−2 , qn = an qn−1 + bn qn−2


(8.16)
p−1 = 1 , p0 = a0 , q−1 = 0 , q0 = 1.

We shall call these recurrence relations the Wallis-Euler recurrence relations


... you’ll see why they’re so central in a moment. In particular,
p1 = a1 p0 + b1 p−1 = a1 a0 + b1
(8.17)
q1 = a1 q0 + b1 q−1 = a1 .

We remark that qn > 0 for n = 0, 1, 2, 3, . . .. This is easily proved by induction:


Certainly, q0 = 1, q1 = a1 > 0 (recall that an > 0 for n ≥ 1); thus assuming that
qn > 0 for n = 0, . . . , n − 1, we have (recall that bn ≥ 0),
qn = an qn−1 + bn qn−2 > 0 · 0 + 0 = 0,

1Actually, in the continued fraction community, we always define a/0 = ∞ for a 6= 0 so we


can get around this division by zero predicament by simply defining it away.
8.3. RECURRENCE RELATIONS, DIOPHANTUS’ TOMB, AND SHIPWRECKED SAILORS405

and our induction is complete. Observe that the zero-th convergent of the continued
fraction (8.15) is c0 = a0 = p0 /q0 and the first convergent is
b1 a1 a0 + b1 p1
c1 = a0 + = = .
a1 a1 q1
The central property of the pn , qn ’s is the fact that cn = pn /qn for all n.
Theorem 8.4. For any positive real number x, we have
b1 b2 b3 bn xpn−1 + bn pn−2
(8.18) a0 + ... = , n = 1, 2, 3, . . . .
a1 + a2 + a3 + + x xqn−1 + bn qn−2

(Note that the denominator is > 0 because qn > 0 for n ≥ 0.) In particular, setting
x = an and using the definition of pn , qn , we have
b1 b2 b3 bn pn
cn = a0 + ... = , n = 0, 1, 2, 3, . . . .
a1 + a2 + a3 + + an qn
Proof. We prove (8.18) by induction on the number of terms after a0 . The
proof for one term after a0 is easy: a0 + bx1 = a0 x+b
x
1
= xpxq0 0+b 1 p−1
+q−1 , since p0 = a0 ,
p−1 = 1, q0 = 1, and q−1 = 0. Assume that (8.18) holds when there are n terms
after a0 ; we shall prove it holds for fractions with n + 1 terms after a0 . To do so,
we write (see Problem 4 in Exercises 8.1 for the general technique)
b1 b2 bn bn+1 b1 b2 bn
a0 + ... = a0 + ... ,
a1 + a2 + + an + x a1 + a2 + + y
where
bn+1 xan + bn+1
y := an + = .
x x
Therefore, by our induction hypothesis, we have
 
xan + bn+1
pn−1 + bn pn−2
b1 b2 bn+1 ypn−1 + bn pn−2 x
a0 + ... = =  
a1 + a2 + + x yqn−1 + bn qn−2 xan + bn+1
qn−1 + bn qn−2
x
xan pn−1 + bn+1 pn−1 + xbn pn−2
=
xan qn−1 + bn+1 qn−1 + xbn qn−2
x(an pn−1 + bn pn−2 ) + bn+1 pn−1
=
x(an qn−1 + bn qn−2 ) + bn+1 qn−1
xpn + bn+1 pn−1
= ,
xqn + bn+1 qn−1
which completes our induction step and finishes our proof. 

In the next theorem, we give various useful identities that the pn , qn satisfy.
Theorem 8.5 (Fundamental recurrence relations). For all n ≥ 1, the
following identities hold:

pn qn−1 − pn−1 qn = (−1)n−1 b1 b2 · · · bn


pn qn−2 − pn−2 qn = (−1)n an b1 b2 · · · bn−1
406 8. INFINITE CONTINUED FRACTIONS

and (where the formula for cn − cn−2 is only valid for n ≥ 2)


(−1)n−1 b1 b2 · · · bn (−1)n an b1 b2 · · · bn−1
cn − cn−1 = , cn − cn−2 = .
qn qn−1 qn qn−2
Proof. To prove that pn qn−1 − pn−1 qn = (−1)n−1 b1 b2 · · · bn for n = 1, 2, . . .,
we proceed by induction. For n = 1, the left-hand side is (see (8.17))
p1 q0 − p0 q1 = (a1 a0 + b1 ) · 1 − a0 · a1 = b1 ,
which is the right-hand side when n = 1. Assume our equality holds for n; we prove
it holds for n + 1. By the Wallis-Euler recurrence relations, we have
 
pn+1 qn − pn qn+1 = an+1 pn + bn+1 pn−1 qn − pn an+1 qn + bn+1 qn−1
= bn+1 pn−1 qn − pn bn+1 qn−1

= −bn+1 pn qn−1 − pn−1 qn
= −bn+1 · (−1)n−1 b1 b2 · · · bn = (−1)n b1 b2 · · · bn bn+1 ,
which completes our induction step. To prove the second equality, we use the
Wallis-Euler recurrence relations and the equality just proved:
 
pn qn−2 − pn−2 qn = an pn−1 + bn pn−2 qn−2 − pn−2 an qn−1 + bn qn−2
= an pn−1 qn−2 − pn−2 an qn−1

= an pn−1 qn−2 − pn−2 qn−1
= an · (−1)n−2 b1 b2 · · · bn−1 = (−1)n an b1 b2 · · · bn−1 .
Finally, the equations for cn − cn−1 and cn − cn−2 follow from dividing
pn qn−1 − pn−1 qn = (−1)n−1 b1 b2 · · · bn
pn qn−2 − pn−2 qn = (−1)n−1 an b1 b2 · · · bn−1
by qn qn−1 and qn qn−2 , respectively. 
For simple continued fractions, the Wallis-Euler relations (8.16) and (8.17) and
the fundamental recurrence relations take the following particularly simple forms:
Corollary 8.6 (Simple fundamental recurrence relations). For simple
continued fractions, for all n ≥ 1, if
pn = an pn−1 + pn−2 , qn = an qn−1 + qn−2
p0 = a0 , p1 = a0 a1 + 1 , q0 = 1 , q1 = a1 ,

then cn = pn /qn for all n ≥ 0, and for any x > 0,


xpn−1 + pn−2
(8.19) ha0 ; a1 , a2 , a3 , . . . , an , xi = , n = 1, 2, 3, . . . .
xqn−1 + qn−2
Moreover, for all n ≥ 1, the following identities hold:
pn qn−1 − pn−1 qn = (−1)n−1
pn qn−2 − pn−2 qn = (−1)n an
and
(−1)n−1 (−1)n an
cn − cn−1 = , cn − cn−2 = ,
qn qn−1 qn qn−2
8.3. RECURRENCE RELATIONS, DIOPHANTUS’ TOMB, AND SHIPWRECKED SAILORS407

where cn − cn−2 is only valid for n ≥ 2.


We also have the following interesting result.
Corollary 8.7. All the pn , qn for a simple continued fraction are relatively
prime; that is, cn = pn /qn is automatically in lowest terms.
Proof. The fact that pn , qn are in lowest terms follows from the fact that
pn qn−1 − pn−1 qn = (−1)n−1 ,
so if an integer happens to divide divide both pn and qn , then it divides pn qn−1 −
pn−1 qn also, so it must divide (−1)n−1 = ±1 which is impossible unless the number
was ±1. 
8.3.2. F Diophantine equations and sailors, coconuts, and monkeys.
From Section 8.1, we know that any rational number can be written as a finite
simple continued fraction. Also, any finite simple continued fraction is certainly a
rational number because it is made up of additions and divisions of rational numbers
and the rational numbers are closed under such operations. (For proofs of these
statements see Problem 2 in Exercises 8.1.) Now as we showed at the beginning of
Section 8.1, we can write
157 1
=2+ = h2; 3, 4, 5i,
68 1
3+
1
4+
5
which has an odd number of terms (three to be exact) after the integer part 2.
Observe that we can write this in another way:
157 1
=2+ = h2; 3, 4, 4, 1i,
68 1
3+
1
4+
1
4+
1
which has an even number of terms after the integer part. This example is typical:
Any finite simple continued fraction can be written with an even or odd number
of terms (by modifying the last term by 1). We summarize these remarks in the
following theorem, which we shall use in Theorem 8.9.
Theorem 8.8. A real number can be expressed as a finite simple continued
fraction if and only if it is rational, in which case, the rational number can be
expressed as a continued fraction with either an even or an odd number of terms.
The proof of this theorem shall be left to you. We now come to the subject
of diophantine equations, which are polynomial equations we wish to find integer
solutions. We shall study very elementary diophantine equations in this section, the
linear ones. Before doing so, it might of interest to know that diophantine equations
is named after a Greek mathematician Diophantus of Alexandrea (200 A.D. –284
A.D.). Diophantus is famous for at least two things: His book Arithmetica, which
studies equations that we now call diophantine equations in his honor, and for the
following riddle, which was supposedly written on his tombstone:
408 8. INFINITE CONTINUED FRACTIONS

This tomb hold Diophantus Ah, what a marvel! And the tomb
tells scientifically the measure of his life. God vouchsafed that he
should be a boy for the sixth part of his life; when a twelfth was
added, his cheeks acquired a beard; He kindled for him the light of
marriage after a seventh, and in the fifth year after his marriage
He granted him a son. Alas! late-begotten and miserable child,
when he had reached the measure of half his father’s life, the
chill grave took him. After consoling his grief by this science of
numbers for four years, he reached the end of his life. [160].
Try to find how old Diophantus was when he died using elementary algebra.
(Let x = his age when he died; then you should end up with trying to solve the
equation x = 61 x + 121
x + 17 x + 5 + 12 x + 4, obtaining x = 84.) Here is an easy
way to find his age: Unravelling the above fancy language, and picking out two
facts, we know that 1/12-th of his life was in youth and 1/7-th was as a bachelor.
In particular, his age must divide 7 and 12. The only integer that does this, and
which is within the human lifespan, is 7 · 12 = 84. In particular, he spent 84/6 = 14
years as a child, 84/12 = 7 as a youth, 84/7 = 12 years as a bachelor. He married
at 14 + 7 + 12 = 33, at 33 + 5 = 38, his son was born, who later died at the age
of 84/2 = 42 years old, when Diophantus was 80. Finally, after 4 years doing the
“science of numbers”, Diophantus died at the ripe old age of 84.
After taking a moment to wipe away our tears, let us consider the following.
Theorem 8.9. If a, b ∈ N are relatively prime, then for any c ∈ Z, the equation
ax − by = c
has an infinite number of integer solutions (x, y). Moreover, if (x0 , y0 ) is any one
integral solution of the equation with c = 1, then for general c ∈ Z, all solutions are
of the form
x = cx0 + bt , y = cy0 + at , for all t ∈ Z.
Proof. In Problem 7 we ask you to prove this theorem using Problem 5 in
Exercises 2.4; but we shall use continued fractions just for fun. We first solve the
equation ax − by = 1. To do so, we write a/b as a finite simple continued fraction:
a/b = ha0 ; a1 , a2 , . . . , an i and by Theorem 8.8 we can choose n to be odd. Then a/b
is equal to the n-th convergent pn /qn , which implies that pn = a and qn = b. Also,
by our relations in Corollary 8.6, we know that
pn qn−1 − qn pn−1 = (−1)n−1 = 1,
where we used that n is odd. Since pn and qn are relatively prime and a/b = pn /qn ,
we must have pn = a and qn = b. Therefore, aqn−1 − bpn−1 = 1, so

(8.20) (x0 , y0 ) = (qn−1 , pn−1 )

solves ax0 − by0 = 1. Multiplying ax0 − by0 = 1 by c we get


a(cx0 ) − b(cy0 ) = c.
Then ax − by = c holds if and only if
ax − by = a(cx0 ) − b(cy0 ) ⇐⇒ a(x − cx0 ) = b(y − cy0 ).
8.3. RECURRENCE RELATIONS, DIOPHANTUS’ TOMB, AND SHIPWRECKED SAILORS409

This shows that a divides b(y − cy0 ), which can be possible if and only if a divides
y − cy0 since a and b are relatively prime. Thus, y − cy0 = at for some t ∈ Z.
Plugging y − cy0 = at into the equation a(x − cx0 ) = b(y − cy0 ), we get
a(x − cx0 ) = b · (at) = abt.
Cancelling a, we get x − cx0 = bt and our proof is now complete. 

We remark that we need a and b to be relatively prime; for example, the


equation 2x − 4y = 1 has no integer solutions (because the left hand side is always
even, so can never equal 1). We also remark that the proof of Theorem 8.9, in
particular, the formula (8.20), also shows us how to find (x0 , y0 ): Just write a/b as
a simple continued fraction with an odd number n terms after the integer part of
a/b and compute the (n − 1)-st convergent to get (x0 , y0 ) = (qn−1 , pn−1 ).
Example 8.10. Consider the diophantine equation
157x − 68y = 12.
157
We already know that the continued fraction expansion of a/b = 68 with an odd
n = 3 is 157
68 = h2; 3, 4, 5i = ha0 ; a1 , a2 , a3 i. Thus,

1 4 30
c2 = 2 + =2+ = .
1 13 13
3+
4
Therefore, (13, 30) is one solution of 157x − 68y = 1, which we should check just to
be sure:
157 · 13 − 68 · 30 = 2041 − 2040 = 1.
Since cx0 = 12·13 = 156 and cy0 = 12·30 = 360, the general solution of 157x−68y =
12 is
x = 156 + 68t , y = 360 + 157t, t ∈ Z.
Example 8.11. We now come to a fun puzzle that involves diophantine equa-
tions; for more cool coconut puzzles, see [80, 81], [228], [212], and Problem 5. See
also [214] for the long history of such problems. Five sailors get shipwrecked on
an island where there is only a coconut tree and a very slim monkey. The sailors
gathered all the coconuts into a gigantic pile and went to sleep. At midnight, one
sailor woke up, and because he didn’t trust his mates, he divided the coconuts into
five equal piles, but with one coconut left over. He throws the extra one to the
monkey, hides his pile, puts the remaining coconuts back into a pile, and goes to
sleep. At one o’clock, the second sailor woke up, and because he was untrusting of
his mates, he divided the coconuts into five equal piles, but again with one coconut
left over. He throws the extra one to the monkey, hides his pile, puts the remaining
coconuts back into a pile, and goes to sleep. This exact same scenario continues
throughout the night with the other three sailors. In the morning, all the sailors
woke up, pretending as if nothing happened and divided the now minuscule pile of
coconuts into five equal piles, and they find yet again one coconut left over, which
they throw to the now very overweight monkey. Question: What is the smallest
possible number of coconuts in the original pile?
Let x = the original number of coconuts. Remember that sailor #1 divided x
into five parts, but with one left over. This means that if y1 is the number that he
410 8. INFINITE CONTINUED FRACTIONS

took, then x = 5y1 + 1 and he left 4 · y1 coconuts. In other words, he took


1 1 4
(x − 1) coconuts, leaving 4 · (x − 1) = (x − 1) coconuts.
5 5 5
Similarly, if y2 is the number of coconuts that sailor #2 took, then 45 (x−1) = 5y2 +1
and he left 4 · y2 coconuts. That is, the second sailor took
 
1 4 4x − 9 4x − 9 16x − 36
· (x − 1) − 1 = coconuts, leaving 4 · = coconuts.
5 5 25 25 25
Repeating this argument, we find that sailors #3, #4, and #5 left
64x − 244 256x − 1476 1024x − 8404
, ,
125 625 3125
coconuts, respectively. At the end, the sailors divided this last amount of coconuts
into five piles, with one coconut left over. Thus, if y is the number of coconuts in
each pile, then we must have
1024x − 8404
= 5y + 1 =⇒ 1024x − 15625y = 11529.
3125
The equation 1024x − 15625y = 11529 is just a diophantine equation since we
want integers x, y solving this equation. Moreover, 1024 = 210 and 15625 = 56
are relatively prime, so we can solve this equation by Theorem 8.9. First, we solve
1024x − 15625y = 1 by writing 1024/15625 as a continued fraction (this takes some
algebra) and forcing n to be odd (in this case n = 9):2
1024
= h0; 15, 3, 1, 6, 2, 1, 3, 2, 1i.
15625
Second, we take the (n − 1)-st convergent:
711
c8 = h0; 15, 3, 1, 6, 2, 1, 3, 2i = .
10849
Thus, (x0 , y0 ) = (10849, 711). Since cx0 = 11529 · 10849 = 125078121 and cy0 =
11529 · 711 = 8197119, the integer solutions to 1024x − 15625y = 11529 are
(8.21) x = 125078121 + 15625t , y = 8197119 + 1024t , t ∈ Z.
This of course gives us infinitely many solutions. However, we want the smallest
nonnegative solutions since x and y represent numbers of coconuts; thus, we need
125078121
x = 125078121 + 15625t ≥ 0 =⇒ t ≥ − = −8004.999744 . . . ,
15625
and
8197119
y = 8197119 + 1024t ≥ 0 =⇒ t ≥ − = −8004.9990234 . . . .
1024
Thus, taking t = −8004 in (8.21), we finally arrive at x = 15621 and y = 1023. In
conclusion, the smallest number of coconuts in the original piles is 15621 coconuts.
By the way, you can solve this coconut problem without continued fractions using
nothing more than basic high school algebra; try it!
Exercises 8.3.
1. Find the general integral solutions of
(a) 7x − 11y = 1 , (b) 13x − 3y = 5 , (c) 13x − 15y = 100.

2See http://www.mcs.surrey.ac.uk/Personal/R.Knott/Fibonacci/cfCALC.html for a contin-


ued fraction calculator.
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 411

2. If all the a0 , a1 , a2 , . . . , an > 0 (which guarantees that p0 = a0 > 0), prove that
pn qn
= han ; an−1 , an−2 , . . . , a2 , a1 , a0 i and = han ; an−1 , an−2 , . . . , a2 , a1 i
pn−1 qn−1
pk ak pk−1 +pk−2 1
for n = 1, 2, . . .. Suggestion: Observe that pk−1
= pk−1
= ak + pk−1 .
pk−2
3. In this problem, we relate the Fibonacci numbers to continued fractions. Recall that
the Fibonacci sequence {Fn } is defined as F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for
all n ≥ 2. Let pn /qn = ha0 ; a1 , . . . , an i where all the ak ’s are equal to 1.
(a) Prove that pn = Fn+2 and qn = Fn+1 for all n = −1, 0, 1, 2, . . .. Suggestion: Use
the Wallis-Euler recurrence relations.
(b) From facts known about convergents, prove that Fn and Fn+1 are relatively prime
and derive the following famous identity, named after Giovanni Domenico Cassini
(1625–1712) (also called Jean-Dominique Cassini)

Fn−1 Fn+1 − Fn2 = (−1)n (Cassini’s identity).

4. Imitating the proof of Theorem 8.9, show that a solution of ax − by = −1 can be found
by writing a/b as a simple continued fraction with an even number n terms after the
integer part of a/b and finding the (n − 1)-th convergent. Apply this method to find a
solution of 157x − 68y = −1 and 7x − 12y = −1.
5. (Coconut problems) Here are some more coconut problems:
(a) Solve the coconut problem assuming the same antics as in the text, except for one
thing: there are no coconuts left over for the monkey at the end. That is, what is
the smallest possible number of coconuts in the original pile given that after the
sailors divided the coconuts in the morning, there are no coconuts left over?
(b) Solve the coconut problem assuming the same antics as in the text except that
during the night each sailor divided the pile into five equal piles with none left
over; however, after he puts the remaining coconuts back into a pile, the monkey
(being a thief himself) steals one coconut from the pile (before the next sailor wakes
up). In the morning, there is still one coconut left over for the monkey.
(c) Solve the coconut problem when there are only three sailors to begin with, otherwise
everything is the same as in the text (e.g. one coconut left over at the end). Solve
this same coconut problem when there are no coconuts left over at the end.
(d) Solve the coconut problem when there are seven sailors, otherwise everything is
the same as in the text. (Warning: Set aside an evening for long computations!)
6. Let α = ha0 ; a1 , a2 , . . . , am i, β = hb0 ; b1 , . . . , bn i with m, n ≥ 0 and the ak , bk ’s integers
with am , bn > 1 (such finite continued fractions are called regular). Prove that if
α = β, then ak = bk for all k = 0, 1, 2, . . .. In other words, distinct regular finite simple
continued fractions define different rational numbers.
7. Prove Theorem 8.9 using Problem 5 in Exercises 2.4.

8.4. Convergence theorems for infinite continued fractions


Certainly the continued fraction h1; 1, 1, 1, 1, . . .i (if it converges), should be a
very special number — it is, it turns out to be the golden ratio! In this section
we shall investigate the delicate issues surrounding convergence of infinite contin-
ued fractions (see Theorem 8.14, the continued fraction convergence theorem); in
particular, we prove that any simple continued fraction converges. We also show
how to write any real number as a simple continued fraction via the canonical
continued fraction algorithm. Finally, we prove that a real number is irrational
if and only if its simple continued fraction expansion is infinite.
412 8. INFINITE CONTINUED FRACTIONS

8.4.1. Monotonicity properties of convergents. Let {cn } denote the con-


vergents of a nonnegative infinite continued fraction
b1 b2 b3 bn
a0 + ... ...,
a1 + a2 + a3 + + an +
where recall that nonnegative means the an , bn ’s are real numbers with an > 0, bn ≥
0 for all n ≥ 1, and there is no restriction on a0 . The Wallis-Euler recurrence
relations (8.16) are
pn = an pn−1 + bn pn−2 , qn = an qn−1 + bn qn−2
p−1 = 1 , p0 = a0 , q−1 = 0 , q0 = 1.
Then (cf. (8.17))
p1 = a1 p0 + b1 p−1 = a1 a0 + b1 , q1 = a1 q0 + b1 q−1 = a1 ,
and all the qn ’s are positive (see discussion below (8.17)). By Theorem 8.4 we have
cn = pn /qn for all n and by Theorem 8.5, for all n ≥ 1 the fundamental recurrence
relations are
pn qn−1 − pn−1 qn = (−1)n−1 b1 b2 · · · bn
pn qn−2 − pn−2 qn = (−1)n an b1 b2 · · · bn−1
and
(−1)n−1 b1 b2 · · · bn (−1)n an b1 b2 · · · bn−1
cn − cn−1 = , cn − cn−2 = ,
qn qn−1 qn qn−2
where cn − cn−2 is only valid for n ≥ 2. Using these fundamental recurrence
relations, we shall prove the following monotonicity properties of the cn ’s, which is
important in the study of the convergence properties of the cn ’s.
Theorem 8.10. Assume that bn > 0 for each n. Then the convergents {cn }
satisfy the inequalities: For all n ∈ N,
c0 < c2 < c4 < · · · < c2n < c2n−1 < · · · < c5 < c3 < c1 .
That is, the even convergents form a strictly increasing sequence while the odd
convergent form a strictly decreasing sequence.
Proof. Replacing n with 2n in the fundamental recurrence relation for cn −
cn−2 , we see that
(−1)2n−2 a2n b1 b2 · · · b2n−1 a2n b1 b2 · · · b2n−1
c2n − c2n−2 = = > 0.
q2n q2n−1 q2n q2n−1
This shows that c2n−2 < c2n for all n ≥ 1 and hence, c0 < c2 < c4 < · · · . Replacing
n with 2n − 1 in the fundamental relation for cn − cn−2 , one can prove that the
odd convergents form a strictly decreasing sequence. Replacing n with 2n in the
fundamental recurrence relation for cn − cn−1 , we see that
(−1)2n−1 b1 b2 · · · b2n b1 b2 · · · b2n
(8.22) c2n − c2n−1 = =− < 0 =⇒ c2n < c2n−1 .
q2n q2n−1 q2n q2n−1

If the continued fraction is actually finite; that is, if b`+1 = 0 for some `, then
this theorem still holds, but we need to make sure that 2n ≤ `. By the monotone
criterion for sequences, we have
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 413

Corollary 8.11. The limits of the even and odd convergents exist, and
c0 < c2 < c4 < · · · < lim c2n ≤ lim c2n−1 < · · · < c5 < c3 < c1 .
8.4.2. Convergence results for continued fractions. As a consequence of
the previous corollary, it follows that lim cn exists if and only if lim c2n = lim c2n−1 ,
which holds if and only if
−b1 b2 · · · b2n
(8.23) c2n − c2n−1 = → 0 as n → ∞.
q2n q2n−1
In the following theorem, we give one condition under which this is satisfied.
Theorem 8.12. Let {an }∞ ∞
n=0 , {bn }n=1 be sequences such that an , bn > 0 for
n ≥ 1 and

X an an+1
= ∞.
n=1
bn+1
b1 b2 b3 b4
Then (8.23) holds, so the continued fraction ξ := a0 + a1 + a2 + a3 + a4 + . . . con-
verges. Moreover, for any even j and odd k, we have
c0 < c2 < c4 < · · · < cj < · · · < ξ < · · · < ck < · · · < c5 < c3 < c1 .
Proof. Observe that for any n ≥ 2, we have qn−1 = an−1 qn−2 + bn−1 qn−3 ≥
an−1 qn−2 since bn−1 , qn−3 ≥ 0. Thus, for n ≥ 2 we have
qn = an qn−1 + bn qn−2 ≥ an · (an−1 qn−2 ) + bn qn−2 = qn−2 (an an−1 + bn ),
so
qn ≥ qn−2 (an an−1 + bn ).
Applying this formula over and over again, we find that for any n ≥ 1,
q2n ≥ q2n−2 (a2n a2n−1 + b2n )
≥ q2n−4 (a2n−2 a2n−3 + b2n−2 ) · (a2n a2n−1 + b2n )
..
≥ .
≥ q0 (a2 a1 + b2 )(a4 a3 + b4 ) · · · (a2n a2n−1 + b2n ).
A similar argument shows that for any n ≥ 2,
q2n−1 ≥ q1 (a3 a2 + b3 )(a5 a4 + b5 ) · · · (a2n−1 a2n−2 + b2n−1 ).
Thus, for any n ≥ 2, we have
q2n q2n−1 ≥ q0 q1 (a2 a1 + b2 )(a3 a2 + b3 ) · · · (a2n−1 a2n−2 + b2n−1 )(a2n a2n−1 + b2n ).
Factoring out all the bk ’s we conclude that
    
a2 a1 a3 a2 a2n a2n−1
q2n q2n−1 ≥ q0 q1 b2 · · · b2n · · · 1 + 1+ ··· 1 + ,
b2 b3 b2n
which shows that
b1 b2 · · · b2n b1 1
(8.24) ≤ ·Q  .
q2n q2n−1 q0 q1 2n−1
1+ ak ak+1
k=1 bk+1
P∞
Now recall that (see Theorem 7.2) a seriesQ∞ k=1 αk of positive real numbers con-
verges if and only if the infinite product k=1 (1+ αk ) converges.
 Thus, since we
P∞ Q∞
are given that k=1 akbk+1
ak+1
= ∞, we have k=1 1 + akbk+1ak+1
= ∞ as well, so the
414 8. INFINITE CONTINUED FRACTIONS

right-hand side of (8.24) vanishes as n → ∞. The fact that for even j and odd k,
we have c0 < c2 < c4 < · · · < cj < · · · < ξ < · · · < ck < · · · < c5 < c3 < c1 follows
from Corollary 8.11. This completes our proof. 

For another convergence theorem, see Problems 6 and 9. An important exam-


ple for which this theorem applies is to simple continued fractions: For a simple
continued fraction ha0 ; a1 , a2 , a3 , . . .i, all the bn ’s equal 1, so
∞ ∞
X an an+1 X
= an an+1 = ∞,
n=1
bn+1 n=1

since all the an ’s are positive integers. Thus,


Corollary 8.13. Infinite simple continued fractions always converge and if
ξ is the limit of such a fraction, then for any even j and odd k, the convergents
satisfy
c0 < c2 < c4 < · · · < cj < · · · < ξ < · · · < ck < · · · < c5 < c3 < c1 .
Example 8.12. In particular, the very special fraction Φ := h1; 1, 1, 1, . . .i
converges. To what you ask? Observe that
1 1 1
Φ=1+ =1+ =⇒ Φ=1+ .
1 Φ Φ
1+
1
1+
..
.
We can also get this formula from convergents: The n-th convergent of Φ is
1 1
cn = 1 + =1+ .
1 cn−1
1+
..
.
1+
1
1+
1
Thus, if we set Φ = lim cn , which we know exists, then taking n → ∞ on both sides
1
of cn = 1 + cn−1 , we get Φ = 1 + 1/Φ just as before. Thus, Φ2 − Φ − 1 = 0, which
after solving for Φ we get

1+ 5
Φ= ,
2
the golden ratio.
As a unrelated side note, we remark that Φ can be used to get a fairly accurate
(and well-known) approximation to π:
6 2
π≈ Φ = 3.1416 . . . .
5

Example 8.13. The continued fraction ξ := 3 + 64 + 64 + 46 + 64 . . . was studied


by Rafael Bombelli (1526–1572) and was one of the first continued fractions ever to
P∞ P∞ 2
be studied. Since n=1 anbn+1
an+1
= n=1 64 = ∞, this continued fraction converges.
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 415

By the same reasoning, the continued fraction η := 6 + 64 + 46 + 64 . . . also converges.


Moreover, ξ = η − 3 and
4 1 1
η =6+ =6+ =⇒ η =6+ =⇒ η 2 − 6η − 1 = 0.
4 η η
6+
4
6+
..
.

Solving this quadratic equation for η, we find that η = 3 + 13. Hence, ξ = η − 3 =

13. Isn’t this fun!
8.4.3. The canonical continued fraction algorithm and the continued
fraction convergence theorem. What if we want to construct the continued
fraction expansion of a real number? We know how to construct such an expansion
for rational numbers, so let us review this; the same method will work for irrational
numbers. Consider again our friend 157 68 = h2; 3, 4, 5i = ha0 ; a1 , a2 , a3 i, and let us
recall how we found its continued fraction expansion. First, we wrote ξ0 := 157 68 as
1 68
ξ0 = 2 + , where ξ1 = > 1.
ξ1 21
In particular, notice that
a0 = 2 = bξ0 c,
where recall thatbxc, where x is a real number, denotes the largest integer ≤ x.
Second, we wrote ξ1 = 68
21 as
1 21
ξ1 = 3 + , where ξ2 = > 1.
ξ2 5
In particular, notice that
a1 = 3 = bξ1 c.
Third, we wrote
21 1
ξ2 = =4+ , where ξ3 = 5 > 1.
5 ξ3
In particular, notice that
a2 = 4 = bξ2 c.
Finally, a3 = bξ3 c = ξ3 cannot be broken up any further so we stop here. Hence,
157 1 1 1 1
= ξ0 = 2 + =2+ =2+ =2+ .
68 ξ1 1 1 1
3+ 3+ 3+
ξ2 1 1
4+ 4+
ξ3 5
We’ve just found the canonical (simple) continued fraction of 157/68.
Notice that we end with the number 5, which is greater than 1; this will always
happen whenever we do the above procedure for a noninteger rational number (such
continued fractions were called regular in Problem 6 of Exercises 8.3). We can do
the same exact procedure for irrational numbers! Let ξ be an irrational number.
First, we set ξ0 = ξ and define a0 := bξ0 c ∈ Z. Then, 0 < ξ0 − a0 < 1 (note that
ξ0 6= a0 since ξ0 is irrational), so we can write
1 1
ξ0 = a0 + , where ξ1 := > 1,
ξ1 ξ0 − a0
416 8. INFINITE CONTINUED FRACTIONS

where we used that 0 < ξ0 − a0 . Note that ξ1 is irrational because if not, then ξ0
would be rational contrary to assumption. Second, we define a1 := bξ1 c ∈ N. Then,
0 < ξ1 − a1 < 1, so we can write
1 1
ξ1 = a1 + , where ξ2 := > 1.
ξ2 ξ1 − a1
Note that ξ2 is irrational. Third, we define a2 := bξ2 c ∈ N. Then, 0 < ξ2 − a2 < 1,
so we can write
1 1
ξ2 = a2 + , where ξ3 := > 1.
ξ3 ξ2 − a2
Note that ξ3 is irrational. We can continue this procedure to “infinity” creating
a sequence {ξn }∞
n=0 of real numbers with ξn > 0 for n ≥ 1 called the complete
quotients of ξ, and a sequence {an }∞n=0 of integers with an > 0 for n ≥ 1 called
the partial quotients of ξ, such that
1
ξn = an + , n = 0, 1, 2, 3, . . . .
ξn+1
Thus,
1 1 1
(8.25) ξ = ξ0 = a0 + = a0 + = · · · “ = ” a0 + .
ξ1 1 1
a1 + a1 +
ξ2 1
a2 +
1
a3 +
.
a4 + . .
We emphasize that we have actually not proved that ξ is equal to the infinite con-
tinued fraction on the far right (hence, the quotation marks)! But, as a consequence
of the following theorem, this equality follows; then the continued fraction in (8.25)
is called the canonical (simple) continued fraction expansion of ξ.
Theorem 8.14 (Continued fraction convergence theorem). Let ξ0 , ξ1 ,
ξ2 , . . . be any sequence of real numbers with ξn > 0 for n ≥ 1 and suppose that
these numbers are related by
bn+1
ξn = an + , n = 0, 1, 2, . . . ,
ξn+1
for sequences
P∞ of real numbers {an }∞ ∞
n=0 , {bn }n=1 with an , bn > 0 for n ≥ 1 and which
an an+1
satisfy n=1 bn+1 = ∞. Then ξ0 is equal to the continued fraction
b1 b2 b3 b4 b5
ξ0 = a0 + ....
a1 + a2 + a3 + a4 + a5 +
In particular, for any real number ξ, the canonical continued fraction expansion
(8.25) converges to ξ.
Proof. By Theorem 8.12, the continued fraction a0 + ab11 + ab22 + ab33 + . . . con-
verges. Let {ck = pk /qk } denote the convergents of this infinite continued fraction
and let ε > 0. Then by Theorem 8.12, there is an N such that
b1 b2 · · · bn
n>N =⇒ |cn − cn−1 | = < ε.
qn qn−1
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 417

Fix n > N and consider the finite continued fraction obtained as in (8.25) by
writing out ξ0 to the n-th term:
b1 b2 b3 bn−1 bn
ξ0 = a0 + ... .
a1 + a2 + a3 + + an−1 + ξn
Let {c0k = p0k /qk0 } denote the convergents of this finite continued fraction. Then
observe that pk = p0k and qk = qk0 for k ≤ n − 1 and c0n = ξ0 . Therefore, by our
fundamental recurrence relations, we have
b1 b2 · · · bn b1 b2 · · · bn
|ξ0 − cn−1 | = |c0n − c0n−1 | ≤ 0 0 = 0 .
qn qn−1 qn qn−1
By the Wallis-Euler relations, we have
 
0 0 0 bn+1
qn = ξn qn−1 + bn qn−2 = an + qn−1 + bn qn−2 > an qn−1 + bn qn−2 = qn .
ξn+1
Hence,
b1 b2 · · · bn b1 b2 · · · bn
|ξ0 − cn−1 | ≤ < < ε.
qn0 qn−1 qn qn−1
Since ε > 0 was arbitrary, it follows that ξ0 = lim cn−1 = ξ. 

Example 8.14. Consider ξ0 = 3 = 1.73205 . . .. In this case, a0 := bξ0 c = 1.
Thus,

1 1 1+ 3
ξ1 := =√ = = 1.36602 . . . =⇒ a1 := bξ1 c = 1.
ξ0 − a0 3−1 2
Therefore,
1 1 √
ξ2 := = √ = 1 + 3 = 2.73205 . . . =⇒ a2 := bξ2 c = 2.
ξ1 − a1 1+ 3
−1
2
Hence,

1 1 1+ 3
ξ3 := =√ = = 1.36602 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 3−1 2
Here we notice that ξ3 = ξ1 and a3 = a1 . Therefore,
1 1 √
ξ4 := = = ξ2 = 1 + 3 =⇒ a4 := bξ4 c = bξ2 c = 2.
ξ3 − a3 ξ1 − a1
At this point, we see that we will get the repeating pattern 1, 2, 1, 2, . . ., so we
conclude that √
3 = h1; 1, 2, 1, 2, 1, 2, . . .i = h1; 1, 2i,
where we indicate that the 1, 2 pattern repeats by putting a bar over them.
Example 8.15. Here is a neat example concerning the Fibonacci and Lucas
numbers; for other fascinating topics on these numbers, see Knott’s fun website
[121]. Let us find the continued fraction √
expansion of the irrational number ξ0 =

Φ/ 5 where Φ is the golden ratio Φ = 1+2 5 :

Φ 1+ 5
ξ0 = √ = √ = 0.72360679 . . . =⇒ a0 := bξ0 c = 0.
5 2 5
418 8. INFINITE CONTINUED FRACTIONS

Thus,

1 1 2 5
ξ1 := = = √ = 1.3819660 . . . =⇒ a1 := bξ1 c = 1.
ξ0 − a0 ξ0 1+ 5
Therefore,

1 1 1+ 5
ξ2 := = √ =√ = 2.6180339 . . . =⇒ a2 := bξ2 c = 2.
ξ1 − a1 2 5 5−1
√ −1
1+ 5
Hence,

1 1 5−1
ξ3 := = √ = √ = 1.2360679 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 1+ 5 3− 5
√ −2
5−1
Thus,
√ √
1 1 3− 5 1+ 5
ξ4 := = √ = √ = = 1.6180339 . . . ;
ξ3 − a3 5−1 2 5−4 2
√ −1
3− 5
that is, ξ4 = Φ, and so, a4 := bξ4 c = 1. Let us do this one more time:

1 1 2 1+ 5
ξ5 := = √ =√ = = Φ,
ξ4 − a4 1+ 5 5−1 2
−1
2
and so, a5 = a4 = 1. Continuing on this process, we will get ξn = Φ and an = 1 for
the rest of the n’s. In conclusion, we have
Φ
√ = h0; 1, 2, 1, 1, 1, 1, . . .i = h0; 1, 2, 1i.
5
The convergents of this continued fraction are fascinating. Recall that the Fibonacci
sequence {Fn }, named after Leonardo Pisano Fibonacci (1170–1250), is defined as
F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2, which gives the sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . .
The Lucas numbers {Ln }, named after François Lucas (1842–1891), are defined
by
L0 := 2 , L1 = 1 , Ln = Ln−1 + Ln−2 , n = 2, 3, 4, . . . ,
and which give the sequence
2, 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, . . .
Φ
If you work out the convergents of √
5
= h0; 1, 2, 1, 1, 1, 1, . . .i what you get is the
fascinating result:
Φ
√ = h0; 1, 2, 1i has convergents
(8.26) 5
0 1 2 3 5 8 13 21 34 55 89 Fibonacci numbers
, , , , , , , , , , ,... = ;
2 1 3 4 7 11 18 29 47 76 123 Lucas numbers
of course, we do miss the other 1 in the Fibonacci sequence. For more fascinating
facts on Fibonacci numbers see Problem 7. Finally, we remark that the canonical
simple fraction expansion of a real number is unique, see Problem 8.
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 419

8.4.4. The numbers π and e. We now discuss the continued fraction expan-
sions for the famous numbers π and e. Consider π first:
ξ0 = π = 3.141592653 . . . =⇒ a0 := bξ0 c = 3.
Thus,
1 1
ξ1 := = = 7.062513305 . . . =⇒ a1 := bξ1 c = 7.
π−3 0.141592653 . . .
Therefore,
1 1
ξ2 := = = 15.99659440 . . . =⇒ a2 := bξ2 c = 15.
ξ1 − a1 0.062513305 . . .
Hence,
1 1
ξ3 := = = 1.00341723 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 0.996594407 . . .
Let us do this one more time:
1 1
ξ4 := = = 292.6345908 . . . =⇒ a4 := bξ4 c = 292.
ξ3 − a3 0.003417231 . . .
Continuing this process (at Davis’ Broadway cafe and after 314 free refills), we get
(8.27) π = h3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 1, . . .i.
Unfortunately (or perhaps fortunately) there is no known pattern that the partial
quotients follow! The first few convergents for π = 3.141592653 . . . are
22 333
c0 = 3 , c1 = = 3.142857142 . . . , c2 = = 3.141509433 . . .
7 106
355 103993
c4 = = 3.141592920 . . . , c5 = = 3.141592653 . . . .
113 33102
In stark contrast to π, Euler’s number e has a shockingly simple pattern, which
we ask you to work out in Problem 2:
e = h2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, . . .i
We will prove that this pattern continues in Section 8.7!
8.4.5. Irrationality. We now discuss when continued fractions represent ir-
rational numbers (cf. [154]).
Theorem 8.15. Let {an }∞ ∞
n=0 , {bn }n=1 be sequences rational numbers
P∞ such that
an , bn > 0 for n ≥ 1, 0 < bn ≤ an for all n sufficiently large, and n=1 anbn+1
an+1
=
∞. Then the real number
b1 b2 b3 b4 b5
ξ = a0 + . . . is irrational.
a1 + a2 + a3 + a4 + a5 +
Proof. First of all, the continued fraction defining ξ converges by Theorem
8.12. Suppose that 0 < bn ≤ an for all n ≥ m + 1 with m > 0. Observe that if we
define
bm+1 bm+2 bm+3
η = am + ...,
am+1 + am+2 + am+3 +
which also converges by Theorem 8.12, then η > am > 0 and we can write
b1 b2 b3 bm
ξ = a0 + ... .
+
a1 a2 a3+ + + η
420 8. INFINITE CONTINUED FRACTIONS

By Theorem 8.4, we know that


b1 b2 b3 bm ηpm + bm pm−1
ξ = a0 + ... = .
a1 + a2 + a3 + + η ηqm + bm qm−1
Solving the last equation for η, we get
ηpm + bm pm−1 ξbm qm−1 − bm pm−1
ξ= ⇐⇒ η= .
ηqm + bm qm−1 pm − ξqm
Note that since η > am , we have ξ 6= pm /qm . Since all the an , bn ’s are rational,
it follows that ξ is irrational if and only if η is irrational. Thus, all we have to do
is prove that η is irrational. Since am is rational, all we have to do is prove that
bm+1 bm+2 bm+3
am+1 + am+2 + am+3 + . . . is irrational, where 0 < bn ≤ an for all n ≥ m + 1. In
conclusion, we might as well assume from the start that
b1 b2 b3 b4 b5
ξ= ...
a1 + a2 + a3 + a4 + a5 +
where 0 < bn ≤ an for all n. We shall do this for the rest of the proof. Assume, by
way of contradiction, that ξ is rational. Define ξn := abnn + abn+1 bn+2
n+1 + an+2 +
. . .. Then
for each n = 1, 2, . . ., we have
bn bn
(8.28) ξn = =⇒ ξn+1 = − an .
an + ξn+1 ξn
By assumption, we have 0 < bn ≤ an for all n. It follows that ξn > 0 for all n and
therefore
bn bn
ξn = < ≤ 1,
an + ξn+1 an
therefore 0 < ξn < 1 for all n. Since ξ0 = ξ, which is rational by assumption, by
the second equality in (8.28) and induction it follows that ξn is rational for all n.
Since 0 < ξn < 1 for all n, we can therefore write ξn = sn /tn where 0 < sn < tn
for all n with sn and tn relatively prime integers. Now from the second equality in
(8.28) we see that
sn+1 bn bn t n bn tn − an sn
= ξn+1 = − an = − an = .
tn+1 ξn sn sn
Hence,
sn sn+1 = (bn tn − an sn )tn+1 .
Thus, tn+1 |sn sn+1 . By assumption, sn+1 and tn+1 are relatively prime, so tn+1
must divide sn . In particular, tn+1 < sn . However, sn < tn by assumption, so
tn+1 < tn . In summary, {tn } is a sequence of positive integers satisfying
t1 > t2 > t3 > · · · > tn > tn+1 > · · · > 0,
which of course is an absurdity because we would eventually reach zero! 
Example 8.16. (Irrationality of e, Proof III) Since we already know that
(see (8.14))
2 3 4 5
e=2+ ...,
2+ 3+ 4+ 5+
we certainly have bn ≤ an for all n, hence e is irrational!
As another application of this theorem, we get
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 421

Corollary 8.16. Any infinite simple continued fraction represents an irra-


tional number. In particular, a real number is irrational if and only if it can be
represented by an infinite simple continued fraction.
Indeed, for a simple continued fraction we have bn = 1 for all n, so 0 < bn ≤ an
for all n ≥ 1 holds.
Exercises 8.4.
1. (a) Use the simple continued fraction algorithm to the find the expansions of

√ 1− 8 √ √
(a) 2 , (b) , (c) 19 , (d) 3.14159 , (e) 7.
2
(b) Find the value of the continued fraction expansions
2 2 2
(a) 4 + . . . , (b) h3i = h3; 3, 3, 3, 3, 3, . . .i.
8+ 8+ 8+
The continued fraction in (a) was studied by Pietro Antonio Cataldi (1548–1626)
and is one of the earliest infinite continued fractions on record.
2. In Section 8.7, we will prove the conjectures you make in (a) and (b) below.
(a) Using a calculator, we find that e ≈ 2.718281828. Verify that 2.718281828 =
h2; 1, 2, 1, 1, 4, 1, 1, 6, . . .i. From this, conjecture a formula for an , n = 0, 1, 2, 3, . . .,
in the canonical continued fraction expansion for e.
e+1
(b) Using a calculator, we find that e−1 ≈ 2.1639534137. Find a0 , a1 , a2 , a3 in the
canonical continued fraction expansion for 2.1639534137 and conjecture a formula
e+1
for an , n = 0, 1, 2, 3, . . ., in the canonical continued fraction expansion for e−1 .

3. Let n ∈ N. Prove that n 2 + 1 = hn; 2ni by using the simple continued fraction

algorithm
√ on n2 + 1. Using the same technique, find the canonical expansion of
n2 + 2. (See Problem 5 below for other proofs.)
4. In this problem we show that any positive real number can be written as two different
infinite continued fractions. Let a be a positive real number. Prove that
k `
a=1+ = ,
k `
1+ 1+
k `
1+ 1+
.. .
1+ . 1 + ..
where k = a2 − a and ` = a2 + a. Suggestion: Link the limits of continued fractions on
the right to the quadratic equations x2 − x − k = 0 and x2 + x − ` = 0, respectively.
Find neat infinite continued fractions for 1, 2, and 3.
5. Let x be any positive real number and suppose that x2 − ax − b = 0 where a, b are
positive. Prove that
b b b b b
x=a+ ....
a+ a+ a+ a+ a+
Using this, prove that for any α, β > 0,
p β β β β
α2 + β = α + ....
2α + 2α + 2α + 2α +
b1 b2 b3
6. (a) Prove that a continued fraction a0 + a1 + a2 + a3 +
. . . converges if and only if

X (−1)n−1 b1 b2 · · · bn
c0 +
n=1
qn qn−1

converges, in which case, this sum is exactly a0 + ab11 + ab22 + ab33 + . . .. Suggestion:
Consider the telescoping sum cn = c0 + (c1 − c0 ) + (c2 − c1 ) + · · · + (cn − cn−1 ). In
422 8. INFINITE CONTINUED FRACTIONS

particular, for a simple continued fraction ξ = ha0 ; a1 , a2 , a3 , . . .i, we have



X (−1)n−1
ξ =1+ .
n=1
qn qn−1
b1 b2 b3
(b) Assume that ξ = a0 + a1 + a2 + a3 +
. . . converges. Prove that

X (−1)n an b1 b2 · · · bn−1
ξ = c0 + .
n=2
qn qn−2
In particular, for a simple continued fraction ξ = ha0 ; a1 , a2 , a3 , . . .i, we have

X (−1)n an
ξ =1+ .
n=2
qn qn−2
7. Let {cn } be the convergents of Φ = h1; 1, 1, 1, 1, 1, 1, . . .i.
F
(1) Prove that for n ≥ 1, we have Fn+1
n
= cn−1 . (That is, pn = Fn+2 and qn = Fn+1 .)
Conclude that
Fn+1
Φ = lim ,
n→∞ Fn

a beautiful (but nontrivial) fact!


(2) Using the previous problem, prove the incredibly beautiful formulas
∞ ∞
X (−1)n−1 X (−1)n
Φ= and Φ−1 = .
n=1
Fn Fn+1 n=2
Fn Fn+2

8. Let α = ha0 ; a1 , a2 , . . .i, β = hb0 ; b1 , b2 , . . .i be infinite simple continued fractions. Prove


that if α = β, then ak = bk for all k = 0, 1, 2, . . ., which shows that the canonical simple
fraction expansion of an irrational real number is unique. See Problem 6 in Exercises
8.3 for the rational case.
9. A continued fraction a0 + a11 + a12 + a13 + a14 + . . . where the an are real numbers with
an > 0 for n ≥ 1 is said to be unary. P In this problem we prove that1 a unary continued
fraction converges if and only if an = ∞. Henceforth, let a0 + a1 + a12 + a13 + . . . be
unary. Q
(i) Prove that qn ≤ n k=1 (1 + ak ).
(ii) Using the inequality P derived in (i), prove that if the unary continued fraction
converges, then an = ∞.
(iii) Prove that
q2n ≥ 1 + a1 (a2 + a4 + · · · + a2n ) , q2n−1 ≥ a1 + a3 + · · · + a2n−1 ,
where the first inequality holds for n ≥ 1 and the second
P for n ≥ 2.
(iv) Using the inequalities derived in (9iii), prove that if an = ∞, then the unary
continued fraction converges.

8.5. Diophantine approximations and the mystery of π solved!


For practical purposes, it is necessary to approximate irrational numbers by
rational numbers. Also, if a rational number has a very large denominator, e.g.
1234567
121110987654321 , then it is hard to work with, so for practical purposes it would
be nice to have a “good” approximation to such a rational number by a rational
number with a more manageable denominator. Diophantine approximations is the
subject of finding “good” or even “best” rational approximations to real numbers.
Continued fractions turn out to play a very important role in this subject, to which
this section is devoted. We start with a journey concerning the mysterious fraction
representations of π.
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 423

8.5.1. The mystery of π and good and best approximations. Here


we review some approximations to π = 3.14159265 . . . that have been discovered
throughout the centuries (see Section 4.10 for a thorough study):
(1) 3 in the Holy Bible circa 1000 B.C. by the Hebrews; See Book of I Kings,
Chapter 7, verse 23, and Book of II Chronicles, Chapter 4, verse 2:
And he made a molten sea, ten cubits from the one brim to the other:
it was round all about, and his height was five cubits: and a line of
thirty cubits did compass it about. I Kings 7:23.
(2) 22/7 = 3.14285714 . . . (correct to two decimal places) by Archimedes of Syra-
cuse (287 B.C. –212 B.C.) circa 250 B.C.
(3) 333/106 = 3.14150943 . . . (correct to four decimal places), a lower bound found
by Adriaan Anthoniszoon (1527–1607) circa 1600 A.D.
(4) 355/113 = 3.14159292 . . . (correct to six decimal places) by Tsu Chung-Chi
(429–501) circa 500 A.D.
Hmmm. . . these numbers certainly seem familiar! These numbers are exactly
the first four convergents of the continued fraction expansion of π that we worked
out in Subsection 8.4.4! From this example, it seems like approximating real num-
bers by rational numbers is intimately related to continued fractions; this is indeed
the case as we shall see. To start our adventure in approximations, we start with
the concepts of “good” and “best” approximations.
A rational number p/q is called a good approximation to a real number ξ if3
a p p a

for all rational 6= with 1 ≤ b ≤ q, we have ξ − < ξ − ;
b q q b
in other words, we cannot get closer to the real number ξ with a different rational
number having a denominator ≤ q.
Example 8.17. 4/1 is not a good approximation to π because 3/1, which has
an equal denominator, is closer to π:
3 4

π − = 0.141592 . . . < π − = 0.858407 . . . .
1 1
Example 8.18. As another example, 7/2 is not a good approximation to π
because 3/1, which has a smaller denominator than 7/2, is closer to π:
3 7

π − = 0.141592 . . . < π − = 0.358407 . . . .
1 2
This example shows that you wouldn’t want to approximate π with 7/2 be-
cause you can approximate it with the “simpler” number 3/1 that has a smaller
denominator.

3Warning: Some authors define good approximation as: p is a good approximation to ξ


q
if for all rational ab with 1 ≤ b < q, we have ξ − pq < ξ − ab . This definition, although only
slightly different from ours, makes some proofs considerably easier. Moreover, with this definition,
1, 000, 000/1 is a good approximation to π (why?)! (In fact, any integer, no matter how big, is a
good approximation to π.) On the other hand, with the definition we used, the only integer that
is a good approximation to π is 3. Also, some authors define best approximation as: pq is a best
approximation to ξ if for all rational ab with 1 ≤ b < q, we have |qξ − p| < |bξ − a|; with this
definition of “best,” one can shorten the proof of Theorem 8.20 — but then one must live with
the fact that 1, 000, 000/1 is a best approximation to π.
424 8. INFINITE CONTINUED FRACTIONS

Example 8.19. On the other hand, 13/4 is a good approximation to π. This


is because
13
π − = 0.108407 . . . ,
4
and there are no fractions closer to π with denominator 4, and the closest distinct
fractions with the smaller denominators 1, 2, and 3 are 3/1, 7/2, and 10/3, which
satisfy
3 7 10

π − = 0.141592 . . . , π − = 0.358407 . . . , π − = 0.191740 . . . .
1 2 3
Thus,
a 13 13 a

for all rational 6= with 1 ≤ b ≤ 4, we have π − < π − .
b 4 4 b
Now one can argue: Is 13/4 really that great of an approximation to π? For
although 3/1 is not as close to π, it is certainly much easier to work with than
13/4 because of the larger denominator 4 — moreover, we have 13/4 = 3.25, so
we didn’t even gain a single decimal place of accuracy in going from 3.00 to 3.25.
These are definitely valid arguments. One can also see the validity of this argument
by combining fractions in the inequality in the definition of good approximation:
p/q is a good approximation to ξ if
a p |qξ − p| |bξ − a|
for all rational 6= with 1 ≤ b ≤ q, we have < ,
b q q b
where we used that q, b > 0. Here, we can see that |qξ−p|
q < |bξ−a|
b may hold not
because p/q is dramatically much closer to ξ than is a/b but simply because q is
a lot larger than b (like in the case 13/4 and 3/1 where 4 is much larger than 1).
To try and correct this somewhat misleading notion of “good” we introduce the
concept of a “best” approximation by clearing the denominators.
A rational number p/q is called a best approximation to a real number ξ if
a p
for all rational 6= with 1 ≤ b ≤ q, we have qξ − p < bξ − a .
b q

Example 8.20. We can see that p/q = 13/4 is not a best approximation to π
because with a/b = 3/1, we have 1 ≤ 1 ≤ 4 yet

4 · π − 13 = 0.433629 . . . 6< 1 · π − 3 = 0.141592 . . . .
Thus, 13/4 is a good approximation to π but is far from a best approximation.
In the following proposition, we show that any best approximation is a good
one.
Proposition 8.17. A best approximation is a good one, but not vice versa.
Proof. We already gave an example showing that a good approximation may
not be a best one, so let p/q be a best approximation to ξ; we shall prove that p/q is
a good one too. Let a/b 6= p/q be rational with 1 ≤ b ≤ q. Then |qξ − p| < |b − ξa|
since p/q is a best approximation, and also, 1q ≤ 1b since b ≤ q, hence
p |qξ − p| |bξ − a| |bξ − a| a p a

ξ − = < ≤ = ξ − =⇒ ξ − < ξ − .
q q q b b q b
This shows that p/q is a good approximation. 
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 425

In the following subsection, we shall prove the best approximation theorem,


Theorem 8.20, which states that
(Best approximation theorem) Every best approximation of a
real number (rational or irrational) is a convergent of its canoni-
cal continued fraction expansion and conversely, each of the con-
vergents c1 , c2 , c3 , . . . is a best approximation.

8.5.2. Approximations, convergents, and the “most irrational” of all


irrational numbers. The objective of this subsection is to understand how con-
vergents approximate real numbers. In the following theorem, we show that the
convergents of the simple continued fraction of a real number ξ get increasingly
closer to ξ. (See Problem 4 for the general case of nonsimple continued fractions.)

Theorem 8.18 (Fundamental approximation theorem). Let ξ be an ir-


rational number and let {cn = pn /qn } be the convergents of its canonical continued
fraction. Then the following inequalities hold:

ξ − cn < 1
, ξ − cn+1 < ξ − cn , qn+1 ξ − pn+1 < qn ξ − pn .
qn qn+1

If ξ is a rational number and the convergent cn+1 is defined (that is, if ξ = 6 cn ), then
these inequalities still hold, with the exception that |ξ − cn | = qn q1n+1 if ξ = cn+1 .

Proof. We prove this theorem for ξ irrational; the rational case is proved
using a similar argument, which we leave to you if you’re interested. The proof of
this theorem is very simple. We just need the inequalities (see Corollary 8.13)

(8.29) cn < cn+2 < ξ < cn+1 or cn+1 < ξ < cn+2 < cn ,

depending on whether n is even or odd, respectively, and the fundamental recur-


rence relations (see Corollary 8.6):

(−1)n (−1)n an+2


(8.30) cn+1 − cn = , cn+2 − cn = .
qn qn+1 qn qn+2

Now the first inequality of our theorem follows easily:


by (8.29) (−1)n 1
ξ − cn < cn+1 − cn by (8.30) =

= .
qn qn+1 qn qn+1

We now prove that qn+1 ξ − pn+1 < qn ξ − pn . To prove this, we work on the left
and right-hand sides separately. For the left-hand side, we have

qn+1 ξ − pn+1 = qn+1 ξ − pn+1 = qn+1 ξ − cn+1 < qn+1 cn+2 − cn+1 by (8.29)

qn+1
1
= qn+1 by (8.30)
qn+1 qn+2
1
= .
qn+2
426 8. INFINITE CONTINUED FRACTIONS

1

Hence, qn+2 > qn+1 ξ − pn+1 . Now,

qn ξ − pn = qn ξ − pn = qn ξ − cn > qn cn+2 − cn

by (8.29)
qn
an+2
= qn by (8.30)
qn qn+2
an+2 1
= ≥ > qn+1 ξ − pn+1 .
qn+2 qn+2
This proves our third inequality. Finally, using what we just proved, and that
1 1
qn+1 = an+1 qn + qn−1 ≥ qn + qn−1 > qn =⇒ < ,
qn+1 qn
we see that

ξ − cn+1 = ξ − pn+1 = 1 qn+1 ξ − pn+1

qn+1 qn+1
1
< q n ξ − pn
qn+1
1 pn
< qn ξ − pn = ξ − = ξ − cn .
qn qn


It is important to only use


the canonical
expansion
when ξ is rational. This is
because the statement that qn+1 ξ − pn+1 < qn ξ − pn may not not be true if we
don’t use the canonical expansion.
Example 8.21. Consider 5/3, which has the canonical expansion:
5 1
= h1; 1, 2i = 1 +
.
3 1
1+
2
We can write this as the noncanonical expansion by breaking up the 2:
1 5
ξ = h1; 1, 1, 1i = 1 + = .
1 3
1+
1
1+
1
The convergents for this noncanonical expansion of ξ are c0 = 1/1, c2 = 2/1,
c3 = 3/2, and ξ = c4 = 5/3. In this case,

q3 ξ − p3 = 2 · 5 − 3 = 1 = 1 · 5 − 2 = q2 ξ − p2 ,

3 3 3

so for this example, q2 ξ − p2 6< q2 ξ − p2 .

We now discuss the “most irrational” of all irrational numbers. From the best
approximation theorem (Theorem 8.20 we’ll prove in a moment) we know that the
best approximations of a real number ξ are convergents and from the fundamental
approximation theorem 8.18, we have the error estimate
1 1
(8.31) ξ − cn < =⇒ qn ξ − pn < .
qn qn+1 qn+1
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 427

This shows you that the larger the qn ’s are, the better the best approximations
are. Since the qn ’s are determined by the recurrence relation qn = an qn−1 + qn−2 ,
we see that the larger the an ’s are, the larger the qn ’s are. In summary, ξ can be
approximated very “good” by rational numbers when it has large an ’s and very
“bad” by rational numbers when it has small an ’s.
Example 8.22. Here is a “good” example: Recall from (8.27) the continued
fraction for π: π = h3; 7, 15, 1, 292, 1, 1, 1, 2, 1, . . .i, which has convergents c0 = 3,
c1 = 22 333 355 103993
7 , c2 = 106 , c3 = 113 , c4 = 33102 , . . .. Because of the large number
a4 = 292, we see from (8.31) that we can approximate π very nicely with c3 : Using
the left-hand equation in (8.31), we see that
π − c3 < 1 = 1

= 0.000000267 . . . ,
q3 q4 113 · 33102
355
which implies that c3 = 113 approximates π to within six decimal places! (Just to
check, note that π = 3.14159265 . . . and 355
113 = 3.14159292 . . ..) It’s amazing how
many decimal places of accuracy we can get with just taking the c3 convergent!

Example 8.23. (The “most irrational” number) Here is a “bad” example:


From our discussion after (8.31), we saw that the smaller the an ’s are, the worse
it can be approximated by rationals. Of course, since 1 is the smallest natural
number, we can consider the golden ratio

1+ 5
Φ= = h1; 1, 1, 1, 1, 1, 1, 1, . . .i = 1.6180339887 . . .
2
as being the “worst” of all irrational numbers that can be approximated by rational
numbers. Indeed, we saw that we could get six decimal places of π by just taking
4181
c3 ; for Φ we need c18 ! (Just to check, we find that c17 = 2584 = 1.6180340557 . . .
6765
— not quite six decimals — and c18 = 4181 = 1.618033963 . . . — got the sixth
one. Also notice the large denominator 4181 just to get six decimals.) Therefore, Φ
wins the prize for the “most irrational” number in that it’s the “farthest” from the
rationals! We continue our discussion on “most irrational” in Subsection 8.10.3.
We now show that best approximations are exactly convergents; this is one of
the most important properties of continued fractions. We first need the following
lemma, whose ingenious proof we learned from Beskin’s beautiful book [28].
Lemma 8.19. If pn /qn , n ≥ 0, is a convergent of the canonical continued
fraction expansion of a real number ξ and p/q 6= pn /qn is a rational number with
q > 0 and 1 ≤ q < qn+1 , then
|qn ξ − pn | ≤ |qξ − p|.
Moreover, this inequality is an equality if and only if
pn+1
ξ= , p = pn+1 − pn , and q = qn+1 − qn .
qn+1
If this is the case, then we have q > qn if n ≥ 1.
Proof. Let pn /qn , n ≥ 0, be a convergent of the canonical continued fraction
expansion of a real number ξ and let p/q 6= pn /qn be a rational number with q > 0
and 1 ≤ q < qn+1 . Note that if ξ happens to be rational, we are implicity assuming
that ξ 6= pn /qn so that qn+1 is defined.
428 8. INFINITE CONTINUED FRACTIONS

Step 1: The trick. To prove that |qn ξ − pn | ≤ |qξ − p|, the trick is to write p
and q as linear combinations of pn , pn+1 , qn , qn+1 :
p = pn x + pn+1 y
(8.32)
q = qn x + qn+1 y.
Using basic linear algebra, together with the fact that pn+1 qn − pn qn+1 = (−1)n ,
we can solve these simultaneous linear equations for x and y obtaining
 
x = (−1)n pn+1 q − pqn+1 , y = (−1)n pqn − pn q .
These formulas are not needed below except for the important fact that these
formulas show that x and y are integers. Now, using the formulas in (8.32), we see
that

qξ − p = qn x + qn+1 y ξ − pn x − pn+1 y
 
= qn ξ − pn x + qn+1 ξ − pn+1 y.
Therefore,
 
(8.33) |qξ − p| = qn ξ − pn x + qn+1 ξ − pn+1 y .
Step 2: Our goal is to simplify the right-hand side of (8.33) by understanding
the signs of the terms in the absolute values. First of all, since q, qn , qn+1 > 0, from
the second formula in (8.32), we see that x ≤ 0 and y ≤ 0 is not possible (for then
q ≤ 0, contradicting that q > 0). If x > 0 and y > 0, then we would have
q = qn x + qn+1 y > qn+1 ,
contradicting that q < qn+1 . (Note that y > 0 is the same thing as saying y ≥ 1
because y is an integer.) If x = 0, then the formulas (8.32) show that p = pn+1 y
and q = qn+1 y. Since q and qn+1 are positive, we must have y > 0 and we have
q ≥ qn+1 , contradicting that q < qn+1 . If y = 0, then the formulas (8.32) show that
p = pn x and q = qn x, so p/q = pn /qn and this contradicts the assumption that
p/q 6= pn /qn . Summarizing our findings: We may assume that x and y are both
nonzero and have opposite signs. By Corollary 8.11 we know that
pn pn+1
ξ− and ξ −
qn qn+1
have opposite signs. Therefore, qn ξ − pn and qn+1 ξ − pn+1 have opposite signs and
hence, since x and y also have opposite signs,
 
qn ξ − pn x and qn+1 ξ − pn+1 y
have the same sign. Therefore, in (8.33), we have

qξ − p = |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y|.
Step 3: We now prove our result. Since x 6= 0, we have |x| ≥ 1 (because x is
an integer), so

|qn ξ − pn | ≤ |qn ξ − pn | |x| ≤ |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y| = qξ − p ,
and we have proved that |qn ξ − pn | ≤ |qξ − p| just as we set out to do.
Now assume that we have an equality: |qn ξ − pn | = |qξ − p|. Then we have
|qn ξ − pn | = |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y|
=⇒ |qn ξ − pn | (|x| − 1) + |qn+1 ξ − pn+1 | |y| = 0
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 429

Since x and y are both nonzero integers, we have in particular, |x| − 1 ≥ 0 and
|y| > 0. Therefore,
pn+1
|qn ξ − pn | (|x| − 1) + |qn+1 ξ − pn+1 | |y| = 0 ⇐⇒ ξ = and |x| = 1.
qn+1
If x = +1, then y < 0 (because x and y have opposite signs) so y ≤ −1 since y is
an integer and hence by the second equation in (8.32), we have
q = qn x + qn+1 y = qn + qn+1 y ≤ qn − qn+1 ≤ 0,
because qn ≤ qn+1 . This is impossible since q > 0 by assumption. Hence, x = −1.
In this case y > 0 and hence y ≥ 1. If y ≥ 2, then
q = qn x + qn+1 y = −qn + qn+1 y ≥ −qn + 2qn+1 = qn+1 + (qn+1 − qn ) ≥ qn+1 ,
which contradicts the fact that q < qn+1 . Therefore, y = 1. In conclusion, we have
seen that |qn ξ − pn | = |qξ − p| if and only if ξ = pn+1 /qn+1 , x = −1 and y = 1,
which by the formulas in (8.32), imply that p = pn+1 − pn and q = qn+1 − qn .
Finally, by the Wallis-Euler recurrence relations, we have
q = qn+1 − qn = an+1 qn + qn−1 − qn = (an+1 − 1)qn + qn−1 .
If n ≥ 1, then qn−1 ≥ 1 and an+1 ≥ 2. Therefore, if n ≥ 1, then q > qn and our
proof is complete. 
As an easy consequence of this lemma, it follows that every convergent pn /qn
with n ≥ 1 of the canonical continued fraction expansion of a real number ξ must
be a best approximation. Indeed, if ξ = pn /qn , then automatically pn /qn is a best
approximation of ξ. So assume that ξ 6= pn /qn , where n ≥ 1, and let p/q 6= pn /qn
with 1 ≤ q ≤ qn . Then, since n ≥ 1, we have qn < qn+1 , so 1 ≤ q < qn+1 . Therefore
by Lemma 8.19,
|qn ξ − pn | < |qξ − p|,
since the exceptional case is ruled out (q 6> qn because q ≤ qn by assumption).
Note that we left out p0 /q0 may not be a best approximation!

√Example 8.24. Consider 3 = 1.73205080 √ . . .. The best integer approximation
to 3 is 2. In Subsection 8.4.3 we found that 3 = h1; 1, 2i. Thus, p0 /q0 = 1, which
is not a best approximation. However, p1 /q1 = 1 + 11 = 2 is a best approximation.

Theorem 8.20 (Best approximation theorem). Every best approximation


of a real number (rational or irrational) is a convergent of its canonical continued
fraction expansion and conversely, each of the convergents c1 , c2 , c3 , . . . is a best
approximation.
Proof. We just have to prove that if p/q with q > 0 is a best approximation
to ξ ∈ R, then p/q is a convergent. Assume first that ξ is irrational. Then 1 = q0 ≤
q1 < q2 < · · · < qn → ∞, so we can choose a k such that
qk ≤ q < qk+1 .
By Lemma 8.19, if p/q 6= pk /qk , we have
|bξ − a| ≤ |qξ − p|,
where a = pk and b = qk ≤ q contradicting that p/q is a best approximation to p/q.
Therefore, p/q = pk /qk , so p/q is a convergent of ξ.
430 8. INFINITE CONTINUED FRACTIONS

Assume now that ξ is rational. Then ξ = pn+1 /qn+1 for some n = −1, 0, 1, . . ..
We consider three cases:
Case 1: q = qn+1 : Then the assumption that p/q is a best approximation to
ξ implies that p/q = pn+1 /qn+1 (why?) so p/q is a convergent.
Case 2: q > qn+1 : In fact, this case cannot occur because
|bξ − a| = 0 ≤ |pξ − q|
would hold for a = pn+1 and b = qn+1 < q contradicting that p/q is a best
approximation to ξ.
Case 3: 1 ≤ q < qn+1 : Since 1 = q0 ≤ q1 < q2 < · · · < qn+1 it follows that
there is a k such that
qk ≤ q < qk+1 .
Then by Lemma 8.19, if p/q 6= pk /qk , we have
|bξ − a| ≤ |qξ − p|.
where a = pk and b = qk ≤ q contradicting that p/q is a best approximation to p/q.
Therefore, p/q = pk /qk , so p/q is a convergent of ξ. 
8.5.3. Dirichlet’s approximation theorem. Using Theorem 8.20, we prove
the following famous fact.
Theorem 8.21 (Dirichlet’s approximation theorem). Amongst two con-
secutive convergents pn /qn , pn+1 /qn+1 with n ≥ 0 of the canonical continued frac-
tion expansion to a real number (rational or irrational) ξ, one of them satisfies
p 1

(8.34) ξ − < 2 .
q 2q
Conversely, if a rational number p/q satisfies (8.34), then it is a convergent.
Proof. We begin by proving that a rational number satisfying (8.34) must be
a convergent, then we show that convergents satisfy (8.34).
Step 1: Assume that p/q satisfies (8.34). To prove that it must be a convergent,
we just need to show that it is a best approximation. To this end, assume that
a/b 6= p/q with b > 0 and that

bξ − a ≤ qξ − p ;
we must show that q < b. To prove this, we note that (8.34) implies that
a 1 1 1 1 1

ξ − = bξ − a ≤ qξ − p < · = .
b b b b 2q 2bq
This inequality plus (8.34) give
aq − bp a p a p a p 1 1

= − = −ξ+ξ− ≤ − ξ + ξ − < + 2.
bq b q b q b q 2bq 2q
Since a/b 6= p/q, |aq − bp| is a positive integer, that is, 1 ≤ |aq − bp|, therefore
1 1 1 1 1 1 1
< + =⇒ < 2 =⇒ < =⇒ q < b,
bq 2bq 2q 2 2bq 2q b q
just as we wanted to show. We now show that one of two consecutive convergents
satisfies (8.34). Let pn /qn and pn+1 /qn+1 , n ≥ 0, be two consecutive convergents.
Step 2: Assume first that qn = qn+1 . Since qn+1 = an+1 qn + qn−1 we see
that qn = qn+1 if and only if n = 0 (because qn−1 = 0 if and only if n = 0) and
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 431

a1 = 1, in which case, q1 = q0 = 1, p0 = a0 , and p1 = a0 a1 + 1 = a0 + 1. Therefore,


p0 /q0 = a0 /1 and p1 /q1 = (a0 + 1)/1, so we just have to show that

ξ − a0 < 1 or ξ − (a0 + 1) < 1 .



2 2
But one of these must hold because a0 = bξc, so
a0 ≤ ξ < a0 + 1.
Note that the special situation where ξ is exactly half-way between a0 and a0 + 1,
that is, ξ = a0 + 1/2 = ha0 , 2i, is not possible under our current assumptions
because in this special situation, q1 = 2 6= 1 = q0 .
Step 3: Assume now that qn 6= qn+1 . Consider two consecutive convergents
cn and cn+1 . We know that either
cn < ξ < cn+1 or cn+1 < ξ < cn ,
depending on whether n is even or odd. For concreteness, assume that n is even;
the odd case is entirely similar. Then from cn < ξ < cn+1 and the fundamental
recurrence relation cn+1 − cn = 1/qn qn+1 , we see that

ξ − cn + cn+1 − ξ = (ξ − cn ) + (cn+1 − ξ) = cn+1 − cn = 1
.
qn qn+1
Now observe that since qn 6= qn+1 , we have
1 1 1 2 1 1 1 1 1 1
0< − = 2 + 2 − =⇒ < 2 + 2 ,
2 qn qn+1 2qn 2qn+1 qn qn+1 qn qn+1 2qn 2qn+1
so
ξ − cn + ξ − cn+1 < 1 + 1 .

(8.35) 2
2qn2 2qn+1

It follows that |ξ − cn | < 1/2qn2 or ξ − cn+1 < 1/2qn+1
2
, otherwise (8.35) would fail
to hold. This completes our proof. 
Exercises 8.5.
1. In this problem we find all the good approximations to 2/7. First, to see things better,
let’s write down the some fractions with denominators less than 7:
0 1 1 1 2 1 2 1
< < < < < < < .
1 6 5 4 7 3 5 2
a

By examining the absolute values ξ − b for the fractions listed, show that the good

approximations to 2/7 are 0/1, 1/2, 1/3, 1/4, and of course, 2/7. Now let’s find which
of the good approximations are best without using the best approximation theorem.
To do so, compute the absolute values

2 2 2 2
1 · − 0 , 2 · − 1 , 3 · − 1 , 4 · − 1
7 7 7 7
and from these numbers, determine which of the good approximations are best. Using
a similar method, find the good and best approximations to 3/7, 3/5, 8/5, and 2/9.
2. Prove that a real number ξ is irrational if and only if there are infinitely many rational
numbers p/q satisfying

p 1
ξ − < 2 .
q q
3. In this problem we find very beautiful approximations to π.
432 8. INFINITE CONTINUED FRACTIONS

(a) Using the canonical continued fraction algorithm, prove that


π 4 = 97.40909103400242 . . . = h97, 2, 2, 3, 1, 16539, 1, . . .i.
(Warning: If your calculator doesn’t have enough decimal places of accuracy, you’ll
probably get a different value for 16539.)
 1/4
(b) Compute c4 = 2143
22
and therefore, π ≈ 2143
22
. Note that π = 3.141592653 . . .
while (2143/22)1/4 = 3.141592652, quite accurate! This approximation is due
to Srinivasa Aiyangar Ramanujan (1887–1920) [27, p. 160].4 As explained on
Weinstein’s website [240], we can write this approximation in pandigital form,
that is, using all digits 0, 1, . . . , 9 exactly once :
sr
 2143 1/4 192
π≈ = 0 + 34 + .
22 78 − 56

(c) By determining certain convergents of the continued fraction expansions of π 2 , π 3 ,


and π 5 , derive the equally fascinating results:
√  227 1/2  4930 1/3  77729 1/5
π ≈ 10 , , 311/3 , , 3061/5 , .
23 159 254

The approximation π ≈ 10 = 3.162 . . . was known in Mesopotamia thousands of
years before Christ [170]!
4. If cn = a0 + ab11 + . . . + abnn and ξ = a0 + ab11 + ab22 + . . ., where an ≥ 1 for n ≥ 1, bn > 0,
P an an+1
and ∞ = ∞, prove that for any n = 0, 1, 2, . . ., we have ξ − cn+1 < ξ − cn
n=1 bn+1
and qn+1 ξ − pn+1 < qn ξ − pn (cf. Theorem 8.18).
5. (Pythagorean triples) Please review Problems 8 and 9 in Exercises 2.4 concerning
primitive Pythagorean triples. Following Schillo [201] we ask the following question:
Given a right triangle, is there a primitive right triangle √ similar to it? The answer is
“not always” since e.g. the triangle with sides (1, 1, 2) is not similar to any triangle
with integer sides (why?). So we ask: Given a right triangle, is there a primitive right
triangle “nearly” similar to it? The answer is “yes” and here’s one way to do it.
(i) Given a right triangle 4, let θ be one of its acute angles. Prove that if tan(θ/2) =
p/q where p, q ∈ Z have no common factors with q > 0, then tan θ = 2pq/(q 2 −
p2 ). Furthermore, prove that if (x, y, z) is a Pythagorean triple where x/y =
2pq/(q 2 − p2 ), then (x, y, z) is similar to 4. Of course, in general tan(θ/2) is
not rational so tan(θ/2) = p/q cannot hold. However, if tan(θ/2) ≈ p/q, we
have tan θ ≈ 2pq/(q 2 − p2 ), so if there exists a Pythagorean triple (x, y, z) where
x/y = 2pq/(q 2 − p2 ), then (x, y, z) is nearly similar to 4. Now by Problem 9 in
Exercises 2.4, there does exist such a triple:
(
x = 2pq , y = q 2 − p2 , z = p2 + q 2 , or,
(8.36) (x, y, z) is primitive, where 2 2 2 2
x = pq , y = q −p 2
, z = p +q2
,
according as p and q have opposite or the same parity. Notice that in either case,
we have x/y = 2pq/(q 2 − p2 ). In summary, to find a primitive right triangle
“nearly” similar to our given triangle, we just have to approximate tan(θ/2) by
rational numbers and let (x, y, z) be given by (8.36) . . . approximating tan(θ/2)
by rational numbers is where continued fractions
√ come in!
Part (i) to the triangle (1, 1, 2). In this case, θ = 45◦ . Prove that
(ii) Let’s apply √
tan(θ/2)
√ = 2−1. Prove that the convergents of the continued fraction expansion
of 2 − 1 are of the form cn = un /un+1 where un = 2un−1 + un−2 (n ≥ 2) with

4
An equation means nothing to me unless it expresses a thought of God. Srinivasa Ramanu-
jan (1887–1920).
8.6. F CONTINUED FRACTIONS AND CALENDARS, AND MATH AND MUSIC 433

u0 = 0, u1 = 1. Prove that (xn , yn , zn ), where xn = 2un un+1 , yn = u2n+1 − u2n ,


and zn = u2n+1 + u2n , forms a sequence of primitive Pythagorean √ triples. (This
sequence gives triangles that are more and more similar to (1, 1, 2) as n → ∞.)

8.6. F Continued fractions and calendars, and math and music


We now do some fun stuff with continued fractions and their applications to
calendars and pianos! In the exercises, you’ll see how Christian Huygens (1629–
1695), a Dutch physicist, made his model of the solar system (cf. [147]).

8.6.1. Calendars. Calendar making is an amazing subject; see Tøndering’s


(free!) book [224] for a fascinating look at calendars. A year, technically a tropical
year, is the time it takes from one vernal equinox to the next. Recall that there are
two equinoxes, which is basically (there is a more technical definition) the time when
night and day have the same length. The vernal equinox occurs around March 21,
the first day of spring, and the autumnal equinox occurs around September 23, the
first day of fall. A year is approximately 365.24219 days. As you might guess, not
being a whole number of days makes it quite difficult to make accurate calenders,
and for this reason, the art of calendar making has been around since the beginning.
Here are some approximations to a year that you might know about:
(1) 365 days, the ancient Egyptians and others.
(2) 365 14 days, Julius Caesar (100 B.C.–44 B.C.), 46 B.C., giving rise to the
Julian calendar.
97
(3) 365 400 days, Pope Gregory XIII (1502–1585), 1585, giving rise to the Gre-
gorian calendar, the calendar that is now the most widely-used calendar.

See Problem 1 for Persian calenders and their link to continued fractions. Let us
analyze these calenders more thoroughly. First, the ancient calendar consisting of
365 days. Since a true year is approximately 365.24219 days, an ancient year has
0.24219 less days than a true year.
Thus, after 4 years, with an ancient calendar you’ll lose approximately
4 × .24219 = 0.9687 days ≈ 1 day.
After 125 years, with an ancient calendar you’ll lose approximately
125 × .24219 = 30.27375 days ≈ 1 month.
So, instead of having spring around March 21, you’ll have it in February! After 500
years, with an ancient calendar you’ll lose approximately
500 × .24219 = 121.095 days ≈ 4 months.
So, instead of having spring around March 21, you’ll have it in November! As you
can see, this is getting quite ridiculous.
In the Julian calendar, there are an average of 365 14 days in a Julian year. The
fraction 41 is played out as we all know: We add one day to the ancient calendar
every four years giving us a “leap year”, that is, a year with 366 days. Thus, just
as we said, a Julian calendar year gives the estimate
4 × 365 + 1 days 1 days
= 365 .
4 years 4 year
434 8. INFINITE CONTINUED FRACTIONS

The Julian year has


365.25 − 365.24219 = 0.00781 more days than a true year.
So, for instance, after 125 years, with a Julian calendar you’ll gain
125 × .00781 = 0.97625 days ≈ 1 day.
Not bad. After 500 years, with a Julian calendar you’ll gain
500 × .00781 = 3.905 days ≈ 4 days.
Again, not bad! But, still, four days gained is still four days gained.
97
In the Gregorian calendar, there are an average of 365 400 days, that is, we add
ninety seven days to the ancient calendar every four hundred years. These extra
days are added as follows: Every four years we add one extra day, a “leap year” just
like in the Julian calendar — however, this gives us 100 extra days in 400 years; so
to offset this, we do not have a leap year for the century marks except 400, 800,
1200, 1600, 2000, 2400, . . . multiples of 400. For example, consider the years

1604, 1608, . . . , 1696, 1700, 1704, . . . , 1796, 1800, 1804, . . . , 1896,


1900, 1904, . . . , 1996, 2000.
Each of these years is a leap year except the three years 1700, 1800, and 1900 (but
note that the year 2000 was a leap year since it is a multiple of 400, as you can
verify on your old calendar). Hence, in the four hundred years from the end of 1600
to the end of 2000, we added only 97 total days since we didn’t add extra days in
1700, 1800, and 1900. So, just as we said, a Gregorian calendar gives the estimate
400 × 365 + 97 97 days
= 365 .
400 400 year
97
Since 365 400 = 365.2425, the Gregorian year has
365.2425 − 365.24219 = 0.00031 more days than a true year.
For instance, after 500 years, with a Gregorian calendar you’ll gain
500 × 0.00031 = 0.155 days ≈ 0 days!
Now let’s link calendars with continued fractions. Here is the continued fraction
expansion of the tropical year:
365.24219 = h365; 4, 7, 1, 3, 24, 6, 2, 2i.
This has convergents:
1 7 8 31
c0 = 365 , c1 = 365 , c2 = 365 , c3 = 365 , c4 = 365 ,....
4 29 33 128
Here, we see that c0 is the ancient calendar and c1 is the Julian calendar, but where
is the Gregorian calendar? It’s not on this list, but it’s almost c3 since
8 8 12 96 97
= · = ≈ .
33 33 12 396 400
8
However, it turns out that c3 = 365 33 is exactly the average number of days in the
Persian calendar introduced by the mathematician, astronomer, and poet Omar
Khayyam (1048 –1131)! See Problem 1 for the modern Persian calendar!
8.6. F CONTINUED FRACTIONS AND CALENDARS, AND MATH AND MUSIC 435

f1 f3 f6 f8 f10 f13 f15

. . . etc.
f0 f2 f4 f5 f7 f9 f11 f12 f14 f16

Figure 8.1. The k-th key, starting from k = 0, is labeled by its


frequency fk .

8.6.2. Pianos. We now move from calendars to pianos. For more on the
interaction between continued fractions and pianos, see [62], [134], [15], [89], [93],
[9], [197]. Let’s start by giving a short lesson on music based on Euler’s letter to
a German princess [39] (see also [105]). When, say a piano wire or guitar string
vibrates, it causes the air molecules around it to vibrate and these air molecules
cause neighboring molecules to vibrate and finally, these molecules bounce against
our ears, and we have the sensation of “sound”. The rapidness of the vibrations,
in number of vibrations per second, is called frequency. Let’s say that we hear
two notes with two different frequencies. In general, these frequencies mix together
and don’t produce a pleasing sound, but according to Euler, when the ratio of their
frequencies happens to equal certain ratios of integers, then we hear a pleasant
sound!5 Fascinating isn’t it? We’ll call the ratio of the frequencies an interval
between the notes or the frequencies. For example, consider two notes, one with
frequency f1 and the other with frequency f2 such that
f2 2
= ⇐⇒ f2 = 2f1 (octave);
f1 1
in other words, the interval between the first and second note is 2, which is to say,
f2 is just twice f1 . This special interval is called an octave. It turns out that
when two notes an octave apart are played at the same time, they sound beautiful
together! Another interval that is corresponds to a beautiful sound is called the
fifth, which is when the ratio is 3/2:
f2 3 3
= ⇐⇒ f2 = f1 (fifth).
f1 2 2
Other intervals (which remember just refer to ratios) that have names are
4/3 (fourth) 9/8 (major tone) 25/24 (chromatic semitone),
5/4 (major third) 10/9 (lesser tone) 81/80 (comma of Didymus),
6/5 (minor thirds) 16/15 (diatonic semitone).
However, it is probably of universal agreement that the octave and the fifth make
the prettiest sounds. Ratios such as 7/6, 8/7, 11/10, 12/11, . . . don’t seem to agree
with our ears.
Now let’s take a quick look at two facts concerning the piano. We all know
what a piano keyboard looks like; see Figure 8.1. Let us label the (fundamental)
frequencies of the piano keys, counting both white and black, by f0 , f1 , f2 , f3 , . . .

5Musica est exercitium arithmeticae occultum nescientis se numerare animi The pleasure
we obtain from music comes from counting, but counting unconsciously. Music is nothing but
unconscious arithmetic. From a letter to Goldbach, 27 April 1712, quoted in [193].
436 8. INFINITE CONTINUED FRACTIONS

starting from the far left key on the keyboard.6 The first fact is that keys which
are twelve keys apart are exactly an octave apart! For instance, f0 and, jumping
twelve keys to the right, f12 are an octave apart, f7 and f19 are an octave apart,
etc. For this reason, a piano scale really has just twelve basic frequencies, say
f0 , . . . , f11 , since by doubling these frequencies we get the twelve frequencies above,
f12 , . . . , f23 , and by doubling these we get f24 , . . . , f35 , etc. The second fact is that
a piano is evenly tempered, which means that the intervals between adjacent
keys is constant. Let this constant be c. Then,
fn+1
= c =⇒ fn+1 = cfn
fn
for all n. In particular,
(8.37) fn+k = cfn+k−1 = c(cfn+k−2 ) = c2 fn+k−2 = · · · = ck fn .
Since fn+12 = 2fn (because fn and fn+12 are an octave apart), it follows that with
k = 12, we get
2fn = c12 fn =⇒ 2 = c12 =⇒ c = 21/12 .
Thus, the interval between adjacent keys is 21/12 .
A question that might come to mind is: What is so special about the number
twelve for a piano scale? Why not eleven or fifteen? Answer: It has to do with
continued fractions! To see why, let us imagine that we have an evenly tempered
piano with q basic frequencies, that is, keys that are q apart have frequencies
differing by an octave. Question: Which q’s make the best pianos? (Note: We
better come up with q = 12 as one of the “best” ones!) By a very similar argument
as we did above, we can see that the interval between adjacent keys is 21/q . Now
we have to ask: What makes a good piano? Well, our piano by design has octaves,
but we would also like our piano to have fifths, the other beautiful interval. Let us
label the keys of our piano as in Figure 8.1. Then we would like to have a p such
that the interval between any frequency fn and fn+p is a fifth, that is,
fn+p 3
= .
fn 2
By the formula (8.37), which we can use in the present set-up as long as we put
c = 21/q , we have fn+p = (21/q )p fn = 2p/q fn . Thus, we want
3 p log(3/2)
2p/q = =⇒ = .
2 q log 2
This is, unfortunately, impossible because p/q is rational yet log(3/2)
log 2 is irrational
(cf. Subsection 2.6.5)! Thus, it is impossible for our piano (even if q = 12 like our
everyday piano) to have a fifth. However, hope is not lost because although our
piano can never have a perfect fifth, it can certainly have an approximate fifth: We
just need to find rational approximations to the irrational number log(3/2)
log 2 . This
we know how to do using continued fractions. One can show that
log(3/2)
= h1, 1, 2, 2, 3, 1, . . .i,
log 2
6A piano wire also gives off overtones but we focus here just on the fundamental frequency.
Also, some of what we say here is not quite true for the keys near the ends of the keyboard because
they don’t vibrate well due to their stiffness leading to the phenomenon called inharmonicity.
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 437

which has convergents

1 1 3 7 24 31 179
0, , , , , , , ,....
1 2 5 12 41 53 306
Lo and behold, we see a twelve! In particular, by the best approximation theorem
(Theorem 8.20), we know that 7/12 approximates log(3/2) log 2 better than any ratio-
nal number with a smaller denominator than twelve, which is to say, we cannot
find a piano scale with fewer than twelve basic key that will give a better approx-
imation to a fifth. This is why our everyday piano has twelve keys! In summary,
1, 2, 5, 12, 41, 53, 306, . . . are the q’s that make the “best” pianos. What about the
other numbers in this list? Supposedly [134], in 40 B.C. King-Fang, a scholar of the
Han dynasty, found the fraction 24/41, although to my knowledge, there has never
been an instrument built with a scale of q = 41; however, King-Fang also found
the fraction 31/53, and in this case, the q = 53 scale was advocated by Gerhardus
Mercator (1512–1594) circa 1650 and was actually implemented by Robert Halford
Macdowall Bosanquet (1841–1912) in his instrument Enharmonic Harmonium [34]!

We have focused on the interval of a fifth. What about other intervals? ... see
Problem 2.

Exercises 8.6.
1. (Persian calendar) As of 2000, the modern calendar in Iran and Afghanistan has an
683
average of 365 2820 days per year. The persian calendar introduced by Omar Khayyam
8
(1048–1131) had an average of 365 33 days per year. Khayyam amazingly calcu-
lated the year to be 365.24219858156 days. Find the continued fraction expansion
of 365.24219858156 and if {cn } are its convergents, show that c0 is the ancient calen-
dar, c1 is the Julian calendar, c3 is the calendar introduced by Khayyam, and c7 is the
modern Persian calendar!
2. Find the q’s that will make a piano with the “best” approximations to a minor third.
(Just as we found the q’s that will make a piano with the “best” approximations to
fifth.) Do you see why many musicians, e.g. Aristoxenus, Kornerup, Ariel, Yasser, who
enjoyed minor thirds, liked q = 19 musical scales?
3. (A solar system model) Christiaan Huygens (1629–1695) made a model scale of the
solar system. In his day, it was thought that it took Saturn 29.43 years to make it once
around the sun; that is,
period of Saturn
= 29.43.
period of Earth
To make a realistic model of the solar system, Huygens needed to make gears for the
model Saturn and the model Earth whose number of teeth had a ratio close to 29.43.
Find the continued fraction expansion of 29.43 and see why Huygens chose the number
of teeth to be 206 and 7, respectively. For more on the use of continued fractions to
solve gear problems, see [147].

8.7. The elementary functions and the irrationality of ep/q


In this section we derive some beautiful and classical continued fraction ex-
pansions for coth x, tanh x, and ex . The book [126, Sec. 11.7] has a very nice
presentation of this material.
438 8. INFINITE CONTINUED FRACTIONS

8.7.1. The hypergeometric function. For complex a 6= 0, −1, −2, . . ., the


function
1 1 z2 1 z3
F (a, z) := 1 + z + + + · · · , z ∈ C,
a a(a + 1) 2! a(a + 1)(a + 2) 3!
is called a (simplified) hypergeometric function or more precisely, the conflu-
ent hypergeometric limit function. Using the ratio test, it is straightforward
to check that F (a, z) converges for all z ∈ C. If for any a ∈ C, we define the
pochhammer symbol, introduced by Leo August Pochhammer (1841–1920),
(
1 n=0
(a)n :=
a(a + 1)(a + 2) · · · (a + n − 1) n = 1, 2, 3, . . . ,
then we can write the hypergeometric function in shorthand notation:

X 1 zn
F (a, z) = .
n=0
(a)n n!
Actually, the true hypergeometric function is defined by (cf. Subsection 6.3.4)

X (a)n z n
F (a, b, c, z) = ,
n=0
(b)n (c)n n!
but we won’t need this function. Many familiar functions can be written in terms
of these hypergeometric functions. For instance, consider
Proposition 8.22. We have
   
1 z2 3 z2
F , = cosh z , zF , = sinh z.
2 4 2 4
Proof. The proof of these identities are the same: We simply check that both
sides have the same series expansions. For example, let us check
 the
 second identity;
3 z2
the identity for cosh is proved similarly. The function z F 2 , 4 is just
∞ ∞
X 1 (z 2 /22 )n X 1 z 2n+1
z· = ,
n=0
(3/2)n n! n=0
(3/2)n 22n n!
and recall that

X z 2n+1
sinh z = .
n=0
(2n + 1)!
Thus, we just have to show that (3/2)n 22n n! = (2n + 1)! for each n. Certainly this
holds for n = 0. For n ≥ 1, we have
    
2n 3 3 3 3
(3/2)n 2 n! = +1 + 2 ··· + n − 1 · 22n n!
2 2 2 2
3 5 7 2n + 1 2n
= · · ··· · 2 n!
2 2 2 2
= 3 · 5 · 7 · · · (2n + 1) · 2n n!
Since 2n n! = 2n · 1 · 2 · 3 · · · n = 2 · 4 · 6 · · · 2n, we have
3 · 5 · 7 · · · (2n + 1) · 2n n! = 3 · 5 · 7 · · · (2n + 1) · 2 · 4 · 6 · · · 2n
= 2 · 3 · 4 · 5 · 6 · 7 · · · 2n · (2n + 1) = (2n + 1)!
and our proof is complete. 
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 439

The hypergeometric function also satisfies an interesting, and useful as we’ll


see in a moment, recurrence relation.
Proposition 8.23. The hypergeometric function satisfies the following recur-
rence relation:
z
F (a, z) = F (a + 1, z) + F (a + 2, z).
a(a + 1)
Proof. The proof of this identity proceeds in the same way as in the previous
proposition: We simply check that both sides have the same series expansions. We
can write
∞ ∞
z X 1 zn X 1 z n+1
F (a + 1, z) + F (a + 2, z) = + .
a(a + 1) n=0
(a + 1)n n! n=0 a(a + 1)(a + 2)n n!
The constant term on the right is 1, which is the constant term on the left. For
n ≥ 1, coefficient of z n on the right is
1 1
+
(a + 1)n n! a(a + 1)(a + 2)n−1 (n − 1)!
1 1
= +
(a + 1) · · · (a + 1 + n − 1) n! a(a + 1) · · · (a + 2 + (n − 1) − 1) (n − 1)!
1 1
= +
(a + 1) · · · (a + n) n! a(a + 1) · · · (a + n) (n − 1)!
 
1 1 1
= · +
(a + 1) · · · (a + n) (n − 1)! n a
 
1 a+n
=
(a + 1) · · · (a + n) (n − 1)! a · n
1 1
= = ,
a(a + 1) · · · (a + n − 1) n(n − 1)! (a)n n!
which is exactly the coefficient of z n for F (a, z). 

8.7.2. Continued fraction expansion of the hyperbolic cotangent. It


turns out that Propositions 8.22 and 8.23 can be combined to give a fairly simple
proof of the continued fraction expansion of the hyperbolic cotangent.
Theorem 8.24. For any real x, we have

1 x
coth x = + .
x x2
3+
x2
5+
x2
7+
.
9 + ..
Proof. With z = x > 0, we have F (a, x) > 0 for any a > 0 by definition of
the hypergeometric function. In particular, for a > 0, F (a + 1, x) > 0, so we can
divide by this in Proposition 8.23, obtaining the recurrence relation
F (a, x) x F (a + 2, x)
=1+ ,
F (a + 1, x) a(a + 1) F (a + 1, x)
440 8. INFINITE CONTINUED FRACTIONS

which we can write as


aF (a, x) x
=a+ .
F (a + 1, x) (a + 1)F (a + 1, x)
F (a + 2, x)
Replacing a with a + n with n = 0, 1, 2, 3, . . ., we get
(a + n)F (a + n, x) x
=a+n+ ;
F (a + n + 1, x) (a + n + 1)F (a + n + 1, x)
F (a + n + 2, x)
that is, if we define
(a + n)F (a + n, x)
ξn (a, x) := , an := a + n , bn := x,
F (a + n + 1, x)
then
bn+1
(8.38) ξn (a, x) = an + , n = 0, 1, 2, 3, . . . .
ξn+1 (a, x)
Since
∞ ∞
X an an+1 X (a + n)(a + n + 1)
= = ∞,
n=1
bn n=1
x
by the continued fraction convergence theorem (Theorem 8.14), we know that
aF (a, x) x x x x x
= ξ0 (a, x) = a + ....
F (a + 1, x) a + 1 a + 2 a + 3 a + 4 a + 5+
+ + + +
 
Since F 1/2, x2 /4 = cosh x and x F 3/2, x2 /4 = sinh x by Proposition 8.22,
when we set a = 1/2 and replace x with x2 /4 into the previous continued fraction,
we find
x cosh x x 1 x2 /4 x2 /4 x2 /4 x2 /4
= coth x = + ...,
2 sinh x 2 2 3/2 + 5/2 + 7/2 + 9/2 +
or after multiplication by 2 and dividing by x, we get
1 x/2 x2 /4 x2 /4 x2 /4
coth x = + ...,
x 3/2 + 5/2 + 7/2 + 9/2 +
Finally, using the transformation rule (Theorem 8.1)
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
a0 + ... . . . = a0 + ... ...
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
with ρn = 2 for all n, we get
1 x x2 x2 x2
+
coth x = ...,
x 3+ 5 + 7 + 9 +
exactly what we set out to prove. 

Given any x, we certainly have 0 < bn = x2 < 2n + 1 = an for all n sufficiently


large, so by Theorem 8.15, it follows that when x is rational, coth x is irrational, or
writing it out, for x rational,
ex + e−x e2x + 1
coth x = =
ex − e−x e2x − 1
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 441

is irrational. It follows that for x rational, e2x must be irrational too, for otherwise
coth x would be rational contrary to assumption. Replacing x with x/2 and calling
this r, we get the following neat corollary.
Theorem 8.25. er is irrational for any rational r.
By the way, as did Johann Heinrich Lambert (1728–1777) originally did back in
1761 [36, p. 463], you can use continued fractions to prove that π is irrational, see
[127], [154]. As another easy corollary, we can get the continued fraction expansion
for tanh x. To do so, multiply the continued fraction for coth x by x:
x2 x2 x2 x2
x coth x = b , where b = 1 + ....
3+ 5+ 7+ 9+
Thus, tanh x = xb , or replacing b with its continued fraction, we get

x
tanh x = .
x2
1+
x2
3+
x2
5+
.
7 + ..
We derive one more beautiful expression that we’ll need later. As before, we have
ex + e−x e2x + 1 1 x x2 x2 x2
coth x = = = + ....
ex − e−x e2x − 1 x 3+ 5 + 7 + 9 +
Replacing x with 1/x, we obtain
e2/x + 1 1/x 1/x2 1/x2 1/x2
= x + ....
e2/x − 1 3 + 5 + 7 + 9 +
Finally, using the now familiar transformation rule, after a little algebra we get
e2/x + 1 1
(8.39) =x+ .
e2/x − 1 1
3x +
1
5x +
.
7x + . .
8.7.3. Continued fraction expansion of the exponential. We can now
get the famous continued fraction expansion for ex , which was first discovered by
(as you might have guessed) Euler. To start, we observe that
ex/2 + e−x/2 1 + e−x coth(x/2) − 1
coth(x/2) = = =⇒ e−x = ,
ex/2 + e−x/2 1 − e−x 1 + coth(x/2)
where we solved the equation on the left for e−x . Thus,
coth(x/2) − 1 1 + coth(x/2) − 2 2
e−x = = =1− ,
1 + coth(x/2) 1 + coth(x/2) 1 + coth(x/2)
so taking reciprocals, we get
1
ex = ,
2
1−
1 + coth(x/2)
442 8. INFINITE CONTINUED FRACTIONS

By Theorem 8.24, we have


2 x/2 x2 /4 x + 2 x/2 x2 /4 x2 /4
1 + coth(x/2) = 1 + + ... = + ...,
x 3 + 5 + x 3 + 5 + 7 +
so
1 −2 x/2 x2 /4 x2 /4
ex = ...
1 + x+2
x
+ 3 + 5 + 7 +
or using the transformation rule (Theorem 8.1)
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ...
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
with ρ1 = 1, ρ2 = x, and ρn = 2 for all n ≥ 3, we get
1 −2x x2 x2 x2
ex = ....
1 + x + 2 + 6 + 10 + 14 +
Thus, we have derived Euler’s celebrated continued fraction expansion for ex :
Theorem 8.26. For any real x, we have

1
ex = .
2x
1−
x2
x+2+
x2
6+
x2
10 +
.
14 + . .
In particular, if we let x = 1, we obtain
1
e= .
2
1−
1
3+
1
6+
1
10 +
.
14 + . .
Although beautiful, we can get an even more beautiful continued fraction expansion
for e, which is a simple continued fraction.

8.7.4. The simple continued fraction expansion of e. If we expand the


decimal number 2.718281828 into a simple continued fraction, we get (see Problem
2 in Exercises 8.4)
2.718281828 = h2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1i.
For this reason, we should be able to conjecture that e is the continued fraction
(8.40) e = h2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, . . .i.

This is true, and it was first proved by (as you might have guessed) Euler. Here,
a0 = 2 , a1 = 1 , a2 = 2 , a3 = 1 , a4 = 1 , a5 = 4 , a6 = 1 , a7 = 1,
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 443

and in general, for all n ∈ N, a3n−1 = 2n and a3n = a3n+1 = 1. Since

1
2=1+ ,
1
0+
1
we can write (8.40) in a prettier way that shows the full pattern:

(8.41) e = h1; 0, 1, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, . . .i,

or in the expanded form

1
(8.42) e=1+ .
1
0+
1
1+
1
1+
1
2+
1
1+
1
1+
.
4 + ..

To prove this incredible formula, denote the convergents of the right-hand con-
tinued fraction in (8.40) by rk /sk . Since we have such simple relations a3n−1 = 2n
and a3n = a3n+1 = 1 for all n ∈ N, one might think that it is quite easy to compute
formulas for r3n+1 and s3n+1 , and this thought is indeed the case.

Lemma 8.27. For all n ≥ 2, we have

r3n+1 = 2(2n + 1)r3(n−1)+1 + r3(n−2)+1


s3n+1 = 2(2n + 1)s3(n−1)+1 + s3(n−2)+1

Proof. Both formulas are proved in similar ways, so we shall focus on the
formula for r3n+1 . First, we apply our Wallis-Euler recursive formulas:

r3n+1 = r3n + r3n−1 = r3n−1 + r3n−2 + r3n−1 = 2r3n−1 + r3n−2 .

We again apply the Wallis-Euler recursive formula on r3n−1 :


 
r3n+1 = 2 2nr3n−2 + r3n−3 + r3n−2
 
= 2(2n) + 1 r3n−2 + 2r3n−3
 
(8.43) = 2(2n) + 1 r3n−2 + r3n−3 + r3n−3 .

Again applying the Wallis-Euler recursive formula on the last term, we get
   
r3n+1 = 2(2n) + 1 r3n−2 + r3n−3 + r3n−4 + r3n−5
   
= 2(2n) + 1 r3n−2 + r3n−3 + r3n−4 + r3n−5 .
444 8. INFINITE CONTINUED FRACTIONS

Since r3n−2 = r3n−3 + r3n−4 by our Wallis-Euler recursive formulas, we finally get
 
r3n+1 = 2(2n) + 1 r3n−2 + r3n−2 + r3n−5
 
= 2(2n) + 2 r3n−2 + r3n−5
 
= 2 (2n) + 1 r3(n−1)+1 + r3(n−2)+1 .

Now putting x = 1 in (8.39), let us look at


e+1
= h2; 6, 10, 14, 18, . . .i.
e−1
that is, if the right-hand side is hα0 ; α1 , . . .i, then αn = 2(2n + 1) for all n =
0, 1, 2, . . .. If pn /qn are the convergents of this continued fraction, then we see that

pn = 2(2n + 1)pn−1 + pn−2 and qn = 2(2n + 1)qn−1 + qn−2 ,

which are similar to the relations in our lemma! Thus, it is not surprising in one
bit that the r3n+1 ’s and s3n+1 ’s are related to the pn ’s and qn ’s. The exact relation
is given in the following lemma.

Lemma 8.28. For all n = 0, 1, 2, . . ., we have

r3n+1 = pn + qn and s3n+1 = pn − qn .

Proof. As with the previous lemma, we shall only prove the formula for r3n+1 .
We proceed by induction: First, for n = 0, we have

r1 := a0 a1 + 1 = 2 · 1 + 1 = 3,

while p0 := 2 and q0 := 1, so r1 = p0 + q0 . If n = 1, then by the formula (8.43),


which holds for n ≥ 1, we see that

r3·1+1 = (2(2) + 1)r1 + 2r0 = 5 · 3 + 2 · 2 = 19.

On the other hand,

p1 := α0 α1 + 1 = 2 · 6 + 1 = 13 , q1 := α1 = 6,

so r3·1+1 = p1 + q1 .
Assume now that r3k+1 = pk + qk for all 0 ≤ k ≤ n − 1 where n ≥ 2; we shall
prove that it holds for k = n (this is an example of “strong induction”; see Section
2.2). But, by Lemma 8.27 and the induction hypothesis, we have

r3n+1 = 2(2n + 1)r3(n−1)+1 + r3(n−2)+1


= 2(2n + 1)(pn−1 + qn−2 ) + (pn−2 + qn−2 )
= 2(2n + 1)pn−1 + pn−2 + 2(2n + 1)qn−2 + qn−2
= pn + q n ,

where at the last step we used the Wallis-Euler recursive formulas. 


8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 445

Finally, we can now prove the continued fraction expansion for e:


rn r3n+1 pn + q n
h2; 1, 1, 4, 1, 1, . . .i = lim = lim = lim
sn s3n+1 pn − q n
e+1 e
pn /qn + 1 e−1 +1 e−1
= lim = e+1 = 1 = e.
pn /qn − 1 e−1 −1 e−1

See [173] for another proof of this formula based on a proof by Charles Hermite
(1822–1901). In the problems, we derive, along with other things, the following
beautiful continued fraction for cot x:
1 x
(8.44) cot x = + .
x x2
3−
x2
5−
x2
7−
.
9 − ..
From this continued fraction, we can derive the beautiful companion result for tan x:

x
tan x = .
x2
1−
x2
3−
x2
5−
.
7 − ..
Exercises 8.7.
1. For all n = 1, 2, . . ., let an > 0, bn ≥ 0, with an ≥ bn + 1. We shall prove that the
following continued fraction converges:
b1 −b2 −b3 −b4
(8.45) ....
a1 + a2 + a3 + a4 +
Note that for the continued fraction we are studying, a0 = 0. Replacing bn with −bn
with n ≥ 2 in the Wallis-Euler recurrence relations (8.16) and (8.17) we get
pn = an pn−1 − bn pn−2 , qn = an qn−1 − bn qn−2 , n = 2, 3, 4, . . .
p0 = 0 , p1 = b1 , q 0 = 1 , q 1 = a1 .
(i) Prove (via induction for instance) that qn ≥ qn−1 for all n = 1, 2, . . .. In partic-
ular, since q0 = 1, we have qn ≥ 1 for all n, so the convergents cn = pn /qn of
(8.45) are defined.
(ii) Verify that q1 − p1 ≥ 1 = q0 − p0 . Now prove by induction that qn − pn ≥
qn−1 − pn−1 for all n = 1, 2, . . .. In particular, since q0 − p0 = 1, we have
qn − pn ≥ 1 for all n. Diving by qn conclude that 0 ≤ cn ≤ 1 for all n = 1, 2, . . ..
(iii) Using the fundamental recurrence relations for cn −cn−1 , prove that cn −cn−1 ≥ 0
for all n = 1, 2, . . .. Combining this with (ii) shows that 0 ≤ c1 ≤ c2 ≤ c3 ≤ · · · ≤
1; that is, {cn } is a bounded monotone sequence and hence converges. Thus, the
continued fraction (8.45) converges.
2. For all n = 1, 2, . . ., let an > 0, bn ≥ 0, with an ≥ bn + 1. From the previous problem,
it follows that given any a0 ∈ R, the continued fraction a0 − ab11 −b 2 −b3 −b4
...
+ a2 + a3 + a4 +
converges. We now prove a variant of the continued fraction convergence theorem
446 8. INFINITE CONTINUED FRACTIONS

(Theorem 8.14): Let ξ0 , ξ1 , ξ2 , . . . be any sequence of real numbers with ξn > 0 for
n ≥ 1 and suppose that these numbers are related by
−bn+1
ξn = an + , n = 0, 1, 2, . . . .
ξn+1
Then ξ0 is equal to the continued fraction
b1 −b2 −b3 −b4 −b5
ξ0 = a0 − ....
a1 + a2 + a3 + a4 + a5 +
Prove this statement following (almost verbatim!) the proof of Theorem 8.14.
3. We are now ready to derive the beautiful cotangent continued fraction (8.44).
(i) Let a > 0. Then as we derived the identity (8.38) found in Theorem 8.24, prove
that if we define
(a + n)F (a + n, −x)
ηn (a, x) := , an = a + n , bn = x, n = 0, 1, 2, . . . ,
F (a + n + 1, −x)
then
−bn+1
ηn (a, x) = an + , n = 0, 1, 2, 3, . . . .
ηn+1 (a, x)
(ii) Using Problem 2, prove that for x ≥ 0 sufficiently small, we have
aF (a, −x) x −x −x −x −x
(8.46) = η0 (a, x) = a − ....
F (a + 1, −x) a + 1+ a + 2+ a + 3+ a + 4+ a + 5+
(iii) Prove that (cf. the proof of Proposition 8.22)
   
1 x2 3 x2
F ,− = cos x , x F ,− = sin x.
2 4 2 4
(iv) Now put a = 1/2 and replace x with −x2 /4 in (8.46) to derive the beautiful
cotangent expansion (8.44). Finally, relax and contemplate this fine formula!
4. (Irrationality of log r) Using Theorem 8.25, prove that if r > 0 is rational with r 6= 1,
then log r is irrational. In particular, one of our favorite constants, log 2, is irrational.

8.8. Quadratic irrationals and periodic continued fractions


We already know (Section 3.8) that a real number has a periodic decimal ex-
pansion if and only if the number is rational. One can ask the same thing about
continued fractions: What types of real numbers have periodic simple continued
fractions? The answer, as you will see in this section, are those real numbers called
quadratic irrationals.

8.8.1. Periodic continued fractions. The object of this section is to char-


acterize continued fractions that “repeat”.
Example 8.25. We have already encountered the beautiful continued fraction

1+ 5
= h1; 1, 1, 1, 1, 1, 1, 1, 1, . . .i.
2
We usually write the right-hand side as h1i to emphasize that the 1 repeats.
Example 8.26. Another continued fraction that repeats is

8 = h2; 1, 4, 1, 4, 1, 4, 1, 4, . . .i,
where we
√ have an infinite repeating block of 1, 4. We usually write the right-hand
side as 8 = h2; 1, 4i.
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 447

Example 8.27. Yet one more continued fraction that repeats is



19 = h4; 2, 1, 3, 1, 2, 8, 2, 1, 3, 1, 2, 8, . . .i,

where we have an √ infinite repeating block of 2, 1, 3, 1, 2, 8. We usually write the


right-hand side as 19 = h4; 2, 1, 3, 1, 2, 8i.

Notice that the above repeating continued fractions are continued fractions for
expressions with square roots.

Example 8.28. Consider now the expression:

ξ = h3; 2, 1, 2, 1, 2, 1, 2, 1, . . .i = h3; 2, 1i.

If η = h2; 1, 2, 1, 2, 1, 2, . . .i, then ξ = 3 + η1 , and

1 1
η =2+ =⇒ η =2+ .
1 1
1+ 1+
1 η
2+
1 + ···

Solving for η we get a quadratic formula and solving it, we find that η = 1 + 3.
Hence,
√ √
1 1 3−1 5+ 3
ξ =3+ =3+ √ =3+ = ,
η 1+ 3 2 6
yet another square root expression.

Consider the infinite repeating simple continued fraction

(8.47) ξ = ha0 ; a1 , . . . , a`−1 , b0 , b1 , . . . , bm−1 , b0 , b1 , . . . , bm−1 , b0 , b1 , . . . , bm−1 , . . .i


= ha0 ; a1 , . . . , a`−1 , b0 , b1 , . . . , bm−1 i,

where the bar denotes that the block of numbers b0 , b1 , . . . , bm−1 repeats forever.
Such a continued fraction is said to be periodic. When writing a continued fraction
in this way we assume that there is no shorter repeating block and that the repeating
block cannot start at an earlier position. For example, we would never write

h2; 1, 2, 4, 3, 4, 3, 4, 3, 4, . . .i as h2; 1, 2, 4, 3, 4, 3, 4i;

we simply write it as h2; 1, 2, 4, 3i. The integer m is called the period of the simple
continued fraction. An equivalent way to define a periodic continued fraction is as
an infinite simple continued fraction ξ = ha0 ; a1 , a2 , . . .i such that for some m and
`, we have

(8.48) an = am+n for all n = `, ` + 1, ` + 2, . . ..

The examples above suggest that infinite periodic simple continued fractions are
intimately related to expressions with square roots; in fact, these expressions are
called quadratic irrationals as we shall see in a moment.
448 8. INFINITE CONTINUED FRACTIONS

8.8.2. Quadratic irrationals. A quadratic irrational is, exactly as its


name suggests, an irrational real number that is a solution of a quadratic equa-
tion with integer coefficients. Using the quadratic equation, we leave you to show
that a quadratic irrational ξ can be written in the form

(8.49) ξ =r+s b
where r, s are rational numbers and √b > 0 is an integer that is not a perfect square
(for if b were a perfect square, then b would be an integer so the right-hand side
of ξ would be rational, contradicting that ξ is irrational). Conversely, given any
real number of the form (8.49), one can check that ξ is a root of the equation
x2 − 2r x + (r2 − s2 b) = 0.
Multiplying both sides of this equation by the common denominator of the rational
numbers 2r and r2 − s2 b, we can make the polynomial on the left have integer
coefficients. Thus, a real number is a quadratic irrational if and only if it is of the
form (8.49). As we shall see in Theorem 8.29 below, it would be helpful to write
quadratic irrationals in a certain way. Let ξ take the form in (8.49) with r = m/n
and s = p/q where we may assume that n, q > 0. Then with the help of some
mathematical gymnastics, we see that
√ √ p p
m p b mq + np b mq + bn2 p2 mnq 2 + bn4 p2 q 2
ξ= + = = = .
n q nq nq n2 q 2
Notice that if we set α = mnq 2 , β = n2 q 2 and d = bn4 p2 q 2 , then d − α2 =
bn4 p2 q 2 − m2 n2 q 4 = (bn2 p2 − m2 q 2 )(n2 q 2 ) is divisible by β = n2 q 2 . Therefore, we
can write any quadratic irrational in the form

α+ d
ξ= , α, β, d ∈ Z, d > 0 is not a perfect square, and β (d − α2 ).
β
Using this expression as the starting point, we prove the following nice theorem
that gives formulas for the convergents of the continued fraction expansion of ξ.

Theorem 8.29. Let ξ = α+β d be a quadratic irrational with complete quotients
{ξn } (with ξ0 = ξ) and partial quotients {an } where an = bξn c. Then,

αn + d
ξn = ,
βn
where αn and βn are integers with βn > 0 defined by the recursive sequences
2
d − αn+1
α0 = α , β0 = β , αn+1 = an βn − αn , βn+1 = ;
βn
moreover, βn |(d − αn2 ) for all n.
Proof. We first show that all the αn ’s and βn ’s defined above are integers
with βn never zero and βn |(d − αn2 ). This is automatic with n = 0. Assume this
is true for n. Then αn+1 = an βn − αn is an integer. To see that βn+1 is also an
integer, observe that
2
d − αn+1 d − (an βn − αn )2 d − a2n βn2 + 2an βn αn − αn2
βn+1 = = =
βn βn βn
d − αn2
= + 2an αn − a2n βn .
βn
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 449

By induction hypothesis, (d − αn2 )/βn is an integer and so is 2an αn − a2n βn . Thus,


βn+1 is an integer too. Moreover, βn+1 6= 0, because if βn+1 = 0, then we must
2
have d − αn+1 = 0, which shows that d is a perfect square contrary to our condition
on d. Finally, since βn is an integer and
2 2
d − αn+1 d − αn+1 2
βn+1 = =⇒ βn = =⇒ βn+1 (d − αn+1 ).
βn βn+1
Lastly, it remains to prove that the ξn ’s are
√ the complete quotients of ξ. To
avoid confusion, for each n let’s put ηn = (αn + d)/βn ; we must show that ηn = ξn
for each n. Note that η0 = ξ = ξ0 . Now to prove that ηn = ξn for n ≥ 1, we simply
use the formula for ηn :
√ √
αn + d αn+1 + αn d − αn+1
ηn − an = − =
βn βn βn
where in the middle equality we solved αn+1 = an βn − αn for an . Rationalizing
and using the definition of βn+1 and ξn+1 , we obtain
2
d − αn+1 βn+1 1 1
ηn − an = √ =√ = =⇒ ηn = an + .
βn ( d + αn+1 ) d + αn+1 ηn+1 ηn+1
Using this formula plus induction on n = 0, 1, 2, . . . (recalling that η0 = ξ0 ) shows
that ηn = ξn for all n. 
8.8.3. Quadratic irrationals and periodic continued fractions. After
one preliminary result, we shall prove that an infinite simple continued fraction is
a quadratic irrational if and only if it is periodic. Define
√ √
Z[ d] := {a + b d ; a, b ∈ Z}
and √ √
Q[ d] := {a + b d ; a, b ∈ Q}.
√ √ √
Given ξ = a + b d in either Z[ d] or Q[ d], we define its conjugate by

ξ := a − b d.
√ √
Lemma 8.30. Z[ d] is a commutative ring and Q[ d] is√ a field, and conjugation
preserves the algebraic properties; for example, if α, β ∈ Q[ d], then
α ± β = α ± β, α · β = α · β, and α/β = α/β.

Proof. To prove that Z[ d] is a commutative ring we just need √ to prove
that it has the same algebraic properties as the integers in that Z[ d] is closed
under addition, subtraction, and multiplication — for more on√this definition see
our discussion in Subsection√2.3.1. For example, √ to see that Z[ d] is√closed under
0 0
multiplication, let α = a + b d and β = a + b d be elements of Z[ d]; then,
√ √ √
(8.50) αβ = (a + b d)(a0 + b0 d) = aa0 + bb0 d + (ab0 + a0 b) d,
√ √
which is also in Z[ d]. Similarly, one can show that Z[ d] satisfies all the other
properties of a commutative
√ ring.
To prove that Q[ d] is a field we need to prove √ that it has the same alge-
braic properties as the rational numbers in that Q[ d] is closed under addition,
multiplication, subtraction, and division (by nonzero elements) — for more on √ this
definition see our discussion in Subsection 2.6.1. For example, to see that Q[ d] is
450 8. INFINITE CONTINUED FRACTIONS

√ √
closed under taking reciprocals, observe that if α = a + b d ∈ Q[ d] is not zero,
then √ √
1 1 a−b d a−b d a b √
= √ · √ = 2 2
= 2 2
− 2 2
d
α a+b d a−b d a −b d a −b d a −b d

Note that a2 − b2 d 6= 0 since being zero would imply that d = a/b, √ a rational
number, which by assumption is false. Similarly, one can show that Q[ d] satisfies
all the other properties of a field.
Finally, we need to prove that conjugation preserves the algebraic properties.
For example, let’s√prove the equality√α · β = α · β, leaving the other properties to
you. If α = a + b d and β = a0 + b0 d, then according to (8.50), we have

αβ = aa0 + bb0 d − (ab0 + a0 b) d.
On the other hand,
√ √ √
αβ = (a − b d)(a0 − b0 d) = aa0 + bb0 d − (ab0 + a0 b) d,
which equals αβ. 
The following theorem was first proved by Joseph-Louis Lagrange (1736–1813).

Theorem 8.31. An infinite simple continued fraction is a quadratic irrational


if and only if it is periodic.
Proof. We first prove the “if” part then the “only if” part.
Step 1: Let ξ = ha0 ; a1 , . . . , a`−1 , b0 , . . . , bm i be periodic and define
η := hb0 ; b1 , . . . , bm , b0 , b1 , . . . , bm , b0 , b1 , . . . , bm , . . .i = hb0 ; b1 , . . . , bm , ηi,
so that ξ = ha0 , a1 , . . . , a`−1 , ηi. Since η = hb0 ; b1 , . . . , bm , ηi, by Theorem 8.4, we
have
ηsm−1 + sm−2
η= ,
ηtm−1 + tm−2
where sn /tn are the convergents for η. Multiplying both sides by ηtm−1 + tm−2 , we
see that
η 2 tm−1 + ηtm−2 = ηsm−1 + sm−2 =⇒ a η 2 + b η + c = 0,
where a = tm−1 , b = tm−2 − sm−1 , and c = −sm−2 . Hence, η is a quadratic
irrational. Now using that ξ = ha0 , a1 , . . . , a`−1 , ηi and Theorem 8.4, we obtain
ηpm−1 + pm−2
ξ= ,
ηqm−1 + qm−2
where pn /qn are the convergents for ξ. √Since η is a quadratic irrational, it follows
that ξ is a quadratic irrational since Q[ d] is a field from Theorem 8.30. Thus, we
have proved that periodic simple continued fractions are quadratic irrationals.
Step 2: Now let ξ = ha0 ; a1 , a2 , . . .i be a quadratic irrational; we shall prove
that its continued fraction expansion is periodic. The trick to prove Step 2 is
to first show that the integers αn and βn of the complete quotients of ξ found in
Theorem 8.29 are bounded. To implement this idea, let ξn be the n-th complete
quotient of ξ. Then we can write ξ = ha0 ; a1 , a2 , . . . , an−1 , ξn i, so by Theorem 8.4
we have
ξn pn−1 + pn−2
ξ= .
ξn qn−1 + qn−2
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 451

Solving for ξn , after a little algebra, we find that


 
qn−2 ξ − cn−2
−ξn = .
qn−1 ξ − cn−1
Since conjugation preserves the algebraic operations by our lemma, we see that
 
qn−2 ξ − cn−2
(8.51) −ξ n = ,
qn−1 ξ − cn−1
√ √
If ξ = (α + d)/β, then ξ − ξ = 2 d/β 6= 0. Therefore, since ck → ξ as k → ∞, it
follows that as n → ∞,
   
ξ − cn−2 ξ−ξ
→ = 1.
ξ − cn−1 ξ−ξ
In particular, there is an N ∈ N such that for n > N , (ξ − cn−2 )/(ξ − cn−1 ) > 0.
Thus, as qk > 0 for k ≥ 0, according to (8.51), for n >√N , we have −ξ n > 0. Hence,
writing ξn , which is positive for n ≥ 1, as ξn = (αn + d)/βn as shown in Theorem
8.29, it follows that for n > N ,

d
0 = 0 + 0 < ξn + (−ξ n ) = 2 .
βn
d−α2n+1
So, for n > N , we have βn > 0. Now solving the identity βn+1 = βn in
Theorem 8.29 for d we see that
2
βn βn+1 + αn+1 = d.
For n > N , both βn and βn+1 are positive, which implies that βn and |αn | cannot
be too large; for instance, for n > N , we must have 0 < βn ≤ d and 0 ≤ |αn | ≤ d.
2
(For if either βn or |αn | were greater than d, then βn βn+1 + αn+1 would be strictly
larger than d, an impossibility since the sum is supposed to equal d.) In particular,
if A is the finite set
A = {(j, k) ∈ Z × Z ; −d ≤ j ≤ d , 1 ≤ k ≤ d},
then for the infinitely many n > N , the pair (αn , βn ) is in the finite set A. By the
pigeonhole principle, there must be distinct i, j > N such that (αj , βj ) = (αk , βk ).
Assume that j > k and let m := j − k. Then j = m + k, so
αk = αm+k and βk = βk+m .
Since ak = bξk c and am+k = bξm+k c, by Theorem 8.29 we have
√ √
αk + d αm+k + d
ξk = = = ξm+k =⇒ ak = bξk c = bξm+k c = am+k .
βk βm+k
Thus, using our formulas for αk+1 and βk+1 from Theorem 8.29, we see that
αk+1 = ak βk − αk = am+k βm+k − αm+k = αm+k+1 ,
and
2 2
d − αk+1 d − αm+k+1
βk+1 = = = βm+k+1 .
βk βm+k
452 8. INFINITE CONTINUED FRACTIONS

Thus,
√ √
αk+1 + d αm+k+1 + d
ξk+1 = = = ξm+k+1
βk+1 βm+k+1
=⇒ ak+1 = bξk+1 c = bξm+k+1 c = am+k+1 .
Continuing this process by induction shows that an = am+n for all n = k, k + 1, k +
2, k + 3, . . .. Thus, by the definition of periodicity in (8.48), we see that ξ has a
periodic simple continued fraction. 
A periodic simple continued fraction is called purely periodic if it is of the
form ξ = ha0 ; a1 , . . . , am−1 i.
Example 8.29. The simplest example of such a fraction is the golden ratio:

1+ 5
Φ= = h1i = h1; 1, 1, 1, 1, 1, . . .i.
2
Observe that Φ has the following properties:

1− 5
Φ > 1 and Φ = = −0.618 . . . =⇒ Φ > 1 and − 1 < Φ < 0.
2
In the following theorem, Evariste Galois’7 (1811–1832) first publication (at the
age of 17), we characterize purely periodic expansions as those quadratic irrationals
having these same properties. (Don’t believe everything to read about the legendary
Galois; see [189]. See [220] for an introduction to Galois’ famous theory.)
Theorem 8.32. A quadratic irrational ξ is purely periodic if and only if
ξ>1 and − 1 < ξ < 0.
Proof. Assume that ξ = ha0 ; . . . , am−1 , a0 , a1 , . . . , am−1 , . . .i is purely peri-
odic; we shall prove that ξ > 1 and −1 < ξ < 0. Recall that in general, for any
simple continued fraction, hb0 ; b1 , b2 , . . .i all the bn ’s are positive after b0 . Thus, as
a0 appears again (and again, and again, . . .) after the first a0 in ξ, it follows that
a0 ≥ 1. Hence, ξ = a0 + ξ11 > 1. Now applying Theorem 8.4 to ha0 ; . . . , am−1 , ξi,
we get
ξpm−1 + pm−2
ξ= ,
ξqm−1 + qm−2
where pn /qn are the convergents for ξ. Multiplying both sides by ξqm−1 + qm−2 ,
we obtain
ξ 2 qm−1 + ξqm−2 = ξpm−1 + pm−2 =⇒ f (ξ) = 0,
where f (x) = qm−1 x2 + (qm−2 − pm−1 )x − pm−2 is a quadratic polynomial. In
particular, ξ is a root of f . Taking conjugates, we see that
2
qm−1 ξ 2 +(qm−2 −pm−1 )ξ −pm−2 = 0 =⇒ qm−1 ξ +(qm−2 −pm−1 )ξ −pm−2 = 0,
7[From the preface to his final manuscript (Evariste died from a pistol duel at the age of
20)] Since the beginning of the century, computational procedures have become so complicated that
any progress by those means has become impossible, without the elegance which modern mathe-
maticians have brought to bear on their research, and by means of which the spirit comprehends
quickly and in one step a great many computations. It is clear that elegance, so vaunted and so
aptly named, can have no other purpose. ... Go to the roots, of these calculations! Group the
operations. Classify them according to their complexities rather than their appearances! This, I
believe, is the mission of future mathematicians. This is the road on which I am embarking in
this work. Evariste Galois (1811–1832).
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 453

therefore ξ is the other root of f . Now ξ > 1, so by the Wallis-Euler recurrence


relations, pn > 0, pn < pn+1 , and qn < qn+1 for all n. Hence,
f (−1) = (qm−1 − qm−2 ) + (pm−1 − pm−2 ) > 0 and f (0) = −pm−2 < 0.
By the intermediate value theorem f (x) = 0 for some −1 < x < 0. Since ξ is the
other root of f we have −1 < ξ < 0.
Assume now that ξ is a quadratic irrational with ξ > 1 and −1 < ξ < 0; we
shall prove that ξ is purely periodic. To do so, we first prove that if {ξn } are the
complete quotients of ξ, then −1 < ξ n < 0 for all n. Since ξ0 = ξ, this is already
true for n = 0 by assumption. Assume this holds for n; then,
1 1 1
ξn = an + =⇒ = ξ n − an < −an ≤ −1 =⇒ < −1.
ξn+1 ξ n+1 ξ n+1
1
The inequality ξ n+1
< −1 shows that −1 < ξ n+1 < 0 and completes the induction.
Now we already know that ξ is periodic, so let us assume sake of contradiction that
ξ is not purely periodic, that is, ξ = ha0 ; a1 , . . . , a`−1 , a` , . . . , a`+m−1 i where ` ≥ 1.
Then a`−1 6= a`+m−1 for otherwise we could start the repeating block at a`−1 , so
(8.52) ξ`−1 = a`−1 + ha` , . . . , a`+m−1 i =
6 a`+m−1 + ha` , . . . , a`+m−1 i = ξ`+m−1
Observe that this expression shows that ξ`−1 − ξ`+m−1 = a`−1 − a`+m−1 is an
integer. In particular, taking conjugates, we see that
ξ `−1 − ξ `+m−1 = a`−1 − a`+m−1 = ξ`−1 − ξ`+m−1 .
Now we already proved that −1 < ξ `−1 < 0 and −1 < ξ `+m−1 < 0, which we write
as 0 < −ξ `+m−1 < 1. Thus,
0 − 1 < ξ `−1 + (−ξ `+m−1 ) < 0 + 1 =⇒ −1 < ξ`−1 − ξ`+m−1 < 1,
since ξ `−1 − ξ `+m−1 = ξ`−1 − ξ`+m−1 . However, we noted that ξ`−1 − ξ`+m−1 is an
integer, and since the only integer strictly between −1 and 1 is 0, it must be that
ξ`−1 = ξ`+m−1 . However, this contradicts (8.52), and our proof is complete. 
8.8.4. Square roots and periodic continued fractions. Recall that

19 = h4; 2, 1, 3, 1, 2, 8i;
if you
√ didn’t notice the beautiful symmetry before, observe that we can write this
as 19 = ha0 ; a1 , a2 , a3 , a2 , a1 , 2a0 i where the repeating block has a symmetric part
and an ending part twice a0 . It turns that any square root has this nice symmetry
property. To prove this fact, we first prove the following.
Lemma 8.33. If ξ = ha0 ; a1 , . . . , am−1 i is purely periodic, then −1/ξ is also
purely periodic of the reversed form: −1/ξ = ham−1 ; am−2 , . . . , a0 i.
Proof. Writing out the complete quotients ξ, ξ1 , ξ2 , . . . , ξm−1 of
ξ = ha0 ; a1 , . . . , am−1 i = ha0 ; a1 , . . . , am−1 , ξi
we obtain
1 1 1 1
ξ = a0 + , ξ1 = a1 + , . . . , ξm−2 = am−2 + , ξm−1 = am−1 + .
ξ1 ξ2 ξm−1 ξ
Taking conjugates of all of these and listing them in reverse order, we find that
−1 −1 −1 −1
= am−1 − ξ m−1 , = am−2 − ξ m−2 , . . . , = a1 − ξ 1 , = a0 − ξ.
ξ ξ m−1 ξ2 ξ1
454 8. INFINITE CONTINUED FRACTIONS

Let us define η0 := −1/ξ, η1 = −1/ξ m−1 , η2 = −1/ξ m−2 , . . . , ηm−1 = −1/ξ 1 . Then
we can write the previous displayed equalities as
1 1 1 1
η0 = am−1 + , η1 = am−2 + , . . . , ηm−2 = a1 + , ηm−1 = a0 + ;
η1 η2 ηm−1 η0
in other words, η0 is just the continued fraction:
η0 = ham−1 ; am−2 , . . . , a1 , a0 , η0 i = ham−1 ; am−2 , . . . , a1 , a0 i.
Since η0 = −1/ξ, our proof is complete. 

Recall that the continued fraction expansion for d has the complete quotients
ξn and partial quotients an determined by

αn + d
ξn = , an = bξn c,
βn
where the αn , βn ’s are integers given in Theorem 8.29. We are now ready to prove
Adrien-Marie Legendre’s (1752–1833) famous result.

Theorem 8.34. The simple continued fraction of d has the form

d = ha0 ; a1 , a2 , a3 , . . . , a3 , a2 , a1 , 2a0 i.
Moreover,√βn 6= −1 for all n, and βn = +1 if and only if n is a multiple of the
period of d.
√ √
Proof. Starting the continued fraction√ algorithm for d, we obtain d =
a0 + ξ11 , where ξ1 > 1. Since ξ11 = −a0 + d, we have
1 √  √
(8.53) − = − − a0 − d = a0 + d > 1,
ξ1
so we must have −1 < ξ 1 < 0. Since both ξ1 > 1 and −1 < ξ 1 < 0, by Galois’
Theorem 8.32, we know that ξ1 is purely periodic: ξ1 = ha1 ; a2 , . . . , am i. Thus,
√ 1
d = a0 + = ha0 ; ξ1 i = ha0 ; a1 , a2 , . . . , am i.
ξ1
On the other hand, from (8.53) and from Lemma 8.33, we see that
√ 1
h2a0 ; a1 , a2 , . . . , am , a1 , a2 , . . . , am , . . .i = a0 + = ham ; . . . , a1 i
d=−
ξ1
= ham , am−1 , am−2 , . . . , a1 , am , am−1 , am−2 , . . . , a1 , . . .i.
Comparing the left and right-hand sides, we see that am = 2a0 , am−1 = a1 , am−2 =
a2 , am−3 = a3 , and so forth, therefore,

d = ha0 ; a1 , a2 , . . . , am i = ha0 ; a1 , a2 , a3 , . . . , a3 , a2 , a1 , 2a0 i.
We now prove that βn never equals −1, and βn = +1 if and only if n is√a
multiple of the period m. By the form of the continued fraction expansion of √d
we just derived, observe that for any n > 0, the n-th complete quotient ξn for d
is purely periodic. In particular, by Galois’ Theorem 8.32 we know that
(8.54) n>1 =⇒ ξn > 1 and − 1 < ξ n < 0.
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 455

Now for sake of contradiction, assume that βn = −1. Since β0 = +1 by definition



(see Theorem 8.29), we must have n > 0. Then the formula ξn = (αn + d)/βn
with βn = −1 and (8.54) imply that
√ √
1 < ξn = −αn − d =⇒ αn < −1 − d =⇒ αn < 0.

On the other hand, (8.54) also implies that


√ √
−1 < ξ n = −αn + d < 0 =⇒ d < αn =⇒ 0 < αn .

Since αn < 0 and αn > 0 cannot possibly hold, it follows that βn = −1 is impossible.
We now prove that βn = +1 if and only √ if n is a multiple of the period m.
Assume first that βn = 1. Then ξn = αn + d. By (8.54) we see that
√ √ √
−1 < ξ n = αn − d < 0 =⇒ d − 1 < αn < d.
√ √
Since α
√n is an integer, and the only integer strictly √ between d − 1 and d is
a0 = b dc, it follows that αn = a0 , so ξn = a0 + d. Now recalling the expansion

d = ha0 ; a1 , a2 , . . . , am i and the fact that 2a0 = am , it follows that

a0 + d = h2a0 ; a1 , a2 , . . . , am−1 , am , a1 , a2 , . . . , am−1 , am , . . .i
(8.55) = ham ; a1 , a2 , . . . , am−1 i;

thus ξn = ham ; a1 , a2 , . . . , am−1 i. On the other hand, ξn is by definition the n-th


convergent of

d = ha0 ; a1 , a2 , . . . , am , a1 , a2 , . . . , am , . . .i,
so writing n = mj + ` where j = 0, 1, 2, . . . and 1 ≤ ` ≤ m, going out n slots after
a0 , we see that
ξn = ha` ; a`+1 , a`+2 , . . . , am , a1 , . . . , a`−1 i.
Comparing this with ξn = ham ; a1 , a2 , . . . , am−1 i, we must have ` = m, so n =
mj + m = m(j + 1) is a multiple of m.
Assume now that n is a multiple of m; say n = mk. Then going √ out n = mk
slots to the right of a0 in the continued√fraction expansion of d we get √ ξn =
ham ; a1 , a2 , . . . , am−1 i. Thus, ξn = a0 + d by (8.55). Since ξn = (αn + d)/βn
also, it follows that βn = 1 and our proof is complete. 

Exercises 8.8.
1. Find the canonical continued fraction expansions for
√ √
√ 1 + 13 2+ 5
(a) 29 , (b) , (c) .
2 3
2. Find the values of the following continued fractions:

(a) h3; 2, 6i , (b) h1; 2, 3i , (c) h1; 2, 3i , (d) h2; 5, 1, 3, 5i.

3. Let m, n ∈ N. Find the quadratic irrational numbers represented by

(a) hni = hn; n, n, n, . . .i , (b) hn; 1i , (c) hn; n + 1i , (d) hm; ni.
456 8. INFINITE CONTINUED FRACTIONS

8.9. Archimedes’ crazy cattle conundrum and diophantine equations


Archimedes of Syracuse (287–212) was known to think in preposterous propor-
tions. In The Sand Reckoner [159, p. 420], a fun story written by Archimedes,
he concluded that if he could fill the universe with grains of sand, there would be
approximately 8 × 1063 grains! According to Pappus of Alexandria (290–350), at
one time Archimedes said (see [58, p. 15]) “Give me a place to stand on, and I will
move the earth!” In the following we shall look at a cattle problem proposed by
Archimedes, whose solution involves approximately 8 × 10206544 cattle! If you feel
moooooooved to read more on Achimedes’ cattle, see [155], [233], [19], [248], and
[135].
8.9.1. Archimedes’ crazy cattle conundrum. Here is a poem written by
Archimedes to students at Alexandria in a letter to Eratosthenes of Cyrene (276
B.C.–194 B.C.). (The following is adapted from [98], as written in [19].)
Compute, O stranger! the number of cattle of Helios, which once
grazed on the plains of Sicily, divided according to their color, to
wit:
(1) White bulls = 21 black bulls + 13 black bulls + yellow bulls
(2) Black bulls = 14 spotted bulls + 15 spotted bulls + yellow bulls
(3) spotted bulls = 16 white bulls + 71 white bulls + yellow bulls
(4) White cows = 13 black herd + 41 black herd (here, “herd” =
bulls + cows)
(5) Black cows = 14 spotted herd + 51 spotted herd
(6) Dappled cows = 15 yellow herd + 61 yellow herd
(7) Yellow cows = 61 white herd + 17 white herd
He who can answer the above is no novice in numbers. Never-
theless he is not yet skilled in wise calculations! But come consider
also all the following numerical relations between the Oxen of the
Sun:
(8) If the white bulls were combined with the black bulls they
would be in a figure equal in depth and breadth and the
far stretching plains of Sicily would be covered by the square
formed by them.
(9) Should the yellow and spotted bulls were collected in one
place, they would stand, if they ranged themselves one after
another, completing the form of an equilateral triangle.
If thou discover the solution of this at the same time; if thou
grasp it with thy brain; and give correctly all the numbers; O
Stranger! go and exult as conqueror; be assured that thou art by
all means proved to have abundant of knowledge in this science.
To solve this puzzle, we need to turn it into mathematics! Let W, X, Y, Z denote
the number of white, black, yellow, and spotted bulls, respectively, and w, x, y, z
for the number of white, black, yellow, and spotted cows, respectively.
The conditions (1) – (7) can be written as
1 1 1 1
(1) W = + X +Y (2) X = + Z +Y
2 3 4 5
1 1 1 1
(3) Z = + W +Y (4) w = + (X + x)
6 7 3 4
8.9. ARCHIMEDES’ CRAZY CATTLE CONUNDRUM AND DIOPHANTINE EQUATIONS 457

1 2 3 4
cattle form a square cattle form a triangle

Figure 8.2. With the dots as bulls, on the left, the number of
bulls is a square number (42 in this case) and the number of bulls
on the right is a triangular number (1 + 2 + 3 + 4 in this case).

1 1 1 1
(5) x = + (Z + z) (6) z = + (Y + y)
4 5 5 6
1 1
(7) y = + (W + w).
6 7
Now how do we interpret (8) and (9)? We will interpret (8) as meaning that
the number of white and black bulls should be a square number (a perfect square);
see the left picture in Figure 8.2. A triangular number is a number of the form
n(n + 1)
1 + 2 + 3 + 4 + ··· + n = ,
2
for some n. Then we will interpret (9) as meaning that the number of yellow and
spotted bulls should be a triangular number; see the right picture in Figure 8.2.
Thus, (8) and (9) become
(8) W + X = a square number , (9) Y + Z = a triangular number.
In summary: We want to find integers W, X, Y, Z, w, x, y, z (here we assume
there are no such thing as “fractional cattle”) solving equations (1)–(9). Now to the
solution of Archimedes cattle problem. First of all, equations (1)–(7) are just linear
equations so these equations can be solved using simple linear algebra. Instead of
solving these equations by hand, which will probably take a few hours, it might be
best to use a computer. Doing so you will find that in order for W, X, Y, Z, w, x, y, z
to solve (1)–(7), they must be of the form
W = 10366482 k , X = 7460514 k , Y = 4149387 k , Z = 7358060 k
(8.56)
w = 7206360 k , x = 4893246 k , y = 5439213 k , z = 3515820 k,
where k can equal 1, 2, 3, . . .. Thus, in order for us to fulfill conditions (1)–(7), we
would have at the very least, setting k = 1,

10366482 + 7460514 + 4149387 + 7358060 + 7206360 + 4893246


+ 5439213 + 3515820 = 50389082 ≈ 50 million cattle!
Now we are “no novice in numbers!” Nevertheless we are not yet skilled in wise
calculations! To be skilled, we still have to satisfy conditions (8) and (9). For (8),
this means
W + X = 10366482 k + 7460514 k = 17826996 k = a square number.
Factoring 17826996 = 22 · 3 · 11 · 29 · 4657 into its prime factors, we see that we must
have
22 · 3 · 11 · 29 · 4657 k = (· · · )2 ,
458 8. INFINITE CONTINUED FRACTIONS

a square of an integer. Thus, we need 3 · 11 · 29 · 4657 k to be a square, which holds


if and only if
k = 3 · 11 · 29 · 4657 m2 = 4456749 m2
for some integer m. Plugging this value into (8.56), we get
W = 46200808287018 m2 , X = 33249638308986 m2
Y = 18492776362863 m2 , Z = 32793026546940 m2
(8.57)
w = 32116937723640 m2 , x = 21807969217254 m2
y = 24241207098537 m2 , z = 15669127269180 m2 ,
where m can equal 1, 2, 3, . . .. Thus, in order for us to fulfill conditions (1)–(8), we
would have at the very least, setting m = 1,

46200808287018 + 33249638308986 + 18492776362863 + 32793026546940


+ 32116937723640 + 21807969217254 + 24241207098537
+ 15669127269180 = 2.2457 . . . × 1014 ≈ 2.2 trillion cattle!
It now remains to satisfy condition (9):

Y + Z = 18492776362863 m2 + 32793026546940 m2
`(` + 1)
= 51285802909803 m2 = ,
2
for some integer `. Multiplying both sides by 8 and adding 1, we obtain
8 · 51285802909803 m2 + 1 = 4`2 + 4` + 1 = (2` + 1)2 = n2 ,
where n = 2` + 1. Since 8 · 51285802909803 = 410286423278424, we finally conclude
that conditions (1)–(9) are all fulfilled if we can find integers m, n satisfying the
equation
(8.58) n2 − 410286423278424 m2 = 1.
This is commonly called a Pell equation and is an example of a diophantine
equation. As we’ll see in the next subsection, we can solve√ this equation by simply
(!) finding the simple continued fraction expansion of 410286423278424. The
calculations involved are just sheer madness, but they can be done and have been
done [19], [248]. In the end, we find that the smallest total number of cattle which
satisfy (1)–(9) is a number with 206545 digits (!) and is equal to
7760271406 . . . (206525 other digits go here) . . . 9455081800 ≈ 8 × 10206544 .
We are now skilled in wise calculations! A copy of this number is printed on 42
computer sheets and has been deposited in the Mathematical Tables of the journal
Mathematics of Computation if you are interested.

8.9.2. Pell’s equation. Generalizing the cattle equation (8.58), we call a


diophantine equation of the form
(8.59) x2 − d y 2 = 1
a Pell equation. Note that (x, y) = (1, 0) solves this equation. This solution
is called the trivial solution; the other solutions are not so easily attained. We
8.9. ARCHIMEDES’ CRAZY CATTLE CONUNDRUM AND DIOPHANTINE EQUATIONS 459

remark that Pell’s equation was named by Euler after John Pell (1611–1685), al-
though Brahmagupta8 (598–670) studied this equation a thousand years earlier √ [36,
p. 221]. Any case, we shall see that the continued fraction expansion of d plays
an important role in solving this equation. We note that if (x, y) solves (8.59), then
trivially so do (±x, ±y) because of the squares in (8.59); thus, we usually restrict
ourselves to the positive solutions. √
Recall that the continued fraction expansion for d has the complete quotients
ξn and partial quotients an determined by

αn + d
ξn = , an = bξn c,
βn
where αn and βn are integers defined in Theorem 8.29. The exact forms of these
integers are not important; what is important is that√βn never equals −1 and
βn = +1 if and only if n is a multiple of the period of √ d as we saw in Theorem
8.34. The following lemma shows how the convergents of d enter Pell’s equation.

Lemma 8.35. If pn /qn denotes the n-th convergent of d, then for all n =
0, 1, 2, . . ., we have
p2n − d qn2 = (−1)n+1 βn+1 .

Proof. √ Since we can write d = ha0 ; a1 , a2 , a3 , . . . , an , ξn+1 i and ξn+1 =
(αn+1 + d)/βn+1 , by (8.19) of Corollary 8.6, we have

√ ξn+1 pn + pn−1 (αn+1 + d) pn + βn+1 pn−1
d= = √ .
ξn+1 qn + qn−1 (αn+1 + d) qn + βn+1 qn−1
Multiplying both sides by the denominator of the right-hand side, we get
√ √ √ √
d(αn+1 + d) qn + dβn+1 qn−1 = (αn+1 + d) pn + βn+1 pn−1
√ √
=⇒ dqn + (αn+1 qn + βn+1 qn−1 ) d = (αn+1 pn + βn+1 pn−1 ) + pn d.
Equating coefficients, we obtain
dqn = αn+1 pn + βn+1 pn−1 and αn+1 qn + βn+1 qn−1 = pn .
Multiplying the first equation by qn and the second equation by pn and equating
the αn+1 pn qn terms in each resulting equation, we obtain

dqn2 − βn+1 pn−1 qn = p2n − βn+1 pn qn−1


=⇒ p2n − d qn2 = (pn qn−1 − pn−1 qn ) · βn+1 = (−1)n+1 · βn+1 ,
where we used that pn qn−1 − pn−1 qn = (−1)n−1 = (−1)n+1 from Corollary 8.6. 

Next, we
√ show that all solutions of Pell’s equation can be found via the con-
vergents of d.

Theorem
√ 8.36. Let pn /qn denote the n-th convergent of d and let m the
period of d. Then the positive integer solutions to
x2 − d y 2 = 1

8A person who can, within a year, solve x2 − 92y 2 = 1 is a mathematician. Brahmagupta


(598–670).
460 8. INFINITE CONTINUED FRACTIONS


are precisely numerators and denominators of the odd convergents of d of the
form x = pnm−1 and y = qnm−1 , where n > 0 is any positive integer for m even
and n > 0 is even for m odd.
Proof. We prove our theorem in two steps.
2 2
√Step 1: We first prove that if x −d y = 12 with y2 > 0, then√x/y is a convergent

of d. To√ see this, observe
√ that since 1 = x − d y = (x − d y)(x + d y), we
have x − d y = 1/(x + d y), so

x √ x − d y 1
− d =
y = √ .
y y |x + d y|

Also, x2 = d y 2 + 1 > d y 2 implies x > d y, which implies
√ √ √ √
x + d y > d y + d y = 2 d y.
Hence,

x √ 1 1 1 x √ 1
y |x + √d y| < y · 2√d y = 2y 2 √d
− d = =⇒ − d < 2 .

y y 2y

By Dirichlet’s theorem 8.21, it follows that x/y must be a convergent of d.
Step 2: We now finish the proof. By Step 1 we already know that every
solution must be a convergent, so we only need to look for convergents (pk , qk ) that
make p2k − d qk2 = 1. To this end, recall from Lemma 8.35 that
p2k−1 − d qk−1
2
= (−1)k βk ,
where
√ βk never equals −1 and βk2 = 1 if and only if k is a multiple of m, the period
of d. In particular, p2k−1 − d qk−1 = 1 if and only if (−1)k βk = 1, if and only if
βk = 1 and k is even, if and only if k is a multiple of m and k is even. This holds
if and only if k = mn where n > 0 is any positive integer for m even and n > 0 is
even for m odd. This completes our proof. 
The fundamental solution of Pell’s equation is the “smallest” positive so-
lution of Pell’s equation; here, a solution (x, y) is positive means x, y > 0.√ Ex-
plicitly, the fundamental solution is (pm−1 , qm−1 ) for an even period m of d or
(p2m−1 , p2m−1 ) for an odd period m.

Example 8.30. Consider the equation x2 − 3y 2 = 1. Since 3 = h1; 1, 2i
has period m = 2, our theorem says that the positive solutions of x2 − 3y 2 = 1 are
precisely x = p2n−1 and y√= q2n−1 for all n > 0; that is, (p1 , q1 ), (p3 , q3 ), (p5 , q5 ), . . ..
Now the convergents of 3 are
n 0 1 2 3 4 5 6 7
pn 1 2 5 7 19 26 71 97 .
qn 1 1 3 4 11 15 41 56
In particular, the fundamental solution is (2, 1) and the rest of the positive solutions
are (7, 4), (26, 15), (97, 56), . . .. Just to verify a couple entries:
22 − 3 · 12 = 4 − 3 = 1
and
72 − 3 · 42 = 49 − 3 · 16 = 49 − 48 = 1,
and one can continue verifying that the odd convergents give solutions.
8.9. ARCHIMEDES’ CRAZY CATTLE CONUNDRUM AND DIOPHANTINE EQUATIONS 461

Example 8.31. For another


√ example, consider the equation x2 − 13 y 2 = 1.
In this case, we find that 13 = h3; 1, 1, 1, 1, 6i has period m = 5. Thus, our
theorem says that the positive solutions of x2 − 13y 2 = 1 are precisely x = p5n−1
and y = q5n−1√for all n > 0 even; that is, (p9 , q9 ), (p19 , q19 ), (p29 , q29 ), . . .. The
convergents of 13 are
n 0 1 2 3 4 5 6 7 8 9
pn 3 4 7 11 18 119 137 256 393 649 .
qn 1 1 2 3 5 33 38 71 109 180
In particular, the fundamental solution is (649, 180).
8.9.3. Brahmagupta’s algorithm. Thus, √ to find solutions of Pell’s equation
we just have to find certain convergents of d. Finding √ all convergents is quite a
daunting task — try finding the solution (p19 , q19 ) for 13 — but it turns out that
all the positive solutions can be found from the fundamental solution.
Example 8.32. We know that the fundamental solution of x2 − 3y 2 = 1 is
(2, 1) and the rest of the positive solutions are (7, 4), (26, 15), (97, 56), . . .. Observe
that √ √ √
(2 + 1 · 3)2 = 4 + 4 3 + 3 = 7 + 4 3.
Note that the second positive solution (7, 4) to x2 − 3y 2 = 1 appears on the right.
Now observe that
√ √ √ √ √ √
(2 + 1 · 3)3 = (2 + 3)2 (2 + 3) = (7 + 4 3) (2 + 3) = 26 + 15 3.
Note that the third positive solution (26, 15) to x2 − 3y 2 = 1 appears on the right.
One may conjecture that the n-th positive solution (xn , yn ) to x2 −3 y 2 = 1 is found
by multiplying out √ √
xn + yn d = (2 + 1 · 3)n
This is in fact correct as the following theorem shows.
Theorem 8.37 (Brahmagupta’s algorithm). If (x1 , y1 ) is the fundamental
solution of Pell’s equation
x2 − d y 2 = 1,
then all the other positive solutions (xn , yn ) can be obtained from the equation
√ √
xn + yn d = (x1 + y1 d)n , n = 0, 1, 2, 3, . . . .
√ √
Proof. To simplify this proof a little, we shall say that ζ = x + y d ∈ Z[ d]
solves Pell’s equation to mean that (x, y) solves Pell’s equation; similarly, we say
ζ is a positive solution to mean that x, y > 0. Throughout this proof we shall use
the following fact:
(8.60) ζ solves Pell’s equation ⇐⇒ ζ ζ = 1 (that is, 1/ζ = ζ).
This is holds for the simple reason that
√ √
ζ ζ = (x + y d) (x − y d) = x2 − d y 2 .

In particular, if we set α := x1 + y1 d, then α α = 1 because (x1 , y1 ) solves Pell’s
equation. We now prove our theorem. We first note that (xn , yn ) is a solution
because
√ √
(xn + yn d) (xn + yn d) = αn · αn = αn · (α)n = (α · α)n = 1n = 1,
462 8. INFINITE CONTINUED FRACTIONS

which in view of (8.60),


√ we conclude that (xn , yn ) solves Pell’s equation. Now
suppose that ξ ∈ Z[ d] is a positive solution to Pell’s equation; we must show √ that
ξ is some power of α. To this end, note that α ≤ ξ because α = x1 + y1 d and
(x1 , y1 ) is the smallest positive solution of Pell’s equation. Since 1 < α, it follows
that αk → ∞ as k → ∞, so we can choose n ∈ N to be the smallest natural number
such that ξ < αn+1 . Then, αn ≤ ξ < αn+1 , so dividing by αn , we obtain
ξ
1 ≤ η < α where η := n = ξ · (α)n ,
α

where we used that 1/α = α √ from (8.60). Since Z[ d] is a ring (Lemma 8.30), we
know that η = ξ · (α)n ∈ Z[ d] as well. Moreover, η solves Pell’s equation because
η η = ξ · (α)n · ξ · αn = (ξ ξ) · (α α)n = 1 · 1 = 1.
We shall prove that η = 1, which shows that ξ = αn . To prove this, observe that
from 1 ≤ η < α and the fact that 1/η = η (since η η = 1), we have
0 < α−1 < η −1 ≤ 1 =⇒ 0 < α−1 < η ≤ 1.

Let η = p + q d where p, q ∈ Z. Then the inequalities 1 ≤ η < α and 0 < α−1 <
η ≤ 1 imply that
√ √
2p = (p + q d) + (p − q d) = η + η ≥ 1 + α−1 > 0
and √ √ √
2q d = (p + q d) − (p − q d) = η − η ≥ 1 − 1 = 0.
In particular, p > 0, q ≥ 0, and p2 − dq 2 = 1 (since η solves Pell’s equation). There-
fore,
√ (p, q) = (1, 0) or (p, q) is a positive (numerator, denominator) of a convergent
of d. However, we know that √ (x1 , y1 ) is the smallest
√ such positive (numerator,
denominator), and that p + q d = η < α = x1 + y1 d. Therefore, we must have
(p, q) = (1, 0). This implies that η = 1 and hence ξ = αn . 
Example 8.33. Since (649, 180) is the fundamental solution to x2 − 13 y 2 = 1,
all the positive solutions are given by
√ √
xn + yn 13 = (649 + 180 13)n .
For instance, for n = 2, we find that
√ √
(649 + 180 13)2 = 842401 + 233640 13 =⇒ (x2 , y2 ) = (842401, 233640),
much easier than finding (p19 , q19 ).
There are many cool applications of Pell’s equation explored in the exercises.
Here’s one of my favorites (see Problem 8): Any prime of the form p = 4k + 1
is a sum of two squares. This was conjectured by Pierre de Fermat9 (1601–1665)
in 1640 and proved by Euler in 1754. For example, 5, 13, 17 are such primes, and
5 = 12 + 22 , 13 = 22 + 32 , and 17 = 12 + 42 .
Exercises 8.9.
9[In the margin of his copy of Diophantus’ Arithmetica, Fermat wrote] To divide a cube into
two other cubes, a fourth power or in general any power whatever into two powers of the same
denomination above the second is impossible, and I have assuredly found an admirable proof of
this, but the margin is too narrow to contain it. Pierre de Fermat (1601–1665). Fermat’s claim
in this marginal note, later to be called “Fermat’s last theorem” remained an unsolved problem
in mathematics until 1995 when Andrew Wiles (1953 – ) finally proved it.
8.9. ARCHIMEDES’ CRAZY CATTLE CONUNDRUM AND DIOPHANTINE EQUATIONS 463

1. Find the fundamental solutions to the equations


(a) x2 − 8 y 2 = 1 , (b) x2 − 5 y 2 = 1 , (c)x2 − 7 y 2 = 1.
Using the fundamental solution, find the next two solutions.
2. (Pythagorean triples) (Cf. [174]) Here’s is a nice problem solvable using continued
fractions. A Pythagorean triple consists of three natural numbers (x, y, z) such that
x2 + y 2 = z 2 . For example, (3, 4, 5), (5, 12, 13), and (8, 15, 17) are examples. (Can
you find more?) The first example (3, 4, 5) has the property that the first two are
consecutive integers; here are some steps to find more Pythagorean triples of this sort.
(i) Show that (x, y, z) is a Pythagorean triple with y = x + 1 if and only if
(2x + 1)2 − 2z 2 = 1.
(ii) By solving the Pell equation u2 −2 v 2 = 1, find the next three Pythagorean triples
(x, y, z) (after (3, 4, 5)) where x and y are consecutive integers.
3. (Triangular numbers) Here’s is another very nice problem that can be solved using
continued fractions. Find all triangular numbers that are squares, where recall that a
triangular number is of the form 1 + 2 + · · · + n = n(n + 1)/2. Here are some steps.
(i) Show that n(n + 1)/2 = m2 if and only if
(2n + 1)2 − 8m2 = 1.
(ii) By solving the Pell equation x2 − 8 y 2 = 1, find the first three triangular numbers
that are squares.
4. In this problem we answer the question: For which n ∈ N is the standard deviation of
the 2n + 1 numbers 0, ±1, . . . , ±n an integer? Here,qthe standard deviation of any
PN
real numbers x1 , . . . , xN is by definition the number N1 2
i=1 (xi − x) where x is the
average of x1 , . . . , xN . q
1
(i) Show that the standard deviation of 0, ±1, . . . , ±n equals 3
n(n + 1). Sugges-
2 2 2 n(n+1)(2n+1)
tion: The formula 1 + 2 + · · · + n = 6
from Problem 3b of Exercises
2.2 might be helpful.
(ii) Therefore, we want 31 n(n + 1) = y 2 where y ∈ N. If we put x = 2n + 1, prove
that 13 n(n + 1) = y 2 if and only if x2 − 12y 2 = 1 where x = 2n + 1.
(iii) Now solve the equation x2 − 12y 2 = 1 to answer our question.
5. The diophantine equation x2 − d y 2 = −1 (where d > 0 is not a perfect square) is also
of interest. In this problem we determine when this equation has solutions. Following
the proof of Theorem 8.36, prove the following statements.
Show that if (x, y) solves x2 − d y 2 = −1 with y > 0, then x/y is a convergent of
(i) √
d. √
(ii) Prove that x2 − d y 2 = −1 has a solution if and only if the period of d is odd,
in which case the nonnegative solutions are exactly x = pnm−1 and y = qnm−1
for all n > 0 odd.
6. Which of the following equations have solutions? If an equation has solutions, find the
fundamental solution.
(a) x2 − 2 y 2 = −1 , (b) x2 − 3 y 2 = −1 , (c)x2 − 17 y 2 = −1.
7. In this problem we prove that the diophantine equation x2 − p y 2 = −1 always has a
solution if p is a prime number of the form p = 4k + 1 for an integer k. For instance,
since 13 = 4 · 3 + 1 and 17 = 4 · 4 + 1, x2 − 13y 2 = −1 and x2 − 17y 2 = −1 have solutions
(as you already saw in the previous problem). Let p = 4k + 1 be prime.
(i) Let (x1 , y1 ) be the fundamental solution of x2 − p y 2 = 1. Prove that x1 and y1
cannot both be even and cannot both be odd.
(ii) Show that the case x1 is even and y1 is odd cannot happen. Suggestion: Write
x1 = 2a and y1 = 2b + 1 and plug this into x21 − p y12 = 1.
464 8. INFINITE CONTINUED FRACTIONS

(iii) Thus, we may write x1 = 2a+1 and y1 = 2b. Show that p b2 = a (a+1). Conclude
that p must divide a or a + 1.
(iv) Suppose that p divides a; that is, a = mp for an integer m. Show that b2 =
m (mp + 1) and that m and mp + 1 are relatively prime. Using this equality,
prove that m = s2 and mp + 1 = t2 for integers s, t. Conclude that t2 − p s2 = 1
and derive a contradiction.
(v) Thus, it must be the case that p divides a + 1. Using this fact and an argument
similar to the one in the previous step, find a solution to x2 − d y 2 = −1.
8. (Sum of squares) In this problem we prove the following incredible result of Euler:
Every prime of the form p = 4k + 1 can be expressed as the sum of two squares.
(i) Let p = 4k + 1 be prime. Using the previous problem and Problem 5, prove that
√ √
the period of p is odd and deduce that p has an expansion of the form

p = ha0 ; a1 , a2 , . . . , a`−1 , a` , a` , a`−1 , . . . , a1 , 2a0 i.
(ii) Let η be the complete quotient ξ`+1 :
η := ξ`+1 = ha` ; a`−1 . . . , a1 , 2a0 , a1 , . . . , a`−1 , a` i.
Prove that −1 = η · η. Suggestion: Use Lemma 8.33.

(iii) Finally, writing η = (a+ p)/b (why does η have this form?) show that p = a2 +b2 .

8.10. Epilogue: Transcendental numbers, π, e, and where’s calculus?


It’s time to get a tissue box, because, unfortunately, our adventures through
this book have come to an end. In this section we wrap up this book with a
discussion on transcendental numbers and continued fractions.
8.10.1. Approximable numbers. A real number ξ is said to be approx-
imable (by rationals) to order n ≥ 1 if there exists a constant C and infinitely
many rational numbers p/q in lowest terms with q > 0 such that
p C

(8.61) ξ − < n .
q q
Observe that if ξ is approximable to order n > 1, then it is automatically approx-
imable to n − 1; this is because
p C C

ξ − < n ≤ n−1 .
q q q
Similarly, ξ approximable to any order k with 1 ≤ k ≤ n. Intuitively, the ap-
proximability order n measures how close we can surround ξ with “good” rational
numbers, that is, rational numbers having small denominators. To see what this
means, suppose that ξ is only approximable to order 1. Thus, there is a C and
infinitely many rational numbers p/q in lowest terms with q > 0 such that
p C

ξ − < .
q q
This inequality suggests that in order to find rational numbers very close to ξ, these
rational numbers need to have large denominators to make C/q small. However, if
ξ were approximable to order 1000, then there is a C and infinitely many rational
numbers p/q in lowest terms with q > 0 such that
p C

ξ − < 1000 .
q q
This inequality suggests that in order to find rational numbers very close to ξ, these
rational numbers don’t need to have large denominators, because even for small q,
8.10. EPILOGUE: TRANSCENDENTAL NUMBERS, π, e, AND WHERE’S CALCULUS? 465

the large power of 1000 will make C/q 1000 small. The following lemma shows that
there is a limit to how close we can surround algebraic numbers by “good” rational
numbers.
Lemma 8.38. If ξ is real algebraic of degree n ≥ 1 (so ξ is rational if n = 1),
then there exists a constant c > 0 such that for all rational numbers p/q 6= ξ with
q > 0, we have
p c
ξ − ≥ n .
q q
Proof. Assume that f (ξ) = 0 where
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 = 0, ak ∈ Z,
and that no such polynomial function of lower degree has this property. First, we
claim that f (r) 6= 0 for any rational number r 6= ξ. Indeed, if f (r) = 0 for some
rational number r 6= ξ, then we can write f (x) = (x−r)g(x) where g is a polynomial
of degree n − 1. Then 0 = f (ξ) = (ξ − r)g(ξ) implies, since ξ 6= r, that g(ξ) = 0.
This implies that the degree of ξ is n − 1 contradicting the fact that the degree of
ξ is n. Now for any rational p/q 6= ξ with q > 0, we see that
 p n  p n−1 p

0 6= |f (p/q)| = an + an−1 + · · · + a1 + a0
q q q
|an pn + an−1 pn−1 q + · · · + a1 pq n−1 + a0 q n |
= .
qn
The numerator is a nonnegative integer, which cannot be zero, so the numerator
must be ≥ 1. Therefore,
(8.62) |f (p/q)| ≥ 1/q n for all rational numbers p/q 6= ξ with q > 0.
Second, we claim that there is an M > 0 such that
(8.63) |x − ξ| ≤ 1 =⇒ |f (x)| ≤ M |x − ξ|.
Indeed, note that since f (ξ) = 0, we have
f (x) = f (x) − f (ξ) = an (xn − ξ n ) + an−1 (xn−1 − ξ n−1 ) + · · · + a1 (x − ξ).
Since
xk − ξ k = (x − ξ) qk (x), qk (x) = xk−1 + xk−2 ξ + · · · + x ξ k−2 + ξ k−1 ,
plugging each of these, for k = 1, 2, 3, . . . , n, into the previous equation for f (x), we
see that f (x) = (x − ξ)h(x) where h is a continuous function. In particular, since
[ξ − 1, ξ + 1] is a closed and bounded interval, there is an M such that |h(x)| ≤ M
for all x ∈ [ξ − 1, ξ + 1]. This proves our claim.
Finally, let p/q 6= ξ be a rational number with q > 0. If |ξ − p/q| > 1, then
p 1

ξ − > 1 ≥ n .
q q
If |ξ − p/q| ≤ 1, then by (8.62) and (8.63), we have
p 1 1 1

ξ − ≥ |f (p/q)| ≥ .
q M M qn
Hence, |ξ − p/q| ≥ c/q n for all rational p/q 6= ξ with q > 0, where c is the smaller
of 1 and 1/M . 
466 8. INFINITE CONTINUED FRACTIONS

Let us form the contrapositive of the statement of this lemma: If n ∈ N and


for all constants c > 0, there exists a rational number p/q 6= ξ with q > 0 such that
p c

(8.64) ξ − < n ,
q q
then ξ is not algebraic of degree n. Since a transcendental number is a number
that is not algebraic of any degree n, we can think of a transcendental number as a
number that can be surrounded arbitrarily close by “good” rational numbers. This
leads us to Liouville numbers to be discussed shortly, but before talking about these
special transcendental numbers, we use our lemma to prove the following important
result.
Theorem 8.39. A real algebraic number of degree n is not approximable to
order n + 1 (and hence not to any higher order). Moreover, a rational number is
approximable to order 1 and a real number is irrational if and only if it is approx-
imable to order 2.
Proof. Let ξ be algebraic of degree n ≥ 1 (so ξ is rational if n = 1). Then by
Lemma 8.38, there exists a constant c such that for all rational numbers p/q 6= ξ
with q > 0, we have
p c
ξ − ≥ n .
q q
It follows that ξ is not approximable by rationals to order n + 1 because
p C c C

ξ − < n+1 =⇒ n
< n+1 =⇒ q < C/c.
q q q q
Since there are only finitely many integers q such that q < C/c; it follows that there
are only finitely many fractions p/q such that |ξ − p/q| < C/q n+1 .
Let a/b be a rational number in lowest terms with b ≥ 1; we shall prove that a/b
is approximable to order 1. (Note that we already know from our first statement
that a/b is not approximable to order 2.) From Theorem 8.9, we know that the
equation ax−by = 1 has an infinite number of integer solutions (x, y). The solutions
(x, y) are automatically relatively prime. Moreover, if (x0 , y0 ) is any one integral
solution, then all solutions are of the form
x = x0 + bt , y = y0 + at , t ∈ Z.
Since b ≥ 1 we can choose t large so as to get infinitely many solutions with x > 0.
With x > 0, we see that
a y ax − by

− = = 1 < 2,
b x bx bx x
which shows that a/b is approximable to order 1.
Finally, if a number is irrational, then it is approximable to order 2 from Dirich-
let’s approximation theorem 8.21; conversely, if a number is approximable to order
2, then it must be irrational by the first statement of this theorem. 
Using this theorem we can prove that certain numbers must be irrational. For
instance, let {an } be any sequence of 0, 1’s where there are infinitely many 1’s.
Consider

X an
ξ= .
n=0
22n
8.10. EPILOGUE: TRANSCENDENTAL NUMBERS, π, e, AND WHERE’S CALCULUS? 467

Note that ξ is the real number with binary expansion a0 .0a1 0a2 0 · · · , with an in
the 2n -th decimal place and with zeros
Pn everywhere else. Any case, fix a natural
number n with an 6= 0 and let sn = k=0 a2kk be the n-th partial sum of this series.
2n
Then we can write sn as p/q where q = 22 . Observe that
ξ − sn ≤ 1n+1 + 1n+2 + 1n+3 + 1n+4 + · · ·

22 22 22 22
1 1 1 1
< 2n+1 + 2n+1 +1 + 2n+1 +2 + 2n+1 +3 + · · ·
2 2 2 2
1  1 1 1  2 2
= 2n+1 1 + 1 + 2 + 3 + · · · = 2n+1 = 2n 2 .
2 2 2 2 2 (2 )
In conclusion,
2
ξ − sn < C
= 2,
(22n )2 q
where C = 2. Thus, ξ is approximable to order 2, and hence must be irrational.

8.10.2. Liouville numbers. Numbers that satisfy (8.64) with c = 1 are spe-
cial: A real number ξ is called a Liouville number, after Joseph Liouville (1809–
1882), if for every natural number n there is a rational number p/q 6= ξ with q > 1
such that
ξ − p < 1 .

q qn
These numbers are transcendental by our discussion around (8.64). Because this
fact is so important, we state this as a theorem.
Theorem 8.40 (Liouville’s theorem). Any Liouville number is transcenden-
tal.
Using Liouville’s theorem we can give many (in fact uncountably many — see
Problem 3) examples of transcendental numbers. Let {an } be any sequence of
integers in 0, 1, . . . , 9 where there are infinitely many nonzero integers. Let

X an
ξ= n!
.
n=0
10
Note that ξ is the real number with decimal expansion
a0 .a1 a2 000a3 00000000000000000a4 · · · ,
with an in the n!-th decimal place and with zeros everywhere else. Using Liouville’s
theorem we’ll show that ξ is transcendental. Fix a natural number n with an 6= 0
and let sn be the n-th partial sum of this series. Then sn can be written as p/q
where q = 10n! > 1. Observe that

ξ − sn ≤ 9 9 9 9
(n+1)!
+ (n+2)! + (n+3)! + (n+4)! + · · ·
10 10 10 10
9 9 9 9
< (n+1)! + (n+1)!+1 + (n+1)!+2 + (n+1)!+3 + · · ·
10 10 10 10
9  1 1 1 
= (n+1)! 1 + 1 + 2 + 3 + · · ·
10 10 10 10
10 10 1
= (n+1)! = n·n! ≤ n·n! .
10 10 · 10n! 10
468 8. INFINITE CONTINUED FRACTIONS

In conclusion,

ξ − sn < 1 1
= n,
(10n! )n q
so ξ is a Liouville number and therefore is transcendental.
8.10.3. Continued fractions and the “most extreme” irrational of all
irrational numbers. We now show how continued fractions can be used to con-
struct transcendental numbers! This is achieved by the following simple observa-
tion. Let ξ = ha0 ; a1 , . . .i be an irrational real number written as a simple continued
fraction and let {pn /qn } be its convergents. Then by our fundamental approxima-
tion theorem 8.18, we know that
pn 1

ξ − < .
qn qn qn+1
Since
qn qn+1 = qn (an+1 qn + qn−1 ) ≥ an+1 qn2 ,
we see that
pn 1

(8.65) ξ − < .
qn an+1 qn2
Thus, we can make the rational number pn /qn approximate ξ as close as we wish
by simply taking the next partial quotient an+1 larger. We use this observation in
the following theorem.
Theorem 8.41. Let ϕ : N → (0, ∞) be a function. Then there is an irrational
number ξ and infinitely many rational numbers p/q such that
p 1

ξ − < .
q ϕ(q)
Proof. We define ξ = ha0 ; a1 , a2 , . . .i by choosing the an ’s inductively as fol-
lows. Let a0 ∈ N be arbitrary. Assume that a0 , . . . , an have been chosen. With qn
the denominator of ha0 ; a1 , . . . , an i, choose (via Archimedean) an+1 ∈ N such that
an+1 qn2 > ϕ(qn ).
This defines {an }. Now defining ξ := ha0 ; a1 , a2 , . . .i, by (8.65), for any natural
number n we have
pn 1 1
ξ − < < .
qn an+1 qn2 ϕ(qn )
This completes our proof. 
Using this theorem we can easily find transcendental numbers. For example,
with ϕ(q) = eq , we can find an irrational ξ such that for infinitely many rational
numbers p/q, we have
p 1

ξ − < q .
q e
P∞
Since for any n ∈ N, we have eq = k=0 q k /k! > q n /n!, it follows that for infinitely
many rational numbers p/q, we have
p constant

ξ − < .
q qn
In particular, ξ is transcendental.
8.10. EPILOGUE: TRANSCENDENTAL NUMBERS, π, e, AND WHERE’S CALCULUS? 469

As we have just seen, we can form transcendental numbers by choosing the


partial quotients in an infinite simple continued fraction to be very large and tran-
scendental numbers are the irrational numbers which are “closest” to good rational
numbers. With this in mind, we can think of infinite continued fractions with small
partial quotients as far from being transcendental or far from rational. Since 1 is
the smallest natural number, we can consider the golden ratio

1+ 5
Φ= = h1; 1, 1, 1, 1, 1, 1, 1, . . .i
2
as being the “most extreme” or “most irrational” of all irrational numbers in the
sense that it is the “farthest” irrational number from being transcendental or the
“farthest” irrational number from being rational.
8.10.4. What about π and e and what about calculus? Above we have
already seen examples (in fact, uncountably many — see Problem 3) of transcen-
dental numbers and we even know how to construct them using continued fractions.
However, these numbers seem in some sense to be “artificially” made. What about
numbers that are more “natural” such as π and e? Are these numbers transcen-
dental? In fact, these numbers do turn out to be transcendental, but the “easi-
est” proofs of these facts need the technology of calculus (derivatives) [162, 163]!
Hopefully this might give one reason (amongst many others) to take more courses
in analysis where the calculus is taught. Advertisement ,: The book [136] is a
sequel to the book you’re holding and in it is the next adventure through topology
and calculus and during our journey we’ll prove that π and e are transcendental.
However, if you choose to go on this adventure, we ask you to look back at all the
amazing things that we’ve encountered during these past chapters — everything
without using one single derivative or integral!
Exercises 8.10.
P −2n
1. Given any integer b ≥ 2, prove that ξ = ∞ n=0 b is irrational.
2. Let b ≥ 2 be an integer and let {an } be any sequence of P integers−n!
0, 1, . . . , b − 1 where
there are infinitely many nonzero an ’s. Prove that ξ = ∞ n=1 an b is transcendental.
3. Using a Cantor diagonal argument P as in the proof of Theorem 3.36, prove that the set
of all numbers of the form ξ = ∞ an
n=0 10n! where an ∈ {0, 1, 2, . . . , 9} is uncountable.
That is, assume that the set of all such numbers is countable and construct a number
of the same sort not in the set. Since we already showed that all these numbers are
Liouville numbers, they are transcendental, so this argument provides another proof
that the set of all transcendental numbers is uncountable.
4. Going through the construction of Theorem 8.41, define ξ ∈ R such that if {pn /qn } are
the convergents of its canonical continued fraction expansion, then for all n,

pn 1
ξ − < n.
qn qn
Show that ξ is a Liouville number, and hence is transcendental.
Bibliography

1. Stephen Abbott, Understanding analysis, Undergraduate Texts in Mathematics, Springer-


Verlag, New York, 2001.
2. Aaron D. Abrams and Matteo J. Paris, The probability that (a, b) = 1, The College Math. J.
23 (1992), no. 1, 47.
3. E. S. Allen, The scientific work of Vito Volterra, Amer. Math. Monthly 48 (1941), 516–519.
4. Nathan Altshiller and J.J. Ginsburg, Solution to problem 460, Amer. Math. Monthly 24
(1917), no. 1, 32–33.
5. Robert N. Andersen, Justin Stumpf, and Julie Tiller, Let π be 3, Math. Mag. 76 (2003),
no. 3, 225–231.
6. Tom Apostol, Another elementary proof of euler’s formula for ζ(2k), Amer. Math. Monthly
80 (1973), no. 4, 425–431.
7. Tom M. Apostol, Mathematical analysis: a modern approach to advanced calculus, Addison-
Wesley Publishing Company, Inc., Reading, Mass., 1957.
8. , Irrationality of the square root of two – a geometric proof, Amer. Math. Monthly
107 (2000), no. 9, 841–842.
9. R.C. Archibald, Mathematicians and music, Amer. Math. Monthly 31 (1924), no. 1, 1–25.
10. Jörg Arndt and Christoph Haenel, Pi—unleashed, second ed., Springer-Verlag, Berlin, 2001,
Translated from the 1998 German original by Catriona Lischka and David Lischka, With 1
CD-ROM (Windows).
11. Raymond Ayoub, Euler and the zeta function, Amer. Math. Monthly 81 (1974), no. 10,
1067–1086.
12. Bruce S. Babcock and John W. Dawson, Jr., A neglected approach to the logarithm, Two
Year College Math. J. 9 (1978), no. 3, 136–140.
13. D. H. Bailey, J. M. Borwein, P. B. Borwein, and S. Plouffe, The quest for pi, Math. Intelli-
gencer 19 (1997), no. 1, 50–57.
14. W.W. Ball, Short account of the history of mathematics, fourth ed., Dover Publications Inc.,
New York, 1960.
15. J.M. Barbour, Music and ternary continued fractions, Amer. Math. Monthly 55 (1948),
no. 9, 545–555.
16. C.W. Barnes, Euler’s constant and e, Amer. Math. Monthly 91 (1984), no. 7, 428–430.
17. Robert G. Bartle and Donald R. Sherbert, Introduction to real analysis, second ed., John
Wiley & Sons Inc., New York, 1992.
18. A.F. Beardon, Sums of powers of integers, Amer. Math. Monthly 103 (1996), no. 3, 201–213.
19. A.H. Bell, The “cattle problem.” by Archimedies 251 b. c., Amer. Math. Monthly 2 (1885),
no. 5, 140–141.
20. Howard E. Bell, Proof of a fundamental theorem on sequences, Amer. Math. Monthly 71
(1964), no. 6, 665–666.
21. Jordan Bell, On the sums of series of reciprocals, Available at
http://arxiv.org/abs/math/0506415. Originally published as De summis serierum
reciprocarum, Commentarii academiae scientiarum Petropolitanae 7 (1740) 123134 and
reprinted in Leonhard Euler, Opera Omnia, Series 1: Opera mathematica, Volume
14, Birkhäuser, 1992. Original text, numbered E41, is available at the Euler Archive,
http://www.eulerarchive.org.
22. W. W. Bell, Special functions for scientists and engineers, Dover Publications Inc., Mineola,
NY, 2004, Reprint of the 1968 original.
23. Richard Bellman, A note on the divergence of a series, Amer. Math. Monthly 50 (1943),
no. 5, 318–319.

471
472 BIBLIOGRAPHY

24. Paul Benacerraf and Hilary Putnam (eds.), Philosophy of mathematics: selected readings,
Cambridge University Press, Cambridge, 1964.
25. Stanley J. Benkoski, The probability that k positive integers are relatively r-prime, J. Number
Theory 8 (1976), no. 2, 218–223.
26. Lennart Berggren, Jonathan Borwein, and Peter Borwein, Pi: a source book, third ed.,
Springer-Verlag, New York, 2004.
27. Bruce C. Berndt, Ramanujan’s notebooks, Math. Mag. 51 (1978), no. 3, 147–164.
28. N.M. Beskin, Fascinating fractions, Mir Publishers, Moscow, 1980, Translated by V.I. Kisln,
1986.
29. F. Beukers, A note on the irrationality of ζ(2) and ζ(3), Bull. London Math. Soc. 11 (1979),
no. 3, 268–272.
30. Ralph P. Boas, A primer of real functions, fourth ed., Carus Mathematical Monographs,
vol. 13, Mathematical Association of America, Washington, DC, 1996, Revised and with a
preface by Harold P. Boas.
31. R.P. Boas, Tannery’s theorem, Math. Mag. 38 (1965), no. 2, 64–66.
32. J.M. Borwein and Borwein P.B., Ramanujan, modular equations, and approximations to pi
or how to compute one billion digits of pi, Amer. Math. Monthly 96 (1989), no. 3, 201–219.
33. Jonathan M. Borwein and Peter B. Borwein, Pi and the AGM, Canadian Mathematical
Society Series of Monographs and Advanced Texts, 4, John Wiley & Sons Inc., New York,
1998, A study in analytic number theory and computational complexity, Reprint of the 1987
original, A Wiley-Interscience Publication.
34. R.H.M. Bosanquet, An elementary treatise on musical intervals and temperament (london,
1876), Diapason press, Utrecht, 1987.
35. Carl B. Boyer, Fermat’s integration of X n , Nat. Math. Mag. 20 (1945), 29–32.
36. , A history of mathematics, second ed., John Wiley & Sons Inc., New York, 1991,
With a foreword by Isaac Asimov, Revised and with a preface by Uta C. Merzbach.
37. Paul Bracken and Bruce S. Burdick, Euler’s formula for zeta function convolutions: 10754,
Amer. Math. Monthly 108 (2001), no. 8, 771–773.
38. David Bressoud, Was calculus invented in India?, College Math. J. 33 (2002), no. 1, 2–13.
39. David Brewster, Letters of Euler to a german princess on different subjects in physics and
philosophy, Harper and Brothers, New York, 1834, In two volumes.
40. W.E. Briggs and Nick Franceschine, Problem 1302, Math. Mag. 62 (1989), no. 4, 275–276.
41. T.J. I’A. Bromwich, An introduction to the theory of infinite series, second ed., Macmillan,
London, 1926.
42. Richard A. Brualdi, Mathematical notes, Amer. Math. Monthly 84 (1977), no. 10, 803–807.
43. Robert Bumcrot, Irrationality made easy, The College Math. J. 17 (1986), no. 3, 243–244.
44. Frank Burk, Euler’s constant, The College Math. J. 16 (1985), no. 4, 279.
45. Florian Cajori, A history of mathematical notations, Dover Publications Inc., New York,
1993, 2 Vol in 1 edition.
46. B.C. Carlson, Algorithms involving arithmetic and geometric means, Amer. Math. Monthly
78 (1971), 496–505.
47. Dario Castellanos, The ubiquitous π, Math. Mag. 61 (1988), no. 2, 67–98.
48. , The ubiquitous π, Math. Mag. 61 (1988), no. 3, 148–163.
49. R. Chapman, Evaluating ζ(2), preprint, 1999.
50. Robert R. Christian, Another completeness property, Amer. Math. Monthly 71 (1964), no. 1,
78.
51. James A. Clarkson, On the series of prime reciprocals, Proc. Amer. Math. Soc. 17 (1966),
no. 2, 541.
52. Benoit Cloitre, private communication.
53. J. Brian Conrey, The Riemann hypothesis, Notices Amer. Math. Soc. 50 (2003), no. 3,
341–353.
54. F. Lee Cook, A simple explicit formula for the Bernoulli numbers, Two Year College Math.
J. 13 (1982), no. 4, 273–274.
55. J. L. Coolidge, The number e, Amer. Math. Monthly 57 (1950), 591–602.
56. Fr. Gabe Costa, Solution 277, The College Math. J. 17 (1986), no. 1, 98–99.
57. Richard Courant and Herbert Robbins, What is mathematics?, Oxford University Press,
New York, 1979, An elementary approach to ideas and methods.
BIBLIOGRAPHY 473

58. E. J. Dijksterhuis, Archimedes, Princeton University Press, Princeton, NJ, 1987, Translated
from the Dutch by C. Dikshoorn, Reprint of the 1956 edition, With a contribution by Wilbur
R. Knorr.
59. Underwood Dudley, A budget of trisections, Springer-Verlag, New York, 1987.
60. William Dunham, A historical gem from Vito Volterra, Math. Mag. 63 (1990), no. 4, 234–
237.
61. , Euler and the fundamental theorem of algebra, The College Math. J. 22 (1991),
no. 4, 282–293.
62. E. Dunne and M. Mcconnell, Pianos and continued fractions, Math. Mag. 72 (1999), no. 2,
104–115.
P
63. Erich Dux, Ein kurzer Beweis der Divergenz der unendlichen Reihe ∞ r=1 1/pr , Elem. Math.
11 (1956), 50–51.
P
64. Erdös, Uber die Reihe 1/p, Mathematica Zutphen. B. 7 (1938), 1–2.
65. Leonhard Euler, Introduction to analysis of the infinite. Book I, Springer-Verlag, New York,
1988, Translated from the Latin and with an introduction by John D. Blanton.
66. , Introduction to analysis of the infinite. Book II, Springer-Verlag, New York, 1990,
Translated from the Latin and with an introduction by John D. Blanton.
67. H Eves, Mathematical circles squared, Prindle Weber & Schmidt, Boston, 1972.
68. Pierre Eymard and Jean-Pierre Lafon, The number π, American Mathematical Society, Prov-
idence, RI, 2004, Translated from the 1999 French original by Stephen S. Wilson.
69. Charles Fefferman, An easy proof of the fundmental theorem of algebra, Amer. Math.
Monthly 74 (1967), no. 7, 854–855.
70. William Feller, An introduction to probability theory and its applications. Vol. I, Third
edition, John Wiley & Sons Inc., New York, 1968.
71. , An introduction to probability theory and its applications. Vol. II., Second edition,
John Wiley & Sons Inc., New York, 1971.
72. D. Ferguson, Evaluation of π. are shanks’ figures correct?, Mathematical Gazette 30 (1946),
89–90.
73. William Leonard Ferrar, A textbook of convergence, The Clarendon Press Oxford University
Press, New York, 1980.
74. Steven R. Finch, Mathematical constants, Encyclopedia of Mathematics and its Applications,
vol. 94, Cambridge University Press, Cambridge, 2003.
75. Philippe Flajolet and Ilan Vardi, Zeta function expansions of classical constants, preprint,
1996.
76. Tomlinson Fort, Application of the summation by parts formula to summability of series,
Math. Mag. 26 (1953), no. 26, 199–204.
77. Gregory Fredricks and Roger B. Nelsen, Summation by parts, The College Math. J. 23
(1992), no. 1, 39–42.
78. Richard J. Friedlander, Factoring factorials, Two Year College Math. J. 12 (1981), no. 1,
12–20.
79. Joseph A. Gallian, contemporary abstract algebra, sixth ed., Houghton Mifflin Company,
Boston, 2005.
80. Martin Gardner, Mathematical games, Scientific American April (1958).
81. , Second scientific american book of mathematical puzzles and diversions, University
of Chicago press, Chicago, 1987, Reprint edition.
82. J. Glaisher, History of Euler’s constant, Messenger of Math. 1 (1872), 25–30.
83. Edward J. Goodwin, Quadrature of the circle, Amer. Math. Monthly 1 (1894), no. 1, 246–
247.
84. Russell A. Gordon, The use of tagged partitions in elementary real analysis, Amer. Math.
Monthly 105 (1998), no. 2, 107–117.
85. H.W. Gould, Explicit formulas for Bernoulli numbers, Amer. Math. Monthly 79 (1972),
no. 1, 44–51.
86. D.S. Greenstein, A property of the logarithm, Amer. Math. Monthly 72 (1965), no. 7, 767.
87. Robert Grey, Georg Cantor and transcendental numbers, Amer. Math. Monthly 101 (1994),
no. 9, 819–832.
88. Lucye Guilbeau, The history of the solution of the cubic equation, Mathematics News Letter
5 (1930), no. 4, 8–12.
474 BIBLIOGRAPHY

89. Rachel W. Hall and Krešimir Josić, The mathematics of musical instruments, Amer. Math.
Monthly 108 (2001), no. 4, 347–357.
90. Hallerberg, Indiana’s squared circle, Math. Mag. 50 (1977), no. 3, 136–140.
91. Paul R. Halmos, Naive set theory, Springer-Verlag, New York-Heidelberg, 1974, Reprint of
the 1960 edition. Undergraduate Texts in Mathematics.
92. , I want to be a mathematician, Springer-Verlag, 1985, An automathography.
93. G.D. Halsey and Edwin Hewitt, More on the superparticular ratios in music, Amer. Math.
Monthly 79 (1972), no. 10, 1096–1100.
94. G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, Cambridge Mathematical Library,
Cambridge University Press, Cambridge, 1988, Reprint of the 1952 edition.
95. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, fifth ed., The
Clarendon Press Oxford University Press, New York, 1979.
96. Julian Havil, Gamma, Princeton University Press, Princeton, NJ, 2003, Exploring Euler’s
constant, With a foreward by Freeman Dyson.
97. Ko Hayashi, Fibonacci numbers and the arctangent function, Math. Mag. 76 (2003), no. 3,
214–215.
98. T. L. Heath, Diophantus of alexandria: a study in the history of greek algebra, Cambridge
University Press, England, 1889.
99. , The works of Archimedes, Cambridge University Press, England, 1897.
100. Thomas Heath, A history of Greek mathematics. Vol. I, Dover Publications Inc., New York,
1981, From Thales to Euclid, Corrected reprint of the 1921 original.
101. Aaron Herschfeld, On Infinite Radicals, Amer. Math. Monthly 42 (1935), no. 7, 419–429.
102. Josef Hofbauer, A simple proof of 1 + 1/22 + 1/32 + · · · = π 2 /6 and related identities, Amer.
Math. Monthly 109 (2002), no. 2, 196–200.
103. P. Iain, Science, theology and einstein, Oxford University, Oxford, 1982.
104. Frank Irwin, A curious convergent series, Amer. Math. Monthly 23 (1916), no. 5, 149–152.
105. Sir James H. Jeans, Science and music, Dover Publications Inc., New York, 1968, Reprint
of the 1937 edition.
106. Dixon J. Jones, Continued powers and a sufficient condition for their convergence, Math.
Mag. 68 (1995), no. 5, 387–392.
107. Gareth A. Jones, 6/π 2 , Math. Mag. 66 (1993), no. 5, 290–298.
108. J.P. Jones and S. Toporowski, Irrational numbers, Amer. Math. Monthly 80 (1973), no. 4,
423–424.
109. Dan Kalman, Six ways to sum a series, The College Math. J. 24 (1993), no. 5, 402–421.
110. Edward Kasner and James Newman, Mathematics and the imagination, Dover Publications
Inc., New York, 2001.
111. Victor J. Katz, Ideas of calculus in islam and india, Math. Mag. 68 (1995), no. 3, 163–174.
112. Gerard W. Kelly, Short-cut math, Dover Publications Inc., New York, 1984.
113. A. J. Kempner, A curious convergent series, Amer. Math. Monthly 21 (1914), no. 2, 48–50.
114. Alexey Nikolaevitch Khovanskii, The application of continued fractions and their general-
izations to problems in approximation theory, Translated by Peter Wynn, P. Noordhoff N.
V., Groningen, 1963.
115. Steven J. Kifowit and Terra A. Stamps, The harmonic series diverges again and again, The
AMATYC Review 27 (2006), no. 2, 31–43.
116. M.S. Klamkin and Robert Steinberg, Problem 4431, Amer. Math. Monthly 59 (1952), no. 7,
471–472.
117. M.S. Klamkin and J.V. Whittaker, Problem 4564, Amer. Math. Monthly 62 (1955), no. 2,
129–130.
118. Israel Kleiner, Evolution of the function concept: A brief survey, Two Year College Math.
J. 20 (1989), no. 4, 282–300.
119. Morris Kline, Euler and infinite series, Math. Mag. 56 (1983), no. 5, 307–314.
120. Konrad Knopp, Infinite sequences and series, Dover Publications Inc., New York, 1956,
Translated by Frederick Bagemihl.
121. R. Knott, Fibonacci numbers and the golden section,
http://www.mcs.surrey.ac.uk/Personal/R.Knott/Fibonacci/ .
122. Donald E. Knuth, The art of computer programming. Vol. 2, second ed., Addison-Wesley
Publishing Co., Reading, Mass., 1981, Seminumerical algorithms, Addison-Wesley Series in
Computer Science and Information Processing.
BIBLIOGRAPHY 475

P∞ 2 2
Q∞ 2 2 2
123. R.A. Kortram, Simple proofs for k=1 1/k = π /6 and sin x = x k=1 (1 − x /k π ),
Math. Mag. 69 (1996), no. 2, 122–125.
124. Myren Krom, On sums of powers of natural numbers, Two Year College Math. J. 14 (1983),
no. 4, 349–351.
125. David E. Kullman, What’s harmonic about the harmonic series, The College Math. J. 32
(2001), no. 3, 201–203.
126. R. Kumanduri and C. Romero, Number theory with computer applications, Prentice-Hall,
Simon and Schuster, New Jersey, 1998.
127. M. Laczkovich, On Lambert’s proof of the irrationality of π, Amer. Math. Monthly 104
(1997), no. 5, 439–443.
128. Serge Lang, A first course in calculus, fifth ed., Addison-Wesley Pub. Co., Reading, Mass.,
1964.
129. L. J. Lange, An elegant continued fraction for π, Amer. Math. Monthly 106 (1999), no. 5,
456–458.
130. W. G. Leavitt, The sum of the reciprocals of the primes, Two Year College Math. J. 10
(1979), no. 3, 198–199.
131. D.H. Lehmer, Problem 3801, Amer. Math. Monthly 43 (1936), no. 9, 580.
132. , On arccotangent relations for π, Amer. Math. Monthly 45 (1938), no. 10, 657–664.
133. D.H. Lehmer and M.A. Heaslet, Solution 3801, Amer. Math. Monthly 45 (1938), no. 9,
636–637.
134. A.L. Leigh Silver, Musimatics or the nun’s fiddle, Amer. Math. Monthly 78 (1971), no. 4,
351–357.
135. H.W. Lenstra, Solving the pell equation, Notices Amer. Math. Soc. 49 (2002), no. 2, 182–192.
136. P. Loya, Amazing and aesthetic aspects of analysis: The celebrated calculus, in preparation.
137. N. Luzin, Function: Part I, Amer. Math. Monthly 105 (1998), no. 1, 59–67.
138. , Function: Part II, Amer. Math. Monthly 105 (1998), no. 3, 263–270.
139. Richard Lyon and Morgan Ward, The limit for e, Amer. Math. Monthly 59 (1952), no. 2,
102–103.
140. Desmond MacHales, Comic sections: The book of mathematical jokes, humour, wit, and
wisdom, Boole Press, Dublin, 1993.
141. Alan L. Mackay, Dictionary of scientific quotations, Institute of Physics Publishing, Bristol,
1994.
142. E.A. Maier, On the irrationality of certain trigonometric numbers, Amer. Math. Monthly
72 (1965), no. 9, 1012–1013.
143. E.A. Maier and Ivan Niven, A method of establishing certain irrationalities, Math. Mag. 37
(1964), no. 4, 208–210.
144. S. C. Malik, Introduction to convergence, Halsted Press, a division of John Wiley and sons,
New Delhi, 1984.
145. Eli Maor, e: the story of a number, Princeton University Press, Princetown, NJ, 1994.
146. George Markowsky, Misconceptions about the golden ratio, Two Year College Math. J. 23
(1992), no. 1, 2–19.
147. Jerold Mathews, Gear trains and continued fractions, Amer. Math. Monthly 97 (1990), no. 6,
505–510. √
148. Marcin Mazur, Irrationality of 2, private communication, 2004.
149. J. H. McKay, The william lowell putnam mathematical competition, Amer. Math. Monthly
74 (1967), no. 7, 771–777.
150. George Miel, Of calculations past and present: The Archimedean algorithm, Amer. Math.
Monthly 90 (1983), no. 1, 17–35.
151. Jeff Miller, Earliest uses of symbols in probability and statistics,
http://members.aol.com/jeff570/stat.html.
152. John E. Morrill, Set theory and the indicator function, Amer. Math. Monthly 89 (1982),
no. 9, 694–695.
P
153. Leo Moser, On the series, 1/p, Amer. Math. Monthly 65 (1958), 104–105.
154. Joseph Amal Nathan, The irrationality of ex for nonzero rational x, Amer. Math. Monthly
105 (1998), no. 8, 762–763.
155. Harry L. Nelson, A solution to Archimedes’ cattle problem, J. Recreational Math. 13 (1980-
81), 162–176.
156. D.J. Newman, Solution to problem e924, Amer. Math. Monthly 58 (1951), no. 3, 190–191.
476 BIBLIOGRAPHY

157. , Arithmetic, geometric inequality, Amer. Math. Monthly 67 (1960), no. 9, 886.
158. Donald J. Newman and T.D. Parsons, On monotone subsequences, Amer. Math. Monthly
95 (1988), no. 1, 44–45.
159. James R. Newman (ed.), The world of mathematics. Vol. 1, Dover Publications Inc., Mineola,
NY, 2000, Reprint of the 1956 original.
160. J.R. Newman (ed.), The world of mathematics, Simon and Schuster, New York, 1956.
161. James Nickel, Mathematics: Is God silent?, Ross House Books, Vallecito, California, 2001.
162. Ivan Niven, The transcendence of π, Amer. Math. Monthly 46 (1939), no. 8, 469–471.
163. , Irrational numbers, The Carus Mathematical Monographs, No. 11, The Mathemat-
ical Association of America. Distributed by John Wiley and Sons, Inc., New York, N.Y.,
1956.
P
164. , A proof of the divergence of 1/p, Amer. Math. Monthly 78 (1971), no. 3, 272–273.
165. Ivan Niven and Herbert S. Zuckerman, An introduction to the theory of numbers, third ed.,
John Wiley & Sons, Inc., New York-London-Sydney, 1972.
166. Jeffrey Nunemacher and Robert M. Young, On the sum of consecutive kth powers, Math.
Mag. 60 (1987), no. 4, 237–238.
167. Mı́cheál Ó Searcóid, Elements of abstract analysis, Springer Undergraduate Mathematics
Series, Springer-Verlag London Ltd., London, 2002.
168. University of St. Andrews, A chronology of pi,
http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/Pi chronology.html.
169. , Eudoxus of cnidus,
http://www-groups.dcs.st-and.ac.uk/~ history/Biographies/Eudoxus.html.
170. , A history of pi,
http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/Pi through the ages.html.
171. , Leonhard Euler,
http://www-groups.dcs.st-and.ac.uk/ history/Mathematicians/Euler.html.
172. , Madhava of sangamagramma,
http://www-gap.dcs.st-and.ac.uk/ history/Mathematicians/Madhava.html.
173. C. D. Olds, The simple continued fraction expansion of e, Amer. Math. Monthly 77 (1970),
no. 9, 968–974.
174. Geo. A. Osborne, A problem in number theory, Amer. Math. Monthly 21 (1914), no. 5,
148–150.
175. Thomas J. Osler, The union of Vieta’s and Wallis’s products for pi, Amer. Math. Monthly
106 (1999), no. 8, 774–776.
176. Thomas J. Osler and James Smoak, A magic trick from fibonacci, The College Math. J. 34
(2003), 58–60.
177. Thomas J. Osler and Nicholas Stugard, A collection of numbers whose proof of irrationality
is like that of the number e, Math. Comput. Ed. 40 (2006), 103–107.
178. Thomas J. Osler and Michael Wilhelm, Variations on Vieta’s and Wallis’s products for pi,
Math. Comput. Ed. 35 (2001), 225–232.
P∞ −2 = π 2 /6, Amer. Math.
179. Ioannis Papadimitriou, A simple proof of the formula k=1 k
Monthly 80 (1973), no. 4, 424–425.
180. L. L. Pennisi, Elementary proof that e is irrational, Amer. Math. Monthly 60 (1953), 474.
181. G.M. Phillips, Archimedes the numerical analyst, Amer. Math. Monthly 88 (1981), no. 3,
165–169.
182. R.C. Pierce, Jr., A brief history of logarithms, Two Year College Math. J. 8 (1977), no. 1,
22–26.
183. Alfred S. Posamentier and Ingmar Lehmann, π: A biography of the world’s most myste-
rious number, Prometheus Books, Amherst, NY, 2004, With an afterword by Herbert A.
Hauptman.
184. G. Baley Price, Telescoping sums and the summation of sequences, Two Year College Math.
J. 4 (1973), no. 4, 16–29.
185. Raymond Redheffer, What! another note just on the fundamental theorem of algebra, Amer.
Math. Monthly 71 (1964), no. 2, 180–185.
186. Reinhold Remmert, Vom Fundamentalsatz der Algebra zum Satz von Gelfand-Mazur, Math.
Semesterber. 40 (1993), no. 1, 63–71.
187. Dorothy Rice, History of π (or pi), Mathematics News Letter 2 (1928), 6–8.
188. N. Rose, Mathematical maxims and minims, Rome Press Inc., Raleigh, NC, 1988.
BIBLIOGRAPHY 477

189. Tony Rothman, Genius and biographers: The fictionalization of Evariste Galois, Amer.
Math. Monthly 89 (1982), no. 2, 84–106.
190. Ranjan Roy, The discovery of the series formula for π by Leibniz, Gregory and Nilakantha,
Math. Mag. 63 (1990), no. 5, 291–306.
191. Walter Rudin, Principles of mathematical analysis, third ed., McGraw-Hill Book Co., New
York, 1976, International Series in Pure and Applied Mathematics.
192. , Real and complex analysis, third ed., McGraw-Hill Book Co., New York, 1987.
193. Oliver Sacks, The man who mistook his wife for a hat : And other clinical tales, Touchstone,
New York, 1985.
194. Yoram Sagher, Notes: What Pythagoras Could Have Done, Amer. Math. Monthly 95 (1988),
no. 2, 117.
195. E. Sandifer, How euler did it,
http://www.maa.org/news/howeulerdidit.html.
196. Norman Schaumberger, An instant proof of eπ > π e , The College Math. J. 16 (1985), no. 4,
280.
197. Murray Schechter, Tempered scales and continued fractions, Amer. Math. Monthly 87 (1980),
no. 1, 40–42.
198. Herman C. Schepler, A chronology of pi, Math. Mag. 23 (1950), no. 3, 165–170.
199. , A chronology of pi, Math. Mag. 23 (1950), no. 4, 216–228.
200. , A chronology of pi, Math. Mag. 23 (1950), no. 5, 279–283.
201. P.J. Schillo, On primitive pythagorean triangles, Amer. Math. Monthly 58 (1951), no. 1,
30–32.
202. Fred Schuh, The master book of mathematical recreations, Dover Publications Inc., New
York, 1968, Translated by F. Göbel.
203. P. Sebah and X. Gourdon, A collection of formulae for the Euler constant,
http://numbers.computation.free.fr/Constants/Gamma/gammaFormulas.pdf.
204. , A collection of series for π,
http://numbers.computation.free.fr/Constants/Pi/piSeries.html.
205. , The constant e and its computation,
http://numbers.computation.free.fr/Constants/constants.html.
206. , Introduction on Bernoulli’s numbers,
http://numbers.computation.free.fr/Constants/constants.html.
207. , π and its computation through the ages,
http://numbers.computation.free.fr/Constants/constants.html.
208. Allen A. Shaw, Note on roman numerals, Nat. Math. Mag. 13 (1938), no. 3, 127–128.
209. Georgi E. Shilov, Elementary real and complex analysis, english ed., Dover Publications
Inc., Mineola, NY, 1996, Revised English edition translated from the Russian and edited by
Richard A. Silverman.
210. G. F. Simmons, Calculus gems, Mcgraw Hill, Inc., New York, 1992.
211. J.G. Simmons, A new look at an old function, eiθ , The College Math. J. 26 (1995), no. 1,
6–10.
212. Sahib Singh, On dividing coconuts: A linear diophantine problem, The College Math. J. 28
(1997), no. 3, 203–204.
213. David Singmaster, The legal values of pi, Math. Intelligencer 7 (1985), no. 2, 69–72.
214. , Coconuts: the history and solutions of a classic Diophantine problem, Gan.ita-
Bhāratı̄ 19 (1997), no. 1-4, 35–51.
215. Walter S. Sizer, Continued roots, Math. Mag. 59 (1986), no. 1, 23–27.
216. David Eugene Smith, A source book in mathematics. vol. 1, 2., Dover Publications, Inc, New
York, 1959, Unabridged and unaltered republ. of the first ed. 1929.
217. J. Sondow, Problem 88, Math Horizons (1997), 32, 34.
218. H. Steinhaus, Mathematical snapshots, english ed., Dover Publications Inc., Mineola, NY,
1999, Translated from the Polish, With a preface by Morris Kline.
219. Ian Stewart, Concepts of modern mathematics, Dover Publications Inc., New York, 1995.
220. John Stillwell, Galois theory for beginners, Amer. Math. Monthly 101 (1994), no. 1, 22–27.
221. D. J. Struik (ed.), A source book in mathematics, 1200–1800, Princeton Paperbacks, Prince-
ton University Press, Princeton, NJ, 1986, Reprint of the 1969 edition.
222. Frode Terkelsen, The fundamental theorem of algebra, Amer. Math. Monthly 83 (1976),
no. 8, 647.
478 BIBLIOGRAPHY

223. Hugh Thurston, A simple proof that every sequence has a monotone subsequence, Math.
Mag. 67 (1994), no. 5, 344.
224. C. Tøndering, Frequently asked questions about calendars,
http://www.tondering.dk/claus/, 2003.
225. Herbert Turnbull, The great mathematicians, Barnes & Noble, New York, 1993.
226. Herbert (ed.) Turnbull, The correspondence of Isaac Newton, Vol. II: 1676–1687, Published
for the Royal Society, Cambridge University Press, New York, 1960.
227. D. J. Uherka and Ann M. Sergott, On the continuous dependence of the roots of a polynomial
on its coefficients, Amer. Math. Monthly 84 (1977), no. 5, 368–370.
228. R.S. Underwood and Robert E. Moritz, Solution to problem 3242, Amer. Math. Monthly 35
(1928), no. 1, 47–48.
229. James Victor Uspensky, Introduction to mathematical probability, McGraw-Hill Book Co,
New York, London, 1937.
230. Alfred van der Poorten, A proof that Euler missed. . .Apéry’s proof of the irrationality of
ζ(3), Math. Intelligencer 1 (1978/79), no. 4, 195–203, An informal report.
P
231. Charles Vanden Eynden, Proofs that 1/p diverges, Amer. Math. Monthly 87 (1980), no. 5,
394–397.
232. Ilan Vardi, Computational recreations in Mathematica, Addison-Wesley Publishing Company
Advanced Book Program, Redwood City, CA, 1991.
233. , Archimedes’ cattle problem, Amer. Math. Monthly 105 (1998), no. 4, 305–319.
234. P.G.J. Vredenduin, A paradox of set theory, Amer. Math. Monthly 76 (1969), no. 1, 59–60.
235. A.D. Wadhwa, An interesting subseries of the harmonic series, Amer. Math. Monthly 82
(1975), no. 9, 931–933.
236. Morgan Ward, A mnemonic for Euler’s constant, Amer. Math. Monthly 38 (1931), no. 9, 6.
237. André Weil, Number theory, Birkhäuser Boston Inc., Boston, MA, 1984, An approach
through history, From Hammurapi to Legendre.
238. E. Weisstein, Dirichlet function. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/DirichletFunction.html.
239. , Landau symbols. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/LandauSymbols.html.
240. , Pi approximations. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/PiApproximations.html.
241. , Pi formulas. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/PiFormulas.html.
242. B. R. Wenner, Continuous, exactly k-to-one functions on R, Math. Mag. 45 (1972), 224–225.
243. Joseph Wiener, Bernoulli’s inequality and the number e, The College Math. J. 16 (1985),
no. 5, 399–400.
244. E. Wigner, The unreasonable effectiveness of mathematics in the natural sciences, Comm.
Pure Appl. Math. 13 (1960), 1–14.
245. Eugene Wigner, Symmetries and reflections: Scientific essays, The MIT press, Cambridge
and London, 1970.
246. Herbert S. Wilf, generatingfunctionology, third ed., A K Peters Ltd., Wellesley, MA, 2006,
Freely downloadable at http://www.cis.upenn.edu/ wilf/.
247. G.T. Williams, A new method of evaluating ζ(2n), Amer. Math. Monthly 60 (1953), no. 1,
12–25.
248. H.C. Williams, R.A. German, and C.R. Zarnke, Solution of the cattle problem of Archimedes,
Math. Comp. 19 (1965), no. 92, 671–674.
249. A. M. Yaglom and I. M. Yaglom, Challenging mathematical problems with elementary so-
lutions. Vol. II, Dover Publications Inc., New York, 1987, Problems from various branches
of mathematics, Translated from the Russian by James McCawley, Jr., Reprint of the 1967
edition.
250. Hansheng Yang and Yang Heng, The arithmetic-geometric mean inequality and the constant
e, Math. Mag. 74 (2001), no. 4, 321–323.
251. G.S. Young, The linear functional equation, Amer. Math. Monthly 65 (1958), no. 1, 37–38.
252. Robert M. Young, Excursions in calculus, The Dolciani Mathematical Expositions, vol. 13,
Mathematical Association of America, Washington, DC, 1992, An interplay of the continuous
and the discrete.
253. Don Zagier, The first 50 million prime numbers, Math. Intelligencer 0 (1977/78), 7–19.
BIBLIOGRAPHY 479

254. Lee Zia, Using the finite difference calculus to sum powers of integers, The College Math. J.
22 (1991), no. 4, 294–300.
Index

Abel summable, 299 series error estimate, 275


Abel’s lemma, 271 series test, 275
Abel’s limit theorem, 296 Angle, 78, 206
Abel’s multiplication theorem, 317 Anthoniszoon, Adriaan, 228, 423
Abel’s test for series, 277 Apéry, Roger, 263
Abel, Neils, 296 Approximable numbers, 464
Abel, Niels, 123, 271 Approximation(s)
Absolute convergence best, 424
double series, 302 good, 423
infinite products, 355 √   1/2   1/3   1/5
π ≈ 10, 227 , 311/3 , 4930 , 3061/5 , 77729 ,
series, 135 23 159 254
Absolute convergence theorem, 135 432
Absolute value, 17, 39 π ≈ 56 Φ2 , 414
  1/4
of complex numbers, 82 π ≈ 2143 , 432
22
Absolute value rules, 39
Approximations
Additive function, 171
eπ√− π ≈ 20, 206
Additive identity
eπ 163 = 262537412640768744, 206
existence for complex numbers, 80
Archimedean
existence for integers, 35
ordering of the natural numbers, 25
existence for reals, 54
existence for vectors, 73 property of reals, 68
Additive inverse Archimedes of Syracuse, 153, 226, 227, 423,
existence for complex numbers, 80 456
existence for integers, 35 Archimedes’
existence for reals, 54 algorithm, 232
existence for vectors, 73 Archimedes’ cattle problem, 456
AGM method (to compute π), 233 Archimedes’ three propositions, 228
Ahmes, 227 Argument
Algebra of limits of a complex number, 219
functions, 161 principal, 219
sequences, 107 Ariel, 437
Algebraic number, 88 Aristotle, 182
Algorithm Aristoxenus, 437
Archimedes’, 232 Arithmetic mean, 34
Borchardt’s, 235 Arithmetic properties of series theorem, 127
Brahmagupta’s, 461 Arithmetic-geometric mean inequality, 34, 195
canonical continued fraction, 415 application to e, 116
division, 42 Associative law
Euclidean, 46 addition for complex numbers, 80
Almost integer, 206 addition for integers, 35
Alternating addition for natural numbers, 22
harmonic series, 136, 154, 275 addition for reals, 53
harmonic series rearrangement, 311 addition for vectors, 73
log 2 formula, 194 multiplication for complex numbers, 80

481
482 INDEX

multiplication for integers, 36 Brahmagupta, 459


multiplication for natural numbers, 22 quote, 459
multiplication for reals, 54 zero and negative numbers, 21
multiplication for vectors, 73 Brahmagupta’s algorithm, 461
Axiom(s) Brent, Richard, 233
completeness, 63 Bromwich, Thomas, 257
integers, 35 Brouncker, Lord William, 233, 389
natural numbers, 22 Brouwer’s fixed point theorem, 179
Brouwer, L E J, 179
b-adic expansion, see also b-adic expansion Bunyakovskiı̆, Viktor, 74
b-adic representation
of integers, 51 Caesar, Julius, 433
of real numbers, 147 Calendar
Bachmann, Paul, 291 Gregorian, 433
Ball Julian, 433
closed, 77 Persian, 437
open, 77 Canonical (simple) continued fraction, 415
Ball norm, 76 Canonical continued fraction algorithm, 415
Barber puzzle, 3 Cantor’s diagonal argument, 151
Base original version, 152
of a power, 28, 190, 222 Cantor’s first proof (of the uncountability of
to represent integers, 51 R), 86
to represent real numbers, 146 Cantor’s second proof (of the uncountability
Basel problem, 250 of R), 151
Bell, Jordan, 238 Cantor’s theorem, 90
Bellman, Richard, 321 Cantor, Georg, 83
Bernoulli numbers, 262, 270, 327 uncountability of transcendental numbers,
Bernoulli’s inequality, 30 89
Bernoulli, Jacob, 30, 125, 249, 269 Cardinality, 83
numbers, 270, 327 Cartesian product, 15
Bernoulli, Nicolaus, 344 Cassini’s identity, 411
Bertrand, Joseph, 293 Castellanos, Dario, 342
Best approximation, 424 Cataldi, Pietro, 421
Best approximation theorem, 425, 429 Cattle problem, 456
big O notation, 291 Cauchy condensation test, 133
Bijection, 18 Cauchy criterion theorem, 119
Binary system, 50 Cauchy product, 315
Binomial Cauchy sequence, 117
coefficient, 31, 332 Cauchy’s arithmetic mean theorem, 344
series, 333, 336 Cauchy’s criterion theorem for series, 132
theorem, 31 Cauchy’s double series theorem, 145, 305
Bisection method, 181 Cauchy’s multiplication theorem, 318
Bolzano, Bernard, 114 Cauchy’s root test, 284
Bolzano-Weierstrass theorem for R, 114 Cauchy, Augustin, 74, 315
Bolzano-Weierstrass theorem for Rm , 114 Cauchy-Bunyakovskiı̆-Schwarz inequality, 74
Bombelli, Rafael, 414 Cauchy-Hadamard theorem, 287
Borchardt’s algorithm, 235 Cauchy-Schwarz inequality, 74
Borchardt, Carl, 235 Chain of intervals, 181
Borwein, Jonathan, 233 Change of base formula, 196
Borwein, Peter, 233 Characteristic function, 20
Bosanquet, Robert, 437 Chongzhi, Zu, see also Chung-Chi, Tsu
Bounded Chung-Chi, Tsu, 227, 423
above, 63 Cloitre, Benoit, 386
below, 64 Closed
sequence, 104 ball, 77
variation, 272 interval, 4
Boundedness theorem, 175 natural number operations, 22
Box norm, 79 Coconut puzzles, 409
INDEX 483

Codomain of function, 16 Continuity theorem for power series, 295


Coin game, 34 Continuous function(s)
Commutative law algebra of, 168
addition for complex numbers, 80 composition of, 168
addition for integers, 35 Continuous functions, 166
addition for natural numbers, 22 Contractive sequence, 120
addition for reals, 53 Contractive sequence theorem, 120
addition for vectors, 73 Contrapositive, 6, 11, 24
multiplication for complex numbers, 80 Convergence
multiplication for integers, 36 double sequences, 300
multiplication for natural numbers, 22 double series, 302
multiplication for reals, 54 functions, 154
Commutative ring, 36 infinite products, 351
Compact, 173 infinite series, 124
Compactness lemma, 173 sequences, 95
Comparison test, 132 Convergent of a continued fraction, 393
Complement of sets, 8 Converse, 13
Complete quotients, 416 Convex set, 72
Completeness axiom, 63 Coprime numbers, 47, 153, 234, 378
Completeness property of R, 64 Cosecant function, 200
Completeness property of Rm , 120 power series, 331
Complex Cosine function, 144, 198
logarithm, 221 Cotangent
power, 190, 222 continued fraction, 445
Complex conjugate, 82 Cotangent function, 200
Complex numbers (C), 79 power series of, 329
Component functions, 161 Countability
Component theorem of algebraic numbers, 89
for continuity, 168 of rational numbers, 86
for functions, 161 Countable
for sequences, 103 set, 83
Composite, 44 Countably infinite, 83
Composition of functions, 17 Cover of sets, 172
Composition of limits theorem, 162 Cube root, 57
confluent hypergeometric limit function, 438
Congruent modulo √ n, 47 √ d’Alembert’s ratio test, 284
Conjugate in Z[ d] or Q[ d], 449 d’Alembert, Jean Le Rond, 284
Connected set, 173 quote, 95
Connectedness lemma, 174 Davis’ Broadway cafe, 419
Constant function, 20 de Lagny, Thomas, 343
Constant(s) de Moivre’s formula, 198
Euler-Mascheroni γ, 153, 193 de Moivre, Abraham, 198
Continued fraction, 123 De Morgan and Bertrand’s test, 293
canonical, 415 De Morgan, Augustus, 293
unary, 422 laws, 9
Continued fraction convergence theorem, 416 quote, 35
Continued fraction(s) Decimal system, 49
finite, 391, 393 Decimal(s)
infinite, 393 integers, 49
nonnegative, 404 real numbers, 146
regular, 411 Degree, 202
simple, 392 Dense, 170
terminating, 393 Density of the (ir)rationals, 69
Continued fractions Difference of sets, 7
transformation rules of, 394 Difference sequence, 122
Continuity Digit(s), 49, 147
at a point, 166 Diophantine equations, 407
on a set, 167 Diophantus of Alexandrea, 389, 407
484 INDEX

Dirac, Paul, 237 ε-principle, 102


Direct proof, 24 Equality
Dirichlet eta function, 376 of functions, 20
Dirichlet function, 17, 159 of sets, 5
modified, 169 Eratosthenes of Cyrene, 456
Dirichlet’s approximation theorem, 430 Erdös, Paul, 325
Dirichlet’s test, 273 Euclid’s theorem, 44
Dirichlet’s theorem for rearrangements, 314 Euclidean algorithm, 46
Dirichlet, Johann, 17, 159, 273 Euclidean space, 72
Disconnected set, 173 Eudoxus of Cnidus, 68
Discontinuity Euler numbers, 270, 330
jump, 183 Euler’s identity, 198
Disjoint sets, 7 Euler’s sum for π 2 /6, 126
Distance between vectors or points, 76 Euler, Leonhard, 126, 153, 234, 249, 339,
Distributive law 341, 389, 399, 441, 462, 464
complex numbers, 80 e, 114
integers, 36 f (x) notation, 16
natural numbers, 22 identity for eiz , 198
reals, 54 letter to a princess, 435
vectors, 73 on the series 1 − 1 + 1 · · · , 93
Divergence on transcendental numbers, 88
infinite products, 351 popularization of π, 228
infinite series, 124 quote, 41, 93
proper (for functions), 164 role played in FTA, 210
proper (for sequences), 109 summation notation Σ, 28
sequences, 95 Euler-Mascheroni constant, 153
to ±∞ for functions, 164 Euler-Mascheroni constant γ, 193
to ±∞ for sequences, 109 Even
to zero for infinite products, 351 function, 299
Divide, divisible, 41 number, 44
Divisibility rules, 42 Existence
Divisibility tricks, 53 of complex n-th roots, 215
Division algorithm, 42 of real n-th roots, 65
Divisor, 41 Exponent, 28, 190, 222
Domain of function, 16 Exponential function, 94, 140, 187, 285
dot product, 73 application of Cauchy multiplication, 319
Double a cube, 228 the most important function, 153
Double sequence, 145, 300 Exponential funtion
Duodecimal system, 52 continued fraction, 442
Dux, Erich, 321 Extended real numbers, 109, 280
Dyadic system, 50
Factor, 41
e, 93, 114 Factorial(s), 31, 46
approximation, 143 how to factor, 48
infinite nested product formula, 143 Family of sets, 8
irrationality, 143, 276, 420 Fandreyer, Ernest, 210
nonsimple continued fraction, 389, 402 Ferguson, D., 234
simple continued fraction, 389, 419, 443 Fermat’s last theorem, 462
e and π in a mirror, 386 Fermat’s theorem, 47
Eddington, Sir Arthur, 21 Fermat, Pierre, 462
Einstein, Albert, 376 Fermat, Pierre de, 47
Empty set, 4 Fibonacci sequence, 34, 131, 278, 289, 341,
Enharmonic Harmonium, 437 411
ε/2-trick, 102, 119 Fibonacci, Leonardo, 341, 418
ε-δ arguments, 156 Field, 54
ε-δ definition of limit, 154 ordered, 54
ε-N arguments, 96 Finite, 83
ε-N definition of limit, 94 subcover, 172
INDEX 485

subcover property, 173 Euler’s infinite product (sin πx), 238


Flajolet, Phillip, 339, 341 Euler’s infinite product (sin πz), 349, 359
FOIL law, 24 Euler’s infinite product (sin πz), Proof I,
Formula(s) 241
1 1/3
2 = 2 = 1/2e
· e1/4 · · · , 194 Euler’s infinite product (sin πz), Proof II,
e e   249
π P∞
4
= n=0 arctan F 1 , 342 Euler’s infinite product (sin πz), Proof III,
2n+1
P∞ (−1)n 3z 5z 361
n=0 (2n+1)z = 3z +1 · 5z −1 · · · , 386
  2k+1 Euler’s infinite product (sin πz), Proof IV,
P∞ (−1)n k E2k π
n=0 (2n+1)2k+1 = (−1) 2(2k)! 2 , 362
385 Euler’s partial fraction ( 4 cosπ πz ), 369
2
e = 2 + 22 + 33 + 44 . . ., 402 Euler’s partial fraction ( sinππz ), 366, 369
addition for (co)sine, 199 Euler’s partial fraction (π tan πz 2
), 369
x x2 32 x 2 Euler’s partial fraction (πz cot πz), 366
arctan x = 1 + 3−x2 + 5−3x2 +
. . ., 398
Castellano’s, 342 Euler’s partial fraction (π/ sin πz), 350
change of base, 196 Euler’s product ( π2 = 32 · 65 · 76 · · · ), 350,
continued fraction for ex , 442 386
cosecant power series, 331 Euler’s product ( π4 = 34 · 45 · 78 · · · ), 350,
cotangent continued fraction, 445 386 
P
γ =1− ∞ 1
n=2 n ζ(n) − 1 , 339
cotangent power series, 329 P n
∞ (−1)
de Moivre’s, 198 γ = 32 −log 2− n=2 n n−1) ζ(n)−

double angle for (co)sine, 199 1 , 339
P (−1)n
e nonsimple continued fraction, 389 γ= ∞ n=2 n
ζ(n), 269, 339
e simple continued
 fraction, 389, 443 4/π continued fraction, 389
1 n
e
n 
e−1
= limn→∞ n n
+ ··· + n , 191 Gregory-Leibniz-Madhava, 239
 1  2  n  1/n
Gregory-Leibniz-Madhava’s, 154, 234, 269
e = limn→∞ 2
1
3
2
· · · n+1
n
, Gregory-Leibniz-Madhava, Proof I, 254
345 Gregory-Leibniz-Madhava, Proof II, 340
e = 21 · 5
4
· 16
· · · , 352
15
Gregory-Leibniz-Madhava, Proof III, 386
π
e −e −π Q∞  1

Gregory-Madhava’s arctangent, 338

= n=1 1 + n2
, 362
half-angle for (co)sine, 200
e2/x +1 1 1
=x+ 3x + 5x
. . ., 441 hyperbolic cotangent continued fraction,
e2/x −1
Euler’s (π 2 /6), 154, 234, 269 439
Euler’s (π 2 /6), Proof I, 249 hyperbolic secant power series, 330
Euler’s (π 2 /6), Proof II, 251 hyperbolic tangent continued fraction, 441
12 x 22 x
Euler’s (π 2 /6), Proof III, 253 log(1 + x) = x1 + (2−1x) + (3−2x) . . ., 402
Euler’s (π 2 /6), Proof IV, 257 log 2
2
Euler’s (π 2 /6), Proof V, 258 22
log 2 = 11 11 . . ., 396
Euler’s (π 2 /6), Proof VI, 264 + + 1+
log 2
Euler’s (π 2 /6), Proof VII, 345 log 2 = ∞
P 1
n=2 2n ζ(n), 309
Euler’s (π 2 /6), Proof VIII, 370 Lord Brouncker’s, 233, 389, 397
Euler’s (π 2 /6), Proof VIIII, 371 Machin’s, 234, 270, 342
Euler’s (π 2 /6), Proof X, 372 partial fraction of 1/ sin2 x, 257
Euler’s (π 2 /6), Proof VIIIIII, 383 partial fraction of 1/ sin2 x, Proof I, 257
Euler’s (π 4 /90), 262, 371, 383 partial fraction of 1/ sin2 x, Proof II, 257
Euler’s (π 6 /945), 262, 371, 383 Fn+1
Φ = limn→∞ F , 422
Euler’s formula (η(2k)), 383 n
P (−1)n−1
Euler’s formula (ζ(2k)), 263, 383 Φ= ∞ n=1 Fn Fn+1 , 422
Euler’s infinite product (cos πz), 364 P (−1)n
Φ −1 = ∞ n=2 F F , 422
Euler’s infinite product (cos πz), Proof I, n n+2
12 32 52
364 π =3+ 6 + 6 + 6
. . ., 400
Euler’s infinite product (cos πz), Proofs 1 x2 (1−x)2
π cot πx = x + 1−2x + 2x . . ., 403
II,III,IV, 364
π
Euler’s infinite product (π 2 /15), 384 2
= 1 + 11 + 1·2
1 + 1
2·3
. . ., 399
Euler’s infinite product (π 2 /6), 350, 375, cos πx
2 (x+1)2 (x−1)2
π = x + 1 + −2·1 + −2 . . ., 403
383 2
sin πx x (1−x)2 (1+x)2
Euler’s infinite product (π 4 /90), 384 πx
=1− 1+ 2x + 1−2x
. . ., 403
486 INDEX

sin πx x 1·(1−x) 1·(1+x)


πx
=1− 1+ x + 1−x . . ., 403 hyperbolic cosine, 200
tan πx x (1−x) 2 hyperbolic sine, 200
πx
=1 + 1−2x + 2x . . ., 403 hypergeometric, 438
secant power series, 330 identity, 20
Seidel’s, 349, 353 image of, 16
Seidel’s for log
θ−1
θ
, 354 injective, 18
4 4 4
6
π2
= 02 + 12 − 121+22 + 22−2 −3
+32 + 32 +42
. . ., inverse, 18
399 inverse or arc cosine (complex), 226
π2 1 −14 −24 −34 inverse or arc cosine (real), 218
6
= 02 +1 2 + 12 +22 + 22 +32 + 32 +42 . . .,
399 inverse or arc sine (complex), 226
6 µ(n)
= 1 − 212 − 312 · · · + n2 + · · · , 384 inverse or arc sine (real), 218
π 2
Sondow’s, 247 inverse or arc tangent (complex), 223
√ (n!)2 22n inverse or arc tangent (real), 218
π = limn→∞ (2n)! √n , 248
inverse or arc tangent continued fraction,
tangent continued fraction, 445 398
tangent power series, 329 jump, 186
Viète’s, 154, 233, 237, 240 Liouville’s, 380
Wallis’, 234, 237, 247 logarithm, 189
P µ(n)
1
ζ(z)
= ∞ n=1 nz , 375 Möbius, 375, 379
Q  −1
monotone, 182
ζ(z) = 1 − p1z , Proof I,II, 373
multiple-valued, 219
Q 1
 −1
ζ(z) = 1 − pz , Proof III, 379 multiplicative, 171
2
P ∞ τ (n) nondecreasing, 182
ζ(z) = n=1 nz , 380
P odd, 299
ζ(2z) λ(n)
ζ(z)
= ∞ n=1 nz , 380 one-to-one, 18
P onto, 18
(k + 2) ζ(k + 1) = k−2 `=1 ζ(k − `) ζ(` + 1) +
P ∞ Hn
2 n=1 nk , 265 range of, 16
P P∞ P∞ Riemann zeta, 192, 269, 285
ζ(k) = k−2 m=1
1
n=1 m` (m+n)k−` ,
`=1 Riemann zeta (in terms of Möbius func-
310 tion), 375
Fraction rules, 56 Riemann zeta (in terms of primes), 373
Function(s) secant, 200
absolute value, 17 sine, 144, 198
additive, 171 strictly decreasing, 182
bijective, 18 strictly increasing, 182
characteristic, 20 surjective, 18
codomain of, 16 tangent, 200
component, 161 target of, 16
composition of, 17 the most important, 153, 187
confluent hypergeometric limit function, value of, 16
438 Zeno, 182
constant, 20 Fundamental recurrence relations, 405
continuous, 166 Fundamental theorem
cosecant, 200 of algebra, 210
cosine, 144, 198 of algebra, proof I, 211
cotangent, 200 of algebra, proof II, 213
definition, 16 of algebra, proof III, 216
Dirichlet, 17, 159 of arithmetic, 45
Dirichlet eta, 376
domain of, 16 Galois, Evariste, 193, 452
equal, 20 Game
even, 299 coin, 34
exponential, 94, 140, 187, 285 of Nim, 40, 51
exponential, continued fraction, 442 Towers of Hanoi, 33
graph of, 16 Game of Nim, 40, 51
greatest integer, 69 Gauss’ test, 293
Hurwitz zeta, 381 Gauss, Carl, 10, 164
hyperbolic, 208 fundamental theorem of algebra, 210
INDEX 487

on Borchardt’s algorithm, 235 continued fraction, 439


quote, 81, 146, 199 Hyperbolic tangent
Generalized power rules theorem, 190 continued fraction, 441
Geometric mean, 34 Hypergeometric function, 438
Geometric series, 126
Geometric series theorem, 126 i, 81
Gilfeather, Frank, 323 I Kings, 227, 423
Glaisher, James Identity
quote, 340 Cassini, 411
Golden Ratio, 34 Euler’s, 198
Golden ratio, 94, 113, 392, 414 function, 20
continued fraction, 94, 123, 392 Lagrange, 78
false rumors, 113, 392 Pythagorean, 199
infinite continued square root, 94, 113 Identity theorem, 298
the “most irrational number, 427 If ... then statements, 5, 11
Good approximation, 423 If and only if statements, 13
Goodwin, Edwin, 228 II Chronicles, 227, 423
Goto, Hiroyuki, 153 Image of a set, 18
Graph, 16 Image of function, 16
Greatest common divisor, 46 Imaginary unit, 81
Greatest integer function, 69 Independent events, 377
Greatest lower bound, 64 Index of a polynomial, 89
Gregorian calendar, 433 Induction, 23
Gregory, James, 234, 338, 340 Inductive definitions, 28
Gregory-Madhava’s arctangent series, 338 Inequalities
rules, 38
Hadamard, Jacques, 287 strict, 23
Halmos, Paul, 3 Inequality
Harmonic Arithmetic-geometric mean, 34, 195
product, 351 Bernoulli, 30
series, 125, 130 Cauchy-Bunyakovskiı̆-Schwarz, 74
Hermite, Charles, 271, 445 Schwarz, 74
almost integer, 206 Inequality lemma, 174
Heron of Alexandria, 79 Infimum, 64
Heron’s formula, 79 Infinite
Hidden assumptions, 12 countably, 83
High school interval, 4
graph of the exponential function, 188 limits, 164
horizontal
√ line test, 185 product, 240
i = −1, 215 series, 124
logarithms, 196 set, 83
long division, 149 uncountably, 83
plane trigonometry facts, 207 Infinite product, 102
trig identities, 199 Injective, 18
zeros of (co)sine, 205 inner product, 73
Hilbert, David, 389 Inner product space, 74
Hobbes, Thomas, 4 Integer(s), 35
Hofbauer, Josef, 249, 251, 257 almost, 206
Holy bible, 153, 227, 423 as a set, 4
House bill No. 246, 153, 228 Intermediate value property, 176
Hurwitz zeta function, 381 Intermediate value theorem, 176
Hurwitz, Adolf, 381 Intersection
Huygens, Christian, 433, 437 of sets, 7
Hyperbolic family of sets, 9
cosine, 200 Interval(s)
secant, power series of, 330 end points, 4
sine, 200 chains of, 181
Hyperbolic cotangent closed, 4
488 INDEX

infinite, 4 Leap year, 433


left-half open, 4 Least upper bound, 63
nested, 69 Left-hand limit, 163
nontrivial, 170 Lehmer, D.H., 342
of music, 435 Leibniz, Gottfried, 79, 234, 239, 340
open, 4 function word, 15
right-half open, 4 on the series 1 − 1 + 1 · · · , 93
Inverse hyperbolic functions, 226 quote, 435
Inverse image of a set, 18 Lemma
Inverse of a function, 18 Abel’s, 271
Inverse or arc cosine (complex), 226 compactness, 173
Inverse or arc cosine (real), 218 connectedness, 174
Inverse or arc sine (complex), 226 inequality, 174
Inverse or arc sine (real), 218 Length
Inverse or arc tangent (complex), 223 Euclidean, 74
Inverse or arc tangent (real), 218 of complex numbers, 82
Irrational number, 55 Limit
Irrationality infimum, 281
of er for r ∈ Q, 441 supremum, 280
of e, proof I, 143 Limit comparison test, 137
of e, proof II, 276 lim inf, 281
of e, proof III, 420 Limit points and sequences lemma, 155
of log r for r ∈ Q, 446 lim sup, 280
of √
logarithmic numbers, 62 Limit(s)
of 2, 57, 60, 63 iterated, 301
of trigonometric numbers, 60, 63 left-hand, 163
Isolated point, 167 of a function, 155
Iterated of a sequence, 95
limits, 301 open ball definition for functions, 156
series, 145, 302 open ball definition for sequences, 96
Jones, William, 153, 228 point, 154
Julian calendar, 433 right-hand, 163
Jump, 183 Limits
discontinuity, 183 at infinity, 163
function, 186 Lindemann, Ferdinand, 153, 228
linear polynomial, 88
Kanada, Yasumasa, 153, 234, 259, 340 Liouville number, 467
Kasner’s number, 117 Liouville’s function, 380
Kasner, Edward, 117 Liouville’s theorem, 467
Khayyam, Omar, 434 Liouville, Joseph, 380, 467
King-Fang, 437 log 2, 154, 194
2
22
Knopp, Konrad, 344 log 2 = 11 11 1
. . ., 396
Knott, Ron, 417 P+∞ +1 +
log 2 = n=2 2n ζ(n), 309
Kornerup, 437
as alternating harmonic series, 154, 194
Kortram, R.A., 372
k-th power free, 381 rearrangement, 311
Kummer’s test, 290 Logarithm
Kummer, Ernest, 290 common, 62
complex, 221
L’Hospital’s rule, 195 power series, 336
Lagrange identity, 78 function, 189
Lagrange, Joseph-Louis, 450 general bases, 196
Lambert, Johann, 153, 228, 441 natural, 189
Landau, Edmund, 291 Logarithmic test, 294
Lange, Jerry, 399 Logical quantifiers, 14
Law of cosines, 78 Lower bound, 64
Law of sines, 79 greatest, 64
Leading coefficient, 59 Lucas numbers, 418
INDEX 489

Lucas, François, 418 vectors, 73


Multiplicative inverse
Möbius function, 375 existence for complex numbers, 80
Möbius inversion formula, 379 existence for reals, 54
Möbius, Ferdinand, 375 Multiplicity of a root, 88
Machin’s formula, 342 Multiply by conjugate trick, 157
Machin, John, 153, 228, 234, 342 Music, 435
Madhava of Sangamagramma, 234, 332, 338,
340 Nn , 83
Massaging (an expression), 97, 117, 156, 157 n-th term test, 124
Mathematical induction, 27 Natural numbers (N), 22
Max/min value theorem, 175 as a set, 4
Maximum Negation of a statement, 14
of a set, 70, 113 Nested intervals, 69
strict, 180 Nested intervals theorem, 70
Mazur, Marcin, 63 Newcomb, Simon, 226
Measurement of the circle, 226, 228 Newton, Sir Isaac, 332
Meister, Gary, 323 Niven, Ivan, 322
Mengoli, Pietro, 249 Nondecreasing
Mercator, Gerhardus, 437 function, 182
Mercator, Nicolaus, 332 sequence, 111
Mersenne number, 48 Nonincreasing
Mersenne prime, 48 function, 182
Mersenne, Marin, 48 sequence, 111
Nonnegative series test, 125
Mertens’ multiplication theorem, 316
Nontrivial interval, 170
Mertens, Franz, 316
Norm, 76
Method
ball, 76
AGM (to compute π), 233
box, 79
bisection, 181
Euclidean, 74
of partial fractions, 125, 128, 210
sup (or supremum), 79
Minimum
Normed space, 76
of a set, 70
Null sequence, 96
MIT cheer, 198
Number theorem series, 310
Mnemonic
Number(s)
π, 153
algebraic, 88
Modular arithmetic, 47
approximable, 464
Monotone
Bernoulli, 262, 270, 327
criterion, 111
composite, 44
function, 182
coprime, 47, 153, 234, 378
sequence, 111
Euler, 270, 330
Monotone criterion theorem, 112
extended real, 109, 280
Monotone inverse theorem, 185
Liouville, 467
Monotone subsequence theorem, 113
perfect, 48
“Most”
prime, 44
extreme irrational, 468
relatively prime, 47, 234, 378
important equation eiπ + 1 = 0, 203
square-free, 153, 322
important function, 153, 187 transcendental, 88
irrational number, 427
real numbers are transcendental, 83 Odd
Multinomial theorem, 34 number, 44
Multiple, 41 function, 299
Multiple-valued function, 219 1/n-principle, 67
Multiplicative function, 171 One-to-one, 18
Multiplicative identity Onto, 18
existence for complex numbers, 80 Open
existence for integers, 36 ball, 77
existence for natural numbers, 22 interval, 4
existence for reals, 54 set, 173
490 INDEX

Order laws index of, 89


integers, 38 leading coefficient of, 59
natural numbers, 22 linear, 88
reals, 55 Pope Gregory XIII, 433
Ordered field, 54 Positivity property
Orthogonal, 77 integers, 36
Oscillation theorem, 203 reals, 54
Power
p-series, 133, 134 complex, 190, 222
p-test, 134 rules (generalized), 190
Pandigital, 432 rules (integer powers), 57
Pappus of Alexandria, 456 rules (rational powers), 67
Paradox Power series, 287, 295
Russell, Bertrand, 11 composition of, 325
Vredenduin’s, 90 Power series composition theorem, 325
Parallelogram law, 77 Power series division theorem, 327
Partial Preimage of a set, 18
products, 351 Preservation of inequalities theorem
sum, 124 limits of functions, 162
Partial quotients, 416 limits of sequences, 106
Pascal’s method, 34 Prime(s)
Pascal’s rule, 33 definition, 44
Pascal, Blaise, 349 infinite series of, 270, 320
Peirce, Benjamin, 203 infinitude, 44
Pell equation, 458 sparseness, 46
Pell, John, 459 Primitive Pythagorean triple, 47, 432
Perfect number, 48 Principal
Period argument, 219
of a b-adic expansion, 149 inverse hyperbolic cosine, 226
of a continued fraction, 447 inverse hyperbolic sine, 226
Periodic inverse or arc cosine, 226
b-adic expansions, 149 inverse or arc sine, 226
continued fraction, 447 inverse or arc tangent, 223
purely periodic continued fraction, 452 logarithm, 221
Persian calendar, 437 n-th root, 215
Pfaff, Johann, 235 value of az , 222
Φ, see also Golden ratio Principle
π, 198, 226 ε, 102
and the unit circle, 204 of mathematical induction, 27
continued fraction, 400, 419 1/n, 67
definition of, 202 pigeonhole, 84
formulas, 233 Principle of mathematical induction, 27
origin of letter, 228 Pringsheim’s theorem
Viète’s formula, 154, 233, 237 for double sequences, 302
Pianos, 435 for double series, 302
Pigeonhole principle, 84 for sequences, 137
Pochhammer symbol, 438 Pringsheim, Alfred, 302
Pochhammer, Leo August, 438 Probability, 377
Poincaré, Henri, 26, 189 number is square-free, 234, 378
Point numbers being coprime, 234, 378
isolated, 167 Product
limit, 154 Cartesian, 15
Pointwise discontinuous, 170 Cauchy, 315
Polar decomposition, 83 of sequences, 108
Polar representation, polar coordinates, 206 Proof
Polynomial by cases, 39
complex, 87 contradiction, 25
degree of, 59 contrapositive, 24, 39
INDEX 491

direct, 24 Wigner, Eugene, 79


Properly divergent Zagier, Don Bernard, 44
functions, 164 Quotient, 41, 42
sequences, 109 of sequences, 108
Property(ies) Quotients
intermediate value, 176 complete, 416
of Arctan, 224 partial, 416
of lim inf/sup, 281
of lim inf/sup theorem, 281 Raabe’s test, 291
of power series, 295 Raabe, Joseph, 291
of sine and cosine, 199 Radius of convergence, 287
of the complex exponential, 141 of (co)tangent, 388
of the logarithm, 189 Ramanujan, Srinivasa
of the real exponential, 188 approximation to π, 432
of zero and one theorem, 39 quote, 432
Purely periodic continued fraction, 452 Range of function, 16
Puzzle Ratio comparison test, 138
antipodal points on earth, 180 Ratio test
barber, 3 for sequences, 111
bent wire puzzle, 179 for series, 284
coconut, 409 Rational numbers
irrational-irrational, 195 as a set, 4
mountain, 178 Rational numbers (Q), 54
numbers one more than their cubes, 179 Rational zeros theorem, 59
rational-irrational, 192 Real numbers (R), 53
square root, 57 Rearrangement, 311
Which is larger, π e or eπ ?, 206 Recurrence relations
Pythagorean identity, 199 fundamental, 405
Pythagorean theorem, 77 Wallis-Euler, 404
Pythagorean triple, 47, 432, 463 Reductio ad absurdum, 25
relatively prime numbers, 47, 234, 378
Quadratic irrational, 448 Remainder, 42
Quote Remmert, Reinhold, 210
Abel, Niels, 123 Rhind (or Ahmes) papyrus, 227
Bernoulli, Jacob, 125, 269 Rhind, Henry, 227
Brahmagupta, 459 Richardson, Lewis, 11
Cantor, Georg, 83 Riemann zeta function, 192, 269, 285, 373,
d’Alembert, Jean Le Rond, 95, 284 375
Dirac, Paul, 237 infinite product, 350
Eddington, Sir Arthur, 21 Riemann’s rearrangement theorem, 312
Euler, Leonhard, 41, 93 Right-hand limit, 163
Galois, Evariste, 193 Ring, 36
Gauss, Carl, 10, 81, 146, 164, 199 Root
Glaisher, James, 340 of a complex number, 210
Halmos, Paul, 3 of a real number, 57
Hermite, Charles, 271 or zero, 178
Hilbert, David, 389 principal n-th, 215
Hobbes, Thomas, 4 rules, 66
Leibniz, Gottfried, 79, 435 Root test
MIT cheer, 198 for sequences, 110
Newcomb, Simon, 226 for series, 284
Peirce, Benjamin, 203 Roots of polynomials, 87
Poincaré, Henri, 26, 189 Rule
Ramanujan, Srinivasa, 432 L’Hospital’s, 195
Richardson, Lewis, 11 Rules
Russell, Bertrand, 11 absolute value, 39
Schrödinger, Erwin, 55 divisibility, 42
Smith, David, 153 fraction, 56
492 INDEX

inequality, 38 compact, 173


of sign, 37 complement of, 8
power (general), 190 connected, 173
power (integers), 57 countable, 83
power (rational), 67 countably infinite, 83
root, 66 cover of, 172
Russell’s paradox, 11 definition, 4
Russell, Bertrand dense, 170
barber puzzle, 3 difference of, 7
paradox, 11 disconnected, 173
quote, 11 disjoint, 7
empty, 4
s-adic expansions, 151 family of, 8
Salamin, Eugene, 233 finite, 83
Schrödinger, Erwin, 55 image of, 18
Schwartz inequality, 77 infimum of, 64
Schwarz inequality, 74 infinite, 83
Schwarz, Hermann, 74 intersection of, 7
Searcóid, Mı́cheál, 213 inverse image of, 18
Secant function, 200 least upper bound of, 63, 64
power series, 330 limit point of, 154
Seidel, Ludwig, 349, 353 lower bound of, 64
Semiperimeter, 79 maximum of, 113
Septimal system, 50 open, 173
Sequence, 17 preimage of, 18
Cauchy, 117 supremum of, 63
contractive, 120 uncountable, 83
definition, 94 union of, 7
difference, 122 upper bound of, 63
double, 145, 300 Venn diagram of, 8
Fibonacci, 34, 131, 289, 341, 411 zero, 171
Lucas, 418 Shanks, William, 234
monotone, 111 Sharp, Abraham, 343
nondecreasing, 111 Sine function, 144, 198
nonincreasing, 111 Smith, David, 153
null, 96 Somayaji, Nilakantha, 340
of bounded variation, 272 Sondow, Jonathan, 247
of sets, 8 Square root, 57
strictly decreasing, 114 Square-free numbers, 153, 234, 322, 378
strictly increasing, 114 Squaring the circle, 153, 228
subsequence of, 106 Squeeze theorem
tail, 103 functions, 162
Sequence criterion sequences, 105
for continuity, 167 Standard deviation, 463
for limits of functions, 159 Stirling’s formula, weak form, 115
Series Strict maxima, 180
absolute convergence, 135 Strictly decreasing
alternating harmonic, 136 sequence, 114
binomial, 333 Strictly increasing
geometric, 126 function, 182
Gregory-Leibniz-Madhava, 239, 340 sequence, 114
harmonic, 125, 130 Subcover, finite, 172
involving number and sum of divisors, 310 Subsequence, 106
iterated, 145, 302 Subset, 5
Leibniz’s, 239, 340 Sum by curves theorem, 304
P
power, 287, 295 1/p, 320
telescoping, 128 Summation
Set(s) arithmetic progression, 29
INDEX 493

by curves, 304 Bolzano-Weierstrass, 114


by squares, 303 boundedness, 175
by triangles, 304 Cauchy’s arithmetic mean, 344
definition, 28 Cauchy’s double series, 145, 305
geometric progression, 30 Cauchy’s multiplication, 318
of powers of integers, 272, 331 Cauchy-Hadamard, 287
Pascal’s method, 34 Continued fraction convergence, 416
Summation by parts, 271 continuity of power series, 295
Sup (or supremum) norm, 79 Dirichlet for rearrangements, 314
Supremum, 63 Dirichlet’s approximation, 430
Surjective, 18 existence of complex n-th roots, 215
FTA, proof I, 211
Tøndering, Claus, 433 FTA, proof II, 213
Tail of a sequence, 103 FTA, proof III, 216
Tails theorem for sequences, 103 generalized power rules, 190
Tails theorem for series, 126 identity, 298
tangent intermediate value, 176
continued fraction, 445 Liouville’s, 467
Tangent function, 200 max/min value, 175
power series of, 329 Mertens’ multiplication, 316
Tannery’s theorem for products, 359 monotone criterion, 112
Tannery’s theorem for series, 138 monotone inverse, 185
Tannery, Jules, 138 monotone subsequence, 113
Target of function, 16 oscillation, 203
Telescoping series, 128 π and the unit circle, 204
Telescoping series theorem, 128 power series composition, 325
generalization, 131 power series division, 327
Temperment, 436 Pringsheim’s, for double sequences, 302
Terminating Pringsheim’s, for double series, 302
decimal, b-adic expansion, 148 Pringsheim’s, for sequences, 137
Tertiary system, 51 properties of the complex exponential, 141
Test(s) properties of the logarithm, 189
Abel’s, 277 properties of the real exponential, 188
alternating series, 275 Riemann’s rearrangement, 312
Cauchy condensation, 133 sum by curves, 304
comparison, 132 Tannery’s for products, 359
De Morgan and Bertrand’s, 293 Tannery’s for series, 138
Dirichlet, 273 Thomae’s function, 169, 186
Gauss’, 293 Thomae, Johannes, 169
Kummer’s, 290 Towers of Hanoi, 33
limit comparison, 137 Transcendental number, 88
logarithmic, 294 Transformation rules, 394
n-th term test, 124 Transitive law
nonnegative series test, 125 cardinality, 84
p-test, 134 for inequalities, 23
Raabe’s, 291 for sets, 6
ratio comparison, 138 Triangle inequality
ratio for sequences, 111 for Rm , 75
ratio for series, 284 for integers, 40
root for sequences, 110 for series, 135
root for series, 284 Trick(s)
The Sand Reckoner, 456 divisibility, 53
Theorem ε/3, 277
Abel’s limit, 296 ε/2, 102, 119
Abel’s multiplication, 317 multiply by conjugate, 157
Basic properties of sine and cosine, 199 probability that a number is divisible by
best approximation, 425, 429 k, 378
binomial, 31 to find N , 97
494 INDEX

Trisect an angle, 228


Tropical year, 433
2-series, see also Euler’s sum for π 2 /6

Unary continued fraction, 422


Uncountability
of transcendental numbers, 89
of irrational numbers, 87
of real numbers, 87
Uncountable, 83
Union
of sets, 7
family of sets, 9
Uniqueness
additive identities and inverses for Z, 37
multiplicative inverse for R, 56
of limits for functions, 160
of limits for sequences, 102
Upper bound, 63
least, 63

Value of function, 16
Vanden Eynden, Charles, 325
Vardi, Ilan, 339, 341
Vector space, 73
Vectors, 72
Venn diagram, 8
Venn, John, 8
Viète, François, 154, 233, 349
Volterra’s theorem, 170
Volterra, Vito, 170
Vredenduin’s paradox, 90

Waldo, C.A., 228


Wallis’ formulas, 247
Wallis, John, 234, 237, 247
infinity symbol ∞, 5
Wallis-Euler recurrence relations, 404
Wantzel, Pierre, 228
Weierstrass, Karl, 114
Weinstein, Eric, 432
Weisstein, Eric, 247
Well-ordering (principle) of N, 25
Wigner, Eugene, 79
Wiles, Andrew, 462
Williams’ formula, 259
Williams’ other formula, 265
Williams, G.T., 258

Yasser, 437

Zagier, Don Bernard, 44


Zeno of Elea, 182
Zeno’s function, 182
Zero
of a function, 178
set of a function, 171
Zeta function, 134, see also Riemann zeta
function

You might also like