EleAna PDF
EleAna PDF
EleAna PDF
Aspects of Analysis:
On the incredible infinite
π2 1 1 1 1
= 2 + 2 + 2 + 2 + ···
6 1 2 3 4
22 32 52 72 112
= 2 · · · · ···
2 − 1 32 − 1 52 − 1 72 − 1 112 − 1
1
=
14
02 + 12 −
24
12 + 22 −
34
22 + 32 −
44
32 + 42 −
42 + 52 − . . .
Paul Loya
Preface i
Acknowledgement iii
Bibliography 471
CONTENTS v
Index 481
Preface
I have truly enjoyed writing this book. Admittedly, some of the writing is
too overdone (e.g. overdoing alliteration at times), but what can I say, I was
having fun. The “starred” sections of the book are meant to be “just for fun”
and don’t interfere with other sections (besides perhaps other starred sections).
Most of the quotes that you’ll find in these pages are taken from the website
http://www-gap.dcs.st-and.ac.uk/~history/Quotations/
This is a first draft, so please email me any errors, suggestions, comments etc.
about the book to
paul@math.binghamton.edu.
The overarching goals of this textbook are similar to any advanced math text-
book, regardless of the subject:
Goals of this textbook. The student will be able to . . .
• comprehend and write mathematical reasonings and proofs.
• wield the language of mathematics in a precise and effective manner.
• state the fundamental ideas, axioms, definitions, and theorems upon which real
analysis is built and flourishes.
• articulate the need for abstraction and the development of mathematical tools
and techniques in a general setting.
The objectives of this book make up the framework of how these goals will be
accomplished, and more or less follow the chapter headings:
Objectives of this textbook. The student will be able to . . .
• identify the interconnections between set theory and mathematical statements
and proofs.
• state the fundamental axioms of the natural, integer, and real number systems
and how the completeness axiom of the real number system distinguishes this
system from the rational system in a powerful way.
• apply the rigorous ε-N definition of convergence for sequences and series and
recognize monotone and Cauchy sequences.
• apply the rigorous ε-δ definition of limits for functions and continuity and the
fundamental theorems of continuous functions.
• determine the convergence and properties of an infinite series, product, or con-
tinued fraction using various tests.
• identify series, product, and continued fraction formulæ for the various elemen-
tary functions and constants.
I’d like to thank Brett Bernstein for looking over the notes and gave many
valuable suggestions.
Finally, some last words about my book. This not a history book (but we
try to talk history throughout this book) and this not a “little” book like Herbert
Westren Turnbull’s book The Great Mathematicians, but like Turnbull, I do hope
i
ii PREFACE
Paul Loya
Binghamton University, Vestal Parkway, Binghamton, NY 13902
paul@math.binghamton.edu
iii
Some of the most beautiful formulæ in the world
Here is a very small sample of the many beautiful formulas we’ll prove in this
book involving some of the main characters we’ll meet in our journey.
2
e=2+ (Euler; §5.2)
3
2+
4
3+
5
4+
.
5 + ..
∞
X (−1)n
γ= ζ(n) (Euler; §6.9)
n=2
n
2 2 2 2
log 2 = √ · p√ · qp · rq · · · (Seidel; §7.1)
1+ 2 1+ 2 1+ √ p√
2 1+ 2
v v
u u s
√ u u r q
1+ 5 t
u t √
Φ= = 1 + 1 + 1 + 1 + 1 + 1 + · · · (§3.3)
2
1
Φ=1+ (§3.4)
1
1+
1
1+
.
1 + ..
π2 1 1 1 1
= 2 + 2 + 2 + 2 + ··· (Euler; §6.11)
6 1 2 3 4
π4 1 1 1 1
= 4 + 4 + 4 + 4 + ··· (Euler; §7.5)
90 1 2 3 4
π 1 1 1 1
= − + − + · · · (Gregory-Leibniz-Madhava; §6.10)
4 1 3 5 7
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + · · · (Viète; §4.10)
π 2 2 2 2 2 2 2 2 2
π 1 2 2 4 4 6 6 8
= · · · · · · · ··· (Wallis; §6.10)
2 1 1 3 3 5 5 7 7
v
vi SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
4 12
=1+ (Lord Brouncker; §5.2)
π 32
2+
52
2+
72
2+
.
2 + ..
1
ex = (Euler; §8.7)
2x
1−
x2
x+2+
x2
6+
x2
10 +
.
14 + . .
∞
Y z2
sin πz = πz 1− 2 (Euler; §7.3)
n=1
n
Y 1 −1 Y pz
ζ(z) = 1− z = (Euler–Riemann; §7.6)
p pz − 1
A word to the student
One can imagine mathematics as a movie with exciting scenes, action, plots,
etc. ... There are a couple things you can do. First, you can simply sit back and
watch the movie playing out. Second, you can take an active role in shaping the
movie. A mathematician does both at times, but is more the actor rather than the
observer. I recommend you be the actor in the great mathematics movie. To do
so, I recommend you read this book with a pencil and paper at hand writing down
definitions, working through examples filling in any missing details, and of course
doing exercises (even the ones that are not assigned).1 Of course, please feel free
to mark up the book as much as you wish with remarks and highlighting and even
corrections if you find a typo or error. (Just let me know if you find one!)
1There are many footnotes in this book. Most are quotes from famous mathematicians and
others are remarks that I might say to you if I were reading the book with you. All footnotes may
be ignored if you wish!
vii
Part 1
Try to answer this question. (Does the barber shave himself or does someone else
shave him?) Any case, the idea of a set is perhaps the most fundamental idea in
all of mathematics. Sets can be combined to form other sets and the study of such
operators is called the algebra of sets, which we cover in Section 1.1. In Section 1.2
we look at the relationship between set theory and the language of mathematics.
Second to sets in fundamental importance is the idea of a function, which we cover
in Section 1.3. In order to illustrate relevant examples of sets, we shall presume
elementary knowledge of the real numbers. A thorough discussion of real numbers
is left for the next chapter.
This chapter is short on purpose since we do not want to spend too much time
on set theory so as to start real analysis ASAP. In the words of Paul Halmos [91,
p. vi], “... general set theory is pretty trivial stuff really, but, if you want to be a
mathematician, you need some, and here it is; read it, absorb it, and forget it.”
Chapter 1 objectives: The student will be able to . . .
• manipulate and create new sets from old ones using the algebra of sets
• identify the interconnections between set theory and math statements/proofs.
• Define functions and the operations of functions on sets.
3
4 1. SETS, FUNCTIONS, AND PROOFS
B Ac
A A
are open intervals, denoted by (−∞, a) and (a, ∞), respectively, and
{x ∈ R ; x ≤ a}, {x ∈ R ; a ≤ x}
are closed intervals, denoted by (−∞, a] and [a, ∞), respectively. Note that the
sideways eight symbol ∞ for “infinity,” introduced in 1655 by John Wallis (1616–
1703) [45, p. 44], is just that, a symbol, and is not to be taken to be a real number.
The real line is itself an interval, namely R = (−∞, ∞).
To say that two sets A and B are the same just means that they contain exactly
the same elements; in other words, every element in A is also in B (that is, A ⊆ B)
and also every element in B is also in A (that is, B ⊆ A). Thus, we define
A set that is a subset of every set is the empty set ∅. To see that ∅ is a subset
of every set, let A be a set. We must show that the statement if x ∈ ∅, then x ∈ A
is true. However, the part “x ∈ ∅” of this statement sounds strange because the
empty set has nothing in it, so x ∈ ∅ is an untrue statement to begin with. Before
evaluating the statement “If x ∈ ∅, then x ∈ A,” let us first discuss general “If ...
then” statements. Consider the following following statement made by Joe:
If Professor Loya cancels class on Friday, then I’m driving to New York City.
Obviously, Joe told the truth if indeed I cancelled class and he headed off
to NYC. What if I did not cancel class but Joe still went to NYC, did Joe tell
the truth, lie, or neither: his statement simply does not apply? Mathematicians
would say that Joe told the truth (regardless of the outcome, Joe went to NYC or
stayed in Binghamton). He only said if the professor cancels class, then he would
drive to NYC. All bets are off if the professor does not cancel class. This is the
standing convention mathematicians take for any “If ... then” statement. Thus,
given statements P and Q, we consider the statement “If P , then Q” to be true if
the statement P is true, then the statement Q is also true, and we also regard it
6 1. SETS, FUNCTIONS, AND PROOFS
as being true if the statement P is false whether or not the statement Q is true or
false. There is no such thing as a “neither statement” in this book.2
Now back to our problem. We want to prove that if x ∈ ∅, then x ∈ A. Since
x ∈ ∅ is untrue, by our convention, the statement “if x ∈ ∅, then x ∈ A” is true
by default. Thus, ∅ ⊆ A. We can also see that there is only one empty set, for
suppose that ∅0 is another empty set. Then the same (silly) argument that we just
did for ∅ shows that ∅0 is also a subset of every set. Now to say that ∅ = ∅0 , we
must show that ∅ ⊆ ∅0 and ∅0 ⊆ ∅. But ∅ ⊆ ∅0 holds because ∅ is a subset of
every set and ∅0 ⊆ ∅ holds because ∅0 is a subset of every set. Therefore, ∅ = ∅0 .
There is another, perhaps easier, way to see that ∅ is a subset of any set by
invoking the “contrapositive”. Consider again the statement that A ⊆ B:
(1) If x ∈ A, then x ∈ B.
This is equivalent to the contrapositive statement
(2) If x 6∈ B, then x 6∈ A.
Indeed, suppose that statement (1) holds, that is, A ⊆ B. We shall prove that
statement (2) holds. So, let us assume that x 6∈ B is true; is true that x 6∈ A?3
Well, the object x is either in A or it’s not. If x ∈ A, then, since A ⊆ B, we must
have x ∈ B. However, we know that x 6∈ B, and so x ∈ A is not the valid option,
and therefore x 6∈ A. Assume now that statement (2) holds: If x 6∈ B, then x 6∈ A.
We shall prove that statement (1) holds, that is, A ⊆ B. So, let x ∈ A. We must
prove that x ∈ B. Well, either x ∈ B or it’s not. If x 6∈ B, then we know that
x 6∈ A. However, we are given that x ∈ A, so x 6∈ B is not the correct option,
therefore, the other option x ∈ B must be true. Therefore, (1) and (2) really say
the same thing. We now prove that ∅ ⊆ A for any given set A. Assume that
x 6∈ A. According to (2), we must prove that x 6∈ ∅. But this last statement is true
because ∅ does not contain anything, so x 6∈ ∅ is certainly true. Thus, ∅ ⊆ A.
The following theorem states an important law of sets.
Theorem 1.1 (Transitive law). If A ⊆ B and B ⊆ C, then A ⊆ C.
Proof. Suppose that A ⊆ B and B ⊆ C. We need to prove that A ⊆ C,
which by definition means that if x ∈ A, then x ∈ C. So, let x be in A; we need
to show that x is also in C. Since x is in A and A ⊆ B, we know that x is also in
B. Now B ⊆ C, and therefore x is also in C. In conclusion, we have proved that if
x ∈ A, then x ∈ C, which is exactly what we wanted to prove.
Finally, we remark that the power set of a given set A is the collection con-
sisting of all subsets of A, which we usually denote by P(A).
Example 1.6.
P({e, π}) = ∅, {e}, {π}, {e, π} .
2Later in your math career you will find some “neither statements” such as e.g. the continuum
hypothesis . . . but this is another story!
3Recall our convention that for a false statement P , we always consider a statement “If P ,
then Q” to be true regardless of the validity of the statement Q. Therefore, “x 6∈ B” is false
automatically makes the statement (2) true regardless of the validity of the statement “x 6∈ A”,
so in order to prove statement (2) is true, we might as well assume that the statement “x 6∈ B” is
true and try to show that the statement “x 6∈ A” is also true.
1.1. THE ALGEBRA OF SETS AND THE LANGUAGE OF MATHEMATICS 7
Example 1.9. √
{0, 1, e, i} \ {e, i, π, 2} = {0, 1}.
8 1. SETS, FUNCTIONS, AND PROOFS
A B A B A B A B
[ ] ] ] ] ] ]
0 . . . . . . 16 51 41 1
3
1
2
1
We how do we define the intersection of all the sets Aα in a family {Aα ; α ∈ I}?
Consider the case of two sets A and B. We can write
A ∩ B = {x ; x ∈ A and x ∈ B}
= {x ; x is in every set on the left-hand side}.
With this as motivation, we define the intersection of all the sets Aα to be
\
Aα := {x ; x ∈ Aα for every α ∈ I}.
α∈I
Example 1.13. For the sequence An = [0, 1/n] in Figure 1.3, we have
\
An := {x ; x ∈ [0, 1/n] for every n ∈ N} = {0}.
n∈N
T T
To simplify notation, we sometimes just write Aα or α Aα for the left-hand
side. If I = {1, 2, . . . , N } is a finite set of natural numbers, then we usually denote
S T SN TN
α Aα and αS Aα by n=1 T An and S∞ n=1 An , respectively.
T∞ If I = N, then we
usually denote α Aα and α Aα by n=1 An and n=1 An , respectively.
Theorem 1.3. Let A be a set and {Aα } be a family of sets. Then union and
intersections distribute in the sense that
[ [ \ \
A ∩ Aα = (A ∩ Aα ), A ∪ Aα = (A ∪ Aα )
α α α α
S S
that x ∈ A and x ∈ α Aα , which is to say, x ∈ A ∩ α Aα . In summary, we have
established both inclusions in (1.1), which proves the equality of the sets.
We shall prove the Sfirst De Morgan
T law and leave the second to the reader. We
need to show that A \ α Aα = α (A \ Aα ), which means that
[ \ \ [
(1.2) A\ Aα ⊆ (A \ Aα ) and (A \ Aα ) ⊆ A \ Aα .
α α α α
S S
To prove the first inclusion, let x ∈ A \ α Aα . This means x ∈ A and x 6∈ α Aα .
For x not to be in the union, it must be that x 6∈ Aα for any α whatsoever
S (because
if x happened to be in some Aα , then x would be in the union α Aα which we
know x is not). Hence, x ∈ A and T x 6∈ Aα for all α, in other words, x ∈ A \ Aα
for all α, which means
T that x ∈ α (A \ Aα ). We now prove the second inclusion
in (1.2). So, let x ∈ α (A \ Aα ). This means that x ∈ A \ Aα for all α. Therefore,
S
x ∈ A and x 6∈ Aα for all S α. Since x is not in any ASα , it follows that x 6∈ α Aα .
Therefore, x ∈ A and x 6∈ α Aα and hence, x ∈ A \ α Aα . In summary, we have
established both inclusions in (1.2), which proves the equality of the sets.
The best way to remember De Morgan’s laws is the English versions: The
complement of a union is the intersection of the complements and the complement
of an intersection is the union of the complements. For a family {Aα } consisting of
just two sets B and C, the distributive and De Morgan laws are just
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C), A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)
and
A \ (B ∪ C) = (A \ B) ∩ (A \ C), A \ (B ∩ C) = (A \ C) ∪ (A \ C).
Here are some exercises where we ask you to prove statements concerning sets.
In Problem 3 it is very helpful to draw Venn diagrams to “see” why the statement
should be true. Some advice that is useful throughout this whole book: If you can’t
see how to prove something after some effort, take a break and come back to the
problem later.4
Exercises 1.1.
1. Prove that ∅ = {x ; x 6= x}. True, false, or neither: If x ∈ ∅, then real analysis is
everyone’s favorite class.
2. Prove that for any set A, we have A ∪ ∅ = A and A ∩ ∅ = ∅.
3. Prove the following statements:
(a) A \ B = A ∩ B c .
(b) A ∩ B = A \ (A \ B).
(c) B ∩ (A \ B) = ∅.
(d) If A ⊆ B, then B = A ∪ (B \ A).
(e) A ∪ B = A ∪ (B \ A).
(f) A ⊆ A ∪ B and A ∩ B ⊆ A.
(g) If A ∩ B = A ∩ C and A ∪ C = A ∪ B, then B = C.
(h) (A \ B) \ C = (A \ C) \ (B \ C)
4Finally, two days ago, I succeeded - not on account of my hard efforts, but by the grace of
the Lord. Like a sudden flash of lightning, the riddle was solved. I am unable to say what was
the conducting thread that connected what I previously knew with what made my success possible.
Carl Friedrich Gauss (1777–1855) [67].
1.2. SET THEORY AND MATHEMATICAL STATEMENTS 11
that any given object is either in or not in a given set, but never both. In a day when
“there are no absolutes” is commonly taught in high school, it may take a while
to fully grasp the language of mathematics. A mathematical statement always has
hypotheses or assumptions, and a conclusion. Almost always there are hidden
assumptions, that is, assumptions that are not stated, but taken for granted,
because the context makes it clear what these assumptions are. Whenever you read
a mathematical statement, make sure that you fully understand the hypotheses or
assumptions (including hidden ones) and the conclusion. For the statement “If
x ∈ A, then x ∈ B”, the assumption is x ∈ A and the conclusion is x ∈ B. The
“if-then” wording means: If the assumptions (x ∈ A) are true, then the conclusion
(x ∈ B) is also true, or stated another way, given that the assumptions are true, the
conclusion follows. Let P denote the statement that x ∈ A and Q the statement
that x ∈ B. Then each of the following statements are equivalent, that is, the truth
of any one statement implies the truth of any of the other statements:7
If P , then Q; P implies Q; Given P , Q holds;
(1.4)
P only if Q; Q if P ; If not Q, then not P .
Each of these statements are for P being x ∈ A and Q being x ∈ B, but as you
probably guess, they work for any mathematical statements P and Q. Let us
consider statements concerning real numbers.
Example 1.14. Let P be the statement that x > 5. Let Q be the statement
that x2 > 100. Then each of the statements are equivalent:
If x > 5, then x2 > 100; x > 5 implies x2 > 100; Given x > 5, x2 > 100;
x > 5 only if x2 > 100; x2 > 100 if x > 5; If x2 ≤ 100, then x ≤ 5.
The hidden assumptions are that x represents a real number and that the real
numbers satisfy all the axioms you think they do. Of course, any one (and hence
every one) of these six statements is false. For instance, x = 6 > 5 is true, but
x2 = 36, which is not greater than 100.
Example 1.15. Let P be the statement that x2 = 2. Let Q be the statement
that x is irrational. Then each of the statements are equivalent:
If x2 = 2, then x is irrational; x2 = 2 implies x is irrational;
Given x2 = 2, x is irrational; x2 = 2 only if x is irrational;
x is irrational if x2 = 2; If x is rational, then x2 6= 2.
Again, the hidden assumptions are that x represents a real number and that the
real numbers satisfy all their usual properties. Any one (and hence every one)√of
these six statements is of course true (since we are told since high school that ± 2
are irrational; we shall prove this fact in Section 2.6).
As these two examples show, it is very important to remember that none of the
statements in (1.4) assert that P or Q is true; they simply state if P is true, then
Q is also true.
7P implies Q is sometimes translated as “P is sufficient for Q” in the sense that the truth of
P is sufficient or enough or ample to imply that Q is also true. P implies Q is also translated “Q
is necessary for P ” because Q is necessarily true given that P is true. However, we shall not use
this language in this book.
1.2. SET THEORY AND MATHEMATICAL STATEMENTS 13
1.2.2. Converse statements and “if and only if” statements. Given a
statement P implies Q, the reverse statement Q implies P is called the converse
statement. For example, back to set theory, the converse of the statement
If x ∈ A, then x ∈ B; that is, A ⊆ B,
is just the statement that
If x ∈ B, then x ∈ A; that is, B ⊆ A.
These set theory statements make it clear that the converse of a true statement may
not be true, for {e, π} ⊆ {e, π, i}, but {e, π, i} 6⊆ {e, π}. Let us consider examples
with real numbers.
Example 1.16. The statement “If x2 = 2, then x is irrational” is true, but its
converse statement, “If x is irrational, then x2 = 2,” is false.
Statements for which the converse is equivalent to the original statement are
called “if and only if” statements.
Example 1.17. Consider the statement “If x = −5, then 2x + 10 = 0.” This
statement is true. Its converse statement is “If 2x + 10 = 0, then x = −5.” By
solving the equation 2x + 10 = 0, we see that the converse statement is also true.
The implication x = −5 =⇒ 2x + 10 = 0 can be written
(1.5) 2x + 10 = 0 if x = −5,
while the implication 2x + 10 = 0 =⇒ x = −5 can be written
(1.6) 2x + 10 = 0 only if x = −5.
Combining the two statements (1.5) and (1.6) into one statement, we get
2x + 10 = 0 if and only if x = −5,
which is often denoted by a double arrow
2x + 10 = 0 ⇐⇒ x = −5,
or in more common terms, 2x + 10 = 0 is equivalent to x = −5. We regard the
statements 2x + 10 = 0 and x = −5 as equivalent because if one statement is true,
then so is the other one; hence the wording “is equivalent to”. In summary, if both
statements
Q if P (that is, P =⇒ Q) and Q only if P (that is, Q =⇒ P )
hold, then we write
Q if and only if P or Q ⇐⇒ P.
Also, if you are asked to prove a statement “Q if and only if P ”, then you have
to prove both the “if” statement “Q if P ” (that is, P =⇒ Q) and the “only if”
statement “Q only if P ” (that is, Q =⇒ P ).
The if and only if notation ⇐⇒ comes in quite handy in proofs whenever we
want to move from one statement to an equivalent one.
14 1. SETS, FUNCTIONS, AND PROOFS
Example
S 1.18.S Recall that in the proof of TheoremS 1.3, weSwanted to show
that
S A ∩ α Aα = Sα (A ∩ A α ), which means that A ∩ α Aα ⊆ α (A ∩ Aα ) and
α (A ∩ Aα ) ⊆ A ∩ α Aα ; that is,
[ [ [ [
x ∈ A ∩ Aα =⇒ x ∈ (A ∩ Aα ) and x ∈ (A ∩ Aα ) =⇒ x ∈ A ∩ Aα ,
α α α α
Just make sure that if you use ⇐⇒, the expression to the immediate left and
right of ⇐⇒ are indeed equivalent.
Similarly, the negation of a “there is” statement becomes a “for every” statement.
Explicitly, the negation of
“For at least one x, Q” is the statement “For every x, not Q.”
For instance, with the understanding that x represents a real number, the negation
of “There is an x such that x2 = 2” is “For every x, x2 6= 2”.
Exercises 1.2.
1. In this problem all numbers are understood to be real. Write down the contrapositive
and converse of the following statement:
If x2 − 2x + 10 = 25, then x = 5,
and determine which (if any) of the three statements are true.
2. Write the negation of the following statements, where x represents an integer.
(a) For every x, 2x + 1 is odd.
(b) There is an x such that 2x + 1 is prime.8
3. Here are some more set theory proofs to brush up on.
(a) Prove that (Ac )c = A.
(b) Prove that A = A ∪ B if and only if B ⊆ A.
(c) Prove that A = A ∩ B if and only if A ⊆ B.
y-axis = codomain
y (x, y) = (x, τ (x)) = (x, sin x)
R(τ ) = range
x x-axis = domain
[−1, 1]
We shall also deal quite a bit with complex-valued functions, that is, functions
whose codomain is C. Then if we say, “let f be a complex-valued function on
[0, 1]”, we mean that f : [0, 1] −→ C is a function. Here are some more examples.
Example 1.22. Consider the function s : N −→ R defined by
(−1)n
s= n, ; n ∈ N ⊆ N × R.
n
n
We usually denote s(n) = (−1) n by sn and write {sn } for the function s, and we
call {sn } a sequence of real numbers. We shall study sequences in great depth in
Chapter 3.
Example 1.23. Here is a “piecewise” defined function: a : R −→ R,
(
x if x ≥ 0;
a(x) =
−x if x < 0.
Of course, a(x) is usually denoted by |x| and is called the absolute value function.
pq 9 pq 9
- -
6 xy 4 xy 4
? ?
[ ] [ ] [ ]
−3 −2 −3 −2 2 3
Example 1.27. Let f (x) = x2 with domain and range in R. Then as we can
see in Figure 1.6,
f ([−3, −2]) = [4, 9] and f −1 ([4, 9]) = [−3, −2] ∪ [2, 3].
Here are more examples: You are invited to check that
f ((1, 2]) = (1, 4], f −1 ([−4, −1)) = ∅, f −1 ((1, 4]) = [−2, −1) ∪ (1, 2].
The following theorem gives the main properties of images and inverse images.
Theorem 1.4. Let f : X −→ Y , let B, C ⊆ Y , {Bα } be a family of subsets of
Y , and let {Aα } a family of subsets of X. Then
[ [
f −1 (C \ B) = f −1 (C) \ f −1 (B), f −1 Bα = f −1 (Bα ),
α α
\ \ [ [
−1 −1
f Bα = f (Bα ), f Aα = f (Aα ).
α α α α
Proof. Using the definition of inverse image and set difference, we have
x ∈ f −1 (C \ B) ⇐⇒ f (x) ∈ C \ B ⇐⇒ f (x) ∈ C and f (x) 6∈ B
⇐⇒ x ∈ f −1 (C) and x 6∈ f −1 (B)
⇐⇒ x ∈ f −1 (C) \ f −1 (B).
We end this section with some definitions needed for the exercises. Let X be
a set and let A be any subset of X. The characteristic function of A is the
function χA : X −→ R defined by
(
1 if x ∈ A;
χA (x) =
0 if x 6∈ A.
The sum and product of two characteristic function χA and χB are the functions
χA + χB : X −→ R and χA · χB : X −→ R defined by
(χA +χB )(x) = χA (x)+χB (x) and (χA ·χB )(x) = χA (x)·χB (x), for all x ∈ X.
Of course, the sum and product of any functions f : X −→ R and g : X −→ R
are defined in the same way. We can also replace R by, say C, or by any set Y as
long as “+” and “·” are defined on Y . Given any constant c ∈ R, we denote by
the same letter the function c : X −→ R defined by c(x) = c for all x ∈ X. This
is the constant function c. For instance, 0 is the function defined by 0(x) = 0
for all x ∈ X. The identity map on X is the map defined by I(x) = x for all
x ∈ X. Finally, we say that two functions f : X −→ Y and g : X −→ Y are equal
if f = g as subsets of X × Y , which holds if and only if f (x) = g(x) for all x ∈ X.
Exercises 1.3.
1. Which of the following subsets of R × R define functions from R to R?
(a) A1 = {(x, y) ∈ R × R ; x2 = y}, (b) A2 = {(x, y) ∈ R × R ; x = sin y},
(c) A3 = {(x, y) ∈ R × R ; y = sin x}, (d) A4 = {(x, y) ∈ R × R ; x = 4y − 1}.
(Assume well-known properties of trig functions.) Of those sets which do define func-
tions, find the range of the function. Is the function is one-to-one; is it onto?
2. Let f (x) = 1 − x2 . Find
f ([1, 4]), f ([−1, 0] ∪ (2, 10)), f −1 ([−1, 1]), f −1 ([5, 10]), f (R), f −1 (R).
3. If f : X −→ Y and g : Z −→ X are bijective, prove that f ◦ g is a bijection and
(f ◦ g)−1 = g −1 ◦ f −1 .
4. Let f : X −→ Y be a function.
(a) Given any subset B ⊆ Y , prove that f (f −1 (B)) ⊆ B.
(b) Prove that f (f −1 (B)) = B for all subsets B of Y if and only if f is surjective.
(c) Given any subset A ⊆ X, prove that A ⊆ f −1 (f (A)).
(d) Prove that A = f −1 (f (A)) for all subsets A of X if and only if f is injective.
5. Let f : X −→ Y be a function. Show that f is one-to-one if and only if there is a
function g : Y −→ X such that g ◦ f is the identity map on X. Show that f is onto if
and only if there is a function h : Y −→ X such that f ◦ h is the identity map on Y .
6. (Cf. [152]) In this problem we give various applications of characteristic functions to
prove statements about sets. First, prove at least two of (a) – (e) of the following. (a)
χX = 1, χ∅ = 0; (b) χA · χB = χB · χA = χA∩B and χA · χA = χA ; (c) χA∪B =
χA + χB − χA · χB ; (d) χAc = 1 − χA ; (e) χA = χB if and only if A = B. Here are
some applications of these properties Prove the distributive law:
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),
by showing that the characteristic functions of each side are equal as functions. Then
invoke (e) to demonstrate equality of sets. Prove the nonobvious equality
(A ∩ B c ) ∩ (C c ∩ A) = A ∩ (B ∪ C)c .
Here’s a harder question: Consider the sets (A ∪ B) ∩ C and A ∪ (B ∩ C). When, if
ever, are they equal? When is one set a subset of the other?
CHAPTER 2
You’ll have to wait for this mouth-watering subject until Section 2.6! In Section 2.7
we study the all-important property of the real numbers called the completeness
property, which in some sense says that real numbers can describe any length
whatsoever. In Sections 2.8 and 2.9, we leave the one-dimensional real line and
discuss m-dimensional space and the complex numbers (which is really just two-
dimensional space). Finally, in Section 2.10 we define “most” using cardinality and
show that “most” real numbers are not only irrational, they are transcendental.
Chapter 2 objectives: The student will be able to . . .
• state the fundamental axioms of the natural, integer, and real number systems.
21
22 2. NUMBERS, NUMBERS, AND MORE NUMBERS
• Explain how the completeness axiom of the real number system distinguishes
this system from the rational number system in a powerful way.
• prove statements about numbers from basic axioms including induction.
• Define Rm and C and the norms on these spaces.
• Explain cardinality and how “most” real numbers are irrational or even tran-
scendental.
(O3) b < a, which by definition means that a = b + c for some natural number c.
Thus, 2 < 5 because 5 = 2 + c where c = 3. Of course, we write a ≤ b if a < b
or a = b. There are similar meanings for the opposite inequalities “>” and “≥”.
The inequality signs < and > are called strict. There is one more property of the
natural numbers called induction. Let M be a subset of N.
(I) Suppose that M contains 1 and that M has the following property: If n belongs
to M , then n + 1 also belongs to M . Then M contains all natural numbers.
The statement that M = N is “obvious” with a little thought. M contains 1.
Because 1 belongs to M , by (I), we know that 1+1 = 2 also belongs to M . Because
2 belongs to M , by (I) we know that 2 + 1 = 3 also belongs to M . Assuming we
can continue this process indefinitely makes it clear that M = N.
Everyday experience convinces us that the counting numbers satisfy properties
(A), (M), (D), (O), and (I). However, mathematically we will assume, or take by
faith, the existence of a set N with operations + and · that satisfy properties (A),
(M), (D), (O), and (I).2 From these properties alone we shall prove many well-
known properties of these numbers that we have accepted since grade school. It is
quite satisfying to see that many of the well-known properties about numbers that
are memorized (or even those that are not so well-known) can in fact be proven
from a basic set of axioms! The “rules of the game” to prove such properties is
that we are allowed to prove statements only using facts that we already know are
true either because these facts were given to us in a set of axioms, or because these
facts have already been proven by us in this book, by your teacher in class, or by
you in an exercise.
2.1.2. Proofs of well-known high school rules. Again, you are going to
learn the language of proofs in the same way that a child learns to talk; by observing
others prove things and imitating them, and eventually you will get the hang of it.
We begin by proving the familiar transitive law.
Theorem 2.1 (Transitive law). If a < b and b < c, then a < c.
Proof. Suppose that a < b and b < c. Then by definition of less than (recall
the inequality law (O2) in Section 2.1.1), there are natural numbers d and e such
that b = a + d and c = b + e. Hence, by the associative law,
c = b + e = (a + d) + e = a + (d + e).
Thus, c = a + f where f = d + e ∈ N, so a < c by definition of less than.
Before moving on, we briefly analyze this theorem in view of what we learned
in Section 1.2. The hypotheses or assumptions of this theorem are that a, b,
and c are natural numbers with a < b and b < c and the conclusion is that a < c.
Note that the fact that a, b, and c are natural numbers and that natural numbers
are assumed to satisfy all their arithmetic and order properties were left unwritten
in the statement of the proposition since these assumptions were understood within
the context of this section. The “if-then” wording means: If the assumptions are
true, then the conclusion is also true or given that the assumptions are true, the
conclusion follows. We can also reword Theorem 2.1 as follows:
a < b and b < c implies (also written =⇒) a < c;
2Taking the axioms of set theory by faith, which we are doing in this book even though we
haven’t listed many of them(!), we can define the natural numbers as sets, see [91, Sec. 11].
24 2. NUMBERS, NUMBERS, AND MORE NUMBERS
that is, the truth of the assumptions implies the truth of the conclusion. We can
also state this theorem as follows:
a < b and b < c only if a < c;
that is, the hypotheses a < b and b < c hold only if the conclusion a < c also holds,
or
a < c if a < b and b < c;
that is, the conclusion a < c is true if, or given that, the hypotheses a < b and b < c
are true. The kind of proof used in Theorem 2.1 is called a direct proof , where
we take the hypotheses a < b and b < c as true and prove that the conclusion a < c
is true. We shall see other types of proofs later. We next give another easy and
direct proof of the so-called “FOIL law” of multiplication. However, before proving
this result, we note that the distributive law (D) holds from the right:
(a + b) · c = ac + bc.
Indeed,
(a + b) · c = c · (a + b) commutative law
= (c · a) + (c · b) distributive law
= (a · c) + (b · c) commutative law.
Theorem 2.2 (FOIL law). For any natural numbers a, b, c, d, we have
(a + b) · (c + d) = ac + ad + bc + bd, (first + outside + inside + last).
Proof. We simply compute:
(a + b) · (c + d) = (a + b) · c + (a + b) · d distributive law
= (ac + bc) + (ad + bd) distributive law (from right)
= ac + (bc + (ad + bd)) associative law
= ac + ((bc + ad) + bd) associative law
= ac + ((ad + bc) + bd) commutative law
= ac + ad + bc + bd,
where at the last step we dropped parentheses as we know we can in sums of more
than two numbers (consequence of the associative law).
We now prove the familiar cancellation properties of high school algebra.
Theorem 2.3. Given any natural numbers a, b, c, we have
a+c=b+c if and only if a = b.
In particular, given a + c = b + c, we can “cancel” c, obtaining a = b.
Proof. Suppose that a = b, then because a and b are just different letters for
the same natural number, we have a + c = b + c.
We now have to prove that if a + c = b + c, then a = b. To prove this, we use
a proof by contraposition. This is how it works. We need to prove that if the
assumption “P : a + c = b + c” is true, then the conclusion “Q : a = b” is also true.
Instead, we shall prove the logically equivalent contrapositive statement: If the
conclusion Q is false, then the assumption P must also false. The statement that Q
2.1. THE NATURAL NUMBERS 25
is false is just that a 6= b and the statement that P is false is just that a + c 6= b + c.
Thus, we must prove
if a 6= b, then a + c 6= b + c.
To this end, assume that a 6= b; then either a < b or b < a. Because the notation
is entirely symmetric between a and b, we may presume that a < b. Then by
definition of less than, we have b = a + d for some natural number d. Hence, by
the associative and commutative laws,
b + c = (a + d) + c = a + (d + c) = a + (c + d) = (a + c) + d.
Thus, by definition of less than, a + c < b + c, so a + c 6= b + c.
2.1.3. Induction. We all know that every natural number is greater than or
equal to one. Here is a proof!
Theorem 2.4. Every natural number is greater than or equal to one.
Proof. Rewording this as an “if-then” statement, we need to prove that if n
is a natural number, then n ≥ 1. To prove this, let M = {n ∈ N ; n ≥ 1}, the
collection all natural numbers greater than or equal to one. Then M contains 1.
If a natural number n belongs to M , then by definition of M , n ≥ 1. This means
that n = 1 or n > 1. In the first case, n + 1 = 1 + 1, so by definition of less than,
1 < n + 1. In the second case, n > 1 means that n = 1 + m for some m ∈ N, so
n + 1 = (1 + m) + 1 = 1 + (m + 1). Again by definition of less than, 1 < n + 1. In
either case, n + 1 also belongs to M . Thus by induction, M = N.
2. Are there natural numbers a and b such that a = a + b? What logical inconsistency
happens if such an equation holds?
3. Prove the following statements.
(a) If n2 = 1 (that is, n · n = 1), then n = 1.
(b) There does not exist a natural number n such that 2n = 1.
(c) There does not exist a natural number n such that 2n = 3.
4. Prove the following statements.
(a) If n ∈ N, then there is no m ∈ N such that n < m < n + 1.
(b) If n ∈ N, then there is a unique m ∈ N satisfying n < m < n + 2; in fact, prove
that the only such natural number is m = n + 1. (That is, prove that n + 1 satisfies
the inequality and if m also satisfies the inequality, then m = n + 1.)
5. Prove the following statements.
(a) (a + b)2 = a2 + 2ab + b2 , where a2 means a · a and b2 means b · b.
(b) For any fixed natural number c,
a = b if and only if a · c = b · c .
Conclude that 1 is the only multiplicative identity (that is, if a · c = c for some
a, c ∈ N, then a = 1).
(c) For any fixed natural number c,
a < b if and only if a + c < b + c .
Also prove that
a < b if and only if a · c < b · c .
(d) If a < b and c < d, then a · c < b · d.
6. Let A be a finite collection of natural numbers. Prove that A has a largest element,
that is, A contains a number n such that n ≥ m for every element m in A.
7. Many books take the well-ordering property as a fundamental axiom instead of the
induction axiom. Replace the induction axiom by the well-ordering property.
(a) Prove Theorem 2.4 using well-ordering. Suggestion: By well-ordering, N has a
least element, call it n. We need to prove that n ≥ 1. Assume that n < 1 and find
another natural number less than n to derive a contradiction.
(b) Prove the induction property.
(that is, all the statements P1 , P2 , P3 , . . . are true). See Figure 2.1 for a visual of
this concept.
We now illustrate this principle through some famous examples. In order to
present examples that have applicability in the sequel, we have to go outside the
realm of natural numbers and assume basic familiarity with integers, real, and
complex numbers. Integers will be discussed in the next section, real numbers in
Sections 2.6 and 2.7, and complex numbers in Section 2.9.
2.2.2. Inductive definitions: Powers and sums. We of course know what
73 is, namely 7 · 7 · 7. In general, we define an where a is any complex number called
the base and n is a positive integer called the exponent as follows:
an := a
| · a{z· · · a} .
n times
(Recall that “:=” means “equals by definition”.)
Example 2.1. We can also define an using induction. Let Pn denote the
statement “the power an is defined”. We define a1 := a. Assume that an has been
defined. Then we define an+1 := an · a. Thus, the statement Pn+1 is defined, so by
induction an is defined for any natural number n.
Example 2.2. Using induction, we prove that for any natural numbers m and
n, we have
(2.1) am+n = am · an .
Indeed, let us fix the natural number m and let Pn be the statement “Equation
(2.1) holds for the natural number n”. Certainly
am+1 = am · a = am · a1
holds by definition of am+1 . Assume that (2.1) holds for a natural number n. Then
by definition of the power and our induction hypothesis,
am+(n+1) = a(m+n)+1 = am+n · a = am · an · a = am · an+1 ,
which is exactly the statement Pn+1 . If a 6= 0 and we also define a0 := 1, then as
the reader can readily check, (2.1) continues to hold even if m or n is zero.
In elementary calculus, we were introduced to the summation notation. Let
a0 , a1 , aP
2 , a3 , . . . be any list of complex numbers. For any natural number n, we
n
define k=0 ak as the sum of the numbers a0 , . . . , an :
n
X
ak := a0 + a1 + · · · + an .
k=0
P
By the way, in 1755 Euler introduced the sigma notation for summation [171].
2.2. THE PRINCIPLE OF MATHEMATICAL INDUCTION 29
Example 2.4. First, we shall prove that for every natural number n, the sum
of the first n integers is n(n + 1)/2; that is,
n(n + 1)
(2.2) 1 + 2 + ··· + n = .
2
Here, Pn represents the statement “Equation (2.2) holds”. Certainly,
1(1 + 1)
1= .
2
Thus, our statement is true for n = 1. Suppose our statement holds for a number
n. Then adding n + 1 to both sides of (2.2), we obtain
n(n + 1)
1 + 2 + · · · + n + (n + 1) = + (n + 1)
2
n(n + 1) + 2(n + 1) (n + 1)(n + 1 + 1)
= = ,
2 2
which is exactly the statement Pn+1 . Hence, by the principle of mathematical
induction, every single statement Pn is true.
We remark that the high school way to prove Pn is to write the sum of the first
n integers forward and backwards:
Sn = 1 + 2 + · · · + (n − 1) + n
and
Sn = n + (n − 1) + · · · + 2 + 1.
Notice that the sum of each column is just n + 1. Since there are n columns, adding
these two expressions, we obtain 2Sn = n(n + 1), which implies our result.
What if we only sum the odd integers? We get (proof left to you!)
1 + 3 + 5 + · · · + (2n − 1) = n2 .
Do you see why Figure 2.2 makes this formula “obvious”?
30 2. NUMBERS, NUMBERS, AND MORE NUMBERS
1 an+1
(2.4) = 1 + a + a2 + · · · + an + .
1−a 1−a
2. Using induction prove that for any complex numbers a and b and for any natural
numbers m and n, we have
(ab)n = an · bn , (am )n = amn .
If a and b are nonzero, prove that these equations hold even if m = 0 or n = 0.
3. Prove the following (some of them quite pretty) formulas/statements via induction:
(a)
1 1 1 n
+ + ··· + = .
1·2 2·3 n(n + 1) n+1
(b)
n(n + 1)(2n + 1)
12 + 22 + · · · + n2 = .
6
(c)
n(n + 1) 2
13 + 23 + · · · + n3 = (1 + 2 + · · · + n)2 = .
2
(d)
1 2 3 n n+2
+ 2 + 3 + ··· + n = 2 − .
2 2 2 2 2n
(e) For a 6= 1,
n+1
n 1 − a2
(1 + a)(1 + a2 )(1 + a4 ) · · · (1 + a2 ) = .
1−a
(f) n3 − n is always divisible by 3.
(g) Every natural number n is either even or odd. Here, n is even means that n = 2m
for some m ∈ N and n odd means that n = 1 or n = 2m + 1 for some m ∈ N.
(h) n < 2n for all n ∈ N. (Can you also prove this using Bernoulli’s
inequality?)
(i) Using the identity (2.7), called Pascal’s rule, prove that nk is a natural number
for all n, k ∈ N with 1 ≤ k ≤ n. (Pn is the statement “ nk ∈ N for all 1 ≤ k ≤ n.”)
4. In this problem we prove some nifty binomial formulas. Prove that
n
! n
!
X n n
X k n
(a) =2 , (b) (−1) = 0,
k k
k=0 k=0
! !
X n n−1
X n
(c) =2 , (d) = 2n−1 ,
k k
k odd k even
where the sums in (c) and (d) are over k = 1, 3, . . . and k = 0, 2, . . ., respectively.
5. (Towers of Hanoi) Induction can be used to analyze games! (See Problem 6 in the
next section for the game of Nim.) For instance, the towers of Hanoi starts with three
pegs and n disks of different sizes placed on one peg, with the biggest disk on the
bottom and with the sizes decreasing to the top as shown in Figure 2.3. A move is
made by taking the top disk off a stack and putting it on another peg so that there is
no smaller disk below it. The object of the game is to transfer all the disks to another
peg. Prove that the puzzle can be solved in 2n − 1 moves. (In fact, you cannot solve
the puzzle in less than 2n − 1 moves, but the proof of this is another story.)
34 2. NUMBERS, NUMBERS, AND MORE NUMBERS
6. (The coin game) Two people have n coins each and they put them on a table, in
separate piles, then they take turns removing their own coins; they may take as many
as they wish, but they must take at least one. The person removing the last coin(s)
wins. Using strong induction, prove that the second person has a “full-proof winning
strategy.” More explicitly, prove that for each n ∈ N, there is a strategy such that the
second person will win the game with n coins if he follows the strategy.
7. We now prove the arithmetic-geometric mean inequality (AGMI): For any non-
negative (that is, ≥ 0) real numbers a1 , . . . , an , we have
a1 + · · · + a n a + · · · + a n
1 n
(a1 · · · an )1/n ≤ or equivalently, a1 · · · an ≤ .
n n
The product (a1 · · · an )1/n is the geometric mean and the sum a1 +···+a n
n
is the
arithmetic mean, respectively, of the numbers a1 , . . . , an .
√ √ √
(i) Show that a1 a2 ≤ a1 +a 2
2
. Suggestion: Expand ( a1 − a2 )2 ≥ 0.
n
(ii) By induction show the AGMI holds for 2 terms for every natural number n.
(iii) Now prove the AGMI for n terms where n is not necessarily a power of 2. (Do
not use induction.) Suggestion: Let a = (a1 + · · · + an )/n. By Problem 3h,
we know that 2n − n is a natural number. Apply the AGMI to the 2n terms
a1 , . . . , an , a, a, . . . , a where there are 2n − n occurrences of a in this list, to derive
the AGMI in general.
8. Here’s Newman’s proof [157] of the AGMI. The AGMI holds for one term, so assume
it holds for n terms; we shall prove that the AGMI holds for n + 1 terms.
(a) Prove that if the AGMI holds for all n + 1 nonnegative numbers a1 , . . . , an+1 such
that a1 · · · an+1 = 1, then the AGMI holds for any n + 1 nonnegative numbers.
(b) By (a), we just have to verify that the AGMI holds when a1 · · · an+1 = 1. Using
the induction hypothesis, prove that a1 + · · · + an + an+1 ≥ n(an+1 )−1/n + an+1 .
(c) Prove that for any x > 0, we have nx−1/n + x ≥ n + 1. Suggestion: Replace
n by n + 1 and a = x1/n − 1 > −1 in Bernoulli’s inequality. Now prove that
a1 + · · · + an+1 ≥ n + 1, which is the AGMI for n + 1 terms.
9. (Fibonacci sequence) Define F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2.
Using strong induction, prove that for every natural number n,
√
1 h n −n
i 1+ 5
Fn = √ Φ − (−Φ) , where Φ = is called the golden ratio.
5 2
√
Suggestion: Note that Φ2 = Φ + 1 and hence −Φ−1 = 1 − Φ = (1 − 5)/2.
10. (Pascal’s method) Using a method due to Pascal, we generalize our formula (2.2)
for the sum of the first n integers to sums of powers. See [18] for more on Pascal’s
method. For any natural number k, put σk (n) := 1k + 2k + · · · + nk and set σ0 (n) := n.
(i) Prove that
k
!
k+1
X k+1
(n + 1) −1= σ` (n).
`
`=0
Pn k+1
Suggestion: The left-hand side can be written as m=1 (m + 1) − mk+1 .
Use the binomial theorem on (m + 1)k+1 .
(ii) Using the strong form of induction, prove that for each natural number k,
1
σk (n) = nk+1 + akk nk + · · · + ak2 n2 + ak1 n (Pascal’s formula),
k+1
for some coefficients ak1 , . . . , akk ∈ Q.
(iii) (Cf. [124]) Using the fact that σ3 (n) = 14 n4 + a33 n3 + a32 n2 + a31 n, find the
coefficients a31 , a32 , a33 . Suggestion: Consider the difference σ3 (n) − σ3 (n − 1).
Can you find the coefficients in the sum for σ4 (n)?
11. (The multinomial theorem) A multi-index is an n-tuple of nonnegative integers
and are usually denoted by Greek letters, for instance α = (α1 , . . . , αn ) where each
2.3. THE INTEGERS 35
Multiplication satisfies
(M1) a · b = b · a; (commutative law)
(M2) (a · b) · c = a · (b · c); (associative law)
(M3) there is an integer denoted by 1 “one” such that
1 · a = a = a · 1. (existence of multiplicative identity)
As with the natural numbers, the · is sometimes dropped and the associative laws
imply that expressions involving integers such as a+b+c or abc make sense without
using parentheses. Addition and multiplication are related by
(D) a · (b + c) = (a · b) + (a · c). (distributive law)
Of these arithmetic properties, the only additional properties listed that were not
listed for natural numbers are (A3) and (A4). As usual, we denote
a + (−b) = (−b) + a by b − a,
so that subtraction is, by definition, really just “adding negatives”. A set together
with operations of addition and multiplication that satisfy properties (A1) – (A4),
(M2), (M3), and (D) is called a ring; essentially a ring is just a set of objects closed
under addition, multiplication, and subtraction. If in addition, this multiplication
satisfies (M1), then the set is called a commutative ring.
The natural numbers or positive integers, which we denote by N or by Z+ , is
closed under addition, multiplication, and has the following property: Given any
integer a, exactly one of the following “positivity” properties hold:
(P) a is a positive integer, a = 0, or −a is a positive integer.
Stated another way, property (P) means that Z is a union of disjoint sets,
Z = Z+ ∪ {0} ∪ −Z+ ,
where −Z+ consists of all integers a such that −a ∈ Z+ .
Everyday experience convinces us that the integers satisfy properties (A), (M),
(D), and (P); however, as with the natural numbers, we will assume the existence
of a set Z satisfying properties (A), (M), (D), and (P) such that Z+ = N, the
natural numbers. From the properties listed above, we shall derive some well-
known properties of the integers memorized since grade school.
2.3.2. Proofs of well-known high school rules. Since the integers satisfy
the same arithmetic properties as the natural numbers, the same proofs as in Section
2.1, prove that the distributive law (D) holds from the right and the FOIL law holds.
Also, the cancellation theorem 2.3 holds: Given any integers a, b, c,
a = b if and only if a + c = b + c .
However, now this statement is easily proved using the fact that the integers have
additive inverses. We only prove the “if” part: If a + c = b + c, then adding −c to
both sides of this equation we obtain
(a+c)+(−c) = (b+c)+(−c) =⇒ a+(c+(−c)) = b+(c+(−c)) =⇒ a+0 = b+0,
or a = b. Comparing this proof with that of Theorem 2.3 shows the usefulness of
having additive inverses.
We now show that we can always solve equations such as the one given at
the beginning of this section. Moreover, we prove that there is only one additive
identity.
2.3. THE INTEGERS 37
Theorem 2.12 (Properties of zero and one). Zero and one satisfy
(1) If a · b = 0, then a = 0 or b = 0.
(2) If a · b = a where a 6= 0, then b = 1; that is, 1 is the only multiplicative identity.
Proof. We give two proofs of (1). Although Proof I is acceptable, Proof
II is much preferred because Proof I boils down to a contrapositive statement
anyways, which Proof II goes to directly.
Proof I: Assume that ab = 0. We shall prove that a = 0 or b = 0. Now either
a = 0 or a 6= 0. If a = 0, then we are done, so assume that a 6= 0. We need to prove
that b = 0. Well, either b = 0 or b 6= 0. However, it cannot be true that b 6= 0, for
according to the properties (4), (5), and (6) of our rules for inequalities,
(2.8) if a 6= 0 and b 6= 0, then a · b 6= 0.
But ab = 0, so b 6= 0 cannot be true. This contradiction shows that b = 0.
Proof II: Our second proof of (1) is a proof by contraposition, which is
essentially what we did in Proof I without stating it! Recall that, already explained
in the proof of Theorem 2.3, the technique of a proof by contraposition is that in
order to prove the statement “if a · b = 0, then a = 0 or b = 0,” we can instead try
to prove the contrapositive statement:
if a 6= 0 and b 6= 0, then a · b 6= 0.
However, as explained above (2.8), the truth of this statement follows from our
inequality rules. This gives another (better) proof of (1).
To prove (2), assume that a · b = a where a 6= 0. Then,
0 = a − a = a · b − a · 1 = a · (b − 1).
By (1), either a = 0 or b − 1 = 0. We are given that a 6= 0, so we must have
b − 1 = 0, or adding 1 to both sides, b = 1.
Property (1) of this theorem is the basis for solving quadratic equations in high
school. For example, let us solve x2 − x − 6 = 0. We first “factor”; that is, observe
that
(x − 3)(x + 2) = x2 − x − 6 = 0.
By property (1), we know that x − 3 = 0 or x + 2 = 0. Thus, x = 3 or x = −2.
2.3.3. Absolute value. Given any integer a, we know that either a = 0, a is
a positive integer, or −a is a positive integer. The absolute value of the integer
a is denoted by |a| and is defined to be the “nonnegative part of a”:
(
a if a ≥ 0,
|a| :=
−a if a < 0.
three stones each. The last one to remove a stone loses. Let Pn be the statement that
the player starting first has a full-proof winning strategy if n is of the form n = 4k,
4k + 2, or 4k + 3 for some k = 0, 1, 2, . . . and the player starting second has a full-proof
winning strategy if n = 4k + 1 for some k = 0, 1, 2, . . .. In this problem we prove that
Pn is true for all n ∈ N.6
(i) Prove that P1 is true. Assume that P1 , . . . , Pn hold. To prove that Pn+1 holds, we
prove by cases. The integer n + 1 can be of four types: n + 1 = 4k, n + 1 = 4k + 1,
n + 1 = 4k + 2, or n + 1 = 4k + 3.
(ii) Case 1: n + 1 = 4k. The first player can remove one, two, or three stones; in
particular, he can remove three stones (leaving 4k − 3 = 4(k − 1) + 1 stones).
Prove that the first person wins.
(iii) Case 2: n+1 = 4k +1. Prove that the second player will win regardless if the first
person takes one, two, or three stones (leaving 4k, 4(k − 1) + 3, and 4(k − 1) + 2
stones, respectively).
(iv) Case 3, Case 4: n + 1 = 4k + 2 or n + 1 = 4k + 3. Prove that the first player has
a winning strategy in the cases that n + 1 = 4k + 2 or n + 1 = 4k + 3. Suggestion:
Make the first player remove one and two stones, respectively.
2.4.1. Divisibility. If a and b are integers, and there is a third integer q such
that
b = a q,
then we say that a divides b or b divisible by a or b is a multiple of a, in which
case we write a|b and call a a divisor or factor of b and q the quotient (of b
divided by a).
Example 2.6. Thus, for example 4|(−16) with quotient 4 because −16 =
4 · (−4) and (−2)|(−6) with quotient 3 because −6 = (−2) · 3.
We also take the convention that
divisors are by definition nonzero.
To see why, note that
0 = 0 · 0 = 0 · 1 = 0 · (−1) = 0 · 2 = 0 · (−2) = · · · ,
6
Here, we are implicitly assuming that any natural number can be written in the form 4k,
4k + 1, 4k + 2, or 4k + 3; this follows from Theorem 2.15 on the division algorithm, which we
assume just for the sake of presenting a cool exercise!
7Mathematicians have tried in vain to this day to discover some order in the sequence of
prime numbers, and we have reason to believe that it is a mystery into which the human mind
will never penetrate. Leonhard Euler (1707–1783) [210].
42 2. NUMBERS, NUMBERS, AND MORE NUMBERS
Theorem 2.15 (The division algorithm). Given any integers a and b with
a 6= 0, there are unique integers q and r so that
b = qa + r with 0 ≤ r < |a|.
Moreover, if a and b are both positive, then q is nonnegative. Furthermore, a divides
b if and only if r = 0.
Proof. Assume for the moment that a > 0. Consider the list of integers
(2.9) . . . , 1 + b − 3a, 1 + b − 2a, 1 + b − a, b, 1 + b + a, 1 + b + 2a, 1 + b + 3a, . . .
extending indefinitely in both ways. Notice that since a > 0, for any integer n,
1 + b + na < 1 + b + (n + 1)a,
so the integers in the list (2.9) are increasing. Moreover, by the Archimedean
ordering of the integers, there is a natural number n so that −1 − b < an or
1 + b + an > 0. In particular, 1 + b + ak > 0 for k ≥ n. Thus, far enough to the right
in the list (2.9), all the integers are positive. Let A be set of all natural numbers
appearing in the list (2.9). By the well-ordering principle (Theorem 2.6), this set of
natural numbers has a least element, let us call it 1 + b + ma where m is an integer.
This integer satisfies
(2.10) 1 + b + (m − 1)a < 1 ≤ 1 + b + ma,
for if 1 + b + (m − 1)a ≥ 1, then 1 + b + (m − 1)a would be an element of A smaller
than 1 + b + ma. Put q = −m and r = b + ma = b − qa. Then b = qa + r by
construction, and substituting in q and r into (2.10), we obtain
1 + r − a < 1 ≤ 1 + r.
Subtracting 1 from everything, we see that
r − a < 0 ≤ r.
Thus, 0 ≤ r and r − a < 0 (that is, r < a). Thus, we have found integers q and r
so that b = qa + r with 0 ≤ r < a. Observe from (2.10) that if b is positive, then
m can’t be positive (for otherwise the left-hand inequality in (2.10) wouldn’t hold).
Thus, q is nonnegative if both a and b are positive. Assume now that a < 0. Then
−a > 0, so by what we just did, there are integers s and r with b = s(−a) + r with
0 ≤ r < −a; that is, b = qa + r, where q = −s and 0 ≤ r < |a|.
We now prove uniqueness. Assume that we also have b = q 0 a + r0 with 0 ≤
r < |a|. We first prove that r = r0 . Indeed, suppose that r 6= r0 , then by
0
symmetry in the primed and unprimed letters, we may presume that r < r0 . Then
0 < r0 − r ≤ r0 < |a|. Moreover,
q 0 a + r0 = qa + r =⇒ (q 0 − q)a = r0 − r.
This shows that a|(r0 − r) which is impossible since r0 − r is smaller than |a| (see
property (1) of Theorem 2.14). Thus, we must have r = r0 . Then the equation
(q 0 − q)a = r0 − r reads (q 0 − q)a = 0. Since a 6= 0, we must have q 0 − q = 0, or
q = q 0 . Our proof of uniqueness is thus complete.
Finally, we prove that a|b if and only if r = 0. If a|b, then b = ac = ac + 0
for some integer c. By uniqueness already established, we have q = c and r = 0.
Conversely, if r = 0, then b = aq, so a|b by definition of divisibility.
44 2. NUMBERS, NUMBERS, AND MORE NUMBERS
Proof. We start with the tentative assumption that the theorem is false.
Thus, we assume that there are only finitely many primes. There being only finitely
many, we can list them:
p1 , p2 , . . . , pn .
Consider the number
p1 p2 p3 · · · pn + 1.
This number is either prime or composite. It is greater than all the primes p1 , . . . , pn ,
so this number can’t equal any p1 , . . . , pn . We conclude that n must be composite,
so
(2.11) p1 p2 p3 · · · pn + 1 = ab,
for some natural numbers a and b. By our lemma, both a and b can be expressed as
a product involving p1 , . . . , pn , which implies that ab also has such an expression.
In particular, being a product of some of the p1 , . . . , pn , the right-hand side of
(2.11) is divisible by at least one of the prime numbers p1 , . . . , pn . However, the
left-hand side is certainly not divisible by any such prime because if we divide the
left-hand side by any one of the primes p1 , . . . , pn , we always get the remainder
1! This contradiction shows that our original assumption that the theorem is false
must have been incorrect; hence there must be infinitely many primes.
2.4.4. Fundamental theorem of arithmetic. Consider the integer 120,
which can be factored as follows:
120 = 2 × 2 × 2 × 3 × 5.
A little verification shows that it is impossible to factor 120 into any primes other
than the ones displayed. Of course, the order can be different; e.g.
120 = 3 × 2 × 2 × 5 × 2.
It is of fundamental importance in mathematics that any natural number can be
factored into a product of primes in only one way, apart from the order.
Theorem 2.18 (Fundamental theorem of arithmetic). Every natural num-
ber other than 1 can be factored into primes in only one way, except for the order
of the factors.
Proof. For sake of contradiction, let us suppose that there are primes that
can be factored in more that one way. By the well-ordering principle, there is a
smallest such natural number a. Thus, we can write a as a product of primes in
two ways:
a = p1 p2 · · · pm = q 1 q 2 · · · q n .
Note that both m and n are greater than 1, for a single prime number has one prime
factorization. We shall obtain a contradiction by showing there is a smaller natural
number that has two factorizations. First, we observe that none of the primes pj
on the left equals any of the primes qk on the right. Indeed, if for example p1 = q1 ,
then by cancellation, we could divide them out obtaining the natural number
p2 p3 · · · pm = q 2 q 3 · · · q n .
This number is smaller than a and the two sides must represent two distinct prime
factorizations, for if these prime factorizations were the same apart from the order-
ings, then (since p1 = q1 ) the factorizations for a would also be the same apart from
orderings. Since a is the smallest such number with more than one factorization, we
46 2. NUMBERS, NUMBERS, AND MORE NUMBERS
5. Working backwards through the equations (2.14) show that for any two integers a, b,
we have
(a, b) = rn = k a + ` b,
for some integers k and `. Using this fact concerning the GCD, we shall give an easy
proof of the fundamental theorem of arithmetic.
(i) Prove that if a prime p divides a product ab, then p divides a or p divides b.
(Problem 2 does not apply here because in that problem we used the fundamental
theorem of arithmetic, but now we are going to prove this fundamental theorem.)
Suggestion: Either p divides a or it doesn’t; if it does, we’re done, if not, then the
GCD of p and a is 1. Thus, 1 = (p, a) = k a + ` b, for some integers k, `. Multiply
this equation by b and show that p must divide b.
(ii) Using induction prove that if a prime p divides a product a1 · · · an , then p divides
some ai .
(iii) Using (ii), prove that the fundamental theorem of arithmetic.
6. (Modular arithmetic) Given n ∈ N, we say that x, y ∈ Z are congruent modulo
n, written x ≡ y (mod n), if x − y is divisible by n. For a, b, x, y, u, v ∈ Z, prove
(a) x ≡ y (mod n), y ≡ x (mod n), x − y ≡ 0 (mod n) are equivalent statements.
(b) If x ≡ y (mod n) and y ≡ z (mod n), then x ≡ z (mod n).
(c) If x ≡ y (mod n) and u ≡ v (mod n), then ax + by ≡ au + bv (mod n).
(d) If x ≡ y (mod n) and u ≡ v (mod n), then xu ≡ yv (mod n).
(e) Finally, prove that if x ≡ y (mod n) and m|n where m ∈ N, then x ≡ y (mod m).
7. (Fermat’s theorem) We assume the basics of modular arithmetic from Problem 6.
In this problem we prove that for any prime p and x ∈ Z, we have xp ≡ x (mod p).
This theorem is due to Pierre de Fermat (1601–1665).
(i) Prove that for any k ∈ N with 1 < k < p, thebinomial coefficient (which is an
integer, see e.g. Problem 3i in Exercises 2.2) kp = k!(p−k)!
p!
is divisible by p.
(ii) Using (i), prove that for any x, y ∈ Z, (x + y) ≡ x + y p (mod p).
p p
(iii) Using (ii) and induction, prove that xp ≡ x (mod p) for all x ∈ N. Conclude that
xp ≡ x (mod p) for all x ∈ Z.
8. (Pythagorean triples) A Pythagorean triple consists of three natural numbers
(x, y, z) such that x2 + y 2 = z 2 . For example, (3, 4, 5) and (6, 8, 10) are such triples.
The triple is called primitive if x, y, z are relatively prime, or coprime, which
means that x, y, z have no common prime factors. For instance, (3, 4, 5) is primitive
while (6, 8, 10) is not. In this problem we prove
(
x = 2mn , y = m2 − n2 , z = m2 + n2 , or,
(x, y, z) is primitive ⇐⇒
x = m2 − n2 , y = 2mn , z = m2 + n2 ,
where m, n are coprime, m > n, and m, n are of opposite parity; that is, one of m, n is
even and the other is odd.
(i) Prove the “⇐=” implication. Henceforth, let (x, y, z) be a primitive triple.
(ii) Prove that x and y cannot both be even.
(iii) Show that x and y cannot both be odd.
(iv) Therefore, one of x, y is even and the other is odd; let us choose x as even and
y as odd. (The other way around is handled similarly.) Show that z is odd and
conclude that u = 12 (z + y) and v = 12 (z − y) are both natural numbers.
(v) Show that y = u − v and z = u + v and then x2 = 4uv. Conclude that uv is a
perfect square (that is, uv = k2 for some k ∈ N).
(vi) Prove that u and v must be coprime and from this fact and the fact that uv is a
perfect square, conclude that u and v each must be a perfect square; say u = m2
and v = n2 for some m, n ∈ N. Finally, prove the desired result.
9. (Pythagorean triples, again) If you like primitive Pythagorean triples, here’s an-
other problem: Prove that if m, n are coprime, m > n, and m, n are of the same parity,
48 2. NUMBERS, NUMBERS, AND MORE NUMBERS
then
m2 − n2 m2 + n2
(x, y, z) is primitive, where x = mn , y = , z= .
2 2
Combined with the previous problem, we see that given coprime natural numbers
m > n,
(
x = 2mn , y = m2 − n2 , z = m2 + n2 , or,
(x, y, z) is primitive, where 2 2 2 2
x = mn , y = m −n
2
, z = m +n2
,
according as m and n have opposite or the same parity.
10. (Mersenne primes) A number of the form Mn = 2n − 1 is called a Mersenne
number, named after Marin Mersenne (1588–1648). If Mn is prime, it’s called a
Mersenne prime. For instance, when M2 = 22 − 1 = 3 is prime, M3 = 23 − 1 = 7
is prime, but M4 = 24 − 1 = 15 is not prime. However, when M5 = 25 − 1 =
31 is prime again. It it not known if there exists infinitely many Mersenne primes.
Prove that if Mn is prime, then n is prime. (The converse if false; for instance M23 ,
is composite.) Suggestion: Prove the contrapositive. Also, the polynomial identity
xk − 1 = (x − 1)(xk−1 + xk−2 + · · · + x + 1) might be helpful.
11. (Perfect numbers) A number n ∈ N is said to be perfect if it is the sum of its proper
divisors (divisors excluding itself). For example, 6 = 1+2+3 and 28 = 1+2+4+7+14
are perfect. It’s not known if there exists any odd perfect numbers! In this problem
we prove that perfect numbers are related to Mersenne primes as follows:
n is even and perfect ⇐⇒ n = 2m (2m+1 − 1) where m ∈ N , 2m+1 − 1 is prime.
For instance, when m = 1, 21+1 − 1 = 3 is prime, so 21 (21+1 − 1) = 6 is perfect.
Similarly, we get 28 when m = 2 and the next perfect number is 496 when m = 4.
(Note that when m = 3, 2m+1 − 1 = 15 is not prime.)
(i) Prove that if n = 2m (2m+1 − 1) where m ∈ N and 2m+1 − 1 is prime, then n
is perfect. Suggestion: The proper divisors of n are 1, 2, . . . , 2m , q, 2q, . . . , 2m−1 q
where q = 2m+1 − 1.
(ii) To prove the converse, we proceed systematically as follows. First prove that if
m, n ∈ N, then d is a divisor of m · n if and only if d = d1 · d2 where d1 and d2
mk
are divisors of m and n, respectively. Suggestion: Write m = pm 1
1 · · · pk and
n1 n`
n = q1 · · · q` into prime factors. Observe that a divisor of m · n is just a number
i j
of the form pi11 · · · pkk q1j1 · · · q` ` where 0 ≤ ir ≤ mr and 0 ≤ jr ≤ nr .
(iii) For n ∈ N, define σ(n) as the sum of all the divisors of n (including n itself).
Using (ii), prove that if m, n ∈ N, then σ(m · n) = σ(m) · σ(n).
(iv) Let n be even and perfect and write n = 2m q where m ∈ N and q is odd. By (iii),
σ(n) = σ(2m )σ(q). Working out both sides of σ(n) = σ(2m )σ(q), prove that
q
(2.15) σ(q) = q + m+1 .
2 −1
Suggestion: Since n is perfect, prove that σ(n) = 2n and by definition of σ, prove
that σ(2m ) = 2m+1 − 1.
(v) From (2.15) and the fact that σ(q) ∈ N, show that q = k(2m+1 − 1) for some
k ∈ N. From (2.15) (that σ(q) = q + k), prove that k = 1. Finally, conclude that
n = 2m (2m+1 − 1) where q = 2m+1 − 1 is prime.
12. In this exercise we show how to factor factorials (cf. [78]). Let n > 1. Show that the
prime factors of n! are exactly those primes less than or equal to n. Explain that to
factor n!, for each prime p less than n, we need to know the greatest power of p that
divides n!. We shall prove that the greatest power of p that divides n! is
∞
X n
(2.16) ep (n) := ,
pk
k=1
2.5. DECIMAL REPRESENTATIONS OF INTEGERS 49
Note that the digits 1, 2, 3, 4 in the symbol 4321 are exactly the remainders produced
after successive divisions of a and its quotients by 10. For example,
a = 432 · 10 + 1 (remainder 1)
Now divide the quotient 432 by 10:
432 = 43 · 10 + 2 (remainder 2).
Continuing dividing the quotients by 10, we get
43 = 4 · 10 + 3, (remainder 3), and finally, 4 = 0 · 10 + 4, (remainder 4).
We shall use this technique of successive divisions in the proof of Theorem 2.19
below. In general, the symbol a = an an−1 · · · a1 a0 represents the number
a = an · 10n + an−1 · 10n−1 + · · · + a1 · 10 + a0 (in base 10).
As with our previous example, the digits a0 , a1 , . . . , an are exactly the remainders
produced after successive divisions of a and the resulting quotients by 10.
2.5.2. Other common bases. We now consider other bases; for instance, the
base 7 or septimal system. Here, we use the symbols 0, 1, 2, 3, 4, 5, 6, 7 to represent
zero and the first seven natural numbers and the numbers 0, 1, . . . , 6 are the digits
in base 7. Then we write an integer a as an an−1 · · · a1 a0 in base 7 if
a = an · 7n + an−1 · 7n−1 + · · · + a1 · 7 + a0 .
Example 2.14. For instance, the number with symbol 10 in base 7 is really
the number 7 itself, since
10 (base 7) = 1 · 7 + 0.
Example 2.15. The number one hundred one has the symbol 203 in the sep-
timal system because
203 (base 7) = 2 · 72 + 0 · 7 + 3,
and in our familiar base 10 or decimal notation, the number on the right is just
2 · 49 + 3 = 98 + 3 = 101.
The base of choice for computers is base 2 or the binary or dyadic system.
In this case, we write numbers using only the digits 0 and 1. Thus, an integer a is
written as an an−1 · · · a1 a0 in base 2 if
a = an · 2n + an−1 · 2n−1 + · · · + a1 · 2 + a0 .
Example 2.16. For instance, the symbol 10101 in the binary system represents
the number
10101 (base 2) = 1 · 24 + 0 · 23 + 1 · 22 + 0 · 21 + 1.
In familiar base 10 or decimal notation, the number on the right is 16+4+2+1 = 21.
Example 2.17. The symbol 10 in base 2 is really the number 2 itself, since
10 (base 2) = 1 · 2 + 0.
2.5. DECIMAL REPRESENTATIONS OF INTEGERS 51
Not only are binary numbers useful for computing, they can also help you be a
champion in the Game of Nim; see [202]. (See also Problem 6 in Exercises 2.3.)
Another common base is base 3, which is known as the tertiary system.
We remark that one can develop addition and multiplication tables in the sep-
timal and binary systems (indeed, with respect to any base); see for instance [57, p.
7]. Once a base is fixed, we shall not make a distinction between a number and its
representation in the chosen base. In particular, throughout this book we always
use base 10 and write numbers with respect to this base unless stated otherwise.
2.5.3. Arbitrary base expansions of integers. We now show that a num-
ber can be written with respect to any base. Fix a natural number b > 1, called a
base. Let a be a natural number and suppose that it can be written as a sum of
the form
a = an · bn + an−1 · bn−1 + · · · + a1 · b + a0 ,
where 0 ≤ ak < b and an 6= 0. Then the symbol an an−1 · · · a1 a0 is called the b-adic
representation of a. A couple questions arise: First, does every natural number
have such a representation and second, if a representation exists, is it unique? The
answer to both questions is yes.
In the following proof, we shall use the following useful “telescoping” sum
several times:
Xn n
X
(b − 1) bk = (bk+1 − bk ) = b1 + · · · + bn + bn+1 − (1 + b1 + · · · + bn ) = bn+1 − 1.
k=0 k=0
and we stop successive divisions once we get an . Combining the first and second
equations in (2.17), we get
a = q0 · b + a0 = (q1 · b + a1 )b + a0 = q1 · b2 + a1 b + a0 .
Combining this equation with the third equation in (2.17) we get
a = (q2 · b + a2 ) · b2 + a1 b + a0 = q2 · b3 + a2 · b2 + a1 b + a0 .
Continuing this process (slang for “by use of induction”) we eventually arrive at
a = (0 · b + an ) · bn + an−1 · bn−1 + · · · + a1 · b + a0
= an · bn + an−1 · bn−1 + · · · + a1 · b + a0 .
This shows the existence of a b-adic representation of a.
Step 2: We now show that this representation is unique. Suppose that a has
another such representation:
Xn Xm
(2.18) a= ak bk = ck bk ,
k=0 k=0
where 0 ≤ ck < b and cm 6= 0. We first prove that n = m. Indeed, let’s suppose
that n 6= m, say n < m. Then,
Xn n
X
a= ak bk ≤ (b − 1) bk = bn+1 − 1.
k=0 k=0
2. In the following exercises, we shall establish the validity of grade school divisibility
“tricks”, cf. [112]. Let a = an an−1 . . . a0 be the decimal (= base 10) representation
of a natural number a. Let us first consider divisibility by 2, 5, 10.
(a) Prove that a is divisible by 10 if and only if a0 = 0.
(b) Prove that a is divisible by 2 if and only if a0 is even.
(c) Prove that a is divisible by 5 if and only if a0 = 0 or a0 = 5.
3. We now consider 4 and 8.
(a) Prove that a is divisible by 4 if and only if the number a1 a0 (written in decimal
notation) is divisible by 4.
(b) Prove that a is divisible by 8 if and only if a2 a1 a0 is divisible by 8.
4. We consider divisibility by 3, 6, 9. (Unfortunately, there is no slick test for divisibility
by 7.) Suggestion: Before considering these tests, prove that 10k − 1 is divisible by 9
for any nonnegative integer k.
(a) Prove that a is divisible by 3 if and only if the sum of the digits (that is, an + · · · +
a1 + a0 ) is divisible by 3.
(b) Prove that a is divisible by 6 if and only if a is even and the sum of the digits is
divisible by 3.
(c) Prove that a is divisible by 9 if and only if the sum of the digits is divisible by 9.
5. Prove that a is divisible by 11 if and only if the difference between the sums of the
even and odd digits:
n
X
(a0 + a2 + a4 + · · · ) − (a1 + a3 + a5 + · · · ) = (−1)k ak
k=0
is divisible by 11. Suggestion: First prove that 10 −1 and 102k+1 +1 are each divisible
2k
2.6.1. The real and rational numbers. The set of real numbers is denoted
by R. The reader is certainly familiar with the following arithmetic properties of
real numbers (in what follows, a, b, c denote real numbers):
Addition satisfies
(A1) a + b = b + a; (commutative law)
(A2) (a + b) + c = a + (b + c); (associative law)
54 2. NUMBERS, NUMBERS, AND MORE NUMBERS
√
1 2 1
A rational number is a number that can be written in the form a/b where
a and b are integers with b =
6 0 and the set of all such numbers is denoted by Q.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 55
We leave the reader to check that the rational numbers also form an ordered field.
Thus, both the real numbers and the rational numbers are ordered fields. Now
what is the difference between the real and rational numbers? The difference was
discovered more than 2500 years ago by the Greeks, who found out that the length
of √
the diagonal of a unit square, which according to the Pythagorean theorem
is 2, is not a rational number (see Theorem 2.23). Because √ this length is not
a rational number, the Greeks called a number such as 2 irrational.9 Thus,
there are “gaps” in the rational numbers. Now it turns out that every length is a
real number. This fact is known as the completeness axiom of the real numbers.
Thus, the real numbers have no “gaps”. To finish up the list of axioms for the real
numbers, we state this completeness axiom now but we leave the terms in the axiom
undefined until Section 2.7 (so don’t worry if some of these words seem foreign).
(C) (Completeness axiom of the real numbers) Every nonempty set of real
numbers that is bounded above has a supremum, that is, a least upper bound.
We shall assume the existence of a set R such that N ⊆ R+ , Z ⊆ R, and R
satisfies all the arithmetic, positivity, and completeness properties listed above.10
All theorems that we prove in this textbook are based on this assumption.
2.6.2. Proofs of well-known high school rules. Since the real numbers
satisfy the same arithmetic properties as the natural and integer numbers, the
same proofs as in Section 2.1 and 2.3 prove the uniqueness of additive identities
and inverses, rules of sign, properties of zero and one (in particular, the uniqueness
of the multiplicative identity), etc . . ..
Also, the real numbers are ordered in the same way as the integers. Given any
real numbers a and b exactly one of the following holds:
(O1) a = b;
(O2) a < b, which means that b − a is a positive real number;
(O3) b < a, which means that −(b − a) is a positive real number.
Just as for integers, we can define ≤, >, and ≥ and (O3) is just that a − b is a
positive real number. One can define the absolute value of a real number in the
exact same way as it is defined for integers. Since the real numbers satisfy the
same order properties as the integers, the same proofs as in Section 2.3 prove the
inequality rules, absolute value rules, etc . . ., for real numbers. Using the inequality
rules we can prove the following well-known fact from high school: if a > 0, then
a−1 > 0. Indeed, by definition of a−1 , we have a · a−1 = 1. Since 1 > 0 (recall that
1 ∈ N ⊆ R+ ) and a > 0, we have positive × a−1 = positive; the only way this is
possible is if a−1 > 0 by the inequality rules. Here are other high school facts that
can be proved using the inequality rules: If 0 < a < 1, then a−1 > 1 and if a > 1,
then a−1 < 1. Indeed, if a < 1 with a positive, then multiplying by a−1 > 0, we
obtain
a · (a−1 ) < 1 · (a−1 ) =⇒ 1 < a−1 .
9The idea of the continuum seems simple to us. We have somehow lost sight of the difficulties
it implies ... We are told such a number as the square root of 2 worried Pythagoras and his school
almost to exhaustion. Being used to such queer numbers from early childhood, we must be careful
not to form a low idea of the mathematical intuition of these ancient sages; their worry was
highly credible. Erwin Schrödinger (1887–1961).
10For simplicity we assumed that N ⊆ R+ (in particular, all natural numbers are positive by
assumption) and Z ⊆ R, but it is possible to define N and Z within R. Actually, one only needs
to define N for then we can put Z := N ∪ {0} ∪ (−N).
56 2. NUMBERS, NUMBERS, AND MORE NUMBERS
Similarly, if 1 < a, then multiplying through by a−1 > 0, we get a−1 < 1.
Here are some more high school facts.
Theorem 2.20 (Uniqueness of multiplicative inverse). If a and b are real
numbers with a 6= 0, then x · a = b if and only if x = ba−1 = b/a. In particular,
setting b = 1, the only x that satisfies the equation x · a = 1 is x = a−1 . Thus, each
real number has only one multiplicative inverse.
Proof. If x = b · a−1 , then
(ba−1 ) · a = b(a−1 a) = b · 1 = b,
so the real number x = b/a solves the equation x · a = b. Conversely, if x satisfies
x · a = b, then
ba−1 = (x · a)a−1 = x · (a a−1 ) = x · 1 = x.
Recall that x · 0 = 0 for any real number x. (This is Theorem 2.12 in the real
number case.) In particular, 0 has no multiplicative inverse (there is no real number
“0−1 ” such that 0 · 0−1 = 1); thus, the high school saying: “You can’t divide by
zero.”
Theorem 2.21 (Fraction rules). For a, b, c, d ∈ R, the following fraction rules
hold (all denominators are assumed to be nonzero):
a a a a
(1 ) = 1, = a, (2 ) =−
a 1 −b b
a c ac a ac
(3 ) · = , (4 ) = ,
b d bd b bc
1 b a/b a d ad a c ad ± bc
(5 ) = , (6 ) = · = , (7 ) ± = .
a/b a c/d b c bc b d bd
Proof. The proofs of these rules are really very elementary, so we only prove
(1)–(3) and leave (4)–(7) to you in Problem 1.
We have a/a = a·a−1 = 1 and since 1·1 = 1, by uniqueness of the multiplicative
inverses, we have 1−1 = 1 and therefore a/1 = a · 1−1 = a · 1 = a.
To prove (2), note that by our rules of sign,
(−b) · (−b−1 ) = b · b−1 = 1
and therefore by uniqueness of multiplicative inverses, we must have (−b)−1 =
−b−1 . Thus, a/(−b) := a · (−b)−1 = a · −b−1 = −a · b−1 = −a/b.
To prove (3), observe that b · d · b−1 · d−1 = (bb−1 ) · (dd−1 ) = 1 · 1 = 1, so by
uniqueness of multiplicative inverses, (bd)−1 = b−1 d−1 . Thus,
a c ac
· = a · b−1 · c · d−1 = a · c · b−1 · d−1 = ac · (bd)−1 = .
b d bd
We already know that what an means for n = 0, 1, 2, . . .. For negative integers,
we define powers by
1
a−n := , a 6= 0, n = 1, 2, 3, . . . .
an
Here are the familiar power rules.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 57
a b
b b
c c
b d
Proof. We provide three proofs, the first one is essentially a version (the
version?) of the original geometric Pythagorean proof while the second one is a real
analysis version of the same proof! The third proof is the “standard” proof in this
business. (See also Problems 6 and 7.)
Proof I: (Cf. [8] for another version.) This proof is not to be considered
rigorous! We only put this proof here for historical purposes and because we shall
make this proof completely rigorous in Proof II below. We assume common facts
from high school geometry, in particular, similar
√ triangles.
Suppose, by way of contradiction, that 2 = a/b where a, b ∈ N. Then a2 =
2b2 = b2 + b2 , so by the Pythagorean theorem, the isosceles triangle with sides
a, b, b is a right triangle (see Figure 2.5). Hence, there is an isosceles right triangle
whose lengths are (of course, positive) integers. By taking a smaller triangle if
necessary, we may assume that a, b, b are the lengths of the smallest such triangle.
We shall derive a contradiction by producing another isosceles right triangle with
integer lengths and a smaller hypotonus. In fact, consider the triangle d, c, c drawn
in Figure 2.5. Note that a = b + c so c = a − b ∈ Z. To see that d ∈ Z, observe
that since the ratio of corresponding sides of similar triangles are in proportion, we
have
d a a a a2
(2.20) = =⇒ d= · c = (a − b) = − a = 2b − a,
c b b b b
where we used that a2 = b2 + b2 = 2b2 . Therefore, d = 2b − a ∈ Z as well. Thus,
we have indeed produced a smaller isosceles right triangle with integer lengths.
Proof II:√(Cf. [218, p. 39], [143], [194].) We now make Proof I rigorous.
Suppose that 2 = a/b (a, b ∈ N).√ By well-ordering, we may assume that a is the
smallest positive numerator that 2 can have as a fraction; explicitly,
n √ n o
a = least element of n ∈ N ; 2 = for some m ∈ Z .
m
Motivated by (2.20), we claim that
√ d
(2.21) 2= where d = 2b − a, c = a − b are integers with d ∈ N and d < a.
c
Once we prove this claim, we contradict the minimality of a. Of course, the facts in
(2.21) were derived from Figure 2.5 geometrically, but now we actually prove these
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 59
√
facts! First, to prove that 2 = d/c, we simply compute:
√ √ √
d 2b − a 2 − a/b 2− 2 2− 2 2+1
= = =√ =√ ·√
c a−b a/b − 1 2−1 2−1 2+1
√ √ √ √
2 2 + 2 − ( 2)2 − 2 2 √
= √ = = 2.
2
( 2) − 1 1
√ 2
To prove that 0 < d < a, note that since 1 < 2 < 4, that is, 12 < ( 2)√ < 22 , by
the (last statement of the) power rules in Theorem 2.22, we have 1 < 2 < 2, or
1 < a/b < 2. Multiplying by b, we get b < a < 2b, which implies that
(2.22) d = 2b − a > 0 and d = 2b − a < 2a − a = a.
Therefore, d ∈ N and d < a and we get our a contradiction.
Proof III: The following proof is the classic proof. We first establish the fact
that the square of an integer has the factor 2 if and only if the integer itself has
the factor 2. A quick way to prove this fact is using the fundamental theorem of
arithmetic: The factors of m2 are exactly the squares of the factors of m. Therefore,
m2 has a prime factor p if and only if m itself has the prime factor p. In particular,
m2 has the prime factor 2 if and only if m has the factor 2, which establishes our
fact. A proof without using the fundamental theorem goes as follows. An integer
is either even or odd, that is, is of the form 2n or 2n + 1 where n is the quotient of
the integer when divided by 2. The equations
(2n)2 = 4n2 = 2(2n2 )
(2n + 1)2 = 4n2 + 4n + 1 = 2(2n2 + 2n) + 1
√
confirm the asserted fact. Now suppose that 2 were a rational number, say
√ a
2= ,
b
where a/b is in lowest terms. Squaring this equation we get
a2
2==⇒ a2 = 2b2 .
b2
The number 2b2 = a2 has the factor 2, so a must have the factor 2. Therefore,
a = 2c for some integer c. Thus,
(2c)2 = 2b2 =⇒ 4c2 = 2b2 =⇒ 2c2 = b2 .
The number 2c2 = b2 has the factor 2, so b must also have the factor 2. Thus, we
have showed that a and b both have the factor 2. This contradicts the assumption
that a and b have no common factors.
√
The following theorem gives another method to prove the irrationality of 2
and also many other numbers. Recall that a (real-valued) n-th degree polyno-
mial is a function p(x) = an xn + · · · + a1 + x + a0 , where ak ∈ R for each k and
with the leading coefficient an 6= 0.
Theorem 2.24 (Rational zeros theorem). If a polynomial equation with
integral coefficients,
cn xn + cn−1 xn−1 + · · · + c1 x + c0 , cn 6= 0,
where the ck ’s are integers, has a nonzero rational solution a/b where a/b is in
lowest terms, then a divides c0 and b divides cn .
60 2. NUMBERS, NUMBERS, AND MORE NUMBERS
Proof. Suppose that a/b is a rational solution of our equation with a/b in
lowest terms. Being a solution, we have
a n a n−1 a
cn + cn−1 + · · · + c1 + c0 = 0.
b b b
Multiplying both sides by bn , we obtain
(2.23) cn an + cn−1 an−1 b + · · · + c1 a bn−1 + c0 bn = 0.
Bringing everything to the right except for cn an and factoring out a b, we find
cn an = −cn−1 an−1 b − · · · − c1 a bn−1 − c0 bn
= b(−cn−1 an−1 − · · · − c1 a bn−2 − c0 bn−1 ).
This formula shows that every prime factor of b occurs in the product cn an . By
assumption, a and b have no common prime factors and hence every prime factor
of b must occur in cn . This shows that b divides cn .
We now rewrite (2.23) as
c0 bn = −cn an − cn−1 an−1 b − · · · − c1 a bn−1
= a(−cn an−1 − cn−1 an−2 − · · · − c1 bn−1 ).
This formula shows that every prime factor of a occurs in the product c0 bn . How-
ever, since a and b have no common prime factors, we conclude that every prime fac-
tor of a occurs in c0 , which implies that a divides c0 . This completes our proof.
√ √
Example 2.19. (Irrationality of 2, Proof IV) Observe that 2 is a so-
lution of the polynomial equation x2 − 2 = 0. The rational zeros theorem implies
that if the equation x2 − 2 = 0 has a rational solution, say a/b in lowest terms, then
a must divide c0 = −2 and b must divide c2 = 1. It follows that a can equal ±1 or
±2 and b can only be ±1. Therefore, the only rational solutions of x2 − 2 = 0, if
any, are x = ±1 or x = ±2. However,
(±1)2 − 2 = −1 6= 0 and (±2)2 − 2 = 2 6= 0,
√
so x2 − 2 = 0 has no rational solutions. Therefore 2 is not rational.
A similar argument using the equation xn − a = 0 proves the the following
corollary.
√
Corollary 2.25. The n-th root n a, where a and n are positive integers, is
either irrational or an integer; if it is an integer, then a is the n-th power of an
integer.
2.6.4. Irrationality of trigonometric numbers. Let 0 < θ < 90◦ be an
angle whose measurement in degrees is rational. Following [142], we shall prove
that cos θ is irrational except when θ = 60◦ , in which case
1
cos 60◦ = .
2
The proof of this result is based on the rational zero theorem and Lemma 2.26 below.
See Problem 5 for corresponding statements for sine and tangent. Of course, at this
point, and only for purposes of illustration, we have to assume basic knowledge of
the trigonometric functions. In Section 4.7 we shall define these function rigourously
and establish their usual properties.
2.6. REAL NUMBERS: RATIONAL AND “MOSTLY” IRRATIONAL 61
Lemma 2.26. For any natural number n, we can write 2 cos nθ as an n-th degree
polynomial in 2 cos θ with integer coefficients and with leading coefficient one.
Theorem 2.27. Let 0 < θ < 90◦ be an angle whose measurement in degrees is
rational. Then cos θ is rational if and only if θ = 60◦ .
Example 2.20. Consider the interval I = [0, 1). This interval is bounded above
by, for instance, 1, 3/2, 22/7, 10, 1000, etc. In fact, any upper bound for I is just
a real number greater than or equal to 1. The least upper bound is 1 since 1 is the
smallest upper bound. Note that 1 6∈ I.
Example 2.21. Now let J = (0, 1]. This set is also bounded above, and any
upper bound for J is as before, just a real number greater than or equal to 1. The
least upper bound is 1. In this case, 1 ∈ J.
These examples show that the supremum of a set, if it exists, may or may not
belong to the set.
Example 2.22. Z is not bounded above (see Lemma 2.34) nor is the set (0, ∞).
We summarize: Let A ⊆ R be bounded above. Then a number b is the least
upper bound or supremum for A means two things concerning b:
(L1) for all a in A, a ≤ b — this just means that b is an upper bound for A;
(L2) if c is an upper bound for A, then b ≤ c — this just means that b is the least,
or smallest, upper bound for A.
Instead of (L2) it is sometimes convenient to substitute the following.
(L20 ) if c < b, then for some a in A we have c < a — this just means that any
number c smaller than b cannot be an upper bound for A, which is to say,
there is no upper bound for A that is smaller than b.
(L20 ) is just the contrapositive of (L2) — do you see why? We can also talk
about lower bounds. A set A ⊆ R is said to be bounded below if there is a real
number b smaller than any number in A in the sense that for each a in A we have
b ≤ a. Any such number b, if such exists, is called a lower bound for A. If b is a
lower bound for A, then b is called the greatest lower bound or infimum for A
if b is just that, the greatest lower bound for A, in the sense that it is greater than
any other lower bound for A. This infimum, if it exists, is denoted by inf A. We
shall use both terminologies “greatest lower bound” and “infimum” interchangeably
although we shall use greatest lower bound more often.
Example 2.23. The sets I = [0, 1) and J = (0, 1] are both bounded below (by
e.g. 0, −1/2, −1, −1000, etc.) and in both cases the greatest lower bound is 0.
Thus, the infimum of a set, if it exists, may or may not belong to the set.
Example 2.24. Z (see Lemma 2.34) and (−∞, 0) are not bounded below.
We summarize: Let A ⊆ R be bounded below. Then a number b is the greatest
lower bound or infimum for A means two things concerning b:
(G1) for all a in A, b ≤ a — this just means that b is a lower bound for A,
(G2) if c is a lower bound for A, then c ≤ b — this just means that b is the greatest
lower bound for A.
Instead of (G2) it is sometimes convenient to substitute its contrapositive.
(G20 ) if b < c, then for some a in A we have a < c — this just means that any
number c greater than b cannot be a lower bound for A, which is to say,
there is no lower bound for A that is greater than b.
In the examples given so far (e.g. the intervals I and J), we have shown that
if a set has an upper bound, then it has a least upper bound. This is a general
phenomenon, called the completeness axiom of the real numbers:
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 65
(C) (Completeness axiom of the real numbers) Every nonempty set of real
numbers that is bounded above has a supremum, that is, a least upper bound.
As stated in the last section, we assume that R has this property. Using the
following lemma, we can prove the corresponding statement for infimums.
Lemma 2.29. If A is nonempty and bounded below, then −A := {−a ; a ∈ A}
is nonempty and bounded above, and inf A = − sup(−A) in the sense that inf A
exists and this formula for inf A holds.
Proof. Since A is nonempty and bounded below, there is a real number b such
that b ≤ a for all a in A. Therefore, −a ≤ −b for all a in A, and hence the set −A
is bounded above by −b. By the completeness axiom, −A has a least upper bound,
which we denote by b. Our lemma is finished once we show that −b is the greatest
lower bound for A. To see this, we know that −a ≤ b for all a in A and so, −b ≤ a
for all a in A. Thus, −b is a lower bound for A. Suppose that b0 ≤ a for all a in A.
Then −a ≤ −b0 for all a in A and so, b ≤ −b0 since b is the least upper bound for
−A. Thus, b0 ≤ −b and hence, −b is indeed the greatest lower bound for A.
This lemma immediately gives the following theorem.
Theorem 2.30. Every nonempty set of real numbers that is bounded below has
an infimum, that is, greatest lower bound.
The consequences of the completeness property of the real numbers are quite
profound as we now intend to demonstrate!
Step 2: Suppose that bn < a. Let 0 < ε < 1. Then εm ≤ ε for any natural
number m, so by the binomial theorem,
n n−1
X n
n
X n k n−k n
(b + ε) = b ε =b + bk εn−k
k k
k=0 k=0
n−1
X n
n
≤b + bk ε = bn + εc,
k
k=0
Pn−1 n
where c is the positive number c = k=0 kbk . Since bn < a, we have (a −
n n
b )/c > 0. Let ε equal (a − b )/c or 1/2, whichever is smaller (or equal to 1/2 if
(a − bn )/c = 1/2). Then 0 < ε < 1 and ε ≤ (a − bn )/c, so
a − bn
(b + ε)n ≤ bn + εc ≤ bn + · c = a.
c
This shows that b + ε also belongs to A, which contradicts the fact that b is an
upper bound for A.
Step 3: Now suppose that bn > a. Then b > 0 (for if b = 0, then bn = 0 6> a).
Given any 0 < ε < b, we have ε b−1 < 1, which implies −ε b−1 > −1, so by
Bernoulli’s inequality (Theorem 2.7),
n
(b − ε)n = bn 1 − εb−1 ≥ bn 1 − nεb−1 = bn − ε c,
where c = nbn−1 > 0. Since a < bn , we have (bn − a)/c > 0. Let ε equal (bn − a)/c
or b/2, whichever is smaller (or equal to b/2 if (bn − a)/c = b/2). Then 0 < ε < b
and ε ≤ (bn − a)/c, which implies that −εc ≥ −(bn − a). Therefore,
(b − ε)n ≥ bn − εc ≥ bn − (bn − a) = a.
This shows that b − ε is an upper bound for A, which contradicts the fact that b is
the least upper bound for A.
√
In particular, 2 exists and, as we already know, is an irrational number. Here
are proofs of the familiar root rules memorized from high school.
Theorem 2.32 (Root rules). For any nonnegative real numbers a and b and
natural number n, we have
√n √ √n
q
m √ √
ab = n a b, n
a = mn a.
Moreover,
√ √
n
n
a<b ⇐⇒ a < b.
√ √
Proof. Let x = n a and y = n b. Then, xn = a and y n = b, so
(xy)n = xn y n = ab.
√
By uniqueness of n-th roots, we must have xy = n ab. This proves the first iden-
tity. The second identity is proved similarly. Finally, by our power rules theorem
√ √ √ n √ n
(Theorem 2.22), we have n a < n b ⇐⇒ ( n a) < n b ⇐⇒ a < b, which proves
the last statement of our theorem.
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 67
6
2 q a
1 q a
q a -
−3 −2 −1 1 2 3
q a−1
q a −2
?
We remark that some authors replace the integer n by the integer n − 1 in the
Archimedean property so it reads: Given a real number x > 0 and a real number y,
there is a unique integer n such that (n − 1)x ≤ y < nx. We’ll use this formulation
of the Archimedean property in the proof of Theorem 2.37 below.
Using the Archimedean property, we can define the greatest integer func-
tion as follows (see Figure 2.6): Given any a ∈ R, we define bac as the greatest
integer less than or equal to a, that is, bac is the unique integer n satisfying the
inequalities n ≤ a < n + 1. This function will come up various times in the sequel.
We now prove an important fact concerning the rational and irrational numbers.
Theorem 2.37 (Density of the (ir)rationals). Between any two real num-
bers is a rational and irrational number.
Proof. Let x < y. We first prove that there is a rational number between x
and y. Indeed, y − x > 0, so by our corollary there is a natural number m such
that 1/m < y − x. By the Archimedean principle, there is an integer n such that
n 1 n
n − 1 ≤ mx < n =⇒ − ≤x< .
m m m
In particular, x < n/m, and
n 1 n
≤ + x < (y − x) + x = y =⇒ < y.
m m m
Thus, the rational number n/m is between x and y. √
To
√ prove that between x and y is an irrational number, note that x − 2 <
y − √2, so by what√ we just proved
√ above, there is a rational number r such that
x − 2 < r < y − 2. Adding 2, we obtain
√
x < ξ < y, where ξ = r + 2.
√
Note that ξ is irrational, for if it were rational, then 2 = ξ − r would also be
rational, which we know is false. This completes our proof.
[ [ [ [ ] ... ] ] ]
a1 a2 a3 . . . an bn b3 b2 b1
We end this section with a discussion of maximums and minimums. Given any
set A of real numbers, a number a is called the maximum of A if a ∈ A and
a = sup A, in which case we write a = max A. Similarly, a is called the minimum
of A if a ∈ A and a = inf A, in which case we write a = min A. For instance,
1 = max(0, 1], but (0, 1) has no maximum, only a supremum, which is also 1. In
Problem 4, we prove that any finite set has a maximum.
2.7. THE COMPLETENESS AXIOM OF R AND ITS CONSEQUENCES 71
Exercises 2.7.
1. What are the supremums and infimums of the following sets? Give careful proofs of
your answers. The “1/n-principle” might be helpful in some of your proofs.
{1 + n5 ; n = 1, 2, 3, . . .}
(a) A =
8
− n3 ; n = 1, 2, 3, . . .}
(b) B = {3
(c) C = 1 + (−1)n n1 ; n = 1, 2, 3, . . . (d) D = n(−1)n + n1 ; n = 1, 2, 3, . . . o
P n n+1
(e) E = 1
k=1 2k ; n = 1, 2, . . . (f ) F = (−1)n + (−1)n ; n = 1, 2, 3, . . . .
2. Are the following sets bounded above? Are they bounded below? If the supremum or
infimum exists, find it and prove your answer.
n n
o n n
o
(a) A = 1 + n(−1) ; n = 1, 2, 3, . . . , (b) B = 2n(−1) ; n = 1, 2, 3, . . . .
(c) If 0 < a < 1 and x ≥ 0, define ax := 1/(1/a)x ; note that 1/a > 1 so (1/a)x is
defined. Finally, if a > 0 and x < 0, define ax := (1/a)−x ; note that −x > 0 so
(1/a)−x is defined. Prove (2.29) for any a, b, x, y ∈ R with a, b > 0 and x, y ∈ R.
10. Let p(x) = ax2 + bx + c be a quadratic polynomial with real coefficients and with
a 6= 0. Prove that p(x) has a real root (that is, an x ∈ R with p(x) = 0) if and only if
b2 − 4ac ≥ 0, in which case, the root(s) are given by the quadratic formula:
√
−b ± b2 − 4ac
x= .
2a
11. Let {In = [an , bn ]} be a nested sequence of nonempty closed and bounded intervals
and put A = {an ; n ∈ N and B = {bn ; n ∈ N}. Show that sup A and inf B exist and
T
In = [sup A, inf B].
12. In this problem we give a characterization of the completeness axiom (C) of R in terms
of intervals as explained by Christian [50]. A subset A of R is convex if given any x
and y in A and t ∈ R with x < t < y, we have t ∈ A.
(a) Assume axiom (C). Prove that all convex subsets of R are intervals.
(b) Assume that all convex subsets of R are intervals. Prove the completeness property
(C) of R. Suggestion: Let I be the set of all upper bounds of a nonempty set A
that is bounded above. Show that I is convex.
This problem shows that the completeness axiom is equivalent to the statement that
all convex sets are intervals.
and similarly, 0+x = x. These computations prove properties (A1) and (A3) below,
and you can check that the following further properties of addition are satisfied:
Addition satisfies
(A1) x + y = y + x; (commutative law)
(A2) (x + y) + z = x + (y + z); (associative law)
(A3) there is an element 0 such that x + 0 = x = 0 + x; (additive identity)
(A4) for each x there is a −x such that
x + (−x) = 0 and (−x) + x = 0. (additive inverse)
Of course, we usually write x + (−y) as x − y.
Multiplication by real numbers satisfies
(M1) 1 · x = x; (multiplicative identity)
(M2) (a b) x = a (bx); (associative law)
and finally, addition and multiplication are related by
(D) a(x + y) = ax + ay and (a + b)x = ax + bx. (distributive law)
We remark that any set, say with elements denoted by x, y, z, . . ., called vec-
tors, with an operation of “+” and an operation of multiplication by real numbers
that satisfy properties (A1) – (A4), (M1) – (M2), and (D), is called a real vector
space. If the scalars a, b, 1 in (M1) – (M2) and (D) are elements of a field F, then
we say that the vector space is an F vector space or a vector space over F. In
particular, Rm is a real vector space.
2.8.2. Inner products. We now review inner products, also called dot prod-
ucts in elementary calculus. We all probably know that given any two vectors
x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in R3 , the dot product x · y is the number
x · y = x1 y1 + x2 y2 + x3 y3 .
We generalize this to Rm as follows: If x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ), then
we define the inner product (also called the dot product or scalar product)
hx, yi as the real number
m
X
hx, yi := x1 y1 + x2 y2 + · · · + xm ym = xj yj .
j=1
It is also common to denote hx, yi by x · y or (x, y), but we prefer the angle bracket
notation hx, yi, which is popular in physics, because the dot “·” can be confused
with multiplication and the parentheses “( , )” can be confused with ordered pair.
In the following theorem we summarize some of the main properties of h·, ·i.
Theorem 2.39. For any vectors x, y, z in Rm and real number a,
(i) hx, xi ≥ 0 and hx, xi = 0 if and only if x = 0.
(ii) hx + y, zi = hx, zi + hy, zi and hx, y + zi = hx, yi + hx, zi.
(iii) ha x, yi = ahx, yi and hx, a yi = ahx, yi.
(iv) hx, yi = hy, xi.
Proof. To prove (i), just note that
hx, xi = x21 + x22 + · · · + x2m
and x2j ≥ 0 for each j. If hx, xi = 0, then as the only way a sum of nonnegative
numbers is zero is that each number is zero, we must have x2j = 0 for each j. Hence,
74 2. NUMBERS, NUMBERS, AND MORE NUMBERS
The other identity hx, y + zi = hx, yi + hx, zi is proved similarly. The proofs of (iii)
and (iv) are also simple computations, so we leave their proofs to the reader.
We remark that any real vector space V with an operation that assigns to
every two vectors x and y in V a real number hx, yi satisfying properties (i) – (iv)
of Theorem 2.39 is called a real inner product space and the operation h·, ·i is
called an inner product on V . In particular, Rm is a real inner product space.
We interpret the norm |x| as the length of the vector x, or the distance of x from
the origin 0. In particular, the squared length |x|2 of the vector x is given by
Warning: For m > 1, |x| does not mean absolute value of a real number x, it
means norm of a vector x. However, if m = 1, then “norm” and “absolute value” p
are the same, because for x = x1 ∈ R1 = R, the above definition of norm is x21 ,
which is exactly the absolute value of x1 according to Problem 7 in Exercises 2.7.
The following inequality relates the norm and the inner product. It is commonly
called the Schwarz inequality or Cauchy-Schwarz inequality after Hermann
Schwarz (1843–1921) who stated it for integrals in 1885 and Augustin Cauchy
(1789–1857) who stated it for sums in 1821. However, (see [94] for the history)
it perhaps should be called the Cauchy-Bunyakovskiı̆-Schwarz inequality be-
cause Viktor Bunyakovskiı̆ (1804–1889), a student of Cauchy, published a related
inequality 25 years before Schwarz. (Note: There is no “t” before the “z.”)
x
6
hx, yi
x− y
|y|2
- -y
hx, yi
y
|y|2
hx,yi
Figure 2.8. The projection of x onto y is |y|2 y and the projection
hx,yi
of x onto the orthogonal complement of y is x − |y|2 y.
|hx, yi|2
0 ≤ |x|2 − =⇒ |hx, yi|2 ≤ |x|2 |y|2 .
|y|2
Taking square roots proves the Schwarz inequality. As a side remark (skip this if
you’re not interested), the vector x − hx,yi
|y|2 y that we took the squared length of
didn’t come out of a hat. Recall from your “multi-variable calculus” or “vector
calculus” course, that the projection of x onto y and the projection of x onto the
orthogonal complement of y are given by hx,yi hx,yi
|y|2 y and x − |y|2 y, respectively, see
Figure 2.8. Thus, all we did above was take the squared length of the projection of
x onto the orthogonal complement of y.
In the following theorem, we list some of the main properties of the norm | · |.
Proof. (i) follows from Property (i) of Theorem 2.39. To prove (ii), observe
that
p p p
|ax| = hax, axi = a2 hx, xi = |a| hx, xi = |a| |x|,
76 2. NUMBERS, NUMBERS, AND MORE NUMBERS
and therefore |ax| = |a| |x|. To prove the triangle inequality, we use the Schwarz
inequality to get
|x + y|2 = hx + y, x + yi = |x|2 + hx, yi + hy, xi + |y|2
= |x|2 + 2hx, yi + |y|2
≤ |x|2 + 2|hx, yi| + |y|2
≤ |x|2 + 2|x| |y| + |y|2 ,
where we used the Schwarz inequality at the last step. Thus,
|x + y|2 ≤ (|x| + |y|)2 .
Taking the square root of both sides proves the triangle inequality.
The second half of (iv) follows from the triangle inequality:
|x ± y| = |x + (±1)y| ≤ |x| + |(±1)y| = |x| + |y|.
To prove the first half | |x| − |y| | ≤ |x ± y| we use the triangle inequality to get
|x| − |y| = |(x − y) + y| − |y| ≤ |x − y|+|y| − |y| = |x − y|
(2.30) =⇒ |x| − |y| ≤ |x − y|.
Switching the letters x and y in (2.30), we get |y| − |x| ≤ |y − x| or −(|x| − |y|) ≤
|x − y|. Combining this with (2.30), we see that
|x| − |y| ≤ |x − y| and − (|x| − |y|) ≤ |x − y| =⇒ | |x| − |y| | ≤ |x − y|,
where we used the definition of absolute value of the real number |x|−|y|. Replacing
y with −y and using that | − y| = |y|, we get | |x| − |y| | ≤ |x + y|. This finishes the
proof of (iv).
We remark that any real vector space V with an operation that assigns to every
vector x in V a nonnegative real number |x|, such that | · | satisfies properties (i) –
(iii) of Theorem 2.41 is called a real normed space and the operation | · | is called
a norm on V . In particular, Rm is a real normed space. The exercises explore
different norms on Rm .
In analogy with the distance between two real numbers, we define the distance
between two vectors x and y in Rm to be the number
|x − y|.
In particular, the triangle inequality implies that given any other vector z, we have
|x − y| = |(x − z) + (z − y)| ≤ |x − z| + |z − y|,
that is,
(2.31) |x − y| ≤ |x − z| + |z − y|.
This inequality is the “genuine” triangle inequality since it represents the geomet-
rically intuitive fact that the distance between two points x and y is shorter than
the distance transversed by going from x to z and then from z to y; see Figure 2.9.
Finally, we remark that the norm | · | on Rm is sometimes called the ball norm
m
on Rp for the following reason. Let r > 0 and take m = 3. Then |x| < r means
that x21 + x22 + x23 < r, or squaring both sides, we get
x21 + x22 + x23 < r2 ,
2.8. m-DIMENSIONAL EUCLIDEAN SPACE 77
6
q
x |x − y|
q
y
-
|x − z| |z − y|
q
z
which simply says that x is inside the ball of radius r. So, if c ∈ R3 , then |x − c| < r
just means that
(x1 − c1 )2 + (x2 − c2 )2 + (x3 − c3 )2 < r2 ,
which is to say, x is inside the ball of radius r that is centered at the point c =
(c1 , c2 , c3 ). Generalizing this notion to m-dimensional space, given c in Rm , we call
the set of all x such that |x − c| < r, or after squaring both sides,
(x1 − c1 )2 + (x2 − c2 )2 + · · · + (xm − cm )2 < r2 ,
the open ball of radius r centered at c. We denote this set by Br , or Br (c) to
emphasize that the center of the ball is c. Therefore,
(2.32) Br (c) := {x ∈ Rm ; |x − c| < r}.
The set of x with < replaced by ≤ is called the closed ball of radius r centered
at c and is denoted by B r or B r (c),
B r (c) := {x ∈ Rm ; |x − c| ≤ r}.
Aq α c = |A − B|
β B
q
b = |A − C| a = |C − B|
γq
C
(c) Now let x and y be arbitrary nonzero vectors of Rm . Applying (b) to the vectors
x/|x| and y/|y|, derive Schwarz’s inequality.
3. (Schwarz’s inequality, Proof III) Here’s an “algebraic” proof. Let x, y ∈ Rm with
y 6= 0 and let p(t) = |x + ty|2 for t ∈ R. Note that p(t) ≥ 0 for all t.
(a) Show that p(t) can be written in the form p(t) = a t2 + 2b t + c where a, b, c are
real numbers with a 6= 0.
(b) Using the fact that p(t) ≥ 0 for all t, prove the Schwarz inequality. Suggestion:
Write p(t) = a(t + b/a)2 + (c − b2 /a).
4. Prove that for any vectors x and y in Rm , we have
m
!2 m X m
2 2
X X
2|x| |y| − 2 xn yn = (xk y` − x` yk )2 (Lagrange identity).
n=1 k=1 `=1
Suggestion: To prove the last equality, observe that c2 = |A − B|2 = |x − y|2 where
x = A − C, y = B − C. Compute the dot product |x − y|2 = hx − y, x − yi.
2.9. THE COMPLEX NUMBER SYSTEM 79
(b) Using that sin2 α = 1 − cos2 α and that a2 = b2 + c2 − 2bc cos α, prove that
sin2 α s(s − a)(s − b)(s − c)
(2.33) =4 ,
a2 a2 b2 c2
where s := (a + b + c)/2 is called the semiperimeter . From (2.33), conclude that
B 1 ⊆ Box1 ⊆ B √m .
When m = 2, give a “proof by picture” of these set inequalities by drawing the
three sets B 1 , Box1 , and B √2 .
12The imaginary number is a fine and wonderful resource of the human spirit, almost an
amphibian between being and not being. Gottfried Leibniz (1646–1716) [141].
13Furthermore, the use of complex numbers is in this case not a calculational trick of applied
mathematics but comes close to being a necessity in the formulation of the laws of quantum
mechanics ... It is difficult to avoid the impression that a miracle confronts us here. Nobel
prize winner Eugene Wigner (1902–1995) responding to the “miraculous” appearance of complex
numbers in the formulation of quantum mechanics [161, p. 208], [244], [245].
80 2. NUMBERS, NUMBERS, AND MORE NUMBERS
In summary, C as a set is just R2 , with the usual addition structure but with a
special multiplication. Of course, we also define −z = (−a, −b) and we write 0 for
(0, 0). Finally, if z = (a, b) 6= 0 (that is, a 6= 0 and b 6= 0), then we define
a −b
(2.34) z −1 := , .
a2 + b2 a2 + b2
Theorem 2.42. The complex numbers is a field with (0, 0) (denoted henceforth
by 0) and (1, 0) (denoted henceforth by 1) being the additive and multiplicative
identities, respectively.
Proof. If z, w, u ∈ C, then we need to show that addition satisfies
(A1) z + w = w + z; (commutative law)
(A2) (z + w) + u = z + (w + u); (associative law)
(A3) z + 0 = z = 0 + z; (additive identity)
(A4) for each complex number z,
z + (−z) = 0 and (−z) + z = 0; (additive inverse)
multiplication satisfies
(M1) z · w = w · z; (commutative law)
(M2) (z · w) · u = z · (w · u); (associative law)
(M3) 1 · z = z = z · 1; (multiplicative identity)
(M4) for z 6= 0, we have
z · z −1 = 0 and z −1 · z = 1; (multiplicative inverse);
and finally, multiplication and addition are related by
(D) z · (w + u) = (z · w) + (z · u). (distributive law)
The proofs of all these properties are very easy and merely involve using the defini-
tion of addition and multiplication, so we leave all the proofs to the reader, except
for (M4). Here, by definition of multiplication,
a −b
z · z −1 = (a, b) · ,
a2 + b2 a2 + b2
a −b −b a
= a· 2 − b· 2 , a· 2 + b· 2
a + b2 a + b2 a + b2 a + b2
2 2
a b
= 2 2
+ 2 , 0 = (1, 0) = 1.
a +b a + b2
Similarly, z −1 · z = 1, and (M4) is proven.
In particular, all the arithmetic properties of R hold for C.
2.9. THE COMPLEX NUMBER SYSTEM 81
2.9.2. The number i. In high school, the complex numbers are introduced
in a slightly different manner, which we now describe.
First, we consider R as a subset of C by the identification of the real number a
with the ordered pair (a, 0), in other words, for sake of notational convenience, we
do not make a distinction between the complex number (a, 0) and the real number
a. Observe that by definition of addition and multiplication of complex numbers,
(a, 0) + (b, 0) = (a + b, 0)
and
(a, 0) · (b, 0) = (a · b − 0 · 0, a · 0 + 0 · b) = (ab, 0),
which is to say, “a + b = a + b” and “a · b = a · b” under our identification. Thus, our
identification of R preserves the arithmetic operations of R. Moreover, we know
how to multiply real numbers and elements of R2 : a(x, y) = (ax, ay). This also
agrees with our complex number multiplication:
(a, 0) · (x, y) = (a · x − 0 · y, a · y + 0 · x) = (ax, ay).
In summary, our identification of R as first components of ordered pairs in C does
not harm any of the additive or multiplicative structures of C.
The number i, notation introduced in 1777 by Euler [171], is by definition the
complex number
i := (0, 1).
Then using the definition of multiplication of complex numbers, we have
i2 = i · i = (0, 1) · (0, 1) = (0 · 0 − 1 · 1, 0 · 1 + 1 · 0) = (−1, 0) =⇒ i2 = −1,
where we used our identification of (−1, 0) with −1. Thus, the complex number
i = (0, 1) is the “imaginary unit” that you learned about in high school; however,
our definition of i avoids the mysterious square root of −1 you probably encoun-
tered.14 Moreover, given any complex number z = (a, b), by definition of addition,
multiplication, i, and our identification of R as a subset of C, we see that
a + b i = a + (b, 0) · (0, 1) = a + (b · 0 − 0 · 1, b · 1 + 0 · 0)
= (a, 0) + (0, b) = (a, b) = z.
Thus, z = a + b i, just as you were taught in high school! By commutativity, we
also have z = a + i b. We call a the real part of z and b the imaginary part of
z, and we denote them by a = Re z and b = Im z, so that
z = Re z + i Im z.
From this point on, we shall typically use the notation z = a + b i = a + i b instead
of z = (a, b) for complex numbers.
14That this subject [imaginary numbers] has hitherto been surrounded by mysterious obscu-
rity, is to be attributed largely to an ill adapted notation. If, for example, +1, -1, and the square
root of -1 had been called direct, inverse and lateral units, instead of positive, negative and imag-
inary (or even impossible), such an obscurity would have been out of the question. Carl Friedrich
Gauss (1777–1855).
82 2. NUMBERS, NUMBERS, AND MORE NUMBERS
If two sets A and B have the same cardinality, we sometimes write card(A) =
card(B). One can check that if card(A) = card(B) and card(B) = card(C), then
card(A) = card(C). Thus, cardinality satisfies a “transitive law”.
It is “obvious” that a set cannot have both n elements and m elements where
n 6= m, but this still needs proof! The proof is based on the “pigeonhole principle”,
which can be interpreted as saying that if m > n and m pigeons are put into n
holes, then at least two pigeons must be put into the same hole.
Theorem 2.45 (Pigeonhole principle). If m > n, then there does not exist
an injection from Nm into Nn .
Proof. We proceed by induction on n. Let m > 1 and f : Nm −→ {1} be any
function. Then f (m) = f (1) = 1, so f is not an injection.
Assume that our theorem is true for n; we shall prove it true for n + 1. Let
m > n + 1 and let f : Nm −→ Nn+1 . We shall prove that f is not an injection.
First of all, if the range of f is contained in Nn ⊆ Nn+1 , then we can consider f
as a function into Nn , and hence by induction hypothesis, f is not an injection. So
assume that the f (a) = n + 1 for some a ∈ Nm . If there is another element of Nm
whose image is n + 1, then f is not injection, so we may assume that a is the only
element of Nm whose image is n + 1. Then f (k) ∈ Nn for k 6= a, so we can define
a function g : Nm−1 −→ Nn by “skipping” f (a) = n + 1:
Example 2.31. With I = R, we see that the set of all real numbers is un-
countable. It follows that Rm is uncountable for any m ∈ N; in particular, C = R2
is uncountable.
Corollary 2.51. The set of irrational numbers in any interval that is not
empty or consisting of a single point is uncountable.
Proof. If the irrationals in such an interval I were countable, then I would be
the union of two countable sets, the irrationals in I and the rationals in I; however,
we know that I is not countable so the irrationals in I cannot be countable.
2.10.4. Roots of polynomials. We already know that the real numbers are
classified into two disjoint sets, the rational and irrational numbers. There is an-
other important classification into algebraic and transcendental numbers. These
numbers have to do with roots of polynomials, so we begin by discussing some
elementary properties of polynomials. We remark that everything we say in this
subsection and the next are valid for real polynomials (polynomials of a real variable
with real coefficients), but it is convenient to work with complex polynomials.
Let n ≥ 1 and
(2.35) p(z) = an z n + an−1 z n−1 + · · · + a2 z 2 + a1 z + a0 , an 6= 0,
be an n-th degree polynomial with complex coefficients (that is, each ak ∈ C).
Lemma 2.52. For any z, a ∈ C, we can write
p(z) − p(a) = (z − a) q(z),
where q(z) is a polynomial of degree n − 1.
Proof. First of all, observe that given any polynomial f (z) and a complex
number b, the “shifted” function f (z + b) is also a polynomial in z of the same
degree as f ; this can be easily proven using the formula (2.35) for a polynomial. In
particular, r(z) = p(z + a) − p(a) is a polynomial of degree n and hence can written
in the form
r(z) = bn z n + bn−1 z n−1 + · · · + b2 z 2 + b1 z + b0 ,
where bn 6= 0 (in fact, bn = an but this isn’t needed). Notice that r(0) = 0, which
implies that b0 = 0, so we can write
r(z) = z s(z) , where s(z) = bn z n−1 + bn−1 z n−2 + · · · + b2 z + b1
is a polynomial of degree n − 1. Now replacing z with z − a, we obtain
p(z) − p(a) = r(z − a) = (z − a) q(z),
where q(z) = s(z − a) is a polynomial of degree n − 1.
88 2. NUMBERS, NUMBERS, AND MORE NUMBERS
By the fundamental theorem of algebra (Section 4.8) we’ll see that any poly-
nomial of degree n has exactly n (complex) roots counting multiplicities.
√ √
The numbers 2 and 3 5 are irrational, so there are irrational numbers that
are algebraic. Thus, the algebraic numbers include all rational numbers and in-
finitely many irrational numbers, namely those irrational numbers that are roots
of polynomials with integer coefficients. Thus, it might seem as if the algebraic
numbers are uncountable, while the transcendental numbers (those numbers that
are not algebraic) are quite small in comparison. This is in fact not the case, as
was discovered by Cantor.
Theorem 2.54 (Uncountability of transcendental numbers). The set
of all algebraic numbers is countable and the set of all transcendental numbers
is uncountable. The same statement holds for real algebraic and transcendental
numbers.
Proof. We only consider the complex case. An algebraic number is by defi-
nition a complex number satisfying a polynomial equation with integer coefficients
an z n + an−1 z n−1 + · · · + a1 z + a0 = 0, an 6= 0.
Here, n ≥ 1 (that is, this polynomial is nonconstant) in order for a solution to exist.
We define the index of this polynomial as the number
n + |an | + |an−1 | + |an−2 | + · · · + |a2 | + |a1 | + |a0 |.
Since |an | ≥ 1, this index is at least 2 for any nonconstant polynomial with integer
coefficients. Given any natural number k there are only a finite number of non-
constant polynomials with index k. For instance, there are only two nonconstant
polynomials of index 2, the polynomials z and −z. There are eight nonconstant
polynomials of index 3:
z2, z + 1, z − 1, 2z, −z 2 , −z + 1, −z − 1, −2z,
and there are 22 polynomials of index 4:
z 3 , 2z 2 , z 2 + z, , z 2 − z, z 2 + 1, z 2 − 1, 3z, 2z + 1, 2z − 1, z + 2, z − 2,
together with the negatives of these polynomials, and so forth. Since any polynomial
of a given degree has finitely many roots (Theorem 2.53) and there are only a
finite number of polynomials with a given index, the set Ak , consisting of all roots
(algebraic numbers) of polynomials of index k, is a finite set. Since every polynomial
with integer coefficients
S∞ has an index, it follows that the set of all algebraic numbers
is the union k=2 Ak . Since a countable union of countable sets is countable, the
set of algebraic numbers is countable!
We know that the complex numbers is the disjoint union of algebraic numbers
and of transcendental numbers. Since the set of complex numbers is uncountable,
the set of transcendental numbers must therefore be uncountable.
Exercises 2.10.
1. Here are some countability proofs.
(a) Prove that the set of prime numbers is countably infinite.
(b) Let N0 = {0, 1, 2, . . .}. Show that N0 is countably infinite. Define f :
N0 × N0 −→ N0 by f (0, 0) = 0 and for (m, n) 6= (0, 0), define
1
f (m, n) = 1 + 2 + 3 + · · · + (m + n) + n = (m + n)(m + n + 1) + n.
2
Can you see (do not prove) that this function counts N0 × N0 as shown in
Figure 2.12? Unfortunately, it is not so easy to show that f is a bijection.
90 2. NUMBERS, NUMBERS, AND MORE NUMBERS
.. .. ..
. k . .
(0, 2) (1, 2) (2, 2) ...
k k
(0, 1) (1, 1) (2, 1) ...
k k k
(0, 0) - (1, 0) z (2, 0) q. . .
(c) Write Q as a countable union of countable sets, so giving another proof that
the rational numbers are countable.
(d) Prove that f : N × N −→ N defined by f (m, n) = 2m−1 (2n − 1) is a bijection;
this give another proof that N × N is countable.
2. Here are some formulas for polynomials in terms of roots.
(a) If c1 , . . . , ck are roots of a polynomial p(z) of degree n (with each root re-
peated according to multiplicity), prove that p(z) = (z − c1 )(z − c2 ) · · · (z −
ck ) q(z), where q(z) is a polynomial of degree n − k.
(b) If k = n, prove that p(z) = an (z − c1 )(z − c2 ) · · · (z − cn ) where an is the
coefficient of z n in the formula (2.35) for p(z).
3. Prove that if A is an infinite set, then A has a countably infinite subset.
4. Let X be any set and denote the set of all functions from X into {0, 1} by ZX 2 .
Define a map from the power set of X into ZX 2 by
f : P(X) −→ ZX
2 , X⊇A 7−→ f (A) := χA ,
where χA is the characteristic function of A. Prove that f is a bijection. Con-
clude that P(X) has the same cardinality as ZX 2 .
5. Suppose that card(X) = n. Prove that card(P(X)) = 2n . Suggestion: There
are many proofs you can come up with; here’s one using the previous problem.
Assuming that X = {0, 1, . . . , n − 1}, which we may (why?), we just have to
prove that card(ZX n X n
2 ) = 2 . To prove this, define F : Z2 −→ {0, 1, 2, . . . , 2 − 1}
as follows: If f : X −→ {0, 1} is a function, then denoting f (k) by ak , define
F (f ) := an−1 2n−1 + an−2 2n−2 + · · · + a1 21 + a0 .
Prove that F is a bijection (Section 2.5 will come in handy).
6. (Cantor’s theorem) This theorem is simple to prove yet profound in nature.
(a) Prove that there can never be a surjection of a set A onto its power set P(A)
(This is called Cantor’s theorem). In particular, card(A) 6= card(P(A)).
Suggestion: Suppose not and let f be such a surjection. Consider the set
B = {a ∈ A ; a ∈
/ f (a)} ⊆ A.
Derive a contradiction from the assumption that f is surjective. Cantor’s
theorem shows that by taking power sets one can always get bigger and
bigger sets.
(b) Prove that the set of all subsets of N is uncountable.
(c) From Cantor’s theorem and Problem 4 prove that the set of all sequences of
0’s and 1’s is uncountable. Here, a sequence is just function from N into
{0, 1}, which can also be thought of as a list (a1 , a2 , a3 , a4 , . . .) where each
ak is either 0 or 1.
7. (Vredenduin’s paradox [234]) Here is another paradox related to the Russell’s
paradox. Assume that A = {{a} ; a is a set} is a well-defined set. Let B ⊆ A
2.10. CARDINALITY AND “MOST” REAL NUMBERS ARE TRANSCENDENTAL 91
be the subset consisting of all sets of the form {a} where a ∈ P(A). Define
g : P(A) −→ B by g(V ) := {V }.
Show that g is a bijection and then derive a contradiction to Cantor’s theorem.
This shows that A is not a set.
8. We define a statement as a finite string of symbols found on the common
computer keyboard (we regard a space as a symbol). E.g. Binghamton University
is sooo great! Math is fun! is a statement. Let’s suppose there are 100 symbols
on the common keyboard.
(a) Let A be the set of all statements. What’s the cardinality of A?
(b) Is the set of all possible mathematical proofs countable? Why?
CHAPTER 3
Analysis is often described as the study of infinite processes, of which the study
of sequences and series form the backbone. It is in dealing with the concept of
“infinite” in infinite processes that makes analysis technically challenging. In fact,
the subject of sequences is when real analysis becomes “really hard”.
Let us consider the following infinite series that Euler mentioned:
s = 1 − 1 + 1 − 1 + 1 − 1 + 1 − 1 + ··· .
Let’s manipulate this infinite series without being too careful. First, we notice that
s = (1 − 1) + (1 − 1) + (1 − 1) + · · · = 0 + 0 + 0 + · · · = 0,
s = 1 − (1 − 1) − (1 − 1) − (1 − 1) − · · · = 1 − 0 − 0 − 0 − · · · = 1,
2s = 2 − 2 + 2 − 2 + · · · = 1 + 1 − 1 − 1 + 1 + 1 − 1 − 1 + · · ·
= 1 + (1 − 1) − (1 − 1) + (1 − 1) − (1 − 1) + · · ·
= 1 + 0 − 0 + 0 − 0 + · · · = 1,
so s = 1/2! This example shows us that we need to be careful in dealing with the
infinite. In the pages that follow we “tame the infinite” with rigorous definitions.
Another highlight of this chapter is our study of the number e (Euler’s number),
which you have seen countless times in calculus and which pops up everywhere
including economics (compound interest), population growth, radioactive decay,
probability, etc. We shall prove two of the most famous formulas for this number:
n
1 1 1 1 1
e = 1 + + + + + · · · = lim 1 + .
1! 2! 3! 4! n→∞ n
93
94 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
We shall derive a few of the exponential function’s many properties including its
relationship to Euler’s number e. As a bonus prize, in Section 3.7 we’ll also prove
that e is irrational and we look at a useful (but little publicized) theorem called
Tannery’s theorem, which is a very handy result we’ll used in subsequent sections.
Finally, in Section 3.8 we see how real numbers can be represented as decimals
(with respect to arbitrary bases) and we look at Cantor’s famous “constructive”
diagonal argument.
Chapter 3 objectives: The student will be able to . . .
• apply the rigorous ε-N definition of convergence for sequences and series.
• determine when a sequence is monotone, Cauchy, or has a convergent subse-
quence (Bolzano-Weierstrass), and when a series converges (absolutely).
• define the exponential function and the number e.
• explain Cantor’s diagonal argument.
shall work with sequences starting at n = 1, although all the results we shall discuss
work for sequences starting with any index.
Example 3.1. Some examples of real sequences (that is, sequences in R1 = R)
include1
3, 3.1, 3.14, 3.141, 3.1415, . . .
and
1 1 1 1 1 1
1, , , , , , . . . , an = , . . . .
2 3 4 5 6 n
We are mostly interested in real or complex sequences. Here, by a complex
sequence we simply mean a sequence in R2 where we are free to use the notation
of i for (0, 1) and the multiplicative structure.
Example 3.2. The following sequence is a complex sequence:
i, i2 = −1, i3 = −i, i4 = 1, . . . , an = in , . . .
Although we shall focus on R and C sequences in this book, later on you might
deal with topology and calculus in Rm (as in, for instance, [136]), so for your later
psychological health we might as well get used to working with Rm instead of R1 .
We now try to painstakingly motivate a precise definition of convergence (so
please bear with me). Intuitively, a sequence {an } in Rm converges to an element
L in Rm indicates that an is “as close as we want” to L for n “sufficiently large”.
We now make the terms in quotes rigorous. First of all, what does “as close as we
want” mean? We take it to mean that given any error, say ε > 0 (e.g. ε = 0.01),
for n “sufficiently large” we can approximate L by an to within an error of ε. In
other words, for n “sufficiently large” the difference between L and an is within ε:
|an − L| < ε.
Now what does for n “sufficiently large” mean? We define it to mean that there is
a real number N such that for all n > N , a specified property holds (e.g. the above
inequality); thus, for all n > N we have |an − L| < ε, or using symbols,
(3.1) n>N =⇒ |an − L| < ε.
In conclusion: For any given error ε > 0 there is an N such that (3.1) holds.2
We now summarize our findings as a precise definition. Let {an } be a sequence
in Rm . We say that {an } converges (or tends) to an element L in Rm if, for
every ε > 0, there is an N ∈ R such that for all n > N , |an − L| < ε. Because this
definition is so important, we display it: {an } converges to L if,
for every ε > 0, there is an N ∈ R such that n > N =⇒ |an − L| < ε.
We call {an } a convergent sequence, L the limit of {an }, and we usually denote
the fact that {an } converges to L in one of four ways:
an → L, an → L as n → ∞, lim an = L, lim an = L.
n→∞
If a sequence does not converge (to any element of Rm ), we say that it diverges.
We can also state the definition of convergence in terms of open balls. Observe
1We’ll talk about decimal expansions of real numbers in Section 3.8 and π in Chapter 4.
2One magnitude is said to be the limit of another magnitude when the second may approach
the first within any given magnitude, however small, though the second may never exceed the
magnitude it approaches. Jean Le Rond d’Alembert (1717–1783). The article on Limite in the
Encyclopdie 1754.
96 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
6
a2
• Iε
a3 an L Bε (L)
• •
• •
•
a1
• -
that |an − L| < ε is just saying that an ∈ Bε (L), the open ball of radius ε centered
at L (see formula (2.32) in Section 2.8). Therefore, an → L in Rm if,
for every ε > 0, there is an N ∈ R such that n > N =⇒ an ∈ Bε (L).
See Figure 3.1. We’ll not emphasize this interpretation of limit but we state it
because the open ball idea will occur in other classes, in particular, when studying
metric spaces in topology.
3.1.2. Standard examples of ε-N arguments. We now give some standard
examples of using our “ε-N ” definition of limit.
Example 3.3. We shall prove that the sequence {1/2, 1/3, 1/4, . . .} converges
to zero:
1
lim = 0.
n+1
In general, any sequence {an } that converges to zero is called a null sequence.
Thus, we claim that {1/(n + 1)} is a null sequence. Let ε > 0 be any given positive
real number. We want to prove there exists a real number N such that
1 1
(3.2) n > N =⇒ − 0 = < ε.
n+1 n+1
To find such a number N , we can proceed in many ways. Here are two ways.
(I) For our first way, we observe that
1 1 1
(3.3) < ε ⇐⇒ < n + 1 ⇐⇒ − 1 < n.
n+1 ε ε
For this reason, let us choose N to be the real number N = 1/ε − 1. Let
n > N , that is, N < n or using the definition of N , 1/ε − 1 < n. Then
by (3.3), we have 1/(n + 1) < ε. In summary, for n > N , we have proved
that 1/(n + 1) < ε. This proves (3.2). Thus, by definition of convergence,
1/(n + 1) → 0.
(II) Another technique is to try and simplify the right-hand side of (3.2). Since
n < n + 1, we have 1/(n + 1) < 1/n. Therefore,
1 1 1 1
(3.4) if < ε, then because < , we have < ε also.
n n+1 n n+1
Now we can make 1/n < ε easily since
1 1
< ε ⇐⇒ < n.
n ε
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 97
With this scratch work done, let us now choose N = 1/ε. Let n > N , that
is, N < n or using the definition of N , 1/ε < n. Then we certainly have
1/n < ε, and hence by (3.4), we know that 1/(n + 1) < ε too. In summary,
for n > N , we have proved that 1/(n + 1) < ε. This proves (3.2).
Note that in (I) and (II), we found different N ’s (namely N = 1/ε − 1 in (I)
and N = 1/ε in (II)), but this doesn’t matter because to prove (3.2) we just need to
show such an N exists; it doesn’t have to be unique and in general, many different
N ’s will work. We remark that a similar argument shows that the sequence {1/n}
is also a null sequence: lim n1 = 0.
Example 3.4. Here’s a harder example. Let’s prove that
2n2 − n
lim = 2.
n2 − 9
For the sequence an = (2n2 − n)/(n2 − 9), we take the indices to be n = 4, 5, 6, . . .
(since for n = 3 the quotient is undefined). Let ε > 0 be given. We want to prove
there exists a real number N such that the following statement holds:
2
2n − n
n > N =⇒ 2 − 2 < ε.
n −9
One technique to prove this is to try and “massage” (simplify) the absolute value
on the right as much as we can. For instance, we first can combine fractions:
2 2
2n − n 2n − n 2n2 − 18 18 − n
(3.5) n2 − 9 − 2 = n2 − 9 − n2 − 9 = n2 − 9 .
Second, just so that we don’t have to worry about absolute values, we can get rid
of them by using the triangle inequality: for n = 4, 5, . . ., we have
18 − n 18 + n
(3.6) n2 − 9 ≤ n2 − 9 .
Third, just for topping on the cake, let us make the top of the right-hand fraction
a little simpler by observing that 18 ≤ 18n, so we conclude that
18 + n 18n + n 19n
(3.7) ≤ 2 = 2 .
n2 − 9 n −9 n −9
In conclusion, we have “massaged” our expression to the following inequality:
2
2n − n 19n
n2 − 9 − 2 ≤ n2 − 9 .
To show that this can be made less than ε, we need to show that the denom-
inator can’t get too small (otherwise the fraction (19/n)/(1 − 9/n2 ) might
get large). To this end, observe that n92 ≤ 492 for n ≥ 4, so
9 9 9 7 1
for n ≥ 4, 1− ≥1− 2 =1− = > .
n2 4 16 16 3
Hence, for n ≥ 4, we have 13 < 1 − n92 , which is to say, 1−9/n
1
2 < 3. Thus,
19n 19/n 57
(3.9) for n ≥ 4, 2
= 2
< (19/n) · 3 = .
n −9 1 − 9/n n
Therefore, we can satisfy (3.8) by making 57/n < ε instead. Now,
57 57
(3.10) < ε ⇐⇒ < n.
n ε
Because of (3.9) and (3.10), let us pick N to be the larger of 3 and 57/ε.
We’ll prove that this N works for (3.8). Let n > N , which implies that
n > 3 and n > 57/ε. In particular, n ≥ 4 and ε > 57/n. Therefore,
19n by (3.9) 57 by (3.10)
2
< < ε.
n −9 n
This proves (3.8).
(II) For our second method, observe that n2 − 9 ≥ n2 − 9n since 9 ≤ 9n. Hence,
1 1 1
for n > 9, ≤ 2 = ,
n2 − 9 n − 9n n(n − 9)
where we chose n > 9 so that n(n − 9) is positive. Thus,
19n 19n 19
(3.11) for n > 9, ≤ = .
n2 − 9 n(n − 9) n−9
So, we can satisfy (3.8) by making 19/(n − 9) < ε instead. Now, for n > 9,
19 19 19
(3.12) < ε ⇐⇒ < n − 9 ⇐⇒ 9 + < n.
n−9 ε ε
Because of (3.11) and (3.12), let us pick N = 9 + 19
ε . We’ll prove that this
N works for (3.8). Let n > N , which implies, in particular, that n > 9.
Therefore,
19n by (3.11) 19 by (3.12)
2
< < ε.
n −9 n−9
This proves (3.8).
(III) For our last (and my favorite of the three) method, we factor the bottom:
19n 19n 19n 1 1
(3.13) = = · < 19 · ,
n2 − 9 (n + 3)(n − 3) n+3 n−3 n−3
19n 19n
where used the fact that n+3 < n = 19. Now (solving for n),
19 19
(3.14) < ε ⇐⇒ 3 + < n.
n−3 ε
For this reason, let us pick N = 3 + 19/ε. We’ll show that this N satisfies
(3.8). Indeed, for n > N (that is, 3 + 19/ε < n), we have
19n by (3.13) 19 by (3.14)
< < ε.
n2 − 9 n−3
3.1. CONVERGENCE AND ε-N ARGUMENTS FOR LIMITS OF SEQUENCES 99
By our familiar root rules (Theorem 2.32), we know that a1/n > 11/n = 1 and
therefore bn := a1/n − 1 > 0. By Bernoulli’s inequality (Theorem 2.7), we have
n a
a = a1/n = (1 + bn )n ≥ 1 + nbn ≥ nbn =⇒ bn ≤ .
n
100 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
Hence,
a
1/n
(3.19) a − 1 = |bn | ≤ .
n
Thus, we can satisfy (3.18) by making a/n < ε instead. Now,
a a
(3.20) < ε ⇐⇒ < n.
n ε
For this reason, let us pick N = a/ε. Let n > N (that is, a/ε < n). Then,
by (3.19) a by (3.20)
1/n
a − 1 ≤ < ε.
n
So, by definition of convergence, a1/n → 1. Now consider the case when 0 < a < 1.
Let ε > 0 be any given positive real number. We need to prove that there is a real
number N such that
n > N =⇒ a1/n − 1 < ε.
Since 0 < a < 1, we have 1/a > 1, so by our argument for real numbers greater
1/n
than one we know that 1/a1/n = (1/a) → 1. Thus, there is a real number N
such that
1
n > N =⇒ 1/n − 1 < ε.
a
Multiplying both sides of the right-hand
inequality by the positive real number
a1/n , we get n > N =⇒ 1 − a1/n < a1/n ε. Since 0 < a < 1, by our root rules,
a1/n < 11/n = 1, so a1/n ε < 1 · ε = ε. Hence,
n > N =⇒ a1/n − 1 < ε,
7. Let {an } be a sequence in Rm and let L ∈ Rm . Form the negation of the definition
that an → L, thus giving a statement that an 6→ L (the sequence {an } does not tend to
L). Using your negation, prove that the sequence {(−1)n } diverges, that is, does not
converge to any real number. In the next section we shall find an easy way to verify
that a sequence diverges using the notion of subsequences.
8. (Infinite products — see Chapter 7 for more on this amazing topic!) In this problem
we investigate the infinite product
22 32 42 52 62 72
(3.23) · · · · · ···
1·3 2·4 3·5 4·6 5·7 6·8
We interpret this “infinite product” as the limit of the “partial products”
22 22 32 22 32 42
a1 = , a2 = · , a3 = · · ,....
1·3 1·3 2·4 1·3 2·4 3·5
2 2 2
2 3 (n+1)
In other words, for each n ∈ N, we define an := 1·3 · 2·4 · · · n·(n+2) . We prove that the
sequence {an } converges as follows.
(i) Prove that an = 2(n+1)
n+2
.
Q
(ii) Now prove that an → 2. We sometimes write the infinite product (3.23) using
notation and we express the limit lim an = 2 as
∞
22 32 42 52 Y (n + 1)2
· · · ··· = 2 or = 2.
1·3 2·4 3·5 4·6 n=1
n(n + 2)
3.2.1. Some limit theorems. We begin by proving that limits are unique,
that is, a convergent sequence cannot have two different limits. Before doing so, we
first prove a lemma.
Lemma 3.1 (The ε-principle). If x ∈ R and for any ε > 0, we have x ≤ ε,
then x ≤ 0. In particular, if a ∈ Rm and for any ε > 0, we have |a| < ε, then
a = 0.
Proof. By way of contradiction, assume that x > 0. Then choosing ε =
x/2 > 0, by assumption we have x < ε = x/2. Subtracting x/2 from both sides, we
conclude that x/2 < 0, which implies that x < 0, a contradiction.
The second assertion of the theorem follows by applying the first assertion to
x = |a|. In this case, |a| ≤ 0, which implies that |a| = 0 and therefore a = 0.
Notice that both subsequences, {1/(2n)} and {1/n!} also converge to zero, the
same limit as the original sequence {1/n}. This is a general fact: If a sequence
converges, then any subsequence of it must converge to the same limit.
Theorem 3.9. Every subsequence of a convergent sequence converges to the
same limit as the original sequence.
Proof. Let {an } be a sequence in Rm converging to L ∈ Rm . Let {aνn } be
any subsequence and let ε > 0. Since an → L there is an N such that for all n > N ,
|an −L| < ε. Since ν1 < ν2 < ν3 < . . . is an increasing sequence of natural numbers,
one can check (for instance, by induction) that n ≤ νn for all n. Thus, for n > N ,
we have νn > N and hence for n > N , we have |aνn − L| < ε. This proves that
aνn → L and completes the proof.
This theorem gives perhaps the easiest way to prove that a sequence does not
converge.
Example 3.16. Consider the sequence
i, i2 = −1, i3 = −i, i4 = 1, i5 = i, i6 = −1, . . . , an = in , . . . .
Choosing 1, 5, 9, 13, . . . , νn = 4n − 3, . . ., we get the subsequence
i, i, i, i, . . . ,
which converges to i. On the other hand, choosing 2, 6, 10, 14, . . . , νn = 4n − 2, . . .,
we get the subsequence
−1, −1, −1, −1, . . . ,
which converges to −1. Since these two subsequences do not converge to the same
limit, the original sequence {in } cannot converge. Indeed, if {in } did converge,
then every subsequence of {in } would have to converge to the same limit as {in },
but we found subsequences that converge to different limits.
3.2.4. Algebra of limits. Let {an } and {bn } be sequences in Rm . Given any
real numbers c, d, we define the linear combination of these sequences by c and d
as the sequence {c an + d bn }. As special case, the sum of these sequences is just the
sequence {an + bn } and the difference is just the sequence {an − bn }, and choosing
d = 0, the multiple of {an } by c is just the sequence {c an }. The sequence of
norms of the sequence {an } is the sequence of real numbers {|an |}.
Theorem 3.10. Linear combinations and norms of convergent sequences con-
verge to the corresponding linear combinations and norms of the limits.
Proof. Consider first the linear combination sequence {c an +d bn }. If an → a
and bn → b, we shall prove that c an + d bn → c a + d b. Let ε > 0. We need to
prove that there is a real number N such that
n>N =⇒ |c an + d bn − (c a + d b)| < ε.
By the triangle inequality,
|c an + d bn − (c a + d b)| = |c (an − a) + d (bn − b)|
≤ |c| |an − a| + |d| |bn − b|.
Now, since an → a, there is an N1 such that for all n > N1 , |c| |an − a| < ε/2. (If
|c| = 0, any N1 will work; if |c| > 0, then choose N1 corresponding to the error
ε/(2|c|) in the definition of convergence for an → a.) Similarly, since bn → b, there
108 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
is an N2 such that for all n > N2 , |d| |bn − b| < ε/2. Then setting N as the larger
of N1 and N2 , it follows that for n > N ,
ε ε
|c an + d bn − (c a + d b)| ≤ |c| |an − a| + |d| |bn − b| < + = ε.
2 2
This proves that c an + d bn → c a + d b.
Assuming that an → a, we show that |an | → |a|. Let ε > 0. Then there is an
N such that |an − a| < ε for all n > N . Hence, as a consequence of the triangle
inequality (see Property (iv) in Theorem 2.41), for n > N , we have
| |an | − |a| | ≤ |an − a| < ε,
which shows that |an | → |a|.
Let {an } and {bn } be complex sequences. Given any complex numbers c, d,
the same proof detailed above shows that c an + d bn → c a + d b. However, being
complex sequences we can also multiply these sequences, term by term, defining the
product sequence as the sequence {an bn }. Also, assuming that bn 6= 0 for each n,
we can divide the sequences, term by term, defining the quotient sequence as the
sequence {an /bn }.
Theorem 3.11. Products of convergent complex sequences converge to the cor-
responding products of the limits. Quotients of convergent complex sequences, where
the denominator sequence is a nonzero sequence converging to a nonzero limit, con-
verge to the corresponding quotient of the limits.
Proof. Let an → a and bn → b. We first prove that an bn → a b. Let ε > 0.
We need to prove that there is a real number N such that for all n > N ,
|an bn − a b| < ε.
By the triangle inequality,
|an bn − a b| = |an (bn − b) + b(an − a)| ≤ |an | |bn − b| + |b| |an − a|.
Since an → a, there is an N1 such that for all n > N1 , |b| |an − a| < ε/2. By
Theorem 3.5 there is a constant C such that |an | ≤ C for all n. Since bn → b, there
is an N2 such that for all n > N2 , C |bn − b| < ε/2. Setting N as the larger of N1
and N2 , it follows that for n > N ,
ε ε
|an bn − a b| ≤ |an | |bn − b| + |b| |an − a| ≤ C |bn − b| + |b| |an − a| < + = ε.
2 2
This proves that an bn → a b.
We now prove the second statement. If bn 6= 0 for each n and b 6= 0, then
we shall prove that an /bn → a/b. We can write this limit statement as a product:
an · b−1
n → a·b
−1
, so all we have to do is show that b−1n →b
−1
. Let ε > 0. We need
to prove that there is a real number N such that for all n > N ,
|b−1
n −b
−1
| = |bn b|−1 |bn − b| < ε.
To do so, let N1 be chosen in accordance with the error |b|/2 in the definition of
convergence for bn → b. Then for n > N1 ,
|b|
|b| = |b − bn + bn | ≤ |b − bn | + |bn | <+ |bn |.
2
Bringing |b|/2 to the left, for n > N1 we have |b|/2 < |bn |, or
|bn |−1 < 2|b|−1 , n > N1 .
3.2. A POTPOURRI OF LIMIT PROPERTIES FOR SEQUENCES 109
Now let N2 be chosen in accordance with the error |b|2 ε/2 in the definition of
convergence for bn → b. Setting N as the larger of N1 and N2 , it follows that for
n > N,
ε
|bn−1 − b−1 | = |bn b|−1 |bn − b| < (2|b|−1 ) (|b|−1 ) |bn − b| < (2|b|−2 ) |b|2 = ε.
2
Thus, b−1
n →b
−1
and our proof is complete.
These two “algebra of limits” theorems can be used to evaluate limits in an
easy manner.
Example 3.17. For example, since lim n1 = 0, by our product theorem (Theo-
rem 3.11), we have
1 1 1
lim 2 = lim · lim = 0 · 0 = 0.
n n n
Example 3.18. In particular, since the constant sequence 1 converges to 1, by
our linear combination theorem (Theorem 3.10), for any number a, we have
a 1
lim 1 + 2 = lim 1 + a · lim 2 = 1 + a · 0 = 1.
n n
2
Example 3.19. Now dividing the top and bottom of nn2 +3 2
+7 by 1/n and using
our theorem on quotients and the limit we just found, we obtain
3
n2 + 3 lim 1 + 2
lim 2 = n = 1 = 1.
n +7 7 1
lim 1 + 2
n
3.2.5. Properly divergent sequences. When dealing with sequences of real
numbers, inevitably infinities occur. For instance, we know that the sequence {n2 }
diverges since it is unbounded. However, in elementary calculus, we would usu-
ally write n2 → +∞, which suggests that this sequence converges to the number
“infinity”. We now make this notion precise.
A sequence {an } of real numbers diverges to +∞ if given any real number
M > 0, there is a real number N such that for all n > N , an > M . The sequence
diverges to −∞, if for any real number M < 0, there is a real number N such
that for all n > N , an < M . In the first case we write lim an = +∞ or an → +∞
(sometimes we drop the “+” in front of ∞) and in the second case we write lim an =
−∞ or an → −∞. In either case we say that {an } is properly divergent. It is
important to understand that the symbols +∞ and −∞ are simply notation and
they do not represent real numbers3. We now present some examples.
Example 3.20. First we show that for any natural number k, nk → +∞. To
see this, let M > 0. Then we want to prove there is an N such that for all n > N ,
nk > M . To do so, observe that nk > M if and only if n > M 1/k . For this reason,
we choose N = M 1/k . With this choice of N , for all n > N , we certainly have
nk > M and our proof is complete. Using a very similar argument, one can show
that −nk → −∞.
3It turns out that ±∞ form part of a number system called the extended real numbers,
which consists of the real numbers together with the symbols +∞ = ∞ and −∞. One can define
addition, multiplication, and order in this system, with the exception that subtraction of infinities
is not allowed. If you take measure theory, you will study this system.
110 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
Example 3.21. In Example 3.10, we showed that given any real number a > 1,
the sequence {an } diverges to +∞.
Because ±∞ are not real numbers, some of the limit theorems we have proved
in this section are not valid when ±∞ are the limits, but many do hold under certain
conditions. For example, if an → +∞ and bn → +∞, then for any nonnegative
real numbers c, d, at least one of which is positive, the reader can check that
c an + d bn → +∞.
If c, d are nonpositive with at least one of them negative, then c an + d bn → −∞. If
c and d have opposite signs, then there is no general result. For example, if an = n,
bn = n2 , and cn = n + (−1)n , then an , bn , cn → +∞, but
lim(an − bn ) = −∞, lim(bn − cn ) = +∞, and lim(an − cn ) does not exist!
We encourage the reader to think about which limit theorems extend to the case of
infinite limits. For example, here is a squeeze law: If an ≤ bn for all n sufficiently
large and an → +∞, then bn → +∞ as well. Some more limit theorems for infinite
limits are presented in the exercises (see e.g. Problem 10).
Exercises 3.2.
1. Evaluate the following limits by using limits already proven (in the text or exercises)
and invoking the “algebra of limits”.
2
(−1)n n (−1)n 2n 3
(a) lim 2 , (b) lim , (c) lim n , (d) lim 7 + .
n +5 n + 10 3 + 10 n
2. Why do the following sequences diverge?
( n
)
X n n
o
n k
(a) {(−1) } , (b) an = (−1) , (c) an = 2n(−1) , (d) {in + 1/n}.
k=0
3. Let
n n n
X 1 X 1 X 1
an = √ , bn = √ , cn = .
n2 + k n+k nn + n!
k=1 k=1 k=1
Find lim an , lim bn , and lim cn .
n
4. (a) Let a1 ∈ R and for n ≥ 1, define an+1 = sgn(an )+10(−1) √
n
. Here, sgn(x) := 1 if
x > 0, sgn(x) := 0 if x = 0, and sgn(x) := −1 if x < 0. Find lim an .
(b) Let a1 ∈ [−1, 1] and for n ≥ 1, define an+1 = (|anan|+1) . Find lim an . Suggestion:
Can you prove that −1/n ≤ an ≤ 1/n for all n ∈ N?
5. If {an } and {bn } are complex sequences with {an } bounded and bn → 0, prove that
an bn → 0. Why is Theorem 3.11 not applicable in this situation?
6. Let {an } be a sequence in Rm .
(a) Let b ∈ Rm and suppose that there is a sequence {bn } in R with bn → 0 and
|an − b| ≤ C|bn | for some C > 0 and for all n. Prove that an → b.
(b) If lim |an | = 0, show that an → 0. It is important that zero is the limit in the
hypothesis. Indeed, give an example of a sequence in R for which lim |an | exists
and is nonzero, but lim an does not exist.
7. (The root test for sequences) Let {an } be a sequence of positive real numbers with
1/n 1/n
L := lim an < 1. (That is, an converges with limit less than 1.)
(i) Show that there is a real number r with 0 < r < 1 such that 0 < an < rn for all
n sufficiently large, that is, there is an N such that 0 < an < rn for all n > N .
(ii) Prove that lim an = 0.
(iii) If, however, L > 1, prove that an is not a bounded sequence, and hence diverges.
3.3. THE MONOTONE CRITERIA, THE BOLZANO-WEIERSTRASS THEOREM, AND e 111
-
a1 a2 a3 an
There are many stories about Φ; unfortunately, many of them are false, see [146].
Our next important theorem is the monotone subsequence theorem. There are
many nice proofs of this theorem, cf. the articles [158] [20], [223]. Given any set
A of real numbers, a number a is said to be the maximum of A if a ∈ A and
a = sup A, in which case we write a = max A.
Theorem 3.13 (Monotone subsequence theorem). Any sequence of real
numbers has a monotone subsequence.
Proof. Let {an } be a sequence of real numbers. Then the statement “for
every n ∈ N, the maximum of the set {an , an+1 , an+2 , . . .} exists” is either a true
statement, or it’s false, which means “there is an m ∈ N such that the maximum
of the set {am , am+1 , am+2 , . . .} does not exist.”
Case 1: Suppose that we are in the first case: for each n, {an , an+1 , an+2 , . . .}
has a greatest member. In particular, we can choose aν1 such that
aν1 = max{a1 , a2 , . . .}.
Now {aν1 +1 , aν1 +2 , . . .} has a greatest member, so we can choose aν2 such that
aν2 = max{aν1 +1 , aν1 +2 , . . .}.
Since aν2 is obtained by taking the maximum of a smaller set of elements, we have
aν2 ≤ aν1 . Let
aν3 = max{aν2 +1 , aν2 +2 , . . .}.
Since aν3 is obtained by taking the maximum of a smaller set of elements than
the set defining aν2 , we have aν3 ≤ aν2 . Continuing by induction we construct a
monotone (nonincreasing) subsequence.
Case 2: Suppose that the maximum of the set A = {am , am+1 , am+2 , . . .} does
not exist, where m ≥ 1. Let aν1 = am . Since A has no maximum, there is a ν2 > m
such that
am < aν2 ,
for, if there were no such aν2 , then am would be a maximum element of A, which we
know is not possible. Since none of the elements am , am+1 , . . . , aν2 is a maximum
element of A, there must exist an ν3 > ν2 such that
aν2 < aν3 ,
114 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
(c) Let a ≥ 2. Show that the sequence defined inductively by an+1 = 2 − a−1 n with
a1 = a is a bounded monotone sequence. Determine the limit. Suggestion: Pick
e.g. a = 2 or a = 10 and calculate a few values of an to conjecture if {an } is, for
general a > 1, nondecreasing or nonincreasing. Also conjecture what a bound may
be from these examples. Now prove your conjecture using induction.
(d) Let a > 0. Show that the sequence defined inductively by an+1 = an /(1 + 2an )
with a1 = a is a bounded monotone sequence. Determine the √ limit.
(e) Show that the sequence defined inductively by an+1 = 5 + an − 5 with a1 = 10
is a bounded monotone sequence. Determine the limit.
1 1 1
(f) Show that the sequence with an = n+1 + n+2 + · · · + 2n is a bounded monotone
sequence. The limit of this sequence is not at all obvious (it turns out to be log 2,
which we’ll study in Section 4.6).
2. In this problem we give two different ways to determine square roots using sequences.
(1) Let a > 0. Let a1 be any positive number and define
1 a
an+1 = an + , n ≥ 1.
2 an
(a) Show that an > 0 for all n and a2n+1 − a ≥ 0 for all n.
(b) Show that {an } is nonincreasing for n ≥ 2.
(c) Conclude that {an } converges and determine its limit.
(2) Let 0 ≤ a ≤ 1. Show that the sequence defined inductively by an+1 = an + 12 (a−a2n )
√
with a1 = 0 is nondecreasing and bounded above by a. Determine the limit.
√
Suggestion: To prove that an+1 is bounded above by a, assume an is and write
√
an = a − ε where ε ≥ 0.
3. (Cf. [250]) In this problem we analyze the constant e based on arithmetic-geometric
mean inequality (AGMI); see Problem 7 of Exercises 2.2. Recall that the arithmetic-
geometric mean states that given any n + 1 nonnegative real numbers x1 , . . . , xn+1 ,
x + x + ··· + x n+1
1 2 n+1
x1 · x2 · · · xn+1 ≤ .
n+1
(i) Put xk = (1 + 1/n) for k = 1, . . . , n and xn+1 = 1, in the AGMI to prove that
the sequence an = (1 + n1 )n is nondecreasing.
(ii) If bn = (1 + 1/n)n+1 , then show that for n ≥ 2,
n
bn 1 1 1 1 1
= 1− 2 1+ = 1 − 2 ··· 1 − 2 1+ .
bn−1 n n n n n
| {z }
n times
Applying the AGMI to the right hand side, show that bn /bn−1 ≤ 1, which shows
that the sequence {bn } is nonincreasing.
(iii) Conclude that both sequences {an } and {bn } converge. Of course, just as in the
text we denote their common limit by e.
4. (Continued roots) For more on this subject, see [4], [101], [149, p. 775], [215], [106].
(1) Fix k ∈ N with√ k ≥ 2 and fix a √ > 0. Show that the sequence defined inductively
by an+1 = k a + an with a1 = k a is a bounded monotone sequence. Prove that
the limit L is a root of the equation xk − x − a = 0. Can you see why L can be
thought of as
s r q
k k k √
L = a + a + a + k a + · · ·.
(2) Let {an } be a sequence of nonnegative real numbers and for each n, define
v s
u r
u q
t √
αn := a1 + a2 + a3 + · · · + an .
3.4. COMPLETENESS AND THE CAUCHY CRITERIA FOR CONVERGENCE 117
Prove that {αn } converges if and only if there is a constant M ≥ 0 such that
√
2n a
n ≤ M for all n. Suggestion: To prove the “only if” portion, prove that
r q
√ p √
2n a M + M 22 + · · · + M 2n =
n ≤ αn . To prove the “if”, prove that αn ≤
r q p √
M bn where bn = 1 + 1 + · · · + 1 is defined in (3.25); in particular, we
showed that bn ≤ 3. Now setting an = n for all n, show that
v s
u r
u q
t √
1 + 2 + 3 + 4 + 5 + ···
exists. This number is called Kasner’s number named after Edward Kasner
(1878–1955) and is approximately 1.75793 . . ..
5. In this problem we prove (3.29).
(a) Prove that for each natural number n, (n − 1)! ≤ nn e−n e ≤ n!. Suggestion:
Can you use induction and (3.28)? (You can also prove these inequalities using
integrals as in [128, p. 219], but using (3.28) gives an “elementary” proof that is
free of integration theory.)
(b) Using (a), prove that for every natural number n,
√
e1/n n
n! e1/n n1/n
≤ ≤ .
e n e
(c) Now prove (3.29). Using (3.29), prove that
1/n 1/n
(3n)! 27 (3n)! 27
lim = and lim = 2.
n3n e3 n! n2n e
This shows that the sequence {an } is Cauchy. Notice that this sequence, { 2n−1n−3 },
converges (to the number 2).
Example 3.25. Here’s a more sophisticated example of a Cauchy sequence.
Let a1 = 1, a2 = 1/2, and for n ≥ 2, we let an be the arithmetic mean between the
previous two terms:
an−2 + an−1
an = , n > 2.
2
Thus, a1 = 1, a2 = 1/2, a3 = 3/4, a4 = 5/8, . . . so this sequence is certainly not
monotone. However, we shall prove that {an } is Cauchy. To do so, we first prove
by induction that
(−1)n+1
(3.30) an − an+1 = .
2n
Since a1 = 1 and a2 = 1/2, this equation holds for n = 1. Assume that the equation
holds for n. Then
an+1 + an 1 1 (−1)n+1 (−1)n+2
an+1 − an+2 = an+1 − = − an − an+1 = − n
= ,
2 2 2 2 2n+1
which proves the induction step. With (3.30) at hand, we can now show that the
sequence {an } is Cauchy. Let k, n be any natural numbers. By symmetry we may
assume that k ≤ n (otherwise just switch k and n in what follows), let us say
n = k + j where j ≥ 0. Then according to (3.30) and the sum of a geometric
progression (2.3), we can write
ak − an = ak − ak+j
= ak − ak+1 + ak+1 − ak+2 + ak+2 − ak+3 + · · · + ak+j−1 − ak+j
(−1)k+1 (−1)k+2 (−1)k+3 (−1)k+j
= + + + · · · +
2k 2k+1 2k+3 2k+j−1
k+1
(−1) −1 −1 2 −1 j−1
= 1 + + + · · · +
2k 2 2 2
(−1)k+1 1 − (−1/2)j (−1)k+1 2
(3.31) = · = · · 1 − (−1/2)j .
2k 1 − (−1/2) 2k 3
3.4. COMPLETENESS AND THE CAUCHY CRITERIA FOR CONVERGENCE 119
2
2
Since 3 · 1 − (−1/2)j ≤ 3 · 1 + 1) ≤ 2, we conclude that
1
|ak − an | ≤ , for all k, n with k ≤ n.
2k−1
Now let ε > 0. Since 1/2 < 1, we know that 1/2k−1 = 2 · (1/2)k → 0 as k → ∞ (see
Example 3.5 in Subsection 3.1.3). Therefore there is an N such that for all k > N ,
1/2k−1 < ε. Let k, n > N and again by symmetry, we may assume that k ≤ n. In
this case, we have
1
|ak − an | ≤ k−1 < ε.
2
This proves that the sequence {an } is Cauchy. Moreover, we claim that this se-
quence also converges. Indeed, by (3.31) with k = 1, so that n = 1 + j or j = n − 1,
we see that
1 2 1
1 − an = · · 1 − (−1/2)n−1 =⇒ an = 1 − · 1 − (−1/2)n−1 .
2 3 3
Since |(−1/2)| = 1/2 < 1, we know that (−1/2)n−1 → 0. Taking n → ∞, we
conclude that
1 2
lim an = 1 − (1 − 0) = .
3 3
We have thus far gave examples of two Cauchy sequences and we have observed
that both sequences converge. In Theorem 3.17 we shall prove that any Cauchy
sequence converges, and conversely, every convergent sequence is also Cauchy.
3.4.2. Cauchy criterion. The following two proofs use the “ε/2-trick.”
Lemma 3.16. If a subsequence of a Cauchy sequence in Rm converges, then the
whole sequence converges too, and with the same limit as the subsequence.
Proof. Let {an } be a Cauchy sequence and assume that aνn → L for some
subsequence of {an }. We shall prove that an → L. Let ε > 0. Since {an } is Cauchy,
there is an N such that
ε
k, n > N =⇒ |ak − an | < .
2
Since aνn → L there is a natural number k ∈ {ν1 , ν2 , ν3 , ν4 , . . .} with k > N such
that
ε
|ak − L| < .
2
Now let n > N be arbitrary. Then using the triangle inequality and the two
inequalities we just wrote down, we see that
ε ε
|an − L| = |an − ak + ak − L| ≤ |an − ak | + |ak − L| < + = ε.
2 2
This proves that an → L and our proof is complete.
The smallest the denominator can possibly be is when an−1 and an are the largest
they can be, which according to (3.35), is at most 3. It follows that for any n ≥ 2,
2 2 2
√ √ ≤√ √ = √ = r,
9 − 2an−1 + 9 − 2an 9−2·3+ 9−2·3 2 3
where r := √1 < 1. Thus, for any n ≥ 2,
3
2
|an − an+1 | = √ √ |an−1 − an | ≤ r |an−1 − an |.
9 − 2an−1 + 9 − 2an
This proves that the sequence {an } is contractive and therefore an → L for some
real number L. Because an ≥ 0 for all n and limits preserve inequalities we must
have L ≥ 0 too. Moreover, by (3.34), we have
2. Negate the statement that a sequence {an } is Cauchy. With your negation, prove that
the following sequences are not Cauchy (and hence cannot converge).
( n
)
n
X n
(a) {(−1) }, (b) an = (−1) , (c) {in + 1/n}.
k=0
3. Prove that the following sequences are contractive, then determine their limits.
(a) Let a1 = 0 and an+1 = (2an − 3)/4.
(b) Let a1 = 1 and an+1 = 51 a2n − 1.
(c) Let a1 = 0 and an+1 = 18 a3n + 14 an + 12 .
1
(d) Let a1 = 1 and an+1 = 1+3a . Suggestion: Prove that 41 ≤ an ≤ 1 for all n.
n √
(e) (Cf. Example 3.27.) Let a1 = 1 and an+1 = 5 − 2an .
(f) (Cf. Example 3.25.) Let a1 = 0, a2 = 1, and an = 32 an−2 + 31 an−1 for n > 2.
(g) Let a1 = 1 and an+1 = a2n + a1n .
4. We can use Cauchy sequences to obtain roots of polynomials. E.g. using a graphing
calculator, we see that x3 − 4x + 2 has exactly one root, call it a, in the interval [0, 1].
(i) Show that the root a satisfies a = 14 (a3 + 2).
(ii) Define a sequence {an } recursively by an+1 = 14 (a3n + 2) with a1 = 0.
(iii) Prove that {an } is contractive and converges to a.
5. Here are some Cauchy limit theorems. Let {an } be a sequence in Rm .
(a) Prove that {an } is Cauchy if and only if for every ε > 0 there is a number N such
that for all n > N and k ≥ 1, |an+k − an | < ε.
(b) Given any sequence {bn } of natural numbers, we call the sequence {dn }, where
dn = an+bn − an , a difference sequence. Prove that {an } is Cauchy if and
only if every difference sequence converges to zero (that is, is a null sequence).
Suggestion: To prove the “if” part, instead prove the contrapositive: If {an } is not
Cauchy, then there is a difference sequence that does not converge to zero.
3.5. BABY INFINITE SERIES 123
| | | | | |
0 1
1 1 1 1
2 22 23 24
6. (Continued fractions — see Chapter 8 for more on this amazing topic!) In this
problem we investigate the continued fraction
1
.
1
2+
1
2+
2 + ···
We interpret this infinite fraction as the limit of the fractions
1 1 1
a1 = , a2 = 1 , a3 = 1 ,....
2 2+ 2
2+ 2+ 1
2
1
(3.36) Φ=1+
1
1+
1
1+
.
1 + ..
in the sense that the right-hand continued fraction converges with value Φ. More
precisely, prove that Φ − 1 = lim φn where {φn } is the sequence defined by φ1 := 1 and
φn+1 = 1/(1 + φn ) for n ∈ N.
4If you disregard the very simplest cases, there is in all of mathematics not a single infinite
series whose sum has been rigorously determined. In other words, the most important parts
of mathematics stand without a foundation. Niels H. Abel (1802–1829) [210]. (Of course,
nowadays series are rigorously determined — this is the point of this section!)
124 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
3.5.1. Basic results on infinite series. P Given a sequence {an }∞ n=1Pof com-
∞
plex numbers, we want to attach a meaning to n=1 an , mostly written an for
simplicity. To do so, we define the n-th partial sum, sn , of the series to be
n
X
sn := ak = a1 + a2 + · · · + an .
k=1
Of course, here there are only finitely many numbers being summed, so the right-
hand side has a clear definition. IfPthe sequence {sn } of partial sums converges,
then we say that the infinite series an converges and we define
X ∞
X
an = an = a1 + a2 + a3 + · · · := lim sn .
n=1
If the sequence of partial sums does not converge to a complex number, then we
say that the series diverges. Since R ⊆ C, restricting to real sequences {an },
we already have built in to the above definition the convergence of a series of real
numbers. Just as a sequence can be indexed so its starting value is a0 or a−7 or
a1234 , etc., we can also consider series starting with indices other than 1:
∞
X ∞
X ∞
X
an , an , an , etc.
n=0 n=−7 n=1234
For convenience, in all our proofs we shall most of the time work with series starting
at n = 1, although all the results we shall discuss work for series starting with any
index.
Example 3.28. Consider the series
∞
X
(−1)n = 1 − 1 + 1 − 1 + − · · · .
n=0
The converse of Pthe n-th term test is false; that is, even though lim an = 0, it
may not follow that an exists.5 For example, the . . .
Example 3.30. (Harmonic series diverges, Proof I) Consider
∞
X 1 1 1 1
= 1 + + + + ··· .
n=1
n 2 3 4
This series is called the harmonic series; see [125] for “what’s harmonic about
the harmonic series”. To see that the harmonic series does not converge, observe
that
1 1 1 1 1 1 1
s2n = 1 + + + + + + ··· + +
2 3 4 5 6 2n − 1 2n
1 1 1 1 1 1 1
>1+ + + + + + ··· + +
2 4 4 6 6 2n 2n
1 1 1 1 1
=1+ + + + ··· + = + sn .
2 2 3 n 2
Thus, s2n > 1/2 + sn . Now if the harmonic series did converge, say to some real
number s, that is, sn → s, then we would also have s2n → s. However, according to
the inequality above, this would imply that s ≥ 1/2 + s, which is an impossibility.
Therefore, the harmonic series does not converge. See Problem 5 for more proofs.
Using the inequality s2n > 1/2 + sn , one can show (and we encourage you to
do it!) that the partial sums of the harmonic series are unbounded. Then one can
deduce that the harmonic series must diverge by the following very useful test.
P
Theorem 3.20 (Nonnegative series test). A series an of nonnegative
real numbers converges if and only if the sequence
P {sn } of partial sums is bounded,
in which case, sn ≤ s for all n where s = an := lim sn .
Proof. Since an ≥ 0 for all n, we have
sn = a1 + a2 + · · · + an ≤ a1 + a2 + · · · + an + an+1 = sn+1 ,
so the sequence of partial sums {sn } is nondecreasing: s1 ≤ s2 ≤ · · · ≤ sn ≤ · · · .
By the monotone criterion for sequences, the sequence
P∞ of partial sums converges if
and only if it is bounded. To see that sn ≤ s := m=1 am for all n, fix n ∈ N and
note that sn ≤ sk for all k ≥ n because the partial sums are nondecreasing. Taking
k → ∞ and using that limits preserve inequalities gives sn ≤ lim sk = s.
To analyze this series, we use the “method of partial fractions” and note that
1 1 1
k(k+1) = k − k+1 . Thus, the adjacent terms in sn cancel (except for the first and
5The sum of an infinite series whose final term vanishes perhaps is infinite, perhaps finite.
Jacob Bernoulli (1654–1705) Ars conjectandi.
126 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
the last):
1 1 1
(3.37) sn = + + ··· +
1·2 2·3 n(n + 1)
1 1 1 1 1 1 1
= − + − + ··· + − =1− ≤ 1.
1 2 2 3 n n+1 n+1
Hence, the sequence {sn } is bounded above by 1, so our series converges. Moreover,
we also see that sn = 1 − 1/(n + 1) → 1, and therefore
∞
X 1
= 1.
n=1
n(n + 1)
Example 3.32. Now if the sum of the reciprocals of the natural numbers
diverges, what about the sum of the reciprocals of the squares (called the 2-series):
∞
X 1 1 1 1
2
= 1 + 2 + 2 + 2 + ··· .
n=1
n 2 3 4
To investigate the convergence of this 2-series, using (3.37) we note that
1 1 1 1
sn = 1 + + + + ··· +
2·2 3·3 4·4 n·n
1 1 1 1
≤1+ + + + ··· + ≤ 1 + 1 = 2.
1·2 2·3 3·4 (n − 1) · n
Since the partial sums of the 2-series are bounded, the 2-series converges. Now
what is the value of this series? This question was answered by Leonhard Euler
(1707–1783) in 1734. We shall rigourously prove, in 9 different ways in this book
that the value of the 2-series is π 2 /6 starting in Section 6.11! (Now what does π
have to do with reciprocals of squares of natural numbers???)
3.5.2. Some properties of series. It is important to understand that the
convergence or divergence of a series only depends on the “tails” of the series.
P
Theorem 3.21 (Tails theorem P for series). A series an converges if and
∞
only if there is an index m such that n=m an converges.
P
P∞Proof. Let sn denote the n-th partial sum of an and tn that of any “m-tail”
n=m an . Then
Xn
tn = ak = sn − a,
k=m
Pm−1
where a is the number a = k=1 ak . It follows that {sn } converges if and only if
{tn } converges and our theorem is proved.
An important type of series
Pwe’ll run into often are geometric series. Given a
complex number a, the series an is called a geometric series. The following
theorem characterizes those geometric series that converge.
Theorem 3.22 (Geometric series Ptheorem). For any nonzero complex num-
∞
ber a and k ∈ Z, the geometric series n=k an converges if and only if |a| < 1, in
which case
∞
X ak
an = ak + ak+1 + ak+2 + ak+3 + · · · = .
1−a
n=k
3.5. BABY INFINITE SERIES 127
In Section 6.6, we’ll see that the commutative law doesn’t hold! It is worth
remembering that the associative law does not work in reverse.
P 3.5.3. Telescoping series. As seen in Example 3.31, the value of the series
1/n(n + 1) was very easy to find because in writing out its partial sums, we
saw that the sum “telescoped” to give a simple expression. In general, it is very
difficult to find the value of a convergent series, but for telescoping series, the sums
are quite straightforward to find.
Proof. Observe that adjacent terms of the following partial sum cancel:
sn = (x0 − x1 ) + (x1 − x2 ) + · · · + (xn−1 − xn ) + (xn − xn+1 ) = x0 − xn+1 .
P
If x := lim xn exists, we have x = lim xn+1 as well, and therefore an := lim sn
exists with sum x0 − x. Conversely, if s = lim sn exists, then s = lim sn−1 as well,
and since xn = x0 − sn−1 , it follows that lim xn exists.
Example 3.35. Let a be any nonzero complex number not equal to a negative
integer. Then we claim that
∞
X 1 1
= .
n=0
(n + a)(n + a + 1) a
Indeed, in this case, we can use the “method of partial fractions” to write
1 1 1 1
an = = − = xn −xn+1 , where xn = .
(n + a)(n + a + 1) (n + a) n + a + 1 (n + a)
Since lim xn = 0, applying the telescoping series theorem gives
∞
X 1 1 1
= x0 = = ,
n=0
(n + a)(n + a + 1) (0 + a) a
just as we stated.
Exercises 3.5.
1. Determine the convergence of the following series. If the series converges, find the sum.
∞ n ∞ ∞
X 1 X i n X 1
(a) 1+ , (b) , (c) 1/n
.
n=1
n n=1
2 n=1
n
P
(iii) Now prove that if |z| < 1, then 1/(1 − z)2 = ∞ n=1 nz
n−1
. Solve Problem 3 using
(3.39). Suggestion: Problem 4 inPExercises 3.1 might be helpful.
2 ∞ n−2
(iv) Can you prove that (1−z) 3 = n=2 n(n − 1)z for |z| < 1 using a similar
technique? (Do this problem if you are feeling extra confident!)
7. In Problem 9 of Exercises 2.2 we studied the Fibonacci sequence, F0 = 0, F1 = 1,
and Fn = Fn−1 + Fn−2 for all n ≥ 2. Using the telescoping theorem prove that
∞ ∞
X 1 X Fn
=1 , = 2.
n=2
F n−1 Fn+1
n=2
F n−1 Fn+1
3.6.1. Various tests for convergence. The first test is the series version of
Cauchy’s criterion for sequences.
P
Theorem 3.26 (Cauchy’s criterion for series). The series an converges
if and only if given any ε > 0 there is an N such that for all n > m > N , we have
Xn
ak = |am+1 + am+2 + · · · + an | < ε,
k=m+1
Here is another test, which is the most useful of the ones we’ve looked at.
Theorem 3.27 (Comparison test). Let {an } and {bn } be real sequences and
suppose that for n sufficiently large, say for all n ≥ k for some k ∈ N, we have
0 ≤ an ≤ bn .
P P P P
If bn converges, then an converges.
P∞Equivalently,
P∞ if an diverges, then bn
diverges. In the case of convergence, n=k an ≤ n=k bn .
P
P Proof. By the tails theorem forP series (Theorem
∞ P ∞
3.21), the series an and
bn converge if and only if the series n=k an and n=k bn converge. By working
with these series instead of the original ones, we may assume that P 0 ≤ an ≤ bn
holds for every n. In P
this case, if sn denotes the n-partial sum for an and tn , the
n-th partial sum for bn , then 0 ≤ an ≤ bn (for every n) implies that for every n,
we have
0 ≤ sn ≤ tn .
P
Assume thatP bn converges. Then by the nonnegative series test (Theorem P 3.20),
tn ≤ t := bn . Hence, 0 ≤ sn ≤ t for all n; that is, the partial P sums of an are
bounded. Again by the nonnegative series test, P it follows
P that an converges and
taking n → ∞ in 0 ≤ sn ≤ t shows that 0 ≤ an ≤ bn .
3.6. ABSOLUTE CONVERGENCE AND A POTPOURRI OF CONVERGENCE TESTS 133
converges for p ≥ 2 and diverges for p ≤ 1. To see this, note that if p < 1, then
1 1
≤ p,
n n
because 1 − p > 0, so 1 = 11−p ≤ n1−p by the power rules theorem (Theorem
2.33) and 1 ≤ n1−p is equivalent to the above inequality. Since the harmonic series
diverges, by the comparison test, so does the p-series for p < 1. If p > 2, then by a
similar argument, we have
1 1
≤ 2.
np n
P
In the last section, we showed that the 2-series 1/n2 converges, so by the com-
parison test, the p-series for p > 2 converges. Now what about for 1 < p < 2? To
answer this question we shall appeal to Cauchy’s condensation test below.
3.6.2. Cauchy condensation test. The following test is usually not found
in elementary calculus textbooks, but it’s very useful.
Proof. ThePproof of this theorem is just like thePproof of the p-test! Let the
∞
partial sums of an be denoted by sn and P those of n=0 2n a2n by tn . Then by
the nonnegative
P series test (Theorem 3.20), an converges if and only if {sn } is
bounded and 2n a2n converges if and only if {tn } is bounded. Therefore, we just
have to prove that {sn } is bounded if and only if {tn } is bounded.
Consider the “if” part: Assume that {tn } is bounded; we shall prove that {sn }
is bounded. To prove this, we note that sn ≤ s2n −1 and we can write (cf. the above
computation with the p-series)
where in the k-th parenthesis, we group those terms of the series with index running
from 2k to 2k+1 − 1. Since the an ’s are nonincreasing (that is, an ≥ an+1 for all n),
replacing each number in a parenthesis by the first term in the parenthesis we can
only increase the value of the sum, so
Now the “only if” part: Assume that {sn } is bounded; we shall prove that {tn }
is bounded. To prove this, we try to estimate tn using s2n . Observe that
s2n = a1 + a2 + (a3 + a4 ) + (a5 + a6 + a7 + a8 ) + · · · + (a2n−1 +1 + · · · + a2n )
≥ a1 + a2 + (a4 + a4 ) + (a8 + a8 + a8 + a8 ) + · · · + (a2n + · · · + a2n )
= a1 + a2 + 2a4 + 4a8 + · · · + 2n−1 a2n
1 1
= a1 + (a1 + 2a2 + 4a4 + 8a8 + · · · + 2n a2n )
2 2
1 1
= a1 + tn .
2 2
It follows that tn ≤ 2s2n for all n. In particular, since {sn } is bounded, {tn } is
bounded as well. This completes our proof.
Once we P
develop the theory of real exponents, the same p-test holds for p real. By
∞
the way, n=1 1/np is also denoted by ζ(p), the zeta function at p:
∞
X 1
ζ(p) := p
.
n=1
n
At first glance, it may seem difficult to determine the convergence of this series,
but Cauchy’s condensation test gives the answer quickly:
∞ ∞
X 1 1 X1
2n · = ,
n=1
2n log 2n log 2 n=1 n
which diverges. (You should P check that 1/(n log n) is nonincreasing.) Therefore by
∞ 1 6
Cauchy’s condensation test, n=2 n log n also diverges.
P
3.6.3. Absolute
P convergence. A series an is said to be absolutely con-
vergent if |an | converges. The following theorem implies that any absolutely
convergent series is convergent in the usual sense.
P
Theorem 3.29 (Absolute convergence). Let an be an infinite series.
P P
(1) If |an | converges, then an also converges, and
X X
(3.40) an ≤ |an | (triangle inequality for series).
6This series is usually handled in elementary calculus courses using the technologically ad-
vanced “integral test,” but Cauchy’s condensation test gives one way to handle such series without
knowing any calculus!
136 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
P
Example 3.43. Although the harmonic series 1/n diverges, the alternating
harmonic series
∞
X (−1)n−1 1 1 1 1 1
= 1 − + − + − + −···
n=1
n 2 3 4 5 6
converges. To see this, we use the Cauchy criterion. Given n > m, observe that
n
X
(−1)k−1 (−1)m (−1)m+1 (−1)m+2 (−1)n−1
= + + + ··· +
k m+1 m+2 m+3 n
k=m+1
1 1 1 (−1)n−m−1
(3.41) = − + + ··· + .
m+1 m+2 m+3 n
Suppose that n − m is even. Then the sum in the absolute values in (3.41) equals
1 1 1 1 1 1
− + − + ··· + −
m+1 m+2 m+3 m+4 n−1 n
1 1 1 1 1 1
= − + − + ··· + − > 0,
m+1 m+2 m+3 m+4 n−1 n
since all the terms in parentheses are positive. Thus, if n − m is even, then we can
drop the absolute values in (3.41) to get
n
X
(−1)k−1 1 1 1 1 1
= − + − ··· + −
k m+1 m+2 m+3 n−1 n
k=m+1
1 1 1 1 1 1 1
= − − − ··· − − − < ,
m+1 m+2 m+3 n−2 n−1 n m+1
since all the terms in parentheses are positive. Now suppose that n − m is odd.
Then the sum in the absolute values in (3.41) equals
1 1 1 1 1 1
− + − + ··· − +
m+1 m+2 m+3 m+4 n−1 n
1 1 1 1 1 1 1
= − + − + ··· + − + > 0,
m+1 m+2 m+3 m+4 n−2 n−1 n
since all the terms in parentheses are positive. So, if n − m is odd, then just as
before, we can drop the absolute values in (3.41) to get
n
X (−1)k−1 1 1 1 1 1
= − + − ··· + −
k m+1 m+2 m+3 n−1 n
k=m+1
1 1 1 1 1 1
= − − − ··· − − < ,
m+1 m+2 m+3 n−1 n m+1
since, once again, all the terms in parentheses are positive. In conclusion, regardless
if n − m is even or odd, we see that for any n > m, we have
n
X
(−1)k−1 1
< .
k m+1
k=m+1
we will study thoroughly in Section 6.1. Later on, in Section 4.6, we’ll prove that
P∞ (−1)n−1
n=1 n equals log 2.
Example 3.44. Using the associative law in Theorem 3.23, we can also write
∞
X (−1)n−1 1 1 1 1 1 1 1 1
= 1− + − + − = + + + ··· .
n=1
n 2 3 4 5 6 1·2 3·4 5·6
Exercises 3.6.
1. For this problem, assume you know all the “well-known” high school properties of log x
(e.g. log xk = k log x, log(xy) = log x + log y, etc.). Using the Cauchy condensation
test, determine the convergence of the following series:
∞ ∞ ∞
X 1 X 1 X 1
(a) 2
, (b) p
, (c) .
n=2
n(log n) n=2
n(log n) n=2
n(log n) (log(log n))
For (b), state which p give convergent/diverent series.
2. Prove that
∞
X (−1)n−1 1 1 1
=1− − − − ··· ,
n=1
n 2 · 3 4 · 5 6 ·7
∞
X (−1)n−1 1 1 1
2
= 2 2 1 + 2 + 2 2 3 + 4 + 2 2 5 + 6 + ··· .
n=1
n 1 ·2 3 ·4 5 ·6
P
3. We consider various (unrelated) properties of real series an with an ≥ 0 for all n.
(a) Here is a nice generalization of the Cauchy condensation test: P If the an ’s are
nonincreasing, then given a natural number b > 1, prove that an converges or
diverges with the series
X∞
b n abn = a1 + b a b + b 2 ab2 + b 3 ab3 + · · · .
n=0
Thus,
P the Cauchy condensation test is just this test with P b k= 2.
(b) If P an converges, prove that for any k ∈ N, the series P√ an also converges.
(c) If an converges, give anPexample showing that an may not converge. How-
√
ever, prove that the series an /n does converges. Suggestion: Can you somehow
use the AGMI with two terms?P P
(d) If an > 0 for all n, prove that
P −1an converges if and only if for any series bn of
−1
nonnegative
P real numbers, (a n + b n) converges. P P
(e) If bn is another series
Pof√ nonnegative real numbers, prove that an and bn
converge if Pand only if 2 2
an + bn converges.
P an
(f) Prove that an converges if and only if 1+an
converges.
pP∞ √
(g) For each n ∈ N, define P bn = k=n a k = a
P anan+1 + an+2 + · · ·. Prove that
n +
if an > 0 for all n and an converges, then bn
converges. Suggestion: Show
that an = b2n − b2n+1 and using this fact show that abnn ≤ 2(bn − bn+1 ) for all n.
P
4. We already know that if an (of complex numbers) converges, then lim an = 0. When
the an ’s form a nonincreasing sequence of nonnegative real numbers, P then prove the
following astonishing fact (called Pringsheim’s theorem): If an converges, then
n an → 0. Use the Cauchy criterion for series somewhere in your proof. Suggestion:
Let ε > 0 and choose N such that n > m > N implies
ε
am+1 + am+2 + · · · + an < .
2
Take n = 2m and then n = 2m + 1.
5. (Limit comparison test) Let {an } and {bn } be nonzero complex sequences and
suppose that the following limit exists: L := lim abnn . Prove that
138 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
P P
(i) If L 6= 0, then an is absolutely convergent if and only if bn is absolutely
convergent. P P
(ii) If L = 0 and bn is absolutely convergent, then an is absolutely convergent.
6. Here’s an alternative method to prove that the alternating harmonic series converges.
(i) Let {bn } be a sequence in Rm and suppose that the even and odd subsequences
{b2k } and {b2k−1 } both converge and have the same limit L. Prove that the
original sequence {bn } converges and has limit L.
(ii) Show that the subsequences of even and odd partial sums of the alternating
harmonic series both converge and have the same limit.
7. (Ratio comparison test) Let {an } and {bn } be sequences of positive numbers and
a b P P
suppose that n+1 ≤ n+1 for all n. If b converges, prove that an also converges.
an P bn P n
(Equivalently, if an diverges, then bn also diverges.) Suggestion: Consider the
telescoping product
an an−1 an−2 a2
an = · · ··· · a1 .
an−1 an−2 an−3 a1
P
8. (Cf. [104], [113], [235]) We already know that the harmonic series 1/n diverges. It
turns out that omitting certain numbers from this sum makes the sum converge. Fix a
natural number b ≥ 2. Recall (see Section 2.5) that we can write any natural number
n uniquely as n = ak ak−1 · · · a0 , where 0 ≤ aj ≤ b − 1, j = 0, . . . , k, are called digits,
and where the notation ak · · · a0 means that
a = ak bk + ak−1 bk−1 + · · · + a1 b + a0 .
Prove that the following sum converges:
X 1
.
n
n has no 0 digit
Suggestion: Let ck be the sum over all numbers of the form n1 where n = ak ak−1 · · · a0
with none of aj ’s zero. Show that there at most (b − 1)k+1 such n’s and that n ≥ bk
k+1 P
and use these facts to show that ck ≤ (b−1)
bk
. Prove that ∞k=0 ck converges and use
this to prove that the desired sum converges.
that is, both sides are well-defined (the limits and sums converge) and are equal.
Proof. First of all, we remark that the series on the right converges. Indeed,
if we put ak := limn→∞ ak (n), which exists by assumption, then taking n → ∞
in the inequality P
|ak (n)| ≤ Mk , we have |ak | ≤ Mk as well. Therefore, by the
∞
comparison test, k=1 ak converges (absolutely).
Now to prove our theorem, let ε > 0 be given. By Cauchy’s criterion for series
we can fix an ` so that
ε
M`+1 + M`+2 + · · · < .
3
Since mn → ∞ as n → ∞ we can choose N1 so that for all n > N1 , we have mn > `.
Then using that |ak (n) − ak | ≤ |ak (n)| + |ak | ≤ Mk + Mk = 2Mk , observe that for
any n > N1 we have
m ∞
mn ∞
X n X X̀ X X
ak (n) − ak = (ak (n) − ak ) + (ak (n) − ak ) − ak
k=1 k=1 k=1 k=`+1 k=mn +1
X̀ mn
X ∞
X
≤ |ak (n) − ak | + 2Mk + Mk
k=1 k=`+1 k=mn +1
∞
X̀ X X̀ ε
≤ |ak (n) − ak | + 2Mk < |ak (n) − ak | + 2 .
3
k=1 k=`+1 k=1
Since for each k, limn→∞ ak (n) = ak , there is an N such that for each k = 1, 2, . . . , `
and for n > N , we have |ak (n) − ak | < ε/(3`). Thus, if n > N , then
m ∞
X n X̀ ε
X ε ε ε
ak (n) − ak < + 2 = + 2 ε.
3` 3 3 3
k=1 k=1 k=1
where mn = n and
1 + 2n
ak (n) := .
2n 3k
+ 4k
Observe that for each k ∈ N,
1
1 + 2n 2n + 1 1
lim ak (n) = lim n k k
= lim k =
n→∞ n→∞ 2 3 + 4 k
n→∞ 3 + 4
2n
3k
140 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
exists. Also,
1 + 2n 2n + 2n 2 · 2n 2
|ak (n)| = ≤ = n k = k =: Mk .
2n 3k
+4 k n
2 3k 2 3 3
P∞
By the geometric series test, we know that k=1 Mk converges. Hence by Tannery’s
theorem, we have
1 + 2n 1 + 2n 1 + 2n 1 + 2n
lim + n 2 + n 3 + ··· + n n
n→∞ 2n 3 + 4 2 3 + 42 2 3 + 43 2 3 + 4n
mn ∞ ∞
X X X 1 1/3 1
= lim ak (n) = lim ak (n) = k
= = .
n→∞ n→∞ 3 1 − 1/3 2
k=1 k=1 k=1
If the hypotheses of Tannery’s theorem are not met, then the conclusion of
Tannery’s theorem may not hold as the following example illustrates.
Example 3.46. Here’s a non-example of Tannery’s theorem. For each k, n ∈ N,
let ak (n) := n1 and let mn = n. Then
∞ ∞
1 X X
lim ak (n) = lim =0 =⇒ lim ak (n) = 0 = 0.
n→∞ n→∞ n n→∞
k=1 k=1
On the other hand,
mn n n mn
X X 1 1 X X
ak (n) = = · 1=1 =⇒ lim ak (n) = lim 1 = 1.
n n n→∞ n→∞
k=1 k=1 k=1 k=1
Thus, for this example,
mn
X ∞
X
lim ak (n) 6= lim ak (n).
n→∞ n→∞
k=1 k=1
What went wrong hereP is that there is no constant Mk such that |ak (n)| ≤ Mk for
∞
all n where the series k=1 Mk converges. Indeed, the inequality |ak (n)| P∞≤ Mk
for all n implies (setting n = 1) that 1 ≤ Mk . It follows that the series k=1 Mk
cannot converge. Therefore, Tannery’s theorem cannot be applied.
3.7.2. The exponential function. The exponential function exp : C → C
is the function defined by
∞
X zn
exp(z) := , for z ∈ C.
n=0
n!
Of course, we need to show that the right-hand side converges for each z ∈ C. In
fact, we claim that the series defining exp(z) is absolutely convergent. To prove
this, fix z ∈ C and choose k ∈ N so that |z| ≤ k2 . (Just as a reminder, recall that
such a k exists by the Archimedean property.) Then for any n ≥ k, we have
|z|n |z| |z| |z| |z| |z|
= · ··· · ···
n! 1 2 k (k + 1) n
k k k
≤ |z|k · ···
2(k + 1) 2(k + 2) 2n
k
k 1 1 1 1
≤ · ··· = kk n .
2 2 2 2 2
3.7. TANNERY’S THEOREM, THE EXPONENTIAL FUNCTION, AND THE NUMBER e 141
n
Therefore,
P for n ≥ k, |z| C k
n! ≤ 2n where C is the constant k . Since the geometric
n
series 1/2 converges, by the comparison test, the series defining exp(z) is abso-
lutely convergent for any z ∈ C. In the following theorem, we relate the exponential
function to Euler’s number e introduced in Section 3.3. The proof of Property (1)
in this theorem is a beautiful application of Tannery’s theorem.
Theorem 3.31 (Properties of the complex exponential). The exponential
function has the following properties:
(1) For any z ∈ C and sequence zn → z, we have
z n n
exp(z) = lim 1 + .
n→∞ n
In particular, setting zn = z for all n,
z n
exp(z) = lim 1 + ,
n→∞ n
and setting z = 1, we get
n
1
exp(1) = lim 1 + = e.
n→∞ n
(2) For any complex numbers z and w,
exp(z) · exp(w) = exp(z + w).
(3) exp(z) is never zero for any complex number z, and
1
= exp(−z).
exp(z)
Proof. To prove (1), let z ∈ C and let {zn } be a complex sequence and
suppose that zn → z; we need to show that limn→∞ (1 + zn /n)n = exp(z). To
begin, we expand (1 + zn /n)n using the binomial theorem:
n n
zn n X n znk z n n X
1+ = =⇒ 1 + = ak (n),
n k nk n
k=0 k=0
zk
where ak (n) = nk nnk . Hence, we are aiming to prove that
n
X
lim ak (n) = exp(z).
n→∞
k=0
Of course, written in this way, we are in the perfect set-up for Tannery’s theorem!
However, before going to Tannery’s theorem, we note that, by definition of ak (n),
we have a0 (n) = 1 and a1 (n) = zn . Therefore, since zn → z,
n n
! n
X X X
lim ak (n) = lim 1 + zn + ak (n) = 1 + z + lim ak (n).
n→∞ n→∞ n→∞
k=0 k=2 k=2
Thus, we just have to apply Tannery’s theorem to the sum starting from k = 2; for
this reason, we henceforth assume that k, n ≥ 2. Now observe that for 2 ≤ k ≤ n,
we have
n 1 n! 1 1 1
= = n(n − 1)(n − 2) · · · (n − k + 1) k
k nk k!(n − k)! nk k! n
1 1 2 k−1
= 1− 1− ··· 1 − .
k! n n n
142 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
Thus, for 2 ≤ k ≤ n,
1 1 2 k−1
ak (n) = 1− 1− ··· 1 − znk .
k! n n n
Using this expression for ak (n) we can easily verify the hypotheses of Tannery’s
theorem. First, since zn → z,
1 1 2 k−1 zk
lim ak (n) = lim 1− 1− ··· 1 − znk = .
n→∞ n→∞ k! n n n k!
In particular,
exp(z) · exp(−z) = exp(z − z) = exp(0) = 1,
which implies (3).
We remark that Tannery’s theorem can also be used to establish formulas for
sine and cosine, see Problem 2. Also, in Section 4.6 we’ll see that exp(z) = ez ;
however, at this point, we don’t even know what ez (“e to the power z”) means.
We end with the following neat “infinite nested product” formula for e:
1 1 1 1 1
(3.43) e=1+ + 1+ 1+ 1+ ··· ;
1 2 3 4 5
see Problem 6 for the meaning of the right-hand side.
Exercises 3.7.
144 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
(b) Following the proof that e is irrational, prove that cos 1 (or sin 1 if you prefer) is
irrational.
3. Following [139], we prove that for any m ≥ 3,
m m X m
X 1 3 1 1
− < 1+ < .
n=0
n! 2 m m n=0
n!
(ii) By the tails theorem, we may assume that 1 < an for all n. For each n, let mn
be the unique integer such that mn − 1 ≤ an < mn (thus, mn = ban c − 1 where
ban c is the greatest integer function). Prove that if mn ≥ 1, then
mn −1 an mn
1 1 1
1+ ≤ 1+ ≤ 1+ .
mn an mn − 1
Now prove (3.44).
5. Let {bn } be any null sequence of positive rational numbers. Prove that
1
e = lim (1 + bn ) bn .
6. Prove that for any n ∈ N,
1 1 1 1 1 1 1 1 1
1+ + + ··· + =1+ + 1+ 1+ ··· 1 + 1+ .
1! 2! n! 1 2 3 4 n−1 n
The infinite nested sum in (3.43) denotes the limit as n → ∞ of this expression.
7. Trying to imitate the proof that e is irrational, prove that for any m ∈ N, exp(1/m) is
irrational. After doing this, show that cos(1/m) (or sin(1/m) if you prefer) is irrational,
where cosine and sine are defined in Problem 2. See the article [177] for more on
irrationality proofs. P
8. (Tannery’s theorem II) For each natural number n, let ∞ k=1 ak (n) be a
Pconvergent
series. Prove that if for each k, limn→∞ ak (n) exists, and there is a series ∞ k=1 Mk of
nonnegative real numbers such that |ak (n)| ≤ Mk for all k, n, then
∞
X ∞
X
lim ak (n) = lim ak (n).
n→∞ n→∞
k=1 k=1
then
∞ X
X ∞ ∞ X
X ∞
(3.46) amn = amn
m=1 n=1 n=1 m=1
in the sense that both iterated sums converge and are equal. The implication (3.45)
=⇒ (3.46) is called Cauchy’s double series theorem; see Theorem 6.26 in Section
6.5 for the full story.
PTo prove
P∞ this, you may proceed as follows.
∞
(i) Assume that m=1 n=1 |amn | converges;
P we must prove the equality (3.46).
To do so, for each k ∈ N define Mk := ∞ j=1 |akj |, which converges by assump-
P Pn
tion. Then ∞ k=1 Mk also converges by assumption. Define ak (n) := j=1 akj .
Prove that Tannery’s theorem II can be applied to these ak (n)’s and in doing so,
establish the equality
P P(3.46).
(ii) Assume that ∞ n=1
∞
m=1 |amn | converges; prove the equality (3.46).
146 3. INFINITE SEQUENCES OF REAL AND COMPLEX NUMBERS
(iii) Cauchy’s double series theorem can be used to prove neat and non-obvious iden-
tities. For example, prove that for any k ∈ N and z ∈ C with |z| < 1, we have
∞ ∞
X z n(k+1) X z m+k
n
= ;
n=1
1−z m=1
1 − z m+k
that is,
z k+1 z 2(k+1) z 3(k+1) z 1+k z 2+k z 3+k
+ + + ··· = + + + ··· .
1−z 1 − z2 1 − z3 1 − z 1+k 1 − z 2+k 1 − z 3+k
Suggestion: Apply Cauchy’s double series to {amn } where amn = z n(m+k) .
7To what heights would science now be raised if Archimedes had made that discovery ! [= the
decimal system of numeration or its equivalent (with some base other than 10)]. Carl Friedrich
Gauss (1777–1855).
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 147
Example 3.48. Consider, for example, the number 1/2, which has two decimal
expansions:
1 1
= 0.50000000 . . . and = 0.49999999 . . . .
2 2
Notice that the first decimal expansion terminates.
You might remember from high school that the only decimals with two dif-
ferent expansions are the ones that terminate. In general, a b-adic expansion
0.b a1 a2 a3 a4 a5 . . . is said to terminate if all the an ’s equal zero for n large.
Theorem 3.34. Let b be a positive integer greater than 1. Then every real
number in (0, 1] has a unique b-adic expansion, except a terminating expansion,
which also can have a b-adic expansion where an = b − 1 for all n sufficiently large.
P∞ an
Proof. For a ∈ (0, 1], let a = n=1 bn be its b-adic expansion found in
Theorem 3.33, so there are infinitely many nonzero an ’s. Suppose that {αn } is
another sequence of integers, notP equal to the sequence {an }, such that 0 ≤ αn ≤
∞
b − 1 for all n and such that a = n=1 αbnn . Since {an } and {αn } are not the same
sequence there is at least one n such that an 6= αn . Let m be the smallest natural
number such that am 6= αm . Then an = αn for n = 1, 2, . . . , m − 1, so
∞ ∞ ∞ ∞
X an X αn X an X αn
n
= =⇒ = .
n=1
b n=1
bn n=m
b n
n=m
bn
Since the ends are equal, all the inequalities in between must be equalities. In
particular, making the first inequality into an equality shows that αn = 0 for all
n = m + 1, m + 2, m + 3, . . . and making the middle inequality into an equality
shows that an = b − 1 for all n = m + 1, m + 2, m + 3, . . .. It follows that a has only
one b-adic expansion except when we can write a as
a terminating one, and one that has repeating b−1’s. This completes our proof.
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 149
we have
a2 r1 q 1
2
≤ < = ,
b bq bq b
which implies that 0 ≤ a2 < b.
Once more using the division algorithm, we divide b r2 by q, obtaining a unique
integer a3 such that br2 = a3 q + r3 where 0 ≤ r3 < q. Since
p a1 a2 a3 r2 a3 br2 − a3 q r3
− − 2 − 3 = 2 − 3 = = 3 ≥ 0,
q b b b b q b b3 q b q
we have
a3 r2 q 1
3
≤ 2 < 2 = 2,
b b q b q b
which implies that 0 ≤ a3 < b. Continuing by induction, we construct integers
0 ≤ an , rn < q such that for each n, brn = an+1 q + rn+1 and
p a1 a2 a3 an rn
− − 2 − 3 − ··· − n = n .
q b b b b b q
rn
Since 0 ≤ rn < q it follows that bn q → 0 as n → ∞, so we can write
∞
p X an p
(3.48) = ⇐⇒ = 0.b a1 a2 a3 a4 a5 . . . .
q n=1 bn q
Now one of two things holds: Either some remainder rn = 0 or none of the rn ’s
are zero. Suppose that we are in the first case, some rn = 0. By construction,
we divide brn by q using the division algorithm to get brn = an+1 q + rn+1 . Since
rn = 0 and quotients and remainders are unique, we must have an+1 = 0 and
rn+1 = 0. By construction, we divide brn+1 by q using the division algorithm to
get brn+1 = an+2 q +rn+2 . Since rn+1 = 0 and quotients and remainders are unique,
we must have an+2 = 0 and rn+2 = 0. Continuing this procedure, we see that all
ak with k > n are zero. This, in view of (3.48), shows that the b-adic expansion of
p/q has repeating zeros, so in particular is periodic.
Suppose that we are in the second case, no rn = 0. Consider the q + 1 re-
mainders r1 , r2 , . . . , rq+1 . Since 0 ≤ rn < q, each rn can only take on the q values
0, 1, 2, . . . , q − 1 (“q holes”), so by the pigeonhole principle, two of these remainders
must have the same value (“be in the same hole”). Thus, rk = rk+` for some k
and `. We now show that ak+1 = ak+`+1 . Indeed, ak+1 was defined by dividing
brk by q so that brk = ak+1 q + rk+1 . On the other hand, ak+`+1 was defined by
dividing brk+` by q so that brk+` = ak+`+1 q + rk+`+1 . Now the division algorithm
states that the quotients and remainders are unique. Since brk = brk+` , it follows
that ak+1 = ak+`+1 and rk+1 = rk+`+1 . Repeating this same argument shows that
ak+n = ak+`+n for all n ≥ 0; that is, an = an+` for all n ≥ k. Thus, p/q has a
periodic b-adic expansion.
Step 2: We now prove the “if” portion: A number with a periodic b-adic
expansion is rational. Let a be a real number and suppose that its b-adic decimal
expansion is periodic. Since a is rational if and only of its noninteger part is rational,
we may assume that the integer part of a is zero. Let
a = 0.b a1 a2 · · · ak b1 · · · b`
have a periodic b-adic expansion, where the bar means that the block b1 · · · b` re-
peats. Observe that in an expansion αm αm−1 · · · α0 .b β1 β2 β3 . . ., multiplication by
3.8. DECIMALS AND “MOST” NUMBERS ARE TRANSCENDENTAL Á LA CANTOR 151
bn for n ∈ N moves the decimal point n places to the right. (Try to prove this;
think about the familiar base 10 case first.) In particular,
bk+` a = a1 a2 · · · ak b1 · · · b` .b b1 · · · b` = a1 a2 · · · ak b1 · · · b` + 0.b b1 · · · b`
and
bk a = a1 a2 · · · ak .b b1 · · · b` = a1 a2 · · · ak + 0.b b1 · · · b` .
Subtracting, we see that the numbers given by 0.b b1 · · · b` cancel, so bk+` a−bk a = p,
where p is an integer. Hence, a = p/q, where q = bk+` − bk . Thus a is rational.
3.8.3. Cantor’s diagonal argument. Now that we know about decimal ex-
pansions, we can present Cantor’s second proof that the real numbers are uncount-
able. His first proof appeared in Section 2.10.
Theorem 3.36 (Cantor’s second proof ). The interval (0, 1) is uncountable.
Proof. Assume, for sake of deriving a contradiction, that there is a bijection
f : N −→ (0, 1). Let us write the images of f as decimals (base 10):
1 ←→ f (1) = .a11 a12 a13 a14 · · ·
2 ←→ f (2) = .a21 a22 a23 a24 · · ·
3 ←→ f (3) = .a31 a32 a33 a34 · · ·
4 ←→ f (4) = .a41 a42 a43 a44 · · ·
.. ..
. .,
where we may assume that in each of these expansions there is never an infinite
run of 9’s. Recall from Theorem 3.33 there every real number of (0, 1) has a unique
such representation. Now let us define a real number a = .a1 a2 a3 · · · , where
(
3 if ann 6= 3
an :=
7 if ann = 3.
(The choice of 3 are 7 is arbitrary — you can choose another pair of unequal
integers in 0, . . . , 9 if you like!) Notice that an 6= ann for all n. In particular,
a 6= f (1) because a and f (1) differ in the first digit. On the other hand, a 6= f (2)
because a and f (2) differ in the second digit. Similarly, a 6= f (n) for every n since
a and f (n) differ in the n-th digit. This contradicts that f : N → (0, 1) is onto.
This argument is not only elegant, it is useful: Cantor’s diagonal argument
gives a good method to generate transcendental numbers (see [87])!
Exercises 3.8.
1. Find the numbers with the b-adic expansions (here b = 10, 2, 3, respectively):
(a) 0.010101 . . . , (b) 0.2 010101 . . . , (c) 0.3 010101 . . . .
2. Prove that a real number a ∈ (0, 1) has a terminating decimal expansion if and only if
2m 5n a ∈ Z for some nonnegative integers m, n.
3. (s-adic expansions) Let s = {bn } be a sequence of integers with bn > 1 for all n and
let 0 < a ≤ 1. Prove that there is a sequence of integers {an }∞n=1 with 0 ≤ an ≤ bn − 1
for all n and with infinitely many nonzero an ’s such that
∞
X an
a= ,
n=1
b 1 · b2 · b3 · · · bn
4. (Cantor’s original diagonal argument) Let g and c be any two distinct objects
and let G be the set consisting of all functions f : N −→ {g, c}. Let f1 , f2 , f3 , . . . be
any infinite sequence of elements of G. Prove that there is an element f in G that is
not in this list. From this prove that G is uncountable. Conclude that the set of all
sequences of 0’s and 1’s is uncountable.
CHAPTER 4
One merit of mathematics few will deny: it says more in fewer words than
any other science. The formula, eiπ = −1 expressed a world of thought, of
truth, of poetry, and of the religious spirit “God eternally geometrizes.”
David Eugene Smith (1860–1944) [188].
In this chapter we study, without doubt, the most important types of functions
in all of analysis and topology, the continuous functions. In particular, we study
the continuity properties P∞of the “the most important function in mathematics”
n
[192, p. 1]: exp(z) = n=0 zn! , z ∈ C. From this single function arise just about
every important function and number you can think of: the logarithm function,
powers, roots, the trigonometric functions, the hyperbolic functions, the number e,
the number π, . . . . . ., and the famous formula displayed in the above quote!
What do the Holy Bible, squaring the circle, House bill No. 246 of the Indiana
state legislature in 1897, square free natural numbers, coprime natural numbers,
the sentence
(4.1) May I have a large container of coffee? Thank you,
the mathematicians Archimedes of Syracuse, William Jones, Leonhard Euler, Jo-
hann Heinrich Lambert, Carl Louis Ferdinand von Lindemann, John Machin, and
Yasumasa Kanada have to do with each other? The answer (drum role please):
They all have been involved in the life of the remarkable number π! This fascinat-
ing number is defined and some of its amazing and death-defying properties and
formulæ are studied in this chapter! By the way, the sentence (4.1) is a mnemonic
device to remember the digits of π. The number of letters in each word represents
a digit of π; e.g. “May” represents 3, “I” 1, etc. The sentence (4.1) gives ten digits
of π: 3.141592653.1
In Section 4.1 we begin our study of continuity by learning limits of functions,
in Section 4.2 we study some useful limit properties, and then in Section 4.3 we
discuss continuous functions in terms of limits of functions. In Section 4.4, we study
some fundamental properties of continuous functions. A special class of functions,
called monotone functions, have many special properties, which are investigated in
Section 4.5. In Section 4.6 we study “the most important function in mathematics”
and we also study its inverse, the logarithm function, and then we use the logarithm
function to define powers. We also define the Riemann zeta function, the Euler-
Mascheroni constant γ:
1 1
γ := lim 1 + + · · · + − log n ,
n→∞ 2 n
1Using mnemonics to memorize digits of π isn’t a good idea if you want to beat Hiroyuki
Goto’s record of reciting 42,195 digits from memory! (see http://www.pi-world-ranking-list.com)
153
154 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
a constant will come up again and again (see the book [96], which is devoted to
this number), and we’ll prove that the alternating harmonic series has sum log 2:
1 1 1 1 1
log 2 = 1 − + − + − + −··· ,
2 3 4 5 6
another fact that will come up often. In Section 4.7 we use the exponential function
to define the trigonometric functions and we define π, the fundamental constant
of geometry. In Section 4.8 we study roots of complex numbers and we give fairly
elementary proofs of the fundamental theorem of algebra. In Section 4.9 we study
the inverse trigonometric functions. The calculation and (hopeful) imparting of a
sense of great fascination of the incredible number π are the features of Sections
4.10, 5.1 and 5.2. In particular, we’ll derive the first analytical expression for π:
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· ,
π 2 2 2 2 2 2 2 2 2
π2 1 1 1 1 1 1
= 1 + 2 + 2 + 2 + 2 + 2 + 2 + ··· .
6 2 3 4 5 6 7
Chapter 4 objectives: The student will be able to . . .
• apply the rigorous ε-δ definition of limits for functions and continuity.
• apply and understand the proofs of the fundamental theorem of continuous
functions.
• define the elementary functions (exponential, trigonometric, and their inverses)
and the number π.
• explain three related proofs of the fundamental theorem of algebra.
rf (c)
L+ε L+ε L+ε
L L b L b
L−ε L−ε L−ε
Figure 4.1. Here are three functions with D = [0, ∞) (we’ll de-
note them by the generic letter f ). In the first graph, L = f (c),
in the second graph f (c) 6= L, and in the third graph, f (c) is not
even defined. However, in all three cases, limx→c f = L.
(Here, the domain D of 3x2 − 10 is assumed to be all of R.) Let ε > 0 be given.
We need to prove that there is a real number δ > 0 such that
0 < |x − 2| < δ =⇒ 3x2 − 10 − 2 = 3x2 − 12 < ε.
How do we find such a δ . . . well . . . we “massage” |3x2 − 12|. Observe that
2
3x − 12 = 3 x2 − 4 = 3 |x + 2| · |x − 2|.
Let us tentatively restrict x so that |x − 2| < 1. In this case,
|x + 2| = |x − 2 + 4| ≤ |x − 2| + 4 < 1 + 4 = 5.
Thus,
2
(4.3) |x − 2| < 1 =⇒ 3x − 12 = 3 |x + 2| · |x − 2| < 15 |x − 2|.
Now
ε
(4.4) 15 |x − 2| < ε ⇐⇒ |x − 2| < .
15
For this reason, let us pick δ to be the minimum of 1 and ε/15. Then |x − 2| < δ
implies |x − 2| < 1 and |x − 2| < ε/15, therefore according to (4.3) and (4.4), we
have
by (4.3) by (4.4)
0 < |x − 2| < δ =⇒ 3x2 − 12 < 15 |x − 2| < ε.
4.1. CONVERGENCE AND ε-δ ARGUMENTS FOR LIMITS OF FUNCTIONS 157
|c| c
2
|c|
2
z
|z|
|c| |c|
Figure 4.2. If |z − c| < 2 , then this picture shows that |z| > 2 .
In order to make this expression less than ε, we need to bound the term in front
of |z − c| (we need to make sure that |z| can’t get too small, otherwise 1/|zc| can
blow-up). To do so, we tentatively restrict z so that |z − c| < |c|
2 . In this case, as
|c|
seen in Figure 4.2, we also have |z| > 2 . Here is a proof if you like:
|c| |c|
|c| = |c − z + z| ≤ |c − z| + |z| < + |z| =⇒ < |z|.
2 2
Therefore, if |z − c| < |c| |c| 1 2 2
2 , then |zc| > 2 · |c| = 2 |c| = b, where b = |c| /2 is a
positive number. Thus,
|c| 1 1 1
(4.6) |z − c| < =⇒ − < |z − c|.
2 z c b
Now
1
(4.7) |z − c| < ε ⇐⇒ |z − c| < b ε.
b
For this reason, let us pick δ to be the minimum of |c|/2 and b ε. Then |z − c| < δ
implies |z − c| < |c|/2 and |z − c| < b ε, therefore according to (4.6) and (4.7), we
have
1 1 by (4.6) 1 by (4.7)
0 < |z − c| < δ =⇒ − < |z − c| < ε.
z c b
Thus, by definition of limit, limz→c 1/z = 1/c.
Example 4.6. Here is one last example. Define f : R2 \ {0} −→ R by
x21 x2
f (x) = , x = (x1 , x2 ).
x21 + x22
We shall prove that limx→0 f = 0. (In the subscript “x → 0”, 0 denotes the zero
vector (0, 0) in R2 while on the right of limx→0 f = 0, 0 denotes the real number 0;
it should always be clear from context what “0” means.) Before our actual proof,
we first note that for any real numbers a, b, we have 0 ≤ (a − b)2 = a2 + b2 − 2ab.
Solving for ab, we get
1
a b ≤ a2 + b2 .
2
This inequality is well worth remembering. Hence,
|x1 | |x1 | 1 |x1 |
|f (x1 , x2 )| = 2 2 · |x1 x2 | ≤ 2 2 · x21 + x22 = .
x1 + x2 x1 + x2 2 2
Given ε > 0, choose δ = ε. Then
|x1 | ε
0 < |x| < δ =⇒ |x1 | < δ =⇒ |f (x)| ≤ ≤ < ε,
2 2
which implies that limx→0 f = 0.
4.1. CONVERGENCE AND ε-δ ARGUMENTS FOR LIMITS OF FUNCTIONS 159
4.1.3. The sequence definition of limit. It turns out that we can relate
limits of functions to limits of sequences, which was studied in Chapter 3, so we can
use much of the theory developed in that chapter to analyze limits of functions. In
particular, take note of the following important theorem!
Theorem 4.2 (Sequence criterion for limits). Let f : D −→ Rm and let c
be a limit point of D. Then L = limx→c f if and only if for every sequence {an } of
points in D \ {c} with c = lim an , we have L = limn→∞ f (an ).
Proof. Let f : D −→ Rm and let c be a limit point of D. We first prove that
if L = limx→c f , then for any sequence {an } of points in D \ {c} converging to c,
we have L = lim f (an ). To do so, let {an } be such a sequence and let ε > 0. Since
f has limit L at c, there is a δ > 0 such that
x∈D and 0 < |x − c| < δ =⇒ |f (x) − L| < ε.
Since an → c and an 6= c for any n, it follows that there is an N such that
n>N =⇒ 0 < |an − c| < δ.
The limit property of f now implies that
n>N =⇒ |f (an ) − L| < ε.
Thus, L = lim f (an ).
We now prove that if for every sequence {an } of points in D \ {c} converging
to c, we have L = lim f (an ), then L = limx→c f . We prove the logically equivalent
contrapositive; that is, if L 6= limx→c f , then there is a sequence {an } of points in
D\{c} converging to c such that L 6= lim f (an ). Now L 6= limx→c f means (negating
the definition L = limx→c f ) that there is an ε > 0 such that for all δ > 0, there is
an x ∈ D with 0 < |x − c| < δ and |f (x) − L| ≥ ε. Since this statement is true for
all δ > 0, it is in particular true for δ = 1/n for each n ∈ N. Thus, for each n ∈ N,
there is a point an ∈ D with 0 < |an − c| < 1/n and |f (an ) − L| ≥ ε. It follows
that {an } is a sequence of points in D \ {c} converging to c and {f (an )} does not
converge to L. This completes the proof of the contrapositive.
This theorem can be used to prove that certain functions don’t have limits.
Example 4.7. Recall from Section 1.3 the Dirichlet function, named after
Johann Peter Gustav Lejeune Dirichlet (1805–1859):
(
1 if x is rational
D : R −→ R is defined by D(x) =
0 if x is irrational.
Let c ∈ R. Then as we saw in Example 3.13 of Section 3.2, there is a sequence {an }
of rational numbers converging to c with an 6= c for all n. Since an is rational we
have D(an ) = 1 for all n ∈ N, so
lim D(an ) = lim 1 = 1.
Also, there is a sequence {bn } of irrational numbers converging to c with bn 6= c for
all n, in which case
lim D(bn ) = lim 0 = 0.
Therefore, according to our sequence criterion, limx→c D cannot exist.
160 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Example 4.11. Now let q(z) be another polynomial and suppose that q(c) 6= 0.
Since q(z) has at most finitely many roots (Proposition 2.53), it follows that q(z) 6= 0
for z sufficiently close to c. Therefore by our algebra of limits, we have
p(z) limz→c p(z) p(c)
lim = = .
z→c q(z) limz→c q(z) q(c)
The following theorem is useful when dealing with compositions of functions.
Theorem 4.6 (Composition of limits). Let f : D −→ Rm and g : C −→ Rp
where D ⊆ Rp and C ⊆ Rq and suppose that g(C) ⊆ D so that f ◦ g : C −→ Rm is
defined. Let d be a limit point of D and c a limit point of C and assume that
(1) d = limx→c g(x).
(2) L = limy→d f (y).
(3) Either f (d) = L or d 6= g(x) for all x 6= c sufficiently near c.
Then
L = lim f ◦ g.
x→c
Proof. Let {an } be any sequence in C \ {c} converging to c. Then by (1), the
sequence {g(an )} in D converges to d.
We now consider the two cases in (3). First, If g(x) 6= d for all x 6= c sufficiently
near c, then a tail of the sequence {g(an )} is a sequence in D \ {d} converging to
d, so by (2), lim f (g(an )) = L. On the other hand, if it is the case that f (d) = L
then by (2) and the definition of limit it follows that for any sequence {bn } in
D converging to d, we have lim f (bn ) = L. Therefore, in this case we also have
lim f (g(an )) = L. In either case, we get L = limx→c f ◦ g.
We now finish our limit theorems by considering limits and inequalities. In the
following two theorems, all functions map a subset D ⊆ Rp into R.
Theorem 4.7 (Squeeze theorem). Let f , g, and h be such that f (x) ≤
g(x) ≤ h(x) for all x sufficiently close to a limit point c in D and such that both
limits limx→c f and limx→c h exist and are equal. Then the limit limx→c g also
exists, and
lim f = lim g = lim h.
x→c x→c x→c
As with the previous two theorems, the squeeze theorem for functions is a direct
consequence of the sequence criterion and the corresponding squeeze theorem for
sequences (Theorem 3.7) and therefore we shall omit the proof. The next theorem
follows (as you might have guessed) the sequence criterion and the corresponding
preservation of inequalities theorem (Theorem 3.8) for sequences.
4.2. A POTPOURRI OF LIMIT PROPERTIES FOR FUNCTIONS 163
If only one of f (c−) or f (c+) makes sense, then L = limx→c f if and only if
L = f (c−) (when c is only a limit point of D ∩ (−∞, c)) or L = f (c+) (when c is
only a limit point of D ∩ (c, ∞)), whichever makes sense.
We now describe limits at infinity. Suppose that for any real number N there
is a point x ∈ D such that x > N . A function f : D −→ Rm is said to have a limit
L ∈ Rm as x → ∞ if for each ε > 0 there is a N ∈ R such that
(4.10) x∈D and x>N =⇒ |f (x) − L| < ε.
Now suppose that for any real number N there is a point x ∈ D such that x < N .
A function f : D −→ Rm is said to have a limit L ∈ Rm as x → −∞ if for each
ε > 0 there is a N ∈ R such that
(4.11) x∈D and x<N =⇒ |f (x) − L| < ε.
To express these limits at infinity, we use the notations (sometimes with ∞ replaced
by +∞)
L = lim f , L = lim f (x) , f → L as x → ∞ , or f (x) → L as x → ∞;
x→∞ x→∞
Finally, we discuss infinite limits, which are also called properly divergent limits
of functions.2 We now let m = 1 and consider functions f : D −→ R with D ⊆ R.
Suppose that for any real number N there is a point x ∈ D such that x > N . Then
f is said to diverge to ∞ as x → ∞ if for any real number M > 0 there is a N ∈ R
such that
(4.12) x∈D and x > N =⇒ M < f (x).
Also, f is said to diverge to −∞ as x → ∞ if for any real number M < 0 there is
a N ∈ R such that
(4.13) x∈D and x > N =⇒ f (x) < M.
In either case we say that f is properly divergent as x → ∞ and when f is
properly divergent to ∞ we write
∞ = lim f , ∞ = lim f (x) , f → ∞ as x → ∞ , or f (x) → ∞ as x → ∞;
x→∞ x→∞
with similar expressions when f properly diverges to −∞. In a very similar manner
we can define properly divergent limits of functions as x → −∞, as x → c, as
x → c−, and x → c+; we leave these other definitions for the reader to formulate.
Let us now consider an example.
Example 4.12. Let a > 1 and let f : Q −→ R be defined by f (x) = ax
(therefore in this case, D = Q). Here, we recall that ax is defined for any rational
number x (see Section 2.7). We shall prove that
lim f = ∞ and lim f = 0.
x→∞ x→−∞
In Section 4.6 we shall define ax for any x ∈ R (in fact, for any complex power)
and we shall establish these same limits with D = R. Before proving these results,
we claim that
(4.14) for any rational p < q, we have ap < aq .
Indeed, 1 < a and q − p > 0, so by our power rules,
1 = 1q−p < aq−p ,
which, after multiplication by ap , gives our claim. We now prove that f → ∞ as
x → ∞. To prove this, we note that since a > 1, we can write a = 1 + b for some
b > 0, so by Bernoulli’s inequality, for any n ∈ N,
(4.15) an = (1 + b)n ≥ 1 + n b > n b.
Now fix M > 0. By the Archimedean principle, we can choose N ∈ N such that
N b > M , therefore by (4.15) and (4.14),
x ∈ Q and x > N =⇒ M < N b < aN < ax .
This proves that f → ∞ as x → ∞. We now show that f → 0 as x → −∞. Let
ε > 0. Then by the Archimedean principle there is an N ∈ N such that
1 1
< N =⇒ < ε.
bε Nb
2I protest against the use of infinite magnitude as something completed, which in mathe-
matics is never permissible. Infinity is merely a facon de parler, the real meaning being a limit
which certain ratios approach indefinitely near, while others are permitted to increase without
restriction. Carl Friedrich Gauss (1777–1855).
4.2. A POTPOURRI OF LIMIT PROPERTIES FOR FUNCTIONS 165
Example 4.14. One last example. Suppose as before that limx→c g(x) = ∞
where c is either a real number, ∞, or −∞. Since limy→∞ 1/y = 0, by our extended
composition of limits theorem, we have
1
lim = 0.
x→c g(x)
Exercises 4.2.
1. Using the ε-δ definition of (left/right-hand) limit, prove (a) and (b):
x x x
(a) lim = −1, (b) lim = 1. Conclude that lim does not exist.
x→0− |x| x→0+ |x| x→0 |x|
f (x)
f (c) + ε
f (c)
f (c) − ε
c−δc c+δ
(d) If n > m, prove that if an > 0, then limx→∞ p(x)/q(x) = ∞, and on the other
hand if an < 0, then limx→∞ p(x)/q(x) = −∞.
4. Let f, g : D −→ R with D ⊆ R, limx→∞ f = ∞, and g(x) 6= 0 for all x ∈ D. Suppose
that for some real number L, we have
f
lim = L.
x→∞ g
Technically speaking, when we compare (4.16) to the definition of limit, for a limit
we actually require that 0 < |x − c| < δ, but in the case that |x − c| = 0, that is,
x = c, we have |f (x) − f (c)| = |f (c) − f (c)| = 0, which is automatically less than ε,
4.3. CONTINUITY, THOMAE’S FUNCTION, AND VOLTERRA’S THEOREM 167
so the condition that 0 < |x − c| can be dropped. What if c ∈ D is not a limit point
of D? In this case c is called an isolated point in D and by definition of (not
being a) limit point there is an open ball Bδ (c) such that Bδ (c) ∩ D = {c}; that is,
the only point of D inside Bδ (c) is c itself. Hence, with this δ, for any ε > 0, the
condition (4.16) is automatically satisfied:
x∈D and |x − c| < δ =⇒ x=c =⇒ |f (x) − f (c)| = 0 < ε.
Therefore, at isolated points of D, the function f is automatically continuous, and
therefore “boring”. For this reason, if we want to prove theorems concerning the
continuity of f : D −→ Rm at a point c ∈ D, we can always assume that c is a limit
point of D; in this case, we have all the limit theorems from the last section at our
disposal. This is exactly why we spent so much time on learning limits during the
last two sections!
If f is continuous at every point in a subset A ⊆ D, we say that f is contin-
uous on A; in particular, if f is continuous at every point of D, we say that f is
continuous, or continuous on D, to emphasize D:
f is continuous ⇐⇒ for all c ∈ D, f is continuous at c.
We can write the last equality as f (lim an ) = lim f (an ) since c = lim an . Thus,
f lim an = lim f (an );
n→∞ n→∞
Note that the answer is false if either f or g were not continuous. For example, with
D denoting Dirichlet’s function, D(r) = 1 for all rational numbers, but D(x) 6= 1
for all irrational numbers x. See Problem 2 for a related problem.
Next, the composition of limits theorem (Theorem 4.6) implies the following
theorem.
If only one of f (c−) or f (c+) makes sense, then f is continuous at c if and only
if f (c) = f (c−) or f (c) = f (c+), whichever makes sense.
4.3. CONTINUITY, THOMAE’S FUNCTION, AND VOLTERRA’S THEOREM 169
1/2
1/3
−1−1 1 1 2 1 4 3 5 2
2 3 3 2 3 3 2 3
Figure 4.4. The left-hand side shows plots of T (p/q) for q at most
3 and the right shows plots of T (p/q) for q at most 7.
|f (y)| ≤ r|f (x)|. Prove that f must have a root, that is, there is a point c ∈ I such
that f (c) = 0.
6. Consider the following function related Thomae’s function:
(
q if x = p/q in lowest terms and q > 0,
t(x) :=
0 if x is irrational.
Prove that t : R −→ R is discontinuous at every point in R.
7. Here are some fascinating questions related to Volterra’s theorem.
(a) Are there functions f, g : R −→ R that don’t have any continuity points in common,
one that is pointwise discontinuous and the other one that is not (but is continuous
at least at one point)? Give an example or prove there are no such functions.
(b) Is there a continuous function f : R −→ R that maps rationals to irrationals? Give
an example or prove there is no such function.
(c) Is there a continuous function f : R −→ R that maps rationals to irrationals and
irrationals to rationals? Suggestion: Suppose there is such a function and consider
the function T ◦ f where T is Thomae’s function.
( ( ) -
1
0 nk 1
Figure 4.5. The finite subcover {Un1 , . . . , Unk } does not cover [0, 1).
Sk
we have j=1 Unj = ( n1k , 1), which does not cover (0, 1) because there is a “gap”
between 0 and 1/nk as seen in Figure 4.5. On the other hand, V does have a
finite subcover, that is, there are finitely many elements of V that will cover [0, 1].
Indeed, [0, 1] is covered by the single element V1 of V because [0, 1] ⊆ (−1, 1 + 1).
This is in fact a general phenomenon for closed and bounded intervals.
Lemma 4.16 (Compactness lemma). Every cover of a closed and bounded
interval by open intervals has a finite subcover.
Proof. Let U be a cover of [a, b] by open intervals. We must show that there
are finitely many elements of U that still cover [a, b]. Let A be the set of all numbers
x in [a, b] such that the interval [a, x] is contained in a union of finitely many sets
in U . Since [a, a] is the single point a, this interval is contained in a single set in
U , so A is not empty. Being a nonempty subset of R bounded above by b, A has
a supremum, say ξ ≤ b. Since ξ belongs to the interval [a, b] and U covers [a, b], ξ
belongs to some open interval (c, d) in the collection U . Choose any real number
η with c < η < ξ. Then η is less than the supremum of A, so [a, η] is covered by
finitely many sets in U , say [a, η] ⊆ U1 ∪ · · · ∪ Uk . Adding Uk+1 := (c, d) to this
collection, it follows that for any real number x with c < x < d, the interval [a, x] is
covered by the finitely many sets U1 , . . . , Uk+1 in U . In particular, since c < ξ < d,
for any x with ξ ≤ x < d, the interval [a, x] is covered by finitely many sets in U ,
so unless ξ = b, the set A would contain a number greater than ξ. Hence, b = ξ
and [a, b] can be covered by finitely many sets in U .
Because closed and bounded intervals have this finite subcover property, and
therefore behave somewhat like finite sets (which are “compact” — take up little
space), we call such intervals compact. We now move to open sets. An open
set Sin R is simply a union of open intervals; explicitly, A ⊆ R is open means that
A = α Uα where each Uα is an open interval.
Example 4.23. R = (−∞, ∞) is an open interval, so R is open.
Example 4.24. Any open interval (a, b) is open because it’s a union consisting
of just itself. In particular, if b ≤ a, we have (a, b) = ∅, so ∅ an open set.
S
Example 4.25. Another example is R \ Z because R \ Z = n∈Z (n, n + 1).
A set A ⊆ R is disconnected if there are open sets U and V such that A ∩ U
and A ∩ V are nonempty, disjoint, and have union A. To have union A, we mean
A = (A ∩ U) ∪ (A ∩ V), which is actually equivalent to saying that
A ⊆ U ∪ V.
A set A ⊆ R is connected if it’s not disconnected.
Example 4.26. A = (−1, 0) ∪ (0, 1) is disconnected because A = U ∪ V where
U = (−1, 0) and V = (0, 1) are open, and A ∩ U = (−1, 0) and A ∩ V = (0, 1) are
nonempty, disjoint, and union to A.
174 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
c ]
1
−M
Proof. Let ε = d − |f (c)| and, using the definition of continuity, choose δ > 0
such that x ∈ I and |x − c| < δ =⇒ |f (x) − f (c)| < ε. Let Ic = (c − δ, c + δ). Then
given x ∈ I with x ∈ Ic , we have |x − c| < δ, so
|f (x)| = |f (x) − f (c)| + |f (c)| < ε + |f (c)| = d − |f (c)| + |f (c)| = d,
which proves our claim.
An analogous proof shows that if a < f (c) < b, then there is an open interval
Ic containing c such that for all x ∈ I with x ∈ Ic , we have a < f (x) < b. Yet
another analogous proof shows that if f : D −→ Rm with D ⊆ Rp is a continuous
map and |f (c)| < d, then there is an open ball B containing c such that for all
x ∈ D with x ∈ B, we have |f (x)| < d. We’ll leave these generalizations to the
interested reader.
Theorem 4.19 (Boundedness theorem). A continuous real-valued function
on a closed and bounded interval is bounded.
Proof. Let f be a continuous function on a closed and bounded interval I.
Proof I: Assume that f is unbounded; we shall prove that f is not continuous.
Since f is unbounded, for each natural number n there is a point xn in I such that
|f (xn )| ≥ n. By the Bolzano-Weierstrass theorem, the sequence {xn } has a conver-
gent subsequence, say {x0n } that converges to some c in I. By the way the numbers
xn were chosen, it follows that |f (x0n )| → ∞, which shows that f (x0n ) 6→ f (c), for
if f (x0n ) → f (c), then we would have |f (c)| = lim |f (x0n )| = ∞, an impossibility
because f (c) is a real number. Thus, f is not continuous at c.
Proof II: Given any arbitrary point c in I, we have |f (c)| < |f (c)|+1, so by our
inequality lemma there is an open interval Ic containing c such that for each x ∈ Ic ,
|f (x)| < |f (c)| + 1. The collection of all such open intervals U = {Ic ; c ∈ I} covers
I, so by the compactness lemma, there are finitely many open intervals in U that
cover I, say Ic1 , . . . , Icn . Let M be the largest of the values |f (c1 )|+1, . . . , |f (cn )|+1.
We claim that f is bounded by M on all of I. Indeed, given x ∈ I, since Ic1 , . . . , Icn
cover I, there is an interval Ick containing x. Then,
|f (x)| < |f (ck )| + 1 ≤ M.
Thus, f is bounded.
4.4.3. The max/min value theorem. The geometric content of our second
theorem is that the graph of a continuous function on a closed and bounded interval
must have highest and lowest points. The dots in Figure 4.6 show such extreme
points; note that there are two lowest points in the figure. The simple example
f (x) = x on (0, 1) shows that the max/min theorem does not hold when the interval
is not closed and bounded.
Theorem 4.20 (Max/min value theorem). A continuous real-valued func-
tion on a closed and bounded interval achieves its maximum and minimum values.
That is, if f : I −→ R is a continuous function on a closed and bounded interval I,
then for some values c and d in the interval I, we have
f (c) ≤ f (x) ≤ f (d) for all x in I.
Proof. Define
M := sup{f (x) ; x ∈ I}.
176 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
This number is finite by the boundedness theorem. We shall prove that there is a
number d in [a, b] such that f (d) = M . This proves that f achieves its maximum;
a related proof shows that f achieves its minimum.
Proof I: By definition of supremum, for each natural number n, there exists
an xn in I such that
1
(4.20) M − < f (xn ) ≤ M,
n
for otherwise, the value M −1/n would be a smaller upper bound for {f (x) ; x ∈ I}.
By the Bolzano-Weierstrass theorem, the sequence {xn } has a convergent subse-
quence {x0n }; let’s say that x0n → d where d is in [a, b]. By continuity, we have
f (x0n ) → f (d). On the other hand, by (4.20) and the squeeze theorem, we have
f (xn ) → M , so f (x0n ) → M as well. By uniqueness of limits, f (d) = M .
Proof II: Assume, for sake of contradiction, that f (x) < M for all x in I. Let
c be any point in I. Since f (c) < M by assumption, we can choose εc > 0 such that
f (c) + εc < M , so by our inequality lemma there is an open interval Ic containing
c such that for all x ∈ Ic , |f (x)| < M − εc . The collection U = {Ic ; c ∈ I}
covers I, so by the compactness lemma, there are finitely many open intervals in
U that cover I, say Ic1 , . . . , Icn . Let m be the largest of the finitely many values
εck + |f (ck )|, k = 1, . . . , n. Then m < M and given any x ∈ I, since Ic1 , . . . , Icn
cover I, there is an interval Ick containing x, which shows that
|f (x)| < εck + |f (ck )| ≤ m < M.
This implies that M cannot be the supremum of f over I, since m is a smaller
upper bound for f . This gives a contradiction to the definition of M .
4.4.4. The intermediate value theorem. A real-valued function f on an
interval I is said to have the intermediate value property if it attains all its
intermediate values in the sense that if a < b both belong to I, then given any
real number ξ between f (a) and f (b), there is a c in [a, b] such that f (c) = ξ. By
“between” we mean that either f (a) ≤ ξ ≤ f (b) or f (b) ≤ ξ ≤ f (a). Geometrically,
this means that the graph of f can be draw without “jumps,” that is, without ever
lifting up the pencil. We shall prove the intermediate value theorem, which states
that any continuous function on an interval has the intermediate value property.
See the previous Figure 4.6 for an example where we take, for instance, a = 0 and
b = 1; note for this example that the point c need not be unique (there is another
c0 such that f (c0 ) = ξ). The function in the introduction to this section shows that
the intermediate value theorem fails when the domain is not an interval.
Before proving the intermediate value theorem we first think a little about
intervals. Note that if I is an interval, bounded or unbounded, open, closed, etc,
then given any points a, b ∈ I with a < b, it follows that every point c between
a and b is also in I. The converse statement: “if A ⊆ R is such that given any
points a, b ∈ A with a < b, every point c between a and b is also in A, then A is an
interval” is “obviously” true. We shall leave its proof to the interested reader.
Lemma 4.21. A set A in R is an interval if and only if given any points a < b
in A, we have [a, b] ⊆ A. Stated another way, A is an interval if and only if given
any two points a, b in A with a < b, all points between a and b also lie in A.
(In fact, some mathematicians might even take this lemma as the definition of
interval.) We are now ready to prove our third important theorem in this section.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 177
f (b)
ξ
a c b
f (a)
imagine and put it back next to the wall. Is there a point on the bent wire whose
distance to the wall is exactly the same as it was originally?
6. (Antipodal point puzzle) Prove that there are, at any given moment, antipodal
points on the earth’s equator that have the same temperature. Here are some steps:
(i) Let a > 0 and let f : [0, 2a] −→ R be a continuous function with f (0) = f (2a).
Show that there exists a point ξ ∈ [0, a] such that f (ξ) = f (ξ + a).
(ii) Using (i) solve our puzzle.
7. Let f : I −→ R and g : I −→ R be continuous functions on a closed and bounded
interval I and suppose that f (x) < g(x) for all x in I
(a) Prove that there is a constant α > 0 such that f (x) + α < g(x) for all x ∈ I.
(b) Prove that there is a constant β > 1 such that βf (x) < g(x) for all x ∈ I.
(c) Do properties (a) and (b) hold in case I is bounded but not closed (e.g. I = (0, 1)
or I = (0, 1]) or unbounded (e.g. I = R or I = [1, ∞))? In each of these two cases
prove (a) and (b) are true, or give counterexamples.
8. (n-to-one functions) This problem is a continuation of Example 4.29.
(a) Define a (necessarily non-continuous) function f : [0, 1] −→ R that takes on each
value in its range exactly two times.
(b) Prove that there does not exist a function f : [0, 1] −→ R that takes on each value
in its range exactly n times, where n ∈ N with n ≥ 2.
(c) Now what about a function with domain R instead of [0, 1]? Prove that there does
not exist a continuous function f : R −→ R that takes on each value in its range
exactly two times.
(d) Prove that there does not exist a continuous function f : R −→ R that takes on
each value in its range exactly n times, where n ∈ N is even. If n is odd, there does
exist such a function! Draw such a function when n = 3 (try to draw a “zig-zag”
type function). If you’re interested in a formula for a continuous n-to-one function
for arbitrary odd n, try to come up with one or see Wenner [242].
9. Show that a function f : R −→ R can have at most a countable number of strict
maxima. Here, a strict maximum is a point c such that f (x) < f (c) for all x
sufficiently close to c. Suggestion: At each point c where f has a strict maximum,
choose an interval (p, q) containing c where p, q ∈ Q.
The remaining exercises give alternative proofs of the boundedness, max/min, and
intermediate value theorems.
10. (Boundedness, Proof III) We shall give another proof of the boundedness theorem
as follows. Let f be a real-valued continuous function on a closed interval [a, b]. Define
A = {c ∈ [a, b] ; f is a bounded on [a, c]}.
If we prove that b ∈ A, then f is bounded on [a, b], which proves our theorem.
(i) Show that a ∈ A and d := sup A exists where d ≤ b. We show that d = b.
(ii) Suppose that d < b. Show that there is an open interval I containing d such that
f is bounded on [a, b] ∩ I, and moreover, for all points c ∈ I with d < c < b, f is
bounded on [a, c]. Derive a contradiction.
11. (Max/min, Proof III) We give another proof of the max/min value theorem as
follows. Let M be the supremum of a real-valued continuous function f on a closed
and bounded interval I. Assume that f (x) < M for all x in I and define
1
g(x) = .
M − f (x)
Show that g is continuous on I. However, show that g is actually not bounded on I.
Now use the boundedness theorem to arrive at a contradiction.
12. (Max/min, Proof IV) Here’s a proof of the max/min value theorem that is similar
to the proof of the boundedness theorem in Problem 10. For each c ∈ [a, b] define
Mc = sup{f (x) ; x ∈ [a, c]}.
4.4. COMPACTNESS, CONNECTEDNESS, AND CONTINUOUS FUNCTIONS 181
q
6
aq
a qa q
a q
a q
q -
c c c
proof that every monotone function has at most countably many discontinuities,
each of which being a jump discontinuity; see Problem 2 for another proof.
Theorem 4.25. A monotone function on R has uncountably many points of
continuity and at most countably many discontinuities, each discontinuity being a
jump discontinuity.
Proof. Assume that f is nondecreasing, the case for a nonincreasing function
is proved in an analogous manner. We know that f is discontinuous at a point
x if and only if f (x+) − f (x−) > 0. Given such a discontinuity point, choose a
rational number rx in the interval (f (x−), f (x+)). Since f is nondecreasing, given
any two such discontinuity points x < y, we have (see (4.22)) f (x+) ≤ f (y−), so
the intervals (f (x−), f (x+)) and (f (y−), f (y+)) are disjoint. Thus, rx 6= ry and to
each discontinuity, we have assigned a unique rational number. It follows that the
set of all discontinuity points of f is in one-to-one correspondence with a subset of
the rationals, and therefore, since a subset of a countable set is countable, the set
of all discontinuity points of f is countable. Since R, which is uncountable, is the
union of the continuity points of f and the discontinuity points of f , the continuity
points of f must be uncountable.
The following is a very simple and useful characterization of continuous mono-
tone functions on intervals.
Theorem 4.26. A monotone function on R is continuous on R if and only if
its range is an interval.
Proof. By the intermediate value theorem, we already know that the range
of any (in particular, a monotone) continuous function on R is an interval. Let
f : R −→ R be monotone and suppose, for concreteness, that f is nondecreasing,
the case for a nonincreasing function being similar. It remains to prove that if the
range of f is an interval, then f is continuous. We shall prove the contrapositive.
So, assume that f is not continuous on I. Then at some point c, we have
f (c−) < f (c+).
Since f is nondecreasing, this inequality implies that either interval (f (c−), f (c)) or
(f (c), f (c+)), whichever is nonempty, is not contained in the range of f . Therefore,
the range of f cannot be an interval.
4.5.2. Monotone inverse theorem. Recall from Section 1.3 that a function
has an inverse if and only if the function is injective, that is, one-to-one. Notice
that a strictly monotone function f : R −→ R is one-to-one since, for instance, if
f is strictly increasing, then x 6= y, say x < y, implies that f (x) < f (y), which
4.5. MONOTONE FUNCTIONS AND THEIR INVERSES 185
in particular says that f (x) 6= f (y). Thus a strictly monotone function is one-to-
one. The last result in this section states that a one-to-one continuous function is
automatically strictly monotone. This result makes intuitive sense for if the graph
of the function had a dip in it, the function would not pass the so-called “horizontal
line test” learned in high school.
Theorem 4.27 (Monotone inverse theorem). A one-to-one continuous
function f : R −→ R is strictly monotone, its range is an interval, and it has
a continuous strictly monotone inverse (with the same monotonicity as f ).
Proof. Let f : R −→ R be a one-to-one continuous function. We shall prove
that f is strictly monotone. Fix points x0 < y0 . Then f (x0 ) 6= f (y0 ) so either
f (x0 ) < f (y0 ) or f (x0 ) > f (y0 ). For concreteness, assume that f (x0 ) < f (y0 ); the
other case f (x0 ) > f (y0 ) can be dealt with analogously. We claim that f is strictly
increasing. Indeed, if this is not the case, then there exists points x1 < y1 such that
f (y1 ) < f (x1 ). Now consider the function g : [0, 1] → R defined by
g(t) = f (ty0 + (1 − t)y1 ) − f (tx0 + (1 − t)x1 ).
Since f is continuous, g is continuous, and
g(0) = f (y1 ) − f (x1 ) < 0 and g(1) = f (y0 ) − f (x0 ) > 0.
Hence by the IVP, there is a c ∈ [0, 1] such that g(c) = 0. This implies that
f (a) = f (b) where a = cx0 + (1 − c)x1 and b = cy0 + (1 − c)y1 . Since f is one-to-
one, we must have a = b; however, this is impossible since x0 < y0 and x1 < y1
implies a < b. This contradiction shows that f must be strictly monotone.
Now let f : R −→ R be a continuous strictly monotone function and let I =
f (R). By Theorem 4.26, we know that I is an interval too. We shall prove that
f −1 : I −→ R is also a strictly monotone function; then Theorem 4.26 implies
that f −1 is continuous. Now suppose, for instance, that f is strictly increasing; we
shall prove that f −1 is also strictly increasing. If x < y in I, then we can write
x = f (ξ) and y = f (η) for some ξ and η in I. Since f is increasing, ξ < η, and
hence, f −1 (x) = ξ < η = f −1 (y). Thus, f −1 is strictly increasing and our proof is
complete.
(i) Given any finite number x1 , . . . , xk of points in (a, b), prove that
d(x1 ) + · · · + d(xk ) ≤ f (b) − f (a), where d(x) := f (x+) − f (x−).
(ii) Given any n ∈ N, prove that there are only a finite number of points c ∈ [a, b]
such that f (c+) − f (c−) > 1/n.
(iii) Now prove that f can have at most countably many discontinuities.
3. Let f : R −→ R be a monotone function. Prove that if f happens to also be additive
(see Problem 3 in Exercises 4.3), then f is continuous. Thus, any additive monotone
function is continuous.
4. In this problem we investigate jump functions. Let x1 , x2 , . . . be countablyP
many points
on the real line and let c1 , c2 , . . . be nonzero complex numbers such that cn is abso-
lutely convergent. For x ∈ R, the functions
X X
(4.23) ϕ` (x) = cn and ϕr (x) = cn
xn <x xn ≤x
thus, e.g. for sn (x) we only sum over those ck ’s such that k ≤ n and also xk < x.
(a) Prove that ϕ` , ϕr : R −→ C are well-defined for all x ∈ R (that is, the two infinite
series (4.23) make sense for all x ∈ R).
(b) If all the cn ’s are nonnegative real numbers, prove that ϕ` and ϕr are nondecreasing
functions on R.
(c) If all the cn ’s are nonpositive real numbers, prove that ϕ` and ϕr are nonincreasing
functions on R.
5. In this problem we prove that ϕr in (4.23) is right-continuous having only jump dis-
continuities at x1 , x2 , . . . with the jump at xn equal to cn . To this end, let ε > 0. Since
P
|cn | converges, by Cauchy’s criterion for series, we can choose N so that
X
(4.24) |cn | < ε.
n≥N +1
Using (4.24) prove that for δ > 0 sufficiently small, |ϕr (x + δ) − ϕr (x)| < ε.
(ii) Prove that for any δ > 0,
X
ϕr (x) − ϕr (x − δ) = cn .
x−δ<xn ≤x
If x is not one of the points x1 , . . . , xN , using (4.24) prove that for δ > 0 sufficiently
small, |ϕr (x) − ϕr (x − δ)| < ε.
(iii) If x = xk for some 1 ≤ k ≤ N , prove that |ϕr (x) − ϕr (x − δ) − ck | < ε.
(iv) Finally, prove that ϕr is right-continuous having only jump discontinuities at
x1 , x2 , . . . with the jump at xn equal to cn .
6. Prove that ϕ` is left-continuous having only jump discontinuities at x1 , x2 , . . . where
the jump at xn equal to cn , with the notation given in (4.23).
7. (Generalized Thomae functions) In this problem we generalize Thomae’s function
to arbitrary countable sets. Let A ⊆ R be a countable set.
(a) Define a nondecreasing function on R that is discontinuous exactly on A.
(b) Suppose that A is dense. (Dense in defined in Subsection 4.3.3.) Prove that there
does not exist a continuous function on R that is discontinuous exactly on Ac .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 187
y = exp x
y = log x
(0, 1)
(1, 0)
In particular, the right-hand side, being a sum of real numbers, is a real number, so
exp : R −→ R. Of course, this real exponential function shares all of the properties
(1) – (4) as the complex one does. In the following theorem we show that this real-
valued exponential function has the increasing/decreasing properties you learned
about in elementary calculus; see Figure 4.10.
Theorem 4.29 (Properties of the real exponential). The real exponential
function has the following properties:
(1) exp : R −→ (0, ∞) is a strictly increasing continuous bijection. Moreover,
limx→∞ exp(x) = ∞ and limx→−∞ exp(x) = 0.
(2) For any x ∈ R, we have
1 + x ≤ exp(x)
with strict inequality for x 6= 0, that is, 1 + x < exp(x) for x 6= 0.
Proof. Observe that
x2 x3
exp(x) = 1 + x + + + · · · ≥ 1 + x, x ≥ 0,
2! 3!
with strict inequalities for x > 0. In particular, exp(x) > 0 for x ≥ 0 and the
inequality exp(x) ≥ 1 + x shows that limx→∞ exp(x) = ∞. If x < 0, then −x > 0,
so exp(−x) > 0, and therefore by Property (3) of Theorem 3.31,
1
exp(x) = > 0.
exp(−x)
Thus, exp(x) is positive for all x ∈ R and recalling Example 4.14, we see that
1 1
lim exp(x) = lim = lim = 0.
x→−∞ x→−∞ exp(−x) x→∞ exp(x)
(As a side note, we can also get exp(x) > 0 for all x ∈ R by noting that exp(x) =
exp(x/2) · exp(x/2) = (exp(x/2))2 .) If x < y, then y − x > 0, so exp(y − x) ≥
1 + (y − x) > 1, and thus,
exp(x) < exp(y − x) · exp(x) = exp(y − x + x) = exp(y).
Thus, exp is strictly increasing on R. The continuity property of exp implies that
exp(R) is an interval and then the limit properties of exp imply that this interval
must be (0, ∞). Thus, exp : R −→ (0, ∞) is onto (since exp(R) = (0, ∞)) and
injective (since exp is strictly increasing) and therefore is a continuous bijection.
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 189
Using a similar technique, one can find series representations for log 3; see Problem
7. Using the above formula for log 2, in Problem 7 you are asked to derive the
following striking expression:
e1 e1/3 e1/5 e1/7 e1/9
(4.30) 2= · · · · ··· .
e1/2 e1/4 e1/6 e1/8 e1/10
Exercises 4.6.
1. Establish the following properties of exponential functions.
(a) If zn → z and an → a (with zn , z complex and an , a > 0), then aznn → az .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 195
(b) If a, b > 0, then for any x < 0, a < b if and only if ax > bx .
(c) If a, b > 0, then for any complex number z, a−z = 1/az and (a/b)z = az /bz .
(d) Prove that for any x ∈ R, |eix | = 1.
2. Let a ∈ R with a 6= 0 and define f (x) = xa .
(a) If a > 0, prove that f : [0, ∞) −→ R is continuous, strictly increasing, limx→0+ f =
0, and limx→∞ f = ∞.
(b) If a < 0, prove that f : (0, ∞) −→ R is continuous, strictly decreasing, limx→0+ f =
∞, and limx→∞ f = 0.
3. Establish the following limit properties of the exponential function.
(a) Show that for any natural number n and for any x ∈ R with x > 0 we have
xn+1
ex > .
(n + 1)!
Use this inequality to prove that for any natural number n,
xn
lim x = 0.
x→∞ e
(b) Using (a), prove that for any a ∈ R with a > 0, however large, we have
xa
lim = 0.
x→∞ ex
x
It follows that e grows faster than any power (no matter how large) of x. This
limit is usually derived in elementary calculus using L’Hospital’s rule.
4. Let a1 , . . . , an be nonnegative real numbers. Recall from Problem 7 in Exercises 2.2
that the arithmetic-geometric mean inequality (AGMI) is the iequality
a1 + · · · + a n
(a1 · a2 · · · an )1/n ≤ .
n
Prove this inequality by setting a = (a1 + · · · + an )/n, xk = −1 + ak /a (so that
ak /a = 1 + xk ) for k = 1, . . . , n, and using the inequality 1 + x ≤ ex .
5. For any x > 0, derive the following remarkable formula:
√
log x = lim n n x − 1 (Halley’s formula),
n→∞
named after the famous Edmond Halley (1656–1742) of Halley’s comet. Suggestion:
√
Write n x = elog x/n and write elog x/n as a series in log x/n.
6. (Cf. [108]) Puzzle: Do there exist irrational numbers α and β such that αβ is irra-
0 √ √
tional ? Suggestion: Consider αβ and αβ where α = β = 2 and β 0 = 2 + 1.
7. In this fun problem, we derive some interesting formulas.
(a) Prove that
∞ ∞
X 1 1 X 1 1
γ= − log 1 + =1+ + log 1 −
n=1
n n n=2
n n
∞
X 1 1
=1+ + log 1 + ,
n=1
n+1 n
where γ is the Euler-Mascheroni constant. Suggestion: Think telescoping series.
(b) Using a similar technique on how we derived our formula for log 2, prove that
1 2 1 1 2 1 1 2
log 3 = 1 + − + + − + + − + + − · · ·
2 3 4 5 6 7 8 9
Can you find a series representation for log 4?
e1 2n−1
(c) Define an = e1/2 · · · e e2n . Prove that 2 = lim an .
8. Following Greenstein [86] (cf. [42]) we establish a “well-known” limit from calculus,
but without using calculus!
(i) Show that log x < x for all x > 0.
(ii) Show that (log x)/x < 2/x1/2 for x > 0. Suggestion: log x = 2 log x1/2 .
196 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
10. In high school you probably learned logarithms with other “bases” besides e. Let a ∈ R
with a > 0 and a 6= 1. For any x > 0, we define
log x
loga x := ,
log a
called the logarithm of x to the base a. Note that if a = e, then loge = log, our
usual logarithm. Here are some of the well-known properties of loga .
(a) Prove that x 7→ loga x is the inverse function of x 7→ ax .
(b) Prove that for any x, y > 0, loga xy = loga x + loga y.
(c) Prove that if b > 0 with b 6= 1 is another base, then for any x > 0,
log b
loga x = logb x (Change of base formula).
log a
11. Part (a) of this problem states that a “function which looks like an exponential function
is an exponential function,” while (b) says the same for the logarithm function.
(a) Let f : R −→ R satisfy f (x + y) = f (x) f (y) for all x, y ∈ R; see Problem 4 in
Exercises 4.3. Assume that f is not the zero function. Prove that if f is continuous,
then
f (x) = ax for all x ∈ R, where a = f (1).
Suggestion: Show that f (x) > 0 for all x. Now there are a couple ways to proceed.
One way is to first prove that f (r) = (f (1))r for all rational r (to prove this you
do not require the continuity assumption). This second way is to define h(x) =
log f (x). Prove that h is linear and then apply Problem 3 in Exercises 4.3.
(b) Let g : (0, ∞) −→ R satisfy g(x · y) = g(x) + g(y) for all x, y > 0. Prove that if g
is continuous, then there exists a unique real number c such that
g(x) = c log x for all x ∈ (0, ∞).
12. (Exponentials the “old fashion way”) Fix a > 0 and x ∈ R. In this section we
defined ax := exp(x log a) However, in this problem we shall define real powers the “old
fashion way” via rational sequences. Henceforth we only assume knowledge of rational
powers and we proceed to define them for real powers.
(i) Let {rn } be a sequence of rational numbers converging to zero. From Section 3.1
we know that a1/n → 1 and a−1/n = (a−1 )1/n → 1. Let ε > 0 and fix m ∈ N such
that 1 − ε < a±1/m < 1 + ε. Show that if |rn | < 1/m, then 1 − ε < arn < 1 + ε.
Conclude that arn → 1. (See Problem 3 in Exercises 3.1 for another proof.)
Suggestion: Recall that any rational p < q and real b > 1, we have bp < bq .
4.6. EXPONENTIALS, LOGS, EULER AND MASCHERONI, AND THE ζ-FUNCTION 197
(ii) Let {rn } be a sequence of rational numbers converging to x. Prove that {arn } is
a Cauchy sequence, hence it converges to a real number, say ξ. We define ax = ξ.
Prove that this definition makes sense; that is, if {rn0 } is any other sequence of
0
rational numbers converging to x, then {arn } also converges to ξ.
x
(iii) Prove that if x = n ∈ N, then a = a · a · · · a where there are n a’s multiplied
√
together. Also prove that a−x = 1/an and if x = n/m ∈ Q, then ax = m an .
Thus, our new definition of powers agrees with the old definition. Finally, show
that for x, y ∈ R,
ax · ay = ax+y ; ax · bx = (ab)x ; (ax )y = axy .
13. (Logarithms the “old fashion way”) In this problem we define the logarithm the
“old fashion way” using rational sequences. In this problem we assume knowledge of
real powers as presented in the previous problem. Fix a > 0.
(i) Prove that it is possible to define unique integers a0 , a1 , a2 , . . . inductively with
0 ≤ ak ≤ 9 for k ≥ 1 such that if xn and yn are the rational numbers
a1 a2 an−1 an
xn = a0 + + 2 + · · · + n−1 + n
10 10 10 10
and
a1 a2 an−1 an + 1
yn = a0 + + 2 + · · · + n−1 + ,
10 10 10 10n
then
(4.31) exn ≤ a < eyn .
Suggestion: Since e > 1, we know that for r ∈ Q, we have the limits er → ∞,
respectively 0, as r → ∞, respectively r → −∞.
(ii) Prove that both sequences {xn } and {yn } converge to the same value, call it L.
Show that eL = a where eL is defined by means of the previous problem. Of
course, L is just the logarithm of a defined in this section.
14. (The Euler-Mascheroni constant II) In this problem we prove that the Euler-
Mascheroni constant constant exists following [44]. Consider the sequence
1 1
an = 1 + + ··· + − log n, n = 2, 3, . . . .
2 n−1
We shall prove that an is nondecreasing and bounded and hence lim an exists.
(i) Assuming that the limit lim an exists, prove that the limit defining the Euler-
Mascheroni constant also exists and equals lim an .
(ii) Using the inequalities in (3.28), prove that
e1/n e1/n 1
(4.32) 1< and < e n(n+1) .
(n + 1)/n (n + 1)/n
(iii) Prove that for each n ≥ 2,
e1 e1/2 e1/(n−1)
an = log · ··· .
2/1 3/2 n/(n − 1)
(iv) Using (c) and the inequalities in (4.32), prove that {an } is strictly increasing such
that 0 < an < 1 for all n. Conclude that lim an exists.
15. (The Euler-Mascheroni constant III) We prove that Euler-Mascheroni constant
−k k
exists following [16]. For each k ∈ N, define ak := e 1 + k1 so that e = ak 1 + k1 .
(i) Prove that
1 1/k
= log(ak ) + log(k + 1) − log k.
k
(ii) Prove that
1 1
1/2 1/3
1 + + · · · + − log(n + 1) = log a1 a2 a3 · · · a1/n n .
2 n
198 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
n o
1/2 1/n
(iii) Prove that the sequence log a1 a2 · · · an is nondecreasing. Conclude that
if this sequence is bounded, then Euler’s constant exists.
(iv) Prove that
1 1
1/2
log a1 a2 · · · a1/n
n = log a1 + log a2 + · · · + log an
2 n
1 1 1 1 1
< log 1 + + log 1 + + · · · + log 1 +
1 2 2 n n
1 1 1 1 1
< + · + ··· + · .
1 2 2 n n
(v) nSince of the reciprocals
o of the squares converges, conclude that the sequence
1/2 1/n
log a1 a2 · · · an is bounded.
7“Cosine, secant, tangent, sine, 3.14159; integral, radical, u dv, slipstick, sliderule, MIT!”
MIT cheer.
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 199
In the following theorem, we adopt the standard notation of writing sin2 z for
(sin z)2 , etc.8 Here are some well-known trigonometric identities that you memo-
rized in high school, now proved from the basic definitions and even for complex
variables.
Theorem 4.34 (Basic properties of cosine and sine). Cosine and sine
are continuous functions on C. In particular, restricting to real values, they define
continuous functions on R. Moreover, for any complex numbers z and w,
(1) cos(−z) = cos z, sin(−z) = − sin z,
(2) cos2 z + sin2 z = 1, (Pythagorean identity)
(3) Addition formulas:
cos(z + w) = cos z cos w − sin z sin w, sin(z + w) = sin z cos w + cos z sin w,
(4) Double angle formulas:
cos(2z) = cos2 z − sin2 z = 2 cos2 z − 1 = 1 − 2 sin2 z,
sin(2z) = 2 cos z sin z.
(5) Trigonometric series:9
∞ ∞
X z 2n X z 2n+1
(4.33) cos z = (−1)n , sin z = (−1)n ,
n=0
(2n)! n=0
(2n + 1)!
ei(z+w) + e−i(z+w)
= = cos(z + w).
2
Taking w = −z and using (1) we get the Pythagorean identity:
1 = cos 0 = cos(z − z) = cos z cos(−z) − sin z sin(−z) = cos2 z + sin2 z.
8Sin2 φ is odious to me, even though Laplace made use of it; should it be feared that sin2 φ
might become ambiguous, which would perhaps never occur, or at most very rarely when speaking
of sin(φ2 ), well then, let us write (sin φ)2 , but not sin2 φ, which by analogy should signify sin(sin φ).
Carl Friedrich Gauss (1777–1855).
9In elementary calculus, these series are usually derived via Taylor series and are usually
attributed to Sir Isaac Newton (1643–1727) who derived them in his paper “De Methodis Serierum
et Fluxionum” (Method of series and fluxions) written in 1671. However, it is interesting to know
that these series were first discovered hundreds of years earlier by Madhava of Sangamagramma
(1350–1425), a mathematicians from the Kerala state in southern India!
200 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
We leave the double angle formulas to the reader. To prove (5), we use the power
series for the exponential to compute
∞ n n ∞
X i z X (−1)n in z n
eiz + e−iz = + .
n=0
n! n=0
n!
The terms when n is odd cancel, so
∞ 2n 2n ∞
X i z X z 2n
2 cos z = eiz + e−iz = 2 = (−1)n ,
n=0
(2n)! n=0
(2n)!
where we used the fact that i2n = (i2 )n = (−1)n . This series converges absolutely
since it is the sum of two absolutely convergent series. The series expansion for
sin z is proved in a similar manner.
From the series expansion for sin it is straightforward to prove the following
limit from elementary calculus (but now for complex numbers):
sin z
lim = 1;
z→0 z
see Problem 3. Of course, from the identities in Theorem 4.34, one can derive other
identities such as the so-called half-angle formulas:
1 + cos 2z 1 − cos 2z
cos2 z = , sin2 z = .
2 2
The other trigonometric functions are defined in terms of sin and cos in the
usual manner:
sin z 1 cos z
tan z = , cot z = =
cos z tan z sin z
1 1
sec z = , csc z = ,
cos z sin z
and are called the tangent, cotangent, secant, and cosecant, respectively. Note
that these functions are only defined for those complex z for which the expressions
make sense, e.g. tan z is defined only for those z such that cos z 6= 0. The extra trig
functions satisfy the same identities that you learned in high school, for example,
for any complex numbers z, w, we have
tan z + tan w
(4.34) tan(z + w) = ,
1 − tan z tan w
for those z, w such that the denominator is not zero. Setting z = w, we see that
2 tan z
tan 2z = .
1 − tan2 z
In Problem 4 we ask you to prove (4.34) and other identities.
Before baking our π, we quickly define the hyperbolic functions. For any com-
plex number z, we define
ez + e−z ez − e−z
cosh z := , sinh z := ;
2 2
these are called the hyperbolic cosine and hyperbolic sine, respectively. There
are hyperbolic tangents, secants, etc . . . defined in the obvious manner. Observe
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 201
that, by definition, cosh z = cos iz and sinh z = −i sin iz, so after substituting iz
for z in the series for cos and sin, we obtain
∞ ∞
X z 2n X z 2n+1
cosh z = , sinh z = (−1)n .
n=0
(2n)! n=0
(2n + 1)!
These functions are intimately related to the trig functions and share many of the
same properties; see Problem 8.
4.7.2. The number π and some trig identities. Setting z = x ∈ R into
the series (4.33), we obtain the formulas learned in elementary calculus:
∞ ∞
X x2n X x2n+1
cos x = (−1)n , sin x = (−1)n .
n=0
(2n)! n=0
(2n + 1)!
In particular, cos, sin : R −→ R. In the following lemma and theorem we shall
consider these real-valued functions instead of the more general complex versions.
The following lemma is the key result needed to define π.
Lemma 4.35. Sine and cosine have the following properties on [0, 2]:
(1) sin is nonnegative on [0, 2] and positive on (0, 2];
(2) cos : [0, 2] −→ R is strictly decreasing with cos 0 = 1 and cos 2 < 0.
Proof. Since
∞
X x2n+1 x2 x5 x2
sin x = (−1)n =x 1− + 1− + ···
n=0
(2n + 1)! 2·3 5! 6·7
and each term in the series is positive for 0 < x < 2, we have sin x > 0 for all
0 < x < 2 and sin 0 = 0.
Since
∞
X x2n x2 x4 x6
cos x = (−1)n =1− + − + ··· ,
n=0
(2n)! 2! 4! 6!
we have 6 10
22 24 2 28 2 212
cos 2 = 1 − + − − − − − ··· .
2! 4! 6! 8! 10! 12!
All the terms in parentheses are positive because for k ≥ 2, we have
2k 2k+2 2k 4
− = 1− > 0.
k! (k + 2)! k! (k + 1)(k + 2)
Therefore,
22 24 1
cos 2 < 1 − + = − < 0.
2! 4! 3
We now show that cos is strictly decreasing on [0, 2]. Since cos is continuous,
by Theorem 4.27 if we show that cos is one-to-one on [0, 2], then we can conclude
that cos is strictly monotone on [0, 2]; then cos 0 = 1 and cos 2 < 0 tells us that
cos must be strictly decreasing. Suppose that 0 ≤ x ≤ y ≤ 2 and cos x = cos y; we
shall prove that x = y. We already know that sin is nonnegative on [0, 2], so the
identity
sin2 x = 1 − cos2 x = 1 − cos2 y = sin2 y
implies that sin x = sin y. Therefore,
sin(y − x) = sin y cos x − cos y sin x = sin x cos x − cos x sin x = 0,
202 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
y = sin x y = cos x
− 3π
2 − π2 π
2
3π
2 − 3π
2 − π2 π
2
3π
2
−2π −π π 2π −2π −π π 2π
Figure 4.11. Our definitions of sine and cosine have the same
properties as the ones you learned in high school!
Proof. We know that cos(π/2) = 0 and, by (1) of Lemma 4.35, sin(π/2) > 0,
therefore since
sin2 (π/2) = 1 − cos2 (π/2) = 1,
we must have sin(π/2) = 1. The double angle formulas now imply that
cos(π) = cos2 (π/2) − sin2 (π/2) = −1, sin(π) = 2 cos(π/2) sin(π/2) = 0,
and by another application of the double angle formulas, we get
cos(2π) = 1, sin(2π) = 0.
The facts just proved plus the addition formulas for cosine and sine in Property (3)
of Theorem 4.34 imply the last six formulas above; for example,
π π π
cos z + = cos z cos − sin z sin = − sin z,
2 2 2
and the other formulas are proved similarly. Finally, setting z = π into
π π
cos z + = − sin z, sin z + = cos z
2 2
prove that cos(3π/2) = 0 and sin(3π/2) = −1.
The last two formulas in Theorem 4.37 (plus an induction argument) imply
that cos and sin are periodic (with period 2π) in the sense that for any n ∈ Z,
(4.35) cos(z + 2πn) = cos z, sin(z + 2πn) = sin z.
iz
Now, substituting z = π into e = cos z + i sin z and using that cos π = −1 and
sin π = 0, we get eiπ = −1, or by bringing −1 to the left we get perhaps most
important equation in all of mathematics (at least to some mathematicians!):10
eiπ + 1 = 0.
In one shot, this single equation contains the five “most important” constants in
mathematics: 0, the additive identity, 1, the multiplicative identity, i, the imag-
inary unit, and the constants e, the base of the exponential function, and π, the
fundamental constant of geometry.
Now consider the following theorem, which essentially states that the graphs
of cosine and sine go “up and down” as you think they should; see Figure 4.11.
10[after proving Euler’s formula eiπ = −1 in a lecture] Gentlemen, that is surely true, it
is absolutely paradoxical; we cannot understand it, and we don’t know what it means. But we
have proved it, and therefore we know it is the truth. Benjamin Peirce (1809–1880). Quoted in
E Kasner and J Newman [110].
204 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Theorem 4.38 (Oscillation theorem). On the interval [0, 2π], the following
monotonicity properties of cos and sin hold:
(1) cos decreases from 1 to −1 on [0, π] and increases from −1 to 1 on [π, 2π].
(2) sin increases from 0 to 1 on [0, π/2] and increases from −1 to 0 on [3π/2, 2π],
and decreases from 1 to −1 on [π/2, 3π/2].
Proof. From Lemma 4.35 we know that cos is strictly decreasing from 1 to
0 on [0, π/2] and from this same lemma we know that sin is positive on (0, π/2).
Therefore by the Pythagorean identity,
p
sin x = 1 − cos2 x
on [0, π/2]. Since cos is positive and strictly decreasing on [0, π/2], this formula
implies that sin is strictly increasing on [0, π/2]. Replacing z by x − π/2 in the
formulas π π
cos z + = − sin z, sin z + = cos z
2 2
found in Theorem 4.37, give the new formulas
π π
cos x = − sin x − , sin x = cos x − .
2 2
The first of these new formulas plus the fact that sin is increasing on [0, π/2] show
that cos is decreasing on [π/2, π], while the second of these formulas plus the fact
that cos is decreasing on [0, π/2] show that sin is also decreasing on [π/2, π]. Finally,
the formulas
cos x = − cos(x − π), sin x = − sin(x − π),
also obtained as a consequence of Theorem 4.37, and the monotone properties al-
ready established for cos and sin on [0, π], imply the rest of the monotone properties
in (1) and (2) of cos and sin on [π, 2π].
In geometric terms, the following theorem states that as θ moves from 0 to
2π, the point f (θ) = (cos θ, sin θ) in R2 moves around the unit circle. (However,
because we like complex notation, we shall write (cos θ, sin θ) as the complex number
cos θ + i sin θ = eiθ in the theorem.)
Theorem 4.39 (π and the unit circle). For a real number θ, define
f (θ) := eiθ = cos θ + i sin θ.
Then f : R −→ C is a continuous function and has range equal to the unit circle
S1 := {(a, b) ∈ R2 ; a2 + b2 = 1} = {z ∈ C ; |z| = 1}.
Moreover, for each z ∈ S1 there exists a unique θ with 0 ≤ θ < 2π such that
f (θ) = z. Finally, f (θ) = f (φ) if and only if θ − φ is an integer multiple of 2π.
Proof. Since the exponential function is continuous, so is the function f , and
by the Pythagorean identity, cos2 θ + sin2 θ = 1, so we also know that f maps into
the unit circle. Given z in the unit circle, we can write z = a + ib where a2 + b2 = 1.
We prove that there exists a unique 0 ≤ θ < 2π such that f (θ) = z, that is, such
that cos θ = a and sin θ = b. Now either b ≥ 0 or b < 0. Assume that b ≥ 0; the case
when b < 0 is proved in a similar way. Since, according to Theorem 4.38, sin θ < 0
for all π < θ < 2π, and we are assuming b ≥ 0, there is no θ with π < θ < 2π
such that f (θ) = z. Hence, we just have to show there is a unique θ ∈ [0, π] such
that f (θ) = z. Since a2 + b2 = 1, we have −1 ≤ a ≤ 1 and 0 ≤ b ≤ 1. Since cos
4.7. THE TRIG FUNCTIONS, THE NUMBER π, AND WHICH IS LARGER, π e OR eπ ? 205
strictly decreases from 1 to −1 on [0, π], by the intermediate value theorem there
is a unique value θ ∈ [0, π] such that cos θ = a. The identity
sin2 θ = 1 − cos2 θ = 1 − a2 = b2 ,
and the fact that sin θ ≥ 0, because 0 ≤ θ ≤ π, imply that b = sin θ.
We now prove the last assertion of our theorem. Let θ and φ be real numbers
and suppose that f (θ) = f (φ). Let n be the unique integer such that
θ−φ
n≤ < n + 1.
2π
Multiplying everything by 2π and subtracting by 2πn, we obtain
0 ≤ θ − φ − 2πn < 2π.
By periodicity (see (4.35)),
f (θ − φ − 2πn) = f (θ − φ) = ei(θ−φ) = eiθ e−iφ = f (θ)/f (φ) = 1.
Since θ − φ − 2πn is in the interval [0, 2π) and f (0) = 1 also, by the uniqueness
we proved in the previous paragraph, we conclude that θ − φ − 2πn = 0. This
completes the proof of the theorem.
We now solve trigonometric equations. Notice that Property (2) of the following
theorem shows that cos vanishes at exactly π/2 and all its π translates and (3) shows
that sin vanishes at exactly all integer multiples of π, again, well-known facts from
high school! However, we consider complex variables instead of just real variables.
Theorem 4.40. For complex numbers z and w,
(1) ez = ew if and only if z = w + 2πin for some integer n.
(2) cos z = 0 if and only if z = nπ + π/2 for some integer n.
(3) sin z = 0 if and only if z = nπ for some integer n.
Proof. The “if” statements follow from Theorem 4.37 so we are left to prove
the “only if” statements. Suppose that ez = ew . Then ez−w = 1. Hence, it suffices
to prove that ez = 1 implies that z is an integer multiple of 2πi. Let z = x + iy for
real numbers x and y. Then,
1 = |ex+iy | = |ex eiy | = ex .
Since the exponential function on the real line in one-to-one, it follows that x = 0.
Now the equation 1 = ez = eiy implies, by Theorem 4.39, that y must be an integer
multiple of 2π. Hence, z = x + iy = iy is an integer multiple of 2πi.
Assume that sin z = 0. Then by definition of sin z, we have eiz = e−iz . By (1),
we have iz = −iz + 2πin for some integer n. Solving for z, we get z = πn. Finally,
the identity
π
sin z + = cos z
2
and the result already proved for sine shows that cos z = 0 implies that z = nπ+π/2
for some integer n.
As a corollary of this theorem we see that the domain of tan z = sin z/ cos z
and sec z = 1/ cos z consists of all complex numbers except integer multiples of π/2.
206 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
z = (x, y)
cos θ = x
θ r y r
y
sin θ = r
θ
x
Exercises 4.7.
1. Here are some values of the trigonometric functions.
(a) Find sin i, cos i, and tan(1 + i) (in terms of e and i).
(b) Using various trig identities (no triangles allowed!),√
prove the following well-known
values of sine and cosine: sin(π/4)
√ = cos(π/4) = 1/ 2, sin(π/6) = cos(π/3) = 1/2,
and sin(π/3) = cos(π/6) = 3/2.
(c) Using trig identities, find sin(π/8) and cos(π/8).
2. In this problem we find a very close estimate of π. Prove that for 0 < x < 2, we have
x2 x4
cos x < 1 −
+ .
2 24
p √
Use
p this √fact to prove that 3/2 < π/2 < 6 − 2 3, which implies that 3 < π <
2 6 − 2 3 ≈ 3.185. We’ll get a much better estimate in Section 4.10.
3. Using the series representations (4.33) for sin z and cos z, find the limits
(a) For z, w ∈ C,
2 sin z sin w = cos(z − w) − cos(z + w),
2 cos z cos w = cos(z − w) + cos(z + w),
2 sin z cos w = sin(z + w) + sin(z − w),
tan z + tan w
tan(z + w) = ,
1 − tan z tan w
1 + tan2 z = sec2 z, cot2 z + 1 = csc2 z.
where btc is the greatest integer less than or equal to t ∈ R. Suggestion: Expand
the left-hand side of de Moivre’s formula using the binomial theorem.
(c) Prove that
√ √ √
π 5− 5 π 3+ 5 π 1+ 5
sin2 = , cos2 = , cos = .
5 8 5 8 5 4
Suggestion: What if you consider x = π/5 and n = 5 in the equation for sin nx in
Part (b)?
5. Prove that for 0 ≤ r < 1 and θ ∈ R,
∞ ∞
X 1 − r cos θ X r sin θ
rn cos(nθ) = , rn sin(nθ) = .
n=0
1 − 2r cos θ + r2 n=1
1 − 2r cos θ + r2
P
Suggestion: Let z = reiθ in the geometric series ∞ n
n=0 z .
e β
6. Prove that if e < β, then β < e .
7. Here’s a very neat problem posed by D.J. Newman [156].
(i) Prove that
lim n sin(2π e n!) = 2π.
n→∞
1 1
where e is Euler’s number. Suggestion: Start by multiplying e = 1 + 2! + 3!
+
1 1
· · · + n! + (n+1)! + · · · by 2πn! and see what happens.
(ii) Prove, using (i), that e is irrational.
8. (Hyperbolic functions) In this problem we study the hyperbolic functions.
(a) Show that
sinh(z + w) = sinh z cosh w + cosh z sinh w,
cosh(z + w) = cosh z cosh w + sinh z sinh w,
sinh(2z) = 2 cosh z sinh z , cosh2 z − sinh2 z = 1
4.8.1. Our first proof of the FTA. Our first proof is found in the article by
Remmert [186]. This proof could have actually been presented immediately after
Section 4.4, but we have chosen to save the proof till now because it fits so well
with roots of complex numbers that we’ll touch on in Section 4.8.3.
Given n ∈ N and z ∈ C, a complex number ξ is called an n-th root of z
if ξ n = z. A natural question is: Does every z ∈ C have an n-th root? Notice
that if z = 0, then ξ = 0 is an n-th root of z and is the only n-th root (since
a nonzero number cannot be an n-th root of 0 because the product of nonzero
complex numbers is nonzero). Thus, for existence purposes we may assume that z
is nonzero. Now certainly if n = 1, then z has a one root; namely ξ = z. If n = 2
and if z is a real positive number,√then we know z has a square root and if z is a
real negative number, then ξ = i −z is a square root of z. If z = a + ib, where
b 6= 0, then the numbers
r r !
|z| + a b |z| − a
ξ=± +i .
2 |b| 2
are square roots of z, as the reader can easily verify; see Problem 8. What about
higher order roots for nonzero complex numbers? In the following lemma we prove
that any complex number has an n-th root. In Subsection 4.8.3 we’ll give another
proof of this lemma using facts about exponential and trigonometric functions de-
veloped in the previous sections. However, the following proof is interesting because
it is completely elementary in that it avoids any reference to these functions.
Lemma 4.41. Any complex number has an n-th root.
Proof. Let z ∈ C, which we may assume is nonzero. We shall prove that z
has an n-th root using strong induction. We already know that z has n-th roots for
n = 1, 2. Let n > 2 and assume that z has roots for all natural numbers less than
n, we shall prove that z has an n-th root.
Suppose first that n is even, say n = 2m for some natural number m > 2. Then
we are looking for a complex number ξ such that ξ 2m = z. By our discussion before
this lemma, we know that there is a number η such that η 2 = z and since m < n,
11In plain English, a polynomial with real coefficients. You can find a beautiful translation
of Gauss’ thesis by Ernest Fandreyer at http://www.fsc.edu/library/documents/Theorem.pdf.
Gauss’ proof was actually incorrect, but he published a correct version in 1816.
4.8. F THREE PROOFS OF THE FUNDAMENTAL THEOREM OF ALGEBRA (FTA) 211
We now present our first proof of the celebrated fundamental theorem of alge-
bra. The following proof is a very elementary proof of Gauss’ famous result in the
sense that looking through the proof, we see that the nontrivial results we use are
kept at a minimum:
(1) The Bolzano-Weierstrass theorem.
(2) Any nonzero complex number has a k-th root.
For other presentations of basically the same proof, see [69], [222], [191], or (one
of my favorites) [185].
Theorem 4.42 (The fundamental theorem of algebra, Proof I). Any
complex polynomial of positive degree has at least one complex root.
Proof. Let p(z) = an z n + an−1 z n−1 + · · · + a1 z + a0 be a polynomial with
complex coefficients, n ≥ 1 with an 6= 0. We prove this theorem in four steps.
Step 1: We begin by proving a simple, but important, inequality. Since
an−1 an−2 a1 a0
|p(z)| = |an z n + · · · + a0 | = |z|n an +
+ 2 + · · · + n−1 + n ,
z z z z
for |z| sufficiently large the absolute value of the sum of all the terms to the right
of an can be made less than, say |an |/2. Therefore,
|an |
(4.38) |p(z)| ≥ · |z|n , for |z| sufficiently large.
2
212 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Step 2: We now prove that there exists a point c ∈ C such that |p(c)| ≤ |p(z)|
for all z ∈ C. The proof of this involves the Bolzano-Weierstrass theorem. Define
m := inf A, A := {|p(z)| ; z ∈ C} .
This infimum certainly exists since A is nonempty and bounded below by zero.
Since m is the greatest lower bound of A, for each k ∈ N, m + 1/k is no longer a
lower bound, so there is a point zk ∈ C such that m ≤ |p(zk )| < m+1/k. By (4.38),
the sequence {zk } must be bounded, so by the Bolzano-Weierstrass theorem, this
sequence has a convergent subsequence {wk }. If c is the limit of this subsequence,
then by continuity of polynomials, |p(wk )| → |p(c)| and since m ≤ |p(zk )| < m+1/k
for all k, by the squeeze theorem we must have |p(c)| = m.
Step 3: The rest of the proof involves showing that the minimum m must be
zero, which shows that p(c) = 0, and so c is a root of p(z). To do so, we introduce
an auxiliary polynomial q(z) as follows. Let us suppose, for sake of contradiction,
that p(c) 6= 0. Define q(z) := p(z + c)/p(c). Then |q(z)| has a minimum at the
point z = 0, the minimum being |q(0)| = |1| = 1. Since q(0) = 1, we can write
(4.39) q(z) = bn z n + · · · + 1 = bn z n + · · · + bk z k + 1,
where k is the smallest natural number such that bk 6= 0. In our next step we shall
prove that 1 is in fact not the minimum of |q(z)|, which gives a contradiction.
Step 4: By our lemma, −1/bk has a k-th root a, so that ak = −1/bk . Then
|q(az)| also has a minimum at z = 0, and
q(bz) = 1 + bk (az)k + · · · = 1 − z k + · · · ,
where · · · represents terms of higher degree than k. Thus, we can write
q(az) = 1 − z k + z k+1 r(z),
where r(z) is a polynomial of degree at most n − (k + 1). Let z = x, a real number
with 0 < x < 1, be so small that x |r(x)| < 1. Then,
einφ = eiθ ,
which holds if and only if nφ = θ + 2πm for some integer m, or
θ 2πm
φ= + , m ∈ Z.
n n
As the reader can easily check, any number of this form differs by an integer multiple
of 2π from one of the following numbers:
θ θ 2π θ 4π θ 2π
, + , + ,..., + (n − 1).
n n n n n n n
None of these numbers differ by an integer multiple of 2π, therefore by our knowl-
edge of π and the unit circle, all the n numbers
1
ei n θ+2πk , k = 0, 1, 2, . . . , n − 1
are distinct. Thus, there are a total of n solutions ξ to the equation ξ n = z, all of
them given in the following theorem.
Theorem 4.46 (Existence of complex n-th roots). There are exactly n
n-th roots of any nonzero complex number z = reiθ ; the complete set of roots is
given by
√ 1 √ 1 1
n
r ei n θ+2πk = n r cos θ + 2πk + i sin θ + 2πk , k = 0, 1, 2, . . . , n − 1.
n n
There is a very convenient way to write these n-th roots as we now describe.
First of all, notice that
√ √ √ 2π k
1 θ 2πk θ
n
r ei n θ+2πk = n r ei n · ei n = n r ei n · ei n .
Note that if z = x > 0 is a positive real number, then√x = rei0 with √ r = x and
−π < 0 ≤ π, so the principal n-th root of x is just n xei0/n = n x, the usual
real n-th root of x. Thus, there is no ambiguity in notation between the complex
principal n-th root of a positive real number and its real n-th root.
We now give some examples.
Example 4.34. For our first example, we find the square roots of −1. Since
−1 = eiπ , because cos π + i sin π = −1 + i0, the square roots of −1 are ei(1/2)π and
ei(1/2)(π+2π) = ei3π/2 . Writing these numbers in terms of sine and cosine, we get i
and −i √ as the square roots of −1. Note that the principal square root of −1 is i
and so −1 = i, just as we learned in high school!
216 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Example 4.35. Next let us compute the n-th roots of unity, that is, 1. Since
1 = 1 ei0 , all the n n-th roots of 1 are given by
2π 2π 2π
1, ω, ω 2 , . . . , ω n−1 , where ω := ei n = cos + i sin .
n n
Consider n = 4. In this case, cos 2π 2π 2 3
4 + i sin 4 = i, i = −1, and i = −i, therefore
the fourth roots of unity are
1, i, −1, −i.
Since √
2π 2π 1 3
cos + i sin =− +i ,
3 3 2 2
the cube roots of unity are
√ √
1 3 1 3
1, − + i , − −i .
2 2 2 2
4.8.4. Our third proof of the FTA. We are now ready to prove our third
proof of the FTA.
Theorem 4.47 (The fundamental theorem of algebra, Proof III). Any
complex polynomial of positive degree has at least one complex root.
Proof. We proceed, without changing a single word, exactly as in Proof I up
to Step 4, where we use the following in place.
Step 4 modified: At the beginning of Step 4 in Proof I, we used Lemma
4.41 to conclude that there is a complex a such that ak = −1/bk . Now we can
simply invoke Theorem 4.46 to verify that there is such a number a. Explicitly, we
can just write −1/bk = reiθ and simply define a = r1/k eiθ/k . In any case, now that
we have such an a, we can proceed exactly as in Step 4 of Proof I to finish the
proof.
Exercises 4.8.
1. Let p(z) and q(z) be polynomials of degree at most n.
(a) If p vanishes at n + 1 distinct complex numbers, prove that p = 0, the zero poly-
nomial.
(b) If p and q agree at n + 1 distinct complex numbers, prove that p = q.
(c) If c1 , . . . , cn (with each root repeated according to multiplicity) are roots of p(z),
a polynomial of degree n, prove that p(z) = an (z − c1 )(z − c2 ) · · · (z − cn ) where
an is the coefficient of z n in the expression for p(z).
2. Find the following roots and state which of the roots represents the principal root.
(a) Find the cube roots of −1.
(b) Find the square roots of i.
(c) Find the cube roots of i. √
(d) Find the square roots of 3 + 3i.
3. Geometrically (not rigorously) demonstrate that the n-th roots, with n ≥ 3, of a
nonzero complex number z are the vertices of a regular polygon.
2π
4. Let n ∈ N and let ω = ei n . If k is any integer that is not a multiple of n, prove that
1 + ω k + ω 2k + ω 3k + · · · + ω (n−1)k = 0.
5. Prove by “completing the square” that any quadratic polynomial z 2 + bz + c = 0 with
complex coefficients has two complex roots, counting multiplicities, given by
√
−b ± b2 − 4ac
z= ,
2a
√
where b2 − 4ac is the principal square root of b2 − 4ac.
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM217
6. We show how the ingenious mathematicians of the past solved the general cubic equa-
tion z 3 + bz 2 + cz + d = 0 with complex coefficients; for the history, see [88].
(i) First, replacing z with z − b/3, show that our cubic equation transforms into an
equation of the form z 3 + αz + β = 0 where α and β are complex. Thus, we may
focus our attention on the equation z 3 + α z + β = 0.
(ii) Second, show that the substitution z = w − α/(3w) gives an equation of the form
27(w3 )2 + 27β(w3 ) − α3 = 0,
a quadratic equation in w3 . We can solve this equation for w3 by the previous
problem, therefore we can solve for w, and therefore we can get z = w − α/(3w).
(iii) Using the technique outlined above, solve the equation z 3 − 12z − 3 = 0.
7. A nice application of the previous problem is finding sin(π/9) and cos(π/9).
(i) Use de Moivre’s formula to prove that
cos 3x = cos3 x − 3 cos x sin2 x, sin 3x = 3 cos2 x sin x − sin3 x.
(ii) Choose one of these equations and using cos2 x + sin2 x = 1, turn the right-hand
side into a cubic polynomial in cos x or sin x.
(iii) Using the equation you get, determine sin(π/9) and cos(π/9).
8. This problem is for the classic mathematicians at heart: We find square roots without
using the technology of trigonometric functions.
(i) Let z = a + ib be a nonzero complex number with b 6= 0. Show that ξ = x + iy
satisfies ξ 2 = z if and√only if x2 − y 2 = a and 2xy = b.
(ii) Prove that x2 +y 2 = a2 + b2 = |z|, and then x2 = 21 |z|+a and y 2 = 21 |z|−a .
(iii) Finally, deduce that z must equal
r r !
|z| + a b |z| − a
ξ=± +i .
2 |b| 2
n n−1
9. Prove that if r is a root of a polynomial p(z) = z + an−1 z + · · · + a0 , then
Pn−1
|r| ≤ max 1, k=0 |ak | .
10. (Continuous dependence of roots) Following Uherka and Sergott [227], we prove
the following useful theorem. Let z0 be a root of multiplicity m of a polynomial
p(z) = z n + an−1 z n−1 + · · · + a0 . Then given any ε > 0, there is a δ > 0 such that if
q(z) = z n + bn−1 z n−1 + · · · + b0 satisfies |bj − aj | < δ for all j = 0, . . . , n − 1, then q(z)
has at least m roots within ε of z0 . You may proceed as follows.
(i) Suppose the theorem is false. Prove there is an ε > 0 and a sequence {qk } of
polynomials qk (z) = z n + bk,n−1 z n−1 + · · · + bk,0 such that qk has at most m − 1
roots within ε of z0 and for each j = 0, . . . , n − 1, we have bk,j → aj as k → ∞.
(ii) Let rk,1 , . . . , rk,n be the n roots of qk . Let Rk = (rk,1 , . . . , rk,n ) ∈ Cn = R2n .
Prove that the sequence {Rk } has a convergent subsequence. Suggestion: Problem
9 is helpful.
(iii) By relabelling the subsequence if necessary, we assume that {Rk } itself converges;
say Rk = (rk,1 , . . . , rk,n ) → (r1 , . . . , rn ). Prove that at most m − 1 of the rj ’s can
equal z0 .
(iv) From Problem 2 in Exercises 2.10, qk (z) = (z − rk,1 )(z − rk,2 ) · · · (z − rk,n ). Prove
that for each z ∈ C, limk→∞ qk (z) = (z−r1 )(z−r2 ) · · · (z−rn ). On the other hand,
using that bk,j → aj as k → ∞, prove that for each z ∈ C, limk→∞ qk (z) = p(z).
Derive a contradiction.
the properties of real logarithms and using the logarithm we defined complex pow-
ers of positive bases. In our current section we shall extend logarithms to include
complex logarithms, which are then used to define complex powers with complex
bases. Finally, we use the complex logarithm to define complex inverse trigonomet-
ric functions.
Then all arguments of z differ from the principal one by multiples of 2π:
arg z = Arg z + 2π n, n ∈ Z.
We can find many different formulas for Arg z using the inverse trig functions as
follows. Writing z in terms of its real and imaginary parts: z = x+iy, and equating
this with |z|ei Arg z = |z| cos(Arg z) + i|z| sin(Arg z), we see that
x y
(4.42) cos Arg z = p and sin Arg z = p .
x2 + y 2 x2 + y 2
By the properties of cosine, we see that
π π
− < Arg z < ⇐⇒ x > 0.
2 2
Since arcsin is the inverse of sin with angles in (−π/2, π/2), it follows that
!
y
Arg z = arcsin p , x > 0.
x2 + y 2
220 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Perhaps the most common formula for Arg z when x > 0 is in terms of arctangent,
which is derived by dividing the formulas in (4.42) to get tan Arg z = y/x and then
taking arctan of both sides:
y
Arg z = arctan , x > 0.
x
We now derive a formula for Arg z when y > 0. By the properties of sine, we see
that
Assuming that y ≥ 0, that is, 0 ≤ Arg z ≤ π, we can take the arccos of both sides
of the first equation in (4.42) to get
!
x
Arg z = arccos p , y ≥ 0.
x2 + y 2
Assume that y < 0, that is, −π < Arg z < 0. Then p 0 < − Arg z < π and since
cos Arg z = cos(− Arg z), we get cos(− Arg z) = x/ x2 + y 2 . Taking the arccos of
both sides, we get
!
x
Arg z = − arccos p , y ≤ 0.
x2 + y 2
Putting together our expressions for Arg z, we obtain the following formulas for the
principal argument:
y
(4.43) Arg z = arctan if x > 0,
x
and
!
x
arccos if y ≥ 0,
p
x2 + y 2 !
(4.44) Arg z =
x
− arccos if y < 0.
p
x + y2
2
Proof. Since
all we have to do is prove that Arg is continuous on each of these three sets. But
this is easy: The formula (4.43) shows that Arg is continuous when x > 0, the first
formula in (4.44) shows that Arg is continuous when y > 0, and the second formula
in (4.44) shows that Arg is continuous when y < 0.
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM221
4.9.3. The complex logarithm and powers. Recall from Section 4.6.2 that
if a ∈ R and a > 0, then a real number ξ having the property that
eξ = a
is called the logarithm of a; we know that ξ always exists and is unique since
exp : R −→ (0, ∞) is a bijection. Of course, ξ = log a by definition of log. We now
consider complex logarithms. We define such logarithms in an analogous way: If
z ∈ C and z 6= 0, then a complex number ξ having the property that
eξ = z
is called a complex logarithm of z. The reason we assume z 6= 0 is that there
is no complex ξ such that eξ = 0. We now show that nonzero complex numbers
always have logarithms; however, in contrast to the case of real numbers, complex
numbers have infinitely many distinct logarithms!
Theorem 4.49. The complex logarithms of any given nonzero complex number
z are all of the form
(4.45) ξ = log |z| + i Arg z + 2π n , n ∈ Z.
Therefore, all complex logarithms of z have exactly the same real part log |z|, but
have imaginary parts that differ by integer multiples of 2π from Arg z.
Proof. The idea behind this proof is very simple: We write
z = |z| · ei arg z = elog |z| · ei arg z = elog |z|+i arg z .
Since any argument of z is of the form Arg z + 2πn for n ∈ Z, this equation shows
that all the numbers in (4.45) are indeed logarithms. On the other hand, if ξ is any
logarithm of z, then
eξ = z = elog |z|+i Arg z .
By Theorem 4.40 we must have ξ = log |z| + i Arg z + 2πi n for some n ∈ Z. This
completes our proof.
To isolate one of these infinitely many logarithms we define the so-called “prin-
cipal” one. For any nonzero complex number z, we define the principal (branch
of the) logarithm of z by
Log z := log |z| + i Arg z.
By Theorem 4.49, all logarithms of z are of the form
Log z + 2πi n, n ∈ Z.
Note that if x ∈ R, then Arg x = 0, therefore
Log x = log x,
our usual logarithm, so Log is an extension of the real log to complex numbers.
Example 4.36. Observe that since Arg(−1) = π and Arg i = π/2 and log | −
1| = 0 = log |i|, since both equal log 1, we have
π
Log(−1) = iπ, Log i = i .
2
The principal logarithm satisfies some of the properties of the real logarithm,
but we need to be careful with the addition properties.
222 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
In general, there are infinitely many complex powers, but in certain cases they
actually reduce to a finite number, see Problem 3. Here are some examples.
Example 4.38. Have you ever thought about what ii equals? In this case,
Log i = iπ/2, so
ii = ei Log i = ei(iπ/2) = e−π/2 ,
a real number! Here is another nice example:
π π
(−1)1/2 = e(1/2) Log(−1) = e(1/2)i π = cos + i sin = i,
2 2
therefore (−1)1/2 = i, just as we suspected!
4.9.4. The complex-valued arctangent function. We now investigate the
complex arctangent function; the other complex inverse functions are found in
Problem 5. Given a complex number z, in the following theorem we shall find all
complex numbers ξ such that
(4.47) tan ξ = z.
Of course, if we can find such a ξ, then we would like to call ξ the “inverse tangent
of z.” However, when this equation does have solutions, it turns out that it has
infinitely many.
Lemma 4.52. If z = ±i, then the equation (4.47) has no solutions. If z 6= ±i,
then
1 + iz
tan ξ = z ⇐⇒ e2iξ = ,
1 − iz
that is, if and only if
1 1 + iz
ξ = × a complex logarithm of .
2i 1 − iz
Proof. The following statements are equivalent:
Some questions you might ask are whether or not Arctan really is an “inverse” of
tan, in other words, is Arctan a bijection; you might also ask if Arctan x = arctan x
for x real. The answer to the first question is “yes,” if we restrict the domain of
Arctan, and the answer to the second question is “yes.”
Theorem 4.53 (Properties of Arctan). Let
D = {z ∈ C ; z 6= iy, y ∈ R, |y| ≥ 1}, E = {ξ ∈ C ; | Re ξ| < π/2}.
Then
Arctan : D −→ E
is a continuous bijection from D onto E with inverse tan : E −→ D and when
restricted to real values,
Arctan : R −→ (−π/2, π/2)
and equals the usual arctangent function arctan : R −→ (−π/2, π/2).
Proof. We begin by showing that Arctan(D) ⊆ E. First of all, by definition
of Log, for any z ∈ C with z 6= ±i (not necessarily in D) we have
1 1 + iz 1 1 + iz 1 + iz
Arctan z = Log = log
+ i Arg
2i 1 − iz 2i 1 − iz 1 − iz
1 1 + iz i 1 + iz
(4.48) = Arg − log .
2 1 − iz 2 1 − iz
Since the principal argument of any complex number lies in (−π, π], it follows that
π π
− < Re Arctan z ≤ , for all z ∈ C, z 6= ±i.
2 2
Assume that Arctan z ∈ / E, which, by the above inequality, is equivalent to
1 + iz 1 + iz
2 Re Arctan z = Arg = π ⇐⇒ ∈ (−∞, 0).
1 − iz 1 − iz
1+iz
If z = x + iy, then (by multiplying top and bottom of 1−iz by 1 + iz and making
a short computation) we can write
1 + iz 1 − |z|2 2x
= + i.
1 − iz |1 − iz|2 |1 − iz|2
This formula shows that (1 + iz)/(1 − iz) ∈ (−∞, 0) if and only if x = 0 and
1 − |z|2 < 0, that is, x = 0 and 1 − y 2 < 0, or, |y| > 1; hence,
1 + iz
∈ (−∞, 0) ⇐⇒ z = iy , |y| > 1.
1 − iz
In summary, for any z ∈ C with z 6= ±i, we have Arctan z ∈
/ E ⇐⇒ z ∈
/ D, or
(4.49) Arctan z ∈ E ⇐⇒ z ∈ D.
Therefore, Arctan(D) ⊆ E.
We now show that Arctan(D) = E, so let ξ ∈ E. Define z = tan ξ. Then
1+iz
according to Lemma 4.52, we have z 6= ±i and e2iξ = 1−iz . By definition of E,
the real part of ξ satisfies −π/2 < Re ξ < π/2. Since Im(2iξ) = 2 Re(ξ), we have
−π < Im(2iξ) < π, and therefore by definition of the principal logarithm,
1 + iz
2iξ = Log .
1 − iz
4.9. THE INVERSE TRIGONOMETRIC FUNCTIONS AND THE COMPLEX LOGARITHM225
(e) Give examples showing that the conclusions of (b) and (c) are false if the hypotheses
are not satisfied.
5. (Arcsine and Arccosine function) In this problem we define the principal arcsin
and arccos functions. To define the complex arcsine, given z ∈ C we want to solve the
equation sin ξ = z for ξ and call ξ the “inverse sine of z”.
(a) Prove that sin ξ = z if and only if (eiξ )2 − 2iz (eiξ ) − 1 = 0.
(b) Solving this quadratic equation for eiξ (see Problem 5 in Exercises 4.8) prove that
sin ξ = z if and only if
1 p
ξ = × a complex logarithm of iz ± 1 − z 2 .
i
Because of this formula, we define the principal inverse or arc sine of z to be
the complex number
1 p
Arcsin z := Log iz + 1 − z 2 .
i
Based on the formula (4.41), we define the principal inverse or arc cosine of z
to be the complex number
π
Arccos z := − Arcsin z.
2
(c) Prove that when restricted to real values, Arcsin : [−1, 1] −→ [−π/2, π/2] and
equals the usual arcsine function.
(d) Similarly, prove that when restricted to real values, Arccos : [−1, 1] −→ [0, π] and
equals the usual arccosine function.
6. (Inverse hyperbolic functions) We look at the inverse hyperbolic functions.
(a) Prove that sinh : R −→ R is a strictly increasing bijection. Thus, sinh−1 : R −→ R
exists. Show that cosh : [0, ∞) −→ [1, ∞) is a strictly increasing bijection. We
define cosh−1 : [1, ∞) −→ [0, ∞) to be the inverse of this function.
(b) Using a similar argument as you did for the arcsine function in Problem 5, prove
2x x
that sinh x = y (here,px, y ∈ R) if and only if e − 2ye − 1 = 0, which holds if
and only if ex = y ± y 2 + 1. From this, prove that
p
sinh−1 x = log(x + x2 + 1).
If x is replaced by z ∈ C and log by Log, the principal complex logarithm, then
this formula is called the principal inverse hyperbolic sine of z.
(c) Prove that p
cosh−1 x = log(x + x2 − 1).
If x is replaced by z ∈ C and log by Log, the principal complex logarithm, then
this formula is called the principal inverse hyperbolic cosine of z.
(5) (circa 1600 A.D.) The Dutch mathematician Adriaan Anthoniszoon of Holland
(1527–1607) used Archemides’ method to get
333 377
<π< .
106 120
By taking the average of the numerators and the denominator, he found Tsu
Chung-Chi’s approximation 355/113.
(6) (1706) The symbol π was first introduced by William Jones (1675–1749) in his
beginners calculus book Synopsis palmariorum mathesios where he published
John Machin’s (1680–1751) one hundred digit approximation to π; see Subsec-
tion 4.10.5 for more on Machin. The symbol π was popularized and became
standard through Leonhard Euler’s (1707–1783) famous book Introductio in
Analysin Infinitorum [65]. The letter π was (reportedly) chosen because it’s
the first letter of the Greek words “perimeter” and “periphery”.
(7) (1761) Johann Heinrich Lambert (1728–1777) proved that π is irrational.
(8) (1882) Carl Louis Ferdinand von Lindemann (1852–1939) proved that π is tran-
scendental.
(9) (1897) A sad day in the life of π. House bill No. 246, Indiana state legislature,
1897, written by a physician Edwin Goodwin (1828–1902), tried to legally set
the value of π to a rational number; see [213], [90] for more about this sad tale.
This value would be copyrighted and used in Indiana state math textbooks and
other states would have to pay to use this value! The bill is very convoluted
(try to read Goodwin’s article [83] and you’ll probably get a headache) and
(reportedly) the following values of π can be√ implied from the bill: π = 9.24,
3.236, 3.232, and 3.2; it’s also implied that 2 = 10/7. Moreover, Mr. Good-
win claimed he could trisect an angle, double a cube, and square a circle, which
(quoting from the bill) “had been long since given up by scientific bodies as
insolvable mysteries and above mans ability to comprehend.” These problems
“had been long since given up” because they have been proven unsolvable! (See
[57, 79] for more on these unsolvable problems, first proved by Pierre Laurent
Wantzel (1814–1848), and see [59] for other stories of amateur mathematicians
claiming to have solved the insolvable.) This bill passed the house (!), but for-
tunately, with the help of mathematician C.A. Waldo of the Indiana Academy
of Science, the bill didn’t pass in the senate.
Hold on to your seats because we’ll take up our brief history of π again in
Subsection 4.10.5, after a brief intermission.
1 1
2πr area 4 = base × height = r · (2πr) = πr2
2 2
r
22
sition gives the famous estimate π ≈ 7 :
area of circle πr2 π 11 22
= = ≈ =⇒ π ≈ .
area of square (2r)2 4 14 7
We now derive Archimedes’ third proposition using the same method Archimedes
pioneered over two thousand years ago, but we shall employ trigonometric functions!
Archimedes’ original method used plane geometry to derive his formulas (they
didn’t have the knowledge of trigonometric functions back then as we do now.)
However, before doing so, we need a couple trig facts.
4.10.3. Some useful trig facts. We first consider some useful trig identities.
Lemma 4.54. We have
sin(2z) tan(2z)
tan z = and 2 sin2 z = sin(2z) tan z.
sin(2z) + tan(2z)
Proof. We’ll prove the first one and leave the second one to you. Multiplying
tan z by 2 cos z/2 cos z = 1 and using the double angle formulas 2 cos2 z = 1 + cos 2z
and sin(2z) = 2 cos z sin z (see Theorem 4.34), we obtain
sin z 2 sin z cos z sin(2z)
tan z = = 2
= .
cos z 2 cos z 1 + cos(2z)
Multiplying top and bottom by tan 2z, we get
sin(2z) tan(2z) sin(2z) tan(2z)
tan z = = .
tan(2z) + cos(2z) tan 2z tan(2z) + sin(2z)
Next, we consider some useful inequalities.
Lemma 4.55. For 0 < x < π/2, we have
sin x < x < tan x.
Proof. We first prove that sin x < x for 0 < x < π/2. We note that the
inequality sin x < x for 0 < x < π/2 automatically implies that this same inequality
holds for all x > 0, since x is increasing and sin x is oscillating. Substituting the
power series for sin x, the inequality sin x < x, that is, −x < − sin x, is equivalent
to
x3 x5 x7 x9
−x < −x + − + − + −··· ,
3! 5! 7! 9!
or after cancelling off the x’s, this inequality is equivalent to
x3 x2 x7 x2
1− + 1− + · · · > 0.
3! 4·5 7! 8·9
For 0 < x < 2, each of the terms in parentheses is positive. This shows that in
particular, this expression is positive for 0 < x < π/2.
230 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
tn
sn
2θn
regular polygons with
M · 2n sides
We now prove that x < tan x for 0 < x < π/2. This inequality is equivalent to
x cos x < sin x for 0 < x < π/2. Substituting the power series for cos and sin, the
inequality x cos x < sin x is equivalent to
x3 x5 x7 x3 x5 x7
x− + − + −··· < x − + − + −··· .
2! 4! 6! 3! 5! 7!
Bringing everything to the right, we get an inequality of the form
1 1 1 1 1 1 1 1
x3 − − x5 − + x7 + − x9 + + − · · · > 0.
2! 3! 4! 5! 6! 7! 8! 9!
Combining adjacent terms, the left-hand side is a sum of terms of the form
1 1 1 1
x2k−1 − − x2k+1 − , k = 2, 3, 4, · · · .
(2k − 2)! (2k − 1)! (2k)! (2k + 1)!
We claim that this term is positive for 0 < x < 3. This shows that x cos x < sin x
for 0 < x < 3, and so in particular, for 0 < x < π/2. Now the above expression is
positive if and only if
1 1
−
(2k − 2)! (2k − 1)!
x2 < = (2k + 1)(2k − 2), k = 2, 3, 4, . . . .
1 1
−
(2k)! (2k + 1)!
where we multiplied the top and bottom by (2k+1)!. The right-hand side is smallest
when k = 2, when it equals 5 · 2 = 10. It follows that these inequalities hold for
0 < x < 3, and our proof is now complete.
4.10.4. Archimedes’ third proposition. We start with a circle with diam-
eter 1 (radius 1/2). Then,
1
circumference of this circle = 2πr = 2π = π.
2
Let us fix a natural number M ≥ 3. Given any n = 0, 1, 2, 3, . . ., we inscribe
and circumscribe the circle with regular polygons having 2n M sides. See Figure
4.14. We denote the perimeter of the inscribed 2n M -gon by lowercase pn and the
perimeter of the circumscribed 2n M -gon by uppercase Pn . Then geometrically we
can see that
pn < π < Pn , n = 0, 1, 2, . . .
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 231
tn
2
sn
1
2 2
θn
sn tn
2 2
θn
1
2
Figure 4.15. We cut the central angle in half. The right picture
shows a blow-up of the overlapping triangles on the left.
and
1
1
2 sin2 (θn+1 ) = 2 sin2 θn = sin(θn ) tan θn = sin(θn ) tan(θn+1 ).
2 2
In particular, recalling that Pn = 2n M tan θn and pn = 2n M sin θn , we see that
sin(θn ) tan(θn )
Pn+1 = 2n+1 M tan θn+1 = 2n+1 M
sin(θn ) + tan(θn )
2n M sin(θn ) · 2n M tan(θn ) pn Pn+1
=2 n =2 .
2 M sin(θn ) + 2n M tan(θn ) pn + P n
and
2 2
2p2n+1 = 2 2n+1 M sin θn+1 = 2n+1 M sin(θn ) tan(θn+1 )
= 2 · 2 M sin(θn ) 2n+1 M tan(θn+1 ) = 2pn Pn+1 ,
n
p
or pn+1 = pn Pn+1 . Finally, recall that θn = 2nπM . Thus, P0 = M tan( M π
) and
π
p0 = M sin( M ). Let us summarize our results in the following formulas:
2pn Pn p
Pn+1 = , pn+1 = pn Pn+1 ; (Archimedes’ algorithm)
(4.50) pn + P n
π π
P0 = M tan , p0 = M sin .
M M
This is the celebrated Archimedes’ algorithm. Starting from the values of P0 and
p0 , we can use the iterative definitions for Pn+1 and pn+1 to generate sequences
{Pn } and {pn } that converge to π, as we now show.
Theorem 4.56 (Archimedes’ algorithm). We have
pn < π < Pn , n = 0, 1, 2, . . .
and pn → π and Pn → π as n → ∞.
π π
Proof. Note that for any n = 0, 1, 2, . . ., we have 0 < θn = 2n M < 2 because
M ≥ 3. Thus, by Lemma 4.55,
pn = 2n M sin θn < 2n M θn < 2n M tan θn = Pn .
Since θn = 2nπM , the middle term is just π, so pn < π < Pn for every n = 0, 1, 2, . . ..
Using the limit limz→0 sin z/z = 1, we obtain
n sin 2nπM
lim pn = lim 2 M sin θn = lim π π
= π.
n→∞ n→∞ n→∞
2n M
Since limz→0 cos z = 1, we have limz→0 tan z/z = limz→0 sin z/(z · cos z) = 1, so the
same argument we used for pn shows that limn→∞ Pn = π.
In Problem 4 you will study how fast pn and Pn converge to π. Now let’s
consider a specific example: Let M = 6, which is what Archimedes chose! Then,
π √ π
P0 = 6 tan = 2 3 = 3.464101615 . . . and p0 = 6 sin = 3.
6 6
From these values, we can find P1 and p1 from Archimedes algorithm (4.50):
√
2p0 P0 2·3·2 3
P1 = = √ = 3.159659942 . . .
p0 + P 0 3+2 3
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 233
and p √
p1 = p0 P1 = 3 · 3.159659942 . . . = 3.105828541 . . . .
Continuing this process (I used a spreadsheet) we can find P2 , p2 , then P3 , p3 , and
so forth, arriving with the table
n pn Pn
0 3 3.464101615
1 3.105828541 3.215390309
2 3.132628613 3.159659942
3 3.139350203 3.146086215
4 3.141031951 3.1427146
5 3.141452472 3.14187305
6 3.141557608 3.141662747
7 3.141583892 3.141610177
Archimedes considered p4 = 3.14103195 . . . and P4 = 3.1427146 . . .. Notice that
10 1
3 = 3.140845070 . . . < p4 and P4 < 3.142857142 . . . = 3 .
71 7
Hence,
10 1
3 < p4 < π < P4 < 3 ,
71 7
which proves Archimedes’ third proposition. It’s interesting to note that Archimedes
didn’t have computers back then (to find square roots for instance), or trig func-
tions, or coordinate geometry, or decimal notation, etc. so it’s incredible that
Archimedes was able to determine π to such an incredible accuracy!
4.10.5. Continuation of our brief history of π. Here are (only some!)
famous formulas for π (along with their earliest known date of publication) that
we’ll prove in our journey through our book: Archimedes of Syracuse ≈ 250 B.C.:
π = lim Pn = lim pn , where
2pn Pn p π π
Pn+1 = , pn+1 = pn Pn+1 ; P0 = M tan , p0 = M sin .
pn + P n M M
We remark that Archimedes’ algorithm is similar to Borchardt’s algorithm (see
Problem 1), which is similar to the modern-day AGM method of Eugene Salamin,
Richard Brent, and Jonathan and Peter Borwein [32, 33]. This AGM method can
generate billions of digits of π!
François Viète 1593 (§ 5.1):
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· .
π 2 2 2 2 2 2 2 2 2
4 12
=1+ .
π 32
2+
52
2+
72
2+
2 + ···
234 4. LIMITS, CONTINUITY, AND ELEMENTARY FUNCTIONS
Machin calculated 100 digits of π with this formula. William Shanks (1812–1882)
is famed for his calculation of π to 707 places in 1873 using Machin’s formula.
However, only the first 527 places were correct as discovered by D. Ferguson in
1944 [72] using another Machin type formula. Ferguson ended up publishing 620
correct places in 1946, which marks the last hand calculation for π ever to so many
digits. From this point on, computers have been used to find π and the number of
digits of π known today is well into the trillions lead by Yasumasa Kanada and his
coworkers at the University of Tokyo using a Machin type formula; see Kanada’s
website http://www.super-computing.org/. One might ask “why try to find so
many digits of π?” Well (taken from Young’s great book [252, p. 238]),
Perhaps in some far distant century they may say, “Strange that
those ingenious investigators into the secrets of the number sys-
tem had so little conception of the fundamental discoveries that
would later develop from them!” D. N. Lehmer (1867–1938).
We now go back to our list of formulas. Leonhard Euler 1736 (§ 5.1):
∞
π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4
and (§ 5.1):
π2 22 32 52 72 112
= 2 · 2 · 2 · 2 · 2 ··· .
6 2 − 1 3 − 1 5 − 1 7 − 1 11 − 1
We end our history with a question to ponder: What is the probability that a
natural number, chosen at random, is square free (that is, is not divisible by the
square of a prime)? What is the probability that two natural numbers, chosen
at random, are relatively (or co-) prime (that is, don’t have any common prime
factors)? The answers, drum role please, (§ 7.6):
6
Probability of being square free = Probability of being coprime = .
π2
Exercises 4.10.
4.10. F THE AMAZING π AND ITS COMPUTATIONS FROM ANCIENT TIMES 235
1. In a letter from Gauss to his teacher Johann Pfaff (1765–1825) around 1800, Gauss
asked Pfaff about the following sequences {αn }, {βn } defined recursively as follows:
1 p
αn+1 = αn + βn ), βn+1 = αn+1 βn . (Borchardt’s algorithm)
2
Later, Carl Borchardt (1817–1880) rediscovered this algorithm and since then this
algorithm is called Borchardt’s algorithm [46]. Prove that Borchardt’s algorithm
is basically the same as Archimedes’ algorithm in the following sense: if you set αn :=
1/Pn and βn := 1/pn in Archimedes’ algorithm, you get Borchardt’s algorithm.
π
2. (Pfaff ’s solution I) Now what if we don’t use the starting values P0 = M tan M and
π
p0 = M sin M for Archimedes’ algorithm in (4.50), but instead used other starting
values? What do the sequences {Pn } and {pn } converge to? These questions were
answered by Johann Pfaff. Pick starting values P0 and p0 and let’s assume that 0 ≤
p0 < P0 ; the case that P0 < p0 is handled in the next problem.
(i) Define
p p0 P0
0
θ := arccos , r := p .
P0 P02 − p20
Prove that P0 = r tan θ and p0 = r sin θ.
(ii) Prove by induction that P0 = 2n r tan 2θn and pn = 2n r sin 2θn .
(iii) Prove that as n → ∞, both {Pn } and {pn } converge to
p0 P0 p
0
rθ = p arccos .
P02 − p20 P0
3. (Pfaff ’s solution II) Now assume that 0 < P0 < p0 .
(i) Define (see Problem 6 in Exercises 4.9 for the definition of cosh−1 )
p p0 P0
0
θ := cosh−1 , r := p .
P0 p20 − P02
Prove that P0 = r tanh θ and p0 = r sinh θ.
(ii) Prove by induction that P0 = 2n r tanh 2θn and pn = 2n r sinh θ
2n
.
(iii) Prove that as n → ∞, both {Pn } and {pn } converge to
p0 P0 p
0
rθ = p cosh−1 .
2
p0 − P02 P0
4. (Cf. [150], [181]) (Rate of convergence)
(a) Using the formulas pn = 2n M sin θn and Pn = 2n M tan θn , where θn = π
2n M
, prove
that there are constants C1 , C2 > 0 such that for all n,
C1 C2
|pn − π| ≤ n and |Pn − π| ≤ n .
4 4
3
Suggestion: For the first estimate, use the expansion sin z = z − z3! + · · · . For the
second estimate, notice that |Pn − π| = cos1θn |pn − π cos θn |.
(b) Part (a) shows that {pn } and {Pn } converge to π very fast, but we can get even
faster convergence by looking at the sequence {an } where an := 31 (2pn +Pn ). Prove
that there is a constant C > 0 such that for all n,
C
|an − π| ≤ n .
16
CHAPTER 5
In this chapter we present a small sample of some of the most beautiful formulas
in the world. We begin in Section 5.1 where we present Viète’s formula, Wallis’
formula, and Euler’s sine expansion. Viète’s formula, due to François Vite (1540–
1603), is the infinite product
s v s
r r u r
u
2 1 1 1 1 t1 1 1 1 1
= · + · + + ··· ,
π 2 2 2 2 2 2 2 2 2
published in 1593. This is not only the first recorded infinite product [120, p. 218] it
is also the first recorded theoretically exact analytical expression for the number π
[36, p. 321]. Wallis’ formula, named after John Wallis (1616–1703) was the second
recorded infinite product [120, p. 219]:
∞
π Y 2n 2n 2 2 4 4 6 6
= · = · · · · · ··· .
2 n=1
2n − 1 2n + 1 1 3 3 5 5 7
To explain Euler’s sine expansion, recall that if p(x) is a polynomial with nonzero
roots r1 , . . . , rn (repeated according to multiplicity), then we can factor p(x) as
p(x) = a(x−r1 )(x−r2 ) · · · (x−rn ) where a is a constant. Factoring out −r1 , . . . , −rn ,
we can write p(x) as
x x x
(5.1) p(x) = b 1 − 1− ··· 1 − ,
r1 r2 rn
sin x
for another constant b. Euler noticed that the function x has only nonzero roots,
located at
π, −π, 2π, −2π, 3π, −3π, . . . ,
2 4
so thinking of sinx x = 1 − x3! + x5! − · · · as a (infinite) polynomial, assuming that
(5.1) holds for such an infinite polynomial we have (without caring about being
rigorous for the moment!)
sin x x x x x x x
=b 1− 1+ 1− 1+ 1− 1+ ···
x π π 2π
2π
3π 3π
x2 x2 x2
=b 1− 2 1− 2 2 1 − 2 2 ··· ,
π 2 π 3 π
237
238 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
where b is a constant. In Section 5.1, we prove that Euler’s guess was correct (with
b = 1)! Here’s Euler’s famous formula:
x2 x2 x2 x2 x2
(5.2) sin x = x 1 − 2 1 − 2 2 1 − 2 2 1 − 2 2 1 − 2 2 · · · ,
π 2 π 3 π 4 π 5 π
which Euler proved in 1735 in his epoch-making paper De summis serierum recipro-
carum (On the sums of series of reciprocals), which was read in the St. Petersburg
Academy on December 5, 1735 and originally published in Commentarii academiae
scientiarum Petropolitanae 7, 1740, and reprinted on pp. 123–134 of Opera Omnia:
Series 1, Volume 14, pp. 73–86.
In Section 5.2 we studyPthe Basel problem, which is the problem to determine
∞ 1
the exact value of ζ(2) = n=1 n2 . Euler, in the same 1735 paper De summis
2
serierum reciprocarum proved that ζ(2) = π6 :
∞
X 1 1 1 1 π2
= 1 + + + + · · · = .
n=1
n2 22 32 42 6
Euler actually gave three proofs of this formula in De summis serierum recipro-
carum, but the third one is the easiest to explain. Here it is: First, recall Euler’s
sine expansion:
sin x x2 x2 x2 x2
= 1− 2 2 1− 2 2 1− 2 2 1 − 2 2 ··· .
x 1 π 2 π 3 π 4 π
If you think about multiply out the right-hand side you will get
x2 1 1 1
1− 2 + + + · · · + ···
π 12 22 32
where the dots “· · · ” involves powers of x of degree at least four or higher. Thus,
sin x x2
= 1 − 2 ζ(2) + · · · .
x π
x3
Dividing the power series of sin x = x − 3! + · · · by x we conclude that
2
x x2
1− + · · · = 1 − 2 ζ(2) + · · ·
3! π
where “· · · ” involves powers of x of degree at least four or higher. Finally, equating
powers of x2 we conclude that
1 ζ(2) π2 π2
−
= 2 =⇒ ζ(2) = = .
3! π 3! 6
Here is Jordan Bell’s [21] English translation of Euler’s argument from De summis
serierum reciprocarum (which was originally written in Latin):
Indeed, it having been put1 y = 0, from which the fundamental
equation will turn into this2
s3 s5 s7
0=s− + − + etc.,
1·2·3 1·2·3·4·5 1·2·3·4·5·6·7
1Here, Euler set y = sin s.
2Instead of writing e.g. 1 · 2 · 3, today we would write this as 3!. However, the factorial symbol
wasn’t invented until 1808 [151], by Christian Kramp (1760–1826), more than 70 years after De
summis serierum reciprocarum was read in the St. Petersburg Academy.
5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD 239
The roots of this equation give all the arcs of which the sine is
equal to 0. Moreover, the single smallest root is s = 0, whereby
the equation divided by s will exhibit all the remaining arcs of
which the sine is equal to 0; these arcs will hence be the roots of
this equation
s2 s4 s6
0=1− + − + etc.
1·2·3 1·2·3·4·5 1·2·3·4·5·6·7
Truly then, those arcs of which the sine is equal to 0 are3
p, −p , +2p , −2p , 3p , −3p etc.,
of which the the second of the two of each pair is negative, each
of these because the equation indicates for the dimensions of s to
be even. Hence the divisors of this equation will be
s s s s
1− , 1+ , 1− , 1+ etc.
p p 2p 2p
and by the joining of these divisors two by two it will be
s2 s4 s6
1− + − + etc.
1 · 2 · 3 1· 2 · 3 · 4
· 5 1 · 2 · 3·4 · 5 · 6 · 7
s2 s2 s2 s2
= 1− 2 1− 2 1− 2 1− etc.
p 4p 9p 16p2
It is now clear from the nature of equations for the coeffi-
1
cient4 of ss that is 1·2·3 to be equal to
1 1 1 1
2
+ 2+ 2+ + etc.
p 4p 9p 16p2
In this last step, Euler says that
1 1 1 1 1
= 2+ 2+ 2+ + etc,
1·2·3 p 4p 9p 16p2
2
which after some rearrangement is exactly the statement that ζ(2) = π6 . Euler’s
proof reminds me of a quote by Charles Hermite (1822–1901):
There exists, if I am not mistaken, an entire world which is the
totality of mathematical truths, to which we have access only
with our mind, just as a world of physical reality exists, the one
like the other independent of ourselves, both of divine creation.
Quoted in The Mathematical Intelligencer, vol. 5, no. 4.
By the way, in this book we give eleven proofs of Euler’s formula for ζ(2)!
In Section 5.2 we also prove the Gregory-Leibniz-Madhava series
π 1 1 1
= 1 − + − + −··· .
4 3 5 7
This formula is usually called Leibniz’s series after Gottfried Leibniz (1646–1716)
because he is usually accredited to be the first to mention this formula in print in
1673, although James Gregory (1638–1675) probably knew about it. However, the
3Here, Euler uses p for π. The notation π for the ratio of the length of a circle to its diameter
was introduced in 1706 by William Jones (1675–1749), and only around 1736, a year after Euler
published De summis serierum reciprocarum, did Euler seem to adopt the notation π.
4Here, ss means s2 .
240 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
and refer the right-hand side as an infinite product, the subject of which we’ll
thoroughly study in Chapter 7. For Q∞the purposes of this chapter, given a sequence
a1 , a2 , a3 , . . . we shall denote by n=1 an as the limit
∞
Y n
Y
an := lim ak = lim a1 a2 · · · an ,
n→∞ n→∞
n=1 k=1
5.1. F EULER, WALLIS, AND VIÈTE 241
provided that the limit exists. We now put z = π/2 into (5.4):
π π π π ∞ π
2 Y
= cos 2 · cos 3 · cos 4 · cos 5 · · · = cos n+1
π 2 2 2 2 n=1
2
We now just have to obtain formulas for cos 2πn . To do so, note that for any
0 ≤ θ ≤ π, we have
r
θ 1 1
cos = + cos θ.
2 2 2
(This follows from the double angle formula: 2 cos2 (2z) = 1 + cos z.) Thus,
s
s r
θ 1 1 θ 1 1 1 1
cos = + cos = + + cos θ,
22 2 2 2 2 2 2 2
Continuing this process (slang for “it can be shown by induction”), we see that
v v
u u s
u u r
θ u
t 1 1t1 1 1 1 1 1
(5.5) cos = + + + ··· + + cos θ,
2n 2 2 2 2 2 2 2 2
where there are n square roots here. Therefore, putting θ = π/2 we obtain
v v
u u s
u r
π u1 1u t1 1 1 1 1 1
cos n+1 = t + + + ··· + + ,
2 2 2 2 2 2 2 2 2
where there are n square roots here. In conclusion, we have shown that
v v
u u s
∞
u u r
2 Y u
t 1 1t1 1 1 1 1 1
= + + + ··· + + ,
π n=1 2 2 2 2 2 2 2 2
where there are n square roots in the n-th factor of the infinite product; or, writing
out the infinite product, we have
v
r s r u u
s r
2 1 1 1 1 t1 1 1 1 1
(5.6) = · + · + + ··· .
π 2 2 2 2 2 2 2 2 2
5.1.2. Expansion of sine I. Our first proof of Euler’s infinite product for
sine is based on a neat identity involving tangents that we’ll present in Lemma 5.1
below. To begin we first write, for z ∈ C,
n n
1 iz −iz 1 iz iz
(5.7) sin z = e −e = lim 1+ − 1− = lim Fn (z),
2i n→∞ 2i n n n→∞
Using this lemma we can give a formal5 proof of Euler’s sine expansion. From
(5.7) and Lemma 5.1 we know that for any x ∈ R,
n−1
sin x Fn (x) Y2
x2
(5.9) = lim = lim 1− 2 ,
x n→∞ x n→∞
k=1
n tan2 (kπ/n)
where in the limit we restrict n to odd natural numbers. Thus, writing n = 2m + 1
the limit in (5.9) really means
m
sin x Y x2
= lim 1− ,
x m→∞
k=1
(2m + 1)2 tan2 (kπ/(2m + 1))
but we prefer the simpler form in (5.9) with the understanding that n is odd in
(5.9). We now take n → ∞ in this expression. Now,
sin2 (kπ/n)
lim n2 tan2 (kπ/n) = lim n2
n→∞ n→∞ cos2 (kπ/n)
sin(kπ/n) 1 2
= lim ·
n→∞ 1/n cos(kπ/n)
sin(kπ/n) 1 2
= lim (kπ)2 · ·
n→∞ kπ/n cos(kπ/n)
= k2 π2 ,
where we used that limz→0 sinz z = 1 and cos(0) = 1. Hence,
x2 x2
(5.10) lim 1 − 2 = 1 − ,
n→∞ n tan2 (kπ/n) k2 π2
thus, formally evaluating the limit in (5.9), we see that
n−1
sin x 2
Y x2
= lim 1− 2
x n→∞
k=1
n tan2 (kπ/n)
∞
Y x2
= lim 1 − 2
k=1
n→∞ n tan2 (kπ/n)
∞
Y x2
= 1− 2 2 ,
k π
k=1
which is Euler’s result. Unfortunately, there is one issue with this argument; it
occurs in switching the limit with the product:
n−1
Y ∞
Y2
x2 x2
(5.11) lim 1− 2 = lim 1 − .
n→∞
k=1
n tan2 (kπ/n) k=1
n→∞ n2 tan2 (kπ/n)
See Problem 2 for an example where such an interchange leads to a wrong answer.
In Section 7.3 of Chapter 7 we’ll learn a Tannery’s theorem for infinite products,
5“Formal” in mathematics usually refers to “having the form or appearance without the
substance or essence,” which is the 5-th entry for “formal” in Webster’s 1828 dictionary. This is
very different to the common use of “formal”: “according to form; agreeable to established mode;
regular; methodical,” which is the first entry in Webster’s 1828 dictionary. Elaborating on the
mathematicians use of “formal,” it means something like “a symbolic manipulation or expression
presented without paying attention to correctness”.
244 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
from which we can easily deduce that (5.11) does indeed hold. However, we’ll
leave Tannery’s theorem for products until Chapter 7 because we can easily justify
(5.11) in a very elementary (although a little long-winded) way, which we do in the
following theorem.
Theorem 5.2 (Euler’s theorem). For any x ∈ R we have
∞
Y x2
sin x = x 1− 2 2 .
π k
k=1
Proof. We just have to verify the formula (5.11), which in view of (5.10) is
equivalent to the equality
∞
Y x2
lim pn = 1− 2 2 ,
n→∞ k π
k=1
where
n−1
Y
2
x2
pn = 1− .
k=1
n tan2 (kπ/n)
2
The limit in the case x = 0 is easily checked to hold so we can (and henceforth do)
fix x 6= 0.
2
Step 1: We begin by finding some nice estimates on the quotient n2 tanx2 (kπ/n) .
In Lemma 4.55 back in Section 4.10, we proved that
θ < tan θ, for 0 < θ < π/2.
n−1
In particular, if n ∈ N is odd and 1 ≤ k ≤ 2 , then
kπ n−1 π π
< · < ,
n 2 n 2
so
x2 x2 x2
(5.12) 2 < = .
n2 tan (kπ/n) n2 (kπ)2 /n2 k2 π2
Step 2: We now break up pn in a nice way. Choose m < n−1
2 and let us break
n−1
up the product pn from k = 1 to m and then from m to 2 :
m n−1
Y x2 Y2
x2
(5.13) pn = 1− 2 1− .
k=1
n tan2 (kπ/n) k=m+1
n2 tan2 (kπ/n)
We shall use (5.12) to find estimates on the second product in (5.13). Choose m
2
and n large enough such that mx2 π2 < 1. Then it follows that from (5.12) that
x2 x2 n−1
2 2 < 2 2
< 1 for k = m + 1, m + 2, . . . , .
n tan (kπ/n) k π 2
In particular,
x2 n−1
0< 1− 2 2 < 1 for k = m + 1, m + 2, . . . , .
n tan (kπ/n) 2
Hence,
n−1
Y 2
x2
0< 1− 2 <1
k=m+1
n tan2 (kπ/n)
5.1. F EULER, WALLIS, AND VIÈTE 245
or after rearrangement,
m m
Y x2 sin x Y x2
−sm 1− 2 2 ≤ − 1 − 2 2 ≤ 0.
k π x k π
k=1 k=1
Our goal is to take m → ∞ here, but before doing so we need the following estimate
on the right-hand side:
m Y m
Y
x2 x2
1− 2 2 ≤ 1+ 2 2
k π k π
k=1 k=1
m
Y x2
≤ e k2 π 2 (since 1 + t ≤ et for any t ∈ R)
k=1
x2
Pm
=e k=1 k2 π 2 (since ea · eb = ea+b )
≤ eL ,
P∞ 2
where L = k=1 k2xπ2 , a finite constant the exact value of which is not important.
P∞ 1
(Note that k=1 k2 converges by the p-test with p = 2.) Thus,
m
Y x2
1 − 2 2 ≤ eL ,
k π
k=1
and so,
m
sin x Y x2
1 − 2 2 ≤ |sm |eL .
(5.16) −
x k π
k=1
2 P∞ P∞
Recalling that sm = πx2 k=m+1 k12 and k=1 k12 converges (the p-test with p = 2),
by the Cauchy Criterion for series we know that limm→∞ sm = 0. Thus, it follows
from (5.16) that
m
sin x Y
x2
− 1− 2 2
x k π
k=1
can be made as small as we desire by taking m larger and larger. This, by definition
of limit, means that
m
sin x Y x2
= lim 1− 2 2 ,
x m→∞ k π
k=1
We remark that Euler’s sine expansion also holds for all complex z ∈ C (and
not just real x ∈ R), but we’ll wait for Section 7.3 of Chapter 7 for the proof of the
complex version.
5.1. F EULER, WALLIS, AND VIÈTE 247
Proof. To obtain the first formula, we set x = π/2 in Euler’s infinite product
expansion for sine:
∞ ∞
Y x2 π Y 1
sin x = x 1− 2 2 =⇒ 1 = 1− 2 2 .
n=1
π n 2 n=1 2 n
1 22 n2 −1 (2n−1)(2n+1)
Since 1 − 22 n2 = 22 n2 = (2n)(2n) , we see that
∞
2 Y 2n − 1 2n + 1
= · .
π n=1 2n 2n
Now taking reciprocals of both sides (you are encouraged to verify that the recip-
rocal of an infinite product is the product of the reciprocals) we get Wallis’ first
formula. To obtain the second formula, we write the first formula as
2n 2
π 2 2 4 2 1
= lim · ··· · .
2 n→∞ 1 3 2n − 1 2n + 1
Then taking square roots we obtain
r n n
√ 2 Y 2k 1 1 Y 2k
π = lim = lim √ p .
n→∞ 2n + 1 2k − 1 n→∞ n 1 + 1/2n 2k − 1
k=1 k=1
p
Using that 1/ 1 + 1/2n → 1 as n → ∞ completes our proof.
We prove prove a beautiful expression for π due to Sondow [217] (which I found
on Weisstein’s website [241]). To present this formula, we first manipulate Wallis’
first formula to
∞ ∞ ∞
π Y 2n 2n Y 4n2 Y 1
= · = = 1+ 2 .
2 n=1
2n − 1 2n + 1 n=1 4n2 − 1 n=1 4n − 1
Second, using partial fractions we observe that
∞ ∞
X 1 1X 1 1 1 1
2
= − = ·1= ,
n=1
4n − 1 2 n=1 2n − 1 2n + 1 2 2
since the sum telescopes (see e.g. the telescoping series theorem — Theorem 3.24).
Dividing these two formulas, we get
∞
Y 1
1+ 2
4n − 1
π = n=1 ∞ ,
X 1
n=1
4n2 − 1
248 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
quite astonishing!
Exercises 5.1.
1. Here are some Viète-Wallis products from [175, 178].
(i) From the formulas (5.3), (5.5), and Euler’s sine expansion, prove that for any
x ∈ R and p ∈ N we have
v v
u u s
p u
u r ∞ p
2 πn − θ 2p πn + θ
u
sin x Y t 1 1t1 1 1 1 1 1 Y
= + + + ··· + + cos x · · ,
x 2 2 2 2 2 2 2 2 n=1
2p πn 2p πn
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(ii) Setting x = π/2 in (i), show that
v v
u u s
p u
u u r ∞
2 Y t1 1t1 1 1 1 1 1 Y 2p+1 n − 1 2p+1 n + 1
= + + + ··· + + · · ,
π 2 2 2 2 2 2 2 2 n=1 2p+1 n 2p+1 n
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(iii) Setting x = π/6 in (i), show that
v v
u u v
u
p u u u s √ Y ∞
3 Y u1 1 u1 1 u1
t 1 1 1 3 3 · 2p+1 n − 1 3 · 2p+1 n + 1
= t + t + + ··· + + · · ,
π 2 2 2 2 2 2 2 2 2 n=1
3 · 2p+1 n 3 · 2p+1 n
k=1
Q
where there are k square roots in the k-th factor of the product pk=1 .
(iv) Experiment with two other values of x to derive other Viète-Wallis-type formulas.
2. Suppose for each n ∈ N we are given a finite product
an
Y
fk (n),
k=1
However, this equality is not always true. Indeed, prove that (5.17) is false for the
example an = n and fk (n) = 1 + n1 .
3. Prove (5.14) using induction on p.
4. Prove the following splendid formula:
√ (n!)2 22n
π = lim √ .
n→∞ (2n)! n
Suggestion: Wallis’ formula is hidden here.
5. (cf. [22]) In this problem we give an elementary proof of the following interesting
identity: For any n that is a power of 2 and for any x ∈ R we have
n −1
x 2Y
x sin2 (x/n)
(5.18) sin x = n sin cos 1− .
n n
k=1
sin2 (kπ/n)
(i) Prove that for any x ∈ R,
x π + x
sin x = 2 sin sin .
2 2
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 249
(iv) Prove the identity sin(θ + ϕ) sin(θ − ϕ) = sin2 θ − sin2 ϕ and use this to conclude
that the formula in (iii) equals
x x Y kπ x
sin x = 2n sin cos sin2 − sin2 .
n n n n n
1≤k< 2
(v) By considering what happens as x → 0 in the formula in (iv), prove that for n a
power of 2, we have
Y 2 kπ
n = 2n sin .
n
n
1≤k< 2
5.2.1. Proof I of Euler’s formula for π 2 /6. In 1644, the Italian mathe-
matician Pietro Mengoli (1625–1686) posed the question: What’s the value of the
sum
∞
X 1 1 1 1
2
= 1 + 2 + 2 + 2 + ··· ?
n=1
n 2 3 4
This problem was made popular through Jacob (Jacques) Bernoulli (1654–1705)
when he wrote about it in 1689 and was solved by Leonhard Euler (1707–1783) in
1735. Bernoulli was so baffled by the unknown value of the series that he wrote
If somebody should succeed in finding what till now withstood
our efforts and communicate it to us, we shall be much obliged
to him. [47, p. 73], [252, p. 345].
250 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
Before Euler’s solution to this request, known as the Basel problem (Bernoulli lived
in Basel, Switzerland), this problem eluded many of the great mathematicians of
that day: In 1742, Euler wrote
Jacob Bernoulli does mention those series, but confesses that,
in spite of all his efforts, he could not get through, so that Joh.
Bernoulli, de Moivre and Stirling, great authorities in such mat-
ters, were highly surprised when I told them that I had found the
sum of ζ(2), and even of ζ(n) for n even. [237, pp. 262-63].
(We shall consider ζ(n) for n even in the next section.) Needless to say, it shocked
the mathematical community when Euler found the sum to be π 2 /6; in the intro-
duction to his famous 1735 paper De summis serierum reciprocarum (On the sums
of series of reciprocals) where he first proves that ζ(2) = π 2 /6, Euler writes:
So much work has been done on the series ζ(n) that it seems
hardly likely that anything new about them may still turn up . . .
I, too, in spite of repeated efforts, could achieve nothing more
than approximate values for their sums . . . Now, however, quite
unexpectedly, I have found an elegant formula for ζ(2), depend-
ing upon the quadrature of the circle [i.e., upon π] [237, p. 261].
For more on various solutions to the Basel problem, see [109], [49], [195], and for
more on Euler, see [11], [119]. On the side is a picture of a Swiss Franc banknote
honoring Euler.
We already saw Euler’s original argument in the introduction
to this chapter; we shall now make his argument rigorous. First,
we claim that for any nonnegative real numbers a1 , a2 , . . . , an ≥
0, we have
n
X n
Y n
X X
(5.19) 1− ak ≤ (1 − ak ) ≤ 1 − ak + ai aj .
k=1 k=1 k=1 1≤i<j≤n
n n n
X x2 Y x2 X x2 X x2 x2
1− ≤ 1 − ≤ 1 − + .
k2 π2 k2 π2 k2 π2 i2 π 2 j 2 π 2
k=1 k=1 k=1 1≤i<j≤n
n n n
x2 X 1 Y x2 x2 X 1 x4 X 1
(5.20) 1− 2 ≤ 1 − ≤ 1 − + .
π k2 k2 π2 π2 k2 π4 i2 j 2
k=1 k=1 k=1 1≤i<j≤n
Let us put
n n
X 1 X 1
ζn (2) = and ζn (4) = ,
k2 k4
k=1 k=1
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 251
We remark that the exact coefficient of x4 on the right is not important, but it
might be helpfulif you try Problem 7. Taking n → ∞ and using that ζn (2) → ζ(2),
Qn x2 sin x
k=1 1 − k2 π 2 → x , and that ζn (4) → ζ(4), we obtain
where at the last step we used that cos(z) = sin( π2 − z). Replacing z with πz, we
get for noninteger z,
1 1 1 1
(5.21) = + .
sin2 πz 4 sin2 zπ
2 sin 2 (1−z)π
2
Applying (5.21) (with z = 1/22 ) to the right-hand side of this equation gives
! 1
2 1 1 2 X 1
1= 2 + = .
4 sin2 2π3 sin2 3π
23
42
sin2 (2k+1)π
3 k=0 2
1 1
Applying (5.21) to each term sin2 π and sin2 3π gives
23 23
" # " #!
2 1 1 1 1 1 1
1= 2 + + +
4 4 sin2 2π4 sin2 7π
24
4 sin2 3π24 sin2 5π
24
!
2 1 1 1 1
= 3 2 π
+ 2 3π
+ 2 5π
+ 2
4 sin 24 sin 24 sin 24 sin 7π24
2
2 X 1
= .
43 2 (2k+1)π
k=0 sin 24
Repeatedly applying (5.21) (slang for “use induction”), we arrive at the following.
Lemma 5.4. For any n ∈ N, we have
2n−1
X−1
2 1
1= n .
4
k=0 sin2 (2k+1)π
2n+1
To establish Euler’s formula, we need one more lemma.
Lemma 5.5. For 0 < x < π/2, we have
1 1 1
−1 + 2 < 2 < .
sin x x sin2 x
Proof. Taking reciprocals in the formula from Lemma 4.55: For 0 < x < π/2,
sin x < x < tan x,
−2
2
we get cot x < x −2
< sin x. Since cot2 x = cos2 x/ sin2 x = sin−2 x − 1, we
conclude that
1 1 1 π
> 2 > −1 + , 0<x< ,
sin2 x x sin2 x 2
which proves the lemma.
n−1
Now, observe that for 0 ≤ k ≤ 2 − 1 we have
n−1
(2k + 1)π (2(2 − 1) + 1)π (2n − 1)π π
n+1
≤ = < ,
2 2n+1 2 n+1 2
therefore using the identity
1 1 1 π
−1 + < < , 0<x<
sin2 x x2 sin2 x 2
we see that
2n−1
X−1 2n−1
X−1 2n−1
X−1
n−1 1 1 1
−2 + < 2 < .
k=0 sin2 (2k+1)π
2n+1 k=0
(2k+1)π
k=0 sin2 (2k+1)π
2n+1
2n+1
Multiplying both sides by 2/4n = 2/22n and using Lemma 5.4, we get
2n−1
X−1
1 8 1
− n +1< 2 < 1.
2 π (2k + 1)2
k=0
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 253
Finally, summing over the even and odd numbers (see Problem 2a in Exercises 3.5),
we have
∞ ∞ ∞ ∞
X 1 X 1 X 1 π2 1X 1
(5.22) = + = +
n=1
n2 n=0 (2n + 1)2 n=1 (2n)2 8 4 n=1 n2
∞
3X 1 π2
=⇒ = .
4 n=1 n2 8
P∞
and solving for n=1 1/n2 , we obtain Euler’s formula:
∞
π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4
where
2 1
ak (n) = .
4n sin2 (2k+1)π
n+1
2
sin z
Let us verify the hypotheses of Tannery’s theorem. First, since limz→0 z = 1, we
have
(2k+1)π
(2k + 1)π sin 2n+1
lim 2n+1 sin n+1
= (2k + 1)π · lim (2k+1)π = (2k + 1)π.
n→∞ 2 n→∞
n+1 2
Therefore,
2 1
lim ak (n) = lim n
· (2k+1)π
n→∞ n→∞4 2
sin 2n+1
1 8
= lim 8 · 2 = 2 .
n→∞ (2k+1)π π (2k + 1)2
2n+1 sin 2n+1
To verify the other hypothesis of Tannery’s theorem we need the following lemma.
Lemma 5.6. There exists a constant c > 0 such that for 0 ≤ x ≤ π/2,
c x ≤ sin x.
Proof. Since limz→0 sinz z = 1, the function f (x) = sin x/x is a continuous
function of x in [0, π/2] where we define f (0) := 1. Observe that f is positive on
[0, π/2] because f (0) = 1 > 0 and sin x > 0 for 0 < x ≤ π/2. Therefore, by the
max/min value theorem, f (x) ≥ f (b) > 0 on [0, π/2] for some b ∈ [0, π/2]. This
proves that c x ≤ sin x on [0, π/2] where c = f (b) > 0.
254 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
Doing the even-odd trick as we did in (5.22), we know that this formula implies
Euler’s formula for π 2 /6. See Problem 5 for Proof IV, a classic proof.
Repeatedly applying (5.24), one can prove that for any n ∈ N, we have
2n−1 −1
1 X (4k + 1)π (4k + 3)π
(5.25) 1= n cot − cot .
2 2n+2 2n+2
k=0
(The diligent reader will supply the details!) Since we know some nice properties
of sine from Lemma 5.6 we write the right-hand side of this identity in terms of
sine. To do so, observe that for any complex numbers z, w, not integer multiples of
π, we have
cos z cos w sin w cos z − cos w sin z
cot z − cot w = − =
sin z sin w sin z sin w
sin(w − z)
= .
sin z sin w
Using this identity in (5.25), we obtain
2n−1
X−1 π 2n−1
X−1
1 sin 2n+1
(5.26) 1= n = ak (n),
2
k=0 sin (4k+1)π
4·2n · sin (4k+3)π
2n+2 k=0
where
π
1 sin 2n+1
ak (n) = n
.
2 sin n+2 · sin (4k+3)π
(4k+1)π
n+2
2 2
The idea to derive Gregory-Leibniz-Madhava’s formula is to take n → ∞ in (5.26)
and use Tannery’s theorem. Let us verify the hypotheses of Tannery’s theorem.
First, to determine limn→∞ ak (n) we write
π π
1 sin 2n+1 3 2n+1 sin 2n+1
· = 2 · ·
2n sin (4k+1)π · sin (4k+3)π 2n+2 · 2n+2 sin (4k+1)π (4k+3)π
4·2n 4·2n 2n+2 · sin 2n+2
2n+1 π
8 π sin2n+1
= · .
π(4k + 1)(4k + 3) 2n+2
sin (4k+1)π 2n+2
· (4k+3)π sin (4k+3)π
(4k+1)π 2n+2 2n+2
256 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
sin z
Therefore, since limz→0 z = 1, we have
8
lim ak (n) = .
n→∞ π(4k + 1)(4k + 3)
To verify the other hypothesis of Tannery’s theorem we need the following lemma.
Thus,
∞ 2n+1 ∞
X |z|2n+1
X z
(−1)n
| sin z| = ≤
n=0
(2n + 1)! n=0 (2n + 1)!
∞ ∞
X |z| X 1 1 6
≤ n
= |z| n
= |z| = |z|.
n=0
6 n=0
6 1 − (1/6) 5
π
Since 0 ≤ 2n+1 ≤ 1 for n ∈ N (because π < 4), by this lemma we have
π 6 π
(5.27) sin ≤ · n+1 .
2n+1 5 2
Observe that for 0 ≤ k ≤ 2n−1 − 1 and 0 ≤ ` ≤ 4, we have
Combining this inequality with (5.27), we see that for 0 ≤ k ≤ 2n−1 − 1, we have
π
1 sin 2n+1 1 6 π 1 2n+2 1 2n+2
≤ · · · ·
2n sin (4k+1)π (4k+3)π 2n 5 2n+1 c (4k + 1)π c (4k + 3)π
2n+2 · sin 2n+2
6 8
= · .
5 π(4k + 1)(4k + 3)
It follows that for any k, n, we have
6 8
|ak (n)| ≤ · =: Mk .
5 π(4k + 1)(4k + 3)
5.2. F EULER, GREGORY, LEIBNIZ, AND MADHAVA 257
P∞
Since the sum k=0 Mk converges, we have verified the hypotheses of Tannery’s
theorem. Hence, taking n → ∞ in (5.26), we get
2n−1
X−1 ∞
X
1 = lim ak (n) = lim ak (n)
n→∞ n→∞
k=0 k=0
∞ ∞
X 8 π X 2
= =⇒ = .
π(4k + 1)(4k + 3) 4 (4k + 1)(4k + 3)
k=0 k=0
4. (Partial fraction expansion of 1/ sin2 x, Proof II) Give another proof of (5.29)
using Tannery’s theorem and the formula (5.28).
5. (Euler’s sum for π 2 /6, Proof IV) In this problem we derive Euler’s sum via an old
argument found in Thomas John l’Anson Bromwich’s (1875–1929) book [41, p. 218–19]
(cf. similar ideas found in [6], [179], [123], [249, Problem 145]).
(i) Recall from Problem 4 in Exercises 4.7 that for any n ∈ N and x ∈ R,
b(n−1)/2c
!
X k n
sin nx = (−1) cosn−2k−1 x sin2k+1 x.
2k + 1
k=0
258 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
where in the last equality we switched the letters m and n. Combining (5.33) with
(5.34) we get
N N
X 1 X n2−2k
(5.35) aN = (k − 1) + 2 .
n=1
n2k m2 − n2
m6=n
PN 2−2k
Step 2: We now find a nice expression for 2 m6=n mn2 −n2 . To this end we
write this as
N N n−1 N
!
X n2−2k X X X n2−2k
2 2 2
=2 +
m −n n=1 m=1 m=n+1
m2 − n2
m6=n
N n−1 N
!
X 1 X X 2n
= 2k−1
+ 2 − n2
n=1
n m=1 m=n+1
m
N n−1 N
!
X 1 X X 1 1
(5.36) = + − .
n=1
n2k−1 m=1 m=n+1 m−n m+n
Therefore,
n−1 N
!
X X 1 1
+ −
m=1 m=n+1
m−n m+n
n
! N −n N +n
!
1 X 1 X 1 1 X 1
= − + + −
n m=1 m m=1
m 2n m=n+1 m
N +n N −n
3 X 1 X 1
(5.37) = − +
2n m=1 m m=1 m
N +n
3 X 1
= + .
2n m
m=N −n+1
Step 3: We prove the limit (5.38) (see Problem 1 for a proof using Tannery’s
theorem). To this end, observe that
N +n
X 1 1 1 1
= + + ··· +
m N −n+1 N −n+2 N +n
m=N −n+1
1 1 1 2n
≤ + + ··· + = .
N −n+1 N −n+1 N −n+1 N −n+1
Therefore,
N N +n
! N N
X 1 X 1 X n 1 X 1
≤2 =2 ,
n=1
n2k−1 m n=1
n2k−1 N −n+1 n=1
n2 (N − n + 1)
m=N −n+1
n 1
where recall that k ≥ 2 so that n2k−1
≤ n2 .
Using partial fractions we see that
1 1 1 1
= + .
n(N − n + 1) N +1 n N −n+1
262 5. SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD
Thus,
1 1 1 1 1 1 1
= · = · +
n2 (N − n + 1) n n(N − n + 1) n N +1 n N −n+1
1 1 1
= 2
+
N +1 n n(N − n + 1)
1 1 1 1 1
= + +
N + 1 n2 N +1 n N −n+1
1 1 1
= · + (1 + 1)
N + 1 n2 (N + 1)2
1 1 2
≤ · 2+ .
N +1 n (N + 1)2
Hence,
N N +n
! N
X 1 X 1 X 1
=2
n=1
n2k−1 m n=1
n2 (N − n + 1)
m=N −n+1
N N
2 X 1 X 2 2π 2 /6 2N
≤ 2
+ 2
≤ + .
N + 1 n=1 n n=1
(N + 1) N + 1 (N + 1)2
5.3.2. Euler’s formula for ζ(2k). To this end, we first define a sequence
1
C1 , C2 , C3 , . . . by C1 = 12 , and for k ≥ 2, we define
k−1
1 X
(5.39) Ck = − C` Ck−` .
2k + 1
`=1
As a side note, we remark that (5.40) shows that ζ(2k) is a rational number
times π 2k ; in particular, since π is transcendental (see, for example, [162, 163,
136]) it follows that ζ(n) is transcendental for n even. One may ask if there are
similar P
expressions like (5.40) for sums of the reciprocals of the odd powers (e.g.
∞
ζ(3) = n=1 n13 ). Unfortunately, there are no known formulas! Moreover, it is not
even known if ζ(n) is transcendental for n odd and in fact, of all odd numbers only
ζ(3) is known without a doubt to be irrational; this was proven by Roger Apéry
(1916–1994) in 1979 (see [29], [230])!
Exercises 5.3.
1. Prove (5.38) using Tannery’s
P theorem.
2. (Cf. [116]) Let Hn = n 1
m=1 m , the n-th partial sum of the harmonic series. In this
problem we prove the equalities:
∞ ∞ ∞
X 1 X Hn 1 X Hn
(5.41) ζ(3) = = = .
n=1
n3 n=1
(n + 1)2 2 n=1 n2
PN
where the notation is as in the proof of Williams’ theorem. Suggestion:
m,n=1
1
For the first equality, use that mn(m+n) = m12 n1 − m+n1
.
(b) Now prove that for N ∈ N,
N N X N
X 1 X 1
=2 .
m,n=1
mn(m + n) m=1 n=1
m(m + n)2
1 1 1
Suggestion: Use that mn(m+n) = m(m+n) 2 + n(m+n)2 .
(c) In Part (b), instead of using n as the inner summation variable on the right-hand
side, change to k = m + n − 1 and in doing so, prove that
N N XN N m+N
X−1
X 1 X 1 X 1
=2 2
+ bN , where bN = 2 .
m,n=1
mn(m + n) m=1
m(k + 1) m=1
m(k + 1)2
k=m k=N +1
PN PN 1
PN Hk
(d) Show that m=1 =
k=m m(k+1)2 and that bN → 0 as N → ∞.
k=1 (k+1)2
Now prove (5.41).
3. (Euler’s sum for π 2 /6, Proof VI) In this problem we prove Euler’s formula for
π 2 /6 by carefully squaring Gregory-Leibniz-Madhava’s formula for π/4; thus, taking
Gregory-Leibniz-Madhava’s formula as “given,” we derive Euler’s formula.7 The proof
is very much in the same spirit as the proof of Williams’ formula; see Section 6.11 for
another, more systematic, proof.
(i) Given N ∈ N, prove that
N
! N ! N N
X (−1)m X (−1)n X 1 X (−1)m+n
= +
m=0
(2m + 1) n=0
(2n + 1) n=0
(2n + 1)2 (2m + 1)(2n + 1)
m6=n
PN
where the notation m6=n is as in the proof of Williams’ theorem.
(ii) For m 6= n, prove that8
2m+1 2n+1
1 2n+1
− 2m+1
=
(2m + 1)(2n + 1) (2m + 1)2 − (2n + 1)2
2m + 1 1 2n + 1 1
= − ,
2n + 1 (2m + 1)2 − (2n + 1)2 2m + 1 (2m + 1)2 − (2n + 1)2
then use this identity to prove that
N N
X (−1)m+n X (−1)m+n 2m + 1
=2 ·
(2m + 1)(2n + 1) 2n + 1 (2m + 1)2 − (2n + 1)2
m6=n m6=n
!
X (−1)n n−1
N X XN
2m + 1
=2 + (−1)m 2 − (2n + 1)2
.
n=0
2n + 1 m=0 m=n+1
(2m + 1)
7
Actually, this works in reverse: We can just as well take Euler’s formula as “given,” and
then derive Gregory-Leibniz-Madhava’s formula!
8 1 1 1
Alternatively, one can prove that (2m+1)(2n+1) = 2(m−n)(2n+1) + 2(n−m)(2m+1) and use
PN m+n
(−1)
this decomposition to simplify m6=n (2m+1)(2n+1) . However, if you do Problem 4 to follow, in
your proof you will run into the decomposition appearing in Part (b) above!
5.3. F EULER’S FORMULA FOR ζ(2k) 265
a formula due to Euler (no surprise!). The proof is very similar to the proof of Williams’
formula, with some twists of course. You may proceed as follows.
(i) For N ∈ N, define
k−2 N
! N ! N k−2
X X 1 X 1 X X 1
aN = k−` `+1
= k−` n`+1
,
m=1
m n=1
n m,n=1
m
`=1 `=1
Pk−2 1 1
Pk−2 `
Summing the geometric sum `=1 mk−` n`+1 = mk n `=1 (m/n) , prove that
N N
X 1 X 1
aN = (k − 2) +2 ,
n=1
nk+1 nk−1 m (m − n)
m6=n
PN
where the notation m6=n is as in the proof of Williams’ theorem.
(ii) Prove that
N N n−1 N
!
X 1 X 1 X X 1 1
k−1
= k
+ −
n m (m − n) n=1
n m=1 m=n+1
m−n m
m6=n
Extracurricular activities
CHAPTER 6
introduced in Section 4.6. Amongst many other things, in this chapter we’ll see
how to write some well-known constants in terms of the Riemann zeta function;
e.g. we’ll derive the following neat formula for our friend log 2 (§ 6.5):
∞
X 1
log 2 = n
ζ(n),
n=2
2
and two more formulas involving our most delicious friend π (see §’s 6.10 and 6.11):
∞ ∞
X 3n − 1 π2 X 1 1 1 1
π= ζ(n + 1) , = ζ(2) = = 1 + 2 + 2 + 2 + ··· .
n=2
4n 6 n=1
n2 2 3 4
and Machin’s formula which started the “decimal place race” of computing π (§
6.10):
∞
1 1 X (−1)n 4 1
π = 4 arctan − arctan =4 − .
5 239 n=0
(2n + 1) 52n+1 2392n+1
converge or diverge? For the answer, see Section 6.7. In elementary calculus, you
probably never seen the power series representations of tangent and secant. This
is because these series are somewhat sophisticated mathematically speaking. In
Section 6.8 we shall derive the power series representations
∞
X 22n (22n − 1) B2n 2n−1
tan z = (−1)n−1 z ,
n=1
(2n)!
and
∞
X E2n 2n
sec z = (−1)n z .
n=0
(2n)!
Here, the B2n ’s are called “Bernoulli numbers” and the E2n ’s are called “Euler
numbers,” which are certain numbers having extraordinary properties. Although
you’ve probably never seen the tangent and secant power series, you might have
seen the logarithmic, binomial, and arctangent series:
∞ ∞ ∞
X (−1)n−1 n X α n X z 2n+1
log(1 + z) = z , (1 + z)α = z , arctan z = (−1)n
n=1
n n=0
n n=0
2n + 1
where α ∈ R. You most likely used calculus (derivatives and integrals) to derive
these formulæ. In Section 6.9 we shall derive these formulæ without any calculus.
Finally, in Sections 6.10 and 6.11 we derive many incredible and awe-inspiring
formulæ involving π. In particular, we again look at the Basel problem.
Chapter 6 objectives: The student will be able to . . .
• determine the convergence, and radius and interval of convergence, for an infinite
series and power series, respectively, using various tests, e.g. Dirichlet, Abel,
ratio, root, and others.
• apply Cauchy’s double series theorem and know how it relates to rearrangement,
and multiplication and composition of power series.
• identify series formulæ for the various elementary functions (binomial, arctan-
gent, etc.) and for π.
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 271
Theorem 6.1 (Summation by parts). For any complex sequences {an } and
{bn }, we have
n
X n
X
bk+1 (ak+1 − ak ) + ak (bk+1 − bk ) = an+1 bn+1 − am bm .
k=m k=m
Proof. Applying the summation by parts formula to the sequences {sn } and
{bn }, we obtain
n−1
X n−1
X
bk+1 (sk+1 − sk ) + sk (bk+1 − bk ) = sn bn − sm bm .
k=m k=m
Since ak+1 = sk+1 − sk , we conclude that
n−1
X n−1
X
bk+1 ak+1 + sk (bk+1 − bk ) = sn bn − sm bm .
k=m k=m
Replacing k with k − 1 in the first sum and bringing the second sum to the right,
we get our result.
Summation by parts is a very useful tool. We shall apply it to find sums of
powers of integers (cf. [254], [77]); see the exercises for more applications.
1Abel has left mathematicians enough to keep them busy for 500 years. Charles Hermite
(1822–1901), in “Calculus Gems” [210].
272 6. ADVANCED THEORY OF INFINITE SERIES
P Theorem 6.4 (Dirichlet’s test). Suppose that P the partial sums of the series
an are uniformly bounded (although the series an may not converge). Then
for
P any sequence {b n } that is of bounded variation
P and converges to zero, the series
an bn converges. In particular, the series an bn converges if {bn } is a monotone
sequence of real numbers approaching zero.
P
Proof. The trick to use Abel’s lemma to rewrite an bn in terms of an ab-
solutely convergent series. Define a0 = 0 (so that s0 = a0 = 0) and b0 = 0. Then
setting m = 0 in Abel’s lemma, we can write
n
X n−1
X
(6.1) ak bk = sn bn − sk (bk+1 − bk ).
k=1 k=1
Now we are given two facts: The first is that the partial sums {sn } are bounded,
say by a constant C, and the second is that the sequence {bn } is of bounded
variation and converges to zero. Since {sn } is bounded and bn → 0 it follows that
sP
n bn → 0. Since |sn | ≤ C for all n and {bn } is of bounded variation, the sum
∞
k=1 sk (bk+1 − bk ) is absolutely convergent:
∞
X ∞
X
|sk (bk+1 − bk )| ≤ C |bk+1 − bk | < ∞.
k=1 k=1
274 6. ADVANCED THEORY OF INFINITE SERIES
P
Therefore,
P∞taking n → ∞ in (6.1) it follows that the sum ak bk converges (and
equals k=1 sk (bk+1 − bk )), and our proof is complete.
Example 6.4. For each x ∈ (0, 2π), determine the convergence of the series
∞
X einx
.
n=1
n
Pm ix n
where we summed n=1 (e ) via the geometric progression (2.3). Hence,
X m 1 − eimx 1 + |eimx | 2
inx
e ≤ inx
≤
inx
= .
n=1
1−e 1−e | |1 − eix |
m
X 1
|1 − eix | = 2| sin(x/2)| einx ≤
=⇒ .
sin(x/2)
n=1
∞ ∞
X cos nx X sin nx
and converge.
n=1
n n=1
n
Therefore,
m m−1
1 X cos nx sin(m + 1/2)x X sin(n + 1/2)x 1
(6.2) + = + · .
2 n=1 n 2m sin(x/2) n=1
2 sin(x/2) n(n + 1)
P
Since the sine is always bounded by 1 and 1/n(n + 1) converges, it follows that
as m → ∞, the first term on the right of (6.2) tends to zero while the summation
on the right of (6.2) converges; in particular, the series in question converges, and
we get the following pretty formula:
∞ ∞
1 X cos nx 1 X sin(n + 1/2)x
+ = , x ∈ (0, 2π).
2 n=1 n 2 sin(x/2) n=1 n(n + 1)
P∞
In Example 6.39 of Section 6.9, we’ll show that n=1 cosnnx = log(2 sin(x/2)).
converges. Of course, we already knew this and we also know that the value of the
alternating harmonic series equals log 2 (see Section 4.6).
We now come to a very useful theorem for approximation purposes.
276 6. ADVANCED THEORY OF INFINITE SERIES
a2
z }| {
a4
z }| {
0 s2 s4 s s3 s1
| {z }
a3
| {z }
a1
Figure 6.1. The partial sums {sn } jump forward and backward
by the amounts given by the an ’s. This picture also shows that
|s − s1 | ≤ a2 , |s − s2 | ≤ a3 , |s − s3 | ≤ a4 , . . ..
Example 6.6. Suppose that we wanted to find log 2 to two decimal places (in
base 10); that is, we want to find b0 , b1 , b2 in the decimal expansion log 2 = b0 .b1 b2
where by the usual convention, b2 is “rounded up” if b3 ≥ 5. We can determine these
P∞ n−1
decimals by finding n such that sn , the n-th partial sum of log 2 = n=1 (−1)n ,
satisfies
| log 2 − sn | < 0.005;
that is,
log 2 − 0.005 < sn < log 2 + 0.005.
Can you see why these inequalities guarantee that sn has a decimal expansion
starting with b0 .b1 b2 ? Any case, according to the alternating series error estimate,
we can make this this inequality hold by choosing n such that
1
|an+1 | = < 0.005 =⇒ 500 < n + 1 =⇒ n = 500 works.
n+1
With about five hours of pencil and paper work (and ten coffee breaks ,) we find
P500 n
that s500 = n=1 (−1) n = 0.69 to two decimal places. Thus, log 2 = 0.69 to two
decimal places. A lot of work just to get two decimal places!
Example 6.7. (Irrationality of e, Proof II) Another nice application of the
alternating series error estimate (or rather its proof) is a simple proof that e is
irrational, cf. [180], [7]. Indeed, on the contrary, let us assume that e = m/n where
6.1. SUMMATION BY PARTS, BOUNDED VARIATION, AND ALTERNATING SERIES 277
Let ε > 0. Since {bn } is of bounded variation, this sequence converges by Propo-
sition 6.3, so in particular is bounded and therefore, since sn → s, we have
278 6. ADVANCED THEORY OF INFINITE SERIES
(sn − s)bn → 0 and (sm − s)bm → 0. Thus, we can choose N such that for
n, m > N , we have |(sn − s)bn | < ε/3, |(sm − s)bm | < ε/3, and |sn − s| < ε/3.
Thus, for N < m < n, we have
X n n−1
X
a b
k k ≤ |(sn − s)b n | + |(sm − s)b m | + |(sk − s)(bk+1 − bk )|
k=m+1 k=m
n−1
ε ε ε X
< + + |bk+1 − bk |.
3 3 3
k=m
P
Finally, since |bk+1 − bk | converges, by the Cauchy criterion for series, the sum
Pn−1
|b
k=m k+1 − b k | can be made
Pn less than 1 for N chosen larger if necessary. Thus,
for N < m < n, we have | k=m+1 ak bk | < ε. This completes our proof.
Example 6.8. Back to our discussion above, we can write
∞
X 1 n einx X
1+ = an bn ,
n=1
n n
inx P∞
where an = e n and bn = (1 + n1 )n . Since we already know that n=1 an con-
verges and that {bn } is nondecreasing and bounded above (by e — see Section
P 3.3)
and therefore is of bounded variation, Abel’s test shows that the series an bn
converges.
Exercises 6.1.
1. Following Fredricks and Nelsen [77], we use summation by parts to derive neat identities
for the Fibonacci numbers. Recall that the Fibonacci sequence {Fn } is defined as
F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2.
(a) Let an = Fn+1 and bn = 1 in the summation by parts formula (see Theorem 6.1)
to derive the identity:
F1 + F2 + F3 + · · · + Fn = Fn+2 − 1.
(b) Let an = bn = Fn in the summation by parts formula to get
F12 + F22 + F32 + · · · + Fn2 = Fn Fn+1 .
(c) What an ’s and bn ’s would you choose to derive the formulas:
F1 + F3 + F5 + · · · + F2n−1 = F2n , 1 + F2 + F4 + F6 + · · · + F2n = F2n+1 ?
2. Following Fort [76], we relate limits of arithmetic means to summation by parts.
(a) LetP{a n }, {bn } be sequences of complex numbers and assume that bn → 0 and
1 n
n1 Pk=1 k |bk+1 − bk | → 0 as n → ∞, and that for some constant C, we have
n
k=1 ak ≤ C for all n. Prove that
n
n
1X
ak b k → 0 as n → ∞.
n
k=1
√
(b) Apply this result to an = (−1)n−1 n and bn = 1/ n to prove that
n
1 √ √ √ √ √ 1X √
1 − 2 + 3 − 4 + · · · + (−1)n−1 n = (−1)k k → 0 as n → ∞.
n n
k=1
an a1
a3
a5
a7
a9
a11 a13 a15
n
a12 a14 a16
a10
a8
a6
a4
a2
Figure 6.2. For the oscillating sequence {an }, the upper dashed
line represents lim sup an and the lower dashed line represents
lim inf an .
∞
X cos nx 1 1 1 1 1 1
(c) , (d) − + − + − + −···
n=2
log n 2·1 2·2 3·3 3·4 4·5 4·6
∞ ∞ ∞
X (−1)n−1 2n + 1 X x X log n
(e) log , (f ) cos nx sin (x ∈ R) , (g) (−1)n−1 .
n=2
n n n=2
n n=2
n
Note that
sn+1 = sup{an+1 , an+2 , . . .} ≤ sup{an , an+1 , an+2 , . . .} = sn .
Indeed, sn is an upper bound for {an , an+1 , an+2 , . . .} and hence an upper bound
for {an+1 , an+2 , . . .}, therefore sn+1 , being the least such upper bound, must satisfy
sn+1 ≤ sn . Thus, s1 ≥ s2 ≥ · · · ≥ sn ≥ sn+1 ≥ · · · is an nonincreasing sequence.
In particular, being a monotone sequence, the limit lim sn is defined either a real
280 6. ADVANCED THEORY OF INFINITE SERIES
This limit, which again is either a real number or −∞, is called the limit supre-
mum or lim sup of the sequence {an }. This name fits since lim sup an is exactly
that, a limit of supremums. If {an } is not bounded from above, then we define
lim sup an := ∞ if {an } is not bounded from above.
We define an extended real number as a real number or the symbols ∞ = +∞,
−∞. Then it is worth mentioning that lim sups always exist as an extended real
number, unlike regular limits which may not exist. For the picture in Figure 6.2
notice that
s1 = sup{a1 , a2 , a3 , . . .} = a1 ,
s2 = sup{a2 , a3 , a4 , . . .} = a3 ,
s3 = sup{a3 , a4 , a5 , . . .} = a3 ,
and so on. Thus, the sequence s1 , s2 , s3 , . . . picks out the odd-indexed terms of the
sequence a1 , a2 , . . .. Therefore, lim sup an = lim sn is the value given by the upper
dashed line in Figure 6.2. (Of course, here we are assuming that the an ’s behave
just as you think they should for for n ≥ 17.) Here are some other examples.
Example 6.9. We shall compute lim sup an where an = n1 . According to the
definition of lim sup, we first have to find sn :
1 1 1 1 1
sn := sup{an , an+1 , an+2 , . . .} = sup , , , ,... = .
n n+1 n+2 n+3 n
Second, we take the limit of the sequence {sn }:
1
lim sup an := lim sn = lim = 0.
n→∞ n n→∞
Notice that lim an also exists and lim an = 0, the same as the lim sup. We’ll come
back to this observation in Example 6.11 below.
Example 6.10. Consider the sequence {(−1)n }. In this case, we know that
lim(−1)n does not exist. To find lim sup(−1)n , we first compute sn :
sn = sup{(−1)n , (−1)n+1 , (−1)n+2 , . . .} = sup{+1, −1} = 1,
where we used that the set {(−1)n , (−1)n+1 , (−1)n+2 , . . .} is just a set consisting
of the numbers +1 and −1. Hence,
lim sup(−1)n := lim sn = lim 1 = 1.
We can also define a corresponding lim inf an , which is a limit of infimums.
To do so, assume for the moment that our generic sequence {an } is bounded from
below. Consider the sequence {ιn } where
ιn := inf ak = inf{an , an+1 , an+2 , an+3 , . . .}.
k≥n
Note that
ιn = inf{an , an+2 , . . .} ≤ inf{an+1 , an+2 , . . .} = ιn+1 ,
since the set {an , an+2 , . . .} on the left of ≤ contains the set {an+1 , an+2 , . . .}. Thus,
ι1 ≤ ι2 ≤ · · · ≤ ιn ≤ ιn+1 ≤ · · · is an nondecreasing sequence. In particular, being
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 281
a monotone sequence, the limit lim ιn is defined either a real number or (properly
divergent to) ∞. We define
lim inf an := lim ιn = lim inf{an , an+1 , an+2 , . . .} ,
n→∞
which exists either as a real number or +∞, is called the limit infimum or lim
inf of {an }. If {an } is not bounded from below, then we define
lim inf an := −∞ if {an } is not bounded from below.
Again, as with lim sups, lim infs always exist as extended real numbers. For the
picture in Figure 6.2 notice that
ι1 = sup{a1 , a2 , a3 , . . .} = a2 ,
ι2 = sup{a2 , a3 , a4 , . . .} = a2 ,
ι3 = sup{a3 , a4 , a5 , . . .} = a4 ,
and so on. Thus, the sequence ι1 , ι2 , ι3 , . . . picks out the even-indexed terms of the
sequence a1 , a2 , . . .. Therefore, lim sup an = lim ιn is the value given by the lower
dashed line in Figure 6.2. Here are some more examples.
Example 6.11. We shall compute lim inf an where an = n1 . According to the
definition of lim inf, we first have to find ιn :
1 1 1 1
ιn := inf{an , an+1 , an+2 , . . .} = inf , , , , . . . = 0.
n n+1 n+2 n+3
Second, we take the limit of ιn :
lim inf an := lim ιn = lim 0 = 0.
n→∞ n→∞
Notice that lim an also exists and lim an = 0, the same as lim inf an , which is the
same as lim sup an as we saw in Example 6.9. We are thus lead to make the following
conjecture: If lim an exists, then lim sup an = lim inf an = lim an ; this conjecture is
indeed true as we’ll see in Property (2) of Theorem 6.8.
Example 6.12. If an = (−1)n , then
inf{an , an+1 , an+2 , . . .} = sup{(−1)n , (−1)n+1 , (−1)n+2 , . . .} = inf{+1, −1} = −1.
Hence,
lim inf(−1)n := lim −1 = −1.
The following theorem contains the main properties of limit infimums and
supremums that we shall need in the sequel.
Theorem 6.8 (Properties of lim inf/sup). If {an } and {bn } are sequences
of real numbers, then
(1) lim sup an = − lim inf(−an ) and lim inf an = − lim sup(−an ).
(2) lim an is defined, as a real number or ±∞, if and only if lim sup an = lim inf an ,
in which case,
lim an = lim sup an = lim inf an .
(3) If an ≤ bn for all n sufficiently large, then
lim inf an ≤ lim inf bn and lim sup an ≤ lim sup bn .
(4) The following inequality properties hold:
282 6. ADVANCED THEORY OF INFINITE SERIES
where we used Property (2) of Theorem 6.8. Since a > b, we have b < lim inf |an |1/n
and our proof is complete.
Here’s Cauchy’s root test, a far-reaching generalization of the root test you
learned in elementary calculus.
P
Theorem 6.10 (Cauchy’s root test). A series an converges absolutely or
diverges according as
1/n 1/n
lim sup an <1 or lim sup an > 1.
1/n
Proof. Suppose first that lim sup an < 1. Then we can choose 0 < a < 1
1/n
such that lim sup an
< a, which, by Property 4 (a) of Theorem 6.8, implies
that for some N ,
1/n
n > N =⇒ an < a,
that is,
n > N =⇒ an < an .
P
Since a < 1, wePknow that the infinite series anPconverges; thus by the comparison
test, the sum |an | also converges, and hence an converges as well.
1/n
Assume now that lim sup an > 1. Then by Property 4 (b) of Theorem
1/n
6.8, there are infinitely many n’s such that an > 1. Thus, there P are infinitely
many n’s such that |an | > 1. Hence by the n-th term test, the series an cannot
converge.
1/n
It is important to remark that in the other case, that is, lim sup an = 1,
this test does not give information as to convergence.
P
Example 6.13. Consider the series 1/n, which diverges, and observe that
1/n 1/n
lim sup |1/n|
P = lim 1/n = 1 (see Section 3.1 for the proof that lim n1/n = 1).
However, 1/n2 converges, and lim sup |1/n2 |1/n = lim(1/n1/n )2 = 1 as well, so
when lim sup |an |1/n = 1 it’s not possible to tell whether or not the series converges.
As with the root test, in elementary calculus you learned the ratio test most
likely without proof, and, accepting by faith this test as correct you probably
used it to determine the convergence/divergence of many types of series. Here’s
d’Alembert’s ratio test, a far-reaching generalization of the ratio test2.
2Allez en avant, et la foi vous viendra [push on and faith will catch up with you]. Advice to
those who questioned the calculus by Jean Le Rond d’Alembert (1717–1783) [141]
6.2. LIMINFS/SUPS, RATIO/ROOTS, AND POWER SERIES 285
P
Theorem 6.11 (d’Alembert’s ratio test). A series an , with an nonzero
for n sufficiently large, converges absolutely or diverges according as
a a
n+1 n+1
lim sup <1 or lim inf > 1.
an an
1/n
Proof. If we set L := lim sup an , then by Lemma 6.9, we have
an+1 an+1
(6.6) lim inf
≤ L ≤ lim sup .
an an
P
Therefore, if lim sup aan+1
n
< 1, then L < 1 too, so an converges absolutely by
P
the root test. On the other hand, if lim inf aan+1
n
> 1, then L > 1 too, so an
diverges by the root test.
an+1 an+1
We remark that in the other case, that is, lim inf an ≤ 1 ≤ lim sup an ,
this test does not give information as to convergence. P Indeed,Pthe same diver-
gent and convergent examples
an+1 used for the root test, 1/n and 1/n2 , have the
property that lim inf an = 1 = lim sup aan+1 n
.
1/n
Note that if lim sup an = 1, that is, the root test fails (to give a decisive
answer), then setting L = 1 in (6.6), we see that the ratio test also fails. Thus,
(6.7) root test fails =⇒ ratio test fails.
Therefore, if the root test fails one cannot hope to appeal to the ratio test.
Let’s now consider some examples.
Example 6.14. First, our old friend:
∞
X zn
exp(z) := ,
n=1
n!
which we already knows converges, but for the fun of it, let’s apply the ratio test.
Observe that
a zn+1 n! |z|
n+1 (n+1)!
= zn = |z| · = .
an n! (n + 1)! n+1
Hence, a
n+1
lim = 0 < 1.
an
Thus, the exponential function exp(z) converges absolutely for all z ∈ C. This
proof was a little easier than the one in Section 3.7, but then again, back then we
didn’t have the up-to-day technology of the ratio test that we have now. Here’s an
example that fails.
Example 6.15. Consider the Riemann zeta function
∞
X 1
ζ(z) = , Re z > 1.
n=1
nz
If z = x + iy is separated into its real and imaginary parts, then
1/n 1 1/n 1 1/n 1 x
an = z = = .
n nx n1/n
286 6. ADVANCED THEORY OF INFINITE SERIES
where an ∈ C for all n (in particular, the an ’s may be real). However, we shall
focus on power series of the complex variable z although essentially everything we
mention works for real variables x.
Example 6.18. Besides the exponential function,Pother familiar examples of
∞
power series
P∞ include the trigonometric series, sin z = n=0 (−1)n z 2n+1 /(2n + 1)!,
cos z = n=0 (−1)n z 2n /(2n)!.
P∞
The convergence of power series is quite easy to analyze. First, n=0 an z n =
a0 + a1 z + a2 z 2 + · · · certainly converges if z = 0. For |z| > 0 we can use the root
test: Observe that (see Problem 8 for the proof that we can take out |z|)
1/n
lim sup an z n = lim sup |z| |an |1/n = |z| lim sup |an |1/n .
P
Therefore, an z n converges (absolutely) or diverges according as
|z| · lim sup |an |1/n < 1 or |z| · lim sup |an |1/n > 1.
Therefore, if we define 0 ≤ R ≤ ∞ by
1
(6.11) R :=
lim sup |an |1/n
3The shortest path between two truths in the real domain passes through the complex domain.
Jacques Hadamard (1865–1963). Quoted in The Mathematical Intelligencer 13 (1991).
288 6. ADVANCED THEORY OF INFINITE SERIES
R
P
Figure 6.3. an z n converges (absolutely) or diverges according
as |z| < R or |z| > R.
a 1/n
Suggestion: For (a), let an = nn /n!. Prove that lim n+1
an
= e and hence lim an = e as
well. As a side remark, recall that (a) is called (the “weak”) Stirling’s formula, which
we introduced in (3.29) and proved in Problem 5 of Exercises P 3.3. n! n
6. In this problem we investigate the interesting power series ∞ n=1 nn z , where z ∈ C.
(a) Prove that this series has radius of convergence R = e.
(b) If |z| = e, then the ratio and root test both fail. However, if |z| = e, then prove
that the infinite series diverges.
nn n
P
(c) Investigate the convergence/divergence of ∞ n=1 n! z , where z ∈ C.
7. In this problem we investigate the interesting power series
∞
X
F (z) := Fn+1 z n = F1 + F2 z + F3 z 2 + · · · ,
n=0
that (1P− z − z 2 )F (z) = 1. By the way, given any sequence {an }∞ n=0 , the power
series ∞ n=0 a n z n
is called the generating function of the sequence {an }. Thus,
the generating function for {Fn+1 } has the closed form 1/(1 − z − z 2 ). For more
on generating functions, see the free book [246]. Also, if you’re interested in a
magic trick you can do with the formula F (z) = 1/(1 − z − z 2 ), see [176].
8. Here are some lim inf/sup problems. Let {an }, {bn } be sequences of real numbers.
(a) Prove that if c > 0, then lim inf(can ) = c lim inf an and lim sup(can ) = c lim sup an .
Here, we take the “obvious” conventions: c · ±∞ = ±∞.
(b) Prove that if c < 0, then lim inf(can ) = c lim sup an and lim sup(can ) = c lim inf an .
(c) If {an }, {bn } are bounded, prove that lim inf an + lim inf bn ≤ lim inf(an + bn ).
(d) If {an }, {bn } are bounded, prove that lim sup(an + bn ) ≤ lim sup an + lim sup bn .
9. If an → L where L is a positive real number, prove that lim sup(an · bn ) = L lim sup bn
and lim inf(an · bn ) = L lim inf bn . Here are some steps if you want them:
(i) Show that you can get the lim inf statement from the lim sup statement, hence
we can focus on the lim sup statement. We shall prove that lim sup(an bn ) ≤
L lim sup bn and L lim sup bn ≤ lim sup(an bn ).
(ii) Show that the inequality lim sup(an bn ) ≤ L lim sup bn follows if the following
statement holds: If lim sup bn < b, then lim sup(an bn ) < L b.
(iii) Now prove that if lim sup bn < b, then lim sup(an bn ) < L b. Suggestion: If
lim sup bn < b, then choose a such that lim sup bn < a < b. Using Property 4
(a) of Theorem 6.8 and the definition of L = lim an > 0, prove that there is
an N such that n > N implies bn < a and an > 0. Conclude that for n > N ,
an bn < aan . Finally, take lim sups of both sides of an bn < aan .
(iv) Show that the inequality L lim sup bn ≤ lim sup(an bn ) follows if the following
statement holds: If lim sup(an bn ) < L b, then lim sup bn < b; then prove this
statement.
10. Let {an } be a sequence of real numbers. We prove that there are monotone subse-
quences of {an } that converge to lim inf an and lim sup an . Proceed as follows:
(i) Using Theorem 3.13, show that it suffices to prove that there are subsequences
converging to lim inf an and lim sup an
(ii) Show that it suffices to that there is a subsequence converging to lim inf an .
(iii) If lim inf an = ±∞, prove there is a subsequence converging to lim inf an .
(iv) Now assume that lim inf an = limn→∞ inf{an , an+1 , . . .} ∈ R. By definition of
limit, show that there is an n so that a − 1 < inf{an , an+1 , . . .} < a + 1. Show
290 6. ADVANCED THEORY OF INFINITE SERIES
that we can choose an n1 so that a − 1 < an1 < a + 1. Then show there an
n2 > n1 so that a − 21 < an2 < a + 21 . Continue this process.
Thus, for n > N , abnn is increasing with n. In particular, fixing m > N , for all
n > m, we have C < an /bn , where C = am /bm is a constantP independent of n.
Thus, for all n > m, we have P nCb < an and since the sum bn diverges, the
comparison test implies that an diverges too.
Note that d’Alembert’s ratio test is just Kummer’s test with bn = 1 for each n.
6.3.2. Raabe’s test and “big O” notation. The following test, attributed
to Joseph Ludwig Raabe (1801–1859), is just Kummer’s test with the bn ’s making
up the harmonic series: bn = 1/n.
P
Theorem 6.15 (Raabe’s test). A series an of positive terms converges or
diverges according as
an an
lim inf n −1 >0 or lim sup n − 1 < 0.
an+1 an+1
In order to effectively apply Raabe’s test, it is useful to first introduce some
very handy notation. For a nonnegative function g, when we write f = O(g) (“big
O” of g), we simply mean that |f | ≤ Cg for some constant C. In words, the
big O notation just represents “a function that is in absolute value less than or
equal to a constant times”. This big O notation was introduced by Paul Bachmann
(1837–1920) but became well-known through Edmund Landau (1877–1938) [239].
x2
= O(x2 )
1+x
because x2 /(1 + x) ≤ x2 for x ≥ 0. Thus, for x ≥ 0,
1 x2 1
(6.14) =1−x+ =⇒ = 1 − x + O(x2 ).
1+x 1+x 1+x
In this section, we are mostly interested in using the big O notation when
dealing with natural numbers.
Three important properties of the big O notation are (1) if f = O(ag) with
a ≥ 0, then f = O(g), and if f1 = O(g1 ) and f2 = O(g2 ), then (2) f1 f2 = O(g1 g2 )
and (3) f1 +f2 = O(g1 +g2 ). To prove these properties, observe that if |f | ≤ C(ag),
then |f | ≤ C 0 g, where C 0 = aC, and that |f1 | ≤ C1 g1 and |f2 | ≤ C2 g2 imply
|f1 f2 | ≤ C1 C2 g1 g2 and |f1 + f2 | ≤ (C1 + C2 ) (g1 + g2 );
2
Example
6.21. Thus, in view of (6.15), we have O n2 + 4n1 2 = O 1
n · 1
n =
O n12 . Therefore, using (the right-hand part of) (6.14), we obtain
2
1 2 1 2 1 2 1 1
=1− − 2 +O + =1− +O 2 +O 2
2 1 n 4n n 4n2 n n n
1+ +
n 4n2
2 1
=1− +O 2 ,
n n
Here we can see the very “big” advantage of using the big O notation: it hides a
lot of complicated junk information. For example, the left-hand side of the equation
is exactly equal to (see the left-hand part of (6.14))
" 2 #
2
1 2 1 n + 4n1 2
=1− + − 2 + ,
2 1 n 4n 1 + n2 + 4n1 2
1+ +
n 4n2
so the big O notation allows
us to summarize the complicated material on the right
as the very simple O n12 .
already considered in (6.9). We saw that the ratio and root tests failed for this
series; however, it turns out that Raabe’s test works. To see this, let an denote the
n-th term in the “mystery” series. Then from (6.10), we see that
5 3
an 1+ + 2
5 3
2
1
= 2n 2n = 1+ + 2 1− +O
an+1 2 1 2n 2n n n2
1+ + 2
n 4n
5 1 2 1
= 1+ +O 1− +O .
2n n2 n n2
Multiplying out the right-hand side, using the properties of big O, we get
an 5 2 1 1 1
=1+ − +O = 1 + + O .
an+1 2n n n2 2n n2
Hence,
an 1 1 an 1
n −1 = +O =⇒ lim n − 1 = > 0,
an+1 2 n an+1 2
4It turns out that the “mystery” sum equals π/2; see [136] for a proof.
6.3. A POTPOURRI OF RATIO-TYPE TESTS AND “BIG O” NOTATION 293
6.3.3. De Morgan and Bertrand’s test. We next study a test due to Au-
gustus De Morgan (1806–1871) and Joseph Bertrand (1822–1900). For this test,
we let bn = 1/n log n in Kummer’s test.
Theorem 6.16 (De Morgan and Bertrand’s test). Let {an } be a sequence
of positive numbers and define αn by the equation
an 1 αn
=1+ + .
an+1 n n log n
P
Then an converges or diverges according as lim inf αn > 1 or lim sup αn < 1.
Proof. If we let bn = 1/n log n in Kummer’s test, then
1 an 1 1 αn
κn = − = n log n 1 + + − (n + 1) log(n + 1)
bn an+1 bn+1 n n log n
h i
= αn + (n + 1) log n − log(n + 1) .
Since
h i n+1
1
(n + 1) log n − log(n + 1) = log 1 − → log e−1 = −1,
n+1
we have
lim inf κn = lim inf αn − 1 and lim sup κn = lim sup αn − 1.
Invoking Kummer’s test now completes the proof.
6.3.4. Gauss’s test. Finally, to end our potpourri of tests, we conclude with
Gauss’ test:
Theorem 6.17 (Gauss’ test). Let {an } be a sequence of positive numbers and
suppose that we can write
an ξ 1
=1+ +O ,
an+1 n np
P
where ξ is a constant and p > 1. Then an converges or diverges according as
ξ ≤ 1 or ξ > 1.
Proof. The hypotheses imply that
an 1 1
n − 1 = ξ + nO = ξ + O →ξ
an+1 np np−1
P
as n → ∞, where we used that p−1 > 0. Thus, Raabe’s test shows that series an
converges for ξ > 1 and diverges for ξ < 1. For the case ξ = 1, let aan+1
n
= 1 + n1 + fn
where fn = O n1p . Then we can write
an 1 1 αn
= 1 + + fn = 1 + + ,
an+1 n n n log n
where αn = fn n log n. If we let p = 1+δ, where δ > 0, then we know that log
nδ
n
→0
as n → ∞ by Problem 8 in Exercises 4.6, so
1 log n
αn = fn n log n = O n log n = O =⇒ lim αn = 0.
n1+δ nδ
P
Thus, De Morgan and Bertrand’s test shows that the series an diverges.
294 6. ADVANCED THEORY OF INFINITE SERIES
Example 6.23. Gauss’ test originated with Gauss’ study of the hypergeometric
series:
α·β α(α − 1) · β(β − 1) α(α − 1)(α − 2) · β(β − 1)(β − 2)
1+ + + + ··· ,
1·γ 2! · γ(γ + 1) 3! · γ(γ + 1)(γ + 2)
P
where α, β, γ are positive real numbers. We can write this as an where
α(α − 1)(α − 2) · · · (α − n + 1) · β(β − 1)(β − 2) · (β − n + 1)
an = .
n! · γ(γ + 1)(γ + 2) · · · (γ + n − 1)
Hence, for n ≥ 1 we have
γ+1 γ
an (n + 1)(γ + n) n2 + (γ + 1)n + γ 1+ + 2
= = 2 = n n .
an+1 (α + n)(β + n) n + (α + β)n + αβ α+β αβ
1+ + 2
n n
Using the handy formula from (6.14),
1 x2
=1−x+ ,
1+x 1+x
we see that (after some algebra)
an γ+1 γ α+β αβ 1
= 1+ + 2 1− − 2 +O
an+1 n n n n n2
γ+1−α−β 1
=1+ +O .
n n2
Thus, the hypergeometric series converges if γ > α + β and diverges if γ ≤ α + β.
Exercises 6.3.
1. Determine whether or not the following series converge.
∞ ∞
X 1 · 3 · 5 · · · (2n − 1) X 3 · 6 · 9 · · · (3n)
(a) , (b) ,
n=1
2n (n + 1)! n=1
7 · 10 · 13 · · · (3n + 4)
∞ ∞
X 1 · 3 · 5 · · · (2n − 1) X 2 · 4 · 6 · · · (2n + 2)
(c) , (d) .
n=1
2 · 4 · 6 · · · (2n) n=1
1 · 3 · 5 · · · (2n − 1)(2n)
For α, β 6= 0, −1, −2, . . .,
∞ ∞
X α(α + 1)(α + 2) · · · (α + n − 1) X α(α + 1)(α + 2) · · · (α + n − 1)
(e) , (f ) .
n=1
n! n=1
β(β + 1)(β + 2) · · · (β + n − 1)
an
(i) Suppose first that lim inf n log an+1
> 1. Show that there is an a > 1 and an
N such that
an an+1
n>N =⇒ a < n log =⇒ < e−a/n .
an+1 an
n
(ii) Using 1 + n1 < e from (3.28), the p-test,
P and the limit comparison test (see
Problem 7 in Exercises 3.6), prove that an converges.P
an
(iii) Similarly, prove that if lim sup n log an+1 < 1, then an diverges.
(iv) Using the logarithmic test, determine the convergence/diverence of
∞ ∞
X n! X nn
n
and .
n=1
n n=1
n!
6.4.2. Abel’s limit theorem.P∞ Abel’s limit theorem has to do with the fol-
lowing question. Let f (x) = n=0 an xn have radius of convergence R; this implies,
in particular, that f (x) is defined for all −R < x < R and, by Theorem
P∞ 6.19, is
continuous on the interval (−R, R). Let us suppose that f (R) = n=0 an Rn con-
verges. In particular, f (x) is defined for all −R < x ≤ R. Question: Is f continuous
on the interval (−R, R], that is, is it true that
The answer to this question is “yes” and it follows from the following more general
theorem due to Neils Abel; however, Abel’s theorem is mostly used for the real
variable case limx→R− f (x) = f (R) that we just described.
P∞
Theorem 6.20 (Abel’s limit theorem). Let f (z) = n=0 an z n have P∞ radius
of convergence R and let z0 ∈ C with |z0 | = R where the series f (z0 ) = n=0 an z0n
converges. Then
lim f (z) = f (z0 )
z→z0
where the limit on the left is taken in such a way that |z| < R and that the ratio
|z0 −z|
R−|z| remains bounded by a fixed constant.
Proof. By considering the limit of the function g(z) = f (z0 z)−f (z0 ) as z → 1
in such a way that |z| < 1 and that the ratio |1 − z|/(1 − |z|) remains bounded by
a fixed constant, we may henceforth assume that z0 = 1 and that f (z0 ) = 0 (the
diligent student will check the details of this statement).
P∞ With these assumptions,
if we put sn = a0 + a1 + · · · + an , then 0 = f (1) = n=0 an = lim sn . Now observe
6.4. SOME PRETTY POWERFUL PROPERTIES OF POWER SERIES 297
that an = sn − sn−1 , so
n
X
ak z k = a0 + a1 z + a2 z 2 + · · · + an z n
k=0
= s0 + (s1 − s0 )z + (s2 − s1 )z 2 + · · · + (sn − sn−1 )z n
= s0 (1 − z) + s1 (z − z 2 ) + · · · + sn−1 (z n−1 − z n ) + sn z n
= s0 (1 − z) + s1 (1 − z)z + · · · + sn−1 (1 − z)z n−1 + sn z n
= (1 − z) s0 + s1 z + · · · + sn−1 z n−1 + sn z n .
Pn Pn
Thus, k=0 ak z k = (1 − z) k=0 sk z k + sn z n . Since sn → 0 and |z| < 1 it follows
that sn z n → 0. Therefore, taking n → ∞, we obtain
X∞ ∞
X
f (z) = an z n = (1 − z) sn z n ,
n=0 n=0
which implies that
∞
X
|f (z)| ≤ |1 − z| |sn | |z|n .
n=0
Let us now take z → 1 in such a way that |z| < 1 and |1 − z|/(1 − |z|) < C where
C > 0. Let ε > 0 be given and, since sn → 0, we can choose an integer N such
PN
that n > N =⇒ |sn | < ε/(2C). Define K := n=0 |sn |. Then we can write
N
X ∞
X
|f (z)| ≤ |1 − z| |sn | |z|n + |1 − z| |sn | |z|n
n=0 n=N
N ∞
X
n
X ε
< |1 − z| |sn | · 1 + |1 − z| |z|n
n=0
2C
n=N
∞
ε X
= K|1 − z| + |1 − z| |z|n
2C n=0
ε |1 − z| ε
= K|1 − z| + < K|1 − z| + .
2C 1 − |z| 2
Thus, with δ := ε/(2K), we have
|1 − z|
|z − 1| < δ with |z| < 1 and <C =⇒ |f (z)| < ε.
1 − |z|
This completes our proof.
Notice that for z = x with 0 < x < R, we have
|R − z| |R − x| R−x
= = =1
R − |z| R − |x| R−x
which, in particular, is bounded by 1, so (6.16) holds under the assumptions stated.
Once weP prove this result at x = R, we can prove a similar resultP at x = −R: If
∞ n ∞ n
f (x) = n=0 a n x has radius of convergence R and f (−R) = n=0 an (−R)
converges, then
lim f (x) = f (−R).
x→−R+
To prove this, consider the function g(x) = f (−x), then apply (6.16) to g.
298 6. ADVANCED THEORY OF INFINITE SERIES
6.4.3. The identity theorem. The identity theorem is perhaps one of the
most useful properties of power series. The identity theorem says, very roughly,
that if two power series are identical at “sufficiently many” points, then in fact, the
power series are identical everywhere!
P P
Theorem 6.21 (Identity theorem). Let f (z) = an z n and g(z) = bn z n
have positive radii of convergence and suppose that f (ck ) = g(ck ) for some nonzero
sequence ck → 0. Then the power series f (z) and g(z) must be identical; that is
an = bn for every n = 0, 1, 2, 3, . . ..
Proof. We begin by proving that for each m = 0, 1, 2, . . ., the series
∞
X
fm (z) := an z n−m = am + am+1 z + am+2 z 2 + am+3 z 3 + · · ·
n=m
P
Example 6.24. Suppose that f (z) = an z n is an odd function in the sense
that f (−z) = −f (z) for all z within its radius of convergence. In terms of power
series, the identity f (−z) = −f (z) is
X X
an (−1)n z n = −an z n .
By the identity theorem, we must have (−1)n an = −an for each n. Thus, for n
even we must have an = −an or an = 0, and for n odd, we must have −an = −an ,
a tautology. In conclusion, we see that f is odd if and only if all coefficients of even
powers vanish:
X∞
f (z) = a2n+1 z 2n+1 ;
n=0
that is, f is odd if and only if f has only odd powers in its series expansion.
Exercises 6.4.
P
1. Prove that f (z) = an z n is an even function in the sense that f (−z) = f (z) for all
z within its radius of convergenceP if and only if f has only even powers in its expansion,
that is, f takes the form f (z) = ∞ n=0 a2n
2n
z . n!
n
2. Recall that the binomial coefficient is k = k!(n−k)! for 0 ≤ k ≤ n. Prove the highly
nonobvious result:
! k
! !
m+n X m n
= .
k j=0
j k−j
Suggestion: Apply the binomial formula to (1 + z)m+n , which equals (1 + z)m · (1 + z)n .
Prove that
! n
!2
2n X n
= .
n k
k=0
P
3. Prove that ∞ n
n=1 n |an | r converges, where the notation is as in the proof of Theorem
6.19, using the root test. You will need Problem P 9 in Exercises 6.2.
4. (Abel summability)
P We say that a series a n is Abel summable to L if the power
n
series f (x) := aP
n x is defined for all x ∈ [0, 1) and limx→1− f (x) = L.
P
(a) Prove that if an converges to L ∈ C, then an is also Abel summable to L.
(b) Derive the following amazing formulas (properly interpreted!):
1
1 − 1 + 1 − 1 + 1 − 1 + − · · · =a ,
2
1
1 + 2 − 3 + 4 − 5 + 6 − 7 + − · · · =a ,
4
where =a mean “is Abel summable to”. You will need Problem 6 in Exercises 3.5.
5. In this problem we continue our fascinating study of Abel summability.PLet a0 , a1 , a2 , . . .
be a positive nonincreasing sequence tending to zero (in particular, (−1)n−1 an con-
verges by the alternating series test). Define bn := a0 + a1 + · · · + an . We shall prove
the neat formula
∞
1X
b 0 − b 1 + b 2 − b 3 + b 4 − b 5 + − · · · =a (−1)n an .
2 n=0
P
(i) Let f (x) = ∞ n n
n=0 (−1) bn x . Prove that f has radius of convergence 1. Sugges-
tion: Use the ratio test.
300 6. ADVANCED THEORY OF INFINITE SERIES
(ii) Let
n
X
fn (x) = (−1)k bk xk
k=0
For limm→∞ limn→∞ smn on the left, we mean to first take n → ∞ and second to
take m → ∞, reversing the order for limn→∞ limm→∞ smn . In general, the iterated
limits (6.20) may have no relationship!
Example 6.27. Consider the double sequence smn = mn/(m + n2 ). We have
mn
lim smn = lim = 0 =⇒ lim lim smn = lim 0 = 0.
n→∞ n→∞ m + n2 m→∞ n→∞ m→∞
It may shock you, but the answer to both of these questions is “no”.
Example 6.28. For a counter example to Question I, consider our first example
smn = mn/(m + n)2 . We know that lim smn does not exist, but observe that
mn
lim smn = lim = 0 =⇒ lim lim smn = lim 0 = 0.
n→∞ n→∞ (m + n)2 m→∞ n→∞ m→∞
and
mn
lim smn = lim =0 =⇒ lim lim smn = lim 0 = 0,
m→∞ m→∞ (m + n)2 n→∞ m→∞ n→∞
so both iterated limits converge. For a counter example to Question II, see limit
(d) in Problem 1.
302 6. ADVANCED THEORY OF INFINITE SERIES
However, if a double sequence converges and both iterated limits exists, then
they all must equal the same number. This is the content of the following theorem,
named after Alfred Pringsheim (1850–1941) (cf. [41, p. 79]).
Theorem 6.23 (Pringsheim’s theorem for sequences). If {smn } converges
and for each m, limn→∞ smn exists and for each n, limm→∞ smn exists, then both
iterated limits exist and the equality (6.21) holds.
Proof. Let ε > 0. Then there is an N such that for all m, n > N , we have
|L − smn | < ε/2. Taking n → ∞, we get, for m > N , |L − limn→∞ smn ≤ ε/2.
Hence,
m > N =⇒ L − lim smn < ε.
n→∞
This means that limm→∞ (limn→∞ smn ) = L. A similar argument establishes the
equality with the limits of m and n reversed.
P
Recall that if {an } is a sequence of complex numbers,
Pnthen we say that an
converges if the sequence {sn } converges, where sn := k=1 ak . By analogy, we
define a double series of complex numbers as follows. Let {amn } be a double
sequence of complex numbers and let
Xm Xn
smn := aij ,
i=1 j=1
P P
called the m, n-th partial sum of amn . We say that the double P series amn
converges if the double sequence {smn } of partial sums converges. If amn exists,
we can ask whether or not
X X∞ X ∞ X∞ X ∞
(6.22) amn = amn = amn ?
m=1 n=1 n=1 m=1
Pm Pn
Here, with smn = i=1 j=1 aij , the iterated series on the right are defined as
∞ X
X ∞ ∞ X
X ∞
amn := lim lim smn and amn := lim lim smn .
m→∞ n→∞ n→∞ m→∞
m=1 n=1 n=1 m=1
P
Thus, (6.22) is just the equality (6.21) with s = amn . Hence, Pringsheim’s
theorem for sequences immediately implies the following.
P
Theorem 6.24 (Pringsheim’s
P∞ theorem for series). If a double
P∞series amn
converges and for each m, n=1 amn converges and for each n, m=1 amn con-
verges, then both iterated series converge and the equality (6.22) holds.
We can “visualize” the iterated sums in (6.22) as follows. First, we arrange
the aPmn ’s in an infinite array as shown in Figure 6.4. Then for fixed m ∈ N, the
∞
sum n=1 amn is summing all the numbers in theP m-th row shown in the left
∞
picture in Figure 6.4. For example, if m = 1, then n=1 a1n is summing all the
numbers
P∞ P∞ in the first row shown in the left picture in Figure 6.4. The summation
m=1 n=1
P∞ amnPis summing over all the rows (that have already been summed).
∞
Similarly, n=1 m=1 amn is summing over all the columns. In Subsection 6.5.3
we shall study the most useful theorem on iterated sums, Cauchy’s double series
theorem, which states Pthat (6.22) always holds for absolutely convergent series.
Here, a double series
P amn is said to converge absolutely if the double series of
absolute values |amn | converges. However, before presenting Cauchy’s theorem,
we first generalize summing by rows and columns to “summing by curves”.
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 303
. . . .
a11 .... a12 .... a13 .... a14 .... . . .
..a...11
......a..12
......a..13
.......a..14
.............. ... ... ... ...
a21 a22 a23 a24 . . . a21 .... a22 .... a23 .... a24 .... . . .
......................................... ... ... ... ...
a31 ... a32 ... a33 ... a34 ... . . .
..a...31
......a..32
......a..33
.......a..34
.............. .... .... .... ...
a41 a42 a43 a44 . . . a41 ... a42 ... a43 ... a44 .... . . .
.............................................. . ... . ... . ... . ... .
. . . . .. . .. . .. . .. . .. . .
. . . . . .. .. .. .
. . . .
is the sum of the amn ’s that are within the triangle consisting of the first k diagonals.
It is natural to refer to the limit (provided it exists)
X k
X X
lim amn = lim amn ,
k→∞ k→∞
(m,n)∈Sk `=1 (m,n)∈T`
as “summing by triangles”. Using that T` = {(1, `), (2, ` − 1), . . . , (`, 1)}, we can
express the summation by triangles as
∞
X
a1,k + a2,k−1 + · · · + ak,1 .
k=1
More generally, we can “sum by curves” as long as the curves increasingly fill up
the array like the squares or triangles shown in Figure 6.5. More precisely, suppose
that S1 ⊆ S2 ⊆ S3 ⊆ · · · ⊆ N × N is a nondecreasing sequence of finite sets having
the property that for any m, n there is a k such that
In the following theorem we consider the sequence {sk } where for each k ∈ N, sk is
the finite sum
X
(6.24) sk := amn ,
(m,n)∈Sk
= tmm − tnn
ε ε
= (tmm − L) + (L − tnn ) < + = ε,
2 2
where we used (6.25). Hence, |sk P
− s` | < ε and so {sk } is Cauchy.
Step 2: We now show that amn converges with sum equal to s := lim sk .
Let ε > 0 be given and choose N such that (6.25) holds with ε/2 replaced with ε/3.
Fix natural numbers m, n > N . By the property (6.23) and the fact that sk → s
we can choose a k > N such that
{1, 2, . . . , m} × {1, 2, . . . , n} ⊆ Sk
and |sk − s| < ε/3. Observe that
X X
|sk − smn | = aij − aij
(i,j)∈Sk (i,j)∈{1,...,m}×{1,...,n}
X X
= aij ≤ |aij |.
(i,j)∈Sk \{1,...,m}×{1,...,n} (i,j)∈Sk \({1,...,m}×{1,...,n})
= tm0 m0 − tmn
ε ε 2ε
= (tm0 m0 − L) + (L − tmn ) < + = ,
3 3 3
where we used the property (6.25) (with ε/2 replaced with ε/3). Finally, recalling
that |sk − s| < ε/3, by the triangle inequality, we have
2ε ε
|smn − s| ≤ |smn − sk | + |sk − s| < + = ε.
3 3
P
This proves that amn = s and completes our proof.
P Theorem 6.26 (Cauchy’s double series theorem). For any double series
amn of complex numbers, the following are equivalent statements:
P
(a) P
The series
P amn is absolutely convergent;
∞ ∞
(b) Pm=1P n=1 |amn | converges;
∞ ∞
(c) n=1 m=1 |amn | converges.
306 6. ADVANCED THEORY OF INFINITE SERIES
P
which proves that |amn | converges, and completes the proof of our result.
Now for some double series examples.
P
Example 6.31. For our first example, consider the sum 1/(mp nq ) where
p, q ∈ R. Since in this case,
∞ ∞
X 1 1 X 1
= · ,
n=1
mp nq mp n=1
nq
it follows that
∞ X
∞ X∞ ∞
X 1 1 X 1
= · .
m=1 n=1
mp nq m=1
mp n=1
nq
P
Therefore, by Cauchy’s double series theorem and the p-test, 1/(mp nq ) converges
if and only if both p, q > 1.
PExample 6.32. The previous example can help us with other examples such
as 1/(m4 + n4 ). Observe that
1 1
(m2 − n2 )2 ≥ 0 =⇒ m4 + n4 − 2m2 n2 ≥ 0 =⇒ ≤ .
m4 + n4 2m2 n2
P
Since 1/(m2 n2 ) converges, by an easy generalization
P of 4our good ole comparison
test (Theorem 3.27) to double series, we see that 1/(m + n4 ) converges too.
Example 6.33. For an applicationP of Cauchy’s theorem and the sum by curves
theorem, we look at the double sum z m+n for |z| < 1. For such z, this sum
converges absolutely because
∞ X ∞ ∞
X X 1 1
|z|m+n = |z|m · = < ∞,
m=0 n=0 m=0
1 − |z| (1 − |z|)2
P∞ 1
where we used the geometric series test (twice): If |r| < 1, then k=0 rk = 1−r .
P m+n
So z converges absolutely by Cauchy’s double series theorem, and
∞ X ∞ ∞
X X X 1 1
z m+n = z m+n = zm · = .
m=0 n=0 m=0
1−z (1 − z)2
P m+n
On the other hand, by our sum by curves theorem, we can determine z by
summing over curves; we shall choose to sum over triangles. Thus, if we set
Sk = T0 ∪ T1 ∪ T2 ∪ · · · ∪ Tk , where T` = {(m, n) ; m + n = ` , m, n ≥ 0},
then
X X k
X X
z m+n = lim z m+n = lim z m+n .
k→∞ k→∞
(m,n)∈Sk `=0 (m,n)∈T`
Since T` = {(m, n) ; m + n = `} = {(0, `), (1, ` − 1), . . . , (`, 0)}, we have
X
z m+n = z 0+` + z 1+(`−1) + z 2+(`−2) + · · · + z `+0 = (` + 1)z ` .
(m,n)∈T`
P P∞ P
Thus, z m+n = k=0 (k + 1)z k . However, we already proved that z m+n =
2
1/(1 − z) , so
∞
1 X
(6.30) = nz n−1 .
(1 − z)2 n=1
308 6. ADVANCED THEORY OF INFINITE SERIES
See Problem 4 for an easier proof of (6.30) using Cauchy’s double series theorem.
Example 6.34. Another very neat application of Cauchy’s double series the-
orem is to derive nonobvious identities. For example, let |z| < 1 and consider the
series
∞
X zn z z2 z3
= + + + ··· ;
n=1
1 + z 2n 1 + z2 1 + z4 1 + z6
we’ll see why this converges in a moment. Observe that (since |z| < 1)
∞
1 X
= (−1)m z 2mn ,
1 + z 2n m=0
P∞
by the familiar geometric series test with r = −z 2n : Since |r| < 1, then k=0 rk =
1
1−r . Therefore,
∞ ∞ ∞ ∞ X ∞
X zn X
n
X
m 2mn
X
2n
= z · (−1) z = (−1)m z (2m+1)n
n=1
1 + z n=1 m=0 n=1 m=0
P m (2m+1)n
We claim that the double sum (−1) z converges absolutely. To prove
this, observe that
∞
∞ X ∞ ∞ ∞
X X X X |z|n
|z|(2m+1)n = |z|n |z|2nm = .
n=1 m=0 n=1 m=0 n=1
1 − |z|2n
1 1 2n
Since 1−|z|2n ≤ 1−|z| (this is because |z| ≤ |z| for |z| < 1), we have
|z|n 1
≤ · |z|n .
1 − |z|2n 1 − |z|
P n P∞ |z|n
Since |z| converges, by the comparison theorem, n=1 1−|z|2n converges too.
Hence, Cauchy’s double series theorem applies, and
∞ X
X ∞ ∞ X
X ∞
(−1)m z (2m+1)n = (−1)m z (2m+1)n
n=1 m=0 m=0 n=1
X∞ ∞
X
= (−1)m z (2m+1)n
m=0 n=1
∞
X z 2m+1
= (−1)m .
m=0
1 − z 2m+1
Thus,
∞ ∞
X zn X
m z
2m+1
= (−1) ;
n=1
1 + z 2n m=0
1 − z 2m+1
that is, we have derived the striking identity between even and odd powers of z:
z z2 z3 z z3 z5
2
+ 4
+ 6
+ ··· = − 3
+ − +··· .
1+z 1+z 1+z 1−z 1−z 1 − z5
There are more beautiful series like this found in the exercises (see Problem 5
or better yet, Problem 7). We just touch on one more because it’s so nice:
6.5. DOUBLE SEQUENCES, DOUBLE SERIES, AND A ζ-FUNCTION IDENTITY 309
6.5.4.
P∞A neat ζ-function identity. Recall that the ζ-function is defined by
ζ(z) = n=1 n1z , which converges absolutely for z ∈ C with Re z > 1. Here’s a
beautiful theorem from Flajolet and Vardi [75, 232].
P∞ P∞
Theorem 6.27. If f (z) = n=2 an z n and n=2 |an | converges, then
X∞ 1 X ∞
f = an ζ(n).
n=1
n n=2
Using this theorem we can derive the pretty formula (see Problem 9):
∞
X 1
(6.31) log 2 = ζ(n).
n=2
2n
Not only is this formula pretty, it converges to log 2 much faster than the usual
P∞ n−1
series n=1 (−1)n (from which (6.31) is derived by the help of Theorem 6.27);
see [75, 232] for a discussion of such convergence issues.
Exercises 6.5.
1. Determine the convergence of the limits and the iterated limits for the double sequences
1 1 m n + 1 m
(a) smn = + , (b) smn = , (c) smn = ,
m n m+n n+2
1 1 1
(d) smn = (−1)m+n + , (e) smn = .
m n 1 + (m − n)2
2. Determine the convergence, iterated convergence, and absolute convergence, for the
double series
X (−1)mn X (−1)n X 1
(a) , (b) p p
, p > 1 , (c)
mn (m + n )(m + n − 1) mn
m,n≥1 m,n≥1 m≥2,n≥1
P∞ 1
Suggestion: For (b), show that m=1 (m+np )(m+np −1)telescopes.
P
3. (mn-term test for double series) Show that if amn converges, then amn → 0.
Suggestion: First verify that amn = smn − sm−1,n − sm,n−1 + sm−1,n−1 .
4. Let z ∈ C with |z| < 1. For (m, n) ∈ N × N, define amn = z n P if m ≤ n and define
amn = 0 otherwise.P∞Using Cauchy’s double series theorem on amn , prove (6.30).
Using (6.30), find n=1 2nn (cf. Problem 3 in Exercises 3.5).
310 6. ADVANCED THEORY OF INFINITE SERIES
5. Let |z| < 1. Using Cauchy’s double series theorem, derive the beautiful identities
z z3 z5 z z3 z5
(a) 2
+ 6
+ 10
+ ··· = 2
− 6
+ − +··· ,
1+z 1+z 1+z 1−z 1−z 1 − z 10
z z2 z3 z z3 z5
(b) 2
− 4
+ 6
− +··· = − 3
+ − +··· ,
1+z 1+z 1+z 1+z 1+z 1 + z5
z 2z 2 3z 3 z z2 z3
(c) − 2
+ 3
− +··· = 2
− 2 2
+ − +··· .
1+z 1+z 1+z (1 + z) (1 + z ) (1 + z 3 )2
P
Suggestion: For (c), you need the formula 1/(1 − z)2 = ∞ n=1 nz
n−1
found in (6.30).
6. Here’s a neat formula for ζ(k) found in [40]: For any k ∈ N with k ≥ 3, we have
k−2 ∞ ∞
XX X 1
ζ(k) = .
m` (m + n)k−`
`=1 m=1 n=1
Make sure you justify each step; in particular, why does each sum converge?
1
(iii) Use the partial fractions n(m+n) = n1 − m+n
1
to show that
∞
∞ X m ∞
X 1 X 1
X 1
= .
m=1 n=1
mk−2 n(m
+ n) m=1 n=1
n mk−1
P P∞
(iv) Replace the summation variable n with ` = m + n in ∞ m=1
1
n=1 n(m+n)k−1 to
get a new sum in terms of m and `, then use Cauchy’s double series theorem to
change the order of summation. Finally, prove the desired result.
7. (Number theory series) Here are some pretty formulas involving number theory!
(a) For n ∈ N, let τ (n) denote the number of positive divisors of n (that is, the number
of positive integers that divide n). For example, τ (1) = 1 and τ (4) = 3 (because
1, 2, 4 divide 4). Prove that
∞ ∞
Xzn X
(6.32) n
= τ (n)z n , |z| < 1.
n=1
1 − z n=1
P P
Suggestion: Write 1/(1 − z n ) =P ∞ m=0 z
mn
= ∞ m=1 z
n(m−1)
, then prove that the
mn
left-hand side of (6.32) equals z . Finally, use Theorem 6.25 with the set Sk
given by Sk = T1 ∪ · · · ∪ Tk where Tk = {(m, n) ∈ N × N ; m · n = k}.
(b) For n ∈ N, let σ(n) denote the sum of the positive divisors of n. For example,
σ(1) = 1 and σ(4) = 1 + 2 + 4 = 7). Prove that
∞ ∞
zn
X X
n )2
= σ(n)z n , |z| < 1.
n=1
(1 − z n=1
P P∞
8. Here is a neat problem. Let f (z) = ∞ n
n=1 an z and g(z) =
n
n=1 bn z . Determine a
set of points z ∈ C for which the following formula is valid:
∞
X ∞
X
bn f (z n ) = an g(z n ).
n=1 n=1
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 311
and my favorite:
∞ ∞
X f (z n ) X n
= an ez .
n=1
n! n=1
where the right-hand sums are only over those natural numbers i, j such that bi
and cj occur in the left-hand
P∞sum. The Pleft-hand side converges as n → ∞ by as-
∞
sumption, so if either sum n=1 bn or n=1 cn of nonnegative numbers converges,
then the equality (6.33) would imply that the other sum converges. But this would
then imply that
Xn X X
|ak | = bi + cj
k=1 i j
P P
converges as n → ∞, which does not. Hence, both sums bn and cn diverge.
Step 2: We produce a rearrangement. Let ξ ∈ R. We shall produce a re-
arrangement
Then any given partial sum t of (6.34) is one of the following two sorts:
where ` ≤ nk , in which case, γk0 ≤ t < βk0 . Now by construction, βk0 differs from
ξ by at most bmk and γk0 differs from ξ by at most cnk . Therefore, the fact that
0
γk−1 < t ≤ βk0 or γk0 ≤ t < βk0 imply that
ξ − cnk−1 < t < ξ + bnk or ξ − cnk < t < ξ + bnk .
P
By assumption, an converges, so bnk , cnk → 0, hence the partial sums of (6.34)
must converge to ξ. This completes our proof.
Moreover,
X ∞
∞ X ∞
X
|amn | = |am | < ∞,
m=1 n=1 m=1
6.6.3. Multiplication
P∞ of powerP∞ series and infinite series. If we consider
two power series n=0 an z n and n=0 bn z n , then formally multiplying and com-
bining like powers of z, we get
a0 + a1 z + a2 z 2 + a3 z 3 + · · · b0 + b1 z + b2 z 2 + b3 z 3 + · · · =
a0 b0 + (a0 b1 + a1 b0 )z + (a0 b2 + a1 b1 + a2 b0 )z 2
+ (a0 b3 + a1 b2 + a2 b1 + a3 b0 )z 3 + · · · .
In particular, taking z = 1, we get (again, only formally!)
a0 + a1 + a2 + a3 + · · · b0 + b1 + b2 + b3 + · · · =
a0 b0 + (a0 b1 + a1 b0 ) + (a0 b2 + a1 b1 + a2 b0 )
+ (a0 b3 + a1 b2 + a2 b1 + a3 b0 ) + · · · .
P∞
These
P∞ thoughts suggest the following definition.P∞Given two series n=0 an and
b
n=0 n , their Cauchy product is the series c
n=0 n , where
n
X
cn = a0 bn + a1 bn−1 + · · · + an b0 = ak bn−k .
k=0
P∞ P∞
A natural question to ask is if n=0 an and n=0 bn converge, then is it true that
X ∞ ∞
X X ∞
an bn = cn ?
n=0 n=0 n=0
Thus, P
the terms cn do not tend to zero as n → ∞, so by the n-th term test, the
∞
series n=0 cn does not converge.
316 6. ADVANCED THEORY OF INFINITE SERIES
P (−1)n−1
The problem with this example is that the series √
n
does not converge
absolutely. However, for absolutely convergent series, there is no problem as the
following theorem, due to Franz Mertens (1840–1927), shows.
Cn = c0 + c1 + · · · + cn
= a0 b0 + (a0 b1 + a1 b0 ) + · · · + (a0 bn + a1 bn−1 + · · · + an b0 )
(6.35) = a0 (b0 + · · · + bn ) + a1 (b0 + · · · + bn−1 ) + · · · + an b0 .
Cn = a0 (B + βn ) + a1 (B + βn−1 ) + · · · + an (B + β0 )
= An B + (a0 βn + a1 βn−1 + · · · + an β0 ).
Since An → A, the first part of this sum converges to AB. Thus, we just need to
show that the term in parenthesis
P tends to zero as n → ∞. To see this, let ε > 0
be given. Putting α = |an | and using that βn → 0, we can choose a natural
number N such that for all n > N , we have |βn | < ε/(2α). Also, since βn → 0, we
can choose a constant C such that |βn | ≤ C for every n. Then for n > N ,
P∞ P∞
As an easy corollary, we see that if n=0 an z n and n=0 bn z n have radii of
convergence R1 , R2 , respectively, then since power series converge absolutely within
their radii of convergence, for all z ∈ C with |z| < R1 , R2 , we have
X∞ ∞
X X ∞
n n
an z bn z = cn z n
n=0 n=0 n=0
Pn
where cn = k=0 ak bn−k . In words: The P product P of power series is a power series.
Here’s
P a question: Suppose that an and
P bn P
converge
Pand their Cauchy
product cn also converges; is it true that cn = an bn ? The answer
may seem to be an “obvious” yes. However, it’s not so “obvious’ because the
definition of the Cauchy product was based on a formal argument. Here is a proof
of this “obvious” fact.
Theorem 6.31 (Abel’s
P multiplication
P theorem). If the Cauchy product of
two convergent series an = A and bn = B converges, then the Cauchy product
has the value AB.
Proof. In my opinion, the slickest proof of this theorem is Abel’s original,
proved in 1826 [120, p. 321] using his limit theorem, Theorem 6.20. Let
X X X
f (z) = an z n , g(z) = bn z n , h(z) = cn z n ,
P∞ (−1)n−1
Example 6.38. For example, let us square log 2 = n=1 P∞ n . It turns out
that it will be convenient to write log 2 in two ways: log 2 = n=1 an where a0 = 0
n−1 P∞ n
and an = (−1)n for n = 1, 2, . . ., and as log 2 = n=0 bn where bn = (−1)
n+1 . Thus,
c0 = a0 b0 = 0 and for n = 1, 2, . . ., we see that
n n
X X (−1)k−1 (−1)n−k
cn = ak bn−k = = (−1)n−1 αn ,
k(n + 1 − k)
k=0 k=1
Pn 1
where αn = k=1 k(n+1−k) . By Abel’s multiplication theorem, we have (log 2)2 =
P∞ P∞ n−1
n=0 cn = n=1 (−1) αn as long as this latter sum converges. By the alternat-
ing series test, this sum converges if we can prove that {αn } is nonincreasing and
converges to zero. To prove these statements hold, observe that we can write
1 1 1 1
= + ,
k(n − k + 1) n+1 k n−k+1
318 6. ADVANCED THEORY OF INFINITE SERIES
therefore
1 1 1 1 1 1 1 1
αn = · + · + · + ··· + ·
1 n 2 n−1 3 n−2 n 1
1 h 1 1 1 1 1 1 1 i
= 1+ + + + + + ··· + + .
n+1 n 2 n−1 3 n−2 n 1
1
In the brackets there are two copies of 1 + 2 + · · · + n1 . Thus,
2 1 1 1
αn = Hn , where Hn := 1 + + + · · · + .
n+1 2 3 n
It is common to use the notation Hn for the n-th partial sum of the harmonic
series. Now, recall from Section 4.6.5 on the Euler-Mascheroni constant that γn :=
Hn − log n is bounded above by 1, so
2 2 log n 2 n 1
αn = (γn + log n) ≤ +2 = +2· · log n
n+1 n+1 n+1 n+1 n+1 n
2 n
= +2· · log(n1/n ) → 0 + 2 · 1 · log 1 = 0
n+1 n+1
as n → ∞. Thus, αn → 0. Moreover,
2 2 2 2 1
αn − αn+1 = Hn − Hn+1 = Hn − Hn +
n+1 n+2 n+1 n+2 n+1
2 2 2
= − Hn −
n+1 n+2 (n + 1)(n + 2)
2 2
= Hn −
(n + 1)(n + 2) (n + 1)(n + 2)
2
= (Hn − 1) ≥ 0.
(n + 1)(n + 2)
P P
Thus, αn ≥ αn+1 , so cn = (−1)n−1 αn converges. Hence, we have proved the
following pretty formula:
∞
1 2 X (−1)n−1
log 2 = Hn
2 n=1
n+1
∞
X (−1)n−1 1 1
= 1 + + ··· + .
n=1
n+1 2 n
where the sum on the right means to add over all such products am bn in any order
we wish. One can ask if this holds true in the infinite series realm. The answer is
“yes” if both series on the left are absolutely convergent.
P
Theorem
P 6.32 (Cauchy’s multiplication theorem). If two P series an =
A and bn = B converge absolutely, then the double series am bn converges
absolutely and has the value AB.
6.6. REARRANGEMENTS AND MULTIPLICATION OF POWER SERIES 319
Proof. Since
∞ X
X ∞ ∞
X ∞
X ∞
X ∞
X
|am bn | = |am | |bn | = |am | |bn | < ∞,
m=0 n=0 m=0 n=0 m=0 n=0
P
by Cauchy’s double series theorem, the double series am bn converges absolutely,
and we can iterate the sums:
X ∞ X
X ∞ ∞
X X∞ X ∞ X ∞
am bn = am bn = am bn = am bn = A · B.
m=0 n=0 m=0 n=0 m=0 n=0
We remark that Cauchy’s multiplication theorem generalizes to a product of
more than two absolutely convergent series.
where we used the binomial theorem for (z + w)n in the last line.
Exercises 6.6.
1. Here are some alternating series problems:
(a) Prove that
1 1 1 1 1 1 1 1 1 3
+ − + + − + ··· + + − + · · · = log 2.
1 3 2 5 7 4 4k − 3 4k − 1 2k 2
that is, we rearrange the alternating harmonic series so that two positive terms are
followed by one negative one, otherwise keeping the ordering the same. Suggestion:
Observe that
1 1 1 1 1 1 1
log 2 = − + − + − + ···
2 2 4 6 8 9 10
1 1 1 1
= 0 + + 0 − + 0 + + 0 − + ··· .
2 4 6 8
Add this term-by-term to the series for log 2.
(b) Prove that
1 1 1 1 1 1 1 1 1 1 3
+ + + − + ··· + + + + − + · · · = log 2;
1 3 5 7 2 8k − 7 8k − 5 8k − 3 8k − 1 2k 2
that is, we rearrange the alternating harmonic series so that four positive terms
are followed by one negative one, otherwise keeping the ordering the same.
320 6. ADVANCED THEORY OF INFINITE SERIES
Using this formula, derive the neat looking formula: For z ∈ C with |z| < 1,
X ∞ X ∞ ∞
1X
(6.36) cos nθ z n · sin nθ z n = (n + 1) sin nθ z n .
n=0 n=0
2 n=0
P P∞ n
Suggestion:
P∞ Put z = eiθ
x with x real into the formula ( ∞ n
n=0 z ) · ( n=0 z ) =
n
n=0 (n + 1)z , then equate imaginary parts of both sides; this proves (6.36) for z = x
real and |x| < 1. Why does (6.36) hold for z ∈ C with |z| < 1?
4. Derive the beautiful formula: For |z| < 1,
X ∞ ∞ ∞
cos nθ n X sin nθ n 1 X Hn sin nθ n
z · z = z .
n=1
n n=1
n 2 n=2 n
P
5. In this problem we prove the following fact: Let f (z) = ∞ n
n=0 an z be a power series
with radius of convergence R > 0 and let α ∈ C with |α| < R. Then we can write
∞
X
f (z) = bn (z − α)n ,
n=0
(iii) Verifying that you can change the order of summation in (6.37), prove the result.
P
6.7. F Proofs that 1/p diverges
P
We know that the harmonic series 1/n diverges.
P However, if we only sum
over the squares, then we get the convergentPsum 1/n2 . Similarly, if we only sum
over the cubes, we get the convergent sum 1/n3 . One may ask: What if we sum
only over all primes:
X1 1 1 1 1 1 1 1
= + + + + + + + ··· ,
p 2 3 5 7 11 13 17
P
6.7. F PROOFS THAT 1/p DIVERGES 321
do we get a convergent sum? We know that there are arbitrarily large gaps P be-
tween primes (see Problem 1 in Exercises 2.4), so one may conjecture that 1/p
converges.
P However, following [23], [63], [164] (cf. [165]), and [130] we shall prove
that 1/p diverges! Other proofs can be found in the exercises. An expository
article giving other proofs (cf. [153], [51]) on this fascinating divergent sum can be
found in [231].
6.7.1. Proof I: Proof by multiplication and rearrangement. This is
Bellman
P [23] and Dux’s [63] argument. Suppose, for sake of contradiction,
P that
1/p converges. Then we can fix a prime number m such that p>m 1/p < 1.
Let 2 < 3 < · · · < m be the list of all prime numbers up to m. Given N > m, let
PN be the set of natural numbers greater than one and less than or equal to N all
of whose prime factors are less than or equal to m, and let QN be the set of natural
numbers greater than one and less than or equal to N all of whose prime factors
are greater than m. Explicitly,
k ∈ PN ⇐⇒ 1 < k ≤ N and k = 2i 3j · · · mk , some i, j, . . . , k,
(6.38)
` ∈ QN ⇐⇒ 1 < ` ≤ N and ` = p q · · · r, p, q, . . . , r > m are prime.
In the product p q · · · r, prime numbers may be repeated. Observe that any integer
1 < n ≤ N that is not in PN or QN must have prime factors that are both less than
or equal to m and greater than m, and hence can be factored in the form n = k `
where k ∈ PN and ` ∈ QN . Thus, the finite sum
X 1 X 1 X 1 X 1 X 1 X 1 X 1
+ + = + + ,
k ` k ` k ` k`
k∈PN `∈QN k∈PN `∈QN k∈PN `∈QN k∈PN ,`∈QN
contains every number of the form 1/n where 1 < n ≤ N . (Of course, the resulting
sum contains other numbers too.) In particular,
X 1 X 1 X 1 X 1 X N
1
+ + ≥ ,
k ` k ` n=2
n
k∈PN `∈QN k∈PN `∈QN
We shall prove that the finite sums on the left remain bounded as N → ∞, which
contradicts the fact
P that the harmonic series diverges. P∞ j
To see that PN 1/k converges, note that each geometric series j=1 1/p
j
converges (absolutely since all the 1/p are positive) to a finite real number. Hence,
by Cauchy’s multiplication theorem (or rather its generalization to a product of
more than two absolutely convergent series), we have
X ∞ X∞ X ∞ X
1 1 1 1
i j
· · · k
= i 3j · · · mk
i=1
2 j=1
3 m 2
k=1
is a finite real number, where the sum on the rightP is over all i, j, . . . , k = 1, 2, . . ..
Using the definition of PN in (6.38), we see that PN 1/k is bounded above by this
P
finite real number uniformly in N . Thus, limN →∞ PN 1/k is finite.
P
We now prove that limN →∞ QN 1/` is finite. To do so observe that since
P P
α := p>m 1/p < 1 and all the 1/p’s are positive, the sum p>m 1/p, in particular,
converges absolutely. Hence, by Cauchy’s multiplication theorem, we have
X 2
2 1 X 1
α = = ,
p>m
p p,q>m
pq
322 6. ADVANCED THEORY OF INFINITE SERIES
where the sum is over all primes P p, q, r > m. We can continue this procedure
showing that αj is the sum 1/(p q · · · r) where the sum is over all j-tuples of
primes p, q, . . . , r all of which P
are strictly larger than m. By definition
P∞ of QN in
(6.38), it follows that the sum QN 1/` is bounded by the number j=1 αj , which
P
is finite because α < 1. Hence, the limit limN →∞ QN 1/` is finite, and we have
reached a contradiction.
6.7.2. An elementary number theory fact. Our next proof depends on
the idea of square-free integers. A positive integer is said to be square-free if no
squared prime divides it, that is, if a prime occurs in its prime factorization, then
it occurs with multiplicity one. For instance, 1 is square-free because no squared
prime divides it, 10 = 2 · 5 is square-free, but 24 = 23 · 3 = 22 · 2 · 3 is not square-free.
We claim that any positive integer can be written uniquely as the product of
a square and a square-free integer. Indeed, let n ∈ N and let k be the largest
natural number such that k 2 divides n. Then n/k 2 must be square-free, for if n/k 2
is divided by a squared prime p2 , then pk > k divides n, which is not possible by
definition of k. Thus, any positive integer n can be uniquely written as n = k 2 if n
is a perfect square, or
(6.39) n = k 2 · p q · · · r,
where k ≥ 1 and where p, q, . . . , r are some primes less than or equal to n that occur
with multiplicity one. Using the fact that any positive integer can be uniquely
written
P as the product of a square and a square-free integer, we shall prove that
1/p diverges.
6.7.3. Proof II: Proof by comparison. Here is Niven’s [164, 165] proof.
We first prove that the product
Y 1
1+
p
p<N
diverges to ∞ as N → ∞, where the product is over all primes less than N . Let
2 < 3 < · · · < m be all the primes less than N . Consider the product
Y 1 1 1 1
1+ = 1+ 1+ ··· 1 + .
p 2 3 m
p<N
If N = 6, then
Y 1 1 1 1
1+ = 1+ 1+ 1+
p<6
p 2 3 5
1 1 1 1 1 1 1
=1+ + + + + + + .
2 3 5 2·3 2·5 3·5 2·3·5
P
6.7. F PROOFS THAT 1/p DIVERGES 323
where the k-th sum on the right is the sum over over all reciprocals of the form
1
p1 ·p2 ···pk with p1 , . . . , pk distinct primes less than N . Thus,
Y 1 X 1 X 1 X X 1
1+ · 2
= 2
+
p k k k2 p
p<N k<N k<N k<N p<N
X X 1 X X 1
+ + ··· + .
k2 ·p·q k2 · p · q · · · r
k<N p,q<N k<N p,q,...,r<N
By our discussion on square-free numbers around (6.39), the right-hand side con-
tains every number of the form 1/n where n < N (and many other numbers too).
In particular,
Y 1 X 1 X 1
(6.40) 1+ · ≥ .
p k2 n
p<N k<N n<N
P
FromP this inequality, we shall prove
P that 1/p diverges. To this end, we know
∞ ∞
that k=1 1/k 2 converges while n=1 1/n diverges, so it follows that
Y 1
lim 1+ = ∞.
N →∞ p
p<N
P
To relate this product to the sum 1/p, note that
x2 x3
ex = 1 + x + + + ··· ≥ 1 + x
2! 3!
for x ≥ 0 — in fact, this inequality holds for all x ∈ R by Theorem 4.29. Hence,
Y 1 Y X 1
1+ ≤ exp(1/p) = exp .
p p
p<N p<N p<N
Since the left-hand side increases without bound as N → ∞, so must the sum
P
p<N 1/p. This ends Proof II; see Problem 2 for a related proof.
which proves our first step. Now recall from (4.29) that for any natural number n,
we have
1 1
(6.42) < log(n + 1) − log n < .
n+1 n
In particular, taking logarithms of both sides of (6.41), we get
NX−1
1 Y p
log ≤ log
n=1
n p−1
p<N
X X 1 X 2
= log p − log(p − 1) ≤ ≤ ,
p−1 p
p<N p<N p<N
where we used that p ≤ 2(p−1) (this is because n ≤ 2(n−1) for all natural numbers
PN −1 PN −1
n > 1). Since P n=1 1/n → ∞ as N → ∞, log n=1 1/n → ∞ as N → ∞ as
well, so the sum 1/p must diverge.
Exercises 6.7.
1. LetPsn = 1/2 + 1/3 + · · · + 1/pn (where pn is the n-th prime) be the n-th partial sum
of 1/p. We know that sn → ∞ as n → ∞. However, it turns out that sn → ∞
avoiding all integers! Prove this. Suggestion: Multiply sn by 2 · 3 · · · pn−1 .
2. Niven’s proof can be slightly modified to avoid using the square-free P fact. Derive the
inequality (6.40) (which, as shown in the main text, implies that 1/p diverges) by
proving that for any prime p,
Xn 2n+1
1 1 X 1
1+ · = .
p p2k pk
k=0 k=0
3. Here is another proof that is similar to Gilfeather and Meister’s argument where we
replace the inequality (6.42) with the following argument.
(i) Prove that
1
(6.43) ≤ ex for all 0 ≤ x ≤ 1.
1 − x/2
Suggestion: Prove that e−x ≤ 1 − x/2 using the series expansion for e−x .
(ii) Taking logarithms of (6.43), prove that for any prime number p, we have
1 2/p 2
− log 1 − = − log 1 − ≤ .
p 2 p
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 325
P∞
where g(z) = an z n .
n=0
P∞
P∞Proof. Let f (z) = n=0 bn z n have radius of convergence R and let g(z) =
n
n=0 an z have radius of convergence r. Then by Cauchy or Mertens’ multiplica-
tion theorem, for each m, we can write g(z)m as a power series:
∞
X m X ∞
m n
g(z) = an z = amn z n , |z| < r.
n=0 n=0
Thus,
∞
X ∞ X
X ∞
f (g(z)) = bm g(z)m = bm amn z n .
m=0 m=0 n=0
If we are allowed to interchange the order of summation in f (g(z)), then our result
is proved:
∞ X
X ∞ X∞ X∞
n n
f (g(z)) = bm amn z = cn z , where cn = bm amn .
n=0 m=0 n=0 m=0
Thus, we can focus on interchanging the order of summation in f (g(z)). Assume
henceforth that
X∞ ∞
X
ξ := |an z n | = |an | |z|n < R = the radius of convergence of f ;
n=0 n=0
P∞ m
in particular, since f (ξ) = m=0 bm ξ is absolutely convergent,
X∞
(6.44) |bm | ξ m < ∞.
m=0
Now according to Cauchy’s double series theorem, we can interchange the order of
summation as long as we can show that
X∞ X ∞ ∞ X ∞
X
bm amn z n = |bm | |amn | |z|n < ∞.
m=0 n=0 m=0 n=0
To prove this, we first claim that the inner summation satisfies the inequality
X∞
(6.45) |amn | |z|n ≤ ξ m .
n=0
To see this, consider the case m = 2. Recall that the coefficients a2n are defined
via the Cauchy product:
X ∞ 2 X∞ Xn
g(z)2 = an z n = a2n z n where a2n = ak an−k .
n=0 n=0 k=0
Pn 2
Thus, |a2n | ≤ k=0 |ak | |an−k |. On the other hand, we can express ξ via the
Cauchy product:
∞
X 2 X ∞ Xn
ξ2 = |an | |z|n = αn |z|n where αn = |ak | |an−k |.
n=0 n=0 k=0
Pn
Hence, |a2n | ≤ k=0 |ak | |an−k | = αn , so
∞
X ∞
X
|a2n | |z|n ≤ αn |z|n = ξ 2 ,
n=0 n=0
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 327
which proves (6.45) for m = 2. An induction argument shows that (6.45) holds for
all m. Finally, using (6.45) and (6.44) we see that
∞ X
X ∞ ∞ X
∞ ∞
X X
bm amn z n = |bm | |amn | |z|n ≤ |bm | ξ m < ∞,
m=0 n=0 m=0 n=0 m=0
which shows that we can interchange the order of summation in f (g(z)) and com-
pletes our proof.
We already know (by Mertens’ multiplication theorem for instance) that the
product of two power series is again a power series. As a consequence of the following
theorem, we get the same statement for division.
Theorem 6.34 (Power series division theorem). If f (z) and g(z) are power
series with positive radii of convergence and with g(0) 6= 0, then f (z)/g(z) is also
a power series with positive radius of convergence.
Proof. Since f (z)/g(z) = f (z) · (1/g(z)) and we know that the product of
two power series is a power series,
P∞ all we have to do is show that 1/g(z) is a power
series. To this end, let g(z) = n=0 an z n and define
∞
1 X
g̃(z) := g(z) − 1 = αn z n ,
a0 n=1
where αn = aan0 and where we recall that a0 = g(0) 6= 0. Then g̃ has a positive
1
radius of convergence and g̃(0) = 0. Now let h(z) := a0 (1+z) , which can be writ-
ten
P∞as a geometric series with radius of convergence 1. Note that for |z| small,
n
n=1 |αn | |z| < 1 (why?), thus by the previous theorem, for such z,
1 1
= = h(g̃(z))
g(z) a0 (g̃(z) + 1)
has a power series expansion with a positive radius of convergence.
6.8.2. Bernoulli numbers. See [120], [54], [206], or [85] for more informa-
tion on Bernoulli numbers. Since
∞ ∞ ∞
ez − 1 1 X 1 n X 1 n−1 X 1
= · z = z = zn
z z n=1 n! n=1
n! n=0
(n + 1)!
has a power series expansion and equals 1 at z = 0, by our division of power series
theorem, the quotient 1/((ez − 1)/z) = z/(ez − 1) also has a power series expansion
near z = 0. It is customary to denote its coefficients by Bn /n!, in which case we
can write
∞
z X Bn n
(6.46) = z
ez − 1 n=0 n!
where the series has a positive radius of convergence. The numbers Bn are called the
Bernoulli numbers after Jacob (Jacques) Bernoulli (1654–1705) who discovered
them while searching for formulas involving powers of integers; see Problems 3 and
4. We can find a remarkable symbolic equation for these Bernoulli numbers as
328 6. ADVANCED THEORY OF INFINITE SERIES
follows. First, we multiply both sides of (6.46) by (ez − 1)/z and use Mertens’
multiplication theorem to get
X ∞ X ∞ X ∞ X n
Bn n 1 Bk 1
1= z · zn = · zn.
n=0
n! n=0
(n + 1)! n=0
k! (n − k + 1)!
k=0
By the identity theorem, the n = 0 term on the right must equal 1 while all other
terms must vanish. The
Pn n = 0 term on the right is just B0 , so B0 = 1, and for
n > 1, we must have k=0 Bk!k · (n+1−k)!
1
= 0. Multiplying this by (n + 1)! we get
n n n
X Bk (n + 1)! X (n + 1)! X n+1
0= · = · Bk = Bk ,
k! (n + 1 − k)! k!(n + 1 − k)! k
k=0 k=0 k=0
and adding Bn+1 = n+1n+1 Bn+1 to both sides of this equation, we get
n+1
X n + 1
Bn+1 = Bk .
k
k=0
The right-hand side might look familiar from the binomial formula. Recall from
the binomial formula that for any complex number a, we have
n+1
X n + 1 n+1
X n + 1
(a + 1)n+1 = ak · 1n−k = ak .
k k
k=0 k=0
Notice that the right-hand side of this expression is exactly the right-hand side of
the previous equation if put a = B and we make the superscript k into a subscript
k. Thus, if we use the notation + to mean “equals after making superscripts into
subscripts”, then we can write
(6.47) B n+1 + (B + 1)n+1 , n = 1, 2, 3, . . . with B0 = 1.
Using the identity (6.47), one can in principle find all the Bernoulli numbers: When
n = 1, we see that
1
B 2 + (B + 1)2 = B 2 + 2B 1 + 1 =⇒ 0 = 2B1 + 1 =⇒ B1 = − .
2
When n = 2, we see that
1
B 3 + (B + 1)3 = B 3 + 3B 2 + 3B 1 + 1 =⇒ 0 = 3B2 + 3B1 + 1 =⇒ B2 = .
6
Here is a partial list through B14 :
1 1
B0 = 1, B1 = − , B2 = , B3 = 0,
2 6
1
B4 = − , B5 = B7 = B9 = B11 = B13 = B15 = 0,
30
1 1 5 691 7
B6 = , B8 = − , B10 = , B12 = − , B14 = .
42 30 66 2730 6
These numbers are rational, but besides this fact, there is no known regular pattern
these numbers conform to. However, we can easily deduce that all odd Bernoulli
numbers greater than one are zero. Indeed, we can rewrite (6.46) as
∞
z z X Bn n
(6.48) z
+ = 1 + z .
e −1 2 n=2
n!
6.8. COMPOSITION OF POWER SERIES AND BERNOULLI AND EULER NUMBERS 329
The fractions on the left-hand side can be combined into one fraction
z z z(ez + 1) z(ez/2 + e−z/2 )
(6.49) + = = ,
ez − 1 2 2(ez − 1) 2(ez/2 − e−z/2 )
which an even function of z. Thus, (see Exercise 1 in Section 6.4)
(6.50) B2n+1 = 0, n = 1, 2, 3, . . . .
Other properties are given in the exercises (see e.g. Problem 3).
where used that B3 , B5 , B7 , . . . all vanish in order to sum only over all even Bernoulli
numbers. Since cot z = cos z/ sin z, using the definition of cos z and sin z in terms
of e±iz , we see that the left-hand side is exactly z cot z. Thus, we have derived the
formula
∞
X 22n B2n 2n
z cot z = (−1)n z .
n=0
(2n)!
From this formula, we can get the expansion for tan z by using the identity
cos 2z cos2 z − sin2 z
2 cot(2z) = 2 =2 = cot z − tan z.
sin 2z 2 sin z cos z
Hence,
∞ ∞
X 22n B2n 2n X 22n B2n 2n 2n
tan z = cot z − 2 cot(2z) = (−1)n z −2 (−1)n 2 z ,
n=0
(2n)! n=0
(2n)!
which, after combining the terms on the right, takes the form
∞
X 22n (22n − 1) B2n 2n−1
tan z = (−1)n−1 z .
n=1
(2n)!
6.8.4. The Euler numbers. It turns out that the expansion for sec z involves
the Euler numbers, which are defined in a similar way as the Bernoulli numbers.
By the division of power series theorem, the function 2ez /(e2z + 1) has a power
series expansion near zero. It is customary to denote its coefficients by En /n!, so
∞
2ez X En n
(6.51) 2z
= z
e + 1 n=0 n!
330 6. ADVANCED THEORY OF INFINITE SERIES
where the series has a positive radius of convergence. The numbers En are called
the Euler numbers. We can get the missing expansion for sec z as follows. First,
observe that
∞
X En n 2ez 2 1
z = 2z = z −z
= = sech z,
n=0
n! e + 1 e + e cosh z
where sech z := 1/ cosh z is the hyperbolic secant. Since sech z is an even function
(that is, sech(−z) = sech z) it follows that all En with n odd vanish. Hence,
∞
X E2n 2n
(6.52) sech z = z .
n=0
(2n)!
In particular, putting iz for z in (6.52) and using that cosh(iz) = cos z, we get the
missing expansion for sec z:
∞
X E2n 2n
sec z = (−1)n z .
n=0
(2n)!
(b) Use Part (a) to find the first few coefficients of the expansion for tan z = sin z/ cos z.
3. (Cf. [120, p. 526] which is reproduced in [166]) In this and the next problem we give
an elegant application of the theory of Bernoulli numbers to determine the sum of the
first k-th powers of integers, Bernoulli’s original motivation for his numbers.
(i) For n ∈ N, derive the formula
z e(n+1)z − 1
1 + ez + e2z + · · · + enz = · .
ez −1 z
(ii) Writing each side of this identity as a power series (on the right, you need to use
the Cauchy product), derive the formula
k
!
k k k
X k (n + 1)k+1−j
(6.55) 1 + 2 + ··· + n = Bj , k = 1, 2, . . . .
j=0
j k+1−j
1
1k + 2k + · · · + nk + (n + 1 + B)k+1 − B k+1 .
k+1
Suggestion: Look for a telescoping sum and recall that (B + 1)j + B j for j ≥ 2.
5. The n-th Bernoulli polynomial Bn (t) is by definition, n! times the coefficient of z n in
the power series expansion in z of the function f (z, t) := zezt /(ez − 1); that is,
∞
z ezt X Bn (t) n
(6.56) z
= z .
e −1 n=0
n!
332 6. ADVANCED THEORY OF INFINITE SERIES
P
(a) Prove that Bn (t) = n n
k=0 k Bk t
n−k
where the Bk ’s are the Bernoulli numbers.
Thus, the first few Bernoulli polynomials are
1 1 3 2 1
B0 (t) = 1, B1 (t) = t − , B2 (t) = t2 − t + , B3 (t) = t3 − t + t.
2 6 2 2
(b) Prove that Bn (0) = Bn for n = 0, 1, . . . and that Bn (0) = Bn (1) = Bn for n 6= 1.
Suggestion: Show that f (z, 1) = z + f (z, 0).
(c) Prove that Bn (t + 1) − Bn (t) = ntn−1 for n = 0, 1, 2, . . .. Suggestion: Show that
f (z, t + 1) − f (z, t) = zezt .
(d) Prove that B2n+1 (0) = 0 for n = 1, 2, . . . and B2n+1 (1/2) = 0 for n = 0, 1, . . ..
With this motivation, given any complex number α, we define the binomial coef-
ficient α α
n for any nonnegative integer n as follows: 0 = 1 and for n > 0,
α α(α − 1) · · · (α − n + 1)
(6.58) = .
n n!
Note that if α = 0, 1, 2, . . ., then we see that all α α
n vanish for n ≥ α + 1 and n
is exactly the usual binomial coefficient (6.57). In the following lemma, we derive
an identity that will be useful later.
where at the last step we formed the Cauchy product of (1 + z)p (1 + z)q . By the
identity theorem we must have
X n
p+q p q
= , for all p, q, n ∈ N0 .
n k n−k
k=0
Step 2: Assume now that β = q ∈ N0 , n ∈ N0 , and define f : C −→ C by
X n
z+q z q
f (z) := − .
n k n−k
k=0
In view of the definition (6.58) of the binomial coefficient, it follows that f (z)
is a polynomial in z of degree at most n. Moreover, by Step 1 we know that
f (p) = 0 for all p ∈ N0 . In particular, the polynomial f (z) has more than n roots.
Therefore, f (z) must be the zero polynomial, so in particular, given any α ∈ C, we
have f (α) = 0; that is,
X n
α+q α q
= , for all α ∈ C , q, n ∈ N0 .
n k n−k
k=0
Step 3: Let α ∈ C, n ∈ N0 , and define g : C −→ C by
X n
α+z α z
g(z) := − .
n k n−k
k=0
see that f (α, z) converges for all other α, assume that α ∈ C is not a nonnegative
integer. Then setting an = α n , we have
an α(α − 1) · · · (α − n + 1) (n + 1)! = n+1 ,
= ·
an+1 n! α(α − 1) · · · (α − n) |α − n|
which approaches 1 as n → ∞. Thus, the radius of convergence of f (α, z) is 1 (see
(6.12)). In conclusion, f (α, z) is convergent for all α ∈ C and |z| < 1.
We now prove the real versions of the logarithm series and the binomial series
(6.59); see Theorem 6.37 below for the more general complex version. It is worth
emphasizing that we do not use the advanced technology of the differential and
integral calculus to derive these formulas!
Lemma 6.36. For all x ∈ R with |x| < 1, we have
∞
X (−1)n−1 n
log(1 + x) = x
n=1
n
and for all α ∈ C and x ∈ R with |x| < 1, we have
∞
X α n α(α − 1) 2
(1 + x)α = x = 1+ αx+ x + ··· .
n=0
n 1!
Now put z = x ∈ R with |x| < 1 and let q ∈ N be odd. Then f (1/q, x)q = 1 + x,
so taking q-th roots, we get f (1/q, x) = (1 + x)1/q . Here we used that every real
number has a unique q-th root, which holds because q is odd — for q even we could
only conclude that f (1/q, x) = ±(1 + x)1/q (unless we checked that f (1/q, x) is
positive, then we would get f (1/q, x) = (1 + x)1/q ). Therefore,
f (r, x) = f (p/q, x) = f (1/q + · · · + 1/q , x) = f (1/q, x) · · · f (1/q, x)
| {z } | {z }
p times p times
in particular, since we know that power series are continuous, f (α, z) is a continuous
function of α ∈ C. Here, the coefficients am (z) depend on z (which we’ll see are
power series in z) and we’ll show that
∞
X (−1)n−1 n
(6.60) a1 (z) = z .
n=1
n
for somecoefficients
P∞ amn . Defining amn = 0 for m = n + 1, n + 2, n + 3, . . ., we can
write αn = m
m=0 mn α . Hence,
a
∞ ∞ X∞
X α n X
(6.62) f (z, α) = 1 + z =1+ amn αm z n .
n=1
n n=1 m=1
where the bmn ’s are nonnegative real numbers. (This is certainly plausible because
the numbers 1, 2, . . . , n − 1 on the left each come with positive signs; any case, this
statement can be verified by induction for instance.) We secondly observe that
replacing α with −α in (6.61), we get
n
X −α(−α − 1) · · · (−α − n + 1)
amn (−1)m αm =
m=1
n!
n
α(α + 1) · · · (α + n − 1) X
= (−1)n = (−1)n bmn αm .
n! m=1
By the identity theorem, we have amn (−1)m = (−1)n bmn . In particular, |amn | =
bmn since bmn > 0, therefore in view of (6.63), we see that
∞ n n
X X X |α|(|α| + 1) · · · (|α| + n − 1)
|amn | |α|m = |amn | |α|m = bmn |α|m = .
m=0 m=0 m=0
n!
Therefore,
∞
∞ X ∞
X X |α|(|α| + 1) · · · (|α| + n − 1) n
|amn | |α|m |z|n = |z| .
n=1 m=1 n=1
n!
336 6. ADVANCED THEORY OF INFINITE SERIES
Using the now very familiar ratio test it’s easily checked that, since |z| < 1, the
series on the right converges. Thus, we can iterate sums in (6.62) and conclude
that
∞ X
X ∞ X∞ X ∞
f (α, z) = 1 + amn αm z n = 1 + amn z n αm .
n=1 m=1 m=1 n=1
α(α − 1)(α − 2) · · · (α − n + 1)
a1n = coefficient of α in
n!
(−1)(−2)(−3) · · · (−n + 1) (n − 1)! (−1)n−1
= = (−1)n−1 = .
n! n! n
Therefore,
∞ ∞
X X (−1)n−1 n
a1 (z) = a1n z n = z ,
n=1 n=1
n
just as we stated in (6.60). This completes Step 2.
Step 3: We are finally ready to prove our theorem. Let x ∈ R with |x| < 1.
By Step 2, we know that for any α ∈ C,
∞
X
f (α, x) = 1 + am (x) αm
m=1
Using this lemma and the identity theorem, we are ready to generalize these
formulas for real x to formulas for complex z.
Theorem 6.37 (The complex logarithm and binomial series). We have
∞
X (−1)n−1 n
Log(1 + z) = z , |z| ≤ 1, z 6= −1,
n=1
n
Proof. We prove this theorem first for Log(1 + z), then for (1 + z)α .
6.9. THE LOGARITHMIC, BINOMIAL, ARCTANGENT SERIES, AND γ 337
P∞ n−1
Step 1: Let us define f (z) := n=1 (−1)n z n . Then one can check that the
radius of convergence of f (z) is 1, so by our power series composition theorem, we
know that exp(f (z)) can be written as a power series:
∞
X
exp(f (z)) = an z n , |z| < 1.
n=0
Restricting to real values of z, by our lemma we know that f (x) = log(1 + x), so
∞
X
an xn = exp(f (x)) = exp(log(1 + x)) = 1 + x.
n=0
By the identity theorem for power series, we must have a0 = 1, a1 = 1, and all
other an = 0. Thus, exp(f (z)) = 1 + z. Since exp(Log(1 + z)) = 1 + z as well, we
have
exp(f (z)) = exp(Log(1 + z)),
which implies that f (z) = Log(1 + z) + 2πik for some integer k. Setting z = 0
P∞ n−1
shows that k = 0 and hence proves that Log(1 + z) = f (z) = n=1 (−1)n z n .
P∞ n−1
We now prove that Log(1 + z) = n=1 (−1)n z n holds for |z| = 1 with z 6= −1
(note that for z = −1, both sides of this equality are not defined). If |z| = 1, then
we can write z = −eix with x ∈ (0, 2π). Recall from Example 6.4 in Section 6.1
P∞ inx
that for any x ∈ (0, 2π), the series n=1 e n converges. Hence, as
∞ ∞ ∞
X (−1)n−1 n X (−1)n (−eix )n X einx
(6.64) − z = = ,
n=1
n n=1
n n=1
n
P∞ n−1
it follows that n=1 (−1)n z n converges for |z| = 1 with z 6= −1. Now fix a point
z0 with |z0 | = 1 and z0 6= −1, and let us take z → z0 through the straight line from
z = 0 to z = z0 (that is, let z = tz0 where 0 ≤ t ≤ 1 and take t → 1− ). Since the
ratio
|z0 − z| |z0 − tz0 | |z0 − tz0 | |1 − t|
= = = = 1,
1 − |z| 1 − |tz0 | 1−t 1−t
which bounded by a fixed constant, by Abel’s theorem (Theorem 6.20), it follows
that
∞ ∞
X (−1)n−1 n X (−1)n−1 n
z0 = lim z = lim Log(1 + z) = Log(1 + z0 ),
n=1
n z→z0
n=1
n z→z0
Restricting to real z = x ∈ R with |x| < 1, by our lemma we know that (1 + x)α =
f (α, x). Hence, by the identity theorem, we must have (1 + z)α = f (α, z) for all
z ∈ C with |z| < 1. This proves the binomial series.
338 6. ADVANCED THEORY OF INFINITE SERIES
For any z ∈ C with |z| < 1, we have Log (1 + z)/(1 − z) = Log(1 + z) −
Log(1 − z). Therefore, we can use this theorem to prove that (see Problem 1)
X ∞
1 1+z z 2n+1
(6.65) Log = .
2 1−z n=0
2n + 1
The first two formulas are due to Euler and the last one to Philippe Flajolet and
Ilan Vardi (see [203, pp. 4,5], [75]).
Exercises 6.9.
1. Fill in the details in the proof of formula (6.65).
2. Derive the remarkably pretty formulas:
∞
X (−1)n 1 1 1 2n+2
2(Arctan z)2 = 1 + + + ··· + z ,
n=0
2n + 2 3 5 2n + 1
P (−1)n
(ii) Prove that γ = 1 − log 2 + ∞ n=2 n
(ζ(n) − 1) using (i) and Problem 10 in
Exercises 6.5. Show that this formula is equivalent to the first formula in (6.66).
(iii) Using the second and third formulas in Problem 7a of Exercises 4.6, derive the
second and third formulas in (6.66).
6The value of π has engaged the attention of many mathematicians and calculators from the
time of Archimedes to the present day, and has been computed from so many different formulae,
that a complete account of its calculation would almost amount to a history of mathematics.
James Glaisher (1848–1928) [82].
6.10. F π, EULER, FIBONACCI, LEIBNIZ, MADHAVA, AND MACHIN 341
due to Philippe Flajolet and Ilan Vardi (see [204, p. 1], [232, 75]).
6.10.2. Euler’s arctangent formula and the Fibonacci numbers. In
1738, Euler derived a very pretty two-angle arctangent expression for π:
π 1 1
(6.68) = arctan + arctan .
4 2 3
This formula is very easy to derive. We start off with the addition formula for
tangent (see (4.34), but now considering real variables)
tan θ + tan φ
(6.69) = tan(θ + φ),
1 − tan θ tan φ
where it is assumed that 1 − tan θ tan φ 6= 0. Let x = tan θ and y = tan φ and
assume that −π/2 < θ + φ < π/2. Then taking arctangents of both sides of the
above equation, we obtain
x+y
arctan = θ + φ,
1 − xy
or after putting the left-hand in terms of x, y, we get
x+y
(6.70) arctan = arctan x + arctan y.
1 − xy
Setting x = 1/2 and y = 1/3 and using that
x+y 5/6
= = 1,
1 − xy 1 − 5/6
we get
1 1
arctan 1 = arctan + arctan .
2 3
This expression is just (6.68).
In Problem 9 of Exercises 2.2 we studied the Fibonacci sequence, named
after Leonardo Fibonacci (1170–1250): F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for
all n ≥ 2 and you proved that for every n,
√
1 h n −n
i 1+ 5
(6.71) Fn = √ Φ − (−Φ) , Φ= .
5 2
342 6. ADVANCED THEORY OF INFINITE SERIES
We can use (6.68) and (6.70) to derive the following fascinating formula for π/4 in
terms of the (odd-indexed) Fibonacci numbers due to Lehmer [131] (see Problem
2 and [133]):
∞
π X 1
(6.72) = arctan .
4 n=0
F2n+1
Also, in Problem 3 you will prove the following series for π, due to Castellanos [47]:
∞
π X (−1)n F2n+1 22n+3
(6.73) √ = √ .
5 n=0 (2n + 1)(3 + 5)2n+1
Example 6.42. Machin’s formula gives many decimal places of π without much
P∞ (−1)n
effort. Let sn denote the n-th partial sum of s := 16 n=0 (2n+1)5 2n+1 and tn that
P∞ (−1)n
of t := 4 n=0 (2n+1) 2392n+1 . Then π = s − t and by the alternating series error
estimate,
16
|s − s3 | ≤ ≈ 9.102 × 10−7
9 · 59
and
4
|t − t0 | ≤ ≈ 10−7 .
3 · (239)3
6.10. F π, EULER, FIBONACCI, LEIBNIZ, MADHAVA, AND MACHIN 343
Therefore,
|π − (s3 − t0 )| = |(s − t) − (s3 − t0 )| ≤ |s − s3 | + |t − t0 | < 5 × 10−6 .
A manageable computation (even without a calculator!) shows that s3 − t0 =
3.14159 . . .. Therefore, π = 3.14159 to five decimal places!
Exercises 6.10.
1. From Gregory-Madhava’s series, derive the following pretty series
π 1 1 1 1
√ =1− + − + − +··· .
2 3 3·3 5 · 32 7 · 33 9 · 34
√
Suggestion: Consider√ arctan(1/ 3) = π/6. How many terms of this series do you need
to approximate π/2 3 to within seven decimal places? History Bite: Abraham Sharp
(1651–1742) used this formula in 1669 to compute π to 72 decimal places, and Thomas
Fantet de Lagny (1660–1734) used this formula in 1717 to compute π to 126 decimal
places (with a mistake in the 113-th place) [47].
2. In this problem we prove (6.72).
(i) Prove that arctan 13 = arctan 51 + arctan 18 , and use this prove that
π 1 1 1
= arctan + arctan + arctan .
4 2 5 8
1 1 1
Prove that arctan 8
= arctan 13
+ arctan 21
, and use this prove that
π 1 1 1 1
= arctan + arctan + arctan + arctan .
4 2 5 13 21
From here you can now see the appearance of Fibonacci numbers.
(ii) To continue this by induction, prove that for every natural number n,
F2n+1 F2n+2 − 1
F2n = .
F2n+3
Suggestion: Can you use (6.71)?
(iii) Using the formula in (ii), prove that
1 1 1
arctan = arctan + arctan .
F2n F2n+1 F2n+2
Using this formula derive (6.72).
3. In this problem we prove (6.73).
(i) Using (6.70), prove that
√ √ √
5x 1+ 5 1− 5
tan−1 = tan −1
x − tan−1
x.
1 − x2 2 2
(ii) Now prove that
√ ∞
−1 5x X (−1)n F2n+1 x2n+1
tan 2
= .
1−x n=0
5n (2n + 1)
P∞
6.11. F Another proof that π 2 /6 = 1/n2 (The Basel problem)
n=1
P∞ n
Assuming only Gregory-Leibniz-Madhava’s series: π4 = n=0 (−1)
2n+1 , we give
our seventh proof of the fact that7
∞
π2 X 1 1 1 1
= 2
= 1 + 2 + 2 + 2 + ··· .
6 n=1
n 2 3 4
According to Knopp [120, p. 324], the proof we are about to give “may be regarded
as the most elementary of all known proofs, since it borrows nothing from the theory
of functions except the Leibniz series”. Knopp attributes the main ideas of the proof
to Nicolaus Bernoulli (1687–1759).
6.11.1. Cauchy’s arithmetic mean theorem. Before giving our sixth proof
of Euler’s sum, we prove the following theorem (attributed to Cauchy by Knopp
[120, p. 72]).
Theorem 6.40 (Cauchy’s arithmetic mean theorem). If a sequence a1 ,
a2 , a3 , . . . converges to L, then the sequence of arithmetic means (or averages)
1
mn := a1 + a2 + · · · + an
n
also converges to L. Moreover, if the sequence {an } is nonincreasing, then so is the
sequence of arithmetic means {mn }.
Proof. To show that mn → L, we need to show that
1
mn − L = (a1 − L) + (a2 − L) + · · · + (an − L)
n
tends to zero as n → ∞. Let ε > 0 and choose N ∈ N so that for all n > N , we
have |an | < ε/2. Then for n > N , we can write
1 1
|mn − L| ≤ |(a1 − L) + · · · + (aN − L)| + |(aN +1 − L) + · · · + (an − L)|
n n
1 1ε ε
≤ |(a1 − L) + · · · + (aN − L)| + + ··· +
n n 2 2
1 n−N ε
= |(a1 − L) + · · · + (aN − L)| + ·
n n 2
1 ε
≤ |(a1 − L) + · · · + (aN − L)| + .
n 2
1
By choosing n larger, we can make n |(a1 − L) + · · · + (aN − L)| also less than ε/2,
which shows that |mn − L| < ε for n sufficiently large. This shows that mn → L.
Assume now that {an } is nonincreasing. We shall prove that {mn } is also
nonincreasing; that is, for each n,
1 1
a1 + · · · + an + an+1 ≤ a1 + · · · + an ,
n+1 n
or, after multiplying both sides by n(n + 1),
n a1 + · · · + an + nan+1 ≤ n a1 + · · · + an + a1 + · · · + an .
But this inequality certainly holds since an+1 ≤ ak for k = 1, 2, . . . , n. This com-
pletes the proof.
There is a related theorem for geometric means found in Problem 2, which can
be used to derive the following neat formula:
1 2 3 n 1/n
2 3 4 n+1
(6.74) e = lim ··· .
n→∞ 1 2 3 n
6.11.2. Proof VII of Euler’s formula for π 2 /6. First we shall apply Abel’s
multiplication theorem to Gregory-Leibniz-Madhava’s series:
π 2 X ∞ X∞
1 1
= (−1)n · (−1)n .
4 n=0
2n + 1 n=0
2n + 1
provided that the series converges! To see that this series converges, note that mn
is exactly the arithmetic mean, or average, of the numbers 1, 1/3, . . . , 1/(2n + 1).
Since 1/(2n + 1) → 0 monotonically, Cauchy’s arithmetic mean theorem shows that
these averages also
P∞ tend to zero monotonically. In particular, by the alternating
series theorem, n=0 (−1)n mn converges, so by Abel’s multiplication theorem, we
get (not quite π 2 /6, but pretty nonetheless)
∞
π2 X 1 1 1
(6.75) = (−1)n 1 + + ··· + .
16 n=0 n+1 3 2n + 1
We evaluate the right-hand side using the following theorem (whose proof is tech-
nical so you can skip it if you like).
346 6. ADVANCED THEORY OF INFINITE SERIES
Theorem
P 2 6.41. Let {an } be a nonincreasing sequence of positive numbers such
that an converges. Then both series
∞
X ∞
X
s := (−1)n an and δk := an an+k , k = 1, 2, 3, . . .
n=0 n=0
P∞ k−1
converge. Moreover, ∆ := k=1 (−1) δk also converges, and we have the formula
∞
X
a2n = s2 + 2∆.
n=0
P
P Proof. Since a2n
converges, we must have an → 0, which implies that
(−1)n an converges byPthe alternating series test. By monotonicity, an an+k ≤
2
a
Pn · an = an and since a2n converges, by comparison, so does each series δk =
∞
n=0 an an+k . Also by monotonicity,
∞
X ∞
X
δk+1 = an an+k+1 ≤ an an+k = δk ,
n=0 n=0
so by the alternating series test, the sum ∆ converges if δk → 0. To prove that this
holds, let ε > 0 and choose N (by invoking the Cauchy criterion for series) such
that a2N +1 + a2N +2 + · · · < ε/2. Then, since the sequence {an } is nondecreasing, we
can write
∞
X
δk = an an+k
n=0
= a0 ak + · · · + aN aN +k + aN +1 aN +1+k + aN +2 aN +2+k + · · ·
≤ a0 ak + · · · + aN ak + a2N +1 + a2N +2 + a2N +3 + · · ·
ε
< ak · a0 + · · · + aN + .
2
As ak → 0 we can make the first term less than ε/2 for all k large enough.
Thus,Pδk < ε for all k sufficiently large. This proves that δk → 0 and hence
∞
∆ = k=1 (−1)k−1 δk converges. Finally, we need to prove the equality
∞
X ∞
X
a2n = s2 + 2∆ = s2 + 2 (−1)k−1 δk .
n=0 k=1
P∞ n
To prove this, let sn denote the n-th partial sum of the series s = n=0 (−1) an .
We have
n
!2 n Xn
X X
2 k
sn = (−1) ak = (−1)k+` ak a` .
k=0 k=0 `=0
We can write the double sum on the right as a sum over (k, `) such that k = `,
k < `, and ` < k:
n X
X n X X X
(−1)k+` ak a` = (−1)k+` ak a` + (−1)k+` ak a` + (−1)k+` ak a` ,
k=0 `=0 k=` k<` `<k
P∞
6.11. F ANOTHER PROOF THAT π 2 /6 = n=1 1/n2 (THE BASEL PROBLEM) 347
where the smallest k and ` can be is 0 and the largest is n. The first sum is just
P n 2
k=0 ak and by symmetry in k and `, the last two sums are actually the same, so
n
X X
s2n = a2k + 2 (−1)k+` ak a` .
k=0 0≤k<`≤n
In summary, we have
n n n−j
!
X X X
s2n = a2k +2 (−1)j
ak ak+j .
k=0 j=1 k=0
P∞
Let dn be the n-th partial sum of 2∆ = 2 j=1 (−1)j−1 δj ; we need to show that
P∞
s2n + dn → k=0 a2k as n → ∞. To this end, we add the expressions for s2n and dn :
n n n−j
! n
X X X X
2 2 j
sn + dn = ak + 2 (−1) ak ak+j + 2 (−1)j−1 δj
k=0 j=1 k=0 j=1
n n n−j
!
X X X
= a2k + 2 (−1)j ak ak+j − δj .
k=0 j=1 k=0
P∞
Recalling that δj = k=0 ak ak+j , we can write s2n + d2n as
n
X n
X
s2n + dn = a2k + 2 (−1)j αj ,
k=0 j=1
where
∞
X
αj := ak ak+j = an−j+1 an+1 + an−j+2 an+2 + an−j+3 an+3 + · · · .
k=n−j+1
Since the sequence {an } is nonincreasing, it follows that the sequence {αj } is non-
decreasing:
αj = an−j+1 an+1 + an−j+2 an+2 + · · · ≤ an−j an+1 + an−j+1 an+2 + · · · = αj+1 .
Now assuming n is even, we have
n
1 2 X
a2k = (−α1 + α2 ) + (−α3 + α4 ) + · · · + (−αn−1 + αn )
sn + dn −
2
k=0
= (−α1 + α2 ) + (−α3 + α4 ) + · · · + (−αn−1 + αn )
= −α1 − (α3 − α2 ) − (α5 − α4 ) − · · · − (αn−1 − αn−2 ) + αn
≤ αn = a1 an+1 + a2 an+2 + · · · = δn − a0 an ,
where we used the fact that the terms in the parentheses are all nonnegative because
the αj ’s are nondecreasing. Using a very similar argument, we get
n
1 2 X
2
(6.76) s + dn − ak ≤ δn − a0 an
2 n
k=0
348 6. ADVANCED THEORY OF INFINITE SERIES
for n odd. Therefore, (6.76) holds for all n. We already know that δn → 0 and
an → 0, so (6.76) shows that the left-hand side tends to zero as n → ∞. This
completes the proof of the theorem.
Finally, we are ready to prove Euler’s formula for π 2 /6. To do so, we apply the
preceding theorem to the sequence an = 1/(2n + 1). In this case,
∞ ∞
X X 1
δk = an an+k = .
n=0 n=0
(2n + 1)(2n + 2k + 1)
Writing in partial fractions,
1 1 1 1
= − ,
(2n + 1)(2n + 2k + 1) 2k 2n + 1 2n + 2k + 1
we get (after some cancellations)
∞
1 X 1 1 1 1 1
δk = − = 1 + + ··· + .
2k n=0 2n + 1 2n + 2k + 1 2k 3 2k − 1
P∞
Hence, the equality n=0 a2n = s2 + 2∆ takes the form
∞ π 2 X ∞
X 1 k−1 1 1 1
= + (−1) 1 + + · · · .
n=0
(2n + 1)2 4 k 3 2k − 1
k=1
However, see (6.75), we already proved that the Cauchy product of Gregory-Leibniz-
Madhava’s series with itself is given by the sum on the right. Thus,
∞
X 1 π 2 π 2 π2
(6.77) = + = .
n=0
(2n + 1)2 4 4 8
Finally, summing over the even and odd numbers, we have
∞ ∞ ∞ ∞
X 1 X 1 X 1 π2 1X 1
= + = + ,
n=1
n2 n=0
(2n + 1)2 n=1 (2n)2 8 4 n=1 n2
P∞ 2 P∞
and solving for n=1 1/n2 , we obtain Euler’s formula: π6 = n=1 n12 .
Exercises 6.11.
1. Find the following limits:
1 + 21/2 + 31/3 + · · · + n1/n
(a) lim ,
n
1 2 3 n
1 + 11 + 1 + 12 + 1 + 13 + · · · + 1 + n1
(b) lim .
n
2. If a sequence a1 , a2 , a3 , . . . of positive numbers converges to L > 0, prove that the
sequence of geometric means (a1 a2 · · · an )1/n also converges to L. Suggestion: Take
logs of the geometric means. Using this result, prove (6.74). Using (6.74), prove that
n
e = lim .
(n!)1/n
3. Here is a generalization of Cauchy’s arithmetic mean theorem: If a1 , a2 , a3 , . . . con-
verges to a and b1 , b2 , b3 , . . . converges to b, then the sequence
1
a1 bn + a2 bn−1 + · · · + an−1 b2 + an b1
n
converges to ab.
CHAPTER 7
Reason’s last step is the recognition that there are an infinite number of
things which are beyond it.
Blaise Pascal (1623–1662), Pensees. 1670.
We already met François Viète’s infinite product expression for π in Sections
4.10 and 5.1. This chapter is devoted entirely to the theory and application of
infinite products, and as a consolation prize we also talk about partial fractions.
In Sections 7.1 and 7.2 we present the basics of infinite products. Hold on to your
seats, because the rest of the chapter is full of surprises!
We begin with the following “Viète-type” formula for log 2, which is due to
Philipp Ludwig von Seidel (1821–1896):
2 2 2 2
log 2 = √ · p√ · qp · rq ··· .
1+ 2 1+ 2 1+ √ p√
2 1+ 2
1 1 2z
Combining the adjacent factors, − z−n − z+n = n2 −z 2 , we get Euler’s celebrated
partial fraction expansion for sine:
∞
π 1 X 2z
(7.1) = + .
sin πz z n=1 n2 − z 2
We’ll also derive partial fraction expansions for the other trig functions. In Section
7.5, we give more proofs of Euler’s sum for π 2 /6 using the infinite products and
partial fractions we found in Sections 7.3 and 7.4. In Section 7.6, we prove one of
the most famous formulas for the Riemann zeta function, namely writing it as an
infinite product involving only the prime numbers:
2z 3z 5z 7z 11z
ζ(z) = · z · z · z · z ··· .
2z − 1 3 − 1 5 − 1 7 − 1 11 − 1
In particular, setting z = 2, we get the following expression for π 2 /6:
π2 Y p2 22 32 52
= = · · ··· .
6 p2 − 1 22 − 1 32 − 1 52 − 1
As a bonus prize, we see how π is related to questions from probability. Finally, in
Section 5.3, we derive some awe-inspiring beautiful formulas (too many to list at
this moment!). Here are a couple of my favorite formulas of all time:
π 3 5 7 11 13 17 19 23
= · · · · · · · ··· .
4 4 4 8 12 12 16 20 24
The numerators of the fractions on the right are the odd prime numbers and the
denominators are even numbers divisible by four and differing from the numerators
by one. The next one is also a beaut:
π 3 5 7 11 13 17 19 23
= · · · · · · · ··· .
2 2 6 6 10 14 18 18 22
The numerators of the fractions are the odd prime numbers and the denominators
are even numbers not divisible by four and differing from the numerators by one.
Chapter 7 objectives: The student will be able to . . .
• determine the (absolute) convergence for an infinite product.
• explain the infinite products and partial fraction expansions of the trig functions.
• describe Euler’s formulæ for powers of π and their relationship to Riemann’s
zeta function.
This definition is of course independent ofQthe m chosen such that the bn ’s are
∞
nonzero for all n ≥ m. The infinite product n=1 bn diverges if it doesn’t converge;
that is, either there are infinitely many zero bn ’s or the limit (7.2) diverges or the
limit (7.2) converges to zero. In this latter case, we say that the infinite product
diverges to zero. Just as sequences
Q∞ and series can start at any integer, products
can also start at any integer: n=k bn , with straightforward modifications of the
definition.
Q∞
Example 7.1. The “harmonic product” n=2 (1−1/n) diverges to zero because
n
Y 1 1 1 1 2 3 n−1 1
1− = 1− ··· 1 − = · · ··· = → 0.
k 2 n 2 3 4 n n
k=2
Q∞
Example 7.2. On the other hand, the product n=2 (1 − 1/n2 ) converges
because
n n n
Y 1 Y k2 − 1 Y (k − 1)(k + 1)
1− 2 = =
k k2 k·k
k=2 k=2 k=2
1·3 2·4 3·5 4·6 (n − 1)(n + 1) n+1 1
= · · · ··· = → 6= 0.
2·2 3·3 4·4 5·5 n·n 2n 2
Therefore,
∞
Y 1 1
1− = .
n=2
n2 2
Q∞
Note that the infinite product n=1 (1 − 1/n2 ) also converges, but in this case,
∞ n
Y 1 1 Y 1 1
1− 2
:= 1 − 2
· lim 1 − 2
= 0 · = 0.
n=1
n 1 n→∞ k 2
k=2
Proposition 7.1. If an infinite product converges, then its factors tend to one.
Also, a convergent infinite product has the value 0 if and only if it has a zero factor.
Proof. The second statement is automatic from the definition of convergence.
If none of the bn ’s vanish for n ≥ m and pn = bm · bm+1 · · · bn , then pn → p, a
nonzero number, so
bm · bm+1 · · · bn−1 · bn pn p
bn = = → = 1.
bm · bm+1 · · · bn−1 pn−1 p
352 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
7.1.2. Infinite products and series: the nonnegative Q case. The follow-
ing theorem states that the analysis of an infinite product (1 + an ) with P all the
an ’s real and nonnegative is completely determined by the infinite series an .
Q
Theorem 7.2. An infiniteP product (1 + an ) with nonnegative terms an con-
verges if and only if the series an converges.
Proof. Let the partial products and partial sums be denoted by
n
Y n
X
pn = (1 + ak ) and sn = ak .
k=1 k=1
Since all the ak ’s are nonnegative, both sequences {pn } and {sn } are nondecreasing,
so converge if and only if they are bounded. Since 1 ≤ 1 + x ≤ ex for any real
number x (see Theorem 4.29), it follows that
n
Y n
Y Pn
1 ≤ pn = (1 + ak ) ≤ eak = e k=1 ak
= esn .
k=1 k=1
This equation shows that if the sequence {sn } is bounded, then the sequence {pn }
is also bounded, and hence converges. Its limit must be ≥ 1, so in particular, is not
zero. On the other hand,
pn = (1 + a1 )(1 + a2 ) · · · (1 + an ) ≥ 1 + a1 + a2 + · · · + an = 1 + sn ,
since the left-hand side, when multiplied out, contains the sum 1 + a1 + a2 + · · · + an
(and a lot of other nonnegative terms too). This shows that if the sequence {pn }
is bounded, then the sequence {sn } is also bounded.
See Problem 4 for the case when the terms an are negative.
Example 7.3. Thus, as a consequence of this theorem, the product
Y 1
1+ p
n
converges for p > 1 and diverges for p ≤ 1.
7.1.3. Infinite products for log 2 and e. I found the following gem in [205].
Define a sequence {en } by e1 = 1 and en+1 = (n + 1)(en + 1) for n = 1, 2, 3, . . .; e.g.
e1 = 1 , e2 = 4 , e3 = 15 , e4 = 64 , e5 = 325 , e6 = 1956 , . . . .
Then
∞
Y en + 1 2 5 16 65 326 1957
(7.3) e= = · · · · · ··· .
n=1
en 1 4 15 64 325 1956
We now prove Philipp Ludwig von Seidel’s (1821–1896) formula for log 2:
2 2 2 2
log 2 = √ · p√ · qp · rq ··· .
1+ 2 1+ 2 1+ √ p√
2 1+ 2
To prove this, we follow the proof of Viète’s formula in Section 5.1.1 using hyperbolic
functions instead of trigonometric functions. Let x ∈ R be nonzero. Then dividing
the identity sinh x = 2 cosh(x/2) sinh(x/2) (see Problem 8 in Exercises 4.7) by x,
we get
sinh x sinh(x/2)
= cosh(x/2) · .
x x/2
Replacing x with x/2, we get sinh(x/2)/(x/2) = cosh(x/22 ) · sinh(x/22 )/(x/22 ),
therefore
sinh x sinh(x/22 )
= cosh(x/2) · cosh(x/22 ) · .
x x/22
Continuing by induction, we obtain for any n ∈ N,
n
sinh x Y sinh(x/2n )
= cosh(x/2k ) · .
x x/2n
k=1
n
Since limz→0 sinh
z
z
= 1 (why?), we have limn→∞ sinh(x/2
x/2n
)
= 1, so taking n → ∞,
it follows that
n
x Y 1
(7.4) = lim .
sinh x n→∞ cosh(x/2k )
k=1
Now let us put x = log θ, that is, θ = ex , into the equation (7.4). To this end,
observe that
ex − e−x θ − θ−1 θ2 − 1 x 2θ log θ
sinh x = = = =⇒ =
2 2 2θ sinh x (θ − 1)(θ + 1)
and
x x 1 1 1
e 2k + e− 2k θ 2k + θ − 2k θ 2k−1 + 1
cosh(x/2k ) = = = 1
2 2 2θ 2k
1
1 2 θ 2k
=⇒ k
= 1/2k−1 .
cosh(x/2 ) θ +1
Thus,
n 1 n n
!
2θ log θ Y 2 θ 2k Y 1 Y 2
= lim = lim θ 2k · 1
(θ − 1)(θ + 1) n→∞
k=1
θ k−1 + 1 n→∞
1/2
k=1 k=1 θ 2k−1 + 1
n−1
!
Pn 1 Y 2
= lim θ k=1 2k
· 1 .
n→∞
k=0 θ 2k + 1
Pn 1
P∞ 1
Since limn→∞ k=1 2k = 1 (this is just the geometric series k=1 2k ), we see that
n−1 n−1
2θ log θ Y 2 2 Y 2
= θ · lim 1 =θ· · lim 1 .
(θ − 1)(θ + 1) n→∞ θ + 1 n→∞
k=0 θ +1 k=1 θ +1
2 k 2 k
354 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
∞
log θ Y 2 2 2 2
= = √ · p√ · qp · · · Seidel’s formula.
θ−1 1
1+ θ 1+ √
k=1 1 + θ θ 1+
2k
θ
where for (b) and (c), state for which x ∈ R, the products Q converge and diverge.
4. In this problem, we prove P that an infinite product (1−a n ) with 0 ≤ an < 1 converges
if and only if the Qnseries an converges.P
n −sn
(i) Let P pn = k=1 (1 − a k ) and sn =
Q k=1 ak . Show that pn ≤ e . Conclude that
if an diverges, then
P (1 − a n ) also diverges (in this case, diverges to zero).
(ii) Suppose now that an converges. Then we can choose m such that am + am+1 +
· · · < 1/2. Prove by induction that
(1 − am )(1 − am+1 ) · · · (1 − an ) ≥ 1 − (am + am+1 + · · · + an )
for n = m, m + 1,Qm + 2, . . .. Conclude that pn /pm ≥ 1/2 for all n ≥ m, and from
this, prove that
Q (1 − an )1 converges.
(iii) For what p is ∞ n=2 1 − np convergent and divergent?
5. In this problem we derive relationships between series and products. Let {an } be a
sequence of complex numbers with an 6= 0 for all n.
(a) Prove that for n ≥ 2,
n
Y n
X
(1 + ak ) = a1 + (1 + a1 ) · · · (1 + ak−1 )ak .
k=1 k=2
Q P∞
Thus, ∞ n=1 (1 + an ) converges if and only if a1 + k=2 (1 + a1 ) · · · (1 + ak−1 )ak
converges to a nonzero value, in which case they have the same value.
(b) Assume that a1 + · · · + ak 6= 0 for every k. Prove that for n ≥ 2,
n n
X Y ak
ak = a1 1+ .
a1 + a2 + · · · + ak−1
k=1 k=2
7.2. ABSOLUTE CONVERGENCE FOR INFINITE PRODUCTS 355
P Q∞
Thus, ∞ n=1 an converges if and only if a1
an
n=2 1 + a1 +a2 +···+an−1 either con-
verges or diverges to zero,
P in which case they have the same value.
(c) Using (b) and the sum ∞ 1 1
n=1 (n+a−1)(n+a) = a from (3.38), prove that
∞
Y a
1+ = a + 1.
n=2
(n + a)(n − 1)
since the sum on the right telescopes. It follows that the limit lim pk exists if and
Pk
only if the limit limk→∞ j=m+1 (pj − pj−1 ) exists; in other words, if and only if
P∞
the infinite series j=m+1 (pj − pj−1 ) converges. In case of convergence, the limit
equality in (a) follows from taking k → ∞ in (7.5).
356 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
k
Y k−1
Y
pk − pk−1 = (1 + aj ) − (1 + aj )
j=m j=m
k−1
Y k−1
Y
= (1 + ak ) (1 + aj ) − (1 + aj )
j=m j=m
k−1
Y
= ak (1 + aj ).
j=m
Qk−1
Therefore, |pk − pk−1 | ≤ |ak | j=m (1 + |aj |). Since 1 + x ≤ ex for all real numbers
x, we have
k−1
Y P k−1
|aj |
|pk − pk−1 | ≤ |ak | e|aj | = |ak | e j=m ,
j=m
just as we wanted.
which in particular proves the base case. Assume that our result holds for n ≥ m;
we prove it for n + 1. Observe that
n+1
Y n
Y
|1 + ak | = |1 + ak | · |1 + an+1 |
k=m k=m
n
!
X
≥ 1− |ak | (1 − |an+1 |) (induction hypothesis and (7.7))
k=m
n
X n
X
=1− |ak | − |an+1 | + |ak ||an+1 |
k=m k=m
Xn
≥1− |ak | − |an+1 |,
k=m
Just as for infinite series, the converse of this theorem is not true. For example,
Q∞ (−1)n
the infinite product n=2 1 + n converges (and equals 1 — see Problem 1 in
Exercises 7.1), but this product is not absolutely convergent.
7.2.2. Infinite products and series: the general case. Q For nonnegative
real numbers {an }, in Theorem
P 7.2, we showed that the product (1+an ) converges
if and only if the series an converges. In the general case
Q of a complex sequence
{an }, in Theorem
P 7.4 we showed that the infinite product (1 + an ) still converges
if the series |an | converges. In the general complex case, is there an “if and
only if” theorem relating convergence of an infinite product to the convergence of
a corresponding infinite series? We now give one such theorem where the series is
a series of logarithms. Moreover, we also get a formula for the product in terms of
the sum of the infinite series.
Q
Theorem 7.5. An infinite product (1 + an ) converges if and only if an → 0
and the series
X∞
Log(1 + an ),
n=m+1
starting from a suitable index m + 1, converges. Moreover, if L is the sum of the
series, then
Y
(1 + an ) = (1 + a1 ) · · · (1 + am ) eL .
Proof. First of all, we remark that the statement “starting from a suitable
index m + 1” concerning the sum of logarithms is needed because we need to
make sure the sum starts sufficiently high so that none of the terms 1 + an is zero
(otherwise
Q Log(1 + an ) is undefined). By Proposition 7.1, in order for the product
(1+an ) to converge, we at least need an → 0. Thus, we may assume that an → 0;
in particular we can fix m such that n > m implies |an | < 1. Q
Let bn = 1 + an . We shall prove that the infinite product bn converges if and
only if the series
X∞
Log bn ,
n=m+1
358 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
Q (−1)n
(iv) Does the product ∞ n=2 1 + n
converge? What about the product
1 1 1 1 1 1 1 1
1+ 1+ 1− 1+ 1+ 1− 1+ 1+ · · ·?
2 3 4 5 6 7 8 9
P P 2
3. Let {an } be a sequence of real numbers and Q assume that an converges but an
diverges. In this problem we shall prove that (1 + an ) diverges.
(i) Prove that there is a constant C > 0 such that for all x ∈ R with |x| ≤ 1/2, we
have
x − log(1 + x) ≥ Cx2 .
P
(ii) Since an converges, we know that P an → 0, so we may assume that Q |an | ≤ 1/2
for all n. Using (i), prove that log(1 + an ) diverges and hence, (1 + an )
diverges.
Q n−1
(iii) Does (1 + (−1) √
n
) converge or diverge?
4. Using the formulas from Problem 5 in Exercises 6.9, prove that for |z| < 1,
∞ ∞
! ∞ ∞
!
Y n
X 1 zn Y n
X (−1)n−1 z n
(1 − z ) = exp − , (1 + z ) = exp .
n=1 n=1
n 1 − zn n=1 n=1
n 1 − zn
Q
5. In this problem we prove that (1 + an ) is absolutely convergent if and only if the
series
∞
X
Log(1 + an ),
n=m+1
starting from a suitable index m + 1, is absolutely convergent. Proceed as follows.
(i) Prove that for any complex number z with |z| ≤ 1/2, we have
1 3
(7.10) |z| ≤ | Log(1 + z)| ≤ |z|.
2 2
Log(1+z)
Suggestion: Look at the power series expansion for
z
−1 and using this
Log(1+z)
− 1 ≤ 12 . Use this
power series, prove that for |z| ≤ 1/2, we have z
inequality to prove (7.10).
(ii) Now use (7.10) to prove the desired result.
We give two proofs of this astounding result. We also prove Wallis’ infinite
product expansion for π. To begin, we first need
7.3.1. Tannery’s theorem for products.
Theorem 7.7 Q (Tannery’s theorem for infinite products). For each nat-
mn
ural number n, let k=1 (1 + ak (n)) be a finite product where
P∞mn → ∞ as n → ∞.
If for each k, limn→∞ ak (n) exists, and there is a series k=1 Mk of nonnegative
real numbers such that |ak (n)| ≤ Mk for all k, n, then
mn
Y ∞
Y
lim (1 + ak (n)) = lim (1 + ak (n));
n→∞ n→∞
k=1 k=1
360 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
that is, both sides are well-defined (the limits and products converge) and are equal.
Proof. First of all, we remark that the infinite product on the right converges.
Indeed, if we put ak := limn→∞ ak (n), which exists by assumption, then taking
n → ∞ in the inequality |ak (n)| ≤ P Mk , we have |ak | ≤ Mk as well. Therefore,
∞
by the comparison test, the seriesQ k=1 ak converges absolutely and hence, by
∞
Theorem 7.4, the infinite product
P k=1 (1 + ak ) converges.
Now to our proof. Since Mk converges, Mk → 0, so we can choose m ∈ N
such that for all k ≥ m, we have Mk < 1. This implies that |ak | < 1 for k ≥ m, so
1 + ak is nonzero for k ≥ m. Put
mn
Y
p(n) = (1 + ak (n))
k=1
Since these are finite products and aj = limn→∞ aj (n), by the algebra of limits we
have limn→∞ pk (n) = pk . Now observe that
mn
Y mn
X
(1 + aj (n)) = pmn (n) = pm (n) + (pk (n) − pk−1 (n)),
j=m k=m
since the right-hand side telescopes to pmn (n), and by the limit identity in (a) of
Lemma 7.3, we know that
∞
Y ∞
X
(1 + aj ) = pm + (pk − pk−1 ),
j=m k=m+1
Q∞
since j=m (1 + aj ) := limk→∞ pk . Also, by Part (b) of Lemma 7.3, we have
P k−1 P k−1
|pk (n) − pk−1 (n)| ≤ |ak (n)| e j=m |aj (n)| ≤ Mk e j=m Mj ≤ CMk ,
P∞ P∞
where C = e j=m Mj . Since k=m+1 CMk converges, by Tannery’s theorem for
series we have
mn
X X∞
lim (pk (n) − pk−1 (n)) = lim (pk (n) − pk−1 (n))
n→∞ n→∞
k=m+1 k=m+1
X∞
= (pk − pk−1 ).
k=m+1
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 361
Therefore,
mn mn
!
Y X
lim (1 + aj (n)) = lim pm (n) + (pk (n) − pk−1 (n))
n→∞ n→∞
j=m k=m+1
mn
X
= pm + lim (pk (n) − pk−1 (n))
n→∞
k=m+1
∞
X ∞
Y
= pm + (pk − pk−1 ) = (1 + aj ),
k=m+1 j=m
7.3.2. Expansion of sine III. (Cf. [41, p. 294]). Our third proof of Euler’s
infinite product for sine is a Tannery’s theorem version of the proof found in Section
5.1 of Chapter 5. To this end, first recall from Lemma 5.1 of that section and the
work done in that section, that for any z ∈ C, we have
sin z = lim Fn (z),
n→∞
Thus,
( m )
Y z2
sin z = lim z 1− 2
m→∞
k=1
n tan2 (kπ/n)
m
Y
= lim z (1 + ak (m)) ,
m→∞
k=1
2
where ak (m) := − n2 tanz2 (kπ/n) with n = 2m + 1. Second, since limz→0 tan z
z =
limz→0 sinz z · cos1 z = 1, we see that
z2
lim ak (m) = lim −
m→∞ m→∞ (2m + 1)2 tan2 (kπ/(2m + 1))
z2 z2
= lim − 2 = − .
m→∞ k2 π2
k 2 π 2 tan(kπ/(2m+1))
kπ/(2m+1)
P∞
Thus, for all k, m, |ak (m)| ≤ Mk . Finally, since the sum k=1 Mk converges, by
Tannery’s theorem for infinite products, we have
m ∞ ∞
Y Y Y z2
sin z = lim z (1 + ak (m)) = z lim (1 + ak (m)) = z 1− 2 2 .
m→∞ m→∞ k π
k=1 k=1 k=1
After replacing z by πz, we get Euler’s infinite product expansion for sin πz. This
completes Proof III of Theorem 7.6. In particular, we see that
∞ ∞
Y 1 Y i2 e−π − eπ
πi 1 + 2 = πi 1 − 2 = sin πi = .
k k 2i
k=1 k=1
Thus, we have derived the very pretty formula
∞
eπ − e−π Y 1
= 1+ 2 .
2π n=1
n
Q∞ 1
Recall from Section 7.1 how easy it was to find that n=1 1− n2 = 1/2, but
replacing −1/n2 with +1/n2 is a whole different story!
7.3.3. Expansion of sine IV. Our fourth proof of Euler’s infinite product
for sine is based on the following neat identity involving sines instead of tangents!
Lemma 7.8. If n = 2m + 1 with m ∈ N, then for any z ∈ C,
m
Y sin2 z
sin nz = n sin z 1− .
k=1
sin2 (kπ/n)
Proof. Lemma 2.26 shows that for each k ∈ N, 2 cos kz is a polynomial in
2 cos z of degree k (with integer coefficients, although this fact is not important for
this lemma). Technically speaking, Lemma 2.26 was proved under the assumption
that z is real, but the proof only used the angle addition formula for cosine, which
holds for complex variables as well. Any case, since 2 cos kz is a polynomial in
2 cos z of degree k, it follows that cos kz is a polynomial in cos z of degree k, say
cos kz = Qk (cos z) where Qk is a polynomial of degree k. In particular,
cos 2kz = Qk (cos 2z) = Qk (1 − 2 sin2 z),
so cos 2kz is a polynomial of degree k in sin2 z. Now using the addition formulas
for sine, we get, for each k ∈ N,
(7.12) sin(2k + 1)z − sin(2k − 1)z = 2 sin z · cos(2kz) = 2 sin z · Qk (1 − 2 sin2 z).
We claim that for any m = 0, 1, 2, . . ., we have
(7.13) sin(2m + 1)z = sin z · Pm (sin2 z),
where Pm is a polynomial of degree m. For example, if m = 0, then sin z =
sin z · P0 (sin2 z) where P0 (w) = 1 is the constant polynomial 1. If m = 1, then by
(7.12) with k = 1, we have
sin(3z) = sin z + 2 sin z · Q1 (1 − 2 sin2 z)
= sin z 1 + 2 sin z · Q1 (1 − 2 sin2 z) = sin z · P1 (sin2 z),
where P1 (w) = 1 + 2Q1 (1 − 2w). To prove (7.13) for general m just requires an
induction argument based on (7.12), which we leave to the interested reader. Now,
observe that sin(2m + 1)z is zero when z = zk with zk = kπ/(2m + 1) where
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 363
k = 1, 2, . . . , m. Also observe that since 0 < z1 < z2 < · · · < zm < π/2, the m
values sin zk are distinct positive values. Hence, according to (7.13), Pm (w) = 0
at the m distinct values w = sin2 zk , k = 1, 2, . . . , m. Thus, as a consequence of
the fundamental theorem of algebra, the polynomial Pm (w) can be factored into a
constant times
(w − z1 )(w − z2 ) · · · (w − zm ) =
m m
Y
2 kπ Y
2 kπ
w − sin = w − sin ,
2m + 1 n
k=1 k=1
for some constant a. Since sin(2m + 1)z/ sin z has limit equal to 2m + 1 as z → 0,
it follows that a = 2m + 1. This completes the proof of the lemma.
We are now ready to give our fourth proof of Euler’s infinite product for sine.
To this end, we let n ≥ 3 be odd and we replace z by z/n in Lemma 7.8 to get
m
Y sin2 (z/n)
sin z = n sin(z/n) 1− ,
k=1
sin2 (kπ/n)
where n = 2m + 1. Since
sin(z/(2m + 1))
lim (2m + 1) sin(z/(2m + 1)) = lim z = z,
m→∞ m→∞ z/(2m + 1)
we have
m m
Y sin2 (z/n) Y
sin z = z lim 1− = z lim (1 + ak (m))
m→∞
k=1
sin2 (kπ/n) m→∞
k=1
2
sin (z/n)
where ak (m) := − sin 2 (kπ/n) with n = 2m + 1. Since we are taking m → ∞, we can
always make sure that n = 2m + 1 > |z|, which we henceforth assume. Now recall
from Lemmas 5.6 and 5.7 that there is a constant c > 0 such that for any 0 ≤ x ≤ π2 ,
we have c x ≤ sin x, and for any w ∈ C with |w| ≤ 1, we have | sin w| ≤ 56 |w|. It
follows that for any k = 1, 2, . . . , m,
sin2 (z/n) (6/5|z/n|)2 36|z|2 1
sin2 (kπ/n) ≤ c2 (kπ/n)2 = 25c2 π 2 · k 2 =: Mk ,
364 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
P∞
Since the sum k=1 Mk converges, and
sin2 (z/(2m + 1))
lim ak (m) = − lim
m→∞ m→∞ sin2 (kπ/(2m + 1))
2
sin(z/(2m+1))
z2 z/(2m+1) z2
= − lim 2 2 · 2 = − ,
m→∞ k π sin(kπ/(2m+1)) k2 π2
kπ/(2m+1)
The top product can be split as a product of even and odd terms:
∞ Y ∞ Y ∞ Y ∞
Y 4z 2 4z 2 4z 2 z2
1− 1 − = 1 − 1 − ,
n=1
(2n − 1)2 n=1 (2n)2 n=1
(2n − 1)2 n=1 n2
Exercises 7.3.
√
1. Put z = π/4 into the cosine expansion to derive the following elegant product for 2:
√ 2 2 6 6 10 10
2= · · · · · ··· .
1 3 5 7 9 11
Compare this with Wallis’ formula:
π 2 2 4 4 6 6 8 8 10 10
= · · · · · · · · · ··· .
2 1 3 3 5 5 7 7 9 9 11
√
Thus, the product for 2 is obtained from Wallis’ formula for π/2 by removing the
factors with numerators that are multiples of 4.
2. Prove that
∞ ∞
Y z2 Y z2
sinh πz = πz 1+ 2 and cosh πz = 1+ .
k n=1
(2n − 1)2
k=1
3. (Euler’s infinite product for cos πz) Here are three more proofs!
7.3. EULER, TANNERY, AND WALLIS: PRODUCT EXPANSIONS GALORE 365
(a) Replace z by −z +1/2 in the sine product to derive the cosine product. Suggestion:
Begin by showing that
!
(−z + 12 )2 1 2z 2z
1− = 1 − · 1 + 1 − .
n2 4n2 2n − 1 2n + 1
(b) For our second proof, show that for n even, we can write
n−1
Y
sin2 (z/n)
cos z = 1− , k = 1, 3, 5, . . . , n − 1.
k=1
sin2 (kπ/2n)
Q
7. (Tannery’s theorem II) For each natural number n, let ∞ k=1 (1+ak (n)) bePa conver-
gent infinite product. If for each k, limn→∞ ak (n) exists, and there is a series ∞k=1 Mk
of nonnegative real numbers such that |ak (n)| ≤ Mk for all k, n, prove that
∞
Y ∞
Y
lim (1 + ak (n)) = lim (1 + ak (n));
n→∞ n→∞
k=1 k=1
that is, both sides are well-defined (the limits and products converge) and are equal.
We also derive partial fraction expansions for the other trigonometric functions.
We begin with the cotangent.
Our proof of Euler’s expansion of the cotangent is based on the following lemma.
Lemma 7.11. For any noninteger complex number z and n ∈ N, we have
2n−1
X−1 πz
πz πz π(z + k) π(z − k) πz πz
πz cot πz = n cot n + n
cot n
+ cot n
− n tan n .
2 2 2 2 2 2 2
k=1
This is the main formula on which induction may be applied to prove our lemma.
For instance, let’s take the case n = 2. Considering the positive sign in the second
cotangent, we have
1 πz π(z + 1)
cot πz = cot + cot .
2 2 2
Applying (7.16) to each cotangent on the right of this equation, using the plus sign
for the first and the minus sign for the second, we get
1 πz π( z2 + 1) π(z + 1) π( z+1
2 − 1)
cot πz = 2 cot 2 + cot + cot + cot
2 2 2 22 2
1 πz π(z + 2) π(z + 1) π(z − 1)
= 2 cot 2 + cot + cot + cot ,
2 2 22 22 22
which, after bringing the second cotangent to the end, takes the form
πz
1 πz π(z + 1) π(z − 1) π
cot πz = 2 cot 2 + cot + cot + cot 2 + .
2 2 22 22 2 2
However, the last term is exactly − tan πz/22 , and so our lemma is proved for n = 2.
Continuing by induction proves our lemma for general n.
Fix a noninteger z; we shall prove Euler’s expansion for the cotangent. Note
that limn→∞ πz πz
2n tan( 2n ) = 0 · tan 0 = 0, and since
w
(7.17) lim w cot w = lim · cos w = 1 · 1 = 1,
w→0 w→0 sin w
we have limn→∞ πz πz
2n cot 2n = 1. Therefore, taking n → ∞ in the formula from the
preceding Lemma 7.11, we conclude that
2n−1
X−1 πz
π(z + k) π(z − k)
πz cot πz = 1 + lim cot + cot
n→∞ 2n 2n 2n
k=1
2n−1
X−1
= 1 + lim ak (n),
n→∞
k=1
where
πz π(z + k) π(z − k)
ak (n) = n cot + cot .
2 2n 2n
We shall apply Tannery’s theorem to this sum. To this end, observe that, from
(7.17),
πz π(z + k) z π(z + k) π(z + k) z
lim n
cot n
= lim n
cot n
= ,
n→∞ 2 2 z + k n→∞ 2 2 z+k
and in a similar way,
πz π(z − k) z
lim cot = .
n→∞ 2n 2n z−k
Thus,
z z 2z 2
lim ak (n) = + = 2 ,
n→∞ z+k z−k z − k2
so Tannery’s theorem gives Euler’s cotangent expansion:
∞
X 1
πz cot πz = 1 + 2z 2 ,
z 2 − k2
k=1
368 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
P
provided of course we can show that |ak (n)| ≤ Mk where Mk < ∞. Actually,
we shall
P∞prove that are m, N ∈ N such that |ak (n)| ≤ M k for all n > N and k ≥ m
where k=m Mk < ∞. The conclusion of Tannery’s theorem will still hold with
these conditions. (Why so?)
To bound each ak (n), we use the formula
sin 2α
cot(α + β) + cot(α − β) = .
sin α − sin2 β
2
This formula is obtained by expressing cot(α ± β) in terms of cosine and sine and
using the angle addition formulas (the diligent reader will supply the details!).
Setting α = πz/2n and β = πk/2n , we obtain
π(z + k) π(z − k) sin 2α
(7.18) cot + cot = ,
2n 2n sin α − sin2 β
2
where we keep the notation α = πz/2n and β = πk/2n on the right. Our goal now
is to bound the term on the right of (7.18). Choose N ∈ N such that for all n > N ,
we have |α| = |πz/2n | < 1/2. Then for n > N , according to Lemma 5.7,
6 6
| sin 2α| ≤ |2α| ≤ 3|α| and | sin α| ≤ |α| ≤ 2|α|,
5 5
and, since β = πk/2n < π/2 for k = 1, . . . , 2n−1 − 1, according to Lemma 5.6, for
some c > 0,
c β ≤ sin β.
Hence, for n > N ,
Thus, for n > N and k ≥ m, in view of (7.18) and the definition of ak (n), we have
3|z|2
|ak (n)| ≤ Mk , where Mk = .
c2 k2− 4|z|2
Since
3|z|2
1 3|z|2 1 1
Mk = ≤
4|z|2
2
· 4|z|2
· 2 = constant · 2 ,
2
c − k2 k 2
c − m2 k k
P∞
by the comparison test, the sum k=m Mk converges. This completes the proof of
Euler’s cotangent expansion.
7.4. PARTIAL FRACTION EXPANSIONS OF THE TRIGONOMETRIC FUNCTIONS 369
Finally, the incredibly awesome diligent reader , will supply the details for the
following cosine expansion: For z ∈ C not an odd integer,
∞
π X (2n + 1)
(7.21) πz = (−1)n .
4 cos 2 n=0
(2n + 1)2 − z 2
Exercises 7.4.
1. Fill in the details for the proofs of (7.19) and (7.20). For (7.21), first show that
π 1 1 1 1 1
= + − − − + ··· .
sin πz z 1−z 1+z 2−z 2+z
1−z
Replacing z with 2
and doing some algebra, derive the expansion (7.21).
P (−1)n−1
2. Derive Gregory-Leibniz-Madhava’s series π4 = ∞ n=1 2n−1 = 1 − 13 + 41 − 15 + · · · by
replacing z = 1/4 in the partial fraction expansions of πz cot πz and π/ sin πz. How
can you derive Gregory-Leibniz-Madhava’s series from the expansion of 4 cosπ πz ?
2
3. Derive the following formulas for π:
π h 1 1 1 1 i
π = z tan · 1− + − + − +···
z z−1 z+1 2z − 1 2z + 1
and
π h 1 1 1 1 i
π = z sin · 1+ − − + + − − + + ··· .
z z−1 z+1 2z − 1 2z + 1
In particular, plug in z = 3, 4, 6 to derive some pretty formulas.
370 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
P∞
7.5. F More proofs that π 2 /6 = n=1 1/n2
In this section, we continue our discussion from Sections 5.2 and 6.11, concern-
ing the Basel problem of determining the sum of the reciprocals of the squares. A
good reference for this material is [109] and for more on Euler, see [11].
7.5.1. Proof VIII of Euler’s formula for π 2 /6. (Cf. [47, p. 74].) One can
consider this proof as a “logarithmic” version of Euler’s original (third) proof of the
formula for π 2 /6, which we explained in the introduction to Chapter 5. As with
Euler, we begin with Euler’s sine expansion restricted to 0 ≤ x < 1:
∞
sin πx Y x2
= 1− 2 .
πx n=1
n
where in the second equality we can pull out the limit because log is continuous,
and at the last step we used that logarithms take products to sums. Thus, we have
shown that
X ∞
sin πx x2
log = log 1 − 2 , 0 ≤ x < 1.
πx n=1
n
P∞ (−1)m−1 m
Recalling that log(1 + t) = m=1 m t , we see that
∞
X 1 m
log(1 − t) = − t ,
m=1
m
Since
∞ X ∞ ∞ X ∞ ∞
X 1 x2m X 1 |x|2m X |x|2 sin π|x|
= = log 1 − = log < ∞,
m n2m m n2m n2 π|x|
n=1 m=1 n=1 m=1 n=1
On the other hand, by our power series composition theorem, we have (after some
simplification)
2 2
sin πx π x π 4 x4
− log = − log 1 − − + −···
πx 3! 5!
2 2 2
π x π 4 x4 1 π 2 x2 π 4 x4
= − + −··· + − + −··· + ···
3! 5! 2 3! 5!
6
π2 2 π4 π4 4 π π6 π6
(7.23) = x + − + x + − + x6 + · · · .
3! 5! 2 · (3!)2 7! 3! · 5! 3 · (3!)3
Equating this with (7.22), we obtain
6
π2 2 π4 π4 4 π π6 π6
x + − + x + − + x6 + · · ·
3! 5! 2 · (3!)2 7! 3! · 5! 3 · (3!)3
∞ ∞ ∞
X 1 x4 X 1 x6 X 1
= x2 + + + ··· ,
n=1
n2 2 n=1 n4 3 n=1 n6
or after simplification,
∞ ∞ ∞
π2 2 π4 4 π6 6 X 1 x4 X 1 x6 X 1
(7.24) x + x + x + · · · = x2 + + +··· .
6 180 2835 n=1
n2 2 n=1 n4 3 n=1 n6
2k
Now what if weP took2kmore terms in (7.22) and (7.23), say to x , can we then find
a formula for 1/n ? The answer is certainly true but the work required to get a
formula is rather intimidating; see Problem 1 for a formula when k = 4. Of course,
in Section 5.2 we found formulas for ζ(2k) for all k.
7.5.2. Proof IX. (Cf. [123], [49].) For this proof, we start with Lemma 7.8,
which states that if n = 2m + 1 with m ∈ N, then
m
Y sin2 z
(7.27) sin nz = n sin z 1− .
k=1
sin2 (kπ/n)
We fix an m; later we shall take m → ∞. We now substitute the expansion
n3 z 3 n5 z 5
sin nz = nz − + − +···
3! 5!
372 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
(i) Fix any M ∈ N and let m > M . Using (7.28), prove that for n = 2m + 1,
M m
1 X 1 1 X 1
− 2 sin2 (kπ/n)
= + 2 sin2 (kπ/n)
.
6 n n2 n
k=1 k=M +1
(ii) Using that c x ≤ sin x for 0 ≤ x ≤ π/2 with c > 0, prove that
M ∞
1 X 1 1 1 X 1
0≤ − 2 sin2 (kπ/n)
≤ 2 + 2 2 .
6 n n c π k2
k=1 k=M +1
4. (Cf. [56])
P Let A ⊆ N denotePthe set of natural numbers that are not perfect squares.
With n∈A n12 := limN →∞ n∈A , n<N n12 , prove that
X 1 π2
2
= (15 − π 2 ).
n∈A
n 90
for some nonnegative integers i, j, . . . , k. Using this fact, it follows that the product
Y 1 −1 1 −1 1 −1 1 −1
1− z = 1− z 1− z ··· 1 − z
p 2 3 m
p<N
1 1 1 1 1 1
= 1 + z + 2z + 3z · · · 1 + z + 2z + 3z + · · · · · ·
2 2 2 3 3 3
1 1 1
· · · 1 + z + 2z + 3z + · · · ,
m m m
after multiplying out and using Cauchy’s multiplication theorem (or rather its gen-
eralization to a product of more than two absolutely convergent series), contains the
numbers 1, 21z , 31z , 41z , 51z . . . , (N −1)
1 1
z (along with all other numbers nz with n ≥ N
1
P
since Re z ≥ r. By the p-test (with p = r > 1), nr converges so the right-hand
side tends to zero as N → ∞. This completes Proof I.
Proof II: Here’s Euler’s beautiful proof using a “sieving method” made famous
by Eratosthenes of Cyrene (276 B.C.–194 B.C.). First we get rid of all the numbers
in ζ(z) that have factors of 2: Observe that
∞ ∞
1 1 X 1 X 1
ζ(z) = z = ,
2z 2 n=1 nz n=1
(2n)z
therefore,
∞ ∞
1 X 1 X 1 X 1
1− ζ(z) = − = .
2z n=1
nz
n=1
(2n)z nz
n ; 26 |n
Next, we get rid of all the numbers in 1 − 21z ζ(z) that have factors of 3: Observe
that
1 1 1 X 1 X 1
1 − ζ(z) = = ,
3z 2z 3z nz (3n)z
26 |n n ; 26 |n
therefore,
1 1 1 1 1
1− z 1− z ζ(z) = 1 − z ζ(z) − z 1 − z ζ(z)
3 2 2 3 2
X 1 X 1
= −
nz (3n)z
n ; 26 |n n ; 26 |n
X 1
= .
nz
n ; 2,36 |n
where the sum is over all n ∈ N that are not divisible by the primes from 2 to q.
Therefore, choosing r > 1 such that |z| > r, we have
Y 1
X 1
1− z ζ(z) − 1 =
z
p prime≤q p n ; n6=1 & 2,3,...,q6 |n n
∞
X 1 X 1
≤ r
≤ .
n n=q
nr
n ; n6=1 & 2,3,...,q6 |n
P∞
By Cauchy’s criterion for series, limq→∞ n=q n1r = 0, so we conclude that
Y
1
1− z ζ(z) = 1,
p
p prime
Proof. Let r > 1 be arbitrary and let Re z ≥ r. Let 2 < N ∈ N and let
2 < 3 < · · · < m < N be all the primes less than N . Then observe that the product
Y 1
−1 −1 −1 −1
1− z = 1+ z 1+ z 1 + z ··· 1 + z ,
p 2 3 5 m
n<N
when multiplied out contains 1 and all numbers of the form
−1 −1 −1 −1 (−1)k (−1)k
z · z · z ··· z = z z z = , n = p1 p2 . . . pk ,
p1 p2 p3 pk p1 p2 · · · pk nz
Q
1
where p1 < p2 < · · · < pk < N are distinct primes. In particular, n<N 1− pz
µ(n)
contains the numbers nz for n = 1, 2, . . . , N − 1 (along with all other numbers
µ(n)
nz with n ≥ N having prime factors 2, 3, . . . , m), so
∞
X µ(n) Y ∞ ∞
1 X µ(n) X 1
z
− 1 − z
≤ z ≤ r
,
n=1
n p n n
p<N n=N n=N
376 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
P 1
since Re z ≥ r. By the p-test (with p = r > 1), nr converges so the right-hand
side tends to zero as N → ∞. This completes our proof.
See the exercises for other neat connections of ζ(z) with number theory.
7.6.2. The eta function. A function related to the zeta function is the “al-
ternating zeta function” or Dirichlet eta-function:
∞
X (−1)n−1
η(z) := .
n=1
nz
We can write the eta function in terms of the zeta function as follows.
Theorem 7.14. We have
η(z) = (1 − 21−z )ζ(z) , z > 1.
Proof. Splitting into sums of even and odd numbers, we get
∞ ∞ ∞
X (−1)n−1 X 1 X 1
z
=− z
+
n=1
n n=1
(2n) n=1
(2n − 1)z
∞ ∞
X 1 1 X 1
=− z z
+
n=1
2 n n=1
(2n − 1)z
∞
X 1
= −2−z ζ(z) + .
n=1
(2n − 1)z
On the other hand, breaking the zeta function into sums of even and odd numbers,
we get
∞ ∞ ∞ ∞
X 1 X 1 X 1 −z
X 1
ζ(z) = z
= z
+ z
= 2 ζ(z) + .
n=1
n n=1
(2n) n=1
(2n − 1) n=1
(2n − 1)z
1Such shocking connections in science perhaps made Albert Einstein (1879–1955) state that
“the scientist’s religious feeling takes the form of a rapturous amazement at the harmony of natu-
ral law, which reveals an intelligence of such superiority that, compared with it, all the systematic
thinking and acting of human beings is an utterly insignificant reflection”. [103]
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 377
7.6.3. Elementary probability theory. You will prove these results with
complete rigor in Problems 11 and 10. However, we are going to derive them
intuitively — not rigorously (!) — based on some basic probability ideas that should
be “obvious” (or at least believable) to you; see [229, 70, 71] for standard books
on probability in case you want the hardcore theory. We only need the basics. We
denote the probability, or chance, that an event A happens by P (A). The classic
definition is
number of occurrences of A
(7.30) P (A) = .
total number of possibilities
For example, consider a classroom with 10 people, m men and w women (so that
m + w = 10). The probability of randomly “choosing a man” (= M ) is
number of men m
P (M ) = = .
total number of possibilities 10
Similarly, the probability of randomly choosing a woman is w/10. We next need to
discuss complementary events. If Ac is the event that A does not happen, then
(7.31) P (Ac ) = 1 − P (A).
For instance, according to (7.31) the probability of “not choosing a man”, M c ,
should be P (M c ) = 1 − P (M ) = 1 − m/10. But this is certainly true because
“not choosing a man” is the same as “choosing a woman” W , so recalling that
m + w = 10, we have
w 10 − m m
P (M c ) = P (W ) = = =1− .
10 10 10
Finally, we need to discuss independence. Whenever an event A is unrelated to an
event B (such events are called independent), we have the fundamental relation:
P (A and B) = P (A) · P (B).
For example, let’s say that we have two classrooms of 10 students each, the first one
with m1 men and w1 women, and the second one with m2 men and w2 women. Let
us randomly choose a pair of students, one from the first classroom and the other
from the second. What is the probability of “choosing a man from the first class-
room” = A and “choosing a woman from the second classroom” = B? Certainly
A and B don’t depend on each other, so by our formula above we should have
m1 w2 m1 w2
P (A and B) = P (A) · P (B) = · = .
10 10 100
To see that this is indeed true, note that the number of ways to pair a man in
classroom 1 with a woman in classroom 2 is m1 · w2 and the total number of
possible pairs of people is 102 = 100. Thus,
number of men-women pairs m1 · m2
P (A and B) = = ,
total number of possible pairs of people 100
in agreement with our previous calculation. We remark that for any number of
events A1 , A2 , . . ., which are unrelated to each other, we have the generalized result:
(7.32) P (A1 and A2 and · · · ) = P (A1 ) · P (A2 ) · · · .
378 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
7.6.4. Probability and π 2 /6. To begin discussing our two incredible and
shocking problems, we first look at the following question: Given a natural number
k, what is the probability, or chance, that a randomly chosen natural number is
divisible by k? Since the definition (7.30) involves finite quantities, we can’t use
this definition as it stands. We can instead use the following modified version:
number of occurrences of A amongst n possibilities
(7.33) P (A) = lim .
n→∞ n
Using this formula, in Problem 8, you should be able to prove that the probability
a randomly chosen natural number is divisible by k is 1/k. However, instead of
using (7.33), we shall employ the following heuristic trick (which works to give the
correct answer). Choose an “extremely large” natural number N , and consider the
very large sample of numbers
1, 2, 3, 4, 5, 6, . . . , N k.
There are exactly N numbers in this list that are divisible by k, namely the N
numbers k, 2k, 3k, . . . , N k, and no others, and there are a total of N k numbers in
this list. Thus, the probability that a natural number n, randomly chosen amongst
the large sample, is divisible by k is exactly the probability that n is one of the N
numbers k, 2k, 3k, . . . , N k, so
number of occurrences of divisibility N 1
(7.34) P (k|n) = = = .
total number of possibilities listed Nk k
For instance, the probability that a randomly chosen natural number is divisible by
1 is 1, which makes sense. The probability that a randomly chosen natural number
is divisible by 2 is 1/2; in other words, the probability that a randomly chosen
natural number is even is 1/2, which also makes sense.
We are now ready to solve our two problems. Question: What is the prob-
ability that a natural number, chosen at random, is square free? Let n ∈ N be
randomly chosen. Then n is square free just means that p2 6 | n for all primes p.
Thus,
P (n is square free) = P ((226 | n) and (326 | n) and (526 | n) and (726 | n) and · · · ).
Since n was randomly chosen, the events 226 | n, 326 | n, 526 | n, etc. are unrelated, so
by (7.32),
P (n is square free) = P (226 | n) · P (326 | n) · P (526 | n) · P (726 | n) · · ·
To see what the right-hand side is, we use (7.31) and (7.34) to write
1
P (p26 | n) = 1 − P (p2 |n) = 1 − .
p2
Thus,
Y Y 1
1 6
2
P (n is square free) = P (p 6 | n) = 1− 2 = = 2,
p ζ(2) π
p prime p prime
relatively prime, or coprime, just means that m and n have no common factors
(except 1), which means2 that p6 | both m, n for all prime numbers p. Thus,
To see what the right-hand side is, we use (7.31), (7.32), and (7.34) to write
pz −(1/pz )N z N +1
Suggestion: pz −1
= 1−(1/p
1−1/pz
)
= 1 + 1/pz + 1/p2z + · · · + 1/pN z .
z z N z N
p −(1/p )
(ii) Write pz −1
= 1 + 1−(1/p
pz −1
)
. Show that
1 − (1/pz )N 2 4
pz − 1 ≤ pr − 1 ≤ pr
P
and 4/pr converges. Now prove Theorem 7.12 using Tannery’s theorem for
products.
2. Prove that for z ∈ C with Re z > 1,
∞
ζ(z) X |µ(n)|
= .
ζ(2z) n=1
nz
ζ(z) Q 1
Suggestion: Show that ζ(2z)
= 1+ pz
and copy Proof I of Theorem 7.13.
3. (Möbius inversion formula) In this problem we prove Möbius inversion formula.
(i) Given n ∈ N with n > 1, let p1 , . . . , pk be the distinct prime factors of n. For
1 ≤ i ≤ k, let
Ai = m ∈ N ; m = a product of exactly i distinct prime factors of n .
2Explicitly, “p6 | both m, n” is the negation of “p|m and p|n”; that is, “p6 | m or p6 | n”.
380 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
Show that
X k
X X
µ(d) = 1 + µ(m),
d|n i=1 m∈Ai
P
where d|n µ(d) means to sum over all d ∈ N such that d|n. Next, show that
!
i k
X
µ(m) = (−1) .
m∈A
i
i
(iii) Let f : (0, ∞) → R such that f (x) = 0 for x < 1, and define
∞
X x
g(x) = f .
n=1
n
Note that g(x) = 0 for x < 1 and this infinite series is really only a finite sum
since f (x) = 0 for x < 1; specifically, choosing any N ∈ N with N ≥ bxc (the
P
greatest integer ≤ x), we have g(x) = N n=1 f (x/n). Prove that
∞
X x
f (x) = µ(n) g (Möbius inversion formula);
n=1
n
As before, this sum is really only finite. Suggestion: If you’ve not gotten anywhere
after some time, let S = {(k, n) ∈ N × N ; n|k} and consider the sum
X x
µ(n) f .
k
(k,n)∈S
P P P∞ P
Write this sum as ∞ k=1 n ; n|k µ(n) f (x/k), then as n=1 k ; n|k µ(n) f (x/k)
and simplify each iterated sum.
4. (Liouville’s function) Define
1
if n = 1
λ(n) := 1 if the number of prime factors of n, counted with repetitions, is even
−1 if the number of prime factors of n, counted with repetitions, is odd.
This function is called Liouville’s function after Joseph Liouville (1809–1882). Prove
that for z ∈ C with Re z > 1,
∞
ζ(2z) X λ(n)
= .
ζ(z) n=1
nz
ζ(2z) Q 1
−1
Suggestion: Show that ζ(z)
= 1+ pz
and copy Proof I of Theorem 7.12.
5. For n ∈ N, let τ (n) denote the number of positive divisors of n (that is, the number of
positive integers that divide n). Prove that for z ∈ C with Re z > 1,
∞
X τ (n)
ζ(z)2 = .
n=1
nz
P
Suggestion: By absolute convergence, we can write ζ(z)2 = m,n 1/(m · n)z where this
double series can be summed in any way we wish. Use Theorem 6.25 with the set Sk
given by Sk = T1 ∪ · · · ∪ Tk where Tk = {(m, n) ∈ N × N ; m · n = k}.
7.6. F RIEMANN’S REMARKABLE ζ-FUNCTION, PROBABILITY, AND π 2 /6 381
P
6. Let ζ(z, a) := ∞n=0 (n + a)
−z
for z ∈ C with Re z > 1 and a > 0 — this function is
called the Hurwitz zeta function after Adolf Hurwitz (1859–1919). Prove that
Xk m
ζ z, = kz ζ(z).
m=1
k
7. In this problem, we find useful bounds and limits for ζ(x) with x > 1 real.
(a) Prove that 1 − 21x < η(x) < 1.
(b) Prove that
1 − 2−x 1
< ζ(x) < .
1 − 21−x 1 − 21−x
(c) Prove the following limits: ζ(x) → 1 as x → ∞, ζ(x) → ∞ as x → 1+ , and
(x − 1)ζ(x) → 1 as x → 1+ .
8. Using the definition (7.33), prove that given a natural number k, the probability that
a randomly chosen natural number is divisible by k is 1/k. Suggestion: Amongst the n
natural numbers 1, 2, 3, . . . , n, show that bn/kc many numbers are divisible by k. Now
take n → ∞ in bn/kc/n.
9. (cf. [25, 107]) Let k ∈ N with k ≥ 2. We say that a natural number n is k-th power
free if pk 6 | n for all primes p. What is the probability that a natural number, chosen
at random, is k-th power free? What is the probability that k natural numbers, chosen
at random, are relatively prime (have not common factors except 1)?
10. (Square-free numbers) Define S : (0, ∞) → R by
S(x) := #{k ∈ N ; 1 ≤ k ≤ x and k is square free};
note that S(x) = 0 for x < 1. We shall prove that
S(n) 6
lim = 2.
n→∞ n π
Do you see why this formula makes precise the statement “The probability that a
randomly chosen natural number is square free equals 6/π 2 ”?
(i) For any real number x > 0 and n ∈ N, define
A(x, n) := k ∈ N ; 1 ≤ k ≤ x and n2 is the largest square that divides k .
Note that A(x, n) = ∅ for n2 > x. Prove that A(x, 1) consists of all square-free
numbers ≤ x, and also prove that
∞
[
{k ∈ N ; 1 ≤ k ≤ x} = A(x, n).
n=1
(ii) Show that there is a bijection between A(x, n) and A(x/n2 , 1).
(iii) Show that for any x > 0, we have
∞
X x
bxc = S 2
.
n=1
n
(iv) Finally, prove that lim S(x)/x = 6/π 2 , which in particular proves our result.
x→∞
11. (Relatively prime numbers; for different proofs, see [122, p. 337] and [95, p. 268])
Define R : (0, ∞) → R by
R(x) := #{(k, `) ∈ N ; 1 ≤ k, ` ≤ x and k and ` are relatively prime};
382 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
(ii) Show that there is a bijection between A(x, n) and A(x/n, 1).
(iii) Show that for any x > 0, we have
∞
X x
bxc2 = R .
n=1
n
(iv) Finally, prove that lim R(x)/x2 = 6/π 2 , which in particular proves our result.
x→∞
Since
∞ X∞ 2 k ∞ X∞
X z X |z|2 k 1
2 = = 1 − π cot π|z| < ∞,
n=1
n n=1
n2 2
k=1 k=1
by Cauchy’s double series theorem, we have
∞ X∞ 2 k ∞ X∞
X z X 1
(7.35) πz cot πz = 1 − 2 2
=1−2 2k
z 2k .
n=1
n n=1
n
k=1 k=1
P∞ 2k
On the other hand, we recall from Section 6.8 that z cot z = k=0 (−1)k 2 (2k)!
B2k 2k
z
(for |z| small), where the B2k ’s are the Bernoulli numbers. Replacing z with πz,
we get
∞
X 22k B2k 2k 2k
πz cot πz = 1 + (−1)k π z .
(2k)!
k=1
Comparing this equation with (7.35) and using the identity theorem, we see that
∞ 2k
X 1 k2 B2k 2k
−2 2k
= (−1) π , k = 1, 2, 3, . . . .
n=1
n (2k)!
Using the known values of the Bernoulli numbers found in Section 6.8, setting
k = 1, 2, 3, we get, in particular, our eleventh proof of Euler’s formula for π 2 /6:
∞ ∞ ∞
π2 X 1 π4 X 1 π6 X 1
= (Euler’s sum, Proof XI) , = , = .
6 n=1
n2 90 n=1 n4 945 n=1 n6
Using (7.36), we can derive many other pretty formulas. First, in Theorem 7.14
we proved that
∞
X (−1)n−1
z
= (1 − 21−z )ζ(z), z > 1.
n=1
n
In particular, setting z = 2k, we find that for k = 1, 2, 3, . . .,
∞
X (−1)n−1 k−1 1−2k (2π)
2k
B2k
(7.37) η(2k) = 2k
= (−1) 1 − 2 ;
n=1
n 2(2k)!
what formulas do you get when you set k = 1, 2? Second, recall from Theorem 7.12
that
∞ Y pz
X 1 2z 3z 5z 7z
(7.38) z
= z
= z · z · z · z ···
n=1
n p −1 2 −1 3 −1 5 −1 7 −1
π2 22 32 52 72 112
(7.39) = 2 · 2 · 2 · 2 · 2 ···
6 2 − 1 3 − 1 5 − 1 7 − 1 11 − 1
384 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
7.7.2. Euler numbers and evaluating sums. We now derive a formula for
the alternating sum of the odd natural numbers to odd powers:
1 1 1 1
1 − 2k+1 + 2k+1 − 2k+1 + 2k+1 − + · · · , k = 0, 1, 2, 3, . . . .
3 5 7 9
First try: To this end, let |z| < 1 and recall from Section 7.4 that
∞
π 1 3 5 X (2n + 1)
(7.41) πz = 2 2
− 2 2
+ 2 2
+ · · · = (−1)n .
4 cos 2 1 −z 3 −z 5 −z n=0
(2n + 1)2 − z 2
Expanding as a geometric series, observe that
∞
(2n + 1) 1 1 X z 2k
(7.42) 2 2
= · 2 = .
(2n + 1) − z z
(2n + 1) 1 − (2n+1)2 (2n + 1)2k+1
k=0
Thus,
∞ X∞
π X z 2k
(7.43) πz = (−1)n .
4 cos 2 n=0
(2n + 1)2k+1
k=0
Just as we did in proving (7.35), we shall try to use Cauchy’s double series theorem
on this sum ... however, observe that
∞ X ∞ X ∞ X ∞ ∞
X
(−1)n z 2k
= |z|2k X (2n + 1)
2k+1 2k+1
= ,
n=0 k=0
(2n + 1)
n=0 k=0
(2n + 1) n=0
(2n + 1)2 − |z|2
P 1
which diverges (because this series behaves like 2n+1 = ∞)! Therefore, we
cannot apply Cauchy’s double series theorem.
7.7. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD IV 385
Second try: Let us start fresh from scratch. This time, we break up (7.41)
into sums over n even and n odd (just consider the sums with n replaced by 2n
and also by 2n + 1):
∞
π X (4n + 1) (4n + 3)
= − .
4 cos πz
2 n=0
(4n + 1)2 − z 2 (4n + 3)2 − z 2
(4n+1) (4n+3)
Let |z| < 1. Then writing (4n+1)2 −z 2 and (4n+3)2 −z 2 as geometric series (just as we
did in (7.42)) we see that
∞ X ∞
π X z 2k z 2k
= −
4 cos πz
2 n=0 k=0
(4n + 1)2k+1 (4n + 3)2k+1
∞ X ∞
X 1 1
(7.44) = 2k+1
− 2k+1
z 2k .
n=0
(4n + 1) (4n + 3)
k=0
We can now use Cauchy’s double series theorem on this sum because
∞ X ∞
X 1 1
(4n + 1)2k+1 − 2k+1
z 2k
n=0 k=0
(4n + 3)
∞ X∞
X 1 1 π
= 2k+1
− 2k+1
|z|2k = < ∞,
n=0 k=0
(4n + 1) (4n + 3) 4 cos π|z| 2
where we used (7.44) with z replaced by |z|. Thus, by Cauchy’s double series
theorem, we have
∞ X∞
π X 1 1
= − z 2k .
4 cos πz
2 n=0
(4n + 1)2k+1 (4n + 3)2k+1
k=0
thus, we could in fact interchange orders in (7.43), but to justify it with complete
mathematical rigor, we needed a little bit of mathematical gymnastics.
Now recall from Section 6.8 that
∞
1 X E2k 2k
= sec z = (−1)k z ,
cos z (2k)!
k=0
where the E2k ’s are the Euler numbers. Replacing z with πz/2 and multiplying by
π/4, we get
∞
π πX E2k π 2k 2k
πz = (−1)k z .
4 cos 2 4 (2k)! 2
k=0
Comparing this equation with (7.45) and using the identity theorem, we conclude
that for k = 0, 1, 2, 3, . . .,
∞
X (−1)n k E2k
π 2k+1
(7.46) = (−1) .
n=0
(2n + 1)2k+1 2(2k)! 2
386 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
where the product is over odd primes (all primes except 2) and where the ± signs
in the denominators depends on whether the prime is of the form 4k + 3 (+ sign)
or 4k + 1 (− sign), where k = 0, 1, 2, . . ..
Since the proof of this theorem is similar to that of Theorem 7.12, we shall leave
the proof of this theorem to the interested reader; see Problem 5. In particular,
setting z = 1, we get
π 3 5 7 11 13 17 19 23
(7.47) = · · · · · · · ··· .
4 4 4 8 12 12 16 20 24
The numerators of the fractions on the right are the odd prime numbers and the
denominators are even numbers divisible by four and differing from the numerators
by one. In (7.39), we found that
π2 22 32 52 4 3 · 3 5 · 5 7 · 7 11 · 11 13 · 13
= 2 · 2 · 2 ··· = · · · · · ··· .
6 2 −1 3 −1 5 −1 3 2 · 4 4 · 6 6 · 8 10 · 12 12 · 14
Dividing this expression by (7.47), and cancelling like terms, we obtain
4π π 2 /6 4 3 5 7 11 13 17
= = · · · · · · ··· .
6 π/4 3 2 6 6 10 14 18
Multiplying both sides by 3/4, we get another one of Euler’s famous formulas:
π 3 5 7 11 13 17 19 23
(7.48) = · · · · · · · ··· .
2 2 6 6 10 14 18 18 22
The numerators of the fractions are the odd prime numbers and the denominators
are even numbers not divisible by four and differing from the numerators by one.
(7.47) and (7.48) are two of my favorite infinite product expansions for π.
The sequences {an } and {bn } look so similar and so do { ann } and { bn2 }, yet they
n
generate very different numbers. Seeing such a connection between e and π, which
a priori are very different, makes you wonder if there isn’t someone behind this
“coincidence.”
To prove the formula for e, let us define a sequence {sn } by sn = an /n. Then
s1 = a1 /1 = 0 and s2 = a2 /2 = 1/2. Observe that for n ≥ 2, we have
an+1 an 1 n+1
sn+1 − sn = − = an+1 − an
n+1 n n+1 n
1 1 1
= an + an−1 − 1 + an
n+1 n−1 n
1 1 an
= an−1 −
n+1 n−1 n
−1
= (sn − sn−1 ).
n+1
Using induction we see that
−1 −1 −1
sn+1 − sn = (sn − sn−1 ) = · (sn−1 − sn−2 )
n+1 n+1 n
−1 −1 −1
= · · (sn−2 − sn−3 ) = · · · etc.
n+1 n n−1
−1 −1 −1 −1
= · · ··· (s2 − s1 )
n+1 n n−1 3
−1 −1 −1 −1 1 (−1)n−3 (−1)n+1
= · · ··· · = = .
n+1 n n−1 3 2 (n + 1)! (n + 1)!
Thus, writing as a telescoping sum, we obtain
n n n
X X (−1)k X (−1)k
sn = s1 + (sk − sk−1 ) = 0 + = ,
k! k!
k=2 k=2 k=0
which is exactly the n-th partial sum for the series expansion of e−1 . It follows that
sn → e−1 and so,
n 1 1
lim = lim = −1 = e,
n→∞ an n→∞ sn e
as we claimed. The limit for π in (7.49) will be left to you (see Problem 2).
Exercises 7.7.
1. In this problem we derive other neat formulas:
(1) Dividing (7.40) by π 2 /6, prove that
5 22 + 1 32 + 1 52 + 1 72 + 1 112 + 1
= 2 · · · · ··· ,
2 2 − 1 32 − 1 52 − 1 72 − 1 112 − 1
quite a neat expression for 2.5.
(2) Dividing (7.48) by (7.47), prove that
2 2 4 6 6 8 10 12
2= · · · · · · · ··· ,
1 3 3 5 7 9 9 11
388 7. MORE ON THE INFINITE: PRODUCTS AND PARTIAL FRACTIONS
quite a neat expression for 2. The fractions on the right are formed as follows:
Given an odd prime 3, 5, 7, . . ., we take the pair of even numbers immediately above
and below the prime, divide them by two, then put the resulting even number as
the numerator and the odd number as the denominator.
2. In this problem, we prove the limit for π in (7.49).
(i) Define tn = bn+1 /bn for n = 2, 3, 4, . . .. Prove that (for n = 2, 3, 4, . . .), tn+1 =
1/n + 1/tn and then, (
1 n even
tn = n
n−1
n odd.
(ii) Prove that b2n = t22 · t23 · t24 · · · t2n−1 , then using Wallis’ formula, derive the limit for
π in (7.49).
3. From Problem 7 in Exercises 7.6, prove that
2(2n)! (1 − 22n ) 2(2n)!
2n 1−2n
< B2n < .
(2π) (1 − 2 ) (2π)2n (1 − 21−2n )
This estimate shows that the Bernoulli numbers grow incredibly fast as n → ∞.
4. (Radius of convergence) In this problem we (finally) determine the radii of conver-
gence of
∞ ∞
X 22n B2n 2n X 22n (22n − 1) B2n 2n−1
z cot z = (−1)n z , tan z = (−1)n−1 z .
n=0
(2n)! n=1
(2n)!
2n
(a) Let a2n = (−1)n 2 (2n)!
B2n
. Prove that
1 1/2n 1
lim |a2n |1/2n = lim ·2 · ζ(2n)1/2n = .
n→∞ n→∞ π π
Conclude that the radius of convergence of z cot z is π.
(b) Using a similar argument, show that the radius of convergence of tan z is π/2.
5. In this problem, we prove Theorem 7.15
(i) Let us call an odd number “type I” if it is of the form 4k + 1 for some k = 0, 1, . . .
and “type II” if it is of the form 4k + 3 for some k = 0, 1, . . .. Prove that every
odd number is either of type I or type II.
(ii) Prove that type I × type I = type I, type I × type II = type II, and type II ×
type II = type I.
(iii) Let a, b, . . . , c ∈ N be odd. Prove that if there is an odd number of type II integers
amongst a, b, . . . , c, then a · b · · · c is of type II, otherwise, a · b · · · c is type I.
(iv) Show that
∞ ∞ ∞
X (−1)n X 1 X 1
z
= z
− ,
n=0
(2n + 1) n=0
(4n + 1) n=0
(4n + 3)z
a sum of type I and type II natural numbers!
(v) Let z ∈ C with Re z ≥ r > 1, let 1 < N ∈ N, and let 3 < 5 < · · · < m < 2N + 1
be the odd prime numbers less than 2N + 1. In a similar manner as in the proof
of Theorem 7.12, show that
∞
(−1)n 3z 5z 7z 11z mz
X
z
− z · z · z · z ··· z
n=1
(2n + 1) 3 + 1 5 − 1 7 + 1 11 − 1 m ±1
∞ ∞
X 1 X 1
≤ ≤ ,
n=N
(2n + 1)z n=N
(2n + 1)r
where the + signs in the product are for type II odd primes and the − signs for
type I odd primes. Now finish the proof of Theorem 7.15.
CHAPTER 8
From time immemorial, the infinite has stirred men’s emotions more than
any other question. Hardly any other idea has stimulated the mind so
fruitfully . . . In a certain sense, mathematical analysis is a symphony of
the infinite.
David Hilbert (1862-1943) “On the infinite” [24].
We dabbed a little into the theory of continued fractions (that is, fractions
that continue on and on and on . . .) way back in the exercises of Section 3.4. In
this chapter we concentrate on this fascinating subject. We start in Section 8.1 by
showing that such fractions occur very naturally in long division and we give their
basic definitions. In Section 5.3, we prove some pretty dramatic formulas (this is
one reason continued fractions are so fascinating, at least to me). For example,
we’ll show that 4/π and π can be written as the continued fractions:
4 12 12
=1+ 2
, π =3+ .
π 3 32
2+ 6+
52 52
2+ 6+
72 72
2+ 6+
. .
2 + .. 6 + ..
The continued fraction on the left is due to Lord Brouncker (and is the first contin-
ued fraction ever recorded) and the one on the right is due to Euler. If you think
these π formulas are cool, we’ll derive the following formulas for e as well:
2 1
e=2+ =1+ .
3 1
2+ 0+
4 1
3+ 1+
5 1
4+ 1+
. 1
5 + .. 2+
1
1+
1
1+
.
4 + ..
We’ll prove the formula on the left in Section 7.7, but you’ll have to wait for
the formula on the right until Section 8.7. In Section 8.3, we discuss elementary
properties of continued fractions. In this section we also discuss how a Greek
mathematician, Diophantus of Alexandrea (≈ 200–284 A.D.), can help you if you’re
stranded on an island with guys you can’t trust and a monkey with a healthy
appetite! In Section 8.4 we study the convergence properties of continued fractions.
389
390 8. INFINITE CONTINUED FRACTIONS
Recall from our discussion on the amazing number π and its computations from
ancient times (see Section 4.10) that throughout the years, the following approxi-
mation to π came up: 3, 22/7, 333/106, and 355/113. Did you ever wonder why
these particular numbers occur? Also, did you ever wonder why our calendar is
constructed the way it is (e.g. why leap years occur)? Finally, did you ever wonder
why a piano has twelve keys (within an octave)? In Sections 8.5 and 8.6 you’ll find
out that these mysteries have to do with continued fractions! In Section 8.8 we
study special types of continued fractions having to do with square roots and in
Section 8.9 we learn why Archimedes needed around 8 × 10206544 cattle in order to
“have abundant of knowledge in this science [mathematics]”!
In the very last section, Section 8.10, we look at continued fractions and tran-
scendental numbers.
Chapter 8 objectives: The student will be able to . . .
• define a continued fraction, state the Wallis-Euler and fundamental recurrence
relations, and apply the continued fraction convergence theorem (Theorem 8.14).
• compute the canonical continued fraction of a given real number.
• understand the relationship between convergents of a simple continued fraction
and best approximations, and the relationship between periodic simple contin-
ued fractions and quadratic irrationals.
• solve simple diophantine equations (of linear and Pell type).
157 1
=2+ .
68 68
21
68 5 1 157
Since we can further divide 21 = 3+ 21 = 3+ 21/5 , we can write 68 in the somewhat
fancy way
157 1
=2+ .
68 1
3+
21
5
21
Since 5 = 4 + 15 , we can write
157 1
(8.1) =2+ .
68 1
3+
1
4+
5
Since 5 is now a whole number, our repeated division process stops.
8.1. INTRODUCTION TO CONTINUED FRACTIONS 391
The expression on the right in (8.1) is called a finite simple continued frac-
tion. There are many ways to denote the right-hand side, but we shall stick with
the following two:
1 1 1 1
h2; 3, 4, 5i or 2+ represent 2 + .
3+ 4+ 5 1
3+
1
4+
5
Thus, continued fractions (that is, fractions that “continue on”) arise naturally out
of writing rational numbers in a somewhat fancy way by repeated divisions. Of
course, 157 and 68 were not special, by repeated divisions one can take any two
integers a and b with b 6= 0 and write a/b as a finite simple continued fraction; see
Problem 2. In Section 8.4, we shall prove that any real number, not necessarily
rational, can be expressed as a simple (possibly infinite) continued fraction.
However, we shall define what this means in a moment, but before doing so, here’s
another neat example:
Example
√
8.3. Consider the slightly modified formula x2 − x − 1 = 0. Then
1+ 5
Φ = 2 , called the golden ratio, is the only positive solution. We can rewrite
Φ2 − Φ − 1 = 0 as Φ = 1 + Φ1 . Replacing Φ in the denominator with Φ = 1 + Φ1 , we
get
1
Φ=1+ .
1
1+
Φ
Repeating this substitution process “to infinity”, we can write
1
(8.2) “ Φ=1+ ,”
1
1+
1
1+
1
1+
.
1 + ..
quite a beautiful expression (cf. Problem 6 in Exercises 3.4)! As a side remark, there
are many false rumors concerning the golden ratio; see [146] for the rundown.
8.1.3. Basic definitions for continued fractions. A general finite contin-
ued fraction can be written as
b1
(8.3) a0 +
b2
a1 +
b3
a2 +
..
.
a3 +
bn
an−1 +
an
where the ak ’s and bk ’s are real numbers. (Of course, we are implicitly assuming
that these fractions are all well-defined, e.g. no divisions by zero are allowed. Also,
when you simplify this big fraction by combining fractions, you need to go from the
bottom up.) Notice that if bm = 0 for some m, then
b1 b1
(8.4) a0 + = a0 + ,
b2 b2
a1 + a1 +
b3 ..
a2 + .
.. a2 +
. bm−1
a3 + am−2 +
bn am−1
an−1 +
an
since the bm = 0 will zero out everything below it. The continued fraction is called
simple if all the bk ’s are 1 and the ak ’s are integers with ak positive for k ≥ 1.
Instead of writing the continued fraction as we did above, which takes up a lot of
space, we shall shorten it to:
b1 b2 b3 bn
a0 + ... .
a1 + a2 + a3 + + an
8.1. INTRODUCTION TO CONTINUED FRACTIONS 393
provided that the right-hand side exists. In Section 8.4 we shall prove that any
simple continued fraction converges; in particular, we’ll prove that (8.2) does hold
true:
1 1 1
Φ=1+ ....
1+ 1+ 1+
In the case when there is some bm term that vanishes, then convergence of (8.5) is
easy because (see (8.4)) for n ≥ m, we have cn = cm−1 . Hence, in this case
b1 b2 b3 b1 b2 b3 bm−1
a0 + . . . = a0 + ...
a1 + a2 + a3 + a1 + a2 + a3 + + am−1
converges automatically (such a continued fraction is said to terminate or be
finite). However, general convergence issues are not so straightforward. We shall
deal with the subtleties of convergence in Section 8.4.
Exercises 8.1.
1. Expand the following fractions into finite simple continued fractions:
7 11 3 13 42
(a) , (b) − , (c) , (d) , (e) − .
11 7 13 3 31
2. Prove that a real number can be written as a finite simple continued fraction if and
only if it is rational. Suggestion: for the “if” statement, use the division algorithm
(see Theorem 2.15): For a, b ∈ Z with b > 0, we have a = qb + r where q, r ∈ Z with
0 ≤ r < b; if a, b are both nonnegative, then so is q.
3. Let ξ = a0 + ab11 ab22 ab33 . . . abnn 6= 0. Prove that
+ + + +
1 1 b1 b2 b3 bn
= ... .
ξ a0 + a1 + a2 + a3 + + an
1
In particular, if ξ = ha0 ; a1 , . . . , an i =
6 0, show that ξ
= h0; a0 , a1 , a2 , . . . , an i.
394 8. INFINITE CONTINUED FRACTIONS
in the sense when the left-hand side is defined, so is the right-hand side and this
equality holds. In particular, if the limit as n → ∞ of the left-hand side exists, then
the limit of the right-hand side also exists, and
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
a0 + ... . . . = a0 + ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
8.2.2. Two stupendous series and continued fractions identities. Let
α1 , α2 , α3 , . . . be any real numbers with αk 6= 0 and αk 6= αk−1 for all k. Observe
that
1 1 α2 − α1 1
− = = α1 α2 .
α1 α2 α1 α2 α2 −α1
Since
α1 α2 α1 (α2 − α1 ) + α12 α12
= = α1 + ,
α2 − α1 α2 − α1 α2 − α1
we get
1 1 1
− = α21
.
α1 α2 α1 + α2 −α 1
This is now a sum of n terms. Thus, we can apply the induction hypothesis to
conclude that
n+1
X (−1)k−1 2
1 α12 αn−1
(8.7) = ··· αn αn+1 .
αk α1 + α2 − α1 + + αn+1 −αn − αn−1
k=1
Since
αn αn+1 αn (αn+1 − αn ) + αn2
− αn−1 = − αn−1
αn+1 − αn αn+1 − αn
αn2
= αn − αn−1 + ,
αn+1 − αn
putting this into (8.7) gives
n+1 2
X (−1)k−1 1 α12 αn−1
= ··· α2n
.
αk α1 + α2 − α1 + + αn − αn−1 +
k=1 αn+1 −αn
We can continue by induction in much the same manner as we did in the proof of
Theorem 8.2 to obtain the following result.
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 397
8.2.3. Continued fractions for arctan and π. We now use the identities
just learned to derive some remarkable continued fractions.
Example 8.5. First, since
π 1 1 1 1
= − + − + ··· ,
4 1 3 5 7
using the limit expression (8.6) in Theorem 8.2:
1 1 1 1 1 α12 α22 α32
− + − + ··· = ··· ,
α1 α2 α3 α4 α1 + α2 − α1 + α3 − α2 + α4 − α3 +
we can write
π 1
= .
4 12
1+
32
2+
52
2+
72
2+
.
2 + ..
Inverting both sides (see Problem 3 in Exercises 8.1), we obtain the incredible
expansion:
4 12
(8.9) =1+ .
π 32
2+
52
2+
72
2+
.
2 + ..
This continued fraction was the very first continued fraction ever recorded, and was
written down without proof by Lord Brouncker (1620 – 1686), the first president
of the Royal Society of London.
398 8. INFINITE CONTINUED FRACTIONS
Actually, we can derive (8.9) from a related expansion of the arctangent func-
tion, which is so neat that we shall derive in two ways, using Theorem 8.2 then
using Theorem 8.3.
Example 8.6. Recall that
x3 x5 x7 x2n−1
arctan x = x − + − + · · · + (−1)n−1 + ··· .
3 5 7 2n − 1
Setting α1 = x1 , α2 = x33 , α3 = x55 , and in general, αn = x2n−1 2n−1 into the formula
However, we can simplify this using the transformation rule from Theorem 8.1:
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
+
a1 a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
(Here we drop the a0 term from that theorem.) Let ρ1 = x, ρ2 = x3 , . . ., and in
general, ρn = x2n−1 . Then,
1
3 2
5 2
1 x2 x3 x5 x x2 32 x2 52 x2
1+ 3 1+ 5 3 + 7 5 + . . . = 1 + 3 − x2 + 5 − 3x2 + 7 − 5x2 + . . . .
x x3 − x x5 − x3 x7 − x5
Thus,
x x2 32 x2 52 x2
arctan x = ...,
1 + 3 − x2 + 5 − 3x2 + 7 − 5x2 +
or in a fancier way:
x
(8.10) arctan x = .
x2
1+
32 x2
(3 − x2 ) +
52 x2
(5 − 3x2 ) +
.
(7 − 5x2 ) + . .
In particular, setting x = 1 and inverting, we get Lord Brouncker’s formula:
4 12
=1+ .
π 32
2+
52
2+
72
2+
.
2 + ..
Example 8.7. We can also derive (8.10) using Theorem 8.3: Using once again
that
x3 x5 x7 x2n−1
arctan x = x − + − + · · · + (−1)n−1 + ···
3 5 7 2n − 1
and setting α1 = x1 , α2 = x32 , α3 = 3x5 2 , α4 = 5x7 2 , · · · , αn = (2n−3)x
2n−1
2 for n ≥ 2,
we obtain
1 3 2n−1
1 x x2 (2n−3)x2
arctan x = 1 3 5 ... 2n+1 ....
x
+ x2 − 1+ 3x2 − 1+ + (2n−1)x2 − 1+
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
Example 8.8. We leave the next two beauts to you! Applying Theorem 8.2
2
and/or Theorem 8.3 to Euler’s sum π6 = 112 + 212 + 312 + · · · , in Problem 2 we ask
you to derive the formula
6 14
(8.11) = 02 + 12 − ,
π2 24
12 + 22 −
34
22 + 32 −
44
32 + 42 −
.
42 + 52 − . .
which is, after inversion, the last formula on the front cover of this book.
π 1
(8.12) =1+ .
2 1·2
1+
2·3
1+
3·4
1+
.
1 + ..
Since
∞
π 1 1 1 1 X (−1)n−1
= − + − + ··· = 1 − ,
4 1 3 5 7 n=1
2n + 1
400 8. INFINITE CONTINUED FRACTIONS
multiplying this expression by 4 and using the previous expression, we can write
∞ ∞
X (−1)n−1 X (−1)n−1
π =4−4 =3+1−4
n=1
2n + 1 n=1
2n + 1
∞ ∞
X 1 1 X (−1)n−1
=3+ (−1)n−1 + −4
n=1
n n+1 n=1
2n + 1
∞
X 1 1 4
=3+ (−1)n−1 + −
n=1
n n + 1 2n +1
∞
X (−1)n−1
=3+4 ,
n=1
2n(2n + 1)(2n + 2)
where we combined fractions in going from the third to forth lines. We now apply
the limiting formula (8.6) from Theorem 8.2 with αn = 2n(2n+1)(2n+2). Observe
that
αn − αn−1 = 2n(2n + 1)(2n + 2) − 2(n − 1)(2n − 1)(2n)
= 4n (2n + 1)(n + 1) − (n − 1)(2n − 1)
= 4n 2n2 + 2n + n + 1 − (2n2 − n − 2n + 1) = 4n(6n) = 24n2 .
Now putting the αn ’s into the formula
1 1 1 1 1 α12 α22 α32
− + − + ··· = ...,
α1 α2 α3 α4 α1 + α2 − α1 + α3 − α2 + α4 − α3 +
we get
∞
X (−1)n−1 1 (2 · 3 · 4)2 (4 · 5 · 6)2
4 =4· ...
n=1
2n(2n + 1)(2n + 2) 2 · 3 · 4 + 24 · 22 + 24 · 32 +
1 (2 · 3)2 · 4 (4 · 5 · 6)2
= ....
2 · 3 + 24 · 22 + 24 · 32 +
Hence,
1 (2 · 3)2 · 4 (4 · 5 · 6)2 (2(n − 1)(2n − 1)(2n))2
π =3+ 2 2
... ...,
6 + 24 · 2 + 24 · 3 + + 24 · n2 +
which is beautiful, but we can make this even more beautiful using the transforma-
tion rule from Theorem 8.1:
b1 b2 bn ρ1 b1 ρ1 ρ2 b2 ρn−1 ρn bn
... ... = ... ....
a1 + a2 + + an + ρ1 a1 + ρ2 a2 + + ρn an +
1
Setting ρ1 = 1 and ρn = 4n2 for n ≥ 2, we see that for n ≥ 3 we have
1 1
ρn−1 ρn bn 4(n−1)2 · 4n2 · (2(n − 1)(2n − 1)(2n))2 (2n − 1)2
= 1 = ;
ρn an 4n2 · 24 · n2 6
the same formula holds for n = 2 as you can check. Thus,
12 32 52 72 (2n − 1)2
π =3+ ... ...
6+ 6+ 6+ 6+ + 6 +
8.2. F SOME OF THE MOST BEAUTIFUL FORMULÆ IN THE WORLD V 401
12
(8.13) π =3+ .
32
6+
52
6+
72
6+
.
6 + ..
8.2.5. Continued fractions and e. For our final beautiful example, we shall
compute a continued fraction expansion for e. We begin with
∞
1 −1
X (−1)n 1 1 1
=e = =1− + − + ··· ,
e n=0
n! 1 1 · 2 1 · 2 ·3
so
e−1 1 1 1 1 1
=1− = − + − + ··· .
e e 1 1·2 1·2·3 1·2·3·4
Thus, setting αk = k into the formula (8.8):
1 1 1 1 α1 α2 αn−1
− + − ··· = ... ...,
α1 α1 α2 α1 α2 α3 α1 + α2 − 1 + α3 − 1 + + αn − 1 +
we obtain
e−1 1
= .
e 1
1+
2
1+
3
2+
.
3 + ..
We can make this into an expression for e as follows: First, invert the expression
and then subtract 1 from both sides to get
e 1 1 1
=1+ , then = .
e−1 2 e−1 2
1+ 1+
3 3
2+ 2+
. .
3 + .. 3 + ..
Second, invert again to obtain
2
e−1=1+ .
3
2+
4
3+
5
4+
.
5 + ..
402 8. INFINITE CONTINUED FRACTIONS
x
log(1 + x) = .
12 x
1+
22 x
(2 − 1x) +
32 x
(3 − 2x) +
.
(4 − 3x) + . .
Plug in x = 1 to derive our beautiful formula for log 2.
2
2. Using Euler’s sum π6 = 112 + 212 + 312 + · · · , give two proofs of the formula (8.11), one
using Theorem 8.2 and the other using Theorem 8.3. The transformation rules will
come in handy.
3. (i) For any real numbers {αk }, prove that for any n,
n
X k α1 x −αα2
x −αα3
x − ααn−1
n
x
1 2
αk x = α0 + α2 α3 ... αn
1 + 1 + α1 x + 1 + α2 x + + 1+ α x
k=0 n−1
provided, of course, that the right-hand side is defined, which we assume holds
for every n. P
(ii) Deduce that if the infinite series ∞ n
n=0 αn x converges, then
∞
X α1 x −αα2
x −α 3
α2
x − ααn−1
n
x
αn xn = α0 + 1
α2 α3 . . . αn ....
n=0
1 + 1 + α1 x + 1 + α2 x + + 1+ α
n−1
x+
Transforming the continued fraction on the right, prove that
∞
X α1 x −α2 x −α1 α3 x −αn−2 αn x
αn xn = α0 + ... ....
1 + α1 + α2 x + α2 + α3 x + + αn−1 + αn x +
n=0
2 3
y
4. Writing arctan x = x(1 − 3
+ y5 − y7 + · · · ) where y = x2 , and using the previous
2 3
problem on (1 − y3 + y5 − y
7
+ · · · ), derive the formula (8.10).
5. Let x, y > 0. Prove that
∞
X (−1)n 1 x2 (x + y)2 (x + 2y)2 (x + 3y)2
= ....
n=0
x + ny x+ y + y + y + y +
2x
(a) By breaking up x2 −n2
using partial fractions, prove that
1 1 1 1 1
π cot πx = − + − + − +··· .
x 1−x 1+x 2−x 2+x
(b) Derive the remarkable formula
πx
cos 2 (x + 1)2 (x − 1)2 (x − 3)2 (x + 3)2 (x + 5)2 (x − 5)2
π =x+1+ ....
2
−2 · 1 + −2 + 2 · 3 + 2 + −2 · 5 + −2 +
sin πx x 1 · (1 − x) 1 · (1 + x) 2 · (2 − x) 2 · (2 + x)
=1− ....
πx 1+ x + 1−x + x + 1−x +
(c) Putting x = 1/2, prove (8.12). Putting x = −1/2, derive another continued
fraction for π/2.
have some weird properties. Let us form its convergents: c1 = 1, which is OK, but
1 −1 1 1 1
c2 = = = = =???,
1+ 1 −1 1−1 0
1+
1
which is not OK.1 However,
1 −1 1 1 1 1
c3 = = = = = 2,
1+ 1 + 1 −1 −1 1
1+ 1+
1 2 2
1+
1
which is OK again! To avoid such craziness, we shall focus on continued fractions
with an > 0 for n ≥ 1 and bn ≥ 0, but we emphasize that much of what we do in
this section and the next works in greater generality.
Let {an }∞ ∞
n=0 , {bn }n=1 be sequences of real numbers with an > 0, bn ≥ 0 for
all n ≥ 1 (there is no restriction on a0 ). The following sequences {pn }, {qn } are
central in the theory of continued fractions:
and our induction is complete. Observe that the zero-th convergent of the continued
fraction (8.15) is c0 = a0 = p0 /q0 and the first convergent is
b1 a1 a0 + b1 p1
c1 = a0 + = = .
a1 a1 q1
The central property of the pn , qn ’s is the fact that cn = pn /qn for all n.
Theorem 8.4. For any positive real number x, we have
b1 b2 b3 bn xpn−1 + bn pn−2
(8.18) a0 + ... = , n = 1, 2, 3, . . . .
a1 + a2 + a3 + + x xqn−1 + bn qn−2
(Note that the denominator is > 0 because qn > 0 for n ≥ 0.) In particular, setting
x = an and using the definition of pn , qn , we have
b1 b2 b3 bn pn
cn = a0 + ... = , n = 0, 1, 2, 3, . . . .
a1 + a2 + a3 + + an qn
Proof. We prove (8.18) by induction on the number of terms after a0 . The
proof for one term after a0 is easy: a0 + bx1 = a0 x+b
x
1
= xpxq0 0+b 1 p−1
+q−1 , since p0 = a0 ,
p−1 = 1, q0 = 1, and q−1 = 0. Assume that (8.18) holds when there are n terms
after a0 ; we shall prove it holds for fractions with n + 1 terms after a0 . To do so,
we write (see Problem 4 in Exercises 8.1 for the general technique)
b1 b2 bn bn+1 b1 b2 bn
a0 + ... = a0 + ... ,
a1 + a2 + + an + x a1 + a2 + + y
where
bn+1 xan + bn+1
y := an + = .
x x
Therefore, by our induction hypothesis, we have
xan + bn+1
pn−1 + bn pn−2
b1 b2 bn+1 ypn−1 + bn pn−2 x
a0 + ... = =
a1 + a2 + + x yqn−1 + bn qn−2 xan + bn+1
qn−1 + bn qn−2
x
xan pn−1 + bn+1 pn−1 + xbn pn−2
=
xan qn−1 + bn+1 qn−1 + xbn qn−2
x(an pn−1 + bn pn−2 ) + bn+1 pn−1
=
x(an qn−1 + bn qn−2 ) + bn+1 qn−1
xpn + bn+1 pn−1
= ,
xqn + bn+1 qn−1
which completes our induction step and finishes our proof.
In the next theorem, we give various useful identities that the pn , qn satisfy.
Theorem 8.5 (Fundamental recurrence relations). For all n ≥ 1, the
following identities hold:
This tomb hold Diophantus Ah, what a marvel! And the tomb
tells scientifically the measure of his life. God vouchsafed that he
should be a boy for the sixth part of his life; when a twelfth was
added, his cheeks acquired a beard; He kindled for him the light of
marriage after a seventh, and in the fifth year after his marriage
He granted him a son. Alas! late-begotten and miserable child,
when he had reached the measure of half his father’s life, the
chill grave took him. After consoling his grief by this science of
numbers for four years, he reached the end of his life. [160].
Try to find how old Diophantus was when he died using elementary algebra.
(Let x = his age when he died; then you should end up with trying to solve the
equation x = 61 x + 121
x + 17 x + 5 + 12 x + 4, obtaining x = 84.) Here is an easy
way to find his age: Unravelling the above fancy language, and picking out two
facts, we know that 1/12-th of his life was in youth and 1/7-th was as a bachelor.
In particular, his age must divide 7 and 12. The only integer that does this, and
which is within the human lifespan, is 7 · 12 = 84. In particular, he spent 84/6 = 14
years as a child, 84/12 = 7 as a youth, 84/7 = 12 years as a bachelor. He married
at 14 + 7 + 12 = 33, at 33 + 5 = 38, his son was born, who later died at the age
of 84/2 = 42 years old, when Diophantus was 80. Finally, after 4 years doing the
“science of numbers”, Diophantus died at the ripe old age of 84.
After taking a moment to wipe away our tears, let us consider the following.
Theorem 8.9. If a, b ∈ N are relatively prime, then for any c ∈ Z, the equation
ax − by = c
has an infinite number of integer solutions (x, y). Moreover, if (x0 , y0 ) is any one
integral solution of the equation with c = 1, then for general c ∈ Z, all solutions are
of the form
x = cx0 + bt , y = cy0 + at , for all t ∈ Z.
Proof. In Problem 7 we ask you to prove this theorem using Problem 5 in
Exercises 2.4; but we shall use continued fractions just for fun. We first solve the
equation ax − by = 1. To do so, we write a/b as a finite simple continued fraction:
a/b = ha0 ; a1 , a2 , . . . , an i and by Theorem 8.8 we can choose n to be odd. Then a/b
is equal to the n-th convergent pn /qn , which implies that pn = a and qn = b. Also,
by our relations in Corollary 8.6, we know that
pn qn−1 − qn pn−1 = (−1)n−1 = 1,
where we used that n is odd. Since pn and qn are relatively prime and a/b = pn /qn ,
we must have pn = a and qn = b. Therefore, aqn−1 − bpn−1 = 1, so
This shows that a divides b(y − cy0 ), which can be possible if and only if a divides
y − cy0 since a and b are relatively prime. Thus, y − cy0 = at for some t ∈ Z.
Plugging y − cy0 = at into the equation a(x − cx0 ) = b(y − cy0 ), we get
a(x − cx0 ) = b · (at) = abt.
Cancelling a, we get x − cx0 = bt and our proof is now complete.
1 4 30
c2 = 2 + =2+ = .
1 13 13
3+
4
Therefore, (13, 30) is one solution of 157x − 68y = 1, which we should check just to
be sure:
157 · 13 − 68 · 30 = 2041 − 2040 = 1.
Since cx0 = 12·13 = 156 and cy0 = 12·30 = 360, the general solution of 157x−68y =
12 is
x = 156 + 68t , y = 360 + 157t, t ∈ Z.
Example 8.11. We now come to a fun puzzle that involves diophantine equa-
tions; for more cool coconut puzzles, see [80, 81], [228], [212], and Problem 5. See
also [214] for the long history of such problems. Five sailors get shipwrecked on
an island where there is only a coconut tree and a very slim monkey. The sailors
gathered all the coconuts into a gigantic pile and went to sleep. At midnight, one
sailor woke up, and because he didn’t trust his mates, he divided the coconuts into
five equal piles, but with one coconut left over. He throws the extra one to the
monkey, hides his pile, puts the remaining coconuts back into a pile, and goes to
sleep. At one o’clock, the second sailor woke up, and because he was untrusting of
his mates, he divided the coconuts into five equal piles, but again with one coconut
left over. He throws the extra one to the monkey, hides his pile, puts the remaining
coconuts back into a pile, and goes to sleep. This exact same scenario continues
throughout the night with the other three sailors. In the morning, all the sailors
woke up, pretending as if nothing happened and divided the now minuscule pile of
coconuts into five equal piles, and they find yet again one coconut left over, which
they throw to the now very overweight monkey. Question: What is the smallest
possible number of coconuts in the original pile?
Let x = the original number of coconuts. Remember that sailor #1 divided x
into five parts, but with one left over. This means that if y1 is the number that he
410 8. INFINITE CONTINUED FRACTIONS
2. If all the a0 , a1 , a2 , . . . , an > 0 (which guarantees that p0 = a0 > 0), prove that
pn qn
= han ; an−1 , an−2 , . . . , a2 , a1 , a0 i and = han ; an−1 , an−2 , . . . , a2 , a1 i
pn−1 qn−1
pk ak pk−1 +pk−2 1
for n = 1, 2, . . .. Suggestion: Observe that pk−1
= pk−1
= ak + pk−1 .
pk−2
3. In this problem, we relate the Fibonacci numbers to continued fractions. Recall that
the Fibonacci sequence {Fn } is defined as F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for
all n ≥ 2. Let pn /qn = ha0 ; a1 , . . . , an i where all the ak ’s are equal to 1.
(a) Prove that pn = Fn+2 and qn = Fn+1 for all n = −1, 0, 1, 2, . . .. Suggestion: Use
the Wallis-Euler recurrence relations.
(b) From facts known about convergents, prove that Fn and Fn+1 are relatively prime
and derive the following famous identity, named after Giovanni Domenico Cassini
(1625–1712) (also called Jean-Dominique Cassini)
4. Imitating the proof of Theorem 8.9, show that a solution of ax − by = −1 can be found
by writing a/b as a simple continued fraction with an even number n terms after the
integer part of a/b and finding the (n − 1)-th convergent. Apply this method to find a
solution of 157x − 68y = −1 and 7x − 12y = −1.
5. (Coconut problems) Here are some more coconut problems:
(a) Solve the coconut problem assuming the same antics as in the text, except for one
thing: there are no coconuts left over for the monkey at the end. That is, what is
the smallest possible number of coconuts in the original pile given that after the
sailors divided the coconuts in the morning, there are no coconuts left over?
(b) Solve the coconut problem assuming the same antics as in the text except that
during the night each sailor divided the pile into five equal piles with none left
over; however, after he puts the remaining coconuts back into a pile, the monkey
(being a thief himself) steals one coconut from the pile (before the next sailor wakes
up). In the morning, there is still one coconut left over for the monkey.
(c) Solve the coconut problem when there are only three sailors to begin with, otherwise
everything is the same as in the text (e.g. one coconut left over at the end). Solve
this same coconut problem when there are no coconuts left over at the end.
(d) Solve the coconut problem when there are seven sailors, otherwise everything is
the same as in the text. (Warning: Set aside an evening for long computations!)
6. Let α = ha0 ; a1 , a2 , . . . , am i, β = hb0 ; b1 , . . . , bn i with m, n ≥ 0 and the ak , bk ’s integers
with am , bn > 1 (such finite continued fractions are called regular). Prove that if
α = β, then ak = bk for all k = 0, 1, 2, . . .. In other words, distinct regular finite simple
continued fractions define different rational numbers.
7. Prove Theorem 8.9 using Problem 5 in Exercises 2.4.
Corollary 8.11. The limits of the even and odd convergents exist, and
c0 < c2 < c4 < · · · < lim c2n ≤ lim c2n−1 < · · · < c5 < c3 < c1 .
8.4.2. Convergence results for continued fractions. As a consequence of
the previous corollary, it follows that lim cn exists if and only if lim c2n = lim c2n−1 ,
which holds if and only if
−b1 b2 · · · b2n
(8.23) c2n − c2n−1 = → 0 as n → ∞.
q2n q2n−1
In the following theorem, we give one condition under which this is satisfied.
Theorem 8.12. Let {an }∞ ∞
n=0 , {bn }n=1 be sequences such that an , bn > 0 for
n ≥ 1 and
∞
X an an+1
= ∞.
n=1
bn+1
b1 b2 b3 b4
Then (8.23) holds, so the continued fraction ξ := a0 + a1 + a2 + a3 + a4 + . . . con-
verges. Moreover, for any even j and odd k, we have
c0 < c2 < c4 < · · · < cj < · · · < ξ < · · · < ck < · · · < c5 < c3 < c1 .
Proof. Observe that for any n ≥ 2, we have qn−1 = an−1 qn−2 + bn−1 qn−3 ≥
an−1 qn−2 since bn−1 , qn−3 ≥ 0. Thus, for n ≥ 2 we have
qn = an qn−1 + bn qn−2 ≥ an · (an−1 qn−2 ) + bn qn−2 = qn−2 (an an−1 + bn ),
so
qn ≥ qn−2 (an an−1 + bn ).
Applying this formula over and over again, we find that for any n ≥ 1,
q2n ≥ q2n−2 (a2n a2n−1 + b2n )
≥ q2n−4 (a2n−2 a2n−3 + b2n−2 ) · (a2n a2n−1 + b2n )
..
≥ .
≥ q0 (a2 a1 + b2 )(a4 a3 + b4 ) · · · (a2n a2n−1 + b2n ).
A similar argument shows that for any n ≥ 2,
q2n−1 ≥ q1 (a3 a2 + b3 )(a5 a4 + b5 ) · · · (a2n−1 a2n−2 + b2n−1 ).
Thus, for any n ≥ 2, we have
q2n q2n−1 ≥ q0 q1 (a2 a1 + b2 )(a3 a2 + b3 ) · · · (a2n−1 a2n−2 + b2n−1 )(a2n a2n−1 + b2n ).
Factoring out all the bk ’s we conclude that
a2 a1 a3 a2 a2n a2n−1
q2n q2n−1 ≥ q0 q1 b2 · · · b2n · · · 1 + 1+ ··· 1 + ,
b2 b3 b2n
which shows that
b1 b2 · · · b2n b1 1
(8.24) ≤ ·Q .
q2n q2n−1 q0 q1 2n−1
1+ ak ak+1
k=1 bk+1
P∞
Now recall that (see Theorem 7.2) a seriesQ∞ k=1 αk of positive real numbers con-
verges if and only if the infinite product k=1 (1+ αk ) converges.
Thus, since we
P∞ Q∞
are given that k=1 akbk+1
ak+1
= ∞, we have k=1 1 + akbk+1ak+1
= ∞ as well, so the
414 8. INFINITE CONTINUED FRACTIONS
right-hand side of (8.24) vanishes as n → ∞. The fact that for even j and odd k,
we have c0 < c2 < c4 < · · · < cj < · · · < ξ < · · · < ck < · · · < c5 < c3 < c1 follows
from Corollary 8.11. This completes our proof.
where we used that 0 < ξ0 − a0 . Note that ξ1 is irrational because if not, then ξ0
would be rational contrary to assumption. Second, we define a1 := bξ1 c ∈ N. Then,
0 < ξ1 − a1 < 1, so we can write
1 1
ξ1 = a1 + , where ξ2 := > 1.
ξ2 ξ1 − a1
Note that ξ2 is irrational. Third, we define a2 := bξ2 c ∈ N. Then, 0 < ξ2 − a2 < 1,
so we can write
1 1
ξ2 = a2 + , where ξ3 := > 1.
ξ3 ξ2 − a2
Note that ξ3 is irrational. We can continue this procedure to “infinity” creating
a sequence {ξn }∞
n=0 of real numbers with ξn > 0 for n ≥ 1 called the complete
quotients of ξ, and a sequence {an }∞n=0 of integers with an > 0 for n ≥ 1 called
the partial quotients of ξ, such that
1
ξn = an + , n = 0, 1, 2, 3, . . . .
ξn+1
Thus,
1 1 1
(8.25) ξ = ξ0 = a0 + = a0 + = · · · “ = ” a0 + .
ξ1 1 1
a1 + a1 +
ξ2 1
a2 +
1
a3 +
.
a4 + . .
We emphasize that we have actually not proved that ξ is equal to the infinite con-
tinued fraction on the far right (hence, the quotation marks)! But, as a consequence
of the following theorem, this equality follows; then the continued fraction in (8.25)
is called the canonical (simple) continued fraction expansion of ξ.
Theorem 8.14 (Continued fraction convergence theorem). Let ξ0 , ξ1 ,
ξ2 , . . . be any sequence of real numbers with ξn > 0 for n ≥ 1 and suppose that
these numbers are related by
bn+1
ξn = an + , n = 0, 1, 2, . . . ,
ξn+1
for sequences
P∞ of real numbers {an }∞ ∞
n=0 , {bn }n=1 with an , bn > 0 for n ≥ 1 and which
an an+1
satisfy n=1 bn+1 = ∞. Then ξ0 is equal to the continued fraction
b1 b2 b3 b4 b5
ξ0 = a0 + ....
a1 + a2 + a3 + a4 + a5 +
In particular, for any real number ξ, the canonical continued fraction expansion
(8.25) converges to ξ.
Proof. By Theorem 8.12, the continued fraction a0 + ab11 + ab22 + ab33 + . . . con-
verges. Let {ck = pk /qk } denote the convergents of this infinite continued fraction
and let ε > 0. Then by Theorem 8.12, there is an N such that
b1 b2 · · · bn
n>N =⇒ |cn − cn−1 | = < ε.
qn qn−1
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 417
Fix n > N and consider the finite continued fraction obtained as in (8.25) by
writing out ξ0 to the n-th term:
b1 b2 b3 bn−1 bn
ξ0 = a0 + ... .
a1 + a2 + a3 + + an−1 + ξn
Let {c0k = p0k /qk0 } denote the convergents of this finite continued fraction. Then
observe that pk = p0k and qk = qk0 for k ≤ n − 1 and c0n = ξ0 . Therefore, by our
fundamental recurrence relations, we have
b1 b2 · · · bn b1 b2 · · · bn
|ξ0 − cn−1 | = |c0n − c0n−1 | ≤ 0 0 = 0 .
qn qn−1 qn qn−1
By the Wallis-Euler relations, we have
0 0 0 bn+1
qn = ξn qn−1 + bn qn−2 = an + qn−1 + bn qn−2 > an qn−1 + bn qn−2 = qn .
ξn+1
Hence,
b1 b2 · · · bn b1 b2 · · · bn
|ξ0 − cn−1 | ≤ < < ε.
qn0 qn−1 qn qn−1
Since ε > 0 was arbitrary, it follows that ξ0 = lim cn−1 = ξ.
√
Example 8.14. Consider ξ0 = 3 = 1.73205 . . .. In this case, a0 := bξ0 c = 1.
Thus,
√
1 1 1+ 3
ξ1 := =√ = = 1.36602 . . . =⇒ a1 := bξ1 c = 1.
ξ0 − a0 3−1 2
Therefore,
1 1 √
ξ2 := = √ = 1 + 3 = 2.73205 . . . =⇒ a2 := bξ2 c = 2.
ξ1 − a1 1+ 3
−1
2
Hence,
√
1 1 1+ 3
ξ3 := =√ = = 1.36602 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 3−1 2
Here we notice that ξ3 = ξ1 and a3 = a1 . Therefore,
1 1 √
ξ4 := = = ξ2 = 1 + 3 =⇒ a4 := bξ4 c = bξ2 c = 2.
ξ3 − a3 ξ1 − a1
At this point, we see that we will get the repeating pattern 1, 2, 1, 2, . . ., so we
conclude that √
3 = h1; 1, 2, 1, 2, 1, 2, . . .i = h1; 1, 2i,
where we indicate that the 1, 2 pattern repeats by putting a bar over them.
Example 8.15. Here is a neat example concerning the Fibonacci and Lucas
numbers; for other fascinating topics on these numbers, see Knott’s fun website
[121]. Let us find the continued fraction √
expansion of the irrational number ξ0 =
√
Φ/ 5 where Φ is the golden ratio Φ = 1+2 5 :
√
Φ 1+ 5
ξ0 = √ = √ = 0.72360679 . . . =⇒ a0 := bξ0 c = 0.
5 2 5
418 8. INFINITE CONTINUED FRACTIONS
Thus,
√
1 1 2 5
ξ1 := = = √ = 1.3819660 . . . =⇒ a1 := bξ1 c = 1.
ξ0 − a0 ξ0 1+ 5
Therefore,
√
1 1 1+ 5
ξ2 := = √ =√ = 2.6180339 . . . =⇒ a2 := bξ2 c = 2.
ξ1 − a1 2 5 5−1
√ −1
1+ 5
Hence,
√
1 1 5−1
ξ3 := = √ = √ = 1.2360679 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 1+ 5 3− 5
√ −2
5−1
Thus,
√ √
1 1 3− 5 1+ 5
ξ4 := = √ = √ = = 1.6180339 . . . ;
ξ3 − a3 5−1 2 5−4 2
√ −1
3− 5
that is, ξ4 = Φ, and so, a4 := bξ4 c = 1. Let us do this one more time:
√
1 1 2 1+ 5
ξ5 := = √ =√ = = Φ,
ξ4 − a4 1+ 5 5−1 2
−1
2
and so, a5 = a4 = 1. Continuing on this process, we will get ξn = Φ and an = 1 for
the rest of the n’s. In conclusion, we have
Φ
√ = h0; 1, 2, 1, 1, 1, 1, . . .i = h0; 1, 2, 1i.
5
The convergents of this continued fraction are fascinating. Recall that the Fibonacci
sequence {Fn }, named after Leonardo Pisano Fibonacci (1170–1250), is defined as
F0 = 0, F1 = 1, and Fn = Fn−1 + Fn−2 for all n ≥ 2, which gives the sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . . .
The Lucas numbers {Ln }, named after François Lucas (1842–1891), are defined
by
L0 := 2 , L1 = 1 , Ln = Ln−1 + Ln−2 , n = 2, 3, 4, . . . ,
and which give the sequence
2, 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, . . .
Φ
If you work out the convergents of √
5
= h0; 1, 2, 1, 1, 1, 1, . . .i what you get is the
fascinating result:
Φ
√ = h0; 1, 2, 1i has convergents
(8.26) 5
0 1 2 3 5 8 13 21 34 55 89 Fibonacci numbers
, , , , , , , , , , ,... = ;
2 1 3 4 7 11 18 29 47 76 123 Lucas numbers
of course, we do miss the other 1 in the Fibonacci sequence. For more fascinating
facts on Fibonacci numbers see Problem 7. Finally, we remark that the canonical
simple fraction expansion of a real number is unique, see Problem 8.
8.4. CONVERGENCE THEOREMS FOR INFINITE CONTINUED FRACTIONS 419
8.4.4. The numbers π and e. We now discuss the continued fraction expan-
sions for the famous numbers π and e. Consider π first:
ξ0 = π = 3.141592653 . . . =⇒ a0 := bξ0 c = 3.
Thus,
1 1
ξ1 := = = 7.062513305 . . . =⇒ a1 := bξ1 c = 7.
π−3 0.141592653 . . .
Therefore,
1 1
ξ2 := = = 15.99659440 . . . =⇒ a2 := bξ2 c = 15.
ξ1 − a1 0.062513305 . . .
Hence,
1 1
ξ3 := = = 1.00341723 . . . =⇒ a3 := bξ3 c = 1.
ξ2 − a2 0.996594407 . . .
Let us do this one more time:
1 1
ξ4 := = = 292.6345908 . . . =⇒ a4 := bξ4 c = 292.
ξ3 − a3 0.003417231 . . .
Continuing this process (at Davis’ Broadway cafe and after 314 free refills), we get
(8.27) π = h3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, 14, 2, 1, 1, 2, 2, 2, 2, 1, 84, 2, 1, . . .i.
Unfortunately (or perhaps fortunately) there is no known pattern that the partial
quotients follow! The first few convergents for π = 3.141592653 . . . are
22 333
c0 = 3 , c1 = = 3.142857142 . . . , c2 = = 3.141509433 . . .
7 106
355 103993
c4 = = 3.141592920 . . . , c5 = = 3.141592653 . . . .
113 33102
In stark contrast to π, Euler’s number e has a shockingly simple pattern, which
we ask you to work out in Problem 2:
e = h2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, . . .i
We will prove that this pattern continues in Section 8.7!
8.4.5. Irrationality. We now discuss when continued fractions represent ir-
rational numbers (cf. [154]).
Theorem 8.15. Let {an }∞ ∞
n=0 , {bn }n=1 be sequences rational numbers
P∞ such that
an , bn > 0 for n ≥ 1, 0 < bn ≤ an for all n sufficiently large, and n=1 anbn+1
an+1
=
∞. Then the real number
b1 b2 b3 b4 b5
ξ = a0 + . . . is irrational.
a1 + a2 + a3 + a4 + a5 +
Proof. First of all, the continued fraction defining ξ converges by Theorem
8.12. Suppose that 0 < bn ≤ an for all n ≥ m + 1 with m > 0. Observe that if we
define
bm+1 bm+2 bm+3
η = am + ...,
am+1 + am+2 + am+3 +
which also converges by Theorem 8.12, then η > am > 0 and we can write
b1 b2 b3 bm
ξ = a0 + ... .
+
a1 a2 a3+ + + η
420 8. INFINITE CONTINUED FRACTIONS
converges, in which case, this sum is exactly a0 + ab11 + ab22 + ab33 + . . .. Suggestion:
Consider the telescoping sum cn = c0 + (c1 − c0 ) + (c2 − c1 ) + · · · + (cn − cn−1 ). In
422 8. INFINITE CONTINUED FRACTIONS
Example 8.20. We can see that p/q = 13/4 is not a best approximation to π
because with a/b = 3/1, we have 1 ≤ 1 ≤ 4 yet
4 · π − 13 = 0.433629 . . . 6< 1 · π − 3 = 0.141592 . . . .
Thus, 13/4 is a good approximation to π but is far from a best approximation.
In the following proposition, we show that any best approximation is a good
one.
Proposition 8.17. A best approximation is a good one, but not vice versa.
Proof. We already gave an example showing that a good approximation may
not be a best one, so let p/q be a best approximation to ξ; we shall prove that p/q is
a good one too. Let a/b 6= p/q be rational with 1 ≤ b ≤ q. Then |qξ − p| < |b − ξa|
since p/q is a best approximation, and also, 1q ≤ 1b since b ≤ q, hence
p |qξ − p| |bξ − a| |bξ − a| a p a
ξ − = < ≤ = ξ − =⇒ ξ − < ξ − .
q q q b b q b
This shows that p/q is a good approximation.
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 425
If ξ is a rational number and the convergent cn+1 is defined (that is, if ξ = 6 cn ), then
these inequalities still hold, with the exception that |ξ − cn | = qn q1n+1 if ξ = cn+1 .
Proof. We prove this theorem for ξ irrational; the rational case is proved
using a similar argument, which we leave to you if you’re interested. The proof of
this theorem is very simple. We just need the inequalities (see Corollary 8.13)
(8.29) cn < cn+2 < ξ < cn+1 or cn+1 < ξ < cn+2 < cn ,
1
Hence, qn+2 > qn+1 ξ − pn+1 . Now,
qn ξ − pn = qn ξ − pn = qn ξ − cn > qn cn+2 − cn
by (8.29)
qn
an+2
= qn by (8.30)
qn qn+2
an+2 1
= ≥ > qn+1 ξ − pn+1 .
qn+2 qn+2
This proves our third inequality. Finally, using what we just proved, and that
1 1
qn+1 = an+1 qn + qn−1 ≥ qn + qn−1 > qn =⇒ < ,
qn+1 qn
we see that
ξ − cn+1 = ξ − pn+1 = 1 qn+1 ξ − pn+1
qn+1 qn+1
1
< q n ξ − pn
qn+1
1 pn
< qn ξ − pn = ξ − = ξ − cn .
qn qn
We now discuss the “most irrational” of all irrational numbers. From the best
approximation theorem (Theorem 8.20 we’ll prove in a moment) we know that the
best approximations of a real number ξ are convergents and from the fundamental
approximation theorem 8.18, we have the error estimate
1 1
(8.31) ξ − cn < =⇒ qn ξ − pn < .
qn qn+1 qn+1
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 427
This shows you that the larger the qn ’s are, the better the best approximations
are. Since the qn ’s are determined by the recurrence relation qn = an qn−1 + qn−2 ,
we see that the larger the an ’s are, the larger the qn ’s are. In summary, ξ can be
approximated very “good” by rational numbers when it has large an ’s and very
“bad” by rational numbers when it has small an ’s.
Example 8.22. Here is a “good” example: Recall from (8.27) the continued
fraction for π: π = h3; 7, 15, 1, 292, 1, 1, 1, 2, 1, . . .i, which has convergents c0 = 3,
c1 = 22 333 355 103993
7 , c2 = 106 , c3 = 113 , c4 = 33102 , . . .. Because of the large number
a4 = 292, we see from (8.31) that we can approximate π very nicely with c3 : Using
the left-hand equation in (8.31), we see that
π − c3 < 1 = 1
= 0.000000267 . . . ,
q3 q4 113 · 33102
355
which implies that c3 = 113 approximates π to within six decimal places! (Just to
check, note that π = 3.14159265 . . . and 355
113 = 3.14159292 . . ..) It’s amazing how
many decimal places of accuracy we can get with just taking the c3 convergent!
Step 1: The trick. To prove that |qn ξ − pn | ≤ |qξ − p|, the trick is to write p
and q as linear combinations of pn , pn+1 , qn , qn+1 :
p = pn x + pn+1 y
(8.32)
q = qn x + qn+1 y.
Using basic linear algebra, together with the fact that pn+1 qn − pn qn+1 = (−1)n ,
we can solve these simultaneous linear equations for x and y obtaining
x = (−1)n pn+1 q − pqn+1 , y = (−1)n pqn − pn q .
These formulas are not needed below except for the important fact that these
formulas show that x and y are integers. Now, using the formulas in (8.32), we see
that
qξ − p = qn x + qn+1 y ξ − pn x − pn+1 y
= qn ξ − pn x + qn+1 ξ − pn+1 y.
Therefore,
(8.33) |qξ − p| = qn ξ − pn x + qn+1 ξ − pn+1 y .
Step 2: Our goal is to simplify the right-hand side of (8.33) by understanding
the signs of the terms in the absolute values. First of all, since q, qn , qn+1 > 0, from
the second formula in (8.32), we see that x ≤ 0 and y ≤ 0 is not possible (for then
q ≤ 0, contradicting that q > 0). If x > 0 and y > 0, then we would have
q = qn x + qn+1 y > qn+1 ,
contradicting that q < qn+1 . (Note that y > 0 is the same thing as saying y ≥ 1
because y is an integer.) If x = 0, then the formulas (8.32) show that p = pn+1 y
and q = qn+1 y. Since q and qn+1 are positive, we must have y > 0 and we have
q ≥ qn+1 , contradicting that q < qn+1 . If y = 0, then the formulas (8.32) show that
p = pn x and q = qn x, so p/q = pn /qn and this contradicts the assumption that
p/q 6= pn /qn . Summarizing our findings: We may assume that x and y are both
nonzero and have opposite signs. By Corollary 8.11 we know that
pn pn+1
ξ− and ξ −
qn qn+1
have opposite signs. Therefore, qn ξ − pn and qn+1 ξ − pn+1 have opposite signs and
hence, since x and y also have opposite signs,
qn ξ − pn x and qn+1 ξ − pn+1 y
have the same sign. Therefore, in (8.33), we have
qξ − p = |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y|.
Step 3: We now prove our result. Since x 6= 0, we have |x| ≥ 1 (because x is
an integer), so
|qn ξ − pn | ≤ |qn ξ − pn | |x| ≤ |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y| = qξ − p,
and we have proved that |qn ξ − pn | ≤ |qξ − p| just as we set out to do.
Now assume that we have an equality: |qn ξ − pn | = |qξ − p|. Then we have
|qn ξ − pn | = |qn ξ − pn | |x| + |qn+1 ξ − pn+1 | |y|
=⇒ |qn ξ − pn | (|x| − 1) + |qn+1 ξ − pn+1 | |y| = 0
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 429
Since x and y are both nonzero integers, we have in particular, |x| − 1 ≥ 0 and
|y| > 0. Therefore,
pn+1
|qn ξ − pn | (|x| − 1) + |qn+1 ξ − pn+1 | |y| = 0 ⇐⇒ ξ = and |x| = 1.
qn+1
If x = +1, then y < 0 (because x and y have opposite signs) so y ≤ −1 since y is
an integer and hence by the second equation in (8.32), we have
q = qn x + qn+1 y = qn + qn+1 y ≤ qn − qn+1 ≤ 0,
because qn ≤ qn+1 . This is impossible since q > 0 by assumption. Hence, x = −1.
In this case y > 0 and hence y ≥ 1. If y ≥ 2, then
q = qn x + qn+1 y = −qn + qn+1 y ≥ −qn + 2qn+1 = qn+1 + (qn+1 − qn ) ≥ qn+1 ,
which contradicts the fact that q < qn+1 . Therefore, y = 1. In conclusion, we have
seen that |qn ξ − pn | = |qξ − p| if and only if ξ = pn+1 /qn+1 , x = −1 and y = 1,
which by the formulas in (8.32), imply that p = pn+1 − pn and q = qn+1 − qn .
Finally, by the Wallis-Euler recurrence relations, we have
q = qn+1 − qn = an+1 qn + qn−1 − qn = (an+1 − 1)qn + qn−1 .
If n ≥ 1, then qn−1 ≥ 1 and an+1 ≥ 2. Therefore, if n ≥ 1, then q > qn and our
proof is complete.
As an easy consequence of this lemma, it follows that every convergent pn /qn
with n ≥ 1 of the canonical continued fraction expansion of a real number ξ must
be a best approximation. Indeed, if ξ = pn /qn , then automatically pn /qn is a best
approximation of ξ. So assume that ξ 6= pn /qn , where n ≥ 1, and let p/q 6= pn /qn
with 1 ≤ q ≤ qn . Then, since n ≥ 1, we have qn < qn+1 , so 1 ≤ q < qn+1 . Therefore
by Lemma 8.19,
|qn ξ − pn | < |qξ − p|,
since the exceptional case is ruled out (q 6> qn because q ≤ qn by assumption).
Note that we left out p0 /q0 may not be a best approximation!
√
√Example 8.24. Consider 3 = 1.73205080 √ . . .. The best integer approximation
to 3 is 2. In Subsection 8.4.3 we found that 3 = h1; 1, 2i. Thus, p0 /q0 = 1, which
is not a best approximation. However, p1 /q1 = 1 + 11 = 2 is a best approximation.
Assume now that ξ is rational. Then ξ = pn+1 /qn+1 for some n = −1, 0, 1, . . ..
We consider three cases:
Case 1: q = qn+1 : Then the assumption that p/q is a best approximation to
ξ implies that p/q = pn+1 /qn+1 (why?) so p/q is a convergent.
Case 2: q > qn+1 : In fact, this case cannot occur because
|bξ − a| = 0 ≤ |pξ − q|
would hold for a = pn+1 and b = qn+1 < q contradicting that p/q is a best
approximation to ξ.
Case 3: 1 ≤ q < qn+1 : Since 1 = q0 ≤ q1 < q2 < · · · < qn+1 it follows that
there is a k such that
qk ≤ q < qk+1 .
Then by Lemma 8.19, if p/q 6= pk /qk , we have
|bξ − a| ≤ |qξ − p|.
where a = pk and b = qk ≤ q contradicting that p/q is a best approximation to p/q.
Therefore, p/q = pk /qk , so p/q is a convergent of ξ.
8.5.3. Dirichlet’s approximation theorem. Using Theorem 8.20, we prove
the following famous fact.
Theorem 8.21 (Dirichlet’s approximation theorem). Amongst two con-
secutive convergents pn /qn , pn+1 /qn+1 with n ≥ 0 of the canonical continued frac-
tion expansion to a real number (rational or irrational) ξ, one of them satisfies
p 1
(8.34) ξ − < 2 .
q 2q
Conversely, if a rational number p/q satisfies (8.34), then it is a convergent.
Proof. We begin by proving that a rational number satisfying (8.34) must be
a convergent, then we show that convergents satisfy (8.34).
Step 1: Assume that p/q satisfies (8.34). To prove that it must be a convergent,
we just need to show that it is a best approximation. To this end, assume that
a/b 6= p/q with b > 0 and that
bξ − a ≤ qξ − p;
we must show that q < b. To prove this, we note that (8.34) implies that
a 1 1 1 1 1
ξ − = bξ − a ≤ qξ − p < · = .
b b b b 2q 2bq
This inequality plus (8.34) give
aq − bp a p a p a p 1 1
= − = −ξ+ξ− ≤ − ξ + ξ − < + 2.
bq b q b q b q 2bq 2q
Since a/b 6= p/q, |aq − bp| is a positive integer, that is, 1 ≤ |aq − bp|, therefore
1 1 1 1 1 1 1
< + =⇒ < 2 =⇒ < =⇒ q < b,
bq 2bq 2q 2 2bq 2q b q
just as we wanted to show. We now show that one of two consecutive convergents
satisfies (8.34). Let pn /qn and pn+1 /qn+1 , n ≥ 0, be two consecutive convergents.
Step 2: Assume first that qn = qn+1 . Since qn+1 = an+1 qn + qn−1 we see
that qn = qn+1 if and only if n = 0 (because qn−1 = 0 if and only if n = 0) and
8.5. DIOPHANTINE APPROXIMATIONS AND THE MYSTERY OF π SOLVED! 431
4
An equation means nothing to me unless it expresses a thought of God. Srinivasa Ramanu-
jan (1887–1920).
8.6. F CONTINUED FRACTIONS AND CALENDARS, AND MATH AND MUSIC 433
See Problem 1 for Persian calenders and their link to continued fractions. Let us
analyze these calenders more thoroughly. First, the ancient calendar consisting of
365 days. Since a true year is approximately 365.24219 days, an ancient year has
0.24219 less days than a true year.
Thus, after 4 years, with an ancient calendar you’ll lose approximately
4 × .24219 = 0.9687 days ≈ 1 day.
After 125 years, with an ancient calendar you’ll lose approximately
125 × .24219 = 30.27375 days ≈ 1 month.
So, instead of having spring around March 21, you’ll have it in February! After 500
years, with an ancient calendar you’ll lose approximately
500 × .24219 = 121.095 days ≈ 4 months.
So, instead of having spring around March 21, you’ll have it in November! As you
can see, this is getting quite ridiculous.
In the Julian calendar, there are an average of 365 14 days in a Julian year. The
fraction 41 is played out as we all know: We add one day to the ancient calendar
every four years giving us a “leap year”, that is, a year with 366 days. Thus, just
as we said, a Julian calendar year gives the estimate
4 × 365 + 1 days 1 days
= 365 .
4 years 4 year
434 8. INFINITE CONTINUED FRACTIONS
. . . etc.
f0 f2 f4 f5 f7 f9 f11 f12 f14 f16
8.6.2. Pianos. We now move from calendars to pianos. For more on the
interaction between continued fractions and pianos, see [62], [134], [15], [89], [93],
[9], [197]. Let’s start by giving a short lesson on music based on Euler’s letter to
a German princess [39] (see also [105]). When, say a piano wire or guitar string
vibrates, it causes the air molecules around it to vibrate and these air molecules
cause neighboring molecules to vibrate and finally, these molecules bounce against
our ears, and we have the sensation of “sound”. The rapidness of the vibrations,
in number of vibrations per second, is called frequency. Let’s say that we hear
two notes with two different frequencies. In general, these frequencies mix together
and don’t produce a pleasing sound, but according to Euler, when the ratio of their
frequencies happens to equal certain ratios of integers, then we hear a pleasant
sound!5 Fascinating isn’t it? We’ll call the ratio of the frequencies an interval
between the notes or the frequencies. For example, consider two notes, one with
frequency f1 and the other with frequency f2 such that
f2 2
= ⇐⇒ f2 = 2f1 (octave);
f1 1
in other words, the interval between the first and second note is 2, which is to say,
f2 is just twice f1 . This special interval is called an octave. It turns out that
when two notes an octave apart are played at the same time, they sound beautiful
together! Another interval that is corresponds to a beautiful sound is called the
fifth, which is when the ratio is 3/2:
f2 3 3
= ⇐⇒ f2 = f1 (fifth).
f1 2 2
Other intervals (which remember just refer to ratios) that have names are
4/3 (fourth) 9/8 (major tone) 25/24 (chromatic semitone),
5/4 (major third) 10/9 (lesser tone) 81/80 (comma of Didymus),
6/5 (minor thirds) 16/15 (diatonic semitone).
However, it is probably of universal agreement that the octave and the fifth make
the prettiest sounds. Ratios such as 7/6, 8/7, 11/10, 12/11, . . . don’t seem to agree
with our ears.
Now let’s take a quick look at two facts concerning the piano. We all know
what a piano keyboard looks like; see Figure 8.1. Let us label the (fundamental)
frequencies of the piano keys, counting both white and black, by f0 , f1 , f2 , f3 , . . .
5Musica est exercitium arithmeticae occultum nescientis se numerare animi The pleasure
we obtain from music comes from counting, but counting unconsciously. Music is nothing but
unconscious arithmetic. From a letter to Goldbach, 27 April 1712, quoted in [193].
436 8. INFINITE CONTINUED FRACTIONS
starting from the far left key on the keyboard.6 The first fact is that keys which
are twelve keys apart are exactly an octave apart! For instance, f0 and, jumping
twelve keys to the right, f12 are an octave apart, f7 and f19 are an octave apart,
etc. For this reason, a piano scale really has just twelve basic frequencies, say
f0 , . . . , f11 , since by doubling these frequencies we get the twelve frequencies above,
f12 , . . . , f23 , and by doubling these we get f24 , . . . , f35 , etc. The second fact is that
a piano is evenly tempered, which means that the intervals between adjacent
keys is constant. Let this constant be c. Then,
fn+1
= c =⇒ fn+1 = cfn
fn
for all n. In particular,
(8.37) fn+k = cfn+k−1 = c(cfn+k−2 ) = c2 fn+k−2 = · · · = ck fn .
Since fn+12 = 2fn (because fn and fn+12 are an octave apart), it follows that with
k = 12, we get
2fn = c12 fn =⇒ 2 = c12 =⇒ c = 21/12 .
Thus, the interval between adjacent keys is 21/12 .
A question that might come to mind is: What is so special about the number
twelve for a piano scale? Why not eleven or fifteen? Answer: It has to do with
continued fractions! To see why, let us imagine that we have an evenly tempered
piano with q basic frequencies, that is, keys that are q apart have frequencies
differing by an octave. Question: Which q’s make the best pianos? (Note: We
better come up with q = 12 as one of the “best” ones!) By a very similar argument
as we did above, we can see that the interval between adjacent keys is 21/q . Now
we have to ask: What makes a good piano? Well, our piano by design has octaves,
but we would also like our piano to have fifths, the other beautiful interval. Let us
label the keys of our piano as in Figure 8.1. Then we would like to have a p such
that the interval between any frequency fn and fn+p is a fifth, that is,
fn+p 3
= .
fn 2
By the formula (8.37), which we can use in the present set-up as long as we put
c = 21/q , we have fn+p = (21/q )p fn = 2p/q fn . Thus, we want
3 p log(3/2)
2p/q = =⇒ = .
2 q log 2
This is, unfortunately, impossible because p/q is rational yet log(3/2)
log 2 is irrational
(cf. Subsection 2.6.5)! Thus, it is impossible for our piano (even if q = 12 like our
everyday piano) to have a fifth. However, hope is not lost because although our
piano can never have a perfect fifth, it can certainly have an approximate fifth: We
just need to find rational approximations to the irrational number log(3/2)
log 2 . This
we know how to do using continued fractions. One can show that
log(3/2)
= h1, 1, 2, 2, 3, 1, . . .i,
log 2
6A piano wire also gives off overtones but we focus here just on the fundamental frequency.
Also, some of what we say here is not quite true for the keys near the ends of the keyboard because
they don’t vibrate well due to their stiffness leading to the phenomenon called inharmonicity.
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 437
1 1 3 7 24 31 179
0, , , , , , , ,....
1 2 5 12 41 53 306
Lo and behold, we see a twelve! In particular, by the best approximation theorem
(Theorem 8.20), we know that 7/12 approximates log(3/2) log 2 better than any ratio-
nal number with a smaller denominator than twelve, which is to say, we cannot
find a piano scale with fewer than twelve basic key that will give a better approx-
imation to a fifth. This is why our everyday piano has twelve keys! In summary,
1, 2, 5, 12, 41, 53, 306, . . . are the q’s that make the “best” pianos. What about the
other numbers in this list? Supposedly [134], in 40 B.C. King-Fang, a scholar of the
Han dynasty, found the fraction 24/41, although to my knowledge, there has never
been an instrument built with a scale of q = 41; however, King-Fang also found
the fraction 31/53, and in this case, the q = 53 scale was advocated by Gerhardus
Mercator (1512–1594) circa 1650 and was actually implemented by Robert Halford
Macdowall Bosanquet (1841–1912) in his instrument Enharmonic Harmonium [34]!
We have focused on the interval of a fifth. What about other intervals? ... see
Problem 2.
Exercises 8.6.
1. (Persian calendar) As of 2000, the modern calendar in Iran and Afghanistan has an
683
average of 365 2820 days per year. The persian calendar introduced by Omar Khayyam
8
(1048–1131) had an average of 365 33 days per year. Khayyam amazingly calcu-
lated the year to be 365.24219858156 days. Find the continued fraction expansion
of 365.24219858156 and if {cn } are its convergents, show that c0 is the ancient calen-
dar, c1 is the Julian calendar, c3 is the calendar introduced by Khayyam, and c7 is the
modern Persian calendar!
2. Find the q’s that will make a piano with the “best” approximations to a minor third.
(Just as we found the q’s that will make a piano with the “best” approximations to
fifth.) Do you see why many musicians, e.g. Aristoxenus, Kornerup, Ariel, Yasser, who
enjoyed minor thirds, liked q = 19 musical scales?
3. (A solar system model) Christiaan Huygens (1629–1695) made a model scale of the
solar system. In his day, it was thought that it took Saturn 29.43 years to make it once
around the sun; that is,
period of Saturn
= 29.43.
period of Earth
To make a realistic model of the solar system, Huygens needed to make gears for the
model Saturn and the model Earth whose number of teeth had a ratio close to 29.43.
Find the continued fraction expansion of 29.43 and see why Huygens chose the number
of teeth to be 206 and 7, respectively. For more on the use of continued fractions to
solve gear problems, see [147].
1 x
coth x = + .
x x2
3+
x2
5+
x2
7+
.
9 + ..
Proof. With z = x > 0, we have F (a, x) > 0 for any a > 0 by definition of
the hypergeometric function. In particular, for a > 0, F (a + 1, x) > 0, so we can
divide by this in Proposition 8.23, obtaining the recurrence relation
F (a, x) x F (a + 2, x)
=1+ ,
F (a + 1, x) a(a + 1) F (a + 1, x)
440 8. INFINITE CONTINUED FRACTIONS
is irrational. It follows that for x rational, e2x must be irrational too, for otherwise
coth x would be rational contrary to assumption. Replacing x with x/2 and calling
this r, we get the following neat corollary.
Theorem 8.25. er is irrational for any rational r.
By the way, as did Johann Heinrich Lambert (1728–1777) originally did back in
1761 [36, p. 463], you can use continued fractions to prove that π is irrational, see
[127], [154]. As another easy corollary, we can get the continued fraction expansion
for tanh x. To do so, multiply the continued fraction for coth x by x:
x2 x2 x2 x2
x coth x = b , where b = 1 + ....
3+ 5+ 7+ 9+
Thus, tanh x = xb , or replacing b with its continued fraction, we get
x
tanh x = .
x2
1+
x2
3+
x2
5+
.
7 + ..
We derive one more beautiful expression that we’ll need later. As before, we have
ex + e−x e2x + 1 1 x x2 x2 x2
coth x = = = + ....
ex − e−x e2x − 1 x 3+ 5 + 7 + 9 +
Replacing x with 1/x, we obtain
e2/x + 1 1/x 1/x2 1/x2 1/x2
= x + ....
e2/x − 1 3 + 5 + 7 + 9 +
Finally, using the now familiar transformation rule, after a little algebra we get
e2/x + 1 1
(8.39) =x+ .
e2/x − 1 1
3x +
1
5x +
.
7x + . .
8.7.3. Continued fraction expansion of the exponential. We can now
get the famous continued fraction expansion for ex , which was first discovered by
(as you might have guessed) Euler. To start, we observe that
ex/2 + e−x/2 1 + e−x coth(x/2) − 1
coth(x/2) = = =⇒ e−x = ,
ex/2 + e−x/2 1 − e−x 1 + coth(x/2)
where we solved the equation on the left for e−x . Thus,
coth(x/2) − 1 1 + coth(x/2) − 2 2
e−x = = =1− ,
1 + coth(x/2) 1 + coth(x/2) 1 + coth(x/2)
so taking reciprocals, we get
1
ex = ,
2
1−
1 + coth(x/2)
442 8. INFINITE CONTINUED FRACTIONS
1
ex = .
2x
1−
x2
x+2+
x2
6+
x2
10 +
.
14 + . .
In particular, if we let x = 1, we obtain
1
e= .
2
1−
1
3+
1
6+
1
10 +
.
14 + . .
Although beautiful, we can get an even more beautiful continued fraction expansion
for e, which is a simple continued fraction.
This is true, and it was first proved by (as you might have guessed) Euler. Here,
a0 = 2 , a1 = 1 , a2 = 2 , a3 = 1 , a4 = 1 , a5 = 4 , a6 = 1 , a7 = 1,
8.7. THE ELEMENTARY FUNCTIONS AND THE IRRATIONALITY OF ep/q 443
1
2=1+ ,
1
0+
1
we can write (8.40) in a prettier way that shows the full pattern:
1
(8.42) e=1+ .
1
0+
1
1+
1
1+
1
2+
1
1+
1
1+
.
4 + ..
To prove this incredible formula, denote the convergents of the right-hand con-
tinued fraction in (8.40) by rk /sk . Since we have such simple relations a3n−1 = 2n
and a3n = a3n+1 = 1 for all n ∈ N, one might think that it is quite easy to compute
formulas for r3n+1 and s3n+1 , and this thought is indeed the case.
Proof. Both formulas are proved in similar ways, so we shall focus on the
formula for r3n+1 . First, we apply our Wallis-Euler recursive formulas:
r3n+1 = r3n + r3n−1 = r3n−1 + r3n−2 + r3n−1 = 2r3n−1 + r3n−2 .
Again applying the Wallis-Euler recursive formula on the last term, we get
r3n+1 = 2(2n) + 1 r3n−2 + r3n−3 + r3n−4 + r3n−5
= 2(2n) + 1 r3n−2 + r3n−3 + r3n−4 + r3n−5 .
444 8. INFINITE CONTINUED FRACTIONS
Since r3n−2 = r3n−3 + r3n−4 by our Wallis-Euler recursive formulas, we finally get
r3n+1 = 2(2n) + 1 r3n−2 + r3n−2 + r3n−5
= 2(2n) + 2 r3n−2 + r3n−5
= 2 (2n) + 1 r3(n−1)+1 + r3(n−2)+1 .
which are similar to the relations in our lemma! Thus, it is not surprising in one
bit that the r3n+1 ’s and s3n+1 ’s are related to the pn ’s and qn ’s. The exact relation
is given in the following lemma.
Proof. As with the previous lemma, we shall only prove the formula for r3n+1 .
We proceed by induction: First, for n = 0, we have
r1 := a0 a1 + 1 = 2 · 1 + 1 = 3,
p1 := α0 α1 + 1 = 2 · 6 + 1 = 13 , q1 := α1 = 6,
so r3·1+1 = p1 + q1 .
Assume now that r3k+1 = pk + qk for all 0 ≤ k ≤ n − 1 where n ≥ 2; we shall
prove that it holds for k = n (this is an example of “strong induction”; see Section
2.2). But, by Lemma 8.27 and the induction hypothesis, we have
See [173] for another proof of this formula based on a proof by Charles Hermite
(1822–1901). In the problems, we derive, along with other things, the following
beautiful continued fraction for cot x:
1 x
(8.44) cot x = + .
x x2
3−
x2
5−
x2
7−
.
9 − ..
From this continued fraction, we can derive the beautiful companion result for tan x:
x
tan x = .
x2
1−
x2
3−
x2
5−
.
7 − ..
Exercises 8.7.
1. For all n = 1, 2, . . ., let an > 0, bn ≥ 0, with an ≥ bn + 1. We shall prove that the
following continued fraction converges:
b1 −b2 −b3 −b4
(8.45) ....
a1 + a2 + a3 + a4 +
Note that for the continued fraction we are studying, a0 = 0. Replacing bn with −bn
with n ≥ 2 in the Wallis-Euler recurrence relations (8.16) and (8.17) we get
pn = an pn−1 − bn pn−2 , qn = an qn−1 − bn qn−2 , n = 2, 3, 4, . . .
p0 = 0 , p1 = b1 , q 0 = 1 , q 1 = a1 .
(i) Prove (via induction for instance) that qn ≥ qn−1 for all n = 1, 2, . . .. In partic-
ular, since q0 = 1, we have qn ≥ 1 for all n, so the convergents cn = pn /qn of
(8.45) are defined.
(ii) Verify that q1 − p1 ≥ 1 = q0 − p0 . Now prove by induction that qn − pn ≥
qn−1 − pn−1 for all n = 1, 2, . . .. In particular, since q0 − p0 = 1, we have
qn − pn ≥ 1 for all n. Diving by qn conclude that 0 ≤ cn ≤ 1 for all n = 1, 2, . . ..
(iii) Using the fundamental recurrence relations for cn −cn−1 , prove that cn −cn−1 ≥ 0
for all n = 1, 2, . . .. Combining this with (ii) shows that 0 ≤ c1 ≤ c2 ≤ c3 ≤ · · · ≤
1; that is, {cn } is a bounded monotone sequence and hence converges. Thus, the
continued fraction (8.45) converges.
2. For all n = 1, 2, . . ., let an > 0, bn ≥ 0, with an ≥ bn + 1. From the previous problem,
it follows that given any a0 ∈ R, the continued fraction a0 − ab11 −b 2 −b3 −b4
...
+ a2 + a3 + a4 +
converges. We now prove a variant of the continued fraction convergence theorem
446 8. INFINITE CONTINUED FRACTIONS
(Theorem 8.14): Let ξ0 , ξ1 , ξ2 , . . . be any sequence of real numbers with ξn > 0 for
n ≥ 1 and suppose that these numbers are related by
−bn+1
ξn = an + , n = 0, 1, 2, . . . .
ξn+1
Then ξ0 is equal to the continued fraction
b1 −b2 −b3 −b4 −b5
ξ0 = a0 − ....
a1 + a2 + a3 + a4 + a5 +
Prove this statement following (almost verbatim!) the proof of Theorem 8.14.
3. We are now ready to derive the beautiful cotangent continued fraction (8.44).
(i) Let a > 0. Then as we derived the identity (8.38) found in Theorem 8.24, prove
that if we define
(a + n)F (a + n, −x)
ηn (a, x) := , an = a + n , bn = x, n = 0, 1, 2, . . . ,
F (a + n + 1, −x)
then
−bn+1
ηn (a, x) = an + , n = 0, 1, 2, 3, . . . .
ηn+1 (a, x)
(ii) Using Problem 2, prove that for x ≥ 0 sufficiently small, we have
aF (a, −x) x −x −x −x −x
(8.46) = η0 (a, x) = a − ....
F (a + 1, −x) a + 1+ a + 2+ a + 3+ a + 4+ a + 5+
(iii) Prove that (cf. the proof of Proposition 8.22)
1 x2 3 x2
F ,− = cos x , x F ,− = sin x.
2 4 2 4
(iv) Now put a = 1/2 and replace x with −x2 /4 in (8.46) to derive the beautiful
cotangent expansion (8.44). Finally, relax and contemplate this fine formula!
4. (Irrationality of log r) Using Theorem 8.25, prove that if r > 0 is rational with r 6= 1,
then log r is irrational. In particular, one of our favorite constants, log 2, is irrational.
Notice that the above repeating continued fractions are continued fractions for
expressions with square roots.
1 1
η =2+ =⇒ η =2+ .
1 1
1+ 1+
1 η
2+
1 + ···
√
Solving for η we get a quadratic formula and solving it, we find that η = 1 + 3.
Hence,
√ √
1 1 3−1 5+ 3
ξ =3+ =3+ √ =3+ = ,
η 1+ 3 2 6
yet another square root expression.
where the bar denotes that the block of numbers b0 , b1 , . . . , bm−1 repeats forever.
Such a continued fraction is said to be periodic. When writing a continued fraction
in this way we assume that there is no shorter repeating block and that the repeating
block cannot start at an earlier position. For example, we would never write
we simply write it as h2; 1, 2, 4, 3i. The integer m is called the period of the simple
continued fraction. An equivalent way to define a periodic continued fraction is as
an infinite simple continued fraction ξ = ha0 ; a1 , a2 , . . .i such that for some m and
`, we have
The examples above suggest that infinite periodic simple continued fractions are
intimately related to expressions with square roots; in fact, these expressions are
called quadratic irrationals as we shall see in a moment.
448 8. INFINITE CONTINUED FRACTIONS
√ √
closed under taking reciprocals, observe that if α = a + b d ∈ Q[ d] is not zero,
then √ √
1 1 a−b d a−b d a b √
= √ · √ = 2 2
= 2 2
− 2 2
d
α a+b d a−b d a −b d a −b d a −b d
√
Note that a2 − b2 d 6= 0 since being zero would imply that d = a/b, √ a rational
number, which by assumption is false. Similarly, one can show that Q[ d] satisfies
all the other properties of a field.
Finally, we need to prove that conjugation preserves the algebraic properties.
For example, let’s√prove the equality√α · β = α · β, leaving the other properties to
you. If α = a + b d and β = a0 + b0 d, then according to (8.50), we have
√
αβ = aa0 + bb0 d − (ab0 + a0 b) d.
On the other hand,
√ √ √
αβ = (a − b d)(a0 − b0 d) = aa0 + bb0 d − (ab0 + a0 b) d,
which equals αβ.
The following theorem was first proved by Joseph-Louis Lagrange (1736–1813).
Thus,
√ √
αk+1 + d αm+k+1 + d
ξk+1 = = = ξm+k+1
βk+1 βm+k+1
=⇒ ak+1 = bξk+1 c = bξm+k+1 c = am+k+1 .
Continuing this process by induction shows that an = am+n for all n = k, k + 1, k +
2, k + 3, . . .. Thus, by the definition of periodicity in (8.48), we see that ξ has a
periodic simple continued fraction.
A periodic simple continued fraction is called purely periodic if it is of the
form ξ = ha0 ; a1 , . . . , am−1 i.
Example 8.29. The simplest example of such a fraction is the golden ratio:
√
1+ 5
Φ= = h1i = h1; 1, 1, 1, 1, 1, . . .i.
2
Observe that Φ has the following properties:
√
1− 5
Φ > 1 and Φ = = −0.618 . . . =⇒ Φ > 1 and − 1 < Φ < 0.
2
In the following theorem, Evariste Galois’7 (1811–1832) first publication (at the
age of 17), we characterize purely periodic expansions as those quadratic irrationals
having these same properties. (Don’t believe everything to read about the legendary
Galois; see [189]. See [220] for an introduction to Galois’ famous theory.)
Theorem 8.32. A quadratic irrational ξ is purely periodic if and only if
ξ>1 and − 1 < ξ < 0.
Proof. Assume that ξ = ha0 ; . . . , am−1 , a0 , a1 , . . . , am−1 , . . .i is purely peri-
odic; we shall prove that ξ > 1 and −1 < ξ < 0. Recall that in general, for any
simple continued fraction, hb0 ; b1 , b2 , . . .i all the bn ’s are positive after b0 . Thus, as
a0 appears again (and again, and again, . . .) after the first a0 in ξ, it follows that
a0 ≥ 1. Hence, ξ = a0 + ξ11 > 1. Now applying Theorem 8.4 to ha0 ; . . . , am−1 , ξi,
we get
ξpm−1 + pm−2
ξ= ,
ξqm−1 + qm−2
where pn /qn are the convergents for ξ. Multiplying both sides by ξqm−1 + qm−2 ,
we obtain
ξ 2 qm−1 + ξqm−2 = ξpm−1 + pm−2 =⇒ f (ξ) = 0,
where f (x) = qm−1 x2 + (qm−2 − pm−1 )x − pm−2 is a quadratic polynomial. In
particular, ξ is a root of f . Taking conjugates, we see that
2
qm−1 ξ 2 +(qm−2 −pm−1 )ξ −pm−2 = 0 =⇒ qm−1 ξ +(qm−2 −pm−1 )ξ −pm−2 = 0,
7[From the preface to his final manuscript (Evariste died from a pistol duel at the age of
20)] Since the beginning of the century, computational procedures have become so complicated that
any progress by those means has become impossible, without the elegance which modern mathe-
maticians have brought to bear on their research, and by means of which the spirit comprehends
quickly and in one step a great many computations. It is clear that elegance, so vaunted and so
aptly named, can have no other purpose. ... Go to the roots, of these calculations! Group the
operations. Classify them according to their complexities rather than their appearances! This, I
believe, is the mission of future mathematicians. This is the road on which I am embarking in
this work. Evariste Galois (1811–1832).
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 453
Let us define η0 := −1/ξ, η1 = −1/ξ m−1 , η2 = −1/ξ m−2 , . . . , ηm−1 = −1/ξ 1 . Then
we can write the previous displayed equalities as
1 1 1 1
η0 = am−1 + , η1 = am−2 + , . . . , ηm−2 = a1 + , ηm−1 = a0 + ;
η1 η2 ηm−1 η0
in other words, η0 is just the continued fraction:
η0 = ham−1 ; am−2 , . . . , a1 , a0 , η0 i = ham−1 ; am−2 , . . . , a1 , a0 i.
Since η0 = −1/ξ, our proof is complete.
√
Recall that the continued fraction expansion for d has the complete quotients
ξn and partial quotients an determined by
√
αn + d
ξn = , an = bξn c,
βn
where the αn , βn ’s are integers given in Theorem 8.29. We are now ready to prove
Adrien-Marie Legendre’s (1752–1833) famous result.
√
Theorem 8.34. The simple continued fraction of d has the form
√
d = ha0 ; a1 , a2 , a3 , . . . , a3 , a2 , a1 , 2a0 i.
Moreover,√βn 6= −1 for all n, and βn = +1 if and only if n is a multiple of the
period of d.
√ √
Proof. Starting the continued fraction√ algorithm for d, we obtain d =
a0 + ξ11 , where ξ1 > 1. Since ξ11 = −a0 + d, we have
1 √ √
(8.53) − = − − a0 − d = a0 + d > 1,
ξ1
so we must have −1 < ξ 1 < 0. Since both ξ1 > 1 and −1 < ξ 1 < 0, by Galois’
Theorem 8.32, we know that ξ1 is purely periodic: ξ1 = ha1 ; a2 , . . . , am i. Thus,
√ 1
d = a0 + = ha0 ; ξ1 i = ha0 ; a1 , a2 , . . . , am i.
ξ1
On the other hand, from (8.53) and from Lemma 8.33, we see that
√ 1
h2a0 ; a1 , a2 , . . . , am , a1 , a2 , . . . , am , . . .i = a0 + = ham ; . . . , a1 i
d=−
ξ1
= ham , am−1 , am−2 , . . . , a1 , am , am−1 , am−2 , . . . , a1 , . . .i.
Comparing the left and right-hand sides, we see that am = 2a0 , am−1 = a1 , am−2 =
a2 , am−3 = a3 , and so forth, therefore,
√
d = ha0 ; a1 , a2 , . . . , am i = ha0 ; a1 , a2 , a3 , . . . , a3 , a2 , a1 , 2a0 i.
We now prove that βn never equals −1, and βn = +1 if and only if n is√a
multiple of the period m. By the form of the continued fraction expansion of √d
we just derived, observe that for any n > 0, the n-th complete quotient ξn for d
is purely periodic. In particular, by Galois’ Theorem 8.32 we know that
(8.54) n>1 =⇒ ξn > 1 and − 1 < ξ n < 0.
8.8. QUADRATIC IRRATIONALS AND PERIODIC CONTINUED FRACTIONS 455
Since αn < 0 and αn > 0 cannot possibly hold, it follows that βn = −1 is impossible.
We now prove that βn = +1 if and only √ if n is a multiple of the period m.
Assume first that βn = 1. Then ξn = αn + d. By (8.54) we see that
√ √ √
−1 < ξ n = αn − d < 0 =⇒ d − 1 < αn < d.
√ √
Since α
√n is an integer, and the only integer strictly √ between d − 1 and d is
a0 = b dc, it follows that αn = a0 , so ξn = a0 + d. Now recalling the expansion
√
d = ha0 ; a1 , a2 , . . . , am i and the fact that 2a0 = am , it follows that
√
a0 + d = h2a0 ; a1 , a2 , . . . , am−1 , am , a1 , a2 , . . . , am−1 , am , . . .i
(8.55) = ham ; a1 , a2 , . . . , am−1 i;
Exercises 8.8.
1. Find the canonical continued fraction expansions for
√ √
√ 1 + 13 2+ 5
(a) 29 , (b) , (c) .
2 3
2. Find the values of the following continued fractions:
(a) hni = hn; n, n, n, . . .i , (b) hn; 1i , (c) hn; n + 1i , (d) hm; ni.
456 8. INFINITE CONTINUED FRACTIONS
1 2 3 4
cattle form a square cattle form a triangle
Figure 8.2. With the dots as bulls, on the left, the number of
bulls is a square number (42 in this case) and the number of bulls
on the right is a triangular number (1 + 2 + 3 + 4 in this case).
1 1 1 1
(5) x = + (Z + z) (6) z = + (Y + y)
4 5 5 6
1 1
(7) y = + (W + w).
6 7
Now how do we interpret (8) and (9)? We will interpret (8) as meaning that
the number of white and black bulls should be a square number (a perfect square);
see the left picture in Figure 8.2. A triangular number is a number of the form
n(n + 1)
1 + 2 + 3 + 4 + ··· + n = ,
2
for some n. Then we will interpret (9) as meaning that the number of yellow and
spotted bulls should be a triangular number; see the right picture in Figure 8.2.
Thus, (8) and (9) become
(8) W + X = a square number , (9) Y + Z = a triangular number.
In summary: We want to find integers W, X, Y, Z, w, x, y, z (here we assume
there are no such thing as “fractional cattle”) solving equations (1)–(9). Now to the
solution of Archimedes cattle problem. First of all, equations (1)–(7) are just linear
equations so these equations can be solved using simple linear algebra. Instead of
solving these equations by hand, which will probably take a few hours, it might be
best to use a computer. Doing so you will find that in order for W, X, Y, Z, w, x, y, z
to solve (1)–(7), they must be of the form
W = 10366482 k , X = 7460514 k , Y = 4149387 k , Z = 7358060 k
(8.56)
w = 7206360 k , x = 4893246 k , y = 5439213 k , z = 3515820 k,
where k can equal 1, 2, 3, . . .. Thus, in order for us to fulfill conditions (1)–(7), we
would have at the very least, setting k = 1,
Y + Z = 18492776362863 m2 + 32793026546940 m2
`(` + 1)
= 51285802909803 m2 = ,
2
for some integer `. Multiplying both sides by 8 and adding 1, we obtain
8 · 51285802909803 m2 + 1 = 4`2 + 4` + 1 = (2` + 1)2 = n2 ,
where n = 2` + 1. Since 8 · 51285802909803 = 410286423278424, we finally conclude
that conditions (1)–(9) are all fulfilled if we can find integers m, n satisfying the
equation
(8.58) n2 − 410286423278424 m2 = 1.
This is commonly called a Pell equation and is an example of a diophantine
equation. As we’ll see in the next subsection, we can solve√ this equation by simply
(!) finding the simple continued fraction expansion of 410286423278424. The
calculations involved are just sheer madness, but they can be done and have been
done [19], [248]. In the end, we find that the smallest total number of cattle which
satisfy (1)–(9) is a number with 206545 digits (!) and is equal to
7760271406 . . . (206525 other digits go here) . . . 9455081800 ≈ 8 × 10206544 .
We are now skilled in wise calculations! A copy of this number is printed on 42
computer sheets and has been deposited in the Mathematical Tables of the journal
Mathematics of Computation if you are interested.
remark that Pell’s equation was named by Euler after John Pell (1611–1685), al-
though Brahmagupta8 (598–670) studied this equation a thousand years earlier √ [36,
p. 221]. Any case, we shall see that the continued fraction expansion of d plays
an important role in solving this equation. We note that if (x, y) solves (8.59), then
trivially so do (±x, ±y) because of the squares in (8.59); thus, we usually restrict
ourselves to the positive solutions. √
Recall that the continued fraction expansion for d has the complete quotients
ξn and partial quotients an determined by
√
αn + d
ξn = , an = bξn c,
βn
where αn and βn are integers defined in Theorem 8.29. The exact forms of these
integers are not important; what is important is that√βn never equals −1 and
βn = +1 if and only if n is a multiple of the period of √ d as we saw in Theorem
8.34. The following lemma shows how the convergents of d enter Pell’s equation.
√
Lemma 8.35. If pn /qn denotes the n-th convergent of d, then for all n =
0, 1, 2, . . ., we have
p2n − d qn2 = (−1)n+1 βn+1 .
√
Proof. √ Since we can write d = ha0 ; a1 , a2 , a3 , . . . , an , ξn+1 i and ξn+1 =
(αn+1 + d)/βn+1 , by (8.19) of Corollary 8.6, we have
√
√ ξn+1 pn + pn−1 (αn+1 + d) pn + βn+1 pn−1
d= = √ .
ξn+1 qn + qn−1 (αn+1 + d) qn + βn+1 qn−1
Multiplying both sides by the denominator of the right-hand side, we get
√ √ √ √
d(αn+1 + d) qn + dβn+1 qn−1 = (αn+1 + d) pn + βn+1 pn−1
√ √
=⇒ dqn + (αn+1 qn + βn+1 qn−1 ) d = (αn+1 pn + βn+1 pn−1 ) + pn d.
Equating coefficients, we obtain
dqn = αn+1 pn + βn+1 pn−1 and αn+1 qn + βn+1 qn−1 = pn .
Multiplying the first equation by qn and the second equation by pn and equating
the αn+1 pn qn terms in each resulting equation, we obtain
Next, we
√ show that all solutions of Pell’s equation can be found via the con-
vergents of d.
√
Theorem
√ 8.36. Let pn /qn denote the n-th convergent of d and let m the
period of d. Then the positive integer solutions to
x2 − d y 2 = 1
√
are precisely numerators and denominators of the odd convergents of d of the
form x = pnm−1 and y = qnm−1 , where n > 0 is any positive integer for m even
and n > 0 is even for m odd.
Proof. We prove our theorem in two steps.
2 2
√Step 1: We first prove that if x −d y = 12 with y2 > 0, then√x/y is a convergent
√
of d. To√ see this, observe
√ that since 1 = x − d y = (x − d y)(x + d y), we
have x − d y = 1/(x + d y), so
√
x √ x − d y 1
− d =
y = √ .
y y |x + d y|
√
Also, x2 = d y 2 + 1 > d y 2 implies x > d y, which implies
√ √ √ √
x + d y > d y + d y = 2 d y.
Hence,
x √ 1 1 1 x √ 1
y |x + √d y| < y · 2√d y = 2y 2 √d
− d = =⇒ − d < 2 .
y y 2y
√
By Dirichlet’s theorem 8.21, it follows that x/y must be a convergent of d.
Step 2: We now finish the proof. By Step 1 we already know that every
solution must be a convergent, so we only need to look for convergents (pk , qk ) that
make p2k − d qk2 = 1. To this end, recall from Lemma 8.35 that
p2k−1 − d qk−1
2
= (−1)k βk ,
where
√ βk never equals −1 and βk2 = 1 if and only if k is a multiple of m, the period
of d. In particular, p2k−1 − d qk−1 = 1 if and only if (−1)k βk = 1, if and only if
βk = 1 and k is even, if and only if k is a multiple of m and k is even. This holds
if and only if k = mn where n > 0 is any positive integer for m even and n > 0 is
even for m odd. This completes our proof.
The fundamental solution of Pell’s equation is the “smallest” positive so-
lution of Pell’s equation; here, a solution (x, y) is positive means x, y > 0.√ Ex-
plicitly, the fundamental solution is (pm−1 , qm−1 ) for an even period m of d or
(p2m−1 , p2m−1 ) for an odd period m.
√
Example 8.30. Consider the equation x2 − 3y 2 = 1. Since 3 = h1; 1, 2i
has period m = 2, our theorem says that the positive solutions of x2 − 3y 2 = 1 are
precisely x = p2n−1 and y√= q2n−1 for all n > 0; that is, (p1 , q1 ), (p3 , q3 ), (p5 , q5 ), . . ..
Now the convergents of 3 are
n 0 1 2 3 4 5 6 7
pn 1 2 5 7 19 26 71 97 .
qn 1 1 3 4 11 15 41 56
In particular, the fundamental solution is (2, 1) and the rest of the positive solutions
are (7, 4), (26, 15), (97, 56), . . .. Just to verify a couple entries:
22 − 3 · 12 = 4 − 3 = 1
and
72 − 3 · 42 = 49 − 3 · 16 = 49 − 48 = 1,
and one can continue verifying that the odd convergents give solutions.
8.9. ARCHIMEDES’ CRAZY CATTLE CONUNDRUM AND DIOPHANTINE EQUATIONS 461
(iii) Thus, we may write x1 = 2a+1 and y1 = 2b. Show that p b2 = a (a+1). Conclude
that p must divide a or a + 1.
(iv) Suppose that p divides a; that is, a = mp for an integer m. Show that b2 =
m (mp + 1) and that m and mp + 1 are relatively prime. Using this equality,
prove that m = s2 and mp + 1 = t2 for integers s, t. Conclude that t2 − p s2 = 1
and derive a contradiction.
(v) Thus, it must be the case that p divides a + 1. Using this fact and an argument
similar to the one in the previous step, find a solution to x2 − d y 2 = −1.
8. (Sum of squares) In this problem we prove the following incredible result of Euler:
Every prime of the form p = 4k + 1 can be expressed as the sum of two squares.
(i) Let p = 4k + 1 be prime. Using the previous problem and Problem 5, prove that
√ √
the period of p is odd and deduce that p has an expansion of the form
√
p = ha0 ; a1 , a2 , . . . , a`−1 , a` , a` , a`−1 , . . . , a1 , 2a0 i.
(ii) Let η be the complete quotient ξ`+1 :
η := ξ`+1 = ha` ; a`−1 . . . , a1 , 2a0 , a1 , . . . , a`−1 , a` i.
Prove that −1 = η · η. Suggestion: Use Lemma 8.33.
√
(iii) Finally, writing η = (a+ p)/b (why does η have this form?) show that p = a2 +b2 .
the large power of 1000 will make C/q 1000 small. The following lemma shows that
there is a limit to how close we can surround algebraic numbers by “good” rational
numbers.
Lemma 8.38. If ξ is real algebraic of degree n ≥ 1 (so ξ is rational if n = 1),
then there exists a constant c > 0 such that for all rational numbers p/q 6= ξ with
q > 0, we have
p c
ξ − ≥ n .
q q
Proof. Assume that f (ξ) = 0 where
f (x) = an xn + an−1 xn−1 + · · · + a1 x + a0 = 0, ak ∈ Z,
and that no such polynomial function of lower degree has this property. First, we
claim that f (r) 6= 0 for any rational number r 6= ξ. Indeed, if f (r) = 0 for some
rational number r 6= ξ, then we can write f (x) = (x−r)g(x) where g is a polynomial
of degree n − 1. Then 0 = f (ξ) = (ξ − r)g(ξ) implies, since ξ 6= r, that g(ξ) = 0.
This implies that the degree of ξ is n − 1 contradicting the fact that the degree of
ξ is n. Now for any rational p/q 6= ξ with q > 0, we see that
p n p n−1 p
0 6= |f (p/q)| = an + an−1 + · · · + a1 + a0
q q q
|an pn + an−1 pn−1 q + · · · + a1 pq n−1 + a0 q n |
= .
qn
The numerator is a nonnegative integer, which cannot be zero, so the numerator
must be ≥ 1. Therefore,
(8.62) |f (p/q)| ≥ 1/q n for all rational numbers p/q 6= ξ with q > 0.
Second, we claim that there is an M > 0 such that
(8.63) |x − ξ| ≤ 1 =⇒ |f (x)| ≤ M |x − ξ|.
Indeed, note that since f (ξ) = 0, we have
f (x) = f (x) − f (ξ) = an (xn − ξ n ) + an−1 (xn−1 − ξ n−1 ) + · · · + a1 (x − ξ).
Since
xk − ξ k = (x − ξ) qk (x), qk (x) = xk−1 + xk−2 ξ + · · · + x ξ k−2 + ξ k−1 ,
plugging each of these, for k = 1, 2, 3, . . . , n, into the previous equation for f (x), we
see that f (x) = (x − ξ)h(x) where h is a continuous function. In particular, since
[ξ − 1, ξ + 1] is a closed and bounded interval, there is an M such that |h(x)| ≤ M
for all x ∈ [ξ − 1, ξ + 1]. This proves our claim.
Finally, let p/q 6= ξ be a rational number with q > 0. If |ξ − p/q| > 1, then
p 1
ξ − > 1 ≥ n .
q q
If |ξ − p/q| ≤ 1, then by (8.62) and (8.63), we have
p 1 1 1
ξ − ≥ |f (p/q)| ≥ .
q M M qn
Hence, |ξ − p/q| ≥ c/q n for all rational p/q 6= ξ with q > 0, where c is the smaller
of 1 and 1/M .
466 8. INFINITE CONTINUED FRACTIONS
Note that ξ is the real number with binary expansion a0 .0a1 0a2 0 · · · , with an in
the 2n -th decimal place and with zeros
Pn everywhere else. Any case, fix a natural
number n with an 6= 0 and let sn = k=0 a2kk be the n-th partial sum of this series.
2n
Then we can write sn as p/q where q = 22 . Observe that
ξ − sn ≤ 1n+1 + 1n+2 + 1n+3 + 1n+4 + · · ·
22 22 22 22
1 1 1 1
< 2n+1 + 2n+1 +1 + 2n+1 +2 + 2n+1 +3 + · · ·
2 2 2 2
1 1 1 1 2 2
= 2n+1 1 + 1 + 2 + 3 + · · · = 2n+1 = 2n 2 .
2 2 2 2 2 (2 )
In conclusion,
2
ξ − sn < C
= 2,
(22n )2 q
where C = 2. Thus, ξ is approximable to order 2, and hence must be irrational.
8.10.2. Liouville numbers. Numbers that satisfy (8.64) with c = 1 are spe-
cial: A real number ξ is called a Liouville number, after Joseph Liouville (1809–
1882), if for every natural number n there is a rational number p/q 6= ξ with q > 1
such that
ξ − p < 1 .
q qn
These numbers are transcendental by our discussion around (8.64). Because this
fact is so important, we state this as a theorem.
Theorem 8.40 (Liouville’s theorem). Any Liouville number is transcenden-
tal.
Using Liouville’s theorem we can give many (in fact uncountably many — see
Problem 3) examples of transcendental numbers. Let {an } be any sequence of
integers in 0, 1, . . . , 9 where there are infinitely many nonzero integers. Let
∞
X an
ξ= n!
.
n=0
10
Note that ξ is the real number with decimal expansion
a0 .a1 a2 000a3 00000000000000000a4 · · · ,
with an in the n!-th decimal place and with zeros everywhere else. Using Liouville’s
theorem we’ll show that ξ is transcendental. Fix a natural number n with an 6= 0
and let sn be the n-th partial sum of this series. Then sn can be written as p/q
where q = 10n! > 1. Observe that
ξ − sn ≤ 9 9 9 9
(n+1)!
+ (n+2)! + (n+3)! + (n+4)! + · · ·
10 10 10 10
9 9 9 9
< (n+1)! + (n+1)!+1 + (n+1)!+2 + (n+1)!+3 + · · ·
10 10 10 10
9 1 1 1
= (n+1)! 1 + 1 + 2 + 3 + · · ·
10 10 10 10
10 10 1
= (n+1)! = n·n! ≤ n·n! .
10 10 · 10n! 10
468 8. INFINITE CONTINUED FRACTIONS
In conclusion,
ξ − sn < 1 1
= n,
(10n! )n q
so ξ is a Liouville number and therefore is transcendental.
8.10.3. Continued fractions and the “most extreme” irrational of all
irrational numbers. We now show how continued fractions can be used to con-
struct transcendental numbers! This is achieved by the following simple observa-
tion. Let ξ = ha0 ; a1 , . . .i be an irrational real number written as a simple continued
fraction and let {pn /qn } be its convergents. Then by our fundamental approxima-
tion theorem 8.18, we know that
pn 1
ξ − < .
qn qn qn+1
Since
qn qn+1 = qn (an+1 qn + qn−1 ) ≥ an+1 qn2 ,
we see that
pn 1
(8.65) ξ − < .
qn an+1 qn2
Thus, we can make the rational number pn /qn approximate ξ as close as we wish
by simply taking the next partial quotient an+1 larger. We use this observation in
the following theorem.
Theorem 8.41. Let ϕ : N → (0, ∞) be a function. Then there is an irrational
number ξ and infinitely many rational numbers p/q such that
p 1
ξ − < .
q ϕ(q)
Proof. We define ξ = ha0 ; a1 , a2 , . . .i by choosing the an ’s inductively as fol-
lows. Let a0 ∈ N be arbitrary. Assume that a0 , . . . , an have been chosen. With qn
the denominator of ha0 ; a1 , . . . , an i, choose (via Archimedean) an+1 ∈ N such that
an+1 qn2 > ϕ(qn ).
This defines {an }. Now defining ξ := ha0 ; a1 , a2 , . . .i, by (8.65), for any natural
number n we have
pn 1 1
ξ − < < .
qn an+1 qn2 ϕ(qn )
This completes our proof.
Using this theorem we can easily find transcendental numbers. For example,
with ϕ(q) = eq , we can find an irrational ξ such that for infinitely many rational
numbers p/q, we have
p 1
ξ − < q .
q e
P∞
Since for any n ∈ N, we have eq = k=0 q k /k! > q n /n!, it follows that for infinitely
many rational numbers p/q, we have
p constant
ξ − < .
q qn
In particular, ξ is transcendental.
8.10. EPILOGUE: TRANSCENDENTAL NUMBERS, π, e, AND WHERE’S CALCULUS? 469
471
472 BIBLIOGRAPHY
24. Paul Benacerraf and Hilary Putnam (eds.), Philosophy of mathematics: selected readings,
Cambridge University Press, Cambridge, 1964.
25. Stanley J. Benkoski, The probability that k positive integers are relatively r-prime, J. Number
Theory 8 (1976), no. 2, 218–223.
26. Lennart Berggren, Jonathan Borwein, and Peter Borwein, Pi: a source book, third ed.,
Springer-Verlag, New York, 2004.
27. Bruce C. Berndt, Ramanujan’s notebooks, Math. Mag. 51 (1978), no. 3, 147–164.
28. N.M. Beskin, Fascinating fractions, Mir Publishers, Moscow, 1980, Translated by V.I. Kisln,
1986.
29. F. Beukers, A note on the irrationality of ζ(2) and ζ(3), Bull. London Math. Soc. 11 (1979),
no. 3, 268–272.
30. Ralph P. Boas, A primer of real functions, fourth ed., Carus Mathematical Monographs,
vol. 13, Mathematical Association of America, Washington, DC, 1996, Revised and with a
preface by Harold P. Boas.
31. R.P. Boas, Tannery’s theorem, Math. Mag. 38 (1965), no. 2, 64–66.
32. J.M. Borwein and Borwein P.B., Ramanujan, modular equations, and approximations to pi
or how to compute one billion digits of pi, Amer. Math. Monthly 96 (1989), no. 3, 201–219.
33. Jonathan M. Borwein and Peter B. Borwein, Pi and the AGM, Canadian Mathematical
Society Series of Monographs and Advanced Texts, 4, John Wiley & Sons Inc., New York,
1998, A study in analytic number theory and computational complexity, Reprint of the 1987
original, A Wiley-Interscience Publication.
34. R.H.M. Bosanquet, An elementary treatise on musical intervals and temperament (london,
1876), Diapason press, Utrecht, 1987.
35. Carl B. Boyer, Fermat’s integration of X n , Nat. Math. Mag. 20 (1945), 29–32.
36. , A history of mathematics, second ed., John Wiley & Sons Inc., New York, 1991,
With a foreword by Isaac Asimov, Revised and with a preface by Uta C. Merzbach.
37. Paul Bracken and Bruce S. Burdick, Euler’s formula for zeta function convolutions: 10754,
Amer. Math. Monthly 108 (2001), no. 8, 771–773.
38. David Bressoud, Was calculus invented in India?, College Math. J. 33 (2002), no. 1, 2–13.
39. David Brewster, Letters of Euler to a german princess on different subjects in physics and
philosophy, Harper and Brothers, New York, 1834, In two volumes.
40. W.E. Briggs and Nick Franceschine, Problem 1302, Math. Mag. 62 (1989), no. 4, 275–276.
41. T.J. I’A. Bromwich, An introduction to the theory of infinite series, second ed., Macmillan,
London, 1926.
42. Richard A. Brualdi, Mathematical notes, Amer. Math. Monthly 84 (1977), no. 10, 803–807.
43. Robert Bumcrot, Irrationality made easy, The College Math. J. 17 (1986), no. 3, 243–244.
44. Frank Burk, Euler’s constant, The College Math. J. 16 (1985), no. 4, 279.
45. Florian Cajori, A history of mathematical notations, Dover Publications Inc., New York,
1993, 2 Vol in 1 edition.
46. B.C. Carlson, Algorithms involving arithmetic and geometric means, Amer. Math. Monthly
78 (1971), 496–505.
47. Dario Castellanos, The ubiquitous π, Math. Mag. 61 (1988), no. 2, 67–98.
48. , The ubiquitous π, Math. Mag. 61 (1988), no. 3, 148–163.
49. R. Chapman, Evaluating ζ(2), preprint, 1999.
50. Robert R. Christian, Another completeness property, Amer. Math. Monthly 71 (1964), no. 1,
78.
51. James A. Clarkson, On the series of prime reciprocals, Proc. Amer. Math. Soc. 17 (1966),
no. 2, 541.
52. Benoit Cloitre, private communication.
53. J. Brian Conrey, The Riemann hypothesis, Notices Amer. Math. Soc. 50 (2003), no. 3,
341–353.
54. F. Lee Cook, A simple explicit formula for the Bernoulli numbers, Two Year College Math.
J. 13 (1982), no. 4, 273–274.
55. J. L. Coolidge, The number e, Amer. Math. Monthly 57 (1950), 591–602.
56. Fr. Gabe Costa, Solution 277, The College Math. J. 17 (1986), no. 1, 98–99.
57. Richard Courant and Herbert Robbins, What is mathematics?, Oxford University Press,
New York, 1979, An elementary approach to ideas and methods.
BIBLIOGRAPHY 473
58. E. J. Dijksterhuis, Archimedes, Princeton University Press, Princeton, NJ, 1987, Translated
from the Dutch by C. Dikshoorn, Reprint of the 1956 edition, With a contribution by Wilbur
R. Knorr.
59. Underwood Dudley, A budget of trisections, Springer-Verlag, New York, 1987.
60. William Dunham, A historical gem from Vito Volterra, Math. Mag. 63 (1990), no. 4, 234–
237.
61. , Euler and the fundamental theorem of algebra, The College Math. J. 22 (1991),
no. 4, 282–293.
62. E. Dunne and M. Mcconnell, Pianos and continued fractions, Math. Mag. 72 (1999), no. 2,
104–115.
P
63. Erich Dux, Ein kurzer Beweis der Divergenz der unendlichen Reihe ∞ r=1 1/pr , Elem. Math.
11 (1956), 50–51.
P
64. Erdös, Uber die Reihe 1/p, Mathematica Zutphen. B. 7 (1938), 1–2.
65. Leonhard Euler, Introduction to analysis of the infinite. Book I, Springer-Verlag, New York,
1988, Translated from the Latin and with an introduction by John D. Blanton.
66. , Introduction to analysis of the infinite. Book II, Springer-Verlag, New York, 1990,
Translated from the Latin and with an introduction by John D. Blanton.
67. H Eves, Mathematical circles squared, Prindle Weber & Schmidt, Boston, 1972.
68. Pierre Eymard and Jean-Pierre Lafon, The number π, American Mathematical Society, Prov-
idence, RI, 2004, Translated from the 1999 French original by Stephen S. Wilson.
69. Charles Fefferman, An easy proof of the fundmental theorem of algebra, Amer. Math.
Monthly 74 (1967), no. 7, 854–855.
70. William Feller, An introduction to probability theory and its applications. Vol. I, Third
edition, John Wiley & Sons Inc., New York, 1968.
71. , An introduction to probability theory and its applications. Vol. II., Second edition,
John Wiley & Sons Inc., New York, 1971.
72. D. Ferguson, Evaluation of π. are shanks’ figures correct?, Mathematical Gazette 30 (1946),
89–90.
73. William Leonard Ferrar, A textbook of convergence, The Clarendon Press Oxford University
Press, New York, 1980.
74. Steven R. Finch, Mathematical constants, Encyclopedia of Mathematics and its Applications,
vol. 94, Cambridge University Press, Cambridge, 2003.
75. Philippe Flajolet and Ilan Vardi, Zeta function expansions of classical constants, preprint,
1996.
76. Tomlinson Fort, Application of the summation by parts formula to summability of series,
Math. Mag. 26 (1953), no. 26, 199–204.
77. Gregory Fredricks and Roger B. Nelsen, Summation by parts, The College Math. J. 23
(1992), no. 1, 39–42.
78. Richard J. Friedlander, Factoring factorials, Two Year College Math. J. 12 (1981), no. 1,
12–20.
79. Joseph A. Gallian, contemporary abstract algebra, sixth ed., Houghton Mifflin Company,
Boston, 2005.
80. Martin Gardner, Mathematical games, Scientific American April (1958).
81. , Second scientific american book of mathematical puzzles and diversions, University
of Chicago press, Chicago, 1987, Reprint edition.
82. J. Glaisher, History of Euler’s constant, Messenger of Math. 1 (1872), 25–30.
83. Edward J. Goodwin, Quadrature of the circle, Amer. Math. Monthly 1 (1894), no. 1, 246–
247.
84. Russell A. Gordon, The use of tagged partitions in elementary real analysis, Amer. Math.
Monthly 105 (1998), no. 2, 107–117.
85. H.W. Gould, Explicit formulas for Bernoulli numbers, Amer. Math. Monthly 79 (1972),
no. 1, 44–51.
86. D.S. Greenstein, A property of the logarithm, Amer. Math. Monthly 72 (1965), no. 7, 767.
87. Robert Grey, Georg Cantor and transcendental numbers, Amer. Math. Monthly 101 (1994),
no. 9, 819–832.
88. Lucye Guilbeau, The history of the solution of the cubic equation, Mathematics News Letter
5 (1930), no. 4, 8–12.
474 BIBLIOGRAPHY
89. Rachel W. Hall and Krešimir Josić, The mathematics of musical instruments, Amer. Math.
Monthly 108 (2001), no. 4, 347–357.
90. Hallerberg, Indiana’s squared circle, Math. Mag. 50 (1977), no. 3, 136–140.
91. Paul R. Halmos, Naive set theory, Springer-Verlag, New York-Heidelberg, 1974, Reprint of
the 1960 edition. Undergraduate Texts in Mathematics.
92. , I want to be a mathematician, Springer-Verlag, 1985, An automathography.
93. G.D. Halsey and Edwin Hewitt, More on the superparticular ratios in music, Amer. Math.
Monthly 79 (1972), no. 10, 1096–1100.
94. G. H. Hardy, J. E. Littlewood, and G. Pólya, Inequalities, Cambridge Mathematical Library,
Cambridge University Press, Cambridge, 1988, Reprint of the 1952 edition.
95. G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, fifth ed., The
Clarendon Press Oxford University Press, New York, 1979.
96. Julian Havil, Gamma, Princeton University Press, Princeton, NJ, 2003, Exploring Euler’s
constant, With a foreward by Freeman Dyson.
97. Ko Hayashi, Fibonacci numbers and the arctangent function, Math. Mag. 76 (2003), no. 3,
214–215.
98. T. L. Heath, Diophantus of alexandria: a study in the history of greek algebra, Cambridge
University Press, England, 1889.
99. , The works of Archimedes, Cambridge University Press, England, 1897.
100. Thomas Heath, A history of Greek mathematics. Vol. I, Dover Publications Inc., New York,
1981, From Thales to Euclid, Corrected reprint of the 1921 original.
101. Aaron Herschfeld, On Infinite Radicals, Amer. Math. Monthly 42 (1935), no. 7, 419–429.
102. Josef Hofbauer, A simple proof of 1 + 1/22 + 1/32 + · · · = π 2 /6 and related identities, Amer.
Math. Monthly 109 (2002), no. 2, 196–200.
103. P. Iain, Science, theology and einstein, Oxford University, Oxford, 1982.
104. Frank Irwin, A curious convergent series, Amer. Math. Monthly 23 (1916), no. 5, 149–152.
105. Sir James H. Jeans, Science and music, Dover Publications Inc., New York, 1968, Reprint
of the 1937 edition.
106. Dixon J. Jones, Continued powers and a sufficient condition for their convergence, Math.
Mag. 68 (1995), no. 5, 387–392.
107. Gareth A. Jones, 6/π 2 , Math. Mag. 66 (1993), no. 5, 290–298.
108. J.P. Jones and S. Toporowski, Irrational numbers, Amer. Math. Monthly 80 (1973), no. 4,
423–424.
109. Dan Kalman, Six ways to sum a series, The College Math. J. 24 (1993), no. 5, 402–421.
110. Edward Kasner and James Newman, Mathematics and the imagination, Dover Publications
Inc., New York, 2001.
111. Victor J. Katz, Ideas of calculus in islam and india, Math. Mag. 68 (1995), no. 3, 163–174.
112. Gerard W. Kelly, Short-cut math, Dover Publications Inc., New York, 1984.
113. A. J. Kempner, A curious convergent series, Amer. Math. Monthly 21 (1914), no. 2, 48–50.
114. Alexey Nikolaevitch Khovanskii, The application of continued fractions and their general-
izations to problems in approximation theory, Translated by Peter Wynn, P. Noordhoff N.
V., Groningen, 1963.
115. Steven J. Kifowit and Terra A. Stamps, The harmonic series diverges again and again, The
AMATYC Review 27 (2006), no. 2, 31–43.
116. M.S. Klamkin and Robert Steinberg, Problem 4431, Amer. Math. Monthly 59 (1952), no. 7,
471–472.
117. M.S. Klamkin and J.V. Whittaker, Problem 4564, Amer. Math. Monthly 62 (1955), no. 2,
129–130.
118. Israel Kleiner, Evolution of the function concept: A brief survey, Two Year College Math.
J. 20 (1989), no. 4, 282–300.
119. Morris Kline, Euler and infinite series, Math. Mag. 56 (1983), no. 5, 307–314.
120. Konrad Knopp, Infinite sequences and series, Dover Publications Inc., New York, 1956,
Translated by Frederick Bagemihl.
121. R. Knott, Fibonacci numbers and the golden section,
http://www.mcs.surrey.ac.uk/Personal/R.Knott/Fibonacci/ .
122. Donald E. Knuth, The art of computer programming. Vol. 2, second ed., Addison-Wesley
Publishing Co., Reading, Mass., 1981, Seminumerical algorithms, Addison-Wesley Series in
Computer Science and Information Processing.
BIBLIOGRAPHY 475
P∞ 2 2
Q∞ 2 2 2
123. R.A. Kortram, Simple proofs for k=1 1/k = π /6 and sin x = x k=1 (1 − x /k π ),
Math. Mag. 69 (1996), no. 2, 122–125.
124. Myren Krom, On sums of powers of natural numbers, Two Year College Math. J. 14 (1983),
no. 4, 349–351.
125. David E. Kullman, What’s harmonic about the harmonic series, The College Math. J. 32
(2001), no. 3, 201–203.
126. R. Kumanduri and C. Romero, Number theory with computer applications, Prentice-Hall,
Simon and Schuster, New Jersey, 1998.
127. M. Laczkovich, On Lambert’s proof of the irrationality of π, Amer. Math. Monthly 104
(1997), no. 5, 439–443.
128. Serge Lang, A first course in calculus, fifth ed., Addison-Wesley Pub. Co., Reading, Mass.,
1964.
129. L. J. Lange, An elegant continued fraction for π, Amer. Math. Monthly 106 (1999), no. 5,
456–458.
130. W. G. Leavitt, The sum of the reciprocals of the primes, Two Year College Math. J. 10
(1979), no. 3, 198–199.
131. D.H. Lehmer, Problem 3801, Amer. Math. Monthly 43 (1936), no. 9, 580.
132. , On arccotangent relations for π, Amer. Math. Monthly 45 (1938), no. 10, 657–664.
133. D.H. Lehmer and M.A. Heaslet, Solution 3801, Amer. Math. Monthly 45 (1938), no. 9,
636–637.
134. A.L. Leigh Silver, Musimatics or the nun’s fiddle, Amer. Math. Monthly 78 (1971), no. 4,
351–357.
135. H.W. Lenstra, Solving the pell equation, Notices Amer. Math. Soc. 49 (2002), no. 2, 182–192.
136. P. Loya, Amazing and aesthetic aspects of analysis: The celebrated calculus, in preparation.
137. N. Luzin, Function: Part I, Amer. Math. Monthly 105 (1998), no. 1, 59–67.
138. , Function: Part II, Amer. Math. Monthly 105 (1998), no. 3, 263–270.
139. Richard Lyon and Morgan Ward, The limit for e, Amer. Math. Monthly 59 (1952), no. 2,
102–103.
140. Desmond MacHales, Comic sections: The book of mathematical jokes, humour, wit, and
wisdom, Boole Press, Dublin, 1993.
141. Alan L. Mackay, Dictionary of scientific quotations, Institute of Physics Publishing, Bristol,
1994.
142. E.A. Maier, On the irrationality of certain trigonometric numbers, Amer. Math. Monthly
72 (1965), no. 9, 1012–1013.
143. E.A. Maier and Ivan Niven, A method of establishing certain irrationalities, Math. Mag. 37
(1964), no. 4, 208–210.
144. S. C. Malik, Introduction to convergence, Halsted Press, a division of John Wiley and sons,
New Delhi, 1984.
145. Eli Maor, e: the story of a number, Princeton University Press, Princetown, NJ, 1994.
146. George Markowsky, Misconceptions about the golden ratio, Two Year College Math. J. 23
(1992), no. 1, 2–19.
147. Jerold Mathews, Gear trains and continued fractions, Amer. Math. Monthly 97 (1990), no. 6,
505–510. √
148. Marcin Mazur, Irrationality of 2, private communication, 2004.
149. J. H. McKay, The william lowell putnam mathematical competition, Amer. Math. Monthly
74 (1967), no. 7, 771–777.
150. George Miel, Of calculations past and present: The Archimedean algorithm, Amer. Math.
Monthly 90 (1983), no. 1, 17–35.
151. Jeff Miller, Earliest uses of symbols in probability and statistics,
http://members.aol.com/jeff570/stat.html.
152. John E. Morrill, Set theory and the indicator function, Amer. Math. Monthly 89 (1982),
no. 9, 694–695.
P
153. Leo Moser, On the series, 1/p, Amer. Math. Monthly 65 (1958), 104–105.
154. Joseph Amal Nathan, The irrationality of ex for nonzero rational x, Amer. Math. Monthly
105 (1998), no. 8, 762–763.
155. Harry L. Nelson, A solution to Archimedes’ cattle problem, J. Recreational Math. 13 (1980-
81), 162–176.
156. D.J. Newman, Solution to problem e924, Amer. Math. Monthly 58 (1951), no. 3, 190–191.
476 BIBLIOGRAPHY
157. , Arithmetic, geometric inequality, Amer. Math. Monthly 67 (1960), no. 9, 886.
158. Donald J. Newman and T.D. Parsons, On monotone subsequences, Amer. Math. Monthly
95 (1988), no. 1, 44–45.
159. James R. Newman (ed.), The world of mathematics. Vol. 1, Dover Publications Inc., Mineola,
NY, 2000, Reprint of the 1956 original.
160. J.R. Newman (ed.), The world of mathematics, Simon and Schuster, New York, 1956.
161. James Nickel, Mathematics: Is God silent?, Ross House Books, Vallecito, California, 2001.
162. Ivan Niven, The transcendence of π, Amer. Math. Monthly 46 (1939), no. 8, 469–471.
163. , Irrational numbers, The Carus Mathematical Monographs, No. 11, The Mathemat-
ical Association of America. Distributed by John Wiley and Sons, Inc., New York, N.Y.,
1956.
P
164. , A proof of the divergence of 1/p, Amer. Math. Monthly 78 (1971), no. 3, 272–273.
165. Ivan Niven and Herbert S. Zuckerman, An introduction to the theory of numbers, third ed.,
John Wiley & Sons, Inc., New York-London-Sydney, 1972.
166. Jeffrey Nunemacher and Robert M. Young, On the sum of consecutive kth powers, Math.
Mag. 60 (1987), no. 4, 237–238.
167. Mı́cheál Ó Searcóid, Elements of abstract analysis, Springer Undergraduate Mathematics
Series, Springer-Verlag London Ltd., London, 2002.
168. University of St. Andrews, A chronology of pi,
http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/Pi chronology.html.
169. , Eudoxus of cnidus,
http://www-groups.dcs.st-and.ac.uk/~ history/Biographies/Eudoxus.html.
170. , A history of pi,
http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/Pi through the ages.html.
171. , Leonhard Euler,
http://www-groups.dcs.st-and.ac.uk/ history/Mathematicians/Euler.html.
172. , Madhava of sangamagramma,
http://www-gap.dcs.st-and.ac.uk/ history/Mathematicians/Madhava.html.
173. C. D. Olds, The simple continued fraction expansion of e, Amer. Math. Monthly 77 (1970),
no. 9, 968–974.
174. Geo. A. Osborne, A problem in number theory, Amer. Math. Monthly 21 (1914), no. 5,
148–150.
175. Thomas J. Osler, The union of Vieta’s and Wallis’s products for pi, Amer. Math. Monthly
106 (1999), no. 8, 774–776.
176. Thomas J. Osler and James Smoak, A magic trick from fibonacci, The College Math. J. 34
(2003), 58–60.
177. Thomas J. Osler and Nicholas Stugard, A collection of numbers whose proof of irrationality
is like that of the number e, Math. Comput. Ed. 40 (2006), 103–107.
178. Thomas J. Osler and Michael Wilhelm, Variations on Vieta’s and Wallis’s products for pi,
Math. Comput. Ed. 35 (2001), 225–232.
P∞ −2 = π 2 /6, Amer. Math.
179. Ioannis Papadimitriou, A simple proof of the formula k=1 k
Monthly 80 (1973), no. 4, 424–425.
180. L. L. Pennisi, Elementary proof that e is irrational, Amer. Math. Monthly 60 (1953), 474.
181. G.M. Phillips, Archimedes the numerical analyst, Amer. Math. Monthly 88 (1981), no. 3,
165–169.
182. R.C. Pierce, Jr., A brief history of logarithms, Two Year College Math. J. 8 (1977), no. 1,
22–26.
183. Alfred S. Posamentier and Ingmar Lehmann, π: A biography of the world’s most myste-
rious number, Prometheus Books, Amherst, NY, 2004, With an afterword by Herbert A.
Hauptman.
184. G. Baley Price, Telescoping sums and the summation of sequences, Two Year College Math.
J. 4 (1973), no. 4, 16–29.
185. Raymond Redheffer, What! another note just on the fundamental theorem of algebra, Amer.
Math. Monthly 71 (1964), no. 2, 180–185.
186. Reinhold Remmert, Vom Fundamentalsatz der Algebra zum Satz von Gelfand-Mazur, Math.
Semesterber. 40 (1993), no. 1, 63–71.
187. Dorothy Rice, History of π (or pi), Mathematics News Letter 2 (1928), 6–8.
188. N. Rose, Mathematical maxims and minims, Rome Press Inc., Raleigh, NC, 1988.
BIBLIOGRAPHY 477
189. Tony Rothman, Genius and biographers: The fictionalization of Evariste Galois, Amer.
Math. Monthly 89 (1982), no. 2, 84–106.
190. Ranjan Roy, The discovery of the series formula for π by Leibniz, Gregory and Nilakantha,
Math. Mag. 63 (1990), no. 5, 291–306.
191. Walter Rudin, Principles of mathematical analysis, third ed., McGraw-Hill Book Co., New
York, 1976, International Series in Pure and Applied Mathematics.
192. , Real and complex analysis, third ed., McGraw-Hill Book Co., New York, 1987.
193. Oliver Sacks, The man who mistook his wife for a hat : And other clinical tales, Touchstone,
New York, 1985.
194. Yoram Sagher, Notes: What Pythagoras Could Have Done, Amer. Math. Monthly 95 (1988),
no. 2, 117.
195. E. Sandifer, How euler did it,
http://www.maa.org/news/howeulerdidit.html.
196. Norman Schaumberger, An instant proof of eπ > π e , The College Math. J. 16 (1985), no. 4,
280.
197. Murray Schechter, Tempered scales and continued fractions, Amer. Math. Monthly 87 (1980),
no. 1, 40–42.
198. Herman C. Schepler, A chronology of pi, Math. Mag. 23 (1950), no. 3, 165–170.
199. , A chronology of pi, Math. Mag. 23 (1950), no. 4, 216–228.
200. , A chronology of pi, Math. Mag. 23 (1950), no. 5, 279–283.
201. P.J. Schillo, On primitive pythagorean triangles, Amer. Math. Monthly 58 (1951), no. 1,
30–32.
202. Fred Schuh, The master book of mathematical recreations, Dover Publications Inc., New
York, 1968, Translated by F. Göbel.
203. P. Sebah and X. Gourdon, A collection of formulae for the Euler constant,
http://numbers.computation.free.fr/Constants/Gamma/gammaFormulas.pdf.
204. , A collection of series for π,
http://numbers.computation.free.fr/Constants/Pi/piSeries.html.
205. , The constant e and its computation,
http://numbers.computation.free.fr/Constants/constants.html.
206. , Introduction on Bernoulli’s numbers,
http://numbers.computation.free.fr/Constants/constants.html.
207. , π and its computation through the ages,
http://numbers.computation.free.fr/Constants/constants.html.
208. Allen A. Shaw, Note on roman numerals, Nat. Math. Mag. 13 (1938), no. 3, 127–128.
209. Georgi E. Shilov, Elementary real and complex analysis, english ed., Dover Publications
Inc., Mineola, NY, 1996, Revised English edition translated from the Russian and edited by
Richard A. Silverman.
210. G. F. Simmons, Calculus gems, Mcgraw Hill, Inc., New York, 1992.
211. J.G. Simmons, A new look at an old function, eiθ , The College Math. J. 26 (1995), no. 1,
6–10.
212. Sahib Singh, On dividing coconuts: A linear diophantine problem, The College Math. J. 28
(1997), no. 3, 203–204.
213. David Singmaster, The legal values of pi, Math. Intelligencer 7 (1985), no. 2, 69–72.
214. , Coconuts: the history and solutions of a classic Diophantine problem, Gan.ita-
Bhāratı̄ 19 (1997), no. 1-4, 35–51.
215. Walter S. Sizer, Continued roots, Math. Mag. 59 (1986), no. 1, 23–27.
216. David Eugene Smith, A source book in mathematics. vol. 1, 2., Dover Publications, Inc, New
York, 1959, Unabridged and unaltered republ. of the first ed. 1929.
217. J. Sondow, Problem 88, Math Horizons (1997), 32, 34.
218. H. Steinhaus, Mathematical snapshots, english ed., Dover Publications Inc., Mineola, NY,
1999, Translated from the Polish, With a preface by Morris Kline.
219. Ian Stewart, Concepts of modern mathematics, Dover Publications Inc., New York, 1995.
220. John Stillwell, Galois theory for beginners, Amer. Math. Monthly 101 (1994), no. 1, 22–27.
221. D. J. Struik (ed.), A source book in mathematics, 1200–1800, Princeton Paperbacks, Prince-
ton University Press, Princeton, NJ, 1986, Reprint of the 1969 edition.
222. Frode Terkelsen, The fundamental theorem of algebra, Amer. Math. Monthly 83 (1976),
no. 8, 647.
478 BIBLIOGRAPHY
223. Hugh Thurston, A simple proof that every sequence has a monotone subsequence, Math.
Mag. 67 (1994), no. 5, 344.
224. C. Tøndering, Frequently asked questions about calendars,
http://www.tondering.dk/claus/, 2003.
225. Herbert Turnbull, The great mathematicians, Barnes & Noble, New York, 1993.
226. Herbert (ed.) Turnbull, The correspondence of Isaac Newton, Vol. II: 1676–1687, Published
for the Royal Society, Cambridge University Press, New York, 1960.
227. D. J. Uherka and Ann M. Sergott, On the continuous dependence of the roots of a polynomial
on its coefficients, Amer. Math. Monthly 84 (1977), no. 5, 368–370.
228. R.S. Underwood and Robert E. Moritz, Solution to problem 3242, Amer. Math. Monthly 35
(1928), no. 1, 47–48.
229. James Victor Uspensky, Introduction to mathematical probability, McGraw-Hill Book Co,
New York, London, 1937.
230. Alfred van der Poorten, A proof that Euler missed. . .Apéry’s proof of the irrationality of
ζ(3), Math. Intelligencer 1 (1978/79), no. 4, 195–203, An informal report.
P
231. Charles Vanden Eynden, Proofs that 1/p diverges, Amer. Math. Monthly 87 (1980), no. 5,
394–397.
232. Ilan Vardi, Computational recreations in Mathematica, Addison-Wesley Publishing Company
Advanced Book Program, Redwood City, CA, 1991.
233. , Archimedes’ cattle problem, Amer. Math. Monthly 105 (1998), no. 4, 305–319.
234. P.G.J. Vredenduin, A paradox of set theory, Amer. Math. Monthly 76 (1969), no. 1, 59–60.
235. A.D. Wadhwa, An interesting subseries of the harmonic series, Amer. Math. Monthly 82
(1975), no. 9, 931–933.
236. Morgan Ward, A mnemonic for Euler’s constant, Amer. Math. Monthly 38 (1931), no. 9, 6.
237. André Weil, Number theory, Birkhäuser Boston Inc., Boston, MA, 1984, An approach
through history, From Hammurapi to Legendre.
238. E. Weisstein, Dirichlet function. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/DirichletFunction.html.
239. , Landau symbols. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/LandauSymbols.html.
240. , Pi approximations. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/PiApproximations.html.
241. , Pi formulas. from MathWorld—a wolfram web resource,
http://mathworld.wolfram.com/PiFormulas.html.
242. B. R. Wenner, Continuous, exactly k-to-one functions on R, Math. Mag. 45 (1972), 224–225.
243. Joseph Wiener, Bernoulli’s inequality and the number e, The College Math. J. 16 (1985),
no. 5, 399–400.
244. E. Wigner, The unreasonable effectiveness of mathematics in the natural sciences, Comm.
Pure Appl. Math. 13 (1960), 1–14.
245. Eugene Wigner, Symmetries and reflections: Scientific essays, The MIT press, Cambridge
and London, 1970.
246. Herbert S. Wilf, generatingfunctionology, third ed., A K Peters Ltd., Wellesley, MA, 2006,
Freely downloadable at http://www.cis.upenn.edu/ wilf/.
247. G.T. Williams, A new method of evaluating ζ(2n), Amer. Math. Monthly 60 (1953), no. 1,
12–25.
248. H.C. Williams, R.A. German, and C.R. Zarnke, Solution of the cattle problem of Archimedes,
Math. Comp. 19 (1965), no. 92, 671–674.
249. A. M. Yaglom and I. M. Yaglom, Challenging mathematical problems with elementary so-
lutions. Vol. II, Dover Publications Inc., New York, 1987, Problems from various branches
of mathematics, Translated from the Russian by James McCawley, Jr., Reprint of the 1967
edition.
250. Hansheng Yang and Yang Heng, The arithmetic-geometric mean inequality and the constant
e, Math. Mag. 74 (2001), no. 4, 321–323.
251. G.S. Young, The linear functional equation, Amer. Math. Monthly 65 (1958), no. 1, 37–38.
252. Robert M. Young, Excursions in calculus, The Dolciani Mathematical Expositions, vol. 13,
Mathematical Association of America, Washington, DC, 1992, An interplay of the continuous
and the discrete.
253. Don Zagier, The first 50 million prime numbers, Math. Intelligencer 0 (1977/78), 7–19.
BIBLIOGRAPHY 479
254. Lee Zia, Using the finite difference calculus to sum powers of integers, The College Math. J.
22 (1991), no. 4, 294–300.
Index
481
482 INDEX
Value of function, 16
Vanden Eynden, Charles, 325
Vardi, Ilan, 339, 341
Vector space, 73
Vectors, 72
Venn diagram, 8
Venn, John, 8
Viète, François, 154, 233, 349
Volterra’s theorem, 170
Volterra, Vito, 170
Vredenduin’s paradox, 90
Yasser, 437