Discrete Structures
Discrete Structures
Michiel Smid
School of Computer Science
Carleton University
Ottawa, Ontario
Canada
michiel@scs.carleton.ca
Preface vii
1 Introduction 1
1.1 Ramsey Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Sperner’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The Quick-Sort Algorithm . . . . . . . . . . . . . . . . . . . . 5
2 Mathematical Preliminaries 9
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Proof Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Direct proofs . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Constructive proofs . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Nonconstructive proofs . . . . . . . . . . . . . . . . . . 14
2.2.4 Proofs by contradiction . . . . . . . . . . . . . . . . . . 15
2.2.5 Proofs by induction . . . . . . . . . . . . . . . . . . . . 16
2.2.6 More examples of proofs . . . . . . . . . . . . . . . . . 18
2.3 Asymptotic Notation . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Counting 25
3.1 The Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.1 Counting Bitstrings of Length n . . . . . . . . . . . . . 26
3.1.2 Counting Functions . . . . . . . . . . . . . . . . . . . . 26
3.1.3 Placing Books on Shelves . . . . . . . . . . . . . . . . . 29
3.2 The Bijection Rule . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 The Complement Rule . . . . . . . . . . . . . . . . . . . . . . 33
3.4 The Sum Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
iv Contents
4 Recursion 83
4.1 Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Fibonacci Numbers . . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.1 Counting 00-Free Bitstrings . . . . . . . . . . . . . . . 87
4.3 A Recursively Defined Set . . . . . . . . . . . . . . . . . . . . 88
4.4 A Gossip Problem . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.5 Euclid’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 94
4.5.1 The Modulo Operation . . . . . . . . . . . . . . . . . . 95
4.5.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . 95
4.5.3 The Running Time . . . . . . . . . . . . . . . . . . . . 97
4.6 The Merge-Sort Algorithm . . . . . . . . . . . . . . . . . . . . 99
4.6.1 Correctness of Algorithm MergeSort . . . . . . . . . 100
4.6.2 Running Time of Algorithm MergeSort . . . . . . . 101
4.7 Computing the Closest Pair . . . . . . . . . . . . . . . . . . . 104
4.7.1 The Basic Approach . . . . . . . . . . . . . . . . . . . 105
4.7.2 The Recursive Algorithm . . . . . . . . . . . . . . . . . 111
4.8 Counting Regions when Cutting a Circle . . . . . . . . . . . . 115
4.8.1 A Polynomial Upper Bound on Rn . . . . . . . . . . . 115
4.8.2 A Recurrence Relation for Rn . . . . . . . . . . . . . . 118
4.8.3 Simplifying the Recurrence Relation . . . . . . . . . . . 123
4.8.4 Solving the Recurrence Relation . . . . . . . . . . . . . 124
4.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Contents v
Introduction
In this chapter, we introduce some problems that will be solved later in this
book. Along the way, we recall some notions from discrete mathematics that
you are assumed to be familiar with. These notions are reviewed in more
detail in Chapter 2.
In the example below, P3 and P5 are friends, whereas P1 and P3 are strangers.
P2 P3
P1 P4
P6 P5
We may assume, without loss of generality, that the first claim holds. (Do
you see why?) Consider three edges incident on P1 that are solid and denote
them by P1 A, P1 B, and P1 C.
If at least one of the edges AB, AC, and BC is solid, then there is a solid
triangle. In the left figure below, AB is solid and we obtain the solid triangle
P1 AB.
A A
P1 B P1 B
C C
Otherwise, all edges AB, AC, and BC are dashed, in which case we
obtain the dashed triangle ABC; see the right figure above.
You should convince yourself that Theorem 1.1.2 also holds for complete
graphs with more than six vertices. The example below shows an example of
a complete graph with five vertices without any solid triangle and without
any dashed triangle. Thus, Theorem 1.1.2 does not hold for complete graphs
with five vertices. Equivalently, Theorem 1.1.1 does not hold for groups of
five people.
P2 P3
P1
P5 P4
Si 6⊆ Sj and Sj 6⊆ Si ,
S1 = {1, 2}, S2 = {1, 3}, S3 = {1, 4}, S4 = {1, 5}, S5 = {2, 3},
S6 = {2, 4}, S7 = {2, 5}, S8 = {3, 4}, S9 = {3, 5}, S10 = {4, 5}.
Observe that these are all subsets of S having size two. Can there be such
a sequence of more than 10 subsets? The following theorem states that the
answer is “no”.
The right-hand side of the last line is a binomial coefficient, which we will
define in Section 3.6. Its value is equal to the number of subsets of S having
size bn/2c. Observe that these subsets satisfy the property in Theorem 1.2.1.
We will prove Theorem 1.2.1 in Section 7.3, using elementary counting
techniques and probability theory. Again, this probably sounds surprising to
you, because Theorem 1.2.1 does not have anything to do with probability.
<p p >p
S1 S2
time of the algorithm heavily depends on the pivots that are chosen in the
recursive calls.
For example, assume that in each (recursive) call to the algorithm, the
pivot happens to be the largest element in the sequence. Then, in each call,
the subsequence of elements that are larger than the pivot is empty. Let us
see what happens in this case:
<p p
n − 1 elements
< p1 p1
n − 2 elements
< p2 p2
n − 3 elements
You probably see the pattern. The total running time of the algorithm, i.e.,
the total number of “steps”, is proportional to
n + (n − 1) + (n − 2) + (n − 3) + · · · + 3 + 2 + 1,
1.3. The Quick-Sort Algorithm 7
<p p >p
(n − 1)/2 (n − 1)/2
In Section 4.6, we will prove that, if this happens in each recursive call,
the running time of the QuickSort algorithm is only O(n log n). Obviously,
it is not clear at all how we can guarantee that we always choose a good pivot.
It turns out that there is a simple strategy: In each call, choose the pivot
randomly! That is, among all elements involved in the recursive call, pick one
uniformly at random; thus, each element has the same probability of being
chosen. In Section 6.10, we will prove that this leads to an expected running
time of O(n log n).
8 Chapter 1. Introduction
Chapter 2
Mathematical Preliminaries
6. The empty set is the set that does not contain any element. This set
is denoted by ∅.
A ∪ B = {x : x ∈ A or x ∈ B},
A ∩ B = {x : x ∈ A and x ∈ B},
A \ B = {x : x ∈ A and x 6∈ B},
A = {x : x 6∈ A}.
16. The Boolean values are 1 and 0, that represent true and false, respec-
tively. The basic Boolean operations include
Since 2(2k 2 + 2k) is even, and “even plus one is odd”, we can conclude that
n2 is odd.
Theorem 2.2.2 Let G = (V, E) be a graph. Then the sum of the degrees of
all vertices is an even integer, i.e.,
X
deg(v)
v∈V
is even.
Proof. If you do not see the meaning of this statement, then first try it out
for a few graphs. The reason why the statement holds is very simple: Each
edge contributes 2 to the summation (because an edge is incident on exactly
two distinct vertices).
Theorem 2.2.3 Let G = (V, E) be a graph. Then the sum of the degrees of
all vertices is equal to twice the number of edges, i.e.,
X
deg(v) = 2|E|.
v∈V
14 Chapter 2. Mathematical Preliminaries
A graph is called 3-regular, if each vertex has degree three. We prove the
following theorem using a constructive proof.
Theorem 2.2.5 For every even integer n ≥ 4, there exists a 3-regular graph
with n vertices.
Proof. Let
V = {0, 1, 2, . . . , n − 1},
and
Theorem 2.2.6 There exist irrational numbers x and y such that xy is ra-
tional.
√
√ 2
Case 2: 2 6∈ Q. √
√ 2 √
In this case, we take x = 2 and y = 2. Since
√ √2
√ √
2 2
xy = 2 = 2 = 2,
Observe that this proof indeed proves the theorem, but it does not give
an example of a pair of irrational numbers x and y such that xy is rational.
√ √
Theorem 2.2.9 2 is irrational, i.e., 2 cannot be written as a fraction of
two integers.
Proof.
√ We will prove√the theorem by contradiction. Thus, we assume that
2 is rational. Then
√ 2 can be written as a fraction of two integers m ≥ 1
and n ≥ 1, i.e., 2 = m/n. We may assume that m and n do not share
16 Chapter 2. Mathematical Preliminaries
any common factors, i.e., the greatest common divisor of m and n is equal
to one; if √this is not the case, then we can get rid of the common factors. By
squaring 2 = m/n, we get 2n2 = m2 . This implies that m2 is even. Then,
by Theorem 2.2.8, m is even, which means that we can write m as m = 2k,
for some positive integer k. It follows that 2n2 = m2 = 4k 2 , which implies
that n2 = 2k 2 . Hence, n2 is even. Again by Theorem 2.2.8, it follows that n
is even.
We have shown that m and n are both even. But we know that m and
n
√ are not both even. Hence, we have a contradiction.√Our assumption that
2 is rational is wrong. Thus, we can conclude that 2 is irrational.
Induction Step: Prove that for all n ≥ 1, the following holds: If P (n) is
true, then P (n + 1) is also true.
n(n + 1)
1 + 2 + 3 + ··· + n = .
2
Proof. We start with the base case of the induction. If n = 1, then both the
left-hand side and the right-hand side are equal to 1. Therefore, the theorem
is true for n = 1.
2.2. Proof Techniques 17
For the induction step, let n ≥ 1 and assume that the theorem is true
for n, i.e., assume that
n(n + 1)
1 + 2 + 3 + ··· + n = .
2
We have to prove that the theorem is true for n + 1, i.e., we have to prove
that
(n + 1)(n + 2)
1 + 2 + 3 + · · · + (n + 1) = .
2
Here is the proof:
1 + 2 + 3 + · · · + (n + 1) = |1 + 2 + 3{z+ · · · + n} +(n + 1)
n(n+1)
= 2
n(n + 1)
= + (n + 1)
2
(n + 1)(n + 2)
= .
2
Since there are n terms on the right-hand side, we have 2S = n(n + 1). This
implies that S = n(n + 1)/2.
We now prove the theorem by induction. For the base case, let n = 1. The
claim in the theorem is “a − b is a factor of a − b”, which is obviously true.
Let n ≥ 1 and assume that a − b is a factor of an − bn . We have to prove
that a − b is a factor of an+1 − bn+1 . We have
Theorem 2.2.12 Let G = (V, E) be a graph with m edges. Then the sum
of the degrees of all vertices is equal to twice the number of edges, i.e.,
X
deg(v) = 2m.
v∈V
Proof. The proof is by induction on the number m of edges. For the base
case of the induction, assume thatPm = 0. Then the graph G does not
contain any edges and, therefore, v∈V deg(v) = 0. Thus, the theorem is
true if m = 0.
Let m ≥ 0 and assume that the theorem is true for every graph with m
edges.
P Let G be an arbitrary graph with m + 1 edges. We have to prove that
v∈V deg(v) = 2(m + 1).
Let {a, b} be an arbitrary edge in G, and let G0 be the graph obtained
from G by removing the edge {a, b}. Since G0 has m edges, we know from
the induction hypothesis that the sum of the degrees of all vertices in G0 is
equal to 2m. Using this, we obtain
X X
deg(v) = deg(v) + 2 = 2m + 2 = 2(m + 1).
v∈G v∈G0
It follows that
3n = 2m.
Since n is an odd integer, the left-hand side in this equation is an odd integer
as well. The right-hand side, however, is an even integer. This is a contra-
diction.
Let Kn be the complete graph on n vertices. This graph has a vertex set
of size n, and every pair of distinct vertices is joined by an edge.
If G = (V, E) is a graph with n vertices, then the complement G of G is
the graph with vertex set V that consists of those edges of Kn that are not
present in G.
• We say that f (n) = O(g(n)) if the following is true: There exist con-
stants c > 0 and k > 0 such that for all n ≥ k,
f (n) ≤ c · g(n).
• We say that f (n) = Ω(g(n)) if the following is true: There exist con-
stants c > 0 and k > 0 such that for all n ≥ k,
f (n) ≥ c · g(n).
We also have
13 + 7n − 5n2 + 8n3 ≥ −5n2 + 8n3 .
Since n3 ≥ 5n2 for all n ≥ 5, it follows that, again for all n ≥ 5,
2.4 Logarithms
If b and x are real numbers with b > 1 and x > 0, then logb x denotes the
logarithm of x with base b. Note that
logb x = y if and only if by = x.
If b = 2, then we write log x instead of log2 x. We write ln x to refer to the
natural logarithm of x with base e.
Lemma 2.4.1 If b > 1 and x > 0, then
blogb x = x.
Proof. We have seen above that y = logb x if and only if by = x. Thus, if
we write y = logb x, then blogb x = by = x.
and
22 log log x = 2log(log x) = log2 x.
2
2.5. Exercises 23
2.5 Exercises
b n + b n = an .
√
2.1 Prove that p is irrational for every prime number p.
√
2.2 Let n be a positive integer that is not a perfect square. Prove that n
is irrational.
F0 F1 F2 · · · Fn−1 = Fn − 2
• Prove that for any two distinct integers n ≥ 0 and m ≥ 0, the greatest
common divisor of Fn and Fm is equal to 1.
Counting
There are three types of people, those who can count and those
who cannot count.
Given a set of 23 elements, how many subsets of size 17 are there? How
many solutions are there to the equation
x1 + x2 + · · · + x12 = 873,
second one being writing the second character. Obviously, there are 26 ways
to do the first task. Next, observe that, regardless of how we do the first
task, there are 10 ways to do the second task. The Product Rule states that
the total number of ways to perform the entire procedure is 26 · 10 = 260.
Product Rule: Assume a procedure consists of performing a se-
quence of m tasks in order. Furthermore, assume that for each
i = 1, 2, . . . , m, there are Ni ways to do the i-th task, regardless
of how the first i − 1 tasks were done. Then, there are N1 N2 · · · Nm
ways to do the entire procedure.
ai f (ai)
A B
For each i, f (ai ) can be any of the n elements of B. As a result, there are
Ni = n ways to do the i-th task, regardless of how we did the first i − 1 tasks.
By the Product Rule, there are N1 N2 · · · Nm = nm ways to do the entire
procedure and, hence, this many functions f : A → B. We have proved the
following result:
• In the second task, we have to specify the value f (a2 ). Since the func-
tion f is one-to-one and since we have already specified f (a1 ), we can
choose f (a2 ) to be any of the n − 1 elements in the set B \ {f (a1 )}. As
a result, there are N2 = n − 1 ways to do the second task. Note that
this is true, regardless of how we did the first task.
28 Chapter 3. Counting
• In general, in the i-th task, we have to specify the value f (ai ). Since
we have already specified f (a1 ), f (a2 ), . . . , f (ai−1 ), we can choose f (ai )
to be any of the n − i + 1 elements in the set
B \ {f (a1 ), f (a2 ), . . . , f (ai−1 )}.
As a result, there are Ni = n − i + 1 ways to do the i-th task. Note
that this is true, regardless of how we did the first i − 1 tasks.
By the Product Rule, there are
N1 N2 · · · Nm = n(n − 1)(n − 2) · · · (n − m + 1)
ways to do the entire procedure, which is also the number of one-to-one
functions f : A → B.
Recall the factorial function
1 if k = 0,
k! =
1 · 2 · 3 · · · k if k ≥ 1.
We can simplify the product
n(n − 1)(n − 2) · · · (n − m + 1)
by observing that it is “almost” a factorial:
n(n − 1)(n − 2) · · · (n − m + 1)
(n − m)(n − m − 1) · · · 1
= n(n − 1)(n − 2) · · · (n − m + 1) ·
(n − m)(n − m − 1) · · · 1
n(n − 1)(n − 2) · · · 1
=
(n − m)(n − m − 1) · · · 1
n!
= .
(n − m)!
We have proved the following result:
Theorem 3.1.3 Let m ≥ 1 and n ≥ 1 be integers, let A be a set of size m,
and let B be a set of size n.
1. If m > n, then there is no one-to-one function f : A → B.
2. If m ≤ n, then the number of one-to-one functions f : A → B is equal
to
n!
.
(n − m)!
3.1. The Product Rule 29
• we specify for each book the shelf at which this book is placed, and
• we specify for each shelf the left-to-right order of the books that are
placed on that shelf.
Some bookshelves may be empty. We assume that each shelf is large enough
to fit all books. In the figure below, you see two different placements.
S1 B4 B3 S1
S2 B1 S2 B3 B1 B4
S3 B2 B5 S3 B5 B2
We are again going to use the Product Rule to determine the number of
placements.
• Just before we place book B1 , all shelves are empty. Therefore, there
are N1 = n ways to do the first task.
• In general, in the i-th task, we have to place book Bi . Since the books
B1 , B2 , . . . , Bi−1 have already been placed, we have the following pos-
sibilities for placing Bi :
N1 N2 · · · Nm = n(n + 1)(n + 2) · · · (n + m − 1)
n(n + 1)(n + 2) · · · (n + m − 1)
1 · 2 · 3 · · · (n − 1)
= · n(n + 1)(n + 2) · · · (n + m − 1)
1 · 2 · 3 · · · (n − 1)
(n + m − 1)!
= .
(n − 1)!
(n + m − 1)!
.
(n − 1)!
3.2. The Bijection Rule 31
It should be clear that this means that A and B contain the same number
of elements.
Bijection Rule: Let A and B be finite sets. If there exists a bijection
f : A → B, then |A| = |B|, i.e., A and B have the same size.
Let us see how we can apply this rule to the subset problem. We define
the following two sets A and B:
• A = P(S), i.e., the power set of S, which is the set of all subsets of S:
P(S) = {T : T ⊆ S}.
We have seen in Theorem 3.1.1 that the set B has size 2n . Therefore, if we
can show that there exists a bijection f : A → B, then, according to the
Bijection Rule, we have |A| = |B| and, thus, the number of subsets of S is
equal to 2n .
Write the set S as S = {s1 , s2 , . . . , sn }. We define the function f : A → B
in the following way:
32 Chapter 3. Counting
1 if si ∈ T ,
bi =
0 if si 6∈ T .
• If T = ∅, then f (T ) = 00000.
You will probably have noticed that we could have proved this result
directly using the Product Rule: The procedure “specify a subset of S =
{s1 , s2 , . . . , sn }” can be carried out by specifying, for i = 1, 2, . . . , n, whether
or not si is contained in the subset. For each i, there are two choices. As a
result, there are 2n ways to do the procedure.
To conclude this section, we remark that we have already been using the
Bijection Rule in Section 3.1!
3.3. The Complement Rule 33
U \ A = {x : x ∈ U and x 6∈ A}.
This rule follows easily from the fact that |U | = |A| + |U \ A|, which holds
because each element in U is either in A or in U \ A.
To apply the Complement Rule to the password problem, let U be the
set of all strings consisting of 8 characters, each character being a lowercase
letter or a digit, and let A be the set of all valid passwords, i.e., all strings
in U that contain at least one digit. Note that U \ A is the set of all strings
of 8 characters, each character being a lowercase letter or a digit, that do
not contain any digit. In other words, U \ A is the set of all strings of 8
characters, where each character is a lowercase letter.
By the Product Rule, the set U has size 368 , because each string in U
has 8 characters, and there are 26 + 10 = 36 choices for each character.
Similarly, the set U \ A has size 268 , because there are 26 choices for each
of the 8 characters. Then, by the Complement Rule, the number of valid
passwords is equal to
Note that we already used this rule in Section 3.3 when we argued why
the Complement Rule is correct!
To give an example, consider strings consisting of 6, 7, or 8 characters,
each character being a lowercase letter or a digit. Such a string is called a
valid password, if it contains at least one digit. Let A be the set of all valid
passwords. What is the size of A?
For i = 6, 7, 8, let Ai be the set of all valid passwords of length i. It is
obvious that A = A6 ∪ A7 ∪ A8 . Since the three sets A6 , A7 , and A8 are
pairwise disjoint, we have, by the Sum Rule,
We have seen in Section 3.3 that |A8 | = 368 − 268 . By the same arguments,
we have |A6 | = 366 − 266 and |A7 | = 367 − 267 . Thus, the number of valid
passwords is equal to
|A| = 366 − 266 + 367 − 267 + 368 − 268 = 2, 684, 483, 063, 360.
x y z
A B
• Each string in A ∩ B starts with 010 and ends with 11. Thus, five bits
are fixed for every string in A ∩ B. It follows that the size of A ∩ B
is equal to the number of bitstrings of length 12. Therefore, by the
Product Rule, we have |A ∩ B| = 212 .
By applying the Inclusion-Exclusion formula, it follows that
To give an example, how many bitstrings of length 17 are there that start
with 010, or end with 11, or have 10 at positions1 7 and 8? Let S be the set
1
The positions are numbered 1, 2, . . . , 17.
3.6. Permutations and Binomial Coefficients 37
of all such bitstrings. Define A to be the set of all bitstrings of length 17 that
start with 010, define B to be the set of all bitstrings of length 17 that end
with 11, and define C to be the set of all bitstrings of length 17 that have 10
at positions 7 and 8. Then S = A ∪ B ∪ C and, thus, we have to determine
the size of A ∪ B ∪ C.
• We have seen before that |A| = 214 , |B| = 215 , and |A ∩ B| = 212 .
• We have |C| = 215 , because the bits at positions 7 and 8 are fixed for
every string in C.
• We have |A∩C| = 212 , because 5 bits are fixed for every string in A∩C.
|S| = |A ∪ B ∪ C|
= |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|
= 214 + 215 + 215 − 212 − 212 − 213 + 210
= 66, 560.
Note that we could also have used Theorem 3.1.3 to prove Theorem 3.6.1:
A permutation of S can be regarded to be a one-to-one function f : S → S.
Therefore, by applying Theorem 3.1.3 with A = S, B = S and, thus, m = n,
we obtain Theorem 3.6.1.
Consider the set S = {a, b, c, d, e}. How many 3-element subsets does S
have? Recall that in a set, the order of the elements does not matter. Here
is a list of all 10 subsets of S having size 3:
{a, b, c}, {a, b, d}, {a, b, e}, {a, c, d}, {a, c, e},
{a, d, e}, {b, c, d}, {b, c, e}, {b, d, e}, {c, d, e}
Definition 3.6.2 Let n ≥ 0 and k ≥ 0 be integers. The binomial coefficient
n
k
denotes the number of k-element subsets of an n-element set.
The symbol nk is pronounced as “n choose k”.
The example above shows that 53 = 10. Since the empty set has exactly
0
one subset of size zero (the empty set itself), we have 0 =n1. Note that
n
k
= 0 if k > n. Below, we derive a formula for the value of k if 0 ≤ k ≤ n.
Let S be a set with n elements and let A be the set of all ordered sequences
consisting of exactly k pairwise distinct elements of S. We are going to count
the elements of A in two different ways.
The first way is by using the Product Rule. This gives
n!
|A| = n(n − 1)(n − 2) · · · (n − k + 1) = . (3.1)
(n − k)!
Observe that (3.1) also follows from Theorem 3.1.3. (Do you see why?)
In the second way, we do the following:
3.6. Permutations and Binomial Coefficients 39
n
• Write all k
subsets of S having size k.
If we put all these lists together, then we obtain a big list in which each
ordered sequence of k pairwise distinct elements of S appears exactly once.
In other words, the big list contains each element of A exactly once. Since
the big list has size nk k!, it follows that
n
|A| = k!. (3.2)
k
Since the right-hand sides of (3.1) and (3.2) are equal (because they are both
equal to |A|), we obtain the following result:
For example,
1·2·3·4·5
5 5! 5!
= = = = 10
3 3!(5 − 3)! 3!2! 1·2·3·1·2
and
0 0! 1
= = = 1;
0 0!0! 1·1
recall that we defined 0! to be equal to 1.
52 · 51 · 50 · 49 · 48
52 52!
= = = 2, 598, 960.
5 5!47! 5·4·3·2·1
40 Chapter 3. Counting
and there is one way to do the third task. Thus, by the Product Rule, the
number of ways to do the procedure and, therefore, the number of bitstrings
of length n having exactly k many 1s, is equal to
n n
·1·1= .
k k
We can also use the Bijection Rule, by observing, in the same way as we did
in Section 3.2, that there is a bijection between
• the set of all bitstrings of length n having exactly k many 1s, and
• the set of all k-element subsets of an n-element set.
Since the latter set has size nk , the former set has size nk as well.
(x + y)2 = x2 + 2xy + y 2 .
x5 , x4 y, x3 y 2 , x2 y 3 , xy 4 , y 5 ,
25
25
X 25
(2x − 5y) = (2x)25−k (−5y)k .
k=0
k
In Section 3.7, we will see a proof of Theorem 3.6.6 that does not use
Newton’s Binomial Theorem.
Proof. The claim can be proved using Theorem 3.6.3. To obtain a combi-
natorial proof, let S be a set with n elements. Recall that
• nk is the number of ways to choose k elements from the set S,
Proof. As in the previous theorem, the claim can be proved using Theo-
rem 3.6.3. To obtain a combinatorial proof, let S be a set with n+1 elements.
We are going to count the k-element subsets of S in two different ways.
First, by definition, the number of k-element subsets of S is equal to
n+1
. (3.3)
k
For the second way, we choose an element x in S and consider the set
T = S \ {x}, i.e., the set obtained by removing x from S. Any k-element
subset of S is of exactly one of the following two types:
Thus, the second way of counting shows that the number of k-element subsets
of S is equal to
n n
+ . (3.4)
k k−1
Since the expressions in (3.3) and (3.4) count the same objects, they must
be equal. Therefore, the proof is complete.
Proof. We have seen in Theorem 3.6.6 that this identity follows from New-
ton’s Binomial Theorem. Below, we give a combinatorial proof.
Consider a set S with n elements. According to Theorem 3.2.1, this set
has 2n many subsets. A different way to count the subsets of S is by dividing
them into (pairwise disjoint) groups according to their sizes. For each k with
0 ≤ k ≤ n, consider all k-element subsets of S. The number of such subsets
n
is equal to k . If we take the sum of all these binomial coefficients, then we
have counted each subset of S exactly once. Thus,
n
X n
k=0
k
n
• 0
= 1 for all integers n ≥ 0,
n
• n
= 1 for all integers n ≥ 0,
n
= n−1 n−1
• k k−1
+ k
for all integers n ≥ 2 and k with 1 ≤ k ≤ n − 1;
see Theorem 3.7.2.
Algorithm GenerateBinomCoeff:
BCoeff (0, 0) = 1;
for n = 1, 2, 3, . . .
do BCoeff (n, 0) = 1;
for k = 1 to n − 1
do BCoeff (n, k) = BCoeff (n − 1, k − 1) + BCoeff (n − 1, k)
endfor;
BCoeff (n, n) = 1
endfor
n
BCoeff (n, k) = for 0 ≤ k ≤ n.
k
0
0
1 1
0 1
2 2 2
0 1 2
3 3 3 3
0 1 2 3
4 4 4 4 4
0 1 2 3 4
5 5 5 5 5 5
0 1 2 3 4 5
6 6 6 6 6 6 6
0 1 2 3 4 5 6
We obtain the values for the binomial coefficients by using the following
rules:
• Each value in the interior is equal to the sum of the two values above
it.
• The values in the n-th row are equal to the coefficients in Newton’s
Binomial Theorem (i.e., Theorem 3.6.5). For example, the coefficients
in the expansion of (x + y)5 are given in the 5-th row:
• Theorem 3.6.6 states that the sum of all values in the n-th row is equal
to 2n .
• Theorem 3.7.1 states that reading the n-th row from left to right gives
the same sequence as reading this row from right to left.
• Corollary 3.7.5 states that the sum of the squares of all values in the
n-th row is equal to the middle element in the 2n-th row.
3.9. More Counting Problems 49
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
In the four tasks above, we first chose the positions for the letter S, then
the positions for the letter C, then the position for the letter U, and finally
the position for the letter E. If we change the order, then we obtain the same
answer. For example, if we choose the positions for the letters in the order
C, E, U, S, then we obtain
7 5 4 3
,
2 1 1 3
• is followed by one 1,
52 Chapter 3. Counting
• is followed by one 1,
f (2, 3, 6) = 0010001000000,
f (3, 2, 6) = 0001001000000,
f (0, 11, 0) = 1000000000001,
and
f (2, 0, 9) = 0011000000000.
To show that this function f maps elements of A to elements of B, we
have to verify that the string f (x1 , x2 , x3 ) belongs to the set B. This follows
from the following observations:
n+k−1
.
k−1
and B to be the set of all bitstrings of length 14 that contain exactly 3 many
1s (and, thus, exactly 11 many 0s).
The function f : A → B is defined as follows: If (x1 , x2 , x3 ) is an element
of A, then f (x1 , x2 , x3 ) is the bitstring
• is followed by one 1,
• is followed by one 1,
• is followed by one 1,
f (2, 3, 6) = 00100010000001,
f (2, 3, 5) = 00100010000010,
f (0, 1, 0) = 10110000000000,
and
f (0, 0, 0) = 11100000000000.
As before, it can be verified that the string f (x1 , x2 , x3 ) belongs to the set
B and the function f is a bijection. It then follows from the Bijection Rule
that
14
|A| = |B| = = 364.
3
The next theorem gives the answer for the general case. As before, you are
encouraged to give a proof.
Simon Pratt loves to drink India Pale Ale (IPA). During each day of the
month of April (which has 30 days), Simon drinks at least one bottle of IPA.
During this entire month, he drinks exactly 45 bottles of IPA. The claim is
that there must be a sequence of consecutive days in April, during which
Simon drinks exactly 14 bottles of IPA.
To prove this, let bi be the number of bottles that Simon drinks on April i,
for i = 1, 2, . . . , 30. We are given that each bi is a positive integer (i.e., bi ≥ 1)
and
b1 + b2 + · · · + b30 = 45.
Define, for i = 1, 2, . . . , 30,
ai = b 1 + b 2 + · · · + b i ,
i.e., ai is the total number of bottles of IPA that Simon drinks during the
first i days of April. Consider the sequence of 60 numbers
a1 , a2 , . . . , a30 , a1 + 14, a2 + 14, . . . , a30 + 14.
56 Chapter 3. Counting
{1, 2, . . . , 59}.
14 = ai − aj = bj+1 + bj+2 + · · · + bi .
Thus, in the period from April j + 1 until April i, Simon drinks exactly 14
bottles of IPA.
ai = 2ki · qi ,
(inc i , dec i ), which are placed in the n2 boxes of B. By the Pigeonhole Prin-
ciple, there must be a box that contains two (or more) elements. In other
words, there exist two integers i and j such that i < j and
Proof. The proof is by contradiction. Thus, we assume that there are, say,
k prime numbers, and denote them by
2n > (n + 1)k .
We define
f (x) = (m1 , m2 , . . . , mk ).
Since
mi ≤ m1 + m2 + · · · + mk
≤ m1 log p1 + m2 log p2 + · · · + mk log pk
mk
= log (pm m2
1 · p2 · · · pk )
1
= log x
≤ n,
it follows that
f (x) ∈ {0, 1, 2, . . . , n}k .
Thus, f is a function
It is easy to see that this function is one-to-one. The set on the left-hand
side has size 2n , whereas the set on the right-hand side has size (n + 1)k . It
then follows from the Pigeonhole Principle that
(n + 1)k ≥ 2n ,
3.11 Exercises
3.1 A licence plate number consists of a sequence of four uppercase letters
followed by three digits. How many licence plate numbers are there?
3.3 For each of the following seven cases, determine how many strings of
eight uppercase letters there are.
• The strings start with PQ (in this order) and letters can be repeated.
• The strings start with PQ (in this order) and no letter can be repeated.
• The strings start and end with PQ (in this order) and letters can be
repeated.
• The strings start with XYZ (in this order), end with QP (in this order),
and letters can be repeated.
• The strings start with XYZ (in this order) or end with QP (in this
order), and letters can be repeated.
3.6 The Carleton Computer Science Society has a Board of Directors con-
sisting of one president, one vice-president, one secretary, one treasurer, and
a three-person party committee (whose main responsibility is to buy beer
for the other four board members). The entire board consists of seven dis-
tinct students. If there are n ≥ 7 students in Carleton’s Computer Science
program, how many ways are there to choose a Board of Directors?
3.7 The Carleton Computer Science Society has an Academic Events Com-
mittee (AEC) consisting of five students and a Beer Committee (BC) con-
sisting of six students (whose responsibility is to buy beer for the AEC).
3.11. Exercises 61
O O X
X X O
X X O
3.12 In how many ways can you paint 200 chairs, if 33 of them must be
painted red, 66 of them must be painted blue, and 101 of them must be
painted green?
3.13 Let A be the set of all integers x > 6543 such that the decimal repre-
sentation of x has distinct digits, none of which is equal to 7, 8, or 9. (The
decimal representation does not have leading zeros.) Determine the size of
the set A.
3.14 Let A be the set of all integers x ∈ {1, 2, . . . , 100} such that the decimal
representation of x does not contain the digit 4. (The decimal representation
does not have leading zeros.)
• Determine the size of the set A without using the Complement Rule.
• Use the Complement Rule to determine the size of the set A.
3.15 Let A be a set of size m, let B be a set of size n, and assume that
n ≥ m ≥ 1. How many functions f : A → B are there that are not one-to-
one?
3.17 How many bitstrings of length 8 are there that contain at least 4 con-
secutive 0s or at least 4 consecutive 1s?
3.18 How many bitstrings of length 77 are there that start with 010 (i.e.,
have 010 at positions 1, 2, and 3), or have 101 at positions 2, 3, and 4, or
have 010 at positions 3, 4, and 5?
3.11. Exercises 63
3.23 Let m and n be integers with m ≥ n ≥ 1. How many ways are there
to place m books on n shelves, if there must be at least one book on each
shelf? As in Section 3.1.3, the order on each shelf matters.
W B3 B1 W B5 B4 W B2 ,
W B1 B3 W B5 B4 W B2 ,
and
B5 W B3 B1 W W B2 B4 .
3.25 Let n ≥ 1 be an integer and consider n boys and n girls. For each
of the following three cases, determine how many ways there are to arrange
these 2n people on a straight line (the order on the line matters):
• All boys stand next to each other and all girls stand next to each other.
3.26 Elisa Kazan has a set {C1 , C2 , . . . , C50 } consisting of 50 cider bottles.
She divides these bottles among 5 friends, so that each friend receives a
subset consisting of 10 bottles. Determine the number of ways in which Elisa
can divide the bottles.
3.28 The Ottawa Senators and the Toronto Maple Leafs play a best-of-7
series: These two hockey teams play games against each other, and the first
team to win 4 games wins the series. Each game has a winner (thus, no game
ends in a tie).
A sequence of games can be described by a string consisting of the char-
acters S (indicating that the Senators win the game) and L (indicating that
the Leafs win the game). Two possible ways for the Senators to win the
series are (L, S, S, S, S) and (S, L, S, L, S, S).
Determine the number of ways in which the Senators can win the series.
3.29 The Beer Committee of the Carleton Computer Science Society has
bought large quantities of 10 different types of beer. In order to test which
beer students prefer, the committee does the following experiment:
3.31 Let m ≥ 2 and n ≥ 2 be even integers. You are given m beer bottles
B1 , B2 , . . . , Bm and n cider bottles C1 , C2 , . . . , Cn . Assume you arrange these
m + n bottles on a horizontal line such that
female students first and then the male students, or placing the male
students first and then the female students?
A ∩ B ∩ C = A ∪ B ∪ C.
68 Chapter 3. Counting
There are 8 ways to do the first task, 10 ways to do the second task,
and 367 ways to do the third task. Therefore, by the Product Rule,
the number of valid passwords is equal to
• How many such permutations do not contain any of the strings wine,
vodka, or coke?
3.43 Determine the number of integers in the set {1, 2, . . . , 1000} that are
not divisible by any of 5, 7, and 11.
• this sequence contains only even numbers (and duplicate elements are
allowed).
• 6 are blond,
How many people in this group are blond and have green eyes?
is an even integer.
3.51 Use Pascal’s Identity (Theorem 3.7.2) to prove Newton’s Binomial The-
orem (i.e., Theorem 3.6.5) by induction.
3.11. Exercises 71
In the rest of this exercise, you will give a combinatorial proof of this identity.
Consider passwords consisting of n characters, each character being a
digit or a lowercase letter. A password must contain at least one digit.
• Use the Complement Rule of Section 3.3 to show that the number of
passwords is equal to 36n − 26n .
• Let k be an integer with 1 ≤ k ≤ n. Prove that the number of pass-
words with exactly k digits is equal to nk 10k · 26n−k .
• Explain why the above two parts imply the identity in (3.5).
3.59 Use Newton’s Binomial Theorem (i.e., Theorem 3.6.5) to prove that
for every integer n ≥ 1,
n
X n k
2 = 3n . (3.6)
k=0
k
In the rest of this exercise, you will give a combinatorial proof of this identity.
Let A = {1, 2, 3, . . . , n} and B = {a, b, c}. According to Theorem 3.1.2,
the number of functions f : A → B is equal to 3n .
3.11. Exercises 73
3.60 Use Newton’s Binomial Theorem (i.e., Theorem 3.6.5) to prove that
for every integer n ≥ 2,
n
X n
(n − 1)n−k = nn . (3.7)
k=0
k
In the rest of this exercise, you will give a combinatorial proof of this identity.
Consider the set A = {1, 2, . . . , n}. According to Theorem 3.1.2, the
number of functions f : A → A is equal to nn .
3.70 Let n ≥ 1 be an integer, and let X and Y be two disjoint sets, each
consisting of n elements. An ordered triple (A, B, C) of sets is called cool, if
0 1 2 3 4 5
0 0 1 0 0 1 1
1 1 1 0 1 0 1
3.75 How many different strings can be obtained by reordering the letters of
the word MississippiMills. (This is a town close to Ottawa. James Naismith,
the inventor of basketball, was born there.)
• Determine the number of strings in which the two letters E are next to
each other.
• Determine the number of strings in which the two letters E are not next
to each other and the two letters N are not next to each other.
3.77 Determine the number of elements x in the set {1, 2, 3, . . . , 99999} for
which the sum of the digits in the decimal representation of x is equal to 8.
An example of such an element x is 3041.
3.78 In Theorems 3.9.1 and 3.9.2, we have seen how many solutions (in
non-negative integers) there are for equations of the type
x1 + x2 + · · · + xk = n
3.11. Exercises 79
x1 + x2 + · · · + xk ≤ n.
3.79 Let n and k be integers with n ≥ k ≥ 1. How many solutions are there
to the equation
x1 + x2 + · · · + xk = n,
where x1 ≥ 1, x2 ≥ 1, . . . , xk ≥ 1 are integers?
Hint: In Theorem 3.9.1, we have seen the answer if x1 ≥ 0, x2 ≥ 0, . . . ,
xk ≥ 0.
3.81 The square in the left figure below is divided into nine cells. In each
cell, we write one of the numbers −1, 0, and 1.
0 1 0
1 1 −1
−1 0 −1
Use the Pigeonhole Principle to prove that, among the rows, columns,
and main diagonals, there exist two that have the same sum. For example,
in the right figure above, both main diagonals have sum 0. (Also, the two
topmost rows both have sum 1, whereas the bottom row and the right column
both have sum −2.)
80 Chapter 3. Counting
• Assume that there are two people in S having the same age. Prove that
there exist two distinct subsets A and B of
PS such that (i)
Pboth A and
B are non-empty, (ii) A∩B = ∅, and (iii) x∈A age(x) = x∈B age(x).
• Assume that all people in S having different ages. Use the Pigeonhole
Principle to prove that there exist two distinct subsetsPA and B of S
such
P that (i) both A and B are non-empty, and (ii) x∈A age(x) =
x∈B age(x).
• Assume that all people in S having different ages. Prove that there
exist two distinct subsets A and B ofP
S such that (i) P
both A and B are
non-empty, (ii) A ∩ B = ∅, and (iii) x∈A age(x) = x∈B age(x).
3.87 Consider five points in a square with sides of length one. Use the Pi-
geonhole Principle
√ to prove that there are two of these points having distance
at most 1/ 2.
3.11. Exercises 81
3.89 Let S be a set of 90 positive integers, each one having at most 25 digits
in decimal notation. Use the Pigeonhole Principle to prove that there are
two distinct subsets A and B of S that have the same sum, i.e.,
X X
x= x.
x∈A x∈B
30 , 31 , 32 , . . . , 31000
of integers.
• Prove that this sequence contains two distinct elements whose difference
is divisible by 1000. That is, prove that there exist two integers m and
n with 0 ≤ m < n ≤ 1000, such that 3n − 3m is divisible by 1000.
Hint: Consider each element in the sequence modulo 1000 and use the
Pigeonhole Principle.
82 Chapter 3. Counting
31 , 32 , . . . , 31000
3.92 Let n ≥ 2 be an integer and let G = (V, E) be a graph whose vertex set
V has size n and whose edge set E is non-empty. The degree of any vertex u
is defined to be the number of edges in E that contain u as a vertex. Prove
that there exist at least two vertices in G that have the same degree.
Hint: Consider the cases when G is connected and G is not connected sepa-
rately. In each case, apply the Pigeonhole Principle. Alternatively, consider
a vertex of maximum degree together with its adjacent vertices and, again,
apply the Pigeonhole Principle.
Recursion
f (1) = 2 · f (0) + 3 = 2 · 3 + 3 = 9.
Can we “solve” this recurrence relation? That is, can we express f (n) in
terms of n only? By looking at these values, you may see a pattern, i.e., you
may guess that for each n ≥ 0,
f (n − 1) = 3 · 2n − 3.
Then
f (n) = 2 · f (n − 1) + 3
= 2 (3 · 2n − 3) + 3
= 3 · 2n+1 − 3.
Thus, we have proved by induction that (4.1) holds for all integers n ≥ 0.
4.2. Fibonacci Numbers 85
In words, there are two base cases (i.e., 0 and 1) and each next element in the
sequence is the sum of the previous two elements. This gives the sequence
The following theorem states that we can “solve” this recurrence relation.
That is, we can express the n-th Fibonacci number fn in a non-recursive
way, i.e., without using any other Fibonacci numbers.
86 Chapter 4. Recursion
√ √
Theorem 4.2.1 Let ϕ = 1+2 5 and ψ = 1−2 5 be the two solutions of the
quadratic equation x2 = x + 1. Then, for all n ≥ 0, we have
ϕn − ψ n
fn = √ .
5
Proof. We prove the claim by induction on n. There are two base cases1 :
ϕ0√
−ψ 0
• Both f0 and 5
are equal to 0.
ϕ1√
−ψ 1
• Both f1 and 5
are equal to 1.
Let n ≥ 2 and assume that the claim is true for n − 2 and n − 1. In other
words, assume that
ϕn−2 − ψ n−2
fn−2 = √
5
and
ϕn−1 − ψ n−1
fn−1 = √ .
5
We have to prove that the claim is true for n as well. Using the definition
of fn , the two assumptions, and the identities ϕ2 = ϕ + 1 and ψ 2 = ψ + 1,
we get
fn = fn−1 + fn−2
ϕn−1 − ψ n−1 ϕn−2 − ψ n−2
= √ + √
5 5
n−2 n−2
ϕ (ϕ + 1) ψ (ψ + 1)
= √ − √
5 5
n−2 2 n−2 2
ϕ ·ϕ ψ ·ψ
= √ − √
5 5
n n
ϕ −ψ
= √ .
5
1
Do you see why there are two base cases?
4.2. Fibonacci Numbers 87
Let us start by determining Bn for some small values of n. There are two
bitstrings of length 1:
0, 1.
Since neither of them contains 00, we have B1 = 2. There are four bitstrings
of length 2:
00, 10, 01, 11.
Since three of them do not contain 00, we have B2 = 3. Similarly, there are
eight bitstrings of length 3:
• How many rows are there in the top part? Any string in the top part
starts with 1 and is followed by a bitstring of length n − 1 that does not
contain 00. Thus, if we take the rows in the top part and delete the first
bit from each row, then we obtain all 00-free bitstrings of length n − 1.
Since the number of 00-free bitstrings of length n − 1 is equal to Bn−1 ,
it follows that the top part of the matrix consists of Bn−1 rows.
88 Chapter 4. Recursion
• How many rows are there in the bottom part? Any string in the bottom
part starts with 0. Since the string does not contain 00, the second bit
must be 1. After these first two bits, we have a bitstring of length n − 2
that does not contain 00. Thus, if we take the rows in the bottom part
and delete the first two bits from each row, then we obtain all 00-free
bitstrings of length n − 2. Since the number of 00-free bitstrings of
length n − 2 is equal to Bn−2 , it follows that the bottom part of the
matrix consists of Bn−2 rows.
Thus, on the one hand, the matrix has Bn rows. On the other hand, this
matrix has Bn−1 + Bn−2 rows. Therefore, we have Bn = Bn−1 + Bn−2 .
To summarize, we have proved that the values Bn , for n ≥ 1, satisfy the
following recurrence relation:
B1 = 2,
B2 = 3,
Bn = Bn−1 + Bn−2 , if n ≥ 3.
This recurrence relation is the same as the one for the Fibonacci numbers,
except that the two base cases are different. The sequence Bn , n ≥ 1, consists
of the integers
2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . .
We obtain this sequence by removing the first three elements (i.e., f0 , f1 , and
f2 ) from the Fibonacci sequence. We leave it to the reader to verify (using
induction) that for all n ≥ 1,
Bn = fn+2 .
Thus, if we already know that x and y belong to the set S, then the second
rule gives us a new element, i.e., x − y, that also belongs to S.
4.3. A Recursively Defined Set 89
Can we give a simple description of the set S? We are going to use the
rules to obtain some elements of S. From these examples, we then hope to
see a pattern from which we guess the simple description of S. The final step
consists of proving that our guess is correct.
S ⊆ {5n : n ∈ Z},
90 Chapter 4. Recursion
How do we prove this? The set S is defined using a base case and a recursive
rule. The only way to obtain an element of S is by starting with the base
case and then applying the recursive rule a finite number of times. Therefore,
the following will prove that (4.3) holds:
• The element in the base case, i.e., 5, is a multiple of 5.
• Let x and y be two elements of S and assume that they are both
multiples of 5. Then x − y (which is the “next” element of S) is also a
multiple of 5.
Next we prove that
{5n : n ∈ Z} ⊆ S.
We will do this by proving that for all n ≥ 0,
5n ∈ S and − 5n ∈ S. (4.4)
5n ∈ S and − 5n ∈ S.
P1 P2 P3 P4
S1 S2 S3 S4
P1 P2 P3 P4
S1 S2 S1 S2 S3 S4
P1 P2 P3 P4
S1 S2 S1 S2 S3 S4 S3 S4
P1 P2 P3 P4
S1 S2 S3 S4 S1 S2 S1 S2 S3 S4 S3 S4
P1 P2 P3 P4
S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4
We see that after four phone calls, each person knows all four scandals.
Observe that the number of phone calls is 42 = 6 if we would have used the
• We assume that we know how to schedule the phone calls for groups of
n − 1 people.
0
• Consider Sn−1 and Sn to be one scandal Sn−1 .
Algorithm gossip(n):
We are now going to determine the number of phone calls made when
running algorithm gossip(n). Since we do not know the answer yet, we
introduce a variable C(n) to denote this number. It follows from the pseu-
docode that
C(4) = 4.
Let n ≥ 5. Algorithm gossip(n) starts and ends with the same phone call:
Pn−1 calls Pn . In between, it runs algorithm gossip(n − 1), during which,
by definition, C(n − 1) phone calls are made. It follows that
C(n) = 2 + C(n − 1) for n ≥ 5.
Thus, we have obtained a recurrence relation for the numbers C(n). The
first few numbers in the sequence are
C(4) = 4,
C(5) = 2 + C(4) = 2 + 4 = 6,
C(6) = 2 + C(5) = 2 + 6 = 8,
C(7) = 2 + C(6) = 2 + 8 = 10.
From this, we guess that
C(n) = 2n − 4 for n ≥ 4.
We can easily prove by induction that our guess is correct. Indeed, since
both C(4) and 2 · 4 − 4 are equal to 4, the claim is true for n = 4. If n ≥ 5
and C(n − 1) = 2(n − 1) − 4, then
C(n) = 2 + C(n − 1) = 2 + (2(n − 1) − 4) = 2n − 4.
94 Chapter 4. Recursion
and
b = 137, 916, 675 = 34 · 52 · 133 · 31.
From this, we see that
a = qb + r, q ≥ 0, and 0 ≤ r ≤ b − 1.
The modulo operation, denoted by a mod b, is the function that maps the
pair (a, b) to the remainder r. Thus, we will write
a mod b = r.
For example,
• 17 mod 5 = 2, because 17 = 3 · 5 + 2,
• 17 mod 17 = 0, because 17 = 1 · 17 + 0,
• 17 mod 1 = 0, because 17 = 17 · 1 + 0,
M (a, b) ≤ b.
This gives an upper bound that is linear in b. Below, we will prove a much
better upper bound: The value of M (a, b) is at most logarithmic in b. We
will use the Fibonacci numbers of Section 4.2 to obtain this result. Recall
that these numbers are defined by
f0 = 0,
f1 = 1,
fn = fn−1 + fn−2 , if n ≥ 2.
Lemma 4.5.3 Let a and b be integers with a > b ≥ 1, and let m = M (a, b).
Then a ≥ fm+2 and b ≥ fm+1 .
98 Chapter 4. Recursion
a = qb + r ≥ b + r ≥ fm+1 + fm = fm+2 .
In Theorem 4.2.1, we have seen that√ the Fibonacci√ numbers can be ex-
1+ 5 1− 5
pressed in terms of the numbers ϕ = 2 and ψ = 2 . You are encouraged
to prove, by induction and using the fact that ϕ2 = ϕ+1, that for any integer
n ≥ 2,
fn ≥ ϕn−2 . (4.5)
M (a, b) ≤ 1 + logϕ b,
b ≥ fm+1 ≥ ϕm−1 .
m − 1 ≤ logϕ b,
i.e.,
M (a, b) = m ≤ 1 + logϕ b = O(log b).
4.6. The Merge-Sort Algorithm 99
• it recursively sorts the sequence am+1 , am+2 , . . . , an and stores the sorted
sequence in a list L2 ,
• it merges the two sorted lists L1 and L2 into one sorted list.
// L is a list of n ≥ 0 numbers
if n ≥ 2
then m = bn/2c;
L1 = list consisting of the first m elements of L;
L2 = list consisting of the last n − m elements of L;
L1 = MergeSort(L1 , m);
L2 = MergeSort(L2 , n − m);
L = Merge(L1 , L2 )
endif;
return L
T (1) = 0.
102 Chapter 4. Recursion
Assume that n ≥ 2 and consider again the pseudocode for MergeSort(L, n).
Which parts of the algorithm make comparisons between input elements?
• The call MergeSort(L1 , m) is a recursive call on a list of m = n/2
numbers. By definition, the total number of comparisons made in this
call (together with all its recursive subcalls) is at most T (n/2).
T (1) = 0,
T (n) ≤ 2 · T (n/2) + n, if n ≥ 2 and n is a power of 2. (4.6)
Our goal was to determine T (n), but at this moment, we only have a recur-
rence relation for this function. We will solve this recurrence relation using
a technique called unfolding:
Recall that we assume that n = 2k for some integer k ≥ 0. We further-
more assume that n is a large integer. We know from (4.6) that
T (n) ≤ 2 · T (n/2) + n.
T (n) ≤ 2 · T (n/2) + n
≤ 2 2 · T (n/22 ) + n/2 + n
= 22 · T (n/22 ) + 2n.
4.6. The Merge-Sort Algorithm 103
T (n) ≤ 22 · T (n/22 ) + 2n
≤ 22 2 · T (n/23 ) + n/22 + 2n
= 23 · T (n/23 ) + 3n.
T (n) ≤ 23 · T (n/23 ) + 3n
≤ 23 2 · T (n/24 ) + n/23 + 3n
= 24 · T (n/24 ) + 4n.
At this moment, you will see the pattern and, at the end, we get the inequality
Since n = 2k , we have T (n/2k ) = T (1), which is 0 from the base case of the
recurrence relation. Also, n = 2k implies that k = log n. We conclude that
We thus have solved the recurrence relation. In case you have doubts about
the validity of the unfolding method, we verify by induction that indeed
The base case is when n = 1. In this case, we have T (1) = 0 and 1 log 1 =
1 · 0 = 0. Let n ≥ 2 be a power of 2 and assume that
T (n) ≤ 2 · T (n/2) + n.
104 Chapter 4. Recursion
q
d(p, q)
|p2 − q2|
p
|p1 − q1|
δ(S)
• n is a power of two,
• no two points of S have the same x-coordinate,
• no two points of S have the same y-coordinate.
We remark that neither of these assumptions is necessary. We only make
them to simplify the presentation.
As mentioned above, our algorithm will be recursive. The base case is
when n = 2, i.e., the set S consists of exactly two points, say p and q. In
this case, the algorithm simply returns the distance d(p, q).
From now on, we assume that n ≥ 4. The algorithm performs the follow-
ing four steps:
Step 1: Let ` be a vertical line that splits the set S into two subsets of equal
size. The algorithm computes the set S1 consisting of all points of S that are
to the left of `, and the set S2 consisting of all points of S that are to the
right of `. Observe that |S1 | = |S2 | = n/2.
Step 2: The algorithm recursively computes the closest-pair distance δ1 in
the set S1 .
Step 3: The algorithm recursively computes the closest-pair distance δ2 in
the set S2 .
δ1
δ2
`
S1 S2
δ1
δ2
`1 ` `2
S10 S20
δ δ δ δ
Rp p Rq
δ q δ
p q
`1 ` `2 `1 ` `2
where
B1 = {(p, q) : p ∈ S10 , q ∈ S20 , q ∈ Rp }
and
B2 = {(q, p) : p ∈ S10 , q ∈ S20 , p ∈ Rq }.
Proof. We have to show that every element of the set A belongs (as an
ordered pair) to the set B. To prove this, consider an arbitrary element
{p, q} of A. We will show that one of the ordered pairs (p, q) and (q, p) is an
element of the set B.
It follows from (4.7) that p ∈ S10 and q ∈ S20 . Thus, to prove that one of
(p, q) and (q, p) is an element of B, it remains to be shown that
q ∈ Rp or p ∈ Rq . (4.8)
Since {p, q} ∈ A, we have d(p, q) < δ. This implies that the vertical
distance between p and q is less than δ. That is, if p = (p1 , p2 ) and q = (q1 , q2 ),
then |p2 − q2 | < δ.
If p2 < q2 , then the point q is contained in the rectangle Rp and, therefore,
(4.8) holds. Otherwise, p2 > q2 , in which case the point p is contained in the
rectangle Rq and, thus, (4.8) also holds.
Is there a non-trivial upper bound on the size of the set B? Since each
of the two sets S10 and S20 can have n/2 elements, it is clear that |B| ≤
n/2 · n/2 = n2 /4. In words, the size of B is at most quadratic in n. The
following lemma states that the size of B is, in fact, at most linear in n:
Proof. Let p be an arbitrary point in S10 . We claim that there are at most
four points q such that (p, q) ∈ B1 . We will prove this claim by contradiction.
Thus, assume that there are at least five such points q. Observe that for any
such point q, we have q ∈ S20 and q ∈ Rp . Therefore, all these points q are
contained in the part of Rp that is to the right of the line `. This part is a
square with sides of length δ. By Exercise 3.87, there are two of these points
that have distance at most √
δ/ 2 < δ.
110 Chapter 4. Recursion
Thus, the set S20 contains two points having distance less than δ. That is,
the closest-pair distance in the set S2 is less than δ.
On the other hand, recall that δ = min(δ1 , δ2 ) and δ2 is the closest-pair
distance of the set S2 . It follows that all distances in the set S2 are at least
equal to δ. This is a contradiction.
Thus, we have shown that, for this fixed point p in S10 , there are at most
four points q such that (p, q) ∈ B1 . Therefore,
By a symmetric argument, for any fixed point q in S20 , there are at most
four points p such that (q, p) ∈ B2 . This implies that the set B2 contains at
most 2n elements. We conclude that
We are now ready to define the set C that we are looking for. Let
0
S1,2 = S10 ∪ S20 .
0
Imagine that we have the points of this set S1,2 in increasing order of their
0
y-coordinates. Consider an arbitrary point r of S1,2 . The seven y-successors
0
of r are the seven points of S1,2 that immediately follow r in this increasing
order. In the figure below, these are the points a, b, . . . , g.
δ δ
g
f
e
d
c
b
a
r
`1 ` `2
4.7. Computing the Closest Pair 111
Observe that the number of points that follow r may be less than seven.
In this case, we abuse our terminology a bit and still talk about the seven
y-successors of r, even though there are fewer of them.
Our final set C is defined as follows:
Proof. We will prove that B ⊆ C. It will then follow from Lemma 4.7.1
that A ⊆ C.
Let (p, q) be an arbitrary element in the set B1 . It follows from the
definition of B1 that p ∈ S10 , q ∈ S20 , and q ∈ Rp . To prove that (p, q) is an
element of the set C, we have to argue that q is one of the seven y-successors
of p.
As in the proof of Lemma 4.7.2, (i) the part of Rp that is to the left of
the line ` contains at most four points of S10 and (ii) the part of Rp that is
to the right of ` contains at most four points of S20 . Thus, the rectangle Rp
contains at most eight points of S10 ∪ S20 . Since p is one of them and p is on
the bottom side of Rp , the point q must be one of the seven y-successors of p.
Thus, we have shown that B1 ⊆ C. By a symmetric argument, B2 ⊆ C.
Consider the elements (r, s) of the set C. There are at most n choices
for the point r. For each choice of r, there are at most seven choices for the
point s. This proves the following lemma:
Step 1: Determine a vertical line ` that splits the point set into two subsets,
each having size n/2. This step is easy to perform if we have the points in
sorted order of their x-coordinates.
Steps 2 and 3: Run the algorithm recursively, once for all points to the left
of `, and once for all points to the right of `.
Step 4: Compute and traverse the set C that is defined in (4.9). This step is
easy to perform if we have the points in sorted order of their y-coordinates.
We assume that the set of input points is stored in a list L. The entire
algorithm, which we denote by ClosestPair(L, n), is given in Figure 4.1. In
the pseudocode, Merge(·, ·, y) refers to the merge algorithm of Section 4.6
that merges two lists, based on the y-coordinates of the points.
• At termination, the list L stores the same points, but in sorted order
of their y-coordinates.
if n = 2
then δ = the distance between the two points in L;
sort the points in L by their y-coordinates;
return δ
else L1 = list consisting of the first n/2 points in L;
L2 = list consisting of the last n/2 points in L;
z = any value between the x-coordinates of the last point
of L1 and the first point of L2 ;
// both L1 and L2 are sorted by x-coordinate
δ1 = ClosestPair(L1 , n/2);
δ2 = ClosestPair(L2 , n/2);
// both L1 and L2 are sorted by y-coordinate
δ = min(δ1 , δ2 );
L01 = list consisting of all points p of L1 with p1 > z − δ;
L02 = list consisting of all points q of L2 with q1 < z + δ;
// both L01 and L02 are sorted by y-coordinate
L01,2 = Merge (L01 , L02 , y);
L = Merge (L1 , L2 , y);
// both L01,2 and L are sorted by y-coordinate
if L01,2 is empty
then return δ
0
else δ1,2 = min{d(r, s) : r, s ∈ L01,2 , s is one of the seven
y-successors of r};
0
return min(δ, δ1,2 )
endif
endif;
Note that the input list L must contain the points in sorted order of
their x-coordinates. Therefore, before the first call to ClosestPair, we run
algorithm MergeSort(L, n) of Section 4.6 to sort the input points by their
x-coordinates. By Theorem 4.6.1, this takes O(n log n) time.
We now analyze the running time of algorithm ClosestPair(L, n). Let
T (n) denote the worst-case running time of this algorithm, when given as
input a list of size n whose points are in sorted x-order. If n = 2, then
the running time is bounded by some constant, say c. If n ≥ 2, then the
algorithm spends 2 · T (n/2) time for the two recursive calls, whereas the rest
of the algorithm takes at most c0 n time, where c0 is some constant. Thus,
the function T (n) satisfies the following recurrence:
T (1) ≤ c,
T (n) ≤ 2 · T (n/2) + c0 n, if n ≥ 2 and n is a power of 2.
As in Section 4.6.2, this recurrence solves to T (n) = O(n log n). Thus, we
have proved the following result:
R1 = 1, R2 = 2, R3 = 4, R4 = 8, R5 = 16.
• These vertices divide the segments into subsegments and the circle into
arcs in a natural way. Each such subsegment and arc is an edge of the
graph.
The figure below illustrates this for the case when n = 5. The graph on
the right has 10 = 5 + 5 vertices: Each of the 5 points on the circle leads to
one vertex and each of the 5 intersection points leads to one vertex. These
5
10 vertices divide the 2 = 10 segments into 20 straight-line edges and the
circle into 5 circular edges. Therefore, the graph has 20 + 5 = 25 edges.
Note that, strictly speaking, this process does not define a proper graph,
because any two consecutive vertices on the circle are connected by two edges
(one straight-line edge and one circular edge), whereas in a proper graph,
there can be only one edge between any pair of vertices. For simplicity,
however, we will refer to the resulting structure as a graph.
Let Vn and En be the number of vertices and edges of the graph, respec-
tively. We claim that
n
Vn ≤ n + 2 . (4.10)
2
This claim follows from the following observations:
• There are exactly n vertices on the circle.
• The n points on the circle are connected by n2 segments, and any
r
pr
f (r)
This defines a one-to-one function f from the set of regions to the set of
edges. Therefore, the number of regions, which is Rn , is at most the number
of edges, which is En .
By combining (4.10), (4.11), and (4.12), we get
Rn ≤ En
Vn
≤ n+
2
n + (22 )
n
≤ n+ .
2
In order to estimate the last quantity, we are going to use asymptotic nota-
tion; see Section 2.3. First observe that
n(n − 1)
n
= = O(n2 ).
2 2
118 Chapter 4. Recursion
n + (22 )
n
O(n4 )
= = O(n8 ),
2 2
which implies that
n + (22 )
n
Rn ≤ n + = n + O(n8 ) = O(n8 ).
2
Thus, we have proved our claim that Rn grows polynomially in n and, there-
fore, for large values of n, Rn is not equal to 2n−1 . (Using results on pla-
nar graphs that we will see in Section 7.5.1, it can be shown that, in fact,
Rn = O(n4 ).)
We remark that there is a shorter way to prove that Rn is not equal to
2n−1 for all n ≥ 1: You can verify by hand that R6 = 31. Still, this single
example does not rule out the possibility that Rn grows exponentially. The
analysis that we gave above does rule this out.
We have proved above that Rn = O(n8 ). We also mentioned that this
upper bound can be improved to O(n4 ). In the following subsections, we
will prove that the latter upper bound cannot be improved. That is, we will
prove that Rn = Θ(n4 ). In fact, we will determine an exact formula, in terms
of n, for the value of Rn .
We start by illustrating this process for the case when n = 6. The figure
below shows the situation after we have removed all segments that have p6
as an endpoint. The number of regions is equal to R5 = 16.
p2 p1
p6
p3
p4 p5
We are going to add, one by one, the five segments that have p6 as an
endpoint. When we add p1 p6 , one region gets cut into two. Thus, the number
of regions increases by one. Using the notation introduced above, we have
I1 = 1.
p2 p1
p6
p3
p4 p5
120 Chapter 4. Recursion
When we add p2 p6 , four regions get cut into two. Thus, the number of
regions increases by four, and we have I2 = 4.
p2 p1
p6
p3
p4 p5
When we add p3 p6 , five regions get cut into two. Thus, the number of
regions increases by five, and we have I3 = 5.
p2 p1
p6
p3
p4 p5
When we add p4 p6 , four regions get cut into two. Thus, the number of
regions increases by four, and we have I4 = 4.
p2 p1
p6
p3
p4 p5
Finally, when we add p5 p6 , one region gets cut into two. Thus, the number
of regions increases by one, and we have I5 = 1.
p2 p1
p6
p3
p4 p5
4.8. Counting Regions when Cutting a Circle 121
After having added the five segments with endpoint p6 , we have accounted
for all regions determined by the six points. In other words, the number of
regions we have at the end is equal to R6 . Since the number of regions at
the end is also equal to the sum of (i) the number of regions we started with,
which is R5 , and (ii) the total increase, we have
R6 = R5 + I1 + I2 + I3 + I4 + I5 = 31.
Let us look at this more carefully. We have seen that I3 = 5. That is,
when adding the segment p3 p6 , the number of regions increases by 5. Where
does this number 5 come from? The segment p3 p6 intersects 4 segments,
namely p1 p4 , p1 p5 , p2 p4 , and p2 p5 . The increase in the number of regions is
one more than the number of intersections. Thus, when adding a segment,
if we determine the number X of intersections between this new segment
and existing segments, then the increase in the number of regions is equal to
1 + X.
When we add p3 p6 , we have X = 4. Where does this number 4 come
from? We make the following observations:
• Any segment that intersects p3 p6 has one endpoint above p3 p6 and one
endpoint below p3 p6 .
• Any pair (a, b) of points on the circle, with a above p3 p6 and b below
p3 p6 , defines a segment ab that intersects p3 p6 .
Now that we have seen the basic approach, we are going to derive the
recurrence relation for Rn for an arbitrary integer n ≥ 2. After having
removed all segments that have pn as an endpoint, we have Rn−1 regions.
For each integer k with 1 ≤ k ≤ n − 1, we add the segment pk pn . What is
the number of existing segments that are intersected by this new segment?
122 Chapter 4. Recursion
pi
pk−1
p1
pk pn
pk+1
pn−1
pj
for any integer n ≥ 2. (In fact, (4.15) is a special case of the result in
Exercise 3.62.)
If n ∈ {2, 3}, then both sides of (4.15) are equal to zero. Assume that
n ≥ 4 and consider the set S = {1, 2, . . . , n−1}. We know that the number of
3-element subsets of S is equal to n−13
. As we will see below, the summation
on the left-hand side of (4.15) counts exactly the same subsets.
We divide the 3-element subsets of S into groups based on their mid-
dle element. Observe that the middle element can be any of the values
2, 3, . . . , n − 2. Thus, for any k with 2 ≤ k ≤ n − 2, the k-th group Gk
consists of all 3-element subsets of S whose middle element is equal to k.
Since the groups are pairwise disjoint, we have
n−2
n−1
X
= |Gk |.
3 k=2
What is the size of the k-th group Gk ? Any 3-element subset in Gk consists
of
• one element from {1, 2, . . . , k − 1},
Thus, we have proved the identity in (4.15), and the recurrence relation in
(4.13) and (4.14) becomes
R1 = 1,
n−1
(4.16)
Rn = Rn−1 + (n − 1) + 3
, if n ≥ 2.
124 Chapter 4. Recursion
Rn = (n − 1) + (n − 2) + (n − 3) + · · · + 3 + 2 + 1
n−1 n−2 n−3
3
+ + + + ··· +
3 3 3 3
+ 1.
Since, by Theorem 2.2.10, the first summation is equal to
n
1 + 2 + 3 + · · · + (n − 1) = n(n − 1)/2 = ,
2
we get
Xn−1
n k
Rn = 1 + + .
2 k=3
3
The final step is to simplify the summation on the right-hand side. We
will use a combinatorial proof to show that
n−1
X k n
= , (4.17)
k=3
3 4
4.9. Exercises 125
for any integer n ≥ 2. (As was the case for (4.15), the identity in (4.17) is a
special case of the result in Exercise 3.62.)
If n ∈ {2, 3}, then both sides of (4.17) are equal to zero. Assume that
n ≥ 4 and consider all 4-element subsets of the set S = {1, 2, . . . , n}. We
know that there are n4 many such subsets. We divide these subsets into
groups based on their largest element. For any k with 3 ≤ k ≤ n − 1, the
k-th group Gk consists of all 4-element subsets of S whose largest element is
equal to k + 1. It should be clear that
X n−1
n
= |Gk |.
4 k=3
To determine the size of the group Gk , we observe that any 4-element subset
in Gk consists of
• three elements from {1, 2, . . . , k} and
• the element k + 1.
It then follows from the Product Rule that
k k
|Gk | = ·1= ,
3 3
completing the proof of (4.17).
After (finally!) having solved and simplified our recurrence relation, we
conclude that for any integer n ≥ 1,
n n
Rn = 1 + + .
2 4
In Exercise 4.76, you will see a shorter way to determine the exact value
of Rn . We went for the long derivation, because it allowed us to illustrate,
along the way, several techniques from previous sections.
4.9 Exercises
4.1 The function f : N → Z is recursively defined as follows:
f (0) = 7,
f (n) = f (n − 1) + 6n − 3 if n ≥ 1.
Prove that f (n) = 3n2 + 7 for all integers n ≥ 0.
126 Chapter 4. Recursion
f (1) = 2,
f (n) = 12 f (n − 1) + 1
f (n−1)
if n ≥ 2.
4.7 You are asked to come up with an exam question about recursive func-
tions. You write down some recurrence, which you then solve. Afterwards,
you give the recurrence to the students, together with the solution. The
students must then prove that the given solution is indeed correct.
This is a painful process, because you must solve the recurrence yourself.
Since you are lazy, you start with the following:
Exam Question:
f (0) = XXX,
f (n) = f (n − 1) + Y Y Y if n ≥ 1.
f (n) = 7n2 − 2n + 9.
• Complete the question, i.e., fill in XXX and Y Y Y , so that you obtain
a complete recurrence that has the given solution.
f (n) = 2n(n − 6)
a0 = 5,
a1 = 3,
an = 6 · an−1 − 9 · an−2 if n ≥ 2.
• Determine an for n = 0, 1, 2, 3, 4, 5.
an = (5 − 4n) · 3n .
√ √
4.16 Let ϕ = 1+2 5 and ψ = 1− 5
2
, and let n ≥ 0 be an integer. We have
seen in Theorem 4.2.1 that
ϕn − ψ n
√ (4.18)
5
is equal to the n-th Fibonacci number fn . Since the Fibonacci numbers are
obviously integers, the number in (4.18) is an integer as well.
Prove that the number in (4.18) is a rational number using only Newton’s
Binomial Theorem (i.e., Theorem 3.6.5).
130 Chapter 4. Recursion
and
f12 + f22 + f32 + · · · + fn2 = fn fn+1 .
• f3n is even,
• f3n+1 is odd,
4.9. Exercises 131
• f3n+2 is odd,
• f4n is a multiple of 3.
i.e.,
n · f1 + (n − 1) · f2 + (n − 2) · f3 + · · · + 2 · fn−1 + 1 · fn = fn+4 − n − 3.
R B W Y G
• There are red (R) and blue (B) bricks, both of which are 1 × 1 cells.
• There are white (W ), yellow (Y ), and green (G) bricks, all of which
are 1 × 2 cells.
In a tiling, a color can be used more than once and some colors may not be
used at all. The figure below shows a tiling of B9 , in which each color is used
and the color red is used twice.
B W R G R Y
• Determine T1 and T2 .
Tn = 2 · Tn−1 + 3 · Tn−2 .
3n+1 + (−1)n
Tn = .
4
a0 = 0,
a1 = 1,
an = 2 · an−1 + an−2 if n ≥ 2.
• Determine an for n = 0, 1, 2, 3, 4, 5.
• Prove that √ n √ n
1+ 2 − 1− 2
an = √ (4.19)
2 2
for all integers n ≥ 0.
Hint: What are the solutions of the equation x2 = 2x + 1?
136 Chapter 4. Recursion
R B G
You have an unlimited supply of bricks, which are of the following types
(see the bottom part of the figure above):
• There are red (R) and blue (B) bricks, both of which are 1 × 1 cells.
We refer to these bricks as squares.
• There are green (G) bricks, which are 1 × 2 cells. We refer to these as
dominoes.
In a tiling, a color can be used more than once and some colors may not be
used at all. The figure below shows an example of a tiling of B9 .
G B B R B G R
• Determine T1 , T2 , and T3 .
bn/2c
n−k
X
Tn = · 2n−2k .
k=0
k
• Determine E1 , O1 , E2 , and O2 .
En + On = 3n .
En = 2 · En−1 + On−1 .
1 + 3n
En = .
2
138 Chapter 4. Recursion
En+1 = 2 · En + 4n .
4.32 Let An be the number of bitstrings of length n that contain 000. Prove
that for n ≥ 4,
An = An−1 + An−2 + An−3 + 2n−3 .
• Determine A1 , A2 , A3 , and A4 .
Hint: Divide the strings into groups depending on the number of leading 1s.
• Determine A1 , A2 , A3 , and A4 .
An = An−1 + (n − 1) · An−2 .
• Determine F1 , F2 , and F3 .
4.9. Exercises 139
• Determine S1 and S2 .
• Let n ≥ 1 be an integer. Express Sn in terms of An , Bn , and Cn .
• Let n ≥ 2 be an integer. Express Cn in terms of Sn−1 .
• Let n ≥ 2 be an integer. Prove that
Sn = (Sn−1 − Bn−1 ) + (Sn−1 − An−1 ) + Sn−1 .
3 = 1 + 1 + 1 = 1 + 2 = 2 + 1.
• Determine S1 , S2 , and S4 .
• Determine the value of Sn , i.e., express Sn in terms of numbers that we
have seen in this chapter.
4.39 Ever since he was a child, Nick has been dreaming to be like Spiderman.
As you all know, Spiderman can climb up the outside of a building; if he is
at a particular floor, then, in one step, he can move up several floors. Nick
is not that advanced yet. In one step, Nick can move up either one floor or
two floors.
Let n ≥ 1 be an integer and consider a building with n floors, numbered
1, 2, . . . , n. (The first floor has number 1; this is not the ground floor.) Nick
is standing in front of this building, at the ground level. There are different
ways in which Nick can climb to the n-th floor. For example, here are three
different ways for the case when n = 5:
1. move up 2 floors, move up 1 floor, move up 2 floors.
2. move up 1 floor, move up 2 floors, move up 2 floors.
3. move up 1 floor, move up 2 floors, move up 1 floor, move up 1 floor.
Let Sn be the number of different ways, in which Nick can climb to the
n-th floor.
4.9. Exercises 141
• Determine, S1 , S2 , S3 , and S4 .
4.40 Let n ≥ 1 be an integer and consider the set Sn = {1, 2, . . . , n}. A non-
neighbor subset of Sn is any subset T of S having the following property: If
k is any element of T , then k + 1 is not an element of T . (Observe that the
empty set is a non-neighbor subset of Sn .)
For example, if n = 3, then {1, 3} is a non-neighbor subset, whereas {2, 3}
is not a non-neighbor subset.
Let Nn denote the number of non-neighbor subsets of the set Sn .
• Determine N1 , N2 , and N3 .
• Determine P1 , P2 , and P3 .
B3 = 0 + 1 + 1 + 1 + 1 + 2 + 1 + 1 = 8.
0 0 0 0
0 0 1 1
0 1 0 1
1 0 0 1
0 1 1 1
1 0 1 2
1 1 0 1
1 1 1 1
• Determine B1 and B2 .
• Let n ≥ 3 be an integer.
1| ·{z
· · 1} .
n
1| ·{z
· · 1} 0.
n−1
Hint: Write (4.20) on one line. Below this line, write (4.20) with n
replaced by n − 1.
f (f (f (x))) = x.
4.45 Let S be the set of ordered pairs of integers that is recursively defined
in the following way:
• (0, 0) ∈ S.
• If (a, b) ∈ S then (a + 2, b + 3) ∈ S.
• If (a, b) ∈ S then (a + 3, b + 2) ∈ S.
4.46 Let S be the set of integers that is recursively defined in the following
way:
• 4 is an element of S.
4.9. Exercises 145
4.47 Let S be the set of ordered triples of integers that is recursively defined
in the following way:
a2 − b2 = c.
4.48 Let S be the set of integers that is recursively defined in the following
way:
• 1 is an element of S.
√
• If x is an element of S, then x + 2 x + 1 is also an element of S.
Give a simple description of the set S and prove that your answer is correct.
• If the string s is an element of the set S, then the string 0s (i.e., the
string obtained by adding the bit 0 at the front of s) is also an element
of the set S.
• If the string s is an element of the set S, then the string 10s (i.e.,
the string obtained by adding the bits 10 at the front of s) is also an
element of the set S.
Let s be an arbitrary string in the set S. Prove that s does not contain the
substring 11.
146 Chapter 4. Recursion
single node
or binary binary
tree tree
Prove that any binary tree with n leaves has exactly 2n − 1 nodes.
4.51 In this exercise, we will denote Boolean variables by lowercase letters,
such as p and q. A proposition is any Boolean formula that can be obtained
by applying the following recursive rules:
1. For every Boolean variable p, p is a proposition.
2. If f is a proposition, then ¬f is also a proposition.
3. If f and g are propositions, then (f ∨ g) is also a proposition.
4. If f and g are propositions, then (f ∧ g) is also a proposition.
• Let p and q be Boolean variables. Prove that
¬ ((p ∧ ¬q) ∨ (¬p ∨ q))
is a proposition.
• Let ↑ denote the not-and operator. In other words, if f and g are
Boolean formulas, then (f ↑ g) is the Boolean formula that has the
following truth table (0 stands for false, and 1 stands for true):
f g (f ↑ g)
0 0 1
0 1 1
1 0 1
1 1 0
4.9. Exercises 147
4.52 In Section 4.4, we have seen the recursive algorithm gossip(n), which
computes a sequence of phone calls for the persons P1 , P2 , . . . , Pn . The base
case for this algorithm was when n = 4. Assume we change the base case to
n = 2: In this new base case, there are only two people P1 and P2 , and only
one phone call is needed. The rest of the algorithm remains unchanged.
Prove that the modified algorithm gossip(n) results in a sequence of
2n − 3 phone calls for any integer n ≥ 2. (Thus, for n ≥ 4, it makes one
more phone call than the algorithm in Section 4.4.)
4.53 In Section 4.4, we have seen the recursive algorithm gossip(n), which
computes a sequence of phone calls for the persons P1 , P2 , . . . , Pn , for any
integer n ≥ 4.
Give an iterative (i.e., non-recursive) version of this algorithm in pseu-
docode. Your algorithm must produce exactly the same sequence of phone
calls as algorithm gossip(n).
4.54 In Section 4.5, we have seen algorithm Euclid(a, b), which takes as
input two integers a and b with a ≥ b ≥ 1, and returns their greatest common
divisior.
Assume we run algorithm Euclid(a, b) with two input integers a and b
that satisfy b > a ≥ 1. What is the output of this algorithm?
Algorithm Fib(n):
if n = 0 or n = 1
then f = n
else f = Fib(n − 1) + Fib(n − 2)
endif;
return f
Algorithm Beer(n):
if n = 1
then eat some peanuts
else choose an arbitrary integer m with 1 ≤ m ≤ n − 1;
Beer(m);
drink one pint of beer;
Beer(n − m)
endif
• Let B(n) be the number of pints of beer you drink when running algo-
rithm Beer(n). Determine the value of B(n).
4.57 Consider the following recursive algorithm Silly, which takes as input
an integer n ≥ 1 which is a power of 2:
4.9. Exercises 149
Algorithm Silly(n):
if n = 1
then drink one pint of beer
else if n = 2
then fart once
else fart once;
Silly(n/2);
fart once
endif
endif
For n a power of 2, let F (n) be the number of times you fart when running
algorithm Silly(n). Determine the value of F (n).
4.58 In the fall term of 2015, Nick took the course COMP 2804 at Carleton
University. Nick was always sitting in the back of the classroom and spent
most of his time eating bananas. Nick uses the following scheme to buy
bananas:
For any integer n ≥ 0, let B(n) be the number of bananas in Nick’s fridge at
the start of week n. Determine the value of B(n).
4.59 Jennifer loves to drink India Pale Ale (IPA). After a week of hard work,
Jennifer goes to the pub and runs the following recursive algorithm, which
takes as input an integer n ≥ 1, which is a power of 4:
150 Chapter 4. Recursion
Algorithm JenniferDrinksIPA(n):
if n = 1
then place one order of chicken wings
else for k = 1 to 4
do JenniferDrinksIPA(n/4);
drink n pints of IPA
endfor
endif
• P (n) be the number of pints of IPA that Jennifer drinks when running
algorithm JenniferDrinksIPA(n),
4.60 Elisa Kazan loves to drink cider. During the weekend, Elisa goes to
the pub and runs the following recursive algorithm, which takes as input an
integer n ≥ 0:
Algorithm ElisaDrinksCider(n):
if n = 0
then order Fibonachos
else if n is even
then ElisaDrinksCider(n/2);
drink n2 /2 pints of cider;
ElisaDrinksCider(n/2)
else for i = 1 to 4
do ElisaDrinksCider((n − 1)/2);
drink (n − 1)/2 pints of cider
endfor;
drink 1 pint of cider
endif
endif
4.9. Exercises 151
For n ≥ 0, let C(n) be the number of pints of cider that Elisa drinks when
running algorithm ElisaDrinksCider(n). Determine the value of C(n).
4.61 Elisa Kazan loves to drink cider. After a week of bossing the Vice-
Presidents around, Elisa goes to the pub and runs the following recursive
algorithm, which takes as input an integer n ≥ 0:
Algorithm ElisaGoesToThePub(n):
if n = 0
then drink one bottle of cider
else for k = 0 to n − 1
do ElisaGoesToThePub(k);
drink one bottle of cider
endfor
endif
For n ≥ 0, let C(n) be the number of bottles of cider that Elisa drinks
when running algorithm ElisaGoesToThePub(n).
Prove that for every integer n ≥ 1,
C(n) = 3 · 2n−1 − 1.
4.62 Elisa Kazan loves to drink cider. On Saturday night, Elisa goes to her
neighborhood pub and runs the following recursive algorithm, which takes
as input an integer n ≥ 1:
152 Chapter 4. Recursion
Algorithm ElisaDrinksCider(n):
if n = 1
then drink one pint of cider
else if n is even
then ElisaDrinksCider(n/2);
drink one pint of cider;
ElisaDrinksCider(n/2)
else drink one pint of cider;
ElisaDrinksCider(n − 1);
drink one pint of cider
endif
endif
For any integer n ≥ 1, let P (n) be the number of pints of cider that
Elisa drinks when running algorithm ElisaDrinksCider(n). Determine
the value of P (n).
min = s1 ;
max = s1 ;
for i = 2 to n
do if si < min (1)
then min = si
endif;
if si > max (2)
then max = si
endif
endwhile;
return (min, max )
if n = 2
then let x and y be the two elements in S;
if x < y (1)
then min = x;
max = y
else min = y;
max = x
endif
else divide S into two subsequences S1 and S2 , both of size n/2;
(min 1 , max 1 ) = FastMinMax(S1 , n/2);
(min 2 , max 2 ) = FastMinMax(S2 , n/2);
if min 1 < min 2 (2)
then min = min 1
else min = min 2
endif;
if max 1 < max 2 (3)
then max = max 2
else max = max 1
endif
endif;
return (min, max )
Algorithm Mystery(a1 , a2 , . . . , an ):
if n = 1
then return the sequence (a1 )
else (b1 , b2 , . . . , bn−1 ) = Mystery(a1 , a2 , . . . , an−1 );
return the sequence (an , b1 , b2 , . . . , bn−1 )
endif
4.66 Consider the following recursive algorithm, which takes as input a se-
quence (a1 , a2 , . . . , an ) of n numbers, where n is a power of two, i.e., n = 2k
for some integer k ≥ 0:
Algorithm Mystery(a1 , a2 , . . . , an ):
if n = 1
then return a1
else for i = 1 to n/2
do bi = min(a2i−1 , a2i ) (∗)
endfor;
Mystery(b1 , b2 , . . . , bn/2 )
endif
T (n) = n − 1.
4.67 Let k be a positive integer and let n = 2k . You are given an n×n board
Bn , all of whose (square) cells are white, except for one, which is black. (The
left part of the figure below gives an example where k = 3 and n = 8.)
4.9. Exercises 155
• the trominoes cover exactly all white cells (thus, the black cell is not
covered by any tromino) and
• takes as input a board Bn having exactly one black cell (which can be
anywhere on the board) and
The figure below shows an example, in which the •-points are maximal and
the ×-points are not maximal. Observe that, in general, there is more than
one maximal element in S.
•
•
×
•
× •
× ×
× •
1 −1 −1 1
Observe that Hk has 2k rows and 2k columns.
If x is a column vector of length 2k , then Hk x is the column vector of
length 2k obtained by multiplying the matrix Hk with the vector x.
Describe a recursive algorithm Mult that has the following specification:
Algorithm Mult(k, x):
Input: An integer k ≥ 0 and a column vector x of length n = 2k .
Output: The column vector Hk x (having length n).
The running time T (n) of your algorithm must be O(n log n).
Hint: The input only consists of k and x. The matrix Hk , which has n2 en-
tries, is not given as part of the input. Since you are aiming for an O(n log n)–
time algorithm, you cannot compute all entries of the matrix Hk .
4.70 Let m ≥ 1 and n ≥ 1 be integers and consider an m × n matrix A. The
rows of this matrix are numbered 1, 2, . . . , m, and its columns are numbered
1, 2, . . . , n. Each entry of A stores one number and, for each row, all numbers
in this row are pairwise distinct. For each i = 1, 2, . . . , m, define
g(i) = the position (i.e., column number) of the smallest number in row i.
We say that the matrix A is awesome, if
g(1) ≤ g(2) ≤ g(3) ≤ . . . ≤ g(m).
In the matrix below, the smallest number in each row is in boldface. For this
example, we have m = 4, n = 10, g(1) = 3, g(2) = 3, g(3) = 5, and g(4) = 8.
Thus, this matrix is awesome.
13 12 5 8 6 9 15 20 19 7
3 4 1 17 6 13 7 10 2 5
A= 19 5 12 7 2 4 11 13 6 3 .
7 4 17 10 5 14 12 3 20 6
158 Chapter 4. Recursion
In the rest of this exercise, you will show that all values g(1), g(2), . . . , g(m)
can be computed in O(m + n log m) total time.
• Assume that m is even and assume that you are given the values
For each i with 0 ≤ i ≤ k, let T (i) denote the running time of algorithm
FindRowMinima(A, i). The running time of your algorithm must
satisfy the recurrence
T (0) = O(n),
T (i) = T (i − 1) + O 2i + n , if 1 ≤ i ≤ k.
• Assume again that m = 2k . Prove that all values g(1), g(2), . . . , g(m)
can be computed in O(m + n log m) total time.
Hint: 1 + 2 + 22 + 23 + · · · + 2k ≤ 2m.
4.9. Exercises 159
12 + 22 + 32 + · · · + n2
12 + 22 + 32 + · · · + n2 = An3 + Bn2 + Cn + D,
but you have forgotten the values of A, B, C, and D. How can you determine
these four values?
which follows from Theorem 2.2.10. Give an alternative proof that uses the
approach that we used to prove the identity in (4.17).
Use induction and Pascal’s Identity (see Theorem 3.7.2) to give an alternative
proof.
160 Chapter 4. Recursion
4.76 Consider the numbers Rn that we defined in Section 4.8. The n points
n
on the circle define 2 line segments, one segment for each
pair of points.
Let X be the total number of intersections among these n2 line segments.
• Prove that
n
Rn = 1 + + X.
2
n
Hint: Start with only the circle and the n points. Then add the 2
line segments one by one.
• Prove that
n
X= .
4
4.77 For an integer n ≥ 1, draw n straight lines, such that no two of them
are parallel and no three of them intersect in one single point. These lines
divide the plane into regions (some of which are bounded and some of which
are unbounded). Denote the number of these regions by Cn .
Derive a recurrence relation for the numbers Cn and use it to prove that
for n ≥ 1,
n(n + 1)
Cn = 1 + .
2
These lines divide the plane into regions (some of which are bounded and
some of which are unbounded). Denote the number of these regions by Rn .
From the figure below, you can see that R1 = 3, R2 = 9, and R3 = 19.
4.9. Exercises 161
`01 `01 `2
`1
`1 `02
R1 = 3
R2 = 9
`01 `2
`1 `02
`03
`3
R3 = 19
These lines divide the plane into regions (some of which are bounded and
some of which are unbounded). Denote the number of these regions by Rm,n .
From the figure below, you can see that R4,3 = 23.
162 Chapter 4. Recursion
• Derive a recurrence relation for the numbers Rm,n and use it to prove
that
n+1
Rm,n = 1 + m(n + 1) + .
2
These lines divide the plane into regions (some of which are bounded and
some of which are unbounded). Denote the number of these regions by Rk,m,n .
From the figure below, you can see that R4,2,2 = 30.
• Prove that
Rk,m,0 = (k + 1)(m + 1).
• Derive a recurrence relation for the numbers Rk,m,n and use it to prove
that
n+1
Rk,m,n = (k + 1)(m + 1) + (k + m)n + .
2
• For any integer n ≥ 1, let An be the total area of all triangles that are
added when constructing SF n from SF n−1 . Prove that
n
3 4
An = · · a0 .
4 9
Discrete Probability
In other words, when Pi (with i 6= k) receives the message, she only knows
that it was broadcast by one of P1 , . . . , Pi−1 , Pi+1 , . . . , Pn ; she cannot deter-
mine who broadcast the message.
At first sight, it seems to be impossible to do this. In 1988, however,
David Chaum published, in the Journal of Cryptology, a surprisingly simple
166 Chapter 5. Discrete Probability
protocol that does achieve this. Chaum referred to the problem as the Dining
Cryptographers Problem.
We will present and analyze the protocol for the case when n = 3. Thus,
there are three people P1 , P2 , and P3 . We assume that exactly one of them
broadcasts a message and refer to this person as the broadcaster. We also
assume that the message is a bitstring. The broadcaster will announce the
message one bit at a time.
The three people P1 , P2 , and P3 sit at a table, in clockwise order of their
indices. Let b be the current bit that the broadcaster wants to announce.
The protocol for broadcasting this bit is as follows:
Step 1: Each person Pi generates a random bit bi , for example, by flipping
a coin. Thus, with 50% probability, bi = 0 and with 50% probability, bi = 1.
Step 2: Each person Pi shows the bit bi to her clockwise neighbor.
b1
P2
b2
P1
P3
b3
Step 5: Each person Pi shows her bit ti to the other two people.
Step 6: Each person Pi computes the sum (modulo 2) of the three bits t1 ,
t2 , and t3 , i.e., the value (t1 + t2 + t3 ) mod 2.
This concludes the description of the protocol for broadcasting one bit b.
Observe that for any bit x, we have (x + x) mod 2 = 0. Therefore, the bit
computed in the last step is equal to
t1 + t2 + t3 = s1 + s2 + s3 + b
= (b1 + b3 ) + (b1 + b2 ) + (b2 + b3 ) + b
= (b1 + b1 ) + (b2 + b2 ) + (b3 + b3 ) + b
= b,
where all arithmetic is done modulo 2. In other words, the bit computed in
the last step is equal to the bit that the broadcaster wants to announce. This
shows that each person in the group receives this bit.
It remains to show that a non-broadcaster cannot determine who broad-
cast the bit. In the analysis below, we assume that
b1 , b2 , t1 , t2 , t3 ,
but does not know the bit b3 . We consider the cases when b1 = b2 and b1 6= b2
separately.
Case 1: b1 = b2 . This case has two subcases depending on the value of b3 .
Case 1.1: b3 = b1 ; thus, all three b-bits are equal.
168 Chapter 5. Discrete Probability
t1 = s1 + 1 = b1 + b3 + 1 = 1
and
t3 = s3 = b2 + b3 = 0.
t1 = s1 = b1 + b3 = 0
and
t3 = s3 + 1 = b2 + b3 + 1 = 1.
t1 = s1 + 1 = b1 + b3 + 1 = 0
and
t3 = s3 = b2 + b3 = 1.
t1 = s1 = b1 + b3 = 1
and
t3 = s3 + 1 = b2 + b3 + 1 = 0.
t1 = s1 + 1 = b1 + b3 + 1 = 1
and
t3 = s3 = b2 + b3 = 1.
t1 = s1 = b1 + b3 = 0
and
t3 = s3 + 1 = b2 + b3 + 1 = 0.
t1 = s1 + 1 = b1 + b3 + 1 = 0
and
t3 = s3 = b2 + b3 = 0.
t1 = s1 = b1 + b3 = 1
and
t3 = s3 + 1 = b2 + b3 + 1 = 1.
For any outcome ω in the sample space S, we will refer to Pr(ω) as the
probability that the outcome is equal to ω.
out to be useful to extend this function so that it maps any event to a real
number in [0, 1]. If A is an event (i.e., A ⊆ S), then we define
X
Pr(A) = Pr(ω). (5.1)
ω∈A
where the last equality follows from the second condition in Definition 5.2.2.
5.2.1 Examples
Flipping a coin: Assume we flip a coin. Since there are two possible
outcomes (the coin comes up either heads (H) or tails (T )), the sample
space is the set S = {H, T }. If the coin is fair, i.e., the probabilities of H
and T are equal, then the probability function Pr : S → R is given by
Pr(H) = 1/2,
Pr(T ) = 1/2.
Observe that this function Pr satisfies the two conditions in Definition 5.2.2.
Since this sample space has two elements, there are four events, one event
for each subset. These events are
∅, {H}, {T }, {H, T },
Pr(∅) = 0,
Pr({H}) = Pr(H) = 1/2,
Pr({T }) = Pr(T ) = 1/2,
Pr({H, T }) = Pr(H) + Pr(T ) = 1/2 + 1/2 = 1.
Flipping a coin twice: If we flip a fair coin twice, then there are four
possible outcomes, and the sample space becomes S = {HH, HT, T H, T T }.
For example, HT indicates that the first flip resulted in heads, whereas the
172 Chapter 5. Discrete Probability
Observe again that this function Pr satisfies the two conditions in Defini-
tion 5.2.2. Since the sample space consists of 4 elements, the number of
events is equal to 24 = 16. For example, A = {HT, T H} is an event and it
follows from (5.1) that
In words, when flipping a fair coin twice, the probability that we see one
heads and one tails (without specifying the order) is equal to 1/2.
Rolling a die twice: If we roll a fair die, then there are six possible
outcomes (1, 2, 3, 4, 5, and 6), each one occurring with probability 1/6. If
we roll this die twice, we obtain the sample space
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6},
where i is the result of the first roll and j is the result of the second roll. Note
that |S| = 6 × 6 = 36. Since the die is fair, each outcome has the same prob-
ability. Therefore, in order to satisfy the two conditions in Definition 5.2.2,
we must have
Pr(i, j) = 1/36
for each outcome (i, j) in S.
If we are interested in the sum of the results of the two rolls, then we
define the event
Ak = {(i, j) ∈ S : i + j = k}.
Consider, for example, the case when k = 4. There are three possible out-
comes of two rolls that result in a sum of 4. These outcomes are (1, 3), (2, 2),
and (3, 1). Thus, the event A4 is equal to
In the matrix below, the leftmost column indicates the result of the first
roll, the top row indicates the result of the second roll, and each entry is the
sum of the results of the two corresponding rolls.
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
For example, the number 4 occurs three times in the matrix and, therefore,
the event A4 has size three. Observe that we have already seen this in (5.2).
It follows that
Pr (A4 ) = |A4 |/36 = 3/36 = 1/12.
In a similar way, we see that
Pr (A2 ) = 1/36,
Pr (A3 ) = 2/36 = 1/18,
Pr (A4 ) = 3/36 = 1/12,
Pr (A5 ) = 4/36 = 1/9,
Pr (A6 ) = 5/36,
Pr (A7 ) = 6/36 = 1/6,
Pr (A8 ) = 5/36,
Pr (A9 ) = 4/36 = 1/9,
Pr (A10 ) = 3/36 = 1/12,
Pr (A11 ) = 2/36 = 1/18,
Pr (A12 ) = 1/36.
174 Chapter 5. Discrete Probability
Since there are zero terms in this summation, its value is equal to zero.
To give an example, assume we roll a fair die twice. What is the proba-
bility that the sum of the two results is even? If you look at the matrix in
Section 5.2, then you see that there are 18 entries, out of 36, that are even.
Therefore, the probability of having an even sum is equal to 18/36 = 1/2.
Below we will give a different way to determine this probability.
The sample space is the set
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6},
where i is the result of the first roll and j is the result of the second roll. Each
element of S has the same probability 1/36 of being an outcome of rolling
the die twice.
The event we are interested in is
A = {(i, j) ∈ S : i + j is even}.
Observe that i + j is even if and only if both i and j are even or both i and
j are odd. Therefore, we split the event A into two disjoint events
and
A2 = {(i, j) ∈ S : both i and j are odd}.
176 Chapter 5. Discrete Probability
The set A1 has 3 · 3 = 9 elements, because there are 3 choices for i and 3
choices for j. Similarly, the set A2 has 9 elements. It follows that
In the next lemma, we relate the probability that an event occurs to the
probability that the event does not occur. If A is an event, then A denotes
its complement, i.e., A = S \ A. Intuitively, the sum of Pr(A) and Pr A
must be equal to one, because the event A either occurs or does not occur.
The following lemma states that this is indeed the case. Observe that this is
similar to the Complement Rule of Section 3.3.
Proof. Since A and A are disjoint and S = A∪A, it follows from Lemma 5.3.2
that
Pr(S) = Pr A ∪ A = Pr(A) + Pr A .
We have seen in Section 5.2 that Pr(S) = 1.
Consider again the sample space that we saw after Lemma 5.3.2. We
showed that, when rolling a fair die twice, we get an even sum with probabil-
ity 1/2. It follows from Lemma 5.3.3 that we get an odd sum with probability
1 − 1/2 = 1/2.
The next lemma is similar to the Principle of Inclusion and Exclusion
that we have seen in Section 3.5.
Pr(A) = 500/1000.
Since there are b1000/3c = 333 elements in S that are divisible by 3, we have
Pr(B) = 333/1000.
A ∩ B = {i ∈ S : i is divisible by 6}.
Since there are b1000/6c = 166 elements in S that are divisible by 6, we have
Pr(A ∩ B) = 166/1000.
We conclude that
it follows that
Pr (A1 ∪ A2 ∪ · · · ∪ An ) = Pr (B ∪ An )
≤ Pr(B) + Pr (An )
n−1
X
≤ Pr (Ai ) + Pr (An )
i=1
n
X
= Pr (Ai ) .
i=1
Pr(A) ≤ Pr(B).
5.4. Uniform Probability Spaces 179
Proof. Using (5.1) and the fact that Pr(ω) ≥ 0 for each ω in S, we have
X
Pr(A) = Pr(ω)
ω∈A
X
≤ Pr(ω)
ω∈B
= Pr(B).
In particular, both {1, 2, 3, 4, 5, 6} and {2, 5, 16, 36, 41, 43} have the same
probability of being the winning numbers. (Still, the latter subset was drawn
by OLG on February 8, 2014.)
The lemma below states that in a uniform probability space (S, Pr), the
probability of an event A is the ratio of the size of A and the size of S.
|A|
Pr(A) = .
|S|
Thus, to determine Pr(A), it remains to determine the size of the set A, i.e.,
the total number of full houses. For this, we will use the Product Rule of
Section 3.1:
• First task: Choose the rank of the three cards in the full house. There
are 13 ways to do this.
• Second task: Choose the suits of these three cards. There are 43 ways
to do this.
• Third task: Choose the rank of the other two cards in the full house.
There are 12 ways to do this.
• Fourth task: Choose the suits of these two cards. There are 42 ways
to do this.
|A| 3, 744
Pr(A) = = ≈ 0.00144.
|S| 2, 598, 960
Below, we will show that p2 = 1/365. If n ≥ 366, then it follows from the
Pigeonhole Principle (see Section 3.10) that there must be at least two people
with the same birthday and, therefore, pn = 1. Intuitively, if n increases from
2 to 365, the value of pn increases as well. What is the value of n such that
pn is larger than 1/2 for the first time? That is, what is the value of n for
which pn−1 ≤ 1/2 < pn ? In this section, we will see that this question can be
answered using simple counting techniques that we have seen in Chapter 3.
We denote the people by P1 , P2 , . . . , Pn , we denote the number of days in
one year by d, and we number the days in one year as 1, 2, . . . , d. The sample
space is the set
|Sn | = dn .
pn = Pr (An ) .
|A2 |
p2 = Pr (A2 ) = .
|S2 |
5.5. The Birthday Paradox 183
|A2 | d 1
p2 = = 2 = .
|S2 | d d
d!
|An | = .
(d − n)!
pn = Pr (An )
= 1 − Pr An
|An |
= 1−
|Sn |
d!
= 1− .
(d − n)!dn
By taking d = 365, we get p22 = 0.476 and p23 = 0.507. Thus, in a random
group of 23 people1 , the probability that at least two of them have the same
birthday is more than 50%. Most people are very surprised when they see
this for the first time, because our intuition says that a much larger group is
needed to have a probability of more than 50%. The values pn approach 1
pretty fast. For example, p40 = 0.891 and p100 = 0.9999997.
1
two soccer teams plus the referee
184 Chapter 5. Discrete Probability
and therefore,
pn = 1 − qn ≥ 1 − e−n(n−1)/(2d) .
If n is large, then n(n − 1)/(2d) is very close to n2 /(2d) and, thus,
2 /(2d)
pn & 1 − e−n .
√
If we take n = 2d, then we get
$x $y or $y $x
We will refer to the box containing x dollars as the small box and to the
box containing y dollars as the big box. Our goal is to find the big box. We
are allowed to do the following:
1. We can choose one of the two boxes, open it, and determine how much
money is inside it.
2. Now we have to make our final decision: Either we keep the box we
just opened or we take the other box.
186 Chapter 5. Discrete Probability
For example, assume that the box we pick in the first step contains $33.
Then we know that the other box contains either less than $33 or more
than $33. It seems that the only reasonable thing to do is to flip a fair coin
when making our final decision. If we do that, then we find the big box with
probability 0.5.
In the rest of this section, we will show the surprising result that we can
find the big box with probability at least 0.505.
The idea is as follows. Assume that we know a number z such that
x < z < y. (Keep in mind that we do not know x and we do not know y.
Thus, we assume that we know a number z that is between the two unknown
numbers x and y.)
• If the box we choose in the first step contains more than z dollars, then
we know that this is the big box and, therefore, we keep it.
• If the box we choose in the first step contains less than z dollars, then
we know that this is the small box and, therefore, we take the other
box.
Thus, if we know this number z with x < z < y, then we are guaranteed to
find the big box.
Of course, it is not realistic to assume that we know this magic number z.
The trick is to choose a random z and hope that it is between x and y. If z
is between x and y, then we find the big box with probability 1; otherwise,
we find the big box with probability 1/2. As we will see later, the overall
probability of finding the big box will be at least 0.505.
In order to avoid the case when z = x or z = y, we will choose z from
the set
Note that |B| = 100. Our algorithm that attempts to find the big box does
the following:
5.6. The Big Box Problem 187
Algorithm FindBigBox:
Step 1: Choose one of the two boxes uniformly at random, open it,
and determine the amount of money inside it; let this amount be a.
Step 2: Choose z uniformly at random from the set B.
Step 3: Do the following:
• If x = a > z, then the algorithm keeps the small box and, thus, is not
successful.
• If x = a < z, then the algorithm takes the other box (which is the big
box) and, thus, is successful.
Thus, the event W contains the set
Wx = {(x, z) : z ∈ {x + 1/2, x + 3/2, . . . , 100 − 1/2}}.
You can verify that
|Wx | = 100 − x.
The second case to consider is when a = y. In this case, the box we
choose in Step 1 is the big box. Again, there are two possibilities for z:
• If y = a > z, then the algorithm keeps the big box and, thus, is
successful.
• If y = a < z, then the algorithm takes the other box (which is the small
box) and, thus, is not successful.
Thus, the event W contains the set
Wy = {(y, z) : z ∈ {1/2, 3/2, . . . , y − 1/2}}.
You can verify that
|Wy | = y.
Since W = Wx ∪ Wy and the events Wx and Wy are disjoint, we have, by
Lemma 5.3.2,
Pr(W ) = Pr (Wx ∪ Wy )
= Pr (Wx ) + Pr (Wy ) .
Since the element (a, z) is chosen uniformly at random from the sample
space S, we can use Lemma 5.4.2 to determine the probability that algorithm
FindBigBox is successful:
Pr(W ) = Pr (Wx ) + Pr (Wy )
|Wx | |Wy |
= +
|S| |S|
100 − x y
= +
200 200
1 y−x
= + .
2 200
5.8. Conditional Probability 189
Note that the host can always open a door that has a goat behind it.
After the host has opened No. 3, we know that the car is either behind No. 1
or No. 2, and it seems that both these doors have the same probability (i.e.,
50%) of having the car behind them. We will prove below, however, that this
is not true: It is indeed to our advantage to switch our choice.
We assume that the car is equally likely to be behind any of the three
doors. Moreover, the host knows what is behind each door.
• The host opens one of the other two doors that has a goat behind it.
• Our final choice is to switch to the other door that is still closed.
Let A be the event that we win the car and let B be the event that the initial
door has a goat behind it. Then it is not difficult to see that event A occurs
if and only if event B occurs. Therefore, the probability that we win the car
is equal to
Pr(A) = Pr(B) = 2/3.
190 Chapter 5. Discrete Probability
where, for example, (b, g) indicates that the youngest child is a boy and the
oldest child is a girl. We assume a uniform probability function, so that each
outcome has a probability of 1/4.
We are given the additional information that one of the two children is
a boy, or, to be more precise, that at least one of the two children is a boy.
This means that the actual sample space is not S, but
When asking for the probability that the other child is also a boy, we are
really asking for the probability that both children are boys. Since there is
only one possibility (out of three) for both children to be boys, it follows that
this probability is equal to 1/3.
This is an example of a conditional probability: We are asking for the
probability of an event (both children are boys), given that another event (at
least one of the two children is a boy) occurs.
Definition 5.8.1 Let (S, Pr) be a probability space and let A and B be two
events with Pr(B) > 0. The conditional probability Pr(A | B), pronounced
as “the probability of A given B”, is defined as
Pr(A ∩ B)
Pr(A | B) = .
Pr(B)
Let us try to understand where this definition comes from. Initially, the
sample space is equal to S. When we are given the additional information
that event B occurs, the sample space “shrinks” to B, and event A occurs if
and only if event A ∩ B occurs.
5.8. Conditional Probability 191
S
B
Let us now consider the conditional probability Pr(B | A). Thus, we are
given that event A occurs, i.e., the roll of the die resulted in 3. Since 3 is an
odd integer, event B is guaranteed to occur. Therefore, Pr(B | A) should be
equal to 1. Again, we are going to verify that this is indeed the answer we
get when using Definition 5.8.1:
Pr(B ∩ A) Pr(A)
Pr(B | A) = = = 1.
Pr(A) Pr(A)
This shows that, in general, Pr(A | B) is not equal to Pr(B | A). Observe
that this is not surprising. (Do you see why?)
Consider the event
C = {2, 3, 5}.
and
Pr(C ∩ A) Pr(A)
Pr(C | A) = = = 1.
Pr(A) Pr(A)
Recall that B denotes the complement of the event B. Thus, this is the
event
B = “the result is an even integer”,
which, when written as a subset of the sample space, is
B = {2, 4, 6}.
Then Pr C | B should be equal to 1/3. Indeed, we have
Pr C ∩ B |C ∩ B|/|S| 1/6
Pr C | B = = = = 1/3.
Pr B |B|/|S| 3/6
Observe that
Pr(C | B) + Pr C | B = 2/3 + 1/3 = 1.
You may think that this is true for any two events B and C. This is, however,
not the case: Since
A = {1, 2, 4, 5, 6},
we have
Pr C ∩ A |C ∩ A|/|S| 2/6
Pr C | A = = = = 2/5
Pr A |A|/|S| 5/6
and, thus,
Pr(C | A) + Pr C | A = 1 + 2/5 6= 1.
194 Chapter 5. Discrete Probability
Intuitively, this should be true for any two events A and C: When we are
given that event C occurs, then either A occurs or A does not occur (in
which case A occurs). The following lemma states that this intuition is
indeed correct.
Lemma 5.8.2 Let (S, Pr) be a probability space and let A and B be two
events with Pr(B) > 0. Then
Pr(A | B) + Pr A | B = 1.
implying that
Pr(A ∩ B) + Pr A ∩ B = Pr(B).
We conclude that
Pr(B)
Pr(A | B) + Pr A | B = = 1.
Pr(B)
5.8. Conditional Probability 195
H T
1 2 3 4 5 6
HH HT T1 T2 T3 T4 T5 T6
The sample space is the set S of all possible values that can be returned
by algorithm FlipAndFlipOrRoll. Thus, we have
S = {(H, H), (H, T ), (T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}.
196 Chapter 5. Discrete Probability
We are interested in the probability that the algorithm returns the value
(T, 5), i.e., the probability of the event
A = {(T, 5)}.
Since the event A obviously depends on the result of flipping the red coin,
we consider the event
R = {(T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}.
We have seen already that Pr(R) = 1/2. To determine Pr(A | R), we assume
that event R occurs. Under this assumption, event A occurs if and only if
the result of rolling the die is 5, which happens with probability 1/6. Thus,
Pr(A | R) = 1/6
Ai = {(T, i)}
because the die is fair. Let p denote the common value of the pi ’s. Next
observe that
R = A1 ∪ A2 ∪ A3 ∪ A4 ∪ A5 ∪ A6 ,
where the six events on the right-hand side are pairwise disjoint. We have
seen already that Pr(R) = 1/2. It follows that
1/2 = Pr(R)
6
!
[
= Pr Ai
i=1
6
X
= Pr (Ai )
i=1
6
X
= p
i=1
= 6p,
Thus, we have obtained a formal proof of the fact that the probability of the
event A is equal 1/12.
198 Chapter 5. Discrete Probability
Pr(A ∩ R)
Pr(A | R) =
Pr(R)
Pr(A)
=
Pr(R)
1/12
=
1/2
= 1/6.
3. ni=1 Bi = S.
S
Then
n
X
Pr(A) = Pr (A | Bi ) · Pr (Bi ) .
i=1
5.9. The Law of Total Probability 199
A = A∩S !
n
[
= A∩ Bi
i=1
n
[
= (A ∩ Bi ) .
i=1
Pr (A ∩ Bi ) = Pr (A | Bi ) · Pr (Bi ) .
Let us consider the three conditions in this theorem. The first condition
is that Pr (Bi ) > 0, i.e., there is a positive probability that event Bi occurs.
The second and third conditions, i.e.,
• the events B1 , B2 , . . . , Bn are pairwise disjoint, and
• ni=1 Bi = S,
S
are equivalent to
• exactly one of the events B1 , B2 , . . . , Bn is guaranteed to occur.
In the example in the beginning of this section, we wanted to know Pr(A),
where A is the event
to determine. For this example, we define the event Bi , for each i with
1 ≤ i ≤ 365, to be
It is clear that (i) Pr (Bi ) = 1/365 > 0 and (ii) exactly one of the events
B1 , B2 , . . . , B365 is guaranteed to occur. It follows that
365
X
Pr(A) = Pr (A | Bi ) · Pr (Bi ) .
i=1
365
X
Pr(A) = (1/365) · Pr (Bi )
i=1
365
X
= (1/365) Pr (Bi )
i=1
= (1/365) · 1
= 1/365,
– If the coin comes up heads, then we roll a fair die. Let R denote
the result of this die.
– If the coin comes up tails, then we roll two fair dice. Let R denote
the sum of the results of these dice.
5.10. Please Take a Seat 201
then we want to know Pr(A). Since the value of R depends on whether the
coin comes up heads or tails, we define the event
Since (i) both B and its complement B occur with a positive probability
and (ii) exactly one of B and B is guaranteed to occur, we can apply Theo-
rem 5.9.1 and get
Pr(A) = Pr(A | B) · Pr(B) + Pr A | B · Pr B .
• To determine Pr(A | B), we assume that the event B occurs, i.e., the
coin comes up heads. Because of this assumption, we roll one die, and
the event A occurs if and only if the result of this roll is 2. It follows
that
Pr(A | B) = 1/6.
• To determine Pr A | B , we assume that the event B occurs, i.e., the
coin comes up tails. Because of this assumption, we roll two dice, and
the event A occurs if and only if both rolls result in 1. Since there are
36 possible outcomes when rolling two dice, it follows that
Pr A | B = 1/36.
We conclude that
Pr(A) = Pr(A | B) · Pr(B) + Pr A | B · Pr B
= 1/6 · 1/2 + 1/36 · 1/2
= 7/72.
202 Chapter 5. Discrete Probability
// n ≥ 2 and k ≥ 0;
// the input consists of n people P1 , P2 , . . . , Pn and
// n + k chairs C1 , C2 , . . . , Cn+k
j = uniformly random element in {1, 2, . . . , n + k};
person P1 sits down in chair Cj ;
for i = 2 to n
do if chair Ci is available
then person Pi sits down in chair Ci
else j = index of a uniformly random available chair;
person Pi sits down in chair Cj
endif
endfor
pn,k = Pr (An,k ) .
5.10. Please Take a Seat 203
Event An,k occurs if and only if chair Cn is available (i.e., has not been taken)
at the start of iteration n.
Is it true that the chair among C1 , Cn , Cn+1 , . . . , Cn+k that has been taken
at the start of iteration n is a uniformly random chair from these k +2 chairs?
If this is the case, then chair Cn has been taken with probability 1/(k + 2)
and, thus, Cn is available with probability 1 − 1/(k + 2) = (k + 1)/(k + 2).
In other words, if the question above has a positive answer, then
k+1
pn,k = Pr (An,k ) = .
k+2
In the rest of this section, we will present two ways to prove that this is
indeed the correct value of pn,k . In both proofs, we will use the Law of Total
Probability of Section 5.9.
Note that pn,k does not depend on n. In particular, if k = 0, then the
probability that person Pn sits in chair Cn is equal to 1/2.
k+1
p2,k = Pr (A2,k ) = .
k+2
Thus, we can determine the probability that event An,k occurs, if we are given
the value of j; note that this is a conditional probability. Since j is a random
element in the set {1, 2, . . . , n + k}, we are going to use the Law of Total
Probability (Theorem 5.9.1): For each j ∈ {1, 2, . . . , n + k}, we consider the
event
Since exactly one of these events is guaranteed to occur, we can apply The-
orem 5.9.1 and obtain
n+k
X
Pr (An,k ) = Pr (An,k | Bn,k,j ) · Pr (Bn,k,j ) .
j=1
It follows from the first line in algorithm TakeASeat(n, k) that, for each j
with 1 ≤ j ≤ n + k,
1
Pr (Bn,k,j ) = .
n+k
• Assume that j ∈ {1, n + 1, n + 2, . . . , n + k}. We have seen above that
event An,k occurs. Thus,
Pr (An,k | Bn,k,j ) = 1.
• Assume that j = n. We have seen above that event An,k does not
occur. Thus,
Pr (An,k | Bn,k,n ) = 0.
• Assume that j ∈ {2, 3, . . . , n − 1}. We have seen above that event An,k
occurs if and only if event An−j+1,k occurs. Thus,
We conclude that
pn,k = Pr (An,k )
n+k
X
= Pr (An,k | Bn,k,j ) · Pr (Bn,k,j )
j=1
n+k
X 1
= Pr (An,k | Bn,k,j ) ·
j=1
n+k
n+k
1 X
= Pr (An,k | Bn,k,j )
n + k j=1
n−1
!
1 X
= (k + 1) + pn−j+1,k .
n+k j=2
206 Chapter 5. Discrete Probability
If we write out the terms in this summation, then we get, for each n ≥ 3,
k+1 1
pn,k = + (p2,k + p3,k + · · · + pn−1,k ) .
n+k n+k
k+1
p2,k = .
k+2
k+1
pn,k =
k+2
for all integers n ≥ 2. (Recall that we already suspected this.) Using induc-
tion on n, it can easily be proved that this is indeed the case.
// n ≥ 2 and k ≥ 0;
// the input consists of n people P1 , P2 , . . . , Pn and
// n + k chairs C1 , C2 , . . . , Cn+k
j = uniformly random element in {1, 2, . . . , n + k};
person P1 sits down in chair Cj ;
for i = 2 to n − 1
do // P2 sits in C2 , P3 sits in C3 , . . . , Pi−1 sits in Ci−1
if chair Ci has been taken
then // P1 sits in Ci
j = uniformly random element
in {1, i + 1, i + 2, . . . , n + k};
person P1 sits down in chair Cj
endif;
person Pi sits down in chair Ci
endfor;
// P2 sits in C2 , P3 sits in C3 , . . . , Pn−1 sits in Cn−1
if chair Cn is available
then person Pn sits down in chair Cn
else j = uniformly random element
in {1, n + 1, n + 2, . . . , n + k};
person Pn sits down in chair Cj
endif
Recall that An,k is the event that person Pn sits in chair Cn , after the
original algorithm TakeASeat(n, k) has terminated. It follows from the
modified algorithm TakeASeat0 (n, k) that event An,k occurs if and only if,
208 Chapter 5. Discrete Probability
Definition 5.11.1 Let (S, Pr) be a probability space and let A and B be
two events. We say that A and B are independent if
In this definition, it is not assumed that Pr(A) > 0 and Pr(B) > 0. If
Pr(B) > 0, then
Pr(A ∩ B)
Pr(A | B) = ,
Pr(B)
and A and B are independent if and only if
Pr(A | B) = Pr(A).
Pr(B | A) = Pr(B).
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6},
where i is the result of the red die and j is the result of the blue die. We
assume a uniform probability function. Thus, each outcome has a probability
of 1/36.
210 Chapter 5. Discrete Probability
Let D1 denote the result of the red die and let D2 denote the result of
the blue die. Consider the events
A = “D1 + D2 = 7”
and
B = “D1 = 4”.
Are these events independent?
• Since
A = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)},
we have Pr(A) = 6/36 = 1/6.
• Since
B = {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)},
we have Pr(B) = 6/36 = 1/6.
• Since
A ∩ B = {(4, 3)},
we have Pr(A ∩ B) = 1/36.
A0 = “D1 + D2 = 11”
and
B 0 = “D1 = 5”
are not independent.
Now consider the two events
A00 = “D1 + D2 = 4”
and
B 00 = “D1 = 4”.
5.11. Independent Events 211
Definition 5.11.3 Let (S, Pr) be a probability space, let n ≥ 2, and let
A1 , A2 , . . . , An be a sequence of events.
1. We say that this sequence is pairwise independent if for any two distinct
indices i and j, the events Ai and Aj are independent, i.e.,
Pr (Ai ∩ Aj ) = Pr (Ai ) · Pr (Aj ) .
B = “f2 = f3 ”,
and
C = “f1 = f3 ”.
If we write these events as subsets of the sample space, then we get
A = {HHH, HHT, T T H, T T T },
B = {HHH, T HH, HT T, T T T },
and
C = {HHH, HT H, T HT, T T T }.
It follows that
Pr(A) = |A|/|S| = 4/8 = 1/2,
Pr(B) = |B|/|S| = 4/8 = 1/2,
Pr(C) = |C|/|S| = 4/8 = 1/2,
Pr(A ∩ B) = |A ∩ B|/|S| = 2/8 = 1/4,
Pr(A ∩ C) = |A ∩ C|/|S| = 2/8 = 1/4,
Pr(B ∩ C) = |B ∩ C|/|S| = 2/8 = 1/4.
A ∩ B ∩ C = {HHH, T T T },
we have
Pr(A ∩ B ∩ C) = |A ∩ B ∩ C|/|S| = 2/8 = 1/4.
Thus,
Pr(A ∩ B ∩ C) 6= Pr(A) · Pr(B) · Pr(C)
and, therefore, the sequence A, B, C is not mutually independent. Of course,
this is not surprising: If both events A and B occur, then event C also occurs.
• Similarly, since the union (∪) of sets corresponds to the disjunction (∨)
of propositions, we often write A ∨ B for the event “A or B occurs”.
The events
A = “the coin comes up heads”
and
B = “the result of the die is even”
correspond to the subsets
and
B = {H2, H4, H6, T 2, T 4, T 6}
of the sample space S, respectively. The event that both A and B occur is
written as A ∧ B and corresponds to the subset
of S.
Assume that both the coin and the die are fair, and the results of rolling
the die and flipping the coin are independent. The probability that both A
and B occur, i.e., Pr(A ∧ B), is equal to |A ∩ B|/|S| = 3/12 = 1/4. We can
also use independence to determine this probability:
Observe that when we determine Pr(A), we do not consider the entire sample
space S. Instead, we consider the coin’s sample space, which is {H, T }.
Similarly, when we determine Pr(B), we consider the die’s sample space,
which is {1, 2, 3, 4, 5, 6}.
The probability that A or B occurs, i.e., Pr(A ∨ B), is equal to
We assume that the coin flips are independent of each other, by which we
mean that the sequence A1 , A2 , . . . , An of events is mutually independent.
Consider the event
A = A1 ∧ A2 ∧ · · · ∧ An .
What is Pr(A), i.e., the probability that all n coins come up heads? Since
there are 2n many possible outcomes for n coin flips and only one of them
satisfies event A, this probability is equal to 1/2n . Alternatively, we can use
independence to determine Pr(A):
Pr(A) = Pr (A1 ∧ A2 ∧ · · · ∧ An )
= Pr (A1 ) · Pr (A2 ) · · · Pr (An ) .
Since each coin is fair, we have Pr (Ai ) = 1/2 and, thus, we get
Ai = “component Ci fails”.
216 Chapter 5. Discrete Probability
u = head (L);
i = 1;
while u 6= nil
do r = Random(i);
if r = 1
then x = u
endif;
u = succ(u);
i=i+1
endwhile;
return x
• in the last iteration, x is set to the last node of L with probability 1/|L|,
whereas the value of x does not change with probability (|L| − 1)/|L|.
Then
A = Ak ∧ Ak+1 ∧ Ak+2 ∧ Ak+3 ∧ · · · ∧ An .
Recall that we assume that the output of the function Random is inde-
pendent of all other calls to this function. This implies that the events
5.14. Long Runs in Random Bitstrings 219
R = r1 r2 . . . rn .
A run of length k is a substring of R, all of whose bits are the same. For
example, the bitstring
00111100101000011000
00111100101000011000,
A = A1 ∧ A2 ∧ · · · ∧ An−k+1 ,
It follows that
Pr A = Pr A1 ∧ A2 ∧ · · · ∧ An−k+1 . (5.5)
We determine Pr Ai , by first determining Pr (Ai ). The event Ai occurs
if and only if
ri = ri+1 = · · · = ri+k−1 = 0
or
ri = ri+1 = · · · = ri+k−1 = 1.
Since the coin flips are mutually independent, it follows that
and, therefore,
Pr Ai = 1 − Pr (Ai ) = 1 − 1/2k−1 .
Pr B i = 1 − 1/2k−1 .
Observe that
• the events B 1 , B 2 , . . . , B n/k are mutually independent, because the
blocks do not overlap, and
• if the event A occurs, then the event B 1 ∧ B 2 ∧ · · · ∧ B n/k also occurs
(but, in general, the converse is not true!).
222 Chapter 5. Discrete Probability
implying that
k
Pr A ≤ e−2n/(k2 )
≤ e−2 ln n
= 1/n2 .
Pr(A) = 1 − Pr A ≥ 1 − 1/n2 .
2bn/kc 2(n/k − 1)
k
>
2 2k
(2 log2 n)(n/k − 1)
=
n
2 log2 n 2 log2 n
= −
k n
2 log2 n
≥ 2 ln n −
n
224 Chapter 5. Discrete Probability
and, thus,
k
≤ e−2bn/kc/2
Pr A
2
≤ e−2 ln n+(2 log n)/n
2
= e−2 ln n · e(2 log n)/n
= (1/n2 ) · 1 + O((log2 n)/n)
This upper bound is larger than the upper bound we had before by only a
small additive factor of O((log2 n)/n3 ).
S = {H, T H, T T H, T T T H, T T T T H, . . .}
= {T n H : n ≥ 0},
Pr (T n H) = (1/2)n+1 .
must be equal to 1. Since you may have forgotten about infinite series, we
recall the definition in the following subsection.
5.15. Infinite Probability Spaces 225
(1 − x) 1 + x + x2 + · · · + xN = 1 − xN +1 .
Now we can return to the coin flipping example that we saw in the be-
ginning of Section 5.15. If we take x = 1/2 in Lemma 5.15.2, then we get
∞
X ∞
X
n
Pr (T H) = (1/2)n+1
n=0 n=0
∞
X
= (1/2) (1/2)n
n=0
1
= (1/2) ·
1 − 1/2
= 1.
Thus, we indeed have a valid probability function on the infinite sample space
S = {T n H : n ≥ 0}.
is about
lim ln N,
N →∞
S = {T n H : n ≥ 0}.
A = {T n H : n ≥ 0 and n is even},
which we rewrite as
A = {T 2m H : m ≥ 0}.
The probability that P1 wins the game is equal to Pr(A). How do we deter-
mine this probability? According to (5.1) in Section 5.2,
X
Pr(A) = Pr(ω).
ω∈A
228 Chapter 5. Discrete Probability
Thus, we have
∞
X
Pr T 2m H
Pr(A) =
m=0
X∞
= (1/2)2m+1
m=0
∞
X
= (1/2) (1/2)2m
m=0
X∞
= (1/2) (1/4)m .
m=0
1
Pr(A) = (1/2) · = 2/3.
1 − 1/4
Let us verify, using an infinite series, that Pr(B) is indeed equal to 1/3. The
event B corresponds to the subset
B = {T n H : n ≥ 0 and n is odd},
which we rewrite as
B = {T 2m+1 H : m ≥ 0}.
5.15. Infinite Probability Spaces 229
S = {T m HT n H : m ≥ 0, n ≥ 0}.
The event
A = “P1 wins the game”
corresponds to the subset
A = {T m HT n H : m ≥ 0, n ≥ 0, m + n is odd}.
Below, we will determine Pr(A), i.e., the probability that P1 wins the game.
We split the event A into two events
and
A2 = “P2 flips the first heads and P1 flips the second heads”.
and
and
∞ X
X ∞
Pr T 2k+1 HT 2` H
Pr (A2 ) =
k=0 `=0
∞ X
X ∞
= (1/2)2k+2`+3
k=0 `=0
= 2/9.
1 − Pr(A) = 5/9.
5.16 Exercises
5.1 Consider a coin that has 0 on one side and 1 on the other side. We flip
this coin once and roll a die twice, and are interested in the product of the
three numbers.
• If both the coin and the die are fair, how would you define the proba-
bility function Pr for this sample space?
232 Chapter 5. Discrete Probability
5.3 Let n be a positive integer. We flip a fair coin 2n times and consider the
possible outcomes, which are strings of length 2n with each character being
H (= heads) or T (= tails). Thus, we take the sample space S to be the set
of all such strings. Since our coin is fair, each string of S should have the
same probability. Thus, we define Pr(s) = 1/|S| for each string s in S. In
other words, we have a uniform probability space.
You are asked to determine the probability that in the sequence of 2n
flips, the coin comes up heads exactly n times:
• What is the event A that describes this?
• Determine Pr(A).
5.4 A cup contains two pennies (P), one nickel (N), and one dime (D). You
choose one coin uniformly at random, and then you choose a second coin
from the remaining coins, again uniformly at random.
• Let S be the sample space consisting of all ordered pairs of letters P,
N, and D that represent the possible outcomes. Write out all elements
of S.
5.5 You are given a box that contains the 8 lowercase letters a, b, c, d, e, f, g, h
and the 5 uppercase letters V, W, X, Y, Z.
In this exercise, we will consider two ways to choose 4 random letters
from the box. In the first way, we do uniform sampling without replacement,
whereas in the second way, we do uniform sampling with replacement. For
each case, you are asked to determine the probability that the 4-th letter
chosen is an uppercase letter. Before starting this exercise, spend a few
minutes and guess for which case this probability is smaller.
• You choose 4 letters from the box: These letters are chosen in 4 steps,
and in each step, you choose a uniformly random letter from the box;
this letter is removed from the box.
5.16. Exercises 233
• Assume both coins are fair. Determine Pr(A), Pr(B), and Pr(C).
• Let p and q be real numbers with 0 < p < 1 and 0 < q < 1. Assume
the red coin comes up “1” with probability p and the blue coin comes
up “1” with probability q. Is it possible to choose p and q such that
• D3 : fn−2 of its faces show the number 5 and the other fn−1 faces show
the number 2.
Pr(R1 > R2 )
and
Pr(R2 > R3 ),
and show that
fn−2 fn+1
Pr(R3 > R1 ) = .
fn2
5.11 You are given a fair die. If you roll this die repeatedly, then the results
of the rolls are independent of each other.
Determine Pr(A).
Determine Pr(B).
Determine Pr(C).
Before starting this exercise, spend a few minutes and guess which of these
three probabilities is the smallest.
236 Chapter 5. Discrete Probability
5.12 When Tri is a big boy, he wants to have four children. Assuming that
the genders of these children are uniformly random, which of the following
three events has the highest probability?
1. All four kids are of the same gender.
2. Exactly three kids are of the same gender.
3. Two kids are boys and two kids are girls.
5.13 A group of ten people sits down, uniformly at random, around a table.
Lindsay and Simon are part of this group. Determine the probability that
Lindsay and Simon sit next to each other.
5.14 Consider five people, each of which has a uniformly random and inde-
pendent birthday. (We ignore leap years.) Consider the event
A = “at least three people have the same birthday”.
Determine Pr(A).
5.15 Donald Trump wants to hire two secretaries. There are n applicants
a1 , a2 , . . . , an , where n ≥ 2 is an integer. Each of these applicants has a
uniformly random birthday, and all birthdays are mutually independent. (We
ignore leap years.)
Since Donald is too busy making America great again, he does not have
time to interview the applicants. Instead, he uses the following strategy: If
there is an index i such that ai and ai+1 have the same birthday, then he
chooses the smallest such index i and hires ai and ai+1 . In this case, the
hiring process is a tremendous success. If such an index i does not exist,
then nobody is hired and the hiring process is a total disaster.
Determine the probability that the hiring process is a tremendous success.
home pub
5.16. Exercises 237
5.17 In Section 5.4.1, we have seen the different cards that are part of a
standard deck of cards.
• You choose 2 cards uniformly at random from the 13 spades in a deck
of 52 cards. Determine the probability that you choose an Ace and a
King.
• You choose 2 cards uniformly at random from a deck of 52 cards. De-
termine the probability that you choose an Ace and a King.
• You choose 2 cards uniformly at random from a deck of 52 cards. De-
termine the probability that you choose an Ace and a King of the same
suit.
5.18 In Section 5.4.1, we have seen the different cards that are part of a
standard deck of cards.
A hand of cards is a subset consisting of five cards. A hand of cards is
called a straight, if the ranks of these five cards are consecutive and the cards
are not all of the same suit.
An Ace and a 2 are considered to be consecutive, whereas a King and an
Ace are also considered to be consecutive. For example, each of the three
hands below is a straight:
8♠, 9♥, 10♦, J♠, Q♣
238 Chapter 5. Discrete Probability
5.20 Let A be an event in some probability space (S, Pr). You are given
that the events A and A are independent2 . Determine Pr(A).
5.21 You are given three events A, B, and C in some probability space
(S, Pr). Is the following true or false?
Pr A ∩ B ∩ C = Pr(A ∪ B ∪ C) − Pr(B) − Pr(C) + Pr (B ∩ C) .
1 − x ≤ e−x
5.24 Let (S, Pr) be a probability space and let B be an event with Pr(B) > 0.
Consider the function Pr0 : S → R by
(
Pr(ω)
if ω ∈ B,
Pr0 (ω) = Pr(B)
0 if ω 6∈ B.
Pr(A ∩ B)
Pr0 (A) = .
Pr(B)
5.25 Consider two events A and B in some probability space (S, Pr).
• Assume that Pr(A) = 1/2 and Pr B | A = 3/5. Determine Pr(A∪B).
• Assume that Pr(A∪B) = 5/6 and Pr A | B = 1/3. Determine Pr(B).
2
This is not a typo.
240 Chapter 5. Discrete Probability
• Pr(A | B) = Pr(A),
Hint: The sequence of six events may contain duplicates. Try to make the
sample space S as small as you can.
5.28 You flip a fair coin three times. Consider the four events (recall that
zero is even)
• Determine Pr(A), Pr(B), Pr(C), Pr(D), Pr(A | C), and Pr(A | D).
• Are there any two events in the sequence A, B, C, and D that are
independent?
5.29 Consider a box that contains four beer bottles b1 , b2 , b3 , b4 and two cider
bottles c1 , c2 . You choose a uniformly random bottle from the box (and do
not put it back), after which you again choose a uniformly random bottle
from the box.
Consider the events
A = “5 is an element of X”,
B = “6 is an element of X”,
C = “6 is an element of X or 7 is an element of X”.
5.40 In this exercise, we assume that, when a child is born, its gender is
uniformly random, its day of birth is uniformly random, the gender and day
of birth are independent of each other and independent of other children.
Anil Maheshwari has two children. You are given that at least one of
Anil’s kids is a boy who was born on a Sunday. Determine the probability
that Anil has two boys.
5.41 Elisa and Nick go to Tan Tran’s Darts Bar. When Elisa throws a dart,
she hits the dartboard with probability p. When Nick throws a dart, he
hits the dartboard with probability q. Here, p and q are real numbers with
0 < p < 1 and 0 < q < 1. Elisa and Nick throw one dart each, independently
of each other. Consider the events
5.42 As everyone knows, Elisa Kazan loves to drink cider. You may not be
aware that Elisa is not a big fan of beer.
244 Chapter 5. Discrete Probability
3 B 5
B B
B B
2 C 6
Elisa spins the tray uniformly at random in clockwise order. After the
tray has come to a rest, there is a bottle of beer in front of her. Since Elisa is
obviously not happy, she gets a second chance, i.e., Elisa can choose between
one of the following two options:
1. Spin the tray again uniformly at random and independently of the first
spin. After the tray has come to a rest, Elisa must drink the bottle
that is in front of her.
2. Rotate the tray one position (i.e., 60 degrees) in clockwise order, after
which Elisa must drink the bottle that is in front of her.
• Elisa decides to go for the first option. Determine the probability that
she drinks the bottle of cider.
• Die D1 has 0 on two of its faces and 1 on the other four faces.
5.45 Nick is taking the course SPID 2804 (The Effect of Spiderman on the
Banana Industry). The final exam for this course consists of one true/false
question. To answer this question, Nick uses the following approach:
You are given that Nick knows the answer to the question with probabil-
ity 0.8. Consider the event
Determine Pr(A).
5.46 Let A and B be events in some probability space (S, Pr), such that
Pr(A) 6= 0 and Pr(B) 6= 0. Use the definition of conditional probability to
prove Bayes’ Theorem:
Pr(B | A) · Pr(A)
Pr(A | B) = .
Pr(B)
• The test gives a false reading for 3% of the population without the
disease: If a person does not have X, then with probability 0.03, the
test says that the person does have X.
• Assume the test says that the person has X. Use Exercise 5.46 to
determine the probability that the person indeed has X.
5.49 Let n ≥ 2 and m ≥ 1 be integers and consider two sets A and B, where
A has size n and B has size m. We choose a uniformly random function
f : A → B. For any two integers i and k with 1 ≤ i ≤ n and 1 ≤ k ≤ m,
consider the event
Aik = “f (i) = k”.
• For two distinct integers i and j, and for an integer k, are the two
events Aik and Ajk independent?
5.50 Consider three events A, B, and C in some probability space (S, Pr),
and assume that Pr(B ∩ C) 6= 0 and Pr(C) 6= 0. Prove that
• You get a uniformly random hand of three cards. Consider the event
Determine Pr(A).
• You get three cards, which are chosen one after another. Each of these
three cards is chosen uniformly at random from the current deck of
cards. (When a card has been chosen, it is removed from the current
deck.) Consider the events
and, for i = 1, 2, 3,
5.53 Let p be a real number with 0 < p < 1. You are given two coins C1
and C2 . The coin C1 is fair, i.e., if you flip this coin, it comes up heads with
probability 1/2 and tails with probability 1/2. If you flip the coin C2 , it
comes up heads with probability p and tails with probability 1 − p. You pick
one of these two coins uniformly at random, and flip it twice. These two coin
flips are independent of each other. Consider the events
• Determine Pr(A).
• Determine all values of p for which the events A and B are independent.
5.55 Donald Trump wants to hire a secretary and receives n applications for
this job, where n ≥ 1 is an integer. Since he is too busy in making important
announcements on Twitter, he appoints a three-person hiring committee.
After having interviewed the n applicants, each committee member ranks
the applicants from 1 to n. An applicant is hired for the job if he/she is
ranked first by at least two committee members.
Since the committee members do not have the ability to rank the appli-
cants, each member chooses a uniformly random ranking (i.e., permutation)
of the applicants, independently of each other.
John is one of the applicants. Determine the probability that John is
hired.
5.56 Edward, Francois-Xavier, Omar, and Yaser are sitting at a round table,
as in the figure below.
E
Y FX
At 11:59am, they all lower their heads. At noon, each of the boys chooses
a uniformly random element from the set {CW , CCW , O}; these choices are
independent of each other. If a boy chooses CW , then he looks at his clock-
wise neighbor, if he chooses CCW , then he looks at his counter-clockwise
neighbor, and if he chooses O, then he looks at the boy at the other side of
the table. When two boys make eye contact, they both shout Vive le Québec
libre.
250 Chapter 5. Discrete Probability
Determine Pr(A).
Determine Pr(B).
Determine
4
X
Pr (Ci ) .
i=0
5.57 You are given a fair die. For any integer n ≥ 1, you roll this die n times
(the rolls are independent). Consider the events
and
• Determine p1 .
5.16. Exercises 251
• For any integer n ≥ 2, express the event An in terms of the events An−1
and Bn .
5.58 You are asked to design a random bit generator. You find a coin in
your pocket, but, unfortunately, you are not sure if it is a fair coin. After
some thought, you come up with the following algorithm GenerateBit(n),
which takes as input an integer n ≥ 1:
Algorithm GenerateBit(n):
In this exercise, you will show that, when n → ∞, the output of algorithm
GenerateBit(n) is a uniformly random bit.
Let p be the real number with 0 < p < 1, such that, if the coin is flipped
once, it comes up heads with probability p and tails with probability 1 − p.
(Note that algorithm GenerateBit does not need to know the value of p.)
For any integer n ≥ 1, consider the two events
and
and define
Pn = Pr (An )
and
Qn = Pn − 1/2.
252 Chapter 5. Discrete Probability
• Determine P1 and Q1 .
Pn = p + (1 − 2p) · Pn−1 .
Qn = (1 − 2p) · Qn−1 .
Qn = (1 − 2p)n−1 · (p − 1/2).
• Prove that
lim Qn = 0
n→∞
and
lim Pn = 1/2.
n→∞
5.59 In this exercise, we will use the product notation. In case you are not
familiar with this notation:
• For k ≤ m, m
Q
i=k xi denotes the product
xk · xk+1 · xk+2 · · · xm .
Qm
• If k > m, then i=k xi is an “empty” product, which we define to be
equal to 1.
For example,
5.16. Exercises 253
p1 = 1 − (1 − p1 ),
p1 (1 − p2 ) + p2 = 1 − (1 − p1 )(1 − p2 ),
The Ottawa Senators and the Toronto Maple Leafs play a best-of-(2n+1)
series: These two hockey teams play games against each other, and the first
team to win n + 1 games wins the series. Assume that
• each game has a winner (thus, no game ends in a tie),
• in any game, the Sens have a probability of 1/2 of defeating the Leafs,
and
• the results of the games are mutually independent.
Consider the events
and
B = “the Leafs win the series”.
254 Chapter 5. Discrete Probability
Ak = “the Sens win the series after winning the (n + k + 1)-st game”.
• Prove that (5.8) holds by combining the results of the previous parts.
Determine Pr (A0 ).
5.16. Exercises 255
• Prove that
1 − 1/2n+1
Pr (A1 ) = .
n+1
• For each k with 0 ≤ k ≤ n, consider the event
Bk = “X has size k + 1 and P1 wins the six-pack”.
Prove that n
k 1
Pr (Bk ) = · .
2n+1 k+1
• Express the event A1 in terms of the events B0 , B1 , . . . , Bn .
• Prove that (5.9) holds by combining the results of the previous parts.
5.62 Let n and k be integers with 1 ≤ n ≤ k ≤ 2n. In this exercise, you will
prove that
n
2n − k
X k 2n
= . (5.10)
i=k−n
i n − i n
Jim is working on his assignment for the course COMP 4999 (Computa-
tional Aspects of Growing Cannabis). There are 2n questions on this assign-
ment and each of them is worth 1 mark. Two minutes before the deadline,
Jim has completed the first k questions. Jim is very smart and all answers
to these k questions are correct. Jim knows that the instructor, Professor
Mary Juana, does not accept late submissions. Because of this, Jim leaves
the last 2n − k questions blank and hands in his assignment.
Tri is a teaching assistant for this course. Since Tri is lazy, he does not
want to mark all questions. Instead, he chooses a uniformly random subset of
n questions out of the 2n questions, and only marks the n chosen questions.
For each correct answer, Tri gives 2 marks, whereas he gives 0 marks for each
wrong (or blank) answer.
For each integer i ≥ 0, consider the event
Ai = “Jim receives exactly 2i marks for his assignment”.
256 Chapter 5. Discrete Probability
P
• Determine the value of the summation i Pr (Ai ). Explain your answer
in plain English.
• Determine all values of i for which the event Ai is non-empty. For each
such value i, determine Pr (Ai ).
• Prove that (5.10) holds by combining the results of the previous parts.
5.63 Let a and z be integers with a > z ≥ 1, and let p be a real number
with 0 < p < 1. Alexa and Zoltan play a game consisting of several rounds.
In one round,
1. Alexa receives a points with probability p and 0 points with probability
1 − p,
2. Zoltan receives z points (with probability 1).
We assume that the results of different rounds are independent.
We say that Alexa is a better player than Zoltan, if Pr(A) > 1/2.
For which values of p is Alexa a better player than Zoltan?
• Assume that a = 3, z = 2, and
√ p is chosen such that p > 1/2 and
2
p < 1/2. (For example, p = ( 5 − 1)/2.)
– Is Alexa a better player than Zoltan?
– Alexa and Zoltan play a game consisting of two rounds. We con-
sider the total number of points that each player wins during these
two rounds. Consider the event
Prove that Pr(B) < 1/2. (This seems to suggest that Zoltan is a
better player than Alexa.)
• Let n be a large integer, and assume that a = n + 1, z = n, and p
is chosen very close to (but less than) 1. (For example, n = 500 and
p = 0.99.)
5.16. Exercises 257
This exercise will lead you through a proof of the claim that
Prove that
i−1
Pr (Bi ) ≤ .
d
• Express the event A in terms of the events B1 , B2 , . . . , B2k .
B = “at least two of Pk+1 , Pk+2 , . . . , P2k have the same birthday”
Ci = “Pi has the same birthday as at least one of Pk+1 , Pk+2 , . . . , P2k ”.
Prove that
1
Pr Ci | B = .
4k
also occurs.
• Prove that k
1
Pr A ≤ 1− .
4k
You may use the fact that the events C 1 ∩ B, C 2 ∩ B, . . . , C k ∩ B are
mutually independent.
n−k+1
Pr(A) ≤ .
2k−1
Pr(A) ≤ 2/n.
Pr (A | Bx ) = 1,
5.67 You are doing two projects P and Q. The probability that project P
is successful is equal to 2/3 and the probability that project Q is successful
is equal to 4/5. Whether or not these two projects are successful are inde-
pendent of each other. What is the probability that both P and Q are not
successful?
5.69 You flip three fair coins independently of each other. Let A be the event
“at least two flips in a row are heads” and let B be the event “the number
of heads is even”. (Note that zero is even.) Are A and B independent?
5.70 You flip three fair coins independently of each other. Consider the
events
A = “there is at most one tails”
and
B = “not all flips are identical”.
Are A and B independent?
5.71 Let n ≥ 2 be an integer and consider two fixed integers a and b with
1 ≤ a < b ≤ n.
• Use the Product Rule to determine the number of permutations of
{1, 2, . . . , n} in which a is to the left of b.
Use your answer to the first part of this exercise to determine Pr(A).
A = “in this permutation, both 3 and 4 are to the left of both 1 and 2”.
Determine Pr(A).
and
and
B = “1 and 2 are next to each other, with 1 to the left of 2, or
2 and 3 are next to each other, with 2 to the left of 3”.
5.75 You flip two fair coins independently of each other. Consider the events
5.78 You are given a tetrahedron, which is a die with four faces. Each of
these faces has one of the bitstrings 110, 101, 011, and 000 written on it.
Different faces have different bitstrings.
We roll the tetrahedron so that each face is at the bottom with equal
probability 1/4. For k = 1, 2, 3, consider the event
For example, if the bitstring at the bottom face is 101, then A1 is false, A2
is true, and A3 is false.
5.79 In a group of 100 children, 34 are boys and 66 are girls. You are given
the following information about the girls:
• 4 of the girls have green eyes, are blond, and are left-handed.
264 Chapter 5. Discrete Probability
S
B
A
20 20 20
5
5 5
10
C
15
5.81 Annie, Boris, and Charlie write an exam that consists of only one
question: What is 26 times 26? Calculators are not allowed during the
exam. Both Annie and Boris are pretty clever and each of them gives the
correct answer with probability 9/10. Charlie has trouble with two-digit
numbers and gives the correct answer with probability 6/10.
• Assume that the three students do not cheat, i.e., each student answers
the question independently of the other two students. Determine the
probability that at least two of them give the correct answer.
• Assume that Annie and Boris do not cheat, but Charlie copies Annie’s
answer. Determine the probability that at least two of them give the
correct answer.
Hint: The answer to the second part is smaller than the answer to the first
part.
5.83 You are given a box that contains one red ball and one blue ball.
Consider the following algorithm RandomRedBlue(n) that takes as input an
integer n ≥ 3:
Algorithm RandomRedBlue(n):
// n ≥ 3
// initially, the box contains one red ball and one blue ball
// all random choices are mutually independent
for k = 1 to n − 2
do choose a uniformly random ball in the box;
if the chosen ball is red
then put the chosen ball back in the box;
add one red ball to the box
else put the chosen ball back in the box;
add one blue ball to the box
endif
endfor
5.16. Exercises 267
In this exercise, you will prove that for any integers n ≥ 3 and i with
1 ≤ i ≤ n − 1,
1
Pr (Ani ) = . (5.11)
n−1
• Let n ≥ 3 and k be integers with 1 ≤ k ≤ n − 2. When running
algorithm RandomRedBlue(n),
– how many balls does the box contain at the start of the k-th
iteration,
– how many balls does the box contain at the end of the k-th iter-
ation?
• Let n = 3. Prove that (5.11) holds for all values of i in the indicated
range.
• Let n ≥ 4. Prove that (5.11) holds for all values of i in the indicated
range.
5.84 Prove that for any real number x 6= 1 and any integer N ≥ 0,
N
X 1 − xN +1
xn = .
n=0
1−x
Take the interval I = [0, 2) of length 2 on the real line and, for each n ≥ 0, an
interval In of length 1/2n . It is possible to place all intervals In with n ≥ 0
in I such that
5.86 Alexa, Tri, and Zoltan play the OddPlayer game: In one round, each
player flips a fair coin.
1. Assume that not all flips are equal. Then the coin flips of exactly two
players are equal. The player whose coin flip is different is called the
odd player. In this case, the odd player wins the game. For example, if
Alexa flips tails, Tri flips heads, and Zoltan flips tails, then Tri is the
odd player and wins the game.
2. If all three coin flips are equal, then the game is repeated.
Algorithm OddPlayer:
5.87 Two players P1 and P2 take turns rolling two fair and independent dice,
where P1 starts the game. The first player who gets a sum of seven wins the
game. Determine the probability that player P1 wins the game.
5.89 Two players P1 and P2 play a game in which they take turns flipping,
independently, a fair coin: First P1 flips the coin, then P2 flips the coin, then
P1 flips the coin, then P2 flips the coin, etc. The game ends as soon as the
270 Chapter 5. Discrete Probability
sequence of coin flips contains either HH or T T . The player who flips the
coin for the last time is the winner of the game. For example, if the sequence
of coin flips is HT HT HH, then P2 wins the game.
Determine the probability that player P1 wins the game.
5.90 We flip a fair coin repeatedly and independently, and stop as soon as
we see one of the two sequences HT T and HHT . Let A be the event that
the process stops because HT T is seen.
• Prove that the event A is given by the set
{T m (HT )n HT T : m ≥ 0, n ≥ 0}.
In other words, event A holds if and only if the sequence of coin flips
is equal to T m (HT )n HT T for some m ≥ 0 and n ≥ 0.
• Prove that Pr(A) = 1/3.
5.91 For i ∈ {1, 2}, consider the game Gi , in which two players P1 and P2
take turns flipping, independently, a fair coin, where Pi starts. The game
ends as soon as heads comes up. The player who flips heads first is the
winner of the game Gi . For j ∈ {1, 2}, consider the event
Bij = “Pj wins the game Gi ”.
In Section 5.15.2, we have seen that
Pr (B11 ) = Pr (B22 ) = 2/3 (5.12)
and
Pr (B12 ) = Pr (B21 ) = 1/3. (5.13)
Consider the game G, in which P1 and P2 take turns flipping, indepen-
dently, a fair coin, where P1 starts. The game ends as soon as a second heads
comes up. The player who flips the second heads wins the game. Consider
the event
A = “P1 wins the game G”.
In Section 5.15.3, we used an infinite series to show that
Pr(A) = 4/9. (5.14)
Use the Law of Total Probability (Theorem 5.9.1) to give an alternative proof
of (5.14). You are allowed to use (5.12) and (5.13).
5.16. Exercises 271
• P2 has two coins. One of them is fair, whereas the other one is 2-headed
(Her Majesty is on both sides of this coin).
The two players P1 and P2 play a game in which they alternate making turns:
P1 starts, after which it is P2 ’s turn, after which it is P1 ’s turn, after which
it is P2 ’s turn, etc.
The player who flips heads first is the winner of the game.
• Determine the probability that P2 wins this game, assuming that all
random choices and coin flips made are mutually independent.
5.93 Jennifer loves to drink India Pale Ale (IPA), whereas Connor Hillen
prefers Black IPA. Jennifer and Connor decide to go to their favorite pub
Chez Lindsay et Simon. The beer menu shows that this pub has ten beers
on tap:
• Caboose IPA,
Each of the first five beers is an IPA, whereas each of the first two beers is a
Black IPA.
Jennifer and Connor play a game, in which they alternate ordering beer:
Connor starts, after which it is Jennifer’s turn, after which it is Connor’s
turn, after which it is Jennifer’s turn, etc.
• When it is Connor’s turn, he orders two beers; each of these is chosen
uniformly at random from the ten beers (thus, these two beers may be
equal).
• When it is Jennifer’s turn, she orders one of the ten beers, uniformly
at random.
The game ends as soon as (i) Connor has ordered at least one Black IPA, in
which case he pays the bill, or (ii) Jennifer has ordered at least one IPA, in
which case she pays the bill.
• Determine the probability that Connor pays the bill, assuming that all
random choices made are mutually independent.
5.94 You would like to generate a uniformly random bit, i.e., with proba-
bility 1/2, this bit is 0, and with probability 1/2, it is 1. You find a coin in
your pocket, but you are not sure if it is a fair coin: It comes up heads (H)
with probability p and tails (T ) with probability 1 − p, for some real number
p that is unknown to you. In particular, you do not know if p = 1/2. In this
exercise, you will show that this coin can be used to generate a uniformly
random bit.
Consider the following recursive algorithm GetRandomBit, which does
not take any input:
Algorithm GetRandomBit:
• The sample space S is the set of all sequences of coin flips that can oc-
cur when running algorithm GetRandomBit. Determine this sample
space S.
5.95 You would like to generate a biased random bit: With probability 2/3,
this bit is 0, and with probability 1/3, it is 1. You find a fair coin in your
pocket: This coin comes up heads (H) with probability 1/2 and tails (T )
with probability 1/2. In this exercise, you will show that this coin can be
used to generate a biased random bit.
Consider the following recursive algorithm GetBiasedBit, which does
not take any input:
Algorithm GetBiasedBit:
• The sample space S is the set of all sequences of coin flips that can occur
when running algorithm GetBiasedBit. Determine this sample space
S.
5.96 Both Alexa and Shelly have an infinite bitstring. Alexa’s bitstring is
denoted by a1 a2 a3 . . ., whereas Shelly’s bitstring is denoted by s1 s2 s3 . . ..
Alexa can see her bitstring, but she cannot see Shelly’s bitstring. Similarly,
Shelly can see her bitstring, but she cannot see Alexa’s bitstring. The bits
in both bitstrings are uniformly random and independent.
The ladies play the following game: Alexa chooses a positive integer k
and Shelly chooses a positive integer `. The game is a success if sk = 1 and
a` = 1. In words, the game is a success if Alexa chooses a position in Shelly’s
274 Chapter 5. Discrete Probability
5.97 Alexa and Shelly take turns flipping, independently, a coin, where Alexa
starts. The game ends as soon as heads comes up. The lady who flips heads
first is the winner of the game.
Alexa proposes that they both use a fair coin. Of course, Shelly does
not agree, because she knows from Section 5.15.2 that this gives Alexa a
probability of 2/3 of winning the game.
The ladies agree on the following: Let p and q be real numbers with
0 < p < 1 and 0 ≤ q ≤ 1. Alexa uses a coin that comes up heads with
probability p, and Shelly uses a coin that comes up heads with probability q.
• Assume that p = 1/2. Determine the value of q for which Alexa and
Shelly have the same probability of winning the game.
• From now on, assume that 0 < p < 1 and 0 < q < 1.
• Let k and ` be two integers with 1 ≤ k < ` ≤ n. Prove that the events
Ak and A` are independent.
Bi = “ri = 1”
and
• Determine Pr (Ri ).
for i = 1 to m
do if ri = 1
then k = i
endif
endfor;
// k is the position of the rightmost 1 in the substring
// r1 r2 · · · rm .
// the next while-loop finds the position of the leftmost 1
// in the substring rm+1 rm+2 · · · rn , if this position exists.
` = m + 1;
while ` ≤ n and r` = 0
do ` = ` + 1
endwhile;
// if ` ≤ n, then ` is the position of the leftmost 1 in the
// substring rm+1 rm+2 · · · rn .
if ` ≤ n
then return `
else return k
endif
• Prove that
m 1 1 1
Pr (Em ) = + + ··· + .
n m m+1 n−1
• Prove that
m 1 1 1
Pr (A) = 1+ + + ··· + .
n m m+1 n−1
5.16. Exercises 277
5.100 You realize that it is time to buy a pair of shoes. You look up all n
shoe stores in Ottawa and visit them in random order. While shopping, you
create a bitstring r1 r2 · · · rn of length n: For each i with 1 ≤ i ≤ n, you set
ri to 1 if and only if the i-th store has the best pair of shoes, among the first
i stores that you have visited.
• Use Exercise 5.98 to prove that this bitstring satisfies the condition in
Exercise 5.99.
After you have visited the first m shoe stores, you are bored of shopping.
You keep on visiting shoe stores, but as soon as you visit a store that has
a pair of shoes that you like more than the previously best pair you have
found, you buy the former pair of shoes.
• Use Exercise 5.99 to determine the probability that you buy the best
pair of shoes that is available in Ottawa.
278 Chapter 5. Discrete Probability
Chapter 6
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6}
280 Chapter 6. Random Variables and Expectation
X(i, j) = i + j
where, e.g., T T H indicates that the first two coins come up tails and the
third coin comes up heads.
Let X : S → R be the random variable that maps any outcome (i.e., any
element of S) to the number of heads in the outcome. Thus,
X(HHH) = 3,
X(HHT ) = 2,
X(HT H) = 2,
X(HT T ) = 1,
X(T HH) = 2,
X(T HT ) = 1,
X(T T H) = 1,
X(T T T ) = 0.
• maps an outcome to 1 if all three coins come up heads or all three coins
come up tails, and
then we have
Y (HHH) = 1,
Y (HHT ) = 0,
Y (HT H) = 0,
Y (HT T ) = 0,
Y (T HH) = 0,
Y (T HT ) = 0,
Y (T T H) = 0,
Y (T T T ) = 1.
Since a random variable is a function X : S → R, it maps any outcome
ω to a real number X(ω). Usually, we just write X instead of X(ω). Thus,
for any outcome in the sample space S, we denote the value of the random
variable, for this outcome, by X. In the example above, we flip three coins
and write
X = the number of heads
and
1 if all three coins come up heads or all three coins come up tails,
Y =
0 otherwise.
value event
X=0 {T T T }
X=1 {HT T, T HT, T T H}
X=2 {HHT, HT H, T HH}
X=3 {HHH}
X=4 ∅
Y =0 {HHT, HT H, HT T, T HH, T HT, T T H}
Y =1 {HHH, T T T }
Y =2 ∅
282 Chapter 6. Random Variables and Expectation
Thus, the event “X = x” corresponds to the set of all outcomes that are
mapped, by the function X, to the value x:
{ω ∈ S : X(ω) = x}.
Let us return to the example in which we flip three coins. Assume that
the coins are fair and the three flips are mutually independent. Consider
again the corresponding random variables X and Y . It should be clear how
we determine, for example, the probability that X is equal to 0, which we
will write as Pr(X = 0). Using our interpretation of “X = 0” as being the
event {T T T }, we get
Pr(X = 0) = Pr(T T T )
= 1/8.
Similarly, we get
{ω ∈ S : X(ω) ≥ x}.
For our three-coin example, the random variable X can take each of the
values 0, 1, 2, and 3 with a positive probability. As a result, “X ≥ 2”
denotes the event “X = 2 or X = 3”, and we have
Pr(X ≥ 2) = Pr(X = 2 ∨ X = 3)
= Pr(X = 2) + Pr(X = 3)
= 3/8 + 1/8
= 1/2.
Definition 6.2.1 Let (S, Pr) be a probability space and let X and Y be two
random variables on S. We say that X and Y are independent if for all real
numbers x and y, the events “X = x” and “Y = y” are independent, i.e.,
and
1 if all three coins come up heads or all three coins come up tails,
Y =
0 otherwise.
Are these two random variables independent? Observe the following: If
Y = 1, then X = 0 or X = 3. In other words, if we are given some
information about the random variable Y (in this case, Y = 1), then the
random variable X cannot take, for example, the value 2. Based on this, we
take x = 2 and y = 1 in Definition 6.2.1. Since the event “X = 2 ∧ Y = 1”
is equal to ∅, we have
Pr(X = 2 ∧ Y = 1) = Pr(∅) = 0.
On the other hand, we have seen in Section 6.1.2 that Pr(X = 2) = 3/8 and
Pr(Y = 1) = 1/4. It follows that
Pr(X = 2 ∧ Y = 1) 6= Pr(X = 2) · Pr(Y = 1)
and, therefore, the random variables X and Y are not independent.
Now consider the random variable
1 if the first coin comes up heads,
Z=
0 if the first coin comes up tails.
We claim that the random variables Y and Z are independent. To verify
this, we have to show that for all real numbers y and z,
Pr(Y = y ∧ Z = z) = Pr(Y = y) · Pr(Z = z). (6.1)
Recall from Section 6.1.2 that Pr(Y = 1) = 1/4 and Pr(Y = 0) = 3/4.
Since the coin flips are independent, we have Pr(Z = 1) = 1/2 and Pr(Z =
0) = 1/2. Furthermore,
Pr(Y = 1 ∧ Z = 1) = Pr(HHH)
= 1/8,
Pr(Y = 1 ∧ Z = 0) = Pr(T T T )
= 1/8,
Pr(Y = 0 ∧ Z = 1) = Pr(HHT, HT H, HT T )
= 3/8,
Pr(Y = 0 ∧ Z = 0) = Pr(T HH, T HT, T T H)
= 3/8.
6.3. Distribution Functions 285
It follows that
Pr(Y = 1 ∧ Z = 1) = Pr(Y = 1) · Pr(Z = 1),
Pr(Y = 1 ∧ Z = 0) = Pr(Y = 1) · Pr(Z = 0),
Pr(Y = 0 ∧ Z = 1) = Pr(Y = 0) · Pr(Z = 1),
and
Pr(Y = 0 ∧ Z = 0) = Pr(Y = 0) · Pr(Z = 0).
Thus, (6.1) holds if (y, z) ∈ {(1, 1), (1, 0), (0, 1), (0, 0)}. For any other pair
(y, z), such as (y, z) = (3, 5) or (y, z) = (1, 2), at least one of the events
“Y = y” and “Z = z” is the empty set, i.e., cannot occur. Therefore, for
such pairs, we have
Pr(Y = y ∧ Z = z) = 0 = Pr(Y = y) · Pr(Z = z).
Thus, we have indeed verified that (6.1) holds for all real numbers y and z. As
a result, we have shown that the random variables Y and Z are independent.
Are the random variables X and Z independent? If X = 0, then all three
coins come up tails and, therefore, Z = 0. Thus,
Pr(X = 0 ∧ Z = 1) = Pr(∅) = 0,
whereas
Pr(X = 0) · Pr(Z = 1) = 1/8 · 1/2 6= 0.
As a result, the random variables X and Z are not independent.
We have defined the notion of two random variables being independent.
As in Definition 5.11.3, there are two ways to generalize this to sequences of
random variables:
Definition 6.2.2 Let (S, Pr) be a probability space, let n ≥ 2, and let
X1 , X2 , . . . , Xn be a sequence of random variables on S.
1. We say that this sequence is pairwise independent if for all real numbers
x1 , x2 , . . . , xn , the sequence “X1 = x1 ”, “X2 = x2 ”, . . . , “Xn = xn ” of
events is pairwise independent.
2. We say that this sequence is mutually independent if for all real numbers
x1 , x2 , . . . , xn , the sequence “X1 = x1 ”, “X2 = x2 ”, . . . , “Xn = xn ” of
events is mutually independent.
286 Chapter 6. Random Variables and Expectation
This defines a function that maps any real number x to the real number
Pr(X = x). This function is called the distribution function of the random
variable X:
for all x ∈ R.
For example, consider a fair red die and a fair blue die, and assume we
roll them independently. The sample space is
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6},
where i is the result of the red die and j is the result of the blue die. Each
outcome (i, j) in S has the same probability of 1/36.
Let X be the random variable whose value is equal to the sum of the
results of the two dies. The matrix below gives all possible values of X. The
leftmost column gives the result of the red die, the top row gives the result
of the blue die, and each other entry is the corresponding value of X.
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
6.4. Expected Values 287
As can be seen from this matrix, the random variable X can take any
value in {2, 3, 4, . . . , 12}. The distribution function D of X is given by
D(x) = Pr(X = x) = 0.
In Sections 6.6 and 6.7, we will see other examples of distribution func-
tions.
4 1 1 13
1 · Pr(1) + 2 · Pr(2) + 3 · Pr(3) = 1 · +2· +3· = .
5 10 10 10
288 Chapter 6. Random Variables and Expectation
Rolling a die: Assume we roll a fair die. Define the random variable X to
be the value of the result. Then, X takes each of the values in {1, 2, 3, 4, 5, 6}
with equal probability 1/6, and we get
1 1 1 1 1 1
E(X) = 1 · +2· +3· +4· +5· +6·
6 6 6 6 6 6
7
= .
2
1
P∞ P∞
The series n=0 an converges absolutely if the series n=0 |an | converges. If a series
converges absolutely, then we can change the order of summation without changing the
value of the series.
6.4. Expected Values 289
Now define the random variable Y to be equal to one divided by the result
of the die. In other words, Y = 1/X. This random variable takes each of the
values in {1, 1/2, 1/3, 1/4, 1/5, 1/6} with equal probability 1/6, and we get
1 1 1 1 1 1 1 1 1 1 1
E(Y ) = 1 · + · + · + · + · + ·
6 2 6 3 6 4 6 5 6 6 6
49
= .
120
Note that E(Y ) 6= 1/E(X). Thus, this example shows that, in general,
E(1/X) 6= 1/E(X).
Rolling two dice: Consider a fair red die and a fair blue die, and assume
we roll them independently. The sample space is
S = {(i, j) : 1 ≤ i ≤ 6, 1 ≤ j ≤ 6},
where i is the result of the red die and j is the result of the blue die. Each
outcome (i, j) in S has the same probability of 1/36.
Let X be the random variable whose value is equal to the sum of the
results of the two rolls. As a function X : S → R, we have X(i, j) = i+j. The
matrix below gives all possible values of X. The leftmost column indicates
the result of the red die, the top row indicates the result of the blue die, and
each other entry is the corresponding value of X.
1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
290 Chapter 6. Random Variables and Expectation
Lemma 6.4.2 Let (S, Pr) be a probability space and let X and Y be two
random variables on S. If X ≤ Y , then E(X) ≤ E(Y ).
we rearranged the terms in the summation. That is, instead of taking the
sum over all elements (i, j) in S,
292 Chapter 6. Random Variables and Expectation
• Determine all values x that X can take, i.e., determine the range of
the function X.
Theorem 6.5.1 Let (S, Pr) be a probability space. For any two random
variables X and Y on S, and for any two real numbers a and b,
Let us return to the example in which we roll two fair and independent
dice, one being red and the other being blue. Define the random variable
X to be the sum of the results of the two rolls. We have seen two ways to
compute the expected value E(X) of X. We now present a third way, which
is the easiest one: We define two random variables
and
Z = the result of the blue die.
In Section 6.4.1, we have seen that
1 1 1 1 1 1 7
E(Y ) = 1 · +2· +3· +4· +5· +6· = .
6 6 6 6 6 6 2
By the same computation, we have
7
E(Z) = .
2
Observe that
X = Y + Z.
6.5. Linearity of Expectation 295
The following theorem states that the Linearity of Expectation also holds
for infinite sequences of random variables:
Theorem 6.5.3 Let (S, Pr) be a probability space and let X1 , X2 , . . . be an
infinite sequence of random variables on S such that the infinite series
∞
X
E (|Xi |)
i=1
converges. Then,
∞
! ∞
X X
E Xi = E (Xi ) .
i=1 i=1
Let us first verify that all probabilities add up to 1: Using Lemma 5.15.2, we
have
∞
X ∞
X
Pr T k−1 H p(1 − p)k−1
=
k=1 k=1
X∞
= p (1 − p)k−1
k=1
X∞
= p (1 − p)`
`=0
1
= p·
1 − (1 − p)
= 1.
for any real number x with −1 < x < 1. Both sides of this equation are func-
tions of x and these two functions are equal to each other. If we differentiate
both sides, we get two derivatives that are also equal to each other:
∞
X 1
kxk−1 = .
k=0
(1 − x)2
If we take x = 1 − p, we get
∞
X
E(X) = p k(1 − p)k−1
k=1
1
= p·
(1 − (1 − p))2
p
=
p2
1
= .
p
Definition 6.6.1 Let p be a real number with 0 < p < 1. A random variable
X has a geometric distribution with parameter p, if its distribution function
satisfies
Pr(X = k) = p(1 − p)k−1
for any integer k ≥ 1.
6.7. The Binomial Distribution 299
Our calculation that led to the value of E(X) proves the following theo-
rem:
Theorem 6.6.2 Let p be a real number with 0 < p < 1 and let X be a
random variable that has a geometric distribution with parameter p. Then
E(X) = 1/p.
For example, if we flip a fair coin (in which case p = 1/2) repeatedly and
independently until it comes up heads for the first time, then the expected
number of coin flips is equal to 2.
Thus, we have to determine Pr(X = k), i.e., the probability that in a sequence
of n independent coin flips, the coin comes up heads exactly k times.
300 Chapter 6. Random Variables and Expectation
To give an example, assume that n = 4 and k = 2. The table below gives
all 42 = 6 sequences of 4 coin flips that contain exactly 2 H’s, together with
their probabilities:
sequence probability
HHT T p · p · (1 − p) · (1 − p) = p2 (1 − p)2
HT HT p · (1 − p) · p · (1 − p) = p2 (1 − p)2
HT T H p · (1 − p) · (1 − p) · p = p2 (1 − p)2
T HHT (1 − p) · p · p · (1 − p) = p2 (1 − p)2
T HT H (1 − p) · p · (1 − p) · p = p2 (1 − p)2
T T HH (1 − p) · (1 − p) · p · p = p2 (1 − p)2
4
As can be seen from this table, each of the 2
sequences has the same
probability p2 (1 − p)2 . It follows that, if n = 4,
4 2
Pr(X = 2) = p (1 − p)2 .
2
As a sanity check, let us use Newton’s Binomial Theorem (i.e., Theorem 3.6.5)
to verify that all probabilities add up to 1:
n n
X X n k
Pr(X = k) = p (1 − p)n−k
k=0 k=0
k
= ((1 − p) + p)n
= 1.
We are now ready to compute the expected value of the random vari-
6.7. The Binomial Distribution 301
able X:
n
X
E(X) = k · Pr(X = k)
k=0
n
X n k
= k p (1 − p)n−k
k=0
k
n
X n k
= k p (1 − p)n−k .
k=1
k
Since
n n!
k = k·
k k!(n − k)!
(n − 1)!
= n·
(k − 1)!(n − k)!
n−1
= n ,
k−1
we get
n
n−1 k
X
E(X) = n p (1 − p)n−k .
k=1
k − 1
By changing the summation variable from k to ` + 1, we get
n−1
n − 1 `+1
X
E(X) = n p (1 − p)n−1−`
`=0
`
n−1
n−1 `
X
= pn p (1 − p)n−1−` .
`=0
`
E(X) = pn · 1
= pn.
302 Chapter 6. Random Variables and Expectation
We have done the following: Our intuition told us that E(X) = pn. Then,
we went through a painful calculation to show that our intuition was correct.
There must be an easier way to show that E(X) = pn. We will show below
that there is indeed a much easier way.
Observe that
X = X1 + X2 + · · · + X n ,
because
We conclude that
n
X
E(X) = E (Xi )
i=1
n
X
= p
i=1
= pn.
I hope you agree that this is much easier than what we did before.
The distribution function of the random variable X is given by (6.3).
This function is called a binomial distribution:
Our calculation that led to the value of E(X) proves the following theo-
rem:
R = r1 r2 . . . rn
0 0 1 1 1 1 1 0 0 0 1 1 0 0 0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
E (Xi ) = Pr (Xi = 1) .
Since Xi = 1 if and only if all bits in the subsequence ri ri+1 . . . ri+k−1 are 0
or all bits in this subsequence are 1, we have
E (Xi ) = Pr (Xi = 1)
= (1/2)k + (1/2)k
= 1/2k−1 .
Since
n−k+1
X
X= Xi ,
i=1
max = −∞;
for i = 1 to n
do if si > max
then max = si (*)
endif
endfor;
return max
We would like to know the number of times that line (*) is executed,
i.e., the number of times that the value of the variable max changes. For
example, if the input sequence is
3, 2, 5, 4, 6, 1,
6.8. Indicator Random Variables 307
6, 5, 4, 3, 2, 1,
1, 2, 3, 4, 5, 6,
it changes 6 times.
Assume that the input sequence s1 , s2 , . . . , sn is a uniformly random per-
mutation of the set {1, 2, . . . , n}. Thus, each permutation has probability
1/n! of being the input. We define a random variable X whose value is equal
to the number of times that line (*) is executed when running algorithm
FindMax(s1 , s2 , . . . , sn ). We are interested in the expected value E(X) of
this random variable.
The algorithm makes n iterations. In each iteration, line (*) is either
executed or not executed. We define, for each iteration, an indicator random
variable that tells us whether or not line (*) is executed during that iteration.
That is, for any i with 1 ≤ i ≤ n, we define
1 if line (*) is executed in the i-th iteration,
Xi =
0 otherwise.
Since n
X
X= Xi ,
i=1
Pr (Xi = 1) = 1/i.
This can be proved in a more formal way as follows: By the Product Rule,
the number of permutations s1 , s2 , . . . , sn of {1, 2, . . . , n} for which si is the
largest element among s1 , s2 , . . . , si is equal to
n
(i − 1)!(n − i)! = n!/i.
i
n!/i
Pr (Xi = 1) = = 1/i.
n!
Thus,
n
X
E(X) = Pr (Xi = 1)
i=1
n
X
= 1/i
i=1
1 1 1
= 1+ + + ··· + .
2 3 n
The number on the right-hand side is called the harmonic number and de-
noted by Hn . In the following subsection, we will show that Hn is approx-
imately equal to ln n. Thus, the expected number of times that line (*) of
algorithm FindMax is executed, when given as input a uniformly random
permutation of {1, 2, . . . , n}, is about ln n.
As a final remark, the indicator random variables X1 , X2 , . . . , Xn that we
have introduced above are mutually independent; see Exercise 5.98. Keep in
mind, however, that we do not need this, because the Linearity of Expectation
does not require these random variables to be mutually independent.
6.8. Indicator Random Variables 309
For example, if we take f (x) = 1/x, then the summation is the harmonic
number Hn of the previous subsection.
For each i with 2 ≤ i ≤ n, draw the rectangle with bottom-left corner at
the point (i − 1, 0) and top-right corner at the point (i, f (i)), as in the figure
below.
y = f (x)
1 2 3 4 n−1 n
The area of the i-th rectangle is equal to f (i) and, thus,
n
X
f (i)
i=1
y = f (x)
1 2 3 4 n n+1
In this case, the graph y = f (x) is below the top sides of the rectangles
and, therefore,
Xn Z n+1
f (i) ≥ f (x) dx. (6.5)
i=1 1
If we apply (6.4) and (6.5) to the function f (x) = 1/x, then we get
n
X 1
Hn =
i=1
i
n
dx
Z
≤ 1+
1 x
= 1 + ln n
and
n
X 1
Hn =
i=1
i
n+1
dx
Z
≥
1 x
= ln(n + 1)
≥ ln n.
We have proved the following result:
6.9. The Insertion-Sort Algorithm 311
Pn
Lemma 6.8.3 For any integer n ≥ 1, the harmonic number Hn = i=1 1/i
satisfies
ln n ≤ Hn ≤ 1 + ln n.
• the algorithm has not yet seen any of the elements in the subarray
A[i . . . n].
In the i-th iteration, the algorithm takes the element A[i] and repeatedly
swaps it with its left neighbor until the subarray A[1 . . . i] is sorted. The
pseudocode of this algorithm is given below.
Algorithm InsertionSort(A[1 . . . n]):
for i = 2 to n
do j = i;
while j > 1 and A[j] < A[j − 1]
do swap A[j] and A[j − 1];
j =j−1
endwhile
endfor
We are interested in the total number of swaps that are made by this
algorithm. The worst-case happens when the input array is sorted in reverse
order, in which case the total number of swaps is equal to
n
1 + 2 + 3 + · · · + (n − 1) = .
2
Thus, in the worst case, each of the n2 pairs of input elements is swapped.
Assume that the input array A[1 . . . n] contains a uniformly random per-
mutation of the set {1, 2, . . . , n}. Thus, each permutation has probability
312 Chapter 6. Random Variables and Expectation
1/n! of being the input. We define the random variable X to be the total
number of swaps made when running algorithm InsertionSort(A[1 . . . n]).
We will determine the expected value E(X) of X.
Since we want to count the number of pairs of input elements that are
swapped, we will use, for each pair of input elements, an indicator random
variable that indicates whether or not this pair gets swapped by the algo-
rithm. That is, for each a and b with 1 ≤ a < b ≤ n, we define
1 if a and b get swapped by the algorithm,
Xab =
0 otherwise.
We observe that, since a < b, these two elements get swapped if and only
if in the input array, b is to the left of a. Since the input array contains a
uniformly random permutation, the events “b is to the left of a” and “a is to
the left of b” are symmetric. Therefore, we have
E (Xab ) = Pr (Xab = 1) = 1/2.
A formal proof of this is obtained by showing that there are n!/2 permutations
of {1, 2, . . . , n} in which b appears to the left of a and, thus, n!/2 permutations
in which a appears to the left of b. (See also Exercise 5.71.)
Since each pair of input elements is swapped at most once, we have
n−1 X
X n
X= Xab .
a=1 b=a+1
if i < j
then p = uniformly random element in A[i . . . j];
compare p with all other elements in A[i . . . j];
rearrange A[i . . . j] such that it has the following
form (this rearranging defines the value of k):
<p p >p
i k j
QuickSort(A, i, k − 1);
QuickSort(A, k + 1, j)
endif
The element p is called the pivot. We have seen in Section 1.3 that the
worst-case running time of algorithm QuickSort(A, 1, n) is Θ(n2 ). In this
section, we will prove that the expected running time is only O(n log n).
We assume for simplicity that the input array is a permutation of the set
{1, 2, . . . , n}. We do not make any other assumption about the input. In
particular, we do not assume that the input is a random permutation. The
only place where randomization is used is when the pivot is chosen: It is
chosen uniformly at random in the subarray on which QuickSort is called.
The quantity that we will analyze is the total number of comparisons
(between pairs of input elements) that are made during the entire execution
of algorithm QuickSort(A, 1, n). In such a comparison, the algorithm takes
two distinct input elements, say a and b, and decides whether a < b or a > b.
Observe from the pseudocode that the only comparisons being made are
between the pivot and all other elements in the subarray that is the input
to the current call to QuickSort. Since the operation “compare a to b” is
314 Chapter 6. Random Variables and Expectation
the same as the operation “compare b to a” (even though the outcomes are
opposite), we will assume below that in such a comparison, a < b.
We define the random variable X to be the total number of comparisons
that are made by algorithm QuickSort(A, 1, n). We will prove that the
expected value of X satisfies E(X) = O(n log n).
For each a and b with 1 ≤ a < b ≤ n, we consider the indicator random
variable
1 if a and b are compared to each other when
Xab = running QuickSort(A, 1, n),
0 otherwise.
– If p < a, then after the algorithm has rearranged the input ar-
ray, all elements of the set Sab are to the right of p and, thus,
all these elements are part of the input for the recursive call
QuickSort(A, k+1, n). During the rearranging, a and b have not
been compared to each other. However, they may be compared
to each other during later recursive calls.
– If p > b, then after the algorithm has rearranged the input ar-
ray, all elements of the set Sab are to the left of p and, thus,
all these elements are part of the input for the recursive call
QuickSort(A, 1, k−1). During the rearranging, a and b have not
been compared to each other. However, they may be compared
to each other during later recursive calls.
• in any recursive call, the pivot is chosen uniformly at random from the
subarray that is the input for this call, and
• at the start of the first recursive call in which the pivot belongs to the
set Sab , all elements of this set are part of the input for this call,
each of the b − a + 1 elements of Sab has the same probability of being the
316 Chapter 6. Random Variables and Expectation
where Hn is the harmonic number that we have seen in Sections 6.8.2 and 6.8.3.
Using Lemma 6.8.3, it follows that
E(X) ≤ 2n ln n.
The value of h is called the height of the skip list. An example of a skip list
of height h = 3 for the set S = {1, 2, 3, 4, 6, 7, 9} is shown in the figure below.
L3 7
L2 3 7
L1 1 3 4 7 9
L0 1 2 3 4 6 7 9
The algorithm that searches for a number x keeps track of the current node u
and the index i of the list Li that contains u. Initially, u is the root of the
skip list and i = h. At any moment, if i ≥ 1, the algorithm tests if the key
of right(u) is less than x. If this is the case, then u moves one node to the
right in the list Li ; otherwise, u moves to the node down(u) in the list Li−1 .
Once i = 0, node u moves to the right in the list L0 and stops at the last
node whose key is at most equal to x. The pseudocode of this algorithm
Search(x) is given below.
6.11. Skip Lists 319
Algorithm Search(x):
The dashed arrows in the figure below show the path that is followed when
running algorithm Search(7). Note that if we replace “key(right(u)) < x”
in the first while-loop by “key(right(u)) ≤ x”, we obtain a different path that
ends in the same node: This path moves from the root to the node in L3
whose key is 7, and then it moves down to the list L0 . As we will see later,
using the condition “key(right(u)) < x” simplifies the algorithm for deleting
an element from the skip list.
L3 7
L2 3 7
L1 1 3 4 7 9
L0 1 2 3 4 6 7 9
• Flip a fair and independent coin repeatedly until it comes up tails for
the first time. Let k be the number of flips.
The figure below shows the skip list that results when inserting the number 5
into our example skip list. In this case, k = 3 and the new number is added
to the lists L0 , L1 , and L2 . The dashed arrows indicate the pointers that are
changed during this insertion.
L3 7
L2 3 5 7
L1 1 3 4 5 7 9
L0 1 2 3 4 5 6 7 9
• At this moment, it may happen that some of the lists Lh , Lh−1 , . . . only
consist of dummy nodes. If this is the case, delete these lists, and
update the height h and the root of the new skip list.
E(h(x)) = 1.
Lemma 6.11.2 For any number x that is stored in the list L0 and for any
i ≥ 0,
Pr (x ∈ Li ) = 1/2i .
Proof. The claim follows from the fact that x is contained in the list Li if
and only if the first i coin flips for x all result in heads.
Lemma 6.11.3 Let i ≥ 0 and let |Li | denote the number of nodes in the
list Li , ignoring the dummy node. Then,
E (|Li |) = n/2i .
Lemma 6.11.4 Let X be the random variable whose value is equal to the
total number of nodes in all lists L0 , L1 , L2 , . . ., ignoring the dummy nodes.
Then,
E(X) = 2n.
Proof. We will give two proofs. In the first proof, we observe that
h
X
X= |Li |
i=0
and, thus,
h
!
X
E(X) = E |Li | .
i=0
Observe that the number of terms in the summation on the right-hand side
is equal to h + 1, which is a random variable. In general, the Linearity of
Expectation does not apply to summations consisting of a random number
of terms; see Exercise 6.64 for an example. Therefore, we proceed as follows.
6.11. Skip Lists 323
Recall that, for the purpose of analysis, we have defined, for each integer
i > h, Li to be an empty list. It follows that
∞
X
X= |Li |.
i=0
Using the Linearity of Expectation (i.e., Theorem 6.5.3) and Lemmas 6.11.3
and 5.15.2, we get
∞
!
X
E(X) = E |Li |
i=0
∞
X
= E (|Li |)
i=0
∞
X
= n/2i
i=0
X∞
= n (1/2)i
i=0
= 2n.
In the second proof, we use the fact that each number x in L0 occurs in
exactly 1 + h(x) lists, namely L0 , L1 , . . . , Lh(x) . Thus, we have
X
X= (1 + h(x)) .
x
Using the Linearity of Expectation (i.e., Theorem 6.5.2) and Lemma 6.11.1,
we get
!
X
E(X) = E (1 + h(x))
x
X
= E (1 + h(x))
x
X
= (1 + E (h(x)))
x
X
= 2
x
= 2n.
324 Chapter 6. Random Variables and Expectation
Lemma 6.11.5 Recall that h is the random variable whose value is equal to
the height of the skip list. We have
E(h) ≤ log n + 1.
Proof. Since
h = max h(x),
x
we have
E(h) = E max h(x) .
x
max E (h(x)) ,
x
which is equal to 1 by Lemma 6.11.1. (In Exercise 6.63, you will find a simple
example showing that, in general, the expected value of a maximum is not
equal to the maximum of the expected values.)
To prove a correct upper bound on E(h), we introduce, for each integer
i ≥ 1, an indicator random variable
1 if the list Li stores at least one number,
Xi =
0 otherwise.
We observe that
∞
X
h= Xi .
i=1
E (Xi ) ≤ 1. (6.6)
at least one number, then (6.7) becomes 1 ≤ |Li |, which is again a true
statement. Combining (6.7) with Lemmas 6.4.2 and 6.11.3, we obtain
If we apply (6.6) to the first summation and (6.8) to the second summation,
we get
log n ∞
X X n
E(h) ≤ 1+
i=1 i=log n+1
2i
∞
X n
= log n +
j=0
2log n+1+j
∞
X n
= log n +
j=0
n · 21+j
∞
X 1
= log n +
j=0
21+j
∞
1 X 1
= log n +
2 j=0
2j
1
= log n + ·2
2
= log n + 1.
326 Chapter 6. Random Variables and Expectation
Lemma 6.11.6 Let Y be the random variable whose value is equal to the
total number of nodes in all lists L0 , L1 , L2 , . . ., including the dummy nodes.
Then
E(Y ) ≤ 2n + log n + 2.
L2 3 7
L1 1 3 4 7 9
L0 1 2 3 4 6 7 9
Lemma 6.11.7 For any number x, let N be the random variable whose value
is equal to the number of nodes on the search path of algorithm Search(x).
Then,
E(N ) ≤ 2 log n + 5.
6.11. Skip Lists 327
E(M ) ≤ 2 log n + 4.
Thus,
h
!
X
E(M ) = E h + 1 + Mi
i=0
h
!
X
= E(h) + 1 + E Mi .
i=0
328 Chapter 6. Random Variables and Expectation
As in the proof of Lemma 6.11.4, the number of terms in the latter summation
is equal to h + 1, which is a random variable. Therefore, we cannot apply
the Linearity of Expectation to this sum. As in the proof of Lemma 6.11.4,
we proceed as follows. We first observe that
∞
X
M =h+1+ Mi .
i=0
Li : H T T T T T
Li−1 :
E (Mi ) ≤ 1. (6.9)
Also, since Mi is less than or equal to the size |Li | of the list Li (ignoring the
dummy node), we have, using Lemmas 6.4.2 and 6.11.3,
We know from Lemma 6.11.5 that E(h) ≤ log n + 1. If we apply (6.9) to the
first summation and (6.10) to the second summation, we get
log n ∞
X X
E(M ) ≤ (log n + 1) + 1 + 1+ n/2i
i=0 i=log n+1
∞
X
= 2 log n + 3 + n/2i .
i=log n+1
We have seen the infinite series in the proof of Lemma 6.11.5 and showed
that it is equal to 1. Thus, we conclude that
E(M ) ≤ 2 log n + 4.
6.12 Exercises
6.1 Consider a fair coin that has 0 on one side and 1 on the other side. We
flip this coin once and roll a fair die twice. Consider the following random
variables:
6.2 Consider the set S = {2, 3, 5, 30}. We choose a uniformly random ele-
ment x from this set. Consider the random variables
1 if x is divisible by 2,
X =
0 otherwise,
1 if x is divisible by 3,
Y =
0 otherwise,
1 if x is divisible by 5,
Z =
0 otherwise.
6.3 Let a and b be real numbers. You flip a fair and independent coin three
times. For i = 1, 2, 3, let
a if the i-th coin flip results in heads,
fi =
b if the i-th coin flip results in tails.
X = f1 · f2 ,
Y = f2 · f3 .
6.4 Lindsay and Simon want to play a game in which the expected amount
of money that each of them wins is equal to zero. After having chosen a num-
ber x, the game is played as follows: Lindsay rolls a fair die, independently,
three times.
6.12. Exercises 331
• If none of the three rolls results in 6, then Lindsay pays one dollar to
Simon.
• If exactly one of the rolls results in 6, then Simon pays one dollar to
Lindsay.
• If exactly two rolls result in 6, then Simon pays two dollars to Lindsay.
• You flip this coin twice; the two flips are independent. For each heads,
you win 3 dollars, whereas for each tails, you lose 2 dollars. Consider
the random variable
• You flip this coin 99 times; these flips are mutually independent. For
each heads, you win 3 dollars, whereas for each tails, you lose 2 dollars.
Consider the random variable
– If the chosen ball is red, then put it back, together with an addi-
tional red ball.
– If the chosen ball is blue, then put it back, together with an ad-
ditional blue ball.
332 Chapter 6. Random Variables and Expectation
Define the random variable X to be the fraction of the balls that are red,
after this experiment. Prove that E(X) = α.
6.7 The Ontario Lottery and Gaming Corporation (OLG) offers the follow-
ing lottery game:
• OLG chooses a winning number w in the set S = {0, 1, 2, . . . , 999}.
Assume that
• John plays this game once per day for one year (i.e., for 365 days),
• each day, John chooses x uniformly at random from the set S, inde-
pendently from previous choices.
Define the random variable X to be the total amount of dollars that John
wins during one year. Determine the expected value E(X).
Hint: Use the Linearity of Expectation.
6.8 Assume we flip a fair coin twice, independently of each other. Consider
the following random variables:
6.9 As of this writing2 , Ma Long is the number 1 ranked ping pong player in
the world. Simon Bose3 also plays ping pong, but he is not at Ma’s level yet.
If you play a game of ping pong against Ma, then you win with probability p.
If you play a game against Simon, you win with probability q. Here, p and
q are real numbers such that 0 < p < q < 1. (Of course, p is much smaller
than q.) If you play several games against Ma and Simon, then the results
are mutually independent.
You have the choice between the following two series of games:
1. MSM : First, play against Ma, then against Simon, then against Ma.
2. SMS : First, play against Simon, then against Ma, then against Simon.
6.10 In order to attract more customers, the Hyacintho Cactus Bar and Grill
in downtown Ottawa organizes a game night, hosted by their star employee
Tan Tran.
After paying $26, a player gets two questions P and Q. If the player
gives the correct answer to question P , this player wins $30; if the player
gives the correct answer to question Q, this player wins $60. A player can
choose between the following two options:
Elisa decides to play this game. The probability that Elisa correctly
answers question P is equal to 1/2, whereas she correctly answers question
Q with probability 1/3. The events of correctly answering are independent.
• Assume Elisa chooses the first option. Define the random variable X
to be the amount of money that Elisa wins (this includes the $26 that
she has to pay in order to play the game). Determine the expected
value E(X).
• Assume Elisa chooses the second option. Define the random variable
Y to be the amount of money that Elisa wins (this includes the $26
that she has to pay in order to play the game). Determine the expected
value E(Y ).
6.11 Assume we roll two fair and independent dice, where one die is red and
the other die is blue. Let (i, j) be the outcome, where i is the result of the
red die and j is the result of the blue die. Consider the random variables
X =i+j
and
Y = i − j.
Are X and Y independent random variables?
6.12 Assume we roll two fair and independent dice, where one die is red and
the other die is blue. Let (i, j) be the outcome, where i is the result of the
red die and j is the result of the blue die. Consider the random variables
X = |i − j|
and
Y = max(i, j).
Are X and Y independent random variables?
6.12. Exercises 335
and
0 if x is even,
Y =
1 if x is odd.
Are X and Y independent random variables?
6.14 Consider the 8-element set A = {a, b, c, d, e, f, g, h}. We choose a uni-
formly random 5-element subset B of A. Consider the following random
variables:
X = |B ∩ {a, b, c, d}|,
Y = |B ∩ {e, f, g, h}|.
• Determine the expected value E(X) of the random variable X.
• Are X and Y independent random variables?
6.15 You roll a fair die repeatedly and independently until the result is an
even number. Consider the random variables
X = the number of times you roll the die
and
Y = the result of the last roll.
For example, if the results of the rolls are 5, 1, 3, 3, 5, 2, then X = 6 and
Y = 2.
Prove that the random variables X and Y are independent.
6.16 Consider two random variables X and Y . If X and Y are independent,
then it can be shown that
E(XY ) = E(X) · E(Y ).
In this exercise, you will show that the converse of this statement is, in
general, not true.
Let X be the random variable that takes each of the values −1, 0, and 1
with probability 1/3. Let Y be the random variable with value Y = X 2 .
336 Chapter 6. Random Variables and Expectation
6.20 Consider the following algorithm, which takes as input a large integer
n and returns a random subset A of the set {1, 2, . . . , n}:
Algorithm RandomSubset(n):
Define
the largest element in A if A 6= ∅,
max(A) =
0 if A = ∅,
the smallest element in A if A 6= ∅,
min(A) =
0 if A = ∅,
and the random variable
X = max(A) − min(A).
• Prove that the expected value E(X) of the random variable X satisfies
E(X) = n − 3 + f (n),
If the array A is not sorted and A[k] = i, where i 6= k, then |A[k] − k| is equal
to the “distance” between the position of the value i in A and the position of
i in case the array were sorted. Thus, the summation in (6.11) is a measure
for the “sortedness” of the array A: If the summation is small, then A is
“close” to being sorted. On the other hand, if the summation is large, then
A is “far away” from being sorted. In this exercise, you will determine the
expected value of the summation in (6.11).
Assume that the array stores a uniformly random permutation of the set
{1, 2, . . . , n}. For each k = 1, 2, . . . , n, consider the random variable
Xk = |A[k] − k|,
and let
n
X
X= Xk .
k=1
n + 1 k 2 − k − kn
E (Xk ) = + .
2 n
m(m + 1)
1 + 2 + 3 + ··· + m = .
2
6.12. Exercises 339
n2 − 1
E(X) = .
3
Hint:
n(n + 1)(2n + 1)
12 + 22 + 32 + · · · + n2 = .
6
B2 , C5 , C2 , C4 , B1 , C3 , C1 ,
then X = 2 and Y = 5.
• Prove that
E(Z) = 2 · E(X) − k.
• Prove that
m−n
E(Z) = k · .
m+n
6.26 You are given four fair and independent dice, each one having six faces:
1. One die is red and has the numbers 7, 7, 7, 7, 1, 1 on its faces.
2. One die is blue and has the numbers 5, 5, 5, 5, 5, 5 on its faces.
3. One die is green and has the numbers 9, 9, 3, 3, 3, 3 on its faces.
4. One die is yellow and has the numbers 8, 8, 8, 2, 2, 2 on its faces.
Let c be a color in the set {red, blue, green, yellow}. You roll the die of
color c. Define the random variable Xc to be the result of this roll.
• For each c ∈ {red, blue, green, yellow}, determine the expected value
E (Xc ) of the random variable Xc .
• Let c and c0 be two distinct colors in the set {red, blue, green, yellow}.
Determine
Pr (Xc < Xc0 ) + Pr (Xc > Xc0 ) .
• Let c and c0 be two distinct colors in the set {red, blue, green, yellow}.
We say that the die of color c is better than the die of color c0 , if
Pr (Xc > Xc0 ) > 1/2.
– Is the red die better than the blue die?
– Is the blue die better than the green die?
– Is the green die better than the yellow die?
– Is the yellow die better than the red die?
– Explain why these dice are called non-transitive dice.
6.27 In this exercise, you are given a fair and independent coin. Let n ≥ 1
be an integer. Farah flips the coin n times, whereas May flips the coin n + 1
times. Consider the following two random variables:
X = the number of heads in Farah’s sequence of coin flips,
Y = the number of heads in May’s sequence of coin flips.
Let A be the event
A = “X < Y ”.
6.12. Exercises 343
• Prove that
n n+1
1 X X n n+1
Pr(A) = · .
22n+1 k=0 `=k+1
k `
– What is X + X 0 ?
– What is Y + Y 0 ?
– Let B be the event
B = “ X 0 < Y 0 ”.
Pr(A) = Pr(B).
6.28 Elisa Kazan’s neighborhood pub serves three types of drinks: cider,
wine, and beer. Elisa likes cider and wine, but does not like beer.
After a week of hard work, Elisa goes to this pub and repeatedly orders
a random drink (the results of the orders are mutually independent). If she
gets a glass of cider or a glass of wine, then she drinks it and places another
order. As soon as she gets a pint of beer, she drinks it and takes a taxi home.
When Elisa orders one drink, she gets a glass of cider with probability 2/5,
a glass of wine with probability 2/5, and a pint of beer with probability 1/5.
Consider the random variables
• Use the results of the previous five parts to determine the expected
value E(Y ).
• Use the results of the previous three parts to determine the expected
value E(Y ).
6.12. Exercises 345
6.29 You repeatedly flip a fair coin and stop as soon as you get tails followed
by heads. (All coin flips are mutually independent.) Consider the random
variable
X = the total number of coin flips.
For example, if the sequence of coin flips is HHHT T T T H, then X = 8.
We are going to add all elements in this matrix in two different ways. A
row-sum is the sum of all elements in one row, whereas a column-sum is the
sum of all elements in one column.
Note that the sum of all row-sums is equal to
∞
X
x + 2x2 + 3x3 + 4x4 + 5x5 + · · · = kxk .
k=1
6.32 Let 0 < p < 1 and consider a coin that comes up heads with probability
p and tails with probability 1 − p. We flip the coin independently until it
comes up heads for the first time. Define the random variable X to be the
number of times that we flip the coin. In Section 6.6, we have shown that
E(X) = 1/p. Below, you will prove this in a different way.
6.35 When Lindsay and Simon have a child, this child is a boy with prob-
ability 1/2 and a girl with probability 1/2, independently of the gender of
previous children. Lindsay and Simon stop having children as soon as they
have a girl. Consider the random variables
and
G = the number of girls that Lindsay and Simon have.
Determine the expected values E(B) and E(G).
6.36 Let p be a real number with 0 < p < 1. When Lindsay and Simon
have a child, this child is a boy with probability p and a girl with probability
1 − p, independently of the gender of previous children. Lindsay and Simon
stop having children as soon as they have a child that has the same gender as
their first child. Define the random variable X to be the number of children
that Lindsay and Simon
P∞ have. Determine the expected value E(X).
k−1
Hint: Recall that k=1 kx = 1/(1 − x)2 .
• E(Xi ) = 1.
Determine
Pr(X1 + X2 + · · · + Xn ≤ n).
6.38 The Ottawa Senators and the Toronto Maple Leafs play a best-of-seven
series: These two hockey teams play games against each other, and the first
team to win four games wins the series. Assume that
• in any game, the Sens have a probability of 3/4 of defeating the Leafs,
Determine the probability that seven games are played in this series.
6.39 Let n ≥ 1 be an integer, let p be a real number with 0 < p < 1, and
let X be a random variable that has a binomial distribution with parameters
n and p. In Section 6.7.1, we have seen that the expected value E(X) of X
satisfies n
X n k
E(X) = k p (1 − p)n−k . (6.12)
k=1
k
• Use (6.12) to prove that E(X) = pn, by taking the derivative, with
respect to y, in Newton’s Binomial Theorem.
– Determine E (X1 ).
– Let i be an integer with 2 ≤ i ≤ n. Use the Product Rule to
determine the number of permutations of {1, 2, . . . , n} for which
Xi = 1.
– Use these indicator random variables to determine E(X).
– Determine E (Y1 ).
– Let i be an integer with 2 ≤ i ≤ n. Use the Product Rule to
determine the number of permutations of {1, 2, . . . , n} for which
Yi = 1.
– Use these indicator random variables to determine E(X).
6.44 Lindsay Bangs and Simon Pratt visit their favorite pub that has 10
different beers on tap. Both Lindsay and Simon order, independently of each
other, a uniformly random subset of 5 beers.
• One of the beers available is Leo’s Early Breakfast IPA. Determine the
probability that this is one of the beers that Lindsay orders.
• Let X be the random variable whose value is the number of beers that
are ordered by both Lindsay and Simon. Determine the expected value
E(X) of X.
Hint: Use indicator random variables.
6.12. Exercises 351
6.45 Lindsay and Simon have discovered a new pub that has n different
beers B1 , B2 , . . . , Bn on tap, where n ≥ 1 is an integer. They want to try
all different beers in this pub and agree on the following approach: During
a period of n days, they visit the pub every day. On each day, they drink
one of the beers. Lindsay drinks the beers in order, i.e., on the i-th day, she
drinks beer Bi . Simon takes a uniformly random permutation a1 , a2 , . . . , an
of the set {1, 2, . . . , n} and drinks beer Bai on the i-th day.
Let X be the random variable whose value is the number of days during
which Lindsay and Simon drink the same beer. Determine the expected value
E(X) of X.
Hint: Use indicator random variables.
6.47 Let A[1 . . . n] be an array of n numbers. Consider the following two al-
gorithms, which take as input the array A and a number x. If x is not present
in A, then these algorithms return the message “not present”. Otherwise,
they return an index i such that A[i] = x. The first algorithm runs linear
search from left to right, whereas the second algorithm runs linear search
from right to left.
352 Chapter 6. Random Variables and Expectation
i := 1;
while i ≤ n and A[i] 6= x do i := i + 1 endwhile;
if i = n + 1 then return “not present” else return i endif
i := n;
while i ≥ 1 and A[i] 6= x do i := i − 1 endwhile;
if i = 0 then return “not present” else return i endif
Consider the following algorithm, which again take as input the array A
and a number x. If x is not present in A, then it returns the message “not
present”. Otherwise, it returns an index i such that A[i] = x.
Algorithm RandomLinearSearch(A, x):
Assume that the number x occurs exactly once in the array A and let k
be the index such that A[k] = x. Let X be the random variable whose
value is the number of times the test “A[i] 6= x” is made in algorithm
RandomLinearSearch(A, x). (In words, X is the number of compar-
isons made by algorithm RandomLinearSearch(A, x).) Determine the
expected value E(X) of X.
6.48 Let n ≥ 3 be an integer and let p be a real number with 0 < p < 1.
Consider the set V = {1, 2, . . . , n}. We construct a graph G = (V, E) with
vertex set V , whose edge set E is determined by the following random process:
Each unordered pair {i, j} of vertices, where i 6= j, occurs as an edge in E
with probability p, independently of the other unordered pairs.
A triangle in G is an unordered triple {i, j, k} of distinct vertices, such
that {i, j}, {j, k}, and {k, i} are edges in G.
6.12. Exercises 353
for i = 2 to n
do j = i;
while j > 1 and A[j] < A[j − 1]
do swap A[j] and A[j − 1];
j =j−1
endwhile
endfor
Consider an input array A[1 . . . n], where each element A[i] is chosen inde-
pendently and uniformly at random from the set {1, 2, . . . , m}.
• Let i and j be two indices with 1 ≤ i < j ≤ n, and consider the values
A[i] and A[j] (just before the algorithm starts). Prove that
1 1
Pr(A[i] > A[j]) = − .
2 2m
• Let X be the random variable that is equal to the number of times the
swap-operation is performed when running InsertionSort(A[1 . . . n]).
Determine the expected value E(X) of X.
• Determine E(X).
1−xk+1
Hint: 1 + x + x2 + x3 + · · · + xk = 1−x
.
6.51 Assume we have n balls and m boxes. We throw the balls independently
and uniformly at random in the boxes. Thus, for each k and i with 1 ≤ k ≤ n
and 1 ≤ i ≤ m,
1. limn→∞ E(X)/n,
6.12. Exercises 355
6.52 Let 0 < p < 1 and consider a coin that comes up heads with probability
p and tails with probability 1 − p. For each integer n, let bn be the outcome
when flipping this coin; thus, bn ∈ {H, T }. The values bn partition the set of
integers into intervals, where each interval is a maximal consecutive sequence
of zero or more T ’s followed by one H:
... H T T T H T H T T H H T H ...
. . . −2 −1 0 1 2 3 4 5 6 7 8 9 10 . . .
• Consider the interval that contains the integer 0, and let X be its
length. (In the example above, X = 4.) Determine the expected value
E(X) of X.
Hint: Use the Linearity of Expectation. The answer is not 1/p, which
is the expected number of coin flips until the first H.
• Assume that, when you guess the number in box Bi , you do not remem-
ber the numbers stored in B1 , B2 , . . . , Bi−1 . Then, the only reasonable
thing you can do is to take a random element in {1, 2, . . . , n} and guess
that this random element is stored in Bi .
Assume that you do this for each i with 1 ≤ i ≤ n. Let X be the
random variable whose value is equal to the number of times that your
guess is correct. Compute the expected value E(X) of X.
• Now assume that your memory is perfect, so that, when you guess the
number in box Bi , you know the numbers stored in B1 , B2 , . . . , Bi−1 .
356 Chapter 6. Random Variables and Expectation
How would you make the n guesses such that the following is true: If
Y is the random variable whose value is equal to the number of times
that your guess is correct, then the expected value E(Y ) of Y satisfies
E(Y ) = Ω(log n).
X = the number of indices i such that Pi and Pi+1 have the same birthday.
6.58 Nick wants to know how many students cheat on the assignments. One
approach is to ask every student “Did you cheat?”. This obviously does not
work, because every student will answer “I did not cheat”. Instead, Nick
uses the following ingenious scheme, which gives a reasonable estimate of the
number of cheaters, without identifying them.
We denote the students by S1 , S2 , . . . , Sn . Let k denote the number of
cheaters. Nick knows the value of n, but he does not know the value of k.
For each i with 1 ≤ i ≤ n, Nick does the following:
1. Nick meets student Si and asks “Did you cheat?”.
2. Student Si flips a fair coin twice, independently of each other; Si does
not show the results of the coin flips to Nick.
(a) If the coin flips are HH or HT , then Si is honest in answering the
question: If Si is a cheater, then he answers “I cheated”; otherwise,
he answers “I did not cheat”.
(b) If the coin flips are T H, then Si answers “I cheated”.
(c) If the coin flips are T T , then Si answers “I did not cheat”.
6.59 You roll a fair die repeatedly, and independently, until you have seen
all of the numbers 1, 2, 3, 4, 5, 6 at least once. Consider the random variable
X = the number of times you roll the die.
For example, if you roll the sequence
5, 5, 3, 5, 1, 3, 4, 2, 5, 2, 1, 3, 6,
358 Chapter 6. Random Variables and Expectation
then X = 13.
Determine the expected value E(X) of the random variable X.
Hint: Use the Linearity of Expectation. If you have seen exactly i different
elements from the set {1, 2, 3, 4, 5, 6}, how many times do you expect to roll
the die until you see a new element from this set?
6.60 Michiel’s Craft Beer Company (MCBC) sells n different brands of India
Pale Ale (IPA). When you place an order, MCBC sends you one bottle of
IPA, chosen uniformly at random from the n different brands, independently
of previous orders.
Simon Pratt wants to try all different brands of IPA. He repeatedly places
orders at MCBC (one bottle per order) until he has received at least one
bottle of each brand.
Define the random variable X to be the total number of orders that Simon
places. Determine the expected value E(X) of the random variable X.
Hint: Use the Linearity of Expectation. If Simon has received exactly i
different brands of IPA, how many orders does he expect to place until he
receives a new brand?
6.61 MCBC still sells n different brands of IPA. As in Exercise 6.60, when
you place an order, MCBC sends you one bottle of IPA, chosen uniformly at
random from the n different brands, independently of previous orders.
Simon Pratt places m orders at MCBC. Define the random variable X to
be the total number of distinct brands that Simon receives. Determine the
expected value E(X) of X.
Hint: Use indicator random variables.
6.62 You are given an array A[0 . . . n−1] of n numbers. Let D be the number
of distinct numbers that occur in this array. For each i with 0 ≤ i ≤ n − 1,
let Ni be the number of elements in the array that are equal to A[i].
Pn−1
• Show that D = i=0 1/Ni .
6.63 One of Jennifer and Thomas is chosen uniformly at random. The person
who is chosen wins $100. Consider the random variables
Prove that
E (max(J, T )) 6= max (E(J), E(T )) .
6.65 Let k ≥ 0 be an integer and let T be a full binary tree, whose levels
are numbered 0, 1, 2, . . . , k. (The root is at level 0, whereas the leaves are
at level k.) Assume that each edge of T is removed with probability 1/2,
independently of other edges. Denote the resulting graph by T 0 .
Define the random variable X to be the number of nodes that are con-
nected to the root by a path in T 0 ; the root itself is included in X.
In the left figure below, the tree T is shown for the case when k = 3. The
right figure shows the tree T 0 : The dotted edges are those that have been
removed from T , the black nodes are connected to the root by a path in T 0 ,
whereas the white nodes are not connected to the root by a path in T 0 . For
this case, X = 6.
T T0
• Prove that the expected value E(X) of the random variable X is equal
to
E(X) = log(n + 1).
6.66 Let n ≥ 2 be a power of two and consider a full binary tree with n leaves.
Let a1 , a2 , . . . , an be a random permutation of the numbers 1, 2, . . . , n. Store
this permutation at the leaves of the tree, in the order a1 , a2 , . . . , an , from
left to right. For example, if n = 8 and the permutation is 2, 8, 1, 4, 6, 3, 5, 7,
then we obtain the following tree:
6.12. Exercises 361
2 8 1 4 6 3 5 7
• At each level, take all pairs of consecutive nodes that have the same
parent. For each such pair, compare the numbers stored at the two
nodes, and store the smaller of these two numbers at the common
parent.
For our example tree, we obtain the following tree:
1 3
2 1 3 5
2 8 1 4 6 3 5 7
It is clear that at the end of this process, the root stores the number 1.
Define the random variable X to be the number that is not equal to 1 and
that is stored at a child of the root; think of X being the “loser of the final
game”. For our example tree, X = 3.
In this exercise, you will determine the expected value E(X) of the random
variable X.
Prove that
n/2
X
E(X) = Pr(X ≥ 1) + Pr(X ≥ k + 1).
k=1
6.67 If X is a random variable that can take any value in {1, 2, 3, . . .}, and
A is an event, then the conditional expected value E(X | A) is given by
∞
X
E(X | A) = k · Pr(X = k | A).
k=1
In words, E(X | A) is the expected value of X, when you are given that the
event A occurs.
You roll a fair die repeatedly, and independently, until you see the num-
ber 6. Define the random variable X to be the number of times you roll the
die (this includes the last roll, in which you see the number 6). It follows
from Theorem 6.6.2 that E(X) = 6. Let A be the event
A = “the results of all rolls are even numbers”.
Determine the conditional expected
P∞value E(X | A).
k−1
Hint: E(X | A) 6= 3. Recall that k=1 k · x = 1/(1 − x)2 .
6.12. Exercises 363
6.68 For any integer n ≥ 0 and any real number x with 0 < x < 1, define
the function
∞
X k k
Fn (x) = x .
k=n
n
(Using the ratio test from calculus, it can be shown that this infinite series
converges for any fixed integer n.)
• Prove that for any integer n ≥ 0 and any real number x with 0 < x < 1,
xn
Fn (x) = ,
(1 − x)n+1
and
xn + n · xn−1
Fn0 (x) = .
(1 − x)n+2
6.69 Consider a fair red coin and a fair blue coin. We repeatedly flip both
coins, and keep track of the number of times that the red coin comes up
heads. As soon as the blue coin comes up tails, the process terminates.
A formal description of this process is given in the pseudocode below.
The value of the variable i is equal to the number of iterations performed
so far, the value of the variable h is equal to the number of times that the
red coin came up heads so far, whereas the Boolean variable stop is used to
decide when the while-loop terminates.
Algorithm RandomCoinFlips:
// both the red coin and the blue coin are fair
// all coin flips are mutually independent
i = 0;
h = 0;
stop = false;
while stop = false
do i = i + 1;
flip the red coin;
if the result of the red coin is heads
then h = h + 1
endif;
flip the blue coin;
if the result of the blue coin is tails
then stop = true
endif
endwhile;
return i and h
Assume that the value of the random variable Y is equal to some integer
n ≥ 0. In this exercise, you will determine the expected value of the random
variable X.
Thus, we are interested in the conditional expected value E(X | Y = n),
which is the expected value of X (i.e., the number of iterations of the while-
6.12. Exercises 365
loop), when you are given that the event “Y = n” (i.e., during the while-loop,
the red coin comes up heads n times) occurs. Formally, we have
X
E(X | Y = n) = k · Pr(X = k | Y = n),
k
where the summation ranges over all values of k that X can take.
The functions Fn and Fn0 that are used below are the same as those in
Exercise 6.68.
• Prove that
∞
X
Pr(Y = 0) = Pr(Y = 0 | X = k) · Pr(X = k).
k=1
Pr(Y = n) = Fn (1/4).
• Prove that
1
Pr(Y = 0) = .
3
• Let n ≥ 1 be an integer. Prove that
Fn0 (1/4)
E(X | Y = n) = .
4 · Fn (1/4)
• Prove that
4
E(X | Y = 0) = .
3
366 Chapter 6. Random Variables and Expectation
6.70 Let (S, Pr) be a probability space, and let X and Y be two identical
non-negative random variables on S. Thus, for all ω in S, X(ω) = Y (ω) ≥ 0.
Consider the new probability space (S 2 , Pr), where S 2 is the Cartesian
product S × S and
for all elements (ω1 , ω2 ) in S 2 . (In words, we choose two elements ω1 and ω2
in S, independently of each other.)
Consider the random variable Z on S 2 defined by
min a2 , b2 ≤ ab.
• Prove that
E(Z) ≤ (E(X))2 .
6.71 Carleton University has implemented a new policy for students who
cheat on assignments:
1. When a student is caught cheating, the student meets with the Dean.
2. The Dean has a box that contains n coins. One of these coins has the
number n written on it, whereas each of the other n − 1 coins has the
number 1 written on it. Here, n is a very large integer.
4. If x is the number written on the chosen coin, then the student gives
x2 bottles of cider to Elisa Kazan.
6.12. Exercises 367
(Note that Z = X 2 .)
• Prove that
E(X) = 2 − 1/n ≤ 2.
• Prove that
E(Z) = n + 1 − 1/n ≥ n.
• Prove that
E X2 =6 O (E(X))2 .
1. The student chooses a uniformly random coin from the box (and
puts it back in the box).
2. Again, the student chooses a uniformly random coin from the box
(and puts it back in the box).
3. If x is the number written on the first chosen coin, and y is the
number written on the second chosen coin, then the student gives
min(x2 , y 2 ) bottles of cider to Elisa.
E(W ) ≤ 4.
368 Chapter 6. Random Variables and Expectation
Chapter 7
The Probabilistic Method is a very powerful and surprising tool that uses
probability theory to prove results in discrete mathematics. In this chapter,
we will illustrate this method using several examples.
d c
For example, in the graph above, let A = {a, d} and B = {b, c, e}. Then
four of the eight edges are between A and B, namely {a, b}, {a, e}, {d, c},
and {d, e}. Thus, the vertex set of this graph can be partitioned into two
subsets A and B, such that at least half of G’s edges are between A and B.
The following theorem states that this is true for any graph.
370 Chapter 7. The Probabilistic Method
Theorem 7.1.1 Let G = (V, E) be a graph with m edges. The vertex set V
of G can be partitioned into two subsets A and B such that the number of
edges between A and B is at least m/2.
Then
m
X
X= Xi
i=1
and
m
!
X
E(X) = E Xi
i=1
m
X
= E (Xi )
i=1
m
X
= Pr (Xi = 1) .
i=1
To determine Pr (Xi = 1), let ei have vertices a and b. The following ta-
ble shows the four possibilities for a and b; each one of them occurs with
probability 1/4.
a ∈ A, b ∈ A Xi =0
a ∈ A, b ∈ B Xi =1
a ∈ B, b ∈ A Xi =1
a ∈ B, b ∈ B Xi =0
7.2. Ramsey Theory 371
Assume the claim in the theorem does not hold. Then, no matter how
we partition the vertex set V into A and B, the number of edges between
A and B will be less than m/2. In particular, the random variable X will
always be less than m/2. But then, E(X) < m/2 as well, contradicting that
E(X) = m/2.
has a positive probability. This, in turn, will imply that the statement
in the
theorem holds: If the statement would not hold, then Pr A would be zero.
Thus, it remains to prove that Pr(A) < 1. The vertex set of Kn has
exactly nk many subsets of size k. We denote these subsets by Vi , i =
n n
1, 2, . . . , k . For each i with 1 ≤ i ≤ k , consider the event
k
Since the event Ai occurs if and only if the edges joining the 2
pairs of
vertices of Vi are either all solid or all dashed, we have
2
Pr (Ai ) = ;
2(2)
k
()
X
n
k
≤ Pr (Ai )
i=1
(nk)
X 2
=
(k2)
i=1 2
2 nk
= .
2(2)
k
7.3. Sperner’s Theorem 373
If we can show that the quantity in the last line is less than one, then the
proof is complete. We have
2 nk
n(n − 1)(n − 2) · · · (n − k + 1) 2
= · (k2 −k)/2
2(2)
k
k! 2
nk 21+k/2
≤ · .
k! 2k2 /2
Since n ≤ b2k/2 c ≤ 2k/2 , we get
2 nk
(2k/2 )k 21+k/2
≤ · k2 /2
2(2)
k
k! 2
21+k/2
= .
k!
By Exercise 2.8, we have k! > 21+k/2 for k ≥ 3. Thus, we conclude that
2 nk
< 1.
2(2)
k
Take, for example, k = 20 and n = 1024. Theorem 7.2.1 states that there
exists a group of 1024 people that does not contain a subgroup of 20 mutual
friends and does not contain a subgroup of 20 mutual strangers. In fact, the
proof shows more: Consider a group of 1024 people such that any two are
friends with probability 1/2, and strangers with probability 1/2. The above
proof shows that Pr(A), i.e., the probability that there is a subgroup of 20
mutual friends or there is a subgroup of 20 mutual strangers, satisfies
21+k/2 211
Pr(A) ≤ = .
k! 20!
Therefore, with probability at least
211
1− = 0.999999999999999158,
20!
(there are 15 nines) this group does not contain a subgroup of 20 mutual
friends and does not contain a subgroup of 20 mutual strangers.
374 Chapter 7. The Probabilistic Method
Aj = {a1 , a2 , . . . , aj }.
A1 = {3},
A2 = {1, 3},
A3 = {1, 3, 4},
A4 = {1, 2, 3, 4}.
7.3. Sperner’s Theorem 375
permutation of Si
The Product Rule of Section 3.1 shows that there are k!(n − k)! many per-
mutations of S that have this property. Therefore, since we chose a random
permutation of S, we have
E (Xi ) = Pr (Xi = 1)
k!(n − k)!
=
n!
1
= n
k
1
= n
.
|Si |
376 Chapter 7. The Probabilistic Method
Thus, since
m
X
X= Xi ,
i=1
we get
m
!
X
E(X) = E Xi
i=1
m
X
= E (Xi )
i=1
m
X 1
= n
.
i=1 |Si |
bn/2c; i.e., the largest value in the n-th row of Pascal’s Triangle (see Sec-
tion 3.8) is in the middle. Thus,
n n
≤ ,
|Si | bn/2c
implying that
m
X 1
1 ≥ n
i=1 |Si |
m
X 1
≥ n
i=1 bn/2c
m
= n
.
bn/2c
We conclude that
n
m≤ .
bn/2c
7.4. The Jaccard Distance between Finite Sets 377
X Y = (X \ Y ) ∪ (Y \ X),
i.e., the set consisting of all elements in X that are not in Y and all elements
in Y that are not in X.
X Y
X \Y Y \X
• 0 ≤ dJ (X, Y ) ≤ 1.
• dJ (X, X) = 0.
• If X ∩ Y = ∅, then dJ (X, Y ) = 1.
In the rest of this section, we will prove that the Jaccard distance satisfies
the triangle inequality:
We will present two proofs of this result. The first proof uses “brute
force”: We consider the Venn diagram for the sets X, Y , and Z. Based on
this diagram, we transform the inequality in Theorem 7.4.1 into an equiv-
alent algebraic inequality. We then argue that the algebraic inequality is
valid. In the second proof, we show that the inequality in Theorem 7.4.1
can be rephrased as an inequality involving probabilities. The result then
follows by straightforward applications of Lemma 5.3.6 and the Union Bound
(Lemma 5.3.5).
X Y
a d b
g
f e
c
Z
x1 , x2 , x3 , . . . , xn
i = min{` : x` ∈ X},
j = min{` : x` ∈ Y },
k = min{` : x` ∈ Z}.
|X ∩ Y |
Pr (AXY ) = 1 − = dJ (X, Y ).
|X ∪ Y |
7.5. Planar Graphs and the Crossing Lemma 381
AXZ = “i 6= k”
and
AY Z = “j 6= k”,
then we have, by the same arguments,
Pr (AXZ ) = dJ (X, Z)
and
Pr (AY Z ) = dJ (Y, Z).
Thus, the inequality in Theorem 7.4.1 is equivalent to
Since
i 6= k ⇒ i 6= j ∨ j 6= k,
Lemma 5.3.6 implies that
Pr (AXZ ) ≤ Pr (AXY ∨ AY Z ) .
1. For any two edges {a, b} and {a0 , b0 } of E, the intersection of the line
segments f (a)f (b) and f (a0 )f (b0 ) is empty or consists of exactly one
point.
2. For any edge {a, b} in E and any vertex c in V , the point f (c) is not
in the interior of the line segment f (a)f (b).
3. For any three edges {a, b}, {a0 , b0 }, and {a00 , b00 } of E, the line segments
f (a)f (b), f (a0 )f (b0 ), and f (a00 )f (b00 ) do not have a point in common
that is in the interior of any of these line segments.
For simplicity, we do not distinguish any more between a graph and its
embedding. That is, a vertex a refers to both an element of V and the point
in the plane that represents a. Similarly, an edge refers to both an element
of E and the line segment that represents it.
How many edges can G have? Since G has v vertices, we obviously have
e ≤ v2 = Θ(v 2 ), an upper bound which holds for any graph with v vertices.
Since our graph G is planar, we expect a much smaller upper bound on e: If
G has Θ(v 2 ) edges, then it seems to be impossible to draw G without edge
crossings. Below, we will prove that e is, in fact, at most linear in v. The
proof will use Euler’s Theorem for planar graphs:
Proof. The idea of the proof is as follows. We start by removing all edges
from G (but keep all vertices), and show that (7.4) holds. Then we add back
the edges of G, one by one, and show that (7.4) remains valid throughout
this process.
After having removed all edges, we have e = 0 and the embedding consists
of a collection of v points. Since f = 1 and c = v, the relation v −e+f = c+1
holds.
Assume the relation v − e + f = c + 1 holds and consider what happens
when we add an edge ab. There are two possible cases.
Case 1: Before adding the edge ab, the vertices a and b belong to the same
connected component.
• the number f of faces increases by one (because the edge ab splits one
face into two),
Usually, Euler’s Theorem is stated for connected planar graphs, i.e., pla-
nar graphs for which c = 1:
v − e + f = 2.
7.5. Planar Graphs and the Crossing Lemma 385
On the other hand, since G is connected and v ≥ 4, each face has at least
three edges on its boundary, i.e., mi ≥ 3. It follows that
f
X
mi ≥ 3f.
i=1
2
3
Since H is planar, we know from Theorem 7.5.4 that the number of its edges
is bounded from above by three times the number of its vertices minus six,
i.e.,
e + 2 · cr (G) ≤ 3(v + cr (G)) − 6.
By rewriting this inequality, we obtain the following result:
Theorem 7.5.6 For any graph G with v ≥ 3 vertices and e edges, we have
cr (G) ≥ e − 3v + 6.
Since Kn has n2 edges and any two of them cross at most once, we have
cr (Gp ) − ep + 3vp ≥ 6,
cr (Gp ) − ep + 3vp ≥ 0,
7.5. Planar Graphs and the Crossing Lemma 389
e
! e e
X X X
E(ep ) = E Xi = E(Xi ) = p2 = p2 e.
i=1 i=1 i=1
Let ab and cd be the edges of G that cross in the i-th crossing2 . This crossing
appears as a crossing in Gp if and only if both ab and cd are edges in Gp .
Since the points a, b, c, and d are pairwise distinct, it follows that the i-th
crossing of G appears as a crossing in Gp with probability p4 . Thus,
E(Yi ) = Pr(Yi = 1) = p4 .
Pcr (G)
Since xp = i=1 Yi , it follows that
cr (G) cr (G) cr (G)
X X X
E(xp ) = E Yi =
E(Yi ) = p4 = p4 · cr (G).
i=1 i=1 i=1
p4 · cr (G) − p2 e + 3 · pv ≥ 0,
which we rewrite as
p2 e − 3pv
cr (G) ≥ . (7.8)
p4
Observe that this inequality holds for any real number p with 0 < p ≤ 1.
If we assume that e ≥ 4v, and take p = 4v/e (so that 0 < p ≤ 1), then
we obtain a new lower bound on the crossing number:
Remark 7.5.8 Let n be a very large integer and consider the complete
n
graph Kn with v = n vertices and e = 2 edges. Let us see what happens
if we repeat the proof for this graph. We choose a random subgraph Gp of
2
By our definition of embedding, see Section 7.5.1, there are exactly two edges that
determine the i-th crossing.
7.6. Exercises 391
7.6 Exercises
7.1 Prove that, for any graph G with m edges, the sequence X1 , X2 , . . . , Xm
of random variables in the proof of Theorem 7.1.1 is pairwise independent.
Give an example of a graph for which this sequence is not mutually inde-
pendent.
7.2 Prove that Theorem 7.5.4 also holds if G is not connected.
7.3 Let K5 be the complete graph on 5 vertices. In this graph, each pair of
vertices is connected by an edge. Prove that K5 is not planar.
7.4 Let G be any embedding of a connected planar graph with v ≥ 4 vertices.
Assume that this embedding has no triangles, i.e., there are no three vertices
a, b, and c, such that ab, bc, and ac are edges of G.
• Prove that G has at most 2v − 4 edges.
• Let K3,3 be the complete bipartite graph on 6 vertices. The vertex set
of this graph consists of two sets A and B, both of size three, and each
vertex of A is connected by an edge to each vertex of B. Prove that
K3,3 is not planar.
7.5 Consider the numbers Rn that were defined in Section 4.8. In Sec-
tion 4.8.1, we proved that Rn = O(n8 ). Prove that Rn = O(n4 ).
7.6 Let n be a sufficiently large positive integer and consider the complete
graph Kn . This graph has vertex set V = {1, 2, . . . , n}, and each pair of
distinct vertices is connected by an undirected edge. (Thus, Kn has n2
edges.)
~ n be the directed graph obtained by making each edge {i, j} of Kn
Let K
~ n , this edge either occurs as the directed edge (i, j)
a directed edge; thus, in K
from i to j or as the directed edge (j, i) from j to i.
We say that three pairwise distinct vertices i, j, and k define a directed
~ n , if
triangle in K
392 Chapter 7. The Probabilistic Method
~ n or
• (i, j), (j, k), and (k, i) are edges in K
~ n.
• (i, k), (k, j), and (j, i) are edges in K
Prove that there exists a way to direct the edges of Kn , such that the
~ 1 n
number of directed triangles in Kn is at least 4 3 .
7.7 Let G = (V, E) be a graph with vertex set V and edge set E. A subset I
of V is called an independent set if for any two distinct vertices u and v in I,
(u, v) is not an edge in E. For example, in the following graph, I = {a, e, i}
is an independent set.
f
c i
a g
d
b e h
Step 1: Set H = G.
• Prove that
E(Z) ≥ n2 /(4m).
• Argue that this implies that the graph G contains an independent set
of size at least n2 /(4m).
7.8 Elisa Kazan is having a party at her home. Elisa has a round table
that has 52 seats numbered 0, 1, 2, . . . , 51 in clockwise order. Elisa invites 51
friends, so that the total number of people at the party is 52. Of these 52
people, 15 drink cider, whereas the other 37 drink beer.
In this exercise, you will prove the following claim: No matter how the 52
people sit at the table, there is always a consecutive group of 7 people such
that at least 3 of them drink cider.
From now on, we consider an arbitrary (which is not random) arrange-
ment of the 52 people sitting at the table.
• Let k be a uniformly random element of the set {0, 1, 2, . . . , 51}. Con-
sider the consecutive group of 7 people that sit in seats k, k + 1, k +
2, . . . , k + 6; these seat numbers are to be read modulo 52. Define the
random variable X to be the number of people in this group that drink
cider. Prove that E(X) > 2.
Hint: Number the 15 cider drinkers arbitrarily as P1 , P2 , . . . , P15 . For
each i with 1 ≤ i ≤ 15, consider the indicator random variable
1 if Pi sits in one of the seats k, k + 1, k + 2, . . . , k + 6,
Xi =
0 otherwise.
394 Chapter 7. The Probabilistic Method
• For the given arrangement of the 52 people sitting at the table, prove
that there is a consecutive group of 7 people such that at least 3 of
them drink cider.
Hint: Assume the claim is false. What is an upper bound on E(X)?