LectureNotes VT23 Part0
LectureNotes VT23 Part0
Part 0
Jan-Fredrik Olsen
ii
Contents
8 The derivative 49
8.1 Computational rules for the derivative . . . . . . . . . . . . . . . . . . . . 50
8.2 Proof of the computational rules for the derivative . . . . . . . . . . . . . 59
8.3 Differentiation formulas for elementary functions . . . . . . . . . . . . . . 64
8.4 Exam exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.5 Answers to selected exercises . . . . . . . . . . . . . . . . . . . . . . . . . 71
iii
iv CONTENTS
Chapter 1
1
2 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
In this section, we briefly discuss the point of proofs in mathematics, and formulate the
first part of the rulebook for the real numbers.
In these lecture notes, one of the main points of mathematical proofs is to understand
why what we learned in high school mathematics is true. A second, and perhaps, more
important goal, is to systematically develop mathematical theorems that goes far beyond
what you learned in high school.
When building a mathematical theory, mathematicians can be guided by various
motivations. For instance, here are some criteria that mathematicians use when trying
to figure out whether a mathematical statement is interesting or not:
While the two latter criteria are subjective, and are judged completely differently
by mathematicians from different fields, they are considered to be important. As the
influential mathematician G. H. Hardy famously wrote in his short, and quite readable,
book "A Mathematicians Apology":
But what does it mean for a statement to be true? Well, here are some examples of
true statements:
While the above statements seem to be true, the real question here is how do we know
this? Ideally, we would want to find a mathematical proof for each of these statements.
However, this poses a problem: to prove that a statement is true, we need to show that it
is the logical consequence of some other true statement. But what is our starting point?
If we make no initial assumption that some statements are to be considered true without
proof, then we have nothing to work with.
This might be shocking, but mathematics has the same fundamental problem as
physics and religion: how can something be created from nothing?
4 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Let there be light! The first part of the rulebook for the real numbers
Where physicists have their big-bang, and priests have their moment of creation, math-
ematicians have axioms. These are an initial selection of mathematical statements that
we choose to believe are true.
In order to keep things friendly, we introduce the axioms in four batches that we can
think of as a "rulebook" for what we are allowed to do with real numbers. The first
batch is the longest, but also contains the most "obvious" axioms. Here goes:
The rulebook for R (part 1 of 4) There exists a unique set of numbers having the
properties listed in the four parts of this rulebook. We call this set of numbers the real
numbers and denote them by R.
We begin with two axioms that essentially tell us what we mean by equalities.
Note that since we do not want to go too deeply into mathematical logic, we only give a
rather informal formulation of rule E2 which sometimes is called the replacement rule.
Next, we give five axioms that tell us how addition works:
The number a from axiom A4 is usually denoted −x and is called the additive inverse
of x. Moreover, subtraction is defined as the addition of additive inverses:
In particular, this means that all rules for subtraction follow from the rules of addition.
Next, here are five axioms that tell us how multiplication works:
The number b from M4 is usually denoted by 1/x or x−1 and called the multiplicative
inverse or the reciprocal of x. Moreover, we define the quotient x/y to mean
x/y = x · (1/y).
In particular, this means that all rules for division follow from the rules of multiplication.
Finally, we include one axiom that tells us how addition and multiplication interact:
(AM) For all x,y,z we have z · (x + y) = z · x + z · y.
Fig. 5. Don’t worry too much if your attention drifts while trying to read these rules.
Just like Carcassone, to learn "the game of maths" you really need to start playing
to get anywhere.
6 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Exercise 1.2 The rulebook only seems to guarantee the existence of the numbers 0
and 1. Here we ask you to explore some immediate ways to extend the rulebook:
(a) Use the rulebook to suggest a definition for the positive integers 2, 3, 4, . . ..
(b) Similarly, suggest what a definition for the negative integers −2, −3, −4, . . ..
(c) What does the rulebook say about the fractions c/c and c/1?
(d) Suggest a meaning for the symbols a2 and a−2 .
(e) Suggest a meaning for a0 . (Hint: what should a2 · a−2 be equal to?)
Remark 1.3 (Some additional notations) If you look in other textbooks, the axioms
may be formulated slightly differently. In particular, you may see specialised notations
being used. As an example, here is a “compact” way to formulate A4 :
∀x ∈ R, ∃a ∈ R : x + a = 0.
Here, ∀ is an upside-down ‘A’ meaning “for all”, ∃ is a backwards ‘E’ meaning “there
exists” and ∈ is a variant of the Greek letter epsilon meaning “element of” or “belongs to”.
The colon “:” has several uses in mathematical notation. Here, it is supposed to mean
“such that”. We will occasionally use these notations throughout these lecture notes.
As you will see below, while calling the axioms by their labels such as M1 is efficient
in computations, it is also confusing, as we quickly forget which axiom a given label refer
to. Some names we have already given above (such as the name for the replacement
rule). Here are some others that are often used, and that you should know:
Remark 1.5 (Are mathematical truths absolute?) By what we write above, you
should suspect that there is something fishy when we claim that mathematicians are able
to establish "absolute" truths. The problem is that all proofs have to start from some
collection of axioms that we have to take on faith. That is, any mathematical statement
can only be shown to be true relative to other mathematical statements. And some of
these – the axioms – can never themselves be shown to be true.
1.2. PROOFS BY CHAINS OF EQUALITIES AND COUNTER-EXAMPLES 7
In the above example, we used not only axioms from the rulebook, but also the
assumption, or hypothesis, on the number 0. This is completely valid, and in fact, most
mathematical statements are on the form: If such and such, then some conclusion holds.
Exercise 1.8 (a) Prove that the number 1, whose existence is given by axiom
(M3 ), is the only number satisfying x · 1 = x for all real x.
(b) Prove that all numbers x have exactly one additive inverse.
(c) Prove that all numbers x 6= 0 have exactly one multiplicative inverse.
1.2. PROOFS BY CHAINS OF EQUALITIES AND COUNTER-EXAMPLES 9
Example 1.10 Here is a chain of equalities to prove that (−1) · a = (−a) holds for all
real numbers a (as in Example 1.9, a crucial step is to add by 0):
(−1) · a = (−1) · a + 0 = (−1) · a + a + (−a)
= (−1) · a + a + (−a)
= (−1) · a + 1 · a + (−a)
= (−1) + 1 · a + (−a)
= 0 · a + (−a) = 0 + (−a) = −a.
Exercise 1.11 (a) Justify the steps in the above example by referring to the rule-
book, an assumption or previously proved statement.
(b) Use a chain of equalities, similar to the one used in the previous example, to
prove that (−1) · (−1) = 1.
(c) Explain why it now follows that −(−1) = 1.
10 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
(a + b)(a − b) = a2 − b2 .
= (a + b)a + (a + b)(−b)
= a(a + b) + (−b)(a + b)
= aa + ab + (−b)a + (−b)b = a2 + ab − ab − b2 .
Exercise 1.13 Justify the steps in the above example by referring to the rulebook,
an assumption or previously proved statement.
Exercise 1.14 Formulate and prove formulas for how to "multiply out" the following
expressions.
(a) (a + b) · (c + d) (b) (a + b) · (a + b).
Remark 1.15 (How detailed do my proofs need to be?) Notice that while the
proof in Example 1.12 is perhaps not so complicated, we still write it out in a level of
detail that we would not use in a normal computation. For instance, we are so used
to real numbers being commutative, that we would never waste ink writing something
like (a + b)a = a(a + b). Similarly, we rarely find it necessary to write something like
(a − b) = (a + (−b)).
Indeed, to make proofs more readable, we often omit steps that we consider to be
"routine". This is something every mathematician does, and so will we. The problem is
to understand what we can safely assume to be "routine". There is no simple answer to
this, and it is ultimately also a question of style. Some mathematicians simply write out
more details than others. However, context matters. For instance, in this part of the
chapter we are explicitly discussing the axioms from the first part of the rulebook. It
therefore makes sense to be explicit about how we use these rules. But in the same way
that it is natural, and necessary, for a 2 year old to inform the world every time they go
to the toilet, there is a point when such information just stops having any function.
1.2. PROOFS BY CHAINS OF EQUALITIES AND COUNTER-EXAMPLES 11
Exercise 1.17 Use the result of the above example to prove that for all real numbers
a, b, C, D such that C, D 6= 0, we have
a b ab
· = .
C D CD
Hint: By definition, a/C = a · (1/C).
Before dealing with the division rule, let us mention that the result of the above
example is useful when adding fractions. Notice how multiplication by 1 is a crucial
step:
1 1 1 1
+ = ·1+ ·1
2 3 2 3
1 3 1 2
= · + ·
2 3 3 2
3 2 5
= + = .
6 6 6
12 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Exercise 1.18 (a) Justify how to do the last step of the above computation.
(b) Use the above as inspiration to prove that for all real numbers a, b, C, D with
C,D 6= 0, we have
a b aD + bC
+ = .
C D CD
What remains is to try to figure out what happens when we divide a fraction by a
fraction. Again, multipling by 1 is the key idea. For instance, we have
( 21 ) ( 12 ) ( 12 ) 6 ( 21 ) · 6 3
5 = 5 · 1 = 5 · = 5 = .
(3) (3) (3) 6 (3) · 6 10
Exercise 1.19 Use the procedure from the above computation to prove the division
rule from Example 1.1:
(a/C)/(b/D) = (aD)/(bC).
In particular, point out what we need to assume about the real numbers a, b, C, D for
this rule to apply.
We now give an example where we show how "complex" fractions are simplified in
practice. Notice that the main idea is repeated use of the trick "multiply by one":
Finally, 1
xy xy xy
x+y · (x + y)
1
x
+ y1 x+y x+y x+y xy
1
= 1
= 1
· = 1
= = xy.
x+y x+y x+y
x+y x+y · (x + y)
1
Exercise 1.21 Simplify the following expressions as much as you can by using the
trick of multiplication by one.
1 1
1 a− a
1− u+1
(a) 1 (b) 1 (c) 1
1+ x 1− a u−1 +1
a 1 4x a x−a x+a
(d) − (e) 1 + (f ) + −
a−1 1−a (x − 1)2 x−a x+a a
1.2. PROOFS BY CHAINS OF EQUALITIES AND COUNTER-EXAMPLES 13
Example 1.22 Suppose that someone claims that the following formula is true:
∀a, b ∈ R : (a + b)2 = a2 + b2 . (∗)
To prove that this statement is false, all we need to do is to find a counter-example.
That is, a pair of numbers a, b so that the left and right-hand sides are not equal. In
fact, if we plug, say, a = 1, b = 1, into the formula, we get
4 = 2.
Since this is clearly false, we conclude that the formula cannot not hold for all a, b.
Done!
Remark 1.24 Note that by multiplying out, we get (a + b)2 = a2 + 2ab + b2 . Now
this (correct) formula certainly looks different than (∗). However, this is not enough to
conclude that (∗) is false. Indeed, formulas may look different, but still be the same (of
course, this is not the case here).
Exercise 1.25 Are the following formulas true? For each, first (i) try to find a
counter-example to the formula. If you cannot do this, then (ii) try to find a chain of
equalities that proves the formula.
1 1 1
(a) = − , ∀a ∈ R such that a 6= −1 and a 6= 0.
a(a + 1) a a+1
1 1 1
(b) = + , ∀a,b ∈ R such that a 6= 0, b 6= 0, a + b 6= 0.
a+b a b
1) As a formula/identity. This was essentially the topic of the previous section. In-
terpreted as a formula, we would typically claim that expression in Figure 8 holds
“for all” numbers x. To prove a formula, we often use a chain of equalities.
2) As an equation. This is the topic of this section. From this point of view, we ask:
for which x is the left-hand side equal to the right-hand side? Does it hold for all
x, some x or no x? To investigate this, we study how to solve equations. To this
end, chains of equivalences and implications will come in handy.
Example 1.26 (Curly bracket notation for sets) Anything written inside of curly
brackets {. . .} should be read as “the set of”. For instance, the set of complex numbers
is expressed as C = {a + ib : a,b ∈ R and i2 = −1}. Here, the “:” means “such that”.
This notation is often used in combination with the symbols ∀, ∃, ∈ (see Remark 1.3).
Exercise 1.28 The existence of solutions to an equation may depend on the context.
Indeed, do you recognise the following sets?
(a) {x ∈ R : x2 + 1 = 0} (b) {x ∈ C : x2 + 1 = 0}
Hint: Here you may need the symbol ∅ which denotes the set with no elements.
1
A set is a collection of elements. For instance, in these lecture notes, we usually consider sets of real
numbers such as the interval [1,3].
1.3. PROOF BY CHAINS OF IMPLICATIONS AND CASES 15
Example 1.29 For all x ∈ R, the following statements are all true:
(i) x = 1 is equivalent to x − 1 = 0.
(ii) x = 1 is equivalent to 2x = 2.
(iii) x = 1 is not equivalent to x − 2 = 0.
(iv) x = 1 is not equivalent to x2 = 1.
If two statements are equivalent, we can denote this by using the symbol ⇐⇒ .
Example 1.30 The two first statements in the previous example can be written as:
(i) x = 1 ⇐⇒ x − 1 = 0
(ii) x = 1 ⇐⇒ 2x = 2
Note that while the statements x = 1 and x2 = 1 from part (iv) of Example 1.29
are not equivalent, they are still related. We use the implication arrows =⇒ and ⇐=
to indicate that the truth of a statement implies the truth of another statement. This
allows us to express relations between statements which are weaker than them being
equivalent.
(i) x = 1 =⇒ x2 = 1
(ii) x = 1 ⇐=
6 x2 = 1.
Here, (i) expresses the fact that if x = 1 holds, then so does x2 = 1, and (ii) expresses
the fact that if x2 = 1 holds for x, then it is not necessarily true that x = 1 (it could be
that x = −1).
Remark 1.32 Let A and B be two statements. Then A ⇐⇒ B means exactly that
we have both A =⇒ B and A ⇐= B.
16 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Exercise 1.34 Before reading on, we ask you to observe that one half of the equiva-
lence in the above proposition has already been proved. Which one?
• Case 1: When a = 0
• Case 2: When a 6= 0.
Note that together these two cases cover all possible situations. This means that if we
are able to prove the desired conclusion in each case, then the proposition holds.
Here are the details on for each case:
• Case 1: We suppose that ab = 0 and that a = 0. In this case, there is nothing to
prove, as this means that the statement "a = 0 or b = 0" is true. Done!
• Case 2: We suppose that ab = 0 and that a 6= 0. In this case, we know by axiom
(M4 ) that a−1 exists. But this means that we have the following chain of equalities:
Proof. We prove (i) (leaving (ii) as an exercise). First, we note that by Remark 1.32,
we can split the equivalence into the statements
a = b =⇒ a + c = b + c and a = b ⇐= a + c = b + c.
We can now consider how to prove these two statements one at a time:
Proof of " =⇒ " direction: We are given that a = b is true. But then it follows
by the replacement rule (axiom (E2 )) to establish the (rather short) chain of equalities
a + c = b + c, and we are done.
Proof of " ⇐= " direction: We are now given that a + c = b + c. But this means we
can establish the following chain of equalities:
a = a + (c − c) = (a + c) − c
= (b + c) − c = b + (c − c) = b,
Since the first line is equivalent to the last, this chain of equivalences tells us exactly
which x solves the original equation, and we are done!
For emphasis, we repeat that the computation in the above example qualifies as a
proof that 9x + 3 = 5x + 2 holds if and only if x = −1/4. Indeed, it is a logically correct,
and justified, argument connecting these two statements.
Exercise 1.41 Solve the following equations using chains of equivalences.
1 15
(a) 3x+1 = 5x−2 (b) 1−6(x−2) = 3(x+1) (c) 2x+ (x+1) = +6x.
2 2
{x ∈ R : 9x + 3 = 5x + 2} = {x ∈ R : 9x = 5x − 1}
= {x ∈ R : 4x = −1}
= {−1/4}.
In this way, we see that the solution strategies of chains of equivalences and chains of
equalities are closely connected.
1.3. PROOF BY CHAINS OF IMPLICATIONS AND CASES 19
In the following example, we give an example that contains all the secrets of how to
prove the pq-formula. Notice that the point is to combine Proposition 1.33 with a trick
we have seen before: adding zero! Also, notice that you will be asked to finish the job
in the exercise following the example.
add by (+25/4) 25 25
x2 + 5x + 4 = 0 ⇐⇒ x2 + 5x + +4=
4 4
add by (−4) 25 9
⇐⇒ x2 + 5x + =
4 4
5 2 9
⇐⇒ x+ =
2 4
How could we know that this would work? Well, (x + c)2 = x2 + 2cx + c2 . So, for x2 + 5x
to match the two first terms of this expression, we need 2c = 5. Moreover, with c = 5/2,
we get that (x + 5/2)2 = x2 + 5x + 25/4. This is why we smuggled an extra 25/4 into
the left-hand side of the above expression.
Well, what does this help? The big deal is that now we have managed to get x on
its own, and can use the result of exercise 1.39 to solve the equation:
5 2 9 5 3
x+ = ⇐⇒ x + = ±
2 4 2 2
5 3
⇐⇒ x = − ± ⇐⇒ x = −1 or x = −4.
2 2
2
In order to keep the discussion simple, we assume for the moment that all real numbers have square
roots (real or imaginary). We will address this fact later in the course.
20 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Exercise 1.45 Repeat the steps from the above example to solve x2 + px + q = 0
We can push the insights from the above example a bit further by looking at basically
the same computations a slightly different way. This gives us a way to factorise second
degree expressions. The only additional ingredient that we will need is the formula
(a + b)(a − b) = a2 − b2 .
And we are done! (You should verify that this is correct by multiplying out the final
expression.)
Observe: It may be that a = b, or that both a and b are complex numbers. Indeed,
x2 + 2x + 1 = (x + 1)2 and x2 + 1 = (x − i)(x + i) are examples of such situations. In
the former case, we say that the root x = −1 is repeated, or that it has multiplicity 2.
But why stop at quadratic equations (also called second degree equations) when one
can also study n-th degree equations
xn + cn−1 xn−1 + cn−2 xn−2 + · · · + c2 x2 + c1 x + c0 = 0?
(Here, we use the subscripted letters c0 , c1 , c2 , . . . to avoid running out of symbols. Note
that the expression in the left-hand side is called an n’th degree polynomial.)
One of the major achievements of early algebra, and which is beyond the scope of
this course, is the following extremely powerful result. It tells us that Proposition 1.49
has an extension to polynomials of all degrees.
xn + cn−1 xn−1 + · · · + c1 x + c0 = 0
A problem with the Fundamental Theorem of Algebra is that it does not tell us
what the zeroes of the equation are. It just tells us that they always exist (at least as
complex numbers), and that we can use them to factorise the polynomial. This should be
compared to the pq-formula which tells us what the zeroes are! During the renaissance,
finding pq-formulas for equations higher than degree 2 was one of the main research
questions in mathematics. These efforts were successful for degrees 3 and 4 (although
the formulas are too complicated to be of any practical use). But, finally, in the 19th
century, it was proven that such formulas do not exist for degrees 5 and higher.
22 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Example 1.52 To solve the inequality 3x + 2 < 5, the procedure is essentially identical
to the one for equations:
3x + 2 < 5 ⇐⇒ 3x < 3
⇐⇒ x < 1
So, do the solution methods for equalities and inequalities always behave in the same
way? Well, no. Consider the following exercise.
Exercise 1.53 Here are three suggested solutions of the inequality x − 2 > −8 − 2x.
Are any of them correct? If so, which one(s)?
Exercise 1.54 In Figure ?? of Appendix ??, we illustrate visually why it makes sense
for a < b to be equivalent to −b < −a when both a, b are positive. Verify that the
implication a < b =⇒ −a > −b also holds visually if:
To figure out what we are allowed to do with inequalities, we need to expand our
rulebook! Here are the additional axioms that we need to deal with inequalities.
The rulebook for R (part 2 of 4) Inequalities are governed by the following axioms:
Apparently, we only need four additional axioms to deal with inequalities. While it
is nice that we do not need more, this also means that there is much for us to figure out
on our own. For instance:
Exercise 1.58 (a) Identify what parts of Proposition 1.55 that we have not proved
yet. (b) Use what we have seen so far to prove the remaining parts.
24 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
”x = 0 and y = 1”.
Notice that this statement is false exactly if either x 6= 0 or y 6= 1. For this reason, we
say that "x 6= 0 or y 6= 1" is the logical negation of the statement "x = 0 and y = 1". If
we denote a statement by the symbol A, we can denote its negation by "not A" or ¬A.
Exercise 1.61 Let A be a subset of R. Determine the logical negations of the follow-
ing statements:
(a) ∀x ∈ A, we have x ≥ 0 (b) ∃x ∈ A such that x ≥ 0.
Example 1.62 (A proof by contradiction) Could it be that there are real numbers
a so that 1/a = 0? Well, the answer is no, and our goal is to prove this. Since 1/a is not
defined for a = 0, the statement we seek to prove can be expressed as follows:
a 6= 0 =⇒ 1/a 6= 0.
Now, on the one hand, since a 6= 0, we know by the axioms that the number 1/a exists,
and we have that a · (1/a) = 1. When combined with the assumption that 1/a = 0, we
get the following chain of equalities:
1 = a · (1/a) = a · 0 = 0,
When reading the above example, you might be annoyed and shout "Wait a minute!
How can establishing that 1 = 0 prove anything?" Well, the point is that one of the two
following possibilities must be true:
1. It is true that both a 6= 0 and 1/a = 0 hold at the same time. But this implies that
we must also accept that 1 = 0. In particular, from this, it is possible to show that
all numbers are equal to 0, and so we conclude that all of mathematics is trivial.
2. It is not true that a 6= 0 and 1/a = 0 hold at the same time. In particular, this
means that if a 6= 0, then we must have 1/a 6= 0, which is exactly the implication
we wanted to prove!
Now, if we believe that the rulebook for the real numbers only contain true statements,
we must reject the first possibility described above and accept the second. In other
words, proving that 1 = 0 means that we are done!
More generally, we can describe a proof by contradiction as follows. Suppose that A
and B are two mathematical statements, and we want to prove that
A =⇒ B.
To do this by a proof by contradiction, we would need to prove
A and "not B" =⇒ something clearly absurd.
Now, if we succeed in establishing a chain of implications proving that "A and "not B"
implies something clearly false, then one of the two following possibilities have to hold:
1. A and "not B" are true at the same time. But then we have to accept that a valid
argument based on true assumptions can lead to false conclusions. In other words,
we have to accept that mathematics cannot be trusted.
2. A and "not B" cannot be true the same time. In particular, this means that if A
holds, then "not B" is false. That is, it must be false that B is false, and so we
must conclude that B holds.
Since we believe that mathematics is to be trusted, we reject the first possibility, and
conclude that the second possibility must hold. That is, we conclude that A =⇒ B,
and the proof by contradiction is done!
are logically equivalent. Since we do not want to spend too much time on logic, here is
an example to convince you that this makes sense.
Example 1.68 Consider the statement “If Elias is in Lund, then he is in Sweden”. This
statement is in the form A =⇒ B. Now, taking negations, the statement corresponding
to "not B” =⇒ "not A" is “If Elias is not in Sweden, then he is not in Lund”. Notice
how these two statements say exactly the same thing!
We end the discussions on the proof strategies "proof by contradiction" and "proof
by contraposition" with the following historical remark.
Remark 1.71 The technique "proof by contradiction" is rather extreme. Indeed, here
is what G. H. Hardy (who we already met on page 2) had to say about it:
Based on the information from (a) and (b), do a proof by cases to determine when
9x − 3
(c) > 0.
4 − 2x
Hint: To solve (c), all you need to know is that “negative times negative is positive”,
that “positive times negative is negative” and that “positive times positive is positive”.
Exercise 1.73 In fact, why do we know that “negative times negative is positive” and
so forth? Formulate this as a proposition, and prove the statement.
So, what is the point of the above exercises? Well, what they tell us is that if we have
an inequality in some factorised form, and we know exactly when each factor is positive
and negative, then we should be in good shape.
Let us consider an example.
Notice that in the last step, we used the formula a2 − b2 = (a − b)(a + b) to factorise the
numerator.
Combining what we did above, we get
5 (3 − x)(3 + x)
x−2< ⇐⇒ 0 < .
x+2 x+2
This means that we have rewritten the inequality in factorised form, and that we can
solve the inequality using the same approach as in part (c) of exercise 1.72. Note that
to keep track of all the cases, it is practical to use a table of signs:
Fig. 9. In the first three lines of this table of signs, we keep track of where the indi-
vidual factors are positive and negative, respectively. This allows us to understand
the different cases we have to consider for the full product appearing in the last line.
Note that we use a skull to remind ourselves that a 0 in the denominator means that
the expression is not defined.
From the table of signs, we see that the expression is positive when x < −3 or −2 < x < 3,
which is also our final answer.
Exercise 1.75 Use the strategy from Example 1.74 to solve the following inequalities.
9 x+2
(a) ≥1 (b) (x − 2)(x + 3) ≤ x − 2 (c) ≤x−2
x+3 x−1
Remark 1.76 Some students react to the use of tables of sign. Indeed, we are discussing
axioms at a fairly serious level, and suddenly these childish drawings with skulls appear.
Well, the tables of signs are just symbols to keep track of the cases involved in solving
the inequality. (It would essentially be the same thing if some alien civilisation would
think that the symbols of our alphabet are too “childish” to express anything serious.)
1.5. PROOF BY INDUCTION 29
The rulebook for R (part 3 of 4) The natural numbers satisfy the induction principle:
(i) 0∈V,
(ii) for all k ∈ V it holds that k + 1 ∈ V ,
then V = N.
As an axiom, the induction principle is rather bad since it is far from obvious. Keep
in mind that axioms are the only mathematical "truths" that we allow ourselves to
accept without any proof, and that they form the starting point for all of mathematics.
Therefore, it is critical that they are as self-evident as possible. Since we only take the
induction principle as an axiom for convenience (to avoid a technical discussion), this is
not really that big of a deal philosophically speaking (as we will see in the next section,
the opposite will be true for the the Completeness Axiom).
Exercise 1.77 (Challenging) Try to figure out how to define the natural numbers
N in a way that reveals the induction principle as a natural consequence
Remark: Although the required definition of N is actually not that complicated, finding
this on your own is rather hard – so you may want to search through some literature.
Before we look more closely at how to use the induction principle, let us briefly
consider whether or not it makes intuitive sense. To this end, we point out that it can
be interpreted as follows: “If you could count the natural numbers, one by one, for all
eternity, then you would be able to count all of them.” So, if the induction principle was
false, then there would be integers that you could never “reach” by counting in this way.
Does this sound reasonable?
3
A discussion that will be made in the course on the foundations of algebra.
30 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Note the following peculiar thing: it does not matter which letter or symbol we use for
the index since the index itself never appears in the actual sum (notice that n does not
appear on the right-hand side). In particular, it is true that
3 3
X 1 X 1
= .
2n 2k
n=0 k=0
For this reason, the index variable is often called a "dummy variable".
We remark that if it is clear from the context what the indices are, we usually just
denote the above sequence by its formula 1/2n . Moreover, we express an infinite sequence
numbers, such as 1, 1/2, 1/4, . . ., by writing
1 ∞ 1 1 1
= 1, , , , . . . .
2n n=0 2 4 8
1.5. PROOF BY INDUCTION 31
Proof that 1 ∈ V: This is the so-called base case. Here, it consists in verifying
that the summation formula is true when n = 1. This is clearly ok here.
Proof that k ∈ V =⇒ k + 1 ∈ V: This is the so-called induction step. It con-
sists of assuming that k ∈ V (the induction hypothesis), and then proving that
k + 1 ∈ V must also be true. To prove that k + 1 ∈ V , our plan is to establish a chain
of equalities showing that
(k + 1)(k + 2)
1 + 2 + · · · + k + (k + 1) = · · · = .
2
Indeed, this is formula (1.1) with N replaced by k + 1. Our most important tool is the
induction hypothesis. Namely, formula (1.1) holds with the upper summation bound n.
We use this to get started:
k(k + 1)
1 + 2 + · · · + k + (k + 1) = |1 + 2 +{z· · · + k} +(k + 1) = + (k + 1).
2
k(k+1)
= 2
32 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
(k + 1)(k + 2)
2
exactly like we needed.
Conclusion: By the induction principle, V = N∗ . That is, formula (1.1) holds for
all positive integers.
Exercise 1.82 What happens when you use induction to prove the following, false,
formula?
Xn
2j = 2n+1 + 13, ∀n ∈ N.
j=0
Exercise 1.84 One of the axioms says that c(a1 + a2 ) = ca1 + ca2 . Use induction to
write out a careful proof of the more general formula
n
X n
X
c· aj = caj .
j=1 j=1
Remark: The meaning of a "careful proof" is subjective and depends on context. Here,
you are asked to prove an "obvious" result which is not in the rulebook. This indicates
that you should point out steps that you otherwise would not (for instance, the use of
the distributional axiom, which we usually omit to mention explicitly in order to make
proofs readable).
1.5. PROOF BY INDUCTION 33
(i) a0 = 1,
(ii) for all k ∈ N we let ak+1 = ak · a.
Notice how each power of a is defined in terms of the previous power. Such a definition
is called inductive, and is well suited for proofs by induction, as we illustrate in the
following example.
Example 1.86 Let us prove that am · an = am+n for all m,n ∈ N. To this end, suppose
that m is fixed. We now prove by induction that for all n ∈ N, the formula holds.
Base case: For n = 0, the left-hand side is am · a0 = am · 1 = am . The right-hand
side is am+0 = am . This is exactly what we needed to show.
Induction step: We begin by assuming that the formula holds for n = k. That is,
we assume that am · ak = am+k holds. We want to use this to prove that this formula
also holds when k is replaced by k + 1. We do this by the following chain of equalities:
(Here, we used the inductive definition of integer powers in the first and last equalities.
In the middle equality, we used the induction hypothesis.)
Conclusion: For all fixed m ∈ N, we have used the induction principle to prove that
am · an = am+n holds for all n ∈ N. That is, the formula holds for all m,n ∈ N.
Exercise 1.87 Strictly speaking, to apply the induction principle in the above exam-
ple, we need to define some suitable set V . Do this.
Exercise 1.88 (a) Use induction to prove am /an = am−n , for m, n ∈ N.
(b) Prove that the formulas from 1.86 and part (a) of this problem hold for all m, n ∈ Z.
Exercise 1.89 Prove the power rule from Example 1.1 for integer powers. That is,
prove that for n ∈ Z and a, b ∈ R, we have
(ab)n = an bn .
Hint: Use induction to prove this statement for n ∈ N, and then show that the
statement for negative n follows as a consequence.
34 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Fig. 13. This is Pascal’s triangle. If you let the 1 on top be the zero’th line, then
the n’th line tells you the coefficients you get when you multiply out (a + b)n . To
the right, we indicate how you, from one line, can deduce the next.
Exercise 1.90 (a) Use the instruction in the above figure to compute the next line.
(b) Multiply out (a + b)5 and compare (feel free to use the expression for (a + b)4 )).
Even if the above observation is nice, how do we compute the coefficients of, say,
(a + b)500 ? Pascal figured out how one can do this directly without having to go through
n
the triangle, line by line. To formulate his result, he introduced the symbol m to denote
the m’th coefficient on the n’th line of the triangle (he called them binomial coefficients):
Exercise 1.92 Verify that with the above definition, then figures 13 and 14 are the
same.
Pascal found a nice expression for the binomial coefficients. This expression is in terms
of the "factorial function", which we define for n ∈ N to be
n! = n · (n − 1) · · · 2 · 1.
(n! reads aloud as "n factorial".) Here are some examples:
1! = 1, 2! = 2 · 1, 3! = 3 · 2 · 1, 4! = 4 · 3 · 2 · 1.
Note that it is usual to define 0! = 1 since it makes formulas nicer.
With the above preparations out of the way, we formulate Pascal’s formula famous
binomial theorem.
Proposition 1.95 (Pascal’s binomial theorem) For all natural numbers n and real
numbers a, b, we have
n
n
X n n−m m
(a + b) = a b .
m
m=0
Every axiom we have formulated so far is satisfied not only by the real numbers R, but
also by the rational numbers Q. That is, none of the rules we have introduced so far
allow us to distinguish between the sets R and Q. In other words, if we did not introduce
any further axioms, we would be tempted to conclude that R = Q. However, √ we know
that this is not the case: it has been known since the ancient Greeks that 2 ∈ / Q.
We begin by formulating the following definition.
x≤C ∀x ∈ M.
The smallest number C with the above property is denoted by sup M and called the least
upper bound, or supremum, of M . If M has no upper bound, we write sup M = +∞.
We will momentarily explain what the above definition means, but let us first formu-
late the completeness axiom, which, well, completes our rulebook:
The rulebook for R (part 4 of 4) The real numbers satisfy the completeness principle:
(CP) If the subset M ⊂ R is non-empty and has an upper bound, then it has a least
upper bound.
Our immediate goal is now to figure out what all of this means. The first thing is to
realise that the concept of an upper bound is not really that complicated (please note
that we do not need to use the completeness axiom in these examples!).
Example 1.99 (A set with upper bounds) Let M be the half-open interval [1,3).
In this case, the numbers 3, 19 and 624.7 are all upper bounds of M . In fact, every
number larger than, or equal to, 3 is an upper bound for M . Out of these upper bounds,
the smallest one is 3. That is, sup M = 3.
1.6. THE COMPLETENESS AXIOM 37
Example 1.100 (A set with no upper bound) Let M = {n2 }n∈N . Then M has no
upper bound, and therefore we write sup M = +∞.
Exercise 1.101 Suppose that M = [1,3]. What are the upper bounds for M , and
what is the lowest upper bound of M ?
Remark: The point of this exercise is to encourage you to compare what Definition
1.98 says about the intervals [1,3] and [1,3) (recall Example 1.100).
We will not be using the following definitions much in these lecture notes, but we
include them since they pop up from time to time.
Definition 1.102 If M ⊂ R is such that sup M ∈ M , then we say that the supremum
is the maximum element of M and denote it by max M instead.
Example 1.103 The sets M1 = [1,3) and M2 = [1,3] both have 3 as their supremum.
Of these, only M2 admits a maximum element, and so we can write max M2 = 3.
Exercise 1.104 (Optional) Prove that every finite set admits a maximum element.
Hint: The case when the set contains one element should be clear...
Finally, we note that in the same way that a set M can have least upper bound
sup M , it can also have a greatest lower bound inf M (also called the infimum of M ). In
the following exercise, we ask you to explore what this ought to mean.
Exercise 1.105 (a) Define what we ought to mean by a lower bound for M .
(b) Define what we ought to mean by inf M .
(c) Find an example of a set M for which inf M does not exist. What would be a
reasonable notation for inf M in this case?
(d) Define what we ought to mean by min M , and give examples of a set that admits
a minimal element and one that does not.
Fig. 15. The completeness axiom is a black sheep of mathematics. There are actually
mathematicians who refuse to accept it!
Notice that the Archimedean property can be understood as the statement that the
set of integers is infinitely "long". That is, the above example shows one sense in which
the completeness axiom allows us to make sense of a notion of the infinite.
Exercise 1.109 Prove the following, more original, formulation of the Archimedean
property of R: For all β > 0 and x > 0 there exists an integer n so that βn > x.
Hint: Here you can either modify the proof of the Archimedean property, or even just
apply it in a suitable way.
1.6. THE COMPLETENESS AXIOM 39
Exercise 1.111 In this exercise, we consider the set M from Example 1.110.
(a) Determine whether M admits a maximum element.
(b) Determine whether M admits a minimum element.
Exercise 1.112 Consider the set
n1 o
M= : n ∈ N∗ .
n
(a) Determine inf M and sup M .
(b) Determine if M admits a maximum or minimum element.
Exercise 1.113 Consider the set
nn + 2 o
M= :n∈N .
n+1
(a) Determine inf M and sup M .
(b) Determine if M admits a maximum or minimum element.
Exercise 1.114 Suppose that A,B are two subsets of R so that a ≤ b for all a ∈ A
and b ∈ B. Show that sup A ≤ inf B.
40 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Example 1.115 (All decimal expansions represent some real number) Suppose
someone throws the endless string of decimals digits for the number π in our face:
3.1415926535... (1.2)
How do we know that this represents a real number? One reason to be skeptical is that
nowhere in the axioms for the real numbers are decimal numbers explicitly mentioned.
A key insight is that (1.2) is supposed to be understood to be the supremum of the set
n 1 1 4 1 4 1 o
M = 3, 3 + , 3 + + 2, 3 + + 2 + 3, ... .
10 10 10 10 10 10
= {3, 3.1, 3.14, 3.141, . . .}.
Notice that, here, we are expressing M on the form {an : n ∈ N}, where each an contains
exactly n decimal digits of π after the comma. In particular, the sequence an is growing,
but none of the entries are larger than the number 4.
By the above observations, the completeness axiom implies that a smallest upper
bound for M exists. That is, π exists. (Note that this argument is incomplete – what
we lack is a recipe for computing all decimal digits of π.)
Exercise 1.117 (Discussion) Can a real number have two different decimal expan-
sions?
Exercise 1.118 The above definition only says what we mean by a positive decimal
number. Extend the definition to negative decimal numbers.
1.6. THE COMPLETENESS AXIOM 41
Note that while the above definition says what we mean by a decimal number, it
does not guarantee that every real number is a decimal number. That is, could there
be super fancy real numbers that cannot be expressed as a decimal number? To put
this question into perspective, consider the following: could it be that there are fancy
numbers that cannot be written
√ as a rational number? Well, yes! It was famously proved
by the ancient Greek that 2 is such a number. These fancy numbers are called the
irrational numbers (see Appendix ??).
So, could it be that there exists some proof that not all real numbers have a decimal
expansion? Well, no! Let us settle this matter as proper mathematicians. That is, in
terms of a theorem, a lemma and a proof.
To prove this result, we need the well-ordering principle, which is a consequence of the
induction principle. As the Well-ordering principle will be discussed in other courses, we
omit the proof here (however, for the interested students, we provide a proof in Appendix
??).
Exercise 1.122 (Challenging) Use the well-ordering principle to prove that for every
x ≥ 0 there exists an integer n so that x ∈ [n,n + 1).
Remark: The statement also holds for x < 0, since we only need it for positive x, we
choose to restrict the statement as this simplifies the proof somewhat.
Proof of Theorem 1.120. Let a ∈ R be some fixed real number. We now show that here
exists some sequence of numbers an so that a satisfies Definition 1.116. For convenience,
we assume that a > 0.
First, we observe that by exercise 1.122, there exists an integer a0 ∈ N such that
a ∈ [a0 , a0 + 1).
a − a0 ∈ [0,1).
42 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Next, we split the [0,1) into 10 subintervals of length 1/10, that are half-open in the
same sense: h 1 h1 2 h9
0, , , ... ,1 .
10 10 10 10
The number a − a0 has to be in one of these subintervals. That is, for some a1 ∈
{0,1,2, . . . ,9}, we have
ha a + 1
1 1
a − a0 ∈ , .
10 10
Again, we notice that this can be expressed on the form
a1 h 1
a − a0 − ∈ 0, .
10 10
Continuing this process, where we at each step split an interval of the form [0,1/10n−1 )
into ten subintervals of length 1/10n , we find, an infinite sequence (an )∞
n=1 of integers
an ∈ {0,1,2, . . . , 9} so that for every fixed n, we have
a1 a2 an h 1
a − a0 − − − · · · − n ∈ 0, . (1.3)
10 100 10 10n
Notice that it follows immediately from (1.3), and the fact that all ak are positive, that
n
X ak
≤a ∀n ∈ N.
10k
k=0
In combination with (1.3), this means that ∆ ∈ [0,1/10n ) for all n ∈ N. But this
contradicts the fact that ∆ is a fixed and strictly positive number (see exercise 1.67)!
1.6. THE COMPLETENESS AXIOM 43
√
Theorem 1.123 The real number 2 is irrational.
The following consequence states that there are no gaps between the real and rational
numbers on the real line. It will be useful in certain key examples later in the lecture
notes. We include a proof that shows how it follows from the above theorem.
Corollary 1.124 Every interval [a,b] with a < b contains (at least) one rational and one
irrational number.
Proof of Corollary. We prove that the interval must contain at least one rational number,
and leave the (very similar) proof that it contains at least an irrational number as an
exercise, below.
First, we note that if either a or b is rational, then we are done (why?). Therefore,
we may assume that both a, b are irrational. Now, since b − a > 0 it follows by the
Archimedean property that we can find some number N ∈ N so that, say, N > 1/(b − a).
Our claim is that we can now find a number n ∈ Z so that the rational number n/N
belongs to [a,b]. Intuitively, this makes sense since the gap between each two consecutive
numbers from the sequence
3 2 1 1 2 3
...,− , − , − , 0, , , , . . .
N N N N N N
is equal to 1/N , which by our choice of N , is smaller than (b − a), and so some number
in the above sequence must hit the interval (a,b) sooner or later. To turn this intuition
into a proof, we need only apply the Well-ordering principle in the form of exercise 1.122.
Indeed, applying the exercise to x = N b, it follows that there exists an integer n so that
N b ∈ [n,n + 1). Now, this can be rewritten as follows:
N b ∈ [n,n + 1) ⇐⇒ n ≤ N b < n + 1
44 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
⇐⇒ N b − 1 < n ≤ N b
1 n
⇐⇒ b − < ≤ b.
N N
What remains is to prove that b − 1/N > a. But this follows since the inequality
1/N < (b − a) can be rewritten on the form
1
b− > b − (b − a) = a.
N
We conclude that the rational number n/N belongs to [a,b].
Exercise 1.125 Complete the proof of the above corollary by proving that every in-
terval [a,b] with a < b contains an irrational number.
√
Hint: Modify the above argument, using the fact that 2 is irrational.
Exercise 1.126 It is actually more common to express the above corollary on the
following form: "Every interval (a,b) with a < b contains (at least) one rational and one
irrational number." Prove that the truth of this statement follows (almost) immediately
from the above corollary.
Remark: To really convince yourself that this is an inessential detail, note that the
statement in the exercise trivially implies the statement of the corollary.
1.81 Base case: for n = 0, the left-hand side is 1 and the right-hand side is 21 − 1 = 1.
Induction step: suppose formula ok for n = k. To investigate for n = k + 1,
we compute, using the induction hypothesis, that 1 + 2 + 22 + · · · + 2k + 2k+1 =
2k+1 − 1 + 2k+1 = 2 · 2k+1 − 1 = 2k+2 − 1. This is exactly what we wanted to get,
and we are done.
1.82 The induction step works, but the base case fails. Hence, the induction proof fails.
1.83 Base case: putting n = 1, we see that the left- and right-hand sides are both equal
to 1 and so the formula holds in this case. For the induction step, assume that the
formula is ok for n = k. When considering the formula for n = k + 1 notice that
you can apply the induction hypothesis to simplify the expression 1 + 4 + 9 + · · · +
k 2 + (k + 1)2 . Using this, you can find a chain of equalities showing that the left-
and right-hand sides for the formula in the case n = k + 1 are equal, and you are
done.
1.87 V = {n ∈ N : am · an = am+n }.
1.90 (a) 1, 5, 10, 10, 5, 1, (b) (a + b)(a + b)4 = a5 + 5a4 b + 10a3 b2 + 10a2 b3 + 5ab4 + b5 .
1.96 500 · 499/2 = 124750.
1.97 The base case is n = 0. Then (a + b)0 = 1 and 0k=0 k0 a0−k bk = 1 · a0 b0 = 1.
P
For the induction step, suppose that the formula for (a + b)n holds for n = k. We
are to use this to prove this formula for n = k + 1. This allows us to make the
following chain of equalities:
k
k+1 k
X k k−m m
(a + b) = (a + b) · (a + b) = (a + b) · a b
m
m=0
k k
X k k−m+1 m X k k−m m+1
= a b + a b
m m
m=0 m=0
k k+1
X k k−m+1 m X k
= a b + ak−m+1 bm
m m−1
m=0 m=1
k !
X k k
= ak+1 + + ak−m+1 bm + bk+1
m m−1
m=1
k
k+1
X k + 1 k−m+1 m
=a + a b + bk+1
m
m=1
k+1
X k + 1 k−m+1 m
= a b .
m
m=0
1.7. ANSWERS TO SELECTED EXERCISES 47
1.118 For a negative number a, find the decimal expansion a0 .a1 a2 a3 . . . of the positive
number −a. Then put a = −a0 .a1 a2 a3 . . ..
48 CHAPTER 1. A CRASH COURSE ON MATHEMATICAL PROOFS
Chapter 8
The derivative
In this chapter we figure out the computational rules for the derivative and prove the
differentiation formulas for the functions most commonly appearing in these lecture
notes.
Remark 8.1 (Selected problems from previous exams based on this chapter)
1. The following elementary functions are closely related to the trigonometric func-
tions, and are called hyperbolic functions:
ex − e−x ex + e−x
sinh x = and cosh x = .
2 2
(a) Prove the differentiation formulas
49
50 CHAPTER 8. THE DERIVATIVE
def f (x + h) − f (x)
f 0 (x) = lim
h→0 h
at all points where this limit exists. Moreover, if the limit exists, we say that f is
differentiable at x. If f is differentiable at all points in its domain, we simply say that f
is differentiable. Note that we sometimes write dx d
f (x) or df 0
dx (x) instead of f (x).
f (x) − f (u)
lim .
u→x x−u
(b) Is it always true that the limit in (a) is equal to the derivative of f (x) for all
differentiable functions f ? If yes, prove this, or, if not, then find a counter-
example.
Remark 8.4 (Leibniz notation) The letter h in the definition of the derivative denotes
how far we move away from the point x on the x-axis. Similarly, the quantity f (x +
h) − f (x) denotes how far this pushes the function away from the value f (x) along the
y-axis. Denoting these changes to the values x and f (x) by ∆x and ∆f , respectively,
Leibniz came up with the notation df
dx
for the derivative. (Here, we should point out that the Greek letter ∆ corresponds to
the latin letter "d" and stands for "difference".) Explicitly, we have
df ∆f f (x + ∆x) − f (x)
= lim = lim .
dx ∆x→0 ∆x ∆x→0 ∆x
In order to remember this notation, then the following is worth keeping in mind:
(In fact, this also holds true in the case of the notation for the definite integral.)
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 51
f (0 + h) − f (0) f (h)
lim = lim .
h→0 h h→0 h
Here, we used that f (0) = 0, according to the definition of f . Now, to plug in a formula
for f (h), we need to know if h is positive or negative (since f has different formulas
depending on the sign of h). This forces us to consider the one-sided limits separately:
f (h) h2 + h
lim = lim = lim h + 1 = 1,
h→0+ h h→0+ h h→0+
f (h) Ch
lim = lim = lim C = C.
h→0− h h→0 − h h→0−
Since these two limits are equal if and only if the two-sided limit exists (Proposition
??.??), it follows that f is differentiable at x = 0 exactly if C = 1.
Exercise 8.6 Consider the function in the above example. Suppose that x > 0 is
some fixed number. Do we have to take both formulas of f into consideration when
computing f 0 (x)? Explain why.
Exercise 8.7 Determine values for C and D so that the following function is both
continuous and differentiable at x = 0:
(√
x+1+D x≥0
f (x) =
C(x + 1) x < 0
52 CHAPTER 8. THE DERIVATIVE
x2 2x
x 1
1/x
√
x
xα
ex
ln x
sin x
cos x
tan x
arcsin x
arccos x
arctan x
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 53
d
f g(x) = f 0 g(x) · g 0 (x)
(v) (chain rule)
dx
We prove these rules in Section 8.2, below. As we shall see, these rules are all con-
sequences of the computational rules for the limit. For this reason, it may be surprising
that the sum rule looks very similar to the one for the limit, while others do not.
While the above computational rules should be more or less familiar from high school,
the following rule is probably not.
d −1 1
f (x) = 0 −1 .
dx f f (x)
We immediately note that while this result may be hard to read and apply, it is
actually not that hard to prove. Indeed, it follows almost immediately from the chain
rule. We shall return to this when we discuss implicit differentiation later in the chapter.
54 CHAPTER 8. THE DERIVATIVE
Example 8.11 Here are some examples to illustrate rules (i) to (iv).
d 3
x + sin x = 3x2 + cos x
(i)
dx
d 3
x sin x = 3x2 sin x + x3 cos x
(ii)
dx
= x2 3 sin x + x cos x
d 1 −1 d 3
(iii) 3
= 2 · x sin x
dx x sin x x3 sin x dx
x2 3 sin x + x cos x
=−
x6 sin2 x
3 sin x + x cos x
=−
x4 sin2 x
Notice how we in (iii) do not try to solve everything in one line. Instead, the first
step was essentially to recall the reciprocal rule. Indeed, to have a bit of patience when
computing derivates often helps us avoid mistakes.
We include one more example on the product rule to illustrate the importance of
patience when using the product rule:
Example 8.12 Applying the product rule twice, we can differentiate the product of
three functions:
d d
ln x · sin x · arctan x = ln x · sin x · {z
arctan x}
dx dx |{z} |
f g
d d
= ln x · sin x · arctan x + ln x · sin x · arctan x
dx dx
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 55
Remark: You can avoid the use of the chain rule in this exercise.
Exercise 8.14 Show that for constants a,b,c,d such that not both c = d = 0, then
d ax + b ad − bc
= .
dx cx + d (cx + d)2
Exercise 8.15 (a) Use the product rule to prove by induction that
d n
x = nxn−1 , ∀n ∈ {1,2,3, . . .}.
dx
(b) Combine the formula from (a) with the reciprocal rule to prove that
d n
x = nxn−1 , ∀n ∈ {−1, − 2, − 3, . . .}.
dx
We now move on to rule (v), namely, the the chain rule.
Here, we use the chain rule as formulated in Proposition 8.9 with f (x) = sin x and
g(x) = x3 . In particular, since f 0 (x) = cos x, this means that f 0 g(x) = cos(x3 ).
Remark: At first, the chain rule can be confusing. If you are struggling with this
exercise, continue reading, and then try again after taking a look at Example 8.20.
56 CHAPTER 8. THE DERIVATIVE
The chain rule is usually the computational rule for the derivative that requires the
most effort to master. The main reason is probably that the notation is sort of bad.
Indeed, notice that in our formulation of the chain rule, then
d
f g(x) 6= f 0 g(x) .
(8.1)
dx
So what is going on? Well, in the expression to the left, we are trying to say that one
should first compose f and g, to get f (g(x)) = sin(x3 ), and then take the derivative of
this composition. In the expression to the right, on the other hand, we mean to say that
you should first take the derivative of f (x) = sin(x), and afterwards compose the result
with g(x).
This difference is really not at all clear from how we write these expressions. So,
to make the chain rule easier to understand, it is common to introduce different letters
for the variables and write f (u) for the outer function and g(x) for the inner function.
With the Leibniz notation for the derivative (recall Remark 5.24), we can now write the
right-most expression in (8.1) as follows:
d d
f (u) or f (u) .
du du u=g(x)
These two expressions mean the same thing. However, in the right-most variant, we
make the extra effort of reminding the reader that only after taking the derivative, do
we put u = g(x). The chain rule can now be expressed as
d d df du
f g(x) = f (u) = · .
dx dx du
|{z} dx
|{z}
outer der. inner der.
Example 8.16 (continued) In the case of sin(x3 ) then f (u) = sin u is the outer function
and g(x) = x3 is the inner function. This means that f 0 (u) = cos u is the outer derivative
and g 0 (x) = 3x2 is the inner derivative. By the chain rule, we get
d u=x3 d du
sin(x3 ) = sin(u) = cos u ·
dx dx dx
d 3
= cos(x3 ) · x
dx
= cos(x3 ) · 3x2 .
Exercise 8.18 Compute the derivatives in exercise 8.17 using this notation.
Exercise 8.19 Check the definition of the indefinite integral in Appendix ??, and
use it to pair the following integrals with the suitable expression. Note that to solve
8.1. COMPUTATIONAL RULES FOR THE DERIVATIVE 57
this exercise, you only need to be able to compute derivatives. (Why? Also, note that
you do not even need to know what the derivative of arctan x is.)
Z
dx 1
(a) 2
(i) ln(x2 + 4) + C
4+x 2
Z
dx 1
(b) (ii) ln(2 + x) − ln(2 − x) + C
4+x 4
Z
dx
(c) (iii) ln(4 + x) + C
(4 + x)2
Z
x dx 1
(d) 2
(iv) − +C
4+x 4+x
Z
dx 1 x
(e) 2
(v) arctan + C
4−x 2 2
Let us consider one more example where we illustrate the use of the chain rule.
Exercise 8.21 Compute the derivative of y = ln ln(ln x) .
58 CHAPTER 8. THE DERIVATIVE
Exercise 8.22 Compute the derivatives of the following functions. Note that they
all have something in common. In particular, after having done this exercise, think
about what this means for their graphs, and plot them to see if you are correct.
1
(a) f (x) = arctan + arctan x
x
x
(b) f (x) = arcsin √ − arctan x
1 + x2
p p
(c) f (x) = 2 arctan(x − x2 − 1) + arctan x2 − 1
Hint: These functions – and how to compute their derivatives – have all appeared on
recent exams. You can find these exams, with full solutions, on the course website.
8.2. PROOF OF THE COMPUTATIONAL RULES FOR THE DERIVATIVE 59
Example 8.23 (Proof of the sum rule) We use the computational rules for the limit
to show that 0
f (x) + g(x) = f 0 (x) + g 0 (x).
Notice that we could use the summation rule for the limit since we knew that both the
limits f 0 (x) and g 0 (x) exist.
Exercise 8.24 Show that if f has a derivative at x, then it is also continuous there.
What part of the diagram in Figure 1 does this justify?
Hint: Recall the formula from exercise 8.3, and find a chain of equalities showing that
lim f (u) − f (x) = · · · = 0.
u→x
d f (x + h)g(x + h) − f (x)g(x)
f (x) · g(x) = lim = · · · = f 0 (x)g(x) + f (x)g 0 (x).
dx h→0 h
Exercise 8.27 (a) Use the definition of the derivative to figure out a formula for
d 1
.
dx g(x)
(b) Use the formula found in (a) and the product rule for derivatives to derive a
formula for d f (x)
.
dx g(x)
Finally, we turn to the chain rule. It is – by far
– the most difficult of the computational rules to
prove. To prepare us for the proof, we give a sim-
ple, but unfortunately false, argument that helps
us understand what is going on (curiously, this
"proof" may be found in numerous high school Fig. 4. Fake proof ahead.
textbooks).
Example 8.28 (Fake "proof" of chain rule) We wish to compute the limit
To end the proof, we need to compute the limit labeled by (∗). To do this, we make a
change of variables. That is, we set k = g(x + h) − g(x). Note that as h → 0, then k → 0
(this is true since differentiable functions are automatically continuous). This allows us
to write
f g(x + h) − f (g(x)) f g(x) + k − f g(x)
lim = lim = f 0 g(x) .
h→0 g(x + h) − g(x) k→0 k
Notice that the last expression here means the derivative of the function f evaluated at
the point g(x).
Exercise 8.29 (a) What is the problem with the above proof? (A correct proof is
supplied on the following page.)
(b) This "fake proof" can be used to come up with a correct proof for the differenti-
ation formula for inverse functions from Proposition 8.10.
Hint: In (b), use the "fake proof" to differentiate f −1 (f (x)).
62 CHAPTER 8. THE DERIVATIVE
Now, the problem with the fake proof of the chain rule occurs when we multiply by one.
Indeed, the expression
g(x + h) − g(x)
g(x + h) − g(x)
may be of the form 0/0 an infinite number of times as h → 0. That is, we need to find
an alternative approach that avoids division by g(x + h) − g(x).
For this reason, let us now consider the definition of f 0 , which we write up as follows:
f (u + k) − f (u)
f 0 (u) = lim . (8.2)
k→0 k
As in the fake proof, we want to put k = g(x + h) − g(x). However, the problem we
mentioned above then becomes precisely that k may be zero for various values of h, and
we may therefore not divide by it. But here is the crucial step: before we make the
connection between k and g(x + h) − g(x), we define
f 0 (u) − f (u + k) − f (u) , k 6= 0,
E(k) = k (8.3)
0 k = 0,
where we keep in mind that we consider u as being fixed and k as the variable. Here, we
also notice that since (8.2) holds, it follows that
lim E(k) = 0.
k→0
That is, when defined in this way, the function E(k) is continuous at the origin. Why
did we do all of this? Well, by multiplying up k, and rearranging (8.3), we can now write
f (u + k) − f (u) = f 0 (u) − E(k) k.
Since this expression is fine for k = 0, it is now safe to put u = g(x) and make the
connection k = g(x + h) − g(x), which, in particular, means that k → 0 as h → 0 and
that we can write
f (g(x) + k) − f (g(x)) = f 0 (g(x)) − E(k) g(x + h) − g(x) .
Here, we did not write out the first k since this will make the computation that follows
below slightly easier to read. Also, notice that we can rewrite k = g(x + h) − g(x) as
g(x + h) = g(x) + k.
8.2. PROOF OF THE COMPUTATIONAL RULES FOR THE DERIVATIVE 63
Finally, we have gathered all the necessary pieces needed to make the following
computation:
Remark 8.30 While the above proof seems to be more complicated than the "fake"
proof, the general idea is basically the same. However, here, things get more complicated
as we need to do some extra bookkeeping to make sure that nothing bad happens if k
happens to be zero for h arbitrarily close to 0 (but not equal to 0).
Exercise 8.31 (Discussion) Is the "fake" proof really that bad? Can you think of
any conditions under which it will actually work, and do the functions we normally
consider in these lecture notes satisfy such conditions?
Exercise 8.32 Use the chain rule to prove the differentiation formula for invertible
functions from Proposition 8.10. Here, you may assume that both f and f −1 are
differentiable.
Remark: You should compare this exercise to exercise 8.29.
Remark 8.33 In the YouTube-film linked in the margin, here, another proof of the
formula for the derivative on an inverse function is given.
64 CHAPTER 8. THE DERIVATIVE
Example 8.34
Let f (x) = arcsin(x) and g(x) = sin(x). Surely,
these functions are differentiable. However, this
is not the case for the composition f ◦ g(x) =
arcsin(sin x). As we see in the figure to the right,
there are plenty of pointy edges! The point is
that arcsin(x) is not differentiable at the end-
points of its domain, and this causes trouble
when x is such that sin x = ±1.
Fig. 5. Look! Pointy edges!
But all is not lost. The following proposition is analogue to Proposition ??.?? (notice
that it contains a little bit of "fine print").
Proposition 8.35 Suppose that f and g are differentiable. Then the same is true for
the functions f ± g, f · g and f /g. Moreover, if g is differentiable at a point x and f is
differentiable at the point g(x), then f ◦ g is also differentiable at x.
Proposition 8.38
d 1
(i) log x = , x>0
dx x
d
(ii) sin x = cos x
dx
d
(iii) cos x = − sin x
dx
As it happens, we have already done the hard work in proving this proposition in
the previous chapter (see page ??). In the following exercises, the point is to help you
realise this.
Exercise 8.39 We now ask you to prove part (i) of the above proposition.
(a) Write out the definition of the derivative of the logarithm at x = 1 and verify
that we have already proved that formula (i) holds at this point.
(b) Use the logarithmic laws, and the change of variables rule for the limit, to extend
the differentiation formula to all other x.
Hint: The solution to (a) should reveal where to look for inspiration for (b).
Exercise 8.40 In this exercise, we ask you to prove parts (ii) and (iii) of the above
proposition.
(a) Write out the definition of the derivative of the sine and cosine at x = 0 and
verify that we have already proved that formulas (ii) and (iii) hold there.
(b) Use suitable trigonometric formulas, and the change of variables rule for the limit,
to extend the differentiation formulas to all other x.
Hint: The solution to (a) should reveal where to look for inspiration for (b).
Now, an interesting observation is that while the logarithm has domain x > 0, its
derivative y = 1/x has the much larger domain x 6= 0. This leads us to ponder if we can
somehow extend the logarithm to negative x in such a way that the derivative of this
extension is equal to 1/x there. This is the point of the following exercise:
Exercise 8.41 (a) Use the chain rule for the derivative to prove that for x < 0 we
have
d 1
log(−x) = .
dx x
(b) Use what you learned in (a) to write a formula for a function f (x) with domain
R\{0} such that f 0 (x) = 1/x there.
66 CHAPTER 8. THE DERIVATIVE
dp −x −1/2 √
y0 = 1 − x2 = √ =⇒ y 0 (1/2) = p = −1/ 3.
dx 1 − x2 1 − 1/4
√
2
√ 7. The function y = + 1 − x describes the upper semi-circle and y =
Fig.
− 1 − x2 describes the lower semi-circle.
Determining the slope using implicit differentiation: We now show how to find
this slope without first solving y explicitly as a function of x. To do this is to assume
that our curve is defined by some function y = f (x) close to the point we care about.
8.3. DIFFERENTIATION FORMULAS FOR ELEMENTARY FUNCTIONS 67
The point is that we can solve our problem without ever needing to know the formula
for f (x).
The first step is to put y = f (x) into the equation for the circle:
x2 + f (x)2 = 1.
Since the left-hand side is identical to the right-hand side for all x, the derivative of the
left-hand side has to be equal to the derivative of the right-hand side. We obtain:
d 2 d
x + f (x)2 = 1 =⇒ 2x + 2f (x)f 0 (x) = 0.
dx dx
Here, f 0 (x) appears since it is the inner derivative when we use the chain rule to get
(f (x)2 )0 = 2f (x)f 0 (x). Rewriting the above expression, we get
x x
f 0 (x) = − ⇐⇒ y 0 = − .
f (x) y
0 0
√ used that y =0 f (x) and
In the last step, we just √ y = f (x). But this means that when
(x,y) is equal to (1/2, 3/2), we get y = −1/ 3.
x4 − y 4 − x2 + y 2 = 0.
Example 8.45 We now use implicit differentiation to compute the derivative of the
function y = arcsin(x) by using the fact that it is the inverse function of the sine. That
is, for x ∈ Darcsin we have
y = arcsin x ⇐⇒ sin y = x,
Remark 8.46 Observe that when using implicit differentiation to study the derivatives
of inverse functions as we do above, then we have no need for the implicit function
8.3. DIFFERENTIATION FORMULAS FOR ELEMENTARY FUNCTIONS 69
theorem. Indeed, since the function is differentiable, we know that we can consider both
y as a function of x and x as a function of y (if needed).
Proposition 8.52
d
(i) exp(x) = exp(x)
dx
d
(ii) exp(ix) = i exp(x)
dx
d α
(iii) x = αxα−1 , α ∈ R, x > 0
dx
d 1
(iv) arcsin x = √ , x ∈ (−1,1),
dx 1 − x2
d 1
(v) arccos x = − √ , x ∈ (−1,1),
dx 1 − x2
d 1
(vi) arctan x = , x ∈ R.
dx 1 + x2
70 CHAPTER 8. THE DERIVATIVE
Exercise 8.53 (Exam 2015-05-27, part of 5) Make a table of signs for the deriva-
tive of the function
x−1
f (x) = ln x − , x ≥ 1.
x+1
Exercise 8.54 (Exam 2014-08-18, part of 2) Make a table of signs for the deriva-
tive of the function
1
f (x) = 2 arctan x + , x 6= 0.
x
Exercise 8.55 (Exam 2014-05-26, part of 3) Make a table of signs for the deriva-
tive, and the second derivative, of the function
f (x) = |x|e−1/x , x 6= 0.
Exercise 8.56 (Exam 2012-12-19, part of 4) Make a table of signs for the deriva-
tive of the function
Exercise 8.57 (Exam 2012-05-28, part of 1) Make a table of signs for the deriva-
tive of the function
2
p
f (x) = e−x /2 x2 + 1, x ∈ R.
Exercise 8.58 (Exam 2012-05-28, part of 3) Make a table of signs for the deriva-
tive of the function
1
f (x) = ln(1 + e−x ) − , x ∈ R.
ex +1
8.5. ANSWERS TO SELECTED EXERCISES 71
8.6 No.
8.13 (a) (2x arctan(x) − 1)/ arctan(x)2 , (b) 1/(1 + cos(x)), (c) 2(ln(x) − 1)/(ln x)2 .
2
8.17 (a) cos xesin x , (b) 2x/(1 + x2 ), (c) −1/(1 + x2 ), (d) (1 + 2x2 )ex
8.19 (a) - (v), (b) - (iii), (c) - (iv), (d) - (i), (e) - (ii).
8.24 Here is an additional hint: multiply the expression in the original hint by one in
such a way that you can take advantage of the fact that the limit
f (u) − f (x)
lim
u→x u−x
exists and is equal to some finite number.
8.25 It says that there are functions that are continuous but not differentiable. That is,
that the two areas in the Venn diagram do not coincide.
8.27 (a)
1 1
g(x+h) − g(x) g(x) − g(x + h) 1 1 g(x + h) − g(x) g 0 (x)
lim = lim =− lim · =− .
h→0 h h→0 hg(x)g(x + h) g(x) h→0 g(x + h) h g(x)2
8.48 Let y = ex , then log y = x. Now differentiate implicitly on both sides, and then
substitute back.
8.51 Let y = f −1 (x), then f (y) = x. Now differentiate implicitly on both sides, and
then substitute back.
Chapter 9
C-1
C-2 CHAPTER 9. HOW TO COMPARE INFINITIES
Indeed, consider the following: how can we tell that the sets B = {Alfred, Elias, Nils}
and G = {Anna, Emma, Nora} are of the same size? Well, a reasonable answer is that
if we can partner up every element of B with an element of G in such a way that no
element of G is left out or used twice, then B and G are of the same size. Here, is one
such pairing:
Alfred ↔ Anna
Elias ↔ Emma
Nils ↔ Nora
Definition I.1 We say that two sets A and B have the same size (he used the term
cardinality) if there exists a bijective correspondence between the elements of A and B.
Please take a moment to notice that the only complication at this point is our use of
the words “bijective” and “cardinality”. In fact, this is just about finding dance partners!
To take another example, let A = {1,2,3,4} and B = {101,102,103,104}. These have the
same cardinality since we can partner up elements from A and B, just as we did with B
and G, above (that is, we can find a bijective correspondence).
Example I.2 The sets N and Z are, according to Cantor, of the same size. To prove
this, we need to find a way to partner up the elements of N and Z with a bijective
correspondence. This may seem hopeless since Z contains all of N, and then some – but
there is a trick that works! We illustrate it in the following figure:
Fig. 3. Here is how to give all numbers in Z a partner in N. First, 0 gets 0. Then 1
gets 1, then −1 gets 2, then 2 gets 3, then −2 gets 4, and so forth.
That is, we partner the elements as follows:
Z N
C-3
0 ←→ 0
1 ←→ 1
−1 ←→ 2
2 ←→ 3
−2 ←→ 4
..
.
While this example may be surprising, it was not really what got grumpy mathe-
maticians like Kroenecker upset. Let us push this one step further.
Example I.3 The sets N and Q are of the same size. That is, there are exactly as
many natural numbers as there are rational numbers (that is, fractions). This time, we
indicate the partnering through a sequence of illustrations. First, let us represent the
rational numbers in a coordinate system:
Fig. 4. In this way, all rational numbers can be represented by a coordinate in the
upper half-plane.
Notice that some rational numbers appear more than once here. For instance, 1/1 = 2/2,
and 3/2 = 6/4 and so forth. So, we delete the superfluous ones:
C-4 CHAPTER 9. HOW TO COMPARE INFINITIES
..
.
Remark I.4 Note that the previous two examples can be adjusted to show that if A and
B both have the same size of N, then both the union A ∪ B and the Cartesian product
A × B = {(a,b) : a ∈ A, b ∈ B} also have the same size as N.
Although this is perhaps even more surprising than Example I.2, still this was not the
reason why grumpy old Kroenecker got upset. If you did not like Cantor’s arguments,
you simply could choose to ignore his definition and move one with your life.
However, the following example made sparks fly.
Example I.5 The set R is not of the same size as N. That is, the infinity describing
the size of the real numbers is strictly larger than the infinity describing the size of the
natural numbers. Hence, there are different infinities!
Let us now see if we can understand Cantor’s famous diagonal argument which proves
that there are different types of infinities. The argument is quite short, but as you can
understand from the controversy, many found it hard to swallow.
What we need to prove is that there is no way to write a list
R N
0,23482968762... ←→ 0
5,34243438421... ←→ 1
0,73923212253... ←→ 2
7,23529523532... ←→ 3
3,04360158943... ←→ 4
..
.
that partners all elements of R with elements of N bijectively. The intuitive idea is that
there simply are not enough numbers in N, so that most elements of R have to be missing
from such a list (even if it is infinitely long – because this infinity is the infinity of N,
and it is simply too small to count all of R!).
The proof is by contradiction. So, suppose that there actually does exist such a list
pairing all elements of R bijectively with an element of N. For the sake of concreteness,
let us suppose that it starts exactly like the list above (it does not really matter. Another
C-6 CHAPTER 9. HOW TO COMPARE INFINITIES
detail is that we need to suppose that numbers ending with an infinite sequence of 9’s
instead are written so that they end with an infinite seqeunce of 0’s.) The point is that
we can now prove that there is at least one x ∈ R which cannot belong to this list.
To do this, let us choose decimal digits from the list in a diagonal fashion:
R N
0,23482968762... ←→ 0
5,34243438421... ←→ 1
0,73923212253... ←→ 2
7,23529523532... ←→ 3
3,04360158943... ←→ 4
..
.
We now claim that we can write down a number x which is not on this list, based on
the digits in red. How do we do this? Well, it is quite straight-forward: we write down
a number x according to the following rules:
• The first decimal is different from the first decimal of the first number.
• The second decimal is different than the second decimal of the second number.
• The third decimal is different than the third decimal of the third number.
..
.
This is achieved, by, for instance, replacing every 0 by 1, every 1 by 2, and so on, until
the digit 9, which we replace by 0 (or 8, if we want to be extra careful). That is,
x = 0,35031...
When constructed in this way, this number x must be different from all numbers on the
original list since it differs from each and every one by at least one decimal digit. Since
we began by supposing that all x ∈ R was on the list, we have reached a contradiction!
Therefore, no such list exists, and so, no bijective pairing of N and R exists.
C-7
Definition I.6 Sets of the same cardinality as N are called countably infinite, and sets
which have a strictly larger cardinality than N are called uncountably infinite.
The sets N, Z √
and Q are all countably infinite, while the sets R and R\Q (the
irrational numbers 2, π, e and so forth) are uncountably infinite.
Chapter 10
10.1 Introduction
How to install Python 3 and write your first program
Go to the webpage https://www.spyder−ide.org and download and install the latest
version of Spyder. Once installed, launch this program. On a Mac (a few versions ago,
at least) this would open the following window.
Fig. 1. The graphical user interface of the Spyder editor for Python.
Exercise J.1 Type print("Hello world") in the editor window and press the green
play button. What output do you get? Also try print("2+2 = ", 2+2).
Remark: Notice how Python treats "2+2" as text to be printed, and 2+2 as something
to be computed, and that a comma is used to separate different type of input.
D-1
D-2 CHAPTER 10. A CRASH COURSE IN PYTHON
Example J.2 Enter the following code into the Spyder editor.
1 a = 2
2 b = 4
3 c = a + b
4 print ("a + b = ", c)
When you press the green play button you get the output a + b = 6.
This seems reasonable. Here, a, b and c are what is called variables. Technically
speaking, a variable is a space in the memory of the computer which can store one piece
of information, such as a number. In the above example, they work pretty much like
what we would expect of a mathematical variable. But the next example shows that this
is not always the case:
Example J.3 Enter the following code into the Spyder editor.
1 a = 2
2 b = 4
3 c = a + b
4 a = 10
5 print ("a + b = ", c)
When you press the green play button you still get the output a + b = 6.
Wait, what? Does Python really mean that 10 + 4 = 6? Well, no. The program is
doing exactly what it is told. The thing is that we, ourselves, do not really understand
what we just asked Python to do. So let us try to figure it out. As an example. here is
a line by line explanation of what happens in Example J.2:
The above example illustrates several peculiarities of arithmetic in Python and how
a code is run:
1. The code is executed line by line.
2. Expressions such that a = b should be read from the right to the left. That is, a
is assigned whatever value b has, but not vice versa (if b does not have a value,
the program will (probably) crash). For this reason, an arrow makes more sense
than an equal sign.
3. If a variable gets assigned some value, it has no memory of how that happened.
That is, in the second example, c gets assigned the value a+b. But when a is
changed in the next line, this does not affect the value stored in c.
We now turn to a discussion of the basic syntax in Python. That is, what are the
basic rules for how we are allowed to write a code? First, let us discuss the types of
names you are allowed to give a variable:
Example J.5 You can give variables much more interesting names than just, say, a, b,
x or y. Here is an example of a perfectly well-functioning program:
1 ponies = 2
2 cookies = 4
3 pony_Cookie 32 = ponies + cookies
4 print ( pony_Cookie 32)
4. Python is case sensitive. This means that n and N are as different as n and m.
5. By "tradition", the name of a variable should never start with an upper case letter.
6. A variable cannot be given a numerical name such as 2 or 34 (in particular, this
means that the code 2 = a will crash, since Python is trying to create a variable
called 2 and assign it whatever value is stored in the variable a). However, numbers
can be part of the name of a variable, as long as the name starts with a letter or
an underscore ’ _’. In particular, pony_Cookie32 is perfectly fine. Note that certain
special symbols, such as $, # and % can never be used in the name of a variable.
7. The name of a variable cannot contain a "space". That is, you cannot give a
variable the name pony Cookie32. Instead, you will have to use something like
pony_Cookie32 or just ponyCookie32.
D-4 CHAPTER 10. A CRASH COURSE IN PYTHON
8. You should not give a variable a name that already means something different. For
instance, do not give a variable the name print (however, Print is fine).
Example J.6 In Python, we need to be careful with how we place indentations. For
instance both of the following programs will crash.
1 a = 2 1 a = 2
2 b = 4 2 b = 4
3 c = a + b 3 c = a + b
4 print (c) 4 print (c)
Example J.7 In contrast to the situation with indentations, Python is much less
sensitive with respect to whether or not we skip a line. In particular, the following code
will run just fine:
1 a = 2; b = 4 # Two commands on the same line.
2 c = (a + b) ∗ 3 # Here we use soft parentheses .
3
4 print (c)
15. It is a good thing for a program to crash. It is a way for Python of letting you
know that whatever result the program would have given you would probably be
false (since it was written in a bad way). What is much more dangerous is if a
program is wrong because of, say, some mathematical mistake in some formula
that still makes sense to Python (say, if a plus is mistakenly replaced by a minus).
Then the code will run, and you will not be warned that something is wrong :-(
Exercise J.8 Will the following codes run? If not, why? If yes, what is the output?
(a) (b)
1 a = 2 1 a = 2
2 b = 4 2 b = 4
3 6 = a + b 3 a = a + b
4 print (a+b) 4 print (a)
(c) (d)
1 a = 2 1 a = 2
2 b = 4 2 b = 4
3 a = b 3 c = a + b
4 print (b) 4 print (c)
(e) (f )
1 a = 2; b=4; a = a + b 1 Ponies = 2
2 2 Cookies = 4
3 3 Rainbows = ponies + cookies
4 print (b) 4 print ( rainbows )
Exercise J.9 Consider the following codes. Can you express mathematically what
that they compute?
(a) (b)
1 a = 1 1 a = 1
2 a = a + 1/2 2 a = 1/(1+a)
3 a = a + 1/4 3 a = 1/(1+a)
4 a = a + 1/8 4 a = 1/(1+a)
5 a = a + 1/16 5 a = 1/(1+a)
6 6
7 print (a) 7 print (a)
Example J.10 (For-loop) The following codes do exactly the same thing when run:
1 a=0
2 print (a)
3 a=1
1 for n in range(0,4):
4 print (a)
2 a = n
5 a=2
3 print (a)
6 print (a)
4 print ("the end!")
7 a=3
8 print (a)
9 print ("the end!")
Exercise J.11 Will these codes run? If so, what are their outputs?
(a) (b)
1 for n in range(0,4): 1 for n in range(0,4):
2 print (n) 2 print (n)
3 print (" mississippi ") 3 print (" mississippi ")
4 print (" hello ") 4 print (" hello ")
Example J.13
1 a = 2 # Here , we store the value 2 as an integer
2 print (type(a)) # Here , we check the type and tell Python to print
3 # the result on screen (it will print "int ").
Exercise J.14 (a) Check the type of the variable defined by b = 3/2.
(b) Check the type of the variable defined by c = 4/2.
In addition to integers and floating point numbers, we will encounter the following
datatypes in this chapter:
• String: A variable that contains a string of text is called a string. For instance,
a = "Hej" will create a variable containing the string of text "Hej".
• List: A variable that contains a list of other variables is called... well, a list.
For instance, a = [2, 3/2, "Hej"] creates a variable that contains an integer, a
floating point number and a string. A list can even be made up of other lists (more
on this on the following pages).
• Numpy array: A numpy array is a special type of list that can only contain either
integers or floating point numbers (but not both). Since it is more specialised than
the datatype list, this also means it can have more "advanced" features (more on
this on page D-25).
Example J.16
1 a = [0,1,2,3,4,5] # list of integers
2 b = [" cheese ", 88] # list of a string and an icecream
3
4 print (a) # prints the entire list
5 print (a[0]) # prints 1st entry
6 print (a[3]) # prints 4th entry
7 print (a[−1]) # prints last entry
8 print (a[−2]) # prints second to last entry
9
10 len(a) # computes the length of the list
11 sum(a) # sums the terms of the list
12 max(a) # returns the largest entry of the list
13 min(a) # returns the smallest entry of the list
14
15 a. append ("oi") # adds entry "oi" to the end of the list
16 a.pop(3) # deletes the 4th entry in the list
17 c = a+b # creates the list [0 ,1 ,2 ,3 ,4 ,5 ," cheese " ,88]
18 d = b∗2 # creates the list [" cheese " ,88 ," cheese " ,88]
19
20 e = a[2:5] # creates the list [2 ,3 ,4] ( called a " slice ")
21 f = a[2:] # creates the list [2 ,3 ,4 ,5]
22 g = a[:5] # creates the list [0 ,1 ,2 ,3 ,4]
23 h = a[1:4:2] # creates the list [1 ,3] ( every second term)
The code is more or less explained by the comments, but we note the following:
Example J.17 (Computing sequences using a for-loop) The following two codes
do exactly the same thing when run:
1 print (1/2 ∗∗ 0)
1 for k in range(0,3):
2 print (1/2 ∗∗ 1)
2 print (1/2 ∗∗ k)
3 print (1/2 ∗∗ 2)
Another way of computing lists is to use a list commands called a list comprehension.
It has the benefit of both computing and storing a sequence of numbers in just a single
line of code.
Example J.18 (Computing sequences using a list) The following two codes do
exactly the same thing when run:
Exercise J.19 Run and compare the output of the codes in the above examples.
Exercise J.20 A difference between the right-most codes in examples J.17 and J.18
is that in the first, we do not store the values of the sequence anywhere. Here are two
suggestions for how to fix this:
1 a = 0 1 a = []
2 for k in range(0,3): 2 for k in range(0,3):
3 a = 1/2 ∗∗ k 3 a. append (1/2 ∗∗ k)
Explain what is going in each code. What information is stored at the end of each
program? Also, why do these not give any output? Can you fix this?
D-10 CHAPTER 10. A CRASH COURSE IN PYTHON
Example J.21 (Computing sums using a for-loop) The following two codes both
compute the sum
1 1 1
1+ + + .
2 4 8
1 a = 0
1 a = 0
2 a = a + 1/2 ∗∗ 0
2 for k in range(0,3):
3 a = a + 1/2 ∗∗ 1
3 a = a + 1/2 ∗∗ k
4 a = a + 1/2 ∗∗ 2
To understand the code in the above example, recall Figure 2 on page D-2, and read
the explanation following Example J.10.
Here is how to compute the same sum using lists and list comprehensions.
Example J.22 (Computing a sum using lists) The following two codes do exactly
the same thing when run:
Note that using the technique of this example, we could actually compute the sum
in just one line (!):
Exercise J.23 (a) Run the codes from the examples on this page. Why do they not
give any output? Fix this.
(b) Modify (some of) the code on this page so that you can compute the sum of the
first million terms or so. What result do you get?
10.2. HOW TO COMPUTE AND VISUALIZE SEQUENCES AND SUMS D-11
Example J.24 The following code is perhaps the simplest way to visualise a sequence
in Python. Here, we visualise the sequence
1 9
.
2k k=0
Let us now immediately modify the above code, so that we can plot partial sums of
the infinite series
∞
X 1
.
2k
k=0
That is, we want to plot the first few we get when we compute
0 1 2
X 1 X 1 1 X 1 1 1
= 1, =1+ , =1+ + ,
2k 2k 2 2 k 2 4
k=0 k=0 k=0
2
We show how to visualise data stored as lists on page D-13.
D-12 CHAPTER 10. A CRASH COURSE IN PYTHON
and so on. The nice thing is that we can achieve this by a relatively minor modification
of the above code. Notice how this code combines what we did in examples J.21 and
J.24.
Example J.25 The following code computes and visualises exactly the partial sums
expressed above.
1 import matplotlib . pyplot as plt
2
3 a = 0
4
5 for k in range(0,3):
6 a = a + 1/2 ∗∗ k
7 plt.plot(k,a,"bo")
8
9 plt.show ()
Exercise J.26 Run this code, and verify by computing the first three partial sums
by hand that the plot is correct. What happens when you increase range(0,3) to, say,
range(0,100)?
Exercise J.27 Several commands can be used to change how the above figure looks.
Try inserting the following into the code. This can be done anywhere after the import
and before plt.plot(). What happens?
(a) Replace "bo" by "rx".
(g) plt.xticks([−2,0,3,4,6.5,10])
(b) plt.xlim(−3,15)
(h) plt.yticks([0,1,1.5])
(c) plt.ylim(−5,2)
(i) plt.grid(True)
(d) plt.xlabel("There was")
(j) plt.figure(figsize=(4,3))
(e) plt.ylabel("a graph")
(k) plt.savefig("myfigure.jpg")
(f ) plt.title("that had a title")
Remark: Note that by choosing the extension ".jpg" in (k), you actually tell Python
to save your figure as a jpg-file. Note that only a limited number of file formats are
supported. For more on how to plot in Python, check out the official tutorial
https://matplotlib.org/users/pyplot_tutorial.html which has a ton of infor-
mation and examples.
10.2. HOW TO COMPUTE AND VISUALIZE SEQUENCES AND SUMS D-13
Example J.28 We now use lists to create and plot the sequence
1 9
.
2n n=0
While this example is quite similar to Example J.24, let us explain what happens in
lines 3, 4 and 6 a bit more carefully:
Lines 3, 4: Here we create two lists. The first is nValues = [0, 1, 2, ..., 9]
which will give us the n-values to be used in the plot (these will be placed on the
x-axis). The second list, a, contains the sequence we are trying to plot. These
will play the part of the y-values in our plot. Note that in the for command, we
can replace the range command by any other list. This is a particular, and rather
elegant, feature of Python.
Line 6: Here, we are creating the plot itself. In the command plt.plot(nValues,a,"bo"),
the we use the two lists created in lines 3 and 4. The first will be interpreted as
the x-coordinates to be used, and the second as their corresponding y-coordinates.
Note that it is therefore crucial that these two lists are of the same length (if not,
Python will crash), and explains why it is a good idea to let the for-loop in line 4
be defined in terms of n running through nValues, as this will guarantee that the
lists nValues and a have the same length.
D-14 CHAPTER 10. A CRASH COURSE IN PYTHON
The following code should be compared to that of Example J.28, above. It will compute
the 19 (!) first partial sums of the series. Here, we use the name "indices" instead of
"nValues", and vary the letter used for the index from line to line to emphasise that the
choice of letter really does not matter.
To understand this code, you should first read the explanation for Example J.28.
The difference is what happens in line 6:
Line 6: Here, we use the sum command to sum slices (see Example J.16) of the
list. Here, you should keep in mind that the slice, say, a[0:19] gives you the entries
a[0], a[1], . . . , a[18]. In particular, there is no point in computing a[0:0] as
this slice contains no terms. More explicitly:
0
X 1
n = 0 =⇒ sum(a[0 : 0 + 1]) = sum(a[0 : 1]) = ,
2k
k=0
1
X 1
n = 1 =⇒ sum(a[0 : 1 + 1]) = sum(a[0 : 2]) = ,
2k
k=0
..
.
19
X 1
n = 19 =⇒ sum(a[0 : 19 + 1]) = sum(a[0 : 20]) = .
2k
k=0
Exercise J.30 Implement the code in Example J.29. Verify that it gives the same
output as Example J.25 when the range in the latter example is suitably adjusted.
10.2. HOW TO COMPUTE AND VISUALIZE SEQUENCES AND SUMS D-15
(1,1,2,3,5,8,13, . . .).
These are the Fibonacci numbers. In general, the n’th Fibonacci number is given by
continuing this list using the rule
an = an−1 + an−2 .
Note that if we intend for the sequence to start at a0 , then this rule cannot be used to
compute a0 or a1 , since they would then depend on a−2 and a−1 . Therefore, it is more
correct to say that the Fibonacci numbers are created from the set of rules:
a0 = 1,
a1 = 1,
an = an−1 + an−2 for n ≥ 2.
So, how to create a list in Python containing, say, the 20 first Fibonacci numbers? Well,
this cannot be done by using a command on the form
since there is no way to let the n’th number depend on the previous numbers in the
sequence3 . What to do? Well, here we use a "pure" for-loop, where we store the
Fibonacci numbers, as they are computed, in a list so that we can use previous Fibonacci
numbers to compute the next numbers.
Example J.31 The following code can be used to compute Fibonacci numbers.
1 a = [1,1]
2
3 for n in range(2,20):
4 newterm = a[n−1] + a[n−2]
5 a. append ( newterm )
6
7 for n in range(0,20):
8 print ("The", n+1,"’th Fibonacci number is", a[n])
3
Well, strictly speaking, there is, but there is no "natural" way to do this in Python.
D-16 CHAPTER 10. A CRASH COURSE IN PYTHON
Exercise J.32 Implement the code from the above example. Use, say, the wiki-page
for the Fibonacci numbers to verify that it gives the correct output.
Exercise J.33 Use the method of the P above examples to create a list containing a
few partial sums of the infinite series ∞
k=0 1/2k.
Another feature of Python is that you can measure how long it takes for a program
to run. In this way, you can time how long it takes Python to run a limited number of
iterations of some for-loop, and then you can make an informed guesstimate of how long
it will take to run, say, a million iterations. Here is how this is done:
Remark J.37 Python is a useful, but fairly slow, programming language. What you
are supposed to observe in the previous exercise is that the built-in commands in Python
actually are written using much faster languages, such as C++. The morale is: if speed
is important, use the built-in functions as much as possible.
10.3. SOME ADDITIONAL CONTROL STATEMENTS IN PYTHON D-17
Remark J.39 (logical operators) In the above example, we see the symbol <=. This
is a logical operator that checks if something is less than or equal to something else (note
that it is important to remember the order of the symbols since =< means nothing to
Python). When doing while-loops, the symbols >=, == and != are also useful, where the
latter two checks if two variables are equal or not equal, respectively (note that round-
off errors usually makes it impossible for Python to check if two numbers that are not
integers are equal – more on this in Section 10.5).
P J.40 Use a while-loop to check how large n has to be for the partial sums
Exercise
Sn = nk=0 1/2k is closer to 2 than 1/10000.
D-18 CHAPTER 10. A CRASH COURSE IN PYTHON
If-else statements
Alongside for- and while-loops, the if- else statement is the most important tool in
programming.
Here is a basic example:
If you want to add more conditions, you can do this by adding as many elif com-
mands as you want (notice how we are allowed to use the word and to combine two
inequalities in our condition – it is also possible to use the keyword or):
Example J.42
1 a = 10
2
3 if a >= 3:
4 print ("a is a BIG number ")
5 elif a > −3 and a < 3:
6 print ("a is a tiny number ")
7 elif a <= −3:
8 print ("a is a BIG but negative number ")
Exercise J.43 Use an if-type statement to modify the code from Example J.31 so
that 1’th, 2’th and 3’th are replaced by 1’st, 2’nd and 3’rd, respectively.
10.3. SOME ADDITIONAL CONTROL STATEMENTS IN PYTHON D-19
Example J.44 The following code does exactly the same as the one in Example J.38
1 S = 0
2 for n in range(1,1000):
3 if n > 10:
4 break
5 S = S + 1/n
6 print (S)
Line 1: We initialise an integer S (that we will use to keep track of partial sums).
Line 2: Here we start out the for-loop. We choose range(0,1000) large to ensure
that the break command, further down, will have time to kick in.
Line 3: We are now inside of the for-loop. Here, we ask the if-command to check
if n > 10. If this is not the case, the code will skip the indented lines and continue
on line 5. If n > 10 is true, then the indented code on line 4 will be run.
Line 4: Here, the break command is activated. It means that the for-loop is
stopped. The code will continue on line 8.
Line 5: Here, the value of S is updated.
Remark J.45 Sometimes we need to put for-loops inside of for-loops (when computing
matrices, this may be the case). When this happens, the break command will only stop
the "inner-most" loop.
(a) Write a code that combines a for-loop with the break command to check when
the first entry in the list is smaller than 10−4 .
(b) Do the same as in (a), but with a while-loop and no break command.
D-20 CHAPTER 10. A CRASH COURSE IN PYTHON
Example J.50 The following code makes no sense and will crash.
1 x = 3
2 def f(x):
3 y = x ∗∗ 2−5 ∗ x+4
4 return y
5 print (y)
While the above explanation may seem to make perfect sense, it is actually sort of
misleading. To understand why, let us take a look at a second example.
Example J.51 The following code makes no sense to Python and will crash.
1 x = 3
2 def f(x):
3 y = x ∗∗ 2−5 ∗ x+4
4 return y
5 f(x)
6 print (y)
Take a moment and think about what happens when this code runs. Hopefully, it
will confuse you since the error is rather subtle. Here is the explanation:
Lines 1 – 4: This is the same as the code in the previous example. For clarity,
we reiterate that the code in lines 2, 3, 4 is not run at this stage.
Lines 5, 6: In line 5, finally, the code in lines 2, 3, 4 is run. Since we put x = 3 in
line 1, it is run with the value x = 3. In line 3, y gets the value −2, and in line 4,
D-22 CHAPTER 10. A CRASH COURSE IN PYTHON
this value is returned. The program moves on to line 6, where it tries to print the
value of the variable y. But this variable has never been created, and so Python
crashes.
Wait, what? Surely, this makes no sense since the variable y was created in line 4
and given the value −2. Well, no: the point is that all variables created in lines 3, 4, 5
are local and only exist when these lines are run. After the code is done running line 5,
all of these local variables are deleted.
Maybe things become clearer if we rewrite the explanation of Lines 7, 8 as follows:
In technical terms, it is said that the code inside of the definition of a function
has its own local namespace which cannot be called upon in the code outside of the
function. This is done to "protect" the program from the code inside of the function.
To understand why this is necessary, let us look at the following example.
Example J.52 Thanks to functions having their own namespace, we can be sure that
the following code works:
1 a = [1, 2, 3]; b = [4, 5, 6]
2 c = sum(a)
3 print (b)
The point is that we have no idea how the function sum(a) is coded. If the local
namespace was not kept separate from the (global) namespace, we could be so unlucky
that a variable called b is used inside of the code for the function. This would then
overwrite the contents of the variable b that we created before running sum(a). The
point of having a local namespace is to avoid this and to keep our variables safe from
harm. Yay!
10.4. FUNCTIONS IN PYTHON D-23
Example J.53 (plotting with datatype list) The following code plots the graph of
f (x) = x2 + 2x + 3 over the interval [0,1].
1 import matplotlib . pyplot as plt
2
3 def f(x):
4 y = x ∗∗ 2 + 2 ∗ x + 3
5 return y
6
7 X = [n/10 for n in range(0,11)]
8 Y = [f(x) for x in X]
9
10 plt.plot(X,Y)
Fig. 3. To the left, we see the output of the above code. To the right, we see the
output if we replace the line plt.plot(X,Y) with plt.plot(X,Y,"bo").
Line 10: The command plt.plot(X,Y) is very similar to the one appearing when
we plotted the sequence in Example J.28. It takes two lists, in this case X =
[x0 , x1 , . . .] and Y = [y0 , y1 , . . .] as input. It then draws a straight (blue) line from
the point (x0 , y0 ) to (x1 ,y1 ). And then from (x1 ,y1 ) to (x2 ,y2 ) and so on until it
runs out of points. If the lists X and Y are not of equal length, the program gets
confused and crashes. If we add the option "bo" to the command plt.plot(X,Y),
we tell Python not to draw lines, and instead just put blue dots, as we have done
before (and as is shown above)!
Fig. 4. We are never really plotting the graph of a function f based on all its values
on an interval [a,b]. In reality, we only check the y-values for certain x-values and
then ask Python to connect the dots.
Exercise J.54 Plot the functions you implemented in exercise J.48 on [−2,2].
Exercise J.55 Consider the following code.
We now give an alternative, and rather elegant, way to plot functions. The price to
pay is that we need to introduce a new datatype commonly referred to as numpy arrays
(strictly speaking, Python calls them numpy.ndarray’s, but we will ignore this).
In a sense, numpy arrays are just like lists, but with the restriction that they can
consist of numbers (lists can also consist of, say, strings of text). But this means that it
makes sense for numpy arrays to have additional features related to numbers.
Example J.56 (features of the datatype numpy array) The following code is
meant to illustrate some of the things we can do with numpy arrays.
1 import numpy as np
2
3 def f(x):
4 return x ∗∗ 2
5
6 a = [1,2,3,4]; b = [1,2,5,3]
7
8 A = np. array (a) # Here we convert the lists a and b into numpy
9 B = np. array (b) # arrays . Usually , arrays are expressed like lists
10 # but without commas . E.g., now A = [1 2 3 4].
11
12 C = A∗B # This results in C = [1 4 15 12].
13 D = A+B # This results in D = [2 4 8 7].
14 E = A/B # This results in ... well , it divides the two lists ,
15 # entry by entry :−)
16
17 F = np. zeros ((3)) # Creates the array [0 0 0]
18 G = np.ones ((4)) # Creates the array [1 1 1 1]
19 H = np.arange(4) # Creates the array [0 1 2 3]
20 I = np. linspace (0,1,5) # Creates the array [0 0.25 0.5 0.75 1]
21
22 J = I. tolist () # Converts the numpy array I to a list J.
23 K = f(A) # this results in K = [1 4 9 16]
Let us make the following comments with respect to the above example:
Line 1: Before we can use numpy arrays we have to import the package numpy.
Lines 8, 9, 22: We can always convert a list (with only numerical entries) to a
numpy array, and vice versa. When creating a numpy array, normally we would
just write, say, A = np.array([1,2,3,4]).
D-26 CHAPTER 10. A CRASH COURSE IN PYTHON
Example J.57 The following code gives exactly the same output as the code in Example
J.53.
1 import matplotlib . pyplot as plt
2 import numpy as np
3
4 def f(x):
5 y = x ∗∗ 2 + 2 ∗ x + 3
6 return y
7
8 X = np. linspace (0,1,11)
9 Y = f(X)
10
11 plt.plot(X,Y)
Exercise J.58 Try to use numpy arrays to plot the functions from exercise J.48. For
one of them, the code will not work. Can you imagine why?
10.4. FUNCTIONS IN PYTHON D-27
Remark J.59 (Built in functions in Python) Here is a list of some of the functions
that are built into Python.
• abs(x) – computes the absolute value of x.
• complex(a,b) – returns the complex number a + ib.
• float(x) – converts integer to a float.
• int(x) – convertes float to integer (rounds down to nearest integer).
• round(a,n) – rounds the floating point number a to its n first digits.
• type(x) – returns the datatype of the variable x.
Additional functions can be imported from packages. For instance, here is a list of
some functions from the numpy (numerical Python) package:
Remark J.60 (Functions in the numpy package) To use these functions, you need
to start your code with import numpy as np. You now have access to the following
functions:
• np.exp(x) – the exponential function
• np.log(x) – the natural logarithm
• np.log2(x) – the logarithm with base 2
• np.log10(x) – the logarithm with base 10
• np.sin(x) – the sine function (radians)
• np.cos(x) – the cosine function (radians)
• np.tan(x) – the tangent function (radians)
• np.arcsin(x) – the arcsine
• np.arccos(x) – the arccosine
• np.arctan(x) – the arctangent
• np.pi – gives π
• np.e – gives e
For more, see https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html
Example J.61
1 import numpy as np
2 y = np.sin(np.pi) # Computes sin(pi ).
3 print (y) # Prints the result on screen .
There are also other packages with even more functions. We can mention the math
and scipy (scientific python) packages. However, since we are quite happy with what we
have mentioned above, we will skip these. (In particular, the numpy package essentially
makes the math package obsolete – especially when working with numpy arrays)
10.5. HOW NUMBERS ARE REPRESENTED IN PYTHON D-29
Example J.62 At the start of Chapter ??, we considered the infinite series
∞
X 1 1 1 1
= 1 + + + + ··· .
2k 2 4 8
k=0
Letting Sn denote the n’th partial sum of this infinite series, the point of exercise J.??
was for you to notice that 1
2 − Sn = n .
2
In particular, for all n ∈ N, we have Sn 6= 2.
But here is the thing: if if we ask Python to compute, say, S100 , then Python will
give the output S100 = 2. But this is wrong! While S100 is close to 2, it is not equal to
2. Yikes!
Exercise J.63 Does Python really mean that S100 = 2, or is there something else
going on? One way to double check this is to ask Python to compute 1/(2 − S100 ).
What happens?
So, what is going on here? Well, the point is that since there is a limit to how much
memory Python is willing to use to represent a number, there is also a limit to how
precisely Python will represent that number. This inevitably leads to so-called round-off
errors when working with computers, and it is vital that have some understanding of
why they occur.
Fig. 5. On the most fundamental level, the memory of the computer is described in
terms of bits and bytes. A bit can be either 0 or 1, while a byte is a string of 8 bits.
Modern computers normally set aside 64 bits for each integer.
D-30 CHAPTER 10. A CRASH COURSE IN PYTHON
That is, Python sets aside a certain amount of memory (depending on the type of
computer you run it on), and use it to store your integer. By giving it a name such as
myNumber, we know how to access this part of the memory, and by giving it a datatype,
Python will knows how to interpret the of 0’s and 1’s located there. Integers are usually
stored using variables of the datatype int (short for integer).
Remark J.64 Inside the circuits of a computer, the 1’s are represented by a short pulse
of electricity, while the 0’s are represented by the lack of such a pulse. Now, it would
make sense to design the circuitry of a computer so that every integer between 0 and
9 was represented. Indeed, one could model the integers by using varying intensities of
the pulse. However, a reason for just working with 0’s and 1’s is that this reduces the
chance that some disturbance will make the computer mistake one value for another.
This leads to the following mathematical question: how to represent all integers
using only 0’s and 1’s? The thing to realise is that this is not so different from the
following question: how to represent all integers using only strings of digits from the list
{0,1,2,3, . . . , 9}?
Example J.65 (Decimal and binary notation) So what do we really mean by the
integer 4132? Well, this:
(Here, we included the second line to emphasise how the powers of 10 occur in this
expression.) In fact, we are so used to thinking about integers as decimal numbers that
it is completely obvious for us that we can represent all numbers in this way.
But how to represent integers just using a string of 0’s and 1’s? For instance, what
number should the string, say, 1101 represent? Well, here is the basic idea:
1101 = 1 · 8 + 1 · 4 + 0 · 2 + 1 · 1
= 1 · 23 + 1 · 22 + 0 · 21 + 1 · 20 .
When 1101 is interpreted in this way, we call it a binary number. (Again, we include
the second line to emphasise how the powers of 2 occur in this expression.)
Let us now consider two questions: 1) why is the basic idea shown in the above
example actually quite reasonable, and 2) how to know if the number 1101 is supposed
to be interpreted in the above sense (i.e., as a binary number) and not as the decimal
number one-thousand-one-hundred-and-one?
10.5. HOW NUMBERS ARE REPRESENTED IN PYTHON D-31
To answer the second question first: when it is not clear if we are talking about binary
or decimal representations of numbers, we can use the notations (1101)2 and (1101)10 to
indicate that we mean binary or decimal notation, respectively.
But what about the first question? Well, using the idea shown above, here is how
counting in binary works:
Notice that this is exactly how counting with two digits should work! Every time we run
out of digits, we start over by including an extra zero and carrying over a one. Indeed,
this is what happens when you count using ten digits and want to count past 9 or, for
that matter, past 19. The thing with counting in binary is that this happens a lot!
Exercise J.66 Check that the above list is correct, and continue the list to 20.
Fig. 6. The first 7 bits (from the right) combine to form the binary representation of
the integer. The left-most bit tells us if the binary number is positive or negative.
Exercise J.68 (a) What is the largest integer you can represent as an 8 bit integer?
(b) Modern computers use 64 bit integers, where 1 bit is used for the sign and 63 for
the number itself. What is the largest integer you can represent using a 64 bit integer?
Remark J.69 (Two’s complement) Strictly speaking, our explanation for how inte-
gers are represented is only correct for positive integers. For negative integers, it would
be kind of stupid to do exactly as we describe since then we would have two different
ways of representing the integer 0 (indeed, both the bytes 0000 0000 and 1000 0000 would
represent 0). Instead, an alternative strategy called two’s complement is used. We will
not explain it here (Wikipedia has a nice page on this), but it allows the computer to
represent one extra negative number, meaning that on an 8 bit computer we can repre-
sent every integer between −128 and 127. (And, perhaps more importantly, using two’s
complement allows the hardware to speed up integer arithmetic.)
D-32 CHAPTER 10. A CRASH COURSE IN PYTHON
α · 10β ,
where the fraction α is (roughly) a 16 digit integer (positive or negative) and the exponent
β is (roughly) an integer between −340 and +292.
Here, we use the words essentially and roughly since we lose a little bit in the trans-
lation from binary to decimal numbers. However, before formulating a more correct
description of floating point numbers in binary language, let us try to get some intuition
from the sloppy definition. To this end, we consider the following example.
√
Example J.71 According to the sloppy description, how is the number 2 =
1.414213562373095048801688724209698078569... stored? Well, roughly as
1414213562373095 · 10−15 .
That is, the computer stores up to 16 digits in α (starting with the first non-zero digit
from the left), and the position of the decimal point in the β. In particular, since a lot
of information is thrown away, this means that we get a round-off error!
Exercise J.72 According to √the sloppy definition, (a) how far is it between the float-
ing point representation of 2 and its closest floating point "neighbour"? (b) How
far is it between x = 0 and the next floating point number?
Fig. 7. As indicated by the above exercise, the floating point line is not a continuous
line, instead it consists of many points with some short distance between them.
10.5. HOW NUMBERS ARE REPRESENTED IN PYTHON D-33
Exercise J.74 Use the sloppy description of float numbers to do the following:
(a) Explain how large N has to be for the computer to think that 1 + 2−N = 1.
(b) How large does N have to be for the computer to think that 2−N = 0?
Now, notice the following. According to our sloppy description, above, it makes no
sense that the number 1/10 = 0.1 cannot be represented accurately as a floating point
number. Indeed, the number 1/10 ought to have the simple representation
1 · 10−1 .
So what is going on? Well, to explain this, we need a more accurate description of
floating point numbers. As a first step, we need to understand how binary notation
works for non-integers.
Example J.75 (binary notation for non-integers) The way to represent non-
integers in binary notation is essentially completely analogous to how we do this for
decimal numbers. Indeed, compare
1 1
(643.57)10 = 6 · 102 + 4 · 101 + 3 · 100 + 5 · 1
+7· 2
10 10
and
1 1
(101.01)2 = 1 · 22 + 0 · 21 + 1 · 20 + 0 · + 1 · 2.
21 2
Exercise J.76 As a small taste of binary arithmetic, figure out both the decimal and
binary representations of the numbers
(1.α)2 · 2β ,
where the fraction α is a string of 52 bits, and the exponent β is a 11 bit integer. The
remaning bit is used to store the sign of the floating point number.
When the exponent β is the smallest possible, then (1.α)2 is replaced by (0.α)2 . (This
is done to offer additional accuracy close to the origin.)
Here is a visual representation of the memory used for a 64 bit floating point number:
Fig. 8. Keep in mind that out of the 11 bits used for the exponent, one of them is
used to denote the sign. In addition, it is not completely accurate to think of the
53 bits used for the fraction as a binary integer (see example below).
Since the notation used in the above description may be a bit confusing, let us consider
an example.
Example J.78 How to store the number 1/10 a floating point number? In order not to
have to write strings of 53 bits, let us pretend that we are working with 16 bit floating
point numbers (so-called half-precision floats).
Fig. 9. 16 bit floats are exactly like 64 bit floats, except that less bits are available
for the fraction and exponent.
First, expressing 1/10 on binary form (see exercises below for how to do this), we see
that 1
= (0.00011001100110011...)2 ,
10 10
where the pattern keeps repeating. That is, the binary expansion of 1/10 is not finite!
This means that to store it as a 16 bit (or 64 bit) floating point number, we are forced
10.5. HOW NUMBERS ARE REPRESENTED IN PYTHON D-35
into making a round-off error! Indeed, here is the representation of 1/10 as a floating
point number:
−(100)2
| {z } ·2
1. 100110011 ,
=α
and here is exactly how this would look in the memory of the 16 bit computer:
Fig. 10. Note that since the fraction appears on the right-hand side of a "binary
comma", its right-most zeroes can be ignored.
Exercise J.79 Translate the "accurate description" of 64 bit floating point numbers
to decimal notation to obtain the "sloppy description" at the start of this section. In
particular, you need to take into account the added accuracy close to the origin.
Exercise J.80 (Challenging) Explain what are the only fractions that can be rep-
resented without round-off error as 64 bit floating point numbers.
1
J.9 (a) 1 + 1/2 + 1/4 + 1/8 + 1/16, (b)
1
1+
1
1+
1
1+
1+1
J.11 The code in (a) will run.
J.36 The code computes the partial sum S99999 of the harmonic series. The speed of
the computations will depend on your processor.
J.40 There are many ways to write this code. Here is one:
1 sum = 0
2 k = 0
3 while abs(sum − 2) >= 1/10000:
4 sum = sum + 1/2 ∗∗ k
5 k = k +1 # Keep in mind that in a while−loop ,
6 # we must update k manually .
7 print (k−1)
J.46 (a)
1 k = 0
2 for x in myList :
3 if x < 10∗∗(−4):
4 break
5 k = k+1
6 print (k)
(b)
1 k = 0
2 while myList [k] > 10∗∗(−4):
3 k = k+1
4 print (k)
10.6. ANSWERS TO SELECTED EXERCISES D-37
J.48 Below, notice that we cannot call the absolute value function abs, since this keyword
is already taken (for Python’s own version of the absolute value function). (a)
1 def absolute 1(x):
2 if x >=0:
3 return x
4 else
5 return −x
(b)
1 def absolute 2(x):
2 return (x ∗∗ 2) ∗∗ (1/2)
J.49
1 def product (x):
2 temp_prod = 1
3 for a in x:
4 temp_prod = temp_prod ∗ x
5 return temp_prod
J.58 This will not work for absolute1 since if-statement in the definition of the function
does not make sense if x is a list or a numpy array. For absolute2, the following
code will work:
1 import matplotlib . pyplot as plt
2 import numpy as np
3 # insert code for the definition of absolute2
4 X = np. linspace(−2,2,40)
5 Y = absolute 2(X)
6 plt.plot(X,Y)
J.63 Python crashes and returns the error message: "ZeroDivisionError: division by
zero". In other words, Python really believes that S100 = 2.
D-38 CHAPTER 10. A CRASH COURSE IN PYTHON
J.66 Here are the first twenty numbers in both binary and decimal notation:
P6 n
P62
J.68 (a) n=0 2 = 27 − 1 = 127, (b) n
n=0 2 .