Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

F17XA Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 125

F1.

7XA
Mathematics for Engineers and Scientists 1
Course Notes

Name:
For use of students of the Heriot-Watt University enrolled in the subject F17XA/XE.
• Version 1: Compiled by Jack Carr and Des Johnston, July 2013.
• Version 2: Major updates by Thomas Wong, April, 2020.
F17XA Contents

Contents
1 Formulae 4
1.1 Working with Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Typesetting Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Algebraic Fractions and Partial Fractions . . . . . . . . . . . . . . . . . . . . . 8
2 Logarithms and Exponentials 13
2.1 Indices and Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Exponential Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Hyperbolic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Application of Linear and Log Functions 27
3.1 Equation of a Straight Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Experimental Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Log-Log and Log-Linear Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Interpolation and Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Introduction to Differentiation 37
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.6 Higher Order Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.7 Implicit Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.8 Parametric Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.9 Curve Sketching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.10 Related Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5 Introduction to Integration 72
5.1 Antidifferentiation (Indefinite Integration) . . . . . . . . . . . . . . . . . . . . . 72
5.2 Area Under Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 Statistics and Probability 83
6.1 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Measures of Average . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4 Frequency Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.6 Probability Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.7 Tree Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.8 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.9 Continuous Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.10 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7 Vectors 103

2
F17XA Contents

7.1 Basic Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103


7.2 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.3 Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.4 Vector Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.5 Scalar Triple Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.6 Equation of a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.7 Equation of a Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
A Formula Sheet 123
B Standard Normal Distribution Table 124

3
F17XA 1 Formulae

1 Formulae
1.1 Working with Formulae
When we want to describe or calculate various phenomena in science and engineering, we often
employ mathematical formulae. We need to be able to extract the information we want from
a given formula by re-arranging it if necessary. Let us look at some examples of formulae from
various subjects:
• In physics, the period T of a pendulum of length L:
s
L
T = 2π
g

• In physics and engineering, the distance s travelled in a time t with a constant acceleration
a and initial velocity u:
1
s = ut + at2
2
• To convey our lovely Scottish weather to our North American friends, we have to relate
temperature in Centigrade (C) to Fahrenheit (F ) by:

9C
F = 32 +
5

We take the formulae above as examples and attempt to change the subject of each formula.

Exercise 1.1
q
Given that T = 2π L
g
. Express L in terms of T, g.

Solution: Eliminate the square root by squaring both sides of the equation.
s !2
2 L
T = 2π
g
L
= 4π 2
g
Multiply both sides by g
4π 2
g g L
2
× T 2 = 2 × 4π 2
4π 4π g
and cancel out to get
gT 2
L=
4π 2

4
F17XA 1 Formulae

Exercise 1.2
A particle with initial velocity u and constant acceleration a travels a distance s in time t
where s = ut + 12 at2 . Express u in terms of s, a and t.

Solution: Starting with the original equation, we rearrange (with steps in brackets)
1
s = ut + at2
2
1 1
 
s − at2 = ut − at2
2 2
s − 21 at2
=u (÷t)
t
s at
− =u Simplify
t 2
So we get u = s
t
− at
2
.

Exercise 1.3
The formula for converting Centigrade (C) to Fahrenheit (F ) is

9C
F = 32 +
5
Express C in terms of F .

Solution: Starting with the equation:


9C
F = 32 +
5
9C
F − 32 = (−32)
5
9C 5
 
F − 32 = ×
5 9
5(F − 32)
=C
9
5(F −32)
So we get C = 9
.

Remark 1.4
When manipulating equations, we must always do the same thing to both sides of the
equation

5
F17XA 1 Formulae

1.2 Units
We won’t emphasise it heavily in this subject, but most applications of mathematics in science
and engineering involves units. We also have to contend with prefixes such as milli, micro, and
nano. In addition, there is often a mixture of measurement systems such as feet/metres or
pounds/kilograms. Errors in converting units have resulted in serious consequences:
• In 2001, the Los Angeles Zoo lent a tortoise, to a Exotic Animal Training Centre. The L.A.
Zoo said that the tortoise was big with weighing 250. The college thought the zoo meant
250 pounds and built an enclosure to match this. In fact the animal was a 75-year-old
Galapagos tortoise weighing 250 kilograms and it wrecked the enclosure on the first night
in its new home.
• An Air Canada flight was fuelled in Toronto using pounds instead of litres of fuel. The pilot
calculated how much he needed thinking he was getting his fuel in litres and ran out of fuel
about an hour into the flight. He successfully landed the plane safely at a semi-abandoned
airstrip.
• In 2003, the Space Mountain roller coaster at Tokyo Disneyland derailed when an axle broke
just before the end of the ride. The axle fractured because it was smaller than the design’s
requirement; the gap between the bearing and the axle was over 1mm - when it should have
been a 0.2mm gap. The error was in converting measurements to the metric system.
• Many medication errors are due to conversion and calculation errors In 1999, the Institute
for Safe Medication Practices reported an instance where a patient had received 0.5 grams
of Phenobarbital (a sedative) instead of 0.5 grains when the recommendation was misread.
A grain is a unit of measure equal to about 0.065 grams.
• In 1999 NASA lost a Mars orbiter because one team used metric units for a calculation and
the other team used imperial units. This loss cost over 125 million dollars.

1.3 Typesetting Mathematics


With the increased use of computer systems, it has become essential to understand how to interact
with these tools. To that end, we will provide a short guide on how to input mathematics.
Different computational tools will have different nuances in input mechanics. Pay close atten-
tion to the specifications of each software.

Definition 1.5 Basic operations


For most computer packages, use:
• Equal sign (=) to denote equality.
• Plus (+) for addition and dash (-) for subtraction.
• Asterisk (*) for multiplication and forward slash (/) for division
• Caret (ˆ) to denote exponents √
• The phrase sqrt to denote square roots. For example, 13 is written as sqrt(13)

6
F17XA 1 Formulae

Example 1.6
We can enter the expression:
4(32 − 1)
x= q √
1+ 2+ 2
Using x=(4*(3ˆ2-1))/(1+sqrt(2+sqrt(2)))

Be aware of order of operations, use brackets appropriately, and pair off brackets appropriately.

Definition 1.7 common functions


For most computer packages, we have the following common functions:
• ex is entered as exp(x), some packages also accept eˆx
• log(x) is entered as log(x) or ln(x).
• sin(x) is entered as sin(x)
• cos(x) is entered as cos(x)
• tan(x) is entered as tan(x)
• π is entered as pi or Pi

Be aware of log(x). Some packages use log(x) = log10 (x). When using a new package, it is
good practice to evaluate log(10) and log(e) to see which logarithm is used.

Example 1.8
We can enter the expression !
3+x
x log √
x
using x*log((3+x)/sqrt(x))

Example 1.9
We can enter the expression √
1−2x2
e1−
using exp(1-sqrt(1-2*xˆ2))

Example 1.10
We can enter the expression
πx − 4
 
sin2
2+x
using (sin((pi*x-4)/(2+x)))ˆ2

7
F17XA 1 Formulae

1.4 Algebraic Fractions and Partial Fractions


In this section we continue with the theme of equations and look at how to manipulate algebraic
fractions. These are sometimes referred to as rational functions. First, we introduce a bit of
terminology. Polynomials are sums of powers in a variable x. For example:

x3 − 3x2 − 2x + 1 or − x2 + 2x + 3

An Algebraic fraction is the ratio of two polynomials. If the numerator (top) has a strictly lower
power of x than the denominator, then the fraction is proper, otherwise it is improper.

Example 1.11
• Examples of proper algebraic fractions include:
1 x 3−x
, ,
x 1 − x2 x4 + 3x2 + 2x + 1

• Examples of improper algebraic fractions include:

x7 x2 x3 − 3x2 − 2x + 1
, ,
1−x 1 − x2 −x2 + 2x + 3

Often we are interested in either splitting up fractions or combining them. In particular, splitting
up algebraic fractions is useful for simplifying integrals and this method is usually called the
method of partial fractions.

Example 1.12
The expression x−1
2 5
+ x+4 can be combined into a single fraction by introducing the common
denominator (x − 1)(x + 4):

2 5 2(x + 4) + 5(x − 1)
+ =
x−1 x+4 (x − 1)(x + 4)
7x + 3
=
(x − 1)(x + 4)

The method of partial fractions provides a method of performing the reverse operation. First,
consider the case when the denominator of the fraction can be expressed as a product of two
linear factors and the numerator is linear.
We will illustrate two common methods for partial fractions via the example below

Exercise 1.13
Express 7x+3
(x−1)(x+4)
in terms of partial fractions.

8
F17XA 1 Formulae

Solution: We start by splitting up the fraction based on the factorisation of the denomi-
nator with unknowns A, B. Let
7x + 3 A B
= +
(x − 1)(x + 4) x−1 x+4

We have to find the values of A and B so the above is true for all x.
Clearing denominators on the right hand side:

(7x + 3) A(x + 4) B(x − 1)


= +
(x − 1)(x + 4) (x − 1)(x + 4) (x − 1)(x + 4)
A(x + 4) + B(x − 1)
=
(x − 1)(x + 4)

Equate numerators to get

7x + 3 = A(x + 4) + B(x − 1)

Since this equation is true for all x, we choose values of x to make the term involving A
or the term involving B disappear.
• Set x = 1. Then 7 × 1 + 3 = A(1 + 4) + 0. Thus 10 = 5A and A = 2.
• Set x = −4 then 7 × (−4) + 3 = 0 + B(−4 − 1). Thus −25 = −5B and B = 5.
So we get
7x + 3 2 5
= +
(x − 1)(x + 4) x−1 x+4

Below is an alternate method for solving partial fractions.

Solution: We start by splitting up the fraction as before.


7x + 3 A B
= +
(x − 1)(x + 4) x−1 x+4

After clearing denominators and equate numerators, we get:

7x + 3 = A(x + 4) + B(x − 1)

From here, we expand both sides and factor by powers of x.

7x + 3 = Ax + 4A + Bx − B
= (A + B)x + (4A − B)

Since this is true for all x values, this means the coefficients of each power of x on both

9
F17XA 1 Formulae

sides must be identical. This gives a set of equations:

[x1 ] : 7=A+B
[x0 ] : 3 = 4A − B

Solving this set of simultaneous equations gives A = 2 and B = 5. So we recover the


same result as before with:
7x + 3 2 5
= +
(x − 1)(x + 4) x−1 x+4

Exercise 1.14
Express 2x−1
(2x+1)(x−3)
in terms of partial fractions.

Solution: Let
2x − 1 A B
= +
(2x + 1)(x − 3) 2x + 1 x − 3
Combine denominators and equate numerators

2x − 1 = A(x − 3) + B(2x + 1)

To obtain A and B:
• Set x = 3 to get 6 − 1 = 0 + B(6 + 1). Hence 5 = 7B, and so B = 57 .
• Set x = − 12 to get −1 − 1 = A(− 12 − 3). Hence −2 = − 7A
2
, and so A = 74 .
So we get
2x − 1 4 5
= +
(2x + 1)(x − 3) 7(2x + 1) 7(x − 3)

In the case where there is a repeated bracket such as


10x + 18
(2x + 3)2

We would use a different form, but following the same procedure.

Exercise 1.15
Express 10x+18
(2x+3)2
in terms of partial fractions.

Solution: Let
10x + 18 A B
= +
(2x + 3)2 2x + 3 (2x + 3)2

10
F17XA 1 Formulae

Combine fractions and equate numerators to obtain

10x + 18 = A(2x + 3) + B

Equating the coefficient of x on both sides gives 10 = 2A and hence A = 5. Equating


the constants gives 18 = 3A + B, so B = 3.

Some common partial fraction expressions are listed below

Example 1.16
For numbers l, m, n, p, q, r, s, t and unknowns A, B, C, we have the following partial fractions
expressions:
mx + n A B
= +
(px + q)(rx + s) px + q rx + s
mx + n A B
= +
(px + q)2 px + q (px + q)2
lx2 + mx + n A Bx + C
= + 2
(px + q)(rx2 + sx + t) px + q rx + sx + t
lx2 + mx + n A B C
= + +
(px + q)2 (rx + s) px + q (px + q)2 rx + s

Exercise 1.17
Express 9x+3
(x−1)(x2 +x+10)
in terms of partial fractions.

Solution: Let
9x + 3 A Bx + C
= + 2
(x − 1)(x + x + 10)
2 x − 1 (x + x + 10)
Combining denominators and then equating numerators gives:

9x + 3 = (Bx + C)(x − 1) + A(x2 + x + 10)

One way to solve could be:


• Set x = 1 to get 12 = 12A so A = 1.
• Expand and equate the coefficients of x2 to get 0 = A + B. So B = −1.
• Set x = 0 to get 3 = −C + 10A so C = 7.

Hence:
9x + 3 1 7−x
= + 2
(x − 1)(x + x + 10)
2 x − 1 (x + x + 10)

11
F17XA 1 Formulae

In general, we can express a proper1 algebraic fraction in terms of partial fractions. We illustrate
the general method via an example.

Example 1.18 (Advanced)


Suppose we want to express the the following expression in terms of partial fractions:

4(x2 + x + 1)
(x + 1)3 (x2 + 1)2

1. First, factor the denominator if it isn’t already factored.


2. Write the algebraic fraction as sum of each factor in the denominator, including its power

4(x2 + x + 1) A(x) B(x)


3 2 2
= 3
+ 2
(x + 1) (x + 1) (x + 1) (x + 1)2

For some currently unknown functions A(x), B(x)


3. Expand each factor by its power to get the final decomposition.

4(x2 + x + 1) A1 A2 A3 B1 x + C1 B2 x + C2
3 2 2
= + 2
+ 3
+ + 2
(x + 1) (x + 1) (x + 1) (x + 1) (x + 1) (x2 + 1) (x + 1)2

for constants A1 , A2 , A3 , B1 , B2 , C1 , C2 . The numerator of each term contains all powers


of x lower than the factor in the denominator.
4. Use the methods described previously to solve for the unknown constants and substitute
results to get:

4(x2 + x + 1) 1 1 1 −x 1−x
3 2 2
= + 2
+ 3
+ 2 + 2
(x + 1) (x + 1) (x + 1) (x + 1) (x + 1) (x + 1) (x + 1)2

1
If the given algebraic fraction is improper, an additional step call polynomial long division is required before
we can apply partial fraction.

12
F17XA 2 Logarithms and Exponentials

2 Logarithms and Exponentials


2.1 Indices and Exponents
Indices are a notation used to represent multiples of the same quantity:

a2 = a × a b3 = b × b × b c1 = c

In an expression such as 26 the base is 2 and the index or power is 6.


A negative power is equivalent to multiplying by the reciprocal.
1 1 1 1 1
3−2 = = a−1 = b−4 = = 4
32 9 a b×b×b×b b

Theorem 2.1 Rules of indices


For real numbers m, n and a > 0, we have the following rules:
am
am × an = am+n = am−n
an
(am )n = amn a0 = 1

In the case where a < 0, there are values of m, n for which am and an are not well-defined.
3
For example (−1) 2 is not well-defined.

2.2 Exponential Functions


For a positive number a > 0, an exponential function is a function of the form y = ax .

Example 2.2
In the case with a = 2, we have:
x -2 -1 0 1 2 3 4
2x 1
4
1
2
1 2 4 8 16
y
20

15

10

x
-2 -1 1 2 3 4

13
F17XA 2 Logarithms and Exponentials

Figure 2.1: Graph of y = 2x .

For reasons that will become obvious later, base e ≈ 2.7182818284 is of special interest. The
number e is called Euler’s number. The exponential function y = ex is sometimes written as
exp(x).

Example 2.3
Values of exponential function in base e:
x -2 -1 0 1 2 3 4
ex 0.14 0.37 1 2.72 7.39 20.1 54.6
y
60

50

40

30

20

10

x
-2 -1 1 2 3 4

Figure 2.2: Graph of y = ex or y = exp(x).

The function y = ex is called the exponential function. On a calculator , the exponential function
is usually written as ex . The reader should use the ex button on a calculator to check the above
table. We can also write ex as exp(x) . The alternative notation exp(x) is useful for complicated
expressions like exp(−5x2 + 3x). One feature of ex is that it increases very rapidly with x, such
an increase has passed into common usage as an “exponential growth”.
The usual laws of indices apply to ex so we have
e0 = 1 (ex )2 = e2x ex · ey = ex+y

Now let us consider some examples of evaluating exponentials.

Exercise 2.4
Let f (t) = 3 exp(−5t2 ). Find f (0) and f (0.4).

Solution:
• For the first case, f (0) = 3e0 = 3.
• When t = 0.4, 5t2 = 0.8. Using a calculator f (0.4) = 3e−0.8 = 1.35.

14
F17XA 2 Logarithms and Exponentials

Exercise 2.5
Expand and simplify (e4x − e−4x )2 .

Solution: Start by expanding:

(e4x − e−4x )2 = (e4x )2 − 2e4x e−4x + (e−4x )2


= e8x − 2 + e−8x

As we saw in Example 2.3, exp(x) is rapidly increasing, similarly exp(−x) is rapidly decreasing.
This remains true for expressions such as A exp(kx) (increasing for positive constants A and k)
and A exp(−kx) (decreasing for positive constants A and k). It is useful to be able to sketch
the behaviour of various exponential curves since they occur in so many physical problems. Let
us consider a couple of examples.

Exercise 2.6
Sketch the graph of y = 6e−2t for t ≥ 0.

Solution: The main points are:


• When t = 0, we have y = 6.
• For large values of t, we have y ≈ 0
• y is rapidly decreasing (A = 6 and k = 2).
The graph is shown below.
y
6

2 6 exp(−2x)

x
1 2 3 4

Figure 2.3: Graph of y = 6e−2t .

Exercise 2.7
Sketch the graph of y = 7 − 4e−2t for t ≥ 0.

15
F17XA 2 Logarithms and Exponentials

Solution: The main points are:


• When t = 0, we have y = 7 − 4 = 3.
• For large values of t, we have y ≈ 7
• y is increasing (A = −4 and k = 2).
The graph is shown below.
y
8

4 7 − 4 exp(−2t)

x
1 2 3 4

Figure 2.4: Graph of y = 7 − 4e−2t .

Exponential functions often appear in physical problems where the rate of increase or decrease of
a quantity is depend on the value of the quantity at the time. An archetypal example of this is
radioactive decay, where the exponential decrease is with time.
A very similar calculation can be used to calculate how air pressure falls off with height.

Exercise 2.8
Atmospheric pressure, when measured in atmospheres, varies according to the height (h)
above the Earth’s surface. The formula is

P (h) = exp(−Ah)

where A = 0.000016 (i.e. 1.6 × 10−5 ), and h is measured in metres. At sea level h = 0
and P = 1 atmosphere.
Determine the air pressure at:
• The summit of Ben Nevis (h = 1344m)
• The summit of Everest (h = 8848m)
• Heriot-Watt Riccarton campus (h = 90m)

16
F17XA 2 Logarithms and Exponentials

Solution:
• The summit of Ben Nevis (h = 1344m)

P (1344) = exp(−0.000016 × 1344)


= exp(−0.021504)
= 0.979

• The summit of Everest (h = 8848m)

P (8488) = exp(−0.000016 × 8848)


= 0.868

• Riccarton campus (h = 90m)

P (90) = exp(−0.000016 × 90)


= 0.998

Example 2.9
An electrical circuit is set up with a resistor R and capacitor C in series with a direct current
source V .
C

Figure 2.5: Electrical circuit


Initially the capacitor is uncharged. When we fully connect the circuit, the charge on the
capacitor varies with time. For our circuit, we have RC = 2 and the final charge is 3. The
charge on the capacitor is given by the formula:

Q(t) = 3[1 − exp(−t/2)]

Plotting Q(t) gives

17
F17XA 2 Logarithms and Exponentials

Q
3

2 Q(t) = 3[1 − exp(−t/2)]

t
1 2 3 4 5 6 7 8

2.3 Logarithms
The logarithm of a particular exponent can be thought of as “undoing” the exponential.

Definition 2.10 Logarithm


For positive numbers a and b,

ax = b if and only if loga (b) = x.

The number x is called the logarithm of b to base a.

Example 2.11
We have
• log10 (100) = 2 since 102 = 100.
• log2 (128) = 7 since 27 = 128
• loge (e) = 1 since e1 = e

Most calculators have two different logarithm functions: logarithm to base 10 and logarithm to
base e.
• If the base of the logarithm is a = 10, then we can omit the base and write log10 (b) = log(b).
On a calculator, logarithms to base 10 are usually denoted by log. The reader should use a
calculator to check that

log(10) = 1 log(2) = 0.301 log(45) = 1.653 log(0.4) = −0.398

which is equivalent to saying:

101 = 1 100.301 = 2 101.653 = 45 10−0.398 = 0.4

• If the base of the logarithm is a = e, then we write loge (b) = ln(b). This is known as the
natural log. The reader should use the ln button on a calculator to check that

ln(2) = 0.693 ln(10) = 2.303 ln(0.3) = −1.204

18
F17XA 2 Logarithms and Exponentials

y
2

ln(x)
1

log(x)
x
1 2 3 4 5 6 7 8

-1

-2

Figure 2.6: Graph of y = log(x) and y = ln(x)

which is equivalent to saying:

e0.693 = 2 e2.303 = 10 e−1.204 = 0.3

For any a > 0, we have ax > 0 for all x. That means ln A (and log A ) only make sense if A > 0.
Hence, an expression such as ln(3x − 15) will only make sense if 3x − 15 > 0. That is if x > 5.
Further, since we have a0 = 1 for any a > 0, we must have loga 1 = 0.

What does the graph of a logarithm look like? We can see that both ln(x) and log(x) look very
similar, going through x = 1 since 100 = e0 = 1.
Below we give the laws of logarithms and demonstrate how to apply them.
Theorem 2.12 Laws of logarithms
For a base a > 0 and positive numbers B and C

loga (BC) = loga (B) + loga (C)


B
 
loga = loga (B) − loga (C)
C
loga (B n ) = n loga (B) any exponent n

Further, we have:

loga (1) = 0 loga (a) = 1

Recall from the definition:

an = B if and only if loga (B) = n.

19
F17XA 2 Logarithms and Exponentials

Exercise 2.13
Reduce the following to a single logarithmic term:
1. log(x) + log(x − 3)
2. 4 log(2) + 2 log(3) − log(12)
3. ln(x2 y) − ln(y) + 3 ln(x)

Solution:
1.

log(x) + log(x − 3) = log(x(x − 3))


 
= log x2 − 3x

2.
   
4 log(2) + 2 log(3) − log(12) = log 24 + log 32 − log(12)
= log(16) + log(9) − log(12)
16 × 9
 
= log
12
= log (12)

3.
     
ln x2 y − ln(y) + 3 ln(x) = ln x2 y − ln(y) + ln x3
!
x2 y × x3
= ln
y
 
= ln x5

Exercise 2.14
Find the value of x which satisfies (1.76)x = 2002.

Solution: Taking logarithms to base 10 of each side of (1.76)x = 2002 gives

log ((1.76)x ) = log(2002)


x log (1.76) = log(2002)
log(2002)
x=
log (1.76)
3.301
x= = 13.45
0.2455

20
F17XA 2 Logarithms and Exponentials

Alternatively, we could take logarithms to base e to obtain

ln ((1.76)x ) = ln(2002)
x ln (1.76) = ln(2002)
ln(2002)
x=
ln (1.76)
7.602
x= = 13.45
0.5653

Example 2.15
The last example shows the the result we obtain is independent of the basis chosen. In general,
if we want to solve y = ax , we can choose any c > 0 to get

y = ax
logc (y) = logc (ax )
logc (y) = x logc (a)
logc (y)
x=
logc (a)

We can also obtain x = loga (y). Hence we have the relation

logc (y)
loga (y) =
logc (a)

This is known as a change of basis.

Exercise 2.16
Find the x value that satisfies 100e−2x = 26.2.

Solution: Simplifying and taking logs give:

100e−2x = 26.2
e−2x = 0.262
−2x = ln(0.262)
−2x = −1.34
x = 0.67

Following on from Exercise 2.7:

21
F17XA 2 Logarithms and Exponentials

Exercise 2.17
Given the equation y = 7 − 4e−2t . Express t in terms of y

Solution: Rearrange the equation as follows:

y = 7 − 4e−2t
y − 7 = −4e−2t
y−7
= e−2t
−4
7−y
= e−2t
4
7−y
 
ln = −2t
4 
1 7−y

− ln =t
2 4
Hence
1 7−y
 
t = − ln
2 4

2.4 Hyperbolic Functions


In Section 2.2, we saw how the exponential function ex behaves. In this section we look at two
close functions related to the exponential function, sinh(x) and cosh(x). Note the h’s!2 We write
sinh(x) and cosh(x) as shorthand for the combination of exponentials defined below:

Definition 2.18 Hyperbolic functions


The hyperbolic trigonometric functions are defined as:
1 1
sinh(x) = (ex − e−x ) cosh(x) = (ex + e−x )
2 2

What do these look like?


2
Depending on your background, sinh can be pronounced as shine, sinch, or hyperbolic sine. Similarly, cosh
can be pronouced as cosch, co-sinh, or hyperbolic cosine.

22
F17XA 2 Logarithms and Exponentials

y y
30 30

20 sinh(x) 20 cosh(x)

10 10

x x
-4 -3 -2 -1 1 2 3 4 -4 -3 -2 -1 1 2 3 4

-10 -10

-20 -20

-30 -30

Figure 2.7: Graph of y = sinh(x) Figure 2.8: Graph of y = cosh(x)

Just as for the trigonometric functions we can define tanh(x) = sinh(x)


cosh(x)
, which appears like a
smooth step:
y
1

tanh(x)

x
-4 -3 -2 -1 1 2 3 4

-1

Figure 2.9: Graph of y = tanh(x)


Hyperbolic functions appear, like sin and cos, as the solutions to many physical problems. For
instance, although it was long thought that a hanging heavy chain or rope hung in the form of a
parabola, it was eventually shown that it is in fact a section of the cosh curve.
The Gateway Arch in St Louis, Missouri, is one of the largest and most elegant catenary arches
in the world. If you conduct an image web-search on the keywords Gateway Arch St Louis you’ll
see many images of this structure. It is 630 feet high and 630 feet wide at ground level. The
formula for its shape is displayed at the base of the arch and is
x
 
y = 757 − 127.7 cosh
127.7
with x and y both measured in feet.

23
F17XA 2 Logarithms and Exponentials

y
700  
x
y = 757 − 127.7 cosh 127.7
600

500

400

300

200

100

x
-400 -300 -200 -100 100 200 300 400

Figure 2.11: Equation of Gateway Arch


Figure 2.10: Gateway Arch. (Wikipedia)

Similar to trigonometric functions, the hyperbolic functions satisfy various identities. We can
derive these by considering their exponential definition.

Exercise 2.19
Express sinh(8x) in terms of e8x and e−8x .

Solution: From the definition of sinh(8x),


1  8x 
sinh(8x) = e − e−8x
2

Exercise 2.20
Express 5e4x − 7e−4x in terms of cosh(4x) and sinh(4x).

Solution: From the definition of cosh(4x) and sinh(4x),


1  4x 
cosh(4x) = e + e−4x
2
1  4x 
sinh(4x) = e − e−4x
2

24
F17XA 2 Logarithms and Exponentials

Hence,
1  4x  1  4x 
cosh(4x) + sinh(4x) = e + e−4x + e − e−4x = e4x
2 2
1  4x  1  4x 
cosh(4x) − sinh(4x) = e + e−4x − e − e−4x = e−4x
2 2
Then

5e4x − 7e−4x = 5(cosh(4x) + sinh(4x)) − 7(cosh(4x) − sinh(4x))


= −2 cosh(4x) + 12 sinh(4x)

Exercise 2.21
Show that cosh2 (x) − sinh2 (x) = 1

Solution: Since
1 x 
cosh(x) = e + e−x
2
1 x 
sinh(x) = e − e−x
2
We can multiply these out, (remembering that (ex )2 = e2x ):
1 x 2 1  2x 
cosh2 (x) = e + e−x = e + 2 + e−2x
4 4
1 x −x 2
 1  2x 
2
sinh (x) = e −e = e − 2 + e−2x
4 4
So subtracting the two just gives one, as required.

Exercise 2.22
Show that 1 + 2 sinh2 (x) = cosh(2x)

Solution: We go back to the exponential definition:


1  2x 
sinh2 (x) = e − 2 + e−2x
4

25
F17XA 2 Logarithms and Exponentials

So
1  2x 
1 + 2 sinh2 (x) = 1 + e − 2 + e−2x
2
1  2x 
= e + e−2x
2
= cosh(2x)

The last line is the definition of cosh(2x), which is what we have to show.

We finish this section with one final physical example.

Exercise 2.23
The formula for the velocity of waves in shallow water channel is given by:
!
6.3d
V 2 = 1.8L tanh
L

where d is the depth of the water and L is the wavelength of the wave. Find V if d = 0.5m
and L = 3.0m.

Solution: Substituting values give:


s
6.3 × 0.5
 
V = 1.8 × 3 × tanh
3.0

= 4.2
= 2.05ms−1

26
F17XA 3 Application of Linear and Log Functions

3 Application of Linear and Log Functions


3.1 Equation of a Straight Line
Straight lines, or linear functions, provide a useful way to extract information from data sets. In
this section, we will look at ways to manipulate data in a way that resembles straight lines3 . First,
a reminder of what the equation of a straight line is:
y
12

10

4 y = 2x + 1

2
x
1 2 3 4 5

Figure 3.1: A straight line


The graph shows the line with slope 2 that hits the vertical axis at 1. In general, a line with slope
m that intersects the vertical axis at c is given by y = mx + c. An equivalent but (Arguably)
more robust approach is given by the point-slope formula

Definition 3.1 Point-slope formula

A line with slope m passing through a point (a, b) satisfies the equation

y − b = m(x − a)

This can also be written as y = mx + c with c = b − ma.

One advantage of this approach is that it avoids the need to solve for the intercept c.

Example 3.2
In the above example, the line has slope 2 passing through the point (1, 3). Using the point
slope formula, we get:

y − b = m(x − a)
y − 3 = 2(x − 1)
y − 3 = 2x − 2
y = 2x + 1

which agrees with what we expect.

3
It is easier to tell whether data forms a straight line or not. For example see Exercise 3.5

27
F17XA 3 Application of Linear and Log Functions

Given two points on the line, we can obtain the slope m by rearranging the point-slope formula:
y−b ∆y
slope = m = =
x−a ∆x
The notation ∆y = y − b (and ∆x = x − a) is sometimes used to denote the change in y
coordinate (change in x coordinate respectively). The ratio of the two tells us that if x changes
by ∆x, then y changes by ∆y. In this case, moving along 3 in x results in the line going up 6 in
y. The slope is 63 = 2. If the line slopes downwards, it has a negative slope.

Example 3.3
If we know the points (1, 3) and (4, 9) are on the line, then we have
9−3 6
slope = m = = = 2.
4−1 3
We can then use one of the points, say (1, 3), to recover the equation of the line.

y − b = m(x − a)
y − 3 = 2(x − 1)
y − 3 = 2x − 2
y = 2x + 1

For completeness, it is possible to use any point that we know is on the line. For example, if
we used (4, 9),

y − b = m(x − a)
y − 9 = 2(x − 4)
y − 9 = 2x − 8
y = 2x + 1

which is equivalent to our previous answer.


Alternatively, if we know the line has slope 2, we can solve y = 2x + c for c by using a point
(1, 3). That is 3 = 2(1) + c to get c = 1.

3.2 Experimental Data


One way to test whether two variables are connected is by testing collected experimental data.
As an example, we can investigate how the resistance R of a wire is related to the temperature
T of the wire. If we suppose that R and T are related by

R = mT + c (3.1)

we could test the validity of Equation 3.1 by attempting to find m and c using experimental data.

28
F17XA 3 Application of Linear and Log Functions

Exercise 3.4
The following data was collected as a result of an experiment.
T (◦ C) 10 30 50 70 90
R(Ω) 4.9 5.2 5.8 6.2 6.8
Determine a linear relationship between temperature and resistance.

Solution: We start by plotting the points:


R
8

4 R = mT + c

T
20 40 60 80 100

Figure 3.2: A straight line


We see that the points lie approximately on a straight line. There are systematic methods
for determining the ‘best’ line to fit the data, but that is not a part of this subject. We
use points taken from the line (i.e. not from the data) and apply the point-slope formula
by taking two the points on the line. For example, we take (30, 5.25) and (70, 6.25) which
lie on the graph.
To get the slope:
6.25 − 5.25 1
m= = = 0.025
70 − 30 40
We can then use one of the points to get the equation of the line:

R − b = m(T − a)
R − 5.25 = 0.025(T − 30)
R = 0.025T + 4.5

So our best line to fit the data is given by R = 0.025T + 4.5.

In cases where the variables are not linearly related, we may have to manipulate the data by
plotting some other combination of the variables.

29
F17XA 3 Application of Linear and Log Functions

Exercise 3.5
The following data is collected in an experiment
t 0 1 2 3 4
y 3.1 4.9 11.2 20.8 34.9
Validate if the variables t and y can be related by

y = mt2 + c (3.2)

Solution:
Based on this data, it would be difficult to see from a plot of this data if Equation 3.2
holds. Further, it is not clear how to determine the values m and c based on these points.
y
40

30

20

10

t
1 2 3 4

Figure 3.3: Plot of data plots

Instead, we can manipulate the data by defining a new variable x = t2 .

t 0 1 2 3 4
x = t2 0 1 4 9 16
y 3.1 4.9 11.2 20.8 34.9

With this substitution, Equation 3.2 simplifies to:

y = mx + c (3.3)

which we can plot to obtain a straight line

30
F17XA 3 Application of Linear and Log Functions

y
40

30

20
y = 2x + 3

10

x
2 4 6 8 10 12 14 16

Figure 3.4: Plot of y against x


We can obtain the equation using two points on the line. For example, we use points
(0, 3) and (10, 23) to get
23 − 3
m= =2
10 − 0
Using the point-slope formula, we get:

y − b = m(x − a)
y − 3 = 2(x − 0)
y = 2x + 3

Substituting back for t gives y = 2t2 + 3.

The process of manipulating variables so that they are related by a straight line is sometimes
called linearisation.

Exercise 3.6
For the following cases, reduce the proposed law connecting t and y to straight line form.
1. y = mt4 + c
2. y = mt2 + ct
3. y = mt + tc2

Solution:
1. Let x = t4 . This gives y = mx + c. So plotting y against t4 gives a straight line.
y
2. For t 6= 0, let w = yt . Then w = mt + c. So plotting w = against t gives a
t
straight line.
3. Rearranging y = mt + tc2 gives t2 y = mt3 + c. So substituting w = t2 y and x = t3
we get w = mx + c. So plotting w = t2 y against t3 gives a straight line.

31
F17XA 3 Application of Linear and Log Functions

3.3 Log-Log and Log-Linear Laws


In the real world, we would often encounter variables that are related by a power law 4 or a
exponential law 5 . In these cases, we would use the logarithm to help linearise the data. We will
illustrate this via two examples.
Suppose two variables t and y are related via the relation y = atb . We can linearise relation by
taking logarithms on both sides and applying the laws of logarithms.

y = atb
ln(y) = ln(atb )
ln(y) = ln(a) + ln(tb )
ln(y) = ln(a) + b ln(t)

Let w = ln(y) and x = ln(t). Then w = ln(a) + bx would be a straight line. So we should
plot ln(y) against ln(t). The resulting graph is known as a log-log graph since we are plotting
logarithms on both horizontal and vertical axes.

Example 3.7
Suppose that the variables t and y are related by y = atb . We can linearise by setting w = ln(y)
and x = ln(t) to obtain the straight line w = ln(1.7) + (−2.3)x from the graph. Back
substituting gives a = 1.6 and b = −2.3.
y
w = ln(y) 15
2
ln(1.7) x = ln(t)
y = 1.7t−2.3
1 2 3 4
-2 10

-4

-6 w = ln(1.7) − 2.3x 5

-8

-10 t
1 2 3 4
Figure 3.5: The log-log graph.
Figure 3.6: Example of power law.

The second case we will deal with is the exponential law. Suppose two variables t and y are
related via the relation y = Abt . We can linearise the relation by taking logarithms on both sides
4
Examples include Kepler’s third law, inverse square laws for gravity and electric forces, or even the number of
friends a person has.
5
Examples include radioactive decay or simple population growth.

32
F17XA 3 Application of Linear and Log Functions

and applying the laws of logarithms.


y = Abt
ln(y) = ln(Abt )
ln(y) = ln(A) + ln(bt )
ln(y) = ln(A) + t ln(b)
Let w = ln(y). Then w = ln(A)+ln(b)t would be a straight line. So we should plot ln(y) against
t. The resulting graph is known as a log-linear graph since we are plotting logarithm against a
linear variable.

Example 3.8
Suppose that the variables t and y are related by y = Abt . We can linearise by setting w = ln(y)
to obtain the straight line w = ln(3) + ln(0.5)x from the graph. Back substituting gives A = 3
and b = 0.5.
y
w = ln(y) 14
2
12 y = 3 · (0.5)x

10
ln(3)
1 w = ln(3) + ln(0.5)x
8

6
t
4
-1 1 2 3
2
t
-1 -2 -1 1 2 3

Figure 3.7: The log-linear graph. Figure 3.8: Example of exponential law.

3.4 Interpolation and Extrapolation


In the real world, it is often difficult to extract an exact function f (x) that exactly describes a
data set. Instead, we often have to extract information based on a table of collected data
• Interpolation is the process of extracting information between two known data points.
• Extrapolation is the process of extracting information beyond the set of known data points.

Exercise 3.9
The velocity of sound in air varies with temperature. The values at different temperatures
are measured to be:
Air Temp o C 0 10 20 30
Velocity ms −1
331.3 337.3 343.2 349.0
1. Interpolate the velocity of sound in air at 16o C
2. Extrapolate the velocity of sound in air at −30o C

33
F17XA 3 Application of Linear and Log Functions

Solution:
Based on the following plot, we can see that a suitable linear relationship exists between
velocity and temperature. So we will use a linear fit to estimate values.
Velocity
350

340

330

320

310

Temp
-30 -20 -10 10 20 30

Figure 3.9: Temperature vs velocity


1. Interpolation: Since 16o C occurs between two known data points, we interpolate
the data using the linear fit to get a velocity of 341ms−1 .
2. Extrapolation: Since −30o C occurs outside the set of known data points, we
extrapolate the data by extending the to get a velocity of 316ms−1 .

Let us finish this section with a example of using graphical methods to determine a relationship
between two variables.

Example 3.10
In the experiment, we consider how changing the length L of a simple pendulum affects the
period of oscillation T .

Figure 3.10: Diagram of simple pendulum


The data below shows the period T for different lengths L

L 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
T 0.63 0.89 1.10 1.27 1.42 1.55 1.68 1.79 1.90 2.01 2.10 2.20 2.37 2.46 2.54

34
F17XA 3 Application of Linear and Log Functions

Plotting T against L does not appear to give a linear relationship:


T

2.5

2.0

1.5

1.0

0.5

L
0.3 0.6 0.9 1.2 1.5

Figure 3.11: Plot of T against L


This shows that the variables are not linearly related. In these cases, we will have to manipulate
the data in some meaningful mannera to obtain a relationship. In this case, the data resembles
a square root function. So we consider the new variable T 2 to validate our guess.
L 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5
T 0.63 0.89 1.10 1.27 1.42 1.55 1.68 1.79 1.90 2.01 2.10 2.20 2.37 2.46 2.54
T2 0.40 0.80 1.20 1.61 2.01 2.41 2.81 3.22 3.62 4.02 4.42 4.82 5.23 5.63 6.04

T
7

L
0.3 0.6 0.9 1.2 1.5

Figure 3.12: Plot of T 2 against L


The plot T 2 against L provides evidence that, for a simple pendulum, we have:
T 2 = kL
for k ≈ 4. While this is only an experimental result for k, it provides some initial insight in the
the problem. If we pursue further, we will see that the formula for a simple pendulum is given

35
F17XA 3 Application of Linear and Log Functions

by: !
2 4π 2
T = L
g
where g = 9.81ms−2 is the acceleration due to gravity near the Earth’s surface.
a
Here, meaningful could range from using other contextual knowledge or theory about the problem, to
guessing the function based on the shape of the plot.

Context is important when attempting to interpolate or extrapolate. For example, if we are


given data about the number of people with n siblings for different values of n, then both
interpolation and extrapolation for n < 0 won’t give any useful information.

36
F17XA 4 Introduction to Differentiation

4 Introduction to Differentiation
4.1 Introduction
Recall we can obtain the gradient of the straight line joining points (a, b) and (x, y) by rearranging
the point-slope formula (Definition 3.1)

y−b
gradient = m =
x−a
For instance, the straight line connecting the points (1, 5) and (3, 11) has slope
11 − 5 6
gradient = m = = =3
3−1 2

Using the point-slope formula with one of the points gives y = 3x + 2.


y
18
16
14
12
10
y = 3x + 2
8
6
4
2
x
1 2 3 4 5

Figure 4.1: Graph of y = 3x + 2


Indeed, one of the key properties of a straight line is the slope is always the same. That is, we
would obtain the same slope regardless of which two points we take on the line. But this is no
longer the case for a general curve. The gradient we obtain for a general curve will depend on
the location of the curve we are interested in.
Let us first consider the example of f (x) = x2 . We want to find the slope of the function at the
point P = (1, 1). That is, we want to find the slope of the line that just touches the function at
point P . This line is called the tangent line of f (x) at x = 1.

37
F17XA 4 Introduction to Differentiation

y = f (x) y = f (x)

8 f (x) = x2 8 f (x) = x2

6 6

Q
4 4

2 2
P P
x x
1 2 3 1 2 3

-2 -2

Figure 4.2: Tangent line at x = 1 Figure 4.3: Secant line P Q


Since we only know of one point on the tangent line, we have to estimate the slope of the tangent
line by using a second point Q on the curve. This resulting estimate is known as the secant line.
In this example, the slope of the secant line P Q is given by
f (2) − f (1) 4−1
slope = = =3
2−1 2−1
Since our choice of Q at x = 2 is completely arbitrary, we can choose Q closer to P to obtain a
better estimate. For Q at x = 1.5, we have
f (1.5) − f (1) 1.52 − 12 2.25 − 1
slope = = = = 2.5
1.5 − 1 1.5 − 1 0.5

Moving Q even closer gives:


• When x = 1.2:
f (1.2) − f (1) 1.22 − 12 1.44 − 1
slopeP Q = = = = 2.2
1.2 − 1 1.2 − 1 0.2

• When x = 1.1:
f (1.1) − f (1) 1.12 − 12 1.21 − 1
slopeP Q = = = = 2.1
1.1 − 1 1.1 − 1 0.1

• When x = 1.01:
f (1.01) − f (1) 1.012 − 12 1.0201 − 1
slopeP Q = = = = 2.01
1.01 − 1 1.01 − 1 0.01

Based on this, it appears that the estimates get closer to 2 as Q gets close to P .
Point Q cannot be at x = 1 since this will result in the zero denominator! Instead, we rely on
a mathematical concept known as limits to help us avoid this issue.

38
F17XA 4 Introduction to Differentiation

Definition 4.1 Limit definition of derivatives


The slope of the function f (x) at x = a is given by

f (a + h) − f (a)
f 0 (a) = lim
h→0 h
if it exists.

In general, we use the notation lim g(h) = L to mean the function g(h) is really close to L when
h→0
h is really close to 06 .

Exercise 4.2
Use the limit definition of the derivative to find the derivative of f (x) = x2 for:
1. x = 1
2. A general x = a

Solution:
1. For x = 1
f (1 + h) − f (1)
f 0 (1) = lim
h→0 h
(1 + h)2 − 12
= lim
h→0 h
(1 + 2h + h2 ) − 1
= lim
h→0 h
2h + h2
= lim
h→0 h
h(2 + h)
= lim
h→0 h
= lim 2 + h
h→0
=2

6
There are rigorous mathematical definitions on what really close means. But our intuitive notion is sufficient
for this subject.

39
F17XA 4 Introduction to Differentiation

2. For x = a
f (a + h) − f (a)
f 0 (a) = lim
h→0 h
(a + h)2 − a2
= lim
h→0 h
(a + 2ah + h2 ) − a2
= lim
h→0 h
2ah + h2
= lim
h→0 h
h(2a + h)
= lim
h→0 h
= lim 2a + h
h→0
= 2a

The last case shows that the process can be achieved for an arbitrary a. That is, the derivative
process takes the function f (x) = x2 and input a and returns 2a. This is the exact definition of
what a function is. Hence, we can extend the notion of a derivative from a point to the whole
function.
In this case, we write f 0 (x) (pronounced f dash x) to denote the derivative function. The process
of finding f 0 (x) from f (x) is called differentiation or finding the derivative.

Exercise 4.3

Use the limit definition of the derivative to find the derivative of f (x) = x.

Solution: We start with the limit definition of the derivative:


f (x + h) − f (x)
f 0 (x) = lim
h→0
√ h

x+h− x
= lim
h→0
√ h
√ √ √
x+h− x x+h+ x
= lim ·√ √
h→0 h x+h+ x

40
F17XA 4 Introduction to Differentiation

We choose this factor to simplify nicely


√ √  √ √ 
x+h− x x+h+ x
= lim √ √ 
h→0 h x+h+ x
x+h−x
= lim √ √ 
h→0 h x+h+ x
h
= lim √√ 
h→0 x+h+ x
h
1
= lim √ √
h→0 x+h+ x
1 1
=√ √ = √
x+ x 2 x

As this example shows, we want to avoid differentiating from first principles every time. Instead,
we will develop a set of rules to make our task easier.
Before we proceed, we will introduce some notation to make our lives easier.

Definition 4.4
Given a function y = f (x), the process of differentiation (with respect to x) is denoted

d d
(f (x))0 or f (x) or y
dx dx
The resulting function is called the derivative and can be denoted
df dy
f 0 (x) or or
dx dx

Example 4.5
Given the function f (x) = x2 , we could write

d 2
(x ) = 2x
dx

Example 4.6
Other variables can be used in place of x. For example:
d√ 1 d 2
u= √ u = 2u
du 2 u du

41
F17XA 4 Introduction to Differentiation

4.2 Derivatives
4.2.1 Powers
The most common function we will encounter is the following7 .
Theorem 4.7 Power Rule
The derivative for the function f (x) = xn for some real number n is given by:

f 0 (x) = nxn−1

Example 4.8
As we’ve seen before:
1. If f (x) = x2 , then f 0 (x) = 2x2−1 = 2x.
√ 1 1 1
2. If f (x) = x = x 2 , then f 0 (x) = 12 x 2 −1 = 21 x− 2 = 1

2 x
.
And a few that are new:
1. If f (x) = x3 , then f 0 (x) = 3x3−1 = 3x2 .
3 3 1

2. If f (x) = x 2 , then f 0 (x) = 32 x 2 −1 = 32 x 2 = 3 x
2
.
3. If f (x) = xπ , then f 0 (x) = πxπ−1 .

Combined with the following rules for addition and subtraction of derivatives, we can differentiate
all polynomials.
Theorem 4.9 Linearity rules
Given differentiable functions f (x), g(x) and a constant c, we have

d
(f (x) + g(x)) = f 0 (x) + g 0 (x) (4.1)
dx
d
(f (x) − g(x)) = f 0 (x) − g 0 (x) (4.2)
dx
d
c · f (x) = c · f 0 (x) (4.3)
dx

In other words, derivatives work nicely with addition, subtraction, and constant multiplication.
Let’s see some examples.

Exercise 4.10
Find the derivative of f (x) = x3 + x.

Calculations such as Exercise 4.2 only prove the result for specific values of n. For a general value of n, more
7

complicated calculation will be required.

42
F17XA 4 Introduction to Differentiation

Solution: We can apply the power rule with the addition rule
d  3  d 3 d
x +x = x + x Equation 4.1
dx dx dx
2
= 3x + 1 Theorem 4.7

Exercise 4.11
Differentiate the following functions:
1. f (x) = −2x5
2. g(x) = 4x(x − 7)

Solution:
1. We can apply a similar method:
d  
f 0 (x) = −2x5
dx
d
= −2 x5 Equation 4.3
dx 
= −2 · 5x4 Theorem 4.7
= −10x4

2. For g(x) we have:

d
g 0 (x) = (4x(x − 7))
dx
d  2 
= 4x − 28x Expand
dx
d 2 d
= 4x − 28x Equation 4.2
dx dx
d 2 d
= 4 x − 28 x Equation 4.3
dx dx
= 4 · (2x) − 28 · (1) Theorem 4.7
= 8x − 28

These solutions are written with all details included. Once you become familiar with the process,
you may decide to combine multiple steps.

Exercise 4.12
Find the slope of the function f (u) = u3 − 7u2 + 4u − 9 at the point u = 2.

43
F17XA 4 Introduction to Differentiation

Solution: Using the above rules, we can check that f 0 (u) = 3u2 − 14u + 4. The slope
of the function at u = 2 is f 0 (2) = 3(2)2 − 14(2) + 4 = −12.

Exercise 4.13
Find the equation of the tangent of the function f (x) = x4 − 4x2 + x about the point
x = 1.

Solution: We can find the derivative f 0 (x) = 4x3 − 8x + 1. Hence the slope of the
function at x = 1 is f 0 (1) = 4(1)3 − 8(1) + 1 = −3. We can then use the point-slope
formula with the point (1, f (1)) = (1, −2) to get the tangent line.

y − b = m(x − a)
y − f (1) = f 0 (1)(x − 1)
y − (−2) = (−3)(x − 1)
y = −3x + 1

4.2.2 Other common functions


We can apply the limit definition to find the derivative of functions such as sin(x), cos(x), ex ,
or ln(x). However, the details are beyond the scope of the subject. Instead, the results are
summarised in the table below.
Theorem 4.14 Table of derivatives
For constants a and b, we have
f (x) f 0 (x)

sin(ax + b) a cos(ax + b)
cos(ax + b) −a sin(ax + b)
eax aeax
a
ln(ax + b)
ax + b
sinh(ax + b) a cosh(ax + b)
cosh(ax + b) a sinh(ax + b)
Note: All measurements for trigonometric functions must be measured in radians.

44
F17XA 4 Introduction to Differentiation

Exercise 4.15
Find the derivative of the following functions:
1. f (x) = 5 sin(x)
2. g(x) = 3 cos(2x − 1)

Solution:
1. For f (x): Using the table with a = 1, b = 0, and Equation 4.3 gives f 0 (x) =
5 cos(x).
2. For g(x): Using the table with a = 2 and b = −1,

f 0 (x) = (−2) × 3 sin(2x − 1)


= −6 sin(2x − 1)

Exercise 4.16
 
Given f (x) = 5 sin(4x) − 8 cos(4x). Evaluate f 0 π
2
.

Solution: Using the table and Equation 4.2, we have

f 0 (x) = 5(4 cos(4x)) − 8(−4 sin(4x))


= 20 cos(4x) + 32 sin(4x)
 
Then f 0 π
2
= 20 cos(2π) + 32 sin(2π) = 20.

Exercise 4.17
An alternating voltage v at time t (in seconds) is given by v = v(t) = 80 sin(10t) volts.
Find the rate of change of voltage when t = 0.2.

Solution: The rate of change of voltage at time t is v 0 (t) = 800 cos(10t).


When t = 0.2

v 0 (t) = v 0 (0.2) = 800 cos(2)


= 800(−0.4161)
= −333

Exercise 4.18
Find f 0 (x) given f (x) = 2 ln(x) − x2 .

45
F17XA 4 Introduction to Differentiation

Solution: Since f (x) = 2 ln(x) − x2 , we have f 0 (x) = 2


x
− 2x.

Exercise 4.19
The following circuit contains a resistor R, a capacitor C and a constant voltage source V .
The current I(t) which flows at time t is given by:

V −t
 
I(t) = exp
R RC

R
Show that
dI I
=−
dt RC

Solution: Since V, R and C are constants, we have


d
LHS = I(t)
dt 
d V −t
 
= exp
dt R RC
V d −t
  
= exp
R dt RC
V −1 −t
  
= exp
R RC RC
−1 V −t
  
= exp
RC R RC
−1 I
= I(t) = − = RHS
RC RC
As required.

46
F17XA 4 Introduction to Differentiation

4.3 Product Rule


So far, we’ve covered addition and subtraction of derivatives. Unfortunately, multiplication and
division of derivatives do not work as nicely. Instead, we have the following rules to help us tackle
them.
Theorem 4.20 Product Rule
Given differentiable functions f (x) and g(x), the derivative of their product is

d
[f (x) · g(x)] = f (x) · g 0 (x) + f 0 (x) · g(x)
dx
This is sometimes written as y = u · v with u = f (x) and v = g(x) to give

y 0 = u · v 0 + u0 · v

Exercise 4.21
Find the derivative of h(x) = x2 e−4x .

Solution: Since this is a product of two functions, we can apply the product rule. Let

f (x) = x2 f 0 (x) = 2x
g(x) = e−4x g 0 (x) = −4e−4x

By the product rule

h0 (x) = f (x) · g 0 (x) + f 0 (x) · g(x)


   
= (x2 ) · −4e−4x + (2x) · e−4x
= −4x2 e−4x + 2xe−4x

Exercise 4.22
Find the derivative of h(x) = x2 cos(x).

Solution: Let

f (x) = x2 f 0 (x) = 2x
g(x) = cos(x) g 0 (x) = − sin(x)

47
F17XA 4 Introduction to Differentiation

By the product rule,

h0 (x) = f (x) · g 0 (x) + f 0 (x) · g(x)


= (x2 ) · (− sin(x)) + (2x) · (cos(x))
= −x2 sin(x) + 2x cos(x)

Exercise 4.23
Find the derivative of y = x3 ln(x).

Solution: Let

u = x3 u0 = 3x2
1
v = ln(x) v0 =
x
By the product rule,

y 0 (x) = u · v 0 + u0 · v
1
 
3
= (x ) · + (3x2 ) · (ln(x))
x
= x2 + 3x2 ln(x)

4.4 Quotient Rule


A similar approach is required when we wish to find the derivative of quotients.
Theorem 4.24 Quotient Rule

Given differentiable functions f (x) and g(x). The derivative of h = f


g

g(x) · f 0 (x) − g 0 (x) · f (x)


" #
d f (x)
=
dx g(x) [g(x)]2

This is sometimes written as y = u


v
with u = f (x) and v = g(x) to give

0 v · u0 − v 0 · u
y =
v2

Beware of the minus sign! Unlike the product rule, the order matters here.

48
F17XA 4 Introduction to Differentiation

Exercise 4.25
5x + 1
Find the derivative of h(x) = .
3x + 2

Solution: Since this is a quotient of two functions, we can apply the quotient rule. Let

f (x) = 5x + 1 f 0 (x) = 5
g(x) = 3x + 2 g 0 (x) = 3

By the quotient rule

g(x) · f 0 (x) − g 0 (x) · f (x)


h0 (x) =
[g(x)]2
(3x + 2) · (5) − (3) · (5x + 1)
=
[3x + 2]2
15x + 10 − (15x + 3)
=
[3x + 2]2
7 −2
= 2 = 7 (3x + 2)
(3x + 2)

Exercise 4.26
x
Find the derivative of y = .
x2 + 4

Solution: Let

u=x u0 = 1
v = x2 + 4 v 0 = 2x

By the quotient rule


v · u0 − v 0 · u
y0 =
v2
(x2 + 4) · (1) − (2x) · x
=
(x2 + 4)2
x2 + 4 − 2x2
=
(x2 + 4)2
4 − x2
= 2
(x + 4)2

49
F17XA 4 Introduction to Differentiation

Exercise 4.27
Find the derivative of y = tan(x).

Solution: Since we can express tan(x) = sin(x)


cos(x)
, the quotient rule applies. Let

u = sin(x) u0 = cos(x)
v = cos(x) v 0 = − sin(x)

By the quotient rule

v · u0 − v 0 · u
y0 =
v2
cos(x) · cos(x) − (− sin(x)) · cos(x)
=
(cos(x))2
cos(x)2 + sin(x)2
=
(cos(x))2
1
= Trig identity
cos2 (x)

Since sec(x) = 1
cos(x)
, this is sometimes written as sec2 (x).

4.5 Chain Rule


In this section, we look at a different function operation called composition.

Example 4.28
Suppose we have functions f (x) = x2 , g(x) = sin(x), and h(x) = ex . Some compositions of
these functions are:

f (g(x)) = g(x)2 = sin(x)2


g(h(x)) = sin(h(x)) = sin (ex )
h(g(x)) = eg(x) = esin(x)
 2
f (f (x)) = f (x)2 = x2 = x4
f (g(h(x))) = g(h(x))2 = (sin(h(x))2 = (sin(ex ))2 = sin2 (ex )

Example 4.29
The following functions can be decomposed into simpler functions:
5
• The function y = (x2 − 3x + 8) can be decomposed as y = u5 with u = x2 − 3x + 8.

50
F17XA 4 Introduction to Differentiation

• The function f (x) = e−3x can be decomposed as f (u) = e−u with u(x) = −3x2 .
2

• The function y = cos2 (ln(x)) can be decomposed as y = u2 , with u = cos(v), and


v = ln(x).

To differentiate such a function, we have to use the chain rule.


Theorem 4.30 Chain Rule
If a function f (x) can be decomposed as f (u(x)) with f = f (u) and u = u(x), then the
derivative:
f 0 (x) = f 0 (u(x)) · u0 (x)
Alternatively, we have
df df du
= ·
dx du dx

Exercise 4.31
Find the derivative of y = (x2 − 3x + 8)5 .

Solution: The function y = (x2 − 3x + 8)5 can be decomposed with y = u5 and u =


x2 − 3x + 8. Hence,
dy
y = u5 = 5u4
du
du
u = x2 − 3x + 8 = 2x − 3
dx
By the chain rule:
dy dy du
= ·
dx du dx
= 5u4 · (2x − 3)
= 5(x2 − 3x + 8)4 · (2x − 3)

Since the function is originally given in terms of x, we have to always back-substitute to give
the final answer in terms of x.
As with the product and quotient rule, you will be able to use the chain rule without introducing
much notation once you become familiar with it.

Exercise 4.32
Find the derivative of f (x) = e−3x .
2

51
F17XA 4 Introduction to Differentiation

Solution: The function f (x) = e−3x can be decomposed with f (u) = eu and u(x) =
2

−3x2 . Hence,

f (u) = eu f 0 (u) = eu
u(x) = −3x2 u0 (x) = −6x

By the chain rule:

f 0 (x) = f 0 (u) · u0 (x)


= eu · −6x
2
= −6xe−3x

Exercise 4.33
Find the derivative of y = ecos(ln(x)) .

Solution: The function y = ecos(ln(x)) can be decomposed with y = eu , u = cos(v), and


v = ln(x). Hence,

dy
y = eu = eu
du
du
u = cos(v) = − sin(v)
dv
dv 1
v = ln(x) =
dx x
By the chain rule (twice):

dy dy du
= · Chain rule
dx du dx !
dy du dv
= · · Chain rule again
du dv dx
dy du dv
= · · (4.4)
du dv dx
1
= eu · (− sin(v)) ·
x
1
= ecos(v) · (− sin(v)) · substitute u = cos(v)
x
1
= ecos(ln(x)) · (− sin(ln(x))) · substitute v = ln(x)
x
sin(ln(x))ecos(ln(x))
=−
x

52
F17XA 4 Introduction to Differentiation

Equation 4.4 shows what happens when a function is created by embedded compositions, and
also how the chain rule got its name.

4.6 Higher Order Derivatives


Recall that the derivative f 0 (x) is a function that measures the slope of the graph of y = f (x).
Since the derivative is also a function, we can consider the derivative of that function. The result
is called the second derivative. The second derivative measures how the slope changes8 . We can
then iterate to get the n-th derivative if we repeat this process n times.
We can denote the second derivative as follows:
• f 00 (x) (read as ‘f double-dash’ or ‘f dash dash’) is given by:
f 00 (x) = (f 0 (x))0

d2 y
• (read as ‘dee two y by dee x squared’) is given by:
dx2
! !
d2 y d dy d d
2
= = y
dx dx dx dx dx

• y 00 (read as ‘y double-dash’ or ‘y dash dash’) is given by:


d 0
y 00 (x) = (y )
dx

Exercise 4.34
Find the second derivative of y = x4 − 5x.

Solution: Apply differentiation twice:


!
d2 y d d
2
= y
dx dx dx
!
d d 4
= (x − 5x)
dx dx
d  3 
= 4x − 5
dx
= 12x2

Exercise 4.35
Find y 00 if y = 6 ln(t) − 4t.
8
Some texts will refer to this as the concavity

53
F17XA 4 Introduction to Differentiation

Solution: We can take the derivative twice.


6
y 0 (t) = −4
t
So
d 0
y 00 (t) = (y )
dt 
d 6

= −4
dt t
6
= − 2 = −6t−2
t

Exercise 4.36
Find the second derivative of the following functions:
1. f (x) = 9 sin(3x)
2. f (x) = 3e−5x

Solution:
1. We have:

f (x) = 9 sin(3x)
f 0 (x) = 27 cos(3x)
f 00 (x) = −81 sin(3x)

2. We have:

f (x) = 3e−5x
f 0 (x) = −15e−5x
f 00 (x) = 75e−5x

Exercise 4.37
d2 y
Given y = 5 cos(3x), find a constant A such that y satisfies the equation + Ay = 0.
dx2

54
F17XA 4 Introduction to Differentiation

Solution: We first find the derivatives:

y = 5 cos(3x)
dy
= −15 sin(3x)
dx
d2 y
= −45 cos(3x)
dx2
Then we have:
d2 y
0 = 2 + Ay
dx
0 = −45 cos(3x) + A · (5 cos(3x))
0 = (5A − 45) cos(3x)

The left hand side is exactly zero when 5A − 45 = 0, or A = 9.

Finding higher order derivatives can involve the rules we’ve covered previously.

Exercise 4.38
Find the second derivative of f (x) = xe−3x .

Solution: Since this is a product of functions, we should use the product rule:

u=x u0 = 1
v = e−3x v 0 = −3e−3x

So the first derivative:

f 0 (x) = e−3x − 3xe−3x


= (1 − 3x)e−3x

To get the second derivative, we need the product rule again:

u = (1 − 3x) u0 = −3
v = e−3x v 0 = −3e−3x

So the second derivative:


 
f 00 (x) = (1 − 3x) −3e−3x − 3e−3x
= −6e−3x + 9xe−3x
= (9x − 6)e−3x

55
F17XA 4 Introduction to Differentiation

Simplifying after derivatives will sometimes make taking higher order derivatives easier. Alterna-
tively, we can use the linearity rules after the first derivative apply the product rule twice.

4.7 Implicit Differentiation


Up till now, we’ve only considered explicit functions. That is, functions where we can explicitly
express y in terms of x. For example, y = x2 sin(x). However, it is not always easy (or feasible)
to relate the variables in such a manner. For example, y 3 + 2xy − 5x2 = 0 relates x and y, but it
is difficult to isolate y in terms of x only. These equations are called implicit equations. The key
assumption we make is that one variable can be written as a function of the other. For example,
we can write y in terms of x, or y(x).
One advantage of implicit functions is that we can find derivatives without explicitly solving for a
variable. The key technique used here is the chain rule.

Example 4.39
Suppose y is a function of x and evaluate
d
y(x)2
dx

We consider y = y(x) and let w = y 2 . By the chain rule

dw dw dy dy
= × = 2y
dx dy dx dx
Similarly, we have
d d dy dy
y(x)2 = y 2 × = 2y
dx dy dx dx

The key point is remembering that to differentiate a function of y with respect to x, we first
differentiate with respect to y then multiply by dx
dy
. We write this mathematically as

d d dy
(f (y)) = (f (y)) ×
dx dy dx

Let’s see some examples:

Exercise 4.40
dy
Consider the equation y 3 + 2xy − 5x2 = 0 and find
dx

Solution: In theory, we could try to solve for y before differentiating. But in this case,

56
F17XA 4 Introduction to Differentiation

we will use implicit differentiation and differentiate both sides with respect to x.
d  3  d
y + 2xy − 5x2 = 0
dx dx
d 3 d d d
y + 2xy − 5x2 = 0 Linearity
dx dx dx dx
Differentiating term by term
• By the chain rule dxd 3 dy
y = 3y 2 dx
• By the chain and product rule dx d dy
2xy = 2y + 2x dx
• We also have dx d
5x2 = 10x and dx d
0=0
Combining, we get:
!
dy dy
3y 2 + 2y + 2x − 10x = 0
dx dx

dy
Solving for gives:
dx
dy 10x − 2y
= 2
dx 3y + 2x

Exercise 4.41
dy
Consider the equation ln(y) − sin(y) + x2 y = 0 and find
dx

Solution: Differentiating both sides with respect to x.


d   d
ln(y) − sin(y) + x2 y = 0
dx dx
d d d d
ln(y) − sin(y) + x2 y = 0
dx dx dx dx
Differentiating term by term
• By the chain rule dxd dy
ln(y) = y1 dx
• By the chain rule dxd dy
sin(y) = cos(y) dx
• By the chain and product rule dx d 2 dy
x y = 2xy + x2 dx
Combining, we get:
!
1 dy dy dy
− cos(y) + 2xy + x2 =0
y dx dx dx

Solving for dy
dx
gives:
dy −2xy
= 1
dx y
− cos(y) + x2

57
F17XA 4 Introduction to Differentiation

Exercise 4.42
e x x2 dy
Consider the equation y = √ 2 and find
x +1 dx

This is given as an explicit equation. To differentiate this, one would require a product rule,
quotient rule, and a chain rule. However, we can simplify the computation using implicit differ-
entiation.

Solution: Recall that for a differentiable function f (x), we have the following via the
chain rule:
d f 0 (x)
ln(f (x)) =
dx f (x)
So we can manipulate the original equation to get:

e x x2
y=√ 2
x +1
e x x2
!
ln(y) = ln √ 2
x +1
  √ 
ln(y) = ln ex x2 − ln x2 + 1
1
   
ln(y) = ln (ex ) + ln x2 − ln (x2 + 1) 2
1  
ln(y) = x ln (e) + 2 ln (x) − ln x2 + 1
2
1  2 
ln(y) = x + 2 ln (x) − ln x + 1
2
We can now differentiate term by term to get:
d d 1 
  
ln(y) = x + 2 ln (x) − ln x2 + 1
dx dx 2
1 dy 2 x
=1+ − 2
y dx x x +1

Solving for dy
dx
gives:

dy 2 x
 
=y 1+ − 2
dx x x +1
x 2
e x 2 x
 
=√ 2 1+ − 2
x +1 x x +1

By manipulating the equation first, we could make our lives slightly less difficult.

58
F17XA 4 Introduction to Differentiation

Exercise 4.43 Inverse trigonometric functions


Use implicit differentiation to show that
d 1
arcsin(x) = √
dx 1 − x2

Solution: Start by rearranging y = arcsin(x) to get x = sin(y). Differentiate both sides


to get:
d d
x= sin(y)
dx dx
dy
1 = cos(y)
dx
dy 1
=
dx cos(y)

To write the equation back in terms of x, we make use of the trigonometric identity

cos2 (y) + sin2 (y) = 1

So we get
q
cos(y) = 1 − sin2 (y)

= 1 − x2

d 1
Hence we have arcsin(x) = √
dx 1 − x2

Exercise 4.44 Energy Conservation

Suppose that y(t) satisfies the equation y 00 (t) + ky(t) = 0 for a constant k. Define g(t) to
be
2
g(t) = (y 0 (t)) + k (y(t))2
and find g 0 (t).

Solution: We start by taking the derivative term by term:


• To find the derivative of w = (y 0 (t))2 , set u = y 0 (t) to get w = u2 . Then by the

59
F17XA 4 Introduction to Differentiation

chain rule, we have:


dw dw du
=
dt du dt
= 2uy 00 (t)
= 2y 0 (t)y 00 (t)

• The derivative of k (y(t))2 is 2ky 0 (t)y(t).


Combine, we get:

g 0 (t) = 2y 0 (t)y 00 (t) + 2ky 0 y


= 2y 0 (t) (y 00 (t) + 2ky)

From the original assumption of y(t), the bracketed term is exactly zero. Hence we have
g 0 (t) = 0.

In the above example, g(t) is related to the energy of the system modelled by the differential
equation y 00 (t) + ky(t) = 0. Since g 0 (t) = 0, we have that g(t) is constant which means that
energy is conserved.

4.8 Parametric Differentiation


For more complicated curves, we could consider a parametric approach as opposed to relating x
and y explicitly via a single equation.
A parametric curve is defined by two (or more) functions x(t) and y(t) connected via the parameter
t. We can think of the parametric curve as the collection of points (x(t), y(t)) that result from
the specified range of t.

Example 4.45
The equation of a circle of radius r is given by

x2 + y 2 = r 2

In parametric form, this could be written as the pair of equations:

x(t) = r cos(t) y(t) = r sin(t)

We can also parametrise the same equation as

x(t) = r sin(t) y(t) = r cos(t)

It is important to realise that there could multiple parametrisations of the same equation.

Suppose that a bicycle is moving in a straight line at constant speed. If we consider the path

60
F17XA 4 Introduction to Differentiation

traced out by a specifiic point on a wheel. After t seconds, the position of the point is

x(t) = t − sin(t) y = 1 − cos(t)

The plot is shown below.


y
3
2
1
x
2 4 6 8 10 12 14 16 18 20
-1

Figure 4.4: Cycloid

dy dy
In this section we want to find for a curve defined by parametric equations. Recall that
dx dx
is the slope of the curve at a given point. In the example shown in Figure 4.4, at any time t, the
dy
gradient is the slope at curve at the point (x(t), y(t)).
dx
Theorem 4.46 Parametric differentiation
Suppose that a curve is given by the parametric equations (x(t), y(t)). The derivative is
given by:
dy dy 1 y 0 (t)
= × dx = 0
dx dt dt
x (t)

Exercise 4.47
dy
Consider the parametric curve defined by x(t) = t2 + 3, y(t) = 4t3 . Find .
dx

Solution: Since dx
dt
= 2t and dy
dt
= 12t2 , we have:

dy dy 1
= × dx
dx dt dt
12t2
=
2t
= 6t

Exercise 4.48
dy
Find for the cycloid defined by x(t) = t − sin(t) and y(t) = 1 − cos(t).
dx

61
F17XA 4 Introduction to Differentiation

Solution: In this case, we have x0 (t) = 1 − cos(t), y 0 (t) = sin(t). Hence

dy y 0 (t)
= 0
dx x (t)
sin(t)
=
1 − cos(t)

The second derivative of a parametric curve defined by the equations (x(t), y(t)) is given by
!
d2 y d dy 1
2
= × dx
dx dt dx dt

Exercise 4.49
d2 y
Find 2 for the parametric equation defined by x(t) = 2t2 and y(t) = t4 − 8t3 .
dx

Solution: Since dx
dt
= 4t and dy
dt
= 4t3 − 24t2 ,

dy dy 1
= × dx
dx dt dt
4t3 − 24t2
=
4t
2
= t − 6t

Hence,
!
d2 y d dy 1
2
= × dx
dx dt dx dt
1
= (2t − 6) ×
4t
t−3
=
2t

62
F17XA 4 Introduction to Differentiation

4.9 Curve Sketching


One common application of differentiation is to locate the maxima and minima of a function.
Recall the following definition of extrema:

Definition 4.50 Extrema


Given a function f and a point (c, f (c)) in the domain:
• The point is a global maximum of the function if f (c) ≥ f (x) for all x in the domain.
• The point is a global minimum of the function if f (c) ≤ f (x) for all x in the domain.
• The point is a local maximum of the function if f (c) ≥ f (x) for all x near c.
• The point is a local minimum of the function if f (c) ≤ f (x) for all x near c.

Example 4.51
Consider the following function:
a y
1

b x
-1 1

-1
c

On the shown domain, the point a is both a local maximum and a global maximum. The point
c is both a local minimum and global minimum. However, if the function was extended beyond
the given domain, then the points may no longer be global maximum and minimum.

To find local and global extrema, we make use of stationary points.

Definition 4.52 Stationary point


Let f be a differentiable function. A point c in the domain is called a stationary point if
f 0 (c) = 0.

Example 4.53
The function in Example 4.51 has three stationary points in the given domain located at x = a,
x = b, and x = c.

The definition of stationary points coincides with the location of local extrema. If a point (c, f (c))
is a local extremum, then it must also be a stationary point.
The order of the previous statement is critical. Whilst local extrema happens at stationary
points, not all stationary points are local extrema.

63
F17XA 4 Introduction to Differentiation

The problem of classifying stationary points is to determine which category each stationary point
is in. To answer this question, we will need to define when a function is increasing or decreasing.

Definition 4.54 Intervals of increase/decrease

A function f is increasing on an interval a ≤ x ≤ b if for every u, v in the interval u < v


implies f (u) < f (v).
A function f is decreasing on an interval a ≤ x ≤ b if for every u, v in the interval u < v
implies f (u) > f (v).

In other words, a function is increasing if the output increases as the input increases. Conversely, a
function is decreasing if the output decreases as the input decreases. The following diagram shows
an example of a function that is increasing, decreasing, and neither increasing nor decreasing over
the intervals 0 < x < 2.
y y y
8 8 8

6 6 6

4 4 4

2 2 2
x x x
1 2 1 2 1 2
Increasing function Decreasing function Neither

Example 4.55
Looking at Example 4.51. We see that the intervals of increase and decrease are exactly
partitioned by the stationary points.
• The function is increasing when x < a and x > c
• The function is decreasing when a < x < b and b < x < c.

Since the stationary points can be found by using derivatives, we expect a similar connection
between derivatives and intervals of increase or decrease
Theorem 4.56
Consider a differentiable function f on the interval a < x < b.
• If f 0 (x) > 0 on the interval, then f (x) is an increasing function on that interval.
• If f 0 (x) < 0 on the interval, then f (x) is a decreasing function on that interval.

The definition of increasing/decreasing functions does not depend on the existence of the
derivative. But this theorem tells us that knowing the derivative will make our lives easier.

64
F17XA 4 Introduction to Differentiation

Example 4.57
• The function f (x) = ex has derivative f 0 (x) = ex > 0 for all x. Hence f (x) is increasing
over the entire number line.
• The function f (x) = −x3 − x has derivative f 0 (x) = −3x2 − 1 < 0 for all x. Hence
f (x) is decreasing over the entire number line.
• The function f (x) = −x2 has derivative f 0 (x) = −2x.
– The derivative is positive for x < 0 and hence f (x) is increasing for x < 0
– The derivative is negative for x > 0 and hence f (x) is decreasing for x > 0

Exercise 4.58
Consider the function f (x) = 5 − x2 . Determine the intervals where f (x) is increasing or
decreasing.

Solution: First find the derivative: f 0 (x) = −2x. There is a stationary point is at x = 0.
So we look at the derivative on both sides.
• When x < 0, f 0 (x) > 0. So f is increasing when x < 0
• When x > 0, f 0 (x) < 0. So f is decreasing when x > 0

Exercise 4.59
Consider the function f (x) = 2x3 − 3x2 . Determine the intervals where f (x) is increasing
or decreasing.

Solution: First find the derivative: f 0 (x) = 6x2 − 6x = 6x(x − 1). There are stationary
points at x = 0 and x = 1. So we look at the derivative after partitioning the domain
• When x < 0, f 0 (x) > 0. So f is increasing when x < 0
• When 0 < x < 1, f 0 (x) < 0. So f is decreasing when 0 < x < 1
• When 1 < x, f 0 (x) > 0. So f is increasing when 1 < x
Hence, f is decreasing when 0 < x < 1 and f is increasing when x < 0 or 1 < x.

Exercise 4.60
A resistor of 3 ohms connected in parallel with a variable resistor of x ohms has a combined
resistance R(x) given by
3x
R(x) =
x+3
Show that R(x) is an increasing function for x > 0.

65
F17XA 4 Introduction to Differentiation

Solution: One way to show that R is increasing is the show that the derivative R0 (x) > 0
for x > 0. Using the quotient rule

3(x + 3) − 3x 9
R0 (x) = 2
=
(x + 3) (x + 3)2

Since numerator and denominator are both positive for x > 0, we have R0 (x) > 0 and
hence R is an increasing function.
We can validate this with a plot of R(x)
R
1.5

3x
1 R(x) = x+3

0.5

x
1 2 3

Exercise 4.61
Consider the function f (x) = x + 1
x
on the domain x > 0. Determine the intervals where
f (x) is increasing or decreasing.

x2 − 1
Solution: First find the derivative: f 0 (x) = 1 − x12 = . There is a stationary point
x2
at x = 1. So we look at the derivative after partitioning the domain
• When 0 < x < 1, f 0 (x) < 0. So f is decreasing when 0 < x < 1
• When 1 < x, f 0 (x) > 0. So f is increasing when 1 < x
Hence, f is decreasing when 0 < x < 1 and f is increasing when 1 < x. We can plot the
function to validate our answer
y
6

x
1 2 3

66
F17XA 4 Introduction to Differentiation

At this point, we have the machinery to start classifying stationary points.


Theorem 4.62 First derivative test
Let f be a differentiable function and (c, f (c)) be a stationary point of f .
• If f 0 (x) > 0 for x < c and f 0 (x) < 0 for x > c, then x = c is a local maximum.
• If f 0 (x) < 0 for x < c and f 0 (x) > 0 for x > c, then x = c is a local minimum.
• If neither condition holds, this test is inconclusive.

This result tells us we should first find the stationary points and then use the sign of the derivative
on either side of the stationary point to determine the nature of the stationary point.

Exercise 4.63
Find and classify the stationary point of f (x) = x2 − 6x + 3.

Solution: To find stationary points, we solve for f 0 (x) = 0. Since f 0 (x) = 2x − 6, we


have x = 3 as the only stationary point.
To classify the stationary point, we consider the derivative on either side:
• When x < 3, f 0 (x) < 0. So f is decreasing when x < 0
• When x > 3, f 0 (x) > 0. So f is increasing when x > 0
By the first derivative test, we have a local minimum at x = 3.

Exercise 4.64
Find and classify the stationary point of f (x) = x + 4
x
on the domain x > 0

x2 −4
Solution: To find stationary points, we solve for f 0 (x) = 0. Since f 0 (x) = 1− x42 = x2
,
we have x = 2 as the only stationary point.
To classify the stationary point, we consider the derivative on either side:
• When x < 2, f 0 (x) < 0. So f is decreasing when x < 0
• When x > 2, f 0 (x) > 0. So f is increasing when x > 0
By the first derivative test, we have a local minimum at x = 2.

Exercise 4.65
Find and classify the stationary point of f (x) = 2x3 − 6x. Determine the intervals where
f is increasing or decreasing.

Solution: To find stationary points, we solve for f 0 (x) = 0. Since f 0 (x) = 6x2 − 6 =
6(x2 − 1), we have x = −1, 1 as stationary points.
To classify the stationary point, we partition the domain by the stationary points and
determine the sign of the derivative in each interval.

67
F17XA 4 Introduction to Differentiation

x x < −1 −1 < x < 1 x > 1


f 0 (x) >0 <0 >0
By the first derivative test, we have a local minimum at x = −1 and a local maximum at
x = 1.
To obtain the intervals of increase and decrease, we have:
• When x < −1, f 0 (x) > 0. So f is increasing when x < −1
• When −1 < x < 1, f 0 (x) < 0. So f is decreasing when −1 < x < 1
• When 1 < x, f 0 (x) > 0. So f is increasing when 1 < x
Hence f (x) is decreasing when −1 < x < 1 and increasing when x < −1 or x > 1.

If you are familiar with the effect of the second derivative on the shape of the function, The
following test is an alternative method for classifying stationary points.
Theorem 4.66 Second derivative test
Let f be a twice differentiable function and x = c be a stationary point of f .
• If f 00 (c) > 0 then x = c is a local minimum.
• If f 00 (c) < 0 then x = c is a local maximum.
• If f 00 (c) = 0 then this test is inconclusive.

Example 4.67
Consider the function f (x) = e3x − 3ex . We have
 
f 0 (x) = 3e3x − 3ex = 3ex (ex )2 − 1
f 00 (x) = 9e3x − 3ex

From f 0 , we solve (ex )2 − 1 = 0 to give x = 0 as a stationary point. The second derivative


f 00 (0) > 0 so that x = 0 is a local minimum by the second derivative test.

Exercise 4.68
Sketch f (x) = x3 − x2 − x + 1

Solution: We will try to extract all the information we can from f and then f 0 .
• Since f (x) = (x + 1)(x − 1)2 , we have:
– y-intercept at (0, 1)
– Roots at x = −1, 1. Partition the domain by roots.
– f (x) > 0 when −1 < x < 1 and 1 < x
– f (x) < 0 when x < −1
• Looking at the first derivative f 0 (x) = 3x2 − 2x − 1 = (3x + 1)(x − 1)
– Partitioning the domain by stationary points at x = − 31 and x = 1.
– f 0 (x) > 0 when x < −1, so f in increasing on the interval

68
F17XA 4 Introduction to Differentiation

– f 0 (x) < 0 when −1 < x < 1, so f in decreasing on the interval


– f 0 (x) > 0 when 1 < x, so f in increasing on the interval
Combining this information, we can sketch a graph of the function
y

x
-2 -1 1 2

4.10 Related Rates


Related rates are a practical application of differentiation. The chain rule can be used to connect
different rates of change. We will illustrate this through a series of examples.

Example 4.69
Suppose two quantities x, y are related via a function y(x). If we are given how fast x changes
dx
with time t, ie . We can determine dydt
using the chain rule:
dt
dy dy dx
= ×
dt dx dt

Consider the following examples.

Exercise 4.70
dx dy
Let y = x2 . If = 5, find when x = 3.
dt dt

Solution: Using the chain rule:


dy dy dx
= ×
dt dx dt
Since we have dy
dx
= 2x, we have that dy
dt
= 10 × x. When x = 3, we have dy
dt
= 30.

Exercise 4.71
The length of the sides of a square are changing. The area of the square is increasing at a
rate 16cm2 per hour. At what rate is the length of each side increasing, when the length
is 4cm.

69
F17XA 4 Introduction to Differentiation

Solution: Define x to be the length of a side of the square and y to be the area. Both
x and y are changing as time t changes. The area of the square is given by y = x2 .
We are given that dy
dt
= 16 and want to find dx
dt
when x = 4.
Using dx
dy
= 2x and the chain rule

dy dy dx
= ×
dt dx dt
dx
16 = 2x ×
dt
dx
16 = 8 when x = 4
dt
Hence dx
dt
= 2. That is, the length is increasing at 2cm per hour when x = 4cm.

Exercise 4.72
Water is dropping into a circular puddle at a uniform rate of 1cm3 per second . The depth
of the puddle remains constant at 0.1cm . When the radius of the puddle is 5cm, find
1. the rate of increase of the radius,
2. the rate of increase of the area,
3. the rate of increase of the circumference.

Solution: Define r be the radius of the puddle, V the volume, A the area, and C the
circumference. We have the follow connect between the variables:
πr2
V = A = πr2 C = 2πr
10
1. We are given that dV
dt
= 1 . We have to find dr
dt
when r = 5.
Using the chain rule
dV dV dr
= ×
dt dr dt
2πr dr
1= ×
10 dt
10 dr
=
2πr dt
10 dr
= when r = 5
2π(5) dt

Hence we have dr
dt
= π1 when r = 5.
2. To find dt we use A = πr2 and the chain rule to obtain
dA

70
F17XA 4 Introduction to Differentiation

dA dA dr
= ×
dt dr dt
dA dr
= 2πr ×
dt dt
dA 1
= 2πr ×
dt π
dA 1
= 2π(5) × when r = 5
dt π
dA
= 10
dt
So dA
dt
= 10 when r = 5.
3. To find dC
dt
, we use C = 2πr and the chain rule to obtain

dC dC dr
= ×
dt dr dt
dC dr
= 2π ×
dt dt
dC 1
= 2π ×
dt π
dC
=2
dt
So dC
dt
= 2 when r = 5.

71
F17XA 5 Introduction to Integration

5 Introduction to Integration
5.1 Antidifferentiation (Indefinite Integration)
Now that we have differentiation, it is natural to ask whether we can reverse the operation.
Starting with a function f (x), an antiderivative F (x) is a function such that dx
d
F (x) = f (x).

Example 5.1
• Since the derivative d 2
dx
x = 2x, the function x2 is an antiderivative of the function 2x.
• Since the derivative d 3
dx
x = 3x2 , the function x3 is an antiderivative of the function 3x2 .
• Since the derivative d
dx
sin(x) = cos(x), the function sin(x) is an antiderivative of the
function cos(x).
• Since the derivative d x
dx
e = ex , the function ex is an antiderivative of the function ex .

Unfortunately, the story is more complicated.

Example 5.2
Since the derivative of the functions x2 , x2 + 1, x2 − 3, x2 + eπ are all 2x, they are all antideriva-
tives of function 2x.

The above example shows that any function of the form “x2 +some constant” are all antiderivatives
of 2x. In other words, the antiderivative of 2x is the entire class of functions of the form x2 + C
for any constant C.

Definition 5.3 Antiderivative


Given a function f (x), an antiderivative F (x) is a function such that d
dx
F (x) = f (x). The
class of antiderivatives is denoted
Z
f (x)dx = F (x) + C

To further Rcomplicate issues, there are functions for which there are no easy antiderivatives.
For example, e−x dx has no easy antiderivative.
2

For reasons that will become apparent in the next section, the term antiderivative is also commonly
known as the indefinite integral.

Example 5.4
• Since the derivative d 2
dx
x = 2x, we have
Z
2xdx = x2 + C

72
F17XA 5 Introduction to Integration

• Since the derivative d 3


dx
x = 3x2 , we have
Z
3x2 dx = x3 + C

• Since the derivative d


dx
sin(x) = cos(x),we have
Z
cos(x)dx = sin(x) + C

• Since the derivative d x


dx
e = ex , we have
Z
ex dx = ex + C

Because additional and constant multiplication works nicely with derivatives, we expect these
operations to work nicely with antiderivatives too.
Theorem 5.5 Linearity Rules
Given two functions f (x) and g(x) for which antiderivatives exists and a constant k, we
have
Z Z Z
f (x) + g(x)dx = f (x)dx + g(x)dx
Z Z Z
f (x) − g(x)dx = f (x)dx − g(x)dx
Z Z
kf (x)dx = k f (x)dx

Further, we have the property that


d
Z 
f (x)dx = f (x)
dx

Example 5.6
Since we know
d n+1
x = (n + 1)xn
dx
we can apply the above result to get
Z
xn+1
xn dx =
n+1

Since we have a table of standard derivatives:

73
F17XA 5 Introduction to Integration

f (x) f 0 (x)

xn nxn−1
sin(ax + b) a cos(ax + b)
cos(ax + b) −a sin(ax + b)
eax aeax
a
ln(ax + b)
ax + b
we can read the table backwards and account for constants to get the following.
Theorem 5.7 Standard Antiderivatives
We have the following standard antiderivatives
Z
f (x) f (x)dx

(ax + b)n+1
(ax + b)n + C for n 6= −1
a(n + 1)
1
sin(ax + b) − cos(ax + b) + C
a
1
cos(ax + b) sin(ax + b) + C
a
1 ax
eax e +C
a
1 1
ln(ax + b) + C for ax + b > 0
ax + b a

The additional conditions on some of the antiderivatives are essential.

Exercise 5.8
Find the antiderivative of
1
+ 2x
x
with respect to x for x > 0.

74
F17XA 5 Introduction to Integration

Solution: Using the rules of the table, we have:


Z 
1 1
 Z Z
+ 2x dx = dx + 2xdx
x x
Z
1 Z
= dx + 2 xdx
x
x2
= ln(x) + 2 · +C
2
= ln(x) + x2 + C.

Example 5.9
Some further examples of antidifferentiation along these lines are:

Z
8x2
(3 + 8x) dx = 3x + +C
2
= 3x + 4x2 + C

Z 
18
 Z  
2
5 − x + 4 dx = 5 − x2 + 18x−4 dx
x
x3 18x−3
= 5x − − +C
3 3
x3
= 5x − − 6x−3 + C
3

Z   sin(7x) 2e−5x
cos(7x) + 2e−5x dx = − +C
7 5

Z 
6

− 4 sin 2t dt = 2 ln(3t − 5) + 2 cos 2t + C
3t − 5

One application of antidifferentiation in physical context is to calculate the amount of work done
by a variable force. For a constant force, we have W ork = F orce × Distance. But to compute
the work done over variable force, we need something more complicated.

75
F17XA 5 Introduction to Integration

Exercise 5.10
Two electric charges with charge q1 and q2 respectively are separated by a distance r in a
vacuum and f (r) is the force exerted. Coulomb’s inverse square law states
q1 q2
f (r) =
4π0 r2
where 0 is a constant. Find the work done w = f (r)dr.
R

Solution:
Z Z
q1 q2
w= f (r)dr = dr
4π0 r2
q1 q2 Z 1
= dr
4π0 r2
−q1 q2 1
= +C
4π0 r

5.2 Area Under Curves


Integration is the branch of mathematics that is related to finding areas under a given curve.

Exercise 5.11
Find the area under the function f (x) = 2x between x = 0 and x = 5.

Solution: Lets start with a picture:


y
(5, 10)

x
(0, 0)
Since the shape of the area is a triangle, we can obtain the area using the geometric
formula. We know f (5) = 10. Hence the area under this function is A = 5×10
2
= 25.

For the purposes of this subject, we will work with the following definition of integration

76
F17XA 5 Introduction to Integration

Definition 5.12 Definite Integral

The area under the graph of the function f (x) between a ≤ x ≤ b is given the by the
definite integral Z b
Area = f (x)dx
a

The right-hand side of the equation is read as “The definite integral between a and b of f (x) dx”.
Before we proceed, we note a few things about the definite integral:
• a is the lower limit of integration, or lower terminal,
• b is the upper limit of integration, or upper terminal,
• The function f (x) is the integrand.
If the curve is below the x-axis, then there is negative area under the curve.

Example 5.13
We will illustrate the rigourous definition of integration in this example. In cases where the
function is not a simple geometric shape, then we approximate the area using rectangles with
one corner touching the function. The more rectangles we use, the better the approximation.
This approximation becomes exact as the number of rectangles go to infinity.
y

The process of finding area under the curve is a technically challenging problem. However, the
following result gives us a more convenient approach to finding areas.
Theorem 5.14 Fundamental Theorem of Calculus
If f is a continuous function on the interval [a, b], then
Z b
f (x)dx = F (b) − F (a) = [F (x)]ba
a

where F is any antiderivative of f .

Since F can be any antiderivative, we can pick the one with C = 0 so that it effectively
disappears from our computations.

77
F17XA 5 Introduction to Integration

This result connects two fundamental ideas in calculus. Finding areas and undoing the slope are
seemingly different ideas. Yet this result tells us finding area is equivalent to undoing the slope
at two different points.
As a result of this connection, the antiderivative also known as the indefinite integral. In
particular, the results we have for antiderivatives apply for indefinite integrals.
For example, Theorem 5.7 contains the indefinite integrals for a range of common functions.

Exercise 5.15
Evaluate the definite integral Z 6
x2 dx
3

Solution: Using the Fundamental Theorem of Calculus, we have


Z b
f (x)dx = [F (x)]ba
a

Using the table of standard indefinite integrals, we have


Z
1
x2 = x3
3
Hence,
Z 6
1 3 6
 
x2 dx = x
3 3 3
63 33
= −
3 3
= 72 − 9 = 63

Since integration is the process of finding areas, we have the following results to help partition
complicated functions.
Theorem 5.16
Let a < b < c be real numbers, then we have the following
Z c Z b Z c
f (x)dx = f (x)dx + f (x)dx
Z aa a b

f (x)dx = 0
a
Z a Z b
f (x)dx = − f (x)dx
b a

78
F17XA 5 Introduction to Integration

Exercise 5.17
Consider the function given by

ex −2 ≤ x < 0
f (x) = 3
x + 1 0 ≤ x ≤ 2

and evaluate Z 2
f (x)dx
−2

Solution: Using the previous result, we can partition the interval at x = 0 to get
Z 2 Z 0 Z 2
f (x)dx = f (x)dx + f (x)dx
−2 −2 0
Z 0 Z 2
= ex dx + x3 + 1dx
−2 0
" #2
x4
= [ex ]0−2 + +x
4 0
" ! !#
4
h i 2 04
= e0 − e−2 + +2 − +0
4 4
= 1 − e−2 + 6
= 7 − e−2

Exercise 5.18
Let F (x) = e−3x . Find F 0 (x) and use the result to evaluate
2

Z 1
2
xe−3x dx
0

Solution: By the Chain Rule (Exercise 4.32) we have


2
F 0 (x) = −6xe−3x

79
F17XA 5 Introduction to Integration

It follows that an indefinite integral of −6xe−3x dx = F (x) and


R 2

Z 1
−3x2 1Z 1 2
xe dx = − −6xe−3x dx
0 6 0
1Z 1 2
=− −6xe−3x dx
6 0
1 h −3x2 i1
= − −e
6 0
1 h −3 i1
= − −e + e0
6 0
−3
1 e
= −
6 6

5.3 Improper Integrals


Definition 5.19 Improper integrals

In cases where one (or both) of the terminals are infinite, we define
Z ∞ Z b
f (x)dx = lim f (x)dx
a b→∞ a

and Z b Z b
f (x)dx = lim f (x)dx
−∞ a→−∞ a

provided that the limit exists.

Integral of this can be calculated by applying the following steps:


Z b
1. Calculate f (x)dx as normal.
a

2. Evaluate the limit as one (or both) terminals approach infinity.

Exercise 5.20
Evaluate Z ∞
1
dx
1 x2

Solution: For a positive number b we have


Z b
1 1 b
 
dx = −
1 x2 x 1
1
= − − (−1)
b

80
F17XA 5 Introduction to Integration

1
As b grows, 1
b
gets smaller and smaller, so we have lim = 0. Thus b tends to infinity
b→∞ b
and 1 − 1
b
tends to 1. Thus
Z ∞
1 Z b
1
2
dx = lim dx
1 x b→∞ 1 x2
1
= lim 1 − = 1.
b→∞ b

An infinite limit may seem counter-intuitive when computing physical areas. However, it can be
extremely useful when applied to mechanics.

Example 5.21
The force of attraction F due to gravity between a space shuttle and Earth is given by
GM m
F =
r2
where G is the gravitation constant, M is the mass of the earth, m is the mass of the shuttle,
and r is the distance between the two objects.
The work (energy) required to elevate the shuttle from height a to height b is given by
Z b
GM m
W = dr
a r2

Since the gravitation force has infinite range, the total work required by the shuttle to escape
the Earth’s orbit can be computed when the upper terminal is infinite. That is
Z ∞ Z b
GM m GM m
W = 2
dr = lim dr
a r b→∞ a r2
Based on the previous exercise, we have
b
GM m GM m

W = lim − =
b→∞ r a a
So the energy required by the shuttle to escape the Earth’s orbit starting at distance a from
the Earth’s core is GMa m .

Exercise 5.22
Z ∞
Evaluate the integral 2xdx, if it exists.
0

81
F17XA 5 Introduction to Integration

Solution: Since we have Z b


2xdx = [x2 ]b0 = b2
0

So as b tends to infinity, the term b does not approach a finite number. Hence the integral
2

does not exist.

82
F17XA 6 Statistics and Probability

6 Statistics and Probability


Statistics and probability are two sides of the same coin. Statistics is the study of collecting and
analysing data. On the other hand, probability aims to develop models to predict what we expect 9
to happen. The similarities and differences between the data collected and the theoretical model
can tell us something about the behaviour of the system.

6.1 Statistics
Statistics is concerned with the collection and analysis of data. Data consists of observations of
one or more variables. The first consideration of data collection is the nature of the observations
made. Observations generally fall into several categories: Categorical, Ordinal, and Cardinal. In
this subject, we will be focused on cardinal data. This type of data can be divided into discrete or
continuous. Discrete data are exact and require no rounding. For example, the number of people
living in a house or a person’s shoe size. Continuous data require rounding of some sort and can
take on any value within a given range. For example, the temperature of an electronic component
or the time required to complete a race.

6.2 Measures of Average


With advances in technology, modern day data sets can contain thousands, or even millions of
observations. Extracting useful information from large data sets is one of the most important
goals of data analysis. In this subject, we will look at several ways to summarise large data sets
using a single measurement or statistic.
Two common statistics used for cardinal data is the a measure of average value via the mean or
the median.

Definition 6.1 Mean and median


Let n be the number of observations made and label the observations x1 , x2 , . . . xn .
The mean, denoted µ or x, is defined by:
n
1X 1
x= xi = (x1 + x2 + . . . + xn )
n i=1 n

The median is found by listing the observations in ascending order (x1 ≤ x2 ≤ . . . ≤ xn )


and then take the middle observation.
  
1 xn + x n2 +1 n even
median =  2 2
x n+1 n odd
2

The mean is the sum of all observations divided by the number of observations. The median is
the “middle” observation if n is odd and the midpoint of the middle two observation when n is
9
We use this word loosely as there are different interpretation of expectation, even amongst experts

83
F17XA 6 Statistics and Probability

even.

Exercise 6.2
Find the mean and median of the following data set.

5 7 −4 1 9 12

Solution: First, we count the number of observations to get n = 6. Then we can apply
the formula to get:
1 30
x= (5 + 7 − 4 + 1 + 9 + 12) = =5
6 6
To obtain the median, we first arrange the observations in ascending order:
−4, 1, 5, 7, 9, 12. Since n is even, we take the midpoint of the middle two observations.
So the median is 5+7 2
= 6.

There is no “right” answer for the correct choice of average. The sensible choice will depend on
the nature of the data sets. The mean is more sensitive to changes in individual observations,
and outliers in particular. On the other hand, the median is more robust to outliers and changes
to individual observations.

Example 6.3
Consider the following scenarios:
• Suppose we are computing the average weight of a component off a mechanical production
line. The mean will be a sensible measure of average since almost all components will be
of similar weight with very few outliers. Large changes in the mean over a short period
of time might signal a malfunction. On the other hand, the median may not be robust
enough to detect such anomalies
• Suppose we want to compute the average household income for a population. The mean
will not be descriptive of a typical household since it will be heavily skewed towards the
small number of extremely wealthy households leaving the majority of the population with
“below average” income. In this case, the median is typically quoted as being a better
representation of the population.

Some notes about manipulating data points:


• If a constant a is added to all data, then both the mean and the median increase by a.
• If all data are multiplied by b, then both the mean and the median are multiplied by b.
Another common notion of average of a data set is the Mode. This notion is commonly applied
for categorical or ordinal data and is the value which occurs with highest frequency. That is, the
most common observation in the data set. If there exists a unique observation with the highest
frequency, then that is the mode of the data set. If two observations share the highest frequency,

84
F17XA 6 Statistics and Probability

then there are two modes. In general, the data has N modes if there are N observations that
share the highest frequency.

6.3 Standard Deviation


In addition to the average value, a second statistic called standard deviation can be used to
describe the average distance the observations are from the mean.

Definition 6.4 Stardard deviation


Let n be the number of observations with mean x and label the observations x1 , x2 , . . . , xn .
The standard deviation, denoted σ, is defined as the square root of the sample variance,
denoted σ 2 . n h
2 1 X i
σ = (xi − x)2 .
n − 1 i=1

The following is an continuation of Exercise 6.2

Exercise 6.5
Find the standard deviation of the following dataset:

5 7 −4 1 9 12

Solution: From Exercise 6.2, we have x = 5. So


1 h i
σ2 = (5 − 5)2 + (7 − 5)2 + (−4 − 5)2 + (1 − 5)2 + (9 − 5)2 + (12 − 5)2 = 33.2
6−1

Thus the standard deviation is σ = 33.2 = 5.76.

An equivalent formula for computing variance is given by


 !2 
n   n
1 1
σ2 = x2 −
X X

i xi 
n−1 i=1 n i=1

Using the alternative formula for σ 2 on the previous exercise, we have


n n
(x2i ) = 52 + 72 + (−4)2 + 12 + 92 + 122 = 316
X X
xi = 30
i=1 i=1

Thus " #
1
2 302 166
σ = 316 − = = 33.2
5 6 5
as before, so we have σ = 5.76.

85
F17XA 6 Statistics and Probability

6.4 Frequency Tables


When dealing with large datasets, we can make use of frequency tables to help us organise the
data. We will illustrate this via an example.

Exercise 6.6
Tyre tread depth was measured during roadside checks by the transport police. The results
for 100 cars are tabulated below:
Tyre tread depth (in mm) xi Frequency fi
1.6 12
1.8 22
2.0 26
2.2 40
1. Calculate the mean tyre tread depth
2. Calculate the standard deviation of the tyre tread depth
3. State the mode of tyre tread depth measurements

Solution:
1. We calculate the mean as: P
f i xi
x = Pi
i fi
which is given by
12 × 1.6 + 22 × 1.8 + 26 × 2.0 + 40 × 2.2 198.8
x= = = 1.988mm
12 + 22 + 26 + 40 100
2. We calculate the standard deviation as:
fi (xi − x)2
P
2 i
σ =
( i fi ) − 1
P

We can partition the calculation into smaller computations by including additional


columns
xi fi f i xi (xi − x̄)2 fi (xi − x̄)2
1.6 12 19.2 0.1505 1.8065
1.8 22 39.6 0.0353 0.7776
2.0 26 52 0.0001 0.0037
2.2 40 88 0.0449 1.7978
Total 100 198.8 4.3855
so we compute
4.3855
σ2 = = 0.0443
99
and hence σ = 0.21.

86
F17XA 6 Statistics and Probability

3. The mode is the most common measurement, looking at the table gives 2.2mm as
the one with the higest frequency.

6.5 Probability Theory


Probability is the study of theoretical models that deal with uncertain outcomes. These models
are centred around a process or experiment for which the outcome is uncertain. The set of
potential outcomes for an experiment is called the sample space. We can then focus on the set
of desirable outcomes as the event E. The likelihood that the experiment returns one of the
desirable outcomes is called the probability P (E).

Definition 6.7 Probability


Given a random experiment with sample space Ω and two events A, B. The probability of
event A happening, denoted P (A), is a function that assigns a number between 0 and 1
such that the following properties are satisfied.
1. 0 ≤ P (A) ≤ 1
2. P (Ω) = 1.
3. If A and B share no common outcomes, then P (A or B) = P (A) + P (B).

The first property tells us that the probability of any event is between 0 (impossible) and 1
(certain). The second property tells that the probability of any outcome happening is certain.
The third property tells us that we can add probabilities nicely provided that two events don’t
overlap.
One assumption we commonly make is the equiprobable assumption. That is, every outcome
in the sample space is equally likely to occur with probability |Ω|
1
, where |Ω| is the number of
outcomes in the sample space.

Example 6.8
• Rolling a fair six-sided die:
outcome 1 2 3 4 5 6
probability 1
6
1
6
1
6
1
6
1
6
1
6

In this case, the equiprobable assumption is valid, so each probability must be 1


|Ω|
= 16 .
• Tossing a fair coin twice:
outcome HH HT TH TT
probability 1
4
1
4
1
4
1
4

As before, the equiprobable assumption is valid, so each probabilty must be 1


|Ω|
= 14 .
• Test reliability of a component:

87
F17XA 6 Statistics and Probability

outcome working not working


probability 0.9 0.1
In this case, the two outcomes do not happen with equal probability and hence the
equiprobable assumption is not valid in this case.

An event E is a set of desirable outcomes. The probability P (E) of the event E is the sum of
the probabilities of these outcomes.

Exercise 6.9
Let A be the event that a die shows at most 4 and let B be the event that the die shows
an even number. Find P (A) and P (B).

Solution: First, we have the events A = {1, 2, 3, 4} and B = {2, 4, 6}. Hence
1 1 1 1 2
P (A) = + + + =
6 6 6 6 3
1 1 1 1
P (B) = + + =
6 6 6 2

Exercise 6.10
In a series of repeated measurements, the number of radioactive particles emitted over a
fixed time interval is recorded:
outcome: 0 1 2 3 4 5 ≥6
probability: 0.368 0.368 0.184 0.061 0.015 0.003 0.001

Let D be the event that at most 2 particles are emitted. What is P (D)?

Solution: Since D = {0, 1, 2}, we have

P (D) = P (0) + P (1) + P (2)


= 0.368 + 0.368 + 0.184
= 0.920

6.6 Probability Rules


For more complicated probabilities, we may have to combine events with known probabilities.

88
F17XA 6 Statistics and Probability

Definition 6.11 Creating new events


For events A and B, we have the following operations:
• The complement of A, denoted Ac is the set of outcomes not in A. This is also called
not A.
• The intersection of A and B, denoted A ∩ B is the set of outcomes in both A and
B. This is also called A and B
• The union of A and B, denoted A ∪ B is the set of outcomes either A or B (or
both). This is also called A or B

In particular complement of the entire sample space Ωc = ∅ = {} is the event with no desirable
outcomes. Hence P (∅) = 0.

Example 6.12
A fair six-sided dice is rolled Ω = {1, 2, 3, 4, 5, 6} and two events are defined:
2
A = {1, 2, 3, 4} P (A) =
3
1
B = {2, 4, 6} P (B) =
2
Then we can compute the following events and probability:
1
Ac = {5, 6} P (Ac ) =
3
1
B c = {1, 3, 5} P (B c ) =
2
1
A ∩ B = {2, 4} P (A ∩ B) =
3
5
A ∪ B = {1, 2, 3, 4, 6} P (A ∪ B) =
6
1
(A ∪ B)c = {5} P ((A ∪ B)c ) =
6
Ω = {1, 2, 3, 4, 5, 6} P (Ω) = 1
∅ = Ωc = {} P (∅) = 0

Based on the example, we can see that for any event A, we have

P (Ac ) = 1 − P (A)

However, probabilities involving intersections and unions are more complicated and require more
machinery.

89
F17XA 6 Statistics and Probability

Definition 6.13 Mutually exclusive events

Two events A and B are said to be mutually exclusive (or disjoint) if

A∩B =∅

Theorem 6.14
Given two events A and B, we have

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

In cases where A, B are mutually exclusive, we have

P (A ∪ B) = P (A) + P (B)

The subtraction is to ensure that we do not count the same outcomes twice.

Example 6.15
A fair six-sided dice is rolled and the following events are defined:

A = {1, 2, 3, 4}, B = {2, 4, 6}, C = {6}

The events A and C are mutually exclusive. Hence we have:

P (A ∪ C) = P (A) + P (C)
4 1 5
= + =
6 6 6
The events A and B are not mutually exclusive. We have A ∩ B = {2, 4} and hence

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
4 3 2
= + −
6 6 6
5
=
6
We can also compute P (A ∪ B) directly since A ∪ B = {1, 2, 3, 4, 6}.

Exercise 6.16
Outside work on an oil platform cannot proceed if it is either too wet or too windy. The
probability of the event A that work is cancelled because it is too wet is 0.3, and the
probability of the event B that work is cancelled because it is too windy is 0.2. The
probability of the event A ∩ B that it is simultaneously both too wet and too windy is 0.1.
Find the probability that work can proceed.

90
F17XA 6 Statistics and Probability

Solution: The event that it is either too wet or too windy is A ∪ B and we have

P (A ∪ B) = P (A) + P (B) − P (A ∩ B) = 0.3 + 0.2 − 0.1 = 0.4.

The event C that work can proceed is the same as the event A ∪ B not occurring. Hence,
the probability that work can proceed is given by

P (C) = 1 − P (A ∪ B) = 1 − 0.4 = 0.6

Definition 6.17 Independent events


Two events A and B are said to be independent if the following property holds

P (A ∩ B) = P (A) × P (B)

Events that are not independent are said to be dependent events.

In other words, independent events do not affect each other. Knowing whether event A happened
does not affect the probability of B happening.

Example 6.18
• The outcomes of flipping two coins are independent event since knowing the outcome of
one does not impact on the outcome of the other.
• The outcome of drawing two cards from a deck (without replacement) are dependent
events since knowledge of the first card affects the second card.

Exercise 6.19
A computer is defective with probability 14 and a second computer is defective with prob-
ability 10
1
. Assuming that these events are independent, find the probability that at least
one computer is working.

Solution: Let A be the event that the first computer is defective, B the event that the
second computer is defective. So we have
1
P (neither computer working) = P (A ∩ B) = P (A)P (B) =
40
Hence
39
P (at least one working) = 1 − P (neither working) =
40

91
F17XA 6 Statistics and Probability

6.7 Tree Diagrams


A useful visual aid for probability problems is the use of tree diagrams to list all possible outcomes.
This approach is extremely useful (and sometime the only feasible method) in case involving a
series of experiments where the outcome of one dictates subsequent probabilities.
To construct the tree diagram:
1. Start with origin O.
2. Create a branch (arrow) for each outcome of the first step and write the probability of that
outcome on the arrow.
3. For each outcome, repeat the previous step.
4. Compute the probability of an outcome by multiplying the probabilities along each arrow
leading to the outcome.
We will illustrate this via an example.

Exercise 6.20
A plant has three synthesis reactors: A,B, and C. All three produce urea. Urea is produced
in discrete batches, each of 100 kg. Reactor A produces 60% of the total volume, Reactors
B and C 25% and 15% respectively. The failure rate of the three Reactors also varies:
5% of the production by Reactor A has failed levels of contamination. The failure rate
for Reactors B and C are 2% and 6%. A batch passes if it does not fail. Calculate the
probability that a batch picked at random is
1. from Reactor B
2. from Reactor A and passes
3. from Reactor C and fails
4. passes

Solution: First, convert the percentages to probabilities and construct the tree diagram:
O
0.6 0.15
0.25

A B C
0.05 0.95 0.02 0.98 0.06 0.94

Fails Passes Fails Passes Fails Passes


We can then answer the questions
1. The probability we end up at B is 0.25 or 25%.
2. The probability we end up at A and passes is 0.6 × 0.95 = 0.57.

92
F17XA 6 Statistics and Probability

3. The probability we end up at C and fails is 0.15 × 0.06 = 0.009.


4. The probability of passing is the sum of passes from all three reactors. So we have
0.6 × 0.95 + 0.25 × 0.98 + 0.15 × 0.94 = 0.956.

6.8 Binomial Distribution


The tree diagram approach works for any series of random experiements. However, we can simplify
the process if the same experiment is repeated where we only care about whether it succeeds or
fails.

Definition 6.21 Bernoulli trials


A Bernoulli trial is a random experiment with two outcomes: success S, and fail F . The
probability of success P (S) = p for some parameter p and hence P (F ) = 1 − p.

Example 6.22
Example of Bernoulli trials include:
• Flipping a coin with success to be heads has p = 21 .
• Rolling a fair six-sided die with success to be rolling a 6 has p = 16 .
• An item off a manufacturing line with success to be passing the quality assessment. Here,
p might be obtained experimentally.

A single Bernoulli trial is not very complicated. We are mostly interest in repeating Bernoulli
trials. We will illustrate some important ideas via an example.

Exercise 6.23
A reactor produces batches of urea of which 85% passes quality assessment. Three batches
are selected. Calculate the probability that the number of batches that pass is:
1. zero
2. one
3. two
4. three?

Solution: Create a Bernoulli trial by defining S to be the event that a given batch passes
and define F as the batch is fails. Then P (S) = 0.85 and P (F ) = 0.15. We can construct
the tree diagram

93
F17XA 6 Statistics and Probability

O
0.85 0.15

0.85 S 0.15 0.85 F 0.15


SS SF SS SF
0.85 0.15 0.85 0.15 0.85 0.15 0.85 0.15

SSS SSF SF S SF F F SS F SF FFS FFF


We can then compute the probabilities of each
1. We can use the multiplication law:

P (F F F ) = P (F )P (F )P (F ) = 0.15 × 0.15 × 0.15 = 0.003375

2. If one batch passes then the outcomes are SF F or F SF or F F S. These events


are mutually exclusive:

P (one batch passes) = P (SF F ) + P (F SF ) + P (F F S)


= (0.85)(0.15)(0.15) + (0.15)(0.85)(0.15) + (0.15)(0.15)(0.85)
= 0.057375

3. If two batches passes then the outcomes are SSF or SF S or F SS. As with the
previous part these events are mutually exclusive

P (two batches pass) = P (SSF ) + P (SF S) + P (F SS)


= (0.85)(0.85)(0.15) + (0.85)(0.15)(0.85) + (0.15)(0.85)(0.85)
= 0.325125

4. If all three batches pass, then

P (SSS) = (0.85)3 = 0.614125

While this approach is suitable for a small number of repeated trials, we may struggle to create
the full tree diagram as the number of trials increases. To simplify the process, we observe that
in each of the computations, the probability of each mutually exclusive outcome were identical.
Hence we only have to compute the number of ways we can get the right number of successes
and fails.

94
F17XA 6 Statistics and Probability

Definition 6.24 Binomial coefficients


The number of ways of selecting r objects of a total set of n objects is given by
!
n n!
=
r (n − r)!r!

where n! = n × (n − 1) × (n − 2) × . . . × 2 × 1 is the factorial function.

For reasons that are not obvious in this definition, we have 0! = 1.

Example 6.25
For small values of n, we have
• 3! = 3 × 2 × 1 = 6
• 4! = 4 × 3 × 2 × 1 = 24
• 5! = 5 × 4 × 3 × 2 × 1 = 120
When computing binomial coefficients, we should avoid expanding the full factorial.
!
2 2! 2! 2!
= = = =1
0 (2 − 0)!0! 2!0! 2!
!
4 4! 4! 4 × 3 × 2! 4×3
= = = = =6
2 (4 − 2)!2! 2!2! 2!2! 2!
!
8 8! 8! 8 × 7 × 6 × 5! 8×7×6
= = = = = 56
3 (8 − 3)!3! 5!3! 5!3! 3!

Theorem 6.26 Binomial distribution


The probability of obtaining r successes from n Bernoulli trials with success probability p
is given by: !
n r
P (r successes) = p (1 − p)n−r
r
 
We can think of this equation as having nr ways of pick r successes. Each success happens with
probability p and fails with probability 1 − p. Since each trial is independent, we can multiply all
these to obtain the final result.

95
F17XA 6 Statistics and Probability

Exercise 6.27
A reactor produces batches of urea, 0.87 of which passes quality assessment. 10 batches
are selected. Calculate the probability that the number of acceptable batches from these
10 are:
1. zero
2. nine
3. ten

Solution: Let S and F be defined as S “the batch passes” and F “the batch fails” with
P (S) = 0.87 and P (F ) = 0.13. We then have
1.
!
10
P (0 passes) = (0.87)0 (0.13)10
0
 
= 1 × 1 × 1.37 × 10−9

2.
!
10
P (9 passes) = (0.87)9 (0.13)1
9
= 10 × 0.2855 × 0.13 = 0.371

3.
!
10
P (10 passes) = (0.87)10 (0.13)0
9
= 1 × 0.248 × 1 = 0.248

6.9 Continuous Distribution


At this point, we turn our attention to continuous data. That is, data that can take on any
value within an interval. The key difference between continuous and discrete data that we had
previously is that for any point a

P (outcome is exactly a) = 0

That is, there is zero probability the outcome is exactly a value a.

Example 6.28
There is zero probability that:
• A race is completed in exactly 10 minutes
• The height of a person is exactly 1.80 metres

96
F17XA 6 Statistics and Probability

• The radius of a balloon is exactly 20 centimetres

Instead, continuous probabilities are defined on an interval. So while there is zero probability of
a competitor finishing a 100m race in exactly 15 seconds, there is a good chance that they finish
the race sometime between 14 and 16 seconds.

Definition 6.29 Probability density function

The probability density function P (X) is the function that describes the probability of an
event with continuous outcome over an interval. Further the probability density function
has the following properties.
• the function is never negative. That is, P (X) ≥ 0
• the total area under the graph of the function is equal to 1
The probability that the variable X lies between two given values, say a and b, is given by
the area under the curve between X = a and X = b. That is
Z b
P (a ≤ X ≤ b) = P (X)dX.
a

In the event that one of the inequalities involve ±∞, we can omit the inequality. Hence
P (−∞ ≤ X ≤ b) = P (X ≤ b) and P (a ≤ X ≤ ∞) = P (a ≤ X).
It is equivalent to denote a probability over an interval as P (a < X < b) instead of P (a ≤
X ≤ b) since

P (a ≤ X ≤ b) = P (X = a) + P (a < X < b) + P (X = b) = P (a < X < b)

as the probability at the single points are zero.

Exercise 6.30
Consider the function 
e−X X≥0
P (X) =
0 X<0
1. Show that P (X) is a probability density function
2. Find the probability that the outcome is between 0 and 1.

Solution:
1. We check the two conditions:
• Since e−X > 0, have P (X) ≥ 0 for all X

97
F17XA 6 Statistics and Probability

• we have to evaluate
Z ∞ Z 0 Z ∞
P (X)dX = P (X)dX + P (X)dX
−∞ −∞ 0
Z 0 Z ∞
= 0dX + e−X dX
−∞ 0
−X ∞
h i
= 0 + −e
0
h i
= 0 + lim 1 − e−b
b→∞
=1

So total area is 1 and we have a probability density function.


2. we have to compute
Z 1
P (0 ≤ X ≤ 1) = P (X)dX
0
Z 1
= e−X dX
0
h i1
= −e−X
0
= 1 − e−1 ≈ 0.632

6.10 Normal Distribution


The most common probability density function that will be of interest in this subject is the Normal
distribution or Gaussian distribution.

Definition 6.31 The standard Normal distribution


The standard Normal distribution is defined by the probability density function
1 Z2
P (Z) = √ e− 2

The standard normal distribution Z has mean 0 and standard deviation 1. This is denoted Z = 0
and σ = 1.

98
F17XA 6 Statistics and Probability

0.3

0.2

0.1

−4 −3 −2 −1 0 1 2 3 4

Figure 6.1: The standard normal distribution


For the normal distribution, we can show that
• 68% of the area lies within one standard deviation either side of the mean.
• 95% of the area lies within two standard deviation either side of the mean.
• 99.7% of the area lies within three standard deviation either side of the mean.
This is sometimes known as the “68-95-99.7” rule.
Further, we have the following relations between probabilities (and their geometric interpretations).
1. P (a ≤ Z ≤ b) = P (Z ≤ b) − P (Z ≤ A)

= −

a b b a
2. P (Z ≤ b) = 1 − P (Z ≥ b)

= 1−

b b
3. P (Z ≥ b) = P (Z ≤ −b)

99
F17XA 6 Statistics and Probability

b −b
Based on these relations, we can reduce the computation of probabilities down to computing
P (Z ≤ b) for different values of b. Unfortunately, this is where things get challenging. There are
no easy methods of finding an antiderivatives of P (Z). So instead of computing these probabilities
analytically, we use the table for the Standard Normal distribution (Appendix B). The table is
indexed by b in the rows and columns (also known as z-scores) and the corresponding entry in
the table is P (Z ≤ b).

Exercise 6.32
Let Z be the standard normal distribution. Compute the following probabilities.
1. P (Z ≤ 1.83), P (Z ≤ 0.57)
2. P (Z ≥ 0.57)
3. P (Z ≤ −0.57)
4. P (−0.57 ≤ Z ≤ 1.83)

Solution:
1. We look up the z-scores in the table by selecting the row corresponding to the first
decimal place and selecting the right column based on the second decimal place.
Doing this, we get:

P (Z ≤ 1.83) = 0.9664 P (Z ≤ 0.57) = 0.7157

2. We make use of property 2.

P (Z ≥ 0.57) = 1 − P (Z ≤ 0.57)
= 1 − 0.7157
= 0.2843

3. We make use of property 3.

P (Z ≤ −0.57) = P (Z ≥ 0.57)
= 0.2843

4. We make use of property 1.

P (−0.57 ≤ Z ≤ 1.83) = P (Z ≤ 1.83) − P (Z ≤ −0.57)


= 0.9664 − 0.2843
= 0.6821

100
F17XA 6 Statistics and Probability

Natural variability in continuous outcomes typically results in normal distributions10 . However,


it is unrealistic to assume all resulting distributions will have mean 0 and standard deviation 1.
Hence we generalise the notion of normal distribution.

Definition 6.33 Normal distribution


The normal distribution with mean µ and standard deviation σ is denoted N (µ, σ) with
probability density function given by
!
1 (X − µ)2 1 (X−µ)2
P (X) = √ exp − = √ e− 2σ2
σ 2π 2σ 2 σ 2π

The standard normal distribution Z ∼ N (0, 1).

For mathematical reasons beyond the scope of this subject, we use the symbol ∼ to denote
that two outcomes that have the same probability distribution.
Comparing Definition 6.31 with Definition 6.33 shows that any normal distribution can be con-
verted back to the standard normal distribution via a series of graph transformations.
Theorem 6.34
We can convert a normal distribution X ∼ N (µ, σ) to the standard normal distribution
Z ∼ N (0, 1) via the transformation

X −µ
Z=
σ
In particular, if two numbers x, z satisfy the relation
x−µ
z=
σ
then P (X ≤ x) = P (Z ≤ z).

The number x is usually given in the context of the question and we have to compute z (hence,
z-score). From there, we can find the probability using the stardard normal table Definition 6.33.
Let’s see this in an example:

Exercise 6.35
The thickness of an aluminium sheet is normally distributed with mean 52µm and standard
deviation 5µm. What proportion of the aluminium sheets have thickness:
1. less than 60.3µm;
2. less than 40µm;
10
In fact the Central Limit Theorem says that data sets involving continuous measurement will tend towards
normal distribution as the sample size increases.

101
F17XA 6 Statistics and Probability

Solution: We denote the distribution of the aluminium sheet thickness (in µm) by X ∼
N (52, 5).
1. To find P (X ≤ 60.3), we first compute z = 60.3−52
5
= 1.66. Hence

P (X ≤ 60.3) = P (Z ≤ 1.66)
= 0.9515

Hence, 95.15% of aluminium sheets are less than 60.3µm thick.


2. To find P (X ≤ 40), we compute z = 40−52
5
= −2.40. Using the properties of the
normal distribution, we have:

P (X ≤ 40) = P (Z ≤ −2.4)
= P (Z ≥ 2.4)
= 1 − P (Z ≤ 2.4)
= 1 − 0.9918
= 0.0082

Hence, 0.82% of aluminium sheets are less than 40µm thick.

102
F17XA 7 Vectors

7 Vectors
7.1 Basic Operations
A scalar is a quantity which has magnitude only, e.g., temperature, mass, pressure.
Many other quantities are not sufficiently defined by their magnitude alone, e.g., displacement,
velocity, magnetic field. These quantities involve both a magnitude and direction. Such quantities
are called vectors.

Definition 7.1 Vector (in 2 or 3 dimensions)

Let P and Q be two points in space. The vector P~Q is the directed line segment starting
at P and finishes at Q.
• The vector can be labelled using an underlined lower case letter a = P~Q.
• The point P is the tail of the vector
• The point Q is the head of the vector
• The magnitude of the vector is the positive length of the line, denoted |a| or |P~Q|.
• The direction is in the direction from P towards Q.
Q

P~Q = a
P

In this subject, we will restrict our discussion of vectors to 2 or 3 dimensions. However, the same
logic can extended to higher dimensions or more abstract spaces.
Before we proceed, some notation. In this section, we shall
• Use upper case letters P, Q, R, S, . . . to denote points in space. The upper case O denotes
the origin.
• Use overhead arrow P~Q, underlined lower case a, or bold font a to denote vectors.
Two vectors a and b are equal, denoted a = b, if they have the same magnitude in the same
direction. An immediate consequence of this definition is that the absolute position of the tail
and head of a vector is not essential in defining a vector, only their position relative to each other.

Example 7.2
Consider the parallelogram with corners at P QRS
Q S
P~Q
~
RS
P R

103
F17XA 7 Vectors

The vectors P~Q and RS


~ are equal since they have the same magnitude and direction.

Q S
T~U
P~Q
T
~
RS
P R

Similarly, the vectors P~Q = RS


~ = T~U .

Given two vectors a and b, we can add the two vectors a + b by joining the tail of one to the head
of the other. This is called the triangle rule for vector addition.

Example 7.3
Define the vectors a and b as follows:

a b

We define add the two vectors a + b as follows.


b

a
a+b

One consequence of the triangle rule is that a + b = b + a.


b

a+b
a
a

The null vector or zero vector, denoted 0, is the vector of magnitude 0. The direction is undefined
for this vector. From this, we can define the negative vector of a as the vector with the same
magnitude but in the opposite direction. This vector is denoted −a. Hence we have the immediate
result
a + (−a) = a − a = 0

104
F17XA 7 Vectors

Example 7.4
Define the vectors a = P~Q and b = QR,
~ then we have −b = QS
~ and a − b = P~S.

−b b
S Q R

a−b a

We can extend the notion of addition to scalar multiplication.

Example 7.5
Define the vector a as follows:
a

We can add a to itself n times and define na = a + a + . . . + a. For example


a a
2a = a + a
2a
a a a
3a = a + a + a
3a
a a a a
4a = a + a + a + a
4a
a a a
na = a + a + . . . + a
na

For n ≥ 0, we can generalise na as a vector in the direction of a with length scaled by a factor of
n. In the case where the scalar is negative, we have −na = n(−a). That is, the negative vector
scaled by n.

−2a

2a
a

1
2
a

Since the length of vectors can be scaled, we can introduce the notion of an unit vector. For any
vector a 6= 0, we can create a unit vector u = |a|
1
a which has magnitude 1 in the direction of a.

105
F17XA 7 Vectors

7.2 Components
Let P be a point in 2 dimensions (the xy-plane) with Cartesian coordinates (x, y). And let O
denote the origin with coordinates (0, 0). We say that P has a position vector p = OP
~ . Using
the Pythagorean theorem, we get that
q
|p| = x2 + y 2
y-axis

y P

~
OP

x-axis
O x

Figure 7.1: Position vector OP


~

In 3-dimensions, the position vector of point P at (x, y, z) has length x2 + y 2 + z 2 .
Theorem 7.6
Given position vectors p = OP
~ and q = OQ,
~ we have

P~Q = OQ
~ − OP
~ =q−p

Based on the picture, we have OP ~ Rearranging gives the desired result.


~ + P~Q = OQ.

P P~Q
~ =p
OP
Q

~ =q
OQ
O

At this point, we define three key unit vectors that forms the standard basis.

Definition 7.7 Standard basis


In 3 dimensions, The unit vectors that form the standard basis are:
• i is the position vector (1, 0, 0) and points in the positive x direction.
• j is the position vector (0, 1, 0) and points in the positive y direction.
• k is the position vector (0, 0, 1) and points in the positive z direction.
In the case of 2-dimensions, we have the analogous i = (1, 0) and j = (0, 1).

It follows from the triangle rule that we can write the position vector of the point P with Cartesian
coordinates (x, y) as p = xi + yj. That is we have

106
F17XA 7 Vectors

p
yj

O xi

Exercise 7.8
Find the position vector p of the point P with Cartesian coordinates (−1, 3)

Solution: This is simply p = −i + 3j.

In the physical world, we can think of the point P with Cartesian coordinates (x, y, z) as a physical
point in space is being occupied and the vectors i, j, k indicating a direction from point P . The
two need not be related.
We can express any vector P~Q in terms of the unit vectors i and j in this way. Consider the
following example.

Example 7.9
Consider the points P and Q defined by (2, 1) and (6, 4) respectively.
Q = (6, 4)

P~Q
bj

P = (2, 1) ai

We can compute P~Q = ai + bj and obtain a = 6 − 2 = 4 and b = 4 − 1 = 3. So P~Q = 4i + 3j.

We can write any vector v as v = ai + bj, for some suitable choice of a, b. In this case, we say
the vector v is expressed or resolved in terms of i, j. The numbers a and b are called the i and
j components of v.
For students who have encountered matrices, we can express the vector v = ai + bj as a column
matrix !
a
v=
b
or a row matrix v = (a, b).
We adopt the convention of the comma separated row matrix notation (a, b) for vectors.

107
F17XA 7 Vectors

Be aware to not confuse this notation with the notation for Cartesian coordinates.

Exercise 7.10
If a = (a1 , a2 ), find |a|.

Solution:

a = (a1 , a2 ) = a1 i + a2 j
q
|a| = a21 + a22

The same ideas will hold in 3 dimensions. The position vector of a point P with Cartesian
coordinates (x, y, z) is
~ = xi + yj + zk
p = OP
More generally, we can resolve any 3 dimensional vector v as

v = ai + bj + ck

in which case we may write v = (a, b, c).


We shall generally be interested in vectors in 3D in this course, for the rather obvious reason that
our world has 3 space dimensions.

Exercise 7.11
Express v = (2, 3, −1) in the form v = ai + bj + ck and find |v|.

q √
Solution: We have v = 2i + 3j − k and |v| = 22 + 32 + (−1)2 = 14.

Theorem 7.12 Vector operations


Vector addition and scalar multiplication can be done in terms of components. Define the
vectors a = (a1 , a2 , a3 ), b = (b1 , b2 , b3 ) and constant k. Then we have

a + b = (a1 , a2 , a3 ) + (b1 , b2 , b3 ) = (a1 + b1 , a2 + b2 , a3 + b3 )


ka = k(a1 , a2 , a3 ) = (ka1 , ka2 , ka3 )

Exercise 7.13
If a = (−3, 9, 1) and b = (−6, 9, −12), find a + 13 b.

108
F17XA 7 Vectors

Solution:
1 1
a + b = (−3, 9, 1) + (−6, 9, −12)
3 3
= (−3, 9, 1) + (−2, 3, −4)
= (−5, 12, −3)

Exercise 7.14
Points P and Q have Cartesian co-ordinates (−1, 3, 4) and (−3, 2, 5). Find the distance
between P and Q, then determine the Cartesian coordinates of the midpoint M of P~Q.

Solution: For this type of geometrical problem, the first step is always to draw the the
relevant vectors (you can draw it in 2D and you don’t need to worry about the detailed
position of points). We have
Q

M
~
OM P~M

O ~
P
OP

First, we need P~Q. We have OP


~ + P~Q = OQ~ by the triangle rule. Thus P~Q = OQ−
~ OP ~ .
But the vectors OQ and OP are just the position vectors corresponding to the points Q
~ ~
and P . Thus we have OQ~ = (−3, 2, 5), OP
~ = (−1, 3, 4).


Hence, P~Q = (−3, 2, 5)−(−1, 3, 4) = (−2, −1, 1) and P Q = |P~Q| = 4 + 1 + 1 = 6.
Next, we need to find the vector OM
~ . This is given by

~ = OP
OM ~ + P~M

~ + 1 P~Q
= OP
2
1
= (−1, 3, 4) + (−2, −1, 1)
2
1 1
 
= (−1, 3, 4) + −1, − ,
2 2
5 9
 
= −2, ,
2 2
 
Thus the Cartesian coordinates of M are −2, 52 , 92 .

109
F17XA 7 Vectors

7.3 Scalar Product


Depending on the application, there may be several ways to “multiply” vectors. We will cover a
few common methods here.

Definition 7.15 Scalar Product


The scalar product of 2 vectors a and b is defined by

a · b = |a||b| cos(θ)

where 0 ≤ θ ≤ π is the angle between the positive directions of a and b.

b
b θ
θ

a a

Figure 7.2: Angle in scalar product


Note that a · b is a scalar quantity. It is also known as the dot product. One of the uses of the
scalar product is to find the angle θ between 2 vectors a and b. Namely, if we know a · b, then we
can find θ by rearranging the formula
a·b
cos(θ) =
|a||b|

Since |a| ≥ 0 and |b| ≥ 0, the dot product is entirely determined by the angle between the vectors.
Given non-zero vectors a and b,
• if they are in the same direction, then θ = 0, cos(θ) = 1 and
a · b = |a||b|

• if they are in opposite directions, then θ = π, cos(θ) = −1 and


a · b = −|a||b|

• if they are perpendicular, then θ = π2 , cos(θ) = 0 and


a·b=0

Perpendicular vectors are also said to be orthogonal.

Exercise 7.16
Find the scalar product of the vectors (1, 1, 0) and (2, 2, 0).

110
F17XA 7 Vectors

Solution: These vectors are in the same direction since (2, 2, 0) = 2(1, 1, 0). Hence,
√ √ √
(1, 1, 0) · (2, 2, 0) = 2 8 = 16 = 4

Scalar products of the standard basis vectors (i, j, k) are useful to know.
Theorem 7.17
Let i, j and k be the unit vectors of the standard basis. Since they are all mutually
perpendicular, we have:

i·i=1 j·j =1 k·k =1


i·j =j·i=0 i·k =k·i=0 k·j =j·k =0

This means we can compute the scalar product of vectors expressed in component form.

Example 7.18
Consider the vectors a = (a1 , a2 , a3 ) = a1 i + a2 j + a3 k and b = (b1 , b2 , b3 ) = b1 i + b2 j + b3 k,
then we have:

a · b = (a1 i + a2 j + a3 k) · (b1 i + b2 j + b3 k) Expand brackets


= (a1 i) · (b1 i) + (a1 i) · (b2 j) + (a1 i) · (b3 k)
+ (a2 j) · (b1 i) + (a2 j) · (b2 j) + (a2 j) · (b3 k)
+ (a3 k) · (b1 i) + (a3 k) · (b2 j) + (a3 k) · (b3 k)
= a1 b1 (i · i) + a1 b2 (i · j) + a1 b3 (i · k)
+ a2 b1 (j · i) + a2 b2 (j · j) + a2 b3 (j · k)
+ a3 b1 (k · i) + a3 b2 (k · j) + a3 b3 (k · k) Resolve products
= a1 b1 (1) + a1 b2 (0) + a1 b3 (0)
+ a2 b1 (0) + a2 b2 (1) + a2 b3 (0)
+ a3 b1 (0) + a3 b2 (0) + a3 b3 (1)
= a1 b 1 + a2 b 2 + a3 b 3

Hence, we have:

a · b = a1 b 1 + a2 b 2 + a3 b 3

Definition 7.19 Scalar product

The scalar product of the vectors a = (a1 , a2 , a3 ) b = (b1 , b2 , b3 ) is given by

a · b = a1 b 1 + a2 b 2 + a3 b 3

111
F17XA 7 Vectors

In particular:
• For any vector a: a · a = a21 + a22 + a23 = |a|2
• For 2 dimensional vectors a = (a1 , a2 ) and b = (b1 , b2 ) we have a · b = a1 b1 + a2 b2 .

Exercise 7.20
Find a · b when a = (1, −4, 3) and b = (6, −2, −1).

Solution: a · b = 1 × 6 + (−4) × (−2) + 3 × (−1) = 11.

Exercise 7.21
Find the angle between the vectors e = i − 4j and f = 3i + 2j.

Solution: We work out each portion of the equation:

e · f = |e||f | cos(θ)

e · f = 1 × 3 + (−4) × 2 = −5
q √
|e| = 12 + (−4)2 = 17

|f | = 13

Hence,
e·f −5
cos(θ) = =√ √
|e||f | 17 13
θ = 1.913 radians (3DP)

Exercise 7.22
Find the angle at A in the triangle with vertices A, B, C with Cartesian coordinates
(1, 0, −1), (2, 1, 3), (3, 2, 1) respectively.

112
F17XA 7 Vectors

Solution: First draw the picture.


~
C
AC

~
AB
B

O
We then use Theorem 7.6 to get
~ = OB
AB ~ − OA
~ = (2, 1, 3) − (1, 0, −1) = (1, 1, 4)
~ = OC
AC ~ − OA
~ = (3, 2, 1) − (1, 0, −1) = (2, 2, 2).

Thus we have
~ · AC
AB ~ = 1 × 2 + 1 × 2 + 4 × 2 = 12
√ √
~ = 1 + 1 + 16 = 18,
|AB|
√ √
~ = 4 + 4 + 4 = 12,
|AC|

Hence
~ · AC
AB ~
cos(θ) =
~ AC|
|AB|| ~
s
2
=
3

This gives θ = 0.615 (3DP).

7.4 Vector Product


The vector product is defined for 3 dimensional vectors.

Definition 7.23 Vector product

Given non-zero vectors a and b in 3 dimensions, the vector product (or cross product) is
defined by
a × b = |a||b| sin(θ)n
where 0 ≤ θ ≤ π is the angle between the positive directions of a and b and n is the unit
normal vector described by the right-hand rule.

113
F17XA 7 Vectors

There are some concepts that require more explanation.


• the quantity a × b is a vector quantity
• An alternative notation for the vector product is a ∧ b.
• The unit normal vector n is a vector of length 1 that is simultaneously perpendicular to
both a and b.
• Since there are two options for the unit normal vector n, the one we choose is determined
by the right hand rule.
Theorem 7.24 The right-hand rule
Follow the steps to find the direction for n:
• Point the index finger of the right-hand in the direction of a
• Point the middle finger point in the direction of b.
• The thumb points in the direction of the unit vector n of a × b

Example 7.25
Consider the following examples:

b c

d
a

The right-hand rule has the unit normal of a × b pointing out of the plane towards the reader,
whereas the unit normal of c × d points into the page away from the reader

As for the scalar product, we can simplify the complexity of these calculations by understanding
how vector products work on the standard basis.
• The angle between each vector and itself is θ = 0. So a × a = 0. Hence

i×i=j×j =k×k =0

• Since i and j are perpendicular, we have sin(θ) = 1. With the right-hand rule, we get:

i×j =k

• On the other hand11 , the right-hand rule gives:

j × i = −k
11
no pun intended

114
F17XA 7 Vectors

Theorem 7.26
Let i, j and k be the unit vectors of the standard basis. We have the cross products:

i×i=0 j×j =0 k×k =0


i×j =k j×k =i k×i=j
j × i = −k k × j = −i i × k = −j

We can think of the standard basis as a cyclic permutation

i j k

The result of a × b is positive if we follow the cycle (a → b) and negative if we go in reverse. We


also have the following properties.
Theorem 7.27
Consider vectors a, b, and c, we have:

b × a = −a × b
a×a=0
a × (b + c) = a × b + a × c
(a + b) × c = a × c + b × c

We can then determine a general formula as follows:

Example 7.28
Consider the vectors a = (a1 , a2 , a3 ) = a1 i + a2 j + a3 k and b = (b1 , b2 , b3 ) = b1 i + b2 j + b3 k,
then we have:

a × b = (a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k) Expand brackets


= (a1 i) × (b1 i) + (a1 i) × (b2 j) + (a1 i) × (b3 k)
+ (a2 j) × (b1 i) + (a2 j) × (b2 j) + (a2 j) × (b3 k)
+ (a3 k) × (b1 i) + (a3 k) × (b2 j) + (a3 k) × (b3 k)
= a1 b1 (i × i) + a1 b2 (i × j) + a1 b3 (i × k)
+ a2 b1 (j × i) + a2 b2 (j × j) + a2 b3 (j × k)
+ a3 b1 (k × i) + a3 b2 (k × j) + a3 b3 (k × k) Resolve products
= a1 b1 (0) + a1 b2 (k) + a1 b3 (−j)
+ a2 b1 (−k) + a2 b2 (0) + a2 b3 (i)
+ a3 b1 (j) + a3 b2 (−i) + a3 b3 (0)
= (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k

115
F17XA 7 Vectors

Hence, we have:

a × b = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k

Exercise 7.29
If a = (2, −1, 3) and b = (1, 2, 1), find a × b.

Solution:

a × b = (2i − j + 3k) × (i + 2j + k)
= −7i + j + 5k
= (−7, 1, 5)

We have a geometrical interpretation to the magnitude of the vector product if we consider the
parallelogram below

b
h
θ
a

The area of this parallelogram is


Area = base × height = |a|h = |a||b| sin(θ) = |a × b|
which gives one application of the vector product.

7.5 Scalar Triple Product


We can combine the two previous results to obtain the triple product.

Definition 7.30 Scalar Triple Product


The scalar triple product of vectors a, b and c is defined as

a · (b × c)

Example 7.31
Combining the results of Example 7.18 and Example 7.28, we can show that for:

a = a1 i + a2 j + a3 k
b = b1 i + b 2 j + b3 k
c = c1 i + c2 j + c3 k

116
F17XA 7 Vectors

we have:

b × c = (b2 c3 − b3 c2 )i + (b3 c1 − b1 c3 )j + (b1 c2 − b2 c1 )k

and hence

a · (b × c) = a1 (b2 c3 − b3 c2 ) + a2 (b3 c1 − b1 c3 ) + a3 (b1 c2 − b2 c1 )

Exercise 7.32
Find the scalar triple product a · (b × c) when a = (1, 2, 3), b = (2, 1, 3), c = (4, 0, 1)

Solution: We have

a · (b × c) = a1 (b2 c3 − b3 c2 ) − a2 (b1 c3 − b3 c1 ) + a3 (b1 c2 − b2 c1 )


= 1(1 × 1 − 3 × 0) − 2(2 × 1 − 3 × 4) + 3(2 × 0 − 1 × 4)
=9

A geometrical interpretation may also be given to the scalar triple product. Consider the following
parallelepiped:

φ a
c

The volume V of this object is the base area B times the height h = |a| cos(φ). We also know
B = |b × c| from above, and that b × c points up from the right-hand-rule. Hence, we have

V = Bh
= |b × c||a| cos(φ)
= |a||b × c| cos(φ)
= a · (b × c)

Thus we have that the volume of such a parallelepiped is given by the scalar triple product a·(b×c).
This application of the formula gives the identity

a · (b × c) = b · (c × a) = c · (a × b)

since the volume does not depend upon the cyclic order we choose for a, b and c.

117
F17XA 7 Vectors

If any two of the vectors are the same, then the resulting parallelepiped gives zero volume.

a · (a × b) = a · (b × a) = 0

In arriving at V = a · (b × c), we have been rather careful about choosing the orientation of a, b
and c. More generally, we have V = |a · (b × c)|.

7.6 Equation of a Line


The equation of a line in general can be obtained if we have a point P on the line, and a vector
d parallel to the line (Think:Definition 3.1). From the picture
L

d
P ~ + td
OP
~ +d
OP

we see that the position vector of one point on the line L is given by OP
~ + d. In fact, any point
~ + td for some value of the parameter
R on the line will have a position vector of the form r = OP
t.

Definition 7.33 Equation of a line

Let P = (px , py , pz ) be a point on the line and d = (dx , dy , dz ) be a vector parallel to the
line. The set of points R = (x, y, z) on the line is given by:
• The parametric equation of the line L

r = p + td

for a parameter t.
• The set of parametric equations

x =px + tdx
y =py + tdy
z =pz + tdz

• The Cartesian equation of the line L


x − px y − py z − pz
= =
dx dy dz

118
F17XA 7 Vectors

Rewriting with the parametric equation r = p + td as (x, y, z) = (px , py , pz ) + t(dx , dy , dz )


and separating the component gives the required equations. We can then obtain the Cartesian
equation by solving each of them for t and then equating them. The same process can be reversed
to go from the Cartesian form to parametric form.
In 2 dimensions, the Cartesian equation reduces down to the point-slope formula (Definition 3.1).

Exercise 7.34
Find the parametric equation of the line through the point (1, 2, 1) in the direction of the
vector (−1, 1, 3). Hence, find the Cartesian form of the line.

Solution: The parametric equation is

r = (1, 2, 1) + t(−1, 1, 3) = (1, 2, 1) + (−t, t, 3t) = (1 − t, 2 + t, 1 + 3t)

Writing this in components, we have

x=1−t
y =2+t
z = 1 + 3t

Eliminating t, we get the Cartesian equation of the line


z−1
1−x=y−2=
3

If we are given two points on the line P and Q, then we can produce a vector parallel to the line
using Theorem 7.6 and d = OQ ~ − OP ~ . Then we recover the parametric equation r = p + td.

Exercise 7.35
Find the parametric equation of the line passing through the two points P = (1, 1, 0) and
Q = (2, 1, 3).

Solution: We choose the vector parallel to the line to be


~ − OP
d = OQ ~
= (2, 1, 3) − (1, 1, 0)
= (1, 0, 3)

and obtain the parametric equation

r = (1, 1, 0) + t(1, 0, 3) = (1 + t, 1, 3t)

119
F17XA 7 Vectors

Exercise 7.36
The Cartesian equation of a line is given by
x−1 y+2
= = z − 2.
3 2
Write down the parametric version of the equation and hence find a vector parallel to the
line, and a point on the line.

Solution: We can write the Cartesian equation in the form


x−1 y+2 z−2
= =
3 2 1
and use Definition 7.33 to read off the parametric form

r = (x, y, z) = (1, −2, 2) + t(3, 2, 1)

Hence, (3, 2, 1) is a vector parallel to the line, and (1, −2, 2) is a point on the line (set
t = 0).

The choice of d may make things more complicated.

Exercise 7.37
Find the Cartesian equation of the line through the point (1, −3, 2) in the direction of the
vector (−1, 0, 3).

Solution: Writing this in components, we have

x=1−t
y = −3 + 0t = −3
z = 2 + 3t

We can eliminate t in the equations for x and z as normal. But y = −3 is independent of


t and hence constant. So it does not change with t. In this case, we obtain two different
condition for the line.
z−2
1−x= , y = −3
3

The challenge comes from 0 entries in d. In fact, every 0 entry in d will increase the number of
conditions in the resulting line.

120
F17XA 7 Vectors

7.7 Equation of a Plane


A plane is a flat surface in 3 dimensional space.

Definition 7.38 Plane in 3 dimensional space


Given a single point P in Cartesian space and a normal vector n, the plane defined by P
and n is the set of points R such that

P~R · n = 0

To understand the equation, consider the following picture

n
R
P~R

P
p r

Any vector from P perpendicular to n gives a point R in the plane. Another way to interpret this
definition is that all points R in the plane have the same normal vector. Using Theorem 7.6, we
have

P~R · n = 0
(r − p) · n = 0
r·n−p·n=0
r·n=p·n

Exercise 7.39
Find the equation of the plane that contains the point P with position vector p = (1, 2, 4),
and has the normal vector n = (1, 1, −1).

Solution: A point r = (x, y, z) lies in the plane if it satisfies the equation r · n = p · n.


Then we get:

(x, y, z) · (1, 1, −1) = (1, 2, 4) · (1, 1, −1)


x + y − z = −1

In cases where we are not given the normal directly, we will have to compute it. For example,
if we are given the position vectors a, b and c of 3 points that lie in the plane. In this case, we

121
F17XA 7 Vectors

can construct a normal n by taking the vector product of any two vectors that are parallel to the
plane. For example, we can use Theorem 7.6 to get AB ~ = b − a and AC ~ = c − a, and thus we
can take
n = (b − a) × (c − a)
before applying the equation of the plane.

Exercise 7.40
Find a plane through the points A = (1, 1, −1), B = (2, 0, 2), and C = (0, −2, 1).

Solution: Two vectors in the plane are


~ = (2, 0, 2) − (1, 1, −1) = (1, −1, 3)
AB
~ = (0, −2, 1) − (1, 1, −1) = (−1, −3, 2)
AC

Their vector product is perpendicular to both vectors and therefore to the plane. This will
give a normal n to the plane.
~ × AC
AB ~ = (i − j + 3k) × (−i − 3j + 2k)
= 7i − 5j − 4k

Thus a normal to the plane is n = (7, −5, −4)a . The point A with position vector
a = (1, 1, −1) is on the plane so the equation of a plane is given by

r·n=a·n

Subsituting gives:

(x, y, z) · (7, −5, −4) = (1, 1, −1) · (7, −5, −4)


7x − 5y − 4z = 1 × 7 + 1 × (−5) + (−1) × 4
7x − 5y − 4z = 6
a
Any scalar multiple of this vector would also do for the normal.

122
A Formula Sheet
Hyperbolic Functions Standard Integrals
Z
x −x
e −e f (x) f (x)dx
sinh(x) =
2
ex + e−x (ax + b)n+1
cosh(x) = (ax + b) n
+ C for n 6= −1
2 a(n + 1)
1 = cosh2 (x) − sinh2 (x) 1
sin(ax + b) − cos(ax + b) + C
sinh(2x) = 2 sinh(x) cosh(x) a
cosh(2x) = 2 sinh2 (x) + 1 = 2 cosh2 (x) − 1 1
cos(ax + b) sin(ax + b) + C
a
1 ax
eax e +C
Standard Derivatives a
1 1
f (x) f 0 (x) ln(ax + b) + C for x > 0
ax + b a
Vectors
xn nxn−1
Scalar product: a · b = |a||b| cos(θ)
sin(ax + b) a cos(ax + b) For a = (a1 , a2 , a3 ), b = (b1 , b2 , b3 )

a · b = a1 b 1 + a2 b 2 + a3 b 3
cos(ax + b) −a sin(ax + b)
For standard basis:
eax aeax
i·i=1 j·j =1 k·k =1
ln(ax + b) a
ax+b
i·j =0 j·k =0 k·i=0

sinh(ax + b) a cosh(ax + b) Vector product: a × b = |a||b| sin(θ)n


For a = (a1 , a2 , a3 ), b = (b1 , b2 , b3 )
cosh(ax + b) a sinh(ax + b)
a × b = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k
0 0
uv u v + uv
For standard basis:
u u0 v − uv 0
v v2 i×j =k j×k =i k×i=j
j × i = −k k × j = −i i × k = −j
B Standard Normal Distribution Table
Cumulative probability for P (Z ≤ z): 0 z

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986
3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990
3.1 0.9990 0.9991 0.9991 0.9991 0.9992 0.9992 0.9992 0.9992 0.9993 0.9993
3.2 0.9993 0.9993 0.9994 0.9994 0.9994 0.9994 0.9994 0.9995 0.9995 0.9995
3.3 0.9995 0.9995 0.9995 0.9996 0.9996 0.9996 0.9996 0.9996 0.9996 0.9997
3.4 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9997 0.9998
3.5 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998 0.9998
3.6 0.9998 0.9998 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.7 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.8 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999 0.9999
3.9 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

You might also like