MAT1841 Revision 35
MAT1841 Revision 35
MAT1841 Revision 35
Remark. Reading and understanding the theory is very important, however being able to do
questions efficiently is the key to doing well in the exam! Thus, you should try and attempt the
Applied class problem sets on your own. Refer to lectures/lecture notes/this document if you
get stuck.
1. Vectors
v = (v1 , . . . , vn )
˜
where v1 , . . . , vn are all real numbers. We say that v belongs to Rn (i.e. v ∈ Rn ). Its length1 is
˜ ˜
given by
n
!1/2
X
|v| = (v12 + · · · + vn2 )1/2 = vi2 .
˜ i=1
A vector possesses only a length and direction, it does not correspond to a certain position in
Rn .
1.1. Dot product. For two vectors v, w ∈ Rn , their dot product is defined as
˜ ˜
Xn
v · w := v1 w1 + · · · + vn wn = v i wi .
˜ ˜ i=1
We know the following things about the dot product (and probably more!):
(i) v · w = w · v.
˜ ˜ ˜ ˜
(ii) v · w ∈ R. That is, the dot product returns a scalar (not a vector!). This is one reason
˜ ˜
why it is often called a scalar product.2
(iii) v · w = |v||w| cos(θ), where θ ∈ [0, π] is the angle between the vectors v and w. Thus the
˜ ˜ ˜ ˜ ˜ ˜
dot product gives a measure of ‘colinearity’ of two vectors.
1.2. Scalar and Vector projection. The vector projection of a vector v onto w is a vector vw
˜ ˜ ˜
which points in the direction of w, its length being the ‘shadow’ of v cast on w. Mathematically,
˜ ˜ ˜
1Also known as magnitude or norm, although the latter is usually used in more abstract mathematics.
2Actually, the dot product is an operator which belongs to a broader class of operators called inner/scalar
products.
1
2
1.3. Cross product. For two vectors v, w ∈ R3 , their cross product is defined as
˜ ˜
v × w = (v2 w3 − v3 w2 , v3 w1 − v1 w3 , v1 w2 − v2 w1 ).
˜ ˜
We know the following things about the cross product (and probably more!):
(i) Cross products are only defined for vectors that belong to R3 .
(ii) Let u = v × w. Then u is perpendicular to both v and w, and points in the direction
˜ ˜ ˜ ˜ ˜ ˜
given by the right hand rule. In other words, the cross product v × w returns a vector in
˜ ˜
R3 that is perpendicular to both v and w.
˜ ˜
(iii) v × w = −(w × v), this is a consequence of the right hand rule.
˜ ˜ ˜ ˜
(iv) v × w = |v||w| sin(θ)ˆ
n, where θ ∈ [0, π] is the angle between v and w and n
ˆ is a unit vector
˜ ˜3 ˜ ˜ ˜ ˜ ˜ ˜
in R perpendicular to both v and w and pointing in the direction given by the right hand
˜ ˜
rule.
(v) |v × w| = |v||w| sin(θ), this is the area of the parallelogram spanned by v and w.
˜ ˜ ˜ ˜ ˜ ˜
(vi) v × w = (0, 0, 0) if and only if the angle between the vectors v and w is either 0 or π (i.e.,
˜ ˜ ˜ ˜
if v and w are colinear).
˜ ˜
(vii) v × w can be written as an informal determinant of a 3 × 3 matrix:
˜ ˜
i j k
˜ ˜ ˜
v × w = v1 v2 v3
˜
w1 w2 w3
where one implements Laplace expansion along the top row. Note that i = (1, 0, 0),
˜
j = (0, 1, 0), k = (0, 0, 1).
˜ Often people ˜
prefer remembering this form of the cross product, as computing deter-
minants via Laplace expansion is easy to remember!
Lines and planes can be utilised for various purposes in linear algebra. In particular, they can
provide a geometric picture concerning the solution(s) of linear systems.
3
(i) Geometric interpretation in terms of lines for systems of linear equations in two variables.
(ii) Geometric interpretation in terms of planes for systems of linear equations in three vari-
ables.
(iii) Critical points (i.e., stationary points and points of singularity), Local extrema and Ab-
solute extrema. We’re talking about univariate functions here!
4
(iv) Derivatives of inverse functions, in particular derivatives of inverse circular functions (e.g.,
sin−1 , cos−1 , tan−1 ).
4. Parametric curves
A curve can be expressed in cartesian form if we can find an algebraic relationship between the
x and y coordinates on the curve. There are two ways to do this, either explicitly or implicitly.
For a curve in explicit (cartersian) form, the set of all points on the curve is
where f : I → R is a function. This means the y coordinate for each point on the curve can
be expressed as the output of a function of x. Most curves cannot be expressed in explicit
(cartesian) form. However, we still may be able to find an algebraic relationship between the x
and y coordinates. A curve in implicit (cartesian) form can be described as
For example, if g(x, y) = x2 + y 2 − 1, then CIm describes a circle centred at (0, 0) with radius
1.3
For a curve in R3 considered in parametric form, it is given by the mapping t 7→ ((x(t), y(t), z(t)),
in which case the collection of points on this curve is
The parametric form has a number of advantages and disadvantages over the cartersian form.
(i) The parametric form of a curve always exists. That is, a curve is a curve if and only if it
can be expressed in parametric form. A cartersian form for a curve may not exist.
(ii) Finding a typical point on a curve in parametric form is easy, just substitute in a particular
value for the parameter. This is not true in implicit (cartesian) form, one needs to solve
an algebraic equation to find a typical point on the curve, which may not be easy.
3In the lectures we did not write the implicit form of a curve this way. This is because we had not seen
multivariable functions yet! But now you can see that a curve in implicit form is a level curve of a surface in
explicit form!
4Technically the curve is the mapping t 7→ ((x(t), y(t)) rather than the set of points C .
Par
5
(iv) The parametric form of a curve is non-unique, thus it is often difficult to tell what the
curve looks like when presented in parametric form. Often we will need to convert it to
cartesian form, as cartesian form is more or less unique.
2
Exercise 4.0.1. Consider the parametric curve given by x(t) = et and y(t) = e2t − cos(πt).
Find the tangent line to the curve at t = 1.
Solution.
6
4.1. Power series and Taylor series. A power series 5 is the series
∞
X
f (x) = an xn ,
n=0
We know a power series is a function. We now ask the reverse question: ‘If we have an
arbitrary function, can it be expressed as a power series?’. The answer is often yes, and this
object is called the function’s Taylor series. The Taylor series of f : I → R centred at/expanded
around a ∈ R is given by
∞
X f (n) (a)
f (x) = (x − a)n ,
n!
n=0
where f (n)means the n-th derivative of f . A Taylor series centred at/expanded around a = 0
is called a Maclaurin series. The Maclaurin series of your favourite functions (e.g., ex , sin(x))
can be found in your lecture notes.
The truncation of the Taylor series of a function f up to n + 1 terms is called the n-th
degree Taylor polynomial. Specifically,
n
X f (k) (a)
Tn (x; a) := (x − a)k .
k!
k=0
It provides an approximation to the function f , and this approximation is usually good if one
evaluates the Taylor polynomial near the centring point a, and/or a sufficiently high number of
terms (i.e., sufficiently large n) is used in the Taylor polynomial.
Exercise 4.1.1. Find the first three non-zero terms in the Taylor series for f (x) = (x− π2 )ecos(x)
centred at a = π/2.
Solution.
5There exists more general notions of power series, but we do not cover them.
7
8
4.2. Cubic splines. Consider n + 1 data points (x0 , y0 ), . . . , (xn , yn ). A cubic spline is a
piecewise function ỹ(x) defined by
ỹ0 (x), x0 ≤ x < x1 ,
ỹ1 (x), x1 ≤ x < x2 ,
ỹ(x) = .
..
ỹn−1 (x), xn−1 ≤ x ≤ xn ,
ỹi (x) = di + ai (x − xi ) + bi (x − xi )2 + ci (x − xi )3
In total there are 4n eqns, and also 4n unknowns. Hence the coefficients ai , bi , ci , di can be
found uniquely (see lectures for the formulas!).
The purpose of a cubic spline is to provide a sufficiently smooth curve that interpolates
the given data points, and thus provide an estimate to the data at unknown sample points.
x −4 −3 o1 o3
f (x) o2 o2 o2 o4
Let
ỹ0 (x), −4 ≤ x < −3,
ỹ(x) = ỹ1 (x), −3 ≤ x < 1,
ỹ (x),
2 1 ≤ x ≤ 3.
where the pieces ỹ0 (x), ỹ1 (x), ỹ2 (x) are given by
1 1
ỹ0 (x) = 2 + (x + 4) − (x + 4)3 ,
26 26
1 3 7
ỹ1 (x) = 2 − (x + 3) − (x + 3)2 + (x + 3)3 ,
13 26 208
8 15 5
ỹ2 (x) = 2 + (x − 1) + (x − 1)2 − (x − 1)3 .
13 52 104
Solution. No need to derive the cubic spline yourself. One needs to just verify that the pieces
ỹ0 , ỹ1 , ỹ2 satisfy the properties (i) to (v) of a cubic spline.
10
11
12
5. Integration
5.1. Definite integral. The Riemann integral or definite integral of a piecewise continuous
and bounded function f : I → R is defined as
Z b n−1
X
f (x)dx := lim f (xi )(xi+1 − xi ).
a n→∞
i=0
This limit is a bit strange and we did not properly study it in this course. However, the
interpretation is that the definite integral yields the signed area bounded by the curve of the
function f , the vertical lines x = a, x = b and the x-axis. It achieves this by essentially fitting
rectangles of infinitesimal width between the curve of f and the x-axis, and ‘summing’ up each
rectangle’s area.
In general, it is not easy to calculate definite integrals from the definition. Instead, one
usually utilises the fundamental theorem of calculus (FTOC), which states that
Z b
f (x)dx = F (b) − F (a)
a
where F ′ (x)
= f (x). Here, F is called an antiderivative of f . Antiderivatives are non-unique;
if F is an antiderivative for f , then so is F̃ (x) = F (x) + c where c is a constant. Thus,
the fundamental theorem of calculus yields the notion that ‘differentiation and integration are
essentially inverse operations’.6
5.2. Indefinite integrals and properties. The indefinite integral of a function f is given by
Z
f (x)dx = F (x) + c
Rc Rb Rc
(i) a f (x)dx = a f (x)dx + b f (x)dx, one can split the interval of integration up.
R R R
(ii) [d1 f (x) + d2 g(x)] dx = d1 f (x)dx + d2 g(x)dx. This means indefinite integrals are
linear.
Rb Rb Rb
(iii) a [d1 f (x) + d2 g(x)] dx = d1 a f (x)dx + d2 a g(x)dx. This means definite integrals are
linear.
One can easily find antiderivatives (or indefinite integrals) for various elementary functions
R
by simply reverse engineering differentiation. E.g., sin(x)dx = − cos(x) + c. See lecture
notes for a full list! However, we are often interested in integrating functions which are more
6Although, I believe one should appreciate that the ideas of differentiation and integration are quite separate —
one can talk about integration without needing to appeal to differentiation.
13
complicated than elementary functions. In order to do so, one can reverse engineer the chain rule
and the product rule to obtain integration by substitution and integration by parts respectively.
5.3. Area between curves. Suppose f (x) ≥ g(x) for all x ∈ [a, b]. The area bounded by the
two curves determined by the functions f and g, and the vertical lines x = a and x = b is given
by the formula
Z b
[f (x) − g(x)]dx.
a
This only works if f (x) ≥ g(x) for all x ∈ [a, b]. If the inequality between f and g changes over
the interval [a, b], you must take this into account; namely, determine the subintervals where f
is bigger/smaller than g, and consider them case by case.
Finding which function f or g is bigger over each potential subinterval can be difficult. In
fact, you sort of don’t need to determine which function is bigger if you consider the following
strategy. Suppose you need to find the area between f and g over the interval [a, b]. Let
h(x) := f (x) − g(x). Then:
(i) Solve h(x) = 0 to get x1 , x2 , . . . . Order them so that xi < xi+1 . These values xi are the
points where f and g intersect.
(ii) Discard any xi that are not in (a, b). Suppose you have x1 , . . . , xn left over.
14
(iii) If there are no xi in (a, b), then the area between f and g over the interval [a, b] is
Z b Z b
(±?)h(x)dx = h(x)dx
a a
Z x1 Z x2 Z b
= h(x)dx + h(x)dx + · · · + h(x)dx .
a x1 xn
In other words, the total desired area is the sum of all these integrals. Note I have
written a ±? in front of each h(x). This is because you may not know which function f
or g is bigger over each subinterval. However, what you do know is that each of these
integrals correspond to (unsigned) areas, and thus must be positive! So if you compute
any of these integrals and they end up negative, you chose the wrong sign. But do not
fret, you simply need to remove the minus sign (why?), hence why we put the absolute
value. Note that this ONLY works once you write the desired area as a sum of integrals
like this. And note in application, there are probably either 0, 1, or 2 intersection points.
In order to find the area of the region enclosed between two curves (without this vertical
line constraint), you must find all points where they intersect (if they do!). Note that there
could be multiple regions of interest. A slight modification of the previous strategy will then
yield you the desired area.
Exercise 5.3.1. Find the area of the region bounded by the curves determined by the functions
f (x) = x2 e−x and g(x) = xe−x and the vertical lines x = 0, x = 1. What about if it were instead
the vertical lines x = 0, x = 2?
Solution.
15
16
5.4. Trapezoidal rule. The trapezoidal rule is utilised for numerically approximating definite
integrals by fitting trapeziums between the curve of the function being integrated and the x-axis.
It is superior to fitting rectangles, since rectangles assume the function is constant on intervals,
whereas the trapeziums assume they are linear on intervals. Neither is correct, but linear is
better than constant! The more trapeziums utilised, the better the approximation. One would
use the trapezoidal rule if the function being integrated has no known antiderivative, or the
antiderivative is hard to find. The trapezoidal rule with n trapeziums is
n−1
Z b !
1b−a X
f (x)dx ≈ f (a) + f (b) + 2 f (xi ) .
a 2 n
i=1
b−a
Note that the distance between xi and xi+1 is assumed to be uniform, thus ∆x = n . This
yields that xi = a + i∆x.
6. Multivariable calculus
6.1. Surfaces. The plot of a function of two variables f : D → R is the set of points
In order to plot the surface of a function of two variables, one considers cross sections of
the surface, and then constructs the surface from them. For example, if z = f (x, y) is a surface
7Usually when we refer to a surface we just say something like ‘Consider the surface z = f (x, y)’ rather than the
whole set of points thing.
17
for various z0 all on the same set of axes (x − y plane). This plot is called a contour plot, and
the curves are called contours or level curves/sets. The level curves can be thought of as curves
obtained via intersection (‘slicing’) of the surface z = f (x, y) with the horizontal planes z = z0 .
One can also look at cross sections corresponding to vertical slicing, i.e., freezing one of
the input variables x or y. These curves are thus
C = {(x, z) ∈ R2 : z = f (x, y0 )}
plotted in the x − z plane. These are called traces of the surface. One can think of these being
obtained as the intersection (‘slicing’) of the surface with the vertical planes x = x0 or y = y0 .
Thus, one can construct a surface by simply investigating its level curves and traces.
Similar to the case of curves, one can write the implicit (cartesian) form of a surface as
For example, if we let g(x, y, z) = x2 + y 2 − z 2 , then SIm is the surface of a double cone.
Lastly, we also have the parametric form of a surface. The parametric form of a surface is
given by the map (u, v) 7→ (x(u, v), y(u, v), z(u, v)). The set of all points on the surface is given
by
6.2. Partial derivatives. Let f : D → R be a two variable function. The partial derivative of
f w.r.t. x, denoted by ∂f∂x , is the instantaneous rate of change of f in the x direction, with y
fixed. Similarly, the partial derivative of f w.r.t. y, denoted by ∂f
∂y , is the instantaneous rate of
change of f in the y direction, with x fixed. Namely,
∂f f (x + ∆x, y) − f (x, y)
:= lim ,
∂x ∆x→0 ∆x
Calculating partial derivatives is easy! For example, if f (x, y) = xex+y , then in order to
compute fx , one ‘treats’ y as a constant, and then implements the usual differentiation rules
from the univariate setting in x. Using the product rule, one obtains fx = ex+y + xex+y . Note
18
6.3. Tangent planes. Tangent planes generalise the notion of a tangent line to two variable
functions. Tangent planes can be placed at a point (x0 , y0 , f (x0 , y0 )) on the surface z = f (x, y)
if the surface is sufficiently locally linear. That is, zoom in at (x0 , y0 , f (x0 , y0 )), if the surface
is locally flat, a tangent plane can be fitted.
where z0 = f (x0 , y0 ). Interestingly, it contains all the possible tangent lines to the point (x0 , y0 ),
and thus the instantaneous rate of change of f at (x0 , y0 ) in any direction. However, in order to
construct the tangent plane, the only information we require from f are its partial derivatives.
This is because the partial derivatives are sufficient in order to obtain any directional derivative
(think about adding non colinear vectors!). Also, if you look at the RHS of the preceding
equation, it coincides with the linear approximation / first-order Taylor polynomial of f centred
at (x0 , y0 ).
Exercise 6.3.1. Let f (x, y) = x3 ecos(x+y) . Find the equation of the tangent plane to the surface
z = f (x, y) at the point (π/2, 0).
Solution.
19
6.4. Chain rule. Let f : D → R be a two variable function. Suppose we have a curve in R2
specified in parametric form s 7→ (x(s), y(s)). If we apply f to this curve, what we get is a
curve in R3 , namely s 7→ (x(s), y(s), f (x(s), y(s))). It makes sense to talk about the derivative
of the (univariate!) function s 7→ f (x(s), y(s)), since the curve s 7→ (x(s), y(s)) determines
the direction in which you approach any point (x0 , y0 ). Perhaps confusing with all the maps
here...but there is a nice geometric picture in the lectures! Anywho, the chain rule states
d ∂f dx ∂f dy
f (x(s), y(s)) = + ,
ds ∂x ds ∂y ds
which should be thought of as a function of s.
Suppose now we have a region in R2 specified in parametric form (u, v) 7→ (x(u, v), y(u, v)).
If we apply f to this region, what we get is a surface in R3 . Now it doesn’t make sense to talk
about the derivative of (u, v) 7→ f (x(u, v), y(u, v)), since you can approach any point (x0 , y0 ) in
this region from infinitely many directions. However, we can talk about partial derivatives. In
this case the chain rule states
∂ ∂f ∂x ∂f ∂y
f (x(u, v), y(u, v)) = + ,
∂u ∂x ∂u ∂y ∂u
and
∂ ∂f ∂x ∂f ∂y
f (x(u, v), y(u, v)) = + .
∂v ∂x ∂v ∂y ∂v
Both of these should be thought of as functions of (u, v).
For a unit vector t = (t1 , t2 ), the directional derivative of f in the direction t is defined as
˜ ˜
f (x + t1 h, y + t2 h) − f (x, y)
∇t f := lim
˜ h→0 h
and thus gives the instantaneous rate of change of f in the direction t. Notice if one takes
˜
t = (1, 0) and t = (0, 1) they recover the definitions of partial derivatives in x and y respectively.
˜ ˜
20
Rather than compute directional derivatives from the definition, one can show (using the
chain rule!) that
∇t f = ∇f · t
˜ ˜
where ∇f := (fx , fy ) is a vector of the partial derivatives of f , called the gradient of f (often
pronounced ‘grad f ’).8 Be careful, when calculating directional derivatives, you must ensure
that your direction vector t is a unit vector. If it is not, you must instead consider a unit vector
˜
in the same direction (just normalise t by dividing it by its length, i.e., consider t̂ = t/|t|).
˜ ˜ ˜ ˜
6.6. Higher order Taylor Polynomials. Let f : D → R be a two variable function. The
first-order Taylor polynomial (i.e., linear approximation) of f centred at (x0 , y0 ) is given by
This is essentially the equation of the tangent plane. The second-order Taylor polynomial
centred at (x0 , y0 ) is similar, one just needs to include second-order terms:
(i) Local minimum: if f (x0 , y0 ) ≤ f (x, y) for all (x, y) ‘near’ (x0 , y0 ).
(ii) Local maximum: if f (x0 , y0 ) ≥ f (x, y) for all (x, y) ‘near’ (x0 , y0 ).
Local extrema refer to either local minima or local maxima. Saddle points are technically not
extrema.
One can classify the nature of a stationary point (x0 , y0 ) via the following test.10 Let
D = fxx fyy − (fxy )2 .11 Then
8We didn’t talk much about ∇f but the interpretation is that it is a vector that points in the direction of the
steepest ascent/descent along the surface z = f (x, y).
9A function whose second-order partial derivatives exist and the second-order partial derivatives themselves are
continuous are called C 2 functions.
10This test doesn’t work for points of singularity, why?
11This is the determinant of the Hessian matrix, i.e., the 2 × 2 matrix of second-order partial derivatives.
21