Week 3
Week 3
∂
If f (x, y) is a function of two variables, then ∂x f (x, y) is defined as the derivative
of the function g(x) = f (x, y), where y is considered a constant. It is called partial
derivative of f with respect to x. The partial derivative with respect to y is defined
similarly.
∂
We also use the short hand notation fx (x, y) = ∂x
f (x, y). For iterated derivatives, the notation is
∂ ∂
similar: for example fxy = ∂x ∂y
f.
The notation for partial derivatives ∂x f, ∂y f were introduced by Carl Gustav Jacobi. Josef La-
grange had used the term ”partial differences”. Partial derivatives fx and fy measure the rate
of change of the function in the x or y directions. For functions of more variables, the partial
derivatives are defined in a similar way.
1 For f (x, y) = x4 − 6x2 y 2 + y 4 , we have fx (x, y) = 4x3 − 12xy 2, fxx = 12x2 − 12y 2, fy (x, y) =
−12x2 y + 4y 3, fyy = −12x2 + 12y 2 and see that fxx + fyy = 0. A function which satisfies this
equation is also called harmonic. The equation fxx + fyy = 0 is an example of a partial
differential equation: it is an equation for an unknown function f (x, y) which involves
partial derivatives with respect to more than one variables.
Clairot’s theorem If fxy and fyx are both continuous, then fxy = fyx .
Proof: we look at the equations without taking limits first. We extend the definition and say that
a background Planck constant h is positive, then fx (x, y) = [f (x + h, y) − f (x, y)]/h. For h = 0
we define fx as before. Compare the two sides for fixed h > 0:
We have not taken any limits in this proof but established an identity which holds for all h > 0, the
discrete derivatives fx , fy satisfy the relation fxy = fyx . We could fancy the identity obtained in
the proof as a ”quantum Clairot” theorem. If the classical derivatives fxy , fyx are both continuous,
we can take the limit h → 0 to get the classical Clairot’s theorem as a ”classical limit”. Note
that the quantum Clairot theorem shown first in this proof holds for any functions f (x, y) of two
variables. We do not even need continuity.
2 Find fxxxxxyxxxxx for f (x) = sin(x) + x6 y 10 cos(y). Answer: Do not compute, but think.
x3 y − xy 3
f (x, y) =
x2 + y 2
contradicts Clairaut’s theorem:
fx (x, y) = (3x2 y − y 3)/(x2 + y 2) − 2x(x3 y − fy (x, y) = (x3 − 3xy 2 )/(x2 + y 2) − 2y(x3 y −
xy 3 )/(x2 +y 2 )2 , fx (0, y) = −y, fxy (0, 0) = −1, xy 3 )/(x2 + y 2 )2 , fy (x, 0) = x, fy,x (0, 0) = 1.
An equation for an unknown function f (x, y) which involves partial derivatives with
respect to at least two different variables is called a partial differential equation.
If only the derivative with respect to one variable appears, it is called an ordinary
differential equation.
Here are some examples of partial differential equations. You should know the first 4 well.
4 The wave equation ftt (t, x) = fxx (t, x) governs the motion of light or sound. The function
f (t, x) = sin(x − t) + sin(x + t) satisfies the wave equation.
5 The heat equation ft (t, x) = fxx (t, x) describes diffusion of heat or spread of an epi-
1 −x2 /(4t)
demic. The function f (t, x) = √
t
e satisfies the heat equation.
6 The Laplace equation fxx + fyy = 0 determines the shape of a membrane. The function
f (x, y) = x3 − 3xy 2 is an example satisfying the Laplace equation.
8 The eiconal equation fx2 + fy2 = 1 is used to see the evolution of wave fronts in optics.
The function f (x, y) = cos(x) + sin(y) satisfies the eiconal equation.
9 The Burgers equation ft + f fx = fxx describes waves at the beach which break. The
√ 1 −x2 /(4t)
e
function f (t, x) = xt √t 1 −x2 /(4t) satisfies the Burgers equation.
1+ t
e
Paul Dirac once said: ”A great deal of my work is just playing with equations and seeing
what they give. I don’t suppose that applies so much to other physicists; I think it’s a peculiarity
of myself that I like to play about with equations, just looking for beautiful mathematical
relations which maybe don’t have any physical meaning at all. Sometimes they do.” Dirac
discovered a PDE describing the electron which is consistent both with quantum theory and special
relativity. This won him the Nobel Prize in 1933. Dirac’s equation could have two solutions, one
for an electron with positive energy, and one for an electron with negative energy. Dirac interpreted
the later as an antiparticle: the existence of antiparticles was later confirmed. We will not learn
here to find solutions to partial differential equations. But you should be able to verify that a
given function is a solution of the equation.
Homework
f[t_,x_]:=(1/Sqrt[t])*Exp[-x^2/(4t)];
Simplify[ D[f[t,x],t] == D[f[t,x],{x,2}]]
y=LHxL
y=fHxL
The graph of the function L is close to the graph of f at a. We generalize this now to higher
dimensions:
How do we justify the linearization? If the second variable y = b is fixed, we have a one-dimensional
situation, where the only variable is x. Now f (x, b) = f (a, b) + fx (a, b)(x − a) is the linear ap-
proximation. Similarly, if x = x0 is fixed y is the single variable, then f (x0 , y) = f (x0 , y0 ) +
fy (x0 , y0 )(y − y0 ). Knowing the linear approximations in both the x and y variables, we can get
the general linear approximation by f (x, y) = f (x0 , y0 ) + fx (x0 , y0 )(x − x0 ) + fy (x0 , y0 )(y − y0 ).
1 What is the linear approximation of the function f (x, y) = sin(πxy 2) at the point (1, 1)? We
have (fx (x, y), yf (x, y) = (πy 2 cos(πxy 2 ), 2yπ cos(πxy 2 )) which is at the point (1, 1) equal to
∇f (1, 1) = hπ cos(π), 2π cos(π)i = h−π, 2πi.
2 Linearization can be used to estimate functions near a point. In the previous example,
5 Find the tangent line to the graph of the function g(x) = x2 at the point (2, 4).
Solution: the level curve f (x, y) = y − x2 = 0 is the graph of a function g(x) = x2 and
the tangent at a point (2, g(2)) = (2, 4) is obtained by computing the gradient ha, bi =
∇f (2, 4) = h−g ′ (2), 1i = h−4, 1i and forming −4x + y = d, where d = −4 · 2 + 1 · 4 = −4.
The answer is −4x + y = −4 which is the line y = 4x − 4 of slope 4.
Remark: some books use differentials etc to describe linearizations. This is 19 century notation
and terminology and should be avoided by all means. For us, the linearlization of a function at
a point is a linear function in the same number of variables. 20th century mathematics has
invented the notion of differential forms which is a valuable mathematical notion, but it is a
concept which becomes only useful in follow-up courses which build on multivariable calculus like
Riemannian geometry. The notion of ”differentials” comes from a time when calculus was still
foggy in some areas. Unfortunately it has survived and appears even in some calculus books.
Homework
f (x, y) = x2 + 9y 2
For
q example, to find arccos (x), we √
′
write 1 = d/dx cos(arccos(x)) = − sin(arccos(x)) arccos′ (x) =
√
− 1 − sin2 (arccos(x)) arccos′ (x) = 1 − x2 arccos′ (x) so that arccos′ (x) = −1/ 1 − x2 .
Define the gradient ∇f (x, y) = hfx (x, y), fy (x, y)i or ∇f (x, y, z) =
hfx (x, y, z), fy (x, y, z), fz (x, y, z)i.
If ~r(t) is curve and f is a function of several variables we can build a function t 7→ f (~r(t)) of one
variable. Similarly, If ~r(t) is a parametrization of a curve in the plane and f is a function of two
variables, then t 7→ f (~r(t)) is a function of one variable.
d
The multivariable chain rule is dt
f (~r(t)) = ∇f (~r(t)) · ~r′ (t).
holds for every h > 0. The left hand side converges to dtd f (x(t), y(t)) in the limit h → 0 and
the right hand side to fx (x(t), y(t))x′ (t) + fy (x(t), y(t))y ′(t) using the single variable chain rule
twice. Here is the proof of the later, when we differentiate f with respect to t and y is treated as
a constant:
Write H(t) = x(t+h)-x(t) in the first part on the right hand side.
1 We move on a circle ~r(t) = hcos(t), sin(t)i on a table with temperature distribution f (x, y) =
x2 − y 3. Find the rate of change of the temperature ∇f (x, y) = (2x, −3y 2 ), ~r′(t) =
(− sin(t), cos(t)) d/dtf (~r(t)) = ∇T (~r(t)) · ~r′ (t) = (2 cos(t), −3 sin(t)2 ) · (− sin(t), cos(t)) =
−2 cos(t) sin(t) − 3 sin2 (t) cos(t).
From f (x, y) = 0 one can express y as a function of x. From d/df (x, y(x)) = ∇f · (1, y ′(x)) =
fx +fy y ′ = 0, we obtain y ′ = −fx /fy . Even so, we do not know y(x), we can compute its derivative!
Implicit differentiation works also in three variables. The equation f (x, y, z) = c defines a surface.
Near a point where fz is not zero, the surface can be described as a graph z = z(x, y). We can
compute the derivative zx without actually knowing the function z(x, y). To do so, we consider y
a fixed parameter and compute using the chain rule
The chain rule is powerful because it implies other differentation rules like the addition, product
and quotient rule in one dimensions: f (x, y) = x+y, x = u(t), y = v(t), d/dt(x+y) = fx u′ +fy v ′ =
u′ + v ′ .
f (x, y) = xy, x = u(t), y = v(t), d/dt(xy) = fx u′ + fy v ′ = vu′ + uv ′.
f (x, y) = x/y, x = u(t), y = v(t), d/dt(x/y) = fx u′ + fy v ′ = u′ /y − v ′ u/v 2.
As in one dimensions, the chain rule follows from linearization. If f is a linear function f (x, y) =
ax + by − c and if the curve ~r(t) = hx0 + tu, y0 + tvi parametrizes a line. Then dtd f (~r(t)) =
d
dt
(a(x0 + tu) + b(y0 + tv)) = au + bv and this is the dot product of ∇f = (a, b) with ~r ′ (t) = (u, v).
Since the chain rule only refers to the derivatives of the functions which agree at the point, the
chain rule is also true for general functions.
Homework
1 You know that d/dtf (~r(t)) = 2 if ~r(t) = ht, ti and d/dtf (~r(t)) =
3 if ~r(t) = ht, −ti. Find the gradient of f at (0, 0).
2 The pressure in the space at the position (x, y, z) is p(x, y, z) =
x2 + y 2 − z 3 and the trajectory of an observer is the curve ~r(t) =
ht, t, 1/ti. Using the chain rule, compute the rate of change of the
pressure the observer measures at time t = 2.
3 Mechanical systems can be described by the energy H(x, y), a
function of position x and momentum y. The curve ~r(t) =
hx(t), y(t)i is described by the Hamilton equations.
x′(t) = Hy (x, y)
y ′(t) = −Hx(x, y)
a) Using the chain rule to verify that the energy of a Hamilto-
nian system is preserved: for every~r(t) = hx(t), y(t)i we have
H(x(t), y(t)) = const.
b) Check the case of the pendulum, where H(x, y) = y 2 /2 −
sin(x).
4 Derive using implicit differentiation the derivative d/dx arctanh(x),
where
tanh(x) = sinh(x)/ cosh(x) .
The hyperbolic sine and hyperbolic cosine are defined as
are sinh(x) = (ex − e−x )/2 and cosh(x) = (ex + e−x)/2. We have
sinh′ = cosh and cosh′ = sinh and cosh2(x) − sinh2(x) = 1.
5 The equation f (x, y, z) = exyz + z = 1 + e implicitly defines z
as a function z = g(x, y) of x and y. Find formulas (in terms of
x,y and z) for gx(x, y) and gy (x, y). Estimate g(1.01, 0.99) using
linear approximation.
Math S21a: Multivariable calculus Oliver Knill, Summer 2012
The symbol ∇ is spelled ”Nabla” and named after an Egyptian harp. Here is a very important
fact:
Proof. Every curve ~r(t) on the level curve or level surface satisfies dtd f (~r(t)) = 0. By the chain
rule, ∇f (~r(t)) is perpendicular to the tangent vector ~r′ (t).
Because ~n = ∇f (p, q) = ha, bi is perpendicular to the level curve f (x, y) = c through (p, q), the
equation for the tangent line is ax + by = d, a = fx (p, q), b = fy (p, q), d = ap + bq. Compactly
written, this is
∇f (~x0 ) · (~x − ~x0 ) = 0
and means that the gradient of f is perpendicular to any vector (~x − ~x0 ) in the plane. It is one of
the most important statements in multivariable calculus. since it provides a crucial link between
calculus and geometry. The just mentioned gradient theorem is also useful. We can immediately
compute tangent planes and tangent lines:
1 Compute the tangent plane to the surface 3x2 y + z 2 − 4 = 0 at the point (1, 1, 1). Solution:
∇f (x, y, z) = h6xy, 3x2 , 2zi. And ∇f (1, 1, 1) = h6, 3, 2i. The plane is 6x+ 3y + 2z = d where
d is a constant. We can find the constant d by plugging in a point and get 6x+3y+2z = 11.
x4 + y 2 + z 6 = 6 .
Solution: ~r(t) hits the surface at the time t = 2 in the point (−1, −2, 1). The velocity
vector in that ray is ~v = h−1, −1, 0i The normal vector at this point is ∇f (−1, −2, 1) =
h−4, 4, 6i = ~n. The reflected vector is
We have Proj~n (~v) = 8/68h−4, −4, 6i. Therefore, the reflected ray is w
~ = (4/17)h−4, −4, 6i−
h−1, −1, 0i.
The name directional derivative is related to the fact that every unit vector gives a direction. If
~v is a unit vector, then the chain rule tells us dtd D~v f = dtd f (x + t~v ).
The directional derivative tells us how the function changes when we move in a given direction.
Assume for example that T (x, y, z) is the temperature at position (x, y, z). If we move with veloc-
ity ~v through space, then D~v T tells us at which rate the temperature changes for us. If we move
with velocity ~v on a hilly surface of height h(x, y), then D~v h(x, y) gives us the slope we drive on.
3 If ~r(t) is a curve with velocity ~r ′ (t) and the speed is 1, then D~r′ (t) f = ∇f (~r(t)) · ~r ′ (t) is the
temperature change, one measures at ~r(t). The chain rule told us that this is d/dtf (~r(t)).
5 You are on a trip in a air-ship over Cambridge at (1, 2) and you want to avoid a thunderstorm,
a region of low pressure. The pressure is given by a function p(x, y) = x2 + 2y 2. In which
direction do you have to fly so that the pressure change is largest?
Solution: The gradient
√ ∇p(x, y) = h2x, 4yi at the point (1, 2) is h2, 8i. Normalize to get
the direction h1, 4i/ 17.
The directional derivative has the same properties than any derivative: Dv (λf ) =
λDv (f ), Dv (f + g) = Dv (f ) + Dv (g) and Dv (f g) = Dv (f )g + f Dv (g).
We will see later that points with ∇f = ~0 are candidates for local maxima or minima of f .
Points (x, y), where ∇f (x, y) = (0, 0) are called critical points and help to understand the func-
tion f .
6 The Matterhorn is a 4’478 meter high mountain in Switzerland. It is quite easy to climb
with a guide because there are ropes and ladders at difficult places. Evenso there are
quite many climbing accidents at the Matterhorn, this does not stop you from trying an
ascent. In suitable units on the ground, the height f (x, y) of the Matterhorn is approximated
by the function f (x, y) = 4000 − x2 − y 2. At height f (−10, 10) = 3800, at the point
(−10, 10,√3800), you rest. The climbing route continues into the south-east direction v =
h1, −1i/ 2. Calculate the rate
√ of change
√ in that direction. We have ∇f (x, y) = h−2x, −2yi,
√ h20, −20i · h1, −1i/ 2 = 40/ 2. This is a place, with a ladder, where you climb
so that
40/ 2 meters up when advancing 1m forward.
The rate of change in all directions is zero if and only if ∇f (x, y) = 0: if ∇f 6= ~0, we can
choose ~v = ∇f /|∇f | and get D∇f f = |∇f |.
√ √ √
7 Assume we√know Dv f (1, 1) = 3/ 5 and Dw f (1, 1) = 5/ 5, where v = h1, 2i/ 5 and
w = h2, 1i/ 5. Find the gradient of f . Note that we do not know anything else about the
function f .
Solution: Let ∇f (1, 1) = ha, bi. We know a + 2b = 3 and 2a + b = 5. This allows us to get
a = 7/3, b = 1/3.
Homework
The exam starts on Thursday at 8:30 AM sharp in Science Center E. The material is slightly tilted
towards the last two weeks but is comprehensive. Look at the previous exams to get an idea.
Geometry of Space
coordinates in the plane A = (1, 1), B = (2, 4) and in space C = (1, 2, 3), D = (3, 1, 3)
~ = h1, 3i and vectors in space CD
vectors in the plane AB ~ = h2, −1, 0i
~v = hv1 , v2 , v3 i, w = hw1 , w2, w3 i, ~v + w
~ = hv1 + w1 , v2 + w2 , v3 + w3 i
dot product ~v .w ~ = v1 w1 + v2 w2 + v3 w3 = |~v | |w|
~ cos(α) angle computation
cross product, ~v .(~v × w)~ = 0, w.(~
~ v × w) ~ = 0, |~v × w|
~ = |~v ||w|
~ sin(α) area parallelogram
triple scalar product ~u · (~v × w),~ volume of parallelepiped
parallel vectors ~v × w ~ = 0, orthogonal vectors ~v · w ~ =0
scalar projection compw~ (~v ) = ~v · w/| ~ w|~
vector projection projw~ (~v ) = (~v · w) ~ w/| ~2
~ w|
completion of square: example x2 − 4x + y 2 = 1 is equivalent to (x − 2)2 + y 2 = 5
q
distance d(P, Q) = |P~Q| = (P1 − Q1 )2 + (P2 − Q2 )2 + (P3 − Q3 )2
orthogonal ~v · w ~ = ~0
~ = 0, parallel ~v × w
Curves
~r(t) = hf (t) cos(t), f (t) sin(t)i polar curve to polar graph r = f (θ) ≥ 0
Rb
a|r ′ (t)| dt arc length of parameterized curve
~
N(t) = T~ ′ (t)/|T~ ′ (t)| normal vector, is perpendicular to T~ (t)
~
B(t) = T~ (t) × N(t)
~ bi-normal vector, is perpendicular to T~ and N
~
κ(t) = |T~ ′ (t)|/|~r ′ (t)| = |~r ′ × ~r ′′ |/|~r ′ |3 curvature
Surfaces