Multivariable Calculus
Multivariable Calculus
James V. Lambers
September 15, 2014
Contents
1 Partial Derivatives
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Partial Differentiation . . . . . . . . . . . . . . . . . .
1.1.2 Multiple Integration . . . . . . . . . . . . . . . . . . .
1.1.3 Vector Calculus . . . . . . . . . . . . . . . . . . . . . .
1.2 Functions of Several Variables . . . . . . . . . . . . . . . . . .
1.2.1 Terminology and Notation . . . . . . . . . . . . . . . .
1.2.2 Visualization Techniques . . . . . . . . . . . . . . . . .
1.3 Limits and Continuity . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Terminology and Notation . . . . . . . . . . . . . . . .
1.3.2 Defining Limits Using Neighborhoods . . . . . . . . .
1.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.4 Techniques for Establishing Limits and Continuity . .
1.4 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Terminology and Notation . . . . . . . . . . . . . . . .
1.4.2 Clairauts Theorem . . . . . . . . . . . . . . . . . . . .
1.4.3 Techniques . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Tangent Planes, Linear Approximations and Differentiability
1.5.1 Tangent Planes and Linear Approximations . . . . . .
1.5.2 Functions of More than Two Variables . . . . . . . . .
1.5.3 The Gradient Vector . . . . . . . . . . . . . . . . . . .
1.5.4 The Jacobian Matrix . . . . . . . . . . . . . . . . . . .
1.5.5 Differentiability . . . . . . . . . . . . . . . . . . . . . .
1.6 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6.1 The Implicit Function Theorem . . . . . . . . . . . . .
1.7 Directional Derivatives and the Gradient Vector . . . . . . . .
1.7.1 The Gradient Vector . . . . . . . . . . . . . . . . . . .
1.7.2 Directional Derivatives . . . . . . . . . . . . . . . . . .
1.7.3 Tangent Planes to Level Surfaces . . . . . . . . . . . .
3
5
5
5
7
7
9
10
12
14
14
16
17
20
22
22
26
27
31
31
33
34
36
38
40
43
47
48
49
51
CONTENTS
1.7.4 Tangent Lines to Level Curves . . . . . . . . .
1.8 Maximum and Minimum Values . . . . . . . . . . . .
1.8.1 Absolute Extrema . . . . . . . . . . . . . . . .
1.8.2 Relation to Taylor Series . . . . . . . . . . . . .
1.8.3 Principal Minors . . . . . . . . . . . . . . . . .
1.9 Constrained Optimization . . . . . . . . . . . . . . . .
1.10 Appendix: Linear Algebra Concepts . . . . . . . . . .
1.10.1 Matrix Multiplication . . . . . . . . . . . . . .
1.10.2 Eigenvalues . . . . . . . . . . . . . . . . . . . .
1.10.3 The Transpose, Inner Product and Null Space
2 Multiple Integrals
2.1 Double Integrals over Rectangles . . . . . . . .
2.2 Double Integrals over More General Regions . .
2.2.1 Changing the Order of Integration . . .
2.2.2 The Mean Value Theorem for Integrals
2.3 Double Integrals in Polar Coordinates . . . . .
2.4 Triple Integrals . . . . . . . . . . . . . . . . . .
2.5 Applications of Double and Triple Integrals . .
2.6 Triple Integrals in Cylindrical Coordinates . . .
2.7 Triple Integrals in Spherical Coordinates . . . .
2.8 Change of Variables in Multiple Integrals . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
54
54
57
59
62
63
68
68
69
70
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
73
77
80
82
82
87
92
93
95
98
3 Vector Calculus
3.1 Vector Fields . . . . . . . . . . . . . . . . . . . . .
3.2 Line Integrals . . . . . . . . . . . . . . . . . . . . .
3.3 The Fundamental Theorem for Line Integrals . . .
3.4 Greens Theorem . . . . . . . . . . . . . . . . . . .
3.5 Curl and Divergence . . . . . . . . . . . . . . . . .
3.6 Parametric Surfaces and Their Areas . . . . . . . .
3.7 Surface Integrals . . . . . . . . . . . . . . . . . . .
3.7.1 Surface Integrals of Scalar-Valued Functions
3.7.2 Surface Integrals of Vector Fields . . . . . .
3.8 Stokes Theorem . . . . . . . . . . . . . . . . . . .
3.8.1 A Note About Orientation . . . . . . . . . .
3.9 The Divergence Theorem . . . . . . . . . . . . . .
3.10 Differential Forms . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
105
105
108
115
120
125
129
134
134
136
143
146
147
150
.
.
.
.
.
.
.
.
.
.
Chapter 1
Partial Derivatives
1.1
Introduction
This course is the fourth course in the calculus sequence, following MAT
167, MAT 168 and MAT 169. Its purpose is to prepare students for more
advanced mathematics courses, particularly courses in mathematical programming (MAT 419), advanced engineering mathematics (MAT 430), real
analysis (MAT 441), complex analysis (MAT 436), and numerical analysis
(MAT 460 and 461). The course will focus on three main areas, which we
briefly discuss here.
1.1.1
Partial Differentiation
when y and z are kept constant. Partial derivatives can be computed using
the same differentiation techniques as in single-variable calculus, but one
must be careful, when differentiating with respect to one variable, to treat
all other variables as if they are constant. For example, if f (x, y) = x2 y +y 3 ,
then
f
f
= 2xy,
= x2 + 3y 2 ,
x
y
because the y 3 term does not depend on x, and therefore its partial derivative
with respect to x is zero.
If
F1 (x, y, z)
F(x, y, z) = F2 (x, y, z)
F3 (x, y, z)
is a vector-valued function of three variables, then each of its component
functions F1 , F2 , and F3 has a gradient vector, and the rate of change of
F with respect to x, y and z is described by a matrix, called the Jacobian
matrix
JF (x, y, z) =
F1
x
F2
x
F3
x
F1
y
F2
y
F3
y
F1
z
F2
z
F3
z
1.1. INTRODUCTION
1.1.2
Multiple Integration
1.1.3
Vector Calculus
In the last part of the course, we will study vector fields, which are functions
that assign a vector to each point in its domain, like the vector-valued function F described above. We will first learn how to compute line integrals,
which are integrals of functions along curves. A line integral can be viewed
as a generalization of the integral of a function on an interval, in that dx
is replaced by ds, an infinitesimal distance between points on the curve. It
can also be viewed as a generalization of an integral that computes the arc
length of a curve, as the line integral of a function that is equal to one yields
the arc length. A line integrals of a vector field is useful for computing the
work done by a force applied to an object to move it along a curved path. To
P
Q R
+
+
,
x
y
z
R Q P
R Q P
y
z z
x x
y
.
However, using the language of differential forms, we can condense the Fundamental Theorem of Calculus and all four of its variations into one theorem,
known as the General Stokes Theorem. We now state all six results; their
discussion is deferred to Chapter 3.
Fundamental Theorem of Calculus:
Z b
f 0 (x) dx = f (b) f (a)
a
where r(t) = hx(t), y(t), z(t)i, a t b, is the position function for a curve
C and f (x, y, z) is a continuously differentiable function defined on C
Greens Theorem:
Z
D
Q P
x
y
Z
dA =
P dx + Q dy
D
Z
curl F n dS =
F T ds
where S is a surface in 3-D with unit normal vector n, and piecewise smooth
boundary S with unit tangent vector T, and F is a continuously differentiable vector field
Divergence Theorem:
Z
Z
F n dS
div F dV =
E
Z
d =
1.2
10
1.2.1
The following standard notation and terminology is used to define, and discuss, functions of several variables and their visual representations. As they
will be used throughout the course, it is important to become acquainted
with them immediately.
The set R is the set of all real numbers.
If S is a set and x is an element of S, we write x S.
11
12
1.2.2
Visualization Techniques
While it is always possible to obtain the graph of a function f (x, y), for
example, by substituting various values for its independent variables and
plotting the corresponding points from the graph, this approach is not necessarily helpful for understanding the graph as a whole. Knowing the extent
of the possible values of a functions independent and dependent variables
(the domain and range, respectively), along with the behavior of a few select
curves that are contained within the functions graph, can be more helpful.
To that end, we mention the following useful techniques for acquiring this
information.
13
14
1.3
1.3.1
xa
if, for any > 0, there exists a > 0 such that if 0 < |x a| < , then
|f (x) L| < .
If x = (x1 , x2 , . . . , xn ) is a point in Rn , or, equivalently, if x =
hx1 , x2 , . . . , xn i is a position vector in Rn , then the magnitude, or
length, of x, denoted by kxk, is defined by
q
kxk = x21 + x22 + + x2n =
n
X
!1/2
x2i
i=1
26. 2
15
p
32 + (1)2 + 42 =
xa
if, for any > 0, no matter how small, there exists a > 0 such that
for any x such that 0 < kx ak < , kf (x) bk < . This definition
is illustrated in Figure 1.3. Note that the condition kx ak > 0
specifically excludes consideration of x = a, because limits are used to
understand the behavior of a function near a point, not at a point.
16
p
x2 + y 2 < , then |f (x, y) 0| =
x +y
y y
y
p
< p = = 1.
x2 + y 2 y 2 |y|
Therefore, if we set = , we obtain
p
y
xy
= |x| p
< |x| = x2 < x2 + y 2 < = ,
p
x2 + y 2
x2 + y 2
from which it follows that the limit exists and is equal to zero. 2
Let f : D Rn Rm , and let a D. We say that f is continuous at
a if limxa f (x) = f (a).
Let f : D Rn R. We say that f is a polynomial if, for each
x = (x1 , x2 , . . . , xn ) D, f (x) is equal to a sum of terms of the form
xp11 xp22 xpnn , where p1 , p2 , . . . , pn are nonnegative integers.
Example The functions f (x) = x3 +3x2 +2x+1, g(x, y) = x2 y 3 +3xy+
x2 + 1, and h(x, y, z) = 4xy 2 z 3 + 8yz 2 are all examples of polynomials.
2
Let f, p, q : D Rn R, and let q(x) 6= 0 on D. We say that
f is a rational function if p and q are both polynomials and f (x) =
p(x)/q(x).
Example The functions f (x) = 1/(x+1), g(x, y) = xy 2 /(x2 +y 3 ), and
h(x, y, z) = (xy 2 + z 3 )/(x2 z + xyz 2 + yz 3 ) are all examples of rational
functions. 2
An algebraic function is a function that satisfies a polynomial equation
whose coefficients are themselves polynomials.
1.3.2
An alternative approach to defining limits involves the concept of a neighborhood, which generalizes open intervals on the real number line.
17
1.3.3
Results
18
Figure 1.4: Boundary point (x0 , y0 ) of the set D = {(x, y)|x2 + y 2 < 1}. The
neighborhood of (x0 , y0 ) shown, Dr ((x0 , y0 )) = {(x, y)|(xx0 )2 +(y y0 )2 <
0.1}, contains points that are in D and points that are not in D.
,
The limit of a function f (x) as x approaches a, if it exists, is unique.
That is, if
lim f (x) = b1 and lim f (x) = b2 ,
xa
xa
xa
19
Furthermore, if m = 1, then
lim (f g)(x) = b1 b2 .
xa
20
1.3.4
21
p
p
(x + y)2 (x y)2
,
xy
x, y 6= 0.
4xy
(x2 + 2xy + y 2 ) (x2 2xy + y 2 )
=
= 4.
xy
xy
Therefore, even though f (x, y) is not defined at (0, 0), its limit as (x, y)
(0, 0) exists, and is equal to 4. This example demonstrates that a limit
depends only on the behavior of a function near a particular point; what
happens at that point is irrelevant. 2
In many cases, determining whether a function f : D Rn R is
continuous can be accomplished by applying the various properties of continuous functions stated above, and using the fact that various types of
functions, such as polynomial and rational functions, are known to be continuous wherever they are defined.
Example Let c = h2, 1, 3i, and let f : R3 R3 be defined by f (x) = cx,
the cross product of the vector c and the vector x = hx1 , x2 , x3 i. This
function is continuous on all of R3 , because
f (x) = c x = hx3 3x2 , 3x1 2x3 , 2x2 + x1i,
and each component function of f can be seen to be not only a polynomial,
but a linear function. 2
22
xy 2
2x2 +y 2
(x, y) 6= (0, 0)
.
(x, y) = (0, 0)
1.4
Partial Derivatives
1.4.1
23
Note that only values of f (x, y) for which y = y0 influence the value of
the partial derivative with respect to x. Similarly, the partial derivative of
f (x, y) with respect to y at (x0 , y0 ) is defined to be
f
f (x0 , y0 + h) f (x0 , y0 )
(x0 , y0 ) = fy (x0 , y0 ) = lim
.
h0
y
h
Note the two methods of denoting partial derivatives used above: f /x or
fx for the partial derivative with respect to x. There are other notations,
but these are the ones that we will use.
Example Let f (x, y) = x2 y, and let (x0 , y0 ) = (2, 1). Then
fx (2, 1) =
=
=
=
fy (2, 1) =
=
=
=
2
In the preceding example, the value fx (2, 1) = 4 can be interpreted
as the slope of the line that is tangent to the graph of f (x, 1) = x2
at x = 2. That is, we consider the restriction of f to the portion of its
domain where y = 1, and thus obtain a function of the single variable x,
g(x) = f (x, 1) = x2 . Note that if we apply differentiation rules from
single-variable calculus to g, we obtain g 0 (x) = 2x, and g 0 (2) = 4, which
is the value we obtained for fx (2, 1).
Similarly, if we consider fy (2, 1) = 4, this can be interpreted as the
slope of a line that is tangent to the graph of p(y) = f (2, y) = 4y at y = 1.
Note that if we differentiate p, we obtain p0 (y) = 4, which, again, shows that
the partial derivative of a function of several variables can be obtained by
freezing the values of all variables except the one with respect to which we
24
e1 = 0 ,
0
0
e2 = 1 ,
0
0
e3 = 0 .
1
fx2 (x0 ) =
lim
25
(c x0 + hc e2 )2 (c x0 )2
h0
h
2
(c x0 ) + 2(c x0 )(hc e2 ) + (hc e2 )2 (c x0 )2
= lim
h0
h
2h(c x0 )(c e2 ) + h2 (c e2 )2
= lim
h0
h
= lim 2(c x0 )(c e2 ) + h(c e2 )2
=
lim
h0
= 2(c x0 )(c e2 )
= 2(h4, 3, 2, 1i h1, 3, 2, 4i)(h4, 3, 2, 1i h0, 1, 0, 0i)
= 2[4(1) 3(3) + 2(2) 1(4)](3)
= 2(5)(3)
= 30.
This shows that f is increasing sharply as a function of x2 at the point x0 .
Note that the same result can be obtained by defining
g(x2 ) = f (1, x2 , 2, 4)
= (c h1, x2 , 2, 4i)2
= (h4, 3, 2, 1i h1, x2 , 2, 4i)2
= (4 3x2 )2 ,
differentiating this function of x2 to obtain g 0 (x2 ) = 2(43x2 )(3), and then
evaluating this derivative at x2 = 3 to obtain g 0 (3) = 2(4 3(3))(3) = 30.
2
Just as functions of a single variable can have second derivatives, third
derivatives, or derivatives of any order, functions of several variables can
have higher-order partial derivatives. To that end, let f : D Rn R be a
scalar-valued function of n variables x1 , x2 , . . . , xn . Then, the second partial
derivative of f with respect to xi and xj at x0 D is defined to be
2f
(x0 ) = fxi xj (x0 )
xi xj
f
=
(x0 )
xi xj
fx (x0 + hi ei ) fxj (x0 )
= lim j
hi 0
hi
1
=
lim
[f (x0 + hi ei + hj ej ) f (x0 + hi ei )
(hi ,hj )(0,0) hi hj
26
The second line of the above definition is the most helpful, in terms of
describing how to compute a second partial derivative with respect to xi
and xj : first, compute the partial derivative with respect to xj . Then,
compute the partial derivative of the result with respect to xi , and finally,
evaluate at the point x0 . That is, the second partial derivative, or a partial
derivative of higher order, can be viewed as an iterated partial derivative.
A commonly used method of indicating that a function is evaluated at
a given point, especially if the formula for the function is complicated or
otherwise does not lend itself naturally to the usual notation for evaluation
at a point, is to follow the function with a vertical bar, and indicate the
evaluation point as a subscript to the bar. For example, given a function
f (x), we can write
df
df
0
f (4) =
=
dx
dx
x=4
1.4.2
Clairauts Theorem
The following theorem is very useful for reducing the amount of work necessary to compute all of the higher-order partial derivatives of a function.
Theorem (Clairauts Theorem): Let f : D R2 R, and let x0 D.
If the second partial derivatives fxy and fyx are continuous on D, then they
are equal:
fxy (x0 ) = fyx (x0 ).
Example Let f (x, y) = sin(2x) cos2 (4y). Then
fx = 2 cos2 (4y) cos(2x),
which yields
fxy = (2 cos2 (4y) cos(2x))y = 16 cos(2x) cos(4y) sin(4y)
27
and
fyx = (8 sin(2x) cos(4y) sin(4y))x = 16 cos(2x) cos(4y) sin(4y),
and we conclude that these mixed partial derivatives are equal. 2
1.4.3
Techniques
We now describe the most practical techniques for computing partial derivatives. As mentioned previously, computing the partial derivative of a function with respect to a given variable, at a given point, is equivalent to freezing the values of all other variables at that point, and then computing the
derivative of the resulting function of one variable at that point.
However, generally it is most practical to compute the partial derivative
as a function of all of the independent variables, which can then be evaluated
at any point at which we wish to know the value of the partial derivative,
just as when we have a function f (x), we normally compute its derivative
as a function f 0 (x), and then evaluate that function at any point x0 where
we want to know the rate of change.
Therefore, the most practical approach to computing a partial derivative
of a function f with respect to xi is to apply differentiation rules from
single-variable calculus to differentiate f with respect to xi , while treating
all other variables as constants. The result of this process is a function that
represents f /xi (x1 , x2 , . . . , xn ), and then values can be substituted for the
independent variables x1 , x2 , . . . , xn .
2
2
2
= cos 4y [e(x +y ) sin 3x]
x
x=/2,y=
(x2 +y2 )
(x2 +y 2 )
]+e
= cos 4y sin 3x [e
[sin 3x]
x
x
x=/2,y=
2
2
= cos 4y e(x +y ) sin 3x [(x2 + y 2 )]+
x
28
2 +y 2 )
i
cos 3x
x=/2,y=
h
i
2
2
2
2
= cos 4y 2xe(x +y ) sin 3x + 3e(x +y ) cos 3x
x=/2,y=
h
((/2)2 + 2 )
= cos 4 2(/2)e
sin(3/2)+
i
2
2
3e((/2) + ) cos 3(/2)
= e5
2 /4
1
[x + y 2 ] =
,
2
x + y x
x + y2
fy =
2y
1
[x + y 2 ] =
.
2
x + y y
x + y2
[x + y 2 ]
=
2
2
(x + y ) x
1
=
,
(x + y 2 )2
fxy = (fx )y
1
=
x + y2 y
=
[x + y 2 ]
2
2
(x + y ) y
29
2y
,
(x + y 2 )2
fyx = fxy
2y
,
(x + y 2 )2
= (fy )y
2y
=
x + y2 y
=
fyy
=
=
=
(x + y 2 )(2y)y 2y(x + y 2 )y
(x + y 2 )2
2
2(x + y ) 4y 2
(x + y 2 )2
2(x y 2 )
.
(x + y 2 )2
1
,
25
4
,
25
fyy (1, 2) =
6
.
25
2
Example Let f (x, y, z) = x2 y 4 z 3 . We will compute the second partial
derivatives of this function at the point (x0 , y0 , z0 ) = (1, 2, 3) by repeated
computation of first partial derivatives. First, we compute
fx = (x2 y 4 z 3 )x = (x2 )x y 4 z 3 = 2xy 4 z 3 ,
by treating y and z as constants, then
fy = (x2 y 4 z 3 )y = (y 4 )y x2 z 3 = 4x2 y 3 z 3 ,
by treating x and z as constants, and then
fz = (x2 y 4 z 3 )z = x2 y 4 (z 3 )z = 3x2 y 4 z 2 .
We then differentiate each of these with respect to x, y and z to obtain the
second partial derivatives:
fxx = (fx )x = (2xy 4 z 3 )x = 2y 4 z 3 ,
fxy = (fx )y = (2xy 4 z 3 )y = (2x)(4y 3 )(z 3 ) = 8xy 3 z 3 ,
30
Note that the order in which partial differentiation operations occur does
not appear to matter; that is, fxy = fyx , for example. That is, Clairauts
Theorem applies for any number of variables. It also applies to any order of
partial derivative. For example,
fxyy = (fxy )y = (8xy 3 z 3 )y = 24xy 2 z 3 ,
fyyx = (fyy )x = (12x2 y 2 z 3 )x = 24xy 2 z 3 .
2
In single-variable calculus, implicit differentiation is applied to an equation that implicitly describes y as a function of x, in order to compute dy/dx.
The same approach can be applied to an equation that implicitly describes
any number of dependent variables in terms of any number of independent
variables. The approach is the same as in the single-variable case: differentiate both sides of the equation with respect to the independent variable,
leaving derivatives of dependent variables in the equation as unknowns. The
resulting equation can then be solved for the unknown partial derivatives.
Example Consider the equation
x2 z + y 2 z + z 2 = 1.
If we view this equation as one that implicitly describes z as a function of x
and y, we can compute zx and zy using implicit differentiation with respect
to x and y, respectively. Applying the Product Rule yields the equations
2xz + x2 zx + y 2 zx + 2zzx = 0,
x2
2xz
,
+ y 2 + 2z
zy =
x2
2yz
.
+ y 2 + 2z
1.5
1.5.1
32
Lf
= B,
y
we conclude that the linear function that best approximates f (x, y) near
(x0 , y0 ) is the linear approximation
Lf (x, y) = f (x0 , y0 ) +
f
f
(x0 , y0 )(x x0 ) +
(x0 , y0 )(y y0 ).
x
y
Furthermore, the graph of this function is called the tangent plane of f (x, y)
at (x0 , y0 ). Its equation is
z z0 =
f
f
(x0 , y0 )(x x0 ) +
(x0 , y0 )(y y0 ).
x
y
Example Let f (x, y) = 2x2 y+3y 2 , and let (x0 , y0 ) = (1, 1). Then f (x0 , y0 ) =
5, and the first partial derivatives at (x0 , y0 ) are
fx (1, 1) = 4xy|x=1,y=1 = 4,
Vh = r2 = 25.
1.5.2
34
f
f
f
(0)
(0)
(p0 )(x1 x1 ) +
(p0 )(x2 x2 ) + +
(p0 )(xn x(0)
n ),
x1
x2
xn
Lf (x1 , x2 , . . . , xn ) = y0 +
1.5.3
It can be seen from the above definitions that writing formulas that involve
the partial derivatives of functions of n variables can be cumbersome. This
can be addressed by expressing collections of partial derivatives of functions
of several variables using vectors and matrices, especially for vector-valued
functions of several variables.
(0) (0)
(0)
By convention, a point p0 = (x1 , x2 , . . . , xn ), which can be identified
(0) (0)
(0)
with the position vector p0 = hx1 , x2 , . . . , xn i, is considered to be a
column vector
(0)
x
1(0)
x2
p0 =
.. .
.
(0)
xn
Also, by convention, given a function of n variables, f : D Rn R,
the collection of its partial derivatives with respect to all of its variables is
written as a row vector
i
h
f
f
f
.
f (p0 ) = x
(p
)
(p
)
(p
)
0
0
0
x2
xn
1
This vector is called the gradient of f at p0 .
Viewing the partial derivatives of f as a vector allows us to use vector
operations to describe, much more concisely, the linearization of f . Specifically, the linearization of f at p0 , evaluated at a point p = (x1 , x2 , . . . , xn ),
can be written as
Lf (p) = f (p0 ) +
f
f
(0)
(0)
(p0 )(x1 x1 ) +
(p0 )(x2 x2 ) +
x1
x2
i=1
= f (p0 ) + f (p0 ) (p p0 ),
where f (p0 ) (p p0 ) is the dot product, also known as the inner product,
of the vectors f (p0 ) and p p0 . Recall that given two vectors u =
hu1 , u2 , . . . , un i and v = hv1 , v2 , . . . , vn i, the dot product of u and v, denoted
by u v, is defined by
uv =
n
X
ui vi = u1 v1 + u2 v2 + + un vn = kukkvk cos ,
i=1
fx fy fz
36
1.5.4
f1 (p)
f2 (p)
f (p) =
,
..
.
fm (p)
where each fi : D Rm . Combining the two conventions described above,
the partial derivatives of these component functions at a point p0 D are
arranged in an m n matrix
f
f1
f1
1
x1 (p0 )
x2 (p0 ) xn (p0 )
f2
f2
f2
(p0 )
(p0 ) x
x1 (p0 ) x
n
2
.
Jf (p0 ) =
..
..
fm
x1 (p0 )
fm
x2 (p0 )
fm
xn (p0 )
f1 (p0 )
f2 (p0 ) h
i
f
f
f
(p
)
(p
)
(p
)
Jf (p0 ) =
=
.
0
0
0
..
x1
x2
xn
.
fm (p0 )
The Jacobian matrix provides a concise way of describing the linearization of a vector-valued function, just the gradient does for a scalar-valued
function. The linearization of f at p0 is the function Lf (p), defined by
f1
f1 (p0 )
x1 (p0 )
f2
f2 (p0 )
x1 (p0 )
(x1 x(0) ) +
Lf (p) =
+
..
.
1
.
.
fm
fm (p0 )
x1 (p0 )
f
1
(p0 )
n
x
f2
xn (p0 )
(xn x(0)
+
..
n )
fm
xn (p0 )
= f (p0 ) +
n
X
f
(0)
(p0 )(xj xj )
xj
j=1
2
2
22
2
2
#
,
0
2
2
2
=
+
2
2
y 4
2
2
2
"
#
2
2
2
+
x
2
2
2
4
=
.
2
2
2 2x + 2 y 4
38
1.5.5
Differentiability
xx0
f (x) f (x0 )
x x0
xx0
xx0
f (x) Lf (x)
= 0.
x x0
xx0
pp0
kf (p) Lf (p)k
= 0,
kp p0 k
|y(x + 1) 2| = 0,
|x 1|
1,
0 p
(x 1)2 + (y 1)2
40
1.6
41
dx
dt
dy
dt
dz
dt
dx
dy
dz
+ fy (x(t), y(t), z(t)) + fz (x(t), y(t), z(t))
dt
dt
dt
z(t)
z(t)
= (2e sin 2x(t) sin 3y(t))(2) + (3e cos 2x(t) cos 3y(t))(2t) +
= fx (x(t), y(t), z(t))
= 4et sin 4t sin 3t2 + 6tet cos 4t cos 3t2 + 3t2 et cos 4t sin 3t2 .
2
Example Let f : R2 R be defined by
f (x, y) = x2 y + xy 2 ,
and let g : R2 R2 be defined by
x(s, t)
2s + t
g(s, t) =
=
.
y(s, t)
s 2t
Then, f g is a scalar-valued function of s and t,
(f g)(s, t) = x(s, t)2 y(s, t)+x(s, t)y(s, t)2 = (2s+t)2 (s2t)+(2s+t)(s2t)2 .
42
To compute its gradient, which includes its partial derivatives with respect
to s and t, we first compute
f = fx fy = 2xy + y 2 x2 + 2xy ,
and
Jg (s, t) =
xs xt
ys yt
=
2 1
1 2
,
gu gv
43
cos u cos v sin u sin v
= [3 sin2 u cos2 v + 4 sin u cos v] cos u cos v sin u sin v
(3 sin2 u cos2 v + 4 sin u cos v) cos u cos v (3 sin2 u cos2 v + 4 sin u cos v) sin u sin v .
=
2
Example Let f : R2 R2 be defined by
2
f1 (x, y)
x y
f (x, y) =
=
,
f2 (x, y)
xy 2
and let g : R R2 be defined by
g(t) = hx(t), y(t)i = hcos t, sin ti.
Then f g is a vector-valued function of t,
f (t) = hcos2 t sin t, cos t sin2 ti.
To compute its derivative with respect to t, we first compute
(f1 )x (f1 )y
2xy x2
Jf (x, y) =
,
=
(f2 )x (f2 )y
y 2 2xy
and g0 (t) = h sin t, cos ti, and then use the Chain Rule to obtain
0
x (t)
(f1 )x (x(t), y(t)) (f1 )y (x(t), y(t))
0
0
(f g) (t) = Jf (x(t), y(t))g (t) =
y 0 (t)
(f2 )x (x(t), y(t)) (f2 )y (x(t), y(t))
= (f1 )x (x(t), y(t))x0 (t) + (f1 )y (x(t), y(t))y 0 (t), (f2 )x (x(t), y(t))x0 (t) + (f2 )y (x(t), y(t))y 0 (t)
= h2 cos t sin t( sin t) + cos2 t(cos t), sin2 t( sin t) + 2 cos t sin t(cos t)i
= h2 cos t sin2 t + cos3 t, sin3 t + 2 cos2 t sin ti.
2
1.6.1
The Chain Rule can also be used to compute partial derivatives of implicitly defined functions in a more convenient way than is provided by implicit
differentiation. Let the equation F (x, y) = 0 implicitly define y as a differentiable function of x. That is, y = f (x) where F (x, f (x)) = 0 for x in
44
dy
= 0,
dx
which yields
dy
Fx
= .
dx
Fy
By the Implicit Function Theorem, the equation F (x, y) = 0 defines y implicitly as a function of x near (x0 , y0 ), where F (x0 , y0 ) = 0, provided that
Fy (x0 , y0 ) 6= 0 and Fx and Fy are continuous near (x0 , y0 ). Under these
conditions, we see that dy/dx is defined at (x0 , y0 ) as well.
Example Let F : R2 R be defined by
F (x, y) = x2 + y 2 4.
The equation F (x, y) = 0 defines y implicitly as a function of x, provided
that F satisfies the conditions of the Implicit Function Theorem.
We have
Fx = 2x, Fy = 2y.
Since both of these partial derivatives are polynomials, and therefore are
continuous on all of R2 , it follows that if Fy 6= 0, then y can be implicitly
defined as a function of x at a point where F (x, y, z) = 0, and
dy
Fx
x
=
= .
dx
Fy
y
For example, at the point (x, y) = (0, 2), F (x, y) = 0, and Fy = 4. Therefore,
y can be implicitly defined as a function of x near this point, and at x = 0,
we have dy/dx = 0. 2
(0)
(0)
(0)
i = 1, 2, . . . , n.
45
(0)
(0)
(0)
y
= 0,
xi
where all partial derivatives are evaluated at p0 , and solve for y/xi at p0 .
Example Let F : R3 R be defined by
F (x, y, z) = x2 z + z 2 y 2xyz + 1.
The equation F (x, y, z) = 0 defines z implicitly as a function of x and y,
provided that F satisfies the conditions of the Implicit Function Theorem.
We have
Fy = z 2 2xz,
Fx = 2xz 2yz,
Fz = x2 + 2yz 2xy.
Since all of these partial derivatives are polynomials, and therefore are continuous on all of R3 , it follows that if Fz 6= 0, then z can be implicitly defined
as a function of x and y at a point where F (x, y, z) = 0, and
zx =
Fx
2yz 2xz
= 2
,
Fz
x + 2yz 2xy
zy =
Fy
2xz z 2
= 2
.
Fz
x + 2yz 2xy
(0)
(0)
(0)
(0)
p0 = (x1 , x2 , . . . , x(0)
n , y1 , y2 , . . . , ym ) D
be such that
(0)
(0)
(0)
(0)
(0)
F(x1 , x2 , . . . , x(0)
n , y1 , y2 , . . . , ym ) = 0.
y1
y2
ym
+ F y2
+ + F ym
= 0,
xi
xi
xi
i = 1, 2, . . . , n,
46
F1
F1
x1 (p0 )
x2 (p0 )
xn (p0 )
F2 (p0 ) F2 (p0 ) F2 (p0 )
x1
x2
xn
Jx,F (p0 ) =
,
.
.
..
..
Jy,F (p0 ) =
and
Jy (x0 ) =
Fm
x1 (p0 )
Fm
x2 (p0 )
Fm
xn (p0 )
F1
y1 (p0 )
F2
y1 (p0 )
F1
y2 (p0 )
F2
y2 (p0 )
F1
ym (p0 )
F2
ym (p0 )
..
.
Fm
y1 (p0 )
Fm
y2 (p0 )
y1
x1 (x0 )
y2
x1 (x0 )
y1
x2 (x0 )
y2
x2 (x0 )
..
.
ym
x1 (x0 )
ym
(x
)
0
x2
..
.
Fm
ym (p0 )
y1
xn (x0 )
y1
xn (x0 )
..
.
y1
xn (x0 )
ux uy
vx vy
1.7
Previously, we defined the gradient as the vector of all of the first partial
derivatives of a scalar-valued function of several variables. Now, we will
learn about how to use the gradient to measure the rate of change of the
function with respect to a change of its variables in any direction, as opposed
to a change in a single variable. This is extremely useful in applications in
which the minimum or maxmium value of a function is sought. We will
also learn how the gradient can be used to easily describe tangent planes to
level surfaces, thus providing an alternative to implicit differentiation or the
Chain Rule.
48
1.7.1
fx1
fx2
fxn
2 +y 2 )
2
2
2xe(x +y ) cos z
cos z. Then
2
2
2ye(x +y ) cos z
2
2
e(x +y ) sin z
Therefore, at the point (x0 , y0 , z0 ) = (1, 2, /3), the gradient is the vector
f (x0 , y0 , z0 ) =
3
=
e5 , 2e5 ,
e5 .
2
2
It should be noted that various differentiation rules from single-variable
calculus have direct generalizations to the gradient. Let u and v be differentiable functions defined on Rn . Then, we have:
Linearity:
(au + bv) = au + bv
where a and b are constants
Product Rule:
(uv) = uv + vu
Quotient Rule:
u
v
vu uv
v2
Power Rule:
un = nun1 u
1.7.2
Directional Derivatives
Du f (x0 , y0 ) = lim
h0
= f (x0 , y0 ) u.
That is, the directional derivative in the direction of u is the dot product of
the gradient with u. It can be shown that this is the case for any number of
variables: given f : D Rn R, and a unit vector u Rn , the directional
derivative of f at x0 Rn in the direction of u is given by
Du f (x0 ) = f (x0 ) u.
50
where is the angle between a and b, the directional derivative can be used
to determine the direction along which f increases most rapidly, decreases
most rapidly, or does not change at all.
We first note that if is the angle between f (x0 ) and u, then
Du f (x0 ) = f (x0 ) u = kf (x0 )k cos .
Then we have the following:
When = 0, cos = 1, so Du f is maximized, and its value is
kf (x0 )k. In this case,
u=
f (x0 )
,
kf (x0 )k
f (x0 )
,
kf (x0 )k
fx (x, y) fy (x, y)
2xy x2 + 3y 2
which yields f (x0 , y0 ) = hfx (2, 2), fy (2, 2)i = h8, 16i. It follows that
the direction of steepest ascent is
h8, 16i
h8, 16i
h8, 16i
f (2, 2)
1 2
=
u=
=p
=
= ,
.
kf (2, 2)k
320
8 5
5 5
(8)2 + 162
1.7.3
Fx (x0 , y0 , z0 )
,
Fz (x0 , y0 , z0 )
zy (x0 , y0 ) =
Fy (x0 , y0 , z0 )
.
Fz (x0 , y0 , z0 )
52
y = y0 + tFy (x0 , y0 , z0 ),
z = z0 + tFz (x0 , y0 , z0 ).
= h4, 2, 4i.
It follows that the equation of the plane that is tangent to the sphere at
(3, 3, 2) is
4(x x0 ) + 2(y y0 ) + 4(z z0 ) = 0,
Fx
= zx .
Fz
It follows that the vector h1, 0, zx i is a vector that lies within the tangent
plane.
Similarly, by setting x x0 = 0 and y y0 = 1, it follows that the vector
h0, 1, zy i is also within the tangent plane. Because these vectors cannot be
scalar multiples of each other, they must be linearly independent. It can
be verified that both of these vectors are orthogonal to the normal to the
tangent plane, F (x0 , y0 , z0 ). For example, we have
h1, 0, zx i hFx , Fy , Fz i = Fx + Fz zx = Fx + Fz (Fx /Fz ) = 0.
54
1.7.4
The preceding discussion about tangent planes to level surfaces can be scaled
down to two dimensions, yielding equations of tangent lines to level curves.
Consider a curve defined by the equation F (x, y) = 0. At a point (x0 , y0 )
on this curve, the tangent vector is pointing in a direction u such that
Du F (x0 , y0 ) = F (x0 , y0 ) u = 0,
since F is equal to the constant function 0 along the curve. That is,
F (x0 , y0 ) is orthogonal to the tangent line to the curve at (x0 , y0 ). Therefore, for any point (x, y) on the tangent line, we have
Fx (x0 , y0 )(x x0 ) + Fy (x0 , y0 )(y y0 ) = 0.
Thus we have an equation for the tangent line that is valid even when
dy/dx = Fx /Fy is not defined due to Fy (x0 , y0 ) = 0, which is analogous to
the most general form of the equation of a tangent plane that is valid even
when Fz (x0 , y0 , z0 ) = 0.
When Fy (x0 , y0 ) 6= 0, a vector contained within this tangent line can be
obtained by setting x x0 = 1 in the above equation of the tangent plane,
which yields the vector
hx x0 , y y0 i = h1, Fx /Fy i = h1, dy/dxi
with initial point (x0 , y0 ). It can then be verified that this vector is orthogonal to F (x0 , y0 ). We have
F h1, dy/dxi = hFx , Fy i h1, Fx /Fy i = Fx + Fy (Fx /Fy ) = 0,
where it is assumed that all derivatives are evaluated at (x0 , y0 ).
1.8
In single-variable calculus, one learns how to compute maximum and minimum values of a function. We first recall these methods, and then we will
learn how to generalize them to functions of several variables.
Let f : D Rn R. A local maximum of a function f is a point
a D such that f (x) f (a) for x near a. The value f (a) is called a local
maximum value. Similarly, f has a local minimum at a if f (x) f (a) for x
near a, and the value f (a) is called a local minimum value.
55
2f
(a).
xi xj
Because mixed second partial derivatives are equal if they are continuous,
it follows that H is a symmetric matrix, meaning that Hij = Hji .
We can now state the Second Derivatives Test. If a is a critical point of
f , and the Hessian, H, is positive definite at a, then a is a local minimum
of f . The notion of a matrix being positive definite is the generalization to
matrices of the notion of a positive number. When a matrix H is symmetric
and positive definite, the following statements are all true:
xT Hx > 0, where x is a nonzero column vector of real numbers, and
xT is the transpose of x, which is a row vector. Note that xT Hx is
the same as x (Hx).
The eigenvalues of H are positive.
The determinant of H is positive.
The diagonal entries of H, Hii for i = 1, 2, . . . , n, are positive.
If a curve contained within the graph of f passes through a critical point a
with unit tangent vector u, then the concavity of the curve at a is measured
56
12x + 4y 1 4x + 16y 3
57
Since the determinant, which is the product of Hs two eigenvalues, is positive, it follows that they must both be the same sign.
To determine that sign, we check the diagonal entries of H. If H is positive definite, then the diagonal entries fxx and fyy would both be positive.
Therefore, it is sufficient to check fxx ; because det(H) > 0, both diagonal
entries must the same sign. We have fxx = 12, so we conclude that H is
positive definite, and that the critical point (2/11, 1/44) is a local minimum
of f . 2
The preceding example describes how the Second Derivatives Test can be
performed for a function of two variables:
2 > 0, and f
If det(H) = fxx fyy fxy
xx > 0, then the critical point is a
minimum.
If det(H) > 0 and fxx < 0, then the critical point is a maximum.
If det(H) < 0, then the critical point is a saddle point.
If det(H) = 0, then the test is inconclusive.
1.8.1
Absolute Extrema
58
to check all critical points in [a, b], and the endpoints a and b, as the absolute
maximum and absolute minimum must each occur at one of these points.
The generalization of a closed interval to the multivariable case is the
notion of a compact set. Previously, we defined an open set, and a boundary
point. A closed set is a set that contains all of its boundary points. A
bounded set is a set that is contained entirely within a ball Dr (x0 ) for some
choice of r and x0 . Finally, a set is compact if it is closed and bounded.
We can now state the generalization of the Extreme Value Theorem to
the multivariable case. It states that a continuous function on a compact
set has an absolute minimum and an absolute maximum. Therefore, given
such a compact set D, to find the absolute maximum and minimum, it is
sufficient to check the critical points of f in D, and to find the extreme
(maximum and minimum) values of f on the boundary. The largest of all of
these values is the absolute maximum value, and the smallest is the absolute
minimum value.
It should be noted that in cases where D has a simple shape, such as
a rectangle, triangle or cube, it is possible to check boundary points by
characterizing them using one or more equations, using these equations to
eliminate a variable, and then substituting for the eliminated variable in f
to obtain a function of one less variable. Then, it is possible to find extreme
values on the boundary by solving a maximization or minimization problem
in one less dimension.
Example Consider the function f (x, y) = x2 + 3y 2 4x 6y. We will find
the absolute maximum and minimum values of this function on the triangle
with vertices (0, 0), (4, 0) and (0, 3).
First, we look for critical points. We have
f = 2x 4 6y 6 .
We see that there is only one critical point, at (x0 , y0 ) = (2, 1). Because
the triangle includes points that satisfy the inequalities x 0, y 0 and
y 3 3x/4, and the point (2, 1) satisfies all of these inequalities, we
conclude that this point lies within the triangle. It is therefore a candidate
for an absolute maximum or minimum.
We now check the boundary, by examining each edge of the triangle
individually. On the edge between (0, 0) and (0, 3), we have x = 0, which
yields f (0, y) = 3y 2 6y. We then have fy (0, y) = 6y 6, which has a
critical point at y = 1. Therefore, (0, 1) is also a candidate for an absolute
extremum. Similarly, along the edge between (0, 0) and (4, 0), we have y = 0,
which yields f (x, 0) = x2 4x. We then have fx (x, 0) = 2x 4, which has
59
y
1
1
0
51/43
0
0
3
f(x,y)
7
3
4
289/43
0
0
9
We conclude that the absolute minimum is at (2, 1), and the absolute maximum is at (0, 3). The function is shown on Figure 1.5. 2
1.8.2
60
61
defined by
2f
(x0 )
x21
2
f
x2 x1 (x0 )
Hf (x0 ) =
..
.
2f
xn x1 (x0 )
2f
x1 x2 (x0 )
2f
(x0 )
x22
2f
x1 xn (x0 )
2f
x2 xn (x0 )
2f
xn x2 (x0 )
..
.
2f
(x0 )
x2
xx0
(0)
(0)
R2 (x0 , x)
= 0.
kx x0 k2
(0)
i,j=1
R2 (x0 , x).
Example Let f (x, y) = x2 y 3 + xy 4 , and let (x0 , y0 ) = (1, 2). Then, from
partial differentiation of f , we obtain its gradient
f = fx fy = 2xy 3 + y 4 3x2 y 2 + 4xy 3 ,
and its Hessian,
Hf (x, y) =
fxx fxy
fyx fyy
=
2y 3
6xy 2 + 4y 3
6xy 2 + 4y 3 6x2 y + 12xy 2
.
Therefore
f (1, 2) =
0 20
,
Hf (1, 2) =
16 8
8 36
,
62
16 8
8 36
R2 ((1, 2), (x, y))
hx 1, y + 2i
x1
y+2
+
1.8.3
Principal Minors
63
2
[Hf (0, 0, 0)]1:3,1:3 = 1
0
1
2
,
1 0
2 0 .
0 2
1.9
Constrained Optimization
i = 1, 2, . . . , m
for given functions gi (x). That is, we may only consider x = (x1 , x2 , . . . , xn )
that belong to the intersection of the hypersurfaces (surfaces, when n = 3,
or curves, when n = 2) defined by the gi , when computing a maximum
or minimum value of f . For conciseness, we rewrite these constraints as a
vector equation g(x) = 0, where g : Rn Rm is a vector-valued function
with component functions gi , for i = 1, 2, . . . , m.
By Taylors theorem, we have, for x0 Rn at which g is differentiable,
g(x) = g(x0 ) + Jg (x0 )(x x0 ) + R1 (x0 , x),
64
65
,
4
y=
.
9
2 13.6.
Substituting these values into the above equations for x and y yield the
critical points
3
1
3
x1 = , y1 = , 1 = ,
5
5
2
x2 1.416626,
y2 2.956124,
2 13.6.
Substituting the x and y values into f (x, y) yields the minimum value of 9/5
at (x1 , y1 ) and the maximum value of approximately 86.675 at (x2 , y2 ). 2
Example Let f (x, y, z) = x + y + z. We wish to find the extremea of this
function subject to the constraints x2 + y 2 = 1 and 2x + z = 1. That is, we
must have g1 (x, y, z) = g2 (x, y, z) = 0, where g1 (x, y, z) = x2 + y 2 1 and
g2 (x, y, z) = 2x + z 1. We must find 1 and 2 such that
f = 1 g1 + 2 g2 ,
or
1 1 1
= 1
2x 2y 0
+ 2
2 0 1
This equation, together with the constraints, yields the system of equations
1 = 2x1 + 22
1 = 2y1
1 = 2
1 = x2 + y 2
1 = 2x + z.
66
From the third equation, 2 = 1, which, by the first equation, yields 2x1 =
1. It follows from the second equation that x =y. This,
in conjunc2,
1/
2) or (x, y) =
tion with
the
fourth
equation,
yields
(x,
y)
=
(1/
(1/ 2, 1/ 2). From the fifth equation, we obtain the two critical points
1
1
1 1
(x1 , y1 , z1 ) = , , 1 2 , (x2 , y2 , y2 ) = , , 1 + 2 .
2
2
2 2
Substituting
these points into f yields f (x1 , y1 , z1 ) = 1 2 and f (x2 , y2 , z2 ) =
m
X
i=1
i gi (x),
67
68
1.10
1.10.1
Matrix Multiplication
n
X
aik bkj ,
i = 1, 2, . . . , m,
j = 1, 2, . . . , p.
k=1
Note that the number of columns in A must equal the number of rows in B,
or the product AB is undefined. Furthermore, in general, even if A and B
can be multiplied in either order (that is, if they are square matrices of the
same size), AB does not necessarily equal BA. In the special case where the
matrix B is actually a column vector x with n components (that is, p = 1),
it is useful to be able to recognize the summation
yi =
n
X
aij xj
j=1
1 2
7
8
A = 3 4 , B =
.
9 10
5 6
Then, because the number of columns in A is equal to the number of rows
in B, the product C = AB is defined, and equal to the 3 2
69
1.10.2
Eigenvalues
70
determinant and trace can be used to easily confirm that the eigenvalues of
A are either both positive, both negative, or of opposite signs. This is the
basis for the Second Derivatives Test for functions of two variables.
Example Let A be a symmetric 2 2 matrix defined by
4 6
A=
.
6 10
Then
tr(A) = 4 + 10 = 14,
It follows that the product and the sum of As two eigenvalues are both
positive. Because A is symmetric, its eigenvalues are also real. Therefore,
they must both also be positive, and we can conclude that A is positive
definite.
To actually compute the eigenvalues, we can compute its characteristic
polynomial, which is
4
6
det(A I) = det
6 10
= (4 )(10 ) (6)(6)
= 2 14 + 4.
Note that
det(A I) = 2 tr(A) + det(A),
which is true for 2 2 matrices in general. To compute the eigenvalues,
we use the quadratic formula to compute the roots of this polynomial, and
obtain
p
14 142 4(4)(1)
=
= 7 3 5 13.708, 0.292.
2(1)
If A represented the Hessian of a function f (x, y) at a point (x0 , y0 ), and
f (x0 , y0 ) = 0, then f would have a local minimum at (x0 , y0 ). 2
1.10.3
The dot product of two vectors u and v, denoted by uv, can also be written
as uT v, where u and v are both column vectors, and uT is the transpose of
u, which converts u into a row vector. In general, the transpose of a matrix
A is the matrix AT whose entries are defined by [AT ]ij = [A]ji . That is, in
the transpose, the sense of rows and columns are reversed. The dot product
71
is also known as an inner product; the outer product of two column vectors
u and v is uvT , which is a matrix, whereas the inner product is a scalar.
Given an m n matrix A, the null space of A is the set N (A) of all
n-vectors such that if x N (A), then Ax = 0. If x is such a vector, then
for any m-vector v, vT (Ax) = vT 0 = 0. However, because of two properites
of the transpose, (AT )T = A and (AB)T = B T AT , this inner product can
be rewritten as vT Ax = vT (AT )T x = (AT v)T x. It follows that any vector
in N (A) is orthogonal to any vector in the range of AT , denoted by R(AT ),
which is the set of all n-vectors of the form AT v, where v is an m-vector.
This is the basis for the condition f = JgT in the method of Lagrange
multipliers when there are multiple constraints.
Example Let
1 2 4
A = 1 3 6 .
1 5 10
Then
1
1
1
AT = 2 3 5 .
4 6 10
The null space of A, N (A), consists of all vectors that are multiples of the
vector
0
v = 2 ,
1
as it can be verified by matrix-vector multiplication that Av = 0. Now, if we
let w be any vector in R3 , and we compute u = AT w, then v u = vT u = 0,
because
vT u = vT AT w = (Av)T w = 0T w = 0.
For example, it can be confirmed directly that v is orthogonal to any of the
columns of AT . 2
72
Chapter 2
Multiple Integrals
2.1
f (x) dx = lim
f (xi )x,
i=1
Z
F (x) =
f (s) ds,
a
then we have
1
F (x) = lim
h0 h
0
Z
x+h
Z
f (s) ds
73
Z
1 x+h
f (s) ds =
f (s) ds.
h x
74
n X
m
X
f (xi , yj ) y x.
i=1 j=1
We then obtain the exact volume of this solid by letting the number of
subintervals, n, tend to infinity. The result is the double integral of f (x, y)
over the rectangle R = {(x, y) | a x b, c y d}, which is also written
as R = [a, b] [c, d]. The double integral is defined to be
Z Z
V =
f (x, y) dA =
R
lim
m,n
n X
m
X
f (xi , yj ) y x,
i=1 j=1
which is equal to the volume of the given solid. The dA corresponds to the
quantity A = xy, and emphasizes the fact that the integral is defined
to be the limit of the sum of volumes of boxes, each with a base of area A.
To evaluate double integrals of this form, we can proceed as in the singlevariable case, by noting that if f (x0 , y), a function of y, is integrable on [c, d]
75
i=1 j=1
n
X
lim
lim
i=1
n Z d
X
lim
f (xi , yj )y x
j=1
f (xi , y) dy
x
i=1
bZ d
m
X
f (x, y) dy dx.
a
f (x, y) dA =
R
f (x, y) dy dx.
c
This result is known as Fubinis Theorem, which states that a double integral of a function f (x, y) can be evaluated as two iterated single integrals,
provided that f is integrable as a function of either variable when the other
variable is held fixed. This is guaranteed if, for instance, f (x, y) is continuous
on the entire rectangle R.
That is, we can evaluate a double integral by performing partial integration with respect to either variable, x or y, which entails applying the
Fundamental Theorem of Calculus to integrate f (x, y) with respect to only
that variable, while treating the other variable as a constant. The result will
be a function of only the other variable, to which the Fundamental Theorem
of Calculus can be applied a second time to complete the evaluation of the
double integral.
Example Let R = [0, 1] [0, 2], and let f (x, y) = x2 y + xy 3 . We will use
Fubinis Theorem to evaluate
Z Z
f (x, y) dy dx.
R
We have
Z Z
1Z 2
x2 y + xy 3 dy dx
0
0
Z 1 Z 2
2
3
x y + xy dy dx
=
f (x, y) dy dx =
R
76
2
3
xy dy dx
x y dy +
0
Z 2
Z 1 Z 2
y 3 dy dx
y dy + x
x2
=
0
0
0
2
2 #
Z 1"
2
4
y
y
x2
=
+ x dx
2 0
4 0
0
Z 1
2x2 + 4x dx
=
=
=
=
1
2x3
+ 2x2
3
0
8
.
3
2
In view of Fubinis Theorem, a double integral is often written as
Z Z
Z Z
Z Z
f (x, y) dA =
f (x, y) dy dx =
f (x, y) dx dy.
R
0
4
=
1
Z
=
2
y 2
8y xy
dx
2 0
14 2x dx
4
(14x x2 )
1
77
2.2
R1
R2
where
F (x, y) =
f (x, y) (x, y) D
.
0
(x, y) R,
/D
78
Z bZ
g2 (x)
F (x, y) dy dx
a
g1 (x)
b Z g2 (x)
f (x, y) dy dx.
a
g1 (x)
This is valid because F (x, y) = 0 when y < g1 (x) or y > g2 (x), because
in these cases, (x, y) lies outside of D. The resulting iterated integral can
be evaluated in the same way as iterated integrals over rectangles; the only
difference is that when the limits of the inner integral are substituted for y
in the antiderivative of f (x, y) with respect to y, the limits are functions of
x, rather than constants.
A similar approach can be applied to a region of type II, which is bounded
on the left and right by continuous functions of y, and bounded above and
below by vertical lines. Specifically, D is a region of type II if
D = {(x, y)|h1 (y) x h2 (y),
Using Fubinis Theorem, we obtain
Z Z
Z
f (x, y) dA =
D
c y d}.
d Z h2 (y)
f (x, y) dx dy.
h1 (y)
Example We wish to compute the volume of the solid under the plane
x + y + z = 8, and bounded by the surfaces y = x and y = x2 . These
surfaces intersect along the lines x = 0, y = 0 and x = 1, y = 1. It follows
that the volume V of the solid is given by the double integral
Z 1Z x
8 x y dy dx.
0
x2
79
+ 4x
=
10
4
6
0
=
=
1 19
1
+
+4
10 4
6
71
.
60
2
Note that it is sometimes necessary to determine the intersections of surfaces
that define a solid, in order to obtain the limits of integration.
To compute the volume of a solid that is bounded above and below
(along the z-direction) by two different surfaces, we can add the volume of
the solid bounded by the top surface and the plane z = 0 to the volume of
the solid bounded above by z = 0 and below by the lower surface, which
is equivalent to subtracting the volume of the solid bounded above by the
lower surface and below by z = 0.
Example We will compute the volume V of the solid in the first octant
bounded by the planes z = 10 + x + y, z = 2 x y, and x = 0, as well as
the surfaces
y = sin x and y = cos x. As these surfaces intersect along the
line y = 2/2, x = /4, this volume is given by the double integral
Z
V
/4 Z cos x
(10 + x + y) (2 x y) dy dx
=
0
sin x
/4 Z cos x
8 + 2x + 2y dy dx
0
Z
=
0
sin x
/4
cos x
8y + 2xy + y 2 sin x dx
80
=
0
/4
=
0
/4
1
2x sin x + 2x cos x + 6 sin x + 10 cos x + sin 2x
2
0
19
2
+8 2 .
2
2
=
=
A(D) =
1x2
1 dy dx =
1
y|0 1x dx =
p
1 x2 dx.
/2
A(D) =
/2
cos d =
/2
/2
1 + cos 2
d =
2
sin 2 /2
= .
+
2
4
2
/2
2.2.1
In some cases, a region can be classified as being of either type I or type II,
and therefore a function can be integrated over the region in two different
ways. However, one approach or the other may be impractical, due to the
81
ey dA
1Z 1
ey dy dx.
x
3
ey dA =
1 Z y2
Z
0
0
1
2
3 y
xey dy
0
1
ey dx dy
y 2 ey dy
=
0
=
=
=
1
3
eu du,
u = y3,
1 u 1
e
3 0
1
(e 1).
3
82
2.2.2
It is important to note that all of the properties of double integrals that have
been previously discussed, including linearity, homogeneity, monotonicity,
and additivity, apply to double integrals over non-rectangular regions as
well. One additional property, that is a consequence of monotonicity, is that
if f (x, y) m on a region D, and f (x, y) M on D, then
Z Z
f (x, y) dA M A(D),
mA(D)
D
The exact value is 14 (e4 5) 12.4, which is between the above lower and
upper bounds. 2
2.3
83
y
.
x
In order to integrate a function over a region defined using polar coordinates,
we must derive the double integral in these coordinates, as was previously
done in Cartesian coordinates.
Let a solid be bounded by the surface z = f (r, ), as well as the surfaces
r = a, r = b, = and = , which define a polar rectangle. To compute
the volume of this solid, we can approximate it by several solids for which
the volume can easily be computed. This is accomplished by dividing the
polar rectangle into several smaller polar rectangles of dimensions r and
. The height of each solid is obtained from the value of the function at a
point in the polar rectangle.
Specifically, we divide the interval [a, b] into n subintervals of width r =
(b a)/n. Each subinterval is of the form [ri1 , ri ], where ri = a + ir, for
i = 1, 2, . . . , n. Similarly, [, ] is divided into m subintervals of width =
( )/m, and each subinterval is of the form [j1 , j ], where j = +j.
Then, the volume V of the solid is approximated by
r 2 = x2 + y 2 ,
n X
m
X
1
i=1 j=1
tan =
2
f (ri , j )(ri2 ri1
),
V =
f (r, ) r dr d.
84
h1 ()
x2
y2
where D = {(x, y) | 1
+
4, x 0}, we convert the integrand, and
the description of D, to polar coordinates. We then have
Z Z
r cos + r sin dA
D
/2
3/2 Z 2
=
/2
3/2
=
/2
r2 (cos + sin ) dr d
2
r3
(cos + sin ) d
3 1
Z
7 3/2
(cos + sin ) d
3 /2
7
3/2
=
(sin cos )|/2
3
7
[(1 0) (1 0)]
=
3
14
= .
3
=
85
2
Example To computep
the volume of the solid in the first octant bounded
below by the cone z = x2 + y 2 , and above by the sphere x2 + y 2 + z 2 = 8,
as well as the planes y = x and y = 0, we first rewrite the equations of the
bounding surfaces in polar coordinates. The solid is bounded below by the
cone z = r, above by the sphere r2 + z 2 = 8, and the surfaces = 0 and
= /4, since the solid lies in the first octant. The surfaces that bound the
solid above and below intersect when 2r2 = 8, or r = 2. It follows that the
volume is given by
/4 Z 2
Z
V
=
0
p
[ 8 r2 r]r dr d
0
/4 Z 2
=
0
=
=
=
=
1
2
Z
Z
p
8 r2 dr d
0
/4 Z 4
0
/4
u1/2 du d
/4 Z 2
0
/4
r2 dr d
0
2
3
r
d
3 0
Z /4
8
2 3/2 8
u d
d
3
3
0
0
4
Z
1 /4
2
[16 2 8] d
3 0
3
4
[ 2 1].
3
1
2
1x2
f (x, y) dy dx
1
86
2
Example We evaluate the double integral
Z
2Z
2xx2
p
x2 + y 2 dy dx
2Z
0
2xx2
x2
y 2 dy dx
/2 Z 2 cos
=
0
r2 dr d.
87
Z
=
=
=
=
=
=
r2 dr d
0
/2 3 2 cos
r
3 0
2 cos
8
3
8
3
8
3
/2
cos3 d
0
/2
cos2 cos d
0
/2
(1 sin2 ) cos d
Z
Z
8 /2
8 /2 2
cos d
sin cos d
3 0
3 0
Z
8
8 1 2
/2
sin |0
u du
3
3 0
1
8
8 u3
(1)
3
3 3
0
16
.
9
2.4
Triple Integrals
lim
m,n,`
n X
m X
`
X
i=1 j=1 k=1
f (xi , yj , zk ) V,
88
where
F (x, y, z) =
f (x, y, z) (x, y, z) E,
.
0
(x, y, z)
/E
All of the properties previously associated with the double integral, such as
linearity and additivity, generalize to the triple integral as well.
Just as regions were classified as type I or type II for double integrals,
they can be classified for the purpose of setting up triple integrals. A solid
region E is said to be of type 1 if it lies between the graphs of two continuous functions of x and y that are defined on a two-dimensional region D.
Specifically,
E = {(x, y, z) | (x, y) D, u1 (x, y) z u2 (x, y)}.
Then, an integral of a function f (x, y, z) over E can be evaluated as
Z Z Z
Z Z Z
u2 (x,y)
f (x, y, z) dV =
f (x, y, z) dz dA,
u1 (x,y)
Z bZ
g2 (x) Z u2 (x,y)
f (x, y, z) dV =
E
f (x, y, z) dz dy dx.
a
g1 (x)
u1 (x,y)
89
0
1
Z
=
x
0
=
=
1x
2 0
1
2
1
2
Z
x
0
1xy
z2
dy dx
1x
(1 x y)2 dy dx
0
1
1x
(1 x y)3
x
dx
3
0
90
Z
1 1
x(1 x)3 dx
6 0
Z
1 1
(1 u)u3 du,
6 0
Z
1 1 3
u u4 du
6 0
1
1 u4 u5
6 4
5
0
1 1 1
6 4 5
1
.
120
u=1x
Example We will compute the volume of the solid E bounded by the surfaces y = x, y = x2 , z = x, and z = 0. Because E is bounded by two
surfaces that define z as a function of x and y, we view E as a solid of type
1. It is bounded by the graphs of the functions z = 0 and z = x that are
defined on a region D in the xy-plane. This region is bounded by the curves
y = x and y = x2 . Because these curves intersect when x = 0 and x = 1,
we can describe D as a region of type I:
D = {(x, y) | 0 x 1, x2 y x}.
It follows that the volume of E is given by the iterated integral
Z 1Z xZ x
Z Z Z
1 dV =
1 dz dy dx
0
x2 0
E
Z 1Z x
=
x dy dx
0
x2
Z 1 Z x
=
x
1 dy dx
0
x2
Z 1
=
x(x x2 ) dx
0
Z 1
=
x2 x3 dx
0
=
=
1
x3 x4
3
4 0
1
.
12
91
2
Example We evaluate the triple integral
Z Z Z
x dV
E
4(y 2 +z 2 )
x dV =
E
xr dx dr d.
0
4r2
xr dx dr d
4r2
4
x2
0
2
1Z 4
=
0
r(1 r4 ) dr d
= 8
0
0
2
= 8
0
1
d
3
= 8
0
r r5 dr d
1
r2 r6
d
2
6 0
= 8
dr d
2 4r2
16
.
3
92
2.5
(x, y) dA.
D
We see that just as the integral allows simple product formulas for area and
volume to be applied to more general problems, it allows similar formulas
for quantities such as mass to be generalized as well.
The center of mass, also known as the center of gravity, of an object is
the point at which the object behaves as if its entire mass is concentrated
at that point. If the object is one- or two-dimensional, the center of mass is
the point at which the object can be balanced horizontally (like a see-saw
with riders at either end, in the one-dimensional case).
For a lamina with its shape defined by a bounded region D R2 , and
with density given by(x, y), its center of mass (
x, y) is located at
x
=
My
,
m
y =
Mx
,
m
where Mx and My are the moments of the lamina about the x-axis and
y-axis, respectively. These are given by
Z Z
Z Z
Mx =
y(x, y) dA, My =
x(x, y) dA.
D
These integrals are obtained from the formula for the moment of a point
mass about an axis, which is given by the product of the mass and the
distance from the axis.
Similarly, the moments about the xy-, yz- and xz-planes, Mxy , Myz , and
Mxz , of a solid E R3 with density (x, y, z) are given by
Z Z Z
Mxy =
z(x, y, z) dV,
E
93
Z Z Z
x(x, y, z) dV,
Myz =
Z Z
ZE
y(x, y, z) dV.
Mxz =
E
Myz
,
m
y =
Mxz
,
m
z =
Mxy
.
m
As in the 2-D case, each moment is defined using the distance of each point
of E from the coordinate plane about which the moment is being computed.
The moment of interia, or second moment, of an object about an axis
gives an indication of the objects tendency to rotate about that axis. For
a lamina defined by a region D R2 with density function (x, y), its
moments of inertia about the x-axis and y-axis, Ix and Iy respectively, are
given by
Z Z
Z Z
2
Ix =
y (x, y) dA, Iy =
x2 (x, y) dA.
D
Ix =
Z Z Z
(y + z )(x, y, z) dV,
E
Iy =
E
Z Z Z
Iz =
E
The moment Iz is also called the polar moment of interia, or the moment of
interia about the origin, when E reduces to a lamina with density (x, y).
2.6
94
r = 32 + (3)2 = 18 = 3 2,
tan = 1.
4x2
f (x, y, z) dz dy dx,
2
x2 +y 2
r2
95
Z Z Z
(r cos )r dz dr d.
x dV =
0
Z
x dV
cos
0
r
Z
dz dr d
0
2
2
Z
=
2
3
r4
0
2
5r2 dr d
2
3
3
r
d + 5
cos d
4 2
3 2
0
Z 2
Z 2
95
65
cos2 + sin cos d +
cos d
4 0
3 0
Z
65 2 1
1
95
(1 + cos 2) + sin 2 d +
sin |2
0
4 0 2
2
3
2
65 1
1
1
+ sin 2) cos 2 d
4 2
2
4
0
=
=
=
65
.
4
2.7
96
y = sin sin .
Example To convert the point (x, y, z) = (1, 3, 4) to spherical coordinates, we first compute
q
p
2
2
2
= x + y + z = 12 + ( 3)2 + (4)2 = 20 = 2 5.
Next, we use the relation tan = y/x, and the fact that x = 1 > 0, to obtain
= tan1
= tan1 3 = .
x
3
2 5
2
To evaluate integrals in spherical coordinates, it is important to note
that the volume of a spherical box of dimensions r, and , as
, , 0, converges to the infinitesimal
2 sin dr d d,
where (, , ) denotes the location of the box in the limit. Therefore, the
integral of a function f (x, y, z) over a solid E, when evaluated in spherical
coordinates, becomes
Z Z Z
Z Z Z
f (x, y, z) dV =
f ( sin cos , sin sin , cos ) 2 sin d d d.
E
Example We wish to compute the volume of the solid E in the first octant
bounded below by the plane z = 0 and the hemisphere x2 + y 2 + z 2 = 9,
bounded above by the hemisphere x2 + y 2 + z 2 = 16, and the planes y = 0
and y = x. This would be highly inconvenient to attempt to evaluate in
97
,
4
.
2
That is, the solid is actually a spherical rectangle. It follows that the
volume V is given by the iterated integral
Z /2 Z /4 Z 4
2 sin d d d
V =
0
=
=
=
0
3
/2 Z 4
2
sin d d d
3
/2
2 d d d
sin
0
/2
sin
0
3
4
3
d d
3 3
Z
37 /2
=
sin d d
4 3 0
37
/2
=
cos |0
4 3
37
.
=
12
2
Example We use spherical coordinates to evaluate the triple integral
Z Z Z
(x2 + y 2 ) dV,
H
where H is the solid that is bounded below by the xy-plane, and bounded
above by the sphere x2 + y 2 + z 2 = 1. In spherical coordinates, H is defined
by the inequalities
H = {(, , ) | 0 1, 0 2, 0 /2}.
As the integrand x2 +y 2 is equal to ( cos sin )2 +( sin sin )2 = 2 sin2
in spherical coordinates, we have
Z Z Z
Z 2 Z /2 Z 1
2
2
(x + y ) dV =
(2 sin2 )2 sin d d d.
H
98
0
2
/2
sin3
=
0
1
5
1
5
Z
Z
1
5
=
=
=
=
=
0
2 Z /2
4 d d d
0
1
5
d d
5 0
sin3 d d
0
2
/2
sin2 sin d d
0
2
/2
(1 cos2 ) sin d d
Z
Z
Z
Z
1 2 /2
1 2 /2
sin d d
cos2 sin d d
5 0
5
0
0
0
Z
Z
Z
1 2
1 2 1 2
/2
( cos )|0 d
u du d
5 0
5 0
0
1
Z
Z
1 2
1 2 u3
1 d
d
5 0
5 0
3 0
2
1
2
5
3
4
.
15
2.8
g 0 (x)
99
y = h(u, v),
a u b,
c v d.
|ru rv | =
u v
v u
It follows that
Z Z
Z Z
(x, y)
du dv,
f (x, y) dx dy =
f (g(u, v), h(u, v))
(u, v)
D
D
= [a, b] [c, d] is the domain of g and h, and
where D
x x
x y x y
(x, y) u
v =
= y y
u v v u
(u, v)
u
v
100
is the Jacobian of the transformation from (u, v) to (x, y). It is also the
determinant of the Jacobian matrix of the vector-valued function that maps
(u, v) to (x, y).
Example Let D be the parallelogram with vertices (0, 0), (2, 4), (6, 1), and
(8, 5). To integrate a function f (x, y) over D, we can use a change of variable
(x, y) = (g(u, v), h(u, v)) that maps a rectangle to this parallelogram, and
then integrate over the rectangle.
Using the vertices, we find that the equations of the edges are
x + 6y = 0,
x + 6y = 22,
2x y = 0,
2x y = 11.
v = 2x y,
D
D
2
In general, when integrating a function f (x1 , x2 , . . . , xn ) over a region
D Rn , if the integral is evaluated using a change of variable (x1 , x2 , . . . , xn ) =
g(u1 , u2 , . . . , un ) that maps a region E Rn to D, then
Z
Z
f (x1 , . . . , xn ) dx1 dxn = (f g)(u1 , . . . , un )| det(Jg (u1 , . . . , un ))| du1 dun ,
D
101
where
Jg (u1 , u2 , . . . , un ) =
x1
u1
x2
u1
x1
u2
x2
u2
xn
u1
xn
u2
..
.
..
.
x1
un
x1
un
..
.
xn
un
It follows that the Jacobian of this transformation is given by the determinant of this matrix,
sin cos sin sin cos cos
sin sin sin cos cos sin
cos
0
sin
= cos sin sin cos cos
sin cos cos sin
sin cos sin sin
sin
sin sin sin cos
= cos [2 sin cos sin2 2 sin cos cos2 ]
sin [ sin2 cos2 + sin2 sin2 ]
= 2 cos2 sin 2 sin2 sin
= 2 sin .
The absolute value of the Jacobian is the factor that must be included in
the integrand when converting a triple integral from Cartesian to spherical
coordinates. 2
Example We evaluate the double integral
Z Z
(x2 xy + y 2 ) dA,
R
102
x = 2u 2/3v, y = 2u + 2/3v.
First, we compute the Jacobian of the change of variables,
p
x x
p
p
(x, y)
4
2
2/3
u
v
p
= det
= 2 2/3+ 2 2/3 = .
= det
y
y
2
2/3
(u, v)
3
u
v
Next, we need to define the region R in terms of u and v. Rewriting the
equation x2 xy +y 2 = 2 in terms of u and v yields the equation 2u2 +2v 2 =
to R, where R
3
3
R
R
0
0
Evaluating this integral, we obtain
Z Z
(x xy + y ) dA =
R
=
=
=
Z
0
r3 dr d
0
2 4 1
r
Z
8
d
4 0
3 0
Z 2
2
1 d
3 0
4
.
3
2
Example We wish to use an appropriate change of variable to evaluate the
double integral
Z Z
2
2
(x + y)ex y dA,
R
103
1
y = (u v).
2
It follows that
x y x y
1
(x, y)
=
=
(u, v)
u v
v u
2
1
11
1
=
2
22
2
=
=
=
=
=
2
3Z 2
ue
0
Z Z
1 3 2 uv
ue dv du
2 0 0
Z
1 3 uv 2
e |0 du
2 0
Z
1 3 2u
[e 1] du
2 0
3
1 e2u
u
2 2
0
1
1 e6
3
2 2
2
1 6
(e 7).
4
uv
1
dv du.
2
104
Chapter 3
Vector Calculus
3.1
Vector Fields
106
x1 x0 +(t1 t0 )V(x0 ).
mM G
r,
krk3
107
qQ
x,
kxk3
108
3.2
Line Integrals
This result is obtained by applying the basic formula for work along each
of n subintervals of width x = (b a)/n, and taking the limit as x 0.
Now, suppose that a force is applied to an object to move it along a path
traced by a curve C, instead of moving it along a straight line. If the amount
of force that is being applied to the object at any point p on the curve C is
given by the value of a function F (p), then the work can be approximated by,
as before, applying the basic formula for work to each of n line segments
that approximate the curve and have lengths s1 , s2 , . . . , sn . The work
done on the ith segment is approximately F (pi )si , where pi is any point
on the segment. By taking the limit as max si 0, we obtain the line
integral
Z
n
X
W =
F (p) ds =
lim
F (pi ) si ,
C
max si 0
i=1
109
y = y(t),
a t b.
Then, if we divide [a, b] into subintervals of width t = (b a)/n, with endpoints [ti1 , ti ] where ti = a+it, we can approximate C by n line segments
with endpoints (x(ti1 ), y(ti1 )) and (x(ti ), y(ti )), for i = 1, 2, . . . , n. From
the Pythagorean Theorem, it follows that the ith segment has length
s
q
xi 2
yi 2
2
2
+
t,
si = xi + yi =
t
t
where xi = x(ti ) x(ti1 ) and yi = y(ti ) y(ti1 ). Letting t 0, we
obtain
s
2
Z
Z b
dx 2
dy
F (p) ds =
F (x(t), y(t))
+
dt.
dt
dt
C
a
We recall that if F (x, y) 1, then this integral yields the arc length of the
curve C.
Example (Stewart, Section 13.2, Exercise 8) To evaluate the line integral
Z
x2 z ds
C
where C is the line segment from (0, 6, 1) to (4, 1, 5), we first need parametric equations for the line segment. Using the vector between the endpoints,
v = h4 0, 1 6, 5 (1)i = h4, 5, 6i,
we obtain the parametric equations
x = 4t,
y = 6 5t,
z = 1 + 6t,
0 t 1.
It follows that
Z
Z 1
p
2
x z ds =
(x(t))2 z(t) [x0 (t)]2 + [y 0 (t)]2 + [z 0 (t)]2 dt
C
0
Z 1
p
=
(4t)2 (6t 1) 42 + (5)2 + 62 dt
0
Z 1
=
16t2 (6t 1) 77 dt
0
110
6t3 t2 dt
= 16 77
0
1
4
t
t3
= 16 77 6
4
3 0
3 1
= 16 77
2 3
56 77
=
.
3
2
Example (Stewart, Section 13.2, Exercise 10) We evaluate the line integral
Z
(2x + 9z) ds
C
y = t2 ,
z = t3 ,
0 t 1.
We have
Z
Z 1
p
(2x + 9z) ds =
(2x(t) + 9z(t)) [x0 (t)]2 + [y 0 (t)]2 + [z 0 (t)]2 dt
C
0
Z 1
p
=
(2t + 9t3 ) 12 + (2t)2 + (3t2 )2 dt
0
Z 1
p
=
(2t + 9t3 ) 1 + 4t2 + 9t4 dt
0
Z
1 14 1/2
=
u du, u = 1 + 4t2 + 9t4
4 1
1 2 3/2 14
=
u
4 3
1
1 3/2
=
(14 1).
6
2
Although we have introduced line integrals in the context of computing
work, this approach can be used to integrate any function along a curve. For
example, to compute the mass of a wire that is shaped like a plane curve C,
where the density of the wire is given by a function (x, y) defined at each
111
If the curve C is parametrized by the the vector equation r(t) = hx(t), y(t)i,
where a t b, then the tangent vector is parametrized by
T(t) = r0 (t)/kr0 (t)k,
and, as before, ds =
Z
Z
F T ds =
p
[x0 (t)]2 + [y 0 (t)]2 dt = kr0 (t)k dt. It follows that
F(r(t))
a
r0 (t)
kr0 (t)k dt =
kr0 (t)k
Z
a
F(r(t)) r0 (t) dt =
Z
F dr.
C
The last form of the line integral is merely an abbreviation that is used for
convenience. As with line integrals of scalar-valued functions, the parametric representation of the curve is necessary for actual evaluation of a line
integral.
Example (Stewart, Section 13.2, Exercise 20) We evaluate the line integral
Z
F dr
C
112
0 t .
We have
Z
F dr =
C
F(r(t)) r0 (t) dt
Z0
hcos t, sin t, ti h1, cos t, sin ti dt
=
0
2
If we write F(x, y) = hP (x, y), Q(x, y)i, where P and Q are the component functions of F, then we have
Z
Z b
F dr =
F(r(t)) r0 (t) dt
C
a
Z b
=
hP (x(t), y(t)), Q(x(t), y(t))i hx0 (t), y 0 (t)i dt
a
Z b
Z b
0
=
P (x(t), y(t))x (t) dt +
Q(x(t), y(t))y 0 (t) dt.
a
P dx,
C
113
Q dy,
and conclude
Z
F dr =
P dx + Q dy.
C
These line integrals of scalar-valued functions can be evaluated individually to obtain the line integral of the vector field F over C. However, it is
important to note that unlike line integrals with respect to the arc length s,
the value of line integrals with respect to x or y (or z, in 3-D) depends on the
orientation of C. If the curve is traced in reverse (that is, from the terminal
point to the initial point), then the sign of the line integral is reversed as
well. We denote by C the curve C with its orientation reversed. We then
have
Z
Z
F dr =
F dr,
C
and
Z
P dx =
Z
Q dy =
P dx,
C
Z
Q dy.
C
All of this discussion generalizes to space curves (that is, curves in 3-D) in
a straightforward manner, as illustrated in the examples.
Example (Stewart, Section 13.2, Exercise 6) Let F(x, y) = hsin x, cos yi and
let C be the curve that is the top half of the circle x2 + y 2 = 1, traversed
counterclockwise from (1, 0) to (1, 0), and the line segment from (1, 0) to
(2, 3). To evaluate the line integral
Z
Z
F T ds =
sin x dx + cos y dy,
C
we consider the integrals over the semicircle, denoted by C1 , and the line
segment, denoted by C2 , separately. We then have
Z
Z
Z
sin x dx + cos y dy =
sin x dx + cos y dy +
sin x dx + cos y dy.
C
C1
C2
y = sin t,
This yields
Z
Z
sin x dx + cos y dy =
C1
0 t .
114
= cos(1) + cos(1)
= 0.
For the line segment, we use the parametric equations
x = 1 t,
y = 3t,
0 t 1.
This yields
Z
sin x dx + cos y dy =
C2
from the Fundamental Theorem of Calculus and the Chain Rule. However,
this shortcut can only be applied when an integral involves only one of the
independent variables. 2
Example (Stewart, Section 13.2, Exercise 12) We evaluate the line integral
Z
F dr
C
where
F(x, y, z) = hP (x, y, z), Q(x, y, z), R(x, y, z)i = hz, x, yi,
and C is defined by the parametric equations
x = t2 ,
y = t3 ,
z = t2 ,
0 t 1.
115
We have
Z
Z
F dr =
P dx + Q dy + R dz
C
Z 1
0
1
=
0
1
=
0
1
(5t4 + 2t3 ) dt
=
0
=
1
t5
t4
5 +2
5
4 0
3
.
2
=
2
3.3
We have learned that the line integral of a vector field F over a curve
piecewise smooth C, that is parameterized by a vector-valued function r(t),
a t b, is given by
Z b
Z
F dr =
F(r(t)) r0 (t) dt.
C
Now, suppose that F continuous, and is a conservative vector field; that is,
F = f for some scalar-valued function f . Then, by the Chain Rule, we
have
Z
Z b
Z b
d
F dr =
f (r(t))r0 (t) dt =
[(f r)(t)] dt = (f r)(t)|ba = f (r(b))f (r(a)).
dt
C
a
a
This is the Fundamental Theorem of Line Integrals, which is a generalization
of the Fundamental Theorem of Calculus.
If the curve C is a closed curve; that is, the initial and terminal points
of C are the same, then r(b) = r(a), which yields
Z
F dr = f (r(b)) f (r(a)) = 0.
C
116
If we decompose C into two curves C1 and C2 , and use the fact that the sign
of the line integral of a vector field over a curve depends on the orientation
of the curve, then we have
Z
Z
Z
Z
Z
F dr = 0.
F dr
F dr =
F dr +
F dr =
C
C1
C2
C1
C2
That is,
Z
F dr.
F dr =
C1
C2
However, C1 and C2 have the same initial and terminal points. It follows
that if F is conservative within an open, connected domain D (so that any
two points in D can be connected by a path that lies within D), then the
line integral of F is independent of path in D; that is, the value of the line
integral of F over a path C depends only on its initial and terminal points.
The converse of this statement is also true: if the line integral of a
vector field F is independent of path within an open, connected domain
D, then F is a conservative vector field on D. To see this, we consider
the two-variable case and let D be a region in R2 . Furthermore, we let
F(x, y) = hP (x, y), Q(x, y)i. We choose an arbitrary point (a, b) D, and
define
Z
(x,y)
F dr.
f (x, y) =
(a,b)
Since this line integral is independent of path, we can define f (x, y) using
any path between (a, b) and (x, y) that we choose, knowing that its value at
(x, y) will be the same in any case.
By choosing a path that ends with a horizontal line segment from (x1 , y)
to (x, y) contained entirely in D, parametrized by x = t, y = y, for x1 t
x, we can show that
"Z
#
"Z
#
(x1 ,y)
(x,y)
(x, y) =
F dr +
F dr
x
x (a,b)
x (x1 ,y)
Z x
0
0
= 0+
P (x(t), y)x (t) dt + Q(x(t), y)y (t) dt
x x1
Z x
=
P (t, y) dt + 0
x x1
= P (x, y).
Using a similar argument, we can show that f /y = Q. We have thus
shown that F is conservative, and conclude that F is a conservative vector
field if and only if its line integral is independent of path.
117
f
= Q,
y
Q
2f
=
.
x
xy
118
A similar procedure can be used for a vector field defined on R3 , except that
the function g depends on both y and z, and differentiation with respect to
both y and z is needed to completely define the function f (x, y, z) such that
f = F.
Example (Stewart, Section 13.3, Exercise 14) Let
F(x, y, z) = hP (x, y, z), Q(x, y, z), R(x, y, z)i = h2xz + y 2 , 2xy, x2 + 3z 2 i.
To confirm that F is conservative, we check the appropriate first partial
derivatives of P , Q and R:
Py = 2y = Qx ,
Pz = 2x = Rx ,
Qz = 0 = Ry .
gz (y, z) = 3z 2 ,
which yields
g(y, z) = z 3 + K
119
y = t + 1,
z = 2t 1,
0 t 1,
= f (1, 2, 1) f (0, 1, 1)
= 12 (1) + 22 (1) + 13 + K (02 (1) + 12 (0) + (1)3 + K)
= 1 + 4 + 1 + K (0 + 0 1 + K)
= 7.
2
Let F represent a force field. Then, recall that the work done by the
force field to move an object along a path r(t), a t b, is given by the
line integral
Z
Z b
W =
F dr =
F(r(t)) r0 (t) dt.
C
120
where A = r(a) and B = r(b) are the initial and terminal points, respectively, and
1
K(P ) = mkv(t)k, r(t) = P,
2
is the kinetic energy of the object at the point P . That is, the work done
by the force field along C is the change in the kinetic energy from point A
to point B.
If F is also a conservative force field, then F = P , where P is the
potential energy. It follows from the Fundamental Theorem of Line Integrals
that
Z
Z
W =
F dr =
P dr = [P (B) P (A)].
C
We conclude that
P (A) + K(A) = P (B) + K(B).
That is, when an object is moved by a conservative force field, then its
total energy remains constant. This is known as the Law of Conservation
of Energy.
3.4
Greens Theorem
We have learned that if a vector field is conservative, then its line integral
over a closed curve C is equal to zero. However, if this is not the case, then
evaluation of a line integral using the formula
Z
Z
F dr =
121
and
D = {(x, y) | c y d, h1 (y) x h2 (y)}.
Using the first definition, we have C = C1 C2 (C3 ) (C4 ), where:
C1 is the curve with parameterization x = t, y = g1 (t), for a t b
C2 is the vertical line segment with parameterization x = b, y = t, for
g1 (b) t g2 (b)
C3 is the curve with parameterization x = t, y = g2 (t), for a t b
C4 is the vertical line segment with parameterization x = a, y = t, for
g1 (a) t g2 (a)
We use positive orientation to describe the curve C, which means that the
curve is traversed counterclockwise. This means that as the curve is traversed, the region D is on the left.
In view of
Z
Z
F dr =
P dx + Q dy,
C
we have
Z
Z
Z
Z
Z
P dx =
P dx +
P dx +
P dx +
P dx
C
ZC1
ZC2
ZC3
Z C4
=
P dx +
P dx
P dx
P dx
C1
b
Z
=
C2
C3
a
b
Z
a
Z
=
C4
g2 (b)
g1 (b)
g2 (a)
P (b, t)(0) dt
g1 (b)
g2 (a)
P (t, g2 (t))(1) dt
a
P (a, t)(0) dt
g1 (a)
g1 (a)
g2 (b)
P (t, g1 (t))(1) dt +
a
Z bZ
g2 (t)
Py (t, y) dy dt
a
g1 (t)
Z Z
=
D
P
dA.
y
122
x
y
D
C
Another common statement of the theorem is
Z Z
Z
Q P
dA =
P dx + Q dy,
x
y
D
D
where D denotes the positively oriented boundary of D.
This theorem can be used to find a simpler approach to evaluating a
line integral of the vector field hP, Qi over C by converting the integral to
a double integral over D, or it can be used to find a simpler approach to
evaluating a double integral over a region D by converting it into an integral
over its boundary.
To show that Greens Theorem applies for more general regions than
those that are of both type I and type II, we consider a region D that is
the union of two regions D1 and D2 that are of both type I and type II. Let
C be the positively oriented boundary of D, let D1 have positively oriented
boundary C1 C3 , and let D2 have positively oriented boundary C2 (C3 ),
where C3 is the boundary between D1 and D2 . Then, C = C1 C2 . It follows
that for functions P and Q that satisfy the assumptions of Greens Theorem
on D, we can apply the theorem to D1 and D2 individually to obtain
Z Z
Z Z
Q P
Q P
dA =
dA +
x
y
x
y
D
D1
Z Z
Q P
dA
x
y
D2
Z
Z
=
P dx + Q dy +
P dx + Q dy
C1 C3
C2 (C3 )
Z
Z
=
P dx + Q dy +
P dx + Q dy +
C1
C3
Z
Z
P dx + Q dy +
P dx + Q dy
C2
C3
123
Z
P dx + Q dy +
P dx + Q dy +
C2
C1
Z
P dx + Q dy
=
P dx + Q dy
ZC3
ZC3
P dx + Q dy +
C1
P dx + Q dy
C2
Z
=
P dx + Q dy
ZC1 C2
P dx + Q dy.
=
C
dA =
dA +
x
y
x
y
D
D0
Z Z
Q P
dA
x
y
D00
Z
Z
=
P dx + Q dy +
P dx + Q dy
C2
ZC1
=
P dx + Q dy.
C1 C2
y
x
2
, 2
2
x + y x + y2
124
dA = 0,
x
y
C
C 0
D
since F is conservative on D. It follows that
Z
Z
Z
F dr =
F dr =
C 0
F dr,
C0
so we can compute the line integral of F over C, which we have not specified,
by computing the line integral over the circle C 0 , which can be parameterized
by x = a cos t, y = a sin t, for 0 t 2. This yields
Z
Z 2
F dr =
P (x(t), y(t))x0 (t) dt + Q(x(t), y(t))y 0 (t) dt
0
C
0
Z 2
a sin t
=
(a sin t) dt +
(a cos t)2 + (a sin t)2
0
a cos t
(a cos t) dt
(a cos t)2 + (a sin t)2
Z 2
a2 sin2 t
a2 cos2 t
=
dt
+
dt
a2 cos2 +a2 sin2 t
a2 cos2 t + a2 sin2 t
0
Z 2
=
1 dt
0
= 2.
We conclude that the line integral of F over any positively oriented, piecewise
smooth, simple closed curve that encloses the origin is equal to 2. 2
Example Consider a n-sided polygon P with vertices (x1 , y1 ), (x2 , y2 ), . . .,
(xn , yn ). The area A of the polygon is given by the double integral
Z Z
A=
1 dA.
P
125
x
y
2
2
It follows from Greens Theorem that if P is positively oriented, then
Z
Z
1
A=
Q dy + P dx =
x dy y dx.
2 P
P
To evaluate this line integral, we consider each edge of P individually. Let
C be the line segment from (x1 , y1 ) to (x2 , y2 ), and assume, for convenience,
that C is not vertical. Then C can be parameterized by x = t, y = mx + b,
for x1 x x2 , where
m=
y2 y1
,
x2 x1
b = y1 mx1 .
We then have
Z
x2
x dy y dx =
C
mt dt (mt + b) dt
x1
x2
b dt
x1
= b(x1 x2 )
= y1 (x1 x2 ) mx1 (x1 x2 )
= y1 (x1 x2 ) + (y2 y1 )x1
= x1 y2 x2 y1 .
We conclude that
A =
1
[(x1 y2 x2 y1 ) + (x2 y3 x3 y2 ) + + (xn1 yn xn yn1 )+
2
(xn y1 x1 yn )] .
3.5
126
Pz Rx = 0,
Qx Py = 0.
=
, ,
x y z
.
127
Unlike the curl, the divergence is defined for vector fields with any number of
variables, as long as the number of independent and the number of dependent
variables are the same.
It can be verified directly that if F is the curl of a vector field G, then
div F = 0. That is, the divergence of any curl is zero, as long as G has
continuous second partial derivatives. This is useful for determining whether
a given vector field F is the curl of any other vector field G, for if it is, its
divergence must be zero.
Example (Stewart, Section 13.5, Exercise 18) The vector field F(x, y, z) =
hyz, xyz, xyi is not the curl of any vector field G, because
div F = (yz)x + (xyz)y + (xy)z = 0 + xz + 0 = xz,
whereas if F = curl G, then
div F = div curl G = 0.
2
If F represents the velocity field of a fluid, then, at each point within the
fluid, div F measures the tendency of the fluid to diverge away from that
point. Specifically, the divergence is the rate of change, with respect to time,
of the density of the fluid. Therefore, if div F = 0, then we say that F, and
therefore the fluid as well, is incompressible.
The divergence of a gradient is
f f f
2f
2f
2f
div(f ) = f =
, ,
,
,
=
+
+
.
x y z
x y z
x2
y 2
z 2
We denote this expression f by 2 f , or f , which is called the Laplacian of f . The operator 2 is called the Laplace operator. Its name comes
from Laplaces equation
f = 0.
The curl and divergence can be used to restate Greens Theorem in
forms that are more directly generalizable to surfaces and solids in R3 . Let
F = hP, Q, 0i, the embedding of a two-dimensional vector field in R3 . Then
Q P
curl F =
k,
x
y
where, as before, k = h0, 0, 1i. It follows that
Q P
Q P
curl F k =
kk=
.
x
y
x
y
128
This expression is called the scalar curl of the two-dimensional vector field
hP, Qi. We conclude that Greens Theorem can be rewritten as
Z Z
(curl F) k dA.
F dr =
D
Z Z
div F dA =
D
P
Q
+
x
y
dA
Z
P dy Q dx
=
C
Z b
where
n(t) =
1
kr0 (t)k
hy 0 (t), x0 (t)i
is the outward unit normal vector to the curve C. Note that nT = 0, where
T is the unit tangent vector
T(t) =
1
kr0 (t)k
Z
F n ds =
C
div F dA.
D
3.6
129
We have learned that Greens Theorem can be used to relate a line integral
of a two-dimensional vector field F over a closed plane curve C to a double
integral of a component of curl F over the region D that is enclosed by
C. Our goal is to generalize this result in such a way as to relate the line
integral of a three-dimensional vector field F over a closed space curve C to
the integral of a component of curl F over a surface enclosed by C.
We have also learned that Greens Theorem relates the integral of the
normal component of a two-dimensional vector field over a closed curve C to
the double integral of div F over the region D that it encloses. We wish to
generalize this result in order to relate the integral of the normal component
of a three-dimensional vector field F over a closed surface S to the triple
integral of div F over the solid E contained within S.
In order to realize either of these generalizations, we need to be able to integrate functions over piecewise smooth surfaces, just as we now know how to
integrate functions over piecewise smooth curves. Whereas a smooth curve
C, being a curved one-dimensional entity, is most conveniently described by
a parameterization r(t), where a t b and r(t) is a differentiable function
of one variable, a smooth surface S, being a curved two-dimensional entity,
is most conveniently described by a parametrization r(u, v), where (u, v) lies
within a 2-D region, and r(u, v) = hx(u, v), y(y, v), z(u, v)i is a differentiable
function of two variables. We say that S is a parametric surface, and
x = x(u, v),
y = y(u, v),
z = z(u, v)
x(u, v) =
1+
130
x
x
x = x, y =
cos , z =
sin ,
2
2
where 0 2, since for each x, a point (x,
py, z) on the paraboloid must
lie on a circle centered at (x, 0, 0) with radius x/4, parallel to the yz-plane.
This is an example of a surface of revolution, since the surface is obtained
by revolving the curve y = f (x) around the x-axis. 2
Let P0 = (x0 , y0 , z0 ) = r(u0 , v0 ) be a point on a parametric surface S. A
curve defined by g(v) = r(u0 , v) that lies within S and passes through P0
has the tangent vector
x
y
z
0
rv = g (v) =
(u0 , v0 ), (u0 , v0 ), (u0 , v0 )
v
v
v
at P0 . Similarly, the tangent vector at P0 of the curve h(u) = r(u, v0 ), that
also lies within S and passes through P0 , is
x
y
z
0
ru = h (u) =
(u0 , v0 ),
(u0 , v0 ),
(u0 , v0 ) .
u
u
u
131
If these vectors are not parallel, then together they define the tangent plane
of S at P0 . Its normal vector is
ru rv = ha, b, ci
which yields the equation
a(x x0 ) + b(y y0 ) + c(z z0 ) = 0
of the tangent plane.
Example (Stewart, Section 13.6, Exercise 30) Consider the surface defined
by the parametric equations
x = u2 ,
y = v2,
z = uv,
0 u, v 10.
yu = 0,
zu = v,
yv = 2v,
zv = u.
132
Then, r approximately maps the rectangle Rij with lower left corner
(ui1 , vj1 ) into a parallelogram with adjacent edges defined by the vectors
r(ui , vj1 ) r(ui1 , vj1 ) ru u
and
r(ui1 , vj ) r(ui1 , vj1 ) rv v.
The area of this parallelogram is
Aij = kru rv kuv.
Adding all of these areas approximates the area of S, which we denote by
A(S). If we let m, n , we obtain
A(S) =
lim
m,n
n X
m
X
i=1 j=1
Z Z
kru rv k dA.
Aij =
D
Example (Stewart, Section 13.6, Exercise 34) We wish to find the area of
the surface S that is the part of the plane 2x + 5y + z = 10 that lies inside
the cylinder x2 + y 2 = 9. First, we must find parametric equations for this
surface. Because x and y are restricted to the circle of radius 3 centered at
the origin, it makes sense to use polar coordinates for x and y. We then
have the parametric equations
x = u cos v,
y = u sin v,
It follows that
Z
3 Z 2
A(S) =
0
u 30 du dv = 2 30
u du = 9 30.
133
y = v,
z = f (u, v),
(u, v) D.
It follows that
ru = h1, 0, fu i,
rv = h0, 1, fv i.
We then have rv ru = hfu , fv , 1i, which yields the equation of the tangent
plane
f
f
(u0 , v0 )(x x0 ) +
(u0 , v0 )(y y0 ) = z z0 ,
u
v
which, using the relations x = u and y = v, can be rewritten as
f
f
(x0 , y0 )(x x0 ) +
(x0 , y0 )(y y0 ) = z z0 .
x
y
Recall that this is the equation of the tangent plane of a surface defined by
an equation of the form z = f (x, y) that had been previously defined. It
follows that the area of such a surface is given by the double integral
s
2 2
Z Z
df
df
A(S) =
+
dA.
1+
dx
dy
D
Example (Stewart, Section 13.6, Exercise 38) To find the area A(S) of the
surface z = 1 + 3x + 2y 2 that lies above the triangle with vertices (0, 0),
(0, 1) and (2, 1), we compute
z
= 3,
x
z
= 4y,
y
134
2
A surface of revolution S that is obtained by revolving the curve y =
f (x), a x b, around the x-axis has parametric equations
x = u,
y = f (u) cos v,
z = f (u) sin v,
A(S) = 2
p
|f (u)| 1 + [f 0 (u)]2 du.
3.7
3.7.1
Surface Integrals
Surface Integrals of Scalar-Valued Functions
135
It follows that the integral of f (x, y, z) 1 along C is equal to the arc length
of C.
We now define integrals of scalar-valued functions on surfaces in an analogous manner. Recall that the area of a smooth surface S, parametrized by
r(u, v) = hx(u, v), y(u, v), z(u, v)i for (u, v) D, is given by the integral
Z Z
A(S) =
kru rv k du dv.
D
To integrate a scalar-valued function f (x, y, z) over S, we assume for simplicitly that D is a rectangle, and divide it into sub-rectangles {Rij } of
dimension u and v, as we did when we derived the formula for A(S).
Then, the function r maps each sub-rectangle Rij into a surface patch Sij
that has area Sij . This area is then multiplied by f (Pij ), where Pij is any
point on Sij .
Letting u, v 0, we obtain the surface integral of f over S to be
n X
m
X
Z Z
f (x, y, z) dS =
S
lim
u,v0
f (Pij ) Sij
i=1 j=1
Z Z
f (r(u, v))kru rv k du dv,
=
D
0 u 1,
0 v .
Then we have
ru = hcos v, sin v, 0i,
which yields
kru rv k = khsin v, cos v, uik =
sin2 v + cos2 v + u2 =
p
1 + u2 .
136
It follows that
Z 1Z p
Z Z p
2
2
1 + x + y dS =
1 + (u cos v)2 + (u sin v)2 kru rv k dv du
0
0
S
Z 1Z p
p
=
1 + u2 1 + u2 dv du
0
0
Z 1Z
1 + u2 dv du
=
0
0
Z 1
1 + u2 du
=
0
1
u3
= u+
3 0
4
=
.
3
2
The surface integral of a scalar-valued function is useful for computing
the mass and center of mass of a thin sheet. If the sheet is shaped like a
surface S, and it has density (x, y, z), then the mass is given by the surface
integral
Z Z
(x, y, z) dS,
m=
S
3.7.2
(
x, y, z), where
Z
x(x, y, z) dS,
ZS
y(x, y, z) dS,
ZS
z(x, y, z) dS.
S
137
ru rv
,
kru rv k
This is analogous to the definition of the line integral of a vector field over
a curve C,
Z
Z
Z b
F dr =
F T ds =
F(r(t)) r0 (t) dt.
C
138
rv = h0, 1, gv i,
we obtain
ru rv = hgu , gv , 1i = hgx , gy , 1i
which yields
n=
hgx , gy , 1i
ru rv
=q
.
kru rv k
1 + gx2 + gy2
where F = hx, y, z 4 i.
First, we must compute the unit normal vector for S. Using cylindrical
coordinates yields the parameterization
x = u cos v,
y = u sin v,
z = u,
0 u 1,
0 v 2.
139
We then have
ru = hcos v, sin v, 1i,
which yields
ru rv = hu cos v, u sin v, u cos2 v + u sin2 vi = uh cos v, sin v, 1i.
Because we assume downward orientation, we must have the z-component
of the normal vector be negative. Therefore, ru rv must be negated, which
yields
Z Z
Z Z
F dS =
F(x(u, v), y(u, v), z(u, v)) (ru rv ) dA,
S
where D is the domain of the parameters u and v, the rectangle [0, 1][0, 2].
Evaluating this integral, we obtain
Z Z
Z Z
F dS =
hu cos v, u sin v, u4 i uh cos v, sin v, 1i dA
S
D
Z 2 Z 1
=
(u cos2 v u sin2 v + u4 )u du dv
0
0
Z 2 Z 1
=
(u2 u5 ) du dv
0
0
Z 1
= 2
(u2 u5 ) du
0
= 2
=
.
3
1
u3 u6
3
6 0
140
y = 0,
0 u 1,
z = v,
0 v 1 u.
Then, from
ru = h1, 0, 0i,
rv = h0, 0, 1i,
we obtain
ru rv = h0, 1, 0i.
This vector is pointing outside the tetrahedron, so it is the outward normal
vector that we wish to use. Therefore, the surface integral of F over S1 is
Z Z
1 Z 1u
Z
F dS =
S1
h0, v 0, ui h0, 1, 0i dv du
0
0
1 Z 1u
Z
=
v dv du
0
Z
=
0
1 2 1u
v
2 0
1
2
du
(1 u)2 du
1
1 (1 u)3
=
2
3
0
1
= .
6
141
For the second side, S2 , with vertices (0, 0, 0), (0, 1, 0) and (0, 0, 1), we
parameterize using
x = 0,
y = u,
0 u 1,
z = v,
0 v 1 u.
Then, from
ru = h0, 1, 0i,
rv = h0, 0, 1i,
we obtain
ru rv = h1, 0, 0i.
This vector is pointing inside the tetrahedron, so we must negate it to obtain
the outward normal vector. Therefore, the surface integral of F over S2 is
Z Z
1 Z 1u
Z
F dS =
S2
hu, v u, 0i h1, 0, 0i dv du
0
0
1 Z 1u
=
u dv du
0
0
Z 1
=
u(u 1) du
0
1
u3 u2
=
3
2 0
1
= .
6
For the base S3 , with vertices (0, 0, 0), (1, 0, 0) and (0, 1, 0), we parametrize
using
x = u, y = v, z = 0, 0 u 1, 0 v 1 u.
Then, from
ru = h1, 0, 0i,
rv = h0, 1, 0i,
we obtain
ru rv = h0, 0, 1i.
This vector is pointing inside the tetrahedron, so we must negate it to obtain
the outward normal vector. Therefore, the surface integral of F over S3 is
Z Z
1 Z 1u
F dS =
S3
hv, 0 v, ui h0, 0, 1i dv du
0
0
1 Z 1u
u dv du
0
142
u(u 1) du
=
0
1
u3 u2
=
3
2 0
1
= .
6
Finally, for the top face S4 , with vertices (1, 0, 0), (0, 1, 0) and (0, 0, 1),
we parametrize using
x = u,
y = v,
z = 1 u v,
0 u 1,
0 v 1 u,
1 Z 1u
F dS =
S4
0
1 Z 1u
1 v dv du
=
0
0
1
=
0
1u
=
0
Z
=
=
1u
v 2
du
v
2 0
(1 u)2
du
2
1 1 2
u du
2
0 2
1
u u3
2
6
0
1
.
3
143
3.8
Stokes Theorem
(curl F) k dA,
D
144
y = 2 sin t,
z = 5,
0 t 2.
10 sin 2t|2
0
= 0.
This result can also be obtained by noting that because F = f , where
f (x, y, z) = xyz, it follows that curl F = 0. 2
Example (Stewart, Section 13.8, Exercise 8) We wish to evaluate the line
integral of F(x, y, z) = hxy, 2z, 3yi over the curve C that is the intersection
of the cylinder x2 + y 2 = 9 with the plane x + z = 5.
To describe the surface S enclosed by C, we use the parameterization
x = u cos v,
y = u sin v,
z = 5 u cos v,
0 u 3,
0 v 2.
Using
ru = hcos v, sin v, cos vi,
we obtain
ru rv = hu, 0, ui.
We then compute
curl F =
, ,
x y z
hxy, 2z, 3yi = h1, 0, xi.
145
0 v 2.
0
3
u2
2 0
= 9.
2
Stokes Theorem can also be used to provide insight into the physical
interpretation of the curl of a vector field. Let Sa be a disk of radius a
centered at a point P0 , and let Ca be its boundary. Furthermore, let v be a
velocity field for a fluid. Then the line integral
Z
Z
v dr =
v T ds,
Ca
Ca
where T is the unit tangent vector of Ca , measures the tendency of the fluid
to move around Ca . This is because this measure, called the circulation of v
around Ca , is greatest when the fluid velocity vector is consistently parallel
to the unit tangent vector. That is, the circulation around Ca is maximized
when the fluid follows the path of Ca .
Now, by Stokes Theorem,
Z
Z Z
v dr =
curl v dS
Ca
Sa
146
=
Sa
Z Z
curl V(P0 ) n(P0 )
1 dS
Sa
3.8.1
Z Z
F dr =
curl F dS,
S
Z Z
F(r(t)) r (t) dt =
a
It is important that the parameterizations r and g have the proper orientation for Stokes Theorem to apply. This is why it is required that C
have positive orientation. It means, informally, that if one were to walk
along C, in such a way that n, the unit normal vector of S, can be viewed,
then S should always be on the left relative to the path traced along C.
It follows that the parameterizations of C and S must be consistent with
one another, to ensure that they are oriented properly. Otherwise, one of the
parameterizations must be reversed, so that the sign of the corresponding
integral is corrected. The orientation of a curve can be reversed by changing
the parameter to s = a + b t. The orientation of a surface can be reversed
by interchanging the variables u and v.
3.9
147
where n is the outward unit normal vector of D. Now, let E be a threedimensional solid whose boundary, denoted by E, is a closed surface S
with positive orientation. Then, if we consider two-dimensional slices of E,
each one being parallel to the xy-plane, then each slice is a region D with
positively oriented boundary C, to which Greens Theorem applies. If we
multiply the integrals on both sides of Greens Theorem, as applied to each
slice, by dz, the infinitesimal thickness of each slice, then we obtain
Z Z
Z Z Z
F n dS =
div F dV,
S
or, equivalently,
Z Z
Z Z Z
F dS =
F dV.
E
This result is known as the Gauss Divergence Theorem, or simply the Divergence Theorem.
As the Divergence Theorem relates the surface integral of a vector field,
known as the flux of the vector field through the surface, to an integral of its
divergence over a solid, it is quite useful for converting potentially difficult
double integrals into triple integrals that may be much easier to evaluate,
as the following example demonstrates.
Example (Stewart, Section 13.9, Exercise 6) Let S be the surface of the
box with vertices (1, 2, 3), and let F(x, y, z) = hx2 z 3 , 2xyz 3 , xz 4 i. To
compute the surface integral of F over S directly is quite tedious, because
S has six faces that must be handled separately. Instead, we apply the
Divergence Theorem to integrate div F over E, the interior of the box. We
then have
Z Z
Z Z Z
F dS =
div F dV
S
E
Z 1Z 2Z 3
=
(x2 z 3 )x + (2xyz 3 )y + (xz 4 )z dz dy dz
1 2 3
1 Z 2 Z 3
Z
=
148
=
1
2
1
8xz 3 dz dy dx
3
3
z 3 dz dx
3
1
Z 1 " 4 3 #
z
= 32
x
dx
4 3
1
= 32
= 0.
2
The Divergence Theorem can also be used to convert a difficult surface
integral into an easier one.
Example (Stewart, Section 13.9, Exercise 17) Let F(x, y, z) = hz 2 x, 31 y 3 +
tan z, x2 z + y 2 i. Let S be the top half of the sphere x2 + y 2 + z 2 = 1. To
evaluate the surface integral of F over S, we note that if we combine S
with S1 , the disk x2 + y 2 1, z = 0, with downward orientation, we then
obtain a new surface S2 that is the boundary of the top half of the ball
x2 + y 2 + z 2 1, which we denote by E. By the Divergence Theorem,
Z Z
Z Z
Z Z
Z Z Z
F dS +
F dS =
F dS =
div F dV.
S
S1
S2
We parameterize S1 by
x = u sin v,
y = u cos v,
z = 0,
0 u 1,
0 v 2.
This parameterization is used instead of the usual one arising from polar
coordinates, due to the downward orientation. It follows from
ru = hsin u, cos u, 0i,
rv = hu cos v, u sin v, 0i
that
ru rv = h0, 0, u sin2 v u cos2 vi = uh0, 0, 1i,
which points downward, as desired. From
1 3
2
div F(x, y, z) = (z x)x +
y + tan z + (x2 z + y 2 )z = x2 + y 2 + z 2 ,
3
y
which suggests the use of spherical coordinates for the integral over E, we
obtain
Z Z
Z Z Z
Z Z
F dS =
div F dV
F dS
S
S1
149
(x2 + y 2 + z 2 ) dV
=
E
1 Z 2
0
1 Z 2
/2
0
1
0
4
= 2
u3
u(u2 cos2 v) dv du
0
0
Z 2
cos2 v dv du
2
1 + cos 2v
u3
dv du
2
0
0
0
Z 1
Z 1
sin 2v 2
4
3 v
+
2
d +
u
du
2
4
0
0
0
1
Z 1
5
2 +
u3 du
5 0
0
1
4
u
2
+
5
4 0
2
+
5
4
13
.
20
= 2
Z
sin d d +
/2
1 Z 2
sin d d d +
2 2
/2
4 cos |0
2
Suppose that F is a vector field that, at any point, represents the flow
rate of heat energy, which is the rate of change, with respect to time, of
the amount of heat energy flowing through that point. By Fouriers Law,
F = KT , where K is a constant called thermal conductivity, and T is a
function that indicates temperature.
Now, let E be a three-dimensional solid enclosed by a closed, positively
oriented, surface S with outward unit normal vector n. Then, by the law
of conservation of energy, the rate of change, with respect to time, of the
amount of heat energy inside E is equal to the flow rate, or flux, or heat
into E through S. That is, if (x, y, z) is the density of heat energy, then
Z Z Z
Z Z
F (n) dS,
dV =
E
where we use n because n is the outward unit normal vector, but we need
to express the flux into E through S.
150
KT n dS.
c0 T dV =
t
S
E
Next, we note that because c, 0 , and E do not depend on time, we can
write
Z Z
Z Z Z
T
c0
KT dS.
dV =
t
E
S
Now, we apply the Divergence Theorem, and obtain
Z Z Z
Z Z Z
Z Z Z
T
c0
dV =
K div T dV =
K2 T dV.
t
E
E
E
That is,
Z Z Z
T
2
c0
K T dV = 0.
t
E
Since the solid E is arbitrary, it follows that
T
K 2
=
T.
t
c0
This is known as the heat equation, which is one of the most important
partial differential equations in all of applied mathematics.
3.10
Differential Forms
151
Z
(Qx Py ) dA =
P dx + Q dy
Z Z
Stokes Theorem:
F dr
curl F dS =
S
152
Example Let f (x, y, z) = exy sin z and let g(x, y, z) = (x2 + y 2 + z 2 )3/2 .
Then f , g and f + g are all 0-forms on R3 , and
f + g = exy sin z + (x2 + y 2 + z 2 )3/2 .
That is, addition of 0-forms is identical to addition of functions.
If we define 1 = f dx and 2 = g dy, then 1 and 2 are both 1-forms
on R3 , and so is = 1 + 2 , where
= f dx + g dy = exy sin z dx + (x2 + y 2 + z 2 )3/2 dy.
Furthermore, if h(x, y, z) = xy 2 z 3 , and
1 = f dx dy,
2 = g dz dx
153
3. Anticommutativity:
= (1)kl ( ).
4. Associativity:
1 (2 3 ) = (1 2 ) 3
5. Homogeneity: If f is a 0-form, then
(f ) = (f ) = f ( ).
6. If dxi is a basic 1-form, then dxi dxi = 0.
7. If f is a 0-form, then f = f .
Example Let = f dx and = g dy be 1-forms. Then
= (f dx g dy) = f g(dx dy) = f g dx dy,
by homogeneity, while
= (1)1(1) ( ) = f g dx dy.
On the other hand, if = h dy dz is a 2-form, then
= f h(dy dz dx) = f h dy dz dx = f h dy dx dz = f h dx dy dz
by homogeneity and anticommutativity, while
= f h(dy dz dy) = f h dy dz dy = f h dy dy dz = 0.
2
Note that if any 3-form on R3 is multiplied by a k-form, where k > 0, then
the result is zero, because there cannot be distinct basic 1-forms in the wedge
product of such forms.
Example Let = x dx y dy, and = z dy dz x dz dx. Then
= (x dx y dy) (z dy dz x dz dx)
= (x dx z dy dz) (y dy z dy dz) (x dx x dz dz) +
(y dy x dz dx)
= xz dx dy dz yz dy dy dz x2 dx dz dx + xy dy dz dx
= xz dx dy dz yz dy dy dz + x2 dx dx dz + xy dy dz dx
= xz dx dy dz 0 0 xy dy dx dz
= (xz + xy) dx dy dz.
154
2
The second operation is differentiation. Given a k-form , where k < 3,
the derivative of , denoted by d, is a (k +1)-form. It satisfies the following
laws:
1. If f is a 0-form, then
df = fx dx + fy dy + fz dz
2. Linearity: If 1 and 2 are k-forms, then
d(1 + 2 ) = d1 + d2
3. Product Rule: If is a k-form and is an l-form, then
d( ) = (d ) + (1)k ( d)
4. The second derivative of a form is zero; that is, for any k-form ,
d(d) = 0.
We now illustrate the use of these differentiation rules.
Example Let = x2 y 3 z 4 dx dy be a 2-form. Then, by Linearity and the
Product Rule,
d = [d(x2 y 3 z 4 ) dx dy] + (1)0 [x2 y 3 z 4 d(dx dy)]
2 3 4
=
(x y z )x dx + (x2 y 3 z 4 )y dy + (x2 y 3 z 4 )z dz dx dy +
2 3 4
x y z {(d(dx) dy) + (1)1 (dx d(dy)}
=
2xy 3 z 4 dx + 3x2 y 2 z 4 dy + 4x2 y 3 z 3 dz dx dy +
2 3 4
x y z {(0 dy) (dx 0)}
= 2xy 3 z 4 dx dx dy + 3x2 y 2 z 4 dy dx dy + 4x2 y 3 z 3 dz dx dy + 0
= 4x2 y 3 z 3 dx dz dy
= 4x2 y 3 z 3 dx dy dz.
In general, differentiating a k-form , when k > 0, only requires differentiating the coefficient function with respect to the variables that are not among
any basic 1-forms that are included in . In this example, since = f dx dy,
we obtain d = fz dz dx dy = fz dx dy dz. 2
We now consider the kind of differential forms that appear in the theorems of vector calculus.
155
The boundary of C, C, consists of its initial point A and terminal point B. If we define the integral of a 0-form over this 0dimensional region by
Z
= (B) (A),
C
156
(u, v) D.
(Qy Px ) dx dy
Z Z
=
d.
S
157
(r(t)) dt
Za
.
C
(u, v) D.
Z
F n dS
ZS
F(r(u, v)) (ru rv ) du dv
D
Z
hP (r(u, v)), Q(r(u, v)), R(r(u, v))i
ZD
P (r(u, v))
D
(y, z)
(z, x)
+ Q(r(u, v))
+
(u, v)
(u, v)
du dv
158
The importance of this unified theorem is that, unlike the previously stated
theorems of vector calculus, this theorem, through the language of differential forms, can be generalized to functions of any number of variables. This
is because operations on differential forms are not defined in terms of other
operations, such as the cross product, that are limited to three variables.
For example, given a 3-form = f (x, y, z, w) dx dy dw, its integral over a
3-dimensional, closed, positively oriented hypersurface S embedded in R4 is
equal to the integral of d over the 4-dimensional solid E that is enclosed by
S, where d is computed using the previously stated rules for differentiation
and multiplication of differential forms.