18.024 Course Notes
18.024 Course Notes
A,
Linear spaces Linear independence Gauss-Jordan elimination Parametric equations of lines and planes in Vn Parametric equations for k-planes in Vn
A1
A7
A17
A25
A32
> Matrices Systems of linear equations Cartesian equations of k-planes in V n The inverse of a matrix
Determinants
Supplementary exercises Derivatives of vector functions Differentiating inverse functions Implicit differentiation The second-derivative test for extrema The extreme value and small-span theorems Exercises on line integrals
C28
c34
Green's theorem in the plane Conditions under which Pi + Qj is a gradient The change of variables theorem Isometries
El E6 E21
E27
Stokes' theorem
Exercises on the divergence theorem
Grad, curl, div and all that
Linear Spaces
+ (B+c) = (A+B) +
for all A.
C.
0
-
3. (Existence of zero)
There is an element
such 'that A + - = A 0
B
there is a
Scalar multiplication:
+ dA,
c(A+B) = cA + cB. 7. (Multiplication by unity) More generally, let V 1A = A. be any set of objects
Definition.
as follows:
that A + B.
The second is an operation that assigns to each real number c and each vector
A,
a vector denoted
cA.
Then V,
n ' The study of linear spaces and their properties is dealt with in a subject called Linear Algebra.
means that for every pair A, B of vectors of W, and every scalar c, the vectors A + B a ~ dcA belong to W. Note that it is automatic that
I
W, we have Q = OA.
Furthermore, for each A in W, the vector - A is also in W. This means (as you can readily check) that W is a linear space in its own right (i.e.,
f
Example 1.
The s u b s e t o f
Vn
c o n s i s t i n g o f the
9-tuple
alone i s a subspace of
space.
Pad of course
is a subspace of V n o It is called the subspace spanned by A and B. In the case n = 3, it can be pictured as consisting of all vectors lying in the plane through the origin that contains A and B.
---A/
We generalize the construction given in the preceding
examples as follows:
...
be a set of vectors in Vn
X = c1 1 +
A
...
+*
.
The set W of
all such vectors X is a subspace of Vn, as we will see; it is said to be the subspace spanned by the vectors Al,...,% linear span of A l ,
...,+
then
and denoted by
W
L(S).
is a subspace of
V. n
If
and
W,
X = clAl
I
-*
+
=kAk
and
Y = d1A1
+ * * *
+ dkAkf
and
di.
f
We compute
* * *
X . + Y = (cl+dl)A1
+ (ck+dk)Akr
ax = (ac ) A + 1 1
* * *
+ (ack)Akf
W
by definition. Thus
so both
+ Y
and
ax
belong to
is a subspace of
Vn.
Giving a spanning set for W is one standard way of specifying W . Different spanning sets can of course give the same subspace. Fcr example, it is intuitively clear that, for the plane through the origin in Example 3,
any_ two non-zero vectors C and D that are not parallel and lie in this
plane will span it. We shall give a proof of this fact shortly.
Example 4. The
n-tuple space Vn
En = (0,0,0,.
.. ,1).
for i f . X = (xl,
...,
Vn. It
x ) is n
then
n = 2, E2
El
and
I and
3,
El,
and- E3
and
of
V3
(a,b,O) is a subspace of
y and
so are
(1,0,0)
and
(O,l,O).
V3 consisting of all vectors of the V3.
It consists
X = (3a+2b,a-b,a+7b) is a subspace of
+ b(2,-1,7),
3 1 , and (2,-1,7).
i a subspace of Vq s
I
WE
see
where xl acd xL and x3 are arbitrary. This element can be written in the form
,
1 .
of V-. 3
...x n
... + anxn = 0
2. In each of the following, let W denote the set of all vectors (x,y,z) in Vj satisfying the condition given.
(x,y,z)
instead of
(xl,x2,x3)
V3.)
Determine whether
is a subspace of
W.
(e) x = y
or
2x=z.
(f)
(4)
x2
X2
y 2 = 0.
+ y = 1.
+ *2 = 0
(dl
x = y
and
2x =
2.
3.
[arb].
is a linear space if cf
denotes the usual sum of functions and product of a function by a real number. vector?
(v)
F?
All functions
such that
f (a) = 0.
Ljnear independence
Dc?f - inition.
...,q;\
of vectors of
vn
that is, if
... + ck+
for some scalars ci. If S spans the vector X, we say that S spans X
uniquely if the equations
X=
i=l
ciAi
and
imply that ci = di
for all i.
...
let
2 uniquely.
2OAi .
X=
1= 1
c.A. 1 1
X =
i=l
dilli.
it follows that
Since S spans X uniquely, we must have xi = xi + ci , or ci = 0, for all 1 0 . This theorem implies that if S spans one vector of L(S) uniquely,
then it spans the zero vector uniquely, whence it spans every vector of L(S)
uniquely. This condition is important enough to be given a special name:
of vectors of Vn is said to be linearly independent (or simply, independent) if it spans the zero vector Definition. The set S =
I
ZA] ,...,%J
situation.
Banple
8 If a subset .
is dependent. For if T spns Q ncn-trivially, so does S . additional vectors with zero coefficients.)
This statement is equivalent to the statement that if S is independent, then so is any subset of S. Example 9 Any set containing the zero vector Q is dependent. For
. example, if S = A ~ , . . . , a i A1 = 0, then rd
Example
in Vn span Q
..
Taking the dot product of both sides of this equation with A1 gives the equation
0 =
C1
AlOA1
Similarly, taking the dot product with Ai for the fixed index
shows that c = 0 . j
Scmetimes it is convenient to replace the vectors
Bi = A ~ / ~ I A ~ ~ \
...,Bk
are of
&
Ai by the vectors
length and are mutually
orthogonal. Such a set of vectors is called an orthonormal set. The coordinate vectors E l
form such a set. n A set ccnsisting of a single vector A is independent
B.mple
if A # Q.
I
only if the vectors are not parallel. More generally, one has the following result: Theorem 2- The set S = { A ~, of the vectors A j
...,\I
+ ckAk:
.. . ...
Conversely, if
ci
so
where the sum on the right extends over all indices different
from
m.
holds between spanning sets for W and independent sets in W Theorem 21 Let W be a subspace of Vn vectors A1,
k vectors.
i
...,\ .
Proof. Let B
,...,B
be a set of vectors of W;
let m 2 k. We
wish to show that these vectors are dependent. That is, we wish to find scalars xl,...,xm '
-- n ; all zero, ct
such that
as follows:
a
...,xn)
Definition. Given a homogeneous system of linear equations, as in ( * ) following, a solution of the system is a vector (xl, that satisfies
each equation of the system. The set of all solutions is a linear subspace of
/
Vn
(as you can check).
If we let
j. If X and
and
Thus X
+Y
and
I'ronf.. We are concer~tcd here only v i t h proving the existence of some solutioli otlicr tJta110, not with nctt~nlly fitidirtg such a solution it1 practice,
Theorem 4. Given
nor wit11 finditig all possildd solutiot~s. (We 11-ill study the practical problem in nnuch greater rlctail in a later scctioti.) in We start with a system of lc ecluatio~is ?t uriknowns:
Our procedure 11411 1)c to reduce tlic size of this system step-by-step by elimitit~tingfirst XI, tlleri x2, and so on. After k - 1 steps, we mil1 be reduced to solvitig just one c q t t a t i o ~ ~ this will be easy. But a certain and nmount, of care is ticeded it1 the dcscriptiorl-for instance, if all = . . = akl = 0, i t is nonset~se to spcak of "elirninnting" XI, since all its coefi cierits are zero. \Ve I ~ a v eo a l l o ~ for this possibility. t i~ 'L'o begirt then, if all the cocflicic~tts st are zero, you may verify t h a t of the vector (fro,. . . ,0) is n solution of the system which is different from 0 , and you nre done. Othcr~risc, t Icast one of the coefiicielits of s is a t nonzero, attd 1t.e rncly s~tj~posc cortvetlier~cethat the equations have for beerr arranged so that this happetls ill the first ec~uation, with the result that 0 1 1 0. We rnultiply the first crlt~ation the scalar azl/afl and then 1)y suhtract i t from the second, eli~nitiat~itlg XI-term from the second tghe et~uatiori.Si~rtilarly,we elirninatc the xl-term in each of the remaining equations. 'I'he result is a ttcw system of liriear equatioris of the form
Now any soltttiolr of this 11c1vssystoc~n cqttntiol~s also a solution of the of is old system (*), because we cat1 recover the old system from the new one: we merely multiply the first erlttatiorl of tlre systcm (**) by the same scalars we used before, aild.then tve add i t to the corresponding later equations of this system. The crucial ttlirig about what n c hnve done is contained in the follorvirlg statement: If the smaller system etlclosed in the box above has a solution other than the zero vector, tlictr thc Ia>rgcrsystem (**) also has a solution other than the zcro ~ c c t o r[so tJ1lnt the origitinl system (*) tve started o r wit21 llns a solutio~l t l ~ c t han the zcro vcctor). $Ire prove this as follows: stna1.ller system, different from Sr~ppose(d2, . . . , d.,) is a solutiolr of t l ~ o , . . . , . We su1)stitutc itito tllc first equation and solve for XI, thereby obtailiirlg the follo~vingvector,
i n w1ricI1 yo11 may verify is a s o l ~ ~ t ~ofo t,hc Iargcr systcm (**). In this v a y we havc rc!clnc:e~l tlio sixo of our problcrn; we now tlccd only to prove otir ttlcorcrn for a sysf,ern of Ic - 1 ecluntions in n - 1 unknowns. n If ~ v c apply this reductio~l scct)~~ci tilnc, tve reduce the prol~lem provto ing the theorem for a systern of k - 2 ccl~tatiol~s n - 2 unkno~v~rs. in Con tinuing in this way, after lc - 1 eIimiriation steps it1 all, we will be down 1 unlrno~vns.Now to a system cot~sistitig orlly one ecll~nt,ion, n - k of it1 n - Ic 1 2 2, because IVC ass~trned our hypothesis that n > Ic; thus ns our problem retluccs to proving the follo~trillg ststemcrrt: a "system" consistitig of otze li~icnr ornogcneous ccluntion it1 two or ntore unkno~vrls lways h a has a solutioll other than 0.
WE! leave it to you to show that this statement holds.(Be sure you
ccnsider the case where one or more or all of the coefficents are zero.)
E -2ample
span all
13.
El,
...,En
of
Vn.
are dependent,
that is, one of them equals a linear combination of the others. for any four vectors in Vj.
I
V3 '
0 - alone. Then:
(a)
(b) Any two linearly independent spanning sets for W have the same
k<n
(a) Choose A1 # 0
In general, suppose A l .A
this set spans W, we are finished. Otherwise, we can choose a vector Ai+l of W that is not in L(Alr.. ,A.) Then the set 1A1, ,A A ~ + ~ ] is 1 i' independent: For suppose that
...
for some scalars ci not all zero. If c ~ =+0 , ~ this equation contradicts independ ecce of A for A
..A
....A,). 1
(b) Suppose S =
Sc,
T= B
... B
are two
...
< n,
so that k i 0 n.
Definition. Given a subspace W of Vn that does not consist of Q alone, it has a linearly independent spanning set. Any such set is called a
)
basis for W, and the number of elements in this set is called the dimension of W . We make the convention that if W consists of - alone, then the dimension of 0
W :
is zero.
. vectors E1,...,E n
It follows that Vn has dimension n are many other bases for Vn., For instance, the vectors
(Surprise!) There
V ,
~xercises
1.
each of these subspaces, and f r d spanning sets for them that are - bases. ii not
2. Check the details of Example 1 . 4
...
>
Let
IA;,
be ...,G
an independent set in Vn
function T
i
V +W
linearity properties:
.,A,,
V +W
7. L e t W be a subspace of V : n
Let X, Y be vectors of W .
Then X = xjAi
2 yiAi
for unique
scalars xi m d y.. These scalars are called the camponents of X and Y, 1 respectively, relative to the basis Aif...f\.
(a)
Note that
--)
and
Conclude
..
'f xiAi
is a
A1 7
that
X*Y = 2 x i Y i
.be7 L
' B1,..., B be mutually orthogonal non-zero vectors m
Given scalars
..
Btl
n+
...,Bm ,Am+l ) -
orthogonal to each of B
..,B, .
Steep2 Show that if W is a subspace of Vn of positive dimension. . then W has a basis consisting of vectors that are mutually orthogonal.
i
[Hint: Proceed by induction on the dimension of W ] . Step 3 Prove the theorem. . Gauss-Jordan elimination
W,
we have at present no constructive process for determining the dimension
A ~ + C1B1 + + ~
"
CmBm
=
..,Bm ,Bm+l)
by
n.
a ij
is
in the
th i -
column.
Theh Ai
The
say rcm m.
(3) Multiply
These operations are called the elcmentary from the following fact: Theorem
& operations.
6.
Suppose B
=d . B
are the same. It suffices to consider the case where B Let All...,% is obtained by
Proof. --
be the rows of A,
...,Bk
be the rows of 8.
(3), then
B = A
j
and
for j # i.
Clearly, any linear combination of B1,...,Bk can be written as a linear combination of Al.
...,+
Because c # 0 ,
(2).
Bi=A for j f i . j
Again, any linear combination of Bl,...,Bk can be written as a linear combination of Al.
...,+
Because
and
A. = B 3 j the converse is also true. for j # i
The Gauss-Jordan procedure consists of applying elementary row operations to the matrix A until it is brought into a form where the dimension of its It is the following:
IGaussJordan
elimination.
I
%
to bring it to the top row. Then add multiplesof the top row to the lower rows so as to make all remaining entries in the first column into zeros. Restrict your attention now to the matrix obtained by deleting the first column and first row, and begin again.
The procedure stops when the matrix remaining has only one row.
Solution. First step. Alternative (a) applies. Exchange rows (1) and ( 2 ) , obtaining
,!
Replace row (3) by row (3) + row ( 1 ) ; then replace (4)by (4)+ 2 times ( I )
Second step. Restrict attention to the matrix in the box. (11) applies. Replace row (4) by row (4) - row (2)
obtaining
(I) applies,
--
echelon or "stair-stepUfom. The entries beneath the steps are zero. And
the entries -1, 1, and 3 that appear at the "inside cornerst1 the stairsteps
of
i
are non-zero. These entries that appear at the "inside cornersw of the stairsteps are often called the pivots in the echelon form.
Yclu can check readily that the non-zero rows of the matrix B
are
independent. (We shall prove this fact later.) It follows that the non-zero rows of the matrix B form a basis for the row space of B, and hence a basis for the row space of the original matrix A. Thus this row space has dimension 3.
reduce the matrix A to the echelon form B, then the non-zero rows are B
are independent, so they form a basis for the row space of B, and hence a
b~.sisor the row space of A. f
Begin by considering the last non-zero row. By adding multiples of this row
to each row above itr one can bring the matrix to the form where each entry lying
above the pivot in this row is zero. Then continue the process, working now
with the next-to-last non-zero row. Because all the entries above the last
pivot are already zero, they remain zero as you add multiples of the next-to-
last non-zero row to the rows above it. Similarly one continues. Eventually
the matrix reaches the form where all the entries that are directly above the
pivots are zero. (Note that the stairsteps do not change during this process,
nor do the pivots themselves.)
Applying this procedure in the example considered earlier, one brings
the matrix B into the form
elementary row operations of types (1) and (2). It has not been necessary to multiply a row by a non-zero scalar. This fact will be important later on.
WE? are not yet finished. The final step is to multiply each non-zero
. into 1 This we can do, because the pivots are non-zero. At the end of
echelon form.
' !
following theorem:
Theorem 7 Let A be a matrix; let W be its row space. Suppose
. we transform A by elementary row operations into the or into the reduced echelon matrix B,
, because
Thus the dimension of W equals the number r of non-zero rows of D. This is the same as the number of non-zero rows of B
If the rows of B
\ifre not independent, thon one would equal a linear combination of the others. Piis would imply that the row space of B could be spanned by fewer than
\
Exercises
I. Find bases for the row spaces of the following matrices:
2.
echelon form.
rows span the same subspace W of Vn. We show that D = D'. Let R
be the non-zero rows of D ; and suppose that the
k t
pivots (first non-zero entries) in these rows occur in columns jl,...,j respectively.
(a)
[Hint: Lst R be a row of Dl; suppose its pivot occurs in column p. We have R = c R + + c & for some scalars ci (Why?) Show that 11 c = 0 if ji < p. Derive a contradiction if p is not equal to any of i
...
(b) If R
, show
that R = Rm.
... + c k s
n-tuples
and
A,
with
A # Q,
the line
t. A
It is denoted by
is called a direction vector for the
L is simply the 1-dimensional subspace
'
The vector
2, then
, ,
(*)
then X = P +
%A;
these are points of L. Occasionally, one writesthe vector equation out in scalar
form as follows:
A26
n
the scalar parametric equations for the line.
where
P = p , , p n
and
A = (al,...,a
).
The
is to - parallel -
If L(P;A) = L(Q;B),
in comn. Since P and P + A lie on the first line they also lie on the second line, so that
for some scalars tl ard t2, and that A = cE for some c # solve these equations for P in terms of Q and B : P = Q + t 2 B - tlA = Q + (t2-tl&JB. Now, given any point X = P + tA of the line L(P;A),
X = P + tA = Q + (t2-tlc)B+ tcB.
WE
0.
We can
can write
Thus X
.
L (Q:B)
belongs to
The
O
symmetry of the argument shows that the reverse holds as well. Definition.
It follows from the preceding theorem that
if t h e i r d i r e c t i o n v e c t o r s a r e p a r a l l e l .
Corollary 9 .
D i s t i n c t p a r a l l e l l i n e s cannot i n t e r s e c t ,
Corollary 1 0
exactly
there is
L.
Proof. L(Q;A)
is t h e l i n e
L(P;A).
Then t h e l i n e
contains
and i s p a r a l l e l t o
Q
L.
L
By Theorem 8 , any
and p a r a l l e l t o
i s equal t o t h i s
there
and -
Q,
The l i n e (since
L(P;A)
contains both
Q = P
(since
P = P + OA)
and
+
Q.
LA). L(R;B)
i s some o t h e r l i n e c o n t a i n i n g
f o r d i s t i n c t scalars
tl
and
t2. It follows t h a t
s o t h a t the v e c t o r
A = Q
-P
is parallel t o
B.
I t follows
Definition.
B
P
If
i s a p o i n t of
Vnr
Vn
and i f . A
and
a r e independent v e c t o r s of determined
w e def i n e t h e p l a n e through
X
and -
t o be t h e s e t of a l l p o i n t s
of
t h e form
where
and
W denote t h i s e
p l a n e by
M(P;A,B).
(*)
I t may be w r i t t e n o u t a s
W e s = t = 0, then X = P; hn
s = 0
SO
on.
Ncte that
if
P =
2,
of Vn
and -
M(Q:C,D)
are D.
-A
a p o i n t i n common and t h e l i n e a r
of
P roof.
and -
e q u a l s t h e l i n e a r span of
and -
I f t h e p l a n e s a r e e q u a l , t h e y obviously have a
A29
ard
P + B all lie
on the first plane, they lie on the second plane as well. Then
P + B = Q + s3c + t p ,
Subtracting, we see that are some scalars s ar-d t i' i
A = (sZ-sl)C + (t2-tl)Df
B (s3-s1)C + (t3-tl)D.
Thus A and B lie in the linear span of C and D. Symmetry shows that
C and D lie in the linear span of A and B as well. Thus these linear
Conversely, suppose t h a t t h e p l a n e s i n t e r s e c t i n a p o i n t
R
and t h a t
L(A,B) = L(C,D).
=
Then
Q+s2C+t2D
P+slA+tB 1
for some scalars si ard ti. We can solve this equation for P as follows:
P
=
X P+sA+tB =
since A
Thus
and B
belong t o
belongs to
M (Q;C 1D)
.
as w e l l .
0
belongs t o
M(P;A,B)
Definition.
A
Given a p l a n e
M = M(P;AtB),
the v e c t o r s
and
a r e n o t uniquely determined by
W e say t h e planes
M,
but t h e i r l i n e a r
span is.
M(P;A,B)
and
M(Q;C,D)
are .
parallel i f
L(A,B) = L(C,D).
Corollary
intersect. Q, there is
M.
14. corollary -
a point
Suppose M = M ( P ; A , B ) and
Then
M(Q;A,B) M. M
is a
is parallel to
Q
By Theorem 12
is equal to
parallel to
Definition. WE'say three points P,Q,R are collinear if they lie on a line.
Lemma 1 . 5
Q-P =d
R-P are dependent (i.e., parallel). Proof. The line L(P; Q-P) is the one containing P and Q, and
theliae;
i
L(P;R-P) istheonecontaining P and R. If Q-P m d R-P are the same, by Theorem .8,so P, Q,and R
are collinear. Conversely, if 'P,Q, and R are collinear, these lines must
be the same, so that Q--P and ' R-P must be parallel.
a,
P, Q, R,
Theorem 16-
. '
Let
and
B = R
P;
then
A = Q
cGntains P and P +
and P + B = R *
Now suppose
M(S;C,D)
Q,
and
R.
Then
'I
for some scalars si and ti Subtracting, we see that the vectors Q - P = A and R - P = B belong to the linear span ~f f and D. By 2 symmetry, C and D belong to the linear span of A ad
12 implies that these two planes are equal.
B. Then Theorem
Exercises
1. We say the line L is parallel to the plane L belongs to L(A,B). then L
M = M(P;A,B)
Show that if
is parallel to M.
and intersects M,
is contained in 2.
Al
and
A2
in Vn
are
A3
in Vn
are
linearly dependent if and only if they lie on some plane through the origin.
Let
A = 1 - 1 0
(a)
B = (2,0,1).
for the line through R with direction vector A. Do these lines intersect?
(b) Find
R.
5.
432
Parametric equations
for
k-planes
Vn.
a s follows:
P
Definition.
A ~ , . . . , A ~ of
Given a p o i n t
of
Vn V,,
and a s e t
k
P
independent v e c t o r s i n determined
we define the
t o be t h e set of
&
Al,
....Ak
of t h e form
f o r some s c a l a r s M(P;A1,..
ti.
W e d e n o t e t h i s s e t o f p o i n t s by
., A k ) .
X
Said d i f f e r e n t l y i f and o n l y i f
is i n the
k-plane
M (P;Al,
i s i n t h e l i n e a r span of
A1,
...,Ak.
...,Ak'
Note that if P =
1
2,
linear subspace of
Vn
s p ~ e by A1, d
...,%.
( 1 - p l a n e s ) and p l a n e s
J u s t as w i t h t h e c a s e o f l i n e s
Let MI = M(P;A
Vn.
and
3 = M(Q;B~'
1 * * * 1 ~
k1
Then M1 = M2
AII...I%
...,Bk.
and M2
of t h i s theorem
B1?
...,Bk.
Theorem 1 8 . Given k-plane k-plane i n
M
V, Vn
5 point
Q
Q,
t h e r e i s e x a c t l y o ne
containing
parallel to
.
they are contained i n
P1 - Po,
..., Pk- Po
are dependent.
...,Pk
in Vn.
If these points.do not lie in any plane of dimension less than k, tten
s there i exactly onek-plane containing them; it is the k-plane
is a k-plane, and
,...,Bm )
is an m-plane, in Vn
, and if k s m, we say
is contained
...,, . B
in Vj
where A = (1.-1,2).
Find parametric equations for a 24lane containing the point P = (1,1,1) that is parallel to L. Is it unique? C m you find such a plane containing both the point P and the point Q = (-1,0,2)?
Is it unique? Can you find such a 3-plane thatcontains both S and the
point T = (0,1,0,2)?
Matrices
We have already defined what we mean by a matrix. In this section,
and B are two matrices of the same size, say n matrix obtained by adding
n,
we define A
B to be the k by
obtained from A by multiplying each entry of A by c. That is, if aij are the entries of A and B, respectively, in row i and column
and bij j , then the entries of A + B and of cA in row i and column j are aij respectively. P o ? that for fixed Jt
1:
bij
and
caij '
ar:d n
satisfies all the properties of a linear space. This fact is hardly surprising, for a k by n matrix is very much like a Ic-n tuple;
that only difference is that the components are written in a rectangular array instead of a linear array. Unlike tuples, however, matrices have a further operation, a product
operation. It is defined as follows:
Cefinition. If
is a k by n mtrix, and B
is an n by
Here
i = k
and
j = I,...,p.
dij is computed, roughly speaking, by tak ing the "dot product" of the 1 'th row of A with the j-th column of
B.
The entry
Schematically,
This definition seems rather strange, but it is in fact e,xtremely useful. Motivation will come later! One important justification for this definition is the fact that this product operation satisfies some of the familar "laws of algebra" : Theorem '1. Matrix multiplication has the following properties: k t -A, B, C ,-D be matrices.
(1) (Distributivihy)
Similarly; if
. -
(B + C). D is defined,
. then
(El + C)* D = B * D
C.D.
i an s
m by m
matrix -
matrices A
and B, we have - and B *Im = B
I .A = A
by p; thus their sum is also defined. The distributivity formula now follows from the equation
a (b.+c s=l is SJ sj
)=I
a b s=l is sj
+I
n
a c s=l is sj'
The other distributivity formula and the homogeneity formula are proved similarly. We leave them as exercises.
Now let us verify associativity.
- .
and
B
If A
B
is
by
is
C
by
p,
then
is
k
C
by
p.
The product
by q.
(AeB)
is thus defined
( B * C ) is
provided
has size p
I (
The product A
Proof of equality
i and
of
(A*B)
is
(B=C)
is
4j,
where
&
1j
= 1
if
i = j and
6: = 0 lj
1's down the "main diagonal" and 0's elsewhere. For instance,
is the matrix
rows. In
Let i and j be fixed. Then as s ranges from 1 to m, of the terms of this sunanation vanish.
cij =
0+. . . + 0 +
Ti
A
&. s
= 1
1.
We conclude that
Qlj
O+ . . a +
0=
Q5.
if B has rn columns.
B
A
B * Im = B
If
is defined, then
need n o t b e
For example,
1.
2.
are two possible choices for the identity element of size m by m, compute
1 ;
I ;
,I
4 Find a non-zero 2 by .
2 matrix A
aij
for
i = 1
and
j = l,--=,n,
we wish to study the following, which is called k , a system of k linear equations 9 n unknowns:
...,xn)
equation. The solution - of the system consists of all such vectors; it is set a subset of Vn
We wish to determine whether this systemhas a solution, and if so, what the nature of the general solution is. Note that we are not assuming anything about the relative size of k and n; they may be equal, or one may be larger than the other. Matrix notation is convenient for dealing with this systemof equations. rd Let A denote the k by n matrix whose entry in rcw i a ? column j is aij
Let X and C denote the matrices
B7
These are matrices with only one column; accordingly, they are called column matrices
1. It is a one-to-one
correspondence, and even the vector space operations correspond. What this means is
that we can identify Vn with the space of all n by all this amounts to is a change of notation.
Representing elements of Vn as column matrices is so convenient that
1 matrices if we wish;
Example 1.
[Here we use
X3r
x, y, z
xl, x2,
for convenience.]
the sum of the first two equations contradicts the third equation.
This system has a solution; in fact, it has mora than one solution. In
solving this sytem, we can ignore the third equation, since it is the sum of the
first two. Then we can assign a value to y arbitrarily, say y = t, and solve the first two equations for x and z We obtain the result .
*
Shifting back to tuple notation, we can say that the solution set consists of
all vectors X such that
result: Sbppose one is given a system of k linear equations in n unknowns. Then the solution set is either (1) empty, or (2) it consists of a single point, or (3) it consists of the points of an m-plane in Vn, for some m7O. In case (11, we say the system is inconsistent, meaning that it has no solution.
many solutions.
F;C shall apply Gauss-Jordan elimination
crucial result we shall need is stated in the following theorem: Tlieorem 2, Consider the system of equations
1 matrix.
A'X = C, where A
is
rd applying an elementary row operation to A, a r let C' be the matrix obtained by applying the same elementary row operation to C. Then the solution set of the system B - X = C' is the same as the solution.setof the system A'X = C.
Proof. Exchanging rows i and j of both matrices has the effect of simply exchanging equations i and j plus c plus c
d
times row j has the effect of replacing the ith equation by itself times the jth equation. And multiplying rcw i by
a non-zero scalar
has the effect of multiplying both sides of the ith equation by d. Thus each
solution of the first system is also a solution of the second system. Nc~wwe recall that the elementary operations are invertible. Thus the system A'X = C can be obtained by applying an elementary operation to both sides of the equation B w X = C1. It follows that every solution of the second system is a solution of the first system. Thus the two solution sets are identical.
AGX =
In this case, the system obviously has at least one solution, namely the trivial solution X = -. 0 Furthermore, we know that the set of solutions is a
linear subspace of Vn
We wish to determine the dimension of this solution space, and to find a basis
for it.
n.
space of A; let r be the dimension of W. Then r equals the number of non-zero rows in the echelon form of A. true that r It follows at once that r
k.
It is also is called
5 n, because W is a subspace of Vn
(or sometimes the -- of A). row rank
The number r
the rank of A
a subspace of Vn
Proof. The preceding theorem tells us that we can apply elementary operations to both the matrices A and -
without changing the solution set. 0
The number of
non-zero rows of D equals the dimension of the row space of A, which is r. Now for a zero row of D, the corresponding equation is automatically satisfied, no matter what X we choose. Orily the first r equations are relevant. Suppose that the pivots of D appear in columns jl,...,jr ' Let J
denote the set of indices l j from the set {I,.
for which j is in J appears with a Each unknown x j non-zero coefficient in only - of the equations of the system D - X = one 0. Therefore, we can Itsolveufor each of these unknowns in terms of the remaining
unknowns xk
... ) n ,
..., x
jr
system can be written as a vector of which each component is a linear combination of the
xk
for k in K.
cathination that appears in the kth component consists merely of the single term
9! )
k!t
EJ-ample 3 -
Let A
be the 4 by
The
equation A'X = - represents a system of 4 equations in 5 unknowns. Nc~w A 0 reduces by row operations to the reduced echelon matrix
Here the pivots appear in columns 1,2 and 4; thus J K is the set f3.53
and
one equation of the system. We solve for theese unknowns in terms of the others
as follows:
X1
X*
x ,
0.
The general solution can thus be written (using tuple notation for convenience)
The same procedure we followed in this example can be followed in general. Once we write X as a vector of which each component is alinear combination of the xk
5,
%I of vectors in Vn
(8,-4.1,O.O)
and (3,-2,0.0,1).
unknowns
and hence
n - r of these vectok.
It follows that the solution space of the system has a spanning set consisting of n - r vectors. We now show that these v~ctorsare independent; then the theorem is proved.. To verify independence, it suffices to shox? that if xe take the vector X, which equals a linear combination with coefficents of these vectors, then X = - if and only if each 0 equals 0.
% (for k in K)
This is easy. Consider the first expression for X tkat we wrote down,
where each component of X is a linear combination of the unknowns The kth component of X is simply
5.
rc-
= 0 .
For example, in the example we just considered, we see that the equation
X = - implies that x3 = 0 and x5 = 0 , because x3 is the third component 0 . . of X and x5 is the fifth component of X.
This proof is especially interesting because it not only gives us the
dimension of the solution space of the system, but it also gives us a method
for finding a basis for this solution space, in practice. All that is involved is
Gauss-Jordan elimination.
Corollary& Let A be a k by n matrix. If the rows of A are
0 independent, then the solution space of the system A-X = - has dimension n - k.
For the moment, we assume that the system has at least one
solution, and we determine what the general solution looks like in this case. Theorem - Let A be a k by n matrix. 5. Let r equal the rank of A.
If the system A*X = C has a solution, then the solution set is a plane in Vn of dimension m = n - r.
Proof. Let X If X is a
ccnversely. The solution space of the system A'X = - is a subspace of n' 0 of dimension m = n - r; let All ,A be a basis for it. Then X is a solution ... m of the system A'X = C if and only if X - P is a linear combination of the vectors Air that is, if and only if X = P + t1A1
t...
+
tmAm
Nciw let us try to determine when the system A'X = C has a solution.
result:
k by
Theorem of A '
6. k t A be a
Proof.
We consider the system A X = C and apply elementary row ' into echelon form
.
C'
(For the moment, we need not yo all the hay to reduced echelon form.)
Let
where c' is the entry of C 1 in row k. If c' is not zero, there are no k k values of xl, xn
satisfying this equation, so the system has no solution.
...,
Let us choose C * to be a k by
Then apply the same elementary operations as before, in reverse order, to both B and C*. These operations transform B back to A; when we apply them to C * , the result is a matrix C such that the system A X = C has no ' solution. Now in the case r = k, the echelon matrix B has no zero rows, so
the difficulty that occurred in the preceding paragraph does not arise. We shall show that in this case the system has a solution. More generally, we shall consider the following two cases at the same time: Either ( 1) B has no zero rows, or (2) whenever the ithrow of B is zero, then the corresponding entry ci of C' is zero. We show that in either of * these cases, the system has a solution. Let
US
both B and C', so as to reduce B to reduced echelon form D. Let C" be the matrix obtained by applying these same operations to C'. Note that the zero rows of B, and the corresponding entries of C', are not affected by these operations, since reducing B to reduced echelon form requires us to work only with the non-zero rows. Consider the resulting system of equations D'X = C t t . We now proceed as in the proof of Theorem 3. Let J be the set of column indices in which the pivots of D appear, and let K be the remaining indices. Since each xi
2
for j in J, appears in only one equation of the system, we can solve for each
in terms of the numbers c; a d the unknowns We can now assign j values arbitrarily to the \ and thus obtain a particular solution of the system. The theorem follows.
The procedure just described actually does much more than was necessary
or not there is a solution; and it tells us,when there is one,how to express the solution set in parametric form as an m-plane in n'
Ccnsider the following example: E2:ample 4 Consider once again the reduced echelon matrix of Example 3:
.
The system
Remark. Solving the system A-X = C in practice involves applying elementary operations to A
C.
A convenient way to perform these calculations is to form a new matrix from A by adjoining C as an additional column. This matrix is often called the
auqmented matrix of the system. Then one applies the elementary operations to this matrix, thus dealing with both A and C at the same time. is described in 16.18 of vol. I of Apostol. This procedure
Exercises 1.
Let A be a k by n matrix.
A X = - has a solution different from G.(Is this result familiar?) What ' 0
can you say about the dimension of the solution space? (b) If k > n, show that
/--
Repeat Exercise 2 for the matrices C, D l and E of p. A23. B be the matrix of p. A23.
.-. 113.
[~int:What happens to
@ k t A be a by n mti: ar:; let r be the rank of : A Let R be the set of all those vectors C of Vk fc,rwhich the system
has a solution. (That is, R A-X is the set of all vectors of the form
, as
Show the vectors this to a basis Al,...,Am ,B1,...,Br for all of Vn A.B1 , A.3r span R; this follows from the fact that A.Ai = - for 0 all i. Show these vectors are independent.]
(c) Cclnclude that if r k, there are vectors C in Vk such that the system A - X = C has no solution; while if r = kt this system has a solution for all C. (This provides an.alternate proof of Theorem 6.)
...,
Let A be a k by n tnztrix. The columns of A, when looked at as elements of Vk , span a subspace of Vk that is called the column The row space and column space of A are very different, space of A have the same dimension ! Prove but it is a totally unexpected fact that they --this fact as follows: Let R be the subspace of Vk defined in Exercise 6. Show that R is spanned by the vectors A-E i,...,A-En ; conclude that R equals the column space of A.
Csrtesian equations
of !:-planes in
n '
'
There are two standard ways of specifying a k-plane One is by an equation in parametric form:
in Vn.
X = 2 + t l A L, + . . . +
vI.ere A , . . . ,
t?<q.l;,
not independent, this equation would still specify an n-plane for some m, 5ut some work would be required to determine m. We normally require the vectors to b2 independent 51 the parametric form of the equation of a ?<-plane.) bother way to specify a plane in system of linear equations
Vn
is as the solution set of a
has size k by
n, then
the plane in question has dimension n - k. The equation is called a caretesian form for the equation of a plane.
, or
an m-plane
for some m, but some work would be required to determine m.) The- process of "solving" the system of equations A S X = C that
we described in the preceding section is an algorithm for passing from a cartesian equation for M to a parametric equation for M. 0 . can ask re
whether there is a process for the reverse, for passing from a parametric equation for M to a cartesian equation. Tk~eanswer is "yes," as we ' f yshould one shall see shortly. The other question one might ask is, ' I n care?" The answer is that sometimes one form is convenient, and other times the other form is more useful. Particularly is this true in the case of 3-dimensional space V3
, as we shall see.
Let
of this' system is for this reason sometimes called the orthoqonal complement of W. It is often denoted ' W (read "W perp".)
Theorem 1 If W is a subspace of Vn of dimension k, then , its orthogonal complement has dimension n - k. Fwthermore, W is the
wL
has dimension n
-k
is an imnediate consequence of
for W is the row space of a k by n matrix A with independent rows is the solution space of the system A - X = has dimension n - (n - k)
, whence w L
0.
Suppose a k-plane M
in Vn
P + tA
1 1
...
- - tkq; I
B*(x-P) = 0 ,
BiO
.a
The preceding proof actually tells us how to find a cartesian equation for M. 0 . takes the matrix A whose rows are the vectors Ai; one finds re
a basis B1,...,B for the solution space of the system A - X = 2, using m the Gass-Jordan algorithm; and then one writes down the equation B.X = B O P
*
k
-.
We now turn to the special case of V3, whose model is the familiar 3-dimensional space in which we live. In this space, we have only lines (1-planes) and planes (2-planes) to deal with. A P ~ e can use either the w parametric or cartesian form for lines and planes, as we prefer. in this situation we tend to prefer: parametric form for a line, and
cartesian form for a plane.
Let us explain why.
If L is a line given in parametric form X = P + tAI then A is uniquely determined up to a scalar factor. (The point P is of course le not determined.) T i equation itself then exhibits some geometric information about the line; one can for instance tell by inspection whether or not two lines are parallel.
Or1
However,
equation X = P + sA + tB
one does not have as much geometric information to find a cartesian equation is
for this plane. We note. that the orthogonal complement of L(A,B) one-dimensional, and is thus spanned by a single non-zero vector
N = (a
a2, a3)
it is
uniquely determined up to a scalar factor. (1n practice, one finds N by solving the system of equations A-N = 0
-1
B = N = - .) 0
Then a cartesian equation for M is the equation
(*)
'
thisequation
al(xl - pl)
a2(x2 - p2)
a3(x3 - P3 ) = 0 .
WE have thus proved the first half of the following theorem: Theorem% If M is a 2-plane in V3, then M has a cartesian
where N = (al, a2, a3) is non-zero. Conversely, any such equation is the cartesian equation of a plane in V3; the vector N to the plane. Proof. To prove the converse, we note that this equation is a system consisting of 1 equation in 3 unknowns, and the matrix A = [a1 a2 a3] has rank 1. Therefore the solution space of the system A.X = [b] is a plane
of dimension 3
is a normal vector
- 1
= 2.
a
useful; it
contins some geometric information about the plane. For instance, one can tell by inspection whether two planes given by cartesian equations are parallel.
For they are parallel if and only if their normal vectors are parallel,
and that can be determined by inspection of the two equations.
Similarly, one can tell readily whether the line X = P + tA is parallel to a plane 1 ; one just checks whether or not A is orthogonal
if and only if their normal vectors are independent. Proof. Take a cartesian equation for each plane; collectively, they form a system A - X The rows of
=
(which consists of the points common to all three planes) consists of a a single point if and only if the rows of Theorem --line.
Proof. -the two planes. Let Nl - X = bl and 11.
are independent.
intersect in a straight
N2 -X = b 2 be ~artesianequations for
and N2
. 17
V,.
If L
is
Now if L is parallel to MI
is perpendicular to
holds for all t if it happens that N.P = b, and it holds for no t if N-P # b. it is empty. Thus the intersection of L and M is either all of L, or
a
in V3
, where
One can use the Gauss-Jordan algorithm, or in this simple case almost by inspection. One can for instance set a2 = 1.
, proceed
: equation implies that al = 2 and then the first equation tells us that a3 = -a1
- a2
-3.
Exercises
1.
is a plane in
2.
through
(1,0,0)that
x1
- x3 = 5.
md
-7x = 4 . 3
MI
then
is coniained in M.
\
planes of Exercise 3.
6.
and Q = (3,1,5) that is parallel to the line through R = (1,1,1)with direction vector A = (1,3,4). 7 Write cartesian equations for the plane M(P;A,B) in V4, . where P = (11 -11 0, 2) and A = (11 0, 1, 0) and B = (2, 1, 0, 1).
8.
...
anXn = b, where
2;
7.
parallel. What can you say about the intersection of MI and M2 ? Give examples to illustrate the possibilities.
The --
inverse of a matrix
inverses for matrices. A t this point, we must take the non-commutativity of matrix.multiplicationinto account.Fc;ritis perfectly possible, given a matrix A, that there exists a matrix B such that A-B equals an identity matrix, without it following that B - A equals an identity matrix. Consider the following example:
r
-Then
I
A - B = I2
but B-A # I3
, as
1-1]
0
Definition. Let A be a k b n matrix. A matrix B of size n y by k is called an inverse for A if both of the following equations hold:
A-B = Ik
and
B-A = In
holds, then the other equation holds as v e l l ! Theorem 13. Let A be a matrix of size k by n. Then A has an inverse if and only if k = n = rank A. inverse is unique.
If B is an n by k matrix, we say B is a
if A-B = Ik
We say B
if
In
QA&a.k
be the rank
that if A h a s a right inverse,
I
then r = k; a . if A has a left inverse, then r = n. rd of the theorem follows. Then A - B = I It k * follows that the system of equations A.X = C has a solution for arbitrary Fjrst, suppose B is a right inverse for A C, for the vector X = B.C is one such solution, as you can check.
Theorem 6 then implies that r must equal k. Second, suppose B is a left inverse for A. Then B - A = In
It
follows that the system of equations A - X = - has only the trivial solution, 0 for the equation A - X = g implies that B-(A*X) =
2,
whence X =
N ; the dimension of the solution space of the system A.X = Q is cw it follows that n
*
d. n- r ;
- r = 0.
.
In particular, it has a solution Let us choose
that
for i = l...,n.
Then if B
.. ' El,...,En i
I ,
.
.
of size n by n
, it says that
Now the equation A.B = In says that A has a right inverse and that
Now we compute
, as desired.
Step 4.
and a ri~htinverse C,
k:tus state the result proved in Step 3 as a separate theorem: Theorem 1 . If A and B are n by n matrices such that 4
A-B = I
':
n '
then B . A = I n .
El
We
of
For
is a given
it will suffice
nonsingular
by
matrix?
By Theorem 14,
to find a matrix
such t h a t
B = 13.
is out
A-'
t h a t involves
of the inverse of a matrix has any practical significance, or whether it is of theoretical interest only. In fact, the problem of finding the inverse of a matrix in an efficient and accurate lay is of great importance in engineering. One way to explain this is t o note that often in a real-life situation, one has a fixed matrix A, and one wishes to solve the system
A.X =
C
each one of these systems separately, it is much more efficient to find the inverse of A, for then the solution X = A".C can be computed by
Exercises
1.
is a right inverse to the matrix A of Example 6. Find two right inverses for A .
2.
no left inverse. *o .w
is not unique.
3. Let B be an n by k matrix with k 4 n Show that B has
.
no right inverse. Show that if B has a left inverse, then that left
inverse is not unique.
Determinants -
properties:
(1)
det A.
A
by replacing
row i of
of A by the scalar
4
. .
If In
We are going to assume this theorem for the time being, and explore
some of its consequences. We will show, among other things, that these
four properties characterize the determinant function completely. kter
First we shall explore some consequences of the first three of these properties. We shall call properties (1)-(3) elementary listed in Theorem 15 the
row properties of
f
the
detsrminant function.
n by n
f(A)
f(In).det A
This theorem says that any function f that satisfies properties (I), ( 2 ) , and (3) of Theorem 15 i a scalar multiple of the determinant s function. It also says that if f satisfies property (4)as well, then
most one --
function that satisfies all four conditions. 1 First we show that if the rows of A are dependent, . Equation ( * ) then holds trivially in this case.
Proof. St= - -
Let us apply elementary row operations to A to brin~it to echelon form B. We need only the first two elementary row operations to do this,
If
we multiply this row b the scalar c, we leave the matrix unchanged, and y hence we leave the values of f and det urlchanged. On the other hand, this operation multiplies these values by c . conclude that f ( B ) = 0 acd d~t = 0 B . Since c is arbitrary, we
Step 2 .
independent.
Now let us consider the case where the rows of A are Again, we apply elementary row operations to A. Hcwever, and det do not
If a l l
upper left-hand corner: this changes the sign of both the func
tions
and
det,
s o we then m u l t i p l y this r o w by
-1
to
or
det.
Then w e repeat the process, working w i t h t h e second
2 . n .
The o p e r a t i o n s we a p p l y w i l l
not have a zero row at the bottom when we finish, and the "stairsteps"
the e n t r i e s below the main diagonal a r e zero. c a l l e d upper t r i a n g u l a r f orm.) e n t r i e s are non-zero.
(This i s what i s
and
B,
det
remain
the same i f w r e p l a c e e
it now s u f
To do
to-last row to the rows lying above it, and so on, we can bring the matrix to the form where
all the
diaqonal form. The values of both f and det remain the same if we replace
by t h i s new matrix
C .
..
B
'
.
bnn
.
B.)
This action m l t iplies the
,
and
det by a Eactor of
second row by
l/b22,
the third by
l/b33,
det In
... bnn
'
f(In)- det C,
as desired.
Besides proving the determinant function unique, this theorem also tells us one way to compute determinants. O r d applies this version of the Gauss-Jordan algorithm to reduce the matrix to echelon form. If the matrix that results has a zero row, then the determinant i zero. Otherwise, the matrix that results is in upper triangular s form with non-zero diagonal entries, and the determinant is the product
of the diagonal entries.
.....
< ,
, : -
-lz .
. :
.-
.,; '
:,-::=. ,-
.. .....! =. : :c ,
'-'2.
2 cr:. , , l
If the rows of
A are not independent, then det A = 0, while if they are independent, then det A # 0. We state this result as a theorem: Theorem 16. Let A be an n by n matrix. if and only i f det A # 0 Then A h . rank n as
.
.
is non-singular.
Then
(det A . (det B) )
Proof. This theorem is almost impossible to prove by direct computation. Try the case n = 2 if you doubt me ! Instead, we proceed in another direction:
f(In)- det A
det B
. det A
...,An
( A ~ CA.).B t
A ~ . Bt c A , - B
3
(row i of A . B )
+ c(row j of
A-B).
Hence it leaves the value of f unchanged. Finally, replacing the ith row Ai of A by cAi h s the effect on A.B
by
We shall derive just one additional result, concerning the inverse matrix.
Exercises 1.
Suppose that
suppose also t h a t
x, y, z are numberssuch t h a t
2.
Let
Calculate
f(In).
Express
7 .
4.
(a) Al = ( l , - l , O ) r
(b)
= (Oflf-llr
A3. = ( 2 t 3 f - 1 ) 9
f
= (1,-1.2,1),
A.2
= (-lt2,-1,O)
A3 = (3t-ltltQ)
A4 = . ( 1 , 0 , 0 , 1 )
(c) P ,
= ( ~ . O , O ~ , O 42 = , ( ,~)
A5
I ~ ~ ~ o ~ A3
~= o ( )I o
A4 = ( l . l f O , l t l ) )
= (1,010,010)
A3
~ O ~ ' ~ ~ O , ~ ) ,
(dl
= (1,-11,
A2 =
( O t l ) ,
= (lfl).
A - formula for A
'j
it has rank n, and we b o w that A has rank n if and only if det A # 0. Now we derive a formula for the inverse that involves determinants directly. We begin with a lema about the evaluation of determinants.
Lemma --
18.
function f of
f(B) = det
, I
where B1 consists of the first j-1 columns of B, arid B cc~nsists 2 of the remainder of B. Then
Proof. You can readily check that f satisfies properties ( 1)-( 3) of the determinant function. Hence- f (B) = f(I 1) -det B. nW
compute
f(In) = det n-j where the large zeros stand for zero matrices of the appropriate size.
A sequence of .j-1 interchanges of adjacent rows gives us the equation
One can apply elementary operations to this matrix, without changing the
value of the determinant, to replace all of the entries al,...,aj-l,aj+l,...,a
n
by zeros. Then the resulting matrix is in diagonal form. We conclude that
Corollary
Consider a n by n
where Bl,
...,B4
Proof. A sequence of i-1 interchanges of adjacent rows wilt king the matrix A to the form given in the preceding l m a .
Corollary 20. If all the entries in the jth column of A are zero except for the entry a i j in row i, then det A = (-1) i+ j aij-det Aij.
~ t n w r ~ e
( - 1) i + j
det Aij
given a special name. It is called the (i,j)-cofactor of A. the signs (-l)itj follows the pattern
If A * B = I n'
then
= (-l)jci det A . ./det A . bi j 31 (That is, the entry of B in row i and column j equals the ( j ,i)
cofactor of A,
divided by det A.
A
B by computing det
(n-1) matrices. low dimensions!)
Proof.
k t
e ere
of 1
I
in row j . )
because A - In =
, we1 have
~ - ( column ~ i ~ of
Now we introduce a couple of weird matrices for reasons that will become
clear. Using the two preceding equations, we put them together to get
the following matrix equation:
It turns out that when we take determinants of both sides of this equation,
we get exactly the equation of our theorem! First, we show that
det [El
... Ei-l
X Ei+l
... En]
= xi
det
If x. = 0, 1
applies, because the th column of this matrix consistsof zeros except for
i
an entry of
in row j
-1.
Rc;mark
i the matrix s
.
has sire n by k; it
has size
K by n, then A'=
-x.
For example,
Of murse,i A as A.
Using this terminology, the theorem just proved says that the inverse of
n* rr
-of
cofactors.)
det
A.
A-1 =
-
A ) ~ ~ .
(cof det A
This formula for A ' ~ is used for rornputational purposes only for 2 by 2
or 3 b 3 matrices; the work simply gets too great otherwise. But it is
y important for theoretical purposes. For instance, if the entries of A
is never zero.
R ~ m r k2.
large change in the ccmputed entries of A-I. This means, in an engineering problem , that a small error in calculating A (even round-off error) may result i a n
gross error in the calculated value of A-l.A matrix for which d e t A relatively small is saidtobeill-conditioned.
i s
one usually tries to refomlate the problem to avoid dealing with such a matrix.
aL3
Exercises
use t h e formula f o r
dirnens ions.
(b)
c :n
0 c d
A
,assmFng
ace
f O .
2 . L e t
show that i f
integers.
dat A =
21,
A-'
are
of p. A.23.
Which of these
t does A-'
5.
Or
A has an entry
ail
row
i equal 0.
'. b
m
by
I(,
Theorem L e t A,
and
m bl
mr
BI
C be matricas of .size lc by
k,
and
respectively. Then
rB z]
= (det
A)
(det (2).
B and C k fixed. e
Fcr each
k by
matrix
A,
define
( a ) Show
function.
(b) U s e Exercise 5 to show that
( c ) Cctrrplete the proof.
f ( I ~ = det C. )
Ccnstruction of the determinant when n 5 3. T'i~eactual definition of the determinant function is t r least interesting 'e
part of this entire discussion. The situation is similar to the situation with respect to the functions sin x, cos x, and ex. You trill recall that
0
to you. This case is in fact all we shall need for our applications to calculus.
be a real-valued function of n by
n matrices.
Suppose that:
( i ) Exchanging any two rows of A changes the mlue of f by a factor
of -1.
( i i ) For each i ,
is linear as a function of t h e
ith row.
Then f satisfies the elementary row properties of the determinant function. Proof. By hypothesis, f satisfies the first elementary row property.
...,A,
be t h e rows
of A.
To say that
is linear
as a function of row f
alone i to s
say t h a t (when
f is written as a,function
of the rows of
(*)
A):
f A ,,..., cx + dY, (
... ,An)
= cf(A1,
...,X,. ..,An)
+ dE(A1,...,Y,...,A,,).
where cX + dY and X and Y appear in the ith component. The special case d = 0 tells us that multiplying the ith row of A b y c has the effect of multiplying the mlue of f by c .
of A 5 y
itself plus c times row j. We then compute (assuming j > i for convenience in notation),
The second term vanishes, since two rows are the same. (Exchanging them does
not change the matrix, but by Step 1 it changes the value of f by a factor
of - 1 . ) ( J Definition. We define
det [ a ]
=
a.
det
bl
b2
a l b ~ - a2bl.
q,eorem 22.
Proof. - The fact that the determinant of the identity matrix is follows by direct computation. of the preceding theorem hold
.
.
,
Irl the 2 by 2 case, exchanging rows leads to the determinant bla2- b2al
In the 3 by 3 case, the fact that exchanging the last two rows changes the sign of the determinant follows from the.2 by 2 case. The fact t h a t exchanging
the first two rows also changes the sign follows similarly if we rewrite the formula defining.t h e determinant in the form
->
of
the form
= axl + b w + a 2 3
E(X) = [ a b c ] - X
is linear, where X is a vector in V
f(X)
The function
det
has this form, where the coefficients a, b, and c involve the constants
bi and c
j *
is linear as a function
Exercise
*l.
Let us define
det
determinant.
(c) Shwthat exchanging the first two rows changes the sign.
expression as a sum of terms involving det
[Hint:
Write the
pi
bjJ
(d) Show that exchanging any two rows changes the sign.
P i
'7 j.
ith row.
( g ) Conclude that this formula satisfies all the properties of the determinant
function.
tive integers 1, 2, . . . , k and write them down in some arbitrary order, say jl, j z , . . . , j h . This new ordering is called a permutation of these integers. For each integer ji in this ordering, let us count how many integers follow it in this ordering, but precede it in the natural ordering 1, 2, . . . , k. This number is called the number of inaersions caused by the integer j;. If we determine this ilumber for each integer ji in the ordering and add the results together, the number we get is called the total number of inversions which occur in this ordering. If the number is odd, we say the permutation is an odd permutation; if the number is even, we say it is an even permutalion. For example, consider the following reordering of the integers between 1 and 6:
2, 5 , 1, 3, 6, 4.
If me count up the inversions, we see that the integer 2 causes one inversion, 5 causes three inversions, 1 and 3 cause no inversions, 6 causes one inversion, and 4 causes none. The sum is five, so the permutation is odd. Xf a permutation is odd, me say the sign of that permutation is - ; if A useful fact about the sign of a permuta it is even, we say its sign is tion is the following:
+.
Theorem 22.If we interchange two adjacent elements of a per mutation, we %hange the sign of the permutation.
jl, tation
ProoJ. Let us suppose the elements ji and ji+1 of the permutation . .' , ji, ji+l, . . . ,j k are the two we interchange, obtaining the permu-
j ~ ,.
. .
jlj~, +,
. . . ,jk.
The number of inversions caused by the integen j l , . . . , ji-1 clearly is the same in the new permutation as in the old one, and so is the number of inversions caused by ji+t, . . . ,js. I t remains to compate the number of inversions caused by ji+land by ji in the two permutations. Case I: j r precedes j,-+ln the natural ordering 1, . , k. I n this case, i the number of inversions caused by ji is the same in both permutations, but the number of inversions caused by ji+lia one larger in the second in permutation than in the first, for ji follows j4+* the second permutation, but not in the first. Hence the total number of inversions is increased by one.
Case 11: j i follows js+l in the natural ordering 1, . , k. I n this case, is the number of inversion caused by jicl the same in both permutations,
. .
. .
but the number of inversions caused by ji is one less in the second permutation than in the first.
I n either case the total number of inversions changes by one, 80 t h a t the sign of the permutation changes. U
EXAMPLE.f we interchange the second and third elements of the I permutation considered in the previous example, we obtain 2, 1, 5, 3, 6, 4, in which the total number of inversions is four, so the permutation is even.
Definition.
Consider a k by k matrix
Pick out one entry from each row of A ; do this in such a way that these entries all lie in different columns of A . Take the product of these entries, and prefix a sign according as the permutation jl, . . , j k is even or odd. (Note that we arrange the entries in the order of the rows they come from, and then"we compute the sign of the resulting permutation of the column indices.) If we write down all possible such expressions and add them together, the number we get is defined to be the determinant of A .
REMARK.We apply this definition to the general 2 by 2 matrix, and obtain the formula
The formula for the determinant of a 4 by 4 matrix involves 24 terms, and for a 5 by 5 matrix it involves 120 terms; we will not write down these formulas. The reader will readily believe that the definition we have given is not very useful for computational purposes!
, of zero in it except for the term alla22...alck and this term equals 1.
/
Theorem $5.
If A '
det A.
in the expansion of det A' also appears in the expansion of det A, because we make all possible choices of one eritry from each row and column when we write down this expansion. The only thing we have to do is to compare what signs this term has when i t appears in the two expansions. Let a~r, - . n i j r a i f l , j , + ,. . . nkjk be a term in the expansion of det A . If we look a t the correspondi~igterm in the expansion of det A', we see that we have the same factors, but they are arranged diflerenlly. For to compute the sign of this term, we agreed to arrange the entries in the order of the row8 they came from, and then to take the sign of the correspondiilg permutation of the column indices. Thus in the expansion of det A', this term mill appear as
l'lie permutation of the columr~indices here is the same as above except ji that eleme~lts and j i + l have been interchanged. By Theorem 8.4, this means that this term appears in the expansion of det A' with the sign opposite to its sign ill the expansion of det A . Since this result holds for each term in the expansion of det A', we have det A' = - det A . - . 0
[xl
...
]\A
new matrix, each term in the expression equals a constant times x , for j some j. (This happens because in forming this term, we picked out exactly one
entry from each row of A . )
LC, ....
ck . x l
'
Exercises
1,
two rows of A
in the definition of
we arrange the factors in this term in the order of their column indices, obtaining an expression of the form
Show that the sign of the perm-tion permutation jl,j2,...,jk Ccnclude that det
3.'
= det A
il,i2,
...,ik
in general.
k t A
row i
and column j.
of which has a single non-zero component. Then use the fact that the determinant function is linear as a function of the mth row. 1
Tl~ecross-product 2 -
are vectors in
v3'
(det
r2
be the vector
a] . 2 3
,-
det
2)
dtb al a2 ) e[l
bd
we have
B%A
-AxB.
(11) Ar(B + C) = A X B + A X C
(Ei+C)XA
= BkA
C(AXB)
C%A.
(c)
(CA)X
B =
= AX(CB)
(d)
AXB
Proof.
ck-angesthe sign; ar:d (b) and (c) follows because the determinant is linear
then
fcl
'31
by definition of the determinant. It; follows that A-(AxB) = B - ( A X B ) = 0 because the determinant vanishes if two rows are equal. The only proof that requires some work i (e). For this, we recall that s
a2 +
We first take the squared terms on the left side and show they equal
the right side. Then we take the "mixed" terms on the left side and show
they equal zero. The squared terms on the left side are
x3
2 (a,b.)
1 3
i,j = 1
the following:
, we have
A- (Bx C) = ( A x B) .C.
&finition.
of vectors of V3
A-(B% C)
> 0.
triple is sometimes said to be a riqht-handed triple, and a negative one is said to be left-handed.
(i, &) i,
and
i, i, and &
in V3
and if one curls the fingers of one's right hand in the direction from the
first to the second, then one's thumb points in the direction of the third.
&" j- >
1
3'
lengths and the angles between them, but never lettinq them become dependent,
=d if one moves one's right hand around correspondingly, then the
fingers still correspond. to the new triple (A,B,C) in the same way, and
this new triple is still a positive triple, since the determinant cannot
have changed sign while the vectors moved around.(Since they did not become
dependent, the determinant did not vanish.)
Theorem .29. Let A and B be vectors in V are dependent, then A X B = 0 Otherwise, AXB .
3'
If A and: B :
(where 8
is the angle between A and B), such that the triple (A,B,A%B) forms a positive (i.e.,right-handed) triple.
W know that e
AXB
is orthogonal to both
and
B. \v'e
II~xaU2 =
=
Finally, i f
~ I A C - I I B~ ?A.B) 2 (
\(A[\~~ . (A,B,C)
( 1 1 - ros 1
1 I
1 ~ 1 1~ (~ B A s i, n 2 e ~
C = A X B , then
is a positive t r i p l e , since
Polar coordinates
Let A = (a,b) be a point of V2 different from Q. We wish to define what we mean by a "polar angle" for A. The idea is that it should be the angle between the vector A
and the unit vector i = (1,O). But we also wish to choose it so its value reflects whether
A lies in the upper or lower hdf-plane. So we make the following definition: Definition. Given A = (a,b) # Q. We define the number
(*I
and to be -if b
8 = * arcos (Aei/llAII)
+ if b > 0,
< 0. Any number of the form 2mn + 0 is also defined to be a polar angle
for A.
If b = 0, the sign in this equation is not determined, but that does not matter. For if A = (a,O) where a
SO
matter. And if A = (-a,O) where a > 0, then arccos (A.L/IIAII) = arccos (-1) = T . Since
+ T and - T differ by a multiple of 27, the sign does not matter, for
T.
Note: The polar angle B for A is uniauely determined if we require -n < f? <
But that is a rather artificial restriction. Theorem. 2 2 'I2 a ~ o i n of V2. Lgt r = (a +b ) t = IIAll; l a 8
A = (a,b) # Q
Proof. If A = (a,O) with a > 0, then r = a and 0 = 0 + 2ms; hence r cos 8 = a and r sin 0 = 0. If A = (-a,O) with a
0 = 2ma e arccos(a/r).
Then a/r = cos(*(&2rn~))= cos 0, or a = r cos 0. Furthermore,
b2 = r2 - a2 = r2 (l-cos2 8) = r2 sin2 8,
so
b = *r sin 8.
We show that in fact b = r sin 8. For if b > 0, then 0 = 2ma + arccos(a/r), so that
and sin 19 is positive. Because b, r, and sin 0 are all positive, we must have b = r sin B rather than b = -r sin 8. On the other hand, if b < 0, then 0 = 2m7r - arccos(a/r), so that
2mn-a< B<2ma
and sin 0 is negative. Since r is positive, and b and sin 8 are negative, we must have
b = r sin 0 rather than b = -r sin 8. o
(~lanetarv Motion (
In the text, Apostol shows how Kepler's three (empirical) laws of planetary motion
can be deduced from the following two laws:
(1) Newton's second law of motion:
F = ma.
Here m, M are the masses of the two objects, r is the distance between them, and G is a universal constant. Here we show (essentially) the reverse-how Kepler 's . More precisely, suppose a planet Newton's laws can be deduced from
".,.kmLM
xy plane with the
origin. Newton's laws tell us that the acceleration of P is given by the equation
That is, Newton's laws tell us that there is a number A such that
r and that X is the same for all pIanets in the solar system. (One needs to consider other
X
!!=-7"r,
systems to see that A involves the mass of the sun.) This is what we shall prove. We use the formula for acceleration in polar coordinates (Apostol, p. 542):
We also use some facts about area that we shall not actually prove until Units VI and
VII of this course.
Theorem. S u ~ ~ oa ela net P moves in the xy plane with the sun at the orign. s (a) Ke~ler's second law im~lies that the acceleration is radial. (b) Ke~ler'sfirst and second laws imply that
a,=- A~ 2 4 , I
\I
where Xp
(c) Keder's three laws i m ~ l v that Xp is the same for d la nets. Proof. (a) We use the following formula for the area swept out by the radial
Differentiating, we have dA = 1I2 dB , which is constant by Kepler's second law. That is,
(*I
for some K.
Differentiating, we have
The left side of this equation is just the transverse component (the po component) of a!
Hence a is radial.
(b) To apply Kepler's first law, we need the equation of an ellipse with focus at
the fact that an ellipse is the locus of all points (x,y) the sum of whose distances from (0,O) and (a,O) is a constant b
> a.
or
+ Jr2 - 2a(r
cos 0)
+
2
a'=
2
b,
+ r 2,
r =1
-e
e
(The numberhis called the eccentricity of the ellipse, by the way.) Now we compute the
-1
(1-e cos 0)
Simplifying,
dr = a (-1)r 2(e sin e)=B 1 d Then using (*) from p. B60, we have 1 dr = $e sin 6')K. Differentiating again, we have
-r
[q
= - r[K
81
Thus, as desired,
(c) To apply Keplerys third law, we need a formula for the area of an ellipse, which will be proved later, in Unit VII. It is Area = T(ma'or axis) (minor axis) a
2
minor axis =
= m
d .
It is also easy to see that major axis = b. Now we can apply Kepler's third law. Since area is being swept out at the constant rate ;K, we know that (since the period is the time it takes to sweep out the entire area),
1
Area = (2K)(Period).
Kepler's third law states that the following number is the same for all planets: (major axis)"(major axi s) 3
16
=b
r2 lT i n* i
major axis
;;Z
T h e n f is c o n t i n u o u s . L e t P be t h e p a r t i t i o n
P = { O , l / n , l / ( n - 1 ) ,...,1/3,1/2,1].
D r a w a p i c t u r e of t h e i n s c r i b e d polygon x ( P ) in t h e c a s e n = 5. Show t h a t i n g e n e r a l , x(P) has length
I x(P) I
C o n c l u d e t h a t f is n o t rectifiable.
2 1 + 2(1/2
+ 113 +
... + l / h ).
($)
Let q be a fixed unit vector. A particle moves in Vn in such a way that its position ) vector ~ ( tsatisfies the equation r(t)-u = 5t 3 for all t, and its velocity vector makes a
(9
K = 3.
Its speed at time t is e2t. Find Ila(t)ll, and find the angle between q and 3 at time t.
(6) Consiaer r = e-
where N
is a positive integer.
and if f
f ( 5 ) is (if it
When we do this,
?f. Thus
rather than
where
0 as
h - ->
0 -.
h -
as a column matrix in
f(5) -
out in the
f -
consists of
if a -
at
a -
to be the matrix
a Said differently,- the'derivative 'Df(5)- o ' . - at - f. f is.the k by n matrix whose entry in row i and column
j
is
i t i s o f t e n c a l l e d t h e J a c o b i a n m a t r i x of
f -( 5 ) .
Another
W e c o n s i d e r some of them h e r e :
Theorem 1. only i f
The - function
f is -(5) - d i f f e r e n t i a b l e
at -
a -
where
- g ( & ) ->
(Here
as
_ I
h 0 - -> .
-, 5, f
matrices.
th
e n t r i e s of t h e s e matrices,
we have t h e f o l l o w i n g e q u a t i o n :
NOW
f
is.
is d i f f e r e n t i a b l e a t
And
i f and o n l y i f each f u n c t i o n
fi
fi as
is d i f f e r e n t i a b l e a t
h 0 - -> -.
But
Ei ( h ) ->
a
0
Ei (&) ->
each
i,
i f and o n l y if
E h - (- )
->
0 -
as
h ->
Theorem 2.
is at - continuous -
- -( - ) - d i f f e r e n t i a b l e - -, If f x is at a
a. f is differentiable a t
then -
f -
If
a -,
then s o i s each a -,
Then i n p a r t i c u l a r ,
fi
i s continuous a t
f (5) = f ( x l , . . a -,
., xn
is a s c a l a r f u n c t i o n
and t h a t
i s a parametrized curve p a s s i n g
x a - ( t O ) = -.
)
If
f (- ) x
is differentiable
x -( t ) = through -. a a t -, and a
(xl ( t ) ,
Let
...,
is
if
x -( t )
differentiable a t
f ( ~ ( t ) )i s d i f f e r e n t i a b l e a t by t h e e q u a t i o n
t o , and i t s d e r i v a t i v e i s g i v e n
when
t=tO.
W e can r e w r i t e t h i s formula i n s c a l a r form a s follows:
Df, -
we s e e
t h a t t h e l a t t e r formula c a n be w r i t t e n i n t h e form
(Note t h a t t h e m a t r i x
DX
d
Df
i s a row m a t r i x , w h i l e t h e m a t r i x
is by i t s d e f i n i t i o n a column m a t r i x . )
This i s t h e form of t h e c h a i n r u l e t h a t w e f i n d e s p e c i a l l y u s e f u l , f o r it i s t h e formula t h a t g e n e r a l i z e s t o h i g h e r dimensions. L e t us now c o n s i d e r a composite of v e c t o r f u n c t i o n s of vector variables. t h e following: Suppose taking values f is on an ba l - - d e f i n e d - - open -l-i n
Rk,
R"
about
a -,
&
with
- - ( 5 ) = -. f b
Suppose
p - defined is
Let -
i n an ball - - open - a b o u t h ,
t a k i n g v a l u e s - RP. in
F -( 5 ) =
9 (f
(x) )
d e n o t e t h e composite f u n c t i o n .
f -(5) =
If f -
f ( x l ~ g e - r x n ) and
g -( x )
=~(ylr...,yk).
and
g -
are differentiable a t
a -
and
b -
r e s p e c t i v e l y , it i s e a s y t o s e e t h a t t h e p a r t i a l d e r i v a t i v e s of
~ -
( 5 ) xist a t e
a -,
and t o c a l c u l a t e them.
After a l l , the
t i-h
coordinate function of
-(- ) F x
i s g i v e n by t h e e q u a t i o n
~f we s e t each o f t h e v a r i a b l e s
xQ
variable
j*
alone. j
g i v e s us t h e formula
f u n c t i o n s of
The c h a i n r u l e a l r e a d y proved t h e n
Thus
= [i-h t
row of
Dz]
-d i f f e r e n t i a b l e - -s on i t
domains, t h e n t h e composite
F -(5) =
q (f( 5 ))
is - continuously
domain, and
it i s u s e f u l f o r t h e o r e t i c a l p u r p o s e s .
one u s u a l l y u s e s t h e s c a l a r formula ( * ) when one c a l c u l a t e s part i a l d e r i v a t i v e s o f a composite f u n c t i o n , however. The f o l l o w i n g proof i s i n c l u d e d s o l e l y f o r c o m p l e t e n e s s shall not need to use it:
Theorem 4. - e t f - g be a s above. - f -
L and If is
;
we
-1
- --
d i f f e r e n t i a b l e -t a
-a - and - g
i s d i f f e r e n t i a b l e - -, at b
then -
is d i f f e r e n t i a b l e a t
a a nd - -, -
DF(5) = Dg(b)
Df ( a ) .
--
Proof.
W e know t h a t
where
E l ( k ) ->
as
- -> k
0.
L e t us s e t
- = - (-+ h ) - -( g ) k f a f
i n t h i s formula,
Then
Thus t h e J a c o b i a n m a t r i x of
F -
his i s o u r g e n e r a l i z e d v e r s i o n of t h e c h a i n r u l e .
There i s , however, a problem h e r e .
if
W e have j u s t shown t h a t
f -
and
g -
of t h e composite f u n c t i o n
e x i s t e n c e of t h e p a r t i a l d e r i v a t i v e s o f t h e f u n c t i o n enough t o g u a r a n t e e t h a t
Fi
is not
is differentiable.
f and
C J
One needs t o
are differentiable,
g(x) = f(p(x)).
(See Theorem 4
, I
both
F -
is
a r e continuously d i f f e r e n t i a b l e .
f -
In t h i s case,
t h e p a r t i a l s of
and
g -
a r e c o n t i n u o u s on t h e i r r e s p e c t i v e
domains; t h e n t h e formula
D.F
3 i
i s a l s o a continuous Fi
x -.
Then by o u r b a s i c theorem, F -
is differentiable
if so t h a t
i s d i f f e r e n t i a b l e , by d e f i n i t i o n .
W e summarize t h e s e f a c t s a s f o l l o w s :
Theorem 3 . about a,
L -e t- f- be
d e f i n e d on an open b a l l i n 'R
Rk;
taking values i n
----let a - f (- ) = b. - - g- be - Let
taking values i n
d e f i n e d i n an open b a l l a b o u t a nd -
--
b,
RP.
If f - -
are c o n t i n u o u s l y
d i f f e r e n t i a b l e on t h e i r r e s p e c t i v e
- - (- )I1 . a
N w w e know t h a t o
f -- (a+h)
where
f (a) = Df (5) h
E2 (h)
->
as
- -> h
-. 0
get t h e e q u a t i o n
+ E l ( -( a +- ) f - h
Thus
F F -(g+h) - -( 5 )
where
Dz(b)
Df
(5) h + E3(h)llhll, -
W e must show t h a t
E3
->
as
h ->
easy, since
Dg
(b)
i s c o n s t a n t and
-. 0 <gZ- ) (h
as
h ->
0.
Furthermore, as approaches
h ->
(since
-, 0 f
t h e expression
-(a+h) - (a) f -f
i s c o n t i n u o u s ) , so t h a t
0.
is bounded as
h -> -
0. -
Now
- I1 Df (a) -C
E2 (2)1 1
I1 Df (a) -Now
u l l - + it g2 (&) I1 , 0 as
where
u -
is a unit vector.
E2(h) -> C
->
and it is easy to see that I1 Df (a) u l - l (Exercise!) Hence the expression bounded, and we are finished.
l - (a+h) f --
1 3 f(a)
nl - - l/h U
0, -
is
Part, but not all, of this theorem generalizes to vector functions. We shall show that if
f - -
if g - -
Let
be -
subset
of
R. "
Suppose that
:A->R
Proof.
g (5) - (f )
=
Because
g -
is inverse to f ,
Now both
differentiable and so is the composite function, we can use the chain rule to compute
Dq(b)
n by
Remark 1. This theorem shows that in order for the differentiable function
have rank
n. to
If
n,
then there is some (probably smaller) open such that in Rn. carries B in a 1-1 fashion
about a, B
C ->
Remark 2.
the rule for the derivative of the inverse function x = g(y) is often written in the form
f -(x,y)
= (u,v)
plane to
plane.
f -(=)
b,
then
~q(2) [D~(~)I-'*
=
If we write out these matrices in Leibnitz notation, we obtain
the equation
Now the formula for the inverse of a matrix gives (in the case
ax/av
let us
Let - be a point of c
R",
equations in n
system for - of the unknowns in terms of the others. n solve this system for - in terms of y. x
the resulting function - = ~ ( y ) be differentiable. x to Assuming this expectation to be correct, one can then calculate the derivative of the resulting function 2 by using the chain rule, One understands best how this is done by workApostol works several in
as a function of y
Fjrst let us consider the problem discussed on p 294 of the text. . It involves an equation of the form
where F is continuously differentiable. Assuming that one can in theory s solve this equation for z a ; a function of x and y, say z = f(x,y). Apostol derives equations for the partials of this unlmown function:
af - (x1y) ax
at the point (x,y,f(x,y))
j
- -- a x
3F and
aF az
-J-f - - 3 . L
3~
3~ -
aF az
Here the functions on the right side of these equations are evaluated
.
,
in order
to carry out these calculations. It is a remarkable fact that the condition aF/?z # 0 is also sufficient to justify the assumptions we made in carrying them out. This is a consequence of a fsrnous theorem of Analysis called the Implicit Function Theorem. O i consequence of this theorem is the following: re If one has a point (xO,yO,zO) that satisfies the equation F(x,y,z) = 0
and if a F b z # 0 at this point, then there exists a unique differentiable function f(x.y), defined in an open set B about (xO,yO)
such that
f(xo,yo) = zO and such that F!x,ytf (xty) = 0 for all (x,y) in B. Of course, once one hows that f exists and is differentiable, one can find its partials by implicit differentiation,
<
The equation
= (OI2).
Hc;wever, the point b = (l,l,6) satisfies the equation also, and )F/~Z # 0
Q : ( o , ~ , o ) at
this pint.
3
Y
implies that there is a function f(x,y) defined in a neighborhood of (xO,yO) = (1,l) such that f(l,l) ~= and f satisfies the
equation F(x,y, z) = 0 identically. h-ote that f is not uniquely determined unless we specify its value at (xO.yO). There are two functions f defined in a neighborhood of (1,l)
[ 4 - x2 - y 2 ] 4
and
z =
[4 - x2 - y2]5
fi .
determined as a function of (x,z) near this point. The picture makes this
,
fact clear.
',
I
Nc~wlet us consider the more general situation discussed on p. 296 of the text. We have two equations
where F acd G are continuously differentiable. (We have inserted an extra variable to make things more interesting.) Assuming there are rd functions x = X(z,w) ar y = Y(z,w) that satisfy these equations for
all points in an open set in the (zfw)plane, we have the identities F(X,Y,z,w) = 0 and G(X,Y,Z,W) =
o ,
These are linear equations for JX/Jz and a~/az we can solve them if the ; coefficient matrix
The functions on the right side of this equation are evaluated at the point
, I
(~(z,w),~(z,w),z,w), so that both sides of the equation are functions of z and w alone.
You can check that one obtains an equation for the other partials
was non-singular
Again
, it is a
remarkable fact
that this condition is also sufficient to justify the assumptions we have made. Specifically, the Implicit Function Theorem tells us that if ( X ~ , ~ ~ , Z ~isW ~point satisfying the equations ( * ) , and if the matrix , a )
a F,G/ d x,y
is non-singular at this point, then there do exist unique and Y(z,w) defined in an open set about
and
YizoIwo) = yo
for x and y. Thus under this assumption all our calculations are justified.
r
I
E3mple 3,
The points
At the point
[ y],
whichis
non-singular. Therefore, there exist unique functions x = X(z,w) and y = Y(Z ,w)
satisfy these equations identically, such that X(-1,O) = 1 and Y(-1,O) = 2 . SSnce we know the values of X and Y at the point (-l,O), we
Indeed,
can find the values of their partial derivatives at this point also.
Om
matrix hF,G/bx,y
which is singular. Therefore we do not expect to be able to solve for x and y in terms of z d we have
w nc.ar this point. However, at this point,
<
I1
Therefore, the implicit function theorem implies that we and w in terms of y and z near this point.
can
solve for x
Exercises
1. Given the continuously differentiable scalar field f(x, ) , y 2 3 9 let + t = f(t ,t + 1. Find ( 1 , given that Vf(1,2) = 5i - J () )
*.
at the point
5.
(2,1,2).
Let
be a s c a l a r function of
variables.
Define
Express
f
F8(1)
i n terms o f t h e f i r s t o r d e r p a r t i a l s o f
(3,3,2).
i n terms o f t h e f i r s t a n d s e c o n d o r d e r
a t the point (3,3,2).
Express
p a r t i a l s of
6. Let
f : R
f
---r
R'
and let
g : R~
Suppose t h a t
f(0,O)
= (1,2)
-( 1 , 2 ) f
= (0'0).
g(O,O) = ( 1 , 3 , 2 )
d l J ) = (-1,Opl)-
Suppose t h a t
a)
If
h ( 4 ) =g(f(x)), f i n d
b)
If
has an i n v e r s e
Dh(0,O). 2 2 : R R ,
find
Dk(0,O).
JX/&
md
3 ~ / & at
the point
( Z ~ , W ~ ) (-1 =
, 0).
G = 0,
at the point
1 , 2 - 2 )
pairs of variables is it possible to solve in terms of the other two near this point?
The s e c o n d - d e r i v a t i v e
t e s t f o r extrema of a f u n c t i o n of
--a.
f(xl,x2)
B
second-
Suppose t h a t
a nd
D2f
vanish a t
Let -
A =
D1llf(a)l
B =
DlI2f(a),
f h s -a0,
C = D212f(=).
(a)
If
I f
If -
B2
- AC >
AC
0,
(b)
B~
< 0
then and - A
a.
>
h s -a- a r e l a t i v e
(c)
B2
AC
<
a nd -
<
0,
then -
h s -a-
a relative
maximum - a . at
(d)
I f
BL
- AC
= OI
the t e s t ---
i s inconclusive.
Proof.
S t e p 1.
f (xl, x2) a -.
centered a t Then
= (hlk).
where
* a -
a -
to
a - + tv. -
i.e .
Let
F ( t -) v
denote t h e l e f t s i d e o f t h i s e q u a t i o n .
F(tv)
We
w i l l be concerned about t h e s i g n of
when f
is small,
h a s a l o c a l maxi-
a -,
o r neither. v = (h,k)
Step 3 ,
be a u n i t v e c t o r .
Consider t h e q u a d r a t i c f u n c t i o n
W e s h a l l determine what v a l u e s
takes a s
v -
v a r i e s over t h e
B~
AC
<
0,
t h e n w e show t h a t
(v)
has
f o r a l l u n i t vectors
v.
thus
When
-= v
A
(1,0),
then
Q(v) = A ;
Q(v)
h a s t h e same s i g n a s function
[ O ,21~1 ,
i n t h i s case.
t
~ o n i i d e r h e continuous t
ranges o v e r t h e i n t e r v a l ranges o v e r a l l u n i t v e c t o r s
(cos t , s i n t )
in
V2.
I f t h i s f u n c t i o n t a k e s on a v a l u e whose s i g n i s d i f f e r
A,
e n t from t h a t o f t h e r e must be a
to such t h a t
(ho,ko).
Now i f
ho # 0,
t h i s means t h a t
i s a real r o o t o f t h e e q u a t i o n
g ( t ) = f(al + th, a2 + t k ) .
W know t h a t e between
0
g ( t ) = g(0) and
t.
2 g l ( 0 ) - t + g W ( c ) */2! t
where
g
is
C a l c u l a t i n g the d e r i v a t i v e s o f
gives
(*)
follows.
Here
a -
= a
- + cv, -
where
is
and
t.
S t e p 2. vanish a t
I n t h e p r e s e n t c a s e , +&e f i r s t p a r t i a l s o f so t h a t
The o n l y r e a s o n this
approximation r a t h e r t h a n
equality
a -
i n s t e a d of a t
a -,
elementary e p s i l o n i c s .
+ E 1l D .
f ( a * )-A] h2
+ [
D ~ f ( a * ) -c] k 2 . 2
a -
is close to
-, a
B~
AC
>
0.
0,
AC
>
0.
AC
on both positive and negative values. Proof. If A # 0, the equation Ax2 + Bx + C = 0 Thus the equation. y = Ax
2
2Bx
points.
B2
On the other hand, if A = 0, then B # 0 (since 2 in this case the equation y = Ax + 2Bx + C AC > 0 ) ; It follows that in
for which
and a number xl
for which
Let
ho/ko
xo
and let
(hl,kl)
Then Q(ho,k,)
< 0 and
Step 4.
i
W e prove p a r t
( a ) of t h e theorem.
Assume
Q(&)
B~
AC
> 0.
Let
% be a u n i t v e c t o r f o r which
(**),
>
0.
Examining formula
w e see t h a t t h e expression
2 [ f(a+tv)
- -
Q(vo) a s
L
t ->
0.
Let
the
s t r a i g h t l i n e from
f
to
- + G, a x
(5)
o t h e r hand, i f
vl
to
i s a p o i n t a t which
Q(vl) <
0,
approaches
along t h e s t r a i g h t
l i n e from
0 -
5 +
vl,
f
the e x p r e s s i o n
f (5)
f (5) approaches
through n e g a t i v e v a l u e s .
W e conclude t h a t
-. a
S t e p 5.
W e prove p a r t s (b)
theorem.
(Q(x)
1 >
v.
Then
IQ
(1) h a s a p o s i t i v e minimum I
(Apply t h e extreme[Q(COS
as
v
G
ranges o v e r a l l u n i t v e c t o r s .
v a l u e theorem t o t h e continuous f u n c t i o n
0
t , s i n t ) 1,
for
<t
2r.)
N w choose o
i s less
i s within
6
If
of
-. a
0 < t
< 6,
A
then
-* a
0
i s on
to
(*)
- + 6v; a -
since
is a u n i t vector, then
whenever whenever
the r i g h t s i d e . 0 5
)
h a s t h e same s i g n a s
< t < 6.
If
> 0,
t h i s means t h a t
f (x)
- - f (a) > 0 -
0
I
then
6,
so
f
0
h a s a r e l a t i v e minimum a t
a.
so
If
<
0,
f (a)
<
whenever
--
6,
f h a s a rela
t i v e maximum a t
a -.
Exercises
1.
Show t h a t t h e f u n c t i o n
a t the o r i g i n , w h i l e t h e f u n c t i o n
x4 x4
y4
y4
there.
Conclude t h a t t h e s e c o n d - d e r i v a t i v e t e s t i s i n c o n c l u s i v e
2.
f (x)
of a s i n g l e variwhat c a n you s a y f
at
able.
If
f 1( a ) = f" ( a ) = 0
and
f"' ( a ) # 0,
a b o u t t h e e x i s t e n c e o f a r e l a t i v e maximum o r minimum a t
3.
!
a?
Suppose near
f (x) h a s c o n t i n u o u s d e r i v a t i v e s o f o r d e r s Suppose
,...,n+l
x = a.
and
f ("+l) ) # 0 . (a
t i v e maximum o r minimum. o f
4.
(a)
Suppose
f (xl, x2)
h a s continuous third-order
(*)
p a r t i a l s near
-. a
Derive a t h i r d - o r d e r v e r s i o n o f formula
of
the p r e c e d i n g theorem.
(b)
and s i m i l a r l y f ~ r(hDl+kD2)
I
R ~ .
in
R~
and
[a,b]
Q
!
= [a,b]
[c,d] = {(x,y)
and
c 5 y
d. )
[a,b]
and
[c,d]
Q.
If P1 =
{X~'X~'*
.'xn]
is a partition of
[a,b],
and if
P1
P2
Since
P1
partitions
partitions
Let
Q = [a,bl
Then, given
6
[c,dl
>
f
0,
there
is
q partition
of
such
is
bounded span
on
of
following terminology:
Q,
is
tt6-nice" if
is
span of
r.
in the set
is defined b y
the equation
S1
is a s u b s e t o f
f spanS f .
S,
then
span
1
To b e g i n , w e n o t e t h e f o l l o w i n g e l e m e n t a r y f a c t :
Qo
Suppose
[ao,boI x [co,d0I
Q.
is any r e c t a n g l e contained i n
ponent i n t e r v a l
I1 = [ a o , p J
L e t u s b i s e c t t h e f i r s t com
[ao,bo]
of
Po
and
I2 = [ p , b o ] ,
[ao,bo]. vals
I
[co,d0]
J1
Then
Qo
is w r i t t e n a s t h e union o f t h e
four rectangles
I1
J1
and
I2
J1
and
I1
and
I2 x J2.
eo-
Now i f e a c h o f t h e s e r e c t a n g l e s h a s a p a r t i t i o n t h a t i s nice,
t i o n of
t h a t is
ro-nice.
The f i g u r e i n d i c a t e s t h e p r o o f ;
e a c h of t h e s u b r e c t a n g l e s o f t h e n e w p a r t i t i o n i s c o n t a i n e d i n a subrectangle o f one o f t h e o l d p a r t i t i o n s .
Now we prove the theorem. false and derive a contradiction. some nice. eo
We suppose the theorem is That is, we assume that for has no partition that is ao
Let us bisect each of the component intervals of writing Q as the union of four rectangles.
Q,
partition that is
nice.
tangles
Q,Q1,Q2,.
.. .
(s,t)
such
Because the
Qm become arbitrarily small as m increases, and because they all contain the point ( s , t ) , we can choose m large enough that Qm lies within this ball. Since (s,t), Qm is contained in
f in
centered at
the span of
is a parti-
Let
i
continu
rectangle Set
9.
&
bounded
9.
Proof. is a-nice.
= 1,
that
Q
f
subrectangles, say
...,R mn'
>Now
for
I ( ) S Mi fil
1~ e Ri.
Then if
M = max{M1,
...,Mmn }
we have
If(x)l
for all
y E Q.
I3
I M
Let
f
Q.
be a scalar
Then there
continuous
J O
the
rectangle
are
f(xo) I
for all - -
x
E
f(x) -
f(x -1
Q.
Proof.
We know
is bounded on
Q;
let
M = sup{f(x)
We wish to show there is a point f(xl) = M. Suppose there is no
I x e Q). x1 of Q
such that
such a point.
- f(x)
Q,
tion
Q.
is bounded on
Q;
e
let 1
g(y)
<
for
Q.
Then
for all in
Q.
f(x)
- <C, M - (l/C)
(1/C) in
Q,
or
X o
1.
parabolic arc
y = x
for
-1
x 5
p.
'
2.
Let
on the set
(a)
(x,y)
+
S.
0.
Show that
on
(b)
(-s,~)
d from a
x2
(a,O)
to
when
+ y2 = a2
Compute it when
3.
Let
be as in problem 2.
x
Let
(x,y)
with
>
0.
i--- :
that is
defined in
U.
fi
+(x,g~=
J - F ~ --. i * ~ n r ~ i t t e A
-T
CLccrYc
4.
subset that
I
s of
R".
*2
Suppose that f =
md
=.a% in
s.
%W O
Then there
>
0,
satisfy
then -
A =
11
f .
f.
and
- integrable are on
Q.
(b)
Q2.
Then
L e t Q be s u b d i v i d e d -- r e c t a n g l e s i n t o two - i n t e g r a b l e o v e r Q - - -n l y - is i f and o - i f i t
and Q2;
Q1
and -
is
i n t e g r a b l e -- Q1 over both
furthermore,
(c)
over
If -
on -
Q,
and i - -f
and -
are -integrable
Q,
then
For
example, c o n s i d e r t h e formula
where
and
are i n t e g r a b l e .
s1 C f C tl
on
Q,
and such t h a t
lJQ If
JJ,
I\
J
Q g,
sl,
W e choose s t e p f u n c t i o n s
and
S2 C
< t2
are also step functions relative to this partition. one adds the earlier inequalities to obtain
Furthermore,
Finally, we compute
this computation uses the fact that linearity has already been proved for step functions. ~hus
JJQ (f + g)
exists.
TO
by definition.
'
Then
here again we use the linearity of the double integral for step
functions. It follows from the second half of the Riemann
conditi.on that
11
"Q
exist?
integrals?
(4) What are the applications of the double integral? We shall deal with questions (l), ( Z ) , and (4) now, postponing question (3) until the next unit.
integrable on Q.
- f - defined - bounded Let be and = [ a r b ] x [c,d] , - assume - f and that is For each fixed y - [c,d], assume -in that the --
one-dimensional integral -
exists.
1
A ( y ) dy
proof.
,b
X(y)dy
f *
such that
s(x,y) C
(x,y)
t(x,y),
exists.
and
in
[c,d],
hence it is integrable.] Now I claim that the b S (y) = s(x,y)dx is a step function on the interval
x;
...,xm
and
yo,.-.,yn is
of
[arb] and
s(x,y)
( x ~ - ~ , x ~ ( ~ ~ - ~ , y Let . y ~) x )
Then
is
s(x,y) = s(x.7) in
--
x.
(This is immediate if x
x i , x i ; if x
here
fore
Hence tion.
S (y)
is constant on
so it is a step func-
< < c - y - d.
- . .
Now since
s 5 f
for all
(xfy), we have
.
by the comparison theorem. hypothesis.) (T3emiddle integral exists by y That is, for all
in
[c,d] ,
Thus
and
respectively,
Furthermore
It f o l l o w s t h a t
A(y)dy
exists, by t h e Riemann c o n d i t i o n .
i s i n t e g r a b l e , w e can conclude
Now t h a t w e know
A(y)
from a n e a r l i e r i n e q u a l i t y t h a t
that is,
But i t is a l s o t r u e that
by d e f i n i t i o n .
E
Since t h e i n t e g r a l s of
A ( Y )dy
s
and
and
t
f
a p a r t , we conclude L l a t Because
E
of e a c h o t \ e r .
i s a r b i t r a r y , t h e y must be e q u a l .
w i t h t h i s theorem a t hand, one can proceed t o c a l c u l a t e some specific double i n t e g r a l s . i n 1 1 . 7 and 11.8 o f Apostol.
NOW
S e v e r a l examples a r e worked o u t
l e t us t u r n t o t h e f i r s t of o u r b a s i c q u e s t i o n s , t h e
W e readily
Theorem 4 .
The - integral
Q.
exists i f
is -
continuous on t h e r e c t a n g l e
-E',
p C. 27
E'.
choose a p a r t i t i o n of
such t h a t t h e span
Qij
'i j = min f ( x )
on
Qij:
t i j = max f ( x )
on
Qij.
Then
tij
si j
<
s
E'.
U s e t h e numbers
sij
and on
Q.
tij
t o obtain
s t e p functions
and
with
<
f G t
One t h e n has
This number e q u a l s
E
i f w e begin t h e proof by s e t t i n g
E/
(d-c) (b-a)
We
Definition. d e f i n e t h e area o f
If
Q
Q =
[a,b] x [c,d]
is a rectangle, we
by t h e e q u a t i o n
a r e a Q.=
I\
1;
1 i s a s t e p f u n c t i o n , we can c a l c u l a t e
(d-c) (b-a)
Of c o u r s e , s i n c e
i
t h i s i n t e g r a l d i r e c t l y as t h e p r o d u c t
A d d i t i v i t y of two r e c t a n g l e s
Q1
i m p l i e s t h a t i f we subdivide and
Q2,
into
then
a r e a Q = a r e a Q1 + a r e a Q*.
then
area Q =
li,j area
Qij,
where t h e summation e x t e n d s o v e r a l l s u b r e c t a n g l e s o f t h e p a r t i t i o n .
1t now f o l l o w s t h a t i f
A
and
a r e r e c t a n g l e s and
Q,
then
area A Let
a r e a Q.
D
Definition.
be a s u b s e t of t h e plane.
E
Then
is
>
0,
there i s a f i n i t e
E.
(I)
(2)
(3)
f i n i t e s e t h a s c o n t e n t zero.
A h o r i z o n t a l l i n e segment h a s c o n t e n t zero.
A v e r t i c a l l i n e segment h a s c o n t e n t zero.
(4)
A s u b s e t o f a set o f c o n t e n t z e r o h a s c o n t e n t zero.
y = $(x);
i
< x <.b
h a s c o n t e n t zero.
(7)
x=$(y);
has c o n t e n t zero.
c G y 9 d
Let
E'
> 0.
l e t us use t h e small-
partition span of
$ J
a = xo < x l <
of
[a ,b]
E'.
such t h a t t h e Consider t h e
rectangles
for
i = l,...,n.
Q,
because
( 1
Qx
1 <
whenever
x
Ai
i s i n the i n t e r v a l
,xi]
The t o t a l a r e a of t h e r e c t a n g l e s
equals
i=l
( x ~ - i-1) 2 ' = 2 c 1 ( b x
- a).
T h i s umber e q u a l s n
E
i f w e begin t h e proof by s e t t i n g
= e / 2 (b-a)
W now prove an elementary f a c t about sets of c o n t e n t zero: e Let be Let be of - Q - -a r e c t a n g l e . - D - -a s u b s e t Q -- c o n t e n t - Given E > 0 , t h a t has zero. t h e r e -s-a p a r t i t i o n i of such t h a t of t h that Q -- t h o s e s u b r e c t a n g l e s - -e p a r t i t i o n - c o n t a i n p o i n t s - D have t o t a l --- E. of area l e s s than Lemma 5. Note t h a t t h i s lemma does n o t s t a t e merely t h a t
D
is
b u t t h a t t h e sum of t h e
D
a r e a s o f - t h e s u b r e c t a n g l e s t h a t c o n t a i n p o i n t s of all
i
is l e s s
D
than
E.
The f o l l o w i n g f i g u r e i l l u s t r a t e s t h e d i s t i n c t i o n ;
i s c o n t a i n e d i n t h e union of two s u b r e c t a n g l e s , b u t t h e r e a r e
seven s u b r e c t a n g l e s t h a t c o n t a i n p o i n t s of
D.
Proof.
A1,...,An
F i r s t , choose f i n i t e l y many r e c t a n g l e s
c/2
of t o t a l a r e a l e s s t h a n
whose union c o n t a i n s
i,
D.
That i s , f o r each
Ai,
choose a
rectangle
A;
Ai
whoseinterior contains
Ai.
such t h a t t h e a r e a o f
i s no more t h a n t w i c e t h a t o f
sets
Int
A i
contains
Of
D,
and t h e r e c t a n g l e s
A ;
have t o t a l
may extend
so l e t
A:
A:
denote t h e r e c t a n g l e t h a t i s t h e
Q.
E.
i n t e r s e c t i o n of
and
Then t h e r e c t a n g l e s
A:
also
have t o t a l a r e a l e s s t h a n
t o define a partition
of t h e r e c t a n g l e
Q.
See t h e f i g u r e .
W e show t h a t t h i s i s o u r d e s i r e d p a r t i t i o n .
Note t h a t by c o n s t r u c t i o n , t h e r e c t a n g l e
by
P,
Aj;
is parti,tioned
s o t h a t it i s a union of s u b r e c t a n g l e s Nw i f a subrectangle o
Qij
Qij
of
P. D,
c o n t a i n s a p o i n t of
then
it c o n t a i n s a p o i n t o f
Int
A1;
f o r some
kt
s o that it a c t u a l l y
B
lies in
and hence i n
Ai;.
Suppose w e l e t
d e n o t e t h e union
D; B C A.
of a l l t h e subrectangles
A
Qij
t h a t c o n t a i n p o i n t s of
and l e t
be the union of t h e r e c t a n g l e s
Then
It follows that
1
NOW
area Qij
Q~j~~
It follows that
This last inequality is in general strict, because some subso rectangles i j belong to more than one rectangle , ' their areas are counted more than once in the sum on the right
side of the inequality.
It follows that
as desired. 0
Now
on -
I/',
Then f is integrable over Q. We prove this result as follows: Because h and g are integrable, we can find step functions sl, s2, tl, t2 such that s1 1g 5 tl and s2 5 h j t2, and such that
Consider the step functions sl and t2. We know that s1Iglf1h1t2 so sl is beneath f, and t2 is above f. Furthermore, because the integral of g is between the integrals of sl and of tl, we know that
we have
S t e ~. Now we prove the theorem. Let D be a set of zero content containing the 2 discontinuities of f. Choose M so that If(x) 1 5 M for x in Q; then given r > 0, set r t =
r/2M. Choose a partition P of Q such that those subrectangles that contain points of D
6).
Now we define functions g and h such that g subrectangles that does not contain a point of D, set
g(x) = f(x) = h(x) for x E Q. .. Do this for each such subrectangle. Then for any other x in Q, set
1J
g(x) = -M ThengSfShonQ.
and h(x) = M.
Now g is integrable over each subrectangle Q.. that does not contain a point of D,
1J
since it equals the continuous function f there. And g is integrable over each subrectangle Q.. that does contain a point of D, because it is a step function on such a
1J
subrectangle. (It is constant on the interior of Q. ..) The additivity property of the
U
integral now implies that g is integrable over Q. Similarly, h is integrable over Q. Using additivity, we compute the integral
JLQ
(h-g) =
1JJ
Qij
(h-g) = 2M
Thus the conditions of Step 1hold, and f is integrable over Q. f Theorem 7. Sup~ose & bounded on Q, and f eauals 0 except on a set D of content zero. Then --
~ Q U zero ~ S
Proof. We apply Step 2 of the preceding proof to the function f. Choose M so that If(x)l 5 M for x in Q; given
E
> 0, set
E'
= E / ~ M Choose a .
partition P such that those subrectangles that contain points of D have total area less than E'. Define functions g and h as follows: If Q.. is one of the subrectangles that does not
1J
contain a point of D, set g(x) = f(x) = 0 and h(x) = f(x) = 0 on Q. .. Do this for each
1J
and h(x) = M.
Now g and h are step functions on Q, because they are constant on the interior of each subrectangle Q. .. We compute
1J
JJQ
Similarly,
h =M
E, SO
Proof.
W write e
g
g = f
f
+ (g-f).
Nw o
is integrable
by h y p o t h e s i s , and
i s i n t e g r a b l e by t h e preceding
corollary.
Then
i s i n t e g r a b l e and
[IS f
f o r a function
c a s e where
i s a r e g i o n of Types I o r 11.
W discuss here e
bounded s e t i n t h e p l a n e .
I n t S,
P roof. - Let
then -
is a t each -1 c o n t i n u o u s -
S. As
point
If
exists.
be a rectangle c o n t d h i n g
usual.
S.
S
let
N
equal
on
S,
and l e t
xo
equal
outside
Then
(because
i s continuous a t each p o i n t
f
of t h e i n t e r i o r of
xo,
it equals
i n an open b a l l about
and
is continuous xl
at
x0).
The f u n c t i o n
S,
i s a l s o continuous a t each p o i n t
of t h e e x t e r i o r of about xl.
Hence
! f
E l
Note: value of
)'IS
f =
/Irnt
1
for instance.
Let us remark on a more general existence theorem than
that stated in Theorem 9. if ~d
S
If
and D
D,
and
do.
integral hold also for this extended integral: Theorem 10. Let S
---
One
(b) Comparison.
If
f < g
on t h e set
St
then
(c) Additivity.
Let
S = SL
S2.
If
S1 n S2
has content
zero, then
Proof.
ft
(a) Given
f, g
defined on
0 0
S,
let
2, g equal
Then Let
Q
gr
respectively, on
dg
S and equal S
otherwise. otherwise.
cl
equals cf
dg on
and
be a rectangle containing S.
We know that
(b) Similarly, if
conclude that
f -
< g, then
from which we
(c) Let
be a rectangle containing
and equal
0
S.
Let fl
f2 equal f on f
equal on
S2,
on S1,
elsewhere. Let
f3 equal
Sf
and equal 0
it equals
Because zero.
Sl n S2
l/
f4
Now
How
iIS
f when
S
is a general
is a region of type
S:
one
evaluates
proved on p ,
This result is
llS f
for
two
S2 that
is of type I
I[
{I
s,A.
s,
obtain
Area.
We can now construct a rigorous theory of area. We
x
already have defined the area of the rectangle Q = [a,b] by the equation
[c,d]
area Q =
11
I. .
, I
We say t h a t
is Jordan-measurable i f
Sfs
1 e x i s t s ; i n t h i s case, we define
Note that i f
t:y Theorem 9.
is Jordan-measurable, by
Let
and If
Jordan-
(1) (Monotonicity).
(2)
S C T, - area S then
< area T.
(Positivity).
S
Area
and - only if
(3)
(Additivity) - S n T If
'
is - - -a set of
content zero,
then -
Su T
Jordan-measurable and
.
S and T.
be a rectangle containing
is(x) = 1
for
= 0
for
x ft S.
Define
FT
(1) If S
$ (x)
C lT(x)
area
L <
I .
= area T.
(2) Since
0 =
11, 11,
S
1 = area
s,
for all S.
~f
$1
S
1=
$1 is =
Q
0,
by Corollary 7
.
I/
1 = 0.
Then
/I
2=
is = 0.
>
t
0,
is defined
such that
f/
t <
E.
Let
P be a partition relative
to which
is a step function.
Now
is contained in the union of these subrectangles (of total area less than zero.
(3) Because
e)
Thus
has content
I, !
1 and
$1 T
exist and
S n T has 1 exists
SUT
$1
$1
not in
Int S
lies in
Bd S,
area S = area(1nt S)
area (S
Int S)
= area (Int S)
+ area (Bd S)
= area (Int S )
Remark.
Let
be a bounded s e t i n t h e plane.
S,
without developing
Q
Let
be a r e c t a n g l e con
Given a p a r t i t i o n
of
P
Q,
let
a(P)
denote t h e t o t a l .
a r e a of a l l s u b r e c t a n g l e s of let
A(P)
t h a t a r e contained
&
S,
and
P
denote t h e t o t a l a r e a of a l l s u b r e c t a n g l e s o f S. Define t h e i n n e r a r e a of
that
contain points of
be t h e supremum
of t h e numbers
a (P) , a s
ranges o v e r a l l p a r t i t i o n s o f
S
Q;
and d e f i n e t h e o u t e r a r e a o f
A(P)
t o b e t h e infemum of t h e numbers
S
If t h e i n n e r a r e a and o u t e r a r e a of
a r e equal, t h e i r
common v a l u e i s c a l l e d t h e a r e a of
S.
W e leave it a s a ( n o t t o o d i f f i c u l t ) e x e r c i s e t o show t h a t
about o u r n o t i o n of a r e a .
This f a c t i s
n o t immediate from t h e d e f i n i t i o n of a r e a , f o r we used r e c t a n g l e s w i t h s i d e s p a r a l l e l t o t h e c o o r d i n a t e axes t o form t h e p a r t i t i o n s on which we based o u r n o t i o n of " i n t e g r a l " , and hence o f " a r e a " . ~t i s n o t immediate, for i n s t a n c e , t h a t t h e r e c t a n g l e s p i c t u r e d below have t h e same a r e a , f o r t h e a r e a of
T
S
and
i s defined
by r e c t a n g l e s w i t h v e r t i c a l and h o r i z o n t a l
Eh erci ses
1.
Show t h a t i f
Q
ISS 1
e x i s t s , then Since
Bd S
[Hint:
Chwse
so that
S CQ.
S&
< E
S
IS
exists
there a r e
P
functions such
s and
that
s
t h a t a r e s t e p functions r e l a t i v e t o a p a r t i t i o n
of
Q,
<, Is jt o
P
Q and
[$ ( t - s )
Show t h a t t h e subrectangles
determined by
2. ( a )
.] .
Let
and
be bounded subsets of
R~
Show t h a t
B (S U T ) C (Bd SVBd T ) . d
Give
S
(b) Show t h a t i f
S V T and SnT,
and
a r e Jordan-measurable,
then s o a r e
and furthermore
ar.ea(SbT)
areas
+ areaT
area ( S n T ) .
3.
2 2 x y ,
where
is t h e b o u n d e d p o r t i o n o f t h e f i r s t
xy
= 1
and
xy
= 2
2
and the
= x
and
y = 4x.
(Do
not e v a l u a t e t h e i n t e g r a l s . )
z = x
A s o l i d is b o u n d e d a b o v e b y t h e s u r f a c e
xy-plane, and by the plane
below
by the
x = 2.
Make a sketch;
e x p r e s s i t s v o l u m e a s a n i n t e g r a l ; a n d f i n d t h e volume.
5 .
i n t h e f i r s t a t a n t of z = 0
R~ bcunded by:
z = xy
and
wd
x + 2y + z = 1.
( b ) The s u r f a c e s
z = 0 and
Let Q denote the rectangle [0,1] x [0,1] in the following exercises. @(a) Let f(x,y) = l/(y-x) f(x,y) = 0 Does if x # y,
if x = y.
JJQf
exist? if x # y,
Does
JJQg exist?
Show that
JJQ
f does not.
JJQ
The d i s c u s s i o n i n 1 1 . 1 9
nr e e n ' s G
Theorem - -e P l a n e i n th
d e s c r i b e d i n two d i f f e r e n t ways, a s f o l l o w s :
The a u t h o r ' s proof i s complete and r i g o r o u s e x c e p t f o r one gap, which a r i s e s from h i s u s e of t h e i n t u i t i v e n o t i o n of "counter
1
clockwisen.
S p e c i f i c a l l y , what h e does i s t h e f o l l o w i n g :
For t h e f i r s t :
p a r t o f t h e proof h e o r i e n t s t h e boundary C o f R as f o l l o w s :
(*)
Then i n t h e second p a r t o f t h e p r o o f , h e o r i e n t s C a s f o l l o w s :
(**)
By d e c r e a s i n g y , on t h e c u r v e x = q L ( y ) ;
By i n c r e a s i n g x , on t h e l i n e segment y = c;
By i n c r e a s i n g y , on t h e c u r v e x = I J ~ ( Y ) a nd
; By d e c r e a s i n g x, on t h e l i n e segment y = d .
(The l a t t e r l i n e segment c o l l a p s e s t o a s i n g l e p o i n t i n t h e prec e d i ng f i g u r e . ) The c r u c i a l q u e s t i o n i s : orientations How d o e s one know ---t h e s e two
a re t h e s ante? ---
~ I and 11. ~ e
S p e c i f i c a l l y , s u c h a r e g i o n can b e d e s c r i b e d by f o u r monotonic f u n c t i o n s :
A..
wt..ere
dl
decreasing and
X2
and d3 are
,
U w *
stzictly increasing,
y = C $ ~ ( X t h a t bounds t h e r e g i o n on t h e top. )
P2 ( y ) .
[ ~ o r m a l l y , one d i r e c t s t h e s e curves
i n c r e a s i n g x = i n c r e a s i n g y on y = a 2 ( x )
increasing y o n x = b
d e c r e a s i n g x = i n c r e a s i n g y on y = cr4 (x)
decreasing x o n y = d
d e c r e a s i n g x = d e c r e a s i n g y on y = a g ( x )
decreasing y tJe make t h e f o l l o w i n g d e f i n i t i o n : Definition.
L e t R be a n open s e t i n t h e p l a n e bounded by
W e say t h a t
a s i m p l e c l o s e d p i e c e w i s e - d i f f e r e n t i a b l e curve C.
R i s a Green's r e g i o n i f i t i s p o s s i b l e t o choose a d i r e c t i o n
on C s o t h a t t h e e q u a t i o n Pdx C
, i
Qdy =
[I
(2- 5 dxdy 1
R
~ ( x , ~ )t 'hja t i s d e f i n e d i n a n open s e t c o n t a i n i n g
R and C.
The d i r e c t i o n on C t h a t makes t h i s e q u a t i o n c o r r e c t i s
c a l l e d t h e counterclockwise d i r e c t i o n , o r t h e counterclockwise o r i e n t a t i o n , o f C.
I n t h e s e terms, Theorem 11.10 o f Apostul can be r e s t a t e d
a s follows : Theorem 1.
Let be - R - bounded
b~ a s i m p l e c l o s e d o i e c e -
wise-differentiable csrve.
i -s-a
If -R
then R
Green's r e g i o n .
As t h e f o l l o w i n g f i g u r e i l l u s t r a t e s , a l m o s t any r e g i o n
For example, t h e r e g i o n R i s a
Definition.
L e t R b e a bounded r e g i o n i n t h e p l a n e
..., Cn.
We c a l l R a
g e n e r a l i z e d Green's r e g i o n i f i t i s p o s s i b l e t o d i r e c t the
curves C1,
...,
Cn so t h a t t h e e q u a t i o n
~f + (23 d e f i n e d
c a t i o n s o f Theorem 1.
generalized
For example, t h e r e g i o n R p i c t u r e d
IXza
Definition. Let C be a piecewise-differentiable curve in the plane parametrized by
- the function d t ) = (x(t),y(t)). The vector T = (x' (t),y8(t))/llgt (t) 11 is the unit tangent
vector to C. The vector
s = (Y' (t),-x'(t))/llsr'(t)ll
-8
Iff = Pi
ss, [E
Remark. If f is the velocity of a fluid, then outward through C in unit time. Thus aP/&
5c
of the fluid, per unit area. It is called the divergence off.] Definition. Let $ be a scalar field (continuously differentiable) defined on C. If g is a point of C, then $'(zc;g) is the directional derivative of $ in the direction of p. It is
4
equal to V$(g) -9, course. Physicists and engineers use the (lousy) notation of denote this directional derivative.
2to
CQC 2
ds =
J'J
v2g dx dy
R
where v2g = a 2 g / h 2
(b) Show
+ a2g/lly2.
These equations are important in applied math and classical physics. A function f with V 2f = 0 is said to be harmonic. Such functions arise in physics: In a region free of charge, electrostatic potential is harmonic; for a body in temperature equilibrium, the temperature function is harmonic.
C ~ n d i t i o n sUnder which
Let f = P ;
P?
Q j i s a Gradient.
--
p3 be a continuously d i f f e r e n t i a b l e vector
a gradient on S,
W e do know t h a t f w i l l be a g r a d i e n t i f S
W e seek t o e x t e n d
r e g i o n S3 c o n s i s t i n g of t h e p o i n t s i n s i d e C2 and o u t s i d e
C3 h a s a h o l e , and s o does t h e r e g i o n S 4 o b t a i n e d from t h e
This t a s k t u r n s o u t
t o be s u r p r i s i n g l y d i f f i c u l t ,
W e b e g i n by p r o v i n g some f a c t s about t h e geometry of t h e
plane.
Definition.
A s t a i r s t e p curve C i n t h e p l a n e i s a curve
segments
.
Then by u s i n g t h e c o o r d i n a t e s o f t h e end
p o i n t s of t h e l i n e segments o f t h e curve C a s p a r t i t i o n p o i n t s ,
Let in t Then t h - C be-a simple c l o s e d s t a i r s t e p curve - h e plane. --e complement - C - - w r i t t e n - -e union - - d i s j o i n t of can be as th of two and h e he open - - o f t h e s e sets i s bounded - t - o t- r i s sets. One Each has a s its unbounded. - o f them- C -- boundary.
Theorem 2 . Proof. Choose a r e c t a n g l e Q whose i n t e r i o r c o n t a i n s C ,
and a p a r t i t i o n o f Q, s a y xo < x 1 < c x < ym, n and yo c yl c s u c h t h a t C i s made up of edges o f s u b r e c t a n g l e s of t h i s p a r t i t i o n . S t e p 1. the partition W b e g i n by marking e each o f t h e r e c t a n g l e s i n
...
...
or
- by
t h e bottom one.
+.
In general, i f a
g i v e n r e c t a n g l e i s marked w i t h
Repeat
In t h e following
SteE 2. of t h - -e
W e prove t h e following:
I f two --
subrectangles
---
we
rove
i t holds f o r t h e v e r t i c a l edges, by i n d u c t i o n .
~t i s t r u e f o r e a c h of t h e lowest v e r t i c a l edges, t h o s e
o f t h e form xix[yo,yl].
Supposing now it i s t r u e
- 1, w e prove i t t r u e f o r r e c t a n g l e s
(! )
There a r e 16 c a s e s t o c o n s i d e r
, of
which we
i l l u s t r a t e 8:
'-wj j-I
(1)
(The other eight are obtained from these by changing all the signs.)
We know in each case, by construction, whether the two horizontal edges are in C, and we h o w from the induction hypothesis whether the lower vectical edge is in C. Those edges that we h o w are in C are marked heavily in the figure. We seek to determine whether the upper vertical edge (marked
"?I1)
is in C or not.
hTe use
e a c h v e r t e x i n C l i e s on e x a c t l y two edges i n C.
In case ( I ) ,
i s n o t i n C, f o r o t h e r w i s e t h e middle v e r t e x would l i e on
t h r e e edges o f C. S i m i l a r r e a s o n i n g shows t h a t i n c a s e s ( S ) ,
i
(6), and ( 7 )
r e c t a n g l e s i s marked rectangles a r e S t e p 4.
W e d i v i d e a l l o f t h e complement of C i n t o two I n t o U w e p u t t h e i n t e r i o r s of a l l i n t o V w e p u t t h e i n t e r i o r s of a l l a l s o p u t i n t o V a l l p o i n t s of t h e
W e s t i l l have
-, and +. W e
p l a n e l y i n g o u t s i d e and on t h e boundary o f Q.
1
I
C o n s i d e r f i r s t a n edge l y i n g i n t e r i o r t o Q.
I f it does
U o r i n V accardingly.
i n t e r i o r t o 9.
I f i t i s n o t on t h e c u r v e C, t h e n c a s e (1) o f
a l l f o u r a r e i n V; p u t v i n t o U o r V a c c o r d i n g l y .
It i s i m m e d i a t e l y c l e a r from t h e c o n s t r u c t i o n t h a t U and V
.
+
I t i s a l s o immediate t h a t
F u r t h e r m o r e , C i s t h e common
boundary o f U and V, b e c a u s e f o r e a c h edge l y i n g i n C , one o f t h e a d j a c e n t - ' r e c t a n g l e s i s narlred by S t e p 2 . ~ e f i n i t i o n . L e t C be a simple c l o s e d s t a i r s t e p curve i n t h e plane. The bounded open s e t U c o n s t r u c t e d i n t h e p r e c e d i n g
C,
and t h e o t h e r i s marked
-,
proof i s c a l l e d t h e i n n e r region of
I t is true that
U
o r t h e region inside
C.
W s h a l l not need t h i s f a c t . e
Definition, Then
S
Let
b e a n open c o n n e c t e d s e t i n t h e p l a n e .
s t a i r s t e p curve
which l i e s i n S.
S,
t h e inner region of
is
also a s u b s e t o f
Theorem 3 . s t a i r s t e p curve
closed
l y i n g i n U, i t is t r u e t h a t
i f Ci
i s t h e boundary of Q i j ,
(For Qij
direction.
i s a t y p e 1-11 r e g i o n ) .
t h e p a r t i t i o n l y i n g i n C a p p e a r s i n o n l y one of t h e s e c u r v e s
Cij,
)
I U i n e segments i n
CI
MA'+ Qdy =
- 2) d x d ~ e
T h e o n l y q u e s t i o n i s w h e t h e r t h e d i r e c t i o n s w e have t h u s g i v e n
t o t h e l i n e s e g m e n t s l y i n g i n C combine t o g i v e an o r i e n t a t i o n o f C.
That t h e y d o i s p r o v e d by examining t h e p o s s i b l e c a s e s .
t h e o t h e r seven a r e opposite
Seven o f them a r e a s f o l l o w s ;
t o them.
-
0
pla.ne s u c h t h a t
Let be an se i t - S - - o p e n -t-n-h e
of
p o i n t s o f S can be joined
- --
~ - s t a i r s t e p curve a
--
be a v e c t o r f i e l d t h a t i s c o n t i n u o u s l y d i f f e r e n t i a b l e i n S ,
s u c h that
--
--
on a l l of - -S.
1
S.
(a) If S
is - simply
connected
then
is a qradient
b a qradient in S. e
(b) f S
L not s
simply
connectad, then
--
or - may not
Proof.
W prove ( a ) here. e
Assume t h a t
St.= -
is simply c o n n ~ c t e d .
WE show t h a t
1.
Pdx + Qdy =
f o r e v e r y s i m p l e c l o s e d s t a i r s t e p c u r v e C Lying i n S.
We
know t h a t t h e r e g i o n U i n s i d e C i s a Green's r e g i o n . U
W e a l s o know t h a t t h e r e g i o n
S. then
(For
C
of
U
S,
not i n
has a hole a t
p.
is simply connected.) h o l d s on a l l o f
;
There
f o r e t h e equation f o r e conclude t h a t
aQ/ax = aP/ay
we t h e r e
f o r some o r i e n t a t i o n o f
(and hence f o r b o t h o r i e n t a t i o n s o f
S t e p 2.
W e show t h a t i f
---
S, S.
t h e n t h e same
Assume
c o n s i s t s of t h e edges of s u b r e c t a n g l e s
C,
i n a p a r t i t i o n o f some r e c t a n g l e t h a t c o n t a i n s
a s usual.
W e proceed by i n d u c t i o n o n t h e number of v e r t i c e s on t h e
c u r v e C.
Consider t h e v e r t i c e s o f C i n o r d e r :
I f i t h a s o n l y two, t h e n
The l i n e
a r e through.
Otherwise, l e t vk b e t h e first v e r t e x i n t h i s
W e
vkWl,
the integral.
This i s a s i m p l e c l o s e d c u r v e , s i n c e k
' a l l i t s v e r t i c e s a r e d i s t i n c t , s o t h e i n t e g r a l around i t i s
V~+~,...,V
z e r o , by S t e p 1.
/C Pdx
Therefore t h e value of t h e i n t e g r a l
i f w e d e l e t e t h e v e r t i c e s vi,.
..
,v
k- 1 from t h e sequence.
Then
In t h e f o l l o w i n g case,
touched is t h e p o i n t
q.
One c o n s i d e r s t h e s i m p l e c l o s e d c r o s s ~ e l e t i n gthis
c l o s e d c u r v e remaining. Step 3.
W e show t h a t i f C1 and C2 a r e any two s t a i r s t e p
c u r v e s i n S from p t o q, t h e n
IC1Pdx
his f o l l o w s by t h e u s u a l argument.
t h e r e v e r s e d d i r e c t i o n , t h e n C = C1 curve.
W e have
(-C2)
i n a closed s t a i r s t e p
p o i n t o f S , and d e f i n e
$ (x) =
J'c (g Pdx +
Qdy.
There
a$/ax = P and
= Q.
C ( 5 ) was
an a r b i t r a r y p i e c e w i s e smooth curve.
B u t t h e proof works j u s t a s
w e l l i f w require e
C(5)
t o be a s t a i r s t e p curve.
To compute
W e computed
w e f i r s t computed
choosing a c u r v e g r a t e d along curve
Cl.
W e computed
from $(x+h,y)
plus the s t r a i g h t l i n e
from t o be a s t a i r s t e p c u r v e .
C1
t h e p r e s e n t c a s e , we have r e q u i r e d Then we n o t e t h a t i f C1
&2
s t a i r s t e p ' u v e c r
+ C2
is - a l s o
a - stairstep
o u t change.
curve.
Remark.
I t i s a f a c t t h a t i f two p a i r o f p o i n t s o f
S,
can
b e j o i n e d by some p a t h i n
s t e p path.
i
( W e s h a l l n o t b o t h e r t o prove t h i s f a c t . )
be
Let S 5e the punctured plane, i.?., the plane with the Show that ths vector fields
orizin deleted.
b?/Jy
f -
a a/ax.
[Hint: F i r s t find ~(3 so that [Hint:
is a gradient in S.
~ a / ) x x/(x =
Compute
c y )
.]
is not a gradient in S.
9- dd
Let
C2
2.
region of C1. Show that the region consisting of those points that are in the inner region of C1 and are not on
C2
Even i f t h e r e g i o n
i s not s i m p l y c o n n e c t e d , o n e c a n
field i n
S.
is the
-. - that f
= Pi
- + Qj - c o n t i n u o u s l y - is
enclosing
(a)
If
--
the then -o r i g i n , -
e i t h e r equals
O]r
I
5 A
( i f t h e o r i g i n is i n t h e inner region of
C)
(otherwise).
(b)
If -
A = 0,
then -
--
punctured plane.
(c)
[Hint:
If
# 0,
then f --
a -
c o n s t a n t m u l t i p l e of the v e c t o r f i e l d
--
That i s , t h e r e i s a c o n s t a n t
such t h a t
f -+
cq
equals a
g r a d i e n t f i e l d i n t h e punctured plane.
(Indeed,
c = - ~ / 2 n .)
Subtracting, w e obtain
aQ - - -(ax au 3x au a v
ax a - -) Y av au
J(u,v) ax
Since
,
= f
is evaluated a t
2 Q/&
It w i l l suffice
S
i s m e r e l y c o n t i n u o u s o n some open s e t c o n t a i n i n g
C.
C.
One t h e n a p p l i e s
I,
1i
aY
aY + -61 av
2 (t)) dt
where the partials are emluated at P_(t). We can write this last integral
as a fine integral over the curve D. Indeed, if we define
NOW
double integral.
We
proof.
I
Let
R = [c,dl
[c',dtl.
Q(x,y) = on all of
f(t,y)dt because
for
f
(x,y) in
Re
R,
is continuous.
Let
(u,v) = @(t)
be a choose the
T.
-(t) a
~(fi(t))
is a parametrization of the t.
curve C .
dr
Also, it may be countsrclock~~ise clockwise.
I\
S
f (x,y)dx d y =
\\
S
aa/ax ax ay = t
This sign is
if
a -(t)
parametrizes
in the counterclock-
otherwise.
Ez3
R e - chanqe of variables theorem
7 (The . Dieorem - - chanqe of variables theorem)
- Let S
in the -C
--
be - an o w n set
the piecewise-diffe&tiable
=
(X(u.v). Y(u.v))
a transformation
from --an open set of (u,v) plane into the (x,y) plane that carries
T into and onto . As a Sf - carries D = a T - C = 2 S - - transformation of D onto C , b_e F may k constant on some segments o D, but otherwise f Assume
if one-to-one.
---4. 10
in some -- rectanqle
R containing S. Then
if
Here - J(u,v)
= det JX,Y/~U~V
-- siqn The
F carries
--
1 1 I
I
r I
b&le
1.
(r cos 8 , r sin 8)
It carries the rectangle T in the (r,Q) plane indicated in the figure into
but . is one-to-one on the rest of T Note that it canies the counterclockwise . orientation of D = JT to the counterclockwise orientation of C =
3 S.
--
Assume a l s o t h a t
--
If s i g n is
J(u,v)
>
on a l of - -l-
v a r i a b l e s formula i s
- -
+;
while
Therefore i n e i t h e r c a s e
Proof. f ( x , y ) z 1.
I
W a p p l y t h e p r e c e d i n g theorem t o t h e f u n c t i o n e W o b t a i n t h e formula e
11
J(u,v)
0
dx dy = t
I\
J ( u , v ) du dv.
The l e f t s i d e of t h i s e q u a t i o n is p o s i t . i v e . on a l l o f T,
heref fore i f
t h e s i g n on t h e r i g h t s i d e of t h e J(u,v) C 0 on a l l o f
T,
+;
while i f
the
t h e p a r t i c u l a r f u n c t i o n b e i n g i n t e g r a t e d , o n l y on t h e transforma-
t i o n involved.
Remark.
!
( u O , v O ) , l e t us choose a
S
small rectangle
a b o u t t h i s p o i n t , and c o n s i d e r i t s image If
under t h e t r a n s f o r m a t i o n .
is s m a l l enough,
J(u,v)
will
be v e r y c l o s e t o Assuming S
J(uo,vo)
on
T,
and s o w i l l n o t change s i g n .
i s a G r e e n ' s r e g i o n , w e have
area
//
dx d y =
\I
IJ(u,v)~
du dvt
so
area S
IJ(U~,V~) I
(area T ) .
Thus, r o u g h l y s p e a k i n g , t h e magnitude o f
J(u,v)
measures how
u, v
plane t o a piece of t h e
x, y
plane.
And
J(u,v)
t e l l s whether t h e t r a n s f o r m a t i o n p r e s e r v e s
before shrinking o r s t r e t c h i n g it
-~cT",o;~~~)s L
AS a n a p p l i c a t i o n o f
=-7
t h e change o f v a r i a b l e s theorem, w e
s h a l l v e r i f y t h e f i n a l p r o p e r t y o f o u r n o t i o n o f a r e a , namely, t h e f a c t t h a t c o n g r u e n t r e g i o n s i n t h e p l a n e h a v e t h e same a r e a . F i r s t , w e m u s t make p r e c i s e what w e mean b y a " c o n g r u e n c e . " ~ e f i n i t i o n . A t r a n s f o r m a t i o n h : R 2 -> or an isomet t o i t s e l f i s c a l l e d a congruence &preserves A points. That is, h is a congruence i f
R~
of t h e plane
d i s t a n c e s between
for every p a i r
a -, b If h - -
of p o i n t s i n t h e plane. has -h -
R2
i -s-a
congruence, then
o r, -
writing vectors
as column matrices,
where ad
- M, -Jacobian t he
: R
determinant
of
h,
ec;uals +I.
&(O,O)
k -
Proof. ->
p2
Let
(p,q)
d e n o t e the p o i n t
Define
by t h e e q u a t i o n
I t i s e a s y t o c h e c k that
k -
is a congruence, s i n c e
f o r every p a i r of points
i
2,
b. L e t u s &(g)= 0.
s t u d y t h e congruence
k,
p r e s e r v e s norms o f v e c t o r s :
By
Ilall = Ilk(a) -
- -OII
llk(a)ll.
By h y p o t h e s i s ,
Second, we show t h a t
k -
preserves d o t products:
Ilk(=)II2 -2k(a)-&(b) +
Because
k -
llk(b)l12 = 1la1l2
--
- - 2a.b- + llbl12.
W e now show t h a t
k -
is a l i n e a r t r a n s f o r m a t i o n .
2 R ; then
Let
gl
and
g2 be the u s u a l u n i t b a s i s v e c t o r s f o r
~ e t
(x,y) =
xg1 + y e Z
e -3 = k (-e ) 1
and
+ e
= k (-2 ) . e k -
Then
1
e3
and
+ e
p r e s e r v e s d o t p r o d u c t s and norms.
x -=
(x,y), c o n s i d e r
k -(?I)
because
e3
and
g4
form a b a s i s f o r
R2 ,
f o r some s c a l a r s
L e t us compute
a a
and and
6,
which a r e o f c o u r s e f u n c t i o n s of W have e
x. -
B.
because
i s orthogonal
by d e f i n i t i o n o f
e3,
because
k -
preserves d o t products,
because
i s orthogonal t o
e2.
Similarly,
P(?i)
= k(2)
= -( -) *k(g2) 5 k x =
e2 = y.
R2,
W e conclude that f o r a l l p o i n t s
5 = ( x , ~ ) of
Letting
ej =
(a,c)
and
= (b,d) ,
w e can w r i t e
o u t in,
k -( 5 )
by, c x
dy).
Thus
is a l i n e a r t r a n s f o r m a t i o n .
w e recall
h R e t u r n i n g now t o o u r o r i g i n a l t r a n s f o r m a t i o n , -,
that
k -( 5 )
= h(x)
-- -
(ptq).
T h e r e f o r e w e can w r i t e o u t
h -( 5 )
i n components a s
h -(
~ = (ax )
by
p, c x
dy
q).
TO
compute t h e J a c o b i a n d e t e r m i n a n t o f
= (a,c)
h -,
w e note t h a t because
e -3
and
= (b,d)
are u n i t o r t h o g o n a l v e c t o r s , w e have
the equation
d e t
cj= [,jtOr
a
b
det
1 0
Theorem 1 0 .
Let
--
S..
to -
7.
If bo -t h
--
-T
a nd -
a re -
area S = area T.
Proof.
a one-to-one
in
(since d i s t i n c t
R2).
p o i n t s of
R'
t o d i s t i n c t points of
~ h u st h e h y p o t h e s e s o f t h e p r e c e d i n g theorem a r e s a t i s f i e d . Furthermore,
I J (u,v) I
= 1.
From t h e e q u a t i o n
w e conclude t h a t
area S = a r e a T.
EXERCISES.
I
1.
Let
h x -(x) = A -
be an a r b i t r a r y l i n e a r transformation
M,
of
R~
t o itself.
If S
i s a rectangle of a r e a
what i s t h e
a r e a o f t h e image o f
2.
under t h e t r a n s f o r m a t i o n
h?
Given t h e t r a n s f o r m a t i o n
\ .
(a)
Show t h a t i f
h ad
(a,c)
and
(bed)
are u n i t o r t h o g o n a l
v e c t o r s , then
i s a congruence.
(b) If
- bc
= 21,
show
h -
p r e s e r v e s areas.
Is
n e c e s s a r i l y a congruence?
3.
A translation of
R~
i s a t r a n s f o r m a t i o n o f t h e form
where
;I
is fixed.
r o t a t i o n of
R~
i s a t r a n s f o r m a t i o n of
t h e form
h -( 2 )
= ( x cos
y s i n 0,
x sin 4
y cos
$1,
where (a)
is fixed.
Check t h a t t h e t r a n s f o r m a t i o n
(r,9)
c a r r i e s the point
( r ,9+$)
(b) congruences. Show t h a t t r a n s l a t i o n s and r o t a t i o n s a r e Conversely, show t h a t every congruence w i t h J a c o b i a n
+I c a n be w r i t t e n a s t h e composite o f a t r a n s l a t i o n and a r o t a
tion.
i
(c)
-1
c a n be
4.
A
Let
be a s q u a r e m a t r i x .
Show t h a t i f t h e rows of
A
a r e orthonormal v e c t o r s , t h e n t h e columns of
are also
orthonormal v e c t o r s .
i
r
3.
Let Given
( x , ~ ) with
b2x2 f
a2 Y 2
I.
f(x,
expresa t h e integral u2 + v 2 1.
JJ
S
as an integral over
Bvaluate when
f(x,y) = x 2, -
. .
*.
6.
I.
Let C be a circular cylinder o f radiua a whome central axis is the x-axis. Let D be a circular cylinder o f radius b 5 a w h o s e central axir is the z-axia. Bxpresa the volume conmoa to the two cylinders am an integral in cylindrical coordinates. L ~ v a l u a t e when b = Transform t h e integral in problem 3, p . 0.26 by using the substi tution x = u / v , y = uv w i t h u, v > 0 . Evaluate the integral.
a - w . 1
StokesI Theorem
Our text states and proves Stokes' Theorem in 12.11, but ituses the scalar form for writing both the line integral and the surface integral
involved. In the applications, it is the vector form of the theorem that is
most likely to be quoted, since the notations dxAdy =d the like are not
PT + QT + $
be a continuously differentiable
vector field defined in an open set U of R3 field in U, by the equation curla = (Ra )/y
-adz)+ + i
(JP/>Z aR/Jx) -
m to note that d
curl ?
+ V ~ F=
-9
det
region in the (u,v) plane. Assume that T is a Green's region, bounded by a simple closed piecewise-smooth curve D, and that has continuous
+ (>Q/& -ap/Jy) 7 .
where T is a
second-order partial deri~tivesin an open set containing T ard D. Let C be the curve L(D).
If F is a continuously differentiable vector field defined in
an open set of R~
--t
( F a
Jc
T) ds
Here the orientation of C is that derived from the counterclockwise orientation of D; and the normal direction as 3 J ru
Y
1gJv
is often described
the
2~
-
2 T. Furthermore,
) R /as
x&/J t
determined by the parametrization g is the same as that determined by g so the right side of the equation is also unchanged.
changes
sign. But in that case, the unit normal determined by g is opposite to that determined by
r.
m e theorem follo&
We shall in fact verify only the first equation. The others are
The idea of the proof is to express the line and surface integrals
of the theorem as integrals over D and TI respectively, and then to
apply Green's theorem to show they are equal.
Let r(uIv) = (X(U.V)~ Y(u,v), Z(u,v)), as usual.
l(t)
4
call it
< b.
We compute as follows:
this as a line integral over D. Indeed, if we let p and q be the functions p(uIv) s(u,v)
= P(g(uIv) )*ru(ufv)
= P(g(u,v) 1 ) 8v r u $ -
ax
ax
d
I*)
(J~/JU
J , / ~ v ) du dv
where P and its partials are evaluated at ~(u,v), of course. Subtracting, we see that the first and last terms cancel each other.
The double integral ( * ) then takes the form
Since
curl F = > P / ~ Z
T->P/>~ 2 , formula
tells us we have
Here
2P/> z
and 2 P/3y
Exercises
on tfie diverqence
theorem
1.
Let
z = 9
x2
y2
lying
J'I P
S
il
2
d~
if:
1 = ~in(~+z)? +
s X Z J + ( x +y
)d.
D =
(b) 8ln.
2.
I
Let S2
S1
denote t h e surface
x2
z = 1
y2
Y 2 ; z 2 0.
Let
z = 0
Let
S ,
= xt
and let
3,
be Evaluate
(2wcy)f + zk;
3,
S2,
component.
Jj
* a 2 d s .
s2
Grad ' Curl Div and all that. -' We study two questions about these operations:
I .
- - -
or geometric interpretations?
11.
I.
gradient.
For divergence. the question is answered in 12.20 of Apostol.
1
definition of divergence F.
Formula
where
C(r)
centered at
a -
a,
and
C(r)
is directed in a counterclockwise
n.
This number is
at a - -b
n;
curl F
as follows:
curl F
at
a -
is
YOU
-+
curl F,
let us
-+
Let us place a small paddle wheel of radius P with its axis along n.
(considered as
if
i t is n e g a t i v e .
1
O p h y s i c a l grounds, i t i s r e a s o n a b l e t o n
suppose t h a t
average v a l u e of
( F - T ) = -
-b
speed of a p o i n t
on one of t h e p a d d l e s
That is,
&I
I t follows t h a t
C~
+
T ds = r w .
s o t h a t by formula
w e have ( i f
+
i s very s m a l l )
c u r l F ( a ) = 2w.
+ I n p h y s i c a l terms t h e n , t h e v e c t o r k u r l F(=)]
points i n the
11.
from s c a l a r f i e l d s t o v e c t o r f i e l d s , C url
vector f i e l d s t o s c a l a r f i e l d s . diagram:
Scalar f i e l d s
0 (5)
a
Jl
I ,(
Scalar f i e l d s
(XI grad
3
L e t us c o n s i d e r f i r s t t h e t o p two o p e r a t i o n s ,
and
curl.
W e r e s t r i c t o u r s e l v e s t o s c a l a r and v e c t o r f i e l d s t h a t U
a r e c o n t i n u o u s l y d i f f e r e n t i a b l e on a r e g i o n
of
+
F
i -s-a
gradient i n
i f and --
only i f
-L
F * du
-=
for - e v e r y+c l o s e d
piecewise-smooth p a t h
in - U.
Theorem 2.
If
F = grad
--
f o r some
4,
then
+ + c u r l F = 0.
~ r o o f . W e compute
curl
by t h e formula
curl
w e know t h a t i f
$
1
i s a g r a d i e n t , and t h e p a r t i a l s o f
= D.F j J i
are
continuous, t h e n curlF=O.
-!
D.F
f o r all
i , j.
Hence
If i - c u r l = 8 -n-a- star-convex r e g i o n U , then f o r some in - $ = grad $ -- + d e f i n e d - U . The f unc- f u n c t i o n $(XI = + ( x ) + c i s t h e most g e n e r a l Theorem 3 .
_I-
t i o n such t h a t --Proof.
j.
F = grad $.
If
If
+ +
c u r l F = 0,
then
for a l l
F
i,
is a
g r a d i e n t i n U, Theorem 4 .
17
The - condition
c u r l $ =
in -
u
3
i -s-a
does o - -n-t
i n g e n e r a l imply t h a t
g r a d i e n t - U. in
Proof.
Consider t h e v e c t d r f i e l d
It is defined i n t h e region
c o n s i s t i n g of a l l of curl
ex
cept f o r the
TO show
z-axis.
I t i s e a s y t o check t h a t
U,
8.
is not a gradient i n
we let
be t h e u n i t
circle
~ ( t=) (cos t , s i n t , 0 ) ;
i n the
xy-plane,
and compute
rt
] I
that
U
$
R~
cannot be a g r a d i e n t i n
i s c a l l e d "simply
U
U.
in
The r e g i o n
R3
- (z-axis)
8
i s not.
U
rt t u r n s o u t t h a t i f
,
curl $ =
in
U,
then
is a gradient i n
The proof
in
U,
let
be an o r i e n t -
which
bounds.
Apply S t o k e s t theorem
One o b t a i n s t h e e q u a t i o n
i s a gradient i n
U.
curl
and
a l l t h e e a r l i e r theorems:
I
Theorem 5 .
11 5 t
S
dS = 0
If i then - F -s- a- c u r l i n U, -
for closed - e v e r y o r i e n t a b l e -s u r f a c e
S
in -
U.
~roof. Let
U.
lies
w e do n o t assume t h a t
includes t h e
S
region t h a t
s up i n t o two s u r f a c e s S1
S2 that intersect in their
common boundary, which i s a simple smooth c l o s e d curve C. Now + by h y p o t h e s i s , G = c u r l ? f o r some d e f i n e d i n U. W e corn pute :
n ds =
11
S2 Adding, w e see t h a t
,
curl s2 .
F m i d~ i
1 Hm
C
da.
ema ark.
= curl
f o r some --
?,
By assumption,
Then
div
-+
G =
(D1D2F3-D1D 3F 21 - (D2D1F3-DZD3Fl)+(D3DlF2-D3D2F1)
If in - div = 0 - -a- star-convex region U, + then for some in - 2 = curl 3 -- F defined - U. + + The is the most func- function H = F + grad $ --- general + tion such that --- = curl H. Theorem 7. We shall not prove this theorem in full generality. proof is by direct computation, as in the ~oincar6lemma. A proof that holds when U is a 3-dimensional box, or U is all of R3, is given in section 12.16 of Apostol. The
when
This proof also shows how to construct a specific such function in the given cases.
Note that if curl(H-F) = for some
+ +
= curl ?$ and
-+
G = curl H,
-+
then in U,
8.
= grad 4
4.
Theorem 8 .
-t
divG=O
& U I
is - - if - -a curl in
U.
---
Proof.
Let
be t h e v e c t o r f i e l d
c o n s i s t i n g of a l l of
R3
If that
i s t h e u n i t s p h e r e c e n t e r e d a t t h e o r i g i n , t h e n w e show
is n o t a c u r l .
I (x,y,z)B = 1,
so
(x,y,z) i s a p o i n t o f S t t h e n + + + + * G(x,y,z) = xi + yj + zk = n o T h e r e f o r e
11 b*;
dA =
1 dA = ( a r e a of s p h e r e ) # 0.
erna ark.
Suppose w e s a y t h a t a r e g i o n
U U
in
R3
i s "two-
simply c o n n e c t e d w i f e v e r y c l o s e d s u r f a c e i n s o l i d region l y i n g i n
U.*
bounds a
The r e g i o n
U = R -(origin)
i s nott'two
axis)
U = R3-(z
is -.
rt
div G = 0
-b
I1
then
is a curl in
U.
roughly a s follows:
Given a c l o s e d s u r f a c e
it bounds.
S
in
U,
let
be t h e r e g i o n
V,
Since
i s by h y p o t h e s i s d e f i n e d on a l l of
+
G
is a c u r l i n
U.
There i s much more one can s a y a b o u t t h e s e matters, b u t one needs t o i n t r o d u c e a b i t of a l g e b r a i c topology i n o r d e r t o do so.
It i s a b i t l a t e i n t h e semester f o r t h a t !
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.