Techniques of Differentation
Techniques of Differentation
Techniques of Differentation
(x).
275
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
276 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
In this chapter we will look at the cases where this limit can be evaluated
exactly. Although using this denition of derivative usually leads to many
algebraic manipulations, the other interpretations of derivatives as slopes,
rates, and multipliers will still be helpful in visualizing whats going on. The
process of calculating the derivative of a function is called dierentiation.
For this reason, functions which are locally linear and not locally vertical
(so they do have slopes, and hence derivatives at every point) are called
dierentiable functions. Our goal in this chapter is to dierentiate functions
given by formulas.
Derivatives of Basic Functions
When a function is given by a formula, there is in fact a formula for its Functions given by
formulas have
derivatives given by
formulas
derivative. We have already seen several examples in chapters 3 and 4. These
examples include all of what we may consider the basic functions. We
collect these formulas in the following table.
Rules for Derivatives of Basic Functions
function derivative
mx + b m
x
r
rx
r1
sin x cos x
cos x sin x
e
x
e
x
ln x 1/x
In the case of the linear function mx + b, we obtained the derivative by
using its geometric description as the slope of the graph of the function. The
derivatives of the exponential and logarithm functions came from the deni-
tion of the exponential function as the solution of an initial value problem.
To nd the derivatives of the other functions we will need to start from the
denition.
An example: f(x) = x
3
We begin by examining the calculation of the derivative of f(x) = x
3
using
the denition. The change y in y = f(x) corresponding to a change x in
x is given by
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 277
y = f(x + x) f(x)
= (x + x)
3
x
3
= 3x
2
x + 3x(x)
2
+ (x)
3
.
From this we get
f
(x) = lim
x0
y
x
= lim
x0
3x
2
+ 3x x + (x)
2
.
To see whats happening with this expression, lets consider the specic
value x = 2 and evaluate the corresponding values of y/x for successively
smaller x.
The value of y/x
gets closer and closer
to 12 as x gets
smaller and smaller
x 2
2
+ 6x + (x)
2
y/x
.1 12 + .6 + .01 12.61
.01 12 + .06 + .0001 12.0601
.001 12 + .006 + .000001 12.006001
.0001 12 + .0006 + .00000001 12.00060001
.00001 12 + .00006 + .0000000001 12.0000600001
It is clear from this table that we can make y/x as close to 12 as we like
by making x small enough. Therefore f
(2) = 12.
Note that in the table above we have used positive values of x. You
should check to convince yourself that if we had used negative values of x
we would have come up with a dierent set of approximations y/x, but
that the limit would still be the same, namely 12it doesnt matter whether
we use positive or negative values for x, or a mixture of the two, so long
as x 0.
In general, for any given x, the second and third terms in the expansion
for y/x become vanishingly small as x 0, so that y/x can be
made as close to 3x
2
as we like by making x small enough. For this reason,
we say that the derivative f
(x) is exactly 3x
2
:
f
(x) = lim
x0
3x
2
+ 3x x + (x)
2
= 3x
2
.
In other words, given the function f specied by the formula f(x) = x
3
we
have found the formula for its derivative function f
: f
(x) = 3x
2
. Note that
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
278 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
this general formula agrees with the specic value f
(x) = 3x
2
.
For a particular value of x, the corresponding value of y/x is an approx-
imation of f
(x) is exactly 3x
2
.
More generally, for any function y = f(x), a particular dierence quotient
y/x is an approximation of f
(x). Again f
(x) = r x
r 1
.
We can prove this rule for the case when r is a positive integer using
algebraic manipulations very like the ones carried out for x
3
; see the exercises
for verications of this and the other dierentiation rules in this section.
Using a rule for quotients of functions (coming later in this section), we
can show that this rule also holds for negative integer exponents. Further
arguments using the chain rule show that the pattern still holds for rational
exponents. We can eliminate this case-by-case approach, though, by recalling
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 279
the approach developed in chapter 4. We saw that we can give meaning to
b
r
for any positive base b and any real number r by dening
b
r
= e
r ln(b)
.
Using the formulas for the derivatives of e
x
and ln x together with the chain
rule, we can prove the rule for x > 0 and for arbitrary real exponent r directly,
without rst proving the special cases for integer or rational exponents. See
the exercises for details. Arguments justifying the formulas for the derivatives
of the trigonometric functions are also in the exercises.
Combining Functions
We can form new functions by combining functions. We have already studied
one of the most useful ways of doing this in chapter 3 when we looked at
forming chains of functions and developed the chain rule for taking the Functions combined
by chains. . .
derivative of such a chain. Suppose u = f(x) and y = g(u). Chaining these
two functions together we have y as a function of x:
y = h(x) = g(f(x)).
The chain rule tells us how to nd the derivative of y with respect to x. In
function notation it takes the form
h
(x) = g
(f(x)) f
(x).
In Leibniz notation, using f(x) = u we can write the chain rule as
The chain rule
dy
dx
=
dy
du
du
dx
.
We also saw in chapter 3 that the polynomial 5x
3
7x
2
+3 can be thought
of as an algebraic combination of simple functions. We can build an even . . . and algebraically
more complicated function by forming a quotient with this polynomial in the
numerator and the dierence of the functions sin x and e
x
in the denominator.
The result is
5x
3
7x
2
+ 3
sin x e
x
.
The derivative of this function, as well as of other functions formed by
adding, subtracting, multiplying and dividing simpler functions, is obtained
by use of the following rules for the derivatives of algebraic combinations of
dierentiable functions.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
280 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
Rules for Algebraic Combinations of Functions
Combining functions
by adding, subtracting,
multiplying and
dividing
function derivative
f(x) + g(x) f
(x) + g
(x)
f(x) g(x) f
(x) g
(x)
cf(x) cf
(x)
f(x) g(x) f
(x)
f(x)
g(x)
g(x) f
(x) f(x) g
(x)
[g(x)]
2
Notice carefully that the product rule has a plus sign but the quotient rule
has a minus sign. You can remember these formulas better if you think about Notice the signs
in the rules
where these signs come from. Increasing either factor increases a (positive)
product, so the derivative of each factor appears with a plus sign in the
formula for the derivative of a product. Similarly, increasing the numerator
increases a positive quotient, so the derivative of the numerator appears
with a plus sign in the formula for the derivative of a quotient. However,
increasing the denominator decreases a positive quotient, so the derivative
of the denominator appears with a minus sign.
Lets now use the rules to dierentiate the quotient
5x
3
7x
2
+ 3
sin x e
x
.
First, the derivative of the numerator 5x
3
7x
2
+ 3 is
5(3x
2
) 7(2x) + 0 = 15x
2
14x.
Similarly the derivative of sin x e
x
is cos x e
x
. Finally, the derivative of
the quotient function is obtained by using the rule for quotients:
(sin x e
x
)(15x
2
14x) (5x
3
7x
2
+ 3)(cos x e
x
)
(sin x e
x
)
2
.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 281
The following examples further illustrate the use of the rules for algebraic
combinations of functions.
function derivative
3e
t
+
3
t 3e
t
+ (1/3)t
2/3
5
x
3
7x
4
+ ln x 5(3)x
4
7(4x
3
) + 1/x
7
xcos x 7(
1
2
x
) cos x + 7
x(sin x)
_
4
3
_
r
3
_
4
3
_
3r
2
3s
6
s
2
s
(s
2
s)3(6s
5
) 3s
6
(2s 1)
(s
2
s)
2
For another kind of example, suppose the per capita daily energy con-
sumption in a country is currently 800,000 BTU, and, due to energy con-
servation eorts, it is falling at the rate of 1,000 BTU per year. Suppose
too that the population of the country is currently 200,000,000 people and
is rising at the rate of 1,000,000 people per year. Is the total daily energy
consumption of this country rising or falling? By how much?
Three dierent quantities vary with time in this example: daily per capita
energy consumption, population and total daily energy consumption. We can
model this situation with three functions C(t), P(t) and E(t).
C(t) : per capita consumption at time t
P(t) : population at time t
E(t) : total energy consumption at time t
Since the per capita consumption times the number of people in the pop-
ulation gives the total energy consumption, these three functions are related
algebraically:
E(t) = C(t) P(t).
If t = 0 represents today, then we are given the two rates of change
C
(0) = 1, 000 = 10
3
BTU per person per year, and
P
(0) = C(0) P
(0) +C
(0) P(0)
= (8 10
5
) (10
6
) + (10
3
) (2 10
8
)
= (8 10
11
) (2 10
11
)
= 6 10
11
BTU per year.
So the total daily energy consumption is currently rising at the rate of 610
11
BTU per year. Thus the growth in the population more than osets the
eorts to conserve energy.
Finally, it is a useful exercise to check that the units make sense in this
computation. Recall that C(t) represents per capita daily energy consump- Checking units
tion, so the units for C(0) P
(0) are
BTU
person
persons
year
=
BTU
year
,
and, similarly, the units for C
(x) and g
(x)? If
the y-coordinates are tripled, the slope will be three times as great. If
they are halved, the slope will also be half as much. More generally, the
elongated (or compressed) graph of g has a slope equal to c times the
slope of the original graph of f. In other words, g(x) = cf(x) implies
g
(x) = cf
(x).
Now suppose instead that g is obtained from f by adding a constant b,
so g(x) = f(x) + b. This time the graph of y = g(x) is obtained from Shifting y-coordinates
the graph of y = f(x) by shifting up or down (according to the sign
of c) by |c| units. What is the relationship between the slopes f
(x)
and g
(x)? The shifted graph has exactly the same slope as the original
graph, so in this case, g(x) = f(x) + b implies g
(x) = f
(x).
y
x
y = sin(x)
y
x
y = sin(x) + .5
There is a similar pattern when the coordinates of the input variable are
stretched or shiftedthat is when y = f(u) and u is rescaled by the linear
relation u = mx + b. These results depend on the chain rule and appear in
the exercises.
The fact that the derivative of f(x) + b is the same as the derivative of
f(x) is a special case of the general addition rule, which says the derivative
of a sum is the sum of the derivatives. In the special case, the derivative
of the constant function b is zero, so adding a constant leaves the derivative
unchanged. To see that how natural it is to add rates in the general case,
consider the following example:
Suppose we are diluting concentrated orange juice by mixing it with water Adding ows
in a big tub. We may let f(t) be the amount (in gallons) of concentrate in
the tub and g(t) be the amount of water in the tub at time t. Then f
(t) is
the rate at which concentrate is being added at time t (measured in gallons
per minute), and g
(t) is the rate at which water is owing into the tub. The
formula F(t) = f(t) +g(t) then gives the total amount of liquid in the tub at
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
284 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
time t, and F
(x) = f
(x)
To save some writing, let
F = F(x + x) F(x),
f = f(x + x) f(x),
and g = g(x + x) g(x).
Rewrite the last two equations as
f(x + x) = f(x) + f
g(x + x) = g(x) + g.
Now we can write
F(x + x) = f(x + x) g(x + x)
= (f(x) + f) (g(x) + g)
= f(x) g(x) + f(x) g + f g(x) + f g
This gives us a simple expression for Simplifying F
F = F(x + x) F(x)
namely,
F = f(x) g + f g(x) + f g
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 285
These quantities all have nice geometric interpretations. First, think of Interpret f and g
as lengths
the numbers f(x) and g(x) as lengths that depend on x; then F(x) naturally
stands for the area of the rectangle whose sides are f(x) and g(x). If the sides
of the rectangle grow by the amounts f and g, then the area F grows by
F. As the following diagram shows, F has three parts, corresponding to
the three terms in the expression we derived algebraically for F.
area = f(x) g(x)
area = f(x) g
area = f g
area = f g(x)
g
_
g(x)
_
_
f(x)
..
f
..
Now we divide F by x and nish the argument:
F
x
=
f(x) g + f g(x) + f g
x
= f(x)
g
x
+
f
x
g(x) +
f g
x
Consider what happens to each of the three terms as x gets smaller and
smaller. In the rst term, the second factor g/x approaches g
(x)by
the denition of the derivative. The rst factor, f(x) doesnt change at all
as x shrinks. So the rst term approaches f(x) g
(x) g(x).
Finally, look at the third term. We would know what to expect if we had
another factor of x in the denominator. We can put ourselves in familiar
territory by the trick of multiplying the third term by x/x:
f g
x
=
f
x
g
x
x
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
286 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
Thus we can see that as x approaches zero, the third term itself approaches
f
(x) g
(x) 0 = 0.
We may summarize our calculation by writing
lim
x0
F
x
= f(x)
_
lim
x0
g
x
_
+
_
lim
x0
f
x
_
g(x)
+
_
lim
x0
f
x
_
_
lim
x0
g
x
_
_
lim
x0
x
_
from which we have
lim
x0
F
x
= f(x) g
(x) + f
(x) g(x) + f
(x) g
(x) 0
= f(x) g
(x) + f
(x) g(x).
This completes the proof of the product rule. Other formal arguments are
left to the exercises.
Exercises
Finding Derivatives
1. Find the derivative of each of the following functions.
a) 3x
5
10x
2
+ 8 j) x
2
e
x
b) (5x
12
+ 2)(
2
x
4
) k) cos x + e
x
c)
u 3/u
3
+ 2u
7
l) sin x/ cos x
d) mx + b (m, b constant) m) e
x
ln x
e) .5 sin x +
3
x +
2
n)
2
x
10 + sin x
f)
2
x
4
5x
12
+ 2
o) sin(e
x
cos x)
g) 2
x
1
x
p) 6e
cos t
/ 5
3
t
h) tanz (sin z 5) q) ln(x
2
+ xe
x
)
i)
sin x
x
2
r)
5x
2
+ lnx
7
x + 5
2. Suppose f and g are functions and that we are given
f(2) = 3, g(2) = 4, g(3) = 2,
f
(2) = 2, g
(2) = 1, g
(3) = 17.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 287
Evaluate the derivative of each of the following functions at t = 2:
a) f(t) + g(t) f)
_
g(t)
b) 5f(t) 2g(t) g) t
2
f(t)
c) f(t)g(t) h) (f(t))
2
+ (g(t))
2
d)
f(t)
g(t)
i)
1
f(t)
e) g(f(t) j) f(3t (g(1 +t))
2
)
k) What additional piece of information would you need to calculate the
derivative of f(g(t)) at t = 2?
l) Estimate the value of f(t)/g(t) at t = 1.95
3. a) Extend the product rule to express (f(t)g(t)h(t))
in terms of f, g,
and h.
b) If the length, width, and height of a rectangular box are changing at the
rates of 3, 6, and 5 inches/minute at the moment when all three dimensions
happen to be 10 inches, at what rate is the volume of the box changing then?
c) If the length, width, and height of a box are 10 inches, 12 inches, and
8 inches, respectively, and if the length and height of the box are changing
at the rates of 3 inches/minute and 2 inches/minute, respectively, at what
rate must the width be changing to keep the volume of the box constant?
4. In this problem we examine the eect of stretching or shifting the co-
ordinates of the input variable of a function. Your answers should address
both the algebra and the geometry of the problem to show how the algebraic
relations between the functions are manifested in their graphs.
a) Suppose f(x) = sin(x) and g(x) = sin(mx), where mis a constant stretch-
ing factor. What is the relation between f
(x) and g
(x)?
b) As in (a), suppose f(x) = sin(x), but this time g(x) = sin(x+b) where b
is the size of a (constant) shift. What is the relation between f
(x) and g
(x)
this time?
c) Now consider the general case: f(x) in an unspecied dierentiable func-
tion and g(x) = f(mx + b), where the input variable is stretched by the
constant factor m and shifted by the constant amount b. What is the rela-
tion between f
(x) and g
(r)?
b) Explain why square cm are not the appropriate units for V
(r), even
though dimensionally correct.
7. Do the following.
a) Show that
1
1 x
2
and
x
2
1 x
2
have the same derivative.
b) If f
(x) = g
?)
c) Show that
1
1 x
2
=
x
2
1 x
2
+ C by nding C.
8. Suppose that the current total daily energy consumption in a particular
country is 16 10
13
BTU and is rising at the rate of 6 10
11
BTU per year.
Suppose that the current population is 2 10
8
people and is rising at the
rate of 10
6
people per year. What is the current daily per capita energy
consumption? Is it rising or falling? By how much?
9. The population of a particular country is 15,000,000 people and is grow-
ing at the rate of 10,000 people per year. In the same country the per capita
yearly expenditure for energy is $1,000 per person and is growing at the rate
of $8 per year. What is the countrys current total yearly energy expenditure?
How fast is the countrys total yearly energy expenditure growing?
10. The population of a particular country is 30 million and is rising at
the rate of 4,000 people per year. The total yearly personal income in the
country is 20 billion dollars, and it is rising at the rate of 500 million dollars
per year. What is the current per capita personal income? Is it rising or
falling? By how much?
11. An explorer is marooned on an iceberg. The top of the iceberg is shaped
like a square with sides of length 100 feet. The length of the sides is shrinking
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 289
at the rate of two feet per day. How fast is the area of the top of the iceberg
shrinking? Assuming the sides continue to shrink at the rate of two feet per
day, what will be the dimensions of the top of the iceberg in ve days? How
fast will the area of the top of the iceberg be shrinking then?
12. Suppose the iceberg of problem 9 is shaped like a cube. How fast is the
volume of the cube shrinking when the sides have length 100 feet? How fast
after ve days?
Deriving Dierentiation Rules
13. In this problem we calculate the derivative of f(x) = x
4
.
a) Expand f(x + x) = (x + x)
4
= (x + x)(x + x)(x + x)(x + x)
as a sum of 16 terms. (Dont collect like terms yet.)
b) How many terms in part a involve no xs? What form do such terms
have?
c) How many terms in part a involve exactly one x? What form do such
terms have?
d) Group the terms in part a so that f(x + x) has the form
Ax
4
+ Bx + R(x)
2
,
where there are no xs among the terms in A or B, but R has several terms,
some involving x. Use part b to check your value of A; use part c to check
your value of B.
e) Compute the quotient
f(x + x) f(x)
x
, taking advantage of part d.
f) Now nd
lim
x0
f(x + x) f(x)
x
;
this is the derivative of x
4
. Is your result here compatible with the rule for
the derivative of x
n
?
14. In this problem we calculate the derivative of f(x) = x
n
, where n is any
positive integer.
a) First show that you can write
f(x + x) = x
n
+ nx
n1
x + R(x)
2
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
290 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
by developing the following line of argument. Write (x + x)
n
as a product
of n identical factors:
(x + x)
n
= (x + x)
. .
1-st
(x + x)
. .
2-nd
(x + x)
. .
3-rd
. . . (x + x)
. .
n-th
But now, before tackling this general case, look at the following examples.
In the examples we use notation to help us keep track of which factors are
contributing to the nal result.
i) Consider the product (a + b)(a + b) = aa + ab + ba + bb. There are four
individual terms. Each term contains one of the entries in the rst factor
(namely a or b) and one of the entries in the second factor (namely a or b).
The four terms represent thereby all possible ways of choosing one entry in
the rst factor and one entry in the second factor.
ii) Multiply out the product (a+b)(a+b)(A+B). (Dont combine like terms
yet.) Does each term contain one entry from the rst factor, one from the
second, and one from the third? How many terms did you get? In fact there
are two ways to choose an entry from the rst factor, two ways to choose
an entry from the second factor, and two ways to choose an entry from the
third factor. Therefore, how many ways can you make a choice consisting of
one entry from the rst, one from the second, and one from the third?
Now return to the general case:
(x + x)
n
= (x + x)
. .
1-st
(x + x)
. .
2-nd
(x + x)
. .
3-rd
. . . (x + x)
. .
n-th
How many ways can you choose an entry from each factor and not get any
xs? Multiply these chosen entries together; what does the product look
like (apart from having no xs in it)?
How many ways can you choose an entry from each factor in such a way
that the resulting product has precisely one x? Describe all the various
choices which give that result. What does a product that contains precisely
one x factor look like? What do you obtain for the sum of all such terms
with precisely one x factor?
What is the minimum number of x factors in any of the remaining terms
in the full expansion of (x + x)
n
?
Do your calculations agree with this summary:
(x + x)
n
= x
n
+ nx
n1
x + R(x)
2
?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 291
b) Now nd the value of
f(x + x) f(x)
x
.
c) Finally, nd
lim
x0
f(x + x) f(x)
x
.
Do you get nx
n1
?
15. In this problem we give another derivation of the power rule based on
writing
x
r
= e
r ln(x)
.
Use the chain rule to dierentiate e
r ln(x)
. Explain why your answer equals
rx
r1
.
16. Does the rule for the derivative of x
r
hold for r = 0? Why or why not?
17. In this exercise we prove the Addition Rule: F(x) = f(x)+g(x) implies
F
(x) = f
(x) + g
(x).
a) Show F(x + x) F(x) = f(x + x) f(x) + g(x + x) g(x)
b) Divide by x and nish the argument.
18. In this exercise we prove the Quotient Rule: F(x) = f(x)/g(x) implies
F
(x) =
g(x)f
(x) f(x)g
(x)
(g(x))
2
a) Rewrite F(x) = f(x)/g(x) as f(x) = g(x)F(x). Pretend for the moment
that you know what F
(x) in
terms of F(x), g(x), F
(x), g
(x).
b) Replace F(x) by f(x)/g(x) in your expression for f
(x) in part a.
c) Solve the equation in part b for F
(x) and
g
(x).
19. In this problem we calculate the derivative of f(x) = x
n
when n is a
negative integer. First write n = m, so m is a positive integer. Then
f(x) = x
m
= 1/x
m
.
a) Use the Quotient Rule and this new expression for f to nd f
(x) .
b) Do the algebra to re-express f
(x) as nx
n1
.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
292 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
20. In this problem we calculate the derivatives of sin x and cos x. We will
need the addition formulas:
sin(A+ B) = sin Acos B + cos Asin B
cos(A+ B) = cos Acos B sin Asin B
First tackle f(x) = sin x:
a) Use the addition formula for sin(A+B) to rewrite f(x+x) in terms of
sin(x), cos(x), sin(x), and cos(x).
b) The quotient
f(x + x) f(x)
x
can now be written in the form
P(x) sin x + Q(x) cos x,
where P and Q are specic functions of x. What are the formulas for those
functions?
c) Use a calculator or computer to estimate the limits
lim
x0
P(x) and lim
x0
Q(x) .
(Try x = .1, .01, .001, .0001. Be sure your calculator is set on radians, not
degrees.) Using part b you should now be able to determine the limit
lim
x0
f(x + x) f(x)
x
by writing it in the form
_
lim
x0
P(x)
_
sin x +
_
lim
x0
Q(x)
_
cos x .
d) What is f
(x)?
e) Proceed similarly to nd the derivative of g(x) = cos x.
21. In this problem we calculate the derivatives of the other circular func-
tions. Use the quotient rule together with the derivatives of sin x and cos x
to verify that the derivatives of the other four circular functions are as given
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 293
in the table below:
function derivative
tanx =
sin x
cos x
sec
2
x
csc x =
1
sin x
cot xcsc x
sec x =
1
cos x
sec xtanx
cot x =
1
tan x
csc
2
x
Dierential Equations
22. If y = f(x) then the second derivative of f is just the derivative of
the derivative of f; it is denoted f
(x) or d
2
y/dx
2
. Find the second derivative
of each of the following functions.
a) f(x) = e
3x2
b) f(x) = sin x, where is a constant
c) f(x) = x
2
e
x
23. Show that e
3x
and e
3x
both satisfy the (second order) dierential equa-
tion
f
(x) = 9f(x).
Furthermore, show that any function of the form g(x) = e
3x
+e
3x
satises
this dierential equation. Here and are arbitrary constants. Finally,
choose and so that g(x) also satises the two conditions g(0) = 12 and
g
(0) = 15.
24. Show that y = sin x satises the dierential equation y
+y = 0. Show
that y = cos x also satises the dierential equation. Show that, in fact, y =
a sin x + b cos x satises the dierential equation for any choice of constants
a and b. Can you nd a function g(x) that satises these three conditions:
g
(x) + g(x) = 0
g(0) = 1
g
(0) = 4?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
294 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
25. Show that sin x satises the dierential equation y
+
2
y = 0. What
other solutions can you nd to this dierential equation? Can you nd a
function L(x) that satises these three conditions:
L
(x) + 4L(x) = 0
L(0) = 36
L
(0) = 64?
The Colorado River Problem
.
Make your answer to this sequence of questions an essay. Identify all the
variables you consider (e.g., A stands for the area of the lake), and indicate
the functional relationships between them (A depends on time t, measured
in weeks from the present). Identify the derivatives of those functions, as
necessary.
The Colorado Riverwhich excavated the Grand Canyon, among others
used to empty into the Gulf of California. It no longer does. Instead, it runs
into a marshy area some miles from the Gulf and stops. One of the ma-
jor reasons for this change is the construction of damsnotably the Hoover
Dam. Every dam creates a lake behind it, and every lake increases the total
surface area of the river. Since the rate at which water evaporates is pro-
portional to the area of the water surface exposed to air, the lakes along the
Colorado have increased the loss of river water through evaporation. Over
the years, these losses (in conjunction with other factors, like increased usage
by a rapidly growing population) have been signicant enough to dry up the
river at its mouth.
26. Let us analyze the evaporation rate along a
river that was recently dammed. Suppose the lake
is currently 50 yards wide, and getting wider at
a rate of 3 yards per week. As the lake lls, it
gets longer, too. Suppose it is currently 950 yards
long, and it is extending upstream at a rate of
15 yards per week. Assuming the lake remains
approximately rectangular as it grows, nd
5
0
y
d
s
9
5
0
y
d
s
the River
the
dam
a) the current area of the lake, in square yards;
b) the rate at which the surface of the lake is currently growing, in square
yards per week.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.1. THE DIFFERENTIATION RULES 295
27. Suppose the lake continues to spread sideways at the rate of 3 yards per
week, and it continues to extend upstream at the rate of 15 yards per week.
a) Express the area of the lake as a (quadratic!) function of time, where
time is measured from the present, in weeks, and where the lakes area is as
given in problem 25.
b) How many weeks will it take for the lake to cover 30 acres (= 145,200
square yards)?
c) At what rate is the lake surface growing when it covers 30 acres?
28. Compare the rates at which the surface of the lake is growing in problem
25 (which is the current rate) and in problem 26 (which is the rate when
the lake covers 30 acres). Are these rates the same? If they are not, how do
you account for the dierence? In particular, the width and length grow at
xed rates, so why doesnt the area? Use what you know about derivatives
to answer the question.
29. Suppose the local climate causes water to evaporate from the surface
of the lake at the rate of 0.22 cubic yards per week, for each square yard of
surface. Write a formula that expresses total evaporation per week in terms
of area. Use E to denote total evaporation.
30. The lake is fed by the river, and that in turn is fed by rainwater and
groundwater from its watershed. (The watershed, or basin, of a river is
that part of the countryside containing the ponds and streams which drain
into the river.) Suppose the watershed provides the lake, on average, with
25,000 cubic yards of new water each week.
Assuming, as we did in problem 25, that the lake widens at the constant
rate of 3 yards per week, and lengthens at the rate of 15 yards per week, will
the time ever come that the water being added to the lake from its watershed
balances the water being removed by evaporation? In other words, will the
lake ever stop lling?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
296 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
5.2 Finding Partial Derivatives
We know from Chapter 3 that no additional formulas are needed to calculate
partial derivatives. We simply use the usual dierentiation formulas, treat-
ing all the variables except onethe one with respect to which the partial
derivative is formedas if they were constants. If we do this we get new
techniques for analyzing rates of change in problems that involve functions
of several variables.
Some Examples
Here are two examples to illustrate the technique for calculating partial
derivatives:
Finding formulas for
partial derivatives
1. Suppose f(x, y) = x
2
y + 5x
3
x + y. Then
f
x
(x, y) = 2xy + 15x
2
1
2
x + y
, and
f
y
(x, y) = x
2
1
2
x + y
.
2. Suppose g(u, v) = e
uv
+
u
v
. Then
g
u
(u, v) = ve
uv
+
1
v
, and
g
v
(u, v) = ue
uv
u
v
2
.
Eradication of Disease
Controllingor, better still, eradicatinga communicable disease depends
rst on the development of a vaccine. But even after this step has been
accomplished, public health ocials must still answer important questions,
including:
What proportion of the population must be vaccinated in order to
eliminate the disease?
At what age should people be vaccinated?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.2. FINDING PARTIAL DERIVATIVES 297
In their 1982 article, Directly Transmitted Infectious Diseases: Control
by Vaccination, (Science, Vol. 215, 10531060), Roy Anderson and Robert
May formulate a model for the spread of disease that permits them to answer
these and other questions. For a particular disease in a particular environ-
ment, the important variables in their model are
1. The average human life expectancy L, in years;
2. The average age A at which individuals catch the disease, in years;
3. The average age V at which individuals are vaccinated against the
disease, in years.
Anderson and May deduce from their model that in order to eradicate
the disease, the proportion of the population that is vaccinated must exceed
p, where p is given by
p =
L + V
L + A
.
For a disease like measles, public health ocials can directly aect the
variable V , for example by the recommendations they make to physicians Partial derivatives can
tell us which variables
are most signicant
about immunization schedules for children. They may also indirectly aect
the variables A and L, because public health policy inuences factors which
can modify the age at which children catch the disease or the overall life
expectancy of the population. (Many other factors aect these variables as
well.) Which of these three variables has the greatest eect on the proportion
of the population that must be vaccinated?
In other words, which is largest: p/L, p/A, or p/V ?
Using the rules, we compute:
p
L
=
1 (L + A) 1 (L + V )
(L + A)
2
=
AV
(L + A)
2
,
p
A
=
(L + V )
(L + A)
2
, and
p
V
=
1
L + A
.
For measles in the United States, reasonable values of the variables are L
= 70 years, A = 5 years and V = 1 year. Using these values, the crucial
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
298 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
proportion of the population needing to be vaccinated is p = 71/75 = .947,
and the partial derivatives are
p
L
=
4
(75)
2
= .0007,
p
A
=
71
(75)
2
= .0126,
p
V
=
1
75
= .0133.
A comment is in order here on units. While the input variables L, A and Determining units
V are all measured in yearsso the rates are per year, the output variable p is
dimensionless: it is the ratio of persons vaccinated to persons not vaccinated.
It would be reasonable to write p as a percentage. Then we can attach the
units percent per year to each of the three partial derivatives. Thus we have:
p
L
= .07% per year
p
A
= 1.26% per year
p
V
= 1.33% per year.
It is not surprising that a change in average life expectancy has a negligible
eect on the proportion p of the population that must be vaccinated in order
to eradicate measles. Nor is it surprising that changing the age of vaccination
has the greatest eect on p. But it is not obvious ahead of time that changing
the age at which children catch the disease has nearly as large an eect on p:
Decreasing the age of vaccination decreases the proportion p by 1.33%
per year of decrease.
Increasing the age at which children catch measles decreases the pro-
portion p by 1.26% per year of increase.
Changes can also go the wrong way. For example, in an area where
use of communal child care facilities is growing, contact among very young
children increases, and the age at which children are exposed toand can
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.2. FINDING PARTIAL DERIVATIVES 299
catchcommunicable diseases like measles falls. The AndersonMay model
tells us that immunization practices must change to compensate: either the
age of vaccination must drop a like amount, or the fraction of the population
that is vaccinated must grow by 1.26% per year of decrease in the average
age at infection.
Exercises
Finding Partial Derivatives
1. Find the partial derivatives of the following functions.
a) x
2
y.
b)
x + y
c) e
xy
d)
y
x
e)
x + y
y + z
f) sin
y
x
2. a) Suppose f(x, y) = e
(x+2y)
(2x 5y). Find f
x
(x, y) and f
y
(x, y).
b) Find a point (a, b) at which f
x
(a, b) = 0. At such a point a small change
in x leaves the value of f virtually unchanged.
c) Find a point (a, b) at which a small increase in the x-value would produce
the same change in f(a, b) as would the same-sized decrease in the y-value.
3. Suppose g(u, v) =
sin u + v
2
+ 7uv
1 + u
2
+ v
4
. Find g
u
(u, v) and g
v
(u, v).
4. The second partial derivatives of z = f(x, y) are the partial deriva-
tives of f/x and f/y, namely:
2
f
x
2
=
x
_
f
x
_
2
f
xy
=
x
_
f
y
_
2
f
y
2
=
y
_
f
y
_
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
300 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
Find the three second partial derivatives of the following functions.
a) x
2
y.
b)
x + y
c) e
xy
d)
y
x
e) sin
y
x
Eradication of Disease
5. Suppose you were dealing with measles in a developing country where
L = 50 years, A = 4 years, and L = 2 years. Discuss the impact on measles
control if increased public health eorts increase L to 55 years, A to 5 years,
and decrease V to 1.5 years.
Partial dierential equations
6. Show that the function z =
1
t
exp
x
2
4t
satises the partial dieren-
tial equation
2
z
x
2
=
z
t
.
7. Show that every linear function of the form z = px +qy +c satises the
partial dierential equation
2
z
x
2
+
2
z
y
2
= 0.
Here p, q, and c are arbitrary constants.
8. Show that the function z = e
x
sin y also satises the partial dierential
equation
2
z
x
2
+
2
z
y
2
= 0.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.3. THE SHAPE OF THE GRAPH OF A FUNCTION 301
5.3 The Shape of the Graph of a Function
We know from chapter 3 that the derivative gives us qualitative information
about the shape of the graph of a dierentiable function.
function derivative
increasing positive
decreasing negative
level zero
steep (rising or falling) large (positive or negative)
gradual (rising or falling) small (positive or negative)
straight constant
Having a formula for the derivative of a function will thus give us a great
deal of information about the behavior of the function itself. In particular we
will be interested in using the derivative to solve optimization problems Contexts for
optimization problems
nding maximum or minimum values of a function. Such problems occur
frequently in many elds.
Economists actually dene human rationality in terms of optimization.
Each person is assumed to have a utility function, a function that as-
signs to each of many possible outcomes its utility, a numerical measure
of its value to her. (Dierent people may have dierent utility func-
tions, depending on their personal value systems.) A rational person
is one who acts to maximize her utility. Some utilities are expressed
in terms of money. For example, a rational manufacturer will seek to
maximize her prot (in dollars). Her prot will depend onthat is, be
a function ofsuch variables as the cost of her raw materials and the
unit price she charges for her product.
Many physical laws are expressed as minimum principles. Ordinary
soap bubbles exhibit one of these principles. A soap lm has a surface
energy which is proportional to its surface area. For almost any phys-
ical system, its stable state is one which minimizes its energy. Stable
soap lms are thus examples of minimal surfaces. Interfaces involving
crystals also have surface energies, leading to the study of crystalline
minimal surfaces.
Statisticians develop mathematical summaries for datain other words,
mathematical models. For example, a relationship between two numer-
ical variables may be summarized by a linear function, say y = mx+b,
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
302 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
where x and y are the variables of interest. It would be very rare to
nd data that were exactly linear. In a particular case, the statistician
chooses the linear model that minimizes the discrepancy between the
actual values of y and the theoretical values obtained from the linear
function. Statisticians frequently measure this discrepancy by summing
the squares of the dierences between the actual and the theoretical val-
ues of y for each data point. The best-tting line or regression line is
the graph of the linear function which is optimal in this sense.
Psychologists who study decision-making have found that some people
are risk-averse; they make their decisions primarily to avoid risks. If
we regard risk as a function of the various outcomes under considera-
tion (a bit like a utility function), such a person acts to minimize this
function.
The derivative is the key tool here. We will develop a general procedure
for using the derivative of a function to locate its extremes.
Language
Here is a graph of what we might consider a generic dierentiable function.
x
y
a
b
c
d
The most distinctive features are the hill tops and valley bottoms, points
where the graph levels and the derivative is zero. We distinguish between Local extremes and
global extremes
local extremes, like those occurring at the points x = b, x = c, and x = d and
a global extreme, like the global maximum at the point x = a. The function
has a local minimum at x = b because f(x) f(b) for all x suciently near
b. The function has a global maximum at x = a because f(x) f(a) for
all x. Notice that this particular function does not have a global minimum.
What kinds of local extremes does the function have at x = c and x = d?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.3. THE SHAPE OF THE GRAPH OF A FUNCTION 303
The convention is to say that all extremes are local extremes, and a local
extreme may or may not also be a global extreme.
Examining the graph of as simple a function as f(x) = x
3
shows us that A function may have
no extreme at a point
where its derivative
equals zero
a function need not have any extremes at all. Moreover, since f
(0) = 0 for
this function, a zero derivative doesnt necessarily identify a point where an
extreme occurs.
y
x
y = x
3
Can a function have an extreme at a point other than where the derivative An extreme can
occur at a cusp
is zero? Consider the graph of f(x) = x
2/3
below.
y
x
y = x
2/3
This function is dierentiable everywhere except at the point x = 0. And it
is at this very point, where
f
(x) =
2
3
x
1/3
=
2
3
3
x
is undened, that the function has its global minimum. For this reason,
points where the derivative fails to exist (or is innite) are as important as
points where the derivative equals zero. All of these kinds of points are called
critical points for the function.
A critical point for a function f is a point on the
graph of f where f
(c) = 0,
f
(c) is undened,
x = c is an endpoint of the interval.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.3. THE SHAPE OF THE GRAPH OF A FUNCTION 307
The following graphs illustrate three local maxima, each satisfying one of
these three conditions.
y
x c
y
x c
y
x c
When we apply Principle II to optimization problems, an important part
of the task will be ascertaining which, if any, of the critical points or end-
points we nd actually gives the extreme were looking for. Well examine
a variety of techniques, graphical and analytical, for locating critical points
and determining what kind of extreme point (if any) they are.
Finding Extremes
Using a graphical approach
If we can use a computer to examine the graph of the function of interest, we Computer graphing can
be easy if the general
location of the
extremes is known
can determine the existence and location of extremes by inspection. However,
every graphing utility requires the user to specify the interval on which the
function will be graphed, and careful analysis may be required in order to
choose an interval that contains all the extremes of interest.
For functions given by data, whose graphs have only nitely many points,
we can zoom in to nd the exact coordinates of the extreme datapoints. For
a function given by a formula, we can estimate the coordinates of an extreme
to arbitrary accuracy by zooming in on the point as closely as desired. This
is the method we used in some of the exercises of Chapter 1, and it is quite
satisfactory in many situations.
Using the formula for the derivative.
In this chapter we are concentrating on functions given by formulas. For Formulas can give
exact answers and can
handle parameters
these functions we may want a method other than the approximation using
a graphing utility described above.
For some functions, the determination of extremes using a formula for
the derivative is at least as easy as using the computer.
Some functions are described in terms of a parameter, a constant whose
value may vary from one problem to another. For example, the rate
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
308 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
equation from the S-I-R model for change in the number of infected,
I
(x).
3. Find any roots of f
(x) is undened.
5. Determine the shape of the graph to locate any local extremes. Find
the shape either by looking directly at the graph of y = f(x) or by
analyzing the sign of f
increasing
or decreasing? At the indicated point, what is the sign of f
? Is f
(not f)
increasing or decreasing at the point? What does this then say about the
sign of f
(the derivative of f
) at the point?
y
x
a) y
x
b)
y
x
c)
y
x
d) y
x
e) y
x
f)
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
310 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
3. For each of the following, sketch a graph of y = f(x) that is consistent
with the given information. On each graph, mark any critical points or
extremes.
a) f
(1) = 0; f
(2) = 0;
f
(2) = 0; f
(2) = 0; f
(3) = 0; f
(3) > 0.
4. The geometric meaning of the second derivative If f is any func-
tion and if f
is increasing over
that interval, and we say the curve is concave upward over that interval.
If f
(0) = g
(0) = h
(0) = k
(0) = 1, g
(0) = 5, h
(0) = 1, and k
, the .
5. a) Here is the graph you saw back on page 302:
x
y
a
b
c
d
At which point is the second derivative greater, b or d, and why?
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.3. THE SHAPE OF THE GRAPH OF A FUNCTION 311
b) Reproduce a sketch of this curve and indicate where the curve is concave
up and where it is concave down.
c) What must be true about the second derivative at the points where the
curve changes concavity from up to down, or vice versa? Give a clear justi-
cation for your answer.
d) What must be true about the second derivative near the right-hand end
of the graph, and why?
e) Put all this together to sketch the graph of the second derivative of this
function. Label the values a d on your sketch.
6. Second derivative test for maxima and minima Explain why the
following test works.
If f
(c) = 0 and f
(c) = 0 and f
(c) = 0 and f
(x)and hence the geometry of the graph of fon either side of c?)
Finding critical points
7. For each of the following functions, nd the critical points, if any, without
using a computer or calculator. Can you sketch the graph of the function
near the critical point? Use the second derivative test if you cant gure the
behavior out from a simpler inspection.
a) f(x) = x
1/3
b) f(x) = x
3
+
3
2
x
2
6x + 5
c) f(x) =
2x + 1
x 1
d) f(x) = sin x
e) f(x) =
1 x
2
f) f(x) =
e
x
x
g) f(x) = xln x
h) f(x) = x
c
+
1
x
c
where c is some constant
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
312 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
8. For each of the following graphs, mark any critical points or extremes.
Indicate which extremes are local and which are global. (Assume that at
their ends the curves continue in the direction they are headed.)
y
x
a)
y
x
b)
y
x
c)
Finding extremes
Except where indicated, you should not use a computer or calculator to solve
the following problems.
9. For what positive value of x does f(x) = x+
7
x
attain its minimum value?
Explain how you found this value.
10. For what value of x in the interval [1, 2] does f(x) = x +
7
x
attain its
minimum value? Explain how you found this value.
11. Use a graphing program to make a sketch of the function y = f(x) =
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.3. THE SHAPE OF THE GRAPH OF A FUNCTION 313
x
2
2
x
on the interval 0 x 10. From the graph, estimate the value of x
which makes y largest, accurately to four decimal places. Then nd where y
takes on its maximum by setting the derivative f
(r) = 4r
2V
r
2
=
4r
3
2V
r
2
The derivative is undened at r = 0, which is outside the domain under
consideration. So now we set the derivative equal to zero and solve for any Looking for
critical points
possible critical points.
f
(r) =
4r
3
2V
r
2
0 =
4r
3
2V
r
2
0 = 4r
3
2V
r =
3
_
V/2
Thus r =
3
_
V/2 is the only critical point.
We can actually sketch the shape of the graph of A versus r based on this Finding the shape
of the graph of A
analysis of f
(x) = 4x
3
+ 3x
2
+ 2x + 1,
which is certainly dened for all x.
In order to use a graphing utility to nd the roots of f
(x) = 0, we need
to choose an interval that will contain the roots we seek. Since f
(x) = 0
and f
(1) = 12(1)
2
+ 6(1) + 2 = 8.
Thus the equation of the tangent line is The equation of
the tangent line
y + 2 = 8(x + 1).
To nd the x-intercept of this line, we must set y equal to zero and solve
for x: 0 + 2 = 8(x + 1) gives us x = 0.75. Of course, this x-intercept is Finding the x-intercept
of the tangent line
not equal to r, but its a better approximation than, say, 1. To get an even
better approximation, we repeat this process, starting with the line tangent
to the graph of g at x = 0.75 instead of at x = 1.
y
x
r 1
(1, 2)
(.75, .5)
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
322 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
The slope of this new tangent line is g
(x
0
), so
The general equation
of the tangent line
y g(x
0
) = g
(x
0
)(x x
0
)
is the equation of the tangent line. Since this line crosses the x-axis at
the point (x
1
, 0), we set x = x
1
and y = 0 in the equation to obtain
0 g(x
0
) = g
(x
0
)(x
1
x
0
).
Now it is easy to solve for x
1
:
g
(x
0
)(x
1
x
0
) = g(x
0
)
x
1
x
0
=
g(x
0
)
g
(x
0
)
x
1
= x
0
g(x
0
)
g
(x
0
)
.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.5. NEWTONS METHOD 323
In the same way we get
x
2
= x
1
g(x
1
)
g
(x
1
)
, x
3
= x
2
g(x
2
)
g
(x
2
)
,
and so on.
To summarize, suppose that x
0
is given some value START. Then New-
tons method is the computation of the sequence of numbers determined by
x
0
= START
x
n+1
= x
n
g(x
n
)
g
(x
n
)
, n = 0, 1, 2, 3, . . .
As we have seen many times, the sequence
x
1
, x
2
, x
3
, . . . , x
n
, . . .
is a list of numbers to which we can always add a new termby iterating
our method yet again. For most functions, if we begin with an appropriate The limit of
the successive
approximations
is the root
starting value of x
0
, there is another number r that is the limit of this list of
numbers, in the sense that the dierence between x
n
and r becomes as small
as we wish as n increases without bound,
r = lim
n
x
n
.
The numbers x
1
, x
2
, x
3
, . . . , x
n
, . . . constitute a sequence of successive approx-
imations for the root r of the equation g(x) = 0. We can write a computer
program to carry out this algorithm for as many steps as we choose. The
program NEWTON does just that for g(x) = 4x
3
+ 3x
2
+ 2x + 1.
Program: NEWTON
Newtons method for solving g(x) = 4x
3
+ 3x
2
+ 2x + 1 = 0
start = -1
numberofsteps = 8
x = start
FOR n = 0 to numberofsteps
print n, x
g = 4 * x^3 + 3 * x^2 + 2 * x + 1
gprime = 12 * x^2 + 6 * x + 2
x = x - g/gprime
NEXT n
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
324 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
If we program a computer using this algorithm with START = 1, then
we get
x
0
= 1.00000000000
x
1
= 0.75000000000
x
2
= 0.63235294118
x
3
= 0.60687911790
x
4
= 0.60583128240
x
5
= 0.60582958619
x
6
= 0.60582958619
x
7
= 0.60582958619
x
8
= 0.60582958619.
Thus we have found the root of g(x) = 0the critical point we were
looking for. In fact, after only 6 steps we could see that the value of the
critical point was specied to at least ten decimal places. Also at the sixth
step, we had the eight decimal places obtained with the use of the graphing
utility. In fact, it turns out that the number of decimal places xed roughly
doubles with each round. In the above list, for instance, x
2
xed one decimal,
x
3
xed two decimals, x
4
xed four, x
5
xed ten (the eleventh digit of the
root is really an 8, which gets rounded to a 9 in x
6
x
8
.), and x
6
would have
xed at least twenty if we had printed them all out! Moreover, by changing
only three lines of the programthe rst, sixth, and seventhwe can use
NEWTON for any other function. With the use of the program NEWTON,
we will see that in most cases we can obtain results more quickly and to a
higher degree of accuracy with Newtons method than by using a graphing
utility, although it is still sometimes helpful to use a graphing utility to get
a reasonable starting value.
Examples
Example 1. Start with cos x = x. The solution(s) to this equation (if
any) will be the x-coordinates of any points of intersection of the graphs of
y = cos x and y = x. Sketch these two graphs and convince yourself that
there is one solution, between 0 and /2. The equation cos x = x is not in the
form g(x) = 0, so rewrite it as cos x x = 0. Now we can apply Newtons
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.5. NEWTONS METHOD 325
method with g(x) = cos x x. Try starting with x
0
= 1. This gives the
iteration scheme
x
0
= 1
x
n+1
= x
n
cos x
n
x
n
sin x
n
1
, n = 0, 1, 2, . . .
The numbers we get are
x
0
= 1.000000000
x
1
= .750363868 . . .
x
2
= .739112891 . . .
x
3
= .739085133 . . .
x
4
= .739085133 . . .
We have the solution to 9 decimal places in only four steps. Not only does
Newtons method work, it works fast!
Example 2. Suppose we continue with the equation cos x = x, but this time
choose x
0
= 0. What will we nd? The numbers we get are
x
0
= 0.000000000
x
1
= 1.000000000
x
2
= 0.750363868
There is no need to continue; we can see that we will again obtain r =
0.739085133 . . . as in Example 1. Look again at your sketch and see why you
might have predicted this result.
Example 3. Next, lets nd the roots of the polynomial x
5
3x + 1. This
means solving the equation x
5
3x + 1 = 0. We know the necessary deriva-
tive, so were ready to apply Newtons method, except for one thing: which Finding the starting
value x
0
can be hard
starting value x
0
do we pick? This is the part of Newtons method that leaves
us on our own.
Assuming that some graphing software is available, the best thing to do
is graph the function. But for most graphing utilities, we need to choose an
interval. How do we choose one which is sure to include all the roots of the
polynomial? The derivative 5x
4
3 of this polynomial is simple enough that
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
326 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
we can use it to get an idea of the shape of the graph of y = x
5
3x+1 before
we turn to the computer. Clearly the derivative is zero only for x =
4
_
3/5,
and the derivative is positive except between these two values of x. In other Using the derivative
to nd the shape
of the graph
words, we know the shape of the graph of y = g(x)it is increasing, then
decreases for a bit, then increases from there on out. This still is not enough
information to tell us how many roots g has, though; its graph might lie in any
one of the following congurations and so have 1, 2, or 3 roots (there are two
other possibilities not shownone has 2 roots and one has 1 root). We can
x
y
x
y
x
y
thus say that g(x) = 0 has at least one and at most three real roots. However,
if we further observe that g(2) = 25, g(1) = 3, g(0) = 1, g(1) = 1, and
g(2) = 27, we see that the graph of g must cross the xaxis at some value of
x between -2 and -1, between 0 and 1, and again between 1 and 2. Therefore
the righthand sketch above must be the correct one.
Or we can almost as easily turn to a graphing utility. If we try the interval
[5, 5], we see again that the graph crosses the x-axis in exactly three points.
y
x 1 2 2 1
y = x
5
3x + 1
One of the roots is between 2 and 1, one is between 0 and 1, and the Finding the root
between
2 and 1
third is between 1 and 2. To nd the rst, we apply Newtons method with
x
0
= 2. Then we get
x
0
= 2.000000000
x
1
= 1.675324675 . . .
x
2
= 1.478238029 . . .
x
3
= 1.400445373 . . .
x
4
= 1.389019863 . . .
x
5
= 1.388792073 . . .
x
6
= 1.388791984 . . .
x
7
= 1.388791984 . . .
This took a few more steps than the other examples, but not a lot. Notice
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.5. NEWTONS METHOD 327
again that once there are any decimals xed at all, the number of decimals
xed roughly doubles in the next approximation. In the exercises you will
be asked to compute the other two roots.
Example 4. Lets use Newtons method to nd the obvious solution r = 0
of x
3
5x = 0. If we choose x
0
suciently close to 0, Newtons method
should work just ne. But what does suciently close mean? Suppose we
try x
0
= 1. Then we get
x
0
= 1
x
1
= 1
x
2
= 1
x
3
= 1
x
4
= 1
.
.
.
The x
n
s oscillate endlessly, never getting close to 0. Going back to the geo- Newtons method
can fail
metric interpretation of Newtons method, this oscillation can be explained
by the graph below.
y
x
y = x
3
5x
x
0
= 1
x
1
= 1
Using more advanced methods, it is possible to get precise estimates for
how close x
0
needs to be to r in order for Newtons method to succeed. For
now, well just have to rely on common sense and trial and error.
One important thing to note is the relation between algebra and Newtons
method. Although we can now solve many more equations than we could
earlier, this doesnt mean that we can abandon algebra. In fact, given a new
equation, you should rst try to solve it algebraically, for exact solutions are
often better. Only when this fails should you look for approximate solutions
using Newtons method. So dont forget algebrayoull still need it!
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
328 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
Exercises
1. The Babylonian algorithm. Show that the Babylonian algorithm of
chapter 2 is the same as Newtons method applied to the equation x
2
a = 0.
2. When Newton introduced his method, he did so with the example x
3
2x 5 = 0. Show that this equation has only one root, and nd it.
This example appeared in 1669 in an unpublished manuscript of Newtons (a published version
came later, in 1711). The interesting fact is that Newtons method diers from the one presented
here: his scheme was more complicated, requiring a dierent formula to get each approximation.
In 1690, Joseph Raphson transformed Newtons scheme into the one used above. Thus, New-
tons method is more properly called the NewtonRaphson method, and many modern texts
use this more accurate name.
3. Use Newtons method to nd a solution of x
3
+2x
2
+10x = 20 near the
point x = 1.
The approximate solution 1;22,7,42,33,4,40 of this equation appears in a book written in 1228
by Leonardo of Pisa (also known as Fibonacci). This number looks odd because its written in
sexagesimal notation: it translates into
1 +
22
60
+
7
60
2
+
42
60
3
+
33
60
4
+
4
60
5
+
40
60
6
.
This solution is accurate to 10 decimal places, which is not bad for 750 years ago. In the Middle
Ages, there was a lot of interest in solving equations. There were even contests, with a prize going
to the person who could solve the most. The quadratic formula, which expresses algebraically
the roots of any second degree equation, had been known for thousands of years, but there were
no general methods for nding roots of higher degree equations in the 13th century. We dont
know how Leonardo found his solutionwhy give away your secrets to your competitors!
4. Use Newtons method to nd a solution of x
3
+ 3x
2
= 5.
In 1530, Nicolo Tartaglia was challenged to solve this equation algebraically. Five years later, in
1535, he found the solution
x =
3
3 +
5
2
+
3
5
2
1 .
Initially, Tartaglia could solve only certain types of cubic equations, but this was enough to let
him win some famous contests with other mathematicians of the time. By 1541, he knew the
general solution, but he made the mistake of telling Geronimo Cardano. Cardano published the
solution in 1545 and the resulting formulas are called Cardans Formulas.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.5. NEWTONS METHOD 329
The above solution of x
3
+ 3x
2
= 5 is called a solution by radicals
because it is obtained by extracting various roots or radicals. Similarly,
some time before 1545, Luigi Ferrari showed that any fourth degree equation
can be solved by radicals. This led to an intense interest in the fth degree
equation. To see what happens in this case, read the next problem.
5. In Example 3, we saw that one root of x
5
3x + 1 was 1.39887919. . . .
Use Newtons method to nd the other two roots.
In 1826, Niels Henrik Abel proved that the general polynomial of degree 5 or greater cannot be
solved by radicals. Using the work of Evariste Galois (done around 1830, but not understood
until many years later), it can be shown that the roots of the equation x
5
3x + 1 = 0 cannot
be expressed by any combination of radicals. Thus algebra cant solve this equationsome kind
of successive approximation technique is unavoidable!
6. One of the more surprising applications of Newtons method is to com-
pute reciprocals. To make things more concrete, we will compute 1/3.4567 .
Note that this number is the root of the equation 1/x = 3.4567.
a) Show that the formula of Newtons method gives us
x
n+1
= 2x
n
3.4567x
2
n
.
b) Using x
0
= .5 and the formula from (a), compute 1/3.4567 to a high
degree of accuracy.
c) Try starting with x
0
= 1. What happens? Explain graphically what goes
wrong.
This method for computing reciprocals is important because it involves
only multiplication and subtraction. Since a/b = a (1/b), this implies that
division can likewise be built from multiplication and subtraction. Thus,
when designing a computer, the division routine doesnt need to be built
from scratchthe designer can use the method illustrated here. There are
some computers that do division this way.
7. In this problem we will determine the maximum value of the function
f(x) =
x + 1
x
4
+ 1
.
a) Graph f(x) and convince yourself that the maximum value occurs some-
where around x = .5. Of course, the exact location is where the slope of the
graph is zero, i.e., where f
(x).
c) Since the answer to (b) is a fraction, it vanishes when its numerator
does. Setting the numerator equal to 0 gives a fourth degree equation. Use
Newtons method to nd a solution near x = .5 .
d) Compute the maximum value of f(x).
8. Consider the hyperbola y = 1/x and the circle x
2
4x + y
2
+ 3 = 0.
a) By graphing the circle and the hyperbola, convince yourself that there
are two points of intersection.
b) By substituting y = 1/x into the equation of the circle, obtain a fourth
degree equation satised by the xcoordinate of the points of intersection.
c) Solve the equation from (b) by Newtons method, and then determine the
points of intersection.
9. Sometimes Newtons method doesnt work so nicely. For example, con-
sider the equation sin x = 0.
a) Compute x
1
using Newtons method for each of the four starting values
x
0
= 1.55, 1.56, 1.57 and 1.58.
b) The answers you get are wildly dierent. Using the basic formula
x
n+1
= x
n
g(x
n
)
g
(x
n
)
explain why.
The epidemic runs its course
We return to the epidemiology example we have studied since chapter 1. Re-
call that our S-I-R model keeps track of three subgroups of the population:
the susceptible, the infected, and the recovered. One of the interesting fea-
tures of the model is that the larger the initial susceptible population, the
more rapidly the epidemic runs its course. We observe this by choosing xed
values of R
0
= R(0) and I
0
= I(0) and looking at graphs of S(t) versus t for
various values of S
0
= S(0).
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.5. NEWTONS METHOD 331
S
t
In every case:
R(0) = 0
I(0) = 100
0 100
time (days)
10000
15000
20000
25000
30000
35000
40000
45000
5000
S
u
s
c
e
p
t
i
b
l
e
P
o
p
u
l
a
t
i
o
n
We see in each case that for suciently large t the graph of S levels o,
approaching a value well call S
:
S
= lim
t
S(t).
What we mean by the epidemic running its course is that S(t) reaches
this limit value. We can see from the graphs that the value of S
0
aects the
number S
= aSI,
I
= aSI bI,
R
= bI.
10. Use the dierentiation rules together with these dierential equations
to show that
(I + S (b/a) ln S)
= 0.
11. Explain why the result of problem 10 means that I + S (b/a) ln S
has the same valuecall it Cfor every value of t.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
332 CHAPTER 5. TECHNIQUES OF DIFFERENTIATION
12. Look at the graphs of the solutions I(t) for various values of S(0) below.
I
t
In every case:
R(0) = 0
I(0) = 100
0 100
time (days)
24000
I
n
f
e
c
t
e
d
P
o
p
u
l
a
t
i
o
n
S(0) = 45000
S(0) = 40000
S(0) = 35000
S(0) = 30000
S(0) = 25000
S(0) = 20000
S(0) = 15000
S(0) = 10000
S(0) = 5000
Write lim
t
I(t) = I
b
a
ln(S
) = I
0
+ S
0
b
a
ln(S
0
).
This equation determines S
implicitly as a function of I
0
and S
0
. For
particular values of I
0
and S
0
(and of the parameters), you can use Newtons
method to nd S
.
14. Use the values
a = .00001 (person-days)
1
b = .08 day
1
I
0
= 100 persons
S
0
= 35, 000 persons
Writing x instead of S
gives
x 8000 ln(x) = 48, 605.
Apply Newtons method to nd x = S
might be 100.
DVI file created at 23:23, 17 January 2008
Copyright 1994, 2008 Five Colleges, Inc.
5.6. CHAPTER SUMMARY 333
15. Using the same values of a, b, and I
0
as in problem 14, determine the
value of S