Calc 1
Calc 1
Calc 1
ISBN 978-1-61610-155-8
Orange Grove Texts Plus is an imprint of the University Press of Florida, which is the scholarly
publishing agency for the State University System of Florida, comprising Florida A&M
University, Florida Atlantic University, Florida Gulf Coast University, Florida International
University, Florida State University, New College of Florida, University of Central Florida,
University of Florida, University of North Florida, University of South Florida, and University of
West Florida.
Chapter 1. Functions 1
1. Functions 1
2. Classes of Functions 2
3. Operations on Functions 4
4. Viewing the Graphs of Functions 6
5. Inverse Functions 8
6. The Velocity Problem and the Tangent Problem 11
Chapter 2. Limits and Derivatives 15
7. The Limit of a Function 15
8. Limit Laws 21
9. Continuous Functions 25
10. Limits at Infinity 29
11. Derivatives 34
12. The Derivative as a Function 36
Chapter 3. Rules of Differentiation 41
13. Derivatives of Polynomial and Exponential Functions 41
14. The Product and Quotient Rules 44
15. Derivatives of Trigonometric Functions 47
16. The Chain Rule 49
17. Implicit Differentiation 53
18. Derivatives of Logarithmic Functions 55
19. Applications of Rates of Change 58
20. Related Rates 61
21. Linear Approximations and Differentials 65
Chapter 4. Applications of Differentiation 71
22. Minimum and Maximum Values 71
23. The Mean Value Theorem 76
24. The First and Second Derivative Tests 79
25. Taylor Polynomials and the Local Behavior of a Function 84
26. L’Hospital’s Rule 88
27. Analyzing the Shape of a Graph 92
28. Optimization Problems 97
vi CONTENTS
Functions
1. Functions
A function f is a rule that associates to each element x in a set D
a unique element f (x) of another set R. Here the set D is called the
domain of f , while the set R is called the range of f . The fact that f
associates to each element of D an element of R is represented by the
symbol f : D → R. Instead of saying that f associates f (x) to x, we
often say that f sends x to f (x), which is shorter.
If the sets mentioned in the previous definition are sets of numbers,
then it is often easier to describe f by an algebraic expression. Let N
be the set of all natural numbers (which are the nonnegative integers).
Then the function f : N → N given by the rule f (x) = 2x + 3 is
the function that sends each nonnegative integer n to the nonnegative
integer 2n + 3. For instance, it sends 0 to 3, 1 to 5, 17 to 37, and so on.
In this case, the algebraic description is simpler than actually saying
“f is the function that sends n to 2n + 3.”
The rule that describes f may be simple or complicated. It could
be that a function is defined by cases such as
⎧
⎨ 0.1x if 0 ≤ x ≤ 40,
f (x) = 4 + 0.15(x − 40) if 40 < x ≤ 80,
⎩ 10 + 0.2(x − 80) if x > 80.
This example could describe an income tax code. The first $40,000
of income is taxed at a rate of 10%, income above $40,000 but below
$60,000 is taxed at a rate of 15%, and income above $80,000 is taxed
at a rate of 20%. The value of f (x) is the amount of tax to be paid
after an income of x thousand dollars for any positive real number x.
There are times when the rules that apply in various cases are
closely connected to each other. A classic example is the absolute value
function, that is,
x if 0 ≤ x,
f (x) = |x| =
−x if x < 0.
1
2 1. FUNCTIONS
In this case, f (x) = f (−x) for all x. When that happens, we say that f
is an even function. For instance, g(x) = cos x and h(x) = x2 are even
functions. There are also functions for which −f (x) = f (−x) holds for
all x. Then we say that f is an odd function. Examples of odd functions
include g(x) = sin x and h(x) = x3 .
There are times when a plain English description of a function is
simpler than an algebraic one. For instance, “let g be the function
that sends each integer that is at least 2 into its largest prime divisor”
is simpler than describing that function with algebraic symbols (and
symbols of formal logic). If the sets D and R are not sets of numbers,
an algebraic description may not even be possible. An example of this
is when D and R are both sets of people and f (x) is the biological
father of person x. Note that it is not by accident that we said that
f (x) is the father (and not the son) of x. Indeed, a function must send
x to a unique f (x). While a person has only one biological father, he
or she may have several sons.
Sometimes the rule that sends x to f (x) can only be given by listing
the value of f (x) for each x, as opposed to a general rule. For instance,
let D be the set of 200 specific cities in the United States, let R be the
set of all nonnegative real numbers, and for a city x, let f (x) be the
amount of precipitation that x had in 2009. Then f is a function since
it sends each x ∈ D into an element of R. This function is given by its
list of values, not by a rule that would specify how to compute f (x) if
given x.
Finally, functions can also be represented by their graphs. If
f : D → R is a function, then let us consider a two-dimensional coor-
dinate system such that the horizontal axis corresponds to elements of
D, and the vertical axis corresponds to elements of R. The graph of f
is the set of all points with coordinates (x, f (x)) such that x ∈ D. The
requirement that f (x) is unique for each x will ensure that no vertical
line intersects the graph of f more than once. This is called the vertical
line test.
2. Classes of Functions
2.1. Power Functions. A power function is a function f given by the
rule f (x) = xa , where a is a fixed real number. Note that x−a = 1/xa ,
so, for instance, x−3 = 1/x3 . The special case of a = −1, that is, the
function f (x) = 1/x, is called the reciprocal function. Note that the
rule g(x) = 1 for all real numbers x also defines a power function, one
2. CLASSES OF FUNCTIONS 3
For example, sin and cos are both periodic with period 2π, and
tan and cot are periodic with period π. The reader will be asked in
Exercise 2.7(1) about the periodicity of sec and csc.
2.5. Algebraic Functions. An algebraic function is a function that con-
tains only addition, subtraction, multiplication, division, and taking
roots. For instance, power functions with integer exponents are al-
gebraic functions, since they only use multiplication, though possibly
many times. Therefore, polynomials are algebraic functions as well since
they are sums of constant multiples of power functions. This implies
that rational functions are also algebraic since they are obtained by
dividing a polynomial (also an algebraic function) by another one.
The preceding list did not contain all algebraic functions since it
did not contain any functions in which roots were involved. So we get
additional examples
√ if we include√ roots, such
as the functions given by
the rules f (x) = x + 3, g(x) = x, h(x) = (x + 1)/(x − 1).
3
3. Operations on Functions
3.1. Transformations of a Function. We have seen the basic mathemat-
ical functions and their graphs in the last section. In this section, we
will look at their transformations.
It is easy to see what happens to the graph of a function if we
increase or decrease each value of a function by a constant. Indeed, the
graph of the function g given by g(x) = f (x) + 5 for all x is simply the
graph of the function f translated by five units to the north. Similarly,
the graph of the function h given by h(x) = f (x) − 7 is the graph of f
translated by seven units to the south.
Horizontal translations are a little bit trickier. The reader is invited
to verify that if g is the function given by g(x) = f (x − 2), then the
graph of g is the graph of f translated by two units to the east, that is,
3. OPERATIONS ON FUNCTIONS 5
3.3. Exercises.
(1) Sketch the graph of f (x) = x2 , g(x) = (x − 3)2 , and h(x) =
(2x + 5)2 .
(2) Sketch the graph of f (x) = cos 2x, g(x) = sin(x − 2), and
h(x) = 3 tan x.
(3) Show examples for f and g when g ◦ f is defined for all real
numbers, but f ◦ g is not.
(4) Show examples when f ◦ g = g ◦ f .
graph of f , this means that the graph goes roughly from the northwest
to the southeast.
If we simply ask a computer or graphing calculator to plot the
graph of a function without specifying the interval [x1 , x2 ] in which
the value of x can range, we may get an error message, or the com-
puter may simply substitute default values for x1 and x2 . For example,
the software package Maple 13 uses the default values x1 = −10 and
x2 = 10. The interval [x1 , x2 ] is often called the viewing window.
We have to be careful, however, since not all viewing windows are
appropriate for all functions, and choosing an inappropriate viewing
window may cause misleading results.
For functions like f (x) = x, g(x) = |x|, or h(x) = x2 + 3, the view-
ing window [−10, 10] is appropriate as the behavior of these functions
outside that window is similar to their behavior inside the window.
Now let f (x) = (x + 10)2 . In this case, using the viewing window
[−10, 10], we get the graph of an increasing function. That is misleading
since f is decreasing on the interval (−∞, −10]. So, in this case, a
viewing window that starts at a point x1 < −10 is necessary.
This problem becomes more difficult if we are dealing with functions
that change from increasing to decreasing many times, perhaps in an
irregular fashion and perhaps far away from the origin. For this reason,
it is worth noting that if f is a polynomial function of degree n, then it
cannot change directions more than n − 1 times. If we found all n − 1
direction changes, then we can be sure that we did not miss any of
them. We will return to this topic in a later chapter, when we discuss
the derivative of a function.
The preceding example showed why selecting a viewing window
that is too small can be misleading. The next example shows why a
viewing window that is too large can also mislead us. Plot the graph
of the function g(x) = 4x3 + 9x2 + 6x + 1. Using the default viewing
window [−10, 10], or some window containing that one, many software
packages will show a graph that increases everywhere and disappears in
a small interval to the left of 0. This should raise our suspicion that the
program does not properly display the graph of g around 0. Indeed,
g is defined for all real numbers, so its graph should not disappear
anywhere. Taking a closer look, that is, changing the viewing window
to [−1, 1], we see a function that is actually decreasing between x = −1
and x = −1/2.
Trigonometric functions, with their periodicity, are particularly
good examples to demonstrate what software package can and cannot
do. The reader is encouraged to plot the graph of the functions sin x,
8 1. FUNCTIONS
cos 2x, tan(x/4), and, finally, sin(1/x) and explain the obtained graphs.
In particular, the reader should try to explain why, for sin(1/x), the
choice of the viewing window is not important as long as it contains
x = 0.
5. Inverse Functions
The inverse f −1 of a function f : A → B “undoes” what f did.
That is, if f (x) = y, then f −1 (y) = x, so f sends x to y, while f 1
sends y back to x. It goes without saying that this f −1 will only be
a function if f −1 (y) is unambiguous, that is, when there is only one
x ∈ A so that f (x) = y. In that case, and only in that case, it is clear
that f −1 (y) = x.
Let us now formalize these concepts.
Definition 1. A function f : A → B is called one-to-one if it sends
different elements into different elements, that is, if x = x implies that
f (x) = f (x ).
One-to-one functions are also called injective functions or injections.
Visually, no horizontal line can intersect the graph of a one-to-one
function more than once.
For instance, if A and B are both the set of real numbers, then
f (x) = x and g(x) = x3 are both one-to-one, but h(x) = x2 is not.
Definition 2. Let f be a one-to-one function with domain A and
range B. Then the inverse of f is the function f −1 : B → A given by
f −1 (y) = x if f (x) = y.
Example 2. Let A and B both be the set of all real numbers. Let
f : A → B be given by f (x) = 2x + 7. Then f −1 (y) = (y − 7)/2.
Solution: If f (x) = y, then y = 2x + 7, so y − 7 = 2x and
(y − 7)/2 = x. As x = f −1 (y), it follows that f −1 (y) = (y − 7)/2.
2
The preceding example shows a general strategy for finding the in-
verse of a function. Write the equation f (x) = y, with the appropriate
algebraic expression replacing f (x). Then solve for x. If there is more
than one solution, then f is not one-to-one, and so it has no inverse
function. If there is one solution, then that expression is the value of
f −1 (y).
Example 3. If A is the set of positive real numbers, B is the set
√ than 1, and f : A → B is given by
of real numbers that are larger
−1
f (x) = x + 1, then f (y) = y − 1.
2
5. INVERSE FUNCTIONS 9
The last two rules simply express the fact that the functions f (x) =
ax and f −1 (y) = loga (y) are inverses of each other, so their composition
is an identity function.
If we know the logarithm of a number in a base and want to compute
it in another base, we can do so using the following theorem.
Theorem 1. For positive real numbers a, b, and x, we have
logb (x)
loga x = .
logb (a)
Proof. Start with the identity
x = aloga x .
Now take the logarithm of base b of both sides to get
logb x = loga logb ax.
Now divide both sides by logb to get the identity of the theorem. 2
Example 4. We can use Theorem 1 to compute log16 (256) from
log2 (256) as follows:
log2 (256) 8
log16 (256) = = = 2.
log2 (16) 4
So if a calculator or computer can provide the logarithm of all
positive real numbers in one base, we can compute the logarithm of any
positive real number in any base. For this reason, many calculators and
computers are programmed to work primarily with logarithms of one
given base, namely of base e, where e 2.718 is an irrational number
that will be formally defined in Chapter 2.
The logarithm of base e is so important that it has its own name,
natural logarithm, and its own notation, ln. So ln x = loge x.
5.2. Inverses of Trigonometric Functions. Basic trigonometric functions,
such as sin, cos, and tan, are very important in calculus, so it is no sur-
prise that their inverse functions are important as well. However, we
have to be precise when we define them since trigonometric functions
are not one-to-one. In fact, they are periodical, of period 2π or π, and
so they take every value in their range infinitely often.
In order to get around this difficulty, we will restrict our trigono-
metric functions to just a short interval, in which they are one-to-one,
and define their inverses based on that restriction.
For instance, consider sin as a function whose domain is [−π/2, π/2].
In that interval, sin is a one-to-one function (since it is increasing),
and its range is the interval [−1, 1]. So its inverse is the function
6. THE VELOCITY PROBLEM AND THE TANGENT PROBLEM 11
sin−1 : [−1, 1] → [−π/2, π/2]. That is, if y ∈ [−1, 1], then sin−1 y
is the (only) x ∈ [−π/2, π/2] for which sin x√= y. For instance,
sin−1 (1/2) = π/6, while sin−1 (0) = 0 and sin−1 ( 2/2) = π/4.
The inverses of the other trigonometric functions are defined simi-
larly, just the intervals to which we restrict the functions (in order to
make them one-to-one) can change.
That is, cos−1 is the inverse function of the cos function that is
restricted to the interval [0, π]. So cos−1 is a function with domain
[−1, 1] and range [0, π]. Similarly, tan−1 is the inverse function of the
tan function that is restricted to the interval (−π/2, π/2). Its domain
is the set of all real numbers, and its range is the interval (−π/2, π/2).
The inverse functions of cot, sec, and csc, while not used often, can
also be defined analogously.
5.3. Exercises.
(1) Is there a function f defined on all positive real numbers for
which f −1 = f ?
(2) If we are given loga x, how can we compute log1/a (x)?
(3) For which values of a is loga an increasing function, and for
which values of a is it a decreasing function?
(4) What is the geometric connection between the graphs of f and
f −1 ?
(5) Is it true that if g is the inverse function of the one-to-one
function f , then g is one-to-one?
velocity for the time period from t1 to t2 , we simply compute the value
of the fraction
s(t2 ) − s(t1 )
.
t2 − t1
This fraction is precisely the slope of the line that intersects the graph
of the function s at points (t1 , s(t1 )) and (t2 , s(t2 )). If we choose t1
and t2 closer and closer together, then these points will get closer and
closer together as well. Finally, if we set t1 = t2 , then we will not
immediately know the slope of the line that touches the graph of s at
the point (t1 , s(t1 )) since we will know only one, not two, point of this
line. However, and this will be made more precise in the next section,
the slope we are looking for will be approximated by the sequence of
slopes of the lines that we got when we chose t1 and t2 closer and closer
together.
Finally, we point out that there is nothing magical about the func-
tion s(t) here. We could consider any function f : R → R, and ask
what the slope of the tangent line to this curve is at the point (x, f (x)).
6.3. Exercises.
(1) A car travels one hour at a speed of 60 miles per hour, then
two hours at a speed of 45 miles per hour. What is the average
speed of the car during this three-hour period?
(2) Consider the car of the previous exercise. What is its average
speed during the first two hours of its trip?
(3) I drove at 40 miles per hour for two hours. How fast do I have
to drive in my third hour if I want to reach an average speed
of 45 miles per hour for my three-hour drive?
(4) Consider the function f (x) = x2 . Can you find two points P
and Q on the graph of f such that the slope of the line P Q is
between 0 and 0.01?
(5) Consider the function g(x) = x3 . Let P = (1, 1). Can you find
a point Q on the graph of g such that the slope of the line P Q
is between 1 and 1.01?
CHAPTER 2
15
16 2. LIMITS AND DERIVATIVES
Example 5. Let
1 if 0 ≤ x,
g(x) =
0 if x < 0.
7.4. Infinite Limits. In our definitions of limits in this section, the limit
L was always a real number. In this section, we extend those definitions
to the cases of infinite limits. If L = ∞, then the values of f have to
get arbitrarily close to ∞; that is, they have to get as large as we want.
This is the content of the following definition.
lim f (x) = ∞.
x→a
20 2. LIMITS AND DERIVATIVES
7.5. Exercises.
2
(1) Find limx→3 x −4x+3
x−3
.
1
(2) Does limx→0 |x| exist?
(3) Give an example of a function f such that limx→0− f (x) = 0
and limx→0+ f (x) = ∞.
(4) Does limx→0 x13 + x12 exist?
(5) Give an example of a function f such that limx→1− f (x) = ∞,
limx→1+ f (x) = −∞, and f (1) is a real number.
8. Limit Laws
8.1. Basic Limit Laws. If f and g are two functions and we know the
limit of each of them at a given point a, then we can easily compute
the limit at a of their sum, difference, product, constant multiple, and
quotient. The rules that provide this limit are given below, and they
are very similar to the ways in which the sum, difference, product,
constant multiple, and quotient of two functions are defined. Indeed,
(I)
lim (f + g)(x) = lim f (x) + lim g(x),
x→a x→a x→a
(II)
lim (f − g)(x) = lim f (x) − lim g(x),
x→a x→a x→a
(III)
lim (f · g)(x) = lim f (x) · lim g(x),
x→a x→a x→a
(IV)
lim (c · f )(x) = c · lim f (x),
x→a x→a
where c is a real number, and
(V)
f limx→a f (x)
lim (x) =
x→a g limx→a g(x)
if limx→a g(x) = 0.
It is not difficult to believe that these rules are valid. For instance,
if f (x) gets arbitrarily close to L as x approaches a and g(x) gets
arbitrarily close to L as x approaches a, then, as x approaches a, the
value of f (x) + g(x), that is, the value of (f + g)(x), will get arbitrarily
close to L + L . This intuitive argument can be made formal using the
precise definition of limits.
22 2. LIMITS AND DERIVATIVES
Example 10. Let f (x) = |x| and let g(x) = x2 . Find the limits of
f + g, f − g, f g, 3f + 2g, and f /g at a = 2.
Solution: Based on the five limit laws given earlier, it makes sense to
first compute the limits of f and g at 2. The reader is invited to verify
that
lim f (x) = lim |x| = lim x = 2,
x→2 x→2 x→2
and
lim g(x) = lim x2 = lim x · lim x = 2 · 2 = 4,
x→2 x→2 x→2 x→2
where we used the fact that g(x) = x2 = x · x, so law III can be applied
to compute the limit of g at 2.
Now it is simply a matter of basic algebra to compute the five limits
that we have been asked to find. Indeed, applying the five limit laws,
we get that
(I) limx→2 (f + g)(x) = limx→2 f (x) + limx→2 g(x) = 2 + 4 = 6,
(II) limx→2 (f − g)(x) = limx→2 f (x) − limx→2 g(x) = 2 − 4 = −2,
(III) limx→2 (f · g)(x) = limx→2 f (x) · limx→2 g(x) = 2 · 4 = 8,
(IV) limx→2 (3f + 2g)(x) = 3 limx→2 f (x) + 2 limx→2 g(x) = 3 · 2 +
2 · 4 = 14 (note that here we applied limit law IV to first f ,
then to g, and then we applied law I to 3f and 2g), and
(V)
lim f (x)
f 2 1
lim (x) = x→2 = = .
x→2 g lim g(x) 4 2
x→2
2
8.2. Frequently Used Special Cases of Limit Laws. A few special cases
of limit laws I–V are used so frequently that it is worth mentioning
them separately. First, if we repeatedly multiply a function by itself,
we get a power of that function. Applying law III each time, we get
that for all positive integers n,
n
n
(2.1) lim (f (x)) = lim f (x) .
x→a x→a
Note that we have essentially applied this rule in the special case
of n = 2 when we computed limx→2 x2 in Example 10.
The reader is invited to verify that the limits of the constant
function f (x) = c and the identity function f (x) = x are given by
limx→a c = c for all a and limx→a x = a. Formal proofs will be given in
the next section.
8. LIMIT LAWS 23
8.3. Other Useful Facts About Limits. In this section, we discuss a few
facts about limits that are often used to compute limits, but are slightly
different in nature from the limit laws we discussed so far.
First, let us recall that the definition of L = limx→a f (x) requires
that f (x) get arbitrarily close to L if x is sufficiently close to a but
not equal to a. That is, the value of f (a) does not have to satisfy any
requirements. In fact, we can change f (a) to anything we want, and
L = limx→a f (x) will not change. What matters is what happens at
points other than a. Hence, we can conclude that if f (x) = g(x) for all
points x = a, then limx→a f (x) = limx→a g(x) as long as these limits
exist. For instance, let f (x) = (x2 − 4)/(x − 2) for all real numbers
x = 2 and let f (2) = 2010. Let g(x) = x + 2 for all real numbers. Then
f (x) = g(x) unless x = 2, and hence limx→a f (x) = limx→a g(x) = 4.
The statement that if f (x) = g(x) for all points x = a, then
limx→a f (x) = limx→a g(x) as long as these limits exist can be sig-
nificantly strengthened. See Exercise 8.4(1) for a possible direction for
that.
Second, Equation (2.2) can be interpreted by saying that the limit
of a power function f (x) = xn at any point a is simply the value of
f (a). Now note that polynomials are nothing else but sums of con-
stant multiples of power functions with nonnegative integer exponents.
Hence, using limit laws I and IV, we get the following theorem.
Theorem 2. Let p be a polynomial function. Then, for any real
number a, we have
lim p(x) = p(a).
x→a
24 2. LIMITS AND DERIVATIVES
Now recall that a rational function is just the ratio of two polyno-
mials. Hence, using limit law V, we get the following statement from
Theorem 2.
Corollary 1. Let R(x) be a rational function and let a be a real
number such that R(a) is defined. Then
lim R(x) = R(a).
x→a
8.4. Exercises.
(1) Let f (x) and g(x) be two functions that only differ for a
finite number of values of the variable x. Is it true that
limx→a f (x) = limx→a g(x) as long as these limits exist? Why
or why not?
(2) Find an example of two functions f and g such that f (x) <
g(x) for all real numbers x, but there exists a real number a
such that limx→a f (x) = limx→a g(x).
(3) Explain why limx→a g(x) exists if the conditions of Corollary
2 hold.
(4) Prove that limx→0 cos(log√x) does not exist.
(5) Prove that limx→0 |x sin( x)| = 0.
9. Continuous Functions
Intuitively speaking, a function is called continuous at a point x = a
if its graph in a neighborhood of x = a can be drawn without lifting
the pencil from the paper, that is, by a “continuous” line. The formal
definition of continuity is as follows.
26 2. LIMITS AND DERIVATIVES
holds.
Note that Definition 11 really requires three things. The limit of
f at a must exist, the function f must be defined in a such that f (a)
exists, and the value of f (a) must agree with the limit of f at a.
If all these conditions hold, then the behavior of f at a is very
similar to the behavior of f around a; in particular, the graph of f can
be drawn without lifting the pencil from the paper. If we had to lift the
pencil from the paper, that would mean that some kind of “gap” would
exist in the graph of f , so the requirements of Definition 11 would not
be satisfied.
If a function f : R → R is continuous at all a ∈ R, then it is called
continuous. If f is continuous at each point of the open interval (c, d),
then we say that f is continuous on (c, d). Finally, if you really want
a formal definition, the neighborhood of a is a set S that contains an
open interval (c, d) containing a.
9.0.1. The Precise Definition of Continuity. As the informal defini-
tion of continuity is very close to that of limits, it is not surprising that
their precise definitions are also similar.
Definition 12. Let f be defined in an open interval containing a.
We say that f is continuous in a if, for all > 0, there exists δ > 0
such that if |x − a| < δ, then |f (x) − f (a)| < .
9.1. Examples of Continuous Functions. Let us consider some of the
most frequently used continuous functions.
Example 12. Polynomial functions are continuous.
Solution: This is a direct consequence of Theorem 2, which we dis-
cussed in the last section. Theorem 2 stated that the limit of a polyno-
mial function at a is equal to the value of the polynomial at a, which
is precisely what the definition of continuity requires. 2
9.2. Functions That Are Not Continuous. It is time to stop for a moment
and think about functions that are not continuous at a given point a.
There can be three reasons for this. First, it could be that f (a) is not
defined, for instance, when f is a rational function whose denominator
becomes 0 when x = a. Or it could be that g is defined at a, but
limx→a g(x) does not exist at a. An example of this is the function
defined by g(x) = 1 if x ≥ 0 and g(x) = 0 if x < 0. As we have seen
before, the limit of this function does not exist in a = 0, even if g(0)
is defined. So g is not continuous at 0. Finally, it could happen that h
is defined in a and the limit of h at a exists, but h(a) is not equal to
this limit. That happens, for example, if h(x) = (x + 3)/(x2 − 9) if
|x| = 3 and h(x) = 1 if |x| = 3. Let a = −3. Then
1
h(a) = 1 = lim h(x) = − .
x→a 6
The interested reader is invited to think about the following
example.
Excursion 1. The following function is not continuous anywhere.
Let f (x) = 1 if x is rational and let f (x) = 0 if x is irrational.
9.3. New Continuous Functions from Old. It follows from the limit laws
that several transformations preserve the continuous property of
functions.
Theorem 4. Let f and g be two functions that are continuous at a
and let c be a real number. Then all of the following are also continuous
functions at a:
(I) f + g,
(II) f − g,
(III) f · g,
(IV) cf , and
(V) f /g as long as g(a) = 0.
28 2. LIMITS AND DERIVATIVES
the values of f (x) get arbitrarily close to L and stay arbitrarily close
to L if x is large enough. Here “x is large enough” means that x is
in a suitably selected neighborhood of ∞, in other words, in an open
interval (c, ∞). Recall that this is analogous to what we required in the
finite case. There we said that limx→a f (x) = L if f (x) got arbitrarily
close to L and stayed arbitrarily close to L once x was suitably close
to a, that is, when x was in a suitably selected neighborhood of a.
Example 18. Let f (x) = 1/x. Then
lim f (x) = 0.
x→∞
10.2. Infinite Limits at Infinity. It can happen that the limit of a function
at ∞ is not a real number but rather ∞ or −∞.
lim f (x) = ∞,
x→∞
if f (x) gets arbitrarily large and stays arbitrarily large if x gets suffi-
ciently large.
x→∞
7
= lim 1 +
x→∞ x−4
7
= 1 + lim
x→∞ x − 4
=1+0
= 1.
2
We would like to point out other pitfalls when dealing with the
application of limit laws and infinite limits. The following expressions
are not defined:
10. LIMITS AT INFINITY 33
(I) ∞ + (−∞)
(II) ∞ · 0 and −∞ · 0
(III) 1∞ and 1−∞
The following theorem is very useful when dealing with limits at ∞.
10.4. Exercises.
(1) Find limx→∞ xx+1
2 +4 .
3x2 +4x+1
(2) Find limx→∞ x2 +4 .
3
(3) Find limx→∞ xx2+2x
+4
.
3x2 +4x+1
(4) Find limx→−∞ x2 +4 .
(5) Let R(x) = p(x)/q(x) be a rational function. Explain how
limx→∞ R(x) depends on p(x) and q(x).
34 2. LIMITS AND DERIVATIVES
11. Derivatives
11.1. Tangent Lines. Let us consider a function, such as f (x) = x2 , and
its graph. Let us choose a point on the graph, say the point P = (3, 9).
Now let us look for the slope of the tangent line to the graph at that
point.
That is, consider a sequence of points P1 , P2 , . . . that are all on the
graph of f and are closer and closer to P . For each of these points,
draw the line Pi P . The slope of these lines will approach a certain
slope, and so the lines Pi P will approach a certain line. That line is
called the tangent line of f at P .
Definition 20. Let f be a function and let P = (a, f (a)) be a
point on the graph of f . Then the tangent line to f at P is the line
that contains P and has slope
f (x) − f (a)
(2.5) lim ,
x→a x−a
provided that this limit exists.
Note that in the preceding definition, (f (x)−f (a))/(x−a) is simply
the slope of the line connecting the points P and (x, f (x)).
Example 23. In our running example, that is, when f (x) = x2 and
P = (3, 9), the tangent line is the line that goes through P and has slope
f (x) − f (3) x2 − 9
lim = lim = lim (x + 3) = 6.
x→3 x−3 x→3 x − 3 x→3
11.3. The Derivative of a Function. The fact that the last two concepts,
the tangent line and the instantaneous velocity, led to very similar
definitions suggests that there is a very general principle at work and
we have seen two special cases of that principle.
This is indeed the case.
Definition 22. Let f be a function. The derivative of f at a is
the limit
f (x) − f (a)
f (a) = lim
x→a x−a
if this limit exists and is finite.
So, in particular, f (a) is the slope of the tangent line of f at a
(unless that tangent line is vertical). Furthermore, the instantaneous
velocity at time a is the derivative of the distance covered (as a function
of the time t needed to cover that distance) at t = a.
In other words, the derivative is a common generalization of the
concepts of tangent line and instantaneous velocity.
11.4. Exercises.
(1) Find the slope of the tangent line to the curve f (x) = 3x2 − 7
at the point (2, 5).
(2) Find the slope of tangent line to the curve f (x) = x3 at x = 0.
(3) Show an example of a curve that does not have a tangent line
at some point a because the limit defined in (2.5) does not
exist.
(4) The distance covered by a car in a certain time period is de-
scribed by the function
t2 (b − m)
f (t) = tm + ,
2
36 2. LIMITS AND DERIVATIVES
(x − a)(x2 + xa + a2 )
= lim
x→a
2 x − a
= lim x + xa + a2
x→a
2
= 3a .
2
The functions we have considered so far had only one independent
variable, usually the variable x. The dependent variable was usually
denoted by y, so y = f (x) held. So it was always clear that the de-
rivative was taken with respect to x. However, there are circumstances
when this is not so clear, usually when f depends on more than one
variable. Therefore, there are additional ways to denote the function
f such as
• dx
dy
,
• dx ,
df
• dx
d
f (x),
• Dx f (x), or
• Df (x).
since we can apply the limit law for products on the right-hand side to
get that
Adding f (a) to both the far left and far right sides, we get that
f (a) = lim f (x),
x→a
Rules of Differentiation
f (a) = nan−1 .
41
42 3. RULES OF DIFFERENTIATION
Proof. We have
f (x) − f (a)
f (a) = lim
x→a x−a
xn − an
= lim
x→a x − a
13.3. Exercises.
(1) Prove that if f (x) = x1/2 and a = 0, then f (a) = 2√1 a .
(2) Prove that if f (x) = 1/x and a = 0, then f (a) = − a12 .
(3) Prove Theorem 12.
(4) Prove Theorem 13.
(5) Let f (x) = 3x3 − 4x2 + x − 2 + 4ex . Compute f (x).
14.3. Exercises.
(1) Let h(x) = ex x3 . Find h (x) and h (x).
(2) Find a rule to compute (f 2 ) (x).
(3) Find a rule to compute (1/f ) (x).
(4) Use the result of the previous exercise to prove a formula for
g (x) if g(x) = xn for a negative integer n.
(5) Let g(x) = e−x . Find g (x).
(6) Let h(x) = x/ex . Find h (x).
15. DERIVATIVES OF TRIGONOMETRIC FUNCTIONS 47
circle bordered by the lines AO, BO, and the arc AB is π · α/(2π) =
α/2. So the ratio of the two areas is
(sin α)/2 sin α
= .
α/2 α
On the other hand, as n gets larger and larger, α gets smaller and
smaller, while the area of the n-gon gets closer and closer to the area
of the circle. Hence, their ratio, sin α/α, will get arbitrarily close to 1
and stay arbitrarily close to 1. 2
Lemma 2. The equality
(cos h) − 1
(3.4) lim =0
h→0 h
holds.
Proof. We will manipulate the expression ((cos h) − 1)/h so that
we can use the result of Lemma 1. First, we multiply both the numer-
ator and the denominator by cos h + 1 to get
(cos h) − 1 (cos2 h) − 1 − sin2 h
= = .
h h(1 + cos h) h(1 + cos h)
Therefore, we have
(cos h) − 1 sin2 h
lim = − lim
h→0 h h→0 h(1 + cos h)
sin h sin h
= − lim ·
h→0 h 1 + cos h
sin h sin h
= − lim · lim
h→0 h h→0 1 + cos h
= (−1) · 0 = 0. 2
We can now finish the proof of Theorem 16. At the end of the first
displayed chain of equations in that proof, we saw that
(cos h) − 1 sin h
(sin x) = sin x lim + cos x lim .
h→0 h h→0 h
The previous two lemmas showed that, on the right-hand side, the first
limit is 0 and the second limit is 1, so (sin x) = cos x as claimed. 2
The following theorem can be proved by very similar methods.
Theorem 17. The equality (cos x) = − sin x holds.
16. THE CHAIN RULE 49
Solution: Let f (x) = sin x and let g(x) = 3x. Then h(x) = f (g(x)),
so, by the chain rule, we have
g(x + r) − g(x)
(3.5) lim − g (x) = 0.
r→0 r
Set
g(x + r) − g(x)
t= − g (x).
r
Note that t depends on r, and as r approaches 0, t approaches 0.
Similarly, let y = g(x). As f is differentiable at y, we have
f (y + s) − f (y)
(3.6) lim − f (y) = 0.
s→0 s
Set
f (y + s) − f (y)
u= − f (y).
s
Again, note that u depends on s and that u approaches 0 as s ap-
proaches 0.
Now we undertake a series of manipulations of the preceding two
equations. Our goal is to express
f (g(x + r)) − f (g(x))
f (g(x)) = lim
r→0 r
in terms of f (g(x)) and g (x).
52 3. RULES OF DIFFERENTIATION
16.4. Exercises.
(1) Let h(x) = (x2 + 1)5 . Find h (x).
(2) Let h(x) = sin (x2 ). Find h (x).
(3) Let h(x) = tan 3x. Find h (x).
(4) Let h(x) = esin x . Find h (x).
2
(5) Let h(x) = ex sin x . Find h (x).
17. IMPLICIT DIFFERENTIATION 53
x and y. That is, if (x, y) is on the curve, then (y, x) is also on the
curve.
17.3. Exercises.
(1) Let C be the circle given by the equation x2 + y 2 = 169. Use
implicit differentiation to find the slope of the tangent line to
C at the point (5, 12).
(2) Prove that (sin−1 x) = √1−x
1
2.
−1
(3) Prove that (cos x) = − 1−x2 .
√ 1
However, ey = x by definition, so
dy 1
=
dx x
as claimed. 2
It is now a breeze to determine the derivative of logarithmic func-
tions of any base.
Corollary 3. Let a = 1 be a fixed positive real number. Then
1
(loga x) = .
x ln a
Proof. Note that
log x
x = eln a a = e(ln a)(loga x) .
56 3. RULES OF DIFFERENTIATION
dy 3 1 1
=y + −
dx x 2(x + 1) 2(x − 2)
√
x3 x + 1 3 1 1
= √ · + − .
x−2 x 2(x + 1) 2(x − 2) 2
18.4. Power Functions Revisited. Recall that in an earlier section, we
proved that if n is a fixed positive integer, then (xn ) = nxn−1 . We
stated that this was the case for all real numbers n, not just positive
integers, but we have not proved that claim. Now we have the tools,
namely logarithmic differentiation, to prove it.
Theorem 22. Let n be any real number. Then we have
d n
x = nxn−1 .
dx
Proof. Set y = xn . Let us assume for the case of simplicity that
x is positive. Taking logarithms, we have
ln y = n ln x.
Differentiating both sides with respect to x, we get
dy 1 n
· = .
dx y x
Solving for dy/dx yields
dy ny nxn
= = = nxn−1
dx x x
as claimed. 2
18.5. The Number e Revisited. Recall that we have defined the num-
ber e, the base of the natural logarithm, as the number for which
limh→0 (eh − 1)/h = 1. Our new knowledge lets us express e more di-
rectly, as a limit.
Note that if f (x) = ln x, then f (x) = 1/x, so f (1) = 1. By the
definition of derivatives, this means that
ln(1 + h) − ln 1
lim = 1.
h→0 h
Observing that ln 1 = 0 and using the power rule of logarithms, we get
lim ln(1 + h)1/h = 1,
h→0
or, applying the exponential function ez to both sides, we have
lim (1 + h)1/h = e.
h→0
58 3. RULES OF DIFFERENTIATION
The marginal cost function C (x) describes how the cost function
changes. In that, C (x) and M (n) are similar. There is one important
difference. As we know, the derivative C (x) is given by
C(x + Δx) − C(x)
(3.17) lim .
Δx→0 Δx
However, it could well be that the smallest meaningful positive value
of Δx is 1, in case the products are such that fractional units do not
make sense (e.g., automobiles). In that case, Δx → 0 is impossible
in its precise mathematical meaning; the closest that Δx can get to
0 is when Δx = 1. In that case, however, the expression after the
limit symbol in (3.17) simplifies to C(x + 1) − C(x), justifying the
approximation
(3.18) M (x) = C(x + 1) − C(x) ≈ C (x).
Example 35. The cost function of a bottle of a new medication is
given by C(x) = 106 +20x+0.001x2 +0.000001x3 . Find the approximate
cost of producing the 101st and the 1001st bottles.
Solution: By the preceding discussion, we need to compute the func-
tion C (x). By the rules of differentiating a polynomial function, we
get C (x) = 0.000003x2 + 0.002x + 20. So the 101st bottle costs
0.0003 · 1002 + 0.002 · 100 + 20 = 20.23 dollars to produce, while the
1001st bottle costs 0.000003 · 10002 + 0.002 · 1002 + 20 = 43 dollars to
produce. 2
It is important to note that the result of the previous example, that
is, the fact that it costs more to produce the 1001st bottle than the
101st bottle does not mean that the more bottles are produced, the
more expensive it is to produce the average bottle. This is because
the cost of producing the first bottle is astronomical, since C(1) > 106 .
Compared to that, the cost of each of the first thousand, or even, first
ten thousand bottles is very small, so the production of each of them
will bring the cost of producing the average bottle down. (The cost
of producing the average bottle if n bottles are produced is of course
C(n)/n.)
In the exercises, you are asked to compare these results to the results
obtained by using the formula C(n + 1) − C(n).
19.3. Exercises.
(1) Consider the particle of Example 34. After 6 seconds, how far
from its starting point is that particle? In what direction?
(2) Consider the particle of the previous exercise. Are there any
moments when the particle is not moving?
20. RELATED RATES 61
used English units of measurement while the agency’s team used the
more conventional metric system for a key spacecraft operation.
words, one has to answer the question: How are these quan-
tities measured? The orientation of the laser beam can be
described by the angle ϕ between the perpendicular to the
wall and the laser beam. The position of the bright spot may
be set by the distance y traveled by it from the point on the
wall when ϕ = 0, that is, when the laser beam is perpendic-
ular to the wall. If the pointer rotates, the angle becomes a
function of time, ϕ = ϕ(t), and so does the position of the
bright spot, y = y(t). Thus, the question is about the relation
between the rates y (t) = v (the speed at which the bright spot
travels) and ϕ (t) = ω (the rate at which the pointer rotates).
(II) The next step is to find a function that determines the relation
between the quantities of interest, that is, between the distance
y and the angle ϕ: y = f (ϕ). It is clear that D and y are
related as the catheti of the right triangle whose hypotenuse
is the laser beam: y = D tan ϕ = f (ϕ).
(III) Once the relation between the quantities of interest has been
established, the relation between their rates can be found.
Since (tan ϕ) = 1/ cos2 ϕ, Equation (3.20) yields
D
(3.21) y = D tan ϕ → y = 2
ϕ .
cos ϕ
The first question is answered by setting ϕ = ω = π rad/s
and D = 1 m/s, and y = v is measured in meters per second.
(IV) Note that the rate y = v is not constant even if the rate ϕ = ω
is constant. To answer the second question, one has to find
the value of ϕ when v = 4π m/s, D = 1 m, and ω = π rad/s.
It follows from Equation (3.21) that
Dω 1 π
cos2 ϕ = = → ϕ= ;
v 4 3
that is, the bright spot moves at the speed 4π m/s when the
laser beam makes 60◦ with the perpendicular to the wall. 2
20.4. Can Anything Travel Faster Than Light? The solution (3.21) has
an interesting feature. When ϕ approaches 90◦ , that is, the laser beam
is getting closer to being parallel to the wall, the cosine, cos ϕ, tends to
0 in Equation (3.21), and hence the rate y = v grows unboundedly. It
seems like just with merely a laser pointer, a superluminal object can
be created in a lecture hall! Let us investigate this. The speed of light is
c ≈ 300,000 km/s ≈ 186,000 mi/sec. The light can make a trip around
64 3. RULES OF DIFFERENTIATION
20.5. Related Problem. The next time you watch a Florida sunset, look
at your shadow. Does there exist a position of the Sun above the
horizon at which your shadow extends faster than the speed of light?
20.6. More Than Two Related Rates. There are situations when several
quantities are related among themselves. If these quantities become
functions of a variable t, then their rates are linearly related. A proof of
this statement is given in a more advanced course, where the functions
of several variables are studied. However, the basic idea of finding
relations between the rates has not changed: They are obtained by
21. LINEAR APPROXIMATIONS AND DIFFERENTIALS 65
Solution:
(I) Since f (x) = (sin x) = cos x, f (0) = 1, and f (0) = 0, the
linearization is L(x) = x.
(II) In Theorem 23, let a = −δ and b = δ. Next, one has to
find M . The simplest way to do this is to take the maximal
value of |f (x)| in the interval |x| ≤ δ. Note that there should
be δ < π/2 because L(π/2) − sin(π/2) = π/2 − 1 exceeds
the given error ε. So sin x is monotonic in |x| ≤ δ, and hence
|(sin x) | = | sin x| ≤ sin δ = M for all |x| ≤ δ. By Theorem 23,
The converse problem is simpler: Find an upper bound for the error
of the linear approximation of sin x at x = 0 in the interval |x| ≤ 0.2.
One has | sin x − x| ≤ ε = 12 M δ 2 = 0.5 · sin(0.2) · (0.2)2 ≈ 3.9734 · 10−3 .
21.3. Differential. For a real variable x, the differential dx is defined as
an increment of x. It can be given the value of any real number inde-
pendently of the value of x; that is, dx is considered as an independent
variable. So, with every real variable, one can associate another real
variable, called the differential. If two real variables are related, the
following rule postulates the relation between their differentials.
Definition 27. Let two variables y and x be related as y = f (x),
where f is a differentiable function. The differential dy = df (x) is
defined by the linear transformation of dx:
(3.24) dy = df (x) = f (x) dx .
Note that the variables x and dx on the right-hand side are independent
variables. Equation (3.24) states that, if the variables y and x are
related, then the differential dy is no longer an independent variable
and is determined by x and dx; specifically, dy depends linearly on dx.
21.4. Geometrical Significance of the Differential. Put dx = Δx, where
Δx is a real number. Consider an increment of the variable y between
x + Δx and x: Δy = f (x + Δx) − f (x). For any x, the differential
of a function df (x) = dy = f (x) Δx does not generally coincide with
its increment Δy. For example, put f (x) = x2 , x = 1, Δx = 0.2, then
Δy = (1 + 0.2)2 − 1 = 0.44, whereas dy = f (1) Δx = 2 · 0.2 = 0.4.
Since the derivative f (x) determines the slope of the tangent line to
the graph y = f (x), the differential is the increment of y along the
tangent line through the point (x, f (x)) in the interval [x, x + Δx].
Thus, dy = Δy because the tangent line does not generally coincide
with the graph.
An intuitive understanding of the differential stems from its geo-
metrical interpretation. Let Δx tend to 0. The ratio
Δy − dy Δy
= − f (x)
Δx Δx
tends to 0 by the existence of f (x). This means that the difference
Δy − dy must go to 0 faster than Δx. An increment Δx is said to
be infinitesimally small if (Δx)n , n > 1, can always be neglected. So
21. LINEAR APPROXIMATIONS AND DIFFERENTIALS 69
Applications of Differentiation
71
72 4. APPLICATIONS OF DIFFERENTIATION
One of the lessons that can be learned from this example is that one
can think of a relative minimum (maximum) as an absolute minimum
(maximum) when f is restricted to a sufficiently small subset in its
domain. This observation is accurately stated by the following theorem.
Theorem 24 (The Extreme Value Theorem). If f is a continuous
function on a closed interval [a, b], then f attains its absolute maximum
and minimum values in [a, b]; that is, there exist c1 and c2 in [a, b] such
that f (c1 ) ≤ f (x) ≤ f (c2 ) for all x in [a, b].
The continuity hypothesis is essential. In fact, the continuity of
f (x) = x3 − x was implicitly used in Example 41 to establish the ex-
istence of its relative maximum and minimum! The following example
22. MINIMUM AND MAXIMUM VALUES 73
(IV) A function defined on a closed interval [a, b] can have its ab-
solute maximum or minimum at the endpoints. When finding
the absolute maximum and minimum values, the values of f
at the critical points must be compared with f (a) and f (b).
The largest (smallest) of them is the absolute maximum (min-
imum) value. 2
Example 42. If a stone is thrown at a speed v0 m/s and an angle θ
with the horizontal line, then its trajectory is a parabola:
g
(4.4) y = x tan θ − x2 2 ,
2v0 cos2 θ
where y is the stone height (vertical position), x is the horizontal posi-
tion (all the positions are in meters), and g = 9.8 m/s2 is a constant
universal for all objects near the surface of the Earth (the free-fall ac-
celeration). This is a consequence of the Newton’s second law. At what
angle should one throw a stone to reach the maximal range at a given
speed v0 ?
Solution:
(I) The range as a function of the angle θ has to be found first.
The stone lands when its height y vanishes. This happens at
x = 0 (naturally, this is where the stone was thrown) and at
x = L(θ), where
2v02 2v 2 v2
L(θ) = tan θ cos2 θ = 0 sin θ cos θ = 0 sin(2θ).
g g g
(II) The range L(θ) is a differentiable function of θ so the values
of θ at which L attains its extreme values may be found from
the equation
v02
L (θ) = 0 → 2 cos(2θ) = 0 → cos(2θ) = 0 .
g
This equation has countably many solutions 2θ = π/2 + πn,
where n is any integer. But in the interval of the physical
values of θ ∈ [0, π/2], it has only one solution θ = π/4. Since
sin(2π/4) = 1 (the absolute maximum of the sine), L attains
its maximal value at θ = π/4; that is, the range is maximal,
Lmax = v02 /g, when a stone is thrown at 45◦ . 2
bit from the parabola (due friction with the air). So the optimal angle
would deviate a bit from π/4. The deviation would also depend on the
mass and the initial speed. The range optimization problem would be
more involved and would require the theory of differential equations.
It should also be noted that the angle at which the maximal range is
attained depends on the initial height at which the stone is thrown.
So, the angle would be different from 45◦ when the stone is thrown, for
example, from a cliff.
Solution:
(I) Let f (x) = x5 + x3 − 1. Evidently, f (−1) = −4 < 0 and
f (1) = 2 > 0. By continuity, f has to take all intermediate
values between −4 and 2 (the intermediate value theorem). So
f has at least one root in (−1, 1).
(II) Suppose it has two roots a and b, that is, f (a) = f (b) = 0.
Then, by Rolle’s theorem, f (x) has to vanish somewhere in
(a, b). But this is not possible because f (x) = 5x4 +3x2 +1 > 0
for any x. Thus, f has the only real root. 2
Theorem 27 (The Mean Value Theorem). Let f be a function that
satisfies the following hypotheses:
(I) f is continuous on the closed interval [a, b].
(II) f is differentiable on the open interval (a, b).
Then there is a number c ∈ (a, b) such that
f (b) − f (a)
(4.5) f (c) = or f (b) − f (a) = f (c)(b − a) .
b−a
The geometrical interpretation of the theorem is simple. Consider
the line through the points (a, f (a)) and (b, f (b)). Its slope is (f (b) −
f (a))/(b − a). The theorem asserts the existence of a point where the
graph y = f (x) has a tangent line with the same slope (cf. Equation
(4.5)) (as f (c) is the slope of the tangent line at x = c). Let us turn
to a formal proof.
Proof of Theorem 27.
(I) Consider the line through the points (a, f (a)) and (b, f (b)). Its
equation is
f (b) − f (a)
(4.6) y = L(x) = f (a) + (x − a) ,
b−a
L(a) = f (a) , L(b) = f (b) .
Next, consider the function
f (b) − f (a)
(4.7) h(x) = f (x) − L(x) = f (x) − f (a) − (x − a) .
b−a
Its values determine the deviation of the graph y = f (x) from
the secant line y = L(x) on the closed interval [a, b].
(II) The function h(x) satisfies the three hypotheses of Rolle’s
theorem. First, it is continuous on [a, b] as the sum of two
78 4. APPLICATIONS OF DIFFERENTIATION
√ √
0 (a local maximum) and f (1/ 3) = 2 3 > 0 (a local minimum).
The function also has an inflection point at x = 0: f (x) = 6x < 0
if x < 0 and f (x) = 6x > 0 if x > 0. Note that an inflection point
may not be related a critical point! In other words, the tangent line
at an inflection point can have any slope. In the previous example,
f (0) = −1.
Tn (a) = f (a) , Tn (a) = f (a) , Tn (a) = f (a) , . . . , Tn(n) (a) = cn .
The resulting polynomial is called the nth-degree Taylor polynomial:
f (a) f (n) (a)
Tn (x) = f (a) + f (a)(x − a) + (x − a)2 + · · · + (x − a)n .
2! n!
25. TAYLOR POLYNOMIALS AND THE LOCAL BEHAVIOR 85
Two observations can be made from this table. First, the accuracy
increases with the degree of the Taylor polynomial (reading the rows
of the table). Second, lower-degree Taylor polynomials become more
accurate as the argument gets closer to the point at which the Taylor
polynomials are constructed (reading the columns of the table). For ex-
ample, the approximation ex ≈ T3 (x) is accurate up to four significant
digits if |x| ≤ 1/4. So the accuracy of the approximation ex ≈ T2 (x)
is determined by the difference T2 − T3 = −x3 /6, that is, by the next
monomial to be added to T2 to get the next Taylor polynomial. This
observation is a characteristic feature of Taylor polynomials:
or that
lim f (x) = ±∞ and lim g(x) = ±∞.
x→a x→a
Then
f (x) f (x)
(4.16) lim = lim
x→a g(x) x→a g (x)
The first equality follows from f (a) = g(a) = 0, the second and third
equalities are the consequence of the limit laws and the assumption
that g (a) = 0, and the last equality follows from the continuity of the
derivatives. This simplified version of l’Hospital’s rule can be under-
stood geometrically. The functions f and g can be approximated by
their tangent lines at a, f (x) ≈ f (a)(x − a) and g(x) ≈ g (a)(x − a),
so that f (x)/g(x) ≈ f (a)/g (a) near a.
It is not so easy to prove the general version of l’Hospital’s rule
(the proof is omitted here). L’Hospital’s rule is also valid for one-sided
limits x → a± and for the limits at ±∞. The conditions of l’Hospital’s
rule must be verified for the corresponding limits.
What happens if f (a) = g (a) = 0? Apparently, the conditions of
l’Hospital’s rule are satisfied for the derivatives f (x) and g (x) in this
case. So, l’Hospital’s rule may be applied again to the ratio f (x)/g (x).
For functions differentiable many times, l’Hospital’s rule is easy to un-
derstand via the Taylor polynomials:
f (x) f (a) + f (a)(x − a) + 12 f (a)(x − a)2 + · · ·
≈ .
g(x) g(a) + g (a)(x − a) + 12 g (a)(x − a)2 + · · ·
If f (a) = g(a) = 0, then the limit of the ratio is determined by
f (a)/g (a). If f (a) = g(a) = 0 and f (a) = g (a) = 0, then the
limit is determined by f (a)/g (a) and so on.
90 4. APPLICATIONS OF DIFFERENTIATION
Solution:
(I) Let f (x) = ex − 1 and g(x) = x. Then f (0) = g(0) = 0 (the
conditions of l’Hospital’s rule are fulfilled). Hence,
ex − 1 (ex − 1) ex
lim = lim = lim = 1.
x→0 x x→0 (x) x→0 1
Although our goal has not been achieved, our effort has not been in
vain. Since the left-hand side vanishes by Example 50, it follows that
x ln2 (x) → 0 as x → 0+ . By repeating this procedure recursively, one
can infer that x lnn (x) → 0 as x → 0+ for any n = 1, 2, ....
The limit of g(x) ln(f (x)) is of type 0 · ∞ and can be treated by the
rule (4.17). The procedure is illustrated with an example of the type
∞0 indeterminate power:
1/x )
lim x1/x = lim eln(x = lim eln(x)/x = elimx→∞ ln(x)/x = e0 = 1 .
x→∞ x→∞ x→∞
92 4. APPLICATIONS OF DIFFERENTIATION
g 1 − g/f f f /g − 1
f −g =f 1− = or f − g = g = .
f 1/f g−1 1/g
If f (x)/g(x) → 1, then the indeterminate difference is equivalent to an
indeterminate form of type 0/0 and can be investigated by l’Hospital’s
rule. The limit of f /g is an indeterminate form of type ∞/∞ and can
also be investigated by l’Hospital’s rule. Suppose that f (x)/g(x) → k
as x → a, where k can be either a non-negative number or k = ∞.
If k < 1, then f − g = g(f /g − 1) → ∞ · (k − 1) = −∞; that
is, g increases faster than f as x → a. If k > 1 or k = ∞, then
f − g = g(f /g − 1) → ∞ · (k − 1) = ∞; that is, f increases faster
than g as x → a. For example,
1 1 1
lim+ ln(x) + = lim+ 1 + x ln(x) = lim+ (1 + 0) = ∞.
x→0 x x→0 x x→0 x
where l’Hospital’s rule has been used in the second equality. Note that
Taylor polynomials allow us to find the local behavior of this function
near x = 0. Use T2 to approximate cos(x) and T3 for sin(x):
1 − cos(x) x2 /2 x/2 x
≈ = ≈ ,
sin(x) x − x /6
3 1 − x /6
2 2
where x2 /6 is small as compared to 1 when x is close enough to 0 and
can therefore be neglected in the denominator.
is not defined at roots of cos(x). How does tan(x) behave, say, near
x = π/2? Since both sin(x) and cos(x) are smooth near x = π/2, the
behavior of tan(x) near π/2 can be understood with the help of Taylor
polynomials. Let us approximate sin(x) by T1 (x) = 1 + (x − π/2) and
cos(x) by T3 (x) = −(x − π/2) + (x − π/2)3 /6. To simplify the notation,
write Δx = x − π/2 (the deviation of x from π/2). Then
1 + Δx 1 1 + Δx 1 1
tan(x) ≈ =− ≈− =− ,
−Δx + (Δx) /6
3 Δx 1 − (Δx) /6
2 Δx x − π/2
where the second ratio in the product has been approximated by 1
because Δx is small. Since tan(x + π) = tan(x), this behavior repeats
itself at near every root of cos(x).
27.1. Growth of the Power, Exponential, and Logarithmic Functions. Let
us compare the growth of the power function xn , the exponential func-
tion ex , and the logarithmic function ln(x) as x → ∞. The exponential
function grows faster than the power function. Let f (x) = ex and
g(x) = xn . Let us analyze the ratio f /g as x → ∞. The conditions
of l’Hospital’s rule are satisfied: ex → ∞ and xn → ∞ as x → ∞.
L’Hospital’s rule can successively be applied until the indeterminate
form is resolved:
ex ex ex ex
lim n = lim = lim = · · · = lim = ∞.
x→∞ x x→∞ nxn−1 x→∞ n(n − 1)xn−2 x→∞ n!
The conclusion is true for any real n. For any real n, there exists
a positive integer N such that n < N or xn < xN , x > 1. But ex
grows faster than xN . Similarly, it is straightforward to show that the
logarithmic function grows slower than any power function:
ln(x) (ln(x)) 1
x 1
lim = lim = lim = lim =0
x→∞ xn x→∞ (xn ) x→∞ nxn−1 x→∞ nxn
27.3. Guidelines for Analyzing the Shape of a Graph. The following guide-
lines are useful for sketching the graph of a function. It should be noted
that not all the steps can always be carried out. This depends very
much on the complexity of the function in question. So these are really
guidelines, not a “must-do” algorithm. Given a function f , find:
(I) Domain.
The domain consists of all values of x at which f (x) is de-
fined. Typically, it is a collection of intervals. If f is defined
for x > a or x < a, or both, but not at a, the the local behavior
of f near a must be studied (see below).
(II) Roots of f and the value f (0).
Roots of f (x) define the intercepts of the graph y = f (x) with
the x axis. They are not always easy to find. The value f (0)
(if x = 0 in the domain of f ) defines the intercept of y = f (x)
with the y axis.
(III) Symmetry and periodicity.
If f (−x) = f (x) (an even function) for all x in the domain,
then the graph y = f (x) is symmetric about the y axis. If
f (−x) = −f (x) (an odd function) for all x in the domain,
then the graph y = f (x) is symmetric about the origin (or the
rotation through 180◦ about the origin). If there is a number p
such that f (x+p) = f (x), then f is periodic and p is its period.
The graph y = f (x) repeats itself on intervals of length p, for
example [a, a+p], [a+p, a+2p], and so on for any a. Examples
are sin(x), p = 2π; tan(x), p = π; cos(4x), p = 2π/4 = π/2.
(IV) Asymptotes and asymptotic behavior of f .
If f is a ratio f = h/g, then vertical asymptotes are x = c,
where c solves g(c) = 0 and h(c) = 0. If h(c) = 0, find the
limits limx→c± f (x). If one of the limits or both is infinite,
investigate the local behavior of f near c (e.g., with the help
of Taylor polynomials if possible). The asymptotic behavior of
f (x) near c and for large positive and negative x determines
27. ANALYZING THE SHAPE OF A GRAPH 95
(I) If f (x) > 0 for all x < c and f (x) < 0 for all x > c, then
f (c) is the absolute maximum value of f .
(II) If f (x) < 0 for all x < c and f (x) > 0 for all x > c, then
f (c) is the absolute minimum value of f .
(II) What is known about the cost function C(x)? First, its value
at a particular number of supplied units x = x0 = 60 is C0 =
C(60) = 2500. Also, the cost function decreases by ΔC = 20
if x increases by Δx = 5. So the ratio M = −ΔC/Δx = −4 is
the rate of change of C or the marginal cost. Therefore,
C(x) = C0 + M (x − x0 ) = 2500 − 4(x − 60) = 2740 − 4x .
(III) One has to maximize the profit function:
P (x) = xp(x) − C(x) = 114x − 12 x2 − 2740.
Since P (x) = 114 − x, the function has one critical point
x = 114 at which P (x) attains its absolute maximal value by
the first derivative test for absolute extreme values.
(IV) If x = 114 units can be sold, the price per unit is p(114) =
110 − 57 = 53; that is, the rebate should be p(60) − p(114) =
80 − 53 = 27. Thus, the store should offer a rebate of $27 to
maximize its profit. Note also the increase in the weekly profit:
P (60) = $2,300 whereas P (114) = $3,758. 2
lim xn = r .
n→∞
Example 54. Find the root of f (x) = x − e−x that is correct to six
decimal places.
Solution:
(I) Determine the position of the root first. The graphs y = x
and y = e−x intersect between 0 and 1. So the root lies in the
interval (0, 1).
(II) Verify the condition f (x) = 0: f (x) = 1 + e−x > 0 for all x.
(III) Pick an initial value of x1 = 0. Then Newton’s sequence for
six decimal places is:
x1 = 0 , x2 = 0.5 , x3 = 0.566311 , x4 = 0.567143 , x5 = 0.567143 .
So the root r = 0.567143 is correct to six decimal places (in
fact, f (0.567143) = −4.5 × 10−7 ). 2
29.2. Pitfalls in Newton’s Method. Unfortunately, there is no unique
recipe for choosing an initial point in Newton’s sequence. The choice
depends very much on the function in question. In practice, it is de-
termined by trying different values. A few possible bad behaviors of
Newton’s sequence are useful to keep in mind.
(I) A bad choice of the initial point x1 can produce the value of x2
that is a worse approximation to the root than x1 . Consider,
for example, the function f (x) = x3 − 3x2 + 2 in the interval
[0, 2] and f (x) = 2 when x < 0 and f (x) = −2 when x > 2.
The function is continuously differentiable because f (x) =
3x2 − 6x approaches 0 as x → 0+ and x → 2−. The function
has the root x = 1 and f (x) < 0 in the open interval (0, 2).
If 0 < x1 < 2 is close enough to either x = 0 or x = 2, then
x2 would be outside the interval (0, 2). Note that the actual
behavior of f (x) outside the interval [0, 2] is not relevant for
the conclusion. The essential point here is that such a situation
is likely to occur when f (x1 ) is close to 0.
(II) A poor choice of the initial point may lead to a cycle in New-
ton’s sequence. Take f (x) = x3 − 2x + 2 and x1 = 0. Since
f (x) = 3x2 − 2, the next elements are x2 = 0 − 2/(−2) = 1,
x3 = 1 − 1/1 = 0 = x1 . That is, Newton’s sequence is a cyclic
sequence, which never converges. The initial point must be
taken closer to the root.
(III) If f (x) → ±∞ as x approaches a root r (the graph y = f (x)
has a vertical tangent line at the root), Newton’s sequence may
oscillate around r, never converging to it, or it may diverge for
any initial point. To understand this phenomenon, suppose
104 4. APPLICATIONS OF DIFFERENTIATION
30. Antiderivatives
In many practical problems, a function is to be recovered from its
derivative. For example, if the velocity is given as a function of time,
v = v(t), one might want to find the position as a function of time,
s = s(t), where s (t) = v(t). What is s(t)?
Definition 35. A function F is called an antiderivative of f on
an interval I if F (x) = f (x) for all x in I.
For many basic functions, it is not difficult to find the corresponding
antiderivative. For example, from the rule (xn+1 ) = (n+1)xn , it follows
that if f (x) = xn , n = −1, the antiderivative is F (x) = xn+1 /(n + 1). It
has also been proved that (ln |x|) = 1/x. So the function F (x) = ln |x|
is the antiderivative of f (x) = 1/x for all x = 0.
30.1. Uniqueness of the Antiderivative. Suppose F (x) = f (x) for all x
in an interval (a, b). Is such an F (x) unique? This question is answered
by Corollary 5 given at the end of Section 23. Indeed, let F (x) and
G(x) be antiderivatives of f (x), that is, F (x) = G (x) = f (x) on
(a, b). By Corollary 5, F and G may only differ by a constant: G(x) =
F (x)+C. Recall that Corollary 5 does not hold for the union of disjoint
intervals. Thus, any two antiderivatives of the same function may differ
at most by a constant on an interval.
Theorem 37. If F is an antiderivative of f on an interval I, then
the most general antiderivative of f on I is
F (x) + C,
where C is an arbitrary constant.
For example, the general antiderivative of the power function f (x) =
xn , n = −1, is F (x) = xn+1 /(n + 1) + C, and for f (x) = 1/x, it is
F (x) = ln |x| + C. This nonuniqueness of the antiderivative is not a
drawback of the concept but rather a great advantage. This is explained
by the following example. The velocity of a piece of chalk thrown verti-
cally upward with a velocity of v0 is v(t) = v0 − gt, where g = 9.8 m/s2
is the acceleration of a free fall. At t = 0, the chalk has a velocity of
v(0) = v0 . Then it begins to slow down (v(t) decreases because of grav-
ity). Eventually, at t = v0 /g, the chalk stops and begins to fall back. If
30. ANTIDERIVATIVES 107
h(t) is the height of the chalk relative to the floor, then h (t) = v(t);
that is, the height is an antiderivative of v(t). It is easy to find a
particular antiderivative of v(t) using the antiderivative of the power
function: h(t) = v0 t − gt2 /2 (indeed, h (t) = v0 − gt). What is the phys-
ical significance of the general antiderivative h(t) = C + v0 t − gt2 /2?
It appears as if the position of the chalk relative to the floor is not
uniquely determined. In particular, h(0) = C is the height at the very
moment when the chalk was thrown upward. But the chalk could be
thrown upward at 1 m above the floor or 2 m above it with the very
same initial velocity. So, in both the cases, v(t) is the same, while the
h(t) are not. In the first case, h(0) = 1, whereas in the second case,
h(0) = 2. Thus, the constant C can be fixed by specifying the value of
the antiderivative at a particular point.
This feature of the general antiderivative can also be visualized by
plotting the graphs y = F (x) + C for different values of C. All such
graphs are obtained from the graph y = F (x) by rigid translations
along the y axis. If one demands that the graph y = F (x) + C should
pass through a particular point (x0 , y0 ), then C is fixed: y0 = F (x0 )+C
or C = y0 − F (x0 ). For example, find f (x) if f (x) = 3x2 and f (2) = 1.
The general antiderivative of 3x2 is f (x) = x3 + C. From f (2) = 1, it
follows that f (2) = 8 + C = 1 or C = −7. Therefore, f (x) = x3 − 7.
1 1
(tan(x)) = (sec(x))2 , (sin−1 (x)) = √ , (tan−1 (x)) = .
1−x 2 1 + x2
In particular, this table says that the general antiderivative of f (x) =
1/(1 + x2 ) is F (x) = tan−1 (x) + C. The table of derivatives of basic
108 4. APPLICATIONS OF DIFFERENTIATION
that satisfies the condition F (n) (x) = f (x), then the general antideriv-
ative of the nth order is F (x) + C1 xn−1 + C2 xn−2 + · · · + Cn−1 x + Cn ,
where C1 , ..., Cn are arbitrary constants. Indeed, the nth derivative of
a polynomial of degree n − 1 is 0. Note that this analysis applied only
when f was defined in an interval. Why?
The following example illustrates the significance of arbitrary con-
stants in general higher-order antiderivatives.
Example 57. Any free-falling object near the surface of the Earth
has the free-fall acceleration of 9.8 m/s2 . A piece of chalk is thrown
vertically upward at a speed of 7 m/s and at 1.5 m above the floor. When
does the chalk hit the floor?
Solution:
(I) Let h(t) be the height of the chalk relative to the floor. Then
its velocity is v(t) = h (t), and its acceleration is a(t) =
v (t) = h (t). Since all free-falling objects have an acceleration
of 9.8 m/s2 , one has h (t) = −9.8. The minus sign indicates
that the acceleration is directed downward.
(II) The general second antiderivative of the constant function
−9.8 is h(t) = −9.8t2 /2 + C1 t + C2 , where C1 and C2 are
arbitrary constants.
(III) To fix C1 and C2 , the initial conditions of the motion must
be used. The initial velocity is v(0) = 7. Since v(t) = h (t) =
−9.8t+C1 , one can infer that v(0) = C1 = 7. The initial height
is h(0) = 1.5. Hence, h(0) = C1 = 1.5.
(IV) The height is h(t) = −9.8t2 /2 + 7t + 1.5. The chalk hits the
floor when its height vanishes, that is, at the time moment
t > 0 when h(t) = 0. A positive root of the quadratic equation
−9.8t2 /2 + 7t + 1.5 = 0 is t ≈ 1.62 s. The maximal height
reached by the chalk is 4 m. Why? 2
CHAPTER 5
Integration
Thus, if the limit limn→∞ AUn exists, then lim AUn = lim ALn because
0 < AUn − ALn < 2/n → 0 as n → ∞. On the other hand, ALn ≤ A ≤ AUn
for any n. Taking the limit n → ∞ in this inequality yields
lim ALn = A = lim AUn .
n→∞ n→∞
From a geometrical point of view, when n gets larger, the area AUn
approaches A from above while ALn does so from below. For n large
enough, both AUn and ALn may serve as a good approximation of A.
In fact, the error of either of the approximations does not exceed 2/n
because 0 < AUn − ALn < 2/n and ALn ≤ A ≤ AUn . It appears that the
limit limn→∞ AUn can actually be calculated by means of the formula
for the sum of squares of the first n positive integers:
1 n
(5.1) 12 + 22 + · · · + n2 = n(n + 1)(2n + 1) = (2n2 + 3n + 1).
6 6
Indeed, by making use of this formula, one can infer that
1 2 2n2 + 3n + 1 1 1 1 1
AUn = 3
(1 + 2 2
+ · · · + n 2
) = 2
= + + 2 →
n 6n 3 2n 6n 3
as n → ∞. So the area is A = 3 . 1
that is, the limit of A∗n does not depend on the choice of sample points
x∗k . The area could have been approximated by, for example, A∗n with
the sample points as the midpoints x∗k = (xk + xk−1 )/2, or any other
convenient choice. This analysis can be extended to any continuous
function.
31.1. The Area Under the Graph of a Continuous Function. Let f (x) be
continuous on [a, b]. Consider a partition of [a, b] by n segments of
length Δx = (b − a)/n. The endpoints of the partitions segments are
xk = a + k Δx with k = 0, 1, 2, ..., n, such that x0 = a and xn = b. Let
x∗k be a sample point in the interval [xk−1 , xk ].
31. AREAS AND DISTANCES 113
Definition 36. The area A of the region that lies under the graph
of a continuous function f (x) ≥ 0 on an interval [a, b] is
(5.2) A = lim A∗n = lim f (x∗1 ) Δx + f (x∗2 ) Δx + ··· + f (x∗n ) Δx
n→∞ n→∞
where the sum formula (5.3) has been used. To compute the limit in
the denominator, let x = 1/n, that is, x → 0. The limit becomes the
indeterminate form (1 − e−2x )/x of type 0/0, which can be resolved by
l’Hospital’s rule: (1 − e−2x ) /(x) = 2e−2x /1 → 2 as x → 0. Thus, the
distance traveled is D = (1 − e−2 )/2. 2
the numbers a and b are called the lower and upper integration limits,
respectively, and the function f is called the integrand.
Apparently, for a continuous and non-negative f on [a, b], the def-
inite integral coincides with the area under the graph of f . The geo-
metrical significance of the definite integral in general will be discussed
later after establishing its basic properties.
Let f be integrable on [a, b] and let x∗k be a sample point in Ik for
each k = 1, 2, ..., n. For any number ∈> 0, there exists an integer N
such that
b
n
∗
f (x) dx − f (x ) Δx <
a k
k=1
for every integer n > N and for every choice of x∗k in Ik . Indeed, since,
for any x∗k , mk ≤ f (x∗k ) ≤ Mk and therefore
n
ALn ≤ f (x∗k ) Δx ≤ AUn .
k=1
Hence, by the definition of the limit, no matter how small is, there is
always a large enough integer N such the deviation of the values of the
sequence elements from the limit does not exceed for all n > N . The
sum in (5.4) is called the Riemann sum after the German mathemati-
cian Bernhard Riemann (1826–1866). It follows from the preceding
analysis that the sequence of Riemann sums for an integrable function
converges to the definite integral. Since the limit is independent of the
choice of sample points, any choice convenient to calculate the limit
(5.4) can be made.
32. THE DEFINITE INTEGRAL 117
32.1. Continuity and Integrability. The relation (5.4) holds and can be
used to calculate the definite integral, provided the function f is inte-
grable. The question of integrability requires investigating the conver-
gence of the sequences of the upper and lower sums, which might be a
tedious task even for such simple functions as, for example, f (x) = x2 ,
as discussed in the previous section. The following theorem is helpful
when studying the question of integrability.
Theorem 38. If f is continuous on [a, b], or if f has only a finite
number of jump discontinuities, then f is integrable on [a, b]; that is,
b
the definite integral a f (x) dx exists.
This theorem justifies the definition of the area under the graph of
a continuous function introduced in the previous section.
Let f (x) be defined on [0, 1] such that f (x) = 1 if x is a rational
number, and f (x) = 0 otherwise (i.e., if x is irrational). The function is
not continuous anywhere in [0, 1]. For example, f (1/2) = 1, but when
x approaches 1/2, the value f (x) keeps jumping from 0 to 1 and back,
no matter how close x is to 1/2 because, for any δ > 0, the interval
( 12 − δ, 12 + δ) always contains both rational and irrational numbers.
This function gives an example of a nonintegrable function. Indeed,
take a partition xk = k/n, k = 0, 1, ..., n. Any partition interval [(k −
1)/n, k/n] contains both rational and irrational numbers. Therefore,
mk = 0 and Mk = 1. Hence, the lower sum vanishes n for any partition,
ALn = 0, whereas the upper sum is AUn = k=1 Δx = 1, that is,
L U
limn→∞ An = 0 while limn→∞ An = 1. The function is not integrable.
The integral does not exist. Note that the Riemann sum can still be
defined, but its limit would depend on the choice of sample points (e.g.,
take x∗k to be rational numbers or take x∗k to be irrational numbers;
both options are possible since any partition interval always contains
rational and irrational numbers).
For any two integrable functions f (x) and g(x) and constants c1 and
c2 , it follows from the convergence of the Riemann sums (5.4) for f and
118 5. INTEGRATION
g that
b
n
[c1 f (x) + c2 g(x)] dx = lim [c1 f (x∗k ) + c2 g(x∗k )] Δx
a n→∞
k=1
n
n
= c1 lim f (x∗k ) Δx + c2 lim g(x∗k ) Δx
n→∞ n→∞
k=1 k=1
b b
(5.6) = c1 f (x) dx + c2 g(x) dx .
a a
and, in particular,
a
f (x) dx = 0 .
a
on [c, b], that is, f (c) = 0. Then it follows from the property (5.8) that
b c b
f (x) dx = f (x) dx + f (x) dx = A1 − A2 ,
a a c
where A1 is the area under the graph of f on [a, c] and A2 is the area
above the graph of f on [c, b].
(III) Thus,
1
1 − e−2 2 1 1 − 6e−2
f (x) dx = − + = .
0 2 3 4 12 2
under the graph of f (t) = t in the interval [0, x], which is the area of a
right triangle: x
x2
A(x) = t dt = .
0 2
The area A(x) can be viewed as a function of the variable x, which
is the length of the triangle catheti. This function has an interesting
property:
A (x) = x = f (x) .
In other words, the derivative of the definite integral with respect to its
upper limit equals the value of the integrand at the upper limit. Recall
that if v(t) ≥ 0 is the speed of a moving object, then the distance
traveled by the object in time T is given by the area under the graph
of v(t), that is,
T
s(T ) = v(t) dt .
0
On the other hand, the speed is the rate of change of s(T ), and therefore
there should be s (T ) = v(T ); that is, the derivative of the integral
with respect to its upper limit is again the value of the integrand at
the upper limit. How general is this property? Does it hold for all
integrable functions? The following theorem answers these questions.
Theorem 39. If f is continuous on [a, b], then the function defined
by x
g(x) = f (t) dt , a ≤ x ≤ b,
a
is continuous on [a, b] and differentiable on (a, b), and g (x) = f (x).
Proof. By the definition of the derivative, one has to prove that
g(x + h) − g(x)
(5.13) lim = f (x)
h→0 h
for a < x < b. The ratio in the limit can be transformed as follows:
x+h x
g(x + h) − g(x) 1
= f (t) dt − f (t) dt
h h a a
x x+h
x
1
= f (t) dt + f (t) dt − f (t) dt
h a x a
1 x+h
= f (t) dt ,
h x
where the property (5.8) has been used. Note that since a < x < b
(i.e., x = a and x = b), for a sufficiently small h = 0, both x and x + h
(h can be positive or negative) always lie in the interval (a, b) so that
122 5. INTEGRATION
1
Example 61. Evaluate 0
(1 + x2 )−1 dx.
Solution: An antiderivative of (1+x2 )−1 is F (x) = tan−1 (x). Therefore,
1
1 π π
2
dx = tan−1 (1) − tan−1 (0) = − 0 = . 2
0 1+x 4 4
4 √
Example 62. Evaluate 1 (1 + x)/ x dx.
Solution: By the linearity of the integral,
4 4 4 4
1+x −1/2 −1/2
√ dx = (x 1/2
+ x ) dx = x dx + x1/2 dx .
1 x 1 1 1
34.1. The Net Change Theorem. Put f (x) = F (x) in the fundamental
theorem of calculus (5.18). The result obtained is known as the net
change theorem.
Theorem 41. The integral of a continuous rate of change is the
net change:
b
F (x) dx = F (b) − F (a) .
a
Note that F (x) may be positive and negative in the interval [a, b] so
that the quantity y = F (x) may increase and decrease. The difference
F (b) − F (a) represents the net change of y when x changes from a
to b. The net change vanishes if F (b) − F (a) = 0. This does not mean
that the quantity y does not change at all, but rather this might mean,
for example, that the quantity y increases from the value F (a), then,
at some c in [a, b], it begins to decrease, returning to its initial value
when x = b so that its net change vanishes.
An analogy with an object moving along a straight line can be
made to illustrate the net change. Let s(t) be a position function of
the object relative to some point on the line. Then s (t) = v(t) is its
velocity (note that the velocity can be negative so that the object can
move back and forth). The net change of the position over the time
interval [t1 , t2 ] is
t2
v(t) dt = s(t2 ) − s(t1 ).
t1
34. INDEFINITE INTEGRALS AND THE NET CHANGE 127
Solution:
(I) The indefinite integral of v(t) is s(t) = t − t2 + C. So the net
change of the object position is
1 1
v(t) dt = s (t) dt = s(1) − s(0) = 0 .
0 0
(II) Note that the velocity changes its sign at t = 1/2. So, in the
interval [0, 1/2], it is positive (i.e., the object moves to the right
from its initial position), then the velocity becomes negative in
[1/2, 1] (i.e., the object goes back to the initial point). To find
the distance traveled by the object, the absolute value |v(t)|
must be integrated over the interval [0, 1]. Think of |v(t)| as
the speed shown on the speedometer of your car; it is always
non negative regardless of the direction in which the car is
moving.
1 1/2 1
|1 − 2t| dt = (1 − 2t) − (1 − 2t) dt
0 0 1/2
= [s(1/2) − s(0)] − [s(1) − s(1/2)] = 1/2,
Other examples of the net change includes the volume V (t) of water
in a reservoir between two moments of time
t2
V (t) dt = V (t2 ) − V (t1 ),
t1
where V (t) is the rate of change of the volume; the net change of the
population growth
t2
n (t) dt = n(t2 ) − n(t1 ),
t1
128 5. INTEGRATION
where n (t) is the growth rate; the relation between the cost and
marginal cost functions:
t2
C (t) dt = C(t2 ) − C(t1 );
t1
and similarly for many other quantities.
where C is an arbitrary constant and the last equality follows from the
fact that
an indefinite
integral of f (u) = 1 is u. So we can conclude
that F (x) dx = du, provided the variables u and x are related as
u = F (x). This also shows that it is permissible to operate with dx
and du after the integral sign as if they were differentials. This obser-
vation leads to a neat technical trick to calculate indefinite integrals.
For example,
1 √ √
√ dx = d 2 x + 1 = 2 x + 1 + C ,
x+1
√
where the substitution u = 2 x + 1 has been used. This trick can be
generalized.
Let F (u) be an indefinite integral of a continuous function f (u) on
an interval I. Let u = g(x), where g is differentiable and its range is
the interval I. By the chain rule,
F (g(x)) = F (g(x))g (x) = f (g(x))g (x) .
(x). On
In other words, F (g(x))+C is an indefinite integral of f (g(x))g
an interval, the most general indefinite
integral of f (u) is f (u) du =
F (u) + C. Therefore, F (g(x)) and f (u) du can differ at most by an
additive constant. This proves the following theorem.
Theorem 42. (The Substitution Rule). If u = g(x) is a differen-
tiable function whose range is an interval I and f is continuous on I,
then
(5.19) f (g(x))g (x) dx = f (g(x)) dg(x) = f (u) du.
35. THE SUBSTITUTION RULE 129
and
a a a
f (x) dx = f (−u) du + f (x) dx.
−a 0 0
2
132 5. INTEGRATION