Book
Book
Book
net/publication/327076435
CITATIONS READS
0 938
1 author:
Wilson Mixon
Berry College
79 PUBLICATIONS 175 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Undergrad math econ text w/CAS and LaTeX files View project
All content following this page was uploaded by Wilson Mixon on 17 August 2018.
Wilson Mixon
The title page of this book is a bit misleading. I (Wilson Mixon) am not
the author of most of the book’s material. I am grateful to have received
the permission of Professors Anthony L. Ostrosky, Jr. and James V. Koch
to use their textbook [16] as the basis for this project. I have edited their
material slightly and have added some material. Mainly, I have incorporated
material from the Maxima open-source computer algebra system. Also, I
have extended the original discussion of sets and lists.
I hope that this project reflects well on the work of Professors Ostrosky and
Koch. I accept responsibility for any shortcomings that I have introduced in
this rendering of their work.
It is fitting that the bulk of this preface should be in the words of the original
authors:
ii
iii
ima user interface. To incorporate Maxima into your study of this material,
look at these two sites: http:// maxima.sourceforge.net/ and http:// an-
drejv.github.io/wxmaxima/. Both sites have links to documentation.1
Another site that readers of this text should visit is http://statmath.wu.ac.at/
leydold/maxima/ [9]. This site contains the text Introduction to Maxima for
Economics, which has a quite complete introduction to both Maxima and
wxMaxima. It also provides an briefer development of much of the material
in this text. Finally, it contains some more advanced material, especially the
treatment of ordinary differential equations.
I am coauthor (with Michael Hammock) of a textbook that develops micro-
economic theory more fully than the confines of the present text allow. That
text, Microeconomic Theory and Computation[7], also provides more detail
on the use of Maxima than is provided here.
1
You probably will not want to download wxMaxima. Windows and MacOS users
should download an executable file that will install wxMaxima. Linux users can access
Maxima and wxMaxima from their repositories.
Acknowledgements
This effort is dedicated to the Maxima team, which maintains and continually
improves this remarkable piece of software. Among the members of that
team, I thank Robert Dodier and Andrej Vodopivec for encouraging my
early efforts to produce material to illustrate Maxima’s power and usefulness.
A third team member, Gunter Königsmann, continues to provide help and
insight in the use of the wxMaxima graphical user interface which greatly
facilitates the use of Maxima, especially by newcomers.
I thank Michael Hammock for working with me on the previously-mentioned
text and for maintaining a website that contains a large amount of material
on the use of Maxima in economic analysis (www.wxmaximaecon.com).
Finally, and above all, I thank my wife Barbara. A suggestion that she
made a few years ago pointed me toward Maxima. Since then, I’ve repaid
her by bending her ears about the most recent tidbit that I’ve discovered by
working on various projects and by providing material for her to proofread.
Remarkably, she continues to encourage me and even offers to proofread new
material. Who would believe such a report?
v
Contents
vi
CONTENTS vii
6 Differentiation II 120
6.1 Partial Differentiation . . . . . . . . . . . . . . . . . . . . . . 120
6.2 Rules of Differentiation . . . . . . . . . . . . . . . . . . . . . . 121
6.3 Higher-order Partial Derivatives . . . . . . . . . . . . . . . . . 132
6.4 Applications of Partial Derivatives . . . . . . . . . . . . . . . . 134
6.5 Questions and Problems . . . . . . . . . . . . . . . . . . . . . 156
7 Optimization 158
7.1 Extreme Value(s): Functions of One Variable . . . . . . . . . 159
7.2 Inflection Points and Concavity . . . . . . . . . . . . . . . . . 161
7.3 Maxima and Minima II . . . . . . . . . . . . . . . . . . . . . . 169
7.4 Maxima and Minima Subject to
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.5 Economic Applications . . . . . . . . . . . . . . . . . . . . . . 184
7.6 Questions and Problems . . . . . . . . . . . . . . . . . . . . . 202
CONTENTS viii
1
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 2
C = P 1 · X1 + P 2 · X2 + · · · + P n · Xn
1
For reasons that will become clear later, we do not use subscripts. Much of our
work will involve commands that are written in text, so that subscripts are hard to enter.
Multiplication is indicated with centered dots (·) so that P 2 is a variable name. In contrast,
P · 2 is a product, with the variable name being P . We will not be entirely consistent in
our usage. In some settings, subscripts provide for easier interpretation. The context will
typically make the usage clear.
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 3
• The first and obvious implication of the analysis is that the diet is
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 4
not very palatable. Not many individuals would find a diet limited to
Stigler’s set of ingredients to be very tasty.
• Second, the meaning of the seemingly direct phrase “the cost of food”
is not as clear-cut as one might hope. Spending on food in the United
States is around $3500 per capita. We buy much more than subsistence,
and a casual reference to “the cost of food” refers to something quite
different from the cost of a subsistence diet. We eat more, sometimes
too much more, than subsistence requires and, more importantly, we
select foods that have attractions other than mere subsistence.
• Fifth, the prices of foods relative to those of other goods and services
has declined and relative prices have changed among types of foods.
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 5
• Finally, we now have more foods to choose from for our new cost-
minimizing diet.
provides links for downloading Maxima, which includes wxMaxima and for
getting started in wxMaxima. Also, see [7], Chapters 1 and 2.
Figure 1.1 shows three input/output cells. The input is entered as text.
Commands are ended by semicolons (if resulting output is to be printed) or
dollar signs (if resulting output is to be suppressed). Once a set of commands
is to be executed, ctrl-enter generates the output.
The first cell shows a simple way to assign a name to an expression (the two
commands in the first cell). The second cell graphs the expression(s), and the
third cell shows how to find a root of an expression, in this case “Revenue
= Cost.” At this point, just observe the general nature of the workbook
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 7
1.4 Summary
Mathematics underlies the analysis of many of the issues that business deci-
sion makers, public policymakers, and economists address. This chapter dis-
cusses the application of mathematical analysis as it applies to an important
issue, the cost of a subsistence diet. This example shows how mathematics
can be applied, and it shows that the mathematical tools are becoming in-
creasingly sophisticated. Coupled with the power of modern computers, this
increased sophistication has broadened the purview of applied mathematical
analysis.
One important advance of the past few decades has been the development
of computer algebra systems. These systems extend analysts abilities by
solving complex problems, by allowing for the examination of a range of
scenarios, and by producing simulations of systems that defy formal solution.
The remainder of this book displays some of these features by applying the
Maxima open-source computer algebra system.
The development and use of mathematical tools to solve business and eco-
nomic problems has expanded very rapidly in recent years. A course covering
the materials presented in this book is now required of business adminis-
tration, accounting, marketing, and economics majors in most colleges and
universities. It behooves the student who wishes to be well prepared and
efficient to master the mathematics that appear in the following chapters,
not only because it probably will be necessary in order to graduate, but
also because mathematics will prove to be very useful in later employment.
Mathematical economics can open new doors to those who take the time to
master its essentials.
CHAPTER 1. THE ROLE AND POWER OF MATHEMATICS 9
2. Many historians claim that they do not use models in writing history
and in arriving at conclusions about historical phenomena. Is it possible
to analyze something without having an underlying model? Will hard
work produce insights and generalizations if you do not have a model?
Explain.
3. Those who use mathematical tools in the analysis of business and eco-
nomics problems frequently contend that it is possible to say things
with mathematics that could not be said verbally. Is this true? Can
the reverse be true?
10
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 11
2.1 Variables
The remainder of this text focuses on mathematical models that relate to
economic activities. The analysis is phrased in terms of expression of how
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 12
variables relate to each other. Most of these variables, like income or a price,
can be quantified. Others, like utility, can be ordered but not quantified in
a meaningful way. A variable is a quantity that can assume different values
at different points of observation.
The magnitude of a variable can assume various values. For example, the
gross national product (GDP) of the United States could be 100, 1000, 1500,
or, indeed, any positive magnitude. Because a variable’s magnitude can as-
sume various different values, a variable must be represented by a general
symbol. Hence the price of a pizza might be represented by the symbol p,
while the tax rate might be represented by the symbol r. Chapter 1 repre-
sented the magnitudes of the 80 different foods as X1, X2,. . . , X80. The
letters at the end of the alphabet, such as u, w, x, y, and z, commonly sym-
bolize the magnitudes of variables. This, however, is a matter of convention
rather than of necessity. In Maxima, we often use a string of letters to name
a variable. Thus cons might be the name assigned to consumption.
Variables can be either cardinal or ordinal. The values assigned to cardinal
numbers have meaning: the difference between $2 and $5 is $3. Ordinal
variables define order alone: we may judge one person to be friendlier or
happier than another but we cannot assign a specific value to the difference
in the levels of friendliness or happiness.
We may classify cardinal variables as being either continuous or discrete in
terms of the magnitudes that are permissible for those variables. A contin-
uous variable is one that can assume any value within a given interval of
values. Annual income is a continuous variable. A discrete variable is one
that can assume at most a limited number of values within a given interval
of values. The number of siblings in your family is a discrete variable.
Consider some examples of continuous and discrete variables. The examples
reveal that the distinction, while quite real, can sometimes be ignored with
impunity. When this distinction can be safely ignored depends, of course, on
theoretical considerations.
2
The result of executing this command is assigned the name solnY. The
2
Any allowable name would do; some names like value, are reserved and cannot be
applied, though Value could be—Maxima is case-sensitive.
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 16
output is shown as the single item in a list. The textttsolve command always
produces a list. Maxima follows mathematical conventions, so the output
is not always as you might expect to see it. In the example above, we can
restate the expression as follows: Y = (a + I0 + G0)/(1 − b), so that the
multiplier 1/(1 − b) becomes apparent.
Suppose that we have values for the parameters a and b and for the exogenous
values I0 and G0. The following input/output combination shows a set of
values and the implied values of Y and C. The first command below assigns
the name Yeq to the equilibrium income. The second command substitutes
a set of parameter values into Yeq and assigns the name Yeq0 to the result.
The third command substitutes the parameters and the equilibrium output
level into the consumption function, yielding the equilibrium consumption
level. The final command provides a check, to confirm that total spending
sums to the equilibrium output level.3
(%o) − I0+G0+a
b−1
(%o) 780.0 (%o) 735.0 (%o) 780.0
The above four-equation model has two endogenous variables (Y and C) and
two exogenous variables (I and G), the values of which are assumed to be
determined outside this model. The consumption function illustrates the use
of two parameters (a = 150 and b = 0.75). The values of the exogenous
variables are also numerical constants. The final command states and solves
the equilibrium condition.
Consider the positive integers (1, 2, 3,. . . ), the negative integers (- l, -2, -
3, . . . ), and zero. All these values may be found on the real number line
portrayed below. A real number line has the following characteristics: (1)
The origin (location of zero) on the real number line is arbitrarily chosen.
(2) The units of measurement on the real number line are arbitrarily chosen.
(3) A positive or negative direction along the real number line is indicated
by the sign of the number; this sign reflects the location of a particular
point relative to the origin. (4) The ordering relation among the numbers
on the real number line is that, if x < y, then the point x lies to the left of
point y on the real number line. The number line in Figure 2.1, generated
by Maxima, shows three integers, the values −π and π, the constant e, the
fraction 2/3, and the square root of 5. Confirm that these values are in the
correct sequence.
The gap between any two whole, integer values found on the real number
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 18
line may be partially filled with rational numbers. A rational number like
2/3 results from the division of one integer by another, provided that the
denominator is not equal to zero. The number 2/3, expressed as the quotient
of the integers 2 and 3, is more commonly known as a fraction. Any integer
may be expressed as the quotient of some two integers. Therefore every
integer is a rational number. For example, 5 = 10/2 = 5/1.
The remaining gaps on the real number line are filled by irrational numbers.
Irrational numbers cannot be expressed as the quotient of two integers.√ An
√. . . The square root of 5 ( 5) is
example is the value of π, which is 3.14159.
another example of an irrational number, 5 = 2.236067 . . ..
To summarize: A rational number is the quotient of two integers, the denom-
inator not being equal to zero; an irrational number cannot be expressed as
the quotient of two integers; and a fraction is a rational number that is not
an integer.
The rational and irrational numbers together form the real number system.
The one-to-one relationship between the real number system and the real
number line means that we may use the terms “real number” and “point”
interchangeably. A real number is a point
√ on the real number line. We omit
complex numbers, which involve i = −1. Complex numbers can occur in
some analysis of dynamic systems, but take us beyond the purview of this
text
stance of the set or it is not, and a condition exists for determining which
of these is true. A set may be defined by either the “roster method” or the
“set-builder method.” Consider a simple set: A = {1, 2, 3, 4, 5}. This is
an application of the roster (or enumeration) method: the elements of
the set are listed. Such a set necessarily has a finite number of elements.
Notice the notation: A set’s name is typically a capital letter, and the rule
for identifying its elements is enclosed in curly brackets.
In the set-builder (or definition) approach, these brackets contain a rule
for identifying the elements Consider B = {x|0 < x < 100}. Read this as
“x such that x’s value is between 0 and 100 and does not include 0 or 100.”
Infinitely many real numbers qualify, so a set that is constructed with the
set-builder method can (but need not) be infinite.
Maxima uses the roster method, so the sets that it manipulates are finite,
though they can be very large. The three commands below build sets A,
B, and C. The resulting output appears below the commands. Compare
the third command with the third output line. Maxima has removed the
elements that repeat. It does this because a set consists entirely of distinct
elements; repetitions are not allowed. Also observe that Maxima writes the
elements of set B in alphabetic order, not in the order in which they were
entered. An important aspect of a set is the sequence does not matter.
(%i) A:1, 2, 3, 4, 5, 6, 7, 8, 9;
B:red, orange, yellow, green, blue, indigo, violet;
C:1,2,3,4,5,6,7,8,9,8,7,6,5,4,3,2,1;
(%o) {1, 2, 3, 4, 5, 6, 7, 8, 9}
{blue, green, indigo, orange, red, violet, yellow}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
Consider one more case, one in which we might be tempted to think that the
set contains a single element. We enter these named expressions: X: a/c +
b/c, Y: a/c + b/c and Z: (a + b)/c as Maxima commands. The first two
expressions, assigned the names X and Y are equivalent to each other and
have the same form. The third expression, Y, has the same value but not
the same form. Therefore, it is a distinct element of the set X,Y,Z. Either
of these two equivalent commands generates the relevant set: {X,Y,Z} or
set(X,Y,Z).4 The result is { b+a
c
, bc + ac }.
4
We have distributed the commands through this paragraph, rather than placing them
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 20
Equality
Two sets S1, and S2 are said to be equal or identical if and only if S1, and
S2 have exactly the same elements. The next exhibit considers four sets, S,
A, B, and C. Sets A, B, and C are subsets of S, but only C contains all of
S’s elements. Therefore, C = S, while A 6= S and B 6= S. Note that we have
entered some repetitions that do not appear in the first output line, which
identifies set S. Each element of a set must be unique.
The foregoing material requires defining a function named evens() and an-
other named odds(). The definition of these functions, which provides a way
for Maxima to emulate the set-builder approach to set building, is sketched
in the workbook that accompanies this chapter. For a more complete treate-
ment, see [13].
Next we confirm that Maxima can discover the relationships between sets
that we have asserted. The following commands are entered: [is(S=S),
is(S=C), is(A=S), is(B=S), is(B=C)]. Maxima treats each of these is(
... ) statements as a condition to be evaluated. The resulting output is
the answers that we expect: [true, true,false, false, false].
Subsets
Union
A new set may be formed by the union of two sets. Let S1 and S2 be any
two arbitrary sets. The union of S1 and S2 consists of the elements that are
in S1, in S2, or in both S1 and S2. The notation is S1 ∪ S2, which is read
“S1 union S2.” In the example above, A ∪ B = C. In the next example,
Maxima determines the union of these two sets: all integers from 1 through
10 and the even integers from 2 through 20.
The commands below create a set that consists of the even integers between
1 and 20, set S1. The second set, S2, consists of the squares of the integers
between 1 and 10. The third set consists of the union of S1 and S2.
(%i) S1:setify(makelist(i,i,1,10));
S2:setify(makelist(i^2, i, 1,10)); union(S1,S2);
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 22
Intersection
Another way to form a new set is via the intersection of two or more sets.
The intersection of two sets (the definition easily extends to more than two)
S1 and S2 consists of the elements that are in both S1 and S2. The notation
is S1 ∩ S2 is read “S1 intersection S2.” Formally, S1 ∩ S2 is equivalent to
{x|x ∈ S1 and x ∈ S2}. The intersection of S1 and S2 above is the set {2, 4,
6, 8, 10 }, which executing the command intersection(S1,S2) confirms.
The intersection of sets that share no common elements is the null set; such
sets are said to be disjoint. The command intersection({red, yellow,
green}, {up, over, out}) generates this output: {}, which is Maxima’s
notation for the null set.
Set Difference
The union operator defines elements that are members of any of two or more
sets. The intersection operator defines elements that two or more sets share
in common. A third, related operator is the set difference operator, which
defines elements that are in the first set but not in the second set. Order
matters. A formal definition is this: Given any two arbitrary sets S1 and
S2, the set difference of S1 and S2 consists of the set of all elements that
belong to S1 but not to S2. Formally, S1 − S2 = {x|x ∈ S1 and x ∈ / S2}.
Consider these three sets: X : {1, 2, 3}, Y : {3, 4, 5}, and Z : {1, 2, 5, 6, 7}.
The command setdifference(X,Y) produces the set that consists of the
elements that are in X but not in Y : {1,2}. In contrast, setdifference(Y,
X) produces the set that consists of elements of Y that are not in X: {4, 5}.
Other examples based on these sets appear in the workbook that accompanies
this chapter.
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 23
Multiple Sets
Universal Sets
A universal set includes all elements that are allowed by definition. If the
set is potential results of flipping a coin, then the universal set is {heads,
tails} (assuming the coin never lands on its edge). Thus, a universal sets is
a complete listing of all elements or outcomes that can be associated with a
particular action or situation.
If the contents of a set A is known and if the universal set is given, then we
can deduce the contents of a second set that complements the first set. The
complement to set A is A0 = U −A, where the prime indicates. Alternatively,
the complementary set A0 is {x|x ∈ U and x ∈ / A}.
Exercise 2 - 1
2. Let S = {1, 2, 3}, T = {3, 4, 5}, V = {3, 2, 1}, and the universal set
U = {, 2, 3, 4, 5}. Which of the following statements are correct? If a
statement is incorrect, correct it.
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 25
itive appeal of the Venn diagrams. However, the fact that they are algebraic
rather than graphical in character is advantageous in extended applications
of set theory. Throughout, we assume the existence of three sets–A, B, and
C–that are subsets of the universal set U . You might recognize the similar-
ity of many of these laws to those that provide the foundation for standard
algebra.
• Commutative Laws
(a) A ∪ B = B ∪ A
(b) A ∩ B = B ∩ A
• Associative Laws
(a) A ∪ (B ∪ C) = (A ∪ B) ∪ C
(b) A ∩ (B ∩ C) = (A ∩ B) ∩ C
• Distributative Laws
(a) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
(b) A ∪ (B ∩ C) = (A ∪ B)) ∩ (A ∪ C)
CHAPTER 2. VARIABLES, SETS, LISTS, AND RELATIONS 27
• Idempotent Laws
(a) A ∪ A = A
(b) A ∩ A = A
• Identity Laws
(a) A ∪ ∅ = A
(b) A ∪ U = U
(c) A ∩ U = A
(d) A ∩ ∅ = ∅
• DeMorgan’s Laws
(a) (A ∪ B)0 = A0 ∩ B 0
(b) (A ∩ B)0 = A0 ∪ B 0
Reagan, Bush, and Clinton. As members of a set, the following would be true
{Reagan, Bush, Carter, and Clinton} = {Clinton, Bush, Reagan, Carter}.
2.5 Lists
A historian would not consider the two orderings in the set of presidents to
be the same. They would insist on a list of the presidents: [Reagan, Bush,
and Clinton] would be a chronological list; some other ranking might result
in a different list. A list is an ordered n-tuple of elements, and any of these
elements may consist of text strings, numbers, mathematical expressions,
sets, or other lists. Lists are the basic building blocks for computer algebra
systems like Maxima. The command pList: [Carter, Reagan, BushI,
Clinton,BushII,Obama] produces a list of presidents and assigns it the name
plist. With this information in Maxima’s memory, the command plist[3]
produces the output BushI.5
Lists can be used to assign names to expressions. These expressions can
involve computation or they can consist of strings. Also, a list can contain
another list or a set. The following set of input and resulting output shows
a four-item list. Each item in the list is bound to a member of a set of four
names. The command that creates the lists ends with a $, so that printing
is suppressed. The individual items are then recalled and printed. The four
commands in the second input line result in the four lines of output, one for
each of the named items in the list.
will do), then the counter variable itself, and finally a start and end value
(additional arguments can be inserted; see the Maxima Manual ). Each of
the three commands below creates a list, to which a name has been assigned.
The names can be used to recall the list or items in the list.
Maxima contains many built-in operations like sqrt(), sin(), and log()
that can be applied directly to the items in a list, as below. Also, some
operations like dividing all items by the same constant or raising them to
the same power can be applied directly to a list. The next three commands
show a quick way to create the items in sqrtList and halfList, along with
a way to square each item in xList.
(%i) sqrt(xList);
√ √ xList/2;
√ √ √xList^2; √
3
(%o) [0, 1, 2, 3, 2, 5, 6, 7, 2 2 , 3, 10]
(%o) [0, 21 , 1, 32 , 2, 52 , 3, 72 , 4, 92 , 5]
(%o) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
Some operations cannot be applied to a list this way. Maxima’s map() com-
mand can be applied in such cases. This command can also be used instead
of some of those that we have already seen. The next input/output group
shows how to apply some basic operations to two lists of equal length (be
aware that you must avoid illegal operations like dividing by zero). Note
the use of quotation marks to indicate the binary operation that is being
conducted.
The first command below takes the factorial of each term in xList. The
map() command requires the following: a specification of the operation to
be applied, in quotation marks, and the list or lists to which the operation
applies.
Operations or functions can be mapped onto a single function. The next
input/output group shows two ways to return the factorial of the values
in xList. The first approach directly applies the factorial command, !. The
second approach is to create a named function fact(x) and then to map that
function onto the values. Be aware of the slight differences in the syntax.
(%i) map("!",xList);
fact(x):=factorial(x)$ map(fact, xList);
(%o) [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]
(%o) [1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]
The subst command can be used to determine the values of the expressions
above, given specified values of the parameters a and b.
To determine the minimum or maximum value, we must use the apply com-
mand, as below. The workbook provides details regarding this operation.
(%i) load(descriptive)$
mean(xList); std1(xList); smin(xList); smax(xList);
√
(%o) 5 (%o) 11 (%o) 0 (%o)10
2.6 Relations
Any ordered pair (triple, quadruple, and so forth) of values constitutes a
relation. Given the ordered n-tuple (x, y, z, . . .) a relation among the variables
exists whenever every set of values for any of n − 1 of the variables implies
one or more values for the remaining variable. For example, suppose that
x + 2 · y − 1.5 · z 2 = 0. Then specifying values for any two of the variables
implies one or more values for the third.
Using the command expr:x + 2*y - 1.5*z^2 = 0 we enter this expression.
We then assign values to two of the variables and solve for the third variable.
Assigning values to x and y can result in more than one z value. Assigning
values to x and z implies a single y value. Likewise, assigning values to y
and z implies a single x value.
To support these assertions, we use three commands that solve the expression
for one of the two variables in terms of the other two. The subst commands
below determine the implications of given values of y and z for x, of x and z
for y, and of x and y for z. For x and y, singe values result; but for z, a list
of two values is reported.
3. Refer to the three sets in (2). Suppose that we replace {} with [],
indicating that the three quadruplets are list, not sets. How would this
change affect the interpretation of the values?
This chapter builds on material from Chapter 2, which shows that variables
may be related via mathematical expressions. This chapter focuses on a
subset of those expressions, functions. It examines three types of functional
relationships. The first, explicit functions, exist when the value of some
variable is determined by an explicit relationship between that variable and
the value(s) of one or more other variables. The second, implicit functions,
exist when a set of two are more variables must jointly satisfy the condition
that a mathematical expression imposes. We saw an example of an implicit
function at the end of Chapter 2. Finally, two or more variables’ values
can be bound by the fact that each of these variables is functionally related
to another set of one or more variables, which are called parameters. Our
variables are said to be related via parametric equations (sometimes called
freedom equations).
Much of our illustrative analysis involves just two variables. In some cases,
the important relationship actually involves just two variables. In other
cases, the relationship can involve more than two variables, but the method
of approach can be outlined with the two-variable case and then extended.
Because of the importance of the two-variable case, this section begins with
an extension of the number line that Chapter 1 developed, showing two
variables’ values simultaneously with the use of rectangular coordinates. Oc-
casionally, we extend the analysis to include a third dimension. Furthermore,
the reasoning involved in creating rectangular coordinates in a plane can be
34
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 35
Both of the coordinate axes have the basic properties of any real number
line. Points to the right of the origin on the, x axis, and upward on the y
axis indicate positive values; points to the left of the origin on the x axis and
downward on the y axis represent negative values.
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 36
The two real number lines that form the coordinate system need not have the
same units of measurement. That is, the units in which they are measured
need not be the same. Indeed, often they cannot be the same. One axis might
measure profit and the other axis number of units sold, so the units for one
is dollars per time period and the other is some measure of physical units
per time period (the two time periods need not be the same). Or one axis
might measure price in terms of dollars per unit and the other axis quantity
in terms of physical units. This latter example describes the coordinate axes
that we use to conduct supply-and-demand analysis.
3.2 Functions
Consider these two equations: y = x2 and y 2 = x. Both are equations, but
for only one of the two is y a function of x. When y = x2 , each value of x
implies a single value of y. Such is not the case when y 2 = x, for in that case
a particular x value can be related to either of two y values. For example,
when x = 4, y can be either -2 or +2, for squaring either yields a value of
4 and satisfies the equation. We summarize this introduction with a formal
definition: A function is a relation (a set of ordered pairs) such that no two
ordered pairs have the same first element.
A function is denoted by y = f (x), which is read “y is a function of x” (not
“y equals f times x”). Other functional notations that are frequently used
include g(x), h(x), and F (x). To create a functional expression in Maxima,
use notation like this: f(x, a) := (a*x^2). The next cell shows this ex-
pression when a’s value is not specified, again when each of two values of a
is specified, and finally when both a and x are specified. All five commands
are placed into a list, so the output also appears in a list. Be aware that the
values entered must be entered in the order specified in parentheses, x first
and then a.
of all values that variable y may assume is referred to as the range of the
function.1
implicit(y^2=x,x,0,5,y,-sqrt(5),sqrt(5)))$
wxdraw(first,second, columns=2)$
1. For each of the following, determine the range of the dependent variable
y and the domain of the√independentp variable x.
(a) y = 8 + x (b) y = x (c) y = (4 − x2 )
(d) y = x21−1 (e) y = 8−x 1
2. Write out a few (a, b) that are consistent with the following expressions.
Indicate which expressions can be cast as explicit functions with y as
the dependent variable.
(a) y = px2 (b) y = x4 (c) y 2 = x (d) y 3 = x (e) y 4 = x
(f) y = (x) (g) y = 1/x (h) y = π · x2 (i) x2 + y 2 = 4
2 2
(j) y + x = 1 (k) y + x = 9 (l) x = 3 (m) y = 1/2
3. Use the wxdraw2d command to graph the expressions in (h) and (k)
from the list above.
(%i) [f(x):=sqrt(x),
√ g(z):=z+1, f(g(z))];
√
(%o) [f (x) := x, g (z) := z + 1, z + 1]
Using the function notation shown above, f(x) := an expression and greatly
facilitate the evaluation of a function for a number of values. Suppose that
we wish to evaluate for values of x from 51 through 58, or for a single value
of x. The first command below creates a list of x values and binds that list to
a name. The second command creates the functional expression. The third
command applies the function to the list of x values and assigns the result
to the name yList. The the matrix command creates a table of values. The
final command applies the function to a single value, x = 81. The results
appear as a table and as a single value, 5 · 9 = 81.
Exercise Set 2.2 Evaluate these expressions by hand and again with Max-
ima.
1. Given that f (x) = 100 + 7x find (a) f (0), (b) f (5), and (c) f (−10).
2. Given that f (x) = 10 − 4x find (a) f (1), (b) f (10) (c) f (a + h).
3. Given that f (x) = x2 + 4x − 6 find (a) f (0), (b) f (10), (c) f (−2).
4. Given that f (x) = 1/x2 find (a) f (2), (b) f (−4), (c) f (x + h).
5. Given that f (x) = 2x find (a) f (0), (b) f (3), (c) f (−3).
11. Given that f (x) = 4x − x2 and g(z) = 1/x find f (g(z)) and g(f (x)).
Polynomial Functions
• y = 5 · x + 3 is a first-degree polynomial.
• y = 5 · x2 is a second-degree polynomial.
• y = 5 · x + 5 · x2 is also a second-degree polynomial.
• y = 5 · x + 5 · x2 + 5 · x5 is a fifth-degree polynomial.
Note the present of radcan in the third example. Copy this cell, remove the radcan
—don’t forget to remove both parentheses—and see what difference occurs. The term
radcan stands for radical canonical.
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 45
Figure 3.3 shows an example of four specific types of polynomial functions, all
of which will be used in examples in this book and all of which are frequently
used to analyze and illustrate economic and business issues. The general
expressions for the four are these: constant, y = x0 ; linear, y = a + b · x;
quadratic, y = a + b · x + c · x2 ; and cubic, y = a + b · x + c · x2 + d · x3 .
We have already seen examples of constant and linear functions, both in the
simple Keynesian model. We will encounter all of these functional forms as
we encounter various models, such as those of cost curves, that require the
additional flexibility that terms with larger exponents provide.
If we can observe any two points on a straight line, then we can compute the
line’s slope and intercept using techniques that you learned in high-school
algebra. Likewise, if we can observe any three points, then we can compute
the coefficients of a quadratic equation. Finally, any four points determine
the coefficients of a cubic equation.4 The output below shows an example
of using Maxima to determine the coefficients of a quadratic equation and
plotting that equation.
The values of the (x, y) pairs were selected arbitrarily. For each x value, this
must be true: y = a + b · x + c · x2 , so the first list in the solve command
is a list of three points at which that statement must be true if a, b, and c
are indeed the proper coefficients. The second list is the list of unknowns
for which we seek values. The solve output consists of a list with another
list embedded. The expression that we name expression is obtained by
extracting the inside list, by using the [1] after the %. The % symbol refers
to the output that results from the most recent command. Thus we are
inserting this list: [a = −90, b = 12, c = −1/4] into the expression for the
quadratic equation. As exercises, repeat this process in order to define the
coefficients of a linear polynomial and a cubic polynomial and to graph the
implied polynomials.
If we can observe any two points on a straight line, then we can compute the
line’s slope and intercept using techniques that you learned in algebra. Like-
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 48
wise, if we can observe any three points, then we can compute the coefficients
of a quadratic equation. Finally, any four points determine the coefficients of
a cubic equation. The output below shows an example of using Maxima to
determine the coefficients of a quadratic equation and to plot that equation.
The values of the (x, y) pairs were selected arbitrarily. For each x value, the
following must be true: y = a + b · x + c · x2 . Therefore, the first list in
the solve command is a list of three points at which that statement must
be true if a, b, and c are indeed the proper coefficients. The second list
contains unknowns for which we seek values, a, b, and c. The solve output
consists of a list with another list embedded. The expression that we name
expression is obtained by extracting the inside list, by using the [1] after
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 49
the %. The % symbol refers to the output that results from the most recent
command. Thus we are inserting this list: [a = -90, b = 12, c = -1/4]
into the expression for the quadratic equation. As exercises, repeat this
process in order to define the coefficients of a linear polynomial and a cubic
polynomial and to graph the polynomials.
Sometimes, we know the value at a point and we know (or have a good
estimate of) the slope at that point and we wish to use that information to
determine the linear function that passes through that point. The cell below
shows an example, where a line with a slope of -2.25 passes through the point
(20, 100). The first command solves the relevant equation and assigns the
solution the name soln. The second command substitutes a = 145 into the
expression y = a−2.25·x to yield the expression for the linear equation. The
wxdraw2d command draws the line over a range of x values, yielding Figure
3.6. The drop line shows the designated point through which the line passes.
The quadratic form is general enough to apply to many economics and busi-
ness issues, so examining it in more detail is warranted. We begin with the
quadratic formula, which is shown below.
This formula provides a rule to find the roots of a quadratic equation. That
is, is shows the values of x for which y = 0.5 We see that two such values seem
to occur. We say “seem to” because our attention is limited to real numbers.
If b2 − 4 · a · c < 0, then no real solutions occur. Also, if only positive x values
make economic sense, then the number of solutions can be 0, 1, or 2. The
examples below show three quadratic equations. The first, named p1, has no
real solutions; the second, p2, has two real solutions but only one for x > 0;
and the third, p3, has two real solutions for positive values of x. numbers.
The term b2 − 4 · a · c is called the discriminant: it discriminates between
equations with no real solutions and those with at least one real solution.
A firm’s cost curve is often represented by a cubic equation like the one
graphed below. If T C = 50 + 20 · x − 2 · x2 + 0.25 · x3 , then its marginal cost
5
Factoring a quadratic or completing the square are two other ways to find its root.
Every quadratic equation can, however, be solved by using the quadratic formula.
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 51
1. x2 − x − 6 = 0
2. x2 − 25 = 0
3. x2 + 6 · x + 8 = 0
4. 3 · x2 + 7 · x − 3 = 0
5. x2 − 4 · x + 4 = 0
6. x2 + x − 12 = 0
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 53
7. x2 − 5 · x + 3 = 0
8. x3 − 5 · x2 − x + 5 = 0
been applied to learning curves and to the fraction of a population that has
adopted a new product or technology. The function is f (t) = K/(1+e1+a+b·x ).
The value K defines the upper limit. The parameters a and b (b < 0)
determine the function’s curvature and height properties. Euler’s number is
e, and t is time. The cell below defines a function for which K = 0.8, a = 0,
and b = −0.2, shows some values of that function, and graphs the function.
The map command is used to map the function onto the list of t values.
For this set of values, the population has been growing and has reached one-
half its maximum size when observations begin (t = 0). When y is relatively
small, the per-period growth rate is high and, at first, increasing. Then the
growth rate decelerates, asymptotically approach zero. In this illustration y
is only very slightly less than K, its maximum value, by year 30.
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 56
1. Sketch on the same set of axes the graphs of the exponential functions
y = bx for these values of b: 4, 8, and 12.
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 57
2. Sketch on the same set of axes the graphs of the exponential functions
y = a · bx , with a = 5 and c = 1, for these values of b: 0.3, 1, and 7.
3
3. Draw the graph of this logistic curve y = 1+e1−2·x
. Graph the function
for −2 ≤ x ≤ 5.
1. 100 = 1 ⇔ log10 1 = 0
2. 101 = 10 ⇔ log10 10 = 1
The allowable bases for logarithms are positive, so raising a positive number
to any power yields a positive number. This fact prohibits logarithms of
negative numbers. We can extract the logarithm of quite small positive
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 58
values like 0.0001, for which the base-ten logarithm is -3, as Example 5 above
shows.6
The logarithmic function permits any positive real number as the base b.
Even so, almost all analysis is conducted with b = 10 or b = e ≈ 2.718. Ten
is the base for a system called “common logarithms,” and e is the base for a
system called “natural logarithms” or “Napierian logarithms” (Napier is the
mathematician who developed this system).
A widely-used convention in science texts is to use the notation logx to refer
to common logarithms and lnx to refer to natural logarithms. In contrast
mathematics textbooks use logxto refer to natural logarithms and log10 x to
refer to refer to common logarithms. We use the mathematics convention
with one modification, the use of parentheses, log(x) rather than logx. This
modification accommodates the way the Maxima works: it can take loga-
rithms of expressions as well as numbers, so log is a command for which
the argument must be entered into parentheses. The result of the commands
log(100), log(100.0) and log(100.0)/log(10.0) appear below.
(%i) matrix(["log(100)","log(100.0)","log(10.0)",
"log(100.0)/log(10.0)"],[ log(100),log(100.0),
log(10.0),log(100.0)/log(10.0)]);
log(100) log(100.0) log(10.0) log(100.0)/log(10.0)
(%o)
log (100) 4.6051 2.3025 2.0
The cell below shows some important laws that govern the behavior of loga-
rithms. The command logexpand:super forces Maxima to expand some of
the expressions rather than just evaluating them and reporting them as they
were originally expressed. The last column applies for natural logarithms,
reflecting the general relationship that whenever x = by , then logb x = y.
(%i) logexpand:super$ exprList:[a*b,a/b,1/b,a^b,1/a,exp(a)]$
logList:log(exprList)$ matrix(exprList,logList);
a 1 1
b a
ab b b
a a
e
(%o)
log (b) + log (a) log (a) − log (b) −log (b) log (a) b −log (a) a
The relationship between the graph of an exponential function such as y = bx
and its inverse function y = logb x can be illustrated graphically. The graph
below shows this symmetry, using the exponential function y = ex and its
inverse function y = log(x).
2. Determine the range of values of x for which the following functions are
defined. (a) y = log(x + 8) (b) y = 1og(8 − x) (c) y = log(x2 − 4)
(d) y = log(25 − x2 ) (e) y = log(x − 9) (f) y = log(x3 + 8)
3. Using the laws concerning the use of logarithms express each of the
following as a single logarithm: log(x) + log(y) + log(z).
Inverse Functions
ylabel="y", key=string(expr1),explicit(expr1,x,0,8))$
decrease: gr2d( xaxis=true,yrange=[-10,30],xlabel="x",
ylabel="y",key=string(expr2),explicit(expr2,x,0,8))$
wxdraw(increase,decrease,columns=2)$
Before we graph the function and its inverse, we determine the range of the
inverse function. Doing so helps guide the graphing of the two functions.
Figure 3.14 shows that the two functions, except for scale, are mirror images
of each other. We have added reference lines that take into account the
difference in scale for the two variables. In the first, the reference line is
y = 4000 · x, and in second, the line is the equivalent, x = y/4000. We have
extended the axes slightly from the values shown above in order to emphasize
the similarity of these two functions.
For emphasis, we repeat that f −1 is not the same as (f )−1 , which is the
inverse of the expression on the right-hand side of the initial function. The
difference between the two is shown below, as created with Maxima’s print
command.
y
1 1 log 125
The inverse of f (x) = = . The inverse function is .
y 125 2x log (2)
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 64
Q10
Interpret i=1 ai as a1 a2 a3 a4 a5 a6 a7 a8 a9 a10
The first table below shows the results of the following instructions (see the
workbook for details):
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 65
• Sum the second through sixth values in the list of squared values.
• Sum k times each of the ten squared values. The illustrates the homo-
geneity property of summation.
• Sum the list of nine first differences, confirming that this sum equals
the last value less the first value in the original list of squared values.
This result is an example of the telescoping property of addition. For
any list [a1 , a2 , · · · , an ], it must be true that (a2 − a1 ) + (a3 − a2 ) + · · · −
(an − an−1) = an − a1 .
The next table shows the effects of the same sets of commands, except that
product replaces sum, and that ratios of adjacent values replace differences
between adjacent values.
Observe that the homogeneity property of multiplication implies that multi-
plying each of a set of values by k and then multiplying the result yields a
result that is k n as large as the products of the initial values.
Observe also that a variant of the telescoping property also applies. The
product of the ratios created as in the table below equals the terminal value
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 66
1. For each of the following functions, sketch the graphs of f and f −1 and
determine the range and domain for both the √ function and its inverse.
(a) y = 5 + 4 · x (b) y = 3 − 2 · x (c) y = 2 · x + 8
x
(d) y = x+1 (e) y = x (f) y = 2 · x − 3
2. Evaluate
P44 each of the P
following sums, bothPby hand and using Maxima.
(a) j=40 j (b) 4k=1 2k−1 (c) 5i=0 (−1)i
P4 k−1 P8
(d) k=1 k+1 (e) i=1 (3 · i + 5)
4. Evaluate each of the following products, both by hand and using Max-
ima. Q
(a) 42 (b) 5j=2 xj (c) 5i=5 i2
Q Q
j=40 j
(d) 4i=1 (3 · i + a)
Q
CHAPTER 3. RECTANGULAR COORDINATES AND FUNCTIONS 67
4. Given the following price and quantity demanded data for pizza.
(a) Find the equation of the demand function for pizza such that Q =
f (P ).
(b) What are the slope and the ordinate (intercept) of the demand
function? Beware: Q is the “y” variable given the way the function is
expressed.
10. Suppose that the average annual earnings for members of a profession
is represented by this function log(y) = f (x) = 12.5 + 0.07 · x − .001 · x2 ,
where y is average earnings and x is years of experience.
(a) Graph this function, with y (not log(y) on the vertical axis.
(b) Assume that the average age at which members of this profession
begin to practice is 25. What factor(s) might account for the downturn
in earnings at x = 35?
Chapter 4
70
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 71
We use the methods of integral calculus when we know the form of the
function that describes changes and we wish to know the form of the function
that describes the level of the variable in question. An important use of
integral calculus is related to the fact that we can use an integral to find the
area under a particular curve or function.
This chapter and the next three address the differential calculus and its
applications. Chapter 8 introduces integral calculus. Before we learn how
to take derivatives and to use them, we develop the concepts of limits and
continuity. The concept of limits is critical to understanding how to interpret
derivatives. Whether or not a function is continuous determines whether
differential calculus methods can be applied.
4.1 Limits
We have already seen one example of a limit. When we examined the logistic
function, we saw that a population grows toward a limiting value. Thinking
of that example gives us a basis for a formal definition of the term. Begin
with a function y = f (x). If, as x approaches some value, x0 , and as a result
f (x) approaches some number A, then A is said to be the limit of f (x) as x
approaches x0 . The standard notation for the preceding statement is this:
lim f (x) = A.
x→x0
(%i) f(x):=x^2$
diffList: [3, 2, 1, 0.1, 0.01, 0.001, 0.001, 0.0001]$
smallList: 3 - diffList$ largeList: 3 + diffList$
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 72
The last line above shows the result of applying the float command with
n approaching infinity. The first result shows the exact value; the second
shows its floating-point representation.
Example. Consider a generalization of the example above. Suppose that a
sum of money accumulates for t years at an annual rate of r. The sum plus
interest is compounded n times per year. The next cell shows the formula for
the value after t years and it shows the limit of that formula as the number
of compounding periods per year approaches infinity. This result shows that
Maxima’s limit command is not restricted to finding numerical values. It
can also find limits of expressions, where the limit is another expression.
(%i) [limit((x^2+1)/(x-1),x,1,plus),
limit((x^2+1)/(x-1),x,1,minus)];
wxdraw2d(yrange=[-100,100],xlabel="x", ylabel="y",
key="(x^2 + 1)/(x - 1)",
explicit( (x^2+1)/(x-1), x, 0, 3 ) )$
(%o) [∞, −∞]
3
We leave as an exercise the determination of y values for x values that are ever closer
to 1. Follow the development of the first example.
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 74
"ratio","times A"]$
limitsList: [limit(f(x),x,100.0),limit(g(x),x,100.0)
limit(f(x)+g(x),x,100.0),
limit(f(x)-g(x),x,100.0),limit(f(x)*g(x),x,100.0),
limit(f(x)/g(x),x,100.0),
limit(K*f(x)*g(x),x,100.0)]$
matrix(nameList, limitsList);
A B sum difference product ratio times A
(%o)
1.9999 320.0 322.0 −318.0 640.0 0.00625 640.0 K
EXERCISE 3.1
In each of Exercises 1 through 15, evaluate the limits. Try to determine the
limits yourself before having Maxima provide the value. To keep the exercises
compact we use limx→A f (x) rather than the equivalent
lim f (x).
x→A
(%o) f (x, B) := A
x
(%o) [0, 0, infinity, infinity]
(%o) [0, 0, ∞, −∞] (%o) [0, 0, −∞, ∞]
This exhibit shows how the function is defined. The commands specify three
sets of limits. In the first set A’s sign is not specified; in the second, A > 0,
and in the third A < 0. In each case we seek limits as x becomes a very
large positive number (inf) and a very large negative number (minf). Also,
we seek limits as x approaches zero from above and from below. In all cases,
1/x becomes quite small (approaches 0) as x becomes either very large or
very small. Thus the limit of 1/x is zero as x approaches either infinity or
negative infinity.
The behavior as x approaches zero depends on the sign of A. If that sign
is not specified, Maxima provides the ambiguous response infinity which
does not specify a sign. The second and third output lists show the source of
this ambiguity: whether the limiting value of 1/x is positive or negative for
a specified limiting value of x depends on A’s sign. Figure 4.3 adds insight
into the behavior of this function.
1
(%o) [ 1 , xa y1−a ]
1−a b
+ ab
yb x
The first output entry shows the CES function. The second entry shows the
limit of this function as b → 0. The result is the Cobb-Douglas function.
4.3 Continunity
The first step in determining and interpreting the derivative of a function
is to have a clear understanding of the concept of a limit. The second step
involves using concept of continuity of a function. Very roughly speaking, a
continuous function on a particular interval is one whose graph can be drawn
without lifting one’s pencil or pen from the paper in that interval. Figure
4.4 below shows two continuous functions and two discontinuous functions.7
Despite the kink at x = 50, the v-shaped function in the first panel is contin-
uous. Both functions in the second panel exhibit discontinuities. Following a
representation of the function and then evaluates the limit of that series. We use this
because on some installations of Maxima, applying the limit command to this exprssion
resulted in an overflow error.
7
The second of the two explicit commands used to generate the first discontinuous
graph contains a “fudge” term in that the second segment begins at 50.001 and not 50.
The reason for this is that draw would connect the points if 50 were both the endpoint of
the first segment and the beginning point of the second segment.
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 82
The next exhibit shows the new, augmented function for which the domain
spans the real number line. Selected values of this function, named faug(x),
appear as the output. The discontinuity has been removed.
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 84
An example much like the one above is the one that applies to periodic
compounding: We have already evaluated the limit of this function when
n → ∞, finding that the result is y0 · er·t . Use Maxima to confirm that the
limit as n → 0 is y0. Also confirm that the function is not defined at n = 0
(of course, n < 0 makes no economic sense). The fact that n = 0 is not part
of the domain actually makes economic sense: if n = 0, compounding never
occurs.
Infinite discontinuity
The next exhibit shows two functions that have infinite discontinuities. As
x approaches a critical value (0 here) the functions’ values become either
very large or very small. The limits themselves may differ or they may be
the same (that is ∞ or −∞). Do not conclude that the fact that the limit
of 1/x2 is the same from either direction implies continuity. Of the three
conditions that must be met for a function to be continuous 1/x2 fails the
first two: 0 is not in the domain, and the limit is no finite value.
wxdraw(different,same, columns=2)$
(%o) [f (x)limits :, infinity, ∞, −∞] (%o) [g(x)limit :, ∞]
The first output line above reminds us that Maxima cannot determine the
limit of 1/x as x → 0 unless the direction is provided. The second output
line shows the same is not true of 1/x2 : g(x) → ∞ regardless of the direction
from which zero is approached.
explicit(sin(1/x),x,-.1,0),color=black,
key="Approaching 0 from above",
explicit(sin(1/x),x,0,.1) )$
Exercise 3-3
Determine the values of x for which the following functions are continuous.
If a discontinuity exists, determine the type of the discontinuity, and correct
any removable discontinuities.
∆y y2 − y1 f (x2 ) − f (x1 )
m= = = .
∆x x2 − x1 x2 − x1
second row shows the implied x value; the third row shows the implied y
values. The last two rows show the implied changes in y, ∆y, and the quo-
tients, ∆y/∆x. For this linear function all values on the last row equal the
coefficient of x.
We cannot always use linear functions in decision and choice problems. When
the underlying function is nonlinear, we must determine the slope of the
function at a particular point of interest, since the slope of the function
differs at different points on that function. Consider the nonlinear function
y = f (x) = x2 . We commence by selecting a particular point on the graph of
that function, namely (3, 9) = (x1 , y1 ). Then, as in the preceding example,
we generate a list of values of a variable x2 by specifying changes for which
the absolute values decrease as we move toward the middle column of the
table. The table shows that the difference quotient moves toward 6 as we
move closer to the point (3, 9). That quotient is not defined at 9 because
that value implies ∆x = 0.
The difference quotient method helps us find the slope of a line between two
points. It measures an average rate of change rather than the slope of the
function at a specific point. Given two points (x1 , y1 ) and (x2 , y2 ), ∆y/∆x
measures the average rate of change in y that occurs over the interval per
unit change in x. Commands are omitted because they are virtually the same
as those for the preceding exhibit.
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 89
x changes x values y values y changes difference quotients
−2 1 1 −8 4
−1 2 4 −5 5
−0.5 2.5 6.25 −2.75 5.5
−0.2 2.8 7.8399 −1.16 5.8
−0.1 2.9 8.41 −0.589 5.8999
−0.01 2.99 8.9401 −0.0598 5.9899
0 3 9 0 −−−
0.01 3.01 9.06 0.06 6.0099
0.1 3.1 9.61 0.61 6.1
0.2 3.2 10.24 1.24 6.2
0.5 3.5 12.25 3.25 6.5
1 4 16 7 7
2 5 25 16 8
Both the linear example and the quadratic example illustrate that as the
change in variable x, ∆x, becomes increasingly small, approaching 0 in the
limit, the difference quotient approaches some finite value as a limit. In the
case of y = x2 , as x → 3, Deltay/Deltax → 6. This is true whether we
approach x = 3 from above or from below. When this is the case, this limit
is called the derivative of y = f (x) with respect to x. This derivative is
denoted dy/dx and is defined as follows: Given the function y = f (x), the
derivative of y with respect to x, is
dy ∆y f (x + ∆x) − f (x)
= lim = lim
dx ∆x→0 ∆x ∆x→0 ∆x
provided that the limit exists.
This definition of a derivative still measures a rate of change; however, the
rate of change is an infinitesimally small change in variable x. For that reason,
a derivative may be thought of intuitively as being taken at a particular
point on a curve. We apply the definition of a derivative to the functions
y = 10 + 2 · x and y = x2 .
For the linear function
dt f (x + ∆x) − f (x) (10 + 2 · (x + ∆x) − (10 + 2 · x)
= lim = lim =
dx ∆x→0 ∆x ∆x→0 ∆x
(10 + 2 · x + 2 · ∆x) − (10 + 2 · x) (2 · ∆x)
lim = lim = 2.
∆x→0 ∆x ∆x→0 ∆x
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 90
This confirms that the slope of a linear function is the same for all values of
the independent variable.
For the second function, y = x2
Exercise 3-4
For the following functions, find the derivative, dy/dx = df (x)/dx, by evalu-
ating the limit of the difference quotient. Then use Maxima to confirm your
results.
1. f (x) = x3 2. f (x) = x2 + 3 · x + 4 3. f (x) = 4 · x − 1
4. f (x) = 4 − 3 · x 5. f (x) = a · b · x 6. f (x) = b · x2
7. f (x) = 144 − 32 · x
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 91
The slope of the line that is tangent to f (x) at x = 3 is the limiting value of
chords like these two, when ∆x → 0.
The limit command confirms that the limiting ratio of ∆y to ∆x is 2 · x.
This command, using h to denote ∆x, shows that the limiting slope of the
chords is indeed 2x. The direction from which ∆x → 0 does not affect the
outcome and, therefore, does not need to be specified in the limit command.
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 92
Refer to Figure 4.5 to recall this function. The value x = 0 is not part of
this function’s domain. Maxima can return a derivative for this function,
the messy term below. As it turns out, however, this derivative cannot be
evaluated for x = 0. Trying to evaluate dy/dx at x = 0 yields the same error
CHAPTER 4. LIMITS, CONTINUITY, AND DIFFERENTIABILITY 93
Differentiation: Univariate
Functions
95
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 96
1. y = 1/2 2. y = 1000 3. y = ex 4. y = π
5. y = xn+1 6. y = x2 3/2
7. y = x p 8. y = −x2
9. y = −x−0.5 10. y = x2 · log(x) x
11. y = e / (x)
5.1.1 Polynomials
A quadratic equation illustrates the process of taking a derivative of a poly-
nomial and of interpreting that derivative. Recall that the derivative is the
slope of the function. The expression below says that the derivative is nega-
tive for x < 10 and positive for x > 10. Thus, reaches a minimum value at
x = 10, as the graph confirms.
The Maxima command to determine the expression for the derivative and to
print that expression is this:
For f(x) := 0.5x2 − 10x + 100 the derivative is df (x)/dx = 1.0x − 10.
The graphs in Figure 5.2 show the values of the quadratic function and
its derivative. The two are drawn separately because the y-axis units are
different.
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 98
These commands
y:sqrt(x)*exp(.05*x);[diff(exp(.05*x),x),diff(sqrt(x),x)];
[dydx: diff(y,x), radcan(dydx)];
z:sqrt(x)/exp(.05*x);[diff(exp(.05*x),x),diff(sqrt(x),x)];
[dzdx: diff(z,x), radcan(dzdx)];
Figure 5.5 graphically depicts the behavior of these two expressions and of
their derivatives over the indicated ranges. For the product, the function is
monotonic and the derivative is always positive. For small values, the func-
tion y grows at a decreasing rate, but for larger values the exponential term’s
rapid growth dominates the damping effect of the relatively slow growth of
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 102
p
(x). The ratio of these two functions of x is not monotonic: at first, the
growth of the square root term dominates. Eventually, however, the expo-
nential term, now in the denominator, overwhelms the growth in the square
root term. The behavior of z is reflected in its derivative, which begins with
positive values but falls to zero when , and remains negative thereafter. For
large values of x, z is asymptotically approaching 0 as is its derivative (from
below).
dy
The second list of commands calculates dx
in the two ways indicated above.2
[y:sqrt(u),dydu:diff(sqrt(u),u),u:a+b*x^n,dudx:diff(u,x)];
[dydu*dudx, diff(’’y,x)];
The output consists of two lists. The first list shows y = f (u) and its
derivative and then u = g(x) and its derivative. The second list contains
dy
dx
first as a product of two derivatives and then as produced by Maxima.
Confirm the equivalence of the two expressions.
√ 1
[ u, √ , b xn + a, bn xn−1 ]
2 u
bn xn−1 bn xn−1
[ √ , √ ]
2 u 2 b xn + a
not only does a given value of x yield a unique value of y because y = f (x),
but also that a given value of y yields a unique value of x because x = g(y).
There is a one-to-one correspondence between y and x.
Consider the example that appears in the display below. For positive values
of x, f is monotonic. The commands below do the following. The commands
df
in the first line specify f and determine the expression for dx . The commands
in the second line define the inverse function g. The third line of commands
dg
determines the expression for dy and substitutes f where appropriate to
1
confirm that dg/dy = df /dx .
Exercise 4.3
Find dy/dx for the following. Remember that log(x) refers to the natural
logarithm, unless another base is specified. Find the derivatives by hand and
check your work with Maxima.
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 106
The “0th order” derivative is the original function. The fourth-order deriva-
tive of y is zero, as is any higher-order derivative. All derivatives of z are
positive if r > 0. Use the rules developed above to confirm that the entries
in each row are the derivatives of the expressions in the preceding row.
d y n
th
We use the following notation to indicate order: dx n is the n -order derivative
increasing. For x greater than 2.5 or so, the derivative remains positive but
decreases. Look again at the original function and observed that for small
values of x, f (x) is accelerating, rising at an increasing rate. For larger values
of x, f (x) decelerates and finally decreases in value. An exercise: determine
the value at which the first derivative equals zero.
The second derivative’s graph (upper right panel) shows that the first deriva-
tive changes sign at x = 8/3. The second derivative’s derivative (the third
derivative of f , is a constant. The derivative of a constant is zero, so the
fourth and all higher-order derivatives for this function are zero.
These graphs are in separate panels rather than in a single graph for an
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 109
important reason. The units of the functions are quite different. Suppose that
the original function does represent production. Then the original function is
expressed in physical units per time period. The first derivative is in physical
units per time period per worker hour. The second derivative is in physical
units per time period per worker hour per worker hour, and so forth. The
units for higher-order derivatives can become quite unwieldy. Fortunately,
as previously noted, we seldom go beyond the third derivative.
Exercise 4.4
dy d2 y d3 y
Determine the following for each of these functions: ,
dx dx2
, and dx3
.
1. y = 8 · x3 2. y = 8 · x1/2 3. y = x4 + x2 + 1
4. y = ex 5. y = log(x)
this function as the total product of labor (TPL) function. Two important
per-unit functions can be derived from this function. The output per unit of
labor, or average product of labor is AP L = q/L = f (L)/L. The change in
output per one (small) unit change in labor is the marginal product of labor,
M P L = dLdq
= dfdL
(L)
.
The law of diminishing returns states that, beyond some employment level,
dq
M P L begins to decrease. That is dL decreases. Chapter 6 returns to this
analysis, extending it to cases in which more than one input is treated as
variable.
One of the simplest of such production functions is the Cobb Douglas func-
tion: q = A · La · K 1−a , where q is the amount that the firm produces each
period, L is the number of worker units employed each period, and K is the
number of units of capital employed each month. The coefficient A reflects
the level of technology and is constant at any point in time. For our pur-
poses, units do not matter and are ignored.4 In input/output exhibit below
shows the total product, average product, and marginal product functions
for a Cobb-Douglas production function. A little manipulation reveals that 1−a
the AP L and M P L functions can be stated as follows: AP L = A · K L
1−a
and M P L = a · A · K L
, so that M P L = a · AP L. This simple relation-
ship between AP L and M P L is a characteristic of this function and does
not generalize to all production functions.
The graphs in Figure 5.7 show the total and per-unit functions. One aspect
of this function that might cause concern is that M P L decreases from the
beginning and is always below AP L. In principles classes you probably saw a
different configuration, with M P L rising at first, until diminishing marginal
product sets in. The failure of the Cobb-Douglas to generate this result is a
4
“Our purposes” are purely illustrative. For applications to specific issues, of course,
knowing the units is crucial.
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 111
failure but probably not a serious one. Most production occurs in the range
where diminishing marginal product is present; in fact, in the range for which
AP L > M P L > 0. For the Cobb-Douglas case, AP L > M P L > 0 for all
values of L.5
Figure 5.7: TPL, APL, and MPL for Cobb-Douglas production function
5
In the Maxima commands, the L range does not begin at 0 for the per-unit functions.
The values of AP L and M P L are not defined for L = 0.
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 112
such cost curves. We use such a curve for the following analysis.8 The cost
functions are as follows:
Total Cost Average Cost
0.05q 3 − 1.7q 2 + 25q + 150 0.05q 2 − 1.7q + 150
q
+ 25
Figure 5.9 shows the curves that these expressions generate. Both T C and
T V C increase throughout the range of production. For small output rates,
the rate of cost increase diminishes, but after q ≈ 10 the rate increases. The
M C curve reflects this aspect of the T C curve, decreasing for small q and
increasing for larger q. Both AC and AV C exhibit the U-shape that economic
theory suggests one should observe. The two converge as q increases, with
the vertical distance between them equally the value of AF C. The AF C
curve contains no useful analytical information and will be omitted from
subsequent analysis.
Figure 5.9 reveals three quantities that can be of interest: the quantity at
which M C’s sign changes, the quantity at which M C = AV C (and AV C
8
We could begin with a cubic production function and establish the relationship be-
tween output and cost, but the process is rather involved. See Hammock and Mixon
[7].
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 114
Figure 5.9: Total and Per-unit cost curves, cubic cost function
reaches its minimum value. The second command yields [q = 0, q = 17]. The
first solution reflects a peculiarity of the cubic functional form and has no
economic import. The second shows the quantity at which AV C reaches its
minimum value.
The final input item in the list of commands, solve(MC=AC,q, determines the
quantity at which the marginal cost curve intersects the average cost curve:
q = 20.55].9
The general expression for the demand curve is q = f (p), where f shows the
quantity demanded at each price. Often, it is easier to work with the inverse
function, p = g(q). An important reason for using the inverse demand curve
is that total revenue is T R = p·q. The illustration here treats a case in which
a homogeneous product is sold in a single market and the same price applies
to each unit. More complex demand curves can be specified with suitable
extensions of this simple representation.
The illustration in this section uses two types of demand curves, linear and
constant-elasticity (the concept of elasticity is the one you encountered in
your principles course and which the next section revisits). The demand
curves are of this form: q = a + b · p and q = A · pe . In the first specification,
a is the horizontal intercept (the quantity demanded when p = 0 and b is the
demand curve’s slope (b < 0). In the second specification, A is a constant
(the quantity when p = 1) and e is the price elasticity of demand (e < 0).
For this specification, no horizontal intercept exists.
The three lines of input below show expressions for two demand curves,
named fL and fCE. The second input line provides Maxima with information
it requires in determining the inverse functions. The third input line produces
expressions for the inverse demand curves.
The resulting output consists of two lists. The first, [bp + a, pe A], is the
original expressions for the quantity demanded. The second, [[p = q−a
b
], [p =
1
qe
1 ]], is the pair of inverse demand curves.
Ae
The inverse of the linear demand curve can be written as p = (a/b)+(1/b)·q,
so that (a/b) is the vertical intercept (graphs appear below) and (1/b) is the
inverse demand curve’s slope. For the constant-elasticity demand curve, the
inverse is p = (1/A( 1/e)) · q ( 1/e) . The following input shows the inverse
demand curves and the associated total revenue curves and marginal revenue
curves. We use functional notation, pL(q,a,b) for example, to facilitate
graphing. The first two input lines specify the inverse demand curves. The
second pair of lines specifies total revenue functions. The “quote-quote”
operator (’’) is used to force Maxima to evaluate the products in the second
pair of commands and the derivatives in the third pair.
Profits
panel, which shows total values, reveals that the firm earns losses at all out-
put levels because total revenue lies below total cost. The firm does, however,
cover its variable costs so that producing at a rate between 0 and just un-
der 25 units results in a lower loss than producing nothing and losing the
per-period fixed cost, $150. Minimum loss appear to occur at approximately
q = 15. We determine the actual value below.
The per-unit curves tell the same story. The inverse demand curve (p) is
below average cost but above average variable cost. Maximum profit (mini-
mum loss here) occurs where the marginal revenue and marginal cost curves
intersect, apparently around q = 14. The combination of a polynomial cost
CHAPTER 5. DIFFERENTIATION: UNIVARIATE FUNCTIONS 119
Differentiation: Multivariate
Functions
120
CHAPTER 6. DIFFERENTIATION II 121
The derivative taken in this expression measures the rate of change of y with
respect to x2 and is referred to as the partial derivative of y with respect to x2.
The partial derivative indicates the effect of a change in a single independent
variable on the dependent variable, with all other independent variables in
the function held constant while this particular derivative is taken.
The process of partial differentiation is denoted by the variant form of the
lower-case Greek letter delta (δ), namely ∂. Suppose that we have a function
of the form y = f (x1, x2). We represent the partial derivative of y with
∂y
respect to x1 by ∂x1 . Notation regarding partial derivatives is not completely
∂y ∂y
standard. Any of the following can be used to mean the same as ∂x1 : ∂x1 ,
f1 , fx1 , y1 , or yx1 . The context usually makes the meaning clear.
We use u_x and u_y as names for these partial derivatives, ∂u/∂x = 4y + 2x
and ∂u/∂y = 2y + 4x. It is apparent in this case that the value of both of
the partial derivatives depend on the values of both x and y.2
For the next example, v = 2 · y/x + 4 · x/y, the partial derivatives are also
functions of both variables, as the table below shows. The table below shows
the original function and the two partial derivatives. Both ∂v/∂x and ∂v/∂y
are rather involved functions of both variables, x and y. When we look at
higher-order partial derivatives, we will discover a way to determine more
about how values of the partial derivatives change as x and y change.
v(x,y) v_x v_y
2y
x
+ 4x
y
4
y
− x2y2 x2 − 4x
y2
√
Figure 6.1: Graph of x·y
6.2.2 Differentials
One purpose of analysis is to explain, at least in a qualitative sense, how
changes in one or more independent variables affect the value of a dependent
variable. To approach this question in an intuitive fashion, we begin with a
tautology: ∆y = (∆y/∆x) · ∆x. As it stands this says that the change in y
equals the change in y per unit change in x, multiplied by the change in x.
This, as it stands, tells us nothing. Suppose, however, that we replace with
its limiting value, dy/dx. Now, we can say that ∆y u (dy/dx) · ∆x. This
expression says that, at a given value of x, the line tangent to the function
approximates the effect of a change in x on y.
One step remains in defining the differential. As ∆x becomes a small num-
ber, the product (dy/dx) · ∆x becomes closer to ∆y. Thus, for sufficiently
small change dx, the expression becomes very nearly accurate. Consider an
example. Let y = x3 . Change x from 2 to 2.01 and determine both ∆y and
dy. At x = 2, y = 8, and dy/dx = 3 · x2 . At x = 2.01, y = 2.013 = 8.1206,
so ∆y = 8.1206 − 8 = 0.1206. The predicted change using the differential is
dy = 3 · x2 · 0.01 = 0.12.
We extend the concept of a differential to include functions of two or more
independent variables. We begin by noting that a partial derivative measures
the rate of change of the dependent variable with respect to an infinitesimal
change in one of the independent variables, all other independent variables
held constant. A total differential of a function, however, is a linear ap-
proximation of the rate of change of the dependent variable when all of the
independent variables change by an infinitesimal amount.
The total differential is the sum of the changes in the dependent variable
caused by simultaneous infinitesimal changes in all the independent variables.
Suppose that y = f (x1, x2, · · · , xn). Then the total differential of y is this:
dy = f1 · dx1 + f2 · dx2 + · · · + fn · dxn, where dx1, dx2, . . . , dxn indicate small
changes in the n independent variables. These changes occur independently
of each other.
Extend the example above by making z = x3 + y 2 . The total differential
for z is dz = 3 · x2 · dx + 2 · y · dy. Begin with x = 2 and y = 3, so that
z = 8 + 9 = 17 and dz = 12 · dx + 6 · dy . Let ∆x = 0.01 and ∆y = 0.01.
Then the approximation for dy is 12 · 0.01 − 6 · 0.01 = −0.18, so that the
approximate value of y is now 17.18. The actual value is 17.181.
CHAPTER 6. DIFFERENTIATION II 125
y = g(x) cannot remain constant when x varies. That is, x and y are not
independent of each other. In a case like this, dz = ∂f /∂x · dx + ∂f /∂y · dy
is the total differential of z. We can divide both sides of this expression by
dx to obtain dz/dx = ∂f /∂x + ∂f /∂y · dy/dx. The term dz/dx is the total
derivative of z with respect to x.
This total derivative has two parts. The first part, ∂f /∂x, measures the
change in z brought about by changes in x, all other variables held constant.
The second part, ∂f /∂y · dy/dx measures the change in z brought about by
change in variable x that works through intermediate variable y. The first
part of this expression is often called the direct effect, while the second part
is called the indirect effect. The indirect effect takes account of the fact that
changes in variable x affect variable y, which in turn affects variable z. In
general if z = f (x, y1, y, . . . , yn, then
dz ∂z ∂z dy1 ∂z ∂dy2 ∂z dyn
= + · + · + ··· + · .
dx ∂x ∂y1 dx ∂y2 dx ∂yn dx
Consider three examples, the first of which we also develop with Maxima.
Let z = f (x, y) = x2 + 2 · x · y + y 2 . The total derivative is dz/dx = ∂z/∂x +
(∂z/∂y)·(dy/dx). In our example, y = g(x) = e3·x , so that dy/dx = 3·e3x . We
confirm that the rule for evaluating dz/dx yields the same result as inserting
g(x) into f (x, y) and evaluating the result. We proceed with three sets of
commands, which generate the table below.
y 2 + 2xy + x2 2y + 2x 2y + 2x
6x 3x 2 6x 3x 3x
%e + 2x %e + x 6%e + 6x %e + 2%e + 2x
3%e (2y + 2x) + 2y + 2x 6%e4x + 6x %e3x + 2%ex + 2x
3x
The table above shows the result of using Maxima to determine this total
derivative. The first input line defines z without specifying how x and y are
related, and it takes the two partial derivatives. Maxima responds as if x
and y are independent of each other. The commands are these:
[z : x^2 + 2*x*y + y^2, dzdx01: diff(z,x), dzdy:diff(z,y)]. The
name dzdx0 is assigned to the derivative with respect to x because, we will
determine this derivative two more times in the next two lines of input.
The second input line, [z:subst(y=exp(3*x),z),dzdx02:diff(z,x)], adds
the information that y is a function of x and takes the total differential of z,
which is the second item in the second line of output.
CHAPTER 6. DIFFERENTIATION II 127
d d
x y z+ x yz %exyz del(s),
ds ds
which shows that five derivatives must be evaluated in order to produce dw
in terms of s and t. The rather long subst command
subst([diff(x,s)=diff(s^2+t^2,s), diff(x,t)= diff(s^2 + t^2,t),
diff(y,s)= diff(s*t,s), diff(y,t)= diff(s*t,t),
diff(z,t) =diff(sqrt(t),t) ], diff(w)) provides the necessary infor-
mation for evaluation of dw. Now the command diff(w) generates this
result:
xy
2tyz + sxz + √ %exyz del(t) + (2syz + txz) %exyz del(s).
2 t
We are not quite finished, because x, y, and z are in the expression. One
more set of substitutions, subst([x=s^2+t^2,y=s*t,z=sqrt(t)],%) gives
us the result that we seek:
√
3s t (t2 + s2 )
3
%es t (t +s ) del(t)+
5 2 2 2
2s t +
2
2
3 3
t t + s + 2s t %es t (t +s ) del(s).
2 2
2 2 2 23 2
2
The coefficient of del(t) is ∂w/∂t, and the coefficient of del(s)–on the second
line– is ∂w/∂s.5
5
The workbook shows that placing the definitions of s and t directly into w and
executing diff(w) yields the same result.
CHAPTER 6. DIFFERENTIATION II 129
Exercise 6.3
Evaluate the derivatives below, first by following the approach used above
and then with Maxima.
8. For z = (x2 +y 2 )3 , when x = s+t and y = 25−t, find ∂z/∂s and ∂z/∂t.
F Fx Fy −Fx/Fy
2 2
F (x, y) := x + y − 16 Fx (x, y) := 2x Fy (x, y) := 2y − xy
Finally, consider F (x, y)y 3 + x · y − 12. Following the steps in the preceding
examples reveals that dy/dx = −y/((3 · y 2 + x)), as reported in the following
table..
F (x, y) Fx Fy dy/dx
y + xy − 12 y 3y + x − 3y2y+x
3 2
This expression is much more difficult to evaluate than then preceding two.
Graphical analysis helps us to understand it better. Figure 6.4 shows that
the relationship between x and y is not monotonic. For the values of x
and y that yield the upper curve, the derivative is monotonic (dy/dx > 0),
but for the values that generate the lower curve the derivative is no longer
monotonic. If, however, our attention is limited to positive x values, then
the implications are simpler: Both the initial function and its derivative are
monotonic: dy/dx is negative but it approaches zero as x increases.
Partial derivatives can be defined for implicit functions of more than two
variables. If F (x, y, z) = 0, then ∂z/∂x = −Fx /Fz . The other two partial
derivatives can be defined in like fashion. The next display shows an implicit
function of three variables. It also shows Fx and Fy , along with ∂y/∂x. As
the entry in the fourth column shows, for this function ∂y/∂x depends on
the values of all three variables.
F (x, y, z) Fx Fy ∂y/∂x
√ y y
y z 2 + xy
2 √
2 xy
2y z + 2√xxy −
2
√
2 xy 2y z 2 + 2√xxy
CHAPTER 6. DIFFERENTIATION II 132
∂ 2z ∂ 2f
∂ ∂z
zxx = = = = fxx
∂x ∂x ∂x2 ∂x2
∂ 2z ∂y 2 f
∂ ∂z
zyy = = 2 = = fyy
∂y ∂y ∂y ∂y 2
∂ 2z ∂ 2f
∂ ∂z
zxy = = = = fxy
∂y ∂x ∂y∂x ∂y∂x
CHAPTER 6. DIFFERENTIATION II 133
∂ 2z ∂ 2f
∂ ∂z
zyx = = = = fyx
∂x ∂y ∂x∂y ∂x∂y
We refer to zxy and zyx as cross (or mixed) partial derivatives. They result
when one differentiates function z first with respect to one variable and then
with respect to the other variable. For example, let z = 5 · x2 · y. Then we
find zxy in two steps. First we differentiate z with respect to x, and obtain
10 · x · y. Then we differentiate 10 · x · y with respect to y, and obtain 10 · x,
which is zxy .
In many cases, zxy = zyx . When this is the case, it makes no difference
whether we first differentiate the function with respect to x, then with respect
to y, or vice versa. We obtain the same result. In general, zx y = zy x when
Young’s Theorem applies.
Name Expression
√
z xy + x y 2 + x2 y
y
zx √
2 xy
+ y 2 + 2xy
zy √x + 2xy + x2
2 xy
y2
zxx 2y − 3
4(xy) 2
x2
zyy 2x − 3
4(xy) 2
xy
zxy √1 − 3 + 2y + 2x
2 xy 4(xy) 2
xy
zyx √1 − 3 + 2y + 2x
2 xy 4(xy) 2
CHAPTER 6. DIFFERENTIATION II 134
Exercise 6.4
Use implicit differentiation to show ∂z/∂x and ∂z/∂y for these expressions.
1. 2 · x2 + 3 · y 2 + 4 · z 2 = 24 2. loge x · y · z = 10
3. a · x + b · y + c · z = e 4. ex + ey + ez = 1000
5. 3 · x2 − 4 · y − z 2 + x2 · y · z 2 = 20 6. −x2 − 4 · y 2 + 2 · z 3 = 60
7. loge x + loge y + loge z = ex 8. x + y − loge z = 0
2 3
9. (x + 8 · y · z) · (x + 5) = 8
10. x3 + y 3 + z + x · y + 2 · x · y 2 + 3 · y · z 3 = 0
The exhibit below shows a stylized linear demand function for a transit sys-
tem’s services. The list named paramsEst is a hypothetical set of estimated
values for the parameters. The first output line shows the general expression
for this demand function. The second output line shows the expression given
the estimated parameters. Also, it shows the number of rides per period
(3950.0) that is estimated, given the values of the independent variables.7
The bottom three lines show the implied elasticities, along with the calcula-
tions required to generate them.
Supposing that qx is stated in 1000s of riders per period, interpret the values
in the fourth input line as follows: px, the per-ride fare is $4; py, $0.50, is the
per-mile cost of operating a private automobile; m, $40,000, is annual per-
household income in the relevant area; and n, 400,000, is the number of people
living within a defined distance from the transit line. From the estimated
linear relationship we can determine that the demand curve slopes downward,
that transit rides and automobile rides are substitutes, that transit rides are
an inferior good, and that adding 1 person to the area generates 2 more rides
per period.
Expressing these values in terms of elasticities offers a number of advantages.
One is that doing so makes comparison with similar analyses done by others is
easier. Such comparisons can indicate whether the study has been conducted
in a proper fashion (best available set of measures for variables and the
proper set of variables in the model, for example). Also, elasticities provide
some relevant information more directly than the original coefficients. In
particular, −1 < Exx < 0 implies that the transit authority could increase
7
The values used here are stylized, but they are based on research by one of the authors
of the original edition of this text. See Ostrosky and Kuhn, p. 160.
CHAPTER 6. DIFFERENTIATION II 137
revenues by raising the fare. That Exy = 0.038 indicates a weak relationship
between automobile operation cost and ridership.
Next consider a constant-elasticity demand function, qx = A · pxExx · py Exy ·
mExm (other variables could be included).8 At all values of the independent
variables, the elasticities are the same. This constancy of the elasticities
implies that the slopes change as variables’ values change, as Figure 6.5
demonstrates.
To determine the slopes, take the first partial derivatives of the qx function.
To use Maxima, apply this list of commands:
[diff(qx(px,py,m,A,Exx,Exy,Exm),px),
diff(qx(px,py,m,A,Exx,Exy,Exm),py),
diff(qx(px,py,m,A,Exx,Exy,Exm),m)].
The resulting expressions are in the following table.
dqx/dpx = Exx mExm px Exx −1 py Exy A
dqx/dpy = Exy mExm px Exx py Exy−1 A
dqx/dm = Exm mExm−1 px Exx py Exy A
A little manipulation of the results above reveals that the slopes are these:
dqx/dpx = Exx·qx/px, dqx/dpy = Exy ·qx/py, and dqx/dm = Exm·qx/m.
The next table shows qx values for selected combinations of prices and in-
come, given these parameters: A = 0.012, Exx = −1.5, Exy = 0.75, and
Exm = 1.2. Comparing the quantity in the second row with that in the
first row allows computation of the arc own-price elasticity. Likewise, the
values in the third and fourth rows provide the values for calculating the
arc cross-price elasticity and the arc income elasticity. The accompanying
workbook shows the computations using the commonly-used midpoint for-
mula. The arc elasticity values are, respectively, -1.475 (compared to the
point elasticity of -1.5), 0.751 (compared to the point elasticity of 0.75), and
1.19 (compared to the point elasticity of 1.2).
prices and income quantity
py = 4, m = 100, px = 2 3.014263717811496
py = 4, m = 100, px = 3 1.640757346405055
py = 5, m = 100, px = 2 3.563393273053673
py = 4, m = 150, px = 2 4.903325869367984
8
A is a scaling factor. Mathematically, it is the value of qx when all independent
variables equal 1. Given that physical, temporal, and monetary units can be selected at
will, A’s value can be selected for convenience.
CHAPTER 6. DIFFERENTIATION II 138
Figure 6.6 shows the highest sales level that EEE can achieve with B =
10000. It also shows the budget line, along with three isosales lines. One
of the isosales lines shows various combinations of T V and N that would
yield S = $160, 190. Only one of these combinations can be achieved with
B = $10000. The second, lower isosales line is for S = $12000. Either
of the intersections of this line with the budget line would consistent with
B = $10000, but would be inefficient. The third isosales line, for S = $20000,
cannot be attained with this budget.
CHAPTER 6. DIFFERENTIATION II 141
d(pd) pdx
= .
dt pdx − psx
In terms of economic impacts, this expression is the change in the price that
buyers pay, given the quantity that is determined above. We could equally
9
We follow Bishop[2].
10
The notation pdx and psx refers to d(pd)/dx and d(ps)/dx.
CHAPTER 6. DIFFERENTIATION II 143
d(ps) psx
= .
dt pdx − psx
This is the change in the price that sellers receive.
The expressions above indicate the signs of the prices in the presence of a tax.
Commonly, these relationships are expressed in terms of elasticities. For an
price elasticity, E, the value is E = (dx/dp) · (p/x) so that dp/dx = p(/E · x).
Making these substitutions for the supply curve and the demand curve leads
to these relationships (copied from wxMaxima):
Eps Epd
, .
Eps − Epd Eps − Epd
The first term is the effect of the tax on the price that buyers pay, and the
second shows the effect on the price that sellers receive. The ratio of these
two, Eps/Epd, the ratio of the fraction of the tax that is passed to buyers
to the fraction that is passed to sellers.
The difference between the price paid and the price received is 1 · t. Confirm
that subtracting the effect of the tax on the price sellers receive from the
effect on the price that buyers pay yields 1. Also, divide the first term by
the second to confirm that the ratio of the two effects is Eps/Epd.11
made in the amount of labor being used, the amount of capital being held
constant: M P L = ∂Q/∂L = fL . Likewise, the marginal product of capital
is M P K = ∂Q/∂K = fK .
Optimization
The preceding illustration depicts a firm as having a fixed budget and seeking
to generate maximum sales based on an optimal use of two advertising media.
More generally, we can represent firms as setting out to gain the largest
output for a given cost by selecting the ideal mix of inputs, given a set cost.
Not surprisingly, if the firm uses two inputs, K and L, for which the unit
costs are r and w, then the firm will attain the result if it uses the mix that
spends the allocated cost C = w · L + r · K on a combination such that
fK /fL = r/w.
Suppose, however, that the firm wishes to select a quantity and then find the
lowest-cost input mix consistent with that quantity. This problem is called
the dual to the one above. Figure 6.7 shows an isoquant (for Q = 200 units
in this illustration; see the accompanying workbook) and three isocost lines.
Line T C1 represents a cost that is inconsistent with producing the specified
output level. Line T C2 represents a cost at which either of two points (where
the isoquant intersects T C2) is consistent with the required output level but
at an unnecessarily high cost.
Finally, the isocost line T C0 allows the output level to be produced, given
that labor and capital are combined as indicated by the point of tangency of
T C0 and the isoquant (at approximately L = 25 and K = 15. As in the case
of maximizing output subject to a cost constraint, minimizing cost subject
to an output constraint requires that the ratio of marginal products equal
the ratio of input prices. The negative of the slope of the isoquant is called
the marginal rate of technical substitution (mrts).
where 0 < b < 1. We change both inputs by a factor k, using this com-
mand: Q(k*L, k*K). The resulting expression is k b (log (K) + log (k)) Lb .
We cannot derive an expression of the form k n · Q)
The table shows the marginal products of labor and capital and the marginal
rate of technical substitution. We cannot derive an expression in the form
mrts = m · (K/L), so this function is not homothetic.
MP L MPK mrts
Lb bK log (K)
b log (K) Lb−1 K L
of the revenue is the profit that accrues to the owner of the inframarginal
firm or to the owner of some specialized factor (like location) that makes the
firm’s cost lower that the cost incurred by the marginal firm(s).
Finally, of course, production functions need not be homogeneous at all.
Euler’s theorem applies only to homogeneous functions. Having said this,
however, be aware that being descriptively inaccurate is not the same as
being useless. If production at an aggregate level is well approximated by
a linear homogenous production function, then the marginal productivity
theory of distribution might provide the best model with which to start
thinking about this important issue.
change in K/L is twice the percentage change mrts. For the second case,
the percentage change in K/L is only 0.2 that of the change in mrts.
The elasticity of substitution takes on special importance when we recall that
optimizing firms select the input combinations that equate their mrts with
the ratio of the relative input prices (more generally, their relative marginal
input costs–see the preceding section). Thus, we can redefine the elasticity
of substitution as
d K
w
L
Esub = w
· Kr .
d r L
This is the percentage change that the firm makes in its input ratio per
one-percent change in the ratio of the input prices. Given that w · L and
r · K are the costs of the inputs, consider the implications of Esub’s value on
how total cost is distributed. If Esub = 1, then a one-percent change in the
ratio of input costs is exactly offset by a one-percent change in the opposite
direction, so that the share of cost that goes to each input type remains the
same.
If, however, Esub 6= 1, then a change affects income distribution between the
inputs. Specifically, if Esub > 1, then a change in w/r causes labor’s share
to move in the opposite direction. Likewise, if Esub < 1 , then a change w/r
causes labor’s share to move in the same direction that w/r changes. Among
other implications, the value of a firm’s Esub could affect a union’s ability
CHAPTER 6. DIFFERENTIATION II 152
d(K/L) mrts 1
Esub = · = .
d mrts K/L 1+b
Refer to Figure 6.9 to recall the implications of the elasticity for the shape
of isoquants for the CES production function. Also, note that as b → 0, the
elasticity of substitution approaches 1, the Cobb-Douglas value.
Output Elasticity
and
b
1−a 1−a Q
b
= .
(K · A/Q) Ab K
As b approaches zero (Esub → 1), these two output elasticity values approach
a and 1 − a as the CES production function approaches the Cobb-Douglas
function, a conclusion that we approach below.
The terms f 0 (x) and g 0 (x) are the first derivatives of the two functions.
Rather than working with the CES function, we consider its logarithm, which
Maxima provides:
log Lab + 1−a
K b
log (A) − .
b
The second term is a ratio of two terms, both of which approach ∞ as b → ∞.
We can see that if g(b) = b, then g 0 (b) = 1 for all values of b, so for all b, the
limit of g 0 (b) = 1. Therefore we can focus on f 0 (b). The derivative of
a 1−a
−log +
Lb Kb
is
a · K b · log (L) + (1 − a) · log (K) · Lb
− .
(a − 1) · Lb − a · K b
As b → 0, this expression approaches
We know that the limiting value of log(A), a constant, is just log(A)). There-
fore the limiting value of
Optimization: Maximization,
Minimization, and Constraints
Economic analysis assumes that actors have objective functions. These func-
tions include the utility function that represents a consumer’s preferences and
the profit function that represents the outcome of a firm’s actions. Further-
more, this analysis often begins with the assumption that the actors attempt
to maximize (or, for some functions, minimize) the value of these functions.
The actions of the decision-maker are nearly always constrained by limita-
tions such as the amount of money or time (or both), or some minimum
acceptable level of performance or output. When such constraints apply, the
actor generally cannot achieve the maximum value of the objective function.
Rather, the actor is assumed to optimize. That is, the actor finds the set
of options that yield the maximum (or minimum) attainable value of the
function, subject to the constraint.
We have inserted an important and somewhat controversial behavioral as-
sumption here, that the actor is an optimizing agent. This is the standard
approach for neoclassical economics. This approach can be thought of as
either positive or normative. That is, we can interpret the results of the
model as behavior in which we should expect actors to engage (positive).
Alternatively, we may be interested in investigating the conditions that are
required for optimization and using those to prescribe behavior for someone
who is seeking to optimize in a particular setting (normative).
This chapter initially demonstrates how one may find the maximum or min-
158
CHAPTER 7. OPTIMIZATION 159
We define concavity as follows: Take as given that dy/dx and d2 y/dx2 exist
for all x in some interval. Then, if d2 y/dx2 > 0 the curve (x) is said to
be concave upward at x. If d2 y/dx2 < 0, then the curve f (x) is said to be
concave downward at x.
Figure 7.4 shows the two graphs from Figure 7.2 and adds three tangent line
segments to each of the two graphs. The first graph illustrates the case in
which the curve of the function is concave downward. At x = 1, the first
point of tangency on the graph, the first derivative is positive and the second
derivative is negative. This means that the function is increasing, but at
a decreasing rate. At x = 9, dy/dx < 0 and d2 y/x2 < 0, indicating that
the function is decreasing at an increasing rate, that is, that the slope of
the function is becoming increasingly negative. Between these two values, at
x = 5, dy/dx = 0, and f (x) achieves its maximum value.2
The second graph depicts the case where the curve of the function is concave
upward. At x = 1, dy/dx < 0, but d2 y/dx2 > 0. This implies that the
velocity (“speed”), while the second derivative of distance with respect to time measures
acceleration. An economic example relates to the general price level (say GDP Deflator
value): The inflation rate is the first derivative, and the annual rate at which inflation is
changing is the second derivative
2
For the first function, d2 y/dx2 = −2 for all x values. For the second function, the
corresponding value is 2.
CHAPTER 7. OPTIMIZATION 163
function is decreasing at a decreasing rate; that is, that the slope of the
function is becoming less negative. When x = 9, dy/dx > 0 and d2 y/dx2 > 0.
The function is now increasing at an increasing rate.
When the concavity of the function changes from downward to upward or
from upward to downward at a value of x, this point on the function is
called an inflection point or point of inflection. Figure 7.5 illustrates the
two different types of points of inflection. In the upper portion of the first
column, the point of inflection occurs where the concavity of the function
changes from downward to upward, at x = 20. The point of inflection in
the upper portion of the second column occurs where the concavity of the
function changes from upward to downward, again at x = 20.3
Points of inflection have definite implications for the first and second deriva-
tives. We can see in the lower portion of the first column that the point of
inflection is the minimum value of dy/dx when concavity is changing from
downward to upward. Analogously, the point of inflection in the lower por-
tion of the second column is the maximum value of dz/dx in the case in
x3 3
3
The two functions are 60 − x2 + 25 · x and − x30 + 2x2 − 15 · x.
CHAPTER 7. OPTIMIZATION 164
extreme point exists. These tests do not say whether the extreme value is
absolute or relative. They can say that, at least, a relative extreme value has
been determined.
Function Derivative Solution(s)
− 2·x3 + 8 · x2 − 110 · x − 2·x2 + 16 · x − 110 [x = 8.8197, x = 31.18]
15 5
x3 2 x2
30
− 2 · x + 40 · x 10
− 4 · x + 40 [x = 20.0]
The second polynomial’s derivative has a root at x = 20, but dg(x)/dx > 0
for x < 20 and also for x > 0. For example, at x = 19.8, dg(x)/dx = 0.001;
also, at x = 20.1, dg(x)/dx = 0.001. Thus, x = 20 corresponds to an
inflection point for g(x) = x3 /30–2 · x2 + 40 · x. Figure 7.7 confirms the
results that we have derived using information in the table above.
Exercise 7.1
For each expression below, find any extreme points that exist and determine
whether each such point is a relative maximum, a relative minimum, or a
point of inflection. Confirm your results with Maxima graphs.
1. y = x2 –4 · x + 16 2. y = x3 –6 · x2 + 9 · x 3.y = x · ex
4. y = x · (x − 1)2 5. y = (x − 1)3 + 8 6. y = x + 1/x
7. y = x 2 · x) − 2 · x 8. y = x /3–x + x + 1 9. y = x3
( 3 2
For each of the following, find the absolution maximum and/or minimum in
the designated intervals. Graph each function.
10. y = x2 , where −8 ≤ x ≤ 16
12. y = (25 − 3 · x)0.5 where 0 ≤ x ≤ 3
13. y = (x − 8)2 , where −2 ≤ x ≤ 4
14. y = 150 − 0.8 · x, where 0 ≤ x ≤ 10
14. (a) When an automobile travels s miles per hour, the cost per mile (in
s2 s
dollars) of operating the automobile is O = 5000 − 50 + 189
200
. At what speed
CHAPTER 7. OPTIMIZATION 168
is the cost per hour minimized? (b) Suppose that the cost of the driver’s
time is added to this equation, and that this cost is 30/s. Determine the new
cost-minimizing speed.4
15. The demand curve for a firm’s product is q = 8–p, where q is the number
of units sold and p is the price per unit. (a) What price should the firm charge
if it chooses to maximize its total sales revenue (p·q)? (b) This demand curve
implies that the total revenue function is T R = p · q = 8 · q − q 2 . Confirm that
the marginal revenue function is mr = d(T R)/dq = 8 − 2 · q. (c) Suppose
that marginal cost is constant at mc = 1. What quantity maximizes the
firm’s profit? What price must the firm charge in order to sell this quantity.
First, given that the first derivative of a function y = f (x) exists, solve the
equation dy/dx = 0 for its critical roots. (This step is identical to the first
step of the first derivative test.) Next, if the second derivative d2 y/dx2 also
exists, then one of the following three conditions must hold:
(a) If d2 y/dx2 < 0, then the function f (x) has a relative maximum at x = x0.
(b) If d2 y/dx2 > 0, then the function f (x) has a relative minimum at x = x0.
(c) If d2 y/dx2 = 0, then the second derivative test fails. We must return to
the first derivative test to ascertain whether a relative maximum or minimum
exists.
Figure 7.7 illustrates the second derivative test graphically. In the first panel,
when x = 8.8197, d2 y/dx2 = 8.9443 (condition b), so this point corresponds
to a local maximum value. When x = 31.18, d2 y/dx2 = −8.9443 (condition
a), so this point corresponds to a local minimum value. In the right-hand
graph, at x = 20, d2 y/dx2 = 0, (condition c), so the second derivative test
cannot detect whether or not a relative maximum or minimum exists at this
point. Hence we cannot be certain what we have, based solely on these two
tests.
Exercise 7.2
Find the extreme values of the following functions, and determine by use
of the second derivative test whether they are maxima or minima. Confirm
your results with Maxima graphs.
1. y = x2 –8 · x + 10 2. y = x · (6 − x)2 3. y = x2 + 8
4. y = x4 − 2 · x2 + 6 5. y = x · e( − x) 6. y = x + 1/x
3 2
7. y = x /3 + x /2 + 12 · x 8. y = x/(x + 1) 9. = 1/(x + 4)
Figure 7.8 shows that the condition ∂z/∂x = ∂z/∂y = 0 can indicate that
either a local maximum or a local minimum has been achieved. We em-
phasize may have, because two other possibilities exist. The function could
have reached an inflection point, just as we saw could happen with a single
independent variable. Furthermore, it could reach a saddle point.
To envision a saddle point, imagine yourself at a point on a three dimensional
shape. When you look in one direction you appear to be at the top of a hill
(where the slope is zero). Now, turn 90 degrees. You are still at a point where
the slope is zero, but the shape running from the back to the front of the
saddle is now a valley and you are at the minimum. Figure 7.9 illustrates
the case of a saddle point. Standing at the point indicated by +, when you
face in the y direction, you are atop the surface (at least locally). When you
face in the x direction you are (at least locally) at the minimum point on the
surface.
Figure 7.10 shows a more complicated picture. The function here is z = x3 ·y 2 ,
for which ∂z/∂x = 3 · x2 · y 2 and ∂z/∂y = 2 · x3 · y. When either x = 0 or
y = 0, both partial derivatives equal zero, but no local extreme values are
apparent. In particular, observe that moving from negative to positive values
CHAPTER 7. OPTIMIZATION 172
and a constraint (green plane). The objective is to move as high up the hill
as the constraint allows. The graph does not provide enough information to
determine the highest feasible value of z.6
In general, a constraint must result in an extreme point whose value is less
than or equal to the extreme value obtained when the same objective func-
tion is maximized in the absence of the constraint. Similarly, imposing a
constraint on a minimization problem must result in an extreme point whose
value is greater than or equal to the value obtained when the same objective
function is minimized in the absence of the constraint.
We generally try to solve a constrained optimization problem by one of two
methods. The first involves substituting the constraint into the objective
function, then proceeding as if one were maximizing or minimizing an un-
constrained function. This method seems straightforward. Unfortunately,
it often it becomes complicated and quite troublesome when the objective
function and constraint(s) are something other than very simple functions.
6
We determine below that (x = 7/5, y = 16/5) is the combination that yields the
largest value of z, z = 19/5.
CHAPTER 7. OPTIMIZATION 178
The Greek letter λ (lambda) is a newly created unknown variable that has
the property at being able to apply to the constrained objective function
precisely the same first-order condition applied when an extremum is found
in the absence of a constraint. The new variable λ has an important inter-
pretation: it equals the change in the objective function per unit change in
the constraint.
First-order Conditions
We determine the optimal values of x and y in three steps: First we differ-
entiate the new objective function in with respect to x, y, and λ. Then we
set these partial derivatives equal to 0. Finally, we solve this system of three
equations for the three unknown values.
Thus, we create a system of these three equations:
Lx = fx − λ · gx = 0,
Ly = fy − λ · gy = 0,
and
Lλ = fλ − λ · gλ = 0.
CHAPTER 7. OPTIMIZATION 179
We can solve these equation for the critical roots of the function L(x, y, λ).
Note that the last of the three first-order conditions is actually nothing more
than the constraint that must be satisfied when the extreme point is found.
Example 1. We now apply this approach to the function and constraint
that generate Figure 7.12. The twoexpressions, stated as Maxima output,
are f (x, y) := − (x − 5)2 + (y − 5)2 + 20 and g(x, y) := x + y/2 − 3.7
The four commands below create the Lagrangian expression and determine
the first partial derivatives.
L : f(x,y)- %lambda*g(x,y);
Lx: diff(L,x);
Ly: diff(L,y);
Llambda: diff(L,%lambda);
Inserting these values into f (x, y) shows the maximum value of z given this
constraint. Use the command subst(soln,f(x,y)); to generate the result
19/5, the maximum attainable value of z.8
7
When an expression is entered into Maxima, it is treated as being equal to zero unless
another value is expressly entered.
8
The accompanying workbook relaxes the constraint by a small amount and shows
that the value of λ closely approximates the resulting change in z’s value.
CHAPTER 7. OPTIMIZATION 180
The first two entries are the (uncompensated) demand curves for x and y.
Note that this consumer’s income would be equally divided between the two
goods: px · x = m/2 = py · y. The Lagrangian multiplier λ can be interpreted
as the marginal utility of income. Given this function, this value is constant
for a given set of prices.9
Now, suppose that m = 100, px = 2, and py = 5. The command subst([m =
100, px=2, py = 5], soln) performs the (in this case simple) calculations
to yield [x = 25, y = 10, λ = 5]. We see that the consumer does spend
m/2 = 50 on each good. Also, λ is a positive constant so the consumer gains
5 “units” of utility per one-unit increase in m, no matter what the consumer’s
income level might be.
Often a minus sign (-) is used in front of the constraint in a Lagrangian
expression. Making this change does not affect the critical roots of the in-
dependent variables in the original objective function. There is an intuitive
explanation or why the sign of the constraint term is of no consequence. The
value of the constraint term, when the objective function is being maximized
or minimized, as appropriate, is equal to zero. Whether we add or subtract
zero is of no consequence.
9
Beware of two possible errors in interpreting this specific result. First, this constancy
is a characteristic of this class of utility functions and should not be treated as a gen-
eral result. Second, numerical values have no meaning when utility: the measures are
subjective.
CHAPTER 7. OPTIMIZATION 181
The method of Lagrange identifies only those values of the independent vari-
ables that satisfy first-order or necessary conditions for an extreme point.
These values may or may not actually represent an extreme point. A second-
order test is necessary to provide further information on this matter. The
second-order test is as follows.
1. Given: Lx = Ly = 0 at x = x0, y = y0 . Given also: Lxx , Lyy , and Lxy
exist at x = x0, y = y0.
2. Then, one of the following conditions must hold:
(a) If Lxx Lyy − (Lxy )2 > 0, and both Lxx and Lyy are negative, then we have
a relative maximum at x = x0, y = y0.
(b) If Lxx Lyy − (Lxy )2 > 0, and both Lxx and Lyy are positive, then we have
a relative minimum at x = x0, y = y0.
(c) If Lxx Lyy − (Lxy )2 < 0, then the second-order test fails and is incapable of
indicating whether or not a relative extreme point exists. A relative extreme
point may exist. One must analyze the function z = f (x, y) in the neighbor-
hood of x = x0, y = y0 in order to ascertain whether a local extreme point
exists at x = x0, y = y0.
The analysis of such complicated cases is one of the areas in which a computer
algebra system becomes especially useful. Many points in the neighborhood
can typically be evaluated quickly, providing insights into the function’s be-
havior in what might be a critical region.
The second-order test outlined above is quite similar to the second-order test
described for the case when an unconstrained extreme point is being sought.
There is, however, an important difference, Assume that Lxx Lyy −(Lxy )2 ≤ 0.
In the unconstrained case, a saddle point exists when fxx fyy −(fxy )2 < 0, and
an extreme point may exist when fxx fyy − (fxy )2 <= 0. In the constrained
case, however, we can say nothing about the existence of a saddle point when
Lxx Lyy − (Lxy )2 < 0. An extreme point may exist when Lxx Lyy − (Lxy )2 = 0
as well as when Lxx Lyy − (Lxy )2 < 0 in the constrained case.
Summary of Conditions for Constrained Extremum: z = f (x, y) subject to
g(x, y) = 0.
CHAPTER 7. OPTIMIZATION 182
First-order condition
Lx = 0, Ly = Lλ = 0
Second-order condition
(a) Maximum Lxx · Lyy − (Lxy )2 > 0 and Lxx , Lyy < 0
(b) Minimum Lxx · Lyy − (Lxy )2 > 0 and Lxx , Lyy > 0
(c) Test fails Lxx · Lyy − (Lxy )2 ≤ 0
Example 1. Find the extremum of z = x2 + y 2 − 4 · x − 4 · y + 7 subject to
x + y = 4.
Form the Lagrangian function L = x2 + y 2 + 2 · x + 2 · y + 4 + λ · (x + y − 4).
Derive the first-order conditions:
Lx = 2 · x − 4 − λ = 0,
Ly = 2 · y − 4 − λ = 0, and
Lλ = x + y − 4 = 0.
Both Lx and Ly equal λ, so 2 · x − 4 = 2 · y − 4, implying that x = y when
the constraint is satisfied. This implies that x = y = 2.
The second-order expressions are Lxx = 2, Lyy = 2, and Lxy = 0. Therefore
Lxx · Lyy − (Lxy )2 = 4 − 0 > 0 which ensures that an extreme value exists.
Furthermore, Lxx > 0 and Lyy > 0 indicate that a minimum value has been
found.
The value of λ is 0, implying that relaxing the constraint would move us
no closer to the unconstrained maximum. As it happens, this constraint
is irrelevant: The constrained optimum is the local minimum. Repeat this
example, setting x + y = 4 and confirm that λ 6= 0; also, interpret the new
value.10
Example 2. Find the extremum of z = x · log(y) subject to x + y = 4. The
original expression and the Lagrangian expression are these:
√ √
[x y, x y − λ (y + x − 4)].
The terms that relate to the second-order test–fxx , fyy , and fxy are
x 1
[0, − 3 , √ ].
4y 2 2 y
11 5 3 4 7 5 2
y 2 6 3 2 3 6
1 6 3
The center entry (8/3, 4/3) is the solution that we found above. It is also
the largest value of z in this neighborhood of (x, y) combinations.
Exercise 7.4
Solve the following constrained optimization problems by the method of La-
grange multipliers. Use Maxima to confirm your computations.
1. z = 2 · x2 + y 2 subject to x + y = 1
2. z = x2 − 2 · x · y + y 2 subject to x + y = 2
3. z = x2 + 4 · y 2 + 24 subject to x − 4 · y = 10
4. z = 4 · x2 + x · y + 3 · y 2 subject to x + 2 · y = 21
5. z = 6 · x2 − x · y + 5 · y 2 subject to 2 · x + y = 24
6. z = 3 · x2 + y 2 − 2 · x · y − 8 subject to x + y = 1
CHAPTER 7. OPTIMIZATION 184
Figure 7.14: Demand, cost, and profits Revenue, cost, and profits
CHAPTER 7. OPTIMIZATION 189
over a long time, then fraction units make sense. If it is thinking of this
week, only a discrete number of books can be sold.
Fortunately, Maxima lets us have it both ways. The following three com-
mands determine price, output, and profit if the bookstore sells 6, 6.7329, or
7 books per week:
subst(x=x1,[x,p,TR-TC]);
subst(x=floor(x1),[x,p,TR-TC]);
subst(x=ceiling(x1),[c,p,TR-TC]);.
The second derivative of the profit function is −0.999 · x − 0.35. For any
of the three x values above, this expression is negative. We have, therefore,
determined a profit-maximizing quantity.
As Figure 7.15 illustrates, the M P L curve cuts the AP L curve at the APL
curve’s highest point. That is, M P L = AP L at the value of L for which AP L
is at a maximum. We can demonstrate this mathematically by showing the
conditions under which AP L is at a maximum. We differentiate AP L = Q/L
with respect to L using these to Maxima commands: depends(Q,L) and
d
Q
diff(Q/L, L).16 The result is dLL − LQ2 . Setting this derivative equal to
zero and dividing through by L, which has a positive value, implies that
M P L = AP L is a necessary condition for AP L to achieve an extreme value.
To confirm that this value is a maximum requires evaluating the second
derivative of AP L(L). The derivative supplied by Maxima is this:
d2 d
dL 2 Q 2 dL
Q 2Q
− 2
+ 3.
L L L
The second term is 2 · M P L/L2 and the third term is 2 · Q · 1 = 2 · AP L/L2 .
L L2
Because the first-order condition requires that AP L = M P L, these sum to
zero. Therefore, the sign of the first term determines the sign of second
derivative of AP L(L). The numerator of that term, the first derivative of
M P L(L), is negative for all L. The test confirms what the graph shows: at
L = L1, the average product of labor reaches its maximum value.17
We can now see why production by a price-taking firm occurs within the
range of decreasing marginal product. In this example, production occurs
where M P L = 41.666 and AP L = 59.562. More generally, the following re-
lationships hold: M C = w/M P L and AV C = w/AP L. Profit-maximization
requires that M C = p, so M P L = w/p > 0, where w/p is the “real wage”
that the firm pays. The price level must exceed the minimum value of AV C,
so M C > AV C, implying that M P L < AP L.19
7.5.5 Taxation
Chapter 6 examines the effects of imposing a tax on either the sale or the
purchase of a single good or service when both buyers and sellers are price
takers. It shows that whether the tax is nominally imposed on buyers or
on sellers is beside the point. Either way, the effects on the price paid by
the buyers and the price received by the sellers is determined by the relative
values of the price elasticity of demand and the price elasticity of supply.
This section examines the case in which the seller in question is a price-
searcher. More specifically, we examine the polar case of the pure monopolist.
In the case of pure competition (price takes on both sides of the market),
the analysis proceeds by solving a system of two equations (the demand
and supply curves) and then imposing and tracing the effects of a change.
(This approach is another example of comparative statics analysis.) When
examining the case of price takers’ markets, we did not look at the behavior
of any single firm or buyer; the effects of their behavior are summarized in
the supply curve and the demand curve respectively. Therefore, we did not
have to apply any optimization conditions.
With a single price-searching seller, however, the firm’s optimizing behavior
is central to the analysis. As before, we follow the analytical approach that
Bishop [2] provides. Begin by defining the objective function: the function
π = T R(x) − T C(x) − t · x is to be maximized.
The first-order condition for maximizing π is T Rx −T Cx −t = 0. The second-
19
If the firm is a price-searcher, then employment can occur at a lower L, where M P L
might exceed AP L. The accompanying workbook shows how this can happen. Likewise,
if the firm faces an upward-sloping supply curve of labor, it might employ an amount of
labor such that M P L > AP L.
CHAPTER 7. OPTIMIZATION 195
order condition is that πxx = T Rxx − T Cxx < 0.20 That is, the marginal
revenue’s slope must be less than that of the marginal cost. In most cases,
we expect M R curves to slope downward (T Rxx < 0) and marginal cost
curves to be either horizontal or upward sloping (T Cxx ≥ 0).
In order to determine the nature of dp/dt, we first apply the implicit function
theorem to the first-order condition and determine dx/dt. First, dπx /dx =
T Rxx − T Cxx . And, of course, dπt = 1. Therefore,
dx 1
= .
dt T Rxx − T Cxx
Multiplying this term by dp/dx yields the result that we seek:
dp px
= ,
dt T Rxx − T Cxx
where px is the slope of the inverse demand curve. Both the numerator and
the denominator are negative, so dp/dt > 0.
This formulation allows direct analysis of the case in which the demand curve
and the marginal cost curve are linear. If p = α + a · x, then M R = α + 2 · x.
Likewise, if AV C = b + c · x then M C = b + 2 · x. The slopes of M R and M C
are 2 · a and 2 · c. The table below summarizes these aspects and implications
of the linear price and average variable cost functions. The results imply that
dp/dt = a/(2 · (a − c)). Recalling that a > 0 and c ≥ 0, these values imply
that dp/dt < 1, a result that is similar to the case of price-taker markets.
We can say a bit more. Suppose that c = 0.Then dp/dt = 1/2. For c > 0,
dp/dt < 1/2. As with price-taker markets, the relative values of b and c
determine the effect of the tax on the prices paid and received.
p p Slope TR MR M R Slope
ax + α
a a x2 + αx 2ax + α 2a
=== === === === ===
AV C AV C slope TV C MC M C Slope
cx + b c c x2 + bx 2cx + b 2c
20
Also, the local maximum profit must be larger than at any other output rate. In
particular, it must exceed π(0), the profit (positive or negative) that the firm earns when
x = 0.
CHAPTER 7. OPTIMIZATION 196
Unlike the case of price-taker markets, the result above is not general. We
can easily find a case in which dp/dt > 0. Suppose that the demand curve
has constant elasticity at each price and that marginal cost is constant (its
slope is 0). Recalling that M R = p · (1 + 1/Epd), setting M R = M C implies
that p = M C/(1 + 1/Epd). Here, Epd is the price elasticity of demand.
For a price-searching firm to produce an appreciable amount of at good with
this type of demand curve, marginal revenue must be positive. That requires
that the demand curve must be elastic. That is, Eps < −1. This, in turn,
1
implies that 1 + 1/Epd < 1. Because Epd is a constant, 1+1/Epd is a constant
markup that maximizes profits. If a firm’s sale is taxed, then the tax-inclusive
1
marginal cost is another constant, c+t, and the price will rise by t· 1+1/Epd > t.
Figure 7.17 shows the impact of a $1 excise tax on the sale of products in
market with linear demand and cost curves and in a market with constant
elasticity and constant marginal cost. The details are in the workbook that
accompanies this chapter.
In the linear case, the price increase is clearly less than that vertical dis-
placement of the marginal cost curve (which in this illustration is $1). In
CHAPTER 7. OPTIMIZATION 197
some sense, therefore, we can say that the buyers and the single seller share
the cost of the tax. In the second example, however, the price rises by more
than the tax (twice as much in this case because we use Epd = −2), so the
question of “sharing” cannot be stated in simple terms.
One might be tempted to think that the monopolist gains from having a
tax imposed, because the price rises by more than the tax. This inference
is incorrect, because it ignores two effects of the tax. First, the monopolist
must pay the tax and, second, the monopolist’s output decreases. In this
illustrative example the monpolist’s profit (less fixed cost, which we ignore)
falls from $7.5 to $6.0.21
This illustrative case that results in dp/dt > 1 is not the only case in which
this result can occur. Bishop [2] provides the general conditions that lead to
this result. Also Bishop addresses the impact of ad valorem taxes.
the firm expects to sell 60,000 units over the space of the next year. Let us
further assume that these sales will be spaced evenly throughout the year,
so that 60,000/ 12 = 5000 units will be sold each month.
Storage Cost. Let U represent the number of units that the firm receives
when it reorders. This means that the average number of units the firm
has in its inventory (assuming that the sales of the units are spaced evenly
throughout time) is U/2. There are costs associated with maintaining a unit
of inventory in terms of protection, storage, and so forth. Let c represent the
cost of maintaining a unit of inventory for one year. Hence ·U/2 is the total
cost of maintaining an average inventory of U/2 units.
Reordering cost. Assume that two separate types of costs are associated
with reordering to replenish the inventory. The first type of cost is fixed in
nature and does not vary with the size of the order. The cost of recording
an order (which presumably does not depend on the size of the order) is an
example of this type of cost. We represent this fixed cost by the letter f . The
second type of cost varies directly and proportionately with the size of the
order and covers the incremental cost of shipping and packaging each unit
in the order.22 Let b refer to the incremental cost associated with reordering
each of U units.
The total cost of reordering in a specific instance is equal to the sum f +b·U .
Since a total of Q units is eventually needed for sale, and U units are reordered
each time, a total of Q/U reorders are made during the year. This means that
the total cost of reordering during the entire year is given by (f +b·U )·(Q/U ).
Total cost (storage and reordering). The total inventory cost (TIC)
associated with storing and reordering is T IC = c·U/2+U +(f +b·U )·Q/U ,
which can be rewritten as T IC = c · U/2 + f · Q/U + b · Q. Note that b is
not a coefficient of U .
To determine the cost-minimizing number of orders requires that we find the
optimal value of U (the order size), because the number of orders placed
is Q/U . Thus, we require the solution to dT IC/dU , which is dT IC/dq =
c/2 − 2 · f · Q/U 2 . Setting this expression equal
p to zero and solving yields
the expression U 2 = 2 · f · Q/c, so that U = (2 · f · Q/c). This result is
commonly called the square root law of inventory management. Because b is
22
Proportionality is not required. If the relationship between order size and order cost
were more complicated, the analysis would proceed in the same way. The exact nature of
the solution would, of course, differ somewhat.
CHAPTER 7. OPTIMIZATION 199
not a coefficient of Q, its value has no effect on the optimal inventory level
(or, equivalently, the optimal number of orders per year).
To confirm that we have found a cost-minimizing order size, evaluate the sec-
ond derivative, which is d2 T IC/dU 2 = 2QfU3
. All terms in this expression are
positive, so the value of U that satisfies the first-order condition corresponds
to a minimum value of T IC.
We determine the optimal value of U , given the following: f = 500, Q =
60000. That value is U ≈ 1732.05, which implies that the optimal number
of orders, N , is 34.64. Of course, the actual number must be an integer, and
might be constrained by provider-imposed restrictions. To see what other
values of N imply, substitute N = Q/U into T IC, so that
2N 2 f + Qc + 2N Qb
T IC = .
2N
Using the coefficient values above and b = 1 generates the relationship that
Figure 7.18 depicts.
for N > 25 or so. Of course, this result is due to the parameters this example
and is not general.
• W L = fL − λ · w = 0
• W K = fK − λ · r = 0
• Wλ = C0 − w · L − r · K = 0
23
The two-input limitation is for demonstration only. The methods easily extend to
any number of inputs.
CHAPTER 7. OPTIMIZATION 201
• VL = w − µ · fL = 0
• VK = r − µ · fK = 0
• Vµ = Q0 − w · L − r · K = 0
The first two first-order conditions yield implications that are identical to
the implications of the first-order conditions that we derived above. Also, we
can determine that µ = 1/λ = w/M P L = r/M P K. For this firm w is the
marginal input cost of labor, the cost of acquiring one more unit of labor,
and M P L, as always, the the change in output from a small change in L.
Therefore, the ratio of the two is the ratio of the cost change to the output
change, which is the marginal cost. Therefore, µ = M C.
Profit Maximization
Now, we can reexamine the implications of profit maximization. We have
established that profit maximization requires producing a quantity such that
M R = M C. We now know that M C = w/M P L = r/M P K if the firm’s
output is to be produced in a least-cost fashion. These conditions combine to
imply that the firm must employ labor and capital at rates such that M R =
CHAPTER 7. OPTIMIZATION 202
a. z = f (x, y) = 2 · x3 + y 2
b. z = f (x, y) = x2 + x2 · y + y 2
c. z = f (x, y) = x3 − x · y 2
d. z = f (r, s) = r + 2 · r2 + s − s3
e. z = f (P, Y ) = 12 · P · Y − P · Y 2
f. z = f (L, K) = 10 · L0.75 · K 0.25
2. Heinz Westphal, Vintner, imports Rhein and Mosel wines. The value of
the wine V √
increases as time passes according to the following formula:
V = 6 · 2.5 t , where t = ageing time in years. The present value of
the wine P V , given a discount rate r and continuous appreciation, is
P V = V · e−r·t .
(a) How long should Westphal hold the wine before selling it in order to
maximize the present value of the wine? That is, what t maximizes
P V ? State your solution as a general expression in terms of r.
(b) If r = 0.08, what is the corresponding t that maximizes P V ?
(a) Find the quantities of M and L that minimize the cost of doing
so.
(b) With respect to the production function, do diminishing marginal
returns exist with respect to M , L, or both?
(c) Does the production function exhibit increasing, decreasing, or
constant returns to scale? (d) Does Euler’s theorem apply?25
a. y = f (x) = 2 · x2
b. y = 6 + 0.15 · x
c. y = 6 · x2 + 2 · x + 1
d. y = a + b · x + c · x2 + d · x3
e. y = 10 + 5 · x + 2 · x2 − x3
f. y = sin(x)
g. y = sin(x2 )
h. y = 15 − x + 2 · x2 + x3
i. y = 100/x2
25
At first glance, this question should bother you. Isn’t the ratio of meat to finished
meat close to 1? Econometricians estimating production functions like this one run into
a “dominant variable” problem. One of the variables, like unbutchered meat here, varies
so closely with output that the impacts of the other variables cannot be detected very
accurately. This problem is one of detection and estimation, however, and not one of
existence. Just as butchered meat cannot be produced without unbutchered meat, it
cannot be produced without other inputs like labor and capital. It is possible that the
elasticity of subsitution is much lower than 1, which is the elasticity of substitution in this
example. Replacing this function with, say, a constant-elasticity-of-substitution function
would not change the nature of the analysis, just its difficulty.
CHAPTER 7. OPTIMIZATION 204
8. The West Mifflin Ford dealership expects to sell 1000 new Mustangs
during the next year. These sales will be evenly spaced throughout the
year. The cost of storing an unsold Mustang for 1 year is $1500. The
cost of placing a new order for new Mustangs is $700 plus $250 per new
automobile ordered.
(a) (a) What is the optimal size of order that West Mifflin should place
when it orders new Mustangs?
(b) How many such orders should West Mifflin place during this year?
(c) Determine total inventory cost for the two integer values nearest
to the computed value.
(d) How much would total inventory cost change if Ford allows no
more than 24 orders per year?
CHAPTER 7. OPTIMIZATION 205
9. The state of Taxonia wishes to maximize the total tax revenue T that it
receives from a per-unit tax of amount t per unit that it is going to place
on the output of Monopoly, Inc., to which Taxonia grants monopoly
status. The total revenue T R function of Monopoly, Inc., is given by
T R = 6 · Q − Q2 , while its total cost (T C) function in the absence of
a sales tax is given by T C = 2 · Q, where Q is units of output.
(a) What tax per unit will maximize total tax receipts for the state of
Taxonia?
(b) How much tax revenue (T) will this tax raise?
(c) What are Monopoly Inc.’s profit-maximizing price and output in
this situation?
Chapter 8
Integral Calculus
206
CHAPTER 8. INTEGRAL CALCULUS 207
when we found the derivative of a function, that derivative was unique. For
example, if y = F (x) = 2 · x2 , then dy/dx = 4 · x, which is unique. There is
no other value or function that is the first derivative of the function y = 2·x2 .
We stress the non-uniqueness of an antiderivative (integral), because it is
important to the understanding of the process of integration. In general, the
indefinite integral of f (x) = 2 · x would be
dF (x) d(x2 + C)
F (x) = x2 + C, for = = 2 · x = f (x).
dx dx
Geometrically, y = x2 + C represents a family of curves that are parallel to
one another, but have a vertical displacement from one another.
Figure 8.1 illustrates such a family of curves for the function F (x) = x2 + C.
Unless we know the value of the arbitrary constant C, we cannot determine
the unique antiderivative of a given function. When additional information
is supplied concerning the value of the constant C, we state that the initial
conditions or boundary conditions have been specified. In the example de-
picted in Figure 8.1, if we are given the initial condition that x = 0, and
the value F (x) = F (0) = 3, then the value of the constant C is determined.
F (0) now equals F (x) + C = F (0) + C = 3, so 02 + C = 3, and C = 3. Thus
F (x) = x2 + C becomes x2 + 3. Figure 8.1 shows F (x) + C for three values
of C: 0, 20, and -20.
1
d · xn+1 + C
n+1 n+1 n
= · x = xn , so fx dx = xn dx.
dx n+1
The proof above correctly suggests that the derivative of an integral must
always be equal to the integrand. That is, if the correct integration has been
performed, d(F (x) + C)/dx must be equal to f (x).
Examples
R 1 d(x6 /6+C)
1. x5 dx = 6
· x6 + C Check: dx
= x5
2 /2+C)
xdx = 12 · x2 + C Check: d(x
R
2. dx
=x
R R R d(x1 /1+C)
3. dx = 1dx = x0 dx = x + C Check: dx
= x0 /1 = 1
R√ d( 2 ·x3/2 +C) √
xdx = x1/2 dx = 23 · x3/2
R
4. Check: 3 dx = x1/2 = x
R 1 R −2
5. x 2 · dx = x dx = − x1 + C
−1
Check: d(−(1/x)+C)
dx
= d(−xdx +C) = x−2 = 1/x2
The power rule of integration explicitly requires that n 6= −1. The following
example demonstrates why Rthis restriction is necessary. Let us try to find
R −1
the integral of f (x) = l/x. 1/x)dx = x dx = (1/0) · x0 . Hence, when
n = −1, the power rule no longer applies because the integral is undefined
due to division by 0. The following rule deals with this type of situation.
To prove this result, note that the differential of the right-hand side of the
equation describing the general logarithmic rule is d(log(|x|) + C), which is
equal to (l/x)dx. We started our integration with (l/x)dx on the left-hand
side, so the general logarithmic rule is proved.
The antiderivative in the general logarithmic rule constains an absolute-value
sign. This is used because logarithms do not exist for negative values of any
variable x. When working with a problem in which we are certain that
the domain of a variable consists only of positive values, we may omit the
absolute-value sign.
It is possible, however, for the initial condition to have a value other than
that of the constant of integration. Exponential functions
R x sometimes furnish
examples of this phenomenon. Consider
R the integral e dx with the initial
condition F (0) = 3. Thus F (x) = ex dx = e0 +C. Hence F (0) = e0 +C = 3,
or 1+C = 3, and so C = 2. That is, F (0) = 3 6= C = 2. Therefore we should
not always assume that the constant of integration and the initial condition
of the function are identical.
CHAPTER 8. INTEGRAL CALCULUS 212
where
Pn Fi (x) is the antiderivative for fi (x) for all i = 1, 2, . . . , n and C =
i=1 Ci , the sum of the arbitrary constants of integration.
R R
The multiplicative property is this: K · f (x)dx = K · f (x)dx =
K · F (x) + C, where K is any constant.
The linearity property combines the additive and multiplicative properties
to this: a linear combination of n functions has an integral that equals the
linear combinations of the integrals of the individual functions.
n
Z X n
X
Ki · fi (x)dx = Ki · Fi (x) + C,
i=1 i=1
x2 dx = − 34 x3 + C
R R
1. −4 · x2 dx = −4 ·
CHAPTER 8. INTEGRAL CALCULUS 213
R R R R
2. (3 · x2 − 5 · x + 1)dx = 3 · x2 dx − 5 · xdx + 1dx =
x3 + C1 − 52 · x2 + C2 + x + C3 = x3 − 25 · x2 + x + C
R R
3. (8/x)dx = 8 · (1/x)dx = 8 · log(|x|) + C
un+1
R
1. The power rule: un du = n+1
+ C, f or n 6= −1
CHAPTER 8. INTEGRAL CALCULUS 214
1
R
2. The general logarithmic rule: u
du = loge |u| + C
au
R
3. The general exponential rule: au du = loge a
+C
R
4. The exponential rule, base e: eu du = eu + C
Integration by substitution is connected to the use of the chain rule in differ-
entiation. Integration is, as we have pointed out, the reverse of differentiation.
This means that when we introduce a new function u = g(x) in the process
of integration, the usual checking process (by which we ascertain whether
our integral is correct) must utilize the chain rule of differentiation. That is,
since integration by substitution involves the introduction of a new function
u, which is a function of x, the checking process must use the function-of-a-
function rule (the chain rule) in order to return us to the original function.
Examples
R
1. Evaluate the integral 2 · (e2·x + 1)2 · e2·x dx.
Let u = e2·x + 1. Then, du/dx = 2 · eR2·x or dx = 2·e12·x du.
Our integral, stated
R in terms of u is 2 · u2 e2·x · du/(2 · e2·x ).
This simplifies to u2 du = 13 · u3 + C.
This, in terms of x is 31 · (e2·x + 1)3 + C.
constant
R e = 2.718 . . .), and the second is the integral stated in terms
of u, u2 du . If you want Maxima to evaluate the result of the out-
put above, enter this command: integrate(u^2,u);, producing the
3
result u3 . Be aware that Maxima does not generate the constant of
integration; you must keep in mind that it exists.
CHAPTER 8. INTEGRAL CALCULUS 215
R
2. Evaluate 3 · x2 · (x3 − 4)2 dx. Let u = x3 − 4, so that du/dx) = 3 · x2 ,
or dx = du/(3 · x2 ).
With
R the pertinent substitutions,
R
3 · x2 · (x3 − 4)2 dx = 3 · x2 · u2 · (du/(3 · x2 )) = u2 du .
This final expression is evaluated as u3 /3 + C = (1/3) · (x2 − 4)2 + C.
dx
3. Evaluate x−2 . Let u = x − 2, so that du/dx = 1 or du = dx.
Now, x−2 = du
R dx R
u
= log(|u|) + C = log(|x − 2|) + C.
Example
Evaluate x2·x−3
R
2 −3·x dx. Try this: Let u = 2 · x − 3 so that du/dx = 2 or
dx = Rdu/2.
Now x2·x−3
R u
2 −3·x dx = x2 −3·x
· 21 · du, which is not integrable, since the sub-
stitution created a new function with two variables, x and u. The proper
substitution should have been u = x2 − 3 · x.
du
Now du/dx = 2 · x − 3) so dx = x−3 . Complete the steps to confirm that the
2
integral is log(|u|) + C = log(|x − 3 · x|) + C.
It is often possible to decide on the appropriate substitution by simple obser-
vation of the original integrand. That ability, however, usually means that
you have acquired the knowledge and foresight that seem to come only with
experience, some trial and error, and hard work. Integration is generally
considered to be a more difficult process to master than differentiation. The
correct way to integrate a function is not always readily apparent. Also, if the
substitution is carried out improperly so that a function or functions of two
or more variables results, then you must try a new substitution. There is no
completely general way to find the needed integral by means of substitution.
All of these difficulties are reasons why tables of integrals are so useful, and
one of the reasons that a computer algebra system can be a useful asset.2
Exercises 8.1
Integrate the following integrals. Determine how many of these integrals
Maxima can evaluate directly, without the use of the changevar command.
2
Many online integral tables are available. For example: http://integral-table.com/.
CHAPTER 8. INTEGRAL CALCULUS 217
of the original region of interest. Eventually, if the process were carried out
long enough, the method of exhaustion would lead to a close approximation
of the area of a particular region.
We use the method of exhaustion to develop an intuitive and visual idea of
how the integral calculus is used to find the area under a curve. Instead of
using a many-sided polygon, we use a rectangle (a four-sided polygon).5
Figure 8.2 illustrates the continuous function y = f (x), where the domain
of the function is the closed interval [a, b]. The problem confronting us is to
calculate the shaded area, which is the area enclosed by the curve and the
abscissa between points a and b. We refer to this area as A.
As an illustrative approximation to the area defined above, we divide the
interval [a, b] into n subintervals (where n = 4 in our example) as shown in
Figure 8.3. Part (a) approximates the area under the curve by inscribing four
rectangles below the curve between points a and b. Part (b) approximates
the area under the curve between points a and b by inscribing four rectangles
from above the curve. The left-hand boundary of each rectangle in part (a)
has a minimum height of y = f (x), whereas the right-hand boundary of each
rectangle in part (b) has a height that represents the maximum value that
y = f (x) assumes in that subinterval.
5
Later, we look at numerical methods for estimating areas that cannot be determined
CHAPTER 8. INTEGRAL CALCULUS 222
Figure 8.3: Approximating the area using rectangles: (a) approximation from
below, and (b) approximation from above.
CHAPTER 8. INTEGRAL CALCULUS 223
The area of a rectangle is given by the product of the height and the width
of that rectangle. The first rectangle in Figure 8.3 has a height of f (x0 ) and
a width of ∆x0 = x1 − x0 . To generalize, the ith rectangle in part (a) has
a height of f (xi ) and a width of ∆xi . The area of the ith rectangle is given
by Areai = f (xi )∆x˙ i . The total area in the four rectangles between points
a and b in part (a) is given by
4
X
A−
n = f (x1 ) · ∆xi .
i=0
We can see that this is an underestimate of the total area under the curve
between points a and b.
In similar fashion, we can measure the area of each rectangle in Figure 8.3.
This measure, which yields an overestimate of the area under the curve be-
tween points a and b, is equal to
5
X
A+
n = f (x1 ) · ∆xi .
i=1
The two approximations to the area under the curve between points a and b
are labeled A− +
i (underestimate) and Ai (overestimate). It is apparent that
− +
An < A < An . The unshaded portions of the rectangles under the curve in
part (a) and above the curve in part (b) are responsible for the differences
between A− +
n , A, and An .
Figure 8.4: Approximating the area using rectangles: (a) approximation from
below, and (b) approximation from above.
CHAPTER 8. INTEGRAL CALCULUS 225
−
Definition: Let A+n be the upper estimate and An the lower estimate of the
area under the graph of y = f (x) when the interval [a, b] is divided into n
subintervals. If
−
lim A+n = lim An
n→∞ n→∞
• Second, the symbol representing change, ∆x, has now been replaced
by the integration notation dx and represents an infinitesimal change.
That is, when we integrate f (x) over the interval from a to b, the
constant of integration disappears.
CHAPTER 8. INTEGRAL CALCULUS 226
The first fundamental theorem of the calculus relates F (x) and f (x).
This is the theorem: Given an integrable function f (t) on a closed
Rb
interval [a, b], that is, given a f (t)dt = F (t) if a ≤ x ≤ b, then the
derivative of Ft exists at each value x and is equal to f (t). That is,
F 0 (t) = f (t).
Examples
R3 3
x2
1. 0
xdx = 2
= 9/2 − 0 = 9/2
0
R2 2
x4
2. −1
3
(x − 3 · x ) = 2
4
− x = ( 16
3
4
− 8) − ( 14 + 1) = − 21
4
−1
R9 dx
3. 3
= loge (9) − loge 3 = log3 (9/3) = loge (3)
x
The loge rather than log is placed here as a reminder. As noted earlier,
Maxima uses log to mean loge and we follow this convention.
The next two examples illustrate the fact that when we use the change- of-
variable technique in order to integrate a function, that is, when we integrate
by substitution, we must always use new limits of integration.
Examples
R 15 x
1. 3 2·x−5 .
Let u = 2 · x − 5. Then du/dx = 2, so dx = du/2. Note that when
we integrate with respect to u, the new limits of integration are: When
x = 4, u = 3 and when x = 10, u = 15. Thus
Z 15
1 du 1 15
· = · log(|u|) = (log(15) − log(3))/2 =
3 2 u 2 3
log(15/3)/2 = log(5)/2.
Alternatively, before we evaluate the integral, we can convert the an-
tiderivative back from u to x and then use the original limits of 4 and
10. That is,
Z 15
1 du 1 15 1 10
· = · log(|u|) = · log(|2 · x − 5) =
3 2 u 2 3 2 4
Figure 8.5: Integrating negative areas. (a) Sum of the areas’ values. (b) Sum
of the areas’ absolute values.
f (x) ≥ 0. When f (x) < 0 in some intervals, as is the case in part (b), we
can obtain f (x) by finding its mirror image with respect to the x axis. The
area between the curve and the x axis in the interval [c, d] is equivalent in
absolute size in both parts (a) and (b). The area between the curve and the
x axis in the interval [c, d] in part (b) is the mirror image of the area between
the curve and the x axis in the same interval in part (a).
As we demonstrate below, the function |f (x)| is integrable on the interval
[a, b] whenever f (x) is integrable on the same interval. That is, we can
Rb
show that a |f (x)| dx is the sum of the positive areas minus the sum of the
Rb Rc Rd Rb
negative areas. Hence a f (x) dx = a f (x) dx − c f (x) dx + d f (x) dx.
Examples
2. Find the area bounded by the curve y = x2 − 4 · x and the x axis such
that only positive values of x are permitted. See the shaded area in the
right panel of Figure 8.6. This function has roots at x = 0 and x = 4
and is negative over
3 the range
4defined by the two roots. Therefore,
R4 x
Area = 0 dx = 3 − 2 · x2 = (64/3) − 32 − 0 = −32/3, so the
0
absolute value is 32/3.
6
This discontinuity must be of either a jump or point variety in order to apply the
method described here. It cannot be an infinite discontinuity.
CHAPTER 8. INTEGRAL CALCULUS 231
R1
R2
7. −2 (x − 5)4 dx 14. −1
3 · x2 · (x3 − 4)2 dx
R4 3 R 10
8. 2 x4·x
4 +1 dx 15. 4
2·x−3
x2 −3·x
dx
and g2 bound between a and c. If our interest is in the absolute value of the
sum of Areas B and C. In that case, integrating (f 2 − g2) will not work. We
must integrate a function |f 2 − g2|.
Definition. Given two functions f (x) and g(x), both of which are integrable
on the interval [a, b]. The
R c total absolute area between the curves of these two
functions is given by a |f (x) − g(x)| dx.
Example. The equations for the two expressions that define Areas B and C
in Figure 8.8 are f 2 = −x2 /10 + 2 · x − 5 and g2 = x2 /10 − 2 · x + 5. For these
functions Area B = 94.281, Area B + Area C = -27.86, and absolute value of
Area B plus Area C = 216.42. All values are computed in theRaccompanying
c
workbook. Maxima’s integrate command does not evaluate a |f 2−g2| dx.
If
R cb has been determined,
Rb then the absolute
Rc area can be computed as follows:
a
|f 2 − g2| dx = a |f 2 − g2| dx − b |f 2 − g2| dx.8
Whether you seek the value of the algebraic difference between two functions
are equal.
8
Also, Maxima’s romberg command, which we consider below, can evaluate the ex-
pression in terms of absolute values.
CHAPTER 8. INTEGRAL CALCULUS 234
1/2 2
2 1 27
3
· (2 · x) 3/2
+ 3
3/2
· (2 · x) 2
−x +2·x = 12
= 94 .
0 1/2
Alternatively, and much more easily, we can integrate with respect to y. The
two curves are drawn with y as the independent variable in the second panel
of Figure 8.10. The integral is this: 2
R 2 y+2 y2 2
y y3
1
Area = −1 2 − 2 dy = 2 · 2 + 2 · y − 3 = 27 12
= 94 .
−1
2
Example 2. Find the area bounded by the curves y= x and y = 2 · x. (See
Figure 8.11.) Solving the two equations simultaneously, we find that the
points of intersection are (0, 0) and (2, 4). Thus
2
R2
Area = 0 (2 · x − x2 ) dx = (x2 − x3 /3)0 = 4/3.
Alternatively, we can integrate with respect to y, adjusting for the proper
limits, so that:
R4 √ 2
4
x − y2 dy = 23 · y 3/2 − y4 = 43 .
Area = 0
0
CHAPTER 8. INTEGRAL CALCULUS 236
Exercise 8.4
Draw a sketch (either by hand or using Maxima) bounded by the follow-
ing expressions. Compute the values by hand and check your solution with
Maxima.
1. y = x3 , y = 0, x = 0, x = 2 6. y = 6 · x − x2 , y = x2 − 2 · x
2. y = 9 − x2 , y = x + 3
7. y 2 = x, y = x/2 − 3/2
2
3. y = 3 − x , y = −2 · x
8. y = y = x3 , y = 2 · x + 4, x = 0
4. y = 6 − x, y = x + 2, y = 0
5. y = 6 − x, y = x + 2, y = 8 9. y = x, y = 10−4·x, y = 0, x = 0
CHAPTER 8. INTEGRAL CALCULUS 237
We consider two general types of improper integrals. The first occurs when
there are infinite limits of integration. The second occurs when there is an
infinite integrand.
Case 1: Improper integral due to an infinite limit of integration
When the limits of integration areRno longer finite, for example, when we
b
wish to study the definite integral a f (x) dx as a → ∞ and/or as → −∞
we have an improper integral. In such a case, it is not possible to find a finite
value for the integral. This is because F (∞) − F (0) is meaningless, as are
F (b) − F (−∞) and F (∞) − F (−∞).
Definition. An improper integral with an infinite limit of integration is
formally symbolized by
Z ∞ Z b
b
f (x) dx = lim f (x) dx or lim F (x)a .
a b→∞ a b→∞
CHAPTER 8. INTEGRAL CALCULUS 238
Such an integral is said to be convergent when the limit exists and is finite,
whereas it is said to be divergent when the limit does not exist.
Rb
We can use the definition in any particular case by initially finding a , that
is, by finding the indefinite integral F (x). Second, we evaluate F (x) for a
and b, then find the limit as b → ∞. If the limit is finite, then the integral
exists and is convergent. If the limit is infinite, then the integral is diverging
and has no finite value.
It is not uncommon to see an improper integral written without the limit
notation in front of the integral. That is, instead of lim f (x) dx we often see
R∞ ∞ b→∞
the shorthand expression a f (x) dx = F (x)a . This shorthand notation
nevertheless must be evaluated with the limit concept held firmly in mind.
This implicit step must be carried out, since the limit may be divergent, and
if it is, the integral has no finite value.
The existence of an improper integral with an infinite limit for its upper
bound does not change the fact that we are measuring the area under a
curve. Figure 8.12 illustrates the graph of a function y = f (x) where the
Rb R∞
upper limit of integration b is infinite. That is, a f (x) dx = a f (x) dx.
If the improper integral is convergent, that is, if the limit exists, then the
shaded region under the curve is considered to be a finite area. However, if
the improper integral is divergent, then a limit does not exist and the shaded
area under the curve is infinite in size.
It is possible, of course, for the lower bound of integration to be infinite
as well. In this case,
R b the lower bound aR tends to −∞. We can define the
b
improper integral −∞ f (x) dx as lim a f (x) dx. We then use the usual
a→−∞
procedure to determine whether the improper integral is convergent or di-
vergent.
A more complicated case is the situation
R ∞ in which both limits of integration
are, infinite; that is, we wish to find −∞ f (x) dx.
Definition. An improper integral with both limits of integration infinite exists
when, for any real number C,
Z ∞ Z C Z ∞ Z b
f (x) dx = f (x) dx + f (x) dx = lim f (x) dx.
−∞ −∞ C a→−∞,b→∞ a
RC R∞
Both integrals, −∞ f (x) dx and C f (x) dx, must be convergent in order
R∞
for the improper integral −∞ f (x)dx to be convergent. If either of the two
CHAPTER 8. INTEGRAL CALCULUS 239
Examples
R0 R0 0
e3·x
1. −∞
e3·x dx = lim e3·x dx = lim = 1/3 − 0 = 1/3
a→−∞ a a→−∞ 3 a
R∞ Rb √ b √
2. √1 dx = lim dx/x = lim 2 ·
x = lim (2 · b − 2)
1 x b→∞ 1 →∞ b→∞
1
This integral is divergent; its limit does not exist.
R∞ Rb x b
3. −∞ ex dx = lim a
e dx = lim e x a = lim (eb − ea)
a→−∞,b→∞ a→−∞,b→∞ a→−∞,b→∞
The term eb grows without bound as b increases, so the last term does
not have a limit. Therefore, the integral is divergent.
divergent.”
Examples
R3 Rb b
dx dx
1. = lim = loge (|x − 3|) = lim loge (|b − 3|) − loge (| − 3|)
0 x−3 b→3 0 x−3 b→3
0
The limit does not exist, and the integral is divergent.
R1 R 1 dx 1
2. 0 dx = lim 0 x = lim loge (|x|) = lim (loge (|1|) − loge (|a|)
a→0 a→0 a x→a
The limit does not exist, and the integral is divergent.
R1 R1 √ 1 √
3. 0 √dx = lim a √dx = lim 2 · x = lim (2 − 2 · a) = 2
(x) a→0 (x) a→0 a a→0
Examples
R5
1. Evaluate 1 dx/(x − 2)2 . The integrand is discontinuous at x = 2 so
R5 R5 R5
we restate it as 1 dx/(x − 2)2 = 1 dx/(x − 2)2 + 1 dx/(x − 2)2 .
Evaluating these integrals reveals,
R 5 however, that both are divergent. If
either is divergent, then so is 1 dx/(x − 2)2 .
R5 √
2. Evaluate 1 dx/ x2 − 9. This integral poses two sources of difficulty.
First, for x < 3 it’s value is is a complex number. Second, it is discon-
tinuous at x = 3. Evaluating √ the
entire integral using M axima yields
5
this result: log (3) − i atan 2 , which is a complex value. Integrating
from 3 to 5 yields log(3). Thus, the integral converges. Evaluating
this interval by hand involves some trigonometric substitutions. See
Mitchell [?].
power. One can also use consumer surplus to help make a decision about the
desirability of building a new highway, a new lock and dam, or of expanding
a wilderness area.9 Figure 8.14 provides a framework for examining the basic
aspects of consumer surplus.
Start with area A in Figure 8.14. The area of A (which extends off the
graph) is the numerical value of consumer surplus when consumers can buy
theR quantity that they select at a price of $5 per unit. Mechanically, this area
x
is 0 1 (p − 5) dx, where x1 is the quantity demanded when p = 5 (x1 u 35.9).
The value of p at each quantity is the price that a consumer is willing and
able to pay for that unit. Hence, the demand curve is also a willingness to
pay curve.
For this demand curve, which is x = p−3/4 , area A is approximately $17,218.
We find this area by integrating the inverse demand curve, p = (x/120)−4/3 .
The integral for the inverse of this inelastic demand curve is divergent, so we
9
The concept of consumer surplus is more complex than the present illustration indi-
cates. For details and references, see Hammock and Mixon [7]. Also, while the concept
is fairly precise, applications in evaluating projects like those cited above must involve a
significant degree of imprecision.
CHAPTER 8. INTEGRAL CALCULUS 244
central feature of markets that consist of price searchers: the quantity that
maximizes the sum of consumer and producer surplus is the equilibrium
quantity.
Let f (x) and g(x) be the inverse
R x demand and supply curves. The area be-
tween these is Surpluses = 0 (f (x) − g(x) dx = F (x) − G(x) + C, where
C is the unknown constant of integration. It is apparent upon inspection
that the first-order condition for maximizing the “Surpluses” function is that
f (x) − g(x) = 0. The remarkable conclusion is, therefore, that the equilib-
rium quantity generates maximum combined surpluses.10
roughly (often very roughly) approximate the marginal value of their services.
To determine the total value of these service (however that value is defined)
requires integrating the marginal value function. An important proposition
of elementary calculus is that one cannot determine total value by looking
at the margin (derivative).
To illustrate this point, suppose
√ that the marginal value
√ function for educa-
tors is M V edu = 50000/ edu and M V ath = 500/ ath, where edu is the
number of educators and ath is the number of elite athletes.11
For the values used here, the resulting marginal value of educators is approx-
imately $63,640 per year and that of elite athletes is approximately $273,860
per year. The total areas under the marginal value curves are approximately
$254,560,000 and $1,643,200, respectively. Thus the total value attributed to
11
This example is purely illustrative but not purely fanciful. In a recent year, the
number of K-12 teachers was about 2.4 million and the number of professional athletes
was about 12,000. The median salary of teachers was around $55,000, while that of
profession athletes was around $36,000. Accordingly, we presume that the very higly paid,
elite athletes number well below 12,000. For our purposes, we set the number at 3,000
and the number of educators at 2,000,000.
CHAPTER 8. INTEGRAL CALCULUS 248
the services of educators is about 154.92 times that attributed to the athletes
(to emphasize: this is a multiple of 154.92, not 154.92 percent). All values
are computed in Maxima. The accompanying workbook shows the details.
This example, unlike the previous two, does not offer a breakdown into con-
sumer and producer surplus. This fact reflects the nature of the two labor
markets in which these activities occur. Educators are employed by an amal-
gam of government and private agencies (mostly government). These agen-
cies’ employment and wage decisions are likely politically motivated. rather
than aimed at maximizing a relatively simple objective function.
The market for athletes is likely even more complex. The value provided by
athletes will be divided among the athletes, the owners of franchises, and
spectators. Neither the competitive model of price takers nor the monopolis-
tic model of price searchers applies well. Superstar models and tournament
models both predict very high earnings for a few participants.12
where exp(. . . ) is the same as e(...) . Here µ is the mean of this population’s
values and σ is its standard deviation.
12
The model as stated implicitly assumes that all of these athletes are identical. Both
the superstar model and the tournament model indicate that, even with homogeneity, large
differences will accrue. If the athletes are not quite identical, both models predict great
rewards for relatively small differences in ability. Cyrenne [5] summarizes these models
and applies them to salaries of ice hockey players. The model in the current example
also ignores earnings differences due different individual bargaining abilities or due to
endorsement earnings.
Garicano and Rossi-Hansberg [6] is a much more ambitious application and extension
of the superstar model.
CHAPTER 8. INTEGRAL CALCULUS 249
As an example, suppose that K(0) = 100, and I(t) − D(t) = a · eg·t , where
t = 0 is the initial period. The first panel shows the investment level for
each time period, and the second panel shows the capital stock for each time
period. Let t1 = 20, where t1 replaces b in the general expression for the
capital stock.
The equation for the capital stock can be written as the command K(K0,
a,g,t1):=’’(K0 + integrate(NetInv(t,a,g),t,0,t1)), which yields this
13
Creating a function as we have done is not necessary. Maxima’s distrib module
provides these values and more. Furthermore, it does so for 25 continuous and discrete
distributions, not just normal distributions.
CHAPTER 8. INTEGRAL CALCULUS 251
output: !
%eg t1 1
K (K0 , a, g, t1 ) := a − + K0 .
g g
Recall that %e is Maxima’s notation for the constant e(= 2.718 . . .). We use
K0 = 100, a = 1, and g = 0.03. For t1 = 20, the value of the capital stock
is approximately 127.404, so the stock has grown by 27.404 units during this
twenty year period.
Observe how the value of ∆K appears in the two panels. In the flow panel
on the left, it is an area: changes per year summed over the 20 years. In the
stock panel on the right, it appears as a vertical distance.
aggregate economy. This section sketches the Solow growth model, which
illustrates how production, consumption, saving, and investment interact to
determine an economy’s capital stock and per-capita income. The develop-
ment used here follows Mankiw [11].
The model begins with production. The model assumes that production is
a function of two inputs, capital (K) and labor (L). Also, the production
functions exhibits constant returns to scale, or in terms that Chapter 6 de-
velops, it is homogeneous of degree 1. Formally, Y = f (K, L) where Y is
total output. Because f (K, L) is homogeneous of degree 1, multiplying all
inputs by the same value multiplies output by the same value. We multiply,
both K and L by 1/L, which causes Y to be multiplied by 1/L. Hence,
Y /L = f (K/L, L/L) or y = f (k, 1) where lower-case letters denote per-
labor-unit values. (From now on, we refer to these as “per capita,” which
is exactly appropriate if L is proportional to population) The constant in
f (. . .) is of no consequence, so we rewrite the per-capita production function
as y = f (k).
For √
illustration, we use the simple Cobb-Douglass
√ production function Y =
A · K · L, which converts to y = A · k, where A is a technology index.
Solow [18] does not use a specific functional specification.
The production function generates a marginal product of√capital function:
mpk = dy/dk. In the illustrative example mpk = A/(2 · k). We use this
function below to determine income shares.
We limit our attention to a relatively simple model. This is a model of a
closed economy without government. Thus total output is either consumed
or saved: Y = C + S.15 We restate these values in per-unit-of-labor terms,
y = c + s, in order to relate them directly to the production relationships
stated above.
Another assumption is that not only is the economy closed in terms of trade
(no net exports), but it is also closed in terms of capital flows. This as-
sumption implies that, in equilibrium, y − c = s = i where y is per-capita
output and c is per-capita consumption, so s = y − c is per-capita saving.
Saving is the only source of funds for investment, so per-capita saving equals
per-capita investment: s = i.
15
This is not as severe an assumption as it might appear. If part of the output is diverted
to government, then government must spend on either consumption or investment.
CHAPTER 8. INTEGRAL CALCULUS 253
year.16
Now, spread the payments evenly over each of the three years. The resulting
present value calculation is executed with these commands (VC for “value,
continuous”): [VC1: float( integrate( A1*exp(-0.1*t),t,0,1)),
VC2: float( integrate( A2*exp(-0.1*t),t,1,2)),
VC3: float( integrate( A3*exp(-0.1*t),t,1,2))] and
VC: VC1 + VC2 + VC3. The resulting value is 5687.5, approximately. As
predicted, the fact that the payments are made throughout each year rather
than at the ends increases their present value somewhat.
We can generalize
Rτ this expression to allow for τ (the Greek letter tau) periods.
Now V = 0 R(t) · e−d·t . Here R(t) is a function of t, so the integral of this
expression will depend on R(t)’s form. The discount rate is d. Consider a
simple case in which R is the same each period. Now
Z τ
R
V = D · e−d·t = · (1 − e−d·τ ).
0 r
Suppose that τ = 2, R = 3000, and d = 0.06. Then the present value of this
stream is (3000/0.06) · (1 − e0.12 ) ≈ 50000 · (1 − 0.8869) ≈ 5565 dollars.
Figure 8.20 generalizes this calculation by letting τ range from 0 to 50 periods.
Also, it shows the effect of increasing the discount rate from 0.06 to 0.08.
The flattening of the two present value functions as the payment period is
lengthened reflects the effect of compounding: the discounting process has
an increasingly large impact as the successive payments move farther into
the future. Likewise, the size of the discount factor becomes increasingly
important as the length of the payment period increases.
Consider an application, due to Chiang [4]. A wine dealer holds a quantity
of wine. This wine can be sold immediately for $A, but holding it and selling
it later will result in a higher price. Suppose
√ that the wine’s value increasing
(t) 17
according to this function: P = A · e .
−d·t
√ present value√ of the wine, sold in period t is V (t) = P · e
The = A·
(t) −d·t t−d·t
e ·e = A·e . For now, we assume that storage cost is zero, so the
profit-maximizing dealer must simply choose the value of t that yields maxi-
mum present value. We can convert the expression to a linear-in-logarithms
expression, log(V (t) = log(A) + t1/2 − d · t.
17
The precise functional relationship between price and age is not important. This one
is used for convenience. The reasoning in this example extends to examples like fisheries
and forests, in which the growth is a physical growth function rather that a price function.
See McAfee [10], Chapter 4.
CHAPTER 8. INTEGRAL CALCULUS 256
determine that
dN (t) dA(t) −d·t s −d·t
= ·e − d · A(t) + ·e .
dt dt d
5. Currently 100,000 cars per hour use a stretch of highway at rush hours.
Over the next few years, this value will grow, following this growth
function: g(t) = 10000
√
0.4t
. To what value will the number have grown in 3
years?
6. Consider this marginal revenue functions that apply over the relevant
ranges for product y: MRy = 10/(1 + y)2 .
Matrix Algebra
Each term aij relates the number of units of requirement i that are provided
by one unit of food j.
We have a system that consists of of 10 equations, the objective function that
we are trying to minimize plus the nine constraint equations. The system
259
CHAPTER 9. MATRIX ALGEBRA 260
contains 80 unknown values, the quantities of the 80 different foods that can
be consumed. This systems is an example of applied linear algebra.
Definition: Linear algebra is the study of systems of linear equations and the
attempt to find a simultaneous solution for the unknowns of those equations,
if such a solution exists.
It is important to note that linear algebra deals with linear equations. Lin-
ear equations are generally easier to deal with than are nonlinear equations.
Nonlinear equations and nonlinear models often cannot be solved without
the help of a computer.1 It is also true, however, that we can usefully ap-
proximate many business and economics relationships with linear functional
forms. Hence we are not severely disadvantaged by the fact that matrix alge-
bra is restricted to the study, manipulation, and solution of linear equations.
a11 a12 ··· a1n x1 c1
a21 a22 ··· a2n x2 c2
A = .. X = .. and C = ..
.. ..
. . . . .
am1 am2 · · · amn xn cn
The matrix labeled A above represents the coefficients of the variables in the
system of equations. The A matrix has m rows and n columns. This can be
contrasted with the variable matrix, labeled X, which consists of n rows and
only one column. In general, there is no relationship between the number
of rows and the number of columns in a matrix. The number of rows is not
necessarily related to the number of columns, and vice versa. What is the
case, however, is that the number of rows and the number of columns define
the dimension (or order) of a matrix. For example, matrix A has m rows and
n columns and is therefore said to be an m × n matrix (which is read, “m by
n matrix ”). The dimension of a matrix is always read rows first, columns
second. A 5 × 7 matrix has five rows and seven columns, not vice versa. In
an important special case in which m = n, for example, a 5 X 5 matrix, one
is dealing with a square matrix.
We occasionally encounter the notation A = [aij ], which represents a matrix
composed of the elements that take the form aij . The number of rows and
columns is unspecified. Note well that [aij ], which represents a matrix, is not
equivalent to aij , which represents a specific element in a matrix. That is,
[aij ] 6= aij unless the dimensions of the matrix are 1 × 1.
A small comment on notation is in order. Either aij or ai,j may be used to
indicate the value in the matrix element in the row i and column j. If we
instruct Maxima to create a matrix of a’s, using the command genmatrix(a,
3, 3), the resulting matrix is
a1,1 a1,2 a1,3
a2,1 a2,2 a2,3 .
a3,1 a3,2 a3,3
X1
X2
X = .. is an 80 x 1 matrix,
.
X80
C = [C] is a 1 × 1 matrix,
a1,1 a1,2 · · · a1,80
a2,1 a2,2 · · · a2,80
A = ..
.. .. ..
. . . .
a9,1 a9,2 · · · a9,80
is a 9 × 80 matrix, and
R1
R2
R = ..
.
R9
is a 9 × 1 matrix.
Matrices X and C have the dimensions 80 × 1 and 9 × 1 respectively. Both
matrices have only one column and are referred to as column vectors. Matrix
P has the dimensions 1 × 80 and is referred to as a row vector.
We can use the concept of a vector to view a matrix as a series of related
row and/or column vectors. Consider the matrix
a11 a12 ··· a1n
a21 a22 ··· a2n
A = ..
.. ..
. . .
am1 am2 · · · amn
CHAPTER 9. MATRIX ALGEBRA 264
A1 a1,1 a1,2 ··· a1,n
A2 a2,1 a2,2 ··· a2,n
A = .. = ..
.. ..
. . . .
Am am,1 am,2 · · · am,n
8 · x1 + 10 · x2 + 12 · x3 = 1
3 · x1 + + 2 · x3 = 0
x1 − 2 · x2 − 5 · x3 = −5
The element in the second row and second column (a2 2) of the coefficient
matrix is 0 and must be included. Further, should we interchange the el-
ements in the first and second columns of the first row, that is, should we
3
We could also view the matrix as an ordered set of n column vectors.
CHAPTER 9. MATRIX ALGEBRA 265
interchange a11 and a12 , then the matrix would become the one below, A∗.
10 8 12
A∗ = 3 0 2
1 −2 −5
10 · x1 + 8 · x2 + 12 · x3 = 1
3 · x1 + + 2 · x3 = 0
x1 − 2 · x2 − 5 · x3 = −5
Matrices A and A∗ are not the same; they represent different sets of coeffi-
cients.
We must finally observe that a matrix has no numeric value per se. One
cannot state that a matrix has a value of 5, 7, 14, or any other number. A
matrix is simply a shorthand, efficient method of writing an array of elements.
of matrix B. The result is the element that appears in the first row, first
column of the summed matrix C.
Likewise, we pair and add the elements in the first row, second column of
each matrix, and so forth. Formally, we can write this process as follows:
a1,1 a1,2 · · · a1,n b1,1 b1,2 · · · b1,n
a2,1 a2,2 · · · a2,n b2,1 b2,2 · · · b2,n
.. + .. .. =
.. .. ..
. . . . . .
am,1 am,2 · · · am,n bm,1 bm,2 bm,3 bm,n
b1,1 + a1,1 b1,2 + a1,2 · · · b1,n + a1,n
b2,1 + a2,1 b2,2 + a2,2 · · · b2,n + a2,n
.. .. ..
. . .
bm,1 + am,1 bm,2 + am,2 · · · bm,n + am,n
Name the third, matrix C. Then we can note that c32 = 2k=1 a3k ·bk2 =
P
8 · 4 + 3 · 9 = 59. Choose 2 or 3 other values in C and confirm that they
are generated in like fashion.
CHAPTER 9. MATRIX ALGEBRA 271
The preceding examples indicate the dimensions of the matrix that results
from matrix multiplication. Our approach to matrix multiplication has so far
been mechanical. We shall now develop an intuitive understanding of matrix
multiplication as well. In particular, we return to linear equations systems,
with which this chapter began. We do so in order to address the logic of
matrix algebra, or linear algebra.
Recall the general system
a11 · x1 + a12 · x2 + ··· a1n · xn = c1
a21 · x1 + a22 · x2 + ··· a2n · xn = c2
.. .. .. ..
. . . .
am1 · x1 + am2 · x2 + · · · amn · xn = cm
We previously learned how to abbreviate this system of linear equations
A·X = C. Our definition of matrix equality enables us to state that
Pnmatrix C
equals the product A·X if and only if element ci is given by ci = k=1 aik ·xk
for i = 1, 2, . . . , m.
The subscripts of the terms in this system lead intuitively to the definition
of matrix multiplication. Specifically, we observe that the subscript k is used
in both the aik and the xk terms. This ensures that the number of columns
in matrix A is the same as the number of rows in matrix X. In more detailed
form, the matrix multiplication A · X = C involves the following:
Pn
a11 · x1 + a12 · x2 + . . . a1n · xn Pnk=1 a 1k · x k c1
a21 · x1 + a22 · x2 + . . . a2n · xn = a
k=1 2k k · x = c2
.. .. .. .. ..
. . .
Pn .
.
am1 · x1 + am2 · x2 + . . . amn · xn k=1 amk · xk cm
We now need to go from the somewhat familiar case above to the general
case in which A is once again an m × n matrix and X P is an n × p matrix.
Any element cij of the new matrix A · X = C is given by nk=1 aik · xjk , where
i = 1, 2, . . . , m and j = 1, 2, . . . , p.
In detail, the product A · X = C is given by the equalities below:
Pn Pn
k=1 a 1k · x k1 · · · k=1 a 1k · x kp
Pn a2k · xk1 · · · Pn a2k · xkp
k=1 k=1
[aik ] · [xkj ] = = [cij ],
.. ..
Pn . Pn .
k=1 amk · xk1 · · · k=1 amk · xkp
CHAPTER 9. MATRIX ALGEBRA 272
as i = 1, 2, · · · , m and j = 1, 2, . . . p.
In this case, the matrix of coefficients is applied to a matrix of veriables.
Each column of the latter is the same length, but not (necessarily) the same
variables. Suppose, for example, the coefficients relate to demographic data
like age, years of schooling, and income. The age and location variables would
likely be well-defined, but the analyst might have little reason to choose
between two competing measures of income. The first column in the X
matrix might have pre-tax income, and the second column might have post-
tax income. Multiplying the (fixed-value) coefficient matrix A to the two-
column X matrix would result in a two-column C matrix. The values in
these columns would differ due to the different income measures.
Before addressing the nature of this issue more formally, we consider a sim-
ple hypothetical example. Consider three individuals who are the same age,
have the same level of schooling, and have the same pre-tax and post-tax in-
come, but behave somewhat differently. Specifically, suppose that their level
of consumption of some product is defined by the following three equations:
z1: 10 + 5*age - 2*years + 3*income,
z2: 12 + 4*age - 1.5*years+2.5*income, and
z3: 11 + 4.5*age - 1.8*years + 2.75*income. Each of these is 30 years
of age and has 12 years of schooling. Each also has a post-tax income of 35
and a pre-tax income of 40 (presumably in $1000’s per year). The three ma-
data and the resulting values of z for these people.5
trices below contain this
1 1
10 5 −2 3 30 30 241 256
12 4 0.5 2.5 · 12 12 = 225.5 238.0 .
1 4.5 −1.8 2.75 210.65 224.4
35 40
Note the two 1’s in the X matrix. This is the value by which the constant
term is multiplied. The A matrix is 3 × 4, and the X matrix is 4 × 2, so the
C matrix is 3 × 2. We have two predicted levels of z for each person. The
first column uses post-tax income to predict z,a nd the second column uses
pre-tax income.
We now develop a more general treatment, one that provides a relatively
simple way to remember what is supposed to be multiplied and what is
5
The “people” are likely to representatives of some groups, perhaps identified by region
or ethnicity. Coefficients would likely be estimates that have come from econometric
studies.
CHAPTER 9. MATRIX ALGEBRA 273
Exercise 9.1.
6
You might have noted an exception to this rule. When scalar k and matrix A are
multiplied, then k · A = A · k. The case of an identity matrix (to be defined shortly) is
another exception. Actually, k = k · 1, and 1 is a degenerate identity matrix, so this is
actually just one exception. More later.
CHAPTER 9. MATRIX ALGEBRA 275
1. Find the coefficient matrix for each of the follow systems of linear equa-
tions.
a. 3 · x1 + 2 · x2 + 4 · x3 = 17, x1 + 2 · x2 + x3 = 4, and
5 · x1 + x2 + 3 · x3 = −2
b. x1 + x2 = 4 and 3 · x1 + 2 · x2 = 0
c. x + 2 · y + 4 · z − 2 = −6, −4 · x + 2 · w = 7,
3 · y + z − 4 · w = 0, and −x − y + z = 6
d. x + y − z = 10, −5 · y + 3 · z = 4, and −3 · x + 2 · y = −3
2. State thedimensions
of the following matrices.
1 2 1
1 2 3 1
(a) A = 3 4 0
(b)B = (c) C =
4 5 6 2
0 0 2
0 19 9
0 26 12 9 12 15 2 3
(d) D = (e) E = (f) F =
0 33 15 −9 5 1 4 5
2 7 4
1 2 −1 2 1
3 0 −6 0 1
(g) G = −9 5 1 (h) H = 0 0
8 0 1
2 −11 10 15 5
For the next three sets of matrices, perform the indicated matrix opera-
tions whenever the matrices meet the required dimensional constraints.
−2
3. Given that A = 0 B = 1 1 2
C= 2 6 0 ,
4
find (a), A · B (b) B · A, and (c) A · (B + C) = A · B + A · C.
2 3 3 5 5 4
4. Given that A = B= C=
4 5 2 0 0 −4
find (a) A + B , (b) A · (B + C), (c) A · B · C, and (d) C · B · A.
2 1 4
5 6 1 3 1 3 1
5. Given that A = and B = 1 0 −1 find A · B.
1 2 0 −1
0 2 −1
CHAPTER 9. MATRIX ALGEBRA 276
···
a11 0 0
0 a22 ··· 0
..
D= 0 . ··· 0 .
. .. ..
.. . ··· .
0 0 ··· ann
0
0 0 0
O= O= O= 0 0 0 0 0
0 0 0
0
As the second and third examples above indicate, the null matrix is not
restricted to being a square matrix, as are the identity, diagonal, and scalar
matrices. A square null matrix is idempotent.
The null matrix, like the number 0 in the real number system, has several
unique qualities. For instance, the commutative law for addition of matrices
holds when the null matrix and another matrix A are added if both the null
matrix and the A matrix satisfy the usual dimensional requirements. That
is,
CHAPTER 9. MATRIX ALGEBRA 280
On the other hand, if both matrix A and the null matrix are square, then
the two products will commute. Thus,
Case 2. Given two real numbers a and b, we know from number algebra
that if a · b = 0, then a = 0 and/or b = 0. However, in matrix algebra, the
product A · B = O does not imply that A = O and/or that B = O. The
following two examples illustrate this point.
0 0 0 3 0 0
A= B= A·B = =O
−2 3 0 2 0 0
1 3 −3 6 0 0
A= B= A·B = =O
2 6 1 −2 0 0
Case 3. Given three real numbers a, b, and c, we know from number algebra
that when a · b = a · c (with a 6= 0), then b = c. Once again, however, this
relationship does not hold in matrix algebra. For example, given matrices
A, B, and C such that A · B = A · C, it does not follow that B and C are
identical matrices such that B = C, as this example shows.
1 3 2 2 −4 2 8 14
A= B= C= A·C = = A · B,
2 6 2 4 4 4 16 28
but B 6= C.
We can extend result to include the product of any finite number of matrices:
A1 · A2 · · · · · An )0 = A0n · · · · · A2 · A1 .
CHAPTER 9. MATRIX ALGEBRA 284
The next three examples illustrate this property. The first example was
created as the text was being printed. That is, the computations were done
by hand. In the second and third, we used Maxima to create the matrices and
to carry out the computations. In each of these two examples the Maxima
output consists of three lists. The first list is the original matrices (A and B
or A, B, and C). The second list contains the products of the original lists,
the transpose of that product, and the transposes of the original matrices.
The third list contains a single item, B 0 · A0 or C · B 0 · A0 . The accompanying
workbook shows the commands.
3 2 2 0
1. A = B=
0 5 1 3
Then
8 6 0 8 5 0 3 0 0 2 1
A·B = (A · B) = A = B =
5 15 6 15 2 5 0 3
8 5 8 5
and (B · A)0 = . Thus (A · B)0 = = B 0 · A0
6 15 6 15
2. A and B
1 3 −1 1 0
[2 0 0 , −1 2]
0 −1 6 1 3
A · B, (A · B)0 , A0 , and B 0
−3 3 1 2 0
−3 2 7 1 −1 1
[ 2 0 ,
, 3 0 −1 ,
]
3 0 16 0 2 3
7 16 −1 0 6
A · B)0
−3 2 7
3 0 16
3. A, B, and C
3 2 1
2 1 0 3 −1
[ , , 2 −1 0 ]
3 4 1 0 2
1 0 −1
A · B · C, (A · B · C)0 , (A · B · C)0 , A0 , B 0 , and C0
15 35 0 1 3 2 1
15 −4 1 2 3
[ , −4 −1 ,
, 3 0 , 2 −1 0 ]
35 −1 −1 1 4
1 −1 −1 2 1 0 −1
CHAPTER 9. MATRIX ALGEBRA 285
C 0 · B 0 · A0
15 35
−4 −1
1 −1
Exercise 9-1
Find the transpose of each of the following seven matrices.
−12 5 4
1 3 4 4 −2
1. 2. 0 8 3. 4. 0
5 −1 −1 −2 1
−5 4 −3
1 2 3
1 −1 2
5. 1 2 5 6. 7. 2 3 4
0 3 4
4 4 4
2 4 1 3 1 0
8. Given that A = , B= , and C = show that
1 2 0 5 −1 3
(a) (A + B)’ = A’ + B’, (b) (A · B)0 = B 0 · A0 ,
(c) (A · B · C)0 = C 0 · B 0 · A0 and (d) (A0 )0 = A.
9.4 Determinants
Previous sections have demonstrated how it to write a linear equation system
in shorthand by means of matrix algebra. For example, we developed the
shorthand A · X = C to represent a typical system of linear equations. It’s
nice to be able to write a large system of equations in a concise, shorthand
notation. However, the premium is on being able to solve that system of
CHAPTER 9. MATRIX ALGEBRA 286
equations for the values of the unknown variables represented by the vector
X.
We can find solutions in a large number of situations. For example, when
we have two linear equations in two unknowns, we can find the solution val-
ues of the unknown variables by setting one unknown variable equal to the
other, substituting, and solving. It is apparent, nonetheless, that the process
of substitution becomes exceedingly complex when many equations and un-
knowns are involved. Therefore we must further develop our matrix-algebra
tools so that we can find the solution values for a large set of simultaneous
linear equations.
The first step we must take is to master the concept of the determinant of a
matrix. Once we have found the value of the determinant, we ordinarily know
whether or not we can solve the system of equations in question, and we often
can find the precise solution values. The determinant of a square matrix is
a uniquely defined scalar (number) that is characteristic of that particular
matrix.
Determinants are denoted by vertical straight lines. Thus,
a11 a12 · · · a1n
a21 a22 · · · a2n
|A| = ..
.. ..
. . .
an1 an2 · · · ann
is a scalar (number). This scalar is said to the the determinant of of the nth
order.
The determinant is calculated by summing products of the matrix’s terms in
a specific fashion. Consider a 2 × 2 matrix
a1,1 a1,2
.
a2,1 a2,2
Its determinant is a1,1 a2,2 − a1,2 a2,1 , the product of the terms in the diagonal
less the product of the two off-diagonal terms.
This cross-multiplication process can be extended to a 3 × 3 matrix as the
third example below illustrates. The third-order determinant consists of six
terms, three of which are added and three of which are subtracted in the
process of cross-multiplication. Using the same notation as in the 2 × 2
matrix, the determinant of a 3 × 3 determinant is
a11 ·a22 ·a33 +a12 ·a23 ·a31 +a13 ·a21 ·a32 −(a31 ·a22 ·a13 +a32 ·a23 ·a11 +a33 ·a21 ·a12 ).
CHAPTER 9. MATRIX ALGEBRA 287
Figure 9.3 illustrates how we can find the various products when a third-
order determinant is involved. The solid lines in Figure 9.3 form a cross
product of three elements, beginning in each case with an element in the
top row and including two other elements that are each from a different row
and column. The dashed lines also form a cross product of three elements,
beginning in each case with an element from the bottom row and including
two other elements, each of which is from a different row and column. The six
products together determine the value of the determinant, with the solid-line
products to be added and the dashed-line products to be subtracted.
Examples
4 2
1. A = , so |A| = 4 · 5 − 1 · 2 = 20 − 2 = 18.
1 5
−3 −4
2. A = , so |A| = −3 · 5 − 1 · (−4) = −15 + 4 = −11.
1 5
CHAPTER 9. MATRIX ALGEBRA 288
1 0 0
3. A = 3 2 4, so |A| = (1 · 2 · 3 + 0 · 4 · 4 + 0 · 3 · 1) −
4 1 3
(4 · 2 · 0 + 1 · 4 · 1 + 3 · 3 · 0) = 2.
Label these smaller matrices |A1,1 |, |A1,2 |, and |A1,3 |. We can now restate |A|
as follows:
X 3
|A| = (−1)1+j · a1,j · |A1,j |.
j=1
The term (−1)i+j ·|Ai,j | is the cofactor of ai,j . When i+j is an even number, a
value is added to the sum; when i + j is an odd number, a value is subtracted
from the sum. As noted above, any row value (i, not just i = 1) can be used.
Also, the summation could be over the rows in a specified column.
The terms |Ai,j | are called minors. The terms (−1)( i + j) + |Ai,j | are called
signed minors or cofactors. We can define each cofactor as |Ci,j | = (−1)i+j ·
|Ai, j|. Then a general expression for the solution of a determinant by the
method of cofactors can be written as
n
X
|A| = ai,j · |Ci,j |
j=1
No row or column contains any zeros, and the computation, even with the
application of the method cofactors will be arduous and subject to mistakes.
CHAPTER 9. MATRIX ALGEBRA 290
1
The solution, given almost instantaneously by Maxima is 560105280000
.8
The site http://www.purplemath.com/modules/minors.htm offers a slightly
more expansive discussion of this topic. This site shows how to compute the
value of a 4 × 4 determinant. It also offers some tips on how to manipulate
a matrix so as to generate cells that have 0 as a value. These tips involve
using the properties of determinants that we state below.
A digression on the differences between matrices and determinants.
Matrices and determinants are not the same thing. A matrix, denoted by
brackets or parentheses, has no numeric value. A matrix is a rectangular
array of numbers, variables, and parameters. A determinant, on the other
hand, does have a numeric value. A determinant is defined to be a scalar
(number).
Matrices can be of any dimension and need not be square. Determinants
must be square. A 2 × 3 determinant does not exist.
Properties of determinants. We can usefully apply the following prop-
erties when we work with determinants. These properties apply to determi-
nants of any dimension.
Property 1. The determinant of a matrix A has the same value as the deter-
minant of its transpose A0 . Let
a b 0 a c
A= . Then A = .
c d b d
For both of these matrices, the determinant is the same, a · d − b · c.
Property 2. Interchanging any two rows (or any two columns) of a determi-
nant does not alter the absolute value of that determinant. It does, however,
change the sign of the determinant.
a b b a
For example = a · d − b · c but
= b · c − a · d. Confirm that
c d d c
interchanging rows 1 and 2 of the initial matrix has the same effect as inter-
changing the columns.
Property 3. A determinant in which any two rows (or any two columns) are
identical, or a determinant in which any two rows (or any two columns) are
multiples of each other, has a value of zero.
8
Three commands result in the creation of this matrix and the compuation of its
determinant: h[i,j]:= i/(i+j), genmatrix(h,5,5), and determinant(%).
CHAPTER 9. MATRIX ALGEBRA 291
a b
For example = a · b · k − k · a · b = 0.
k · a k · b
Property 4. A determinant in which any row or any column has all zero
elements has a value of zero.
a b
For example = 0 · a − 0 · b = 0.
0 0
Property 5. Adding (or subtracting) a multiple of one row of a determinant
to (from) another row of that determinant, or adding (or subtracting) a
multiple of one column of a determinant to (from) another column of that
determinant does not change the value of the determinant.
a b
For example = a · d + a · k · b − (c · b + k · a · b = a · d − c · b).
c + k · a d + k · b
Property 6. If every element in one row (or one column) is multiplied by a
constant k, then the value of the determinant is also multiplied by k. For
example,
a b a b
For A = , |A| = a·d−c·b = and = k ·a·d−k ·c·b = k ·|A|.
c d k · c k · d
By iteration, if we multiply the elements of two columns of a matrix M by k,
then the determinant of the new matrix is k 2 ·|M |. Further, if we multiply all
elements of an n × n matrix M by k the determinant of the resulting matrix
is k n · |M |.
Exercise 9.3. Evaluate each of the following matrices. Confirm your solu-
tions with M axima.
1 2 3 1 3 4 4 1 6
1. 2 3 4 2. 2 0 7 3. 7 2 9
1 5 7 5 6 9 3 0 8
2 1 −3 4
2 1 5 −4 7 −2 1 1
4. 5. 6.
3 4 4 0 6 −3 −3 −3
3 −2 5 2
1 2 −2 3
1 1 1 2 1 1 3 −1 5 0
7. 0 1 1 8. 0 5 −2 9.
1 7 2 −3
0 0 1 1 −3 4
4 0 2 1
CHAPTER 9. MATRIX ALGEBRA 292
X = A−1 · C
.
n×n n×n n×1
matrix A is another square matrix, denoted A−1 , that satisfies the relation
A−1 · A = A · A−1 = I .
This definition of the inverse matrix is consistent with ordinary algebraic
rules. For example, in ordinary algebra, a · a−1 = a−1 · a = 1. In the
case of matrix algebra, it makes no difference whether A is premultiplied
or postmultiplied by A−1 . The product that results is always the identity
matrix I.
Our definition of the inverse of a matrix has two noteworthy implications.
We state these without proof. See Perlis [17].
The theorem above uses the term adjoint of matrix A. We now define this
new concept. Definition: The adjoint of matrix A, denoted by adjA, is the
transpose of the cofactor matrix of A, which we encountered when computing
determinants. More formally, let the cofactor matrix of matrix A be given
by C = [ |Aij | ]. Then the adjoint of A is given as
0
|C11 | |C12 | · · · |C1n | |C11 | |C21 | · · · |Cn1 |
|C21 | |C22 | · · · |C2n | |C12 | |C22 | · · · |Cn2 |
.. = adjA = .. .. .
.. .. ..
. . . . . .
|Cn1 | |Cn2 | · · · |Cnn | |C1n | |C2n | · · · |Cnn |
We can either divide the adjoint matrix by the determinant or use the com-
mand invert(M) to determine the inverse matrix, which is
14 1 7
− 19 − 19 19
M −1 = 57
17
− 11
57
1
57
.
4 3 2
19 19
− 19
Finally, we can compute either M −1 · M or M · M −1 to confirm that the
result is a 3 × 3 identity matrix.
Suppose that our matrix is
1 0 4
2 −3 1 .
6 −9 3
For this matrix (for which the values in the third row are 3 times their
counterparts in the second row), the determinant is 0. Thus we cannot divide
the elements of the adjoint matrix by the determinant in order to compute
the elements of the inverse matrix. This matrix is singular and does not have
an inverse matrix.
The three examples above illustrate two important points. First, the defi-
nition of an inverse matrix requires that |A| =6 0. Not only does this mean
that matrix A is nonsingular, but also it recognizes that division by zero is
undefined. Therefore |A| 6= 0 is a sufficient condition for an inverse matrix
to exist. Second, it is always possible to check whether the theorem con-
cerning inverse matrices has been applied correctly. One need only multiply
the alleged inverse and the original matrix. If the theorem has been applied
correctly, the result must be the identity matrix.
Properties of Inverse Matices. Three properties of inverse matrices war-
rant mention.
1. If A−1 exists, the (A−1 )−1 = A. That is, the inverse of an inverse
matrix, if it exists, is the original matrix. For an example, consider
M −1 in Example 2 above. Its inverse is
1 3 4
2 0 7 ,
5 6 9
which is matrix M .
CHAPTER 9. MATRIX ALGEBRA 296
2. If A−1 and B −1 both exist, then (A · B)−1 = B −1 · A−1 . That is, the
inverse of the product of two matrices is equal to the product of their
inverses in reverse order. This property generalizes to any number of
matrices, so that (A · B · · · Z)−1 = Z −1 · · · B −1 · · · A−1 .
3. If A−1 exists, then (A0 )−1 = (A−1 )0 . That is, the inverse of the transpose
is the transpose of the inverse.
1. Determine whether or not the inverse matrix exists. That is, find |A|
If |A| = 0, then there is no inverse matrix.
3. Find the adjoint of matrix A. That is, take the transpose of the cofactor
matrix such that C 0 = adjA.
With Maxima: Use the command invert(M), where M is the name that you
have assigned to the matrix.
Exercise 9.4. For each of the following matrices, find the inverse if it exists.
1 1 1 2 1 1 2 1 3
2 1
1. 2. 0 1 1 3. 0 5 −2 4. 3 0 1
0 5
0 0 1 1 −3 4 −1 1 4
−1 0 2 1 0 −2 7 6 5
5. 3 1 −6 6. −3 −1 6 7. 1 2 1
−2 −1 5 2 1 −5 3 −2 1
3 −5 0 5 2 4
8. 9. 10.
−1 2 6 4 −3 −6
CHAPTER 9. MATRIX ALGEBRA 297
A · X = C
.
n×n n×n n×1
If the inverse matrix A−1 does exist, then premultiplying both sides of A·X =
C by A−1 yields A−1 · X = A−1 · C, or
X = A−1 · C = D
.
n×1 n×n n×1 n×1
Our definition of matrix equality tells us that the left side n × 1 column
vector of unknown variables represented by X must be equal to the right side
n × 1 column vector of solution values represented by D if the two sides are
indeed equal.
We have also found that we can find an inverse matrix (such as A−1 ), only
if the matrix A is square. We stated this requirement by asserting that
A−1 · A = A · A−1 = I, the identity matrix. This means that the number of
equations is equal to the number of unknown variables.
Recall that when the value of the determinant of a matrix is zero, then you
cannot find an inverse for that matrix. That is, A−1 = |A|
1
· adjA and A|| =
6 0.
Thus, when |A| =6 0, there is a unique solution for a linear equation system.
Nonsingularity implies that an inverse can be found. When an inverse ma-
trix can be found, then there is a unique solution. We may summarize the
relationship between nonsingularity and the existence of a unique solution
as follows: Nonsingularity implies the existence of an inverse and a unique
solution and vice versa.
Consider two examples.
To derive a specific set of estimates (numerical values) from this set of esti-
mators (rules), we require a data set. To illustrate the process, we look at a
hypothetical example in which yt = b0 + b1 · xt + b2 · x2,t + et . The et terms
are not observable. They can be estimated after the parameters have been
estimated. The data are as follows:
1 2 5 1
1 5 8 2.1
1 6 9 , 3 .
1 7 7 4.5
1 6.5 8 7
The first matrix contains the x values. This matrix includes a column of
1’s which attach to the constant term, the estimated value of b0 . We are
estimating three parameters, b0 , b1 , and b2 . We have five data points. Me-
chanically, this is enough to provide estimates. It is not nearly enough to
provide reliable estimates, but it does illustrate the process.
Now we transpose X, and then mulitply this transpose by X. The result
in the 3 × 3 matrix in the middle. Then we determine the inverse of that
matrix.
1 1 1 1 1 5 26.5 37 6.63 0.259 −1.05
2 5 6 7 6.5 , 26.5 1.56102 2.05102 , 0.259 0.139 −0.135
5 8 9 7 8 37 2.05102 283 −1.05 −0.135 0.239
The first of the following pair of of matrices shows the result of multiplying
X 0 by Y . Multiplying (X 0 · X)−1 by this matrix creates our set of OLS
estimates for the coefficients: (X 0 · X)−1 · (X 0 · Y ).
17.6 0.806
1.07102 , 1.16 .
1.36102 −0.466
output shows the estimates. It also repeats the Y vector for comparison, and
it shows the size of the residuals–the differences between the OLS estimates
and the actual values in this sample.
OLS estimates
Actual
values
Residuals
0.802 1 0.198
2.89 2.1 −0.792
3.59 3 −0.588
5.68 4.5 −1.18
4.64 7 2.36
1. x1 + 3 · x2 = 15 6. x1 + 2 · x2 − 3 · x3 = −1
x1 − 2 · x2 = −3 3 · x1 − x2 + 2 · x3 = 7
5 · x1 + 3 · x2 − 4 · x3 = 2
2. 2 · x1 + 3 · x2 = 10
− 4 · x1 + x2 = −6 7. 2 · x1 + x2 − 2 · x3 = 10
3. 2 · x1 − 3 · x2 = 7 3 · x1 + 2 · x2 − 2 · x3 = 1
3 · x 1 + 5 · x2 = 1 5 · x1 + 4 · x2 + 3 · x3 = 4
4. 10 · x1 − x2 − x3 = 0 8. x1 + 2 · x2 − 3 · x3 = 6
− x1 + 12 · x2 − 2 · x3 = 0 2 · x1 − x2 + 4 · x3 = 2
x1 + 2 · x2 = 24 4 · x1 + 3 · x2 − 2 · x3 = 14
5. 12 · x1 − 2 · x2 − x3 = 0 9. x1 + 3 · x2 − 2 · x3 = 0
12 · x1 − 6 · x2 − x5 = 0 2 · x1 − 3 · x2 + x3 = 0
x1 + x2 = 16 3 · x1 − 2 · x2 + 2 · x3 = 0
more independent variables were said to flow directly from the one- and two-
independent variable cases. This assertion was not formally demonstrated.
With the help of matrix algebra, however, we can see how to identify extreme
points when we deal with functions that have n independent variables.
such that
f11 f12 · · · f1n
f21 f22 · · · f2n
|H| = .. .. .
..
. . .
fn1 fn2 · · · fn
Once we have found the Hessian determinant, one of the following conditions
must hold:
a. When |H1 |, |H2 |, . . . , |Hn | > 0, we have a minimum at the critical point.
b. When |H1 < 0, |H2 | > 0, |H3 | < 0,. . . , (i. e., alternating signs) we have
a maximum at the critical point.
c. When neither (a) nor (b) is true, the test fails, and we must examine the
function in the neighborhood of the critical point in order to determine
whether an extreme point exists.
The terms |H1 |, |H2 |, . . . , |Hn | are principal minors of a Hessian determinant.
The Hessian determinant that is used in the second-order condition is a
symmetric determinant. That is, the main diagonal of the Hessian consists
of all second-order partial derivatives of the function with respect to the
variables of the function; for example, f11 , f22 , . . . , fnn . The off-diagonal
elements in the Hessian are composed of all mixed or cross-partial derivatives
of the function, for example, f12 or f36 , where, according to Young’s theorem,
fij = fji .
The process for defining the Hessian minors is illustrated here for |H1 |, |H2 |,
and |H3 |. The process continues up to and including Hn . (The | | indicates
determinants, not absolute values.)
f11 f12 f11 f21
|H1 | = |f11 | = f11 |H2 | = =
f21 f22 f12 f22
f11 f12 f13 f11 f21 f31
|H3 | = f21 f22 f23 = f12 f22 f32 . . .
f31 f32 f33 f13 f23 f33
and so forth through |Hn |.
CHAPTER 9. MATRIX ALGEBRA 303
We can see that these results are consistent with the conditions that Chapter
7 developed for a single variable. In that case the Hessian contains a single
element, f11 . When |H1 | > 0, we have found a local minimum, and when
|H1 | < 0, we have found a local maximum. This is the same condition that
Chapter 7 develped, that f 00 (x) > 0 indicates a minimum and f 00 (x) < 0
indicates a maximum.
Analogously, a function of two independent variables satisfied the second-
order condition for a minimum in Chapter 7. If fxx · fyy − (f xy)2 > 0 and
fxx , fyy > 0 at the critical point. A maximum existed if fxx · fyy − (fxy )2 > 0
and fxx , fyy < 0 at the critical point. This is precisely what the second-order
condition for the Hessian determinants requires:
fxx fxy
|H1 | = |fxx | = fxx > 0 and |H2 | = = fxx · fyy − (fxy )2 > 0
fyx fyy
for a minimum. Similarly,
fxx fxy
|H1 | = |fxx | = fxx < 0 and |H2 | = = fxx · fyy − (fxy )2 > 0
fyx fyy
are required to establish a maximum.
A function with only one independent variable has only one principal minor.
A function of two independent variables has only two principal minors. A
function with n independent variables has n principal minors. We must
examine each of those principal minors when we seek to determine whether
an extreme point exists. If any one of those principal minors is found to have
an incorrect sign, then we need go no further with the evaluation process.
It is wise to begin with H1 , proceed to H2 , and so forth. If, for example, we
are testing for the existence of a maximum at a critical point, then the signs
of the principal minors will alternate, beginning with a negative. If H1 < 0
and H2 > 0, but H3 > 0, then a maximum point may not exist. In this case,
the test fails, and we must examine the function in the neighborhood of the
critical point in order to determine whether a maximum exists (a process in
which a computer algebra system is quite useful). In any case, we need not
go beyond |H3 | to determine that the test has failed. Finally, if there are
n independent variables, and n is an even number, then the sign of the nth
principal minor must be positive if a maximum exists. If n is odd, then the
sign of the nth principal minor must be negative for a maximum to exist.
Consider two examples.
CHAPTER 9. MATRIX ALGEBRA 304
Exercise 9.6. Determine the critical points, if any, that correspond to local
maxima or minima for the following functions.
1. z = 2 · x2 + y 2 − 2 · x · y + 5 · x − 3 · y + 1
2. z = 2 · x1 + x1 · x2 + 4 · x2 + x1 · x3 + x23 + 8
3. z = 4 · x1 · x2 + 3 · x3 · x21 + x2 · x3
6. z = x3 + y 3 + z 3 − 3 · x · y · z
7. z = 5 · x3 − 2 · x · y + 3 · y 2
derivatives equal to 0, and solve for their critical roots. Only critical-root
values can be extreme points. However, a critical-root value is not always an
extreme point.
We need a second-order (sufficient) condition in order to make a firm judg-
ment.
Second-order (sufficient) condition
Given that the first partial derivatives of L exist and have been set equal to
0 for solution purposes, we must find the bordered Hessian determinant
relating to the function and its constraints. The bordered Hessian determi-
nant of a function z = f (x1 , x2 , . . . , xn ), subject to g(x1 , x2 , . . . , xn ) = 0, is
denoted by |H B | and is composed of all second-order partial derivatives of
the constraint
such that
0 g1
g2 · · · gn
g1 L11 L12 · · · L1n
|H B | = g2 L21 L22 · · · L2n .
.. .. .. ..
. . . .
gn Ln1 Ln2 · · · Lnn
If all the first-order partial derivatives of the constraint and all the second-
order partial derivatives of the function L exist at the critical point(s), then
one of the following conditions must hold.
(a) When |H2B |, |H3B |, . . . , |HnB | < 0, we have a minimum at the critical point.
(b) When [H2B > 0|, |H3B l < 0, |F4B | > 0, . . ., we have a maximum at the
critical point.
(c) When neither (a) nor (b) is met, the test fails, and we must examine the
function in the neighborhood around the critical point in order to determine
whether a constrained extreme value exists.
A bordered Hessian determinant is a symmetric determinant. It is simply
a Hessian determinant that is bordered by the first partial derivatives of
the constraint, and 0. The symmetry follows from the fact that a Hessian
determinant, which is the major part of a bordered Hessian determinant, is
also symmetric.
It is customary to denote the ith bordered principal minor of a bordered
Hessian determinant by the symbol |HiB |. Among the bordered principal
CHAPTER 9. MATRIX ALGEBRA 307
The notation |H2B | means that we must take the Hessian determinant |H2 |
and place around it the appropriate border.
We should note carefully that the process of evaluating bordered Hessian
determinants begins with |H2B |, not with |H1B | We do not evaluate |H1B |
when we maximize or minimize subject to a single constraint. We shall
shortly state a general rule that deals with this situation.
Consider two examples.
1. Find the maximum or minimum value(s) for the function z = x21 −10·x22 ,
subject to this constraint: x1 − x2 = 18.
Our Lagrangian expression is L = z = x21 − 10 · x22 + λ · (x1 − x2 − 18).
The first-order conditions are 2 · x1 + λ = 0, −20 · x2 − λ = 0, and x1 −
x2 − 18 = 0, so the critical values are x1 = 20, x2 = 2, and λ = −40.
Our second-order condition is
0 1 −1
|H2B | = 1 2 0 = 18 > 0,
−1 0 −20
and
0 1 2 4
B
1 10 −4 −2
|H3 | =
= −3528.
2 −4 20 0
4 −2 0 2
Thus, we determine that a constrained minimum point on w occurs at
the critical values of x, y, and z.
zero for solution purposes, we must find the appropriate bordered Hessian
determinant. In this case, the bordered Hessian is given by
0 · · · · · · 0 k1 k2 · · · kn
. .. .. .. ..
.
. . . . .
· · · · 0 0 h1 h2 · · · hn
0 · · · 0 0 g1 g2 · · · gn
|H B | = .
k1 · · · h1 g1 L11 L12 · · · L1n
k2 · · · h2 g2 L21 L22 · · · L2n
. .. .. .. .. ..
.. . . . . .
k · · · h g L L ··· L
n n n n1 n2 nn
If all the first-order partial derivatives of the constraints and all the second-
order partial derivatives of the function L exist at the critical point(s), then
one of the following conditions must hold:
B B B
(a) When |Hm+1 |, |Hm+2 |, . . . , |Hm+n | all have the same sign, namely (−1)n ,
we have a constrained minimum at the critical points.
B B B B
(b) When |Hm+1 |, |Hm+2 |, . . . , |Hm+n | alternate in sign, where |Hm+1 | has
m+1
the sign (−1) , we have a constrained maximum at the critical point.
(c) When the requirements of neither (a) nor (b) aremet, the test fails and
we must examine the function in the neighborhood around the critical
point in order to determine whether a constrained extremum exists.
The practice of beginning the evaluation of the bordered Hessians with some-
thing other than |H1 | continues. The rule that guides this behavior requires
that we begin with a bordered Hessian whose size is one bigger than the num-
ber of constraints. Hence, when m constraints exist, we begin our analysis of
B
the bordered Hessians with |Hm+1 |. We can now look back to our previous
work and explain our previous choices in this regard. When m = 0 and no
constraint exists, we begin with |H1 | When m = 1, we begin with |H2B | and
so on.
The signs that are required for the successive bordered Hessian determinants
follow a definite order. When we evaluate critical point(s) with respect to
a maximum, the bordered Hessians must alternate in sign. In the case in
CHAPTER 9. MATRIX ALGEBRA 310
B
which two constraints exist, the sign of |Hm+1 | = |H3B | must be negative, the
sign of |H4B | must be positive, and so forth. The rule is that the sign is given
by (−1)m+1 . Hence, if m = 2, the sign of |H3B | is (−1)3 = −1, and |H3B | is
negative.
The sign determination for the case of a constrained minimum differs from
that of a constrained maximum. When m = 0, all bordered Hessians must
be positive. When m = 1, all bordered Hessians must be negative. When
m = 2, all bordered Hessians must once again be positive. In general, the
sign of all bordered Hessians must be the same as the sign of (−1)m if a
minimum exists.
As an example, suppose that five constraints apply. Then all bordered Hes-
sian determinants must have a negative sign, for (−1)5 = −1, which is nega-
tive. We can see that when the number of constraints is odd, the signs must
all be negative, whereas when the number of constraints is even, the signs
must all be positive.
1) (1) Y = C + I + G,
(2) C=200 + 0.7 · Y ,
(3) I = 75 + 0.1 · Y ,
(4) G = 100
3. Find the equilibrium prices (Pi ) and quantities (Qi )for goods A, B,
and C using inverse matrix algebra. Then use Maxima to check your
answer.
QD,A = 8 − 2 · PA + 3 · PB − PC QD,B = 4 − 4 · PB + PA + 3 · PC
QD,C = 6 − PC + 3 · PA + 3 · PB
QS.A = 10 QS,B = 2 · PA + 2 QS,C = 8 + PC
Matrix algebra that has several important uses beyond those demonstrated in
Chapter 9. In particular, matrix algebra underlies two powerful quantitative
techniques, linear programming and input-output analysis. Indeed, these are
just extension of matrix algebra, but we devote a separate chapter to them
only as a means of focusing attention on their importance to economists.
We begin by considering linear programming in a matrix algebra framework,
and subsequently examine input-output analysis in the same fashion.
312
CHAPTER 10. LINEAR PROGRAMMING 313
problem. This objective function is linear in the XjP ’s, which we call the
decision variables.Thus, our objective function is C = 801 Pj · X j .
2
An obvious way to minimize C is to not spend any money on food, but this
is not permissible, for such a menu plan would not satisfy the recommended
daily dietary allowances, the constraints. More bluntly, it would kill the
consumer. Let R symbolize a dietary requirement. Thus R1 might represent
the recommended intake of calories per individual per day. In Stigler’s 1945
version of the diet problem, the recommended intake of calories per individual
per day was 3000. Fewer calories would presumably be detrimental to the
consumer’s health. We represent the nine different dietary requirements of
Stigler’s problem by the variable names R1 , R2 , . . . , R9 .
The diet problem is inherently challenging because two different foods seldom
yield the same amount of nutrient per ounce of food. For example, in Stigler’s
1945 exercise, 1 ounce of uncooked bacon yielded 186 calories, whereas 1
ounce of uncooked sirloin steak yielded only 88 calories. Let the symbol aij
represent the number of units of nutrient i that are provided by 1 ounce of
food j. Hence the aij for uncooked bacon (in terms of calories) is 186, while
the analogous aij for uncooked sirloin steak is 88.
A consumer can satisfy a nutrient requirement by eating many different foods.
The term aij · Xj represents the total number of units of nutrient i that
are obtained when one consumes a given number of ounces of food j. For
example, if one consumes 6 ounces of uncooked bacon, then Xj = 6, and
since aij = 186, aij · Xj = 186 · 6) = 1116 calories.
We have previously noted that Stigler assumed that the individual must
obtain at least 3000 calories per day from the foods consumed. The consumer
can choose among the 80 foods in order to satisfy this requirement. We can
write this constraint as follows, stating that the sum of all the calories the
consumer derives from consuming various foods must be 3000 or greater:
a1,1 · X1 + a1,2 · X2 + · · · + a1,80 · X80 ≥ 3000.
2
Nonlinear objective functions are not permissible in a linear programming problem.
If the researcher attempts to represent the underlying phenomenon by a linear equation
when it is actually nonlinear, then the results obtained will be inaccurate and unreli-
able. Fortunately, nonlinear programming techniques are now accessible, given advances
in mathematics and in computing power. Even spreadsheet programs like Excel typically
offer ways to address nonlinear systems. We illustrate nonlinear programming at the end
of this section.
CHAPTER 10. LINEAR PROGRAMMING 315
The nine daily dietary requirements that Stigler imposed on the consumer in
his diet problem were reported in Chapter 1. Each of these nine requirements
constitutes a constraint on the consumer’s activities that takes the form of a
linear inequality similar to the expression above that addresses caloric intake.
The nine constraints are
a1,1 · X1 + a1,2 · X2 + · · · +a1,80 · X80 ≥ R1
a2,1 · X2 + a2,2 · X2 + · · · +a2,80 · X80 ≥ R2
.. .. .. ..
. . . .
a9,1 · X9 + a9,1 · X2 + · · · +a9,80 · X80 ≥ R9
The constraints or requirements n this set of equations are, like the objective
function, linear in the decision variables Xj .
We must include one additional, seemingly obvious set of constraints, that
Xj ≥ 0 for all j = 1, 2, . . . , 80. These restrictions are known as nonnega-
tivity constraints. They explicitly restrict the solution values of the decision
variables to be either zero or positive, thus eliminating the possibility of a
nonsensical solution that might (for example) allow the individual to consume
-5 ounces of bacon.
A formal expression of the problem at hand is that we must minimize our
objective function C subject to the nine inequalities and the eighty nonneg-
ativity conditions being satisfied.
We now show a variant of the Stigler solution, which also appears in Chapter
1. For purposes of exposition, Stigler focused on a subset of five foods and
eight constraints. The solution to this subset yielded results that were close
to that of the larger model. The problem consists of the following nine
equations, with the first on being the objective function.
x5 + x4 + x3 + x2 + x1 (z)
1. The object of the problem is to find optimal values for the decision
variables or unknowns in the problem.
2. The optimal values of the decision variables are such that they either
minimize or maximize an explicit linear objective function.
3. The minimization or maximization solution of the objective function
must be feasible. That is, the values of the decision variables in the
optimal solution must satisfy both the linear inequality constraints and
the nonnegativity constraints.
CHAPTER 10. LINEAR PROGRAMMING 317
or feasible region, that maximizes (minimizes) the value of the objective func-
tion.
If we ignore, for the time being, the less-than sign in the linear constraints of
Acme’s linear programming problem, and therefore treat these equations as
if they had equals signs then we can begin to provide a visual representation
of the linear-programming problem. From Chapter 2, we know that we can
graph a straight line without great difficulty if we are given two points on that
line. For example, if we know the horizontal intercept (abscissa) and vertical
intercept (ordinate) of a line, then we can connect these two intercepts with
a straight line and obtain the needed graph. We use this technique to define
the solution space for Acme. Figure 10.1 illustrates this procedure.
Figure 10.1 also shows an iso-revenue line (for R = $200–any value will do).
This line lies outside the feasible region, so this level of revenue cannot be
attained.
The shaded area in Figure 10.1 has four corners (not counting the origin,
where R = 0). Two of the points are interior and involve the production of
positive amounts of both products. Two points are on the axes and involve
the production of just one of the two goods. Figure 10.2 replicates Figure
10.1 and focuses on the feasible region.
CHAPTER 10. LINEAR PROGRAMMING 320
The accompanying workbook shows that the interior solutions occur at (24/5,
12/5) and (56/9,4/3), or approximately (4.8, 2.4) and (6.22, 1.33).
An infinite number of points lie either within or on the boundary of feasible
region. This means that an infinite number of possibilities confront us when
we attempt to identify the solution that is optimal. To illustrate the search
technique that linear programming performs, we consider a few specific points
within the solution space. At the origin, as we have seen, R = 0. Acme can,
however, produce positive quantities of goods X1 and X2 that generate some
sales revenue, so the solution at the origin is not optimal. We can do better.
We assert that any movement from the origin that involves production of
more of one good and no less of the other adds to revenue. Such moves
are possible whenever the (X1 , X2 ) combination is inside the bondary of the
feasible region. Therefore, any point inside the solution space represents less
production (and therefore less total sales revenue) than at least one point on
the boundary of the solution space. Thus, the optimal solution lies on the
boundary of the solution space, not inside it.
When we are considering points as candidates for the optimal solution, we
can ignore any point inside the solution space. Knowledge of this fact sub-
stantially reduces the number of possible solutions with which we must con-
CHAPTER 10. LINEAR PROGRAMMING 321
tend. We proceed therefore to consider only those solution points that are
on the boundary. We begin with the values, X1 = 7, X2 = 0 which yields
R = $140. From this point, we move counter-clockwise to (6.22,1.33), at
which R = $144.35. The next move, to (4.8, 2.4) generates revenue of
$132.00. Finally, when no X1 is produced and X2 = 4, the revenue level
falls to $60. Therefore, among the corners, the largest revenue occurs where
X1 = 6.22 and X2 = 1.33.
One of the corner points of the solution space is always an optimal solution to
a linear-programming problem, so we are justified in ignoring the linear seg-
ments between these corners. This assertion may not be intuitively obvious,
however, and requires more explanation.
Consider Figure 10.3, which graphs the constraints of the Acme linear pro-
gramming problem. The solution space (feasible region) is indicated by the
black border. Graphs that correspond to five values of the objective function,
R = $20X1 + $15X2 also appear. Four of the five revenue levels are feasible;
they are the values calculated above. For the highest of these five revenue
values, the iso-revenue line is tangent to the feasible region boundary. Thus,
a single value on that line is feasible. For R > $144.44, no point is in the
feasible region. Thus, the iso-revenue line for R = $160, shown in yellow,
cannot be attained.
In general, a corner of the solution space will always be an optimal solution to
a linear programming problem. Only when the slope of the objective function
is the same as the slope of a binding constraint will there be more than one
optimal solution. In the unlikely event that the slope of the objective function
in Figure 10.3 were the same as the slope of any one of the three line segments,
then all the points on that line segment (including the corner points) would
be optimal. Thus, the value of the objective function at the corner point is
at least as high (or low, if the program involves minimization) as the points
on the line segment, so only corner points need be considered.
An efficient linear-programming solution technique is ordinarily used to iden-
tify the corner points in a problem, evaluate them, and select the optimal
solution from among those possibilities. The simplex algorithm, one such
technique that is frequently used, is the one that Maxima used above to solve
the subset of the Stigler diet problem. This algorithm is a search technique
that identifies and evaluates the corner points in a problem. It repeatedly
strives to find a better solution than the one at hand. When it reaches an
CHAPTER 10. LINEAR PROGRAMMING 322
Figure 10.3: The feasible region and revenue levels for Acme
optimal point, such as the one that we identified in the Acme problem, it
stops. A movement to any other corner would result in a worse solution.
The table below shows the objective function R and the three constraints.
R 15X2 + 20X1
C1 15X2 + 5X1 ≤ 60
C2 4X2 + 3X1 ≤ 24
C3 7X2 + 12X1 ≤ 84
The following commands load the simplex module and execute the command
to execute the linear programming tool: load(simplex)$
float(maximize lp( R, [C1,C2,C3,X1>=0,X2>=0]));. The maximize lp
command is embedded in a float command to generate easily-interpreted
values rather than exact values. It is optional. The results are consistent
with our computations above: [144.44,[X2=1.3333,X1=6.2222]]. The
first item that the command reports is the value of the objective function.
The second item is a list of the two production levels.
CHAPTER 10. LINEAR PROGRAMMING 323
3. The coefficients of the variables in the primal objective function are the
right-hand constants of the constraint equations in the dual problem.
4. The coefficients of the variables in the dual objective function are the
right-hand constants of the constraint equations in the primal problem.
9. The optimal solution is identical for both the primal and the dual
problems.
Examples
subject to:
5 · X1 + 15 · X2 ≤ 60
3 · X1 + 4 · X2 ≤ 24
12 · X1 + 7 · X2 ≤ 84 and
X1.X2 ≥ 0,
then the dual problem is
minimize: C = 60va + 24 · vb + 84 · vc
subject to:
5 · va + 3 · vb + 12 · vc ≥ 20
(The “value” to the firm of selling a unit of X1 is attributed to the
inputs, which could be used in producing X2 . Likewise for X2 below.)
15 · va + 4 · vb + 7 · vc ≥ 20
va , vb ≥ 0
The optimal solution is: X1 = 6.22, X2 = 1.33 R = 144.35 va = 0,
vb = 1.47, and vc = 1.30
The v’s are shadow prices. The objective is to minimize cost, C, based
on these shadow prices. The value of va is 0, because units of input a
remain unused after the optimal solution has been implemented.
Here is the relevant information in the form of Maxima commands:
Cost: 60*va + 24*vb + 84*vc; Ca: 5*va + 3*vb +12*vc >= 20;
Cb: 15*va + 4*vb + 7*vc >= 15;. The command minimize lp(
Cost, [Ca,Cb,va>=0, vb>=0, vc>=0]); implements the simplex method
and yields the following results (a float command was invoked also):
[144.44, [vc = 1.2963, vb = 1.4815, va = 0.0]].
The values differ slightly from those above, because Maxima’s output
involves less rounding error.
A digression: We can return to the primal problem look at the shadow
price in a slightly different way, one that relates to the Lagrangian
multipliers that we encountered earlier. The commands below, sequen-
tially add 1 unit of inputs a, b, and c, holding the other two at the
initial levels: C1alt: 15*X2+5*X1<=61$ C2alt:4*X2 + 3*X1 <=
25$ C3alt:7*X2 + 12*X1<=85$. The results, after invoking a float
command are these:
[144.44, [X2 = 1.3333, X1 = 6.2222]]
[145.93, [X2 = 1.7778, X1 = 5.963]]
[145.74, [X2 = 1.2222, X1 = 6.3704]]
CHAPTER 10. LINEAR PROGRAMMING 327
Exercise: Draw the four constraints in the primal problem for Example 2
and shade the feasible region. Use the graph to estimate the values of X1
and X2 . Which constraints are binding? See the accompanying workbook to
determine the exact values of X1 and X2 and to confirm that two of the Y
values are zero.
We provide a brief overview in two steps. First, we revisit the diet problem,
comparing the simplex solution to that provided by cobyla. Then we set up
and solve a simple nonlinear problem that appears in Bradley et al.
A nonlinear problem cannot be addressed with linear programming, but a
linear problem can be addressed with nonlinear programming. We saw earlier
that the simplex method could be implemented with the following commnds:
load(simplex)$
minimize lp(z, [c1,c2,c3,c4,c5,c6,c7,c8,
x1>=0,x2>=0,x3>=0,x4>=0,x5>=0]);
where z is the objective function, and the brackets contain a list of con-
straints. The resulting output is [0.10904, [x5 = 0.048628, x4 = 0.0051128,
x3 = 0.01125, x2 = 0.0085915, x1 = 0.035456].
The interpretation appears early in this section
The cobyla counterparts are the following input: load(fmin cobyla);
fmin cobyla(z, [x1,x2,x3,x4,x5], [0.031,0.01,0.01,0.005,0.005],
constraints = [c1,c2,c3,c4,c5,c6,c7,c8,
x1>=0,x2>=0,x3>=0,x4>=0,x5>=0],iprint=1);.
This input differs from the simplex input only in minor details, with one
exception. The minor details are that the constraints are explicitly identified
as such and that a code iprint=1 is added in order to control the amount of
detail that is reported. The more substantive difference is that this command
requires a list of initial guesses of the values of x1 , x2 , . . .. In general, the
better guesses one can provide the better the method will work. In particular,
close guesses reduce the number of iterations required to obtain the desired
results.
As with the input, the output differs in detail from the simplex output:
[[x1 = 0.035456, x2 = 0.0085915, x3 = 0.01125,
x4 = 0.0051128, x5 = 0.048628], 0.10904, 42, 0].
The output appears with thex values first, followed the the value of z. The
number 42 is the number iterations that were required. The 0 is a code that
indicates that the process was completed without error.
We now consider a simple example that is not amenable to linear program-
ming. We wish to maximize the function z = 2 ∗ x1 − x12 + x2. Un-
fortunately cobyla is limited to minimization, but we can minimize −z =
−(2 ∗ x1 − x12 + x2). The constraints are these: x2 ≤ 1.8, x21 + x22 ≤ 4,
x1 ≥ 0, and x2 ≥ 0. The command that we enter is load(fmin cobyla)$
CHAPTER 10. LINEAR PROGRAMMING 329
or (I − A) · X = d
In this equation I is an n × n identity matrix, A is the technical coefficient
matrix, X is the n-industry variable matrix, and d is the final demand matrix.
We frequently refer to the matrix (I − A) as a Leontief matrix. Using matrix
inversion, if I − A is nonsingular, then we can find (I − A)−1 . This means
that the unique solution for the X matrix is X = (I − A)−1 · d.
CHAPTER 10. LINEAR PROGRAMMING 333
An illustrative example. The table on the next page lists the sources
of inputs, and the destinations of outputs, in a hypothetical economy. We
could represent th this table’s contents as a system of 11 simultaneous linear
equations in 11 unknown values. Movements along any row show the output
of an industry and where that output goes. For example, Row 6 addresses
industry F. Two units its output go to industry A, six units to industry
B, and so forth. Column 7 reveals that two units of industry D’s output
constitute the accumulation of inventories in industry F itself. Column 12
shows that he total production of industry F is 46 units.
A movement down any column in this table lists the inputs that each industry
or sector receives from other industries or sectors. For example, column
5 indicates the inputs that industry E receives from other industries and
sectors. Thus industry E uses five units of industry A’s output, three units
of industry B’s output, five units of industry C’s output, and so forth.
The “processing sector” of an input-output table (rows 1 through 6 and
columns 1 through 6 contains all those industries that produce salable goods
and services, such as cars, furniture, and toothpaste. The processing sector
of most input-output tables is highly developed and may contain as many as
500 industries.
Columns 7 through 11 contain the “final demand” sector. For example,
household purchases of goods and services, in column 11, total 14 units from
industry A, 17 units from industry B, and so forth. Rows 7 through 11 contain
the “payments sector” of the table. This sector shows the contribution of
various owners of factor inputs (for example, households) to the production
of each output. For example, households provide 19 units of their inputs,
predominantly labor, to industry A, as recorded in row 11, column 1.
Compare this sector to the Keynesian aggregate expenditures equation Y =
C+I+G+X−M , where Y is total spending, C is private-sector consumption,
I is private-sector investment, G is government spending, X is exports, and
M is imports.
The table also records total gross outlays (in row 12) and Total Gross Output
(in column 12). The total gross outlay of inputs and the total gross output
of goods and services are not equivalent to gross Domestic Product, which
deliberately excludes intermediate outputs and inputs and concentrates only
on the value of final goods and services. In constrast, total gross outlay and
total gross input involve repeated double counting. This is not bad, however,
CHAPTER 10. LINEAR PROGRAMMING 334
We can use this table to see how one computes the technical coefficients of
production that we discussed earlier. Each technical coefficient of production
should show the number of units of input j required to produce one unit of
output i. Thke next table consists of technical coefficients of production de-
rived from the input-output matrix in the preceding table. Consider industry
C: It receives a total of 40 units of inputs, one of which comes from the de-
pletion of its own inventories. Seven of these 40 units come from industry F.
Therefore the technical coefficient of production is 7/39 ≈ 0.18. This tells us
that every unit of output produced by industry C requires 0.18 unit of the
output of Industry F. (Note that we subtract inventory depletion from total
gross outlay before computing the technical coefficient of production.)
These review questions are selected from a final example in Professor Os-
trosky’s course at Illinois State University, “Introduction to Mathematical
Economics,” which closely parallels the material presented in this book.
339
APPENDIX A. ADDITIONAL REVIEW QUESTIONS 340
5. Assume that the demand per week for the NoFuzz Cable is 10,000
subscribers when the price is $60 per unit, and 20,000 subscribers when
the price is $40.
11. Solve the following set of equations using inverse matrix algebra:
x1 + 2 · x3 + x4 = 4
x1 − x2 + 2 · x4 = 12
2 · x1 + x2 + x4 = 12
x1 + 2 · x2 + x3 + x4 = 12
12. Suppose that A and B are the only two firms in the market selling the
same product (we say that they are duopolists). The industry’s inverse
demand function for the product is P = 92 − qA − qB , where qA and qB
denote the output produced and sold by A and B, respectively. For A
the cost function is CA = 10 · qA and for B it is CB = qB2 /4. Suppose
that the firms enter into an agreement on output and price control by
jointly acting as a monopoly.
where PA and PB are the selling prices (in dollars per kilogram) of A
and B, respectively. Determine the selling prices that will maximize
Sweet Tooth’s profit.
[1] Bassi LJ (1976) The Diet Problem Revisited. American Economist, 20:
35–39.
[2] Bishop RL (1968) The Effects of Specific and Ad Valorem Taxes, Quar-
terly Journal of Economics, 82:198–218.
[5] Cyrenne P (2014). Salary Inequality, Team Success and the Super-
star Effect. Available at: ftp:// ftp.repec.org/opt/ReDIF/RePEc/win/
winwop/2014-02.pdf.
344
BIBLIOGRAPHY 345
[13] Miller DE (2012) Using wxMaxima for Basic Set Operations. Available
at http:// andrejv.github.io/wxmaxima/help.html.
346
INDEX 347
function matrix
and relation, 38 use to build tables, 32
definition, 37 inferior goods, 135
dependent & indepdendent variables, 38 functions, 62
inferse
domain, 38 inflection bputs and concavity, 161
explicit, 38 inflection point
form, 43 and derivatives, 164
polynomial, 44 necessary and sufficient conditions, 164
functions inframarginal firm, 150
composite, 42 inital conditions, 208
inverse, 61, 62 initial conditions, 211
monotonic, 61 initital conditions
types, 34 see boundary conditions, 208
input-output analysis, 330
Hessian determinants, 302 applications, 335
Hessians multipliers, 336
bordered, 309 technical coefficients, 335
homethetic function, 146 integral
homogeneous function, 141 additive property, 212
idempotent matrix, 277 definite, 221, 223
identity matrix, 276 graphical representation, 208
Kronecker’s delta, 277 linearity property, 212
imperfect competition, 187 multiplicative property, 212
implicit function, 129 Riemann, 223
partial derivative, 131 integral calculus, 206
Implicit Function Theorem, 130 integrand, 207
implicit function theorem, 195 integration, 206
implicit functions, 39 and summation, 225
improper integral, 237 area between two curves, 232, 234
infinite discontinuity, 241 boundary conditions, 208
infinite integrand, 240 by parts, 217
infinite limit(s) of integration, 237 by substitution, 213, 226
income determination model, 16 constant of integration, 207
income elasticity of demand, 135 discontinuous function, 230
indefinite integral, 207, 209 general exponential rule, 211
independent variables, 38 general logarithmic rule, 210
index important properties, 213
indefinite integral, 209
INDEX 349