Ronald S. Irving - Beyond The Quadratic Formula
Ronald S. Irving - Beyond The Quadratic Formula
Ronald S. Irving - Beyond The Quadratic Formula
Beyond the
Quadratic Formula
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page ii — #2
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page iii — #3
i i
Beyond the
Quadratic Formula
Ron Irving
University of Washington
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page iv — #4
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page v — #5
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page vi — #6
i i
Learn from the Masters, Frank Swetz, John Fauvel, Otto Bekken, Bengt
Johansson, and Victor Katz
Math Made Visual: Creating Images for Understanding Mathematics, Claudi
Alsina and Roger B. Nelsen
Mathematics Galore!: The First Five Years of the St. Marks Institute of
Mathematics, James Tanton
Methods for Euclidean Geometry, Owen Byer, Felix Lazebnik, and Deirdre
L. Smeltzer
Ordinary Differential Equations: A Brief Eclectic Tour, David A. Sánchez
Oval Track and Other Permutation Puzzles, John O. Kiltinen
Paradoxes and Sophisms in Calculus, Sergiy Klymchuk and Susan Staples
A Primer of Abstract Mathematics, Robert B. Ash
Proofs Without Words, Roger B. Nelsen
Proofs Without Words II, Roger B. Nelsen
Rediscovering Mathematics: You Do the Math, Shai Simonson
She Does Math!, edited by Marla Parker
Solve This: Math Activities for Students and Clubs, James S. Tanton
Student Manual for Mathematics for Business Decisions Part 1: Probabil-
ity and Simulation, David Williamson, Marilou Mendel, Julie Tarr, and
Deborah Yoklic
Student Manual for Mathematics for Business Decisions Part 2: Calculus
and Optimization, David Williamson, Marilou Mendel, Julie Tarr, and
Deborah Yoklic
Teaching Statistics Using Baseball, Jim Albert
Visual Group Theory, Nathan C. Carter
Which Numbers are Real?, Michael Henle
Writing Projects for Mathematics Courses: Crushed Clowns, Cars, and
Coffee to Go, Annalisa Crannell, Gavin LaRose, Thomas Ratliff, and
Elyn Rykken
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page vii — #7
i i
To my parents,
Florence and Herbert Irving
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page viii — #8
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page ix — #9
i i
Preface
Every student learns the formula for the solution of a quadratic, or de-
gree two, polynomial equation in a high school algebra course. It is one
of the few mathematical topics that many adults remember years later, at
least by name. However, the study of cubic, or degree three, and quartic, or
degree four, polynomial equations has largely disappeared from the mathe-
matical curriculum. In the rush to calculus, high school students do not see
it. At the university level, undergraduate mathematics majors often crown
their algebraic studies with Galois theory, which provides the tools needed
to show that there is no formula for the solution of degree five equations
analogous to the quadratic formula for degree two equations. Galois The-
ory can also be used to show that formulas exist for solutions in degrees
three and four, but these may be skipped over.
What are the formulas? The answer is at the heart of this book. The re-
sults are both elementary and beautiful. Moreover, they are an essential part
of the history of mathematics, representing the high point in mathematical
developments of the sixteenth century.
This book has evolved from notes used in a class for in-service and
prospective secondary mathematics teachers. It is intended to be suitable
ix
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page x — #10
i i
x Preface
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xi — #11
i i
Preface xi
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xii — #12
i i
xii Preface
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xiii — #13
i i
Preface xiii
well as later extensive comments by the copy editor, criticism of the history
sections by an anonymous MAA reader, and transformation of my primitive
diagrams by Beverly Ruedi. I am indebted as well to Don Albers and Jerry
Bryce for their willingness to consider this project and for their ongoing
support.
Exploring mathematics, even elementary mathematics, is a privilege,
connecting us to fellow humans across millennia and cultures in our search
for fundamental truth. (I hope this book illuminates these connections.) My
greatest debt is to the members of my family, who have allowed and encour-
aged me to enjoy this privilege. My parents, to whom I have dedicated this
book, arranged for me to arrive on a leap day, thereby inspiring my child-
hood interests in mathematics and astronomy. Their gift of Irving Adler’s
The Giant Golden Book of Mathematics [2] on my second birthday sealed
my fate.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xiv — #14
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xv — #15
i i
Contents
Preface ix
1 Polynomials 1
1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Multiplication and Degree . . . . . . . . . . . . . . . . . . 4
1.3 Factorization and Roots . . . . . . . . . . . . . . . . . . . 8
1.4 Bounding the Number of Roots . . . . . . . . . . . . . . . 10
1.5 Real Numbers and the Intermediate Value Theorem . . . . 12
1.6 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Quadratic Polynomials 21
2.1 Sums and Products . . . . . . . . . . . . . . . . . . . . . 22
2.2 Completing the Square . . . . . . . . . . . . . . . . . . . 24
2.3 Changing Variables . . . . . . . . . . . . . . . . . . . . . 28
2.4 A Discriminant . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3 Cubic Polynomials 47
3.1 Reduced Cubics . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Cardano’s Formula . . . . . . . . . . . . . . . . . . . . . 50
3.3 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4 A Discriminant . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4 Complex Numbers 73
4.1 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Quadratic Polynomials and the Discriminant . . . . . . . . 77
4.3 Square and Cube Roots . . . . . . . . . . . . . . . . . . . 81
xv
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page xvi — #16
i i
xvi Contents
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 1 — #17
i i
1
Polynomials
1.1 Definitions
What is a polynomial? We know that
3x 2 4x C 7
is one, as is
5x 17 C 12x 11 4x 7 C 13x 4 C x 3 x C 113:
So are
4 100 2 88
x C x 5
3 5
and
p 333 1
2x 2x 200 C x 111 C x 4 2:
3
But
x 4 C sin x
is not a polynomial, and neither is
10x :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 2 — #18
i i
2 1. Polynomials
(Why not? For now, we can agree that they certainly don’t look like poly-
nomials. We will return to this question in Exercises 1.12 and 1.14.)
The formal definition of a polynomial is: a polynomial is an expression
of the form
an x n C an 1x
n 1
C an 2x
n 2
C C a2 x 2 C a1 x C a0 ;
Exercise 1.1. What are the degrees of the four polynomials displayed at
the start of the section?
x3 3x C 2:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 3 — #19
i i
1.1. Definitions 3
Then
f .2/ D 23 32C2D8 6 C 2 D 4:
There is nothing special about x. We may write other letters for the
variable (also called the indeterminate) of a polynomial, or any symbol in
place of x. For example,
y 3 3y C 2
is a polynomial, as are
t3 3t C 2
and
|3 3| C 2:
Although written differently, they describe the same function.
Exercise 1.2. Is
p
3 3
437 C 2411 4 C
7
a polynomial? If so, what is its degree? If not, why not?
3 .4x 3 3x 2 C 2x C 8/ D 12x 3 9x 2 C 6x C 24
and
1 3 2 1
.4x 3 3x 2 C 2x C 8/ D x 3 x C x C 2:
4 4 2
A non-zero polynomial is monic if the coefficient of its highest-degree
term is 1. As we did in the example, we can multiply any non-zero polyno-
mial by a suitable real number to obtain a monic polynomial: Given
an x n C an 1x
n 1
C C a1 x C a0
f .x/ D 0;
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 4 — #20
i i
4 1. Polynomials
x5 3x 3 6x C 4 D 0
x2 C 1 D 0
doesn’t. Substituting any real number for x, we obtain as its square a posi-
tive real number or 0. Adding 1 yields a positive real number, which there-
fore can’t be 0.
In saying a polynomial equation has no solution, we are implicitly as-
suming that the domain of allowable solutions is the set of real numbers. In
Chapter 4, we will introduce complex numbers, and we will find that equa-
tions such as x 2 C 1 D 0 have solutions in this expanded domain. It would
be more accurate to say that x 2 C 1 D 0 has no real number solution rather
than that it has no solution.
For any non-zero polynomial f .x/ and non-zero real number c, the
polynomial equations
f .x/ D 0
and
c f .x/ D 0
have the same solutions. If a is a real number satisfying f .a/ D 0, then
cf .a/ D 0 also, and similarly, if cf .a/ D 0, then since c is non-zero, f .a/
must be 0. Therefore, whenever we want to solve a polynomial equation
f .x/ D 0 (with f .x/ non-zero), we can replace f .x/ by its associated
monic polynomial and solve the resulting equation instead. We will make
use of this elementary observation throughout the book.
r .x/ D an x n C an 1x
n 1
C C a1 x C a0
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 5 — #21
i i
and
s.x/ D bn x n C bn 1x
n 1
C C b1 x C b0 :
.an C bn /x n C .an 1 C bn 1 /x
n 1
C C .a1 C b1 /x C .a0 C b0 /:
Exercise 1.3. Let us be more explicit about the coefficients of the product
of two polynomials. Suppose r .x/ is a non-zero polynomial of degree m,
with
r .x/ D am x m C am 1 x m 1 C C a1 x C a0 :
s.x/ D bn x n C bn 1x
n 1
C C b1 x C b0 :
a0 b1 C a1 b0 :
a0 b2 C a1 b1 C a2 b0 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 6 — #22
i i
6 1. Polynomials
(v) Show, for any non-negative integer k, that the coefficient of x k in the
product r .x/s.x/ is
ak b0 C ak 1 b1 C ak 2 b2 C C a2 bk 2 C a1 bk 1 C a0 bk :
x 12 C 3x 7 C 4x 2 2
and
4x 5 3x 4 C 12x:
Write other polynomials and multiply them.
Given polynomials f .x/ and r .x/, the polynomial r .x/ divides the
polynomial f .x/ if there is another polynomial s.x/ such that
f .x/ D r .x/s.x/:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 7 — #23
i i
Exercise 1.5. Let a.x/ and b.x/ be non-zero polynomials. Let q.x/ be the
quotient obtained by long division when one divides a.x/ into b.x/ and let
r .x/ be the remainder.
(i) By the definition of quotient and remainder,
(ii) Using the definition of divisibility, show that if r .x/ D 0, then a.x/
divides b.x/.
(iii) Suppose instead that a.x/ divides b.x/. Show that r .x/ must be 0.
(This requires a little care. Use the definition of divisibility to produce
a polynomial p.x/ for which b.x/ D a.x/p.x/. Compare with the
equation in the first part of the exercise and use Theorem 1.1 to deduce
that r .x/ D 0.)
(iv) Conclude that to test if a.x/ divides b.x/, carry out long division and
see if the remainder is 0.
(v) Decide if x 2 divides x 3 C 6x 20.
(vi) Decide if x 2 C 2x C 1 divides x 5 C 4x 2 C 2.
Let us collect some facts about multiplication and division that are ana-
logues of familiar facts about number multiplication and division.
The third part of Theorem 1.2 says that when we have an equality of
polynomial products, we can cancel a non-zero factor.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 8 — #24
i i
8 1. Polynomials
f .x/ D .x a/s.x/:
x2 13x C 36 D .x 4/.x 9/
.x r /.x s/
for real numbers r and s. If we allow integers only, there is no such factor-
ization, but with real numbers, we find the factorization
p p
.x 2/.x C 2/:
p p
Theorem 1.3 then tells us (as is already evident) that 2 and 2 are roots
of x 2 2.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 9 — #25
i i
Theorem 1.4 allows us to limit the possibilities for roots. If we can find
all the factors of f .x/, we will be able to find its roots by examining the
degree-one factors. The proof of Theorem 1.4 is given in the next exercise.
g.y/ D an .y C a/n C C a1 .y C a/ C a0 :
(ii) Use Theorem 1.1 to show for each positive integer i that .y C a/i has
degree i .
(iii) Deduce that g.y/ has degree n and that therefore real numbers bn ;
bn 1 ; : : : ; b1; b0 exist for which
g.y/ D bn y n C bn 1y
n 1
C C b1 y C b0 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 10 — #26
i i
10 1. Polynomials
(v) Let’s determine b0 . Substitute a for x on both sides and show that
f .a/ D b0:
a/n 1
f .x/ D f .a/ C .x a/ bn .x C C b2 .x a/ C b1 :
That is, given f .x/ and a, there is a polynomial h.x/ such that
.x a1 /.x a2 / : : : .x ak /
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 11 — #27
i i
The possibility that both hold is not excluded. For example, the statement
“6 is even or 7 is even” is correct, since 6 is even, and the statement “6 is
even or 8 is even” is correct also. If the statement “P or Q” is true but Q is
not true, then P must be true.
We prepare for the proof of Theorem 1.5 with some polynomial divisi-
bility facts.
Exercise 1.9. Suppose r .x/ and s.x/ are polynomials and a is a real num-
ber. If r .a/s.a/ D 0, then r .a/ D 0 or s.a/ D 0. (Why?) Use this and
Theorem 1.4 to deduce that if x a divides r .x/s.x/, then x a divides at
least one of r .x/ and s.x/.
(ii) Suppose k > 1. Since a2 is a root of f .x/, use Theorem 1.4 and
Exercise 1.9 to deduce that x a2 divides x a1 or g1 .x/.
(iii) By assumption, a1 and a2 are distinct. Use this and Exercise 1.8 to
deduce that x a2 cannot divide x a1 , and so therefore x a2
divides g1 .x/.
(iv) Conclude that there is a polynomial g2 .x/ satisfying g1 .x/ D .x
a2 /g2 .x/, and therefore that
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 12 — #28
i i
12 1. Polynomials
Exercise 1.11. Prove Theorem 1.6 using Theorem 1.5 and a comparison
of degrees.
We noted in Section 1.1 that sin x is not a polynomial. This now follows
as a consequence of Theorem 1.6.
Exercise 1.12. Deduce from Theorem 1.6 that sin x cannot be equal to
a polynomial. (Hint: How many solutions are there to sin x D 0?) Show
similarly that cos x and tan x are not polynomials.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 13 — #29
i i
Exercise 1.13. Prove that 2 does not have a rational square root.
(i) Suppose a and b are positive integers such that a=b is a square root
of 2. Assume further that a=b is a reduced fraction; that is, no positive
integer divides a and b other than 1. Square and clear denominators to
obtain a2 D 2b 2.
(ii) Show that 2 must divide a.
(iii) Show that 2 must divide b.
(iv) This is a contradiction, proving that a and b can’t exist.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 14 — #30
i i
14 1. Polynomials
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 15 — #31
i i
numbers to a connected set. Combining this with the result that a connected
set of real numbers containing r and s contains all the numbers in between,
we can easily deduce the intermediate value theorem [38, p. 170].
Theorem 1.8 (intermediate value theorem). Let a < b be real numbers and
let f be a continuous function defined on the closed interval Œa; b. Given
a real number d between f .a/ and f .b/, there is a real number c in Œa; b
such that f .c/ D d .
Theorem 1.10. Given a positive real number d , there exists a real number
c such that c 2 D d , so every positive real number has a square root.
The same argument works when we replace the exponent 2 by any pos-
itive integer n, showing that every positive real number has a positive nth
root.
Here is another application of the intermediate value theorem, stated for
polynomials although it holds for any continuous function:
Theorem 1.11. Let a < b be real numbers and let f .x/ be a polynomial
for which f .a/ < 0 and f .b/ > 0. Then there is a real number c between
a and b such that f .c/ D 0.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 16 — #32
i i
16 1. Polynomials
1.6 Graphs
In Section 1.5, we introduced some results on real numbers. Here, we will
continue our excursion into realms beyond algebra, this time turning to an-
alytic geometry to study the shape of the graph of a polynomial function.
We will give merely an overview. Calculus gives the results of this section
as easy exercises. Some of the results can be obtained by more elementary
considerations, albeit at the cost of a little more work, but either way, foun-
dational theorems on the construction and properties of the real numbers
are essential for the proofs.
We will use the results of this section in Sections 2.4, 3.3, and 3.4, where
the discriminants of quadratic and cubic polynomials are studied. In Sec-
tions 4.2 and 5.3, the results will be proved again algebraically, so that their
derivation does not depend on the results of this section. Nonetheless, the
approach to discriminants using graphs in Sections 2.4 and 3.4 provides
additional insight and intuition.
The polynomial functions whose graphs are the simplest are the powers
of x. We know that the graph of y D x is the line forming a 45ı angle with
the x-axis. Suppose n is an integer greater than 1. For positive real numbers
a < b and r < s, the inequality ar < bs holds. Setting r D a and s D b,
we find that a2 < b 2. For a positive integer n, we can repeat the argument
n times to obtain an < b n . This tells us that the graph of y D x n is always
increasing in height as x increases, for x 0.
x
Figure 1.1. Graph of y D x n
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 17 — #33
i i
1.6. Graphs 17
One feature of the graph in Figure 1.1 that requires calculus to justify
is the way the shape has been drawn. It’s what is called concave up. Those
familiar with calculus will recognize that this follows from the fact that for
x > 0, the second derivative n.n 1/x n 2 of x n is positive.
y
y = x4 y = 3x 3
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 18 — #34
i i
18 1. Polynomials
obtained from the graph to the right by taking the mirror image across the
y-axis and then reflecting again across the x-axis. We see from this that the
graph rises steadily as x increases, from arbitrarily low heights to .0; 0/ and
then onwards to arbitrarily high heights. This is illustrated in Figure 1.2.
The graph of a general polynomial will be more complicated, but some
features remain unchanged from that of a simple power of x. For example,
consider the graph of the polynomial x 7 6x 5 C 11x 3 6x depicted in
Figure 1.3. Like the graph of an odd power of x, it rises to infinity to the
right and drops to infinity to the left. The complicating feature is that it rises
and falls in between. Counting, we find that it makes six turns on its way
from minus infinity to infinity. (Those familiar with calculus will recognize
that a seventh-degree polynomial can have no more turns, though it may
have fewer. For instance, x 7 has no turns.)
y
Let’s discuss the general picture. Suppose that f .x/ is a monic polyno-
mial of positive degree n,
x n C an 1x
n 1
C C a1 x C a0 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 19 — #35
i i
1.6. Graphs 19
Theorem 1.13 tells us that the graph of a degree-n polynomial f .x/ be-
haves like the graph of x n in its extremes. Away from them, the graph may
behave differently from the graph of x n . It may shift from rising to falling to
rising multiple times, as illustrated by the seventh-degree polynomial graph
of Figure 1.3.
A graph changes from falling to rising or rising to falling at what are
called turning points. Let us define this precisely. A local maximum of a
function f .x/ is a point .a; f .a// on its graph with the property that there
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 20 — #36
i i
20 1. Polynomials
exists a positive number r such that every c in the open interval .a r; aCr /
satisfies f .c/ f .a/. Similarly, .b; f .b// is a local minimum if there is a
positive number s such that every c in the open interval .b s; b C s/
satisfies f .b/ f .c/. A turning point for f .x/ is a point that is either a
local maximum or a local minimum.
An example of a turning point is the point .0; 1/ on the graph of f .x/ D
x 2 C 1, for which it is a local minimum. In fact, since f .x/ > 1 for any
x ¤ 0, the point .0; 1/ is more than a local minimum. It’s what is called a
global minimum.
Readers familiar with calculus will know, given a differentiable function
y D f .x/, that for .a; f .a// to be a turning point, f 0 .a/ must equal 0.
Thus, the turning points, if there are any, will be among the points where the
derivative vanishes. The converse does not hold: f 0 .a/ may equal 0 without
.a; f .a// being a turning point. An example is x 3 , whose derivative at 0 is
0, but .0; 0/ is not a turning point, since the function f .x/ D x 3 is always
increasing. (The graph is rising when x ¤ 0 but flat at x D 0.)
If f .x/ is a polynomial of positive degree n, we learn in calculus how to
compute its derivative f 0 .x/, and find that f 0 .x/ is itself a polynomial, of
degree n 1. By Theorem 1.6, f 0 .x/ can have at most n 1 distinct roots;
that is, the derivative of f .x/ vanishes at at most n 1 points. Since these
are the only candidates for turning points, we obtain the next theorem.
Theorem 1.15. A polynomial of positive degree n has at most n 1 turning
points.
We can use Theorems 1.13 and 1.15 to see again that the sine function
cannot be a polynomial, and to see that the exponential function 10x isn’t a
polynomial.
Exercise 1.14. This exercise assumes familiarity with sin x and 10x as
functions defined for all real values of x.
(i) The graph of y D sin x has the shape of an infinite wave, oscillating
between peaks of height 1 and valleys of height 1. Describe the turn-
ing points of sin x and deduce that it has infinitely many local minima
and infinitely many local maxima.
(ii) Deduce from Theorem 1.15 that sin x cannot be a polynomial.
(iii) Use Theorem 1.13 to give an alternative proof that sin x cannot be a
polynomial.
(iv) For x < 0, we have 0 < 10x < 1. Deduce from Theorem 1.13 that the
function y D 10x cannot be a polynomial.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 21 — #37
i i
2
Quadratic Polynomials
The heart of this book is the study of solutions to cubic and quartic equa-
tions, which we will begin in Chapter 3. This chapter is devoted to quadratic
equations. Even though they are familiar from a first algebra course, a close
look is warranted, as a warmup before we tackle the greater difficulties of
cubic and quartic equations and to introduce themes that will recur as we
study cubics and quartics.
The general quadratic equation has the form
ax 2 C bx C c D 0;
for real numbers a, b, and c, with a ¤ 0. The quadratic formula for the
solutions of this equation takes the form
p
b b 2 4ac
xD ˙ :
2a 2a
Since a ¤ 0, we are free to divide both sides of the equation ax 2 C
bx C c D 0 by a before searching for solutions. This amounts to setting
a D 1 and studying quadratic equations of the form
x 2 C bx C c D 0:
We will take this approach throughout this chapter. The quadratic formula
then takes the simpler form
p
b b 2 4c
xD ˙ :
2 2
We will obtain the quadratic formula (in this simpler form) by three
different approaches, and turn to some history at the end.
21
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 22 — #38
i i
22 2. Quadratic Polynomials
Exercise 2.2. Solve the general form of the age problem. Suppose r and s
are unknown numbers, with u D r C s and v D r s. Determine r and s
in terms of u and v.
We see from the solution to Exercise 2.2 that we can determine two
numbers from their sum and difference. It is also possible to determine two
numbers from their sum and product. Doing so requires the calculation of
a square root, which leads to an ambiguity in sign. But it turns out to be
harmless, since the two solutions it yields are the two we seek.
Theorem 2.1. Given two real numbers r and s, let u be their sum and p
their product. Then r and s are the two quantities
p
u u2 4p
˙ :
2 2
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 23 — #39
i i
Exercise 2.4. Let b and c be real numbers and let f .x/ be the polynomial
x 2 C bx C c.
(i) Suppose f .x/ factors as .x r /.x s/ for distinct real numbers r and
s. Show that r and s are the only roots of f .x/. (Hint: Suppose t is a
root. Substitute it for x.)
(ii) Conversely, show that if distinct real numbers r and s are roots of f .x/,
then f .x/ D .x r /.x s/. (Hint: Use Theorem 1.5.)
(iii) Suppose f .x/ factors as .x r /2. Show that r is the only root of f .x/.
(iv) Conclude by proving Theorem 2.2.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 24 — #40
i i
24 2. Quadratic Polynomials
r Cs D b
and
r s D c:
(ii) Use Theorem 2.1 to obtain expressions for r and s in terms of b and c.
This proves Theorem 2.3.
Theorem 2.3 (Quadratic Formula). Let b and c be real numbers and sup-
pose x 2 C bx C c has real roots r and s (distinct or coincident). Then r and
s are the two quantities
p
b b 2 4c
˙ :
2 2
We see from Theorem 2.1 and Exercise 2.5 that the quadratic formula
is the expression for the roots of x 2 C bx C c in terms of their sum b and
their product c.
Let us consider how the quadratic formula of Theorem 2.3 provides
a solution for quadratic equations. In the simplest case, with b D 0, the
equation has the form
x2 C c D 0
p
and the quadratic formula tells us that the solution is x D ˙ c. If c > 0,
then c has no square roots and there is no solution. If c < 0, then c is
positive and has two square roots. The quadratic formula does not tell us
how to compute them. It tells what we already know, that the solutions to
the equation are the square roots. Thus, rather than regarding the formula
as a way to solve an arbitrary quadratic equation, we should view it as a
reduction technique. It gets us to the point of having to do a square root
calculation, then leaves us on our own. As we saw in Section 1.5, calculating
square roots is not a problem of algebra.
x 2 C 10x 39 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 25 — #41
i i
x 2 C 10x C 52 D 39 C 52 :
b2
x 2 C bx C
4
is the square of a degree one polynomial.
(iii) Rewrite
x 2 C bx C c
by adding and subtracting b 2 =4 and find that solving the equation x 2 C
bx C c D 0 is equivalent to solving
.x C b=2/2 D d=4
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 26 — #42
i i
26 2. Quadratic Polynomials
x 2 C bx C c D 0
has no solutions.
(ii) If b 2 4c D 0, then the only solution of x 2 Cbx Cc D 0 is x D b=2.
(iii) If b 2 4c > 0, then the equation x 2 C bx C c D 0 has two solutions,
given by p
b b 2 4c
xD ˙ :
2 2
The quadratic formula takes on a simpler form if we alter how we write
the coefficients in our initial quadratic equation. Given the real numbers b
and c, let B D b=2 and C D c. Then b D 2B, c D C , and
x 2 C bx C c D 0
can be written as
x 2 C 2Bx C D 0;
or
x 2 C 2Bx D C:
For the remainder of the section, we take this as our standard form for
a quadratic equation. With this notation, we can appreciate the process of
completing the square geometrically.
Algebraically, to make the left side of x 2 C 2Bx D C a perfect square,
we add B 2 to both sides, yielding
x 2 C 2Bx C B 2 D B 2 C C
or
.x C B/2 D B 2 C C:
Taking square roots, we recover the quadratic formula, now in the form
p
x C B D ˙ B2 C C ;
or p
xD B˙ B2 C C :
The change in our coefficients results in a quadratic formula without the
cluttering appearances of 2 and 4.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 27 — #43
i i
x 2 C 10x D 39
x 5
Figure 2.1. Completing the Square
Since the square has area x 2 and each rectangle has area 5x, the figure
drawn has area x 2 C 10x. We are given that
x 2 C 10x D 39;
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 28 — #44
i i
28 2. Quadratic Polynomials
p
The larger square has side length
p given by 64, so the original square has
side length x given by 5 C 64, or 3. We have solved the quadratic equa-
tion geometrically, and we have seen that the algebraic process of complet-
ing the square has a geometric counterpart.
The general case is handled similarly. Given x 2 C 2Bx D C , with B
and C arbitrary positive numbers, we draw the square whose sides have
an unknown length x, then place rectangles atop it and to the right with
sides of lengths B and x. The resulting figure has area x 2 C 2Bx, which
we recognize as equal to C , thanks to the given equation. We complete
the square by placing a square of side length B in the upper right corner,
yielding a larger square with side length x C B. Its area is both .x C B/2
and B 2 C C , yielding
.x C B/2 D B 2 C C:
x 2 C 10x 39 D 0
by changing variables.
(i) Let y related to x by x D y 5. Substitute y 5 for x in the equation
and obtain an equation in y.
(ii) The new quadratic equation has the form y 2 d D 0 for a constant d .
We have a quadratic equation without a degree-one term.
(iii) Take square roots to obtain two values for y, then use x D y 5 to
obtain two solutions to the original quadratic equation.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 29 — #45
i i
2.4. A Discriminant 29
We now take up the general case, with the goal of finding a change of
variable that, as in Exercise 2.8, eliminates the degree-one term.
We will use this technique in our treatment of cubic and quartic equa-
tions. It will allow us to drop the term of second-highest degree from the
polynomial, as here we were able to drop the degree-one term.
2.4 A Discriminant
A quadratic polynomial x 2 C bx C c may have two distinct real roots, one
repeated real root, or no real roots, as we saw in Theorem 2.2. It is possible
to determine which occurs from b and c, without finding the roots.
This is evident from the quadratic formula, since the roots, if they exist,
are p
b b 2 4c
˙ :
2 2
We see that if b 2 4c > 0, there are two distinct real roots, if b 2 4c D 0,
there is one multiple root, and if b 2 4c < 0, there are no real roots. We can
obtain this independently of the roots by studying the shape of the graph of
y D x 2 C bx C c.
To do so, we first need to review the shape of the graph of y D x 2. This
was discussed in Section 1.6, where we saw that there is a turning point
at .0; 0/, with the graph falling to .0; 0/ as x increases through negative
values to 0 and rising as x increases through positive values, the shape being
concave up throughout. (See Figure 2.2.)
The graph has an alternative description from the theory of conic sec-
tions. We won’t be using it, but the issues are worth reviewing. The distance
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 30 — #46
i i
30 2. Quadratic Polynomials
between two points .a; b/ and .c; d / is the square root of .c a/2 C.d b/2.
(This is essentially the Pythagorean Theorem.) The distance from a point P
to a line ` is defined to be the distance between P and the closest point Q
on ` to P , with Q being the intersection of ` and the line perpendicular to
` that passes through P .
Exercise 2.10. Verify that the points in the plane equidistant from .0; 1=4/
and the line y D 1=4 are precisely those satisfying the equation y D x 2 .
(Hint: Set the squared distances equal to each other rather than the dis-
tances.)
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 31 — #47
i i
2.4. A Discriminant 31
y = x2
Focus
(0, 1/4)
(x, y)
x
Vertex (0, 0)
Directrix y = –1/4
(x, –1/4)
Axis of symmetry x = 0
Exercise 2.11. Suppose y D f .x/ is a function defined for all real number
values of x.
(i) Given a real number C , verify that the graph of y D f .x C / is
obtained from the graph of y D f .x/ by shifting every point to the
right a distance of C .
(iii) Combine (i) and (ii) to obtain the result that the graph of y D f .x
C / C D is obtained from the graph of y D f .x/ by shifting every
point to the right a distance of C and upward a distance of D.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 32 — #48
i i
32 2. Quadratic Polynomials
y
y = x2 + 4
y = x2
y = x2 – 4
Exercise 2.13. Let b and c be real numbers. Using Exercise 2.12, conclude
that the graph of y D x 2 C bx C c has a local (and global) minimum at
. b=2; c b 2 =4/:
The graph of x 2 C bx C c is the parabola with focus at . b=2; c b 2=4 C
1=4/, directrix y D c b 2 =4 1=4, axis of symmetry x D b=2, and
vertex at . b=2; c b 2=4/.
Now that we have found the turning point on the graph of a monic
quadratic polynomial, we can use it to get information on the polynomial’s
roots.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 33 — #49
i i
2.5. History 33
Exercise 2.14. Using Exercise 2.13, show that the nature of the roots of
x 2 C bx C c is determined by the sign of c b 2 =4:
(i) If c b 2 =4 < 0, then the graph’s turning point is below the x-axis.
Conclude that the graph crosses the x-axis twice, so that x 2 C bx C c
has two distinct real roots.
(ii) If c b 2 =4 D 0, then the turning point is on the x-axis. Conclude that
the graph of x 2 C bx C c touches the x-axis once, so that x 2 C bx C c
has one real root. Since x 2 C bx C c must factor as the product of two
degree-one polynomials, deduce that the real root has multiplicity 2.
(iii) If c b 2=4 > 0, then the turning point is above the x-axis. Conclude
that the graph of x 2 C bx C c does not touch or cross the x-axis, so
that x 2 C bx C c has no real roots.
(The three cases are illustrated in Figure 2.4.)
Theorem 2.5. Let f .x/ D x 2 C bx C c for real numbers b and c and let
ı be the associated discriminant b 2 4c.
(i) If ı > 0, then f .x/ has two distinct real roots.
(ii) If ı D 0, then f .x/ has a repeated real root.
(iii) If ı < 0, then f .x/ has no real root.
2.5 History
In this section, we will look at the history of quadratic equations from an-
cient Babylonian civilization four millennia ago to Italy in 1500, touching
occasionally on developments in algebra beyond the quadratic formula.
Every ancient civilization developed a body of mathematical knowl-
edge, in part to serve the practical needs of measurement, construction,
and commerce. The work of the ancient Greeks may be the most familiar,
thanks to Euclid, but important contributions were made as well by Baby-
lonian, Egyptian, Indian, and Chinese civilizations. Mathematical problems
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 34 — #50
i i
34 2. Quadratic Polynomials
and solutions were often described through words, and geometry was used
to express concepts that we might now address algebraically through nota-
tional systems not then available. General methods would be laid out im-
plicitly through the working of a series of examples. All these civilizations
addressed problems, in different guises, that we can now interpret as fitting
under the heading of solving quadratic equations.
Let’s begin with a look at some Babylonian work on problems reducible
to quadratic equations. The first Babylonian dynasty, in the years ranging
from around 1900 B.C.E. to 1600 B.C.E., made many contributions to hu-
man culture. Perhaps most notable is the code of Hammurabi, consisting of
282 laws that were written on clay tablets. Some of the earliest preserved
mathematical texts date to this time. Babylonians used cuneiform, a wedge-
like script that evolved from pictographs, for their writing. They also used
base 60 in writing numbers, a choice that persists today in our methods for
measuring time (minutes, seconds) and angle (degrees, minutes, seconds).
Surviving on tablets are examples of the mathematical problems that Baby-
lonians posed and solved. Many reduce to solving quadratic equations.
Our understanding of the mathematics on the cuneiform tablets owes
much to the pioneering work of Otto Neugebauer, an Austrian scholar who
studied mathematics at Munich and Göttingen in the 1920s. While at
Göttingen, Neugebauer shifted his interest from mathematics to its his-
tory and did his doctoral research on Egyptian mathematics. He stayed
on at Göttingen and began to study the Babylonian tablets that can be
found at many of the major museums in Europe and the United States.
His three-volume work Mathematische Keilschrift-Texte [44], published in
1935, translated and interpreted tablets found in museums from London and
Paris to Berlin and Istanbul.
Neugebauer came to the United States in 1939 and continued his work,
publishing Mathematical Cuneiform Texts [46] jointly with Abraham Sachs
in 1945. He provided an overview of his findings in The Exact Sciences
in Antiquity [45], published in 1952 based on a 1949 lecture series he pre-
sented at Cornell. These books are well worth a look, both for their content
and for the tablet photographs.
In surveying the Babylonian treatment of algebra, Neugebauer high-
lighted its abstract nature, divorced from both geometric considerations and
practical meaning:
[G]eometrical concepts play a very secondary part in Babylonian al-
gebra, however extensively a geometrical terminology may be used. It
suffices to quote the existence of examples in which areas and lengths
are added, or areas multiplied, thus excluding any geometrical inter-
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 35 — #51
i i
2.5. History 35
pretation . . . . Indeed, still more drastic examples can be quoted for the
disregard of reality. We have many examples concerning wages to be
paid for labor according to a given quota per man and day. Again, prob-
lems are set up involving sums, differences, products of these numbers
and one does not hesitate to combine in this way the number of men
and the number of days. It is a lucky accident that if the unknown num-
ber of workmen, found by solving a quadratic equation, is an integer.
Obviously the algebraic relation is the only point of interest, exactly as
it is irrelevant for our algebra what the letters may signify.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 36 — #52
i i
36 2. Quadratic Polynomials
x2 x D 870:
The solution is laid out as follows: “You put 1, the unit. You divide in
two 1: 0I 30. You multiply by 0I 30: 0I 15. You add to 14; 30: 14; 30I 15. It
is the square of 29I 30. You add 0I 30, which you multiplied, to 29I 30: 30,
the side of the square.”
Once again thinking of the equation as x 2 C bx C c D 0, this time with
b D 1 and c D 870, we can describe the solution as follows: We first
form the quantity b=2. We then square it and add the result to c to get
b 2 =4 c. The next step is to determine the square root, to which we add
the number we earlier used for multiplying, which was b=2. The result—
30—is the desired answer, and once again it is what we obtain by using the
quadratic formula.
It turns out that the two problems are not typical. Few examples of single
quadratic equations have been found. More common are problems involving
two variables and two equations, one equation having the form xy D 1, the
other being a linear equation in x and y. They can be reduced to solving a
single quadratic equation.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 37 — #53
i i
2.5. History 37
xy C .x y/ D 183
and
x C y D 27:
The tablet then lays out a sequence of numerical calculations that lead to
the answer. They can be interpreted as instructing the student to add the
equations, yielding x.y C 2/ D 210, and to add 2 to both sides of the
second equation, yielding x C .y C 2/ D 29. This transforms the original
equations to normal form, x and y C 2 being two numbers whose product
is 210 and whose sum is 29.
It is important to note that scholarship on Babylonian mathematics in
recent decades has yielded a perspective sharply different from that en-
gendered by Neugebauer’s work. Jens Høyrup, in his 2002 study Lengths,
Widths, Surfaces: A Portrait of Old Babylonian Algebra and Its Kin [35],
observes that
The Babylonian algebra which most historians of mathematics found
in Neugebauer’s works looked astonishingly modern and similar to
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 38 — #54
i i
38 2. Quadratic Polynomials
ours. It is the purpose of the present book to replace this standard inter-
pretation by a less modernizing reading. . . . The mathematical texts are
school texts. They contain no theorems and no theoretical investiga-
tions. . . . [Their authors] were teachers of computation, at times teach-
ers of pure, unapplicable computation . . . but they remained teachers,
teachers of scribe school students who were later to end up applying
mathematics to engineering, managerial, accounting, or notarial tasks.
Høyrup’s own analysis of the first problem on AO 8862 #1 [35, pp. 169–
170] emphasizes its concrete nature: “The text starts by stating that a rect-
angular surface or field is built, that is, marked out; after pacing off its di-
mensions, the speaker ‘appends’ the excess of the length over the width to
it; the outcome is 3; 3. Even this is done quite concretely in the terrain. Then
he ‘turns back’ and reports the accumulation of the length and the width to
be 27.”
In Mathematics in Ancient Iraq: A Social History [56], Eleanor Robson
offers an enlightening side-by-side comparison of van der Waerden’s and
Høyrup’s analyses [56, pp. 276–278]. “In van der Waerden’s translation of
1954 the problem is entirely numerical . . . In his reading ‘the Babylonians’
are just like modern mathematicians: they use ‘symbols’ and ‘equations,’
which means that the problem can ‘safely’ be expressed as modern alge-
bra. Høyrup, by contrast, opens his comments on the same problem with an
interpretative diagram that does not appear on the cuneiform tablet and con-
tinues [as quoted above]. All of van der Waerden’s apparently arithmetical
numbers turn out to have dimension: they are particular lengths and areas
that are manipulated physically.”
Another common form taken by problems on the tablets is to find two
numbers from known values of the sum of their squares and of their sum
(or difference). An example is problem 9 of BM 13901 [59, p. 15]: “I added
the area of my two squares: 1300. The side of one exceeds the side of the
other by 10.” Here we are asked to solve the equations
x 2 C y 2 D 1300
and
x y D 10:
This is easily converted to a quadratic equation in one variable. In reinter-
preting the problem this way, we are placing a modern overlay on the data
of the tablet. Høyrup offers his own analysis [35, pp. 66–70], attempting to
interpret the tablet’s prescription as a series of instructions for tearing out,
inscribing, and appending squares of given dimensions.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 39 — #55
i i
2.5. History 39
Around 600 B.C.E., a new mathematical current arose within the Greek
civilization of the Mediterranean, starting with the work of Thales and
Pythagoras. The Greek tradition would have enormous influence on the de-
velopment of mathematics, and much has been written about it. Perhaps
of greatest importance was the introduction of the axiomatic method and
deductive proof. For a concise introduction, William Berlinghoff and Fer-
nando Gouvêa have an informative survey [9, pp. 14–24]. (See S. Cuomo’s
Ancient Mathematics [15] for a more detailed study.) The most famous of
the Greek mathematicians is Euclid, who lived around 300 B.C.E. in Alexan-
dria. Little is known about him, or about the origins of his greatest work, the
Elements. For instance, no definitive answer can be given to the question of
how much of the Elements is due to him and how much is a compendium
of earlier work.
The Elements provided the model for the axiomatic method and laid the
basis for geometry for centuries. Most of its thirteen books are devoted to
geometry, with plane geometry treated first and regular polyhedra in three-
space covered in the concluding Book 13. Books 7 to 9 contain some of the
most famous results of elementary number theory, such as Euclid’s proof
that there are infinitely many prime numbers. Book 2 provides solutions to
geometric questions about area that amount to solving simultaneous equa-
tions x C y D a and xy D b for constants a and b [65, pp. 77–80]. As we
know, this reduces to solving a quadratic equation.
Jumping ahead five centuries to the late stages of the Greek tradition, we
come to Diophantus, another Alexandrian. Diophantus wrote Arithmetica,
a collection of about 200 problems and their solutions. Arithmetica con-
sisted of thirteen chapters, of which six were preserved in Greek and four in
an Arabic translation from the ninth century, but the Arabic chapters were
essentially lost until their rediscovery in 1968.
In contrast to Euclid’s geometric algebra, Diophantus works in a manner
we would recognize as more strictly algebraic, with symbols for the lower
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 40 — #56
i i
40 2. Quadratic Polynomials
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 41 — #57
i i
2.5. History 41
[14]. Here is one problem Brahmagupta poses [14, p. 346]: “When does
the residue of revolutions of the sun, less one . . . equal to the square root of
two less than the residue of revolutions, less one, multiplied by ten and aug-
mented by two?” Translating into modern algebraic notation, Brahmagupta
is looking for a solution of the equation x 2 86x D 249. Following Cole-
brooke in his translation, let us call 249 the “absolute number” and x the
“middle term.” With this terminology, Brahmagupta’s proposed solution
amounts to a statement of the quadratic formula:
Now, from the absolute number, multiplied by four times the coeffi-
cient of the square, and added to the square of the coefficient of the
middle term, the square root being extracted, and lessened by the co-
efficient of the middle term, the remainder is divided by twice the co-
efficient of the square, yields the value of the middle term.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 42 — #58
i i
42 2. Quadratic Polynomials
twelfth century by Robert of Chester and Gerard of Cremona and in the thir-
teenth century by William of Luna. Robert of Chester’s version, Liber alge-
brae et almucabola, was translated into English in 1915 by Louis Charles
Karpinski, who also provided notes and an introduction [36]. Karpinski’s
edition is well worth a look, as is the more recent Al-Khwarizmi: The Be-
ginnings of Algebra [54] by Roshdi Rashed, which includes a translation
(with the original Arabic text on facing pages) and commentary. Rashed
also writes about the traditions of calculation in the eighth century and al-
Khwarizmi’s knowledge of Greek and Indian mathematical literature.
At the outset of his book, al-Khwarizmi introduces three types of quan-
tities: roots, squares, and numbers, the root being what we would label x,
the square being x 2 , and number being number. He then gets right to busi-
ness, classifying linear and quadratic equations into six forms and illustrat-
ing how to solve each with examples. We would regard all six types as one,
but since al-Khwarizmi does not use negative numbers or 0, he is obliged to
consider separate cases, with a, b, and c always positive:
(1) Squares equal roots, or what we would describe as ax 2 D bx.
(2) Squares equal numbers, or ax 2 D c.
(3) Roots equal numbers, or bx D c.
(4) Squares and roots equal numbers, or ax 2 C bx D c.
(5) Squares and numbers equal roots, or ax 2 C c D bx.
(6) Roots and numbers equal squares, or bx C c D ax 2 .
(In addition to [36] and [54], see also the discussion of al-Khwarizmi’s work
in The Beginnings and Evolution of Algebra by I.G. Bashmakova and G.S.
Smirnova, from which the above summary is drawn [8, p. 50].)
What is novel about the opening of al-Khwarizmi’s book is that in it he
lays out the mathematical issues in purely algebraic terms before turning to
the examples—drawn from commerce and inheritances—that fill the later
pages of the book. The focus is on equations in the abstract, classified by
degree. Rashed explains in the introductory essay to his translation [54,
p. 24] that what al-Khwarizmi
does cannot be reduced to anything to be found in other traditions,
such as those of the Babylonians, of Diophantus, of Heron of Alexan-
dria, of Aryabhata or of Brahmagupta. It is not in the course of solving
problems that al-Khwārizmı̄ finds these equations. The classification
in fact precedes the problems. It is introduced deliberately as the nec-
essary first step in the construction of a theory of equations of the first
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 43 — #59
i i
2.5. History 43
and second degree; and this theory will become the nucleus of a math-
ematical discipline.
The first of al-Khwarizmi’s six cases that interests us is .4/, and the
example he uses to illustrate it is none other than the quadratic equation
x 2 C 10x D 39 that we studied in Sections 2.2. This equation is not dis-
tinguished in any mathematical sense, but since it is the first one for which
al-Khwarizmi employs the technique of completing the square, it has res-
onated through history. Or, as Karpinski suggests [36, p. 19], the equation
“runs like a thread of gold through the algebras for several centuries, appear-
ing in the algebras of the three writers mentioned, Abu Kamil, al-Karkhi and
Omar al-Khayyami, and frequently in the works of Christian writers, cen-
turies later.” Here is al-Khwarizmi’s own description of the example [54,
p. 100]:
“Squares plus roots are equal to a number”, as when you say: a square
plus ten roots are equal to thirty-nine dirhams [a unit of measure];
namely, if you add to any square [a quantity] equal to ten of its roots,
the total will be thirty-nine.
Procedure: you halve the number of the roots which, in this problem,
yields five; you multiply it by itself; the result is twenty-five; you add
it to thirty-nine; the result is sixty-four; you take the root, that is eight,
from which you subtract half the number of the roots, which is five.
The remainder is three, that is the root of the square you want, and the
square is nine.
Upon completing his treatment of the remaining types of quadratic equa-
tion, al-Khwarizmi returns to the equation x 2 C 10x D 39 and illustrates
two ways of solving it geometrically. The second is the one we used in
Section 2.2.
Rashed concludes his introductory essay by turning to the question of
Indian influence on al-Khwarizmi. Two centuries ago, in translating the
work of the Indians, Colebrooke suggested [14, pp. xx–xxi] that al-Khwa-
rizmi was a mere borrower. This assessment is misguided, but the extent to
which Islamic mathematicians and astronomers were familiar with and in-
fluenced by Indian literature is an interesting question. Rashed emphasizes
what is original to al-Khwarizmi [54, p. 77]:
Comparing the two texts, that of Brahmagupta and that of al-Khwa-
rizmi, reveals irreducible differences. . . . Brahmagupta arrived at the
quadratic equation in one unknown in the course of solving a problem
in astronomy. In other words, he did not give himself the equation as
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 44 — #60
i i
44 2. Quadratic Polynomials
such with a view to solving it. This link between problem and equation,
which is found in other mathematics, this as it were empirical ground-
ing for the equation, has vanished in al-Khwarizmi’s programme. From
the start, al-Khwarizmi proceeded by defining basic terms, which were
then combined to give him the ideal canonical equations with which his
theory is concerned. This new approach breaks the close link between
problems and equations. As for problems, al-Khwarizmi turns to them
later, as exercises in algebra, that is as providing an area in which he
can apply the theory of equations that he has already constructed. . . .
. . . al-Khwarizmi wanted to construct “a form of calculation” for un-
knowns that was independent of what they represented, that is a new
mathematical discipline, that by its very nature was subject to the rules
of proof. There is no trace of any such programme in the work of his
predecessors.
Al-Khwarizmi was one of several Islamic scholars who made important
contributions to algebra. Others include Thabit ibn Qurra, who spent much
of his career in Baghdad, studying medicine, philosophy, mathematics, and
astronomy as well as translating Euclid and other Greeks into Arabic. He
died in 901. Also worth mentioning is the famed Persian poet, philosopher,
and mathematician Omar Khayyam, who studied in Samarkand and worked
in Bukhara (both in modern-day Uzbekistan), dying in 1131. He wrote a
book on algebra, studying quadratic equations from a geometric viewpoint
as in Euclid. The work of Thabit and Omar Khayyam is discussed in detail
in van der Waerden’s A History of Algebra From Al-Khwarizmi to Emmy
Noether [66, pp. 15–31].
While mathematics was flourishing in Arabic, Persian, and Indian lands,
the medieval era in Europe was mathematically more quiescent. The revival
of significant European mathematical activity occurred first in Italy, per-
haps because its cities were major trade centers with connections to Arabic
ports along the Mediterranean. The great Italian mathematician Leonardo
da Pisa—better known to us as Fibonacci—provides testimony to this effect
in the prologue to his influential 1202 work Liber Abaci [28, pp. 15–16], or
the Book of Calculation:
As my father was a public official away from our homeland in the
Bugia [the Algerian port city of Bejaia] customshouse established for
the Pisan merchants who frequently gathered there, he had me in my
youth brought to him, looking to find for me a useful and comfortable
future; there he wanted me to be in the study of mathematics and to
be taught for some days. There from a marvelous instruction in the
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 45 — #61
i i
2.5. History 45
art of the nine Indian figures, the introduction and knowledge of the
art pleased me so much above all else, and I learnt from them, who-
ever was learned in it, from nearby Egypt, Syria, Greece, Sicily and
Provence, and their various methods, to which locations of business I
travelled considerably afterwards for much study, and I learnt from the
assembled disputations.
Fibonacci is best known for the sequence of numbers that bears his
name, the Fibonacci numbers, which arise in his solution to the following
problem [28, pp. 404–405]:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 46 — #62
i i
46 2. Quadratic Polynomials
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 47 — #63
i i
3
Cubic Polynomials
In this chapter, we will take our first look at cubic equations and the famous
formula for their solution known as Cardano’s formula. Girolamo Cardano,
for whom the formula is named, was a sixteenth-century Italian scholar. The
story of the formula’s discovery is complex, as we will see in Section 3.5,
and credit must be shared with Scipione del Ferro and Niccolò Fontana.
Our results in this chapter will be imprecise, because we are lacking
what turns out to be an essential tool: complex numbers. However, it is this
first look that will reveal the need for complex numbers. After developing
the basic facts about them in Chapter 4, we will return to cubic equations in
Chapter 5 and treat them with appropriate care.
Theorem 3.1. Let f .x/ be a cubic polynomial. Then there is a real number
a such that f .a/ D 0.
Theorem 1.4 states that, given a polynomial f .x/ and a real number a,
if f .a/ D 0, then x a divides f .x/. Combining this with Theorem 3.1,
we find that any cubic polynomial can be factored as a product of linear and
47
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 48 — #64
i i
48 3. Cubic Polynomials
quadratic polynomials.
.x a/.x 2 C mx C n/
Theorem 3.2. Let f .x/ be a monic, cubic polynomial. Exactly one of the
following occurs.
(i) f .x/ has one real root a, of multiplicity 3, and factors as
.x a/3 :
(ii) f .x/ has two distinct real roots a1 and a2 , of multiplicities 1 and 2,
and factors as
.x a1 /.x a2 /2 :
(iii) f .x/ has three distinct simple real roots a1 , a2 , and a3 , and factors as
.x a1 /.x a2 /.x a3 /:
.x a/s.x/;
y 3 C py C q
is called a reduced cubic. In the next exercise, we will see that a change of
variable allows us to pass from an arbitrary cubic polynomial to a reduced
one. This is analogous to the change of variable used in Exercise 2.9 to pass
from a quadratic polynomial to a new one without a degree-one term, and
will simplify the task of solving cubic equations.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 49 — #65
i i
f .x/ D x 3 C bx 2 C cx C d;
for real numbers b, c, and d . Given another real number a, let’s see what
happens under the change of variable x D y C a, or y D x a.
(i) Substitute yCa for x and obtain a polynomial g.y/ in the new variable
y. Write it as
y 3 C By 2 C Cy C D
and obtain formulas for the coefficients B, C , and D in terms of a and
the old coefficients b, c, and d .
(ii) Observe that there is a choice of a for which B D 0. Thus for this a,
changing variables provides a new polynomial g.y/ that is reduced.
(iii) Verify that g.y/ is the reduced polynomial described in Theorem 3.3.
(iv) What is the relation between a root of f .x/ and a root of g.y/? In
particular, given a root s of g.y/, describe a root r of f .x/.
(v) Conclude that solving g.y/ D 0 allows us to solve f .x/ D 0.
We have shown that we can pass from the problem of solving an arbi-
trary cubic equation to the equivalent problem of solving a cubic equation
of the form
y 3 C py C q D 0;
and that being able to solve equations of this simpler type allows us to solve
arbitrary cubic equations.
x3 3x 2 4x C 12 D 0:
(i) What reduced cubic equation would we solve in order to find the solu-
tions?
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 50 — #66
i i
50 3. Cubic Polynomials
(ii) How are the solutions of the reduced cubic equation related to the so-
lutions of the original cubic equation?
In Section 3.2, we will solve the reduced cubic equation in Exercise 3.3,
allowing us to find the solutions of the original cubic equation.
There are two families of reduced cubic equations that take on an even
simpler form, those for which p D 0 and those for which q D 0. Let’s
discuss these.
(ii) Deduce that if p > 0 then y D 0 is the only solution and 0 is a simple
root of y 3 C py; if p D 0 then y D 0 is the only solution and is a
repeated root of y 3 C py; and if p < 0 then there are three distinct
p
solutions: y D 0 and y D ˙ p.
y3 r 3 D .y r /.y 2 C ry C r 2 /:
(iv) Conclude that r is the lone root of y 3 C q and that r has multiplicity
1.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 51 — #67
i i
We will explore its history in Section 3.5. For now, let us just note that
Cardano did not write down such a formula explicitly. Rather, he illustrated
how to solve reduced cubic equations through examples.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 52 — #68
i i
52 3. Cubic Polynomials
p3
z3 C q D 0:
27z 3
p3
z 6 C qz 3 D 0:
27
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 53 — #69
i i
(vi) Take cube roots of both sides and deduce that the two values of z have
a product satisfying
q p q p
r r
3 3 p
C R R D :
2 2 3
p
(vii) This means that if z is the cube
p root of .q=2/ C R, then p=3z is
the cube root of .q=2/ R.
(viii) Recall that z was introduced to satisfy y D z p=3z. The two terms on
the
p right of this equation,
p z and p=3z, are the cube roots of .q=2/C
R and of .q=2/ R.
q p q p
r r
3
yD C R C 3 R:
2 2
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 54 — #70
i i
54 3. Cubic Polynomials
We can stop here, content that we have found the single real root of
y 3 C6y 20. Or we can use a calculator to determine at least approximately
what the sum of the two cube roots is. If we do this, we discover that it is
close to 2. Substituting 2 for y in y 3 C 6y 20 to see how close to 0 we
are, we further discover that
23 C 6 2 20 D 8 C 12 20 D 0:
Exercise 3.10. We will calculate the cube roots that appear in Exercise 3.9
by guessing and verifying.
p p
(i) Guess that the cube root of 10 C 6 3phas the form a C b 3. Cube
this. Combine terms that don’t involve 3 and terms that do to get two
equations in a and b with integer coefficients.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 55 — #71
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 56 — #72
i i
56 3. Cubic Polynomials
The form of the solution in Exercise 3.12 is similar to the form of the
solution in Exercise 3.11, the only essential difference being the minus sign
under the square root. But what a difference! Because of the minus sign, the
solution makes no sense. After all, there is no square root of 3.
Let’s not worry about the meaninglessness of our solution pjust yet. In-
stead, taking
p a hint from Exercises 3.10 and 3.11, let’s treat 3 the way
we did 3 and try to find cube roots that we might be able p to add to ob-
tain a simpler answer. Even though we don’t know what 3 means, let’s
assume in the exercises that when we square it, the result is the number 3.
p
Exercise 3.13.
p Let’s guess that 3 C 10 3=9 has a cube root in p the
form a C b 3. We defer any concerns about the meaning of a C b 3,
choosing for now just to work with it formally.
p p
(i) Cube a C b 3. Combine terms that don’t involve 3 and terms
that do to get two equations in a and b with integer coefficients.
(ii) Make guesses for a and b. Small positive integers for a and fractions
involving thirds for b should give a solution quickly.
p
(iii) Find the cube root of 3 10 3=9 similarly.
p p
(iv) The expressions for the cube roots of 3C10 3=9 and 3 10 3=9
involve the square root of 3. Let’s not worry yet. Let’s proceed.
p
(v) Add the two cube roots: the troublesome 3 terms cancel, leaving
us with a meaningful real number, 2. Verify that 2 is a solution to y 3
7y C 6 D 0.
(vi) Conclude that Cardano’s formula has led us to a correct solution of the
equation y 3 7y C 6 D 0, namely, y D 2.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 57 — #73
i i
x3 3x 2 4x C 12 D 0
by using the change of variable of Theorem 3.3 and applying the result of
Exercise 3.13.
(iv) For the displayed numbers to make sense, there must be a square root
of 3. Let’s continue not to worry about this. It will be convenient to
have a special name for the hypothetical number
p
1 3
C :
2 2
We will use the
p lower case Greek letter omega for this purpose and
write 1=2 C 3=2 as !.
is ! 2 .
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 58 — #74
i i
58 3. Cubic Polynomials
Exercise 3.16. Let c be any non-zero real number. From Exercise 3.5, c
has one real cube root. Call it a. Provided that the new number ! introduced
in the preceding exercise makes sense, show that !a and ! 2 a are also cube
roots of c. Conclude that if we decide to treat ! as an allowable number, we
will find that the polynomial x 3 c has three distinct roots, one real and the
others of a new form.
3.3 Graphs
In this section, we will look at the shapes of cubic polynomial graphs. The
results of this section are needed only in Section 3.4, where we take our
first look at the discriminant of a cubic. We will return to the study of the
discriminant in Section 5.3 and obtain results purely algebraically, without
reference to this section. Thus, we might choose to omit both this section
and Section 3.4. However, the principal results enhance our visual or geo-
metric understanding of cubic polynomials.
To give full proofs of the results of this section, some background from
the foundations of real numbers and calculus is needed. Readers with that
background, including familiarity with the connection between derivatives
and turning points summarized in Section 1.6, will be able to prove the re-
sults easily. For those unfamiliar with calculus, we will indicate an approach
that reduces the calculus to a minimum, although some foundational theo-
rems on the real numbers are needed.
We will restrict ourselves to reduced cubic polynomials, those of the
form x 3 C px C q. As we saw in Section 3.1, this is in fact no restriction
at all. Recall from Section 1.6 the notions of local maximum, local mini-
mum, and turning point for the graph of a function y D f .x/. We know
from Theorem 1.15 that the graph of x 3 C px C q has at most two turning
points. We know from Theorem 1.13 that the graph rises from arbitrarily
low heights on the left of the y-axis to arbitrarily high heights on the right.
The theorems tell us that the graph of a cubic has two possible behaviors: it
will rise steadily as x increases or it will rise to a local maximum, fall to a
local minimum, then rise. Which one occurs depends on the sign of p.
The possibilities are illustrated in Figure 3.1, which shows the graphs of
the cubic polynomials y 3 C px for p D 4; 2; 0; 2, and 4. For p D 2 or
4, (or, more generally, p positive), the graph rises steadily, with no turning
points. For p D 2 or 4 (or, more generally, for p negative), the graph
rises, then falls, then rises. An exceptional case occurs when p D 0. Here,
the graph never turns, but at x D 0, it stops rising for an instant, being
essentially flat. More precisely, the x-axis is tangent to the graph at .0; 0/,
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 59 — #75
i i
3.3. Graphs 59
y
y = x 3 + 4x
y = x 3 + 2x
y = x3
y = x 3 – 2x
y = x 3 – 4x
x
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 60 — #76
i i
60 3. Cubic Polynomials
Theorem 3.5. Let p and q be real numbers with p < 0 and let f .x/ D
x 3 C px C q. Let a be the positive square root of p=3.
(i) The graph of y D f .x/ has two turning points, a local maximum at
x D a and a local minimum at x D a.
(ii) As x increases, the graph rises to a turning point at
2ap
a; q ;
3
then rises.
Exercise 3.19. Let p and q be real numbers with p < 0 and let a be
the positive square root of p=3. In this exercise, we will find two turning
points for the graph of y D x 3 C px C q. We can rewrite the cubic as
x 3 3a2 x C q.
(i) Let g.x/ D x 3 3a2 x C 2a3 . Because a is a root, x a divides g.x/.
Factor g.x/ as .x a/2 .x C 2a/. Use this factorization to verify for
x > a that as x increases, so does g.x/. Also check that g.a/ D 0, but
g.x/ > 0 for any x 0 besides x D a.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 61 — #77
i i
3.4. A Discriminant 61
(ii) Deduce that .a; 0/ is a local minimum for g.x/ and that the graph of
g.x/ rises to the right of x D a as x increases.
(iv) Let q be a real number and let f .x/ D x 3 3a2 x C q. The graph of
f .x/ is just a vertical shift of the graphs of g.x/ and h.x/. Conclude
that f .x/ has a local maximum at . a; q 2ap=3/ and a local mini-
mum at .a; q C 2ap=3/, with the graph rising as x increases to a and
as x increases from a.
Let’s continue with the notation of Exercise 3.19. We might guess from
the exercise that the graph of y D x 3 C px C q falls as x goes from a to a.
Theorem 3.5 tells us that this is the case, as we easily check using calculus,
since we can verify that the derivative 3x 2 C p of x 3 C px C q is negative
for all values of x satisfying a < x < a.
Without calculus, we can still show with a little more work (and an
appeal to basic facts from the foundations of real numbers) that x 3 Cpx Cq
decreases as x goes from a to a. For instance, we can show for each real
number r between a and a that there is some open interval around r on
which x 3 C px C q is decreasing. This local information, together with
basic results on the real numbers, shows x 3 C px C q is decreasing across
the entire interval . a; a/. Rather than pursuing this point further, we will
accept that it is true, as we already know from calculus.
3.4 A Discriminant
For a quadratic polynomial x 2 C bx C c, we know that there are three
possibilities for the roots: two distinct real roots, one real root repeated, or
no real roots. We also know that we can decide which case holds from the
coefficients b and c, as we saw in Theorem 2.5. The three cases correspond
to the quantity b 2 4c, the discriminant of x 2 C bx C c, being positive,
zero, or negative. This can be proved by using elementary calculus or the
quadratic formula.
For a cubic polynomial x 3 C bx 2 C cx C d , as for a quadratic poly-
nomial, it is possible to determine from the coefficients the nature of the
cubic’s roots: whether the cubic has one simple real root, three distinct real
roots, or repeated real roots. This is governed by an expression in b, c, and
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 62 — #78
i i
62 3. Cubic Polynomials
We will handle the case p < 0 in Exercise 3.21, using Theorem 3.5.
It will be helpful to have a picture in mind of the possible behaviors of the
graphs. We will use Figure 3.2 as a guide. The figure shows the graphs of the
cubic polynomials x 3 3x C q for q D 4; 2; 0; 2, and 4. All five graphs
y y = x 3 – 3x + 4
y = x 3 – 3x + 2
y = x 3 – 3x
y = x 3 – 3x – 2
y = x 3 – 3x – 4
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 63 — #79
i i
3.4. A Discriminant 63
have the same shape, as we would expect. They are translations of each
other, up or down. How far each one is shifted determines how the x-axis
meets it and therefore how many roots the corresponding polynomial has.
Imagine a video showing the graph of y D x 3 3x C q as q increases
from 10 to 10. At the start of the video, the graph of y D x 3 3x 10
is shown. As time goes by, the curve steadily rises, until at the end of the
video we arrive at the graph of y D x 3 3x C 10.
As suggested in Figure 3.2 with q D 4, at the video’s start, the curve
will cross the x-axis only once. When q D 2, the curve will cross the
x-axis off to the right, but suddenly the turning point on the left makes
contact with the x-axis, producing a second root. Once q increases above
2, the turning point rises above the x-axis and is replaced by two points
of intersection on the left, along with the point of intersection on the right.
Thus, as the figure illustrates in the case of q D 0, whenever 2 < q < 2
the graph of x 3 3x C q crosses the x-axis three times and there are three
roots. The behavior changes when q D 2. If we were to watch the video as
q increases from 2 to 2, we would see the middle of the three intersection
points approach the rightmost intersection point, until they merge when q D
2. The graph of x 3 3x C 2 therefore meets the x-axis only twice and the
polynomial has two roots. Finally, as q increases beyond 2, the graph no
longer makes contact with the x-axis to the right, crossing only on the left,
with x 3 3x C q having only one root.
f .a/ < f . a/ D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 64 — #80
i i
64 3. Cubic Polynomials
(c) The x-axis crosses the graph between the local minimum and the
local maximum:
f .a/ < 0 < f . a/:
0 D f .a/ < f . a/
(e) The x-axis crosses the graph below the local minimum:
(iii) For each of these five cases, describe the number of real roots of f .x/
and their multiplicity, referring to Theorem 3.2 for guidance on the
options.
(iv) Express the last result by saying that there are three possibilities for
the nature of the roots, depending on whether f . a/ and f .a/ have
opposite sign, f . a/ and f .a/ have the same sign, or one of them is
0.
(v) The three cases can be described more compactly in terms of the sign
of the product of f . a/ and f .a/:
(a) If f . a/f .a/ < 0, then f .x/ has three distinct real roots.
(b) If f . a/f .a/ > 0, then f .x/ has one real root, of multiplicity 1.
(c) If f . a/f .a/ D 0, then f .x/ has two distinct real roots, of mul-
tiplicities 1 and 2.
(vi) Calculate the product f . a/f .a/ to get
4p 3
f . a/f .a/ D q 2 C ;
27
so
27f . a/f .a/ D 4p 3 27q 2 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 65 — #81
i i
3.4. A Discriminant 65
Exercise 3.22. Let’s continue the analysis of the roots of the cubic poly-
nomial x 3 C px C q.
(i) Assume p D 0. Then x 3 C px C q has one real root, which has
multiplicity 1 if q ¤ 0 and multiplicity 3 if q D 0. Verify that ı < 0 if
q ¤ 0 and ı D 0 if q D 0.
(ii) Assume p > 0. From Exercise 3.20, x 3 C px C q has one simple real
root. Verify that ı < 0.
(iii) Assume p < 0. Deduce from Exercise 3.21 that when the graph of
y D x 3 C px C q crosses the x-axis three times, so that x 3 C px C q
has three distinct real roots, ı > 0. Deduce that when the graph crosses
the x-axis only once, so that x 3 C px C q has one real root, ı < 0.
Deduce that when the graph crosses the x-axis once and is tangent to
the x-axis once, so that there is a repeated real root, ı D 0.
Exercise 3.23. Using Theorem 3.6, describe the nature of the roots of each
of these polynomials:
(i) x 3 3x C 2.
(ii) x 3 C 6x 20.
3
(iii) x 2x 4.
(iv) x 3 7x 6.
ı D 108R:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 66 — #82
i i
66 3. Cubic Polynomials
3.5 History
We closed our account of quadratic equations in Section 2.5 with Luca Paci-
oli, who summarized the mathematical knowledge of the time in 1494 [50].
Regarding cubic and quartic equations, Pacioli wrote [66, p. 47] that “it has
not been possible until now to form general rules.” This was the setting at
the dawn of the sixteenth century.
By the century’s end, general rules would be in place, thanks to the work
of Cardano and other Italian mathematicians. The story of these discoveries
is a wonderful one. Given the excellent accounts in [67] and [66], as well
as the 1953 biography Cardano: The Gambling Scholar [49] by the math-
ematician Øystein Ore, only a brief account will be given here. (Ore’s bi-
ography is unfortunately no longer in print. For more on Cardano, one can
also turn to Anthony Grafton’s Cardano’s cosmos: the worlds and works
of a Renaissance astrologer [33] as well as Cardano’s autobiography, The
Book of My Life [13], available in a 2002 edition that contains both Jean
Stoner’s 1929 English translation and an introduction by Grafton.)
An important point to keep in mind in preparation for the story is that
the academic culture at the time was not anything like that to which we are
accustomed. If someone were now to solve a centuries-long problem, he or
she would immediately announce it, lecture on it, and publish it, with the
written account being made available on the internet long before it actually
appears in print. Other mathematicians would study the solution, make sure
there are no errors, and pay close attention to the methods used for the solu-
tion, in anticipation that the methods and the new ideas they contain may be
applicable to other problems. In contrast, in sixteenth century Italy, scholars
would keep techniques of solution to themselves, perhaps employing them
to succeed at public competitions.
Scipione del Ferro was the first person to obtain a formula for solutions
to cubic equations. Del Ferro lived from 1465 until 1526, serving as a pro-
fessor at the University of Bologna for the final thirty years of his life. He
found a way to solve cubic equations of the form
x 3 C px D q:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 67 — #83
i i
3.5. History 67
Del Ferro and his contemporaries worked with positive numbers only, both
as coefficients and as solutions, so p and q are understood to be positive
and we seek positive values of x. An example would be x 3 C 6x D 20, the
cubic equation of Exercise 3.9.
We saw in Section 2.5 that seven centuries earlier, al-Khwarizmi had
classified quadratic equations into three forms (excluding cases in which
one of the coefficients is 0). Similarly, because of the restriction to positive
coefficients, there are three forms for cubic equations with no degree 2 term,
the form x 3 C px D q that del Ferro studied as well as
x 3 D px C q
and
x 3 C q D px:
Were 0 available, we might momentarily be tempted to add a fourth case,
x 3 C px C q D 0:
But since p and q are positive, it cannot have a positive solution. Hence, we
are missing nothing by omitting it.
Del Ferro did not publish his solution, but he did communicate it to his
son-in-law, Annibale della Nave, and his colleague Antonio Maria Fior. In
1535, Fior challenged the Venetian mathematician Niccolò Fontana (1499–
1557) to a contest in which each would pose thirty problems to the other, the
loser paying for a banquet for thirty. Fontana, better known by his nickname
Tartaglia, or the stutterer, prepared problems of varying types. In contrast,
all of Fior’s problems were cubic equations of the type x 3 C px D q, with
del Ferro’s solution as his secret weapon.
Just before time expired, on the night that ran from February 12 to 13,
Tartaglia found a way to solve the equation x 3 C px D q on his own! So
much for Fior’s secret weapon. Tartaglia solved all thirty of Fior’s problems,
whereas Fior could solve only some of Tartaglia’s. Victory was sufficient
satisfaction for Tartaglia, who chose to forgo the banquet. Of course, he had
also won eternal fame as independent co-discoverer, with del Ferro, of the
cubic’s solution.
The story now shifts to another participant, the one for whom the so-
lution to the cubic equation is named, Girolamo Cardano. Cardano was a
prominent scholar in many fields, famous as a doctor, astrologer, philoso-
pher, and mathematician. He lived from 1501 to 1576 and spent most of
his life in the city of his birth, Milan. In 1539, having heard of Tartaglia’s
solution to the cubic, he asked Tartaglia through an intermediary what the
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 68 — #84
i i
68 3. Cubic Polynomials
solution was, but Tartaglia chose not to tell. Cardano then invited Tartaglia
to come to Milan as his guest, enticing him with the opportunity to meet Al-
fonso d’Avalos, the military commander of Milan, to whom Tartaglia would
be able to show some of his military inventions.
Like Cardano, Tartaglia was a man of many talents, including expertise
in ballistics and military engineering. In 1546, he would publish some of his
military work in Quesiti et Inventioni Diverse, or New Problems and Inven-
tions [63]. (Its drawings of cannons, cannonball paths, and fortifications are
reason enough to seek out a copy of the book.) Naturally, Tartaglia accepted
Cardano’s invitation. During his stay, Tartaglia told Cardano how he solved
the equation x 3 C px D q, with Cardano swearing an oath on March 29,
1539 never to publish it.
We should note at this point that the principal source for many details
of this story, including Cardano’s oath, is Tartaglia himself, as he would
devote the final pages of Quesiti et Inventioni Diverse to an account of his
dealings with Cardano.
After Tartaglia left, Cardano saw how to use Tartaglia’s ideas in order
to obtain solutions to the cubic equations of the forms x 3 D px C q and
x 3 C q D px. Difficulties arise in these two cases because for certain
values of p and q, the solutions may contain expressions with square roots
of negative numbers, as we saw in Exercise 3.12. (In contrast, this cannot
occur in the expression for a solution of the cubic equation x 3 C px D
q.) Nonetheless, in principle Cardano had found a solution to any cubic
equation without a degree 2 term, and as we know, it is then an elementary
matter to solve all cubic equations.
The fourth participant in our story is Lodovico Ferrari, who lived from
1522 to 1565. Ferrari came from Bologna to Milan at the age of 14 to work
as a servant in Cardano’s household. Cardano quickly realized Ferrari’s
talent, and Ferrari moved from servant to student to collaborator. Ferrari
learned from Cardano the method of solving cubic equations, then made
his own great contribution: the solution of quartic equations. What Ferrari
discovered is that one can reduce the problem of solving a quartic equation
to that of solving an auxiliary cubic equation, one whose coefficients are
expressed in terms of the coefficients of the original quartic. We will study
Ferrari’s discovery in Section 6.2.
Ferrari’s result was an advance of the greatest importance, but it posed
a difficulty for Cardano. He had given his oath to Tartaglia that he would
not publish Tartaglia’s solution to cubic equations of a special form. Yet,
Cardano had extended this to other cubic equations and his disciple Ferrari
had shown how to use it to solve quartic equations. This was too important
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 69 — #85
i i
3.5. History 69
to keep secret. Moreover, he may have had a way out, since the formula was
initially due to del Ferro, not Tartaglia.
Cardano decided to publish the solution, which he did in the book Ars
Magna, written in Latin and published in 1545. (It is available in an English
translation by T. Richard Witmer, with the title The Great Art; or, The Rules
of Algebra [12].) He states clearly at the beginning of Ars Magna [12, pp. 8–
9] that the solution to the cubic equation in the special form x 3 C px D q
was discovered by del Ferro and re-discovered by Tartaglia:
In our own days Scoipione del Ferro of Bologna has solved the case
of the cube and first power equal to a constant, a very elegant and ad-
mirable accomplishment. Since this art surpasses all human subtlety
and the perspicuity of mortal talent and is a truly celestial gift and a
very clear test of the capacity of men’s minds, whoever applies him-
self to it will believe that there is nothing he cannot understand. In
emulation of him, my friend Niccolò Tartaglia of Brescia, wanting not
to be outdone, solved the same case when he got into a contest with
his [Scipioine’s] pupil, Antonio Maria Fior, and, moved by my many
entreaties, gave it to me. For I had been deceived by the words of Luca
Paccioli, who denied that any more general rule could be discovered
than his own. Notwithstanding the many things which I had already
discovered, as is well known, I had despaired and had not attempted
to look any further. Then, however, receiving Tartaglia’s solution and
seeking for the proof of it, I came to understand that there were a great
many other things that could also be had. Pursuing this thought and
with increased confidence, I discovered these others, partly by myself
and partly through Lodovico Ferrari, formerly my pupil. Hereinafter
those things which have been discovered by others have their names
attached to them; those to which no name is attached are mine. The
demonstrations, except for the three by [al-Khwarizmi] and the two by
Lodovico, are all mine.
Cardano turns to cubic equations in Chapter 11, “On the Cube and First
Power Equal to the Number,” which is devoted to equations of the form
x 3 C px D q. At the start of the chapter, he once again credits del Ferro
and Tartaglia [12, p. 96]:
Scipio Ferro of Bologna well-nigh thirty years ago discovered this rule
and handed it on to Antonio Maria Fior of Venice, whose contest with
Niccolò Tartaglia of Brescia gave Niccolò occasion to discover it. He
[Tartaglia] gave it to me in response to my entreaties though with-
holding the demonstration. Armed with this assistance, I sought out its
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 70 — #86
i i
70 3. Cubic Polynomials
By modern standards, Cardano had given proper credit for the results
on which he and Ferrari built, and had satisfied all scholarly expectations.
But this was a different era, and Tartaglia was furious. As already noted, he
told his side of the story a year later in Quesiti et Inventioni Diverse [63],
making the story of the oath—and its text—public.
Cardano did not respond to Tartaglia’s accusations, but Ferrari, on Car-
dano’s behalf, issued a public challenge. The dispute was argued in Milan
on August 10, 1548, with Tartaglia leaving before it was settled and evi-
dently being named the loser. (See Ore’s discussion [49, pp. 99–105] for an
attempt to sort out the details from the available records.)
These discoveries make for a richly entertaining story, worthy of the
greatness of the discoveries themselves. As Varadarajan notes in conclud-
ing his account of the dispute [67, p. 62], “Tartaglia, through his penchant
for secrecy, represents the middle ages, while Cardano, with his generous
views about authorship and sharing of knowledge, represents the modern
view. Mathematics was very fortunate that a person with the great vision
and world-view like Cardano was able to get his hands on the original dis-
covery and was then able to build a wonderful structure on top of it that led
to the beginning of Algebra as we know today.” Ore offers the following
conclusion [49, p. 106]:
As for Cardano’s actual account of his formula, let us see what he says
in Chapter 11 of Ars Magna in his treatment of equations of the form x 3 C
px D q [12, pp. 98–99]. Some of his terminology has been modernized
in the English translation. The terms binomium and p apotome, which
p the
translation retains, refer to pairs of numbers such as 5 C 3 and 5 3.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 71 — #87
i i
3.5. History 71
the cube root of the apotome from the cube root of the binomium, the
remainder [or] that which is left is the value of x.
For example,
x 3 C 6x D 20:
Cube 2, one-third of 6, making 8; square 10, one-half of the constant;
p results. Add 100 and 8, making 108, the square root of which is
100
108. This you will duplicate: to one add 10, one-half the constant,
and frompthe other subtract the same. p Thus you will obtain the bi-
nomium 108 C 10 and its apotome 108 10. Take the cube roots
of these. Subtract [the cube root of the] apotome from that of the bi-
nomium and you will have the value of x:
3 p 3 p
q q
108 C 10 108 10:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 72 — #88
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 73 — #89
i i
4
Complex Numbers
Our experience with Cardano’s formula has taught p us that to solve cubic
equations, we need to work with numbers such as 3, that is, square roots
of negative numbers. Cubic equations were where they were first encoun-
tered. They are now called complex numbers, and have been found to have
many important uses in mathematics and science. In this chapter, we will
introduce them and see how to calculate their nth powers and roots, which
is needed in our study of polynomial equations.
73
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 74 — #90
i i
74 4. Complex Numbers
p p
Suppose we are given two complex numbers, aCb 1 and c Cd 1.
How do we add and multiply them? Let’s do what comes naturally and see
what happens. This will lead us to definitions of sum and product.
For addition, if we rearrange and combine terms, we obtain
p p p p
.a C b 1/ C .c C d 1/ D a C c C b 1Cd 1
p
D .a C c/ C .b C d / 1:
This would appear to be a sensible definition of addition, and it is what we
adopt. For example,
p p p
.2 C 3 1/ C .8 C 2 1/ D 10 C 5 1:
How about multiplication? Let’s use the distributive law and multiply
the same two numbers:
p p p p p p
.2 C 3 1/ .8 C 2 1/ D 2 8 C 2 2 1C3 18C3 12 1:
p
It is natural to p
p expect that 1 should commute with real numbers so that
18 D8 1. Assuming this and rearranging terms, we obtain
p p p p p
.2 C 3 1/ .8 C 2 1/ D 2 8 C 2 2 1C38 1 C 3 2. 1/2 :
p
If 1 is to make sense as a number, then it must have the property that its
square is 1: p
. 1/2 D 1:
If we assume this, then we find that
p p p p p
.2C3 1/.8C2 1/ D 16C4 1C24 1C6. 1/ D 10C28 1:
p p
More generally, suppose we wish topmultiply aCbp 1 and cp Cd 1.
Proceeding in the same way, assuming 1c Dc 1 and . 1/2 D
1, we will obtain
p p p
.a C b 1/ .c C d 1/ D .ac bd / C .ad C bc/ 1:
Let us adopt this as our definition of multiplication. The product of two
complex numbers is again a complex number, as would need to be the case
if the definition is to be of any use. p p
Let’s summarize. Given complex numbers a C b 1 and c C d 1,
we have adopted the rules that their sum and product are given by
p p p
.a C b 1/ C .c C d 1/ D .a C c/ C .b C d / 1
and
p p p
.a C b 1/ .c C d 1/ D .ac bd / C .ad C bc/ 1:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 75 — #91
i i
.a C bi / C .c C d i / D .a C c/ C .b C d /i
p
A positive real number r has a real square root r . When we square
p
a pure imaginary number r i according to our multiplication rule, we
obtain
p
. r i /2 D r:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 76 — #92
i i
76 4. Complex Numbers
p
Thus r i is a square root of r . By creating the new number i to serve
as a square root of 1, we have obtained square roots for all negative real
numbers.
A notion that we will need is that of complex conjugate. The complex
conjugate, or simply the conjugate, of a complex number a C bi is the
complex number a bi . The conjugate of a complex number r is denoted
by putting a bar over r to get r . In this notation, we would write for example
that
2 C 3i D 2 3i
and
12i D 12i:
The complex conjugate of a real number a is a itself and the complex con-
jugate of a pure imaginary number bi is its opposite bi .
A multiplicative inverse of a number r is a number s satisfying r s D 1.
For instance, the multiplicative inverse of 1 is 1 and the multiplicative
inverse of 2 is 1=2. What is the multiplicative inverse of ? It is 1= , of
course. But this isn’t an answer. It’s a notation. The fact that there is a real
number s for which s D 1 is by no means obvious. It requires a proof, one
that depends on foundational results about the construction of real numbers.
(See, for instance, [38, pp. 71–73] for a discussion.)
Let us accept the truth of the statement that every non-zero real num-
ber r has a multiplicative inverse, which we will write as 1=r . Thanks to
conjugation, we can deduce from this that non-zero complex numbers have
multiplicative inverses.
(ii) Assume that a C bi ¤ 0. Describe (in terms of a and b) the real and
imaginary parts of a complex number that is a multiplicative inverse of
a C bi .
Let’s derive the basic facts on conjugation and the arithmetic operations
for complex numbers.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 77 — #93
i i
and
r s D r s:
rn D rn
Let’s restate one part of Exercise 4.4 as a theorem, for later reference.
Exercise 4.5. Prove Theorem 4.2. (Hint: Apply Exercise 4.4 to calculate
f .r /.)
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 78 — #94
i i
78 4. Complex Numbers
x 2 C bx C c D .x r /.x r/
.x a1 /.x a2 /:
We saw in Theorem 2.5 that the nature of the roots of a quadratic poly-
nomial x 2 C bx C c is determined by the sign of b 2 4c, which we denoted
ı. In the next exercise, we will see that the nature is determined by the sign
of the square of the difference of the polynomial’s roots.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 79 — #95
i i
Exercise 4.7. Suppose b and c are real numbers. Write r1 and r2 for the
roots of x 2 C bx C c and for .r1 r2 /2 .
(i) Conclude from Theorem 4.3 that there are three mutually exclusive
possibilities for r1 and r2 : they are real and distinct, or real and coin-
cident, or distinct complex conjugates of each other.
(ii) If r1 D r2 , then D 0.
(iii) If r1 and r2 are real and distinct, then > 0.
(iv) The only remaining possibility is that r1 and r2 are not real but are
complex numbers, each the conjugate of the other. Suppose r1 D m C
ni and r2 D m ni , with m and n real numbers and with n ¤ 0. (If
n D 0, the roots are real.) Calculate and show that it is a negative
real number.
(v) We wish to show that the converses of the results in (ii)-(iv) hold as
well; that is:
(a) If > 0, then r1 and r2 are real and distinct;
(b) if D 0, then r1 is a real number and r1 D r2 ;
(c) if < 0, then r1 and r2 are non-real, complex numbers, each the
conjugate of the other.
(vi) Assume that > 0. We are to prove that r1 and r2 are real and dis-
tinct. Suppose this is not the case and deduce from the first part of the
exercise that either r1 and r2 coincide and are real or they are non-real
complex conjugates. Use other parts of the proof to obtain the contra-
diction that 0. Conclude, as desired, that r1 and r2 are real and
distinct.
(vii) Make similar arguments to prove the other two converses.
(viii) Conclude that the sign of determines the nature of the roots of x 2 C
bx C c.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 80 — #96
i i
80 4. Complex Numbers
Theorem 4.4. Let b and c be real numbers and let r1 and r2 be the roots
(real or complex) of the quadratic polynomial x 2 C bx C c, so that x 2 C
bx C c factors as .x r1 /.x r2 /. Let be the discriminant of x 2 C bx C c,
which by definition is .r1 r2 /2 .
(i) determines the nature of the roots of x 2 C bx C c: If > 0, then
the roots are real and distinct; if D 0, then there is one root, real of
multiplicity 2; if < 0, then the roots are a pair of non-real complex
conjugates.
(ii) can be calculated in terms of the coefficients:
D b2 4c:
Theorem 4.5. Let f .x/ be a monic, cubic polynomial. Exactly one of (i)–
(iv) occurs.
(i) f .x/ has one real root a, of multiplicity 3, and factors as
.x a/3 :
(ii) f .x/ has two distinct real roots a1 and a2 , of multiplicities 1 and 2,
and factors as
.x a1 /.x a2 /2 :
(iii) f .x/ has three distinct simple real roots a1 , a2 , and a3 , and factors as
.x a1 /.x a2 /.x a3 /:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 81 — #97
i i
(iv) f .x/ has one real root a and two distinct non-real, complex roots r
and r, and factors as
.x a/.x r /.x r /:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 82 — #98
i i
82 4. Complex Numbers
.x C yi /2 D a C bi
for real values of the unknowns x and y. We view a and b as known, fixed
constants.
(i) Expand the equation .x C yi /2 D a C bi and obtain two simultaneous
equations for x and y with real coefficients.
(ii) Using one of the equations and the fact that b ¤ 0, show that if these
equations have solutions, x and y will be non-zero.
(iii) Using this, divide both sides of one equation by x to get an equation
expressing y in terms of x and substitute this into the other equation
to get a single equation in x.
(iv) To find a square root of a C bi , we need to solve a single equation in x
involving only real numbers. Clear denominators in this equation and
obtain a degree 4 or quartic equation in x.
(v) It can be regarded as a degree 2 or quadratic equation in x 2. Use the
quadratic formula to obtain two values for x 2 in terms of a and b.
(vi) We are looking for a real value of x that solves the equation. If one of
our expressions for x 2 is a positive real number, then its two square
roots are our desired values for x. Verify that one of the expressions
for x 2 is positive.
(vii) Take square roots to obtain two values of x. Using an equation you
obtained earlier in the problem, obtain the two corresponding values
for y.
(viii) Write the formulas for the two square roots of a C bi . Notice that each
is 1 times the other.
We have just shown that any non-zero complex number has two com-
plex numbers as its square roots, each the opposite of the other. We will
record part of this.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 83 — #99
i i
(ii) Use Theorem 4.6 to deduce that x 2 C bx C c has a root in the complex
numbers, and so factors as a product of two linear polynomials.
Theorem 4.7. Let b and c be complex numbers. There exist complex num-
bers r1 and r2 (possibly coincident) such that
x 2 C bx C c D .x r1 /.x r2 /:
In Exercise 4.11, we have proved that complex square roots exist and
shown how to determine them explicitly in terms of square roots of posi-
tive real numbers. We wish to be able to compute cube roots of complex
numbers as well, since this is essential in using Cardano’s formula. Our ex-
perience in Exercise 4.11 suggests that we should be able to calculate cube
roots of complex numbers in terms of cube roots of real numbers. Let’s
mimic what we did for square roots and see how far we can get.
Exercise 4.13. Fix real numbers a and b; they should be regarded as con-
stants, not as variables. Assume b ¤ 0. We wish to determine the cube roots
of the non-real complex number a C bi ; that is, we wish to find a complex
number m C ni such that .m C ni /3 D a C bi . We can introduce variables
x and y for the unknown real numbers m and n, so that finding a cube root
amounts to solving the equation
.x C yi /3 D a C bi
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 84 — #100
i i
84 4. Complex Numbers
Exercise 4.14. Begin with the two equations for x and y obtained in Ex-
ercise 4.13.
(i) Using one of the equations and the assumption that b ¤ 0, show that
if these equations have solutions, then y will be non-zero.
(ii) Introduce a new variable s and set x D sy, so that s D x=y. This
makes sense, since we know that y can’t be 0. Substitute sy for x in
the two equations for x and y to get two equations in s and y. Write
them so that each is an expression in s times y 3 . Using the assumption
that b ¤ 0, show that 3s 2 1 ¤ 0.
(iii) Solve one equation for y 3 , substitute in the other, and obtain a cubic
equation in s with coefficients involving a, b, and 3.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 85 — #101
i i
p/3
(0, 0) (1, 0)
p
Figure 4.1. The Point 1 C 3i in the Plane
Points in the plane can also be specified by their polar coordinates. The
polar coordinates of a point p in the plane are numbers r and , with r rep-
resenting the distance from p to the origin of the plane and representing
the angle formed by the positive ray of the x-axis and the line connecting p
to the origin. The angle is measured in radians, proceeding counterclock-
wise from the x-axis to the line with p on it. Negative angles indicate that
we proceed in a clockwise direction from the x-axis.
Let’s write a pair of polar coordinates as Œr; , using the brackets to
indicate that the pair ofpnumbers is not the usual cartesian coordinates. For
example, the point .1; 3/ (in cartesian coordinates) can be written in polar
coordinates as Œ2; =3. See Figure 4.1.
Polar coordinates have some disadvantages. The principal one is that a
point p is described by infinitely many different pairs of polar coordinates.
If we move around the plane along a circle, tracing an angle of 2 , we return
to the point at which we started. Thus, in addition to Œr; , the point p also
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 86 — #102
i i
86 4. Complex Numbers
has polar coordinates Œr; C 2 . One more trip around the circle and we
return to the same point, but this time with polar coordinates Œr; C 4 .
More generally, for any integer n, the point has polar coordinates Œr; C
2n . The origin can be described in still more ways, since it has polar
coordinates Œ0; for arbitrary .
Polar coordinates have advantages as well (or else we would not intro-
duce them). One advantage is that certain figures in the plane can be de-
scribed as the graphs of equations that take on an especially simple form in
polar coordinates. For example, r D 1 is the equation in polar coordinates
of a circle of radius 1 with the origin as its center.
In cartesian coordinates, the circle is described by the equation x 2 C
2
y D 1, and we can use it to define the cosine and sine functions for ar-
bitrary real numbers . Given a point p on the circle such that the radius
from the origin to p forms an angle with the x-axis (measured counter-
clockwise from the x-axis to the radius), the x and y coordinates of p are
the cosine and sine of :
(ii) Conclude that a point with cartesian coordinates .a; b/ and polar coor-
dinates Œr; satisfies
a D r cos I b D r sin :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 87 — #103
i i
Exercise 4.16. Let a and b be real numbers. We wish to find polar coordi-
nates Œr; for the point described in cartesian coordinates by .a; b/.
(i) If .a; b/ has polar coordinates Œr; , explain why .a; b/ has polar
coordinates Œr; .
(ii) Describe r in terms of a, b, and square roots.
(iii) Assume b 0. Describe in terms of a, b, and the inverse cosine
function.
(iv) What is if b < 0?
for real numbers r and , with r and chosen to be polar coordinates of the
point with cartesian coordinates .a; b/. Exercise 4.16 shows how to express
r and explicitly in terms of a and b. We call r the length or magnitude of
the complex number a C bi and the angle of inclination of a C bi , since it
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 88 — #104
i i
88 4. Complex Numbers
represents the angle at which the line through .a; b/ and .0; 0/ inclines with
respect to the x-axis. Often, is called the argument of a C bi .
Recall that in Exercise 4.1 we proved that real numbers commute with
i under multiplication. Thus, the complex number a C bi is the same as
a C i b. Sometimes, when working with complicated expressions for the
imaginary part, the a C i b form is preferable, in that it eliminates the need
for extra parentheses. Thus, for example, we might prefer r cos C i r sin
to r cos C .r sin /i . Sometimes it’s convenient to write r cos C r i sin .
For example, we might write 2 cos.=3/ C 2i sin.=3/ rather than putting
the i before the 2 or after the sine term.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 89 — #105
i i
Let’s see what multiplication by 1 does to the plane. Suppose Œs; are
polar coordinates for the point .a; b/, so that
a C bi D s cos C si sin :
.a C bi / D s cos. C / C si sin. C /:
Thus, multiplication by 1 leaves the length of each point in the plane un-
changed and adds to its angle of inclination. This results in a rotation of
the plane around the origin by , or 180ı. Given a positive real number r ,
it follows that multiplication by r is an expansion or contraction of the
plane followed by a 180ı rotation.
We wish to extend this analysis to obtain a geometric description of
multiplication of complex numbers by an arbitrary complex number. The
essential case to handle is that of multiplication by i , the most basic of
non-real complex numbers. First we check that multiplication of complex
numbers satisfies the associative law.
.pq/r D p.qr /:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 90 — #106
i i
90 4. Complex Numbers
verify that
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 91 — #107
i i
Theorem 4.8. Let u be a complex number with length 1 and angle of incli-
nation , so that
u D cos C i sin :
Let v be a complex number with length s and angle of inclination , so that
v D s cos C si sin :
Then the product uv has length s and angle of inclination C :
uv D s cos. C / C si sin. C /:
Thus, multiplication by u effects a rotation of the complex plane counter-
clockwise through the angle .
Theorem 4.8 will follow from the standard trigonometric identities for
the sine and cosine of the sum of two angles. These are sufficiently impor-
tant that we should record them.
Theorem 4.9. Let and be two real numbers. Then
sin. C / D sin cos C cos sin
and
cos. C / D cos cos sin sin :
We should also take a moment to review how Theorem 4.9 is proved, to
ensure that we avoid any circularity in our reasoning. Some books provide
a proof of Theorem 4.9 as a consequence of de Moivre’s formula, which
we will prove in Section 4.6 as a consequence of Theorem 4.9. This is the
circularity we wish to avoid.
Fortunately, there are many other ways to prove Theorem 4.9. See Chap-
ter 6 of Eli Maor’s Trigonometric Delights [40, pp. 87–94] for one based
on the classical result of plane geometry known as Ptolemy’s theorem.
This states, for a quadrilateral inscribed in a circle, that the product of the
lengths of the quadrilateral’s two diagonals equals the sum of the products
of lengths of the two pairs of opposite sides. (Ptolemy is the great Alexan-
drian astronomer and mathematician of the second century C.E.)
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 92 — #108
i i
92 4. Complex Numbers
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 93 — #109
i i
Thus, we may as well consider the general case. However, we will lay out
the issues in such a way that the reader can ignore the general case and
focus on the cases n D 2 and n D 3.
One way to derive de Moivre’s formula is to use trigonometry and Theo-
rem 4.9. We will take this approach in a moment. But first let’s see how eas-
ily it follows from Euler’s formula, a result due to Leonhard Euler (1707–
1783), the greatest mathematician of the eighteenth century.
Euler’s formula involves the number e, whose central role in mathemat-
ics Euler first discovered and whose notation he introduced. Readers who
have studied calculus will be familiar with it as the number most naturally
used for exponentiation. What makes it so natural is that the exponential
function e x with base e is its own derivative, in contrast to other expo-
nential functions such as 2x and 10x . Another attractive feature is that its
inverse function loge x is the antiderivative of 1=x. For readers unfamiliar
with e, suffice to say that it is an irrational number with decimal expansion
that begins 2:71828 and that for theoretical reasons it is the best base for
studying exponentiation.
Euler’s formula relates exponential and trigonometric functions:
e i D cos C i sin :
At first glance, the formula may appear both mysterious and wondrous.
For those readers familiar with power series expansions of functions, the
mystery is easily addressed, but the wonder remains.
In a calculus course, it is shown that the sine and cosine functions have
power series expansions
x2 x4 x6
cos x D 1 C C
2Š 4Š 6Š
and
x3 x5 x7
sin x D x C C ;
3Š 5Š 7Š
and the exponential function satisfies
x x2 x3 x4 x5
ex D 1 C C C C C C :
1Š 2Š 3Š 4Š 5Š
The expansions are valid for any real number x.
Once complex numbers are introduced, it is not a large leap to replace
real numbers in the power series expansion of e x with pure imaginary num-
bers. Suppose for instance that is a real number and let x D i. If we
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 94 — #110
i i
94 4. Complex Numbers
substitute i for x in the power series expansion of e x and use the equali-
ties i 2 D 1, i 3 D i , and i 4 D 1, we find that
2 3 4 5
e i D 1 C i i C Ci C :
2Š 3Š 4Š 5Š
Collecting real and imaginary terms on the right side of this equation, we
obtain cos C i sin , yielding Euler’s formula. That’s all there is to it.
Given a real number and a positive integer n, the rules of exponentia-
tion yield
.e i /n D e i n :
Reinterpreting the left and right sides of this equation via Euler’s formula,
we see that
.cos C i sin /n D cos n C i sin n:
(ii) Square cos Ci sin , combine real and imaginary terms, and use The-
orem 4.9 to show that
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 95 — #111
i i
(iii) Multiply both sides of the last equality by cos C i sin and use The-
orem 4.9 to show that
(iv) We can continue this process step by step to obtain the general result.
Suppose k is a positive integer less than n and we know that
Multiply both sides of this equality by cos C i sin and use Theorem
4.9 to deduce that
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 96 — #112
i i
96 4. Complex Numbers
r cos C r i sin
p
for some positive real number r and some real . Write r for the positive
real square root of r .
(i) Using Theorem 4.12, verify that the two complex numbers
p p
r cos C r i sin
2 2
and
p p
r cos. C / C r i sin. C /
2 2
are square roots of c.
(ii) Check that each is the opposite of the other, as we would expect.
(iii) Use the formula to compute square roots of
(a) 4 and 4.
(b) i and i .
(c) 1 C i and 1 C i .
r cos C r i sin
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 97 — #113
i i
We can use Exercise 4.27 to redo some cube root calculations that we
made earlier when we used Cardano’s formula, as illustrated next.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 98 — #114
i i
98 4. Complex Numbers
p
(ii) In the same way, find the three cube roots of 3 10 3=9.
(iii) Alternatively, use Theorem 4.1 to find its three cube roots.
(iv) Compare the results to those arising from the calculation of cube roots
in Exercise 3.13.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 99 — #115
i i
Exercise 4.31. For small n, we can write down the nth roots of unity as
ordinary complex numbers, without trigonometric functions in their expres-
sions. We have already done so in several cases. Let’s review and extend
what we know. Describe without cosines and sines the two square roots of
1, the three cube roots of 1, the four fourth roots of 1, the six sixth roots of
1, and the eight eighth roots of 1.
The two square roots of 1 are the powers of 1, the three cube roots of
1 are the powers of the number we have called !, and the four fourth roots
of 1 are the powers of i . For an arbitrary positive integer n, let n be the
number
2 2
cos C i sin :
n n
Here, is the Greek letter zeta. Notice that 2 is 1 and 3 is !, while 4 is
i.
We learn different parts of Theorem 4.15 early in our mathematical ed-
ucation. One might not think to state it as a theorem, but it provides the
template for the results that follow.
Theorem 4.15 has an analogue for any integer n > 2, with 1 and 1
replaced by the nth roots of unity. We will need the appropriate analogue
for n D 3 in Section 5.3.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 100 — #116
i i
Exercise 4.32. Prove Theorem 4.16. The first part has already been shown
in Exercise 3.15 (or Exercise 4.31). Verify the second part by direct calcu-
lation, and use it to prove the third and fourth parts.
(ii) 1 C i C i 2 C i 3 D 0.
Here is the theorem that generalizes Theorems 4.15, 4.16, and 4.17:
Theorem 4.18. Let n be a positive integer and let n be the number cos.2=n/C
i sin.2=n/.
(i) The n distinct nth roots of unity are 1; n ; n2 ; : : : ; nn 1
, with .n /n D
1.
xn 1 D .x 1/.x n 1
C xn 2
C C x 2 C x C 1/:
Then substitute n for x and use the fact that n ¤ 1 to deduce the
desired result.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 101 — #117
i i
(iii) The proof of Theorem 1.5 works just as well for complex roots of a
polynomial as for real roots. Deduce that Theorem 1.5 can be applied
to show that
(iv) Conclude that this reduces the problem of calculating nth roots of posi-
tive real numbers to that of computing certain values of the exponential
and logarithm functions and dividing by n.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 102 — #118
i i
The lesson of Exercise 4.35 is that applying the logarithm converts ex-
traction of an nth root of a real number to division by n.
D arccos a:
(i) Use Theorem 4.13 to deduce that one nth root of cos C i sin is
cos.=n/ C i sin.=n/:
(iii) We may rewrite this in another way. The sine and cosine functions
satisfy
sin x D cos.x =2/:
Rewrite the expression for an nth root of a C bi as
(iv) Conclude that this reduces the problem of calculating nth roots of com-
plex numbers of length 1 to that of computing certain values of the
inverse cosine and cosine functions and dividing by n.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 103 — #119
i i
4.9 History
Complex numbers are so widely used that it can be difficult to understand
why their acceptance took centuries, stretching from their appearance in
Cardano’s Ars Magna [12] in 1545 to the development of the subject now
called complex analysis in the late nineteenth century. Two entertaining
accounts of the intellectual challenge complexp numbers posed are Paul J.
Nahin’s An Imaginary Tale: The Story of 1 [43] and Barry Mazur’s
Imagining Numbers (Particularly the Square Root of Minus Fifteen) [41].
They serve as natural complements to each other, with rich historical dis-
cussions that are especially worth reading. In this section, we will touch on
just a few historical highlights.
Let’s begin with de Moivre’s formula. It does not appear that de Moivre
wrote the formula down explicitly, but he surely knew it. Or at least he
understood that taking nth roots of complex numbers produces formulas
similar to trigonometric formulas for sines and cosines, allowing him to
relate taking nth roots with dividing angles by n. For instance, in a 1707
note with the title “The analytic solution of certain equations of the third,
fifth, seventh, ninth and other higher uneven powers, by rules similar to
those called Cardan’s,” he writes [16], [60, pp. 443–444]:
61
5y 20y 3 C 16y 5 D ;
64
yD
s r s r
1 5 61 375 1 5 61 375
C C :
2 64 4096 2 64 4096
And if by any means the fifth root of the binomial can be extracted the
root will come out true and possible, although the expression seems to
include an impossibility. Now the fifth root of the binomial
r
61 375
C
64 4096
is
1 1p
C 15;
4 4
and of the binomial r
61 375
64 4096
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 104 — #120
i i
is
1 1p
15
4 4
whose semi-sum 1=4 D y. But if the extraction cannot be performed,
or should seem too difficult, the thing may always be effected by a
table of natural sines in the following manner.
To the radius 1 let a D 61=64 D 0:95112 be the sine of a certain
arc which therefore will be 72ı 230 , the fifth part of which (because
n D 5) is 14ı 280 ; the sine of this is 0:24981; near ly D 1=4. So also
for equations of higher degree.
p
De Moivre recognizes that .61=64/C i 375=4096 has the form sin C
i cos for a suitable , and adding the fifth roots of this number and its
conjugate amounts to computing the sine of =5. As Mazur comments ([41,
p. 199]), “Although there were earlier hints of the link between the solu-
tion of polynomial equations and trigonometry, we know that Abraham De
Moivre, by 1707, had perceived the analogy between the geometric prob-
lem of cutting an arc of a circle into n equal parts and taking the nth root of
a complex number.”
In 1748, Leonhard Euler stated de Moivre’s formula in its standard
form, deriving it using the angle-sum formulas for sine and cosine. Here
is the start of his discussion:
Since .sin z/2 C .cos z/2 D 1, on decomposing into factors we get
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 105 — #121
i i
rank the square root of a negative number amongst the possible num-
bers, and we must therefore say that it is an impossible quantity. In this
manner we are led to the idea of numbers, which from their nature are
impossible; and therefore they are usually called imaginary quantities,
because they exist merely in the imagination. : : : But notwithstanding
this, these numbers present themselves to the mind; they exist in our
imagination, and we still have a sufficient idea of them; : : : for this
reason also, nothing prevents us from making use of these imaginary
numbers, and employing them in calculation.
Elements of Algebra would become an influential text. A translation
from German to French was prepared by Johann III Bernoulli, with about
one hundred pages of additions by Joseph-Louis Lagrange. An English
translation of the French edition was begun by Francis Horner, who also
wrote a short biography of Euler, and completed by Reverend John Hewlett.
The 1840 edition of the English translation containing Bernoulli’s notes,
Lagrange’s additions, and Horner’s biography remains available thanks to
Springer-Verlag.
A good account of Euler’s work on complex numbers can be found in
Chapter 5 of William Dunham’s Euler: The Master of Us All [23]. Dunham
concludes it with the observation [23, pp. 101–102] that “complex numbers
were here to stay. A concept only dimly understood for its role in solving
cubic equations had been legitimized by the discoveries and influence of
Leonhard Euler. Without apology or embarrassment, he treated these num-
bers as equal players on the mathematical stage and showed how to take
their roots, logs, sines, and cosines.”
As the eighteenth century gave way to the nineteenth, the geometric
description of complex numbers presented in Sections 4.4 and 4.5 first ap-
peared. How it did is a fascinating story, well told in the books of Nahin
[43] and Mazur [41]. We will provide a brief sketch. But first we should
note that an earlier effort to describe pure imaginary numbers as points on
the plane that lie off the real number line appears in a 1685 work on algebra
[69] by the English mathematician John Wallis (1616–1703). His approach
was inconclusive. We will return to some of the material in Wallis’s book in
Section 5.8.
The first person to describe complex numbers geometrically in the way
we now understand was the Norwegian Caspar Wessel (1745–1818). He
presented a paper to the Royal Danish Academy of Sciences in 1797 that
was published in Danish two years later in the Academy’s Memoires [70].
Wessel recognized that in multiplying two complex numbers, one adds their
angles of inclination. It appears, however, that Wessel’s work went unno-
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 106 — #122
i i
ticed for nearly a century, until it was rediscovered in 1895 [43, pp. 48–49,
243].
In 1806, the Swiss-born Jean-Robert Argand (1768–1822) published a
pamphlet, Essay on the Geometrical Interpretation of Imaginary Quanti-
ties, in which he also identified complex numbers with points on the plane.
(We will quote from A.S. Hardy’s 1881 translation [4].) Before tackling
complex numbers, Argand reviews the one-time challenge of making sense
of negative numbers [4, pp. 17–22], concluding that
the difficulty of the subject will not be questioned if we remember
that the exact sciences had been cultivated for many centuries, and had
made great progress before either a true conception of negative quan-
tities was reached or a general method for their use had been devised.
The notion of a negative number, Argand explains, might seem imaginary,
but when we compare two quantities, we consider not just the ratio of their
absolute values, but also “a relation of direction, or of the sense in which
they are estimated, a relation either of identity or opposition.”
Next Argand poses the problem [4, p. 23] of finding “the geometric
mean between two quantities of different signs, that is, to find the value of
x in the proportion C1 W x WW x W 1.” In effect, he asks for a a value of x
satisfying 1=x D x=. 1/, or x 2 D 1.
Here we encounter a difficulty . . . ; but, as before, the quantity which
was imaginary, when applied to certain magnitudes, became real when
to the idea of absolute number we added that of direction, may it not
be possible to treat this quantity, which is regarded imaginary, because
we cannot assign it a place in the scale of positive and negative quan-
tities, with the same success? On reflection this has seemed possible,
provided we can devise a kind of quantity to which we may apply the
idea of direction, so that having chosen two opposite directions, one
for positive and one for negative values, there shall exist a third —
such that the positive direction shall stand in the same relation to it
that the latter does to the negative.
Having opened the door to additional directions, Argand explains that
the quantity x which is to be the geometric mean of 1 and 1 should be
perpendicular to the line containing them. This yields two choices, which
are “related to each other
p as C1 and p 1. They are, therefore, what is ordi-
narily expressed by C 1 and 1.” So begins Argand’s identification
of complex numbers with points on the plane.
Argand’s work was ignored initially. Guillaume-Jules Hoüel tells the
story in his preface to the 1876 reprint of Argand’s book [4, pp. iii–xvi]. (See
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 107 — #123
i i
also [43, pp. 73–74].) Argand was living in Paris at the time, and he sent a
copy to the great French mathematician Adrien-Marie Legendre. Without
mentioning Argand’s name, Legendre described Argand’s ideas in a letter
to François Français, a professor of mathematics. On François’s death in
1810, his younger brother Jacques inherited his papers. Jacques published
an article in 1813 in the Annales de Mathématiques describing Argand’s
ideas as well as Legendre’s letter, and asked who the unknown author of the
pamphlet might be. Argand saw Français’s article and identified himself,
receiving full credit from Français.
Carl Friedrich Gauss (1777–1855), the greatest mathematician of the
era, had come up with the same ideas, perhaps even earlier than Wessel. He
chose not to publish his work until 1831, yet he came to share naming rights
with Argand. The plane, when its points are identified with complex num-
bers, has come to be called the Gaussian plane, Argand plane, or Argand
diagram.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 108 — #124
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 109 — #125
i i
5
Cubic Polynomials, II
109
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 110 — #126
i i
p
roots of q=2 C R. Let A be one of them. Then p the other two are !A and
2
! A. Similarly, if B is a cube root of q=2 R, then the other two are
!B and ! 2 B. In the next exercise, we determine which cube roots to add
together in order to obtain roots of y 3 C py C q.
Exercise 5.1. Let p and q be real numbers with p ¤ 0. (We can handle
the p D 0 case directly.) Since any non-zero number has three distinct
cube roots, each summand on the right side of Cardano’s formula has three
possible values.
p
(i) Check,
p as in Exercise 3.7, that the product of q=2 C R and q=2
R is p 3=27.
(iv) Check that A3 B3 D p 3 =27 and deduce that AB equals one of the
numbers p=3, !p=3, and ! 2 p=3.
r1 D A C BI r2 D !A C ! 2 BI r3 D ! 2 A C !B:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 111 — #127
i i
Exercise 5.1 removes the imprecision that was present in our earlier
derivation of Cardano’s formula. We can summarize what we have found in
a theorem.
Theorem 5.1 (Cardano’s formula). Suppose p and q are real numbers, with
p ¤ 0. Let A be a cube root of
r
q p3 q2
C C
2 27 4
and let B be the unique cube root of
r
q p3 q2
C
2 27 4
p
satisfying AB D p=3. Let ! be the cube root 1=2 C 3=2 of 1. Then
the three roots of the polynomial y 3 C py C q are
A C B; !A C ! 2 B; ! 2 A C !B:
y3 7y C 6 D 0
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 112 — #128
i i
and
10 p
3 3:
9
(ii) Using thepnumbers ! and ! 2 and the determination of one cube root of
3 C 109 3 in Exercise 3.13 or Exercise 4.28, write expressions
p for
the three complex numbers that are cube roots of 3 C 10 9
3.
p Write
also the three complex numbers that are cube roots of 3 10 9
3.
10
p 10
p
(iii) Pair the cube roots of 3 C 9 3 and 3 9
3 as specified in
Theorem 5.1 to get three pairs such that the product of the complex
numbers in each pair equals 7/3.
(iv) Add the complex numbers in each pair to obtain all three real solutions
of y 3 7y C 6 D 0.
y3 3y C 1 D 0:
(i) Using Cardano’s formula, show that the solutions have the form
s p s p
3 1 3 3 1 3
C i C i:
2 2 2 2
(ii) The numbers inside the cube roots signs are our familiar friends ! and
! 2 , the non-real cube roots of 1. Thus, the solution can be rewritten as
p p3
y D 3 ! C !2 :
(iii) Find three cube roots of ! and three cube roots of ! 2 , expressed in
terms of the cosine and sine of suitable angles.
(iv) Form pairs of cube roots in accordance with Exercise 5.1 and add them,
taking into account the advice that preceded this exercise.
(v) The three solutions to y 3 3y C 1 D 0 have the form 2 cos , for three
particular angles . What are they?
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 113 — #129
i i
As the last part of Exercise 5.4 may suggest, once we turn to calculators,
we are settling for approximate answers, and these can be obtained in a
variety of ways independent of Cardano’s formula. For example, we can
solve cubic equations using Newton’s method—a standard application of
calculus—or turn to a selection of mathematical programs without even
worrying about the nature of the underlying algorithms.
The need to approximate already reared its head in our use of the qua-
dratic formula. The finding of square roots of positive real numbers is an
algorithmic process, one we carry out until we arrive at as close an approx-
imation to the answer as we wish.
What the quadratic formula does for us is reduce the problem of solving
arbitrary quadratic equations to that of calculating square roots. Likewise,
Cardano’s formula reduces solving cubic equations to the calculation of
square and cube roots. It is only at this stage, and because we may be cal-
culating cube roots of complex numbers, that we turn to trigonometry and
calculators.
In addition, both the quadratic formula and Cardano’s have theoretical
value, representing roots of polynomials in terms of their coefficients. We
will make use of this in Section 5.3 in obtaining a formula for the discrimi-
nant of a cubic polynomial.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 114 — #130
i i
Exercise 5.5. Let b and c be real numbers and let U , V be the roots of the
quadratic polynomial x 2 C bx C c. From Exercise 2.5, b D .U C V / and
c D UV.
(i) Let d be the real cube root of c. Choose cube roots A of U and B of
V so that AB D d .
(ii) Using the formulas A3 D U , B 3 D V , and AB D d , expand .ACB/3
and verify that
.A C B/3 D 3d.A C B/ b:
p
Theorem 5.2. Let b and c be real numbers and let 3 c be the real cube
root of c. Let U and V be the roots of the quadratic polynomial
x 2 C bx C c:
p p p p p
Choose cube roots 3 U and 3 V of U and V so that 3 U 3 V D 3 c.
Then the sum p p
3 3
UC V
is a root of the cubic polynomial
p
y3 3 3 c y C b:
Theorem 5.3 (Cardano again). Let p and q be real numbers. Let U and V
be the roots of the quadratic polynomial
p3
t 2 C qt :
27
p
3
p
3
p
3
p
3
Choose cube roots U and V of U and V so that U V D p=3.
Then the sum p p
3 3
UC V
is a root of the cubic polynomial
y 3 C py C q:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 115 — #131
i i
.r1 r2 /2 D b 2 4c:
This allows us to obtain information on the roots from the coefficients alone:
if b 2 4c is positive, the roots are real and distinct; if b 2 4c is 0, the roots
are real and coincide; and if b 2 4c is negative, then the roots are a pair of
non-real, complex conjugate numbers.
We would like similarly to be able to obtain information about the roots
of a cubic polynomial in terms of its coefficients. The discussion that fol-
lowed Exercise 4.8 showed that the two quantities b 2 4c and .r1 r2 /2
associated with x 2 C bx C c can both be regarded as its discriminant, but
.r1 r2 /2 is the more fundamental quantity, the one we chose as the dis-
criminant’s definition.
Given a cubic polynomial x 3 C bx 2 C cx C d , let’s write r1 ; r2; r3 for
its roots, real or complex. In analogy with the quadratic case, we define the
discriminant of x 3 C bx 2 C cx C d to be
and denote it by .
For the special case of a reduced cubic polynomial x 3 C px C q, we
introduced the quantity 4p 3 27q 2 in Section 3.4, denoted it by ı, and
called it the discriminant. We also saw, in Exercise 3.22 and Theorem 3.6,
that the nature of the roots of x 3 CpxCq is determined by ı: the polynomial
has three distinct real roots if ı is positive, a repeated real root if ı is 0,
and one simple real root plus two complex conjugate roots if ı is negative.
Let’s show for a general cubic polynomial that our newly introduced
determines the nature of the roots in the same way.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 116 — #132
i i
(iii) Suppose that one root is real and the other two are complex conjugates.
Show that < 0. (Hint: Suppose the roots are r , a C bi , and a bi ,
with r , a, and b all real numbers. Using this notation, calculate by
calculating the product of the root differences before squaring.)
(iv) Prove the converses: if D 0, then f .x/ has a multiple root and all
its roots are real; if > 0, then f .x/ has three distinct real roots;
and if < 0, then f .x/ has one real root and two non-real complex
conjugate roots. (Hint: These follow from the three statements by the
logical argument that we used in Exercise 4.7 to prove the quadratic
analogue.)
(v) Conclude that you can determine the nature of the roots of f .x/ from
the sign of the discriminant of f .x/, as described in Theorem 5.4.
Exercise 5.8. Suppose p and q are real numbers and consider the poly-
nomial y 3 C py C q. We will use the notation and results of Exercise p
5.1.
Accordingly, we choose A to be one of p the three cube roots of q=2 C R
and B to be the cube root of q=2 R satisfying AB D p=3. With
these choices the three roots of y 3 C py C q are
r1 D A C B; r2 D !A C ! 2 B; r3 D ! 2 A C !B:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 117 — #133
i i
that
r1 r3 D ! 2 .1 !/.A !B/;
and that
r2 r3 D !.1 !/.A B/:
and that
p p
.r1 r2 /.r1 r3 /.r2 r3 / D 6 3i R:
D 108R:
Theorem 5.5. Given real numbers p and q, the discriminant of the poly-
nomial y 3 C py C q is
4p 3 27q 2:
(ii) y 3 C 5y C 1.
(iii) y 3 5y C 1.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 118 — #134
i i
18bcd 4b 3d C b 2 c 2 4c 3 27d 2:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 119 — #135
i i
three distinct real roots when is positive, there is one real root along with
a distinct pair of complex conjugate roots when is negative, and there is
a repeated root when is zero. Theorem 5.6 allows us to determine which
occurs for a cubic polynomial from its coefficients alone. If D 0, we can
use the coefficients to do a little more.
Theorem 5.7. The polynomial x 3 Cbx 2 Ccx Cd has one real root, of mul-
tiplicity 3, if D 0 and b 2 3c D 0. It has two real roots, of multiplicities
1 and 2, if D 0 and b 2 3c ¤ 0.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 120 — #136
i i
(iv) In contrast, deduce that if y 3 Cpy Cq has three distinct real roots, then
Cardano’s formula for the roots expresses all three of them as sums of
cube roots of non-real complex numbers.
Our examples of polynomials for which such cube roots arose weren’t anoma-
lies. They were inevitable.
(iii) To
p simplify notation, let’s introduce a new constant a, with a D
p=3; that is, a is the positive square root of p=3. Check that
p D 3a2 and q D ˙2a3 . The choice of sign in the expression for
q in terms of a is determined by the sign of q: if q is positive, then
q D 2a3 ; if q is negative, then q D 2a3 .
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 121 — #137
i i
y3 3a2 y 2a3 :
Verify that its roots are a and 2a, and that we have the factorization
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 122 — #138
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 123 — #139
i i
Let us return to the cubic y 3 7y C 6 and find its roots using Theorem
5.10.
y3 7y C 6 D 0:
p
(i) One cube root of 3 C 10 3=9 was determined in Exercise 3.13 or
Exercise 4.28. Add this cube root to its conjugate to obtain one solution
to y 3 7y C 6 D 0.
(ii) Using ! p and ! 2 , write expressions for the other two cube roots of
3 C 10 3=9. Add each to its conjugate to obtain the other two
solutions.
x 3 C 6x 2 C 3x C 18 D 0:
x3 3x 2 10x C 24 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 124 — #140
i i
x3 3xy 2 D aI 3x 2y y 3 D b:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 125 — #141
i i
The calculation in the last exercise shows that we have come full circle.
To solve an “irreducible” cubic equation, we need to calculate the cube root
of a non-real complex number a C bi . To compute it, we are led to solve a
different cubic equation. Passing from this different cubic to its associated
reduced cubic equation and applying Cardano’s formula, we find that we
must compute the same cube root with which we began. It is in this sense
that the problem is irreducible.
We need not despair. We must accept that non-algebraic techniques are
needed. For example, we can use trigonometry to compute cube roots, as
worked out in Exercise 4.27.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 126 — #142
i i
Exercise 5.21. Suppose p and q are real numbers. Assume p < 0 and let
p
p=3 be the positive square root of p=3. Show that
3q
1< p <1
2p p=3
precisely when 4p 3 27q 2 > 0. (Hint: Use the fact, for a real number r ,
that 1 < r < 1 precisely when r 2 < 1.)
for real numbers r and , with r > 0. Using Exercise 4.16, show that
r
p3
r
p p
rD D ;
27 3 3
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 127 — #143
i i
r cos C r i sin
is
p3 p
r cos C 3 r i sin :
3 3
Deduce, using our shorthand notation, that
1 b 1 b
a cos arccos C ai sin arccos
3 2a 3 2a
p
is a cube root of q=2 C Ri .
(v) By Theorem 5.10, one root of y 3 C py C q is given by adding the
complex number displayed in (iv) to its conjugate. Adding a complex
number and its complex conjugate yields twice the real part of the
complex number. Conclude that
1 b
2a cos arccos
3 2a
is a root of y 3 C py C q.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 128 — #144
i i
y3 7y C 6 D 0:
y3 3y C 1 D 0
in terms of cosines. We get the same answer as in Exercise 5.4 using Car-
dano’s and de Moivre’s formulas.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 129 — #145
i i
(ii) Using Theorem 4.9 again, write cos 3 in terms of sines and cosines
of and 2.
(iv) Use
cos2 C sin2 D 1
and the equality in (iii) to obtain the desired identity.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 130 — #146
i i
(ii) Use the triple angle formula and collect terms to rewrite the equation
as
.6a3 C 2ap/ cos C 2a3 cos 3 C q D 0:
It will suffice to find values of a and for which 6a3 C 2ap D 0 and
2a3 cos 3 C q D 0.
(iii) We have assumed that a is positive, so that a ¤ 0. Using this, solve
6a3 C 2ap D 0 for a and show that
p
aD p=3:
p
(iv) Substitute p=3 for a in the equation
2a3 cos 3 C q D 0;
Exercise 5.27. Let c be a real number. Show that the only root of the linear
polynomial x C c is c. Deduce that the sign of the root can be determined
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 131 — #147
i i
in terms of the sign of the constant coefficient. (Yes, this is trivial, but it is
the first in a sequence of results.)
Exercise 5.28. Let b and c be real numbers and let r1 and r2 be the roots
of the quadratic polynomial x 2 C bx C c. Since we are interested in the case
that the roots are real, we assume that b 2 4c 0.
(i) If c D 0, the roots are 0 and b, so their sign is determined by the sign
of the coefficient b.
(ii) Assume c ¤ 0. From Exercise 2.5, c D r1 r2 and b D .r1 C r2/.
Show that if r1 and r2 have the same sign, then c > 0, and if they have
opposite sign, then c < 0.
(iii) Assume that c is positive. Show that r1 and r2 are both positive pre-
cisely when b < 0 and both negative precisely when b > 0.
(iv) Deduce Theorem 5.13.
b D r1 r2 r3 ;
c D Cr1 r2 C r1 r3 C r2 r3 ;
d D r1 r2 r3 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 132 — #148
i i
Exercise 5.30. Prove Theorem 5.15. (Hint: Use Theorem 5.14 and the fact
that the product of a non-zero complex number and its conjugate is a posi-
tive real number.)
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 133 — #149
i i
Exercise 5.32. For the cubic polynomials in (i)-(iv), use Theorems 5.15
and 5.16 to determine from the coefficients how many real roots there are
and how many of them are positive or negative.
(i) x 3 6x 2 C 11x 6.
3 2
(ii) x 5x C 9x 5.
3 2
(iii) x C 6x C 3x C 18.
(iv) x 3 3x 2 10x C 24.
Let’s extract one small piece of Theorems 5.15 and 5.16 that will play
an essential role in our study of quartic polynomials. (See Exercise 6.8.)
Theorem 5.17. Let f .x/ be a cubic polynomial with non-zero constant
coefficient d .
(i) If d > 0, then f .x/ has at least one negative real root.
(ii) If d < 0, then f .x/ has at least one positive real root.
Exercise 5.33. Verify that Theorem 5.17 follows from Theorems 5.15 and
5.16. It is valid regardless of whether f .x/ has one or three real roots.
5.8 History
We concluded Section 5.5 with the observation that we need not despair
when faced with a cubic equation to solve in the irreducible case. How did
Cardano react? Let’s find out, as we continue the historical account of cubic
equations that we began in Section 3.5.
The cubic equations whose solution Cardano learned from Tartaglia
have the special form
x 3 C px D q
with p and q positive. Since p is positive, the discriminant is negative and
the formula of del Ferro and Tartaglia involves cube roots of real numbers.
Cardano, however, extended Tartaglia’s solution to cover cubic equations of
the additional forms
x 3 D px C q
and
x 3 C q D px:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 134 — #150
i i
Again, the coefficients are taken to be positive. But then, rewriting them in
the form x 3 C P x C Q D 0 by moving the appropriate terms to the left
side, we see that P is negative, so the discriminant 4P 3 27Q2 may
be positive. Thus Cardano opened the door to the irreducible case, with its
square roots of negative numbers and cube roots of non-real numbers.
Cardano was aware of the new difficulty. However, uncertain how to
handle it, he omitted examples of cubic equations of this type in Ars Magna
[12]. Thus, the answer to our opening question is that he suppressed the
irreducible case.
In an interesting passage later in Ars Magna, Cardano does make some
calculations with square roots of negative numbers. We know from Section
2.1 that solving a quadratic equation x 2 bx C c D 0 (the minus sign in
front of the b is intentional) is equivalent to finding two numbers whose
sum is b and whose product is c, and that special cases of this problem can
be found on Babylonian clay tablets from over three thousand years ago. In
Chapter 37, titled “On the Rule for Postulating a Negative,” Cardano poses
and solves just such a problem [12, p. 219]:
If it should be said, Divide 10 into two parts the product of which is
30 or 40, it is clear that this case is impossible. Nevertheless, we will
work thus: We divide 10 into two equal parts, making each 5. These we
square, making 25. Subtract 40 if you will, from the 25 thus produced
. . . leaving a remainder of 15, the square root of which added to or
subtracted
p from 5 gives
p parts the product of which is 40. These will be
5C 15 and 5 15.
By competing the square, Cardano has found two numbers that have
sum 10 and product 40. Equivalently, he has solved the quadratic equation
x2 10x C 40 D 0;
despite its discriminant being negative. Next Cardano writes [12, pp. 219–
220],
p
Putting
p aside the mental tortures involved, multiply 5 C 15 by 5
15, making 25 . 15/, which is C15. Hence this product is 40.
. . . This truly is sophisticated . . . . So progresses arithmetic subtlety
the end of which, as is said, is as refined as it is useless.
Recognizing that the problem makes no physical sense, Cardano is content
to label the result useless.
Complex numbers enter into the solving of quadratic equations pre-
cisely when the equations have no real solutions. Though sophisticated, the
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 135 — #151
i i
new numbers aren’t needed. What’s different about cubic equations is that
the new numbers appear precisely when the equations have only real (and
distinct) solutions. This was a puzzle Cardano was not prepared to address.
In contrast, Cardano’s near-contemporary Rafael Bombelli (1526–1572)
did not shy away from the irreducible case. In the influential three-volume
work L’Algebra [11], printed in 1572 and again in 1579, Bombelli set out to
present the results of Ars Magna in a manner more accessible to beginners.
Bombelli gives as one example the irreducible cubic equation
x 3 D 15x C 4:
(See the discussion in [66, pp. 60–61].) Using Cardano’s formula, Bombelli
writes the solution
p p
q q
3 3
2C 121 C 2 121:
a3 3ab D 2:
p p
By setting the product of a C b and a b equal to the product of
p
3 p p3 p
2C 121 and 2 121, we get
a2 C b D 5:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 136 — #152
i i
A3 3B 2 A D B 2 D
under the assumption that “B is greater than half of D.” Taking B and D
to be positive, we can check that the condition B > D=2 is the requirement
that the given cubic’s discriminant is positive. He then describes how to
obtain a solution geometrically in terms of two right triangles, one having
an acute angle three times that of the second. Next, Viète illustrates the
method by solving the cubic equations
x3 300x D 432
and
x3 300x D 432:
In [48, pp. 203–205], R.W.D. Nickalls relates this passage of Viète’s
to our more familiar trigonometric view, explaining that “Viète’s approach
stems from his familiarity with the then equivalent of the trigonometric
triple-angle identities, since he himself had established formulae for chords
of multiple arcs in terms of chords of simple arcs, and hence he was aware
that solving a cubic with three real roots was analogous to trisecting an
angle.”
Before moving on, let us recognize the greater significance of Viète’s
work, as he was the first to write down cubic equations (in the irreducible
case) in general form and solve them.
Another French mathematician, Albert Girard (1595–1632), who would
spend most of his life in the Netherlands as a religious refugee, published In-
vention Nouvelle en l’Algèbre in 1629 [32]. It’s a short work, just 64 pages,
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 137 — #153
i i
with a central section on algebra that is full of new ideas. We will discuss
Girard’s contributions further in Sections 7.2 and 7.4. For now, we will be
content with a few words about his account of cubic equations.
Girard doesn’t use letters for constants or variables, yet his text is easy to
understand. For example, he divides cubic equations into a variety of cases,
one of which he describes as the equations with “1.3/ equal to .1/ C .0/.”
The parenthetical numbers should be interpreted as powers of x, so that
this refers to equations in which a cube equals a degree-one term plus a
constant, or x 3 D px C q. For this case, Girard provides a “rule for solving
the equation 1.3/ equal to .1/ C .0/ when the cube of a third of the number
of .1/ is larger than the square of half of .0/ with the aid of a table of
sines.” This is, of course, the irreducible case. Without a general notation,
Girard must describe the rule by way of example. The example he uses is
“1.3/ equal to 13.1/ C 12,” which we would write as x 3 D 13x C 12. He
displays a series of calculations that leads to the solution x D 4, using what
is essentially Viète’s approach.
We mentioned John Wallis in passing in Section 4.9 with regard to his
early effort to provide a geometric description of complex numbers. This is
contained in his wonderfully titled 1685 book, A Treatise of Algebra, both
Historical and Practical: Shewing the original, progress, and advancement
thereof, from time to time, and by what steps it hath attained to the heighth
at which now it is; with some additional treatises, which would have great
influence. J.F. Scott’s The Mathematical Work of John Wallis, written in
1938, is a valuable guide to Wallis’s mathematical work, with a chapter
devoted to the Treatise of Algebra [58, pp. 133–165]. Scott’s concluding
appraisal [58, p. 165] is that Wallis’s book “constituted a reservoir from
which contemporary and later algebraists drew much inspiration. It may
be mentioned that this treatise did a great deal towards popularizing the
notation which was now rapidly becoming current in Europe.”
Of particular significance is Wallis’s unreserved adoption of complex
numbers as solutions to algebraic equations. Scott writes [58, pp. 156–157],
“In his quadratic equations he discusses every type, and the rules he evolves
for determining the nature of the roots by a mere inspection of the equation
would not be out of place in a modern text-book. He was quite at home
with imaginaries, and he knew that such roots always occurred in pairs.
Moreover he would not allow the use of the word Impossible as applied to
an equation with imaginary roots. An equation, for example, such as
aa C 8a D 25;
p p
of which the roots are 4 C 16 25 and 4 16 25, ‘imaginaries’ —
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 138 — #154
i i
had hitherto been styled ‘Impossible’. Yet, avers Wallis” (and now we turn
to Wallis’s own writings [69, p. 174]):
aaa 7a D 6
Exercise 5.34. Let a and b be real numbers with b ¤ 0 and suppose x Cyi
is a cube root of a C bi .
(i) Set .x C yi /3 D a C bi , as in Exercise 4.13, and obtain two equations
in x and y by setting the real and imaginary parts equal to each other.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 139 — #155
i i
(ii) Square both equations, take their sum, and obtain the equation
x 6 C 3x 4y 2 C 3x 2 y 4 C y 6 D a2 C b 2:
4x 3 3mx a D 0:
(v) Calculate the discriminant of .4x 3 3mx a/=4 and use the fact that
b ¤ 0 to show that it is positive.
(vi) Conclude that cube roots of complex numbers can be calculated by
solving reduced cubic equations in the irreducible case.
4x 3 63x 81 D 0:
De Moivre suggests that this be “compared with the equation for the cosines,
namely 4x 3 3r 2x D r 2c.” To understand the sense in which this is an
equation for cosines, we can rewrite it as
x 3 x c
4 3 D :
r r r
Taking x=r to be cos and c=r to be cos 3 for a suitable angle , the
equation becomes the trigonometric identity
of Theorem 5.12.
Following de Moivre again, we use the comparison
p of 4x 3 63x D 81
3 2 2
to 4x 3r x D r c to obtain the values r D 21 and c D 27=7. To find
x, we use the triple cosine identity and the identifications x=r D cos and
c=r D cos 3. Applying the inverse cosine to c=r , dividing by 3, computing
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 140 — #156
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 141 — #157
i i
p
must determine p and q in such a manner, that 3 3 pq may become
equal to f , and p C q D g; for we then know that one of the roots of
p p
our equation will be x D 3 3 p C 3 3 q.
Following his derivation, Euler presents a sequence of examples of cu-
bic equations with negative discriminant, starting with two for which the
cube root calculations are trivial. He introduces a third example [26, p 265],
x 3 D 6x C 40, with the immediate comment that “x D 4 is one of the
roots.” Applying Cardano’s formula, he expresses the same surprise we did
after Exercise 3.9 on comparing the complicated answer it produces with
the known simpler answer:
Consequently one of the roots will be . . .
p p
q q
3 3
x D 20 C 14 2 C 20 14 2I
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 142 — #158
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 143 — #159
i i
6
Quartic Polynomials
143
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 144 — #160
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 145 — #161
i i
and
.A C B/3 D A3 C 3A2 B C 3AB 2 C B 3 :
Show that
.A C B/4 D A4 C 4A3B C 6A2B 2 C 4AB 3 C B 4 :
Exercise 6.3. What is the reduced quartic equation associated to the quar-
tic equation
x 4 C 4x 3 2x C 4 D 0‹
How are the solutions of the reduced quartic equation related to the solu-
tions of the original quartic equation?
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 146 — #162
i i
z 4 C qz 2 C r z C s D 0
with real coefficients q, r , and s. Moving the lower-degree terms to the right
to obtain
z 4 C qz 2 D r z s;
we can regard the left side as a quadratic polynomial in z 2 and complete the
square by adding qz 2 C q 2 to both sides. This yields
.z 2 C q/2 D qz 2 C q 2 rz s:
Our goal is to find a value of t for which the right side becomes the square
of a linear polynomial in z.
We know that a monic quadratic polynomial z 2 C bz C c is the square
of a linear polynomial precisely when it has a real root of multiplicity 2, or
equivalently, when its discriminant b 2 4c is 0. A polynomial of the form
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 147 — #163
i i
.q C 2t/z 2 r z C .q 2 s C 2qt C t 2 /
Rewriting the expression on the left side as a cubic polynomial in the un-
known constant t and multiplying by 1, we find that t must satisfy
are squares.
Using the techniques we have learned for solving cubic equations, we
can find t, then factor the quadratic in z as the square of a degree-one poly-
nomial dz C e. In this way, we obtain the quartic equation
z 2 C q C t D dz C e
and
z 2 C q C t D dz e:
Solving for z yields four solutions to the original quartic equation. We (and
Ferrari) have reduced the problem of solving the original quartic equation
z 4 C qz 2 C r z C s D 0 to that of solving the cubic equation in t.
Let’s try Ferrari’s approach in an example, one we will pursue only to
the point of obtaining the reduced cubic equation that needs solving.
z 4 C 6z 2 60z C 36 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 148 — #164
i i
(i) Use Ferrari’s method to show this can be done by finding t satisfying
Let’s work out the next example to conclusion. It is missing both degree
three and degree two terms, so that the cubic equation to solve is already
reduced.
z4 12z C 3 D 0:
(ii) Show that the right side of this equation will be the square of a degree-
one polynomial in z if t is a solution of the cubic equation
8t 3 24t 144 D 0;
or
t3 3t 18 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 149 — #165
i i
(v) Substitute t into the equation in z and t at the start of the exercise to
obtain
.z 2 C 3/2 D 6z 2 C 12z C 6:
(vi) As Ferrari’s method ensures, the right side of this equation is the square
of a degree-one polynomial in z:
and
p p
z2 C 6z C .3 C 6/ D 0:
z4 12z C 3 D 0:
Exercise 6.6. Let q and s be real numbers and consider the polynomial
z 4 C qz 2 C s.
(i) Assume that q 2 4s 0. Show that z 4 C qz 2 C s factors as .z 2 C
l /.z 2 C m/.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 150 — #166
i i
(ii) Assume that q 2 4s < 0 and observe that this implies s > 0. Verify
that a factorization of the form .z 2 C l /.z 2 C m/ can’t exist. Therefore,
to show that z 4 C qz 2 C s factors as a product of quadratic polynomials
with real coefficients, we will instead seek a factorization of the more
general form
z 4 C qz 2 C s D .z 2 C kz C l /.z 2 C k 0 z C m/;
where the coefficients k, k 0 , l , and m are real and k and k 0 are non-
zero. Our search will occupy the remainder of the exercise.
(iii) Multiply the two quadratic polynomials, obtain expressions for the co-
efficients of each power of z, and show that for equality to hold, we
must have k 0 D k and l D m. Thus, we can simplify our task and
look for a factorization of the form
z 4 C qz 2 C s D .z 2 C kz C l /.z 2 kz C l /:
(iv) Show next that for equality to hold, k and l must satisfy l 2 D s and
2l D k 2 C q. We view these as equations in the unknowns k and l ,
with q and s regarded as known constants.
j 2 C 2qj C .q 2 4s/ D 0:
(viii) Let k and k represent the square roots of the positive root. Conclude
that k and k give a value for l and allow us to factor z 4 C qz 2 C s as
a product of quadratic polynomials with real coefficients.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 151 — #167
i i
Exercise 6.7. Use the quadratic formula to show that the roots of z 4 C
qz 2 C s are s p
q q 2 4s
˙ ˙ :
2 2
.z 2 C kz C l /.z 2 C k 0 z C m/:
Show that for this to hold, we must have k 0 D k. Thus we can look
for a factorization of the form
z 4 C qz 2 C r z C s D .z 2 C kz C l /.z 2 kz C m/:
l Cm k 2 D qI k.m l / D r I l m D s:
(iii) Since k can’t be zero, we are free to divide by it. Do so in the second
equation and obtain from the first two equations the new equations
r
m C l D q C k2I m lD :
k
Using these, obtain
r r
2m D q C k 2 C I 2l D q C k 2 :
k k
Conclude that l and m are determined once k is known.
(iv) We can multiply both sides of the equation l m D s by 4 to obtain
.2l /.2m/ D 4s. Substitute the expressions for 2l and 2m and obtain
k 6 C 2qk 4 C .q 2 4s/k 2 r 2 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 152 — #168
i i
j 3 C 2qj 2 C .q 2 4s/j r 2 D 0:
(vi) Deduce from Theorem 5.17 that the cubic equation in j has a positive
real solution.
(vii) Let k and k represent the square roots of this solution. Conclude that
k and k give values for l and m and allow us to factor z 4 C qz 2 C
r zCs as a product of quadratic polynomials with real coefficients. The
roots of the quadratic polynomials yield the solutions of the original
quartic equation.
x 4 C bx 3 C cx 2 C dx C e
z 4 C qz 2 C r z C s:
j 3 C 2qj 2 C .q 2 4s/j r 2;
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 153 — #169
i i
j 3 C 2qj 2 C .q 2 4s/j r2
be the given quartic’s associated resolvent cubic polynomial and let k and
k be square roots of a positive real root of the resolvent cubic. Then z 4 C
qz 2 C r z C s factors as
1 r 1 r
z 2 C kz C q C k2 z 2 kz C q C k2 C :
2 k 2 k
The roots of z 4 Cqz 2 Cr zCs are the roots of the two quadratic polynomials
in the factorization.
The procedure for finding roots of quartic polynomials can be long and
complicated, but in principle it works. The hardest part is likely to be the
calculation of roots of the resolvent cubic, a problem we already understand.
Let’s try the procedure on some examples. They are designed so that
the polynomials arising as resolvent cubics have positive real roots that are
easy to find. Thus, in solving the quartic equations, we can focus on the
new features of the procedure and not get distracted by the now-familiar
difficulties of the cubic.
z4 3z 2 C 6z 2 D 0:
To do so, write the resolvent cubic, find a real root by guessing (squares of
integers are good guesses), take its two square roots, factor z 4 3z 2 C6z 2
as a product of two quadratic polynomials, and find their roots.
z4 2z 2 8z 3D0
Our third example is the quartic equation of Exercise 6.5. We will solve
it using Descartes’ method.
z4 12z C 3 D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 154 — #170
i i
In this case, since there is no square term, the resolvent cubic will be a
reduced cubic, so that Cardano’s formula can be applied directly. Find a
real root by using Cardano’s formula or by guessing. (Guessing may not
lead to a root that is the square of an integer, but it may lead to a positive
integer.) Then proceed as in the preceding exercises. Compare the solutions
to the result in Exercise 6.5 to make sure they are the same.
z 4 C qz 2 C r z C s
Theorem 6.3. Any quartic polynomial f .x/ with real coefficients factors
as a product of quadratic polynomials with real coefficients. Exactly one of
the following possibilities occurs for the roots of f .x/:
(i) f .x/ has four distinct real roots.
(ii) f .x/ has two distinct real roots and a pair of non-real complex conju-
gate roots.
(iii) f .x/ has no real roots, but two pairs of distinct non-real complex con-
jugate roots.
(iv) f .x/ has repeated roots.
Exercise 6.12. Deduce Theorem 6.3 for reduced quartic polynomials from
Exercise 6.6 and Theorem 6.2. Then explain why it holds for quartic poly-
nomials in general.
z 4 C qz 2 C r z C s D 0:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 155 — #171
i i
Exercise 6.13. Use the notation and assumptions of the preceding discus-
sion.
(i) Explain why there are eight possible choices for the triple of numbers
.k1 ; k2; k3 /, depending on choice of signs.
(Hint: Use Theorem 5.14 on how the roots and coefficients of a cubic
polynomial are related.)
Exercise 6.14. Continue within the setting of Exercise 6.13. For definite-
ness, let’s fix the choice of triple of square roots .k1 ; k2 ; k3 / to be one of
the four satisfying k1 k2 k3 D r .
(i) Given our choice of k1 as one of the square roots of the positive real
number j1 , we obtain from Theorem 6.2 the factorization of z 4 Cqz 2 C
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 156 — #172
i i
r z C s as
2 1 r 1 r
z C k1 z C q C k12 z2 k1 z C 2
q C k1 C :
2 k1 2 k1
Use the quadratic formula to obtain expressions for the roots of
1 r
z 2 C k1 z C q C k12 :
2 k1
Show that
s
k1 1 r
zD ˙ k12 2 q C k12 :
2 2 k1
(ii) Use the formulas from Exercise 6.13 for 2q and r in terms of the ki ’s
as well as the choice of the triple .k1 ; k2; k3 / to satisfy k1 k2 k3 D r
to write this equality as
k1 1p
zD ˙ .k2 k3 /2 :
2 2
(iii) Conclude that the roots of the quadratic factor are
1 1
zD . k1 C k2 k3 / and z D . k1 k2 C k3 /:
2 2
(iv) Follow the same procedure to find the roots of
2 1 2 r
z k1 z C q C k1 C
2 k1
and show that you get
1 1
zD .k1 C k2 C k3 / and zD .k1 k2 k3 /:
2 2
(v) Conclude that once the three roots j1, j2 , j3 of the resolvent cubic
polynomial are found, the roots of the quartic polynomial can be ex-
pressed in terms of square roots of j1 , j2 , and j3 , as described in The-
orem 6.4.
j 3 C 2qj 2 C .q 2 4s/j r2
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 157 — #173
i i
z 4 C qz 2 C r z C s:
Let’s solve two quartic equations using Theorem 6.4. The equations
have been chosen so that the roots of the resolvent cubic polynomials are
easily found by trial and error.
In the last exercise, the two quartic equations have the same resolvent
cubic equation. This is a special case of a general phenomenon.
z 4 C qz 2 C r z C s D 0
and
z 4 C qz 2 rz C s D 0
have the same resolvent cubic equation. Describe how the solutions of the
first are related to the solutions of the second.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 158 — #174
i i
is defined in the same way. Suppose its roots are r1 , r2 , r3 , and r4 . Then its
discriminant is the product
Because the root differences are squared, we get the same result regardless
of how the roots are ordered.
For quadratic and cubic polynomials, the discriminant is important for
two reasons. First, its sign gives us information on the nature of the roots:
how many are real numbers and how many are non-real complex numbers.
Second, there is a formula expressing the discriminant in terms of the co-
efficients of the polynomial, allowing us to calculate the discriminant from
the coefficients and thereby obtain information about the roots from the co-
efficients alone.
We will show for a quartic polynomial that the discriminant gives in-
formation on the nature of the roots, although not as complete as in the
quadratic and cubic cases, and that it can be calculated from the coefficients.
Exercise 6.17. In this exercise, we will find out what the discriminant of
a quartic polynomial tells us about the nature of its roots. Let f .x/ be the
polynomial, with discriminant , and let r1, r2 , r3 , and r4 be its roots.
(i) Check that D 0 precisely when at least two of the roots coincide.
(ii) Assume in the remainder of the exercise that the four roots are dis-
tinct, so that ¤ 0. From Theorem 6.3, there are three possibilities
for the roots: all are real, two are real and two are non-real complex
conjugates, or there are two pairs of non-real complex conjugates.
(iii) Show that if all four roots are real, then > 0.
(iv) Suppose two roots are real and two are complex conjugates. Show that
< 0. (Hint: The product of a non-zero complex number and its
conjugate is a positive real number. Name the four roots r , s, a C
bi , and a bi , where r , s, a, and b are real numbers. There are six
root differences. Four of the six differences occur as pairs of conjugate
complex numbers, so that their products are real. One of the remaining
root differences is real and the other is pure imaginary. From this it
follows that is negative.)
(v) Suppose that the four roots occur in two pairs of non-real complex
conjugates. Show that > 0. (Hint: Pair four of the differences so
that they are conjugates and their products are real. The other two root
differences should be pure imaginary.)
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 159 — #175
i i
(vi) Show that the converses of the statements in (iii)–(v) hold; that is,
prove Theorem 6.5. (Hint: As in Exercises 4.7 and 5.7, these follow
from the statements themselves by an elementary logical argument.)
Exercise 6.18. Let q and s be real numbers. Use the description of the
roots of
z 4 C qz 2 C s
in Exercise 6.7 to calculate the discriminant, obtaining Theorem 6.6.
k1 C k2 ; k1 C k3 ; k2 C k3 ;
k1 k2 ; k1 k3 ; k2 k3 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 160 — #176
i i
(iii) The ki ’s are the square roots of the solutions j1, j2 , j3 of the resolvent
cubic equation. Using the ji ’s, write the equation for as
D .j1 j2 /2 .j1 j3 /2 .j2 j3 /2 :
Since the ji ’s are by definition the roots of the resolvent cubic poly-
nomial, the product on the right side is in fact the discriminant of the
resolvent cubic.
(ii) Use the formula for the discriminant of a reduced cubic polynomial
y 3 C py C q in terms of p and q to show that its discriminant is
256s 3 27r 4 :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 161 — #177
i i
(ii) Conclude that it is also the discriminant of the reduced quartic polyno-
mial z 4 C qz 2 C r z C s with which we began.
(iii) Expand and simplify the expression in q, r , and s. Combine with The-
orem 6.6 to conclude that Theorem 6.7 holds.
x 4 C bx 3 C cx 2 C dx C e:
From Exercise 6.2 and Theorem 6.1, the change of variable x D y .b=4/
transforms x 4 C bx 3 C cx 2 C dx C e into a polynomial of the form z 4 C
qz 2 C r z C s.
(i) Verify, as in Exercise 5.10, that although the roots of x 4 C bx 3 C
cx 2 C dx C e and the roots of z 4 C qz 2 C r z C s are not the same,
their differences are.
(ii) Deduce that the discriminant of x 4 C bx 3 C cx 2 C dx C e is the same
as the discriminant of z 4 C qz 2 C r z C s.
(iii) Conclude that a formula for the discriminant of x 4 Cbx 3 Ccx 2 CdxCe
in terms of b, c, d , and e can be found by substituting into the formula
for the discriminant of z 4 C qz 2 C r z C s the expressions given by
Theorem 6.1 for q, r , and s in terms of b, c, d , and e.
(iv) Carry out this calculation, simplify, and conclude that Theorem 6.8
holds.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 162 — #178
i i
z 4 C qz 2 C s
z 4 C qz 2 C r z C s:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 163 — #179
i i
(i) Deduce, since is also the discriminant of the resolvent cubic poly-
nomial
j 3 C 2qj 2 C .q 2 4s/j r 2 ;
that its roots j1, j2 , and j3 are real and distinct.
(ii) From Theorem 5.14, deduce that
j1 j2 j3 D r 2 :
(iii) Conclude that either all the roots ji are positive or one is positive and
the other two are negative.
(iv) From Theorem 5.16, deduce that the ji are all positive precisely when
(v) If the ji are all positive, then their square roots ˙ki are all real num-
bers. Conclude from Theorem 6.4 that the four roots of the quartic
z 4 C qz 2 C r z C s are all real.
(vi) Suppose j1 is positive and j2 and j3 are negative. Conclude that the
two square roots ˙k1 are real while ˙k2 and ˙k3 are pure imaginary.
Show in this case that the four roots of z 4 C qz 2 C r z C s are all
non-real complex numbers.
(vii) Conclude that the roots of z 4 C qz 2 C r z C s are all real precisely when
q < 0 and q 2 4s > 0.
Exercise 6.27. For each of the quartic polynomials, use the discriminant
and the coefficients to decide how many roots are real.
(i) z 4 C z C 1.
(ii) z 4 C z 1.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 164 — #180
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 165 — #181
i i
Theorem 6.11 tells us in particular that the sum of the roots of a reduced
quartic polynomial is 0. We can now complete our analysis of the roots of a
such a polynomial when it has zero discriminant.
(a) If q < 0 and q 2 4s > 0, the roots are all real. If q 2 C 12s D 0,
there is a non-zero real root a of multiplicity 3 and 3a is a root
of multiplicity 1. If q 2 C 12s ¤ 0, there is a non-zero real root a
of multiplicity 2 and there are two real roots of the form a C c
and a c for some non-zero real number c with c ¤ ˙2a.
(b) If it is not the case that q < 0 and q 2 4s > 0, then there is a
non-zero real root a of multiplicity 2 and there are two complex
conjugate roots of the form a C bi and a bi , for some non-
zero real number b.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 166 — #182
i i
(iii) Show that if all the roots are real, they have the form a; a; aCc; a
c for some non-zero real number c, but if they are not all real, they have
the form a; a; a C bi; a bi for some non-zero real number b.
(iv) Use Theorem 6.11 to calculate q, r and s in terms of the roots. Verify
that q < 0 and q 2 4s > 0 if the roots are all real, but not otherwise.
(v) Compute q 2 C 12s in terms of the roots and show that it is the square
of a simple expression. Check that it can’t be 0 unless the roots are all
real, in which case it is 0 precisely when c D ˙2a. This is exactly the
case of a multiplicity 3 root.
These theorems provide our final illustration of the theme that a poly-
nomial’s coefficients encode rich information on its roots.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 167 — #183
i i
x 3 C 6x 2 C 3x C 18 D 0
and
x3 3x 2 10x C 24 D 0:
Interweave a concrete treatment of the equations with the general discus-
sion, including the following points.
(i) The reduction of arbitrary cubic equations to reduced cubic equations,
so that Cardano’s formula can be applied.
(ii) How Cardano’s formula can be used to obtain not just one but three
solutions to the reduced cubic equation. In particular, explain which
cube roots to pair in the formula in order to obtain the three solutions.
(iii) How to compute cube roots of non-real complex numbers.
(iv) How to sidestep such computations by relying instead on trigonometric
and inverse trigonometric functions.
x4 4x 3 C 3x 2 C 8x 10 D 0
and
x4 42x 2 C 64x C 105 D 0:
Interweave a concrete treatment of the equations with the general discus-
sion, including the following points.
(i) The reduction of arbitrary quartic equations to reduced quartic equa-
tions, and its application to the first quartic.
(ii) The solving of a reduced quartic equation, such as the one obtained,
by writing the reduced quartic polynomial in it as a product of two
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 168 — #184
i i
(v) The observation, after guessing a solution to this cubic equation, that
we can use the procedure described for the first quartic to factor the
second one as a product of two quadratic polynomials, followed by
an explanation that we can instead describe the roots of the reduced
quartic in terms of the three solutions of the cubic equation.
(vi) A list of eight candidates for solutions to the original quartic equation
in terms of the three solutions to the cubic equation, and an explanation
of which four to choose. (Provide an explanation that does not depend
on substituting all eight candidates in the original equation.)
(vii) A description of a quartic equation whose solutions are the other four
candidates, and an explanation of why they are solutions.
(viii) A discussion of how these ideas can be used to solve any quartic equa-
tion, once a method is available for solving cubic equations.
Exercise 6.34. Write an essay about the role the discriminant plays in un-
derstanding the solutions of quadratic, cubic, and quartic equations. Ad-
dress the following issues.
(i) The definition of the discriminant of a polynomial of degree 2, 3, or 4
in terms of its roots, and the information it gives about the roots.
(ii) The formula for the discriminant of a polynomial of degree 2, 3, or 4
in terms of its coefficients, and how the formulas make it possible to
obtain information about the roots from the coefficients.
(iii) The quadratic formula for the roots of a quadratic polynomial and the
appearance of the discriminant within the formula.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 169 — #185
i i
(iv) Cardano’s formula for the roots of a reduced cubic polynomial and the
appearance of the discriminant within the formula.
(v) The way in which the sign of the discriminant of a cubic polynomial
affects the calculations in Cardano’s formula, with the need to compute
cube roots of non-real complex numbers in certain cases.
(vi) How to combine information from the discriminant of a quartic poly-
nomial with information from the quartic’s coefficients to refine our
understanding of the nature of the quartic’s roots.
6.8 History
Lodovico Ferrari was the first person to develop a method for solving quar-
tic equations. We already introduced him in Section 3.5, learning that he
was Cardano’s assistant, that Cardano described his quartic solution in Ars
Magna, and that he participated in the famous dispute in 1548 with Tartaglia.
Let us look at Cardano’s presentation of Ferrari’s work before moving on to
contributions of two intellectual giants from the following centuries, René
Descartes and Leonhard Euler.
Cardano presents the quartic solution in the penultimate chapter of Ars
Magna, making the following transition [12, p. 237]:
There is another rule, more noble than the preceding. It is Lodovico
Ferrari’s, who gave it to me on my request. Through it we have all the
solutions for equations of the fourth power, square, first power, and
number, or of the fourth power, cube, square, and number, and I set
them out here in order.
What Cardano sets out are twenty families of quartic equations, depending
on the signs of the coefficients.
After describing a procedure for handling the equations, Cardano works
nine examples. The first example is introduced by [12, p. 239]:
For example, divide 10 into three proportional parts, the product of the
first and second of which is 6. This was proposed by Zuanne de Tonini
de Coi, who said it would not be solved. I said it could, though I did not
yet know the method [for doing so]. This was discovered by Ferrari.
Let the mean be x. The first, then, will be 6=x and the third will be
x 3=6. These equal 10. Multiplying all terms by 6x, we will have
60x D x 4 C 6x 2 C 36:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 170 — #186
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 171 — #187
i i
challenge. Some pages later, in one of the most historic passages of western
literature, Descartes arrives at his first item of knowledge [18, p. 28]:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 172 — #188
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 173 — #189
i i
Descartes works out some examples of this method, then explains how
to transform a polynomial equation to a new one whose roots are all mul-
tiplied or divided by a given quantity. It is exciting to read these pages and
see now-familiar algebraic notions and formulations coming to life for the
first time.
Just a few pages later [19, pp. 180–183], before discussing cubic equa-
tions, Descartes turns to factoring quartic polynomials as products of quadrat-
ics. He passes to a reduced quartic (with p, q, and r as coefficients rather
than our q, r , and s), then introduces the resolvent cubic without ado:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 174 — #190
i i
Again, given an equation in which the unknown quantity has four di-
mensions . . . we must increase or diminish the roots so as to remove
the second term, in the way already explained, and then reduce it to
another of the third degree, in the following manner: Instead of
x 4 ˙ px 2 ˙ qx ˙ r D 0
write
y 6 ˙ 2py 4 C .p 2 ˙ 4r /y 2 q 2 D 0:
For the ambiguous sign put C2p in the second expression if Cp occurs
in the first; but if p occurs in the first, write 2p in the second . . . .
For example, given
x4 4x 2 8x C 35 D 0
replace it by
y6 8y 4 124y 2 64 D 0:
For since p D 4, we replace 2py by 8y 4 ; . . . .
4
Similarly, instead of
we must write
for 34 is twice 17, and 313 is the square of 17 increased by four times
6, and 400 is the square of 20.
We do not bother with the distinction Descartes maintains between positive
and negative coefficients, and therefore would dispense with his clarifica-
tions on how to choose signs in writing the resolvent cubic equation.
Descartes presents a third example of a quartic and its resolvent cu-
bic, one with coefficients that are algebraic expressions rather than specific
numbers, then explains how to use the resolvent cubic to factor a quartic
[19, p. 184]:
If, however, the value of y 2 can be found, we can by means of it sepa-
rate the preceding equation into two others, each of the second dimen-
sion, whose roots will be the same as those of the original equation.
Instead of x 4 ˙ px 2 ˙ qx ˙ r D 0, write the two equations
1 1 q
Cx 2 yx C y 2 ˙ p ˙ D0
2 2 2y
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 175 — #191
i i
and
1 1 q
Cx 2 C yx C y 2 ˙ p ˙ D 0:
2 2 2y
. . . It is then easy to determine all the roots of the proposed equation.
We will suppose that the root of an equation of the fourth degree has
p p p
the form, x D p C q C r, in which the letters p, q, r express the
roots of an equation of the third degree, such as z 3 f z 2 Cgz h D 0;
so that p C q C r D f ; pq C pr C qr D g; and pqr D h.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 176 — #192
i i
x4 ax 2 bx c D 0;
1 1 2 1 1 2 p 1
f D a; g D a C c; and h D b ; or h D b;
2 16 4 64 8
This method appears at first to furnish only one root of the given equa-
p
tion; but if we consider that every sign may be taken negatively, as
well as positively, we immediately perceive that this formula contains
all the four roots.
A short discussion follows, like the last part of Exercise 6.14 and of Ex-
ercise 6.16. Euler explains that there are eight possibilities for the sum of
the three square roots, but the correct four are chosen by using the rule that
the square roots’ product must equal b=8. Euler’s notation doesn’t match
ours, nor does his resolvent cubic, but it is easy to check that with the ap-
propriate change in the coefficient labels and signs and a change of variable
by a factor of 4, his result coincides with Theorem 6.4.
Procedure in hand, Euler solves the equation x 4 25x 2 C 60x 36 D 0
[26, pp. 284–285]. Let’s do so too.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 177 — #193
i i
x4 25x 2 C 60x 36 D 0
using the version of Euler’s approach given in Theorem 6.4. One root of
its resolvent cubic can be found by testing various integer squares. Using
it, factor the cubic and find the other two roots. Euler constructed a very
simple example indeed.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 178 — #194
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 179 — #195
i i
7
Higher-Degree
Polynomials
179
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 180 — #196
i i
x n C an 1x
n 1
C C a1 x C a0 D 0;
y n C bn 2y
n 2
C C b1 y C b0 D 0;
z n C cn 3z
n 3
C C c1z C c0 D 0;
where the new coefficients are expressible in terms of the old ones. The
equation has no terms of degrees n 1 or n 2 and is called a principal equa-
tion. If we can solve a principal equation, then Tschirnhausen’s change of
variable process allows us to pass back to solutions of the reduced equation,
perhaps having to calculate some square roots to do so. From the solutions
to the reduced equation, we get solutions of the original equation.
Applying Tschirnhausen’s idea to a reduced equation of degree 3, we
obtain a cubic equation of the form z 3 C c D 0, which can be solved by
computing cube roots. Square root calculations lead to the roots of the re-
duced equation and adding a suitable constant leads to the roots of a general
cubic. Cardano’s formula can be derived in this way.
The process of passing from a reduced equation to a principal equation
is an example of a general process called a Tschirnhausen transformation.
Applying it to quintic polynomials, we are led to quintic equations of the
form
z 5 C cz 2 C dz C e D 0:
Erland Bring (1736–1798) used more complicated Tschirnhausen transfor-
mations to go one step further. He showed in 1786 that the problem of solv-
ing a quintic equation can be reduced to that of solving a quintic equation
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 181 — #197
i i
of the form
w 5 C dw C e D 0:
Such equations are now called quintic equations in Bring-Jerrard form, in
honor of Bring and of George Jerrard (1804–1863), who used the same idea
to study equations of higher degree.
Several eighteenth-century mathematicians tried to solve the general
quintic equation, including Bring, Etienne Bézout, Edward Waring, La-
grange, and—no surprise—Euler. We saw in Section 5.2 that Euler solved
reduced cubic equations in Elements of Algebra [26] by determining, for a
quadratic polynomial with roots U and p V , the
p coefficients of a cubic poly-
nomial whose roots have the form 3 U C 3 V . Going backwards, he as-
sociated to a reduced cubic polynomial its resolvent quadratic polynomial
and found the roots of the cubic as sums of cube roots of the roots of the
quadratic. He had a similar approach to solving reduced quartic equations
using resolvent cubics, as we discussed in Section 6.8. We concluded that
section with his remark that efforts (up to 1770) to resolve equations of the
fifth degree had been unsuccessful.
Earlier, in a 1732 paper, Euler had conjectured that the roots of a quintic
polynomial take the form
p5
p p p
A1 C 5 A2 C 5 A3 C 5 A4 ;
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 182 — #198
i i
Ruffini sent Lagrange a copy of the book, but received no reply. In 1803,
Ruffini published a paper with another proof. In 1806, he published yet
another proof, and in 1813 he published a paper in which he expressed his
disappointment at the poor reception his work received. As it turns out,
Lagrange had read his work, but did not think the proof was complete and
chose not to express his approval. Other contemporaries also did not find
the proof complete or correct.
Raymond Ayoub has given an excellent appraisal of Ruffini’s work in
“Paolo Ruffini’s Contributions to the Quintic” [5], along with an account
of the work of Lagrange, Gauss, and others. It is clear in retrospect, as
emerges from Ayoub’s account, that Ruffini did not receive proper credit
for his contributions. One exception was the response of the great French
mathematician Augustin-Louis Cauchy, who wrote to Ruffini in 1821 [5,
p. 271] that “your memoir on the general resolution of equations is a work
which has always seemed to me worthy of the attention of mathematicians
and which, in my judgement, proves completely the insolvability of the
general equation of degree > 4.”
It fell to the Norwegian mathematician Niels Abel (1802–1829) to pro-
vide the first widely recognized proof that there can be no general solu-
tion of quintic equations by radicals. His result was published in 1824 [1].
A few years later, the French mathematician Évariste Galois (1811–1832)
provided another approach, one that would revolutionize algebra [31]. His
proof can be found in every graduate-level text on algebra (such as Abstract
Algebra, by David S. Dummit and Richard M. Foote [22, p. 629]) and some
undergraduate texts as well, and his ideas continue to influence mathematics
today.
The lives of Abel and Galois are of great interest. Both were mathemat-
ical geniuses of the first rank. Both died young. Galois is notorious for his
dramatic death at 20 from wounds incurred in a duel, and for the image of
him feverishly writing out his mathematical ideas in a letter on the eve of
the duel. In both cases, we can only dream of the glorious discoveries they
would have made had they lived longer, and reflect on the unfairness of life.
Since their stories are well told elsewhere, we will move on. Especially rec-
ommended for further reading is Peter Pesic’s Abel’s Proof: An Essay on
the Sources and Meaning of Mathematical Unsolvability [52].
There is one more twist to the tale. Ruffini deserves credit as the first
mathematician to focus attention on proving that the quintic cannot be solved
by radicals rather than searching for such a solution. But Gauss had similar
inklings. In his 1799 doctoral dissertation, Gauss wrote [5, pp. 262–263]:
After the labors of many geometers left little hope of ever arriving at
the resolution of the general equation algebraically, it appears more
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 183 — #199
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 184 — #200
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 185 — #201
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 186 — #202
i i
.x 2 C bx C c/g.x/;
Theorem 7.3. Let f .x/ be a polynomial of degree n > 0 with real coeffi-
cients.
(i) f .x/ factors as a constant multiple of a polynomial of the form
.x r1 / : : : .x rj /.x 2 C b1 x C c1 / : : : .x 2 C bk x C ck /;
.x r1 / : : : .x rn /;
where r1; : : : ; rn are the complex numbers that are roots of f .x/.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 187 — #203
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 188 — #204
i i
The first mathematician to state the theorem was Albert Girard, in 1629,
in Invention Nouvelle en l’Algèbre [32]. We briefly examined Girard’s ac-
count of cubic equations in Section 5.8. After discussing cubics, he states
a theorem that begins, “All algebraic equations have as many solutions as
the size of the highest quantity.” Girard makes no effort to provide justi-
fication, but he illustrates the statement with examples. For the first one,
Girard writes, “Given the equation x 4 D 4x 3 C 7x 2 34x 24, the size of
the highest quantity is 4, which signifies that there are 4 certain solutions,
4
neither more nor less.” Girard’s p final example
p is x D 4x 3, for which
he lists the solutions 1; 1; 1 C 2; 1 2. From this, we can infer
that Girard intends his theorem to be understood in the context of complex
numbers, with roots counted according to their multiplicities. Thus, he has
provided a correct statement of the fundamental theorem.
Several of the leading mathematicians of the eighteenth century—Jean-
le-Rond d’Alembert, Euler, Lagrange, and Pierre-Simon Laplace (1749–
1827)—attempted to prove the fundamental theorem, but their arguments
were not complete. (See Chapter 6 of Dunham’s Euler: The Master of Us
All [23] for an account of Euler’s efforts.) One problem with these early
proofs was their reliance on the implicit assumption that a polynomial has
roots somewhere, where not being clear. Within this mysterious domain,
they would then show that the roots actually lie in the complex numbers. As
Remmert puts it [55, pp. 98–99], “until Gauss all mathematicians believed
in the existence of solutions in some sort of no-man’s land . . . and tried
imaginatively to show that these solutions were in fact complex numbers.”
It was Gauss who first pointed out this problem, in his doctoral dissertation
of 1799 [55, p. 104]:
Gauss was not yet aware of Laplace’s proof. In 1815, he would subject
Laplace to the same criticism [55, p. 105]: “The ingenious way in which
Laplace dealt with this matter cannot be absolved from the main objections
affecting all these attempted proofs.”
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 189 — #205
i i
Gauss offered not just criticism in his 1799 dissertation, but also a proof
of his own, one that did not rely on the existence of roots in some unspeci-
fied domain. Rather, he set out to prove their existence from scratch, within
the complex numbers. Although Gauss avoided the error of his predeces-
sors, his proof, which is topological in nature, had gaps.
The first proof of the fundamental theorem that appears to be free of
error is one given by Argand in 1814, relying on the existence of a minimum
for a continuous function. He doesn’t justify the existence of the minimum,
something Cauchy would later do. Gauss would continue to give additional
proofs using a variety of methods. His second proof, from 1816, is algebraic
in nature, and correct. It completes Euler’s argument.
Regarding the gap in the proofs of Gauss’s predecessors, Remmert ob-
serves [55, p. 109]:
Nowadays one can only speculate about how mathematicians before
the beginning of the nineteenth century had visualized the solutions
of equations in their mind’s eye. It is difficult for us to understand
why, until the time of Gauss, they had an unshakable belief in a kind
of “extraterrestrial” existence of such solutions “somewhere or other,”
and then sought to show that these solutions were complex numbers.
With the development of algebra in the nineteenth century, it became a sim-
ple matter to construct the extraterrestrial solutions.
In studying polynomial equations, we typically wish to find solutions
in some familiar domain of numbers, such as the rational numbers, the real
numbers, or perhaps the complex numbers. As our historical discussion sug-
gests, it may be convenient, at least provisionally, to search for solutions in
a broader domain of numbers, one that may contain the complex numbers
and be large enough to contain roots of the polynomial. If we can introduce
such a domain, we can then work within it to show that the roots are in fact
complex numbers. This leads to the algebraic notion of a field.
The families of numbers with which we are most familiar—integers,
rationals, reals, and complexes—all have certain basic arithmetic properties
in common. To describe them, let us write K to denote any of these four
sets of numbers. Then K satisfies:
(1) Commutativity: Any two numbers a and b in K satisfy a C b D b C a
and a b D b a.
(2) Associativity: Any three numbers a, b, and c in K satisfy .aCb/Cc D
a C .b C c/ and .a b/ c D a .b c/.
(3) Distrtibutivity: Any three numbers a, b, and c in K satisfy a.b Cc/ D
a b C a c.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 190 — #206
i i
f .x/ D .x r1 /.x r2 / .x rn /:
We call K a field extension of C and say that f .x/ splits into linear
factors over K. We may also call K a splitting field for f .x/, although this
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 191 — #207
i i
term is usually reserved for a field of this type that is minimal in a suitable
sense.
Theorem 7.5 is the result that the mathematicians of the eighteenth cen-
tury were missing, or unjustifiably took for granted. It is purely algebraic,
the proof requiring no property of the real numbers other than the fact that
they form a field. Once it is available, the proofs of the fundamental the-
orem that Gauss criticized can now be completed [55, pp. 107–108]: “The
Gaussian objection against the attempts of Euler-Lagrange and Laplace was
invalidated as soon as Algebra was able to guarantee the existence of a split-
ting field for every polynomial. From that moment on, as Adolf Kneser al-
ready observed in 1888, the attempted proofs became in effect fully valid.”
We will discuss Theorem 7.5 in Section 7.3 and use it in Section 7.5,
where we present Laplace’s 1795 proof of the fundamental theorem.
.x a1 /.x a2 / .x ak /g.x/
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 192 — #208
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 193 — #209
i i
M&M’s, Lego bricks—as long as addition and multiplication rules are in-
troduced that satisfy the properties. We will have a better perspective on the
notion of a field if we introduce two examples besides rational, real, and
complex numbers.
Binary arithmetic gives one example. Write F2 for the set of symbols 0
and 1 with the familiar addition rules 0 C 0 D 0 and 0 C 1 D 1 C 0 D 1
along with the not-so-familiar rule 1 C 1 D 0. For multiplication, we will
adopt the usual rules 0 0 D 0 1 D 1 0 D 0 and 1 1 D 1.
Exercise 7.2. Under the definitions for addition and multiplication, verify
that F2 is a field, with 0 as the additive identity and 1 as the multiplicative
identity.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 194 — #210
i i
Given two polynomials f .x/ and g.x/ in KŒx, the addition and mul-
tiplication laws for K allow us to add and multiply f .x/ and g.x/ in the
way defined in Section 1.2, obtaining new polynomials f .x/ C g.x/ and
f .x/g.x/ that also lie in KŒx. (Under these addition and multiplication
rules, KŒx satisfies the defining properties of a ring, the polynomial ring
with coefficients in K.) For example, we can add and multiply polynomials
whose coefficients lie in our new fields F2 and R.t/.
Given a field K, let’s write 0 for its additive identity. In all our examples
of fields besides F2, the additive identity is the actual number 0, but some-
times, as is the case with F2 , we can’t identify 0 with a number. It is simply
a distinguished member of the field. We can define the notion of root for any
polynomial f .x/ in KŒx: a root of f .x/ is an element a of K satisfying
f .a/ D 0. For example, the polynomial x 2 C 1 in F2 Œx has 1 as a root,
since
12 C 1 D 1 C 1 D 0:
The definition of irreducible polynomial makes sense for polynomials
with coefficients in any field: a polynomial f .x/ of positive degree in KŒx
is irreducible if it cannot be factored as the product of two polynomials of
lower degree in KŒx. In the next exercise, we determine the irreducible
polynomials of low degree in F2 Œx.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 195 — #211
i i
We can show, just as with QŒx, that for every positive integer n, there
is an irreducible polynomial of degree n in F2 Œx.
Once we broaden the study of polynomials to allow any field as the set
of coefficients, we become interested in two types of theorems: those that
are specific to the choice of K, and those that hold no matter what field K
is chosen. The fundamental theorem of algebra is an example of the first
type of theorem, as it is a statement about polynomials with real or complex
coefficients. The results of Sections 1.2, 1.3, and 1.4 are examples of the
second type. They make sense for polynomials with coefficients in any field
K and the proofs work when real numbers are replaced by elements of
K. For example, combining Theorems 1.3 and 1.4, and extending them to
arbitrary fields, we obtain:
Theorem 7.8. Let K be a field and let f .x/ be a polynomial in KŒx with
exactly k distinct roots a1 ; : : : ; ak in K. For 1 i k, let mi denote the
multiplicity of ai as a root of f .x/. Then f .x/ has a factorization
f .x/ D .x a1 /m1 .x a2 /m2 .x ak /mk g.x/
for a polynomial g.x/ in KŒx that itself has no roots in K.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 196 — #212
i i
It is easy to take this for granted. Yet, it requires proof. The proof is
straightforward. If n is prime, we are done. Otherwise, by the definition of
prime, n factors as a product r s for smaller positive integers r and s. If r
or s is prime, we leave it as is. If not, we factor it as the product of smaller
positive integers. We continue in this way as we test each new factor for
primeness. Any non-prime factors become smaller with each round, so that
after at most n 1 rounds, we will have factored n as a product of prime
numbers.
Theorem 7.9 has a polynomial analogue, with essentially the same proof:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 197 — #213
i i
Theorem 7.11. Let n be an integer greater than 1. There exist prime num-
bers p1 ; : : : ; pk and positive integers m1 ; : : : ; mk satisfying
m m
n D p1 1 pk k :
and
n
q1n1 q` `
are two prime factorizations of n, for prime numbers p1 < p2 < < pk
and q1 < q2 < < q` and for positive integer exponents mi and nj . Then
k D ` and for each i between 1 and k, we have the equalities pi D qi and
mi D ni .
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 198 — #214
i i
Euclid’s key property of prime numbers has its counterpart for irre-
ducible polynomials: given a field K, if an irreducible polynomial p.x/
in KŒx divides the product r .x/s.x/ of two polynomials r .x/ and s.x/
in KŒx, then p.x/ divides r .x/ or s.x/. This is an extension of Exercise
1.9. We will not prove it. The proof is not difficult, paralleling as it does
the proof of the integer result. We introduce the notion of greatest common
divisor for polynomials, obtain a polynomial analogue of the euclidean al-
gorithm, and prove the result, mimicking the classical integer arguments.
Once this property of irreducible polynomials is available, we can mimic
the proof of Theorem 7.13 to obtain what is essentially an extension of The-
orem 7.8, a unique factorization theorem for polynomials. The statement is
more complicated than its integer counterpart because we can alter a fac-
torization of a polynomial by inserting constant factors without changing
it in an essential way, but this is the only issue: up to constant factors, the
same irreducible polynomials occur in any factorization of a polynomial
into irreducible polynomials with the same exponents.
With these general factorization theorems in place, we may next wish to
study factorization problems with particular fields of coefficients, such as Q
or F2 . They turn out to be important, and difficult. Let us instead conclude
by returning to the issue that prompted us to introduce the notion of a field
in Section 7.2.
We saw that several eighteenth-century attempts to prove the fundamen-
tal theorem of algebra failed because of a missing ingredient, Theorem 7.5.
This states, given a polynomial f .x/ in RŒx of positive degree n, that there
exists a field K containing C and elements r1; r2 ; : : : ; rn in K satisfying
f .x/ D .x r1 /.x r2 / .x rn /:
The proof does not depend on working with the real numbers. It yields more
generally, with no additional work, the following theorem.
Theorem 7.14. Let K be a field and let f .x/ be a polynomial of positive de-
gree in KŒx. There exists a field L containing K and elements r1 ; r2 ; : : : ; rn
in L, repetitions allowed, such that
f .x/ D .x r1 /.x r2 / .x rn /
in LŒx.
The proof is not difficult, and can be found in many algebra texts (for
example, [22, p. 536]). A full discussion would take us too far afield. Let us
take a brief look at the essential issue, which is the following partial result:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 199 — #215
i i
rn an 1r
n 1
a1 r a0 D 0;
or
r n D an 1r
n 1
C C a1 r C a0 :
We then take L to be the set of all polynomial-like expressions in r of degree
at most n 1 with coefficients from K.
Addition of expressions is defined in the obvious way. For multiplica-
tion, we first treat r as a polynomial indeterminate and multiply as usual.
The result will be a polynomial in r , but it may have degree n or greater. We
then use the last equation, which we can think of as a rewrite rule telling us
how to replace r n by a sum of lower-degree terms in r . Iterating, we can
write any polynomial in r as a polynomial in r of degree less than n. Having
defined addition and multiplication in L, we easily verify that L satisfies the
defining conditions of a ring.
What requires closer examination is the additional requirement that fields
must satisfy: that every non-zero element has a multiplicative inverse. Is this
the case for L? It is not difficult to see that the answer is no whenever f .x/
fails to be irreducible. However, if f .x/ is irreducible, then L is a field. The
proof of this, like the proof of unique factorization for polynomials, depends
on polynomial analogues of integer results on greatest common divisors and
the euclidean algorithm.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 200 — #216
i i
xn a1 x n 1
C a2 x n 2
C . 1/n 1
an 1x C . 1/n an :
We index the coefficients so that the subscript of a coefficient and the expo-
nent of its associated power of x sum to n, and also we write the coefficients
with alternating signs. The convenience of these choices will soon be clear.
Assume that the coefficients are real numbers.
By the fundamental theorem of algebra, f .x/ factors as
.x r1 /.x r2/ .x rn /;
where r1 ; : : : ; rn are the real or complex roots of f .x/, listed with possible
repetitions. The discriminant of f .x/ is defined to be the square of the
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 201 — #217
i i
We take the product over all roots, with repetitions, not over the set of dis-
tinct roots. If there is a repeated root, which means there is a pair of indices
i and j with i ¤ j and ri D rj , then the difference ri rj is 0 and D 0. If
no root is repeated, then the factors in the product are non-zero and ¤ 0.
We are assuming that coefficients are real numbers. We could choose to
work in the generality of Section 7.3, with an arbitrary field K as the field of
coefficients for the polynomial f .x/. Theorem 7.14 ensures the existence
of a larger field L that contains a set r1 ; : : : ; rn of roots for f .x/ for which
f .x/ factors in LŒx as
f .x/ D .x r1 /.x r2 / .x rn /:
However, we would have to prove that doesn’t depend on the field L and
roots ri . This is a journey we choose not to take.
For polynomials f .x/ of degree 2, 3, or 4, we found formulas for the
discriminant that express it as a sum of integer multiples of products of the
coefficients. For example, we found in Theorem 5.6 that the discriminant of
x 3 C bx 2 C cx C d is
18bcd 4b 3 d C b 2c 2 4c 3 27d 2:
We wish to show for f .x/ of any degree n that a formula like this exists:
the discriminant of f .x/ can be written as a polynomial expression in the
coefficients of f .x/ with integer coefficients.
The discriminant is not the only expression in the roots of f .x/ that we
have written as a polynomial expression in its coefficients. In Theorem 5.14
we saw for a cubic polynomial x 3 a1 x 2 C a2 x a3 with roots r1 , r2 , and
r3 that
a1 D r1 C r2 C r3 ;
a2 D r1 r2 C r1 r3 C r2 r3 ;
a3 D r1 r2 r3 :
The proof amounts to factoring the cubic as .x r1 /.x r2 /.x r3 / and car-
rying out the multiplication of the degree-one terms. The simpler quadratic
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 202 — #218
i i
analogue was obtained for real roots in Exercise 2.5 and in general in Exer-
cise 4.8, and the quartic analogue was Theorem 6.11.
A similar result holds for polynomials of higher degree. We have fac-
tored a general degree n polynomial f .x/ as
.x r1 /.x r2 / .x rn 1 /.x rn /:
When we carry out the multiplication of the n terms, the result is the sum
of the expressions we obtain by choosing from each term x ri either the
x or the ri and then multiplying them together. For an integer k between
1 and n, the coefficient of x n k will be the sum of the terms we obtain
when we choose x from n k of the factors and choose ri ’s from k of
the factors. We might for instance choose constants from the factors with
indices i1 ; i2; : : : ; ik . This would yield
or
. 1/k ri1 ri2 rik x n k
:
To get the entire coefficient of x n k , which we have written as . 1/k ak ,
we need to add all these expressions together. The result is
X
ak D ri1 ri2 rik :
1i1 <i2 <<ik n
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 203 — #219
i i
is the case as well for the discriminant, a point that is worth a moment of
thought.
By definition,
Y
.r1 ; : : : ; rn / D .ri rj /2 :
1i <j n
If we reorder the roots ri in some way, the order of the factors in the product
will change, but the list of factors is unchanged: we choose every possible
pair of roots, take its difference, square it, and multiply. Since multiplica-
tion doesn’t depend on the ordering, the product that defines .r1 ; : : : ; rn /
doesn’t either. If we were to omit the squares in the product, the resulting
product Y
.ri rj /
1i <j n
would fail to be independent of the ordering of the ri ’s. For example, con-
sider what happens if we switch the order of r1 and r2 but leave the order
of r3 ; : : : ; rn intact. The factor r1 r2 changes to its opposite, r2 r1 . For
any i > 2, the factors r1 ri and r2 ri switch with each other, but their
product is unchanged. The remaining factors ri rj with i; j > 2 remain
unchanged. The product as a whole, then, is changed to its opposite.
Expressions in the roots of f .x/ that remain unchanged under re-naming
of the roots, such as the examples we have just examined, are called sym-
metric.
To proceed, we pass to a more abstract framework. The formulas we
have obtained relating roots and coefficients of a polynomial make sense
whatever the values of the coefficients are. In effect, we can regard them
as variables, to be replaced by particular real numbers when we identify
a specific polynomial of interest. If we are studying the polynomial x 3 C
3x 2 2x C 7, for example, then we substitute 3 for a1 , 2 for a2 and 7
for a3 (keeping our sign convention in mind).
This suggests the path we should take, which is to start over, working
with a generic polynomial of degree n whose coefficients are variables or
indeterminates. We can in this way discuss all degree n polynomials at once,
specializing the variable coefficients to actual ones when we have a specific
polynomial in mind.
Let us begin anew, then. Not only should the coefficients of a degree
n polynomial be generic, but the roots should be too. In other words, we
replace the roots ri by variables. Fix a positive integer n and introduce the
new variables or indeterminates t1 ; t2 ; : : : ; tn , which we think of as stand-
ins for the roots of a degree n polynomial. We anticipate that when we study
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 204 — #220
i i
It is understood that the sum is finite and the coefficients ai1 ;i2 ;:::;in are in-
tegers, all but finitely many of them being 0. Just as the ti s are stand-ins for
the roots of a given degree n polynomial, the expressions are stand-ins for
polynomial expressions in the roots.
We already know some expressions that will interest us. For each k
between 1 and n, let’s define ek .t1 ; : : : ; tn / by
X
ek .t1 ; : : : ; tn / D ti1 ti2 tik
1i1 <i2 <<ik n
and call it the kth elementary symmetric polynomial. Repeating the analysis
we made for sums of products of roots, we obtain:
We also introduce, for each positive integer k, the power sum polyno-
mials pk .t1 ; : : : ; tn /, defined by
f .x/ D x n a1 x n 1
C a2 x n 2
C . 1/n 1
an 1 C . 1/n an :
.x r1/.x r2 / .x rn /:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 205 — #221
i i
If, for each index i from 1 to n, we substitute ri for ti , then we pass from
the generic polynomial
n
X
xn C . 1/k ek .t1 ; : : : ; tn /x n k
kD1
f .x/ D x n a1 x n 1
C a2 x n 2
C . 1/n 1
an 1 C . 1/n an :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 206 — #222
i i
polynomials. For example, with n D 2, since e1 .t1 ; t2/ and e2 .t1 ; t2/ are
symmetric, so is
e1 .t1 ; t2 /2 4e2 .t1 ; t2 /:
Expanding, we find that
e1.t1 ; : : : ; tn /; : : : ; en .t1 ; : : : ; tn /:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 207 — #223
i i
We obtain the same result for power sum polynomials, too, but we can
say more, as we will later in the section.
To see the power of Theorem 7.18, let’s return to our standard degree n
polynomial
xn a1 x n 1
C a2 x n 2
C . 1/n 1
an 1 C . 1/n an
We know that when we substitute the roots ri for the variables ti , the ele-
mentary symmetric polynomials ek specialize to the coefficients ak :
ek .r1 ; : : : rn / D ak . We can therefore use .t1 ; : : : ; tn / to calculate the
discriminant .r1 ; : : : ; rn / of f .x/ in terms of the coefficients ak :
x 5 C cx 2 C dx C e;
is given by
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 208 — #224
i i
t m un C t n um ;
t m un C t n um D t m um .t k C uk / D .tu/m .t k C uk /:
Thus we can combine the two cases into one, and it suffices to show for all
non-negative integers m and k that
.tu/m .t k C uk /
t k C uk D .t k 1
C uk 1
/.t C u/ tu.t k 2
C uk 2
/
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 209 — #225
i i
pk .t1 ; : : : ; tn / D . 1/k 1
kek .t1 ; : : : ; tn /
k 1
X
. 1/k i ek i .t1 ; : : : ; tn /pi .t1 ; : : : ; tn /:
i D1
When k D 1, this is
e1 .t1 ; : : : ; tn / D p1 .t1 ; : : : ; tn /;
xn a1 x n 1
C a2 x n 2
C . 1/n 1
an 1 C . 1/n an ;
we are able to write the power sums pk .r1 ; : : : ; rk / in terms of the coeffi-
cients ai . Girard wrote such formulas explicitly for k 4. Let’s conclude
the section with a look at his work.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 210 — #226
i i
Prior to Girard, Viète obtained formulas in special cases for the coeffi-
cients of a polynomial as sums of products of the roots. Girard was the first
to state such formulas in general. In his 1629 book Invention Nouvelle en
l’Algèbre [32], Girard stated the theorem that any polynomial of degree n
has n roots. Only with the presence of n roots can we even contemplate ex-
pressions for the coefficients of the polynomial in terms of the roots. Girard
provided them in the second part of the same theorem.
Before stating the theorem, Girard introduced a notion he called a fac-
tion:
When several numbers are proposed, their sum is called the first fac-
tion; the sum of all the products two-by-two is called the second fac-
tion; the sum of all the products three-by-three is called the third fac-
tion; and so on up to the end, with the product of all the numbers called
the last faction. There are as many factions as the numbers proposed.
The second part of the theorem then states:
The first faction of the solutions [of a polynomial equation] equals the
first coefficient, the second faction of the same is equal to the second
coefficient, the third the third, and so on, so that the last faction is equal
to the last, and this according to the signs, which one observes are in
alternating order.
Girard offers no proof. But the proof is clear, provided we understand a
polynomial of degree n to have n roots, as Girard states, and provided we
understand that the polynomial factors as the product of terms x r , as r
runs through the n roots, which he surely did.
A few pages later, Girard introduces the expressions in the roots ri that
we call the power sum polynomials. He then writes formulas for them in
terms of the coefficients ai , for powers up to 4:
r1 C C rn D a1 ;
r12 C C rn2 D a12 2a2;
r13 C C rn3 D a13 3a1 a2 C 3a3 ;
r14 C C rn4 D a14 4a12a2 C 4a1 a3 C 2a22 4a4 :
These are the first four Girard-Newton identities of Theorem 7.19. Girard
doesn’t continue with higher exponents, but it is clear that a sequence of
such formulas exists.
Isaac Newton (1643–1727) obtained the same result in 1707 [47, pp.
392–393], again without proof. (In supplementary notes immediately after
Newton’s statement of the result, Reverend Wilder provides a proof [47, pp.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 211 — #227
i i
393–396].) The identities are often named after Newton alone, but credit is
due to both.
Girard’s observation that certain symmetric expressions in the roots of a
polynomial can be written in terms of the polynomial’s coefficients comes
in for praise in H. Gray Funkhouser’s 1930 paper, “A short account of the
history of symmetric functions of roots of equations” [30, pp. 358, 360]:
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 212 — #228
i i
states for a polynomial f .x/ with real coefficients that a field K exists con-
taining both C and a set of roots of f .x/, so that f .x/ factors in KŒx as
a product of linear factors, and Theorem 7.17, the fundamental theorem on
symmetric polynomials. We will also need the principle of mathematical
induction.
Induction has been used implicitly on occasion, for instance in the proof
of de Moivre’s formula in Exercise 4.23 and the proof of the two-variable
case of the fundamental theorem on symmetric polynomials in Section 7.4.
Its application in proving the fundamental theorem is more subtle. Hence, in
preparation for that proof, an explicit discussion of induction is warranted.
Let’s write N for the set of positive integers. (This is a traditional math-
ematical notation, the letter “N” suggesting the natural numbers, another
name for the positive integers.) Here is the statement of the principle:
Let S be the set of positive integers for which the equality holds. We see im-
mediately that S contains 1. Suppose we can show for any positive integer
k that if k is in S , so is k C 1. Then the principle of mathematical induction
implies that S D N and de Moivre’s formula holds for all positive integers.
What we must show, then, is that if
then
.cos C i sin /kC1 D cos.k C 1/ C i sin.k C 1/:
This is essentially what we checked in Exercise 4.23.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 213 — #229
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 214 — #230
i i
x2 .ri C rj / C ri rj :
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 215 — #231
i i
Since ri C rj and ri rj are complex, it lies in CŒx. By design, its roots are
ri and rj . We proved in Theorem 4.7 that any quadratic polynomial in CŒx
has complex roots. Hence, ri and rj lie in C. But ri and rj are roots of
f .x/, so f .x/ has roots in C.
We have proved, under the assumption that polynomials whose degree
is a product of 2k and an odd number have roots in C, that polynomials
whose degree is a product of 2kC1 and an odd number also have roots in C.
We also proved the base case, that polynomials of odd degree have roots in
C. By the principle of mathematical induction, polynomials of any positive
degree have roots in C, proving the fundamental theorem.
Let’s review where we used our two assumptions. The existence of real
roots for odd-degree polynomials entered the stage at the beginning, as the
base case of the induction argument. The existence of real square roots for
positive real numbers was used, implicitly, at the end, when we quoted The-
orem 4.7 on quadratic polynomials in CŒx having roots in C. This followed
by a completing-the-square argument from the existence in C of square
roots of complex numbers (Theorem 4.6), which followed from the exis-
tence of real square roots for positive real numbers.
What a beautiful proof! It depends on several of the ideas we have in-
troduced in this chapter, plus a touch of genius. It is a perfect peak on which
to bring our study of polynomials to a close.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 216 — #232
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 217 — #233
i i
References
[1] Niels Abel, Mémoire sur les équations algébriques ou l’on démontre
l’impossibilité de la résolution de l’équation générale du cinquième degré,
(1824) in Œuvres Complètes, Volume I, Grøndahl & Søn, Christiania, 1881,
28-33; translated by P. Pesic in [52, pp. 155–169].
[2] Irving Adler, The Giant Golden Book of Mathematics, illustrated by Lowell
Hess, Golden Press, New York, 1960.
[3] Lars V. Ahlfors, Complex Analysis, 3rd ed., McGraw-Hill, New York, 1979.
[4] Jean-Robert Argand, Imaginary Quantities: Their Geometrical Interpretation,
translated by A.S. Hardy with a preface by J. Hoüel, D. Van Nostrand, New
York, 1881.
[5] Raymond G. Ayoub, Paolo Ruffini’s contributions to the quintic, Archive for
History of Exact Sciences, 23 (1980) 253–277.
[6] Claude-Gaspard Bachet, Problèmes Plaisants et Délectables, 4th ed.,
Gauthier-Villars, Paris, 1905.
[7] I.G. Bashmakova, Diophantus and Diophantine Equations, translated by Abe
Shenitzer with the editorial assistance of Hardy Grant, Mathematical Associa-
tion of America, Washington, D.C., 1997.
[8] I.G. Bashmakova and G.S. Smirnova, The Beginnings and Evolution of Alge-
bra, translated by Abe Shenitzer with the editorial assistance of David A. Cox,
Mathematical Association of America, Washington, D.C., 2000.
[9] William P. Berlinghoff and Fernando Q. Gouvêa, Math Through the Ages, Ox-
ton House Publishers, Farmington, Maine, 2002.
[10] A.E. Berriman, The Babylonian quadratic equation, The Mathematical
Gazette, 40 (1956) 185–192.
[11] Rafael Bombelli, l’Algebra, Giovanni Rossi, Bologna, 1579.
[12] Girolamo Cardano, The Great Art; or, The Rules of Algebra, translated and
edited by T. Richard Witmer, The M.I.T. Press, Cambridge, Massachusetts,
1968.
217
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 218 — #234
i i
218 References
[13] Girolamo Cardano, The Book of My Life, translated by Jean Stoner with an
introduction by Anthony Grafton, New York Review Books, New York, 2002.
[14] Henry Thomas Colebrooke, Algebra with Arithmetic and Mensuration from
the Sanskrit of Brahmegupta and Bhascara, John Murray, London, 1817.
[15] S. Cuomo, Ancient Mathematics, Routledge, London, 2001.
[16] Abraham de Moivre, The analytic solution of certain equations of the third,
fifth, seventh, ninth and other higher uneven powers, by rules similar to those
called Cardan’s, Philosophical Transactions, 25 (1707) 2368–2371; portions
translated in [60, pp. 441–444].
[17] Abraham de Moivre, On the reduction pof radicals pto simpler terms, or the ex-
traction of roots of any binomial a C Cb or a C b, Philosophical Trans-
actions, 40 (1739) 463-478; portions translated in [60, pp. 447–450].
[18] René Descartes, A Discourse on the Method of Correctly Conducting One’s
Reason and Seeking Truth in the Sciences, translated by Ian Maclean, Oxford
University Press, Oxford, 2006.
[19] René Descartes, The Geometry of René Descartes, translated by David Eugene
Smith and Marcia L. Latham, Open Court Publishing Company, La Salle, Illi-
nois, 1952.
[20] Leonard Eugene Dickson, Elementary Theory of Equations, John Wiley and
Sons, New York, 1914.
[21] Peter Doyle and Curt McMullen, Solving the quintic by iteration, Acta Math-
ematica, 163 (1989) 151–180.
[22] David S. Dummit and Richard M. Foote, Abstract Algebra, 3rd ed., John Wiley
& Sons, New York, 2004.
[23] William Dunham, Euler: The Master of Us All, Mathematical Association of
America, Washington, D.C., 1999.
[24] Gotthold Eisenstein, Allgemeine Auflösung der Gleichung von den ersten vier
Graden, Journal für die reine and angewandte Mathematik 27 (1844) 81-83;
reprinted in Mathematische Werke, Vol. I, Chelsea Publishing Company, New
York, 1975, 7–9.
[25] Gotthold Eisenstein, Entwicklung von ˛˛ ˛ , Journal für die reine and ange-
wandte Mathematik 28 (1844) 49-52; reprinted in Mathematische Werke, Vol.
I, Chelsea Publishing Company, New York, 1975, 122–125.
[26] Leonhard Euler, Elements of Algebra, translated by Rev. John Hewlett,
Springer-Verlag, New York, Berlin, Heidelberg, Tokyo, 1984.
[27] Charles Fefferman, An easy proof of the fundamental theorem of algebra,
American Mathematical Monthly, 74 (1967) 854–855.
[28] Fibonacci, Fibonacci’s Liber Abaci: A Translation into Modern English of
Leonardo Pisano’s Book of Calculation, translated by L.E. Sigler, Springer-
Verlag, New York, 2002.
[29] Benjamin Fine and Gerhard Rosenberger, The Fundamental Theorem of Alge-
bra, Springer-Verlag, New York, 1997.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 219 — #235
i i
References 219
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 220 — #236
i i
220 References
[48] R.W.D. Nickalls, Viète, Descartes and the cubic equation, Mathematical
Gazette 90 (2006) 203–208.
[49] Øystein Ore, Cardano: The Gambling Scholar, Princeton University Press,
Princeton, New Jersey, 1953.
[50] Luca Pacioli, Summa de arithmetica, geometria, proportioni e proportionalità,
Paganino de Paganini, Venice, 1494.
[51] S.J. Patterson, Eisenstein and the quintic equation, Historia Mathematica 17
(1990) 132–140.
[52] Peter Pesic, Abel’s Proof, The MIT Press, Cambridge, Massachusetts, 2004.
[53] Satya Prakash, A Critical Study of Brahmagupta and His Works, The Indian
Institute of Astronomical & Sanskrit Research, New Delhi, 1968.
[54] Roshdi Rashed, editor, translator, commentator, Al-Khwārizmı̄: The Begin-
nings of Algebra, SAQI Books, London, 2009.
[55] Reinhold Remmert, The fundamental theorem of algebra, in Numbers,
Springer-Verlag, New York, 1990.
[56] Eleanor Robson, Mathematics in Ancient Iraq, Princeton University Press,
Princeton, New Jersey, 2008.
[57] Paolo Ruffini, Opere Matematiche di Paolo Ruffini, 3 volumes, edited by Ettore
Bortolotti, Volume 1, Palermo, 1915, Volumes 2 and 3, Edizioni Cremonese,
Rome, 1953 and 1954.
[58] J.F. Scott, The Mathematical Work of John Wallis, Chelsea Publishing Com-
pany, New York, 1981.
[59] Jacques Sesiano, An Introduction to the History of Algebra: Solving Equations
from Mesopotamian Times to the Renaissance, translated by Anna Pierrehum-
bert, American Mathematical Society, Providence, Rhode Island, 2009.
[60] David Eugene Smith, A Source Book in Mathematics, Dover Publications, Mi-
neola, New York, 1959.
[61] John Stillwell, Mathematics and Its History, 3rd ed., Springer Sci-
ence+Business Media, New York, 2010.
[62] John Stillwell, Eisenstein’s footnote, The Mathematical Intelligencer 17
(1995) 58–62.
[63] Niccolò Tartaglia, Quesiti et Inventioni Diverse, fascimile reproduction from
the 1554 edition, Ateneo di Brescia, Brescia, 1959.
[64] B.L. van der Waerden, Science Awakening, translated by Arnold Dresden, P.
Noordhoff, Groningen, Holland, 1954.
[65] B.L. van der Waerden, Geometry and Algebra in Ancient Civilizations,
Springer-Verlag, Berlin, New York, 1983.
[66] B.L. van der Waerden, A History of Algebra from Al-Khwārizmı̄ to Emmy
Noether, Springer-Verlag, Berlin, New York, 1985.
[67] V.S. Varadarajan, Algebra in Ancient and Modern Times, American Mathemat-
ical Society, Providence, Rhode Island, 1998.
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 221 — #237
i i
References 221
[68] François Viète, The Analytic Art, translated by T. Richard Witmer, The Kent
State University Press, Kent, Ohio, 1983.
[69] John Wallis, A Treatise of Algebra, both Historical and Practical, John Play-
ford, London, 1685.
[70] Caspar Wessel, On the analytical representation of direction, Memoirs of the
Royal Danish Academy of Sciences 1799; portions translated in [60, pp. 55–
66].
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 222 — #238
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 223 — #239
i i
Index
223
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 224 — #240
i i
224 Index
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 225 — #241
i i
Index 225
constant, 2 ring
constant term, 2 definition, 189–190
cubic, 2 polynomial ring, 194
definition, 2 Robert of Chester, 42
degree, 2 root of a polynomial, 4
division, 6 in a field, 194
factor, 6 multiple, 10
irreducible, 192, 194 multiplicity, 10
linear, 2 repeated, 10
monic, 3 simple, 10
product, 5–6 root of unity, 98
quadratic, 2 Ruffini, Paolo, 181
quartic, 2 on the quintic, 181–182
quintic, 2
sum, 5 Scipione del Ferro, 47, 66
polynomial equation, 3 symmetric polynomials
solution, 3 definition, 205
solution by radicals, 179 discriminant, 204
polynomial factorization elementary symmetric polynomi-
existence, 196–197 als, 204
non-trivial, 191 fundamental theorem, 206
trivial, 191 Newton’s identities, 209
uniqueness, 198 power sum polynomials, 204
Ptolemy, 91
theorem, 91 Tartaglia, 47, 67
pure imaginary numbers, 73 dispute with Cardano, 67–70
Thabit ibn Qurra, 44
quadratic formula, 21, 24, 26 Tschirnhausen, Ehrenfried Walter von,
complex coefficients, 83 180
Tschirnhausen transformation, 180
quartic polynomials
turning point, 20
Descartes’ solution, 149–154
global minimum, 20
Euler’s solution, 154–157
local maximum, 19
Ferrari’s solution, 146–149
local minimum, 20
nature of roots, 162–169
reduced, 145
Viète, François, 127, 135–136
resolvent cubic, 152, 173, 175–176
formula, 125–130
quintic polynomials
symmetric polynomials, 209
Bring-Jerrard form, 181
Eisenstein’s solution, 184 Wallis, John, 105, 137–138
solvability by iteration, 184 on the cubic, 138
unsolvability by radicals, 181–183 Waring, Edward, 181
Wessel, Caspar, 105
rational numbers, 13
resolvent polynomial
general notion, 181
of a cubic, 113–115
of a quartic, 152, 173, 175–176
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 226 — #242
i i
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 227 — #243
i i
227
i i
i i
i i
“IrvingBook” — 2013/5/22 — 15:39 — page 228 — #244
i i
i i
i i